Going to ask again about HealWell - they are on an acquisition tear and seem to be very AI-focused. Has…
Curbside Consult with Dr. Jayne 7/22/24
The big news of the weekend was hearing about the response of organizations to the CrowdStrike debacle on Friday. Despite official statements that everything was fine and patient care was proceeding as usual, comments from worker bees at several local hospitals revealed significant issues that did impact patient care.
At one facility, patients who had mammograms performed on Wednesday and Thursday and were told to expect results by end of day Friday were left in the lurch, since the hospital’s cloud-based dictation service was down. Apparently there was confusion about whether there was a backup plan and what it might be, so radiologists stopped reading studies, bringing everything to a halt. There was no proactive communication to impacted patients letting them know that results would be delayed, causing a great deal of anxiety.
One physician friend who was impacted as a patient reached out on a local physician forum to find out whether her study was being delayed because it was abnormal, which is a common thought among patients. She had no idea about the CrowdStrike situation, but a number of hospital-based physicians chimed in about the patient care nightmare that was unfolding across the region. Several affiliated hospitals canceled elective imaging, including screening mammograms, on Friday. Other physicians reported delays in getting operating room systems started and an inability to get through to internal help desks due to a high volume of calls.
Since I work with various organizations and have company-issued laptops for each of them, I was able to experience firsthand how different places handled the problem. One organization was extremely hands on, sending messages via text starting in the wee hours of the morning. They’re not on my overnight priority list, so the text thread was muted, but I was impressed because they sent hourly updates. Fortunately, my laptop wasn’t impacted and I wasn’t scheduled to do work for them that day, but I followed along because that’s what a good healthcare IT reporter does. By around 7 p.m. in the company’s primary time zone, they sent another text indicating that mitigation efforts had concluded. I checked that company’s email over the weekend to see what other communications they might have sent and was pleased to see an overall summary and debrief communication.
Another company was radio silent, acting like nothing was happening. I guess it’s good that none of their systems or hardware were impacted, but it would have been nice to receive some kind of communication letting employees and contractors know that there was a worldwide issue and that vendors, external systems, or patient pharmacies might be impacted. Since they’re a virtual care company, I would be interested to see whether there was any increase in the number of failed prescription transmissions or patient callbacks asking for medications to be prescribed to a different pharmacy because of the outage.
My laptop for another health system was impacted by the outage and they didn’t send out any communications until two hours after I discovered the issue. I had reported it to the help desk via email by using my phone, so I knew I was in the hopper. Since everyone’s accounts are on Office 365, I was able to do the small amount of work I had for them by using my personal computer, which I’m not sure is entirely permitted based on the vague wording of their privacy and security policies. No one blinked when I said I was using my own device, though, so I’m assuming that I’ll ask for forgiveness if it becomes an issue later since I didn’t ask for permission. I was ultimately able to perform the fix on my laptop myself, which was good because the help desk didn’t get back to me until Saturday afternoon when I was nowhere near my laptop.
Mr. H reported a list of impacts in this week’s Monday Morning Update and they included surgery and procedure cancellations, appointment cancellations, closure of diagnostic facilities, and holds on shipping laboratory specimens due to delays with FedEx. Mr. H noted that Michigan Medicine reported a “major incident.” I’m not sure what that means at the institution, and whether something truly serious happened or whether it was classified as major due to the number of impacted systems, or something else. I’d be interested to hear from anyone at that organization as to what exactly that report means.
Since one of the more serious impacts occurred with 911 emergency call centers, it will be difficult to quantify the full effect on patients. Several state systems were down and analog backups were pulled into service in multiple places. It’s difficult to perform reporting and analysis on events that didn’t happen, but one could extrapolate from the historical call history as to how many calls weren’t received compared to a typical summer Friday. Given the typical percentages of different types of critical calls – cardiac arrests, penetrating trauma, motor vehicle accidents – one can start to do the math to understand how many lives might have been either seriously impacted or lost due to what others minimize as a “computer glitch.” I’m sure the loved ones of those individuals who were frantically trying to call 911 for help might have other words for it.
I spent a fair amount of time this weekend following the Relive Apollo 11 thread (@ReliveApollo11) on the service formerly known as Twitter. I’ve always been a space junkie and being able to share the experience in a reenacted real time way was kind of thrilling. Through one of the links, I found the Apollo 11 Flight Journal, which is a fascinating read of the transcripts from mission communications. Other cool resources I found during my trip down the rabbit hole included a guide for using Google Earth to explore the moon, and in particular, the landing sites.
It’s hard to believe the level of accomplishment that took us to the moon, with human computers and slide rule-wielding engineers leading the way. The technologies are considered much less powerful than what most of us hold in our hands on a daily basis, but people achieved great things. It should be inspirational, especially on those days when we feel that we are making little progress.
I also learned a piece of information I didn’t previously know. The Apollo 11 mission patch doesn’t include the names of the crew members because those three astronauts wanted the patch to represent all of those who were involved in the mission. It’s a refreshing departure from the “me” culture with which we’re all too familiar.
For those of you who experienced Apollo 11 or other moon landings at the time they occurred, what are your significant memories? Leave a comment or email me.
Email Dr. Jayne.
Re: CrowdStrike issue causing BSOD
By way of explanation, and not making excuses here. The Tech industry is chockablock filled with oligopolies and near-monopolies. It is how tech has evolved so far and so fast.
When people ask, “how did CrowdStrike get so popular and create this vulnerability”, one common meaning is, “how come we don’t have hundreds/thousands of small security companies instead of a handful of big companies”? Well, quite simply, the market didn’t want massive numbers of small companies. The rewards were all towards consolidation and market domination.
Another thing to know. Pretty well the entire tech sector has de-emphasized testing. Testing is built around an “avoid the problem” idea. Which is great, but testing is slow, and expensive, and isn’t the only method of problem management. Most companies (from what I can gather) have disbanded testing/QA groups, and now issue new software versions in waves, to select groups of users. Sometimes the early adopters are self-selected, sometimes they are targeted by the software company. The Big Idea is to catch problems early, and respond rapidly, but do so using real Users.
The truth is, there are advantages to market consolidation and single points of control (failure too). Reputation is everything in the security space. It’s difficult-to-impossible to get to know thousands of company’s reputations, so fewer names makes reputation tracking easier for the customers. Also, loci of control, have huge administrative advantages for customers. You almost never get administrative control points, in situations of massive market fragmentation.
If you were to have large-scale fragmentation of security companies and products? What no one will tell you is this. You are opting for a situation of near-permanent, partial disruption of your systems. Statistically, something like 1-10% of your systems would be down at any given time. It’s like a permanent drag on Operations, but it is possible to reorganize around this dynamic, so maybe it’s not so bad.
With large-scale consolidation, you tend to get long intervals of high availability, interspersed with shorter intervals of disruption. And that disruption can be planned, unplanned, low impact or high impact. The CrowdStrike outage was the worst combo (unplanned and high impact).
The key issue, I submit, is this. Does market consolidation cause more problems, or less problems, overall, for customers? And I know of no definitive answer to this question. Does market consolidation increase costs overall for customers? This does have at least a partial answer; although there are savings in tracking vendors, and product administration, the closer the market moves to monopolies? The vendors tend to take advantage of market power and increase selling prices.
One more thing. Security software is typically afforded extremely high levels of access and authority. Yet security software is vulnerable to every type of exploit that non-security software has. One security guy I followed for decades demoaned this situation! People tend to assume that security software is itself reliable & secure (by definition), but this turns out not to be true.
Yeah, disappointing, isn’t it?
I followed every space mission as a kid, including all of the lunar missions, and remember watching Armstrong’s first step live. It was a thrilling time. Five years ago, on the 50th anniversary, I visited Goddard Park outside Worcester MA to say thanks for the first steps that led to that amazing day. I still follow activity (launches, returns from the ISS), but on-line now. Not newsworthy enough for mass media any more.
Wow, memories!
I tried to watch, I believe it was the very last Apollo mission? It was past my bedtime but my parents gave me permission to stay up late.
The launch was delayed, repeatedly IIRC. And young Brian Too fell asleep in front of the TV, waiting for the magic moment. Arrgghhhh!
I was so disappointed.
If anyone is interested in a full multi-media presentation (video, audio, documentation, timelines) on Apollo 11, the Apollo in Real Time (https://apolloinrealtime.org/) site is amazing. Note that they also have rundowns on Apollo 13 and Apollo 17.