Giving a patient medications in the ER, having them pop positive on a test, and then withholding further medications because…
Curbside Consult with Dr. Jayne 1/24/22
Many healthcare organizations are struggling with the recent COVID surge due to the omicron variant. The focus is often on staffing issues, especially when large numbers of workers are out due to personal illness, caring for sick family members, or providing care for children whose schools have shifted to virtual learning. Other struggles include supply shortages, especially with personal protective equipment, medications and therapeutics, and occasionally cleaning products, all of which are shocking at this stage of the pandemic.
More recently, though, a number of organizations are seeing infrastructure challenges due to the sheer number of patient visits that are occurring.
I spent some time over the weekend trying to calm a CMIO friend whose ambulatory organization is in complete crisis. In the past, they had a robust IT department and hosted all of their own applications. In a round of cost cutting, the parent organization decided it would be better to outsource all of those functions. At the same time, they moved many of their internally hosted systems onto web-based platforms where available.
Their primary ambulatory EHR was one of those systems. It wasn’t just moved out of their data center — it was also transitioned to a SaaS model with multi-tenant architecture. This was fine for a number of months, but recently their system has been grinding to a halt at various times during the course of a day, and the user community is becoming increasingly frustrated.
Many of their outpatient clinical offices are back to pre-COVID productivity, through a combination of in-person and virtual visits. Because this organization is conservative, its conducts all of its telehealth visits by video, which take up more bandwidth than an audio-only visit. Their urgent care and same-day facilities have been seeing high volumes throughout the pandemic, but they have been fairly stable numbers for the last few months since operational leaders wisely capped daily volumes in order to preserve staff sanity.
I’m sure they have lost some patients to other facilities in town, but they consider the leakage acceptable if it keeps staff from resigning. They made these decisions based on experiences from earlier in the pandemic when they didn’t cap volumes, which led to some pretty significant burnout and nearly insurmountable levels of turnover. They weren’t about to put their newly rebuilt staff through the same experience, and for that I commend them.
Still, they were puzzled why they were having such poor system performance with stable volumes. As a hosted client, the IT team was opening performance tickets left and right, but with few answers. System latency continued to increase along with user frustration, as it was taking up to 30 seconds to load patient charts or 20 seconds to navigate from screen to screen. Even basic controls such as pick lists and pop-ups were also sluggish. Performance would improve at times and they would feel like they were moving in the right direction. The urgent care locations, which run seven days a week, reported some slight improvement on the weekends, but not much.
After many conversations with the vendor and a number of executive escalations, it became clear that the way the vendor’s system is architected is the problem. After moving from their own data center onto the SaaS model, the group is experiencing lags related to the out-of-control visit volumes other clients. They are feeling performance impacts that are caused by organizations who had doubled or tripled their daily visit volumes, putting additional load on the infrastructure. Since many of us didn’t anticipate how quickly the COVID curve would climb with the omicron variant, and how many people would be sickened in such a short interval, planning for such volume surges was inadequate.
Sometimes solving infrastructure problems can be as challenging as solving staffing problems in the hospital. Especially if the system is already running towards the higher end of capacity, there might not be available hardware that can be incorporated quickly. In the crisis situations that many 24×7 organizations are working in, it’s not easy to schedule a downtime for an upgrade or to modify resources. A lot of things can be done behind the scenes, but the reality is that most of us never planned for a peak that looks like what we are experiencing.
I can’t imagine what the staff at these doubly- or triply-busy practices are going through. They have got to be at wits’ end, because increasing throughput to that degree requires more staff, better processes, or less care being delivered. Based on what we know staffing looks like, and the difficulty in doing significant process changes during a crisis, I’m guessing care might be taking a hit. That would certainly mesh with the discussions I’m seeing on physician-only social media, where the number of mentions of moral injury has climbed along with the number of posts in which physicians are asking for advice on how to break their contracts.
My CMIO friend’s vendor was supposed to try to some maneuvers over the weekend that would create relative isolation for his organization so that they wouldn’t be so dramatically impacted by what is going on with other clients. I’m trying to wrap my head around what their architecture might look like to make that happen. It makes me grateful for all the deeply techy people I’ve worked with over the years who understand better than I how those pieces of the healthcare IT world run.
I wouldn’t want to be on the tip of the spear, whether it was my fault or my vendor’s, because an angry end user is an angry end user regardless of where the root cause of the problem lies. Regardless, I can offer a sympathetic ear, a soft virtual shoulder, and reassurance that his communication strategy was solid and that he had considered all of the things that I would have considered were I in the same unenviable position. He’s going to let me know mid-week how things are going, and for his sake, I hope they’ are improved.
Have you encountered infrastructure challenges related to booming visit volumes? Leave a comment or email me.
Email Dr. Jayne.
Wasn’t it Dilbert who said “isn’t the Cloud just someone else’s data center?” That’s the scenario you described, and the “tenants” compete for compute. Instead, Public Clouds (AWS/Azure/Google) use software “containers” – it’s virtualization at the application software level (not hardware, like VMWare). If the application is designed to be “Cloud-native,” and “Container-native,” the container/application spins up when you log in and are active, and evaporates when you’re not. This is how the Public Cloud vendors get the scale to do email, internet searches, etc. And if healthcare is going to adopt Artificial Intelligence, this is the only way to scale it – rather than building massive data centers that still can’t handle it.
Unfortunately, healthcare has tolerated vendors with 1990s fat client architectures, machine virtualization dependence, and other technical debt that removes any Cloud advantage, and won’t perform for AI. Rather than re-architecting the application, some are simply balling the whole mess up into a massive, expensive container that can’t spin up/down, there is still no “Cloud-scale.” Many are also seeing Artificial Intelligence as a further revenue opportunity – and their customers will be trapped into a single-threaded, horsepower-dependent model. For example, it will be interesting to see if Oracle re-platforms Cerner to increase performance and make it Cloud-agnostic, or if it is simply a one-way ticket to buying the Oracle Cloud – what’s your bet?
Healthcare is at a cross-roads, and doesn’t seem to realize; we can continue to be a vendor-dependent quagmire that now sits in someone else’s data center where we have even less control, or we can raise our expectations and move into the current world – Cloud-native, Cloud-scale, Cloud-agnostic, open source, Edge-capable – there’s a whole other world out there, folks!
I’m a pretty traditional IT guy, and I’ve had concerns about the SAAS model all along. Yet everyone in my position either has been, or is vulnerable to, accusations of being stodgy, or featherbedding, or worse.
The thing is, a lot of organizations aggressively pursued SAAS without doing a realistic risk assessment. If you don’t do that then you are in danger of falling victim to the optimistic spin of the SAAS vendors. The spin most often being, “it’s cheaper, it’s so simple, and everything gets better!”
I 100% buy that for some customers, SAAS is the best solution. My view of the trade-offs is that usually, smaller customers will benefit the most. Very large customers may still benefit from a SAAS provisioning model and may choose to become their own SAAS provider. Yet I rarely hear about this latter service model in the wild.
Would love a longer article from the above commenter giving more explanation of the concepts cited, aimed as us medical types.
Funny story related to the problems cited: I was explaining to my IT son-in-law the difference between slow transit and pelvic floor constipation. He finally said of the latter “oh, so it’s not latency, it’s throughput”.
Vendors who can’t fix performance problems are corporate cheap skates. Some performance problems you can fix by buying more hardware or equipment. Other performance problems require retaining in demand professionals with a deep knowledge of the system to troubleshoot what comes up. Some problems require both equipment and software changes. Businesses cheap out and get caught with their pants down, nothing technology specific there.