I recently wrote about Nuance and their efforts to create the exam room of the future, where charting is performed in real time as speech occurs. Several readers reached out with some detailed questions and discussion about the technology, which spurred me to dig a little deeper.
One reader commented about the concept of meaning as it relates to voice recognition technology and the need for systems to use pattern matching to correctly identify the content of the speech. I have a tremendous time getting my phone to recognize the difference between “pictures” and “pitchers” no matter how clearly I try to articulate, and regardless of context. Getting a system to recognize words when you’re actually trying is one thing, and having them accurately identify speech in an exam room conversation that is all over the place is another.
An article in the Journal of the American Medical Informatics Association looked at the difficulty in detecting conversation topics during primary care office visits. They used transcripts of the visits to look at whether machine learning methods could be effective in automating annotation of visits. The authors recognized the complexity of the average primary care office visit, noting:
Patients present multiple issues during an office visit requiring clinicians to divide time and effort during a visit to address competing demands, such as a patient could be concerned about blood pressure, knee pain, and blurry vision in a single appointment. Moreover, visit content does not solely focus on biomedical issues, but also on psychosocial matters, personal habits, mental health, patient-physician relationship, and small talk.
When looking at the content of visits to determine what material was covered, research raters can label each so-called “talk-turn” using codes intended to capture the visit content. This process can take several hours per visit, making it difficult to scale such an analysis. Being able to automate the extraction of these topics could not only help reduce documentation burden, but could also help identify providers who may not be following up on all the clinically relevant parts of the encounter. The authors wanted to build on previous studies that looked at human-labeled interactions and showed that machine learning systems can create annotations of those conversations.
Using 279 primary care office visits, they found that different models performed better at the visit level vs. the topic level, concluding that there needs to be additional study and larger datasets available to achieve performance that would succeed in the real-world exam room. It doesn’t seem as easy to move from the realm of natural language processing generation of discrete data as people might think. I’ve often thought about what it would be like if you could just record an office visit (both audio and video) as documentation. The pain would be in reviewing it later, unless there was a way to transcribe the information or make it searchable. Various vendors have tried to solve this problem, including leveraging Google Glass to do so.
Remember Google Glass, the tech industry’s darling way back in 2013? It’s been hiding in plain sight, as an “Enterprise Edition” that’s being used in a variety of manufacturing and heavy industrial applications as well as in healthcare. A quick scan of the website shows several big-name healthcare organizations on the client roster.
I recently had a chance to catch up with Ian Shakil, founding chairman of Augmedix, whose client roster shares some of the big names listed by Glass. He confirmed that Glass is far from gone, with around 30% of Augmedix customers using it as part of tech-enabled scribing services. The remaining clients use smartphones, which might be worn or on a stand in the exam room. It sounds like patients have gotten over the concerns that many of us initially had with Glass and privacy – he cites a 98% acceptance rate by patients, which is partly accomplished by education by the front desk or clinical staff.
It was interesting to talk to someone knowledgeable about a segment of the healthcare industry that I admit I know little about. Other than some excitement around Glass half a decade ago, and some acquisitions of scribe and transcription services by other vendors in the voice recognition and EHR spaces, I hadn’t seen a lot of coverage. We spend some time talking about the way various solutions tackle the problem, from what can be described as “dictation in disguise” to human scribes to remote scribes to attempts to use voice recognition and virtual assistant technology to create a true AI-powered scribe. Some vendors like Augmedix even offer services across the continuum, depending on where their clients are, from a human virtual scribe all the way to tech-augmented scribes who use a variety of tools to enhance their abilities to document visits.
I was surprised to learn that there is variability in what is done with the recordings of patient visits created during the course of visits. Depending on the vendor and the client, some want the recordings and video destroyed and others want it preserved. It may be used for training, quality assurance activities, or even in the future as a multimedia note or for access by the patient as a reminder of the visit. Given the plaintiff’s attorney whose branch is close to mine on the family tree, I wondered about the use of the video feeds in potential litigation. I’ve pored through enough bulky, EHR-generated medical records to know that it certainly would be easier to watch the movie than to read the book in this case.
I use a human scribe in the exam room about half of the time. Our office fully agrees with industry data that shows that such support leads to better notes, timelier patient care, and reduced clinician burnout. The biggest struggle I have though is going back and forth between having a scribe with me or not having one. When I have that support, everything I say is taken down or acted upon in the exam room before we leave, and I can just close that visit in my mind and move to the next exam room. The scribes watch for lab results or radiology tests to return and make sure I don’t miss going back to take care of a patient who is still pending disposition.
When I work a shift without a scribe, I’m pretty good at the follow up piece, but I sometimes forget to put in my orders or flag patients for discharge. I’m just so used to saying, “We’re going to do a flu swab and get a chest x-ray” and having those orders placed, my brain is on autopilot right past the need to enter them myself. It’s enough of an issue that I usually tell the rest of my clinical team that “I had a scribe yesterday and don’t today, so if you see me missing orders or discharges, just grab me” and they usually laugh, because apparently I’m not the only physician who does it.
Shakil shared a great piece with me that ran in The Lancet a couple of weeks ago, one where the author discusses “Empathy in the age of the electronic medical record.” It’s worth a read for folks who might wonder what physicians who struggle with the EHR are thinking as they try to see patients. I’m interested to hear what readers think on the topic. Where are we, and where are we headed? In the meantime, I’m mentally prepping because tomorrow’s schedule does not include a scribe.
What do you think about virtual scribes, natural language processing, and the exam room of the future? Leave a comment or email me.
Email Dr. Jayne.