Home » Dr. Jayne » Currently Reading:

Curbside Consult with Dr. Jayne 8/26/19

August 26, 2019 Dr. Jayne 3 Comments

I’ve received several post cards and also emails from Nuance lately, marketing their Ambient Clinical Intelligence product which they also describe as “the exam room of the future.” I’m pretty sure this is the follow-on to what many of us saw in their demo/theater at HIMSS.

The premise was this: the physician and patient interact in an exam room that supports speech recognition while also serving up EHR data to the provider upon request. The demo scenario was a 40-something woman with knee pain. The system helped the provider navigate to find information about previous visits as well as documenting the current one.

At the time, I spoke with some of the Nuance team and it sounded like they were really focusing on subspecialty situations where the workflows would be fairly standardized and/or predictable. In order for the technology to work, there needs to be a significant repository of data available as far as medical dictionaries, codified discrete data, etc. Then on top, you have to layer the typical exam findings, questions, and possible answers for different conditions, to ensure the system will be able to recognize what is being said without having to “train” the speech recognition portion. Beyond that, components of EHR historical data t have to be served up to help answer questions the clinician might ask, such as when a medication was first prescribed, etc.

Although the orthopedic demo was pretty flashy, it was obvious that the participants were actors and that they were working from a script, especially when the real-time-looking demo on the screen didn’t 100% match what had been said. Still, it was attention-grabbing enough to send me to speak to one of their reps about where they really were in development for other specialties. It sounded like they were a bit of a way out for what would be necessary to support workflows in primary care or urgent care, which can be the exact opposite of predictable. With the mailings and email ads, I figured perhaps they had made more progress and decided to follow up.

One piece on the website that caught my eye was something they’re calling “integrated machine vision” and is designed to “detect non-verbal cues.” I’d be curious to learn more about how they’re doing this, and what it might entail to create a library of non-verbal information that could be parsed to add context to notes. I’m also curious whether this applies only to the patient side or whether it’s skilled enough to pick up non-verbal input from the clinician. Would it be able to interpret the complete absence of a poker face that I exhibited recently when seeing the largest hernia I have encountered in my career? Could it interpret the glassy-eyed stare of my patient to determine whether they just weren’t paying attention or whether I should be asking more deeply about potential substance abuse? For clinicians caring for teens, I’d think that ability to quantify teenage eye-rolling would be the gold standard.

Another major component of the system is the virtual assistant piece, kind of like Alexa, Siri, or Google. “Hey Dragon” is the wake word to access information in the EHR, and as this technology evolves, it gets us closer and closer to what many of us have seen in the “Star Trek” universe over the years. Having toyed with a virtual assistant over the last couple of years, I know there are nuances in how the questions are asked to get the data you want to get. Somehow in “Star Trek” they don’t have to ask the computer three different questions to get the desired output. I’m hoping Nuance has been able to figure out the secret sauce needed to translate how physicians think and speak and adapt the system to match.

I was also intrigued by their “intelligent translation and summarization” comments on the website, where they note that it “turns natural language into coherent sentences.” That sounds a bit like physicians might have trouble being coherent, which probably isn’t far off the mark for many of us, especially at the end of a particularly long and brutal shift. I know I lean heavily on my scribes (when I’m fortunate enough to have one) to translate my often-wordy home care instructions into a bulleted list that patients will be more likely to follow once they get home.

Although some of us are skeptical about the power of AI, I was intrigued by some of the numbers presented on the website. The company claims 400 million consumer voiceprints, with 600 million virtual and live chats per year powered by their AI technology. Although I’ve used speech recognition in the past, I didn’t realize the growth in speech-to-text and the fact that they have 125 voices in 50 languages. If they could somehow work with Garmin to integrate the “Australian English Ken” voice I used to have with my stand-alone GPS, I’d be sold. I could listen to him all day, even if he was continually telling me to make a U-turn at the next safe intersection.

This type of technology could really be a game-changer for physicians, perhaps reducing burnout, decreasing medical errors, and making visits more efficient for patients and clinicians alike. I’d be interested to hear from anyone who is actually employing these types of features in practice, whether it’s a comprehensive suite as Nuance is promoting or whether it’s freestanding elements such as a voice assistant for chart navigation, data retrieval assistance, or something else.

I wonder how much research is being done in this arena outside of the vendor space, whether any of the institutions that have strong informatics programs are getting involved with similar initiatives, or whether it’s so expensive that the work is typically vendor-driven.

From a patient perspective, I’d love to see a voice assistant functionality that could make it a reality for me to simply ask it to “make me an eye appointment after November 3 using one of the open slots on my calendar” and have it connect with my provider’s practice management system and get the job done without two phone calls, a patient portal message, and a two-week timeframe like it took me to make my last appointment. Now that would be something, indeed.

What is your most sought-after voice assistant functionality? Leave a comment or email me.

Email Dr. Jayne.

Currently there are "3 comments" on this Article:

Practice Admin says:

August 27, 2019 at 10:11 pm

I’ll speak for the staff of the practices we support, outbound referrals to specialists; voice command order, sending of referral and supporting documents, PA from insurance initiated!
My Australian Siri dude (Aussie English Ken?) and I have a lot of trouble some days, can’t imagine how he’d do charting a mumbling provider. I listen to one of my orthopedic surgeons dictate, and he’s talking so fast and low the transcriptionists have to slow the playback to hear his words. Cool and exciting, I’ll fully buy into AI hype once we have true interoperability across all platforms working!
Brian Harder says:

September 3, 2019 at 6:27 pm

The reason that you don’t have to ask Star Trek computers 3 times is that the Star Trek computers understand meaning. This is what is missing from all the classic voice recognition systems to date.

Now, the tech behind Siri, Google Voice, Cortana and the latest crop of voice assistants may be different (though I doubt it is different enough). The tech that’s in Nuance, Dragon and similar systems all use pattern matching of various types. IBM also published research on triads of words that statistically were most often found together.

Even when the Star Trek computers had to ask for clarification, or allow for different meanings, those interactions showed deep understanding of content and meaning.

It has long been the case that you can say nonsense sentences that are grammatically correct, to Dragon, and the system will do nothing to stop you. They will also make jaw-dropping recognition errors by incorrectly parsing word start and stop end points, or choosing incorrect homonyms.
- Nuanced says:
  
  September 4, 2019 at 10:02 am
  
  Siri was originally based on Nuance’s tech, but Apple replaced them. I think pre-2010?