Re: encouraging corporate donations Maybe HIStalk sponsors could get an incentive in exchange for pitching in? A contribution of $10/month…
Thoughts on NIST’s EHR Usability Document 10/24/11
NIST’s EHR usability report, Technical Evaluation, Testing, and Validation of the Usability of Electronic Health Records, can be viewed here. It is in draft status and available for public comments. Comments can be sent to EHRUsability@nist.gov.
ONC has also pledged to review comments left HIStalk. Cllick the link at the end of this article to add yours.
My Disclosures
- I’m not a usability expert, but I have attended usability workshops and possess some familiarity with how software usability is defined and measured.
- I’ve used badly designed software.
- I’ve had to tell clinical users to live with badly designed software and patient-endangering IT functionality because we as the customer had no capability to change it and our vendor wasn’t inclined to.
- I’ve designed and programmed some of that badly designed software myself, choosing a quick and dirty problem fix rather than a more elegant and thoughtful approach.
- My hospital job has involved reviewing reports of patient harm (potential and actual) that either resulted from poor software design or could have been prevented by better software design.
- I’ve seen examples from hospitals I’ve worked in where patients died from mistakes that software either caused or could have prevented.
First Impressions
My first impression of the report is that it was developed by the right people – usability experts. Vendor people and well-intentioned but untrained system users were not involved. Both have a role in assessing the usability of a given application, but not in designing a usability review framework. That’s where you want experts in usability, whose domain is product-agnostic.
My second impression of the report is that it is, in itself, usable. It’s an easy-to-read overview of what software usability is. It’s not an opinion piece, an academic literature review, or government boilerplate.
The document contains three sections:
- A discussion of usability as it relates to developing a new application.
- A review of how experts assess an application’s user interface usability after the fact.
- How to bring in qualified users to use the product under controlled conditions as a final test to analyze their interaction with the application and their opinions about how usable it is. This is where the user input comes in.
A Nod to the HIMSS Usability Task Force
I was pleased to see a Chapter 2 nod given to the HIMSS Usability Task Force, which did a good job in bringing the usability issue to light. They were especially bold to do this under the vendor-friendly HIMSS, which has traditionally steered a wide berth around issues that might make its big-paying vendor members look bad. I credit that task force for putting usability on the front burner.
In fact, the HIMSS Usability Task Force’s white paper is similar to the NIST document, just less detailed. I’ll punt and suggest reading both for some good background. I actually like the HIMSS one better as an introduction.
Usability Protocol
A key issue raised early in Chapter 3 (Proposed EHR Usability Protocol) is that it’s important to understand the physical environment in which the software will be used. This is perhaps the biggest deficiency of software intended for physician use.
User interfaces that work well for users who are seated in a quiet room in front of a desktop computer may be significantly less functional when used on laptops or other portable devices while walking down a hospital hallway, or on a laptop with only a built-in mouse. That’s a variable that programmers and even IT-centric clinicians who spend their days riding an office chair often forget. The iPad is forcing re-examination of how and where applications are actually used and how to optimize them for frontline use.
The document mentions that ONC’s SHARPC program is developing a quick evaluation tool that assess how well an application adheres to good design principles. Three experts will review 14 best practices to come up with what sounds like a final score. It will be interesting to see what’s done with that score, since it could clearly identify a given software product as either very good or very bad. In fact, the document lists “violations” that range from “advisory” to “catastrophic,” which implies some kind of government involvement with vendors. Publishing the results would certainly put usability at the forefront, but I would not expect that to happen.
The document points out that usability testing “does not question an innovative feature” that’s being introduced by a designer, but nonetheless can identify troublesome or unsafe implementation of the user interface for that feature.” That’s the beauty of usability testing. It can be used to test anything. It doesn’t know or care that what’s being testing is a worthless bell and whistle vs. a game-changing informatics development. It only cares whether the end result can be effectively used (and with regard to clinical software, that patients won’t be harmed as a result of confusion by the clinician user.)
Methods of Expert Review of User Interfaces
Chapter 5 covers expert review of user interfaces. When it talked about standardization and monitoring, I was thinking how valuable a central EHR problem reporting capability would be. Customers find problems that either aren’t reported to vendors or aren’t fixed by them, meaning patients in potentially hundreds of locations are put at risk because of what their caregivers don’t know about an IT problem.
If the objective of improving usability is to reduce patient risk, why not have a single organization receive and aggregate EHR problem reports? It could be FDA, Joint Commission, ONC, NIST, or a variety of government or non-profit organizations. Their job would be to serve as the impartial intermediary between users and vendors in identifying problems, identifying their risk and severity, alerting other users of the potential risk, and tracking the problem through to resolution.
The NIST document cites draft guidance from FDA on usability of medical devices. It could be passionately argued either way that clinical IT systems are or aren’t medical devices, but the usability issues of medical devices and clinical IT systems are virtually identical. Since FDA has mechanisms in place for collecting problem reports for drugs and devices, making sure vendors are aware of the issues, and tracking those problems through to resolution, it would make perfect sense that FDA also oversee problem reports with software designed for clinician use. This oversight would not necessarily need to involve regulation or certification, but could instead be more like FDA’s product registration and recall process.
The document highlighted some issues that I’ve had personal gripes about in using clinical software, such as applications that don’t follow Windows standards for keystrokes and menus and those that don’t support longstanding accessibility guidelines for the disabled.
Choosing Expert Reviewers and Conducting a Usability Review
Chapter 6 talks about the expert review and analysis of EHR usability. So who is the “expert” involved in this step? It’s not just any clinician willing to volunteer. The “expert” is defined as someone with a Master’s or higher in a human factors discipline and three years’ experience working with EHRs or other clinical systems.
The idea that clinicians are the best people to (a) design clinical software from inception to final product, or (b) assess software usability ignores the formal discipline of human factors.
Validation Testing
Chapter 7 describes validation testing. It explains upfront that this refers to “summative” user testing, meaning giving users software tasks to perform and measuring what happens. It’s strictly observational. “Formative” testing occurs in product development, where an expert interacts collaboratively with users to talk through specific design challenges.
Validation testers, the document says, must be actively practicing physicians, ARNPs, PAs, or RNs. Those who have moved to the IT dark side aren’t candidates, and neither are those who have education in computer science.
How many of these testers do you need? The document cites studies that found that 80% of software problems can be found with 10 testers, while moving to 20 testers increases the detection rate to 95%. FDA split the difference in proposing 15 testers per distinct user group (15 doctors, 15 nurses, etc.)
The paper notes that EHRs “are not intended to be walk-up-and-use applications.” Their users require training and experience to master complex clinical applications. The tester pool, then, might include (a) complete EHR newbies; (b) those who have experience with the specific product; and (c) users who have used a competing or otherwise different EHR.
Tester instructions should include the fact that in summative testing, nobody’s asking for their opinions or suggestions. They are lab rats. Their job is to complete the defined tasks under controlled conditions and observation and nothing more. They are welcome to use help text, manuals, or job aids that any other user would have available to complete the defined tasks.
The NIST report listed other government software usability programs, including those of the FAA, the Nuclear Regulatory Commission, the military, and FDA.
EHR Review Criteria
Appendix B is a meaty list of expert EHR review criteria. This is where the report gets really interesting in a healthcare-specific way. It’s just a list of example criteria, but if you’re a software-using clinician, you can immediately start to picture the extent of the usability issue by seeing how many of those criteria are not met by software you’re using today. Some of those that resonated with me are:
- Does the system warn users when twins are admitted simultaneously or when active patients share similar names?
- If the system allows copying and pasting, does it show the viewer from where that information was copied and pasted?
- Does the system have a separate test environment that mirrors the production environment, or does it instead use a “test patient” in production that might cause inadvertent ordering of test orders on live patients?
- Does a screen require pressing a refresh button after changing information to see that change fully reflected on the screen?
- For orders, does the system warn users to read the order’s comments if they further define a discrete data field? (example: does a drug taper order flag the dose field to alert the user that the taper instructions are contained in the comments?)
- When a provider leaves an unsigned note, are other providers alerted to its existence?
- Do fields auto-fill only when the typed-in information entered matches only one choice?
- Can critical information (like a significant lab result) be manually flagged by a user to never be purged?
- Are commas automatically inserted when field values exceed 9999?
- Are “undo” options provided for multiple levels of actions?
- Is proper case text entry supported rather than uppercase-only?
- Do numeric fields automatically right-justify and decimal-align?
- Do error messages that relate to a data entry error automatically position the cursor to the field in error?
- Do error messages explain to the user what they need to do to correct the error?
- Do data entry fields indicate the maximum number of characters that can be entered?
- Are mandatory entry fields visually flagged?
My Random Thoughts
Usability principles would ideally be incorporated in early product design. To retrofit usability to an existing application could require major rework, which may be why some vendors don’t measure usability – it would simply expose opportunities that the vendor is unwilling or unable to undertake.
On the other hand, improving usability doesn’t require heavy duty programming or database changes. The main consideration would be, ironically, the need for users to be re-trained on the user interface (new documentation, new help text, etc.)
Usability can me measured, so does that mean there is “one best way” to do a given set of functions? Or, given that users are often forced to use a variety of competing CPOE and nurse documentation systems, is it really in the best interest of patients that each of those vendor systems has a totally different user interface?
Car models have their own design elements to distinguish them commercially, but it’s in the best interest of both the car industry and society in general that placement of the steering wheel and brake pedal is consistent. With PC software, this wasn’t the case until Windows forced standard conventions and the abandonment of bizarre keystroke combinations and menus.
I always feel for the community-based physician who covers two or more hospitals and possibly even multiple ambulatory practice settings, all of which have implemented different proprietary software applications that must be learned. This issue of “user interoperability” is rarely discussed, but will continue to increase along with EHR penetration.
From a purely patient safety perspective, we’d be better off with a single basic user interface for a given module like CPOE, or even a single system instead of competing ones (the benefits of the VA’s single VistA system spring immediately to mind.) It’s the IT equivalent of a best practice, Usability can be measured and compared, so that means if there are 10 CPOE systems on the market, patients of physicians-users of nine of them are being subjected to greater risk of harm or suboptimal care.
Usability testing does not require vendor participation or permission. Any expert can conduct formal usability testing with nothing more than access to the application. Any third party (government, private, or for-profit) could conduct objective and meaningful usability assessments and publish their results. It’s surprising that none have done so. They could make quite a splash and instantly change the dialogue from academic to near-hysterical by publicly listing the usability scores of competing products.
Conclusion
Read the report. It’s not too long, and much of it can really be skimmed unless you’re a hardcore usability fan. If nothing else, at least read the two-page executive summary.
For the folks who express strong reaction to the word “usability” while clearly not really knowing what it means, the report should be comforting in its objective specificity.
Even though the document is open to public comment, there really isn’t much in it that’s contentious or bold. It’s just a nice summary of usability design principles, with no suggested actions or hints of what might future actions are being contemplated (if any.)
I’m sure comments will be filed, but unless they are written by usability experts, they will most likely be unrelated to the actual paper, but rather what role the government may eventually take with regard to medical software usability.
It should also be noted that no product would register a perfect usability score. And, that humans are infinitely adaptable and will learn to work around poor design without even thinking about it. In some respects, usability is less of an issue with experienced system users who have figured out a given system’s quirks and learned to work capably (even proudly) around them.
This document really just provides some well-researched background on usability. The real discussion will involve what’s to be done with it.
Let’s hear your thoughts. Leave a comment.
Well then but, I liked the most in the executive summary, the phrase “use error” which is in opposition to “user error”.
“User error” is what the vendors have, you know then, coached the hospital administration to tell the doctors who complain when the mistakes cause their patients to be dead, you know.
You know then, it is very bad for patient care for the doctors to be slowed by many clicks to order an aspirin, and then you know, to sign off on the stupid decisions support about aspirin. Do these devices’ programmers think doctors are that stupid??
If but nothing else you know then, this report confirms that there are many mant very very serious problems with meaningful errors being produced by bad usability.
To my very surprise, you know then but, you the writer of this blogger, have softened your positions on the roles of the FDA.
BUT, you know, this report by NISTs goes no place fast without enfocement, if you know what that means.
Nice summary of the NIST work! Thanks! And a call out to Shelley Myers, who went to work for UserCentric in Chicago and is the seed of inspiration and awareness that led to this report. Very cool to see this issue of usability progress over the past few years.
A report like this is long overdue. A decade or more overdue, in fact. HITECH being passed at the same time NIST was studying usability, admitted to be a problem by HIMSS etc., and while IOM is studying safety, is quite the cart before the horse.
That said, my concerns are:
1. Mention of the paper: “The Benefits Of Health Information Technology: A Review Of The Recent Literature Shows Predominantly Positive Results”, Health Aff March 2011 vol. 30 no. 3 464-471, by ONC. (footnote 17) as “emerging evidence that the use of health information technology (HIT) may help address significant challenges related to healthcare delivery and patient outcomes.”
This paper has severe methodological flaws, notably, the authors had no methodologic standards whatsoever for article inclusion, and the review included qualitative studies that were probably not meant to be evaluative, and observational studies subject to severe methodologic bias.
Much more on this issue at: http://hcrenewal.blogspot.com/2011/03/benefits-of-health-information.html
2. Is NIST the appropriate venue for what appears to be a very costly undertaking for the health IT industry? While the report appears sound, does NIST have regulatory authority and/or experience in healthcare/healthcare IT? It seems a far more appropriate venue is FDA, which has both.
Mr. HIStalk wrote:
I’ve seen examples from hospitals I’ve worked in where patients died from mistakes that software either caused or could have prevented.
Thanks for this candid statement.
Here’s a sort of “Diary of EHR-Initiated Tragedy” of my own: http://www.ischool.drexel.edu/faculty/ssilverstein/cases/?loc=cases&sloc=diary
A very cautionary tale.
Disclosure: I’m a practicing front-end designer and application developer.
I must start with a question; what is the purpose of this document? Is the goal of the article to educate vendors on best practices? Will this report serve as a foundation for evaluating usability of EMR systems that will be published to consumers? Without this end point in mind, it’s hard to judge the merit of the report.
With that said, the methodologies outlined in the report contain little groundbreaking information, but is a rehash of what is considered best practices in the field of human-computer interaction. My major problem with the report is the emphasis placed on a “waterfall model of design” (see Figure #1).
First and foremost good design is a process. It is not something that can done once or injected at the very end of the development cycle. You can’t put “lipstick on the pig” and call your software usable. The best design companies have taken these principles to heart very early in the design process by constantly testing and modifying their product. Design is iterative. When you wait until the very end of the development process to run usability tests with your users you’re going to find problems – lots of problems. At that point, what is a vendor going to do? Miss a shipment deadline or send less “usable” code? I’m going to guess the former. Add to that, the usability testing will uncover major code rewrites often times requiring starting from scratch. Don’t make the mistake of assuming usability is going to uncover only aesthetic problems. This report needs to place a greater emphasis on iterative, rather than a once-through (waterfall) design process.
If you’re going to improve vendor design, it has to begin with an internal commitment to value designers and what they contribute to the product development process. They can not be an afterthought. There are very few companies with this mindset. Almost all HIT companies are developer-driven, so the first thought, and one that is promoted in this report, is to turn developers into designers. This will not work! A developer and a designer require two fundamentally different skill sets that are not easily transposed. Developers are trained to think rationally and analytically; design requires empathy for the user.
Trained designers have a hard enough time in HIT because most of them are not clinicians. It’s much easier for a designer to create a new social network because we have used them. It’s much hardly for HIT designers to judge the value of the solution. That is why it is crucial to have a clinician apart of the design process. They can add the insights that the designer could never see. It’s hard enough for a trained designer to develop useful solutions in the HIT space, yet alone asking a developer to do the same.
I don’t think that vendors are going to change to this designer-centric development process voluntarily, so how is this report going to push them?
Some other thoughts:
Why test applications in a sterile user testing environment? One of the biggest problems in HIT is that software is not designed with the clinical workflow in mind. How often have you seen a software tool that supports the interruptions that are common in a clinician’s day? By keeping the software in a controlled environment everything will look fine, but once its in the wild the problems will show themselves.
The highlight of the report for me, were the HIT examples in the Appendix. They did a great job of capturing healthcare use cases.
When testing task competition times, why not assume expert users? If expert users are assumed then software can be used to model the time to competition based on spacing of buttons, number of clicks, system lag, keystrokes, etc. This will give an accurate, reproducible competition time estimates across vendors. Right now, untrained EMR users will be the test subjects, but as stated in the report EMRs are not “walk-up-and-use applications”. My guess is without training they will have difficultly even completing the tasks!
Nice overview. I would agree. One area that I would of like the framework to address is the capturing and cataloging of “specific goals.”
In their definition of usability they do a great job talking about the other components of the definition but there is not even a sentence on identifying specific goals.
IMHO, the ambiguity of the definition of the specific goals the user is trying to accomplish is one of the big root causes of poor usability (esp. adding in the mal-alignment of incentives to the “desired” goals).
Best,
Steven
My reactions:
1- I am in favor of better usability, but as you and others point out what is ‘usable’ today may not be tomorrow. This is particularly true in medicine and the delivery of health care. New technologies and protocols are developed and promoted every month and what you build for today’s care delivery will clealy be usably obsolete by next year. If usability becomes a requirement for vendor ONCHIT certification, providers better get ready for astronomical support fees.
2- Getting ONC certified today (meeting some 50 or so criteria) can take up to two days ATCB testing, and is usually done by non-healthcare staff, some with little or no field experience. (In one client case I recently was involved with the testor asked – What’s an MPI and what’s it used for? !). If they include usability criteria I predict at least a weeks effort to get through a certification test.
3- And as anyone who ever designed a system or wrote a line of code – ‘The devil is in the details’ – and I can see a few million devils lurking here.
The devil is in the detail when taking care of patients. This report elucidates why there are the dangers that patients face 24/7 when doctors use EMRs.
The FDA ought to be called upon to enforce the usability criteria; and, there must be after market surveillance for injuried, deaths and near misses.
Frank Poggio writes:
I am in favor of better usability, but as you and others point out what is ‘usable’ today may not be tomorrow. This is particularly true in medicine and the delivery of health care. New technologies and protocols are developed and promoted every month and what you build for today’s care delivery will clealy be usably obsolete by next year.
Can you give some examples of how changes in healthcare would affect the usability of a health IT application that was very usable from the get-go per the guidelines in the NIST document?
MIMD,
Before I give an example let me first say in my view there are two levels of ‘usability’. The first deals with the simple issues of clicks and screens and other basic operational tasks. I do not see a problem with this level of usability.
The second, and where I see the ‘city of devils’, deals with workflow. Workflow usability is just as important as the first. Some examples: usability for a pediatric specialist is very different than for say an ophthalmologist, or a psychiatrist, or an anesthesiologist. Nursing workflow for special care units is different than routine care units, and so on…
And how about usability for drug interactions? Who’s library is ‘right’? What do you show on a screen (and format, and urgency, etc) when one library says minor conflict but the other says potential critical conflict? What’s a systems designer to do? If you want the developer to build usable applications, somebody has first to sort out the inherent conflicts that exist in medicine.
The FIST document pretty much admits this in saying “Validation testers, the document says, must be actively practicing physicians, ARNPs, PAs, or RNs.” So how can we expect a GP to validate usability for a Nephrologist? The quick answer is well get one (or more) of each and let’m be the testers. Good luck with that approach. And by the way it is not uncommon to see a surgeon from Stanford want different workflow than the surgeon from Duke. Which is right? Which is better? I don’t know, but maybe that is why they call it ‘the practice of medicine’. Anyway I digress…but try designing systems and writing code to address these millions of variations. See you next millennium.
My biggest concern is the feds take the same approach to usability they took from the start of the ONC certification program. Pick a bunch of general criteria and throw them out there as quickly as possible and hope for the best. If that happens you’ll get the medical equivalent of these many little problems we have in the current certification process. Like why do we need to capture date of death and cause of death in a demographic (registration process) if you are selling a radiology system? Seems to me the epitome of lousy usability and workflow. For the current rather simple criteria I have a dozen more examples.
Thanks for your review and for bringing more attention to the issue of usability.
I am concerned by your comment: “If the objective of improving usability is to reduce patient risk, why not have a single organization receive and aggregate EHR problem reports? It could be FDA, Joint Commission, ONC, NIST, or a variety of government or non-profit organizations. Their job would be to serve as the impartial intermediary between users and vendors in identifying problems, identifying their risk and severity, alerting other users of the potential risk, and tracking the problem through to resolution.”
I hope you are not suggesting that all EHR problem reports be sent to an oversight organization. Sending all problem reports to a third party would seem akin to video review of all plays in an NFL game. If a vendor, as judged by its users, is doing a good job of identifying, categorizing, prioritizing, communicating, tracking and resolving problems (along the lines of the VHA example cited in the report), why penalize them or add another layer of complexity? If the two parties, vendor and user, can reach agreement, why involve a mediator? Resources should be focused where they will provide benefit and not encumber what already is working well.
I’ve long argued usability was a much greater barrier to widespread adoption than most Health IT organizations were thinking. Now that we’re making some degree of progress on interoperability, HIEs, etc. it’s nice to see the industry putting more attention on this important issue.
Our understanding via the HIMSS usability task force is that this RFC they’ve put out will be the basis for upcoming meaningful use stage 2 criteria around usability.