Richard Cramer is chief healthcare strategist for Informatica of Redwood City, CA.
Give me some background about yourself and about Informatica.
I am Informatica’s chief healthcare strategist. I’ve been on board about 10 months now. Formerly I was the associate CIO for operations and health exchange at UMass Memorial Healthcare in beautiful Worcester, Mass. I was there for a little over two years. I spent the prior 10 years in the software business doing strategy and marketing for software companies, healthcare, and whatnot. I ran a corporate and industry marketing for SeeBeyond for four and a half years.
Before that, I was the director of applications development at the University of Pennsylvania Health System. I’ve been on the provider side and the vendor side, back and forth, over the course of the last 15 years. I’m now pretty excited to see where healthcare is. I’ve waited 15 years for healthcare IT to finally to be cool.
Informatica was founded in 1993. It spent probably the first 10 or 11 years establishing a dominant place in the extract transform load marketplace, supporting data warehousing. We brought in a new CEO from Oracle in 2004, Sohaib Abbasi. Over the course of the last eight or nine years, we have branched out from our core beginnings in extract transform load to being what we say now is the leading independent data integration vendor in the marketplace. We moved from simply doing batch loads into data warehouses to including data quality, real-time transformation, business-to-business, master data management, archiving, and a whole slew of other things.
In its current incarnation, Informatica is a comprehensive data integration vendor with a horizontal focus to date, with 4,200 customers or so. Eighty-four of the Fortune 100 use our solutions in various capacities. Even though we’re relatively new to having a dedicated team focused on healthcare, we’ve got well over 100 healthcare enterprises that are Informatica customers, but have acquired our solutions by virtue of looking for technology and licensing Informatica as much as us having a dedicated focus on the healthcare market, which is really new in the last year.
When you look at healthcare specifically, who would you say are your main competitors?
Looking at healthcare specifically, our main competitor — and it’s not just healthcare specifically — is IBM. If you look at the suite of products that we have and the nature of those products, really the only big competitor we have for ETL or any of those is IBM at an enterprise level. That certainly became even more true when IBM acquired Initiate and brought them into the IBM master data management family. That’s our primary competitor.
We do run across organizations that are very much SQL Server shops and use the Microsoft stack, but those tend to be the smaller organizations, or we tend to be talking to people that have been using that and now see that they need something a bit more powerful, and then it’s really us or IBM.
Healthcare hasn’t been very fastidious about creating and managing information that could be valuable for managing outcomes, costs, and risks. A lot of times the best data anybody has is claims data, which is like a manufacturer trying to run a business using only information from its invoicing system. When you look at all the proprietary systems that are creating and consuming data oblivious to all the others that might need that data, do you think there is any chance all this can get resolved in a way that will allow healthcare organizations to meet healthcare quality and cost expectations?
I could not have described to you more or better why I joined Informatica. I absolutely think that’s going to happen in healthcare, and I absolutely think that Informatica has the platform required to achieve that.
I’ve been in the software vendor side long enough to know that you don’t go to a horizontal technology company and say, “You’ve got to build a bunch of healthcare-specific applications if we’re going to sell anything into the healthcare market.” The fact is that healthcare has finally woken up to the value of the data that they’re going to have. I don’t really think it matters what your political persuasion may or may not be. What the Obama administration did with HITECH and Meaningful Use is to finally get providers to adapt electronic health records. Finally we have the data available to do cool stuff with.
Meaningful Use is a useful microcosm of what’s going to happen on a much grander scale for healthcare data, because Meaningful Use really is nothing more than a data quality standard mandated by the government. They say, “Here are the data elements you have to collect. Here is the format you must collect them in. Here is who must enter those data elements. Here are the relationships between those data elements.“
By doing that, just in that one small section of the data that’s really available, what the government did is say, “Here is going to be high quality data.” What we see in healthcare organizations that previously have never done anything that resembled a quality report or a physician comparison report because the data was never accurate enough. What happens when you have bad quality data? You don’t share it, because you get eviscerated for the data being bad.
Even the most conservative provider organizations — because the Meaningful Use data that they’ve created is pretty good — are publishing those reports for all physicians to see, because the data is actually trustworthy. It is an interesting example of how high quality data in a clinical information system gets democratized because it is high quality.
EHRs are exciting because they actually collect data, not because they replace paper. Once that data is available and accessible, taking techniques and tools and things that were groomed over the past decade following SAP implementation for Y2K and using those to make high-quality, trustworthy data from healthcare systems is the whole opportunity, I think.
You mentioned that Informatica offers the platform, but unlike your previous employers that were really about the nuts and bolts and bits and bytes of moving data back and forth, is there some organizational commitment and expertise of being stewards of that data more than just moving it around electronically?
Yes, exactly. That is a very good counterpoint that if you look and you say, “Healthcare enterprises had been using interface engines for decades.” Healthcare was actually at the forefront of adapting real-time interface technology. It was great at shifting data from one system to the other. For HL7, when is a standard so flexible that it’s not a standard? I don’t know that anybody has any real sense of the data quality problems that exist within those real-time messages, but it worked adequately.
If you look at the larger data integration challenge, though, not all of the data we care about in an analytical context is exposed through an HL7 message. We do HL7 messaging just fine. All of the libraries are supported, and it’s actually relatively easy to do HL7 when you do everything else. But also having the option to say, “I can go directly against the database and pull the data out of the database en masse after profiling it to ensure the quality and all of those sophisticated tools.”
Part of the challenge is we’ve got new electronic systems, but not all of them were designed to even have the triggers within the application to expose the data outbound. We were an Allscripts Enterprise shop when I was at UMass, and three years ago, Allscripts didn’t send any transactions out of Allscripts Enterprise. They just had never considered that their EMR was actually going to be a source of data to other people. I mean, shockingly. A fine company, no complaints about them because I think they are representative on a lot of the thinking three, five years ago. We’ve got a whole series of older clinical applications where they didn’t even have the event model to send data out on HL7 messages.
Being able to connect directly to those databases and those applications and get data out other ways — when it changes in the database, send it out — is the big part of the story. Then the data quality component that says, “How do I do the profiling and the rules-based cleanup and all of those things to make sure that the data that we are transacting and we are getting from one system and moving to another and moving to a database or a data warehouse is of high quality every single time?”
The last component is the idea of master data management. Healthcare providers and even healthcare payers have been very familiar with enterprise master patient indexes. If you said master data management to a provider IT person, they might not be that familiar with it. They absolutely know what an enterprise master patient index is.
Our particular solution for master data management says if you can model the data, we can manage it as master data. If you look at other people, they built very traditional vertical applications on top of a specific domain, like “patient” or a specific domain like “provider.” We think that patient and provider is not adequate in terms of managing of master data in the future. You need patient, provider, organization, health plan, physical location, and a whole slew of different things. More importantly, you also need to manage the relationship between the element as master data.
For example, it’s not enough to know that Richard Cramer is a unique patient and Bob Smith is a unique doctor. We think that it’s important to know that Richard Cramer has Bob Smith as my primary care physician. That relationship data is as dirty as any other data in the enterprise. Being able to do a traditional master data management things where you say, “I’m going to automatically reconcile relationships where I can. Where I can’t automatically reconcile, I’m going to put it in a task list and a data steward is going to look at it and they are going to manually resolve it just like you would patient or provider identity,” we think is key.
The whole idea of pervasive data quality is a key part of what we think is going to be a huge enabler to the healthcare analytics and the data decade in healthcare, as I like to call it.
When you look at your previous career as well as where healthcare evolved from, do you think interface engines have made us complacent about standards and metadata?
I think they did. I think that interface engines allowed us the luxury of sharing data very easily between applications in a transaction-by-transaction way. One of the beauties of coming from the ETL world is that when you’re moving data en masse from one place to another, you have the great luxury of, “Wow, I’m going to move 400 million rows. Let me profile it and look at all of it in its entirety before I move it.” You really get a data quality bent about you starting from ETL.
With real-time interface engines, particularly since HL7 was so flexible and all of the different applications interpreted what an individual field meant in Z-Segments and all of that, you were driven to an approach that said, “When I’ve integrated to one Cerner Millennium, I’ve integrated to one Cerner Millennium.” You looked at it not only at an individual system-to-system level, but you looked at it at an individual transaction level. I worked in my interface engine until it passed the edits to be accepted by the target system. It was a very different style of work when you were focused on passing transactions as opposed to looking at the data in aggregate.
People are trying to exchange data, not just internally, but outside the four walls. Is that raising the bar for people to produce better quality data, or does that just make it obvious that we’re nowhere near where we need to be when it comes to being ready to exchange patient information meaningfully?
I think it’s the latter. I hope it’s going to move to being the former. All of those same problems that you have integrating and sharing data within the four walls — different formats, different standards, and questionable data quality — become much more complicated.
The data is much more fragmented when you try and go between organizations. I think that’s why you see so few organizations actually exchanging discrete data. They tend to exchange paper documents or a document like a CCD, but they don’t standardize the nomenclature in it, so you don’t consume the data into a receiving application through most HIEs yet. It’s all driven by the exact issue that you just raised.
If we wanted to share Meaningful Use data — and I think there is some hope that for the subset of the CCD that needs to be interoperable — I think there will be some real success in sharing that, again, because the data is high quality and trusted.
With HL7 interfaces, provider organizations had to figure out their own solutions and their interfaces really weren’t very transportable. In the case of general data exchange, does patient data need new standards and requirements, or will every provider have to figure it out for themselves?
I think there will be new standards, or there will be an adoption of some standards, with HITECH and Meaningful Use really defining the nomenclature that systems need to exchange data. I think it really was the varied nomenclature within the actual segments of a message that caused so much problems. You know the RxNorm versus the MEDCIN versus the whatever for prescription drugs.
The structural differences in the message are very easily handled. The nomenclature things are very difficult to handle. From an exchange perspective, I think that’s going to help us a great deal. I think I have a great deal of enthusiasm for the CCD being a very good start to interoperability. Certainly it is not all inclusive and complete, but if we can get to the point where we can exchange the CCD, we will have fixed enough problems that exchanging more stuff after that will be easier.
The other piece that’s challenging and an example from my former life is the actual data elements within the applications. This speaks to the whole governance issue within the enterprise, because it’s not just the transaction. If you look at any enterprise system within a health system that’s been around for any period of time, people are misusing the data fields that are in the application to support other purposes that were never intended.
In a perfect example at UMass, in the registration record, there is a time stamp field. You’re going to do quality studies that look at the amount of time it takes from the time a patient is registered until they’re admitted to the floor. You go in and you try and do a report, because there’s a time stamp field in the application. One of the organizations did that report. They spent weeks and weeks, they ran the report, they looked at the results, and said, “Wow, these results make absolutely no sense.” They looked at the data in the time stamp field and said, “That doesn’t look like time.” They talked to the registrars in the emergency department and, lo and behold, they were putting the license plate number of the patient’s car in the time stamp field so the valets could find it.
It’s scary that they could even access a time stamp field.
In a lot of old applications, it’s a character-based field. Nobody was using it for anything else and there was no governance to enforce it, so somebody probably put in a request and said, “Hey, relax the edits on this field because I want to do this with it.” Ten years ago, it probably seemed a good idea, and off it went.
Those examples are rampant within every application that’s out there. Even if you have an HL7 message that’s drawing from the fields within the application, if you haven’t done a good enterprise data governance program and you haven’t inspected all of those applications and have good metadata management and data stewardship, you’re going to constantly run across those particular kinds of issues.
Data quality is about making the simple questions simple to answer. If every time you go to use a data element in an application, you have to go through an enormously laborious effort to confirm that it’s reliable. You have to clean it up, and you do it just for that one project or that one thing. You can’t do even simple questions, much less talk about all of the exciting things that we can do with the data.
From my perspective, one of the most least-appreciated challenges in healthcare is to get to what you started, which is: are we ever going to get to where we used the data to profile quality, identify best practices, and improve value? I genuinely believe we are, but the least-appreciated thing to get us there, I believe, is data quality.
You mentioned the responsibility to manage the data and understand how it’s being used. Who would do that in a typical hospital and under whose governance?
Today, the responsibility doesn’t exist. I think other industries have seen that to do data governance, it needs to be an enterprise initiative with a broad membership and very strong leadership that reports high in the organization. In a healthcare provider organization, by and large those organizations don’t exist. People who have an EMPI have traditionally put data stewardship in the HIM group. That’s fine for patient identity. It’s not fine for all the other data elements.
Payers tend to be ahead of providers in this and have really have stood up an executive level data governance and data stewardship function because that’s the only way to do it. It has to be an enterprise initiative. It has to be senior people. It has to have the highest level of support in the organization, and that doesn’t exist. I have not seen a provider system that does it well yet.
Are hospital data projects strategic enough to merit the funding and effort it would require to do it right?
Not yet, but they have to be. I think part of this is the evolution that says, when the only data you have to work with is claims data, for all the reasons that you said, you’re only going to be able to do so much with it. You’re only going to make so much of an investment and you’re not going to get a lot of horsepower out of it.
Now that we’ve got the keys to the kingdom being captured and generated in those EHRs, the stakeholders — the clinicians who we’ve pounded on for years to say, “Hey, you need to do this” – they’re going to say, “I’m doing your data entry for you at great personal expense of my own. Now I want some results from it.” The providers and the business are going to raise the visibility and say, “We’ve invested all this time and effort in our EHRs and our new financial systems and everything — we want to get some value out of it.” The only way they’re going to get value out of it is to elevate data governance to where it needs to be and invest in getting value from the data. If all we do as a healthcare industry is replace paper with electrons by doing EHRs, we will have failed miserably.
Any concluding thoughts?
An interesting topic for the future is the field of complex event processing. It started in the intelligence business to correlate all of these disconnected events against different data streams to be able to draw a conclusion and give alerts to people that, “Hey, you ought to probably be looking at people taking flying lessons and not caring about whether they know how to land or not.”
I see that there is a big opportunity for complex event processing in the healthcare market. Part of it is driven by our historical success with real-time messaging, because if you look and you say, “Healthcare is going to follow the same dynamic as the rest of industries did when they replaced all their ERP systems for Y2K,” then there was huge renaissance and blooming of analytics and data warehousing and driving value from now all that rich supply chain data they had.
Healthcare is going to follow the same thing on the backs of HER, as I believe, and hopefully do it in a more expedient manner. It’s still going to be counted in years the amount of time it’s going to take healthcare organizations to get the data, ensure its high quality, put it in a data warehouse, and start to do really powerful compelling things with it.
In the interim, CIOs and business executives aren’t going to wait two, three, or four years to start getting value from their investments in all those new systems, particularly given the competitive environment. With access to real-time messaging streams plus access to data that lives in databases, the ability to deliver-real time clinical and business decision support using complex event processing techniques to me is a fantastic way for executives to deliver real value to their business and clinical users before their data warehouse is ready.
An example of that would be something in an academic medical center. One of the most frequently challenging things to be able to do is to say, “When is a patient scheduled or when is a patient in-house that meets the criteria for my study so that I can go in and recruit them to be in my study before they’re discharged or before they leave the doctor’s office?”
In a normal organization, that’s a really difficult challenge to meet, because you’ve got registration data, you’ve got past claims data for billing history, you’ve got the laboratory system for some studies, and you’ve got the scheduling system for when the patient is going to be in-house. In the CEP world, if you can get to any of that data through your regular HL7 transactions — which you absolutely can — you can simply configure a real-time alert to go by e-mail to that end user and solve that question for them.
I think there are probably hundreds of those specific little things that people want to be able to do. I don’t know that there is one grand slam home run CEP use case that everybody would say, “Oh, I’ve got to have it.” But I think being able to put real-time decision support in the hands of clinical analysts and financial analysts six months or a year from now rather than waiting for the data warehouse is an area that the industry is going to look at very closely in the next year.