Christine Swisher, PhD is chief scientific officer of Project Ronin of San Mateo, CA.
Tell me about yourself and the company.
My background is in healthcare, mostly in oncology, but also in building predictive models and AI as software and as a medical device. I’ve worked at Philips Healthcare, which is in the Fortune 500, as well as several startups. I’ve led from idea to FDA clearance and expansion in the US and in Europe.
I am passionate about responsible AI and what that means, to deliver AI that is impactful in healthcare and that improves the lives of patients at scale.
At Ronin, we are fortunate to have a wonderful network and partners such that we are set up to achieve our mission of improving the life of cancer patients at scale and impacting all four of the Quadruple Aim verticals. We build technology such that an oncologist and the clinical care team that cares for cancer patients can see at a glance, and understand, their patient’s journey. We look through all of the structured data, clinical notes, and documents and bring that forward, so there isn’t that 30 minutes of clicking to prepare for a visit, but help them understand their patient at a glance. We also bring in the patient’s voice to understand what’s happening to the patient outside of the hospital and render that in their clinical workflow.
We have a mobile application that engages patients, not just for having a better understanding of the patient, but to empower clinicians with predictive information so they can take actions earlier and prevent adverse events and avoidable hospitalizations or emergency department visits and also better manage symptoms so that patients can stay on treatment longer.
What is the extent of genetics and genomics data that can be used to make clinical decisions?
A lot of that is about contextualizing that information. There’s a big jump from what scientists have discovered and where we are in this, especially in the genetics field. How do we deliver that to have meaningful outcomes in clinical care? How can we contextualize that information alongside their patient record of what’s happening, their entire patient record such as comorbidities, social determinants of health, and patient-reported outcomes? What’s happening to them at home? How can we bring all that together to have a total patient understanding, including their patient preferences?
With that total patient understanding, we can make the best choice for that particular patient. It’s a critical piece of information, especially things like EGFR mutations that are so impactful for treatment decisions that they can be lifesaving. We need to bring them into care decision making.
ChatGPT feels like an overnight success, but probably isn’t to experts in the field like yourself. How will your work be changed by its capabilities and popularity?
It definitely impacts the work that we do it. In fact, I think it enables the next level of technology if we are thoughtful in how we deliver that.
It didn’t happen overnight from my perspective. In 2012, we witnessed a similar event in AI, where there was a technological breakthrough with convolutional neural networks, rectified linear units, and dropout that allowed us to have computer vision perform as well as humans for general domain tasks in classification. That particular event sparked the deep learning revolution.
From 2012 to 2020, there were about 100 FDA-cleared applications, 88 of which were computer vision or in the radiology space. That happened quickly and the characteristics of these winners that were able to deliver on deep learning at that time. Radiologists, pathologists, and recipients of this technology were skeptical, just as skeptical as they are now.
It’s slightly higher publicity now because so many people are using things like ChatGPT in their work. But it’s a lot of mirroring to what happened in the 2010s, when the AI winners in healthcare did three things. One, they prioritized interpretability and risk mitigation. Two, they focused on super-powering the clinicians versus trying to compete with them, and companies that said they were going to replace a clinician were not successful. Third is that they delivered a complete solution, and those solutions fit seamlessly into the clinical workflow. They delivered on the CDS five rights, which means that it was the right information, the right person, the right format, the right channel, and at the right time. That’s the key to success.
None of those things have really changed about healthcare in the past 10 years. There was a technological breakthrough with the transformer architecture in 2017, and then a new generalizable method, which was GPT- based models. We had a new generation of applications like ChatGPT, Stable Diffusion, Dall-E, and all of these generative AI technologies. It’s very much like what we saw in 2012.
If we can take those learnings about what success looks like, and bring those into how we think about this new innovation or new class of AI-powered applications, we’re going to be a lot more successful. I am really excited about generative AI, but I think that it has to be delivered the right way.
We heard way too much back then about big data, which is rarely mentioned using that name today. Will AI and ML help deliver that promise?
We’ve been doing things that are interesting. AI has helped identify sepsis patients earlier and to identify ischemic strokes so that patients can be treated within the golden hour. It’s been able to better detect breast cancer, lung cancer, and prostate cancer earlier. It’s already impacting people’s lives. That was with big data. It’s already living up to, maybe not at the scale that was predicted, but it is actually improving people’s lives at scale.
Now what we are seeing with this new class is new ways that we can better improve people’s lives. Generative AI can help scientists and researchers better discover new drugs, new treatments, and new therapies for cancer and other diseases.
It’s going to enable a better understanding of the patient’s journey, just like what we are doing at Ronin, being able to dig through the 80% of the EMR that is unstructured data documents, clinical narratives, and notes and have a better understanding of patients at an individual level and at a population level. That means that we are going to be able to better predict things like mortality, progression, adverse events, toxicities from treatment, and acute care utilization like emergency department visits. Then by being able to predict them and see what caused them, we can better inform on actions. I’m really excited about the technology, as long as it’s delivered safely and ethically.
The new book “Redefining the Boundaries of Medicine” notes that medicine is based around huge population studies that may lead to the wrong conclusions when a specific intervention doesn’t appear to be effective collectively, but works on subgroups of patients who share particular circumstances or comorbidities. How would a data scientist look at that issue?
This is very core to our Ronin mission, to deliver care decisions that are personalized to that particular patient versus based on population averages. So many decisions in oncology are based on population averages. By bringing data of what happened to patients like them — what happened in terms of their progression, their quality of life, the toxicities that they experienced — we can look at the patient in a comprehensive way, thinking about their demographics, social determinants of health, their cancer and treatment specific risk factors, their comorbidities, symptoms, active problems, and biomarkers as well.
If we bring that together to then say, what happened to patients like my patient, we can provide more personalized decisions. We can also empower the care team, oncologist, patient, and caregiver with data to make that decision.
Previous technologies were implemented as advisory rather than a closed loop system that would require FDA approval. How prepared is FDA to evaluate AI technologies and are the usual retrospective studies adequate to do so?
I have two answers for that. The first is that regulatory and best practice groups are moving quickly in response to the innovation and excitement around generative AI and AI in general. Three seminal documents were released just in the past few months. The White House delivered a blueprint for an AI bill of rights, NIST delivered their risk management framework, and the Coalition for Health AI delivered their “Blueprint for Trustworthy AI Implementation Guidance and Assurance for Healthcare.”
When you look at these three documents, five themes emerge across them. You need validated, safe, and effective systems. You need protections against bias. You need privacy and security. You need interpretability and explainability. Finally, you need transparency and human factors.
Whether or not it’s FDA-cleared 510 (k) software as a medical device, a CDSS, a CLIA-validated laboratory developed test, or AI for another application that doesn’t fit it under those regulatory guidance, it’s still important that it delivers on those five principles. In fact, those actually expand past healthcare.
Those are the things where we will see guidance from groups like CHAI on how we concretely deliver on those principles. The principles have been defined, and now these groups are working very quickly to define the next steps. I also think that infrastructure cloud vendors and AI tooling vendors will, at some point, start to provide certified tools to companies like Ronin and others to accelerate our ability to deliver AI safely. That’s a huge market opportunity.
AI in healthcare, particularly with our last AI revolution in the 2010s, was most successful when it was partnered with clinicians to make them super-powered clinicians. If you look at other domains, the same thing is true. AI did not replace as many jobs as people thought it would.
You could also look at things like when we went from animators hand drawing to CGI. CGI just expanded the scope of what they could deliver, how productive they could be, and allowed them to work at a higher level with the tedious tasks taken away. It’s the same thing of going from FORTRAN to C++ to Python and how we develop AI.
If we look at how those industries are impacted, there’s as guiding principle that AI empowers people and takes the tedious things off their plate so that they can operate at a higher level and deliver higher quality. That’s true in healthcare as well.
How will the availability of complete, representative, and unbiased training data affect the market for AI technologies?
Protections against bias is a key theme in those three seminal documents that I just talked about, and something that we need to do proactively and continuously. It’s not a one-time event where you look at your patient population, see how it performs in subgroups, and then write it up in a medical journal.
It has to be part of your system, where you are continuously monitoring for bias. Then when you detect a bias incident, you need to have the systems in place to rapidly mitigate that issue. One of solutions is representative data, but we need a three-pronged approach, where the first prong is like the brakes in your car, the second prong is the seatbelt, and the last one is the airbag.
The first prong, our brake, is about preventing any foreseeable bias. So that when you are developing the model, you have representation of the populations that you intend to serve. You have subject matter experts that understand that there isn’t bias built into the actual ground truth data or the data feeding into the model. That the way it is delivered from a user experience will not exacerbate currently existing biases in the system, so that there’s a lot of voice of the customer or human-centric design that has representation of the populations that we intend to serve. That’s the brake.
The seatbelt and the airbag are two pieces. The first is that you need to have proactive and continuous monitoring for bias across important subgroups. Things like social determinants of health. Do they have access to transportation? What about their insurance and demographic groups? We need a comprehensive understanding of the different ways that we could introduce bias that causes harm to different types of groups, then detecting that and being able to diagnose any problem quickly before it causes patient harm.
Then knowing that you have a problem, the next step is to fix the problem, so having the systems in place so you can rapidly retrain a model and you have the technology or ability to mitigate bias quickly. The machine learning operations, MLOps needs both infrastructure and practice to mitigate that and then deliver that fix quickly before there’s patient harm. In addition, there are human factors in how it’s delivered so that you can mitigate risk as well.
IBM Watson Health failed at trying to do years ago what people think is possible now. What has changed?
For those that will be successful, what’s different now is the user experience and real-world validation of the technology. What is the AUC, area under the curve, of a model? All these abstract metrics that AI practitioners tend to focus on … instead of focusing on those, focus on the meaningful measures. Does the AI plus the human better prevent acute unplanned care? Does it keep patients on treatment longer with their symptoms better managed? Does it increase progression-free survival? Going back to what a meaningful measure is and evaluating the performance of your models against that, versus abstract measures, is one of those key pieces.
The other one is thoughtful, human-centric design. With those pieces together, that’s where you have meaningful impact. Companies compete too much on model AUC, accuracy, or F1 score. A 5% difference sounds good on paper, but it’s the execution of that. When you delivered in clinical workload, did you live up the CDS five rights? If that’s true, you’re going to have a bigger impact. Focusing on the meaningful measures versus the abstract measures is key.
Is there a tension between the absolutes of data science versus the frontline practice of medicine that incorporates variables that are personal, local, or perceptual?
Especially for CDSs that rely on predictive models, machine learning, or statistical methods, it’s crucially important. It is written in the FDA’s guidance that you need to share the basis of the prediction and the relevancy of the training of the development data. Both of those things need to be shared.
At Ronin, we show that in a way that is accessible to the clinician. You don’t have to have statistical knowledge or machine learning knowledge to understand that. It’s right there at the point of making the decision, the relevance of the patients that are similar that are giving this insight for this particular patient. The basis of that prediction is right there during clinical decision versus buried in a user manual or peer-reviewed publication that might be behind a paywall.
For things like generative AI and language models, we still need to innovate and develop the methods for transparency in sharing the basis of our prediction. When we look back to things like convolutional neural networks, there was innovation on how we do that. Things like saliency maps were invented and the methodology to do that. Semantic segmentation was another innovation that allowed us to provide that type of insight.
We probably will have to invent some new methods, and I’m excited and hope that we continue excited about what that will be. We would like to be a part of that, and I am hopeful that our research community will gather around this challenge.
Will we see a trough of disillusionment with generative AI?
There will probably be a realization of the challenges, limitations, and areas of success. We’re going to learn that. We’re still learning about what this technology can do. How do we really understand what’s going on underneath the hood? How do we get it to explain the basis of its predictions?
People who are skeptical now — especially if they start to use it to help with writing, as a second reader, or to write code – may start to see a lot of value in it. On the other hand, we’re going to learn about its limitations. I think we might see the more skeptical folks being more embracing, and the ones that are less skeptical becoming more skeptical, as we learn more about the limitations.
What will the next few years bring to Ronin?
We are realizing that personalized, data-driven, total patient understanding in care decisions for cancer patients empowers clinicians. We can use AI, machine learning, and data science informatics for that and to bring the patient patient’s voice into it as well, where they can say what’s happening to them outside the home and their preferences can be brought in to care decision-making, even in the data that is driving those care decisions. There’s a huge opportunity to deliver on that vision, and we are already doing it.