<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: An HIT Moment with &#8230; Andrew Kapit</title>
	<atom:link href="http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/feed/" rel="self" type="application/rss+xml" />
	<link>http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/</link>
	<description>Healthcare IT News and Opinion</description>
	<lastBuildDate>Thu, 09 Feb 2012 04:50:12 +0000</lastBuildDate>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
	<item>
		<title>By: Andy Kapit</title>
		<link>http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/comment-page-1/#comment-3273</link>
		<dc:creator>Andy Kapit</dc:creator>
		<pubDate>Wed, 04 Feb 2009 14:39:08 +0000</pubDate>
		<guid isPermaLink="false">http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/#comment-3273</guid>
		<description>Dr. Scarlat, thank you for asking these important questions and the answers are: yes, we are happy to share metrics about our performance; the system does &quot;learn;&quot; and the length of time to demonstrate an ROI depends on the hospital&#039;s prior performance – but our rapid growth, 120 clients and almost 100 percent retention should make clear how powerful the ROI is.

I agree that to evaluate the quality of an NLP application it is imperative to know its precision and recall scores. In common language, recall measures the engine&#039;s ability to accurately identify the full range of correct answers and precision evaluates the accuracy of the answers.

For example, when CodeRyte processes a note and identifies 10 ICD-9 codes and an expert coder agrees that each of those 10 codes were appropriately supported by the documentation, then the precision would be 100 percent.  However, if the expert coder also thinks that the engine should have identified 20 ICD-9 codes, then the recall score would be 50 percent.

As Dr. Scarlat points out, there can be a trade-off between precision and recall. If it were more important that every ICD be right, then an NLP vendor might be willing to have lower recall. On the other hand, if it were more important that no ICD code is missed, then an NLP vendor might be willing to have some incorrect ICD codes in order to make sure that all the correct ones were included.

Thus, the F-Score...

In addition to creating a high-performing coding engine, there was an additional challenge that CodeRyte had to overcome to successfully bring our NLP engine to the market. We had to put &quot;accuracy&quot; in context. In the scenarios above, for example, one would expect that three medical coders would agree that each of the ICDs selected by the engine were correct and, furthermore, would agree on exactly which ICD codes should have also been included. Unfortunately, that is not how it works. There is significant disagreement among coders.

Because CodeRyte believes in transparency, and especially when bringing a disruptive technology like this to the market, we have already published metrics regarding CodeRyte&#039;s performance. The papers, which get to the heart of your questions, were written by our team of engineers and NLP pioneers to dissect automated confidence assessment for compliant accurate coding.

Because those papers were published two years ago and our scores have improved on every metric, I suggest that we set up a meeting during which we can provide you detailed answers to your questions.</description>
		<content:encoded><![CDATA[<p>Dr. Scarlat, thank you for asking these important questions and the answers are: yes, we are happy to share metrics about our performance; the system does &#8220;learn;&#8221; and the length of time to demonstrate an ROI depends on the hospital&#8217;s prior performance – but our rapid growth, 120 clients and almost 100 percent retention should make clear how powerful the ROI is.</p>
<p>I agree that to evaluate the quality of an NLP application it is imperative to know its precision and recall scores. In common language, recall measures the engine&#8217;s ability to accurately identify the full range of correct answers and precision evaluates the accuracy of the answers.</p>
<p>For example, when CodeRyte processes a note and identifies 10 ICD-9 codes and an expert coder agrees that each of those 10 codes were appropriately supported by the documentation, then the precision would be 100 percent.  However, if the expert coder also thinks that the engine should have identified 20 ICD-9 codes, then the recall score would be 50 percent.</p>
<p>As Dr. Scarlat points out, there can be a trade-off between precision and recall. If it were more important that every ICD be right, then an NLP vendor might be willing to have lower recall. On the other hand, if it were more important that no ICD code is missed, then an NLP vendor might be willing to have some incorrect ICD codes in order to make sure that all the correct ones were included.</p>
<p>Thus, the F-Score&#8230;</p>
<p>In addition to creating a high-performing coding engine, there was an additional challenge that CodeRyte had to overcome to successfully bring our NLP engine to the market. We had to put &#8220;accuracy&#8221; in context. In the scenarios above, for example, one would expect that three medical coders would agree that each of the ICDs selected by the engine were correct and, furthermore, would agree on exactly which ICD codes should have also been included. Unfortunately, that is not how it works. There is significant disagreement among coders.</p>
<p>Because CodeRyte believes in transparency, and especially when bringing a disruptive technology like this to the market, we have already published metrics regarding CodeRyte&#8217;s performance. The papers, which get to the heart of your questions, were written by our team of engineers and NLP pioneers to dissect automated confidence assessment for compliant accurate coding.</p>
<p>Because those papers were published two years ago and our scores have improved on every metric, I suggest that we set up a meeting during which we can provide you detailed answers to your questions.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Alex Scarlat MD</title>
		<link>http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/comment-page-1/#comment-3251</link>
		<dc:creator>Alex Scarlat MD</dc:creator>
		<pubDate>Tue, 03 Feb 2009 15:48:10 +0000</pubDate>
		<guid isPermaLink="false">http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/#comment-3251</guid>
		<description>Interesting article raising a couple of questions:

1. Can Andy share some metrics regarding CodeRye performance (such as precision, recall and their combined F-statistic) ? 
These metrics will provide an insight into CodeRyte capabilities for coding righteously...Minimum under-coding (loss of revenues) and minimum over-coding (auditing, penalties, etc.) Unfortunately as with other binary classifications techniques, a gain in one metric means a loss in the other and vice-versa. Thus the F-statistic (no pun intended).

2. Does the NLP system &quot;learn&quot; in the process ? Does the algorithm performance improve in time?

3. How long it takes to demonstrate a ROI for a medium size hospital ?

Many thanks,
Alex Scarlat MD</description>
		<content:encoded><![CDATA[<p>Interesting article raising a couple of questions:</p>
<p>1. Can Andy share some metrics regarding CodeRye performance (such as precision, recall and their combined F-statistic) ?<br />
These metrics will provide an insight into CodeRyte capabilities for coding righteously&#8230;Minimum under-coding (loss of revenues) and minimum over-coding (auditing, penalties, etc.) Unfortunately as with other binary classifications techniques, a gain in one metric means a loss in the other and vice-versa. Thus the F-statistic (no pun intended).</p>
<p>2. Does the NLP system &#8220;learn&#8221; in the process ? Does the algorithm performance improve in time?</p>
<p>3. How long it takes to demonstrate a ROI for a medium size hospital ?</p>
<p>Many thanks,<br />
Alex Scarlat MD</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Spero Melior</title>
		<link>http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/comment-page-1/#comment-3247</link>
		<dc:creator>Spero Melior</dc:creator>
		<pubDate>Tue, 03 Feb 2009 12:20:04 +0000</pubDate>
		<guid isPermaLink="false">http://histalk2.com/2009/02/02/an-hit-moment-with-andrew-kapit/#comment-3247</guid>
		<description>The upgrade to ICD-10 will not change what the codes are used for.  To quote Andrew himself: &quot;...reimbursement.  Not treatment. Not the most appropriate care. Not the patient at the heart of it all.&quot;

I agree that the pushback is almost entirely about the cost, and that it is wrongheaded.  It is surprisingly easy to pushback on the structure of ICD-10-CM.

The upshot of it all is that &quot;accurate diagnoses&quot; are impossible with a disease classification, as opposed to disease, coding system.  So long as you code the class of diagnoses into which a diagnosis falls, and not the actual individual diagnosis, you have a problem.</description>
		<content:encoded><![CDATA[<p>The upgrade to ICD-10 will not change what the codes are used for.  To quote Andrew himself: &#8220;&#8230;reimbursement.  Not treatment. Not the most appropriate care. Not the patient at the heart of it all.&#8221;</p>
<p>I agree that the pushback is almost entirely about the cost, and that it is wrongheaded.  It is surprisingly easy to pushback on the structure of ICD-10-CM.</p>
<p>The upshot of it all is that &#8220;accurate diagnoses&#8221; are impossible with a disease classification, as opposed to disease, coding system.  So long as you code the class of diagnoses into which a diagnosis falls, and not the actual individual diagnosis, you have a problem.</p>
]]></content:encoded>
	</item>
</channel>
</rss>

