The Theranos Scandal is an object lesson for the laboratory industry. It's not only an example of what NOT to do, but also an example of what we risk when we don't demand proof of quality, or what we might suffer if we don't prove the quality we're delivering. We don't want to be Theranosed. And we certainly don't want to be Theranosing others...
It seems a week cannot go by without more bad news for the former darling startup:
The problems are so notorious that "to Theranos" has now been turned into a verb equivalent "to Enron."
There’s so much that’s happened that it’s hard to know where to start. Indeed, most of the stories have been covered by other news outlets already, and by real journalists. About the only additional insight we can add here is a closer reading of the lightly redacted inspection report. Because buried in that are some performance details that no one else seems to have noticed.
But before we do that, let's admit that it's more than a little embarrassing that it got this far. That a company that provided no shred of proof of its performance was nevertheless able to market itself to hundreds of millions of dollars of venture capitalist funding and a $9 billion valuation and the Time 100, etc. The laboratory industry remained largely silent when we should have been vocal, challenging, and skeptical. Instead, we let a startup bully and bluster its way into prominence, and at the same time, claim that all of us in the the laboratory world and diagnostics industry were unnecessary, antiquated, and soon to be driven into obsolescence.
Let’s start with the QC failure rates. The inspection report details that there were significant out-of-control results for many tests, sometimes up to 87% of QC results were out more than 2 SD!!
Knowing these failure rates, we can easily convert those into Six Sigma benchmarks, using the simple short-term scale (a table look up based on the DPM/defect rate)
Recall that the minimum performance required for stable operation is 3 Sigma. Given that, and the inability of any Edison methods to achieve even close to 3 Sigma, we are hardly surprised by the voiding of two years of test results. Indeed, based on those Sigma benchmarks, you might even predict the unreliability of the results.
But the QC failure rates aren’t the only hidden nuggets in the report. There are also some details on the reported imprecision rates of some other tests.
The report notes that the imprecision for Vitamin B12 at level 1 was 52.5% and level 3 was 48.5%. Already that doesn’t sound good, right? But not put it into context: the “Ricos goal” – a desirable specification for allowable total error of 30%, as well as a specification for allowable imprecision of 7.5%. So the Theranos method exceeds the recommended imprecision by more than 600%! The imprecision alone is nearly double the allowable total error (even if we accounted for bias, we are in terrible shape).
The inspection note also notes the imprecision for Vitamin D at level 1 was 63.8% and at level 3 was 26.4%. Again, this doesn’t sound good, but just how bad is it? For that we can consult a paper from 2011:
Adie Viljoen, Dhruy K Singh, Ken Farrinton, Patrick J. Twomey, Analytical Quality Goals for 25-Vitamin D Based on Biologic Variation, Journal of Clinical Laboratory Analysis 25 : 130–133 (2011)
This paper notes that the most forgiving (minimum) performance specification for imprecision should be 9% and the most forgiving (minimum) specification for allowable total error is 32.2%.
We can further calculate analytical Sigma-metrics by dividing the allowable total error by the imprecision observed. We don't know anything about bias, so we have to assume it's zero for the moment. Even with this optimistic assumption, the Sigma-metrics are terrible:
Thus, the Theranos methods consistently fail to meet even the most lenient goals for precision, and on the Sigma-metric scale, these methods are achieving mostly below 1 Sigma. Again, it’s not a shock that these test results are being voided.
I also hope it’s not necessary to state that this level of performance is significantly worse than what is provided and observed in traditional diagnostic platforms. Indeed, many “Traditional” POC devices are also far better in precision and performance than Edison.
Theranos' performance raises many more questions – Why was Theranos hiding this data? Did they understand how bad this performance was? What specifications were they setting for performance and why was this egregious performance considered acceptable?
At the AACC session, questions from the attending scientists are encouraged, and I hope key questions will be asked about long-term imprecision observed in the methods, comparison of these results against traditional laboratory instruments (for example, what is the bias between Edison and the Siemens instruments that Theranos has installed and running at the Newark facility?). If we know CV and bias, we can calculate analytical Sigma-metrics and determine the current performance of their methods – and whether or not this company deserves another chance to deliver test results to patients.
But let's not pretend that Theranos is alone in some of these practices, even in the diagnostics industry. There are instrument managers who prefer to hype their quality rather than provide hard facts. There are labs that build a facade of quality but hide an inner core of corrupted quality.
So those are just a few ways we can avoid being Theranosed by our vendors. But then turn around and make sure you're not committing the same sins with your clients, clinicians and patients.
When we conduct ourselves this way, holding outselves to a higher standard than mere compliance, we will prevent any more Theranos upstarts from gaining footholds in our industry. Innovation has an important role to play in the diagnostics industry, but fraud does not.