“Data is the new gold” is the mantra of our age. The data scientist has been termed the “sexiest job of the 21st century.” Every industry is driven by the impetus to acquire, curate, and analyze all aspects of their business. We measure and quantify everything with the promise of utilizing this data to drive predictions and prescriptions. As an ER physician and an associate medical director, I am regularly shown a dashboard of metrics with rows and rows of color-coded goals. Metrics such as patient satisfaction scores, length of stay, door to discharge, door to doctor, are all color-coded with green representing a passing score, yellow representing a warning, and red representing a failing score. There are incentives and disincentives tied to meeting these goals, and therefore, there is almost a fanatical devotion to quality dashboards. This, in turn, drives behavior at all levels of the industry from the department to the larger healthcare system. The goal for any department or hospital administrator is to turn the majority of the dashboard “green.”
These metrics are intended to serve as a proxy to the ultimate goal of increasing healthcare quality, while concomitantly decreasing cost. However, in the United States, healthcare costs continue to increase to nearly $3.2 trillion annually or approximately 18% of the GDP. In contrast, quality metrics such as life expectancy, and infant or maternal mortality are tapering off and even diminishing in certain demographics. English economist, Charles Goodhart, proposed that “any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes.” Stated in other words, “when a measure becomes a target, it ceases to be a good measure.” The most famous example of this observation is the case study of nail factories in the Soviet Union. In order to increase nail production, administrators tied performance to the number of nails produced. As a consequence, nail factories started to produce millions of nails but also lots of tiny and useless nails. Thereafter, in order to increase the size of these nails, the metric was switched to the total weight of nails produced, however, this imperative only resulted in the production of similarly useless large and heavy nails. As a practicing emergency physician, I suspect something eerily similar is happening in health care.
In the metric-centric culture of health care where intrinsic and extrinsic incentives and disincentives are tied to these metrics, it is inevitable that these metrics become the focus (Goodhart’s Law). Solutions are created in service of these metrics, and consequently, drive the system towards these goals. As in the case study above, if goals are poorly defined or represent a poor proxy for the desired target, then the system will produce suboptimal and inadvertent results. Furthermore, often times what we cannot measure (or do not measure) is often more important than what we can measure. Therefore, ignoring criteria that are difficult to quantify leads to faulty models, erroneous predictions, and undesirable consequences. For example, in order to create a patient-centered healthcare system, patient satisfaction metrics were instituted. The intent of these surveys was to quantify patient satisfaction and facilitated comparisons between physicians. Subsequently, compensation incentives and remediation disincentives were developed to drive behavior towards improved patient satisfaction as represented by these survey scores. These surveys often included question similar to the following: “How often did the hospital staff do everything they could to help you with your pain?” The treatment of pain is a complex, multifactorial process including psycho-social-physical aspects. To reduce it to a static, simplified, unbending survey question played a significant part in the over-prescription of opioids and the subsequent opioid epidemic.
Treatment of pain is a microcosm of the complexity and multidimensionality of medicine. Helping a patient through their illness is immensely complex and involves multiple dimensions ranging from the psychological to the social and the physical. To not account for the multidimensionality, to have rigid and simplified metrics, to not have layers of self-adjusting feedback loops built into these goals all lead to suboptimal results and unintended consequences. Donella Meadows states, “human beings have been endowed not only with the ability to count but also with the ability to assess quality. Be a quality detector.” Similarly, Marvin Minsky stated that “we turn to using quantities when we can’t compare the qualities of things.” As an informaticist, I see the value of data in improving processes and decision making, but on the other hand, as a physician (and a quality detector) I am inclined to believe that many of these metrics are poorly designed, overly reductive, and far too rigid to improve the quality of our healthcare system. My hope is that the “sexiest job of the 21st century” integrated with physician domain expertise will yield the next generation of more meaningful metrics that will drive healthcare system towards better care.