Kenneth Arrow on healthcare economics: a 21st century appreciation

Nobel laureate Kenneth Arrow passed away on February 21, 2017. In a classic, fifty-year-old paper entitled Uncertainty and the Welfare Economics of Medical Care, Arrow discussed how:

“the operation of the medical-care industry and the efficacy with which it satisfies the needs of society differs from… a competitive model… If a competitive equilibrium exists at all, and if all commodities relevant to costs or utilities are in fact priced in the market, then the equilibrium is necessarily [Pareto] optimal” (emphasis added)

Note the implicit assumption that price reflects value, to which I’ll return. As Arrow elegantly explained, there are vast differences between the actual healthcare market and the competitive model, and, moreover, these differences arise from important features of the actual healthcare market.

Identifying the lack of realism of the competitive model in health care may lead to deeper understanding of the actual system. In essence this is what Arrow does. Although both medical care and our expectations have changed greatly, Arrow ’63 is still valid and worth reading today.

Here is Arrow’s summary of the differences between the healthcare market and typical competitive markets.

The nature of demand

Demand for medical services is irregular and unpredictable:

“Medical services, apart from preventive services, afford satisfaction only in the event of illness, a departure from the normal state of affairs… Illness is, thus, not only risky but a costly risk in itself, apart from the cost of medical care.”

Expected behavior of the physician

“It is at least claimed that treatment is dictated by objective needs of the case and not limited by financial considerations… Charity treatment in one form or another does exist because of this tradition about human rights to adequate medical care.”

Product uncertainty

“Recovery from disease is as unpredictable as its incidence…  Because medical knowledge is so complicated, the information possessed by the physician as to the consequences and possibilities of treatment is necessarily very much greater than that of the patient, or at least so it is believed by both parties.”

Supply conditions

Barriers to entry include licensing and other controls on quality (accreditation) and costs.

“One striking consequence of the control of quality is the restriction on the range offered… The declining ratio of physicians to total employees in the medical-care industry shows that substitution of less trained personnel, technicians and the like, is not prevented completely, but the central role of the highly trained physician is not affected at all.”

Pricing practices

There are no fixed prices:

“extensive price discrimination by income (with an extreme of zero prices for sufficiently indigent patients)… the apparent rigidity of so-called administered prices considerably understates the actual flexibility.”

Avik Roy observes in a critical National Review article that “Because patients don’t see the bill until after the non-refundable service has been consumed, and because patients are given little information about price and cost, patients and payors are rarely able to shop around for a medical service based on price and value.”

Medicine has seen major changes since Arrow’s 1963 paper. For example, the treatment of blocked coronary arteries has evolved from coronary bypass to angioplasty to early stents and finally drug-eluting stents. We have seen the advent of minimally invasive surgery, robotic surgery and catheter-based cardiac valve repair and replacement. We have seen drugs to treat hepatitis C and biologicals to treat arthritis and cancer. Many conditions have been transformed from acute to chronic but (at least temporarily) manageable. There are also divergent trends, such as increases in both natural childbirth and Caesarean sections.

In the last 50 years, medicine has become more powerful, but also significantly more complex and overall, more expensive. Intensive care units are a good example, both valuable therapeutically, but expensive to provide. At the same time, many treatments are both better (more valuable to the patient) and less expensive to provide; these range from root canal (frequently two visits to the dentist instead of four) to the significantly less invasive treatments for many cardiac rhythm abnormalities (radio-frequency ablation) and stents for coronary artery disease. The advent of epinephrine auto-injectors has been a lifesaver, but the cost of the Epi-Pen has increased significantly.

Can a competitive economic system appropriately and reasonably price such treatments and devices? Arrow argues that, if not, non-market social institutions will arise and address these challenges. Here is a deeper look.

Arrow’s first two points are still virtually axiomatic today: demand for medical services has become even more unpredictable with the continued growth of advanced, effective interventions and corresponding, appropriately increasing (in my opinion), patient expectations. Similarly, as medical care advances, we increasingly see medical care as a human right and in many cases, a societal obligation. We have come to expect treatment dictated by objective needs and not limited by financial considerations, not only from physicians but from a growing number of key players including pharmaceutical companies. To their credit, in many cases (AIDS comes to mind) pharmaceutical companies have responded by sharply reducing prices in the developing world.

Powerful chemotherapeutic and biologic drugs may have increased the uncertainty and asymmetry of information observed by Arrow, both in their effectiveness and in their side effects. In many cases one needs the language and mathematics of probability and statistics to evaluate, assess and describe their efficacy and utility. One needs an understanding of probability to determine when and how to use common preventive techniques, such as mammograms and PSA screening. Here is an example, paraphrased from Gigerenzer and Edwards (see also Strogatz). Women 40 to 50 years old, with no family history of breast cancer, are a low-risk population; the overall probability of breast cancer in this population is 0.8%. Assume that mammography has a sensitivity of 90% and a false positive rate of 7%.  A woman has a positive mammogram. What is the probability that she has breast cancer? Among 25 German doctors surveyed, 36% said 90% or more, 32% said 50-80%, and 32% said 10% or less. Most (95%) of United States doctors thought the probability was approximately 75%.  (See the links above for the answer, or see my next blog on the challenge of communicating probability).

Arrow’s information asymmetry remains, despite the growing availability of accessible medical information on the web, perhaps for good reasons such as the ability to effectively address the needs of sicker patients.

I would amend Arrow’s discussion of supply conditions to include a wide variety of cost barriers ranging from large fixed costs of ICUs to the costs of medical research. The high cost of basic medical services relative to per capita GDP in the the developing world represents a barrier as high as any faced in the developed world.  As Arrow notes, society has addressed this challenge through a variety of pricing mechanisms outside traditional competitive models. This may not, and in general will not achieve a Pareto optimum, but their wide endorsement by society does indeed suggest that these approaches achieve a more general optimum.

“I propose here the view that, when the market fails to achieve an optimal state, society will, to some extent at least, recognize the gap, and nonmarket social institutions will arise attempting to bridge it… But it is contended here that the special structural characteristics of the medical-care market are largely attempts to overcome the lack of optimality due to the nonmarketability of the bearing of suitable risks and the imperfect marketability of information. These compensatory institutional changes, with some reinforcement from usual profit motives, largely explain the observed noncompetitive behavior of the medical-care market, behavior which, in itself, interferes with optimality. The social adjustment towards optimality thus puts obstacles in its own path.”

It is this view which I find too limiting. I would suggest that society has at least implicitly concluded that price alone does not define value, and thus formed a broader definition of optimality, not simply Pareto optimality in a competitive market. Society is finding and supporting ways to overcome obstacles toward this broader sense of optimality.

The Bill & Melinda Gates Foundation vaccination project aims to reduce the number of children that die each year from preventable disease (currently around 1.5 million). The lifebox project, founded by Dr Atul Gawande, provides affordable, high quality pulse oximeters to the developing world and now seeks to address basic surgical safety in the developing world. Important advances also arise in the developing world; most recently, an easy to deliver, more effective oral cholera vaccine developed in Vietnam.

Arrow himself recognizes the limits of a traditional economic description of the medical care market in his concluding Postscript, arguing that “The logic and limitations of ideal competitive behavior under uncertainty force us to recognize the incomplete description of reality supplied by the impersonal price system.” I conclude more generally that prices not only do not necessarily represent value in medical care (as Arrow observed), but that the combination of uncertainty, externalities, high costs, divergent economies, and technological advance means that price alone cannot describe value in medical care. A broader more general theory of healthcare economics with a foundation standing on the shoulders of giants such as Kenneth Arrow, with perhaps a more general multi-dimensional Pareto optimum, might help us all better understand where we are and where we might go.



Chris Sampson’s journal round-up for 27th February 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Does it pay to know prices in health care? American Economic Journal: Economic Policy Published February 2017

In the US, people in need of health care have to pay for it – or for insurance to cover it – without knowing in advance how much said health care actually costs. Weird, right? Instinctively, it feels as if people really ought to be able to find out. However, if knowing prices in advance doesn’t actually affect consumption, maybe we can say it really doesn’t matter. Well, we can’t. As this new study shows, having access to price information affects consumer choices. There’s plenty of price dispersion to make this potentially important: in this study’s dataset, a move from the 90th to the 50th percentile is on average associated with a price drop of 35%. The data relate to 387,774 procedures for 6,208 people working for a corporate client of a price information firm. Access to this service was staggered for different employees, creating the potential for experimental investigation. The principal strategy is difference-in-differences regression analysis. Access to the price information service was associated with prices around 1.6% lower on average. For primary care – which might be less price sensitive – and for complex cases where lots of procedures are taking place, the effect is weakened. The results seem robust to matching and other tests. The author is able to provide further insight by showing that access to price information increases the probability of seeing a new doctor by 14%. And when an instrumental variable approach is used to assess the price reduction specifically for people who searched for price information and then received a procedure within 30 days, the reduction in price reaches a whopping 17%. This suggests that the average impact of a 1.6% reduction could be a lot higher if people searched for price information more frequently. The fact that they don’t is likely due to a particular kind of moral hazard being at play. Moral hazard in search occurs when people have no incentive to search for cheaper services. The author goes on to show that in any given week an individual is around 90% less likely to search if they have already met their deductible, and that this translates into an elasticity of search propensity to the proportion out-of-pocket expense of approximately 1.8. We mustn’t forget the other side of the welfare coin here. What if people are choosing lower quality care in order to save money, or foregoing it altogether? Looking at the rate of follow-through after searches and bringing in hospital quality data seems to show that this isn’t a concern here. This group of people aren’t representative of the general population so it may be that access to prices is only valuable to certain groups. Nevertheless, this paper tells us a lot about the importance of price information and in particular the special kind of moral hazard that can arise in the presence of comprehensive insurance coverage.

Mitigating the consequences of a health condition: The role of intra- and interhousehold assistance. Journal of Health Economics Published 20th February 2017

There’s a lot of research around the effect that an individual’s health problem can have on their immediate family, both in terms of the overspill in quality of life impacts and the costs of satisfying need for health care. However, large panel data research can be limited because the data can’t connect non-coresident family members. This study considers informal insurance and consumption smoothing within families beyond the current household. The data come from the Panel Study of Income Dynamics, with 7,578 individuals and around 33,000 household years from 2001-2011. The panel follows offspring after they leave a household, facilitating the identification of genetically linked families. Participants are asked whether they suffer from 11 different health problems and, if they do, the extent to which it limits their daily activities. The data also include information on different categories of spending, including health. The analysis involves regression that accounts for individual fixed effects and looks at the impact of a change in health status on consumption. If a household is fully insured, changes in health status should not affect non-health expenditures. The analysis focuses on the impact of severe limitations, which are reported at some point by 1,321 people. Such a change in health status was associated with a reduction in annual working hours of around 20%, corresponding to $5000 for men and $2800 for women. Additionally, household health expenditures increased by $479 on average. The notion of complete insurance facilitating consumption smoothing appears to fail, with a decline in consumption of around 10%. Partial insurance smoothes roughly half the loss. Households with formal insurance exhibit a much smaller reduction in consumption. A key finding is that being married may facilitate consumption smoothing to the extent of full insurance, while unmarried couples take a bigger hit. Home equity seems to play an important role in this dynamic, with married couples more likely to remortgage in response to a health shock. Married couples also receive more in social security transfers. Unmarried couples, it seems, have to turn to non-coresident family members instead and are 50% more likely to use this channel than married couples. Male children are more likely to use their own home equity to support their parents, while female children tend to reduce their own consumption. This study identifies a lot of interesting relationships and divergent strategies for consumption smoothing that warrant further investigation.

Handling missing data in within-trial cost-effectiveness analysis: a review with future recommendations. PharmacoEconomics – Open Published 9th February 2017

If you conduct trial-based cost-effectiveness analyses then chances are that at some point you’ve had to go and figure out how to deal with all that missing data. There are a handful of quality papers out there that offer guidance. If we all followed their advice then we’d be doing a decent job of it. This new paper demonstrates that we aren’t all doing a good job of it and offers fresh guidance. The paper starts by outlining the ‘principled’ approach to handling missing data. Essentially it means being sensible with the data, considering the most appropriate statistical model and describing assumptions about the missing data mechanism. Imputation methods that can support this principled approach are briefly discussed. The authors present a quality evaluation scheme, which can be used to assess the appropriateness of methods adopted in a study and the completeness of reporting. It makes recommendations with respect to the description of missing data, the methods used to handle it and the limitations associated with the study. The quality evaluation scheme can be used to score and rank papers from A-E. This is what the authors go on to do, with a systematic review including 81 eligible papers. A previous review found complete case analysis to be the most popular base case method adopted. In 2009-2015, multiple imputation became the most frequently used base case method, though complete case analysis remains common and many studies are still unclear about the methods adopted. Most articles did not describe any robustness analysis, reporting only the base case approach to missing data. Many articles were classified as the lowest quality (E), though this has improved over time. The authors demonstrate that their proposed grading system is associated with the strength of the assumptions in the adopted methods. If you’re engaged in trial-based economic evaluation, you ought to read this paper.


Brent Gibbons’s journal round-up for 30th January 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

For this week’s round-up, I selected three papers from December’s issue of Health Services Research. I didn’t intend to to limit my selections to one issue of one journal but as I narrowed down my selections from several journals, these three papers stood out.

Treatment effect estimation using nonlinear two-stage instrumental variable estimators: another cautionary note. Health Services Research [PubMed] Published December 2016

This paper by Chapman and Brooks evaluates the properties of a non-linear instrumental variables (IV) estimator called two-stage residual inclusion or 2SRI. 2SRI has been more recently suggested as a consistent estimator of treatment effects under conditions of selection bias and where the dependent variable of the 2nd-stage equation is either binary or otherwise non-linear in its distribution. Terza, Bradford, and Dismuke (2007) and Terza, Basu, and Rathouz (2008) furthermore claimed that 2SRI estimates can produce unbiased estimates not just of local average treatment effects (LATE) but of average treatment effects (ATE). However, Chapman and Brooks question why 2SRI, which is analogous to two-stage least squares (2SLS) when both the first and second stage equations are linear, should not require similar assumptions as in 2SLS when generalizing beyond LATE to ATE. Backing up a step, when estimating treatment effects using observational data, one worry when trying to establish a causal effect is bias due to treatment choice. Where patient characteristics related to treatment choice are unobservable and one or more instruments is available, linear IV estimation (i.e. 2SLS) produces unbiased and consistent estimates of treatment effects for “marginal patients” or compliers. These are the patients whose treatment effects were influenced by the instrument and their treatment effects are termed LATE. But if there is heterogeneity in treatment effects, a case needs to be made that treatment effect heterogeneity is not related to treatment choice in order to generalize to ATE.  Moving to non-linear IV estimation, Chapman and Brooks are skeptical that this case for generalizing LATE to ATE no longer needs to be made with 2SRI. 2SRI, for those not familiar, uses the residual from stage 1 of a two-stage estimator as a variable in the 2nd-stage equation that uses a non-linear estimator for a binary outcome (e.g. probit) or another non-linear estimator (e.g. poisson). The authors produce a simulation that tests the 2SRI properties over varying conditions of uniqueness of the marginal patient population and the strength of the instrument. The uniqueness of the marginal population is defined as the extent of the difference in treatment effects for the marginal population as compared to the general population. For each scenario tested, the bias between the estimated LATE and the true LATE and ATE is calculated. The findings support the authors’ suspicions that 2SRI is subject to biased results when uniqueness is high. In fact, the 2SRI results were only practically unbiased when uniqueness was low, but were biased for both ATE and LATE when uniqueness was high. Having very strong instruments did help reduce bias. In contrast, 2SLS was always practically unbiased for LATE for different scenarios and the authors use these results to caution researchers on using “new” estimation methods without thoroughly understanding their properties. In this case, old 2SLS still outperformed 2SRI even when dependent variables were non-linear in nature.

Testing the replicability of a successful care management program: results from a randomized trial and likely explanations for why impacts did not replicate. Health Services Research [PubMed] Published December 2016

As is widely known, how to rein in U.S. healthcare costs has been a source of much hand-wringing. One promising strategy has been to promote better management of care in particular for persons with chronic illnesses. This includes coordinating care between multiple providers, encouraging patient adherence to care recommendations, and promoting preventative care. The hope was that by managing care for patients with more complex needs, higher cost services such as emergency visits and hospitalizations could be avoided. CMS, the Centers for Medicare and Medicaid Services, funded a demonstration of a number of care management programs to study what models might be successful in improving quality and reducing costs. One program implemented by Health Quality Partners (HQP) for Medicare Fee-For-Service patients was successful in reducing hospitalizations (by 34 percent) and expenditures (by 22 percent) for a select group of patients who were identified as high-risk. The demonstration occurred from 2002 – 2010 and this paper reports results for a second phase of the demonstration where HQP was given additional funding to continue treating only high-risk patients in the years 2010 – 2014. High-risk patients were identified as having a diagnosis of congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), coronary artery disease (CAD), or diabetes and had a hospitalization in the year prior to enrollment. In essence, phase II of the demonstration for HQP served as a replication of the original demonstration. The HQP care management program was delivered by nurse coordinators who regularly talked with patients and provided coordinated care between primary care physicians and specialists, as well as other services such as medication guidance. All positive results from phase I vanished in phase II and the authors test several hypotheses for why results did not replicate. They find that treatment group patients had similar hospitalization rates between phase I and II, but that control group patients had substantially lower phase II hospitalization rates. Outcome differences between phase I and phase II were risk-adjusted as phase II had an older population with higher severity of illness. The authors also used propensity score re-weighting to further control for differences in phase I and phase II populations. The affordable care act did promote similar care management services through patient-centered medical homes and accountable care organizations that likely contributed to the usual care of control group patients improving. The authors also note that the effectiveness of care management may be sensitive to the complexity of the target population needs. For example, the phase II population was more homebound and was therefore unable to participate in group classes. The big lesson in this paper though is that demonstration results may not replicate for different populations or even different time periods.

A machine learning framework for plan payment risk adjustment. Health Services Research [PubMed] Published December 2016

Since my company has been subsumed under IBM Watson Health, I have been trying to wrap my head around this big data revolution and the potential of technological advances such as artificial intelligence or machine learning. While machine learning has infiltrated other disciplines, it is really just starting to influence health economics, so watch out! This paper by Sherri Rose is a nice introduction into a range of machine learning techniques that she applies to the formulation of plan payment risk adjustments. In insurance systems where patients can choose from a range of insurance plans, there is the problem of adverse selection where some plans may attract an abundance of high risk patients. To control for this, plans (e.g. in the affordable care act marketplaces) with high percentages of high risk consumers get compensated based on a formula that predicts spending based on population characteristics, including diagnoses. Rose says that these formulas are still based on a 1970s framework of linear regression and may benefit from machine learning algorithms. Given that plan payment risk adjustments are essentially predictions, this does seem like a good application. In addition to testing goodness of fit of machine learning algorithms, Rose is interested in whether such techniques can reduce the number of variable inputs. Without going into any detail, insurers have found ways to “game” the system and fewer variable inputs would restrict this activity. Rose introduces a number of concepts in the paper (at least they were new to me) such as ensemble machine learningdiscrete learning frameworks and super learning frameworks. She uses a large private insurance claims dataset and breaks the dataset into what she calls 10 “folds” which allows her to run 5 prediction models, each with its own cross-validation dataset. Aside from one parametric regression model, she uses several penalized regression models, neural net, single-tree, and random forest models. She describes machine learning as aiming to smooth over data in a similar manner to parametric regression but with fewer assumptions and allowing for more flexibility. To reduce the number of variables in models, she applies techniques that limit variables to, for example, just the 10 most influential. She concludes that applying machine learning to plan payment risk adjustment models can increase efficiencies and her results suggest that it is possible to get similar results even with a limited number of variables. It is curious that the parametric model performed as well as or better than many of the different machine learning algorithms. I’ll take that to mean we can continue using our trusted regression methods for at least a few more years.