Visualising PROMs data

The patient reported outcomes measures, or PROMs, is a large database with before and after health-related quality of life (HRQoL) measures for a large number of patients undergoing four key conditions: hip replacement, knee replacement, varicose vein surgery and surgery for groin hernia. The outcome measures are the EQ-5D index and visual analogue scale (and a disease-specific measure for three of the interventions). These data also contain the provider of the operation. Being publicly available, these data allow us to look at a range of different questions: what’s the average effect of the surgery on HRQoL? What are the differences between providers in gains to HRQoL or in patient casemix? Great!

The first thing we should always do with new data is to look at it. This might be in an exploratory way to determine the questions to ask of the data or in an analytical way to get an idea of the relationships between variables. Plotting the data communicates more about what’s going on than any table of statistics alone. However, the plots on the NHS Digital website might be accused of being a little uninspired as they collapse a lot of the variation into simple charts that conceal a lot of what’s going on. For example:

So let’s consider other ways of visualising this data. For all these plots a walk through of the code is at the end of this post.

Now, I’m not a regular user of PROMs data, so what I think are the interesting features of the data may not reflect what the data are generally used for. For this, I think the interesting features are:

  • The joint distribution of pre- and post-op scores
  • The marginal distributions of pre- and post-op scores
  • The relationship between pre- and post-op scores over time

We will pool all the data from six years’ worth of PROMs data. This gives us over 200,000 observations. A scatter plot with this information is useless as the density of the points will be very high. A useful alternative is hexagonal binning, which is like a two-dimensional histogram. Hexagonal tiles, which usefully tessellate and are more interesting to look at than squares, can be shaded or coloured with respect to the number of observations in each bin across the support of the joint distribution of pre- and post-op scores (which is [-0.5,1]x[-0.5,1]). We can add the marginal distributions to the axes and then add smoothed trend lines for each year. Since the data are constrained between -0.5 and 1, the mean may not be a very good summary statistic, so we’ll plot a smoothed median trend line for each year. Finally, we’ll add a line on the diagonal. Patients above this line have improved and patients below it deteriorated.

Hip replacement results

Hip replacement results

There’s a lot going on in the graph, but I think it reveals a number of key points about the data that we wouldn’t have seen from the standard plots on the website:

  • There appear to be four clusters of patients:
    • Those who were in close to full health prior to the operation and were in ‘perfect’ health (score = 1) after;
    • Those who were in close to full health pre-op and who didn’t really improve post-op;
    • Those who were in poor health (score close to zero) and made a full recovery;
    • Those who were in poor health and who made a partial recovery.
  • The median change is an improvement in health.
  • The median change improves modestly from year to year for a given pre-op score.
  • There are ceiling effects for the EQ-5D.

None of this is news to those who study these data. But this way of presenting the data certainly tells more of a story that the current plots on the website.

R code

We’re going to consider hip replacement, but the code is easily modified for the other outcomes. Firstly we will take the pre- and post-op score and their difference and pool them into one data frame.

# df 14/15
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1415.csv")

df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1415 <- df[,c('Provider.Code','pre','post','diff')]

# df 13/14
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1314.csv")

df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1314 <- df[,c('Provider.Code','pre','post','diff')]

# df 12/13
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1213.csv")

df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1213 <- df[,c('Provider.Code','pre','post','diff')]

# df 11/12
df<-read.csv("C:/docs/proms/Hip Replacement 1112.csv")

df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre

df1112 <- df[,c('Provider.Code','pre','post','diff')]

# df 10/11
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1011.csv")

df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre

df1011 <- df[,c('Provider.Code','pre','post','diff')]




Now, for the plot. We will need the packages ggplot2, ggExtra, and extrafont. The latter package is just to change the plot fonts, not essential, but aesthetically pleasing.

loadfonts(device = "win")

 geom_quantile(aes(color=year),method = "rqss", lambda = 2,quantiles=0.5,size=1)+
 scale_fill_gradient2(name="Count (000s)",low="light grey",midpoint = 15000,
   mid="blue",high = "red",
 labs(x="Pre-op EQ-5D index score",y="Post-op EQ-5D index score")+
 theme(legend.position = "bottom",text=element_text(family="Gill Sans MT"))

ggMarginal(p, type = "histogram")

Kenneth Arrow on healthcare economics: a 21st century appreciation

Nobel laureate Kenneth Arrow passed away on February 21, 2017. In a classic, fifty-year-old paper entitled Uncertainty and the Welfare Economics of Medical Care, Arrow discussed how:

“the operation of the medical-care industry and the efficacy with which it satisfies the needs of society differs from… a competitive model… If a competitive equilibrium exists at all, and if all commodities relevant to costs or utilities are in fact priced in the market, then the equilibrium is necessarily [Pareto] optimal” (emphasis added)

Note the implicit assumption that price reflects value, to which I’ll return. As Arrow elegantly explained, there are vast differences between the actual healthcare market and the competitive model, and, moreover, these differences arise from important features of the actual healthcare market.

Identifying the lack of realism of the competitive model in health care may lead to deeper understanding of the actual system. In essence this is what Arrow does. Although both medical care and our expectations have changed greatly, Arrow ’63 is still valid and worth reading today.

Here is Arrow’s summary of the differences between the healthcare market and typical competitive markets.

The nature of demand

Demand for medical services is irregular and unpredictable:

“Medical services, apart from preventive services, afford satisfaction only in the event of illness, a departure from the normal state of affairs… Illness is, thus, not only risky but a costly risk in itself, apart from the cost of medical care.”

Expected behavior of the physician

“It is at least claimed that treatment is dictated by objective needs of the case and not limited by financial considerations… Charity treatment in one form or another does exist because of this tradition about human rights to adequate medical care.”

Product uncertainty

“Recovery from disease is as unpredictable as its incidence…  Because medical knowledge is so complicated, the information possessed by the physician as to the consequences and possibilities of treatment is necessarily very much greater than that of the patient, or at least so it is believed by both parties.”

Supply conditions

Barriers to entry include licensing and other controls on quality (accreditation) and costs.

“One striking consequence of the control of quality is the restriction on the range offered… The declining ratio of physicians to total employees in the medical-care industry shows that substitution of less trained personnel, technicians and the like, is not prevented completely, but the central role of the highly trained physician is not affected at all.”

Pricing practices

There are no fixed prices:

“extensive price discrimination by income (with an extreme of zero prices for sufficiently indigent patients)… the apparent rigidity of so-called administered prices considerably understates the actual flexibility.”

Avik Roy observes in a critical National Review article that “Because patients don’t see the bill until after the non-refundable service has been consumed, and because patients are given little information about price and cost, patients and payors are rarely able to shop around for a medical service based on price and value.”

Medicine has seen major changes since Arrow’s 1963 paper. For example, the treatment of blocked coronary arteries has evolved from coronary bypass to angioplasty to early stents and finally drug-eluting stents. We have seen the advent of minimally invasive surgery, robotic surgery and catheter-based cardiac valve repair and replacement. We have seen drugs to treat hepatitis C and biologicals to treat arthritis and cancer. Many conditions have been transformed from acute to chronic but (at least temporarily) manageable. There are also divergent trends, such as increases in both natural childbirth and Caesarean sections.

In the last 50 years, medicine has become more powerful, but also significantly more complex and overall, more expensive. Intensive care units are a good example, both valuable therapeutically, but expensive to provide. At the same time, many treatments are both better (more valuable to the patient) and less expensive to provide; these range from root canal (frequently two visits to the dentist instead of four) to the significantly less invasive treatments for many cardiac rhythm abnormalities (radio-frequency ablation) and stents for coronary artery disease. The advent of epinephrine auto-injectors has been a lifesaver, but the cost of the Epi-Pen has increased significantly.

Can a competitive economic system appropriately and reasonably price such treatments and devices? Arrow argues that, if not, non-market social institutions will arise and address these challenges. Here is a deeper look.

Arrow’s first two points are still virtually axiomatic today: demand for medical services has become even more unpredictable with the continued growth of advanced, effective interventions and corresponding, appropriately increasing (in my opinion), patient expectations. Similarly, as medical care advances, we increasingly see medical care as a human right and in many cases, a societal obligation. We have come to expect treatment dictated by objective needs and not limited by financial considerations, not only from physicians but from a growing number of key players including pharmaceutical companies. To their credit, in many cases (AIDS comes to mind) pharmaceutical companies have responded by sharply reducing prices in the developing world.

Powerful chemotherapeutic and biologic drugs may have increased the uncertainty and asymmetry of information observed by Arrow, both in their effectiveness and in their side effects. In many cases one needs the language and mathematics of probability and statistics to evaluate, assess and describe their efficacy and utility. One needs an understanding of probability to determine when and how to use common preventive techniques, such as mammograms and PSA screening. Here is an example, paraphrased from Gigerenzer and Edwards (see also Strogatz). Women 40 to 50 years old, with no family history of breast cancer, are a low-risk population; the overall probability of breast cancer in this population is 0.8%. Assume that mammography has a sensitivity of 90% and a false positive rate of 7%.  A woman has a positive mammogram. What is the probability that she has breast cancer? Among 25 German doctors surveyed, 36% said 90% or more, 32% said 50-80%, and 32% said 10% or less. Most (95%) of United States doctors thought the probability was approximately 75%.  (See the links above for the answer, or see my next blog on the challenge of communicating probability).

Arrow’s information asymmetry remains, despite the growing availability of accessible medical information on the web, perhaps for good reasons such as the ability to effectively address the needs of sicker patients.

I would amend Arrow’s discussion of supply conditions to include a wide variety of cost barriers ranging from large fixed costs of ICUs to the costs of medical research. The high cost of basic medical services relative to per capita GDP in the the developing world represents a barrier as high as any faced in the developed world.  As Arrow notes, society has addressed this challenge through a variety of pricing mechanisms outside traditional competitive models. This may not, and in general will not achieve a Pareto optimum, but their wide endorsement by society does indeed suggest that these approaches achieve a more general optimum.

“I propose here the view that, when the market fails to achieve an optimal state, society will, to some extent at least, recognize the gap, and nonmarket social institutions will arise attempting to bridge it… But it is contended here that the special structural characteristics of the medical-care market are largely attempts to overcome the lack of optimality due to the nonmarketability of the bearing of suitable risks and the imperfect marketability of information. These compensatory institutional changes, with some reinforcement from usual profit motives, largely explain the observed noncompetitive behavior of the medical-care market, behavior which, in itself, interferes with optimality. The social adjustment towards optimality thus puts obstacles in its own path.”

It is this view which I find too limiting. I would suggest that society has at least implicitly concluded that price alone does not define value, and thus formed a broader definition of optimality, not simply Pareto optimality in a competitive market. Society is finding and supporting ways to overcome obstacles toward this broader sense of optimality.

The Bill & Melinda Gates Foundation vaccination project aims to reduce the number of children that die each year from preventable disease (currently around 1.5 million). The lifebox project, founded by Dr Atul Gawande, provides affordable, high quality pulse oximeters to the developing world and now seeks to address basic surgical safety in the developing world. Important advances also arise in the developing world; most recently, an easy to deliver, more effective oral cholera vaccine developed in Vietnam.

Arrow himself recognizes the limits of a traditional economic description of the medical care market in his concluding Postscript, arguing that “The logic and limitations of ideal competitive behavior under uncertainty force us to recognize the incomplete description of reality supplied by the impersonal price system.” I conclude more generally that prices not only do not necessarily represent value in medical care (as Arrow observed), but that the combination of uncertainty, externalities, high costs, divergent economies, and technological advance means that price alone cannot describe value in medical care. A broader more general theory of healthcare economics with a foundation standing on the shoulders of giants such as Kenneth Arrow, with perhaps a more general multi-dimensional Pareto optimum, might help us all better understand where we are and where we might go.


Alastair Canaway’s journal round-up for 20th February 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The estimation and inclusion of presenteeism costs in applied economic evaluation: a systematic review. Value in Health Published 30th January 2017

Presenteeism is one of those issues that you hear about from time to time, but rarely see addressed within economic evaluations. For those who haven’t come across it before, presenteeism refers to being at work, but not working at full capacity, for example, due to your health limiting your ability to work. The literature suggests that given presenteeism can have large associated costs which could significantly impact economic evaluations, it should be considered. These impacts are rarely captured in practice. This paper sought to identify studies where presenteeism costs were included, examined how valuation was approached and the degree of impact of including presenteeism on costs. The review included cost of illness studies as well as economic evaluations, just 28 papers had attempted to capture the costs of presenteeism, these were in a wide variety of disease areas. A range of methods was used, across all studies, presenteeism costs accounted for 52% (range from 19%-85%) of the total costs relating to the intervention and disease. This is a vast proportion and significantly outweighed absenteeism costs. Presenteeism is clearly a significant issue, yet widely ignored within economic evaluation. This in part may be due to the health and social care perspective advised within the NICE reference case and compounded by the lack of guidance in how to measure and value productivity costs. Should an economic evaluation pursue a societal perspective, the findings suggest that capturing and valuing presenteeism costs should be a priority.

Priority to end of life treatments? Views of the public in the Netherlands. Value in Health Published 5th January 2017

Everybody dies, and thus, end of life care is probably something that we should all have at least a passing interest in. The end of life context is an incredibly tricky research area with methodological pitfalls at every turn. End of life care is often seen as ‘different’ to other care, and this is reflected in NICE having supplementary guidance for the appraisal of end of life interventions. Similarly, in the Netherlands, treatments that do not meet typical cost per QALY thresholds may be provided should public support be sufficient. There, however, is a dearth of such evidence, and this paper sought to elucidate this issue using the novel Q methodology. Three primary viewpoints emerged: 1) Access to healthcare as a human right – all have equal rights regardless of setting, that is, nobody is more important. Viewpoint one appeared to reject the notion of scarce resources when it comes to health: ‘you can’t put a price on life’. 2) The second group focussed on providing the ‘right’ care for those with terminal illness and emphasised that quality of life should be respected and unnecessary care at end of life should be avoided. This second group did not place great importance on cost-effectiveness but did acknowledge that costly treatments at end of life might not be the best use of money. 3) Finally, the third group felt there should be a focus on care which is effective and efficient, that is, those treatments which generate the most health should be prioritised. There was a consensus across all three groups that the ultimate goal of the health system is to generate the greatest overall health benefit for the population. This rejects the notion that priority should be given to those at end of life and the study concludes that across the three groups there was minimal support for the possibility of the terminally ill being treated with priority.

Methodological issues surrounding the use of baseline health-related quality of life data to inform trial-based economic evaluations of interventions within emergency and critical care settings: a systematic literature review. PharmacoEconomics [PubMed] Published 6th January 2017

Catchy title. Conducting research within emergency and critical settings presents a number of unique challenges. For the health economist seeking to conduct a trial based economic evaluation, one such issue relates to the calculation of QALYs. To calculate QALYs within a trial, baseline and follow-up data are required. For obvious reasons – severe and acute injuries/illness, unplanned admission – collecting baseline data on those entering emergency and critical care is problematic. Even when patients are conscious, there are ethical issues surrounding collecting baseline data in this setting, the example used relates to somebody being conscious after cardiac arrest, is it appropriate to be getting them to complete HRQL questionnaires? Probably not. Various methods have been used to circumnavigate this issue; this paper sought to systematically review the methods that have been used and provide guidance for future studies. Just 19 studies made it through screening, thus highlighting the difficulty of research in this context. Just one study prospectively collected baseline HRQL data, and this was restricted to patients in a non-life threatening state. Four different strategies were adopted in the remaining papers. Eight studies adopted a fixed health utility for all participants at baseline, four used only the available data, that is, from the first time point where HRQL was measured. One asked patients to retrospectively recall their baseline state, whilst one other used Delphi methods to derive EQ-5D states from experts. The paper examines the implications and limitations of adopting each of these strategies. The key finding seems to relate to whether or not the trial arms are balanced with respect to HRQL at baseline. This obviously isn’t observed, the authors suggest trial covariates should instead be used to explore this, and adjustments made where applicable. If, and that’s a big if, trial arms are balanced, then all of the four methods suggested should give similar answers. It seems the key here is the randomisation, however, even the best randomisation techniques do not always lead to balanced arms and there is no guarantee of baseline balance. The authors conclude trials should aim to make an initial assessment of HRQL at the earliest opportunity and that further research is required to thoroughly examine how the different approaches will impact cost-effectiveness results.