Visualising PROMs data

The patient reported outcomes measures, or PROMs, is a large database with before and after health-related quality of life (HRQoL) measures for a large number of patients undergoing four key conditions: hip replacement, knee replacement, varicose vein surgery and surgery for groin hernia. The outcome measures are the EQ-5D index and visual analogue scale (and a disease-specific measure for three of the interventions). These data also contain the provider of the operation. Being publicly available, these data allow us to look at a range of different questions: what’s the average effect of the surgery on HRQoL? What are the differences between providers in gains to HRQoL or in patient casemix? Great!

The first thing we should always do with new data is to look at it. This might be in an exploratory way to determine the questions to ask of the data or in an analytical way to get an idea of the relationships between variables. Plotting the data communicates more about what’s going on than any table of statistics alone. However, the plots on the NHS Digital website might be accused of being a little uninspired as they collapse a lot of the variation into simple charts that conceal a lot of what’s going on. For example:

So let’s consider other ways of visualising this data. For all these plots a walk through of the code is at the end of this post.

Now, I’m not a regular user of PROMs data, so what I think are the interesting features of the data may not reflect what the data are generally used for. For this, I think the interesting features are:

  • The joint distribution of pre- and post-op scores
  • The marginal distributions of pre- and post-op scores
  • The relationship between pre- and post-op scores over time

We will pool all the data from six years’ worth of PROMs data. This gives us over 200,000 observations. A scatter plot with this information is useless as the density of the points will be very high. A useful alternative is hexagonal binning, which is like a two-dimensional histogram. Hexagonal tiles, which usefully tessellate and are more interesting to look at than squares, can be shaded or coloured with respect to the number of observations in each bin across the support of the joint distribution of pre- and post-op scores (which is [-0.5,1]x[-0.5,1]). We can add the marginal distributions to the axes and then add smoothed trend lines for each year. Since the data are constrained between -0.5 and 1, the mean may not be a very good summary statistic, so we’ll plot a smoothed median trend line for each year. Finally, we’ll add a line on the diagonal. Patients above this line have improved and patients below it deteriorated.

Hip replacement results

Hip replacement results

There’s a lot going on in the graph, but I think it reveals a number of key points about the data that we wouldn’t have seen from the standard plots on the website:

  • There appear to be four clusters of patients:
    • Those who were in close to full health prior to the operation and were in ‘perfect’ health (score = 1) after;
    • Those who were in close to full health pre-op and who didn’t really improve post-op;
    • Those who were in poor health (score close to zero) and made a full recovery;
    • Those who were in poor health and who made a partial recovery.
  • The median change is an improvement in health.
  • The median change improves modestly from year to year for a given pre-op score.
  • There are ceiling effects for the EQ-5D.

None of this is news to those who study these data. But this way of presenting the data certainly tells more of a story that the current plots on the website.

R code

We’re going to consider hip replacement, but the code is easily modified for the other outcomes. Firstly we will take the pre- and post-op score and their difference and pool them into one data frame.

# df 14/15
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1415.csv")

df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1415 <- df[,c('Provider.Code','pre','post','diff')]

# df 13/14
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1314.csv")

df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1314 <- df[,c('Provider.Code','pre','post','diff')]

# df 12/13
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1213.csv")

df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1213 <- df[,c('Provider.Code','pre','post','diff')]

# df 11/12
df<-read.csv("C:/docs/proms/Hip Replacement 1112.csv")

df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre

df1112 <- df[,c('Provider.Code','pre','post','diff')]

# df 10/11
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1011.csv")

df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre

df1011 <- df[,c('Provider.Code','pre','post','diff')]




Now, for the plot. We will need the packages ggplot2, ggExtra, and extrafont. The latter package is just to change the plot fonts, not essential, but aesthetically pleasing.

loadfonts(device = "win")

 geom_quantile(aes(color=year),method = "rqss", lambda = 2,quantiles=0.5,size=1)+
 scale_fill_gradient2(name="Count (000s)",low="light grey",midpoint = 15000,
   mid="blue",high = "red",
 labs(x="Pre-op EQ-5D index score",y="Post-op EQ-5D index score")+
 theme(legend.position = "bottom",text=element_text(family="Gill Sans MT"))

ggMarginal(p, type = "histogram")

Chris Sampson’s journal round-up for 16th January 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Competition and quality indicators in the health care sector: empirical evidence from the Dutch hospital sector. The European Journal of Health Economics [PubMed] Published 3rd January 2017

In case you weren’t already convinced, this paper presents more evidence to support the notion that (non-price) competition between health care providers is good for quality. The Dutch system is based on compulsory insurance and information on quality of hospital care is made public. One feature of the Dutch health system is that – for many elective hospital services – prices are set following a negotiation between insurers and hospitals. This makes the setting of the study a bit different to some of the European evidence considered to date, because there is scope for competition on price. The study looks at claims data for 3 diagnosis groups – cataract, adenoid/tonsils and bladder tumor – between 2008 and 2011. The authors’ approach to measuring competition is a bit more sophisticated than some other studies’ and is based on actual market share. A variety of quality indicators are used for the 3 diagnosis groups relating mainly to the process of care (rather than health outcomes). Fixed and random effects linear regression models are used to estimate the impact of market share upon quality. Casemix was only controlled for in relation to the proportion of people over 65 and the proportion of women. Where a relationship was found, it tended to be in favour of lower market share (i.e. greater competition) being associated with higher quality. For cataract and for bladder tumor there was a ‘significant’ effect. So in this setting at least, competition seems to be good news for quality. But the effect sizes are neither huge nor certain. A look at each of the quality indicators separately showed plenty of ‘non-significant’ relationships in both directions. While a novelty of this study is the liberalised pricing context, the authors find that there is no relationship between price and quality scores. So even if we believe the competition-favouring results, we needn’t abandon the ‘non-price competition only’ mantra.

Cost-effectiveness thresholds in global health: taking a multisectoral perspective. Value in Health Published 3rd January 2017

We all know health care is not the only – and probably not even the most important – determinant of health. We call ourselves health economists, but most of us are simply health care economists. Rarely do we look beyond the domain of health care. If our goal as researchers is to help improve population health, then we should probably be allocating more of our mental resource beyond health care. The same goes for public spending. Publicly provided education might improve health in a way that the health service would be willing to fund. Likewise, health care might improve educational attainment. This study considers resource allocation decisions using the familiar ‘bookshelf approach’, but goes beyond the unisectoral perspective. The authors discuss a two-sector world of health and education, and demonstrate the ways in which there may be overlaps in costs and outcomes. In short, there are likely to be situations in which the optimal multisectoral decision would be for individual sectors to increase their threshold in order to incorporate the spillover benefits of an intervention in another sector. The authors acknowledge that – in a perfect world – a social-welfare-maximising government would have sufficient information to allocate resources earmarked for specific purposes (e.g. health improvement) across sectors. But this doesn’t happen. Instead the authors propose the use of a cofinancing mechanism, whereby funds would be transferred between sectors as needed. The paper provides an interesting and thought-provoking discussion, and the idea of transferring funds between sectors seems sensible. Personally I think the problem is slightly misspecified. I don’t believe other sectors face thresholds in the same way, because (generally speaking) they do not employ cost-effectiveness analysis. And I’m not sure they should. I’m convinced that for health we need to deviate from welfarism, but I’m not convinced of it for other sectors. So from my perspective it is simply a matter of health vs everything else, and we can incorporate the ‘everything else’ into a cost-effectiveness analysis (with a societal perspective) in monetary terms. Funds can be reallocated as necessary with each budget statement (of which there seem to be a lot nowadays).

Is the Rational Addiction model inherently impossible to estimate? Journal of Health Economics [RePEc] Published 28th December 2016

Saddle point dynamics. Something I’ve never managed to get my head around, but here goes… This paper starts from the problem that empirical tests of the Rational Addiction model serve up wildly variable and often ridiculous (implied) discount rates. That may be part of the reason why economists tend to support the RA model but at the same time believe that it has not been empirically proven. The paper sets out the basis for saddle point dynamics in the context of the RA model, and outlines the nature of the stable and unstable root within the function that determines a person’s consumption over time. The authors employ Monte Carlo estimation of RA-type equations, simulating panel data observations. These simulations demonstrate that the presence of the unstable root may make it very difficult to estimate the coefficients. So even if the RA model can truly represent behaviour, empirical estimation may contradict it. This raises the question of whether the RA model is essentially untestable. A key feature of the argument relates to use of the model where a person’s time horizon is not considered to be infinite. Some non-health economists like to assume it is, which, as the authors wryly note, is not particularly ‘rational’.


Sam Watson’s journal round-up for 10th October 2016

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

This week’s journal round up-is a special edition featuring a series of papers on health econometrics published in this month’s issue of the Journal of the Royal Statistical Society: Series A.

Healthcare facility choice and user fee abolition: regression discontinuity in a multinomial choice setting. JRSS: A [RePEcPublished October 2016

Charges for access to healthcare – user fees – present a potential barrier to patients in accessing medical services. User fees were touted in the 1980s as a way to provide revenue for healthcare services in low and middle income countries, improve quality, and reduce overuse of limited services. However, a growing evidence base suggests that user fees do not achieve these ends and reduce uptake of preventative and curative services. This article seeks to provide new evidence on the topic using a regression discontinuity (RD) design while also exploring the use of RD with multinomial outcomes. Based on South African data, the discontinuity of interest is that children under the age of six are eligible for free public healthcare whereas older children must pay a fee; user fees for the under sixes were abolished following the end of apartheid in 1994. The results provide evidence that removal of user fees resulted in more patients using public healthcare facilities than costly private care or care at home. The authors describe how their non-parametric model performs better, in terms of out-of-sample predictive performance, than the parametric model. And when the non-parametric model is applied to examine treatment effects across income quantiles we find that the treatment effect is among poorer families and that it is principally due to them switching between home care and public healthcare. This analysis supports an already substantial literature on user fees, but a literature that has previously been criticised for a lack of methodological rigour, so this paper makes a welcome addition.

Do market incentives for hospitals affect health and service utilization?: evidence from prospective pay system–diagnosis-related groups tariffs in Italian regions. JRSS: A [RePEcPublished October 2016

The effect of pro-market reforms in the healthcare sector on hospital quality is a contentious and oft-discussed topic, not least due to the difficulties with measuring quality. We critically discussed a recent, prominent paper that analysed competitive reforms in the English NHS, for example. This article examines the effect of increased competition in Italy on health service utlisation: in the mid 1990s the Italian national health service moved from a system of national tariffs to region-specific tariffs in order for regions to better incentivise local health objectives and reflect production costs. For example, the tariffs for a vaginal delivery ranged from €697 to €1,750 in 2003. This variation between regions and over time provides a source of variation to analyse the effects of these reforms. The treatment is defined as a binary variable at each time point for whether the regions had switched from national to local tariffs, although one might suggest that this disposes of some interesting variation in how the policy was enacted. The headline finding is that the reforms had little or no effect on health, but did reduce utilisation of healthcare services. The authors interpret this as suggesting they reduce over-utilisation and hence improve efficiency. However, I am still pondering how this might work: presumably the marginal benefit of treating patients who do not require particular services is reduced, although the marginal cost of treating those patients who do not need it is likely also to be lower as they are healthier. The between-region differences in tariffs may well shed some light on this.

Short- and long-run estimates of the local effects of retirement on health. JRSS: A [RePEcPublished October 2016

The proportion of the population that is retired is growing. Governments have responded by increasing the retirement age to ensure the financial sustainability of pension schemes. But, retirement may have other consequences, not least on health. If retirement worsens one’s health then delaying the retirement age may improve population health, and if retirement is good for you, the opposite may occur. Retirement grants people a new lease of free time, which they may fill with health promoting activities, or the loss of activity and social relations may adversely impact on ones health and quality of life. In addition, people who are less healthy may be more likely to retire. Taken all together, estimating the effects of retirement on health presents an interesting statistical challenge with important implications for policy. This article uses the causal inference method du jour, regression discontinuity design, and the data are from that workhorse of British economic studies, the British Household Panel Survey. The discontinuity is obviously the retirement age; to deal with the potential reverse causality, eligibility for the state pension is used as an instrument. Overall the results suggest that the short term impact on health is minimal, although it does increase the risk of a person becoming sedentary, which in the long run may precipitate health problems.


Other articles on health econometrics in this special issue:

The association between asymmetric information, hospital competition and quality of healthcare: evidence from Italy.

This paper finds evidence that increased between hospital competition does not lead to improved outcomes as patients were choosing hospitals on the basis of information from their social networks. We featured this paper in a previous round-up.

A quasi-Monte-Carlo comparison of parametric and semiparametric regression methods for heavy-tailed and non-normal data: an application to healthcare costs.

This article considers the problem of modelling non-normally distributed healthcare costs data. Linear models with square root transformations and generalised linear models with square root link functions are found to perform the best.

Phantoms never die: living with unreliable population data.

Not strictly health econometrics, more demographics, this article explores how to make inferences about population mortality rates and trends when there are unreliable population data due to fluctuations in birth patterns. For researchers using macro health outcomes data, such corrections may prove useful.