Visualising PROMs data

The patient reported outcomes measures, or PROMs, is a large database with before and after health-related quality of life (HRQoL) measures for a large number of patients undergoing four key conditions: hip replacement, knee replacement, varicose vein surgery and surgery for groin hernia. The outcome measures are the EQ-5D index and visual analogue scale (and a disease-specific measure for three of the interventions). These data also contain the provider of the operation. Being publicly available, these data allow us to look at a range of different questions: what’s the average effect of the surgery on HRQoL? What are the differences between providers in gains to HRQoL or in patient casemix? Great!

The first thing we should always do with new data is to look at it. This might be in an exploratory way to determine the questions to ask of the data or in an analytical way to get an idea of the relationships between variables. Plotting the data communicates more about what’s going on than any table of statistics alone. However, the plots on the NHS Digital website might be accused of being a little uninspired as they collapse a lot of the variation into simple charts that conceal a lot of what’s going on. For example:

So let’s consider other ways of visualising this data. For all these plots a walk through of the code is at the end of this post.

Now, I’m not a regular user of PROMs data, so what I think are the interesting features of the data may not reflect what the data are generally used for. For this, I think the interesting features are:

  • The joint distribution of pre- and post-op scores
  • The marginal distributions of pre- and post-op scores
  • The relationship between pre- and post-op scores over time

We will pool all the data from six years’ worth of PROMs data. This gives us over 200,000 observations. A scatter plot with this information is useless as the density of the points will be very high. A useful alternative is hexagonal binning, which is like a two-dimensional histogram. Hexagonal tiles, which usefully tessellate and are more interesting to look at than squares, can be shaded or coloured with respect to the number of observations in each bin across the support of the joint distribution of pre- and post-op scores (which is [-0.5,1]x[-0.5,1]). We can add the marginal distributions to the axes and then add smoothed trend lines for each year. Since the data are constrained between -0.5 and 1, the mean may not be a very good summary statistic, so we’ll plot a smoothed median trend line for each year. Finally, we’ll add a line on the diagonal. Patients above this line have improved and patients below it deteriorated.

Hip replacement results

Hip replacement results

There’s a lot going on in the graph, but I think it reveals a number of key points about the data that we wouldn’t have seen from the standard plots on the website:

  • There appear to be four clusters of patients:
    • Those who were in close to full health prior to the operation and were in ‘perfect’ health (score = 1) after;
    • Those who were in close to full health pre-op and who didn’t really improve post-op;
    • Those who were in poor health (score close to zero) and made a full recovery;
    • Those who were in poor health and who made a partial recovery.
  • The median change is an improvement in health.
  • The median change improves modestly from year to year for a given pre-op score.
  • There are ceiling effects for the EQ-5D.

None of this is news to those who study these data. But this way of presenting the data certainly tells more of a story that the current plots on the website.

R code

We’re going to consider hip replacement, but the code is easily modified for the other outcomes. Firstly we will take the pre- and post-op score and their difference and pool them into one data frame.

# df 14/15
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1415.csv")

df<-df[!is.na(df$Pre.Op.Q.EQ5D.Index),]
df$pre<-df$Pre.Op.Q.EQ5D.Index
df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1415 <- df[,c('Provider.Code','pre','post','diff')]

#
# df 13/14
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1314.csv")

df<-df[!is.na(df$Pre.Op.Q.EQ5D.Index),]
df$pre<-df$Pre.Op.Q.EQ5D.Index
df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1314 <- df[,c('Provider.Code','pre','post','diff')]

# df 12/13
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1213.csv")

df<-df[!is.na(df$Pre.Op.Q.EQ5D.Index),]
df$pre<-df$Pre.Op.Q.EQ5D.Index
df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1213 <- df[,c('Provider.Code','pre','post','diff')]

# df 11/12
df<-read.csv("C:/docs/proms/Hip Replacement 1112.csv")

df$pre<-df$Q1_EQ5D_INDEX
df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre
names(df)[1]<-'Provider.Code'

df1112 <- df[,c('Provider.Code','pre','post','diff')]

# df 10/11
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1011.csv")

df$pre<-df$Q1_EQ5D_INDEX
df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre
names(df)[1]<-'Provider.Code'

df1011 <- df[,c('Provider.Code','pre','post','diff')]

#combine

df1415$year<-"2014/15"
df1314$year<-"2013/14"
df1213$year<-"2012/13"
df1112$year<-"2011/12"
df1011$year<-"2010/11"

df<-rbind(df1415,df1314,df1213,df1112,df1011)
write.csv(df,"C:/docs/proms/eq5d.csv")

Now, for the plot. We will need the packages ggplot2, ggExtra, and extrafont. The latter package is just to change the plot fonts, not essential, but aesthetically pleasing.

require(ggplot2)
require(ggExtra)
require(extrafont)
font_import()
loadfonts(device = "win")

p<-ggplot(data=df,aes(x=pre,y=post))+
 stat_bin_hex(bins=15,color="white",alpha=0.8)+
 geom_abline(intercept=0,slope=1,color="black")+
 geom_quantile(aes(color=year),method = "rqss", lambda = 2,quantiles=0.5,size=1)+
 scale_fill_gradient2(name="Count (000s)",low="light grey",midpoint = 15000,
   mid="blue",high = "red",
   breaks=c(5000,10000,15000,20000),labels=c(5,10,15,20))+
 theme_bw()+
 labs(x="Pre-op EQ-5D index score",y="Post-op EQ-5D index score")+
 scale_color_discrete(name="Year")+
 theme(legend.position = "bottom",text=element_text(family="Gill Sans MT"))

ggMarginal(p, type = "histogram")
Advertisements

Chris Sampson’s journal round-up for 6th February 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A review of NICE methods and processes across health technology assessment programmes: why the differences and what is the impact? Applied Health Economics and Health Policy [PubMed] Published 27th January 2017

Depending on the type of technology under consideration, NICE adopts a variety of different approaches in coming up with their recommendations. Different approaches might result in different decisions, which could undermine allocative efficiency. This study explores this possibility. Data were extracted from the manuals and websites for 5 programmes, under the themes of ‘remit and scope’, ‘process of assessment’, ‘methods of evaluation’ and ‘appraisal of evidence’. Semi-structured interviews were conducted with 5 people with expertise in each of the 5 programmes. Results are presented in a series of tables – one for each theme – outlining the essential characteristics of the 5 programmes. In their discussion, the authors then go on to consider how the identified differences might impact on efficiency from either a ‘utilitarian’ health-maximisation perspective or NICE’s egalitarian aim of ensuring adequate levels of health care. Not all programmes deliver recommendations with mandatory funding status, and it is only the ones that do that have a formal appeals process. Allowing for local rulings on funding could be good or bad news for efficiency, depending on the capacity of local decision makers to conduct economic evaluations (so that means probably bad news). At the same time, regional variation could undermine NICE’s fairness agenda. The evidence considered by the programmes varies, from a narrow focus on clinical and cost-effectiveness to the incorporation of budget impact and wider ethical and social values. Only some of the programmes have reference cases, and those that do are the ones that use cost-per-QALY analysis, which probably isn’t a coincidence. The fact that some programmes use outcomes other than QALYs obviously has the potential to undermine health-maximisation. Most differences or borne of practicality; there’s no point in insisting on a CUA if there is no evidence at all to support one – the appraisal would simply not happen. The very existence of alternative programmes indicates that NICE is not simply concerned with health-maximisation. Additional weight is given to rare conditions, for example. And NICE want to encourage research and innovation. So it’s no surprise that we need to take into account NICE’s egalitarian view to understand the type of efficiency for which it strives.

Economic evaluations alongside efficient study designs using large observational datasets: the PLEASANT trial case study. PharmacoEconomics [PubMed] Published 21st January 2017

One of the worst things about working on trial-based economic evaluations is going to lots of effort to collect lots of data, then finding that at the end of the day you don’t have much to show for it. Nowadays, the health service routinely collects many data for other purposes. There have been proposals to use these data – instead of prospectively collecting data – to conduct clinical trials. This study explores the potential for doing an economic evaluation alongside such a trial. The study uses CPRD data, including diagnostic, clinical and resource use information, for 8,608 trial participants. The intervention was the sending out of a letter in the hope of reducing unscheduled medical contacts due to asthma exacerbation in children starting a new school year. QALYs couldn’t be estimated using the CPRD data, so values were derived from the literature and estimated on the basis of exacerbations indicated by changes in prescriptions or hospitalisations. Note here the potentially artificial correlation between costs and outcomes that this creates, thus somewhat undermining the benefit of some good old bootstrapping. The results suggest the intervention is cost-saving with little impact on QALYs. Lots of sensitivity analyses are conducted, which are interesting in themselves and say something about the concerns around some of the structural assumptions. The authors outline the pros and cons of the approach. It’s an important discussion as it seems that studies like this are going to become increasingly common. Regarding data collection, there’s little doubt that this approach is more efficient, and it should be particularly valuable in the evaluation of public health and service delivery type interventions. The problem is that the study is not able to use individual-level cost and outcome data from the same people, which is what sets a trial-based economic evaluation apart from a model-based study. So for me, this isn’t really a trial-based economic evaluation. Indeed, the analysis incorporates a Markov-type model of exacerbations. It’s a different kind of beast, which incorporates aspects of modelling and aspects of trial-based analysis, along with some unique challenges of its own. There’s a lot more methodological work that needs to be done in this area, but this study demonstrates that it could be fruitful.

“Too much medicine”: insights and explanations from economic theory and research. Social Science & Medicine [PubMed] Published 18th January 2017

Overconsumption of health care represents an inefficient use of resources, and so we wouldn’t recommend it. But is that all we – as economists – have to say on the matter? This study sought to dig a little deeper. A literature search was conducted to establish a working definition of overconsumption. Related notions such as overdiagnosis, overtreatment, overuse, low-value care, overmedicalisation and even ‘pharmaceuticalisation’ all crop up. The authors introduce ‘need’ as a basis for understanding overconsumption; it represents health care that should never be considered as “needed”. A useful distinction is identified between misconsumption – where an individual’s own consumption is detrimental to their own well-being – and overconsumption, which can be understood as having a negative effect on social welfare. Note that in a collectively funded system the two concepts aren’t entirely distinguishable. Misconsumption becomes the focus of the paper, as avoiding harm to patients has been the subject of the “too much medicine” movement. I think this is a shame, and not really consistent with an economist’s usual perspective. The authors go on to discuss issues such as moral hazard, supplier-induced demand, provider payment mechanisms, ‘indication creep’, regret theory, and physicians’ positional consumption, and whether or not such phenomena might lead to individual welfare losses and thus be considered causes of misconsumption. The authors provide a neat diagram showing the various causes of misconsumption on a plane. One dimension represents the extent to which the cause is imperfect knowledge or imperfect agency, and the other the degree to which the cause is at the individual or market level. There’s a big gap in the top right, where market level causes meet imperfect knowledge. This area could have included patent systems, research fraud and dodgy Pharma practices. Or maybe just a portrait of Ben Goldacre for shorthand. There are some warnings about the (limited) extent to which market reforms might address misconsumption, and the proposed remedy for overconsumption is not really an economic one. Rather, a change in culture is prescribed. More research looking at existing treatments rather than technology adoption, and to investigate subgroup effects, is also recommended. The authors further suggest collaboration between health economists and ecological economists.

Credits

Variations in NHS admissions at a glance

Variations in admissions to NHS hospitals are the source of a great deal of consternation. Over the long-run, admissions and the volume of activity required of the NHS have increased, without equivalent increases in funding or productivity. Over the course of the year, there are repeated claims of crises as hospitals are ill-equipped for the increase in demand in the winter. While different patterns of admissions at weekends relative to weekdays may be the foundation of the ‘weekend effect’ as we recently demonstrated. And yet all these different sources of variation produce a singular time series of numbers of daily admissions. But, each of the different sources of variation are important for different planning and research aims. So let’s decompose the daily number of admissions into its various components.

Data

Daily number of emergency admissions to NHS hospitals between April 2007 and March 2015 from Hospital Episode Statistics.

Methods

A similar analysis was first conducted on variations in the number of births by day of the year. A full description of the model can be found in Chapter 21 of the textbook Bayesian Data Analysis (indeed the model is shown on the front cover!). The model is a sum of Gaussian processes, each one modelling a different aspect of the data, such as the long-run trend or weekly periodic variation. We have previously used Gaussian processes in a geostatistical model on this blog. Gaussian processes are a flexible class of models for which any finite dimensional marginal distribution is Gaussian. Different covariance functions can be specified for different models, such as the aforementioned periodic or long-run trends. The model was run using the software GPstuff in Octave (basically an open-source version of Matlab) and we have modified code from the GPstuff website.

Results

admit5-1

The four panels of the figure reveal to us things we may claim to already know. Emergency admissions have been increasing over time and were about 15% higher in 2015 than in 2007 (top panel). The second panel shows us the day of the week effects: there are about 20% fewer admissions on a Saturday or Sunday than on a weekday. The third panel shows a decrease in summer and increase in winter as we often see reported, although perhaps not quite as large as we might have expected. And finally the bottom panel shows the effects of different days of the year. We should note that the large dip at the end of March/beginning of April is an artifact of coding at the end of the financial year in HES and not an actual drop in admissions. But, we do see expected drops for public holidays such as Christmas and the August bank holiday.

While none of this is unexpected it does show that there’s a lot going on underneath the aggregate data. Perhaps the most alarming aspect of the data is the long run increase in emergency admissions when we compare it to the (lack of) change in funding or productivity. It suggests that hospitals will often be running at capacity so other variation, such as over winter, may lead to an excess capacity problem. We might also speculate on other possible ‘weekend effects’, such as admission on a bank holiday.

As a final thought, the method used to model the data is an excellent way of modelling data with an unknown structure without posing assumptions such as linearity that might be too strong. Hence their use in geostatistics. They are widely used in machine learning and artificial intelligence as well. We often encounter data with unknown and potentially complicated structures in health care and public health research so hopefully this will serve as a good advert for some new methods. See this book, or the one referenced in the methods section, for an in depth look.

Credits