Thesis Thursday: Miqdad Asaria

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Miqdad Asaria who graduated with a PhD from the University of York. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
The economics of health inequality in the English National Health Service
Supervisors
Richard Cookson, Tim Doran
Repository link
http://etheses.whiterose.ac.uk/16189

What types of inequality are relevant in the context of the NHS?

For me the inequalities that really matter are the inequalities in health outcomes, in the English context it is particularly the socioeconomic patterning of these inequalities that is of concern. The focus of health policy in England over the last 200 years has been on improving the average health of the population as well as on providing financial risk protection against catastrophic health expenditure. Whilst great strides have been made in improving average population health through various pioneering interventions including the establishment of the NHS, health inequality has in fact consistently widened over this period. Recent research suggests that in terms of quality-adjusted life expectancy the gap between people living in the most deprived fifth of neighbourhoods in the country as compared to those living in the most affluent fifth is now approximately 11 quality-adjusted life years.

However, these socio-economic inequalities in health typically accumulate across the life course and there is a limited amount that health care on its own can do to prevent these gaps from widening or indeed to close these gaps once they emerge. This is why health systems including the NHS typically focus on measuring and tackling the inequalities that they can influence even though eliminating such inequalities can have at best only modest impacts on reducing health inequality overall. These comprise of inequalities in access to and quality of healthcare as well as inequality of those health outcomes specifically amenable to healthcare.

What were the key methods and data that you used to identify levels of health inequality?

I am currently working on a project with the Ministry of Health and Family Welfare in India and it is really making me appreciate the amazingly detailed and comprehensive administrative datasets available to researchers in England. For the work underpinning my thesis I linked 10 years of data looking at every hospital admission and outpatient visit in the country with the quality and outcomes achieved for patients registered at each primary care practice, the number of doctors working at each primary care practice, general population census data, cause-specific mortality data, hospital cost data and deprivation data all at neighbourhood level. I spent a lot of time assembling, cleaning and linking these data sets and then used this data platform to build a range of health inequality indicators – some of which can be seen in an interactive tool I built to present the data to clinical commissioning groups.

As well as measuring inequality retrospectively in order to provide evidence to evaluate past NHS policies, and building tools to enable the NHS to monitor inequality going forward, another key focus of my thesis was to develop methods to model and incorporate health inequality impacts into cost-effectiveness analysis. These methods allow analysts to evaluate proposed health interventions in terms of their impact on the distribution of health rather than just their impact on the mythical average citizen. The distributional cost-effectiveness analysis framework I developed is based on the idea of using social welfare functions to evaluate the estimated health distributions arising from the rollout of different health care interventions and compute the equity-efficiency trade-offs that would need to be made in order to prefer one intervention over another. A key parameter in this analysis required in order to make equity-efficiency trade-offs is the level of health inequality aversion. This parameter was quite tricky to estimate with methods used to elicit it from the general public being prone to various framing effects. The preliminary estimates that I used in my analysis for this parameter suggested that at the margin the general public thought people living in the most deprived fifth of neighbourhoods in the country deserve approximately 7 times the priority in terms of health care spending as those who live in the most affluent fifth of neighbourhoods.

Does your PhD work enable us to attach a ‘cost’ to inequality, and ‘value’ to policies that reduce it?

As budding economists, we are ever cautious to distinguish association and causation. My thesis starts by estimating the cost associated with inequality to the NHS. That is the additional cost to the NHS spent on treating the excess morbidity in those living in relatively deprived neighbourhoods. I estimated the difference between the actual NHS hospital budget and what the cost would have been if everybody in the country had the morbidity profile of those who live in just the most affluent fifth of neighbourhoods. For inpatient hospital costs this difference came to £4.8 billion per year and widening this to all NHS costs this came to £12.5 billion per year approximately a fifth of the total NHS budget. I looked both cross-sectionally and also modelled lifetime estimated health care use and found that even over their entire lifetimes people living in more deprived neighbourhoods consumed more health care despite their substantially shorter life expectancies.

This cost is of course very different to the value of policies to reduce inequality. This difference arises for two main reasons. First, my estimates were not causal but rather associations so we are unable to conclude that reducing socioeconomic inequality would actually result in everybody in the country gaining the morbidity profile of those living in the most affluent fifth of neighbourhoods. Second and perhaps more significantly, my estimates do not value any of the health benefits that would result from reducing health inequality they just count the costs that could be saved by the NHS due to the excess morbidity avoided. The value of these health benefits forgone in terms of quality adjusted life years gained would have to be converted into monetary terms using an estimate of willingness to pay for health and added to these cost savings (which themselves would need to be converted to consumption values) to get a total value of reducing inequality from a health perspective. There would also, of course, be a range of non-health impacts of reducing inequality that would need to be accounted for if this exercise were to be comprehensively conducted.

In simple terms, if the causal link between socioeconomic inequality and health could be determined then the value to the health sector of policies that could substantially reduce this inequality would likely be far greater than the costs quoted here.

How did you find the PhD-by-publication route? Would you recommend it?

I came to academia relatively late having previously worked in both the government and the private sector for a number of years. The PhD by publication route suited me well as it allowed me to get stuck into a number of projects, work with a wide range of academics and build an academic career whilst simultaneously curating a set of papers to submit as a thesis. However, it is certainly not the fastest way to achieve PhD status, my thesis took 6 years to compile. The publication route is also still relatively uncommon in England and I found both my supervisors and examiners somewhat perplexed about how to approach it. Additionally, my wife who did her PhD by the traditional route assures me that it is not a ‘proper’ PhD!

For those fresh out of an MSc programme the traditional route probably works well, giving you the opportunity to develop research skills and focus on one area in depth with lots of guidance from a dedicated supervisor. However, for people like me who probably would never have got around to doing a traditional PhD, it is nice that there is an alternative way to acquire the ‘Dr’ title which I am finding confers many unanticipated benefits.

What advice would you give to a researcher looking to study health inequality?

The most important thing that I have learnt from my research is that health inequality, particularly in England, has very little to do with health care and everything to do with socioeconomic inequality. I would encourage researchers interested in this area to look at broader interventions tackling the social determinants of health. There is lots of exciting work going on at the moment around basic income and social housing as well as around the intersection between the environment and health which I would love to get stuck into given the chance.

Visualising PROMs data

The patient reported outcomes measures, or PROMs, is a large database with before and after health-related quality of life (HRQoL) measures for a large number of patients undergoing four key conditions: hip replacement, knee replacement, varicose vein surgery and surgery for groin hernia. The outcome measures are the EQ-5D index and visual analogue scale (and a disease-specific measure for three of the interventions). These data also contain the provider of the operation. Being publicly available, these data allow us to look at a range of different questions: what’s the average effect of the surgery on HRQoL? What are the differences between providers in gains to HRQoL or in patient casemix? Great!

The first thing we should always do with new data is to look at it. This might be in an exploratory way to determine the questions to ask of the data or in an analytical way to get an idea of the relationships between variables. Plotting the data communicates more about what’s going on than any table of statistics alone. However, the plots on the NHS Digital website might be accused of being a little uninspired as they collapse a lot of the variation into simple charts that conceal a lot of what’s going on. For example:

So let’s consider other ways of visualising this data. For all these plots a walk through of the code is at the end of this post.

Now, I’m not a regular user of PROMs data, so what I think are the interesting features of the data may not reflect what the data are generally used for. For this, I think the interesting features are:

  • The joint distribution of pre- and post-op scores
  • The marginal distributions of pre- and post-op scores
  • The relationship between pre- and post-op scores over time

We will pool all the data from six years’ worth of PROMs data. This gives us over 200,000 observations. A scatter plot with this information is useless as the density of the points will be very high. A useful alternative is hexagonal binning, which is like a two-dimensional histogram. Hexagonal tiles, which usefully tessellate and are more interesting to look at than squares, can be shaded or coloured with respect to the number of observations in each bin across the support of the joint distribution of pre- and post-op scores (which is [-0.5,1]x[-0.5,1]). We can add the marginal distributions to the axes and then add smoothed trend lines for each year. Since the data are constrained between -0.5 and 1, the mean may not be a very good summary statistic, so we’ll plot a smoothed median trend line for each year. Finally, we’ll add a line on the diagonal. Patients above this line have improved and patients below it deteriorated.

Hip replacement results

Hip replacement results

There’s a lot going on in the graph, but I think it reveals a number of key points about the data that we wouldn’t have seen from the standard plots on the website:

  • There appear to be four clusters of patients:
    • Those who were in close to full health prior to the operation and were in ‘perfect’ health (score = 1) after;
    • Those who were in close to full health pre-op and who didn’t really improve post-op;
    • Those who were in poor health (score close to zero) and made a full recovery;
    • Those who were in poor health and who made a partial recovery.
  • The median change is an improvement in health.
  • The median change improves modestly from year to year for a given pre-op score.
  • There are ceiling effects for the EQ-5D.

None of this is news to those who study these data. But this way of presenting the data certainly tells more of a story that the current plots on the website.

R code

We’re going to consider hip replacement, but the code is easily modified for the other outcomes. Firstly we will take the pre- and post-op score and their difference and pool them into one data frame.

# df 14/15
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1415.csv")

df<-df[!is.na(df$Pre.Op.Q.EQ5D.Index),]
df$pre<-df$Pre.Op.Q.EQ5D.Index
df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1415 <- df[,c('Provider.Code','pre','post','diff')]

#
# df 13/14
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1314.csv")

df<-df[!is.na(df$Pre.Op.Q.EQ5D.Index),]
df$pre<-df$Pre.Op.Q.EQ5D.Index
df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1314 <- df[,c('Provider.Code','pre','post','diff')]

# df 12/13
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1213.csv")

df<-df[!is.na(df$Pre.Op.Q.EQ5D.Index),]
df$pre<-df$Pre.Op.Q.EQ5D.Index
df$post<- df$Post.Op.Q.EQ5D.Index
df$diff<- df$post - df$pre

df1213 <- df[,c('Provider.Code','pre','post','diff')]

# df 11/12
df<-read.csv("C:/docs/proms/Hip Replacement 1112.csv")

df$pre<-df$Q1_EQ5D_INDEX
df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre
names(df)[1]<-'Provider.Code'

df1112 <- df[,c('Provider.Code','pre','post','diff')]

# df 10/11
df<-read.csv("C:/docs/proms/Record Level Hip Replacement 1011.csv")

df$pre<-df$Q1_EQ5D_INDEX
df$post<- df$Q2_EQ5D_INDEX
df$diff<- df$post - df$pre
names(df)[1]<-'Provider.Code'

df1011 <- df[,c('Provider.Code','pre','post','diff')]

#combine

df1415$year<-"2014/15"
df1314$year<-"2013/14"
df1213$year<-"2012/13"
df1112$year<-"2011/12"
df1011$year<-"2010/11"

df<-rbind(df1415,df1314,df1213,df1112,df1011)
write.csv(df,"C:/docs/proms/eq5d.csv")

Now, for the plot. We will need the packages ggplot2, ggExtra, and extrafont. The latter package is just to change the plot fonts, not essential, but aesthetically pleasing.

require(ggplot2)
require(ggExtra)
require(extrafont)
font_import()
loadfonts(device = "win")

p<-ggplot(data=df,aes(x=pre,y=post))+
 stat_bin_hex(bins=15,color="white",alpha=0.8)+
 geom_abline(intercept=0,slope=1,color="black")+
 geom_quantile(aes(color=year),method = "rqss", lambda = 2,quantiles=0.5,size=1)+
 scale_fill_gradient2(name="Count (000s)",low="light grey",midpoint = 15000,
   mid="blue",high = "red",
   breaks=c(5000,10000,15000,20000),labels=c(5,10,15,20))+
 theme_bw()+
 labs(x="Pre-op EQ-5D index score",y="Post-op EQ-5D index score")+
 scale_color_discrete(name="Year")+
 theme(legend.position = "bottom",text=element_text(family="Gill Sans MT"))

ggMarginal(p, type = "histogram")

Data sharing and the cost of error

The world’s highest impact factor medical journal, the New England Journal of Medicine (NEJM), seems to have been doing some soul searching. After publishing an editorial early in 2016 insinuating that researchers requesting data from trials for re-analysis were “research parasites“, they have released a series of articles on the topic of data sharing. Four articles were published in August: two in favour and two less so. This month another three articles are published on the same topic. And, the journal is sponsoring a challenge to re-analyse data from a previous trial. We reported earlier in the year about a series of concerns at the NEJM and these new steps are all welcome to address those challenges. However, while the articles consider questions of fairness about sharing data from large, long, and difficult trials, little has been said about the potential costs to society of un-remedied errors in data analysis. The costs of not sharing data can be large as the long running saga over the controversial PACE trial illustrates.

The PACE trial was a randomised, controlled trial to assess the benefits of a number of treatments for chronic fatigue syndrome including graded exercise therapy and cognitive behavioural therapy. However, after publication of the trial results in 2011, a number of concerns were raised about the conduct of the trial, its analysis, and reporting. This included a change in the definitions of ‘improvement’ and ‘recovery’ mid-way through the trial. Other researchers sought access to the data from the trial for re-analysis, but such requests were rebutted with what a judge later described as ‘wild speculations’. The data were finally released and recently re-analysed. The new analysis revealed what many suspected – that the interventions in the trial had little benefit. Nevertheless, the recommended treatments for chronic fatigue syndrome had changed as a result of the trial. (STAT has the whole story here).

A cost-effectiveness analysis was published alongside the PACE trial. The results showed that chronic behavioural therapy (CBT) was cost-effective compared to standard care, as was graded exercise therapy (GET). Quality of life was measured in the trial using the EQ-5D, and costs were also recorded, making calculation of incremental cost-effectiveness ratios straightforward. Costs were higher for all the intervention groups. The table reporting QALY outcomes is reproduced below:

journal-pone-0040808-t005

At face value the analysis seems reasonable. But, in light of the problems with the trial, including that none of the objective measures of patient health, such as walking tests and step tests, nor labour market outcomes, showed much sign of improvement or recovery, these data seem less convincing. In particular, their statistically significant difference in QALYs – “After controlling for baseline utility, the difference between CBT and SMC was 0.05 (95% CI 0.01 to 0.09)” – may well just be a type I error. A re-analysis of these data is warranted (although gaining access may yet still be hard).

If there actually was no real benefit from the new treatments, then benefits have been lost from elsewhere in the healthcare system. If we assume the NHS achieves £20,000/QALY (contentious I know!) then the health service loses 0.05 QALYs for each patient with chronic fatigue syndrome put on the new treatment. The prevalence of chronic fatigue syndrome may be as high as 0.2% among adults in England, which represents approximately 76,000 people. If all of these were switched to new, ineffective treatments, the opportunity cost could potentially be as much as 3,800 QALYs.

The key point is that analytical errors have costs if the analyses go on to lead to changes in recommended treatments. And when averaged over a national health service these costs could become quite substantial. Researchers may worry about publication prestige or fairness in using other people’s hard won data, but the bigger issue is the wider costs of letting an error go unchallenged.

Credits