# Chris Sampson’s journal round-up for 23rd July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quantifying life: understanding the history of quality-adjusted life-years (QALYs). Social Science & Medicine [PubMed] Published 3rd July 2018

We’ve had some fun talking about the history of the QALY here on this blog. The story of how the QALY came to be important in health policy has been obscured. This paper seeks to address that. The research adopts a method called ‘multiple streams analysis’ (MSA) in order to explain how QALYs caught on. The MSA framework identifies three streams – policy, politics, and problems – and considers the ‘policy entrepreneurs’ involved. For this study, archival material was collected from the National Archives, Department of Health files, and the University of York. The researchers also conducted 44 semi-structured interviews with academics and civil servants.

The problem stream highlights shocks to the UK economy in the late 1960s, coupled with growth in health care costs due to innovations and changing expectations. Cost-effectiveness began to be studied and, increasingly, policymaking was meant to be research-based and accountable. By the 80s, the likes of Williams and Maynard were drawing attention to apparent inequities and inefficiencies in the health service. The policy stream gets going in the 40s and 50s when health researchers started measuring quality of life. By the early 60s, the idea of standardising these measures to try and rank health states was on the table. Through the late 60s and early 70s, government economists proliferated and proved themselves useful in health policy. The meeting of Rachel Rosser and Alan Williams in the mid-70s led to the creation of QALYs as we know them, combining quantity and quality of life on a 0-1 scale. Having acknowledged inefficiencies and inequities in the health service, UK politicians and medics were open to new ideas, but remained unconvinced by the QALY. Yet it was a willingness to consider the need for rationing that put the wheels in motion for NICE, and the politics stream – like the problem and policy stream – characterises favourable conditions for the use of the QALY.

The MSA framework also considers ‘policy entrepreneurs’ who broker the transition from idea to implementation. The authors focus on the role of Alan Williams and of the Economic Advisers’ Office. Williams was key in translating economic ideas into forms that policymakers could understand. Meanwhile, the Economic Advisers’ Office encouraged government economists to engage with academics at HESG and later the QoL Measurement Group (which led to the creation of EuroQol).

The main takeaway from the paper is that good ideas only prevail in the right conditions and with the right people. It’s important to maintain multi-disciplinary and multi-stakeholder networks. In the case of the QALY, the two-way movement of economists between government and academia was crucial.

I don’t completely understand or appreciate the MSA framework, but this paper is an enjoyable read. My only reservation is with the way the authors describe the QALY as being a dominant aspect of health policy in the UK. I don’t think that’s right. It’s dominant within a niche of a niche of a niche – that is, health technology assessment for new pharmaceuticals. An alternative view is that the QALY has in fact languished in a quiet corner of British policymaking, and been completely excluded in some other countries.

Accuracy of patient recall for self‐reported doctor visits: is shorter recall better? Health Economics [PubMed] Published 2nd July 2018

In designing observational studies, such as clinical trials, I have always recommended that self-reported resource use be collected no less frequently than every 3 months. This is partly based on something I once read somewhere that I can’t remember, but partly also on some logic that the accuracy of people’s recall decays over time. This paper has come to tell me how wrong I’ve been.

The authors start by highlighting that recall can be subject to omission, whereby respondents forget relevant information, or commission, whereby respondents include events that did not occur. A key manifestation of the latter is ‘telescoping’, whereby events are included from outside the recall period. We might expect commission to be more likely in short recalls and omission to be more common for long recalls. But there’s very little research on this regarding health service use.

This study uses data from a large trial in diabetes care in Australia, in which 5,305 participants were randomised to receive either 2-week, 3-month, or 12-month recall for how many times they had seen a doctor. Then, the trial data were matched with Medicare data to identify the true levels of resource use.

Over 92% of 12-month recall participants made an error, 76% of the 3-month recall, and 46% of the 2-week recall. The patterns of errors were different. There was very little under-reporting in the 2-week recall sample, with 3-month giving the most over-reporting and 12-month giving the most under-reporting. 12-month recall was associated with the largest number of days reported in error. However, when the authors account for the longer period being considered, and estimate a relative error, the impact of misreporting is smallest for the 12-month recall and greatest for the 2-week recall. This translates into a smaller overall bias for the longest recall period. The authors also find that older, less educated, unemployed, and low‐income patients exhibit higher measurement errors.

Health surveys and comparative studies that estimate resource use over a long period of time should use 12-month recall unless they can find a reason to do otherwise. The authors provide some examples from economic evaluations to demonstrate how selecting shorter recall periods could result in recommending the wrong decisions. It’s worth trying to understand the reasons why people can more accurately recall service use over 12 months. That way, data collection methods could be designed to optimise recall accuracy.

Part of the reason the QALY faces opposition is that it has been used in a way that might not reflect societal preferences for resource allocation. In particular, the idea that ‘a QALY is a QALY is a QALY’ may conflict with notions of desert, severity, or process. We’re starting to see more evidence for groups of people holding different views, which makes it difficult to come up with decision rules to maximise welfare. This study considers some of the perspectives that people adopt, which have been identified in previous research – ‘equal right to healthcare’, ‘limits to healthcare’, and ‘effective and efficient healthcare’ – and looks at how they are distributed in the Netherlands. Using four willingness to trade-off (WTT) exercises, the authors explore the relationship between these views and people’s preferences about resource allocation. Trade-offs are between quality vs quantity of life, health maximisation vs equality, children vs the elderly, and lifestyle-related risk vs adversity. The authors sought to test several hypotheses: i) that ‘equal right’ respondents have a lower WTT; ii) ‘limits to healthcare’ people express a preference for health gains, health maximisation, and treating people with adversity; and iii) ‘effective and efficient’ people support health maximisation, treating children, and treating people with adversity.

A representative online sample of adults in the Netherlands (n=261) was recruited. The first part of the questionnaire collected socio-demographic information. The second part asked questions necessary to allocate people to one of the three perspectives using Likert scales based on a previous study. The third part of the questionnaire consisted of the four reimbursement scenarios. Participants were asked to identify the point (in terms of the relevant quantities) at which they would be indifferent between two options.

The distribution of the viewpoints was 65% ‘equal right’, 23% ‘limits to healthcare’, and 7% ‘effective and efficient’. 6% couldn’t be matched to one of the three viewpoints. In each scenario, people had the option to opt out of trading. 24% of respondents were non-traders for all scenarios and, of these, 78% were of the ‘equal right’ viewpoint. Unfortunately, a lot of people opted out of at least one of the trades, and for a wide variety of reasons. Decisionmakers can’t opt out, so I’m not sure how useful this is.

The authors describe many associations between individual characteristics, viewpoints, and WTT results. But the tested hypotheses were broadly supported. While the findings showed that different groups were more or less willing to trade, the points of indifference for traders within the groups did not vary. So while you can’t please everyone in health care priority setting, this study shows how policies might be designed to satisfy the preferences of people with different perspectives.

Credits

# Thesis Thursday: Thomas Hoe

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Thomas Hoe who has a PhD from University College London. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Essays on the economics of health care provision
Supervisors
Richard Blundell, Orazio Attanasio
http://discovery.ucl.ac.uk/10048627/

What data do you use in your analyses and what are your main analytical methods?

I use data from the English National Health Service (NHS). One of the great features of the NHS is the centralized data it collects, with the Hospital Episodes Statistics (HES) containing information on every public hospital visit in England.

In my thesis, I primarily use two empirical approaches. In my work on trauma and orthopaedic departments, I exploit the fact that the number of emergency trauma admissions to hospital each day is random. This randomness allows me to conduct a quasi-experiment to assess how hospitals perform when they are more or less busy.

The second approach I use, in my work on emergency departments with Jonathan Gruber and George Stoye, is based on bunching techniques that originated in the tax literature (Chetty et al, 2013; Kleven and Waseem, 2013; Saez, 2010). These techniques use interpolation to infer how discontinuities in incentive schemes affect outcomes. We apply and extend these techniques to evaluate the impact of the ‘4-hour target’ in English emergency departments.

How did you characterise and measure quality in your research?

Measuring the quality of health care outcomes is always a challenge in empirical research. Since my research primarily relies on administrative data from HES, I use the patient outcomes that can be directly constructed from this data: in-hospital mortality, and unplanned readmission.

Mortality is, of course, an outcome that is widely used, and offers an unambiguous interpretation. Readmission, on the other hand, is an outcome that has gained more acceptance as a measure of quality in recent years, particularly following the implementation of readmission penalties in the UK and the US.

What is ‘crowding’, and how can it affect the quality of care?

I use the term crowding to refer, in a fairly general sense, to how busy a hospital is. This could mean that the hospital is physically very crowded, with lots of patients in close proximity to one another, or that the number of patients outstrips the available resources.

In practice, I evaluate how crowding affects quality of care by comparing hospital performance and patient outcomes on days when hospitals deal with different levels of admissions (due to random spikes in the number of trauma admissions). I find that hospitals respond by not only cancelling some planned admissions, such as elective hip and knee replacements, but also discharge existing patients sooner. For these discharged patients, the shorter-than-otherwise stay in the hospital is associated with poorer health outcomes for patients, most notably an increase in subsequent hospital visits (unplanned readmissions).

How might incentives faced by hospitals lead to negative consequences?

One of the strongest incentives faced by public hospitals in England is to meet the government-set waiting time target for elective care. This target has been very successful at reducing wait times. In doing so, however, it may have contributed to hospitals shortening patient stays and increasing patient admissions.

My research shows that shorter hospitals stays, in turn, can lead to increases in unplanned readmissions. Setting strong wait time targets, then, is in effect trading off shorter waits (from which patients benefit) with crowding effects (which may harm patients).

Your research highlights the importance of time in the hospital production process. How does this play out?

I look at this from three dimensions, each a separate part of a patient’s journey through hospital.

The first two relate to waiting for treatment. For elective patients, this means waiting for an appointment, and previous work has shown that patients attach significant value to reductions in these wait times. I show that trauma and orthopaedic patients would be better off with further wait time reductions, even if that leads to more crowding.

Emergency patients, in contrast, wait for treatment while physically in a hospital emergency department. I show that these waiting times can be very harmful and that by shortening these wait times we can actually save lives.

The third dimension relates to how long a patient spends in hospital recovering from surgery. I show that, at least on the margin of care for trauma and orthopaedic patients, an additional day in hospital has tangible benefits in terms of reducing the likelihood of experiencing an unplanned readmission.

How could your findings be practically employed in the NHS to improve productivity?

I would highlight two areas of my research that speak directly to the policy debate about NHS productivity.

First, while the wait time targets for elective care may have led to some crowding problems and subsequently more readmissions, the net benefit of these targets to trauma and orthopaedic patients is positive. Second, the wait time target for emergency departments also appears to have benefited patients: it saved lives at a reasonably cost-effective rate.

From the perspective of patients, therefore, I would argue these policies have been relatively successful and should be maintained.

# Sam Watson’s journal round-up for 9th July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Evaluating the 2014 sugar-sweetened beverage tax in Chile: an observational study in urban areas. PLoS Medicine [PubMedPublished 3rd July 2018

Sugar taxes are one of the public health policy options currently in vogue. Countries including Mexico, the UK, South Africa, and Sri Lanka all have sugar taxes. The aim of such levies is to reduce demand for the most sugary drinks, or if the tax is absorbed on the supply side, which is rare, to encourage producers to reduce the sugar content of their drinks. One may also view it as a form of Pigouvian taxation to internalise the public health costs associated with obesity. Chile has long had an ad valorem tax on soft drinks fixed at 13%, but in 2014 decided to pursue a sugar tax approach. Drinks with more than 6.25g/100ml saw their tax rate rise to 18% and the tax on those below this threshold dropped to 10%. To understand what effect this change had, we would want to know three key things along the causal pathway from tax policy to sugar consumption: did people know about the tax change, did prices change, and did consumption behaviour change. On this latter point, we can consider both the overall volume of soft drinks and whether people substituted low sugar for high sugar beverages. Using the Kantar Worldpanel, a household panel survey of purchasing behaviour, this paper examines these questions.

Everyone in Chile was affected by the tax so there is no control group. We must rely on time series variation to identify the effect of the tax. Sometimes, looking at plots of the data reveals a clear step-change when an intervention is introduced (e.g. the plot in this post), not so in this paper. We therefore rely heavily on the results of the model for our inferences, and I have a couple of small gripes with it. First, the model captures household fixed effects, but no consideration is given to dynamic effects. Some households may be more or less likely to buy drinks, but their decisions are also likely to be affected by how much they’ve recently bought. Similarly, the errors may be correlated over time. Ignoring dynamic effects can lead to large biases. Second, the authors choose among different functional form specifications of time using Akaike Information Criterion (AIC). While AIC and the Bayesian Information Criterion (BIC) are often thought to be interchangeable, they are not; AIC estimates predictive performance on future data, while BIC estimates goodness of fit to the data. Thus, I would think BIC would be more appropriate. Additional results show the estimates are very sensitive to the choice of functional form by an order of magnitude and in sign. The authors estimate a fairly substantial decrease of around 22% in the volume of high sugar drinks purchased, but find evidence that the price paid changed very little (~1.5%) and there was little change in other drinks. While the analysis is generally careful and well thought out, I am not wholly convinced by the authors’ conclusions that “Our main estimates suggest a significant, sizeable reduction in the volume of high-tax soft drinks purchased.”

A Bayesian framework for health economic evaluation in studies with missing data. Health Economics [PubMedPublished 3rd July 2018

Missing data is a ubiquitous problem. I’ve never used a data set where no observations were missing and I doubt I’m alone. Despite its pervasiveness, it’s often only afforded an acknowledgement in the discussion or perhaps, in more complete analyses, something like multiple imputation will be used. Indeed, the majority of trials in the top medical journals don’t handle it correctly, if at all. The majority of the methods used for missing data in practice assume the data are ‘missing at random’ (MAR). One interpretation is that this means that, conditional on the observable variables, the probability of data being missing is independent of unobserved factors influencing the outcome. Another interpretation is that the distribution of the potentially missing data does not depend on whether they are actually missing. This interpretation comes from factorising the joint distribution of the outcome $Y$ and an indicator of whether the datum is observed $R$, along with some covariates $X$, into a conditional and marginal model: $f(Y,R|X) = f(Y|R,X)f(R|X)$, a so-called pattern mixture model. This contrasts with the ‘selection model’ approach: $f(Y,R|X) = f(R|Y,X)f(Y|X)$.

This paper considers a Bayesian approach using the pattern mixture model for missing data for health economic evaluation. Specifically, the authors specify a multivariate normal model for the data with an additional term in the mean if it is missing, i.e. the model of $f(Y|R,X)$. A model is not specified for $f(R|X)$. If it were then you would typically allow for correlation between the errors in this model and the main outcomes model. But, one could view the additional term in the outcomes model as some function of the error from the observation model somewhat akin to a control function. Instead, this article uses expert elicitation methods to generate a prior distribution for the unobserved terms in the outcomes model. While this is certainly a legitimate way forward in my eyes, I do wonder how specification of a full observation model would affect the results. The approach of this article is useful and they show that it works, and I don’t want to detract from that but, given the lack of literature on missing data in this area, I am curious to compare approaches including selection models. You could even add shared parameter models as an alternative, all of which are feasible. Perhaps an idea for a follow-up study. As a final point, the models run in WinBUGS, but regular readers will know I think Stan is the future for estimating Bayesian models, especially in light of the problems with MCMC we’ve discussed previously. So equivalent Stan code would have been a bonus.

This is an economics blog. But focusing solely on economics papers in these round-ups would mean missing out on some papers from related fields that may provide insight into our own work. Thus I present to you a politics and sociology paper. It is not my field and I can’t give a reliable appraisal of the methods, but the results are of interest. In the global fight against non-communicable diseases, there is a range of policy tools available to governments, including the sugar tax of the paper at the top. The WHO recommends a large number. However, there is ongoing debate about whether trade rules and agreements are used to undermine this public health legislation. One agreement, the Technical Barriers to Trade (TBT) Agreement that World Trade Organization (WTO) members all sign, states that members may not impose ‘unnecessary trade costs’ or barriers to trade, especially if the intended aim of the measure can be achieved without doing so. For example, Philip Morris cited a bilateral trade agreement when it sued the Australian government for introducing plain packaging claiming it violated the terms of trade. Philip Morris eventually lost but not after substantial costs were incurred. In another example, the Thai government were deterred from introducing a traffic light warning system for food after threats of a trade dispute from the US, which cited WTO rules. However, there was no clear evidence on the extent to which trade disputes have undermined public health measures.

This article presents results from a new database of all TBT WTO challenges. Between 1995 and 2016, 93 challenges were raised concerning food, beverage, and tobacco products, the number per year growing over time. The most frequent challenges were over labelling products and then restricted ingredients. The paper presents four case studies, including Indonesia delaying food labelling of fat, sugar, and salt after challenge by several members including the EU, and many members including the EU again and the US objecting to the size and colour of a red STOP sign that Chile wanted to put on products containing high sugar, fat, and salt.

We have previously discussed the politics and political economy around public health policy relating to e-cigarettes, among other things. Understanding the political economy of public health and phenomena like government failure can be as important as understanding markets and market failure in designing effective interventions.

Credits