# Chris Sampson’s journal round-up for 17th December 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Health related quality of life aspects not captured by EQ-5D-5L: results from an international survey of patients. Health Policy Published 14th December 2018

Generic preference-based measures, such as the EQ-5D, cannot capture all aspects of health-related quality of life. They’re not meant to. Rather, their purpose is to capture just enough information to be able to adequately distinguish between health states with respect to the domains deemed normatively relavent to decisionmakers. The stated aim of this paper is to determine whether people with a variety of chronic conditions believe that their experiences can be adequately represented by the EQ-5D-5L.

The authors conducted an online survey, identifying participants through 320 patient associations across 47 countries. Participants were asked to complete the EQ-5D-5L and then asked if any aspects of their illness, which had a “big impact” on their health, were not captured by the EQ-5D-5L. 1,031 people started the survey and 767 completed it. More than half were from the UK. 51% of respondents said that there was some aspect of health not captured by the EQ-5D-5L. Of them, 19% mentioned fatigue, 12% mentioned medication side effects, 9.5% mentioned co-morbid conditions, and then a bunch of others in smaller proportions.

It’s nice to know what people think, but I have a few concerns about the usefulness of this study. One of the main problems is that it doesn’t seem safe to assume that respondents interpret “big impact” as meaning “an impact that is independently important in determining your overall level of quality of life”. So, even if we accept that people judging something to be important makes it important (which I’m not sure it does), then we still can’t be sure whether what they are identifying is within the scope of what we’re trying to measure. For starters, I can see no justification for including a ‘medication side effects’ domain. There’s also some concern about selection and attrition. I’d guess that people with more complicated or less common health concerns would be more likely to start and finish a survey about more complicated or less common health concerns.

The main thing I took from this study is that half of respondents with chronic diseases thought that the EQ-5D-5L captured every single aspect of health that had a “big impact”, and that there wasn’t broad support for any other specific dimension.

Reducing drug wastage in pharmaceuticals dosed by weight or body surface areas by optimising vial sizes. Applied Health Economics and Health Policy [PubMed] Published 5th December 2018

It’s common for pharmaceuticals to be wasted. Not just those out-of-date painkillers you threw in the bin, but also the expensive stuff being used in hospitals. One of the main reasons that waste occurs is that vials are made to specific sizes and, often, dosage varies from patient to patient – according to weight, for example – and doesn’t match the vial size. Suppose that vials are available as 50mg and 80mg and that an individual requires a 60mg dose. One way to address this might be to allow for vial sharing, whereby the leftovers are given to the next patient. But that isn’t always possible. So, we might like to consider what the best combination of available vial sizes should be, given the characteristics of the population.

In this paper, the authors set out the problem mathematically. Essentially, the optimisation problem is to minimise cost across the population subject to the vial sizes. An example is presented for two drugs (pembrolizumab and cabazitaxel), simulating patients based on samples drawn from the Health Survey for England. Simplifications are applied to the examples, such as setting a constraint of 8 vials per patient and assuming that prices are linear (i.e. fixed per milligram).

Pembrolizumab is currently available in 50mg and 100mg vials, and the authors estimate current wastage to be 13.2%. The simulations show that switching the 50mg to a 70mg would cut wastage to 8.6%. Cabazitaxel is available in 60mg vials, resulting in 19.4% wastage. Introducing a 12.5mg vial would cut wastage by around two thirds. An important general finding, which should be self-evident, is that vial sizes should not be divisible by each other, as this limits the number of possible combinations.

Depending on when vial sizes are determined (e.g. pre- or post-authorisation), pharmaceutical companies might use it to increase profit margins, or health systems might use it to save costs. Regardless, wastage isn’t useful. Evidence-based manufacture is an example of one of those best ideas; the sort that is simple and seems obvious once it’s spelt out. It’s a rare opportunity to benefit patients, health care providers, and manufacturers, with no significant burden on policymakers.

Death or debt? National estimates of financial toxicity in persons with newly-diagnosed cancer. The American Journal of Medicine [PubMed] Published October 2018

If you’re British, what’s the scariest thing about an ‘Americanised’ (/Americanized) health care system? Expensive inhalers? A shortened life expectancy? My guess is that the prospect of having to add financial ruin to terminal illness looms pretty large. You should make sure your fear is evidence-based. Here’s a paper to shake in the face of anyone who doesn’t support universal health care.

The authors use data from the Health and Retirement Study from 1998-2014, which includes people over 50 years of age and includes new (self-reported) diagnoses of cancer. This was the basis for inclusion in the study, with over 9.5 million new diagnoses of cancer. Up to two years pre-diagnosis was taken as a baseline. The data set also includes information on participants’ assets and debts, allowing the authors to use change in net worth as the primary outcome. Generalised linear models were used to assess various indicators of financial toxicity, including change or incurrence of consumer debt, mortgage debt, and home equity debt at two- and four-year follow-up. In addition to cancer diagnosis, various chronic comorbidities and socio-demographic variables were included in the models.

Shockingly, after two years following diagnosis, 42.4% of people had depleted their entire life’s assets. Average net worth had dropped \$92,000. After four years, 38.2% were still insolvent. Women, older people, people who weren’t White, people with Medicaid, and those with worsening cancer status were among those more likely to have completely depleted their assets within two years. Having private insurance and being married had protective effects, as we might expect. There were some interesting findings associated with the 2008 financial crisis, which also seemed to be protective. And a protective effect associated with psychiatric comorbidity deserves more thought.

It’s difficult to explain away any (let alone all) of the magnitude of these findings. The analysis seems robust. But, given all other evidence available about out-of-pocket costs for cancer patients in the US, it should be shocking but not unexpected. The authors describe financial toxicity as ‘unintended’. There’s nothing unintended about this. Policymakers in the US keep deciding that they’d prefer to destroy the lives of sick people than allow for the spreading of that financial risk.

Credits

# Sam Watson’s journal round-up for 8th October 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A cost‐effectiveness threshold based on the marginal returns of cardiovascular hospital spending. Health Economics [PubMed] Published 1st October 2018

There are two types of cost-effectiveness threshold of interest to researchers. First, there’s the societal willingness-to-pay for a given gain in health or quality of life. This is what many regulatory bodies, such as NICE, use. Second, there is the actual return on medical spending achieved by the health service. Reimbursement of technologies with a lesser return for every pound or dollar would reduce the overall efficiency of the health service. Some refer to this as the opportunity cost, although in a technical sense I would disagree that it is the opportunity cost per se. Nevertheless, this latter definition has seen a growth in empirical work; with some data on health spending and outcomes, we can start to estimate this threshold.

This article looks at spending on cardiovascular disease (CVD) among elderly age groups by gender in the Netherlands and survival. Estimating the causal effect of spending is tricky with these data: spending may go up because survival is worsening, external factors like smoking may have a confounding role, and using five year age bands (as the authors do) over time can lead to bias as the average age in these bands is increasing as demographics shift. The authors do a pretty good job in specifying a Bayesian hierarchical model with enough flexibility to accommodate these potential issues. For example, linear time trends are allowed to vary by age-gender groups and  dynamic effects of spending are included. However, there’s no examination of whether the model is actually a good fit to the data, something which I’m growing to believe is an area where we, in health and health services research, need to improve.

Most interestingly (for me at least) the authors look at a range of priors based on previous studies and a meta-analysis of similar studies. The estimated elasticity using information from prior studies is more ‘optimistic’ about the effect of health spending than a ‘vague’ prior. This could be because CVD or the Netherlands differs in a particular way from other areas. I might argue that the modelling here is better than some previous efforts as well, which could explain the difference. Extrapolating using life tables the authors estimate a base case cost per QALY of €40,000.

Early illicit drug use and the age of onset of homelessness. Journal of the Royal Statistical Society: Series A Published 11th September 2018

How the consumption of different things, like food, drugs, or alcohol, affects life and health outcomes is a difficult question to answer empirically. Consider a recent widely-criticised study on alcohol published in The Lancet. Among a number of issues, despite including a huge amount of data, the paper was unable to address the problem that different kinds of people drink different amounts. The kind of person who is teetotal may be so for a number of reasons including alcoholism, interaction with medication, or other health issues. Similarly, studies on the effect of cannabis consumption have shown among other things an association with lower IQ and poorer mental health. But are those who consume cannabis already those with lower IQs or at higher risk of psychoses? This article considers the relationship between cannabis and homelessness. While homelessness may lead to an increase in drug use, drug use may also be a cause of homelessness.

The paper is a neat application of bivariate hazard models. We recently looked at shared parameter models on the blog, which factorise the joint distribution of two variables into their marginal distribution by assuming their relationship is due to some unobserved variable. The bivariate hazard models work here in a similar way: the bivariate model is specified as the product of the marginal densities and the individual unobserved heterogeneity. This specification allows (i) people to have different unobserved risks for both homelessness and cannabis use and (ii) cannabis to have a causal effect on homelessness and vice versa.

Despite the careful set-up though, I’m not wholly convinced of the face validity of the results. The authors claim that daily cannabis use among men has a large effect on becoming homeless – as large an effect as having separated parents – which seems implausible to me. Cannabis use can cause psychological dependency but I can’t see people choosing it over having a home as they might with something like heroin. The authors also claim that homelessness doesn’t really have an effect on cannabis use among men because the estimated effect is “relatively small” (it is the same order of magnitude as the reverse causal effect) and only “marginally significant”. Interpreting these results in the context of cannabis use would then be difficult, though. The paper provides much additional material of interest. However, the conclusion that regular cannabis use, all else being equal, has a “strong effect” on male homelessness, seems both difficult to conceptualise and not in keeping with the messiness of the data and complexity of the empirical question.

How could health care be anything other than high quality? The Lancet: Global Health [PubMed] Published 5th September 2018

Tedros Adhanom Ghebreyesus, or Dr Tedros as he’s better known, is the head of the WHO. This editorial was penned in response to the recent Lancet Commission on Health Care Quality and related studies (see this round-up). However, I was critical of these studies for a number of reasons, in particular, the conflation of ‘quality’ as we normally understand it and everything else that may impact on how a health system performs. This includes resourcing, which is obviously low in poor countries, availability of labour and medical supplies, and demand side choices about health care access. The empirical evidence was fairly weak; even in countries like in the UK in which we’re swimming in data we struggle to quantify quality. Data are also often averaged at the national level, masking huge underlying variation within-country. This editorial is, therefore, a bit of an empty platitude: of course we should strive to improve ‘quality’ – its goodness is definitional. But without a solid understanding of how to do this or even what we mean when we say ‘quality’ in this context, we’re not really saying anything at all. Proposing that we need a ‘revolution’ without any real concrete proposals is fairly meaningless and ignores the massive strides that have been made in recent years. Delivering high-quality, timely, effective, equitable, and integrated health care in the poorest settings means more resources. Tinkering with what little services already exist for those most in need is not going to produce a revolutionary change. But this strays into political territory, which UN organisations often flounder in.

Editorial: Statistical flaws in the teaching excellence and student outcomes framework in UK higher education. Journal of the Royal Statistical Society: Series A Published 21st September 2018

As a final note for our academic audience, we give you a statement on the Teaching Excellence Framework (TEF). For our non-UK audience, the TEF is a new system being introduced by the government, which seeks to introduce more of a ‘market’ in higher education by trying to quantify teaching quality and then allowing the best-performing universities to charge more. No-one would disagree with the sentiment that improving higher education standards is better for students and teachers alike, but the TEF is fundamentally statistically flawed, as discussed in this editorial in the JRSS.

Some key points of contention are: (i) TEF doesn’t actually assess any teaching, such as through observation; (ii) there is no consideration of uncertainty about scores and rankings; (iii) “The benchmarking process appears to be a kind of poor person’s propensity analysis” – copied verbatim as I couldn’t have phrased it any better; (iv) there has been no consideration of gaming the metrics; and (v) the proposed models do not reflect the actual aims of TEF and are likely to be biased. Economists will also likely have strong views on how the TEF incentives will affect institutional behaviour. But, as Michael Gove, the former justice and education secretary said, Britons have had enough of experts.

Credits

# Sam Watson’s journal round-up for 10th September 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Probabilistic sensitivity analysis in cost-effectiveness models: determining model convergence in cohort models. PharmacoEconomics [PubMed] Published 27th July 2018

Probabilistic sensitivity analysis (PSA) is rightfully a required component of economic evaluations. Deterministic sensitivity analyses are generally biased; averaging the outputs of a model based on a choice of values from a complex joint distribution is not likely to be a good reflection of the true model mean. PSA involves repeatedly sampling parameters from their respective distributions and analysing the resulting model outputs. But how many times should you do this? Most times, an arbitrary number is selected that seems “big enough”, say 1,000 or 10,000. But these simulations themselves exhibit variance; so-called Monte Carlo error. This paper discusses making the choice of the number of simulations more formal by assessing the “convergence” of simulation output.

In the same way as sample sizes are chosen for trials, the number of simulations should provide an adequate level of precision, anything more wastes resources without improving inferences. For example, if the statistic of interest is the net monetary benefit, then we would want the confidence interval (CI) to exclude zero as this should be a sufficient level of certainty for an investment decision. The paper, therefore, proposed conducting a number of simulations, examining the CI for when it is ‘narrow enough’, and conducting further simulations if it is not. However, I see a problem with this proposal: the variance of a statistic from a sequence of simulations itself has variance. The stopping points at which we might check CI are themselves arbitrary: additional simulations can increase the width of the CI as well as reduce them. Consider the following set of simulations from a simple ratio of random variables $ICER = gamma(1,0.01)/normal(0.01,0.01)$:The “stopping rule” therefore proposed doesn’t necessarily indicate “convergence” as a few more simulations could lead to a wider, as well as narrower, CI. The heuristic approach is undoubtedly an improvement on the current way things are usually done, but I think there is scope here for a more rigorous method of assessing convergence in PSA.

Mortality due to low-quality health systems in the universal health coverage era: a systematic analysis of amenable deaths in 137 countries. The Lancet [PubMed] Published 5th September 2018

Richard Horton, the oracular editor-in-chief of the Lancet, tweeted last week:

There is certainly an argument that academic journals are good forums to make advocacy arguments. Who better to interpret the analyses presented in these journals than the authors and audiences themselves? But, without a strict editorial bulkhead between analysis and opinion, we run the risk that the articles and their content are influenced or dictated by the political whims of editors rather than scientific merit. Unfortunately, I think this article is evidence of that.

No-one debates that improving health care quality will improve patient outcomes and experience. It is in the very definition of ‘quality’. This paper aims to estimate the numbers of deaths each year due to ‘poor quality’ in low- and middle-income countries (LMICs). The trouble with this is two-fold: given the number of unknown quantities required to get a handle on this figure, the definition of quality notwithstanding, the uncertainty around this figure should be incredibly high (see below); and, attributing these deaths in a causal way to a nebulous definition of ‘quality’ is tenuous at best. The approach of the article is, in essence, to assume that the differences in fatality rates of treatable conditions between LMICs and the best performing health systems on Earth, among people who attend health services, are entirely caused by ‘poor quality’. This definition of quality would therefore seem to encompass low resourcing, poor supply of human resources, a lack of access to medicines, as well as everything else that’s different in health systems. Then, to get to this figure, the authors have multiple sources of uncertainty including:

• Using a range of proxies for health care utilisation;
• Using global burden of disease epidemiology estimates, which have associated uncertainty;
• A number of data slicing decisions, such as truncating case fatality rates;
• Estimating utilisation rates based on a predictive model;
• Estimating the case-fatality rate for non-users of health services based on other estimated statistics.

Despite this, the authors claim to estimate a 95% uncertainty interval with a width of only 300,000 people, with a mean estimate of 5.0 million, due to ‘poor quality’. This seems highly implausible, and yet it is claimed to be a causal effect of an undefined ‘poor quality’. The timing of this article coincides with the Lancet Commission on care quality in LMICs and, one suspects, had it not been for the advocacy angle on care quality, it would not have been published in this journal.

Embedding as a pitfall for survey‐based welfare indicators: evidence from an experiment. Journal of the Royal Statistical Society: Series A Published 4th September 2018

Health economists will be well aware of the various measures used to evaluate welfare and well-being. Surveys are typically used that are comprised of questions relating to a number of different dimensions. These could include emotional and social well-being or physical functioning. Similar types of surveys are also used to collect population preferences over states of the world or policy options, for example, Kahneman and Knetsch conducted a survey of WTP for different environmental policies. These surveys can exhibit what is called an ’embedding effect’, which Kahneman and Knetsch described as when the value of a good varies “depending on whether the good is assessed on its own or embedded as part of a more inclusive package.” That is to say that the way people value single dimensional attributes or qualities can be distorted when they’re embedded as part of a multi-dimensional choice. This article reports the results of an experiment involving students who were asked to weight the relative importance of different dimensions of the Better Life Index, including jobs, housing, and income. The randomised treatment was whether they rated ‘jobs’ as a single category, or were presented with individual dimensions, such as the unemployment rate and job security. The experiment shows strong evidence of embedding – the overall weighting substantially differed by treatment. This, the authors conclude, means that the Better Life Index fails to accurately capture preferences and is subject to manipulation should a researcher be so inclined – if you want evidence to say your policy is the most important, just change the way the dimensions are presented.

Credits