Chris Sampson’s journal round-up for 23rd September 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Can you repeat that? Exploring the definition of a successful model replication in health economics. PharmacoEconomics [PubMed] Published 18th September 2019

People talk a lot about replication and its role in demonstrating the validity and reliability of analyses. But what does a successful replication in the context of cost-effectiveness modelling actually mean? Does it mean coming up with precisely the same estimates of incremental costs and effects? Does it mean coming up with a model that recommends the same decision? The authors of this study sought to bring us closer to an operational definition of replication success.

There is potentially much to learn from other disciplines that have a more established history of replication. The authors reviewed literature on the definition of ‘successful replication’ across all disciplines, and used their findings to construct a variety of candidate definitions for use in the context of cost-effectiveness modelling in health. Ten definitions of a successful replication were pulled out of the cross-disciplinary review, which could be grouped into ‘data driven’ replications and ‘experimental’ replications – the former relating to the replication of analyses and the latter relating to the replication of specific observed effects. The ten definitions were from economics, biostatistics, cognitive science, psychology, and experimental philosophy. The definitions varied greatly, with many involving subjective judgments about the proximity of findings. A few studies were found that reported on replications of cost-effectiveness models and which provided some judgment on the level of success. Again, these were inconsistent and subjective.

Quite reasonably, the authors judge that the lack of a fixed definition of successful replication in any scientific field is not just an oversight. The threshold for ‘success’ depends on the context of the replication and on how the evidence will be used. This paper provides six possible definitions of replication success for use in cost-effectiveness modelling, ranging from an identical replication of the results, through partial success in replicating specific pathways within a given margin of error, to simply replicating the same implied decision.

Ultimately, ‘data driven’ replications are a solution to a problem that shouldn’t exist, namely, poor reporting. This paper mostly convinced me that overall ‘success’ isn’t a useful thing to judge in the context of replicating decision models. Replication of certain aspects of a model is useful to evaluate. Whether the replication implied the same decision is a key thing to consider. Beyond this, it is probably worth considering partial success in replicating specific parts of a model.

Differential associations between interpersonal variables and quality-of-life in a sample of college students. Quality of Life Research [PubMed] Published 18th September 2019

There is growing interest in the well-being of students and the distinct challenges involved in achieving good mental health and addressing high levels of demand for services in this group. Students go through many changes that might influence their mental health, prominent among these is the change to their social situation.

This study set out to identify the role of key interpersonal variables on students’ quality of life. The study recruited 1,456 undergraduate students from four universities in the US. The WHOQOL measure was used for quality of life and a barrage of measures were used to collect information on loneliness, social connectedness, social support, emotional intelligence, intimacy, empathic concern, and more. Three sets of analyses of increasing sophistication were conducted, from zero-order correlations between each measure and the WHOQOL, to a network analysis using a Gaussian Graphical Model to identify both direct and indirect relationships while accounting for shared variance.

In all analyses, loneliness stuck out as the strongest driver of quality of life. Social support, social connectedness, emotional intelligence, intimacy with one’s romantic partner, and empathic concern were also significantly associated with quality of life. But the impact of loneliness was greatest, with other interpersonal variables influencing quality of life through their impact on loneliness.

This is a well-researched and reported study. The findings are informative to student support and other services that seek to improve the well-being of students. There is reason to believe that such services should recognise the importance of interpersonal determinants of well-being and in particular address loneliness. But it’s important to remember that this study is only as good as the measures it uses. If you don’t think WHOQOL is adequately measuring student well-being, or you don’t think the UCLA Loneliness Scale tells us what we need to know, you might not want these findings to influence practice. And, of course, the findings may not be generalisable, as the extent to which different interpersonal variables affect quality of life is very likely dependent on the level of service provision, which varies greatly between different universities, let alone countries.

Affordability and non-perfectionism in moral action. Ethical Theory and Moral Practice [PhilPapers] Published 14th September 2019

The ‘cost-effective but unaffordable’ challenge has been bubbling for a while now, at least since sofosbuvir came on the scene. This study explores whether “we can’t afford it” is a justifiable position to take. The punchline is that, no, affordability is not a sound ethical basis on which to support or reject the provision of a health technology. I was extremely sceptical when I first read the claim. If we can’t afford it, it’s impossible, and how can there by a moral imperative in an impossibility? But the authors proceeded to convince me otherwise.

The authors don’t go into great detail on this point, but it all hinges on divisibility. The reason that a drug like sofosbuvir might be considered unaffordable is that loads of people would be eligible to receive it. If sofosbuvir was only provided to a subset of this population, it could be affordable. On this basis, the authors propose the ‘principle of non-perfectionism’. This states that not being able to do all the good we can do (e.g. provide everyone who needs it with sofosbuvir) is not a reason for not doing some of the good we can do. Thus, if we cannot support provision of a technology to everyone who could benefit from it, it does not follow (ethically) to provide it to nobody, but rather to provide it to some people. The basis for selecting people is not of consequence to this argument but could be based on a lottery, for example.

Building on this, the authors explain to us why this is wrong, with the notion of ‘numerical discrimination’. They argue that it is not OK to prioritise one group over another simply because we can meet the needs of everyone within that group as opposed to only some members of the other group. This is exactly what’s happening when we are presented with notions of (un)affordability. If the population of people who could benefit from sofosbuvir was much smaller, there wouldn’t be an issue. But the simple fact that the group is large does not make it morally permissible to deny cost-effective treatment to any individual member within that group. You can’t discriminate against somebody because they are from a large population.

I think there are some tenuous definitions in the paper and some questionable analogies. Nevertheless, the authors succeeded in convincing me that total cost has no moral weight. It is irrelevant to moral reasoning. We should not refuse any health technology to an entire population on the grounds that it is ‘unaffordable’. The authors frame it as a ‘mistake in moral mathematics’. For this argument to apply in the HTA context, it relies wholly on the divisibility of health technologies. To some extent, NICE and their counterparts are in the business of defining models of provision, which might result in limited use criteria to get around the affordability issue. Though these issues are often handled by payers such as NHS England.

The authors of this paper don’t consider the implications for cost-effectiveness thresholds, but this is where my thoughts turned. Does the principle of non-perfectionism undermine the morality of differentiating cost-effectiveness thresholds according to budget impact? I think it probably does. Reducing the threshold because the budget impact is great will result in discrimination (‘numerical discrimination’) against individuals simply because they are part of a large population that could benefit from treatment. This seems to be the direction in which we’re moving. Maybe the efficiency cart is before the ethical horse.


Brendan Collins’s journal round-up for 18th March 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Evaluation of intervention impact on health inequality for resource allocation. Medical Decision Making [PubMed] Published 28th February 2019

How should decision-makers factor equity impacts into economic decisions? Can we trade off an intervention’s cost-effectiveness with its impact on unfair health inequalities? Is a QALY just a QALY or should we weight it more if it is gained by someone from a disadvantaged group? Can we assume that, because people of lower socioeconomic position lose more QALYs through ill health, that most interventions should, by default, reduce inequalities?

I really like the health equity plane. This is where you show health impacts (usually including a summary measure of cost-effectiveness like net health benefit or net monetary benefit) and equity impacts (which might be a change in slope index of inequality [SII] or relative index of inequality) on the same plane. This enables decision-makers to identify potential trade-offs between interventions that produce a greater benefit, but have less impact on inequalities, and those that produce a smaller benefit, but increase equity. I think there has been a debate over whether the ‘win-win’ quadrant should be south-east (which would be consistent with the dominant quadrant of the cost-effectiveness plane) or north-east, which is what seems to have been adopted as the consensus and is used here.

This paper showcases a reproducible method to estimate the equity impact of interventions. It considers public health interventions recommended by NICE from 2006-2016, with equity impacts estimated based on whether they targeted specific diseases, risk factors or populations. The disease distributions were based on hospital episode statistics data by deprivation (IMD). The study used equity weights to convert QALYs gained to different social groups into net social welfare. In this case, valuing the most disadvantaged fifth of people’s health at around 6-7 times that of the least disadvantaged fifth. I think there might still be work to be done around reaching consensus for equity weights.

The total expected effect on inequalities is small – full implementation of all recommendations would produce a reduction of the quality-adjusted life expectancy gap between the healthiest and least healthy from 13.78 to 13.34 QALYs. But maybe this is to be expected; NICE does not typically look at vaccinations or screening and has not looked at large scale public health programmes like the Healthy Child Programme in the whole. Reassuringly, where recommended interventions were likely to increase inequality, the trade-off between efficiency and equity was within the social welfare function they had used. The increase in inequality might be acceptable because the interventions were cost-effective – producing 5.6million QALYs while increasing the SII by 0.005. If these interventions are buying health at a good price, then you would hope this might then release money for other interventions that would reduce inequalities.

I suspect that public health folks might not like equity trade-offs at all – trading off equity and cost-effectiveness might be the moral equivalent of trading off human rights – you can’t choose between them. But the reality is that these kinds of trade-offs do happen, and like a lot of economic methods, it is about revealing these implicit trade-offs so that they become explicit, and having ‘accountability for reasonableness‘.

Future unrelated medical costs need to be considered in cost effectiveness analysis. The European Journal of Health Economics [PubMed] [RePEc] Published February 2019

This editorial says that NICE should include unrelated future medical costs in its decision making. At the moment, if NICE looks at a cardiovascular disease (CVD) drug, it might look at future costs related to CVD but it won’t include changes in future costs of cancer, or dementia, which may occur because individuals live longer. But usually unrelated QALY gains will be implicitly included; so there is an inconsistency. If you are a health economic modeller, you know that including unrelated costs properly is technically difficult. You might weight average population costs by disease prevalence so you get a cost estimate for people with coronary heart disease, diabetes, and people without either disease. Or you might have a general healthcare running cost that you can apply to future years. But accounting for a full matrix of competing causes of morbidity and mortality is very tricky if not impossible. To help with this, this group of authors produced the excellent PAID tool, which helps with doing this for the Netherlands (can we have one for the UK please?).

To me, including unrelated future costs means that in some cases ICERs might be driven more by the ratio of future costs to QALYs gained. Whereas currently, ICERs are often driven by the ratio of the intervention costs to QALYs gained. So it might be that a lot of treatments that are currently cost-effective no longer are, or we need to judge all interventions with a higher ICER willingness to pay threshold or value of a QALY. The authors suggest that, although including unrelated medical costs usually pushes up the ICER, it should ultimately result in better decisions that increase health.

There are real ethical issues here. I worry that including future unrelated costs might be used for an integrated care agenda in the NHS, moving towards a capitation system where the total healthcare spend on any one individual is capped, which I don’t necessarily think should happen in a health insurance system. Future developments around big data mean we will be able to segment the population a lot better and estimate who will benefit from treatments. But I think if someone is unlucky enough to need a lot of healthcare spending, maybe they should have it. This is risk sharing and, without it, you may get the ‘double jeopardy‘ problem.

For health economic modellers and decision-makers, a compromise might be to present analyses with related and unrelated medical costs and to consider both for investment decisions.

Overview of cost-effectiveness analysis. JAMA [PubMed] Published 11th March 2019

This paper probably won’t offer anything new to academic health economists in terms of methods, but I think it might be a useful teaching resource. It gives an interesting example of a model of ovarian cancer screening in the US that was published in February 2018. There has been a large-scale trial of ovarian cancer screening in the UK (the UKCTOCS), which has been extended because the results have been promising but mortality reductions were not statistically significant. The model gives a central ICER estimate of $106,187/QALY (based on $100 per screen) which would probably not be considered cost-effective in the UK.

I would like to explore one statement that I found particularly interesting, around the willingness to pay threshold; “This willingness to pay is often represented by the largest ICER among all the interventions that were adopted before current resources were exhausted, because adoption of any new intervention would require removal of an existing intervention to free up resources.”

The Culyer bookshelf model is similar to this, although as well as the ICER you also need to consider the burden of disease or size of the investment. Displacing a $110,000/QALY intervention for 1000 people with a $109,000/QALY intervention for a million people will bust your budget.

This idea works intuitively – if Liverpool FC are signing a new player then I might hope they are better than all of the other players, or at least better than the average player. But actually, as long as they are better than the worst player then the team will be improved (leaving aside issues around different positions, how they play together, etc.).

However, I think that saying that the reference ICER should be the largest current ICER might be a bit dangerous. Leaving aside inefficient legacy interventions (like unnecessary tonsillectomies etc), it is likely that the intervention being considered for investment and the current maximum ICER intervention to be displaced may both be new, expensive immunotherapies. It might be last in, first out. But I can’t see this happening; people are loss averse, so decision-makers and patients might not accept what is seen as a fantastic new drug for pancreatic cancer being approved then quickly usurped by a fantastic new leukaemia drug.

There has been a lot of debate around what the threshold should be in the UK; in England NICE currently use £20,000 – £30,000, up to a hypothetical maximum £300,000/QALY in very specific circumstances. UK Treasury value QALYs at £60,000. Work by Karl Claxton and colleagues suggests that marginal productivity (the ‘shadow price’) in the NHS is nearer to £5,000 – £15,000 per QALY.

I don’t know what the answer to this is. I don’t think the willingness-to-pay threshold for a new treatment should be the maximum ICER of a current portfolio of interventions; maybe it should be the marginal health production cost in a health system, as might be inferred from the Claxton work. Of course, investment decisions are made on other factors, like impact on health inequalities, not just on the ICER.


Thesis Thursday: Koonal Shah

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Koonal Shah who has a PhD from the University of Sheffield. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Valuing health at the end of life
Aki Tsuchiya, Allan Wailoo
Repository link

What were the key questions you wanted to answer with your research?

My key research question was: Do members of the general public wish to place greater weight on a unit of health gain for end of life patients than on that for other types of patients? Or put more concisely: Is there evidence of public support for an end of life premium?

The research question was motivated by a policy introduced by NICE in 2009 [PDF], which effectively gives special weighting to health gains generated by life-extending end of life treatments. This represents an explicit departure from the Institute’s reference case position that all equal-sized health gains are of equal social value (the ‘a QALY is a QALY’ rule). NICE’s policy was justified in part by claims that it represented the preferences of society, but little evidence was available to either support or refute that premise. It was this gap in the evidence that inspired my research question.

I also sought to answer other questions, such as whether the focus on life extensions (rather than quality of life improvements) in NICE’s policy is consistent with public preferences, and whether people’s stated end of life-related preferences depend on the ways in which the preference elicitation tasks are designed, framed and presented.

Which methodologies did you use to elicit people’s preferences?

All four of my empirical studies used hypothetical choice exercises to elicit preferences from samples of the UK general public. NICE’s policy was used as the framework for the designs in each case. Three of the studies can be described as having used simple choice tasks, while one study specifically applied the discrete choice experiment methodology. The general approach was to ask survey respondents which of two hypothetical patients they thought should be treated, assuming that the health service had only enough funds to treat one of them.

In my final study, which focused on framing effects and study design considerations, I included attitudinal questions with Likert item responses alongside the hypothetical choice tasks. The rationale for including these questions was to examine the consistency of respondents’ views across two different approaches (spoiler: most people are not very consistent).

Your study included face-to-face interviews. Did these provide you with information that you weren’t able to obtain from a more general survey?

The surveys in my first two empirical studies were both administered via face-to-face interviews. In the first study, I conducted the interviews myself, while in the second study the interviews were subcontracted to a market research agency. I also conducted a small number of face-to-face interviews when pilot testing early versions of the surveys for my third and fourth studies. The piloting process was useful as it provided me with first-hand information about which aspects of the surveys did and did not work well when administered in practice. It also gave me a sense of how appropriate my questions were. The subject matter – prioritising between patients described as having terminal illnesses and poor prognoses – had the potential to be distressing for some people. My view was that I shouldn’t be including questions that I did not feel comfortable asking strangers in an interview setting.

The use of face-to-face interviews was particularly valuable in my first study as it allowed me to ask debrief questions designed to probe respondents and elicit qualitative information about the thinking behind their responses.

What factors influence people’s preferences for allocating health care resources at the end of life?

My research suggests that people’s preferences regarding the value of end of life treatments can depend on whether the treatment is life-extending or quality of life-improving. This is noteworthy because NICE’s end of life criteria accommodate life extensions but not quality of life improvements.

I also found that the amount of time that end of life patients have to ‘prepare for death’ was a consideration for a number of respondents. Some of my results suggest that observed preferences for prioritising the treatment of end of life patients may be driven by concern about how long the patients have known their prognosis rather than by concern about how long they have left to live, per se.

The wider literature suggests that the age of the end of life patients (which may act as a proxy for their role in their household or in society) may also matter. Some studies have reported evidence that respondents become less concerned about the number of remaining life years when the patients in question are relatively old. This is consistent with the ‘fair innings’ argument proposed by Alan Williams.

Given the findings of your study, are there any circumstances under which you would support an end of life premium?

My findings offer limited support for an end of life premium (though it should be noted that the wider literature is more equivocal). So it might be considered appropriate for NICE to abandon its end of life policy on the grounds that the population health losses that arise due to the policy are not justified by the evidence on societal preferences. However, there may be arguments for retaining some form of end of life weighting irrespective of societal preferences. For example, if the standard QALY approach systematically underestimates the benefits of end of life treatments, it may be appropriate to correct for this (though whether this is actually the case would itself need investigating).

Many studies reporting that people wish to prioritise the treatment of the severely ill have described severity in terms of quality of life rather than life expectancy. And some of my results suggest that support for an end of life premium would be stronger if it applied to quality of life-improving treatments. This suggests that weighting QALYs in accordance with continuous variables capturing quality of life as well as life expectancy may be more consistent with public preferences than the current practice of applying binary cut-offs based only on life expectancy information, and would address some of the criticisms of the arbitrariness of NICE’s policy.