The irrelevance of inference: (almost) 20 years on is it still irrelevant?

The Irrelevance of Inference was a seminal paper published by Karl Claxton in 1999. In it he outlines a stochastic decision making approach to the evaluation of health technologies. A key point that he makes is that we need only to examine the posterior mean incremental net benefit of one technology compared to another to make a decision. Other aspects of the distribution of incremental net benefits are irrelevant – hence the title.

I hated this idea. From a Bayesian perspective estimation and inference is a decision problem. Surely uncertainty matters! But, in the extra-welfarist framework that we generally conduct cost-effectiveness analysis in, it is irrefutable. To see why let’s consider a basic decision making framework.

There are three aspects to a decision problem. Firstly, there is a state of the world, \theta \in \Theta with density \pi(\theta). In this instance it is the net benefits in the population, but could be the state of the economy, or effectiveness of a medical intervention in other contexts, for example. Secondly, there is the possible actions denoted by a \in \mathcal{A}. There might be a discrete set of actions or a continuum of possibilities. Finally, there is the loss function L(a,\theta). The loss function describes the losses or costs associated with making decision a given that \theta is the state of nature. The action that should be taken is the one which minimises expected losses \rho(\theta,a)=E_\theta(L(a,\theta)). Minimising losses can be seen as analogous to maximising utility. We also observe data x=[x_1,...,x_N]' that provide information on the parameter \theta. Our state of knowledge regarding this parameter is then captured by the posterior distribution \pi(\theta|x). Our expected losses should be calculated with respect to this distribution.

Given the data and posterior distribution of incremental net benefits, we need to make a choice about a value (a Bayes estimator), that minimises expected losses. The opportunity loss from making the wrong decision is “the difference in net benefit between the best choice and the choice actually made.” So the decision falls down to deciding whether the incremental net benefits are positive or negative (and hence whether to invest), \mathcal{A}=[a^+,a^-]. The losses are linear if we make the wrong decision:

L(a^+,\theta) = 0 if \theta >0 and L(a^+,\theta) = \theta if \theta <0

L(a^-,\theta) = - \theta if \theta >0 and L(a^+,\theta) = 0 if \theta <0

So we should decide that the incremental net benefits are positive if

E_\theta(L(a^+,\theta)) - E_\theta(L(a^-,\theta)) > 0

which is equivalent to

\int_0^\infty \theta dF^{\pi(\theta|x)}(\theta) - \int_{-\infty}^0 -\theta dF^{\pi(\theta|x)}(\theta) = \int_{-\infty}^\infty \theta dF^{\pi(\theta|x)}(\theta) > 0

which is obviously equivalent to E(\theta|x)>0 – the posterior mean!

If our aim is simply the estimation of net benefits (so \mathcal{A} \subseteq \mathbb{R}), different loss functions lead to different estimators. If we have a squared loss function L(a, \theta)=|\theta-a|^2 then again we should choose the posterior mean. However, other choices of loss function lead to other estimators. The linear loss function, L(a, \theta)=|\theta-a| leads to the posterior median. And a ‘0-1’ loss function: L(a, \theta)=0 if a=\theta and L(a, \theta)=1 if a \neq \theta, gives the posterior mode, which is also the maximum likelihood estimator (MLE) if we have a uniform prior. This latter point does suggest that MLEs will not give the ‘correct’ answer if the net benefit distribution is asymmetric. The loss function is therefore important. But for the purposes of the decision between technologies I see no good reason to reject our initial loss function.

Claxton also noted that equity considerations could be incorporated through ‘adjustments to the measure of outcome’. This could be some kind of weighting scheme. However, this is where I might begin to depart from the claim of the irrelevance of inference. I prefer a social decision maker approach to evaluation in the vein of cost-benefit analysis as discussed by the brilliant Alan Williams. This approach allows for non-market outcomes that extra-welfarism might include but classical welfarism would exclude; their valuations could be arrived at by a political, democratic process or by other means. It also permits inequality aversion and other features that I find are a perhaps more accurate reflection of a political decision making approach. However, one must be aware of all the flaws and failures of this approach, which Williams so neatly describes.

In a social decision maker framework, the decision that should be made is the one that maximises a social welfare function. A utility function expresses social preferences over the distribution of utility in the population, the social welfare function aggregates utility and is usually assumed to be linear (utilitarian). If the utility function is inequality averse then the variance obviously does matter. But, in making this claim I am moving away from the arguments of Claxton’s paper and towards a discussion of the relative merits extra-welfarism and other approaches.

Perhaps the statement that inference was irrelevant was made just to capture our attention. After all the process of updating our knowledge of the net benefits of alternatives from data is inference. But Claxton’s statement refers more to the process of hypothesis testing and p-values (or Bayesian ranges of equivalents), the use of which has no place in decision making. On this point I wholeheartedly agree.


Data sharing and the cost of error

The world’s highest impact factor medical journal, the New England Journal of Medicine (NEJM), seems to have been doing some soul searching. After publishing an editorial early in 2016 insinuating that researchers requesting data from trials for re-analysis were “research parasites“, they have released a series of articles on the topic of data sharing. Four articles were published in August: two in favour and two less so. This month another three articles are published on the same topic. And, the journal is sponsoring a challenge to re-analyse data from a previous trial. We reported earlier in the year about a series of concerns at the NEJM and these new steps are all welcome to address those challenges. However, while the articles consider questions of fairness about sharing data from large, long, and difficult trials, little has been said about the potential costs to society of un-remedied errors in data analysis. The costs of not sharing data can be large as the long running saga over the controversial PACE trial illustrates.

The PACE trial was a randomised, controlled trial to assess the benefits of a number of treatments for chronic fatigue syndrome including graded exercise therapy and cognitive behavioural therapy. However, after publication of the trial results in 2011, a number of concerns were raised about the conduct of the trial, its analysis, and reporting. This included a change in the definitions of ‘improvement’ and ‘recovery’ mid-way through the trial. Other researchers sought access to the data from the trial for re-analysis, but such requests were rebutted with what a judge later described as ‘wild speculations’. The data were finally released and recently re-analysed. The new analysis revealed what many suspected – that the interventions in the trial had little benefit. Nevertheless, the recommended treatments for chronic fatigue syndrome had changed as a result of the trial. (STAT has the whole story here).

A cost-effectiveness analysis was published alongside the PACE trial. The results showed that chronic behavioural therapy (CBT) was cost-effective compared to standard care, as was graded exercise therapy (GET). Quality of life was measured in the trial using the EQ-5D, and costs were also recorded, making calculation of incremental cost-effectiveness ratios straightforward. Costs were higher for all the intervention groups. The table reporting QALY outcomes is reproduced below:


At face value the analysis seems reasonable. But, in light of the problems with the trial, including that none of the objective measures of patient health, such as walking tests and step tests, nor labour market outcomes, showed much sign of improvement or recovery, these data seem less convincing. In particular, their statistically significant difference in QALYs – “After controlling for baseline utility, the difference between CBT and SMC was 0.05 (95% CI 0.01 to 0.09)” – may well just be a type I error. A re-analysis of these data is warranted (although gaining access may yet still be hard).

If there actually was no real benefit from the new treatments, then benefits have been lost from elsewhere in the healthcare system. If we assume the NHS achieves £20,000/QALY (contentious I know!) then the health service loses 0.05 QALYs for each patient with chronic fatigue syndrome put on the new treatment. The prevalence of chronic fatigue syndrome may be as high as 0.2% among adults in England, which represents approximately 76,000 people. If all of these were switched to new, ineffective treatments, the opportunity cost could potentially be as much as 3,800 QALYs.

The key point is that analytical errors have costs if the analyses go on to lead to changes in recommended treatments. And when averaged over a national health service these costs could become quite substantial. Researchers may worry about publication prestige or fairness in using other people’s hard won data, but the bigger issue is the wider costs of letting an error go unchallenged.


PrEP: A story in desperate need of health economics communication

The poor state of public economics communication has been decried in many fora. The consensus of economists regarding issues such as the impacts of austerity, leaving the European Union, and other major policy choices, is in general poorly communicated to the public. With a few exceptions, such as Martin Wolf in the Financial Times and Paul Krugman in the New York Times, most major economics issues are communicated by political journalists and frequently lack appropriate scrutiny. Health economics is no exception and this week’s ruling on PrEP (pre-exposure prophylaxis), a combination of anti-retroviral drugs that can reduce the risk of transmission of HIV by over 90%, reveals the poor state of public understanding of economic evaluation and cost-effectiveness analysis.

Perhaps the most shocking of the reportage on this topic came, unsurprisingly, from the Daily Mail which claimed that funding the “promiscuity pill” would prevent cancer treatment and amputees receiving limbs. Even the comments sections in more temperate journals, such as the Guardian, reveal the same concerns: providing PrEP will both encourage risky sexual behaviour and prevent other treatments being provided. Indeed NHS England has itself made the statement that they will be prevented from treating children with cystic fibrosis despite the lack of any formal cost-effectiveness analyses. A basic understanding of health economics is lacking. The communication of some straightforward facts may improve public understanding:

  • New treatments that would displace resources for other interventions that provide greater benefits are generally not recommended within the NHS.
  • New interventions are often more expensive than standard therapy but the extra benefit is greater than is being achieved with the same resources elsewhere, if the new intervention is considered cost-effective. In some cases a treatment may have a negative net cost if it is cheaper or prevents longer term costs arising freeing up resources to be used elsewhere.
  • When assessing whether or not a new intervention is cost-effective it is compared to relevant alternative treatments. In the case of PrEP this would be providing treatment for HIV after it has been contracted, for example. A thoroughgoing cost-effectiveness analysis should take into account possible changes in behaviour induced by the availability of the treatment: vaccination programmes are a good example of this.
  • Criteria other than cost-effectiveness are used to decide on treatment recommendations such as access to other treatments or the demographics of the groups affected by the disease in question.

Understanding these concepts should suggest many of the concerns around PrEP are moot until its cost-effectiveness has been established. But, PrEP clearly ignites some heated moral debates. Some might still argue that regardless of its cost-effectiveness, even if it is cost saving, it shouldn’t be provided by a public healthcare system. Many of the objectors to the ruling voice the familiar objection to funding treatments for conditions that result from personal choices about behaviour: the luck egalitarian argument that we have addressed before (here and here, for example). While these ethical and political considerations may be valid grounds for debate, the issue is not exclusive to PrEP, and could cover rugby injuries, cancer as a result of smoking, or even provision of HPV vaccine. A final point could therefore be added:

  • Normative economics, a question of what should be, and positive economics, a question of what is, shouldn’t be conflated.

Misconceptions of economic evaluation as bean counting or as being valueless abound. It is only through effective communication can this be remedied.

Image credit: Jeffrey Beal (CC BY-SA 3.0)