# Sam Watson’s journal round-up for 10th September 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Probabilistic sensitivity analysis in cost-effectiveness models: determining model convergence in cohort models. PharmacoEconomics [PubMed] Published 27th July 2018

Probabilistic sensitivity analysis (PSA) is rightfully a required component of economic evaluations. Deterministic sensitivity analyses are generally biased; averaging the outputs of a model based on a choice of values from a complex joint distribution is not likely to be a good reflection of the true model mean. PSA involves repeatedly sampling parameters from their respective distributions and analysing the resulting model outputs. But how many times should you do this? Most times, an arbitrary number is selected that seems “big enough”, say 1,000 or 10,000. But these simulations themselves exhibit variance; so-called Monte Carlo error. This paper discusses making the choice of the number of simulations more formal by assessing the “convergence” of simulation output.

In the same way as sample sizes are chosen for trials, the number of simulations should provide an adequate level of precision, anything more wastes resources without improving inferences. For example, if the statistic of interest is the net monetary benefit, then we would want the confidence interval (CI) to exclude zero as this should be a sufficient level of certainty for an investment decision. The paper, therefore, proposed conducting a number of simulations, examining the CI for when it is ‘narrow enough’, and conducting further simulations if it is not. However, I see a problem with this proposal: the variance of a statistic from a sequence of simulations itself has variance. The stopping points at which we might check CI are themselves arbitrary: additional simulations can increase the width of the CI as well as reduce them. Consider the following set of simulations from a simple ratio of random variables $ICER = gamma(1,0.01)/normal(0.01,0.01)$:The “stopping rule” therefore proposed doesn’t necessarily indicate “convergence” as a few more simulations could lead to a wider, as well as narrower, CI. The heuristic approach is undoubtedly an improvement on the current way things are usually done, but I think there is scope here for a more rigorous method of assessing convergence in PSA.

Mortality due to low-quality health systems in the universal health coverage era: a systematic analysis of amenable deaths in 137 countries. The Lancet [PubMed] Published 5th September 2018

Richard Horton, the oracular editor-in-chief of the Lancet, tweeted last week:

There is certainly an argument that academic journals are good forums to make advocacy arguments. Who better to interpret the analyses presented in these journals than the authors and audiences themselves? But, without a strict editorial bulkhead between analysis and opinion, we run the risk that the articles and their content are influenced or dictated by the political whims of editors rather than scientific merit. Unfortunately, I think this article is evidence of that.

No-one debates that improving health care quality will improve patient outcomes and experience. It is in the very definition of ‘quality’. This paper aims to estimate the numbers of deaths each year due to ‘poor quality’ in low- and middle-income countries (LMICs). The trouble with this is two-fold: given the number of unknown quantities required to get a handle on this figure, the definition of quality notwithstanding, the uncertainty around this figure should be incredibly high (see below); and, attributing these deaths in a causal way to a nebulous definition of ‘quality’ is tenuous at best. The approach of the article is, in essence, to assume that the differences in fatality rates of treatable conditions between LMICs and the best performing health systems on Earth, among people who attend health services, are entirely caused by ‘poor quality’. This definition of quality would therefore seem to encompass low resourcing, poor supply of human resources, a lack of access to medicines, as well as everything else that’s different in health systems. Then, to get to this figure, the authors have multiple sources of uncertainty including:

• Using a range of proxies for health care utilisation;
• Using global burden of disease epidemiology estimates, which have associated uncertainty;
• A number of data slicing decisions, such as truncating case fatality rates;
• Estimating utilisation rates based on a predictive model;
• Estimating the case-fatality rate for non-users of health services based on other estimated statistics.

Despite this, the authors claim to estimate a 95% uncertainty interval with a width of only 300,000 people, with a mean estimate of 5.0 million, due to ‘poor quality’. This seems highly implausible, and yet it is claimed to be a causal effect of an undefined ‘poor quality’. The timing of this article coincides with the Lancet Commission on care quality in LMICs and, one suspects, had it not been for the advocacy angle on care quality, it would not have been published in this journal.

Embedding as a pitfall for survey‐based welfare indicators: evidence from an experiment. Journal of the Royal Statistical Society: Series A Published 4th September 2018

Health economists will be well aware of the various measures used to evaluate welfare and well-being. Surveys are typically used that are comprised of questions relating to a number of different dimensions. These could include emotional and social well-being or physical functioning. Similar types of surveys are also used to collect population preferences over states of the world or policy options, for example, Kahneman and Knetsch conducted a survey of WTP for different environmental policies. These surveys can exhibit what is called an ’embedding effect’, which Kahneman and Knetsch described as when the value of a good varies “depending on whether the good is assessed on its own or embedded as part of a more inclusive package.” That is to say that the way people value single dimensional attributes or qualities can be distorted when they’re embedded as part of a multi-dimensional choice. This article reports the results of an experiment involving students who were asked to weight the relative importance of different dimensions of the Better Life Index, including jobs, housing, and income. The randomised treatment was whether they rated ‘jobs’ as a single category, or were presented with individual dimensions, such as the unemployment rate and job security. The experiment shows strong evidence of embedding – the overall weighting substantially differed by treatment. This, the authors conclude, means that the Better Life Index fails to accurately capture preferences and is subject to manipulation should a researcher be so inclined – if you want evidence to say your policy is the most important, just change the way the dimensions are presented.

Credits

# James Lomas’s journal round-up for 21st May 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Decision making for healthcare resource allocation: joint v. separate decisions on interacting interventions. Medical Decision Making [PubMed] Published 23rd April 2018

While it may be uncontroversial that including all of the relevant comparators in an economic evaluation is crucial, a careful examination of this statement raises some interesting questions. Which comparators are relevant? For those that are relevant, how crucial is it that they are not excluded? The answer to the first of these questions may seem obvious, that all feasible mutually exclusive interventions should be compared, but this is in fact deceptive. Dakin and Gray highlight inconsistency between guidelines as to what constitutes interventions that are ‘mutually exclusive’ and so try to re-frame the distinction according to whether interventions are ‘incompatible’ – when it is physically impossible to implement both interventions simultaneously – and, if not, whether interventions are ‘interacting’ – where the costs and effects of the simultaneous implementation of A and B do not equal the sum of these parts. What I really like about this paper is that it has a very pragmatic focus. Inspired by policy arrangements, for example single technology appraisals, and the difficulty in capturing all interactions, Dakin and Gray provide a reader-friendly flow diagram to illustrate cases where excluding interacting interventions from a joint evaluation is likely to have a big impact, and furthermore propose a sequencing approach that avoids the major problems in evaluating separately what should be considered jointly. Essentially when we have interacting interventions at different points of the disease pathway, evaluating separately may not be problematic if we start at the end of the pathway and move backwards, similar to the method of backward induction used in sequence problems in game theory. There are additional related questions that I’d like to see these authors turn to next, such as how to include interaction effects between interventions and, in particular, how to evaluate system-wide policies that may interact with a very large number of interventions. This paper makes a great contribution to answering all of these questions by establishing a framework that clearly distinguishes concepts that had previously been subject to muddied thinking.

When cost-effective interventions are unaffordable: integrating cost-effectiveness and budget impact in priority setting for global health programs. PLoS Medicine [PubMed] Published 2nd October 2017

In my opinion, there are many things that health economists shouldn’t try to include when they conduct cost-effectiveness analysis. Affordability is not one of these. This paper is great, because Bilinski et al shine a light on the worldwide phenomenon of interventions being found to be ‘cost-effective’ but not affordable. A particular quote – that it would be financially impossible to implement all interventions that are found to be ‘very cost-effective’ in many low- and middle-income countries – is quite shocking. Bilinski et al compare and contrast cost-effectiveness analysis and budget impact analysis, and argue that there are four key reasons why something could be ‘cost-effective’ but not affordable: 1) judging cost-effectiveness with reference to an inappropriate cost-effectiveness ‘threshold’, 2) adoption of a societal perspective that includes costs not falling upon the payer’s budget, 3) failing to make explicit consideration of the distribution of costs over time and 4) the use of an inappropriate discount rate that may not accurately reflect the borrowing and investment opportunities facing the payer. They then argue that, because of this, cost-effectiveness analysis should be presented along with budget impact analysis so that the decision-maker can base a decision on both analyses. I don’t disagree with this as a pragmatic interim solution, but – by highlighting these four reasons for divergence of results with such important economic consequences – I think that there will be further reaching implications of this paper. To my mind, Bilinski et al essentially serves as a call to arms for researchers to try to come up with frameworks and estimates so that the conduct of cost-effectiveness analysis can be improved in order that paradoxical results are no longer produced, decisions are more usefully informed by cost-effectiveness analysis, and the opportunity costs of large budget impacts are properly evaluated – especially in the context of low- and middle-income countries where the foregone health from poor decisions can be so significant.

Patient cost-sharing, socioeconomic status, and children’s health care utilization. Journal of Health Economics [PubMed] Published 16th April 2018

This paper evaluates a policy using a combination of regression discontinuity design and difference-in-difference methods. Not only does it do that, but it tackles an important policy question using a detailed population-wide dataset (a set of linked datasets, more accurately). As if that weren’t enough, one of the policy reforms was actually implemented as a result of a vote where two politicians ‘accidentally pressed the wrong button’, reducing concerns that the policy may have in some way not been exogenous. Needless to say I found the method employed in this paper to be a pretty convincing identification strategy. The policy question at hand is about whether demand for GP visits for children in the Swedish county of Scania (Skåne) is affected by cost-sharing. Cost-sharing for GP visits has occurred for different age groups over different periods of time, providing the basis for regression discontinuities around the age threshold and treated and control groups over time. Nilsson and Paul find results suggesting that when health care is free of charge doctor visits by children increase by 5-10%. In this context, doctor visits happened subject to telephone triage by a nurse and so in this sense it can be argued that all of these visits would be ‘needed’. Further, Nilsson and Paul find that the sensitivity to price is concentrated in low-income households, and is greater among sickly children. The authors contextualise their results very well and, in addition to that context, I can’t deny that it also particularly resonated with me to read this approaching the 70th birthday of the NHS – a system where cost-sharing has never been implemented for GP visits by children. This paper is clearly also highly relevant to that debate that has surfaced again and again in the UK.

Credits

# Sam Watson’s journal round-up for 30th April 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The Millennium Villages Project: a retrospective, observational, endline evaluation. The Lancet Global Health [PubMedPublished May 2018

There are some clinical researchers who would have you believe observational studies are completely useless. The clinical trial is king, they might say, observation studies are just too biased. And while it’s true that observational studies are difficult to do well and convincingly, they can be a reliable and powerful source of evidence. Similarly, randomised trials are frequently flawed, for example there’s often missing data that hasn’t been dealt with, or a lack of allocation concealment, and many researchers forget that randomisation does not guarantee a balance of covariates, it merely increases the probability of it. I bring this up, as this study is a particularly carefully designed observational data study that I think serves as a good example to other researchers. The paper is an evaluation of the Millennium Villages Project, an integrated intervention program designed to help rural villages across sub-Saharan Africa meet the Millennium Development Goals over ten years between 2005 and 2015. Initial before-after evaluations of the project were criticised for inferring causal “impacts” from before and after data (for example, this Lancet paper had to be corrected after some criticism). To address these concerns, this new paper is incredibly careful about choosing appropriate control villages against which to evaluate the intervention. Their method is too long to summarise here, but in essence they match intervention villages to other villages on the basis of district, agroecological zone, and a range of variables from the DHS – matches were they reviewed for face validity and revised until a satisfactory matching was complete. The wide range of outcomes are all scaled to a standard normal and made to “point” in the same direction, i.e. so an increase indicated economic development. Then, to avoid multiple comparisons problems, a Bayesian hierarchical model is used to pool data across countries and outcomes. Costs data were also reported. Even better, “statistical significance” is barely mentioned at all! All in all, a neat and convincing evaluation.

Reconsidering the income‐health relationship using distributional regression. Health Economics [PubMed] [RePEcPublished 19th April 2018

The relationship between health and income has long been of interest to health economists. But it is a complex relationship. Increases in income may change consumption behaviours and a change in the use of time, promoting health, while improvements to health may lead to increases in income. Similarly, people who are more likely to make higher incomes may also be those who look after themselves, or maybe not. Disentangling these various factors has generated a pretty sizeable literature, but almost all of the empirical papers in this area (and indeed all empirical papers in general) use modelling techniques to estimate the effect of something on the expected value, i.e. mean, of some outcome. But the rest of the distribution is of interest – the mean effect of income may not be very large, but a small increase in income for poorer individuals may have a relatively large effect on the risk of very poor health. This article looks at the relationship between income and the conditional distribution of health using something called “structured additive distribution regression” (SADR). My interpretation of SADR is that, one would model the outcome y ~ g(a,b) as being distributed according to some distribution g(.) indexed by parameters a and b, for example, a normal or Gamma distribution has two parameters. One would then specify a generalised linear model for a and b, e.g. a = f(X’B). I’m not sure this is a completely novel method, as people use the approach to, for example, model heteroscedasticity. But that’s not to detract from the paper itself. The findings are very interesting – increases to income have a much greater effect on health at the lower end of the spectrum.

Ask your doctor whether this product is right for you: a Bayesian joint model for patient drug requests and physician prescriptions. Journal of the Royal Statistical Society: Series C Published April 2018.