# Chris Sampson’s journal round-up for 23rd December 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The Internet and children’s psychological wellbeing. Journal of Health Economics Published 13th December 2019

Here at the blog, we like the Internet. We couldn’t exist without it. We vie for your attention along with all of the other content factories (or “friends”). But there’s a well-established sense that people – especially children – should moderate their consumption of Internet content. The Internet is pervasive and is now a fundamental part of our day-to-day lives, not simply an information source to which we turn when we need it. Almost all 12-15 year olds in the UK use the Internet. The ubiquity of the Internet makes it difficult to test its effects. But this paper has a good go at it.

This study is based on the idea that broadband speeds are a good proxy for Internet use. In England, a variety of public and private sector initiatives have resulted in a distorted market with quasi-random assigment of broadband speeds. The authors provide a very thorough explanation of children’s wellbeing in relation to the Internet, outlining a range of potential mechanisms.

The analysis combines data from the UK’s pre-eminent household panel survey (Understanding Society) with broadband speed data published by the UK regulator Ofcom. Six wellbeing outcomes are analysed from children’s self-reported responses. The questions ask children how they feel about their lives – measured on a seven-point scale – in relation to school work, appearance, family, friends, school attended, and life as a whole. An unbalanced panel of 6,310 children from 2012-2017 provides 13,938 observations from 3,765 different Lower Layer Super Output Areas (LSOA), with average broadband speeds for each LSOA for each year. Each of the six wellbeing outcomes is modelled with child-, neighbourhood- and time-specific fixed effects. The models’ covariates include a variety of indicators relating to the child, their parents, their household, and their local area.

A variety of models are tested, and the overall finding is that higher broadband speeds are negatively associated with all of the six wellbeing indicators. Wellbeing in relation to appearance shows the strongest effect; a 1% increase in broadband speed reduces happiness with appearance by around 0.6%. The authors explore a variety of potential mechanisms by running pairs of models between broadband speeds and the mechanism and between the mechanism and the outcomes. A key finding is that the data seem to support the ‘crowding out’ hypothesis. Higher broadband speeds are associated with children spending less time on activities such as sports, clubs, and real world social interactions, and these activities are in turn positively associated with wellbeing. The authors also consider different subgroups, finding that the effects are more detrimental for girls.

Where the paper falls down is that it doesn’t do anything to convince us that broadband speeds represent a good proxy for Internet use. It’s also not clear exactly what the proxy is meant to be for – use (e.g. time spent on the Internet) or access (i.e. having the option to use the Internet) – though the authors seem to be interested in the former. If that’s the case, the logic of the proxy is not obvious. If I want to do X on the Internet then higher speeds will enable me to do it in less time, in which case the proxy would capture the inverse of the desired indicator. The other problem I think we have is in the use of self-reported measures in this context. A key supposed mechanism for the effect is through ‘social comparison theory’, which we might reasonably expect to influence the way children respond to questions as well as – or instead of – their underlying wellbeing.

One-way sensitivity analysis for probabilistic cost-effectiveness analysis: conditional expected incremental net benefit. PharmacoEconomics [PubMed] Published 16th December 2019

Here we have one of those very citable papers that clearly specifies a part of cost-effectiveness analysis methodology. A better title for this paper could be Make one-way sensitivity analysis great again. The authors start out by – quite rightly – bashing the tornado diagram, mostly on the basis that it does not intuitively characterise the information that a decision-maker needs. Instead, the authors propose an approach to probabilistic one-way sensitivity analysis (POSA) that is a kind of simplified version of EVPPI (expected value of partially perfect information) analysis. Crucially, this approach does not assume that the various parameters of the analysis are independent.

The key quantity created by this analysis is the conditional expected incremental net monetary benefit (cINMB), conditional, that is, on the value of the parameter of interest. There are three steps to creating a plot of the POSA results: 1) rank the costs and outcomes for the sampled values of the parameter – say from the first to the last centile; 2) plug in a cost-effectiveness threshold value to calculate the cINMB at each sampled value; and 3) record the probability of observing each value of the parameter. You could use this information to present a tornado-style diagram, plotting the credible range of the cINMB. But it’s more useful to plot a line graph showing the cINMB at the different values of the parameter of interest, taking into account the probability that the values will actually be observed.

The authors illustrate their method using three different parameters from a previously published cost-effectiveness analysis, in each case simulating 15,000 Monte Carlo ‘inner loops’ for each of the 99 centiles. It took me a little while to get my head around the results that are presented, so there’s still some work to do around explaining the visuals to decision-makers. Nevertheless, this approach has the potential to become standard practice.

A head-on ordinal comparison of the composite time trade-off and the better-than-dead method. Value in Health Published 19th December 2019

For years now, methodologists have been trying to find a reliable way to value health states ‘worse than dead’. The EQ-VT protocol, used to value the EQ-5D-5L, includes the composite time trade-off (cTTO). The cTTO task gives people the opportunity to trade away life years in good health to avoid having to subsequently live in a state that they have identified as being ‘worse than dead’ (i.e. they would prefer to die immediately than to live in it). An alternative approach to this is the better-than-dead method, whereby people simply compare given durations in a health state to being dead. But are these two approaches measuring the same thing? This study sought to find out.

The authors recruited a convenience sample of 200 students and asked them to value seven different EQ-5D-5L health states that were close to zero in the Dutch tariff. Each respondent completed both a cTTO task and a better-than-dead task (the order varied) for each of the seven states. The analysis then looked at the extent to which there was agreement between the two methods in terms of whether states were identified as being better or worse than dead. Agreement was measured using counts and using polychoric correlations. Unsurprisingly, agreement was higher for those states that lay further from zero in the Dutch tariff. Around zero, there was quite a bit of disagreement – only 65% agreed for state 44343. Both approaches performed similarly with respect to consistency and test-retest reliability. Overall, the authors interpret these findings as meaning that the two methods are measuring the same underlying preferences.

I don’t find that very convincing. States were more often identified as worse than dead in the better-than-dead task, with 55% valued as such, compared with 37% in the cTTO. That seems like a big difference. The authors provide a variety of possible explanations for the differences, mostly relating to the way the tasks are framed. Or it might be that the complexity of the worse-than-dead task in the cTTO is so confusing and counterintuitive that respondents (intentionally or otherwise) avoid having to do it. For me, the findings reinforce the futility of trying to value health states in relation to being dead. If a slight change in methodology prevents a group of biomedical students from giving consistent assessments of whether or not a state is worse than being dead, what hope do we have?

Credits

# Rita Faria’s journal round-up for 4th November 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The marginal benefits of healthcare spending in the Netherlands: estimating cost-effectiveness thresholds using a translog production function. Health Economics [PubMed] Published 30th August 2019

The marginal productivity of the healthcare sector or, as commonly known, the supply-side cost-effectiveness threshold, is a hot topic right now. A few years ago, we could only guess at the magnitude of health that was displaced by reimbursing expensive and not-that-beneficial drugs. Since the seminal work by Karl Claxton and colleagues, we have started to have a pretty good idea of what we’re giving up.

This paper by Niek Stadhouders and colleagues adds to this literature by estimating the marginal productivity of hospital care in the Netherlands. Spoiler alert: they estimated that hospital care generates 1 QALY for around €74,000 at the margin, with 95% confidence intervals ranging from €53,000 to €94,000. Remarkably, it’s close to the Dutch upper reference value for the cost-effectiveness threshold at €80,000!

The approach for estimation is quite elaborate because it required building QALYs and costs, and accounting for the effect of mortality on costs. The diagram in Figure 1 is excellent in explaining it. Their approach is different from the Claxton et al method, in that they corrected for the cost due to changes in mortality directly rather than via an instrumental variable analysis. To estimate the marginal effect of spending on health, they use a translog function. The confidence intervals are generated with Monte Carlo simulation and various robustness checks are presented.

This is a fantastic paper, which will be sure to have important policy implications. Analysts conducting cost-effectiveness analysis in the Netherlands, do take note.

Mixed-effects models for health care longitudinal data with an informative visiting process: a Monte Carlo simulation study. Statistica Neerlandica Published 5th September 2019

Electronic health records are the current big thing in health economics research, but they’re not without challenges. One issue is that the data reflects the clinical management, rather than a trial protocol. This means that doctors may test more severe patients more often. For example, people with higher cholesterol may get more frequent cholesterol tests. The challenge is that traditional methods for longitudinal data assume independence between observation times and disease severity.

Alessandro Gasparini and colleagues set out to solve this problem. They propose using inverse intensity of visit weighting within a mixed-methods model framework. Importantly, they provide a Stata package that includes the method. It’s part of the wide ranging and super-useful merlin package.

It was great to see how the method works with the directed acyclic graph. Essentially, after controlling for confounders, the longitudinal outcome and the observation process are associated through shared random effects. By assuming a distribution for the shared random effects, the model blocks the path between the outcome and the observation process. It makes it sound easy!

The paper goes through the method, compares it with other methods in the literature in a simulation study, and applies to a real case study. It’s a brilliant paper that deserves a close look by all of those using electronic health records.

Would you like to use a propensity score method but don’t know where to start? Look no further! This paper by Rishi Desai and Jessica Franklin provides a practical guide to propensity score methods.

They start by explaining what a propensity score is and how it can be used, from matching to reweighting and regression adjustment. I particularly enjoyed reading about the importance of conceptualising the target of inference, that is, what treatment effect are we trying to estimate. In the medical literature, it is rare to see a paper that is clear on whether it is average treatment effect or average treatment effect among the treated population.

I found the algorithm for method selection really useful. Here, Rishi and Jessica describe the steps in the choice of the propensity score method and recommend their preferred method for each situation. The paper also includes the application of each method to the example of dabigatran versus warfarin for atrial fibrillation. Thanks to the graphs, we can visualise how the distribution of the propensity score changes for each method and depending on the target of inference.

This is an excellent paper to those starting their propensity score analyses, or for those who would like a refresher. It’s a keeper!

Credits

# Sam Watson’s journal round-up for 10th September 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Probabilistic sensitivity analysis in cost-effectiveness models: determining model convergence in cohort models. PharmacoEconomics [PubMed] Published 27th July 2018

Probabilistic sensitivity analysis (PSA) is rightfully a required component of economic evaluations. Deterministic sensitivity analyses are generally biased; averaging the outputs of a model based on a choice of values from a complex joint distribution is not likely to be a good reflection of the true model mean. PSA involves repeatedly sampling parameters from their respective distributions and analysing the resulting model outputs. But how many times should you do this? Most times, an arbitrary number is selected that seems “big enough”, say 1,000 or 10,000. But these simulations themselves exhibit variance; so-called Monte Carlo error. This paper discusses making the choice of the number of simulations more formal by assessing the “convergence” of simulation output.

In the same way as sample sizes are chosen for trials, the number of simulations should provide an adequate level of precision, anything more wastes resources without improving inferences. For example, if the statistic of interest is the net monetary benefit, then we would want the confidence interval (CI) to exclude zero as this should be a sufficient level of certainty for an investment decision. The paper, therefore, proposed conducting a number of simulations, examining the CI for when it is ‘narrow enough’, and conducting further simulations if it is not. However, I see a problem with this proposal: the variance of a statistic from a sequence of simulations itself has variance. The stopping points at which we might check CI are themselves arbitrary: additional simulations can increase the width of the CI as well as reduce them. Consider the following set of simulations from a simple ratio of random variables $ICER = gamma(1,0.01)/normal(0.01,0.01)$:The “stopping rule” therefore proposed doesn’t necessarily indicate “convergence” as a few more simulations could lead to a wider, as well as narrower, CI. The heuristic approach is undoubtedly an improvement on the current way things are usually done, but I think there is scope here for a more rigorous method of assessing convergence in PSA.

Mortality due to low-quality health systems in the universal health coverage era: a systematic analysis of amenable deaths in 137 countries. The Lancet [PubMed] Published 5th September 2018

Richard Horton, the oracular editor-in-chief of the Lancet, tweeted last week:

There is certainly an argument that academic journals are good forums to make advocacy arguments. Who better to interpret the analyses presented in these journals than the authors and audiences themselves? But, without a strict editorial bulkhead between analysis and opinion, we run the risk that the articles and their content are influenced or dictated by the political whims of editors rather than scientific merit. Unfortunately, I think this article is evidence of that.

No-one debates that improving health care quality will improve patient outcomes and experience. It is in the very definition of ‘quality’. This paper aims to estimate the numbers of deaths each year due to ‘poor quality’ in low- and middle-income countries (LMICs). The trouble with this is two-fold: given the number of unknown quantities required to get a handle on this figure, the definition of quality notwithstanding, the uncertainty around this figure should be incredibly high (see below); and, attributing these deaths in a causal way to a nebulous definition of ‘quality’ is tenuous at best. The approach of the article is, in essence, to assume that the differences in fatality rates of treatable conditions between LMICs and the best performing health systems on Earth, among people who attend health services, are entirely caused by ‘poor quality’. This definition of quality would therefore seem to encompass low resourcing, poor supply of human resources, a lack of access to medicines, as well as everything else that’s different in health systems. Then, to get to this figure, the authors have multiple sources of uncertainty including:

• Using a range of proxies for health care utilisation;
• Using global burden of disease epidemiology estimates, which have associated uncertainty;
• A number of data slicing decisions, such as truncating case fatality rates;
• Estimating utilisation rates based on a predictive model;
• Estimating the case-fatality rate for non-users of health services based on other estimated statistics.

Despite this, the authors claim to estimate a 95% uncertainty interval with a width of only 300,000 people, with a mean estimate of 5.0 million, due to ‘poor quality’. This seems highly implausible, and yet it is claimed to be a causal effect of an undefined ‘poor quality’. The timing of this article coincides with the Lancet Commission on care quality in LMICs and, one suspects, had it not been for the advocacy angle on care quality, it would not have been published in this journal.

Embedding as a pitfall for survey‐based welfare indicators: evidence from an experiment. Journal of the Royal Statistical Society: Series A Published 4th September 2018

Health economists will be well aware of the various measures used to evaluate welfare and well-being. Surveys are typically used that are comprised of questions relating to a number of different dimensions. These could include emotional and social well-being or physical functioning. Similar types of surveys are also used to collect population preferences over states of the world or policy options, for example, Kahneman and Knetsch conducted a survey of WTP for different environmental policies. These surveys can exhibit what is called an ’embedding effect’, which Kahneman and Knetsch described as when the value of a good varies “depending on whether the good is assessed on its own or embedded as part of a more inclusive package.” That is to say that the way people value single dimensional attributes or qualities can be distorted when they’re embedded as part of a multi-dimensional choice. This article reports the results of an experiment involving students who were asked to weight the relative importance of different dimensions of the Better Life Index, including jobs, housing, and income. The randomised treatment was whether they rated ‘jobs’ as a single category, or were presented with individual dimensions, such as the unemployment rate and job security. The experiment shows strong evidence of embedding – the overall weighting substantially differed by treatment. This, the authors conclude, means that the Better Life Index fails to accurately capture preferences and is subject to manipulation should a researcher be so inclined – if you want evidence to say your policy is the most important, just change the way the dimensions are presented.

Credits