Chris Sampson’s journal round-up for 11th March 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Identification, review, and use of health state utilities in cost-effectiveness models: an ISPOR Good Practices for Outcomes Research Task Force report. Value in Health [PubMed] Published 1st March 2019

When modellers select health state utility values to plug into their models, they often do it in an ad hoc and unsystematic way. This ISPOR Task Force report seeks to address that.

The authors discuss the process of searching, reviewing, and synthesising utility values. Searches need to use iterative techniques because evidence requirements develop as a model develops. Due to the scope of models, it may be necessary to develop multiple search strategies (for example, for different aspects of disease pathways). Searches needn’t be exhaustive, but they should be systematic and transparent. The authors provide a list of factors that should be considered in defining search criteria. In reviewing utility values, both quality and appropriateness should be considered. Quality is indicated by the precision of the evidence, the response rate, and missing data. Appropriateness relates to the extent to which the evidence being reviewed conforms to the context of the model in which it is to be used. This includes factors such as the characteristics of the study population, the measure used, value sets used, and the timing of data collection. When it comes to synthesis, the authors suggest it might not be meaningful in most cases, because of variation in methods. We can’t pool values if they aren’t (at least roughly) equivalent. Therefore, one approach is to employ strict inclusion criteria (e.g only EQ-5D, only a particular value set), but this isn’t likely to leave you with much. Meta-regression can be used to analyse more dissimilar utility values and provide insight into the impact of methodological differences. But the extent to which this can provide pooled values for a model is questionable, and the authors concede that more research is needed.

This paper can inform that future research. Not least in its attempt to specify minimum reporting standards. We have another checklist, with another acronym (SpRUCE). The idea isn’t so much that this will guide publications of systematic reviews of utility values, but rather that modellers (and model reviewers) can use it to assess whether the selection of utility values was adequate. The authors then go on to offer methodological recommendations for using utility values in cost-effectiveness models, considering issues such as modelling technique, comorbidities, adverse events, and sensitivity analysis. It’s early days, so the recommendations in this report ought to be changed as methods develop. Still, it’s a first step away from the ad hoc selection of utility values that (no doubt) drives the results of many cost-effectiveness models.

Estimating the marginal cost of a life year in Sweden’s public healthcare sector. The European Journal of Health Economics [PubMed] Published 22nd February 2019

It’s only recently that health economists have gained access to data that enables the estimation of the opportunity cost of health care expenditure on a national level; what is sometimes referred to as a supply-side threshold. We’ve seen studies in the UK, Spain, Australia, and here we have one from Sweden.

The authors use data on health care expenditure at the national (1970-2016) and regional (2003-2016) level, alongside estimates of remaining life expectancy by age and gender (1970-2016). First, they try a time series analysis, testing the nature of causality. Finding an apparently causal relationship between longevity and expenditure, the authors don’t take it any further. Instead, the results are based on a panel data analysis, employing similar methods to estimates generated in other countries. The authors propose a conceptual model to support their analysis, which distinguishes it from other studies. In particular, the authors assert that the majority of the impact of expenditure on mortality operates through morbidity, which changes how the model should be specified. The number of newly graduated nurses is used as an instrument indicative of a supply-shift at the national rather than regional level. The models control for socioeconomic and demographic factors and morbidity not amenable to health care.

The authors estimate the marginal cost of a life year by dividing health care expenditure by the expenditure elasticity of life expectancy, finding an opportunity cost of €38,812 (with a massive 95% confidence interval). Using Swedish population norms for utility values, this would translate into around €45,000/QALY.

The analysis is considered and makes plain the difficulty of estimating the marginal productivity of health care expenditure. It looks like a nail in the coffin for the idea of estimating opportunity costs using time series. For now, at least, estimates of opportunity cost will be based on variation according to geography, rather than time. In their excellent discussion, the authors are candid about the limitations of their model. Their instrument wasn’t perfect and it looks like there may have been important confounding variables that they couldn’t control for.

Frequentist and Bayesian meta‐regression of health state utilities for multiple myeloma incorporating systematic review and analysis of individual patient data. Health Economics [PubMed] Published 20th February 2019

The first paper in this round-up was about improving practice in the systematic review of health state utility values, and it indicated the need for more research on the synthesis of values. Here, we have some. In this study, the authors conduct a meta-analysis of utility values alongside an analysis of registry and clinical study data for multiple myeloma patients.

A literature search identified 13 ‘methodologically appropriate’ papers, providing 27 health state utility values. The EMMOS registry included data for 2,445 patients in 22 counties and the APEX clinical study included 669 patients, all with EQ-5D-3L data. The authors implement both a frequentist meta-regression and a Bayesian model. In both cases, the models were run including all values and then with a limited set of only EQ-5D values. These models predicted utility values based on the number of treatment classes received and the rate of stem cell transplant in the sample. The priors used in the Bayesian model were based on studies that reported general utility values for the presence of disease (rather than according to treatment).

The frequentist models showed that utility was low at diagnosis, higher at first treatment, and lower at each subsequent treatment. Stem cell transplant had a positive impact on utility values independent of the number of previous treatments. The results of the Bayesian analysis were very similar, which the authors suggest is due to weak priors. An additional Bayesian model was run with preferred data but vague priors, to assess the sensitivity of the model to the priors. At later stages of disease (for which data were more sparse), there was greater uncertainty. The authors provide predicted values from each of the five models, according to the number of treatment classes received. The models provide slightly different results, except in the case of newly diagnosed patients (where the difference was 0.001). For example, the ‘EQ-5D only’ frequentist model gave a value of 0.659 for one treatment, while the Bayesian model gave a value of 0.620.

I’m not sure that the study satisfies the recommendations outlined in the ISPOR Task Force report described above (though that would be an unfair challenge, given the timing of publication). We’re told very little about the nature of the studies that are included, so it’s difficult to judge whether they should have been combined in this way. However, the authors state that they have made their data extraction and source code available online, which means I could check that out (though, having had a look, I can’t find the material that the authors refer to, reinforcing my hatred for the shambolic ‘supplementary material’ ecosystem). The main purpose of this paper is to progress the methods used to synthesise health state utility values, and it does that well. Predictably, the future is Bayesian.

Credits

Sam Watson’s journal round-up for 11th February 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Contest models highlight inherent inefficiencies of scientific funding competitions. PLoS Biology [PubMed] Published 2nd January 2019

If you work in research you will have no doubt thought to yourself at one point that you spend more time applying to do research than actually doing it. You can spend weeks working on (what you believe to be) a strong proposal only for it to fail against other strong bids. That time could have been spent collecting and analysing data. Indeed, the opportunity cost of writing extensive proposals can be very high. The question arises as to whether there is another method of allocating research funding that reduces this waste and inefficiency. This paper compares the proposal competition to a partial lottery. In this lottery system, proposals are short, and among those that meet some qualifying standard those that are funded are selected at random. This system has the benefit of not taking up too much time but has the cost of reducing the average scientific value of the winning proposals. The authors compare the two approaches using an economic model of contests, which takes into account factors like proposal strength, public benefits, benefits to the scientist like reputation and prestige, and scientific value. Ultimately they conclude that, when the number of awards is smaller than the number of proposals worthy of funding, the proposal competition is inescapably inefficient. It means that researchers have to invest heavily to get a good project funded, and even if it is good enough it may still not get funded. The stiffer the competition the more researchers have to work to win the award. And what little evidence there is suggests that the format of the application makes little difference to the amount of time spent by researchers on writing it. The lottery mechanism only requires the researcher to propose something that is good enough to get into the lottery. Far less time would therefore be devoted to writing it and more time spent on actual science. I’m all for it!

Preventability of early versus late hospital readmissions in a national cohort of general medicine patients. Annals of Internal Medicine [PubMed] Published 5th June 2018

Hospital quality is hard to judge. We’ve discussed on this blog before the pitfalls of using measures such as adjusted mortality differences for this purpose. Just because a hospital has higher than expected mortality does not mean those death could have been prevented with higher quality care. More thorough methods assess errors and preventable harm in care. Case note review studies have suggested as little as 5% of deaths might be preventable in England and Wales. Another paper we have covered previously suggests then that the predictive value of standardised mortality ratios for preventable deaths may be less than 10%.

Another commonly used metric is readmission rates. Poor care can mean patients have to return to the hospital. But again, the question remains as to how preventable these readmissions are. Indeed, there may also be substantial differences between those patients who are readmitted shortly after discharge and those for whom it may take a longer time. This article explores the preventability of early and late readmissions in ten hospitals in the US. It uses case note review and a number of reviewers to evaluate preventability. The headline figures are that 36% of early readmissions are considered preventable compared to 23% of late readmissions. Moreover, it was considered that the early readmissions were most likely to have been preventable at the hospital whereas for late readmissions, an outpatient clinic or the home would have had more impact. All in all, another paper which provides evidence to suggest crude, or even adjusted rates, are not good indicators of hospital quality.

Visualisation in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society) [RePEc] Published 15th January 2019

This article stems from a broader programme of work from these authors on good “Bayesian workflow”. That is to say, if we’re taking a Bayesian approach to analysing data, what steps ought we to be taking to ensure our analyses are as robust and reliable as possible? I’ve been following this work for a while as this type of pragmatic advice is invaluable. I’ve often read empirical papers where the authors have chosen, say, a logistic regression model with covariates x, y, and z and reported the outcomes, but at no point ever justified why this particular model might be any good at all for these data or the research objective. The key steps of the workflow include, first, exploratory data analysis to help set up a model, and second, performing model checks before estimating model parameters. This latter step is important: one can generate data from a model and set of prior distributions, and if the data that this model generates looks nothing like what we would expect the real data to look like, then clearly the model is not very good. Following this, we should check whether our inference algorithm is doing its job, for example, are the MCMC chains converging? We can also conduct posterior predictive model checks. These have had their criticisms in the literature for using the same data to both estimate and check the model which could lead to the model generalising poorly to new data. Indeed in a recent paper of my own, posterior predictive checks showed poor fit of a model to my data and that a more complex alternative was better fitting. But other model fit statistics, which penalise numbers of parameters, led to the alternative conclusions. So the simpler model was preferred on the grounds that the more complex model was overfitting the data. So I would argue posterior predictive model checks are a sensible test to perform but must be interpreted carefully as one step among many. Finally, we can compare models using tools like cross-validation.

This article discusses the use of visualisation to aid in this workflow. They use the running example of building a model to estimate exposure to small particulate matter from air pollution across the world. Plots are produced for each of the steps and show just how bad some models can be and how we can refine our model step by step to arrive at a convincing analysis. I agree wholeheartedly with the authors when they write, “Visualization is probably the most important tool in an applied statistician’s toolbox and is an important complement to quantitative statistical procedures.”

Credits

 

Thesis Thursday: Anna Heath

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Anna Heath who has a PhD from the University College London. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Bayesian computations for value of information measures using Gaussian processes, INLA and Moment Matching
Supervisors
Gianluca Baio, Ioanna Manolopoulou
Repository link
http://discovery.ucl.ac.uk/id/eprint/10050229

Why are new methods needed for value of information analysis?

Value of Information (VoI) has been around for a really long time – it was first mentioned in a book published in 1959! More recently, it has been suggested that VoI methods can be used in health economics to direct and design future research strategies. There are several different concepts in VoI analysis and each of these can be used to answer different questions. The VoI measure with the most potential calculates the economic benefit of collecting additional data to inform a health economic model (known as the EVSI). The EVSI can be compared with the cost of collecting data and allow us to make sure that our clinical research is “cost-effective”.

The problem is that, mathematically, VoI measures are almost impossible to calculate, so we have to use simulation. Traditionally, these simulation methods have been very slow (in my PhD, one example took over 300 days to compute 10 VoI measures) so we need simulation methods that speed up the computation significantly before VoI can be used for decisions about research design and funding.

Do current EVPPI and EVSI estimation methods give different results?

For most examples, the current estimation methods give similar results but the computational time to obtain these results differs significantly. Since starting my PhD, different estimation methods for the EVPPI and the EVSI have been published. The difference between these methods are the assumptions and the ease of use. The results seem to be pretty stable for all the different methods, which is good!

The EVPPI determines which model parameters have the biggest impact on the cost-effectiveness of the different treatments. This is used to direct possible avenues of future research, i.e. we should focus on gaining more information about parameters with a large impact on cost-effectiveness. The EVPPI is calculated based only on simulations of the model parameters so the number of methods for EVPPI calculation is quite small. To calculate the EVSI, you need to consider how to collect additional data, through a clinical trial, observational study etc, so there is a wider range of available methods.

How does the Gaussian process you develop improve EVPPI estimation?

Before my PhD started, Mark Strong and colleagues at the University of Sheffield developed a method to calculate the EVPPI based on flexible regression. This method is accurate but when you want to calculate the value of a group of model parameters, the computational time increases significantly. A Gaussian process is a method for very flexible regression but could be slow when trying to calculate the EVPPI for a group of parameters. The method we developed adapted the Gaussian process to speed up computation when calculating the EVPPI for a group of parameters. The size of the group of parameters does not really make a difference to the computation for this method, so we allowed for fast EVPPI computation in nearly all practical examples!

What is moment matching, and how can it be used to estimate EVSI?

Moments define the shape of a distribution – the first moment is the mean, the second the variance, the third is the skewness and so on. To estimate the EVSI, we need to estimate a distribution with some specific properties. We can show that this distribution is similar to the distribution of the net benefit from a probabilistic sensitivity analysis. Moment matching is a fancy way of saying that we estimate the EVSI by changing the distribution of the net benefit so it has the same variance as the distribution needed to estimate the EVSI. This significantly decreases the computation time for the EVSI because traditionally we would estimate the distribution for the EVSI using a large number of simulations (I’ve used 10 billion simulations for one estimate).

The really cool thing about this method is that we extended it to use the EVSI to find the trial design and sample size that gives the maximum value for money from research investment resources. The computation time for this analysis was around 5 minutes whereas the traditional method took over 300 days!

Do jobbing health economists need to be experts in value of information analysis to use your BCEA and EVSI software?

The BCEA software uses the costs and effects calculated from a probabilistic health economic model alongside the probabilistic analysis for the model parameters to give standard graphics and summaries. It is based in R and can be used to calculate the EVPPI without being an expert in VoI methods and analysis. All you need is to decide which model parameters you are interested in valuing. We’ve put together a Web interface, BCEAweb, which allows you to use BCEA without using R.

The EVSI software requires a model that incorporates how the data from the future study will be analysed. This can be complicated to design although I’m currently putting together a library of standard examples. Once you’ve designed the study, the software calculates the EVSI without any input from the user, so you don’t need to be an expert in the calculation methods. The software also provides graphics to display the EVSI results and includes text to help interpret the graphical results. An example of the graphical output can be seen here.