Rita Faria’s journal round-up for 4th November 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The marginal benefits of healthcare spending in the Netherlands: estimating cost-effectiveness thresholds using a translog production function. Health Economics [PubMed] Published 30th August 2019

The marginal productivity of the healthcare sector or, as commonly known, the supply-side cost-effectiveness threshold, is a hot topic right now. A few years ago, we could only guess at the magnitude of health that was displaced by reimbursing expensive and not-that-beneficial drugs. Since the seminal work by Karl Claxton and colleagues, we have started to have a pretty good idea of what we’re giving up.

This paper by Niek Stadhouders and colleagues adds to this literature by estimating the marginal productivity of hospital care in the Netherlands. Spoiler alert: they estimated that hospital care generates 1 QALY for around €74,000 at the margin, with 95% confidence intervals ranging from €53,000 to €94,000. Remarkably, it’s close to the Dutch upper reference value for the cost-effectiveness threshold at €80,000!

The approach for estimation is quite elaborate because it required building QALYs and costs, and accounting for the effect of mortality on costs. The diagram in Figure 1 is excellent in explaining it. Their approach is different from the Claxton et al method, in that they corrected for the cost due to changes in mortality directly rather than via an instrumental variable analysis. To estimate the marginal effect of spending on health, they use a translog function. The confidence intervals are generated with Monte Carlo simulation and various robustness checks are presented.

This is a fantastic paper, which will be sure to have important policy implications. Analysts conducting cost-effectiveness analysis in the Netherlands, do take note.

Mixed-effects models for health care longitudinal data with an informative visiting process: a Monte Carlo simulation study. Statistica Neerlandica Published 5th September 2019

Electronic health records are the current big thing in health economics research, but they’re not without challenges. One issue is that the data reflects the clinical management, rather than a trial protocol. This means that doctors may test more severe patients more often. For example, people with higher cholesterol may get more frequent cholesterol tests. The challenge is that traditional methods for longitudinal data assume independence between observation times and disease severity.

Alessandro Gasparini and colleagues set out to solve this problem. They propose using inverse intensity of visit weighting within a mixed-methods model framework. Importantly, they provide a Stata package that includes the method. It’s part of the wide ranging and super-useful merlin package.

It was great to see how the method works with the directed acyclic graph. Essentially, after controlling for confounders, the longitudinal outcome and the observation process are associated through shared random effects. By assuming a distribution for the shared random effects, the model blocks the path between the outcome and the observation process. It makes it sound easy!

The paper goes through the method, compares it with other methods in the literature in a simulation study, and applies to a real case study. It’s a brilliant paper that deserves a close look by all of those using electronic health records.

Alternative approaches for confounding adjustment in observational studies using weighting based on the propensity score: a primer for practitioners. BMJ [PubMed] Published 23rd October 2019

Would you like to use a propensity score method but don’t know where to start? Look no further! This paper by Rishi Desai and Jessica Franklin provides a practical guide to propensity score methods.

They start by explaining what a propensity score is and how it can be used, from matching to reweighting and regression adjustment. I particularly enjoyed reading about the importance of conceptualising the target of inference, that is, what treatment effect are we trying to estimate. In the medical literature, it is rare to see a paper that is clear on whether it is average treatment effect or average treatment effect among the treated population.

I found the algorithm for method selection really useful. Here, Rishi and Jessica describe the steps in the choice of the propensity score method and recommend their preferred method for each situation. The paper also includes the application of each method to the example of dabigatran versus warfarin for atrial fibrillation. Thanks to the graphs, we can visualise how the distribution of the propensity score changes for each method and depending on the target of inference.

This is an excellent paper to those starting their propensity score analyses, or for those who would like a refresher. It’s a keeper!

Credits

Brendan Collins’s journal round-up for 18th March 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Evaluation of intervention impact on health inequality for resource allocation. Medical Decision Making [PubMed] Published 28th February 2019

How should decision-makers factor equity impacts into economic decisions? Can we trade off an intervention’s cost-effectiveness with its impact on unfair health inequalities? Is a QALY just a QALY or should we weight it more if it is gained by someone from a disadvantaged group? Can we assume that, because people of lower socioeconomic position lose more QALYs through ill health, that most interventions should, by default, reduce inequalities?

I really like the health equity plane. This is where you show health impacts (usually including a summary measure of cost-effectiveness like net health benefit or net monetary benefit) and equity impacts (which might be a change in slope index of inequality [SII] or relative index of inequality) on the same plane. This enables decision-makers to identify potential trade-offs between interventions that produce a greater benefit, but have less impact on inequalities, and those that produce a smaller benefit, but increase equity. I think there has been a debate over whether the ‘win-win’ quadrant should be south-east (which would be consistent with the dominant quadrant of the cost-effectiveness plane) or north-east, which is what seems to have been adopted as the consensus and is used here.

This paper showcases a reproducible method to estimate the equity impact of interventions. It considers public health interventions recommended by NICE from 2006-2016, with equity impacts estimated based on whether they targeted specific diseases, risk factors or populations. The disease distributions were based on hospital episode statistics data by deprivation (IMD). The study used equity weights to convert QALYs gained to different social groups into net social welfare. In this case, valuing the most disadvantaged fifth of people’s health at around 6-7 times that of the least disadvantaged fifth. I think there might still be work to be done around reaching consensus for equity weights.

The total expected effect on inequalities is small – full implementation of all recommendations would produce a reduction of the quality-adjusted life expectancy gap between the healthiest and least healthy from 13.78 to 13.34 QALYs. But maybe this is to be expected; NICE does not typically look at vaccinations or screening and has not looked at large scale public health programmes like the Healthy Child Programme in the whole. Reassuringly, where recommended interventions were likely to increase inequality, the trade-off between efficiency and equity was within the social welfare function they had used. The increase in inequality might be acceptable because the interventions were cost-effective – producing 5.6million QALYs while increasing the SII by 0.005. If these interventions are buying health at a good price, then you would hope this might then release money for other interventions that would reduce inequalities.

I suspect that public health folks might not like equity trade-offs at all – trading off equity and cost-effectiveness might be the moral equivalent of trading off human rights – you can’t choose between them. But the reality is that these kinds of trade-offs do happen, and like a lot of economic methods, it is about revealing these implicit trade-offs so that they become explicit, and having ‘accountability for reasonableness‘.

Future unrelated medical costs need to be considered in cost effectiveness analysis. The European Journal of Health Economics [PubMed] [RePEc] Published February 2019

This editorial says that NICE should include unrelated future medical costs in its decision making. At the moment, if NICE looks at a cardiovascular disease (CVD) drug, it might look at future costs related to CVD but it won’t include changes in future costs of cancer, or dementia, which may occur because individuals live longer. But usually unrelated QALY gains will be implicitly included; so there is an inconsistency. If you are a health economic modeller, you know that including unrelated costs properly is technically difficult. You might weight average population costs by disease prevalence so you get a cost estimate for people with coronary heart disease, diabetes, and people without either disease. Or you might have a general healthcare running cost that you can apply to future years. But accounting for a full matrix of competing causes of morbidity and mortality is very tricky if not impossible. To help with this, this group of authors produced the excellent PAID tool, which helps with doing this for the Netherlands (can we have one for the UK please?).

To me, including unrelated future costs means that in some cases ICERs might be driven more by the ratio of future costs to QALYs gained. Whereas currently, ICERs are often driven by the ratio of the intervention costs to QALYs gained. So it might be that a lot of treatments that are currently cost-effective no longer are, or we need to judge all interventions with a higher ICER willingness to pay threshold or value of a QALY. The authors suggest that, although including unrelated medical costs usually pushes up the ICER, it should ultimately result in better decisions that increase health.

There are real ethical issues here. I worry that including future unrelated costs might be used for an integrated care agenda in the NHS, moving towards a capitation system where the total healthcare spend on any one individual is capped, which I don’t necessarily think should happen in a health insurance system. Future developments around big data mean we will be able to segment the population a lot better and estimate who will benefit from treatments. But I think if someone is unlucky enough to need a lot of healthcare spending, maybe they should have it. This is risk sharing and, without it, you may get the ‘double jeopardy‘ problem.

For health economic modellers and decision-makers, a compromise might be to present analyses with related and unrelated medical costs and to consider both for investment decisions.

Overview of cost-effectiveness analysis. JAMA [PubMed] Published 11th March 2019

This paper probably won’t offer anything new to academic health economists in terms of methods, but I think it might be a useful teaching resource. It gives an interesting example of a model of ovarian cancer screening in the US that was published in February 2018. There has been a large-scale trial of ovarian cancer screening in the UK (the UKCTOCS), which has been extended because the results have been promising but mortality reductions were not statistically significant. The model gives a central ICER estimate of $106,187/QALY (based on $100 per screen) which would probably not be considered cost-effective in the UK.

I would like to explore one statement that I found particularly interesting, around the willingness to pay threshold; “This willingness to pay is often represented by the largest ICER among all the interventions that were adopted before current resources were exhausted, because adoption of any new intervention would require removal of an existing intervention to free up resources.”

The Culyer bookshelf model is similar to this, although as well as the ICER you also need to consider the burden of disease or size of the investment. Displacing a $110,000/QALY intervention for 1000 people with a $109,000/QALY intervention for a million people will bust your budget.

This idea works intuitively – if Liverpool FC are signing a new player then I might hope they are better than all of the other players, or at least better than the average player. But actually, as long as they are better than the worst player then the team will be improved (leaving aside issues around different positions, how they play together, etc.).

However, I think that saying that the reference ICER should be the largest current ICER might be a bit dangerous. Leaving aside inefficient legacy interventions (like unnecessary tonsillectomies etc), it is likely that the intervention being considered for investment and the current maximum ICER intervention to be displaced may both be new, expensive immunotherapies. It might be last in, first out. But I can’t see this happening; people are loss averse, so decision-makers and patients might not accept what is seen as a fantastic new drug for pancreatic cancer being approved then quickly usurped by a fantastic new leukaemia drug.

There has been a lot of debate around what the threshold should be in the UK; in England NICE currently use £20,000 – £30,000, up to a hypothetical maximum £300,000/QALY in very specific circumstances. UK Treasury value QALYs at £60,000. Work by Karl Claxton and colleagues suggests that marginal productivity (the ‘shadow price’) in the NHS is nearer to £5,000 – £15,000 per QALY.

I don’t know what the answer to this is. I don’t think the willingness-to-pay threshold for a new treatment should be the maximum ICER of a current portfolio of interventions; maybe it should be the marginal health production cost in a health system, as might be inferred from the Claxton work. Of course, investment decisions are made on other factors, like impact on health inequalities, not just on the ICER.

Credits

Chris Sampson’s journal round-up for 11th March 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Identification, review, and use of health state utilities in cost-effectiveness models: an ISPOR Good Practices for Outcomes Research Task Force report. Value in Health [PubMed] Published 1st March 2019

When modellers select health state utility values to plug into their models, they often do it in an ad hoc and unsystematic way. This ISPOR Task Force report seeks to address that.

The authors discuss the process of searching, reviewing, and synthesising utility values. Searches need to use iterative techniques because evidence requirements develop as a model develops. Due to the scope of models, it may be necessary to develop multiple search strategies (for example, for different aspects of disease pathways). Searches needn’t be exhaustive, but they should be systematic and transparent. The authors provide a list of factors that should be considered in defining search criteria. In reviewing utility values, both quality and appropriateness should be considered. Quality is indicated by the precision of the evidence, the response rate, and missing data. Appropriateness relates to the extent to which the evidence being reviewed conforms to the context of the model in which it is to be used. This includes factors such as the characteristics of the study population, the measure used, value sets used, and the timing of data collection. When it comes to synthesis, the authors suggest it might not be meaningful in most cases, because of variation in methods. We can’t pool values if they aren’t (at least roughly) equivalent. Therefore, one approach is to employ strict inclusion criteria (e.g only EQ-5D, only a particular value set), but this isn’t likely to leave you with much. Meta-regression can be used to analyse more dissimilar utility values and provide insight into the impact of methodological differences. But the extent to which this can provide pooled values for a model is questionable, and the authors concede that more research is needed.

This paper can inform that future research. Not least in its attempt to specify minimum reporting standards. We have another checklist, with another acronym (SpRUCE). The idea isn’t so much that this will guide publications of systematic reviews of utility values, but rather that modellers (and model reviewers) can use it to assess whether the selection of utility values was adequate. The authors then go on to offer methodological recommendations for using utility values in cost-effectiveness models, considering issues such as modelling technique, comorbidities, adverse events, and sensitivity analysis. It’s early days, so the recommendations in this report ought to be changed as methods develop. Still, it’s a first step away from the ad hoc selection of utility values that (no doubt) drives the results of many cost-effectiveness models.

Estimating the marginal cost of a life year in Sweden’s public healthcare sector. The European Journal of Health Economics [PubMed] Published 22nd February 2019

It’s only recently that health economists have gained access to data that enables the estimation of the opportunity cost of health care expenditure on a national level; what is sometimes referred to as a supply-side threshold. We’ve seen studies in the UK, Spain, Australia, and here we have one from Sweden.

The authors use data on health care expenditure at the national (1970-2016) and regional (2003-2016) level, alongside estimates of remaining life expectancy by age and gender (1970-2016). First, they try a time series analysis, testing the nature of causality. Finding an apparently causal relationship between longevity and expenditure, the authors don’t take it any further. Instead, the results are based on a panel data analysis, employing similar methods to estimates generated in other countries. The authors propose a conceptual model to support their analysis, which distinguishes it from other studies. In particular, the authors assert that the majority of the impact of expenditure on mortality operates through morbidity, which changes how the model should be specified. The number of newly graduated nurses is used as an instrument indicative of a supply-shift at the national rather than regional level. The models control for socioeconomic and demographic factors and morbidity not amenable to health care.

The authors estimate the marginal cost of a life year by dividing health care expenditure by the expenditure elasticity of life expectancy, finding an opportunity cost of €38,812 (with a massive 95% confidence interval). Using Swedish population norms for utility values, this would translate into around €45,000/QALY.

The analysis is considered and makes plain the difficulty of estimating the marginal productivity of health care expenditure. It looks like a nail in the coffin for the idea of estimating opportunity costs using time series. For now, at least, estimates of opportunity cost will be based on variation according to geography, rather than time. In their excellent discussion, the authors are candid about the limitations of their model. Their instrument wasn’t perfect and it looks like there may have been important confounding variables that they couldn’t control for.

Frequentist and Bayesian meta‐regression of health state utilities for multiple myeloma incorporating systematic review and analysis of individual patient data. Health Economics [PubMed] Published 20th February 2019

The first paper in this round-up was about improving practice in the systematic review of health state utility values, and it indicated the need for more research on the synthesis of values. Here, we have some. In this study, the authors conduct a meta-analysis of utility values alongside an analysis of registry and clinical study data for multiple myeloma patients.

A literature search identified 13 ‘methodologically appropriate’ papers, providing 27 health state utility values. The EMMOS registry included data for 2,445 patients in 22 counties and the APEX clinical study included 669 patients, all with EQ-5D-3L data. The authors implement both a frequentist meta-regression and a Bayesian model. In both cases, the models were run including all values and then with a limited set of only EQ-5D values. These models predicted utility values based on the number of treatment classes received and the rate of stem cell transplant in the sample. The priors used in the Bayesian model were based on studies that reported general utility values for the presence of disease (rather than according to treatment).

The frequentist models showed that utility was low at diagnosis, higher at first treatment, and lower at each subsequent treatment. Stem cell transplant had a positive impact on utility values independent of the number of previous treatments. The results of the Bayesian analysis were very similar, which the authors suggest is due to weak priors. An additional Bayesian model was run with preferred data but vague priors, to assess the sensitivity of the model to the priors. At later stages of disease (for which data were more sparse), there was greater uncertainty. The authors provide predicted values from each of the five models, according to the number of treatment classes received. The models provide slightly different results, except in the case of newly diagnosed patients (where the difference was 0.001). For example, the ‘EQ-5D only’ frequentist model gave a value of 0.659 for one treatment, while the Bayesian model gave a value of 0.620.

I’m not sure that the study satisfies the recommendations outlined in the ISPOR Task Force report described above (though that would be an unfair challenge, given the timing of publication). We’re told very little about the nature of the studies that are included, so it’s difficult to judge whether they should have been combined in this way. However, the authors state that they have made their data extraction and source code available online, which means I could check that out (though, having had a look, I can’t find the material that the authors refer to, reinforcing my hatred for the shambolic ‘supplementary material’ ecosystem). The main purpose of this paper is to progress the methods used to synthesise health state utility values, and it does that well. Predictably, the future is Bayesian.

Credits