Chris Sampson’s journal round-up for 5th August 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The barriers and facilitators to model replication within health economics. Value in Health Published 16th July 2019

Replication is a valuable part of the scientific process, especially if there are uncertainties about the validity of research methods. When it comes to cost-effectiveness modelling, there are endless opportunities for researchers to do things badly, even with the best intentions. Attempting to replicate modelling studies can therefore support health care decision-making. But replication studies are rarely conducted, or, at least, rarely reported. The authors of this study sought to understand the factors that can make replication easy or difficult, with a view to informing reporting standards.

The authors attempted to replicate five published cost-effectiveness modelling studies, with the aim of recreating the key results. Each replication attempt was conducted by a different author and we’re even given a rating of the replicator’s experience level. The characteristics of the models were recorded and each replicator detailed – anecdotally – the things that helped or hindered their attempt. Some replications were a resounding failure. In one case, the replicated cost per patient was more than double the original, at more than £1,000 wide of the mark. Replicators reported that having a clear diagram of the model structure was a big help, as was the provision of example calculations and explicit listing of the key assumptions. Various shortcomings made replication difficult, all relating to a lack of clarity or completeness in reporting. The impact of this on the validation attempt was exacerbated if the model either involved lots of scenarios that weren’t clearly described or if the model had a long time horizon.

The quality of each study was assessed using the Philips checklist, and all did pretty well, suggesting that the checklist is not sufficient for ensuring replicability. If you develop and report cost-effectiveness models, this paper could help you better understand how end-users will interpret your reporting and make your work more replicable. This study focusses on Markov models. They’re definitely the most common approach, so perhaps that’s OK. It might be useful to produce prescriptive guidance specific to Markov models, informed by the findings of this study.

US integrated delivery networks perspective on economic burden of patients with treatment-resistant depression: a retrospective matched-cohort study. PharmacoEconomics – Open [PubMed] Published 28th June 2019

Treatment-resistant depression can be associated high health care costs, as multiple lines of treatment are tried, with patients experiencing little or no benefit. New treatments and models of care can go some way to addressing these challenges. In the US, there’s some reason to believe that integrated delivery networks (IDNs) could be associated with lower care costs, because IDNs are based on collaborative care models and constitute a single point of accountability for patient costs. They might be particularly useful in the case of treatment-resistant depression, but evidence is lacking. The authors of this study investigated the difference in health care resource use and costs for patients with and without treatment-resistant depression, in the context of IDNs.

The researchers conducted a retrospective cohort study using claims data for people receiving care from IDNs, with up to two years follow-up from first antidepressant use. 1,582 people with treatment-resistant depression were propensity score matched to two other groups – patients without depression and patients with depression that was not classified as treatment-resistant. Various regression models were used to compare the key outcomes of all-cause and specific categories of resource use and costs. Unfortunately, there is no assessment of whether the selected models are actually any good at estimating differences in costs.

The average costs and resource use levels in the three groups ranked as you would expect: $25,807 per person per year for the treatment-resistant group versus $13,701 in the non-resistant group and $8,500 in the non-depression group. People with treatment-resistant depression used a wider range of antidepressants and for a longer duration. They also had twice as many inpatient visits as people with depression that wasn’t treatment-resistant, which seems to have been the main driver of the adjusted differences in costs.

We don’t know (from this study) whether or not IDNs provide a higher quality of care. And the study isn’t able to compare IDN and non-IDN models of care. But it does show that IDNs probably aren’t a full solution to the high costs of treatment-resistant depression.

Rabin’s paradox for health outcomes. Health Economics [PubMed] [RePEc] Published 19th June 2019

Rabin’s paradox arises from the theoretical demonstration that a risk-averse individual who turns down a 50:50 gamble of gaining £110 or losing £100 would, if expected utility theory is correct, turn down a 50:50 gamble of losing £1,000 or gaining millions. This is because of the assumed concave utility function over wealth that is used to model risk aversion and it is probably not realistic. But we don’t know about the relevance of this paradox in the health domain… until now.

A key contribution of this paper is that it considers both decision-making about one’s own health and decision-making from a societal perspective. Three different scenarios are set-up in each case, relating to gains and losses in life expectancy with different levels of health functioning. 201 students were recruited as part of a larger study on preferences and each completed all six gamble-pairs (three individual, three societal). To test for Rabin’s paradox, the participants were asked whether they would accept each gamble involving a moderate stake and a large stake.

In short, the authors observe Rabin’s proposed failure of expected utility theory. Many participants rejected small gambles but did not reject the larger gambles. The effect was more pronounced for societal preferences. Though there was a large minority for whom expected utility theory was not violated. The upshot of all this is that our models of health preferences that are based on expected utility may be flawed where uncertain outcomes are involved – as they often are in health. This study adds to a growing body of literature supporting the relevance of alternative utility theories, such as prospect theory, to health and health care.

My only problem here is that life expectancy is not health. Life expectancy is everything. It incorporates the monetary domain, which this study did not want to consider, as well as every other domain of life. When you die, your stock of cash is as useful to you as your stock of health. I think it would have been more useful if the study focussed only on health status and outcomes and excluded all considerations of death.

Credits

Simon McNamara’s journal round-up for 6th August 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Euthanasia, religiosity and the valuation of health states: results from an Irish EQ5D5L valuation study and their implications for anchor values. Health and Quality of Life Outcomes [PubMed] Published 31st July 2018

Do you support euthanasia? Do you think there are health states worse than death? Are you religious? Don’t worry – I am not commandeering this week’s AHE journal round-up just to bombard you with a series of difficult questions. These three questions form the foundation of the first article selected for this week’s round-up.

The paper is based upon the hypothesis that your religiosity (“adherence to religious beliefs”) is likely to impact your support for euthanasia and, subsequently, the likelihood of you valuing severe health states as worse than death. This seems like a logical hypothesis. Religions tend to be anti-euthanasia, and so it appears likely that religious people will have lower levels of support for euthanasia than non-religious people. Equally, if you don’t support the principle of euthanasia, it stands to reason that you are likely to be less willing to choose immediate death over living in a severe health state – something you would need to do for a health state to be considered as being worse than death in a time trade-off (TTO) study.

The authors test this hypothesis using a sub-sample of data (n=160) collected as part of the Irish EQ-5D-5L TTO valuation study. Perhaps unsurprisingly, the authors find evidence in support of the above hypotheses. Those that attend a religious service weekly were more likely to oppose euthanasia than those who attend a few times a year or less, and those who oppose euthanasia were less likely to give “worse than death” responses in the TTO than those that support it.

I found this paper really interesting, as it raises a number of challenging questions. If a society is made up of people with heterogeneous beliefs regarding religion, how should we balance these in the valuation of health? If a society is primarily non-religious is it fair to apply this valuation tariff to the lives of the religious, and vice versa? These certainly aren’t easy questions to answer, but may be worth reflecting on.

E-learning and health inequality aversion: A questionnaire experiment. Health Economics [PubMed] [RePEc] Published 22nd July 2018

Moving on from the cheery topic of euthanasia, what do you think about socioeconomic inequalities in health? In my home country, England, if you are from the poorest quintile of society, you can expect to experience 62 years in full health in your lifetime, whilst if you are from the richest quintile, you can expect to experience 74 years – a gap of 12 years.

In the second paper to be featured in this round-up, Cookson et al. explore the public’s willingness to sacrifice incremental population health gains in order to reduce these inequalities in health – their level of “health inequality aversion”. This is a potentially important area of research, as the vast majority of economic evaluation in health is distributionally-naïve and effectively assumes that members of the public aren’t at all concerned with inequalities in health.

The paper builds on prior work conducted by the authors in this area, in which they noted a high proportion of respondents in health inequality aversion elicitation studies appear to be so averse to inequalities that they violate monotonicity – they choose scenarios that reduce inequalities in health even if these scenarios reduce the health of the rich at no gain to the poor, or they reduce the health of the poor, or they may reduce the health of both groups. The authors hypothesise that these monotonicity violations may be due to incomplete thinking from participants, and suggest that the quality of their thinking could be improved by two e-learning educational interventions. The primary aim of the paper is to test the impact of these interventions in a sample of the UK public (n=60).

The first e-learning intervention was an animated video that described a range of potential positions that a respondent could take (e.g. health maximisation, or maximising the health of the worst off). The second was an interactive spreadsheet-based questionnaire that presented the consequences of the participant’s choices, prior to them confirming their selection. Both interventions are available online.

The authors found that the interactive tool significantly reduced the amount of extreme egalitarian (monotonicity-violating) responses, compared to a non-interactive, paper-based version of the study. Similarly, when the video was watched before completing the paper-based exercise, the number of extreme egalitarian responses reduced. However, when the video was watched before the interactive tool there was no further decrease in extreme egalitarianism. Despite this reduction in extreme egalitarianism, the median levels of inequality aversion remained high, with implied weights of 2.6 and 7.0 for QALY gains granted to someone from the poorest fifth of society, compared to the richest fifth of society for the interactive questionnaire and video groups respectively.

This is an interesting study that provides further evidence of inequality aversion, and raises further concern about the practical dominance of distributionally-naïve approaches to economic evaluation. The public does seem to care about distribution. Furthermore, the paper demonstrates that participant responses to inequality aversion exercises are shaped by the information given to them, and the way that information is presented. I look forward to seeing more studies like this in the future.

A new method for valuing health: directly eliciting personal utility functions. The European Journal of Health Economics [PubMed] [RePEc] Published 20th July 2018

Last, but not least, for this round-up, is a paper by Devlin et al. on a new method for valuing health.

The relative valuation of health states is a pretty important topic for health economists. If we are to quantify the effectiveness, and subsequently cost-effectiveness, of an intervention, we need to understand which health states are better than others, and how much better they are. Traditionally, this is done by asking members of the public to choose between different health profiles featuring differing levels of fulfilment of a range of domains of health, in order to ‘uncover’ the relative importance the respondent places on these domains, and levels. These can then be used in order to generate social tariffs that assign a utility value to a given health state for use in economic evaluation.

The authors point out that, in the modern day, valuation studies can be conducted rapidly, and at scale, online, but at the potential cost of deliberation from participants, and the resultant risk of heuristic dominated decision making. In response to this, the authors propose a new method – direct elicitation of personal utility functions, and pilot its use for the valuation of EQ-5D in a sample of the English public (n=76).

The proposed approach differs from traditional approaches in three key ways. Firstly, instead of simply attempting to infer the relative importance that participants place on differing domains based upon choices between health profiles, the respondents are asked directly about the relative importance they place on differing domains of health, prior to validating these with profile choices. Secondly, the authors place a heavy emphasis on deliberation, and the construction, rather than uncovering, of preferences during the elicitation exercises. Thirdly, a “personal utility function” for each individual is constructed (in effect a personal EQ-5D tariff), and these individual utility functions are subsequently aggregated into a social utility function.

In the pilot, the authors find that the method appears feasible for wider use, albeit with some teething troubles associated with the computer-based tool developed to implement it, and the skills of the interviewers.

This direct method raises an interesting question for health economics – should we be inferring preferences based upon choices that differ in terms of certain attributes, or should we just ask directly about the attributes? This is a tricky question. It is possible that the preferences elicited via these different approaches could result in different preferences – if they do, on what grounds should we choose one or other? This requires a normative judgment, and at present, it appears both are (potentially) as legitimate as each other.

Whilst the authors apply this direct method to the valuation of health, I don’t see why similar approaches couldn’t be applied to any multi-attribute choice experiment. Keep your eyes out for future uses of it in valuation, and perhaps beyond? It will be interesting to see how it develops.

Credits

Chris Sampson’s journal round-up for 2nd April 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quality-adjusted life-years without constant proportionality. Value in Health Published 27th March 2018

The assumption of constant proportional trade-offs (CPTO) is at the heart of everything we do with QALYs. It assumes that duration has no impact on the value of a given health state, and so the value of a health state is constant regardless of its duration. This assumption has been repeatedly demonstrated to fail. This study looks for a non-constant alternative, which hasn’t been done before. The authors consider a quality-adjusted lifespan and four functional forms for the relationship between time and the value of life: constant, discount, logarithmic, and power. These relationships were tested in an online survey with more than 5,000 people, which involved the completion of 30-40 time trade-off pairs based on the EQ-5D-5L. Respondents traded off health states of varying severities and durations. Initially, a saturated model (making no assumptions about functional form) was estimated. This demonstrated that the marginal value of lifespan is decreasing. The authors provide a set of values attached to different health states at different durations. Then, the econometric model is adjusted to suit a power model, with the power estimated for duration expressed in days, weeks, months, or years. The power value for time is 0.415, but different expressions of time could introduce bias; time expressed in days (power=0.403) loses value faster than time expressed in years (power=0.654). There are also some anomalies that arise from the data that don’t fit the power function. For example, a single day of moderate problems can be worse than death, whereas 7 days or more is not. Using ‘power QALYs’ could be the future. But the big remaining question is whether decisionmakers ought to respond to people’s time preferences in this way.

A systematic review of studies comparing the measurement properties of the three-level and five-level versions of the EQ-5D. PharmacoEconomics [PubMed] Published 23rd March 2018

The debate about the EQ-5D-5L continues (on Twitter, at least). Conveniently, this paper addresses a concern held by some people – that we don’t understand the implications of using the 5L descriptive system. The authors systematically review papers comparing the measurement properties of the 3L and 5L, written in English or German. The review ended up including 24 studies. The measurement properties that were considered by the authors were: i) distributional properties, ii) informativity, iii) inconsistencies, iv) responsiveness, and v) test-retest reliability. The last property involves consideration of index values. Each study was also quality-assessed, with all being considered of good to excellent quality. The studies covered numerous countries and different respondent groups, with sample sizes from the tens to the thousands. For most measurement properties, the findings for the 3L and 5L were very similar. Floor effects were generally below 5% and tended to be slightly reduced for the 5L. In some cases, the 5L was associated with major reductions in the proportion of people responding as 11111 – a well-recognised ceiling effect associated with the 3L. Just over half of the studies reported on informativity using Shannon’s H’ and Shannon’s J’. The 5L provided consistently better results. Only three studies looked at responsiveness, with two slightly favouring the 5L and one favouring the 3L. The latter could be explained by the use of the 3L-5L crosswalk, which is inherently less responsive because it is a crosswalk. The overarching message is consistency. Business as usual. This is important because it means that the 3L and 5L descriptive systems provide comparable results (which is the basis for the argument I recently made that they are measuring the same thing). In some respects, this could be disappointing for 5L proponents because it suggests that the 5L descriptive system is not a lot better than the 3L. But it is a little better. This study demonstrates that there are still uncertainties about the differences between 3L and 5L assessments of health-related quality of life. More comparative studies, of the kind included in this review, should be conducted so that we can better understand the differences in results that are likely to arise now that we have moved (relatively assuredly) towards using the 5L instead of the 3L.

Preference-based measures to obtain health state utility values for use in economic evaluations with child-based populations: a review and UK-based focus group assessment of patient and parent choices. Quality of Life Research [PubMed] Published 21st March 2018

Calculating QALYs for kids continues to be a challenge. One of the challenges is the choice of which preference-based measure to use. Part of the problem here is that the EuroQol group – on which we rely for measuring adult health preferences – has been a bit slow. There’s the EQ-5D-Y, which has been around for a while, but it wasn’t developed with any serious thought about what kids value and there still isn’t a value set for the UK. So, if we use anything, we use a variety of measures. In this study, the authors review the use of generic preference-based measures. 45 papers are identified, including 5 different measures: HUI2, HUI3, CHU-9D, EQ-5D-Y, and AQOL-6D. No prizes for guessing that the EQ-5D (adult version) was the most commonly used measure for child-based populations. Unfortunately, the review is a bit of a disappointment. And I’m not just saying that because at least one study on which I’ve worked isn’t cited. The search strategy is likely to miss many (perhaps most) trial-based economic evaluations with children, for which cost-utility analyses don’t usually get a lot of airtime. It’s hard to see how a review of this kind is useful if it isn’t comprehensive. But the goal of the paper isn’t just to summarise the use of measures to date. The focus is on understanding when researchers should use self- or proxy-response, and when a parent-child dyad might be most useful. The literature review can’t do much to guide that question except to assert that the identified studies tended to use parent–proxy respondents. But the study also reports on some focus groups, which are potentially more useful. These were conducted as part of a wider study relating to the design of an RCT. In five focus groups, participants were presented with the EQ-5D-Y and the CHU-9D. It isn’t clear why these two measures were selected. The focus groups included parents and some children over the age of 11. Unfortunately, there’s no real (qualitative) analysis conducted, so the findings are limited. Parents expressed concern about a lack of sensitivity. Naturally, they thought that they knew best and should be the respondents. Of the young people reviewing the measures themselves, the EQ-5D-Y was perceived as more straightforward in referring to tangible experiences, whereas the CHU-9D’s severity levels were seen as more representative. Older adolescents tended to prefer the CHU-9D. The youths weren’t so sure of themselves as the adults and, though they expressed concern about their parents not understanding how they feel, they were generally neutral to who ought to respond. The older kids wanted to speak for themselves. The paper provides a good overview of the different measures, which could be useful for researchers planning data collection for child health utility measurement. But due to the limitations of the review and the lack of analysis of the focus groups, the paper isn’t able to provide any real guidance.

Credits