James Lomas’s journal round-up for 21st May 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Decision making for healthcare resource allocation: joint v. separate decisions on interacting interventions. Medical Decision Making [PubMed] Published 23rd April 2018

While it may be uncontroversial that including all of the relevant comparators in an economic evaluation is crucial, a careful examination of this statement raises some interesting questions. Which comparators are relevant? For those that are relevant, how crucial is it that they are not excluded? The answer to the first of these questions may seem obvious, that all feasible mutually exclusive interventions should be compared, but this is in fact deceptive. Dakin and Gray highlight inconsistency between guidelines as to what constitutes interventions that are ‘mutually exclusive’ and so try to re-frame the distinction according to whether interventions are ‘incompatible’ – when it is physically impossible to implement both interventions simultaneously – and, if not, whether interventions are ‘interacting’ – where the costs and effects of the simultaneous implementation of A and B do not equal the sum of these parts. What I really like about this paper is that it has a very pragmatic focus. Inspired by policy arrangements, for example single technology appraisals, and the difficulty in capturing all interactions, Dakin and Gray provide a reader-friendly flow diagram to illustrate cases where excluding interacting interventions from a joint evaluation is likely to have a big impact, and furthermore propose a sequencing approach that avoids the major problems in evaluating separately what should be considered jointly. Essentially when we have interacting interventions at different points of the disease pathway, evaluating separately may not be problematic if we start at the end of the pathway and move backwards, similar to the method of backward induction used in sequence problems in game theory. There are additional related questions that I’d like to see these authors turn to next, such as how to include interaction effects between interventions and, in particular, how to evaluate system-wide policies that may interact with a very large number of interventions. This paper makes a great contribution to answering all of these questions by establishing a framework that clearly distinguishes concepts that had previously been subject to muddied thinking.

When cost-effective interventions are unaffordable: integrating cost-effectiveness and budget impact in priority setting for global health programs. PLoS Medicine [PubMed] Published 2nd October 2017

In my opinion, there are many things that health economists shouldn’t try to include when they conduct cost-effectiveness analysis. Affordability is not one of these. This paper is great, because Bilinski et al shine a light on the worldwide phenomenon of interventions being found to be ‘cost-effective’ but not affordable. A particular quote – that it would be financially impossible to implement all interventions that are found to be ‘very cost-effective’ in many low- and middle-income countries – is quite shocking. Bilinski et al compare and contrast cost-effectiveness analysis and budget impact analysis, and argue that there are four key reasons why something could be ‘cost-effective’ but not affordable: 1) judging cost-effectiveness with reference to an inappropriate cost-effectiveness ‘threshold’, 2) adoption of a societal perspective that includes costs not falling upon the payer’s budget, 3) failing to make explicit consideration of the distribution of costs over time and 4) the use of an inappropriate discount rate that may not accurately reflect the borrowing and investment opportunities facing the payer. They then argue that, because of this, cost-effectiveness analysis should be presented along with budget impact analysis so that the decision-maker can base a decision on both analyses. I don’t disagree with this as a pragmatic interim solution, but – by highlighting these four reasons for divergence of results with such important economic consequences – I think that there will be further reaching implications of this paper. To my mind, Bilinski et al essentially serves as a call to arms for researchers to try to come up with frameworks and estimates so that the conduct of cost-effectiveness analysis can be improved in order that paradoxical results are no longer produced, decisions are more usefully informed by cost-effectiveness analysis, and the opportunity costs of large budget impacts are properly evaluated – especially in the context of low- and middle-income countries where the foregone health from poor decisions can be so significant.

Patient cost-sharing, socioeconomic status, and children’s health care utilization. Journal of Health Economics [PubMed] Published 16th April 2018

This paper evaluates a policy using a combination of regression discontinuity design and difference-in-difference methods. Not only does it do that, but it tackles an important policy question using a detailed population-wide dataset (a set of linked datasets, more accurately). As if that weren’t enough, one of the policy reforms was actually implemented as a result of a vote where two politicians ‘accidentally pressed the wrong button’, reducing concerns that the policy may have in some way not been exogenous. Needless to say I found the method employed in this paper to be a pretty convincing identification strategy. The policy question at hand is about whether demand for GP visits for children in the Swedish county of Scania (Skåne) is affected by cost-sharing. Cost-sharing for GP visits has occurred for different age groups over different periods of time, providing the basis for regression discontinuities around the age threshold and treated and control groups over time. Nilsson and Paul find results suggesting that when health care is free of charge doctor visits by children increase by 5-10%. In this context, doctor visits happened subject to telephone triage by a nurse and so in this sense it can be argued that all of these visits would be ‘needed’. Further, Nilsson and Paul find that the sensitivity to price is concentrated in low-income households, and is greater among sickly children. The authors contextualise their results very well and, in addition to that context, I can’t deny that it also particularly resonated with me to read this approaching the 70th birthday of the NHS – a system where cost-sharing has never been implemented for GP visits by children. This paper is clearly also highly relevant to that debate that has surfaced again and again in the UK.

Credits

 

Chris Sampson’s journal round-up for 2nd April 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quality-adjusted life-years without constant proportionality. Value in Health Published 27th March 2018

The assumption of constant proportional trade-offs (CPTO) is at the heart of everything we do with QALYs. It assumes that duration has no impact on the value of a given health state, and so the value of a health state is constant regardless of its duration. This assumption has been repeatedly demonstrated to fail. This study looks for a non-constant alternative, which hasn’t been done before. The authors consider a quality-adjusted lifespan and four functional forms for the relationship between time and the value of life: constant, discount, logarithmic, and power. These relationships were tested in an online survey with more than 5,000 people, which involved the completion of 30-40 time trade-off pairs based on the EQ-5D-5L. Respondents traded off health states of varying severities and durations. Initially, a saturated model (making no assumptions about functional form) was estimated. This demonstrated that the marginal value of lifespan is decreasing. The authors provide a set of values attached to different health states at different durations. Then, the econometric model is adjusted to suit a power model, with the power estimated for duration expressed in days, weeks, months, or years. The power value for time is 0.415, but different expressions of time could introduce bias; time expressed in days (power=0.403) loses value faster than time expressed in years (power=0.654). There are also some anomalies that arise from the data that don’t fit the power function. For example, a single day of moderate problems can be worse than death, whereas 7 days or more is not. Using ‘power QALYs’ could be the future. But the big remaining question is whether decisionmakers ought to respond to people’s time preferences in this way.

A systematic review of studies comparing the measurement properties of the three-level and five-level versions of the EQ-5D. PharmacoEconomics [PubMed] Published 23rd March 2018

The debate about the EQ-5D-5L continues (on Twitter, at least). Conveniently, this paper addresses a concern held by some people – that we don’t understand the implications of using the 5L descriptive system. The authors systematically review papers comparing the measurement properties of the 3L and 5L, written in English or German. The review ended up including 24 studies. The measurement properties that were considered by the authors were: i) distributional properties, ii) informativity, iii) inconsistencies, iv) responsiveness, and v) test-retest reliability. The last property involves consideration of index values. Each study was also quality-assessed, with all being considered of good to excellent quality. The studies covered numerous countries and different respondent groups, with sample sizes from the tens to the thousands. For most measurement properties, the findings for the 3L and 5L were very similar. Floor effects were generally below 5% and tended to be slightly reduced for the 5L. In some cases, the 5L was associated with major reductions in the proportion of people responding as 11111 – a well-recognised ceiling effect associated with the 3L. Just over half of the studies reported on informativity using Shannon’s H’ and Shannon’s J’. The 5L provided consistently better results. Only three studies looked at responsiveness, with two slightly favouring the 5L and one favouring the 3L. The latter could be explained by the use of the 3L-5L crosswalk, which is inherently less responsive because it is a crosswalk. The overarching message is consistency. Business as usual. This is important because it means that the 3L and 5L descriptive systems provide comparable results (which is the basis for the argument I recently made that they are measuring the same thing). In some respects, this could be disappointing for 5L proponents because it suggests that the 5L descriptive system is not a lot better than the 3L. But it is a little better. This study demonstrates that there are still uncertainties about the differences between 3L and 5L assessments of health-related quality of life. More comparative studies, of the kind included in this review, should be conducted so that we can better understand the differences in results that are likely to arise now that we have moved (relatively assuredly) towards using the 5L instead of the 3L.

Preference-based measures to obtain health state utility values for use in economic evaluations with child-based populations: a review and UK-based focus group assessment of patient and parent choices. Quality of Life Research [PubMed] Published 21st March 2018

Calculating QALYs for kids continues to be a challenge. One of the challenges is the choice of which preference-based measure to use. Part of the problem here is that the EuroQol group – on which we rely for measuring adult health preferences – has been a bit slow. There’s the EQ-5D-Y, which has been around for a while, but it wasn’t developed with any serious thought about what kids value and there still isn’t a value set for the UK. So, if we use anything, we use a variety of measures. In this study, the authors review the use of generic preference-based measures. 45 papers are identified, including 5 different measures: HUI2, HUI3, CHU-9D, EQ-5D-Y, and AQOL-6D. No prizes for guessing that the EQ-5D (adult version) was the most commonly used measure for child-based populations. Unfortunately, the review is a bit of a disappointment. And I’m not just saying that because at least one study on which I’ve worked isn’t cited. The search strategy is likely to miss many (perhaps most) trial-based economic evaluations with children, for which cost-utility analyses don’t usually get a lot of airtime. It’s hard to see how a review of this kind is useful if it isn’t comprehensive. But the goal of the paper isn’t just to summarise the use of measures to date. The focus is on understanding when researchers should use self- or proxy-response, and when a parent-child dyad might be most useful. The literature review can’t do much to guide that question except to assert that the identified studies tended to use parent–proxy respondents. But the study also reports on some focus groups, which are potentially more useful. These were conducted as part of a wider study relating to the design of an RCT. In five focus groups, participants were presented with the EQ-5D-Y and the CHU-9D. It isn’t clear why these two measures were selected. The focus groups included parents and some children over the age of 11. Unfortunately, there’s no real (qualitative) analysis conducted, so the findings are limited. Parents expressed concern about a lack of sensitivity. Naturally, they thought that they knew best and should be the respondents. Of the young people reviewing the measures themselves, the EQ-5D-Y was perceived as more straightforward in referring to tangible experiences, whereas the CHU-9D’s severity levels were seen as more representative. Older adolescents tended to prefer the CHU-9D. The youths weren’t so sure of themselves as the adults and, though they expressed concern about their parents not understanding how they feel, they were generally neutral to who ought to respond. The older kids wanted to speak for themselves. The paper provides a good overview of the different measures, which could be useful for researchers planning data collection for child health utility measurement. But due to the limitations of the review and the lack of analysis of the focus groups, the paper isn’t able to provide any real guidance.

Credits

 

IVF and the evaluation of policies that don’t affect particular persons

Over at the CLAHRC West Midlands blog, Richard Lilford (my boss, I should hasten to add!) writes about the difficulties with the economic evaluation of IVF. The post notes that there are a number of issues that “are not generally considered in the standard canon for health economic assessment” including the problems with measuring benefits, choosing an appropriate discount rate, indirect beneficiaries, and valuing the life of the as yet unborn child. Au contraire! These issues are the very bread and butter of health economics and economic evaluation research. But I would concede that their impact on estimates of cost-effectiveness are not nearly well enough integrated into standard assessments.

We’ve covered the issue of choosing a social discount rate on this blog before with regards to treatments with inter-generational effects. I want instead to consider the last point about how we should, in the most normative of senses, consider the life of the child born as a result of IVF.

It puts me in mind of the work of the late, great Derek Parfit. He could be said to have single-handedly developed the field of ethics about future people. He identified a number of ethical problems that still often don’t have satisfactory answers. Decisions like funding IVF have an impact on the very existence of persons. But these decisions do not affect the well-being or rights of any particular persons, rather, as Parfit terms them, general persons. Few would deny that we have moral obligations not to cause material harm to future generations. Most would reject the narrow view that the only relevant outcomes are those that affect actual, particular persons, the narrow person-centred view. For example, in considering the problem of global warming, we do not reject its consequences on future generations as being irrelevant. But there remains the question about how we morally treat these general, future persons. Parfit calls this the non-identity problem and it applies neatly to the issue of IVF.

To illustrate the problem of IVF consider the choice:

If we choose A Adam and Barbara will not have children Charles will not exist
If we choose B Adam and Barbara will have a child Charles will live to 70

If we ignore evidence that suggests quality of life actually declines after one has children, we will assume that Adam and Barbara having children will in fact raise their quality of life since they are fulfilling their preferences. It would then seem to be clear that the fact of Charles existing and living a healthy life would be better than him not existing at all and the net benefit of Choice B is greater. But then consider the next choice:

If we choose A Adam and Barbara will not have children Charles will not exist Dianne will not exist
If we choose B Adam and Barbara will have a child Charles will live to 70 Dianne will not exist
If we choose C Adam and Barbara will have children Charles will live to 40 Dianne will live to 40

Now, Choice C would still seem to be preferable to Choice B if all life years have the same quality of life. But we could continue adding children with shorter and shorter life expectancies until we have a large population that lives a very short life, which is certainly not a morally superior position. This is a version of Parfit’s repugnant conclusion, in which general utilitarian principles leads us to prefer a situation with a very large, very low quality of life population to a smaller, better off one. No satisfying solution has yet been proposed. For IVF this might imply increasing the probability of multiple births!

We can also consider the “opposite” of IVF, contraception. In providing contraception we are superficially choosing Choice A above, which by the same utilitarian reasoning would be a worse situation than one in which those children are born. However, contraception is often used to be able to delay fertility decisions, so the choice actually becomes between a child being born earlier and living a worse life than a child being born later in better circumstances. So for a couple, things would go worse for the general person who is their first child, if things are worse for the particular person who is actually their first child. So it clearly matters how we frame the question as well.

We have a choice about how to weigh up the different situations if we reject the ‘narrow person-centred view’. On a no difference view, the effects on general and particular persons are weighted the same. On a two-tier view, the effects on general persons only matter a fraction of those on particular persons. For IVF this relates to how we weight Charles’s (and Diane’s) life in an evaluation. But current practice is ambiguous about how we weigh up these lives, and if we have a ‘two-tier view’, how we weight the lives of general persons.

From an economic perspective, we often consider that the values we place on benefits resulting from decisions as being determined by societal preferences. Generally, we ignore the fact that for many treatments the actual beneficiaries do not yet exist, which would suggest a ‘no difference view’. For example, when assessing the benefits of providing a treatment for childhood leukaemia, we don’t value the benefits to those particular children who have the disease differently to those general persons who may have the disease in the future. Perhaps we do not consider this since the provision of the treatment does not cause a difference in who will exist in the future. But equally when assessing the effects of interventions that may cause, in a counterfactual sense, changes in fertility decisions and the existence of persons, like social welfare payments or a lifesaving treatment for a woman of childbearing age, we do not think about the effects on the general persons that may be a child of that person or household. This would then suggest a ‘narrow person-centred view’.

There is clearly some inconsistency in how we treat general persons. For IVF evaluations, in particular, many avoid this question altogether and just estimate the cost per successful pregnancy, leaving the weighing up of benefits to later decision makers. While the arguments clearly don’t point to a particular conclusion, my tentative conclusion would be a ‘no difference view’. At any rate, it is an open question. In my rare lectures, I often remark that we spend a lot more time on empirical questions than questions of normative economics. This example shows how this can result in inconsistencies in how we choose to analyse and report our findings.

Credit