Chris Sampson’s journal round-up for 23rd April 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

What should we know about the person behind a TTO? The European Journal of Health Economics [PubMed] Published 18th April 2018

The time trade-off (TTO) is a staple of health state valuation. Ask someone to value a health state with respect to time and – hey presto! – you have QALYs. This editorial suggests that completing a TTO can be a difficult task for respondents and that, more importantly, individuals’ characteristics may determine the way that they respond and therefore the nature of the results. One of the most commonly demonstrated differences, in this respect, is the fact that valuations of people’s own health states tend to be higher than health states valued hypothetically. But this paper focuses on indirect (hypothetical) valuations. The authors highlight mixed evidence for the influence of age, gender, marital status, having children, education, income, expectations about the future, and of one’s own health state. But why should we try and find out more about respondents when conducting TTOs? The authors offer 3 reasons: i) to inform sampling, ii) to inform the design and standardisation of TTO exercises, and iii) to inform the analysis. I agree – we need to better understand these sources of heterogeneity. Not to over-engineer responses, but to aid our interpretation, even if we want societally-representative valuations that include all of these variations in response behaviour. TTO valuation studies should collect data relating to the individual respondents. Unfortunately, what those data should be aren’t listed in this study, so the research question in the title isn’t really answered. But maybe that’s something the authors have in hand.

Computer modeling of diabetes and its transparency: a report on the eighth Mount Hood Challenge. Value in Health Published 9th April 2018

The Mount Hood Challenge is a get-together for people working on the (economic) modelling of diabetes. The subject of the 2016 meeting was transparency, with two specific goals: i) to evaluate the transparency of two published studies, and ii) to develop a diabetes-specific checklist for transparent reporting of modelling studies. Participants were tasked (in advance of the meeting) with replicating the two published studies and using the replicated models to evaluate some pre-specified scenarios. Both of the studies had some serious shortcomings in the reporting of the necessary data for replication, including the baseline characteristics of the population. Five modelling groups replicated the first model and seven groups replicated the second model. Naturally, the different groups made different assumptions about what should be used in place of missing data. For the first paper, none of the models provided results that matched the original. Not even close. And the differences between the results of the replications – in terms of costs incurred and complications avoided – were huge. The performance was a bit better on the second paper, but hardly worth celebrating. In general, the findings were fear-confirming. Informed by these findings, the Diabetes Modeling Input Checklist was created, designed to complement existing checklists with more general applications. It includes specific data requirements for the reporting of modelling studies, relating to the simulation cohort, treatments, costs, utilities, and model characteristics. If you’re doing some modelling in diabetes, you should have this paper to hand.

Setting dead at zero: applying scale properties to the QALY model. Medical Decision Making [PubMed] Published 9th April 2018

In health state valuation, whether or not a state is considered ‘worse than dead’ is heavily dependent on methodological choices. This paper reviews the literature to answer two questions: i) what are the reasons for anchoring at dead=0, and ii) how does the position of ‘dead’ on the utility-scale impact on decision making? The authors took a standard systematic approach to identify literature from databases, with 7 papers included. Then the authors discuss scale properties and the idea that there are interval scales (such as temperature) and ratio scales (such as distance). The difference between these is the meaningfulness of the reference point (or origin). This means that you can talk about distance doubling, but you can’t talk about temperature doubling, because 0 metres is not arbitrary, whereas 0 degrees Celsius is. The paper summarises some of the arguments put forward for using dead=0. They aren’t compelling. The authors argue that the duration part of the QALY (i.e. time) needs to have ratio properties for the QALY model to function. Time obviously holds this property and it’s clear that duration can be anchored at zero. The authors then demonstrate that, for the QALY model to work, the health-utility scale must also exhibit ratio scale properties. The basis for this is the assumption that zero duration nullifies health states and that ‘dead’ nullifies duration. But the paper doesn’t challenge the conceptual basis for using dead in health state valuation exercises. Rather, it considers the mathematical properties that must hold to allow for dead=0, and asserts them. The authors’ conclusion that dead “needs to have the value of 0 in a QALY model” is correct, but only within the existing restrictions and assumptions underlying current practice. Nevertheless, this is a very useful study for understanding the challenge of anchoring and explicating the assumptions underlying the QALY model.

Credits

Alastair Canaway’s journal round-up for 28th August 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Valuing health-related quality of life: an EQ-5D-5L value set for England. Health Economics [PubMed] Published 22nd August 2017

With much anticipation, the new EQ-5D-5L value set was officially published. For over 18 months we’ve had access to values via the OHE’s discussion paper but the formal peer-reviewed paper has (I imagine) been in publication purgatory. This paper presents the results of the value-set for the new (ish) EQ-5D-5L measure. The study used the internationally agreed hybrid model combining TTO and DCE data to generate the values for the 3125 health states. It’s worth noting that the official values are marginally different to those in the discussion paper, although in practice this is likely to have little impact on results. Important results of the new value set include fewer health states worse than death (5.1% vs over 33%), and a higher minimum value (-0.285 vs -0.594). I’d always been a bit suspect of the values for worse than death states for the 3L measure, so this if anything is encouraging. This does, however, have important implications, primarily for interventions seeking to improve those in the worst health, where potential gains may be reduced. Many of us are actively using the EQ-5D-5L within trials and have been eagerly awaiting this value set. Perhaps naively, I always anticipated that with more levels and an improved algorithm it would naturally supersede the 3L and the outdated 3L value set upon publication. Unfortunately, to mark the release of the new value set, NICE released a ‘position statement’ [PDF] regarding the choice of measure and value sets for the NICE reference case. NICE specifies that i) the 5L value set is not recommended for use, ii) the EQ-5D-3L with the original UK TTO value set is recommended and if both measures are included then the 3L should be preferred, iii) if the 5L measure is included, then scores should be mapped to the EQ-5D-3L using the van Hout et al algorithm, iv) NICE supports the use of the EQ-5D-5L generally to collect data on quality of life, and v) NICE will review this decision in August 2018 in light of future evidence. So, unfortunately, for the next year at least, we will be either sticking to the original 3L measure or mapping from the 5L. I suspect NICE is buying some time as transitioning to the 5L is going to raise lots of interesting issues e.g. if a measure is cost-effective according to the 3L, but not the 5L, or vice-versa, and comparability of 5L results to old 3L results. Interesting times lie ahead. As a final note, it’s worth reading the OHE blog post outlining the position statement and OHE’s plans to satisfy NICE.

Long-term QALY-weights among spouses of dependent and independent midlife stroke survivors. Quality of Life Research [PubMed] Published 29th June 2017

For many years, spillover impacts were largely being ignored within economic evaluation. There is increased interest in capturing wider impacts, indeed, the NICE reference case recommends including carer impacts where relevant, whilst the US Panel on Cost-Effectiveness in Health and Medicine now advocates the inclusion of other affected parties. This study sought to examine whether the dependency of midlife stroke survivors impacted on their spouses’ HRQL as measured using the SF-6D. An OLS approach was used whilst controlling for covariates (age, sex and education, amongst others). Spouses of dependent stroke survivors had a lower utility (0.69) than those whose spouses were independent (0.77). This has interesting implications for economic evaluation. For example, if a treatment were to prevent dependence, then there could potentially be large QALY gains to spouses. Spillover impacts are clearly important. If we are to broaden the evaluative scope as suggested by NICE and the US Panel to include spillover impacts, then work is vital in terms of identifying relevant contexts, measuring spillover impacts, and understanding the implications of spillover impacts within economic evaluation. This remains an important area for future research.

Conducting a discrete choice experiment study following recommendations for good research practices: an application for eliciting patient preferences for diabetes treatments. Value in Health Published 7th August 2017

To finish this week’s round-up I thought it’d be helpful to signpost this article on conducting DCEs, which I feel may be helpful for researchers embarking on their first DCE. The article hasn’t done anything particularly radical or made ground-breaking discoveries. What it does however do is give you a practical guide to walk you through each step of the DCE process following the ISPOR guidelines/checklist. Furthermore, it expands upon the ISPOR checklist to provide researchers with a further resource to consider when conducting DCEs. The case study used relates to measuring patient preferences for type 2 diabetes mellitus medications. For every item on the ISPOR checklist, it explains how they made the choices that they did, and what influenced them. The paper goes through the entire process from identifying the research question all the way through to presenting results and discussion (for those interested in diabetes – it turns out people have a preference for immediate consequences and have a high discount rate for future benefits). For people who are keen to conduct a DCE and find a worked example easier to follow, this paper alongside the ISPOR guidelines is definitely one to add to your reference manager.

Credits

Chris Sampson’s journal round-up for 17th October 2016

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Estimating health-state utility for economic models in clinical studies: an ISPOR Good Research Practices Task Force report. Value in Health [PubMedPublished 3rd October 2016

When it comes to model-based cost-per-QALY analyses, researchers normally just use utility values from a single clinical study. So we best be sure that these studies are collecting the right data. This ISPOR Task Force report presents guidelines for the collection and reporting of utility values in the context of clinical studies, with a view to making them as useful as possible to the modelling process. The recommendations are quite general and would apply to most aspects of clinical studies: do some early planning; make sure the values are relevant to the population being modelled; bear HTA agencies’ expectations in mind. It bothers me though that the basis for the recommendations is not very concrete (the word “may” appears more than 100 times). The audience for this report isn’t so much people building models, or people conducting clinical trials. Rather, it’s people who are conducting some modelling within a clinical study (or vice versa). I’m in that position, so why don’t the guidelines strike me as useful? They expect a lot of time to be dedicated to the development of the model structure and aims before the clinical study gets underway. So modelling work would be conducted alongside the full duration of a clinical study. In my experience, that isn’t how things usually work. And even if that does happen, practical limitations to data collection will thwart the satisfaction of the vast majority of the recommendations. In short, I think the Task Force’s position puts the cart on top of the horse. Models require data and, yes, models can be used to inform data collection. But seldom can proposed modelling work be the principal basis for determining data collection in a clinical study. I think that may be a good thing and that a more incremental approach (review – model – collect data – repeat) is more fruitful. Having said all that, and having read the paper, I do think it’s useful. It isn’t useful as a set of recommendations that we might expect from an ISPOR Task Force, but rather as a list of things to think about if you’re somebody involved in the collection of health state utility data. If you’re one of those people then it’s well worth a read.

Reliability, validity, and feasibility of direct elicitation of children’s preferences for health states: a systematic review. Medical Decision Making [PubMedPublished 30th September 2016

Set aside for the moment the question of whose preferences we ought to use in valuing health improvements. There are undoubtedly situations in which it would be interesting and useful to know patients’ preferences. What if those patients are children? This study presents the findings from a systematic review of attempts at direct elicitation of preferences from children, focusing on psychometric properties and with the hope of identifying the best approach. To be included in the review, studies needed to report validity, reliability and/or feasibility. 26 studies were included, with most of them using time trade-off (n=14) or standard gamble (n=11). 7 studies reported validity and the findings suggested good construct validity with condition-specific but not generic measures. 4 studies reported reliability and TTO came off better than visual analogue scales. 9 studies reported on feasibility in terms of completion rates and generally found it to be high. The authors also extracted information about the use of preference elicitation in different age groups and found that studies making such comparisons suggested that it may not be appropriate for younger children. Generally speaking, it seems that standard gamble and time trade-off are acceptably valid, reliable and feasible. It’s important to note that there was a lot of potential for bias in the included studies, and that a number of them seemed somewhat lacking in their reporting. And there’s a definite risk of publication and reporting bias lurking here. I think a key issue that the study can’t really enlighten us on is the question of age. There might not be all that much difference between a 17 year old and a 27 year old, but there’s going to be a big difference between a 17 year old and a 7 year old. Future research needs to investigate the notion of an age threshold for valid preference elicitation. I’d like to see a more thorough quantitative analysis of findings from direct preference elicitation studies in children. But what we really need is a big new study in which children (both patients and general public) are asked to complete various direct preference elicitation tasks at multiple time points. Because right now, there just isn’t enough evidence.

Economic evaluation of integrated new technologies for health and social care: suggestions for policy makers, users and evaluators. Social Science & Medicine [PubMedPublished 24th September 2016

There are many debates that take place at the nexus of health care and social care, whether they be about funding, costs or outcome measurement. This study focusses on a specific example of health and social care integration – assisted living technologies (ALTs) – and tries to come up with a new and more appropriate method of economic evaluation. In this context, outcomes might matter ‘beyond health’. I should like this paper. It tries to propose an approach that might satisfy the suggestions I made in a recent essay. Why, then, am I not convinced? The authors outline their proposal as consisting of 3 steps: i) identify attributes relevant to the intervention, ii) value these in monetary terms and iii) value the health benefit. In essence the plan is to estimate QALYs for the health bit and then a monetary valuation for the other bits, with the ‘other bits’ specified in advance of the evaluation. That’s very easily said and not at all easily done. And the paper makes no argument that this is actually what we ought to be doing. Capabilities work their way in as attributes, but little consideration is given to the normative differences between this and other approaches (what I have termed ‘consequents’). The focus on ALTs is odd. The authors fill a lot of space arguing (unconvincingly) that it is a special case, before stating that their approach should be generalisable. The main problem can be summarised by a sentence that appears in the introduction: “the approach is highly flexible because the use of a consistent numeraire (either monetary or health) means that programmes can be compared even if the underlying attributes differ“. Maybe they can, but they shouldn’t. Or at least that’s what a lot of people think, which is precisely why we use QALYs. An ‘anything goes’ approach means that any intervention could easily be demonstrated to be more cost-effective than another if we just pick the right attributes. I’m glad to see researchers trying to tackle these problems, and this could be the start of something important, but I was disappointed that this paper couldn’t offer anything concrete.

Credits