Chris Sampson’s journal round-up for 27th January 2020

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A general framework for classifying costing methods for economic evaluation of health care. The European Journal of Health Economics [PubMed] Published 20th January 2020

When it comes to health state valuation and quality of life, I’m always very concerned about the use of precise terminology, and it bugs me when people get things wrong. But when it comes to costing methods, I’m pretty shoddy. So I’m pleased to see this very useful paper, which should help us all to gain some clarity in our reporting of costing studies.

The authors start out by clearly distinguishing between micro-costing and gross-costing in the identification of costs and between top-down and bottom-up valuation of these costs. I’m ashamed to say that I had never properly grasped the four distinct approaches that can be adopted based on these classifications, but the authors make it quite clear. Micro-costing means detailed identification of cost components, while gross-costing considers resource use in aggregate. Top-down methods use expenditure data collected at the organisational level, while bottom-up approaches use patient-level data.

A key problem is that our language – as health economists – is in several respects contradictory to the language used by management accountants. It’s the accountants who are usually preparing the cost information that we might use in analyses, and these data are not normally prepared for the types of analysis that we wish to conduct, so there is a lot that can go awry. Perhaps most important is that financial accounting is not concerned with opportunity costs. The authors provide a kind of glossary of terms that can support translation between the two contexts, as well as a set of examples of the ways in which the two contexts differ. They also point out the importance of different accounting practices in different countries and the ways in which these might necessitate adjustment in costing methods for economic evaluation.

The study includes a narrative review of costing studies in order to demonstrate the sorts of errors in terminology that can arise and the lack of clarity that results. The studies included in the review provide examples of the different approaches to costing, though no study is identified as ‘bottom-up gross-costing’. One of the most useful contributions of the paper is to provide two methodological checklists, one for top-down and one for bottom-up costing studies. If you’re performing, reviewing, or in any way making use of costing studies, this will be a handy reference.

Health state values of deaf British Sign Language (BSL) users in the UK: an application of the BSL version of the EQ-5D-5L. Applied Health Economics and Health Policy [PubMed] Published 16th January 2020

The BSL translation of the EQ-5D is like no other. It is to be used – almost exclusively – by people who have a specific functional health impairment. For me, this raises questions about whether or not we can actually consider it simply a translation of the EQ-5D and compare values with other translations in the way we would any other language. This study uses data collected during the initial development and validation of the EQ-5D-5L BSL translation. The authors compared health state utility values from Deaf people (BSL users) with a general population sample from the Health Survey for England.

As we might expect, the Deaf sample reported a lower mean utility score (0.78) than the general population (0.84). Several other health measures were used in the BSL study. A staggering 43% of the Deaf participants had depression and a lot of the analysis in the paper is directed towards comparing the groups with and without psychological distress. The authors conduct some simple regression analyses to explore what might be the determinants of health state utility values in the Deaf population, with long-standing physical illness having the biggest impact.

I had hoped that the study might be able to tell us a bit more about the usefulness of the BSL version of the EQ-5D-5L, because the EQ-5D has previously been shown to be insensitive to hearing problems. The small sample (<100) can’t tell us a great deal on its own, so it’s a shame that there isn’t some attempt at matching with individuals from the Health Survey for England for the sake of comparison. Using the crosswalk from the EQ-5D-3L to obtain 5L values is also a problem, as it limits the responsiveness of index values. Nevertheless, it’s good to see data relating to this under-represented population.

A welfare-theoretic model consistent with the practice of cost-effectiveness analysis and its implications. Journal of Health Economics [PubMed] Published 11th January 2020

There are plenty of good reasons to deviate from a traditional welfarist approach to cost-benefit analysis in the context of health care, as health economists have debated for decades. But it is nevertheless important to understand the ways in which cost-effectiveness analysis, as we conduct it, deviates from welfarism, and to aim for some kind of consistency in our handling of different issues. This paper attempts to draw together disparate subjects of discussion on the theoretical basis for aspects of cost-effectiveness analysis. The author focuses on issues relating to the inclusion of future (unrelated) costs, to discounting, and to consistency with welfarism, in the conduct of cost-per-QALY analyses. The implications are given consideration with respect to adopting a societal perspective, recognising multiple budget holders, and accounting for distributional impacts.

All of this is based on the description of an intertemporal utility model and a model of medical care investment. The model hinges especially on how we understand consumption to be affected by our ambition to maximise QALYs. For instance, the author argues that, once we consider time preferences in an overall utility function, we don’t need to worry about differential discounting in health and consumption. The various implications of the model are compared to the recommendations of the Second Panel on Cost-Effectiveness in Health and Medicine. In general, the model supports the recommendations of the Panel, where others have been critical. As such, it sets out some of the theoretical basis for those recommendations. It also implies other recommendations, not considered by the Panel. For example, the optimal cost-effectiveness threshold is likely to be higher than GDP per capita.

It’s difficult to judge the validity of the framework from a first read. The paper is dense with theoretical exposition. My first instinct is ‘so what’. One of the great things about the practice of cost-effectiveness analysis in health care is that it isn’t constrained by restrictive theoretical frameworks, and so the very idea of a kind of unified theoretical framework is a bit worrying to me. But my second thought is that this is a valuable paper, as it attempts to gather up several loose threads. Whether or not these can be gathered up within a welfarist framework is debatable, but the exercise is revealing. I suspect this paper will help to trigger further inquiry, which can only be a good thing.

Registered reports: time to radically rethink peer review in health economics. PharmacoEconomics – Open [PubMed] Published 23rd January 2020

As a discipline, health economics isn’t great when it comes to publication practices. We excel in neither the open access culture of medical sciences nor the discussion paper culture of economics proper. In this article, the authors express concern about publication bias, and the fact that health economics journals – and health economists in general – aren’t doing much to combat it. In fairness to the discipline, there isn’t really any evidence that publication bias abounds. But that isn’t really the point. We should be able to prove and ensure that it doesn’t if we want our research to been seen as credible.

One (partial) solution to publication bias is the adoption – by journals – of registered reports. Under such a system, researchers would submit study protocols to journals for peer review. If the journal were satisfied with the methods then they could guarantee to publish the study once the results are in, regardless of how sexy the results may or may not be. The authors of this paper identify the prevalence of studies in major health economics journals that could benefit from registered reports. These would be prospectively designed experimental or quasi-experimental studies. It seems that there are plenty.

I’ve used this blog in the past to propose more transparent research practices and to complain about publication practices in health economics generally, while others have complained about the use of p-values in our discipline. The adoption of registered reports is one tactic that could bring improvements and I hope it will be given proper consideration by those in a position to enact change.

Credits

Chris Sampson’s journal round-up for 14th May 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A practical guide to conducting a systematic review and meta-analysis of health state utility values. PharmacoEconomics [PubMed] Published 10th May 2018

I love articles that outline the practical application of a particular method to solve a particular problem, especially when the article shares analysis code that can be copied and adapted. This paper does just that for the case of synthesising health state utility values. Decision modellers use utility values as parameters. Most of the time these are drawn from a single source which almost certainly introduces some kind of bias to the resulting cost-effectiveness estimates. So it’s better to combine all of the relevant available information. But that’s easier said than done, as numerous researchers (myself included) have discovered. This paper outlines the various approaches and some of the merits and limitations of each. There are some standard stages, for which advice is provided, relating to the identification, selection, and extraction of data. Those are by no means simple tasks, but the really tricky bit comes when you try and pool the utility values that you’ve found. The authors outline three strategies: i) fixed effect meta-analysis, ii) random effects meta-analysis, and iii) mixed effects meta-regression. Each is illustrated with a hypothetical example, with Stata and R commands provided. Broadly speaking, the authors favour mixed effects meta-regression because of its ability to identify the extent of similarity between sources and to help explain heterogeneity. The authors insist that comparability between sources is a precondition for pooling. But the thing about health state utility values is that they are – almost by definition – never comparable. Different population? Not comparable. Different treatment pathway? No chance. Different utility measure? Ha! They may or may not appear to be similar statistically, but that’s totally irrelevant. What matters is whether the decision-maker ‘believes’ the values. If they believe them then they should be included and pooled. If decision-makers have reason to believe one source more or less than another then this should be accounted for in the weighting. If they don’t believe them at all then they should be excluded. Comparability is framed as a statistical question, when in reality it is a conceptual one. For now, researchers will have to tackle that themselves. This paper doesn’t solve all of the problems around meta-analysis of health state utility values, but it does a good job of outlining methodological developments to date and provides recommendations in accordance with them.

Unemployment, unemployment duration, and health: selection or causation? The European Journal of Health Economics [PubMed] Published 3rd May 2018

One of the major socioeconomic correlates of poor health is unemployment. It appears not to be very good for you. But there’s an obvious challenge here – does unemployment cause ill-health, or are unhealthy people just more likely to be unemployed? Both, probably, but that answer doesn’t make for clear policy solutions. This paper – following a large body of literature – attempts to explain what’s going on. Its novelty comes in the way the author considers timing and distinguishes between mental and physical health. The basis for the analysis is that selection into unemployment by the unhealthy ought to imply time-constant effects of unemployment on health. On the other hand, the negative effect of unemployment on health ought to grow over time. Using seven waves of data from the German Socio-economic Panel, a sample of 17,000 people (chopped from 48,000) is analysed, of which around 3,000 experienced unemployment. The basis for measuring mental and physical health is summary scores from the SF-12. A fixed-effects model is constructed based on the dependence of health on the duration and timing of unemployment, rather than just the occurrence of unemployment per se. The author finds a cumulative effect of unemployment on physical ill-health over time, implying causation. This is particularly pronounced for people unemployed in later life, and there was essentially no impact on physical health for younger people. The longer people spent unemployed, the more their health deteriorated. This was accompanied by a strong long-term selection effect of less physically healthy people being more likely to become unemployed. In contrast, for mental health, the findings suggest a short-term selection effect of people who experience a decline in mental health being more likely to become unemployed. But then, following unemployment, mental health declines further, so the balance of selection and causation effects is less clear. In contrast to physical health, people’s mental health is more badly affected by unemployment at younger ages. By no means does this study prove the balance between selection and causality. It can’t account for people’s anticipation of unemployment or future ill-health. But it does provide inspiration for better-targeted policies to limit the impact of unemployment on health.

Different domains – different time preferences? Social Science & Medicine [PubMed] Published 30th April 2018

Economists are often criticised by non-economists. Usually, the criticisms are unfounded, but one of the ways in which I think some (micro)economists can have tunnel vision is in thinking that preferences elicited with respect to money exhibit the same characteristics as preferences about things other than money. My instinct tells me that – for most people – that isn’t true. This study looks at one of those characteristics of preferences – namely, time preferences. Unfortunately for me, it suggests that my instincts aren’t correct. The authors outline a quasi-hyperbolic discounting model, incorporating both short-term present bias and long-term impatience, to explain gym members’ time preferences in the health and monetary domains. A survey was conducted with members of a chain of fitness centres in Denmark, of which 1,687 responded. Half were allocated to money-related questions and half to health-related questions. Respondents were asked to match an amount of future gains with an amount of immediate gains to provide a point of indifference. Health problems were formulated as back pain, with an EQ-5D-3L level 2 for usual activities and a level 2 for pain or discomfort. The findings were that estimates for discount rates and present bias in the two domains are different, but not by very much. On average, discount rates are slightly higher in the health domain – a finding driven by female respondents and people with more education. Present bias is the same – on average – in each domain, though retired people are more present biased for health. The authors conclude by focussing on the similarity between health and monetary time preferences, suggesting that time preferences in the monetary domain can safely be applied in the health domain. But I’d still be wary of this. For starters, one would expect a group of gym members – who have all decided to join the gym – to be relatively homogenous in their time preferences. Findings are similar on average, and there are only small differences in subgroups, but when it comes to health care (even public health) we’re never dealing with average people. Targeted interventions are increasingly needed, which means that differential discount rates in the health domain – of the kind identified in this study – should be brought into focus.

Credits

 

Chris Sampson’s journal round-up for 2nd April 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quality-adjusted life-years without constant proportionality. Value in Health Published 27th March 2018

The assumption of constant proportional trade-offs (CPTO) is at the heart of everything we do with QALYs. It assumes that duration has no impact on the value of a given health state, and so the value of a health state is constant regardless of its duration. This assumption has been repeatedly demonstrated to fail. This study looks for a non-constant alternative, which hasn’t been done before. The authors consider a quality-adjusted lifespan and four functional forms for the relationship between time and the value of life: constant, discount, logarithmic, and power. These relationships were tested in an online survey with more than 5,000 people, which involved the completion of 30-40 time trade-off pairs based on the EQ-5D-5L. Respondents traded off health states of varying severities and durations. Initially, a saturated model (making no assumptions about functional form) was estimated. This demonstrated that the marginal value of lifespan is decreasing. The authors provide a set of values attached to different health states at different durations. Then, the econometric model is adjusted to suit a power model, with the power estimated for duration expressed in days, weeks, months, or years. The power value for time is 0.415, but different expressions of time could introduce bias; time expressed in days (power=0.403) loses value faster than time expressed in years (power=0.654). There are also some anomalies that arise from the data that don’t fit the power function. For example, a single day of moderate problems can be worse than death, whereas 7 days or more is not. Using ‘power QALYs’ could be the future. But the big remaining question is whether decisionmakers ought to respond to people’s time preferences in this way.

A systematic review of studies comparing the measurement properties of the three-level and five-level versions of the EQ-5D. PharmacoEconomics [PubMed] Published 23rd March 2018

The debate about the EQ-5D-5L continues (on Twitter, at least). Conveniently, this paper addresses a concern held by some people – that we don’t understand the implications of using the 5L descriptive system. The authors systematically review papers comparing the measurement properties of the 3L and 5L, written in English or German. The review ended up including 24 studies. The measurement properties that were considered by the authors were: i) distributional properties, ii) informativity, iii) inconsistencies, iv) responsiveness, and v) test-retest reliability. The last property involves consideration of index values. Each study was also quality-assessed, with all being considered of good to excellent quality. The studies covered numerous countries and different respondent groups, with sample sizes from the tens to the thousands. For most measurement properties, the findings for the 3L and 5L were very similar. Floor effects were generally below 5% and tended to be slightly reduced for the 5L. In some cases, the 5L was associated with major reductions in the proportion of people responding as 11111 – a well-recognised ceiling effect associated with the 3L. Just over half of the studies reported on informativity using Shannon’s H’ and Shannon’s J’. The 5L provided consistently better results. Only three studies looked at responsiveness, with two slightly favouring the 5L and one favouring the 3L. The latter could be explained by the use of the 3L-5L crosswalk, which is inherently less responsive because it is a crosswalk. The overarching message is consistency. Business as usual. This is important because it means that the 3L and 5L descriptive systems provide comparable results (which is the basis for the argument I recently made that they are measuring the same thing). In some respects, this could be disappointing for 5L proponents because it suggests that the 5L descriptive system is not a lot better than the 3L. But it is a little better. This study demonstrates that there are still uncertainties about the differences between 3L and 5L assessments of health-related quality of life. More comparative studies, of the kind included in this review, should be conducted so that we can better understand the differences in results that are likely to arise now that we have moved (relatively assuredly) towards using the 5L instead of the 3L.

Preference-based measures to obtain health state utility values for use in economic evaluations with child-based populations: a review and UK-based focus group assessment of patient and parent choices. Quality of Life Research [PubMed] Published 21st March 2018

Calculating QALYs for kids continues to be a challenge. One of the challenges is the choice of which preference-based measure to use. Part of the problem here is that the EuroQol group – on which we rely for measuring adult health preferences – has been a bit slow. There’s the EQ-5D-Y, which has been around for a while, but it wasn’t developed with any serious thought about what kids value and there still isn’t a value set for the UK. So, if we use anything, we use a variety of measures. In this study, the authors review the use of generic preference-based measures. 45 papers are identified, including 5 different measures: HUI2, HUI3, CHU-9D, EQ-5D-Y, and AQOL-6D. No prizes for guessing that the EQ-5D (adult version) was the most commonly used measure for child-based populations. Unfortunately, the review is a bit of a disappointment. And I’m not just saying that because at least one study on which I’ve worked isn’t cited. The search strategy is likely to miss many (perhaps most) trial-based economic evaluations with children, for which cost-utility analyses don’t usually get a lot of airtime. It’s hard to see how a review of this kind is useful if it isn’t comprehensive. But the goal of the paper isn’t just to summarise the use of measures to date. The focus is on understanding when researchers should use self- or proxy-response, and when a parent-child dyad might be most useful. The literature review can’t do much to guide that question except to assert that the identified studies tended to use parent–proxy respondents. But the study also reports on some focus groups, which are potentially more useful. These were conducted as part of a wider study relating to the design of an RCT. In five focus groups, participants were presented with the EQ-5D-Y and the CHU-9D. It isn’t clear why these two measures were selected. The focus groups included parents and some children over the age of 11. Unfortunately, there’s no real (qualitative) analysis conducted, so the findings are limited. Parents expressed concern about a lack of sensitivity. Naturally, they thought that they knew best and should be the respondents. Of the young people reviewing the measures themselves, the EQ-5D-Y was perceived as more straightforward in referring to tangible experiences, whereas the CHU-9D’s severity levels were seen as more representative. Older adolescents tended to prefer the CHU-9D. The youths weren’t so sure of themselves as the adults and, though they expressed concern about their parents not understanding how they feel, they were generally neutral to who ought to respond. The older kids wanted to speak for themselves. The paper provides a good overview of the different measures, which could be useful for researchers planning data collection for child health utility measurement. But due to the limitations of the review and the lack of analysis of the focus groups, the paper isn’t able to provide any real guidance.

Credits