Chris Sampson’s journal round-up for 30th September 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A need for change! A coding framework for improving transparency in decision modeling. PharmacoEconomics [PubMed] Published 24th September 2019

We’ve featured a few papers in recent round-ups that (I assume) will be included in an upcoming themed issue of PharmacoEconomics on transparency in modelling. It’s shaping up to be a good one. The value of transparency in decision modelling has been recognised, but simply making the stuff visible is not enough – it needs to make sense. The purpose of this paper is to help make that achievable.

The authors highlight that the writing of analyses, including coding, involves personal style and preferences. To aid transparency, we need a systematic framework of conventions that make the inner workings of a model understandable to any (expert) user. The paper describes a framework developed by the Decision Analysis in R for Technologies in Health (DARTH) group. The DARTH framework builds on a set of core model components, generalisable to all cost-effectiveness analyses and model structures. There are five components – i) model inputs, ii) model implementation, iii) model calibration, iv) model validation, and v) analysis – and the paper describes the role of each. Importantly, the analysis component can be divided into several parts relating to, for example, sensitivity analyses and value of information analyses.

Based on this framework, the authors provide recommendations for organising and naming files and on the types of functions and data structures required. The recommendations build on conventions established in other fields and in the use of R generally. The authors recommend the implementation of functions in R, and relate general recommendations to the context of decision modelling. We’re also introduced to unit testing, which will be unfamiliar to most Excel modellers but which can be relatively easily implemented in R. The role of various tools are introduced, including R Studio, R Markdown, Shiny, and GitHub.

The real value of this work lies in the linked R packages and other online material, which you can use to test out the framework and consider its application to whatever modelling problem you might have. The authors provide an example using a basic Sick-Sicker model, which you can have a play with using the DARTH packages. In combination with the online resources, this is a valuable paper that you should have to hand if you’re developing a model in R.

Accounts from developers of generic health state utility instruments explain why they produce different QALYs: a qualitative study. Social Science & Medicine [PubMed] Published 19th September 2019

It’s well known that different preference-based measures of health will generate different health state utility values for the same person. Yet, they continue to be used almost interchangeably. For this study, the authors spoke to people involved in the development of six popular measures: QWB, 15D, HUI, EQ-5D, SF-6D, and AQoL. Their goal was to understand the bases for the development of the measures and to explain why the different measures should give different results.

At least one original developer for each instrument was recruited, along with people involved at later stages of development. Semi-structured interviews were conducted with 15 people, with questions on the background, aims, and criteria for the development of the measure, and on the descriptive system, preference weights, performance, and future development of the instrument.

Five broad topics were identified as being associated with differences in the measures: i) knowledge sources used for conceptualisation, ii) development purposes, iii) interpretations of what makes a ‘good’ instrument, iv) choice of valuation techniques, and v) the context for the development process. The online appendices provide some useful tables that summarise the differences between the measures. The authors distinguish between measures based on ‘objective’ definitions (QWB) and items that people found important (15D). Some prioritised sensitivity (AQoL, 15D), others prioritised validity (HUI, QWB), and several focused on pragmatism (SF-6D, HUI, 15D, EQ-5D). Some instruments had modest goals and opportunistic processes (EQ-5D, SF-6D, HUI), while others had grand goals and purposeful processes (QWB, 15D, AQoL). The use of some measures (EQ-5D, HUI) extended far beyond what the original developers had anticipated. In short, different measures were developed with quite different concepts and purposes in mind, so it’s no surprise that they give different results.

This paper provides some interesting accounts and views on the process of instrument development. It might prove most useful in understanding different measures’ blind spots, which can inform the selection of measures in research, as well as future development priorities.

The emerging social science literature on health technology assessment: a narrative review. Value in Health Published 16th September 2019

Health economics provides a good example of multidisciplinarity, with economists, statisticians, medics, epidemiologists, and plenty of others working together to inform health technology assessment. But I still don’t understand what sociologists are talking about half of the time. Yet, it seems that sociologists and political scientists are busy working on the big questions in HTA, as demonstrated by this paper’s 120 references. So, what are they up to?

This article reports on a narrative review, based on 41 empirical studies. Three broad research themes are identified: i) what drove the establishment and design of HTA bodies? ii) what has been the influence of HTA? and iii) what have been the social and political influences on HTA decisions? Some have argued that HTA is inevitable, while others have argued that there are alternative arrangements. Either way, no two systems are the same and it is not easy to explain differences. It’s important to understand HTA in the context of other social tendencies and trends, and that HTA influences and is influenced by these. The authors provide a substantial discussion on the role of stakeholders in HTA and the potential for some to attempt to game the system. Uncertainty abounds in HTA and this necessarily requires negotiation and acts as a limit on the extent to which HTA can rely on objectivity and rationality.

Something lacking is a critical history of HTA as a discipline and the question of what HTA is actually good for. There’s also not a lot of work out there on culture and values, which contrasts with medical sociology. The authors suggest that sociologists and political scientists could be more closely involved in HTA research projects. I suspect that such a move would be more challenging for the economists than for the sociologists.

Credits

Chris Sampson’s journal round-up for 19th December 2016

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Discounting the recommendations of the Second Panel on Cost-Effectiveness in Health and Medicine. PharmacoEconomics [PubMed] Published 9th December 2016

I do enjoy a bit of academic controversy. In this paper, renowned troublemakers Paulden, O’Mahony and McCabe do what they do best. Their target is the approach to discounting recommended by the report from the new Panel on Cost-Effectiveness, which I briefly covered in a recent round-up. This paper starts out by describing what – exactly – the Panel recommends. The real concerns lie with the approach recommended for analyses from the societal perspective. According to the authors, the problems start when the Panel conflates the marginal utility of income and that of consumption, and confusingly label it with our old friend the lambda. The confusion continues with the use of other imprecise terminology. And then there are some aspects of the Panel’s calculations that just seem to be plain old errors, resulting in illogical results – for example, that future consumption should be discounted more heavily if associated with higher marginal utility. Eh? The core criticism is that the Panel recommends the same discount rate for both costs and the consumption value of health, and that this contradicts recent developments. The Panel fails to clearly explain the basis for its recommendation. Helpfully, the authors outline an alternative (correct?) approach. The 3% rate for costs and health effects that the Panel recommends is not justified. The criticisms made in this paper are technical ones. That doesn’t mean they are any less important, but all we can see is that use of the Panel’s recommended decision rule results in some vague threat to utility-maximisation. Whether or not the conflation of consumption and utility value would actually result in bad decisions is not clear. Nevertheless, considering the massive influence of the original Gold Panel that will presumably be enjoyed by the Second Panel, extreme scrutiny is needed. I hope Basu and Ganiats see it fit to respond. I also wonder whether Paulden, O’Mahony and McCabe might have other chapters in their crosshairs.

Is best–worst scaling suitable for health state valuation? A comparison with discrete choice experiments. Health Economics [PubMed] Published 4th December 2016

BWS is gaining favour as a means of valuing health states. In this paper, team DCE throw down the gauntlet for team BWS. The study uses data collected during the development of a ‘glaucoma utility index’ in which DCE and BWS exercises were completed. The first question is, do DCE and BWS give the same results? The answer is no. The models indicate relatively weak correlation. For most dimensions, the BWS gave values for different severity levels that were closer together than in the DCE. This means that large improvements in health might be associated with smaller utility gains using BWS values than using DCE values. BWS is also identified as being more prone to decision biases. The second question is, which technique is best ‘to develop health utility indices’ (as the authors put it)? We need to bear in mind that this may in part be moot. Proponents of BWS have often claimed that they are not even trying to measure utility, so to judge BWS on this basis may not be appropriate. Anyway, set aside for now the fact that your own definition of utility might be (and that the authors’ almost certainly is) at odds with the BWS approach. No surprise that the authors suggest that DCE is superior. The bases on which this judgement is made are stability, monotonicity, continuity and completeness. All of these relate to whether the respondents make the kinds of responses we might expect. BWS answers are found to be less stable, more likely to be non-continuous and tend not to satisfy monotonicity. Personally I don’t see these as objective identifiers of goodness or ability of the technique to identify ‘true’ preferences. Also, I don’t know anything about how the glaucoma measure was developed, but if the health states it defines aren’t very informative then the results of this study won’t be either. Nevertheless, the findings do indicate to me that health state valuation using BWS might be subject to more caveats that need investigating before we start to make greater use of the technique. The much larger body of research behind DCE counts in its favour. Over to you, Terry team BWS.

Preference weighting of health state values: what difference does it make, and why? Value in Health Published 23rd November 2016

When non-economists ask about the way we measure health outcomes, the crux of it all is that the EQ-5D et al are preference-based. We think – or at least have accepted – that preferences must be really very serious and important. Equal weighting of dimensions? Nothing but meaningless nonsense! That may well be true in theory, but what if our approach to preference-elicitation is actually providing us with much the same results as if we were using equal weighting? Much research energy (and some money) goes into the preference weighting project, but could it be a waste of time? I had hoped that this paper might answer that question, but while it’s a useful study I didn’t find it quite so enlightening. The authors look at the EQ-5D-5L and 15D and compared the usual preference-based index for each with one constructed using an equal weighting, rescaled to the 0-1 dead-full health scale. The rescaling takes into account the differences in scale length for the 15D (0 to 1, 1.000) and the EQ-5D-5L (-0.281 to 1, 1.281). Data are from the Multi-Instrument Comparison (MIC) study, which includes healthy people as well as subsamples with a range of chronic diseases. The authors look at the correlations between the preference-based and equal weighted index values. They find very high correlation, especially for the 15D, and agreement on the EQ-5D increases when adjusted for the scale length. Furthermore, the results are investigated for known group validity alongside a depression-specific outcome measure. The EQ-5D performs a little better. But the study doesn’t really tell me what I want to know: would the use of equal-weighting normally give us the same results, and in what cases might it not? The MIC study includes a whole range of generic and condition-specific measures and I can’t see why the study didn’t look at all of them. It also could have used alternative preference weights to see how they differ. And it could have looked at all of the different disease-based subgroups in the sample to try and determine under what circumstances preference weighting might approach equal weighting. I hope to see more research on this issue, not to undermine preference weighting but to inform its improvement.

Credits

The ‘Q’ in the QALY: are we fudging it?

Some recent research from the Centre for Health Economics at Monash University has quantified something that we are all aware of: fudging in the measurement of health-related quality of life. They have found that, on average, randomly changing from one health-related quality of life measure to another changes cost-effectiveness results by 41%. This is clearly huge.

Health-related quality of life?

I am of the opinion that health-related quality of life is not something that, at least in any objective way, actually exists. The extent to which health-related aspects of life affect overall quality of life differs across people, places and time. Discrepancies can become apparent on two levels:

  1. What we perceive as dimensions of health may or may not affect an individual’s subjective level of overall health, and
  2. The relative importance of health in defining overall quality of life, compared with other aspects of life, can vary. This issue has been addressed in relation to adaptation.

These discrepancies translate into an inconsistency in the valuation processes we currently use. The people from whom values are being elicited are seeking to maximise utility (at least, this is what we assume), while the researcher’s chosen maximand is health-related quality of life. This means that any dimension of the chosen measure that can affect non-health-related quality of life will skew the results. As such we end up with a fudge that combines (objective) health characteristics and (subjective) preferences. I believe that, eventually, we will have to settle on a stricter definition of the ‘Q’ in the QALY, and that this will have to be based entirely in either objective heath or (subjective) utility.

Health

An approach to measuring ‘health’ would not be entirely dissimilar to our current approach, but an ‘objective’ health measure would have to be more comprehensive than the EQ-5D, SF-6D, AQoL and other similar measures. Of existing measures, the 15D comes closest. It could include items such as mobility, sensory function, pain, sexual function, fatigue, anxiety, depression and cognition, which the individual may or may not consider dimensions of their health, but which could define health objectively. These would involve a level of subjectivity in that they are being assessed by the individual, but they are less contextual; items such as self-care, emotion, usual activities and relationships, from current measures, are heavily influenced by the context in which they are being completed. The instrument could then be valued using ranking exercises to establish a complete set of health states, ranked from best health to worst health. Dead can be equated to the worst possible health state, as all other outcomes are, in terms of health, an improvement upon death. If all valuations are completed relative to other health states, rather than to ‘death’, much of the distortion of non-health-related considerations will be removed.

I see no reason why the process should involve the elicitation of preferences. A health service does not seek to maximise utility. Evidence-based medicine does not aim to make people happier. Health care – particularly that which is publicly funded – should seek to maximise health, not health-related quality of life. If a person does not wish to improve their own health in a given way, they can choose not to consume that health care (so long as this is not detrimental to the health of others). For example, an individual may choose not to have a cochlear implant if their social network consists largely of deaf people [thanks, Scrubs]. Surely this should be the role of preferences in the allocation of health care.

Quality of life

At the other end of the scale we have a measure of general well-being. In some respects this is the easier approach, though there remains unanswered questions; for example, do we wish to measure present well-being or satisfaction with life? These approaches are simpler insofar as they require only one question such as, ‘how happy are you right now?’ or ‘how satisfied are you with your life overall?’. These questions should be posed to patients. Again, I do not see any benefit of using preferences or capturing decision utility in this case; experienced utility gives a better indication of the impact of health states upon quality of life. This approach could provide us with a measure of utility, so we could implement a cost-utility analysis (which is by no means what we do currently).

The two approaches described here could be used in conjunction. They would provide very different results, as the early findings from Monash demonstrate. A public health service should maintain health as its maximand, but other government departments or private individuals could provide funding for interventions that also benefit people in ways other than their health, or improve the rate at which individuals can derive utility from their health (e.g. education, social housing, social care).

I have little doubt that our current fudging approach – of maximising mainly-but-not-completely-health-related quality of life – is the best thing to do in the meantime, but I suspect it isn’t a long-term solution.

DOI: 10.6084/m9.figshare.1186883