Simon McNamara’s journal round-up for 24th June 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Manipulating the 5 dimensions of the EuroQoL instrument: the effects on self-reporting actual health and valuing hypothetical health states. Medical Decision Making [PubMed] Published 4th June 2019

EQ-5D is the Rocky Balboa of health economics. A left-hook here, a jab there, vicious undercuts straight to the chin – it takes the hits, it never stays down. Every man and his dog is ganging up on it, yet, it still stands, proudly resolute in its undefeated record.

When you are the champ” it thinks to itself, “everyone wants a piece you”. The door opens. Out the darkness emerges four mysterious figures. “No… not…”, the instrument stumbles over its words. A bead of sweat rolls slowly down its glistening forehead. Its thumping heartbeat pierces the silence like a drum being thrashed by spear-wielding members of an ancient tribe. “It can’t beNo.” A clear, precise, voice emerges from the darkness, “taken at face value” it states, “our results suggest that economic evaluations that use EQ-5D-5L are systematically biased.” EQ-5D stares blankly, its pupils dilated. It responds, “I’ve been waiting for you”. The gloom clears. Tsuchiya et al (2019) stand there proudly: “bring it on… punk”.

The first paper in this week’s round-up is a surgical probing of a sample of potential issues with EQ-5D. Whilst the above paragraph contains a fair amount of poetic license (read: this is the product of an author who would rather be writing dystopian health-economics short stories than doing their actual work), this paper by Tsuchiya et al. does seems to land a number of strong blows squarely on the chin of EQ-5D. The authors employ a large discrete choice experiment (n=2,494 members of the UK general public), in order to explore the impact of three issues on the way people both report and value health. Specifically: (1) the order the five dimensions are presented; (2) the use of composite dimensions (dimensions that pool two things – e.g. pain or discomfort) rather than separate dimensions; (3) “bolting-off” domains (the reverse of a bolt-on: removing domains from the EQ-5D).

If you are interested in these issues, I suggest you read the paper in full. In brief, the authors find that splitting anxiety/depression into two dimensions had a significant effect on the way people reported their health; that splitting level 5 of the pain/discomfort and anxiety/depression dimensions (e.g. I have extreme pain or discomfort) into individual dimensions significantly impacted the way people valued health; and, that “bolting off” dimensions impacted valuation of the remaining dimensions. Personally, I think the composite domain findings are most interesting here. The authors find that that extreme pain/discomfort is perceived as being a more severe state than extreme discomfort alone, and similarly, that being extremely depressed/anxious is perceived as a more severe state than simply being extremely anxious. The authors suggest this means the EQ-5D-5L may be systematically biased, as an individual who reports extreme discomfort (or anxiety) will have their health state valued based upon the composite domains for each of these, and subsequently have the severity of their health-state over-estimated.

I like this paper, and think it has a lot to contribute to the refinement of EQ-5D, and the development of new instruments. I suggest the champ uses Tsuchiya et al as a sparring partner, gets back to the gym and works on some new moves – I sense a training montage coming on.

Methods for public health economic evaluation: A Delphi survey of decision makers in English and Welsh local government. Health Economics [PubMed] Published 7th June 2019

Imagine the government in your local city is considering a major new public health initiative. Politicians plan to destroy a number of out of date social housing blocks in deprived communities, and building 10,000 new high-quality homes in their place. This will cost a significant amount of money and, as a result, you have been asked to do an economic evaluation of this intervention. How would you go about doing this?

This is clearly a complicated task. You are unlikely to find a randomised controlled trial on which to base your evaluation, the costs and benefits of the programme are likely to fall on multiple sectors, and you will likely have to balance health gains with a wide range of other non-health outcomes (e.g. reductions in crime). If you somehow managed to model the impact of the intervention perfectly, you would then be faced with the challenge of how to value these benefits. Equally, you would have to consider whether or not to weight the benefits of this programme more highly than programmes in alternative parts of the city, because it benefits people in deprived communities – note that inequalities in health seem to be a much larger issue in public health than in ‘normal health’ (e.g. the bread and butter of health economics evaluation). This complexity, and concern for inequalities, makes public health economic evaluation a completely different beast to traditional economic evaluation. This has led some to question the value of QALY-based cost-utility analysis in public health, and to calls for methods that better meet the needs of the field.  

The second paper in this week’s round-up contributes to the development of these methods, by providing information on what public health decision makers in England and Wales think about different economic evaluation methodologies. The authors fielded an online, two-round, Delphi-panel study featuring 26 to 36 statements (round 1 and 2 respectively). For each statement, participants were asked to rank their level of agreement with the statement on a five-point scale (e.g. 1 = strongly agree and 5 = strongly disagree). In the first round, participants (n=66) simply responded to the statements, and in the second, they (n=29) were presented with the median response from the prior round, and asked to consider their response in light of this feedback. The statements tested covered a wide range of issues, including: the role distributional concerns should play in public health economic evaluation (e.g. economic evaluation should formally weight outcomes by population subgroup); the type of outcomes considered (e.g. economic evidence should use a single outcome that captures length of life and quality of life); and, the budgets to be considered (e.g. economic evaluation should take account of multi-sectoral budgets available).

Interestingly, the decision-makers rejected the idea of focusing solely on maximising outcomes (the current norm for health economic evaluations), and supported placing an equal focus on minimising inequality and maximising outcomes. Furthermore, they supported formal weighting of outcomes by population subgroup, the use of multiple outcomes to capture health, wellbeing and broader outcomes, and failed to support use of a single outcome that captures well-being gain. These findings suggest cost-consequence analysis may provide a better fit to the needs of these decision makers than simply attempting to apply the QALY model in public health – particularly if augmented by some form of multi-criteria decision analysis (MCDA) that can reflect distributional concerns and allow comparison across outcome types. I think this is a great paper and expect to be citing it for years to come.

I AM IMMORTAL. Economic Enquiry [RePEc] Published 16th November 2016

I love this paper. It isn’t a recent one, but it hasn’t been covered in the AHE blog before, and I think everyone should know about it, so – luckily for you – it has made it in to this week’s round-up.

In this groundbreaking work, Riccardo Trezzi fits a series of “state of the art”, complex, econometric models to his own electrocardiogram (ECG) signal – a measure of the electrical function of the heart. He then compares these models, identifies the one that best fits his data, and uses the model to predict his future ECG signal, and subsequently his life expectancy. This provides an astonishing result  – “the n steps ahead forecast remains bounded and well above zero even after one googol period, implying that my life expectancy tends to infinite. I therefore conclude that I am immortal”.

I think this is genius. If you haven’t already realised the point of the paper by the time you have reached this part of my write-up, I suggest you think very carefully about the face-validity of this result. If you still don’t get it after that, have a look at the note on the front page – specifically the bit that says “this paper is intended to be a joke”. If you still don’t get it – the author measured their heart activity for 10 seconds, and then applied lots of complex statistical methods, which (obviously) when extrapolated suggested his heart would keep beating forever, and subsequently that he would live forever.

Whilst the paper is a parody, it makes an important point. If we fit models to data, and attempt to predict the future without considering external evidence, we may well make a hash of that prediction – despite the apparent sophistication of our econometric methods. This is clearly an extreme example, but resonates with me, because this is what many people continue to do when modelling oncology data. This is certainly less prevalent than it was a few years ago, and I expect it will become a thing of the past, but for now, whenever I meet someone who does this, I will be sure to send them a copy of this paper. That being said, as far as I am aware the author is still alive, so maybe he will have the last laugh – perhaps even the last laugh of all of humankind if his model is to be believed.

Credits

Chris Sampson’s journal round-up for 17th June 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Mental health: a particular challenge confronting policy makers and economists. Applied Health Economics and Health Policy [PubMed] Published 7th June 2019

This paper has a bad title. You’d never guess that its focus is on the ‘inconsistency of preferences’ expressed by users of mental health services. The idea is that people experiencing certain mental health problems (e.g. depression, conduct disorders, ADHD) may express different preferences during acute episodes. Preference inconsistency, the author explains, can result in failures in prediction (because behaviour may contradict expectations) and failures in evaluation (because… well, this is a bit less clear). Because of preference inconsistency, a standard principal-agent model cannot apply to treatment decisions. Conventional microeconomic theory cannot apply. If this leaves you wondering “so what has this got to do with economists?” then you’re not alone. The author of this article believes that our role is to identify suitable agents who can interpret patients’ inconsistent preferences and make appropriate decisions on their behalf.

But, after introducing this challenge, the framing of the issue seems to change and the discussion becomes about finding an agent who can determine a patient’s “true preferences” from “conflicting statements”. That seems to me to be a bit different from the issue of ‘inconsistent preferences’, and the phrase “true preferences” should raise an eyebrow of any sceptical economist. From here, the author describes some utility models of perfect agency and imperfect agency – the latter taking account of the agent’s opportunity cost of effort. The models include error in judging whether the patient is exhibiting ‘true preferences’ and the strength of the patient’s expression of preference. Five dimensions of preference with respect to treatment are specified: when, what, who, how, and where. Eight candidate agents are specified: family member, lay helper, worker in social psychiatry, family physician, psychiatrist/psychologist, health insurer, government, and police/judge. The knowledge level of each agent in each domain is surmised and related to the precision of estimates for the utility models described. The author argues that certain agents are better at representing a patient’s ‘true preferences’ within certain domains, and that no candidate agent will serve an optimal role in every domain. For instance, family members are likely to be well-placed to make judgements with little error, but they will probably have a higher opportunity cost than care professionals.

The overall conclusion that different agents will be effective in different contexts seems logical, and I support the view of the author that economists should dedicate themselves to better understanding the incentives and behaviours of different agents. But I’m not convinced by the route to that conclusion.

Exploring the impact of adding a respiratory dimension to the EQ-5D-5L. Medical Decision Making [PubMed] Published 16th May 2019

I’m currently working on a project to develop and test EQ-5D bolt-ons for cognition and vision, so I was keen to see the methods reported in this study. The EQ-5D-5L has been shown to have only a weak correlation with clinically-relevant changes in the context of respiratory disease, so it might be worth developing a bolt-on (or multiple bolt-ons) that describe relevant functional changes not captured by the core dimensions of the EQ-5D. In this study, the authors looked at how the inclusion of respiratory dimensions influenced utility values.

Relevant disease-specific outcome measures were reviewed. The researchers also analysed EQ-5D-3L data and disease-specific outcome measure data from three clinical studies in asthma and COPD, to see how much variance in visual analogue scores was explained by disease-specific items. The selection of potential bolt-ons was also informed by principal-component analysis to try to identify which items form constructs distinct from the EQ-5D dimensions. The conclusion of this process was that two other dimensions represented separate constructs and could be good candidates for bolt-ons: ‘limitations in physical activities due to shortness of breath’ and ‘breathing problems’. Some think-aloud interviews were conducted to ensure that the bolt-ons made sense to patients and the general public.

A valuation study using time trade-off and discrete choice experiments was conducted in the Netherlands with a representative sample of 430 people from the general public. The sample was split in two, with each half completing the EQ-5D-5L with one or the other bolt-on. The Dutch EQ-5D-5L valuation study was used as a comparator data set. The inclusion of the bolt-ons seemed to extend the scale of utility values; the best-functioning states were associated with higher utility values when the bolt-ons were added and the worst-functioning states were associated with lower values. This was more pronounced for the ‘breathing problems’ bolt-on. The size of the coefficients on the two bolt-ons (i.e. the effect on utility values) was quite different. The ‘physical activities’ bolt-on had coefficients similar in size to self-care and usual activities. The coefficients on the ‘breathing problems’ bolt-on were a bit larger, comparable in size with those of the mobility dimension.

The authors raise an interesting question in light of their findings from the development process, in which the quantitative analysis supported a ‘symptoms’ dimension and patients indicated the importance of a dimension relating to ‘physical activities’. They ask whether it is more important for an item to be relevant or for it to be quantitatively important for valuation. Conceptually, it seems to me that the apparent added value of a ‘physical activity’ bolt-on is problematic for the EQ-5D. The ‘physical activity’ bolt-on specifies “climbing stairs, going for a walk, carrying things, gardening” as the types of activities it is referring to. Surely, these should be reflected in ‘mobility’ and ‘usual activities’. If they aren’t then I think the ‘usual activities’ descriptor, in particular, is not doing its job. What we might be seeing here, more than anything, is the flaws in the development process for the original EQ-5D descriptors. Namely, that they didn’t give adequate consideration to the people who would be filling them in. Nevertheless, it looks like a ‘breathing problems’ bolt-on could be a useful part of the EuroQol armoury.

Technology and college student mental health: challenges and opportunities. Frontiers in Psychiatry [PubMed] Published 15th April 2019

Universities in the UK and elsewhere are facing growing demand for counselling services from students. That’s probably part of the reason that our Student Mental Health Research Network was funded. Some researchers have attributed this rising demand to the use of personal computing technologies – smartphones, social media, and the like. No doubt, their use is correlated with mental health problems, certainly through time and probably between individuals. But causality is uncertain, and there are plenty of ways in which – as set out in this article – these technologies might be used in a positive way.

Most obviously, smartphones can be a platform for mental health programmes, delivered via apps. This is particularly important because there are perceived and actual barriers for students to accessing face-to-face support. This is an issue for all people with mental health problems. But the opportunity to address this issue using technology is far greater for students, who are hyper-connected. Part of the problem, the authors argue, is that there has not been a focus on implementation, and so the evidence that does exist is from studies with self-selecting samples. Yet the opportunity is great here, too, because students are often co-located with service providers and already engaged with course-related software.

Challenges remain with respect to ethics, privacy, accountability, and duty of care. In the UK, we have the benefit of being able to turn to GDPR for guidance, and universities are well-equipped to assess the suitability of off-the-shelf and bespoke services in terms of their ethical implications. The authors outline some possible ways in which universities can approach implementation and the challenges therein. Adopting these approaches will be crucial if universities are to address the current gap between the supply and demand for services.

Credits

Chris Sampson’s journal round-up for 11th March 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Identification, review, and use of health state utilities in cost-effectiveness models: an ISPOR Good Practices for Outcomes Research Task Force report. Value in Health [PubMed] Published 1st March 2019

When modellers select health state utility values to plug into their models, they often do it in an ad hoc and unsystematic way. This ISPOR Task Force report seeks to address that.

The authors discuss the process of searching, reviewing, and synthesising utility values. Searches need to use iterative techniques because evidence requirements develop as a model develops. Due to the scope of models, it may be necessary to develop multiple search strategies (for example, for different aspects of disease pathways). Searches needn’t be exhaustive, but they should be systematic and transparent. The authors provide a list of factors that should be considered in defining search criteria. In reviewing utility values, both quality and appropriateness should be considered. Quality is indicated by the precision of the evidence, the response rate, and missing data. Appropriateness relates to the extent to which the evidence being reviewed conforms to the context of the model in which it is to be used. This includes factors such as the characteristics of the study population, the measure used, value sets used, and the timing of data collection. When it comes to synthesis, the authors suggest it might not be meaningful in most cases, because of variation in methods. We can’t pool values if they aren’t (at least roughly) equivalent. Therefore, one approach is to employ strict inclusion criteria (e.g only EQ-5D, only a particular value set), but this isn’t likely to leave you with much. Meta-regression can be used to analyse more dissimilar utility values and provide insight into the impact of methodological differences. But the extent to which this can provide pooled values for a model is questionable, and the authors concede that more research is needed.

This paper can inform that future research. Not least in its attempt to specify minimum reporting standards. We have another checklist, with another acronym (SpRUCE). The idea isn’t so much that this will guide publications of systematic reviews of utility values, but rather that modellers (and model reviewers) can use it to assess whether the selection of utility values was adequate. The authors then go on to offer methodological recommendations for using utility values in cost-effectiveness models, considering issues such as modelling technique, comorbidities, adverse events, and sensitivity analysis. It’s early days, so the recommendations in this report ought to be changed as methods develop. Still, it’s a first step away from the ad hoc selection of utility values that (no doubt) drives the results of many cost-effectiveness models.

Estimating the marginal cost of a life year in Sweden’s public healthcare sector. The European Journal of Health Economics [PubMed] Published 22nd February 2019

It’s only recently that health economists have gained access to data that enables the estimation of the opportunity cost of health care expenditure on a national level; what is sometimes referred to as a supply-side threshold. We’ve seen studies in the UK, Spain, Australia, and here we have one from Sweden.

The authors use data on health care expenditure at the national (1970-2016) and regional (2003-2016) level, alongside estimates of remaining life expectancy by age and gender (1970-2016). First, they try a time series analysis, testing the nature of causality. Finding an apparently causal relationship between longevity and expenditure, the authors don’t take it any further. Instead, the results are based on a panel data analysis, employing similar methods to estimates generated in other countries. The authors propose a conceptual model to support their analysis, which distinguishes it from other studies. In particular, the authors assert that the majority of the impact of expenditure on mortality operates through morbidity, which changes how the model should be specified. The number of newly graduated nurses is used as an instrument indicative of a supply-shift at the national rather than regional level. The models control for socioeconomic and demographic factors and morbidity not amenable to health care.

The authors estimate the marginal cost of a life year by dividing health care expenditure by the expenditure elasticity of life expectancy, finding an opportunity cost of €38,812 (with a massive 95% confidence interval). Using Swedish population norms for utility values, this would translate into around €45,000/QALY.

The analysis is considered and makes plain the difficulty of estimating the marginal productivity of health care expenditure. It looks like a nail in the coffin for the idea of estimating opportunity costs using time series. For now, at least, estimates of opportunity cost will be based on variation according to geography, rather than time. In their excellent discussion, the authors are candid about the limitations of their model. Their instrument wasn’t perfect and it looks like there may have been important confounding variables that they couldn’t control for.

Frequentist and Bayesian meta‐regression of health state utilities for multiple myeloma incorporating systematic review and analysis of individual patient data. Health Economics [PubMed] Published 20th February 2019

The first paper in this round-up was about improving practice in the systematic review of health state utility values, and it indicated the need for more research on the synthesis of values. Here, we have some. In this study, the authors conduct a meta-analysis of utility values alongside an analysis of registry and clinical study data for multiple myeloma patients.

A literature search identified 13 ‘methodologically appropriate’ papers, providing 27 health state utility values. The EMMOS registry included data for 2,445 patients in 22 counties and the APEX clinical study included 669 patients, all with EQ-5D-3L data. The authors implement both a frequentist meta-regression and a Bayesian model. In both cases, the models were run including all values and then with a limited set of only EQ-5D values. These models predicted utility values based on the number of treatment classes received and the rate of stem cell transplant in the sample. The priors used in the Bayesian model were based on studies that reported general utility values for the presence of disease (rather than according to treatment).

The frequentist models showed that utility was low at diagnosis, higher at first treatment, and lower at each subsequent treatment. Stem cell transplant had a positive impact on utility values independent of the number of previous treatments. The results of the Bayesian analysis were very similar, which the authors suggest is due to weak priors. An additional Bayesian model was run with preferred data but vague priors, to assess the sensitivity of the model to the priors. At later stages of disease (for which data were more sparse), there was greater uncertainty. The authors provide predicted values from each of the five models, according to the number of treatment classes received. The models provide slightly different results, except in the case of newly diagnosed patients (where the difference was 0.001). For example, the ‘EQ-5D only’ frequentist model gave a value of 0.659 for one treatment, while the Bayesian model gave a value of 0.620.

I’m not sure that the study satisfies the recommendations outlined in the ISPOR Task Force report described above (though that would be an unfair challenge, given the timing of publication). We’re told very little about the nature of the studies that are included, so it’s difficult to judge whether they should have been combined in this way. However, the authors state that they have made their data extraction and source code available online, which means I could check that out (though, having had a look, I can’t find the material that the authors refer to, reinforcing my hatred for the shambolic ‘supplementary material’ ecosystem). The main purpose of this paper is to progress the methods used to synthesise health state utility values, and it does that well. Predictably, the future is Bayesian.

Credits