Chris Sampson’s journal round-up for 16th December 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

MCDA-based deliberation to value health states: lessons learned from a pilot study. Health and Quality of Life Outcomes [PubMed] Published 1st July 2019

The rejection of the EQ-5D-5L value set for England indicates something of a crisis in health state valuation. Evidently, there is a lack of trust in the quantitative data and methods used. This is despite decades of methodological development. Perhaps we need a completely different approach. Could we instead develop a value set using qualitative methods?

A value set based on qualitative research aligns with an idea forwarded by Daniel Hausman, who has argued for the use of deliberative approaches. This could circumvent the problems associated with asking people to give instant (and possibly ill-thought-out) responses to preference elicitation surveys. The authors of this study report on the first ever (pilot) attempt to develop a consensus value set using methods of multi-criteria decision analysis (MCDA) and deliberation. The study attempts to identify a German value set for the SF-6D.

The study included 34 students in a one-day conference setting. A two-step process was followed for the MCDA using MACBETH (the Measuring Attractiveness by a Categorical Based Evaluation Technique), which uses pairwise comparisons to derive numerical scales without quantitative assessments. First, a scoring procedure was conducted for each of the six dimensions. Second, a weighting was identified for each dimension. After an introductory session, participants were allocated into groups of five or six and each group was tasked with scoring one SF-6D dimension. Within each group, consensus was achieved. After these group sessions, all participants were brought together to present and validate the results. In this deliberation process, consensus was achieved for all domains except pain. Then the weighting session took place, but resulted in no consensus. Subsequent to the one-day conference, a series of semi-structured interviews were conducted with moderators. All the sessions and interviews were recorded, transcribed, and analysed qualitatively.

In short, the study failed. A consensus value set could not be identified. Part of the problem was probably in the SF-6D descriptive system, particularly in relation to pain, which was interpreted differently by different people. But the main issue was that people had different opinions and didn’t seem willing to move towards consensus with a societal perspective in mind. Participants broadly fell into three groups – one in favour of prioritising pain and mental health, one opposed to trading-off SF-6D dimensions and favouring equal weights, and another group that was not willing to accept any trade-offs.

Despite its apparent failure, this seems like an extremely useful and important study. The authors provide a huge amount of detail regarding what they did, what went well, and what might be done differently next time. I’m not sure it will ever be possible to get a group of people to reach a consensus on a value set. The whole point of preference-based measures is surely that different people have different priorities, and they should be expected to disagree. But I think we should expect that the future of health state valuation lies in mixed methods. There might be more success in a qualitative and deliberative approach to scoring combined with a quantitative approach to weighting, or perhaps a qualitative approach informed by quantitative data that demands trade-offs. Whatever the future holds, this study will be a valuable guide.

Preference-based health-related quality of life outcomes associated with preterm birth: a systematic review and meta-analysis. PharmacoEconomics [PubMed] Published 9th December 2019

Premature and low birth weight babies can experience a whole host of negative health outcomes. Most studies in this context look at short-term biomedical assessments or behavioural and neurodevelopmental indicators. But some studies have sought to identify the long-term consequences on health-related quality of life by identifying health state utility values. This study provides us with a review and meta-analysis of such values.

The authors screened 2,139 articles from their search and included 20 in the review. Lots of data were extracted from the articles, which is helpfully tabulated in the paper. The majority of the studies included adolescents and focussed on children born very preterm or at very low birth weight.

For the meta-analysis, the authors employed a linear mixed-effects meta-regression, which is an increasingly routine approach in this context. The models were used to estimate the decrement in utility values associated with preterm birth or low birth weight, compared with matched controls. Conveniently, all but one of the studies used a measure other than the HUI2 or HUI3, so the analysis was restricted to these two measures. Preterm birth was associated with an average decrement of 0.066 and extremely low birth weight with a decrement of 0.068. The mean estimated utility scores for the study groups was 0.838, compared with 0.919 for the control groups.

Reviews of utility values are valuable as they provide modellers with a catalogue of potential parameters that can be selected in a meaningful and transparent way. Even though this is a thorough and well-reported study, it’s a bit harder to see how its findings will be used. Most reviews of utility values relate to a particular disease, which might be prevented or ameliorated by treatment, and the value of this treatment depends on the utility values selected. But how will these utility values be used? The avoidance of preterm or low-weight birth is not the subject of most evaluations in the neonatal setting. Even if it was, how valuable are estimates from a single point in adolescence? The authors suggest that future research should seek to identify a trajectory of utility values over the life course. But, even if we could achieve this, it’s not clear to me how this should complement utility values identified in relation to the specific health problems experienced by these people.

The new and non-transparent Cancer Drugs Fund. PharmacoEconomics [PubMed] Published 12th December 2019

Not many (any?) health economists liked the Cancer Drugs Fund (CDF). It was set-up to give special treatment to cancer drugs, which weren’t assessed on the same basis as other drugs being assessed by NICE. In 2016, the CDF was brought within NICE’s remit, with medicines available through the CDF requiring a managed access agreement. This includes agreements on data collection and on payments by the NHS during the period. In this article, the authors contend that the new CDF process is not sufficiently transparent.

Three main issued are raised: i) lack of transparency relating to the value of CDF drugs, ii) lack of transparency relating to the cost of CDF drugs, and iii) the amount of time that medicines remain on the CDF. The authors tabulate the reporting of ICERs according to the decisions made, showing that the majority of treatment comparisons do not report ICERs. Similarly, the time in the CDF is tabulated, with many indications being in the CDF for an unknown amount of time. In short, we don’t know much about medicines going through the CDF, except that they’re probably costing a lot.

I’m a fan of transparency, in almost all contexts. I think it is inherently valuable to share information widely. It seems that the authors of this paper do too. A lack of transparency in NICE decision-making is a broader problem that arises from the need to protect commercially sensitive pricing agreements. But what this paper doesn’t manage to do is to articulate why anybody who doesn’t support transparency in principle should care about the CDF in particular. Part of the authors’ argument is that the lack of transparency prevents independent scrutiny. But surely NICE is the independent scrutiny? The authors argue that it is a problem that commissioners and the public cannot assess the value of the medicines, but it isn’t clear why that should be a problem if they are not the arbiters of value. The CDF has quite rightly faced criticism over the years, but I’m not convinced that its lack of transparency is its main problem.


Chris Sampson’s journal round-up for 30th September 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A need for change! A coding framework for improving transparency in decision modeling. PharmacoEconomics [PubMed] Published 24th September 2019

We’ve featured a few papers in recent round-ups that (I assume) will be included in an upcoming themed issue of PharmacoEconomics on transparency in modelling. It’s shaping up to be a good one. The value of transparency in decision modelling has been recognised, but simply making the stuff visible is not enough – it needs to make sense. The purpose of this paper is to help make that achievable.

The authors highlight that the writing of analyses, including coding, involves personal style and preferences. To aid transparency, we need a systematic framework of conventions that make the inner workings of a model understandable to any (expert) user. The paper describes a framework developed by the Decision Analysis in R for Technologies in Health (DARTH) group. The DARTH framework builds on a set of core model components, generalisable to all cost-effectiveness analyses and model structures. There are five components – i) model inputs, ii) model implementation, iii) model calibration, iv) model validation, and v) analysis – and the paper describes the role of each. Importantly, the analysis component can be divided into several parts relating to, for example, sensitivity analyses and value of information analyses.

Based on this framework, the authors provide recommendations for organising and naming files and on the types of functions and data structures required. The recommendations build on conventions established in other fields and in the use of R generally. The authors recommend the implementation of functions in R, and relate general recommendations to the context of decision modelling. We’re also introduced to unit testing, which will be unfamiliar to most Excel modellers but which can be relatively easily implemented in R. The role of various tools are introduced, including R Studio, R Markdown, Shiny, and GitHub.

The real value of this work lies in the linked R packages and other online material, which you can use to test out the framework and consider its application to whatever modelling problem you might have. The authors provide an example using a basic Sick-Sicker model, which you can have a play with using the DARTH packages. In combination with the online resources, this is a valuable paper that you should have to hand if you’re developing a model in R.

Accounts from developers of generic health state utility instruments explain why they produce different QALYs: a qualitative study. Social Science & Medicine [PubMed] Published 19th September 2019

It’s well known that different preference-based measures of health will generate different health state utility values for the same person. Yet, they continue to be used almost interchangeably. For this study, the authors spoke to people involved in the development of six popular measures: QWB, 15D, HUI, EQ-5D, SF-6D, and AQoL. Their goal was to understand the bases for the development of the measures and to explain why the different measures should give different results.

At least one original developer for each instrument was recruited, along with people involved at later stages of development. Semi-structured interviews were conducted with 15 people, with questions on the background, aims, and criteria for the development of the measure, and on the descriptive system, preference weights, performance, and future development of the instrument.

Five broad topics were identified as being associated with differences in the measures: i) knowledge sources used for conceptualisation, ii) development purposes, iii) interpretations of what makes a ‘good’ instrument, iv) choice of valuation techniques, and v) the context for the development process. The online appendices provide some useful tables that summarise the differences between the measures. The authors distinguish between measures based on ‘objective’ definitions (QWB) and items that people found important (15D). Some prioritised sensitivity (AQoL, 15D), others prioritised validity (HUI, QWB), and several focused on pragmatism (SF-6D, HUI, 15D, EQ-5D). Some instruments had modest goals and opportunistic processes (EQ-5D, SF-6D, HUI), while others had grand goals and purposeful processes (QWB, 15D, AQoL). The use of some measures (EQ-5D, HUI) extended far beyond what the original developers had anticipated. In short, different measures were developed with quite different concepts and purposes in mind, so it’s no surprise that they give different results.

This paper provides some interesting accounts and views on the process of instrument development. It might prove most useful in understanding different measures’ blind spots, which can inform the selection of measures in research, as well as future development priorities.

The emerging social science literature on health technology assessment: a narrative review. Value in Health Published 16th September 2019

Health economics provides a good example of multidisciplinarity, with economists, statisticians, medics, epidemiologists, and plenty of others working together to inform health technology assessment. But I still don’t understand what sociologists are talking about half of the time. Yet, it seems that sociologists and political scientists are busy working on the big questions in HTA, as demonstrated by this paper’s 120 references. So, what are they up to?

This article reports on a narrative review, based on 41 empirical studies. Three broad research themes are identified: i) what drove the establishment and design of HTA bodies? ii) what has been the influence of HTA? and iii) what have been the social and political influences on HTA decisions? Some have argued that HTA is inevitable, while others have argued that there are alternative arrangements. Either way, no two systems are the same and it is not easy to explain differences. It’s important to understand HTA in the context of other social tendencies and trends, and that HTA influences and is influenced by these. The authors provide a substantial discussion on the role of stakeholders in HTA and the potential for some to attempt to game the system. Uncertainty abounds in HTA and this necessarily requires negotiation and acts as a limit on the extent to which HTA can rely on objectivity and rationality.

Something lacking is a critical history of HTA as a discipline and the question of what HTA is actually good for. There’s also not a lot of work out there on culture and values, which contrasts with medical sociology. The authors suggest that sociologists and political scientists could be more closely involved in HTA research projects. I suspect that such a move would be more challenging for the economists than for the sociologists.


Thesis Thursday: Cheryl Jones

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Cheryl Jones who has a PhD from the University of Manchester. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

The economics of presenteeism in the context of rheumatoid arthritis, ankylosing spondylitis and psoriatic arthritis
Katherine Payne, Suzanne Verstappen, Brenda Gannon
Repository link

What attracted you to studying health-related presenteeism?

I was attracted to study presenteeism because it gave me a chance to address both normative and positive issues. Presenteeism, a concept related to productivity, is a controversial topic in the economic evaluation of healthcare technologies and is currently excluded from health economic evaluations, following the recommendation made by the NICE reference case. The reasons why productivity is excluded from economic evaluations are important and valid, however, there are some circumstances where excluding productivity is difficult to defend. Presenteeism offered an opportunity for me to explore and question the social value judgements that underpin economic evaluation methods with respect to productivity. In terms of positive issues related to presenteeism, research into the development of methods that can be used to measure and value presenteeism was (and still is) limited. This provided an opportunity to think creatively about the types of methods we could use, both quantitative and qualitative, to address and further methods for quantifying presenteeism.

Are existing tools adequate for measuring and valuing presenteeism in inflammatory arthritic conditions?

That is the question! Research into methods that can be used to quantify presenteeism is still in its infancy. Presenteeism is difficult to measure accurately because there are a lack of objective measures that can be used, for example, the number of cars assembled per day. As a consequence, many methods rely on self-report surveys, which tend to suffer from bias, such as reporting or recall bias. Methods that have been used to value presenteeism have largely focused on valuing presenteeism as a cost using the human capital approach (HCA: volume of presenteeism multiplied by a monetary factor). The monetary factor typically used to convert the volume of presenteeism into a cost value is wages. Valuing productivity using wages risks taking account of discriminatory factors that are associated with wages, such as age. There are also economic arguments that question whether the value of the wage truly reflects the value of productivity. My PhD focused on developing a method that values presenteeism as a non-monetary benefit, thereby avoiding the need to value it as a cost using wages. Overall, methods to measure and value presenteeism still have some way to go before a ‘gold standard’ can be established, however, there are many experts from many disciplines who are working to improve these methods.

Why was it important to conduct qualitative interviews as part of your research?

The quantitative component of my PhD was to develop an algorithm, using mapping methods, that links presenteeism with health status and capability measures. A study by Connolly et al. recommend conducting qualitative interviews to provide some evidence of face/content validity to establish whether a quantitative link between two measures (or concepts) is feasible and potentially valid. The qualitative study I conducted was designed to understand the extent to which the EQ-5D-5L, SF6D and ICECAP-C were able to capture those aspects of rheumatic conditions that negatively impact presenteeism. The results suggested that all three measures were able to capture those important aspects of rheumatic conditions that affect presenteeism; however, the results indicated that the SF6D would most likely be the most appropriate measure. The results from the quantitative mapping study identified the SF6D as the most suitable outcome measure able to predict presenteeism in working populations with rheumatic conditions. The advantage of the qualitative results was that it provided some evidence that explained why the SF6D was the more suitable measure rather than relying on speculation.

Is it feasible to predict presenteeism using outcome measures within economic evaluation?

I developed an algorithm that links presenteeism, measured using the Work Activity Productivity Impairment (WPAI) questionnaire, with health and capability. Health status was measured using the EQ-5D-5L and SF6D, and capability was measured using the ICECAP-A. The SF6D was identified as the most suitable measure to predict presenteeism in a population of employees with rheumatoid arthritis or ankylosing spondylitis. The results indicate that it is possible to predict presenteeism using generic outcome measures; however, the results have yet to be externally validated. The qualitative interviews provided evidence as to why the SF6D was the better predictor for presenteeism and the result gave rise to questions about the suitability of outcome measures given a specific population. The results indicate that it is potentially feasible to predict presenteeism using outcome measures.

What would be your key recommendation to a researcher hoping to capture the impact of an intervention on presenteeism?

Due to the lack of a ‘gold standard’ method for capturing the impact of presenteeism, I would recommend that the researcher reports and justifies their selection of the following:

  1. Provide a rationale that explains why presenteeism is an important factor that needs to be considered in the analysis.
  2. Explain how and why presenteeism will be captured and included in the analysis; as a cost, monetary benefit, or non-monetary benefit.
  3. Justify the methods used to measure and value presenteeism. It is important that the research clearly reports why specific tools, such as presenteeism surveys, have been selected for use.

Because there is no ‘gold standard’ method for measuring and valuing presenteeism and guidelines do not exist to inform the reporting of methods used to quantify presenteeism, it is important that the researcher reports and justifies their selection of methods used in their analysis.