Chris Sampson’s journal round-up for 16th December 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

MCDA-based deliberation to value health states: lessons learned from a pilot study. Health and Quality of Life Outcomes [PubMed] Published 1st July 2019

The rejection of the EQ-5D-5L value set for England indicates something of a crisis in health state valuation. Evidently, there is a lack of trust in the quantitative data and methods used. This is despite decades of methodological development. Perhaps we need a completely different approach. Could we instead develop a value set using qualitative methods?

A value set based on qualitative research aligns with an idea forwarded by Daniel Hausman, who has argued for the use of deliberative approaches. This could circumvent the problems associated with asking people to give instant (and possibly ill-thought-out) responses to preference elicitation surveys. The authors of this study report on the first ever (pilot) attempt to develop a consensus value set using methods of multi-criteria decision analysis (MCDA) and deliberation. The study attempts to identify a German value set for the SF-6D.

The study included 34 students in a one-day conference setting. A two-step process was followed for the MCDA using MACBETH (the Measuring Attractiveness by a Categorical Based Evaluation Technique), which uses pairwise comparisons to derive numerical scales without quantitative assessments. First, a scoring procedure was conducted for each of the six dimensions. Second, a weighting was identified for each dimension. After an introductory session, participants were allocated into groups of five or six and each group was tasked with scoring one SF-6D dimension. Within each group, consensus was achieved. After these group sessions, all participants were brought together to present and validate the results. In this deliberation process, consensus was achieved for all domains except pain. Then the weighting session took place, but resulted in no consensus. Subsequent to the one-day conference, a series of semi-structured interviews were conducted with moderators. All the sessions and interviews were recorded, transcribed, and analysed qualitatively.

In short, the study failed. A consensus value set could not be identified. Part of the problem was probably in the SF-6D descriptive system, particularly in relation to pain, which was interpreted differently by different people. But the main issue was that people had different opinions and didn’t seem willing to move towards consensus with a societal perspective in mind. Participants broadly fell into three groups – one in favour of prioritising pain and mental health, one opposed to trading-off SF-6D dimensions and favouring equal weights, and another group that was not willing to accept any trade-offs.

Despite its apparent failure, this seems like an extremely useful and important study. The authors provide a huge amount of detail regarding what they did, what went well, and what might be done differently next time. I’m not sure it will ever be possible to get a group of people to reach a consensus on a value set. The whole point of preference-based measures is surely that different people have different priorities, and they should be expected to disagree. But I think we should expect that the future of health state valuation lies in mixed methods. There might be more success in a qualitative and deliberative approach to scoring combined with a quantitative approach to weighting, or perhaps a qualitative approach informed by quantitative data that demands trade-offs. Whatever the future holds, this study will be a valuable guide.

Preference-based health-related quality of life outcomes associated with preterm birth: a systematic review and meta-analysis. PharmacoEconomics [PubMed] Published 9th December 2019

Premature and low birth weight babies can experience a whole host of negative health outcomes. Most studies in this context look at short-term biomedical assessments or behavioural and neurodevelopmental indicators. But some studies have sought to identify the long-term consequences on health-related quality of life by identifying health state utility values. This study provides us with a review and meta-analysis of such values.

The authors screened 2,139 articles from their search and included 20 in the review. Lots of data were extracted from the articles, which is helpfully tabulated in the paper. The majority of the studies included adolescents and focussed on children born very preterm or at very low birth weight.

For the meta-analysis, the authors employed a linear mixed-effects meta-regression, which is an increasingly routine approach in this context. The models were used to estimate the decrement in utility values associated with preterm birth or low birth weight, compared with matched controls. Conveniently, all but one of the studies used a measure other than the HUI2 or HUI3, so the analysis was restricted to these two measures. Preterm birth was associated with an average decrement of 0.066 and extremely low birth weight with a decrement of 0.068. The mean estimated utility scores for the study groups was 0.838, compared with 0.919 for the control groups.

Reviews of utility values are valuable as they provide modellers with a catalogue of potential parameters that can be selected in a meaningful and transparent way. Even though this is a thorough and well-reported study, it’s a bit harder to see how its findings will be used. Most reviews of utility values relate to a particular disease, which might be prevented or ameliorated by treatment, and the value of this treatment depends on the utility values selected. But how will these utility values be used? The avoidance of preterm or low-weight birth is not the subject of most evaluations in the neonatal setting. Even if it was, how valuable are estimates from a single point in adolescence? The authors suggest that future research should seek to identify a trajectory of utility values over the life course. But, even if we could achieve this, it’s not clear to me how this should complement utility values identified in relation to the specific health problems experienced by these people.

The new and non-transparent Cancer Drugs Fund. PharmacoEconomics [PubMed] Published 12th December 2019

Not many (any?) health economists liked the Cancer Drugs Fund (CDF). It was set-up to give special treatment to cancer drugs, which weren’t assessed on the same basis as other drugs being assessed by NICE. In 2016, the CDF was brought within NICE’s remit, with medicines available through the CDF requiring a managed access agreement. This includes agreements on data collection and on payments by the NHS during the period. In this article, the authors contend that the new CDF process is not sufficiently transparent.

Three main issued are raised: i) lack of transparency relating to the value of CDF drugs, ii) lack of transparency relating to the cost of CDF drugs, and iii) the amount of time that medicines remain on the CDF. The authors tabulate the reporting of ICERs according to the decisions made, showing that the majority of treatment comparisons do not report ICERs. Similarly, the time in the CDF is tabulated, with many indications being in the CDF for an unknown amount of time. In short, we don’t know much about medicines going through the CDF, except that they’re probably costing a lot.

I’m a fan of transparency, in almost all contexts. I think it is inherently valuable to share information widely. It seems that the authors of this paper do too. A lack of transparency in NICE decision-making is a broader problem that arises from the need to protect commercially sensitive pricing agreements. But what this paper doesn’t manage to do is to articulate why anybody who doesn’t support transparency in principle should care about the CDF in particular. Part of the authors’ argument is that the lack of transparency prevents independent scrutiny. But surely NICE is the independent scrutiny? The authors argue that it is a problem that commissioners and the public cannot assess the value of the medicines, but it isn’t clear why that should be a problem if they are not the arbiters of value. The CDF has quite rightly faced criticism over the years, but I’m not convinced that its lack of transparency is its main problem.

Credits

Rachel Houten’s journal round-up for 11th November 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A comparison of national guidelines for network meta-analysis. Value in Health [PubMed] Published October 2019

The evolving treatment landscape results in a greater dependence on indirect treatment comparisons to generate estimates of clinical effectiveness, where the current practice has not been compared to the proposed new intervention in a head-to-head trial. This paper is a review of the guidelines of reimbursement bodies for conducting network meta-analyses. Reassuringly, the authors find that it is possible to meet the needs of multiple agencies with one analysis.

The authors assign three categories to the criteria; “assessment and analysis to test assumptions required for a network meta-analysis, presentation and reporting of results, and justification of modelling choices”, with heterogeneity of the included studies highlighted as one of the key elements to be sure to include if prioritisation of the criteria is necessary. I think this is a simple way of thinking about what needs to be presented but the ‘justification’ category, in my experience, is often given less weight than the other two.

This paper is a useful resource for companies submitting to multiple HTA agencies with the requirements of each national body displayed in tables that are easy to navigate. It meets a practical need but doesn’t really go far enough for me. They do signpost to the PRISMA criteria, but I think it would have been really good to think about the purpose of the submission guidelines; to encourage a logical and coherent summary of the approaches taken so the evidence can be evaluated by decision-makers.

Variation in responsiveness to warranted behaviour change among NHS clinicians: novel implementation of change detection methods in longitudinal prescribing data. BMJ [PubMed] Published 2nd October 2019

I really like this paper. Such a lot of work, from all sectors, is devoted to the production of relevant and timely evidence to inform practice, but if the guidance does not become embedded into the real world then its usefulness is limited.

The authors have managed to utilize a HUGE amount of data to identify the real reaction to two pieces of guidance recommending a change in practice in England. The authors used “trend indicator saturation”, which I’m not ashamed to admit I knew nothing about beforehand, but it is explained nicely. Their thoughtful use of the information available to them results in three indicators of response (in this case the deprescribing of two drugs) around when the change occurs, how quickly it occurs, and how much change occurs.

The authors discover variation in response to the recommendations but suggest an application of their methods could be used to generate feedback to clinicians and therefore drive further response. As some primary care practices took a while to embed the guidance change into their prescribing, the paper raises interesting questions as to where the barriers to the adoption of guidance have occurred.

What is next for patient preferences in health technology assessment? A systematic review of the challenges. Value in Health Published November 2019

It may be that patient preferences have a role to play in the uptake of guideline recommendations, as proposed by the authors of my final paper this week. This systematic review, of the literature around embedding patient preferences into HTA decision-making, groups the discussion in the academic literature into five broad areas; conceptual, normative, procedural, methodological, and practical. The authors state that their purpose was not to formulate their own views, merely to present the available literature, but they do a good job of indicating where to find more opinionated literature on this topic.

Methodological issues were the biggest group, with aspects such as the sample selection, internal and external validity of the preferences generated, and the generalisability of the preferences collected from a sample to the entire population. However, in general, the number of topics covered in the literature is vast and varied.

It’s a great summary of the challenges that are faced, and a ranking based on frequency of topic being mentioned in the literature drives the authors proposed next steps. They recommend further research into the incorporation of preferences within or beyond the QALY and the use of multiple-criteria decision analysis as a method of integrating patient preferences into decision-making. I support the need for “a scientifically and valid manner” to integrate patient preferences into HTA decision-making but wonder if we can first learn of what works well and hasn’t worked so well from the attempts of HTA agencies thus far.

Credits

Rita Faria’s journal round-up for 21st October 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quantifying how diagnostic test accuracy depends on threshold in a meta-analysis. Statistics in Medicine [PubMed] Published 30th September 2019

A diagnostic test is often based on a continuous measure, e.g. cholesterol, which is dichotomised at a certain threshold to classify people as ‘test positive’, who should be treated, or ‘test negative’, who should not. In an economic evaluation, we may wish to compare the costs and benefits of using the test at different thresholds. For example, the cost-effectiveness of offering lipid lowering therapy for people with cholesterol over 7 mmol/L vs over 5 mmol/L. This is straightforward to do if we have access to a large dataset comparing the test to its gold standard to estimate its sensitivity and specificity at various thresholds. It is quite the challenge if we only have aggregate data from multiple publications.

In this brilliant paper, Hayley Jones and colleagues report on a new method to synthesise diagnostic accuracy data from multiple studies. It consists of a multinomial meta-analysis model that can estimate how accuracy depends on the diagnostic threshold. This method produces estimates that can be used to parameterise an economic model.

These new developments in evidence synthesis are very exciting and really important to improve the data going into economic models. My only concern is that the model is implemented in WinBUGS, which is not a software that many applied analysts use. Would it be possible to have a tutorial, or even better, include this method in the online tools available in the Complex Reviews Support Unit website?

Early economic evaluation of diagnostic technologies: experiences of the NIHR Diagnostic Evidence Co-operatives. Medical Decision Making [PubMed] Published 26th September 2019

Keeping with the diagnostic theme, this paper by Lucy Abel and colleagues reports on the experience of the Diagnostic Evidence Co-operatives in conducting early modelling of diagnostic tests. These were established in 2013 to help developers of diagnostic tests link-up with clinical and academic experts.

The paper discusses eight projects where economic modelling was conducted at an early stage of project development. It was fascinating to read about the collaboration between academics and test developers. One of the positive aspects was the buy-in of the developers, while a less positive one was the pressure to produce evidence quickly and that supported the product.

The paper is excellent in discussing the strengths and challenges of these projects. Of note, there were challenges in mapping out a clinical pathway, selecting the appropriate comparators, and establishing the consequences of testing. Furthermore, they found that the parameters around treatment effectiveness were the key driver of cost-effectiveness in many of the evaluations. This is not surprising given that the benefits of a test are usually in better informing the management decisions, rather than via its direct costs and benefits. It definitely resonates with my own experience in conducting economic evaluations of diagnostic tests (see, for example, here).

Following on from the challenges, the authors suggest areas for methodological research: mapping the clinical pathway, ensuring model transparency, and modelling sequential tests. They finish with advice for researchers doing early modelling of tests, although I’d say that it would be applicable to any economic evaluation. I completely agree that we need better methods for economic evaluation of diagnostic tests. This paper is a useful first step in setting up a research agenda.

A second chance to get causal inference right: a classification of data science tasks. Chance [arXiv] Published 14th March 2019

This impressive paper by Miguel Hernan, John Hsu and Brian Healy is an essential read for all researchers, analysts and scientists. Miguel and colleagues classify data science tasks into description, prediction and counterfactual prediction. Description is using data to quantitatively summarise some features of the world. Prediction is using the data to know some features of the world given our knowledge about other features. Counterfactual prediction is using the data to know what some features of the world would have been if something hadn’t happened; that is, causal inference.

I found the explanation of the difference between prediction and causal inference quite enlightening. It is not about the amount of data or the statistical/econometric techniques. The key difference is in the role of expert knowledge. Predicting requires expert knowledge to specify the research question, the inputs, the outputs and the data sources. Additionally, causal inference requires expert knowledge “also to describe the causal structure of the system under study”. This causal knowledge is reflected in the assumptions, the ideas for the data analysis, and for the interpretation of the results.

The section on implications for decision-making makes some important points. First, that the goal of data science is to help people make better decisions. Second, that predictive algorithms can tell us that decisions need to be made but not which decision is most beneficial – for that, we need causal inference. Third, many of us work on complex systems for which we don’t know everything (the human body is a great example). Because we don’t know everything, it is impossible to predict with certainty what would be the consequences of an intervention in a specific individual from routine health records. At most, we can estimate the average causal effect, but even for that we need assumptions. The relevance to the latest developments in data science is obvious, given all the hype around real world data, artificial intelligence and machine learning.

I absolutely loved reading this paper and wholeheartedly recommend it for any health economist. It’s a must read!

Credits