How to explain cost-effectiveness models for diagnostic tests to a lay audience

Non-health economists (henceforth referred to as ‘lay stakeholders’) are often asked to use the outputs of cost-effectiveness models to inform decisions, but they can find them difficult to understand. Conversely, health economists may have limited experience of explaining cost-effectiveness models to lay stakeholders. How can we do better?

This article shares my experience of explaining cost-effectiveness models of diagnostic tests to lay stakeholders such as researchers in other fields, clinicians, managers, and patients, and suggests some approaches to make models easier to understand. It is the condensed version of my presentation at ISPOR Europe 2018.

Why are cost-effectiveness models of diagnostic tests difficult to understand?

Models designed to compare diagnostic strategies are particularly challenging. In my view, this is for two reasons.

Firstly, there is the sheer number of possible diagnostic strategies that a cost-effectiveness model allows us to compare. Even if we are looking at only a couple of tests, we can use them in various combinations and at many diagnostic thresholds. See, for example, this cost-effectiveness analysis of diagnosis of prostate cancer.

Secondly, diagnostic tests can affect costs and health outcomes in multiple ways. Specifically, diagnostic tests can have a direct effect on people’s health-related quality of life, mortality risk, acquisition costs, as well as the consequences of side effects. Furthermore, diagnostic tests can have an indirect effect via the consequences of the subsequent management decisions. This indirect effect is often the key driver of cost-effectiveness.

As a result, the cost-effectiveness analysis of diagnostic tests can have many strategies, with multiple effects modelled in the short and long-term. This makes the model and the results difficult to understand.

Map out the effect of the test on health outcomes or costs

The first step in developing any cost-effectiveness model is to understand how the new technology, such as a diagnostic test or a drug, can impact the patient and the health care system. Ferrante di Ruffano et al and Kip et al are two studies that can be used as a starting point to understand the possible effects of a test on health outcomes and/or costs.

Ferrante di Ruffano et al conducted a review of the mechanisms by which diagnostic tests can affect health outcomes and provides a list of the possible effects of diagnostic tests.

Kip et al suggests a checklist for the reporting of cost-effectiveness analyses of diagnostic tests and biomarkers. Although this is a checklist for the reporting of a cost-effectiveness analysis that has been previously conducted, it can also be used as a prompt to define the possible effects of a test.

Reach a shared understanding of the clinical pathway

The parallel step is to understand the clinical pathway in which the diagnostic strategies integrate and affect. This consists of conceptualising the elements of the health care service relevant for the decision problem. If you’d like to know more about model conceptualisation, I suggest this excellent paper by Paul Tappenden.

These conceptual models are necessarily simplifications of reality. They need to be as simple as possible, but accurate enough that lay stakeholders recognise it as valid. As Einstein said: “to make the irreducible basic elements as simple and as few as possible, without having to surrender the adequate representation of a single datum of experience.”

Agree which impacts to include in the cost-effectiveness model

What to include and to exclude from the model is, at present, more of an art than a science. For example, Chilcott et al conducted a series of interviews with health economists and found that their approach to model development varied widely.

I find that the best approach is to design the model in consultation with the relevant stakeholders, such as clinicians, patients, health care managers, etc. This ensures that the cost-effectiveness model has face validity to those who will ultimately be their end user and (hopefully) advocates of the results.

Decouple the model diagram from the mathematical model

When we have a reasonable idea of the model that we are going to build, we can draw its diagram. A model diagram not only is a recommended component of the reporting of a cost-effectiveness model but also helps lay stakeholders understand it.

The temptation is often to draw the model diagram as similar as possible to the mathematical model. In cost-effectiveness models of diagnostic tests, the mathematical model tends to be a decision tree. Therefore, we often see a decision tree diagram.

The problem is that decision trees can easily become unwieldy when we have various test combinations and decision nodes. We can try to synthesise a gigantic decision tree into a simpler diagram, but unless you have great graphic designer skills, it might be a futile exercise (see, for example, here).

An alternative approach is to decouple the model diagram from the mathematical model and break down the decision problem into steps. The figure below shows an example of how the model diagram can be decoupled from the mathematical model.

The diagram breaks the problem down into steps that relate to the clinical pathway, and therefore, to the stakeholders. In this example, the diagram follows the questions that clinicians and patients may ask: which test to do first? Given the result of the first test, should a second test be done? If a second test is done, which one?

Simplified model diagram on the cost-effectiveness analysis of magnetic resonance imaging (MRI) and biopsy to diagnose prostate cancer

Relate the results to the model diagram

The next point of contact between the health economists and lay stakeholders is likely to be at the point when the first cost-effectiveness results are available.

The typical chart for the probabilistic results is the cost-effectiveness acceptability curve (CEAC). In my experience, the CEAC is challenging for lay stakeholders. It plots results over a range of cost-effectiveness thresholds, which are not quantities that most people outside cost-effectiveness analysis relate to. Additionally, CEACs showing the results of multiple strategies can have many lines and some discontinuities, which can be difficult to understand by the untrained eye.

An alternative approach is to re-use the model diagram to present the results. The model diagram can show the strategy that is expected to be cost-effective and its probability of cost-effectiveness at the relevant threshold. For example, the probability that the strategies starting with a specific test are cost-effective is X%; and the probability that strategies using the specific test at a specific cut-off are cost-effective is Y%, etc.

Next steps for practice and research

Research about the communication of cost-effectiveness analysis is sparse, and guidance is lacking. Beyond the general advice to speak in plain English and avoiding jargon, there is little advice. Hence, health economists find themselves developing their own approaches and techniques.

In my experience, the key aspects for effective communication are to engage with lay stakeholders from the start of the model development, to explain the intuition behind the model in simplified diagrams, and to find a balance between scientific accuracy and clarity which is appropriate for the audience.

More research and guidance are clearly needed to develop communication methods that are effective and straightforward to use in applied cost-effectiveness analysis. Perhaps this is where patient and public involvement can really make a difference!

James Altunkaya’s journal round-up for 3rd September 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Sensitivity analysis for not-at-random missing data in trial-based cost-effectiveness analysis: a tutorial. PharmacoEconomics [PubMed] [RePEc] Published 20th April 2018

Last month, we highlighted a Bayesian framework for imputing missing data in economic evaluation. The paper dealt with the issue of departure from the ‘Missing at Random’ (MAR) assumption by using a Bayesian approach to specify a plausible missingness model from the results of expert elicitation. This was used to estimate a prior distribution for the unobserved terms in the outcomes model.

For those less comfortable with Bayesian estimation, this month we highlight a tutorial paper from the same authors, outlining an approach to recognise the impact of plausible departures from ‘Missingness at Random’ assumptions on cost-effectiveness results. Given poor adherence to current recommendations for the best practice in handling and reporting missing data, an incremental approach to improving missing data methods in health research may be more realistic. The authors supply accompanying Stata code.

The paper investigates the importance of assuming a degree of ‘informative’ missingness (i.e. ‘Missingness not at Random’) in sensitivity analyses. In a case study, the authors present a range of scenarios which assume a decrement of 5-10% in the quality of life of patients with missing health outcomes, compared to multiple imputation estimates based on observed characteristics under standard ‘Missing at Random’ assumptions. This represents an assumption that, controlling for all observed characteristics used in multiple imputation, those with complete quality of life profiles may have higher quality of life than those with incomplete surveys.

Quality of life decrements were implemented in the control and treatment arm separately, and then jointly, in six scenarios. This aimed to demonstrate the sensitivity of cost-effectiveness judgements to the possibility of a different missingness mechanism in each arm. The authors similarly investigate sensitivity to higher health costs in those with missing data than predicted based on observed characteristics in imputation under ‘Missingness at Random’. Finally, sensitivity to a simultaneous departure from ‘Missingness at Random’ in both health outcomes and health costs is investigated.

The proposed sensitivity analyses provide a useful heuristic to assess what degree of difference between missing and non-missing subjects on unobserved characteristics would be necessary to change cost-effectiveness decisions. The authors admit this framework could appear relatively crude to those comfortable with more advanced missing data approaches such as those outlined in last month’s round-up. However, this approach should appeal to those interested in presenting the magnitude of uncertainty introduced by missing data assumptions, in a way that is easily interpretable to decision makers.

The impact of waiting for intervention on costs and effectiveness: the case of transcatheter aortic valve replacement. The European Journal of Health Economics [PubMed] [RePEc] Published September 2018

This paper appears in print this month and sparked interest as one of comparatively few studies on the cost-effectiveness of waiting lists. Given interest in using constrained optimisation methods in health outcomes research, highlighted in this month’s editorial in Value in Health, there is rightly interest in extending the traditional sphere of economic evaluation from drugs and devices to understanding the trade-offs of investing in a wider range of policy interventions, using a common metric of costs and QALYs. Rachel Meacock’s paper earlier this year did a great job at outlining some of the challenges involved broadening the scope of economic evaluation to more general decisions in health service delivery.

The authors set out to understand the cost-effectiveness of delaying a cardiac treatment (TVAR) using a waiting list of up to 12 months compared to a policy of immediate treatment. The effectiveness of treatment at 3, 6, 9 & 12 months after initial diagnosis, health decrements during waiting, and corresponding health costs during wait time and post-treatment were derived from a small observational study. As treatment is studied in an elderly population, a non-ignorable proportion of patients die whilst waiting for surgery. This translates to lower modelled costs, but also lower quality life years in modelled cohorts where there was any delay from a policy of immediate treatment. The authors conclude that eliminating all waiting time for TVAR would produce population health at a rate of ~€12,500 per QALY gained.

However, based on the modelling presented, the authors lack the ability to make cost-effectiveness judgements of this sort. Waiting lists exist for a reason, chiefly a lack of clinical capacity to treat patients immediately. In taking a decision to treat patients immediately in one disease area, we therefore need some judgement as to whether the health displaced in now untreated patients in another disease area is of greater, less or equal magnitude to that gained by treating TVAR patients immediately. Alternately, modelling should include the cost of acquiring additional clinical capacity (such as theatre space) to treat TVAR patients immediately, so as not to displace other treatments. In such a case, the ICER is likely to be much higher, due to the large cost of new resources needed to reduce waiting times to zero.

Given the data available, a simple improvement to the paper would be to reflect current waiting times (already gathered from observational study) as the ‘standard of care’ arm. As such, the estimated change in quality of life and healthcare resource cost from reducing waiting times to zero from levels observed in current practice could be calculated. This could then be used to calculate the maximum acceptable cost of acquiring additional treatment resources needed to treat patients with no waiting time, given current national willingness-to-pay thresholds.

Admittedly, there remain problems in using the authors’ chosen observational dataset to calculate quality of life and cost outcomes for patients treated at different time periods. Waiting times were prioritised in this ‘real world’ observational study, based on clinical assessment of patients’ treatment need. Thus it is expected that the quality of life lost during a waiting period would be lower for patients treated in the observational study at 12 months, compared to the expected quality of life loss of waiting for the group of patients judged to need immediate treatment. A previous study in cardiac care took on the more manageable task of investigating the cost-effectiveness of different prioritisation strategies for the waiting list, investigating the sensitivity of conclusions to varying a fixed maximum wait-time for the last patient treated.

This study therefore demonstrates some of the difficulties in attempting to make cost-effectiveness judgements about waiting time policy. Given that the cost-effectiveness of reducing waiting times in different disease areas is expected to vary, based on relative importance of waiting for treatment on short and long-term health outcomes and costs, this remains an interesting area for economic evaluation to explore. In the context of the current focus on constrained optimisation techniques across different areas in healthcare (see ISPOR task force), it is likely that extending economic evaluation to evaluate a broader range of decision problems on a common scale will become increasingly important in future.

Understanding and identifying key issues with the involvement of clinicians in the development of decision-analytic model structures: a qualitative study. PharmacoEconomics [PubMed] Published 17th August 2018

This paper gathers evidence from interviews with clinicians and modellers, with the aim to improve the nature of the working relationship between the two fields during model development.

Researchers gathered opinion from a variety of settings, including industry. The main report focusses on evidence from two case studies – one tracking the working relationship between modellers and a single clinical advisor at a UK university, with the second gathering evidence from a UK policy institute – where modellers worked with up to 11 clinical experts per meeting.

Some of the authors’ conclusions are not particularly surprising. Modellers reported difficulty in recruiting clinicians to advise on model structures, and further difficulty in then engaging recruited clinicians to provide relevant advice for the model building process. Specific comments suggested difficulty for some clinical advisors in identifying representative patient experiences, instead diverting modellers’ attention towards rare outlier events.

Study responses suggested currently only 1 or 2 clinicians were typically consulted during model development. The authors recommend involving a larger group of clinicians at this stage of the modelling process, with a more varied range of clinical experience (junior as well as senior clinicians, with some geographical variation). This is intended to help ensure clinical pathways modelled are generalizable. The experience of one clinical collaborator involved in the case study based at a UK university, compared to 11 clinicians at the policy institute studied, perhaps may also illustrate a general problem of inadequate compensation for clinical time within the university system. The authors also advocate the availability of some relevant training for clinicians in decision modelling to help enhance the efficiency of participants’ time during model building. Clinicians sampled were supportive of this view – citing the need for further guidance from modellers on the nature of their expected contribution.

This study ties into the general literature regarding structural uncertainty in decision analytic models. In advocating the early contribution of a larger, more diverse group of clinicians in model development, the authors advocate a degree of alignment between clinical involvement during model structuring, and guidelines for eliciting parameter estimates from clinical experts. Similar problems, however, remain for both fields, in recruiting clinical experts from sufficiently diverse backgrounds to provide a valid sample.

Credits

 

Chris Sampson’s journal round-up for 23rd April 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

What should we know about the person behind a TTO? The European Journal of Health Economics [PubMed] Published 18th April 2018

The time trade-off (TTO) is a staple of health state valuation. Ask someone to value a health state with respect to time and – hey presto! – you have QALYs. This editorial suggests that completing a TTO can be a difficult task for respondents and that, more importantly, individuals’ characteristics may determine the way that they respond and therefore the nature of the results. One of the most commonly demonstrated differences, in this respect, is the fact that valuations of people’s own health states tend to be higher than health states valued hypothetically. But this paper focuses on indirect (hypothetical) valuations. The authors highlight mixed evidence for the influence of age, gender, marital status, having children, education, income, expectations about the future, and of one’s own health state. But why should we try and find out more about respondents when conducting TTOs? The authors offer 3 reasons: i) to inform sampling, ii) to inform the design and standardisation of TTO exercises, and iii) to inform the analysis. I agree – we need to better understand these sources of heterogeneity. Not to over-engineer responses, but to aid our interpretation, even if we want societally-representative valuations that include all of these variations in response behaviour. TTO valuation studies should collect data relating to the individual respondents. Unfortunately, what those data should be aren’t listed in this study, so the research question in the title isn’t really answered. But maybe that’s something the authors have in hand.

Computer modeling of diabetes and its transparency: a report on the eighth Mount Hood Challenge. Value in Health Published 9th April 2018

The Mount Hood Challenge is a get-together for people working on the (economic) modelling of diabetes. The subject of the 2016 meeting was transparency, with two specific goals: i) to evaluate the transparency of two published studies, and ii) to develop a diabetes-specific checklist for transparent reporting of modelling studies. Participants were tasked (in advance of the meeting) with replicating the two published studies and using the replicated models to evaluate some pre-specified scenarios. Both of the studies had some serious shortcomings in the reporting of the necessary data for replication, including the baseline characteristics of the population. Five modelling groups replicated the first model and seven groups replicated the second model. Naturally, the different groups made different assumptions about what should be used in place of missing data. For the first paper, none of the models provided results that matched the original. Not even close. And the differences between the results of the replications – in terms of costs incurred and complications avoided – were huge. The performance was a bit better on the second paper, but hardly worth celebrating. In general, the findings were fear-confirming. Informed by these findings, the Diabetes Modeling Input Checklist was created, designed to complement existing checklists with more general applications. It includes specific data requirements for the reporting of modelling studies, relating to the simulation cohort, treatments, costs, utilities, and model characteristics. If you’re doing some modelling in diabetes, you should have this paper to hand.

Setting dead at zero: applying scale properties to the QALY model. Medical Decision Making [PubMed] Published 9th April 2018

In health state valuation, whether or not a state is considered ‘worse than dead’ is heavily dependent on methodological choices. This paper reviews the literature to answer two questions: i) what are the reasons for anchoring at dead=0, and ii) how does the position of ‘dead’ on the utility-scale impact on decision making? The authors took a standard systematic approach to identify literature from databases, with 7 papers included. Then the authors discuss scale properties and the idea that there are interval scales (such as temperature) and ratio scales (such as distance). The difference between these is the meaningfulness of the reference point (or origin). This means that you can talk about distance doubling, but you can’t talk about temperature doubling, because 0 metres is not arbitrary, whereas 0 degrees Celsius is. The paper summarises some of the arguments put forward for using dead=0. They aren’t compelling. The authors argue that the duration part of the QALY (i.e. time) needs to have ratio properties for the QALY model to function. Time obviously holds this property and it’s clear that duration can be anchored at zero. The authors then demonstrate that, for the QALY model to work, the health-utility scale must also exhibit ratio scale properties. The basis for this is the assumption that zero duration nullifies health states and that ‘dead’ nullifies duration. But the paper doesn’t challenge the conceptual basis for using dead in health state valuation exercises. Rather, it considers the mathematical properties that must hold to allow for dead=0, and asserts them. The authors’ conclusion that dead “needs to have the value of 0 in a QALY model” is correct, but only within the existing restrictions and assumptions underlying current practice. Nevertheless, this is a very useful study for understanding the challenge of anchoring and explicating the assumptions underlying the QALY model.

Credits