Chris Sampson’s journal round-up for 9th October 2017

Evaluating the relationship between visual acuity and utilities in patients with diabetic macular edema enrolled in intravitreal aflibercept studies. Investigative Ophthalmology & Visual Science [PubMed] Published October 2017

Part of my day job involves the evaluation of a new type of screening programme for diabetic eye disease, including the use of a decision analytic model. Cost-effectiveness models usually need health state utility values for parameters in order to estimate QALYs. There are some interesting challenges in evaluating health-related quality of life in the context of vision loss; does vision in the best eye or worst eye affect quality of life most; do different eye diseases have different impacts independent of sight loss; do generic preference-based measures even work in this context? This study explores some of these questions. It combines baseline and follow-up EQ-5D and VFQ-UI (a condition-specific preference-based measure) responses from 1,320 patients from 4 different studies, along with visual acuity data. OLS and random effects panel models are used to predict utility values dependent on visual acuity and other individual characteristics. Best-seeing eye seems to be a more important determinant than worst-seeing eye, which supports previous studies. But worst-seeing eye is still important, with about a third of the impact of best-seeing eye. So economic evaluations shouldn’t ignore the bilateral nature of eye disease. Visual acuity – in both best- and worst-seeing eye – was more strongly associated with the condition-specific VFQ-UI than with the EQ-5D index, leading to better predictive power, which is not a big surprise. One way to look at this is that the EQ-5D underestimates the impact of visual acuity on utility. An alternative view could be that the VFQ-UI valuation process overestimates the impact of visual acuity on utility. This study is a nice demonstration of the fact that selecting health state utility values for a model-based economic evaluation is not straightforward. Attention needs to be given to the choice of measure (e.g. generic or condition-specific), but also to the way states are defined to allow for accurate utility values to be attached.

Do capability and functioning differ? A study of U.K. survey responses. Health Economics [PubMed] Published 24th September 2017

I like the capability approach in theory, but not in practice. I’ve written before about some of my concerns. One of them is that we don’t know whether capability measures (such as the ICECAP) offer anything beyond semantic nuance. This study sought to address that. A ‘functioning and capability’ instrument was devised, which reworded the ICECAP-A by changing phrases like “I am able to be” to phrases like “I am”, so that each question could have a ‘functioning’ version as well as a ‘capability’ version. Then, both the functioning and capability versions of the domains were presented in tandem. Questionnaires were sent to 1,627 individuals who had participated in another study about spillover effects in meningitis. Respondents (n=1,022) were family members of people experiencing after-effects of meningitis. The analysis focusses on the instances where capabilities and functionings diverge. Across the sample, 34% of respondents reported a difference between capability and functioning on at least one domain. For all domain-level responses, 12% were associated with higher capability than functioning, while 2% reported higher functioning. Some differences were observed between different groups of people. Older people tended to be less likely to report excess capabilities, while those with degree-level education reported greater capabilities. Informal care providers had lower functionings and capabilities but were more likely to report a difference between the two. Women were more likely to report excess capabilities in the ‘attachment’ domain. These differences lead the author to conclude that the wording of the ICECAP measure enables researchers to capture something beyond functioning, and that the choice of a capability measure could lead to different resource allocation decisions. I’m not convinced. The study makes an error that is common in this field; it presupposes that the changes in wording successfully distinguish between capabilities and functionings. This is implemented analytically by dropping from the primary analysis the cases where capabilities exceeded functionings, which are presumed to be illogical. If we don’t accept this presupposition (and we shouldn’t) then the meaning of the findings becomes questionable. The paper does outline most of the limitations of the study, but it doesn’t dedicate much space to alternative explanations. One is to do with the distinction between ‘can’ and ‘could’. If people answer ‘capability’ questions with reference to future possibilities, then the difference could simply be driven by optimism about future functionings. This future-reference problem is most obvious in the ‘achievement and progress’ domain, which incidentally, in this study, was the domain with the greatest probability of showing a discrepancy between capabilities and functionings. Another alternative explanation is that showing someone two slightly different questions coaxes them into making an artificial distinction that they wouldn’t otherwise make. In my previous writing on this, I suggested that two things needed to be identified. The first was to see whether people give different responses with the different wording. This study goes some way towards that, which is a good start. The second was to see whether people value states defined in these ways any differently. Until we have answers to both these questions I will remain sceptical about the implications of the ICECAP’s semantic nuance.

Estimating a constant WTP for a QALY—a mission impossible? The European Journal of Health Economics [PubMed] Published 21st September 2017

The idea of estimating willingness to pay (WTP) for a QALY has fallen out of fashion. It’s a nice idea in principle but, as the title of this paper suggests, it’s not easy to come up with a meaningful answer. One key problem has been that WTP for a QALY is not constant in the number of QALYs being gained – that is, people are willing to pay less (at the margin) for greater QALY gains. But maybe that’s OK. NICE and their counterparts tend not to use a fixed threshold but rather a range: £20,000-£30,000 per QALY, say. So maybe the variability in WTP for a QALY can be reflected in this range. This study explores some of the reasons – including uncertainty – for differences in elicited WTP values for a QALY. A contingent valuation exercise was conducted using a 2014 Internet panel survey of 1,400 Swedish citizens. The survey consisted 21 questions about respondents’ own health, sociodemographics, prioritisation attitudes, WTP for health improvements, and a societal decision-making task. Respondents were randomly assigned to one of five scenarios with different magnitudes and probabilities of health gain, with yes/no responses for five different WTP ‘bids’. The estimated WTP for a QALY – using the UK EQ-5D-3L tariff – was €17,000. But across the different scenarios, the WTP ranged from €10,600 to over a million. Wide confidence intervals abound. The authors’ findings only partially support an assumption of weak scope sensitivity – that more QALYs are worth paying more for – and do not at all support a strong assumption of scope sensitivity that WTP is proportional to QALY gain. This is what is known as scope bias, and this insensitivity to scope also applied to the variability in uncertainty. The authors also found that using different EQ-5D or VAS tariffs to estimate health state values resulted in variable differences in WTP estimates. Consistent relationships between individuals’ characteristics and their WTP were not found, though income and education seemed to be associated with higher willingness to pay across the sample. It isn’t clear what the implications of these findings are, except for the reinforcement of any scepticism you might have about the sociomathematical validity (yes, I’m sticking with that) of the QALY.


Thesis Thursday: Lidia Engel

Going beyond health-related quality of life for outcome measurement in economic evaluation
David Whitehurst, Scott Lear, Stirling Bryan
Your thesis explores the potential for expanding the ‘evaluative space’ in economic evaluation. Why is this important?

I think there are two answers to this question. Firstly, methods for economic evaluation of health care interventions have existed for a number of years but these evaluations have mainly been applied to more narrowly defined ‘clinical’ interventions, such as drugs. Interventions nowadays are more complex, where benefits cannot be simply measured in terms of health. You can think of areas such as public health, mental health, social care, and end-of-life care, where interventions may result in broader benefits, such as increased control over daily life, independence, or aspects related to the process of health care delivery. Therefore, I believe there is a need to re-think the way we measure and value outcomes when we conduct an economic evaluation. Secondly, ignoring broader outcomes of health care interventions that go beyond the narrow focus of health-related quality of life can potentially lead to misallocation of scarce health care resources. Evidence has shown that the choice of outcome measure (such as a health outcome or a broader measure of wellbeing) can have a significant influence on the conclusions drawn from an economic evaluation.

You use both qualitative and quantitative approaches. Was this key to answering your research questions?

I mainly applied quantitative methods in my thesis research. However, Chapter 3 draws upon some qualitative methodology. To gain a better understanding of ‘benefits beyond health’, I came across a novel approach, called Critical Interpretive Synthesis. It is similar to meta-ethnography (i.e. a synthesis of qualitative research), with the difference that the synthesis is not of qualitative literature but of methodologically diverse literature. It involves an iterative approach, where searching, sampling, and synthesis go hand in hand. It doesn’t only produce a summary of existing literature but enables the development of new interpretations that go beyond those originally offered in the literature. I really liked this approach because it enabled me to synthesise the evidence in a more effective way compared with a conventional systematic review. Defining and applying codes and themes, as it is traditionally done in qualitative research, allowed me to organize the general idea of non-health benefits into a coherent thematic framework, which in the end provided me with a better understanding of the topic overall.

What data did you analyse and what quantitative methods did you use?

I conducted three empirical analyses in my thesis research, which all made use of data from the ICECAP measures (ICECAP-O and ICECAP-A). In my first paper, I used data from the ‘Walk the Talk (WTT)‘ project to investigate the complementarity of the ICECAP-O and the EQ-5D-5L in a public health context using regression analyses. My second paper used exploratory factor analysis to investigate the extent of overlap between the ICECAP-A and five preference-based health-related quality of life measures, using data from the Multi Instrument Comparison (MIC) project. I am currently finalizing submission of my third empirical analysis, which reports findings from a path analysis using cross-sectional data from a web-based survey. The path analysis explores three outcome measurement approaches (health-related quality of life, subjective wellbeing, and capability wellbeing) through direct and mediated pathways in individuals living with spinal cord injury. Each of the three studies addressed different components of the overall research question, which, collectively, demonstrated the added value of broader outcome measures in economic evaluation when compared with existing preference-based health-related quality of life measures.

Thinking about the different measures that you considered in your analyses, were any of your findings surprising or unexpected?

In my first paper, I found that the ICECAP-O is more sensitive to environmental features (i.e. social cohesion and street connectivity) when compared with the EQ-5D-5L. As my second paper has shown, this was not surprising, as the ICECAP-A (a measure for adults rather than older adults) and the EQ-5D-5L measure different constructs and had only limited overlap in their descriptive classification systems. While a similar observation was made when comparing the ICECAP-A with three other preference-based health-related quality of life measures (15D, HUI-3, and SF-6D), a substantial overlap was observed between the ICECAP-A and the AQoL-8D, which suggests that it is possible for broader benefits to be captured by preference-based health-related measures (although some may not consider the AQoL-8D to be exclusively ‘health-related’, despite the label). The findings from the path analysis confirmed the similarities between the ICECAP-A and the AQoL-8D. However, the findings do not imply that the AQoL-8D and ICECAP-A are interchangeable instruments, as a mediation effect was found that requires further research.

How would you like to see your research inform current practice in economic evaluation? Is the QALY still in good health?

I am aware of the limitations of the QALY and although there are increasing concerns that the QALY framework does not capture all benefits of health care interventions, it is important to understand that the evaluative space of the QALY is determined by the dimensions included in preference-based measures. From a theoretical point of view, the QALY can embrace any characteristics that are important for the allocation of health care resources. However, in practice, it seems that QALYs are currently defined by what is measured (e.g. the dimensions and response options of EQ-5D instruments) rather than the conceptual origin. Therefore, although non-health benefits have been largely ignored when estimating QALYs, one should not dismiss the QALY framework but rather develop appropriate instruments that capture such broader benefits. I believe the findings of my thesis have particular relevance for national HTA bodies that set guidelines for the conduct of economic evaluation. While the need to maintain methodological consistency is important, the assessment of the real benefits of some health care interventions would be more accurate if we were less prescriptive in terms of which outcome measure to use when conducting an economic evaluation. As my thesis has shown, some preference-based measures already adopt a broad evaluative space but are less frequently used.

Alastair Canaway’s journal round-up for 28th November 2016

The cost-effectiveness of antibiotic prophylaxis for patients at risk of infective endocarditis. Circulation [PubMed] Published 13th November 2016

Did NICE get it wrong? In 2008 NICE recommended stopping antibiotic prophylaxis (AP) for those at risk of infective endocarditis (IE). For those unfamiliar with this research area, AP refers to the use of antibiotics or similar to prevent infection complications. IE is an infection of the endocardial surface of the heart which can have severe, and potentially fatal consequences. NICE stopped the recommendation of AP for those at risk of IE whilst undergoing dental procedures citing lack of evidence of efficacy and cost-effectiveness. This paper sought to fill the void in evidence and conduct an economic evaluation of AP using the latest estimates of efficacy and resource use. The paper constructed a decision analytic model to estimate costs and benefits. Both resource use and adverse event rates were sourced through Hospital Episode Statistics. The results were pretty conclusive: AP was less costly and more effective (than no AP) for all patients at risk of IE. Scenario analyses suggested that AP would have to be substantially less effective than estimated for it to fail on grounds of cost-effectiveness. The paper estimated that the annual savings of reintroducing AP in England would be between £5.5m and £8.2m with a health gain of over 2600 QALYs. Given the low costs of AP, the consequent cost saving and health improvements, perhaps NICE will be persuaded to reconsider their decision.

Maximizing health or sufficient capability in economic evaluation? A methodological experiment of treatment for drug addiction. Medical Decision Making [PubMed] Published 17th November 2016

The standard normative framework for economic evaluation within the UK is extra-welfarism, specifically, using health as the maximand (typically measured using QALYs). Thus, the evaluative space is health, with maximisation as the decision rule. Arguments have been made that health maximisation is not always the most appropriate framework. It has been suggested that the evaluative space be broadened to include capability wellbeing (based on the work of Sen), whilst a minimum threshold approach has been touted as an alternative approach to decision making. Such an approach is egalitarian and aims to ensure all members of society achieve a ‘sufficient’ level of capability wellbeing. This paper reports a pilot trial for the treatment of drug addiction to explore how i) changing the evaluative space to that of capability wellbeing, and ii) switching the decision-making principle to sufficient capability, impacts upon the decisions made. The drug addiction context is particularly pertinent due to non-health spill over impacts to the patient and others. The intervention considers three treatments: treatment as usual (TAU), TAU with social behaviour and network therapy (SBNT) and TAU with goal setting (GS). The two measures of interest within this study are the EQ-5D-5L and the ICECAP-A (capability measure for adults), QALYs and years of full capability (YFC) were calculated. Additionally, years of sufficient capability (YSC) were also calculated, sufficient capability was determined by a score of 33333: ‘a lot’ on each dimension of the ICECAP-A instrument. The study examined four situations: i) broadening the costing perspective from NHS/PSS to government, ii) broadening the evaluative space from QALYs to YFC, iii) broadening both costing perspective and evaluative space, and iv) changing the decision making rule to years of sufficient capability (YSC). The study found that changing from health maximisation to capability maximisation changed the treatment decision, as did changing the perspective: treatment recommendation is sensitive to choice of evaluative space and perspective. In the YSC analysis, the decision remained the same as the YFC analysis. The authors note a number of limitations with their study. The biggest for me, was the sample size of 83 – unsurprising given this was a pilot trial. As a result of the small numbers in each arm (30, 27, and 26) there is a surfeit of uncertainty, and just a handful of extreme cases in any one arm has the potential to change the results, and so it is difficult to draw any firm conclusions from this study. This paper however does provide a good starting point for the novel YFC approach, I’d be very interested in seeing this operationalised in a larger trial.

Does the EQ-5D capture the effects of physical and mental health status on life satisfaction among older people? A path analysis approach. Quality of Life Research [PubMed] Published 19th November 2016

This study sought to identify whether the EQ-5D captures impacts of mental and physical health on life satisfaction (LS) of older adults. This involved a retrospective cohort of 884 patients in Ireland. Path analysis was used to evaluate the direct and indirect effects. The EQ-5D-3L was used to measure health-related quality of life, whilst life satisfaction was measured with the life satisfaction index (LSI). Various specific measures of health status were also measured, e.g. co-morbidity level, activity limitation, and anxiety and depression. Within the analysis a number of assumptions were required, specifically around causation. The overall findings suggest that the EQ-5D-3L sufficiently captures the impact of physical health on life satisfaction, but not mental health. The author’s reflect that this may be due to a fundamental incommensurability of the general public’s preferences (who value the health states for the EQ-5D) and those who experience these health states. The authors conclude that the EQ-5D-3L should be used with caution within economic evaluations, and the use of the EQ-5D will underestimate benefits of treatment to mental health. The authors suggest alternative measures: HUI-3, AQoL and the ICECAP, and advocate their use alongside the EQ-5D within economic evaluation to better capture mental health impacts. A lot of this boils down to existing issues of debate: who should do the valuing (patient vs society), what are we trying to maximise (health vs well-being, or minimum threshold) and are existing measures doing the job they are supposed to be doing (is the EQ-5D fit for purpose). All these are interesting areas and it’s nice to see these issues being pushed to the fore once more.