Shilpi Swami’s journal round-up for 9th December 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Performance of UK National Health Service compared with other high-income countries: observational study. BMJ [PubMed] Published 27th November 2019

Efficiencies and inefficiencies of the NHS in the UK have been debated in recent years. This new study reveals the performance of the NHS compared to other high-income countries, based on observational data, and has already caught a bunch of attention (almost 3,000 tweets and 6 news appearances, since publication)!

The authors presented a descriptive analysis of the UK (England, Scotland, Northern Ireland, and Wales) compared to nine other countries (US, Canada, Germany, Australia, Sweden, France, Denmark, the Netherlands, and Switzerland) based on aggregated recent data from a range of sources (such as OECD, World Bank, the Institute for Health Metrics Evaluation, and Eurostat). Good things first; access to care – a lower proportion of people felt unmet needs owing to costs. The waiting times were comparable across other countries, except for specialist care. The UK performed slightly better on the metric of patient safety. The main challenge, however, is that NHS healthcare spending is lower and has been growing more slowly. This means fewer doctors and nurses, and doctors spending less time with patients. The authors vividly suggest that

“Policy makers should consider how recent changes to nursing bursaries, the weakened pound, and uncertainty about the status of immigrant workers in the light of the Brexit referendum result have influenced these numbers and how to respond to these challenges in the future.”

Understandably comparing healthcare systems across the world is difficult. Including the US in the study, and not including other countries like Spain and Japan, may need more justification or could be a scope of future research.

To be fair, the article is a not-to-miss read. It is an eye-opener for those who think it’s only a (too much) demand-side problem the the NHS is facing and confirms the perspective of those who think it’s a (not enough) supply-side problem. Kudos to the hardworking doctors and nurses who are currently delivering efficiently in the stretched situation! For sustainability, the NHS needs to consider increasing its spending to increase labour supply and long-term care.

A systematic review of methods to predict weight trajectories in health economic models of behavioral weight management programs: the potential role of psychosocial factors. Medical Decision Making [PubMed] Published 2nd December 2019

In economic modelling, assumptions are often made about the long-term impact of interventions, and it’s important that these assumptions are based on sound evidence and/or tested in sensitivity analysis, as these could affect the cost-effectiveness results.

The authors explored assumptions about weight trajectories to inform economic modelling of behavioural weight management programmes. Also, they checked their evidence sources, and whether these assumptions were based on any psychosocial variables (such as self-regulation, motivation, self-efficacy, and habit), as these are known to be associated with weight-loss trajectories.

The authors conducted a systematic literature review of economic models of weight management interventions that aimed at reducing weight. In the 38 studies included, they found 6 types of assumptions of weight trajectories beyond trial duration (weight loss maintained, weight loss regained immediately, linear weight regain, subgroup-specific trajectories, exponential decay of effect, maintenance followed by regain), with only 15 of the studies reporting sources for these assumptions. The authors also elaborated on the assumptions and graphically represented them. Psychosocial variables were, in fact, measured in evidence sources of some of the included studies. However, the authors found that none of the studies estimated their weight trajectory assumptions based on these! Though the article also reports on how the assumptions were tested in sensitivity analyses and their impact on results in the studies (if reported within these studies), it would have been interesting to see more insights into this. The authors feel that there’s a need to investigate how psychosocial variables measured in trials can be used within health economic models to calculate weight trajectories and, thus, to improve the validity of cost-effectiveness estimates.

To me, given that only around half of included studies reported sources of assumptions on long-term effects of the interventions and performed sensitivity analysis on these assumptions, it raises the bigger long-debated question on the quality of economic evaluations! To conclude, the review is comprehensive and insightful. It is an interesting read and will be especially useful for those interested in modelling long-term impacts of behavioural support programs.

The societal monetary value of a QALY associated with EQ‐5D‐3L health gains. The European Journal of Health Economics [PubMed] Published 28th November 2019

Finding an estimate of the societal monetary value of a QALY (MVQALY) is mostly performed to inform a range of thresholds for accurately guiding cost-effectiveness decisions.

This study explores the degree of variation in the societal MVQALY based on a large sample of the population in Spain. It uses a discrete choice experiment and a time trade-off exercise to derive a value set for utilities, followed by a willingness to pay questionnaire. The study reveals that the societal values for a QALY, corresponding to different EQ-5D-3L health gains, vary approximately between €10,000 and €30,000. Ironically, the MVQALY associated with larger improvements on QoL was found to be lower than with moderate QoL gains, meaning that WTP is less than proportional to the size of the QoL improvement. The authors further explored whether budgetary restrictions could be a reason for this by analysing responses of individuals with higher income and found out that it may somewhat explain this, but not fully. As this, at face value, implies there should be a lower cost per QALY threshold for interventions with largest improvement of health than with moderate improvements, it raises a lot of questions and forces you to interpret the findings with caution. The authors suggest that the diminishing MVQALY is, at least partly, produced by the lack of sensitivity of WTP responses.

Though I think that the article does not provide a clear take-home message, it makes the readers re-think the very underlying norms of estimating monetary values of QALYs. The study eventually raises more questions than providing answers but could be useful to further explore areas of utility research.


Chris Sampson’s journal round-up for 2nd July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Choice in the presence of experts: the role of general practitioners in patients’ hospital choice. Journal of Health Economics [PubMed] [RePEc] Published 26th June 2018

In the UK, patients are in principle free to choose which hospital they use for elective procedures. However, as these choices operate through a GP referral, the extent to which the choice is ‘free’ is limited. The choice set is provided by the GP and thus there are two decision-makers. It’s a classic example of the principal-agent relationship. What’s best for the patient and what’s best for the local health care budget might not align. The focus of this study is on the applied importance of this dynamic and the idea that econometric studies that ignore it – by looking only at patient decision-making or only at GP decision-making – may give bias estimates. The author outlines a two-stage model for the choice process that takes place. Hospital characteristics can affect choices in three ways: i) by only influencing the choice set that the GP presents to the patient, e.g. hospital quality, ii) by only influencing the patient’s choice from the set, e.g. hospital amenities, and iii) by influencing both, e.g. waiting times. The study uses Hospital Episode Statistics for 30,000 hip replacements that took place in 2011/12, referred by 4,721 GPs to 168 hospitals, to examine revealed preferences. The choice set for each patient is not observed, so a key assumption is that all hospitals to which a GP made referrals in the period are included in the choice set presented to patients. The main findings are that both GPs and patients are influenced primarily by distance. GPs are influenced by hospital quality and the budget impact of referrals, while distance and waiting times explain patient choices. For patients, parking spaces seem to be more important than mortality ratios. The results support the notion that patients defer to GPs in assessing quality. In places, it’s difficult to follow what the author did and why they did it. But in essence, the author is looking for (and in most cases finding) reasons not to ignore GPs’ preselection of choice sets when conducting econometric analyses involving patient choice. Econometricians should take note. And policymakers should be asking whether freedom of choice is sensible when patients prioritise parking and when variable GP incentives could give rise to heterogeneous standards of care.

Using evidence from randomised controlled trials in economic models: what information is relevant and is there a minimum amount of sample data required to make decisions? PharmacoEconomics [PubMed] Published 20th June 2018

You’re probably aware of the classic ‘irrelevance of inference’ argument. Statistical significance is irrelevant in deciding whether or not to fund a health technology, because we ought to do whatever we expect to be best on average. This new paper argues the case for irrelevance in other domains, namely multiplicity (e.g. multiple testing) and sample size. With a primer on hypothesis testing, the author sets out the regulatory perspective. Multiplicity inflates the chance of a type I error, so regulators worry about it. That’s why triallists often obsess over primary outcomes (and avoiding multiplicity). But when we build decision models, we rely on all sorts of outcomes from all sorts of studies, and QALYs are never the primary outcome. So what does this mean for reimbursement decision-making? Reimbursement is based on expected net benefit as derived using decision models, which are Bayesian by definition. Within a Bayesian framework of probabilistic sensitivity analysis, data for relevant parameters should never be disregarded on the basis of the status of their collection in a trial, and it is up to the analyst to properly specify a model that properly accounts for the effects of multiplicity and other sources of uncertainty. The author outlines how this operates in three settings: i) estimating treatment effects for rare events, ii) the number of trials available for a meta-analysis, and iii) the estimation of population mean overall survival. It isn’t so much that multiplicity and sample size are irrelevant, as they could inform the analysis, but rather that no data is too weak for a Bayesian analyst.

Life satisfaction, QALYs, and the monetary value of health. Social Science & Medicine [PubMed] Published 18th June 2018

One of this blog’s first ever posts was on the subject of ‘the well-being valuation approach‘ but, to date, I don’t think we’ve ever covered a study in the round-up that uses this method. In essence, the method is about estimating trade-offs between (for example) income and some measure of subjective well-being, or some health condition, in order to estimate the income equivalence for that state. This study attempts to estimate the (Australian) dollar value of QALYs, as measured using the SF-6D. Thus, the study is a rival cousin to the Claxton-esque opportunity cost approach, and a rival sibling to stated preference ‘social value of a QALY’ approaches. The authors are trying to identify a threshold value on the basis of revealed preferences. The analysis is conducted using 14 waves of the Australian HILDA panel, with more than 200,000 person-year responses. A regression model estimates the impact on life satisfaction of income, SF-6D index scores, and the presence of long-term conditions. The authors adopt an instrumental variable approach to try and address the endogeneity of life satisfaction and income, using an indicator of ‘financial worsening’ to approximate an income shock. The estimated value of a QALY is found to be around A$42,000 (~£23,500) over a 2-year period. Over the long-term, it’s higher, at around A$67,000 (~£37,500), because individuals are found to discount money differently to health. The results also demonstrate that individuals are willing to pay around A$2,000 to avoid a long-term condition on top of the value of a QALY. The authors apply their approach to a few examples from the literature to demonstrate the implications of using well-being valuation in the economic evaluation of health care. As with all uses of experienced utility in the health domain, adaptation is a big concern. But a key advantage is that this approach can be easily applied to large sets of survey data, giving powerful results. However, I haven’t quite got my head around how meaningful the results are. SF-6D index values – as used in this study – are generated on the basis of stated preferences. So to what extent are we measuring revealed preferences? And if it’s some combination of stated and revealed preference, how should we interpret willingness to pay values?



Chris Sampson’s journal round-up for 9th October 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Evaluating the relationship between visual acuity and utilities in patients with diabetic macular edema enrolled in intravitreal aflibercept studies. Investigative Ophthalmology & Visual Science [PubMed] Published October 2017

Part of my day job involves the evaluation of a new type of screening programme for diabetic eye disease, including the use of a decision analytic model. Cost-effectiveness models usually need health state utility values for parameters in order to estimate QALYs. There are some interesting challenges in evaluating health-related quality of life in the context of vision loss; does vision in the best eye or worst eye affect quality of life most; do different eye diseases have different impacts independent of sight loss; do generic preference-based measures even work in this context? This study explores some of these questions. It combines baseline and follow-up EQ-5D and VFQ-UI (a condition-specific preference-based measure) responses from 1,320 patients from 4 different studies, along with visual acuity data. OLS and random effects panel models are used to predict utility values dependent on visual acuity and other individual characteristics. Best-seeing eye seems to be a more important determinant than worst-seeing eye, which supports previous studies. But worst-seeing eye is still important, with about a third of the impact of best-seeing eye. So economic evaluations shouldn’t ignore the bilateral nature of eye disease. Visual acuity – in both best- and worst-seeing eye – was more strongly associated with the condition-specific VFQ-UI than with the EQ-5D index, leading to better predictive power, which is not a big surprise. One way to look at this is that the EQ-5D underestimates the impact of visual acuity on utility. An alternative view could be that the VFQ-UI valuation process overestimates the impact of visual acuity on utility. This study is a nice demonstration of the fact that selecting health state utility values for a model-based economic evaluation is not straightforward. Attention needs to be given to the choice of measure (e.g. generic or condition-specific), but also to the way states are defined to allow for accurate utility values to be attached.

Do capability and functioning differ? A study of U.K. survey responses. Health Economics [PubMed] Published 24th September 2017

I like the capability approach in theory, but not in practice. I’ve written before about some of my concerns. One of them is that we don’t know whether capability measures (such as the ICECAP) offer anything beyond semantic nuance. This study sought to address that. A ‘functioning and capability’ instrument was devised, which reworded the ICECAP-A by changing phrases like “I am able to be” to phrases like “I am”, so that each question could have a ‘functioning’ version as well as a ‘capability’ version. Then, both the functioning and capability versions of the domains were presented in tandem. Questionnaires were sent to 1,627 individuals who had participated in another study about spillover effects in meningitis. Respondents (n=1,022) were family members of people experiencing after-effects of meningitis. The analysis focusses on the instances where capabilities and functionings diverge. Across the sample, 34% of respondents reported a difference between capability and functioning on at least one domain. For all domain-level responses, 12% were associated with higher capability than functioning, while 2% reported higher functioning. Some differences were observed between different groups of people. Older people tended to be less likely to report excess capabilities, while those with degree-level education reported greater capabilities. Informal care providers had lower functionings and capabilities but were more likely to report a difference between the two. Women were more likely to report excess capabilities in the ‘attachment’ domain. These differences lead the author to conclude that the wording of the ICECAP measure enables researchers to capture something beyond functioning, and that the choice of a capability measure could lead to different resource allocation decisions. I’m not convinced. The study makes an error that is common in this field; it presupposes that the changes in wording successfully distinguish between capabilities and functionings. This is implemented analytically by dropping from the primary analysis the cases where capabilities exceeded functionings, which are presumed to be illogical. If we don’t accept this presupposition (and we shouldn’t) then the meaning of the findings becomes questionable. The paper does outline most of the limitations of the study, but it doesn’t dedicate much space to alternative explanations. One is to do with the distinction between ‘can’ and ‘could’. If people answer ‘capability’ questions with reference to future possibilities, then the difference could simply be driven by optimism about future functionings. This future-reference problem is most obvious in the ‘achievement and progress’ domain, which incidentally, in this study, was the domain with the greatest probability of showing a discrepancy between capabilities and functionings. Another alternative explanation is that showing someone two slightly different questions coaxes them into making an artificial distinction that they wouldn’t otherwise make. In my previous writing on this, I suggested that two things needed to be identified. The first was to see whether people give different responses with the different wording. This study goes some way towards that, which is a good start. The second was to see whether people value states defined in these ways any differently. Until we have answers to both these questions I will remain sceptical about the implications of the ICECAP’s semantic nuance.

Estimating a constant WTP for a QALY—a mission impossible? The European Journal of Health Economics [PubMed] Published 21st September 2017

The idea of estimating willingness to pay (WTP) for a QALY has fallen out of fashion. It’s a nice idea in principle but, as the title of this paper suggests, it’s not easy to come up with a meaningful answer. One key problem has been that WTP for a QALY is not constant in the number of QALYs being gained – that is, people are willing to pay less (at the margin) for greater QALY gains. But maybe that’s OK. NICE and their counterparts tend not to use a fixed threshold but rather a range: £20,000-£30,000 per QALY, say. So maybe the variability in WTP for a QALY can be reflected in this range. This study explores some of the reasons – including uncertainty – for differences in elicited WTP values for a QALY. A contingent valuation exercise was conducted using a 2014 Internet panel survey of 1,400 Swedish citizens. The survey consisted 21 questions about respondents’ own health, sociodemographics, prioritisation attitudes, WTP for health improvements, and a societal decision-making task. Respondents were randomly assigned to one of five scenarios with different magnitudes and probabilities of health gain, with yes/no responses for five different WTP ‘bids’. The estimated WTP for a QALY – using the UK EQ-5D-3L tariff – was €17,000. But across the different scenarios, the WTP ranged from €10,600 to over a million. Wide confidence intervals abound. The authors’ findings only partially support an assumption of weak scope sensitivity – that more QALYs are worth paying more for – and do not at all support a strong assumption of scope sensitivity that WTP is proportional to QALY gain. This is what is known as scope bias, and this insensitivity to scope also applied to the variability in uncertainty. The authors also found that using different EQ-5D or VAS tariffs to estimate health state values resulted in variable differences in WTP estimates. Consistent relationships between individuals’ characteristics and their WTP were not found, though income and education seemed to be associated with higher willingness to pay across the sample. It isn’t clear what the implications of these findings are, except for the reinforcement of any scepticism you might have about the sociomathematical validity (yes, I’m sticking with that) of the QALY.