Don Husereau’s journal round-up for 25th November 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Development and validation of the TRansparent Uncertainty ASsessmenT (TRUST) tool for assessing uncertainties in health economic decision models. PharmacoEconomics [PubMed] Published 11th November 2019

You’re going to quickly see that all three papers in today’s round-up align with some strong personal pet peeves that I harbour toward the nebulous world of market access and health technology assessment – most prominent is how loose we seem to be with language and form without overarching standards. This may be of no surprise to some when discussing a field which lacks a standard definition and for which many international standards of what constitutes good practice have never been defined.

This first paper deals with both issues and provides a useful tool for characterizing uncertainty. The authors state the purpose of the tool is “for systematically identifying, assessing, and reporting uncertainty in health economic models.” They suggest, to the best of their knowledge, no such tool exists. They also support the need for the tool by asserting that uncertainty in health economic modelling is often not fully characterized. The reasons, they suggest, are twofold: (1) there has been too much emphasis on imprecision; and (2) it is difficult to express all uncertainty.

I couldn’t agree more. What I sometimes deeply believe about those planning and conducting economic evaluation is that they obsess too often about uncertainty that is is less relevant (but more amenable to statistical adjustment) and don’t address uncertainty that payers actually care about. To wit, while it may be important to explore and adopt methods that deal with imprecision (dealing with known unknowns), such as improving utility variance estimates (from an SE of 0.003 to 0.011, yes sorry Kelvin and Feng for the callout), not getting this right is unlikely to lead to truly bad decisions. (Kelvin and Feng both know this.)

What is much more important for decision makers is uncertainty that stems from a lack of knowledge. These are unknown unknowns. In my experience this typically has to do with generalizability (how well will it work in different patients or against a different comparator?) and durability (how do I translate 16 weeks of data into a lifetime?); not things resolved by better variance estimates and probabilistic analysis. In Canada, our HTA body has even gone so far as to respond to the egregious act of not providing different parametric forms for extrapolation with the equally egregious act of using unrealistic time horizon adjustments to deal with this. Two wrongs don’t make a right.

To develop the tool, the authors first conducted a (presumably narrative) review of uncertainty frameworks and then ran identified concepts across a bunch of HTA expert committee types. They also used a previously developed framework as a basis for identifying all the places where uncertainty in HTA could occur. Using the concepts and the HTA areas they developed a tool which was presented a few times, and then validated through semi-structured interviews with different international stakeholders (N = 11), as well as insights into barriers to its use, user-friendliness, and feasibility.

Once the tool was developed, six case studies were worked up with an illustration of one of them (pembrolizumab for Hodgkin’s lymphoma) in the manuscript. While the tool does not provide a score or coefficient to adjust estimates or deal with uncertainty, it is not supposed to. What it is trying to do is make sure you are aware of them all so that you can make some determination as to whether the uncertainties are dealt with. One of the challenges of developing the tool is the lack of standardized terminology regarding uncertainty itself. While a short primer exists in the manuscript, for those who have looked into it, uncertainty terminology is far more uncertain than even the authors let on.

While I appreciate the tool and the attempt to standardize things, I do suspect the approach could have been strengthened (a systematic review and possibly a nominal group technique as is done for reporting guidelines). However, I’m not sure this would have gotten us much closer to the truth. Uncertainty needs to be sorted first and I am happy at their attempt. I hope it raises some awareness of how we can’t simply say we are “uncertain” as if that means something.

Unmet medical need: an introduction to definitions and stakeholder perceptions. Value in Health [PubMed] Published November 2019

The second, and also often-abused, term without an obvious definition is unmet medical need (UMN). My theory is that some confusion has arisen due to a confluence of marketing and clinical development teams and regulators. UMN has come to mean patients with rare diseases, drugs with ‘novel’ mechanisms of action, patients with highly prevalent disease, drugs with a more convenient formulation, or drugs with fewer side effects. And yet payers (in my experience) usually recognize none of these. Payers tend to characterize UMN in different ways: no drugs available to treat the condition, available drugs do not provide consistent or durable responses, and there have been no new medical developments in the area for > 10 years.

The purpose of this research then was to unpack the term UMN further. The authors conducted a comprehensive (gray) literature review to identify definitions of UMN in use by different stakeholders and then unpacked their meaning through definitions consultations with multi-European stakeholder discussions, trying to focus on the key elements of unmet medical need with a regulatory and reimbursement lens. This consisted of six one-hour teleconference calls and two workshops held in 2018. One open workshop involved 69 people from regulatory agencies, industry, payers, HTA bodies, patient organizations, healthcare, and academia.

A key finding of this work was that, yes indeed, UMN means different things to different people. A key dimension is whether unmet need is being defined in terms of individuals or populations. Population size (whether prevalent or rare) was not felt to be an element of the definition while there was general consensus that disease severity was. This means UMN should really only consider the UMNs of individual patients, not whether very few or very many patients are at need. It also means we see people who have higher rates of premature mortality and severe morbidity as having more of an unmet need, regardless of how many people are affected by the condition.

And last but not least was the final dimension of how many treatments are actually available. This, the authors point out, is the current legal definition in Europe (as laid down in Article 4, paragraph 2 of Commission Regulation [EC] No. 507/2006). And while this seems the most obvious definition of ‘need’ (we usually need things that are lacking) there was some acknowledgement by stakeholders that simply counting existing therapies is not adequate. There was also acknowledgement that there may be existing therapies available and still an UMN. Certainly this reflects my experience on the pan-Canadian Oncology Drug Review expert review committee, where unmet medical need was an explicit subdomain in their value framework, and where on more than one occasion it was felt, to my surprise, there was an unmet need despite the availability of two or more treatments.

Like the previous paper, the authors did not conduct a systematic review and could have consulted more broadly (no clinician stakeholders were consulted) or used more objective methods, a limitation they acknowledge but also unlikely to get them much further ahead in understanding. So what to do with this information? Well, the authors do propose an HTA approach that would triage reimbursement decision based on UMN. However, stakeholders commented that the method you use really depends on the HTA context. As such, the authors conclude that “the application of the definition within a broader framework depends on the scope of the stakeholder.” In other words, HTA must be fit for purpose (something we knew already). However, like uncertainty, I’m happy someone is actually trying to create reasonable coherent definitions of such an important concept.

On value frameworks and opportunity costs in health technology assessment. International Journal of Technology Assessment in Health Care [PubMed] Published 18th September 2019

The final, and most-abused term is that of ‘value’. While value seems an obvious prerequisite to those making investments in healthcare, and that we (some of us) are willing to acknowledge that value is what we are willing to give up to get something, what is less clear is what we want to get and what we want to give up.

The author of this paper, then, hopes to remind us of the various schools of thought on defining value in health that speak to these trade-offs. The first is broadly consistent with the welfarist school of economics and proposes that the value of health care used by decision makers should reflect individuals’ willingness to pay for it. An alternative approach – sometimes referred to as the extra-welfarist framework, argues that the value of a health technology should be consistent with the policy objectives of the health care system, typically health (the author states it is ‘health’ but I’m not sure it has to be). The final school of thought (which I was not familiar with and neither might you be which is the point of the paper) is what he terms ‘classical’, where the point is not to maximize a maximand or be held up to notions of efficiency but rather to discuss how consumers will be affected. The reference cited to support this framework is this interesting piece although I couldn’t find any allusion to the framework within.

What follows is a relatively fair treatment of extra-welfarist and welfarist applications to decision-making with a larger critical swipe at the former (using legitimate arguments that have been previously published – yes, extra-welfarists assume resources are divisible and, yes, extra-welfarists don’t identify the health-producing resources that will actually be displaced and, yes, using thresholds doesn’t always maximize health) and much downplay of the latter (how we might measure trade-offs reliably under a welfarist framework appears to be a mere detail until this concession is finally mentioned: “On account of the measurement issues surrounding [willingness to pay], there may be many situations in which no valid and reliable methods of operationalizing [welfarist economic value frameworks] exist.”) Given the premise of this commentary is that a recent commentary by Culyer seemed to overlook concepts of value beyond extra-welfarist ones, the swipe at extra-welfarist views is understandable. Hence, this paper can be seen as a kind of rebuttal and reminder that other views should not be ignored.

I like the central premise of the paper as summarized here:

“Although the concise term “value for money” may be much easier to sell to HTA decision makers than, for example, “estimated mean valuation of estimated change in mean health status divided by the estimated change in mean health-care costs,” the former loses too much in precision; it seems much less honest. Because loose language could result in dire consequences of economic evaluation being oversold to the HTA community, it should be avoided at all costs”

However, while I am really sympathetic to warning against conceptual shortcuts and loose language, I wonder if this paper misses the bigger point. Firstly, I’m not convinced we are making such bad decisions as those who wish the lambda to be silenced tend to want us to believe. But more importantly, while it is easy to be critical about economics applied loosely or misapplied, this paper (like others) offers no real practical solutions other than the need to acknowledge other frameworks. It is silent on the real reason extra-welfarist approaches and thresholds seem to have stuck around, namely, they have provided a practical and meaningful way forward for difficult decision-making and the HTA processes that support them. They make sense to decision-makers who are willing to overlook some of the conceptual wrinkles. And I’m a firm believer that conceptual models are a starting point for pragmatism. We shouldn’t be slaves to them.

Credits

Chris Sampson’s journal round-up for 19th August 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Paying for kidneys? A randomized survey and choice experiment. American Economic Review [RePEc] Published August 2019

This paper starts with a quote from Alvin Roth about ‘repugnant transactions’, of which markets for organs provide a prime example. This idea of ‘repugnant transactions’ has been hijacked by some pop economists to represent the stupid opinions of non-economists. If you ask me, markets for organs aren’t repugnant, they just seem like a very bad idea in terms of both efficiency and equity. But it doesn’t matter what I think; it matters what the people of the United States think.

The authors of this study conducted an online survey with a representative sample of 2,666 Americans. Each respondent was randomised to evaluate one of eight systems compared with the current system. The eight systems differed with respect to i) cash or non-cash compensation of ii) different sizes ($30,000 or $100,000), iii) paid by either a public agency or the organ recipient. Participants made five binary choices that differed according to the gain – in transplants generated – associated with the new system. Half of the participants were also asked to express moral judgements.

Both the system features (e.g. who pays) and the outcomes of the new system influenced people’s choices. Broadly speaking, the results suggest that people aren’t opposed to donors being paid, but are opposed to patients paying. (Remember, we’re talking about the US here!). Around 21% of respondents opposed payment no matter what, 46% were in favour no matter what, and 18% were sensitive to the gain in the number of transplants. A 10% point increase in transplants resulted in a 2.6% point increase in support. Unsurprisingly, individuals’ moral judgements were predictive of the attitudes they expressed, particularly with respect to fairness. The authors describe their results as exhibiting ‘strong polarisation’, which is surely inevitable for questions that involve moral judgement.

Being in AER, this is a long meandering paper with extensive analyses and thoroughly reported results. There’s lots of information and findings that I can’t share here. It’s a valuable study with plenty of food for thought, but I can’t help but think that it is, methodologically, a bit weak. If we want to understand the different views in society, surely some Q methodology would be more useful than a basic online survey. And if we want to elicit stated preferences, surely a discrete choice experiment with a well-thought-out efficient design would give us more meaningful results.

Estimating local need for mental healthcare to inform fair resource allocation in the NHS in England: cross-sectional analysis of national administrative data linked at person level. The British Journal of Psychiatry [PubMed] Published 8th August 2019

The need to fairly (and efficiently) allocate NHS resources across the country played an important part in the birth of health economics in the UK, and resulted in resource allocation formulas. Since 1996 there has been a separate formula for mental health services, which is periodically updated. This study describes the work undertaken for the latest update.

The model is based on predicting service use and total mental health care costs observed in 2015 from predictors in the years 2013-2014, to inform allocations in 2019-2024. Various individual-level data sources available to the NHS were used for 43.7 million people registered with a GP practice and over the age of 20. The cost per patient who used mental health services ranged from £94 to over one million, averaging around £2,000. The predictor variables included individual indicators such as age, sex, ethnicity, physical diagnoses, and household type (e.g. number of adults and kids). The model also used variables observed at the local or GP practice level, such as the proportion of people receiving out-of-work benefits and the distance from the mental health trust. All of this got plugged into a good old OLS regression. From individual-level predictions, the researchers created aggregated indices of need for each clinical commission group (CCG).

A lot went into the model, which explained 99% of the variation in costs between CCGs. A key way in which this model differs from previous versions is that it relies on individual-level indicators rather than those observed at the level of GP practice or CCG. There was a lot of variation in the CCG need indices, ranging from 0.65 for Surrey Heath to 1.62 for Southwark, where 1.00 is the average. You’ll need to check the online appendices for your own CCG’s level of need (Lewisham: 1.52). As one might expect, the researchers observed a strong correlation between a CCG’s need index and the CCG’s area’s level of deprivation. Compared with previous models, this new model indicates a greater allocation of resources to more deprived and older populations.

Measuring, valuing and including forgone childhood education and leisure time costs in economic evaluation: methods, challenges and the way forward. Social Science & Medicine [PubMed] Published 7th August 2019

I’m a ‘societal perspective’ sceptic, not because I don’t care about non-health outcomes (though I do care less) but because I think it’s impossible to capture everything that is of value to society, and that capturing just a few things will introduce a lot of bias and noise. I would also deny that time has any intrinsic value. But I do think we need to do a better job of evaluating interventions for children. So I expected this paper to provide me with a good mix of satisfaction and exasperation.

Health care often involves a loss of leisure or work time, which can constitute an opportunity cost and is regularly included in economic evaluations – usually proxied by wages – for adults. The authors outline the rationale for considering ‘time-related’ opportunity costs in economic evaluations and describe the nature of lost time for children. For adults, the distinction is generally between paid or unpaid work and leisure time. Arguably, this distinction is not applicable to children. Two literature reviews are described. One looked at economic evaluations in the context of children’s health, to see how researchers have valued lost time. The other sought to identify ideas about the value of lost time for children from a broader literature.

The authors do a nice job of outlining how difficult it is to capture non-health-related costs and outcomes in the context of childhood. There is a handful of economic evaluations that have tried to measure and value children’s foregone time. The valuations generally focussed on the costs of childcare rather than the costs to the child, though one looked at the rate of return to education. There wasn’t a lot to go off in the non-health literature, which mostly relates to adults. From what there is, the recommendation is to capture absence from formal education and foregone leisure time. Of course, consideration needs to be given to the importance of lost time and thus the value of capturing it in research. We also need to think about the risk of double counting. When it comes to measurement, we can probably use similar methods as we would for adults, such as diaries. But we need very different approaches to valuation. On this, the authors found very little in the way of good examples to follow. More research needed.

Credits

Chris Sampson’s journal round-up for 23rd July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quantifying life: understanding the history of quality-adjusted life-years (QALYs). Social Science & Medicine [PubMed] Published 3rd July 2018

We’ve had some fun talking about the history of the QALY here on this blog. The story of how the QALY came to be important in health policy has been obscured. This paper seeks to address that. The research adopts a method called ‘multiple streams analysis’ (MSA) in order to explain how QALYs caught on. The MSA framework identifies three streams – policy, politics, and problems – and considers the ‘policy entrepreneurs’ involved. For this study, archival material was collected from the National Archives, Department of Health files, and the University of York. The researchers also conducted 44 semi-structured interviews with academics and civil servants.

The problem stream highlights shocks to the UK economy in the late 1960s, coupled with growth in health care costs due to innovations and changing expectations. Cost-effectiveness began to be studied and, increasingly, policymaking was meant to be research-based and accountable. By the 80s, the likes of Williams and Maynard were drawing attention to apparent inequities and inefficiencies in the health service. The policy stream gets going in the 40s and 50s when health researchers started measuring quality of life. By the early 60s, the idea of standardising these measures to try and rank health states was on the table. Through the late 60s and early 70s, government economists proliferated and proved themselves useful in health policy. The meeting of Rachel Rosser and Alan Williams in the mid-70s led to the creation of QALYs as we know them, combining quantity and quality of life on a 0-1 scale. Having acknowledged inefficiencies and inequities in the health service, UK politicians and medics were open to new ideas, but remained unconvinced by the QALY. Yet it was a willingness to consider the need for rationing that put the wheels in motion for NICE, and the politics stream – like the problem and policy stream – characterises favourable conditions for the use of the QALY.

The MSA framework also considers ‘policy entrepreneurs’ who broker the transition from idea to implementation. The authors focus on the role of Alan Williams and of the Economic Advisers’ Office. Williams was key in translating economic ideas into forms that policymakers could understand. Meanwhile, the Economic Advisers’ Office encouraged government economists to engage with academics at HESG and later the QoL Measurement Group (which led to the creation of EuroQol).

The main takeaway from the paper is that good ideas only prevail in the right conditions and with the right people. It’s important to maintain multi-disciplinary and multi-stakeholder networks. In the case of the QALY, the two-way movement of economists between government and academia was crucial.

I don’t completely understand or appreciate the MSA framework, but this paper is an enjoyable read. My only reservation is with the way the authors describe the QALY as being a dominant aspect of health policy in the UK. I don’t think that’s right. It’s dominant within a niche of a niche of a niche – that is, health technology assessment for new pharmaceuticals. An alternative view is that the QALY has in fact languished in a quiet corner of British policymaking, and been completely excluded in some other countries.

Accuracy of patient recall for self‐reported doctor visits: is shorter recall better? Health Economics [PubMed] Published 2nd July 2018

In designing observational studies, such as clinical trials, I have always recommended that self-reported resource use be collected no less frequently than every 3 months. This is partly based on something I once read somewhere that I can’t remember, but partly also on some logic that the accuracy of people’s recall decays over time. This paper has come to tell me how wrong I’ve been.

The authors start by highlighting that recall can be subject to omission, whereby respondents forget relevant information, or commission, whereby respondents include events that did not occur. A key manifestation of the latter is ‘telescoping’, whereby events are included from outside the recall period. We might expect commission to be more likely in short recalls and omission to be more common for long recalls. But there’s very little research on this regarding health service use.

This study uses data from a large trial in diabetes care in Australia, in which 5,305 participants were randomised to receive either 2-week, 3-month, or 12-month recall for how many times they had seen a doctor. Then, the trial data were matched with Medicare data to identify the true levels of resource use.

Over 92% of 12-month recall participants made an error, 76% of the 3-month recall, and 46% of the 2-week recall. The patterns of errors were different. There was very little under-reporting in the 2-week recall sample, with 3-month giving the most over-reporting and 12-month giving the most under-reporting. 12-month recall was associated with the largest number of days reported in error. However, when the authors account for the longer period being considered, and estimate a relative error, the impact of misreporting is smallest for the 12-month recall and greatest for the 2-week recall. This translates into a smaller overall bias for the longest recall period. The authors also find that older, less educated, unemployed, and low‐income patients exhibit higher measurement errors.

Health surveys and comparative studies that estimate resource use over a long period of time should use 12-month recall unless they can find a reason to do otherwise. The authors provide some examples from economic evaluations to demonstrate how selecting shorter recall periods could result in recommending the wrong decisions. It’s worth trying to understand the reasons why people can more accurately recall service use over 12 months. That way, data collection methods could be designed to optimise recall accuracy.

Who should receive treatment? An empirical enquiry into the relationship between societal views and preferences concerning healthcare priority setting. PLoS One [PubMed] Published 27th June 2018

Part of the reason the QALY faces opposition is that it has been used in a way that might not reflect societal preferences for resource allocation. In particular, the idea that ‘a QALY is a QALY is a QALY’ may conflict with notions of desert, severity, or process. We’re starting to see more evidence for groups of people holding different views, which makes it difficult to come up with decision rules to maximise welfare. This study considers some of the perspectives that people adopt, which have been identified in previous research – ‘equal right to healthcare’, ‘limits to healthcare’, and ‘effective and efficient healthcare’ – and looks at how they are distributed in the Netherlands. Using four willingness to trade-off (WTT) exercises, the authors explore the relationship between these views and people’s preferences about resource allocation. Trade-offs are between quality vs quantity of life, health maximisation vs equality, children vs the elderly, and lifestyle-related risk vs adversity. The authors sought to test several hypotheses: i) that ‘equal right’ respondents have a lower WTT; ii) ‘limits to healthcare’ people express a preference for health gains, health maximisation, and treating people with adversity; and iii) ‘effective and efficient’ people support health maximisation, treating children, and treating people with adversity.

A representative online sample of adults in the Netherlands (n=261) was recruited. The first part of the questionnaire collected socio-demographic information. The second part asked questions necessary to allocate people to one of the three perspectives using Likert scales based on a previous study. The third part of the questionnaire consisted of the four reimbursement scenarios. Participants were asked to identify the point (in terms of the relevant quantities) at which they would be indifferent between two options.

The distribution of the viewpoints was 65% ‘equal right’, 23% ‘limits to healthcare’, and 7% ‘effective and efficient’. 6% couldn’t be matched to one of the three viewpoints. In each scenario, people had the option to opt out of trading. 24% of respondents were non-traders for all scenarios and, of these, 78% were of the ‘equal right’ viewpoint. Unfortunately, a lot of people opted out of at least one of the trades, and for a wide variety of reasons. Decisionmakers can’t opt out, so I’m not sure how useful this is.

The authors describe many associations between individual characteristics, viewpoints, and WTT results. But the tested hypotheses were broadly supported. While the findings showed that different groups were more or less willing to trade, the points of indifference for traders within the groups did not vary. So while you can’t please everyone in health care priority setting, this study shows how policies might be designed to satisfy the preferences of people with different perspectives.

Credits