Don Husereau’s journal round-up for 25th November 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Development and validation of the TRansparent Uncertainty ASsessmenT (TRUST) tool for assessing uncertainties in health economic decision models. PharmacoEconomics [PubMed] Published 11th November 2019

You’re going to quickly see that all three papers in today’s round-up align with some strong personal pet peeves that I harbour toward the nebulous world of market access and health technology assessment – most prominent is how loose we seem to be with language and form without overarching standards. This may be of no surprise to some when discussing a field which lacks a standard definition and for which many international standards of what constitutes good practice have never been defined.

This first paper deals with both issues and provides a useful tool for characterizing uncertainty. The authors state the purpose of the tool is “for systematically identifying, assessing, and reporting uncertainty in health economic models.” They suggest, to the best of their knowledge, no such tool exists. They also support the need for the tool by asserting that uncertainty in health economic modelling is often not fully characterized. The reasons, they suggest, are twofold: (1) there has been too much emphasis on imprecision; and (2) it is difficult to express all uncertainty.

I couldn’t agree more. What I sometimes deeply believe about those planning and conducting economic evaluation is that they obsess too often about uncertainty that is is less relevant (but more amenable to statistical adjustment) and don’t address uncertainty that payers actually care about. To wit, while it may be important to explore and adopt methods that deal with imprecision (dealing with known unknowns), such as improving utility variance estimates (from an SE of 0.003 to 0.011, yes sorry Kelvin and Feng for the callout), not getting this right is unlikely to lead to truly bad decisions. (Kelvin and Feng both know this.)

What is much more important for decision makers is uncertainty that stems from a lack of knowledge. These are unknown unknowns. In my experience this typically has to do with generalizability (how well will it work in different patients or against a different comparator?) and durability (how do I translate 16 weeks of data into a lifetime?); not things resolved by better variance estimates and probabilistic analysis. In Canada, our HTA body has even gone so far as to respond to the egregious act of not providing different parametric forms for extrapolation with the equally egregious act of using unrealistic time horizon adjustments to deal with this. Two wrongs don’t make a right.

To develop the tool, the authors first conducted a (presumably narrative) review of uncertainty frameworks and then ran identified concepts across a bunch of HTA expert committee types. They also used a previously developed framework as a basis for identifying all the places where uncertainty in HTA could occur. Using the concepts and the HTA areas they developed a tool which was presented a few times, and then validated through semi-structured interviews with different international stakeholders (N = 11), as well as insights into barriers to its use, user-friendliness, and feasibility.

Once the tool was developed, six case studies were worked up with an illustration of one of them (pembrolizumab for Hodgkin’s lymphoma) in the manuscript. While the tool does not provide a score or coefficient to adjust estimates or deal with uncertainty, it is not supposed to. What it is trying to do is make sure you are aware of them all so that you can make some determination as to whether the uncertainties are dealt with. One of the challenges of developing the tool is the lack of standardized terminology regarding uncertainty itself. While a short primer exists in the manuscript, for those who have looked into it, uncertainty terminology is far more uncertain than even the authors let on.

While I appreciate the tool and the attempt to standardize things, I do suspect the approach could have been strengthened (a systematic review and possibly a nominal group technique as is done for reporting guidelines). However, I’m not sure this would have gotten us much closer to the truth. Uncertainty needs to be sorted first and I am happy at their attempt. I hope it raises some awareness of how we can’t simply say we are “uncertain” as if that means something.

Unmet medical need: an introduction to definitions and stakeholder perceptions. Value in Health [PubMed] Published November 2019

The second, and also often-abused, term without an obvious definition is unmet medical need (UMN). My theory is that some confusion has arisen due to a confluence of marketing and clinical development teams and regulators. UMN has come to mean patients with rare diseases, drugs with ‘novel’ mechanisms of action, patients with highly prevalent disease, drugs with a more convenient formulation, or drugs with fewer side effects. And yet payers (in my experience) usually recognize none of these. Payers tend to characterize UMN in different ways: no drugs available to treat the condition, available drugs do not provide consistent or durable responses, and there have been no new medical developments in the area for > 10 years.

The purpose of this research then was to unpack the term UMN further. The authors conducted a comprehensive (gray) literature review to identify definitions of UMN in use by different stakeholders and then unpacked their meaning through definitions consultations with multi-European stakeholder discussions, trying to focus on the key elements of unmet medical need with a regulatory and reimbursement lens. This consisted of six one-hour teleconference calls and two workshops held in 2018. One open workshop involved 69 people from regulatory agencies, industry, payers, HTA bodies, patient organizations, healthcare, and academia.

A key finding of this work was that, yes indeed, UMN means different things to different people. A key dimension is whether unmet need is being defined in terms of individuals or populations. Population size (whether prevalent or rare) was not felt to be an element of the definition while there was general consensus that disease severity was. This means UMN should really only consider the UMNs of individual patients, not whether very few or very many patients are at need. It also means we see people who have higher rates of premature mortality and severe morbidity as having more of an unmet need, regardless of how many people are affected by the condition.

And last but not least was the final dimension of how many treatments are actually available. This, the authors point out, is the current legal definition in Europe (as laid down in Article 4, paragraph 2 of Commission Regulation [EC] No. 507/2006). And while this seems the most obvious definition of ‘need’ (we usually need things that are lacking) there was some acknowledgement by stakeholders that simply counting existing therapies is not adequate. There was also acknowledgement that there may be existing therapies available and still an UMN. Certainly this reflects my experience on the pan-Canadian Oncology Drug Review expert review committee, where unmet medical need was an explicit subdomain in their value framework, and where on more than one occasion it was felt, to my surprise, there was an unmet need despite the availability of two or more treatments.

Like the previous paper, the authors did not conduct a systematic review and could have consulted more broadly (no clinician stakeholders were consulted) or used more objective methods, a limitation they acknowledge but also unlikely to get them much further ahead in understanding. So what to do with this information? Well, the authors do propose an HTA approach that would triage reimbursement decision based on UMN. However, stakeholders commented that the method you use really depends on the HTA context. As such, the authors conclude that “the application of the definition within a broader framework depends on the scope of the stakeholder.” In other words, HTA must be fit for purpose (something we knew already). However, like uncertainty, I’m happy someone is actually trying to create reasonable coherent definitions of such an important concept.

On value frameworks and opportunity costs in health technology assessment. International Journal of Technology Assessment in Health Care [PubMed] Published 18th September 2019

The final, and most-abused term is that of ‘value’. While value seems an obvious prerequisite to those making investments in healthcare, and that we (some of us) are willing to acknowledge that value is what we are willing to give up to get something, what is less clear is what we want to get and what we want to give up.

The author of this paper, then, hopes to remind us of the various schools of thought on defining value in health that speak to these trade-offs. The first is broadly consistent with the welfarist school of economics and proposes that the value of health care used by decision makers should reflect individuals’ willingness to pay for it. An alternative approach – sometimes referred to as the extra-welfarist framework, argues that the value of a health technology should be consistent with the policy objectives of the health care system, typically health (the author states it is ‘health’ but I’m not sure it has to be). The final school of thought (which I was not familiar with and neither might you be which is the point of the paper) is what he terms ‘classical’, where the point is not to maximize a maximand or be held up to notions of efficiency but rather to discuss how consumers will be affected. The reference cited to support this framework is this interesting piece although I couldn’t find any allusion to the framework within.

What follows is a relatively fair treatment of extra-welfarist and welfarist applications to decision-making with a larger critical swipe at the former (using legitimate arguments that have been previously published – yes, extra-welfarists assume resources are divisible and, yes, extra-welfarists don’t identify the health-producing resources that will actually be displaced and, yes, using thresholds doesn’t always maximize health) and much downplay of the latter (how we might measure trade-offs reliably under a welfarist framework appears to be a mere detail until this concession is finally mentioned: “On account of the measurement issues surrounding [willingness to pay], there may be many situations in which no valid and reliable methods of operationalizing [welfarist economic value frameworks] exist.”) Given the premise of this commentary is that a recent commentary by Culyer seemed to overlook concepts of value beyond extra-welfarist ones, the swipe at extra-welfarist views is understandable. Hence, this paper can be seen as a kind of rebuttal and reminder that other views should not be ignored.

I like the central premise of the paper as summarized here:

“Although the concise term “value for money” may be much easier to sell to HTA decision makers than, for example, “estimated mean valuation of estimated change in mean health status divided by the estimated change in mean health-care costs,” the former loses too much in precision; it seems much less honest. Because loose language could result in dire consequences of economic evaluation being oversold to the HTA community, it should be avoided at all costs”

However, while I am really sympathetic to warning against conceptual shortcuts and loose language, I wonder if this paper misses the bigger point. Firstly, I’m not convinced we are making such bad decisions as those who wish the lambda to be silenced tend to want us to believe. But more importantly, while it is easy to be critical about economics applied loosely or misapplied, this paper (like others) offers no real practical solutions other than the need to acknowledge other frameworks. It is silent on the real reason extra-welfarist approaches and thresholds seem to have stuck around, namely, they have provided a practical and meaningful way forward for difficult decision-making and the HTA processes that support them. They make sense to decision-makers who are willing to overlook some of the conceptual wrinkles. And I’m a firm believer that conceptual models are a starting point for pragmatism. We shouldn’t be slaves to them.

Credits

Are QALYs #ableist?

As many of us who have had to review submitted journal articles, thesis defenses, grant applications, white papers, and even published literature know, providing feedback on something that is poorly conceived is much harder than providing feedback on something well done.

This is going to be hard.

Who is ValueOurHealth?

The video above comes from the website of “ValueOurHealth.org”; I would tell you more about them, but there is no “About Us” menu item on the website. However, the website indicates that they are a group of patient organizations concerned about:

“The use of flawed, discriminatory value assessments [that] could threaten access to care for patients with chronic illnesses and people with disabilities.”

In particular, who find issue with value assessments that

“place a value on the life of a human based on their health status and assume every patient will respond the same way to treatments.”

QALYs, according to these concerned patient groups, assign a value to human beings. People with lower values (like Jessica, in the video above), then, will be denied coverage because their life is “valued less than someone in perfect health” which means “less value is also placed on treating” them. (Many will be quick to notice that health states and QALYs are used interchangeably here. I try to explain why below.)

It’s not like this is a well-intended rogue group who simply misunderstands the concept of a QALY, requires someone to send them a polite email, and then we can all move on. Other groups have also asserted that QALYs unfairly discriminate against the aged and disabled, and include AimedAlliance, Alliance for Patient Access, Institute for Patient Access, Alliance for Aging Research, and Global Liver Institute. There are likely many more patient groups that abhor QALYs (and definite articles/determiners, it seems) out there, and are justifiably concerned about patient access to therapy. But these are all the ones I could find through a quick search and sitting from my perch in Canada.

Why do they hate QALYs?

One can infer pretty quickly that ValueOurHealth and their illustrative message is largely motivated by another very active organization, the “Partnership to Improve Patient Care” (PIPC). The video, and the arguments about “assigning QALYs” to people, seem to stem from a white paper produced by the PIPC, which in turn cites a very nicely written paper by Franco Sassi (of Imperial College London), that explains QALY and DALY calculations for researchers and policymakers.

The PIPC white paper, in fact, uses the very same calculation provided by Prof. Sassi to illustrate the impact of preventing a case of tuberculosis. However, unlike Prof. Sassi’s illustrative example, the PIPC fails to quantify the QALYs gained by the intervention. Instead they simply focus on the QALYs an individual who has tuberculosis for 6 months will experience. (0.36, versus 0.50, for those keeping score). After some further discussion about problems with measuring health states, the PIPC white paper then skips ahead to ethical problems with QALYs central to their position, citing a Value in Health paper by Erik Nord and colleagues. One of the key problems with the QALY according to the PIPC and argued in the Nord paper goes as follows:

“Valuing health gains in terms of QALYs means that life-years gained in full health—through, for instance, prevention of fatal accidents in people in normal health—are counted as more valuable than life-years gained by those who are chronically ill or disabled—for instance, by averting fatal episodes in people with asthma, heart disease, or mental illness.”

It seems the PIPC assume the lower number of QALYs experienced by those who are sick equates with the value of lives to payers. Even more interestingly, Prof. Nord’s analysis says nothing about costs. While those who are older have fewer QALYs to potentially gain, they also incur fewer costs. This is why, contrary to the assertion of preventing accidents in healthy people, preventive measures may offer a similar value to treatments when both QALYS and costs are considered.

It is also why an ICER review showed that alemtuzumab is good value in individuals requiring second-line treatment for relapse-remitting multiple sclerosis (1.34 QALYs can be gained compared to the next best alternative and at a lower cost then comparators), while a policy of annual mammography screening of similarly aged (i.e., >40) healthy women is of poor economic value (0.036 QALYs can be gained compared to no screening at an additional cost of $5,500 for every woman). Mammography provides better value in older individuals. It is not unlike fracture prevention and a myriad of other interventions in healthy, asymptomatic people in this regard. Quite contrary to the assertion of these misinformed groups, many interventions represent increasingly better value in frail, disabled, and older patients. Relative risks create larger yields when baseline risks are high.

None of this is to say that QALYs (and incremental cost-effectiveness ratios) do not have problems. And the PIPC, at the very least, should be commended for trying to advance alternative metrics, something that very few critics have offered. Instead, the PIPC and like-minded organizations are likely trapped in a filter bubble. They know there are problems with QALYs, and they see expensive and rare disease treatments being valued harshly. So, ergo, blame the QALY. (Note to PIPC: it is because the drugs are expensive, relative to other life-saving things, not because of your concerns about the QALY.) They then see that others feel the same way, which means their concerns are likely justified. A critique of QALYs issued by the Pioneer Institute identifies many of these same arguments. One Twitterer, a disabled Massachusetts lawyer “alive because of Medicaid” has offered further instruction for the QALY-naive.

What to do about it?

As a friend recently told me, not everyone is concerned with the QALY. Some don’t like what they see as a rationing approach promoted by the Institute for Clinical and Economic Review (ICER) assessments. Some hate the QALY. Some hate both. Last year, Joshua T. Cohen, Dan Ollendorf, and Peter Neumann published their own blog entry on the effervescing criticism of ICER, even allowing the PIPC head to have a say about QALYs. They then tried to set the record straight with these thoughts:

While we applaud the call for novel measures and to work with patient and disability advocates to understand attributes important to them, there are three problems with PIPC’s position.

First, simply coming up with that list of key attributes does not address how society should allocate finite resources, or how to price a drug given individual or group preferences.

Second, the diminished weight QALYs assign to life with disability does not represent discrimination. Instead, diminished weight represents recognition that treatments mitigating disability confer value by restoring quality of life to levels typical among most of the population.

Finally, all value measures that inform allocation of finite resources trade off benefits important to some patients against benefits potentially important to others. PIPC itself notes that life years not weighted for disability (e.g., the equal value life-year gained, or evLYG, introduced by ICER for sensitivity analysis purposes) do not award value for improved quality of life. Indeed, any measure that does not “discriminate” against patients with disability cannot award treatments credit for improving their quality of life. Failing to award that credit would adversely affect this population by ruling out spending on such improvements.

Certainly a lot more can be said here.

But for now, I am more curious what others have to say…