Are QALYs #ableist?

As many of us who have had to review submitted journal articles, thesis defenses, grant applications, white papers, and even published literature know, providing feedback on something that is poorly conceived is much harder than providing feedback on something well done.

This is going to be hard.

Who is ValueOurHealth?

The video above comes from the website of “ValueOurHealth.org”; I would tell you more about them, but there is no “About Us” menu item on the website. However, the website indicates that they are a group of patient organizations concerned about:

“The use of flawed, discriminatory value assessments [that] could threaten access to care for patients with chronic illnesses and people with disabilities.”

In particular, who find issue with value assessments that

“place a value on the life of a human based on their health status and assume every patient will respond the same way to treatments.”

QALYs, according to these concerned patient groups, assign a value to human beings. People with lower values (like Jessica, in the video above), then, will be denied coverage because their life is “valued less than someone in perfect health” which means “less value is also placed on treating” them. (Many will be quick to notice that health states and QALYs are used interchangeably here. I try to explain why below.)

It’s not like this is a well-intended rogue group who simply misunderstands the concept of a QALY, requires someone to send them a polite email, and then we can all move on. Other groups have also asserted that QALYs unfairly discriminate against the aged and disabled, and include AimedAlliance, Alliance for Patient Access, Institute for Patient Access, Alliance for Aging Research, and Global Liver Institute. There are likely many more patient groups that abhor QALYs (and definite articles/determiners, it seems) out there, and are justifiably concerned about patient access to therapy. But these are all the ones I could find through a quick search and sitting from my perch in Canada.

Why do they hate QALYs?

One can infer pretty quickly that ValueOurHealth and their illustrative message is largely motivated by another very active organization, the “Partnership to Improve Patient Care” (PIPC). The video, and the arguments about “assigning QALYs” to people, seem to stem from a white paper produced by the PIPC, which in turn cites a very nicely written paper by Franco Sassi (of Imperial College London), that explains QALY and DALY calculations for researchers and policymakers.

The PIPC white paper, in fact, uses the very same calculation provided by Prof. Sassi to illustrate the impact of preventing a case of tuberculosis. However, unlike Prof. Sassi’s illustrative example, the PIPC fails to quantify the QALYs gained by the intervention. Instead they simply focus on the QALYs an individual who has tuberculosis for 6 months will experience. (0.36, versus 0.50, for those keeping score). After some further discussion about problems with measuring health states, the PIPC white paper then skips ahead to ethical problems with QALYs central to their position, citing a Value in Health paper by Erik Nord and colleagues. One of the key problems with the QALY according to the PIPC and argued in the Nord paper goes as follows:

“Valuing health gains in terms of QALYs means that life-years gained in full health—through, for instance, prevention of fatal accidents in people in normal health—are counted as more valuable than life-years gained by those who are chronically ill or disabled—for instance, by averting fatal episodes in people with asthma, heart disease, or mental illness.”

It seems the PIPC assume the lower number of QALYs experienced by those who are sick equates with the value of lives to payers. Even more interestingly, Prof. Nord’s analysis says nothing about costs. While those who are older have fewer QALYs to potentially gain, they also incur fewer costs. This is why, contrary to the assertion of preventing accidents in healthy people, preventive measures may offer a similar value to treatments when both QALYS and costs are considered.

It is also why an ICER review showed that alemtuzumab is good value in individuals requiring second-line treatment for relapse-remitting multiple sclerosis (1.34 QALYs can be gained compared to the next best alternative and at a lower cost then comparators), while a policy of annual mammography screening of similarly aged (i.e., >40) healthy women is of poor economic value (0.036 QALYs can be gained compared to no screening at an additional cost of $5,500 for every woman). Mammography provides better value in older individuals. It is not unlike fracture prevention and a myriad of other interventions in healthy, asymptomatic people in this regard. Quite contrary to the assertion of these misinformed groups, many interventions represent increasingly better value in frail, disabled, and older patients. Relative risks create larger yields when baseline risks are high.

None of this is to say that QALYs (and incremental cost-effectiveness ratios) do not have problems. And the PIPC, at the very least, should be commended for trying to advance alternative metrics, something that very few critics have offered. Instead, the PIPC and like-minded organizations are likely trapped in a filter bubble. They know there are problems with QALYs, and they see expensive and rare disease treatments being valued harshly. So, ergo, blame the QALY. (Note to PIPC: it is because the drugs are expensive, relative to other life-saving things, not because of your concerns about the QALY.) They then see that others feel the same way, which means their concerns are likely justified. A critique of QALYs issued by the Pioneer Institute identifies many of these same arguments. One Twitterer, a disabled Massachusetts lawyer “alive because of Medicaid” has offered further instruction for the QALY-naive.

What to do about it?

As a friend recently told me, not everyone is concerned with the QALY. Some don’t like what they see as a rationing approach promoted by the Institute for Clinical and Economic Review (ICER) assessments. Some hate the QALY. Some hate both. Last year, Joshua T. Cohen, Dan Ollendorf, and Peter Neumann published their own blog entry on the effervescing criticism of ICER, even allowing the PIPC head to have a say about QALYs. They then tried to set the record straight with these thoughts:

While we applaud the call for novel measures and to work with patient and disability advocates to understand attributes important to them, there are three problems with PIPC’s position.

First, simply coming up with that list of key attributes does not address how society should allocate finite resources, or how to price a drug given individual or group preferences.

Second, the diminished weight QALYs assign to life with disability does not represent discrimination. Instead, diminished weight represents recognition that treatments mitigating disability confer value by restoring quality of life to levels typical among most of the population.

Finally, all value measures that inform allocation of finite resources trade off benefits important to some patients against benefits potentially important to others. PIPC itself notes that life years not weighted for disability (e.g., the equal value life-year gained, or evLYG, introduced by ICER for sensitivity analysis purposes) do not award value for improved quality of life. Indeed, any measure that does not “discriminate” against patients with disability cannot award treatments credit for improving their quality of life. Failing to award that credit would adversely affect this population by ruling out spending on such improvements.

Certainly a lot more can be said here.

But for now, I am more curious what others have to say…

Analysing Patient-Level Data using HES Workshop

This intensive workshop introduces participants to HES (Hospital Episode Statistics) data and how to handle and manipulate these very large patient-level data sets using computer software. Understanding and interpreting the data is a key first step for using these data in economic evaluation or evaluating health care policy and practice. Participants will engage in lectures and problem-solving exercises, analysing the information in highly interactive sessions. Data manipulation and statistical analysis will be taught and demonstrated using Stata.

This workshop is offered to people in the academic, public and commercial sectors.  It is useful for analysts who wish to harness the power of HES non-randomised episode level patient data to shed further light on such things as patient costs and pathways, re-admissions and outcomes and provider performance.  The workshop is suitable for individuals working in NHS hospitals, commissioning organisations, NHS England, Monitor, and the Department of Health and Social Care, pharmaceutical companies or consultancy companies and for health care researchers and PhD students.  Overseas participants may find the tuition helpful for their own country, but note that the course is heavily oriented towards understanding HES data for England.

The workshop fee is 900GBP for the public sector; 1,400GBP for the commercial sector. This includes all tuition, course materials, lunches, the welcome and drinks reception, the workshop dinner and refreshments, but does not include accommodation.

Online registration is now open; further information and registration is at: https://www.york.ac.uk/che/courses/patient-data/

Subsidised places are available for full-time PhD students. If this is applicable to you, please email the workshop administrators and request an Application Form.

Contact: Gillian or Louise, Workshop Administrators, at: che-apd@york.ac.uk;  tel: +44 (0)1904 321436

Chris Sampson’s journal round-up for 5th March 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Healthy working days: the (positive) effect of work effort on occupational health from a human capital approach. Social Science & Medicine Published 28th February 2018

If you look at the literature on the determinants of subjective well-being (or happiness), you’ll see that unemployment is often cited as having a big negative impact. The same sometimes applies for its impact on health, but here – of course – the causality is difficult to tease apart. Then, in research that digs deeper, looking at hours worked and different types of jobs, we see less conclusive results. In this paper, the authors start by asserting that the standard approach in labour economics (on which I’m not qualified to comment) is to assume that there is a negative association between work effort and health. This study extends the framework by allowing for positive effects of work that are related to individuals’ characteristics and working conditions, and where health is determined in a Grossman-style model of health capital that accounts for work effort in the rate of health depreciation. This model is used to examine health as a function of work effort (as indicated by hours worked) in a single wave of the European Working Conditions Survey (EWCS) from 2010 for 15 EU member states. Key items from the EWCS included in this study are questions such as “does your work affect your health or not?”, “how is your health in general?”, and “how many hours do you usually work per week?”. Working conditions are taken into account by looking at data on shift working and the need to wear protective equipment. One of the main findings of the study is that – with good working conditions – greater work effort can improve health. The Marxist in me is not very satisfied with this. We need to ask the question, compared to what? Working fewer hours? For most people, that simply isn’t an option. Aren’t the people who work fewer hours the people who can afford to work fewer hours? No attention is given to the sociological aspects of employment, which are clearly important. The study also shows that overworking or having poorer working conditions reduces health. We also see that, for many groups, longer hours do not negatively impact on health until we reach around 120 hours a week. This fails a good sense check. Who are these people?! I’d be very interested to see if these findings hold for academics. That the key variables are self-reported undermines the conclusions somewhat, as we can expect people to adjust their expectations about work effort and health in accordance with their colleagues. It would be very difficult to avoid a type 2 error (with respect to the negative impact of effort on health) using these variables to represent health and the role of work effort.

Agreement between retrospectively and contemporaneously collected patient-reported outcome measures (PROMs) in hip and knee replacement patients. Quality of Life Research [PubMed] Published 26th February 2018

The use of patient-reported outcomes (PROMs) in elective care in the NHS has been a boon for researchers in our field, providing before-and-after measurement of health-related quality of life so that we can look at the impact of these interventions. But we can’t do this in emergency care because the ‘before’ is never observed – people only show up when they’re in the middle of the emergency. But what if people could accurately recall their pre-emergency health state? There’s some evidence to suggest that people can, so long as the recall period is short. This study looks at NHS PROMs data (n=443), with generic and condition-specific outcomes collected from patients having hip or knee replacements. Patients included in the study were additionally asked to recall their health state 4 weeks prior to surgery. The authors assess the extent to which the contemporary PROM measurements agree with the retrospective measurements, and the extent to which any disagreement relates to age, socioeconomic status, or the length of time to recall. There wasn’t much difference between contemporary and retrospective measurements, though patients reported slightly lower health on the retrospective questionnaires. And there weren’t any compelling differences associated with age or socioeconomic status or the length of recall. These findings are promising, suggesting that we might be able to rely on retrospective PROMs. But the elective surgery context is very different to the emergency context, and I don’t think we can expect the two types of health care to impact recollection in the same way. In this study, responses may also have been influenced by participants’ memories of completing the contemporary questionnaire, and the recall period was very short. But the only way to find out more about the validity of retrospective PROM collection is to do more of it, so hopefully we’ll see more studies asking this question.

Adaptation or recovery after health shocks? Evidence using subjective and objective health measures. Health Economics [PubMed] Published 26th February 2018

People’s expectations about their health can influence their behaviour and determine their future health, so it’s important that we understand people’s expectations and any ways in which they diverge from reality. This paper considers the effect of a health shock on people’s expectations about how long they will live. The authors focus on survival probability, measured objectively (i.e. what actually happens to these patients) and subjectively (i.e. what the patients expect), and the extent to which the latter corresponds to the former. The arguments presented are couched within the concept of hedonic adaptation. So the question is – if post-shock expectations return to pre-shock expectations after a period of time – whether this is because people are recovering from the disease or because they are moving their reference point. Data are drawn from the Health and Retirement Study. Subjective survival probability is scaled to whether individuals expect to survive for 2 years. Cancer, stroke, and myocardial infarction are the health shocks used. The analysis uses some lagged regression models, separate for each of the three diagnoses, with objective and subjective survival probability as the dependent variable. There’s a bit of a jumble of things going on in this paper, with discussions of adaptation, survival, self-assessed health, optimism, and health behaviours. So it’s a bit difficult to see the wood for the trees. But the authors find the effect they’re looking for. Objective survival probability is negatively affected by a health shock, as is subjective survival probability. But then subjective survival starts to return to pre-shock trends whereas objective survival does not. The authors use this finding to suggest that there is adaptation. I’m not sure about this interpretation. To me it seems as if subjective life expectancy is only weakly responsive to changes in objective life expectancy. The findings seem to have more to do with how people process information about their probability of survival than with how they adapt to a situation. So while this is an interesting study about how people process changes in survival probability, I’m not sure what it has to do with adaptation.

3L, 5L, what the L? A NICE conundrum. PharmacoEconomics [PubMed] Published 26th February 2018

In my last round-up, I said I was going to write a follow-up blog post to an editorial on the EQ-5D-5L. I didn’t get round to it, but that’s probably best as there has since been a flurry of other editorials and commentaries on the subject. Here’s one of them. This commentary considers the perspective of NICE in deciding whether to support the use of the EQ-5D-5L and its English value set. The authors point out the differences between the 3L and 5L, namely the descriptive systems and the value sets. Examples of the 5L descriptive system’s advantages are provided: a reduced ceiling effect, reduced clustering, better discriminative ability, and the benefits of doing away with the ‘confined to bed’ level of the mobility domain. Great! On to the value set. There are lots of differences here, with 3 main causes: the data, the preference elicitation methods, and the modelling methods. We can’t immediately determine whether these differences are improvements or not. The authors stress the point that any differences observed will be in large part due to quirks in the original 3L value set rather than in the 5L value set. Nevertheless, the commentary is broadly supportive of a cautionary approach to 5L adoption. I’m not. Time for that follow-up blog post.

Credits