Chris Sampson’s journal round-up for 28th October 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Spatial competition and quality: evidence from the English family doctor market. Journal of Health Economics [RePEc] Published 17th October 2019

Researchers will never stop asking questions about the role of competition in health care. There’s a substantial body of literature now suggesting that greater competition in the context of regulated prices may bring some quality benefits. But with weak indicators of quality and limited generalisability, it isn’t a closed case. One context in which evidence has been lacking is in health care beyond the hospital. In the NHS, an individual’s choice of GP practice is perhaps the context in which quality can be observed and choice most readily (and meaningfully) exercised. That’s where this study comes in. Aside from the horrible format of a ‘proper economics’ paper (where we start with spoilers and climax with robustness tests), it’s a good read.

The study relies on a measure of competition based on the number of rival GPs within a 2km radius. Number of GPs, that is, rather than number of practices. This is important, as the number of GPs per practice has been increasing. About 75% of a practice’s revenues are linked to the number of patients registered, wherein lies the incentive to compete with other practices for patients. And, in this context, research has shown that patient choice is responsive to indicators of quality. The study uses data for 2005-2012 from all GP practices in England, making it an impressive data set.

The measures of quality come from the Quality and Outcomes Framework (QOF) and the General Practice Patient Survey (GPPS) – the former providing indicators of clinical quality and the latter providing indicators of patient experience. A series of OLS regressions are run on the different outcome measures, with practice fixed effects and various characteristics of the population. The models show that all of the quality indicators are improved by greater competition, but the effect is very small. For example, an extra competing GP within a 2km radius results in 0.035% increase in the percentage of the population for whom the QOF indicators have been achieved. The effects are a little stronger for the patient satisfaction indicators.

The paper reports a bunch of important robustness checks. For instance, the authors try to test whether practices select their locations based on the patient casemix, finding no evidence that they do. The authors even go so far as to test the impact of a policy change, which resulted in an exogenous increase in the number of GPs in some areas but not others. The main findings seem to have withstood all the tests. They also try out a lagged model, which gives similar results.

The findings from this study slot in comfortably with the existing body of research on the role of competition in the NHS. More competition might help to achieve quality improvement, but it hardly seems worthy of dedicating much effort or, importantly, much expense to the cause.

Worth living or worth dying? The views of the general public about allowing disabled children to die. Journal of Medical Ethics [PhilPapers] [PubMed] Published 15th October 2019

Recent years have seen a series of cases in the UK where (usually very young) children have been so unwell and with such a severe prognosis that someone (usually a physician) has judged that continued treatment is not warranted and that the child should be allowed to die. These cases have generated debate and outrage in the media. But what do people actually think?

This study recruited members of the public in the UK (n=130) to an online panel and asked about the decisions that participants would support. The survey had three parts. The first part set out six scenarios of hospitalised infants, which varied in terms of the infants’ physical and sensory abilities, cognitive capacity, level of suffering, and future prospects. Some of the cases approximated real cases that have received media coverage, and the participants were asked whether they thought that withdrawing treatment was justified in each case. In the second part of the survey, participants were asked about the factors that they believed were important in making such decisions. In the third part, participants answered a few questions about themselves and answered the Oxford Utilitarianism Scale.

The authors set up the concept of a ‘life not worth living’, based on the idea that net future well-being is ‘negative’, and supposing the individual’s own judgement were they able to provide it. In the first part of the survey, 88% indicated that life would be worse than death in at least one of the cases. In such cases, 65% thought that treatment withdrawal was ethically obligatory, while 33% thought that either decision was acceptable. Pain was considered the most important factor in making such decisions, followed by the presence of pleasure. Perhaps predictably for health economists familiar with the literature, about 42% of people thought that resources should be considered in the decision, while 40% thought they shouldn’t.

The paper includes an extensive discussion, with plenty of food for thought. In particular, it discusses the ways in which the findings might inform the debate between the ‘zero line view’, whereby treatment should be withdrawn at the point where life has no benefit, and the ‘threshold view’, which establishes a grey zone of ethical uncertainty, in which either decision is ethically acceptable. To some extent, the findings of this study support the need for a threshold approach. Ethical questions are rarely black and white.

How is the trade-off between adverse selection and discrimination risk affected by genetic testing? Theory and experiment. Journal of Health Economics [PubMed] [RePEc] Published 1st October 2019

A lot of people are worried about how knowledge of their genetic information could be used against them. The most obvious scenario is one in which insurers increase premiums – or deny coverage altogether – on the basis of genetic risk factors. There are two key regulatory options in this context – disclosure duty, whereby individuals are obliged to tell insurers about the outcome of genetic tests, or consent law, whereby people can keep the findings to themselves. This study explores how people behave under each of these regulations.

The authors set up a theoretical model in which individuals can choose whether to purchase a genetic test that can identify them as being either high-risk or low-risk of developing some generic illness. The authors outline utility functions under disclosure duty and consent law. Under disclosure duty, individuals face a choice between the certainty of not knowing their risk and receiving pooled insurance premiums, or a lottery in which they have to disclose their level of risk and receive a higher or lower premium accordingly. Under consent law, individuals will only reveal their test results if they are at low risk, thus securing lower premiums and contributing to adverse selection. As a result, individuals will be more willing to take a test under consent law than under disclosure duty, all else equal.

After setting out their model (at great length), the authors go on to describe an experiment that they conducted with 67 economics students, to elicit preferences within and between the different regulatory settings. The experiment was set up in a very generic way, not related to health at all. Participants were presented with a series of tasks across which the parameters representing the price of the test and the pooled premium were varied. All of the authors’ hypotheses were supported by the experiment. More people took tests under consent law. Higher test prices reduce the number of people taking tests. If prices are high enough, people will prefer disclosure duty. The likelihood that people take tests under consent law is increasing with the level of adverse selection. And people are very sensitive to the level of discrimination risk under disclosure duty.

It’s an interesting study, but I’m not sure how much it can tell us about genetic testing. Framing the experiment as entirely unrelated to health seems especially unwise. People’s risk preferences may be very different in the domain of real health than in the hypothetical monetary domain. In the real world, there’s a lot more at stake.

Credits

Chris Sampson’s journal round-up for 23rd September 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Can you repeat that? Exploring the definition of a successful model replication in health economics. PharmacoEconomics [PubMed] Published 18th September 2019

People talk a lot about replication and its role in demonstrating the validity and reliability of analyses. But what does a successful replication in the context of cost-effectiveness modelling actually mean? Does it mean coming up with precisely the same estimates of incremental costs and effects? Does it mean coming up with a model that recommends the same decision? The authors of this study sought to bring us closer to an operational definition of replication success.

There is potentially much to learn from other disciplines that have a more established history of replication. The authors reviewed literature on the definition of ‘successful replication’ across all disciplines, and used their findings to construct a variety of candidate definitions for use in the context of cost-effectiveness modelling in health. Ten definitions of a successful replication were pulled out of the cross-disciplinary review, which could be grouped into ‘data driven’ replications and ‘experimental’ replications – the former relating to the replication of analyses and the latter relating to the replication of specific observed effects. The ten definitions were from economics, biostatistics, cognitive science, psychology, and experimental philosophy. The definitions varied greatly, with many involving subjective judgments about the proximity of findings. A few studies were found that reported on replications of cost-effectiveness models and which provided some judgment on the level of success. Again, these were inconsistent and subjective.

Quite reasonably, the authors judge that the lack of a fixed definition of successful replication in any scientific field is not just an oversight. The threshold for ‘success’ depends on the context of the replication and on how the evidence will be used. This paper provides six possible definitions of replication success for use in cost-effectiveness modelling, ranging from an identical replication of the results, through partial success in replicating specific pathways within a given margin of error, to simply replicating the same implied decision.

Ultimately, ‘data driven’ replications are a solution to a problem that shouldn’t exist, namely, poor reporting. This paper mostly convinced me that overall ‘success’ isn’t a useful thing to judge in the context of replicating decision models. Replication of certain aspects of a model is useful to evaluate. Whether the replication implied the same decision is a key thing to consider. Beyond this, it is probably worth considering partial success in replicating specific parts of a model.

Differential associations between interpersonal variables and quality-of-life in a sample of college students. Quality of Life Research [PubMed] Published 18th September 2019

There is growing interest in the well-being of students and the distinct challenges involved in achieving good mental health and addressing high levels of demand for services in this group. Students go through many changes that might influence their mental health, prominent among these is the change to their social situation.

This study set out to identify the role of key interpersonal variables on students’ quality of life. The study recruited 1,456 undergraduate students from four universities in the US. The WHOQOL measure was used for quality of life and a barrage of measures were used to collect information on loneliness, social connectedness, social support, emotional intelligence, intimacy, empathic concern, and more. Three sets of analyses of increasing sophistication were conducted, from zero-order correlations between each measure and the WHOQOL, to a network analysis using a Gaussian Graphical Model to identify both direct and indirect relationships while accounting for shared variance.

In all analyses, loneliness stuck out as the strongest driver of quality of life. Social support, social connectedness, emotional intelligence, intimacy with one’s romantic partner, and empathic concern were also significantly associated with quality of life. But the impact of loneliness was greatest, with other interpersonal variables influencing quality of life through their impact on loneliness.

This is a well-researched and reported study. The findings are informative to student support and other services that seek to improve the well-being of students. There is reason to believe that such services should recognise the importance of interpersonal determinants of well-being and in particular address loneliness. But it’s important to remember that this study is only as good as the measures it uses. If you don’t think WHOQOL is adequately measuring student well-being, or you don’t think the UCLA Loneliness Scale tells us what we need to know, you might not want these findings to influence practice. And, of course, the findings may not be generalisable, as the extent to which different interpersonal variables affect quality of life is very likely dependent on the level of service provision, which varies greatly between different universities, let alone countries.

Affordability and non-perfectionism in moral action. Ethical Theory and Moral Practice [PhilPapers] Published 14th September 2019

The ‘cost-effective but unaffordable’ challenge has been bubbling for a while now, at least since sofosbuvir came on the scene. This study explores whether “we can’t afford it” is a justifiable position to take. The punchline is that, no, affordability is not a sound ethical basis on which to support or reject the provision of a health technology. I was extremely sceptical when I first read the claim. If we can’t afford it, it’s impossible, and how can there by a moral imperative in an impossibility? But the authors proceeded to convince me otherwise.

The authors don’t go into great detail on this point, but it all hinges on divisibility. The reason that a drug like sofosbuvir might be considered unaffordable is that loads of people would be eligible to receive it. If sofosbuvir was only provided to a subset of this population, it could be affordable. On this basis, the authors propose the ‘principle of non-perfectionism’. This states that not being able to do all the good we can do (e.g. provide everyone who needs it with sofosbuvir) is not a reason for not doing some of the good we can do. Thus, if we cannot support provision of a technology to everyone who could benefit from it, it does not follow (ethically) to provide it to nobody, but rather to provide it to some people. The basis for selecting people is not of consequence to this argument but could be based on a lottery, for example.

Building on this, the authors explain to us why this is wrong, with the notion of ‘numerical discrimination’. They argue that it is not OK to prioritise one group over another simply because we can meet the needs of everyone within that group as opposed to only some members of the other group. This is exactly what’s happening when we are presented with notions of (un)affordability. If the population of people who could benefit from sofosbuvir was much smaller, there wouldn’t be an issue. But the simple fact that the group is large does not make it morally permissible to deny cost-effective treatment to any individual member within that group. You can’t discriminate against somebody because they are from a large population.

I think there are some tenuous definitions in the paper and some questionable analogies. Nevertheless, the authors succeeded in convincing me that total cost has no moral weight. It is irrelevant to moral reasoning. We should not refuse any health technology to an entire population on the grounds that it is ‘unaffordable’. The authors frame it as a ‘mistake in moral mathematics’. For this argument to apply in the HTA context, it relies wholly on the divisibility of health technologies. To some extent, NICE and their counterparts are in the business of defining models of provision, which might result in limited use criteria to get around the affordability issue. Though these issues are often handled by payers such as NHS England.

The authors of this paper don’t consider the implications for cost-effectiveness thresholds, but this is where my thoughts turned. Does the principle of non-perfectionism undermine the morality of differentiating cost-effectiveness thresholds according to budget impact? I think it probably does. Reducing the threshold because the budget impact is great will result in discrimination (‘numerical discrimination’) against individuals simply because they are part of a large population that could benefit from treatment. This seems to be the direction in which we’re moving. Maybe the efficiency cart is before the ethical horse.

Credits

Are QALYs #ableist?

As many of us who have had to review submitted journal articles, thesis defenses, grant applications, white papers, and even published literature know, providing feedback on something that is poorly conceived is much harder than providing feedback on something well done.

This is going to be hard.

Who is ValueOurHealth?

The video above comes from the website of “ValueOurHealth.org”; I would tell you more about them, but there is no “About Us” menu item on the website. However, the website indicates that they are a group of patient organizations concerned about:

“The use of flawed, discriminatory value assessments [that] could threaten access to care for patients with chronic illnesses and people with disabilities.”

In particular, who find issue with value assessments that

“place a value on the life of a human based on their health status and assume every patient will respond the same way to treatments.”

QALYs, according to these concerned patient groups, assign a value to human beings. People with lower values (like Jessica, in the video above), then, will be denied coverage because their life is “valued less than someone in perfect health” which means “less value is also placed on treating” them. (Many will be quick to notice that health states and QALYs are used interchangeably here. I try to explain why below.)

It’s not like this is a well-intended rogue group who simply misunderstands the concept of a QALY, requires someone to send them a polite email, and then we can all move on. Other groups have also asserted that QALYs unfairly discriminate against the aged and disabled, and include AimedAlliance, Alliance for Patient Access, Institute for Patient Access, Alliance for Aging Research, and Global Liver Institute. There are likely many more patient groups that abhor QALYs (and definite articles/determiners, it seems) out there, and are justifiably concerned about patient access to therapy. But these are all the ones I could find through a quick search and sitting from my perch in Canada.

Why do they hate QALYs?

One can infer pretty quickly that ValueOurHealth and their illustrative message is largely motivated by another very active organization, the “Partnership to Improve Patient Care” (PIPC). The video, and the arguments about “assigning QALYs” to people, seem to stem from a white paper produced by the PIPC, which in turn cites a very nicely written paper by Franco Sassi (of Imperial College London), that explains QALY and DALY calculations for researchers and policymakers.

The PIPC white paper, in fact, uses the very same calculation provided by Prof. Sassi to illustrate the impact of preventing a case of tuberculosis. However, unlike Prof. Sassi’s illustrative example, the PIPC fails to quantify the QALYs gained by the intervention. Instead they simply focus on the QALYs an individual who has tuberculosis for 6 months will experience. (0.36, versus 0.50, for those keeping score). After some further discussion about problems with measuring health states, the PIPC white paper then skips ahead to ethical problems with QALYs central to their position, citing a Value in Health paper by Erik Nord and colleagues. One of the key problems with the QALY according to the PIPC and argued in the Nord paper goes as follows:

“Valuing health gains in terms of QALYs means that life-years gained in full health—through, for instance, prevention of fatal accidents in people in normal health—are counted as more valuable than life-years gained by those who are chronically ill or disabled—for instance, by averting fatal episodes in people with asthma, heart disease, or mental illness.”

It seems the PIPC assume the lower number of QALYs experienced by those who are sick equates with the value of lives to payers. Even more interestingly, Prof. Nord’s analysis says nothing about costs. While those who are older have fewer QALYs to potentially gain, they also incur fewer costs. This is why, contrary to the assertion of preventing accidents in healthy people, preventive measures may offer a similar value to treatments when both QALYS and costs are considered.

It is also why an ICER review showed that alemtuzumab is good value in individuals requiring second-line treatment for relapse-remitting multiple sclerosis (1.34 QALYs can be gained compared to the next best alternative and at a lower cost then comparators), while a policy of annual mammography screening of similarly aged (i.e., >40) healthy women is of poor economic value (0.036 QALYs can be gained compared to no screening at an additional cost of $5,500 for every woman). Mammography provides better value in older individuals. It is not unlike fracture prevention and a myriad of other interventions in healthy, asymptomatic people in this regard. Quite contrary to the assertion of these misinformed groups, many interventions represent increasingly better value in frail, disabled, and older patients. Relative risks create larger yields when baseline risks are high.

None of this is to say that QALYs (and incremental cost-effectiveness ratios) do not have problems. And the PIPC, at the very least, should be commended for trying to advance alternative metrics, something that very few critics have offered. Instead, the PIPC and like-minded organizations are likely trapped in a filter bubble. They know there are problems with QALYs, and they see expensive and rare disease treatments being valued harshly. So, ergo, blame the QALY. (Note to PIPC: it is because the drugs are expensive, relative to other life-saving things, not because of your concerns about the QALY.) They then see that others feel the same way, which means their concerns are likely justified. A critique of QALYs issued by the Pioneer Institute identifies many of these same arguments. One Twitterer, a disabled Massachusetts lawyer “alive because of Medicaid” has offered further instruction for the QALY-naive.

What to do about it?

As a friend recently told me, not everyone is concerned with the QALY. Some don’t like what they see as a rationing approach promoted by the Institute for Clinical and Economic Review (ICER) assessments. Some hate the QALY. Some hate both. Last year, Joshua T. Cohen, Dan Ollendorf, and Peter Neumann published their own blog entry on the effervescing criticism of ICER, even allowing the PIPC head to have a say about QALYs. They then tried to set the record straight with these thoughts:

While we applaud the call for novel measures and to work with patient and disability advocates to understand attributes important to them, there are three problems with PIPC’s position.

First, simply coming up with that list of key attributes does not address how society should allocate finite resources, or how to price a drug given individual or group preferences.

Second, the diminished weight QALYs assign to life with disability does not represent discrimination. Instead, diminished weight represents recognition that treatments mitigating disability confer value by restoring quality of life to levels typical among most of the population.

Finally, all value measures that inform allocation of finite resources trade off benefits important to some patients against benefits potentially important to others. PIPC itself notes that life years not weighted for disability (e.g., the equal value life-year gained, or evLYG, introduced by ICER for sensitivity analysis purposes) do not award value for improved quality of life. Indeed, any measure that does not “discriminate” against patients with disability cannot award treatments credit for improving their quality of life. Failing to award that credit would adversely affect this population by ruling out spending on such improvements.

Certainly a lot more can be said here.

But for now, I am more curious what others have to say…