Journal Club Briefing: Dolan and Kahneman (2008)

Today’s Journal Club Briefing comes from the Academic Unit of Health Economics at the University of Leeds. At their journal club on 2nd August 2017, they discussed Dolan and Kahneman’s 2008 article from The Economic Journal: ‘Interpretations of utility and their implications for the valuation of health‘. If you’ve discussed an article at a recent journal club meeting at your own institution and would like to write a briefing for the blog, get in touch.

Why this paper?

Dolan and Kahneman (2008) is a paper which was published nearly ten years ago, was written several years before that, and was not published in a health-related journal. It’s hence, at first sight, a slightly curious choice for a health economics journal club. However, it raises issues which are at the heart of health economics practice. The questions raised by this article have not as yet been answered, and don’t look likely to be answered anytime soon.

Summary

Experienced vs. decision utility

The article’s point of departure is the distinction between experienced utility and decision utility, often a source of fruitful research in behavioural economics. Experienced utility is utility in the Benthamite sense, meaning the hedonic experience in the current moment: the pleasure and/or pain felt by a person at any given point in time. Decision utility is utility as taught in undergraduate economics textbooks: an objective function which the individual dispassionately acts to maximise. In the neoclassical framework of said undergraduate textbooks, this is a distinction without a difference. The individual correctly forecasts the expected flow of experienced utility given the available information and her actions, forms a decision utility function from it and acts to maximise it.

However, Thaler and Sunstein wouldn’t have sold as many books if things were so simple. Many systematic and significant instances of divergences between experienced and decision utility have been well documented, and several people (including one of the authors of this paper) have won Nobel prizes for it. The one which this article focuses on is adaptation.

Adaptation

The authors summarise a large body of evidence that shows that individuals suffer a large loss of utility after a traumatic event (e.g. the loss of a limb or loss of function), but that for many conditions they will adapt to their new situation and recover much of their utility loss. After as little as a year, their valuation of their health is very similar to that of the general population. Furthermore, the authors precis various studies which show that individuals routinely underestimate drastically the amount of adaptation that would occur should such a traumatic event befall them.

This improvement over time in the health-related utility experienced by people with many conditions is partly due to hedonic adaptation – the internal scale of pleasure/pain re-calibrates to their new situation – and partly due to behavioural change, such as finding new pastimes to replace those ruled out by their condition. While the causes of adaptation are fascinating, the focus here is not on the mechanisms behind it, but rather on the consequences for measuring utility and the implications for resource allocation.

Health valuation and adaptation

The methods health economists use to evaluate the utility of being in a given health state, such as time trade-off, standard gamble or discrete choice experiments, will tend to elicit decision utility. They are based on choices between hypothetical states and so will not capture the changes in experienced utility due to adaptation. Thus valuations of health states from the general public will tend to be lower than the valuations from people actually living in the health state.

At first glance, the consequences for resource allocation may not appear to be particularly severe. It may lead to more resources being devoted to healthcare as a whole (at least for life-improving treatments – life-extending treatments are a different case), but the overall healthcare budget is in practice largely a political decision. However, it will not lead to distortions between treatments for alternative conditions.

Yet adaptation is not a universal phenomenon. There are conditions for which little or no adaptation is seen (for example unexplained pain), and when it occurs, it occurs at different speeds and to differing extents for different conditions. The authors show that valuations of conditions with a greater initial utility loss are lower than conditions with a lesser initial loss but a lower degree of adaptation, and thus will receive a greater level of resources, despite the sum of experienced utility being the same for both. The authors argue that this is unfair, and that health economists should update their practices to better capture experienced utility.

Public vs. patient preference

A common argument in favour of the status quo is that (in many countries at least) it is public resources which are being allocated, and thus it is public preferences which should be respected. It appears legitimate to allocate resources to assuage public fears of health states, even if those health states are worse in their imagination than in reality. The authors consider this argument and reply that, in this case, the instruments of health economists are still not fit for purpose. General measures of health states, such as EQ-5D, go out of their way to describe states in abstract terms and to separate them from causes, such as cancer, which may carry an emotional affect. It cannot be argued that public valuations are justified because resources should be allocated according to public fears if the measurement of valuation deliberately tries not to elicit those fears.

The argument that adaptation causes serious problems for valuing health and for allocation of health resources is a persuasive one. It is undoubtedly true that changes in utility over time, and other violations of the neoclassical economic paradigm such as reference dependence, do not presently receive sufficient attention in health economics and policy decisions in general.

Discussion

Which yardstick?

Despite the stimulating discussion and the overall brilliance of the paper, there are some elements which can be challenged. One of them is that throughout, the authors’ arguments and recommendations are made from the standpoint that the sum over time of the flow of experienced utility from a health state is to be used as the sole measure of value. This would consist in what one of the authors calls the day reconstruction method (DRM) which consists in rating a range of feelings including happiness, worry, and frustration.

Despite the acknowledgement of some philosophical difficulties, the sum of the flow of experienced utility is treated as if it is the only true yardstick with which to measure health, without a convincing justification and no discussion on the qualitative aspect of the measurement as opposed to a truly cardinal measure of health allowing ranking of individuals’ health states.

Public vs. private preferences revisited

The authors raise the question of whether current practice can be justified by a desire to soothe public fears, and dismiss it since the elicitation tools are not suitable. However, they do not address the question of whether allocating public resources according to the public’s (incorrect) fears of given diseases or health states could be a legitimate health policy aim. One could imagine, for example, a discrete choice experiment eliciting how much the general public dreads cancer over other diseases, and make an argument that the welfare of the public is improved by allocating resources based on these results. There are myriad problems with such an approach, of course, but there seem to be no fewer problems with alternative approaches.

Intertemporal welfare

Intertemporal welfare judgements are notoriously difficult once the exponential discounting framework is left. It seems just as legitimate to base valuations on the ex post judgement of individuals who have fully adjusted to a health state as on an integration of past feelings, most of which are now distant memories. Most people would agree that the time to value their experience of a marathon is after completing it, not during the twenty-fifth mile or at the start line.

Indeed, this appears to be the position tacitly taken elsewhere by Kahneman in his work on the peak-end rule. In Redelmeier et al. (2003), it was found that the retrospective rating of the pain of a colonoscopy was based almost exclusively on the peak intensity of pain and on the pain felt at the end. Thus procedures which were extended by an extra three minutes were remembered as less painful than standard procedures, even though the total pain experienced was greater. Furthermore, those who underwent the extended procedure were more likely to state they would undergo it again. It would seem strange, in this case, to judge them as worse off.

Schelling (1984) ends his superlative discussion of the problems of intertemporal decision making with the following thought experiment. Just as with valuing health, there are no easy answers.

[S]ome anesthetics block transmission of the nervous impulses that constitute pain; others have the characteristic that the patient responds to the pain as if feeling it fully but has utterly no recollection afterwards. One of these is sodium pentothal. In my imaginary experiment we wish to distinguish the effects of the drug from the effects of the unremembered pain, and we want a healthy control subject in parallel with some painful operations that will be performed with the help of this drug. For a handsome fee you will be knocked out for an hour or two, allowed to sleep it off, then tested before you go home. You do this regularly, and one afternoon you walk into the lab a little early and find the experimenters viewing some videotape. On the screen is an experimental subject writhing, and though the audio is turned down the shrieks are unmistakably those of a person in pain. When the pain stops the victim pleads, “Don’t ever do that again. Please.”

The person is you.

Do you care?

Do you walk into your booth, lie on the couch, and hold out your arm for today’s injection?

Should I let you?

Credits

Meeting round-up: EuroQol Plenary Meeting 2017

The 34th Plenary Meeting of the EuroQol Group took place in Barcelona on 21st and 22nd September 2017. The local hosts of the meeting were Mike Herdman (UK-born but a Barcelona resident for many years), Juan Manuel Ramos-Goñi and Oliver Rivero-Arias. For the second year running, I chaired the Scientific Programme together with Anna Lugnér.

At its inception, the EuroQol Group was very much a northern European collaboration – the early versions of the EuroQol instrument (now known as the EQ-5D) were developed by researchers in the Netherlands, UK, Sweden, Finland and Norway – see here for an overview of the Group and its history. This year’s Plenary Meeting was attended by 111 participants (primarily academic researchers) representing 23 different countries spanning six continents.

As with previous Plenary Meetings, an HESG-style discussant format was followed – papers were pre-circulated to participants and presented by discussants rather than by authors. The parallel poster sessions also followed a discussant format, with approximately 10 minutes dedicated to the discussion of each poster. In total, 19 papers and 20 posters were presented. For the first time, the majority of the papers were lead-authored by women.

One of the themes of the meeting was a focus on the relationships and interactions between EQ-5D dimensions. A paper by Anna Selivanova compared health state values derived from discrete choice data both with and without interactions. Anna reported results demonstrating that interactions are important and that the interaction between mobility and self-care was the most salient. Another paper by Thor Gamst-Klaussen (represented at the meeting by co-author Jan Abel Olsen) explored the causal and effect nature of EQ-5D dimensions. The authors applied confirmatory tetrad analysis and confirmatory factor analysis to multi-country, cross-sectional data in order to test a conceptual framework depicting relationships among the five dimensions. The results suggest that the EQ-5D comprises both causal variables – mobility, pain/discomfort and anxiety/depression – and effect variables – self-care and usual activities.

An intriguing paper by John Hartman tested for differences in respondent characteristics, participation, response quality and EQ-5D-5L values depending on the device and connection used to access an online survey. The results showed systematic variability in participation and response quality, but the variability did not affect the resulting health state values. The findings could support extending the administration of valuation surveys to smaller devices (e.g. mobile phones) to obtain responses from younger, more ethnically diverse populations who have traditionally been found to be difficult to recruit.

Other topics covered in the programme included the views of UK decision makers on the role of well-being in resource allocation decisions, the development of a value set for the EQ-5D-Y (a version of the EQ-5D designed for use in children and adolescents), and the prevalence and impact of so-called ‘implausible’ health states.

The Plenary Meeting concluded with a guest presentation by Janel Hanmer of the University of Pittsburgh, followed by a reception at a restaurant on the Montjuïc hill overlooking the Barcelona harbour. The next EuroQol conference will be the Academy Meeting, which takes place in Budapest on 6-8 March 2018.

Credits

Chris Sampson’s journal round-up for 25th September 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Good practices for real‐world data studies of treatment and/or comparative effectiveness: recommendations from the Joint ISPOR‐ISPE Special Task Force on Real‐World Evidence in Health Care Decision Making. Value in Health Published 15th September 2017

I have an instinctive mistrust of buzzwords. They’re often used to avoid properly defining something, either because it’s too complicated or – worse – because it isn’t worth defining in the first place. For me, ‘real-world evidence’ falls foul. If your evidence isn’t from the real world, then it isn’t evidence at all. But I do like a good old ISPOR Task Force report, so let’s see where this takes us. Real-world evidence (RWE) and its sibling buzzword real-world data (RWD) relate to observational studies and other data not collected in an experimental setting. The purpose of this ISPOR task force (joint with the International Society for Pharmacoepidemiology) was to prepare some guidelines about the conduct of RWE/RWD studies, with a view to improving decision-makers’ confidence in them. Essentially, the hope is to try and create for RWE the kind of ecosystem that exists around RCTs, with procedures for study registration, protocols, and publication: a noble aim. The authors distinguish between 2 types of RWD: ‘Exploratory Treatment Effectiveness Studies’ and ‘Hypothesis Evaluating Treatment Effectiveness Studies’. The idea is that the latter test a priori hypotheses, and these are the focus of this report. Seven recommendations are presented: i) pre-specify the hypotheses, ii) publish a study protocol, iii) publish the study with reference to the protocol, iv) enable replication, v) test hypotheses on a separate dataset than the one used to generate the hypotheses, vi) publically address methodological criticisms, and vii) involve key stakeholders. Fair enough. But these are just good practices for research generally. It isn’t clear how they are in any way specific to RWE. Of course, that was always going to be the case. RWE-specific recommendations would be entirely contingent on whether or not one chose to define a study as using ‘real-world evidence’ (which you shouldn’t, because it’s meaningless). The authors are trying to fit a bag of square pegs into a hole of undefined shape. It isn’t clear to me why retrospective observational studies, prospective observational studies, registry studies, or analyses of routinely collected clinical data should all be treated the same, yet differently to randomised trials. Maybe someone can explain why I’m mistaken, but this report didn’t do it.

Are children rational decision makers when they are asked to value their own health? A contingent valuation study conducted with children and their parents. Health Economics [PubMed] [RePEc] Published 13th September 2017

Obtaining health state utility values for children presents all sorts of interesting practical and theoretical problems, especially if we want to use them in decisions about trade-offs with adults. For this study, the researchers conducted a contingent valuation exercise to elicit children’s (aged 7-19) preferences for reduced risk of asthma attacks in terms of willingness to pay. The study was informed by two preceding studies that sought to identify the best way in which to present health risk and financial information to children. The participating children (n=370) completed questionnaires at school, which asked about socio-demographics, experience of asthma, risk behaviours and altruism. They were reminded (in child-friendly language) about the idea of opportunity cost, and to consider their own budget constraint. Baseline asthma attack risk and 3 risk-reduction scenarios were presented graphically. Two weeks later, the parents completed similar questionnaires. Only 9% of children were unwilling to pay for risk reduction, and most of those said that it was the mayor’s problem! In some senses, the children did a better job than their parents. The authors conducted 3 tests for ‘incorrect’ responses – 14% of adults failed at least one, while only 4% of children did so. Older children demonstrated better scope sensitivity. Of course, children’s willingness to pay was much lower in absolute terms than their parents’, because children have a much smaller budget. As a percentage of the budget, parents were – on average – willing to pay more than children. That seems reassuringly predictable. Boys and fathers were willing to pay more than girls and mothers. Having experience of frequent asthma attacks increased willingness to pay. Interestingly, teenagers were willing to pay less (as a proportion of their budget) than younger children… and so were the teenagers’ parents! Children’s willingness to pay was correlated with that of their own parent’s at the higher risk reductions but not the lowest. This study reports lots of interesting findings and opens up plenty of avenues for future research. But the take-home message is obvious. Kids are smart. We should spend more time asking them what they think.

Journal of Patient-Reported Outcomes: aims and scope. Journal of Patient-Reported Outcomes Published 12th September 2017

Here we have a new journal that warrants a mention. The journal is sponsored by the International Society for Quality of Life Research (ISOQOL), making it a sister journal of Quality of Life Research. One of its Co-Editors-in-Chief is the venerable David Feeny, of HUI fame. They’ll be looking to publish research using PRO(M) data from trials or routine settings, studies of the determinants of PROs, qualitative studies in the development of PROs; anything PRO-related, really. This could be a good journal for more thorough reporting of PRO data that can get squeezed out of a study’s primary outcome paper. Also, “JPRO” is fun to say. The editors don’t mention that the journal is open access, but the website states that it is, so APCs at the ready. ISOQOL members get a discount.

Research and development spending to bring a single cancer drug to market and revenues after approval. JAMA Internal Medicine [PubMed] Published 11th September 2017

We often hear that new drugs are expensive because they’re really expensive to develop. Then we hear about how much money pharmaceutical companies spend on marketing, and we baulk. The problem is, pharmaceutical companies aren’t forthcoming with their accounts, so researchers have to come up with more creative ways to estimate R&D spending. Previous studies have reported divergent estimates. Whether R&D costs ‘justify’ high prices remains an open question. For this study, the authors looked at public data from the US for 10 companies that had only one cancer drug approved by the FDA between 2007 and 2016. Not very representative, perhaps, but useful because it allows for the isolation of the development costs associated with a single drug reaching the market. The median time for drug development was 7.3 years. The most generous estimate of the mean cost of development came in at under a billion dollars; substantially less than some previous estimates. This looks like a bargain; the mean revenue for the 10 companies up to December 2016 was over $6.5 billion. This study may seem a bit back-of-the-envelope in nature. But that doesn’t mean it isn’t accurate. If anything, it begs more confidence than some previous studies because the methods are entirely transparent.

Credits