Journal Club Briefing: Dolan and Kahneman (2008)

Today’s Journal Club Briefing comes from the Academic Unit of Health Economics at the University of Leeds. At their journal club on 2nd August 2017, they discussed Dolan and Kahneman’s 2008 article from The Economic Journal: ‘Interpretations of utility and their implications for the valuation of health‘. If you’ve discussed an article at a recent journal club meeting at your own institution and would like to write a briefing for the blog, get in touch.

Why this paper?

Dolan and Kahneman (2008) is a paper which was published nearly ten years ago, was written several years before that, and was not published in a health-related journal. It’s hence, at first sight, a slightly curious choice for a health economics journal club. However, it raises issues which are at the heart of health economics practice. The questions raised by this article have not as yet been answered, and don’t look likely to be answered anytime soon.

Summary

Experienced vs. decision utility

The article’s point of departure is the distinction between experienced utility and decision utility, often a source of fruitful research in behavioural economics. Experienced utility is utility in the Benthamite sense, meaning the hedonic experience in the current moment: the pleasure and/or pain felt by a person at any given point in time. Decision utility is utility as taught in undergraduate economics textbooks: an objective function which the individual dispassionately acts to maximise. In the neoclassical framework of said undergraduate textbooks, this is a distinction without a difference. The individual correctly forecasts the expected flow of experienced utility given the available information and her actions, forms a decision utility function from it and acts to maximise it.

However, Thaler and Sunstein wouldn’t have sold as many books if things were so simple. Many systematic and significant instances of divergences between experienced and decision utility have been well documented, and several people (including one of the authors of this paper) have won Nobel prizes for it. The one which this article focuses on is adaptation.

Adaptation

The authors summarise a large body of evidence that shows that individuals suffer a large loss of utility after a traumatic event (e.g. the loss of a limb or loss of function), but that for many conditions they will adapt to their new situation and recover much of their utility loss. After as little as a year, their valuation of their health is very similar to that of the general population. Furthermore, the authors precis various studies which show that individuals routinely underestimate drastically the amount of adaptation that would occur should such a traumatic event befall them.

This improvement over time in the health-related utility experienced by people with many conditions is partly due to hedonic adaptation – the internal scale of pleasure/pain re-calibrates to their new situation – and partly due to behavioural change, such as finding new pastimes to replace those ruled out by their condition. While the causes of adaptation are fascinating, the focus here is not on the mechanisms behind it, but rather on the consequences for measuring utility and the implications for resource allocation.

Health valuation and adaptation

The methods health economists use to evaluate the utility of being in a given health state, such as time trade-off, standard gamble or discrete choice experiments, will tend to elicit decision utility. They are based on choices between hypothetical states and so will not capture the changes in experienced utility due to adaptation. Thus valuations of health states from the general public will tend to be lower than the valuations from people actually living in the health state.

At first glance, the consequences for resource allocation may not appear to be particularly severe. It may lead to more resources being devoted to healthcare as a whole (at least for life-improving treatments – life-extending treatments are a different case), but the overall healthcare budget is in practice largely a political decision. However, it will not lead to distortions between treatments for alternative conditions.

Yet adaptation is not a universal phenomenon. There are conditions for which little or no adaptation is seen (for example unexplained pain), and when it occurs, it occurs at different speeds and to differing extents for different conditions. The authors show that valuations of conditions with a greater initial utility loss are lower than conditions with a lesser initial loss but a lower degree of adaptation, and thus will receive a greater level of resources, despite the sum of experienced utility being the same for both. The authors argue that this is unfair, and that health economists should update their practices to better capture experienced utility.

Public vs. patient preference

A common argument in favour of the status quo is that (in many countries at least) it is public resources which are being allocated, and thus it is public preferences which should be respected. It appears legitimate to allocate resources to assuage public fears of health states, even if those health states are worse in their imagination than in reality. The authors consider this argument and reply that, in this case, the instruments of health economists are still not fit for purpose. General measures of health states, such as EQ-5D, go out of their way to describe states in abstract terms and to separate them from causes, such as cancer, which may carry an emotional affect. It cannot be argued that public valuations are justified because resources should be allocated according to public fears if the measurement of valuation deliberately tries not to elicit those fears.

The argument that adaptation causes serious problems for valuing health and for allocation of health resources is a persuasive one. It is undoubtedly true that changes in utility over time, and other violations of the neoclassical economic paradigm such as reference dependence, do not presently receive sufficient attention in health economics and policy decisions in general.

Discussion

Which yardstick?

Despite the stimulating discussion and the overall brilliance of the paper, there are some elements which can be challenged. One of them is that throughout, the authors’ arguments and recommendations are made from the standpoint that the sum over time of the flow of experienced utility from a health state is to be used as the sole measure of value. This would consist in what one of the authors calls the day reconstruction method (DRM) which consists in rating a range of feelings including happiness, worry, and frustration.

Despite the acknowledgement of some philosophical difficulties, the sum of the flow of experienced utility is treated as if it is the only true yardstick with which to measure health, without a convincing justification and no discussion on the qualitative aspect of the measurement as opposed to a truly cardinal measure of health allowing ranking of individuals’ health states.

Public vs. private preferences revisited

The authors raise the question of whether current practice can be justified by a desire to soothe public fears, and dismiss it since the elicitation tools are not suitable. However, they do not address the question of whether allocating public resources according to the public’s (incorrect) fears of given diseases or health states could be a legitimate health policy aim. One could imagine, for example, a discrete choice experiment eliciting how much the general public dreads cancer over other diseases, and make an argument that the welfare of the public is improved by allocating resources based on these results. There are myriad problems with such an approach, of course, but there seem to be no fewer problems with alternative approaches.

Intertemporal welfare

Intertemporal welfare judgements are notoriously difficult once the exponential discounting framework is left. It seems just as legitimate to base valuations on the ex post judgement of individuals who have fully adjusted to a health state as on an integration of past feelings, most of which are now distant memories. Most people would agree that the time to value their experience of a marathon is after completing it, not during the twenty-fifth mile or at the start line.

Indeed, this appears to be the position tacitly taken elsewhere by Kahneman in his work on the peak-end rule. In Redelmeier et al. (2003), it was found that the retrospective rating of the pain of a colonoscopy was based almost exclusively on the peak intensity of pain and on the pain felt at the end. Thus procedures which were extended by an extra three minutes were remembered as less painful than standard procedures, even though the total pain experienced was greater. Furthermore, those who underwent the extended procedure were more likely to state they would undergo it again. It would seem strange, in this case, to judge them as worse off.

Schelling (1984) ends his superlative discussion of the problems of intertemporal decision making with the following thought experiment. Just as with valuing health, there are no easy answers.

[S]ome anesthetics block transmission of the nervous impulses that constitute pain; others have the characteristic that the patient responds to the pain as if feeling it fully but has utterly no recollection afterwards. One of these is sodium pentothal. In my imaginary experiment we wish to distinguish the effects of the drug from the effects of the unremembered pain, and we want a healthy control subject in parallel with some painful operations that will be performed with the help of this drug. For a handsome fee you will be knocked out for an hour or two, allowed to sleep it off, then tested before you go home. You do this regularly, and one afternoon you walk into the lab a little early and find the experimenters viewing some videotape. On the screen is an experimental subject writhing, and though the audio is turned down the shrieks are unmistakably those of a person in pain. When the pain stops the victim pleads, “Don’t ever do that again. Please.”

The person is you.

Do you care?

Do you walk into your booth, lie on the couch, and hold out your arm for today’s injection?

Should I let you?

Credits

Chris Sampson’s journal round-up for 19th June 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Health-related resource-use measurement instruments for intersectoral costs and benefits in the education and criminal justice sectors. PharmacoEconomics [PubMed] Published 8th June 2017

Increasingly, people are embracing a societal perspective for economic evaluation. This often requires the identification of costs (and benefits) in non-health sectors such as education and criminal justice. But it feels as if we aren’t as well-versed in capturing these as we are in the health sector. This study reviews the measures that are available to support a broader perspective. The authors search the Database of Instruments for Resource Use Measurement (DIRUM) as well as the usual electronic journal databases. The review also sought to identify the validity and reliability of the instruments. From 167 papers assessed in the review, 26 different measures were identified (half of which were in DIRUM). 21 of the instruments were only used in one study. Half of the measures included items relating to the criminal justice sector, while 21 included education-related items. Common specifics for education included time missed at school, tutoring needs, classroom assistance and attendance at a special school. Criminal justice sector items tended to include legal assistance, prison detainment, court appearances, probation and police contacts. Assessments of the psychometric properties was found for only 7 of the 26 measures, with specific details on the non-health items available for just 2: test-retest reliability for the Child and Adolescent Services Assessment (CASA) and validity for the WPAI+CIQ:SHP,V2 (there isn’t room on the Internet for the full name). So there isn’t much evidence of any validity for any of these measures in the context of intersectoral (non-health) costs and benefits. It’s no doubt the case that health-specific resource use measures aren’t subject to adequate testing, but this study has identified that the problem may be even greater when it comes to intersectoral costs and benefits. Most worrying, perhaps, is the fact that 1 in 5 of the articles identified in the review reported using some unspecified instrument, presumably developed specifically for the study or adapted from an off-the-shelf instrument. The authors propose that a new resource use measure for intersectoral costs and benefits (RUM ICB) be developed from scratch, with reference to existing measures and guidance from experts in education and criminal justice.

Use of large-scale HRQoL datasets to generate individualised predictions and inform patients about the likely benefit of surgery. Quality of Life Research [PubMed] Published 31st May 2017

In the NHS, EQ-5D data are now routinely collected from patients before and after undergoing one of four common procedures. These data can be used to see how much patients’ health improves (or deteriorates) following the operations. However, at the individual level, for a person deciding whether or not to undergo the procedure, aggregate outcomes might not be all that useful. This study relates to the development of a nifty online tool that a prospective patient can use to find out the expected likelihood that they will feel better, the same or worse following the procedure. The data used include EQ-5D-3L responses associated with almost half a million unilateral hip or knee replacements or groin hernia repairs between April 2009 and March 2016. Other variables are also included, and central to this analysis is a Likert scale about improvement or worsening of hip/knee/hernia problems compared to before the operation. The purpose of the study is to group people – based on their pre-operation characteristics – according to their expected postoperative utility scores. The authors employed a recursive Classification and Regression Tree (CART) algorithm to split the datasets into strata according to the risk factors. The final set of risk variables were age, gender, pre-operative EQ-5D-3L profile and symptom duration. The CART analysis grouped people into between 55 and 60 different groups for each of the procedures, with the groupings explaining 14-27% of the variation in postoperative utility scores. Minimally important (positive and negative) differences in the EQ-5D utility score were estimated with reference to changes in the Likert scale for each of the procedures. These ranged in magnitude from 0.041 to 0.106. The resulting algorithms are what drive the results delivered by the online interface (you can go and have a play with it). There are a few limitations to the study, such as the reliance on complete case analysis and the fact that the CART analysis might lack predictive ability. And there’s an interesting problem inherent in all of this, that the more people use the tool, the less representative it will become as it influences selection into treatment. The validity of the tool as a precise risk calculator is quite limited. But that isn’t really the point. The point is that it unlocks some of the potential value of PROMs to provide meaningful guidance in the process of shared decision-making.

Can present biasedness explain early onset of diabetes and subsequent disease progression? Exploring causal inference by linking survey and register data. Social Science & Medicine [PubMed] Published 26th May 2017

The term ‘irrational’ is overused by economists. But one situation in which I am willing to accept it is with respect to excessive present bias. That people don’t pay enough attention to future outcomes seems to be a fundamental limitation of the human brain in the 21st century. When it comes to diabetes and its complications, there are lots of treatments available, but there is only so much that doctors can do. A lot depends on the patient managing their own disease, and it stands to reason that present bias might cause people to manage their diabetes poorly, as the value of not going blind or losing a foot 20 years in the future seems less salient than the joy of eating your own weight in carbs right now. But there’s a question of causality here; does the kind of behaviour associated with time-inconsistent preferences lead to poorer health or vice versa? This study provides some insight on that front. The authors outline an expected utility model with quasi-hyperbolic discounting and probability weighting, and incorporate a present bias coefficient attached to payoffs occurring in the future. Postal questionnaires were collected from 1031 type 2 diabetes patients in Denmark with an online discrete choice experiment as a follow-up. These data were combined with data from a registry of around 9000 diabetes patients, from which the postal/online participants were identified. BMI, HbA1c, age and year of diabetes onset were all available in the registry and the postal survey included physical activity, smoking, EQ-5D, diabetes literacy and education. The DCE was designed to elicit time preferences using the offer of (monetary) lottery wins, with 12 different choice sets presented to all participants. Unfortunately, despite the offer of a real-life lottery award for taking part in the research, only 79 of 1031 completed the online DCE survey. Regression analyses showed that individuals with diabetes since 1999 or earlier, or who were 48 or younger at the time of onset, exhibited present bias. And the present bias seems to be causal. Being inactive, obese, diabetes illiterate and having lower quality of life or poorer glycaemic control were associated with being present biased. These relationships hold when subject to a number of control measures. So it looks as if present bias explains at least part of the variation in self-management and health outcomes for people with diabetes. Clearly, the selection of the small sample is a bit of a concern. It may have meant that people with particular risk preferences (given that the reward was a lottery) were excluded, and so the sample might not be representative. Nevertheless, it seems that at least some people with diabetes could benefit from interventions that increase the salience of future health-related payoffs associated with self-management.

Credits

Chris Sampson’s journal round-up for 8th May 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Verification of decision-analytic models for health economic evaluations: an overview. PharmacoEconomics [PubMed] Published 29th April 2017

Increasingly, it’s expected that model-based economic evaluations can be validated and shown to be fit-for-purpose. However, up to now, discussions have focussed on scientific questions about conceptualisation and external validity, rather than technical questions, such as whether the model is programmed correctly and behaves as expected. This paper looks at how things are done in the software industry with a view to creating guidance for health economists. Given that Microsoft Excel remains one of the most popular software packages for modelling, there is a discussion of spreadsheet errors. These might be errors in logic, simple copy-paste type mistakes and errors of omission. A variety of tactics is discussed. In particular, the authors describe unit testing, whereby individual parts of the code are demonstrated to be correct. Unit testing frameworks do not exist for application to spreadsheets, so the authors recommend the creation of a ‘Tests’ spreadsheet with tests for parameter assignments, functions, equations and exploratory items. Independent review by another modeller is also recommended. Six recommendations are given for taking model verification forward: i) the use of open source models, ii) standardisation in model storage and communication (anyone for a registry?), iii) style guides for script, iv) agency and journal mandates, v) training and vi) creation of an ISPOR/SMDM task force. This is a worthwhile read for any modeller, with some neat tactics that you can build into your workflow.

How robust are value judgments of health inequality aversion? Testing for framing and cognitive effects. Medical Decision Making [PubMed] Published 25th April 2017

Evidence shows that people are often extremely averse to health inequality. Sometimes these super-egalitarian responses imply such extreme preferences that monotonicity is violated. The starting point for this study is the idea that these findings are probably influenced by framing effects and cognitive biases, and that they may therefore not constitute a reliable basis for policy making. The authors investigate 4 hypotheses that might indicate the presence of bias: i) realistic small health inequality reductions vs larger one, ii) population- vs individual-level descriptions, iii) concrete vs abstract intervention scenarios and iv) online vs face-to-face administration. Two samples were recruited: one with a face-to-face discussion (n=52) and the other online (n=83). The questionnaire introduced respondents to health inequality in England before asking 4 questions in the form of a choice experiment, with 20 paired choices. Responses are grouped according to non-egalitarianism, prioritarianism and strict egalitarianism. The main research question is whether or not the alternative strategies resulted in fewer strict egalitarian responses. Not much of an effect was found with regard to large gains or to population-level descriptions. There was evidence that the abstract scenarios resulted in a greater proportion of people giving strong egalitarian responses. And the face-to-face sample did seem to exhibit some social desirability bias, with more egalitarian responses. But the main take-home message from this study for me is that it is not easy to explain-away people’s extreme aversion to health inequality, which is heartening. Yet, as with all choice experiments, we see that the mode of administration – and cognitive effects induced by the question – can be very important.

Adaptation to health states: sick yet better off? Health Economics [PubMed] Published 20th April 2017

Should patients or the public value health states for the purpose of resource allocation? It’s a question that’s cropped up plenty of times on this blog. One of the trickier challenges is understanding and dealing with adaptation. This paper has a pretty straightforward purpose – to look for signs of adaptation in a longitudinal dataset. The authors’ approach is to see whether there is a positive relationship between the length of time a person has an illness and the likelihood of them reporting better health. I did pretty much the same thing (for SF-6D and satisfaction with life) in my MSc dissertation, and found little evidence of adaptation, so I’m keen to see where this goes! The study uses 4 waves of data from the British Cohort Study, looking at self-assessed health (on a 4-point scale) and self-reported chronic illness and health shocks. Latent self-assessed health is modelled using a dynamic ordered probit model. In short, there is evidence of adaptation. People who have had a long-standing illness for a greater duration are more likely to report a higher level of self-assessed health. An additional 10 years of illness is associated with an 8 percentage point increase in the likelihood of reporting ‘excellent’ health. The study is opaque about sample sizes, but I’d guess that finding is based on not-that-many people. Further analyses are conducted to show that adaptation seems to become important only after a relatively long duration (~20 years) and that better health before diagnosis may not influence adaptation. The authors also look at specific conditions, finding that some (e.g. diabetes, anxiety, back problems) are associated with adaptation, while others (e.g. depression, cancer, Crohn’s disease) are not. I have a bit of a problem with this study though, in that it’s framed as being relevant to health care resource allocation and health technology assessment. But I don’t think it is. Self-assessed health in the ‘how healthy are you’ sense is very far removed from the process by which health state utilities are obtained using the EQ-5D. And they probably don’t reflect adaptation in the same way.

Credits