Chris Sampson’s journal round-up for 31st December 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Perspectives of patients with cancer on the quality-adjusted life year as a measure of value in healthcare. Value in Health Published 29th December 2018

Patients should have the opportunity to understand how decisions are made about which treatments they are and are not allowed to use, given their coverage. This study reports on a survey of cancer patients and survivors, with the aim of identifying patients’ awareness, understanding, and opinions about the QALY as a measure of value.

Participants were recruited from a (presumably US-based) patient advocacy group and 774 mostly well-educated, mostly white, mostly women responded. The online survey asked about cancer status and included a couple of measures of health literacy. Fewer than 7% of participants had ever heard of the QALY – more likely for those with greater health literacy. The survey explained the QALY to the participants and then asked if the concept of the QALY makes sense. Around half said it did and 24% thought that it was a good way to measure value in health care. The researchers report a variety of ‘significant’ differences in tendencies to understand or support the use of QALYs, but I’m not convinced that they’re meaningful because the differences aren’t big and the samples are relatively small.

At the end of the survey, respondents were asked to provide opinions on QALYs and value in health care. 165 people provided responses and these were coded and analysed qualitatively. The researchers identified three themes from this one free-text question: i) measuring value, ii) opinions on QALY, and iii) value in health care and decision making. I’m not sure that they’re meaningful themes that help us to understand patients’ views on QALYs. A significant proportion of respondents rejected the idea of using numbers to quantify value in health care. On the other hand, some suggested that the QALY could be a useful decision aid for patients. There was opposition to ‘external decision makers’ having any involvement in health care decision making. Unless you’re paying for all of your care out of pocket, that’s tough luck. But the most obvious finding from the qualitative analysis is that respondents didn’t understand what QALYs were for. That’s partly because health economists in general need to be better at communicating concepts like the QALY. But I think it’s also in large part because the authors failed to provide a clear explanation. They didn’t even use my lovely Wikipedia graphic. Many of the points made by respondents are entirely irrelevant to the appropriateness of QALYs as they’re used (or in the case of the US, aren’t yet used) in practice. For example, several discussed the use of QALYs in clinical decision making. Patients think that they should maintain autonomy, which is fair enough but has nothing to do with how QALYs are used to assess health technologies.

QALYs are built on the idea of trade-offs. They measure the trade-off between life extension and life improvement. They are used to guide trade-offs between different treatments for different people. But the researchers didn’t explain how or why QALYs are used to make trade-offs, so the elicited views aren’t well-informed.

Measuring multivariate risk preferences in the health domain. Journal of Health Economics Published 27th December 2018

Health preferences research is now a substantial field in itself. But there’s still a lot of work left to be done on understanding risk preferences with respect to health. Gradually, we’re coming round to the idea that people tend to be risk-averse. But risk preferences aren’t (necessarily) so simple. Recent research has proposed that ‘higher order’ preferences such as prudence and temperance play a role. A person exhibiting univariate prudence for longevity would be better able to cope with risk if they are going to live longer. Univariate temperance is characterised by a preference for prospects that disaggregate risk across different possible outcomes. Risk preferences can also be multivariate – across health and wealth, for example – determining the relationship between univariate risk preferences and other attributes. These include correlation aversion, cross-prudence, and cross-temperance. Many articles from the Arthur Attema camp demand a great deal of background knowledge. This paper isn’t an exception, but it does provide a very clear and intuitive description of the various kinds of uni- and multivariate risk preferences that the researchers are considering.

For this study, an experiment was conducted with 98 people, who were asked to make 69 choices, corresponding to 3 choices about each risk preference trait being tested, for both gains and losses. Participants were told that they had €240,000 in wealth and 40 years of life to play with. The number of times that an individual made choices in line with a particular trait was used as an indicator of their strength of preference.

For gains, risk aversion was common for both wealth and longevity, and prudence was a common trait. There was no clear tendency towards temperance. For losses, risk aversion and prudence tended to neutrality. For multivariate risk preferences, a majority of people were correlation averse for gains and correlation seeking for losses. For gains, 76% of choices were compatible with correlation aversion, suggesting that people prefer to disaggregate fixed wealth and health gains. For losses, the opposite was true in 68% of choices. There was evidence for cross-prudence in wealth gains but not longevity gains, suggesting that people prefer health risk if they have higher wealth. For losses, the researchers observed cross-prudence and cross-temperance neutrality. The authors go on to explore associations between different traits.

A key contribution is in understanding how risk preferences differ in the health domain as compared with the monetary domain (which is what most economists study). Conveniently, there are a lot of similarities between risk preferences in the two domains, suggesting that health economists can learn from the wider economics literature. Risk aversion and prudence seem to apply to longevity as well as monetary gains, with a shift to neutrality in losses. The potential implications of these findings are far-reaching, but this is just a small experimental study. More research needed (and anticipated).

Prospective payment systems and discretionary coding—evidence from English mental health providers. Health Economics [PubMed] Published 27th December 2018

If you’ve conducted an economic evaluation in the context of mental health care in England, you’ll have come across mental health care clusters. Patients undergoing mental health care are allocated to one of 20 clusters, classed as either ‘psychotic’, ‘non-psychotic’, or ‘organic’, which forms the basis of an episodic payment model. In 2013/14, these episodes were associated with an average cost of between £975 and £9,354 per day. Doctors determine the clusters and the clusters determine reimbursement. Perverse incentives abound. Or do they?

This study builds on the fact that patients are allocated by clinical teams with guidance from the algorithm-based Mental Health Clustering Tool (MHCT). Clinical teams might exhibit upcoding, whereby patients are allocated to clusters that attract a higher price than that recommended by the MHCT. Data were analysed for 148,471 patients from the Mental Health Services Data Set for 2011-2015. For each patient, their allocated cluster is known, along with a variety of socioeconomic indicators and the HoNoS and SARN instruments, which go into the MHCT algorithm. Mixed-effects logistic regression was used to look at whether individual patients were or were not allocated to the cluster recommended as ‘best fit’ by the MHCT, controlling for patient and provider characteristics. Further to this, multilevel multinomial logit models were used to categorise decisions that don’t match the MHCT as either under- or overcoding.

Average agreement across clusters between the MHCT and clinicians was 36%. In most cases, patients were allocated to a cluster either one step higher or one step lower in terms of the level of need, and there isn’t an obvious tendency to overcode. The authors are able to identify a few ways in which observable provider and patient characteristics influence the tendency to under- or over-cluster patients. For example, providers with higher activity are less likely to deviate from the MHCT best fit recommendation. However, the dominant finding – identified by using median odds ratios for the probability of a mismatch between two random providers – seems to be that unobserved heterogeneity determines variation in behaviour.

The study provides clues about the ways in which providers could manipulate coding to their advantage and identifies the need for further data collection for a proper assessment. But reimbursement wasn’t linked to clustering during the time period of the study, so it remains to be seen how clinicians actually respond to these potentially perverse incentives.

Credits

Thesis Thursday: Caroline Vass

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Caroline Vass who has a PhD from the University of Manchester. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Using discrete choice experiments to value benefits and risks in primary care
Supervisors
Katherine Payne, Stephen Campbell, Daniel Rigby
Repository link
https://www.escholar.manchester.ac.uk/uk-ac-man-scw:295629

Are there particular challenges associated with asking people to trade-off risks in a discrete choice experiment?

The challenge of communicating risk in general, not just in DCEs, was one of the things which drew me to the PhD. I’d heard a TED talk discussing a study which asked people’s understanding of weather forecasts. Although most people think they understand a simple statement like “there’s a 30% chance of rain tomorrow”, few people correctly interpreted that as meaning it will rain 30% of the days like tomorrow. Most interpret it to mean there will be rain 30% of the time or in 30% of the area.

My first ever publication was reviewing the risk communication literature, which confirmed our suspicions; even highly educated samples don’t always interpret information as we expect. Therefore, testing if the communication of risk mattered when making trade-offs in a DCE seemed a pretty important topic and formed the overarching research question of my PhD.

Most of your study used data relating to breast cancer screening. What made this a good context in which to explore your research questions?

All women are invited to participate in breast screening (either from a GP referral or at 47-50 years old) in the UK. This makes every woman a potential consumer and a potential ‘patient’. I conducted a lot of qualitative research to ensure the survey text was easily interpretable, and having a disease which many people had heard of made this easier and allowed us to focus on the risk communication formats. My supervisor Prof. Katherine Payne had also been working on a large evaluation of stratified screening which made contacting experts, patients and charities easier.

There are also national screening participation figures so we were able to test if the DCE had any real-world predictive value. Luckily, our estimates weren’t too far off the published uptake rates for the UK!

How did you come to use eye-tracking as a research method, and were there any difficulties in employing a method not widely used in our field?

I have to credit my supervisor Prof. Dan Rigby with planting the seed and introducing me to the method. I did a bit of reading into what psychologists thought you could measure using eye-movements and thought it was worth further investigation. I literally found people publishing with the technology at our institution and knocked on doors until someone would let me use it! If the University of Manchester didn’t already have the equipment, it would have been much more challenging to collect these data.

I then discovered the joys of lab-based work which I think many health economists, fortunately, don’t encounter in their PhDs. The shared bench, people messing with your experiment set-up, restricted lab time which needs to be booked weeks in advance etc. I’m sure it will all be worth it… when the paper is finally published.

What are the key messages from your research in terms of how we ought to be designing DCEs in this context?

I had a bit of a null-result on the risk communication formats, where I found it didn’t affect preferences. I think looking back that might have been with the types of numbers I was presenting (5%, 10%, 20% are easier to understand) and maybe people have a lot of knowledge about the risks of breast screening. It certainly warrants further research to see if my finding holds in other settings. There is a lot of support for visual risk communication formats like icon arrays in other literatures and their addition didn’t seem to do any harm.

Some of the most interesting results came from the think-aloud interviews I conducted with female members of the public. Although I originally wanted to focus on their interpretation of the risk attributes, people started verbalising all sorts of interesting behaviour and strategies. Some of it aligned with economic concepts I hadn’t thought of such as feelings of regret associated with opting-out and discounting both the costs and health benefits of later screens in the programme. But there were also some glaring violations, like ignoring certain attributes, associating cost with quality, using other people’s budget constraints to make choices, and trying to game the survey with protest responses. So perhaps people designing DCEs for benefit-risk trade-offs specifically or in healthcare more generally should be aware that respondents can and do adopt simplifying heuristics. Is this evidence of the benefits of qualitative research in this context? I make that argument here.

Your thesis describes a wealth of research methods and findings, but is there anything that you wish you could have done that you weren’t able to do?

Achieved a larger sample size for my eye-tracking study!

Chris Sampson’s journal round-up for 25th September 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Good practices for real‐world data studies of treatment and/or comparative effectiveness: recommendations from the Joint ISPOR‐ISPE Special Task Force on Real‐World Evidence in Health Care Decision Making. Value in Health Published 15th September 2017

I have an instinctive mistrust of buzzwords. They’re often used to avoid properly defining something, either because it’s too complicated or – worse – because it isn’t worth defining in the first place. For me, ‘real-world evidence’ falls foul. If your evidence isn’t from the real world, then it isn’t evidence at all. But I do like a good old ISPOR Task Force report, so let’s see where this takes us. Real-world evidence (RWE) and its sibling buzzword real-world data (RWD) relate to observational studies and other data not collected in an experimental setting. The purpose of this ISPOR task force (joint with the International Society for Pharmacoepidemiology) was to prepare some guidelines about the conduct of RWE/RWD studies, with a view to improving decision-makers’ confidence in them. Essentially, the hope is to try and create for RWE the kind of ecosystem that exists around RCTs, with procedures for study registration, protocols, and publication: a noble aim. The authors distinguish between 2 types of RWD: ‘Exploratory Treatment Effectiveness Studies’ and ‘Hypothesis Evaluating Treatment Effectiveness Studies’. The idea is that the latter test a priori hypotheses, and these are the focus of this report. Seven recommendations are presented: i) pre-specify the hypotheses, ii) publish a study protocol, iii) publish the study with reference to the protocol, iv) enable replication, v) test hypotheses on a separate dataset than the one used to generate the hypotheses, vi) publically address methodological criticisms, and vii) involve key stakeholders. Fair enough. But these are just good practices for research generally. It isn’t clear how they are in any way specific to RWE. Of course, that was always going to be the case. RWE-specific recommendations would be entirely contingent on whether or not one chose to define a study as using ‘real-world evidence’ (which you shouldn’t, because it’s meaningless). The authors are trying to fit a bag of square pegs into a hole of undefined shape. It isn’t clear to me why retrospective observational studies, prospective observational studies, registry studies, or analyses of routinely collected clinical data should all be treated the same, yet differently to randomised trials. Maybe someone can explain why I’m mistaken, but this report didn’t do it.

Are children rational decision makers when they are asked to value their own health? A contingent valuation study conducted with children and their parents. Health Economics [PubMed] [RePEc] Published 13th September 2017

Obtaining health state utility values for children presents all sorts of interesting practical and theoretical problems, especially if we want to use them in decisions about trade-offs with adults. For this study, the researchers conducted a contingent valuation exercise to elicit children’s (aged 7-19) preferences for reduced risk of asthma attacks in terms of willingness to pay. The study was informed by two preceding studies that sought to identify the best way in which to present health risk and financial information to children. The participating children (n=370) completed questionnaires at school, which asked about socio-demographics, experience of asthma, risk behaviours and altruism. They were reminded (in child-friendly language) about the idea of opportunity cost, and to consider their own budget constraint. Baseline asthma attack risk and 3 risk-reduction scenarios were presented graphically. Two weeks later, the parents completed similar questionnaires. Only 9% of children were unwilling to pay for risk reduction, and most of those said that it was the mayor’s problem! In some senses, the children did a better job than their parents. The authors conducted 3 tests for ‘incorrect’ responses – 14% of adults failed at least one, while only 4% of children did so. Older children demonstrated better scope sensitivity. Of course, children’s willingness to pay was much lower in absolute terms than their parents’, because children have a much smaller budget. As a percentage of the budget, parents were – on average – willing to pay more than children. That seems reassuringly predictable. Boys and fathers were willing to pay more than girls and mothers. Having experience of frequent asthma attacks increased willingness to pay. Interestingly, teenagers were willing to pay less (as a proportion of their budget) than younger children… and so were the teenagers’ parents! Children’s willingness to pay was correlated with that of their own parent’s at the higher risk reductions but not the lowest. This study reports lots of interesting findings and opens up plenty of avenues for future research. But the take-home message is obvious. Kids are smart. We should spend more time asking them what they think.

Journal of Patient-Reported Outcomes: aims and scope. Journal of Patient-Reported Outcomes Published 12th September 2017

Here we have a new journal that warrants a mention. The journal is sponsored by the International Society for Quality of Life Research (ISOQOL), making it a sister journal of Quality of Life Research. One of its Co-Editors-in-Chief is the venerable David Feeny, of HUI fame. They’ll be looking to publish research using PRO(M) data from trials or routine settings, studies of the determinants of PROs, qualitative studies in the development of PROs; anything PRO-related, really. This could be a good journal for more thorough reporting of PRO data that can get squeezed out of a study’s primary outcome paper. Also, “JPRO” is fun to say. The editors don’t mention that the journal is open access, but the website states that it is, so APCs at the ready. ISOQOL members get a discount.

Research and development spending to bring a single cancer drug to market and revenues after approval. JAMA Internal Medicine [PubMed] Published 11th September 2017

We often hear that new drugs are expensive because they’re really expensive to develop. Then we hear about how much money pharmaceutical companies spend on marketing, and we baulk. The problem is, pharmaceutical companies aren’t forthcoming with their accounts, so researchers have to come up with more creative ways to estimate R&D spending. Previous studies have reported divergent estimates. Whether R&D costs ‘justify’ high prices remains an open question. For this study, the authors looked at public data from the US for 10 companies that had only one cancer drug approved by the FDA between 2007 and 2016. Not very representative, perhaps, but useful because it allows for the isolation of the development costs associated with a single drug reaching the market. The median time for drug development was 7.3 years. The most generous estimate of the mean cost of development came in at under a billion dollars; substantially less than some previous estimates. This looks like a bargain; the mean revenue for the 10 companies up to December 2016 was over $6.5 billion. This study may seem a bit back-of-the-envelope in nature. But that doesn’t mean it isn’t accurate. If anything, it begs more confidence than some previous studies because the methods are entirely transparent.

Credits