Chris Sampson’s journal round-up for 23rd July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Quantifying life: understanding the history of quality-adjusted life-years (QALYs). Social Science & Medicine [PubMed] Published 3rd July 2018

We’ve had some fun talking about the history of the QALY here on this blog. The story of how the QALY came to be important in health policy has been obscured. This paper seeks to address that. The research adopts a method called ‘multiple streams analysis’ (MSA) in order to explain how QALYs caught on. The MSA framework identifies three streams – policy, politics, and problems – and considers the ‘policy entrepreneurs’ involved. For this study, archival material was collected from the National Archives, Department of Health files, and the University of York. The researchers also conducted 44 semi-structured interviews with academics and civil servants.

The problem stream highlights shocks to the UK economy in the late 1960s, coupled with growth in health care costs due to innovations and changing expectations. Cost-effectiveness began to be studied and, increasingly, policymaking was meant to be research-based and accountable. By the 80s, the likes of Williams and Maynard were drawing attention to apparent inequities and inefficiencies in the health service. The policy stream gets going in the 40s and 50s when health researchers started measuring quality of life. By the early 60s, the idea of standardising these measures to try and rank health states was on the table. Through the late 60s and early 70s, government economists proliferated and proved themselves useful in health policy. The meeting of Rachel Rosser and Alan Williams in the mid-70s led to the creation of QALYs as we know them, combining quantity and quality of life on a 0-1 scale. Having acknowledged inefficiencies and inequities in the health service, UK politicians and medics were open to new ideas, but remained unconvinced by the QALY. Yet it was a willingness to consider the need for rationing that put the wheels in motion for NICE, and the politics stream – like the problem and policy stream – characterises favourable conditions for the use of the QALY.

The MSA framework also considers ‘policy entrepreneurs’ who broker the transition from idea to implementation. The authors focus on the role of Alan Williams and of the Economic Advisers’ Office. Williams was key in translating economic ideas into forms that policymakers could understand. Meanwhile, the Economic Advisers’ Office encouraged government economists to engage with academics at HESG and later the QoL Measurement Group (which led to the creation of EuroQol).

The main takeaway from the paper is that good ideas only prevail in the right conditions and with the right people. It’s important to maintain multi-disciplinary and multi-stakeholder networks. In the case of the QALY, the two-way movement of economists between government and academia was crucial.

I don’t completely understand or appreciate the MSA framework, but this paper is an enjoyable read. My only reservation is with the way the authors describe the QALY as being a dominant aspect of health policy in the UK. I don’t think that’s right. It’s dominant within a niche of a niche of a niche – that is, health technology assessment for new pharmaceuticals. An alternative view is that the QALY has in fact languished in a quiet corner of British policymaking, and been completely excluded in some other countries.

Accuracy of patient recall for self‐reported doctor visits: is shorter recall better? Health Economics [PubMed] Published 2nd July 2018

In designing observational studies, such as clinical trials, I have always recommended that self-reported resource use be collected no less frequently than every 3 months. This is partly based on something I once read somewhere that I can’t remember, but partly also on some logic that the accuracy of people’s recall decays over time. This paper has come to tell me how wrong I’ve been.

The authors start by highlighting that recall can be subject to omission, whereby respondents forget relevant information, or commission, whereby respondents include events that did not occur. A key manifestation of the latter is ‘telescoping’, whereby events are included from outside the recall period. We might expect commission to be more likely in short recalls and omission to be more common for long recalls. But there’s very little research on this regarding health service use.

This study uses data from a large trial in diabetes care in Australia, in which 5,305 participants were randomised to receive either 2-week, 3-month, or 12-month recall for how many times they had seen a doctor. Then, the trial data were matched with Medicare data to identify the true levels of resource use.

Over 92% of 12-month recall participants made an error, 76% of the 3-month recall, and 46% of the 2-week recall. The patterns of errors were different. There was very little under-reporting in the 2-week recall sample, with 3-month giving the most over-reporting and 12-month giving the most under-reporting. 12-month recall was associated with the largest number of days reported in error. However, when the authors account for the longer period being considered, and estimate a relative error, the impact of misreporting is smallest for the 12-month recall and greatest for the 2-week recall. This translates into a smaller overall bias for the longest recall period. The authors also find that older, less educated, unemployed, and low‐income patients exhibit higher measurement errors.

Health surveys and comparative studies that estimate resource use over a long period of time should use 12-month recall unless they can find a reason to do otherwise. The authors provide some examples from economic evaluations to demonstrate how selecting shorter recall periods could result in recommending the wrong decisions. It’s worth trying to understand the reasons why people can more accurately recall service use over 12 months. That way, data collection methods could be designed to optimise recall accuracy.

Who should receive treatment? An empirical enquiry into the relationship between societal views and preferences concerning healthcare priority setting. PLoS One [PubMed] Published 27th June 2018

Part of the reason the QALY faces opposition is that it has been used in a way that might not reflect societal preferences for resource allocation. In particular, the idea that ‘a QALY is a QALY is a QALY’ may conflict with notions of desert, severity, or process. We’re starting to see more evidence for groups of people holding different views, which makes it difficult to come up with decision rules to maximise welfare. This study considers some of the perspectives that people adopt, which have been identified in previous research – ‘equal right to healthcare’, ‘limits to healthcare’, and ‘effective and efficient healthcare’ – and looks at how they are distributed in the Netherlands. Using four willingness to trade-off (WTT) exercises, the authors explore the relationship between these views and people’s preferences about resource allocation. Trade-offs are between quality vs quantity of life, health maximisation vs equality, children vs the elderly, and lifestyle-related risk vs adversity. The authors sought to test several hypotheses: i) that ‘equal right’ respondents have a lower WTT; ii) ‘limits to healthcare’ people express a preference for health gains, health maximisation, and treating people with adversity; and iii) ‘effective and efficient’ people support health maximisation, treating children, and treating people with adversity.

A representative online sample of adults in the Netherlands (n=261) was recruited. The first part of the questionnaire collected socio-demographic information. The second part asked questions necessary to allocate people to one of the three perspectives using Likert scales based on a previous study. The third part of the questionnaire consisted of the four reimbursement scenarios. Participants were asked to identify the point (in terms of the relevant quantities) at which they would be indifferent between two options.

The distribution of the viewpoints was 65% ‘equal right’, 23% ‘limits to healthcare’, and 7% ‘effective and efficient’. 6% couldn’t be matched to one of the three viewpoints. In each scenario, people had the option to opt out of trading. 24% of respondents were non-traders for all scenarios and, of these, 78% were of the ‘equal right’ viewpoint. Unfortunately, a lot of people opted out of at least one of the trades, and for a wide variety of reasons. Decisionmakers can’t opt out, so I’m not sure how useful this is.

The authors describe many associations between individual characteristics, viewpoints, and WTT results. But the tested hypotheses were broadly supported. While the findings showed that different groups were more or less willing to trade, the points of indifference for traders within the groups did not vary. So while you can’t please everyone in health care priority setting, this study shows how policies might be designed to satisfy the preferences of people with different perspectives.

Credits

Chris Sampson’s journal round-up for 2nd July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Choice in the presence of experts: the role of general practitioners in patients’ hospital choice. Journal of Health Economics [PubMed] [RePEc] Published 26th June 2018

In the UK, patients are in principle free to choose which hospital they use for elective procedures. However, as these choices operate through a GP referral, the extent to which the choice is ‘free’ is limited. The choice set is provided by the GP and thus there are two decision-makers. It’s a classic example of the principal-agent relationship. What’s best for the patient and what’s best for the local health care budget might not align. The focus of this study is on the applied importance of this dynamic and the idea that econometric studies that ignore it – by looking only at patient decision-making or only at GP decision-making – may give bias estimates. The author outlines a two-stage model for the choice process that takes place. Hospital characteristics can affect choices in three ways: i) by only influencing the choice set that the GP presents to the patient, e.g. hospital quality, ii) by only influencing the patient’s choice from the set, e.g. hospital amenities, and iii) by influencing both, e.g. waiting times. The study uses Hospital Episode Statistics for 30,000 hip replacements that took place in 2011/12, referred by 4,721 GPs to 168 hospitals, to examine revealed preferences. The choice set for each patient is not observed, so a key assumption is that all hospitals to which a GP made referrals in the period are included in the choice set presented to patients. The main findings are that both GPs and patients are influenced primarily by distance. GPs are influenced by hospital quality and the budget impact of referrals, while distance and waiting times explain patient choices. For patients, parking spaces seem to be more important than mortality ratios. The results support the notion that patients defer to GPs in assessing quality. In places, it’s difficult to follow what the author did and why they did it. But in essence, the author is looking for (and in most cases finding) reasons not to ignore GPs’ preselection of choice sets when conducting econometric analyses involving patient choice. Econometricians should take note. And policymakers should be asking whether freedom of choice is sensible when patients prioritise parking and when variable GP incentives could give rise to heterogeneous standards of care.

Using evidence from randomised controlled trials in economic models: what information is relevant and is there a minimum amount of sample data required to make decisions? PharmacoEconomics [PubMed] Published 20th June 2018

You’re probably aware of the classic ‘irrelevance of inference’ argument. Statistical significance is irrelevant in deciding whether or not to fund a health technology, because we ought to do whatever we expect to be best on average. This new paper argues the case for irrelevance in other domains, namely multiplicity (e.g. multiple testing) and sample size. With a primer on hypothesis testing, the author sets out the regulatory perspective. Multiplicity inflates the chance of a type I error, so regulators worry about it. That’s why triallists often obsess over primary outcomes (and avoiding multiplicity). But when we build decision models, we rely on all sorts of outcomes from all sorts of studies, and QALYs are never the primary outcome. So what does this mean for reimbursement decision-making? Reimbursement is based on expected net benefit as derived using decision models, which are Bayesian by definition. Within a Bayesian framework of probabilistic sensitivity analysis, data for relevant parameters should never be disregarded on the basis of the status of their collection in a trial, and it is up to the analyst to properly specify a model that properly accounts for the effects of multiplicity and other sources of uncertainty. The author outlines how this operates in three settings: i) estimating treatment effects for rare events, ii) the number of trials available for a meta-analysis, and iii) the estimation of population mean overall survival. It isn’t so much that multiplicity and sample size are irrelevant, as they could inform the analysis, but rather that no data is too weak for a Bayesian analyst.

Life satisfaction, QALYs, and the monetary value of health. Social Science & Medicine [PubMed] Published 18th June 2018

One of this blog’s first ever posts was on the subject of ‘the well-being valuation approach‘ but, to date, I don’t think we’ve ever covered a study in the round-up that uses this method. In essence, the method is about estimating trade-offs between (for example) income and some measure of subjective well-being, or some health condition, in order to estimate the income equivalence for that state. This study attempts to estimate the (Australian) dollar value of QALYs, as measured using the SF-6D. Thus, the study is a rival cousin to the Claxton-esque opportunity cost approach, and a rival sibling to stated preference ‘social value of a QALY’ approaches. The authors are trying to identify a threshold value on the basis of revealed preferences. The analysis is conducted using 14 waves of the Australian HILDA panel, with more than 200,000 person-year responses. A regression model estimates the impact on life satisfaction of income, SF-6D index scores, and the presence of long-term conditions. The authors adopt an instrumental variable approach to try and address the endogeneity of life satisfaction and income, using an indicator of ‘financial worsening’ to approximate an income shock. The estimated value of a QALY is found to be around A$42,000 (~£23,500) over a 2-year period. Over the long-term, it’s higher, at around A$67,000 (~£37,500), because individuals are found to discount money differently to health. The results also demonstrate that individuals are willing to pay around A$2,000 to avoid a long-term condition on top of the value of a QALY. The authors apply their approach to a few examples from the literature to demonstrate the implications of using well-being valuation in the economic evaluation of health care. As with all uses of experienced utility in the health domain, adaptation is a big concern. But a key advantage is that this approach can be easily applied to large sets of survey data, giving powerful results. However, I haven’t quite got my head around how meaningful the results are. SF-6D index values – as used in this study – are generated on the basis of stated preferences. So to what extent are we measuring revealed preferences? And if it’s some combination of stated and revealed preference, how should we interpret willingness to pay values?

Credits

 

Meeting round-up: 7th Meeting of the International Academy of Health Preference Research

The 7th meeting of the International Academy of Health Preference Research (IAHPR) took place in Glasgow on Saturday 4th November 2017. The meeting was chaired by Karin Groothuis-Oudshoorn and Terry Flynn. It was preceded by a Friday afternoon symposium on the econometrics of heterogeneity, which I was unable to attend.

IAHPR is a relatively new organisation, describing itself as an ‘international network of multilingual, multidisciplinary researchers who contribute to the field of health preference research’. To minimise participants’ travel costs, IAHPR meetings are usually scheduled alongside major international conferences such as the meetings of iHEA, EuHEA and AHES (the Australian Health Economics Society). The November meeting took place just before the kick-off of the ISPOR European Congress (a behemoth by comparison). Most, but not all, of the attendees I spoke to, said that they would also be attending the ISPOR Congress.

The meeting was attended by 49 researchers from nine different countries. Nine were from the US, 16 from the UK, and 22 from elsewhere in the EU (sadly, I won’t be able to use the phrase ‘elsewhere in the EU’ for much longer). Understandably, the regional representation of the Glasgow meeting was quite different from that of the (July 2017) Boston meeting, where over 60% of the participants were based in the US.

All1

In total there were 12 podium presentations (half by student presenters) and about eight posters. Each podium presenter was allocated 12 minutes for their presentation and a further eight minutes for questions and group discussion. The poster authors were given the opportunity to briefly introduce themselves and their research to the group as part of an ‘elevator talks’ session.

Although all of the presentations focused on issues in stated preference research, the range of topics was quite broad, covering preferences between health outcomes, preferences between health services, conceptual and theoretical issues, experimental design approaches, and novel analytical techniques. Most of the studies presented applications of the DCE and best-worst scaling methods. Several presentations examined issues relating to preference heterogeneity and decision heuristics.

A personal highlight was Tabea Schmidt-Ott’s examination of the use of dominance tests to assess rational choice behaviour amongst survey respondents. She reported that such tests were included in a quarter of the health-related DCE studies published in 2015 (including many studies that had been led by IAHPR meeting attendees). Their inclusion had often been used to justify choices about which respondents to exclude from the final samples. Tabea concluded that dominance tests are a weak technique for assessing the rationality of people’s choice behaviour, as the observation of dominated choices can be explained by and accounted for in DCE models.

Overall, the IAHPR meeting was enjoyable and intellectually stimulating. The standard of the presentations and discussions was high, and it was a good forum for learning about the latest advances in stated preference research. It was quite DCE-dominated, so it would have been interesting to have had some representation from researchers who are sceptical about that methodology.

The next meeting will take place in Tasmania, to be chaired by Brendan Mulhern and Richard Norman.

Credits