Paul Mitchell’s journal round-up for 6th November 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A longitudinal study to assess the frequency and cost of antivascular endothelial therapy, and inequalities in access, in England between 2005 and 2015. BMJ Open [PubMed] Published 22nd October 2017

I am breaking one of my unwritten rules in a journal paper round-up by talking about colleagues’ work, but I feel it is too important not to provide a summary for a number of reasons. The study highlights the problems faced by regional healthcare purchasers in England when implementing national guideline recommendations on the cost-effectiveness of new treatments. The paper focuses on anti-vascular endothelial growth factor (anti-VEGF) medicines in particular, with two drugs, ranibizumab and aflibercept, offered to patients with a range of eye conditions, costing £550-800 per injection. Another drug, bevacizumab, that is closely related to ranibizumab and performs similarly in trials, could be provided at a fraction of the cost (£50-100 per injection), but it is currently unlicensed for eye conditions in the UK. This study investigates how the regional areas in England have coped with trying to provide the recommended drugs using administrative data from Hospital Episode Statistics in England between 2005-2015 by tracking their use since they have been recommended for a number of different eye conditions over the past decade. In 2014/15 the cost of these two new drugs for treating eye conditions alone was estimated at £447 million nationally. The distribution of where these drugs are provided is not equal, varying widely across regions after controlling for socio-demographics, suggesting an inequality of access associated with the introduction of these high-cost drugs over the past decade at a time of relatively low growth in national health spending. Although there are limitations associated with using data not intended for research purposes, the study shows how the most can be made from data routinely collected for non-research purposes. On a public policy level, it raises questions over the provision of such high-cost drugs, for which the authors state the NHS are currently paying more for than US insurers. Although it is important to be careful when comparing to unlicensed drugs, the authors point to clear evidence in the paper as to why their comparison is a reasonable one in this scenario, with a large opportunity cost associated with not including this option in national guidelines. If national recommendations continue to insist that such drugs be provided, clearer guidance is also required on how to disinvest from existing services at a regional level to reduce further examples of inequality in access in the future.

In search of a common currency: a comparison of seven EQ-5D-5L value sets. Health Economics [PubMed] Published 24th October 2017

For those of us out there who like a good valuation study, you will need to set yourself aside a good piece of time to work your way through this one. The new EQ-5D-5L measure of health status, with a primary purpose of generating quality-adjusted life years (QALYs) for economic evaluations, is now starting to have valuation studies emerging from different countries, whereby the relative importance of each of the measure dimensions and levels are quantified based on general population preferences. This study offers the first comparison of value sets across seven countries: 3 Western European (England, Netherlands, Spain), 1 North American (Canada), 1 South American (Uruguay), and two East Asian (Japan and South Korea). The authors in this paper aim to describe methodological differences between the seven value sets, compare the relative importance of dimensions, level decrements and scale length (i.e. quality/quantity trade-offs for QALYs), as well as developing a common (Western) currency across four of the value sets. In brief summary, there does appear to be similar trends across the three Western European countries: level decrements from levels 3 to 4 have the largest value, followed by levels 1 to 2. There is also a pattern in these three countries’ dimensions, whereby the two “symptom” dimensions (i.e. pain/discomfort, anxiety/depression) have equal importance to the other three “functioning” dimensions (i.e. mobility, self-care and usual activities). There are also clear differences with the other four value sets. Canada, although it also has the highest level decrements between levels 3 and 4 (49%), unusually has equal decrements for the remainder (17% x 3). For the other three countries, greater weight is attached to the three functioning dimensions relative to the two symptom dimensions. Although South Korea also has the greatest level decrements between level 3 and 4, it was greatest between level 4 and level 5 in Uruguay and levels 1 and 2 in Japan. Although the authors give a number of plausible reasons as to why these differences may occur, less justification is given in the choice of the four value sets they offer as a common currency, beyond the need to have a value set for countries that do not have one already. The most in-common value sets were the three Western European countries, so a Western European value set may have been more appropriate if the criterion was to have comparable values across countries. If the aim was really for a more international common currency, there are issues with the exclusion of non-Western countries’ value sets from their common currency version. Surely differences across cultures should be reflected in a common currency if they are apparent in different cultures and settings. A common currency should also have a better spread of regions geographically, with no country from Africa, the Middle East, Central and South Asia represented in this study, as well as no lower- and middle-income countries. Though this final criticism is out of the control of the authors based on current data availability.

Quantifying the relationship between capability and health in older people: can’t map, won’t map. Medical Decision Making [PubMed] Published 23rd October 2017

The EQ-5D is one of many ways quality of life can be measured within economic evaluations. A more recent way based on Amartya Sen’s capability approach has attempted to develop outcome measures that move beyond health-related aspects of quality of life captured by EQ-5D and similar measures used in the generation of QALYs. This study examines the relationship between the EQ-5D and the ICECAP-O capability measure in three different patient populations included in the Medical Crises in Older People programme in England. The authors propose a reasonable hypothesis that health could be considered a conversion factor for a person’s broader capability set, and so it is plausible to test how well the EQ-5D-3L dimension values and overall score can map onto the ICECAP-O overall score. Through numerous regressions performed, the strongest relationship between the two measures in this sample was an R-squared of 0.35. Interestingly, the dimensions on the EQ-5D that had a significant relationship with the ICECAP-O score were a mix of dimensions with a focus on functioning (i.e. self-care, usual activities) and symptoms (anxiety/depression), so overall capability on ICECAP-O appears to be related, at least to a small degree, to both health components of EQ-5D discussed in this round-up’s previous paper. The authors suggest it provides further evidence of the complementary data provided by EQ-5D and ICECAP-O, but the causal relationship, as the authors suggest, between both measures remains under-researched. Longitudinal data analysis would provide a more definitive answer to the question of how much interaction there is between these two measures and their dimensions as health and capability changes over time in response to different treatments and care provision.

Credits

 

Chris Sampson’s journal round-up for 11th September 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Core items for a standardized resource use measure (ISRUM): expert Delphi consensus survey. Value in Health Published 1st September 2017

Trial-based collection of resource use data, for the purpose of economic evaluation, is wild. Lots of studies use bespoke questionnaires. Some use off-the-shelf measures, but many of these are altered to suit the context. Validity rarely gets a mention. Some of you may already be aware of this research; I’m sure I’m not the only one here who participated. The aim of the study is to establish a core set of resource use items that should be included in all studies to aid comparability, consistency and validity. The researchers identified a long list of 60 candidate items for inclusion, through a review of 59 resource use instruments. An NHS and personal social services perspective was adopted, and any similar items were merged. This list was constructed into a Delphi survey. Members of the HESG mailing list – as well as 111 other identified experts – were invited to complete the survey, for which there were two rounds. The first round asked participants to rate the importance of including each item in the core set, using a scale from 1 (not important) to 9 (very important). Participants were then asked to select their ‘top 10’. Items survived round 1 if they scored at least 7 with more than 50% of respondents, and less than 3 by no more than 15%, either overall or within two or more participant subgroups. In round 2, participants were presented with the results of round 1 and asked to re-rate 34 remaining items. There was a sample of 45 usable responses in round 1 and 42 in round 2. Comments could also be provided, which were subsequently subject to content analysis. After all was said and done, a meeting was held for final item selection based on the findings, to which some survey participants were invited but only one attended (sorry I couldn’t make it). The final 10 items were: i) hospital admissions, ii) length of stay, iii) outpatient appointments, iv) A&E visits, v) A&E admissions, vi) number of appointments in the community, vii) type of appointments in the community, viii) number of home visits, ix) type of home visits and x) name of medication. The measure isn’t ready to use just yet. There is still research to be conducted to identify the ideal wording for each item. But it looks promising. Hopefully, this work will trigger a whole stream of research to develop bolt-ons in specific contexts for a modular system of resource use measurement. I also think that this work should form the basis of alignment between costing and resource use measurement. Resource use is often collected in a way that is very difficult to ‘map’ onto costs or prices. I’m sure the good folk at the PSSRU are paying attention to this work, and I hope they might help us all out by estimating unit costs for each of the core items (as well as any bolt-ons, once they’re developed). There’s some interesting discussion in the paper about the parallels between this work and the development of core outcome sets. Maybe analysis of resource use can be as interesting as the analysis of quality of life outcomes.

A call for open-source cost-effectiveness analysis. Annals of Internal Medicine [PubMed] Published 29th August 2017

Yes, this paper is behind a paywall. Yes, it is worth pointing out this irony over and over again until we all start practising what we preach. We’re all guilty; we all need to keep on keeping on at each other. Now, on to the content. The authors argue in favour of making cost-effectiveness analysis (and model-based economic evaluation in particular) open to scrutiny. The key argument is that there is value in transparency, and analogies are drawn with clinical trial reporting and epidemiological studies. This potential additional value is thought to derive from i) easy updating of models with new data and ii) less duplication of efforts. The main challenges are thought to be the need for new infrastructure – technical and regulatory – and preservation of intellectual property. Recently, I discussed similar issues in a call for a model registry. I’m clearly in favour of cost-effectiveness analyses being ‘open source’. My only gripe is that the authors aren’t the first to suggest this, and should have done some homework before publishing this call. Nevertheless, it is good to see this issue being raised in a journal such as Annals of Internal Medicine, which could be an indication that the tide is turning.

Differential item functioning in quality of life measurement: an analysis using anchoring vignettes. Social Science & Medicine [PubMed] [RePEc] Published 26th August 2017

Differential item functioning (DIF) occurs when different groups of people have different interpretations of response categories. For example, in response to an EQ-5D questionnaire, the way that two groups of people understand ‘slight problems in walking about’ might not be the same. If that were the case, the groups wouldn’t be truly comparable. That’s a big problem for resource allocation decisions, which rely on trade-offs between different groups of people. This study uses anchoring vignettes to test for DIF, whereby respondents are asked to rate their own health alongside some health descriptions for hypothetical individuals. The researchers conducted 2 online surveys, which together recruited a representative sample of 4,300 Australians. Respondents completed the EQ-5D-5L, some vignettes, some other health outcome measures and a bunch of sociodemographic questions. The analysis uses an ordered probit model to predict responses to the EQ-5D dimensions, with the vignettes used to identify the model’s thresholds. This is estimated for each dimension of the EQ-5D-5L, in the hope that the model can produce coefficients that facilitate ‘correction’ for DIF. But this isn’t a guaranteed approach to identifying the effect of DIF. Two important assumptions are inherent; first, that individuals rate the hypothetical vignette states on the same latent scale as they rate their own health (AKA response consistency) and, second, that everyone values the vignettes on an equivalent latent scale (AKA vignette equivalence). Only if these assumptions hold can anchoring vignettes be used to adjust for DIF and make different groups comparable. The researchers dedicate a lot of effort to testing these assumptions. To test response consistency, separate (condition-specific) measures are used to assess each domain of the EQ-5D. The findings suggest that responses are consistent. Vignette equivalence is assessed by the significance of individual characteristics in determining vignette values. In this study, the vignette equivalence assumption didn’t hold, which prevents the authors from making generalisable conclusions. However, the researchers looked at whether the assumptions were satisfied in particular age groups. For 55-65 year olds (n=914), they did, for all dimensions except anxiety/depression. That might be because older people are better at understanding health problems, having had more experience of them. So the authors can tell us about DIF in this older group. Having corrected for DIF, the mean health state value in this group increases from 0.729 to 0.806. Various characteristics explain the heterogeneous response behaviour. After correcting for DIF, the difference in EQ-5D index values between high and low education groups increased from 0.049 to 0.095. The difference between employed and unemployed respondents increased from 0.077 to 0.256. In some cases, the rankings changed. The difference between those divorced or widowed and those never married increased from -0.028 to 0.060. The findings hint at a trade-off between giving personalised vignettes to facilitate response consistency and generalisable vignettes to facilitate vignette equivalence. It may be that DIF can only be assessed within particular groups (such as the older sample in this study). But then, if that’s the case, what hope is there for correcting DIF in high-level resource allocation decisions? Clearly, DIF in the EQ-5D could be a big problem. Accounting for it could flip resource allocation decisions. But this study shows that there isn’t an easy answer.

How to design the cost-effectiveness appraisal process of new healthcare technologies to maximise population health: a conceptual framework. Health Economics [PubMed] Published 22nd August 2017

The starting point for this paper is that, when it comes to reimbursement decisions, the more time and money spent on the appraisal process, the more precise the cost-effectiveness estimates are likely to be. So the question is, how much should be committed to the appraisal process in the way of resources? The authors set up a framework in which to consider a variety of alternatively defined appraisal processes, how these might maximise population health and which factors are key drivers in this. The appraisal process is conceptualised as a diagnostic tool to identify which technologies are cost-effective (true positives) and which aren’t (true negatives). The framework builds on the fact that manufacturers can present a claimed ICER that makes their technology more attractive, but that the true ICER can never be known with certainty. As a diagnostic test, there are four possible outcomes: true positive, false positive, true negative, or false negative. Each outcome is associated with an expected payoff in terms of population health and producer surplus. Payoffs depend on the accuracy of the appraisal process (sensitivity and specificity), incremental net benefit per patient, disease incidence, time of relevance for an approval, the cost of the process and the price of the technology. The accuracy of the process can be affected by altering the time and resources dedicated to it or by adjusting the definition of cost-effectiveness in terms of the acceptable level of uncertainty around the ICER. So, what determines an optimal level of accuracy in the appraisal process, assuming that producers’ price setting is exogenous? Generally, the process should have greater sensitivity (at the expense of specificity) when there is more to gain: when a greater proportion of technologies are cost-effective or when the population or time of relevance is greater. There is no fixed optimum for all situations. If we relax the assumption of exogenous pricing decisions, and allow pricing to be partly determined by the appraisal process, we can see that a more accurate process incentivises cost-effective price setting. The authors also consider the possibility of there being multiple stages of appraisal, with appeals, re-submissions and price agreements. The take-home message is that the appraisal process should be re-defined over time and with respect to the range of technologies being assessed, or even an individualised process for each technology in each setting. At least, it seems clear that technologies with exceptional characteristics (with respect to their potential impact on population health), should be given a bespoke appraisal. NICE is already onto these ideas – they recently introduced a fast track process for technologies with a claimed ICER below £10,000 and now give extra attention to technologies with major budget impact.

Credits

Lazaros Andronis’s journal round-up for 4th September 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The effect of spending cuts on teen pregnancy. Journal of Health Economics [PubMed] Published July 2017

High teenage pregnancy rates are an important concern that features high in many countries’ social policy agendas. In the UK, a country which has one of the highest teen pregnancy rates in the world, efforts to tackle the issue have been spearheaded by the Teenage Pregnancy Strategy, an initiative aiming to halve under-18 pregnancy rates by offering access to sex education and contraception. However, the recent spending cuts have led to reductions in grants to local authorities, many of which have, in turn, limited or cut a number of teenage pregnancy-related programmes. This has led to vocal opposition by politicians and organisations, who argue that cuts jeopardise the reductions in teenage pregnancy rates seen in previous years. In this paper, Paton and Wright set out to examine whether this is the case; that is, whether cuts to Teenage Pregnancy Strategy-related services have had an impact on teenage pregnancy rates. To do so, the authors used panel data from 149 local authorities in England collected between 2009 and 2014. To capture changes in teenage pregnancy rates across local authorities over the specified period, the authors used a fixed effects model which assumed that under-18 conception rates are a function of annual expenditure on teenage pregnancy services per 13-17 year female in the local authority, and a set of other socioeconomic variables acting as controls. Area and year dummies were also included in the model to account for unobservable effects that relate to particular years and localities and a number of additional analysis were run to get around spurious correlations between expenditure and pregnancy rates. Overall, findings showed that areas which implemented bigger cuts to teenage pregnancy-targeting programmes have, on average, seen larger drops in teenage pregnancy rates. However, these drops are, in absolute terms, small (e.g. a 10% reduction in expenditure is associated with a 0.25% decrease in teenage conception rates). Various explanations can be put forward to interpret these findings, one of which is that cuts might have trimmed off superfluous or underperforming elements of the programme. If this is the case, Paton and Wright’s findings offer some support to arguments that spending cuts may not always be bad for the public.

Young adults’ experiences of neighbourhood smoking-related norms and practices: a qualitative study exploring place-based social inequalities in smoking. Social Science & Medicine [PubMed] Published September 2017

Smoking is a universal problem affecting millions of people around the world and Canada’s young adults are no exception. As in most countries, smoking prevalence and initiation is highest amongst young groups, which is bad news, as many people who start smoking at a young age continue to smoke throughout adulthood. Evidence suggests that there is a strong socioeconomic gradient in smoking, which can be seen in the fact that smoking prevalence is unequally distributed according to education and neighbourhood-level deprivation, being a greater problem in more deprived areas. This offers an opportunity for local-level interventions that may be more effective than national strategies. Though, to come up with such interventions, policy makers need to understand how neighbourhoods might shape, encourage or tolerate certain attitudes towards smoking. To understand this, Glenn and colleagues saw smoking as a practice that is closely related to local smoking norms and social structures, and sought to get young adult smokers’ views on how their neighbourhood affects their attitudes towards smoking. Within this context, the authors carried out a number of focus groups with young adult smokers who lived in four different neighbourhoods, during which they asked questions such as “do you think your neighbourhood might be encouraging or discouraging people to smoke?” Findings showed that some social norms, attitudes and practices were common among neighbourhoods of the same SES. Participants from low-SES neighbourhoods reported more tolerant and permissive local smoking norms, whereas in more affluent neighbourhoods, participants felt that smoking was more contained and regulated. While young smokers from high SES neighbourhoods expressed some degree of alignment and agency with local smoking norms and practices, smokers in low SES described smoking as inevitable in their neighbourhood. Of interest is how individuals living in different SES areas saw anti-smoking regulations: while young smokers in affluent areas advocate social responsibility (and downplay the role of regulations), their counterparts in poorer areas called for more protection and spoke in favour of greater government intervention and smoking restrictions. Glenn and colleagues’ findings serve to highlight the importance of context in designing public health measures, especially when such measures affect different groups in entirely different ways.

Cigarette taxes, smoking—and exercise? Health Economics [PubMed] Published August 2017

Evidence suggests that rises in cigarette taxes have a positive effect on smoking reduction and/or cessation. However, it is also plausible that the effect of tax hikes extends beyond smoking, to decisions about exercise. To explore whether this proposition is supported by empirical evidence, Conway and Niles put together a simple conceptual framework, which assumes that individuals aim to maximise the utility they get from exercise, smoking, health (or weight management) and other goods subject to market inputs (e.g. medical care, diet aids) and time and budget constraints. Much of the data for this analysis came from the Behavioral Risk Factor Surveillance System (BRFSS) in the US, which includes survey participants’ demographic characteristics (age, gender), as well as answers to questions about physical activities and exercise (e.g. intensity and time per week spent on activities) and smoking behaviour (e.g. current smoking status, number of cigarettes smoked per day). Survey data were subsequently combined with changes in cigarette taxes and other state-level variables. Conway and Niles’s results suggest that increased cigarette costs reduce both smoking and exercise, with the decline in exercise being more pronounced among heavy and regular smokers. However, the direction of the effect varied according to one’s age and smoking experience (e.g. higher cigarette cost increased physical activity among recent quitters), which highlights the need for caution in drawing conclusions about the exact mechanism that underpins this relationship. Encouraging smoking cessation and promoting physical exercise are important and desirable public health objectives, but, as Conway and Niles’s findings suggest, pursuing both of them at the same time may not always be plausible.

Credits