Chris Sampson’s journal round-up for 27th January 2020

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A general framework for classifying costing methods for economic evaluation of health care. The European Journal of Health Economics [PubMed] Published 20th January 2020

When it comes to health state valuation and quality of life, I’m always very concerned about the use of precise terminology, and it bugs me when people get things wrong. But when it comes to costing methods, I’m pretty shoddy. So I’m pleased to see this very useful paper, which should help us all to gain some clarity in our reporting of costing studies.

The authors start out by clearly distinguishing between micro-costing and gross-costing in the identification of costs and between top-down and bottom-up valuation of these costs. I’m ashamed to say that I had never properly grasped the four distinct approaches that can be adopted based on these classifications, but the authors make it quite clear. Micro-costing means detailed identification of cost components, while gross-costing considers resource use in aggregate. Top-down methods use expenditure data collected at the organisational level, while bottom-up approaches use patient-level data.

A key problem is that our language – as health economists – is in several respects contradictory to the language used by management accountants. It’s the accountants who are usually preparing the cost information that we might use in analyses, and these data are not normally prepared for the types of analysis that we wish to conduct, so there is a lot that can go awry. Perhaps most important is that financial accounting is not concerned with opportunity costs. The authors provide a kind of glossary of terms that can support translation between the two contexts, as well as a set of examples of the ways in which the two contexts differ. They also point out the importance of different accounting practices in different countries and the ways in which these might necessitate adjustment in costing methods for economic evaluation.

The study includes a narrative review of costing studies in order to demonstrate the sorts of errors in terminology that can arise and the lack of clarity that results. The studies included in the review provide examples of the different approaches to costing, though no study is identified as ‘bottom-up gross-costing’. One of the most useful contributions of the paper is to provide two methodological checklists, one for top-down and one for bottom-up costing studies. If you’re performing, reviewing, or in any way making use of costing studies, this will be a handy reference.

Health state values of deaf British Sign Language (BSL) users in the UK: an application of the BSL version of the EQ-5D-5L. Applied Health Economics and Health Policy [PubMed] Published 16th January 2020

The BSL translation of the EQ-5D is like no other. It is to be used – almost exclusively – by people who have a specific functional health impairment. For me, this raises questions about whether or not we can actually consider it simply a translation of the EQ-5D and compare values with other translations in the way we would any other language. This study uses data collected during the initial development and validation of the EQ-5D-5L BSL translation. The authors compared health state utility values from Deaf people (BSL users) with a general population sample from the Health Survey for England.

As we might expect, the Deaf sample reported a lower mean utility score (0.78) than the general population (0.84). Several other health measures were used in the BSL study. A staggering 43% of the Deaf participants had depression and a lot of the analysis in the paper is directed towards comparing the groups with and without psychological distress. The authors conduct some simple regression analyses to explore what might be the determinants of health state utility values in the Deaf population, with long-standing physical illness having the biggest impact.

I had hoped that the study might be able to tell us a bit more about the usefulness of the BSL version of the EQ-5D-5L, because the EQ-5D has previously been shown to be insensitive to hearing problems. The small sample (<100) can’t tell us a great deal on its own, so it’s a shame that there isn’t some attempt at matching with individuals from the Health Survey for England for the sake of comparison. Using the crosswalk from the EQ-5D-3L to obtain 5L values is also a problem, as it limits the responsiveness of index values. Nevertheless, it’s good to see data relating to this under-represented population.

A welfare-theoretic model consistent with the practice of cost-effectiveness analysis and its implications. Journal of Health Economics [PubMed] Published 11th January 2020

There are plenty of good reasons to deviate from a traditional welfarist approach to cost-benefit analysis in the context of health care, as health economists have debated for decades. But it is nevertheless important to understand the ways in which cost-effectiveness analysis, as we conduct it, deviates from welfarism, and to aim for some kind of consistency in our handling of different issues. This paper attempts to draw together disparate subjects of discussion on the theoretical basis for aspects of cost-effectiveness analysis. The author focuses on issues relating to the inclusion of future (unrelated) costs, to discounting, and to consistency with welfarism, in the conduct of cost-per-QALY analyses. The implications are given consideration with respect to adopting a societal perspective, recognising multiple budget holders, and accounting for distributional impacts.

All of this is based on the description of an intertemporal utility model and a model of medical care investment. The model hinges especially on how we understand consumption to be affected by our ambition to maximise QALYs. For instance, the author argues that, once we consider time preferences in an overall utility function, we don’t need to worry about differential discounting in health and consumption. The various implications of the model are compared to the recommendations of the Second Panel on Cost-Effectiveness in Health and Medicine. In general, the model supports the recommendations of the Panel, where others have been critical. As such, it sets out some of the theoretical basis for those recommendations. It also implies other recommendations, not considered by the Panel. For example, the optimal cost-effectiveness threshold is likely to be higher than GDP per capita.

It’s difficult to judge the validity of the framework from a first read. The paper is dense with theoretical exposition. My first instinct is ‘so what’. One of the great things about the practice of cost-effectiveness analysis in health care is that it isn’t constrained by restrictive theoretical frameworks, and so the very idea of a kind of unified theoretical framework is a bit worrying to me. But my second thought is that this is a valuable paper, as it attempts to gather up several loose threads. Whether or not these can be gathered up within a welfarist framework is debatable, but the exercise is revealing. I suspect this paper will help to trigger further inquiry, which can only be a good thing.

Registered reports: time to radically rethink peer review in health economics. PharmacoEconomics – Open [PubMed] Published 23rd January 2020

As a discipline, health economics isn’t great when it comes to publication practices. We excel in neither the open access culture of medical sciences nor the discussion paper culture of economics proper. In this article, the authors express concern about publication bias, and the fact that health economics journals – and health economists in general – aren’t doing much to combat it. In fairness to the discipline, there isn’t really any evidence that publication bias abounds. But that isn’t really the point. We should be able to prove and ensure that it doesn’t if we want our research to been seen as credible.

One (partial) solution to publication bias is the adoption – by journals – of registered reports. Under such a system, researchers would submit study protocols to journals for peer review. If the journal were satisfied with the methods then they could guarantee to publish the study once the results are in, regardless of how sexy the results may or may not be. The authors of this paper identify the prevalence of studies in major health economics journals that could benefit from registered reports. These would be prospectively designed experimental or quasi-experimental studies. It seems that there are plenty.

I’ve used this blog in the past to propose more transparent research practices and to complain about publication practices in health economics generally, while others have complained about the use of p-values in our discipline. The adoption of registered reports is one tactic that could bring improvements and I hope it will be given proper consideration by those in a position to enact change.

Credits

Thesis Thursday: Koh Jun Ong

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Koh Jun Ong who has a PhD from the University of Groningen. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Economic aspects of public health programmes for infectious disease control: studies on human immunodeficiency virus & human papillomavirus
Supervisors
Maarten Postma, Mark Jit
Repository link
http://hdl.handle.net/11370/0edbcfae-2a0c-4103-9722-fb8086d75cff

Which public health programmes did you consider in your research?

Three public health programmes were considered in the thesis: 1) HIV Pre-Exposure Prophylaxis (PrEP), 2) Human Papillomavirus (HPV) vaccination, and 3) HIV screening to reduce undiagnosed infections in the population.

The first two of the three involved primary infectious disease prevention among men who have sex with men (MSM), and both of these programmes were to be delivered via sexual health clinics in England (commonly known as genitourinary medicine, GUM, clinics).

The third public health infectious disease control programme involved secondary prevention of onward HIV transmission in the general population by encouraging routine HIV screening to reduce undiagnosed HIV, with a view of earlier diagnosis leading to antiretroviral treatment initiation, which will stop HIV transmission with viral suppression.

Was it necessary to develop complex mathematical models?

It depends on the policy research question. A dynamic model was used for the HPV vaccination research question, which captures the ecological externality that vaccination provides by reducing transmission to non-vaccinees. A dynamic model was used because this programme would likely reach a high proportion of MSM who attend GUM clinics in England, and therefore the subsequent knock-on impact of disease transmission in the population was likely to be substantial.

The policy research question was different for PrEP and a static model was more suitable since the objective was to advise NHS England on whether and how such a programme, with relatively small numbers of patients over an initial time-limited period, may represent value for money in England. We first considered a public health control programme, with promising new efficacy data from the 500-person PrEP pilot study (the UK-based PROUD trial) and additional information from per protocol participants in the earlier iPrEx study. The initial consideration was to maintain the preventative effect of a drug that needs to be taken on a daily basis (compared with near one-off HPV vaccination – three doses in total delivered within a year’s time). Regular monitoring of STI and patient’s renal function meant there were clinical service capacity issue to consider, which was likely to limit access initially. Thus, a static model that did not take into account transmission was used.

However, dynamic modelling would be useful to inform policy decisions as PrEP usage expands. Firstly, because it would then be important to capture the indirect effect on infection transmission. Secondly, because when the force of infection begins to fall as incidence declines, dynamic modelling will inform future delivery of a programme that maintains its value. These represent important areas for future research.

Finally, the model designed for the research question on HIV screening was quite straightforward as its aim is primarily to advise local commissioners on financial implications of offering routine screening in their local area, which is dependent on local clinical resources and local disease prevalence.

Did you draw any important conclusions from your literature reviews?

Two literature reviews were conducted: 1) a review on economic parameters i.e. cost and utility estimates for HPV-related outcomes, and 2) a review on published MSM HPV vaccination economic evaluations.

In relation to the first review, most economic models of HPV-related interventions selected economic parameters in a pretty ad hoc way, without reviewing the entirety of the literature. We found substantial variations in cost and utility estimates for all diseases considered in our systematic review, wherever there were more than one publication. These variations in value estimates could result from the differences in cancer site, disease stages, study population, treatment pathway/settings, treatment country and utility elicitation methods used. It would be important for future models to be transparent about parameter sources and assumptions, and to recognise that as patient disease management changes over time, there will be corresponding effects on both cost and utility, necessitating future updates to the estimates. These must be considered when applied to future economic evaluations, to ensure that assumptions are up-to-date and closely reflect the case mix of patients being evaluated.

In relation to the second review, despite limited models, different modelling approaches and assumptions, a general theme from these studies reveal modelling outcomes to be most sensitive to assumptions around vaccine efficacy and price. Future studies could consider synchronising parameter assumptions to test outputs generated by different models.

What can your research tell us about the ‘cost-effective but unaffordable’ paradox?

A key finding and concluding remark of this thesis was that “findings around cost-effectiveness should not be considered independently of budget impact and affordability considerations, as the two are interlinked”. Ultimately, cost-effectiveness is linked to the budget and, in an ideal world, a cost-effectiveness threshold should correspond to the opportunity cost of replacing least cost-effective care at the margin of the whole healthcare budget spend. This willingness to pay threshold should be linked to the amount of budgetary resources an intervention displaces. After all, the concept of opportunity cost in a fixed budget setting means that decisions to invest in something translates to funding being displaced elsewhere.

Since most health economies do not have unlimited resources, even if investment in a new intervention gives high returns and therefore is worthwhile from a value for money perspective, without the necessary resources it cannot always be afforded despite its high return on investment. Having a limited budget means that funding an expensive new intervention may mean moving funding away from existing services, which may be more cost-effective than the new intervention. Hence, the services from which funds are moved from will lose out, and this may leave society worse-off.

A simple analogy may be that buying a property that guarantees return over a defined period is worthwhile, but if I cannot afford it in the first place, is this still an option?

This was clearly demonstrated in the PrEP example, where despite potential to be cost-effective, the high cost of the intervention at list price carried with it a very high budget impact. The size of the population needed to be given PrEP to achieve substantial public health benefits is large, which meant that a public health programme could pose an affordability challenge to the national health care system.

Based on your findings, how might HIV and HPV prevention strategies be made more cost-effective?

Two strategies could influence cost effectiveness: optimizing the population covered and using an appropriate comparator price.

The most obvious way to improve cost-effectiveness is to optimise the population covered. For example, we know that HIV risk, as measured by HIV incidence, is higher among GUM-attending MSM. Therefore, delivering a PrEP programme to this population (at least in the initial phase until the intervention becomes more affordable) will likely result in a higher number of new HIV infections prevented. Similarly, HIV screening offered to areas with high local prevalence would likely give a higher number of new diagnoses.

The other important factor to consider around cost-effectiveness is the comparator price on which the technology appraisal is based. In the chapter on estimating HIV care cost in England, we demonstrated that with imminent availability of generic antiretrovirals, the lifetime care cost for a person living with HIV will reduce substantially. This reduced cost, representing cost of care with existing intervention, should be used as comparator for newer HIV interventions, as they would represent what society will be paying in the absence of the new interventions, allowing corresponding reduced price expectations for new interventions to ensure cost-effectiveness is maintained.

How did you find the experience of completing your thesis by publication?

It was brilliant! I must acknowledge all the contributions from my supervisors and co-authors in making this possible and for the very positive experience of this process. A major advantage of doing a PhD by publication is that the work conducted was regularly peer-reviewed, hence providing an extra check of the robustness of the analyses. And also the fact that these works are out for public consumption almost immediately, making the science available for other researchers to consider and to move the science to the next stage.

Chris Sampson’s journal round-up for 13th January 2020

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A vision ‘bolt-on’ increases the responsiveness of EQ-5D: preliminary evidence from a study of cataract surgery. The European Journal of Health Economics [PubMed] Published 4th January 2020

The EQ-5D is insensitive to differences in how well people can see, despite this seeming to be an important aspect of health. In contexts where the impact of visual impairment may be important, we could potentially use a ‘bolt-on’ item that asks about a person’s vision. I’m working on the development of a vision bolt-on at the moment. But ours won’t be the first. A previously-developed bolt-on has undergone some testing and has been shown to be sensitive to differences between people with different levels of visual function. However, there is little or no evidence to support its responsiveness to changes in visual function, which might arise from treatment.

For this study, 63 individuals were recruited prior to receiving cataract surgery in Singapore. Participants completed the EQ-5D-3L and EQ-5D-5L, both with and without a vision bolt-on, which matched the wording of other EQ-5D dimensions. Additionally, the SF-6D, HUI3, and VF-12 were completed along with a LogMAR assessment of visual acuity. The authors sought to compare the responsiveness of the EQ-5D with a vision bolt-on compared with the standard EQ-5D and the other measures. Therefore, all measures were completed before and after cataract surgery. Preference weights can be generated for the EQ-5D-3L with a vision bolt-on, but they can’t for the EQ-5D-5L, so the authors looked at rescaled sum scores to compare across all measures. Responsiveness was measured using indicators such as standardised effect size and response mean.

Visual acuity changed dramatically before and after surgery, for almost everybody. The authors found that the vision bolt-on does seem to provide a great deal more in the way of response to this, compared to the EQ-5D without the bolt-on. For instance, the mean change in the EQ-5D-3L index score was 0.018 without the vision bolt-on, and 0.031 with it. The HUI3 came out with a mean change of 0.105 and showed the highest responsiveness across all analyses.

Does this mean that we should all be using a vision bolt-on, or perhaps the HUI3? Not exactly. Something I see a lot in papers of this sort – including in this one – is the framing of a “superior responsiveness” as an indication that the measure is doing a better job. That isn’t true if the measure is responding to things to which we don’t want it to respond. As the authors point out, the HUI3 has quite different foundations to the EQ-5D. We also don’t want a situation where analysts can pick and choose measures according to which ever is most responsive to the thing to which they want it to be most responsive. In EuroQol parlance, what goes into the descriptive system is very important.

The causal effect of social activities on cognition: evidence from 20 European countries. Social Science & Medicine Published 9th January 2020

Plenty of studies have shown that cognitive abilities are correlated with social engagement, but few have attempted to demonstrate causality in a large sample. The challenge, of course, is that people who engage in more social activities are likely to have greater cognitive abilities for other reasons, and people’s decision to engage in social activities might depend on their cognitive abilities. This study tackles the question of causality using a novel (to me, at least) methodology.

The analysis uses data from five waves of SHARE (the Survey of Health, Ageing and Retirement in Europe). Survey respondents are asked about whether they engage in a variety of social activities, such as voluntary work, training, sports, or community-related organisations. From this, the authors generate an indicator for people participating in zero, one, or two or more of these activities. The survey also uses a set of tests to measure people’s cognitive abilities in terms of immediate recall capacity, delayed recall capacity, fluency, and numeracy. The authors look at each of these four outcomes, with 231,407 observations for the first three and 124,381 for numeracy (for which the questions were missing from some waves). Confirming previous findings, a strong positive correlation is found between engagement in social activities and each of the cognition indicators.

The empirical strategy, which I had never heard of, is partial identification. This is a non-parametric method that identifies bounds for the average treatment effect. Thus, it is ‘partial’ because it doesn’t identify a point estimate. Fewer assumptions means wider and less informative bounds. The authors start with a model with no assumptions, for which the lower bound for the treatment effect goes below zero. They then incrementally add assumptions. These include i) a monotone treatment response, assuming that social participation does not reduce cognitive abilities on average; ii) monotone treatment selection, assuming that people who choose to be socially active tend to have higher cognitive capacities; iii) a monotone instrumental variable assumption that body mass index is negatively associated with cognitive abilities. The authors argue that their methodology is not likely to be undermined by unobservables, as previous studies might.

The various models show that engaging in social activities has a positive impact on all four of the cognitive indicators. The assumption of monotone treatment response had the highest identifying power. For all models that included this, the 95% confidence intervals in the estimates showed a statistically significant positive impact of social activities on cognition. What is perhaps most interesting about this approach is the huge amount of uncertainty in the estimates. Social activities might have a huge effect on cognition or they might have a tiny effect. A basic OLS-type model, assuming exogenous selection, provides very narrow confidence intervals, whereas the confidence intervals on the partial identification models are almost as wide as the lower and upper band themselves.

One shortcoming of this study for me is that it doesn’t seek to identify the causal channels that have been proposed in previous literature (e.g. loneliness, physical activity, self-care). So it’s difficult to paint a clear picture of what’s going on. But then, maybe that’s the point.

Do research groups align on an intervention’s value? Concordance of cost-effectiveness findings between the Institute for Clinical and Economic Review and other health system stakeholders. Applied Health Economics and Health Policy [PubMed] Published 10th January 2020

Aside from having the most inconvenient name imaginable, ICER has been a welcome edition to the US health policy scene, appraising health technologies in order to provide guidance on coverage. ICER has become influential, with some pharmacy benefit managers using their assessments as a basis for denying coverage for low value medicines. ICER identify technologies as falling in one of three categories – high, low, or intermediate long-term value – according to whether the ICER (grr) falls below, above, or between the threshold range of $50,000-$175,000 per QALY. ICER conduct their own evaluations, but so do plenty of other people. This study sought to find out whether other analyses in the literature agree with ICER’s categorisations.

The authors consider 18 assessments by ICER, including 76 interventions, between 2015 and 2017. For each of these, the authors searched the literature for other comparative studies. Specifically, they went looking for cost-effectiveness analyses that employed the same perspectives and outcomes. Unfortunately, they were only able to identify studies for six disease areas and 14 interventions (of the 76), across 25 studies. It isn’t clear whether this is because there is a lack of literature out there – which would be an interesting finding in itself – or because their search strategy or selection criteria weren’t up to scratch. Of the 14 interventions compared, 10 get a more favourable assessment in the published studies than in their corresponding ICER evaluations, with most being categorised as intermediate value instead of low value. The authors go on to conduct one case study, comparing an ICER evaluation in the context of migraine with a published study by some of the authors of this paper. There were methodological differences. In some respects, it seems as if ICER did a more thorough job, while in other respects the published study seemed to use more defensible assumptions.

I agree with the authors that these kinds of comparisons are important. Not least, we need to be sure that ICER’s approach to appraisal is valid. The findings of this study suggest that maybe ICER should be looking at multiple studies and combining all available data in a more meaningful way. But the authors excluded too many studies. Some imperfect comparisons would have been more useful than exclusion – 14 of 76 is kind of pitiful and probably not representative. And I’m not sure why the authors set out to identify studies that are ‘more favourable’, rather than just different. That perspective seems to reveal an assumption that ICER are unduly harsh in their assessments.

Credits