Chris Sampson’s journal round-up for 30th September 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A need for change! A coding framework for improving transparency in decision modeling. PharmacoEconomics [PubMed] Published 24th September 2019

We’ve featured a few papers in recent round-ups that (I assume) will be included in an upcoming themed issue of PharmacoEconomics on transparency in modelling. It’s shaping up to be a good one. The value of transparency in decision modelling has been recognised, but simply making the stuff visible is not enough – it needs to make sense. The purpose of this paper is to help make that achievable.

The authors highlight that the writing of analyses, including coding, involves personal style and preferences. To aid transparency, we need a systematic framework of conventions that make the inner workings of a model understandable to any (expert) user. The paper describes a framework developed by the Decision Analysis in R for Technologies in Health (DARTH) group. The DARTH framework builds on a set of core model components, generalisable to all cost-effectiveness analyses and model structures. There are five components – i) model inputs, ii) model implementation, iii) model calibration, iv) model validation, and v) analysis – and the paper describes the role of each. Importantly, the analysis component can be divided into several parts relating to, for example, sensitivity analyses and value of information analyses.

Based on this framework, the authors provide recommendations for organising and naming files and on the types of functions and data structures required. The recommendations build on conventions established in other fields and in the use of R generally. The authors recommend the implementation of functions in R, and relate general recommendations to the context of decision modelling. We’re also introduced to unit testing, which will be unfamiliar to most Excel modellers but which can be relatively easily implemented in R. The role of various tools are introduced, including R Studio, R Markdown, Shiny, and GitHub.

The real value of this work lies in the linked R packages and other online material, which you can use to test out the framework and consider its application to whatever modelling problem you might have. The authors provide an example using a basic Sick-Sicker model, which you can have a play with using the DARTH packages. In combination with the online resources, this is a valuable paper that you should have to hand if you’re developing a model in R.

Accounts from developers of generic health state utility instruments explain why they produce different QALYs: a qualitative study. Social Science & Medicine [PubMed] Published 19th September 2019

It’s well known that different preference-based measures of health will generate different health state utility values for the same person. Yet, they continue to be used almost interchangeably. For this study, the authors spoke to people involved in the development of six popular measures: QWB, 15D, HUI, EQ-5D, SF-6D, and AQoL. Their goal was to understand the bases for the development of the measures and to explain why the different measures should give different results.

At least one original developer for each instrument was recruited, along with people involved at later stages of development. Semi-structured interviews were conducted with 15 people, with questions on the background, aims, and criteria for the development of the measure, and on the descriptive system, preference weights, performance, and future development of the instrument.

Five broad topics were identified as being associated with differences in the measures: i) knowledge sources used for conceptualisation, ii) development purposes, iii) interpretations of what makes a ‘good’ instrument, iv) choice of valuation techniques, and v) the context for the development process. The online appendices provide some useful tables that summarise the differences between the measures. The authors distinguish between measures based on ‘objective’ definitions (QWB) and items that people found important (15D). Some prioritised sensitivity (AQoL, 15D), others prioritised validity (HUI, QWB), and several focused on pragmatism (SF-6D, HUI, 15D, EQ-5D). Some instruments had modest goals and opportunistic processes (EQ-5D, SF-6D, HUI), while others had grand goals and purposeful processes (QWB, 15D, AQoL). The use of some measures (EQ-5D, HUI) extended far beyond what the original developers had anticipated. In short, different measures were developed with quite different concepts and purposes in mind, so it’s no surprise that they give different results.

This paper provides some interesting accounts and views on the process of instrument development. It might prove most useful in understanding different measures’ blind spots, which can inform the selection of measures in research, as well as future development priorities.

The emerging social science literature on health technology assessment: a narrative review. Value in Health Published 16th September 2019

Health economics provides a good example of multidisciplinarity, with economists, statisticians, medics, epidemiologists, and plenty of others working together to inform health technology assessment. But I still don’t understand what sociologists are talking about half of the time. Yet, it seems that sociologists and political scientists are busy working on the big questions in HTA, as demonstrated by this paper’s 120 references. So, what are they up to?

This article reports on a narrative review, based on 41 empirical studies. Three broad research themes are identified: i) what drove the establishment and design of HTA bodies? ii) what has been the influence of HTA? and iii) what have been the social and political influences on HTA decisions? Some have argued that HTA is inevitable, while others have argued that there are alternative arrangements. Either way, no two systems are the same and it is not easy to explain differences. It’s important to understand HTA in the context of other social tendencies and trends, and that HTA influences and is influenced by these. The authors provide a substantial discussion on the role of stakeholders in HTA and the potential for some to attempt to game the system. Uncertainty abounds in HTA and this necessarily requires negotiation and acts as a limit on the extent to which HTA can rely on objectivity and rationality.

Something lacking is a critical history of HTA as a discipline and the question of what HTA is actually good for. There’s also not a lot of work out there on culture and values, which contrasts with medical sociology. The authors suggest that sociologists and political scientists could be more closely involved in HTA research projects. I suspect that such a move would be more challenging for the economists than for the sociologists.

Credits

Thesis Thursday: Cheryl Jones

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Cheryl Jones who has a PhD from the University of Manchester. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
The economics of presenteeism in the context of rheumatoid arthritis, ankylosing spondylitis and psoriatic arthritis
Supervisors
Katherine Payne, Suzanne Verstappen, Brenda Gannon
Repository link
https://www.research.manchester.ac.uk/portal/en/theses/the-economics-of-presenteeism-in-the-context-of-rheumatoid-arthritis-ankylosing-spondylitis-and-psoriatic-arthritis%288215e79a-925e-4664-9a3c-3fd42d643528%29.html

What attracted you to studying health-related presenteeism?

I was attracted to study presenteeism because it gave me a chance to address both normative and positive issues. Presenteeism, a concept related to productivity, is a controversial topic in the economic evaluation of healthcare technologies and is currently excluded from health economic evaluations, following the recommendation made by the NICE reference case. The reasons why productivity is excluded from economic evaluations are important and valid, however, there are some circumstances where excluding productivity is difficult to defend. Presenteeism offered an opportunity for me to explore and question the social value judgements that underpin economic evaluation methods with respect to productivity. In terms of positive issues related to presenteeism, research into the development of methods that can be used to measure and value presenteeism was (and still is) limited. This provided an opportunity to think creatively about the types of methods we could use, both quantitative and qualitative, to address and further methods for quantifying presenteeism.

Are existing tools adequate for measuring and valuing presenteeism in inflammatory arthritic conditions?

That is the question! Research into methods that can be used to quantify presenteeism is still in its infancy. Presenteeism is difficult to measure accurately because there are a lack of objective measures that can be used, for example, the number of cars assembled per day. As a consequence, many methods rely on self-report surveys, which tend to suffer from bias, such as reporting or recall bias. Methods that have been used to value presenteeism have largely focused on valuing presenteeism as a cost using the human capital approach (HCA: volume of presenteeism multiplied by a monetary factor). The monetary factor typically used to convert the volume of presenteeism into a cost value is wages. Valuing productivity using wages risks taking account of discriminatory factors that are associated with wages, such as age. There are also economic arguments that question whether the value of the wage truly reflects the value of productivity. My PhD focused on developing a method that values presenteeism as a non-monetary benefit, thereby avoiding the need to value it as a cost using wages. Overall, methods to measure and value presenteeism still have some way to go before a ‘gold standard’ can be established, however, there are many experts from many disciplines who are working to improve these methods.

Why was it important to conduct qualitative interviews as part of your research?

The quantitative component of my PhD was to develop an algorithm, using mapping methods, that links presenteeism with health status and capability measures. A study by Connolly et al. recommend conducting qualitative interviews to provide some evidence of face/content validity to establish whether a quantitative link between two measures (or concepts) is feasible and potentially valid. The qualitative study I conducted was designed to understand the extent to which the EQ-5D-5L, SF6D and ICECAP-C were able to capture those aspects of rheumatic conditions that negatively impact presenteeism. The results suggested that all three measures were able to capture those important aspects of rheumatic conditions that affect presenteeism; however, the results indicated that the SF6D would most likely be the most appropriate measure. The results from the quantitative mapping study identified the SF6D as the most suitable outcome measure able to predict presenteeism in working populations with rheumatic conditions. The advantage of the qualitative results was that it provided some evidence that explained why the SF6D was the more suitable measure rather than relying on speculation.

Is it feasible to predict presenteeism using outcome measures within economic evaluation?

I developed an algorithm that links presenteeism, measured using the Work Activity Productivity Impairment (WPAI) questionnaire, with health and capability. Health status was measured using the EQ-5D-5L and SF6D, and capability was measured using the ICECAP-A. The SF6D was identified as the most suitable measure to predict presenteeism in a population of employees with rheumatoid arthritis or ankylosing spondylitis. The results indicate that it is possible to predict presenteeism using generic outcome measures; however, the results have yet to be externally validated. The qualitative interviews provided evidence as to why the SF6D was the better predictor for presenteeism and the result gave rise to questions about the suitability of outcome measures given a specific population. The results indicate that it is potentially feasible to predict presenteeism using outcome measures.

What would be your key recommendation to a researcher hoping to capture the impact of an intervention on presenteeism?

Due to the lack of a ‘gold standard’ method for capturing the impact of presenteeism, I would recommend that the researcher reports and justifies their selection of the following:

  1. Provide a rationale that explains why presenteeism is an important factor that needs to be considered in the analysis.
  2. Explain how and why presenteeism will be captured and included in the analysis; as a cost, monetary benefit, or non-monetary benefit.
  3. Justify the methods used to measure and value presenteeism. It is important that the research clearly reports why specific tools, such as presenteeism surveys, have been selected for use.

Because there is no ‘gold standard’ method for measuring and valuing presenteeism and guidelines do not exist to inform the reporting of methods used to quantify presenteeism, it is important that the researcher reports and justifies their selection of methods used in their analysis.

Chris Sampson’s journal round-up for 2nd July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Choice in the presence of experts: the role of general practitioners in patients’ hospital choice. Journal of Health Economics [PubMed] [RePEc] Published 26th June 2018

In the UK, patients are in principle free to choose which hospital they use for elective procedures. However, as these choices operate through a GP referral, the extent to which the choice is ‘free’ is limited. The choice set is provided by the GP and thus there are two decision-makers. It’s a classic example of the principal-agent relationship. What’s best for the patient and what’s best for the local health care budget might not align. The focus of this study is on the applied importance of this dynamic and the idea that econometric studies that ignore it – by looking only at patient decision-making or only at GP decision-making – may give bias estimates. The author outlines a two-stage model for the choice process that takes place. Hospital characteristics can affect choices in three ways: i) by only influencing the choice set that the GP presents to the patient, e.g. hospital quality, ii) by only influencing the patient’s choice from the set, e.g. hospital amenities, and iii) by influencing both, e.g. waiting times. The study uses Hospital Episode Statistics for 30,000 hip replacements that took place in 2011/12, referred by 4,721 GPs to 168 hospitals, to examine revealed preferences. The choice set for each patient is not observed, so a key assumption is that all hospitals to which a GP made referrals in the period are included in the choice set presented to patients. The main findings are that both GPs and patients are influenced primarily by distance. GPs are influenced by hospital quality and the budget impact of referrals, while distance and waiting times explain patient choices. For patients, parking spaces seem to be more important than mortality ratios. The results support the notion that patients defer to GPs in assessing quality. In places, it’s difficult to follow what the author did and why they did it. But in essence, the author is looking for (and in most cases finding) reasons not to ignore GPs’ preselection of choice sets when conducting econometric analyses involving patient choice. Econometricians should take note. And policymakers should be asking whether freedom of choice is sensible when patients prioritise parking and when variable GP incentives could give rise to heterogeneous standards of care.

Using evidence from randomised controlled trials in economic models: what information is relevant and is there a minimum amount of sample data required to make decisions? PharmacoEconomics [PubMed] Published 20th June 2018

You’re probably aware of the classic ‘irrelevance of inference’ argument. Statistical significance is irrelevant in deciding whether or not to fund a health technology, because we ought to do whatever we expect to be best on average. This new paper argues the case for irrelevance in other domains, namely multiplicity (e.g. multiple testing) and sample size. With a primer on hypothesis testing, the author sets out the regulatory perspective. Multiplicity inflates the chance of a type I error, so regulators worry about it. That’s why triallists often obsess over primary outcomes (and avoiding multiplicity). But when we build decision models, we rely on all sorts of outcomes from all sorts of studies, and QALYs are never the primary outcome. So what does this mean for reimbursement decision-making? Reimbursement is based on expected net benefit as derived using decision models, which are Bayesian by definition. Within a Bayesian framework of probabilistic sensitivity analysis, data for relevant parameters should never be disregarded on the basis of the status of their collection in a trial, and it is up to the analyst to properly specify a model that properly accounts for the effects of multiplicity and other sources of uncertainty. The author outlines how this operates in three settings: i) estimating treatment effects for rare events, ii) the number of trials available for a meta-analysis, and iii) the estimation of population mean overall survival. It isn’t so much that multiplicity and sample size are irrelevant, as they could inform the analysis, but rather that no data is too weak for a Bayesian analyst.

Life satisfaction, QALYs, and the monetary value of health. Social Science & Medicine [PubMed] Published 18th June 2018

One of this blog’s first ever posts was on the subject of ‘the well-being valuation approach‘ but, to date, I don’t think we’ve ever covered a study in the round-up that uses this method. In essence, the method is about estimating trade-offs between (for example) income and some measure of subjective well-being, or some health condition, in order to estimate the income equivalence for that state. This study attempts to estimate the (Australian) dollar value of QALYs, as measured using the SF-6D. Thus, the study is a rival cousin to the Claxton-esque opportunity cost approach, and a rival sibling to stated preference ‘social value of a QALY’ approaches. The authors are trying to identify a threshold value on the basis of revealed preferences. The analysis is conducted using 14 waves of the Australian HILDA panel, with more than 200,000 person-year responses. A regression model estimates the impact on life satisfaction of income, SF-6D index scores, and the presence of long-term conditions. The authors adopt an instrumental variable approach to try and address the endogeneity of life satisfaction and income, using an indicator of ‘financial worsening’ to approximate an income shock. The estimated value of a QALY is found to be around A$42,000 (~£23,500) over a 2-year period. Over the long-term, it’s higher, at around A$67,000 (~£37,500), because individuals are found to discount money differently to health. The results also demonstrate that individuals are willing to pay around A$2,000 to avoid a long-term condition on top of the value of a QALY. The authors apply their approach to a few examples from the literature to demonstrate the implications of using well-being valuation in the economic evaluation of health care. As with all uses of experienced utility in the health domain, adaptation is a big concern. But a key advantage is that this approach can be easily applied to large sets of survey data, giving powerful results. However, I haven’t quite got my head around how meaningful the results are. SF-6D index values – as used in this study – are generated on the basis of stated preferences. So to what extent are we measuring revealed preferences? And if it’s some combination of stated and revealed preference, how should we interpret willingness to pay values?

Credits