Chris Sampson’s journal round-up for 20th May 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A new method to determine the optimal willingness to pay in cost-effectiveness analysis. Value in Health Published 17th May 2019

Efforts to identify a robust estimate of the willingness to pay for a QALY have floundered. Mostly, these efforts have relied on asking people about their willingness to pay. In the UK, we have moved away from using such estimates as a basis for setting cost-effectiveness thresholds in the context of resource allocation decisions. Instead, we have attempted to identify the opportunity cost of a QALY, which is perhaps even more difficult, but more easy to justify in the context of a fixed budget. This paper seeks to inject new life into the willingness-to-pay approach by developing a method based on relative risk aversion.

The author outlines the relationship between relative risk aversion and the rate at which willingness-to-pay changes with income. Various candidate utility functions are described with respect to risk preferences, with a Weibull function being adopted for this framework. Estimates of relative risk aversion have been derived from numerous data sources, including labour supply, lottery experiments, and happiness surveys. These estimates from the literature are used to demonstrate the relationship between relative risk aversion and the ‘optimal’ willingness to pay (K), calibrated using the Weibull utility function. For an individual with ‘representative’ parameters plugged into their utility function, K is around twice the income level. K always increases with relative risk aversion.

Various normative questions are raised, including whether a uniform K should be adopted for everybody within the population, and whether individuals should be able to spend on health care on top of public provision. This approach certainly appears to be more straightforward than other approaches to estimating willingness-to-pay in health care, and may be well-suited to decentralised (US-style) resource allocation decision-making. It’s difficult to see how this framework could gain traction in the UK, but it’s good to see alternative approaches being proposed and I hope to see this work developed further.

Striving for a societal perspective: a framework for economic evaluations when costs and effects fall on multiple sectors and decision makers. Applied Health Economics and Health Policy [PubMed] Published 16th May 2019

I’ve always been sceptical of a ‘societal perspective’ in economic evaluation, and I have written in favour of a limited health care perspective. This is mostly for practical reasons. Being sufficiently exhaustive to identify a truly ‘societal’ perspective is so difficult that, in attempting to do so, there is a very high chance that you will produce estimates that are so inaccurate and imprecise that they are more dangerous than useful. But the fact is that there is no single decision-maker when it comes to public expenditure. Governments are made up of various departments, within which there are many levels and divisions. Not everybody will care about the health care perspective, so other objectives ought to be taken into account.

The purpose of this paper is to build on the idea of the ‘impact inventory’, described by the Second Panel on Cost-Effectiveness in Health and Medicine, which sought to address the challenge of multiple objectives. The extended framework described in this paper captures effects and opportunity costs associated with an intervention within various dimensions. These dimensions could (or should) align with decision-makers’ objectives. Trade-offs invariably require aggregation, and this aggregation could take place either within individuals or within dimensions – something not addressed by the Second Panel. The authors describe the implications of each approach to aggregation, providing visual representations of the impact inventory in each case. Aggregating within individuals requires a normative judgement about how each dimension is valued to the individual and then a judgement about how to aggregate for overall population net benefit. Aggregating across individuals within dimensions requires similar normative judgements. Where the chosen aggregation functions are linear and additive, both approaches will give the same results. But as soon as we start to consider equity concerns or more complex aggregation, we’ll see different decisions being indicated.

The authors adopt an example used by the Second Panel to demonstrate the decisions that would be made within a health-only perspective and then decisions that consider other dimensions. There could be a simple extension beyond health, such as including the impact on individuals’ consumption of other goods. Or it could be more complex, incorporating multiple dimensions, sectors, and decision-makers. For the more complex situation, the authors consider the inclusion of the criminal justice sector, introducing the number of crimes averted as an object of value.

It’s useful to think about the limitations of the Second Panel’s framing of the impact inventory and to make explicit the normative judgements involved. What this paper seems to be saying is that cross-sector decision-making is too complex to be adequately addressed by the Second Panel’s impact inventory. The framework described in this paper may be too abstract to be practically useful, and too vague to be foundational. But the complexities and challenges in multi-sector economic evaluation need to be spelt out – there is no simple solution.

Advanced data visualisation in health economics and outcomes research: opportunities and challenges. Applied Health Economics and Health Policy [PubMed] Published 4th May 2019

Computers can make your research findings look cool, which can help make people pay attention. But data visualisation can also be used as part of the research process and provide a means of more intuitively (and accurately) communicating research findings. The data sets used by health economists are getting bigger, which provides more opportunity and need for effective visualisation. The authors of this paper suggest that data visualisation techniques could be more widely adopted in our field, but that there are challenges and potential pitfalls to consider.

Decision modelling is an obvious context in which to use data visualisation, because models tend to involve large numbers of simulations. Dynamic visualisations can provide a means by which to better understand what is going on in these simulations, particularly with respect to uncertainty in estimates associated with alternative model structures or parameters. If paired with interactive models and customised dashboards, visualisation can make complex models accessible to non-expert users. Communicating patient outcomes data is also highlighted as a potential application, aiding the characterisation of differences between groups of individuals and alternative outcome measures.

Yet, there are barriers to wider use of visualisation. There is some scepticism about bias in underlying analyses, and end users don’t want to be bamboozled by snazzy graphics. The fact that journal articles are still the primary mode of communicating research findings is a problem, as you can’t have dynamic visualisations in a PDF. There’s also a learning curve for analysts wishing to develop complex visualisations. Hopefully, opportunities will be identified for two-way learning between the health economics world and data scientists more accustomed to data visualisation.

The authors provide several examples (static in the publication, but with links to live tools), to demonstrate the types of visualisations that can be created. Generally speaking, complex visualisations are proposed as complements to our traditional presentations of results, such as cost-effectiveness acceptability curves, rather than as alternatives. The key thing is to maintain credibility by ensuring that data visualisation is used to describe data in a more accurate and meaningful way, and to avoid exaggeration of research findings. It probably won’t be long until we see a set of good practice guidelines being developed for our field.

Credits

Thesis Thursday: Kevin Momanyi

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Kevin Momanyi who has a PhD from the University of Aberdeen. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Enhancing quality in social care through economic analysis
Supervisors
Paul McNamee
Repository link
http://digitool.abdn.ac.uk/webclient/DeliveryManager?pid=240815

What are reablement and telecare services and why should economists study them?

Reablement and telecare are two types of services within homecare that enable individuals to live independently in their own homes with little or no assistance from other people. Reablement focuses on helping individuals relearn the skills needed for independent living after an illness or injury. It is a short term intervention that lasts for about 6 to 12 weeks and usually involves several health care professionals and social care workers working together to meet some set objectives. Telecare, on the other hand, entails the use of devices (e.g. community alarms and linked pill dispensers) to facilitate communication between homecare clients and their care providers in the event of an accident or negative health shock. Economists should study reablement and telecare so as to determine whether or not the services have value for money and also develop policies that would reduce social care costs without compromising the welfare of the populace.

In what ways did your study reach beyond the scope of previous research?

My study extended the previous studies in three main ways. Firstly, I estimated the treatment effects in a non-experimental setting unlike the previous studies that used either randomised controlled trials or quasi-experiments. Secondly, I used linked administrative health and social care data in Scotland for the 2010/2011 financial year. The data covered the administrative records for the entire Scottish population and was larger and more robust than the data used by the previous studies. Thirdly, the previous studies were simply concerned with quantifying the treatment effects and thus did not provide a rationale as to how the interventions affect the outcomes of interest. My thesis addressed this knowledge gap by formulating an econometric model that links the demand for reablement/telecare to several outcomes.

How did you go about trying to estimate treatment effects from observational data?

I used a theory driven approach combined with specialised econometric techniques in order to estimate the treatment effects. The theoretical model drew from the Almost Ideal Demand System (AIDS), Andersen’s Behavioural Model of Health Services Use, the Grossman Model of the demand for health capital, and Samuelson’s Revealed Preference Theory; whereas the estimation strategy simultaneously controlled for unexplained trend variations, potential endogeneity of key variables, potential sample selection bias and potential unobserved heterogeneity. For a more substantive discussion of the theoretical model and estimation strategy, see Momanyi, 2018. Although the majority of the studies in the econometric literature advocate for the use of quasi-experimental study designs in estimating treatment effects using observational data, I provided several proofs in my thesis showing that these designs do not always yield consistent results, and that estimating the econometric models in the way that I did is preferable since it nests several study designs and estimation strategies as special cases.

Are there key groups of people that could benefit from greater use of reablement and telecare services?

According to the empirical results of my thesis, there is sufficient evidence to conclude that there are certain groups within the population that could benefit from greater use of telecare. For instance, one empirical study investigating the effect of telecare use on the expected length of stay in hospital showed that the community alarm users with physical disabilities are more likely than the other community alarm users to have a shorter length of stay in hospital, holding other factors constant. Correspondingly, the results also showed that the individuals who use more advanced telecare devices than the community alarm and who are also considered to be frail elderly are expected to have a relatively shorter length of stay in hospital as compared to the other telecare users in the population, all else equal. A discussion of various econometric models that can be used to link telecare use to the length of stay in hospital can be found in Momanyi, 2017.

What would be your main recommendation for policymakers in Scotland?

The main recommendation for policymakers is that they ought to subsidise the cost of telecare services, especially in regions that currently have relatively low utilisation levels, so as to increase the uptake of telecare in Scotland. This was informed by a decomposition analysis that I conducted in the first empirical study to shed light on what could be driving the observed direct relationship between telecare use and independent living at home. The analysis showed that the treatment effect was in part due to the underlying differences (both observable and unobservable) between telecare users and non-users, and thus policymakers could stimulate telecare use in the population by addressing these differences. In addition to that, policymakers should advise the local authorities to target telecare services at the groups of people that are most likely to benefit from them as well as sensitise the population on the benefits of using community alarms. This is because the econometric analyses in my thesis showed that the treatment effects are not homogenous across the population, and that the use of a community alarm is expected to reduce the likelihood of unplanned hospitalisation, whereas the use of the other telecare devices has the opposite effect all else equal.

Can you name one thing that you wish you could have done as part of your PhD, which you weren’t able to do?

I would have liked to include in my thesis an empirical study on the effects of reablement services. My analyses focused only on telecare use as the treatment variable due to data limitations. This additional study would have been vital in validating the econometric model that I developed in the first chapter of the thesis as well as addressing the gaps in knowledge that were identified by the literature review. In particular, it would have been worthwhile to determine whether reablement services should be offered to individuals discharged from hospital or to individuals who have been selected into the intervention directly from the community.

Poor statistical communication means poor statistics

Statistics is a broad and complex field. For a given research question any number of statistical approaches could be taken. In an article published last year, researchers asked 61 analysts to use the same dataset to address the question of whether referees were more likely to give dark skinned players a red card than light skinned players. They got 61 different responses. Each analysis had its advantages and disadvantages and I’m sure each analyst would have defended their work. However, as many statisticians and economists may well know, the merit of an approach is not the only factor that matters in its adoption.

There has, for decades, been criticism about the misunderstanding and misuse of null hypothesis significance testing (NHST). P-values have been a common topic on this blog. Despite this, NHST remains the predominant paradigm for most statistical work. If used appropriately this needn’t be a problem, but if it were being used appropriately it wouldn’t be used nearly as much: p-values can’t perform the inferential role many expect of them. It’s not difficult to understand why things are this way: most published work uses NHST, we teach students NHST in order to understand the published work, students become researchers who use NHST, and so on. Part of statistical education involves teaching the arbitrary conventions that have gone before such as that p-values are ‘significant’ if below 0.05 or a study is ‘adequately powered’ if power is above 80%. One of the most pernicious consequences of this is that these heuristics become a substitute for thinking. The presence of these key figures is expected and their absence often marked by a request from reviewers and other readers for their inclusion.

I have argued on this blog and elsewhere for a wider use of Bayesian methods (and less NHST) and I try to practice what I preach. For an ongoing randomised trial I am involved with, I adopted a Bayesian approach to design and analysis. Instead of the usual power calculation, I conducted a Bayesian assurance analysis (which Anthony O’Hagan has written some good articles on for those wanting more information). I’ll try to summarise the differences between ‘power’ and ‘assurance’ calculations by attempting to define them, which is actually quite hard!

Power calculation. If we were to repeat a trial infinitely many times, what sample size would we need so that in x% of trials the assumed data generating model produces data which would fall in the α% most extreme quantiles of the distribution of data that would be produced from the same data generating model but with one parameter set to exactly zero (or any equivalent hypothesis). Typically we set x%to be 80% (power) and α% to be 5% (statistical significance threshold).

Assurance calculation. For a given data generating model, what sample size do we need so that there is a x% probability that we will be 1-α% certain that the parameter is positive (or any equivalent choice).

The assurance calculation could be reframed in a decision framework as what sample size do we need so that there is a x% probability we will make the right decision about whether a parameter is positive (or any equivalent decision) given the costs of making the wrong decision.

Both of these are complex but I would argue it is the assurance calculation that gives us what we want to know most of the time when designing a trial. The assurance analysis also better represents uncertainty since we specify distributions over all the uncertain parameters rather than choose exact values. Despite this though, the funder of the trial mentioned above, who shall remain nameless, insisted on the results of a power calculation in order to be able to determine whether the trial was worth continuing with because that’s “what they’re used to.”

The main culprit for this issue is, I believe, communication. A simpler explanation with better presentation may have been easier to understand and accept. This is not to say that I do not believe the funder was substituting the heuristic ‘80% or more power = good’ for actually thinking about what we could learn from the trial. But until statisticians, economists, and other data analytic researchers start communicating better, how can we expect others to listen?

Image credit: Geralt