How important is healthcare for population health?

How important is a population’s access to healthcare as a determinant of population health? I have heard the claim that “as little as 10% of a population’s health is linked to access to healthcare”, or some variant of it, in many places. Some examples include the Health Foundation, the AHRQ, the King’s Fund, the WHO, and This claim is appealing: it feels counter-intuitive and it brings to the fore questions of public health and health-related behaviour. But it’s not clear what it means.

I can think of two possible interpretations. One, 10% of the variation in population health outcomes is explained by variation in healthcare access. Or two, access to healthcare leads to a 10% change in population health outcomes compared to no access to healthcare. Both of these claims would be very hard to evaluate empirically. Within many countries, particularly the highest income countries, there is little variation in access to healthcare relative to possible levels of access across the world. Inter-country comparisons would provide a greater range of variation to compare to population outcomes. But even the most sophisticated statistical analysis will struggle to separate out the effects of other economic determinants of health.

It would also be difficult to make sense of any study that purported to estimate the effect of adding or removing healthcare beyond any within-country variation. The labour and capital resource needs of the most sophisticated hospitals are too great for the poorest settings, and it is unlikely that the wealthiest democratic countries could end up with the level of healthcare the world’s poorest face.

But what is the evidence for the claim of 10%? There are a handful of key citations, all of which were summarised at the time in a widely cited article in Health Affairs in 2014. For each of the two ways we could think about the contribution of healthcare above, we would need to look at estimates of the probability of health conditional on different levels of healthcare, Pr(health|healthcare). Each of the references for the 10% figure above in fact provides evidence for the proportion of deaths associated with ‘inadequate’ healthcare, or to put in another way, the probability of having received ‘inadequate’ care given death, Pr(healthcare|health). This is known as transposing the conditional: we have got our conditional probability the wrong way round. Even if we accept mortality rates as an acceptable proxy for population health, the two probabilities are not equal to one another.

Interpretation of this evidence is also complex. Smoking tobacco, for example, would be considered a behavioural determinant of health and deaths caused by it would be attributed to a behavioural cause rather than healthcare. But survival rates for lung cancers have improved dramatically over the last few decades due to improvements in healthcare. While it would be foolish to attribute a death in the past to a lack of access to treatments which had not been invented, contemporary lung cancer deaths in low income settings may well have been prevented by access to better healthcare. Thus using cause-of-death statistics to estimate the contributions of different factors to population health only typically picks up those deaths resulting from medical error or negligence. They are a wholly unreliable guide to the role of healthcare in determining population health.

A study published recently in The Lancet, timed to coincide with a commission on healthcare quality, adopted a different approach. The study aimed to estimate the annual number of deaths worldwide due to a lack of access to high-quality care. To do this they compared the mortality rates of conditions amenable to healthcare intervention around the world with those in the wealthiest nations. Any differences were attributed to either non-utilisation of or lack of access to high-quality care. 15.6 million ‘excess deaths’ were estimated. However, to attribute to these deaths a cause of inadequate healthcare access, one would need to conceive of a counter-factual world in which everyone was treated in the best healthcare systems. This is surely implausible in the extreme. A comparable question might be to ask how many people around the world are dying because their incomes are not as high as those of the top 10% of Americans.

On the normative question, there is little disagreement with the goal of achieving universal health coverage and improving population health. But these dramatic, eye-catching, or counter-intuitive figures do little to support achieving these ends: they can distort policy priorities and create unattainable goals and expectations. Health systems are not built overnight; an incremental approach is needed to ensure sustainability and affordability. Evidence to support this is where great strides are being made both methodologically and empirically, but it is not nearly as exciting as claiming healthcare isn’t very important or that millions of people are dying every year due to poor healthcare access. Healthcare systems are an integral and important part of overall population health; assigning a number to this importance is not.

Picture credit: pixabay

Sam Watson’s journal round-up for 29th October 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Researcher Requests for Inappropriate Analysis and Reporting: A U.S. Survey of Consulting Biostatisticians. Annals of Internal Medicine. [PubMed] Published October 2018.

I have spent a fair bit of time masquerading as a statistician. While I frequently try to push for Bayesian analyses where appropriate, I have still had to do Frequentist work including power and sample size calculations. In principle these power calculations serve a good purpose: if the study is likely to produce very uncertain results it won’t contribute much to scientific knowledge and so won’t justify its cost. It can indicate that a two-arm trial would be preferred over a three-arm trial despite losing an important comparison. But many power analyses, I suspect, are purely for show; all that is wanted is the false assurance of some official looking statistics to demonstrate that a particular design is good enough. Now, I’ve never worked on economic evaluation, but I can imagine that the same pressures can sometimes exist to achieve a certain result. This study presents a survey of 400 US-based statisticians, which asks them how frequently they are asked to do some inappropriate analysis or reporting and to rate how egregious the request is. For example, the most severe request is thought to be to falsify statistical significance. But it includes common requests like to not show plots as they don’t reveal an effect as significant as thought, to downplay ‘insignificant’ findings, or to dress up post hoc power calculations as a priori analyses. I would think that those responding to this survey are less likely to be those who comply with such requests and the survey does not ask them if they did. But it wouldn’t be a big leap to suggest that there are those who do comply, career pressures being what they are. We already know that statistics are widely misused and misreported, especially p-values. Whether this is due to ignorance or malfeasance, I’ll let the reader decide.

Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Advances in Methods and Practices in Psychological Science. [PsyArXiv] Published August 2018.

Every data analysis requires a large number of decisions. From receiving the raw data, the analyst must decide what to do with missing or outlying values, which observations to include or exclude, whether any transformations of the data are required, how to code and combined categorical variables, how to define the outcome(s), and so forth. The consequence of each of these decisions leads to a different analysis, and if all possible analyses were enumerated there could be a myriad. Gelman and Loken called this the ‘garden of forking paths‘ after the short story by Jorge Luis Borges, who explored this idea. Gelman and Loken identify this as the source of the problem called p-hacking. It’s not that researchers are conducting thousands of analyses and publishing the one with the statistically significant result, but that each decision along the way may be favourable towards finding a statistically significant result. Do the outliers go against what you were hypothesising? Exclude them. Is there a nice long tail of the distribution in the treatment group? Don’t take logs.

This article explores the garden of forking paths by getting a number of analysts to try to answer the same question with the same data set. The question was, are darker skinned soccer players more likely to receive a red card that their lighter skinned counterparts? The data set provided had information on league, country, position, skin tone (based on subjective rating), and previous cards. Unsurprisingly there were a large range of results, with point estimates ranging from odds ratios of 0.89 to 2.93, with a similar range of standard errors. Looking at the list of analyses, I see a couple that I might have pursued, both producing vastly different results. The authors see this as demonstrating the usefulness of crowdsourcing analyses. At the very least it should be stark warning to any analyst to be transparent with every decision and to consider its consequences.

Front-Door Versus Back-Door Adjustment With Unmeasured Confounding: Bias Formulas for Front-Door and Hybrid Adjustments With Application to a Job Training Program. Journal of the American Statistical Association. Published October 2018.

Econometricians love instrumental variables. Without any supporting evidence, I would be willing to conjecture it is the most widely used type of analysis in empirical economic causal inference. When the assumptions are met it is a great tool, but decent instruments are hard to come by. We’ve covered a number of unconvincing applications on this blog where the instrument might be weak or not exogenous, and some of my own analyses have been criticised (rightfully) on these grounds. But, and we often forget, there are other causal inference techniques. One of these, which I think is unfamiliar to most economists, is the ‘front-door’ adjustment. Consider the following diagram:

frontdoorOn the right is the instrumental variable type causal model. Provided Z satisfies an exclusion restriction. i.e. independent of U, (and some other assumptions) it can be used to estimate the causal effect of A on Y. The front-door approach, on the left, shows a causal diagram where there is a post-treatment variable, M, unrelated to U, and which causes the outcome Y. Pearl showed that under a similar set of assumptions as instrumental variables, that the effect of A on Y was entirely mediated by M, and that there were no common causes of A and M or of M and Y, then M could be used to identify the causal effect of A on Y. This article discusses the front-door approach in the context of estimating the effect of a jobs training program (a favourite of James Heckman). The instrumental variable approach uses random assignment to the program, while the front-door analysis, in the absence of randomisation, uses program enrollment as its mediating variable. The paper considers the effect of the assumptions breaking down, and shows the front-door estimator to be fairly robust.



Rita Faria’s journal round-up for 22nd October 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Economically efficient hepatitis C virus treatment prioritization improves health outcomes. Medical Decision Making [PubMed] Published 22th August 2018

Hepatitis C treatment was in the news a couple of years ago when the new direct-acting antivirals first appeared on the scene. These drugs are very effective but also incredibly expensive. This prompted a flurry of cost-effectiveness analyses and discussions of the role of affordability in cost-effectiveness (my views here).

This compelling study by Lauren Cipriano and colleagues joins the debate by comparing various strategies to prioritise patients for treatment when the budget is not enough to meet patient demand. This is a clear example of the health losses due to the opportunity cost.

The authors compare the costs and health outcomes of various prioritisation schedules in terms of the number of patients treated, the distribution by severity and age, time to treatment, impact on end-stage liver disease, QALYs, costs and net benefit.

The differences between prioritisation schedules in terms of these various outcomes were remarkable. Reassuringly, the optimal prioritisation schedule on the basis of net benefit (the “optimisation” schedule) was the one that achieved the most QALYs and the greatest net benefit. This was even though the cost-effectiveness threshold did not reflect the opportunity cost, as it was set at $100,000 per QALY gained.

This study is fascinating. It shows how the optimal policy depends on what we are trying to maximise. The “first come first serve” schedule treats the most patients, but it is the “optimisation” schedule that achieves the most health benefits net of the opportunity cost.

Since their purpose was not to compare treatments, the authors used a representative price depending on whether patients had progressed to cirrhosis. A future study could include a comparison between drugs, as our previous work found that there are clear differences in cost-effectiveness between treatment strategies. The more cost-effective the treatment strategies, the more patients can be treated with a given budget.

The authors made the Excel model available as supporting material, together with documentation. This is excellent practice! It disseminates the work and shows openness to independent validation. Well done!

Long-term survival and value of chimeric antigen receptor T-cell therapy for pediatric patients with relapsed or refractory leukemia. JAMA Pediatrics [PubMed] Published 8th October 2018

This fascinating study looks at the cost-effectiveness of tisagenlecleucel in the treatment of children with relapsed or refractory leukaemia compared to chemotherapy.

Tisagenlecleucel is the first chimeric antigen receptor T-cell (CAR-T) therapy. CAR-T therapy is the new kid on the block in cancer treatment. It involves modifying the patient’s own immune system cells to recognise and kill the patient’s cancer (see here for details). Such high-tech treatment comes with a hefty price tag. Tisagenlecleucel is listed at $475,000 for a one-off administration.

The key challenge was to obtain the effectiveness inputs under the chemotherapy option. This was because tisagenlecleucel has only been studied in single-arm trials and individual level data was not available to the research team. The research team selected a single-arm study on the outcomes with clofarabine monotherapy, since its patients at baseline were most similar in terms of demographics and number of prior therapies to the tisagenlecleucel study.

This study is brilliant in approaching a difficult decision problem and conducting extensive sensitivity analysis. In particular, it tests the impact of common drivers of the cost-effectiveness of potentially curative therapies in children, such as the discount rate, duration of benefit, treatment initiation, and the inclusion of future health care costs. Ideally, the sensitivity analysis should also have tested the assumption that the studies informing the effectiveness inputs for tisagenlecleucel and clofarabine monotherapy were comparable or if clofarabine monotherapy does not represent the current standard of care, although it would be difficult to parameterise.

This outstanding study highlights the challenges posed by the approval of treatments based on single-arm studies. Had individual-level data been available, an adjusted comparison may have been possible, which would improve the degree of confidence in the cost-effectiveness of tisagenlecleucel. Regulators and trial sponsors should work together to make anonymised individual level data available to bonafide researchers.

Researcher requests for inappropriate analysis and reporting: a U.S. survey of consulting biostatisticians. Annals of Internal Medicine [PubMed] Published 10th October 2018

This study reports a survey of biostatisticians on the frequency and severity of requests for inappropriate analysis and reporting. The results are stunning!

The top 3 requests in terms of severity were to falsify statistical significance to support a desired result, change data to achieve the desired outcome and remove/alter data records to better support the research hypothesis. Fortunately, this sort of requests appears to be rare.

The top 3 requests in terms of frequency seem to be not showing a plot because it does not show an effect as strong as it had been hoped; to stress only the significant findings but under-reporting non-significant ones, and report results before data have been cleaned and validated.

Given the frequency and severity of the requests, the authors recommend that researchers should be better educated in good statistical practice and research ethics. I couldn’t agree more and would suggest that cost-effectiveness analysis is included, given that it informs policy decisions and it is generally conducted by multidisciplinary teams.

I’m now wondering what the responses would be if we did a similar survey to health economists, particularly those working in health technology assessment! Something for HESG, iHEA or ISPOR to look at for the future?