Sam Watson’s journal round-up for 2nd October 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The path to longer and healthier lives for all Africans by 2030: the Lancet Commission on the future of health in sub-Saharan Africa. The Lancet [PubMedPublished 13th September 2017

The African continent has the highest rates of economic growth, the fastest growing populations and rates of urbanisation, but also the highest burden of disease. The challenges for public health and health care provision are great. It is no surprise then that this Lancet commission on the future of health in Sub-Saharan Africa runs to 57 pages yet still has some notable absences. In the space of a few hundred words, it would be impossible to fully discuss the topics in this tome, these will appear in future blog posts. For now, I want to briefly discuss a lack of consideration of the importance of political economy in the Commission’s report. For example, the report notes the damaging effects of IMF and World Bank structural adjustment programs in the 70s and 80s. These led to a dismantling of much of the public sector in indebted African nations in order for them to qualify for further loans. However, these issues have not gone away. Despite strongly emphasizing that countries in Africa must increase their health spending, it does not mention that many countries spend much more servicing debt than on public health and health care. Kenya, for example, will soon no longer qualify for aid as it becomes a middle-income country, and yet it spends almost double (around $6 billion) servicing its debt than it does on health care (around $3 billion). Debt reform and relief may be a major step towards increasing health expenditure. The inequalities in access to basic health services reflect the disparities in income and wealth both between and within countries. The growth of slums across the continent is stark evidence of this. Residents of these communities, despite often facing the worst exposure to major disease risk factors, are often not recognised by authorities and cannot access health services. Even where health services are available there are still difficulties with access. A lack of regulation and oversight can lead the growth of a rentier class within slums as those with access to small amounts of capital, land, or property act as petty landlords. So while some in slum areas can afford the fees for basic health services, the poorest still face a barrier even when services are available. These people are also those who have little access to decent water and sanitation or education and have the highest risk of disease. Finally, the lack of incentives for trained doctors and medical staff to work in poor or rural areas is also identified as a key problem. Many doctors either leave for wealthier countries or work in urban areas. Doctors are often a powerful interest group and can influence macro health policy, distorting it to favour richer urban areas. Political solutions are required, as well as the public health interventions more widely discussed. The Commission’s report is extensive and worth the time to read for anyone with an interest in the subject matter. What also becomes clear upon reading it is the lack of solid evidence on health systems and what works and does not work. From an economic perspective, much of the evidence pertaining to health system functioning and efficiency is still just the results from country-level panel data regressions, which tell us very little about what is actually happening. This results in us being able to identify areas needed for reform with very little idea of how.

The relationship of health insurance and mortality: is lack of insurance deadly? Annals of Internal Medicine [PubMedPublished 19th September 2017

One sure-fire way of increasing your chances of publishing in a top-ranked journal is to do something on a hot political topic. In the UK this has been seven-day services, as well as other issues relating to deficiencies of supply. In the US, health insurance is right up there with the Republicans trying to repeal the Affordable Care Act, a.k.a. Obamacare. This paper systematically reviews the literature on the relationship between health insurance coverage and the risk of mortality. The theory being that health insurance permits access to medical services and therefore treatment and prevention measures that reduce the risk of death. Many readers will be familiar with the Oregon Health Insurance Experiment, in which the US state of Oregon distributed access to increased Medicaid expansion by lottery, therein creating an RCT. This experiment, which takes a top spot in the review, estimated that those who had ‘won’ the lottery had a mortality rate 0.032 percentage points lower than the ‘losers’, whose mortality rate was 0.8%; a relative reduction of around 4%. Similar results were found for the quasi-experimental studies included, and slightly larger effects were found in cohort follow-up studies. These effects are small. But then so is the baseline. Most of these studies only examined non-elderly, non-disabled people, who would otherwise not qualify for any other public health insurance. For people under 45 in the US, the leading cause of death is unintentional injury, and its only above this age that cancer becomes the leading cause of death. If you suffer major trauma in the US you will (for the most part) be treated in an ER insured or uninsured, even if you end up with a large bill afterwards. So it’s no surprise that the effects of insurance coverage on mortality are very small for these people. This is probably the inappropriate endpoint to be looking at for this study. Indeed, the Oregon experiment found that the biggest differences were in reduced out-of-pocket expenses and medical debt, and improved self-reported health. The review’s conclusion that, “The odds of dying among the insured relative to the uninsured is 0.71 to 0.97,” is seemingly unwarranted. If they want to make a political point about the need for insurance, they’re looking in the wrong place.

Smoking, expectations, and health: a dynamic stochastic model of lifetime smoking behavior. Journal of Political Economy [RePEcPublished 24th August 2017

I’ve long been sceptical of mathematical models of complex health behaviours. The most egregious of which is often the ‘rational addiction’ literature. Originating with the late Gary Becker, the rational addiction model, in essence, assumes that addiction is a rational choice made by utility maximising individuals, whose preferences alter with use of a particular drug. The biggest problem I find with this approach is that it is completely out of touch with the reality of addiction and drug dependence, and makes absurd assumptions about the preferences of addicts. Nevertheless, it has spawned a sizable literature. And, one may argue that the model is useful if it makes accurate predictions, regardless of the assumptions underlying it. On this front, I have yet to be convinced. This paper builds a rational addiction-type model for smoking to examine whether learning of one’s health risks reduces smoking. As an illustration of why I dislike this method of understanding addictive behaviours, the authors note that “…the model cannot explain why individuals start smoking. […] The estimated preference parameters in the absence of a chronic illness suggest that, for a never smoker under the age of 25, there is no incentive to begin smoking because the marginal utility of smoking is negative.” But for many, social and cultural factors simply explain why young people start smoking. The weakness of the deductive approach to social science seems to rear its head, but like I said, the aim here may be the development of good predictive models. And, the model does appear to predict smoking behaviour well. However, it is all in-sample prediction, and with the number of parameters it is not surprising it predicts well. This discussion is not meant to be completely excoriating. What is interesting is the discussion and attempt to deal with the endogeneity of smoking – people in poor health may be more likely to smoke and so the estimated effects of smoking on longevity may be overestimated. As a final point of contention though, I’m still trying to work out what the “addictive stock of smoking capital” is.


The effect of spending cuts on teen pregnancy. Journal of Health Economics [PubMed] Published July 2017

High teenage pregnancy rates are an important concern that features high in many countries’ social policy agendas. In the UK, a country which has one of the highest teen pregnancy rates in the world, efforts to tackle the issue have been spearheaded by the Teenage Pregnancy Strategy, an initiative aiming to halve under-18 pregnancy rates by offering access to sex education and contraception. However, the recent spending cuts have led to reductions in grants to local authorities, many of which have, in turn, limited or cut a number of teenage pregnancy-related programmes. This has led to vocal opposition by politicians and organisations, who argue that cuts jeopardise the reductions in teenage pregnancy rates seen in previous years. In this paper, Paton and Wright set out to examine whether this is the case; that is, whether cuts to Teenage Pregnancy Strategy-related services have had an impact on teenage pregnancy rates. To do so, the authors used panel data from 149 local authorities in England collected between 2009 and 2014. To capture changes in teenage pregnancy rates across local authorities over the specified period, the authors used a fixed effects model which assumed that under-18 conception rates are a function of annual expenditure on teenage pregnancy services per 13-17 year female in the local authority, and a set of other socioeconomic variables acting as controls. Area and year dummies were also included in the model to account for unobservable effects that relate to particular years and localities and a number of additional analysis were run to get around spurious correlations between expenditure and pregnancy rates. Overall, findings showed that areas which implemented bigger cuts to teenage pregnancy-targeting programmes have, on average, seen larger drops in teenage pregnancy rates. However, these drops are, in absolute terms, small (e.g. a 10% reduction in expenditure is associated with a 0.25% decrease in teenage conception rates). Various explanations can be put forward to interpret these findings, one of which is that cuts might have trimmed off superfluous or underperforming elements of the programme. If this is the case, Paton and Wright’s findings offer some support to arguments that spending cuts may not always be bad for the public.

Young adults’ experiences of neighbourhood smoking-related norms and practices: a qualitative study exploring place-based social inequalities in smoking. Social Science & Medicine [PubMed] Published September 2017

Smoking is a universal problem affecting millions of people around the world and Canada’s young adults are no exception. As in most countries, smoking prevalence and initiation is highest amongst young groups, which is bad news, as many people who start smoking at a young age continue to smoke throughout adulthood. Evidence suggests that there is a strong socioeconomic gradient in smoking, which can be seen in the fact that smoking prevalence is unequally distributed according to education and neighbourhood-level deprivation, being a greater problem in more deprived areas. This offers an opportunity for local-level interventions that may be more effective than national strategies. Though, to come up with such interventions, policy makers need to understand how neighbourhoods might shape, encourage or tolerate certain attitudes towards smoking. To understand this, Glenn and colleagues saw smoking as a practice that is closely related to local smoking norms and social structures, and sought to get young adult smokers’ views on how their neighbourhood affects their attitudes towards smoking. Within this context, the authors carried out a number of focus groups with young adult smokers who lived in four different neighbourhoods, during which they asked questions such as “do you think your neighbourhood might be encouraging or discouraging people to smoke?” Findings showed that some social norms, attitudes and practices were common among neighbourhoods of the same SES. Participants from low-SES neighbourhoods reported more tolerant and permissive local smoking norms, whereas in more affluent neighbourhoods, participants felt that smoking was more contained and regulated. While young smokers from high SES neighbourhoods expressed some degree of alignment and agency with local smoking norms and practices, smokers in low SES described smoking as inevitable in their neighbourhood. Of interest is how individuals living in different SES areas saw anti-smoking regulations: while young smokers in affluent areas advocate social responsibility (and downplay the role of regulations), their counterparts in poorer areas called for more protection and spoke in favour of greater government intervention and smoking restrictions. Glenn and colleagues’ findings serve to highlight the importance of context in designing public health measures, especially when such measures affect different groups in entirely different ways.

Cigarette taxes, smoking—and exercise? Health Economics [PubMed] Published August 2017

Evidence suggests that rises in cigarette taxes have a positive effect on smoking reduction and/or cessation. However, it is also plausible that the effect of tax hikes extends beyond smoking, to decisions about exercise. To explore whether this proposition is supported by empirical evidence, Conway and Niles put together a simple conceptual framework, which assumes that individuals aim to maximise the utility they get from exercise, smoking, health (or weight management) and other goods subject to market inputs (e.g. medical care, diet aids) and time and budget constraints. Much of the data for this analysis came from the Behavioral Risk Factor Surveillance System (BRFSS) in the US, which includes survey participants’ demographic characteristics (age, gender), as well as answers to questions about physical activities and exercise (e.g. intensity and time per week spent on activities) and smoking behaviour (e.g. current smoking status, number of cigarettes smoked per day). Survey data were subsequently combined with changes in cigarette taxes and other state-level variables. Conway and Niles’s results suggest that increased cigarette costs reduce both smoking and exercise, with the decline in exercise being more pronounced among heavy and regular smokers. However, the direction of the effect varied according to one’s age and smoking experience (e.g. higher cigarette cost increased physical activity among recent quitters), which highlights the need for caution in drawing conclusions about the exact mechanism that underpins this relationship. Encouraging smoking cessation and promoting physical exercise are important and desirable public health objectives, but, as Conway and Niles’s findings suggest, pursuing both of them at the same time may not always be plausible.


Widespread misuse of statistical significance in health economics

Despite widespread cautionary messages, p-values and claims of statistical significance are continuously misused. One of the most common errors is to mistake statistical significance for economic, clinical, or political significance. This error may manifest itself by authors interpreting only ‘statistically significant’ results as important, or even neglecting to examine the magnitude of estimated coefficients. For example, we’ve written previously about a claim of how statistically insignificant results are ‘meaningless’. Another common error is to ‘transpose the conditional’, that is to interpret the p-value as the posterior probability of a null hypothesis. For example, in an exchange on Twitter recently, David Colquhoun, whose discussions of p-values we’ve also previously covered, made the statement:

However, the p-value does not provide probability/evidence of a null hypothesis (that an effect ‘exists’). P-values are correlated with the posterior probability of the null hypothesis in a way that depends on statistical power, choice of significance level, and prior probability of the null. But observing a significant p-value only means that the data were unlikely to be produced by a particular model, not that the alternative hypothesis is true. Indeed, the null hypothesis may be a poor explanation for the observed data, but that does not mean it is a better explanation than the alternative. This is the essence of Lindley’s paradox.

So what can we say about p-values? The six principles of the ASA’s statement on p-values are:

  1. P-values can indicate how incompatible the data are with a specified statistical model.
  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone.
  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
  4. Proper inference requires full reporting and transparency.
  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis.


In 1996, Deirdre McClosky and Stephen Ziliak surveyed economics papers published in the American Economic Review in the 1980s for p-value misuse. Overall, 70% did not distinguish statistical from economic significance and 96% misused a test statistic in some way. Things hadn’t improved when they repeated the study ten years later. Unfortunately, these problems are not exclusive to the AER. A quick survey of a top health economics journal, Health Economics, finds similar misuse as we discuss below. This journal is not singled out for any particular reason beyond that it’s one of the key journals in the field covered by this blog, and frequently features in our journal round-ups. Similarly, no comment is made on the quality of the studies or authors beyond the claims and use of statistical significance. Nevertheless, where there are p-values, there are problems. For such a pivotal statistic, one that careers can be made or broken on, we should at least get it right!

Nine studies were published in the May 2017 issue of Health Economics. The list below shows some examples of p-value errors in the text of the articles. The most common issue was using the p-value to interpret whether an effect exists or not, or using it as the (only) evidence to support or reject a particular hypothesis. As described above, the statistical significance of a coefficient does not imply the existence of an effect. Some of the statements claimed below to be erroneous may be contentious as, in the broader context of the paper, they may make sense. For example, claiming that a statistically significant estimate is evidence of an effect may be correct where the broader totality of the evidence suggests that any observed data would be incompatible with a particular model. However, this is generally not the way the p‘s are used.

Examples of p-value (mis-)statements

Even the CMI has no statistically significant effect on the facilitation ratio. Thus, the diversity and complexity of treated patients do not play a role for the subsidy level of hospitals.

the coefficient for the baserate is statistically significant for PFP hospitals in the FE model, indicating that a higher price level is associated with a lower level of subsidies.

Using the GLM we achieved nine significant effects, including, among others, Parkinson’s disease and osteoporosis. In all components we found more significant effects compared with the GLM approach. The number of significant effects decreases from component 2 (44 significant effects) to component 4 (29 significant effects). Although the GLM lead to significant results for intestinal diverticulosis, none of the component showed equivalent results. This might give a hint that taking the component based heterogeneity into account, intestinal diverticulosis does not significantly affect costs in multimorbidity patients. Besides this, certain coefficients are significant in only one component.

[It is unclear what ‘significant’ and ‘not significant’ refer to or how they are calculated but appear to refer to t>1.96. Not clear if corrections for multiple comparisons.]

There is evidence of upcoding as the coefficient of spreadp_posis statistically significant.

Neither [variable for upcoding] is statistically significant. The incentive for upcoding is, according to these results, independent of the statutory nature of hospitals.

The checkup significantly raises the willingness to pay any positive amount, although it does not significantly affect the amount reported by those willing to pay some positive amount.

[The significance is with reference to statistical significance].

Similarly, among the intervention group, there were lower probabilities of unhappiness or depression (−0.14, p = 0.045), being constantly under strain (0.098, p = 0.013), and anxiety or depression (−0.10, p = 0.016). There was no difference between the intervention group and control group 1 (eligible non-recipients) in terms of the change in the likelihood of hearing problems (p = 0.64), experiencing elevate blood pressure (p = 0.58), and the number of cigarettes smoked (p = 0.26).

The ∆CEs are also statistically significant in some educational categories. At T + 1, the only significant ∆CE is observed for cancer survivors with a university degree for whom the cancer effect on the probability of working is 2.5 percentage points higher than the overall effect. At T + 3, the only significant ∆CE is observed for those with no high school diploma; it is 2.2 percentage points lower than the overall cancer effect on the probability of working at T + 3.

And, just for balance, here is a couple from this year’s winner of the Arrow prize at iHEA, which gets bonus points for the phrase ‘marginally significant’, which can be used both to confirm and refute a hypothesis depending on the inclination of the author:

Our estimated net effect of waiting times for high-income patients (i.e., adding the waiting time coefficient and the interaction of waiting times and high income) is positive, but only marginally significant (p-value 0.055).

We find that patients care about distance to the hospital and both of the distance coefficients are highly significant in the patient utility function.


As we’ve argued before, p-values should not be the primary result reported. Their interpretation is complex and so often leads to mistakes. Our goal is to understand economic systems and to determine the economic, clinical, or policy relevant effects of interventions or modifiable characteristics. The p-value does provide some useful information but not enough to support the claims made from it.