Sam Watson’s journal round-up for 10th April 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Expertise versus Bias in Evaluation: Evidence from the NIH. American Economic Journal: Applied Economics. Published April 2017.

As an academic’s career progresses, she learns two things: patience and learning to deal with rejection. Getting a paper accepted by a top journal is hard. Obtaining funding for what seems like a good idea similarly so. We sometimes convince ourselves that the system is rigged, or at least biased. Research funding bodies may make poor decisions. This paper considers this question in great deal. While reviewers may have an informational advantage that allows them to assess quality, they may also be biased towards projects in their own domain of expertise. More funding for health economics blogs! To assess this, this paper examines 100,000 applications to the US National Institutes of Health. The proximity of the reviewer to the subject area of the application is judged by the number of times the reviewer has cited the work of the applicant. Quality is judged by the number of publications and citations the research produces – an attempt is made to adapt this to judge unfunded work. The principle finding is that reviewers are both more informed and more biased about work in their own field. Each additional permanent reviewer in a applicant’s area is estimated to increase the chance of funding by 2.2 percent, an equivalent effect to increasing quality by one quarter standard deviation. These effects seem small, as the author notes, and what strikes me is how little variation these measures in explain in funding decisions. Perhaps I will find some solace in the fact that there is quite a lot of apparent randomness in what gets funded. Nevertheless, the author suggests that the findings suggest that by trying to reduce bias by using impartial reviewers, the ability to judge quality will also decline.

Long-term effects of youth unemployment on mental health: does an economic crisis make a difference? Journal of Epidemiology and Community Health. [PubMedPublished April 2017.

Unemployment is related to mental health issues. The effect is appears to be particularly acute among young people for whom the transition to adult life can be difficult. Indeed, at this vulnerable period young people also transition from youth to adult mental health services, which breaks their continuity of care. Many become lost in the system. Services in many areas are being redesigned in light of this. This paper asks if the effect of unemployment on youth mental health is different depending on the economic conditions. Do period of high unemployment nationally exacerbate the effects of becoming unemployed? Surprisingly, the paper concludes, no, there is no difference. I say ‘surprisingly’ since I cannot recall finding a paper in this area or one that has featured on this blog with a negative finding. The analyses seem careful, and the authors concentrate on the magnitude of the effects, rather than statistical significance. Large sample sizes are required for adequate power to test a hypothesis on an interaction; this study does have a large sample size. The interactive effect is likely to be very small, not necessarily non-existent. But in comparison with the large effects of unemployment on youth mental health in general, the effect of economic conditions is of little importance. Nevertheless, Simpson’s paradox may rear its head here: during times of high unemployment, the cohort of the unemployed will be different. If those who only become unemployed during economic downturns have lower risk of mental health issues, then this may attenuate the estimated effect of unemployment on mental health. This issue is not addressed unfortunately, but I don’t want that to detract from a sensible use of statistics.

The Distortionary Effects of Incentives in Government: Evidence from China’s ‘Death Ceiling’ Program. American Economic Journal: Applied Economics. [RePEcPublished April 2017.

Targets and incentives to achieve those targets can distort the actions of agents. This is especially true of difficult to observe outcomes. People may be more inclined to manipulate the data than to actually achieve the target. Gaming and other similar behaviours have been noted in health services, for example. This article examines a policy in China designed to reduce the high rates of accidental deaths. In 2004 the State Administration of Work Safety announced that provinces would have to reduce their rate of accidental deaths by 2.5% per year. The provinces were set a so-called ‘death ceiling’. In 2012, the policy was declared a success; accidental death rates had come down by 45% since 2005. But further examination of the data, which were made publicly available in the state newspaper the People’s Daily, suggests this may not be the case. First of all, there was a sharp discontinuity of accidental deaths right below the death ceiling. This discontinuity was not consistent with a continuous variable. Provinces had much discretion about how to achieve the reductions. Those that used significant incentives for local officials were more likely to be successful. The authors also consider why, if the data were manipulated, deaths weren’t made to look significantly below the death ceiling rather than just below the death ceiling. They speculate that this would have the effect of making next year’s death ceiling even lower and more difficult to achieve. This paper provides a nice narrative that adds to our understanding of the perverse effects of incentives. For health services this is important. For many of the difficult to observe outcomes, like patient health, merely incentivising doctors and hospitals to improve may have little actual benefit.





What’s the significance of this?

A good illustration of the muddles that p-values can get us in appeared recently on HealthNewsReview examines and debunks the often hyped-up claims about medicines that appear in the media. But last week they “called BS” on a claim on Novartis’ website for the drug Everolimus. Novartis claimed that in a recent trial Everolimus demonstrated benefits that were “not statistically significant but clinically meaningful.” HealthNewsReview writes:

When results aren’t statistically significant, researchers can’t be sufficiently confident that any benefit they observed is real. Such findings are considered speculative until confirmed by other studies.

Sometimes, a result that was initially “not significant” might well reach the threshold of significance in a bigger study group with more patients, which is what this promotional material seems to anticipate.

And they quote a biostatistician further on:

A result that is statistically insignificant is not meaningful, period. Thus, we cannot say a result is statistically insignificant and clinically meaningful at the same time.

A null hypothesis significance testing (NHST) framework aims to determine whether the data are compatible with a model that the coefficient on the treatment is exactly zero. For the Everolimus trial, the t-test did not hit the magical 1.96 threshold and so it has been concluded there either was an effect of exactly zero or there was insufficient power. Hence it is “not meaningful”.

This is where the problems of NHST become obvious. Everolimus is an mTOR inhibitor, a class of drugs under active development for the treatment of cancer. Hyperactivation of mTOR signalling in cancer has been widely observed and various preclinical trials have shown promising results (see more here). So why one should expect it to have an effect of exactly zero, I can’t say.

Perhaps more importantly, this is where the irrelevance of inference rears its head. A decision to use Everolimus has to be made. It cannot be deferred and the only reason we use other treatments at this point in time is an accident of history. As we discussed recently, all that matters for these decisions is the (posterior) mean net benefits. Although, in the US, costs information has been outlawed because of those pernicious “death panels”. Decisions are made on the basis of “comparative effectiveness“. But even in this case, the above comments on Everolimus do not follow this logic, seeming to imply: (i) in the absence of statistical significance we have learned nothing from the data to inform our decision; and (ii) we should only choose to implement technologies that have demonstrated statistical significance. If taken as true then we have no choice but to conflate clinical and statistical significance, since apparently we cannot conclude something is clinically significant unless it is also statistically significant. This goes against all sound advice.

I write this without good knowledge of the original Everolimus trial. It may well be flawed. Industry funded research can be biased. Indeed, it is the design and conduct of the trial should be the basis for a reasonable critique of its findings, not its statistical significance.

HealthNewsReview is a typically excellent reviewer of claims derived from often flawed studies. Their conflation of statistical and clinical significance here though is by no means unique, being used by many regulatory agencies around the world. This just goes to show how the p-value can continue to distract from a sound decision making process in health care.

Sam Watson’s journal round-up for 27th March 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The minimum legal drinking age and morbidity in the United States. Review of Economics and Statistics Published 23rd February 2017

Governments have tried multiple different policies to reduce the physical and social harms of alcohol consumption. In the United Kingdom, a minimum price per unit alcohol has been investigated recently, and in 2003 opening times for licensed premises were extended. Neither policy was overwhelmingly judged to be an effective way to reduce harms. In the United States, the legal minimum age for purchasing alcohol is 21, notably higher than other Western nations. This legal age resulted from the National Minimum Age Drinking Act of 1984, which threatened states with a reduction of 10% in their funding for federal highways if they did not raise the legal age to 21. The Act was ostensibly in response to evidence of increased traffic fatalities associated with a lower legal age. This study adds evidence to this ongoing debate. The legal cut-off provides a natural discontinuity for the authors to investigate. Regression discontinuity can be abused, with some researchers controlling inappropriately for high powers of the variable, ‘forcing’ a difference to appear. This paper takes a more sensible approach adopting a quadratic form. For some variables, such as ED admission for alcohol intoxication, the discontinuity is obvious, as you would expect. But for others, such as accidental injury or deliberate injury by another person, the difference is not so apparent if you ignore the fitted lines. One wonders then how much their effect size is driven by their functional form. The authors write that their model is to ‘determine if an increase in the morbidity rate visible in a figure is statistically significant’. Oh dear.  Theoretically, the effect makes sense, alcohol does lead to physical and social harms. But I’m not convinced by the magnitude of the effect they’ve estimated: some sensitivity analyses wouldn’t have gone amiss.

A re-evaluation of fixed effect(s) meta-analysis. Journal of the Royal Statistical Society: Series A Published 16th March 2017

Meta-analysis is the frequently used method to combine results from multiple studies. Evidence synthesis is frequently required in health economic analyses to estimate parameters for models. Practitioners typically either consider ‘fixed effects’ or ‘random effects’ meta-analysis. The latter is used when it is assumed the estimated effects differ between studies, leading many authors to shun fixed effects analyses if there’s any suspicion of heterogeneity. But, as this article argues, there are multiple interpretations of fixed effects analyses. They can provide useful results even in the presence of between study heterogeneity. There are three key assumptions about the parameters estimated in different studies. First, there could be the same common effect underlying all studies. Secondly, each study could have a its own separate fixed effect. Or thirdly, each estimate is a draw from an underlying sampling distribution, an exchangeable parameters assumption. This latter assumption is the basis of random effects meta-analysis. The fixed effects meta-analysis estimator is consistent for the common effect parameter. For the multiple fixed effects assumption the fixed effects meta-analysis is a consistent estimator for the parameter that would have been estimated if the samples in each study were amalgamated. The key point of the paper is that under both the common effect and fixed effects assumptions the fixed effects meta-analysis estimator is useful.

Insurer competition in health care markets. Econometrica. Published 21st March 2017.

Given the gestation length of an economics paper, it is perhaps somewhat fortuitous that this one should land just as major health care market legislation is being discussed in the US. Health care provision differs notably between the US and other high income countries. Health care is predominantly left up to the market with ‘consumers’ purchasing insurance or health care directly. This, despite it being long recognised that health care markets are likely to fail (see our recent piece on the late Kenneth Arrow). But a single payer system is politically unpalatable. The Affordable Care Act (ACA; Obamacare) aimed to ensure universal coverage of health care through a system of subsidies, regulations, and mandates. The ACA brought about changes to the insurance market with a number of providers merging and consolidating. The consequences of these mergers may be deliterious as increased monopoly power within states may lead to higher premiums, but equally increased monopsony power may mean lower prices negotiated with health care providers. This article attempts to simulate what will happen to premiums and health care prices when insurers of different sizes are removed from the market. I can’t give a fair review to the methods in the time I’ve had to read this paper as there is a lot going on including econometric models of household choice and game theoretic models of insurer bargaining. But I put it here as it appears at first glance to be a solid analysis of what is an incredibly large and complex market in the US and is likely worth more time to understand.