Sam Watson’s journal round-up for 26th November 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Alcohol and self-control: a field experiment in India. American Economic Review Forthcoming

Addiction is complex. For many people it is characterised by a need or compulsion to take something, often to prevent withdrawal, often in conflict with a desire to not take it. This conflicts with Gary Becker’s much-maligned rational theory of addiction, which views the addiction as a choice to maximise utility in the long term. Under Becker’s model, one could use market-based mechanisms to end repeated, long-term drug or alcohol use. By making the cost of continuing to use higher then people would choose to stop. This has led to the development of interventions like conditional payment or cost mechanisms: a user would receive a payment on condition of sobriety. Previous studies, however, have found little evidence people would be willing to pay for such sobriety contracts. This article reports a randomised trial among rickshaw drivers in Chennai, India, a group of people with a high prevalence of high alcohol use and dependency. The three trial arms consisted of a control arm who received an unconditional daily payment, a treatment arm who received a small payment plus extra if they passed a breathalyser test, and a third arm who had the choice between either of the two payment mechanisms. Two findings are of much interest. First, the incentive payments significantly increased daytime sobriety, and second, over half the participants preferred the conditional sobriety payments over the unconditional payments when they were weakly dominated, and a third still preferred them even when the unconditional payments were higher than the maximum possible conditional payment. This conflicts with a market-based conception of addiction and its treatment. Indeed, the nature of addiction means it can override all intrinsic motivation to stop, or do anything else frankly. So it makes sense that individuals are willing to pay for extrinsic motivation, which in this case did make a difference.

Heterogeneity in long term health outcomes of migrants within Italy. Journal of Health Economics [PubMed] [RePEc] Published 2nd November 2018

We’ve discussed neighbourhood effects a number of times on this blog (here and here, for example). In the absence of a randomised allocation to different neighbourhoods or areas, it is very difficult to discern why people living there or who have moved there might be better or worse off than elsewhere. This article is another neighbourhood effects analysis, this time framed through the lens of immigration. It looks at those who migrated within Italy in the 1970s during a period of large northward population movements. The authors, in essence, identify the average health and mental health of people who moved to different regions conditional on duration spent in origin destinations and a range of other factors. The analysis is conceptually similar to that of two papers we discussed at length on internal migration in the US and labour market outcomes in that it accounts for the duration of ‘exposure’ to poorer areas and differences between destinations. In the case of the labour market outcomes papers, the analysis couldn’t really differentiate between a causal effect of a neighbourhood increasing human capital, differences in labour market conditions, and unobserved heterogeneity between migrating people and families. Now this article examining Italian migration looks at health outcomes, such as the SF-12, which limit the explanations since one cannot ‘earn’ more health by moving elsewhere. Nevertheless, the labour market can still impact upon health strongly.

The authors carefully discuss the difficulties in identifying causal effects here. A number of model extensions are also estimated to try to deal with some issues discussed. This includes a type of propensity score weighting approach, although I would emphasize that this categorically does not deal with issues of unobserved heterogeneity. A finite mixture model is also estimated. Generally a well-thought-through analysis. However, there is a reliance on statistical significance here. I know I do bang on about statistical significance a lot, but it is widely used inappropriately. A rule of thumb I’ve adopted for reviewing papers for journals is that if the conclusions would change if you changed the statistical significance threshold then there’s probably an issue. This article would fail that test. They use a threshold of p<0.10 which seems inappropriate for an analysis with a sample size in the tens of thousands and they build a concluding narrative around what is and isn’t statistically significant. This is not to detract from the analysis, merely its interpretation. In future, this could be helped by banning asterisks in tables, like the AER has done, or better yet developing submission guidelines around its use.


Sam Watson’s journal round-up for 12th November 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Estimating health opportunity costs in low-income and middle-income countries: a novel approach and evidence from cross-country data. BMJ Global Health. Published November 2017.

The relationship between health care expenditure and population health outcomes is a topic that comes up often on this blog. Understanding how population health changes in response to increases or decreases in the health system budget is a reasonable way to set a cost-effectiveness threshold. Purchasing things above this threshold will, on average, displace activity with greater benefits. But identifying this effect is hard. Commonly papers use some kind of instrumental variable method to try to get at the causal effect with aggregate, say country-level, data. These instruments, though, can be controversial. Years ago I tried to articulate why I thought using socio-economic variables as instruments was inappropriate. I also wrote a short paper a few years ago, which remains unpublished, that used international commodity price indexes as an instrument for health spending in Sub-Saharan Africa, where commodity exports are a big driver of national income. This was rejected from a journal because of the choice of instruments. Commodity prices may well influence other things in the country that can influence population health. And a similar critique could be made of this article here, which uses consumption:investment ratios and military expenditure in neighbouring countries as instruments for national health expenditure in low and middle income countries.

I remain unconvinced by these instruments. The paper doesn’t present validity checks on them, which is forgiveable given medical journal word limitations, but does mean it is hard to assess. In any case, consumption:investment ratios change in line with the general macroeconomy – in an economic downturn this should change (assuming savings = investment) as people switch from consumption to investment. There are a multitude of pathways through which this will affect health. Similarly, neighbouring military expenditure would act by displacing own-country health expenditure towards military expenditure. But for many regions of the world, there has been little conflict between neighbours in recent years. And at the very least there would be a lag on this effect. Indeed, in all the models of health expenditure and population health outcomes I’ve seen, barely a handful take into account dynamic effects.

Now, I don’t mean to let the perfect be the enemy of the good. I would never have suggested this paper should not be published as it is, at the very least, important for the discussion of health care expenditure and cost-effectiveness. But I don’t feel there is strong enough evidence to accept these as causal estimates. I would even be willing to go as far to say that any mechanism that affects health care expenditure is likely to affect population health by some other means, since health expenditure is typically decided in the context of the broader public sector budget. That’s without considering what happens with private expenditure on health.

Strategic Patient Discharge: The Case of Long-Term Care Hospitals. American Economic Review. [RePEcPublished November 2018.

An important contribution of health economics has been to undermine people’s trust that doctors act in their best interest. Perhaps that’s a little facetious, nevertheless there has been ample demonstration that health care providers will often act in their own self-interest. Often this is due to trying to maximise revenue by gaming reimbursement schemes, but also includes things like doctors acting differently near the end of their shift so they can go home on time. So when I describe a particular reimbursement scheme that Medicare in the US uses, I don’t think there’ll be any doubt about the results of this study of it.

In the US, long-term acute care hospitals (LTCHs) specialise in treating patients with chronic care needs who require extended inpatient stays. Medicare reimbursement typically works on a fixed rate for each of many diagnostic related groups (DRGs), but given the longer and more complex care needs in LTCHs, they get a higher tariff. To discourage admitting patients purely to get higher levels of reimbursement, the bulk of the payment only kicks in after a certain length of stay. Like I said – you can guess what happened.

This article shows 26% of patients are discharged in the three days after the length of stay threshold compared to just 7% in the three days prior. This pattern is most strongly observed in discharges to home, and is not present in patients who die. But this may still be just by chance that the threshold and these discharges coincide. Fortunately for the authors the thresholds differ between DRGs and even move around within a DRG over time in a way that appears unrelated to actual patient health. They therefore estimate a set of decision models for patient discharge to try to estimate the effect of different reimbursement policies.

Estimating misreporting in condom use and its determinants among sex workers: Evidence from the list randomisation method. Health Economics. Published November 2018.

Working on health and health care research, especially if you conduct surveys, means you often want to ask people about sensitive topics. These could include sex and sexuality, bodily function, mood, or other ailments. For example, I work a fair bit on sanitation, where frequently self-reported diarrhoea in under fives (reported by the mother that is) is the primary outcome. This could be poorly reported particularly if an intervention includes any kind of educational component that suggests it could be the mother’s fault for, say, not washing her hands, if the child gets diarrhoea. This article looks at condom use among female sex workers in Senegal, another potentially sensitive topic, since unprotected sex is seen as risky. To try and get at the true prevalence of condom use, the authors use a ‘list randomisation’ method. This randomises survey participants to two sets of questions: a set of non-sensitive statements, or the same set of statements with the sensitive question thrown in. All respondents have to do is report the number of the statements they agree with. This means it is generally not possible to distinguish the response to the sensitive question, but the difference in average number of statements reported between the two groups gives an unbiased estimator for the population proportion. Neat, huh? Ultimately the authors report an estimate of 80% of sex workers using condoms, which compares to the 97% who said they used a condom when asked directly.



Sam Watson’s journal round-up for 29th October 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Researcher Requests for Inappropriate Analysis and Reporting: A U.S. Survey of Consulting Biostatisticians. Annals of Internal Medicine. [PubMed] Published October 2018.

I have spent a fair bit of time masquerading as a statistician. While I frequently try to push for Bayesian analyses where appropriate, I have still had to do Frequentist work including power and sample size calculations. In principle these power calculations serve a good purpose: if the study is likely to produce very uncertain results it won’t contribute much to scientific knowledge and so won’t justify its cost. It can indicate that a two-arm trial would be preferred over a three-arm trial despite losing an important comparison. But many power analyses, I suspect, are purely for show; all that is wanted is the false assurance of some official looking statistics to demonstrate that a particular design is good enough. Now, I’ve never worked on economic evaluation, but I can imagine that the same pressures can sometimes exist to achieve a certain result. This study presents a survey of 400 US-based statisticians, which asks them how frequently they are asked to do some inappropriate analysis or reporting and to rate how egregious the request is. For example, the most severe request is thought to be to falsify statistical significance. But it includes common requests like to not show plots as they don’t reveal an effect as significant as thought, to downplay ‘insignificant’ findings, or to dress up post hoc power calculations as a priori analyses. I would think that those responding to this survey are less likely to be those who comply with such requests and the survey does not ask them if they did. But it wouldn’t be a big leap to suggest that there are those who do comply, career pressures being what they are. We already know that statistics are widely misused and misreported, especially p-values. Whether this is due to ignorance or malfeasance, I’ll let the reader decide.

Many Analysts, One Data Set: Making Transparent How Variations in Analytic Choices Affect Results. Advances in Methods and Practices in Psychological Science. [PsyArXiv] Published August 2018.

Every data analysis requires a large number of decisions. From receiving the raw data, the analyst must decide what to do with missing or outlying values, which observations to include or exclude, whether any transformations of the data are required, how to code and combined categorical variables, how to define the outcome(s), and so forth. The consequence of each of these decisions leads to a different analysis, and if all possible analyses were enumerated there could be a myriad. Gelman and Loken called this the ‘garden of forking paths‘ after the short story by Jorge Luis Borges, who explored this idea. Gelman and Loken identify this as the source of the problem called p-hacking. It’s not that researchers are conducting thousands of analyses and publishing the one with the statistically significant result, but that each decision along the way may be favourable towards finding a statistically significant result. Do the outliers go against what you were hypothesising? Exclude them. Is there a nice long tail of the distribution in the treatment group? Don’t take logs.

This article explores the garden of forking paths by getting a number of analysts to try to answer the same question with the same data set. The question was, are darker skinned soccer players more likely to receive a red card that their lighter skinned counterparts? The data set provided had information on league, country, position, skin tone (based on subjective rating), and previous cards. Unsurprisingly there were a large range of results, with point estimates ranging from odds ratios of 0.89 to 2.93, with a similar range of standard errors. Looking at the list of analyses, I see a couple that I might have pursued, both producing vastly different results. The authors see this as demonstrating the usefulness of crowdsourcing analyses. At the very least it should be stark warning to any analyst to be transparent with every decision and to consider its consequences.

Front-Door Versus Back-Door Adjustment With Unmeasured Confounding: Bias Formulas for Front-Door and Hybrid Adjustments With Application to a Job Training Program. Journal of the American Statistical Association. Published October 2018.

Econometricians love instrumental variables. Without any supporting evidence, I would be willing to conjecture it is the most widely used type of analysis in empirical economic causal inference. When the assumptions are met it is a great tool, but decent instruments are hard to come by. We’ve covered a number of unconvincing applications on this blog where the instrument might be weak or not exogenous, and some of my own analyses have been criticised (rightfully) on these grounds. But, and we often forget, there are other causal inference techniques. One of these, which I think is unfamiliar to most economists, is the ‘front-door’ adjustment. Consider the following diagram:

frontdoorOn the right is the instrumental variable type causal model. Provided Z satisfies an exclusion restriction. i.e. independent of U, (and some other assumptions) it can be used to estimate the causal effect of A on Y. The front-door approach, on the left, shows a causal diagram where there is a post-treatment variable, M, unrelated to U, and which causes the outcome Y. Pearl showed that under a similar set of assumptions as instrumental variables, that the effect of A on Y was entirely mediated by M, and that there were no common causes of A and M or of M and Y, then M could be used to identify the causal effect of A on Y. This article discusses the front-door approach in the context of estimating the effect of a jobs training program (a favourite of James Heckman). The instrumental variable approach uses random assignment to the program, while the front-door analysis, in the absence of randomisation, uses program enrollment as its mediating variable. The paper considers the effect of the assumptions breaking down, and shows the front-door estimator to be fairly robust.