The trouble with estimating neighbourhood effects, part 2

When we think of the causal effect of living in one neighbourhood compared to another we think of how the social interactions and lifestyle of that area produce better outcomes. Does living in an area with more obese people cause me to become fatter? (Quite possibly). Or, if a family moves to an area where people earn more will they earn more? (Read on).

In a previous post, we discussed such effects in the context of slums, where the synergy of poor water and sanitation, low quality housing, small incomes, and high population density likely has a negative effect on residents’ health. However, we also discussed how difficult it is to estimate neighbourhood effects empirically for a number of reasons. On top of this, are the different ways neighbourhood effects can manifest. Social interactions may mean behaviours that lead to better health or incomes rub off on one another. But also there may be some underlying cause of the group’s, and hence each individual’s, outcomes. In the slum, low education may mean poor hygiene habits spread, or the shared environment may contain pathogens, for example. Both of these pathways may constitute a neighbourhood effect, but both imply very different explanations and potential policy remedies.

What should we make then of, not one, but two new articles by Raj Chetty and Nathaniel Henderen in the recent issue of Quarterly Journal of Economics? Both of which use observational data to estimate neighbourhood effects.

Paper 1: The Impacts of Neighborhoods on Intergenerational Mobility I: Childhood Exposure Effects.

The authors have an impressive data set. They use federal tax records from the US between 1996 and 2012 and identify all children born between 1980 and 1988 and their parents (or parent). For each of these family units they determine household income and then the income of the children when they are older. To summarise a rather long exegesis of the methods used, I’ll try to describe the principle finding in one sentence:

Among families moving between commuting zones in the US, the average income percentile of children at age 26 is 0.04 percentile points higher per year spent and per additional percentile point increase in the average income percentile of the children of permanent residents at age 26 in the destination where the family move to. (Phew!)

They interpret this as the outcomes of in-migrating children ‘converging’ to the outcomes of permanently resident children at a rate of 4% per year. That should provide an idea of how the outcomes and treatments were defined, and who constituted the sample. The paper makes the assumption that the effect is the same regardless of the age of the child. Or to perhaps make it a bit clearer, the claim can be interpreted as that human capital, H, does something like this (ignoring growth over childhood due to schooling etc.):

humancap1

where ‘good’ and ‘bad’ mean ‘good neighbourhood’ and ‘bad neighbourhood’. This could be called the better neighbourhoods cause you to do better hypothesis.

The analyses also take account of parental income at the time of the move and looks at families who moved due to a natural disaster or other ‘exogenous’ shock. The different analyses generally support the original estimate putting the result in the region of 0.03 to 0.05 percentile points.

But are these neighbourhood effects?

A different way of interpreting these results is that there is an underlying effect driving incomes in each area. Areas with higher incomes for their children in the future are those that have a higher market price for labour in the future. So we could imagine that this is what is going on with human capital instead:

humancap2

This is the those moving to areas where people will earn more in the future, also earn more in the future because of differences in the labour market hypothesis. The Bureau of Labour Statistics, for example, cites the wage rate for a registered nurse as $22.61 in Iowa and $36.13 in California. But we can’t say from the data whether the children are sorting into different occupations or are getting paid different amounts for the same occupations.

The reflection problem

Manksi (1993) called the issue the ‘reflection problem’, which he described as arising when

a researcher observes the distribution of a behaviour in a population and wishes to infer whether the average behaviour in some group influences the behaviour of the individuals that compose the group.

What we have here is a linear-in-means model estimating the effect of average incomes on individual incomes. But what we cannot distinguish between is the competing explanations of, what Manski called, endogenous effects that result from the interaction  with families with higher incomes, and correlated effects that lead to similar outcomes due to exposure to the same underlying latent forces, i.e. the market. We could also add contextual effects that manifest due to shared group characteristics (e.g. levels of schooling or experience). When we think of a ‘neighbourhood effect’ I tend to think of them as of the endogenous variety, i.e. the direct effects of living in a certain neighbourhood. For example, under different labour market conditions, both my income and the average income of the permanent residents of the neighbourhood I move to might be lower, but not because of the neighbourhood.

The third hypothesis

There’s also the third hypothesis, families that are better off move to better areas (i.e. effects are accounted for by unobserved family differences):

humancap3

The paper presents lots of modifications to the baseline model, but none of them can provide an exogenous choice of destination. They look at an exogenous cause of moving – natural disasters – and also instrument with the expected difference in income percentiles for parents from the same zip code, but I can’t see how this instrument is valid. Selection bias is acknowledged in the paper but without some exogenous variation in where a family moves to it’ll be difficult to really claim to have identified a causal effect. The choice to move is in the vast majority of family’s cases based on preferences over welfare and well-being, especially income. Indeed, why would a family move to a worse off area unless their circumstances demanded it of them? So in reality, I would imagine the truth would lie somewhere in between these three explanations.

Robust analysis?

As a slight detour, we might want to consider if these are causal effects, even if the underlying assumptions hold. The paper presents a range of analyses to show that the results are robust. But these analyses represent just a handful of those possible. Given that the key finding is relatively small in magnitude, one wonders what would have happened under different scenarios and choices – the so-called garden of forking paths problem. To illustrate, consider some of the choices that were made about the data and models, and all the possible alternative choices. The sample included only those with a mean positive income between 1996 to 2004 and those living in commuter zones with populations of over 250,000 in the 2000 census. Those whose income was missing were assigned a value of zero. Average income over 1996 to 2000 is a proxy for lifetime income. If the marital status of the parents changed then the child was assigned to the mother’s location. Non-filers were coded as single. Income is measured in percentile ranks and not dollar terms. The authors justify each of the choices, but an equally valid analysis would have resulted from different choices and possibly produced very different results.

-o-

Paper 2The Impacts of Neighborhoods on Intergenerational Mobility II: County-Level Estimates

The strategy of this paper is much like the first one, except that rather than trying to estimate the average effect of moving to higher or lower income areas, they try to estimate the effect of moving to each of 3,000 counties in the US. To do this they assume that the number of years exposure to the county is as good as random after taking account of i) origin fixed effects, ii) parental income percentile, and iii) a quadratic function of birth cohort year and parental income percentile to try and control for some differences in labour market conditions. An even stronger assumption than before! The hierarchical model is estimated using some complex two-step method for ‘computational tractability’ (I’d have just used a Bayesian estimator). There’s some further strange calculations, like conversion from percentile ranks into dollar terms by regressing the dollar amounts on average income ranks and multiplying everything by the coefficient, rather than just estimating the model with dollars as the outcome (I suspect it’s to do with their complicated estimation strategy). Nevertheless, we are presented with some (noisy) county-level estimates of the effect of an additional year spent there in childhood. There is a weak correlation with the income ranks of permanent residents. Again, though, we have the issue of many competing explanations for the observed effects.

The differences in predicted causal effect by county don’t help distinguish between our hypotheses. Consider this figure:

usincomes1

Do children of poorer parents in the Southern states end up with lower human capital and lower-skilled jobs than in the Midwest? Or does the market mean that people get paid less for the same job in the South? Compare the map above to the maps below showing wage rates of two common lower-skilled professions, cashiers (right) or teaching assistants (left):

A similar pattern is seen. While this is obviously just a correlation, one suspects that such variation in wages is not being driven by large differences in human capital generated through personal interaction with higher earning individuals. This is also without taking into account any differences in purchasing power between geographic areas.

What can we conclude?

I’ve only discussed a fraction of the contents of these two enormous papers. The contents could fill many more blog posts to come. But it all hinges on whether we can interpret the results as the average causal effect of a person moving to a given place. Not nearly enough information is given to know whether families moving to areas with lower future incomes are comparable to those with higher future incomes. Also, we could easily imagine a world where the same people were all induced to move to different areas – this might produce completely different sets of neighbourhood effects since they themselves contribute to those effects. But I feel that the greatest issue is the reflection problem. Even random assignment won’t get around this. This is not to discount the value or interest these papers generate, but I can’t help but feel too much time is devoted to trying to convince the reader of a ‘causal effect’. A detailed exploration of the relationships in the data between parental incomes, average incomes, spatial variation, later life outcomes, and so forth, might have been more useful for generating understanding and future analyses. Perhaps sometimes in economics we spend too long obsessing over estimating unconvincing ‘causal effects’ and ‘quasi-experimental’ studies that really aren’t and forget the value of just a good exploration of data with some nice plots.

 

Image credits:

Sam Watson’s journal round-up for 25th June 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The efficiency of slacking off: evidence from the emergency department. Econometrica [RePEc] Published May 2018

Scheduling workers is a complex task, especially in large organisations such as hospitals. Not only should one consider when different shifts start throughout the day, but also how work is divided up over the course of each shift. Physicians, like anyone else, value their leisure time and want to go home at the end of a shift. Given how they value this leisure time, as the end of a shift approaches physicians may behave differently. This paper explores how doctors in an emergency department behave at ‘end of shift’, in particular looking at whether doctors ‘slack off’ by accepting fewer patients or tasks and also whether they rush to finish those tasks they have. Both cases can introduce inefficiency by either under-using their labour time or using resources too intensively to complete something. Immediately, from the plots of the raw data, it is possible to see a drop in patients ‘accepted’ both close to end of shift and close to the next shift beginning (if there is shift overlap). Most interestingly, after controlling for patient characteristics, time of day, and day of week, there is a decrease in the length of stay of patients accepted closer to the end of shift, which is ‘dose-dependent’ on time to end of shift. There are also marked increases in patient costs, orders, and inpatient admissions in the final hour of the shift. Assuming that only the number of patients assigned and not the type of patient changes over the course of a shift (a somewhat strong assumption despite the additional tests), then this would suggest that doctors are rushing care and potentially providing sub-optimal or inefficient care closer to the end of their shift. The paper goes on to explore optimal scheduling on the basis of the results, among other things, but ultimately shows an interesting, if not unexpected, pattern of physician behaviour. The results relate mainly to efficiency, but it’d be interesting to see how they relate to quality in the form of preventable errors.

Semiparametric estimation of longitudinal medical cost trajectory. Journal of the American Statistical Association Published 19th June 2018

Modern computational and statistical methods have opened up a range of statistical models to estimation hitherto inestimable. This includes complex latent variable structures, non-linear models, and non- and semi-parametric models. Recently we covered the use of splines for semi-parametric modelling in our Method of the Month series. Not that complexity is everything of course, but given this rich toolbox to more faithfully replicate the data generating process, one does wonder why the humble linear model estimated with OLS remains so common. Nevertheless, I digress. This paper addresses the problem of estimating the medical cost trajectory for a given disease from diagnosis to death. There are two key issues: (i) the trajectory is likely to be non-linear with costs probably increasing near death and possibly also be higher immediately after diagnosis (a U-shape), and (ii) we don’t observe the costs of those who die, i.e. there is right-censoring. Such a set-up is also applicable in other cases, for example looking at health outcomes in panel data with informative dropout. The authors model medical costs for each month post-diagnosis and time of censoring (death) by factorising their joint distribution into a marginal model for censoring and a conditional model for medical costs given the censoring time. The likelihood then has contributions from the observed medical costs and their times, and the times of the censored outcomes. We then just need to specify the individual models. For medical costs, they use a multivariate normal with mean function consisting of a bivariate spline of time and time of censoring. The time of censoring is modelled non-parametrically. This setup of the missing data problem is sometimes referred to as a pattern mixing model, in that the outcome is modelled as a mixture density over different populations dying at different times. The authors note another possibility for informative missing data, which was considered not to be estimable for complex non-linear structures, was the shared parameter model (to soon appear in another Method of the Month) that assumes outcomes and dropout are independent conditional on an underlying latent variable. This approach can be more flexible, especially in cases with varying treatment effects. One wonders if the mixed model representation of penalised splines wouldn’t fit nicely in a shared parameter framework and provide at least as good inferences. An idea for a future paper perhaps… Nevertheless, the authors illustrate their method by replicating the well-documented U-shaped costs from the time of diagnosis in patients with stage IV breast cancer.

Do environmental factors drive obesity? Evidence from international graduate students. Health Economics [PubMedPublished 21st June 2018

‘The environment’ can encompass any number of things including social interactions and networks, politics, green space, and pollution. Sometimes referred to as ‘neighbourhood effects’, the impact of the shared environment above and beyond the effect of individual risk factors is of great interest to researchers and policymakers alike. But there are a number of substantive issues that hinder estimation of neighbourhood effects. For example, social stratification into neighbourhoods likely means people living together are similar so it is difficult to compare like with like across neighbourhoods; trying to model neighbourhood choice will also, therefore, remove most of the variation in the data. Similarly, this lack of common support, i.e. overlap, between people from different neighbourhoods means estimated effects are not generalisable across the population. One way of getting around these problems is simply to randomise people to neighbourhoods. As odd as that sounds, that is what occurred in the Moving to Opportunity experiments and others. This paper takes a similar approach in trying to look at neighbourhood effects on the risk of obesity by looking at the effects of international students moving to different locales with different local obesity rates. The key identifying assumption is that the choice to move to different places is conditionally independent of the local obesity rate. This doesn’t seem a strong assumption – I’ve never heard a prospective student ask about the weight of our student body. Some analysis supports this claim. The raw data and some further modelling show a pretty strong and robust relationship between local obesity rates and weight gain of the international students. Given the complexity of the causes and correlates of obesity (see the crazy diagram in this post) it is hard to discern why certain environments contribute to obesity. The paper presents some weak evidence of differences in unhealthy behaviours between high and low obesity places – but this doesn’t quite get at the environmental link, such as whether these behaviours are shared through social networks or perhaps the structure and layout of the urban area, for example. Nevertheless, here is some strong evidence that living in an area where there are obese people means you’re more likely to become obese yourself.

Credits

Variations in NHS admissions at a glance

Variations in admissions to NHS hospitals are the source of a great deal of consternation. Over the long-run, admissions and the volume of activity required of the NHS have increased, without equivalent increases in funding or productivity. Over the course of the year, there are repeated claims of crises as hospitals are ill-equipped for the increase in demand in the winter. While different patterns of admissions at weekends relative to weekdays may be the foundation of the ‘weekend effect’ as we recently demonstrated. And yet all these different sources of variation produce a singular time series of numbers of daily admissions. But, each of the different sources of variation are important for different planning and research aims. So let’s decompose the daily number of admissions into its various components.

Data

Daily number of emergency admissions to NHS hospitals between April 2007 and March 2015 from Hospital Episode Statistics.

Methods

A similar analysis was first conducted on variations in the number of births by day of the year. A full description of the model can be found in Chapter 21 of the textbook Bayesian Data Analysis (indeed the model is shown on the front cover!). The model is a sum of Gaussian processes, each one modelling a different aspect of the data, such as the long-run trend or weekly periodic variation. We have previously used Gaussian processes in a geostatistical model on this blog. Gaussian processes are a flexible class of models for which any finite dimensional marginal distribution is Gaussian. Different covariance functions can be specified for different models, such as the aforementioned periodic or long-run trends. The model was run using the software GPstuff in Octave (basically an open-source version of Matlab) and we have modified code from the GPstuff website.

Results

admit5-1

The four panels of the figure reveal to us things we may claim to already know. Emergency admissions have been increasing over time and were about 15% higher in 2015 than in 2007 (top panel). The second panel shows us the day of the week effects: there are about 20% fewer admissions on a Saturday or Sunday than on a weekday. The third panel shows a decrease in summer and increase in winter as we often see reported, although perhaps not quite as large as we might have expected. And finally the bottom panel shows the effects of different days of the year. We should note that the large dip at the end of March/beginning of April is an artifact of coding at the end of the financial year in HES and not an actual drop in admissions. But, we do see expected drops for public holidays such as Christmas and the August bank holiday.

While none of this is unexpected it does show that there’s a lot going on underneath the aggregate data. Perhaps the most alarming aspect of the data is the long run increase in emergency admissions when we compare it to the (lack of) change in funding or productivity. It suggests that hospitals will often be running at capacity so other variation, such as over winter, may lead to an excess capacity problem. We might also speculate on other possible ‘weekend effects’, such as admission on a bank holiday.

As a final thought, the method used to model the data is an excellent way of modelling data with an unknown structure without posing assumptions such as linearity that might be too strong. Hence their use in geostatistics. They are widely used in machine learning and artificial intelligence as well. We often encounter data with unknown and potentially complicated structures in health care and public health research so hopefully this will serve as a good advert for some new methods. See this book, or the one referenced in the methods section, for an in depth look.

Credits