A six hour delay flying home from the Health Economists Study Group conference in Gran Canaria is providing me with ample time to mull over the great issues in life. One of these big issues is of course the trade-off between bias and variance.

Typically the discussion of an empirical economics paper at a conference will focus heavily on the model and estimation method. Often the word ‘endogenous’ echoes round the room as the discussion considers whether the estimator employed is biased or not. This is of course an important consideration for any empirical work; but, the question of efficiency (essentially the variance) of the estimator rarely comes up. Indeed, Andrew Gelman has discussed this predilection among economists elsewhere. So, why don’t we prefer to think in terms of overall error?

As an example consider that fixed-effects (FE) models are generally almost always preferred to random-effects (RE) models among economists. (Although the meaning of these terms varies widely!) This is for reasons of unbiasedness; we teach undergraduates to choose FE if the Hausman test for a difference in FE and RE is rejected. But, RE is more efficient. So the question should be under what conditions is the overall error smaller in the RE model. If much of the variation is between individuals (or whatever the unit of a panel of data is) rather than within individuals, then the efficiency gains of RE may outweigh error due to bias.

To give a more mathematically explicit example consider the use of an ordinary least squares estimator (OLS) versus a two stage least squares (2SLS) estimator. If we have the simple linear model *y = xβ + u*, the OLS estimator is biased if *Corr(x,u)=ρ≠0. *In such cases if an instrumental variable, say *z, *is available, one which is correlated with *x *but not with *u*, then 2SLS is an unbiased estimator. But what of overall error? If *λ *is the correlation between *z *and *x*, and *n *is the sample size, then 2SLS has a lower mean squared error than OLS if

*ρ ^{2} λ^{2} n/(1-λ^{2} )>1*

Thus if the correlation between *x *and *u* is low or the instruments are weak then OLS should be preferred in many cases. In many cases it comes down to whether the sample size is sufficient.

The same considerations could be made of predictive models for economic evaluation. An ambitious young student (as I was) may want to create an ever more complex model that captures this and that ambiguity in the world. While each addition may reduce bias in the prediction of the outcome it will increase variance. Thus beyond a certain point we will just increase the uncertainty in our predictions.

It could be argued that one of the key goals of research is a decision. Minimising the error in the estimator that informs the decision will lead to a lower probability of making the wrong decision. We should therefore consider overall error. This could be a plug for Bayesian methods; the posterior mean is an estimator that minimises the mean squared error. But, I don’t think Bayesianism is implied by the premises, we should just be less biased towards bias.

Photo credit: PeterPan23