Alastair Canaway’s journal round-up for 28th May 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Information, education, and health behaviours: evidence from the MMR vaccine autism controversy. Health Economics [PubMed] Published 2nd May 2018

In 1998, Andrew Wakefield published (in the Lancet) his infamous and later retracted research purportedly linking the measles-mumps-rubella (MMR) vaccine and autism. Despite the thorough debunking and exposure of academic skulduggery, a noxious cloud of misinformation remained in the public mind, particularly in the US. This study examined several facets of the MMR fake news including: what impact did this have on vaccine uptake in the US (both MMR and other vaccines); how did state level variation in media coverage impact uptake; and what role did education play in subsequent decisions about whether to vaccinate or not. This study harnessed the National Immunization Survey from 1995 to 2006 to answer these questions. This is a yearly dataset of over 200,000 children aged between 19 to 35 months with detailed information on not just immunisation, but also maternal education, income and other sociodemographics. The NewsLibrary database was used to identify stories published in national and state media relating to vaccines and autism. Various regression methods were implemented to examine these data. The paper found that, unsurprisingly, for the year following the Wakefield publication the MMR vaccine take-up declined by between 1.1%-1.5% (notably less than 3% in the UK), likewise this fall in take-up spilled over into other vaccines take-up. The most interesting finding related to education: MMR take-up for children of college-educated mothers declined significantly compared to those without a degree. This can be explained by the education gradient where more-educated individuals absorb and respond to health information more quickly. However, in the US, this continued for many years beyond 2003 despite proliferation of research refuting the autism-MMR link. This contrasts to the UK where educational link closed soon after the findings were refuted, that is, in the UK, the educated responded to the new information refuting the MMR-Autism link. In the US, despite the research being debunked, MMR uptake was lower in the children of those with higher levels of education for many more years. The author speculates that this contrast to the UK may be a result of the media influencing parents’ decisions. Whilst the media buzz in the UK peaked in 2002, it had largely subsided by 2003. In the US however, the media attention was constant, if not increasing till 2006, and so this may have been the reason the link remained within the US. So, we have Andrew Wakefield and arguably fearmongering media to blame for causing a long-term reduction in MMR take-up in the US. Overall, an interesting study leaning on multiple datasets that could be of interest for those working with big data.

Can social care needs and well-being be explained by the EQ-5D? Analysis of the Health Survey for England. Value in Health Published 23rd May 2018

There is increasing discussion about integrating health and social care to provide a more integrated approach to fulfilling health and social care needs. This creates challenges for health economists and decision makers when allocating resources, particularly when comparing benefits from different sectors. NICE itself recognises that the EQ-5D may be inappropriate in some situations. With the likes of ASCOT, ICECAP and WEMWBS frequenting the health economics world this isn’t an unknown issue. To better understand the relationship between health and social care measures, this EuroQol Foundation funded study examined the relationship between social care needs as measured by the Barthel Index, well-being measured using WEMWBS and also the GGH-12, and the EQ-5D as the measure of health. Data was obtained through the Health Survey for England (HSE) and contained 3354 individuals aged over 65 years. Unsurprisingly the authors found that higher health and wellbeing scores were associated with an increased probability of no social care needs. Those who are healthier or at higher levels of wellbeing are less likely to need social care. Of all the instruments, it was the self-care and the pain/discomfort dimensions of the EQ-5D that were most strongly associated with the need for social care. No GHQ-12 dimensions were statistically significant, and for the WEMWBS only the ‘been feeling useful’ and ‘had energy to spare’ were statistically significantly associated with social care need. The authors also investigated various other associations between the measures with many unsurprising findings e.g. EQ-5D anxiety/depression dimension was negatively associated with wellbeing as measured using the GHQ-12. Although the findings are favourable for the EQ-5D in terms of it capturing to some extent social care needs, there is clearly still a gap whereby some outcomes are not necessarily captured. Considering this, the authors suggest that it might be appropriate to strap on an extra dimension to the EQ-5D (known as a ‘bolt on’) to better capture important ‘other’ dimensions, for example, to capture dignity or any other important social care outcomes. Of course, a significant limitation with this paper relates to the measures available in the data. Measures such as ASCOT and ICECAP have been developed and operationalised for economic evaluation with social care in mind, and a comparison against these would have been more informative.

The health benefits of a targeted cash transfer: the UK Winter Fuel Payment. Health Economics [PubMed] [RePEc] Published 9th May 2018

In the UK, each winter is accompanied by an increase in mortality, often known as ‘excess winter mortality’ (EWM). To combat this, the UK introduced the Winter Fuel Payment (WFP), the purpose of the WFP is an unconditional cash transfer to households containing an older person (those most vulnerable to EWM) above the female state pension age with the intent for this to used to help the elderly deal with the cost of keeping their dwelling warm. The purpose of this paper was to examine whether the WFP policy has improved the health of elderly people. The authors use the Health Surveys for England (HSE), the Scottish health Survey (SHeS) and the English Longitudinal Study of Ageing (ELSA) and employ a regression discontinuity design to estimate causal effects of the WFP. To measure impact (benefit) they focus on circulatory and respiratory illness as measured by: self-reports of chest infection, nurse measured hypertension, and two blood biomarkers for infection and inflammation. The authors found that for those living in a household receiving the payment there was a 6% point reduction (p<0.01) in the incidence of high levels of serum fibrinogen (biomarker) which are considered to be a marker of current infection and are associated with chronic pulmonary disease. For the other health outcomes, although positive, the estimated effects were less robust and not statistically significant. The authors investigated the impact of increasing the age of eligibility for the WFP (in line with the increase of women’s pension age). Their findings suggest there may be some health cost associated with the increase in age of eligibility for WFP. To surmise, the paper highlights that there may be some health benefits from the receipt of the WFP. What it doesn’t however consider is opportunity cost. With WFP costing about £2 billion per year, as a health economist, I can’t help but wonder if the money could have been better spent through other avenues.



Method of the month: custom likelihoods with Stan

Once a month we discuss a particular research method that may be of interest to people working in health economics. We’ll consider widely used key methodologies, as well as more novel approaches. Our reviews are not designed to be comprehensive but provide an introduction to the method, its underlying principles, some applied examples, and where to find out more. If you’d like to write a post for this series, get in touch. This month’s method is custom likelihoods with Stan.


Regular readers of this blog will know that I am a fan of Bayesian methods. The exponential growth in personal computing power has opened up a whole new range of Bayesian models at home. WinBUGS and JAGS were the go-to pieces of software for estimating Bayesian models, both using Markov Chain Monte Carlo (MCMC) methods.  Theoretically, an MCMC chain will explore the posterior distribution. But MCMC has flaws. For example, if the target distribution has a high degree of curvature, such as many hierarchical models might exhibit, then MCMC chains can have trouble exploring it. To compensate, the chains stay in the ‘difficult’ bit of the space for longer before leaving to go elsewhere so its average oscillates around the true value. Asymptotically, these oscillations balance out, but in real, finite time, they ultimately lead to bias. And further, MCMC chains are very slow to converge to the target distribution, and for complex models can take a literal lifetime. An alternative, Hamiltonian Monte Carlo (HMC), provides a solution to these issues. Michael Betancourt’s introduction to HCM is great for anyone interested in the topic.

Stan is a ‘probabilistic programming language’ that implements HMC. A huge range of probability distributions are already implemented in the software, check out the manual for more information. And there is an R package, rstanarm, that estimates a number of standard models using normal R code that even means you can use these tools without learning the code. However, Stan may not have the necessary distributions for more complex econometric or statistical models. It used to be the case that you would have to build your own MCMC sampler – but given the problems with MCMC, this is now strongly advised against in lieu of HMC. Fortunately, we can implement our own probability density functions in Stan. So, if you can write down the (log) likelihood for your model, you can estimate it in Stan!

The aim of this post is to provide an example of implementing a custom probability function in Stan from the likelihood of our model. We will look at the nested logit model. These models have been widely used for multinomial choice problems. An area of interest among health economists is the choice of hospital provider. A basic multinomial choice model, such as a multinomial logit, requires an independence of irrelevant alternatives (IIA) assumption that says the odds of choosing one option over another is independent of any other alternative. For example, it would assume that the odds of me choosing the pharmacy in town over the hospital in town would be unaffected by a pharmacy opening on my own road. This is likely too strong. There are many ways to relax this assumption, the nested logit being one. The nested logit is useful when choices can be ‘nested’ in groups and assumes there is a correlation among choices with each group. For example, we can group health care providers into pharmacies, primary care clinics, hospitals, etc. such as this:



Econometric model

Firstly, we need a nesting structure for our choices, like that described above. We’ll consider a 2-level nesting structure, with branches and total choices, with Rt choices in each branch t. Like with most choice models we start from an additive random utility model, which is, for individual i=1,…,N, and with choice over branch and option:

U_{itr} = V_{itr} + \epsilon_{itr}

Then the chosen option is the one with the highest utility. The motivating feature of the nested logit is that the hierarchical structure allows us to factorise the joint probability of choosing branch and option r into a conditional and marginal model:

p_{itr} = p_{it} \times p_{ir|t}

Multinomial choice models arise when the errors are assumed to have a generalised extreme value (GEV) distribution, which gives use the multinomial logit model. We will model the deterministic part of the equation with branch-varying and option-varying variables:

V_{itr} = Z_{it}'\alpha + X_{itr}'\beta_t

Then the model can be written as:

p_{itr} = p_{it} \times p_{ir|t} = \frac{exp(Z_{it}'\alpha + \rho_t I_{it})}{\sum_{k \in T} exp(Z_{ik}'\alpha + \rho_k I_{ik})} \times \frac{exp(X_{itr}'\beta_t/\rho_t)}{\sum_{m \in R_t} exp( X_{itm}'\beta_t/\rho_t) }

where \rho_t is variously called a scale parameter, correlation parameter, etc. and defines the within branch correlation (arising from the GEV distribution). We also have the log-sum, which is also called the inclusive value:

I_{it} = log \left(  \sum_{m \in R_t} exp( X_{itm}'\beta_t/\rho_t)  \right).

Now we have our model setup, the log likelihood over all individuals is

\sum_{i=1}^N \sum_{k \in T} \sum_{m \in R_t} y_{itr} \left[ Z_{it}'\alpha + \rho_t I_{it} - log \left( \sum_{k \in T} exp(Z_{ik}'\alpha + \rho_k I_{ik}) \right) + X_{itr}'\beta_t/\rho_t - log \left(  \sum_{m \in R_t} exp( X_{itm}'\beta_t/\rho_t) \right) \right]

As a further note, for the model to be compatible with an ARUM specification, a number of conditions need to be satisfied. One of these is satisfied is 0<\rho_t \leq 1, so we will make that restriction. We have also only included alternative-varying variables, but we are often interested in individual varying variables and allowing parameters to vary over alternatives, which can be simply added to this specification, but we will leave them out for now to keep things “simple”. We will also use basic weakly informative priors and leave prior specification as a separate issue we won’t consider further:

\alpha \sim normal(0,5), \beta_t \sim normal(0,5), \rho_t \sim Uniform(0,1)


DISCLAIMER: This code is unlikely to be the most efficient, nor can I guarantee it is 100% correct – use at your peril!

The following assumes a familiarity with Stan and R.

Stan programs are divided into blocks including data, parameters, and model. The functions block allows us to define custom (log) probability density functions. These take a form something like:

real xxx_lpdf(real y, ...){}

which says that the function outputs a real valued variable and take a real valued variable, y, as one of its arguments. The _lpdf suffix allows the function to act as a density function in the program (and equivalently _lpmf for log probability mass functions for discrete variables). Now we just have to convert the log likelihood above into a function. But first, let’s just consider what data we will be passing to the program:

  • N, the number of observations;
  • T, the number of branches;
  • P, the number of branch-varying variables;
  • Q, the number of choice-varying variables;
  • R, a T x 1 vector with the number of choices in each branch, from which we can also derive the total number of options as sum(R). We will call the total number of options Rk for now;
  • Y, a N x Rk vector, where Y[i,j] = 1 if individual i=1,…,N chose choice j=1,…,Rk;
  • Z, a N x T x P array of branch-varying variables;
  • X, a N x Rk x Q array of choice-varying variables.

And the parameters:

  • \rho , a T x 1 vector of correlation parameters;
  • \alpha , a P x 1 vector of branch-level covariates;
  • \beta , a P x T matrix of choice-varying covariates.

Now, to develop the code, we will specify the function for individual observations of Y, rather than the whole matrix, and then perform the sum over all the individuals in the model block. So we only need to feed in each individual’s observations into the function rather than the whole data set. The model is specified in blocks as follows (with all the data and parameter as arguments to the function):

 real nlogit_lpdf(real[] y, real[,] Z, real[,] X, int[] R, 
   vector alpha, matrix beta, vector tau){
//first define our additional local variables
 real lprob; //variable to hold log prob
 int count1; //keep track of which option in the loops
 int count2; //keep track of which option in the loops
 vector[size(R)] I_val; //inclusive values
 real denom; //sum denominator of marginal model
//for the variables appearing in sum loops, set them to zero
 lprob = 0;
 count1 = 0;
 count2 = 0;
 denom = 0;
 // determine the log-sum for each conditional model, p_ir|t, 
 //i.e. inclusive value
 for(k in 1:size(R)){
    I_val[k] = 0;
    for(m in 1:R[k]){
       count1 = count1 + 1;
       I_val[k] = I_val[k] + exp(to_row_vector(X[count1,])*
          beta[,k] /tau[k]);
    I_val[k] = log(I_val[k]);
 //determine the sum for the marginal model, p_it, denomininator
 for(k in 1:size(R)){
    denom = denom + exp(to_row_vector(Z[k,])*alpha + tau[k]*I_val[k]);
 //put everything together in the log likelihood
 for(k in 1:size(R)){
    for(m in 1:R[k]){
       count2 = count2 + 1;
       lprob = lprob + y[count2]*(to_row_vector(Z[k,])*alpha + 
         tau[k]*I_val[k] - log(denom) + 
         to_row_vector(X[count2,])*beta[,k] - I_val[k]);
// return the log likelihood value
 return lprob;
 int N; //number of observations
 int T; //number of branches
 int R[T]; //number of options per branch
 int P; //dim of Z
 int Q; //dim of X
 real y[N,sum(R)]; //outcomes array
 real Z[N,T,P]; //branch-varying variables array
 real X[N,sum(R),Q]; //option-varying variables array
 vector<lower=0, upper=1>[T] rho; //scale-parameters
 vector[P] alpha; //branch-varying parameters
 matrix[Q,T] beta; //option-varying parameters
//specify priors
 for(p in 1:P) alpha[p] ~ normal(0,5); 
 for(q in 1:Q) for(t in 1:T) beta[q,t] ~ normal(0,5);

//loop over all observations with the data 
 for(i in 1:N){
    y[i] ~ nlogit(Z[i,,],X[i,,],R,alpha,beta,rho);

Simulation model

To see whether our model is doing what we’re hoping it’s doing, we can run a simple test with simulated data. It may be useful to compare the result we get to those from other estimators; the nested logit is most frequently estimated using the FIML estimator. But, neither Stata nor R provide packages that estimate a model with branch-varying variables – another reason why we sometimes need to program our own models.

The code we’ll use to simulate the data is:

#### simulate 2-level nested logit data ###

N <- 300 #number of people
P <- 2 #number of branch variant variables
Q <- 2 #number of option variant variables
R <- c(2,2,2) #vector with number of options per branch
T <- length(R) #number of branches
Rk <- sum(R) #number of options

#simulate data

Z <- array(rnorm(N*T*P,0,0.5),dim = c(N,T,P))
X <- array(rnorm(N*Rk*Q,0,0.5), dim = c(N,Rk,Q))

rho <- runif(3,0.5,1)
beta <- matrix(rnorm(T*Q,0,1),c(Q,T))
alpha <- rnorm(P,0,1)

#option models #change beta indexing as required
vals_opt <- cbind(exp(X[,1,]%*%beta[,1]/rho[1]),exp(X[,2,]%*%beta[,1]/rho[1]),exp(X[,3,]%*%beta[,2]/rho[2]),

incl_val <- cbind(vals_opt[,1]+vals_opt[,2],vals_opt[,3]+vals_opt[,4],vals_opt[,5]+vals_opt[,6])

vals_branch <- cbind(exp(Z[,1,]%*%alpha + rho[1]*log(incl_val[,1])),
 exp(Z[,2,]%*%alpha + rho[2]*log(incl_val[,2])),
 exp(Z[,3,]%*%alpha + rho[3]*log(incl_val[,3])))

sum_branch <- rowSums(vals_branch)

probs <- cbind((vals_opt[,1]/incl_val[,1])*(vals_branch[,1]/sum_branch),

Y = t(apply(probs, 1, rmultinom, n = 1, size = 1))

Then we’ll put the data into a list and run the Stan program with 500 iterations and 3 chains:

data <- list(
 y = Y,
 X = X,
 Z = Z,
 R = R,
 T = T,
 N = N,
 P = P,
 Q = Q

rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

fit <- stan("C:/Users/Samuel/Dropbox/Code/nlogit.stan",
 data = data,
 chains = 3,
 iter = 500)

Which gives results (with 25t and 75th percentiles dropped to fit on screen):

> print(fit)
Inference for Stan model: nlogit.
3 chains, each with iter=500; warmup=250; thin=1; 
post-warmup draws per chain=250, total post-warmup draws=750.

             mean se_mean   sd    2.5%     50%     75%   97.5% n_eff Rhat
rho[1]       1.00    0.00 0.00    0.99    1.00    1.00    1.00   750  1
rho[2]       0.87    0.00 0.10    0.63    0.89    0.95    1.00   750  1
rho[3]       0.95    0.00 0.04    0.84    0.97    0.99    1.00   750  1
alpha[1]    -1.00    0.01 0.17   -1.38   -0.99   -0.88   -0.67   750  1
alpha[2]    -0.56    0.01 0.16   -0.87   -0.56   -0.45   -0.26   750  1
beta[1,1]   -3.65    0.01 0.32   -4.31   -3.65   -3.44   -3.05   750  1
beta[1,2]   -0.28    0.01 0.24   -0.74   -0.27   -0.12    0.15   750  1
beta[1,3]    0.99    0.01 0.25    0.48    0.98    1.15    1.52   750  1
beta[2,1]   -0.15    0.01 0.25   -0.62   -0.16    0.00    0.38   750  1
beta[2,2]    0.28    0.01 0.24   -0.16    0.28    0.44    0.75   750  1
beta[2,3]    0.58    0.01 0.24    0.13    0.58    0.75    1.07   750  1
lp__      -412.84    0.14 2.53 -418.56 -412.43 -411.05 -409.06   326  1

Samples were drawn using NUTS(diag_e) at Sun May 06 14:16:43 2018.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

Which we can compare to the original parameters:

> beta
           [,1]       [,2]      [,3]
[1,] -3.9381389 -0.3476054 0.7191652
[2,] -0.1182806  0.2736159 0.5237470
> alpha
[1] -0.9654045 -0.6505002
> rho
[1] 0.9503473 0.9950653 0.5801372

You can see that the posterior means and quantiles of the distribution provide pretty good estimates of the original parameters. Convergence diagnostics such as Rhat and traceplots (not reproduced here) show good convergence of the chains. But, of course, this is not enough for us to rely on it completely – you would want to investigate further to ensure that the chains were actually exploring the posterior of interest.


I am not aware of any examples in health economics of using custom likelihoods in Stan. There are not even many examples of Bayesian nested logit models, one exception being a paper by Lahiri and Gao, who ‘analyse’ the nested logit using MCMC. But given the limitations of MCMC discussed above, one should prefer this implementation in the post rather than the MCMC samplers of that paper. It’s also a testament to computing advances and Stan that in 2001 an MCMC sampler and analysis could fill a paper in a decent econometrics journal and now we can knock one out for a blog post.

In terms of nested logit models in general in health economics, there are many examples going back 30 years (e.g. this article from 1987). More recent papers have preferred “mixed” or “random parameters” logit or probit specifications, which are much more flexible than the nested logit. We would advise these sorts of models for this reason. The nested logit was used as an illustrative example of estimating custom likelihoods for this post.



Chris Sampson’s journal round-up for 14th May 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

A practical guide to conducting a systematic review and meta-analysis of health state utility values. PharmacoEconomics [PubMed] Published 10th May 2018

I love articles that outline the practical application of a particular method to solve a particular problem, especially when the article shares analysis code that can be copied and adapted. This paper does just that for the case of synthesising health state utility values. Decision modellers use utility values as parameters. Most of the time these are drawn from a single source which almost certainly introduces some kind of bias to the resulting cost-effectiveness estimates. So it’s better to combine all of the relevant available information. But that’s easier said than done, as numerous researchers (myself included) have discovered. This paper outlines the various approaches and some of the merits and limitations of each. There are some standard stages, for which advice is provided, relating to the identification, selection, and extraction of data. Those are by no means simple tasks, but the really tricky bit comes when you try and pool the utility values that you’ve found. The authors outline three strategies: i) fixed effect meta-analysis, ii) random effects meta-analysis, and iii) mixed effects meta-regression. Each is illustrated with a hypothetical example, with Stata and R commands provided. Broadly speaking, the authors favour mixed effects meta-regression because of its ability to identify the extent of similarity between sources and to help explain heterogeneity. The authors insist that comparability between sources is a precondition for pooling. But the thing about health state utility values is that they are – almost by definition – never comparable. Different population? Not comparable. Different treatment pathway? No chance. Different utility measure? Ha! They may or may not appear to be similar statistically, but that’s totally irrelevant. What matters is whether the decision-maker ‘believes’ the values. If they believe them then they should be included and pooled. If decision-makers have reason to believe one source more or less than another then this should be accounted for in the weighting. If they don’t believe them at all then they should be excluded. Comparability is framed as a statistical question, when in reality it is a conceptual one. For now, researchers will have to tackle that themselves. This paper doesn’t solve all of the problems around meta-analysis of health state utility values, but it does a good job of outlining methodological developments to date and provides recommendations in accordance with them.

Unemployment, unemployment duration, and health: selection or causation? The European Journal of Health Economics [PubMed] Published 3rd May 2018

One of the major socioeconomic correlates of poor health is unemployment. It appears not to be very good for you. But there’s an obvious challenge here – does unemployment cause ill-health, or are unhealthy people just more likely to be unemployed? Both, probably, but that answer doesn’t make for clear policy solutions. This paper – following a large body of literature – attempts to explain what’s going on. Its novelty comes in the way the author considers timing and distinguishes between mental and physical health. The basis for the analysis is that selection into unemployment by the unhealthy ought to imply time-constant effects of unemployment on health. On the other hand, the negative effect of unemployment on health ought to grow over time. Using seven waves of data from the German Socio-economic Panel, a sample of 17,000 people (chopped from 48,000) is analysed, of which around 3,000 experienced unemployment. The basis for measuring mental and physical health is summary scores from the SF-12. A fixed-effects model is constructed based on the dependence of health on the duration and timing of unemployment, rather than just the occurrence of unemployment per se. The author finds a cumulative effect of unemployment on physical ill-health over time, implying causation. This is particularly pronounced for people unemployed in later life, and there was essentially no impact on physical health for younger people. The longer people spent unemployed, the more their health deteriorated. This was accompanied by a strong long-term selection effect of less physically healthy people being more likely to become unemployed. In contrast, for mental health, the findings suggest a short-term selection effect of people who experience a decline in mental health being more likely to become unemployed. But then, following unemployment, mental health declines further, so the balance of selection and causation effects is less clear. In contrast to physical health, people’s mental health is more badly affected by unemployment at younger ages. By no means does this study prove the balance between selection and causality. It can’t account for people’s anticipation of unemployment or future ill-health. But it does provide inspiration for better-targeted policies to limit the impact of unemployment on health.

Different domains – different time preferences? Social Science & Medicine [PubMed] Published 30th April 2018

Economists are often criticised by non-economists. Usually, the criticisms are unfounded, but one of the ways in which I think some (micro)economists can have tunnel vision is in thinking that preferences elicited with respect to money exhibit the same characteristics as preferences about things other than money. My instinct tells me that – for most people – that isn’t true. This study looks at one of those characteristics of preferences – namely, time preferences. Unfortunately for me, it suggests that my instincts aren’t correct. The authors outline a quasi-hyperbolic discounting model, incorporating both short-term present bias and long-term impatience, to explain gym members’ time preferences in the health and monetary domains. A survey was conducted with members of a chain of fitness centres in Denmark, of which 1,687 responded. Half were allocated to money-related questions and half to health-related questions. Respondents were asked to match an amount of future gains with an amount of immediate gains to provide a point of indifference. Health problems were formulated as back pain, with an EQ-5D-3L level 2 for usual activities and a level 2 for pain or discomfort. The findings were that estimates for discount rates and present bias in the two domains are different, but not by very much. On average, discount rates are slightly higher in the health domain – a finding driven by female respondents and people with more education. Present bias is the same – on average – in each domain, though retired people are more present biased for health. The authors conclude by focussing on the similarity between health and monetary time preferences, suggesting that time preferences in the monetary domain can safely be applied in the health domain. But I’d still be wary of this. For starters, one would expect a group of gym members – who have all decided to join the gym – to be relatively homogenous in their time preferences. Findings are similar on average, and there are only small differences in subgroups, but when it comes to health care (even public health) we’re never dealing with average people. Targeted interventions are increasingly needed, which means that differential discount rates in the health domain – of the kind identified in this study – should be brought into focus.