Sam Watson’s journal round-up for 11th February 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Contest models highlight inherent inefficiencies of scientific funding competitions. PLoS Biology [PubMed] Published 2nd January 2019

If you work in research you will have no doubt thought to yourself at one point that you spend more time applying to do research than actually doing it. You can spend weeks working on (what you believe to be) a strong proposal only for it to fail against other strong bids. That time could have been spent collecting and analysing data. Indeed, the opportunity cost of writing extensive proposals can be very high. The question arises as to whether there is another method of allocating research funding that reduces this waste and inefficiency. This paper compares the proposal competition to a partial lottery. In this lottery system, proposals are short, and among those that meet some qualifying standard those that are funded are selected at random. This system has the benefit of not taking up too much time but has the cost of reducing the average scientific value of the winning proposals. The authors compare the two approaches using an economic model of contests, which takes into account factors like proposal strength, public benefits, benefits to the scientist like reputation and prestige, and scientific value. Ultimately they conclude that, when the number of awards is smaller than the number of proposals worthy of funding, the proposal competition is inescapably inefficient. It means that researchers have to invest heavily to get a good project funded, and even if it is good enough it may still not get funded. The stiffer the competition the more researchers have to work to win the award. And what little evidence there is suggests that the format of the application makes little difference to the amount of time spent by researchers on writing it. The lottery mechanism only requires the researcher to propose something that is good enough to get into the lottery. Far less time would therefore be devoted to writing it and more time spent on actual science. I’m all for it!

Preventability of early versus late hospital readmissions in a national cohort of general medicine patients. Annals of Internal Medicine [PubMed] Published 5th June 2018

Hospital quality is hard to judge. We’ve discussed on this blog before the pitfalls of using measures such as adjusted mortality differences for this purpose. Just because a hospital has higher than expected mortality does not mean those death could have been prevented with higher quality care. More thorough methods assess errors and preventable harm in care. Case note review studies have suggested as little as 5% of deaths might be preventable in England and Wales. Another paper we have covered previously suggests then that the predictive value of standardised mortality ratios for preventable deaths may be less than 10%.

Another commonly used metric is readmission rates. Poor care can mean patients have to return to the hospital. But again, the question remains as to how preventable these readmissions are. Indeed, there may also be substantial differences between those patients who are readmitted shortly after discharge and those for whom it may take a longer time. This article explores the preventability of early and late readmissions in ten hospitals in the US. It uses case note review and a number of reviewers to evaluate preventability. The headline figures are that 36% of early readmissions are considered preventable compared to 23% of late readmissions. Moreover, it was considered that the early readmissions were most likely to have been preventable at the hospital whereas for late readmissions, an outpatient clinic or the home would have had more impact. All in all, another paper which provides evidence to suggest crude, or even adjusted rates, are not good indicators of hospital quality.

Visualisation in Bayesian workflow. Journal of the Royal Statistical Society: Series A (Statistics in Society) [RePEc] Published 15th January 2019

This article stems from a broader programme of work from these authors on good “Bayesian workflow”. That is to say, if we’re taking a Bayesian approach to analysing data, what steps ought we to be taking to ensure our analyses are as robust and reliable as possible? I’ve been following this work for a while as this type of pragmatic advice is invaluable. I’ve often read empirical papers where the authors have chosen, say, a logistic regression model with covariates x, y, and z and reported the outcomes, but at no point ever justified why this particular model might be any good at all for these data or the research objective. The key steps of the workflow include, first, exploratory data analysis to help set up a model, and second, performing model checks before estimating model parameters. This latter step is important: one can generate data from a model and set of prior distributions, and if the data that this model generates looks nothing like what we would expect the real data to look like, then clearly the model is not very good. Following this, we should check whether our inference algorithm is doing its job, for example, are the MCMC chains converging? We can also conduct posterior predictive model checks. These have had their criticisms in the literature for using the same data to both estimate and check the model which could lead to the model generalising poorly to new data. Indeed in a recent paper of my own, posterior predictive checks showed poor fit of a model to my data and that a more complex alternative was better fitting. But other model fit statistics, which penalise numbers of parameters, led to the alternative conclusions. So the simpler model was preferred on the grounds that the more complex model was overfitting the data. So I would argue posterior predictive model checks are a sensible test to perform but must be interpreted carefully as one step among many. Finally, we can compare models using tools like cross-validation.

This article discusses the use of visualisation to aid in this workflow. They use the running example of building a model to estimate exposure to small particulate matter from air pollution across the world. Plots are produced for each of the steps and show just how bad some models can be and how we can refine our model step by step to arrive at a convincing analysis. I agree wholeheartedly with the authors when they write, “Visualization is probably the most important tool in an applied statistician’s toolbox and is an important complement to quantitative statistical procedures.”

Credits

 

Sam Watson’s journal round-up for 9th July 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Evaluating the 2014 sugar-sweetened beverage tax in Chile: an observational study in urban areas. PLoS Medicine [PubMedPublished 3rd July 2018

Sugar taxes are one of the public health policy options currently in vogue. Countries including Mexico, the UK, South Africa, and Sri Lanka all have sugar taxes. The aim of such levies is to reduce demand for the most sugary drinks, or if the tax is absorbed on the supply side, which is rare, to encourage producers to reduce the sugar content of their drinks. One may also view it as a form of Pigouvian taxation to internalise the public health costs associated with obesity. Chile has long had an ad valorem tax on soft drinks fixed at 13%, but in 2014 decided to pursue a sugar tax approach. Drinks with more than 6.25g/100ml saw their tax rate rise to 18% and the tax on those below this threshold dropped to 10%. To understand what effect this change had, we would want to know three key things along the causal pathway from tax policy to sugar consumption: did people know about the tax change, did prices change, and did consumption behaviour change. On this latter point, we can consider both the overall volume of soft drinks and whether people substituted low sugar for high sugar beverages. Using the Kantar Worldpanel, a household panel survey of purchasing behaviour, this paper examines these questions.

Everyone in Chile was affected by the tax so there is no control group. We must rely on time series variation to identify the effect of the tax. Sometimes, looking at plots of the data reveals a clear step-change when an intervention is introduced (e.g. the plot in this post), not so in this paper. We therefore rely heavily on the results of the model for our inferences, and I have a couple of small gripes with it. First, the model captures household fixed effects, but no consideration is given to dynamic effects. Some households may be more or less likely to buy drinks, but their decisions are also likely to be affected by how much they’ve recently bought. Similarly, the errors may be correlated over time. Ignoring dynamic effects can lead to large biases. Second, the authors choose among different functional form specifications of time using Akaike Information Criterion (AIC). While AIC and the Bayesian Information Criterion (BIC) are often thought to be interchangeable, they are not; AIC estimates predictive performance on future data, while BIC estimates goodness of fit to the data. Thus, I would think BIC would be more appropriate. Additional results show the estimates are very sensitive to the choice of functional form by an order of magnitude and in sign. The authors estimate a fairly substantial decrease of around 22% in the volume of high sugar drinks purchased, but find evidence that the price paid changed very little (~1.5%) and there was little change in other drinks. While the analysis is generally careful and well thought out, I am not wholly convinced by the authors’ conclusions that “Our main estimates suggest a significant, sizeable reduction in the volume of high-tax soft drinks purchased.”

A Bayesian framework for health economic evaluation in studies with missing data. Health Economics [PubMedPublished 3rd July 2018

Missing data is a ubiquitous problem. I’ve never used a data set where no observations were missing and I doubt I’m alone. Despite its pervasiveness, it’s often only afforded an acknowledgement in the discussion or perhaps, in more complete analyses, something like multiple imputation will be used. Indeed, the majority of trials in the top medical journals don’t handle it correctly, if at all. The majority of the methods used for missing data in practice assume the data are ‘missing at random’ (MAR). One interpretation is that this means that, conditional on the observable variables, the probability of data being missing is independent of unobserved factors influencing the outcome. Another interpretation is that the distribution of the potentially missing data does not depend on whether they are actually missing. This interpretation comes from factorising the joint distribution of the outcome Y and an indicator of whether the datum is observed R, along with some covariates X, into a conditional and marginal model: f(Y,R|X) = f(Y|R,X)f(R|X), a so-called pattern mixture model. This contrasts with the ‘selection model’ approach: f(Y,R|X) = f(R|Y,X)f(Y|X).

This paper considers a Bayesian approach using the pattern mixture model for missing data for health economic evaluation. Specifically, the authors specify a multivariate normal model for the data with an additional term in the mean if it is missing, i.e. the model of f(Y|R,X). A model is not specified for f(R|X). If it were then you would typically allow for correlation between the errors in this model and the main outcomes model. But, one could view the additional term in the outcomes model as some function of the error from the observation model somewhat akin to a control function. Instead, this article uses expert elicitation methods to generate a prior distribution for the unobserved terms in the outcomes model. While this is certainly a legitimate way forward in my eyes, I do wonder how specification of a full observation model would affect the results. The approach of this article is useful and they show that it works, and I don’t want to detract from that but, given the lack of literature on missing data in this area, I am curious to compare approaches including selection models. You could even add shared parameter models as an alternative, all of which are feasible. Perhaps an idea for a follow-up study. As a final point, the models run in WinBUGS, but regular readers will know I think Stan is the future for estimating Bayesian models, especially in light of the problems with MCMC we’ve discussed previously. So equivalent Stan code would have been a bonus.

Trade challenges at the World Trade Organization to national noncommunicable disease prevention policies: a thematic document analysis of trade and health policy space. PLoS Medicine [PubMed] Published 26th June 2018

This is an economics blog. But focusing solely on economics papers in these round-ups would mean missing out on some papers from related fields that may provide insight into our own work. Thus I present to you a politics and sociology paper. It is not my field and I can’t give a reliable appraisal of the methods, but the results are of interest. In the global fight against non-communicable diseases, there is a range of policy tools available to governments, including the sugar tax of the paper at the top. The WHO recommends a large number. However, there is ongoing debate about whether trade rules and agreements are used to undermine this public health legislation. One agreement, the Technical Barriers to Trade (TBT) Agreement that World Trade Organization (WTO) members all sign, states that members may not impose ‘unnecessary trade costs’ or barriers to trade, especially if the intended aim of the measure can be achieved without doing so. For example, Philip Morris cited a bilateral trade agreement when it sued the Australian government for introducing plain packaging claiming it violated the terms of trade. Philip Morris eventually lost but not after substantial costs were incurred. In another example, the Thai government were deterred from introducing a traffic light warning system for food after threats of a trade dispute from the US, which cited WTO rules. However, there was no clear evidence on the extent to which trade disputes have undermined public health measures.

This article presents results from a new database of all TBT WTO challenges. Between 1995 and 2016, 93 challenges were raised concerning food, beverage, and tobacco products, the number per year growing over time. The most frequent challenges were over labelling products and then restricted ingredients. The paper presents four case studies, including Indonesia delaying food labelling of fat, sugar, and salt after challenge by several members including the EU, and many members including the EU again and the US objecting to the size and colour of a red STOP sign that Chile wanted to put on products containing high sugar, fat, and salt.

We have previously discussed the politics and political economy around public health policy relating to e-cigarettes, among other things. Understanding the political economy of public health and phenomena like government failure can be as important as understanding markets and market failure in designing effective interventions.

Credits

Brent Gibbons’s journal round-up for 22nd January 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Is retirement good for men’s health? Evidence using a change in the retirement age in Israel. Journal of Health Economics [PubMed] Published January 2018

This article is a tour de force from one chapter of a recently completed dissertation from the Hebrew University of Jerusalem. The article focuses on answering the question of what are the health implications of extending working years for older adults. As many countries are faced with critical decisions on how to adjust labor policies to solve rising pension costs (or in the case of the U.S., Social Security insolvency) in the face of aging populations, one obvious potential solution is to change the retirement age. Most OECD countries appear to have retirement ages in the mid-60’s with a number of countries on track to increase that threshold. Israel is one of these countries, having changed their retirement age for men from age 65 to age 67 in 2004. The author capitalizes on this exogenous change in retirement incentives, as workers will be incentivized to keep working to receive full pension benefits, to measure the causal effect of working in these later years, compared to retiring. As the relationship between employment and health is complicated by the endogenous nature of the decision to work, there is a growing literature that has attempted to deal with this endogeneity in different ways. Shai details the conflicting findings in this literature and describes various shortcomings of methods used. He helpfully categorizes studies into those that compare health between retirees and non-retirees (does not deal with selection problem), those that use variation in retirement age across countries (retirement ages could be correlated with individual health across countries), those that exploit variation in specific sector retirement ages (problem of generalizing to population), and those that use age-specific retirement eligibility (health may deteriorate at specific age regardless of eligibility for retirement). As this empirical question has amounted conflicting evidence, the author suggests that his methodology is an improvement on prior papers. He uses a difference-in-difference model that estimates the impact on various health outcomes, before and after the law change, comparing those aged 65-66 years after 2004 with both older and younger cohorts unaffected by the law. The assumption is that any differences in measured health between the age 65-66 group and the comparison group are a result of the extended work in later years. There are several different datasets used in the study and quite a number of analyses that attempt to assuage threats to a causal interpretation of results. Overall, results are that delaying the retirement age has a negative effect on individual health. The size of the effect found is in the ballpark of 1 standard deviation; outcome measures included a severe morbidity index, a poor health index, and the number of physician visits. In addition, these impacts were stronger for individuals with lower levels of education, which the author relates to more physically demanding jobs. Counterfactuals, for example number of dentist visits, which are not expected to be related to employment, are not found to be statistically different. Furthermore, there are non-trivial estimated effects on health care expenditures that are positive for the delayed retirement group. The author suggests that all of these findings are important pieces of evidence in retirement age policy decisions. The implication is that health, at least for men, and especially for those with lower education, may be negatively impacted by delaying retirement and that, furthermore, savings as a result of such policies may be tempered by increased health care expenditures.

Evaluating community-based health improvement programs. Health Affairs [PubMed] Published January 2018

For article 2, I see that the lead author is a doctoral student in health policy at Harvard, working with colleagues at Vanderbilt. Without intention, this round-up is highlighting two very impressive studies from extremely promising young investigators. This study takes on the challenge of evaluating community-based health improvement programs, which I will call CBHIPs. CBHIPs take a population-based approach to public health for their communities and often focus on issues of prevention and health promotion. Investment in CBHIPs has increased in recent years, emphasizing collaboration between the community and public and private sectors. At the heart of CBHIPs are the ideas of empowering communities to self-assess and make needed changes from within (in collaboration with outside partners) and that CBHIPs allow for more flexibility in creating programs that target a community’s unique needs. Evaluations of CBHIPs, however, suffer from limited resources and investment, and often use “easily-collectable data and pre-post designs without comparison or control communities.” Current overall evidence on the effectiveness of CBHIPs remains limited as a result. In this study, the authors attempt to evaluate a large set of CBHIPs across the United States using inverse propensity score weighting and a difference-in-difference analysis. Health outcomes on poor or fair health, smoking status, and obesity status were used at the county level from the BRFSS (Behavioral Risk Factor Surveillance System) SMART (Selected Metropolitan/Micropolitan Area Risk Trends) data. Information on counties implementing CBHIPs was compiled through a series of systematic web searches and through interviews with leaders in population health efforts in the public and private sector. With information on the exact years of implementation of CBHIPs in each county, a pre-post design was used that identified county treatment and control groups. With additional census data, untreated counties were weighted to achieve better balance on pre-implementation covariates. Importantly, treated counties were limited to those with CBHIPs that implemented programs related to smoking and obesity. Results showed little to no evidence that CBHIPs improved population health outcomes. For example, CBHIPs focusing on tobacco prevention were associated with a 0.2 percentage point reduction in the rate of smoking, which was not statistically significant. Several important limitations of the study were noted by the authors, such as limited information on the intensity of programs and resources available. It is recognized that it is difficult to improve population-level health outcomes and that perhaps the study period of 5-years post-implementation may not have been long enough. The researchers encourage future CBHIPs to utilize more rigorous evaluation methods, while acknowledging the uphill battle CBHIPs face to do this.

Through the looking glass: estimating effects of medical homes for people with severe mental illness. Health Services Research [PubMed] Published October 2017

The third article in this round-up comes from a publication from October of last year, however, it is from the latest issue of Health Services Research so I deem it fair play. The article uses the topic of medical homes for individuals with severe mental illness to critically examine the topic of heterogeneous treatment effects. While specifically looking to answer whether there are heterogeneous treatment effects of medical homes on different portions of the population with a severe mental illness, the authors make a strong case for the need to examine heterogeneous treatment effects as a more general practice in observational studies research, as well as to be more precise in interpretations of results and statements of generalizability when presenting estimated effects. Adults with a severe mental illness were identified as good candidates for medical homes because of complex health care needs (including high physical health care needs) and because barriers to care have been found to exist for these individuals. Medicaid medical homes establish primary care physicians and their teams as the managers of the individual’s overall health care treatment. The authors are particularly concerned with the reasons individuals choose to participate in medical homes, whether because of expected improvements in quality of care, regional availability of medical homes, or symptomatology. Very clever differences in estimation methods allow the authors to estimate treatment effects associated with these different enrollment reasons. As an example, an instrumental variables analysis, using measures of regional availability as instruments, estimated local average treatment effects that were much smaller than the fixed effects estimates or the generalized estimating equation model’s effects. This implies that differences in county-level medical home availability are a smaller portion of the overall measured effects from other models. Overall results were that medical homes were positively associated with access to primary care, access to specialty mental health care, medication adherence, and measures of routine health care (e.g. screenings); there was also a slightly negative association with emergency room use. Since unmeasured stable attributes (e.g. patient preferences) do not seem to affect outcomes, results should be generalizable to the larger patient population. Finally, medical homes do not appear to be a good strategy for cost-savings but do promise to increase access to appropriate levels of health care treatment.

Credits