Sam Watson’s journal round-up for 21st August 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Multidimensional performance assessment of public sector organisations using dominance criteria. Health Economics [RePEcPublished 18th August 2017

The empirical assessment of the performance or quality of public organisations such as health care providers is an interesting and oft-tackled problem. Despite the development of sophisticated methods in a large and growing literature, public bodies continue to use demonstrably inaccurate or misleading statistics such as the standardised mortality ratio (SMR). Apart from the issue that these statistics may not be very well correlated with underlying quality, organisations may improve on a given measure by sacrificing their performance on another outcome valued by different stakeholders. One example from a few years ago showed how hospital rankings based upon SMRs shifted significantly if one took into account readmission rates and their correlation with SMRs. This paper advances this thinking a step further by considering multiple outcomes potentially valued by stakeholders and using dominance criteria to compare hospitals. A hospital dominates another if it performs at least as well or better across all outcomes. Importantly, correlation between these measures is captured in a multilevel model. I am an advocate of this type of approach, that is, the use of multilevel models to combine information across multiple ‘dimensions’ of quality. Indeed, my only real criticism would be that it doesn’t go far enough! The multivariate normal model used in the paper assumes a linear relationship between outcomes in their conditional distributions. Similarly, an instrumental variable model is also used (using the now routine distance-to-health-facility instrumental variable) that also assumes a linear relationship between outcomes and ‘unobserved heterogeneity’. The complex behaviour of health care providers may well suggest these assumptions do not hold – for example, failing institutions may well show poor performance across the board, while other facilities are able to trade-off outcomes with one another. This would suggest a non-linear relationship. I’m also finding it hard to get my head around the IV model: in particular what the covariance matrix for the whole model is and if correlations are permitted in these models at multiple levels as well. Nevertheless, it’s an interesting take on the performance question, but my faith that decent methods like this will be used in practice continues to wane as organisations such as Dr Foster still dominate quality monitoring.

A simultaneous equation approach to estimating HIV prevalence with nonignorable missing responses. Journal of the American Statistical Association [RePEcPublished August 2017

Non-response is a problem encountered more often than not in survey based data collection. For many public health applications though, surveys are the primary way of determining the prevalence and distribution of disease, knowledge of which is required for effective public health policy. Methods such as multiple imputation can be used in the face of missing data, but this requires an assumption that the data are missing at random. For disease surveys this is unlikely to be true. For example, the stigma around HIV may make many people choose not to respond to an HIV survey, thus leading to a situation where data are missing not at random. This paper tackles the question of estimating HIV prevalence in the face of informative non-response. Most economists are familiar with the Heckman selection model, which is a way of correcting for sample selection bias. The Heckman model is typically estimated or viewed as a control function approach in which the residuals from a selection model are used in a model for the outcome of interest to control for unobserved heterogeneity. An alternative way of representing this model is as copula between a survey response variable and the response variable itself. This representation is more flexible and permits a variety of models for both selection and outcomes. This paper includes spatial effects (given the nature of disease transmission) not only in the selection and outcomes models, but also in the model for the mixing parameter between the two marginal distributions, which allows the degree of informative non-response to differ by location and be correlated over space. The instrumental variable used is the identity of the interviewer since different interviewers are expected to be more or less successful at collecting data independent of the status of the individual being interviewed.

Clustered multistate models with observation level random effects, mover–stayer effects and dynamic covariates: modelling transition intensities and sojourn times in a study of psoriatic arthritis. Journal of the Royal Statistical Society: Series C [ArXiv] Published 25th July 2017

Modelling the progression of disease accurately is important for economic evaluation. A delicate balance between bias and variance should be sought: a model too simple will be wrong for most people, a model too complex will be too uncertain. A huge range of models therefore exists from ‘simple’ decision trees to ‘complex’ patient-level simulations. A popular choice are multistate models, such as Markov models, which provide a convenient framework for examining the evolution of stochastic processes and systems. A common feature of such models is the Markov property, which is that the probability of moving to a given state is independent of what has happened previously. This can be relaxed by adding covariates to model transition properties that capture event history or other salient features. This paper provides a neat example of extending this approach further in the case of arthritis. The development of arthritic damage in a hand joint can be described by a multistate model, but there are obviously multiple joints in one hand. What is more, the outcomes in any one joint are not likely to be independent of one another. This paper describes a multilevel model of transition probabilities for multiple correlated processes along with other extensions like dynamic covariates and different mover-stayer probabilities.


Paul Mitchell’s journal round-up for 17th April 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Is foreign direct investment good for health in low and middle income countries? An instrumental variable approach. Social Science & Medicine [PubMed] Published 28th March 2017

Foreign direct investment (FDI) is considered a key benefit of globalisation in the economic development of countries with developing economies. The effect FDI has on the population health of countries is less well understood. In this paper, the authors draw from a large panel of data, primarily World Bank and UN sources, for 85 low and middle income countries between 1974 and 2012 to assess the relationship between FDI and population health, proxied by life expectancy at birth, as well as child and adult mortality data. They explain clearly the problem of using basic regression analysis in trying to explain this relationship, given the problem of endogeneity between FDI and health outcomes. By introducing two instrumental variables, using grossed fixed capital formation and volatility of exchange rates in FDI origin countries, as well as controlling for GDP per capita, education, quality of institutions and urban population, the study shows that FDI is weakly statistically associated with life expectancy, estimated to amount to 4.15 year increase in life expectancy during the study period. FDI also appears to have an effect on reducing adult mortality, but a negligible effect on child mortality. They also produce some evidence that FDI linked to manufacturing could lead to reductions in life expectancy, although these findings are not as robust as the other findings using instrumental variables, so they recommend this relationship between FDI type and population health to be explored further. The paper also clearly shows the benefit of robust analysis using instrumental variables, as the results without the introduction of these variables to the regression would have led to misleading inferences, where no relationship between life expectancy and FDI would have been found if the analysis did not adjust for the underlying endogeneity bias.

Uncovering waste in US healthcare: evidence from ambulance referral patterns. Journal of Health Economics [PubMed] Published 22nd March 2017

This study looks to unpick some of the reasons behind the estimated waste in US healthcare spending, by focusing on mortality rates across the country following an emergency admission to hospital through ambulances. The authors argue that patients admitted to hospital for emergency care using ambulances act as a good instrument to assess hospital quality given the nature of emergency admissions limiting the selection bias of what type of patients end up in different hospitals. Using linear regressions, the study primarily measures the relationship between patients assigned to certain hospitals and the 90-day spending on these patients compared to mortality. They also consider one-year mortality and the downstream payments post-acute care (excluding pharmaceuticals outside the hospital setting) has on this outcome. Through a lengthy data cleaning process, the study looks at over 1.5 million admissions between 2002-2011, with a high average age of patients of 82 who are predominantly female and white. Approximately $27,500 per patient was spent in the first 90 days post-admission, with inpatient spending accounting for the majority of this amount (≈$16,000). The authors argue initially that the higher 90-day spending in some hospitals only produces modestly lower mortality rates. Spending over 1 year is estimated to cost more than $300,000 per life year, which the authors use to argue that current spending levels do not lead to improved outcomes. But when the authors dig deeper, it seems clear there is an association between hospitals who have higher spending on inpatient care and reduced mortality, approximately 10% lower. This leads to the authors turning their attention to post-acute care as their main target of reducing waste and they find an association between mortality and patients receiving specialised nursing care. However, this target seems somewhat strange to me, as post-acute care is not controlled for in the same way as their initial, insightful approach to randomising based on ambulatory care. I imagine those in such care are likely to be a different mix from those receiving other types of care post 90 days after the initial event. I feel there really is not enough to go on to make recommendations about specialist nursing care being the key waste driver from their analysis as it says nothing, beyond mortality, about the quality of care these elderly patients are receiving in the specialist nurse facilities. After reading this paper, one way I would suggest in reducing inefficiency related to their primary analysis could be to send patients to the most appropriate hospital for what the patient needs in the first place, which seems difficult given the complexity of the private and hospital provided mix of ambulatory care offered in the US currently.

Population health and the economy: mortality and the Great Recession in Europe. Health Economics [PubMed] Published 27th March 2017

Understanding how economic recessions affect population health is of great research interest given the recent global financial crisis that led to the worst downturn in economic performance in the West since the 1930s. This study uses data from 27 European countries between 2004 and 2010 collected by WHO and the World Bank to study the relationship between economic performance and population health by comparing national unemployment and mortality rates before and after 2007. Regression analyses appropriate for time-series data are applied with a number of different specifications applied. The authors find that the more severe the economic downturn, the greater the increase in life expectancy at birth. Additional specific health mortality rates follow a similar trend in their analysis, with largest improvements observed in countries where the severity of the recession was the highest. The only exception the authors note is data on suicide, where they argue the relationship is less clear, but points towards higher rates of suicide with greater unemployment. The message the authors were trying to get across in this study was not very clear throughout most of the paper and some lay readers of the abstract alone could easily be misled in thinking recessions themselves were responsible for better population health. Mortality rates fell across all six years, but at a faster rate in the recession years. Although the results appeared consistent across all models, question marks remain for me in terms of their initial variable selection. Although the discussion mentions evidence that suggests health care may not have a short-term effect on mortality, they did not consider any potential lagged effect record investment in healthcare as a proportion of GDP up until 2007 may have had on the initial recession years. The authors rule out earlier comparisons with countries in the post-Soviet era but do not consider the effect of recent EU accession for many of the countries and more regulated national policies as a consequence. Another issue is the potential of countries’ mortality rates to improve, where countries with existing lower life expectancy have more room for moving in the right direction. However, one interesting discussion point raised by the authors in trying to explain their findings is the potential impact of economic activity on pollution levels and knock-on health impacts from this (and to a lesser extent occupational health levels), that may have some plausibility in better mortality rates linked to physical health during recessions.


Brent Gibbons’s journal round-up for 30th January 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

For this week’s round-up, I selected three papers from December’s issue of Health Services Research. I didn’t intend to to limit my selections to one issue of one journal but as I narrowed down my selections from several journals, these three papers stood out.

Treatment effect estimation using nonlinear two-stage instrumental variable estimators: another cautionary note. Health Services Research [PubMed] Published December 2016

This paper by Chapman and Brooks evaluates the properties of a non-linear instrumental variables (IV) estimator called two-stage residual inclusion or 2SRI. 2SRI has been more recently suggested as a consistent estimator of treatment effects under conditions of selection bias and where the dependent variable of the 2nd-stage equation is either binary or otherwise non-linear in its distribution. Terza, Bradford, and Dismuke (2007) and Terza, Basu, and Rathouz (2008) furthermore claimed that 2SRI estimates can produce unbiased estimates not just of local average treatment effects (LATE) but of average treatment effects (ATE). However, Chapman and Brooks question why 2SRI, which is analogous to two-stage least squares (2SLS) when both the first and second stage equations are linear, should not require similar assumptions as in 2SLS when generalizing beyond LATE to ATE. Backing up a step, when estimating treatment effects using observational data, one worry when trying to establish a causal effect is bias due to treatment choice. Where patient characteristics related to treatment choice are unobservable and one or more instruments is available, linear IV estimation (i.e. 2SLS) produces unbiased and consistent estimates of treatment effects for “marginal patients” or compliers. These are the patients whose treatment effects were influenced by the instrument and their treatment effects are termed LATE. But if there is heterogeneity in treatment effects, a case needs to be made that treatment effect heterogeneity is not related to treatment choice in order to generalize to ATE.  Moving to non-linear IV estimation, Chapman and Brooks are skeptical that this case for generalizing LATE to ATE no longer needs to be made with 2SRI. 2SRI, for those not familiar, uses the residual from stage 1 of a two-stage estimator as a variable in the 2nd-stage equation that uses a non-linear estimator for a binary outcome (e.g. probit) or another non-linear estimator (e.g. poisson). The authors produce a simulation that tests the 2SRI properties over varying conditions of uniqueness of the marginal patient population and the strength of the instrument. The uniqueness of the marginal population is defined as the extent of the difference in treatment effects for the marginal population as compared to the general population. For each scenario tested, the bias between the estimated LATE and the true LATE and ATE is calculated. The findings support the authors’ suspicions that 2SRI is subject to biased results when uniqueness is high. In fact, the 2SRI results were only practically unbiased when uniqueness was low, but were biased for both ATE and LATE when uniqueness was high. Having very strong instruments did help reduce bias. In contrast, 2SLS was always practically unbiased for LATE for different scenarios and the authors use these results to caution researchers on using “new” estimation methods without thoroughly understanding their properties. In this case, old 2SLS still outperformed 2SRI even when dependent variables were non-linear in nature.

Testing the replicability of a successful care management program: results from a randomized trial and likely explanations for why impacts did not replicate. Health Services Research [PubMed] Published December 2016

As is widely known, how to rein in U.S. healthcare costs has been a source of much hand-wringing. One promising strategy has been to promote better management of care in particular for persons with chronic illnesses. This includes coordinating care between multiple providers, encouraging patient adherence to care recommendations, and promoting preventative care. The hope was that by managing care for patients with more complex needs, higher cost services such as emergency visits and hospitalizations could be avoided. CMS, the Centers for Medicare and Medicaid Services, funded a demonstration of a number of care management programs to study what models might be successful in improving quality and reducing costs. One program implemented by Health Quality Partners (HQP) for Medicare Fee-For-Service patients was successful in reducing hospitalizations (by 34 percent) and expenditures (by 22 percent) for a select group of patients who were identified as high-risk. The demonstration occurred from 2002 – 2010 and this paper reports results for a second phase of the demonstration where HQP was given additional funding to continue treating only high-risk patients in the years 2010 – 2014. High-risk patients were identified as having a diagnosis of congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD), coronary artery disease (CAD), or diabetes and had a hospitalization in the year prior to enrollment. In essence, phase II of the demonstration for HQP served as a replication of the original demonstration. The HQP care management program was delivered by nurse coordinators who regularly talked with patients and provided coordinated care between primary care physicians and specialists, as well as other services such as medication guidance. All positive results from phase I vanished in phase II and the authors test several hypotheses for why results did not replicate. They find that treatment group patients had similar hospitalization rates between phase I and II, but that control group patients had substantially lower phase II hospitalization rates. Outcome differences between phase I and phase II were risk-adjusted as phase II had an older population with higher severity of illness. The authors also used propensity score re-weighting to further control for differences in phase I and phase II populations. The affordable care act did promote similar care management services through patient-centered medical homes and accountable care organizations that likely contributed to the usual care of control group patients improving. The authors also note that the effectiveness of care management may be sensitive to the complexity of the target population needs. For example, the phase II population was more homebound and was therefore unable to participate in group classes. The big lesson in this paper though is that demonstration results may not replicate for different populations or even different time periods.

A machine learning framework for plan payment risk adjustment. Health Services Research [PubMed] Published December 2016

Since my company has been subsumed under IBM Watson Health, I have been trying to wrap my head around this big data revolution and the potential of technological advances such as artificial intelligence or machine learning. While machine learning has infiltrated other disciplines, it is really just starting to influence health economics, so watch out! This paper by Sherri Rose is a nice introduction into a range of machine learning techniques that she applies to the formulation of plan payment risk adjustments. In insurance systems where patients can choose from a range of insurance plans, there is the problem of adverse selection where some plans may attract an abundance of high risk patients. To control for this, plans (e.g. in the affordable care act marketplaces) with high percentages of high risk consumers get compensated based on a formula that predicts spending based on population characteristics, including diagnoses. Rose says that these formulas are still based on a 1970s framework of linear regression and may benefit from machine learning algorithms. Given that plan payment risk adjustments are essentially predictions, this does seem like a good application. In addition to testing goodness of fit of machine learning algorithms, Rose is interested in whether such techniques can reduce the number of variable inputs. Without going into any detail, insurers have found ways to “game” the system and fewer variable inputs would restrict this activity. Rose introduces a number of concepts in the paper (at least they were new to me) such as ensemble machine learningdiscrete learning frameworks and super learning frameworks. She uses a large private insurance claims dataset and breaks the dataset into what she calls 10 “folds” which allows her to run 5 prediction models, each with its own cross-validation dataset. Aside from one parametric regression model, she uses several penalized regression models, neural net, single-tree, and random forest models. She describes machine learning as aiming to smooth over data in a similar manner to parametric regression but with fewer assumptions and allowing for more flexibility. To reduce the number of variables in models, she applies techniques that limit variables to, for example, just the 10 most influential. She concludes that applying machine learning to plan payment risk adjustment models can increase efficiencies and her results suggest that it is possible to get similar results even with a limited number of variables. It is curious that the parametric model performed as well as or better than many of the different machine learning algorithms. I’ll take that to mean we can continue using our trusted regression methods for at least a few more years.