Thesis Thursday: Sarah Zheng

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Sarah Zheng who has a PhD from Boston University. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Design for performance: studies on cost and quality in U.S. health care
Z. Justin Ren, Kimberley H. Geissler, Janelle Heineke, Anita Tucker
Repository link

In the context of your PhD research, what does ‘design for performance’ mean?

“Design for performance” is a further step in managing healthcare from “pay for performance”, on which there has been decades of attention paid among practitioners and academics. Despite the long effort on “pay for performance”, the core challenge remains how to properly incentivize patients, clinicians and staff to align their behaviors with optimal, safe and cost-effective, patient-centric care. This dissertation suggests an important set of issues to consider around “design for performance” at the system and process levels.

At the system level, under what conditions does cost-sharing lead to lower total costs without reducing quality of care? Previous literature has studied contract theory and mechanism design in varied industry settings (Guajardo et al. 2012), yet very few are studied in the healthcare domain where insurance plans are offered to patients under different contract arrangements. It remains unclear whether certain contract design at such settings may lead to desired outcomes (e.g., low healthcare spending). At the process level, under what conditions and to what extent does excellent internal supply operations result in superior hospital performance? Industrial studies suggest that reliable, efficient internal supply chains that are integrated with production yield better financial and quality performance for manufacturing companies (Droge et al. 2004, Flynn et al. 2010). However, there is scant quantitative research on the impact of support departments in hospitals (Tucker et al. 2008, Fredendall et al. 2009). Studies are needed to understand the extent to which support departments impact patient care outcomes, such as adverse events.

How was quality captured in the data that you used in your analyses?

In Chapter 3 of my thesis, I studied the impact of internal service quality on one particular quality performance metric: adverse events. Specifically, it is a rate variable that is calculated by the sum of adverse events (i.e., patient falls with injury and pressure ulcers) on the unit that month divided by the number of patient days on the unit that month, which is then multiplied by 1,000. The hospital collects these data monthly. The adverse event data come from both patient record reviews and incident reports in the hospital’s safety reporting system, as is typical of this type of data (Lake and Cheung 2006). The error event data are audited internally as well as reported to CMS (Zheng et al. 2017).

This is a unique opportunity to study quality as most healthcare operations research has relied on publicly available, hospital-level quality data, such as patient mortality (e.g., Senot et al. 2015, KC and Terwiesch 2011)—which is a blunt measure of quality—or process of care measures (e.g., Boyer et al. 2012, Gardner et al. 2015, Senot et al. 2015), which have been criticized in the healthcare literature for their weak connection to clinical outcomes (Patterson et al. 2010).

You complemented your quantitative analysis with some qualitative interviews – was this a valuable exercise?

Yes, definitely. To understand further the role patients (and the patient-physician dyad) play in deciding the usage of imaging studies, I conducted in-depth conversations with both physicians and patients. Specifically, I interviewed three physicians (i.e., hospitalist, primary care provider) and one imaging technician with the average conversation time of 70 mins. I also interviewed six patients with the average conversation time of 20 mins.

I found that patients did play a role in deciding the usage of imaging studies in the way that high-deductible health plan (HDHP) patients are less likely to demand imaging studies than non-HDHP patients. However, as patients cannot distinguish low-value care from high-value care, HDHP patients avoid patient care in general. This is consistent with previous literature on patient cost-sharing and HDHPs where patients indiscriminately reduce medical care (Hibbard et al. 2008, Lohr et al. 1986). It further suggests that HDHP may be a blunt instrument, reducing all diagnostic imaging, rather than helping physicians and patients choose high-value imaging.

Did any of your findings about high-deductible health plans stand out as different from previous studies?

I wouldn’t say different but more like complementary. Previous studies found HDHPs have different impacts depending on the site and type of care (Haviland et al. 2015, Wharam et al. 2013, Bundorf 2012, Nair et al. 2009, Waters et al. 2011, Hibbard et al. 2008, Busch et al. 2006, Rowe et al. 2008, Parente et al. 2004). By explicitly testing associations between HDHP enrollment and diagnostic imaging, we provide a more complete picture for policymakers in making guidelines related to HDHP plans. Our results suggest that increases in HDHP enrollment may contribute to a slow in the growth of diagnostic imaging utilization. However, increased cost-sharing may not allow patients to differentiate between high-value and low-value utilization, and better patient awareness and education should be a crucial part of any reductions in diagnostic imaging utilization (Zheng et al. 2016).

‘Internal service quality’ is a term that doesn’t often appear in health economics journals – should researchers be dedicating more attention to this?

Yes. In our study we find improved internal service quality to be a particularly novel driver of reduced adverse events because it is not obvious a priori that support departments—most of which are not clinical in nature—could have a significant impact on clinical outcomes. In particular, we find that improving the overall average internal service quality received by a nursing unit by 0.1 on a 5-point scale is associated with a 38% reduction in adverse events per nursing unit, which has roughly the same benefit for reducing adverse events as increasing staffing on that unit by nearly one full-time equivalent nurse. In the hospital that we study, the average salary of a support service technician is lower than the average salary of a nurse. Thus, hospitals might be able to improve quality of care at a lower cost by increasing support staff to relieve the burden on nurses (Zheng et al. 2017). More studies are needed in this area to explore further internal service quality as a viable and cost-effective means to improve clinical performance.

Sam Watson’s journal round-up for 12th March 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

[While the journal round-up is not on strike this week, academics and other university staff across the UK continue to be. Please support these staff members and the university sector that produces much of the great research we feature on this blog.]

How does household income affect child personality traits and behaviors? American Economic Review [RePEcPublished February 2018

The intergenerational transmission of poor social and health outcomes and its remediation has long been of concern to policy makers and economists alike. A popular hypothesis to explain this phenomenon is that of fetal origins: the nine months in the womb are perhaps the most important in determining a person’s health over their lifetime. We have featured numerous papers on this blog looking at the impact of in utero conditions on infant, child, and adult outcomes. This hypothesis though leaves a sense of pessimism since if this generational link is rooted in biology then it is not likely to be modifiable by any intervention. Studies of institutional interventions in schools and the health care system have shown that the health of  children from impoverished households can be improved. But what about the effects of simply improving the material conditions of those households? Would this have an effect? This study uses a longitudinal dataset of children in North Carolina, USA which oversampled children from Native American families who, in the middle of the period of observation, began to receive an unconditional cash transfer from the tribal government funded by casino revenues. A difference-in-difference-in-differences model is used with the relevant differences being: before v. after, younger cohorts v. older cohorts (older children’s households did not receive the cash while they were children), and Native American v non-Native American. An ‘event study analysis’ is also used, which takes into account time from the intervention. (This is the exact same method as another recently featured paper on this blog – perhaps sign of the growing popularity of such techniques). Average annual income increased by around $3,500 per year. Quite clear improvements in a range of psychological traits are estimated from the models including increases in conscientiousness and agreeableness, and declines in emotional and behavioural disorders. Potential mediating mechanisms for these changes are explored and uncertain evidence is shown indicating improved parental supervision and interaction and a reduction in parental mental health care seeking (they plot 90% confidence intervals which appear  ‘statistically significant’ where 95% confidence intervals clearly would not be – however, the lack of significance stars and p-values is refreshing). Such evidence should weigh heavily on policy makers’ minds when implementing reductions to social assistance programs and household income.

Adaptation or recovery after health shocks? Evidence using subjective and objective health measures. Health Economics. [PubMedPublished March 2018.

Hedonic adaptation is a well evidenced phenomenon in health economics and related fields. Individuals can get used to health conditions and adverse circumstances, such as amputation or blindness, and recover much of their pre-illness quality of life. This makes it hard for healthy people to judge the quality of life of these conditions and is one of the reasons for the divergence in preferences over health states depending on who you ask. This paper takes an interesting approach to looking at adaptation by asking whether the improvement in someone’s subjective assessment of their own life expectancy after a serious illness is reflective of actual recovery or is in fact due to the optimism brought on by adaptation. Typically, beliefs about life expectancy are found to accord well with actuarial assessments of life expectancy, but little is known about how this relates to serious illness. This study suggests that subjective assessments of mortality risk do drop with cancer, stroke, and myocardial infarction in line with changes to objective risk of death. However, these subjective assessments generally return to their pre-illness levels, which doesn’t reflect the continued increase in risk actually faced by these people. An explanation for this is hedonic adaptation – people perhaps end up feeling as well as they did before even if they are not. It’s hard to say though if there’s a survivorship bias in favour of the optimists.

The local influence of pioneer investigators on technology adoption: Evidence from new cancer drugs. Review of Economics and Statistics. [RePEcPublished March 2018.

Technology diffusion typically shows a strong spatial pattern. If you know someone who has adopted a new technology, you are more likely to do the same yourself. But what about in medicine – do doctors also adopt similar patterns of prescribing new drugs? In the UK, we might think such patterns are unlikely as doctors are not free to prescribe what they like since they are restricted generally to what the NHS will reimburse. New technologies have to be first approved on the basis of being demonstrably cost-effective. But in the United States doctors are freer to prescribe what they like. While this has benefits, it also leads to adoption of cost-ineffective interventions or persistence in prescribing sub-optimal treatments. If the diffusion of new treatments is based upon social and professional spatial networks then one might expect the epicentre to be where the drug was trialled, the PI may well also be the loudest cheerleader for the new drug should it be shown to be effective. Indeed if a ‘superstar’ researcher is involved with the development of a drug this may attract more attention to it still. The key finding from this study in the US is that patients treated in the hospital market where the first author of the paper reporting the results of the main clinical trial of a drug were 36% more likely to receive the drug than elsewhere in the first two years. This is generally beneficial to patients in those areas, particularly since the average survival benefit to those patients is larger than is attributable to the drug itself, which may suggest that doctors with local information are better at selecting which patients will benefit from a treatment. However, with some of the problems arising from reporting bias, p-values, and the like patients may also be getting a worse deal should the drug not be as good as claimed.


Chris Sampson’s journal round-up for 5th March 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Healthy working days: the (positive) effect of work effort on occupational health from a human capital approach. Social Science & Medicine Published 28th February 2018

If you look at the literature on the determinants of subjective well-being (or happiness), you’ll see that unemployment is often cited as having a big negative impact. The same sometimes applies for its impact on health, but here – of course – the causality is difficult to tease apart. Then, in research that digs deeper, looking at hours worked and different types of jobs, we see less conclusive results. In this paper, the authors start by asserting that the standard approach in labour economics (on which I’m not qualified to comment) is to assume that there is a negative association between work effort and health. This study extends the framework by allowing for positive effects of work that are related to individuals’ characteristics and working conditions, and where health is determined in a Grossman-style model of health capital that accounts for work effort in the rate of health depreciation. This model is used to examine health as a function of work effort (as indicated by hours worked) in a single wave of the European Working Conditions Survey (EWCS) from 2010 for 15 EU member states. Key items from the EWCS included in this study are questions such as “does your work affect your health or not?”, “how is your health in general?”, and “how many hours do you usually work per week?”. Working conditions are taken into account by looking at data on shift working and the need to wear protective equipment. One of the main findings of the study is that – with good working conditions – greater work effort can improve health. The Marxist in me is not very satisfied with this. We need to ask the question, compared to what? Working fewer hours? For most people, that simply isn’t an option. Aren’t the people who work fewer hours the people who can afford to work fewer hours? No attention is given to the sociological aspects of employment, which are clearly important. The study also shows that overworking or having poorer working conditions reduces health. We also see that, for many groups, longer hours do not negatively impact on health until we reach around 120 hours a week. This fails a good sense check. Who are these people?! I’d be very interested to see if these findings hold for academics. That the key variables are self-reported undermines the conclusions somewhat, as we can expect people to adjust their expectations about work effort and health in accordance with their colleagues. It would be very difficult to avoid a type 2 error (with respect to the negative impact of effort on health) using these variables to represent health and the role of work effort.

Agreement between retrospectively and contemporaneously collected patient-reported outcome measures (PROMs) in hip and knee replacement patients. Quality of Life Research [PubMed] Published 26th February 2018

The use of patient-reported outcomes (PROMs) in elective care in the NHS has been a boon for researchers in our field, providing before-and-after measurement of health-related quality of life so that we can look at the impact of these interventions. But we can’t do this in emergency care because the ‘before’ is never observed – people only show up when they’re in the middle of the emergency. But what if people could accurately recall their pre-emergency health state? There’s some evidence to suggest that people can, so long as the recall period is short. This study looks at NHS PROMs data (n=443), with generic and condition-specific outcomes collected from patients having hip or knee replacements. Patients included in the study were additionally asked to recall their health state 4 weeks prior to surgery. The authors assess the extent to which the contemporary PROM measurements agree with the retrospective measurements, and the extent to which any disagreement relates to age, socioeconomic status, or the length of time to recall. There wasn’t much difference between contemporary and retrospective measurements, though patients reported slightly lower health on the retrospective questionnaires. And there weren’t any compelling differences associated with age or socioeconomic status or the length of recall. These findings are promising, suggesting that we might be able to rely on retrospective PROMs. But the elective surgery context is very different to the emergency context, and I don’t think we can expect the two types of health care to impact recollection in the same way. In this study, responses may also have been influenced by participants’ memories of completing the contemporary questionnaire, and the recall period was very short. But the only way to find out more about the validity of retrospective PROM collection is to do more of it, so hopefully we’ll see more studies asking this question.

Adaptation or recovery after health shocks? Evidence using subjective and objective health measures. Health Economics [PubMed] Published 26th February 2018

People’s expectations about their health can influence their behaviour and determine their future health, so it’s important that we understand people’s expectations and any ways in which they diverge from reality. This paper considers the effect of a health shock on people’s expectations about how long they will live. The authors focus on survival probability, measured objectively (i.e. what actually happens to these patients) and subjectively (i.e. what the patients expect), and the extent to which the latter corresponds to the former. The arguments presented are couched within the concept of hedonic adaptation. So the question is – if post-shock expectations return to pre-shock expectations after a period of time – whether this is because people are recovering from the disease or because they are moving their reference point. Data are drawn from the Health and Retirement Study. Subjective survival probability is scaled to whether individuals expect to survive for 2 years. Cancer, stroke, and myocardial infarction are the health shocks used. The analysis uses some lagged regression models, separate for each of the three diagnoses, with objective and subjective survival probability as the dependent variable. There’s a bit of a jumble of things going on in this paper, with discussions of adaptation, survival, self-assessed health, optimism, and health behaviours. So it’s a bit difficult to see the wood for the trees. But the authors find the effect they’re looking for. Objective survival probability is negatively affected by a health shock, as is subjective survival probability. But then subjective survival starts to return to pre-shock trends whereas objective survival does not. The authors use this finding to suggest that there is adaptation. I’m not sure about this interpretation. To me it seems as if subjective life expectancy is only weakly responsive to changes in objective life expectancy. The findings seem to have more to do with how people process information about their probability of survival than with how they adapt to a situation. So while this is an interesting study about how people process changes in survival probability, I’m not sure what it has to do with adaptation.

3L, 5L, what the L? A NICE conundrum. PharmacoEconomics [PubMed] Published 26th February 2018

In my last round-up, I said I was going to write a follow-up blog post to an editorial on the EQ-5D-5L. I didn’t get round to it, but that’s probably best as there has since been a flurry of other editorials and commentaries on the subject. Here’s one of them. This commentary considers the perspective of NICE in deciding whether to support the use of the EQ-5D-5L and its English value set. The authors point out the differences between the 3L and 5L, namely the descriptive systems and the value sets. Examples of the 5L descriptive system’s advantages are provided: a reduced ceiling effect, reduced clustering, better discriminative ability, and the benefits of doing away with the ‘confined to bed’ level of the mobility domain. Great! On to the value set. There are lots of differences here, with 3 main causes: the data, the preference elicitation methods, and the modelling methods. We can’t immediately determine whether these differences are improvements or not. The authors stress the point that any differences observed will be in large part due to quirks in the original 3L value set rather than in the 5L value set. Nevertheless, the commentary is broadly supportive of a cautionary approach to 5L adoption. I’m not. Time for that follow-up blog post.