Thesis Thursday: Caroline Vass

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Caroline Vass who has a PhD from the University of Manchester. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Using discrete choice experiments to value benefits and risks in primary care
Supervisors
Katherine Payne, Stephen Campbell, Daniel Rigby
Repository link
https://www.escholar.manchester.ac.uk/uk-ac-man-scw:295629

Are there particular challenges associated with asking people to trade-off risks in a discrete choice experiment?

The challenge of communicating risk in general, not just in DCEs, was one of the things which drew me to the PhD. I’d heard a TED talk discussing a study which asked people’s understanding of weather forecasts. Although most people think they understand a simple statement like “there’s a 30% chance of rain tomorrow”, few people correctly interpreted that as meaning it will rain 30% of the days like tomorrow. Most interpret it to mean there will be rain 30% of the time or in 30% of the area.

My first ever publication was reviewing the risk communication literature, which confirmed our suspicions; even highly educated samples don’t always interpret information as we expect. Therefore, testing if the communication of risk mattered when making trade-offs in a DCE seemed a pretty important topic and formed the overarching research question of my PhD.

Most of your study used data relating to breast cancer screening. What made this a good context in which to explore your research questions?

All women are invited to participate in breast screening (either from a GP referral or at 47-50 years old) in the UK. This makes every woman a potential consumer and a potential ‘patient’. I conducted a lot of qualitative research to ensure the survey text was easily interpretable, and having a disease which many people had heard of made this easier and allowed us to focus on the risk communication formats. My supervisor Prof. Katherine Payne had also been working on a large evaluation of stratified screening which made contacting experts, patients and charities easier.

There are also national screening participation figures so we were able to test if the DCE had any real-world predictive value. Luckily, our estimates weren’t too far off the published uptake rates for the UK!

How did you come to use eye-tracking as a research method, and were there any difficulties in employing a method not widely used in our field?

I have to credit my supervisor Prof. Dan Rigby with planting the seed and introducing me to the method. I did a bit of reading into what psychologists thought you could measure using eye-movements and thought it was worth further investigation. I literally found people publishing with the technology at our institution and knocked on doors until someone would let me use it! If the University of Manchester didn’t already have the equipment, it would have been much more challenging to collect these data.

I then discovered the joys of lab-based work which I think many health economists, fortunately, don’t encounter in their PhDs. The shared bench, people messing with your experiment set-up, restricted lab time which needs to be booked weeks in advance etc. I’m sure it will all be worth it… when the paper is finally published.

What are the key messages from your research in terms of how we ought to be designing DCEs in this context?

I had a bit of a null-result on the risk communication formats, where I found it didn’t affect preferences. I think looking back that might have been with the types of numbers I was presenting (5%, 10%, 20% are easier to understand) and maybe people have a lot of knowledge about the risks of breast screening. It certainly warrants further research to see if my finding holds in other settings. There is a lot of support for visual risk communication formats like icon arrays in other literatures and their addition didn’t seem to do any harm.

Some of the most interesting results came from the think-aloud interviews I conducted with female members of the public. Although I originally wanted to focus on their interpretation of the risk attributes, people started verbalising all sorts of interesting behaviour and strategies. Some of it aligned with economic concepts I hadn’t thought of such as feelings of regret associated with opting-out and discounting both the costs and health benefits of later screens in the programme. But there were also some glaring violations, like ignoring certain attributes, associating cost with quality, using other people’s budget constraints to make choices, and trying to game the survey with protest responses. So perhaps people designing DCEs for benefit-risk trade-offs specifically or in healthcare more generally should be aware that respondents can and do adopt simplifying heuristics. Is this evidence of the benefits of qualitative research in this context? I make that argument here.

Your thesis describes a wealth of research methods and findings, but is there anything that you wish you could have done that you weren’t able to do?

Achieved a larger sample size for my eye-tracking study!

Thesis Thursday: Thomas Allen

On the third Thursday of every month we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Thomas Allen who graduated with a PhD from the University of Manchester. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
The impact of provider incentives on professionals and patients
Supervisors
Matt Sutton, William Whittaker
Repository link
https://www.escholar.manchester.ac.uk/item/?pid=uk-ac-man-scw:296844

Let’s dive straight in: what was the most important or overarching finding of your research?

My thesis focused on a large financial incentive scheme for UK GPs. So the thesis is a collection of UK studies, but I think the main findings can be generalised reasonably well.

Two of these studies actually looked at how the non-financial incentives of the scheme affected GPs, namely reputation and peer effects. I found reputation became more important, compared to revenue, a few years into the scheme. My explanation for this: reputation matters once you can observe performance benchmarks.

As for peer effects, the focus was on how practices react to their peer groups getting larger, this was caused by mergers in PCTs (groups of practices). You might expect peer effects to shrink when the group gets larger and this is what I found. Practice performance is also pulled down by poor peers more than it is pulled up by good peers. An analogy to merging a good classroom with a bad classroom is helpful to imagine.

There is quite a lot of variation (at GP level) in the amount of income that was linked to performance, 10-30% in most cases, so the third study exploits this variation. The size of this exposure to performance pay does affect GPs working lives – their job satisfaction, working hours, intentions to quit etc.

The final study was pretty novel as it linked patient reported quality with practice reported quality. It seemed to be the case that as practices improved on the incentivised areas of quality (e.g. blood pressure test) they got worse on the non-incentivised areas (communication).

What were the main methodologies that you used and which researchers’ work did your study most depend on?

It was a quantitative thesis so various regression methods were used. I’ll admit there was nothing particularly special or new with the methods used, they were standard methods but I think they were applied in interesting ways. For example, two studies linked existing datasets in new ways so I could answer questions which would have otherwise been impossible, probably. One method used which is not so common was the continuous difference in differences from the job satisfaction chapter. It’s been used before by David Card and Carol Propper. It can be used when you have a continuous treatment variable, instead of the typical treatment vs control situation. Everyone is treated but there is some exogenous factor deciding the amount of treatment.

I’m not sure there is one researcher that my study most depended on. The four different empirical chapters were influenced by slightly different literatures. Two big influences were systematic reviews of financial incentives (Scott et al. 2011) and of the scheme which I studied (Steel & Willems 2010). Both helped to identify areas where I could add to the existing literature.

What was the most surprising thing that you discovered; was there anything odd or unexpected?

Lots of theories would suggest an effect of pay for performance on job satisfaction and working lives. For example, large financial incentives should crowd out internal motivation and so reduce job satisfaction. Pay for performance appeals more to risk seeking individuals; those who are risk averse should feel uncomfortable as more income is linked to performance. Pay for performance can often result in wage dispersion, where incomes differ because some individuals perform better, this is usually linked to lower job satisfaction. A section of Chapter 6 is dedicated to these theories but I found no effect of pay for performance on GPs’ job satisfaction or working lives. Even specific areas you would expect to be affected weren’t, like satisfaction with choice of working methods or levels of autonomy.

This was certainly an unexpected result but I think still very interesting. I was able to publish this quite recently in Social Science & Medicine.

What was the biggest challenge that you encountered during your PhD, and did it change the direction of your research?

I started to answer this saying I didn’t have any big challenges but then a few came to me. I guess looking back they don’t seem as significant as they were at the time.

In the first few weeks I realised one of the studies from the PhD proposal couldn’t be done – basically I wanted to use PROMs to analyse a policy but had glossed over the difference between hip fractures and hip replacements, which seems very obvious now. I had to think of Plan B.

Plan B turned into Plan C around the end of my second year. I was going to try linking three datasets to measure the impact of pay for performance using administrative data, patient data and GP data. Imagine a Venn diagram of the overlapping samples from these three datasets. In the end the sample covered by all three was too small.

I’m pleased with how the thesis turned out, these challenges ended up improving the finished product.

Have you any words of wisdom for any researchers who might be embarking on a similar programme of research?

On this research area… The incentive scheme I focused on, the QOF, has been around for 12 years. If you have a new research question maybe someone else already tried it and it doesn’t work. Review the literature well and talk to those who have done work on the scheme. My internal examiner was a GP. She gave some great insight which would have been helpful at the start of the PhD not the end! So if you can, talk with those affected by the incentive or policy you are evaluating – it might not work in the way described in policy documents.

On PhDs generally… Choose your supervisors wisely – they are more than just a boss/manager, so try and find someone you think you can work with, not for. If you can, have a professor and a less senior person. Matt and Will were a great combo. In the end you might find you are sick of the PhD topic, so make sure you at least start off liking it. Don’t just pick it because it is the only one going. Try and do some extra work: teaching, collaborate with others, blogs. But make sure you gain from it in some way. Plan your time well at the start. You won’t stick to it, but at least you’ll know how far you are behind.

Chris Sampson’s journal round-up for 26th September 2016

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Health economics as rhetoric: the limited impact of health economics on funding decisions in four European countries. Value in Health Published 19th September 2016

We start on a sombre note, with a paper that begs the question: why do we bother? A key purpose of health economic evaluation is to prevent the use of low-value, high-cost technologies. Influence on funding decisions is arguably a good basis on which to judge the impact of health economics. This study looks at funding decisions in England, Germany, the Netherlands and Sweden. The paper identifies key features of the HTA institutions and processes in each country. For all countries, there is very little evidence of economic evaluation having been the basis for the restriction of high-cost drugs. England found ways to support the funding of drugs for multiple sclerosis and cancer, despite high-cost and apparent low value. One positive impact might be in facilitating the negotiation of reduced prices – for example, through NICE’s patient access schemes. While the different countries have quite different processes, they have produced similar decisions in practice. The authors suggest that, despite having had limited impact on the outcome of funding decisions, health economics has influenced the process of decision making and the language of health care prioritisation. In this sense, health economics has value in rhetoric, increasing transparency and rational decision making. It’s an interesting idea that I’d like to see developed further, as the authors only provide a limited discussion of it. Personally, I think some distinction needs to be drawn between ‘health economics’ – as identified in the title – and ‘agency-mandated health technology assessment’. While many readers of this blog might do the former on a daily basis, I’d bet not many of us deal in the latter. I certainly don’t. So there’s a lot of ‘health economics’ that can’t – at least not directly – be judged on the basis of funding decisions. Yes, high-cost drugs backed by money-hungry Pharma evade HTA defences. But what about the other end of the spectrum? What about high-value interventions that have been commissioned because the economic evidence has been so compelling. Wishful thinking? Maybe not. Either way, we shouldn’t understate the value of health economics as rhetoric when dealing with capricious and myopic governments.

Recommendations for conduct, methodological practices, and reporting of cost-effectiveness analyses: Second Panel on Cost-Effectiveness in Health and Medicine. JAMA [PubMed] Published 13th September 2016

What do you mean you haven’t yet pre-ordered the new edition of the ‘Gold’ book from the famous Panel on Cost-Effectiveness in Health and Medicine? The original Panel was a big deal (not that I remember it, of course, as I was 8 years old), and so, presumably is the Second Panel. Maybe less so as relative consensus has developed in the use of health technology assessment in practice around the world. But we still need guidance. It’s ironic that the Panel was convened and funded by US organisations in a country that lags far behind in its use of economic evaluation in health technology assessment. This article in JAMA outlines the Panel’s recommendations. I can’t summarise them all here, so you probably need to go and read it all yourself. But know that there isn’t anything radical or unexpected. This Panel updated the original recommendations and created new ones where necessary. Threatening the validity of many a joke at economists’ expense, the Panel was able to reach consensus on all recommendations. Readers are chastised for not appropriately adopting a societal perspective as recommended by the first Panel, but then we are offered a compromise: “All studies should report a reference case analysis based on a health care sector perspective and another reference case analysis based on a societal perspective”. The Panel also recommend use of an “impact inventory”. This is a nice suggestion and I like the terminology. Including a disaggregated list of costs (and outcomes) improves transparency and makes studies more useful to future researchers. One new recommendation is that we should include unrelated future costs, which is something we saw discussed in a recent journal round-up. Another departure from the first Panel is that we are told to include productivity costs in the cost side of our equation. A suggestion that’s dropped in is that protocols should be written in advance of a study. I wish the panel had been more forceful with this one, as published protocols could go a long way in improving consistency, transparency and quality.

The Load Model: an alternative to QALY. Journal of Medical Economics [PubMedPublished 7th September 2016

OK, I admit it: I went into this paper with a lot of scepticism. The QALY – that is, the combination of the quality and quantity of life – fundamentally makes sense. I’m not sure we need ‘an alternative’. The paper introduces some interesting ideas, but they aren’t as revolutionary as the author suggests and I’m not sure that it gets us anywhere. There are some problems from the outset. The article jumbles up positive and normative matters, criticising the QALY on the basis of its capacity to indicate what we might consider to be inequitable results. The author hints that the need for a new model derives from the QALY’s inappropriate combination of quality and duration of life. The most obvious criticism is that the constant proportional trade-off assumption does not hold. But then there’s no discussion of CPTO. The Load Model is presented as “radically different”, but it isn’t. Equations are shuffled so that we’re dealing in rates rather than time, but this adjustment appears to be inconsequential. It might be a more useful way to think about morbidity and mortality, but no argument to that end is presented. The main difference in the Load Model is that a ‘load’ is added for the negative impact of death (as opposed to being dead). Now, I have big problems with the way we handle ‘dead’ in health state valuation. I think it’s a more serious issue than we know (and we know quite a bit), so I am always glad to see attempts to fix it. Once you get past the superficial adjustments to the QALY, what’s really going on is that the Load Model is adding a third dimension to the valuation process; in addition to length of life and quality of life (in the Load Model it’s disease burden) we also have quality (or rather the burden) of death. But this could be incorporated into a QALY framework; I’ve spoken before about the notion of a 3- or otherwise multi-dimensional QALY. Given that death is so key to the distinction between the Load Model and the QALY, it’s unfortunate that in the worked example an entirely arbitrary value of questionable meaning is attributed to it. So the subsequent comparison between the two approaches seems meaningless. There may be more merit in the Load Model than I can see – perhaps I lack the immagination. But it seems to solve none of the problems associated with the QALY framework, while introducing new ones.

Associations between extending access to primary care and emergency department visits: a difference-in-differences analysis. PLOS Medicine [PubMedPublished 6th September 2016

We’ve had quite a bit of discussion of 7 day services here on the blog. But the papers continue to flood in, much to the chagrin of Jeremy Hunt. This study doesn’t look at the most controversial case of extending hospital services, but investigates whether extended (evening and weekend) opening of GP practices reduces hospital attendance. The context is that providers in Manchester (England) were invited to bid for funding to roll out extended hours from December 2013. In total we’re looking at 56 practices who succeeded in the bid and 469 practices who provided services as normal. The analysis uses routinely collected hospital administrative data for almost 3 million patients from 2011 to 2014. A difference-in-differences OLS regression was used with propensity score matching to try and deal with the obvious selection problem. Of course, there was an increase in the number of GP visits: 33,519 in total. The main finding is that patients registered at practices with extended hours exhibited a 26.4% relative reduction in attendances for minor problems at A&E. So in this sense, extending opening hours seems to have satisfied its purpose. Though each emergency attendance ‘avoided’ corresponded to around 3 additional GP appointments. Unfortunately the study wasn’t able to determine the set-up and running costs of the extended GP services, so couldn’t carry out a proper cost-effectiveness analysis. And as we’ve discussed before in this context, that’s the question that really matters.

Credits