Thesis Thursday: Francesco Longo

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Francesco Longo who has a PhD from the University of York. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Essays on hospital performance in England
Luigi Siciliani
Repository link

What do you mean by ‘hospital performance’, and how is it measured?

The concept of performance in the healthcare sector covers a number of dimensions including responsiveness, affordability, accessibility, quality, and efficiency. A PhD does not normally provide enough time to investigate all these aspects and, hence, my thesis mostly focuses on quality and efficiency in the hospital sector. The concept of quality or efficiency of a hospital is also surprisingly broad and, as a consequence, perfect quality and efficiency measures do not exist. For example, mortality and readmissions are good clinical quality measures but the majority of hospital patients do not die and are not readmitted. How well does the hospital treat these patients? Similarly for efficiency: knowing that a hospital is more efficient because it now has lower costs is essential, but how is that hospital actually reducing costs? My thesis tries to answer also these questions by analysing various quality and efficiency indicators. For example, Chapter 3 uses quality measures such as overall and condition-specific mortality, overall readmissions, and patient-reported outcomes for hip replacement. It also uses efficiency indicators such as bed occupancy, cancelled elective operations, and cost indexes. Chapter 4 analyses additional efficiency indicators, such as admissions per bed, the proportion of day cases, and proportion of untouched meals.

You dedicated a lot of effort to comparing specialist and general hospitals. Why is this important?

The first part of my thesis focuses on specialisation, i.e. an organisational form which is supposed to generate greater efficiency, quality, and responsiveness but not necessarily lower costs. Some evidence from the US suggests that orthopaedic and surgical hospitals had 20 percent higher inpatient costs because of, for example, higher staffing levels and better quality of care. In the English NHS, specialist hospitals play an important role because they deliver high proportions of specialised services, commonly low-volume but high-cost treatments for patients with complex and rare conditions. Specialist hospitals, therefore, allow the achievement of a critical mass of clinical expertise to ensure patients receive specialised treatments that produce better health outcomes. More precisely, my thesis focuses on specialist orthopaedic hospitals which, for instance, provide 90% of bone and soft tissue sarcomas surgeries, and 50% of scoliosis treatments. It is therefore important to investigate the financial viability of specialist orthopaedic hospitals relative to general hospitals that undertake similar activities, under the current payment system. The thesis implements weighted least square regressions to compare profit margins between specialist and general hospitals. Specialist orthopaedic hospitals are found to have lower profit margins, which are explained by patient characteristics such as age and severity. This means that, under the current payment system, providers that generally attract more complex patients such as specialist orthopaedic hospitals may be financially disadvantaged.

In what way is your analysis of competition in the NHS distinct from that of previous studies?

The second part of my thesis investigates the effect of competition on quality and efficiency under two different perspectives. First, it explores whether under competitive pressures neighbouring hospitals strategically interact in quality and efficiency, i.e. whether a hospital’s quality and efficiency respond to neighbouring hospitals’ quality and efficiency. Previous studies on English hospitals analyse strategic interactions only in quality and they employ cross-sectional spatial econometric models. Instead, my thesis uses panel spatial econometric models and a cross-sectional IV model in order to make causal statements about the existence of strategic interactions among rival hospitals. Second, the thesis examines the direct effect of hospital competition on efficiency. The previous empirical literature has studied this topic by focusing on two measures of efficiency such as unit costs and length of stay measured at the aggregate level or for a specific procedure (hip and knee replacement). My thesis provides a richer analysis by examining a wider range of efficiency dimensions. It combines a difference-in-difference strategy, commonly used in the literature, with Seemingly Unrelated Regression models to estimate the effect of competition on efficiency and enhance the precision of the estimates. Moreover, the thesis tests whether the effect of competition varies for more or less efficient hospitals using an unconditional quantile regression approach.

Where should researchers turn next to help policymakers understand hospital performance?

Hospitals are complex organisations and the idea of performance within this context is multifaceted. Even when we focus on a single performance dimension such as quality or efficiency, it is difficult to identify a measure that could work as a comprehensive proxy. It is therefore important to decompose as much as possible the analysis by exploring indicators capturing complementary aspects of the performance dimension of interest. This practice is likely to generate findings that are readily interpretable by policymakers. For instance, some results from my thesis suggest that hospital competition improves efficiency by reducing admissions per bed. Such an effect is driven by a reduction in the number of beds rather than an increase in the number of admissions. In addition, competition improves efficiency by pushing hospitals to increase the proportion of day cases. These findings may help to explain why other studies in the literature find that competition decreases length of stay: hospitals may replace elective patients, who occupy hospital beds for one or more nights, with day case patients, who are instead likely to be discharged the same day of admission.

Method of the month: Coding qualitative data

Once a month we discuss a particular research method that may be of interest to people working in health economics. We’ll consider widely used key methodologies, as well as more novel approaches. Our reviews are not designed to be comprehensive but provide an introduction to the method, its underlying principles, some applied examples, and where to find out more. If you’d like to write a post for this series, get in touch. This month’s method is coding qualitative data.


Health economists are increasingly stepping away from quantitative datasets and conducting interviews and focus groups, as well as collecting free text responses. Good qualitative analysis requires thought and rigour. In this blog post, I focus on coding of textual data – a fundamental part of analysis in nearly all qualitative studies. Many textbooks deal with this in detail. I have drawn on three in particular in this blog post (and my research): Coast (2017), Miles and Huberman (1994), and Ritchie and Lewis (2003).

Coding involves tagging segments of the text with salient words or short phrases. This assists the researcher with retrieving the data for further analysis and is, in itself, the first stage of analysing the data. Ultimately, the codes will feed into the final themes or model resulting from the research. So the codes – and the way they are applied – are important!


There is no ‘right way’ to code. However, I have increasingly found it useful to think of two phases of coding. First, ‘open coding’, which refers to the initial exploratory process of identifying pertinent phrases and concepts in the data. Second, formal or ‘axial’ coding, involving the application of a clear, pre-specified coding framework consistently across the source material.

Open coding

Any qualitative analysis should start with the researcher being very familiar with both the source material (such as interview transcripts) and the study objectives. This sounds obvious, but it is easy, as a researcher, to get drawn into the narrative of an interview and forget what exactly you are trying to get out of the research and, by extension, the coding. Open coding requires the researcher to go through the text, carefully, line-by-line, tagging segments with a code to denote its meaning. It is important to be inquisitive. What is being said? Does this relate to the research question and, if so, how?

Take, for example, the excerpt below from a speech by the Secretary of State for Health, Jeremy Hunt, on safety and efficiency in the NHS in 2015:

Let’s look at those challenges. And I think we have good news and bad news. If I start with the bad news it is that we face a triple whammy of huge financial pressures because of the deficit that we know we have to tackle as a country, of the ageing population that will mean we have a million more over 70s by 2020, and also of rising consumer expectations, the incredible excitement that people feel when they read about immunotherapy in the newspapers that gives a heart attack to me and Simon Stevens but is very very exciting for the country. The desire for 24/7 access to healthcare. These are expectations that we have to recognise in the NHS but all of these add to a massive pressure on the system.

This excerpt may be analysed, for example, as part of a study into demand pressures on the NHS. And, in this case, codes such as “ageing population” “consumer expectations” “immunotherapy” “24/7 access to healthcare” might initially be identified. However, if the study was investigating the nature of ministerial responsibility for the NHS, one might pull out very different codes, such as “tackle as a country”, “public demands vs. government stewardship” and “minister – chief exec shared responsibility”.

Codes can be anything – attitudes, behaviours, viewpoints – so long as they relate to the research question. It is very useful to get (at least) one other person to also code some of the same source material. Comparing codes will provide new ideas for the coding framework, a different perspective of the meaning of the source material and a check that key sections of the source material have not been missed. Researchers shouldn’t aim to code all (or even most) of the text of a transcript – there is always some redundancy. And, in general, initial codes should be as close to the source text as possible – some interpretation is fine but it is important to not get too abstract too quickly!

Formal or ‘axial’ coding

When the researcher has an initial list of codes, it is a good time to develop a formal coding framework. The aim here is to devise an index of some sort to tag all the data in a logical, systematic and comprehensive way, and in a way that will be useful for further analysis.

One way to start is to chart how the initial codes can be grouped and relate to one another. For example, in analysing NHS demand pressures, a researcher may group “immunotherapy” with other medical innovations mentioned elsewhere in the study. It’s important to avoid having many disconnected codes, and at this stage, many codes will be changed, subdivided, or combined. Much like an index, the resulting codes could be organised into loose chapters (or themes) such as “1. Consumer expectations”, “2. Access” and/or there might be a hierarchical relationship between codes, for example, with codes relating to national and local demand pressures. A proper axial coding framework has categories and sub-categories of codes with interdependencies formally specified.

There is no right number of codes. There could be as few as 10, or as many as 50, or more. It is crucial however that the list of codes are logically organised (not alphabetically listed) and sufficiently concise, so that the researcher can hold them in their head while coding transcripts. Alongside the coding framework itself – which may only be a page – it can be very helpful to put together an explanatory document with more detail on the meaning of each code and possibly some examples.


Once the formal coding framework is finalised it can be applied to the source material. I find this a good stage to use software like Nvivo. While coding in Nvivo takes a similar amount of time to paper-based methods, it can help speed up the process of retrieving and comparing segments of the text later on. Other software packages are available and some researchers prefer to use computer packages earlier in the process or not all – it is a personal choice.

Again, it is a good idea to involve at least one other person. One possibility is for two researchers to apply the framework separately and code the first, say 5 pages of a transcript. Reliability between coders can then be compared, with any discrepancies discussed and used to adjust the coding framework accordingly. The researchers could then repeat the process. Once reliability is at an acceptable level, a researcher should be able to code the transcripts in a much more reproducible way.

Even at this stage, the formal coding framework does not need to be set in stone. If it is based on a subset of interviews, new issues are likely to emerge in subsequent transcripts and these may need to be incorporated. Additionally, analyses may be conducted with sub-samples of participants or the analysis may move from more descriptive to explanatory work, and therefore the coding needs may change.


Published qualitative studies will often mention that transcript data were coded, with few details to discern how this was done. In the study I worked on to develop the ICECAP-A capability measure, we coded to identify influences on quality of life in the first batch of interviews and dimensions of quality of life in later batches of interviews. A recent study into disinvestment decisions highlights how a second rater can be used in coding. Reporting guidelines for qualitative research papers highlight three important items related to coding – number of coders, description of the coding tree (framework), and derivation of the themes – that ought to be included in study write-ups.

Coding qualitative data can feel quite laborious. However, the real benefit of a well organised coding framework comes when reconstituting transcript data under common codes or themes. Codes that relate clearly to the research question, and one another, allow the researcher to reorganise the data with real purpose. Juxtaposing previously unrelated text and quotes sparks the discovery of exciting new links in the data. In turn, this spawns the interpretative work that is the fundamental value of the qualitative analysis. In economics parlance, good coding can improve both the efficiency of retrieving text for analysis and the quality of the analytical output itself.


Chris Sampson’s journal round-up for 8th January 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

An empirical comparison of the measurement properties of the EQ-5D-5L, DEMQOL-U and DEMQOL-Proxy-U for older people in residential care. Quality of Life Research [PubMed] Published 5th January 2018

There is now a condition-specific preference-based measure of health-related quality of life that can be used for people with cognitive impairment: the DEMQOL-U. Beyond the challenge of appropriately defining quality of life in this context, cognitive impairment presents the additional difficulty that individuals may not be able to self-complete a questionnaire. There’s some good evidence that proxy responses can be valid and reliable for people with cognitive impairment. The purpose of this study is to try out the new(ish) EQ-5D-5L in the context of cognitive impairment in a residential setting. Data were taken from an observational study in 17 residential care facilities in Australia. A variety of outcome measures were collected including the EQ-5D-5L (proxy where necessary), a cognitive bolt-on item for the EQ-5D, the DEMQOL-U and the DEMQOL-Proxy-U (from a family member or friend), the Modified Barthel Index, the cognitive impairment Psychogeriatric Assessment Scale (PAS-Cog), and the neuropsychiatric inventory questionnaire (NPI-Q). The researchers tested the correlation, convergent validity, and known-group validity for the various measures. 143 participants self-completed the EQ-5D-5L and DEMQOL-U, while 387 responses were available for the proxy versions. People with a diagnosis of dementia reported higher utility values on the EQ-5D-5L and DEMQOL-U than people without a diagnosis. Correlations between the measures were weak to moderate. Some people reported full health on the EQ-5D-5L despite identifying some impairment on the DEMQOL-U, and some vice versa. The EQ-5D-5L was more strongly correlated with clinical outcome measures than were the DEMQOL-U or DEMQOL-Proxy-U, though the associations were generally weak. The relationship between cognitive impairment and self-completed EQ-5D-5L and DEMQOL-U utilities was not in the expected direction; people with greater cognitive impairment reported higher utility values. There was quite a lot of disagreement between utility values derived from the different measures, so the EQ-5D-5L and DEMQOL-U should not be seen as substitutes. An EQ-QALY is not a DEM-QALY. This is all quite perplexing when it comes to measuring health-related quality of life in people with cognitive impairment. What does it mean if a condition-specific measure does not correlate with the condition? It could be that for people with cognitive impairment the key determinant of their quality of life is only indirectly related to their impairment, and more dependent on their living conditions.

Resolving the “cost-effective but unaffordable” paradox: estimating the health opportunity costs of nonmarginal budget impacts. Value in Health Published 4th January 2018

Back in 2015 (as discussed on this blog), NICE started appraising drugs that were cost-effective but implied such high costs for the NHS that they seemed unaffordable. This forced a consideration of how budget impact should be handled in technology appraisal. But the matter is far from settled and different countries have adopted different approaches. The challenge is to accurately estimate the opportunity cost of an investment, which will depend on the budget impact. A fixed cost-effectiveness threshold isn’t much use. This study builds on York’s earlier work that estimated cost-effectiveness thresholds based on health opportunity costs in the NHS. The researchers attempt to identify cost-effectiveness thresholds that are in accordance with different non-marginal (i.e. large) budget impacts. The idea is that a larger budget impact should imply a lower (i.e. more difficult to satisfy) cost-effectiveness threshold. NHS expenditure data were combined with mortality rates for different disease categories by geographical area. When primary care trusts’ (PCTs) budget allocations change, they transition gradually. This means that – for a period of time – some trusts receive a larger budget than they are expected to need while others receive a smaller budget. The researchers identify these as over-target and under-target accordingly. The expenditure and outcome elasticities associated with changes in the budget are estimated for the different disease groups (defined by programme budgeting categories; PBCs). Expenditure elasticity refers to the change in PBC expenditure given a change in overall NHS expenditure. Outcome elasticity refers to the change in PBC mortality given a change in PBC expenditure. Two econometric approaches are used; an interaction term approach, whereby a subgroup interaction term is used with the expenditure and outcome variables, and a subsample estimation approach, whereby subgroups are analysed separately. Despite the limitations associated with a reduced sample size, the subsample estimation approach is preferred on theoretical grounds. Using this method, under-target PCTs face a cost-per-QALY of £12,047 and over-target PCTs face a cost-per-QALY of £13,464, reflecting diminishing marginal returns. The estimates are used as the basis for identifying a health production function that can approximate the association between budget changes and health opportunity costs. Going back to the motivating example of hepatitis C drugs, a £772 million budget impact would ‘cost’ 61,997 QALYs, rather than the 59,667 that we would expect without accounting for the budget impact. This means that the threshold should be lower (at £12,452 instead of £12,936) for a budget impact of this size. The authors discuss a variety of approaches for ‘smoothing’ the budget impact of such investments. Whether or not you believe the absolute size of the quoted numbers depends on whether you believe the stack of (necessary) assumptions used to reach them. But regardless of that, the authors present an interesting and novel approach to establishing an empirical basis for estimating health opportunity costs when budget impacts are large.

First do no harm – the impact of financial incentives on dental x-rays. Journal of Health Economics [RePEc] Published 30th December 2017

If dentists move from fee-for-service to a salary, or if patients move from co-payment to full exemption, does it influence the frequency of x-rays? That’s the question that the researchers are trying to answer in this study. It’s important because x-rays always present some level of (carcinogenic) risk to patients and should therefore only be used when the benefits are expected to exceed the harms. Financial incentives shouldn’t come into it. If they do, then some dentists aren’t playing by the rules. And that seems to be the case. The authors start out by establishing a theoretical framework for the interaction between patient and dentist, which incorporates the harmful nature of x-rays, dentist remuneration, the patient’s payment arrangements, and the characteristics of each party. This model is used in conjunction with data from NHS Scotland, with 1.3 million treatment claims from 200,000 patients and 3,000 dentists. In 19% of treatments, an x-ray occurs. Some dentists are salaried and some are not, while some people pay charges for treatment and some are exempt. A series of fixed effects models are used to take advantage of these differences in arrangements by modelling the extent to which switches (between arrangements, for patients or dentists) influence the probability of receiving an x-ray. The authors’ preferred model shows that both the dentist’s remuneration arrangement and the patient’s financial status influences the number of x-rays in the direction predicted by the model. That is, fee-for-service and charge exemption results in more x-rays. The combination of these two factors results in a 9.4 percentage point increase in the probability of an x-ray during treatment, relative to salaried dentists with non-exempt patients. While the results do show that financial incentives influence this treatment decision (when they shouldn’t), the authors aren’t able to link the behaviour to patient harm. So we don’t know what percentage of treatments involving x-rays would correspond to the decision rule of benefits exceeding harms. Nevertheless, this is an important piece of work for informing the definition of dentist reimbursement and patient payment mechanisms.