Thesis Thursday: Andrea Gabrio

On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Andrea Gabrio who has a PhD from University College London. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.

Title
Full Bayesian methods to handle missing data in health economic evaluation
Supervisors
Gianluca Baio, Alexina Mason, Rachael Hunter
Repository link
http://discovery.ucl.ac.uk/10072087

What kind of assumptions about missing data are made in trial-based economic evaluations?

In any analysis, assumptions about the missing values are always made, about those values which are not observed. Since the final results may depend on these assumptions, it is important that they are as plausible as possible within the context considered. For example, in trial-based economic evaluations, missing values often occur when data are collected through self-reported patient questionnaires and in many cases it is plausible that patients with unobserved responses are different from the others (e.g. have worse health states). In general, it is very important that a range of plausible scenarios (defined according to the available information) are considered, and that the robustness of our conclusions across them is assessed in sensitivity analysis. Often, however, analysts prefer to ignore this uncertainty and rely on ‘default’ approaches (e.g. remove the missing data from the analysis) which implicitly make unrealistic assumptions and possibly lead to biased results. For a more in-depth overview of current practice, I refer to my published review.

Given that any assumption about the missing values cannot be checked from the data at hand, an ideal approach to handle missing data should combine a well-defined model for the observed data and explicit assumptions about missingness.

What do you mean by ‘full Bayesian’?

The term ‘full Bayesian’ is a technicality and typically indicates that, in the Bayesian analysis, the prior distributions are freely specified by the analyst, rather than being based on the data (e.g. ’empirical Bayesian’). Being ‘fully’ Bayesian has some key advantages for handling missingness compared to other approaches, especially in small samples. First, a flexible choice of the priors may help to stabilise inference and avoid giving too much weight to implausible parameter values. Second, external information about missingness (e.g. expert opinion) can be easily incorporated into the model through the priors. This is essential when performing sensitivity analysis to missingness, as it allows assessment of the robustness of the results to a range of assumptions, with the uncertainty of any unobserved quantity (parameters or missing data) being fully propagated and quantified in the posterior distribution.

How did you use case studies to support the development of your methods?

In my PhD I had access to economic data from two small trials, which were characterised by considerable amounts of missing outcome values and which I used as motivating examples to implement my methods. In particular, individual-level economic data are characterised by a series of complexities that make it difficult to justify the use of more ‘standardised’ methods and which, if not taken into account, may lead to biased results.

Examples of these include the correlation between effectiveness and costs, the skewness in the empirical distributions of both outcomes, the presence of identical values for many individuals (e.g. excess zeros or ones), and, on top of that, missingness. In many cases, the implementation of methods to handle these issues is not straightforward, especially when multiple types of complexities affect the data.

The flexibility of the Bayesian framework allows the specification of a model whose level of complexity can be increased in a relatively easy way to handle all these problems simultaneously, while also providing a natural way to perform probabilistic sensitivity analysis. I refer to my published work to see an example of how Bayesian models can be implemented to handle trial-based economic data.

How does your framework account for longitudinal data?

Since the data collected within a trial have a longitudinal nature (i.e. collected at different times), it is important that any missingness methods for trial-based economic evaluations take into account this feature. I therefore developed a Bayesian parametric model for a bivariate health economic longitudinal response which, together with accounting for the typical complexities of the data (e.g. skewness), can be fitted to all the effectiveness and cost variables in a trial.

Time dependence between the responses is formally taken into account by means of a series of regressions, where each variable can be modelled conditionally on other variables collected at the same or at previous time points. This also offers an efficient way to handle missingness, as the available evidence at each time is included in the model, which may provide valuable information for imputing the missing data and therefore improve the confidence in the final results. In addition, sensitivity analysis to a range of missingness assumptions can be performed using a ‘pattern mixture’ approach. This allows the identification of certain parameters, known as sensitivity parameters, on which priors can be specified to incorporate external information and quantify its impact on the conclusions. A detailed description of the longitudinal model and the missing data analyses explored is also available online.

Are your proposed methods easy to implement?

Most of the methods that I developed in my project were implemented in JAGS, a software specifically designed for the analysis of Bayesian models using Markov Chain Monte Carlo simulation. Like other Bayesian software (e.g. OpenBUGS and STAN), JAGS is freely available and can be interfaced with different statistical programs, such as R, SAS, Stata, etc. Therefore, I believe that, once people are willing to overcome the initial barrier of getting familiar with a new software language, these programs provide extremely powerful tools to implement Bayesian methods. Although in economic evaluations analysts are typically more familiar with frequentist methods (e.g. multiple imputations), it is clear that as the complexity of the analysis increases, the implementation of these methods would require tailor-made routines for the optimisation of non-standard likelihood functions, and a full Bayesian approach is likely to be a preferable option as it naturally allows the propagation of uncertainty to the wider economic model and to perform sensitivity analysis.

Chris Sampson’s journal round-up for 18th February 2019

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

An educational review about using cost data for the purpose of cost-effectiveness analysis. PharmacoEconomics [PubMed] Published 12th February 2019

Costing can seem like a cinderella method in the health economist’s toolkit. If you’re working on an economic evaluation, estimating resource use and costs can be tedious. That is perhaps why costing methodology has been relatively neglected in the literature compared to health state valuation (for example). This paper tries to redress the balance slightly by providing an overview of the main issues in costing, explaining why they’re important, so that we can do a better job. The issues are more complex than many assume.

Supported by a formidable reference list (n=120), the authors tackle 9 issues relating to costing: i) costs vs resource use; ii) trial-based vs model-based evaluations; iii) costing perspectives; iv) data sources; v) statistical methods; vi) baseline adjustments; vii) missing data; viii) uncertainty; and ix) discounting, inflation, and currency. It’s a big paper with a lot to say, so it isn’t easily summarised. Its role is as a reference point for us to turn to when we need it. There’s a stack of papers and other resources cited in here that I wasn’t aware of. The paper itself doesn’t get technical, leaving that to the papers cited therein. But the authors provide a good discussion of the questions that ought to be addressed by somebody designing a study, relating to data collection and analysis.

The paper closes with some recommendations. The main one is that people conducting cost-effectiveness analysis should think harder about why they’re making particular methodological choices. The point is also made that new developments could change the way we collect and analyse cost data. For example, the growing use of observational data demands that greater consideration be given to unobserved confounding. Costing methods are important and interesting!

A flexible open-source decision model for value assessment of biologic treatment for rheumatoid arthritis. PharmacoEconomics [PubMed] Published 9th February 2019

Wherever feasible, decision models should be published open-source, so that they can be reviewed, reused, recycled, or, perhaps, rejected. But open-source models are still a rare sight. Here, we have one for rheumatoid arthritis. But the paper isn’t really about the model. After all, the model and supporting documentation are already available online. Rather, the paper describes the reasoning behind publishing a model open-source, and the process for doing so in this case.

This is the first model released as part of the Open Source Value Project, which tries to convince decision-makers that cost-effectiveness models are worth paying attention to. That is, it’s aimed at the US market, where models are largely ignored. The authors argue that models need to be flexible to be valuable into the future and that, to achieve this, four steps should be followed in the development: 1) release the initial model, 2) invite feedback, 3) convene an expert panel to determine actions in light of the feedback, and 4) revise the model. Then, repeat as necessary. Alongside this, people with the requisite technical skills (i.e. knowing how to use R, C++, and GitHub) can proffer changes to the model whenever they like. This paper was written after step 3 had been completed, and the authors report receiving 159 comments on their model.

The model itself (which you can have a play with here) is an individual patient simulation, which is set-up to evaluate a variety of treatment scenarios. It estimates costs and (mapped) QALYs and can be used to conduct cost-effectiveness analysis or multi-criteria decision analysis. The model was designed to be able to run 32 different model structures based on different assumptions about treatment pathways and outcomes, meaning that the authors could evaluate structural uncertainties (which is a rare feat). A variety of approaches were used to validate the model.

The authors identify several challenges that they experienced in the process, including difficulties in communication between stakeholders and the large amount of time needed to develop, test, and describe a model of this sophistication. I would imagine that, compared with most decision models, the amount of work underlying this paper is staggering. Whether or not that work is worthwhile depends on whether researchers and policymakers make us of the model. The authors have made it as easy as possible for stakeholders to engage with and build on their work, so they should be hopeful that it will bear fruit.

EQ-5D-Y-5L: developing a revised EQ-5D-Y with increased response categories. Quality of Life Research [PubMed] Published 9th February 2019

The EQ-5D-Y has been a slow burner. It’s been around 10 years since it first came on the scene, but we’ve been without a value set and – with the introduction of the EQ-5D-5L – the questionnaire has lost some comparability with its adult equivalent. But the EQ-5D-Y has almost caught-up, and this study describes part of how that’s been achieved.

The reason to develop a 5L version for the EQ-5D-Y is the same as for the adult version – to reduce ceiling effects and improve sensitivity. A selection of possible descriptors was identified through a review of the literature. Focus groups were conducted with children between 8 and 15 years of age in Germany, Spain, Sweden, and the UK in order to identify labels that can be understood by young people. Specifically, the researchers wanted to know the words used by children and adolescents to describe the quantity or intensity of health problems. Participants ranked the labels according to severity and specified which labels they didn’t like. Transcripts were analysed using thematic content analysis. Next, individual interviews were conducted with 255 participants across the four countries, which involved sorting and response scaling tasks. Younger children used a smiley scale. At this stage, both 4L and 5L versions were being considered. In a second phase of the research, cognitive interviews were used to test for comprehensibility and feasibility.

A 5-level version was preferred by most, and 5L labels were identified in each language. The English version used terms like ‘a little bit’, ‘a lot’, and ‘really’. There’s plenty more research to be done on the EQ-5D-Y-5L, including psychometric testing, but I’d expect it to be coming to studies near you very soon. One of the key takeaways from this study, and something that I’ve been seeing more in research in recent years, is that kids are smart. The authors make this point clear, particulary with respect to the response scaling tasks that were conducted with children as young as 8. Decision-making criteria and frameworks that relate to children should be based on children’s preferences and ideas.

Credits

Sam Watson’s journal round-up for 25th June 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

The efficiency of slacking off: evidence from the emergency department. Econometrica [RePEc] Published May 2018

Scheduling workers is a complex task, especially in large organisations such as hospitals. Not only should one consider when different shifts start throughout the day, but also how work is divided up over the course of each shift. Physicians, like anyone else, value their leisure time and want to go home at the end of a shift. Given how they value this leisure time, as the end of a shift approaches physicians may behave differently. This paper explores how doctors in an emergency department behave at ‘end of shift’, in particular looking at whether doctors ‘slack off’ by accepting fewer patients or tasks and also whether they rush to finish those tasks they have. Both cases can introduce inefficiency by either under-using their labour time or using resources too intensively to complete something. Immediately, from the plots of the raw data, it is possible to see a drop in patients ‘accepted’ both close to end of shift and close to the next shift beginning (if there is shift overlap). Most interestingly, after controlling for patient characteristics, time of day, and day of week, there is a decrease in the length of stay of patients accepted closer to the end of shift, which is ‘dose-dependent’ on time to end of shift. There are also marked increases in patient costs, orders, and inpatient admissions in the final hour of the shift. Assuming that only the number of patients assigned and not the type of patient changes over the course of a shift (a somewhat strong assumption despite the additional tests), then this would suggest that doctors are rushing care and potentially providing sub-optimal or inefficient care closer to the end of their shift. The paper goes on to explore optimal scheduling on the basis of the results, among other things, but ultimately shows an interesting, if not unexpected, pattern of physician behaviour. The results relate mainly to efficiency, but it’d be interesting to see how they relate to quality in the form of preventable errors.

Semiparametric estimation of longitudinal medical cost trajectory. Journal of the American Statistical Association Published 19th June 2018

Modern computational and statistical methods have opened up a range of statistical models to estimation hitherto inestimable. This includes complex latent variable structures, non-linear models, and non- and semi-parametric models. Recently we covered the use of splines for semi-parametric modelling in our Method of the Month series. Not that complexity is everything of course, but given this rich toolbox to more faithfully replicate the data generating process, one does wonder why the humble linear model estimated with OLS remains so common. Nevertheless, I digress. This paper addresses the problem of estimating the medical cost trajectory for a given disease from diagnosis to death. There are two key issues: (i) the trajectory is likely to be non-linear with costs probably increasing near death and possibly also be higher immediately after diagnosis (a U-shape), and (ii) we don’t observe the costs of those who die, i.e. there is right-censoring. Such a set-up is also applicable in other cases, for example looking at health outcomes in panel data with informative dropout. The authors model medical costs for each month post-diagnosis and time of censoring (death) by factorising their joint distribution into a marginal model for censoring and a conditional model for medical costs given the censoring time. The likelihood then has contributions from the observed medical costs and their times, and the times of the censored outcomes. We then just need to specify the individual models. For medical costs, they use a multivariate normal with mean function consisting of a bivariate spline of time and time of censoring. The time of censoring is modelled non-parametrically. This setup of the missing data problem is sometimes referred to as a pattern mixing model, in that the outcome is modelled as a mixture density over different populations dying at different times. The authors note another possibility for informative missing data, which was considered not to be estimable for complex non-linear structures, was the shared parameter model (to soon appear in another Method of the Month) that assumes outcomes and dropout are independent conditional on an underlying latent variable. This approach can be more flexible, especially in cases with varying treatment effects. One wonders if the mixed model representation of penalised splines wouldn’t fit nicely in a shared parameter framework and provide at least as good inferences. An idea for a future paper perhaps… Nevertheless, the authors illustrate their method by replicating the well-documented U-shaped costs from the time of diagnosis in patients with stage IV breast cancer.

Do environmental factors drive obesity? Evidence from international graduate students. Health Economics [PubMedPublished 21st June 2018

‘The environment’ can encompass any number of things including social interactions and networks, politics, green space, and pollution. Sometimes referred to as ‘neighbourhood effects’, the impact of the shared environment above and beyond the effect of individual risk factors is of great interest to researchers and policymakers alike. But there are a number of substantive issues that hinder estimation of neighbourhood effects. For example, social stratification into neighbourhoods likely means people living together are similar so it is difficult to compare like with like across neighbourhoods; trying to model neighbourhood choice will also, therefore, remove most of the variation in the data. Similarly, this lack of common support, i.e. overlap, between people from different neighbourhoods means estimated effects are not generalisable across the population. One way of getting around these problems is simply to randomise people to neighbourhoods. As odd as that sounds, that is what occurred in the Moving to Opportunity experiments and others. This paper takes a similar approach in trying to look at neighbourhood effects on the risk of obesity by looking at the effects of international students moving to different locales with different local obesity rates. The key identifying assumption is that the choice to move to different places is conditionally independent of the local obesity rate. This doesn’t seem a strong assumption – I’ve never heard a prospective student ask about the weight of our student body. Some analysis supports this claim. The raw data and some further modelling show a pretty strong and robust relationship between local obesity rates and weight gain of the international students. Given the complexity of the causes and correlates of obesity (see the crazy diagram in this post) it is hard to discern why certain environments contribute to obesity. The paper presents some weak evidence of differences in unhealthy behaviours between high and low obesity places – but this doesn’t quite get at the environmental link, such as whether these behaviours are shared through social networks or perhaps the structure and layout of the urban area, for example. Nevertheless, here is some strong evidence that living in an area where there are obese people means you’re more likely to become obese yourself.

Credits