On the third Thursday of every month, we speak to a recent graduate about their thesis and their studies. This month’s guest is Dr Anthony Hatswell who has a PhD from University College London. If you would like to suggest a candidate for an upcoming Thesis Thursday, get in touch.
A modelling framework for estimation of comparative effectiveness in pharmaceuticals using uncontrolled clinical trials
Gianluca Baio, Nick Freemantle
Are uncontrolled clinical trials an important source of data?
The simple answer here is yes because for some interventions they are the only source of data. To go back a bit, there exist situations where randomised controlled trials, the bread and butter of evidence-based medicine, just aren’t possible. The most obvious example here is parachutes; would you want to take part in a placebo-controlled trial to see if they work?
Parachutes are a bit of a straw man, but there are equally other similar scenarios, such as medicines for cyanide poisoning (cyanide is produced when certain plastic burns, such as in house fires), through to terminal cancers where the alternative is unpalatable and thus a trial not possible due to equipoise.
In such circumstances, uncontrolled studies may give sufficient information for regulators to approve a product, as it appears efficacious. The challenge, however, is for payers, where the question isn’t necessarily whether a product is efficacious but how effective it is compared to the standard of care. The magnitude of health gain after all is one of the most important things in cost-effectiveness.
How have previous approaches to identifying comparative effectiveness been inadequate?
There have been a few previous approaches proposed for individual circumstances but, as far as I’ve been able to tell, I’m the first person to look at this as a whole. Especially when it comes to comparative effectiveness, as opposed to looking at efficacy from a regulatory perspective.
In performing a systematic review (technically over 70 systematic reviews), what people have done in the past is, for the most part, compare between the contemporary data and a ‘historical control’, without adjusting for differences between studies. This is a major issue as there exists a lot of literature demonstrating that the same intervention in different populations can have a different effect size. The best example I think comes from Moroz et al. who show that trials outperform historical data for the same intervention by about 5%. 5% overestimation might not sound massive, but it is more than enough to affect the cost-effectiveness markedly.
Subjectively I think things have improved a lot in the past few years, with techniques such as propensity scoring and matching-adjusted indirect comparison (MAIC) being more widely used. That isn’t to say that we can’t get better though!
How did you go about testing alternative methods?
One of the things that became apparent early on in my studies is that MAIC was being increasingly used (I think there are now over 80 published examples), but that we don’t really have any guidance on when it works, and when it doesn’t. Filling a part of this gap was the main methodological piece of work in my PhD.
To look into this I performed a simulation study. Simulating data that we know to have a bias, and seeing how well MAIC does in correcting, then repeating many times to understand whether MAIC is biased (i.e. does it on average get the right answer) and accurate (i.e. does it have a large amount of uncertainty). After checking it works as intended (and it does) in the base case with ideal conditions, we are then able to change the setup of the study to push, or even violate, assumptions on how it works. This work was published in Value in Health earlier this year.On top of that, part of the work was also in proposing new methods for creating controls; all of which work with techniques we use widely as a field such as parametric curve fitting. These are published here and here.
What are the key features of the framework that you propose?
After working through all the different methods that are available, I put them into a flow chart to understand which methods are applicable under which circumstances. It isn’t necessarily to say that you must do X, but more to present the options that you have. Even if I wanted to, at present it isn’t possible to say for instance that simulated treatment comparison (STC) is more or less appropriate than MAIC – what we can say however is that if you have access to patient-level data from one study, but only aggregated data from another, both are options and should at least be considered.
In case of interest, the diagram is below which highlights the options. Most, I think, people will be familiar with, apart from the ‘E-value’ which is a really neat idea that went in to press recently. Basically, how much of an influence does any confounding need to have, before it changes the result we see?
I think the main takeaway I would give, however, is that multiple options are much better than a single value. We are unlikely to ever estimate exactly what an RCT would have shown and, even if we did, there is no way to confirm. What is more reassuring is if you can show a naive estimate with adjustments performed a few ways (even if just matching on different numbers of variables) and that the results are similar; it builds a lot of confidence in the structural uncertainty.
Who would you most like to influence with this research: regulators or researchers?
Researchers. Regulators have a difficult task but do have the power to demand evidence of a certain level in any given context. Some of the techniques used might be of relevance (and some are already widely used, propensity scoring for example), so there may be some scope for their use but, for the most part, regulators have done a good job (and are very consistent).
Ultimately what I’m hoping is that the work I’ve done helps push up the standards by a little. This is for clinical researchers in making sure the relevant endpoints are captured (and giving economists tools to do the push back with), but also in making sure that we as a field are aware of the tools available to do the best job possible. What I think we all want is the right decisions made for the right reasons, and anything we can do to get there is a bonus.
How did you find the experience of working full-time alongside your part-time PhD?
The route I took (working for eight years before starting, then doing it part-time over six years) is unusual but not unique – others have done this before (Professor Alan Brennan for one). What I would say though is it does take you having a thesis question you are very passionate about, as you need to keep going at it for a very long time and likely through a lot of life changes. As an example, I started just after my wife and I got married – we moved house, had two children, and both moved jobs before I finished.
It isn’t all downsides though. In having worked previously, you know a lot of the field and have a lot of the basic research skills that a full-time person has to learn. Other things such as finances are less of a worry, provided you have a supportive employer (if you don’t, Delta Hat are hiring), and review times – when submitting a paper three months is pretty typical. If waiting around as a full-time student, it’s quite a chunk of your three years. As a part-time student though it’s much less of a hit, and you don’t run out of other things to do.If anyone is seriously considering it, it isn’t hard to find my contact details, and I’ll happily talk you through my experience, and answer any questions you have. The main advice would be firstly not to underestimate it (you need to give it at least one full day a week which does mean going part-time at work), but also don’t overestimate it (it is doable! I’ve done it, and many others have). Also, that good communication and supervisors who are on board with the plan from the beginning is vital. In terms of resources, I found the book ‘How to get a PhD’ invaluable, especially the chapter titled ‘How not to get a PhD’.