# James Altunkaya’s journal round-up for 3rd September 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Sensitivity analysis for not-at-random missing data in trial-based cost-effectiveness analysis: a tutorial. PharmacoEconomics [PubMed] [RePEc] Published 20th April 2018

Last month, we highlighted a Bayesian framework for imputing missing data in economic evaluation. The paper dealt with the issue of departure from the ‘Missing at Random’ (MAR) assumption by using a Bayesian approach to specify a plausible missingness model from the results of expert elicitation. This was used to estimate a prior distribution for the unobserved terms in the outcomes model.

For those less comfortable with Bayesian estimation, this month we highlight a tutorial paper from the same authors, outlining an approach to recognise the impact of plausible departures from ‘Missingness at Random’ assumptions on cost-effectiveness results. Given poor adherence to current recommendations for the best practice in handling and reporting missing data, an incremental approach to improving missing data methods in health research may be more realistic. The authors supply accompanying Stata code.

The paper investigates the importance of assuming a degree of ‘informative’ missingness (i.e. ‘Missingness not at Random’) in sensitivity analyses. In a case study, the authors present a range of scenarios which assume a decrement of 5-10% in the quality of life of patients with missing health outcomes, compared to multiple imputation estimates based on observed characteristics under standard ‘Missing at Random’ assumptions. This represents an assumption that, controlling for all observed characteristics used in multiple imputation, those with complete quality of life profiles may have higher quality of life than those with incomplete surveys.

Quality of life decrements were implemented in the control and treatment arm separately, and then jointly, in six scenarios. This aimed to demonstrate the sensitivity of cost-effectiveness judgements to the possibility of a different missingness mechanism in each arm. The authors similarly investigate sensitivity to higher health costs in those with missing data than predicted based on observed characteristics in imputation under ‘Missingness at Random’. Finally, sensitivity to a simultaneous departure from ‘Missingness at Random’ in both health outcomes and health costs is investigated.

The proposed sensitivity analyses provide a useful heuristic to assess what degree of difference between missing and non-missing subjects on unobserved characteristics would be necessary to change cost-effectiveness decisions. The authors admit this framework could appear relatively crude to those comfortable with more advanced missing data approaches such as those outlined in last month’s round-up. However, this approach should appeal to those interested in presenting the magnitude of uncertainty introduced by missing data assumptions, in a way that is easily interpretable to decision makers.

The impact of waiting for intervention on costs and effectiveness: the case of transcatheter aortic valve replacement. The European Journal of Health Economics [PubMed] [RePEc] Published September 2018

This paper appears in print this month and sparked interest as one of comparatively few studies on the cost-effectiveness of waiting lists. Given interest in using constrained optimisation methods in health outcomes research, highlighted in this month’s editorial in Value in Health, there is rightly interest in extending the traditional sphere of economic evaluation from drugs and devices to understanding the trade-offs of investing in a wider range of policy interventions, using a common metric of costs and QALYs. Rachel Meacock’s paper earlier this year did a great job at outlining some of the challenges involved broadening the scope of economic evaluation to more general decisions in health service delivery.

The authors set out to understand the cost-effectiveness of delaying a cardiac treatment (TVAR) using a waiting list of up to 12 months compared to a policy of immediate treatment. The effectiveness of treatment at 3, 6, 9 & 12 months after initial diagnosis, health decrements during waiting, and corresponding health costs during wait time and post-treatment were derived from a small observational study. As treatment is studied in an elderly population, a non-ignorable proportion of patients die whilst waiting for surgery. This translates to lower modelled costs, but also lower quality life years in modelled cohorts where there was any delay from a policy of immediate treatment. The authors conclude that eliminating all waiting time for TVAR would produce population health at a rate of ~€12,500 per QALY gained.

However, based on the modelling presented, the authors lack the ability to make cost-effectiveness judgements of this sort. Waiting lists exist for a reason, chiefly a lack of clinical capacity to treat patients immediately. In taking a decision to treat patients immediately in one disease area, we therefore need some judgement as to whether the health displaced in now untreated patients in another disease area is of greater, less or equal magnitude to that gained by treating TVAR patients immediately. Alternately, modelling should include the cost of acquiring additional clinical capacity (such as theatre space) to treat TVAR patients immediately, so as not to displace other treatments. In such a case, the ICER is likely to be much higher, due to the large cost of new resources needed to reduce waiting times to zero.

Given the data available, a simple improvement to the paper would be to reflect current waiting times (already gathered from observational study) as the ‘standard of care’ arm. As such, the estimated change in quality of life and healthcare resource cost from reducing waiting times to zero from levels observed in current practice could be calculated. This could then be used to calculate the maximum acceptable cost of acquiring additional treatment resources needed to treat patients with no waiting time, given current national willingness-to-pay thresholds.

Admittedly, there remain problems in using the authors’ chosen observational dataset to calculate quality of life and cost outcomes for patients treated at different time periods. Waiting times were prioritised in this ‘real world’ observational study, based on clinical assessment of patients’ treatment need. Thus it is expected that the quality of life lost during a waiting period would be lower for patients treated in the observational study at 12 months, compared to the expected quality of life loss of waiting for the group of patients judged to need immediate treatment. A previous study in cardiac care took on the more manageable task of investigating the cost-effectiveness of different prioritisation strategies for the waiting list, investigating the sensitivity of conclusions to varying a fixed maximum wait-time for the last patient treated.

This study therefore demonstrates some of the difficulties in attempting to make cost-effectiveness judgements about waiting time policy. Given that the cost-effectiveness of reducing waiting times in different disease areas is expected to vary, based on relative importance of waiting for treatment on short and long-term health outcomes and costs, this remains an interesting area for economic evaluation to explore. In the context of the current focus on constrained optimisation techniques across different areas in healthcare (see ISPOR task force), it is likely that extending economic evaluation to evaluate a broader range of decision problems on a common scale will become increasingly important in future.

Understanding and identifying key issues with the involvement of clinicians in the development of decision-analytic model structures: a qualitative study. PharmacoEconomics [PubMed] Published 17th August 2018

This paper gathers evidence from interviews with clinicians and modellers, with the aim to improve the nature of the working relationship between the two fields during model development.

Researchers gathered opinion from a variety of settings, including industry. The main report focusses on evidence from two case studies – one tracking the working relationship between modellers and a single clinical advisor at a UK university, with the second gathering evidence from a UK policy institute – where modellers worked with up to 11 clinical experts per meeting.

Some of the authors’ conclusions are not particularly surprising. Modellers reported difficulty in recruiting clinicians to advise on model structures, and further difficulty in then engaging recruited clinicians to provide relevant advice for the model building process. Specific comments suggested difficulty for some clinical advisors in identifying representative patient experiences, instead diverting modellers’ attention towards rare outlier events.

Study responses suggested currently only 1 or 2 clinicians were typically consulted during model development. The authors recommend involving a larger group of clinicians at this stage of the modelling process, with a more varied range of clinical experience (junior as well as senior clinicians, with some geographical variation). This is intended to help ensure clinical pathways modelled are generalizable. The experience of one clinical collaborator involved in the case study based at a UK university, compared to 11 clinicians at the policy institute studied, perhaps may also illustrate a general problem of inadequate compensation for clinical time within the university system. The authors also advocate the availability of some relevant training for clinicians in decision modelling to help enhance the efficiency of participants’ time during model building. Clinicians sampled were supportive of this view – citing the need for further guidance from modellers on the nature of their expected contribution.

This study ties into the general literature regarding structural uncertainty in decision analytic models. In advocating the early contribution of a larger, more diverse group of clinicians in model development, the authors advocate a degree of alignment between clinical involvement during model structuring, and guidelines for eliciting parameter estimates from clinical experts. Similar problems, however, remain for both fields, in recruiting clinical experts from sufficiently diverse backgrounds to provide a valid sample.

Credits

# Method of the month: Semiparametric models with penalised splines

Once a month we discuss a particular research method that may be of interest to people working in health economics. We’ll consider widely used key methodologies, as well as more novel approaches. Our reviews are not designed to be comprehensive but provide an introduction to the method, its underlying principles, some applied examples, and where to find out more. If you’d like to write a post for this series, get in touch. This month’s method is semiparametric models with penalised splines.

## Principles

A common assumption of regression models is that effects are linear and additive. However, nothing is ever really that simple. One might respond that all models are wrong, but some are useful, as George Box once said. And the linear, additive regression model has coefficients that can be interpreted as average treatment effects under the right assumptions. Sometimes though we are interested in conditional average treatment effects and how the impact of an intervention varies according to the value of some variable of interest. Often this relationship is not linear and we don’t know its functional form. Splines provide a way of estimating curves (or surfaces) of unknown functional form and are a widely used tool for semiparametric regression models. The term ‘spline’ was derived from the tool shipbuilders and drafters used to construct smooth edges: a bendable piece of material that when fixed at a number of points would relax into the desired shape.

## Implementation

Our interest lies in estimating the unknown function m:

$y_i = m(x_i) + e_i$

A ‘spline’ in the mathematical sense is a function constructed piece-wise from polynomial functions. The places where the functions meet are known as knots and the spline has order equal to one more than the degree of the underlying polynomial terms. Basis-splines or B-splines are the typical starting point for spline functions. These are curves that are defined recursively as a sum of ‘basis functions’, which depend only on the polynomial degree and the knots. A spline function can be represented as a linear combination of B-splines, the parameters dictating this combination can be estimated using standard regression model estimation techniques. If we have $N$ B-splines then our regression function can be estimated as:

$y_i = \sum_{j=1}^N ( \alpha_j B_j(x_i) ) + e_i$

by minimising $\sum_{i=1}^N \{ y_i - \sum_{j=1}^N ( \alpha_j B_j(x_i) ) \} ^2$. Where the $B_j$ are the B-splines and the $\alpha_j$ are coefficients to be estimated.

Useful technical explainers of splines and B-splines can be found here [PDF] and here [PDF].

One issue with fitting splines to data is that we run the risk of ‘overfitting’. Outliers might distort the curve we fit, damaging the external validity of conclusions we might make. To deal with this, we can enforce a certain level of smoothness using so-called penalty functions. The smoothness (or conversely the ‘roughness’) of a curve is often defined by the integral of the square of the second derivative of the curve function. Penalised-splines, or P-splines, were therefore proposed which added on this smoothness term multiplied by a smoothing parameter $\lambda$. In this case, we look to minimising:

$\sum_{i=1}^N \{ y_i - \sum_{j=1}^N ( \alpha_j B_j(x_i) ) \}^2 + \lambda\int m''(x_i)^2 dx$

to estimate our parameters. Many other different variations on this penalty have been proposed. This article provides a good explanation of P-splines.

An attractive type of spline has become the ‘low rank thin plate spline‘. This type of spline is defined by its penalty, which has a physical analogy with the resistance that a thin sheet of metal puts up when it is bent. This type of spline removes the problem associated with thin plate splines of having too many parameters to estimate by taking a ‘low rank’ approximation, and it is generally insensitive to the choice of knots, which other penalised spline regression models are not.

Crainiceanu and colleagues show how the low rank thin plate smooth splines can be represented as a generalised linear mixed model. In particular, our model can be represented as:

$m(x_i) = \beta_0 + \beta_1x_i + \sum_{k=1}^K u_k |x_i - \kappa_k|^3$

where $\kappa_k$, $k=1,...,K$, are the knots. The parameters, $\theta = (\beta_0,\beta_1,u_k)'$, can be estimated by minimising

$\sum_{i=1}^N \{ y_i - m(x_i) \} ^2 + \frac{1}{\lambda} \theta ^T D \theta$ .

This is shown to give the mixed model

$y_i = \beta_0 + \beta_1 + Z'b + u_i$

where each random coefficient in the vector $b$ is distributed as $N(0,\sigma^2_b)$ and $Z$ and $D$ are given in the paper cited above.

As a final note, we have discussed splines in one dimension, but they can be extended to more dimensions. A two-dimensional spline can be generated by taking the tensor product of the two one dimensional spline functions. I leave this as an exercise for the reader.

### Software

#### R

• The package gamm4 provides the tools necessary for a frequentist analysis along the lines described in this post. It uses restricted maximum likelihood estimation with the package lme4 to estimate the parameters of the thin plate spline model.
• A Bayesian version of this functionality is implemented in the package rstanarm, which uses gamm4 to produce the matrices for thin plate spline models and Stan for the estimation through the stan_gamm4 function.

If you wanted to implement these models for yourself from scratch, Crainiceanu and colleagues provide the R code to generate the matrices necessary to estimate the spline function:

n<-length(covariate)
X<-cbind(rep(1,n),covariate)
knots<-quantile(unique(covariate),
seq(0,1,length=(num.knots+2))[-c(1,(num.knots+2))])
Z_K<-(abs(outer(covariate,knots,"-")))^3
OMEGA_all<-(abs(outer(knots,knots,"-")))^3
svd.OMEGA_all<-svd(OMEGA_all)
sqrt.OMEGA_all<-t(svd.OMEGA_all$v %*% (t(svd.OMEGA_all$u)*sqrt(svd.OMEGA_all\$d)))
Z<-t(solve(sqrt.OMEGA_all,t(Z_K)))

#### Stata

I will temper this advice by cautioning that I have never estimated a spline-based semi-parametric model in Stata, so what follows may be hopelessly incorrect. The only implementation of penalised splines in Stata is the package and associated function psplineHowever, I cannot find any information about the penalty function used, so I would advise some caution when implementing. An alternative is to program the model yourself, through conversion of the above R code in Mata to generate the matrix Z and then the parameters could be estimated with xtmixed.

## Applications

Applications of these semi-parametric models in the world of health economics have tended to appear more in technical or statistical journals than health economics journals or economics more generally. For example, recent examples include Li et al who use penalised splines to estimate the relationship between disease duration and health care costs. Wunder and co look at how reported well-being varies over the course of the lifespan. And finally, we have Stollenwerk and colleagues who use splines to estimate flexible predictive models for cost-of-illness studies with ‘big data’.

Credit

# Review: Health Econometrics Using Stata (Partha Deb et al)

Health Econometrics Using Stata

Paperback, 264 pages, ISBN: 978-1-59718-228-7, published 31 August 2017

This book is the perfect guide to understanding the various econometric methods available for modelling of costs and counts data for the individual who understands econometrics best after applying it to a dataset (like myself). Pre-requisites include a decent knowledge of Stata and a desire to apply econometric methods to a cost or count outcome variable

It’s important to say that this book does not cover all aspects of econometrics within health economics, but instead focuses on ‘modelling health care costs and counts’ (the title of the short course from which the book evolved). As expected from this range of texts, the vast majority of the book comes with detailed example Stata code for all of the methods described, with illustrations either using a publicly available sample of MEPS data or simulated data.

Like many papers in this field, the focus of the book revolves around the non-normal characteristics of health care resource use distributions. These are the mass point at zero, right-hand skew and inherent heteroskedasticity. As such the book covers the broad suite of models that have been developed in order to account for these features, ranging from two-part models, transformation of the data (and the problematic re-transformation of estimated effects) to non-linear modelling methods such as generalised linear models (GLMs). Unlike many papers in this field, the authors emphasise the need – and provide guidance on how – to delve deep into the underlying data in order to appreciate the most appropriate methods (there is even a chapter on design effects) and encourage rigorous testing of model specification. In addition, Health Econometrics Using Stata considers the important issue of endogeneity and is not solely fixated on distributional issues, providing important insight and code for estimation of non-linear models that control for potential endogeneity (interested readers may wish to heed the published cautionary notes for some of these methods, e.g. Chapman and Brooks). Finally, the book describes more advanced methods for estimating heterogeneous effects, although code is not provided for all of these methods, which is a bit of a shame (but perhaps understandable given the complexity).

This could be a very dry text, but it is not – emphatically! The personality of the authors comes through very strongly from the writing. Reading it brought back many pleasant memories from the course ‘modelling health care costs and counts’ that I sat in 2012. The book also features a dedication to Willard Manning, which is a fitting tribute to a man who was both a great academic and an outstanding mentor. One particular highlight, with which past course attendants will be familiar, is the section ‘top 10 myths in health econometrics’. This straightforward and punchy presentation, backed up by rigorous methodological research, is a great way to get these key messages across in an accessible format. Other great features of this book include the use of simulations to illustrate important features of the econometric models (with code provided to recreate) and a personal highlight (granted, a niche interest…) was the code to generate appropriate standard errors when using the poisson family within GLMs for costs.

Of course, Health Econometrics Using Stata cannot be comprehensive and there are developments in this field that are not covered. Most notably, there is no discussion of how to model these data in a panel/longitudinal setting, which is crucially important for estimating parameters for decision models, for example. Potential issues around missing data and censoring are also not discussed. Also, this text does not cover advances in flexible parametric modelling, which enable modelling of data that are both highly skewed and leptokurtic (see Jones 2017 for an excellent summary of this literature along with a primer on data visualisation using Stata).

I heartily recommend Health Econometrics Using Stata to interested colleagues who want practical advice – on model selection and specification testing with cost and count outcome data – from some of the top specialists in our field, in their own words.

Credit