Sam Watson’s journal round-up for 12th February 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Tuskegee and the health of black men. The Quarterly Journal of Economics [RePEc] Published February 2018

In 1932, a study often considered the most infamous and potentially most unethical in U.S. medical history began. Researchers in Alabama enrolled impoverished black men in a research program designed to examine the effects of syphilis under the guise of receiving government-funded health care. The study was known as the Tuskegee syphilis experiment. For 40 years the research subjects were not informed they had syphilis nor were they treated, even after penicillin was shown to be effective. The study was terminated in 1972 after its details were leaked to the press; numerous men died, 40 wives contracted syphilis, and a number of children were born with congenital syphilis. It is no surprise then that there is distrust among African Americans in the medical system. The aim of this article is to examine whether the distrust engendered by the Tuskegee study could have contributed to the significant differences in health outcomes between black males and other groups. To derive a causal estimate the study makes use of a number of differences: black vs non-black, for obvious reasons; male vs female, since the study targeted males, and also since women were more likely to have had contact with and hence higher trust in the medical system; before vs after; and geographic differences, since proximity to the location of the study may be informative about trust in the local health care facilities. A wide variety of further checks reinforce the conclusions that the study led to a reduction in health care utilisation among black men of around 20%. The effect is particularly pronounced in those with low education and income. Beyond elucidating the indirect harms caused by this most heinous of studies, it illustrates the importance of trust in mediating the effectiveness of public institutions. Poor reputations caused by negligence and malpractice can spread far and wide – the mid-Staffordshire hospital scandal may be just such an example.

The economic consequences of hospital admissions. American Economic Review [RePEcPublished February 2018

That this paper’s title recalls that of Keynes’s book The Economic Consequences of the Peace is to my mind no mistake. Keynes argued that a generous and equitable post-war settlement was required to ensure peace and economic well-being in Europe. The slow ‘economic privation’ driven by the punitive measures and imposed austerity of the Treaty of Versailles would lead to crisis. Keynes was evidently highly critical of the conference that led to the Treaty and resigned in protest before its end. But what does this have to do with hospital admissions? Using an ‘event study’ approach – in essence regressing the outcome of interest on covariates including indicators of time relative to an event – the paper examines the impact hospital admissions have on a range of economic outcomes. The authors find that for insured non-elderly adults “hospital admissions increase out-of-pocket medical spending, unpaid medical bills, and bankruptcy, and reduce earnings, income, access to credit, and consumer borrowing.” Similarly, they estimate that hospital admissions among this same group are responsible for around 4% of bankruptcies annually. These losses are often not insured, but they note that in a number of European countries the social welfare system does provide assistance for lost wages in the event of hospital admission. Certainly, this could be construed as economic privation brought about by a lack of generosity of the state. Nevertheless, it also reinforces the fact that negative health shocks can have adverse consequences through a person’s life beyond those directly caused by the need for medical care.

Is health care infected by Baumol’s cost disease? Test of a new model. Health Economics [PubMed] [RePEcPublished 9th February 2018

A few years ago we discussed Baumol’s theory of the ‘cost disease’ and an empirical study trying to identify it. In brief, the theory supposes that spending on health care (and other labour-intensive or creative industries) as a proportion of GDP increases, at least in part, because these sectors experience the least productivity growth. Productivity increases the fastest in sectors like manufacturing and remuneration increases as a result. However, this would lead to wages in the most productive sectors outstripping those in the ‘stagnant’ sectors. For example, salaries for doctors would end up being less than those for low-skilled factory work. Wages, therefore, increase in the stagnant sectors despite a lack of productivity growth. The consequence of all this is that as GDP grows, the proportion spent on stagnant sectors increases, but importantly the absolute amount spent on the productive sectors does not decrease. The share of the pie gets bigger but the pie is growing at least as fast, as it were. To test this, this article starts with a theoretic two-sector model to develop some testable predictions. In particular, the authors posit that the cost disease implies: (i) productivity is related to the share of labour in the health sector, and (ii) productivity is related to the ratio of prices in the health and non-health sectors. Using data from 28 OECD countries between 1995 and 2016 as well as further data on US industry group, they find no evidence to support these predictions, nor others generated by their model. One reason for this could be that wages in the last ten years or more have not risen in line with productivity in manufacturing or other ‘productive’ sectors, or that productivity has indeed increased as fast as the rest of the economy in the health care sector. Indeed, we have discussed productivity growth in the health sector in England and Wales previously. The cost disease may well then not be a cause of rising health care costs – nevertheless, health care need is rising and we should still expect costs to rise concordantly.

Credits

Chris Sampson’s journal round-up for 8th January 2018

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

An empirical comparison of the measurement properties of the EQ-5D-5L, DEMQOL-U and DEMQOL-Proxy-U for older people in residential care. Quality of Life Research [PubMed] Published 5th January 2018

There is now a condition-specific preference-based measure of health-related quality of life that can be used for people with cognitive impairment: the DEMQOL-U. Beyond the challenge of appropriately defining quality of life in this context, cognitive impairment presents the additional difficulty that individuals may not be able to self-complete a questionnaire. There’s some good evidence that proxy responses can be valid and reliable for people with cognitive impairment. The purpose of this study is to try out the new(ish) EQ-5D-5L in the context of cognitive impairment in a residential setting. Data were taken from an observational study in 17 residential care facilities in Australia. A variety of outcome measures were collected including the EQ-5D-5L (proxy where necessary), a cognitive bolt-on item for the EQ-5D, the DEMQOL-U and the DEMQOL-Proxy-U (from a family member or friend), the Modified Barthel Index, the cognitive impairment Psychogeriatric Assessment Scale (PAS-Cog), and the neuropsychiatric inventory questionnaire (NPI-Q). The researchers tested the correlation, convergent validity, and known-group validity for the various measures. 143 participants self-completed the EQ-5D-5L and DEMQOL-U, while 387 responses were available for the proxy versions. People with a diagnosis of dementia reported higher utility values on the EQ-5D-5L and DEMQOL-U than people without a diagnosis. Correlations between the measures were weak to moderate. Some people reported full health on the EQ-5D-5L despite identifying some impairment on the DEMQOL-U, and some vice versa. The EQ-5D-5L was more strongly correlated with clinical outcome measures than were the DEMQOL-U or DEMQOL-Proxy-U, though the associations were generally weak. The relationship between cognitive impairment and self-completed EQ-5D-5L and DEMQOL-U utilities was not in the expected direction; people with greater cognitive impairment reported higher utility values. There was quite a lot of disagreement between utility values derived from the different measures, so the EQ-5D-5L and DEMQOL-U should not be seen as substitutes. An EQ-QALY is not a DEM-QALY. This is all quite perplexing when it comes to measuring health-related quality of life in people with cognitive impairment. What does it mean if a condition-specific measure does not correlate with the condition? It could be that for people with cognitive impairment the key determinant of their quality of life is only indirectly related to their impairment, and more dependent on their living conditions.

Resolving the “cost-effective but unaffordable” paradox: estimating the health opportunity costs of nonmarginal budget impacts. Value in Health Published 4th January 2018

Back in 2015 (as discussed on this blog), NICE started appraising drugs that were cost-effective but implied such high costs for the NHS that they seemed unaffordable. This forced a consideration of how budget impact should be handled in technology appraisal. But the matter is far from settled and different countries have adopted different approaches. The challenge is to accurately estimate the opportunity cost of an investment, which will depend on the budget impact. A fixed cost-effectiveness threshold isn’t much use. This study builds on York’s earlier work that estimated cost-effectiveness thresholds based on health opportunity costs in the NHS. The researchers attempt to identify cost-effectiveness thresholds that are in accordance with different non-marginal (i.e. large) budget impacts. The idea is that a larger budget impact should imply a lower (i.e. more difficult to satisfy) cost-effectiveness threshold. NHS expenditure data were combined with mortality rates for different disease categories by geographical area. When primary care trusts’ (PCTs) budget allocations change, they transition gradually. This means that – for a period of time – some trusts receive a larger budget than they are expected to need while others receive a smaller budget. The researchers identify these as over-target and under-target accordingly. The expenditure and outcome elasticities associated with changes in the budget are estimated for the different disease groups (defined by programme budgeting categories; PBCs). Expenditure elasticity refers to the change in PBC expenditure given a change in overall NHS expenditure. Outcome elasticity refers to the change in PBC mortality given a change in PBC expenditure. Two econometric approaches are used; an interaction term approach, whereby a subgroup interaction term is used with the expenditure and outcome variables, and a subsample estimation approach, whereby subgroups are analysed separately. Despite the limitations associated with a reduced sample size, the subsample estimation approach is preferred on theoretical grounds. Using this method, under-target PCTs face a cost-per-QALY of £12,047 and over-target PCTs face a cost-per-QALY of £13,464, reflecting diminishing marginal returns. The estimates are used as the basis for identifying a health production function that can approximate the association between budget changes and health opportunity costs. Going back to the motivating example of hepatitis C drugs, a £772 million budget impact would ‘cost’ 61,997 QALYs, rather than the 59,667 that we would expect without accounting for the budget impact. This means that the threshold should be lower (at £12,452 instead of £12,936) for a budget impact of this size. The authors discuss a variety of approaches for ‘smoothing’ the budget impact of such investments. Whether or not you believe the absolute size of the quoted numbers depends on whether you believe the stack of (necessary) assumptions used to reach them. But regardless of that, the authors present an interesting and novel approach to establishing an empirical basis for estimating health opportunity costs when budget impacts are large.

First do no harm – the impact of financial incentives on dental x-rays. Journal of Health Economics [RePEc] Published 30th December 2017

If dentists move from fee-for-service to a salary, or if patients move from co-payment to full exemption, does it influence the frequency of x-rays? That’s the question that the researchers are trying to answer in this study. It’s important because x-rays always present some level of (carcinogenic) risk to patients and should therefore only be used when the benefits are expected to exceed the harms. Financial incentives shouldn’t come into it. If they do, then some dentists aren’t playing by the rules. And that seems to be the case. The authors start out by establishing a theoretical framework for the interaction between patient and dentist, which incorporates the harmful nature of x-rays, dentist remuneration, the patient’s payment arrangements, and the characteristics of each party. This model is used in conjunction with data from NHS Scotland, with 1.3 million treatment claims from 200,000 patients and 3,000 dentists. In 19% of treatments, an x-ray occurs. Some dentists are salaried and some are not, while some people pay charges for treatment and some are exempt. A series of fixed effects models are used to take advantage of these differences in arrangements by modelling the extent to which switches (between arrangements, for patients or dentists) influence the probability of receiving an x-ray. The authors’ preferred model shows that both the dentist’s remuneration arrangement and the patient’s financial status influences the number of x-rays in the direction predicted by the model. That is, fee-for-service and charge exemption results in more x-rays. The combination of these two factors results in a 9.4 percentage point increase in the probability of an x-ray during treatment, relative to salaried dentists with non-exempt patients. While the results do show that financial incentives influence this treatment decision (when they shouldn’t), the authors aren’t able to link the behaviour to patient harm. So we don’t know what percentage of treatments involving x-rays would correspond to the decision rule of benefits exceeding harms. Nevertheless, this is an important piece of work for informing the definition of dentist reimbursement and patient payment mechanisms.

Credits

Chris Sampson’s journal round-up for 11th September 2017

Every Monday our authors provide a round-up of some of the most recently published peer reviewed articles from the field. We don’t cover everything, or even what’s most important – just a few papers that have interested the author. Visit our Resources page for links to more journals or follow the HealthEconBot. If you’d like to write one of our weekly journal round-ups, get in touch.

Core items for a standardized resource use measure (ISRUM): expert Delphi consensus survey. Value in Health Published 1st September 2017

Trial-based collection of resource use data, for the purpose of economic evaluation, is wild. Lots of studies use bespoke questionnaires. Some use off-the-shelf measures, but many of these are altered to suit the context. Validity rarely gets a mention. Some of you may already be aware of this research; I’m sure I’m not the only one here who participated. The aim of the study is to establish a core set of resource use items that should be included in all studies to aid comparability, consistency and validity. The researchers identified a long list of 60 candidate items for inclusion, through a review of 59 resource use instruments. An NHS and personal social services perspective was adopted, and any similar items were merged. This list was constructed into a Delphi survey. Members of the HESG mailing list – as well as 111 other identified experts – were invited to complete the survey, for which there were two rounds. The first round asked participants to rate the importance of including each item in the core set, using a scale from 1 (not important) to 9 (very important). Participants were then asked to select their ‘top 10’. Items survived round 1 if they scored at least 7 with more than 50% of respondents, and less than 3 by no more than 15%, either overall or within two or more participant subgroups. In round 2, participants were presented with the results of round 1 and asked to re-rate 34 remaining items. There was a sample of 45 usable responses in round 1 and 42 in round 2. Comments could also be provided, which were subsequently subject to content analysis. After all was said and done, a meeting was held for final item selection based on the findings, to which some survey participants were invited but only one attended (sorry I couldn’t make it). The final 10 items were: i) hospital admissions, ii) length of stay, iii) outpatient appointments, iv) A&E visits, v) A&E admissions, vi) number of appointments in the community, vii) type of appointments in the community, viii) number of home visits, ix) type of home visits and x) name of medication. The measure isn’t ready to use just yet. There is still research to be conducted to identify the ideal wording for each item. But it looks promising. Hopefully, this work will trigger a whole stream of research to develop bolt-ons in specific contexts for a modular system of resource use measurement. I also think that this work should form the basis of alignment between costing and resource use measurement. Resource use is often collected in a way that is very difficult to ‘map’ onto costs or prices. I’m sure the good folk at the PSSRU are paying attention to this work, and I hope they might help us all out by estimating unit costs for each of the core items (as well as any bolt-ons, once they’re developed). There’s some interesting discussion in the paper about the parallels between this work and the development of core outcome sets. Maybe analysis of resource use can be as interesting as the analysis of quality of life outcomes.

A call for open-source cost-effectiveness analysis. Annals of Internal Medicine [PubMed] Published 29th August 2017

Yes, this paper is behind a paywall. Yes, it is worth pointing out this irony over and over again until we all start practising what we preach. We’re all guilty; we all need to keep on keeping on at each other. Now, on to the content. The authors argue in favour of making cost-effectiveness analysis (and model-based economic evaluation in particular) open to scrutiny. The key argument is that there is value in transparency, and analogies are drawn with clinical trial reporting and epidemiological studies. This potential additional value is thought to derive from i) easy updating of models with new data and ii) less duplication of efforts. The main challenges are thought to be the need for new infrastructure – technical and regulatory – and preservation of intellectual property. Recently, I discussed similar issues in a call for a model registry. I’m clearly in favour of cost-effectiveness analyses being ‘open source’. My only gripe is that the authors aren’t the first to suggest this, and should have done some homework before publishing this call. Nevertheless, it is good to see this issue being raised in a journal such as Annals of Internal Medicine, which could be an indication that the tide is turning.

Differential item functioning in quality of life measurement: an analysis using anchoring vignettes. Social Science & Medicine [PubMed] [RePEc] Published 26th August 2017

Differential item functioning (DIF) occurs when different groups of people have different interpretations of response categories. For example, in response to an EQ-5D questionnaire, the way that two groups of people understand ‘slight problems in walking about’ might not be the same. If that were the case, the groups wouldn’t be truly comparable. That’s a big problem for resource allocation decisions, which rely on trade-offs between different groups of people. This study uses anchoring vignettes to test for DIF, whereby respondents are asked to rate their own health alongside some health descriptions for hypothetical individuals. The researchers conducted 2 online surveys, which together recruited a representative sample of 4,300 Australians. Respondents completed the EQ-5D-5L, some vignettes, some other health outcome measures and a bunch of sociodemographic questions. The analysis uses an ordered probit model to predict responses to the EQ-5D dimensions, with the vignettes used to identify the model’s thresholds. This is estimated for each dimension of the EQ-5D-5L, in the hope that the model can produce coefficients that facilitate ‘correction’ for DIF. But this isn’t a guaranteed approach to identifying the effect of DIF. Two important assumptions are inherent; first, that individuals rate the hypothetical vignette states on the same latent scale as they rate their own health (AKA response consistency) and, second, that everyone values the vignettes on an equivalent latent scale (AKA vignette equivalence). Only if these assumptions hold can anchoring vignettes be used to adjust for DIF and make different groups comparable. The researchers dedicate a lot of effort to testing these assumptions. To test response consistency, separate (condition-specific) measures are used to assess each domain of the EQ-5D. The findings suggest that responses are consistent. Vignette equivalence is assessed by the significance of individual characteristics in determining vignette values. In this study, the vignette equivalence assumption didn’t hold, which prevents the authors from making generalisable conclusions. However, the researchers looked at whether the assumptions were satisfied in particular age groups. For 55-65 year olds (n=914), they did, for all dimensions except anxiety/depression. That might be because older people are better at understanding health problems, having had more experience of them. So the authors can tell us about DIF in this older group. Having corrected for DIF, the mean health state value in this group increases from 0.729 to 0.806. Various characteristics explain the heterogeneous response behaviour. After correcting for DIF, the difference in EQ-5D index values between high and low education groups increased from 0.049 to 0.095. The difference between employed and unemployed respondents increased from 0.077 to 0.256. In some cases, the rankings changed. The difference between those divorced or widowed and those never married increased from -0.028 to 0.060. The findings hint at a trade-off between giving personalised vignettes to facilitate response consistency and generalisable vignettes to facilitate vignette equivalence. It may be that DIF can only be assessed within particular groups (such as the older sample in this study). But then, if that’s the case, what hope is there for correcting DIF in high-level resource allocation decisions? Clearly, DIF in the EQ-5D could be a big problem. Accounting for it could flip resource allocation decisions. But this study shows that there isn’t an easy answer.

How to design the cost-effectiveness appraisal process of new healthcare technologies to maximise population health: a conceptual framework. Health Economics [PubMed] Published 22nd August 2017

The starting point for this paper is that, when it comes to reimbursement decisions, the more time and money spent on the appraisal process, the more precise the cost-effectiveness estimates are likely to be. So the question is, how much should be committed to the appraisal process in the way of resources? The authors set up a framework in which to consider a variety of alternatively defined appraisal processes, how these might maximise population health and which factors are key drivers in this. The appraisal process is conceptualised as a diagnostic tool to identify which technologies are cost-effective (true positives) and which aren’t (true negatives). The framework builds on the fact that manufacturers can present a claimed ICER that makes their technology more attractive, but that the true ICER can never be known with certainty. As a diagnostic test, there are four possible outcomes: true positive, false positive, true negative, or false negative. Each outcome is associated with an expected payoff in terms of population health and producer surplus. Payoffs depend on the accuracy of the appraisal process (sensitivity and specificity), incremental net benefit per patient, disease incidence, time of relevance for an approval, the cost of the process and the price of the technology. The accuracy of the process can be affected by altering the time and resources dedicated to it or by adjusting the definition of cost-effectiveness in terms of the acceptable level of uncertainty around the ICER. So, what determines an optimal level of accuracy in the appraisal process, assuming that producers’ price setting is exogenous? Generally, the process should have greater sensitivity (at the expense of specificity) when there is more to gain: when a greater proportion of technologies are cost-effective or when the population or time of relevance is greater. There is no fixed optimum for all situations. If we relax the assumption of exogenous pricing decisions, and allow pricing to be partly determined by the appraisal process, we can see that a more accurate process incentivises cost-effective price setting. The authors also consider the possibility of there being multiple stages of appraisal, with appeals, re-submissions and price agreements. The take-home message is that the appraisal process should be re-defined over time and with respect to the range of technologies being assessed, or even an individualised process for each technology in each setting. At least, it seems clear that technologies with exceptional characteristics (with respect to their potential impact on population health), should be given a bespoke appraisal. NICE is already onto these ideas – they recently introduced a fast track process for technologies with a claimed ICER below £10,000 and now give extra attention to technologies with major budget impact.

Credits