By Chris Sampson, David Whitehurst and Andrew Street
In December 2012, an article was published in The European Journal of Health Economics with the title ‘Patients whose GP knows complementary medicine tend to have lower costs and live longer’. We spotted a number of shortcomings in the analysis and reporting, to which we felt a response was worthwhile. Subsequently the authors of the original piece, Professor Peter Kooreman and Dr Erik Baars, wrote a reply. In this blog post we summarise the debate and offer some concluding thoughts.
The study employed a large dataset (n~150,000) from a Dutch health insurer. The objective of the study was “to explore the cost-effectiveness of CAM compared with conventional medicine”. The study sought to find out whether different levels of cost or mortality were observed depending on whether or not an individual’s general practitioner (GP) was trained in complementary and alternative medicine (CAM). The authors specifically looked at GPs trained in anthroposophy, homeopathy and acupuncture.
The authors implemented both a linear and log-linear regression model to estimate the cost differences associated with different types of CAM-training. Separate regressions were carried out for each type of CAM, for four different age groups and for five different cost categories. This gave a total of 120 different coefficients (2 (models) x 3 (CAM approaches) x 4 (age groups) x 5 (cost categories)) for the cost difference associated with CAM-training. Eighteen (15%) of these coefficients were negative (indicating positive findings attributable to CAM training) and statistically significant at the 5% level. Three (2.5%) coefficients showed a greater cost associated with CAM training.
For mortality effects, the authors implemented both a fixed effects logit and a fixed effects linear probability model (LPM). In this case the groups were split by sex and, again, by type of CAM-training; additionally an overall effect of CAM-training was included. This gave a total of 24 different coefficients for the mortality difference associated with CAM-training. Four (16.7%) of these were lower and statistically significant at the 5% level; all from the LPM.
The authors concluded that “patients whose GP has additional CAM training have 0–30% lower healthcare costs and mortality rates, depending on age groups and type of CAM”; adding that “since the differences are obtained while controlling for confounders… the lower costs and longer lives are unlikely to be related to differences in socioeconomic status.”
The study’s faults
A major problem with the study is one of selection. Selection is important in this study; there is selection of individuals who decide whether or not to register with CAM-trained GPs and selection of GPs who choose to pursue CAM. Patients that register with CAM-trained GPs may have different characteristics from those who do not, and exhibit different levels of cost and mortality as a result of these characteristics, rather than of CAM itself. The risk-adjustment the authors perform is the only way they deal with selection, and the set of risk-adjusters is very small; including only age, gender and postal code. The authors defend their position by citing a paper suggesting that selection bias might operate in the other direction. Neither we nor the authors can prove this one way or another. To thoroughly address selection, a larger set of risk-adjusters should be included and an approach such as propensity score matching would have been superior to the model adopted by the authors.
In reporting and reflecting upon their analyses, the authors do not recognise the problems associated with multiple testing. The authors appear to misunderstand the familywise error rate and the implications of this for the results that are currently shown as statistically significant. The authors should have accounted for this, using a method such as the Bonferroni correction.
The primary claims of the study are that patients with CAM-trained GPs had “0–30% lower costs” and “0–30% lower mortality rates”. These claims can be found throughout the original study, including the title, and in the authors’ subsequent dealings with the media. We believe that the first claim is a ‘cherry-picked’ finding; the second is simply false.
With regard to costs, as identified in the authors’ reply, the 30% relates specifically to patients “aged 75 and above with an anthroposophic GP-CAM”. But there are some coefficients that show a greater cost associated with CAM-trained GPs. Yet the paper’s title and publicity statements focus on this significant result alone. This is not an accurate reflection of the cost implications for patients in general, and highlighting this cherry-picked result is a misleading representation of the overall effects. A more appropriate way of reporting the results would have been to present the expected cost differences across the whole sample.
The analysis of mortality is simply incorrect. Mortality risk is bounded by 0,1 but the linear probability model is unbounded; making it inappropriate to model mortality data. The logit model is designed for binary outcomes, and when this is employed the significance of the mortality differences disappears or is less than 5%. But even the logit is inappropriate for these data because mortality is an infrequent event (around 3% of the sample died). A probit model would be preferable and we suspect that, had a probit been employed, no significant differences would be found. In short, the ‘significant’ effects that the authors identify are due to incorrect model specification.
In their responses, the authors retreated from their original emphasis on the significance of the mortality results saying that “our results do not show any evidence that patients of GP-CAMs have higher mortality rates”. We agree with this re-statement. Nevertheless, the title of the paper remains “Patients whose GP knows complementary medicine tend to … live longer”, which the authors now appear to admit is false.
The study was available in its current form, as well as earlier versions, long before it was published in the EJHE. As a result, the study’s inaccurate claims have been repeated in a number of papers that cite the work in relation to herbal medicine and CAM in primary care. The publicity sounding these claims, and the authors’ conduct with the media, has been discussed elsewhere (English translation).
We believe that the original study and the response pieces might be used as a case study to aid teaching. To this end we have provided material to the Health Economics Education website. In addition, please do consider commenting below to develop the discussion – whatever your thoughts on the matter. Do you see other flaws in the study design? Or maybe you think some of our comments are unfounded? Are there better ways of studying important questions such as these?
[…] Hoewel de auteurs van het onderzoek Patients whose GP knows complementary medicine tend to have lower cost and live longer merkwaardig genoeg geanonimiseerd zijn, is hun identiteit gemakkelijk te achterhalen. Het gaat om de onderzoekers Peter Kooreman en Erik Baars, beiden met antroposofische sympathieën. Hun onderzoek verscheen pas eind 2012 in het European Journal of Health Economics, maar was daarvoor al meer dan twee jaar beschikbaar via internet, o.a. via de eigen website van Kooreman, die om die reden ook als [link] zal zijn weergegeven in het vonnis. Cees Renckens en Jan Willem Nienhuys leverden in juni 2010 al stevige kritiek op het onderzoek en hier op Kloptdatwel? verscheen in mei 2013 een uitgebreid commentaar van Pepijn van Erp. Een vervolgonderzoek van Kooreman en Baars werd eveneens door Pepijn bekritiseerd. En ten slotte reageerden ook in het European Journal of Health Economics zelf enkele critici (zie ook hier). […]
The story continues… if you can read Dutch.
Kooreman and Baars have published another study in a similar vein, with similar flaws: http://www.economie.nl/artikel/complementair-werkende-huisartsen-en-de-kosten-van-zorg
And it has already received a rebuttal: http://www.economie.nl/artikel/discussie-complementair-werkende-huisartsen-en-de-kosten-van-zorg [paywall]
The original study has also been cited in a review, which acknowledges a number of methodological flaws: http://link.springer.com/article/10.1007%2Fs10198-013-0462-7
[…] via Bad science in health economics: complementary medicine, costs and mortality | The Academic Health E…. […]
[…] bezwaren ontzettend gemakzuchtig afdoen. Ze zijn van plan er binnenkort op terug te komen [Update: inmiddels gebeurd]. In eerste instantie dacht ik dat die ’15 procent’ wel eens kon komen van een […]
Last week the ‘Patients Platform for Complementary Healthcare’ in the Netherlands wrote an open letter to the minister for health care. The title of this letter is: ‘INTEGRALE ZORG KAN MOGELIJK 15% AAN KOSTEN BESPAREN!’ (which translates as ‘integrative medicine possibly saves 15% on health care costs’). The reference for this claim is of course the article of Kooreman and Baars.
The open letter is signed by numerous organisations for complementary and alternative medicine as well by a some well known individuals.
Good post, it is a good article to demonstrate methodological flaws. I had a look and I had a couple of questions that I couldn’t seem to find an answer for:
– Were all the individuals present in every quarter t? If there is a non-zero mortality rate then I assume the answer is no, and I assume new insurees are added in each quarter as well. Since selection/observation is determined by things such as healthcare spending or physician competence then this may present a problem of endogenous selection. We would also have censoring of the high cost insurees due to mortality – of course you will have lower costs if you kill off the high cost patients. This is probably a bigger problem in the elderly. One thing they could’ve done is create a weighted average cost at the GP level to account for censoring.
– There were presumably individuals with 0 costs in some quarters – were these dropped in the log-costs model?
– I am assuming the model is the typical fixed effects model (within transformation with OLS). How were the standard errors calculated? Their identification strategy seems in part to rely on the fact that individuals are very similar within postcodes, so then why weren’t the standard errors clustered by postcode?
Thanks for the article. I totally agree with the critiques moved to the empirical strategy. Self-selection is indeed a major issue here, in terms of selection on both observables and unobservables. Propensity Score Matching would have been a much more appropriate and robust estimation strategy, at least to control for selection on observables.
In addition to that, I also believe that at least some controls for health behaviors should have been included. I wouldn’t be surprised if patients of GP-CAMS were showing differences in health behaviors compared to patients of traditional GP.