# “Doing the math” on the distribution of healthcare expenditures: a Pareto-like distribution is inevitable

Yesterday I explored one of the major challenges to affordable, universal health insurance, namely the high cost of providing care to the sickest patients. The extreme distribution of healthcare costs means that “Targeting the highest spenders represents the greatest opportunity to have a significant impact on overall spending”, an opportunity for insurance carriers  to reduce costs by risk selection, as well as for public policy. Here is a deeper look into the math behind the distribution of healthcare expenditures, using 2012 US data as a model.

One can fit a Pareto (power law, 80/20) distribution with scale coefficient $\alpha$ – that is, $prob(expenditure)\sim 1/expenditure^{\alpha+1}$ – to the data in several ways. For a Pareto distribution with scale coefficient $\alpha$, the per-capita expenditure at a given percentile from the top scales as $1/\%ile^{1/\alpha}$. The first two of these approaches yield a scale coefficient $1/\%ile^{0.893}$, with expenditures scaling as :

1. Use the 80/20 rule modified to fit the data: the top 25% ranked by healthcare expenditures account for 86.7% of costs; thus $\alpha=1.115$.
2. Use the ratio of mean to median expenditure, 5.05:1; thus $\alpha=1.119$.
However, a graphical analysis finds that the data does not follow such a Pareto distribution, shown as a black dashed line in the following figure (representing a Pareto distribution with $\alpha=1.117$ and median expenditure $854, the actual median expenditure). 3. Use data for the most expensive patients (10% through 30% percentiles from the top), for these patients, per-capita expenditure scales as $1/\%ile^{1.24}, (R^{2}=0.994)$, shown as a dashed red line in the figure above; thus $\alpha=0.806$. 4. Use the fraction of total expenses paid by the most expensive patients. A comparison of the fraction of expenses paid by the most expensive 1%, 5% and 10% finds that this scales as $x^{0.4228}, (R^{2}=0.987)$, shown as a dashed black line in the figure below. This scaling exponent is $1-1/\alpha$; thus $\alpha=1.733$. (Scaling added to figure modified from Cohen, 2014) Thus, there really is no typical patient. For discussion and implications, see Feyman, who called the empirical distribution of healthcare costs “worse than Pareto”. The Pareto-like (hyper-Pareto?) empirical distribution of expenditures presents a severe challenge to risk pooling through insurance without limiting the highest expenditures through risk selection (illegal!). Pareto distributions differ sharply from normal distributions, with important consequences for payment models. For a Pareto-like distribution with $\alpha\leq2$ at large expenditures, the variance is not defined, and sample variance approaches infinity with increasing sample size. Therefore, unlike the case of distributions with finite variance, variability in the mean of a sample of size N does not decrease with N. This violates a standard requirement for insurance; that risk pooling over a large sample reduces variability in the mean expenditure, and thus, standard insurance models cannot effectively price health insurance when the highest per capita expenditures follow Pareto distributions. Moreover, a Pareto-like distribution may be a natural consequence of advances in healthcare: our growing ability to manage multiple simultaneous chronic conditions, with consequent exponential growth in costs, while extending life expectancy, so that the probability of dying is not only not reduced, but may actually increase. In a mathematically limiting case, with no bound on healthcare costs, these dynamics yield a Pareto distribution. In fact, if one extrapolates the power law for a broad range of the sickest patients (the 10th through 30th percentiles of expenditures from the top), obtaining a Pareto distribution with $\alpha\leq1$, even the mean is not defined and the sample mean approaches infinity with increasing sample size. The actual distribution of healthcare cost for the very sickest patients clearly falls below the empirical Pareto distribution with $\alpha=0.806$, such a distribution predicts a cost at the 1st percentile of$178,194, well above the average for the top 1% of \$97,956. Deviations from this distribution for the very sickest patients may reflect current limits on healthcare and thus healthcare expenses. These limits may be relaxed with advances in healthcare, causing further growth in costs.

A Pareto-like distribution of healthcare costs is here to stay, and must be reflected in how we share the burden of healthcare and provide care to our sickest patients.

Credits