Method of the month: Distributional cost effectiveness analysis

Once a month we discuss a particular research method that may be of interest to people working in health economics. We’ll consider widely used key methodologies, as well as more novel approaches. Our reviews are not designed to be comprehensive but provide an introduction to the method, its underlying principles, some applied examples, and where to find out more. If you’d like to write a post for this series, get in touch. This month’s method is distributional cost effectiveness analysis.

Principles

Variation in population health outcomes, particularly when socially patterned by characteristics such as income and race, are often of concern to policymakers. For example, the fact that people born in the poorest tenth of neighbourhoods in England can expect to live 19 fewer years of healthy life than those living in the richest tenth of neighbourhoods in the country, or the fact that black Americans born today can expect to die 4 years earlier than white Americans, are often considered to be unfair and in need of policy attention. As policymakers look to implement health programmes to tackle such unfair health disparities, they need the tools to enable them to evaluate the likely impacts of alternative programmes available to them in terms of the programmes’ impact on reducing these undesirable health inequalities, as well as their impact on improving population health.

Traditional tools for prospectively evaluating health programmes – that is to say, estimating the likely impacts of health programmes prior to their implementation – are typically based on cost-effectiveness analysis (CEA). CEA selects those programmes that improve the health of the average recipient of the programme the most, taking into consideration the health opportunity costs involved in implementing the programme. When using CEA to select health programmes there is, therefore, a risk that the programmes selected will not necessarily reduce the health disparities of concern to policymakers as these disparities are not part of the evaluation process used when comparing programmes. Indeed, in some cases, the programmes chosen using CEA may even unintentionally exacerbate these health inequalities.

There has been recent methodological work to build upon the standard CEA methods explicitly incorporating concerns for reducing health disparities into them. This equity augmented form of CEA is called distributional cost effectiveness analysis (DCEA). DCEA estimates the impacts of health interventions on different groups within the population and evaluates the health distributions resulting from these interventions in term of both health inequality and population health. Where necessary, DCEA can then be used to guide the trade-off between these different dimensions to pick the most “socially beneficial” programme to implement.

Implementation

The six core steps in implementing a DCEA are outlined below – full details of how DCEA is conducted in practice and applied to evaluate alternative options in a real case study (the NHS Bowel Cancer Screening Programme in England) can be found in a published tutorial.

1. Identify policy-relevant subgroups in the population

The first step in the analysis is to decide which characteristics of the population are of policy concern when thinking about health inequalities. For example, in England, there is a lot of concern about the fact that people born in poor neighbourhoods expect to die earlier than those born in rich neighbourhoods but little concern about the fact that men have shorter life expectancies than women.

2. Construct the baseline distribution of health

The next step is to construct a baseline distribution of health for the population. This baseline distribution describes the health of the population, typically measured in quality-adjusted life expectancy at birth, to show the level of health and health inequality prior to implementing the proposed interventions. This distribution can be standardised (using methods of either direct or indirect standardisation) to remove any variation in health that is not associated with the characteristics of interest. For example, in England, we might standardise the health distribution to remove variation associated with gender but retain variation associated with neighbourhood deprivation. This then gives us a description of the population health distribution with a particular focus on the health disparities we are trying to reduce. An example of how to construct such a ‘social distribution of health’ for England is given in another published article.

3. Estimate post-intervention distributions of health

We next estimate the health impacts of the interventions we are comparing. In producing these estimates we need to take into account differences by each of the equity relevant subgroups identified in the:

  • prevalence and incidence of the diseases impacted by the intervention,
  • rates of uptake and adherence to the intervention,
  • efficacy of the intervention,
  • mortality and morbidity, and
  • health opportunity costs.

Standardising these health impacts and combining with the baseline distribution of health derived above gives us estimated post-intervention distributions of health for each intervention.

4. Compare post-intervention distributions using the health equity impact plane

Once post-intervention distributions of health have been estimated for each intervention we can compare them both in terms of their level of average health and in terms of their level of health inequality. Whilst calculating average levels of health in the distributions is straightforward, calculating levels of inequality requires some value judgements to be made. There is a wide range of alternative inequality measures that could be employed each of which captures different aspects of inequality. For example, relative inequality measures would conclude that a health distribution where half the population lives for 40 years and the other half lives for 50 years is just as unequal as a health distribution where half the population lives for 80 years and the other half lives for 100 years. An absolute inequality measure would instead conclude that the equivalence is with a population where half the population lives for 80 years and the other half lives for 90 years.

Two commonly used inequality measures are the Atkinson relative inequality measure and the Kolm absolute inequality measure. These both have the additional feature that they can be calibrated using an inequality aversion parameter to vary the level of priority given to those worst off in the distribution. We will see these inequality aversion parameters in action in the next step of the DCEA process.

Having selected a suitable inequality measure we can plot our post interventions distributions on a health equity impact plane. Let us assume we are comparing two interventions A and B, we can plot intervention A at the origin of the plane and plot intervention B relative to A on the plane.

 

 

If intervention B falls in the north-east quadrant of the health equity impact plane we know it both improves health overall and reduces health inequality relative to intervention A and so intervention B should be selected. If, however, intervention B falls in the south-west quadrant of the health equity impact plane we know it both reduces health and increases health inequality relative to intervention A and so intervention A should be selected. If intervention B falls either in the north-west or south-east quadrants of the health equity impact plane there is no obvious answer as to which intervention should be preferred as there is a trade-off to be made between health equity and total health.

5. Evaluate trade-offs between inequality and efficiency using social welfare functions

We use social welfare functions to trade-off between inequality reduction and average health improvement. These social welfare functions are constructed by combining our chosen measure of inequality with the average health in the distribution. This combination of inequality and average health is used to calculate what is known as an equally distributed equivalent (EDE) level of health. The EDE summarises the health distribution being analysed as one number representing the amount of health that each person in a hypothetically perfectly equal health distribution would need to have for us to be indifferent between the actual health distribution analysed and this perfectly equal health distribution. Where our social welfare function is built around an inequality measure with an inequality aversion parameter this EDE level of health will also be a function of the inequality aversion parameter. Where inequality aversion is set to zero there is no concern for inequality and the EDE simply reflects the average health in the distribution replicating results we would see under standard utilitarian CEA. As the inequality aversion level approaches infinity, our focus becomes increasingly on those worse off in the health distribution until at the limit we reflect the Rawlsian idea of focusing entirely on improving the lot of the worst-off in society.

 

Social welfare functions derived from the Atkinson relative inequality measure and the Kolm absolute inequality measure are given below, with the inequality aversion parameters circled. Research carried out with members of the public in England suggests that suitable values for the Atkinson and Kolm inequality aversion parameters are 10.95 and 0.15 respectively.

Atkinson Relative Social Welfare Function Kolm Absolute Social Welfare Function

When comparing interventions where one intervention does not simply dominate the others on the health equity impact plane we need to use our social welfare functions to calculate EDE levels of health associated with each of the interventions and then select the intervention that produces the highest EDE level of health.

In the example depicted in the figure above we can see that pursuing intervention A results in a health distribution which appears less unequal but has a lower average level of health than the health distribution resulting from intervention B. The choice of intervention, in this case, will be determined by the form of social welfare function selected and the level of inequality this social welfare function is parameterised to embody.

6. Conduct sensitivity analysis on forms of social welfare function and extent of inequality aversion

Given that the conclusions drawn from DCEA may be dependent on the social value judgments made around the inequality measure used and the level of inequality aversion embodied in it, we should present results for a range of alternative social welfare functions parameterised at a range of inequality aversion levels. This will allow decision makers to clearly understand how robust conclusions are to alternative social value judgements.

Applications

DCEA is of particular use when evaluating large-scale public health programmes that have an explicit goal of tackling health inequality. It has been applied to the NHS bowel cancer screening programme in England and to the rotavirus vaccination programme in Ethiopia.

Some key limitations of DCEA are that: (1) it currently only analyses programmes in terms of their health impacts whilst large public health programmes often have important impacts across a range of sectors beyond health; and (2) it requires a range of data beyond that required by standard CEA which may not be readily available in all contexts.

For low and middle-income settings an alternative augmented CEA methodology called extended cost effectiveness analysis (ECEA) has been developed to combine estimates of health impacts with estimates of impacts on financial risk protection. More information on ECEA can be found here.

There are ongoing efforts to generalise the DCEA methods to be applied to interventions having impacts across multiple sectors. Follow the latest developments on DCEA at the dedicated website based at the Centre for Health Economics, University of York.

Credit

On the commensurability of efficiency

In this week’s round-up, I highlighted a recent paper in the journal Cambridge Quarterly of Healthcare Ethics. There are some interesting ideas presented regarding the challenge of decision-making at the individual patient level, and in particular a supposed trade-off between achieving efficiency and satisfying health need.

The gist of the argument is that these two ‘values’ are incommensurable in the sense that the comparative value of two choices is ambiguous where the achievement of efficiency and need satisfaction needs to be traded. In the journal round-up, I highlighted 2 criticisms. First, I suggested that efficiency and health need satisfaction are commensurable. Second, I suggested that the paper did not adequately tackle the special nature of microlevel decision-making. The author – Anders Herlitz – was gracious enough to respond to my comments with several tweets.

Here, I’d like to put forth my reasoning on the subject (albeit with an ignorance of the background literature on incommensurability and other matters of ethics).

Consider a machine gun

A machine gun is far more efficient than a pistol, right? Well, maybe. A machine gun can shoot more bullets than a pistol over a sustained period. Likewise, a doctor who can treat 50 patients per day is more efficient than a doctor who can treat 20 patients per day.

However, the premise of this entire discussion, as established by Herlitz, is values. Herlitz introduces efficiency as a value and not as some dispassionate indicator of return on input. When we are considering values – as we necessarily are when we are discussing decision-making and more generally ‘what matters’ – we cannot take the ‘more bullets’ approach to assessing efficiency.

That’s because ‘more bullets’ is not what we mean when we talk about the value of efficiency. The production function is fundamental to our understanding of efficiency as a value. Once values are introduced, it is plain to see that in the context of war (where value is attached to a greater number of deaths) a machine gun may very well be considered more efficient. However, bearing a machine gun is far less efficient than bearing a pistol in a civilian context because we value a situation that results in fewer deaths.

In this analogy, bullets are health care and deaths are (somewhat confusingly, I admit) health improvement. Treating more people is not better because we want to provide more health care, but because we want to improve people’s health (along with some other basket of values).

Efficiency only has value with respect to the outcome in whose terms it is defined, and is therefore always commensurable with that outcome. That is, the production function is an inherent and necessary component of an efficiency to which we attach value.

I believe that Herlitz’s idea of incommensurability could be a useful one. Different outcomes may well be incommensurable in the way described in the paper. But efficiency has no place in this discussion. The incommensurability Herlitz describes in his paper seems to be a simple conflict between utilitarianism and prioritarianism, though I don’t have the wherewithal to pursue that argument so I’ll leave it there!

Microlevel efficiency trade-offs

Having said all that, I do think there could be a special decision-making challenge regarding efficiency at the microlevel. And that might partly explain Herlitz’s suggestion that efficiency is incommensurable with other outcomes.

There could be an incommensurability between values that can be measured in their achievement at the individual level (e.g. health improvement) and values that aren’t measured with individual-level outcomes (e.g. prioritisation of more severe patients). Those two outcomes are incommensurable in the way Herlitz described, but the simple fact that we tend to think about the former as an efficiency argument and the latter as an equity argument is irrelevant. We could think about both in efficiency terms (for example, treating n patients of severity x is more efficient than treating n-1 patients of severity x, or n patients of severity x-1), we just don’t. The difficulty is that this equity argument is meaningless at the individual level because it relies on information about outcomes outside the microlevel. The real challenge at the microlevel, therefore, is to acknowledge scope for efficiency in all outcomes of value. The incommensurability that matters is between microlevel and higher-level assessments of value.

As an aside, I was surprised that the Rule of Rescue did not get a mention in the paper. This is a perfect example of a situation in which arguments that tend to be made on efficiency grounds are thrown out and another value (the duty to save an immediately endangered life) takes over. One doesn’t need to think very hard about how Rule of Rescue decision-making could be framed as efficient.

In short, efficiency is never incommensurable because it is never an end in itself. If you’re concerned with being more efficient for the sake of being more efficient then you are probably not making very efficient decisions.

Credit

Are we estimating the effects of health care expenditure correctly?

It is a contentious issue in philosophy whether an omission can be the cause of an event. At the very least it seems we should consider causation by omission differently from ‘ordinary’ causation. Consider Sarah McGrath’s example. Billy promised Alice to water the plant while she was away, but he did not water it. Billy not watering the plant caused its death. But there are good reasons to suppose that Billy did not cause its death. If Billy’s lack of watering caused the death of the plant, it may well be reasonable to assume that Vladimir Putin and indeed anyone else who did not water the plant were also a cause. McGrath argues that there is a normative consideration here: Billy ought to have watered the plant and that’s why we judge his omission as a cause and not anyone else’s. Similarly, the example from L.A. Paul and Ned Hall’s excellent book Causation: A User’s GuideBilly and Suzy are playing soccer on rival teams. One of Suzy’s teammates scores a goal. Both Billy and Suzy were nearby and could have easily prevented the goal. But our judgement is that the goal should only be credited to Billy’s failure to block the goal as Suzy had no responsibility to.

These arguments may appear far removed from the world of health economics. But, they have practical implications. Consider the estimation of the effect that increasing health care expenditure has on public health outcomes. The government, or relevant health authority, makes a decision about how the budget is allocated. It is often the case that there are allocative inefficiencies: greater gains could be had by reallocating the budget to more effective programs of care. In this case there would seem to be a relevant omission; the budget has not been spent where it could have provided benefits. These omissions are often seen as causes of a loss of health. Karl Claxton wrote of the Cancer Drugs Fund, a pool of money diverted from the National Health Service to provide cancer drugs otherwise considered cost-ineffective, that it was associated with

a net loss of at least 14,400 quality adjusted life years in 2013/14.

Similarly, an analysis of the lack of spending on effective HIV treatment and prevention by the Mbeki administration in South Africa wrote that

More than 330,000 lives or approximately 2.2 million person-years were lost because a feasible and timely ARV treatment program was not implemented in South Africa.

But our analyses of the effects of health care expenditure typically do not take these omissions into account.

Causal inference methods are founded on a counterfactual theory of causation. The aim of a causal inference method is to estimate the potential outcomes that would have been observed under different treatment regimes. In our case this would be what would have happened under different levels of expenditure. This is typically estimated by examining the relationship between population health and levels of expenditure, perhaps using some exogenous determinant of expenditure to identify the causal effects of interest. But this only identifies those changes caused by expenditure and not those changes caused by not spending.

Consider the following toy example. There are two causes of death in the population a and b with associated programs of care and prevention A and B. The total health care expenditure is x of which a proportion p: p\in P \subseteq [0,1] is spent on A and 1-p on B. The deaths due to each cause are y_a and y_b and so the total deaths are y = y_a + y_b. Finally, the effect of a unit increase in expenditure in each program are \beta_a and \beta_b. The question is to determine what the causal effect of expenditure is. If Y_x is the potential outcome for level of expenditure x then the average treatment effect is given by E(\frac{\partial Y_x}{\partial x}).

The country has chosen an allocation between the programmes of care of p_0. If causation by omission is not a concern then, given linear, additive models (and that all the model assumptions are met), y_a = \alpha_a + \beta_a p x + f_a(t) + u_a and y_b = \alpha_b + \beta_b (1-p) x + f_b(t) + u_b, the causal effect is E(\frac{\partial Y_x}{\partial x}) = \beta = \beta_a p_0 + \beta_b (1-p_0). But if causation by omission is relevant, then the net effect of expenditure is the lives gained \beta_a p_0 + \beta_b (1-p_0) less the lives lost. The lives lost are those under all possible things we did not do, so the estimator of the causal effect is \beta' = \beta_a p_0 + \beta_b (1-p_0) -  \int_{P/p_0} [ \beta_ap + \beta_b(1-p) ] dG(p). Now, clearly \beta \neq \beta' unless P/p_0 is the empty set, i.e. there was no other option. Indeed, the choice of possible alternatives involves a normative judgement as we’ve suggested. For an omission to count as a cause, there needs to be a judgement about what ought to have been done. For health care expenditure this may mean that the only viable alternative is the allocatively efficient distribution, in which case all allocations will result in a net loss of life unless they are allocatively efficient, which some may argue is reasonable. An alternative view is simply that the government simply has to not do worse than in the past and perhaps it is also reasonable for the government not to make significant changes to the allocation, for whatever reason. In that case we might say that P \in [p_0,1] and g(p) might be a distribution truncated below p_0 with most mass around p_0 and small variance.

The problem is that we generally do not observe the effect of expenditure in each program of care nor do we know the distribution of possible budget allocations. The normative judgements are also a contentious issue. Claxton clearly believes the government ought not to have initiated the Cancer Drugs Fund, but he does not go so far as to say any allocative inefficiency results in a net loss of life. Some working out of the underlying normative principles is warranted. But if it’s not possible to estimate these net causal effects, why discuss it? Perhaps it’s due to the lack of consistency. We estimate the ‘ordinary’ causal effect in our empirical work, but we often discuss opportunity costs and losses due to inefficiencies as being due to or caused by the spending decisions that are made. As the examples at the beginning illustrate, the normative question of responsibility seeps into our judgments about whether an omission is the cause of an outcome. For health care expenditure the government or other health care body does have a relevant responsibility. I would argue then that causation by omission is important and perhaps we need to reconsider the inferences that we make.

Credits