The ‘value’ in value based pricing

The use of labour market outcomes in the Value Based Pricing scheme is inconsistent with the concept of value

This year, the Department of Health in the UK will begin using a new system of ‘value based pricing’ (VBP) to set prices for medicines and other health technologies. Decisions regarding the adoption of new medical technologies relies, in a large part, upon formal assessments of cost-effectiveness; these assessments are most often carried out by the National Institute of Health and Care Excellence (NICE). The aim of the new VBP system is to better capture the benefits of a certain treatment, particularly benefits accruing both directly to other non-treated individuals such as carers and indirectly to society as a whole. In the latter case, these indirect benefits are referred to as wider societal benefits (WSB), and are to be measured in terms of market based activity—specifically, the difference between productivity and consumption. However, I believe that the proposed methodology is inconsistent with the concept of ‘value’.

The concept of value is hard to specify, but whenever we talk of something being ‘good’, ‘better’, or ‘best’, or conversely ‘bad’, ‘worse’, or ‘worst’, then we are talking in terms of value. The health technology assessments (HTAs) conducted in the UK, generally define that what is best is the state of affairs with the greatest amount of goodness, and hence value, overall, subject to a budget constraint. But how do we measure value in these HTAs? The standard measure used in HTAs currently, is the quality adjusted life year (QALY); a medicine that leads to the largest number of QALYs overall within our budget constraint, i.e. a cost-effective medicine, is good. And, in this sense we can say one treatment is better than another in terms of its cost-effectiveness. At this point it becomes important to think about different types of value.

An important contrast is made between intrinsic value and instrumental value. Something with intrinsic value is good in and of itself whereas things of instrumental value are good because they causally lead to intrinsically good things. Consider money, it is good only because it leads to things that are themselves good, such as good housing or an HDTV, which themselves may be good because of what they lead to, such as a safe and clean environment and relaxing weekends watching sport, for example. As a third category, there is also constitutive value; while instrumental values causally lead to intrinsic values, constitutive value constitutes intrinsic value without causing it. For example, giving you money may lead to your pleasure, and this pleasure constitutes your happiness without necessarily causing it. In these distinctions, QALYs arguably have constitutive value in that they constitute well-being and longevity.

One further distinction is the difference between value monism and pluralism. A monist believes that there is only one kind of value to which all other values are reducible. Economists arguably fall into this camp since they often use utility as the encompassing super-value. This position has some attractive features, such as being able to explain rational choice through, for example, diminishing marginal value. The opposing school of thought is value pluralism that posits that different kinds of value (e.g. happiness and liberty) are distinct and hence incommensurable. Thus, the QALY may be constitutive of the singular super-value, which we can refer to as utility without loss of generality, or be a measure of just one kind of value, such as quality of life.

In a monist perspective, we could consider the aim of VBP to estimate the effect of healthcare expenditure for each specific technology on overall utility. The new VBP system aims to capture not only the utility accruing directly to the recipients of a medical technology (which QALYs are constitutive of) but also the utility generated by the increased level of resources in the economy caused by their increase in productivity (i.e. the instrumental value of productivity). In this sense, the VBP system aims to estimate a multiplier effect of healthcare expenditure for each technology. But, the VBP methodology would appear inconsistent with this position. Firstly, the WSBs of a treatment are determined by productivity minus consumption, but consumption generates utility. All rational decisions regarding consumption boil down to utility, in a monist sense. Secondly, the changes to societal welfare caused by increased productivity are estimated by calculating the effect of changes in individual QALYs on productivity. There is no reason to suspect the effect of productivity on QALYs is at all similar.

We could adopt a pluralist position in which QALYs constitute only one kind of value and productivity is instrumental for another kind of value. But if these types of value are distinct then they are incommensurable and cannot be combined. Furthermore, linking productivity to other types of value, such as liberty or happiness is certainly fraught with difficulty and not discussed as such in the VBP literature. We could argue that just because two things are incommensurable does not mean they are incomparable—to take a particularly contrived example, we may prefer to reimburse a medicine that treats a disease that afflicts only charity workers rather than a sales person specific disease out of a particular notion of value (I have no qualms to bear against people who work in sales). In this way we could create an ordinal scale, but this would preclude the calculation of thresholds and cost-effectiveness ratios, the very existence of which HTA relies upon.

I believe VBP to be a good idea in order to more accurately capture the effects of healthcare expenditure but the V in VBP is particularly nebulous. At the very least, however, VBP is a step in the right direction and will lead to wider discussions about the often under-considered normative side of economics.

About these ads

Tags: , , , , , , , , , , , , , , , , ,

A comment on the value of screening

Everybody’s talking about screening again, with good reason. Research seems to suggest that screening for breast cancer, using mammography, is not effective (let alone cost-effective). Here I present a view on the value of screening; the validity of which I am yet to fully convince myself.

Why screening?

The point of screening is the early detection of disease, or the identification of increased risk of disease, in people without symptoms. It is based on the idea that commencing treatment or care at an earlier stage of disease can be beneficial to an individual’s health or well-being in the long-term, including life-extension. Read that sentence again – it’s important. So, the potential value of screening is commensurable to the incremental value of early treatment or care over and above treatment or care that begins with the onset of symptoms. However, as I suggest below, a screening trial is not the best way to capture this.

The value of treatment for disease is clear. Treatment can directly improve health and well-being, and this is something to which people attach value. The same cannot be said of screening, which in and of itself is not health-improving. I believe that we need to stop valuing screening in terms of its ability to extend life and quality of life. It may serve to give people peace of mind, but this is not something that we routinely consider in the evaluation of screening. The true value of screening lies in the extent to which it provides us with new knowledge about a patient; a screening intervention with better sensitivity and specificity is of greater value. It seems perverse to suppose that an intervention that provides us with information can be ‘ineffective’ in terms of health outcomes, or even harmful. The error is surely in our use of that information.

The redundancy of screening trials

To capture the value of screening we do the same as for any other health care intervention; we carry out an RCT. It appears to me that screening trials are unnecessary, of little value and potentially dangerous. This may be a foolhardy suggestion given that my employment depends upon a screening trial, but I’ll go on.

I’ll tackle my three claims in reverse order. Firstly, screening trials are potentially dangerous. This is simply because screening is potentially dangerous. As we’ve already established, the value of screening is derived from the value of treatment for the disease for which an individual is being screened. If a person is screened positive, they may be able to receive treatment earlier than they otherwise would. However, it seems unlikely to me that the treatment they will receive has been evaluated in asymptomatic individuals like themselves. Trials of pharmaceuticals, surgery or other health care interventions are invariably carried out in populations with symptomatic disease. Therefore, our understanding of the benefits, costs and side-effects of the treatment may not be relevant to individuals receiving treatment after being screened positive. The effect in this population may be detrimental.

Secondly, screening trials may be of little value. Given that the value of screening is derived from the effectiveness of treatment, a change in the cost-effectiveness of the best available treatment for a disease will render the ‘effectiveness’ results from the screening trial redundant. Studies of screening no doubt provide other important data and findings, but I do not believe ‘effectiveness’ to be one of them… (please don’t fire me).

Thirdly, trials of the effectiveness of screening are unnecessary. Again because the value of screening is derived from treatment. Armed with information about the effectiveness of treatment, screening uptake and the sensitivity and specificity of the screening method, we can accurately model the ‘effectiveness’ of screening. A trial of the effectiveness provides no new information.

The solution

It seems to me that the problems set out above can be addressed by making changes to the way we carry out trials of treatments for diseases that have an asymptomatic state. Trials of treatments for such diseases should recruit asymptomatic individuals and subject all participants to a screening programme. Upon detection of disease (with any necessary decision rules and stratifications) individuals can be given the treatment that is being evaluated. Then, at the end of the trial, the cost-effectiveness of the treatment should be reported as a function of disease progression. In some cases disease progression will be categorised – for example into asymptomatic/symptomatic – while in others it will be a function of some other indicator, such as tumour size. Using this information it would be simple to elicit the ‘value’ of a screening programme. There will be 3 possible results: i) commencement of treatment is not cost-effective at any stage of disease, in which case screening is not effective; ii) commencing treatment in the asymptomatic stages of disease is no more cost-effective than commencing treatment once symptoms are identified, in which case screening is not effective; or iii) there is a stage of pre-symptomatic disease at which the commencement of treatment provides benefits over and above waiting to start treatment once symptoms develop, in which case screening is effective. It is only in the third scenario that the information provided by screening can be harnessed for health benefit. Using this information it would be possible to implement a screening programme that maximises cost-effectiveness, without carrying out a trial of the ‘effectiveness’ of screening.

No doubt there are problems with this suggested solution. However, the dramatic inconsistency in the findings of screening trials, and the resulting scepticism amongst clinicians and decision makers, suggests that something is wrong with the way we currently think about the value of screening. It’s time for a rethink.


Tags: , , , , , , , ,

Social impact bonds: is an ounce of (bond) prevention worth more than a pound of (budgetary) cure

It is one of the curious ironies of history that ideas which tend to destroy also help to rebuild. Innovative financial instruments played a key role in the 2007-2008 financial crisis that not only dented economic growth worldwide, but also hit government revenue streams making fewer resources available for health care spending. Roughly five years after the crisis, social impact bonds (SIBs) – a new financial instrument – hold promise to fund a raft of innovative social service delivery models via private capital. Though SIBs are still in the early development phase, they could play a niche role in relieving burdened state health care budgets and financing innovative preventive health schemes in both the US and UK.

SIBs share some common characteristics with (vanilla) bonds; however, there are also notable differences. When an investor purchases a regular bond, he/she pays a principal amount (e.g. a face value of $10,000) with the expectation of receiving periodic interest payments until the bond matures, at which point the principal amount is returned to the investor. SIBs still require an initial principal investment from investors, usually with more than a modicum of altruism for the cause involved. Not-for-profits, and sometimes commercial entities, are the main current investors in SIBs.

The main differences lie in how the money is used and how payments to investors are made. An intermediary, which charges fees, serves as the organizer of the SIB selecting the investors, service providers, and overseeing the process. Once investors purchase a SIB, a government agency contracts out with social service delivery organisation(s) for a selected cohort of individuals. Investors are not offered regular interest payments; rather, they are offered ‘performance-based’ payments based on agreed-to benchmarks in service delivery.

For example, Social Finance UK issued the first social impact bond in September 2010 in the United Kingdom. In the case of the 2010 Peterborough SIB offering, incentive payments were tied to ex-prisoner recidivism levels. That is, if the selected cohort of released ex-prisoners ‘covered’ under the bond’s services had a lower rate of recidivism than an agreed-upon counter-factual cohort (usually the natural average), investors would be rewarded with a payment from the government. If the cohort demonstrated a higher rate of recidivism, investors would forfeit both the initial principal investment and performance payments. In this scheme, the financing mechanism acts more like equity when investors receive a dividend for superior corporate performance (without the capital gain) rather than guaranteed interest payments (see diagram below from Social Finance for the SIB flow of funds between investor, government, and social service deliverer).

(c) Social Finance 2011

(c) Social Finance 2011

The interest for SIBs in health care service delivery is gaining momentum. After the successful launch of the first SIB in 2010, coupled with a greater emphasis on ‘responsible finance’, the idea quickly expanded to other fields including education, adoption and work retraining schemes. The business case for health care SIBs is arguably at least as strong, if not stronger, than other areas. There are two reasons for this.

First, governments face difficult funding choices in the age of austerity. Regardless of the expenditure area, general budgetary funds are usually allocated to existing programs with minimal risk; innovative programs with high start-up costs and unknown outcomes are not seen to deliver value-for-money.

Second, a majority of health care budgets in advanced countries are dedicated to treating patients with chronic conditions, primarily in hospital or long-term care settings. Spending on preventive services has traditionally been much lower, although this is gradually changing. This is particularly true for innovative schemes to prevent chronic disease onset. Policy makers need more tools to address the crowding out of preventive spending in health care budgets as the average population age and number of comorbidities per patient grows. SIBs might be one tool to diversify the risk associated with these schemes, while also allowing governments to pay only for programs that actually improve outcomes.

Although interest exists, adoption of SIBs for health care services has been slow.  Though the UK served as the initial testing ground for SIBs, their use in health care has been minimal. Some of the inertia is due to the NHS: the large bureaucracy has established payment and program trial systems that are not compatible with SIBs. This attitude may be changing however, particularly due to the fiscal pressures of austerity. In reaction to a May 2013 NHS/Monitor discussion paper on changing the NHS’s payment system, several organisations submitted responses that proposed SIBs as a necessary strategy. The Health Foundation’s submission cited a trial in the Milton Keynes NHS Trust associated with psychological assessment of diabetes patients with ‘SIB-like’ properties.

In the United States, state and local health care stakeholders have been at the forefront of developing SIBs. The city of Fresno in California is the country’s first site for a health care SIB: a two- year demonstration bond has been approved to assess the use of evidence-based practices in the treatment of 200 low-income paediatric asthma patients. The $660,000 SIB, funded by Collective Health and the California Endowment, will evaluate if intensive patient education and home visits will be effective in preventing emergency department visits and inpatient hospitalisations. If the selected cohort achieves a lower utilisation rate than another selected cohort in California’s Medical population, investors will receive their payback and the initial trial will be expanded to cover 2000 children in the state.

SIBs, despite their innovative nature, are also a target of criticism. First, critics point out that the SIB delivery structure is economically inefficient. The SIB’s intermediary charges fees that would not exist in a direct relationship between the government and contractor; these fees mean that a project can be expensive to scale up and potentially waste government funds. Second, the singular focus on pre-determined quantitative measures may be wrong-headed. A typical evaluation of social service schemes is more flexible including both qualitative and quantitative assessments of success. The evaluation also takes note of when service delivery or outcomes did not follow prescribed guidelines, or allows for changes in how the demonstration proceeds based on feedback. This iterative process may not be possible in SIBs.

Overall, SIBs are still in their nascency and face many challenges. The idea, however, is not simply put of a larger social investing fad. If SIBs are able to allocate investments in areas where governments are unable or unwilling to invest, they may serve their purpose; even if they show which delivery schemes fail. With tighter health care budgets and the pressing need for innovative solutions in health, SIBs should be seen as a useful new financing tool.


Tags: , , , , , , , , , ,

Solution for Trap 66 (Postcondition violated)

WinBUGS is a widely used free software program within health-economics. It allows for Bayesian statistical modelling, using Gibbs sampling. (Hence the name: the Windows version of Bayesian inference Using Gibbs Sampling). One of the drawbacks of WinBUGS is the notoriously uninformative error messages you can receive. While Google is usually a Fountain of Knowledge on solving errors, where WinBUGS is concerned it often only serves up other people asking the same question, and hardly any answers. This post is about one error message that I found, the solution that’s sometimes offered which I think only partly solves the problem and the solution I found which solves it completely.

The error message itself is “Trap 66 (postcondition violated)”. Variance priors have been identified as the culprits. The suggested solutions I could find (for example here, here and here) all point towards those priors being too big. The standard advice is then to reduce the priors (for example from dunif(0,100) to dunif(0,10)) and rerun it. This usually solves the problem.

However, to me, this doesn’t make a whole lot of sense theoretically. And, in a rare case of the two aligning, it also didn’t solve my problems in practice. I have been performing a simulation study, in which I have about 8,000 similar, but different data sets (8 scenarios, 1000 repetitions in each). They all represent mixed treatment comparisons (MTC), which are analyzed by WinBUGS. I used SAS to create the data, send it to WinBUGS and collect and analyse the outcomes. When I started the random effects MTC, the “Trap 66 (postcondition violated)” popped up around dataset 45. Making the priors smaller, as suggested, solved the problem for this particular dataset, but it came back on data set 95. The funny thing is that making the priors higher also solved the problem for the original dataset, but once again the same problem arose at a different data set (this time number 16).

Whenever I tried to recreate the problem, it would give the same error message at the exact same point in time, even though it’s a random sampler. From this it seems to be that the reason why the suggested solution works for one data is that the generated ‘chains’, as they are called in WinBUGS, are the same with the same priors and initial values. Defining a smaller prior will give a different chain which is likely not to cause problems. But so will a larger prior or a different initial value. However, it didn’t really solve the problem.

The solution I have found to work for all 8,000 data sets is to not look at the maximum value of the prior, but at the minimum value. The prior that is given for a standard error usually looks something like dunif(0,X). In my case, I did an extra step, with a prior on a variable called tau, for which I specify a uniform prior. The precision (one divided by the variance) that goes into the link function is defined by

prec <- pow(tau,-2)

This does not make any difference for the problem or the solution. My hypothesis is that when Trap 66 comes up, the chain generates a tau (or standard error, if that’s what you modelled directly) equal to 0, which resolves into a precision equal to 1 divided by 0, or infinity. The solution is to let the prior not start at 0, but at a small epsilon. I used dunif(0.001,10), which solved all my problems.

This solution is related to a different problem I once had when programming a probability. I mistakenly used a dunif(0,1) prior. Every now and then the chain will generate exactly 0 or 1, which does not sit well with the binomial link function. The error message is different (“Undefined real result”), but the solution is again to use a prior that does not include the extremes. In my case, using a flat dbeta instead (which I should have done to begin with) solved that problem.

Any suggestions and comments are much appreciated. You can download WinBUGS, the free immortality key and lots of examples from the BUGS project website here. It also contains a list of common error messages and their solutions. Not Trap 66, obviously.

1 Comment

Posted by on January 8, 2014 in Health Statistics and Econometrics


Tags: , ,

New year’s resolutions

In 2013 we published 34 posts and got about 12,000 hits. While we’re by no means disappointed with this, we know we can do better.

Our primary goal is to publish more posts, while ensuring that articles maintain a high quality. I hope that this year you will consider becoming a contributor and writing for the blog. It’s a great way to present any issues you’ve been pondering that just won’t formulate into a journal submission, or to simply pose tricky questions to the health economics community.

We hope to cultivate more of a community around the blog, and the Health Economics Journal Club will become a primary means of achieving this. We will be experimenting with new platforms for #HEJC discussions, such as Google Hangouts, and new formats, including guest authors. If you’d like to help put together a future discussion, please get in touch.

You’ll now also find a new Resources page in the menu above. A links page is an oft-requested feature for the site. While other websites already provide links pages, many are outdated, incomplete or simply too huge to be of value. We hope this one will prove more useful, but do let us know what you think.

Leave a comment

Posted by on January 6, 2014 in #HEJC, News


Tags: ,

#HEJC for 17/10/2013

The next #HEJC discussion will take place Thursday 17th October, at 8am London time. Join the Facebook event here. For more information about the Health Economics Twitter Journal Club and how to take part, click here.

The paper for discussion this month is a working paper published by IZA. The authors are Maja Adena and Michal Myck. The title of the paper is:

“Poverty and transitions in health”

Following the meeting, a transcript of the Twitter discussion can be downloaded here.

Links to the article




Summary of the paper

My interest in discussing ‘Poverty and transitions in health’ by Adena and Myck was driven by my own curiosity in the area of socio-economic determinants of health and well-being more generally. The income based approach to assessing a country’s progress or assessing welfare within nations has faced sustained criticism from a number of quarters, most notably the Commission on the Measurement of Economic Performance and Social Progress (Stiglitz et al. 2009). Even when it comes to assessing poverty, the method of drawing a relative income poverty threshold has faced further scrutiny, most notably from advocates of a multidimensional poverty approach (Alkire & Foster, 2011).

The paper by Adena and Myck is another critique on the relative income approach on assessing wellness, but in a new light – by investigating the ability of relative income to predict future health outcomes in an older population group. The authors argue that old age poverty is one of the key challenges to developed countries, with the demographics of the over 65 expected to make up almost three in every ten EU citizens by 2060 adding concerns about the sustainability of national pension plans. The authors argue that epidemiological research has so far failed to account for the relationship between material conditions and health in the later stages of life to date.

The data applied to investigate this research question were drawn from 12 European countries from a large (n=29,110) European Panel Survey, the Survey for Health, Aging and Retirement in Europe (SHARE). The percentage of the population aged 50-64 at baseline was 53.23% with the remainder 65 and older. Males accounted for 54.69% of the sample. At baseline Wave 2 of the SHARE dataset (year 2006) was used to predict binary outcomes of good or bad health in Wave 4 of the survey (year 2012), which depended on whether or not an individual was in good or bad health at baseline (Wave 2). Three measures of “material circumstances” were applied to predict three measures of ‘health’ in this study (as well as mortality). The three material circumstances measures were:

  1. Income poverty – 60% of the median equivalised household income
  2. Subjective poverty – having difficulty to “make ends meet” per month
  3. Wealth poverty – bottom tertile of country wealth distributions.

The three health measures were:

  1. Self-assessed health status (SAH) – “fair” or “poor” health status on a five-part scale
  2. Symptoms of poor health (SMT) – poor if they have 3 or more of 12 symptoms measured
  3. Limitations in performing activities of daily living (ADL) – poor if they have 3 or more of 23 ADLs.

The authors’ key findings suggest that the “broader measures” of subjective and wealth poverty are more accurately able to predict negative health outcomes than income poverty (in some cases, no relationship was found between income poverty and health outcomes). People who were in bad health in Wave 2 are also less likely to recover if they are classified as subjective or wealth poor in Wave 4. The most striking finding by Adena and Myck is the probability of death when reported as subjectively poor in Wave 2. The probability of dying is 40.3% higher for men and 58.3% higher for all aged between 50-64 years old. The authors conclude by stating that “improvements in material conditions may not only translate into better quality of life but also living longer”

Discussion points

  • The method of defining health poverty as 3 problems for the health measures SMT and ADL is reported as arbitrary but common. How common is this practice? No reference to other examples in the paper.
  • Should having 3 problems be equivalent irrespective of ‘problem’ under consideration? Might be worth considering literature on ‘core’ poverty methods (Clark & Qizilbash, 2008).
  • Sensitivity analysis considered people with two or more problems: I was expecting an analysis which went higher (i.e. four or more problems), especially for ADL.
  • Sensitivity analysis would have also been useful for the material measures. This paper shows problems with the 60% median relative income threshold, rather than relative income itself.
  • Is the method for defining wealth within a country appropriate for defining “wealth poverty”?
  • While the authors touched on qualitative information contained in Wave 3 in the sensitivity analysis, this could warrant future research as to the drivers of changes in health outcomes over time.
  • Personally, I did not feel the link to Grossman’s (1972) model on health stock was necessary. Felt the results, if properly presented, could stand up on their own merits.
  • Imputation of missing values for income and wealth needed further explanation.
  • Related to Footnote 11: why were the result in Hahn et al. (1995) different than what were found in this study? This requires more detailed consideration.

Can’t join in with the Twitter discussion? Add your thoughts on the paper in the comments below.

Leave a comment

Posted by on October 10, 2013 in #HEJC


Tags: , , , , ,

Longitudinal datasets of potential interest to health economists

At the beginning of my PhD I was quite stuck to find good data. I wanted data with good measures of mental health, physical health, employment, income, small area indicators and standard controls like age, gender, education and so on. I eventually found what I needed, but it required looking at many datasets. I am sure other people have faced similar problems. This post attempts to give you a bit of a short-cut, providing an overview of datasets that might be interesting to health economists who work empirically.

The focus here is on datasets which allow following individuals over time (even if they weren’t specifically meant as panels). The reason for this is simply that I have more knowledge of these datasets and these alone already bring this post to a substantial length.

The selection criteria for datasets to make it into this post are quite simple: I have to know of their existence and they have to be available in English. I cannot guarantee for the quality of all the dataset as I have only used a few of them. Additionally this aspect depends on your research question.

I will certainly have missed one or two interesting datasets, so I am already curious about your comments and additions at the bottom. The reason to include only English datasets is quite simply that using a dataset in a foreign language is a pain, though it is possible! I have a forthcoming publication using a Portuguese dataset without speaking a single word of Portuguese. However, it took me several highly disciplined days of using Google translate to find all the variables I needed. It worked out in the end for me, but I would not encourage it! The reverse is true, if you are proficient in another language. Using data from that country for your research (which might so far have only used UK/US data) might be part of your competitive advantage. Naturally, the table below is not exhaustive and as already mentioned I am looking forward to learning about other interesting surveys. However, here are a few words of warning. Most of my “warnings” are straightforward, but better to be safe than sorry, right?

  1. Not all of these datasets are free to researchers. In fact, some of them come at substantial costs, especially the administrative ones. I would advise to narrow it down and make your own cost-benefit analysis to determine which dataset you will use. Your supervisor and/or colleagues can probably give you some advice.
  2. Merging and cleaning most of these datasets is often a substantial amount of work. Economies of scale might therefore advocate in favour of sticking to one dataset if possible. Some datasets are even so complex that there are specific workshops on them. ISER, Essex, for example, is offering workshops on Understanding Society (for free) and CHE, York offer workshops on HES.
  3. If what you are looking for does not exist in one dataset, but exists in two datasets from the same source, you might want to check whether they are linkable. HES for example can be linked to the Scottish Health Survey and should soon also be linkable to ELSA.
  4. Another option to pursue if this table does not provide you the dream dataset is to check out websites which offer access to publicly available datasets to researchers. For the UK that would be the UK Data Archive, for Ireland look at the Irish Social Science Data Archive. A similar website exists for the Netherlands, called CentER Data. The European Institute also has a good website with an overview of datasets.

I hope this shortens the search for a data source for some and I am looking forward to hearing about further interesting datasets which could help me in my research.

Country of Origin Started in: Still running? Short description
1970 British Cohort Study (1970 BCS)
Great Britain


Yes A study surveying a group of individuals born in 1970 in regular intervals
British Household Panel Survey (BHPS)
Great Britain (& Northern Ireland from wave 7)


No, stopped in 2008, but BHPS participants can be followed up in US A well-organized general household panel.
Clinical Practice Research Datalink (CPRD)
England 2002 Following recent NHS reforms, CPRD will become obsolete as the NHS itself will provide one dataset including both Hospital and GP records, called care episode service (CES).
Cognitive Function and Ageing Studies (CFAS)
UK CFAS I started in the late 80s. CFAS II started in 2008. A UK study focusing on health and cognitive ability in the elderly.
Dutch Central Bank Household Study (DHS)
The Netherlands


Yes The survey is particularly well-suited to study financial effects onto health.
English Longitudinal Study of Aging (ELSA)


Yes The study focuses on the socio-economic and health dynamics among the elderly in England, therefore only 50+ individuals are included.
European Community Household Panel (ECHP)


No, stopped in 2001. EU-SILC succeeds this survey. A general household panel covering many EU member states.
European Union Statistics on Income and Living Conditions (EU-SILC)
EU 2003/2004 Yes The successor of the ECHP used to create official statistics for the EU.
German Socio-economic Panel (GSOEP)


  A very long still running household panel comparable with the BHPS or PSID.
Growing Up in Scotland (GUS)


Yes A study especially well-suited to study Scottish children.
Hospital Episodes Statistics (HES)
England Covers everyone treated in a NHS Hospital. Soon HES will develop into Care Episode Service (CES), a dataset including NHS records on both hospital and GP visits.
Household, Income and Labour Dynamics in Australia (HILDA)


Yes A general household panel with good data quality, decent sample size and very easy to handle.
Korean Labor & Income Panel Study (KLIPS)


Yes Especially well-suited to study the labor – health relationship.
Longitudinal Internet Studies for the Social sciences (LISS)
The Netherlands


Yes A smaller Dutch panel but interesting as it has a subsection focusing on immigrants and researchers can propose new questions at no cost.
Longitudinal Study of Young People in England (LSYPE)


Yes A cohort study following a group of individuals who were 13/14 in 2004.
Mental Health Minimum Dataset (MHMDS)


Contains NHS record data about individuals with severe mental illnesses. In the future it is probably more interesting to look at CES once mental health institutions are integrated (see HES).
Millennium Cohort Study (MCS)


Yes A study surveying a group of individuals born in 2000 in regular intervals.
National Child Development Study (NCDS)


Yes A study regularly surveying individuals born in a particular week in 1958. It is the longest cohort study I am aware of.
Panel Study of Income Dynamics (PSID)


Yes The longest running household survey covering a similar spectrum as the BHPS or GSOEP.
Survey of Family, Income, and Employment (SoFIE)
New Zealand


No, only 8 waves were carried out. This survey offers 8 waves of data on income and employment from New Zealand. Detailed health data is only available for a few waves.
Survey of Health, Ageing and Retirement in Europe (SHARE)
Most European countries


No, only 5 waves were carried out. A dataset specifically suited to study the elderly over time across Europe
The Irish Longitudinal study on Ageing (TILDA)
Ireland 2009/2011 Yes The Irish version of ELSA also covering only 50+ individuals. It includes very specific health measures such as heart rate, blood pressure, grip strength, etc.
Understanding Society (US)


Yes Currently the largest household panel available to researchers (ignoring administrative data).

Tags: , , ,


Get every new post delivered to your Inbox.

Join 867 other followers

%d bloggers like this: