Skip to content

Longitudinal datasets of potential interest to health economists

At the beginning of my PhD I was quite stuck to find good data. I wanted data with good measures of mental health, physical health, employment, income, small area indicators and standard controls like age, gender, education and so on. I eventually found what I needed, but it required looking at many datasets. I am sure other people have faced similar problems. This post attempts to give you a bit of a short-cut, providing an overview of datasets that might be interesting to health economists who work empirically.

The focus here is on datasets which allow following individuals over time (even if they weren’t specifically meant as panels). The reason for this is simply that I have more knowledge of these datasets and these alone already bring this post to a substantial length.

The selection criteria for datasets to make it into this post are quite simple: I have to know of their existence and they have to be available in English. I cannot guarantee for the quality of all the dataset as I have only used a few of them. Additionally this aspect depends on your research question.

I will certainly have missed one or two interesting datasets, so I am already curious about your comments and additions at the bottom. The reason to include only English datasets is quite simply that using a dataset in a foreign language is a pain, though it is possible! I have a forthcoming publication using a Portuguese dataset without speaking a single word of Portuguese. However, it took me several highly disciplined days of using Google translate to find all the variables I needed. It worked out in the end for me, but I would not encourage it! The reverse is true, if you are proficient in another language. Using data from that country for your research (which might so far have only used UK/US data) might be part of your competitive advantage. Naturally, the table below is not exhaustive and as already mentioned I am looking forward to learning about other interesting surveys. However, here are a few words of warning. Most of my “warnings” are straightforward, but better to be safe than sorry, right?

  1. Not all of these datasets are free to researchers. In fact, some of them come at substantial costs, especially the administrative ones. I would advise to narrow it down and make your own cost-benefit analysis to determine which dataset you will use. Your supervisor and/or colleagues can probably give you some advice.
  2. Merging and cleaning most of these datasets is often a substantial amount of work. Economies of scale might therefore advocate in favour of sticking to one dataset if possible. Some datasets are even so complex that there are specific workshops on them. ISER, Essex, for example, is offering workshops on Understanding Society (for free) and CHE, York offer workshops on HES.
  3. If what you are looking for does not exist in one dataset, but exists in two datasets from the same source, you might want to check whether they are linkable. HES for example can be linked to the Scottish Health Survey and should soon also be linkable to ELSA.
  4. Another option to pursue if this table does not provide you the dream dataset is to check out websites which offer access to publicly available datasets to researchers. For the UK that would be the UK Data Archive, for Ireland look at the Irish Social Science Data Archive. A similar website exists for the Netherlands, called CentER Data. The European Institute also has a good website with an overview of datasets.

I hope this shortens the search for a data source for some and I am looking forward to hearing about further interesting datasets which could help me in my research.

Country of Origin Started in: Still running? Short description
1970 British Cohort Study (1970 BCS)
Great Britain

1970

Yes A study surveying a group of individuals born in 1970 in regular intervals
British Household Panel Survey (BHPS)
Great Britain (& Northern Ireland from wave 7)

1990

No, stopped in 2008, but BHPS participants can be followed up in US A well-organized general household panel.
Clinical Practice Research Datalink (CPRD)
England 2002 Following recent NHS reforms, CPRD will become obsolete as the NHS itself will provide one dataset including both Hospital and GP records, called care episode service (CES).
Cognitive Function and Ageing Studies (CFAS)
UK CFAS I started in the late 80s. CFAS II started in 2008. A UK study focusing on health and cognitive ability in the elderly.
Dutch Central Bank Household Study (DHS)
The Netherlands

1993

Yes The survey is particularly well-suited to study financial effects onto health.
English Longitudinal Study of Aging (ELSA)
England

2002

Yes The study focuses on the socio-economic and health dynamics among the elderly in England, therefore only 50+ individuals are included.
European Community Household Panel (ECHP)
EU

1994

No, stopped in 2001. EU-SILC succeeds this survey. A general household panel covering many EU member states.
European Union Statistics on Income and Living Conditions (EU-SILC)
EU 2003/2004 Yes The successor of the ECHP used to create official statistics for the EU.
German Socio-economic Panel (GSOEP)
Germany

1984

A very long still running household panel comparable with the BHPS or PSID.
Growing Up in Scotland (GUS)
Scotland

2005

Yes A study especially well-suited to study Scottish children.
Hospital Episodes Statistics (HES)
England Covers everyone treated in a NHS Hospital. Soon HES will develop into Care Episode Service (CES), a dataset including NHS records on both hospital and GP visits.
Household, Income and Labour Dynamics in Australia (HILDA)
Australia

2001

Yes A general household panel with good data quality, decent sample size and very easy to handle.
Korean Labor & Income Panel Study (KLIPS)
Korea

1998

Yes Especially well-suited to study the labor – health relationship.
Longitudinal Internet Studies for the Social sciences (LISS)
The Netherlands

2007

Yes A smaller Dutch panel but interesting as it has a subsection focusing on immigrants and researchers can propose new questions at no cost.
Longitudinal Study of Young People in England (LSYPE)
England

2004

Yes A cohort study following a group of individuals who were 13/14 in 2004.
Mental Health Minimum Dataset (MHMDS)
England

2003

Contains NHS record data about individuals with severe mental illnesses. In the future it is probably more interesting to look at CES once mental health institutions are integrated (see HES).
Millennium Cohort Study (MCS)
UK

2000

Yes A study surveying a group of individuals born in 2000 in regular intervals.
National Child Development Study (NCDS)
Britain

1958

Yes A study regularly surveying individuals born in a particular week in 1958. It is the longest cohort study I am aware of.
Panel Study of Income Dynamics (PSID)
US

1968

Yes The longest running household survey covering a similar spectrum as the BHPS or GSOEP.
Survey of Family, Income, and Employment (SoFIE)
New Zealand

2002

No, only 8 waves were carried out. This survey offers 8 waves of data on income and employment from New Zealand. Detailed health data is only available for a few waves.
Survey of Health, Ageing and Retirement in Europe (SHARE)
Most European countries

2004

No, only 5 waves were carried out. A dataset specifically suited to study the elderly over time across Europe
The Irish Longitudinal study on Ageing (TILDA)
Ireland 2009/2011 Yes The Irish version of ELSA also covering only 50+ individuals. It includes very specific health measures such as heart rate, blood pressure, grip strength, etc.
Understanding Society (US)
UK

2009

Yes Currently the largest household panel available to researchers (ignoring administrative data).

We now have a newsletter!

Sign up to receive updates about the blog and the wider health economics world.

0 0 votes
Article Rating
Subscribe
Notify of
guest

2 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments
Christoph Kronenberg
Christoph Kronenberg
10 years ago

Another dataset which I just discovered is the: The Russia Longitudinal Monitoring Survey (RLMS)
http://www.cpc.unc.edu/projects/rlms-hse

Carl Haakon Samuelsen
10 years ago

Reblogged this on Helseøkonomen.com and commented:
The academic health economist blog har lagd en liste over empiriske datasett som kan være av interesse for helseøkonomer. Den er på ingen måte komplett, men inneholder også en del linker til websider som inneholder oversikter over åpne databaser. Vil også minne om HERO sin databaseoversikt: http://www.med.uio.no/helsam/forskning/nettverk/hero/helsedata/

2
0
Join the conversation, add a commentx
()
x
%d bloggers like this: