Data everywhere! – Introduction to Quandl

Economists need data. In this post I want to introduce those of you who don’t know it to a magnificent data source – quandl.com. Quandl is an open source website that indexes a huge range of data – over 6,000,000 data sets according to the website – for almost every country on Earth. The bulk of these data appear to be financial but there is a wealth of socioeconomic data for many countries (see here for their list of health topics, for example).

One of the most useful things about Quandl is its ability to provide that data directly into a usable format. You can even download any of the datasets straight into R; here, I will show you how.

Let’s look at total health spending as a proportion of GDP in the UK. We first find the dataset in Quandl (which is found here) and then click download where we have a number of options. In this case let’s opt for R.

quandl1

We copy and paste the code into R

[code language=”r”]df<-read.csv(‘http://www.quandl.com/api/v1/datasets/WHO/20600_56.csv?&trim_start=1995-12-31&trim_end=2010-12-31&sort_order=desc’, colClasses=c(‘Year’=’Date’))[/code]

And then we plot

[code language=”r”]ggplot(aes(Year,Value),data=df)+theme_bw()+labs(x=”Year”,y=”Healthcare spending as % of GDP”)+geom_line()[/code]

hcspendgdp
Simple. There is also an R package available in CRAN that enables you to access data from Quandl without using the website and customising the data set (selecting variables and dates). I suspect that this will make finding appropriate data much easier in future.

Author

  • Health economics, statistics, and health services research at the University of Warwick. Also like rock climbing and making noise on the guitar.

Join the discussion

This site uses Akismet to reduce spam. Learn how your comment data is processed.