Skip to content

R for Journalists

Unlock the power of R

  • What Is R?
  • R for Rob
  • GitHub
  • Twitter
  • Etsy
  • Home
  • 2016
  • October
  • 18
  • Melting Drugs Data: Part One

Melting Drugs Data: Part One

Posted on October 18, 2016October 18, 2016 By Rob
See

foundry

Every year the Home Office, which is responsible for drugs policy, carries out an anonymous survey into use of illegal drugs in England and Wales.

It always throws up some interesting data. Regionally people in London and South West England are the most likely in the country to own up to taking illegal drugs. Obviously your lifestyle is also a major factor. Younger people who go out four times a month are more likely to be drug users (the illegal kind anyway) than stay-at-home bookworms.

I cleaned up the regional data a bit and you can download it to follow along here.

Here is the structure and head of the data:

> str(data)
'data.frame': 84 obs. of 17 variables:
 $ Drug : chr "Amphetamines" "Amphetamines" "Amphetamines" "Amphetamines" ...
 $ Region : chr "England and Wales" "England" " North East " " North West " ...
 $ X2001.02: num 1.5 1.5 1.3 2.1 1.4 1.2 1.7 0.8 1.6 1.9 ...
 $ X2002.03: num 1.6 1.5 2.1 1.6 1.5 1.4 1.2 1.2 1.9 1.5 ...
 $ X2003.04: num 1.5 1.5 2.4 1.5 2.1 1.4 1.4 1.1 1.3 1.4 ...
 $ X2004.05: num 1.4 1.3 1.4 1.6 1.7 1.3 0.8 0.9 1.2 1.4 ...
 $ X2005.06: num 1.4 1.3 2.4 1.2 1.6 1.5 1.3 1.2 1.3 0.9 ...
 $ X2006.07: num 1.3 1.3 2.2 1.5 1 1.4 1.1 1.4 1.1 0.8 ...
 $ X2007.08: num 1 1 1.2 1.1 1 1 0.7 0.8 0.7 0.9 ...
 $ X2008.09: num 1.2 1.2 2 1.4 1.7 0.9 1.5 0.7 0.5 1.2 ...
 $ X2009.10: num 0.9 0.9 1.7 1.7 0.7 0.8 0.7 0.9 0.5 0.8 ...
 $ X2010.11: num 1 1 1.9 1.4 1.3 1 0.7 0.9 0.7 0.8 ...
 $ X2011.12: num 0.8 0.7 1.6 0.6 0.9 0.8 0.5 0.7 0.4 0.7 ...
 $ X2012.13: num 0.6 0.6 0.9 0.6 0.8 0.7 0.3 0.3 0.5 1 ...
 $ X2013.14: num 0.8 0.7 1 0.8 0.6 0.6 0.5 1.1 0.7 0.7 ...
 $ X2014.15: num 0.6 0.6 0.6 0.8 0.9 0.5 0.2 0.4 0.5 0.5 ...
 $ X2015.16: num 0.6 0.6 0.8 0.5 0.7 0.4 0.4 0.3 0.7 0.7 ...

 Drug Region X2001.02 X2002.03 X2003.04
1 Amphetamines England and Wales 1.5 1.6 1.5
2 Amphetamines England 1.5 1.5 1.5
3 Amphetamines North East 1.3 2.1 2.4
4 Amphetamines North West 2.1 1.6 1.5
5 Amphetamines Yorkshire and the Humber 1.4 1.5 2.1
6 Amphetamines East Midlands 1.2 1.4 1.4
 X2004.05 X2005.06 X2006.07 X2007.08 X2008.09 X2009.10 X2010.11 X2011.12
1 1.4 1.4 1.3 1.0 1.2 0.9 1.0 0.8
2 1.3 1.3 1.3 1.0 1.2 0.9 1.0 0.7
3 1.4 2.4 2.2 1.2 2.0 1.7 1.9 1.6
4 1.6 1.2 1.5 1.1 1.4 1.7 1.4 0.6
5 1.7 1.6 1.0 1.0 1.7 0.7 1.3 0.9
6 1.3 1.5 1.4 1.0 0.9 0.8 1.0 0.8
 X2012.13 X2013.14 X2014.15 X2015.16
1 0.6 0.8 0.6 0.6
2 0.6 0.7 0.6 0.6
3 0.9 1.0 0.6 0.8
4 0.6 0.8 0.8 0.5
5 0.8 0.6 0.9 0.7
6 0.7 0.6 0.5 0.4

This data presents a challenge.

The years are in different columns. If you remember from the last post on unemployment, our data was in a vertical time series.

In other words, the dates were stacked vertically in one column rather than across several different columns.

Remember our basic geom_line command, which goes like this:

ggplot([our_data], aes(x = [what goes on x axis,
y = [what goes on y axis])) + geom_line()

We want our y axis to be a single variable for this to work (if there are alternatives, please let me know in the comments).

This is where reshape2 comes in.

Reshape2 is a package designed by Hadley Wickham. Sean Anderson has a detailed overview here.

It functions like a pivot table in a spreadsheet – you melt down data and then (if you want) cast it the way you want.

Melting our data is what we want to do here.

#make sure you have reshape2 installed in RStudio
library(reshape2)

mdata<- melt(data, id = c("Drug","Region"))

Here we are saying: ‘Keep “Drug” and “Region” separate and merge the other columns’.

This is the result:

> str(mdata)
'data.frame': 1260 obs. of 4 variables:
 $ Drug : chr "Amphetamines" "Amphetamines" "Amphetamines" "Amphetamines" ...
 $ Region : chr "England and Wales" "England" " North East " " North West " ...
 $ variable: Factor w/ 15 levels "X2001.02","X2002.03",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ value : num 1.5 1.5 1.3 2.1 1.4 1.2 1.7 0.8 1.6 1.9 ...

Excellent! Our years are now in one column, named variable.

Here is the head of our data:

 Drug Region variable value
1 Amphetamines England and Wales X2001.02 1.5
2 Amphetamines England X2001.02 1.5
3 Amphetamines North East X2001.02 1.3
4 Amphetamines North West X2001.02 2.1
5 Amphetamines Yorkshire and the Humber X2001.02 1.4
6 Amphetamines East Midlands X2001.02 1.2

Our years are there in the variable column.

While we are here, we can clean the data a bit more thoroughly.

We have two types of drugs that aren’t very interesting: ‘Any drug’ and ‘Any Class A drug’. Let’s remove them.

#remove unwanted drug types
kept_drugs <- grep("[^Any Class A drug|Any drug]",mdata$Drug)
mdata <- mdata[kept_drugs, ]

This regular expression uses the | (OR) operator combined with ^ to mean except (i.e. everything except this OR that).

We now have our data in the correct format!

Next up in Part Two we will take a look at the drug habits of different regions.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook

Related

Tags: ddj drugs ggplot2 journalism melt reshape2 rstats

Post navigation

❮ Previous Post: Is Unemployment Higher under Labour or the Conservatives?
Next Post: Melting Drugs Data: Part Two ❯

Recent Posts

  • I’ve moved my blog over to Substack
  • How to plot a large rural area using Ordnance Survey data in R
  • Check the COVID-19 vaccination progress in your area
  • Let R tell you what to watch on Netflix
  • Sentiment analysis of Nineteen-Eighty-Four: how gloomy is George Orwell’s dystopian novel?

Archives

  • April 2022
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • February 2020
  • December 2019
  • November 2019
  • October 2019
  • April 2018
  • March 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016

Categories

  • Geospatial data
  • Landmark Atlas
  • Learn
  • See
  • Seen Elsewhere
  • Site
  • Uncategorized

Copyright © 2025 R for Journalists.

Theme: Oceanly by ScriptsTown

 

Loading Comments...