Skip to content

R for Journalists

Unlock the power of R

  • What Is R?
  • R for Rob
  • GitHub
  • Twitter
  • Etsy
  • Home
  • 2017
  • February
  • 15
  • Cyclists Involved in Accidents Undertaking

Cyclists Involved in Accidents Undertaking

Posted on February 15, 2017 By Rob
See

Last week I wrote some stories about cyclists involved in accidents while undertaking.

The stories generated a LOT of comment and debate – both far more and far more negative than I was expecting.

I’ll address that in a moment, but first, here’s a useful exercise in how to get the Government’s STATS19 data and filter down to what you want.

What is STATS19?

We’ve come across STATS19 before. It is the police records of road traffic accidents in which at least one person was injured or killed.

Each accident has a unique Accident Index. The data is divided into three files for each year: accidents, vehicles and casualties. The accident index links the data across these three files.

It is published year by year, so the first thing to do is to get the data and amalgamate it together into one huge data frame.

#clean data and build one big file

vehicles2015 <- read.csv("Vehicles_2015.csv")
vehicles2014 <- read.csv("Vehicles_2014.csv")
vehicles2013 <- read.csv("Vehicles_2013.csv")
vehicles2012 <- read.csv("Vehicles_2012.csv")
vehicles2011 <- read.csv("Vehicles_2011.csv")
vehicles2010 <- read.csv("Vehicles_2010.csv")

accidents2015 <- read.csv("Accidents_2015.csv")
accidents2014 <- read.csv("DfTRoadSafety_Accidents_2014.csv")
accidents2013 <- read.csv("DfTRoadSafety_Accidents_2013.csv")
accidents2012 <- read.csv("DfTRoadSafety_Accidents_2012.csv")
accidents2011 <- read.csv("DfTRoadSafety_Accidents_2011.csv")
accidents2010 <- read.csv("DfTRoadSafety_Accidents_2010.csv")


casualties2015 <- read.csv("Casualties_2015.csv")
casualties2014 <- read.csv("DfTRoadSafety_Casualties_2014.csv")
casualties2013 <- read.csv("DfTRoadSafety_Casualties_2013.csv")
casualties2012 <- read.csv("DfTRoadSafety_Casualties_2012.csv")
casualties2011 <- read.csv("DfTRoadSafety_Casualties_2011.csv")
casualties2010 <- read.csv("DfTRoadSafety_Casualties_2010.csv")


vehicles2013 <- as.data.frame(append(vehicles2013, list(Age_of_Driver = NA), after = 15))
vehicles2012 <- as.data.frame(append(vehicles2012, list(Age_of_Driver = NA), after = 15))
vehicles2011 <- as.data.frame(append(vehicles2011, list(Age_of_Driver = NA), after = 15))
vehicles2010 <- as.data.frame(append(vehicles2010, list(Age_of_Driver = NA), after = 15))

names(vehicles2013)[1] <- "Accident_Index"
names(vehicles2012)[1] <- "Accident_Index"
names(vehicles2011)[1] <- "Accident_Index"
names(vehicles2010)[1] <- "Accident_Index"

names(vehicles2010)[13] <- "Was_Vehicle_Left_Hand_Drive"
names(vehicles2011)[13] <- "Was_Vehicle_Left_Hand_Drive"
names(vehicles2012)[13] <- "Was_Vehicle_Left_Hand_Drive"
names(vehicles2013)[13] <- "Was_Vehicle_Left_Hand_Drive"
names(vehicles2014)[13] <- "Was_Vehicle_Left_Hand_Drive"
names(vehicles2015)[13] <- "Was_Vehicle_Left_Hand_Drive"

vehicles2015 <- vehicles2015[, 1:22]

vehicles <- rbind(vehicles2010, vehicles2011, vehicles2012, vehicles2013, vehicles2014, vehicles2015)

names(accidents2014)[1] <- "Accident_Index"
accidents <- rbind(accidents2010, accidents2011, accidents2012, accidents2013, accidents2014, accidents2015)

casualties2010 <- as.data.frame(append(casualties2010, list(Age_of_Casualty = NA), after = 5))
casualties2011 <- as.data.frame(append(casualties2011, list(Age_of_Casualty = NA), after = 5))
casualties2012 <- as.data.frame(append(casualties2012, list(Age_of_Casualty = NA), after = 5))
casualties2013 <- as.data.frame(append(casualties2013, list(Age_of_Casualty = NA), after = 5))

names(casualties2010)[1] <- "Accident_Index"
names(casualties2011)[1] <- "Accident_Index"
names(casualties2012)[1] <- "Accident_Index"
names(casualties2013)[1] <- "Accident_Index"
names(casualties2014)[1] <- "Accident_Index"
casualties2015 <- casualties2015[, 1:15]

casualties <- rbind(casualties2010, casualties2011, casualties2012, casualties2013, casualties2014, casualties2015)

full <- merge(x = accidents, y = vehicles, by = "Accident_Index")
full <- merge(x = full, y = casualties, by = "Accident_Index")

Above we are cleaning the data. The rbind function only works if you have the same number of columns, identically labelled. These spreadsheets are not all the same length; in later years additional data was added. So we add some dummy columns in where necessary and change the labels.

Take a look at the append function for adding in columns in between others.

Cyclists only

The next step is to just get the cases where the vehicle was a bike. We filter ‘1’ for pedal cycle as per the accompanying instructions.

cyclists <- full[full$Vehicle_Type == 1, ]

Undertaking

The term in the data is ‘undertaking – nearside’, code 15. It is a simple task to filter down for this code in the ‘vehicle manoeuvre’ column.

undertaking <- cyclists[cyclists$Vehicle_Manoeuvre == 15, ]

That is really all there is to getting the data we want.

To make it faster to write the stories, I printed the data using write.table and plotted it in a Google Fusion table to see where the accidents were.

A section of the Fusion table in central London

Some thoughts on the data

There were 2,823 accidents while cyclists were undertaking between 2010 and 2015.

More than half of them were in London.

According to the latest Department for Transport survey, 14.7 per cent of Londoners cycled at least one a month in 2014/15, exactly on a par with the England average.

So people in the capital are no more or less likely to cycle than average.

London makes up 13.7 per cent of the population but somehow 58 per cent of the accidents while undertaking.

The rest of the accidents, as you might expect, are in cities – including ones where cycling is popular such as Cambridge, York and Exeter.

Some thoughts on my stories

The publicly-available yearly STATS19 data is neutral on ‘blame’ for accidents.

There really is no way to tell whether one party, another or a third is at fault in any way from the data.

The Highway Code seems to allow ‘filtering’ by cyclists – passing through slow-moving or stationary vehicle traffic either on the left or the right.

Obviously on the left (nearside) leaves little room for manoeuvre for the cyclist – they can quickly run out of room between a car and the kerb if a driver makes a turn or a move to the left.

After the story was published I had a chat with a transport campaigner – he wants, among other things, to encourage more people to take to cycling on the road. He told me that cyclists often get frustrated with stories about cycling in the news. They often feel that it is portrayed as more dangerous than it actually is.

I am a cyclist myself – you can find me pedaling along Oldham Road in Manchester in the mornings and afternoons to and from work – but it wasn’t a point I understood as well before. The data is accurate, but of course it’s just one small facet of accidents on the roads in Britain.

 

Picture © Copyright Albert Bridge and licensed for reuse under this Creative Commons Licence.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook

Related

Tags: cycling dft journalism rstats stats19 transport

Post navigation

❮ Previous Post: Overcrowding in prison
Next Post: How to Set Your Working Directory in RStudio on a PC ❯

Recent Posts

  • I’ve moved my blog over to Substack
  • How to plot a large rural area using Ordnance Survey data in R
  • Check the COVID-19 vaccination progress in your area
  • Let R tell you what to watch on Netflix
  • Sentiment analysis of Nineteen-Eighty-Four: how gloomy is George Orwell’s dystopian novel?

Archives

  • April 2022
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • February 2020
  • December 2019
  • November 2019
  • October 2019
  • April 2018
  • March 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016

Categories

  • Geospatial data
  • Landmark Atlas
  • Learn
  • See
  • Seen Elsewhere
  • Site
  • Uncategorized

Copyright © 2025 R for Journalists.

Theme: Oceanly by ScriptsTown

 

Loading Comments...