Here is my first Shiny app! Shiny lets you create interactive visualisations in R. It’s a big step forward from the static visualisations we have done thus far. R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was … Read More “My First Shiny App: See Where Your Team Ranks in the Football Pyramid” »
Year: 2017
Often I use R to handle large datasets, analyse the data and filter out the data I don’t need. When all this is done, I usually use write.csv() to print my data off and reopen it in Google Sheets. My workflow would look something like this: full_data <- read.csv(“some_dataset.csv”) #R analysis ending up with relevant_data write.csv(relevant_data, … Read More “How to Use googlesheets to Connect R to Google Sheets” »
On Tuesday I gave a workshop at the Data Journalism UK conference, run by Paul Bradshaw. This was the worked example for absolute beginners that we went through. If you’ve never looked at R before and want to run some R code, load up this page , copy the following in step by step … Read More “R for Absolute Beginners” »
Recently the British Department for Transport published its latest STATS19 data for the year 2016. We’ve looked at this data before. To recap, each row of the STATS19 data is a traffic accident that caused injury or death, identified by a unique Accident Index. It’s an extremely detailed dataset containing fields such as the latitude … Read More “Road accidents in November” »
Over the past two weeks I’ve been looking at Network Rail’s delays data. The data tells us how many delays there have been to trains thanks to all kinds of problems that affect the railways, from natural causes such as that seasonal favourite ‘leaves on the line‘ to human causes such as vandalism. There are … Read More “Vandalism Causing Train Delays” »
Back in August 2014, around the 100th anniversary of the outbreak of the First World War, the Data Unit published our analysis of the Commonwealth War Graves Commission‘s records of fallen soldiers, airmen, sailors and other servicemen and women who gave their lives during the next four years. As the 100th anniversaries have come and … Read More “The Losses in the Final Year of WW1” »
Over the past few years a good source of data has been Parliament’s petitions website. Anyone can start petitions or sign them. MPs have to consider the ones that get to 100,000 signatures for debates. The most popular petitions often end up leading the news cycle, such as the petition arguing for a second EU … Read More “Scraping in R: Access to mortgage petition” »
It’s time to branch out into a new area of data visualisation: proportion area plots. These plots use area to show proportion between different related values. A common type of proportional area plots are tree maps. We are going to be using the same principle but with circles. A common subject for area visualisation is … Read More “Spring Budget 2017: Circle visualisation” »
R has a lot of packages for users to analyse posts on social media. As an experiment in this field, I decided to start with the biggest one: Facebook. I decided to look at the Facebook activity of Donald Trump and Hillary Clinton during the 2016 presidential election in the United States. The winner may … Read More “Comparing Donald Trump and Hillary Clinton’s Facebook pages during the US presidential election, 2016” »
Earlier this month Marie Segger, Carlos Novoa and I had a major new project published about different rail speeds between cities around Britain. We compared the distances between train stations in Britain’s largest cities and found which areas were poorly-served by slow trains. Our project was picked up by a MP for Plymouth, a city … Read More “Calculating Distances in R: How Fast is Your Train?” »