On Wikipedia’s page for the 2015/16 Premier League, it has a matrix table of results.
It shows you all the results for each team, home and away against each other.
For example, if you look at Arsenal on Row 1, they beat Aston Villa 4-0 at the Emirates but lost their home match against Chelsea 0-1.
How to recreate this in R:
First of all, we need the source data from James Curley [full citation below]. He has compiled results for all top four league football matches in England from 1888 to 2015, as well as the FA Cup, Champions League and other European results. On his page he has some instructions to install it into R.
Once that’s done, run this command:
data(package="engsoccerdata")
It lists the datasets within the package. The one containing league matches is called ‘england’.
We just want last season’s Premier League data. For that, we are going to have to select just last season’s data for Tier 1, calling it ‘premier_league201516’.
Here’s how we do that:
#get just last season's data season <- grep("2015", england$Season) last_season <- england[season, ] #get just last season's Premier League (i.e. Tier 1) premier_league <- grep(1, last_season$tier) premier_league201516 <- last_season[premier_league, ]
Having got the data we want, it’s time to build the plot. This code uses the package ggplot2.
Running it line-by-line in RStudio won’t be of much use. I broke it down line by line so you can see what I’m doing. Select it all and run it at once. You may need to put each command on a single line (i.e. five instructions in total) to make it work:
#plot graph called 'season_matrix using ggplot2 season_matrix <- ggplot(premier_league201516, aes(x = home, y = visitor, color=result))+ geom_point(size = 2) #make the x axis labels 90 degrees season_matrix <- season_matrix + theme(axis.text.x = element_text(angle = 90)) #add a title season_matrix <- season_matrix + ggtitle("Premier League 2015/16 results matrix") #add a legend and colour scheme season_matrix <- season_matrix + scale_color_manual (name = "Result", labels = c("Away win", "Draw","Home win") ,values=c("#E69F00", "#999999", "#56B4E9")) #add axes labels season_matrix + labs(x = "Home team", y = "Away team")
Resulting graphic:
Analysis:
It’s a nice, clean graphic showing the outcomes of 380 football matches. The teams are listed in alphabetical order.
Starting in the left-hand corner, Bournemouth lost to Arsenal, Aston Villa and Chelsea at the Vitality Stadium in their first season in the Premier League.
However, they did secure a famous win against Manchester United at home. Junior Stanislas scored directly from a corner that day.
Leicester’s title-winning form is also plain to see – they lost just once at home all season, by five goals to two against Arsenal.
Reading left to right, we can also see that Arsenal beat Leicester at home, making them the only team to do the double over the champions last season. Danny Welbeck scored a late, late header to win that game 2-1 at the Emirates.
See if you can spot West Ham’s surprisingly good away form and just how dreadful Aston Villa were all season long.
I’ll be returning to this dataset in the future
Full data citation:
Full citation: James P. Curley (2016). engsoccerdata: English Soccer Data 1871-2016. R package version 0.1.5