Here is my first Shiny app!
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
league_size$tier2 <- league_size
Here is my first Shiny app!
Select a football team and the app will plot where the team has ranked in the top four divisions of English football:
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
1` league_size$tier3 <- league_size
Here is my first Shiny app!
Select a football team and the app will plot where the team has ranked in the top four divisions of English football:
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
1` + league_size
Here is my first Shiny app!
Select a football team and the app will plot where the team has ranked in the top four divisions of English football:
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
2` league_size$tier4 <- league_size
Here is my first Shiny app!
Select a football team and the app will plot where the team has ranked in the top four divisions of English football:
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
1` + league_size
Here is my first Shiny app!
Select a football team and the app will plot where the team has ranked in the top four divisions of English football:
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
2` + league_size
Here is my first Shiny app!
Select a football team and the app will plot where the team has ranked in the top four divisions of English football:
Shiny lets you create interactive visualisations in R.
It’s a big step forward from the static visualisations we have done thus far.
R has a fairly steep learning curve at the beginning. It took me several months and a DataCamp course before I began to know what I was doing.
Shiny has another learning curve on top of that – the code and syntax is a bit different from regular R code.
If you persevere with though you can create some brilliant, clear visualisations.
In Part I of this post I will show you how I got the data together in regular R and in Part II we will delve into the Shiny code itself.
The source data is the engsoccerdata package
We have worked with this data before when we looked at a Boxing Day league table. Happily James Curley has updated the data with last season’s results since then [Full citation is at the end of this post].
Let’s take another look at the data to familiarise ourselves with it:
library(engsoccerdata) > str(england) 'data.frame': 194040 obs. of 12 variables: $ Date : Date, format: "1888-12-15" "1889-01-19" "1889-03-23" ... $ Season : num 1888 1888 1888 1888 1888 ... $ home : chr "Accrington F.C." "Accrington F.C." "Accrington F.C." "Accrington F.C." ... $ visitor : chr "Aston Villa" "Blackburn Rovers" "Bolton Wanderers" "Burnley" ... $ FT : chr "1-1" "0-2" "2-3" "5-1" ... $ hgoal : int 1 0 2 5 6 3 1 0 2 2 ... $ vgoal : int 1 2 3 1 2 1 2 0 0 1 ... $ division: chr "1" "1" "1" "1" ... $ tier : num 1 1 1 1 1 1 1 1 1 1 ... $ totgoal : int 2 2 5 6 8 4 3 0 2 3 ... $ goaldif : int 0 -2 -1 4 4 2 -1 0 2 1 ... $ result : chr "D" "A" "A" "H" ...
There is also a maketable() function within the package that creates a league table for a division in England’s top four tiers for a particular season:
> maketable(england, tier = 1, pts = 3, Season = 2016) team GP W D L gf ga gd Pts Pos 1 Chelsea 38 30 3 5 85 33 52 93 1 2 Tottenham Hotspur 38 26 8 4 86 26 60 86 2 3 Manchester City 38 23 9 6 80 39 41 78 3 4 Liverpool 38 22 10 6 78 42 36 76 4 5 Arsenal 38 23 6 9 77 44 33 75 5 6 Manchester United 38 18 15 5 54 29 25 69 6 7 Everton 38 17 10 11 62 44 18 61 7 8 Southampton 38 12 10 16 41 48 -7 46 8 9 AFC Bournemouth 38 12 10 16 55 67 -12 46 9 10 West Bromwich Albion 38 12 9 17 43 51 -8 45 10 11 West Ham United 38 12 9 17 47 64 -17 45 11 12 Leicester City 38 12 8 18 48 63 -15 44 12 13 Stoke City 38 11 11 16 41 56 -15 44 13 14 Crystal Palace 38 12 5 21 50 63 -13 41 14 15 Swansea City 38 12 5 21 45 70 -25 41 15 16 Burnley 38 11 7 20 39 55 -16 40 16 17 Watford 38 11 7 20 40 68 -28 40 17 18 Hull City 38 9 7 22 37 80 -43 34 18 19 Middlesbrough 38 5 13 20 27 53 -26 28 19 20 Sunderland 38 6 6 26 29 69 -40 24 20
We want to look at league positions over time. We’ll take the 1958/59 season as a starting point because that is when the old Fourth Division came into existence. England’s top four tiers have changed names and sizes since then but essentially they have remained the same.
However the points awarded for a win moved from two to three starting from the 1981/82 season.
There is probably a neater way to do this, but I decided to loop through the data in two stages – one with two points for a win from 1958/59 to 1980/81 and one for three points for a win from 1981/82 onwards and add it to a data frame:
library(engsoccerdata) library(shiny) library(plyr) library(dplyr) library(tidyr) library(ggplot2)
#create blank data frames
tier1_data <- data.frame()
tier1 <- data.frame()
tier2_data <- data.frame()
tier2 <- data.frame()
tier3_data <- data.frame()
tier3 <- data.frame()
tier4_data <- data.frame()
tier4 <- data.frame()
#set the first season
j = 1958
#2pts for a win
for (i in 1:23) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 2, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 2, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 2, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 2, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#3pts for a win
for (i in 1:36) {
#create a top tier league table for each season and add it to the previous
tier1 <- maketable(england, tier = 1, pts = 3, Season = j)
tier1$season <- j
tier1$tier <- 1
tier1_data <- rbind(tier1_data, tier1)
#repeat for the other tiers
tier2 <- maketable(england, tier = 2, pts = 3, Season = j)
tier2$season <- j
tier2$tier <- 2
tier2_data <- rbind(tier2_data, tier2)
tier3 <- maketable(england, tier = 3, pts = 3, Season = j)
tier3$season <- j
tier3$tier <- 3
tier3_data <- rbind(tier3_data, tier3)
tier4 <- maketable(england, tier = 4, pts = 3, Season = j)
tier4$season <- j
tier4$tier <- 4
tier4_data <- rbind(tier4_data, tier4)
j = j+1
}
#combine all the data together
tables <- data.frame()
tables <- rbind(tier1_data, tier2_data, tier3_data, tier4_data)
tables$Pos <- as.numeric(tables$Pos)
We now have all the league tables in one big data frame.
Each league table within the data has a tier field and a season field so we can identify it.
The challenge now is how to deal with shifting league sizes
In 1995/96 the Premier League switched to its current 20 club format. Before that, there were 22 teams in the top-flight. The sizes of the other leagues have changed in the past 60 years as well.
This poses a challenge: at the moment the team that wins the Championship (the second tier) finishes 21st in the overall footballing pyramid. But this wasn’t always the case in the old Second Division – so how can we adjust for the different league sizes?
The first thing we can do is work out the size of each tier for each season:
league_size <- tables %>% dplyr::group_by(season, tier) %>% dplyr::summarise(n = n())
league_size <- league_size %>% spread(key = tier, value = n)
> head(league_size)
# A tibble: 6 x 5
# Groups: season [6]
season `1` `2` `3` `4`
<dbl> <int> <int> <int> <int>
1 1958 22 22 24 24
2 1959 22 22 24 24
3 1960 22 22 24 24
4 1961 22 22 24 24
5 1962 22 22 24 24
6 1963 22 22 24 24
Now we can just add the columns cumulatively to get figures for how many teams finish above each tier for each season.
3`
Merge with our original data
tables <- merge(tables, league_size, by = "season") #create a field for our total finishing position tables$overall_position <- NA
Next up we need to calculate the overall finishing position
Again, there is probably a way to do this using an apply function. If you figure it out, please get in touch.
for (i in 1:nrow(tables)) { if (tables$tier[i] == 1) { tables$overall_position[i] <- tables$Pos[i] } else if (tables$tier[i] == 2) { tables$overall_position[i] <- tables$Pos[i] + tables$tier2[i] } else if (tables$tier[i] == 3) { tables$overall_position[i] <- tables$Pos[i] + tables$tier3[i] } else if (tables$tier[i] == 4) { tables$overall_position[i] <- tables$Pos[i] + tables$tier4[i] } }
tables$overall_position <- unlist(tables$overall_position)
Now we can summarise our data to leave us just with the team, season and finishing position
league_positions <- tables %>% group_by(team, season, overall_position) %>% summarise()
#remove attributes
attributes(league_positions) <- NULL
league_positions <- data.frame(league_positions)
names(league_positions) <- c(“team”,”season”,”overall_position”)
league_positions <- arrange(league_positions, team, season) league_positions <- league_positions %>% spread(key = team, value = overall_position, fill = 0)
Let’s take a look at our data now.
At this point I tried to load the data directly into Google using googlesheets but it took too long.
> str(league_positions) 'data.frame': 59 obs. of 118 variables: $ season : num 1958 1959 1960 1961 1962 ... $ Accrington : num 0 0 0 0 0 0 0 0 0 0 ... $ Accrington Stanley : num 62 68 86 92 0 0 0 0 0 0 ... $ AFC Bournemouth : num 57 59 62 49 50 48 54 64 64 58 ... $ AFC Wimbledon : num 0 0 0 0 0 0 0 0 0 0 ... $ Aldershot : num 90 81 80 73 80 77 86 86 79 77 ... $ Arsenal : num 3 13 11 9 7 8 13 19 7 9 ... $ Aston Villa : num 21 23 8 7 14 19 14 14 21 35 ... $ Barnet : num 0 0 0 0 0 0 0 0 0 0 ... $ Barnsley : num 44 62 51 65 63 65 68 85 86 70 ... $ Barrow : num 91 86 90 78 77 92 88 80 70 51 ... $ Birmingham City : num 8 19 18 17 20 20 22 32 32 26 ... $ Blackburn Rovers : num 10 15 10 16 11 7 10 22 26 30 ... $ Blackpool : num 9 11 20 14 15 18 17 13 22 24 ... $ Bolton Wanderers : num 4 6 19 12 17 21 25 31 31 34 ... $ Boston United : num 0 0 0 0 0 0 0 0 0 0 ... $ Bradford City : num 55 63 66 74 91 72 87 91 77 72 ... $ Bradford Park Avenue : num 80 79 71 54 65 81 76 77 91 92 ... $ Brentford : num 48 48 61 66 69 60 49 67 78 80 ... $ Brighton & Hove Albion : num 37 36 38 44 66 76 70 59 63 57 ... $ Bristol City : num 32 44 57 50 58 49 46 28 38 40 ... $ Bristol Rovers : num 29 29 37 43 64 53 51 60 49 61 ... $ Burnley : num 7 1 4 2 3 9 12 2 14 13 ... $ Burton Albion : num 0 0 0 0 0 0 0 0 0 0 ... $ Bury : num 54 49 45 33 30 39 38 40 44 45 ... $ Cambridge United : num 0 0 0 0 0 0 0 0 0 0 ... $ Cardiff City : num 31 24 16 21 31 37 37 42 42 36 ... $ Carlisle United : num 79 87 88 72 67 69 45 33 25 32 ... $ Charlton Athletic : num 30 30 31 37 42 26 40 38 41 38 ... $ Chelsea : num 13 17 12 22 23 5 3 4 9 7 ... $ Cheltenham : num 0 0 0 0 0 0 0 0 0 0 ... $ Chester : num 82 88 92 90 88 79 74 75 85 90 ... $ Chesterfield : num 59 61 68 88 86 84 79 89 80 74 ... $ Colchester United : num 49 54 67 69 55 62 67 74 56 68 ... $ Coventry City : num 70 47 58 58 49 46 31 25 24 21 ... $ Crawley Town : num 0 0 0 0 0 0 0 0 0 0 ... $ Crewe Alexandra : num 86 82 76 79 71 66 80 82 74 73 ... $ Crystal Palace : num 75 76 70 59 56 45 29 35 29 33 ... $ Dagenham and Redbridge : num 0 0 0 0 0 0 0 0 0 0 ... $ Darlington : num 85 83 77 84 79 88 81 69 66 85 ... $ Derby County : num 28 38 34 39 40 35 32 30 40 39 ... $ Doncaster Rovers : num 66 84 78 89 84 83 77 70 67 78 ... $ Everton : num 15 18 5 4 1 3 4 11 6 5 ... $ Exeter City : num 73 77 89 86 83 73 62 66 83 88 ... $ Fleetwood Town : num 0 0 0 0 0 0 0 0 0 0 ... $ Fulham : num 24 9 15 20 16 16 20 20 19 22 ... $ Gateshead : num 84 89 0 0 0 0 0 0 0 0 ... $ Gillingham : num 78 74 85 87 73 70 50 49 57 54 ... $ Grimsby Town : num 43 51 50 46 41 42 56 56 58 65 ... $ Halifax Town : num 53 60 53 62 68 80 91 83 81 79 ... $ Hartlepool United : num 87 91 91 91 92 91 85 84 75 71 ... $ Hereford United : num 0 0 0 0 0 0 0 0 0 0 ... $ Huddersfield Town : num 35 28 42 29 29 34 30 27 27 37 ... $ Hull City : num 45 43 54 53 54 55 48 45 33 41 ... $ Ipswich Town : num 34 32 23 1 18 22 28 36 28 25 ... $ Kidderminster Harriers : num 0 0 0 0 0 0 0 0 0 0 ... $ Leeds United : num 16 21 36 41 27 23 2 3 4 4 ... $ Leicester City : num 19 16 6 13 4 11 19 5 8 14 ... $ Leyton Orient : num 39 33 39 24 22 38 42 44 62 64 ... $ Lincoln City : num 41 35 44 67 90 78 90 90 92 82 ... $ Liverpool : num 25 25 25 23 8 1 8 1 5 3 ... $ Luton Town : num 17 22 35 32 44 61 65 73 84 69 ... $ Macclesfield : num 0 0 0 0 0 0 0 0 0 0 ... $ Maidstone United : num 0 0 0 0 0 0 0 0 0 0 ... $ Manchester City : num 20 12 14 11 21 29 33 23 15 1 ... $ Manchester United : num 2 7 7 15 19 2 1 6 1 2 ... $ Mansfield Town : num 64 66 87 83 72 51 47 62 53 66 ... $ Middlesbrough : num 38 27 27 34 26 33 39 43 46 28 ... $ Millwall : num 77 75 74 70 61 64 72 46 30 29 ... $ Milton Keynes Dons : num 0 0 0 0 0 0 0 0 0 0 ... $ Morecambe : num 0 0 0 0 0 0 0 0 0 0 ... $ Newcastle United : num 11 8 21 36 28 28 23 15 20 10 ... $ Newport County : num 61 53 55 68 89 82 83 78 87 81 ... $ Northampton Town : num 76 73 72 52 45 32 24 21 43 62 ... $ Norwich City : num 47 46 26 40 34 40 26 37 35 31 ... $ Nottingham Forest : num 14 20 13 19 9 13 6 17 3 11 ... $ Notts County : num 67 70 49 57 51 68 84 76 88 84 ... $ Oldham Athletic : num 89 92 79 81 70 52 64 65 54 60 ... $ Oxford United : num 0 0 0 0 87 86 73 55 60 46 ... $ Peterborough United : num 0 0 69 47 48 54 52 58 61 52 ... $ Plymouth Argyle : num 46 41 32 27 35 43 34 39 37 44 ... $ Port Vale : num 69 56 52 56 47 58 66 87 82 86 ... $ Portsmouth : num 22 42 43 45 39 31 43 34 36 27 ... $ Preston North End : num 12 10 22 35 38 25 35 41 34 42 ... $ Queens Park Rangers : num 56 52 47 48 57 57 57 47 45 23 ... $ Reading : num 51 58 63 51 62 50 59 53 47 49 ... $ Rochdale : num 68 80 84 77 75 89 75 88 89 87 ... $ Rotherham United : num 42 31 40 31 36 30 36 29 39 43 ... $ Rushden & Diamonds : num 0 0 0 0 0 0 0 0 0 0 ... $ Scarborough : num 0 0 0 0 0 0 0 0 0 0 ... $ Scunthorpe United : num 40 37 33 26 32 44 61 48 59 67 ... $ Sheffield United : num 26 26 24 6 10 12 18 9 11 20 ... $ Sheffield Wednesday : num 23 5 3 5 6 6 9 16 12 19 ... $ Shrewsbury Town : num 71 50 56 63 59 56 60 54 51 47 ... $ Southampton : num 58 45 30 28 33 27 27 24 18 17 ... $ Southend United : num 52 55 64 61 53 59 55 63 73 75 ... $ Southport : num 92 90 81 85 82 87 89 79 71 59 ... $ Stevenage Borough : num 0 0 0 0 0 0 0 0 0 0 ... $ Stockport County : num 65 78 82 82 85 85 92 81 69 55 ... [list output truncated]
At this point I tried to connect my data to Google using googlesheets but it took too long.
I printed it off and uploaded instead. It’s here.
We’ll run a test now for Arsenal to show you how a plot of the finishing positions looks.
This will be a template for our Shiny app.
p <- ggplot(data = league_positions, aes(x = league_positions$season)) + geom_line(aes(y = league_positions$Arsenal, group = 1), size = 1.1)
p
Stay tuned for part II, where we move into the Shiny code itself!
Data source: James P. Curley (2016). engsoccerdata: English Soccer Data 1871-2016. R package version 0.1.5
P.S. Can you think of a way to run the for loops using apply functions? I thought for ages about how to do it but I haven’t figured it out yet.