Strikes don’t really happen too much in Britain any more.
A total of 170,000 working days were lost in Britain due to strikes and industrial action in 2015.
That might sound like a lot, but it was actually the second lowest on record.
And this record goes back to 1891 – one of the longest time series of any official UK dataset I know of.
This makes it perfect to introduce the geom_line plot in ggplot2.
You can download the data here.
It’s a .xls file. I deleted the title and the blank rows at the top and renamed it strikes.csv. There are some more empty rows at the bottom that you can delete if you want, but they don’t matter so much.
data <- read.csv("strikes.csv") str(data) data.frame': 128 obs. of 5 variables: $ Year : Factor w/ 127 levels "","1891","1892",..: 2 3 4 5 6 7 8 9 10 11 ... $ Number.of.Stoppages.beginning.in.year : Factor w/ 117 levels "","1,004","1,053",..: 113 99 90 112 101 113 107 98 100 95 ... $ Stoppages.in.progress.in.year : Factor w/ 86 levels "","-","1,016",..: 2 2 2 2 2 2 2 2 2 2 ... $ Total.number.of.workers.involved.in.year..thousands. : Factor w/ 118 levels "","-","1,041",..: 2 2 92 59 48 37 44 46 33 36 ... $ Total.number.of.working.days.lost.in.year..thousands.: int 6809 7382 30439 9506 5701 3565 10327 15257 2503 3088 ... #Calling str on our data shows that the columns could do with renaming because they are too long colnames(data) <- c("Year","Stoppages","Stops_progress","Workers_involved","Days_lost") str(data) 'data.frame': 128 obs. of 5 variables: $ Year : Factor w/ 127 levels "","1891","1892",..: 2 3 4 5 6 7 8 9 10 11 ... $ Stoppages : Factor w/ 117 levels "","1,004","1,053",..: 113 99 90 112 101 113 107 98 100 95 ... $ Stops_progress : Factor w/ 86 levels "","-","1,016",..: 2 2 2 2 2 2 2 2 2 2 ... $ Workers_involved: Factor w/ 118 levels "","-","1,041",..: 2 2 92 59 48 37 44 46 33 36 ... $ Days_lost : int 6809 7382 30439 9506 5701 3565 10327 15257 2503 3088 ... head(data) Year Stoppages Stops_progress Workers_involved Days_lost 1 1891 906 - - 6809 2 1892 700 - - 7382 3 1893 599 - 634 30439 4 1894 903 - 322 9506 5 1895 728 - 259 5701 6 1896 906 - 192 3565
The data is in the correct format for a line plot. So let’s do it:
ggplot(data, aes(x=Year, y = Days_lost, group = 1)) + geom_line(size = 2.2, color = "#7f0000")
This is how it looks initially. The main problem is that we have all the x labels overlapping each other.
Avoid overlapping by showing only set ones in a sequence
p <- ggplot(data, aes(x=Year, y = Days_lost, group = 1)) +geom_line(size = 2.2, color = "#7f0000") #break up the x axis labels p + scale_x_discrete(breaks = seq(1880, 2020, by = 10))
The scale_x_discrete here uses a sequence beginning at 1880 and ending at 2010, only showing every tenth year.
This looks much tidier.
The ONS interactive explains that the spikes correspond to significant events in UK industrial history.
The huge spike in the 1920s was the 1926 General Strike. As its name suggests, it was a massive walkout organised by the TUC, the main trade union body in the UK. The TUC called a halt to it nine days later. It ultimately failed in its goal of protecting miners from pay cuts and longer hours.
The following year sympathy strikes were banned by the Trades Disputes Act 1927.
World War II, the Attlee Government and the ‘post-war consensus’ kept industrial relations on an even keel until the 1970s.
The first of the three main spikes from 1970 onwards was a miners’ strike in 1972.
The two larger ones to follow were the Winter of Discontent in 1979 and the Miners’ Strike of 1984-85.
Both of these events have had a profound effect on British politics since.
The Winter of Discontent, with its strikes from binmen and gravediggers, is still referenced by right-wing politicians and journalists as an example of what happens if you let trade unions (and Labour) run the show.
The strike of 1984-85 ended in defeat for the miners.
It entrenched bitterness in the parts of Britain where coal mining was important – bitterness towards the ‘scabs’ who broke the strikes and to Margaret Thatcher and her government, so much so that there was a party in one village when she died in 2013.
Annotating the line plot
We can annotate these events on the plot using annotate.
#formatting p + labs(y = "Days lost (1000s)", x ="") + ggtitle("Days lost to strikes,\n 1891-2015") + theme(plot.title = element_text(size = 60), axis.text.y = element_text(size = 24), axis.text.x = element_text(size = 24), axis.title.y = element_text(size = 32), axis.title.x = element_blank()) #annotations + annotate("text", x = 38, y = 170000, label = "1926 General Strike", size = 7) + annotate("text", x = 90, y = 37000, label = "1979 Winter of Discontent", size = 7) + annotate("text", x = 109, y = 25000, label = "1984-85 Miners' strike", size = 7)
The x and y correspond to the axes. Having x = 1 would put the label at the extreme left of the plot. X = 40 puts it at the 40th value in the line plot, or approximately 1931. Its size means it stretches to both sides of the spike. Having y = 170,000 puts it above the top of the 1926 spike.
The Winter of Discontent and the miners’ strike happened just a few years apart, so I tweaked the labels to give them both space.
Here’s how it looks:
There you have it. Using R to illustrate Britain’s industrial history with geom_line and annotations. I removed the ‘Year’ on the x axis to save space and because I thought the title and the annotations adequately explained the fact it expressed years. Was that a fair decision?
I’m looking into interactive R plots. I’ll post soon if I crack it – any tips would be much appreciated.
Take a look at the ggiraph package https://github.com/davidgohel/ggiraph
Thanks, I’ll take a look!