Skip to content

R for Journalists

Unlock the power of R

  • What Is R?
  • R for Rob
  • GitHub
  • Twitter
  • Etsy
  • Home
  • 2016
  • October
  • 6
  • Visualising the Boom in New Flats in Manchester

Visualising the Boom in New Flats in Manchester

Posted on October 6, 2016October 6, 2016 By Rob
See
An example of new apartments in Salford. © Copyright Geoff Royle and licensed for reuse under this Creative Commons Licence.
New apartments in Salford.
© Copyright Geoff Royle and licensed for reuse under this Creative Commons Licence.

Any visitor to Manchester city centre will be struck by the number of modern blocks of city centre flats.

Take the tram out towards Altrincham and you’ll see modern apartment blocks on both sides, out towards Salford and back towards Castlefield.

Many of these flats will have been sold, which means they will turn up in the Land Registry’s Price Paid data.

This is an excellent dataset containing residential property sales in England and Wales. You can download the (huge) full file, or you can use the data wizard to filter the data you want. The Land Registry also let you download this year’s data only, which they update monthly. This is the file I’m going to be using (note it’s since been updated with another month).

Getting the data into RStudio

#read Land Registry CSV, called pp-2016.csv
houses <- read.csv("pp-2016.csv",header=FALSE,stringsAsFactors = FALSE)

#add column names
colnames(houses) <- c("id","price","date","postcode","type",
"y/n","hold","housename2","housename1","street",
"neighbourhood1","neighbourhood2","la","county","a","a2")

Our data is now in RStudio. The CSV file doesn’t come with column headers, so we’ve correctly labelled them. We won’t need all these columns but it helps to know what they are.

Calling str on our data frame shows us the structure of our data:

'data.frame': 469605 obs. of 16 variables:
 $ id : chr "{369DFB16-3F24-3A19-E050-A8C0620518C6}" "{369DFB16-3F25-3A19-E050-A8C0620518C6}" "{369DFB16-3F26-3A19-E050-A8C0620518C6}" "{369DFB16-3F27-3A19-E050-A8C0620518C6}" ...
 $ price : int 169995 110000 240000 70000 80000 165000 138500 246500 145000 230000 ...
 $ date : chr "2016-06-01 00:00" "2016-04-29 00:00" "2016-06-10 00:00" "2016-05-27 00:00" ...
 $ postcode : chr "S73 0BX" "S12 4RW" "S74 9NW" "S5 7DQ" ...
 $ type : chr "D" "S" "T" "S" ...
 $ y/n : chr "N" "N" "N" "N" ...
 $ hold : chr "F" "F" "F" "F" ...
 $ housename2 : chr "1" "34" "MOOR VIEW BARN" "29" ...
 $ housename1 : chr "" "" "" "" ...
 $ street : chr "COTTERDALE GARDENS" "ALPORT PLACE" "HIGH ROYD LANE" "MUSGRAVE CRESCENT" ...
 $ neighbourhood1: chr "WOMBWELL" "" "HOYLAND" "" ...
 $ neighbourhood2: chr "BARNSLEY" "SHEFFIELD" "BARNSLEY" "SHEFFIELD" ...
 $ la : chr "BARNSLEY" "SHEFFIELD" "BARNSLEY" "SHEFFIELD" ...
 $ county : chr "SOUTH YORKSHIRE" "SOUTH YORKSHIRE" "SOUTH YORKSHIRE" "SOUTH YORKSHIRE" ...
 $ a : chr "A" "A" "A" "A" ...
 $ a2 : chr "A" "A" "A" "A" ...

Focusing on ‘type’

‘Type’ refers to the kind of home being sold. The five possibilities are ‘D’ (detached), ‘S’ (semi-detached), ‘T’ (terraced), ‘F’ (flat) and ‘O’ (other).

We don’t really want ‘other’ as it can skew the data. We also want to write these out in full so they show up properly on our legends.

#remove other by using a regular expression to search for everything except 'O'
no.other <- grep("[^O]", houses$type)
filtered_data <- houses[no.other, ]

#with the plyr package enabled, rename the values in the 'type' section

filtered_data$type <- revalue(filtered_data$type, 
c("D" = "Detached", "F" = "Flat", 
"S" = "Semi-detached", "T" = "Terraced"))

We now have the data we want

So let’s take a look at Manchester, using ggplot2. We are going to use geom_jitter, which spreads our dots out so we can see them better. If we used geom_point they would all be on one line because they are all from Manchester.

selected <- grep("MANCHESTER",filtered_data$la)
manchester_data <- filtered_data[selected, ]

ggplot(manchester_data, aes(x = price, y = la, col = type)) 
+ geom_jitter()

manchester

 

Three things jump out at me:

  • There was one detached that sold for a fortune, almost £3.5m

  • Terraced houses tend to be among the cheapest properties sold in Manchester

  • There are lots of flats being sold in the city

Now let’s take a look at Oldham:

oldham <- grep("OLDHAM",filtered_data$la)
oldham_data <- filtered_data[oldham, ]

ggplot(oldham_data, aes(x = price, y = la, col = type)) 
+ geom_jitter()

oldham

This is much easier to read. Two things jump out from the data:

  • Hardly any flats have been sold in Oldham this year

  • The cheapest properties are almost always terraced houses

Putting them both on the same graph doesn’t really help much, because of that one detached house:

manol <- grep("MANCHESTER|OLDHAM",filtered_data$la)
manol_data <- filtered_data[manol, ]

ggplot(manol_data, aes(x = price, 
y = la, col = type)) + geom_jitter()

manchester_oldham

How about if we only focus on new builds?

If we go back to the structure of our data, there is a y/n column. ‘Yes’ means a new build, ‘no’ means an existing property.

Let’s filter just for new builds and see what comes up:

manchester_oldham_unformatted

Now this is more interesting.

We can see two trends here:

  • Hardly any new homes are being sold in Oldham compared to Manchester
  • Almost all the new homes in Manchester being sold are flats

Let’s tidy up the graph:

ggplot(manolnew_data, aes(x = price, y = la, col = type))
+ geom_jitter(size = 3) + ggtitle("New homes sold in Manchester and Oldham, 2016")
+ labs(y = "", x ="Price", color = "") + theme(plot.title = element_text(size = 30), 
legend.title = element_text(size = 18), 
axis.title.x = element_text(size = 28),
axis.text = element_text(size = 16), legend.text = element_text(size = 18),
legend.key.size = unit(0.8, "cm")) 

#add pound signs and thousands separator

+ scale_x_continuous(labels = dollar_format(prefix = "£"))

manchester_oldham_formatted

Some swish new flats are being sold in Manchester this year for extraordinary prices – well over £300,000, even £400,000 in three cases.

Meanwhile, hardly anyone is purchasing new homes in Oldham.

A quick glance at the Government’s house building tables shows that only 40 new builds were completed in Oldham in Q2 of 2016, the joint lowest in Greater Manchester. Perhaps that goes some way to explaining why.

 

 

 

 

 

 

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook

Related

Tags: dataviz ggplot2 houses journalism land registry plyr rstats scales

Post navigation

❮ Previous Post: Make Your Labels Legible on Mobile
Next Post: Visualising House Prices by Postcode in Leeds ❯

Recent Posts

  • I’ve moved my blog over to Substack
  • How to plot a large rural area using Ordnance Survey data in R
  • Check the COVID-19 vaccination progress in your area
  • Let R tell you what to watch on Netflix
  • Sentiment analysis of Nineteen-Eighty-Four: how gloomy is George Orwell’s dystopian novel?

Archives

  • April 2022
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • February 2020
  • December 2019
  • November 2019
  • October 2019
  • April 2018
  • March 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016

Categories

  • Geospatial data
  • Landmark Atlas
  • Learn
  • See
  • Seen Elsewhere
  • Site
  • Uncategorized

Copyright © 2025 R for Journalists.

Theme: Oceanly by ScriptsTown

 

Loading Comments...