Skip to content

R for Journalists

Unlock the power of R

  • What Is R?
  • R for Rob
  • GitHub
  • Twitter
  • Etsy
  • Home
  • 2021
  • January
  • 1
  • Five useful spatial functions from the sf R package

Five useful spatial functions from the sf R package

Posted on January 1, 2021May 4, 2022 By Rob 1 Comment on Five useful spatial functions from the sf R package
Geospatial data, Landmark Atlas

Author’s note: if you like this post, you might like Landmark Atlas. This is my new blog where I tell stories about Britain’s history, culture and geography through maps.


During my mapping project I have made heavy use of the sf package by Edzer Pebesma.

The package contains intuitive, straightforward functions for working with geospatial data. In this post I’m going to outline five useful functions from the wide array available for anyone interested in working with spatial data in R.

You can find the code on GitHub to practise these functions.

1. read_sf()

The first function you need to know about is read_sf(). This allows you to read shapefiles into R. Shapefiles contain the spatial data in a geometry dimension that you need to plot maps.

Here is a simple example of how to read a shapefile, using British parliamentary constituencies.

Firstly, go to the ONS Geography portal and download the zipped shapefile folder for Westminster constituencies.

It contains the actual shapefile (.shp) as well as other associated files, which are also needed for R to read it correctly.

Extract all files from the folder and put it in a folder in your working directory called constituencies.

Then run:

library(sf)
constituencies <- read_sf('~/constituencies')

This will give you a data frame of 632 observations.

The ~/ indicates to R that this shapefile is within your current working directory. If it’s not, put the full path to the constituencies folder inside read_sf().

On Windows you can find the full path by right-clicking on a folder and clicking Properties and then copying the Location. You will have to change the backslashes to forward slashes, however. This will look on Windows something like C:/Users/You/R/constituencies.

If you call str() on the data frame you get the following:

str(constituencies)
tibble [632 x 10] (S3: sf/tbl_df/tbl/data.frame)
$ objectid : int [1:632] 1 2 3 4 5 6 7 8 9 10 …
$ pcon16cd : chr [1:632] "E14000530" "E14000531" "E14000532" "E14000533" …
$ pcon16nm : chr [1:632] "Aldershot" "Aldridge-Brownhills" "Altrincham and Sale West" "Amber Valley" …
$ bng_e : chr [1:632] "484884" "404723" "374132" "440478" …
$ bng_n : int [1:632] 155126 302568 389051 349675 115542 355939 138824 399819 199541 233685 …
$ long : num [1:632] -0.784 -1.932 -2.39 -1.398 -0.426 …
$ lat : num [1:632] 51.3 52.6 53.4 53 50.9 …
$ st_areasha: num [1:632] 5.30e+07 4.40e+07 5.09e+07 1.25e+08 6.45e+08 …
$ st_lengths: num [1:632] 40827 38222 46098 62106 326799 …
$ geometry :sfc_MULTIPOLYGON of length 632; first list element: List of 1
..$ :List of 1
.. ..$ : num [1:243, 1:2] 485407 485466 485540 485546 485603 …
..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
attr(*, "sf_column")= chr "geometry"
attr(, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA ..- attr(, "names")= chr [1:9] "objectid" "pcon16cd" "pcon16nm" "bng_e" …

The geometry column contains the coordinates that plot the outlines of the constituencies.

2. st_transform()

This function is used to transform geospatial data between different coordinate reference systems (CRSs). A CRS is used to turn Earth’s spherical surface into a flat two-dimensional map. There are various different ways to do this represented in different CRSs.

If you’re working with shapefiles from different sources you might find one uses one CRS and the other uses another. Trying to work with the two with different CRSs is no good – so use st_transform() to bring them into a consistent format.

You can use st_crs() to find out the EPSG code of a spatial data frame. Each CRS will have an EPSG code:

st_crs(constituencies)
Coordinate Reference System:
User input: OSGB 1936 / British National Grid
wkt:
PROJCRS["OSGB 1936 / British National Grid",
BASEGEOGCRS["OSGB 1936",
DATUM["OSGB 1936",
ELLIPSOID["Airy 1830",6377563.396,299.3249646,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4277]],
CONVERSION["British National Grid",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",49,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",-2,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",0.9996012717,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",400000,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",-100000,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["(E)",east,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["(N)",north,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["unknown"],
AREA["UK - Britain and UKCS 49°46'N to 61°01'N, 7°33'W to 3°33'E"],
BBOX[49.75,-9.2,61.14,2.88]],
ID["EPSG",27700]]

We can see that the EPSG code is 27700 for this data frame, as it uses the 1936 Ordnance Survey reference system. We can change it to another common CRS, the WGS 84, using the following:

constituencies_wgs <- st_transform(constituencies, crs = 4326)

I’ve often had frustrating error messages crop up when I come to plot maps only to find that the solution is to transform one of my data frames into the right CRS.

3. st_point(), st_multilinestring(), st_cast()

The constituencies data frame that we have contains polygons of each parliamentary constituency in Great Britain. We may wish to create some of our own polygons from time to time.

We start with st_point(). This is used to create one-dimensional coordinate points. They need to have an x and y coordinate, like so:

pt1 <- st_point(c(-0.76, 51.28))

That is a point with a latitude of 51.28 and a longitude of -0.76. That places it within Aldershot, the first polygon in our constituencies data frame.

Let’s create a few more:

pt2 <- st_point(c(-0.76, 51.3))
pt3 <- st_point(c(-0.74, 51.3))
pt4 <- st_point(c(-0.74, 51.28))

Now let’s plot these points, using the geom_sf() function from ggplot2.

ggplot() + geom_sf(data = pt1) + geom_sf(data = pt2) + geom_sf(data = pt3) + geom_sf(data = pt4)

We have four points independent of each other in a square. We can join them up as a multilinestring:

mls <- st_multilinestring(list(rbind(pt1,pt2,pt3,pt4,pt1)))

We have to put pt1 in twice to join it up back to where we started at point 1. This cuts down our call to plot it:

ggplot() + geom_sf(data = mls)

We now have a nice square. But it is still treated as a linestring, so we won’t be able to fill it in or calculate other functions like its area. To unlock these we need to turn it into a polygon:

polygon <- st_cast(mls, to = 'POLYGON')
ggplot() + geom_sf(data = polygon, fill = 'yellow')

Success! Now we have our polygon.

You will usually want to import other shapefiles for your spatial work. But it can be very useful to know how to draw your own if you wish to use them to annotate other shapefiles.

Here’s how our yellow polygon looks in relation to Aldershot:

#add crs
polygon <- sf::st_sfc(polygon, crs = 4326)
ggplot() + geom_sf(data = constituencies_wgs[1,]) + geom_sf(data = polygon, fill = 'yellow')

4. st_intersects()

This is one of a number of functions that can be used to see whether two polygons occupy the same space.

constituencies_wgs$intersects <- as.character(st_intersects(constituencies_wgs, polygon))

This line of code adds a new column to our constituencies data frame, asking whether our yellow polygon intersects with each of the 632 constituencies of Great Britain. The first constituency, Aldershot, returned ‘1’. The others were all integer(0), indicating they do not touch the yellow polygon at all.

This function can be very handy for narrowing down a large shapefile to a particular area. You can draw a polygon to act as a bounding box as shown above in number three and then use st_intersects() to pinpoint just the part of the shapefile within that box. After that you can remove the superfluous parts of the shapefile from your data frame.

An important constraint in geospatial mapping is process time. Polygons can take a long time to load, work with and visualise so any time-saving devices are most welcome.

5. st_union()

The final function in this post takes more than one polygon and combines them together. This one removes the internal boundaries between them if they are next to each other. Another related function st_combine() keeps the internal boundaries.

union <- st_union(constituencies_wgs[1,],constituencies_wgs[315,])
ggplot() + geom_sf(data = union)

combination <- rbind(constituencies_wgs[1,], constituencies_wgs[315,])
combination <- st_combine(combination)
ggplot() + geom_sf(data = combination)

st_union()
st_combine()

I hope you find these functions useful in your R mapping.

Share this:

  • Click to share on X (Opens in new window) X
  • Click to share on Facebook (Opens in new window) Facebook

Related

Tags: geospatial mapping sf

Post navigation

❮ Previous Post: Introducing Landmark Atlas: my R mapping project on Etsy
Next Post: Sentiment analysis of Nineteen-Eighty-Four: how gloomy is George Orwell’s dystopian novel? ❯

One thought on “Five useful spatial functions from the sf R package”

  1. Pingback: Five more useful spatial functions from the sf package – R for Journalists

Comments are closed.

Recent Posts

  • I’ve moved my blog over to Substack
  • How to plot a large rural area using Ordnance Survey data in R
  • Check the COVID-19 vaccination progress in your area
  • Let R tell you what to watch on Netflix
  • Sentiment analysis of Nineteen-Eighty-Four: how gloomy is George Orwell’s dystopian novel?

Archives

  • April 2022
  • April 2021
  • March 2021
  • February 2021
  • January 2021
  • December 2020
  • February 2020
  • December 2019
  • November 2019
  • October 2019
  • April 2018
  • March 2018
  • January 2018
  • December 2017
  • November 2017
  • October 2017
  • September 2017
  • August 2017
  • July 2017
  • May 2017
  • April 2017
  • March 2017
  • February 2017
  • January 2017
  • December 2016
  • November 2016
  • October 2016
  • September 2016

Categories

  • Geospatial data
  • Landmark Atlas
  • Learn
  • See
  • Seen Elsewhere
  • Site
  • Uncategorized

Copyright © 2025 R for Journalists.

Theme: Oceanly by ScriptsTown

 

Loading Comments...