Chapter 5 Map making

5.1 Learning outcomes

By the end of this practical you should be able to:

  1. List and explain basic mapping concepts across QGIS and R
  2. Interpret and manipulate data from multiple sources
  3. Create near publishable static and interactive mapped outputs
  4. Evaluate and critique mapping approaches between QGIS and R

5.2 Homework

Outside of our scheduled sessions you should be doing around 12 hours of extra study per week. Feel free to follow your own GIS interests, but good places to start include the following:

Exam

Each week we will provide a short task to test your knowledge, these should be used to guide your study for the final exam.

The task this week is to:

  • Fork the repository below you from the spreadsheet last week
  • Run the code, editing it if required
  • Create a map (you can use additional data if you wish)
  • submit a pull request to the original person

Reading

This week:

Watching

Remember this is just a starting point, explore the reading list, practical and lecture for more ideas.

5.4 Introduction

In this practical we’re going to focus on creating mapped outputs in R. For fun we’re going to use data from OpenStreetMap (OSM) and Airbnb.

Within this practical we show how to make maps in just R. We used to provide instructions for QGIS and these have been moved to the additional resources section. It is useful to have some knowledge of QGIS for your future career but it is not needed for the exam.

5.4.1 OSM

OpenStreetMap is collaborative project that has created a free editable map of the World. As users can create their own content it is classed as Volunteered geographic Information (VGI). There is a lot of academic literature on VGI, it’s advnatages and disadvantages. For an overview of VGI checkout this article by Goodchild (2007).

If you are interested in exploring the power of VGI and OSM further checkout missing maps. They aim to map missing places, often in less developed countires, using OSM so that when natural disasters occur first responders can make data informed decisions. They run events all over the world and it’s worth going to meet other spatial professionals, gain some experience with OSM and contribute to a good cause.

5.4.2 Airbnb

Airbnb is an online marketplace that connects people looking to rent homes to those seeking accommodation often over short time periods.

5.5 Data

It’s possible to download OSM data straight from the website, although the interface can be a little unreliable (it works better for small areas). There are, however, a number of websites that allow OSM data to be downloaded more easily and are directly linked to from the ‘Export’ option in OSM. Geofabrik (one of these websites) allows you to download frequently updated shapefiles for various global subdivisions.

5.5.1 OSM

  1. Go to the Geofabrik download server website

  2. Navigate to: Europe > United Kingdom > England > Greater London

  3. Download greater-london-latest-free.shp.zip

  4. Unzip the data and save it to your current folder

5.5.2 London boroughs

We’ll use our London boroughs layer again, either load it from week 1 or download it:

  1. To get the data go to the London data store

  2. Search for Statistical GIS Boundary Files for London

  3. Download the statistical-gis-boundaries-london.zip

  4. Unzip the data and save it to your current folder

5.5.3 World cities

We will use World cities to provide some context to our maps.

  1. Download them from the ArcGIS HUB > Download > Shapefile.

5.5.4 Uk outline

  1. Download GADM data for the UK

5.5.5 Airbnb

  1. Download the listings.csv from the inside airbnb website for London.

In the lecture we discussed how when producing maps there should be some sort of common denominator as opposed to mapping raw counts. Go and source a suitable common denominator then using the skills from previous weeks normalise your data. Hint there is a table on the Office for National Statistics NOMIS website called number of bedrooms which would let you normalise the airbnb and hotel data based on the number of bedrooms in each ward.

5.6 Spatial joining

I will note that i have made use of a new function called st_join(). This is similar to the the joins we explored with attribute data (e.g. joining based on a unique attribute column) but here we just want to join datasets together and keep all their attribute data, this is useful in the code below where i want to join assign the London Borough to each hotel in London. The output will be a massive dataset where each hotel will be a new row and will retain the attributes of the hotel data but also append the attribute of the borough….st_join() defaults to a left join, so in this case the borough data is the left dataset and all the right data has been appended to it. If the left data (borough) had no matches (so no hotels) they would still appear in the final dataset. The default argument for this is st_intersects but we will explore this more next week.

library(sf)

Londonborough <- st_read(here::here("Prac1_data",
                                    "statistical-gis-boundaries-london", 
                                    "ESRI", 
                                    "London_Borough_Excluding_MHW.shp"))%>%
  st_transform(., 27700)
## Reading layer `London_Borough_Excluding_MHW' from data source 
##   `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\CASA0005repo\prac1_data\statistical-gis-boundaries-london\ESRI\London_Borough_Excluding_MHW.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 33 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 503568.2 ymin: 155850.8 xmax: 561957.5 ymax: 200933.9
## Projected CRS: OSGB36 / British National Grid
OSM <- st_read(here::here("prac5_data",
                          "greater-london-latest-free.shp", 
                          "gis_osm_pois_a_free_1.shp")) %>%
  st_transform(., 27700) %>%
  #select hotels only
  dplyr::filter(fclass == 'hotel')
## Reading layer `gis_osm_pois_a_free_1' from data source 
##   `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\CASA0005repo\prac5_data\greater-london-latest-free.shp\gis_osm_pois_a_free_1.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 35197 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -0.5108706 ymin: 51.28117 xmax: 0.322123 ymax: 51.68948
## Geodetic CRS:  WGS 84
join_example <-  st_join(Londonborough, OSM)

head(join_example)
## Simple feature collection with 6 features and 12 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 516362.6 ymin: 155850.8 xmax: 539657.9 ymax: 172367
## Projected CRS: OSGB36 / British National Grid
##                     NAME  GSS_CODE HECTARES NONLD_AREA ONS_INNER SUB_2009
## 1   Kingston upon Thames E09000021 3726.117          0         F     <NA>
## 1.1 Kingston upon Thames E09000021 3726.117          0         F     <NA>
## 1.2 Kingston upon Thames E09000021 3726.117          0         F     <NA>
## 1.3 Kingston upon Thames E09000021 3726.117          0         F     <NA>
## 2                Croydon E09000008 8649.441          0         F     <NA>
## 2.1              Croydon E09000008 8649.441          0         F     <NA>
##     SUB_2006 numeruc    osm_id code fclass                               name
## 1       <NA>       3  28055159 2401  hotel                        Premier Inn
## 1.1     <NA>       3 161631526 2401  hotel           Chessington Safari Hotel
## 1.2     <NA>       3 421483573 2401  hotel            Premier Inn Chessington
## 1.3     <NA>       3 421506999 2401  hotel           Chessington Azteca Hotel
## 2       <NA>    2684  32349145 2401  hotel                       Queens Hotel
## 2.1     <NA>    2684  71738948 2401  hotel Holiday Inn Express London-Croydon
##                           geometry
## 1   MULTIPOLYGON (((516401.6 16...
## 1.1 MULTIPOLYGON (((516401.6 16...
## 1.2 MULTIPOLYGON (((516401.6 16...
## 1.3 MULTIPOLYGON (((516401.6 16...
## 2   MULTIPOLYGON (((535009.2 15...
## 2.1 MULTIPOLYGON (((535009.2 15...

5.6.1 Static map

##Load all our data
library(sf)
library(tmap)
library(tmaptools)
library(tidyverse)
library(here)

# read in all the spatial data and 
# reproject it 

OSM <- st_read(here::here("prac5_data",
                          "greater-london-latest-free.shp", 
                          "gis_osm_pois_a_free_1.shp")) %>%
  st_transform(., 27700) %>%
  #select hotels only
  filter(fclass == 'hotel')
## Reading layer `gis_osm_pois_a_free_1' from data source 
##   `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\CASA0005repo\prac5_data\greater-london-latest-free.shp\gis_osm_pois_a_free_1.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 35197 features and 4 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -0.5108706 ymin: 51.28117 xmax: 0.322123 ymax: 51.68948
## Geodetic CRS:  WGS 84
Worldcities <- st_read(here::here("prac5_data", 
                                  "World_Cities", 
                                  "World_Cities.shp")) %>%
  st_transform(., 27700)
## Reading layer `World_Cities' from data source 
##   `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\CASA0005repo\prac5_data\World_Cities\World_Cities.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 2540 features and 13 fields
## Geometry type: POINT
## Dimension:     XY
## Bounding box:  xmin: -176.1516 ymin: -54.792 xmax: 179.2219 ymax: 78.2
## Geodetic CRS:  WGS 84
UK_outline <- st_read(here::here("prac5_data", 
                                 "gadm36_GBR_shp", 
                                 "gadm36_GBR_0.shp")) %>%
  st_transform(., 27700)
## Reading layer `gadm36_GBR_0' from data source 
##   `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\CASA0005repo\prac5_data\gadm36_GBR_shp\gadm36_GBR_0.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 2 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: -13.69139 ymin: 49.86542 xmax: 1.764168 ymax: 61.52708
## Geodetic CRS:  WGS 84
#London Borough data is already in 277000
Londonborough <- st_read(here::here("Prac1_data",
                                    "statistical-gis-boundaries-london", 
                                    "ESRI", 
                                    "London_Borough_Excluding_MHW.shp"))%>%
  st_transform(., 27700)
## Reading layer `London_Borough_Excluding_MHW' from data source 
##   `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\CASA0005repo\prac1_data\statistical-gis-boundaries-london\ESRI\London_Borough_Excluding_MHW.shp' 
##   using driver `ESRI Shapefile'
## Simple feature collection with 33 features and 8 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 503568.2 ymin: 155850.8 xmax: 561957.5 ymax: 200933.9
## Projected CRS: OSGB36 / British National Grid
# read in the .csv
# and make it into spatial data

Airbnb <- read_csv("prac5_data/listings.csv") %>%
  # longitude is considered x value here, latitude is y
  st_as_sf(., coords = c("longitude", "latitude"), 
                   crs = 4326) %>%
    st_transform(., 27700)%>%
    #select entire places that are available all year
    filter(room_type == 'Entire home/apt' & availability_365 =='365')


# make a function for the join
# functions are covered in practical 7
# but see if you can work out what is going on
# hint all you have to do is replace data1 and data2
# with the data you want to use

Joinfun <- function(data1, data2){

output<- data1%>%
  st_join(data2,.) %>%
  add_count(GSS_CODE, name="hotels_in_borough") 

  return(output)
}

# use the function for hotels
Hotels <- Joinfun(OSM, Londonborough)

# then for airbnb
# this is incorrect - change to airbnb2 to look at result
Airbnb <- Joinfun(Airbnb, Londonborough)
 
Worldcities2 <- Worldcities %>%
  filter(CNTRY_NAME=='United Kingdom'&
           Worldcities$CITY_NAME=='Birmingham'|
           Worldcities$CITY_NAME=='London'|
           Worldcities$CITY_NAME=='Edinburgh')

newbb <- c(xmin=-296000, ymin=5408, xmax=655696, ymax=1000000)
  
UK_outlinecrop <- UK_outline$geometry %>%
  st_crop(., newbb)

Hotels <- Hotels %>%
  #at the moment each hotel is a row for the borough
  #we just one one row that has number of airbnbs
  group_by(., GSS_CODE, NAME)%>%
  summarise(`Accomodation count` = unique(hotels_in_borough))

Airbnb <- Airbnb %>%
  group_by(., GSS_CODE, NAME)%>%
  summarise(`Accomodation count` = unique(hotels_in_borough))

This is actually incorrect and we will see why after reading week. Essentially there are two boroughs that have 0 airbnbs with our filters - Kingston upon Thames and Sutton. They don’t appear on the QGIS map and you can explore them on the interactive map later on…

However, in our data a they have a value of 1…why?

Well because this spatial join is like a left join. The Boroughs will always exist (there is always a row of Sutton) we have then grouped by and summarised the Borough (Sutton) and as it has a row it will return a value of 1 even through it is actually 0. To explore this you will need to break the code above, change Airbnb <- Joinfun(Airbnb, Londonborough) to Airbnb2 <- Joinfun(Airbnb, Londonborough) and explore the Airbnb2 object - Sutton will have a single row but no Airbnb names!

Let’s check this…from the code below Sutton has 1 Airbnb (meeting our criteria) however we know it does not!

Airbnb %>%
  filter(NAME=="Sutton")
## Simple feature collection with 1 feature and 3 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 522223.3 ymin: 159658.1 xmax: 531245.8 ymax: 167622.5
## Projected CRS: OSGB36 / British National Grid
## # A tibble: 1 × 4
## # Groups:   GSS_CODE [1]
##   GSS_CODE  NAME   `Accomodation count`                                 geometry
## * <chr>     <chr>                 <int>                       <MULTIPOLYGON [m]>
## 1 E09000029 Sutton                    1 (((528552.3 159658.1, 528399.7 159928.8…

Make the map

tmap_mode("plot")

# set the breaks
# for our mapped data
breaks = c(0, 5, 12, 26, 57, 286) 

# plot each map
tm1 <- tm_shape(Hotels) + 
  tm_polygons("Accomodation count", 
              breaks=breaks,
              palette="PuBu")+
  tm_legend(show=FALSE)+
  tm_layout(frame=FALSE)+
  tm_credits("(a)", position=c(0,0.85), size=1.5)

tm2 <- tm_shape(Airbnb) + 
  tm_polygons("Accomodation count",
              breaks=breaks, 
              palette="PuBu") + 
  tm_legend(show=FALSE)+
  tm_layout(frame=FALSE)+
  tm_credits("(b)", position=c(0,0.85), size=1.5)

tm3 <- tm_shape(UK_outlinecrop)+ 
  tm_polygons(col="darkslategray1")+
  tm_layout(frame=FALSE)+
  tm_shape(Worldcities2) +
  tm_symbols(col = "red", scale = .5)+
  tm_text("CITY_NAME", xmod=-1, ymod=-0.5)

legend <- tm_shape(Hotels) +
    tm_polygons("Accomodation count",
                breaks=breaks,
                palette="PuBu") +
    tm_scale_bar(position=c(0.2,0.04), text.size=0.6)+
    tm_compass(north=0, position=c(0.65,0.6))+
    tm_layout(legend.only = TRUE, legend.position=c(0.2,0.25),asp=0.1)+
    tm_credits("(c) OpenStreetMap contrbutors and Air b n b", position=c(0.0,0.0))
  
t=tmap_arrange(tm1, tm2, tm3, legend, ncol=2)

t

We can also arrange our maps using the grid package…

library(grid)
# erases the current device or moves to a new page 
# probably not needed but makes sure you are plotting on a new page.
grid.newpage()

pushViewport(viewport(layout=grid.layout(2,2)))
print(tm1, vp=viewport(layout.pos.col=1, layout.pos.row=1, height=5))
print(tm2, vp=viewport(layout.pos.col=2, layout.pos.row=1, height=5))
print(tm3, vp=viewport(layout.pos.col=1, layout.pos.row=2, height=5))
print(legend, vp=viewport(layout.pos.col=2, layout.pos.row=2, height=5))

5.6.2 Inset map

Londonbb <- st_bbox(Airbnb,
                    crs = st_crs(Airbnb))%>%
  #we need this to convert it into a class of sf
  # otherwise it our bb won't have a class it will just be x and y coordinates for the box
  # this makes it into a polygon
  st_as_sfc()
main <- tm_shape(Airbnb, bbbox = Londonbb) + 
  tm_polygons("Accomodation count",
              breaks=breaks, 
              palette="PuBu")+
  tm_scale_bar(position = c("left", "bottom"), text.size = .75)+
  tm_layout(legend.position = c("right","top"), 
            legend.text.size=.75, 
            legend.title.size = 1.1,
            frame=FALSE)+
  tm_credits("(c) OpenStreetMap contrbutors and Air b n b", position=c(0.0,0.0))+
  #tm_text(text = "NAME", size = .5, along.lines =T, remove.overlap=T,  auto.placement=F)+
  tm_compass(type = "8star", position = c(0.06, 0.1)) +

  #bottom left top right
  tm_layout(inner.margin=c(0.02,0.02,0.02,0.2))
inset = tm_shape(UK_outlinecrop) + tm_polygons() +
  tm_shape(Londonbb)+ 
  tm_borders(col = "grey40", lwd = 3)+
    tm_layout(frame=FALSE,
            bg.color = "transparent")+
  tm_shape(Worldcities2) +
  tm_symbols(col = "red", scale = .5)+
  tm_text("CITY_NAME", xmod=-1.5, ymod=-0.5)
library(grid)
main
print(inset, vp = viewport(0.86, 0.29, width = 0.5, height = 0.55))

5.6.3 Export

So how do we output our map then…

tmap_save(t, 'hotelsandairbnbR.png')

library(grid)
tmap_save(main,insets_tm = inset,insets_vp=viewport(x=0.86, y=0.29, width=.5, height=.55), filename="test.pdf", dpi=600)

5.6.4 Basic interactive map

But could we not also make an interactive map like we did in practical 2?

tmap_mode("view")

tm_shape(Airbnb) + 
  tm_polygons("Accomodation count", breaks=breaks) 

5.6.5 Advanced interactive map

But let’s take it a bit further so we can select our layers on an interactive map..

# library for pop up boxes
library(leafpop)
library(leaflet)

#join data
Joined <- Airbnb%>%
  st_join(., Hotels, join = st_equals)%>%
  dplyr::select(GSS_CODE.x, NAME.x, `Accomodation count.x`, `Accomodation count.y`)%>%
  dplyr::rename(`GSS code` =`GSS_CODE.x`,
                `Borough` = `NAME.x`,
                `Airbnb count` = `Accomodation count.x`,
                `Hotel count`= `Accomodation count.y`)%>%
  st_transform(., 4326)
  
  
#remove the geometry for our pop up boxes to avoid
popupairbnb <-Joined %>%
  st_drop_geometry()%>%
  dplyr::select(`Airbnb count`, Borough)%>%
  popupTable()

popuphotel <-Joined %>%
  st_drop_geometry()%>%
  dplyr::select(`Hotel count`, Borough)%>%
  popupTable()

tmap_mode("view")

# set the colour palettes using our previously defined breaks


pal1 <- Joined %>%
  colorBin(palette = "YlOrRd", domain=.$`Airbnb count`, bins=breaks)

pal1 <-colorBin(palette = "YlOrRd", domain=Joined$`Airbnb count`, bins=breaks)

pal2 <- Joined %>%
  colorBin(palette = "YlOrRd", domain=.$`Hotel count`, bins=breaks)


map<- leaflet(Joined) %>%

  #add our polygons, linking to the tables we just made
  addPolygons(color="white", 
              weight = 2,
              opacity = 1,
              dashArray = "3",
              popup = popupairbnb,
              fillOpacity = 0.7,
              fillColor = ~pal2(`Airbnb count`),
              group = "Airbnb")%>%
  
  addPolygons(fillColor = ~pal2(`Hotel count`), 
              weight = 2,
              opacity = 1,
              color = "white",
              dashArray = "3",
              popup = popupairbnb,
              fillOpacity = 0.7,group = "Hotels")%>%
  
  #add basemaps
  addTiles(group = "OSM (default)") %>%
  addProviderTiles(providers$Stadia.StamenToner, group = "Toner") %>%
  addProviderTiles(providers$Stadia.StamenTonerLite, group = "Toner Lite") %>%
  addProviderTiles(providers$CartoDB.Positron, group = "CartoDB")%>%
  
  # add a legend
  addLegend(pal = pal2, values = ~`Hotel count`, group = c("Airbnb","Hotel"), 
            position ="bottomleft", title = "Accomodation count") %>%
  # specify layers control
  addLayersControl(
    baseGroups = c("OSM (default)", "Toner", "Toner Lite", "CartoDB"),
    overlayGroups = c("Airbnb", "Hotels"),
    options = layersControlOptions(collapsed = FALSE)
  )

# plot the map
map

The legend on this map doesn’t update when you select a different layer. I could have enabled this by chaning the group argument to just Airbnb or Hotel, then calling a second addLegend() function for the other group. However, when displaying maps such as these, it’s often useful to have a consistent scale in the legend so they are directly comparable.

If you want to explore Leaflet more have a look at the leaflet for R Guide

To see other basemap options (there are loads!) have a look here at leaflet extras

5.7 Bad maps

What makes a bad map then… and what should you avoid:

  • Poor labeling — don’t present something as an output with the file name (e.g. layer_1_osm) in the legend — name your layers properly, it’s really easy to do and makes a big difference to the quality of the map.
  • No legend
  • Screenshot of the map — export it properly, we’ve been doing this a while and can tell
  • Change the values in the legend … what is aesthetically more pleasing 31.99999 or 32?. Make it as easy as possible to interpret your map.
  • Too much data presented on one map — be selective or plot multiple maps
  • Presented data is too small or too big — be critical about what you produce, it should be easy to read and understand
  • A map or figure without enough detail — A reader should be able to understand a map or figure using the graphic in the figure/map and the caption alone! A long caption is fine assuming it’s all relevant information.

For more cartography ideas/advice have a look at Katie Jolly’s blog post on urban heat islands, consult axis map catography guide and check out the data is beautiful reddit.

Another decent resource is the Fundamentals of Data Visualization book

5.8 Feedback

Was anything that we explained unclear this week or was something really clear…let us know using the feedback form. It’s anonymous and we’ll use the responses to clear any issues up in the future / adapt the material.