Chapter 9 GWR and spatially lagged regression
9.1 Learning objectives
By the end of this practical you should be able to:
- Explain hypothesis testing
- Execute regression in R
- Descrbie the assumptions associated with regression models
- Explain steps to deal with spatially autocorrelated (spatial similarlity of nearby observations) residuals.
Outside of our schedulded sessions you should be doing around 12 hours of extra study per week. Feel free to follow your own GIS interests, but good places to start include the following:
From weeks 6-9, learn and practice analysis from the course and identify appropriate techniques (from wider research) that might be applicable/relevant to your data. Conduct an extensive methodological review – this could include analysis from within academic literature and/or government departments (or any reputable source).
Reading This week:
Chapter 2 “Linear Regression” from Hands-On Machine Learning with R by Boehmke & Greenwell (2020).
Chapter 5 and 6 “Basic Regression and”Multiple Regression" from Modern Dive by Ismay and Kim (2019).
Chapter 9 Spatial regression models from Crime Mapping in R by Juanjo Medina and Reka Solymosi (2019).
Remember this is just a starting point, explore the reading list, practical and lecture for more ideas.
9.3 Recommended listening 🎧
Some of these practicals are long, take regular breaks and have a listen to some of our fav tunes each week.
Andy Haim. Can all play loads of instuments and just switch during live sets.
Adam Music this week - time for some big, dirty rock and roll. The Wildhearts have only gone and released a massive live album - oh yes!
In this practical you will be introduced to a suite of different models that will allow you to test a variety of research questions and hypotheses through modelling the associations between two or more spatially reference variables.
In the worked example, we will explore the factors that might affect the average exam scores of 16 year-old across London. GSCEs are the exams taken at the end of secondary education and here have been aggregated for all pupils at their home addresses across the City for Ward geographies.
The London Data Store collates a range of other variables for each Ward and so we will see if any of these are able to help explain the patterns of exam performance that we see.
This practical will walk you through the common steps that you should go through when building a regression model using spatial data to test a stated research hypothsis; from carrying out some descriptive visualisation and summary statistics, to interpreting the results and using the outputs of the model to inform your next steps.
9.4.1 Setting up your Data
First, let’s set up R and read in some data to enable us to carry out our analysis.
#library a bunch of packages we may (or may not) use - install them first if not installed already. library(tidyverse) library(tmap) library(geojsonio) library(plotly) library(rgdal) library(broom) library(mapview) library(crosstalk) library(sf) library(sp) library(spdep) library(car) library(fs) library(janitor)
Read some ward data in
#download a zip file containing some boundaries we want to use download.file("https://data.london.gov.uk/download/statistical-gis-boundary-files-london/9ba8c833-6370-4b11-abdc-314aa020d5e0/statistical-gis-boundaries-london.zip", destfile="prac9_data/statistical-gis-boundaries-london.zip")
Get the zip file and extract it
<-dir_info(here::here("prac9_data")) %>% listfiles::filter(str_detect(path, ".zip")) %>% dplyr::select(path)%>% dplyrpull()%>% #print out the .gz file print()%>% as.character()%>% ::unzip(exdir=here::here("prac9_data")) utils
Look inside the zip and read in the
#look what is inside the zip <-dir_info(here::here("prac9_data", Londonwards"statistical-gis-boundaries-london", "ESRI"))%>% #$ means exact match ::filter(str_detect(path, dplyr"London_Ward_CityMerged.shp$"))%>% ::select(path)%>% dplyrpull()%>% #read in the file in st_read()
## Reading layer `London_Ward_CityMerged' from data source `C:\Users\Andy\OneDrive - University College London\Teaching\CASA0005\2020_2021\CASA0005repo\prac9_data\statistical-gis-boundaries-london\ESRI\London_Ward_CityMerged.shp' using driver `ESRI Shapefile' ## Simple feature collection with 625 features and 7 fields ## geometry type: POLYGON ## dimension: XY ## bbox: xmin: 503568.2 ymin: 155850.8 xmax: 561957.5 ymax: 200933.9 ## projected CRS: OSGB 1936 / British National Grid
#check the data qtm(Londonwards)