Overview
This week the main focus of the assignment is building a simple statistical (regression) model to help us understand a spatial pattern.
We will also look at the RMarkdown file format which you can use to present results from analysis in this or other assignments (this is not required, but can be a very effective and convenient way of presenting analysis results).
Because the lab material is presented in RMarkdown format in
.Rmd files, we will consider that first, before going on to
the assignment proper.
This means that the procedure this week is slightly different. You
should download all the materials for this week’s lab from this link. You should then
uncompress them to a folder on your computer, and set up a project in
that folder as usual. Then inside the project, open up the file
rmarkdown.Rmd.
That file explains itself, and you should spend a little bit of time with it to get used to the idea of RMarkdown.
When you are done, you can come back to these lab instructions in the
usual way, or you can follow the instructions in the
07-lab-instructions.Rmd file instead (the instructions are
the same in that document as in this one).
Building a simple statistical model
In this assignment you will build a simple regression model of the Airbnb listings in and around Wellington that we assembled a few weeks ago. The model will aim to account for variation in the numbers of listings with respect to the age structure of the population (from census) and relative to the numbers of various ‘amenities’ such as cafés, retail, and so on.
Libraries
Before you start, as usual we need some libraries.
library(sf)
library(tmap)
library(dplyr)
tmap_mode("view")
The data
Provided you have unpacked this week’s materials to an accessible
folder and opened a .Rproj file in the usual way, you
should find the datasets by simply running the commands shown below. If
that doesn’t seem to work, then download the data from the links
provided in the section below. You should find all the data in a
subfolder called data, if you unpacked the zip file
correctly.
Base data
The base data are in this file. Open them
with st_read:
welly <- st_read("data/wellington-base-data.gpkg")
and take a look with a plot command:
plot(welly, lwd = 0.5, pal = RColorBrewer::brewer.pal(9, "Reds"))
I’ve used the pal option here to get a nicer colour
palette than the plot command default.
If you really want to get a feel for the distribution of different
variables, then you should make some tmap maps of
individual attributes. If you make web maps with
tmap_mode("view") you will also be able to get a closer
look at things.
The attributes in this base data set are
| Attribute | Description |
|---|---|
sa2_id |
Statistical Area 2 (SA2) ID |
sa2_name |
Statistical Area 2 name |
ta_name |
Territorial Authority name, limited to just Wellington City and Lower Hutt City (this is a smaller area than we originally looked at, to allow easier mapping) |
pop |
Total population of SA2 per the 2018 Census |
u15 |
% of population under 15 |
a15_29 |
% of population aged from 15 to 29 |
a30_64 |
% of population aged from 30 to 64 |
o65 |
% of population aged 65 and over |
dist_cbd |
Approximate distance in km to the CBD. This is measured in a straight line, not over the road network |
Airbnb locations
We already saw this dataset recording all the Airbnb locations across the wider region.
abb <- st_read("data/abb.gpkg")
tm_shape(abb) +
tm_dots()