7.3 Geographic data packages

A multitude of R packages have been developed for accessing geographic data, some of which are presented in Table 7.1.These provide interfaces to one or more spatial libraries or geoportals and aim to make data access even quicker from the command line.

Table 7.1: Selected R packages for geographic data retrieval.
PackageDescription
getlandsatProvides access to Landsat 8 data.
osmdataDownload and import of OpenStreetMap data.
rastergetData() imports administrative, elevation, WorldClim data.
rnaturalearthAccess to Natural Earth vector and raster data.
rnoaaImports National Oceanic and Atmospheric Administration (NOAA) climate data.
rWBclimateAccess World Bank climate data.

It should be emphasised that Table 7.1 represents only a small number of available geographic data packages.Other notable packages include GSODR, which provides Global Summary Daily Weather Data in R (see the package’s README for an overview of weather data sources);tidycensus and tigris, which provide socio-demographic vector data for the USA; and hddtools, which provides access to a range of hydrological datasets.

Each data package has its own syntax for accessing data.This diversity is demonstrated in the subsequent code chunks, which show how to get data using three packages from Table 7.1.Country borders are often useful and these can be accessed with the ne_countries() function from the rnaturalearth package as follows:

  1. library(rnaturalearth)
  2. usa = ne_countries(country = "United States of America") # United States borders
  3. class(usa)
  4. #> [1] "SpatialPolygonsDataFrame"
  5. #> attr(,"package")
  6. #> [1] "sp"
  7. # alternative way of accessing the data, with raster::getData()
  8. # getData("GADM", country = "USA", level = 0)

By default rnaturalearth returns objects of class Spatial.The result can be converted into an sf objects with st_as_sf() as follows:

  1. usa_sf = st_as_sf(usa)

A second example downloads a series of rasters containing global monthly precipitation sums with spatial resolution of ten minutes.The result is a multilayer object of class RasterStack.

  1. library(raster)
  2. worldclim_prec = getData(name = "worldclim", var = "prec", res = 10)
  3. class(worldclim_prec)
  4. #> [1] "RasterStack"
  5. #> attr(,"package")
  6. #> [1] "raster"

A third example uses the osmdata package (Padgham et al. 2018) to find parks from the OpenStreetMap (OSM) database.As illustrated in the code-chunk below, queries begin with the function opq() (short for OpenStreetMap query), the first argument of which is bounding box, or text string representing a bounding box (the city of Leeds in this case).The result is passed to a function for selecting which OSM elements we’re interested in (parks in this case), represented by key-value pairs. Next, they are passed to the function osmdata_sf() which does the work of downloading the data and converting it into a list of sf objects (see vignette('osmdata') for further details):

  1. library(osmdata)
  2. parks = opq(bbox = "leeds uk") %>%
  3. add_osm_feature(key = "leisure", value = "park") %>%
  4. osmdata_sf()

OpenStreetMap is a vast global database of crowd-sourced data and it is growing daily.Although the quality is not as spatially consistent as many official datasets, OSM data have many advantages: they are globally available free of charge and using crowd-source data can encourage ‘citizen science’ and contributions back to the digital commons.Further examples of osmdata in action are provided in Chapters 9, 12 and 13.

Sometimes, packages come with inbuilt datasets.These can be accessed in four ways: by attaching the package (if the package uses ‘lazy loading’ as spData does), with data(dataset), by referring to the dataset with pkg::dataset or with system.file() to access raw data files.The following code chunk illustrates the latter two options using the world (already loaded by attaching its parent package with library(spData)):31

  1. world2 = spData::world
  2. world3 = st_read(system.file("shapes/world.gpkg", package = "spData"))