2 Geographic data in R - 2.3 Raster data - 《[英文] Geocomputation with R》

2.3 Raster data

2.3 Raster data

The geographic raster data model usually consists of a raster headerand a matrix (with rows and columns) representing equally spaced cells (often also called pixels; Figure 2.10:A).¹²The raster header defines the coordinate reference system, the extent and the origin.The origin (or starting point) is frequently the coordinate of the lower-left corner of the matrix (the raster package, however, uses the upper left corner, by default (Figure 2.10:B)).The header defines the extent via the number of columns, the number of rows and the cell size resolution.Hence, starting from the origin, we can easily access and modify each single cell by either using the ID of a cell (Figure 2.10:B) or by explicitly specifying the rows and columns.This matrix representation avoids storing explicitly the coordinates for the four corner points (in fact it only stores one coordinate, namely the origin) of each cell corner as would be the case for rectangular vector polygons.This and map algebra makes raster processing much more efficient and faster than vector data processing.However, in contrast to vector data, the cell of one raster layer can only hold a single value.The value might be numeric or categorical (Figure 2.10:C).

Figure 2.10: Raster data types: (A) cell IDs, (B) cell values, (C) a colored raster map.

Raster maps usually represent continuous phenomena such as elevation, temperature, population density or spectral data (Figure 2.11).Of course, we can represent discrete features such as soil or land-cover classes also with the help of a raster data model (Figure 2.11).Consequently, the discrete borders of these features become blurred, and depending on the spatial task a vector representation might be more suitable.

Figure 2.11: Examples of continuous and categorical rasters.

2.3.1 An introduction to raster

The raster package supports raster objects in R.It provides an extensive set of functions to create, read, export, manipulate and process raster datasets.Aside from general raster data manipulation, raster provides many low-level functions that can form the basis to develop more advanced raster functionality.raster also lets you work on large raster datasets that are too large to fit into the main memory.In this case, raster provides the possibility to divide the raster into smaller chunks (rows or blocks), and processes these iteratively instead of loading the whole raster file into RAM (for more information, please refer to vignette("functions", package = "raster").

For the illustration of raster concepts, we will use datasets from the spDataLarge (note these packages were loaded at the beginning of the chapter).It consists of a few raster objects and one vector object covering an area of the Zion National Park (Utah, USA).For example, srtm.tif is a digital elevation model of this area (for more details, see its documentation ?srtm).First, let’s create a RasterLayer object named new_raster:

raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
new_raster = raster(raster_filepath)

Typing the name of the raster into the console, will print out the raster header (extent, dimensions, resolution, CRS) and some additional information (class, data source name, summary of the raster values):

new_raster
#> class       : RasterLayer 
#> dimensions  : 457, 465, 212505  (nrow, ncol, ncell)
#> resolution  : 0.000833, 0.000833  (x, y)
#> extent      : -113, -113, 37.1, 37.5  (xmin, xmax, ymin, ymax)
#> coord. ref. : +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0 
#> data source : /home/robin/R/x86_64-pc-linux../3.5/spDataLarge/raster/srtm.tif 
#> names       : srtm 
#> values      : 1024, 2892  (min, max)

Dedicated functions report each component: dim(new_raster) returns the number of rows, columns and layers; the ncell() function the number of cells (pixels); res() the raster’s spatial resolution; extent() its spatial extent; and crs() its coordinate reference system (raster reprojection is covered in Section 6.6).inMemory() reports whether the raster data is stored in memory (the default) or on disk.

help("raster-package") returns a full list of all available raster functions.

2.3.2 Basic map making

Similar to the sf package, raster also provides plot() methods for its own classes.

plot(new_raster)

Figure 2.12: Basic raster plot.

There are several other approaches for plotting raster data in R that are outside the scope of this section, including:

Functions such as spplot() and levelplot() (from the sp and rasterVis packages, respectively) to create facets, a common technique for visualizing change over time.
Packages such as tmap, mapview and leaflet to create interactive maps of raster and vector objects (see Chapter 8).

2.3.3 Raster classes

The RasterLayer class represents the simplest form of a raster object, and consists of only one layer.The easiest way to create a raster object in R is to read-in a raster file from disk or from a server.

raster_filepath = system.file("raster/srtm.tif", package = "spDataLarge")
new_raster = raster(raster_filepath)

The raster package supports numerous drivers with the help of rgdal.To find out which drivers are available on your system, run raster::writeFormats() and rgdal::gdalDrivers().

Rasters can also be created from scratch using the raster() function.This is illustrated in the subsequent code chunk, which results in a new RasterLayer object.The resulting raster consists of 36 cells (6 columns and 6 rows specified by nrows and ncols) centered around the Prime Meridian and the Equator (see xmn, xmx, ymn and ymx parameters).The CRS is the default of raster objects: WGS84.This means the unit of the resolution is in degrees which we set to 0.5 (res).Values (vals) are assigned to each cell: 1 to cell 1, 2 to cell 2, and so on.Remember: raster() fills cells row-wise (unlike matrix()) starting at the upper left corner, meaning the top row contains the values 1 to 6, the second 7 to 12, etc.

new_raster2 = raster(nrows = 6, ncols = 6, res = 0.5, 
                     xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
                     vals = 1:36)

For other ways of creating raster objects, see ?raster.

Aside from RasterLayer, there are two additional classes: RasterBrick and RasterStack.Both can handle multiple layers, but differ regarding the number of supported file formats, type of internal representation and processing speed.

A RasterBrick consists of multiple layers, which typically correspond to a single multispectral satellite file or a single multilayer object in memory.The brick() function creates a RasterBrick object.Usually, you provide it with a filename to a multilayer raster file but might also use another raster object and other spatial objects (see ?brick for all supported formats).

multi_raster_file = system.file("raster/landsat.tif", package = "spDataLarge")
r_brick = brick(multi_raster_file)

r_brick
#> class       : RasterBrick 
#> resolution  : 30, 30  (x, y)
#> ...
#> names       : landsat.1, landsat.2, landsat.3, landsat.4 
#> min values  :      7550,      6404,      5678,      5252 
#> max values  :     19071,     22051,     25780,     31961

nlayers() retrieves the number of layers stored in a Raster* object:

nlayers(r_brick)
#> [1] 4

A RasterStack is similar to a RasterBrick in the sense that it consists also of multiple layers.However, in contrast to RasterBrick, RasterStack allows you to connect several raster objects stored in different files or multiply objects in memory.More specifically, a RasterStack is a list of RasterLayer objects with the same extent and resolution.Hence, one way to create it is with the help of spatial objects already existing in R’s global environment.And again, one can simply specify a path to a file stored on disk.

raster_on_disk = raster(r_brick, layer = 1)
raster_in_memory = raster(xmn = 301905, xmx = 335745,
                          ymn = 4111245, ymx = 4154085, 
                          res = 30)
values(raster_in_memory) = sample(seq_len(ncell(raster_in_memory)))
crs(raster_in_memory) = crs(raster_on_disk)

r_stack = stack(raster_in_memory, raster_on_disk)
r_stack
#> class : RasterStack
#> dimensions : 1428, 1128, 1610784, 2
#> resolution : 30, 30
#> ...
#> names       :   layer, landsat.1 
#> min values  :       1,      7550 
#> max values  : 1610784,     19071

Another difference is that the processing time for RasterBrick objects is usually shorter than for RasterStack objects.

Decision on which Raster class should be used depends mostly on a character of input data.Processing of a single mulitilayer file or object is the most effective with RasterBrick, while RasterStack allows calculations based on many files, many Raster objects, or both.

Operations on RasterBrick and RasterStack objects will typically return a RasterBrick.