13.4 Create census rasters

After the preprocessing, the data can be converted into a raster stack or brick (see Sections 2.3.3 and 3.3.1).rasterFromXYZ() makes this really easy.It requires an input data frame where the first two columns represent coordinates on a regular grid.All the remaining columns (here: pop, women, mean_age, hh_size) will serve as input for the raster brick layers (Figure 13.1; see also code/13-location-jm.R in our github repository).

  1. input_ras = rasterFromXYZ(input_tidy, crs = st_crs(3035)$proj4string)
  1. input_ras
  2. #> class : RasterBrick
  3. #> dimensions : 868, 642, 557256, 4 (nrow, ncol, ncell, nlayers)
  4. #> resolution : 1000, 1000 (x, y)
  5. #> extent : 4031000, 4673000, 2684000, 3552000 (xmin, xmax, ymin, ymax)
  6. #> coord. ref. : +proj=laea +lat_0=52 +lon_0=10
  7. #> names : pop, women, mean_age, hh_size
  8. #> min values : 1, 1, 1, 1
  9. #> max values : 6, 5, 5, 5

Note that we are using an equal-area projection (EPSG:3035; Lambert Equal Area Europe), i.e., a projected CRS where each grid cell has the same area, here 1000 x 1000 square meters.Since we are using mainly densities such as the number of inhabitants or the portion of women per grid cell, it is of utmost importance that the area of each grid cell is the same to avoid ‘comparing apples and oranges’.Be careful with geographic CRS where grid cell areas constantly decrease in poleward directions (see also Section 2.4 and Chapter 6).

Gridded German census data of 2011 (see Table 13.1 for a description of the classes).
Figure 13.1: Gridded German census data of 2011 (see Table 13.1 for a description of the classes).

The next stage is to reclassify the values of the rasters stored in input_ras in accordance with the survey mentioned in Section 13.2, using the raster function reclassify(), which was introduced in Section 4.3.3.In the case of the population data, we convert the classes into a numeric data type using class means.Raster cells are assumed to have a population of 127 if they have a value of 1 (cells in ‘class 1’ contain between 3 and 250 inhabitants) and 375 if they have a value of 2 (containing 250 to 500 inhabitants), and so on (see Table 13.1).A cell value of 8000 inhabitants was chosen for ‘class 6’ because these cells contain more than 8000 people.Of course, these are approximations of the true population, not precise values.73However, the level of detail is sufficient to delineate metropolitan areas (see next section).

In contrast to the pop variable, representing absolute estimates of the total population, the remaining variables were re-classified as weights corresponding with weights used in the survey.Class 1 in the variable women, for instance, represents areas in which 0 to 40% of the population is female;these are reclassified with a comparatively high weight of 3 because the target demographic is predominantly male.Similarly, the classes containing the youngest people and highest proportion of single households are reclassified to have high weights.

  1. rcl_pop = matrix(c(1, 1, 127, 2, 2, 375, 3, 3, 1250,
  2. 4, 4, 3000, 5, 5, 6000, 6, 6, 8000),
  3. ncol = 3, byrow = TRUE)
  4. rcl_women = matrix(c(1, 1, 3, 2, 2, 2, 3, 3, 1, 4, 5, 0),
  5. ncol = 3, byrow = TRUE)
  6. rcl_age = matrix(c(1, 1, 3, 2, 2, 0, 3, 5, 0),
  7. ncol = 3, byrow = TRUE)
  8. rcl_hh = rcl_women
  9. rcl = list(rcl_pop, rcl_women, rcl_age, rcl_hh)

Note that we have made sure that the order of the reclassification matrices in the list is the same as for the elements of input_ras.For instance, the first element corresponds in both cases to the population.Subsequently, the for-loop applies the reclassification matrix to the corresponding raster layer.Finally, the code chunk below ensures the reclass layers have the same name as the layers of input_ras.

  1. reclass = input_ras
  2. for (i in seq_len(nlayers(reclass))) {
  3. reclass[[i]] = reclassify(x = reclass[[i]], rcl = rcl[[i]], right = NA)
  4. }
  5. names(reclass) = names(input_ras)
  1. reclass
  2. #> ... (full output not shown)
  3. #> names : pop, women, mean_age, hh_size
  4. #> min values : 127, 0, 0, 0
  5. #> max values : 8000, 3, 3, 3