14.6 Exercises

  • Run a NMDS using the percentage data of the community matrix.Report the stress value and compare it to the stress value as retrieved from the NMDS using presence-absence data.What might explain the observed difference?

  • Compute all the predictor rasters we have used in the chapter (catchment slope, catchment area), and put them into a raster stack.Add dem and ndvi to the raster stack.Next, compute profile and tangential curvature as additional predictor rasters and add them to the raster stack (hint: grass7:r.slope.aspect).Finally, construct a response-predictor matrix.The scores of the first NMDS axis (which were the result when using the presence-absence community matrix) rotated in accordance with elevation represent the response variable, and should be joined to random_points (use an inner join).To complete the response-predictor matrix, extract the values of the environmental predictor raster stack to random_points.

  • Use the response-predictor matrix of the previous exercise to fit a random forest model.Find the optimal hyperparameters and use them for making a prediction map.

  • Retrieve the bias-reduced RMSE of a random forest model using spatial cross-validation including the estimation of optimal hyperparameter combinations (random search with 50 iterations) in an inner tuning loop (see Section 11.5.2).Parallelize the tuning level (see Section 11.5.2).Report the mean RMSE and use a boxplot to visualize all retrieved RMSEs.

  • Retrieve the bias-reduced RMSE of a simple linear model using spatial cross-validation.Compare the result to the result of the random forest model by making RMSE boxplots for each modeling approach.

References

Dillon, M. O., M. Nakazawa, and S. G. Leiva. 2003. “The Lomas Formations of Coastal Peru: Composition and Biogeographic History.” In El Niño in Peru: Biology and Culture over 10,000 Years, edited by J. Haas and M. O. Dillon, 1–9. Chicago: Field Museum of Natural History.

Muenchow, Jannes, Achim Bräuning, Eric Frank Rodríguez, and Henrik von Wehrden. 2013. “Predictive Mapping of Species Richness and Plant Species’ Distributions of a Peruvian Fog Oasis Along an Altitudinal Gradient.” Biotropica 45 (5): 557–66. https://doi.org/10.1111/btp.12049.

Muenchow, Jannes, Simon Hauenstein, Achim Bräuning, Rupert Bäumler, Eric Frank Rodríguez, and Henrik von Wehrden. 2013. “Soil Texture and Altitude, Respectively, Largely Determine the Floristic Gradient of the Most Diverse Fog Oasis in the Peruvian Desert.” Journal of Tropical Ecology 29 (05): 427–38. https://doi.org/10.1017/S0266467413000436.

Breiman, Leo. 2001. “Random Forests.” Machine Learning 45 (1): 5–32. https://doi.org/10.1023/A:1010933404324.

Wickham, Hadley. 2014b. “Tidy Data.” Journal of Statistical Software 59 (10). https://doi.org/10.18637/jss.v059.i10.

Borcard, Daniel, François Gillet, and Pierre Legendre. 2011. Numerical Ecology with R. Use R! New York: Springer.

von Wehrden, Henrik, Jan Hanspach, Helge Bruelheide, and Karsten Wesche. 2009. “Pluralism and Diversity: Trends in the Use and Application of Ordination Methods 1990-2007.” Journal of Vegetation Science 20 (4): 695–705. https://doi.org/10.1111/j.1654-1103.2009.01063.x.

McCune, Bruce, James B. Grace, and Dean L. Urban. 2002. Analysis of Ecological Communities. Second. Gleneden Beach, OR: MjM Software Design.

Muenchow, Jannes, Patrick Schratz, and Alexander Brenning. 2017. “RQGIS: Integrating R with QGIS for Statistical Geocomputing.” The R Journal 9 (2): 409–28.

Hengl, Tomislav, Madlene Nussbaum, Marvin N. Wright, Gerard B.M. Heuvelink, and Benedikt Gräler. 2018. “Random Forest as a Generic Framework for Predictive Modeling of Spatial and Spatio-Temporal Variables.” PeerJ 6 (August): e5518. https://doi.org/10.7717/peerj.5518.

Schratz, Patrick, Jannes Muenchow, Eugenia Iturritxa, Jakob Richter, and Alexander Brenning. 2018. “Performance Evaluation and Hyperparameter Tuning of Statistical and Machine-Learning Models Using Spatial Data.” arXiv:1803.11266 [Cs, Stat], March. http://arxiv.org/abs/1803.11266.

James, Gareth, Daniela Witten, Trevor Hastie, and Robert Tibshirani, eds. 2013. An Introduction to Statistical Learning: With Applications in R. Springer Texts in Statistics 103. New York: Springer.

Probst, Philipp, Marvin Wright, and Anne-Laure Boulesteix. 2018. “Hyperparameters and Tuning Strategies for Random Forest.” arXiv:1804.03515 [Cs, Stat], April. http://arxiv.org/abs/1804.03515.

Zuur, Alain, Elena N. Ieno, Neil Walker, Anatoly A. Saveliev, and Graham M. Smith. 2009. Mixed Effects Models and Extensions in Ecology with R. Statistics for Biology and Health. New York: Springer-Verlag.

Zuur, Alain F., Elena N. Ieno, Anatoly A. Saveliev, and Alain F. Zuur. 2017. Beginner’s Guide to Spatial, Temporal and Spatial-Temporal Ecological Data Analysis with R-INLA. Vol. 1. Newburgh, United Kingdom: Highland Statistics Ltd.


  • Similar vegetation formations develop also in other parts of the world, e.g., in Namibia and along the coasts of Yemen and Oman (Galletti, Turner, and Myint 2016).

  • In statistics, this is also called a contingency table or cross-table.

  • Admittedly, it is a bit unsatisfying that the only way of knowing that sagawetnessindex computes the desired terrain attributes is to be familiar with SAGA.

  • One way of choosing k is to try k values between 1 and 6 and then using the result which yields the best stress value (McCune, Grace, and Urban 2002).