7.7 Data output (O)
Writing geographic data allows you to convert from one format to another and to save newly created objects.Depending on the data type (vector or raster), object class (e.g., multipoint
or RasterLayer
), and type and amount of stored information (e.g., object size, range of values), it is important to know how to store spatial files in the most efficient way.The next two sections will demonstrate how to do this.
7.7.1 Vector data
The counterpart of st_read()
is st_write()
.It allows you to write sf objects to a wide range of geographic vector file formats, including the most common such as .geojson
, .shp
and .gpkg
.Based on the file name, st_write()
decides automatically which driver to use.The speed of the writing process depends also on the driver.
st_write(obj = world, dsn = "world.gpkg")
#> Writing layer `world' to data source `world.gpkg' using driver `GPKG'
#> features: 177
#> fields: 10
#> geometry type: Multi Polygon
Note: if you try to write to the same data source again, the function will fail:
st_write(obj = world, dsn = "world.gpkg")
#> Updating layer `world' to data source `world.gpkg' using driver `GPKG'
#> Creating layer world failed.
#> Error in CPL_write_ogr(obj, dsn, layer, driver, ...), :
#> Layer creation failed.
#> In addition: Warning message:
#> In CPL_write_ogr(obj, dsn, layer, driver, ...), :
#> GDAL Error 1: Layer world already exists, CreateLayer failed.
#> Use the layer creation option OVERWRITE=YES to replace it.
The error message provides some information as to why the function failed.The GDAL Error 1
statement makes clear that the failure occurred at the GDAL level.Additionally, the suggestion to use OVERWRITE=YES
provides a clue about how to fix the problem.However, this is not a st_write()
argument, it is a GDAL option.Luckily, st_write
provides a layer_options
argument through which we can pass driver-dependent options:
st_write(obj = world, dsn = "world.gpkg", layer_options = "OVERWRITE=YES")
Another solution is to use the st_write()
argument delete_layer
. Setting it to TRUE
deletes already existing layers in the data source before the function attempts to write (note there is also a delete_dsn
argument):
st_write(obj = world, dsn = "world.gpkg", delete_layer = TRUE)
You can achieve the same with writesf()
since it is equivalent to (technically an _alias for) st_write()
, except that its defaults for delete_layer
and quiet
is TRUE
.
write_sf(obj = world, dsn = "world.gpkg")
The layer_options
argument could be also used for many different purposes.One of them is to write spatial data to a text file.This can be done by specifying GEOMETRY
inside of layer_options
.It could be either AS_XY
for simple point datasets (it creates two new columns for coordinates) or AS_WKT
for more complex spatial data (one new column is created which contains the well-known text representation of spatial objects).
st_write(cycle_hire_xy, "cycle_hire_xy.csv", layer_options = "GEOMETRY=AS_XY")
st_write(world_wkt, "world_wkt.csv", layer_options = "GEOMETRY=AS_WKT")
7.7.2 Raster data
The writeRaster()
function saves Raster*
objects to files on disk.The function expects input regarding output data type and file format, but also accepts GDAL options specific to a selected file format (see ?writeRaster
for more details).
The raster package offers nine data types when saving a raster: LOG1S, INT1S, INT1U, INT2S, INT2U, INT4S, INT4U, FLT4S, and FLT8S.35The data type determines the bit representation of the raster object written to disk (Table 7.4).Which data type to use depends on the range of the values of your raster object.The more values a data type can represent, the larger the file will get on disk.Commonly, one would use LOG1S for bitmap (binary) rasters.Unsigned integers (INT1U, INT2U, INT4U) are suitable for categorical data, while float numbers (FLT4S and FLT8S) usually represent continuous data.writeRaster()
uses FLT4S as the default.While this works in most cases, the size of the output file will be unnecessarily large if you save binary or categorical data.Therefore, we would recommend to use the data type that needs the least storage space, but is still able to represent all values (check the range of values with the summary()
function).
Data type | Minimum value | Maximum value |
---|---|---|
LOG1S | FALSE (0) | TRUE (1) |
INT1S | -127 | 127 |
INT1U | 0 | 255 |
INT2S | -32,767 | 32,767 |
INT2U | 0 | 65,534 |
INT4S | -2,147,483,647 | 2,147,483,647 |
INT4U | 0 | 4,294,967,296 |
FLT4S | -3.4e+38 | 3.4e+38 |
FLT8S | -1.7e+308 | 1.7e+308 |
The file extension determines the output file when saving a Raster*
object to disk.For example, the .tif
extension will create a GeoTIFF file:
writeRaster(x = single_layer,
filename = "my_raster.tif",
datatype = "INT2U")
The raster
file format (native to the raster
package) is used when a file extension is invalid or missing.Some raster file formats come with additional options.You can use them with the options
parameter.GeoTIFF files, for example, can be compressed using COMPRESS
:
writeRaster(x = single_layer,
filename = "my_raster.tif",
datatype = "INT2U",
options = c("COMPRESS=DEFLATE"),
overwrite = TRUE)
Note that writeFormats()
returns a list with all supported file formats on your computer.