8.5 Data Input

specifically, loading into RAM in the R session .GlobalEnv

sf::read_sf()

terra::rast()

For my GDAL installation:

sf_drivers <- st_drivers()
sf_drivers
##                          name
## ESRIC                   ESRIC
## FITS                     FITS
## PCIDSK                 PCIDSK
## netCDF                 netCDF
## PDS4                     PDS4
## VICAR                   VICAR
## JP2OpenJPEG       JP2OpenJPEG
## PDF                       PDF
## MBTiles               MBTiles
## BAG                       BAG
## EEDA                     EEDA
## OGCAPI                 OGCAPI
## ESRI Shapefile ESRI Shapefile
## MapInfo File     MapInfo File
## UK .NTF               UK .NTF
## LVBAG                   LVBAG
## OGR_SDTS             OGR_SDTS
## S57                       S57
## DGN                       DGN
## OGR_VRT               OGR_VRT
## REC                       REC
## Memory                 Memory
## CSV                       CSV
## NAS                       NAS
## GML                       GML
## GPX                       GPX
## LIBKML                 LIBKML
## KML                       KML
## GeoJSON               GeoJSON
## GeoJSONSeq         GeoJSONSeq
## ESRIJSON             ESRIJSON
## TopoJSON             TopoJSON
## Interlis 1         Interlis 1
## Interlis 2         Interlis 2
## OGR_GMT               OGR_GMT
## GPKG                     GPKG
## SQLite                 SQLite
## ODBC                     ODBC
## WAsP                     WAsP
## PGeo                     PGeo
## MSSQLSpatial     MSSQLSpatial
## OGR_OGDI             OGR_OGDI
## PostgreSQL         PostgreSQL
## MySQL                   MySQL
## OpenFileGDB       OpenFileGDB
## DXF                       DXF
## CAD                       CAD
## FlatGeobuf         FlatGeobuf
## Geoconcept         Geoconcept
## GeoRSS                 GeoRSS
## GPSTrackMaker   GPSTrackMaker
## VFK                       VFK
## PGDUMP                 PGDUMP
## OSM                       OSM
## GPSBabel             GPSBabel
## OGR_PDS               OGR_PDS
## WFS                       WFS
## OAPIF                   OAPIF
## SOSI                     SOSI
## Geomedia             Geomedia
## EDIGEO                 EDIGEO
## SVG                       SVG
## CouchDB               CouchDB
## Cloudant             Cloudant
## Idrisi                 Idrisi
## ARCGEN                 ARCGEN
## XLS                       XLS
## ODS                       ODS
## XLSX                     XLSX
## Elasticsearch   Elasticsearch
## Walk                     Walk
## Carto                   Carto
## AmigoCloud         AmigoCloud
## SXF                       SXF
## Selafin               Selafin
## JML                       JML
## PLSCENES             PLSCENES
## CSW                       CSW
## VDV                       VDV
## GMLAS                   GMLAS
## MVT                       MVT
## NGW                       NGW
## MapML                   MapML
## TIGER                   TIGER
## AVCBin                 AVCBin
## AVCE00                 AVCE00
## HTTP                     HTTP
##                                                                    long_name
## ESRIC                                                     Esri Compact Cache
## FITS                                         Flexible Image Transport System
## PCIDSK                                                  PCIDSK Database File
## netCDF                                            Network Common Data Format
## PDS4                                            NASA Planetary Data System 4
## VICAR                                                        MIPL VICAR file
## JP2OpenJPEG                       JPEG-2000 driver based on OpenJPEG library
## PDF                                                           Geospatial PDF
## MBTiles                                                              MBTiles
## BAG                                               Bathymetry Attributed Grid
## EEDA                                                   Earth Engine Data API
## OGCAPI                                                                OGCAPI
## ESRI Shapefile                                                ESRI Shapefile
## MapInfo File                                                    MapInfo File
## UK .NTF                                                              UK .NTF
## LVBAG                                            Kadaster LV BAG Extract 2.0
## OGR_SDTS                                                                SDTS
## S57                                                           IHO S-57 (ENC)
## DGN                                                         Microstation DGN
## OGR_VRT                                             VRT - Virtual Datasource
## REC                                                            EPIInfo .REC 
## Memory                                                                Memory
## CSV                                             Comma Separated Value (.csv)
## NAS                                                              NAS - ALKIS
## GML                                          Geography Markup Language (GML)
## GPX                                                                      GPX
## LIBKML                                      Keyhole Markup Language (LIBKML)
## KML                                            Keyhole Markup Language (KML)
## GeoJSON                                                              GeoJSON
## GeoJSONSeq                                                  GeoJSON Sequence
## ESRIJSON                                                            ESRIJSON
## TopoJSON                                                            TopoJSON
## Interlis 1                                                        Interlis 1
## Interlis 2                                                        Interlis 2
## OGR_GMT                                             GMT ASCII Vectors (.gmt)
## GPKG                                                              GeoPackage
## SQLite                                                   SQLite / Spatialite
## ODBC                                                                        
## WAsP                                                        WAsP .map format
## PGeo                                               ESRI Personal GeoDatabase
## MSSQLSpatial                           Microsoft SQL Server Spatial Database
## OGR_OGDI                                       OGDI Vectors (VPF, VMAP, DCW)
## PostgreSQL                                                PostgreSQL/PostGIS
## MySQL                                                                  MySQL
## OpenFileGDB                                                     ESRI FileGDB
## DXF                                                              AutoCAD DXF
## CAD                                                           AutoCAD Driver
## FlatGeobuf                                                        FlatGeobuf
## Geoconcept                                                        Geoconcept
## GeoRSS                                                                GeoRSS
## GPSTrackMaker                                                  GPSTrackMaker
## VFK                                     Czech Cadastral Exchange Data Format
## PGDUMP                                                   PostgreSQL SQL dump
## OSM                                                OpenStreetMap XML and PBF
## GPSBabel                                                            GPSBabel
## OGR_PDS                                         Planetary Data Systems TABLE
## WFS                                            OGC WFS (Web Feature Service)
## OAPIF                                                     OGC API - Features
## SOSI                                                 Norwegian SOSI Standard
## Geomedia                                                       Geomedia .mdb
## EDIGEO                                         French EDIGEO exchange format
## SVG                                                 Scalable Vector Graphics
## CouchDB                                                   CouchDB / GeoCouch
## Cloudant                                                  Cloudant / CouchDB
## Idrisi                                                  Idrisi Vector (.vct)
## ARCGEN                                                     Arc/Info Generate
## XLS                                                          MS Excel format
## ODS                     Open Document/ LibreOffice / OpenOffice Spreadsheet 
## XLSX                                          MS Office Open XML spreadsheet
## Elasticsearch                                                 Elastic Search
## Walk                                                                        
## Carto                                                                  Carto
## AmigoCloud                                                        AmigoCloud
## SXF                                              Storage and eXchange Format
## Selafin                                                              Selafin
## JML                                                             OpenJUMP JML
## PLSCENES                                              Planet Labs Scenes API
## CSW                                   OGC CSW (Catalog  Service for the Web)
## VDV                                      VDV-451/VDV-452/INTREST Data Format
## GMLAS          Geography Markup Language (GML) driven by application schemas
## MVT                                                      Mapbox Vector Tiles
## NGW                                                              NextGIS Web
## MapML                                                                  MapML
## TIGER                                                 U.S. Census TIGER/Line
## AVCBin                                              Arc/Info Binary Coverage
## AVCE00                                         Arc/Info E00 (ASCII) Coverage
## HTTP                                                   HTTP Fetching Wrapper
##                write  copy is_raster is_vector   vsi
## ESRIC          FALSE FALSE      TRUE      TRUE  TRUE
## FITS            TRUE FALSE      TRUE      TRUE FALSE
## PCIDSK          TRUE FALSE      TRUE      TRUE  TRUE
## netCDF          TRUE  TRUE      TRUE      TRUE  TRUE
## PDS4            TRUE  TRUE      TRUE      TRUE  TRUE
## VICAR           TRUE  TRUE      TRUE      TRUE  TRUE
## JP2OpenJPEG    FALSE  TRUE      TRUE      TRUE  TRUE
## PDF             TRUE  TRUE      TRUE      TRUE  TRUE
## MBTiles         TRUE  TRUE      TRUE      TRUE  TRUE
## BAG             TRUE  TRUE      TRUE      TRUE  TRUE
## EEDA           FALSE FALSE     FALSE      TRUE FALSE
## OGCAPI         FALSE FALSE      TRUE      TRUE  TRUE
## ESRI Shapefile  TRUE FALSE     FALSE      TRUE  TRUE
## MapInfo File    TRUE FALSE     FALSE      TRUE  TRUE
## UK .NTF        FALSE FALSE     FALSE      TRUE  TRUE
## LVBAG          FALSE FALSE     FALSE      TRUE  TRUE
## OGR_SDTS       FALSE FALSE     FALSE      TRUE  TRUE
## S57             TRUE FALSE     FALSE      TRUE  TRUE
## DGN             TRUE FALSE     FALSE      TRUE  TRUE
## OGR_VRT        FALSE FALSE     FALSE      TRUE  TRUE
## REC            FALSE FALSE     FALSE      TRUE FALSE
## Memory          TRUE FALSE     FALSE      TRUE FALSE
## CSV             TRUE FALSE     FALSE      TRUE  TRUE
## NAS            FALSE FALSE     FALSE      TRUE  TRUE
## GML             TRUE FALSE     FALSE      TRUE  TRUE
## GPX             TRUE FALSE     FALSE      TRUE  TRUE
## LIBKML          TRUE FALSE     FALSE      TRUE  TRUE
## KML             TRUE FALSE     FALSE      TRUE  TRUE
## GeoJSON         TRUE FALSE     FALSE      TRUE  TRUE
## GeoJSONSeq      TRUE FALSE     FALSE      TRUE  TRUE
## ESRIJSON       FALSE FALSE     FALSE      TRUE  TRUE
## TopoJSON       FALSE FALSE     FALSE      TRUE  TRUE
## Interlis 1      TRUE FALSE     FALSE      TRUE  TRUE
## Interlis 2      TRUE FALSE     FALSE      TRUE  TRUE
## OGR_GMT         TRUE FALSE     FALSE      TRUE  TRUE
## GPKG            TRUE  TRUE      TRUE      TRUE  TRUE
## SQLite          TRUE FALSE     FALSE      TRUE  TRUE
## ODBC           FALSE FALSE     FALSE      TRUE FALSE
## WAsP            TRUE FALSE     FALSE      TRUE  TRUE
## PGeo           FALSE FALSE     FALSE      TRUE FALSE
## MSSQLSpatial    TRUE FALSE     FALSE      TRUE FALSE
## OGR_OGDI       FALSE FALSE     FALSE      TRUE FALSE
## PostgreSQL      TRUE FALSE     FALSE      TRUE FALSE
## MySQL           TRUE FALSE     FALSE      TRUE FALSE
## OpenFileGDB    FALSE FALSE     FALSE      TRUE  TRUE
## DXF             TRUE FALSE     FALSE      TRUE  TRUE
## CAD            FALSE FALSE      TRUE      TRUE  TRUE
## FlatGeobuf      TRUE FALSE     FALSE      TRUE  TRUE
## Geoconcept      TRUE FALSE     FALSE      TRUE  TRUE
## GeoRSS          TRUE FALSE     FALSE      TRUE  TRUE
## GPSTrackMaker   TRUE FALSE     FALSE      TRUE  TRUE
## VFK            FALSE FALSE     FALSE      TRUE FALSE
## PGDUMP          TRUE FALSE     FALSE      TRUE  TRUE
## OSM            FALSE FALSE     FALSE      TRUE  TRUE
## GPSBabel        TRUE FALSE     FALSE      TRUE FALSE
## OGR_PDS        FALSE FALSE     FALSE      TRUE  TRUE
## WFS            FALSE FALSE     FALSE      TRUE  TRUE
## OAPIF          FALSE FALSE     FALSE      TRUE FALSE
## SOSI           FALSE FALSE     FALSE      TRUE FALSE
## Geomedia       FALSE FALSE     FALSE      TRUE FALSE
## EDIGEO         FALSE FALSE     FALSE      TRUE  TRUE
## SVG            FALSE FALSE     FALSE      TRUE  TRUE
## CouchDB         TRUE FALSE     FALSE      TRUE FALSE
## Cloudant        TRUE FALSE     FALSE      TRUE FALSE
## Idrisi         FALSE FALSE     FALSE      TRUE  TRUE
## ARCGEN         FALSE FALSE     FALSE      TRUE  TRUE
## XLS            FALSE FALSE     FALSE      TRUE FALSE
## ODS             TRUE FALSE     FALSE      TRUE  TRUE
## XLSX            TRUE FALSE     FALSE      TRUE  TRUE
## Elasticsearch   TRUE FALSE     FALSE      TRUE FALSE
## Walk           FALSE FALSE     FALSE      TRUE FALSE
## Carto           TRUE FALSE     FALSE      TRUE FALSE
## AmigoCloud      TRUE FALSE     FALSE      TRUE FALSE
## SXF            FALSE FALSE     FALSE      TRUE  TRUE
## Selafin         TRUE FALSE     FALSE      TRUE  TRUE
## JML             TRUE FALSE     FALSE      TRUE  TRUE
## PLSCENES       FALSE FALSE      TRUE      TRUE FALSE
## CSW            FALSE FALSE     FALSE      TRUE FALSE
## VDV             TRUE FALSE     FALSE      TRUE  TRUE
## GMLAS          FALSE  TRUE     FALSE      TRUE  TRUE
## MVT             TRUE FALSE     FALSE      TRUE  TRUE
## NGW             TRUE  TRUE      TRUE      TRUE FALSE
## MapML           TRUE FALSE     FALSE      TRUE  TRUE
## TIGER           TRUE FALSE     FALSE      TRUE  TRUE
## AVCBin         FALSE FALSE     FALSE      TRUE  TRUE
## AVCE00         FALSE FALSE     FALSE      TRUE  TRUE
## HTTP           FALSE FALSE      TRUE      TRUE FALSE

8.5.1 Vector Data

read_sf() guesses the driver based on the file name extension

f = system.file("shapes/world.gpkg", package = "spData")
world <- read_sf(f, quiet = TRUE)

For some drivers, dsn could be provided as a folder name, access credentials for a database, or a GeoJSON string representation

Some vector driver formats can store multiple data layers. By default, read_sf() automatically reads the first layer of the file specified in dsn; however, using the layer argument you can specify any other layer.

read_sf() SQL features

tanzania <- read_sf(f, query = 'SELECT * FROM world WHERE name_long = "Tanzania"')

tanzania |>
  ggplot(aes()) +
  geom_sf()

If you do not know the names of the available columns, a good approach is to just read one row of the data with 'SELECT * FROM world WHERE FID = 1'

Well Known Text

Another approach e need to prepare our “filter” by (a) creating the buffer, (b) converting the sf buffer object into an sfc geometry object with st_geometry(), and (c) translating geometries into their well-known text representation with st_as_text()

Our result, contains Tanzania and every country within its 0.2 arc degrees of buffer.

tanzania_buf <- st_buffer(tanzania, 0.2)
## Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
## endCapStyle, : st_buffer does not correctly buffer longitude/latitude data
## dist is assumed to be in decimal degrees (arc_degrees).
tanzania_buf_geom <- st_geometry(tanzania_buf)
tanzania_buf_wkt <- st_as_text(tanzania_buf_geom)

tanzania_neigh <- read_sf(f, wkt_filter = tanzania_buf_wkt)

tanzania_neigh |>
  ggplot(aes()) +
  geom_sf() 

# code knits to the correct map of East Africa
# something fishy is happening with blogdown in building the book using a 50 km buffer. 

read_sf() also reads KML files. A KML file stores geographic information in XML format - a data format for the creation of web pages and the transfer of data in an application-independent way. This file contains more than one layer

u <- "https://developers.google.com/kml/documentation/KML_Samples.kml"
download.file(u, "KML_Samples.kml")
st_layers("KML_Samples.kml")
## Driver: LIBKML 
## Available layers:
##               layer_name geometry_type features fields crs_name
## 1             Placemarks                      3     11   WGS 84
## 2      Styles and Markup                      1     11   WGS 84
## 3       Highlighted Icon                      1     11   WGS 84
## 4        Ground Overlays                      1     11   WGS 84
## 5        Screen Overlays                      0     11   WGS 84
## 6                  Paths                      6     11   WGS 84
## 7               Polygons                      0     11   WGS 84
## 8          Google Campus                      4     11   WGS 84
## 9       Extruded Polygon                      1     11   WGS 84
## 10 Absolute and Relative                      4     11   WGS 84
kml <- read_sf("KML_Samples.kml", layer = "Placemarks")

kml |>
  ggplot(aes()) +
  geom_sf() +
  coord_sf()

via GIPHY

8.5.2 Raster Data

Raster data comes in many file formats with some of them supporting multilayer files.

raster_filepath <- system.file("raster/srtm.tif", package = "spDataLarge")
single_layer <- rast(raster_filepath)

ggplot() +
  tidyterra::geom_spatraster(data = single_layer) 

It also works in case you want to read a multilayer file.

multilayer_filepath <- system.file("raster/landsat.tif", package = "spDataLarge")
multilayer_rast <- rast(multilayer_filepath)

multilayer_rast
## class       : SpatRaster 
## dimensions  : 1428, 1128, 4  (nrow, ncol, nlyr)
## resolution  : 30, 30  (x, y)
## extent      : 301905, 335745, 4111245, 4154085  (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 12N (EPSG:32612) 
## source      : landsat.tif 
## names       : landsat_1, landsat_2, landsat_3, landsat_4 
## min values  :      7550,      6404,      5678,      5252 
## max values  :     19071,     22051,     25780,     31961

All of the previous examples read spatial information from files stored on your hard drive. However, GDAL also allows reading data directly from online resources, such as HTTP/HTTPS/FTP web resources.

add a /vsicurl/ prefix before the path to the file.

the global monthly snow probability at 500 m resolution for the period 2000-2012:

myurl <- "/vsicurl/https://zenodo.org/record/5774954/files/clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif"
snow <- rast(myurl)
snow
## class       : SpatRaster 
## dimensions  : 35849, 86400, 1  (nrow, ncol, nlyr)
## resolution  : 0.004166667, 0.004166667  (x, y)
## extent      : -180, 180, -62.00083, 87.37  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326) 
## source      : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif 
## name        : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0

Due to the fact that the input data is COG, we are actually not reading this file to our RAM, but rather creating a connection to it without obtaining any values.

We can get the snow probability for December in Reykjavik by specifying its coordinates and applying the extract() function

rey <- data.frame(lon = -21.94, lat = 64.15)
snow_rey <- terra::extract(snow, rey)
snow_rey
##   ID clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0
## 1  1                                                         70
                                                      70
## [1] 70

The /vsicurl/ prefix also works not only for raster but also for vector file formats. It allows reading vectors directly from online storage with read_sf() just by adding the prefix before the vector file URL.

/vsicurl/ is not the only prefix provided by GDAL – many more exist, such as /vsizip/ to read spatial files from ZIP archives without decompressing them beforehand or /vsis3/ for on-the-fly reading files available in AWS S3 buckets. Learn more at https://gdal.org/user/virtual_file_systems.html

via GIPHY