8.5 Data Input
specifically, loading into RAM in the R session .GlobalEnv
sf::read_sf()
terra::rast()
For my GDAL installation:
<- st_drivers()
sf_drivers sf_drivers
## name
## ESRIC ESRIC
## FITS FITS
## PCIDSK PCIDSK
## netCDF netCDF
## PDS4 PDS4
## VICAR VICAR
## JP2OpenJPEG JP2OpenJPEG
## PDF PDF
## MBTiles MBTiles
## BAG BAG
## EEDA EEDA
## OGCAPI OGCAPI
## ESRI Shapefile ESRI Shapefile
## MapInfo File MapInfo File
## UK .NTF UK .NTF
## LVBAG LVBAG
## OGR_SDTS OGR_SDTS
## S57 S57
## DGN DGN
## OGR_VRT OGR_VRT
## REC REC
## Memory Memory
## CSV CSV
## NAS NAS
## GML GML
## GPX GPX
## LIBKML LIBKML
## KML KML
## GeoJSON GeoJSON
## GeoJSONSeq GeoJSONSeq
## ESRIJSON ESRIJSON
## TopoJSON TopoJSON
## Interlis 1 Interlis 1
## Interlis 2 Interlis 2
## OGR_GMT OGR_GMT
## GPKG GPKG
## SQLite SQLite
## ODBC ODBC
## WAsP WAsP
## PGeo PGeo
## MSSQLSpatial MSSQLSpatial
## OGR_OGDI OGR_OGDI
## PostgreSQL PostgreSQL
## MySQL MySQL
## OpenFileGDB OpenFileGDB
## DXF DXF
## CAD CAD
## FlatGeobuf FlatGeobuf
## Geoconcept Geoconcept
## GeoRSS GeoRSS
## GPSTrackMaker GPSTrackMaker
## VFK VFK
## PGDUMP PGDUMP
## OSM OSM
## GPSBabel GPSBabel
## OGR_PDS OGR_PDS
## WFS WFS
## OAPIF OAPIF
## SOSI SOSI
## Geomedia Geomedia
## EDIGEO EDIGEO
## SVG SVG
## CouchDB CouchDB
## Cloudant Cloudant
## Idrisi Idrisi
## ARCGEN ARCGEN
## XLS XLS
## ODS ODS
## XLSX XLSX
## Elasticsearch Elasticsearch
## Walk Walk
## Carto Carto
## AmigoCloud AmigoCloud
## SXF SXF
## Selafin Selafin
## JML JML
## PLSCENES PLSCENES
## CSW CSW
## VDV VDV
## GMLAS GMLAS
## MVT MVT
## NGW NGW
## MapML MapML
## TIGER TIGER
## AVCBin AVCBin
## AVCE00 AVCE00
## HTTP HTTP
## long_name
## ESRIC Esri Compact Cache
## FITS Flexible Image Transport System
## PCIDSK PCIDSK Database File
## netCDF Network Common Data Format
## PDS4 NASA Planetary Data System 4
## VICAR MIPL VICAR file
## JP2OpenJPEG JPEG-2000 driver based on OpenJPEG library
## PDF Geospatial PDF
## MBTiles MBTiles
## BAG Bathymetry Attributed Grid
## EEDA Earth Engine Data API
## OGCAPI OGCAPI
## ESRI Shapefile ESRI Shapefile
## MapInfo File MapInfo File
## UK .NTF UK .NTF
## LVBAG Kadaster LV BAG Extract 2.0
## OGR_SDTS SDTS
## S57 IHO S-57 (ENC)
## DGN Microstation DGN
## OGR_VRT VRT - Virtual Datasource
## REC EPIInfo .REC
## Memory Memory
## CSV Comma Separated Value (.csv)
## NAS NAS - ALKIS
## GML Geography Markup Language (GML)
## GPX GPX
## LIBKML Keyhole Markup Language (LIBKML)
## KML Keyhole Markup Language (KML)
## GeoJSON GeoJSON
## GeoJSONSeq GeoJSON Sequence
## ESRIJSON ESRIJSON
## TopoJSON TopoJSON
## Interlis 1 Interlis 1
## Interlis 2 Interlis 2
## OGR_GMT GMT ASCII Vectors (.gmt)
## GPKG GeoPackage
## SQLite SQLite / Spatialite
## ODBC
## WAsP WAsP .map format
## PGeo ESRI Personal GeoDatabase
## MSSQLSpatial Microsoft SQL Server Spatial Database
## OGR_OGDI OGDI Vectors (VPF, VMAP, DCW)
## PostgreSQL PostgreSQL/PostGIS
## MySQL MySQL
## OpenFileGDB ESRI FileGDB
## DXF AutoCAD DXF
## CAD AutoCAD Driver
## FlatGeobuf FlatGeobuf
## Geoconcept Geoconcept
## GeoRSS GeoRSS
## GPSTrackMaker GPSTrackMaker
## VFK Czech Cadastral Exchange Data Format
## PGDUMP PostgreSQL SQL dump
## OSM OpenStreetMap XML and PBF
## GPSBabel GPSBabel
## OGR_PDS Planetary Data Systems TABLE
## WFS OGC WFS (Web Feature Service)
## OAPIF OGC API - Features
## SOSI Norwegian SOSI Standard
## Geomedia Geomedia .mdb
## EDIGEO French EDIGEO exchange format
## SVG Scalable Vector Graphics
## CouchDB CouchDB / GeoCouch
## Cloudant Cloudant / CouchDB
## Idrisi Idrisi Vector (.vct)
## ARCGEN Arc/Info Generate
## XLS MS Excel format
## ODS Open Document/ LibreOffice / OpenOffice Spreadsheet
## XLSX MS Office Open XML spreadsheet
## Elasticsearch Elastic Search
## Walk
## Carto Carto
## AmigoCloud AmigoCloud
## SXF Storage and eXchange Format
## Selafin Selafin
## JML OpenJUMP JML
## PLSCENES Planet Labs Scenes API
## CSW OGC CSW (Catalog Service for the Web)
## VDV VDV-451/VDV-452/INTREST Data Format
## GMLAS Geography Markup Language (GML) driven by application schemas
## MVT Mapbox Vector Tiles
## NGW NextGIS Web
## MapML MapML
## TIGER U.S. Census TIGER/Line
## AVCBin Arc/Info Binary Coverage
## AVCE00 Arc/Info E00 (ASCII) Coverage
## HTTP HTTP Fetching Wrapper
## write copy is_raster is_vector vsi
## ESRIC FALSE FALSE TRUE TRUE TRUE
## FITS TRUE FALSE TRUE TRUE FALSE
## PCIDSK TRUE FALSE TRUE TRUE TRUE
## netCDF TRUE TRUE TRUE TRUE TRUE
## PDS4 TRUE TRUE TRUE TRUE TRUE
## VICAR TRUE TRUE TRUE TRUE TRUE
## JP2OpenJPEG FALSE TRUE TRUE TRUE TRUE
## PDF TRUE TRUE TRUE TRUE TRUE
## MBTiles TRUE TRUE TRUE TRUE TRUE
## BAG TRUE TRUE TRUE TRUE TRUE
## EEDA FALSE FALSE FALSE TRUE FALSE
## OGCAPI FALSE FALSE TRUE TRUE TRUE
## ESRI Shapefile TRUE FALSE FALSE TRUE TRUE
## MapInfo File TRUE FALSE FALSE TRUE TRUE
## UK .NTF FALSE FALSE FALSE TRUE TRUE
## LVBAG FALSE FALSE FALSE TRUE TRUE
## OGR_SDTS FALSE FALSE FALSE TRUE TRUE
## S57 TRUE FALSE FALSE TRUE TRUE
## DGN TRUE FALSE FALSE TRUE TRUE
## OGR_VRT FALSE FALSE FALSE TRUE TRUE
## REC FALSE FALSE FALSE TRUE FALSE
## Memory TRUE FALSE FALSE TRUE FALSE
## CSV TRUE FALSE FALSE TRUE TRUE
## NAS FALSE FALSE FALSE TRUE TRUE
## GML TRUE FALSE FALSE TRUE TRUE
## GPX TRUE FALSE FALSE TRUE TRUE
## LIBKML TRUE FALSE FALSE TRUE TRUE
## KML TRUE FALSE FALSE TRUE TRUE
## GeoJSON TRUE FALSE FALSE TRUE TRUE
## GeoJSONSeq TRUE FALSE FALSE TRUE TRUE
## ESRIJSON FALSE FALSE FALSE TRUE TRUE
## TopoJSON FALSE FALSE FALSE TRUE TRUE
## Interlis 1 TRUE FALSE FALSE TRUE TRUE
## Interlis 2 TRUE FALSE FALSE TRUE TRUE
## OGR_GMT TRUE FALSE FALSE TRUE TRUE
## GPKG TRUE TRUE TRUE TRUE TRUE
## SQLite TRUE FALSE FALSE TRUE TRUE
## ODBC FALSE FALSE FALSE TRUE FALSE
## WAsP TRUE FALSE FALSE TRUE TRUE
## PGeo FALSE FALSE FALSE TRUE FALSE
## MSSQLSpatial TRUE FALSE FALSE TRUE FALSE
## OGR_OGDI FALSE FALSE FALSE TRUE FALSE
## PostgreSQL TRUE FALSE FALSE TRUE FALSE
## MySQL TRUE FALSE FALSE TRUE FALSE
## OpenFileGDB FALSE FALSE FALSE TRUE TRUE
## DXF TRUE FALSE FALSE TRUE TRUE
## CAD FALSE FALSE TRUE TRUE TRUE
## FlatGeobuf TRUE FALSE FALSE TRUE TRUE
## Geoconcept TRUE FALSE FALSE TRUE TRUE
## GeoRSS TRUE FALSE FALSE TRUE TRUE
## GPSTrackMaker TRUE FALSE FALSE TRUE TRUE
## VFK FALSE FALSE FALSE TRUE FALSE
## PGDUMP TRUE FALSE FALSE TRUE TRUE
## OSM FALSE FALSE FALSE TRUE TRUE
## GPSBabel TRUE FALSE FALSE TRUE FALSE
## OGR_PDS FALSE FALSE FALSE TRUE TRUE
## WFS FALSE FALSE FALSE TRUE TRUE
## OAPIF FALSE FALSE FALSE TRUE FALSE
## SOSI FALSE FALSE FALSE TRUE FALSE
## Geomedia FALSE FALSE FALSE TRUE FALSE
## EDIGEO FALSE FALSE FALSE TRUE TRUE
## SVG FALSE FALSE FALSE TRUE TRUE
## CouchDB TRUE FALSE FALSE TRUE FALSE
## Cloudant TRUE FALSE FALSE TRUE FALSE
## Idrisi FALSE FALSE FALSE TRUE TRUE
## ARCGEN FALSE FALSE FALSE TRUE TRUE
## XLS FALSE FALSE FALSE TRUE FALSE
## ODS TRUE FALSE FALSE TRUE TRUE
## XLSX TRUE FALSE FALSE TRUE TRUE
## Elasticsearch TRUE FALSE FALSE TRUE FALSE
## Walk FALSE FALSE FALSE TRUE FALSE
## Carto TRUE FALSE FALSE TRUE FALSE
## AmigoCloud TRUE FALSE FALSE TRUE FALSE
## SXF FALSE FALSE FALSE TRUE TRUE
## Selafin TRUE FALSE FALSE TRUE TRUE
## JML TRUE FALSE FALSE TRUE TRUE
## PLSCENES FALSE FALSE TRUE TRUE FALSE
## CSW FALSE FALSE FALSE TRUE FALSE
## VDV TRUE FALSE FALSE TRUE TRUE
## GMLAS FALSE TRUE FALSE TRUE TRUE
## MVT TRUE FALSE FALSE TRUE TRUE
## NGW TRUE TRUE TRUE TRUE FALSE
## MapML TRUE FALSE FALSE TRUE TRUE
## TIGER TRUE FALSE FALSE TRUE TRUE
## AVCBin FALSE FALSE FALSE TRUE TRUE
## AVCE00 FALSE FALSE FALSE TRUE TRUE
## HTTP FALSE FALSE TRUE TRUE FALSE
8.5.1 Vector Data
read_sf()
guesses the driver based on the file name extension
= system.file("shapes/world.gpkg", package = "spData")
f <- read_sf(f, quiet = TRUE) world
For some drivers, dsn
could be provided as a folder name, access credentials for a database, or a GeoJSON
string representation
Some vector driver formats can store multiple data layers. By default, read_sf() automatically reads the first layer of the file specified in dsn; however, using the layer argument you can specify any other layer.
read_sf()
SQL features
<- read_sf(f, query = 'SELECT * FROM world WHERE name_long = "Tanzania"')
tanzania
|>
tanzania ggplot(aes()) +
geom_sf()
If you do not know the names of the available columns, a good approach is to just read one row of the data with 'SELECT * FROM world WHERE FID = 1'
Well Known Text
Another approach e need to prepare our “filter” by (a) creating the buffer, (b) converting the sf buffer object into an sfc geometry object with st_geometry(), and (c) translating geometries into their well-known text representation with st_as_text()
Our result, contains Tanzania and every country within its 0.2 arc degrees of buffer.
<- st_buffer(tanzania, 0.2) tanzania_buf
## Warning in st_buffer.sfc(st_geometry(x), dist, nQuadSegs, endCapStyle =
## endCapStyle, : st_buffer does not correctly buffer longitude/latitude data
## dist is assumed to be in decimal degrees (arc_degrees).
<- st_geometry(tanzania_buf)
tanzania_buf_geom <- st_as_text(tanzania_buf_geom)
tanzania_buf_wkt
<- read_sf(f, wkt_filter = tanzania_buf_wkt)
tanzania_neigh
|>
tanzania_neigh ggplot(aes()) +
geom_sf()
# code knits to the correct map of East Africa
# something fishy is happening with blogdown in building the book using a 50 km buffer.
read_sf()
also reads KML files. A KML file stores geographic information in XML format - a data format for the creation of web pages and the transfer of data in an application-independent way. This file contains more than one layer
<- "https://developers.google.com/kml/documentation/KML_Samples.kml"
u download.file(u, "KML_Samples.kml")
st_layers("KML_Samples.kml")
## Driver: LIBKML
## Available layers:
## layer_name geometry_type features fields crs_name
## 1 Placemarks 3 11 WGS 84
## 2 Styles and Markup 1 11 WGS 84
## 3 Highlighted Icon 1 11 WGS 84
## 4 Ground Overlays 1 11 WGS 84
## 5 Screen Overlays 0 11 WGS 84
## 6 Paths 6 11 WGS 84
## 7 Polygons 0 11 WGS 84
## 8 Google Campus 4 11 WGS 84
## 9 Extruded Polygon 1 11 WGS 84
## 10 Absolute and Relative 4 11 WGS 84
<- read_sf("KML_Samples.kml", layer = "Placemarks")
kml
|>
kml ggplot(aes()) +
geom_sf() +
coord_sf()
8.5.2 Raster Data
Raster data comes in many file formats with some of them supporting multilayer files.
<- system.file("raster/srtm.tif", package = "spDataLarge")
raster_filepath <- rast(raster_filepath)
single_layer
ggplot() +
::geom_spatraster(data = single_layer) tidyterra
It also works in case you want to read a multilayer file.
<- system.file("raster/landsat.tif", package = "spDataLarge")
multilayer_filepath <- rast(multilayer_filepath)
multilayer_rast
multilayer_rast
## class : SpatRaster
## dimensions : 1428, 1128, 4 (nrow, ncol, nlyr)
## resolution : 30, 30 (x, y)
## extent : 301905, 335745, 4111245, 4154085 (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 12N (EPSG:32612)
## source : landsat.tif
## names : landsat_1, landsat_2, landsat_3, landsat_4
## min values : 7550, 6404, 5678, 5252
## max values : 19071, 22051, 25780, 31961
All of the previous examples read spatial information from files stored on your hard drive. However, GDAL also allows reading data directly from online resources, such as HTTP/HTTPS/FTP web resources.
add a /vsicurl/ prefix before the path to the file.
the global monthly snow probability at 500 m resolution for the period 2000-2012:
<- "/vsicurl/https://zenodo.org/record/5774954/files/clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif"
myurl <- rast(myurl)
snow snow
## class : SpatRaster
## dimensions : 35849, 86400, 1 (nrow, ncol, nlyr)
## resolution : 0.004166667, 0.004166667 (x, y)
## extent : -180, 180, -62.00083, 87.37 (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326)
## source : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0.tif
## name : clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0
Due to the fact that the input data is COG, we are actually not reading this file to our RAM, but rather creating a connection to it without obtaining any values.
We can get the snow probability for December in Reykjavik by specifying its coordinates and applying the extract()
function
<- data.frame(lon = -21.94, lat = 64.15)
rey <- terra::extract(snow, rey)
snow_rey snow_rey
## ID clm_snow.prob_esacci.dec_p.90_500m_s0..0cm_2000..2012_v2.0
## 1 1 70
70
## [1] 70
The /vsicurl/
prefix also works not only for raster but also for vector file formats. It allows reading vectors directly from online storage with read_sf()
just by adding the prefix before the vector file URL.
/vsicurl/
is not the only prefix provided by GDAL – many more exist, such as /vsizip/
to read spatial files from ZIP archives without decompressing them beforehand or /vsis3/
for on-the-fly reading files available in AWS S3 buckets. Learn more at https://gdal.org/user/virtual_file_systems.html