How can I more easily access APIs from R?

️✅ Learning objectives

  • Use API docs to explore available endpoints.
  • Fetch data from an API with {httr2}.
  • Build a {httr2} request piece-by-piece.
library(jsonlite)
library(tibblify)
library(tidyr)
library(dplyr)
library(httr2)

What can we do so far?

We can:

  • Access APIs that can be accessed in a web browser
  • Copy/paste or manually build URLs
  • Pass an authentication key as part of a URL
  • Fetch JSON data with {jsonlite}
  • Fetch YAML data with {yaml}

We can’t yet:

  • Access APIs that can’t be accessed in a web browser
  • Build API requests systematically
  • Access APIs that require authentication
  • Fetch other types of data (raw text, images, videos, etc)

How can I learn about API endpoints?

Motivating Example: OpenFEC

2020 Presidential Candidates

openfec_response <- jsonlite::read_json("https://api.open.fec.gov/v1/candidates/?election_year=2020&office=P&api_key=DEMO_KEY")
tibblify(openfec_response$results) |> 
  select(name, party_full, has_raised_funds) |> 
  head()
#> # A tibble: 6 × 3
#>   name                       party_full       has_raised_funds
#>   <chr>                      <chr>            <lgl>           
#> 1 753, JO                    NONE             FALSE           
#> 2 ABRAMSON, MAX              VETERANS PARTY   FALSE           
#> 3 ABRAUGH, MATTHEW M. MR.    NON-PARTY        FALSE           
#> 4 ACKER, BRANDON W           DEMOCRATIC PARTY FALSE           
#> 5 ACKER, RYAN                NON-PARTY        TRUE            
#> 6 ACORD, ROBERT BRADFORD LEE DEMOCRATIC PARTY FALSE

What is {httr2}?

What do {httr2} calls look like?

Pipe-based API calls

candidates <- 
  request("https://api.open.fec.gov/v1") |> 
  req_url_path_append("candidates") |> 
  req_url_query(api_key = "DEMO_KEY") |> 
  req_url_query(
    election_year = 2020,
    office = "P"
  ) |> 
  req_perform() |> # Chapter 5 (next)
  resp_body_json() # Chapter 7
waldo::compare(candidates, openfec_response)
#> ✔ No differences

Why “httr2”?

HTTP = HyperText Transfer Protocol

  • “HyperText” = web content
  • “Transfer” = exchange
  • “Protocol” = rules
  • “rules for exchanging web content”
  • HTTP(S) = most of internet communication

How do I use {httr2}?

req_*() functions return httr2_request objects

req_fec <- request("https://api.open.fec.gov/v1")
req_fec_auth <- req_url_query(req_fec, api_key = "DEMO_KEY")
req_candidates <- req_url_path_append(req_fec_auth, "candidates")
req_candidates_president <- req_url_query(req_candidates, office = "P")
pres_2024 <- req_url_query(req_candidates_president, election_year = 2024) |> 
  req_perform() |> resp_body_json()
candidates_2022 <- req_url_query(req_candidates, election_year = 2022) |> 
  req_perform() |> resp_body_json()
req_calendar <- req_url_path_append(req_fec_auth, "calendar-dates")

How do I build {httr2} requests?

request() & req_path_append()

request("https://api.open.fec.gov/v1/candidates/")

Cleaner: “main” request object + specific path

req_fec <- request("https://api.open.fec.gov/v1")
req_candidates <- req_fec |> 
  req_url_path_append("candidates")
req_candidates$url
#> [1] "https://api.open.fec.gov/v1/candidates"

Don’t use req_url_path()!

req_path_bad <- req_fec |> 
  req_url_path("candidates")
req_path_bad$url
#> [1] "https://api.open.fec.gov/candidates"
req_candidates$url
#> [1] "https://api.open.fec.gov/v1/candidates"

req_url_query()

https://api.open.fec.gov/v1/candidates/?api_key=DEMO_KEY&office=P

req_pres <- req_candidates |> 
  req_url_query(
    api_key = "DEMO_KEY",
    office = "P"
  )
  • Can add query parameters piecewise, even before path!
req_fec_auth <- req_fec |> 
  req_url_query(api_key = "DEMO_KEY")
req_candidates_auth <- req_fec_auth |> 
  req_url_path_append("candidates")
req_pres2 <- req_candidates_auth |> 
  req_url_query(office = "P")
identical(req_pres$url, req_pres2$url)
#> [1] TRUE

req_url_query(.multi)

req_url_query(req_candidates, office = c("H", "S"))
#> Error in `req_url_query()`:
#> ! All vector elements of `...` must be length 1.
#> ℹ Use `.multi` to choose a strategy for handling vectors.

.multi = "pipe"

req_url_query(req_candidates, office = c("H", "S"), .multi = "pipe")$url
#> [1] "https://api.open.fec.gov/v1/candidates?office=H|S"

.multi = "comma"

req_url_query(req_candidates, office = c("H", "S"), .multi = "comma")$url
#> [1] "https://api.open.fec.gov/v1/candidates?office=H,S"

.multi = "explode"

req_url_query(req_candidates, office = c("H", "S"), .multi = "explode")$url
#> [1] "https://api.open.fec.gov/v1/candidates?office=H&office=S"

More httr2 to come!

  • Chapter 4 = you are here
  • Chapter 5 = “How can I get a lot of data from an API?”
  • Chapter 6 = “How do I tell the API who I am?
  • Chapter 7 = “How can I process API responses?”
  • Chapter 8 = “How can I do other things with APIs?”

What do you think?

Please complete this survey!