How can I process API responses?

Add req_error() in here somewhere!

Learning objectives:

  • Recognize the types of API responses.
  • Recognize JSON data structures.
  • Recognize XML data structures.
  • Parse text responses.
  • Parse nested lists with the {tidyverse}.
  • Parse nested lists with {tibblify}.
  • Use API docs to anticipate API responses.
  • Parse binary responses such as images and videos.

Content-Type

  • Content-Type = “MIME type”
    • “Multipurpose Internet Mail Extensions”
  • type/subtype;parameter=value
  • httr2::resp_content_type() gets type/subtype
  • httr2::resp_encoding() gets charset parameter
  • More at MDN MIME types

Common text content types

MIME type httr2 function Description
application/json resp_body_json() By far most common
application/xml resp_body_xml() Briefly most common
text/html resp_body_html() See later chapter on scraping
text/plain resp_body_string() Text wildcard

JSON

  • application/json or */json
  • 4 scalars (length-1 vectors)
    • nullNA
    • stringcharacter(1), always " (not ')
    • numbernumeric(1), no Inf/-Inf/NaN
    • booleanlogical(1), true = TRUE, false = FALSE
  • array ≈ unnamed list()
    • []: [null, "a", 1, true]list(NULL, "a", 1, TRUE)
  • object ≈ named list()
    • {}: {"a": 1, "b": [1, 2]}list(a = 1, b = list(1, 2))
  • httr2::resp_body_json() uses jsonlite::fromJSON()

JSON Example

resp_json <- req_template(request(example_url()), "/json") |>
  req_perform()
resp_json |> resp_body_string() |> jsonlite::prettify(indent = 2)
extracted_json <- resp_body_json(resp_json)
class(extracted_json)
names(extracted_json)

XML

eXtensible Markup Language

  • application/xml, text/xml, or */xml)
  • Tags as <tagname attribute="a">contents</tagname>
  • Everything nestable
  • httr2::resp_body_xml() uses xml2::read_xml()

XML Example

resp_xml <- req_template(request(example_url()), "/xml") |>
  req_perform()
resp_xml |> resp_body_string() |> cat()
extracted_xml <-  resp_body_xml(resp_xml)
class(extracted_xml)
# We'll see other ways to parse this in a later chapter.
xml2::as_list(extracted_xml) |> names()
xml2::as_list(extracted_xml)$root |> names()

Nested lists: tidyverse

  • tibble::enframe() to tibble-ify the list()
  • tidyr::unnest_wider() and/or tidyr::unnest_longer()
    • Sometimes tidyr::unnest() but it can hide layers
list(root = extracted_json) |> 
  tibble::enframe(name = NULL) |> 
  tidyr::unnest_wider(value)

xml2::as_list(extracted_xml) |> 
  tibble::enframe(name = NULL) |> 
  tidyr::unnest_wider(value)

Nested lists: tibblify

list(root = extracted_json) |> 
  tibblify::tibblify()

xml2::as_list(extracted_xml) |> 
  tibblify::tibblify()

Response objects in API docs

  • Look for 200 in docs for this path
  • Find “OpenAPI” or “Swagger” links (or “API json”, “API yaml”, etc)
    • Eg: Asana
    • Search for the path, then method, then 200 (eg: /tasks:)
    • Often a $ref to something like #/components/schemas/TaskCompact
    • Ctrl-F around that doc!
  • {anyapi} will soon build a tibblify spec from API spec

Binary objects

  • image/* (png, jpeg, svg+xml)
  • audio/* (webm, ogg, wav)
  • video/* (webm, ogg, mp4)
  • application/octet-stream (catch-all)

resp_body_raw(resp) |> writeBin(filename)

Meeting Videos

Cohort 1

Meeting chat log
LOG