2.4 Reusing existing data structures

“You don’t have to reinvent the wheel, just attach it to a new wagon.”

- Mark McCormack

There are many different data types in R, such as matrices, lists, and data frames.1 A typical function would take in data of some form, conduct an operation, and return the result.

tidyverse functions most often operate on data structures called tibbles.

  • Traditional data frames can represent different data types in each column, and multiple values in each row.

  • Tibbles are a special data frame that have additional properties helpful for data analysis.

    • Example: list-columns

boot_samp <- rsample::bootstraps(mtcars, times = 3)
boot_samp
## # Bootstrap sampling 
## # A tibble: 3 × 2
##   splits          id        
##   <list>          <chr>     
## 1 <split [32/10]> Bootstrap1
## 2 <split [32/12]> Bootstrap2
## 3 <split [32/11]> Bootstrap3
class(boot_samp)
## [1] "bootstraps" "rset"       "tbl_df"     "tbl"        "data.frame"

The above example shows how to create bootstrap resamples of the data frame mtcars. It returns a tibble with a splits column that defines the resampled data sets.

This function inherits data frame and tibble methods so other functions that operate on those data structures can be used.


  1. For a more detailed discussion, see Hadley Wickham’s Advanced R↩︎