2.4 Reusing existing data structures
“You don’t have to reinvent the wheel, just attach it to a new wagon.”
- Mark McCormack
There are many different data types in R, such as matrices, lists, and data frames.1 A typical function would take in data of some form, conduct an operation, and return the result.
tidyverse
functions most often operate on data structures called tibbles.
Traditional data frames can represent different data types in each column, and multiple values in each row.
Tibbles are a special data frame that have additional properties helpful for data analysis.
- Example: list-columns
<- rsample::bootstraps(mtcars, times = 3)
boot_samp boot_samp
## # Bootstrap sampling
## # A tibble: 3 × 2
## splits id
## <list> <chr>
## 1 <split [32/10]> Bootstrap1
## 2 <split [32/12]> Bootstrap2
## 3 <split [32/11]> Bootstrap3
class(boot_samp)
## [1] "bootstraps" "rset" "tbl_df" "tbl" "data.frame"
The above example shows how to create bootstrap resamples of the data frame mtcars. It returns a tibble with a splits
column that defines the resampled data sets.
This function inherits data frame and tibble methods so other functions that operate on those data structures can be used.
For a more detailed discussion, see Hadley Wickham’s Advanced R↩︎