2.7 Tibbles vs. Data Frames

A tibble is a special type of data frame with some additional properties. Specifically:

  • Tibbles work with column names that are not syntactically valid variable names.
data.frame(`this does not work` = 1:2,
           oops = 3:4)
##   this.does.not.work oops
## 1                  1    3
## 2                  2    4
tibble(`this does work, though` = 1:2,
       `woohoo!` = 3:4)
## # A tibble: 2 × 2
##   `this does work, though` `woohoo!`
##                      <int>     <int>
## 1                        1         3
## 2                        2         4
  • Tibbles prevent partial matching of arguments to avoid accidental errors
df <- data.frame(partial = 1:5)
tbbl <- tibble(partial = 1:5)

df$part
## [1] 1 2 3 4 5
tbbl$part
## Warning: Unknown or uninitialised column: `part`.
## NULL
  • Tibbles prevent dimension dropping, so subsetting data into a single column will never return a vector.
df[, "partial"]
## [1] 1 2 3 4 5
tbbl[, "partial"]
## # A tibble: 5 × 1
##   partial
##     <int>
## 1       1
## 2       2
## 3       3
## 4       4
## 5       5
  • Tibbles allow for list-columns, which can be a powerful tool when working with the purrr package.
template_list <- list(a = 1, b = 2, c = 3, d = 4, e = 5)

data.frame(col = 1:5, list_col = template_list)
##   col list_col.a list_col.b list_col.c list_col.d list_col.e
## 1   1          1          2          3          4          5
## 2   2          1          2          3          4          5
## 3   3          1          2          3          4          5
## 4   4          1          2          3          4          5
## 5   5          1          2          3          4          5
tibble(col = 1:5, list_col = template_list)
## # A tibble: 5 × 2
##     col list_col    
##   <int> <named list>
## 1     1 <dbl [1]>   
## 2     2 <dbl [1]>   
## 3     3 <dbl [1]>   
## 4     4 <dbl [1]>   
## 5     5 <dbl [1]>