29.15 List-columns
- List-Columns are implicit in the definition of the data frame: a data frame is a named list of equal length vectors.
- Base-R doesn’t make it easy to create list-columns, and
data.frame()
treats a list as a list of columns.
- You can prevent
data.frame()
from treating a lists of lists by addingI()
to the argument. However, this doesn’t print well.
I()
stands for Inhibit Interpretation/Conversion of Objects: Change the class of an object to indicate that is should be treated as is.
- Tibble alleviates this problem by being lazier (
tibble()
doesn’t modify its inputs) and by providing a better print method.
Note where the quotes are placed
tribble()
can automatically work out that you need a list.
- List-columns are often most useful as intermediate data structure.
- Advantage of keeping related items together in a data frame is worth a little hassle.
There are three parts of an effective list-column pipeline:
- You create the list-column using one of:
nest()
,summarise() + list()
, ormutate() + a map function
- You create other intermediate list-columns by transforming existing list columns with
map()
,map2()
, orpmap()
. - You simplify the list-column back down to a data frame or atomic vector.