Controlling column types
readr
will try to guess the type of each column (i.e. whether it’s a logical, number, string, etc.)readr
uses a heuristic to figure out the column types. For each column, it pulls the values of 1,0002 rows and works through the following questions:- Does it contain only
F
,T
,FALSE
,TRUE
,false
,true
,f
, ort
etc.? If so, it’s a logical. - Does it contain only numbers (e.g.,
1
,-4.5
,5e6
,Inf
)? If so, it’s a number. - Does it match the ISO8601 standard? If so, it’s a date or date-time.
- Otherwise, it must be a string.
- Does it contain only
This heuristic works well if you have a clean dataset, but not in real life.
read_csv("
logical,numeric,date,string
TRUE,1,2021-01-15,abc
false,4.5,2021-02-15,def
T,Inf,2021-02-16,ghi
")
## Rows: 3 Columns: 4
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (1): string
## dbl (1): numeric
## lgl (1): logical
## date (1): date
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
## # A tibble: 3 × 4
## logical numeric date string
## <lgl> <dbl> <date> <chr>
## 1 TRUE 1 2021-01-15 abc
## 2 FALSE 4.5 2021-02-15 def
## 3 TRUE Inf 2021-02-16 ghi