NA
and NULL
black hole#> R version 4.5.1 (2025-06-13 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 26100)
#>
#> Matrix products: default
#> LAPACK version 3.12.1
#>
#> locale:
#> [1] LC_COLLATE=English_United States.utf8
#> [2] LC_CTYPE=English_United States.utf8
#> [3] LC_MONETARY=English_United States.utf8
#> [4] LC_NUMERIC=C
#> [5] LC_TIME=English_United States.utf8
#>
#> time zone: America/Chicago
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] palmerpenguins_0.1.1 gt_1.0.0 dplyr_1.1.4
#>
#> loaded via a namespace (and not attached):
#> [1] digest_0.6.37 R6_2.6.1 fastmap_1.2.0 tidyselect_1.2.1
#> [5] xfun_0.53 magrittr_2.0.3 glue_1.8.0 tibble_3.3.0
#> [9] knitr_1.50 pkgconfig_2.0.3 htmltools_0.5.8.1 rmarkdown_2.29
#> [13] generics_0.1.4 lifecycle_1.0.4 xml2_1.3.8 cli_3.6.5
#> [17] vctrs_0.6.5 compiler_4.5.1 tools_4.5.1 pillar_1.11.0
#> [21] evaluate_1.0.4 yaml_2.3.10 rlang_1.1.6 jsonlite_2.0.0
#> [25] keyring_1.4.1
Palmer Penguins
Consider this code to count the number of Gentoo penguins in the penguins
data set. We see that there are 124 Gentoo penguins.
One subtle error can arise in trying out %in%
here instead.
Where did the penguins go?
Image Credit: Advanced R
Two main types:
Closely related but not technically a vector:
Image Credit: Advanced R
Scalars are vectors that consist of a single value.
Integers must be followed by L and cannot have fractional values
L
dates back to the C programming language and its “long int” type for memory allocation.
Strings can use single or double quotes and special characters are escaped with
There are several ways to make longer vectors:
1. With single values inside c() for combine.
Image Credit: Advanced R
2. With other vectors
We can determine the type of a vector with typeof()
and its length with length()
Types of Atomic Vectors1 | |||
---|---|---|---|
name | value | typeof() | length() |
lgl_var | TRUE, FALSE | logical | 2 |
int_var | 1L, 6L, 10L | integer | 3 |
dbl_var | 1, 2.5, 4.5 | double | 3 |
chr_var | 'these are', 'some strings' | character | 2 |
1 Source: https://adv-r.hadley.nz/index.html |
For most computations, an operation over values that includes a missing value yields a missing value (unless you’re careful)
To search for missing values use is.na()
Each type has its own NA type
NA
NA_integer
NA_double
NA_character
This may not matter in many contexts.
Can matter for operations where types matter likedplyr::if_else()
.
What type of vector is.*
() it?
Test data type:
is.logical()
is.integer()
is.double()
is.character()
What type of object is it?
Don’t test objects with these tools:
is.vector()
is.atomic()
is.numeric()
They don’t test if you have a vector, atomic vector, or numeric vector; you’ll need to carefully read the documentation to figure out what they actually do (preview: attributes)
is_*()
{rlang}
?
rlang::is_vector
rlang::is_atomic
#> [1] TRUE
#> [1] TRUE
#> [1] TRUE
#> [1] FALSE
R follows rules for coercion: character → double → integer → logical
R can coerce either automatically or explicitly
Two contexts for automatic coercion:
Coercion of Atomic Vectors1 | |||||
---|---|---|---|---|---|
name | value | as.logical() | as.integer() | as.double() | as.character() |
lgl_var | TRUE, FALSE | TRUE FALSE | 1 0 | 1 0 | 'TRUE' 'FALSE' |
int_var | 1L, 6L, 10L | TRUE TRUE TRUE | 1 6 10 | 1 6 10 | '1' '6' '10' |
dbl_var | 1, 2.5, 4.5 | TRUE TRUE TRUE | 1 2 4 | 1.0 2.5 4.5 | '1' '2.5' '4.5' |
chr_var | 'these are', 'some strings' | NA NA | NA_integer | NA_double | 'these are', 'some strings' |
1 Source: https://adv-r.hadley.nz/index.html |
But note that coercion may fail in one of two ways, or both:
1 == "1"
true? Why is -1 < FALSE
true? Why is "one" < 2
false?These comparisons are carried out by operator-functions (==, <), which coerce their arguments to a common type. In the examples above, these types will be character, double and character: 1 will be coerced to “1”, FALSE is represented as 0 and 2 turns into “2” (and numbers precede letters in lexicographic order (may depend on locale)).
NA
s with other atomic types, the NA
s will be coerced to integer (NA_integer_
), double (NA_real_
) or character (NA_character_
) and not the other way round. If NA
were a character and added to a set of other values all of these would be coerced to character as well.
is.atomic()
, is.numeric()
, and is.vector()
test for?is.atomic()
tests if an object is an atomic vector or is NULL
(!). Atomic vectors are objects of type logical, integer, double, complex, character or raw.is.numeric()
tests if an object has type integer or double and is not of class factor
, Date
, POSIXt
or difftime
.is.vector()
tests if an object is a vector or an expression and has no attributes, apart from names. Vectors are atomic vectors or lists.Attributes are name-value pairs that attach metadata to an object (vector).
Three functions:
attr()
attributes()
structure()
Use attr()
structure()
: set multiple attributes, attributes()
: get multiple attributes
Three particularly important attributes:
Most attributes are lost by most operations. Only two attributes are routinely preserved: names and dimension.
Three Four ways to name:
x <- unname(x)
or names(x) <- NULL
haven::labelled()
matrix()
and array()
dim()
Vector | Matrix | Array |
---|---|---|
names() |
rownames() , colnames() |
dimnames() |
length() |
nrow() , ncol() |
dim() |
c() |
rbind() , cbind() |
abind::abind() |
— | t() |
aperm() |
is.null(dim(x)) |
is.matrix() |
is.array() |
dim
set has NULL
dimensions, not 1
.setNames()
implemented? Read the source code.unname()
implemented? Read the source code.dim()
return when applied to a 1-dimensional vector? When might you use NROW()
or NCOL()
?1:5
?structure()
:Why don’t you see the comment attribute on print? Is the attribute missing, or is there something else special about it?
The documentation states (see ?comment
):
Contrary to other attributes, the comment is not printed (by print or print.default).
Also, from ?attributes:
Note that some attributes (namely class, comment, dim, dimnames, names, row.names and tsp) are treated specially and have restrictions on the values which can be set.
Retrieve comment attributes with attr()
:
Credit: Advanced R by Hadley Wickham
Having a class attribute turns an object into an S3 object.
What makes S3 atomic vectors different?
A factor is a vector used to store categorical data that can contain only predefined values.
Factors are integer vectors with:
Factors can be ordered. This can be useful for models or visualizations where order matters.
values <- c('high', 'med', 'low', 'med', 'high', 'low', 'med', 'high')
ordered_factor <- ordered(
x = values,
levels = c('low', 'med', 'high') # in order
)
ordered_factor
#> [1] high med low med high low med high
#> Levels: low < med < high
#> values
#> high low med
#> 3 2 3
#> ordered_factor
#> low med high
#> 2 3 3
Dates are:
The double component represents the number of days since since the Unix epoch 1970-01-01
There are 2 Date-time representations in base R:
We’ll focus on POSIXct because:
Let’s now build and deconstruct a Date-time
# Build
note_date_time <- as.POSIXct(
x = Sys.time(), # time
tz = "America/New_York" # time zone, used only for formatting
)
# Inspect
note_date_time
#> [1] "2025-09-03 07:11:21 EDT"
#> [1] "double"
#> $class
#> [1] "POSIXct" "POSIXt"
#>
#> $tzone
#> [1] "America/New_York"
#> [1] "2025-09-03 13:11:21 CEST"
Durations represent the amount of time between pairs of dates or date-times.
#> Time difference of 1 mins
#> [1] "double"
#> $class
#> [1] "difftime"
#>
#> $units
#> [1] "mins"
See also:
table()
return? What is its type? What attributes does it have? How does the dimensionality change as you tabulate more variables?table()
returns a contingency table of its input variables. It is implemented as an integer vector with class table and dimensions (which makes it act like an array). Its attributes are dim (dimensions) and dimnames (one name for each input column). The dimensions correspond to the number of unique values (factor levels) in each input variable.
The underlying integer values stay the same, but the levels are changed, making it look like the data has changed.
f1 <- factor(letters)
f1
#> [1] a b c d e f g h i j k l m n o p q r s t u v w x y z
#> Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
as.integer(f1)
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [26] 26
levels(f1) <- rev(levels(f1))
f1
#> [1] z y x w v u t s r q p o n m l k j i h g f e d c b a
#> Levels: z y x w v u t s r q p o n m l k j i h g f e d c b a
as.integer(f1)
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
#> [26] 26
f2
and f3
differ from f1
?For f2
and f3
either the order of the factor elements or its levels are being reversed. For f1
both transformations are occurring.
# Reverse element order
(f2 <- rev(factor(letters)))
#> [1] z y x w v u t s r q p o n m l k j i h g f e d c b a
#> Levels: a b c d e f g h i j k l m n o p q r s t u v w x y z
as.integer(f2)
#> [1] 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2
#> [26] 1
# Reverse factor levels (when creating factor)
(f3 <- factor(letters, levels = rev(letters)))
#> [1] a b c d e f g h i j k l m n o p q r s t u v w x y z
#> Levels: z y x w v u t s r q p o n m l k j i h g f e d c b a
as.integer(f3)
#> [1] 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2
#> [26] 1
Simple lists:
# Construct
simple_list <- list(
c(TRUE, FALSE), # logicals
1:20, # integers
c(1.2, 2.3, 3.4), # doubles
c("primo", "secundo", "tercio") # characters
)
simple_list
#> [[1]]
#> [1] TRUE FALSE
#>
#> [[2]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
#>
#> [[3]]
#> [1] 1.2 2.3 3.4
#>
#> [[4]]
#> [1] "primo" "secundo" "tercio"
#> [1] "list"
#> List of 4
#> $ : logi [1:2] TRUE FALSE
#> $ : int [1:20] 1 2 3 4 5 6 7 8 9 10 ...
#> $ : num [1:3] 1.2 2.3 3.4
#> $ : chr [1:3] "primo" "secundo" "tercio"
#> [[1]]
#> [1] TRUE FALSE
#> [[1]]
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
#> [[1]]
#> [1] 1.2 2.3 3.4
#> [[1]]
#> [1] "primo" "secundo" "tercio"
#> [1] FALSE
#> [1] 8
#> [1] 2.3
#> [1] "tercio"
nested_list <- list(
# first level
list(
# second level
list(
# third level
list(1)
)
)
)
str(nested_list)
#> List of 1
#> $ :List of 1
#> ..$ :List of 1
#> .. ..$ :List of 1
#> .. .. ..$ : num 1
Like JSON.
list_comb1 <- list(list(1, 2), list(3, 4)) # with list()
list_comb2 <- c(list(1, 2), list(3, 4)) # with c()
# compare structure
str(list_comb1)
#> List of 2
#> $ :List of 2
#> ..$ : num 1
#> ..$ : num 2
#> $ :List of 2
#> ..$ : num 3
#> ..$ : num 4
#> List of 4
#> $ : num 1
#> $ : num 2
#> $ : num 3
#> $ : num 4
# does this work if they are different data types?
list_comb3 <- c(list(1, 2), list(TRUE, FALSE))
str(list_comb3)
#> List of 4
#> $ : num 1
#> $ : num 2
#> $ : logi TRUE
#> $ : logi FALSE
Check that is a list:
is.list()
The two do the same, except that the latter can check for the number of elements
Use as.list()
Although not often used, the dimension attribute can be added to create list-matrices or list-arrays.
unlist()
to convert a list to an atomic vector? Why doesn’t as.vector()
work?c()
and unlist()
when combining a date and date-time into a single vector.Date and date-time objects are both built upon doubles. While dates store the number of days since the reference date 1970-01-01 (also known as “the Epoch”) in days, date-time-objects (POSIXct) store the time difference to this date in seconds.
As the c() generic only dispatches on its first argument, combining date and date-time objects via c() could lead to surprising results in older R versions (pre R 4.0.0):
In the first statement above c.Date() is executed, which incorrectly treats the underlying double of dttm_ct (3600) as days instead of seconds. Conversely, when c.POSIXct() is called on a date, one day is counted as one second only.
We can highlight these mechanics by the following code:
As of R 4.0.0 these issues have been resolved and both methods now convert their input first into POSIXct and Date, respectively.
However, as c() strips the time zone (and other attributes) of POSIXct objects, some caution is still recommended.
A package that deals with these kinds of problems in more depth and provides a structural solution for them is the {vctrs} package9 which is also used throughout the tidyverse.10
Let’s look at unlist(), which operates on list input.
We see again that dates and date-times are internally stored as doubles. Unfortunately, this is all we are left with, when unlist strips the attributes of the list.
To summarise: c() coerces types and strips time zones. Errors may have occurred in older R versions because of inappropriate method dispatch/immature methods. unlist() strips attributes.Credit: Advanced R by Hadley Wickham
A data frame is a:
names
row.names
# Construct
df <- data.frame(
col1 = c(1, 2, 3), # named atomic vector
col2 = c("un", "deux", "trois") # another named atomic vector
# ,stringsAsFactors = FALSE # default for versions after R 4.1
)
# Inspect
df
#> col1 col2
#> 1 1 un
#> 2 2 deux
#> 3 3 trois
#> [1] "list"
#> $names
#> [1] "col1" "col2"
#>
#> $class
#> [1] "data.frame"
#>
#> $row.names
#> [1] 1 2 3
#> [1] "1" "2" "3"
#> [1] "col1" "col2"
#> [1] "col1" "col2"
#> [1] 3
#> [1] 2
#> [1] 2
Unlike other lists, the length of each vector must be the same (i.e. as many vector elements as rows in the data frame).
Created to relieve some of the frustrations and pain points created by data frames, tibbles are data frames that are:
Tibbles do not:
chr_col <- c("don't", "factor", "me", "bro")
# data frame
df <- data.frame(
a = chr_col,
# in R 4.1 and earlier, this was the default
stringsAsFactors = TRUE
)
# tibble
tbl <- tibble::tibble(
a = chr_col
)
# contrast the structure
str(df$a)
#> Factor w/ 4 levels "bro","don't",..: 2 3 4 1
#> chr [1:4] "don't" "factor" "me" "bro"
# data frame
df <- data.frame(
col1 = c(1, 2, 3, 4),
col2 = c(1, 2)
)
# tibble
tbl <- tibble::tibble(
col1 = c(1, 2, 3, 4),
col2 = c(1, 2)
)
#> Error in `tibble::tibble()`:
#> ! Tibble columns must have compatible sizes.
#> • Size 4: Existing data.
#> • Size 2: Column `col2`.
#> ℹ Only values of size one are recycled.
Tibbles do only what they’re asked and complain if what they’re asked doesn’t make sense:
tibble()
allows you to refer to variables created during construction
#> # A tibble: 3 × 2
#> x y
#> <int> <dbl>
#> 1 1 2
#> 2 2 4
#> 3 3 6
rownames()
df3 <- data.frame(
age = c(35, 27, 18),
hair = c("blond", "brown", "black"),
row.names = c("Bob", "Susan", "Sam")
)
df3
#> age hair
#> Bob 35 blond
#> Susan 27 brown
#> Sam 18 black
#> [1] "Bob" "Susan" "Sam"
#> age hair
#> Bob 35 blond
#> [1] "Susan" "Bob" "Sam"
#> age hair
#> Bob 27 brown
There are three reasons why row names are undesirable:
Data frames and tibbles print differently
Two undesirable subsetting behaviours:
df[, vars]
, you will get a vector if vars selects one variable, otherwise you’ll get a data frame, unless you always remember to use df[, vars, drop = FALSE]
.df$x
and there is no column x
, a data frame will instead select any variable that starts with x
. If no variable starts with x
, df$x
will return NULL.Tibbles tweak these behaviours so that a [ always returns a tibble, and a $ doesn’t do partial matching and warns if it can’t find a variable (this is what makes tibbles surly).
Whether data frame: is.data.frame()
. Note: both data frame and tibble are data frames.
Whether tibble: tibble::is_tibble
. Note: only tibbles are tibbles. Vanilla data frames are not.
as.data.frame()
tibble::as_tibble()
List-columns are allowed in data frames but you have to do a little extra work by either adding the list-column after creation or wrapping the list in I()
I()
dfm <- data.frame(
x = 1:3 * 10,
y = I(matrix(1:9, nrow = 3))
)
dfm$z <- data.frame(a = 3:1, b = letters[1:3], stringsAsFactors = FALSE)
str(dfm)
#> 'data.frame': 3 obs. of 3 variables:
#> $ x: num 10 20 30
#> $ y: 'AsIs' int [1:3, 1:3] 1 2 3 4 5 6 7 8 9
#> $ z:'data.frame': 3 obs. of 2 variables:
#> ..$ a: int 3 2 1
#> ..$ b: chr "a" "b" "c"
#> [,1] [,2] [,3]
#> [1,] 1 4 7
#> [2,] 2 5 8
#> [3,] 3 6 9
#> a b
#> 1 3 a
#> 2 2 b
#> 3 1 c
Yes, you can create these data frames easily; either during creation or via subsetting. Even both dimensions can be zero. Create a 0-row, 0-column, or an empty data frame directly:
Create similar data frames via subsetting the respective dimension with either 0, NULL
, FALSE
or a valid 0-length atomic (logical(0)
, character(0)
, integer(0)
, double(0)
). Negative integer sequences would also work. The following example uses a zero:
Matrices can have duplicated row names, so this does not cause problems.
Data frames, however, require unique rownames and you get different results depending on how you attempt to set them. If you set them directly or via row.names()
, you get an error:
data.frame(row.names = c("x", "y", "y"))
#> Error in data.frame(row.names = c("x", "y", "y")): duplicate row.names: y
df <- data.frame(x = 1:3)
row.names(df) <- c("x", "y", "y")
#> Warning: non-unique value when setting 'row.names': 'y'
#> Error in `.rowNamesDF<-`(x, value = value): duplicate 'row.names' are not allowed
If you use subsetting, [
automatically deduplicates:
df
is a data frame, what can you say about t(df)
, and t(t(df))
? Perform some experiments, making sure to try different column types.Both of t(df)
and t(t(df))
will return matrices:
The dimensions will respect the typical transposition rules:
Because the output is a matrix, every column is coerced to the same type. (It is implemented within t.data.frame()
via as.matrix()
which is described below).
as.matrix()
do when applied to a data frame with columns of different types? How does it differ from data.matrix()
?The type of the result of as.matrix depends on the types of the input columns (see ?as.matrix
):
The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column, applying as.vector to factors and format to other non-character columns. Otherwise the usual coercion hierarchy (logical < integer < double < complex) will be used, e.g. all-logical data frames will be coerced to a logical matrix, mixed logical-integer will give an integer matrix, etc.
On the other hand, data.matrix
will always return a numeric matrix (see ?data.matrix()
).
Return the matrix obtained by converting all the variables in a data frame to numeric mode and then binding them together as the columns of a matrix. Factors and ordered factors are replaced by their internal codes. […] Character columns are first converted to factors and then to integers.
We can illustrate and compare the mechanics of these functions using a concrete example. as.matrix()
makes it possible to retrieve most of the original information from the data frame but leaves us with characters. To retrieve all information from data.matrix()
’s output, we would need a lookup table for each column.
df_coltypes <- data.frame(
a = c("a", "b"),
b = c(TRUE, FALSE),
c = c(1L, 0L),
d = c(1.5, 2),
e = factor(c("f1", "f2"))
)
as.matrix(df_coltypes)
#> a b c d e
#> [1,] "a" "TRUE" "1" "1.5" "f1"
#> [2,] "b" "FALSE" "0" "2.0" "f2"
data.matrix(df_coltypes)
#> a b c d e
#> [1,] 1 1 1 1.5 1
#> [2,] 2 0 0 2.0 2
NULL
Special type of object that:
Let is use some of this chapter’s skills on the penguins
data.
#> tibble [344 × 17] (S3: tbl_df/tbl/data.frame)
#> $ studyName : chr [1:344] "PAL0708" "PAL0708" "PAL0708" "PAL0708" ...
#> $ Sample Number : num [1:344] 1 2 3 4 5 6 7 8 9 10 ...
#> $ Species : chr [1:344] "Adelie Penguin (Pygoscelis adeliae)" "Adelie Penguin (Pygoscelis adeliae)" "Adelie Penguin (Pygoscelis adeliae)" "Adelie Penguin (Pygoscelis adeliae)" ...
#> $ Region : chr [1:344] "Anvers" "Anvers" "Anvers" "Anvers" ...
#> $ Island : chr [1:344] "Torgersen" "Torgersen" "Torgersen" "Torgersen" ...
#> $ Stage : chr [1:344] "Adult, 1 Egg Stage" "Adult, 1 Egg Stage" "Adult, 1 Egg Stage" "Adult, 1 Egg Stage" ...
#> $ Individual ID : chr [1:344] "N1A1" "N1A2" "N2A1" "N2A2" ...
#> $ Clutch Completion : chr [1:344] "Yes" "Yes" "Yes" "Yes" ...
#> $ Date Egg : Date[1:344], format: "2007-11-11" "2007-11-11" ...
#> $ Culmen Length (mm) : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
#> $ Culmen Depth (mm) : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
#> $ Flipper Length (mm): num [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
#> $ Body Mass (g) : num [1:344] 3750 3800 3250 NA 3450 ...
#> $ Sex : chr [1:344] "MALE" "FEMALE" "FEMALE" NA ...
#> $ Delta 15 N (o/oo) : num [1:344] NA 8.95 8.37 NA 8.77 ...
#> $ Delta 13 C (o/oo) : num [1:344] NA -24.7 -25.3 NA -25.3 ...
#> $ Comments : chr [1:344] "Not enough blood for isotopes." NA NA "Adult not sampled." ...
#> - attr(*, "spec")=List of 3
#> ..$ cols :List of 17
#> .. ..$ studyName : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Sample Number : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Species : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Region : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Island : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Stage : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Individual ID : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Clutch Completion : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Date Egg :List of 1
#> .. .. ..$ format: chr ""
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_date" "collector"
#> .. ..$ Culmen Length (mm) : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Culmen Depth (mm) : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Flipper Length (mm): list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Body Mass (g) : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Sex : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> .. ..$ Delta 15 N (o/oo) : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Delta 13 C (o/oo) : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_double" "collector"
#> .. ..$ Comments : list()
#> .. .. ..- attr(*, "class")= chr [1:2] "collector_character" "collector"
#> ..$ default: list()
#> .. ..- attr(*, "class")= chr [1:2] "collector_guess" "collector"
#> ..$ skip : num 1
#> ..- attr(*, "class")= chr "col_spec"
#> tibble [344 × 17] (S3: tbl_df/tbl/data.frame)
#> $ studyName : chr [1:344] "PAL0708" "PAL0708" "PAL0708" "PAL0708" ...
#> $ Sample Number : num [1:344] 1 2 3 4 5 6 7 8 9 10 ...
#> $ Species : chr [1:344] "Adelie Penguin (Pygoscelis adeliae)" "Adelie Penguin (Pygoscelis adeliae)" "Adelie Penguin (Pygoscelis adeliae)" "Adelie Penguin (Pygoscelis adeliae)" ...
#> $ Region : chr [1:344] "Anvers" "Anvers" "Anvers" "Anvers" ...
#> $ Island : chr [1:344] "Torgersen" "Torgersen" "Torgersen" "Torgersen" ...
#> $ Stage : chr [1:344] "Adult, 1 Egg Stage" "Adult, 1 Egg Stage" "Adult, 1 Egg Stage" "Adult, 1 Egg Stage" ...
#> $ Individual ID : chr [1:344] "N1A1" "N1A2" "N2A1" "N2A2" ...
#> $ Clutch Completion : chr [1:344] "Yes" "Yes" "Yes" "Yes" ...
#> $ Date Egg : Date[1:344], format: "2007-11-11" "2007-11-11" ...
#> $ Culmen Length (mm) : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
#> $ Culmen Depth (mm) : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
#> $ Flipper Length (mm): num [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
#> $ Body Mass (g) : num [1:344] 3750 3800 3250 NA 3450 ...
#> $ Sex : chr [1:344] "MALE" "FEMALE" "FEMALE" NA ...
#> $ Delta 15 N (o/oo) : num [1:344] NA 8.95 8.37 NA 8.77 ...
#> $ Delta 13 C (o/oo) : num [1:344] NA -24.7 -25.3 NA -25.3 ...
#> $ Comments : chr [1:344] "Not enough blood for isotopes." NA NA "Adult not sampled." ...
#> species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
#> 1 Adelie Torgersen 39.1 18.7 181 3750
#> 2 Adelie Torgersen 39.5 17.4 186 3800
#> 3 Adelie Torgersen 40.3 18.0 195 3250
#> 4 Adelie Torgersen NA NA NA NA
#> 5 Adelie Torgersen 36.7 19.3 193 3450
#> 6 Adelie Torgersen 39.3 20.6 190 3650
#> sex year
#> 1 male 2007
#> 2 female 2007
#> 3 female 2007
#> 4 <NA> 2007
#> 5 female 2007
#> 6 male 2007
#> # A tibble: 344 × 8
#> species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g
#> <fct> <fct> <dbl> <dbl> <int> <int>
#> 1 Adelie Torgersen 39.1 18.7 181 3750
#> 2 Adelie Torgersen 39.5 17.4 186 3800
#> 3 Adelie Torgersen 40.3 18 195 3250
#> 4 Adelie Torgersen NA NA NA NA
#> 5 Adelie Torgersen 36.7 19.3 193 3450
#> 6 Adelie Torgersen 39.3 20.6 190 3650
#> 7 Adelie Torgersen 38.9 17.8 181 3625
#> 8 Adelie Torgersen 39.2 19.6 195 4675
#> 9 Adelie Torgersen 34.1 18.1 193 3475
#> 10 Adelie Torgersen 42 20.2 190 4250
#> # ℹ 334 more rows
#> # ℹ 2 more variables: sex <fct>, year <int>
species_vector_df <- penguins_df |> select(species)
species_unlist_df <- penguins_df |> select(species) |> unlist()
species_pull_df <- penguins_df |> select(species) |> pull()
species_vector_tb <- penguins_tb |> select(species)
species_unlist_tb <- penguins_tb |> select(species) |> unlist()
species_pull_tb <- penguins_tb |> select(species) |> pull()
typeof()
and class()
#> [1] "list"
#> [1] "data.frame"
#> [1] "integer"
#> [1] "factor"
#> [1] "integer"
#> [1] "factor"
#> [1] "list"
#> [1] "tbl_df" "tbl" "data.frame"
#> [1] "integer"
#> [1] "factor"
#> [1] "integer"
#> [1] "factor"
#> [1] "species" "island" "bill_length_mm"
#> [4] "bill_depth_mm" "flipper_length_mm" "body_mass_g"
#> [7] "sex" "year"
tibbles are surly!
attr(x, "y")
and attr(x, "y") <- value
; or you can get and set all attributes at once with attributes()
.
df$x <- matrix()
, or by using I()
when creating a new data frame data.frame(x = I(matrix()))
.