3.3 Atomic Vectors

3.3.1 Types of atomic vectors

Image Credit: Advanced R
Image Credit: Advanced R
  • Logical: True/False
  • Integer: Numeric (discrete, no decimals)
  • Double: Numeric (continuous, decimals)
  • Character: String

3.3.2 Vectors of Length One

Scalars are vectors that consist of a single value.

3.3.2.1 Logicals

lgl1 <- TRUE
lgl2 <- T #abbreviation for TRUE
lgl3 <- FALSE
lgl4 <- F #abbreviation for FALSE

3.3.2.2 Doubles

# integer, decimal, scientific, or hexidecimal format
dbl1 <- 1
dbl2 <- 1.234 # decimal
dbl3 <- 1.234e0 # scientific format
dbl4 <- 0xcafe # hexidecimal format

3.3.2.3 Integers

Integers must be followed by L and cannot have fractional values

int1 <- 1L
int2 <- 1234L
int3 <- 1234e0L
int4 <- 0xcafeL
Pop Quiz: Why “L” for integers? Wickham notes that the use of L dates back to the C programming language and its “long int” type for memory allocation.

3.3.2.4 Strings

Strings can use single or double quotes and special characters are escaped with

str1 <- "hello" # double quotes
str2 <- 'hello' # single quotes
str3 <- "مرحبًا" # Unicode
str4 <- "\U0001f605" # sweaty_smile

3.3.3 Longer

There are several ways to make longer vectors:

1. With single values inside c() for combine.

lgl_var <- c(TRUE, FALSE)
int_var <- c(1L, 6L, 10L)
dbl_var <- c(1, 2.5, 4.5)
chr_var <- c("these are", "some strings")
Image Credit: Advanced R
Image Credit: Advanced R

2. With other vectors

c(c(1, 2), c(3, 4)) # output is not nested
#> [1] 1 2 3 4
Side Quest: rlang

{rlang} has vector constructor functions too:

  • rlang::lgl(...)
  • rlang::int(...)
  • rlang::dbl(...)
  • rlang::chr(...)

They look to do both more and less than c().

  • More:
    • Enforce type
    • Splice lists
    • More types: rlang::bytes(), rlang::cpl(...)
  • Less:
    • Stricter rules on names

Note: currently has questioning lifecycle badge, since these constructors may get moved to vctrs

3.3.4 Type and Length

We can determine the type of a vector with typeof() and its length with length()

Types of Atomic Vectors1
name value typeof() length()
lgl_var TRUE, FALSE logical 2
int_var 1L, 6L, 10L integer 3
dbl_var 1, 2.5, 4.5 double 3
chr_var 'these are', 'some strings' character 2
1 Source: https://adv-r.hadley.nz/index.html
Side Quest: Penguins
typeof(penguins$species)
#> [1] "integer"
class(penguins$species)
#> [1] "factor"

typeof(species_unlist)
#> [1] "integer"
class(species_unlist)
#> [1] "factor"

typeof(species_pull)
#> [1] "integer"
class(species_pull)
#> [1] "factor"

3.3.5 Missing values

3.3.5.1 Contagion

For most computations, an operation over values that includes a missing value yields a missing value (unless you’re careful)

# contagion
5*NA
#> [1] NA
sum(c(1, 2, NA, 3))
#> [1] NA

3.3.5.2 Exceptions

NA ^ 0
#> [1] 1
NA | TRUE
#> [1] TRUE
NA & FALSE
#> [1] FALSE

3.3.5.3 Innoculation

sum(c(1, 2, NA, 3), na.rm = TRUE)
# output: 6

To search for missing values use is.na()

x <- c(NA, 5, NA, 10)
x == NA
# output: NA NA NA NA [BATMAN!]
is.na(x)
# output: TRUE FALSE TRUE FALSE
Side Quest: NA Types

Each type has its own NA type

  • Logical: NA
  • Integer: NA_integer
  • Double: NA_double
  • Character: NA_character

This may not matter in many contexts.

But this does matter for operations where types matter like dplyr::if_else().

3.3.6 Testing

What type of vector is.*() it?

Test data type:

  • Logical: is.logical()
  • Integer: is.integer()
  • Double: is.double()
  • Character: is.character()

What type of object is it?

Don’t test objects with these tools:

  • is.vector()
  • is.atomic()
  • is.numeric()

They don’t test if you have a vector, atomic vector, or numeric vector; you’ll need to carefully read the documentation to figure out what they actually do (preview: attributes)

Side Quest: rlang

Instead, maybe, use {rlang}

  • rlang::is_vector
  • rlang::is_atomic
# vector
rlang::is_vector(c(1, 2))
#> [1] TRUE
rlang::is_vector(list(1, 2))
#> [1] TRUE

# atomic
rlang::is_atomic(c(1, 2))
#> [1] TRUE
rlang::is_atomic(list(1, "a"))
#> [1] FALSE
See more here

3.3.7 Coercion

  • R follows rules for coercion: character → double → integer → logical

  • R can coerce either automatically or explicitly

3.3.7.1 Automatic

Two contexts for automatic coercion:

  1. Combination
  2. Mathematical
3.3.7.1.1 Coercion by Combination:
str(c(TRUE, "TRUE"))
#>  chr [1:2] "TRUE" "TRUE"
3.3.7.1.2 Coercion by Mathematical operations:
# imagine a logical vector about whether an attribute is present
has_attribute <- c(TRUE, FALSE, TRUE, TRUE)

# number with attribute
sum(has_attribute)
#> [1] 3

3.3.7.2 Explicit

Coercion of Atomic Vectors1
name value as.logical() as.integer() as.double() as.character()
lgl_var TRUE, FALSE TRUE FALSE 1 0 1 0 'TRUE' 'FALSE'
int_var 1L, 6L, 10L TRUE TRUE TRUE 1 6 10 1 6 10 '1' '6' '10'
dbl_var 1, 2.5, 4.5 TRUE TRUE TRUE 1 2 4 1.0 2.5 4.5 '1' '2.5' '4.5'
chr_var 'these are', 'some strings' NA NA NA_integer NA_double 'these are', 'some strings'
1 Source: https://adv-r.hadley.nz/index.html

But note that coercion may fail in one of two ways, or both:

  • With warning/error
  • NAs
as.integer(c(1, 2, "three"))
#> Warning: NAs introduced by coercion
#> [1]  1  2 NA

3.3.8 Exercises

  1. How do you create raw and complex scalars?
Answer(s)
as.raw(42)
#> [1] 2a
charToRaw("A")
#> [1] 41
complex(length.out = 1, real = 1, imaginary = 1)
#> [1] 1+1i
  1. Test your knowledge of the vector coercion rules by predicting the output of the following uses of c():
c(1, FALSE)
c("a", 1)
c(TRUE, 1L)
Answer(s)
c(1, FALSE)      # will be coerced to double    -> 1 0
c("a", 1)        # will be coerced to character -> "a" "1"
c(TRUE, 1L)      # will be coerced to integer   -> 1 1
  1. Why is 1 == "1" true? Why is -1 < FALSE true? Why is "one" < 2 false?
Answer(s)

These comparisons are carried out by operator-functions (==, <), which coerce their arguments to a common type. In the examples above, these types will be character, double and character: 1 will be coerced to “1”, FALSE is represented as 0 and 2 turns into “2” (and numbers precede letters in lexicographic order (may depend on locale)).

  1. Why is the default missing value, NA, a logical vector? What’s special about logical vectors?
Answer(s) The presence of missing values shouldn’t affect the type of an object. Recall that there is a type-hierarchy for coercion from character → double → integer → logical. When combining NAs with other atomic types, the NAs will be coerced to integer (NA_integer_), double (NA_real_) or character (NA_character_) and not the other way round. If NA were a character and added to a set of other values all of these would be coerced to character as well.
  1. Precisely what do is.atomic(), is.numeric(), and is.vector() test for?
Answer(s)

The documentation states that:

  • is.atomic() tests if an object is an atomic vector (as defined in Advanced R) or is NULL (!).
  • is.numeric() tests if an object has type integer or double and is not of class factor, Date, POSIXt or difftime.
  • is.vector() tests if an object is a vector (as defined in Advanced R) or an expression and has no attributes, apart from names.
Atomic vectors are defined in Advanced R as objects of type logical, integer, double, complex, character or raw. Vectors are defined as atomic vectors or lists.