Control flow

Learning objectives

  • Understand the two primary tools for control flow: choices and loops
  • Learn best practices to void common pitfalls
  • Distinguish when to use if, ifelse(), and switch() for choices
  • Distinguish when to use for, while, and repeat for loops

Note

Basic familiarity with choices and loops is assumed.

Choices

if is the basic statement for a choice

Single line format

if (condition) true_action
if (condition) true_action else false_action

Compound statement within {

grade <- function(x) {
  if (x > 90) {
    "A"
  } else if (x > 80) {
    "B"
  } else if (x > 50) {
    "C"
  } else {
    "F"
  }
}

Results of if can be assigned

x1 <- if (TRUE) 1 else 2
x2 <- if (FALSE) 1 else 2

c(x1, x2)
#> [1] 1 2

Tip

Only recommended with single line format; otherwise hard to read.

if without else can be combined with c() or paste() to create compact expressions

  • if without else invisibly returns NULL when FALSE.
greet <- function(name, birthday = FALSE) {
  paste0(
    "Hi ", name,
    if (birthday) " and HAPPY BIRTHDAY"
  )
}
greet("Maria", FALSE)
#> [1] "Hi Maria"
greet("Jaime", TRUE)
#> [1] "Hi Jaime and HAPPY BIRTHDAY"

if should have a single TRUE or FALSE condition, other inputs generate errors

if ("x") 1
#> Error in if ("x") 1: argument is not interpretable as logical
if (logical()) 1
#> Error in if (logical()) 1: argument is of length zero
if (NA) 1
#> Error in if (NA) 1: missing value where TRUE/FALSE needed
if (c(TRUE, FALSE)) 1
#> Error in if (c(TRUE, FALSE)) 1: the condition has length > 1

Use ifelse() for vectorized conditions

x <- 1:10
ifelse(x %% 5 == 0, "XXX", as.character(x))
#>  [1] "1"   "2"   "3"   "4"   "XXX" "6"   "7"   "8"   "9"   "XXX"
ifelse(x %% 2 == 0, "even", "odd")
#>  [1] "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even"

Tip

Only use ifelse() if both results are of the same type; otherwise output type is hard to predict.

Use dplyr::case_when() for multiple condition-vector pairs

dplyr::case_when(
  x %% 35 == 0 ~ "fizz buzz",
  x %% 5 == 0 ~ "fizz",
  x %% 7 == 0 ~ "buzz",
  is.na(x) ~ "???",
  TRUE ~ as.character(x)
)
#>  [1] "1"    "2"    "3"    "4"    "fizz" "6"    "buzz" "8"    "9"    "fizz"

switch() is a special purpose equivalent to if that can be used to compact code

x_option <- function(x) {
  if (x == "a") {
    "option 1"
  } else if (x == "b") {
    "option 2" 
  } else if (x == "c") {
    "option 3"
  } else {
    stop("Invalid `x` value")
  }
}
x_option <- function(x) {
  switch(x,
    a = "option 1",
    b = "option 2",
    c = "option 3",
    stop("Invalid `x` value")
  )
}

switch() is a special purpose equivalent to if that can be used to compact code

Tip

  • The last component of a switch() should always throw an error, otherwise unmatched inputs will invisibly return NULL.
  • Only use switch() with character inputs. Numeric inputs are hard to read and have undesirable failure modes.

Caution

Like if, switch() can only take a single condition, not vector conditions

Avoid repeat outputs by leaving the right side of = empty

  • Inputs will “fall through” to the next value.
legs <- function(x) {
  switch(x,
    cow = ,
    horse = ,
    dog = 4,
    human = ,
    chicken = 2,
    plant = 0,
    stop("Unknown input")
  )
}
legs("cow")
#> [1] 4
legs("dog")
#> [1] 4

Loops

To iterate over items in a vector, use a for loop

for (item in vector) perform_action
for (i in 1:3) {
  print(i)
}
#> [1] 1
#> [1] 2
#> [1] 3

Note

Convention uses short variables like i, j, or k for iterating vector indices

for will overwrite existing variables in the current environment

i <- 100
for (i in 1:3) {}
i
#> [1] 3

Use next or break to terminate loops early

  • next exits the current iteration, but continues the loop
  • break exits the entire loop

Use next or break to terminate loops early

for (i in 1:10) {
  if (i < 3) 
    next

  print(i)
  
  if (i >= 5)
    break
}
#> [1] 3
#> [1] 4
#> [1] 5

Preallocate an output container to avoid slow loops

means <- c(1, 50, 20)
out <- vector("list", length(means))
for (i in 1:length(means)) {
  out[[i]] <- rnorm(10, means[[i]])
}

Tip

vector() function is helpful for preallocation

Use seq_along(x) instead of 1:length(x)

  • 1:length(x) causes unexpected failure for 0 length vectors
  • : works with both increasing and decreasing sequences
means <- c()
1:length(means)
#> [1] 1 0
seq_along(means)
#> integer(0)

Use seq_along(x) instead of 1:length(x)

out <- vector("list", length(means))
for (i in 1:length(means)) {
  out[[i]] <- rnorm(10, means[[i]])
}
#> Error in rnorm(10, means[[i]]): invalid arguments
out <- vector("list", length(means))
for (i in seq_along(means)) {
  out[[i]] <- rnorm(10, means[[i]])
}
out
#> list()

Avoid problems when iterating over S3 vectors by using seq_along(x) and x[[i]]

  • loops typically strip attributes
xs <- as.Date(c("2020-01-01", "2010-01-01"))
for (x in xs) {
  print(x)
}
#> [1] 18262
#> [1] 14610
for (i in seq_along(xs)) {
  print(xs[[i]])
}
#> [1] "2020-01-01"
#> [1] "2010-01-01"

Use while or repeat when you don’t know the number of iterations

  • while(condition) action: perfoms action while condition is TRUE
  • repeat(action): repeats action forever (or until a break)

Always use the least-flexible loop option possible

  • Use for before while or repeat
  • In data analysis use apply() or purrr::map() before for

Quiz & Exercises

What is the difference between if and ifelse()?

if works with scalars; ifelse() works with vectors.

In the following code, what will the value of y be if x is TRUE? What if x is FALSE? What if x is NA?

y <- if (x) 3

When x is TRUE, y will be 3; when FALSE, y will be NULL; when NA the if statement will throw an error.

What does switch("x", x = , y = 2, z = 3) return?

switch(
  "x",
  x = ,
  y = 2,
  x = 3
)

This switch() statement makes use of fall-through so it will return 2.

What type of vector does each of the following calls to ifelse() return?

Read the documentation and write down the rules in your own words.

ifelse(TRUE, 1, "no")
ifelse(FALSE, 1, "no")
ifelse(NA, 1, "no")

What type of vector does each of the following calls to ifelse() return?

The arguments of ifelse() are named test, yes and no. In general, ifelse() returns the entry for yes when test is TRUE, the entry for no when test is FALSE and NA when test is NA.

ifelse(TRUE, 1, "no")
#> [1] 1
ifelse(FALSE, 1, "no")
#> [1] "no"
ifelse(NA, 1, "no")
#> [1] NA

What type of vector does each of the following calls to ifelse() return?

In practice, test is first converted to logical and if the result is neither TRUE nor FALSE, then as.logical(test) is returned.

ifelse(logical(), 1, "no")
#> logical(0)
ifelse(NaN, 1, "no")
#> [1] NA
ifelse(NA_character_, 1, "no")
#> [1] NA
ifelse("a", 1, "no")
#> [1] NA
ifelse("true", 1, "no")
#> [1] 1

Why does the following code work?

x <- 1:10
if (length(x)) "not empty" else "empty"
#> [1] "not empty"
x <- numeric()
if (length(x)) "not empty" else "empty"
#> [1] "empty"

if() expects a logical condition, but also accepts a numeric vector where 0 is treated as FALSE and all other numbers are treated as TRUE. Numerical missing values (including NaN) lead to an error in the same way that a logical missing, NA, does.

Why does this code succeed without errors or warnings?

x <- numeric()
out <- vector("list", length(x))
for (i in 1:length(x)) {
  out[i] <- x[i] ^ 2
}
out
#> [[1]]
#> [1] NA

Why does this code succeed without errors or warnings?

  • Subsetting behavior for out-of-bounds & 0 indices when using [<- and [
  • x[1] generates an NA. NA is assigned to the empty length-1 list out[1]
  • x[0] returns numeric(0). numeric(0) is assigned to out[0]. Assigning a 0-length vector to a 0-length subset doesn’t change the object.
  • Each step includes valid R operations (even though the result may not be what the user intended).

Walk-through

Setup

x <- numeric()
out <- vector("list", length(x))
1:length(x)
#> [1] 1 0

Walk-through

First Iteration

x[1]
#> [1] NA
x[1]^2
#> [1] NA
out[1]
#> [[1]]
#> NULL
out[1] <- x[1]^2
out[1]
#> [[1]]
#> [1] NA

Walk-through

Second Iteration

x[0]
#> numeric(0)
x[0]^2
#> numeric(0)
out[0]
#> list()
out[0] <- x[0]^2
out[0]
#> list()

Walk-through

Final Result

out
#> [[1]]
#> [1] NA

When the following code is evaluated, what can you say about the vector being iterated?

xs <- c(1, 2, 3)
for (x in xs) {
  xs <- c(xs, x * 2)
}
xs
#> [1] 1 2 3 2 4 6

In this loop x takes on the values of the initial xs (1, 2 and 3), indicating that it is evaluated just once in the beginning of the loop, not after each iteration. (Otherwise, we would run into an infinite loop.)

What does the following code tell you about when the index is updated?

for (i in 1:3) {
  i <- i * 2
  print(i) 
}
#> [1] 2
#> [1] 4
#> [1] 6

In a for loop the index is updated in the beginning of each iteration. Therefore, reassigning the index symbol during one iteration doesn’t affect the following iterations. (Again, we would otherwise run into an infinite loop.)