Control flow

Learning objectives:

  • Learn the tools for controlling flow of execution.

  • Learn some technical pitfalls and (perhaps lesser known) useful features.

Introduction

There are two main groups of flow control tools: choices and loops:

  • Choices (if, switch, ifelse, dplyr::if_else, dplyr::case_when) allow you to run different code depending on the input.

  • Loops (for, while, repeat) allow you to repeatedly run code

Choices

if() and else

Use if to specify a block of code to be executed, if a specified condition is true. Use else to specify a block of code to be executed, if the same condition is false.

if (condition) true_action
if (condition) true_action else false_action

(Note braces are only needed for compound expressions)

if (test_expression) {   
  true_action
} else {
  false_action
}

Can be expanded to more alternatives:

if (test_expression) {   
  true_action
} else if (other_test_expression) {
  other_action
} else {
  false_action
}

Exercise

Why does this work?

x <- 1:10
if (length(x)) "not empty" else "empty"
#> [1] "not empty"

x <- numeric()
if (length(x)) "not empty" else "empty"
#> [1] "empty"

if returns a value which can be assigned

x1 <- if (TRUE) 1 else 2
x2 <- if (FALSE) 1 else 2

c(x1, x2)
#> [1] 1 2

The book recommends assigning the results of an if statement only when the entire expression fits on one line; otherwise it tends to be hard to read.

Single if without else

When you use the single argument form without an else statement, if invisibly (Section 6.7.2) returns NULL if the condition is FALSE. Since functions like c() and paste() drop NULL inputs, this allows for a compact expression of certain idioms:

greet <- function(name, birthday = FALSE) {
  paste0(
    "Hi ", name,
    if (birthday) " and HAPPY BIRTHDAY"
  )
}
greet("Maria", FALSE)
#> [1] "Hi Maria"
greet("Jaime", TRUE)
#> [1] "Hi Jaime and HAPPY BIRTHDAY"
format_lane_text <- function(number){

  paste0(
    number,
    " lane",
    if (number > 1) "s",
    " of sequencing"
  )
}

format_lane_text(1)
#> [1] "1 lane of sequencing"
format_lane_text(4)
#> [1] "4 lanes of sequencing"

Invalid inputs

  • Condition must evaluate to a single TRUE or FALSE

A single number gets coerced to a logical type.

if (56) 1
#> [1] 1
if (0.3) 1
#> [1] 1
if (0) 1

If the condition cannot evaluate to a single TRUE or FALSE, an error is (usually) produced.

if ("text") 1
#> Error in if ("text") 1: argument is not interpretable as logical
if ("true") 1 
#> 1
if (numeric()) 1
#> Error in if (numeric()) 1: argument is of length zero
if (NULL) 1
#> Error in if (NULL) 1 : argument is of length zero
if (NA) 1
#> Error in if (NA) 1: missing value where TRUE/FALSE needed

Exception is a logical vector of length greater than 1, which only generates a warning, unless you have _R_CHECK_LENGTH_1_CONDITION_ set to TRUE.
This seems to have been the default since R-4.2.0

if (c(TRUE, FALSE)) 1
#>Error in if (c(TRUE, FALSE)) 1 : the condition has length > 1

Vectorized choices

  • ifelse() is a vectorized version of if:
x <- 1:10
ifelse(x %% 5 == 0, "XXX", as.character(x))
#>  [1] "1"   "2"   "3"   "4"   "XXX" "6"   "7"   "8"   "9"   "XXX"

ifelse(x %% 2 == 0, "even", "odd")
#>  [1] "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even" "odd"  "even"
  • dplyr::if_else()

  • Book recommends only using ifelse() “only when the yes and no vectors are the same type as it is otherwise hard to predict the output type.”

  • dplyr::if_else() enforces this recommendation.

For example:

ifelse(c(TRUE,TRUE,FALSE),"a",3)
#> [1] "a" "a" "3"
dplyr::if_else(c(TRUE,TRUE,FALSE),"a",3)
#> Error in `dplyr::if_else()`:
#> ! `false` must be a character vector, not a double vector.

Switch

Rather then string together multiple if - else if chains, you can often use switch.

centre <- function(x, type) {
  switch(type,
    mean = mean(x),
    median = median(x),
    trimmed = mean(x, trim = .1),
    stop("Invalid `type` value")
  )
}

Last component should always throw an error, as unmatched inputs would otherwise invisibly return NULL. Book recommends to only use character inputs for switch().

vec <- c(1:20,50:55)
centre(vec, "mean")
#> [1] 20.19231
centre(vec, "median")
#> [1] 13.5
centre(vec, "trimmed")
#> [1] 18.77273
set.seed(123)
x <- rlnorm(100)

centers <- data.frame(type = c('mean', 'median', 'trimmed'))
centers$value = sapply(centers$type, \(t){centre(x,t)})

require(ggplot2)
ggplot(data = data.frame(x), aes(x))+
  geom_density()+
  geom_vline(data = centers, 
             mapping = aes(color = type, xintercept = value), 
             linewidth=0.5,linetype="dashed") +
  xlim(-1,10)+
  theme_bw()

Example from book of “falling through” to next value

legs <- function(x) {
  switch(x,
    cow = ,
    horse = ,
    dog = 4,
    human = ,
    chicken = 2,
    plant = 0,
    stop("Unknown input")
  )
}
legs("cow")
#> [1] 4
#> [1] 4
legs("dog")
#> [1] 4
#> [1] 4

Using dplyr::case_when

  • case_when is a more general if_else and can be used often in place of multiple chained if_else or sapply’ing switch.

  • It uses a special syntax to allow any number of condition-vector pairs:

set.seed(123)
x <- rlnorm(100)

centers <- data.frame(type = c('mean', 'median', 'trimmed'))

centers$value = dplyr::case_when(
  centers$type == 'mean' ~ mean(x),
  centers$type == 'median' ~ median(x),
  centers$type == 'trimmed' ~ mean(x, trim = 0.1),
  .default = 1000
  )

centers
#>      type    value
#> 1    mean 1.652545
#> 2  median 1.063744
#> 3 trimmed 1.300568

Loops

  • Iteration over a elements of a vector

for (item in vector) perform_action

First example

for(i in 1:5) {
  print(1:i)
}
#> [1] 1
#> [1] 1 2
#> [1] 1 2 3
#> [1] 1 2 3 4
#> [1] 1 2 3 4 5
x <- numeric(length=5L)
df <- data.frame(x=1:5)

for(i in 1:5) {
  df$y[[i]] <- i+1
}

Second example: terminate a for loop earlier

  • next skips rest of current iteration
  • break exits the loop entirely
for (i in 1:10) {
  if (i < 3) 
    next

  print(i)
  
  if (i >= 5)
    break
}
#> [1] 3
#> [1] 4
#> [1] 5

Exercise

When the following code is evaluated, what can you say about the vector being iterated?

xs <- c(1, 2, 3)
for (x in xs) {
  xs <- c(xs, x * 2)
}
xs
#> [1] 1 2 3 2 4 6

Pitfalls

  • Preallocate output containers to avoid slow code.

  • Beware that 1:length(v) when v has length 0 results in a iterating backwards over 1:0, probably not what is intended. Use seq_along(v) instead.

  • When iterating over S3 vectors, use [[]] yourself to avoid stripping attributes.

xs <- as.Date(c("2020-01-01", "2010-01-01"))
for (x in xs) {
  print(x)
}
#> [1] 18262
#> [1] 14610

vs. 

for (i in seq_along(xs)) {
  print(xs[[i]])
}
#> [1] "2020-01-01"
#> [1] "2010-01-01"