4.2 Vectors, matrices and data.frames
Vectors are collection of elements of the same type without dimensions. Vectors of length 1 are called scalars.
scalar_1 <- 1
scalar_2 <- 2
scalar_3 <- 3
length(scalar_1) #length of a the scalar is 1## [1] 1
(vector_123 <- c(scalar_1, scalar_2, scalar_3))## [1] 1 2 3
(vector_long <- c(vector_123, vector_123))## [1] 1 2 3 1 2 3
dim(vector_123) #dimensions of the vector returns NULL## NULL
Scalars come in four different types
typeof(TRUE)## [1] "logical"
typeof(1.5)## [1] "double"
typeof(1L)## [1] "integer"
typeof("one")## [1] "character"
R can serve as a calculator.
#basic mathematical operations
scalar_1+scalar_2## [1] 3
scalar_3^scalar_2## [1] 9
Computations on vectors are performed element-wise.
vector_123*scalar_2## [1] 2 4 6
Functions in R have names. A basic function is print().
print("Hello world!")## [1] "Hello world!"
log() is another useful function and it has two arguments, x and base.
#order matters if arguments are not specified
log(8, base = 2)## [1] 3
log(x = 8, base = 2)## [1] 3
log(8, 2)## [1] 3
log(base = 2, x = 8)## [1] 3
log(2, 8)## [1] 0.3333333
Another important function is sample() that takes a sample from a vector.
sample(1:5, 3) #random sample without replacement## [1] 2 3 1
sample(1:5, 3, replace = TRUE) #random sample with replacement## [1] 4 3 5
sample(1:5, 3, prob = c(0.2,0.1,0.3,0.1,0.3)) #odd-biased sample without replacement## [1] 3 1 5
A matrix is just a vector with dimensions.
#adding dimensions to a vector transforms it to matrix
vector_long_2 <- vector_long
identical(vector_long, vector_long_2)## [1] TRUE
(dim(vector_long_2) <- c(2,3))## [1] 2 3
identical(vector_long, vector_long_2)## [1] FALSE
dim(vector_long)## NULL
dim(vector_long_2)## [1] 2 3
Matrices can be created using the matrix() function.
matrix(vector_long, 2, 3)## [,1] [,2] [,3]
## [1,] 1 3 2
## [2,] 2 1 3
The default for matrix() function is to fill values by-column. This can changed by setting byrow to TRUE
matrix(vector_long, 2, 3, byrow = TRUE)## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 1 2 3
Example: Simulate an adjacency matrix for a network4
set.seed(1992)
#1 = link,0 = no-link
x <- sample(c(1,0), 25, replace = TRUE, prob=c(.5,.5))
#names of the network's nodes
dim_names <- list(c("Thea", "Pravin", "Troy", "Albin", "Clementine"),
c("Thea", "Pravin", "Troy", "Albin", "Clementine"))
#creat 5x5 adjacency matix
(matrix_data2 <- matrix(x,
nrow=5,
ncol=5,
byrow =TRUE,
dimnames = dim_names)#set names of the rows and columns
)## Thea Pravin Troy Albin Clementine
## Thea 1 1 1 1 0
## Pravin 1 0 1 1 1
## Troy 0 0 1 0 1
## Albin 1 1 0 1 0
## Clementine 1 0 0 1 0
isSymmetric(matrix_data2)## [1] FALSE
A data.frame is a collection of vectors of the same length. We can convert a matrix into a data.frame and vice versa.
#convert matrix to dataframe
class(matrix_data2)## [1] "matrix" "array"
df_data2 <- as.data.frame(matrix_data2)
class(df_data2)## [1] "data.frame"
data.frames are inefficient in R and are increasingly being replaced by user-created data classes, such as data.table.5
Another important data structure that you need to familiarize yourself with if you’re new to R is lists
For undirected graphs, adjacency matrix is symmetric for one-mode network. This symmerty might not hold for two-mode AKA bipartite networks. In case of directed graphs the adjacency matrix can be asymmetric to reflect directionality of the link/edge. (Thanks Pierre Olivier for your input)↩︎
adjacency matrix , especially large ones, are recommended to be stored as sparse matrix for memory efficiency.↩︎