4.2 Vectors, matrices and data.frames
Vectors are collection of elements of the same type without dimensions. Vectors of length 1 are called scalars.
<- 1
scalar_1 <- 2
scalar_2 <- 3
scalar_3 length(scalar_1) #length of a the scalar is 1
## [1] 1
<- c(scalar_1, scalar_2, scalar_3)) (vector_123
## [1] 1 2 3
<- c(vector_123, vector_123)) (vector_long
## [1] 1 2 3 1 2 3
dim(vector_123) #dimensions of the vector returns NULL
## NULL
Scalars come in four different types
typeof(TRUE)
## [1] "logical"
typeof(1.5)
## [1] "double"
typeof(1L)
## [1] "integer"
typeof("one")
## [1] "character"
R can serve as a calculator.
#basic mathematical operations
+scalar_2 scalar_1
## [1] 3
^scalar_2 scalar_3
## [1] 9
Computations on vectors are performed element-wise.
*scalar_2 vector_123
## [1] 2 4 6
Functions in R have names. A basic function is print()
.
print("Hello world!")
## [1] "Hello world!"
log()
is another useful function and it has two arguments, x and base.
#order matters if arguments are not specified
log(8, base = 2)
## [1] 3
log(x = 8, base = 2)
## [1] 3
log(8, 2)
## [1] 3
log(base = 2, x = 8)
## [1] 3
log(2, 8)
## [1] 0.3333333
Another important function is sample()
that takes a sample from a vector.
sample(1:5, 3) #random sample without replacement
## [1] 2 3 1
sample(1:5, 3, replace = TRUE) #random sample with replacement
## [1] 4 3 5
sample(1:5, 3, prob = c(0.2,0.1,0.3,0.1,0.3)) #odd-biased sample without replacement
## [1] 3 1 5
A matrix is just a vector with dimensions.
#adding dimensions to a vector transforms it to matrix
<- vector_long
vector_long_2 identical(vector_long, vector_long_2)
## [1] TRUE
dim(vector_long_2) <- c(2,3)) (
## [1] 2 3
identical(vector_long, vector_long_2)
## [1] FALSE
dim(vector_long)
## NULL
dim(vector_long_2)
## [1] 2 3
Matrices can be created using the matrix()
function.
matrix(vector_long, 2, 3)
## [,1] [,2] [,3]
## [1,] 1 3 2
## [2,] 2 1 3
The default for matrix()
function is to fill values by-column. This can changed by setting byrow
to TRUE
matrix(vector_long, 2, 3, byrow = TRUE)
## [,1] [,2] [,3]
## [1,] 1 2 3
## [2,] 1 2 3
Example: Simulate an adjacency matrix for a network4
set.seed(1992)
#1 = link,0 = no-link
<- sample(c(1,0), 25, replace = TRUE, prob=c(.5,.5))
x #names of the network's nodes
<- list(c("Thea", "Pravin", "Troy", "Albin", "Clementine"),
dim_names c("Thea", "Pravin", "Troy", "Albin", "Clementine"))
#creat 5x5 adjacency matix
<- matrix(x,
(matrix_data2 nrow=5,
ncol=5,
byrow =TRUE,
dimnames = dim_names)#set names of the rows and columns
)
## Thea Pravin Troy Albin Clementine
## Thea 1 1 1 1 0
## Pravin 1 0 1 1 1
## Troy 0 0 1 0 1
## Albin 1 1 0 1 0
## Clementine 1 0 0 1 0
isSymmetric(matrix_data2)
## [1] FALSE
A data.frame is a collection of vectors of the same length. We can convert a matrix into a data.frame and vice versa.
#convert matrix to dataframe
class(matrix_data2)
## [1] "matrix" "array"
<- as.data.frame(matrix_data2)
df_data2 class(df_data2)
## [1] "data.frame"
data.frames are inefficient in R and are increasingly being replaced by user-created data classes, such as data.table.5
Another important data structure that you need to familiarize yourself with if you’re new to R is lists
For undirected graphs, adjacency matrix is symmetric for one-mode network. This symmerty might not hold for two-mode AKA bipartite networks. In case of directed graphs the adjacency matrix can be asymmetric to reflect directionality of the link/edge. (Thanks Pierre Olivier for your input)↩︎
adjacency matrix , especially large ones, are recommended to be stored as sparse matrix for memory efficiency.↩︎