Generating random variables

A procedure for generating random variables from a single variable Gaussian is typically available in any data science system (in r , we can use rnorm), but how can we generate a multivariate Gaussian? In r there are libraries (e.g. MASS) that can do this for you, but we now have the tools generate these directly from rnorm.

The ideas is to start with i.i.d. random normal variables arranged as vector $\mathbf{X}$ with mean zero and standard deviation 1. The covariance matrix for this random vector is diagonal with all 1’s along the diagonal. Now transform the variable:

$\mathbf{Y} = \mathbf{\Sigma^{1/2} X} + \mathbf{\mu}$ Then the distribution of $\mathbf{Y}$ (as shown in text) has mean $\mathbf{\mu}$ and covariance $\mathbf{\Sigma}$ .

sigma <- matrix(c(3,1,1,1),nrow=2)

# compute matrix square root
ev <- eigen(sigma)
ev$values <- sqrt(ev$values)
sigma_root <- ev$vectors %*% diag(ev$values) %*% t(ev$vectors)

sigma_root %*% sigma_root

##      [,1] [,2]
## [1,]    3    1
## [2,]    1    1

Ok sqrt in hand, we can compute perform the transform:

# random normal variables
uncorr <-   matrix(rnorm(2000),ncol=2)
print(cov(uncorr))

##            [,1]       [,2]
## [1,] 1.00419561 0.01080475
## [2,] 0.01080475 1.01521702

# multiply by sqrt of sigma
correlated <-  t(apply(uncorr,1,\(x)sigma_root %*% x))
print(cov(correlated))

##          [,1]     [,2]
## [1,] 3.028170 1.026537
## [2,] 1.026537 1.021243

This process can also be used in reverse to ‘whiten’ an noise source (for example for random number generation from a natural source)