7.2 Linear Layer: nn_linear()

Consider the linear layer:

library(torch)

l <- nn_linear(in_features = 5, out_features = 16) #bias = TRUE is default
l
## An `nn_module` containing 96 parameters.
## 
## ── Parameters ──────────────────────────────────────────────────────────────────
## • weight: Float [1:16, 1:5]
## • bias: Float [1:16]

Comment about size: We expect l to be \(5 \times 16\) (i.e for matrix multiplication: \(X_{50\times5}* \beta_{5 \times 16}\)). We see below that it is \(16 \times 5\), which is due to the underlying C++ implementation of libtorch. For performance reasons, the transpose is stored.

l$weight$size()
## [1] 16  5

Apply the module:

#Generate data: generated from the normal distribution
x <- torch_randn(50, 5) 

# Feed x into layer:
output <- l(x)

output$size()
## [1] 50 16

When we use built-in modules, requires_grad = TRUE is not required in creation of the tensor (unlike previous chapters). It’s taken care of for us.