7.2 Linear Layer: nn_linear()
Consider the linear layer:
## An `nn_module` containing 96 parameters.
##
## ── Parameters ──────────────────────────────────────────────────────────────────
## • weight: Float [1:16, 1:5]
## • bias: Float [1:16]
Comment about size: We expect l
to be \(5 \times 16\) (i.e for matrix multiplication: \(X_{50\times5}* \beta_{5 \times 16}\)). We see below that it is \(16 \times 5\), which is due to the underlying C++ implementation of libtorch
. For performance reasons, the transpose is stored.
## [1] 16 5
Apply the module:
#Generate data: generated from the normal distribution
x <- torch_randn(50, 5)
# Feed x into layer:
output <- l(x)
output$size()
## [1] 50 16
When we use built-in modules, requires_grad = TRUE
is not required in creation of the tensor (unlike previous chapters). It’s taken care of for us.