Quasi-Newton methods
- Cons of Newton’s method: evaluating Jacobian, tendency of divergence from many starting points
Tackling computation of Jacobian:
- Finite difference approach: multidimensional secant iteration; Jacobian is replaced by a quotient replacing one element at a time
\[ \mathbf{J}(x)\mathbf{e}_j \approx \frac{\mathbf{f}(\mathbf{x} + \delta \mathbf{e}_j) - \mathbf{f}(\mathbf{x})}{\delta}, \quad j = 1, ..., n. \]
- Broyden’s update: approximate the Jacobian with quantities computed on last iteration and no additional function evaluations
Tackling divergence:
- Levenberg’s method:
- akin to ridge regression
- tempering the squared Jacobian by a small constant
- smooth transition between Newton’s method and gradient descent
- Levenberg-Marquardt extension has superior strategy for changes in \(\lambda\)