Quasi-Newton methods
- Cons of Newton’s method: evaluating Jacobian, tendency of divergence from many starting points
Tackling computation of Jacobian:
- Finite difference approach: multidimensional secant iteration; Jacobian is replaced by a quotient replacing one element at a time
J(x)ej≈f(x+δej)−f(x)δ,j=1,...,n.
- Broyden’s update: approximate the Jacobian with quantities computed on last iteration and no additional function evaluations
Tackling divergence:
- Levenberg’s method:
- akin to ridge regression
- tempering the squared Jacobian by a small constant
- smooth transition between Newton’s method and gradient descent
- Levenberg-Marquardt extension has superior strategy for changes in λ