Quasi-Newton methods

  • Cons of Newton’s method: evaluating Jacobian, tendency of divergence from many starting points

Tackling computation of Jacobian:

  • Finite difference approach: multidimensional secant iteration; Jacobian is replaced by a quotient replacing one element at a time

\[ \mathbf{J}(x)\mathbf{e}_j \approx \frac{\mathbf{f}(\mathbf{x} + \delta \mathbf{e}_j) - \mathbf{f}(\mathbf{x})}{\delta}, \quad j = 1, ..., n. \]

  • Broyden’s update: approximate the Jacobian with quantities computed on last iteration and no additional function evaluations

Tackling divergence:

  • Levenberg’s method:
    • akin to ridge regression
    • tempering the squared Jacobian by a small constant
    • smooth transition between Newton’s method and gradient descent
    • Levenberg-Marquardt extension has superior strategy for changes in \(\lambda\)