Model fitting by least squares

Next: The meaning of the Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Sign convention

## Method of random directions and steepest descent

Let us minimize the sum of the squares of the components of the residual vector given by

 (52) (53)

A contour plot is based on an altitude function of space. The altitude is the dot product . By finding the lowest altitude, we are driving the residual vector as close as we can to zero. If the residual vector reaches zero, then we have solved the simultaneous equations . In a two-dimensional world the vector has two components, . A contour is a curve of constant in -space. These contours have a statistical interpretation as contours of uncertainty in , with measurement errors in .

Let us see how a random search-direction can be used to reduce the residual . Let be an abstract vector with the same number of components as the solution , and let contain arbitrary or random numbers. We add an unknown quantity of vector to the vector , and thereby create :

 (54)

This gives a new residual:
 (55) (56) (57)

which defines .

Next we adjust to minimize the dot product:

 (58)

Set to zero its derivative with respect to using the chain rule
 (59)

which says that the new residual is perpendicular to the fitting function'' . Solving gives the required value of .
 (60)

A computation template'' for the method of random directions is



iterate {

}

A nice thing about the method of random directions is that you do not need to know the adjoint operator .

In practice, random directions are rarely used. It is more common to use the gradient direction than a random direction. Notice that a vector of the size of is

 (61)

Notice also that this vector can be found by taking the gradient of the size of the residuals:
 (62)

Choosing to be the gradient vector is called the method of steepest descent.''

Starting from a model (which may be zero), below is a template of pseudocode for minimizing the residual by the steepest-descent method:



iterate {

}


 Model fitting by least squares

Next: The meaning of the Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Sign convention

2011-07-17