next up previous [pdf]

Next: Conjugate direction Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Null space and iterative

Why steepest descent is so slow

Before we can understand why the conjugate-direction method is so fast, we need to see why the steepest-descent method is so slow. Imagine yourself sitting on the edge of a circular bowl. If you jump off the rim, you'll slide straight to the bottom at the center. Now imagine an ellipsoidal bowl of very large ellipticity. As you jump off the rim, you'll first move in the direction of the gradient. This is not towards the bottom at the center of the ellipse (unless you were sitting on the major or minor axis).

We can formalize the situation. A parametric equation for a line is $\bold x=\bold x_{\rm old} +\alpha \Delta \bold x$ where $\alpha$ is the parameter for moving on the line. The process of selecting $\alpha$ is called ``line search'', but for a linear problem like the one we have chosen here, we hardly recognize choosing $\alpha$ as searching a line. A more graphic understanding of the whole process is possible from considering a two-dimensional space where the vector of unknowns $\bold x$ has just two components, $x_1$ and $x_2$. Then the size of the residual vector $\bold r \cdot \bold r$ can be displayed with a contour plot in the plane of $(x_1 , x_2 )$. Visualize a contour map of a mountainous terrain. The gradient is perpendicular to the contours. Contours and gradients are curved lines. When we use the steepest-descent method we start at a point and compute the gradient direction at that point. Then we begin a straight-line descent in that direction. The gradient direction curves away from our direction of travel, but we continue on our straight line until we have stopped descending and are about to ascend. There we stop, compute another gradient vector, turn in that direction, and descend along a new straight line. The process repeats until we get to the bottom, or until we get tired.

What could be wrong with such a direct strategy? The difficulty is at the stopping locations. These occur where the descent direction becomes parallel to the contour lines. (There the path becomes level.) So after each stop, we turn $90^\circ$, from parallel to perpendicular to the local contour line for the next descent. What if the final goal is at a $45^\circ$ angle to our path? A $45^\circ$ turn cannot be made. Instead of moving like a rain drop down the centerline of a rain gutter, we move along a fine-toothed zigzag path, crossing and recrossing the centerline. The gentler the slope of the rain gutter, the finer the teeth on the zigzag path.

1


next up previous [pdf]

Next: Conjugate direction Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: Null space and iterative

2011-07-17