next up previous [pdf]

Next: Measuring success Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: A basic solver program

The modeling success and the solver success

Every time we run a data modeling program we have access to two publishable numbers $1-\vert\bold r\vert/\vert\bold d\vert$ and $1-\vert\bold F\T\bold r\vert/\vert\bold F\T\bold d\vert$. The first says how well the model fits the data. The second says how well we did the job of finding out.

Define the residual $\bold r=\bold F\bold m-\bold d$ and the ``size'' of any vector, such as the data vector, as $\vert\bold d\vert=\sqrt{\bold d \cdot \bold d}$. The number $1-\vert\bold r\vert/\vert\bold d\vert$, will be called the ``modeling success at fitting data.'' (When the data fitting includes a residual weighting function, it should be incorporated in $\bold F$ and $\bold d$.)

While the modeling success is of interest to everyone, the second number $1-\vert\bold F\T\bold r\vert/\vert\bold F\T\bold d\vert$, to be called the ``solver success at achiving goals,'' is more of interest to us, the analysts. It tells us to what extent our program has achieved our stated goals of data fitting (and regularization). Experience seeing this number may give us guidance to where opportunities have been missed and where more work might be worthwhile. In applications of low dimensionality we can normally drive the solver success to unity. Conjugate-gradient theory says given infinite precision arithmetic, iteration should converge to the exact solution in a number of iterations equal to the number of unknowns. In reality, however, in Geophysics we often find ourselves iterating on applications far too large to run to completion. The solver success number tells us how well we are doing.

There are three ways to understand the important expression $\bold F\T\bold r$. First, it is the residual $\bold r$, originally in data space, transformed to model space by $\bold F\T$. The second is that $\bold F\T\bold r$ is the gradient of the penalty function, namely $d(\bold r \cdot \bold r)/d\bold m$. This gradient vanishes as the best fit is found. The statement that this gradient vanishes is the ``normal equations''

\begin{displaymath}
\bold 0 \eq
\bold F\T \bold r
\eq \bold F\T(\bold F\bold m-\bold d)
\eq (\bold F\T\bold F) \bold m - \bold F\T\bold d
\end{displaymath} (83)

The third way to understand the gradient $\bold F\T\bold r$ is that it is called $\Delta \bold m$ in many solver programs because of its role in building an update for the model $\bold m$.

The progress of a solver can be observed by a plot of $\vert\Delta\bold m\vert = \sqrt{\Delta\bold m \cdot \Delta\bold m }$ versus iteration. Ideally it converges to zero. Unfortunately the amplitude axis of this plot scales with the units of model space. We should nondimensionalize it. Then we could make a simple statements like, ``We iterated 90% of the way to the solution.''

Starting from a zero model $\bold m=\bold 0$ the first residual is $\bold r = -\bold d$. Its size in model space is $\vert\bold F\T\bold d\vert$ (which also happens to be our first estimated model). After many iterations the residual in model space is $\vert\bold F\T\bold r\vert$. The ratio $\vert\bold F\T\bold r\vert/\vert\bold F\T\bold d\vert$ progresses towards zero as we progress. The degree of success our solver is having is $1-\vert\bold F\T\bold r\vert/\vert\bold F\T\bold d\vert$. We could say, ``Starting from our first model, we iterated until we had gone $100\times (1- \vert\bold F\T\bold r\vert/\bold F\T\bold d\vert)$% of the way to the ultimate model $\bold m$.'' More simply, $1- \vert\bold F\T\bold r\vert/\bold F\T\bold d\vert$ is the ``solver success.''

The ``modeling success at fitting data'' is the number $1-\vert\bold r\vert/\vert\bold d\vert$. The ``solver success at achieving goals'' is the number $1-\vert\bold F\T\bold r\vert/\vert\bold F\T\bold d\vert$.


next up previous [pdf]

Next: Measuring success Up: KRYLOV SUBSPACE ITERATIVE METHODS Previous: A basic solver program

2011-07-17