next up previous [pdf]

Next: Abandoned theory for matching Up: Regularization is model styling Previous: SEARCHING THE SEA OF

CODE FOR THE REGULARIZED SOLVER

In Chapter [*] we defined linear interpolation as the extraction of values from between mesh points. In a typical setup (occasionally the role of data and model are swapped), a model is given on a uniform mesh, and we solve the easy problem of extracting values between the mesh points with subroutine lint1(). The genuine problem is the inverse problem, which we attack here. Data values are sprinkled all around, and we wish to find a function on a uniform mesh from which we can extract that data by linear interpolation. The adjoint operator for subroutine lint1() simply piles data back into its proper location in model space without regard to how many data values land in each region. Thus, some locations may receive much data while other locations get none. We could interpolate by minimizing the energy in the model gradient, in the second derivative of the model, or in any roughening model.

Formalizing now our wish that data $\bold d$ be extractable by linear interpolation $\bold F$, from a model $\bold m$, and our wish that application of a roughening filter with an operator $\bold A$ have minimum energy, we write the fitting goals:

\begin{displaymath}
\begin{array}{lll}
\bold 0 &\approx & \bold F \bold m - \bold d \\
\bold 0 &\approx & \bold A \bold m
\end{array}\end{displaymath} (14)

Suppose we take the roughening filter to be the second difference operator $(1,-2,1)$ scaled by a constant $\epsilon$, and suppose we have a data point near each end of the model and a third data point exactly in the middle. Then, for a model space 6 points long, the fitting goal could look like:
\begin{displaymath}{
\left[
\begin{array}{rrrrrr}
.8 & .2 & . & . & . & .  ...
...\
\bold r_m
\end{array} \right]
\quad \approx  \bold 0
} \end{displaymath} (15)

The residual vector has two parts, a data part $\bold r_d$ on top and a model part $\bold r_m$ on the bottom. The data residual should vanish except where contradictory data values happen to lie in the same place. The model residual is the roughened model.

Finding something unexpected is good science and engineering. We should look both in data space and model space. In data space, we look at the residual $\bold r$. In model space, we look at the residual projected there $\Delta \bold m =\bold F\T\bold r$. After iterating to completion, we have $\Delta \bold m = \bold 0 = \bold F\T \bold r_d + \bold A\T\bold r_m$, a sum of two images identical but for polarity. This image tells us what we have learned from the data; and how the model differs from what we thought it would be.

After all the definitions, we load the negative of the data into the residual. If a starting model $\bold m_0$ is present, then we update the data part of the residual $\bold r_d=\bold F \bold m_0 - \bold d$, and we load the model part of the residual $ \bold r_m = \bold A \bold m_0$. Otherwise, we begin from a zero model $\bold m_0=\bold 0$; and thus, the model part of the residual $\bold r_m$ is also zero. After this initialization, subroutine solver_reg() begins an iteration loop by first computing the proposed model perturbation $\Delta \bold m$ (called g in the program) with the adjoint operator:

\begin{displaymath}
\Delta \bold m
\quad\longleftarrow\quad
\left[
\begin{ar...
...begin{array}{c}
\bold r_d \\
\bold r_m
\end{array} \right]
\end{displaymath} (16)

Using this value of $\Delta \bold m$, we can find the implied change in residual $\Delta\bold r$ as:
\begin{displaymath}
\Delta
\left[
\begin{array}{c}
\bold r_d \\
\bold r_m
...
...
\bold F \\
\bold A
\end{array} \right]
\
\Delta \bold m
\end{displaymath} (17)

and the last thing in the loop is to use the optimization step function stepper() to decide the length of the step size and to decide how much of the previous step to include.

An example of using the new solver is subroutine invint1. I chose to implement the model roughening operator $\bold A$ with the convolution subroutine tcai1() which has transient end effects (and an output length equal to the input length plus the filter length). The adjoint of subroutine tcai1() suggests perturbations in the convolution input (not the filter).

user/gee/invint1.c
void invint1(int niter                  /* number of iterations */,
	     int nd                     /* data size */,
	     float *coord               /* data coordinates */, 
	     const float *dd            /* data values */, 
	     int n1, float o1, float d1 /* model grid */, 
	     int na, const float *aa    /* filter */, 
	     float *mm                  /* estimated model */, 
	     float eps                  /* regularization */)
/*< inverse interpolation >*/
{
    lint1_init( o1, d1, coord); /* interpolation */
    tcai1_init( na, aa);         /* filtering */

    sf_solver_reg(lint1_lop, sf_cgstep, tcai1_lop, 
		  n1+na, n1, nd, mm, dd, niter, eps);
    sf_cgstep_close( );
}

Figure 11 shows an example for a $(1,-2,1)$ filter with $\epsilon = 1$. The continuous curve representing the model $\bold m$ passes through the data points. Because the models are computed with transient convolution end-effects, the models tend to damp linearly to zero outside the region where signal samples are given.

im1-2+1
Figure 11.
Sample points and estimation of a continuous function through them.
im1-2+1
[pdf] [png] [scons]

To show an example where the result is clearly a theoretical answer, I prepared another figure with the simpler filter $\bold A = (1,-1)$. When we minimize energy in the first derivative of the waveform, the residual becomes distributed uniformly between data points so the solution there is a straight line. Theoretically, it should be a straight line, because a straight line has a vanishing second derivative; and that condition arises by differentiating by $\bold x\T$, the minimized quadratic form $\bold x\T \bold A\T\bold A \bold x$, and getting $ \bold A\T\bold A \bold x=\bold 0$. (By this logic, the curves between data points in Figure 11 must be cubics.) The $(1,-1)$ result is shown in Figure 12.

im1-1a
Figure 12.
The same data samples and a function through them that minimizes the energy in the first derivative.
im1-1a
[pdf] [png] [scons]

The example of Figure 12 has been a useful test case for me. You will see it again in later chapters. What I would like to show you here is a movie showing the convergence to Figure 12. Convergence occurs rapidly where data points are close together. The large gaps, however, fill at a rate of one point per iteration.



Subsections
next up previous [pdf]

Next: Abandoned theory for matching Up: Regularization is model styling Previous: SEARCHING THE SEA OF

2014-12-03