|
|
|
|
Preconditioning |
First I remind you of a rarely used little bit of mathematical notation.
Given a vector
with components
,
the notation
means
![]() |
(38) |
Given the usual linearized fitting goal between
data space and model space,
,
the simplest image of the model space results from
application of the adjoint operator
.
Unless
has no physical units, however,
the physical units of
do not match those of
,
so we need a scaling factor.
The theoretical solution
tells us that the scaling units should be those of
.
We are going to approximate
by a diagonal matrix
with the correct units so
.
What we use for
will be a guess, simply a guess.
If it works better than nothing, we'll be happy,
and if it doesn't we'll forget about it.
Experience shows it is a good idea to try.
Common sense tells us to insist that all elements of
are positive.
is a square matrix of size of model space.
From any vector
in model space with all positive components,
we could guess that
be
to any power.
To get the right physical dimensions we choose
, a vector of all ones and choose
To go beyond the scaled adjoint we can use
as a preconditioner.
To use
as a preconditioner
we define implicitly a new set of variables
by the substitution
.
Then
.
To find
instead of
,
we iterate
with the operator
instead of with
.
As usual, the first step of the iteration is to use the adjoint
of
to form the image
.
At the end of the iterations,
we convert from
back to
with
.
The result after the first iteration
turns out to be the same as scaling.
By (5.39),
has physical units inverse to
.
Thus the transformation
has no units
so the
variables have physical units of data space.
Experimentalists might enjoy seeing the
solution
with its data units more than viewing the solution
with its more theoretical model units.
The theoretical solution for underdetermined systems
suggests
an alternate approach using instead
.
This diagonal weighting matrix
must be drawn
from vectors in data space.
Again I chose a vector of all 1's getting the weight
My choice of a vector of 1's is quite arbitrary.
I might as well have chosen a vector of random numbers.
Bill Symes, who suggested this approach to me,
suggests using an observed data vector
for the data space weight,
and
for the model space weight.
This requires an additional step, dividing out the units of the data
.
Experience tells me that a broader methodology than all above is needed.
Appropriate scaling is required in both data space and model space.
We need two other weights
and
where
.
I have a useful practical example (stacking in
media)
in another of my electronic books (BEI),
where I found both
and
by iterative guessing.
First assume
and estimate
as above.
Then assume you have the correct
and estimate
as above.
Iterate.
(Perhaps some theorist can find a noniterative solution.)
I believe this iterative procedure leads us to the best diagonal
pre- and post- multipliers for any operator
.
By this I mean that the modified operator
is as close to being unitary as we will be able to obtain
with diagonal transformation.
Unitary means it is energy conserving and that the inverse
is simply the conjugate transpose.
What good is it that
?
It gives us the most rapid convergence of least squares problems of the form
The PhD thesis of James Rickett experiments extensively with data space and model space weighting functions in the context of seismic velocity estimation.
|
|
|
|
Preconditioning |