Next: THE WORLD OF CONJUGATE Up: Model fitting by least Previous: Analytical solutions

OPERATOR SCALING (BINORMALIZATION)

We can accept model $\bold m$ and data $\bold d$ as they arise from the geometry or geophysics of an application, or we can transform them to forms that are computationally convenient, work with them, and finally transform back. Say we transform them to new variables, $\bold u$ and $\bold v$ .

$\displaystyle \bold u$	$\textstyle =$	$\displaystyle \bold M \bold m$	(97)
$\displaystyle \bold v$	$\textstyle =$	$\displaystyle \bold D \bold d$	(98)

These transformations change the $\bold F$ operator to $\bold D \bold F \bold M^{-1}$ because

$\displaystyle \bold d$	$\textstyle =$	$\displaystyle \bold F \bold m$	(99)
$\displaystyle \bold D \bold d$	$\textstyle =$	$\displaystyle \bold D \bold F \bold M^{-1} \bold M \bold m$	(100)
$\displaystyle \bold v$	$\textstyle =$	$\displaystyle \bold D \bold F \bold M^{-1} \bold u$	(101)

The game is looking for the best $\bold M$ and $\bold D$ .

If I were able and willing to handle linear algebra in a modern way, I would show you this matrix iteration

$\begin{displaymath} \lambda\ \left[ \begin{array}{c} \bold d \\ \bold m \... ... \begin{array}{c} \bold d \\ \bold m \end{array}\right]_i \end{displaymath}$

(102)

What is $\lambda$ ? It is a scale factor that you use to keep the vectors normalized as the iteration proceeds. You may recognize a path leading to eigenvectors, eigenvalues, Hermitian matrices, and singular-value decomposition. You'll need to find other sources to go further on that path because it has not led me to a solution to the problem at hand which is how to choose the best $\bold M$ and $\bold D$ .

If you have an operator that you are using millions of times it is worth seeking good choices. Good choices are those that make the adjoint of your new operator $\bold D \bold F \bold M^{-1}$ a good approximation to its inverse. These are the two conditions we seek:

$\displaystyle \bold I$	$\textstyle \approx$	$\displaystyle (\bold D \bold F \bold M^{-1})\T \ (\bold D \bold F \bold M^{-1})$	(103)
$\displaystyle \bold I$	$\textstyle \approx$	$\displaystyle (\bold D \bold F \bold M^{-1}) \ (\bold D \bold F \bold M^{-1})\T$	(104)

If these were true, we could probe with any test vectors we wished

$\displaystyle \hat {\bold t}_m$	$\textstyle =$	$\displaystyle (\bold M^{-1})\T \ (\rm something)\ \bold M^{-1}\ \bold t_m$	(105)
$\displaystyle \hat {\bold t}_d$	$\textstyle =$	$\displaystyle \bold D \ {(\rm something\ else)} \ \bold D\T \ \bold t_d$	(106)

and find both $\hat {\bold t}_m\approx \bold t_m$ and $\hat {\bold t}_d\approx \bold t_d$ . So, the game is to play with $\bold M$ and $\bold D$ to try to get this to happen.

About the only trick I know is to try $\bold M$ and $\bold D$ as diagonal matrices. For test functions $\bold t$ , I generally use a pattern of moderately spaced impulses. In physical space we may see places where $\hat {\bold t}$ is smaller than $\bold t$ . Those are the places to boost the corresponding diagonal.

There are many test functions you could use. You could use all ones. You could use random numbers. You could use a pile of random old data, though I'm not sure what you would use for old models. Take the output. Take its absolute value. Maybe smooth it. Take the square root since the half you put in $\bold F$ appears a second time in $\bold F\T$ .

I know one more trick. In seismology many operators appear as integrals. One of many such operators is called ``Kirchhoff migration''. Because these operators and their adjoints contain integrations they boost low frequencies. We can attenuate them back to their original size by having $\bold M$ or $\bold D$ apply $\sqrt{-i\omega}$ (known in the time domain as the ``Hankel tail'').

What, may we ask is the interpretation of the $(\bold u, \bold v)$ variables? They feel like ``energy conservation'' variables, though it makes no sense to say the physical energy in $\bold m$ or $\bold d$ should be conserved in the way of Parseval's theorem of Fourier transforms. I imagined the $(\bold u, \bold v)$ variables might be especially suitable for display (like preconditioned variables) but now I am less certain.

Next: THE WORLD OF CONJUGATE Up: Model fitting by least Previous: Analytical solutions

2014-12-01