Just as no one seismic migration algorithm would be the ideal choice in all cases, no one optimization algorithm can possibly be efficient or even successful in all cases of interest. If all our problems involved objective functions that were quadratic, with analytic derivatives, then life would be simple. But in practice we know that many of the problems we face in geophysics are vastly more complicated than this. What's worse, often the function we are attempting to optimize is not available analytically, and is only known in the form of a computer code that evaluates this function point by point. What makes some problems hard and some easy is not always clear.
Figure 1: An overview of objective functions. A: unimodal. B:
essentially unimodal but with parasitic local extrema.
C: fundamentally multimodal,
small number of local extrema. D: significant null-space effects.
E: fundamentally multimodal, huge number of local extrema. F:
lacking any useful structure, brute force probably required.
To aid in the visualization, all the function examples shown are
are functions of only two dimensions. In practice, we are faced
with functions of hundreds or thousands of unknowns.
For now, let us focus on the unconstrained optimization of
functions
defined on
dimensional Euclidean spaces.
It is clear that geometry plays a key
role in determining the difficulty of a problem. Can we
follow the gradient up-hill or down-hill to our target
or will we be trapped in local extrema? Are these local
extrema important in that they represent significant features
in the model or are they irrelevant roadblocks on our
way to the global extremum?
Figure 1 gives a simple overview of some of these geometrical issues. This list is not exhaustive, but illustrates some of the practical issues that arise in attempting to find the appropriate tool for a given problem. We have drawn the picture to emphasize maximization, but obviously there is no fundamental distinction between maximization and minimization. Let us look at each case individually.
dimensional
quadratic form. Non-quadratic, but still unimodal,
functions can usually be optimized by making a sequence
of quadratic approximations. If the matrix of second
derivatives of a quadratic form is known, then special-purpose
algorithms can be used.
This simple picture must be taken with a grain of salt
when considering optimization problems with many unknowns.
Much of our geometric intuition fails when confronted with
high-dimensional spaces. For example, if you calculate the
volumes of
dimensional spheres of radius
and
and take their difference,
you will quickly discover that as
gets
large, all the volume of an
sphere
becomes concentrated in a thin shell
just inside the radius
. So if one is thinking of
sampling points within the domain of a function
of, let's say, 1000 unknowns, then just how to do so
efficiently is not obvious. In fact, this is one of
the fundamental
limitations of Monte Carlo methods which, almost by
definition, must sample the space of possible models
globally.