Just as no one seismic migration algorithm would be the ideal choice in all cases, no one optimization algorithm can possibly be efficient or even successful in all cases of interest. If all our problems involved objective functions that were quadratic, with analytic derivatives, then life would be simple. But in practice we know that many of the problems we face in geophysics are vastly more complicated than this. What's worse, often the function we are attempting to optimize is not available analytically, and is only known in the form of a computer code that evaluates this function point by point. What makes some problems hard and some easy is not always clear.
Figure 1: An overview of objective functions. A: unimodal. B:
essentially unimodal but with parasitic local extrema.
C: fundamentally multimodal,
small number of local extrema. D: significant null-space effects.
E: fundamentally multimodal, huge number of local extrema. F:
lacking any useful structure, brute force probably required.
To aid in the visualization, all the function examples shown are
are functions of only two dimensions. In practice, we are faced
with functions of hundreds or thousands of unknowns.
For now, let us focus on the unconstrained optimization of functions defined on dimensional Euclidean spaces. It is clear that geometry plays a key role in determining the difficulty of a problem. Can we follow the gradient up-hill or down-hill to our target or will we be trapped in local extrema? Are these local extrema important in that they represent significant features in the model or are they irrelevant roadblocks on our way to the global extremum?
Figure 1 gives a simple overview of some of these geometrical issues. This list is not exhaustive, but illustrates some of the practical issues that arise in attempting to find the appropriate tool for a given problem. We have drawn the picture to emphasize maximization, but obviously there is no fundamental distinction between maximization and minimization. Let us look at each case individually.
This simple picture must be taken with a grain of salt when considering optimization problems with many unknowns. Much of our geometric intuition fails when confronted with high-dimensional spaces. For example, if you calculate the volumes of dimensional spheres of radius and and take their difference, you will quickly discover that as gets large, all the volume of an sphere becomes concentrated in a thin shell just inside the radius . So if one is thinking of sampling points within the domain of a function of, let's say, 1000 unknowns, then just how to do so efficiently is not obvious. In fact, this is one of the fundamental limitations of Monte Carlo methods which, almost by definition, must sample the space of possible models globally.