CoStat Statistics  

Nonlinear Regression in CoStat
(Curve Fitting of Any Function by
Nelder and Mead's Iterative Simplex Method)

Introduction - Linear regressions are regressions in which the unknowns are coefficients of the terms of the equations, for example, a polynomial regression like y=a + b*x + c*x^2. In this case, a, b, and c are multiplied by the known quantities 1, x, and x^2, to calculate y. With nonlinear regressions the unknowns are not always coefficients of the terms of the equation, for example, an exponential equation like y=e^(a*x).

CoStat's Nonlinear Regression procedure lets you type in any function of y. It then solves for the unknowns in the function.

If you are familiar with linear regressions (like polynomial regressions) but unfamiliar with nonlinear regressions, be prepared for a shock. The approach to finding a solution is entirely different. While linear regressions have a definite solution which can be directly arrived at, there is no direct method to solve nonlinear regressions. They must be solved iteratively (repeated intelligent guesses) until you get to what appears to be the best answer. And there is no way to determine if that answer is indeed the best possible answer. Fortunately, there are several good algorithms for making each successive guess. The algorithm used here (the simplex procedure as originally described by Nelder and Mead, 1965) was chosen because it is widely used, does not require derivatives of the equation (which are sometimes difficult or impossible to get), is fairly quick, and is very reliable. See Press et. al., 1986, for an overview and a comparison of different algorithms.

How does the procedure work? In any regression, you are seeking to minimize the deviations between the observed y values and the expected y values (the values of the equation for specific values of the unknowns).

Any regression is analogous to searching for the lowest point of ground in a given state (for example, California). (Just so you may know, the lowest spot is in Death Valley, at 282 feet below sea level.) In this example, there are 2 unknowns: longitude and latitude. The simplex method requires that you make an initial guess at the answer (initial values for the unknowns). The simplex method will then make n additional nearby guesses (one for each unknown, based on the initial guess and on the simplex size). The simplex size determines the distance from the initial guess to the n nearby guesses. In this example, we have 3 points (the initial guess and 2 nearby guesses). This triangle (in our example) is the "simplex" - the simplest possible shape in the n-dimensional world in which the simplex is moving around.

The procedure starts by determining the elevation at each of these 3 points. The triangle then tries to flip itself by moving the highest point in the direction of the lower points; sort of like an amoeba. The simplex only commits to a move if it results in an improvement. One of the nice features of the Nelder and Mead variation of the simplex method is that it allows the simplex to grow and shrink as necessary to pass through valleys.

This analogy highlights some of the perils of doing nonlinear regressions:

  1. Sensitivity to bad initial guesses. A bad initial guess can put you on the wrong side of the Sierras (a huge mountain range). The simplex will not find its way over to the other side of the Sierras or start making real progress toward finding the lowest point in the state. The lesson: a bad initial guess can doom you to failure.
  2. Going beyond reasonable boundaries. In the example, the simplex can crawl over the edge of the state border. In a real regression, this occurs when the values of the unknowns go out of the range of what you consider to be legitimate values. The procedure does not let you set limits, but you may see the unknowns heading toward infinity or 0. You may also see n (the number of rows of data used) decreasing; this indicates that the equation can't be evaluated for some rows of data, usually because of numeric overflows (for example, e^(u1*col(1)) where u1*col(1) generates numbers greater than 650). If this occurs, try using different initial values for the equation.
  3. Local minima. The simplex can be suckered into thinking that some gopher hole or puddle is the lowest spot in California. This is more likely if the simplex size is set way too small. When the regression has 3, 4, 5, or more unknowns, and/or if the data set is very large, it often becomes less likely that the simplex will find the true global minima. CoStat minimizes the risk of this problem by automatically restarting the algorithm (see "Restarts" below).

Restarts - After the procedure finds what it believes to be an answer, it restarts itself at that point with a reinitialized simplex. If that point was indeed the best in the area, the procedure will stop there. But sometimes the procedure can find better answers. The procedure will continue to restart itself until the new result is not significantly better than the old result (a relative change of the Sum of Squaresobserved-expected of <10-9).


CoStat Statistics | Top
All material Copyright © 1996-2001 CoHort Software. All rights reserved.