|
Math 121 - Calculus for Biology I |
|
---|---|---|
|
San Diego State University -- This page last updated 07-Aug-04 |
|
The fitting of data to a mathematical model is more of an art form than a precise mathematical technique. It is vitally important that the person modeling a particular data set knows what he or she hopes to derive from the mathematical model, then select the model appropriately. The most common means of fitting data uses a least squares best fit of the data to the mathematical model. As we saw earlier, when the data are approximated by a straight line, then there are precise statistical formulae for finding the line that best fits the data in a least squares sense. These formulae are derived from techniques developed in two variable Calculus. The technique can be extended to more general polynomial forms with correspondingly more complicated formulae.
When the mathematical model is nonlinear; then in general, there are no precise formulae for finding the least squares best fit to the data. However, there are mathematical methods for numerically finding the least squares best fit to thedata. These numerical methods are notoriously unstable.
Nonlinear Least Squares for Cumulative AIDS cases
Below is an applet that computes the least squares best fit to the data for cumulative AIDS cases shown in the Allometic modeling section. In this case, the least squares best fit is taken directly from the data, instead of doing a linear least squares fit to the the logarithms of the data. See if you can find the least squares best fit to these data.
By adjusting the parameters A and r, the least squares best fit can be minimized with A = 210 and r = 2.87, giving the sum of the squares of the errors as J(A,r) = 210,000. This graph is visibly closer to the data than the fit using the linear least squares fit to the logarithm of the data. Furthermore, this fit will clearly give a better projection of future cumulative AIDS cases by inspection of the graph.
The linear least squares fit to the log of the data is much simpler for finding a power law model, especially with features such as Trendline in Excel. However, this method tends to bias the earlier data points, which is especially poor for projecting future results. The applet above gives an unbiased nonlinear least squares fit to the data, which is probably the best fit if no other information is available. When more is known about a particular data set, then other weighted least squares analyses may be provide the best fit. However, all of these nonlinear least squares methods are significantly more difficult than the method studied in the allometric section.