Parameter estimation of nonlinear growth models in forestry

Partial derivatives of the negative exponential, monomolecular, Mitcherlich, Gompertz, logistic, Chapman-Richards, von Bertalanffy, Weibull and the Richard’s nonlinear growth models are presented. The application of these partial derivatives in estimating the model parameters is illustrated. The parameters are estimated using the Marquardt iterative method of nonlinear regression relating top height to age of Norway spruce (Picea abies L.) from the Bowmont Norway Spruce Thinning Experiment. Formulas that provide good initial values of the parameters are specified. Clear definitions of the parameters of the nonlinear models in the context of the system being modelled are found to be critically important in the process of parameter estimation.


Introduction
Growth studies in many branches of science have demonstrated that more complex nonlinear functions are justified and required if the range of the independent variable encompasses juvenile, adolescent, mature and senescent stages of growth (Philip 1994). Then a function with a sigmoid form, ideally its origin at (0,0), a point of inflection occurring early in the adolescent stage and either approaching a maximum value, an asymptote, or peaking and falling in the senescent stage, is justified. Examples of such models include the logistic, the Gompertz, the Chapman-Richards and the von Bertalanffy functions. In contrast to empirical models, e.g., polynomial equations, the above theoretical models have an underlying hypothesis associated with cause or function of the phenomenon described by the response variable and have meaningful parameters from a forestry perspective. Theoretically based equations may also be more reliable for predictions which involve extrapolations beyond the range of data compared to empirical polynomial equations (Martin and Ek 1984). More importantly, theoretical nonlinear mathematical models provide the basis for an objective method of estimating yield class (growth potential) and the sustainable yield of a crop.
Even though there are few theoretical models formulated specifically for forestry applications, many developed in other disciplines have a considerable potential for modelling forest growth and yield parameters. However, the use of theoretical nonlinear mathematical models explicitly formulated to provide consistent growth and yield has not progressed in the field of forestry. This is partly attributed to the fact that the statistical methodology used for fitting nonlinear models to forest growth data is closely related to the mathematics of the models and the importance of this relationship is not well understood in a forestry context (Fekedulegn 1996).
Nonlinear models are more difficult to specify and estimate than linear models and the solutions are determined iteratively (Ratkowsky 1983). The iterative methods used in nonlinear regression include the modified Gauss-Newton method (Taylor series), gradient or steepest-descent method, multivariate secant or false position, and the Marquardt method (Draper and Smith 1981). If a model, after reparameteriza-tion, does not behave in a near-linear fashion, the parameter estimates will not have desirable properties such us unbiasedness, normality, and minimum variance and hence, complex estimation techniques (e.g., the Marquardt (1963) method) may be necessary (Ratkowsky 1983 and1990). In such cases, the use of partial derivatives rather than computational approximations usually results in more efficient and more precise parameter estimation. Therefore, the purpose of this study was to derive the partial derivatives of nine nonlinear growth models and demonstrate the method of parameter estimation using experimental height growth data.

Models and Sample Data
The growth models considered are given in Table  1. Further details, background and historical information can be found in the sources mentioned in Table 1. For all models considered, ω is the dependent growth variable, t is the independent variable (usually age in years), α , β, k, and m are parameters to be estimated, exp is the base of natural logarithms, and ε is the additive error term. The mathematical properties of the models and meaningful forestry interpretation of the parameters are discussed by Fekedulegn (1996). The Bowmont Norway Spruce Thinning Experiment (1930Experiment ( -1974 was established at the Roxburghe district of the Border region of Scot- Negative exponential ω(t) = α(1 -exp(-kt)) + ε Philip (1994) Monomolecular Mitcherlich Gompertz ω(t) = α exp(-β exp(-kt)) + ε Draper & Smith (1981) Logistic Nelder (1961), Oliver (1964) Chapman-Richards Bertalanffy (1957), Vanclay (1994) Richard's ω(t) = α / (1 + β exp(-kt)) 1/m + ε Richard (1959), Myers (1986) Weibull Ratkowsky (1983), Myers (1986) land to investigate effects of four thinning treatments on growth and yield of Norway spruce (Picea abies L.). Each treatment was replicated four times and a total of 16 permanent sample plots each with an area of 0.04 hectare were measured at five year intervals up until 1970 and at irregular intervals thereafter. At the time of establishing the experiment in 1930 the stand was 20 years old and the local yield class was 15 cubic meters per hectare per year. Top height growth data from sample plot 3661, that received a B-grade thinning treatment, are used to fit the nonlinear growth models and demonstrate the method of parameter estimation (Table 2). A Bgrade thinning treatment refers to the removal of dead and dying trees only. These data were made available by the UK Forestry Commission under an end-user license (Methley 1996).

Method of Estimation
For a nonlinear model (1) i = 1, 2,… n, where ω is the response variable, t is the independent variable, B is the vector of parameters β j to be estimated (β 1 , β 2 ,..., β p ), ε i is a random error term, p is the number of unknown parameters, and n is the number of observation. The estimators of β j 's are found by minimizing the sum of squares residual (SS Res ) function under the assumption that the ε i are normal and independent with mean zero and common variance σ 2 . Since ω i and t i are fixed observations, the sum of squares residual is a function of B.
Least squares estimates of B are values which when substituted into Eq.
(2) will make the SS Res a minimum and are found by differentiating Eq.
(2) with respect to each parameter and setting the result to zero. This provides the p normal equations that must be solved for B . These normal equations take the form for j = 1, 2, …, p. When the model is nonlinear in the parameters so are the normal equations. Consequently, for the nonlinear models considered in Table 1, it is impossible to obtain a closed form solution to the least squares estimate of the parameters by solving the p normal equations described in Eq. (3). Hence an iterative method must be employed to minimize the SS Res (Draper andSmith 1981, Ratkowsky 1983). The NLIN (nonlinear regression) procedure in SAS (1985) was used to fit the models to the height growth data and estimate the parameters. The Marquardt (1963) iterative method was chosen as it represents a compromise between the linearization (Gauss-Newton) method and the steepest descent method and appears to combine the best features of both while avoiding their most serious limitations (Draper and Smith 1981). The Marquardt iterative method requires specification of the names and starting values of the parameters to be estimated, the model using a single dependent variable, and the partial derivatives of the model with respect to each parameter (SAS 1985). The usual statistical tests which are appropriate in the general linear model case are, in general, not appropriate when the model is nonlinear and one cannot use the F statistic to obtain conclusions at any stated level of significance (Draper and Smith 1981). Hence the models were compared based on the proportion of the unexplained variation.

Partial Derivatives
For ease of understanding the symbols of the parameters of the nonlinear models given in Table 1, α, β, k, and m, are replaced by new symbols β 0 , β 1 , β 2 , and β 3 respectively. The parameters for all models considered in this paper are defined as follows: β 0 is the asymptote or the potential maximum of the response variable; β 1 is the biological constant; β 2 is the parameter governing the rate at which the response variable approaches its potential maximum; and β 3 is the allometric constant. The partial derivatives of the models with respect to each parameter (∂ω /∂β j ) are given in Tables 3-7. The NLIN procedure in SAS (1985) requires that the integral forms and the partial derivatives of the nonlinear models must be entered in the program using valid SAS syntax. Table 3. Partial derivatives of the negative exponential, monomolecular, and the Mitcherlich growth models.
The negative exponential The monomolecular The Mitcherlich Table 4. Partial derivatives of the Gompertz and logistic growth models. Partial

Model derivative
The Gompertz The logistic

Starting Value Specification
The Marquardt iterative method requires that an initial or starting value for each parameter be estimated. Starting value specification is one of the most difficult problems encountered in estimating parameters of nonlinear models (Draper and Smith 1981). However, the problem of specifying initial values of parameters can be solved with proper understanding of the definition of the parameters in the context of the phenomenon being modelled. Wrong starting values result in longer iteration, greater execution time, non-convergence of the iteration, and possibly convergence to unwanted local minimum sum of squares residual (SAS 1985). Hence expressions that provide good starting values for some of the parameters were developed. The most efficient order for determining starting values is β 0 , β 2 , β 3 , and finally β 1 . The only parameter that is simple to specify is β 0 . This is attributed to the clarity of its definition. The parameterβ 0 is defined in the literature (Bertalanffy 1957, Richard 1959 as maximum possible value of the dependent variable determined by the productive capacity of the site. Therefore, in modelling the top height-age relationship β 0 was specified as the maximum value of the response variable in the data. For all nonlinear models given in Table 1 the β 2 parameter is defined as the rate constant at which the response variable approaches its maximum possible value β 0 . On the basis of this definition it was found that the expression ω ω β where ω 1 and ω 2 are values of the response variable corresponding to a wide range of the predictor variable t 1 and t 2 , and β 0s is the starting value specified for the β 0 parameter, gave good starting values for the β 2 parameter. For modelling biological growth variables the allometric constant, β 3 , lies between zero and one (0 < β 3 < 1) for the Chapman-Richards growth model and is positive (β 3 > 0) for the von Bertalanffy and Weibull growth models. The starting value for the biological constant, β 1 , was specified by evaluating the models at the start of growth when the predictor variable is zero. Table 8 gives expressions used to specify starting values of the β 1 parameter for each model. ω(0) is the magnitude of the response variable at the start of growth, which ideally is zero, but one should choose a relatively small positive number.

Parameter Estimates and Analysis
The least squares estimates of the parameters of the nonlinear models for top height-age relationship are given in Tables 9-10. The parameter estimates for the Mitcherlich, monomolecular, Gompertz, logistic and Richard's growth functions are all statistically significant at the 5% level. Estimates of β 1 , β 2 , and β 3 for Chapman-Richards and the von Bertalanffy growth models are not statistically significant at the 5% level.
Parameter estimates of the Weibull growth model except β 2 are statistically significant at the 5% level. However, the Marquardt iteration procedure did not converge for the negative exponential model and hence, no parameter estimates of this model are presented. Statistical significance of the parameters of the nonlinear models was determined by evaluating the 95% asymptotic confidence intervals of the estimated parameters. The null hypothesis H 0 :β j = 0 was rejected when the 95% asymptotic confidence interval of β j does not include zero. Table 11 provides predicted values of top height over the range of age using the least squares parameter estimates derived from the Marquardt algorithm. The monomolecular, Mitcherlich, Gompertz, Richard's, and the Weibull models have produced a slightly smaller residual standard error (0.13 m) compared to the logistic (0.18 m), and the von Bertalanffy and the Chapman-Richards models (0.14 m). All the models in Table 11 appear to predict reasonable estimates over the entire range of age.
When nonlinear models are fitted to a biological growth data set statistical non-significance of the estimated parameters might imply one of the following: (a) one or more parameters in the  Bertalanffy models indicate that the functions are suitable to model a system that encompasses the entire range of the life cycle (i.e., juvenile, adolescent, mature and senescent stages) of a biological response variable. However, the top height growth measurements considered in this study (Table 1) lacks data on juvenile and senescent stages of growth. Hence, non-significance of three of the parameters of the two models might be attributed to this cause. To support this argument we have included an initial data point  Table 2 and refitted the Chapman-Richards and the von Bertalanffy models. Table 12 shows the parameter estimates, asymptotic standard error (ASE) and asymptotic 95% confidence intervals (95% ACI) for each parameter of these two models. Without inclusion of the initial data point three of the parameters (β 1 , β 2 , and β 3 ) are not statistically significant (see Table 12). However, inclusion of the initial data point has resulted in statistically significant estimates of the three parameters. Inclusion of any additional data point from an early or juvenile stage of growth will result in significant improvement in the estimates of the parameters of these two models. This clearly illustrates that significance of the parameters of the Chapman-Richards and the von Bertalanffy growth models depends on the range of the growth data.
It is important to note that some of the models such as the negative exponential, monomolecular and the Mitcherlich have no points of inflection and are not sigmoid shaped. Hence, the models are not appropriate for modelling the entire range of the life cycle of biological response variables such as height growth that exhibit a sigmoid pattern over time (reason (c) in the previous paragraph). In other words, growth pattern of most living organisms follow slow initial and terminal growth rates, with fastest growth during the middle of the life cycle, and a maximum final size (Kramer andKozlowski 1979, Philip 1994). But evaluation of the second derivative (d 2 ω/dt 2 ) of the negative exponential, monomolecular, and the Mitcherlich models indicate that it is negative (d 2 ω/dt 2 < 0) over the entire rage of the independent variable. Hence, the current annual increment (dω/dt) of a biological response variable, according to these models, decreases throughout the range of the independent variable and this is contrary to the general pattern of growth of a biological system. These models, however, are useful for quantifying the latter stages of growth of a biological system. Therefore, statistical significance of the parameters of these models does not necessarily indicate the usefulness of the functions to simulate the top height growth data presented in Table 2 or to quantify any biological growth data that encompasses the entire life cycle of the response varia-ble being studied. On the other hand, the Gompertz, logistic, Chapman-Richards, Richard's, and the von Bertalnffy growth models have points of inflection and are sigmoid. These models are suitable for quantifying a growth phenomenon that exhibit a sigmoid pattern over time.

Discussion
The main focus of this paper is to provide the major components (i.e., the partial derivatives) required for estimating parameters of nonlinear growth models, test significance of the parameters, and interpret some of the relevant statistical outputs from a forestry perspective. These results show the fundamental importance of partial derivatives of nonlinear models in estimating their parameters. The NLIN procedure in SAS does not guarantee that the iteration converges to a global minimum sum of squares residual (SAS 1985). Hence, an alternative approach for avoiding the problem of non-convergence or convergence to unwanted local minimum sum of squares residual is to specify a grid of values for each parameter. Then NLIN evaluates the residual sum of squares at each combination of values to determine the best starting values for the iterative process. Initial values may be intelligent guesses or preliminary estimates based on available information. Initial values may, for example, be values suggested by the information gained in fitting a similar equation in a different laboratory or values suggested as "about right" by the experimenter based on personal experience and knowledge. Based on meaningful biological definition of the parameters of the nonlinear models, expressions to specify initial values for the asymptote and the biological constant were developed. These expressions were found useful to specify initial values of the parameters for modelling the sample top height data used in the study.
Provided that one has knowledge of the meaning of the parameters of the model and the phenomena being modelled, it may be possible to identify when an iteration procedure has converged to a local minimum sum of squares residual. One recognizes this non-optimal solution by reviewing the magnitude and sign of the estimat-ed parameters, and the size of the asymptotic correlation matrix of the estimated parameters. Large asymptotic correlation matrices for the estimated parameters of the nonlinear models may indicate that some of the parameters are not important or the model is overparameterized. However, Draper and Smith (1981) explain that large correlations of the estimated parameters do not necessarily mean that the original model is inappropriate for the physical situation under study. For example, in a linear model, when a particular β (a coefficient) does not appear to be different from zero, it does not always imply that the corresponding x (independent variable) is ineffective; it may be that, in the particular set of data under study, x does not change enough for its effect to be discernible. In general, efficient parameter estimation can best be achieved through a good understanding of the meaning of the parameters, the mathematics of the models, including the partial derivatives, and the system being modelled.