Parameter recovery vs. parameter prediction for the Weibull distribution validated for Scots pine stands in Finland

The moment-based parameter recovery method (PRM) has not been applied in Finland since the 1930s, even after a continuation of forest stand structure modelling in the 1980s. This paper presents a general overview of PRM and some useful applications. Applied PRM provided compatibility for the included stand characteristics of stem number (N) and basal area (G) with either mean (D), basal area-weighted mean (DG), median (DM) or basal area-median (DGM) diameter at breast height (dbh). A two-parameter Weibull function was used to describe the dbh-frequency distribution of Scots pine stands in Finland. In the validation, PRM was compared with existing parameter prediction models (PPMs). In addition, existing models for stand characteristics were used for the prediction of unknown characteristics. Validation consisted of examining the performance of the predicted distributions with respect to variation in stand density and accuracy of the localised distributions, as well as accuracy in terms of bias and the RMSE in stand characteristics in the independent test data set. The validation data consisted of 467 randomly selected stands from the National Forest Inventory based plots. PRM demonstrated excellent accuracy if G and N were both known. At its best, PRM provided accuracy that was superior to any existing model in Finland – especially in young stands (mean height < 9 m), where the RMSE in total and pulp wood volumes, 3.6 and 5.7%, respectively, was reduced by one-half of the values obtained using the best performing existing PPM (8.7–11.3%). The unweighted Weibull distribution solved by PRM was found to be competitive with weighted existing PPMs for advanced stands. Therefore, using PRM, the need for a basal area weighted distribution proved unnecessary, contrary to common belief. Models for G and N were shown to be unreliable and need to be improved to obtain more reliable distributions using PRM.


Introduction
There are two main options for predicting distribution in tree size distribution modelling: the parameter prediction method (PPM) and the parameter recovery method (PRM) (Hyink and Moser 1983). PRM has been divided to two main approaches: percentile-based and moment-based recovery. However, these methods are typically mixed (see Baldwin and Feduccia 1987;Cao 2004;Poudel 2011) and augmented with recovery equations that use neither moments nor percentiles, instead using a parameter such as total volume (Hyink and Moser 1983;Mehtätalo et al. 2007). Therefore, we ignore the percentile-and moment-based recovery division in this paper. In addition to the abovementioned parametric methods, non-parametric methods are also available for describing stand structure in terms of size distribution. In Finland, non-parametric methods that have been used include the k-MSN method and the k-NN method (Haara et al. 1997;Maltamo and Kangas 1998;Packalén and Maltamo 2007;Peuhkurinen et al. 2008).The distribution-free percentile-based diameter distribution models have been presented by Kangas and Maltamo (2000b). The Weibull distribution is the theoretical distribution most likely to be used due to its high flexibility relative to the number of parameters used. However, there is no a priori biological basis for using the Weibull distribution or any other statistical function (Shiver 1988). Thus, the selection of the distribution function is entirely up to the researcher. A similar situation arises when choosing whether the selected distribution function is used to describe the unweighted or weighted random variable. The former is typically selected for stem frequency distribution of tree heights or breast-height diameters (dbh), and the latter is selected for basal area-dbh distribution or stand volume-dbh distribution (see Loetsch et al. 1973).
In Finland, models are typically based on a weighted, basal area-dbh distribution, whereas elsewhere, basal area-dbh distribution models are rarely used (see Gove and Patil 1998). For example, in Scandinavia, dbh-frequency distributions are typically unweighted (e.g., Mønnes 1982;Tham 1988;Holte 1993). The use of weighting is partly a result of relascope-sampled data, which provide direct estimates for the stand basal area, partly because of its ability to emphasise the large and the most valuable trees (see Päivinen 1980). Weighted distribution models include beta, Weibull and Johnson's SB functions (e.g., Päivinen 1980;Maltamo 1997;Siipilehto 1999;Siipilehto 2011a;Kilkki and Päivinen 1986;Kilkki et al. 1989;Maltamo et al. 1995;Siipilehto et al. 2007). The few existing dbh-frequency distributions used in Finland were presented at the beginning of 21st century (Maltamo et al. 2000;Sarkkola et al. 2003;Sarkkola et al. 2005). More recently, comparisons between dbh-frequency and basal area-dbh distribution model were made by Maltamo et al. (2007) and Siipilehto (2011a) in Finland and Gobakken and Naesset (2004) in Norway. Note that for young stands, the assessed stand characteristics are unweighted in the Finnish National Forest Inventory (NFI) and in forest management planning (FMP) fieldwork (see Solmun maastotyöopas 1997). Many of the existing distribution models used in Finland have been compared with each other to find the most reliable models for applied forest planning system use (Maltamo et al. 2002a;Maltamo et al. 2002b). The system requires auxiliary models for stand characteristics, such as stand basal area and weighted median dbh for young stands models developed by Nissinen (2002). In addition, there are still few distribution models specifically for young stands. These models include height distribution models for Scots pine (Siipilehto 2006a) and planted Norway spruce (Valkonen 1997), in addition to dbh-frequency distribution models for Scots pine (Siipilehto 2011a), all of which apply the Weibull function.
Note that parameter prediction method has been improved lately by using Generalized Linear Model (GLM) to fit the distribution function and the prediction models for the parameters simultaneously (Cao 2004). However, the improvement by GLM was quite marginal, when the prediction equations provided high degrees of determinations (see Siipilehto 2006a).
The parameter recovery procedure is different from direct parameter prediction. In general, one needs as many known distribution-related stand characteristics as there are parameters in the used distribution function. These characteristics may be moments (e.g., mean and quadratic mean) or percentiles (e.g., median) of the distribution, but the characteristics can be any stand characteristics that can be mathematically derived from the distribution (see Hyink and Moser 1983;). In special cases, one might use only moments (e.g., Burk and Newberry 1984;Lindsay et al. 1996) or only percentiles (e.g., Dubey 1967;Bailey and Dell 1973;Gobakken and Naesset 2004). If the number of known characteristics is lower than the number of distribution parameters, then some of the characteristics need to be predicted using regression models. Moments or percentiles are better understood in terms of their relationship with stand characteristics and, therefore, are regarded as easier to predict from other stand characteristics (see Knoebel and Burkhart 1991;Gobakken and Naesset 2004).
It is interesting that since the old studies by Cajanus (1914), Ilvessalo (1920), Lönnroth (1925) and Lappi-Seppälä (1930), recovery models have not, for the most part, been used in Finland. Some exceptions include pure percentile-based recovery models for special cases, such as the effect of moose browsing or the effect of retained trees and stand edges on the height distribution for Scots pine sapling stands (Valkonen et al. 2002;Siipilehto and Heikkilä 2005;Siipilehto 2006a;Ruuska et al. 2008). As an option in model validation, Kangas and Maltamo (2000c) used sample minimum and maximum for recovering the three-parameter Weibull function. More recently, parameter recovery utilizing predicted volume from airborne laser scanning (ALS) data have been presented by Mehtätalo et al. (2007) and applied by Peuhkurinen et al. (2011).
In this study, we first present the general relationships between tree diameter distribution and certain stand characteristics (number of stems, quadratic mean diameter and both basal-area weighted and pure mean and median diameters). These relationships are used to develop parameter recovery equations for a two-parameter Weibull function. Using these equations, four combinations of recovery equations are proposed. Two of these combinations are evaluated thoroughly and compared with the latest parameter prediction methods (Siipilehto 2011a;Siipilehto 2011b). The effect of the imputation of missing stand characteristics in PRM is also evaluated.

Test data
The data were obtained from 6th and 7th NFI-based permanent sample plots, established in seedling stands (TINKA: dominant height, HDOM < 5 m when established) and more advanced stands (INKA: HDOM ≥ 5 m) in 1976-1989 (see Gustavsen et al. 1998). The re-measurements were carried out twice, 5 and 10 years or 5 and 15 years after establishment for the INKA and TINKA data sets, respectively. Each stand sample plot consisted of a cluster of three circular plots within a stand. These plots were combined for reliable stand characteristics and distributions (see Shiver 1988). The total number of trees tallied was approximately 100-120 per stand. In the sapling stand TINKA data, all crop trees were measured for dbh and height. In the INKA data, the tree heights were measured from smaller radius circular plots representing one-third of the total sample area. Missing tree heights were completed with Näslund's (1936) height curve fitted by stand. Instead of pure expectation, random error was added to the expected height (see Siipilehto 2011a). Data were restricted to D > 1.5 cm and G > 1.5 m 2 ha -1 to avoid the obvious anomalies resulting from the low proportion of trees above breast height in the youngest stands. Siipilehto (2011a) divided the data randomly into a modelling data set (75%) and a validation data set (25%) to evaluate the models.
Apart from the formulation of the systems of equations, the applied parameter recovery method did not need any further modelling. Thus, the recovered Weibull distributions were simply tested using the above mentioned validation data set ( Table 1). The final number of observations was 467 for the validation data set. The data set was divided to represent young stands, (mean height H < 9 m, 248 stands) and advanced stands (H 9 m, 219 stands). Note that stand characteristics are calculated from tallied trees and the measured dbh is assumed as being error-free.

Diameter distribution and stand characteristics
Let random variable X be the diameter of a tree in a forest stand. The probability density function of tree diameter is

N N
To emphasise that the density f N (x) is proportional to the number of stems, we use subscript N here. The corresponding cumulative diameter distribution is The number of stems between the diameters x 1 and x 2 is N(F N (x 2 ) -F N (x 1 )) where N is the total number of stems. It is often more convenient to use the basal area instead of the number of stems to measure the stand density. The density function of the basal-area weighted diameter distribution is a sizebiased distribution of order two (Gove and Patil 1998), which can also be viewed as a special case of a weighted distribution: The corresponding cumulative distribution function is G G x 0 and the basal area between the diameters x 1 and x 2 is G(F G (x 2 ) -F G (x 1 )) , where G is the total basal area. To derive relationships between different stand characteristics, the following general expression, which holds for all values of x, is very useful: where t(x) defines an increasing transformation of tree diameter x and T is the total of that variable (e.g., in per ha basis). The most practically useful transformations as t(x) are tree volume and basal area and in those cases, T is the total volume or basal area. If the transformation is basal area, the equation yields that yields the relationship for quadratic mean diameter, basal area and the number of stems: N 2 2 0 in which q = π / (2 × 100) 2 is the conversion factor to square meters. The arithmetic mean diameter is the expected value of X over the unweighted distribution The basal-area weighted mean diameter is the expected value of the basal-area weighted distribution: The inverse of cumulative distribution function provides the quantile function: 1 which can be used to compute different tree diameter quantiles. In the context of tree diameter distributions, the particularly important quantiles are the arithmetic and basal-area weighted median diameters, which are the 0.5th quantiles of the unweighted and basal-area weighted diameter distributions, respectively: The following section provides applications of the above-specified relationships to the recovery of the parameters of the two-parameter Weibull distribution using different stand characteristic measurements.

Properties of the two-parameter Weibull distribution
The Weibull distribution has been widely used to describe and predict size distributions in forestry. It can be regarded as quite flexible in describing different shapes of unimodal distributions (Bailey and Dell 1973). The two-parameter Weibull probability density function (f) is as follows: where x is the random variable, the observed diameter or height of a tree in a stand plot and b and c are the scale and shape parameters of the Weibull function, respectively. Assuming the Weibull distribution for random variable X, The mean of X (Eq. 8) becomes N and the mean of the basal-area weighted distribution (Eq. 9) becomes The cumulative Weibull function has a closed-form expression: The closed-form cumulative function makes the computation of quantiles easy. The median (Eq. 11) becomes The density of the basal-area weighted diameter distribution (Eq. 3) yields that implies (Gove and Patil 1998) that transformation Y = (x / b) c has the standard gamma distribution with the cumulative distribution function where k = 2 / c + 1 and γ(k), the incomplete gamma function, is a shortcut for the integral . Although integrals γ(k, y) and Γ(k) cannot be solved in closed forms, they are practically useful because numerical evaluation of the integrals has been implemented in most computing environments. Using the distribution function of Y, the basal-area weighted median diameter (Eq. 12) for the Weibull-distribution of tree diameter can be written as (because the median of a monotonic transformation equals the monotonic transformation of the original median): c 1 1 where F -1 (y) is the quantile function of Y that is the inverse function of the standard Gamma cumulative density function.

Parameter recovery options
We focused on PRM using the two-parameter Weibull distribution and solving it using the commonly assessed mean or median diameters, such as arithmetic mean (D), basal area-weighted mean (DG), median (DM) and basal area-median (DGM) using the recovery equations (14), (15), (17) and (20), respectively. These equations ensured that the applied mean or median diameter of the stand was equal to the corresponding characteristic computed from the recovered Weibull distribution.
In addition, Eq. (21) was used to ensure compatibility with stand density in terms of the number of stems and basal area. The recovery equations are presented in Table 2. As there are two parameters in the Weibull distribution, a recovery needs two of these equations for the system to have a unique solution. Therefore, recovery Methods A-D (See Table 2) were constructed, each of which used the basal area and number of stems and one additional measure for mean or median stand diameter. Note that no matter the mean or median used for solution, the solved two-parameter Weibull function always represents dbh-frequency distribution in this paper. Table 2 lists the equations in the form of finding the roots, f(y) = 0, as they were required in the subroutine funcv for the Newton's method subroutine newt of the Numerical recipes in FORTRAN (Press et al. 1992, p. 379). The FORTRAN IMSL library was used for solving the parameters (see Appendix).

Stand characteristics and the height curve
If the stand characteristic required for PRM was not measured in the field, it was predicted. The applied family of models (Siipilehto 2011a) consisted of seemingly unrelated regressions for eight A, B, C, D stand characteristics (G, N, D, DGM, H, HGM, DDOM, HDOM) and for the parameters of three optional dbh-distributions along with the height curve. The models were calibrated using linear prediction theory for the best linear unbiased predictor (BLUP) estimation (see Lappi 1991).
Known stand characteristics were applied as calibrating variables to improve the expectations for the unknown stand characteristics and unknown parameters of the distribution functions and the height curve. Calibrated (BLUP) estimates for the parameters of Näslund's (1936) height curve have demonstrated excellent goodness-of-fit (see Siipilehto 2011b, page 28-30). In conclusion, the family of models explored by Siipilehto (2011a) included all the needed sub-models (stand characteristics, dbh distributions, height-dbh relationship) for generating dbh and height for the sample trees of a stand from different combinations of known stand characteristics. In the present study, comparisons between distribution modelling approaches in their capability to generate essential stand characteristics (e.g., assortments volume) were made exactly in the same manner as in Siipilehto (2011a). Although the focus was on Weibull dbh-frequency distribution (W N ), the results can be compared similarly with the BLUP estimated basal area-dbh distribution using Weibull (W G ) and Johnson's SB (SB G ) functions (Siipilehto 2011a). In addition, a recent SB G model (Siipilehto et al. 2007) and the Weibull height distribution model for young stands (Siipilehto 2009) are compared with the same data in Siipilehto (2011b).

Validation of the models
Because the BLUP models for stand characteristics included D and DGM, but not DM and DG, only Method A and Method D were validated thoroughly. When predicting the missing stand variable, we started from the assumption that the commonly known stand characteristics, according to guidelines for FMP and for NFI, were available for linear prediction. This presupposed knowledge of the basal area-weighted characteristics (DGM, HGM and G) for advanced stands and arithmetic stand characteristics (D, H and N) for young stands. These variables are referred to later as SOLMU variables (see Solmun maastotyöopas 1997). The most realistic additional characteristics for advanced stands are HDOM and N because they are sometimes additionally required for the aforementioned characteristics (Kuvioittainen arviointi ja... 1998). The combination of known stand characteristics has its effect not only on recovered or predicted dbh-distribution but also on the predicted Näslund's (1936) height curve. Therefore, both parameters have their effect on generated stand characteristics; recovered or predicted distributions have a particular their effect on generated dbh characteristics: D, DGM and DDOM, in addition to basal area, G, while the predicted height curve has its effect especially on height characteristics, H, HGM and HDOM. Finally, both parameters are needed to derive tree and stand volume. Stand volume and timber assortments were calculated according to models for individual tree volume and a taper curve as a function of known tree dbh and height (Laasasenaho 1982). The well-known inequality claims that the arithmetic mean, D, is always smaller than the quadratic mean, DQM (Hardy et al. 1988). In practice, it was shown that DGM should be greater than DQM for the unimodal Weibull distribution. When DQM is approaching DGM, the recovered shape parameter c is increasing and the distribution is turning into a peak. We noticed that the BLUP estimated basal area was not accurate enough and sometimes resulted in a DQM that was lower than or equal to D. Note that if this happened, PRM was not able to find the solution. In these cases, we were forced to re-predict basal area. Using the model by Nissinen (2002), the inequality DQM > D was achieved and the parameters could be solved so that the convergence criterion (10 -5 ) was always met.
The accuracy of the compared methods was validated in terms of bias (22) and RMSE (23) in the generated stand characteristics, such as stem number, basal area, dominant diameter and height, volume of total growing stock, timber assortments and waste wood fraction. The relative bias and RMSE were calculated by dividing them by the mean value of the observed Y i and they were expressed as percentages.
Using the same data and the same validation criteria as Siipilehto (2011a) and Siipilehto (2011b) means that the results were fully comparable, not only between PPM and PRM methods for the Weibull in this paper but also among all of the models that were previously validated by Siipilehto (2011b). Note that the trees for young stands were sampled from the cumulative Weibull distribution, imitating fixed area sampling, whereas trees for the advanced stands were sampled from the weighted distribution, imitating relascope sampling. A systematic sample of 40 trees was taken from the predicted distributions.

PRM compared with PPM for characterising the variation in N
Method D was used in a 25-year-old Scots pine stand with fixed DGM = 10 cm, G = 10 m 2 ha -1 and varying stem number, N (ha -1 ): 3100 for high density, 2500 and 1900 for moderate densities and 1300 for low density forest stands. Comparisons were made between the BLUP estimated (PPM) and recovered (PRM) W N distributions. Similar comparisons between various models can be observed in Siipilehto (2011b, p. 33-34). The BLUP estimation did not respond adequately to the variation in stand density (Fig. 1). Thus, PPM resulted in as high as 27% underestimation in stem number for the highest density and 16% overestimation for the lowest density when the distribution was scaled for the basal area. PRM found converged solutions for these distributions and thus provided correct densities. The resulting variation in the shape of the W N distribution was surprisingly wide. The highest density of 3100 ha -1 resulted in a strongly right-skewed, almost decreasing distribution while the lowest density of 1300 ha -1 resulted in a very peaked distribution and therefore, it was shown with its own Y-axis in Fig. 1. Indeed, in the lowest density, the trees were mostly distributed between the 8 and 11 cm dbh-classes while in the other cases, the trees were distributed between the 1 and 20 cm dbh-classes. In the peaked distribution, DQM (9.896 cm) was close to DGM, resulting in a high value for the shape parameter c (24.5). The only existing PPM that could provide an almost adequate response to the same variation in stem number was the SB G distribution model by Siipilehto et al. (2007), as shown by Siipilehto (2011b, Fig. 5 on p. 34).  the unknown stand characteristics. Fig. 2 gives us the observed and the expected W(0) distribution without localisation, in addition to the distributions localised with N, D and H (i.e., W(N, D, H)) and with the varying additional stand characteristics (e.g., W(+G) with the additional basal area). The resulting distributions resembled those in Siipilehto (2011a, Fig. 4). However, the primary difference is the fact that basal area was the most useful additional variable for PRM while it did not calibrate the BLUP estimate for W N effectively (Siipilehto 2011a). The error (overestimation) in total volume with the additional G was 5.3 m 3 ha -1 (12%) using PPM and only 0.9 m 3 ha -1 (2%) using PRM. An even smaller error was found using PRM with the additional DGM, namely 0.7 m 3 ha -1 . However, when using the common SOLMU variables (N, D and H), the resulting error was approximately the same, 5 m 3 ha -1 , an overestimation with both PPM and PRM.

Localising for a real stand
It is worth mentioning that if parameters were recovered using Method B, C or D (not shown) from the known N and G, together with either DG (12.3 cm), DM (11.1 cm) or DGM, the solution for the parameters were so close to that of Method A that the differences were hardly visible. Indeed, parameter b was always the same 12.06 and c was given values of 4. 421, 4.474, 4.458 and 4.466, respectively.

PRM compared with PPM for generated stand characteristics
Method A for PRM was used in young stands except with the knowledge of N, DGM, HGM and G, when Method D was used instead. The first bars in the Figs. 3-5 represent knowledge of the common SOLMU variables. When the SOLMU variables were assumed to be known in young stands, the RMSE of each validated characteristic was higher in PRM compared with any PPM in the studies by Siipilehto (2011a, b). Included additional HDOM and especially DDOM improved the accuracy in all validated stand characteristics. The knowledge of G in addition to N means that DQM and thus the 2nd raw moment could be derived correctly. This was shown to be a huge improvement in the accuracy of the stand characteristics (Fig. 3). For example, RMSE in total volume was 27% when predicted with SOLMU variables, and the additional dominant tree characteristics reduced RMSE to 22-19%, whereas the additional G provided an RMSE of only 3.5% using Method A and 3.6% using Method D (Fig. 3).
The two rightmost bars in Figs. 3-5 demonstrate that the RMSEs of the validated characteristics were quite comparable, regardless if Method A with D or Method D with DGM was applied in PRM, when both N and G were also known. This comparability was also true in both young and advanced stands. However, in young stands, pulpwood volume as well as HDOM benefited more from DGM knowledge, while waste wood and DDOM were slightly more accurate with a known D. Respectively, in advanced stands, the total volume and HDOM were more accurate with DGM, whereas the volume fractions and DDOM were slightly more accurate with D (Figs. 4 and  5). Finally, knowledge of G and N resulted in an RMSE of only 1.6% in total volume with known DGM while it was 2.8% with D. When N was unknown, the BLUP outcome estimation presented only modest accuracy (RMSE of 20%).
The compared methods produced different type of biases. In young stands, Siipilehto (2011a) found 4.6% underestimation in total volume with W N , when SOLMU input variables were used as a calibrating variables (Table 3). However, using PRM with the BLUP estimate for basal area, the corresponding bias reached a 12% underestimation. In addition, the same 12% underestimation was also found for DDOM, as was a 9% underestimation of HDOM. The BLUP estimation for W N resulted in the smallest biases in most cases, if the sum characteristics G and N were not also known (Table 3). In general, PRM was superior in terms of bias if both G and N were also known. The residual errors in such a case are given in Fig. 6 for the total volume in young stands using PPM and PRM. Fig. 4. RMSE% in volume characteristics using parameter recovery with BLUP estimation for missing stand variable in advanced stands (Siipilehto 2011a). Method D was used except when G, D, H and N were known, in which case, Method A was applied. If we compare the W N distribution in advanced stands, PRM resulted in generally lower bias in total and commercial wood volumes but also a higher bias in waste wood volume as well as in stem number and dominant tree characteristics (Table 4). The overestimations of 7% of DDOM and 13-14% of N were the obvious reasons for the 23-24% overestimation in waste wood volume. Indeed, if N and G were known, they were unbiased and the bias in DDOM was reduced to 0.3-1% using PRM. Thus, the determination of the correct DQM through the knowledge of G and N proved essential for the accuracy of PRM. However, no matter the compared method or combination of known stand characteristics, the total volume was always only slightly biased (Table 4) (see also Siipilehto 2011a, b). The approximately 7-10% bias in log and pulpwood volumes using BLUP estimation for W N was reduced to 0-6% using PRM. The best-performing W G distribution in the study by Siipilehto (2011a) also most frequently presented (17 times out of 35 cases) the least   N, D, H  biased stand characteristics in the present study. According to the data reported in Table 4, choosing the size distribution model such that the available data and the PPM applied are based on the same scale (i.e., arithmetic stand variables for frequency distribution, weighted medians for basal-area dbh distribution) appears to be essential. In the case of PRM, the mean characteristic used was of no practical importance (Table 4).

Discussion
It is convenient to present various options for parameter recovery in one paper. In addition to the standard moment-based parameter recovery with the arithmetic mean (D) together with the second raw moment, squared quadratic mean dbh, DQM 2 (e.g., Lindsay et al. 1996;Cao 2003), we presented equations utilising basal area-weighted mean (DG), median (DM) or basal area-weighted median (DGM) for the recovery of the two-parameter Weibull function. DQM was calculated from either the known or predicted stem number (N) and basal area (G). The Newton-Raphson method for root-finding was used to solve the parameters of the system of two non-linear equations. Because this method appeared somewhat sensitive to the initial guess, we emphasised finding functions for the appropriate initial values of the shape parameter c. The final functions eliminated approximately 50-70% of the needed iterations compared with the use of constant initial values that converged for each stand (see Appendix). Studies by Siipilehto (2011a;b) in Finland previously indicated that dbh-frequency distribution was superior in young stands, but the weighted, basal area-dbh distribution was a better option in advanced stands in terms of the accuracy in stand characteristics generated from predicted distributions. Similar results have been shown by Gobakken and Naesset (2004) and Palahi et al. (2007). Furthermore, Siipilehto (2011b, p. 33-34) noticed that stem number was not an efficient variable for calibrating the shape of the Weibull distribution, but instead, it was more useful for calibrating the SB G distribution using BLUP estimation. Thus, considerably high RMSE was found in the generated stand density with the Weibull distribution using PPM.
Recently, a great amount of effort has been expended toward calibrating the predicted distribution to achieve compatibility between various input stand characteristics. This has been performed mostly to improve the accuracy in stem volume and assortments (see Kangas and Maltamo 2000a;Kangas and Maltamo 2002;Mabvirura et al. 2002). Note that the calibration estimation presented by Deville and Särndal (1992) necessitates abandoning the smooth shape of the original distribution function. Thus, calibration functions enable mimicking of the irregularities (e.g., bi-or multimodality) of the observed distributions (see Kangas and Maltamo 2000a;Kangas and Maltamo 2003). The Mehtätalo (2004) method, on the other hand, performs the calibration through adjusting directly the predicted diameter percentiles. In contrast, PRM provides a unique solution, keeping the original smooth shape of the unimodal distribution function together with the demonstrated compatibility and without a priori estimated prediction models. In this approach, the irregularities of the observed distribution are more or less brushed aside and regarded as random variation due to sampling. The interesting question raised when applying PRM is how much the achieved compatibility can improve e.g. volume characteristics compared with PPM.
In young stands, PRM was able to yield a clearly lower RMSE for total volume (3.6%) than PPM at its best (8.4% according to Siipilehto 2011a). Conversely, the lowest RMSE for DDOM (3.8%) and HDOM (4.4%) were found using the BLUP estimation for W N as the best option for existing PPMs (Siipilehto 2011a). There are two main reasons for the above results. On one hand, the BLUP estimations for the Weibull distributions (W G and especially for W N ) were effectively calibrated with dominant tree characteristics (Siipilehto 2011a) while the applied PRM did not utilise knowledge of dominant tree characteristics, when solving the parameters of the Weibull distribution. Thus, the more accurate DDOM using the BLUP estimated Weibull distribution resulted in the more accurate HDOM as well. On the other hand, the compatibility in the sum characteristics G and N using PRM resulted in a meaningful improvement in volume characteristics. Of the compared PPMs in advanced stands, the BLUP estimation for W G provided the lowest RMSE of 1.7, 5.3 and 7.0% for total, log and pulp wood volumes, respectively, but W N provided the lowest RMSE for waste wood of 11.6% (Siipilehto 2011a). Apart from total volume, the above best results required a large amount of information in addition to common SOLMU variables (see Siipilehto 2011a, Table 10). PRM proved competitive with PPM as long as G and N were both known. For example in volume fractions, PRM provided an RMSE of 5-7% in log and pulp wood volumes, which were quite similar to the best options using PPM. Indeed, the BLUP model for the W G and SB G distribution by Siipilehto et al. (2007) provided an RMSE of 6-8%. Simultaneously, an RMSE in waste wood of 10-13% using PRM was equal to PPMs at their best but clearly lower than PPMs in general (22-25%), when N was a known input variable in addition to SOLMU variables (see Siipilehto 2011b, Fig. 7).
Overall, the results of the PRM were closely related to the accuracy of the models, which were used for the unknown stand characteristics. In this study, the stand characteristics were predicted using BLUP estimation, which enabled calibration of the expectations with the arbitrary set of known stand characteristics under linear prediction theory (see Lappi 1991;Siipilehto 2011a). For PRM, the predictions for G in young stands and predictions for N in advanced stands proved crucial. Burk and Newberry (1984) warned that PRM is very sensitive to small changes in the moment predictions. Because the BLUP estimated basal area was not accurate enough, the solved DQM was sometimes lower than D and PRM was not able to find the solution. For example, with the SOLMU input variables, this outcome was observed 30 times out of 248 cases in young stands. When the Nissinen model (2002) was used for re-predicting basal area, the required inequality D < DQM was achieved. On the other hand, the BLUP estimated N, calibrated with SOLMU variables for advanced stands (i.e., G, DGM and HGM) was suitable for recovery because the resulting DQM was lower than DGM. However, the accuracy in N was unsatisfactory, having an RMSE of 20% and a bias of 13%.
The evaluation data of this study included error-free measurements of the stand characteristics used. In practice, however, the measurements always include errors. Therefore, in a practical situation, the incompatibility of the stand characteristics may lead to lack of solution more often than we observed in this study. Furthermore, the accuracy of the recovered diameter distribution may be lower than demonstrated here with error-free data. Indeed, in Finland, the RMSE in N and G in forest inventory data have varied between 18-81% and 10-32%, respectively. Today, ALS is used in practical forest inventory and ALS based applications have shown similar Uuttera et al. 2006) or improved accuracy (Suvanto et al 2005; Uuttera et al. 2006;Peuhkurinen et al. 2011;Naesset (2002) compared with the conventional visual assessment of the stand characteristics or photo-interpretation (Kangas et al 2004;Uuttera et al. 2006).
In conclusion, in simulation systems, such as MELA or MOTTI, a considerable number of different types of models are linked together (see Hynynen et al. 2002). Especially in young stands, we are attempting to follow the development of numerous stand characteristics over time in the current version of MOTTI (Siipilehto 2006b). When reaching a threshold of HDOM of 8 m, the followed stand characteristics are converted into individual trees using alternative size distribution models. It is obvious that the sampled trees describing the stand should represent at least those stand characteristics that were used for predicting the distributions. The current PPM models in MOTTI, namely, the Weibull height distribution (Siipilehto 2009) and SB G diameter distribution (Siipilehto et al. 2007), perform quite satisfactorily, but they do not provide compat-ibility. Strict compatibility, if required, may be difficult to achieve and the solution would require extra calibration functions, such as those shown by Kangas and Maltamo (2000a) and Palahi et al. (2006). However, the compatibility between the three essential stand characteristics is obviously one of the desired features that the presented parameter recovery method can offer. At its best, the parameter recovery method provided superior accuracy for young stands and at least competitive accuracy for advanced stands compared with any existing distribution model used in Finland. In addition, the common belief that the random variable needs weighting for advanced stands proved somewhat controversial. As a whole, the PRM approach can be best improved by improving the accuracy of the forest inventory methods (e.g. Peuhkurinen et al. 2011) and models for the most appropriate stand characteristics. To this end, multivariate multiresponse modelling (see Miina and Saksa 2006;Miina and Heinonen 2008) for stand characteristics are under study. To be able to find a solution for the unimodal Weibull distribution using PRM, we must ensure that the inequalities D < DQM < DGM hold for the input variables in the system of equations, regardless whether they are given as predicted characteristics or determined by measurement or visual assessment.