Models for diameter distribution and tree height in hybrid aspen plantations in southern Finland

Highlights


Introduction
Hybrid aspen, a hybrid between the European aspen and North American trembling aspen (Populus tremula L. × P. tremuloides Michx.), is highlighted with great potential in modern wood production and biofuel forestry in northern Europe (Tullus et al. 2012;Fahlvik et al. 2019).Although it was first introduced for the matchwood industry in 1950s, as of today, plywood and veneer are one of the principal consumer goods because of the fibre characteristics and outstanding growth rate (Beuker 2000;Heräjärvi and Junkkonen 2006).Moreover, given the urgent need to mitigate climate change, hybrid aspen may play a significant role.First, due to its high growth rate, hybrid aspen plantation can uptake atmospheric carbon more rapidly than other northern tree species.Second, its use as bioenergy is expected to be accelerated in accordance with international conventions such as the Paris Agreement (Commission of European Communities 2008; European Commission 2013; United Nations Framework Convention on Climate Change 2015; Hytönen 2018).
Significant advantages of hybrid aspen in terms of growth characteristics and productivity based on clonal plantations have been reported in many studies (Hynynen et al. 2004;Luoranen et al. 2006;Stener and Westin 2017;Hytönen et al. 2018;Niemczyk et al. 2019).They generally concluded that hybrid aspen was a promising broadleaved species particularly for the region of coniferous-dominated forests such as those of northern Europe (Tullus et al. 2012;Fahlvik et al. 2019).The high growth and yield potential of hybrid aspen plantations has been observed in longterm experiments in Finland (Hynynen et al. 2002) and Sweden (Fahlvik et al. 2019).Nevertheless, growth and yield models for hybrid aspen have yet to be developed for stand management purposes (Stener et al. 2019).There is still a lack of modelling research except for a few studies carrying out the dominant height and site index model (Johansson 2013;Lee et al. 2021), thus highlighting the importance and necessity of developing stand-and tree-level models.
The choice of modelling approach depends on the characteristics of the stands to be modelled and their intended use.If stand structure is heterogeneous, e.g.including diverse tree species with a large variation of within-stand age or size, tree-level models instead of stand-level models are often preferred to better predict stand structure and dynamics (Somers and Nepal 1994;Quin and Cao 2006).However, in stands with a more homogeneous structure, such as even-aged, singlespecies plantations of clonal origin, the stand-level modelling approach is a viable option (Pienaar and Rheney 1995;Scolforo et al. 2019).
In practical forestry, forest inventory data often contain mean and sum values of measured stand characteristics, but seldom include information about individual trees.There are some differences of the collected stand characteristics between inventory methods.In the field inventory the mean characteristics are typically basal area-median dbh (DGM) and the corresponding height (HGM) due to the convenience to assess from the measured relascope sample plots.Such examples are the compartment-wise field inventory (CWFI) and the national forest inventory (NFI) (Koivuniemi and Korhonen 2006).In addition, the assessed stand characteristics for the advanced stands include basal area (BA) but do not include trees per hectare (TPH) in CWFI, NFI or in the open access Metsaan.fiservice for stand compartments.Today, the area-based approach (ABA) for the airborne laser scanning (ALS) data for the 16 m × 16 m grid cells includes both BA and TPH together with basal area-weighted mean dbh (DG) and the Lorey's height (HG) for Scots pine (Pinus sylvestris L.), Norway spruce (Picea abies (L.) Karst.) and broadleaves (Siipilehto et al. 2016).Thus, stand-level models are directly compatible with these data containing stand characteristics because stand-level models can be used to predict both stand dynamics (regeneration, growth, and mortality) and within-stand structure (diameter and height distributions).Information on trees' within-stand size variation is necessary to assess timber assortments and the monetary value of the growing stock (Vanclay 1994).Further, size distribution models generate tree lists representing the stands, which are often used as input data of stand simulators applying individual-tree growth and yield models such as Motti in Finland (Salminen et al. 2005).
To generate a tree list that represents a stand, we need the diameter distribution of tree diameters at breast height (dbh), and a tree height model to complete the description for tree dimensions.Parameter prediction methods (PPM) for different distribution functions are already available, but none for hybrid aspen in Finland.There are several PPM in Finland, e.g. for the Weibull function (Kilkki and Päivinen 1986;Kilkki et al. 1986;Hökkä et al. 1991;Maltamo et al. 1995;Maltamo 1997;Siipilehto 1999Siipilehto , 2011) ) for predicting diameter distributions, mainly for Scots pine and Norway spruce but also for broadleaves (Siipilehto 1999).In addition, PPM for the more flexible Johnson's SB distribution exists for pine, spruce, and birch in Finland (Siipilehto 1999;Siipilehto et al. 2013).Similarly, several height models have been presented in Finland (Lappi 1997;Mehtätalo 2004Mehtätalo , 2005;;Siipilehto 1999;Siipilehto and Kangas 2015), but none for hybrid aspen.Mehtätalo et al. (2015) showed the good performance of Näslund's height curve among a large number of alternative height models and tree species.Further, Siipilehto and Kangas (2015) presented alternative Näslunds's height curve models for Scots pine, Norway spruce, and birch species (Betula pendula Roth and B. pubescens Ehrh.) in Finland.Siipilehto and Mehtätalo (2013) showed a superior result using the parameter recovery method (PRM) for 2-parameter Weibull distribution instead of PPM for Scots pine stands.However, only arithmetic mean dbh (D) and DGM were included when PPM and PRM were compared in terms of accuracy in volume characteristics (Siipilehto and Mehtätalo 2013).D and DGM obtained quite comparable accuracy, but D was slightly better for young stands, whereas DGM performed slightly better for advanced stands in terms of bias and RMSE in volume characteristics (Siipilehto and Mehtätalo 2013).Thus far, we have no evidence about how the median dbh (DM) or DG perform with the parameter recovery method.No matter which mean or median dbh is selected, the recovered distributions can provide it correctly with TPH and BA.The alternative mean or median can therefore be assumed to provide quite comparable results.
The general objective of this study was to develop methods and models for predicting individual tree dimensions (dbh and height) in hybrid aspen plantations.The specific objectives were: i) to test the optional mean and median characteristics for the PRM in solving diameter distribution; and ii) to develop the tree height models with the best performing stand characteristic.Diameter distribution with the height-diameter relationship provided the final models and methods for predicting tree dbh and height for hybrid aspen.

Modelling data
Data for modelling were collected in clonal hybrid aspen plantations in southern Finland, which had been designed for assessing the effects of varying initial planting densities in four different locations (i.e.experiments).The experiments were established in Lohja, Jalassaari (60°13´N, 23°56´E), Lohja, Kirkniemi (60°11´N, 23°57´E), Lapinjärvi (60°39´N, 26°08´E), and Pornainen (60°32´N, 25°20´E) in Finland, and comprised of 4-24 plots with a plot area of 1000 m 2 by location (Lee et al. 2021).The experiments were repeatedly remeasured 7-12 times by site between 1997 and 2016.The detailed information on clones, plot design, and the history of silvicultural treatment was described in Lee et al. (2021).In each plot, dbh and height were measured for all trees.
A variety of stand-and tree-level characteristics such as stand age (AGE), stand density in terms of TPH and BA, diameter, and height were used for modelling tree dimensions (Table 1).Among them, four kinds of mean or median dbh were analysed to find the best option for the parameter recovery method: D, DG, DM, and DGM with quadratic mean dbh (DQ) squared as a second moment (Table 1).Dominant height (HDOM) was calculated by averaging tree height of one hundred trees with the largest dbh per ha.Site index (SI) was computed based on the densitysensitive site index model suggested by Lee et al. (2021).

Parameter recovery method for dbh distribution
We applied the 2-parameter Weibull function to characterize dbh distributions.The 2-parameter Weibull probability density function (f) is as follows: where x is the random variable, the observed diameter of a tree in a plot, and b and c are the scale and shape parameters of the Weibull function respectively.The computation of quantiles (e.g.median) is easy due to the closed-form cumulative Weibull function: 2 Parameter recovery equations for the 2-parameter Weibull distribution are presented in Siipilehto and Mehtätalo (2013) for the optional mean and median characteristics: D, DG, DM, and DGM.The corresponding recovery equations are as follows: With the mean or median characteristics, we need the second raw moment for the recovery.The second moment of diameter distribution is as follows: In the previous equations, the gamma function Γ( ) k is a shortcut for the integral, is the quantile function of Y that is the inverse function of the standard Gamma cumulative density function, and DQ is the quadratic mean dbh (Siipilehto and Mehtätalo 2013).
TPH and BA are used to define the second moment, which is DQ 2 = BA/(TPH × q), in which q is a conversion factor π/200 2 = 7.854E−05.Note that the square root of the second raw moment is the quadratic mean diameter (DQ).We applied the recovery equations by Siipilehto and Mehtätalo (2013), utilising their supplementary file Weibull2recovery.r for solving the two parameters of the Weibull function (also found as recweib function in lmfor package by Mehtätalo and Kansanen (2020)).The Newton-Raphson method for root finding was used to resolve the parameters of the system of two non-linear equations.

Prediction for individual tree height
We used the Näslund height curve (Näslund 1936) model as a base function to predict individual tree height (h).Näslund's function is as follows: where b 0 and b 1 are the estimated parameters.Power 2 in Eq. 8 was selected for hybrid aspen as a light demanding species.Note that power 3 has been used for shade-tolerant species, e.g.Norway spruce and European beech (Fagus sylvatica L.), for flexibility to the sigmoid curve (Vestjordet 1972;Siipilehto 1999;Nord-Larsen 2006;Siipilehto and Kangas 2015).We used the lmfor package (Mehtätalo 2015) of the R statistical software (R Core Team 2019) and the exponential formulation for Näslund's height curve HDNaslund4, in which tree height is as follows: One of the benefits in this formulation is that b 0 and b 1 are always given a positive value that is needed for logical behaviour.In addition, Siipilehto (2011) showed the linear relationship between the logarithmic parameter ln(b 1 ) and logarithmic Lorey's height ln(HG), as well as the logarithmic dominant height ln(HDOM).
In b 0 and b 1 parts of Eq. 9, we attempted to add several stand-level characteristics as predictors, e.g.DG, HG, AGE, BA, TPH, and SI.Several transformations such as square root, logarithm, and reciprocal form and intercepts (k) ranging from 0.1 to 5.0 were also checked to identify the most stable and unbiased model performance, e.g.ln(BA + k), 1/(AGE + k) (see Siipilehto 2011; Siipilehto and Kangas 2015).

Validation of dbh distributions
We evaluated the goodness-of-fit of the Weibull distribution, which was solved by the optional mean and median dbh characteristics for the parameter recovery (Eqs.3-6).The goodness-of-fit tests included the Kolmogorov-Smirnov (KS) test, using an alpha (α) 0.1 risk level.When the number of sampled trees (n) exceeded 100, the approximate critical value was calculated according to Sokal and Rolf (1981) as √(ln(α/2)/2n).When the calculated test value (supremum) was related to critical value as a KS quotient (KSq), the quotient value above 1 indicated rejection (Tham 1988;Siipilehto et al. 2016).The number of rejected cases was counted for each optional recovery equations.In addition, the Error Index (EI) of Reynolds et al. (1988) was used for evaluation.EI is the sum (or weighted sum) of the absolute errors in the real frequencies by dbh class.In this study, the differences were calculated in 1-cm dbh classes without weighting.The smaller the test value of KS and EI, the better the fit.
In addition to the assessment of KS test and EI, the recovered dbh distributions were checked by illustrating with the observed dbh distribution.Based on the comparison between the observed and predicted number of trees by dbh class, the best model prediction was verified as stable and reasonable with every stand, plot, and age measurement instances of our dataset.

Modelling height curve
A nonlinear mixed-effects regression for Eq. 9 was applied for the Näslund's height curve.We used experiment (four locations) as random effects for both intercept (b 0 ) and coefficient (b 1 ).Logarithmic transformations of stand characteristics showed the best performance for the Näslund's height curve.The final mixed-effects model can thus be written as follows: where a 0 -a n are estimated parameters for the intercept and predictor variables ln(x 1 ) -ln(x n ) for the Näslund's parameter b 0 in Eq. 9, c 0 -c n are the estimated parameters for the intercept and the predictor variables ln(x 1 ) -ln(x n ) for the Näslund's parameter b 1 in Eq. 9, u 0 is the random intercept and u 1 is the random coefficient representing the four experiments, and ε is the residual error.
For model development, we used the nlme package of the R statistical software (R Core Team 2019), and the initial values and the initial model structure for parameter fitting were referenced from the preceding study (Siipilehto and Kangas 2015).The power function was used to take into account the heteroscedasticity of the residual error ε with respect to predicted tree height, i.e. weights = varPower().
The modelling results were basically evaluated with the basic fit statistics: AIC, BIC, and -2Log likelihood.For the candidate models, residual plots of the dependent variable over each independent variable were carefully examined to assure the unbiased model fit by using the fixedeffect parameters without estimated random effects.Moreover, residuals against the social status of a tree (dbh/DG) were examined (Siipilehto 2011;Siipilehto and Kangas 2015).
The height models were demonstrated by applying only fixed-effects parameters with measured height-dbh scatterplots.Finally, the most representative and illustrative examples were presented to obtain information about model behaviours with respect to dbh distributions and height-dbh relationship, also without random effects.

Comparison of the parameter recovery equations
In all the cases the converged solution was found.The goodness-of-fit results clearly showed superior performance for mean characteristics (D and DG) than the median characteristics (DM and DGM) (Table 2).DM (Eq.5) provided the worst results, and the average KSq greater than 1 implied that the recovered distributions frequently rejected the KS test at the α 0.1 level.The number of rejected cases at the α 0.1 level was 101 (34%) out of 294 total cases using DM.In contrast, DG (Eq.4) provided the best goodness-of-fit results, and only 8 cases (2.7%) rejected the KS test at the α 0.1 level (Table 2).The first moment, D (Eq. 3) provided the second-best result with 5.1% rejected cases, while the result of DGM (Eq. 6) was 12.6% (Table 2).Most of the rejected cases using DG occurred in young stands where the dbh distribution was still narrow and the predicted mode class was typically 1 cm higher than the observed mode class, e.g. the observed mode class was 4 cm and the predicted mode class was 5 cm.
Table 2.The goodness-of-fit statistics for the recovered Weibull functions using arithmetic mean (D), basal area weighted mean (DG), median (DM), and basal area median dbh (DGM) for resolving the parameters based on the data of hybrid aspen (Populus tremula × P. tremuloides) from southern Finland.The smaller the value of test criterion, the better the fit.The values are mean ± standard deviation with minimum-maximum in parentheses.The box and whisker plots presented the distribution of calculated KSq (a) and EI (b) statistics by four age classes (Fig. 1).According to these results, D showed a slightly better fit for the youngest state of the stand than DG.In the oldest state (age ≥18 years) DG and DGM showed equal and best KSq, but simultaneously, DG had slightly smaller EI (Fig. 1).Overall, the central location of statistics was lowest in an order of DG < D < DGM < DM.DM especially presented a lengthy quartile range, skewed pattern, and large mean value due to extreme outliers, which implies an unstable estimation.In an early age class, the median of KSq statistic was 0.47 for D and 0.50 for DG (Fig. 1a).However, using DG for recovery provided the most constant results across all the age classes.The KS and EI goodness-of-fit results encouraged us to select DG as a candidate predictor variable for modelling the height-dbh relationship.KSq is the Kolmogorov-Smirnov quotient (supremum/critical value).EI is the Error Index (Reynolds et al. 1988).All the other abbreviations indicate stand characteristics as follows: AGE is the stand age (year); TPH is the number of trees per ha (trees ha −1 ); BA is the stand basal area (m 2 ha −1 ); D is the arithmetic mean diameter (cm); DG is the basal area-weighted mean diameter (cm); DM is the median diameter (cm); DGM is the basal area median diameter (cm).
To illustrate the parameter recovery method of the Weibull function, three representative cases of probability distribution presented fitting accuracy.These plots were deliberately chosen by each stand condition to show how different solutions could be provided based on the four kinds of recovery equation, and the results were thus distinctively presented in Fig. 2. In most cases of the studied plots, the recovered Weibull distributions fitted the observed dbh distributions well.In the relevant example in this case, for D, DG, DM, and DGM, KSq was 0.518,0.485,0.573,and 0.616,and EI was 30.4,29.5,37.7,and 34.1 respectively (Fig. 2a).In some other cases, there was a greater difference between the parameter recovery equations.For example, in the case of Fig. 2b, the shape of the distribution using DGM was too peaked (rejected KS test), whereas DM resulted in too flat a distribution (but it still passed the KS test).
Moreover, in extreme cases, the recovered distributions using DM evidently deviated from the observed distribution, resulting in large KSq and EI of the DM (Eq.5) option (Table 2, Fig. 1).A representative plot of this type was shown in plot 17 of stand (i.e., experiment) 3 at age 16 (Fig. 2c), although the KSq of every method still passed the KS test in this case (KSq < 1.0).Unlike the other parameter recovery options, DM resulted in a poor fit with the observed distribution.Specifically, the KSq value of 0.887 and the EI of 55.2 using DM were much higher than with the other options: KSq 0.217-0.248and EI 16.2-17.5for D, DG, and DGM.In some cases, the highest KSq values (KSq > 3) were caused by distributions where almost all the predicted trees' dbh was about the same as DM.Overall, the goodness of fit varied depending on the stand condition as presented in the examples.Nonetheless, the performances when using D and DG for recovery were the most stable, while several unstable cases used DM and DGM.

Models for individual tree height
The formulation of Näslund's height model included various stand characteristics such as age, stand density, mean diameter, and mean height.Model fitting with logarithmic transformation performed well.Finally, the chosen predictors were the logarithm of DG, HG, AGE, BA, and TPH.SI was not significant due to the correlation with DG, HG, and AGE, so it was excluded.The finally proposed two model formulas for fixed effects are respectively as follows: in where b 0 and b 1 of Model 1 and Model 2 are the parameters for Näslund's exponential Eq. 9, a 0 -a 5 are the estimated parameters of each equation in this study, and the other terms are as previously defined.
The fit statistics of the final models were the best among the tested models, and all the fixed-effect parameters were highly significant (P < 0.0001) (Table 3).The standard deviation of random effects was similar, and the power for weighting was larger in Model 1 (Table 3).Because the random effects were significant for intercept (b 0 ) and coefficient (b 1 ), both the shape and the asymptote of the height curve changed between the four experiments.Overall, the fit statistics of the AIC, BIC, and −2Log-likelihood were better in Model 2, which was the result of adding the predictor of stand density, TPH.Compared with the absolute values of predictors, DG and HG were larger in Model 1 than in Model 2. The opposite sign in the parameter for ln(DG) and ln(HG) also resulted in Model 1 following the changes in these mean characteristics.In Model 2, the parameters of DG and HG were relatively low, whereas those of BA and TPH were larger in Model 2, which reflects the higher impact caused by stand density.
In residual plots, all the predictors used in models, including DG, HG, AGE, BA, TPH, and tree dbh, were unbiased, showing a random distribution of residuals (Supplement file S1, available at https://doi.org/10.14214/sf.10612).To validate the models with a tree's social status, an additional residual plot was analysed over dbh divided by DG, which represents the relative dominance in a stand.A subtle bias of the residuals over dbh/DG was detected in the middle range of Model 1 (Fig. 3a).On the other hand, the residuals were completely unbiased throughout the whole range of dbh/DG with Model 2 (Fig. 3b).Simultaneously, the residual errors were clearly diminished in Model 2 (Fig. 3b) compared with Model 1 (Fig. 3a).
To test the goodness of fit in individual tree height estimation, two examples of Näslund's model application were displayed in Fig. 4. As the scatterplot of stand (experiment) 1, plot 26, age 11, there were many cases where the predicted value of Model 1 was a little higher than Model 2 (e.g.Fig. 4a).On the other hand, the scatterplot of height-dbh allometry in stand 1, plot 32, age 18, the predicted height curve of Model 1 was a little lower than Model 2 (e.g.Fig. 4b).Other than these cases, no case showed a noticeable difference between models.However, in most of the studied sample plot cases, the height estimation difference was so subtle that both models were considered reasonable in reality.In addition, both models fitted well throughout all stand characteristics.One different feature was the sign of the estimated parameters for ln(DG) and ln(HG), and its impact on the model equation.Because of the opposite signs for these parameters, Model 1 was more closely driven by DG and HG.In contrast, DG and HG in Model 2 had a relatively minor impact on tree height prediction.This was because the estimated effects of DG and HG for the Näslund's parameter b 1 were both negative (effect of DG turned from positive in Model 1 to negative in Model 2).Instead, the impact of stand density was strong because of both predictor variables, BA and TPH, in Model 2. Note that here the residual analysis was performed only with fixed-effect parameters for general purposes.The whiskers were offered by the lmfor package in R statistical software to detect any bias (Mehtätalo 2015).and Feduccia 1987;Liu et al. 2009).Nevertheless, the differences in the recovery performance between the alternative mean and median characteristics were unknown.
In our study, DG (Eq.4) provided the overall best KS goodness-of-fit results for the recovered dbh distributions.The proportion of rejected cases was far less (2.7%) than the risk level of 10% (Table 2).The main reason for rejection was 1-cm discrepancies in the dbh class between the predicted and observed mode, and this mostly happened in the early state of stands (6 out of 8 rejected cases belonged to the first measurement occasion).The first moment (D) provided a better result than DGM (Eq.6), as in the paper by Siipilehto and Mehtätalo (2013) for young Scots pine stands.DGM has typically been a defined characteristic in forest management planning fieldwork as an easily defined median tree of the relascope sample plot.Today, when management planning is based on ALS data, DG has replaced DGM because of a straightforward calculation, i.e. ∑dbh 3 /∑dbh 2 (Siipilehto et al. 2016).
DM (Eq.5) clearly provided the most unstable fit results, and the proportion of rejected cases was as high as 34%.The estimated parameter c was sometimes so high (c > 20) that all the trees were practically of the same size when using DM.Note that the median is less sensitive than the mean for small changes in dbh.This means the sample mean is a statistically more efficient estimator than the median.Because of this feature, the recovery methods using mean characteristics in our study reflected the distributions of the observed tree dbh more similarly than those using median characteristics.Taking the stable solution and the lowest KS and EI across the whole range into account, DG is considered the best option for the parameter recovery of the Weibull function.
Based on this finding, DG can hence be further used as a promising dependent variable to develop the stand-level model for the mean diameter of hybrid aspen plantation.

Performance of tree height model
Model 2 has better fit statistics than Model 1, but Model 1 was more sensitive to changes in the mean dimensions DG and HG, while Model 2 was more obviously affected by stand density such as BA and TPH.Among every candidate model examined in this research, Model 1 and Model 2 were finally chosen because fit statistics were the best, and the estimated parameters were all highly significant (P < 0.0001).The residuals over all predictors also showed unbiased patterns without an abnormal trend.Both models can thus be applied especially within the fitted data range when stand characteristics are considered.
When dbh/DG representing social status was also checked, a slight bias was detected in the middle of the dbh range for Model 1 (Fig. 3a).As reported in Siipilehto and Kangas (2015), the tree heights for trees of average size were slightly overestimated.Nonetheless, Model 1 is still considered a compelling formula, because the height-dbh relationship followed the mean dimensions, DG and HG, well.Meanwhile, Model 2 placed more weight on stand density, BA and TPH, but did not behave as logically with respect to DG and HG as Model 1. Model 1 would therefore be safer and more reasonable to estimate individual tree height especially for extrapolation ranging beyond our modelling dataset, e.g.TPH.In addition, the strong correlation between the random effects (-0.998) may indicate overparameterizing of the Model 2.
Model 2 including TPH presented slightly better fit statistics and residuals over dbh/DG, but Model 1 is still considered desirable and viable for general stand condition, e.g.predicting beyond the range of the modelling data.Overall, Näslund's models for tree height prediction are assessed to implicitly include the feature of the site index with the density effect caused by the interaction among AGE, DG, HG, BA, and/or TPH.According to common knowledge, stand density affects competition among trees, and it thus affects stem taper.This feature was performed in both height models.SI depicts stand carrying capacity and yield.On more fertile sites, the trees are taller than on less fertile sites at a given age.However, this feature was more or less characterised through AGE, DG, and HG, especially in Model 1. Model 1 is therefore expected to offer a flexible and suitable prediction according to varying stand condition.

Applicability and practicability of the final models
The developed models were designed to be used in practice after taking the modelling data range into account.The number of trees per hectare in this study ranged from 400 to 1600, which are sparser than the prevailing practice in Finland (Hynynen et al. 2002;Lee et al. 2021).In addition to the compliance with these conditions, DG was evaluated as the most stable characteristic for recovering diameter distribution as presented in this study (Table 2, Fig. 1, Fig. 2).
When predicting tree height using Näslund's model, extrapolation should be cautiously reviewed with the stand age applied in the field, because Models 1 and 2 in this study were developed with the predictor of AGE, ranging from 5 to 20 years.According to the current silvicultural guideline in southern Finland for broadleaved species such as silver birch, the mean diameter range of 27-32 cm is recommended for final felling (Rantala 2011).Based on the maximum mean diameter of our data, the age of 20 years is already close to suitable rotation age for hybrid aspen (Table 1).Furthermore, since the height and dominant height in this study differed significantly by initial stand density (Lee et al. 2021), it is distinguishing that Näslund's model for hybrid aspen included the predictors of HG and BA which dynamically responded to the stand density effect (Table 3).Considering that Model 1 was sensitive to dimensions DG and HG while Model 2 was highly sensitive to stand density such as TPH and BA, Model 1 is evaluated as more practicable with parsimonious input variables than Model 2.

Conclusion
In this study, tree-level methods and models were studied for a clonal hybrid aspen plantation using empirical data collected from southern Finland.Four kinds of mean diameter were examined to identify the best-performing equation for the parameter recovery of the Weibull function.The comparison was successfully conducted with meaningful fitting differences.This was the first study to present evidence that the selected mean characteristic has a significant effect on the goodness-of-fit for the recovered dbh distributions.Both the arithmetic and weighted mean characteristic (D, DG) performed better than the corresponding median dbh (DM, DGM).DG generally provided the best fit results, and it was therefore selected as an input variable for height-curve modelling.Note that DG is included in the stand characteristics assessed from the current ALS data.
Näslund's height curve showed excellent fit to data with fundamental stand-and tree-level variables, including AGE, DG, HG, and tree dbh.Variables describing stand density such as BA and TPH were also successfully added to the model, implying a significant stand density effect on tree height: only BA in Model 1 and both BA and TPH in Model 2. None of the models were biased against predictors according to residual plots.A slight bias in a tree's social status (dbh/DG) using Model 1 was corrected in Model 2. However, despite the slightly worse fit statistics, Model 1 showed more logical behaviour in changes in mean characteristics as the tree height curves always followed the DG and HG.Thus, it was evaluated to be safer to apply Model 1 in a simulator.

Fig. 1 .
Fig. 1.Box and whisker plots of Kolmogorov-Smirnov quotient (KSq, plot a) statistics and Error Index (EI, plot b) by parameter recovery method of Weibull function for diameter distribution of hybrid aspen (Populus tremula × P. tremuloides) in southern Finland.The colour legend indicates different diameter types: D is the arithmetic mean diameter (cm); DG is the basal area-weighted mean diameter (cm); DM is the median diameter (cm); DGM is the basal area median diameter (cm).Each symbolic shape indicates the specific statistic as follows: the centre horizontal solid line inside the box is the median; the × mark is the mean; the lower boundary of the box is the 1st quartile (Q1); the upper boundary of the box is the 3rd quartile (Q3); the vertical length of the box is the interquartile range (IQR, Q3-Q1); the minimum value within the lower whisker below the box is the smallest observation above Q1-1.5×IQR;and the maximum value within the upper whisker above the box is the largest observation below Q3+1.5×IQR.

Fig. 2 .
Fig. 2. Examples of the recovered Weibull distributions for three plots with fit statistics in clonal hybrid aspen (Populus tremula × P. tremuloides) plantations in southern Finland.The stand characteristics of each case are presented together on the left of the subplot.The fit statistics corresponding to each case are provided on the right with KSq and EI to compare the diameter types for the recovery method of Weibull distribution.KSq is the Kolmogorov-Smirnov quotient (supremum/critical value).EI is the Error Index(Reynolds et al. 1988).All the other abbreviations indicate stand characteristics as follows: AGE is the stand age (year); TPH is the number of trees per ha (trees ha −1 ); BA is the stand basal area (m 2 ha −1 ); D is the arithmetic mean diameter (cm); DG is the basal area-weighted mean diameter (cm); DM is the median diameter (cm); DGM is the basal area median diameter (cm).

Fig. 3 .
Fig. 3. Residual plots of Näslund's Model 1 (a) and Model 2 (b) against the social status of a tree, dbh/DG to predict tree height of hybrid aspen (Populus tremula × P. tremuloides) in southern Finland.The selected variables and formulas are provided in Eqs.11 and 12 for Model 1 (a), and Eqs. 13 and 14 for Model 2 (b).Note that here the residual analysis was performed only with fixed-effect parameters for general purposes.The whiskers were offered by the lmfor package in R statistical software to detect any bias(Mehtätalo 2015).

Table 1 .
Summary statistics of modelling data to test parameter recovery method of Weibull distribution and to develop Näslund's height curve model for hybrid aspen (Populus tremula × P. tremuloides) from southern Finland.

Table 3 .
Parameter estimates and fit statistics for Näslund's height curve models to predict individual tree height of hybrid aspen (Populus tremula × P. tremuloides) in southern Finland.