Modelling tree height-diameter allometry of Chinese fir in relation to stand and climate variables through Bayesian model averaging approach

Tree height-diameter allometry reflects the response of specific species to above and belowground resource allocation patterns. However, traditional methods (e.g. stepwise regression (SR)) may ignore model uncertainty during the variable selection process. In this study, 450 trees of Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) grown at five spacings were used. We explored the height-diameter allometry in relation to stand and climate variables through Bayesian model averaging (BMA) and identifying the contributions of these variables to the allometry, as well as comparing with the SR method. Results showed the SR model was equal to the model with the third highest posterior probability of the BMA models. Although parameter estimates from the SR method were similar to BMA, BMA produced estimates with slightly narrower 95% intervals. Heights increased with increasing planting density, dominant height, and mean annual temperature, but decreased with increasing stand basal area and summer mean maximum temperature. The results indicated that temperature was the dominant climate variable shaping the height-diameter allometry for Chinese fir plantations. While the SR model included the mean coldest month temperature and winter mean minimum temperature, these variables were excluded in BMA, which indicated that redundant variables can be removed through BMA.


Introduction
Tree height-diameter allometry is a critical structure relationship and widely used in forest growth and yield models (Curtis 1966;Fang and Bailey 1998). The relationship can be used for estimating biomass and carbon storage (Hulshof et al. 2015). Errors in tree height prediction may lead to large errors when estimating carbon stocks. In addition, height-diameter relationships are also used for evaluating mechanical stability and timber quality (King 1990;Mensah et al. 2018). Accounting for both diameter and height provides a flexible perspective to understand differences in growth allocation of species or functional groups (Mensah et al. 2018).
Precise prediction of tree height-diameter allometry is crucial in forest growth and yield systems. A number of models have been developed. For instance, Tewari and Gadow (1999) used a bivariate height-diameter distribution which included the S BB distribution (the bivariate distribution) to develop height-diameter relationships. The power law, logistic, Richards, Gompertz, and Weibull equations have also been tested (Curtis 1966;Huang et al. 1992;Zeide 1993;Fang and Bailey 1998;Sánchez et al. 2003).
When there are a large number of variables in the models, we should choose the subset of independent variables which are strongly associated with the dependent variable for model development Wang et al. 2004). Usually, the variables with statistical significance are selected using stepwise regression (SR). However, little or no attention is given to the imprecision arising from the variable selection procedure, since the final single model is implicitly assumed to be "optimal". In addition, a small change in the size of the sample data set may lead to a different set of selected variables (Prost et al. 2008). If there are n variables, the number of possible models will be 2 n . Actually, we do not know the model with which variables is the optimal one before we analyze the data. These problems are commonly known as model uncertainty (Draper 1995;Raftery et al. 1997). Neglecting model uncertainty may lead to overconfident predictions and reduce reliability Wintle et al. 2003).
Bayesian model averaging (BMA) developed by Kass and Raftery (1995) is a method that provides an efficient tool for data analysts to discover promising models and obtain estimates of their posterior probabilities. In BMA models, the probabilities are used as weights for averaged model predictions and parameter estimates (Lipkovich 2002). The BMA approach has been commonly used in hydrological and ecological studies (Verbeeck et al. 2006;Dong et al. 2011). Additionally, BMA has been used to model diameter distributions (Bullock and Boone 2007), tree aboveground biomass (Picard et al. 2012), and tree mortality .
As shade intolerant species, Chinese fir (Cunninghamia lanceolata (Lamb.) Hook.) prefers to grow in warm and humid environments (Wu 1984). It is adapted to climate conditions that are characterized by annual average temperatures of 15-23 °C and precipitation amounts of 800-2000 mm. It is one of the most widely distributed and important tree species in southern China for timber production. Due to its high quality timber, the total planting area has reached 8.93 million hectares, accounting for 19.01% of all plantation area (SFA 2013). Prior studies have developed the height-diameter allometry of Chinese fir. Yuan and Li (2012) indicated that the height-diameter allometry was different in south versus north aspects. Li et al. (2015) developed height-diameter models and found the model including multiple stand variables performed better in height estimation than one-independent-variable models for Chinese fir. Zhang et al. (2019) determined that climate variables affected the height-diameter allometry of Chinese fir using data collected from four provinces. However, none of these studies have accounted for the uncertainty in the selection of the height-diameter allometry model.
The aim of this study is to model tree height-diameter allometry of Chinese fir plantations in relation to stand and climate variables using a BMA approach to address model uncertainty and identify the contributions of these variables on the allometry under five initial planting densities. In addition, we compared the BMA approach with a traditional method (SR) on selection of variables by predictive performance on the tree height-diameter allometry.

Study area and data collection
The study site is in Shaowu city (117°43´E, 27°05´N), Fujian province, in southern China (Fig. 1). Chinese fir is widely distributed in Fujian. The region belongs to the subtropical monsoon climate zone with mean annual temperature of 17.7 °C, annual rainfall of 1768 mm, and annual sunshine duration of 1740.7 h. The soil type of the site is yellow-brown. The soil thickness is about 50-100 cm. The soil forming parent rock is mainly sandy shale and phyllite. Data used in this study were from Chinese fir stands established with bare-root seedlings in 1982. Five initial planting densities were planted in a randomized block design: A: 2 m × 3 m (1667 trees/ha), B: 2 m × 1.5 m (3333 trees/ha), C: 2 m × 1 m (5000 trees/ha), D: 1 m × 1.5 m (6667 trees/ha), and E: 1 m × 1 m (10 000 trees/ha). There are 15 plots (20 m × 30 m /plot) in total with every treatment replicated three times. Every plot comprised a buffer zone including two lines of similar trees. The 15 plots were uniformly distributed in the middle slope position of the mountain (Fig. 1).
We tracked 450 trees which continued to survive from 1984 to 2010 in the 15 plots. We measured tree height (H) with a Vertex (Haglof, Sweden) hypsometer and diameter at breast height (DBH at 1.3 m) every year from 1984 to 1990, and every other year from 1992 to 2010, for a total of 17 sampling times. Here, 60% of the sampled tree data including the measurements at each sampling interval were randomly selected and used for model development and the remaining 40% were used for model validation.

Candidate stand and climate variables
Some reports indicated that stand basal area, number of trees per ha, and stand dominant height (a proxy of site index) were significant explanatory variables in height-diameter models (Huang and Titus 1994;Sharma and Parton 2007). Generally, age is regarded as a good proxy of tree size in even-aged stands. In this study, stand variables included the arithmetic mean diameter (Dm), stand basal area per ha (BA), stand dominant height (HD, the mean height of the five highest trees in the stand), tree age (A), and number of trees per ha (N). To determine the influence of initial planting density (PD) on height-diameter allometry, the planting density was also given consideration as a candidate variable. Statistics of stand and tree variables across all the planting density levels are shown in Table 1 in which means and standard deviations were calculated over the 17 sampling times from 1984-2010 (every year from 1984 to 1990, and every other year from 1992 to 2010).
Some studies reported that both precipitation and temperature influence growth processes of trees (Vizcaíno-Palomar et al. 2017;Fortin et al. 2019;Zhang et al. 2019). Eight climate variables including mean annual temperature (MAT), spring mean temperature (SMT), winter mean minimum temperature (WMMT), annual precipitation (AP), mean coldest month temperature (MCMT), summer mean maximum temperature (SMMT), annual heat moisture index (AHM), and mean warmest month temperature (MWMT), which are listed in Table 2, were incorporated to model height-diameter allometry. AHM was used to represent annual climatic water deficit, since MAT and AP were integrated into a single parameter: AHM = (MAT + 10)/(AP/1000) . We used ClimateAP software to obtain the climate data. The software derives the spatially Table 1. Statistics of stand-and tree-level variables used for modelling height-diameter allometry of Chinese fir in this study.
Planting Density (PD) interpolated data based on latitude, longitude, and elevation of the site . Table 2 lists the statistics of climate variables of the study site for the years 1984-2010. We pre-selected each variable through the variance inflation factor (VIF) test before regressing to avoid multicollinearity using package DAAG in R (Maindonald and Braun 2015). A common rule is that there is no multicollinearity when the VIF < 5 ). Mensah et al. (2018) found that the power function had excellent performance in terms of fitting the height-diameter model for 45 species in South Africa. The allometry model including the selected variables had the following structure:

Height-diameter allometry model
where H is tree height, DBH is diameter at breast height, a and b are the parameters to be estimated, x is an independent variable vector containing stand and climate variables, β is the parameter vector, and ε is the residual error term. In height-diameter allometry, the parameter b also referred to scale parameter (Niklas 1993). For estimating the model under the BMA framework (Raftery et al. 2005) and to deal with model heteroscedasticity, Eq. 1 can be linearized using logarithms: It should be noted that the log-transformation of the data results in a minor bias in height estimation. Baskerville (1972) gave an approximation of the estimate that is corrected for this bias: where / (n -2) is the estimated variance of Ln(ε), n is the sample size, Ln(H) is the natural log of the observation values, and is the predicted values from Eq. 2.

The theory of BMA
Let θ represent a quantity of interest; here, it denotes the Log transform of tree height, i.e., Ln(H). y is independent variable vector including DBH, stand and climate variables. In general, model uncertainty is large because there are many variables that could affect tree height. For example, the number of candidate models will be 2 p , if there are p variables. Actually, we are rarely totally  confident about which model best represents our data . The BMA method provides a way of selecting a subset of possible models (each submodel contains a unique combination of the variables) and using the posterior probabilities of these models to estimate key influential variables (Kass and Raftery 1995). Based on the probability model (Raftery 1996), the posterior distribution of θ is listed as follows: is the probability of model M i , and k is the number of candidate models. Based on Bayes rule, the posterior probability of the model M i can be given by: where β i is the vector of regression parameters for model M i . The posterior probability shown in equation (4) gives a method to calculate the estimates and credible intervals given the uncertainty of the model. An uninformative prior was used for the parameter prior (Box and Tiao 1973). The influence of a specific variable on the likelihood of affecting tree height can be interpreted as follows using the posterior probability P(β j ≠ 0 | y): where β j ≠ 0 means the predictor has an effect on the height growth, G is the model space, I Mi is a 0/1 indicator variable. I Mi = 1 when β j of the j th variable is in the model M i , and 0 when it is not. A common set of rules for making inferences using the BMA posterior probability (Kass and Raftery 1995;Wang et al. 2004;Lu et al. 2019) is shown in Table 3. If the variable x j affects height growth, then the posterior mean of its parameter β j is shown as:

Probability
Effect of x j on tree height P(β j ≠ 0 | y) < 0.5 no effect 0.5 ≤ P(β j ≠ 0 | y) < 0.75 weak effect 0.75 ≤ P(β j ≠ 0 | y) < 0.95 positive effect P(β j ≠ 0 | y) ≥ 0.95 strong effect To fit the BMA models, we used the function bicreg in the BMA package (Raftery et al. 2005) in R software (R Development Core Team 2018). The SR method was performed in SAS software (SAS Institute 2011). Independent variables were selected using bidirectional elimination (P-value < 0.05). Therefore P-values significant at P < 0.05 for the traditional SR approach and mean of β j > 75% for BMA were used, which is equivalent to P < 0.05 in the traditional SR approach (Viallefont et al. 2001;Mu et al. 2019). The algorithm for BMA modeling needs to compute the integrated likelihoods for all models (Eq. 6) and averaged over all models. In the function bicreg, the integrated likelihood is approximated through the BIC (Bayesian Information Criterion) approximation (Raftery et al. 2005). The following BIC approximation in the BMA package was used for calculating the integrated likelihood function of model M i : is the number of independent variables in M i , and is the maximum likelihood estimate vector. The models with high posterior probabilities are found through the fast leaps and bounds algorithm to approximate the sum over all the models (Raftery et al. 2005), which was introduced by Furnival and Wilson (1974) and Raftery (1995).

Model performance evaluation
Both BMA and SR methods (bidirectional elimination) were used to model height-diameter allometry. The evaluation statistics including R 2 , mean difference (MD), and mean absolute difference (MAD) were used to evaluate the model performance. In general, models with higher R 2 , and lower MD and MAD are better.
where denotes the mean value of observed H, and is the predicted value after log-transfomation correction at tree i.

Results
Based on the VIF test, multicollinearity did not exist among PD, HD, BA, MAT, MCMT, AHM, SMMT, and WMMT (VIF < 5). Therefore, these eight variables were included in modelling heightdiameter allometry of Chinese fir. Table 4 shows the top five models and their posterior probabilities of the tree height-diameter allometry model derived through BMA and this model development was carried out with the training data set. The model fitted using the SR method was the same as the third highest model determined by BMA (Table 4). In the BMA models, the posterior probability of the highest probability model  was 0.454, which was much higher than the third highest one that had a value of 0.127 (which had the same variables as the SR model). Fig. 2 shows the model space of the height-diameter models determined by BMA. It clearly indicated that the model performance changes with different combinations of the variables. Based on the validation data set, there were negligible differences in R 2 , MD, and MAD between the BMA and SR model. Consequently, these statistics showed that the methods have no significant difference in model performance. Nonetheless, we found the BMA method performed well in modelling height-diameter allometry of Chinese fir (Table 5, Fig. 3).

Parameter estimates and P-values (or posterior probability) from SR and BMA approaches
Via the BMA approach, there was strong evidence to indicate that tree height increased with increasing DBH, PD, HD, and MAT, while tree height decreased with increasing BA, MCMT, and SMMT (Table 6). There was weak evidence to indicate that MCMT had a negative effect on tree height. In addition, there was no evidence to support that AHM and WMMT affected tree height. Comparing the two methods, the SR method selected more variables than BMA. In addition to the same variables that were deemed to strongly affect tree height using the BMA approach, it was observed that in the SR model, tree height decreased significantly with increasing MCMT and WMMT (P < 0.01) ( Table 6).
We also found most of the intervals of parameter estimates from BMA were narrower than those of the SR model (Table 6). In addition, stand variables generally had greater influences on tree height than climate variables according to the posterior probabilities of the parameters ( Table 6). The scale parameter b 1 of the height-diameter allometry of Chinese fir was 0.54 (Table 7).

SR and BMA approaches
In this study, the BMA and SR methods were compared for modeling tree height-diameter allometry in relation to stand and climate variables. The SR method selects variables according to the significance of independent variables on the dependent variable (P-value). However, Freedman et al. (1988) said that the use of P-values during variable selection may be dramatically misleading. Furthermore, SR ignores the uncertainty of the models (Fig. 2). Choosing a single model and disregarding the others could distort our inference capabilities based on the quantities of interest of a single specific regression model (Zhang et al. 2007;Picard et al. 2012). In addition, Altman and Adersen (1989) concluded that small changes of sample size may lead to different sets of variables, and instability of variable selection.
In fact, we are rarely totally confident about which height-diameter model best simulates our data. Uncertainty must be taken into account, and BMA provides an elegant way to do so. The advantage of the BMA method is consideration of model uncertainty (Fig. 2), which has also been evaluated in several different kinds of models, such as survival analysis (Volinsky et al. 1997;Viallefont et al. 2001), linear modelling (Raftery 1996), and hazard modelling (Murphy and Wang 2001). Their results indicated that BMA enhanced model performance.
Here, the model fitted using the SR method had the same selection of variables as the third best model selected by BMA (Table 4). This showed that the selection procedure is simply different  between the two methods and thus leads to different models. In addition, more variables (MCMT and WMMT) were incorporated in the SR model than BMA. These variables might be redundant variables. Generally, as the minimum temperature rose in winter, tree height of Chinese fir should increase. But the results of SR suggested that MCMT and WMMT had significantly negative effects on tree height growth. However, in BMA, there was no strong evidence to indicate that there were negative effects on tree height. In other words, the results showed that the parameter estimates of BMA yielded better interpretation than those of the SR model. Lu et al. (2019) used BMA and SR methods to model the relationships between tree mortality and stand and climate variables, and found that more redundant variables were included in the SR model. The SR method has the tendency to choose a redundant variable correlating with a true variable more often than choosing an uncorrelated variable, whereas BMA did not choose a redundant variable even if it was correlated with a true variable Viallefont et al. 2001;Wang et al. 2004). Furthermore, Genell et al. (2010), using simulated data, showed that BMA had a larger probability of not choosing a redundant variable compared with the SR method and the probability of choosing a true variable of two methods was similar. In our study, it appears that BMA has the capacity to develop more parsimonious models than the traditional SR approach. However, differences in model performance between the two methods were negligible in this study.

Variables affecting tree height
Tree height of Chinese fir increased with increasing HD (Table 6). Calama and Montero (2004) fitted an individual-tree height-diameter model of stone pine (Pinus pinea L.) in Spain. Crecente-Campo et al. (2010) developed a generalized height-diameter model of Eucalyptus globulus Labill. stands in Galicia (northwestern Spain). Both of these two studies found that HD was a good estimator for tree height. In height-diameter model, site index was traditionally used as a representative of site fertility (Sánchez et al. 2003). Better fertility accelerates tree height growth. BA, which can be regarded as a competition index, had negative effects on tree height (Table 6). Feldpausch et al. (2011) reported that BA was an important driver of variation in height-diameter allometry. For Chinese fir, BA had a negative effect on height, which suggested that trees tended to be shorter in height because of exposure to high competition (BA) .
Previous studies have showed that stand density was the main biological variable affecting tree height-diameter allometry (Zeide and Vanderschaaf 2002;Crecente-Campo et al. 2010). When trees grow in a dense stand, they tend to be higher than those in a sparse stand to elevate their leaves above competitors in order to maximize light capture (Liu et al. 2003). In our study, planting density (PD) was positively correlated with height (Table 6). With less growth being allocated to height at lower densities, more open grown trees have a general tendency of allocating more growth to lower branches and fuller, lateral crown size.
For climate variables, tree height increased with increasing MAT, but decreased with increasing SMMT (Table 6), which was consistent with Zhang et al. (2019). With increasing temperature, additional carbon had higher priority to be allocated to height growth (Thornley 1999;Martínez and López-Portillo 2003;Vizcaíno-Palomar et al. 2017;Fortin et al. 2019). Vizcaíno-Palomar et al. (2017) reported that under a warmer climate, Pinus halepensis Mill. and Pinus pinaster Aiton trees would be taller which corresponded to what was found in our study which showed that MAT had a positive effect on tree height. Warm summers can prolong the growing season and lead to an increase in radial growth (Zhang et al. 2017). However, excessive high temperature in summer may lead to aridity, by aggravating evaporation and transpiration, which will lead to the decrease of soil water-use efficiency, thus inhibiting radial growth (Allen et al. 2015;Chen et al. 2018). Trees tend to allocate more resources to belowground growth rather than height growth when water is lacking (Schwinning and Ehleringer 2001). In addition, in the current study, variables associated with temperature were included in the model (Table 6), which was consistent with previous studies (Wang et al. 2006;Zhang et al. 2019) indicating that temperature was the key variable influencing height-diameter relationships in humid environments with sufficient moisture.

Scale parameter in height-diameter allometry
The scale parameter (b) in tree height-diameter allometry has been debated from theoretical and empirical perspectives (Niklas 1993;Henry and Aarssen 1999;Hulshof et al. 2015). McMahon and Kronauer (1976) discussed the three competing models to describe the scale parameter in plant height and diameter allometry (McMahon and Kronauer 1976) (Table 7). In this study, we found the estimated scale parameter value in height-diameter allometry was close to 0.5, and it almost remained constant in four study sites ). But other studies reported different sites because of different weather conditions may affect the estimated scale parameter value (Pilli et al. 2006;Vizcaíno-Palomar et al. 2017). Zeide and Vanderschaaf (2002) found that the scale was 0.7374 in normal height-diameter relationship for loblolly pine plantations. The large scale parameter value in the exponent confers comparatively small safety factors against elastic buckling due to the influence of gravity on plant mass. Niklas (1993) drew a conclusion that the estimated scale parameter value for gymnosperm trees was 0.43 and the scale parameter value of gymnosperm and dicotyledonous trees was 0.538. McMahon and Kronauer (1976) showed that stress similarity and elastic similarity was the representative strategy of woody plants (gymnosperm and dicotyledonous trees) and geometric similarity reflected the strategy of non-woody plants (pteridophytes). In this study, the estimated scale parameter value was 0.54 which was close to expectations from the stress-similarity model.

Conclusions
In our paper, BMA and SR methods were used to model tree height-diameter allometry of Chinese fir plantations. The SR model had the same set of variables as the model with the third highest posterior probability developed via BMA, and SR method included bigger number of variables in the model than the best BMA model, although the model performance was equal. The SR method does not consider the uncertainty associated with model selection. Compared with the SR method, BMA had a higher probability of not selecting potentially redundant variables (MCMT and WMMT) and therefore incorporated more parsimonious models. The difference in model performance was generally negligible. Nonetheless, BMA did noticeably produce estimates with narrower intervals around the parameter estimates.
Tree height was influenced by both stand and climate variables. Trees planted in higher density and better sites were taller than those planted in lower density and poor sites. Tree height increased with increasing MAT whereas decreased with increasing SMMT. Temperature was the pivotal climate variable in shaping height-diameter allometry of Chinese fir. In addition, the estimated scale parameter value was around 0.54 which was close to the expectations from the stress-similarity theory.