Assessing genetic diversity and population structure of Salix viminalis across Ergun and West Liao basin

Salix viminalis L. is an important shrub that has potential for use as a bioenergy crop, for phytoremediation of heavy metal contaminated soil and sewage sludge treatment. It is mainly distributed in the northeast of China, but the species has not yet been used a resource here. We examined the genetic diversity and population structure of populations from the Ergun basin and West Liao basin using 20 microsatellite markers. A high level of genetic diversity (Na = 16.45, He = 0.742) was detected for S. viminalis, and populations from the Ergun basin exhibited higher genetic diversity and private alleles numbers than the West Liao basin. The 12 populations could be divided into two clusters by both Bayesian analysis and UPGMA clustering which were consistent with the populations derived from the two basins. Moderate population differentiation (FST = 0.076) was shown in S. viminalis, and AMOVA analysis confirmed that most of the genetic variation (86.13%) was attributed to individual differences within populations, while 11.49% was attributed to differences between basins and 2.38% to differences within each basin. Significant correlations of FST/(1–FST) with log (geographic distance) among 12 populations (r = 0.634, p < 0.00) and 10 populations within the Ergun basin (r = 0.482, p = 0.0002) indicated that geographical distance was the principal factor influencing genetic structure. As most of genetic variation exist within populations, so protection measures should be focused on populations with higher genetic diversity and unique alleles, such as Tuli, Mordaga downstream, Zhadun1 and Genhe.


Introduction
The genus Salix (willow) have been considered one important taxonomic entity in the world because of its extensive use, such as riverbank stabilization (Watson et al. 1997), medical uses (Mahdi et al. 2010) and basket production (Grönroos et al. 1989).The genus comprises around 520 species worldwide which show considerable variation in plant size, growth habit and morphological characteristics, ranging from small shrubs to large trees (Fang et al. 1987;Newsholme 1992).Salix spp.occupy diverse ecological niches, and most of them are native to the northern hemisphere (Argus 1997).Asia is considered to be the original center of genus Salix and about 275 species exist in China, of which 189 are endemics (Fang et al. 1987).
Salix viminalis L. belongs to family Salicaceae.It is a diploid species (2n = 2x = 38) with an outcrossing breeding system that combines insect and wind pollination (Rönnberg-Wästljung 2001; Hanley et al. 2002).S. viminalis is a pioneer tree species which is mainly found in riparian environments, and it can be propagated both sexually and vegetatively (Flora of China Editorial Committee 1994).The willow is prized for its economic value as a promising woody bioenergy crop due to perennial habit, easy vegetative propagation, fast growth rate and good coppicing response (Keoleian and Volk 2005;Trybush et al. 2012).In recent years, S. viminalis also be used for sewage sludge treatment (Kocik et al. 2007) and phytoremediation of heavy metal contaminated soils (Klang-Westin and Perttu 2002;Mleczek et al. 2010).S. viminalis is a diecious shrub native to China, which is mainly distributed across the northeast, such as Heilongjiang, Jilin, Liaoning and eastern Inner Mongolia provinces (Flora of China Editorial Committee 1994).So far, S. viminalis has been underutilized and the overwhelming majority of this specie is still living in the wild in China.Moreover, with the increase of human activities, the survival of S. viminalis has been strongly impacted.Assessment of the genetic diversity and population structure of natural S. viminalis is essential to conduct species' conservation and initiate domestication process.
Over the last decades, a number of molecular marker technologies have appeared and been proved to be valuable tools to evaluate genetic diversity within and between populations (Kadri et al. 2009).Among various molecular markers, microsatellite or simple sequence repeat (SSR) makers are well known for multiallelic nature, codominant inheritance, high information content, and simple detection by PCR amplifications (Wang et al. 2012).Many SSR markers have been developed in Salix genus, and it is possible to use these markers in related species within the same genus because of the transferability of the markers (Lian et al. 2002;Barker et al. 2003;Du et al. 2012).There have been studies conducted to assess the genetic diversity of S. viminalis (Trybush et al. 2012;Lascoux et al. 1996;Przyborowski and Sulima 2010).However, the number of individuals within populations was limited, and current researches are confined to central Europe.
In this study, we collected cuttings of 330 S. viminalis individuals from 12 natural populations across Ergun basin and West Liao basin in China, and all samples were analyzed using 20 SSR markers obtained from literatures.The specific aims of this study were: (1) to evaluate genetic variation within and between populations (2) to examine whether the genetic diversity of populations from West Liao basin was the same as that of populations from Ergun basin and (3) assess genetic differentiation of populations with short distances.

Plant materials and DNA extraction
Cuttings of 330 S. viminalis individuals were collected from 12 populations along 9 rivers across two basins (Table1; Fig. 1).The river Mordaga, Genhe, Tuli, Kuduer, Zhadun, Yimin and Huihe belong to Ergun basin, while river Dahaiqing and Heili belong to West Liao basin.Moreover, three populations were collected along Zhadun river to study genetic differentiation of populations with short distance.For each population, 17-30 individuals were randomly sampled.In order to avoid resampling clones, the individuals within population should be at least 50 m apart (except at YM, HH and HL sites where willows are no more than 50).As soon as the cuttings were transported to laboratory, they were planted in pot at once.One month later, fresh leaves were sampled and kept frozen at -80 ℃ in laboratory for later use.Total DNA was extracted from about 150 mg fresh leaf tissues according to the modified cetyltrimethyl ammonium bromide (CTAB) protocol described by Doyle and Doyle (1987).The DNA concentration was checked with a spectrophotometer (NAS-99, ACTGene Company) and the DNA quality was verified on a 1.2% agarose gel electrophoresis.DNA samples were stored at -18 °C.

PCR amplification and genotyping
A total of 70 microsatellite markers of Salix were selected from public literatures (Lian et al. 2001;Barker et al. 2003;Stamati et al. 2003;Hoshikawa et al. 2009).They were used to detect interspecific variation across six S. viminalis individuals by polyacrylamide gel electrophoresis.Based on the clarity and reproducibility of the band patterns, twenty microsatellite markers revealed high inter-individual variation and then were amplified to all samples.Amplification reactions were performed in a 15 μl volume containing 6.0 µl of 2.5 × PCR buffer (KAPA Taq HotStart PCR Kits, KAPA Company), 1.0 µl of each of the forward and reverse primers (5 µM), 0.1µl of Taq polymerase (5U µl -1 ), 1.0 µl of template DNA, and water to 15 µl.The thermal cycling was performed using GeneAmp 9600 PCR system (Applied Biosystems) as follows: 10 min at 95 °C; then 30 cycles consisting of 30 s at 95 °C, 30 s at 58 °C, 30 s at 70 °C, followed by a final extension for 7 min at 72 °C.After amplification, 1 µl of PCR products were added to 0.5 µl of ROX-500 size standard (Beijing Microread Gene Technology Co., Ltd) and 8.5 µl of Hi-Di formamide (Applied Biosystems) in 96 well-plates, and denatured at 95 °C for 3 min.Then the products were separated by capillary electrophoresis on 3730XL DNA analyzer (Applied Biosystems).Data were scored using the GeneMapper v 4.0 (Applied Biosystems).

Data analysis
The intra-and inter-population genetic statistics were measured using POPGENE v. 1.32 (Yeh et al. 1997) and GenAlex 6.5 (Peakall and Smouse 2012), including number of observed alleles (N a ), number of effective alleles (N e ), observed heterozygosity (H o ), expected heterozygosity (H e ), number of private alleles (N p ) and F-statistics (F IS , F IT , F ST ).Besides, statistical tests were performed on SPSS 18.0 (SPSS Inc., Chicago, IL, USA) with one-way analysis of variance to reveal the difference of two basins.Without prior classification information, population structure can be analyzed by Bayesian clustering algorithm in software STRUCTURE v 2.3.4 (Falush et al. 2003;Pritchard et al. 2000).Fifteen independent runs were performed for each value of K ranging from 1 to 12 with a burn-in of 50 000 iterations followed by 100 000 iterations.The most appropriate K value could be  determined by mathematical model described by Pritchard et al. (2000) and Evanno et al. (2005).Genetic relationships among populations were performed using 10 000 bootstrap replicates from PowerMarker v. 3.25 (Liu and Muse 2005) based on Shriver distance (Shriver et al. 1995) and a UPGMA tree was constructed with Consense of the software package Phylip v. 3.6 (Felsenstein 2004).Furthermore, in order to investigate the correlation between genetic and geographic distance between all pairs of populations, Isolation By Distance (IBD) analysis based on a mantel test with 30 000 matrix randomizations was carried out using IBDWS (isolation by distance web service) (Jensen et al. 2005).The software MEGA 6.0 (Tamura et al. 2013) was applied to view and edit the tree.Analysis of molecular variance (AMOVA) was performed on GenAlex to estimate the genetic variation within and among populations.Pairwise values of genetic differentiation (F ST ) were calculated by the AMOVA routine in GenAlex, and permutation procedures (9999 replicates) were used to test the significance of the differentiation between pairs of populations.In addition, the program Barrier version 2.2 was used to identify potential physical barriers (Manni et al. 2004).

Polymorphism of 20 simple sequence repeat (SSR) markers
A total of 329 alleles at 20 SSR markers were identified across 330 samples and all of the markers showed moderate to high level of polymorphism.The number of observed alleles (N a ) ranged from 3 at SB1148 to 49 at gSIMCT052 with an average of 16.45, while the number of effective alleles (N e ) ranged from 1.595 at Shuk058 to 13.47 at gSIMCT052 with an average of 5.384.Mean observed heterozygosity (H o ) and expected heterozygosity (H e ) were 0.636 and 0.742, respectively; H o was less than H e at 18 loci except SB1172 and SB1148 which were confirmed by F IS and F IT , indicating a deficiency of heterozygote at these 18 loci (Supplementary file S1, available at https:// doi.org/10.14214/sf.7001).

Genetic diversity of Salix viminalis populations
The mean N a and N e per locus and per population were 8.313 and 4.387, respectively.The maximum values of N a (10.50) and N e (5.549) presented in TL and KDE separately, while the minimum values of them (N a = 3.850, N e = 2.187) appeared in HL.Observed heterozygosity (H o ) ranged from 0.490 (HL) to 0.728 (KDE), while expected heterozygosity (H e ) ranged from 0.473 (HL) to 0.757 (KDE) (Table 2).For each population, H o was less than H e except HL where an excess of heterozygote were observed.A total of 56 private alleles were found in 12 populations ranging from 0 (ZD2) to 14 (TL), with an average value of 4.67 (Table 2).Moreover, the genetic diversity between the two basins showed highly significant differences by one-way analysis of variance (Suppl.file S2).

Genetic structure of Salix viminalis populations
According to the mathematical model described by Pritchard et al. (2000), the most optimal K value was to be 3 where the log-likelihood value emerged an inflection point (Suppl.file S3).However, the statistical model suggested by Evanno et al. (2005) showed a clear peak at value of K = 2 (Suppl.file S3).The clustering results of 12 populations are shown in Fig. 2. When K = 2, the 12 populations could be divided into two clusters: the first cluster comprised individuals from HL and DHQ of West Liao basin, while the second cluster comprised individuals from Ergun basin.When K = 3, the 12 populations could be divided in to three clusters: the individuals of West Liao basin were still classified into the same cluster, while individuals of Ergun basin could be divided into two clusters equaling that the individuals from Ergun basin were with two substructures: populations of GH, KDE, MDG1, MDG2 and TL constituted one subcluster (S1), and HH, YM, ZD1, ZD2 and ZD3 constituted the other subcluster (S2).The results of Bayesian clustering were confirmed by UPGMA with strong bootstrap supporting for Ergun cluster and West Liao cluster, although subclusters in Ergun cluster were not obvious (Fig. 3).This was further supported by Mantel test with 30 000 permutations, revealing significant correlation between F ST /(1-F ST ) and log (geographic distance) among 12 populations (y = 0.153x -0.261, r = 0.634, p ﹤ 0.00, Fig. 4A).In addition, significant correlations of genetic distance and geographic distance was also observed in Ergun basin (y = 0.031x -0.038, r = 0.482, p = 0.0002, Fig. 4B).

Genetic differentiation and genetic variance of Salix viminalis populations
For the 12 populations of S. viminalis, F ST of different loci ranged from 0.031 to 0.141 with an average of 0.076 (Suppl.file S1), while pairwise F ST values ranged from 0.003 to 0.220 (Suppl.file S4).Highly significant differentiation between populations of two basins was revealed by pairwise F ST values, and the differentiation of the two subclusters of Ergun basin was also highly significant.However, there was no statistically significant differentiation between HH and YM, ZD2 and ZD3, and among KDE, TL, GH and MDG1 (Suppl.file S4).AMOVA analysis revealed the populations were structured and moderately differentiated.As expected, the largest proportion of the total molecular variation (93.37%) was attributed to differences within populations and 6.63% was among populations.If taking into account the clustering results by Bayesian and Neighbor-Joining clustering, then most of the total genetic variation (86.13%) could be due to individual differences within populations while 11.49% was ascribed to difference of two basins and 2.38% existed in populations within each basin.For populations of Ergun basin, the variation between subclusters (2.33%) was larger than that between populations within subclusters (0.88%), but 96.80% of variation was still associated with individual differences within populations (Table 3).
Results revealed by Barrier software was in accord with the analyses of genetic structure.When the number of barrier was set as only one, a putative barrier (Da Hinggan Mountains) which separated the twelve populations into two groups was detected.Moreover, there were other two barriers existing in two basins respectively when the number of barrier was set to three.One isolated HL from DHQ and the other isolated HH, YM, ZD1, ZD2 and ZD3 from MDG1, MDG2, GH, TL and KDE (Fig. 5).

High polymorphism of simple sequence repeat (SSR) markers and high genetic diversity of Salix viminalis
The application of SSR markers to estimate the level of genetic diversity has been shown to be appropriate in many studies (Liu et al. 2012;Trybush et al. 2012;Talve et al. 2014).In this study, a moderate to high level of polymorphism for the 20 SSR markers indicates that they are effective to reveal genetic variation of S. viminalis.An average number of 16.45 observed alleles (N a ) and 5.38 effective alleles (N e ) per locus within 330 individuals was higher than studies in Salix like S. caprea (N a = 10.00,N e = 3.90) (Perdereau et al. 2014), and S. viminalis in Czech Republic (N a = 13.46,N e = 3.73) (Trybush et al. 2012) (Berlin et al. 2014).The high genetic diversity of S. viminalis is probably associated with its biological characters including woody habit, dioecism, wind or insect pollination, and wind seed dispersal (Hamrick et al. 1992;Karp et al. 2011;Teixeira et al. 2014).Moreover, heterozygosis deficit is observed within all populations except HL.The deficit of heterozygosity is very common in the Salicaceae family, such as S. lapponum and S. lanata (Stamati et al. 2007), S. daphnoides (Sochor et al. 2013), P. alba (Castiglione et al. 2010) and P. davidiana (Lee et al. 2011).Limited population size, extensive clonality or biased sex ratio and subsequent inbreeding are usually the reasons of deficit (Mandel et al. 2001;Sochor et al. 2013).
The SSR data demonstrated that the populations from Ergun basin exhibited significantly more genetic diverse than that from West Liao basin.The two populations of Ergun basin were either small or isolated, genetic drift are likely to take place.A small number of private alleles were also detected in the two populations.The genetic drift, small efficient individuals and inbreeding would lead the genetic diversity to decrease in Ergun basin.Chen et al. (2008) found that the genetic variance might be lost under unfavorable environment due to an increase in vegetative reproduction and a decrease in resource-demanding sexuality.The unfavorable environment in West Liao basin with arid spring and hot summer might also lead to declining genetic diversity.In this study, lower genetic diversity and together with less number of private alleles for populations from West Liao basin might also reflect changes of geographic distribution caused by past climate changes and glaciation followed by an expansion of S. viminalis distribution towards lower latitudes (Kim et al. 2011;Mehes et al. 2009).Less private alleles found in YM, HH and ZD2 of Ergun basin might be due to the disturbance of human activities, leading to rapid reduction in individuals.

Geographical patterns of Salix viminalis
The evolutionary potential and ability to cope with adverse environment of a species not only depends on intraspecific genetic variation, but also determined by genetic structure of population (Millar and Libby 1991).Both of Bayesian analysis and UPGMA clustering suggested that the 12 populations could be classified to two different genetic clusters which generally agreed with populations deriving from two basins.The populations of Ergun basin were further subdivided into two subclusters.It has been reported that physical barriers, such as mountains, rivers and deserts often play an important role in the population structure (Lamborot and Eaton 1997;Yuan et al. 2012;Wang et al. 2015).The Barrier result showed that the most significant geographical barrier was between the two basins.Da Hinggan Mountains could act as a key barrier to impede gene exchange among populations of two basins.In addition, the two suclesters of Ergun basin are located on the east and west side of Da Hinggan Mountains respectively, so Da Hinggan Mountains was also a barrier for populations inside the Ergun basin.
The overall genetic differentiation based on F ST (0.076 on average) showed that 7.60% of the total genetic variation was attributed to inter-population differentiation and 92.84% to individual differentiation within populations, suggesting moderate differentiation of S. viminalis according to the criterion suggested by Wright (1978).AMOVA results also showed a moderate differentiation that 6.63% genetic variance was among populations, which was mainly due to the contrast between two basins (Table 3).Our result was consistent with previous studies that the differentiation was low to moderate for species of Saliaceae (Imbert and Lefèvre 2003;Smulders et al. 2008;Lin et al. 2009;Callahan et al. 2013).However, the value of F ST in our study is higher than that for natural populations of S. viminalis in Czech Republic (F ST = 0.05) (Trybush et al. 2012), which might be due to that the 12 populations in our study are distributed in two basins with long distance, and they are separated and isolated by Da Hinggan Mountains between them.
Isolation By Distance (IBD) is based on the theory that when the populations are in equilibrium between gene flow and gene drift, a positive relationship should be detected between genetic distance and geographic distance.The mantel test showed a significant positive correlation of F ST /(1-F ST ) with log (geographic distance) among all 12 populations and among 10 populations in Ergun basin.It means that the genetic structure of S. viminalis is affected by geographic isolation and this is the results of adapting environment for a long time.Conversely, no correlation was found between F ST /(1-F ST ) and geographic distance of 6 S. viminalis populations in Czech republic (r = 0.316, p = 0.262) (Trybush et al. 2012).The inconsistent results might be due to that the populations in our study have little disturbance by human activities and the sampling range is relatively larger.Though YM, HH and ZD2 have been disturbed to some extent, it mainly occurs in recent years.
Populations of KDE, TL, GH and MDG1 were not statistically differentiated.The exchange of material that local residents collected twigs of S. viminalis for firewood or weaving might explain this result.Populations of ZD, YM and HH are located in plain area and the three rivers join into Hailar river and flood events could promote widespread regeneration across a floodplain (Hughes and Rood 2003).For species with remarkable capacity to flooding, they can even grow vigorously in flooded conditions (Bailey-Serres and Voesenek 2008).The flood in Hailar river might promote the gene exchange of ZD, YM and HH populations, so no differentiation was found.However, the flood has little effect on the differentiation of other populations of the same basin because they were located in mountain area.The differentiation is usually assumed very weak in Salicaceae, but the gene flow was restricted within 1-3 km for P. nigra (Imbert and Lefèvre 2003).In our study, differentiation of the three populations with short distance (1.5~1.9 km) from Zhadun river was very low, which was probably related to the outbreeding system of S. viminalis.The three sites along Zhadun river bank are not separated by barriers such as mountain, so the gene exchange could take place within above distances widely.

Conclusion
This study applied 20 SSR markers to examine the genetic diversity and genetic structure in natural populations of S. viminalis across Ergun and West Liao basins, China.Genotyping with microsatellites markers indicates that the S. viminalis are characterized by a high level genetic diversity and moderate population differentiation.The results support the separation of 12 populations into two differentiated genetic clusters which is in accordance with geographical distribution of S. viminalis populations across two basins, and Da Hinggan Mountains acts as a geographical barrier for this structure.Moreover, populations from Ergun basin exhibit significantly higher genetic diversity and higher number of private alleles than that from West Liao basin.In addition, the differentiation of the three populations with short distance (1.5~1.9 km) from Zhadun river was very low.Due to most of genetic variation exist within populations, so protection measures should be focused on populations, especially populations with higher diversity and unique alleles, such as TL, MDG2, ZD1 and GH.Meanwhile, the aforementioned populations are important reservoirs of genetic variation for special germplasm resources, thus we can make use of these potential germplasm in genetic improvement for selection of genotypes with rapid growth rate, high biomass, strong adaptability and resistance.

Table 1 .
General information for 12 Salix viminalis populations across two basins in China.

Table 2 .
Genetic diversity of 12 populations in Salix viminalis based on 20 SSR markers.
Abbreviations of populations are explained inTable 1. N a = Number of observed alleles; N e = Number of effective alleles; H o = Observed heterozygosity; H e = Expected heterozygosity; F IS = Inbreeding coefficient among individuals within subpopulation; N p = Number of private alleles.
Fig. 5. Genetic barrier of 12 Salix viminalis populations based on Barrier version 2.2.[Note:Abbreviations of populations are explained in Table1].