High genetic diversity and strong genetic structure of Strongyllodes variegatus (Coleoptera: Nitidulidae) demonstrate the genetic mechanism of its distribution in oilseed rape production areas in China CURRENT STATUS: UNDER REVIEW

Background : Strongyllodes variegatus (Fairmaire) is a major insect pest of oilseed rape in China. Despite its economic importance, the population genetics of this pest contributing to the development of suitable management and control strategies is poorly known. To understand the population genetics and assess the geographical patterns and genetic structure of S. variegates in China. Using mitochondrial DNA cytochrome c oxidase subunit I and cytochrome b region sequences as genetic markers, we analyzed population genetic diversity and structure from 437 individuals collected in 15 S. variegates populations located in different oilseed rape production areas in China. In addition, we estimated the demographic history using neutrality test and mismatch distribution analysis. Results : The high level of genetic diversity was detected among the mtDNA region sequences of S. variegates . The population structure analysis strongly suggested that three genetic and geographical regions occur with limited gene flow. The Mantel test showed that the genetic distance was greatly influenced by geographical distance. The demographic analyses showed that S. variegates experienced population fluctuation during the Pleistocene, which was likely to be related to the climatic changes. Conclusion : Overall, these results demonstrated that the strong population genetic structure of this beetle may attribute to the geographical barriers and subsequently adapt to the regional ecological conditions for the distribution of S. variegates in China.

became more and more serious. In spring 2013, the S. variegatus population broke out in Hanshan, Anhui province, destructing 97% of oilseed rape leaves [6].This beetle has become from a major insect pest of oilseed rape, and has spread to Qinghai, Gansu, Sichuan, Shaanxi, Chongqing, Hubei, Anhui and Jiangsu provinces in China.
S. variegatus displays ecological adaptation to temperature and photoperiod of geographical regions.
In the spring oilseed rape areas, this beetle species reproduces once or twice a year [2].However, only two generations occur in the winter oilseed rape areas in Anhui [6].The fitness and viability of populations, along with their ability to adapt to environmental changes, are strongly influenced by genetic diversity [7].Population genetic studies on crop pests can provide information on the spatial scales at which population structure and gene flow occurs. Such information can help spatially defining relevant strategies for pest control [8].In addition, genetic diversity contains the information on past and present demography that could be useful to characterize the demographic history of crop pests [9].However, the population genetics of S. variegates has not been studied. Consequently, it is in urgent demands to conduct the genetics studies of S. variegatus for the management and control of this beetle species.
In recent years, more and more molecular markers have been used to study insect population genetics, demonstrating the importance of phylogeographical approaches [10].The mitochondrial (mtDNA) genes are characterized by strict maternal inheritance, lack of genetic recombination, and fast evolution rates, thus they are widely used to analyze the degree of inter-and intraspecific population differentiation, the population history and other aspects [10,11].The fragments of the mtDNAs cytochrome c oxidase subunit I (COI) and cytochrome b (Cytb) are good molecular markers to study insect phylogeny, population genetic variation and differentiation of Dendrolimus kikuchii, Chilo suppressalis and Agriosphodrus dohrni [12][13][14][15][16].The COI and Cytb were used to track the colonization routes of Halyomorpha halys, in order to identify the places where they have originated [17][18][19].
To understand the population genetics and assess the geographical patterns and genetic structure of S. variegates in China, we examined the genetic variations of COI and Cytb genes and population structures of this species. Three haplogroups of its distribution in China were identified. The 4 demographic history of S. variegatus was also inferred in the oilseed rape production areas of China.

GDQHHZGS ZYGS GYSC HZSX AKSX FJCQ ESHB LCHB AQAH LAAH HFAH CHAH NJJS ZJJS
A limited gene flow (Nm < 1) was also revealed among three haplogroups. It is known that once populations have become genetically differentiated, their genetic divergence status can be maintained if they have differentially adapted to regional ecological conditions, since geographic variation in selection can act as a strong barrier to gene flow [23]. On the other hand, our analysis suggested that there was a large gene flow among S. variegates populations within each region. This may be due to the geographical isolation and its flight capacity. The Mantel test results showed that the gene flow between populations was greatly influenced by geographical distance. The absence of gene flow on larger scales over China further confirmed the strong isolation-by-distance relationships 13 of this species. The strong isolation-by-distance relationship in the present study supported the assumption that S. variegates has a limited flight capacity. It was reported that S. variegates can fly 30 ~ 40 m in 2 min [2]. However, the flight ability of S. variegates is less than tens of kilometres and would be not enough to weaken the isolation-by-distance relationships and increase the potential for allopatric or parapatric speciation [24,25]. In addition, the three haplogroups shared common haplotypes, suggesting that small amounts of gene flow between the haplogroups. Although there is a geographical distance between these three regions, the transportation network of oilseed rape seeds and the shuttle breeding of oilseed rape crops could increase the gene flow among regions [6].
Gene flow in insects has been reported to be increased with mobility, which is more pronounced on herbaceous plants, and this feature is high especially in agricultural pests [26]. The large genetic variation within populations was also found for the pollen beetle, Meligethes aeneus, another oilseed rape pest [9,[27][28][29]. However, no population structure of the pollen beetle could be found in five provinces of Sweden [28]. M. aeneus is found to use high altitude flights (up to ca 200 m) at specific points during the year and low-altitude flights at multiple periods [29], which could help to disperse over large distances with the assistance of prevailing wind currents [30], resulting in the high gene flow similar to the diamondback moths, Plutella xylostella [31].
Both the neutrality test and mismatch distribution indicated that a population expansion in UY haplogroup. Furthermore, the phylogeographic patterns of the COI and Cytb haplotype networks were roughly composed of three "star-like" clusters. Based on 2.3% per site per million years [32], the expansion time of UY haplogroup for COI and Cytb was estimated to be 104 and 128 ka years ago, respectively, within the interglacial time of the Pleistocene. Vast glaciers developed at that time in Tibetan Plateau, Qinling Mountain and even in the Yangtze River valley [33,34], which could trigger episodes of range contractions and expansions in many plant and animal species [35][36][37].

Conclusions
The current study provides the first population genetic analysis of S. variegates. The high variability observed in the COI and Cytb molecular markers indicates that the markers are useful for measuring genetic patterns in S. variegates populations. We confirmed the strong genetic structure of S.
variegates populations in China, which could be divided into three genetic haplogroups and geographical regions with the limited gene flow among them. The distribution of this species in oilseed rape production areas in China is mainly structured by the isolation through geographical barrier between the haplogroups and genetic divergence between individuals within populations. We also found a signature of population expansion in UY haplogroup, which might be related to the climatic changes during the Pleistocene.

Methods Sampling
A total of 437 S. variegates individuals were collected from 15 populations in China (Fig. 1). Sample size ranged from 24 to 37 individuals per population spot except eight individuals for the ESHB population (Table S1) The PCR products were subjected to electrophoresis on a 1.5% agarose gel (UltraPure Agarose, Invitrogen) containing 10,000 × stock GelRed (Biotium) diluted at 1:10,000, visualized on a BioDoc-it imaging system (UVP) and purified using ExoSAP-IT (USB, USA). The PCR products were bidirectionally sequenced (using the above primers) on an ABI 3730XL Automated Sequencer using the BigDye

Date Analysis
Forward and reverse sequences were assembled, aligned using ClustalW algorithm [39]. Obtained chromatograms were checked for the presence of ambiguous bases. The sequences were also translated to amino acids using the invertebrate mitochondrial code implemented in MEGA7 to check for the presence of stop codons and therefore pseudogenes [40]. Population genetic diversity was estimated using the program DnaSP 5.0 [41], as indexed by number of variable sites (S), parsimony informative sites, number of haplotypes (Hn), % of haplotypes unique to a given geographical area, haplotype diversity (Hd), nucleotide diversity (π), and average number of nucleotide differences (k).
The Templeton, Crandall, and Sing (TCS) network of the haplotypes was performed using POPART [42,43]. 24,25 Population genetic structure was assessed with an analysis of molecular variance (AMOVA) in Arlequin3.5 according to the degree of differentiation between regions (F CT ), between populations within regions (F SC ), and between all populations (F ST ). F ST analysis for populations of pairwise geographical regions were carried out with significance tests based on 1,000 permutations using Arlequin3.5 [44]. In order to test isolation by distance, the matrices of genetic distance F ST /(1-F ST ) and the geographic distance (ln) between all 15 populations were compared using the Mantel test with 10,000 permutations [45]. The analysis was carried out using zt software package [46].
We examined the historical demographic expansion with Tajima goodness-of-fit test was used to determine the smoothness of the observed mismatch distribution (using Harpending's raggedness index, Rag) and the degree of fit between the observed and simulated data (using the sum of squares deviation, SSD) [51,52]. The expansion signal for a population was indicated by a smooth and unimodal distribution pattern with non-significant p-values for the SSD. We also evaluated the time of expansion with the formula τ = 2µkt [49], where τ is the crest of mismatch distribution, µ is nucleotide substitution rate, and k is number of nucleotides. territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.

Figure 2
Haplotype networks estimated from the sequences of (a) COI and (b) Cytb. Circles represent haplotype, numbers in the circle represent name of haplotype, small black circles represent missing haplotypes that were not observed, circle size denotes the total haplotype frequency, while each slice represents the haplotype frequency in different populations, and lines between linked haplotypes corresponded to one mutation. Three haplotype regions are indicated by three different colors; SP region (red), UY region (yellow) and LY region (green).
25 Figure 3 Scatter plots of genetic divergence vs. geographical distance for pairwise comparisons of all populations (r = 0.500, P < 0.0001). The genetic divergence FST/(1-FST) and the geographic distance (ln) were compared using the Mantel test with 10,000 permutations.
26 Figure 4 Pairwise mismatch distributions of (a) COI and (b) Cytb gene for three derived regions. The x coordinate represents the number of pairwise differences among sequences, and the y coordinate represents the frequencies of pairwise differences in each region.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
Additional Files.pdf