DNA Barcoding Reveals Cryptic Diversity in the Genus Triplophysa (Cypriniformes: Cobitidae, Nemacheilinae) from the Northeastern Qinghai-Tibet Plateau

Background: The northeastern part of the Qinghai-Tibet Plateau (QTP) is one of the areas where the number of species of plateau loach is the largest. As one of the three major groups of shes distributed on the QTP, plateau loach have very important ecological value. However, their taxonomy and systematics are still controversial, and a large number of new species have been reported. The reason for this phenomenon is that the degree of morphological variation is low, the phylogenetic information provided by morphological and anatomical features used for species identication is relatively poor, and there are many cryptic species. Based on the high-density sampling points from the biodiversity hotspots surveyed, this study aims to evaluate the taxonomic characteristics of the plateau loach by means of morphology, DNA barcoding and multiple species demarcation methods to accurately describe species and allocate taxonomic units to unknown specimens. Results: After careful identication and comparison of the morphology and DNA barcoding of 1,630 specimens, 22 species were identied, 20 of which were considered valid local species and two of which were new species that had not been described. Based on the combination of morphological and molecular methods, a total of 24 native species have been found, two of which are cryptic species: Triplophysa robusta sp1 and Triplophysa minxianensis sp1. Fourteen of the 24 species form clusters of barcodes, which allow them to be reliably identied. The remaining cases involved 10 closely related species, some of which were rapidly differentiated, had a disputed taxonomic status, or showed introgressions. Conclusions: The results highlight the need to combine traditional taxonomies with molecular methods to correctly identify species, especially in closely related species such as the plateau loach. This study provides a basis for protecting the biodiversity of plateau loach.


Background
With problems such as global climate change, issues related to populations, the ecological environment, energy and food are becoming increasingly serious, and sustainable anthropogenic development and the ability to understand and meet the requirements of biodiversity is becoming urgent (Loreau et al, 2001;Isbell et al, 2011;Cardinale et al, 2012). There is a major global demand for accurate and rapid identi cation of species for the protection and sustainable use of biodiversity resources. Species identi cation and classi cation is a basic requirement for biological research. Based on morphological characteristics, classical taxonomy has made great contributions to species classi cation; however, due to morphological plasticity, traditional taxonomy cannot accurately distinguish all species, in particular, some forms of similar, related species (Robinson and Parsons, 2002;Pigliucci, 2005). Therefore, there is a need for a new way to support species identi cation with classical taxonomy methods. Tautz et al.
(2002) rst suggested using DNA sequencing, namely, DNA taxonomy, as the main platform for biological classi cation. Then, Professor Paul Hebert from the University of Guelph in Canada introduced the concept of DNA barcoding, highlighting its signi cance to the eld of biological taxonomy and species identi cation (Hebert et al., 2003;Remigio et al., 2003) and suggesting the use of the mitochondrial cytochrome C oxidase subunit I (COI) gene as the basis for animal DNA barcoding. The applicability of DNA barcoding to the identi cation of marine and freshwater sh species has been shown by using a short fragment of approximately 650 bp from the mitochondrial COI gene to identify species based on sequence differences (Ward et  The Qinghai-Tibet Plateau (QTP), known as "the roof of the world", is rich in biodiversity and is a relatively unique area with many endemic species (Khan, et al., 2005). The native sh living in the Qinghai-Tibet region belong to three orders: Salmoniformes, Siluriformes and Cypriniformes (Wu and Wu, 1992). Triplophysa, which belong to the family Nemacheilinae (Cypriniformes) are widely distributed on the QTP and in its adjacent regions (Wang et al., 2016). It is a special group adapted to the climatic characteristics of the QTP, such as cool temperatures and oxygen shortages (Zhu and Wu, 1981;Wu and Wu, 1992). In 1992, there were 33 Triplophysa species identi ed. However, over time, a large number of new species have been described; so far there are a total of 140 valid species (He et al., 2008;Li et al.,2017;). Although there may be some synonyms species (He et al., 2008;Proko ev et al., 2007), these studies show that a large amount of unknown biodiversity exists in the Triplophysa, and many species have not been recognized or described. The phenomenon of many new species being reported is mainly caused by the existence of cryptic species or the lack of careful classi cation review. The simple body structure and relatively conservative morphological evolution of the plateau loach sh, coupled with their weak migration ability due to the restrictions of the water system, have led to limited gene exchange between different populations. Over time, although morphologically imperceptible, the process of species differentiation, including genetic structural differentiation and reproductive isolation, may have occurred, and many hidden taxa may have been ignored. Therefore, the genus Triplophysa should be considered in the study of cryptic diversity.
Classical morphological classi cation has always played a dominant role in species identi cation, but it has limitations. In particular, for the sh of the genus Triplophysa, the phenotype is easily affected by biological factors and the external environment and there is morphological plasticity; therefore, morphological differences are not easily detected (He et al., 2008). Moreover, some species were named many years ago, and their morphological descriptions were relatively simple. All these factors have led to di culties in the subsequent identi cation of species and taxonomic research. Due to the di culty in obtaining detailed data for comparisons, it is possible that the distribution of some species is arti cially expanded and mistakenly divided into different geographical populations (Ding et al., 1996;Proko ev, 2007). Such a long-term, complex classi cation history is signi cant to the genus. Many studies have shown that species identi cation results based on DNA barcoding, which is one of the effective means to identify morphologically indistinguishable cryptic species, have a high degree of matching with the current classi cation system (Barrett et al., 2005;Dincǎ et al., 2011;Dhar et al., 2017). Triplophysa species can also be identi ed by DNA barcoding . To date, there are no studies on the identi cation or evaluation of cryptic biodiversity within the genus Triplophysa using DNA barcoding in northeastern QTP. Herein, samples obtained from northeastern QTP, a hotspot for biodiversity, were identi ed using DNA barcoding. Furthermore, the uncovered cryptic lineage was analysed in combination with morphological features.

Results
A total of 1,630 native specimens were collected from the northeastern edge of the QTP (Table S1), and 22 morphospecies were identi ed including two undetermined species (Triplophysa sp1 and Triplophysa sp2). Among the specimens, the endemic species T. robusta (n = 413) had the largest number of individuals, followed by T. minxianensis (n = 253). The undetermined species T. sp1 (n = 3) and T. bleekeri (n = 5) had the lowest number of specimens, with 68 specimens per species on average (Table 1). A total of 1,630 COI sequences were obtained. The size of the sequences obtained was 606 bp after trimming to a consensus length. No stop codons were observed, and the mean nucleotide composition within the complete data set was 30.6% thymine (T), 26.7% cytosine (C), 24.3% adenine (A) and 18.4% guanine (G).
There were 393 conserved sites, 213 variable sites, 178 parsimonious sites and 35 singleton sites. A total of 230 unique haplotypes were generated in the 1623 COI sequences. The haplotype number of T. robusta was the largest (Nh = 46), followed by that of T. obscura (Nh = 27) and T. stoliczkai (Nh = 25). The haplotype numbers of T. bleeker and T. orientalis were the smallest (Nh = 1). Correspondingly, the haplotype diversity of T. robusta is the highest (h = 0.9360 ± 0.006). The nucleotide diversity is the highest for T. obscura (π = 0.00777 ± 0.00145). The phylogenetic tree was constructed by neighbor-joining method (NJ), maximum likelihood method (ML) and Bayesian inference method (BI).With Homatula variegata as the outclass group (Gen Bank no. : MF953219), the topological structure of the phylogenetic trees obtained by the three analysis methods is basically the same, and only the topological structure of NJ tree is retained here, and the values at the node represent support values of NJ/ML/BI tree is added respectively. The PTP analysis with a maximum likelihood partition and Bayesian implementation resulted in 17 MOTUs (Fig. 3). The GMYC analysis robusta showed correspondence between the morphological species and MOTUs. The MOTUs of T. minxianensis, T. pappenheimi, T. siluroides, T. pappenheimi and T. robusta sp1 cannot be distinguished by the PTP, GMYC, ABGD or BOLD analyses. The same phenomenon occurs between T. stoliczkae and T. dalaica and between T. scleroptera and T. pseudoscleroptera. T. leptosome and T. papilloso-labiatus cannot be distinguished by the PTP or GMYC analyses, but they can be distinguished by the ABGD and BOLD analyses. The same is true of T. shiyangensis.
The average K2P intraspeci c distance ranged between 0 and 3.10% ( Table 2). The maximum observed average K2P intraspeci c distance was that of T. robusta. The maximum intraspeci c K2P distance ranged from 0 to 7.90%. The largest K2P intraspeci c distance was observed for T. robusta, followed by T. minxianensis with a value of 7.40%. The nearest neighbour distance ranged between 0 and 8.57%. For T. robusta, T. minxanensis, T. siluroides and T. pappenheimi, a nearest neighbour distance of 0% was observed. The nearest neighbour distance of 18 species was lower than the maximum K2P intraspeci c distance. Only the nearest neighbour distance of T. scleroptera and T. pseudoscleroptera was less than 1%, at 0.40%. The distributions of the maximum K2P intraspeci c distances and the nearest neighbour K2P genetic distances re ected the overlap; in addition, no barcode gap was found (Fig. 4). Most species form very good evolutionary branches in the NJ tree, and these main branches represent different taxonomic species. Monophyletic clades have also been observed for T. stoliczkae and T. dalaica, and T. scleroptera and T. pseudoscleroptera. Neither T. minxianensis nor T. robusta formed an independent monophyletic clade, but they formed two larger branches according to geographic distribution. Because of a shared haplotype between T. minxianensis, T. pappenheimi, T. siluroides and T. robusta, these four species form a larger clade. The trend of mixed genealogies was con rmed by the examination of the haplotype networks. Two pairs of species (T. stoliczkae and T. dalaica (Fig. 5A) and T. scleroptera and T. pseudoscleroptera (Fig. 5B)) cannot be distinguished by the four algorithms used for MOTU delimitation, and there is no shared haplotype between them. Four haplotypes were shared among T. minxianensis, T. pappenheimi, T. siluroides and T. robusta (Fig. 5C).

Discussion
In this study, a total of 24 species were reported, including two new species: a cryptic species in the T. minxianensis population and a cryptic species in the T. robusta population. The morphological and molecular data were consistent in 14 of the 22 species identi ed. The results show that there are two cryptic species that can be described in the biodiversity hotspot area, which reinforces the general view that there is still a large amount of unrecorded diversity in the plateau loach. A single sequence forms a single branch for T. bleekeri and T. orientalis. It is necessary to collect more specimens and add sequences, but we do not rule out the possibility of identifying more cryptic species.
Different numbers of MOTUs were identi ed in the four DNA barcode analysis methods: 17 different MOTUs were identi ed using the PTP and GMYC models and 19 MOTUs were identi ed using the ABGD and BOLD methods. T. shiyangensis and T. leptosoma cannot be distinguished by the PTP or GMYC models, but the ABGD and BOLD methods allow different MOTUs to be assigned to each species (Fig. 3). The inconsistent results of the four methods may be due to different threshold values used for the identi cation of species; the BOLD system method defaults to 2.2%, ABGD to 2.7%, PTP to 1%, and GMYC to 2%. Although it has been pointed out that the RESL in the BOLD system has a stronger taxonomic performance than that in the ABGD system, showing better species identi cation and MOTU assignment results (Ratnasingham and Hebert, 2013), the two methods in this study achieved the same results, which may be related to the identi ed species. A key aspect implicit in DNA barcoding analysis is the genetic distance threshold values used to de ne the MOTUs. The difference in the number of MOTUs detected by the different analysis methods was mainly seen in two pairs of MOTUs: the genetic distance between T. shiyangensis and T. stoliczkae was relatively low (2.65%), as was the genetic distance between T. leptosoma and T. papilloso-labiatus (1.47%). These relatively low genetic distance values may be related to the late differentiation of these MOTUs. Notably, the MOTUs of relatively recent origin had less time than species of distant origin to accumulate genetic differences, which hindered their correct identi cation, even though the species differ greatly in their morphological characteristics. T. papilloso-labiatus has obvious swim bladder, while T. leptosoma does not (Zhao, 1984). The characteristics of the genetic diversity of these species are the same: there is a relatively high level of haplotype diversity (> 0.5) and relatively low levels of nucleotide diversity (< 0.5%) ( Table 1). This indicates that after the differentiation of these species, in uenced by the founder effect and environmental heterogeneity caused by water system changes, the population rapidly accumulated variation, resulting in a high haplotype diversity index. The accumulation time of the nucleotide diversity index was much longer than that of the haplotype diversity index. In terms of geographical distribution, these two species are mainly distributed in the Shulehe River and Heihe River. The possibility of sympatric speciation exists, but this needs to be con rmed by further analysis.
An example of incompletely separated species was also found. T. minxianensis, T. robusta, T. pappenheimi and T. siluroides are not su ciently differentiated by COI gene differences, and there are also shared haplotypes among the four species (Fig. 5). These phenomena can be explained as frequent gene in ltration events before species differentiation (Feng et al., 2018) or phenotypic plasticity in sh (Robinson and Parsons, 2002; Thibert-Plante and Hendry, 2011). Another species that together with T. minxianensis, T. robusta, T. pappenheimi and T. siluroides on a larger branch is T. hsutschouensis. The morphological characteristics of T. hsutschouensis, which was identi ed as an independent species isolated from T. robusta, include bare and scaleless bodies and a relatively low ratio of body length to body height (Wang, 1991). T. robusta only has residual scales in speci c parts of its body. The Jinghe River population of T. robusta has scales along the lateral line from the caudal n to the front of the dorsal n. Moreover, the Jinghe River population and other populations of T. robusta were clustered into two branches (Fig. 3), and the genetic distance between the populations reached 7.9% ( Table 2). These phenomena suggested the existence of cryptic species of T. robusta. There was no difference between T. minxianensis and T. minxianensis sp1 in the degradation of the swim bladder, whether the end of the pelvic n reached the anus, the starting point of the dorsal n and the pelvic n relative to each other or the morphological measurement data. But the scales of T. minxianensis sp1 were only found in the caudal peduncle and this is quite different from T. minxianensis, in which all the body parts except the head have obvious round scales. The genetic distance between the two populations was 7.4% (Table 2), which indicated that there were cryptic species in T. minxianensis. Similar to this example of incomplete species separation, Wang (1991) argued that the plateau loach groups without scales (T. hsutschouensis) come from scaly groups (T. minxianensis) following the degeneration of scales. The groups with remnant body scales (T. robusta) are the intermediate species between the two types. Whether scales degenerate marks a leap in the evolution of plateau loach populations. The cryptic species found in this study provide more evidence for this speculation.
The morphological characteristics and molecular characteristics were inconsistent in T. pseudoscleroptera and T. scleroptera. The two species have similar appearances but different internal anatomical structure. The anterior and posterior segments of the swim bladder of T. pseudoscleroptera were the same size, with a long pouch or oblong oval shape and no pyloric caecum. The posterior chamber of the swim bladder of T. scleroptera is developed, the anterior segment is thin and the posterior segment is enlarged into a long pouch (Zhu et al., 1981). Without the comparison of internal anatomical structure, these species are easy to misidentify and morphological identi cation may be incorrect (He et al., 2008). However, due to the low interspeci c distance between the two species (0.40%), the two MOTUs cannot be correctly distinguished. This inconsistency was also found between T. dalaica and T. stoliczkai. The posterior chamber of T. dalaica's swim bladder was oval, while the posterior chamber of T. stoliczkai 's swim bladder was degraded; this feature can be used to accurately distinguish the two species.
As shown by the two cases reported here, the DNA barcoding did not show enough difference to distinguish similar species because the lineages were not completely divided into different branches. It is easy to identify species with morphological characteristics that are not signi cantly different as a single species. For example, T. bleekeri and T. polyfasciata have very similar morphological characteristics, there is no signi cant difference in the quantitative traits in different proportions of their bodies, and they have been identi ed as the same nominal species. Ding et al. (1996) believed that they should be divided into two different species based on molecular data and pointed out that the main distinguishing feature was that there were 10-12 wide, dark brown horizontal stripes on the side of the body. However, even among T. bleekeri individuals collected from the same site, the horizontal stripes on the side of its body can range from 0-10. Of the specimens collected from Wenchuanhe River in Sichuan Province, most had 5-7 horizontal stripes, and almost none had more than 10. It was concluded that the validity of T. polyfasciata was still questionable (He et al., 2008). In this study, the numbers of these two species of plateau loach collected were relatively small, with 10 T. bleekeri and 5 T. polyfasciata, and 7-9 horizontal stripes were observed on the sides of the sh bodies. The division into two different species was also not supported by morphology, but the genetic distance between the two species reached 8.57%, far exceeding the threshold of genetic distance within the species of 2% (Pereira et al., 2013). Therefore, it is speculated that these two species have undergone genetic differentiation in terms of genetic material, but due to the small size of the individual (the length of the collected sample is 5-8 cm), the morphological difference is not obvious, so they have historically been regarded as one species.
Obviously, the body colour or body markings of the plateau loach may not be an effective classi cation feature for the identi cation of species and cannot be used as the main basis for identi cation.
Herzenstein (1891) identi ed T. papilloso-labiatus as a subspecies of T. strauchii; this nding was also supported by Zugmeyer (1910). T. strauchii lack a developed mastoid process similar to that of T. papilloso-labiatus. Instead, they have only a strong, naked fold, while the mastoid process on the upper lip of the plateau loach living in the Hexi corridor is obviously a double line, and that on the lower lip is blurred double line. Characteristics such as the mastoid process and strong, naked crease are continuously transitive in a geographical distribution without obvious boundaries. However, the appearance of signi cant double lines on the mastoid marks discontinuity in the variation, and there are relatively stable differences in a series of other morphological traits. Thus, T. papilloso-labiatus should be regarded as an independent species (Li and Chang, 1974;Zhao, 1984). This is also supported in the phylogenetic tree constructed in this study (Fig. 3). T. strauchii and T. papilloso-labiatus are clustered into two different branches and should be independent species.
There is little difference in the morphological characteristics between T. wuweiensis and T. scleroptera. Li and Chang (1974) regarded T. wuweiensis as an independent species based on 7 morphological traits. Wu (1975, 1981) believed that there was a certain continuity in the identi cation characteristics of these two species. However, after collecting specimens of T. scleroptera distributed in the Datonghe River, only one mountain away from the T. wuweiensis specimens, Zhao (1984) believed that there were signi cant differences between the two species in the number of pectoral n rays, intestinal shapes and gill rakers, supporting T. wuweiensis as an independent species. In this study, T. wuweiensis and T. scleroptera clustered in different branches, and the two species were greatly differentiated, which also supported the idea that T. wuweiensis is an independent species. The low genetic diversity of T. wuweiensis may be due to the short time since species differentiation and the low haplotype diversity and nucleotide diversity may be caused by the founder effect and the narrow distribution area (the species is only distributed in the east and west Shiyanghe River tributaries).
T. shiyangensis, T. papilloso-labiatus and T. hsutschouensis are distributed in three inland river systems in the Hexi corridor. The maximum intra-species genetic distance of these three species is more than 1%.
This may be mainly due to the wide geographic distribution of the three species and the large population differentiation caused by the barriers created by the water systems. This phenomenon also appears in the sympatric distribution of Gymnocypris chilianensis, in which each geographic population is clustered into a single branch, with a large genetic differentiation (Zhao et al., 2011).
The different geographic populations of some widespread species are identi ed as different species or subspecies due to some more signi cant morphological differences. For example, T. stoliczkae was divided into 7 subspecies (Herzenstein, 1891) due to the differences in the number of gill rakers, the proportion of quantitative traits and the number of spiral loops of intestinal tubes with changes in altitude or water system. In this study, the samples were collected in three drainage systems (Yellow River, Jialing River and the inland rivers in the Hexi corridor). The maximum genetic distance within the population was greater than 1.2% (Table 2). However, the samples of different water systems have shared haplotypes. This indicated that the population differentiation of T. stoliczkae was low in the surveyed area.
The membranous swim bladder of T. obscura is very developed with a constriction in the middle, and its length accounts for approximately 2/3 of the abdominal cavity. Compared with T. orientalis, its body surface has obvious spines. It is regarded as an independent new species (Li, 2017). In this study, a relatively large number of samples (n = 234) were collected in the distribution area. The phylogenetic tree showed that the samples from different water systems were clustered into different branches, the maximum genetic distance within the species was 2%, and the nucleotide diversity and haplotype diversity were relatively high (h = 0.887, π = 0.00777). These ndings indicate that there is a large differentiation between the two geographically separated populations of T. obscura and the possibility of allopatric speciation. T. obscura and T. orientalis are also divided into two different monophyletic lines in the phylogenetic tree, which is consistent with the results of the analysis of Wu (2017).
Although only 3 specimens of T. sp1 were collected in the Liangdang section of the Jialing River, there are obvious differences in morphological characteristics from other species of plateau loach. It should be identi ed as a new species that has not been reported, but more specimens should be collected for further con rmation. T. sp2 was collected in the Jialing River, and showed degeneration of the membranous swim bladder, leaving only a small chamber, an anus near the start of the anal n, the end of the pelvic n adjacent to the anus, a large spot on the back of the body, a spot on the side of the body and other morphological characteristics which were obviously different from those of the closely related species T. obscura. A detailed description of these newly discovered species is necessary to make it possible to record the relationship between morphology and molecular identi cation criteria (Versteirt et al., 2015).

Conclusions
This study is the rst comprehensive assessment of plateau loach species in a biodiversity hotspot using

Sample collection
The samples were collected at 114 sampling sites in two exorheic rivers (Jialing River, which is the largest branch of the Yangtze River, and the upstream of the Yellow River) and three inland water bodies (Shiyanghe River, Heihe River and Shulehe River) located on the northeastern edge of the QTP from 2015 to 2018 (Fig. 1, Fig. 2). The specimens were caught using gill nets and cage nets. To accurately identify the sh based on taxonomic books, the fresh specimens were examined for speci c morphological characters (Zhu and Wu, 1981;Wu and Wu, 1992;Wang, 1991). The muscle tissue of each specimen was preserved in 95% ethanol for DNA extraction, and the voucher specimens were stored in 10% formaldehyde solution for further examination of speci c morphological characters (Table S1).

Dna Extraction, Ampli cation And Sequencing
Total genomic DNA was extracted from the muscle tissue using the high-salt method, and a segment of    Collection sites. Details of the 111 sites and collected specimens are provided in Table S1. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. Relationship between maximum genetic distance within species and nearest neighbor genetic distance among species.