Riverine barrier effects on population genetic structure of the Hanuman langur (Semnopithecus entellus) in the Nepal Himalaya

Background Past climatological events and contemporary geophysical barriers shape the distribution, population genetic structure, and evolutionary history of many organisms. The Himalayan region, frequently referred to as the third pole of the Earth, has experienced large-scale climatic oscillations in the past and bears unique geographic, topographic, and climatic areas. The influences of the Pleistocene climatic fluctuations and present-day geographical barriers such as rivers in shaping the demographic history and population genetic structure of organisms in the Nepal Himalaya have not yet been documented. Hence, we examined the effects of late-Quaternary glacial-interglacial cycles and riverine barriers on the genetic composition of Hanuman langurs (Semnopithecus entellus), a colobine primate with a wide range of altitudinal distribution across the Nepalese Himalaya, using the mitochondrial DNA control region (CR, 1090 bp) and cytochrome B (CYTB, 1140 bp) sequences combined with paleodistribution modeling. Results DNA sequences were successfully retrieved from 67 non-invasively collected fecal samples belonging to 18 wild Hanuman langur troops covering the entire distribution range of the species in Nepal. We identified 37 haplotypes from the concatenated CR + CYTB (2230 bp) sequences, with haplotype and nucleotide diversities of 0.958 ± 0.015 and 0.0237 ± 0.0008, respectively. The troops were clustered into six major clades corresponding to their river-isolated spatial distribution, with the significantly high genetic variation among these clades confirming the barrier effects of the snow-fed Himalayan rivers on genetic structuring. Analysis of demographic history projected a decrease in population size with the onset of the last glacial maximum (LGM); and, in accordance with the molecular analyses, paleodistribution modeling revealed a range shift in its suitable habitat downward/southward during the LGM. The complex genetic structure among the populations of central Nepal, and the stable optimal habitat through the last interglacial period to the present suggest that the central mid-hills of Nepal served as glacial refugia for the Hanuman langur. Conclusions Hanuman langurs of the Nepal Himalaya region exhibit high genetic diversity, with their population genetic structure is strongly shaped by riverine barrier effects beyond isolation by distance; hence, this species demands detailed future phylogenetic study. Electronic supplementary material The online version of this article (10.1186/s12862-018-1280-4) contains supplementary material, which is available to authorized users.


Background
The geographical separation of populations by barriers can alter ranging behavior, prevent dispersal, and restrict gene flow [1]. Barriers influencing population connectivity can be historical or contemporary, physical or nonphysical, and natural or manmade [2]. Animal behavior, including philopatry, foraging preferences, dispersal patterns, and mate selection, etc. can also lead to reproductive isolation and disruption of gene flow, leading to changes in population genetic structure within a species [3].
Rivers are major physical barriers, with Wallace [4] initially noting their effects on the distributions of Amazonian monkeys. Wallace's riverine hypothesis implies that major rivers significantly reduce or prevent gene flow between populations inhabiting opposing river banks, hence promoting genetic differentiation and speciation [5]. The extent of the riverine barrier effect depends on the physical characteristics of the river as well as the ecology and dispersal ability of the taxa [6,7]. History, width, discharge, flow, and chemistry among other factors can significantly influence the way in which rivers separate species distributions and constitute physical barriers to species dispersal [5]. Large or faster rivers have been described as insurmountable barriers for terrestrial mammals, including primates, although assessment of the riverine barrier effects in primates has been mainly confined to the Amazon [5,8,9], Congo River basin [3,[10][11][12], and Madagascar [13][14][15][16]. Among larger rivers, their lower sections are considered more effective barriers than the headwaters [17].
Rivers of the Nepal Himalaya originate at an elevation of 6000-8000 m and carry snow melt southwards to the plains at elevations lower than 100 m within a north-south distance of less than 200 km [18]. The widths of the Himalayan rivers in Nepal are less than most of those viewed as impassable to primates in the Amazon and Congo basins, but are characterized by very low temperatures of snow-fed water, elevated headwaters, and strong currents due to their extreme elevational gradients. The headwaters are narrower and their currents stronger than the lower reaches, where they become broader after uniting with many tributaries in lowland Nepal. Running along the north-south axis, these rivers divide the landmass into fragments, which impede the movement of terrestrial animals. However, the genetic influence of such isolation on the fauna within this range has not yet been documented.
Primate species have been used to assess the riverine barrier hypothesis as the taxonomic boundaries of many primates coincide with major river courses [6,8,19], although parapatric models of speciation have been observed for some [20,21]. The body mass and foraging behavior of primates are considered important factors influencing their ability to cross rivers [22]. Hanuman langurs (Semnopithecus entellus), the only colobine primate from the Nepal Himalaya (central third of the Himalayan range), occupy a wide elevational and latitudinal distribution and are likely a good model species to explore riverine barrier effects. Hanuman langurs are restricted to the Indian subcontinent and are distributed throughout most parts of India, Nepal and Sri Lanka, as well as parts of Pakistan and Bangladesh [23]. Taxonomic ambiguity at the species and subspecies level still exists due to the occurrence of many morphotypes and incongruence among various classification schemes [24][25][26][27]. Hanuman langurs occur in diverse habitats, ranging from lowland tropical forest to subalpine forests covering multiple vegetative and climatic zones in Nepal [28]. Studies on this species in Nepal are limited to feeding ecology [29,30] and reproductive behaviors of subtropical [31,32] and temperate troops [33,34]. Recently, Chalise [28] described the distribution of the species and basic morphological variations among populations from various forest fragments across Nepal.
Due to the morphological variations among the isolated populations of Hanuman langurs living in complex and fragmented forest environments as well as the effects of riverine barriers, population genetic analysis was necessary. Thus, we explored i) the level of genetic diversity among Hanuman langur populations; ii) the strength and influence of riverine barriers on the population genetic structure of Hanuman langur; and, iii) the effect of Pleistocene climatic fluctuations on the demographic history of Hanuman langurs in Nepal. To investigate these issues, we collected fecal samples from wild Hanuman langur troops covering almost the entire range of the species in Nepal from < 200 m to 3800 m asl and analyzed genetic variation within the control region (CR, 1090 bp) and cytochrome B (CYTB, 1140 bp) fragments of the mitochondrial DNA. Coalescent-based molecular analyses and paleodistribution modeling can provide a robust picture of the distributional history of a species [35][36][37]. Therefore, the evolutionary history of the Hanuman langur revealed from molecular analyses was further established by ecological niche modeling using a maximum entropy algorithm and paleodistribution reconstruction based on the principle of niche conservatism.

Study area and research species
Nepal lies on the southern flank of the central Himalaya between China and India, ranging at latitudes of 26°22′ and 30°27′ N and longitudes of 80°40′ and 88°12′ E (Fig. 1). Of the 2400 km long Himalayan range, the central one-third (~800 km) forms the Nepalese Himalaya (NH) [38]. Within the 200 km north-south width of Nepal, the elevation ranges from 60 m in lowland Terai to 8848 m at Mount Everest. This extreme altitudinal gradient has resulted in diverse bioclimatic zones from tropical to nival [37]. It includes the Palearctic and the Indo-Malayan biogeographical regions and hence the major floristic provinces of Asia (Sino-Japanese, Indian, western and central Asiatic, Southeast Asiatic, and African Indian desert), creating a unique and rich terrestrial biodiversity [28]. The Nepalese territory can be divided into the eastern, central, and western regions, which comprise the Koshi, Gandaki, and Karnali drainage basins, respectively [39]. Each drainage/river system has major tributaries and many minor streams that originate from the Himalayas and flow southwards.
Among the three species of nonhuman primates found in Nepal, the Hanuman langur is the only colobine, with the other two being the cercopithecine macaques (rhesus and Assam macaques) [28]. Hanuman langurs mainly distributed in the foot hills of the Nepal Himalaya and adjacent states of northern India and Pakistan [40,41]. The Nepalese Hanuman langur populations have been classified previously as the Tarai gray langur (at lower elevations) and the Nepal gray langurs (at higher elevations) based on morphological variation. They have black faces and their pelage coloration is silver gray dorsally and white on the ventral side of the body, and highland populations bear a whorl of bright white hairs on head [28]. Their head and body length can reach up to 1 m, with a longer tail, and their body weight ranges from 7 to 20 kg, though the highland individuals are heavier than the lowland ones [42].
Hanuman langurs are diurnal, arboreal, and primarily folivorous, relying mostly on young tender leaves, buds, fruits, coniferous needles, and cones, though are known to occasionally consume termites, insect larvae [43], and even eggs of nesting birds [44]. The troops are mostly multi-male, multi-female in which adult males and females live together with young adults, juveniles, and infants. Females are philopatric. In Nepal, Hanuman langurs are found in the dipterocarp forests of outer and inner Tarai, mixed deciduous and evergreen forest of Schima-Castanopsis, Elaeocarpus-Macaranga forests in the mid-hills and mountains, and Quercus-Pinus-Rhododendron forests in the high mountains [42].

Sample collection
From August 2016 to May 2017, surveys were conducted along the tributaries of the Koshi, Gandaki, and Karnali-Mahakali river systems (KRS, GRS and KMRS, respectively). In eastern Nepal, the Tamor and  Table 1 Arun tributaries of KRS; in central Nepal, the Trishuli, Marshyangdi and Kaligandaki tributaries of GRS; and, in far-western Nepal, the Karnali and Mahakali rivers of KMRS were sampled (Fig. 1). The surveys were carried out along the eastern and western sides of the north-south river axis starting from < 100 m asl and continuing up to 4000 m asl. Whenever a Hanuman langur troop was encountered, the occurrence point was noted using GPS, population size was estimated, and the major habitat characteristics were surveyed (Table 1).
Fecal samples of wild Hanuman langurs were collected from troops that were encountered during the field surveys. A total of 87 fecal samples were collected noninvasively using sterilized cotton swabs and a plastic vial with 2 ml of lysis buffer prepared according to the protocols in White and Densmore III [45]. The dry cotton swab was rolled on the surface of the fecal sample extensively and soaked in lysis buffer to remove fecal matter and recover the mucus cells from the sample. This process was repeated three times, with similar swabbing conducted after turning the fecal sample over. To avoid re-sampling of fecal material from the same individual of a troop, the following precautions were taken: i) fecal sampling was preceded by clear identification and detailed census of the troop; ii) only fresh, moist, and intact fresh fecal material was sampled; and, iii) a troop was sampled only once from the fresh defecates after their mid-day rest. The collected samples were stored at ambient temperature and transferred to the lab for DNA extraction.

DNA extraction
Total genomic DNA was extracted from fecal samples using the QIAamp DNA Stool Mini Kit (QIAGEN, Germany) following the manufacturer's protocols, with some modifications: the 2 ml sample tube with lysis buffer was centrifuged for 2 min at 14000 rpm and 600 μl of the supernatant was taken for further processing; final elution was performed using 75 μl of elution buffer considering the low concentration of DNA in fecal samples.

PCR amplification and sequencing
PCR amplification of mtDNA fragments encompassing the entire control region (CR) was performed using the primer pairs LCRF (5'-AATTGACGTTCTATCTAAA CTAC-3′) and LCRR (5′-GGGGATGCTTGCAT GTGTAA-3′); furthermore, the CYTB was amplified using the primer pairs LCYF (5'-CGAGATCTGAAAAA CCATCGTTG-3′) and LCYR (5'-AACTGCAGTCATCT CCGGTTTACAAGA-3′). The 25 μl reaction solution contained 12.5 μl of 2× power Taq PCR Master Mix (BioTeke, Beijing), 10.5 μl of ddH 2 O, 0.5 μl of each forward and reverse primer, and 1.0 μl of template DNA. The PCR assays for loci were carried out with an initial denaturation at 94°C for 5 min, followed by 35 cycles with denaturation at 94°C for 30 s, annealing at 55°C for 30 s, and extension at 72°C for 60 s, and final extension at 72°C for 10 min. Precautions were taken to avoid any contamination between the samples and from external sources. Only six samples were processed in one batch. Every PCR amplification was carried out with a negative control and the pre-and post-PCR work were performed independently using separate equipment. The amplicons obtained were the intended loci and were confirmed not to be exogenous contaminants or nuclear insertions of mitochondrial DNA (numts) because: (i) colobine-specific primer pairs that successively failed to amplify high-quality vertebrate DNA were used for each locus; (ii) single PCR amplicon of the expected size was detected in all individual samples; (iii) no extreme variant sequences were detected; and iv) the coding region of CYTB had no abnormal stop codons. Moreover, as fecal samples are richer in mtDNA molecules due to their large copy numbers, small size, and lower vulnerability to degradation than the nuclear DNA, the natural degradation of DNA from fecal samples made nuclear copies even more scarce, reducing the likelihood of amplifying numts [46].
After separation in 1.2% agarose gel, the amplicons were purified and sequenced using the BigDye Terminator Cycle Kit v.3.1 (Invitrogen, USA) in both directions on an ABI 3730XL sequencer (Applied Biosystems, USA).

Genetic data analysis Genetic diversity
The CR (1090 bp) and CYTB (1140 bp) sequences were assembled and edited using Seqman (DNASTAR Lasergene v.7.1) and aligned using the Clustal W [47] algorithm in MEGA7 [48]. The sequences from respective samples were concatenated (CR + CYTB, 2230 bp) using SequenceMatrix v.1.8 [49]. Genetic diversity including number of polymorphic sites (s), haplotype number (H), haplotype diversity (Hd), and nucleotide diversity (π) were assessed separately for CR, CYTB, and CR + CYTB sequences using DnaSP v.5.10 [50]. To ensure that our analyses were comparable to other colobine research, the genetic analyses were also performed by trimming the CR sequences to leave a 489 bp long first hypervariable region (HVR I).

Population genetic structure
The concatenated CR and CYTB sequences were used to assess the population genetic structure. A phylogenetic tree depicting the relationship among the haplotypes was constructed using the maximum likelihood (ML) algorithm in RaxML v.8.2.10 [51], and the neighbor joining (NJ) and maximum parsimony (MP) algorithms in MEGA7 [48] with 1000 bootstrap replications. For ML analysis in RaxML, the best partitioning scheme (P1 = CYTB_1 st , CYTB_2 nd ; P2 = CYTB_3 rd , CR) with the GTR + G evolutionary model and Bayesian Information Criterion (BIC) was determined using Partition Finder v.2.1.1 [52]. The geographical distributions and mutational relationships of the haplotypes among the isolated populations were described by the median-joining (MJ) haplotype network constructed using PopART v.1.7 [53].
The six major clades defined by the phylogenetic tree and MJ haplotype network were considered as populations for downstream analyses. Population pairwise F ST was calculated from pairwise nucleotide differences between populations and their statistical significance was checked using 10,000 permutations as implemented in Arlequin v.3.5.1.2 [54]. The genetic differentiation due to isolation by distance (IBD) was assessed by the Mantel test, which analyzed the correlation between geographic and interindividual genetic distances with 10,000 permutations executed in IBDWS v.3.23 [55]. The regression of all pairwise genetic differentiation values, F ST , against the corresponding log10 transformed geographical distance (in km, Additional file 1: Table S1) was used to determine the strength of the correlation. There was a high correlation (r = 0.876, P < 0.05) between the Euclidian distance and number of river tributaries separating the population pairs; therefore, the correlation results between genetic distance and number of major rivers isolating the population pairs were omitted. Further, the effect of altitudinal gradient on genetic variations was assessed by correlation between genetic distance and altitudinal gradient between the population pairs using the Mantel test in Arlequin v.3.5.1.2 [54]. Assessment of the genetic structure among the populations was performed using analysis of molecular variance (AMOVA) in Arlequin v.3.5.1.2 [54] by grouping the six populations into three geographical groups (east, central, and west) ( Table 2).

Demographic history
Multiple complementary methods were employed to assess the demographic history of the Hanuman langurs in Nepal. Neutrality tests, Tajima's D [56] and Fu's Fs tests [57] in Arlequin v.3.5.1.2 [54], and the Ramos-Onsins and Rozas test (R 2 -test) [58] in DnaSP v. 5.10 [50] were performed to assess the statistical significance with 1000 permutations. Mismatch distribution analyses were performed to infer the population expansion patterns for the set of all sequences and population groups separately with 10,000 bootstrap replicates in Arlequin v.3.5.1.2 [54].
Demographic changes of Hanuman langurs through time were estimated on HVR I sequences by Bayesian skyline plot (BSP) [59] analysis in BEAST v.1.8.4 [60]. The jMODELTEST v.2.1.7 [61] was used with the Akaike information criterion (AIC) to determine the best model of nucleotide substitution (HKY + I + G). Further analyses of BSP were performed following the method in Khanal et al. [37].

Ecological niche modeling and paleodistribution reconstruction
Ecological niche modeling (ENM) was performed with the MaxEnt algorithm using the geographic coordinates of 29 wild Hanuman langur troops (18 troops used in molecular analyses and 11 from Chalise [28]) and bioclimatic variables. The 19 bioclimatic variables version 1.4 (Additional file 1: Table S3) for the current (~1950-2000) and Last Interglacial (LIG,~120,000-140,000 years before present, YBP) in 30 arcsec [62] and LGM (~22,000 YBP) in 2.5 arcmin [63] were downloaded from the Worldclim website (http://worldclim.org/). The spatial resolutions of LGM variables were harmonized with those of current and LIG periods by resampling in 30 arcsec. All bioclimatic variables were clipped to a region from 78.5°E to 92.5°E and from 24°N to 31°N using ArcGIS 10.3.1 and exported in ASCII format. Bioclimatic variables often show high collinearity, resulting in poor model performance and misleading interpretation [64]. Therefore, correlation among the bioclimatic variables were tested by Pearson correlation test (P < 0.05) with the threshold of r ≥ |0.8|, and from all pairs of correlated variables beyond the threshold, only ecologically meaningful variables for the species (Bio:1, 3, 5, 11, 12, 15, 18) were selected for further analyses (Additional file 1: Table S4). MaxEnt v.3.4.1 [65] was used to model and map the current potential distribution of the Hanuman langur. The ecological niche defined by MaxEnt for Hanuman langurs was used to reconstruct their potential distributions during the period of the LGM and LIG. To facilitate the model evaluation, presence data were randomly divided into a 75% training data set and 25% validation data set, and the uncertainty introduced by such split was accounted by generating 25 replication models on a cross-validation method [62]. Area under the curve (AUC) of the receiving operating curve (ROC) was used to evaluate the accuracy of the model (Additional file 1: Figure S5). The relative importance of each environmental variable in the model was evaluated by a Jackknife test and variable contribution table.
The logistic outputs of habitat suitability for the present, LGM, and LIG were converted to binary outputs of unsuitable (0-0.377) and suitable (0.377-1) habitats using the threshold of maximum training specificity and sensitivity (maxTSS = 0.377), as explained for the model generated by presence-only data by Liu et al. [66]. The potential altitudinal shift of suitable habitats was then evaluated by overlaying binary outputs for the present, LGM, and LIG separately to the SRTM DEM (http://www.cgiar-csi.org/data/srtm-90m-digital-elevation-database-v4-1). The elevation values of suitable habitat pixels were extracted, and their mean, maximum, and minimum were computed.

Population genetic structure
The gene genealogy inferred by the ML, MP, and NJ analyses consistently deduced a six-clade structure (Fig. 2). Clade I (EA) clustered all haplotypes from eastern Nepal. The sequences from central Nepal formed three major clades (CA, CB, and CC) corresponding to isolation from the Budhigandaki and Marshyangdi rivers, whereas the populations from western Nepal formed two clades (WA and WB) delineated spatially by the Karnali River. The median-joining haplotype network was consistent with the phylogenetic tree in defining six distinct clades (Fig. 3). This congruence signifies the deep divergence between the eastern, central, and western Hanuman langur populations, as well as high-level of sub-structuring in the central populations (CA, CB, and CC). Among the six clusters (Hap_17 was included in clade CB due to its geographic proximity and closeness in the MJ network with Hap_18 and Hap_19) defined by the phylogenetic tree and MJ network, haplotype diversity (for the 2230 bp fragment) was highest in CA, whereas the nucleotide diversity was highest in CB and lowest in EA (Table 2). There were differences in sample size among the population groups but mtDNA diversity had no significant correlation with sample size (r = − 0.061, P = 0.909 for number of polymorphic sites; r = 0.615, P = 0.193 for number of haplotypes; r = 0.200, P = 0.703 for haplotype diversity; and r = − 0.382, P = 0.454 for nucleotide diversity; no r-values were statistically significant at P < 0.05).
Population pairwise F ST values among the six populations were high and statistically significant. The highest pairwise F ST was between the EA and WA populations (0.9408) and the lowest was between the CA and CB populations (0.7246) ( Table 3). The Mantel test assessing isolation by distance (IBD) revealed a statistically significant positive correlation (r = 0.343, P < 0.001) between the genetic and geographical distances among populations (Fig. 4). A similar test for weighing the effect of altitudinal gradients on genetic variation among population pairs resulted in a weaker positive correlation (r = 0.237, P < 0.05) between genetic distance and altitudinal gradient.
Assessment of population genetic structure by AMOVA partitioned 48.00% of genetic variation among the populations within groups and 41.44% of variation among groups (Table 4). These results revealed high genetic variations among the river-isolated populations within groups and also yielded statistically significant high fixation indices suggesting a strong pattern of genetic differentiation.

Demographic history
Neutrality tests resulted in statistically nonsignificant positive values (Tajima's D = 1.209, P = 0.918; Fu's Fs = 3.493, P = 0.868). The Ramos-Onsin and Rozas test also yielded a high R 2 value (R 2 = 0.140, P = 0.954), indicating a relatively stable population. The neutrality test results for the individual clades were also not sufficiently statistically informative to analyze their demographic patterns (Additional file 1: Table S2).
The mismatch distribution analysis of the entire set of sequences projected a multimodal curve (Fig. 5) with a raggedness index (rg) of 0.0065 (P < 0.05). Among the population groups, the Eastern group demonstrated a  unimodal curve, which peaked close to the y-axis, whereas the Central and Western groups displayed multimodal curves. The Bayesian skyline plot revealed a relatively stable effective population size (N e ) over the last 250,000 years, though the population decreased with the onset of the LGM in the Himalayan region (Fig. 6), after which it remained stable at a lower level until the mid-Holocene, before increasing to the current population size.

Ecological niche model and paleodistribution
Both the sampling strategies of ecological niche modeling produced similar results. In the single training/test split run, the AUC values based on training and test data were 0.934 and 0.998, respectively (Additional file 1: Figure S5). The 25 cross-validation multiplication runs resulted mean AUC of 0.892 (SD = 0.088) (Additional file 1: Figure S6).
The AUC values signified that the potential distribution of Hanuman langurs fits well with our data.
According to the paleodistribution model, suitable Hanuman langur habitat at present (Max_prob = 0.965) is more extensive compared to that during the LGM (Max_prob = 0.940) and LIG (Max_prob = 0.895) (Fig. 7). Suitable habitat during the LGM almost remained the same in terms of area but shifted downward/southward from the present habitat. The current suitable habitat elevation ranges between 49 m asl to 4190 m asl, with a mean value of 1823 m asl; whereas, the historical elevational distribution of suitable habitat during the LGM and LIG ranged from 16 to 2927 m (mean = 1375 m) and 16-    Figure S8). The lowland Terai and central mid-hills of Nepal were the most stable habitats from the LIG to the present day.

Discussion
Hanuman langurs are one of the most endangered and protected non-human primate species in Nepal. The species has a wide latitudinal and elevational distribution in forest patches that are fragmented by natural barriers such as rivers and human settlements. Hanuman langurs are vulnerable to alteration and loss of their natural habitat due to anthropogenic activities, as well as conflicts with local people due to their raiding of crops [28]. One of the key elements to the management of Hanuman langurs should be a clear understanding of their population history, dynamics, and structure. Here, we used mtDNA CR (1090 bp) and CYTB (1040 bp) sequences from Hanuman langurs to explore their genetic diversity, population genetic structure, and demographic history.

High genetic diversity
The Hanuman langur populations of Nepal demonstrated higher haplotype and nucleotide diversity (Hd = 0.955 ± 0.015 and π = 0.0528 ± 0.0015 for HVR I fragment) compared to other colobines of limited distribution range, including Trachypithecus geei (Hd = 0.934, π = 0.0244) from India [67], T. leucocephalus (Hd = 0.570 ± 0.056; π = 0.00323 ± 0.00044) [68], Rhinopithecus roxellana (Hd = 0.845 and π = 0.0331) [69], and R. brelichi (Hd = 0.457 ± 0.048, π = 0.014 ± 0.007) from China [70], but comparable to that of R. bieti (Hd = 0.945 ± 0.006, π = 0.036 ± 0.018) [70]. The higher genetic diversity in Hanuman langurs correlates with their morphological variations at different latitudes and elevations within Nepal [28]. According to the neutral theory of evolution, genetic diversity is proportional to effective population size and genetic variation takes a long time to accumulate in vertebrates with long generation times [71]. Thus, the observed higher genetic diversity signifies a large historical effective population size and long evolutionary history of Hanuman langurs in Nepal. Among the populations we examined, the highest haplotype diversity was found in clade CA, which might relate to the strong elevational gradient associated with the occurrence of troops in these populations [72], even in the absence of river isolation. However, this does not explain the observation that the WB population was also sampled from a wide range of elevations but showed low haplotype diversity, even though the nucleotide diversity was higher. Intertroop haplotype sharing was very low (4/37; 10.81%) and was limited to the troops which were not isolated by rivers (H14 was shared between K-LNP and S-LNP; H28 was shared between DP and BP; H32 was shared between BNP and CBNP troops; and H37 was shared among three troops of ANCA). Such low haplotype sharing may be due to the prevention of dispersal by rivers, and the effect of female philopatry [7] as we used matrilineally inherited mtDNA for this study. Overall, the populations from central Nepal had the highest genetic diversity, and thus we hypothesized that the central populations had a comparatively longer evolutionary history and may therefore be the center of distribution [73] from which the eastern and western populations later radiated.
The overall pairwise F ST between the populations was high. Population pairs in close geographical proximity had lower F ST values, whereas the populations separated by distance and/or rivers had higher F ST values. This F ST distribution demonstrates the effectiveness of rivers as barriers as well as isolation by distance phenomenon. The genetic diversity pattern in Hanuman langurs fits the assumption of the central-abundance model that explains the reduced genetic diversity and higher genetic differentiation in peripheral populations than in central populations [74]. The average number of pairwise nucleotide differences was high between almost all population pairs but the genetic variation within populations was low, which may be due to the combined effects of smaller dispersal distance of females coupled with female philopatry and long-term isolation of populations [75].

Riverine barrier effect and isolation by the distance
The phylogenetic inference using ML, MP, and NJ algorithms consistently defined six major clades for the Hanuman langurs in Nepal; namely EA, CA, CB, CC, WA, and WB. Clade EA from eastern Nepal and clade CA from central Nepal are isolated by the Arun, Tamakoshi, and Sunkoshi tributaries of the Koshi River system, and the Trishuli River of the Gandaki River system (Fig. 1). Clades CA and CB are isolated by the Budhigandaki and Daraundi rivers, whereas CB and CC are separated by the Marshyangdi River of the Gandaki River system. Central clades (CA, CB, and CC) are isolated from the western clades (WA and WB) by the Kaligandaki River, with its strong current, and the deep and wide Kaligandaki Gorge. The WA and WB clades from western Nepal are delineated by the Karnali and Seti rivers together with other tributaries such as the Chamelia. mtDNA phylogenies can provide unique insights into population history [76] and can indicate the boundaries of genetically divergent groups [77]. Here, the MJ haplotype network depicted the same haplotype grouping as determined by the phylogenetic analyses, and haplogroup boundaries were delineated by rivers. The network described deep divergences between eastern, central, and western populations; further, it showed sub-structuring within the central and western populations which formed three and two haplogroups, respectively. To further validate the riverine barrier effects, we grouped all sampled troops into three groups and six populations based on isolation by rivers and used AMOVA to assess the population genetic structures. Our results partitioned 48% of the genetic variations among populations within groups revealing a strong riverine barrier effect in the population genetic structure of Hanuman langur in Nepal. In addition, the genetic variation among groups (41.44%) was high, signifying isolation by distance. Similar roles of rivers in shaping the boundaries of regional haplogroups/taxa have been described for other primates such as gorillas [6], bonobos [11], lemurs [16], and tamarins [9]. The shy behavior and strong female philopatry of Hanuman langurs [42], and fast currents and cold snow-fed waters of the Himalayan rivers, with headwaters much above the elevational range of the species, have likely prevented the Hanuman langurs from crossing the rivers; however, their climatic tolerance did not prevent their distribution along extensive altitudinal gradients.
In addition to the riverine barrier effect on the population genetic structure of Hanuman langurs, isolation by distance also showed a profound effect. The nonparametric Mantel test demonstrated a statistically significant positive correlation (r = 0.343, P < 0.001) between the inter-individual genetic distance F ST and the respective geographic distance, but a lower correlation (r = 0.237, P < 0.05) between the altitudinal gradients and genetic distance. Isolation by distance can explain the increase in genetic differentiation at neutral loci with geographic distance [78], with considerable time is required for this pattern to be established and stabilized [79]; the moderate correlation coefficient of Hanuman langurs may be due to its long generation time and the socio-ecological phenomenon of population fragmentation [80].

Effects of Pleistocene climatic fluctuation on demographic history
Multiple complementary methods were used to infer the demographic history of Hanuman langurs in Nepal, which indicated long-term population stability, with some oscillations under the influence of Pleistocene climatic fluctuations. The neutrality test results did not reject the population stability but lacked statistical significance. The MDA curve was multimodal, which also signified a relatively stable population with periodic fluctuations. Both the neutrality test and MDA suggest that the population did not fluctuate to an extent sufficient to imply bottleneck or expansion [81]. The Bayesian skyline plot, which is used to estimate population dynamics of a species through time [59], indicated that the Hanuman langur population remained stable for a long period, but decreased with the onset of the LGM about 22,000 YBP. During the early-Holocene, the local LGM (LLGM) continued in the Himalayan region [82], during which the smaller Hanuman langur population remained stable. After the LLGM, with the onset of the mid-Holocene climatic optimum around 8000 YBP, the population started increasing and reached the present-day effective population size. A similar pattern was supported by the MDA curve, which revealed a peak close to the y-axis, signifying population expansion in the recent past. However, our Bayesian skyline plot analysis has some limitations that should be taken into consideration while interpreting the results. A considerable improvement on the demographic history estimates of the Hanuman langur would have been gained by using multiple non-recombining and neutrally evolving loci [83]. In addition, in highly structured populations, separate analyses at the subpopulation level would meet the assumption of panmixia [84]; however, because of the small sample size of individual clades, we performed the Bayesian skyline plot analysis on the entire set of sequences. Bayesian skyline plots together with ecological niche modeling provide more refined insights into the evolutionary history of populations [85]; hence, we further validated our molecular analysis results with paleodistribution modeling.
Pleistocene climate change and tectonic instability formed a unique scenario of disturbance ecology for the Nepal Himalaya [86]. Palynological and geomorphological data indicate the presence of dry and cold climates in higher altitudes of Nepal during the glacial periods [39]. The paleodistribution modeling during this study revealed an altitudinal/latitudinal range shift of suitable Hanuman langur habitat downward/southward during the dry and cold LGM period. Annual precipitation, precipitation seasonality, and mean temperature of the coldest quarter had higher contributions in defining the suitable habitat of the Hanuman langur. Although the Hanuman langur is a habitat generalist species with wide range of distribution [28,87], precipitation and temperature were the powerful limiting factors for their high-altitude populations during the cold and dry glacial period. The lower elevations of central Nepal were comparatively warmer and may have received higher precipitation as they remained under the influence of both the South Asian summer monsoon and mid-latitude winter westerlies [18,88], and, as Pleistocene refugia, should have provided amenable habitat for the Hanuman langur. The higher genetic diversity and complex population genetic structure of the Hanuman langur in central Nepal also suggest the area as a possible Pleistocene refugia for their historic populations [89]. The widespread Hanuman langur population in most of the physiographic zones of Nepal today may have been constrained during glaciation in the Himalayan region, causing them to retreat to lower elevations, especially in central Nepal. This shrinkage in habitat and food availability might have increased competition and survival threats, resulting in population size reduction. With the onset of the mid-Holocene climatic optimum, the potential habitat for the species likely increased, allowing for a concomitant spatial and demographic expansion of Hanuman langurs in the Nepal Himalaya. At present, suitable habitat is almost uniformly distributed along the entire length of the lower Nepalese Himalaya; hence, habitat interruption does not seem to constrain the distribution of the Hanuman langur. Rather, the major factor behind the genetic structuring of the Hanuman langur is the river barriers.

Conclusions
The primary aims of this study were to explore the genetic diversity, population genetic structure and demographic history of the Hanuman langur in Nepal by assessing the riverine barrier effects and influence of the Pleistocene climatic fluctuations. The Hanuman langur populations in Nepal exhibited high genetic and nucleotide diversity and the population genetic structure was clearly demarcated by the rivers, thus supporting the riverine barrier effects. Himalayan rivers of narrow width but high flow rate with cold snow-fed water have remained effective barriers for the movement of the Hanuman langur for a long period, which has caused the accumulation of genetic variations into distinct clusters. The genetic data of the Hanuman langur and paleodistribution modeling demonstrate a robust signature of late-Pleistocene and Holocene climatic fluctuations, especially the LGM, in shaping their distribution, population genetic structure and demographic history. Being the first study of its kind in the Nepal Himalaya, our work provides baseline information on the highly diverse population genetic composition of Hanuman langur populations; further, our work justifies further fine-scale sampling and multi-locus population genetic and phylogenetic analyses to clarify the species' ambiguous taxonomy.

Additional file
Additional file 1: Table S1. Geographic distance matrix among the sampled Hanuman langur troops. Table S2. Neutrality tests and demographic history parameters of population groups of Hanuman langur in Nepal based on mtDNA HVR I (489 bp) sequences. Table S3. Predictor variables used in the construction of the niche models. Table S4. Correlation matrix among the 19 bioclimatic variables retrieved from the Worldclim website (http://worldclim.org/) after clipping to a region from 78.5°E to 92.5°E and from 24°N to 31°N. Figure S5. Area under curve (AUC) of the receiving operating curve (ROC) for the single training/test split run. Figure S6. Average area under the curve (AUC) for 25 replicates of MaxEnt runs. The red line is average value and blue bars represent ±1 standard deviation. Figure S7. The individual response curve of major three predictive variables to the Maxent model prediction. Figure S8. Distribution of suitable habitat pixels along the elevational gradient. X-axis represents the elevation gradients and y-axis represents the pixel frequency.