Lipidome analysis of breast milk composition in humans, monkeys, bovids, and pigs.

Background Lipids contained in breast milk are an essential source of energy and structural materials for a growing infant. Furthermore, lipids’ long-chain unsaturated fatty acid residues can directly participate in infant tissue formation. Here, we used untargeted mass spectrometric measurements to assess breast milk lipid composition in seven mammalian species: humans, two macaque species, cows, goats, yaks, and pigs. Results Analysis of the main milk lipid class, triacylglycerides, revealed species-specific differences in the composition of fatty acid residues for each of seven species. While human milk showed more medium and long-chain unsaturated fatty acids, pig milk composition was the most distinct, featuring the highest proportion of long-chain polyunsaturated fatty acids. Conclusions We show that breast milk lipidome composition is dynamic across mammalian species, changed extensively in pigs, and contains features particular to humans.

directly participating in the infant tissue development and growth [7,8].
The most variable element of breast milk composition is total fat content ranging from 60% in the hooded seals [9] to less than 1% in white rhinoceroses and ringtailed lemur [10] -the phenomenon mainly linked to the duration of breastfeeding and ecological conditions. In most species, however, a single lipid class, triacylglycerides (TAGs), composes on average, 98% of the milk fat [11]. TAGs have simple chemical composition, consisting of glycerol and three fatty acid residues.
Several studies analyzed breast milk TAG composition in mammalian species, including humans [12][13][14][15][16], demonstrating differences in the distribution of DHAcontaining TAGs in comparison with the baby formula [17]. However, the systematic characterization of the breast milk lipidome composition across mammalian species remained incomplete.
While most lipids contained in breast milk get broken down to energy sources and simple building blocks, there is evidence that some of TAG fatty acid residues could directly participate in infant tissue development and growth. For example, docosahexaenoic acid (DHA) derived from milk TAGs is transported across the blood-brain barrier in the form of LPC with the help of a specific protein transporter [18]. The direct use of milk fatty acid residues in infant tissue development indicates that breast milk lipidome composition of a species might reflect specific requirements of its infant tissues. To approach this question, we first assessed the extent of breast milk TAG composition differences among seven mammalian species, including humans represented by two distinct populations.
Randomized milk samples were extracted, separated using liquid chromatography, and measured using untargeted mass spectrometry in positive ionization mode. The measurements yielded a total of 472 mass spectrometry features representing distinct hydrophobic compounds (lipids) with molecular weights below 1,200 Daltons (Da).
Visualization of the relationship among samples based on the abundance levels of these 472 detected lipids using multidimensional scaling (MDS) revealed good separation of species and phylogenetic groups (Fig. 1B). Furthermore, distances between species calculated using the normalized intensities of mass spectrometry signals generally agreed with the phylogenetic distances (Fig. 1C). Pig milk was the only obvious exception from this linear relationship, showing a greater difference to the bovidae species than expected from the phylogeny.
Of the 472 detected lipids 403 (85%) showed significant intensity differences among species (ANOVA, BH-corrected P < 0.05). Unsupervised clustering of these 403 species-dependent lipids based on their intensity profiles across samples yielded 5 four clusters ( Fig. 2A). Notably, lipid intensities within these clusters differed within the mammalian orders as much as between them. Particularly, the lipid composition of the pig milk stood out in three of the four clusters, while cluster 4 contained milk lipidome composition features shared between monkeys and bovids (Fig. 2B).
In agreement with previous knowledge, of all lipids detected in milk, the most of lipid was covered by a specific lipid class -triacylglycerides (TAGs). Interestingly, 76 lipid features computationally annotated as TAGs were present in all four clusters, covering the entire spectrum of lipid variation patterns (Additional file 6: Table S2). Furthermore, the average length and unsaturation extent of the fatty acid chains of these TAGs deferred among the clusters (Fig. 2C). For instance, cluster 2 TAGs contained long-chain polyunsaturated fatty acid residues, while cluster 4 TAGs preferentially contained medium-chain fatty acid residues ( Fig. 2C; Additional file 1: Figure S1).
Notably, relative abundance analysis of detected TAGs revealed apparent intensity differences characteristic of each species. Specifically, cow milk contained more TAGs composed of long monounsaturated fatty acids. By contrast, goat milk contained more TAGs composed of medium-chain saturated and monounsaturated fatty acids. Pigs milk stood out from the rest of the species by having TAGs composed of long-and very long-chain polyunsaturated fatty acids. Monkey milk had more TAGs composed of medium-chain monounsaturated and polyunsaturated fatty acids. Finally, human milk tended to contain more TAGs with long-chain polyunsaturated fatty acids ( Fig. 3A; Additional file 1: Figure S1). We next specifically searched for lipids showing statistically significant intensity differences between humans and other three species' groups represented in our study. Of the 472 detected lipids, 94 differed in intensity between humans and 6 macaques, 23 of them annotated as TAGs (ANOVA, BH-corrected P < 0.05).
Clustering of these lipids revealed a notable intensity increase in the human milk for a group of lipids, including nine TAGs with long-and very long-chain fatty acids ( Fig. 4; Additional file 2: Figure S2).
Comparison between humans and bovids yielded significant intensity differences for 269 lipids, 61 of them annotated as TAGs (ANOVA, BH-corrected P < 0.05). Among them, 23 TAGs with long-and very long-chain polyunsaturated fatty acids showed increased intensities in human milk ( Fig. 5B; Additional file 3: Figure S3).
Comparison between humans and pigs revealed significant intensity differences for 343 lipids, 64 of them annotated as TAGs (ANOVA, BH-corrected P < 0.05). Among them, 15 TAGs containing medium-chain saturated, monounsaturated, and polyunsaturated fatty acids were increased in the human milk, while the rest of TAGs containing long-and very long-chain polyunsaturated fatty acids were elevated in the pig milk ( Fig. 4C; Additional file 4: Figure S4).

Discussion
We show that triacylglycerides, the main components of the milk fat, differ widely among mammalian species in terms of their average fatty acid length and unsaturation degree. Of the seven species examined in our study, pig milk shows the most rapid evolutionary divergence, resulting in unusually high proportion of polyunsaturated fatty acids. This rapid evolution might either represent the relaxation of stabilizing selection or, more likely, an adaptive change enabling shortening of the lactation period on the suidae lineage.
Importantly, we detect specific features of the breast milk lipidome composition in each of the seven species, including humans. While we were unable to obtain milk 7 samples from apes, the comparison to two old world monkey species, rhesus and crab-eating macaques, revealed breast milk lipidome features potentially unique to humans. Specifically, we show that after the human-monkey species' divergence approximately 30 million years ago, human ancestors started to produce milk with higher abundance of TAGs containing long-chain fatty acids with high levels of unsaturation. This observation is intriguing, given the reports of particular longchain poly-unsaturated fatty acids accumulating in the human brain during the last trimester of pregnancy and after birth [19] potentially influencing brain development and functionality [8]. In addition to TAGs, we detect human-specific intensity differences for lipids representing other lipid classes present in milk, such as phospholipids. However, given the small sample size of the study, we did not consider low-abundance lipid classes in our analysis.

Conclusions
Our study revealed substantial differences in triacylglyceride composition among seven mammalian species: three primates, three bovids, and pigs. While for most species changes in milk lipidome composition fit the general evolutionary pattern, with distances proportional to the phylogenetic times, there is an exception.
Specifically, pig milk stood out by containing unusually high amounts of long-chain polyunsaturated fatty acids. Notably, human milk was second in terms of long-chain polyunsaturated fatty acids abundance, followed by two macaque species, and then by the bovids. These results indicate the need for further studies of breast milk lipidome evolution in conjunction with other developing tissues, especially the brain.

Lipid extraction
The milk aliquots were thawed at 0 °C mixed, and 16 µl of milk were transferred to a 2.0 ml Eppendorf safe-lock tube and resuspended in 34 µl of LC-MS grade water.
Prior to extraction, samples were randomized with regard to species' identity. For lipid extraction, a modified two-phase protocol was used as described in [20]. All From each diluted sample, 3 µl were injected to a reversed-phase Bridged Ethylene Hybrid (BEH) C8 reverse column (100 mm x 2.1 mm, containing 1.7 µm diameter particles, Waters) coupled to a Vanguard pre-column with the same dimensions, using a Waters Acquity UPLC system (Waters, Manchester, UK). The mobile phases used for the chromatographic separation were: water, containing 10 mM ammonium acetate, 0.1% formic acid (Buffer A) and acetonitrile:isopropanol (7:3 (v:v)), containing 10 mM ammonium acetate, 0.1% formic acid (Buffer B). The gradient separation was: 1 min 55% B, 3 min linear gradient from 55% to 80% B, 8 min linear gradient from 80% B to 85% B, and 3 min linear gradient from 85% A to 100% A. After 4.5 min washing with 100% B the column was re-equilibrated with 55% B. The flow rate was set to 400 µl/min. The mass spectra were acquired in a positive mode using a heated electrospray ionization source in combination with a Bruker Impact II QTOF (quadrupole-Time-of-Flight) mass spectrometer (Bruker Daltonics, Bremen, Germany).
Four blank samples were run at the beginning of the queue, followed by four QC samples to equilibrate the column. After them, 38 samples were queued in the same random order used for extraction with all samples randomized by species, interleaved with seven pooled samples, one QC preceding the first sample and then a QCs after every 9th sample. At the end of the queue, we performed two injections containing 100% acetonitrile to wash the column, followed by blank samples. Blank samples were prepared as usual samples, but contained only extraction buffers to reveal all contaminants that could come from the extraction and other technical steps, and not from the sample itself.

Data preprocessing
After the acquisition, Bruker raw data .d files were automatically calibrated using the internal calibration and converted into mzXML format using a custom DataAnalysis script (Bruker, Version 4.3). The mzXML files were then subjected to the standard alignment and peak picking procedure using xcms software [21]. We then filtered from the output table all lipid features with the coefficient of variation (CoV, calculated as the standard deviation over the mean across QC samples) > 30%, and peaks with zero values in > 50% of the individual samples. Lipid features' intensities were then normalized using the upper quartile normalization and base-two log-transformed. Raw data is uploaded to the metabolomics study data repository MetaboLights [22].

Phylogenetic distances calculation
The lipid intensity-based distances between species were calculated as the Euclidean distance between the vectors containing the intensities of 472 lipids detected in each pair of species. The phylogenetic distances between the species' pairs were obtained from the TimeTree database [23].

Statistical analysis
Species-dependent lipids were defined with ANOVA and BH-corrected p-value cutoff of 0.05. The correlation matrix of species-dependent lipids was calculated as (1-cor) Pearson's distances between all lipids. Unsupervised clustering of the speciesdependent lipids was performed using hierarchical clustering with complete linkage in the R statistical environment.

TAGs annotation
All detected lipid features were annotated against the theoretical list with all possible masses of NH4 + adducts of TAGs. The theoretical list of masses was generated using ALEX Target List Generator with the ALEX lipid database (5.2) [24].
Among the detected lipids, 76 matched the theoretical masses with < 10 ppm and were considered for further analysis.

Species-dependent TAG intensity differences
We calculated the mean intensities of TAGs in each species using the raw intensities of the annotated TAGs. To assess TAGs concentrations in each species, the mean intensity of a particular TAG was divided by the sum of the mean intensities of all TAGs in that species. For comparison between species, the 100% intensity value of 12 the TAG was defined as the maximal intensity of the TAG across the species.

Declarations
Ethics approval and consent to participate The study was carried out with the ethical approval of the Institutional Review Board of the Skolkovo Institute of Science and Technology (non-regular meeting No 2). All participants have signed an Informed Consent form.

Consent for publicationNot Applicable
Availability of data and materials The datasets generated and analysed during the current study are available in the MetaboLights repository, www.ebi.ac.uk/metabolights/MTBLS1338.

Competing interests
The authors declare that they have no competing interests.