We examined the features that were top ranked predictors in both Ldlr- and ApoE-based classifiers


IHH is a clinically important exposure because it markedly promotes atherosclerotic lesions in the pulmonary arteries and aorta in not only Ldlr-/- mice but also ApoE knockout mice, another widely-used atherosclerosis model , thereby mimicking the adverse cardiovascular changes that occur in OSA patients . In Ldlr-/- mice, wereported significant shifts in the bacterial and chemical composition of the gut on IHH-exposure. The key chemical alterations included changes in microbe-dependent metabolites such as gutderived estrogen-like molecules and bile acids. These observations revealed an unrecognized link between IHH and gut microbes, thereby holding immense potential for translation in OSA patients. However, a key challenge in microbiome research is understanding if different animal models, or animal models and human subjects are characterized by common changes in the microbiome and metabolome . As a first step, finding reproducible alterations across multiple animal models would provide confidence in the generalizability of the findings and accrue evidence for clinical relevance. Here, we use machine learning predictive models to address the reproducibility of the perturbations associated with IHH exposure in the gut ecosystem using both Ldlr-/- and ApoE-/- mouse models . To model OSA and its cardiovascular conditions, all mice were either exposed to IHH or air and fed a high-fat diet . Individuals were studied longitudinally for 6 weeks or 10 weeks to understand the impact prolonged IHH-exposure . Furthermore,plastic plant pots multiple cages per treatment group were used to untangle the effect of treatment with the effect of distinct housing conditions .

Starting with 10 weeks of age , fecal pellets were collected twice every week, and profiled for microbiome and metabolome using 16S rRNA amplicon sequencing and liquid chromatography-tandem mass spectrometry -based untargeted mass spectrometry, respectively. These data layers were processed per recommended practices to obtain relative abundances of microbial and molecular species per sample for all downstream analyses .Predictive models that classify microbiome and metabolome responses to interventions have proven extremely useful in disease diagnosis and biomarker discovery. Yet, these have been surprisingly hard to generalize across populations or model systems. In this work, we use Random Forest classification to investigate the cross-applicability of our previous findings in Ldlr-/- mice to ApoE-/- mice and vice-versa. RF is an ensemble machine learning algorithm that fits many decision trees on random subsamples of original data, and then aggregates the results of each decision tree to improve the prediction accuracy . The level of accuracy is often expressed using the area under the curve of true-positive versus falsepositive rates, known as a receiver operating characteristic . RF has consistently been reported to perform well in high-dimensional datasets i.e. datasets with many features such as ours, making it our algorithm of choice for this work . We had previously shown that machine learning classifiers trained on Inflammatory Bowel Disease cases and healthy controls in humans can distinguish between IBD cases and controls in dogs using cross-sectional microbiome data . To our knowledge, however, this type of crossmodel classification task has not been performed with metabolomics data, or with data collected longitudinally.First, we performed Principal Coordinate Analysis to get a visual overview of the characteristic microbiome and metabolome of the two animal models.

PCoA is an unsupervised method routinely used to explore major factors that drive the clustering of data points in high-dimensional datasets by projecting the samples in a reduced-dimensional space . Figure 4.1 displays the PCoA results plotted along time to visualize the dynamics of diet and IHH-associated changes in the gut ecosystem. This analysis shows that the Ldlr-/- and ApoE-/- mice in our study have very distinct gut microbial and chemical signature which is captured by the first principal axis in both data layers. These plots also capture a rapid shift in the baseline gut microbial and chemical composition in response to HFD which has also been reported previously . We performed PCoA without baseline samples, to better visualize the impact of IHH-exposure alone . We observed that despite underlying differences in the two genotypes, axis 2 consistently captured IHH-induced shifts in both the gut microbiome and metabolome highlighting common shifts in the gut ecosystem due to IHH exposure.It is important to note that the two animal models are temporally separated for sample collection and data acquisition , which likely contributes to the strong distinction between the models observed here. We quantified the effects of covariates such as genotypes , age of individuals, housing conditions and individual variability on the microbiome and metabolome composition by performing effect size analysis on our dataset . While the largest effect on the microbial and chemical composition was linked to the mouse model, the type of exposure impacted each data layer within both models significantly. Moreover, the effect sizes varied based on the animal model highlighting the distinctive characteristics of the gut ecosystem in the two models .Our unsupervised analysis showed that the gut ecosystems of ApoE-/- and Ldlr-/- mice, despite being inherently distinct, consistently shift in response to IHH-exposure. We applied supervised machine learning in order to capture the consistent shifts associated with IHHexposure in both animal models .

Specifically, we built RF classifiers using IHHassociated microbial and chemical composition in ApoE-/- and tested its performance in predicting IHH-exposure in Ldlr-/- and vice-versa. This informed us if the changes we observe in one model are reproducible to the other, which would make the findings more relevant for translation in OSA patients.To examine the predictive potential of microbiome data, we trained RF classifiers on relative abundances of 16S tag sequences shared between the two mouse models . Within each animal model, the classifiers yielded nearly perfect prediction of IHH-exposure . We then predicted the same in Ldlr-/- using RF trained on microbiome signature in ApoE-/- and vice-versa , still achieving very high cross-model prediction accuracies . Similarly, we used metabolomics data for training RF models on relative abundance of MS1 spectral ions . Metabolome-based RF classifiers also predicted IHH-exposure within animal models accurately , and maintained impressive cross-model prediction accuracies . Together, these analyses suggest that IHH-exposure alters both the gut microbial and chemical composition distinguishably in each animal model. Moreover, the changes induced by IHH-exposure are consistent across Ldlr-/- and ApoE-/- models, despite the underlying differences between the two genotypes . It is worth noting that we hugely benefited from our longitudinal sample collection scheme as we had more data points available for learning, despite limited number of animals per group. We accounted for the longitudinal samples from the same individual in our analyses by ensuring that observations for each individual appeared either in the training or validation dataset but not both. This prevented over-optimistic cross-validation accuracy scores as a result of the model overfitting to the characteristics of the individual itself rather than the treatment. .Next, we used these longitudinal data sets to learn how the duration of IHH-exposure impacts the gut microbiome and metabolome over time; and if this is consistent across the mouse models. The goal was to compare the dynamics of changes in the gut ecosystem with chronic IHH exposure in the ApoE-/and Ldlr-/- mice. We tested this by assessing the capability of the RF classifier to distinguish IHH samples from control at each time point. In ApoE-/- mice, the classification AUC using gut microbiome data is high at each time point starting at 11 weeks of age. The microbiome in Ldlr-/- mice, however,blueberry pot appears more predictive only at later time points, with its classification AUC improving from 0.71 at week 11 to more than 0.99 beyond week 14. We also observed a similar lag in gut metabolome changes in Ldlr-/- compared to ApoE-/- animals . Importantly, this is concordant with our previous finding that the atherosclerotic lesions evolved slowly and mildly in Ldlr-/- mice as compared to ApoE-/- mice . Therefore, observing this trend in both ‘omics layers provides supporting evidence that the atherosclerosis phenotype in these animals is linked to perturbations in their gut ecosystem. Moreover, the gut microbiome and metabolome changes occur quickly after IHH-exposure, before atherosclerotic lesions were observed, which was reported to be 4 weeks for ApoE-/- and 6 weeks for Ldlr-/- post IHH exposure .The subsequent goal of this analysis was to narrow down the list of fecal biomarkers that are reproducibly predictive of IHH-exposure, thereby guiding future mechanistic and clinical studies. The RF classifiers used to distinguish IHHexposed and control animals described above provided us with a ranked list of bacterial and chemical features important for prediction 4 . To investigate if there were some key biomarkers that could single-handedly distinguish IHH from control, we used the abundance of each of these features individually to plot ROC curves and compute AUCs. Indeed, some of these microbial and chemical features could alone detect IHH-exposure within each mouse model highly accurately 4 .

We used our longitudinal data to compare trends of these predictive features in IHHexposed and control groups in both animal models . These predictors included bacterial strains from the families Clostridiaceae and molecules identified as muricholic acid and vaccenic acid . The goal was to investigate if these microbial and chemical species changed in the same direction on IHHexposure in both ApoE-/– and Ldlr-/- mice or had idiosyncratic responses to IHH exposure based on the genetic background of the host. Figure 4.3c, e, and f show trends in these consistently altered features. These microbes and metabolites highlight key IHH-related changes in the gut microenvironment, and could guide subsequent reconstitution experiments in germ-free mice to establish causality. It is noteworthy that one unclassified species from the order Clostridiales , despite being highly predictive within each animal, was depleted in IHH in ApoE-/– mice but enriched in Ldlr-/- mice. This, together with the high cross-genotype prediction accuracy using all features , suggests that although the microbiome and metabolome changes induced by IHH are reproducible across mouse models overall, there do exist animal modelspecific changes as well. Hence, multi-animal model studies such as this are highly advantageous in precisely identifying biomarkers associated with an intervention of interest.We examined the reproducibility of IHH-associated alterations in the gut microbiome and metabolome of Ldlr-/- and ApoE-/- mouse models, crucial for understanding links between OSA and associated cardiovascular pathologies. As both APOE and LDLR are important in clearing cholesterol and triglyceride-rich particles from the blood, both models show elevated plasma cholesterol levels. However, they develop atherosclerotic plaques to different extents under highfat dietary conditions . Concordant with these phenotypic differences, we highlight throughout that the gut ecosystem of the two models is also intrinsically distinct. As technical variables such as origin of animals, housing conditions, experimental batches and data acquisition protocols are important considerations for meta-analyses such as ours , we ensured that all animals were housed and handled in the same facility and data were acquired using identical protocols to minimize confounding effects. Furthermore, we used supervised machine learning to identify features specifically associated with IHH-exposure in both animal models reproducibly. To our knowledge, the impact of IHH on the gut microbiome and metabolome in the context of atherosclerosis has not been investigated before, making our work exploratory in nature. Intermittent hypoxia alone has been reported to significantly alter the microbiome in wild-type mice and guinea-pigs which lends support to our findings with IHH-exposure. Another study modeled human OSA and its cardiovascular consequences in HFD-fed rats by inflating a tracheal balloon during the sleep cycle . The authors reported that HFD and OSA synergistically caused hypertension and gut-dysbiosis in these rats. This study also noted perturbations in members of the order Clostridiales in response to HFD. Curiously, we also observe a member of this order to be highly predictive of IHH, yet changing in different directions in ApoE-/- and Ldlr-/- animals, hinting that this may be due to the differential impact of HFD on the two models . In addition to genotype-specific changes, we also report consistent changes to unclassified strains belonging to the families Ruminococcaceae, Mogibacteriaceae, Lachnospiraceae and Clostridiaceae . These taxonomic groups have been associated with cardiovascular, metabolic and inflammatory conditions previously , which indicates shared mechanistic pathways in OSA-associated cardiovascular conditions. Furthermore, our work is the first to profile OSA-associated changes in the gut metabolome at this scale. We observed reproducible perturbations in clinically relevant biomolecules in both ApoE-/- and Ldlr-/- mice. For example, Vaccenic acid, a trans-fatty acid that has been reported to lower LDL cholesterol andtriglyceride levels in rats was found to decrease under IHH-exposure in both models.