Recent studies have shown correlations between the microbiota’s composition and various health conditions. Machine learning (ML) techniques are essential for analyzing complex biological data, particularly in microbiome research. ML methods help analyze large datasets to uncover microbiota patterns and understand how these patterns affect human health. This study introduces a novel approach combining statistical physics with the Monte Carlo (MC) methods to characterize bacterial species in the human microbiota. We assess the significance of bacterial species in different age groups by using notions of statistical distances to evaluate species prevalence and abundance across age groups and employing MC simulations based on statistical mechanics principles. Our findings show that the microbiota composition experiences a significant transition from early childhood to adulthood. Species such as Bifidobacterium breve and Veillonella parvula decrease with age, while others like Agathobaculum butyriciproducens and Eubacterium rectale increase. Additionally, low-prevalence species may hold significant importance in characterizing age groups. Finally, we propose an overall species ranking by integrating the methods proposed here in a multicriteria classification strategy. Our research provides a comprehensive tool for microbiota analysis using statistical notions, ML techniques, and MC simulations.
Machine Learning Monte Carlo Approaches and Statistical Physics Notions to Characterize Bacterial Species in Human Microbiota / Bellingeri, Michele; Mancabelli, Leonardo; Milani, Christian; Lugli, Gabriele Andrea; Alfieri, Roberto; Turchetto, Massimiliano; Ventura, Marco; Cassi, Davide. - In: MACHINE LEARNING AND KNOWLEDGE EXTRACTION. - ISSN 2504-4990. - (2024). [10.3390/make6040117]
Machine Learning Monte Carlo Approaches and Statistical Physics Notions to Characterize Bacterial Species in Human Microbiota
Michele Bellingeri
;Leonardo Mancabelli;Christian Milani;Gabriele Andrea Lugli;Roberto Alfieri;Massimiliano Turchetto;Marco Ventura;Davide Cassi
2024-01-01
Abstract
Recent studies have shown correlations between the microbiota’s composition and various health conditions. Machine learning (ML) techniques are essential for analyzing complex biological data, particularly in microbiome research. ML methods help analyze large datasets to uncover microbiota patterns and understand how these patterns affect human health. This study introduces a novel approach combining statistical physics with the Monte Carlo (MC) methods to characterize bacterial species in the human microbiota. We assess the significance of bacterial species in different age groups by using notions of statistical distances to evaluate species prevalence and abundance across age groups and employing MC simulations based on statistical mechanics principles. Our findings show that the microbiota composition experiences a significant transition from early childhood to adulthood. Species such as Bifidobacterium breve and Veillonella parvula decrease with age, while others like Agathobaculum butyriciproducens and Eubacterium rectale increase. Additionally, low-prevalence species may hold significant importance in characterizing age groups. Finally, we propose an overall species ranking by integrating the methods proposed here in a multicriteria classification strategy. Our research provides a comprehensive tool for microbiota analysis using statistical notions, ML techniques, and MC simulations.File | Dimensione | Formato | |
---|---|---|---|
make-06-00117 (1).pdf
accesso aperto
Tipologia:
Versione (PDF) editoriale
Licenza:
Creative commons
Dimensione
4.3 MB
Formato
Adobe PDF
|
4.3 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.