A dictionary-based bacterial genome analysis is performed, through specific k-long factors (called res) and their maximal right elongation along the genome (called spectral segment), in order to find discriminating biomarkers at the genus and species level. The aim is pursued through a k-mer-based approach previously introduced, here applied on genomes of different bacterial taxa. Intervals for values of k are identified to obtain meaningful genomic fragments, whose collection is a suitable representation to compare genomes according to informational indexes and Jaccard’s similarity matrices. Corresponding dictionaries of k-mers are identified to discriminate bacterial genomes at genus and species level. This approach appears competitive in terms of performance (e.g., species discrimination) and size with respect to traditional barcoding methods.
An Investigation to Test Spectral Segments as Bacterial Biomarkers / Astorino, S.; Bonnici, V.; Franco, G.. - (2023), pp. 1-16. (Intervento presentato al convegno The 20th International Conference on Unconventional Computation and Natural Computation (UCNC 2023) tenutosi a Jacksonville, FL, USA nel 05/2023) [10.1007/978-3-031-34034-5_1].
An Investigation to Test Spectral Segments as Bacterial Biomarkers
Bonnici V.
;Franco G.
2023-01-01
Abstract
A dictionary-based bacterial genome analysis is performed, through specific k-long factors (called res) and their maximal right elongation along the genome (called spectral segment), in order to find discriminating biomarkers at the genus and species level. The aim is pursued through a k-mer-based approach previously introduced, here applied on genomes of different bacterial taxa. Intervals for values of k are identified to obtain meaningful genomic fragments, whose collection is a suitable representation to compare genomes according to informational indexes and Jaccard’s similarity matrices. Corresponding dictionaries of k-mers are identified to discriminate bacterial genomes at genus and species level. This approach appears competitive in terms of performance (e.g., species discrimination) and size with respect to traditional barcoding methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.