The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads and rely on approximations in data analysis in order to reduce computing times. These issues result in suboptimal results in terms of accuracy, sensitivity, and specificity when used either for the reconstruction of taxonomic or functional profiles through read analysis or analysis of genomes reconstructed by metagenomic assembly. Moreover, the recent introduction of novel DNA sequencing technologies that generate long reads, such as Nanopore and PacBio, represent a valuable data resource that still suffers from a lack of dedicated tools to perform integrated hybrid analysis alongside short read data. In order to overcome these limitations, here we describe a comprehensive bioinformatic platform, METAnnotatorX2, aimed at providing an optimized user-friendly resource which maximizes output quality, while also allowing user-specific adaptation of the pipeline and straightforward integrated analysis of both short and long read data. To further improve performance quality and accuracy of taxonomic assignment of reads and contigs, custom preprocessed and taxonomically revised genomic databases for viruses, prokaryotes, and various eukaryotes were developed. The performance of METAnnotatorX2 was tested by analysis of artificial data sets encompassing viral, archaeal, bacterial, and eukaryotic (fungal) sequence reads that simulate different biological matrices. Moreover, real biological samples were employed to validate in silico results.

METAnnotatorX2: A comprehensive tool for deep and shallow metagenomic data set analyses / Milani, C.; Lugli, G. A.; Fontana, F.; Mancabelli, L.; Alessandri, G.; Longhi, G.; Anzalone, R.; Viappiani, A.; Turroni, F.; van Sinderen, D.; Ventura, M.. - In: MSYSTEMS. - ISSN 2379-5077. - 6:3(2021), p. e00583-21.e0058321. [10.1128/mSystems.00583-21]

METAnnotatorX2: A comprehensive tool for deep and shallow metagenomic data set analyses

Milani C.
;
Lugli G. A.;Fontana F.;Mancabelli L.;Alessandri G.;Longhi G.;Anzalone R.;Viappiani A.;Turroni F.;van Sinderen D.;Ventura M.
2021

Abstract

The use of bioinformatic tools for read-based taxonomic and functional analyses of metagenomic data sets, including their assembly and management, is rather fragmentary due to the absence of an accepted gold standard. Moreover, most currently available software tools need input of millions of reads and rely on approximations in data analysis in order to reduce computing times. These issues result in suboptimal results in terms of accuracy, sensitivity, and specificity when used either for the reconstruction of taxonomic or functional profiles through read analysis or analysis of genomes reconstructed by metagenomic assembly. Moreover, the recent introduction of novel DNA sequencing technologies that generate long reads, such as Nanopore and PacBio, represent a valuable data resource that still suffers from a lack of dedicated tools to perform integrated hybrid analysis alongside short read data. In order to overcome these limitations, here we describe a comprehensive bioinformatic platform, METAnnotatorX2, aimed at providing an optimized user-friendly resource which maximizes output quality, while also allowing user-specific adaptation of the pipeline and straightforward integrated analysis of both short and long read data. To further improve performance quality and accuracy of taxonomic assignment of reads and contigs, custom preprocessed and taxonomically revised genomic databases for viruses, prokaryotes, and various eukaryotes were developed. The performance of METAnnotatorX2 was tested by analysis of artificial data sets encompassing viral, archaeal, bacterial, and eukaryotic (fungal) sequence reads that simulate different biological matrices. Moreover, real biological samples were employed to validate in silico results.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11381/2895519
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? ND
social impact