A k-mer Based Sequence Similarity for Pangenomic Analyses

Bonnici, V.; Cracco, A.; Franco, G.

doi:10.1007/978-3-030-95470-3_3

In this work we propose an approach to improve the performance of a current methodology, computing k-mer based sequence similarity via Jaccard index, for pangenomic analyses. Recent studies have shown a good performance of such a measure for retrieving homology among genetic sequences belonging to a group of genomes. Our improvement is obtained by exploiting a suitable k-mer representation, which enables a fast and memory-cheap computation of sequence similarity. Experimental results on genomes of living organisms of different species give an evidence that a state of the art methodology is here improved, in terms of running time and memory requirements.

A k-mer Based Sequence Similarity for Pangenomic Analyses / Bonnici, V.; Cracco, A.; Franco, G.. - 13164:(2022), pp. 31-44. ( LOD2021 - The 7th International Online & Onsite Conference on Machine Learning, Optimization, and Data Science Grasmere, Lake District, England – UK 10/2021) [10.1007/978-3-030-95470-3_3].

A k-mer Based Sequence Similarity for Pangenomic Analyses

Bonnici V.;Cracco A.;Franco G.

2022-01-01

Abstract

In this work we propose an approach to improve the performance of a current methodology, computing k-mer based sequence similarity via Jaccard index, for pangenomic analyses. Recent studies have shown a good performance of such a measure for retrieving homology among genetic sequences belonging to a group of genomes. Our improvement is obtained by exploiting a suitable k-mer representation, which enables a fast and memory-cheap computation of sequence similarity. Experimental results on genomes of living organisms of different species give an evidence that a state of the art methodology is here improved, in terms of running time and memory requirements.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2022
			
	Citazione
	
				A k-mer Based Sequence Similarity for Pangenomic Analyses / Bonnici, V.; Cracco, A.; Franco, G.. - 13164:(2022), pp. 31-44. ( LOD2021 - The 7th International Online & Onsite Conference on Machine Learning, Optimization, and Data Science Grasmere, Lake District, England – UK 10/2021) [10.1007/978-3-030-95470-3_3].
			
	Appare nelle tipologie:
	
				4.1b Atto convegno Volume

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2919531

Citazioni

ND

4

2

A k-mer Based Sequence Similarity for Pangenomic Analyses

Bonnici V.;Cracco A.;Franco G.

2022-01-01

Abstract

Scheda breve Scheda completa Scheda completa (DC)

Informazioni

Citazioni

social impact

Conferma cancellazione

Scheda breve

Scheda completa

Scheda completa (DC)