Graphs are a widely used structure for knowledge representation. Their uses range from biochemical to biomedical applications and are recently involved in multi-omics analyses. A key computational task regarding graphs is the search of specific topologies contained in them. The task is known to be NP-complete, thus indexing techniques are applied for dealing with its complexity. In particular, techniques exploiting paths extracted from graphs have shown good performances in terms of time requirements, but they still suffer because of the relatively large size of the produced index. We applied decision diagrams (DDs) as index data structure showing a good reduction in the indexing size with respect to other approaches. Nevertheless, the size of a DD is dependent on its variable order. Because the search of an optimal order is an NP-complete task, variable order heuristics on DDs are applied by exploiting domain-specific information. Here, we propose a heuristic based on the information content of the labeled paths. Tests on well-studied biological benchmarks, which are an essential part of multi-omics graphs, show that the resultant size correlates with the information measure related to the paths and that the chosen order allows to effectively reduce the index size.

An entropy heuristic to optimize decision diagrams for index-driven search in biological graph databases / Licheri, N.; Amparore, E.; Bonnici, V.; Giugno, R.; Beccuti, M.. - ELETTRONICO. - 3052:(2021). (Intervento presentato al convegno 2021 International Conference on Information and Knowledge Management Workshops, CIKMW 2021 tenutosi a aus nel 2021).

An entropy heuristic to optimize decision diagrams for index-driven search in biological graph databases

Bonnici V.;
2021-01-01

Abstract

Graphs are a widely used structure for knowledge representation. Their uses range from biochemical to biomedical applications and are recently involved in multi-omics analyses. A key computational task regarding graphs is the search of specific topologies contained in them. The task is known to be NP-complete, thus indexing techniques are applied for dealing with its complexity. In particular, techniques exploiting paths extracted from graphs have shown good performances in terms of time requirements, but they still suffer because of the relatively large size of the produced index. We applied decision diagrams (DDs) as index data structure showing a good reduction in the indexing size with respect to other approaches. Nevertheless, the size of a DD is dependent on its variable order. Because the search of an optimal order is an NP-complete task, variable order heuristics on DDs are applied by exploiting domain-specific information. Here, we propose a heuristic based on the information content of the labeled paths. Tests on well-studied biological benchmarks, which are an essential part of multi-omics graphs, show that the resultant size correlates with the information measure related to the paths and that the chosen order allows to effectively reduce the index size.
2021
An entropy heuristic to optimize decision diagrams for index-driven search in biological graph databases / Licheri, N.; Amparore, E.; Bonnici, V.; Giugno, R.; Beccuti, M.. - ELETTRONICO. - 3052:(2021). (Intervento presentato al convegno 2021 International Conference on Information and Knowledge Management Workshops, CIKMW 2021 tenutosi a aus nel 2021).
File in questo prodotto:
File Dimensione Formato  
paper13.pdf

accesso aperto

Tipologia: Versione (PDF) editoriale
Licenza: Creative commons
Dimensione 539.14 kB
Formato Adobe PDF
539.14 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/2914709
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact