MOTIVATION: Unsupervised class discovery in gene expression data relies on the statistical signals in the data to exclusively drive the results. It is often the case, however, that one is interested in constraining the search space to respect certain biological prior knowledge while still allowing a flexible search within these boundaries. RESULTS: We develop an approach to semi-supervised class discovery. One component of our approach uses clinical sample information to constrain the search space and guide the class discovery process to yield biologically relevant partitions. A second component consists of using known biological annotation of genes to drive the search, seeking partitions that manifest strong differential expression in specific sets of genes. We develop efficient algorithmics for these tasks, implementing both approaches and combinations thereof. We show that our method is robust enough to detect known clinical parameters in accordance with expected clinical values. We also use our method to elucidate cardiovascular disease (CVD) putative risk factors.

Clinically driven semi-supervised class discovery in gene expression data / Steinfeld, I; Navon, R; Ardigò, D; Zavaroni, Ivana; Yakhini, Z.. - In: BIOINFORMATICS. - ISSN 1367-4803. - 24(16):(2008), pp. 190-197. [10.1093/bioinformatics/btn279]

Clinically driven semi-supervised class discovery in gene expression data

ZAVARONI, Ivana;
2008-01-01

Abstract

MOTIVATION: Unsupervised class discovery in gene expression data relies on the statistical signals in the data to exclusively drive the results. It is often the case, however, that one is interested in constraining the search space to respect certain biological prior knowledge while still allowing a flexible search within these boundaries. RESULTS: We develop an approach to semi-supervised class discovery. One component of our approach uses clinical sample information to constrain the search space and guide the class discovery process to yield biologically relevant partitions. A second component consists of using known biological annotation of genes to drive the search, seeking partitions that manifest strong differential expression in specific sets of genes. We develop efficient algorithmics for these tasks, implementing both approaches and combinations thereof. We show that our method is robust enough to detect known clinical parameters in accordance with expected clinical values. We also use our method to elucidate cardiovascular disease (CVD) putative risk factors.
2008
Clinically driven semi-supervised class discovery in gene expression data / Steinfeld, I; Navon, R; Ardigò, D; Zavaroni, Ivana; Yakhini, Z.. - In: BIOINFORMATICS. - ISSN 1367-4803. - 24(16):(2008), pp. 190-197. [10.1093/bioinformatics/btn279]
File in questo prodotto:
File Dimensione Formato  
clinically driven.pdf

non disponibili

Tipologia: Documento in Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 479.6 kB
Formato Adobe PDF
479.6 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11381/1874838
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 16
  • ???jsp.display-item.citation.isi??? 15
social impact