In data analysis, clustering is the process of finding groups in unlabelled data according to similarities among them in such a way that data items belonging to the same group are more similar between each other than items in different groups. Consensus clustering is a methodology for combining different clustering solutions from the same data set in a new clustering, in order to obtain a more accurate and stable solution. In this work we compared different consensus approaches in combination with different clustering algorithms and ran several experiments on gene expression data sets. We show that consensus techniques lead to an improvement in clustering accuracy and give evidence of the stability of the solutions obtained with these methods.
Consensus clustering in gene expression
Napolitano F.;
2015-01-01
Abstract
In data analysis, clustering is the process of finding groups in unlabelled data according to similarities among them in such a way that data items belonging to the same group are more similar between each other than items in different groups. Consensus clustering is a methodology for combining different clustering solutions from the same data set in a new clustering, in order to obtain a more accurate and stable solution. In this work we compared different consensus approaches in combination with different clustering algorithms and ran several experiments on gene expression data sets. We show that consensus techniques lead to an improvement in clustering accuracy and give evidence of the stability of the solutions obtained with these methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.