Comparison of Spectral Clustering Algorithms for Gene Expression Data

  • Hoàng Thị Thanh Giang
  • Nguyễn Thị Thúy Hạnh
  • Nguyễn Hoàng Huy

Abstract

      Spectral clustering algorithms have been the most effective algorithms to divide genes into groups according to the degree of their expression similarity. Such a grouping may suggest that the respective genes are correlated and/or co-regulated, and subsequently indicates that the genes could possibly share a common biological role. In this paper, three spectral clustering algorithms were investigated: unnormalized spectral clustering, normalized spectral clustering according to Shi and Malik (2000), and normalized spectral clustering according to Ng, Jordan and Weiss (2002). The algorithms were benchmarked against each other. The performance of the three clustering algorithms was studied on time series expression data using Dynamic Time Warping (DTW) distance in order to measure similarity between gene expression profiles. Four different cluster validation measures were used to evaluate the clustering algorithms: Connectivity and Silhouette Index for estimatingthe quality of clusters, Jaccard Index for evaluating the stability of a cluster method and Rand Index for assessing the accuracy. The results were analyzed by Friedman’s test. The performance of normalized spectral clustering according to Ng, Jordan and Weiss (2002) was demonstrated to be the best under the Silhouette and Rand validation indices.

điểm /   đánh giá
Published
2017-10-04
Section
ENGINEERING AND TECHNOLOGY