Presentation is loading. Please wait.

Presentation is loading. Please wait.

Unsupervised clustering in mRNA expression profiles D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis Computational Intelligence Laboratory (CILAB), Department.

Similar presentations


Presentation on theme: "Unsupervised clustering in mRNA expression profiles D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis Computational Intelligence Laboratory (CILAB), Department."— Presentation transcript:

1 Unsupervised clustering in mRNA expression profiles D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis Computational Intelligence Laboratory (CILAB), Department of Mathematics, University of Patras, GR-26110 Patras, Greece University of Patras Artificial Intelligence Research Center (UPAIRC), University of Patras, GR-26110 Patras, Greece Computers in Biology and Medicine In Press, Corrected Proof, Available online 24 October 2005

2 K-Windows Clustering Adaptation of K-means, originally proposed in 2002 by Vrahatis et. al. Windowing technique improves speed and accuracy Tries to place a d-dimensional window (box) containing all patterns that belong to a single cluster

3 K-Windows – Basic Concepts Move windows to find cluster centers (fig a) 1.Select k points as centers of d-windows of size a. 2.Window means becomes new center. 3.Repeat until stopping criterion (movement of center). Enlarge windows to determine cluster edges (fig b) 1.Enlarge one dimension by a specified percent. 2.Relocate window as above. 3.Keep only if increase in instances in window exceeds threshold

4 Unsupervised K-Windows (UKW) Start with sufficiently large number of windows Merge to automatically determine the number of clusters For each pair of overlapping windows, calculate proportion of overlap for each window. a)Large overlap, considered same cluster, W1 is deleted. b)Many points in common, considered the same cluster. c)Low overlap, considered two different clusters.

5 Experimental Setup Leukemia dataset – well characterized Default UKW parameters used Supervised dimension reduction –Two previously published gene subsets and their union Unsupervised dimension reduction –Biclustering with UKW –PCA –PCA and UKW hybrid

6 Supervised Feature Selection Use two gene subsets selected in previously published papers using supervised techniques. All algorithms did best on combined set, results below.

7 Unsupervised Feature Selection (Biclustering Technique) Apply UKW to cluster genes, select one gene, closest to cluster center, as representative from each cluster. Apply UKW to samples, using those genes (239). UKW accuracy: 93.6% (ALL) and 76% (AML) No results reported for other algorithms

8 Unsupervised Feature Selection (PCA Techniques) PCA and scree plot to reduce features –Poor Performance Hybrid PCA and UKW method –Partition genes using UKW –Transform each partition using PCA –Select representative factors from each cluster –UKW accuracy: 97.87% (ALL) and 88% (AML)

9 UKW Results Summary DatasetALL AccuracyAML Accuracy Published Gene Subsets (Supervised) 90%100% UKW Biclustering (Unsupervised) 93.6%76% PCA (Unsupervised) N/A PCA-UKW Hybrid (Unsupervised) 97.87%88%

10

11 Default parameters –initial window size a=5a=5 –enlargement threshold θe=0.8θe=0.8 –merging threshold θm=0.1θm=0.1 –coverage threshold θc=0.2θc=0.2 –variability threshold θv=0.02θv=0.02 Link to article


Download ppt "Unsupervised clustering in mRNA expression profiles D.K. Tasoulis, V.P. Plagianakos, and M.N. Vrahatis Computational Intelligence Laboratory (CILAB), Department."

Similar presentations


Ads by Google