Presentation is loading. Please wait.

Presentation is loading. Please wait.

MicroArray Data Analysis Candice Quadros & Amol Kothari.

Similar presentations


Presentation on theme: "MicroArray Data Analysis Candice Quadros & Amol Kothari."— Presentation transcript:

1 MicroArray Data Analysis Candice Quadros & Amol Kothari

2 Harnessing the power of a neural network for classifying samples. Neural Network for classification

3 Reduce the no. of genes :  We have to reduce the data dimensionality, i.e. reduce the no. of genes to consider.  PCA can be used to select most informative genes, but it is computationally expensive to obtain the Eigen vectors for high dimensional data.  Use the method suggested by Golub et al. to obtain the informative genes. Neural Network for classification

4 Steps in classification  Obtain the informative genes using Golub’s method.  Normalize the genes by shifting them to the mean & dividing by the standard deviation.  Train the neural network by using the training data & targets, and get the weights.  Classify the test data using the weights obtained above. Neural Network for classification

5 Results obtained: Inform. Genes No. of Hidden Units NN: Accuracy Golub: Accuracy 100370.5561.76 2001376.7458.82 Neural Network for classification

6 Hierarchical Merging: When to stop? Question: When to stop the merging? Suggested Solutions:  Diameter(C)  MaxD  Avg(sim(O i,O j )) ≥  (O i,O j  C) Difficult to estimate the parameters in high dimensions.

7 Another solution: When m clusters are present, stop merging. Problem: The m clusters might contain single point clusters. Use the concept of MinPts (from DBScan). A set of points is a significant cluster only if the set has MinPts. When there are m significant clusters, then stop. Hierarchical Merging: When to stop?

8 No. of iterations No. of Significant Clusters

9 Visualization of data: Vizstruct

10 Equation used: How do weigh each dimension, i.e. how do we select λ? Default value = 0.5 Use the Eigen Values of each dimension to obtain the value of λ. Visualization of data: Vizstruct

11 Steps for visualization:  Project the data into Eigen space.  The Eigen values of each dimension i = λi  Now use the same formulae for calculating the 2D point: Where λ i = Eigen value of the i th dimension Visualization of data: Vizstruct

12 Results:  The visualization obtained by this method is more representative of the data, compared to Vizstruct.  Demo Visualization of data: Vizstruct

13 Thank You


Download ppt "MicroArray Data Analysis Candice Quadros & Amol Kothari."

Similar presentations


Ads by Google