Presentation is loading. Please wait.

Presentation is loading. Please wait.

Peter Scully Investigating Rough Set Feature Selection for Gene Expression Analysis.

Similar presentations


Presentation on theme: "Peter Scully Investigating Rough Set Feature Selection for Gene Expression Analysis."— Presentation transcript:

1 Peter Scully pds7@aber.ac.uk Investigating Rough Set Feature Selection for Gene Expression Analysis

2 Project Summary Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 2 Feature Selection Rough Set Feature Selection: using RSAR/QuickReduct Data Gene Expression Datasets Large Continuous: U ~100, A >~22,000<~55,000 Weka Development Experimentation RSARSubsetEval ConsistencySubsetEval

3 Content Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 3 Background Introduction to Rough Set Theory Feature Selection, Rough Set Feature Selection Analysis of Gene Expression Data Demonstration Dissertation Approach (RSAR) Install Package - RSARSubsetEval Experiment on Dataset Critical Discussion Dissertation – New Approach RSCTC 2010 – Better Approaches

4 Introduction to Rough Set Theory Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 4 Indiscernibility Lower/ Upper Approximations, Regions Reducts Figure 1 – RST Regions

5 Feature Selection, Rough Set Feature Selection Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 5 Feature Selection (FS) Rough Set FS Approaches: Lower Approximations (RSAR/ QuickReduct) Relative Dependency, Dynamic Reduct, …..

6 Analysis of Gene Expression Data Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 6 Expressed Genes, Gene Products and Gene Expression. “Wet Lab” Experimentation Process Data Collection of Gene Expression Quantifications Dataset Collation Figure 2 – Microarray Hybridisation

7 Demonstration Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 7 Dissertation Approach (RSAR) Install Package - RSARSubsetEval Experimentation on Dataset UCI diabetes.arff RF_diabetes.arff – Relief-F Ranked RSCTC2010 - Reduced RF500_data3.arff – Relief-F Ranked and reduced (from 22k) to 500 attributes. o3 minutes (RSARSubsetEval) o13 minutes (ConsistencySubsetEval)

8 Approach in Dissertation 8 Rough Set Attribute Reduction – QuickReduct

9 Critical Discussion Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 9 Dissertation - New Approach Entropy attribute ranking at subset evaluation RSCTC 2010 - Better Approaches Cut Point Importance/ Dynamic Clustering Attribute Relevance

10 Investigating Rough Set Feature Selection for Gene Expression Analysis - Peter Scully (pds7) 10 References Figure 1 by Rough set-aided keyword reduction for text categorization. Alexios Chouchoulas; Qiang Shen. Applied Artificial Intelligence: An International Journal, 1087-6545, Volume 15, Issue 9, 2001, Pages 843 – 873 Figure 2 by “User:Squidonius” http://upload.wikimedia.org/wikipedia/en/e/e8/ Microarray_exp_horizontal.svg [Public Domain License] Rough Sets. Zdzisław Pawlak. International Journal of Parallel Programming. Volume 11, Number 5, 1982, Pages 341-356. See Dissertation for references of: RSAR/ QuickReduct, Relative Dependency, Dynamic Reducts, RSCTC2010, UCI, Weka, Consistency Feature Selection etc.


Download ppt "Peter Scully Investigating Rough Set Feature Selection for Gene Expression Analysis."

Similar presentations


Ads by Google