Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stanford University Boolean Analysis of Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering Joint work with David Dill,

Similar presentations


Presentation on theme: "Stanford University Boolean Analysis of Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering Joint work with David Dill,"— Presentation transcript:

1 Stanford University Boolean Analysis of Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering Joint work with David Dill, Andrew Gentles, Rob Tibshirani, Sylvia Plevritis

2 Stanford University Outline Standard microarray work flow Data collection and preprocessing Boolean analysis Biological insights Conclusion and future work

3 Stanford University Microarray Work Flow mRNAHybridizationScanning Image processingNormalizationData analysis

4 Stanford University Data Collection There are thousands of microarray freely available GEO ArrayExpress SMD Celsius

5 Stanford University Preprocessing Get original RAW CEL files for one platform together. Typical number of CEL files : 2,000-11,000 Use RMA to normalize the CEL files Need a memory efficient algorithm Generates expression values for each probeset

6 Stanford University Existing Methods Correlation analysis Conditional probability Mutual information

7 Stanford University Boolean Analysis Get RAW DataNormalize Determine thresholds Discover Boolean relationshipsNew Biology

8 Stanford University Example

9 Stanford University Determine threshold Sort the gene expressions Use StepMiner to determine the threshold

10 Stanford University Determine threshold Its hard to determine a threshold for this gene. StepMiner usually puts a threshold in the middle for this case.

11 Stanford University Discover Boolean Relationships Analyze scatter plots between two genes. Divide the space into four different regions using the thresholds (quadrants). Determine sparse quadrants. Determine the Boolean relationships. WNT5A high PAX5 low 0 1 3 2

12 Stanford University Statistical Tests Compute the expected number of points under the independence model Compute maximum likelihood estimate of the error rate statistic = (expected – observed) expected √ a 00 (a 00 + a 01 ) a 00 (a 00 + a 10 ) + () 1 2 error rate = a 00 a 01 a 11 a 10

13 Stanford University Boolean Relationships Tightly co-regulated genes forms two sparse quadrants. There are six possible Boolean relationships Equivalent Opposite A lowB low A lowB high A highB low A high B high

14 Stanford University Boolean Relationships Equivalent Opposite PTPRC low CD19 low XIST high RPS4Y1 low COL3A1 high COL1A1 highFAM60A low NUAK1 high SymmetricAsymmetric

15 Stanford University Boolean Implication Network Directed graph Nodes: For each gene A A high A low Edges: A high to B low A high B low A high B low A low B high C high C low

16 Stanford University New Biology This slide is under construction!!

17 Stanford University Biological Insights Gender Organ Tissue DevelopmentDifferentiationCo-expression

18 Stanford University Example Application Immunology B Cell differentiation Goal: Discover genes that mark unique B Cell precursors

19 Stanford University Differentiation Tree Hematopoietic stem cell differentiation is a tree Root: HSC Leaf Lymphocytes B Cell, T Cell, NK cell, Dendritic cell Erythrocytes Granulocytes: Basophil, Neutrophil, Eosinophil Monocytes: Dendritic cell Thrombocytes

20 Stanford University KIT high A high B low B220 low CD19 low KIT A B B220 CD19 A high B low

21 Stanford University Conclusion Boolean analysis Directly visible on the scatter plot. Enables discovery of asymmetric relationship. Follow biology. Potential application to Immunology Future work Cancer progression New biology

22 Stanford University Acknowledgements The Felsher Lab:  Natalie Wu  Cathy Shachaf  Dean Felsher Funding: ICBP Program (NIH grant: 5U56CA112973-02)  Leonore A Herzenberg  James Brooks  Joe Lipsick  Gavin Sherlock  Howard Chang  Stuart Kim

23 Stanford University The END


Download ppt "Stanford University Boolean Analysis of Large Gene-expression Datasets Debashis Sahoo PhD Candidate, Electrical Engineering Joint work with David Dill,"

Similar presentations


Ads by Google