Presentation is loading. Please wait.

Presentation is loading. Please wait.

Microarray analysis: The CCBR’s perspective Manjula Kasoji CCBR 09/29/2014 1.

Similar presentations


Presentation on theme: "Microarray analysis: The CCBR’s perspective Manjula Kasoji CCBR 09/29/2014 1."— Presentation transcript:

1 Microarray analysis: The CCBR’s perspective Manjula Kasoji CCBR 09/29/2014 1

2 Common pitfalls Number of replicates Source, quantity and quality of RNA Batch effects Adequate expression signal Time series experiments Non-target tissue contamination 2

3 No Replicates, No Statistics 3 A project with no replicates may give you some information, but not possible to do statistics

4 How many biological replicates are needed in a quantitative study? 4 >= 7replicates >= 3 replicates More replicates if High biological variability Contamination by non-target tissues Subtle treatment effect Multiple treatments Mechanism of action Network analysis And many more… >= 4-5replicates

5 Depending on the source of RNA, sometimes even with the required number of replicates, samples do not cluster well 5 Embryonic tissue Knock-out efficiency may also play a role in good sample clustering. Quality, quantity and Source of RNA influence sample clustering Restoration of Knockout 1 Knockout 1 Knockout 2 Control

6 6 Randomization and consistent processing will help avoid batch effects Treatment Biological Replicate Technical Replicate Array Batch A A B B

7 Example of batch effect Cell line A Cell line B Cell line C Cell line D Cell line E Treated 1 Control 1 Treated 2 Control 2 Batch 1 – Scan Date 02/22/2011 Batch 2 – Scan Date 08/12/2011 7

8 8 Batch effects can be visualized via clustering as well Summary: Batch effects can be avoided by good experimental design and randomization. Batch effects can be visualized on a PCA plot and by clustering. Batch ABatch B

9 Weak signal expression across samples confounds analysis results 9 Group1 Group2 Group3 Group4 Poor clustering of samples Genes regulated by gene A induced upon DNA damage 4 different conditions

10 Weak signal leads to Very little or no significant, differentially expressed genes ContrastsDEGs (FDR 0.05, FC 2) Group 1 vs. Group 20 Group 3 vs. Group 40 Group 1 vs. Group 30 Group 2 vs. Group 40 10 What can we do in this situation? Relax the statistical parameters (lower p-value and FC thresholds) Caveat is that this will increase the number of false positives and will negatively influence downstream analysis. Summary:  Sufficient number of replicates  Randomization  Validation

11 Adding time points to an experiment can be useful for finding biological relevance Comparing immune system response in knockout mouse model to human model after treatment with endotoxin. Only one time point in mouse: 24hrs 6 time points in human data (0,2,4,6,9,24 hours) 11 WT-Mouse KO-Mouse Human 0 hr Human 2 hr Human 4 hr Human 6 hr Human 9 hr Human 24 hr

12 A successful project: Sufficient number of replicates and samples of a group cluster well 12 Treated, Time 1 Treated, Time 2 Not treated, Time 1 Not treated, Time 2 Principal Component Analysis Effect of cell density and drug treatment on cell survival and growth. Two conditions and 4 samples per group.

13 Diagnosing outliers 13 PCA plots are a good way to flag outliers

14 Diagnosing outliers: Quality control 14 arrayQualityMetrics() from R/Bioconductor Metrics measured: 1Between array comparison(Distance between arrays, PCA) 2Array intensity distribution(Box plots, density plots) 3Affymetrix specific plots on raw data (RLE – Relative Log Expression) 4Affymetrix specific plots on raw data (NUSE – Normalize Unscaled Standard Error) 5Individual array quality (MA plots) 6Spatial distribution of intensities If a sample outlier fails more than one QC metric: 1.that sample should be re-run if possible. 2.be removed from the analysis.

15 15 Diagnosing outliers: Quality control Density PlotBox PlotHeat map

16 Sufficient number of replicates and good quality lead to sufficient number of DEGs Contrast GroupsDEGs Treated-Day 1 vs. Not Treated Day 2288 Treated-Day 2 vs. Not Treated Day 2434 Treated-Day 2 vs. Treated-Day 1105 Not Treated Day 2 vs. Not Treated Day 1264 16 Summary: Sufficient replicates and good quality samples yield a successful project. Outliers can be diagnosed by visualization on a PCA plot and checking technical QC metrics to ensure that the outlier is not due to biological variability. Significant, differentially expressed genes (DEGs), p-value 0.05, FC 2

17 Downstream analysis : Functional enrichment using IPA Question: Which genes are associated with the growth-suppressive effect of low cell density on cell proliferation and survival? Time 1 = low cell density, Time 2 = high density Time 2, Treated vs. Non-treated Time 1, Treated vs. Non-treated 224 210 78 Subset of the 10 genes specifically involved in the Cellular Growth and Proliferation function that are also predicted to be growth suppressive. 17 Top 5 Bio-functions

18 Visualization of fold changes on a heat-map 18 Time 1 – low cell density Time 2 – high cell density Time 1, Unique genes involved in Cell Proliferation

19 Visualization of networks in IPA Interaction networkInteraction network expanded to include connections to upstream molecules 19

20 After the analysis Submit data to public repository and provide required metadata 20

21 What you need to provide to CCBR 21 Microarray Facility Investigator 23 1 CCBR Give us a visit before you begin your experiment Raw data (e.g..CEL files) Metadata (type of array, platform, species, experimental design information, processing dates) – http://ccrifx.cancer.gov/apps/site/example_microarray http://ccrifx.cancer.gov/apps/site/example_microarray Your goals and participation Submit your project request – https://ccrifx.cancer.gov/apps/project_request/request_project https://ccrifx.cancer.gov/apps/project_request/request_project 4

22 If you want to perform the analysis on your own, you need to… Learn appropriate qc methods, different statistical tests, and experimental designs Know what is in your tool box – Command line Affymetrix Power Tools (APT)—for Macs, command line only; free R/Bioconductor packages – GUI tools Affymetrix Expression Console (PC only)--free Partek Gene Set Enrichment Analysis (GSEA) Ingenuity Pathway Analysis (IPA) To take this further – Know how to run command line programs – Learn how to script (R/Bioconductor) – Learn different R packages 22

23 Recap Appropriate experimental design Sufficient replicates to have statistical power Consistent processing to avoid batch effects Raw data and meta data Visualization Validation 23 * Continuous interaction with CCBR

24 Acknowledgements CCRIFX Fathi Elloumi, PhD Parthav Jailwala, MS Li Jia, MS Manjula Kasoji, MS Anjan Purkayastha, PhD Anand S Merchant, MD, PhD Eric Stahlberg, PhD 24 CCR experts Maggie Cam, PhD Sean Davis, MD, PhD Max Lee, PhD Peter FitzGerald, PhD David Goldstein, PhD Sequencing Facility Yongmei Zhao, MS Bao Tran, MS ABCC Brian Luke, PhD Uma Mudunuri, MS Bob Stephens, PhD Ming Yi, PhD Jack Collins, PhD

25 Questions?? Contact CCBR home page: http://ccrifx.cancer.gov/apps/site/default CCBR email: ccrifx_support@mail.nih.gov Building 37, room 1123 Building 41, room B620 Office hours: Fridays 9:30am -11:30am 25


Download ppt "Microarray analysis: The CCBR’s perspective Manjula Kasoji CCBR 09/29/2014 1."

Similar presentations


Ads by Google