Presentation on theme: "Linear Models for Microarray Data"— Presentation transcript:
1 Linear Models for Microarray Data LIMMALinear Models for Microarray Data
2 Difficulties with microarray data Variability of the expression values differs between genesNon-identical and dependent distribution between genesMultiple testing of tens of thousands of genes
3 Correct for multiple comparisons Multiple testing - Family-wise error rate- False Discovery Rate etc.Parallel nature of the inference allows for compensating possibilitiesBorrowing information from the ensemble of genes to assist in inference from individual genes
4 Empirical BayesFrequentist methods, a hypothesis is typically rejected or not rejected without directly assigning a probabilityBayesian methods, specifies some prior probability, which is then updated in the light of new data.For Bayesian techniques, the prior distribution is assigned independent of the data and fixed before any data is observed.
5 Empirical BayesSuperficially similar to Bayesian methods in that a prior distribution is assigned.However, prior distribution is estimated from the dataTherefore Empirical Bayes is a frequentist technique
6 LIMMAEmpiricial Bayes techniques have previously been applied to microarray dataAnalysis specific to experiment and very difficult to implementLIMMA - Simple model with simple expression of posterior oddsAllows linear modelling to be applied to microarray data
7 Estrogen Data2x2 factorial experiment on MCF7 breast cancer cells using Affymetrix HGU95av2 arraysFactors : Estrogen (Presence/Absence)Length of exposure (10hr/48hr)The idea of the study is to identify genes that respond to estrogen treatment
8 Read in the Data Load in the estrogen data Normalise the data Define the targets (factors) for the linear model
9 Design Matrix Eight arrays Four pairs of replicates 1 low10-1.cel absent 102 low10-2.cel absent 103 high10-1.cel present 104 high10-2.cel present 105 low48-1.cel absent 486 low48-2.cel absent 487 high48-1.cel present 488 high48-2.cel present 48Eight arraysFour pairs of replicatesFour parameters in the linear model
11 Differential Expression Extract linear model fit for contrastsObtain list of differentially expressed genes for contrastsLook for overlap among differentially expressed genes
12 Linear Model FitlogFC - Estimate of the log2-fold-change corresponding to the effect or contrastAveExpr - Average log2-expression for the probe over all arrays/channelst - moderated t-statisticP.Value - Raw p-valueadj.P.Value -Adjusted p-valueB - log odds that the gene is differentially expressed
13 Annotating Data Probe arrays can be annotated with external data Multiple sources of gene annotations
14 Gene Set EnrichmentAll biochemical pathways are determined by sets of genesGene sets are determined by prior biological knowledge relating to co-expression, function, location or known biochemical pathways.If a pathway is in any way related to a biological trait then the co-functioning genes should display a higher degree of enrichment compared to the rest of the transcriptome.Gene Set Enrichment (GSE) is a computational technique which determines whether a priori defined set of genes show statistically significant overlap
16 Estrogen receptor (ER) gene set If estrogen is present, ER genes will bind the estrogen and become activatedGain ability to regulate gene expression and result in differential expression between the cells with and without estrogenShould lead to up regulation of ER genes
Your consent to our cookies if you continue to use this website.