Presentation is loading. Please wait.

Presentation is loading. Please wait.

es/by-sa/2.0/. Large Scale Approaches to the Study of Gene Expression Prof:Rui Alves 973702406 Dept.

Similar presentations


Presentation on theme: "es/by-sa/2.0/. Large Scale Approaches to the Study of Gene Expression Prof:Rui Alves 973702406 Dept."— Presentation transcript:

1 http://creativecommons.org/licens es/by-sa/2.0/

2 Large Scale Approaches to the Study of Gene Expression Prof:Rui Alves ralves@cmb.udl.es 973702406 Dept Ciencies Mediques Basiques, 1st Floor, Room 1.08 Website of the Course:http://web.udl.es/usuaris/pg193845/Courses/Bioinformatics_2007/ Course: http://10.100.14.36/Student_Server/

3 Why Studying Gene Expression? Just because the gene is there, is it expressed? If so, under which circumstances ? Are there quantitative aspects to the expression of a gene or genes? Are they expressed differentially (other than in an on-off manner) under different conditions? What TF regulate expression of a given gene?

4 Genome Wide Gene Expression With fully sequenced genomes available one can finally study how gene expression is regulated in the whole genome simultaneously The way this done is by creating hybridation probes for each mRNA that are used simultaneously in an experiment and using them to measure how gene expression changes under different conditions

5 Substrates for High Throughput Arrays Nylon Membrane MicroarrayGeneChip Single label P 33 Single label biotin streptavidin Dual label Cy3, Cy5

6 GeneChip ® Probe Arrays 24µm Millions of copies of a specific oligonucleotide probe Image of Hybridized Probe Array Image of Hybridized Probe Array >200,000 different complementary probes Single stranded, labeled RNA target Oligonucleotide probe * * * * *1.28cm GeneChip Probe Array Hybridized Probe Cell

7 GeneChip ® Expression Array Design GeneSequence Probes designed to be Perfect Match Probes designed to be Mismatch Multiple oligo probes 5´3´

8 Procedures for Target Preparation cDNA Fragment (heat, Mg 2+ ) LLLL Wash & Stain Scan Hybridize (16 hours) Labeled transcript Poly (A) + / Total RNA RNA AAAA IVT(Biotin-UTPBiotin-CTP) Labeled fragments L L L L Cells

9 Microarray Technology

10 Cells from condition A Cells from condition B mRNA Label Dye 2 Ratio of expression of genes from two sources Label Dye 1 cDNA equaloverunder Mix Total or

11 GSI Lumonics

12 Radioactive Microarrays cDNA Hybridize&Scan Poly (A) + / Total RNA RNA AAAA (Radioactive NTP) Cells

13 GeneSpots on an Array Fluorescence/RadioactivityIntensity ExpressionMeasurement TissueSelectionDifferentialState/StageSelection RNA Preparation and Labeling CompetitiveHybridization Microarray Expression Analysis

14 Photons Radioactive emission What is measured? Gene 1 … Experiment 1200… ……… What do we do with the numbers?

15 Normalize the data Background noise exists and must be subtracted from the signal This noise results from contamination and/or non- specific hybridation 1 0.5 Normalization is not simple and no method is the best in all cases The normalized ratio is usually given as Log Ratios This gives the different between expression in one condition vs expression in another condition

16 Ratio vs. log-ratio A i : Red intensity B i : Green intensity Let Gene1: R 1 = 4, log 2 R 1 = 2 Gene2: R 2 = 1/4, log 2 R 2 = -2 R A*B 4 2 0 Gene2 Gene1 3 1 log 2 (A*B) Advantages of log transformation: Treat up-regulated and down-regulated genes symmetrically! Treat up-regulated and down-regulated genes symmetrically! Transfer multiplication operations to addition operations! Because: Transfer multiplication operations to addition operations! Because: log 2 R 0 -2 Gene2 Gene1 2

17 Expression Vectors As Points in ‘Expression Space’ Experiment 1 Experiment 2 Experiment 3 Similar Expression Exp 1 Exp 2 Exp 3 G1 G2 G3 G4 G5 x yz

18 Data Mining Methods Classification, Regression (Predictive Modeling) Clustering (Segmentation) Association Discovery (Summarization) Change and deviation detection Dependency Modeling Information Visualization

19 Biological question Differentially expressed genes Sample class prediction etc. Testing Biological verification and interpretation Microarray experiment Estimation Experimental design Image analysis Normalization Clustering Discrimination R, G (Rfg, Rbg), (Gfg, Gbg)

20 Why Studying Gene Expression? Just because the gene is there, is it expressed? If so, under which circumstances ? What TF regulate expression of a given gene? Are there quantitative aspects to the expression of a gene or genes? Are they expressed differentially (other than in an on-off manner) under different conditions?

21 Transcriptional Regulation DNA binding proteins Binding sites (specific sequences) Coding region (transcribed) Non-coding region RNA transcript Gene 1 Gene 2 Gene 3 ActivatorRepressor

22 ChIP-on-Chip Based on –ChIP (Chromatin Immuno-Precipitation) –Microarray In vivo assay Genome-wide location analysis

23 Predicting regulatory modules with CHIP-ChIp experiments cells Crosslink Protein/DNA Break DNA Reverse cross link & Purify DNA Pieces Afinity Purification of Transcription factor Reverse cross link & Purify DNA Pieces bound to TF Compare in Microarray

24 ChIP-on-Chip Array of intergenic sequences from the whole genome

25 Protein Binding Microarray (PBM) (Bulyk et al.) In vitro assay DNA-binding protein of interest is expressed with an epitope tag, purified and then bound directly to a double-strand DNA microarray Can overcome the shortcomings of ChIP-on-Chip –Poor enrichment –No available antibody –Unknown culture condition or time points

26 Protein Binding Microarray Whole-genome yeast intergenic microarray bound by Rap1

27 ChIP-on-Chip vs PBM Done by Mukherjee et al. Useful when ChIP-on-Chip does not result in enough enrichment * Lee et al., # Lieb et al.

28 Motif Discovery MEME (Expectation Maximization) CONSENSUS (greedy multiple alignment) WINNOWER (Clique finding in graphs) SP-STAR (Sum of pairs scoring) MITRA (Mismatch trees to prune exhaustive search space) BioProspector (Gibbs Sampling Based) MDScan (Differential weight for sequences) Motif Regressor EBMF (Energy Based Motif Finding)

29 Obstacles in TFBS Analysis Variation in binding sequences might be problematic in motif discovery process. –But for differential binding, there is no sequence discrepancy. For eukaryotic systems, lots of transcription factors (TFs) work together with other TFs affecting each other’s binding to DNA

30 To Do Go on line and look to find if there are examples of M. xanthus micro arrays and TF experiments. If you find them, analyze how the KHs and RRs are involved in the regulation under the different conditions.

31 Causes of Differential Binding We suspect the possible causes for this differential binding to be –Changes in the TF expression –Changes in other TFs expression –Modifications in TFs (protein level) –Changes in physical structures (epigenetic features) –Other unknown reasons

32 Cooperations in TFs Condition 1 Condition 2 Condition 3 What has caused the difference in the binding affinity?

33 Differentially Bound Promoters Simple correlation (A, B: binding ratio of TF in condition 1 and 2 respectively)

34 Differentially Bound Promoters Con1 vs Con2 Gene_1 Gene_2 Gene_3 Gene_4 ~ Gene_n Con1 vs Con3 Gene_2 Gene_5 Gene_7 Gene_8 ~ Gene_n Con2 vs Con3 Gene_1 Gene_2 Gene_3 Gene_4 ~ Gene_n How can we confirm which other TF(s) is involved?

35 –Sequence analyses on the differentially bound promoters? –Comparison of ChIP-on-Chip results? –Protein-protein interaction between TFs? Other possible analysis –Gene Ontology distribution of differentially bound promoters Methods

36 Expected Results We may be able to use heterogeneous experimental data to reveal the underlying mechanisms of differential binding of transcription factor to cis-regulatory region.

37 To Do Look for micro-array experiments using M. xanthus in the literature (http://www.ncbi.nih.gov/) or available on the web Make a list of found experimental conditions and collect the data Look for data analysis software on the web and find if any of the hks or RR is involved in the studyied conditions by analyzing the data.

38 Transcriptional regulatory code by Harbison et al. Identification of transcription factor binding site specificities

39 Transcriptional regulatory code by Harbison et al. Construction of regulatory map of Yeast

40 Transcriptional regulatory code by Harbison et al. Promoter architectures

41 Transcriptional regulatory code by Harbison et al. Environment-specific use of regulatory codes

42 Overview Introduction to Transcriptional Regulation ChIP-on-Chip (ChIP-Chip) Current Approaches Our Approach

43 Scatterplot of Normalized Data Adult Fetal

44 >0.3<-0.3

45 Data can be viewed as a NxM matrix (N >> M): N is the number of genes M is the number of data points for each gene Or Nx(M+K) K is the number of Features describing each gene(genome location, functional description, metabolic pathway et al) Characteristics of Data

46 Model for Data Analysis Gene Expression is a Dynamic Process Each Microarray Experiment is a snap shot of the process Need basic biological knowledge to build model For Example: Assumption – In most of experiments, only a small set of genes (100s/1000s) have been affected significantly.

47 Data Mining Data volumes are too large for traditional analysis methods Large number of records and high dimensional data Only small portion of data is analyzed Decision support process becomes more complex Functions of Data Mining Need for Data Mining Use the data to build predictors – prediction, classification, deviation detection, segmentation Generates more sophisticated summaries and reports to aid understanding of the data – find clusters, partitions in data

48 Normalize and filter Mine data for biological patterns of expression Integrate expression data with other ancillary data such, including genotype, phenotype, the genome, and its annotation Normalized data can be mined for biological knowledge

49 Chromatin Immuno Precipitation (ChIP) Immunoprecipitation SupernatantPellet Sonication or vortexing with glass-beads Using antibody of a protein of interest DNA bound to specific protein are enriched.


Download ppt "es/by-sa/2.0/. Large Scale Approaches to the Study of Gene Expression Prof:Rui Alves 973702406 Dept."

Similar presentations


Ads by Google