Presentation is loading. Please wait.

Presentation is loading. Please wait.

DNA, Gene, and Genome Translating Machinery for Genetic Information.

Similar presentations


Presentation on theme: "DNA, Gene, and Genome Translating Machinery for Genetic Information."— Presentation transcript:

1

2 DNA, Gene, and Genome

3 Translating Machinery for Genetic Information

4 Transcription factors mRNA levels

5 Automated DNA Sequencing

6

7 Data Increase (from NCBI web site)

8

9 Partial Display of Human Draft Sequence (Nature, 2001)

10 Human Genome Map at NCBI

11 MGALRPTLLPPSLPLLLLLMLGMGCWAREVLVPEGPLYRVAGTAVSISCNVTGY EGPAQQNFEWFLYRPEAPDTALGIVSTKDTQFSYAVFKSRVVAGEVQVQRLQGD AVVLKIARLQAQDQGIYECTPSTDTRYLGSYSGKVELRVLPDVLQVSAAPPGPR GRQAPTSPPRMTVHEGQELALGCLARTSTQKHTHLAVSFGRSVPEAPVGRSTLQ EVVGIRSDLAVEAGAPYAERLAAGELRLGKEGTDRYRMVVGGAQAGDAGTYH CTAAEWIQDPDGSWAQIAEKRAVLAHVDVQTLSSQLAVTVGPGERRIGPGEPLE LLCNVSGALPPAGRHAAYSVGWEMAPAGAPGPGRLVAQLDTEGVGSLGPGYE GRHIAMEKVASRTYRLRLEAARPGDAGTYRCLAKAYVRGSGTRLREAASARSR PLPVHVREEGVVLEAVAWLAGGTVYRGETASLLCNISVRGGPPGLRLAASWWV ERPEDGELSSVPAQLVGGVGQDGVAELGVRPGGGPVSVELVGPRSHRLRLHSL GPEDEGVYHCAPSAWVQHADYSWYQAGSARSGPVTVYPYMHALDTLFVPLL VGTGVALVTGATVLGTITCCFMKRLRKR 60-70 KDa Protein interacting with prostate cancer suppressor

12

13 Molecular biology databases Sequence databases –Annotated –Low-annotation –Specialized Structural databases Motif databases Genome databases Proteome databases RNA expression Literature Populations Mutations Polymorphisms Organisms Pathways

14 PromotersESTs Tissues and cells Genome maps DNA sequences Molecular Phylogeny Protein sequences Protein structures DNA motifs Protein motifs Substrates Metabolic pathways Transcription Factors RNA expression Mutations/polymorphisms Gene Family

15 Databases formats Relational databases –GDB, GSDB, MGD etc. –Vender: Sybase, Oracle etc. Flat file databases –GenBank, SWISS-PROT etc. Object-oriented databases –ACeDB, AtDB etc.

16 Molecular biology data types OrganismsGenome maps Mouse chromosome X from the Mouse Genome Informatics project http://www.informatics.jax.org/

17 Molecular biology data types OrganismsGenome maps DNA sequences RNA sequences...AATGGTACCGATGACCTGGAGCTTGGTTCGA...

18 Molecular biology data types OrganismsGenome maps DNA sequences RNA sequences Protein sequences...TRLRPLLALLALWPPPPARAFVNQHLCGSHLVEA...

19 Molecular biology data types OrganismsGenome maps DNA sequences RNA sequences Protein sequences Protein structures RNA structures PDB entry 1CIS P.Osmark, P.Sorensen, F.M.Poulsen

20 Molecular biology data types OrganismsGenome maps DNA sequences RNA sequences Protein sequences Protein structures DNA motifs Protein motifs RNA expression RNA structures

21 DNA microarrays measure variations in RNA levels The full Yeast genome on a chip http://cmgm.Stanford.EDU/pbrown/ De Risi et al, Science 278:680 Red dots: genes whose RNA level increased Green dots: genes whose RNA level decreased

22 Substrates for High Throughput Arrays Nylon Membrane Glass SlidesGeneChip Single label P 33 Single label biotin streptavidin Dual label Cy3, Cy5

23 GeneChip ® Probe Arrays 24µm Millions of copies of a specific oligonucleotide probe Image of Hybridized Probe Array Image of Hybridized Probe Array >200,000 different complementary probes Single stranded, labeled RNA target Oligonucleotide probe * * * * *1.28cm GeneChip Probe Array Hybridized Probe Cell

24 GeneChip ® Expression Array Design GeneSequence Probes designed to be Perfect Match Probes designed to be Mismatch Multiple oligo probes 5´3´

25 Procedures for Target Preparation cDNA Fragment (heat, Mg 2+ ) LLLL Wash & Stain Scan Hybridize (16 hours) Labeled transcript Poly (A) + / Total RNA RNA AAAA IVT(Biotin-UTPBiotin-CTP) Labeled fragments L L L L Cells

26 Microarray Technology

27 NSF Soybean Functional Genomics Steve Clough / Vodkin Lab Printing Arrays on 50 slides

28 Cells from condition A Cells from condition B mRNA Label Dye 2 NSF / U of Illinois Microarray Workshop -Steve Clough / Vodkin Lab Ratio of expression of genes from two sources Label Dye 1 cDNA equaloverunder Mix Total or

29 GSI Lumonics NSF Soybean Functional Genomics Steve Clough / Vodkin Lab

30 Beta Actin PKG HPRT Beta 2 microglobulin Rubisco AB binding protein Major latex protein homologue (MSG) Cattle and Soy Controls Array of cattle and soy spiking controls. 50 ug of cattle brain total RNA was labeled with Cy3 (green). 1 ul each of in vitro transcribed soy Rubisco (5 ng), AB binding protein (0.5 ng) and MSG (0.05 ng) were labeled with Cy5. The two labeled samples were cohybridized on superamine slides (Telechem, Inc.). To the right of each set of spots are five negative controls (water).

31 IgM IgM heavy chain MYLK COL1A2 MYLK IgM Fetal Spleen-Cy3Adult Spleen-Cy5 IgM heavy chain

32 Placenta vs. Brain – 3800 Cattle Placenta Array cy3 cy5 GenePix Image Analysis Software

33

34 1.Experimental Design 2.Image Analysis – raw data 3.Normalization – “clean” data 4.Data Filtering – informative data 5.Model building 6.Data Mining (clustering, pattern recognition, et al) 7.Validation Microarray Data Process

35 Scatterplot of Normalized Data Adult Fetal

36 >0.3<-0.3

37 Complexity Levels of Microarray Experiments: 1.Compare genes in a control situation versus a treatment situation Example: Is the level of expression (up-regulated or down-regulated) significantly different in the two situations? (drug design application) Methods: t-test, Bayesian approach 2.Find multiple genes that share common functionalities Example: Find related genes that are dependent? Methods: Clustering (hierarchical, k-means, self-organizing maps, neural network, support vector machines) 3.Infer the underlying gene and protein networks that are responsible for the patterns and functional pathways observed Example: What is the gene regulation at system level? Directions: mining regulatory regions, modeling regulatory networks on a global scale

38 Comparing data from two experiments.

39 NO DRUG 1nM Drug 1  M Drug Statistical filters used: The genes present (Presence Call in Affymetrix) in drug treated, ANOVA p<0.02 between groups. Red indicates increased expression, and green is decreased expression (Log(fold change)). Genesight 3 (Biodiscovery Software, www.biodiscovery.com) Clustering to extract genes which tightly co-express.

40 Statistical filters used: The genes present (Presence Call in Affymetrix) in absence of drug, ANOVA p<0.02 between groups. NO DRUG 1nM Drug 1  M Drug

41 Self Organizing Maps

42 Molecular Classification of Cancer

43

44 Gene Expression Profile of Aging and Its Retardation by Caloric Restriction Cheol-Koo Lee, Roger G. Klopp, Richard Weindruch, Tomas A. Prolla

45 Data Mining Methods Classification, Regression (Predictive Modeling) Clustering (Segmentation) Association Discovery (Summarization) Change and deviation detection Dependency Modeling Information Visualization


Download ppt "DNA, Gene, and Genome Translating Machinery for Genetic Information."

Similar presentations


Ads by Google