Presentation is loading. Please wait.

Presentation is loading. Please wait.

Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte.

Similar presentations


Presentation on theme: "Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte."— Presentation transcript:

1 Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte

2 Coding Genotypes CodingCCCTTT Co-dominant010 001 Dominant011 Recessive001 Log Additive012

3 Post-Genomic Era: Lots of Data!

4 “The study of genetic and other biological information using computer and statistical techniques.” A Genome Glossary, Science, Feb 16, 2001

5 Bioinformatics in Genetic Epi Some key aspects: Data management Candidate regions / genes (selection and SNP mining) Genetic Analyses (e.g., genotyping) Statistical Analyses

6 Data Management 5/20 Demogr. Database Laboratory Database Clinical Database Health and Habits Database Nutritional Database Genomic Database CaP Genes Databases Hub

7 From gene to polymorphisms Given a gene, how do I… Find its polymorphisms? Find its polymorphisms? Find information about those polymorphisms? Find information about those polymorphisms?

8 Hands-on guide for browsing and analyzing genomic data. Contains worked examples, providing: –overview of the types of data available, –details on how these data can be browsed, and –step-by-step instructions for using many of the most commonly- used tools for sequence based discovery. www.nature.com/cgi-taf/dynapage.taf?file=/ng/journal/v35/n1s/

9 Nature Genetics: A User's Guide to the Human Genome 3 of the 13 worked example questions How does one find a gene of interest and determine that gene's structure? How would one retrieve the sequence of a gene, along with all annotated exons and introns, as well as a certain number of flanking bases for use in primer design? A user wishes to find all the single nucleotide polymorphisms that lie between two sequence-tagged sites. Do any of these single nucleotide polymorphisms fall within the coding region of a gene? Where can any additional information about the function of these genes be found?

10 Look for SNPs in Databases General databases: --- dbSNP (http://www.ncbi.nlm.nih.gov/) --- UCSC Genome Bioinformatics (http://genome.ucsc.edu/)http://genome.ucsc.edu/ --- HapMap (http://www.hapmap.org/)http://www.hapmap.org/ --- The SNP consortium (TSC) (http://snp.cshl.org/)http://snp.cshl.org/ --- Human gene variation base (HGVbase) (http://hgvbase.cgb.ki.se)http://hgvbase.cgb.ki.se Special databases: --- The UW-FHCRC Variation Discovery Resource (SeattleSNPs) (http://pga.gs.washington.edu/)http://pga.gs.washington.edu/ --- Cancer Genome Anatomy Project - SNP500Cancer Database (http://snp500cancer.nci.nih.gov/home_1.cfm) ( --- InnateImmunity (http://innateimmunity.net)http://innateimmunity.net --- Drug response (http://pharmgkb.org) More….

11 UCSC Browser Comparative Genomics SNPs Gene structure

12 SeattleSNPs Resequencing the complete genomic region of each gene among 24 African-American (AA) subjects and 23 European (CEPH) subjects –2000 bp upstream of first exon –1500 bp downstream of poly-A signal –All exons and introns for genes below 35 kbp Summary data (2/18/05) –Number of genes sequenced: 208 –Total kilobases sequenced: 4408.78 –Number of SNPs found: 23,590 –SNPs in AA sample: 20,765 –SNPs in CEPH sample: 12,937

13 From Genomics to Proteomics Our ~ 25,000 genes carry the blueprint for making proteins, of which all living matter is made. Each protein has a particular shape and function that determine its role in the body. Proteomics is the study of protein shape, function, and patterns of expression.

14 5`3`DNA Pre-splicing RNA Post-splicing RNA Protein Exon, non-coding (5`UTR, 3`UTR) Exon, coding Promoter Enhancer Intron Poly-adenilation Anatomy of a gene

15 Characterize proteins derived from genetic code Compare variations in their expression levels under different conditions Study their interactions Identify their functional role. Proteomics

16 Proteome Complexity Recall that genome is relatively static. In contrast, many cellular proteins are continually moving and undergoing changes such as: 1.binding to a cell membrane, 2.partnering with another protein, 3.gaining or losing a chemical group such as a sugar, fat, or phosphate, or 4.breaking into two or more pieces.

17 Size of Proteome? > 1 Million Proteins >>> 25,000 genes in humans. Large number due to complexity (a given gene can make many different proteins) Features such as folds and motifs, allow them to be categorized into groups and families. This should help make it easier to undertake proteomic research. But no proteome has yet been sequenced.

18 How to Analyze Proteomes Broad range of technologies Central paradigm: –2-D gel electrophoresis (2D- GE), and mass spectrometry (MS). –2D-GE is used to separate the proteins by isoelectric point and then by size. –MS determines their identity and characteristics.

19 Bioinformatics in Proteomics Creation and maintenance of databases of protein info. Development of methods to predict the structure and/or function of newly discovered proteins and structural RNA sequences. Clustering protein sequences into families of related sequences and the development of protein models. Aligning similar proteins and generating phylogenetic trees to examine evolutionary relationships


Download ppt "Epidemiology 217 Molecular and Genetic Epidemiology Bioinformatics & Proteomics John Witte."

Similar presentations


Ads by Google