Presentation is loading. Please wait.

Presentation is loading. Please wait.

Http://www.youtube.com/watch?v=X4mO4KPdtEM.

Similar presentations


Presentation on theme: "Http://www.youtube.com/watch?v=X4mO4KPdtEM."— Presentation transcript:

1

2 Genomics and Bioinformatics

3 What is Functional Genomics?
Functional genomics refers to the development and application of global (genome-wide or system-wide) experimental approaches to assess gene function by making use of the information and reagents provided by structural genomics. (Hieter and Boguski 1997) Functional genomics as a means of assessing phenotype differs from more classical approaches primarily with respect to the scale and automation of biological investigations. (UCDavis Genome Center)

4 Functional Genomics Hunt & Livesey (eds.)
cDNA Libraries Differential Display Representational Difference Analysis Suppression Subtractive Hybridization cDNA Microarrays Serial Analysis of Gene Expression 2-D Gel Electrophoresis

5 Functional Genomics Differential Gene expression
SAGE/MPSS *Open systems* Identifying the Function of Genes Functional Complementation RNA interference/RNA silencing

6 Why We Need Functional Genomics
Organism # genes % of genes with inferred function Completion date of genome E. coli 4288 60 1997 yeast 6,600 40 1996 C. elegans 19,000 1998 Drosophila 12-14K 25 1999 Arabidopsis 25,000 2000 mouse ~30,000? 10-20 2002 human

7 What is the limitation of functional genomics? 5P

8 Questions Functional genomics will not replace the time-honored use of genetics, biochemistry, cell biology and structural studies in gaining a detailed understanding of biological mechanisms.

9 SAGE & MPSS Serial Analysis of Gene Expression
Massively Parallel Signature Sequencing Start from mRNA (euks) Generate a short sequence tag (9-21 nt) for each mRNA ‘species’ in a cell

10 SAGE Described by Velculescu et al. (1995)
Originally 9 bp tags, now LongSAGE 21 bp 10-50 tags in a clone Only requires a sequencer (and some time)

11 MPSS Proprietary technology; published 2000
Generates 17 nt “signature sequence” Collects >1,000,000 signatures per sample Requires 2 µg of mRNA and $$

12

13

14

15

16

17

18

19

20

21

22 Kamath et al. 2003 16,757 strains = 86% of predicted ORFs
Looked for sterility or lethality(Nonv), slow growth (Gro) or defects (Vpep) 1,722 strains (10.3% had such phenotypes)

23 Genes involved in basic metabolism & cell maintenance are enriched for Nonv phenotype Genes involved in more complex ‘metazoan’ processes (signal transduction, transcriptional regulation) are enriched for Vpep phenotype Nonv phenotypes highly underrepresented on the X chromosome X chromosome is enriched for Vpep phenotypes

24 Basal functions of eukaryotes are shared: - lethal (Nonv) genes tended to be of ancient origin - ‘animal-specific’ genes tended to be non-lethal (Vpep) - almost no ‘worm-specific’ genes were lethal

25 Protein-Protein Interaction
Interactome: proteome scale data sets of protein-protein interaction. Protein network. Methods: Yeast two hybrid, protein microarray, gene disruption phenotype, protein subcellular localization, mRNA expression profile, immunoprecipitation/mass spectrometry Problem: false positive, false negative Although many protein-protein interaction maps on a global scale have been generated, they suffer from high error rates as evidence by low overlap among the different databases.

26 Network Topology Various pattern of links connecting pairs of nodes
Computer network or any communication network A given node ahs one or more links to others Network topology is determined only by the configuration of connections between nodes Daisy chain; linear, ring Centralization Decentralization Hybrid; two or more different basic network

27 Traveling Salesman Network (or Conference Site Map)
Seattle, WA Sioux Falls, SD Boston, MA Steamboat, CO Denver, CO San Francisco, LA Iowa city, Iowa New York, NY Lincoln, NE Chicago, IL Moab, UT Washington DC, MD Wichita, KS Lake of the Ozarks, MS Columbia, SC Atlanta, GA Dallas, TX Fort Lauderdale, FL To find the cheapest way of visiting all of the cities returning to your starting point Anchorage, AL Orlando, FL Honolulu, HI

28 Protein-Protein Interaction: Extended Neighborhood
Counting simple paths between two proteins In addition to the path through the immediate neighbor G, we consider four other simple paths between A and B The local structure beyond the shortest path through G contains valuable information

29 Protein-Protein Interaction: Assigning Weight on the Hub
As the paths between a pair of proteins are found, we wanted to assign scores to each paths, according to path length and the characteristics of the nodes Then we sum up the contribution from each path to calculate the overall score The longer path should have less weight since they have less chance for interaction Traveling along the nodes that have fewer neighbors should be given more weight

30 np; maximum path length
Total score: np; maximum path length al; the weighting coefficient for paths of different length nl; number of paths with length l Pli; nodes along the ith path of length l including the start and end nodes dj; degree of the nodes As the paths between a pair of proteins are found, we assign scores to each path, according to the path length and the characteristics of the nodes it traverses. Then we sum up the contribution from each path to calculate the overall score. Intuitively, the longer paths should have less weight since they present weaker evidence for the interaction. Similarly, traversing along the nodes that have fewer neighbors should be given more weight in general than doing so along the nodes with many neighbors. However, we assign a lower weight to the scenario on the left. This may not give the correct result all the time, as there are many real hub-like structures in the network, but we apply a conservative weighting scheme in order to protect against false positives due to experimental errors. The square root scaling is intuitively appealing, as this gives each additional degree diminishing weight, and the factor 1 was added inside the square root to lessen the difference between the nodes with degrees one and two. The total score is then where np is the maximum path length to consider, al is the weighting coefficient for paths of different length, nl is the number of paths with length l, and Pli contains the nodes along the i th path of length l including the start and end nodes.

31 Database Molecular Interaction (MINT) database (Zanzoni et al., 2002)
Datasets by Gavin et al (2002) and Ho et al (2002) Database of Interacting Proteins (DIP) by Salwinski et al., 2004. MINT focuses on experimentally verified protein interactions mined from the scientific literature by expert curators. The curated data can be analyzed in the context of the high throughput data and viewed graphically with the 'MINT Viewer'. The DIPTM database catalogs experimentally determined interactions between proteins. It combines information from a variety of sources to create a single, consistent set of protein-protein interactions. The data stored within the DIP database were curated, both, manually by expert curators and also automatically using computational approaches that utilize the the knowledge about the protein-protein interaction networks extracted from the most reliable, core subset of the DIP data. Please, check the reference page to find articles describing the DIP database in greater detail. This page serves also as an access point to a number of projects related to DIP, such as LiveDIP, The Database of Ligand-Receptor Partners (DLRP) and JDIP.

32 RNA polymerase II transcription process
Mediator Complex RNA polymerase II transcription process To bridge between gene-specific transcription factors and the core RNAP II machinery 25 subunits Computational and 3D structural analysis

33 Carbon and energy source
Glucose Metabolism Carbon and energy source Adaptation of their metabolism based on the available nutrients Regulate gene expression Glucose homeostasis regulates its lifespan and aging in all eukayotes Snf1 protein kinase complex: key components of the glucose repression and derepression pathway

34 Aging The ultimate causes of aging are unknown Multifactorial process Mutation accumulation and oxidation

35 Mediator Closed Conformation
SRB1 SRB10 SRB9 SRB8 NUT1 MED10 MED3 MED1 RGR1 SIN4 MED2 MED7 SRB7 GAL11 MED4 ROX3 SRB2 SRB6 SRB5 MED3 SRB4 MED6 MED11

36 Mediator Open Conformation
SIN4 GAL1 RGR1 MED2 MED4 MED1 SRB7 MED10 SRB8 ROX3 SRB9 SRB11 NUT1 SRB6 MED7 SRB10 SRB4 SRB5 MED6 MED8 SRB2 MED11

37 Mediator, Glucose, and Aging Network
GPR1 RAS1 SDS22 SLT2 RAS2 GPA2 PAK1 CDC25 SET1 GLC7 SWI4 CYR1 TOS3 HDA1 TUP1 REG1 SGS1 TRK2 SNF4 SIP1 SIR3 ELM1 HOG1 CYCB MSN5 SIP1 RAP1 SIP2 SNF1 SIT4 DMC1 GALB3 CRC1 MIG1 SIR4 ACC1 ZDS2 SCD1 SRB11 SRB10 SIP4 ZDS1 RAD50 NET1 CAT8 HAP4 GAL4 SIR2 SRB6 SRB2 CDC14 RB4 RGR1 MED11 SIN4 ESCB ADA1 MED10 MED8 GAL11 MED3 MED6 ROX3 NUT1 SRB5 MED2 SRB8 MED1 GCN5 SRB7 SRB9 SRB4 MED4 MED7

38 Oxidative stress Aging Stress (cell senescence) Plasma membrane Stress Osmotic stress Calorie restriction Tor Activity Unknown Kinase Activity Respiration Fermentation Hog1 Activity Slt4 Activity Slt2 Activity Tup1 Repressor Activity Sir2 Activity Slt2 pSir3 Sir3 Longevity Sit4 Gre2 Gene Expression PAU gene Expression Regulated Longevity Cell Wall Remodeling Stress Response


Download ppt "Http://www.youtube.com/watch?v=X4mO4KPdtEM."

Similar presentations


Ads by Google