Subcellular Localization, Provides a simple goal for genome-scale functional prediction Determine how many of the ~6000 yeast proteins go into each compartment
Subcellular Localization, a standardized aspect of function Nucleus Membrane Extra- cellular [secreted] ER Cytoplasm Mitochondria Golgi
"Traditionally" subcellular localization is "predicted" by sequence patterns NLS TM-helix Sig. Seq. HDEL Nucleus Membrane Extra- cellular [secreted] ER Cytoplasm Mitochondria Golgi Import Sig.
Subcellular localization is associated with the level of gene expression Nucleus Membrane Extra- cellular [secreted] ER Cytoplasm Mitochondria Golgi [Expression Level in Copies/Cell]
Combine Expression Information & Sequence Patterns to Predict Localization NLS TM-helix Sig. Seq. HDEL Nucleus Membrane Extra- cellular [secreted] ER Cytoplasm Mitochondria Golgi Import Sig. [Expression Level in Copies/Cell]
Major Objective: Discover a comprehensive theory of life’s organization at the molecular level –The major actors of molecular biology: the nucleic acids, DeoxyriboNucleic Acid (DNA) and RiboNucleic Acids (RNA) –The central dogma of molecular biology??? Proteins are very complicated molecules with 20 different amino acids. Epigenetics RNA editing Post-translational modification Translational regulation
Data Mining Microarray Experiment Image Analysis Biology Application Domain Experiment Design and Hypothesis Data Analysis Artificial Intelligence (AI) Knowledge discovery in databases (KDD) Data Warehouse Validation Statistics
Higher Level Microarray data analysis Clustering and pattern detection Data mining and visualization Linkage between gene expression data and gene sequence/function/metabolic pathways databases Discovery of common sequences in co- regulated genes Meta-studies using data from multiple experiments
Scatter plot of all genes in a simple comparison of two control (A) and two treatments (B: high vs. low glucose) showing changes in expression greater than 2.2 and 3 fold.
Types of Clustering Herarchical –Link similar genes, build up to a tree of all Self Organizing Maps (SOM) –Split all genes into similar sub-groups –Finds its own groups (machine learning)
Public Databases Gene Expression data is an essential aspect of annotating the genome Publication and data exchange for microarray experiments Data mining/Meta-studies Common data format - XML MIAME (Minimal Information About a Microarray Experiment)
Molecular Function = elemental activity/task –the tasks performed by individual gene products; examples are carbohydrate binding and ATPase activity Biological Process = biological goal or objective –broad biological goals, such as mitosis or purine metabolism, that are accomplished by ordered assemblies of molecular functions Cellular Component = location or complex –subcellular structures, locations, and macromolecular complexes; examples include nucleus, telomere, and RNA polymerase II holoenzyme The 3 Gene Ontologies
One Last Note Microarrays are “cutting edge” technology You now have experience doing a technique that most Ph.D.s have never done Looks great on a resume…