Luciano Brocchieri, PhD Research Interests. Summary of Research Interests 1.Gene identification and genome annotation 2.The evolution of genome-sequence.

Slides:



Advertisements
Similar presentations
After 13 years of scientist work predominatly in USA & UK the DNA sequence of the human genome was completed in 2003 Any ideas how they did it? What would.
Advertisements

Using phylogenetic profiles to predict protein function and localization As discussed by Catherine Grasso.
Summer Bioinformatics Workshop 2008 Comparative Genomics and Phylogenetics Chi-Cheng Lin, Ph.D., Professor Department of Computer Science Winona State.
Phylogenetic reconstruction
Speaker: HU Xue-Jia Supervisor: WU Yun-Dong Date: 19/12/2013.
Systems Biology Existing and future genome sequencing projects and the follow-on structural and functional analysis of complete genomes will produce an.
RNA and Protein Synthesis
. Class 1: Introduction. The Tree of Life Source: Alberts et al.
Molecular Evolution with an emphasis on substitution rates Gavin JD Smith State Key Laboratory of Emerging Infectious Diseases & Department of Microbiology.
Genetics and the Organism 10 Jan, Genetics Experimental science of heredity Grew out of need of plant and animal breeders for greater understanding.
BLOSUM Information Resources Algorithms in Computational Biology Spring 2006 Created by Itai Sharon.
Bioinformatics Unit 1: Data Bases and Alignments Lecture 3: “Homology” Searches and Sequence Alignments (cont.) The Mechanics of Alignments.
Bioinformatics Original definition (1979 by Paulien Hogeweg): “application of information technology and computer science to the field of molecular biology”
Lecture 12 Splicing and gene prediction in eukaryotes
Subsystem Approach to Genome Annotation National Microbial Pathogen Data Resource Claudia Reich NCSA, University of Illinois, Urbana.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Multiple Sequence Alignment May 12, 2009 Announcements Quiz #2 return (average 30) Hand in homework #7 Learning objectives-Understand ClustalW Homework#8-Due.
What is comparative genomics? Analyzing & comparing genetic material from different species to study evolution, gene function, and inherited disease Understand.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Measuring the T m of DNA GC pairs connected by 3 H bonds AT pairs connected by 2 H bonds * Higher GC content  higher T m Absorbance of 260 nM light (UV)
Identify gene markers for different taxonomic groups in Archaea and Bacteria Genomes Dongying Wu 1,2, Jonathan A. Eisen 1,2 1. DOE Joint Genome Institute,
OMICS Group Contact us at: OMICS Group International through its Open Access Initiative is committed to make genuine and.
Lecture 25 - Phylogeny Based on Chapter 23 - Molecular Evolution Copyright © 2010 Pearson Education Inc.
Genomes and Their Evolution. GenomicsThe study of whole sets of genes and their interactions. Bioinformatics The use of computer modeling and computational.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
Ch. 21 Genomes and their Evolution. New approaches have accelerated the pace of genome sequencing The human genome project began in 1990, using a three-stage.
Codon usage bias Ref: Chapter 9 Xuhua Xia dambe.bio.uottawa.ca.
Chapter 24: Molecular and Genomic Evolution CHAPTER 24 Molecular and Genomic Evolution.
Identifying and Modeling Selection Pressure (a review of three papers) Rose Hoberman BioLM seminar Feb 9, 2004.
Multiple Alignment and Phylogenetic Trees Csc 487/687 Computing for Bioinformatics.
A Theoretical Approach for the Genetic Code Paul SORBA Seminar dedicated to my always young friend Branko Belgrad, Sept.2015.
November 18, 2000ICTCM 2000 Introductory Biological Sequence Analysis Through Spreadsheets Stephen J. Merrill Sandra E. Merrill Marquette University Milwaukee,
Central dogma: the story of life RNA DNA Protein.
Codon usage bias Ref: Chapter 9
Cédric Notredame (08/12/2015) Molecular Evolution Cédric Notredame.
Chapter 10 Phylogenetic Basics. Similarities and divergence between biological sequences are often represented by phylogenetic trees Phylogenetics is.
Genome annotation and search for homologs. Genome of the week Discuss the diversity and features of selected microbial genomes. Link to the paper describing.
Human Genomics. Writing in RED indicates the SQA outcomes. Writing in BLACK explains these outcomes in depth.
Translational evidence and the accuracy of prokaryotic gene annotation Luciano Brocchieri Department of Molecular Genetics & Microbiology and Genetics.
Genes and Genomes. Genome On Line Database (GOLD) 243 Published complete genomes 536 Prokaryotic ongoing genomes 434 Eukaryotic ongoing genomes December.
Classification. Cell Types Cells come in all types of shapes and sizes. Cell Membrane – cells are surrounded by a thin flexible layer Also known as a.
Classification.
Sequence Alignment.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Finding genes in the genome
Major characteristics used in taxonomy
BIOINFORMATICS Ayesha M. Khan Spring 2013 Lec-8.
Protein Evolution Introducing the use of Biology Workbench as a Bioinformatics Tool.
General Microbiology (Micr300)
1. 2 Discovering the codon bias 3 Il codice genetico è DEGENERATO.
Discovering the codon bias
Bioinformatics Overview
OMICS Journals are welcoming Submissions
Introduction to Bioinformatics Resources for DNA Barcoding
Week-6: Genomics Browsers
Causes of Variation in Substitution Rates
Pipelines for Computational Analysis (Bioinformatics)
5.4 Cladistics.
Genomes and their evolution
Genomes and their evolution
What are the Patterns Of Nucleotide Substitution Within Coding and
There are four levels of structure in proteins
Genome organization and Bioinformatics
Summary and Recommendations
Evidence and Phylogenetic trees
KEY CONCEPT Entire genomes are sequenced, studied, and compared.
Unit Genomic sequencing
Maximum-likelihood phylogenetic tree of the bacterial domain, highlighting heterotrophic taxa most commonly found associated with diatoms. Maximum-likelihood.
Summary and Recommendations
Patterns of amino acid usage and its GC-content of synonymous codons in 65 nuclear genomes in this study. Patterns of amino acid usage and its GC-content.
Presentation transcript:

Luciano Brocchieri, PhD Research Interests

Summary of Research Interests 1.Gene identification and genome annotation 2.The evolution of genome-sequence compositional properties 3.Models of protein evolution 4.The phylogeny of Bacteria 5.Measures of differentiation from genomic and metagenomic data 6.The evolution of chaperone proteins

1. Gene identification and genome annotation From gene prediction to prokaryotic genome annotation – Missing genes are recognized in bacterial annotations – New methods and bioinformatics tools are needed to translate information from computational gene prediction and from conservation into accurate genome annotations Frame analysis – Frame analysis provides an effective way of visualizing frame-specific contrasts in sequence composition that reveal genes missed in genome annotations. – Tools are needed to efficiently translate the qualitative information provided by frame analysis into genome-wide annotations.

1. Gene identification and genome annotation Quantitative frame analysis We are developing methods for: – Quantitative characterization of sequence regions with compositional periodicity. – Statistical characterization of the intensity of compositional contrasts – Graphical representation of results in relation to gene-prediction information for sequence annotation

1. Gene identification and genome annotation Expression analyses To verify expression of predicted genes we use: – RNAseq: genome-wide sequencing of the trancriptome, measuring mRNA expression levels. – Ribosome Profiling (RIBO-seq): genome-wide analysis of translation of coding regions.

2. Sequence compositional properties What are the determinants of the composition of bacterial genomes? – Bacterial genomes encompass a wide variety of compositional properties. – In bacterial GC content ranges from about 16% to 75% GC. Does GC content reflect different selective pressures in different organisms? – Codon usage is highly variable across bacterial species. What are the determinants of codon usage?

2. Sequence compositional properties What are the determinants of GC content? How does GC content vary in different regions of the genomes? – We are investigating the relation of GC content with coding capacity and how GC content. – How does GC content relate to site-specific variation in selective pressures for optimal codon usage and for amino acid composition across species and across protein families?

2. Sequence compositional properties What are the determinants of optimal codon usage? How does codon usage vary across species? – How does codon usage relate to GC content? – What are the optimal codons for each organism? – How does optimal codon usage relate to patterns of gene conservation? 1000 prokaryotic genomes analysis – Definition of amino-acid-specific optimal codon types in 1000 genomes – We are interested in investigating how optimal codon usage relates to patterns of gene conservation across 1000 genomes.

3. Models of protein evolution ClassesAmino acid type Hydrophilic and small (MW: 75–146)A N C G P S T Hydrophobic and small (MW: )I L M V Negatively charged (MW: 133–147)D E Positively charged, aromatic and relatively large (MW: ) R Q H K F W Y Hanada et al Protein functional differentiation has been most commonly associated with substitution events between amino acid types belonging to different classes of physico-chemical properties (Hanada et al. 2007) implying that substitutions between similar types prevalently involve preservation of the protein functional/structural properties.

3. Models of protein evolution Long-term protein evolution as a mixture of neutral substitution and functional differentiation P1 P2 P3 P4 We are modeling protein evolution as a mixture of two processes: 1) frequent neutral amino acid substitutions within profiles (e.g., P1), punctuated by 2) profile changes related to functional differentiation (P1 -> P2 -> P3 ….). A protein can be modeled as a sequence of profiles of site-specific amino acid frequencies.

Determining the evolutionary relationships among bacterial phyla based on phylogenetic trees of protein evolution 4. The phylogeny of Bacteria

Identification of characters preserving the phylogenetic signal Development of amino acid substitution models 4. The phylogeny of bacteria Reconstruction of phylogenetic trees

Gammaproteobacteria Betaproteobacteria Alpahproteobacteria Deltaproteobacteria Epsilonproteobacteria Aquificae Bacteriodetes Plactomycetes, Verucomicrobia Chlamydiae Chlorobi Spirochates Lactobacillales Bacillales Clostridia Tenericutes Actinobacteria Thermotogae Deinococcus-Thermus Cyanobacteria Chloroflexi 0.1 Acidobacteria 4. The phylogeny of bacteria A tree based on our neutral-constrained model and switch positions

5. Measures of differentiation from genomic and metagenomic data We are developing measures of diversity within samples of data (e.g., from genome projects). to help direct future sequencing projects in an effort to maximize diversity. We are also interested in developing measures of diversity that will allow for meaningful comparisons between sets of data. For example, comparisons between microbiomes that take into consideration the evolutionary relations of the different sequences. We are working on measures of diversity based on the concept of genetic relatedness (Wright, 1922), calculated over given evolutionary trees.

5. Measures of differentiation Expected Genetic Diversity A measure of genetic diversity (as opposed to phenotypic diversity) It estimates how many states not identical by descent are expected to be represented at the leaves of a phylogenetic tree (or of a cluster). In our implementation states correspond to amino acid types. Estimates of Expected Genetic Diversity depend on the substitution model as well as on branch lengths and tree topology. Associated with this measure are measures of sampling density, group relatedness, and leaf-weights.

6. Evolution of chaperone proteins Chaperone proteins are well known for the critical role they play in assisting cellular activities, such as protein folding, and their role in in disease. Recent analyses indicated that the number and functional differentiation of eukaryotic chaperone genes is larger than previously thought. The availability of complete genome sequences makes it possible to pursue characterization of the complete set of chaperone genes encoded by human and other species.

6. The evolution of chaperone proteins Human chaperonin-genes Chaperonins are involved in folding nascent peptides and in rescuing mis-folded proteins. We identified fifty-four chaperonin-like sequences in the human genome including a newly-defined class of chaperonin genes named CCT8L, represented in human by the two sequences CCT8L1 and CCT8L2 (Mukherjee et al, BMC Evol Biol 2010). The characterization of many newly-discovered chaperonin pseudogenes uncovered the intense duplication activity of eukaryotic chaperonin genes (Mukherjee et al, BMC Evol Biol 2010). By extensive searches of chaperonin-like genes, we uncovered the ancient origin of chaperonin-like BBS genes, associated with the human developmental disorder Bardet-Biedl Syndrome (BBS), and previously believed to be metazoan-specific (Mukherjee and Brocchieri, J Phylogen Evolution Biol 2013).