Plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 EBI is an Outstation of the European Molecular Biology Laboratory.

Slides:



Advertisements
Similar presentations
Dan Bolser, EMBL-EBI transPLANT portal: Overview and search Versailles, 12th-13th November 2012 trans-National Infrastructure for Plant Genomic Science.
Advertisements

1 st transPLANT user training workshop Versailles, France, November 2012 EBI is an Outstation of the European Molecular Biology Laboratory. Dan Bolser.
SRI International Bioinformatics 1 Genome Browser Markus Krummenacker Bioinformatics Research Group SRI, International Q
Web Apollo Resources at the National Agricultural Library Christopher Childers NAL ARS USDA i5k.nal.usda.gov.
Genomic Innovations- Orthology Paralogy. Genomic innovation.
Peter Tsai, Bioinformatics Institute.  University of California, Santa Cruz (UCSC)  A rapid and reliable display of any requested portion of genomes.
Genome Browsers Carsten O. Daub Omics Science Center RIKEN, Japan May 2008.
Copyright OpenHelix. No use or reproduction without express written consent1 Organization of genomic data… Genome backbone: base position number sequence.
How to access genomic information using Ensembl August 2005.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProt Jennifer McDowall, Ph.D. Senior InterPro Curator Protein Sequence Database:
Data retrieval BioMart Data sets on ftp site MySQL queries of databases Perl API access to databases Export View.
WormBase Workshop: 2015 International C. elegans Meeting Tools & Resources InterMine / WormMine – Chris Grove JBrowse – Scott Cain The WormBase Ontology.
Plants.ensembl.org / The transPLANT project is funded by the European Commission within its 7 th Framework Programme under the thematic.
NGS Analysis Using Galaxy
Cytoscape A powerful bioinformatic tool Mathieu Michaud
Role of Rubisco in Photosynthesis Anu Murphy Dept. of Molecular and Integrative Physiology, University of Illinois at Urbana-Champaign.
Gramene Objectives Develop a database and tools to store, visualize and analyze data on genetics, genomics, proteomics, and biochemistry of grass plants.
EBI is an Outstation of the European Molecular Biology Laboratory. Bert Overduin Daniel Rios Stephen Fitzgerald Edinburgh, 24 & 25 February 2009 Ensembl.
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Copyright OpenHelix. No use or reproduction without express written consent1.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Copyright OpenHelix. No use or reproduction without express written consent1.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
Copyright OpenHelix. No use or reproduction without express written consent 2 Overview of Genome Browsers Materials prepared by Warren C. Lathe, Ph.D.
UCSC Genome Browser 1. The Progress 2 Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools.
Managing Data Modeling GO Workshop 3-6 August 2010.
SRI International Bioinformatics 1 Object Groups & Enrichment Analysis Suzanne Paley Pathway Tools Workshop 2010.
EADGENE and SABRE Post-Analyses Workshop 12-14th November 2008, Lelystad, Netherlands 1 François Moreews SIGENAE, INRA, Rennes Cytoscape.
Copyright OpenHelix. No use or reproduction without express written consent1.
Welcome to DNA Subway Classroom-friendly Bioinformatics.
DAY 1c: Accessing Completed Genomes 1. UCSC Genome Bioinformatics 2. Ensembl 3. NCBI Genomic Biology.
I. Introduction and Red Line Education for Data-unlimited Science.
1 of 38 Data Mining in Ensembl with BioMart. 2 of 38 Simple Text-based Search Engine.
Data Mining in Ensembl with BioMart Nov,
Plants.ensembl.org / The transPLANT project is funded by the European Commission within its 7 th Framework Programme under the thematic.
Web Databases for Drosophila Introduction to FlyBase and Ensembl Database Wilson Leung6/06.
Gramene Objectives Provide researchers working on grasses and plants in general with a bird’s eye view of the grass genomes and their organization. Work.
SRI International Bioinformatics 1 SmartTables & Enrichment Analysis Peter Karp SRI Bioinformatics Research Group September 2015.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Data Mining in Ensembl with BioMart Giulietta Spudich.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
EBI is an Outstation of the European Molecular Biology Laboratory. Gautier Koscielny VectorBase Meeting 08 Feburary 2012, EBI VectorBase Text Search Engine.
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
What do we already know ? The rice disease resistance gene Pi-ta Genetically mapped to chromosome 12 Rybka et al. (1997). It has also been sequenced Bryan.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Genome Database Comparative Genomics Phylogenomics Variation GrameneMart (BioMart) Discovery Environment Josh Stein Cold Spring Harbor Laboratory 1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1 1.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Accessing and visualizing genomics data
Genomes at NCBI. Database and Tool Explosion : 230 databases and tools 1996 : first annual compilation of databases and tools lists 57 databases.
Welcome to the combined BLAST and Genome Browser Tutorial.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
IGV Demo Slides:/g/funcgen/trainings/visualization/Demos/IGV_demo.ppt Galaxy Dev: 0.
Visualizing data from Galaxy
Galaxy for analyzing genome data Hardison October 05, 2010
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
Denise Carvalho-Silva Ensembl Outreach
NGS Analysis Using Galaxy
Data Mining with BioMart
Ensembl Genomes: Overview Poznań, 27th-28th June 2013
Ensembl Genome Repository.
Ensembl Genomes: Overview Versailles, 12th-13th November 2012
Yating Liu July 2018 G-OnRamp workshop
Welcome to the GrameneMart Tutorial
SRI Bioinformatics Research Group
Welcome - webinar instructions
Presentation transcript:

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 EBI is an Outstation of the European Molecular Biology Laboratory. Dan Bolser (adapted from slides by Bert Overduin) Browsing Genomic Information with Ensembl Plants

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Outline of workshop Brief introduction to Ensembl Plants History Content Tutorial (~1:30h) Interactive exercises and answers…

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Ensembl & Ensembl Genomes 1999: Start of Ensembl project (Human Genome) 2001: First release of data and web interface 2002: Mouse, mosquito, fugu, zebrafish and rat added … 2009: First release of Ensembl Genomes … 2012: Ensembl (v69): 71 genomes 2012: Ensembl Genomes (v16): 359 genomes

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Ensembl & Ensembl Genomes Vertebrates Annotation in-house by the Ensembl project European Bioinformatics Institute & Wellcome Trust Sanger Institute Invertebrates, plants, fungi, protists and bacteria Annotation by or in collaboration with the scientific community European Bioinformatics Institute

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Primates Rodents etc. Laurasiatheria Afrotheria Xenartha Other mammals Birds & reptiles Amphibians Fish Other chordates Other eukaryotes On Pre! Ensembl Species in Ensembl

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Species in Ensembl Genomes

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Species Ensembl Plants

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Data Genomic sequence Gene / transcript / protein models External references Mapped sequences cDNAs, proteins, repeats, markers, probes, etc. Variation data: sequence variants structural variants

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Data Comparative data: Orthologues and paralogues (between plants and pan-taxonomic) Protein families Whole genome pairwise alignments (selected species) Synteny (selected species) 8-way whole genome multiple alignment

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Expected … sooner or later Barley (Hordeum vulgare) Potato (Solanum tuberosum) Bread wheat (Triticum aestivum) Medicago (Medicago truncatula) Pigeon pea (Cajanus cajan) Papaya (Carica papaya) Cucumber (Cucumus sativus) Domesticated apple (Malus x domestica Borkh.) Woodland strawberry (Fragaria vesca) Norway Spruce (Picea abies) (18 Gb!)

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Access to data Web browser BioMart FTP ftp://ftp.ensemblgenomes.org/pub/plants/ Public MySQL server mysql.ebi.ac.uk:4157:anonymous Ensembl APIs

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 BioMart Data retrieval tool Originally developed for Ensembl (EnsMart) Now used by many large data resources Integrated with several widely used software packages, e.g. Galaxy, BioConductor Joint project between the European Bioinformatics Institute (EBI) and the Ontario Institute for Cancer Research (OICR) Central portal:

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Help Helpdesk Mailing lists YouTube and YouKu ( 优酷网 ) channels

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Workshops Browser (0.5-2 days) and API (1-3 days) workshops Combination of lectures and hands-on exercises Advertised on You can host your own workshop! For academic institutions there is no fee, apart from the instructor’s expenses You only need a computer room and participants You can get more info from or

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Ensembl Genomes

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Tutorial

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Tutorial objectives After this tutorial you should be able to: Search and navigate the Ensembl Plants website. Understand Ensembl Plants annotation. How to attach and visualize your BAM and VCF data. Retrieve Ensembl Plants data using BioMart. Know where to find help and documentation.

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Background: G6PD Glucose-6-phosphate dehydrogenase (G6PD or G6PDH) is a cytosolic enzyme in the pentose phosphate pathway, a metabolic pathway that supplies reducing energy to cells by maintaining the level of the co-enzyme nicotinamide adenine dinucleotide phosphate (NADPH). G6PD is widely distributed in many species from bacteria to humans. In higher plants, several isoforms of G6PDH have been reported, which are localized in the cytosol, the plastidic stroma, and peroxisomes.

Species pages Info on current release Search

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 1  Go to the Ensembl Plants homepage ( What is the current release (version) of Ensembl Plants? On which data are the genome sequence and gene annotation for Arabidopsis thaliana based?

Gene tab he!p Side menu Top panel stays the same as long as you stay on the same tab Main panel changes when you choose another page from the side menu

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 2  Find the Arabidopsis thaliana gene encoding glucose-6-phosphate dehydrogenase 1 What is the official gene name for this gene? On which chromosome and on which strand is it located? What do the empty boxes, filled boxes and lines in the transcript models represent?

Duplication node Speciation node Phylogenetic GeneTree Protein multiple alignment Collapsed sub tree (Mis)match Gene of interest Gap

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 3  Explore the ‘Paralogues’ and ‘Gene Tree’ pages. How many paralogues have been identified for the G6PD1 gene? Which paralogues show the highest sequence similarity? Does the plant gene tree reflect the information that is shown on the ‘Paralogues’ page? Does the pan-taxonomic gene tree confirm that glucose-6-phosphate dehydrogenase is present in species across all kingdoms?

Transcript tab Changed side menu

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 4  Explore the G6PD1 transcript and protein (AT5G ). How many exons does this transcript have? Is any of them (partially) untranslated? Is it cross-referenced to the UniProtKB/Swiss-Prot database? What is its ID and recommended name according to UniProtKB/Swiss-Prot? Does any of the associated Gene Ontology (GO) terms hint at a role of glucose-6-phosphate dehydrogenase 1 in the pentose phosphate pathway? Where in the cell is glucose-6-phosphate dehydrogenase 1 located? In which part of the glucose-6-phosphate dehydrogenase 1 protein is its NAD binding domain located?

Add tracks Tracks Top panel: Overview Chromosome Main panel: Zoom in, zoom out Add tracks and remove tracks Add your own data Location tab

Categories of tracks Search tracks Turn track on/off

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 5  Explore the genomic region of the G6PD1 gene. Which species in Ensembl Plants shows the highest sequence conservation for this region when compared to Arabidopsis thaliana? And which species the lowest? What part of the sequence is most conserved across the various species? Is this what you would expect?

Add your own data Location of your data

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 6  Attach the following file, that contains RNA-Seq data for a wild type Arabidopsis thaliana seedling, to Ensembl Plants: Is the G6PD1 gene expressed? Compare its expression to a gene that is: expected to be constitutively highly expressed, e.g. RBCS1A (ribulose bisphosphate carboxylase small chain 1A), and one that is not, e.g. PR1 (pathogenesis-related protein 1).

Paste data … or upload file … or provide URL

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 7  The following file contains the genomic coordinates and alleles of a number of new variants in the G6PD1 gene of Arabidopsis thaliana: Do any of these variants change the sequence of the glucose-6- phosphate dehydrogenase 1 protein? Have any of the variants already been annotated in Ensembl?

Step 1 Step 2 Step 3 Step 4 Preview of results Export results to file

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 BioMart Step 1 – Dataset Choose your dataset and species Step 2 – Filters Limit your dataset Step 3 – Attributes Specify what information you want to output Step 4 – Results Preview and output your results

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Exercise 8  Select the Ensembl Genes dataset for Arabidopsis thaliana.  Filter for all genes that are annotated with the GO term ‘pentose- phosphate shunt’, the official GO term for the pentose-phosphate pathway ( bin/amigo/term_details?term=GO: ) bin/amigo/term_details?term=GO:  Select the following attributes: Ensembl Gene ID, Associated Gene Name and Description.  View the results. How many genes does the query find? Are all G6PD genes amongst the results?

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Explore your favorite genes!

plants.ensembl.org 2 nd transPLANT user training workshop Poznań, 27th-28th June 2013 Acknowledgments team Dan Bolser, Paul Davies, Paul Derwent, Christoph Grabmüller, Kevin Howe, Daniel Hughes, Jay Humphrey, Arnaud Kerhornou, Paul Kersey, Eugene Kulesha, Nick Langridge, Dan Lawson, Uma Maheswari, Gareth Maslen, Mark McDowall, Karyn Megy, Michael Nuhn, ChuangKee Ong, Michael Paulini, Helder Pedro, Dan Staines, Iliana Toneva, Mary-Ann Tuli, Gareth Williams, Derek Wilson team Collaborators: Gramene, Rothamsted Research Funding: EMBL, EU-FP7, BBSRC