Viewing & Getting GO COST Functional Modeling Workshop 22-24 April, Helsinki.

Slides:



Advertisements
Similar presentations
Rama Balakrishnan AmiGO Tutorial Saccharomyces Genome Database (SGD) Stanford University.
Advertisements

Integration of Protein Family, Function, Structure Rich Links to >90 Databases Value-Added Reports for UniProtKB Proteins iProClass Protein Knowledgebase.
Modeling Functional Genomics Datasets CVM Lesson 3 13 June 2007Fiona McCarthy.
GO-based tools for functional modeling GO Workshop 3-6 August 2010.
1 Welcome to the Protein Database Tutorial This tutorial will describe how to navigate the section of Gramene that provides collective information on proteins.
Tutorial 5 Motif discovery.
GO Enrichment analysis COST Functional Modeling Workshop April, Helsinki.
An introduction to using the AmiGO Gene Ontology tool.
Automatic methods for functional annotation of sequences Petri Törönen.
Self Guided Tour for Query V8.4 Basic Features. 2 This Self Guided Tour is meant as a review only for Query V8.4 Basic Features and not as a substitute.
Gene Expression Omnibus (GEO)
1 Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Fission Yeast Computing Workshop -1- Searching, querying, browsing downloading and analysing data using PomBase Basic PomBase Features Gene Page Overview.
Managing Data Modeling GO Workshop 3-6 August 2010.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics Lab v1 | Saurabh Sinha1 Powerpoint by Casey Hanson.
Adding GO for Large Datasets COST Functional Modeling Workshop April, Helsinki.
Adding GO GO Workshop 3-6 August GOanna results and GOanna2ga 2. gene association files 3. getting GO for your dataset 4. adding more GO (introduction)
Welcome to DNA Subway Classroom-friendly Bioinformatics.
Grup.bio.unipd.it CRIBI Genomics group Erika Feltrin PhD student in Biotechnology 6 months at EBI.
Strategies for functional modeling TAMU GO Workshop 17 May 2010.
Ontologies GO Workshop 3-6 August Ontologies  What are ontologies?  Why use ontologies?  Open Biological Ontologies (OBO), National Center for.
GO-based tools for functional modeling TAMU GO Workshop 17 May 2010.
From Functional Genomics to Physiological Model: Using the Gene Ontology Fiona McCarthy, Shane Burgess, Susan Bridges The AgBase Databases, Institute of.
Manual GO annotation Evidence: Source AnnotationsProteins IEA:Total Manual: Total
Introduction to the GO: a user’s guide Iowa State Workshop 11 June 2009.
PIRSF Classification System PIRSF: Evolutionary relationships of proteins from super- to sub-families Homeomorphic Family: Homologous proteins sharing.
Alastair Kerr, Ph.D. WTCCB Bioinformatics Core An introduction to DNA and Protein Sequence Databases.
EuPathDB: an integrated resource and tool for eukaryotic pathogen bioinformatics Aurrecoechea C., Heiges M., Warrenfeltz S. for the EuPathDB team CTEGD,
Getting Started: a user’s guide to the GO GO Workshop 3-6 August 2010.
Building WormBase database(s). SAB 2008 Wellcome Trust Sanger Insitute Cold Spring Harbor Laboratory California Institute of Technology ● RNAi ● Microarray.
Motif discovery and Protein Databases Tutorial 5.
Getting Started: a user’s guide to the GO TAMU GO Workshop 17 May 2010.
Regulatory Genomics Lab Saurabh Sinha Regulatory Genomics | Saurabh Sinha | PowerPoint by Casey Hanson.
Copyright OpenHelix. No use or reproduction without express written consent1.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
ID Mapping to accessions from different databases. COST Functional Modeling Workshop April, Helsinki.
Introduction to the GO: a user’s guide NCSU GO Workshop 29 October 2009.
9/10/06 GO Users Meeting 2006 Seattle, Washington The AgBase GO Annotation Tools Susan Bridges 1,3, Fiona McCarthy 2,3, Nan Wang 1,3, G. Bryce Magee 1,3,
Primary vs. Secondary Databases Primary databases are repositories of “raw” data. These are also referred to as archival databases. -This is one of the.
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
EBI is an Outstation of the European Molecular Biology Laboratory. UniProtKB Sandra Orchard.
GO based data analysis Iowa State Workshop 11 June 2009.
Getting GO: how to get GO for functional modeling Iowa State Workshop 11 June 2009.
AgBase Shane Burgess, Fiona McCarthy Mississippi State University.
Tutorial 8 Gene expression analysis 1. How to interpret an expression matrix Expression data DBs - GEO Clustering –Hierarchical clustering –K-means clustering.
Copyright OpenHelix. No use or reproduction without express written consent1.
Copyright OpenHelix. No use or reproduction without express written consent1.
Protein databases Petri Törönen Shamelessly copied from material done by Eija Korpelainen and from CSC bio-opas
Welcome to the combined BLAST and Genome Browser Tutorial.
Welcome to the GrameneMart Tutorial A tool for batch data sequence retrieval 1.Select a Gramene dataset to search against. 2.Add filters to the dataset.
Biocomputational Languages December 1, 2011 Greg Antell & Khoa Nguyen.
Lesson 17 Mail Merge. Overview Create a main document. Create a data source. Insert merge fields into a main document. Perform a mail merge. Use data.
Designing, Executing and Sharing Workflows with Taverna 2.4 Different Service Types Katy Wolstencroft Helen Hulme myGrid University of Manchester.
Getting GO annotation for your dataset
Regulatory Genomics Lab
Strategies for functional modeling
GO : the Gene Ontology & Functional enrichment analysis
Introduction to the Gene Ontology
Sequence based searches:
Saccharomyces Genome Database (SGD)
Functional Annotation of the Horse Genome
Strategy for working on your own data sets.
ID Mapping tools: Converting Accessions between Databases
Ensembl Genome Repository.
Explore Evolution: Instrument for Analysis
Gramene’s Ontologies Tutorial
Regulatory Genomics Lab
Welcome to the GrameneMart Tutorial
Welcome - webinar instructions
Regulatory Genomics Lab
Presentation transcript:

Viewing & Getting GO COST Functional Modeling Workshop April, Helsinki

Summary 1.Ontology Browsers QuickGO, AmiGO – searching for GO Terms, getting GO Plant Ontology Browser 2.Finding/ Adding GO for Functional Modeling. GOProfiler – summary of GO for your species GORetriever – gets evisting GO GOanna – adds GO (Blast) Adding GO for large datasets 3.Array annotation 4.Using added GO in GO Enrichment Analysis.

GO Browsers QuickGO Browser (EBI GOA Project) protein annotations search by GO Term or by UniProt ID AmiGO Browser (GO Consortium Project) bin/amigo/go.cgi search by GO Term or by accession

Example: QuickGO Browser

QuickGO Features Searching for gene products Can use gene/protein gene names, but better to use accessions. Works off UniProt accessions/IDs Can enter multiple accessions (separated by a space). Can search for GO Terms Has autocomplete Provides ranked list of matches Matches are also grouped by BP, CC, MF GO Terms – definitions, annotations, parents, children terms. Advanced filtering options. Download as protein lists or gene association file format.

record information annotation data number of annotations

Add taxon ID for horse. (Use to find taxon ID for your species.)

Example: AmiGO Browser

AmiGO Features Need to select either a gene product of GO Term search. Searching for gene products Can use gene/protein gene names, but better to use accessions. Works off multiple accessions/IDs Only accepts a single accession, not a list. View information about gene product & about annotations for that gene product. Can search for GO Terms Large numbers of GO annotations are truncated. Some filtering options. Filter by ontology or evidence code. Filter by database or species. Download as sequences or gene association file.

Plant Ontology (PO) Browser describes plant anatomy and morphology and stages of development for all plants Plant Anatomy e.g., plant structures (PO: ) such as plant organ (PO: ), plant cell (PO: ), whole plant (PO: ), portion of plant tissue (PO: ), and vascular system (PO: ), etc. Plant Structure Development Stage e.g., plant tissue development stage (PO: ), leaf development stage (PO: ), whole plant development stage (PO: ), seed development stage (PO: ), and sporophyte development stage (PO: ), etc.

Ontology Browsers Use to identify specific ontology terms of interest. Use to download specific annotation files for specific gene lists for species use as input for GO or PO expression analysis

Tutorial 1. Familiarizing your self with ontology browsers. OR Use browsers to look for GO/PO for accessions from your own data set.

2. Finding/ Adding GO for Functional Modeling How much GO is available for your species? How much GO is available for your data set? How much of this is in the tool(s) you want to use? Do you need to add GO? GOProfilerGORetrieverLast update? Source? GOanna, Blast2GO, etc

GOProfiler GOProfiler allows you get an overview of what GO annotation exists for the species you are interested in.

 Number of proteins is based upon GO Consortium records for these species.  Species with only IEA annotations do not have an active GO annotation project  GO provided automatically by EBI GOA Project.

GORetriever  Allows you to get existing GO annotations for a specific set of gene products.  Accepts a text file of accessions or IDs.  Returns GO annotations, list of accessions that have no GO and a GO Summary file.

Input file – text file of return separated accessions.

GORetriever Results

add GO to this list using GOanna or Blast2GO

GORetriever Results do functional grouping using GOSlimViewer

only returns existing GO only accepts limited accession types GOanna does a Blast search against existing GO annotated products. allows you to quickly transfer GO to gene products where they have similar sequences accepts fasta files

Incorrect address – you will not receive your results! Contact AgBase if you have not received results after 24-48h.

GOanna Results If you enter an incorrect address – you will not receive your results! Contact AgBase if you have not received results after 24-48h.

query IDs are hyperlinked to BLAST data (files must be in the same directory)

*WHAT IS A GOOD ALIGNMENT? 1. Manually inspect alignments and delete any lines where there is not a good alignment*. 2. Add this additional annotation to the annotations from GORetriever.

GOanna2ga New to AgBase: an online script to convert your GOanna file to a gene association file format. add manually checked GOanna annotations to a GORetriever file

Tutorial 2 Getting GO. GOProfiler – check what is available GORetriever – get existing GO GOanna – add GO annotations Note - you will use Blast2GO to add additional GO annotations to your data sets tomorrow. OR Getting existing GO & adding additional GO to your own data set.

Some limitations of GOanna: BLAST analysis is slow – results ed limit to 5,000 inputs or an overall file size of 6Mb limit to 3 jobs submitted/user at one time How do I do to get GO for my 50,000 RNA-Seq dataset? 50 x GOanna submissions + manual interpretation of results – impractical and slow!! ALTERNATIVELY: Contact AgBase we use internal GO annotation pipelines/queuing We can help customize databasess GO can be kept private and released after publication

How do I do to get GO for my 50,000 RNA-Seq dataset? GOanna is being deployed on the iPlant discovery environment increased computing capacity faster Blast searches no limitations on file number or size

GO annotation of RNA-Seq data 1.Retrieve any existing GO annotation for gene products Genome2Seq: Rapidly retrieves a fasta file of sequences and GO based on genome co-ordinates generated from RNA-Seq data. 2.InterProScan – identifies functional motifs and domains Can be mapped to GO terms (IEA) VERY computer intensive – do this on HPC resources; being implemented on iPlant Improved results if transcripts are translated (e.g. EMBOSS) 3.BLAST based similarity transfer (ISA) e.g. Blast2GO, GOanna Should only transfer GO annotations based upon direct experimental evidence codes. Need to test sample set to determine “good” matches/Evalues. 4.Combine GO annotations into single file. Remove duplicates

Adding GO Annotation GO annotations are usually added as gene association files. Check the number of the columns. Can check file format against the GO guide: Check your analysis tool: accepts additional GO annotations format required

GO Enrichment tools that support agricultural species.