Presentation is loading. Please wait.

Presentation is loading. Please wait.

GO based data analysis Iowa State Workshop 11 June 2009.

Similar presentations


Presentation on theme: "GO based data analysis Iowa State Workshop 11 June 2009."— Presentation transcript:

1 GO based data analysis Iowa State Workshop 11 June 2009

2 All tools and materials from this workshop are available online at the AgBase database Educational Resources link. For continuing support and assistance please contact: agbase@cse.msstate.edu This workshop is supported by USDA CSREES grant number MISV-329140.

3 AgBase protein annotation process Protein identifiers or Fasta format GORetriever Annotated Proteins GOanna Proteins with no annotations GOSlimViewer

4 Hypothesis generating Gene Ontology enrichment analysis GO terms that are statistically (Fisher’s exact test) over or underrepresented in a set of genes Annotation Clustering group similar annotations based on the hypothesis that they should have similar gene members

5 Some resources DAVID: http://david.abcc.ncifcrf.gov/http://david.abcc.ncifcrf.gov/ GOStat: http://gostat.wehi.edu.au/http://gostat.wehi.edu.au/ EasyGO: http://bioinformatics.cau.edu.cn/easygo/http://bioinformatics.cau.edu.cn/easygo/ AmiGO http://amigo.geneontology.org/cgi-bin/amigo/term_enrichment (does not use IEA)http://amigo.geneontology.org/cgi-bin/amigo/term_enrichment Onto-Express & OE2GO http://vortex.cs.wayne.edu/projects.htmhttp://vortex.cs.wayne.edu/projects.htm GOEAST http://omicslab.genetics.ac.cn/GOEAST http://www.geneontology.org/GO.tools.shtml Comparison of enrichment analysis tools : Nucleic Acids Research, 2009, Vol. 37, No. 1 1–13 (Tool_Comparison_09.pdf) DAVID and EasyGO analysis included DAVID&EasyGo.ppt

6 Database for Annotation, Visualization and Integrated Discovery

7

8

9 http://vortex.cs.wayne.edu/ontoexpress Onto-Express analysis instructions are Available in onto-express.ppt

10 Species represented in Onto-Express

11 For uploading your own annotations use OE2GO

12 Comparison Onto-Express, EasyGO, GOstat and DAVID Test set: 60 randomly selected chicken genes Used AgBase GO annotations as baseline annotations Vandenberg et al (BMC Bioinformatics, in review)

13

14 Networks & Pathways Iowa State Workshop 11 June 2009

15 Multiple data analysis platforms Proteomics Transcriptomics ESTs LIST

16 Our original aim…. …understand biological phenomena…. Bits and pieces of information Do not have the full picture How do we get back to BIOLOGY in this digital information landscape?

17 What do we know about biological systems …. biological systems are dynamic, not static how molecules interact is key to understanding complex systems Francis Crick, 1958

18 Types of interactions protein (enzyme) – metabolite (ligand) metabolic pathways protein – protein cell signaling pathways, protein complexes protein – gene genetic networks

19 Sod1 Mus musculus STRING Database http:// string.embl.de /

20

21 PLoS Computational Biology March 2007, Volume 3 e42 Database/URL/FTP DIP http://dip.doe-mbi.ucla.eduhttp://dip.doe-mbi.ucla.edu BIND http://bind.cahttp://bind.ca MPact/MIPS http://mips.gsf.de/services/ppihttp://mips.gsf.de/services/ppi STRING http://string.embl.dehttp://string.embl.de MINT http://mint.bio.uniroma2.it/minthttp://mint.bio.uniroma2.it/mint IntAct http://www.ebi.ac.uk/intacthttp://www.ebi.ac.uk/intact BioGRID http://www.thebiogrid.orghttp://www.thebiogrid.org HPRD http://www.hprd.orghttp://www.hprd.org ProtCom http://www.ces.clemson.edu/compbio/ProtComhttp://www.ces.clemson.edu/compbio/ProtCom 3did, Interprets http://gatealoy.pcb.ub.es/3did/http://gatealoy.pcb.ub.es/3did/ Pibase, Modbase http://alto.compbio.ucsf.edu/pibasehttp://alto.compbio.ucsf.edu/pibase CBM ftp://ftp.ncbi.nlm.nih.gov/pub/cbmftp://ftp.ncbi.nlm.nih.gov/pub/cbm SCOPPI http://www.scoppi.org/http://www.scoppi.org/ iPfam http://www.sanger.ac.uk/Software/Pfam/iPfam InterDom http://interdom.lit.org.sghttp://interdom.lit.org.sg DIMA http://mips.gsf.de/genre/proj/dima/index.htmlhttp://mips.gsf.de/genre/proj/dima/index.html Prolinks http://prolinks.doe-mbi.ucla.edu/cgibin/functionator/pronav/ Predictomehttp://predictome.bu.edu/

22 Pathways & Networks A network is a collection of interactions Pathways are a subset of networks Network of interacting proteins that carry out biological functions such as metabolism and signal transduction All pathways are networks of interactions NOT ALL NETWORKS ARE PATHWAYS

23 Biological Networks Networks often represented as graphs Nodes represent proteins or genes that code for proteins Edges represent the functional links between nodes (ex regulation) Small changes in graph’s topology/architecture can result in the emergence of novel properties

24 Yeast Protein-Protein Interaction Map Nature 411, 2001, H. Jeong, et al

25 KEGG http://www.genome.jp/kegg/pathway.html/ BioCyc http://www.biocyc.org/ Reactome http://www.reactome.org/ GenMAPP http://www.genmapp.org/ BioCarta http://www.biocarta.com/ Pathguide – the pathway resource list http://www.pathguide.org/ Some resources

26

27 Gallus gallus is missing Pathguide Statistics

28 Reactome

29 What is feasible with my specific dataset?

30 Systems Biology Workflow Nanduri & McCarthy CAB reviews, 2008

31 Systems Biology Workflow For a given species of interest what type of data is available???

32 Retrieval of interaction datasets Evaluate PPI resources such as Predictome Prolinks for existence of species of interest If unavailable, find orthologous proteins in related species that have interactions!

33 I have interactions what next? Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods?

34 I have interactions what next? Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods? STRING Database

35 PPI Identification Experimental Computational Gene Coexpression TAP assays Sequence coevolution Yeast two hybrid Phylogenetic profile Gene Cluster Rosetta stone method Text mining TAP assays Yeast two hybrid (Y2H) Protein arrays PLoS Computational Biology March 2007, Volume 3 e42

36 PPI database comparisons Proteins: Structure, Function and Bioinformatics 63:490-500 2006

37 I have interactions what next? Evaluate the quality of interactions i.e. type of method used for identification….what exactly are these methods? Visualize these interactions as a network and analyze… what are the available tools?


Download ppt "GO based data analysis Iowa State Workshop 11 June 2009."

Similar presentations


Ads by Google