Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA.

Slides:



Advertisements
Similar presentations
Wincite Knowledge Warehousing and Networking Sophisticated Simplicity.
Advertisements

12-CRS-0106 REVISED 8 FEB 2013 PRESENTS vTools Voting: Getting Voter List.
Chapter 13 The Data Warehouse
Developing Custom Solutions Victor J. Pudelski. What is a Custom Solution???
Rama Balakrishnan AmiGO Tutorial Saccharomyces Genome Database (SGD) Stanford University.
Rama Balakrishnan Saccharomyces Genome Database Stanford University
The design, construction and use of software tools to generate, store, annotate, access and analyse data and information relating to Molecular Biology.
Tools You Own Maggie Moehringer AIRPO, June 2006.
Kate Milova MolGen retreat March 24, Microarray experiments: Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Kate Milova MolGen retreat March 24, Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005.
Mouse Genome Informatics November 2008 Paul Szauter MGI User Support.
WormBase Workshop: 2015 International C. elegans Meeting Tools & Resources InterMine / WormMine – Chris Grove JBrowse – Scott Cain The WormBase Ontology.
Claire O’Donovan EMBL-EBI. In UniProtKB, we aim to provide… o A high quality protein sequence database A non redundant protein database, with maximal.
_______________________________________________________________________________________________________________ E-Commerce: Fundamentals and Applications1.
Federated Searching Pre-Conference Workshop - The federated searching cookbook Qin Zhu HP Labs Research Library February 18, 2007.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
BTN323: INTRODUCTION TO BIOLOGICAL DATABASES Day2: Specialized Databases Lecturer: Junaid Gamieldien, PhD
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
Modifying GO How changes are made to GO, and how you can be involved.
Basics of Web Databases With the advent of Web database technology, Web pages are no longer static, but dynamic with connection to a back-end database.
NODEJS, THE JOOMLA FRAMEWORK, AND THE FUTURE IAN MACLENNAN.
GO and OBO: an introduction. Jane Lomax EMBL-EBI What is the Gene Ontology? What is OBO? OBO-Edit demo & practical What is the Gene Ontology? What is.
COLD FUSION Deepak Sethi. What is it…. Cold fusion is a complete web application server mainly used for developing e-business applications. It allows.
The aims of the Gene Ontology project are threefold: - to compile vocabularies to describe components, functions and processes - to produce tools to query.
Presentation on SubmissionTrackingTool: by Anjan Sharma.
Intralab Workshop - Reactome CMAP Chang-Feng Quo June 29 th, 2006.
GENOME-CENTRIC DATABASES Daniel Svozil. NCBI Gene Search for DUT gene in human.
Copyright OpenHelix. No use or reproduction without express written consent1.
© 2007 by Prentice Hall 1 Introduction to databases.

OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Copyright OpenHelix. No use or reproduction without express written consent1.
The Gene Ontology: a real-life ontology, progress and future. Jane Lomax EMBL-EBI.
Gene Ontology TM (GO) Consortium Jennifer I Clark EMBL Outstation - European Bioinformatics Institute (EBI), Hinxton, Cambridge CB10 1SD, UK Objectives:
ECO Statistical Network (ECOSTAT) Statistical Center of Iran.
Got genom e? Community Meetings GMOD.org The GMOD community meets semi- annually to discuss GMOD components, best practices,
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
The New Website of the Gene Ontology Consortium Seth Carbon Chris Mungall, PhD Monica Munoz-Torres, PhD Genomics Division,
Relational Database vs. Data Files By Willa Zhu JISAO/UW - PMEL/NOAA March 25, 2005.
Software. A web site is a collection of web pages on a particular topic. A web page is a document written in HTML code. Web pages are linked together.
Reporting and Analysis With Microsoft Office. Reporting and Analysis Business User Reporting & Analysis OLAP Data Warehouse.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
The Public Face of TAIR User Interface Design Responsiveness to User Input.
To Boldly GO… Amelia Ireland GO Curator EBI, Hinxton, UK.
Copyright OpenHelix. No use or reproduction without express written consent1.
COPYRIGHT © 2007 MUSEGLOBAL, INC. ALL RIGHTS RESERVED PAGE 1 Turn Content into Insight From Silos to Solutions: How Advanced Content Integration Creates.
PwC New Technologies New Risks. PricewaterhouseCoopers Technology and Security Evolution Mainframe Technology –Single host –Limited Trusted users Security.
Introduction to the Gene Ontology GO Workshop 3-6 August 2010.
Coastal Data Management System Update Alfred Wanger May 15,
1 TOPIC 6 DATABASE 6.1 Introduction to Database 6.2 Basic Concept of Database 6.3 Database Object DATABASE.
Web Technologies for Bioinformatics Ken Baclawski.
ARGOS (A Replicable Genome InfOrmation System) for FlyBase and wFleaBase Don Gilbert, Hardik Sheth, Vasanth Singan { gilbertd, hsheth, vsingan
Data Warehousing MEC 623 – Data Warehousing and Data Mining.
SRI International Bioinformatics 1 Pathway Tools Features Available Only in the Desktop Version PathoLogic.
Introduction to Wikis! More info:
March 18, 2010 Social Knowledge Management in an Academic Research Environment: A Case Study Kimberly Silk, MLS Data Librarian.
SAP BI – The Solution at a Glance : SAP Business Intelligence is an enterprise-class, complete, open and integrated solution.
SEMINAR ON INTERNET SEARCHING PRESENTED BY:- AVIPSA PUROHIT REGD NO GUIDED BY:- Lect. ANANYA MISHRA.
Towards a unified MOD resource: An Overview
ETL TESTING ONLINE TRAINING
Fiverr Clone,Fiverr Script,Fiverr Clone php Script.
Biology (visualisation) is hard
Saccharomyces Genome Database (SGD)
VRBO Clone Script | 9flats Clone | VRBO Clone | 9flats Script
Department of Genetics • Stanford University School of Medicine
Handling Data Using Databases
TL101B – Advanced OA LodgeMaster
Development of the SMC Data Portal
INFORMATION FLOW AARTHI & NEHA.
got genome? Community Meetings Databases Training GMOD.org
Presentation transcript:

Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA

Outline Advantages of collaborating When do you collaborate? What are the considerations?

Why collaborate? There is lot of data out there and it is hard to keep up Each MOD doesn’t have resources to build tools to capture all the data – Doesn’t make sense for each MOD to build a curation tool to capture the same data type There is a lot to gain by collaborating (it is a win- win situation) – There is lot more consistency in curation – Money is well spent – Data is produced in standard file formats

Considerations Type of data (not all data types are amenable for collaborative curation) Data flow model to receive data from external sources and integrate it into our database (e.g. bioGRID, protein2GO) Important to get data in standard file formats so loading scripts don’t have to change – Need good error checking/reporting

Data flow Main Database Data out Web display Download as flat file Specialized tools such as Intermine Data in Curate directly into database Load data from a flat file flat file available routinely on FTP site Loading script with good error checking/reporting

Protein complex curation with IntAct SGD needed to curate more in-depth information on complexes – Curate/capture functional (using GO) data for complexes (rather than for individual subunits) – Represent complexes in cellular pathways – One has to invest lot of resources to define the curation model and build software – Thanks to the tight curation community we had a chance to talk to IntAct

IntAct complex portal Few conference calls to understand their curation model IntAct was very open to suggestions/changes They were able to make software changes rapidly to accommodate new features Since curation was done on a web interface – easy for SGD curators to get trained and curate online – Important to have a test site where we can get trained – Mailing list to share notes, tips, tricks, bug reports

Complex Curation SGD has been curating complexes for almost an year now We load the complex data into YeastMine for now – Data will move to our main database soon – Goal is to make pages for Complexes like we do for genes

YeastMine Collaboration with InterMine project (Cambridge, UK) It is a data warehouse – Has sophisticated querying interface – Can query for various slices of data without knowing any database query language – Can make lists of any object and retrieve data for the members of the list – Can download data in custom formats RGD, MGI, FlyBase, Zfin have a “Mine”

YeastMine Home