Download presentation
Presentation is loading. Please wait.
Published byMyrtle Hampton Modified over 9 years ago
1
Collaboration with IntAct and InterMine: SGD Rama Balakrishnan Saccharomyces Genome Database Gene Ontology Consortium Stanford University, CA USA
2
Outline Advantages of collaborating When do you collaborate? What are the considerations?
3
Why collaborate? There is lot of data out there and it is hard to keep up Each MOD doesn’t have resources to build tools to capture all the data – Doesn’t make sense for each MOD to build a curation tool to capture the same data type There is a lot to gain by collaborating (it is a win- win situation) – There is lot more consistency in curation – Money is well spent – Data is produced in standard file formats
4
Considerations Type of data (not all data types are amenable for collaborative curation) Data flow model to receive data from external sources and integrate it into our database (e.g. bioGRID, protein2GO) Important to get data in standard file formats so loading scripts don’t have to change – Need good error checking/reporting
5
Data flow Main Database Data out Web display Download as flat file Specialized tools such as Intermine Data in Curate directly into database Load data from a flat file flat file available routinely on FTP site Loading script with good error checking/reporting
6
Protein complex curation with IntAct SGD needed to curate more in-depth information on complexes – Curate/capture functional (using GO) data for complexes (rather than for individual subunits) – Represent complexes in cellular pathways – One has to invest lot of resources to define the curation model and build software – Thanks to the tight curation community we had a chance to talk to IntAct
7
IntAct complex portal Few conference calls to understand their curation model IntAct was very open to suggestions/changes They were able to make software changes rapidly to accommodate new features Since curation was done on a web interface – easy for SGD curators to get trained and curate online – Important to have a test site where we can get trained – Mailing list to share notes, tips, tricks, bug reports
8
Complex Curation SGD has been curating complexes for almost an year now We load the complex data into YeastMine for now – Data will move to our main database soon – Goal is to make pages for Complexes like we do for genes
9
YeastMine Collaboration with InterMine project (Cambridge, UK) It is a data warehouse – Has sophisticated querying interface – Can query for various slices of data without knowing any database query language – Can make lists of any object and retrieve data for the members of the list – Can download data in custom formats RGD, MGI, FlyBase, Zfin have a “Mine”
10
YeastMine Home
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.