AstroDAS: Sharing Assertions across Astronomy Catalogues through Distributed Annotation Rajendra Bose, Robert G. Mann, Diego Prina-Ricotti Digital Curation.

Slides:



Advertisements
Similar presentations
Analyzing Large Datasets in Astrophysics Alexander Szalay The Johns Hopkins University Towards an International Virtual Observatory, Garching, 2002 (Living.
Advertisements

Trying to Use Databases for Science Jim Gray Microsoft Research
Web Services for the Virtual Observatory Alex Szalay, Tamas Budavari, Tanu Malik, Jim Gray, and Ani Thakar SPIE, Hawaii, 2002 (Living in an exponential.
1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Data Mining, ADQL, & The National Virtual Observatory's OpenSkyQuery Utility by Richard Doc Kinne, KQR 2008 AAVSO Fall Conference Nantucket, MA.
9 September 2005NVO Summer School Aspen Astronomical Dataset Query Language (ADQL) Ray Plante T HE US N ATIONAL V IRTUAL O BSERVATORY.
Kensington Oracle Edition: Open Discovery Workflow Meets Oracle 10g Professor Yike Guo.
1 A Systematic Nomenclature for Embryo Anatomy MRC, Human Genetics Unit Heriot-Watt University, Dept. of Comp & EE, Albert Burger.
Provenance GGF18 Kepler/COW+RWS, Kepler/COW+RWS, Bowers, McPhiilips et al. Provenance Management in a COllection-oriented Scientific Workflow.
Prof. Carolina Ruiz Computer Science Department Bioinformatics and Computational Biology Program WPI WELCOME TO BCB4003/CS4803 BCB503/CS583 BIOLOGICAL.
László Dobos 1,2, Tamás Budavári 2, Nolan Li 2, Alex Szalay 2, István Csabai 1 1 Eötvös Loránd University, Budapest,
14 October 2003ADASS 2003 – Strasbourg1 Resource Registries for the Virtual Observatory R.Plante (NCSA), G. Greene (STScI), R. Hanisch (STScI), T. McGlynn.
Extreme Bioinformatics Stephen Montgomery Genetics Graduate Program, UBC Supervisor: Steven Jones.
Jennifer A. Dunne Santa Fe Institute Pacific Ecoinformatics & Computational Ecology Lab Rich William, Neo Martinez, et al. Challenges.
Genome Data Directories Don Gilbert, May 2003.
1 CIS607, Fall 2006 Semantic Information Integration Instructor: Dejing Dou Week 10 (Nov. 29)
Introduction to Geographic Information Systems GIS is a Spatial tool used to query spatial information investigate spatial problems communicate spatial.
1 BrainWave Biosolutions Limited Accelerating Life Science Research through Technology.
Mouse Genome Informatics November 2008 Paul Szauter MGI User Support.
Genome database & information system for Daphnia Don Gilbert, October 2002 Talk doc at
To the Problem of Organizing Heterogeneous Information Olga Zhelenkova 1,2, Vladimir Vitkovskij 1,2 (1) SAO RAS (Nizhnij Arkhyz), (2) ITMO University (Saint-Petersburg)
E-Science: Stuart Anderson National e-Science Centre Stuart Anderson National e-Science Centre.
Supported by the National Science Foundation’s Information Technology Research Program under Cooperative Agreement AST with The Johns Hopkins University.
László Dobos, Tamás Budavári, Alex Szalay, István Csabai Eötvös University / JHU Aug , 2008.IDIES Inaugural Symposium, Baltimore1.
Diversity of domain descriptions in natural science: virtual observatory as a case study Briukhov D.O., Kalinichenko L.A., Zakharov V.N. Institute of Informatics.
1 Common Challenges Across Scientific Disciplines Laurence Field CERN 18 th November 2013.
Hopkins Storage Systems Lab, Department of Computer Science A Workload-Driven Unit of Cache Replacement for Mid-Tier Database Caching Xiaodan Wang, Tanu.
The GAVO Cross-Matcher Application Hans-Martin Adorf, Gerard Lemson, Wolfgang Voges GAVO, Max-Planck-Institut für extraterrestrische Physik, Garching b.
EdSkyQuery-G Overview Brian Hills, December
Functions and Demo of Astrogrid 1.1 China-VO Haijun Tian.
Astronomical data curation and the Wide-Field Astronomy Unit Bob Mann Wide-Field Astronomy Unit Institute for Astronomy School of Physics University of.
NEON Obs School 11-Aug-2005 Archival Data and Virtual Observatories 1 Virtual Observatories...or how to do your research from a beach in the Bahamas rather.
Taverna Workflow. A suite of tools for bioinformatics Fully featured, extensible and scalable scientific workflow management system – Workbench, server,
Public Access to Large Astronomical Datasets Alex Szalay, Johns Hopkins Jim Gray, Microsoft Research.
Strasbourg astronomical Data Centre (DS) Françoise GENOVA.
* Working Group 4. 2 AstroGrid-D Meeting, Heidelberg Tobias Scholl Astrometric Matching Prototype (D4.2) 50 RASS-BSC sources Correlation with.
Taverna Workflows for Systems Biology Katy Wolstencroft School of Computer Science University of Manchester.
October 7, 2005VOQL-Madrid1 IVOA – VOQL WG session ESAC Villafranca del Castillo Madrid, Spain Friday, October 7th, 9: :00 Yuji Shirasaki Maria Nieto-Santisteban.
1 10-June-2004Andy Lawrence : PPARC data curation panel meeting AstroGrid, Data Centres, & Edinburgh What is curation ? Data Centres in the VO era Data.
Source: Alex Szalay. Example: Sloan Digital Sky Survey The SDSS telescope array is systematically mapping ¼ of the entire sky Discoveries are made by.
Federation and Fusion of astronomical information Daniel Egret & Françoise Genova, CDS, Strasbourg Standards and tools for the Virtual Observatories.
Association techniques for the Virtual Observatory Bob Mann.
DAS for Visual Omics 2009 DAS Workshop Jose Ramon Macias Biocomputing Unit, CNB-CSIC
Web Services for the National Virtual Observatory Tamás Budavári Johns Hopkins University.
CMU-CS lunch talk, Gerard Lemson1 Computational and statistical problems for the Virtual Observatory With contributions from/thanks to: GAVO.
ACGT: Open Grid Services for Improving Medical Knowledge Discovery Stelios G. Sfakianakis, FORTH.
BBN Technologies Copyright 2009 Slide 1 The S*QL Plugin for Cytoscape Visual Analytics on the Web of Linked Data Rusty (Robert J.) Bobrow Jeff Berliner,
A collaborative tool for sequence annotation. Contact:
Databases, Ontologies and Text mining Session Introduction Part 2 Carole Goble, University of Manchester, UK Dietrich Rebholz-Schuhmann, EBI, UK Philip.
Edinburgh e-Science MSc Bob Mann Institute for Astronomy & NeSC University of Edinburgh.
Khalid Belhajjame 1, Paolo Missier 2, and Carole A. Goble 1 1 University of Manchester 2 University of Newcastle Detecting Duplicate Records in Scientific.
Pan-STARRS PS1 Published Science Products Subsystem Presentation to the PS1 Science Council August 1, 2007.
The Large Synoptic Survey Telescope Project Bob Mann Wide-Field Astronomy Unit University of Edinburgh.
1 Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Data Integration & Data Mining Tool Donald Dunbar BHF CoRE Bioinformatics Team Edinburgh Bioinformatics Meeting April 2013.
William O’Mullane/ Tannu Malik - JHU IVOA Cambridge May 12-16, 2003 SkyQuery.Net SKYQUERY Federated Database Query System (using WebServices)
May 2006IVOA Victoria, Canada1 VOQL Where do we stand? What is left? Yuji Shirasaki JVO Maria A. Nieto-Santisteban JHU T HE US N ATIONAL V IRTUAL O BSERVATORY.
Biomedical Informatics Research Network The BIRN Architecture: An Overview Jeffrey S. Grethe, BIRN-CC 10/9/02 BIRN All Hands Meeting 2002.
Introduction: AstroGrid increases scientific research possibilities by enabling access to distributed astronomical data and information resources. AstroGrid.
A Database Platform for Bioinformatics
Databases, Ontologies and Text mining Session Introduction Part 2
Optimizing Biological Data Integration
UCSD Neuron-Centered Database
Cross-matching the sky with database server cluster
Sky Query: A distributed query engine for astronomy
Online Science The World-Wide Telescope as a Prototype For the New Computational Science Jim Gray Microsoft Research
Overview of Microbial Pathway and Genome Databases
Google Sky.
Some thoughts on an annotations standard
Presentation transcript:

AstroDAS: Sharing Assertions across Astronomy Catalogues through Distributed Annotation Rajendra Bose, Robert G. Mann, Diego Prina-Ricotti Digital Curation Centre 4 May 2006 International Provenance and Annotation Workshop (IPAW’06)

Outline 1.Astronomy catalogues and existing OpenSkyQuery system 2.Custom cross-matching algorithms: AstroDAS 3.How AstroDAS compares to other annotation systems

SDSS (Visual) TWOMASS (Infrared)

SDSS (Visual) TWOMASS (Infrared)

AstroDAS: Astronomy Distributed Annotation System Example astronomy catalogue schema

SDSS :Sky node TWOMASS :Sky node USNOB :Sky node :OpenSky Query client OpenSkyQuery Portal OpenSkyQuery SELECT s.objid, t.objid, u.objid, s.ra, s.dec, s.type, t.ra, t.dec, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u WHERE XMATCH(s,t,u)<3.5 AND Region(’CircleJ ’) AND s.type=3 AND ADQL query SDSS: Sloan Digital Sky Survey TWOMASS: the Two Micron All Sky Survey USNOB: U.S. Naval Observatory USNO-B1.0 catalogue X-Match cross-matching algorithm built into OpenSkyQuery based on spatial proximity; user specifies parameter sigma which encodes tolerance of match AstroDAS: Astronomy Distributed Annotation System Existing OpenSkyQuery system for astronomy catalogue access National Virtual Observatory. (2006). Open SkyQuery Help: The XMatch Algorithm

AstroDAS: Astronomy Distributed Annotation System Existing OpenSkyQuery system for astronomy catalogue access

SDSS :Sky node TWOMASS :Sky node USNOB :Sky node :OpenSky Query client OpenSkyQuery Portal OpenSkyQuery SELECT s.objid, t.objid, u.objid, s.ra, s.dec, s.type, t.ra, t.dec, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u WHERE XMATCH(s,t,u)<3.5 AND Region(’CircleJ ’) AND s.type=3 AND ADQL query AstroDAS: Astronomy Distributed Annotation System Existing OpenSkyQuery system provides X-Match results based on proximity

SDSS :Sky node TWOMASS :Sky node USNOB :Sky node :OpenSky Query client OpenSkyQuery Portal OpenSkyQuery SELECT s.objid, t.objid, u.objid, s.ra, s.dec, s.type, t.ra, t.dec, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u WHERE XMATCH(s,t,u)<3.5 AND Region(’CircleJ ’) AND s.type=3 AND ADQL query AstroDAS: Astronomy Distributed Annotation System But X-Match results based on proximity not always adequate Catalogue1 Catalogue2

SDSS :Sky node TWOMASS :Sky node USNOB :Sky node :OpenSky Query client OpenSkyQuery Portal OpenSkyQuery SELECT s.objid, t.objid, u.objid, s.ra, s.dec, s.type, t.ra, t.dec, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u WHERE XMATCH(s,t,u)<3.5 AND Region(’CircleJ ’) AND s.type=3 AND ADQL query AstroDAS: Astronomy Distributed Annotation System UEdinburgh: ↔ URome: ↔ So group produces its own cross-match results

URome :AstroDAS Server SDSS :Sky node UEdinburgh :AstroDAS Server TWOMASS :Sky node USNOB :Sky node :OpenSky Query client :AstroDAS client AstroDAS Portal OpenSkyQuery Portal AstroDAS OpenSkyQuery AstroDAS: Astronomy Distributed Annotation System UEdinburgh: ↔ URome: ↔ SELECT s.objid, t.objid, u.objid, s.ra, s.dec, s.type, t.ra, t.dec, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u WHERE XMATCH(s,t,u)<3.5 AND Region(’CircleJ ’) AND s.type=3 AND ADQL query Storing annotations to map database objects

AstroDAS: Astronomy Distributed Annotation System Storing annotations to map database objects db_object SDSS_ TWOMASS_ annote1 authorannote_source SAME OBJECT (algorithm1) researcher1 (algorithm1) researcher1 ……………… id SDSS_ USNOB_ UEdinburgh: ↔ URome: ↔ SAME OBJECT NOT SAME OBJECT (algorithm2) researcher2 SDSS_ TWOMASS_

URome :AstroDAS Server SDSS :Sky node UEdinburgh :AstroDAS Server TWOMASS :Sky node USNOB :Sky node :OpenSky Query client :AstroDAS client AstroDAS Portal OpenSkyQuery Portal null null USNOBTWOMASSSDSS AstroDAS OpenSkyQuery mapping table created dynamically from annotations AstroDAS: Astronomy Distributed Annotation System Querying annotations on astronomy catalogues SELECT s.objid, s.ra, s.dec, s.type, t.objid, t.ra, t.dec u.objid, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u AS:UEdinburgh e, AS:URome r WHERE Region(’CircleJ ’) AND s.type=3 AND e.author=’algorithm1’ AND r.author=’algorithm2’ DSQL query

URome :AstroDAS Server SDSS :Sky node UEdinburgh :AstroDAS Server TWOMASS :Sky node USNOB :Sky node :OpenSky Query client :AstroDAS client AstroDAS Portal OpenSkyQuery Portal null null USNOBTWOMASSSDSS AstroDAS OpenSkyQuery mapping table created dynamically from annotations AstroDAS: Astronomy Distributed Annotation System Creating a mapping table from stored annotations: inference SELECT s.objid, s.ra, s.dec, s.type, t.objid, t.ra, t.dec u.objid, u.ra, u.dec FROM SDSS:photoprimary s, TWOMASS:photoprimary t, USNOB:photoprimary u AS:UEdinburgh e, AS:URome r WHERE Region(’CircleJ ’) AND s.type=3 AND e.author=’algorithm1’ AND r.author=’algorithm2’ DSQL query UEdinburgh: ↔ URome: ↔

Outline 1.Astronomy catalogues and existing OpenSkyQuery system 2.Custom cross-matching algorithms: AstroDAS 3.How AstroDAS compares to other annotation systems

/das/ / ? BioDAS: Biology Distributed Annotation System (Dowell 2001) Example 1: Genome annotation and BioDAS Dowell, R., Jokerst, R., Day, A., Eddy, S., & Stein, L. (2001). The Distributed Annotation System. BMC Bioinformatics, 2(7).

Ensembl system which includes BioDAS functionality Example 1: Genome annotation and BioDAS

Ensembl system which includes BioDAS functionality Example 1: Genome annotation and BioDAS

Outline 1.Astronomy catalogues and existing OpenSkyQuery system 2.Custom cross-matching algorithms: AstroDAS 3.How AstroDAS compares to other annotation systems

AstroDAS: Sharing Assertions across Astronomy Catalogues through Distributed Annotation Rajendra Bose, Robert G. Mann, Diego Prina-Ricotti Digital Curation Centre 4 May 2006 International Provenance and Annotation Workshop (IPAW’06)

Annotation of the Malaria Mosquito Anopheles gambiae genome sequence Example 1: Genome annotation and BioDAS The Genome Sequence of the Malaria Mosquito Anopheles gambiae, Robert A. Holt, et al., Science 4 October 2002: Vol no. 5591, pp DOI: /science ;

Annotation of the Malaria Mosquito Anopheles gambiae genome sequence Example 1: Genome annotation and BioDAS The Genome Sequence of the Malaria Mosquito Anopheles gambiae, Robert A. Holt, et al., Science 4 October 2002: Vol no. 5591, pp DOI: /science ;

Annotation of the Malaria Mosquito Anopheles gambiae genome sequence Example 1: Genome annotation and BioDAS The Genome Sequence of the Malaria Mosquito Anopheles gambiae, Robert A. Holt, et al., Science 4 October 2002: Vol no. 5591, pp DOI: /science ;

Lauer, Kim P., Llorente, Isabel, Blair, Eric, Seto, Jason, Krasnov, Vladimir, Purkayastha, Anjan, Ditty, Susan E., Hadfield, Ted L., Buck, Charles, Tibbetts, Clark, Seto, Donald Natural variation among human adenoviruses: genome sequence and annotation of human adenovirus serotype 1 J Gen Virol : Example of genome annotation from the biological literature Example 1: Genome annotation and BioDAS

Human Brain Project (HBP) image annotation (Gertz 2002, 2003) Example 2: Medical image annotation

Human Brain Project (HBP) image annotation (Gertz 2002, 2003) Example 2: Medical image annotation Gertz, M., Sattler, K.-U., Gorin, F., Hogarth, M., & Stone, J. (2002). Annotating Scientific Images: A Concept-based Approach. Proceedings of the 14th International Conference on Scientific and Statistical Database Management (SSDBM 2002), Edinburgh, Scotland. IEEE Computer Society. Gertz, M., & Sattler, K. U. (2003). Integrating scientific data through external, concept-based annotations. In Efficiency and Effectiveness of Xml Tools and Techniques and Data Integration over the Web (Vol. 2590, pp ).

Edinburgh Mouse Atlas Project (EMAP) (Baldock 1999) Example 2: Medical image annotation Baldock, R. A., Dubreuil, C., Hill, W., & Davidson, D. (1999). The Edinburgh Mouse Atlas: Basic Structure and Informatics. In S. I. Letovsky (Ed.), Bioinformatics: Databases and Systems (pp ). Kluwer Academic Publishers. (See

AstroDAS: Astronomy Distributed Annotation System Storing annotations to map database objects db_object SDSS_112233TWOMASS_ annote1 annote2annote_source SAME OBJECT (algorithm1) GROUP1 NOT SAME OBJECT (algorithm2) GROUP2 NOT SAME OBJECT (algorithm1) GROUP1 ……………… id SDSS_ TWOMASS_ TWOMASS_445566