The CLARION Project for the Infrastructure for Integration in Structural Sciences (I2S2) mtg, Rutherford Labs, 11 th February 2010 CLARION – Chemical Laboratory.

Slides:



Advertisements
Similar presentations
Building a Semantic IntraWeb with Rhizomer and a Wiki Roberto Garcia and Rosa Gil GRIHO (Human Computer Interaction Research Group) Universitat de Lleida,
Advertisements

BioPortal Architecture and Plans November 29, 2011 Ray Fergerson NCBO Project Director Stanford University 1.
Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
Grey Literature, Institutional Repositories and the Organisational Context Simon Lambert, Brian Matthews & Catherine Jones Business & Information Technology.
TeraGrid Deployment Test of Grid Software JP Navarro TeraGrid Software Integration University of Chicago OGF 21 October 19, 2007.
Digital Repositories – Linked Open Data – the possible Role of D4Science Workshop, December 2010, FAO use cases A tool to create Linked Data providers.
Names Project Web Services and repositories workshop Daniel Needham.
© 2006 IBM Corporation Features of an Enterprise-ready Triple Store Ben Szekely June, 2006.
S.J. Coles a*, M.B. Hursthouse a, R.A. Stephenson a, P. Cliff b, E. Lyon b, M. Patel b J. Downing c & P. Murray-Rust.
A distributed architecture for crystallography data, metadata, and applications John C. Bollinger Indiana University Molecular Structure Center, Bloomington,
Peter Berrisford RAL – Data Management Group SRB Services.
Towards an information model for I2S2
UKOLN is supported by: Put functionality Augmenting interoperability across scholarly repositories 20/21 April 2006 Rachel Heery, UKOLN, University of.
The SPECTRa Project : A wider chemistry picture Alan Tonge & Jim Downing A Digital Repository for the Chemical Community.
EBankII Workshop 1 Making Scientific Data Openly Available Simon Coles School of Chemistry, University of Southampton.
I2S2 - Infrastructure for Integration in Structural Sciences Cross-Institutional Pilot
CTS2 DEVELOPMENT FRAMEWORK CTS2 Overview. Schedule What is it? Why a framework? What does this do for me? Plugins Implementations available now CTS2 Compliance.
Database Management Using Microsoft Access Xinhua Chen, Ph.D. Chinese Association of Professionals in Science and Technology March 23, 2003.
Data Management Expert Panel - WP2. WP2 Overview.
ICAT + Information Model Brian Matthews Scientific Information Group E-Science Centre STFC Rutherford Appleton Laboratory
17th February, 2000 by Maciej Korzeniowski (CERN-IT-IA-MI) 1 Oracle Discoverer Product Presentation  This is an ad hoc query and analysis tool for.
Using the IDBS ELN in the University of Cambridge Chemistry Department IDBS Product Innovation Seminar Little Chesterford, Wednesday 10 th March 2010 Brian.
Transformations at GPO: An Update on the Government Printing Office's Future Digital System George Barnum Coalition for Networked Information December.
Hydra Partners Meeting March 2012 Bill Branan DuraCloud Technical Lead.
Solar and STP Physics with AstroGrid 1. Mullard Space Science Laboratory, University College London. 2. School of Physics and Astronomy, University of.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Click to edit Master subtitle style JISC XYZ Project Principal Investigator: Peter Murray-Rust Project Team: Nick England, Brian Brooks Unilever Centre,
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Progress Report 11/1/01 Matt Bridges. Overview Data collection and analysis tool for web site traffic Lets website administrators know who is on their.
Report Distribution Report Distribution in PeopleTools 8.4 Doug Ostler & Eric Knapp 7264.
Oxford Jan 2005 RAL Computing 1 RAL Computing Implementing the computing model: SAM and the Grid Nick West.
Enterprise Reporting with Reporting Services SQL Server 2005 Donald Farmer Group Program Manager Microsoft Corporation.
Talend 5.4 Architecture Adam Pemble Talend Professional Services.
Jason Morrill NCOAUG Training Day February, 2008
Linux Operations and Administration
© Geodise Project, University of Southampton, Data Management in Geodise Jasmin Wason, Zhuoan Jiao and Marc Molinari Engineering.
RDF Triple Stores Nipun Bhatia Department of Computer Science. Stanford University.
Using the SAS® Information Delivery Portal
© Geodise Project, University of Southampton, Data Management in Geodise Zhuoan Jiao, Jasmin Wason and Marc Molinari
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
How to Adapt existing Archives to VO: the ISO and XMM-Newton cases Research and Scientific Support Department Science Operations.
COMP3019 Coursework: Introduction to GridSAM Steve Crouch School of Electronics and Computer Science.
8th November 2002Tim Adye1 BaBar Grid Tim Adye Particle Physics Department Rutherford Appleton Laboratory PP Grid Team Coseners House 8 th November 2002.
 Open source RDF framework in Java.  Supports RDF Schema inferencing and querying.  Supports SPARQL 1.1 query, update, federated query.
OME-TIFF and Bio-Formats K. Eliceiri, E. Hathaway, M. Linkert, and C. Rueden
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Jens Thomas Lensfield Quixote. Quixote Project An international, open-source, open-data collaboration to design, test and.
Presented by Scientific Annotation Middleware Software infrastructure to support rich scientific records and the processes that produce them Jens Schwidder.
NMNH EMu DAMS Integration Project Rebecca Snyder Smithsonian, NMNH.
Presented by Jens Schwidder Tara D. Gibson James D. Myers Computing & Computational Sciences Directorate Oak Ridge National Laboratory Scientific Annotation.
Technical Update 2008 Sandy Payette, Executive Director Eddie Shin, Senior Developer April 3, 2008 Open Repositories 2008, Fedora User Group.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
Adapting the Electronic Laboratory Notebook for the Semantic Era Tara Talbott, Michael Peterson, Jens Schwidder, James D. Myers 2005 International Symposium.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
RDF David R Newman 15 May 2009.
AHM04: Sep 2004 Nottingham CCLRC e-Science Centre eMinerals: Environment from the Molecular Level Managing simulation data Lisa Blanshard e- Science Data.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Nanbor Wang, Balamurali Ananthan Tech-X Corporation Gerald Gieraltowski, Edward May, Alexandre Vaniachine Argonne National Laboratory 2. ARCHITECTURE GSIMF:
The Storage Resource Broker and.
Presenting Semantic Data Through “Instance Hubs” Using Authoritative URI Design Schemes Alexei Bulazel 1 ( ), Dominic Difranzo 1 (
1 PSI/PhUSE Single Day Event – SAS Applications – June 11, 2009 SAS Drug Development from the Inside Magnus Mengelbier Director.
REDCap General Overview
DATA INTEGRATION FOR LANGUAGE DOCUMENTATION
The importance of being Connected
Introduction an Open Source, Open Data international collaboration, based entirely in the internet started following a CECAM meeting in Zaragoza:
eSafe Open Modules Overview
BusinessObjects 4.2 SP3 What's new for System Administration in CMC
Semantic Annotation service
Best Practices in Higher Education Student Data Warehousing Forum
Eurostat Unit B3 – IT and standards for data and metadata exchange
Presentation transcript:

The CLARION Project for the Infrastructure for Integration in Structural Sciences (I2S2) mtg, Rutherford Labs, 11 th February 2010 CLARION – Chemical Laboratory Repository In/Organic Notebooks Principal Investigator: Peter Murray-Rust Co-Investigator: Jim Downing Project Team: Nick Day, Sam Adams, Brian Brooks Unilever Centre, Department of Chemistry, University of Cambridge

CHEM-0 repository EmMa Embargo Mgr ELN (IDBS) Crystall- ography Files (CIF) NMR files CML, RDF RDF triplestores SPARQL interface CLARION query app CLARION overview CHEM-1 repository Data Releaser Publications database JUMBO converters EmMa user interface External Scientist Internal Scientist 1.Scientist collects data & stores it in variety of locations 2.EmMa is notified about the new content 3.Scientist specifies the release conditions for the data 4.Timer waits until release conditions are met 5.Data is moved into CHEM-1 repository and (at some time) into CHEM-0 repository 7.Repository queried by scientists Data Loader 5 7

ELN server File Feed ELN Feed Lensfield Loader ELN Data Files CHEM-0/1 repository Atom Feed Jetty webserver cron jobs Java Adapter Atom Feed ELN API Jetty webserver cron jobs Java Data Handler Atom Feed Atom Feed Reader GUI client Adapter Release Manager Design principles used: Decoupling through standard web interfaces (http, Atom) Avoid data duplication (by using http references unless a copy is required) Dont do manually that which can be done automatically Manual semantification as early as possible Automatic semantification as late as possible Give ability to undo an action during a grace period rather than getting confirmation Jetty webserver Java H2db for metadata JUMBO converters Ontologies: ChemAxiom ORE ORE Chem Expt Jetty webserver Java & Clojure CML RDF Triplestore Chemical Structure index Jetty webserver Java SPARQL Blue boxes indicate logical machine environments CLARION architecture SOAP CLARION repository Sesame Chemicx EmMas role: Adds metadata Defines embargo release conditions Is the gatekeeper for metadata quality Is the gatekeeper for security (trust, authentication, authorisation) Embargo Manager (EmMa) Query System

Scientists presented with data records to which they add metadata and then set embargo release conditions EmMa Sources R epository Data Loader Stage 1Stage 2Stage FebMarJanMayJunAprAugSepJulNovDecOct 123 CLARION development stages & timings Stage 1: First data-feed into EmMa Atom-feeds from file stores EmMa feed-readers EmMa user review tool EmMa output atom-feeds Stage 2: Basic functionality to store first data-type into repository Lensfield reads EmMa feeds Process data to CML Process CML to RDF Store triples into triple-store Indexing of chemical structures Stage 3: Basic querying functionality Authentication & authorisation Pilot users loading data V1 query tool Data stored in RDF and chemical structures indexed System in use by pilot users & simple query interface for SSS & RDF queries. Querying by outside users.

EmMa EmMa: A general tool for controlling data release between systems ? ISIS ELN XRay NMR Etc PubChem PDB Chem-1 Chem-0 NCS eCrystals Atom feed Public Atom feed Fully semantified data (RDF) Original data plus basic metadata Private Atom feed Pump

Institution A EmMa Rutherford neutron Institution B EmMa Events: 1.Scientist sends sample to Rutherford 2.Rutherford stores data locally and sends copy back to scientist 3.Institutions EmMa is informed about new data 4.Scientist specifies data release conditions 5.Release conditions reached, data released to public repository 6.Rutherford monitors institutions atom feed, detects data is released 7.Rutherford makes data visible in their own public-access repository Private repository Public repository How EmMa could facilitate data release in collaborating institutions