Metadata Mòrag Burgon-Lyon University of Glasgow.

Slides:



Advertisements
Similar presentations
21 Sep 2005LCG's R-GMA Applications R-GMA and LCG Steve Fisher & Antony Wilson.
Advertisements

WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
The Quantum Chromodynamics Grid James Perry, Andrew Jackson, Matthew Egbert, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Holding slide prior to starting show. Supporting Collaborative Working of Construction Industry Consortia via the Grid - P. Burnap, L. Joita, J.S. Pahwa,
1 The IIPC Web Curator Tool: Steve Knight The National Library of New Zealand Philip Beresford and Arun Persad The British Library An Open Source Solution.
Information Retrieval in Practice
Automatic Data Ramon Lawrence University of Manitoba
Overview of Search Engines
TIBCO Designer TIBCO BusinessWorks is a scalable, extensible, and easy to use integration platform that allows you to develop, deploy, and run integration.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
Implementation Yaodong Bi. Introduction to Implementation Purposes of Implementation – Plan the system integrations required in each iteration – Distribute.
WSRF Supported Data Access Service (VO-DAS)‏ Chao Liu, Haijun Tian, Dan Gao, Yang Yang, Yong Lu China-VO National Astronomical Observatories, CAS, China.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
1 st December 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
DEPICT: DiscovEring Patterns and InteraCTions in databases A tool for testing data-intensive systems.
David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.
11 CORE Architecture Mauro Bruno, Monica Scannapieco, Carlo Vaccari, Giulia Vaste Antonino Virgillito, Diego Zardetto (Istat)
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
22 nd September 2003 JIM for CDF 1 JIM and SAMGrid for CDF Mòrag Burgon-Lyon University of Glasgow.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Metadata Workshop Rick St. Denis Glasgow University April 26-28, 2004.
Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.
Production Tools in ATLAS RWL Jones GridPP EB 24 th June 2003.
Giuseppe Codispoti INFN - Bologna Egee User ForumMarch 2th BOSS: the CMS interface for job summission, monitoring and bookkeeping W. Bacchi, P.
Wrapping Scientific Applications As Web Services Using The Opal Toolkit Wrapping Scientific Applications As Web Services Using The Opal Toolkit Sriram.
LHCb Software Week November 2003 Gennady Kuznetsov Production Manager Tools (New Architecture)
Data access and integration with OGSA-DAI: OGSA-DQP Steven Lynden University of Manchester.
Replica Management Services in the European DataGrid Project Work Package 2 European DataGrid.
DDM Monitoring David Cameron Pedro Salgado Ricardo Rocha.
Mike Jackson EPCC OGSA-DAI Architecture + Extensibility OGSA-DAI Tutorial GGF17, Tokyo.
XML and Its Applications Ben Y. Zhao, CS294-7 Spring 1999.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL September 30, 2004 CHEP2004 Track 5: Distributed Computing Systems and Experiences.
INRIA - Progress report DBGlobe meeting - Athens November 29 th, 2002.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
Interactive Data Analysis on the “Grid” Tech-X/SLAC/PPDG:CS-11 Balamurali Ananthan David Alexander
The GridPP DIRAC project DIRAC for non-LHC communities.
K. Harrison CERN, 22nd September 2004 GANGA: ADA USER INTERFACE - Ganga release status - Job-Options Editor - Python support for AJDL - Job Builder - Python.
David Adams ATLAS ATLAS Distributed Analysis: Overview David Adams BNL December 8, 2004 Distributed Analysis working group ATLAS software workshop.
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
Copyright 2007, Information Builders. Slide 1 iWay Web Services and WebFOCUS Consumption Michael Florkowski Information Builders.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
K. Harrison CERN, 21st February 2005 GANGA: ADA USER INTERFACE - Ganga release Python client for ADA - ADA job builder - Ganga release Conclusions.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
Future of Distributed Production in US Facilities Kaushik De Univ. of Texas at Arlington US ATLAS Distributed Facility Workshop, Santa Cruz November 13,
INFSO-RI Enabling Grids for E-sciencE Ganga 4 Technical Overview Jakub T. Moscicki, CERN.
The GridPP DIRAC project DIRAC for non-LHC communities.
OGSA-DQP Steven Lynden University of Manchester. Data access & integration with OGSA-DAI: GGF 17 2 Introduction OGSA-DQP is a service based distributed.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
Advanced Databases COMP3017 Dr Nicholas Gibbins
A Data Handling System for Modern and Future Fermilab Experiments Robert Illingworth Fermilab Scientific Computing Division.
XML 2002 Annotation Management in an XML CMS A Case Study.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
ISC321 Database Systems I Chapter 2: Overview of Database Languages and Architectures Fall 2015 Dr. Abdullah Almutairi.
Information Retrieval in Practice
BOSS: the CMS interface for job summission, monitoring and bookkeeping
BOSS: the CMS interface for job summission, monitoring and bookkeeping
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 2 Database System Concepts and Architecture.
SDMX IT Tools SDMX Registry
Presentation transcript:

Metadata Mòrag Burgon-Lyon University of Glasgow

31 st January 2005Metadata Contents Overview gLite Adoption Use Cases Schema OGSA-DAI Evaluation Further information

31 st January 2005Metadata Overview The Metadata group exists to examine commonalities across all the High Energy Physics experiments' metadata handling, at the technology, interface and schema level. The aim of the group is to ensure metadata services are deployed as much as possible using web services and other grid standards.

31 st January 2005Metadata Contents Overview gLite Adoption Use Cases Schema OGSA-DAI Evaluation Further information

31 st January 2005Metadata gLite Adoption Classes have been deployed as web services using Apache Axis Skeletons of web service clients have been produced in Java and Python –AttributeMetadataCatalog Specific methods, e.g. ‘queryByAttributes’, ‘remove’ Maps to AMI command e.g. AMI ListDataset –QueryMetadataCatalog Supports AMI and SQL Queries Uses SAX XML parser to interpret AMI response to produce a list of logical datasets Atlas have implemented the gLite QueryMetadataCatalog and AttributeMetadataCatalog interfaces as wrappers for an AMI backend:

31 st January 2005Metadata Contents Overview gLite Adoption Use Cases Schema OGSA-DAI Evaluation Further information

31 st January 2005Metadata Use Cases Reviewed existing Use Case documentation: –HEPCAL I & II –CDF Note 5858 / SAMGrid / D0 & CDF –BaBar Analysis Grid Application – Use Case and Requirements Document –Atlas – Catalog Services for Atlas Held discussions with representatives of the experiments, ARDA and EGEE.

31 st January 2005Metadata Use Cases Thirteen Core Use Cases of HEP Metadata “Unlucky for some?” ases/CoreUseCases_v10.pdf Data handling –Specify a new dataset –Read metadata for datasets –Update metadata for a dataset –Resolve physical data –Access data in a dataset

31 st January 2005Metadata Use Cases Analysis –Run a physics simulation program –Select a subset of a dataset –Run an algorithm over an input dataset Job handling –Submit a job to a Grid –Retrieve/Access the output of a job –Estimate the system resources cost of running a job –Monitor the progress of a job –Repeat a previous job

31 st January 2005Metadata Use Cases Conclusions: Highlighted occasional conflicting use-cases (for updating metadata). General consensus on 13 Core Use Cases. More feedback welcome.

31 st January 2005Metadata Contents Overview gLite Adoption Use Cases Schema OGSA-DAI Evaluation Further information

31 st January 2005Metadata Schema Looking at different approaches to database schema with two case studies: –AMI –SAM Aim to document the common elements and note the differences in both schema and approach.

31 st January 2005Metadata Schema AMI Schema: Stores Atlas production data for Data Challenges Simply supports different schema for different projects Allows schema to be changed easily User accesses multiple DB’s via generic commands and a router database

31 st January 2005Metadata Schema

31 st January 2005Metadata Schema AMI future plans – Implement the catalogues defined by David Adams and the ADA group: –Consists (roughly) of a selectableDataset catalogue with links to physics properties. –Related to a virtual dataset catalogue containing recipes for creating dataset instances, and a concrete dataset catalogue for finding files.

31 st January 2005Metadata Schema SAM Schema Mature system which has been in production for a number of years Stores real physics data Schema updates take between days and weeks to complete due to integration testing New code – Dimension Editor under development to allow run-time addition of new dimensions

31 st January 2005Metadata Schema

31 st January 2005Metadata Schema SAM future plans Dimension Editor development –Convert ‘SAM translate constraints’ commands into SQL queries –Optimise queries using ‘Chains’ and ‘Links’ tables. The Chains table describes the complete definition of join paths all the way back to the fact destination table. The Links table depicts the actual join details needed to follow the chains. –Allow the creation of new dimensions, built from queries, stored in the ‘Chains’ and ‘Links’ tables. Each time a new dimension is added, by walking through the paths of all Links on the Chains described the validity of a new table addition can be verified.

31 st January 2005Metadata Contents Overview gLite Adoption Use Cases Schema OGSA-DAI Evaluation Further information

31 st January 2005Metadata OGSA-DAI Evaluation Evaluation to establish appropriateness of OGSA-DAI for the HEP community. Initial investigations highlights: Pros –Web service (SOAP) access to database –Allows deployment of OGSA-DQP (Distributed Query Processor) Cons –Difficult installation due to poor packaging – mass deployment would be problematic –OGSA-DAI is heavy-weight in terms of dependencies

31 st January 2005Metadata OGSA-DAI Evaluation Initial conclusions –SOAP and DQP both attractive features. Migration to Globus Toolkit 4 might reduce the dependencies issue. Will keep an eye on future OGSA-DAI releases –Further evaluation required, especially through-put tests. Full document: dai.pdf

31 st January 2005Metadata Further Information GridPP Metadata pages: Metadata Wiki: Mailing List: