Metadata Workshop Rick St. Denis Glasgow University April 26-28, 2004.

Slides:



Advertisements
Similar presentations
GridPP July 2003Stefan StonjekSlide 1 SAM middleware components Stefan Stonjek University of Oxford 7 th GridPP Meeting 02 nd July 2003 Oxford.
Advertisements

WP2: Data Management Gavin McCance University of Glasgow November 5, 2001.
GridPP9 – 5 February 2004 – Data Management DataGrid is a project funded by the European Union GridPP is funded by PPARC GridPP2: Data and Storage Management.
WP2: Data Management Gavin McCance University of Glasgow.
Metadata Progress GridPP18 20 March 2007 Mike Kenyon.
Tony Doyle - University of Glasgow MetaData Project Management.
Data Management Expert Panel - WP2. WP2 Overview.
Managing Data Resources
The Hierarchy of Data Bit (a binary digit): a circuit that is either on or off Byte: 8 bits Character: each byte represents a character; the basic building.
A Model for Grid User Management Rich Baker Dantong Yu Tomasz Wlodek Brookhaven National Lab.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
System Design/Implementation and Support for Build 2 PDS Management Council Face-to-Face Mountain View, CA Nov 30 - Dec 1, 2011 Sean Hardman.
GRID job tracking and monitoring Dmitry Rogozin Laboratory of Particle Physics, JINR 07/08/ /09/2006.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
Fundamentals of Information Systems, Fifth Edition
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
GCMD/IDN STATUS AND PLANS Stephen Wharton CWIC Meeting February19, 2015.
David Adams ATLAS ATLAS Distributed Analysis David Adams BNL March 18, 2004 ATLAS Software Workshop Grid session.
Organizing Data and Information AD660 – Databases, Security, and Web Technologies Marcus Goncalves Spring 2013.
CDF Grid Status Stefan Stonjek 05-Jul th GridPP meeting / Durham.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL U.S. ATLAS Physics and Computing Advisory Panel Review Argonne National Laboratory Oct 30, 2001.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
David Adams ATLAS DIAL status David Adams BNL July 16, 2003 ATLAS GRID meeting CERN.
David Adams ATLAS ATLAS Distributed Analysis Plans David Adams BNL December 2, 2003 ATLAS software workshop CERN.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
Fundamentals of Information Systems, Seventh Edition 1 Chapter 3 Data Centers, and Business Intelligence.
Datasets on the GRID David Adams PPDG All Hands Meeting Catalogs and Datasets session June 11, 2003 BNL.
Tony Doyle & Gavin McCance - University of Glasgow ATLAS MetaData AMI and Spitfire: Starting Point.
David Adams ATLAS ADA, ARDA and PPDG David Adams BNL June 28, 2004 PPDG Collaboration Meeting Williams Bay, Wisconsin.
Metadata: Plans and Progress of the Metadata Working Group Rick St. Denis Glasgow University May 13,2004.
JRA Execution Plan 13 January JRA1 Execution Plan Frédéric Hemmer EGEE Middleware Manager EGEE is proposed as a project funded by the European.
The Experiments – progress and status Roger Barlow GridPP7 Oxford 2 nd July 2003.
State Key Laboratory of Resources and Environmental Information System China Integration of Grid Service and Web Processing Service Gao Ang State Key Laboratory.
Distributed database system
Metadata Mòrag Burgon-Lyon University of Glasgow.
David Adams ATLAS DIAL/ADA JDL and catalogs David Adams BNL December 4, 2003 ATLAS software workshop Production session CERN.
EGEE User Forum Data Management session Development of gLite Web Service Based Security Components for the ATLAS Metadata Interface Thomas Doherty GridPP.
LCG ARDA status Massimo Lamanna 1 ARDA in a nutshell ARDA is an LCG project whose main activity is to enable LHC analysis on the grid ARDA is coherently.
DGC Paris WP2 Summary of Discussions and Plans Peter Z. Kunszt And the WP2 team.
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
WP3 Information and Monitoring Rob Byrom / WP3
System/SDWG Update Management Council Face-to-Face Flagstaff, AZ August 22-23, 2011 Sean Hardman.
LHCb File-Metadata: Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 04 July 2006.
The ATLAS TAGs Database - Experiences and further developments Elisabeth Vinek, CERN & University of Vienna on behalf of the TAGs developers group.
ATLAS Database Access Library Local Area LCG3D Meeting Fermilab, Batavia, USA October 21, 2004 Alexandre Vaniachine (ANL)
David Adams ATLAS ATLAS-ARDA strategy and priorities David Adams BNL October 21, 2004 ARDA Workshop.
ATLAS-specific functionality in Ganga - Requirements for distributed analysis - ATLAS considerations - DIAL submission from Ganga - Graphical interfaces.
DGC Paris Spitfire A Relational DB Service for the Grid Leanne Guy Peter Z. Kunszt Gavin McCance William Bell European DataGrid Data Management.
April 25, 2006Parag Mhashilkar, Fermilab1 Resource Selection in OSG & SAM-On-The-Fly Parag Mhashilkar Fermi National Accelerator Laboratory Condor Week.
1 A Scalable Distributed Data Management System for ATLAS David Cameron CERN CHEP 2006 Mumbai, India.
Grid Status - PPDG / Magda / pacman Torre Wenaus BNL DOE/NSF Review of US LHC Software and Computing Fermilab Nov 29, 2001.
Status of tests in the LCG 3D database testbed Eva Dafonte Pérez LCG Database Deployment and Persistency Workshop.
David Adams ATLAS ATLAS Distributed Analysis (ADA) David Adams BNL December 5, 2003 ATLAS software workshop CERN.
AMGA-Bookkeeping Carmine Cioffi Department of Physics, Oxford University UK Metadata Workshop Oxford, 05 July 2006.
DDM Central Catalogs and Central Database Pedro Salgado.
David Adams ATLAS ATLAS Distributed Analysis and proposal for ATLAS-LHCb system David Adams BNL March 22, 2004 ATLAS-LHCb-GANGA Meeting.
David Adams ATLAS ADA: ATLAS Distributed Analysis David Adams BNL December 15, 2003 PPDG Collaboration Meeting LBL.
DBS Monitor and DAN CD Projects Report July 9, 2003.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
Preservation Data Services Persistent Archive Research Group Reagan W. Moore October 1, 2003.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Joe Foster 1 Two questions about datasets: –How do you find datasets with the processes, cuts, conditions you need for your analysis? –How do.
Magda Distributed Data Manager Torre Wenaus BNL October 2001.
Managing Data Resources File Organization and databases for business information systems.
Amy Krause EPCC OGSA-DAI An Overview OGSA-DAI on OMII 2.0 OMII The Open Middleware Infrastructure Institute NeSC,
Database System Concepts and Architecture
Chris Menegay Sr. Consultant TECHSYS Business Solutions
My life at the Cooperators
Grid Data Integration In the CMS Experiment
The AMI Database Project
Presentation transcript:

Metadata Workshop Rick St. Denis Glasgow University April 26-28, 2004

Format Goal: Answer the question “What is Metadata” in our document Method: Provaceteurs Topics list: augment now Get acquainted, divide and study topics, present together, course of action Output: Revamped deliverables

Rough Agenda Mon: –2-3 5 min on who we are –3-3:30 Decide on topics –3:30-5:00 Get to Stepps, Hotel –5:00 Meet in 2 West Ave Tues: Provocateur sessions and research Wed: Final Document with deliverables, Plans for future: MO, CHEP abstract

Topics 1.Metadata Architecture and components –Replica Catalogs, file catalogs, physics catalogs 2.Use Cases 3.Query Languages 4.Implementations and Performance. Technology Considerations, Performance reqs 5.Service architectures, Deployment Architectures 6.Database implementations: text/mysql/postgres/oracle/enth

Informing ourselves SAM Services (Julie) Arda/OGSA-DAI(Gav will outline) LHCB (Carmine) AMI (Solveig) Pool and Graphical Visualization (Carmine) Spitfire (Paul) PNPA-GGF (Rick) Project Management (Tony) Package services for release in SourceForge

Use Cases CDF5858: physicist use case (Rick) HEPCAL II (Solveig,Tony) –Production –Analysis –ADA: Atlas catalogs – David Adams(Steve) D0: Wyatt Schema Update Document: use cases?(Adam)

Services Compare Arda and SAM approaches: Arda architecture:Gavin Given Use cases: Define services List Services from SAM:Services to services Interfaces: The SAM service with one schema – the Grid services implemented in several schemas. Interfaces: Physics catalog impact from failure of lower level services. “file content status”. Action: outline models of access: physical/logical Discrete or related bits of functionality: dependencies between services. Zenness of services. List of files, directive on where to use, not connection to why anymore.Performance implications on interfaces. Wyatt, Gavin, Rick, Julie

Deployment Architectures Where do the services run? Application servers? Tiers of applications and databases Replication for HA. At what tier? Application or DB? Oracle? Is it replication or mirroring. What is the time constant for replication? When do metadata become stale?Freshness date: status bits. Centralized catalogs as a single point of failure: what are single points of failure. HA strategies Federation of metadata Julie,Gavin,Paul,Solveig

Tools DB: jdbc,phpi,text, mysql, msql, oracle,xml,soap,python Dbserver Tools on top of *sql. Relation to deployment architectures: db access directly or application server. Replication Data Virtualization Rick, Gavin, Solveig, Adam,Julie

Query Languages and Interfaces SQL Chains and Links (rick) General Dimensions (Wyatt) Queries against multiple databases. Related to deployment architecture (dimensions, c&l,SBIR II/enth) POOL (Carmine)

Monitoring Sam TV (Adam) Mining and instrumenting (Caitriana) MonAlisa File access patterns stats

Security Table Access in a distributed architecture Server to Server security Access to the Server by the user A standard certification protocol VOMs Spitfire security

Use CasesSteve,Rick,Solveig,Tony,Wyatt,Adam ServicesGavin,Wyatt,Rick, Julie Deployment Architecture Solveig,Julie,Gavin, Paul MonitoringAdam,Caitriana,Carmine Query Languages and Interfaces Carmine,Rick,Wyatt ToolsJulie,Rick,Gavin,Solveig,Adam DeliverablesTony,Gavin

RickRick Ju li e S ol ve ig Ad am Wy att St ev e MogMog Cai t CarmCarm TonyTony GavGav P au l 1Use CasesxxxxLx ServicesXxxL DeploymentxLxx 1MonitoringLxx QueryLang/IntxxxL 1Security(&!£)xLx ToolsxLxxx DeliverablesLx

Next Steps Design for Keyword-Value Schema evolution and self-describing schema Use previous 2 to automate transition from keyword-value to query-efficient schema and determination of which queries need to be satisfied. Unique dataset tool

Deliverables Docs from next steps Use case filtered for our group Services: Decomposition of ER-Diagram into collab diagram Deployment Arch: Enumerate problems Monitoring: Stats on queries(accumluate/doc) QueryLang/Int: Survey of QL(Pool.C&L) Tools:Wrap corba w/xml Deliverables: longer term

Schedules Monthly meeting Last Tues of month at 8:30/14:30/15:30 First: May 25. H323: Mailing list (Paul)

Metadata for the Common Physicist A working group on metadata with representatives from ATLAS, BaBar, CDF, CMS, D0, and LHCB in cooperation with EGEE have identified overlapping user requirements that may be supported by common service implementations. Classes of metadata specific to each service and their relations are described. These include a set of use cases based on compilation of various HEP documents. These documents are used to inform interfaces in existing and planned services as described in metadata schema. Emphasis is placed on the evolution of schema using keyword-value pairs that are then transformed into a normalised performant database schema. A report is made of self-description mechanisms, which coupled with updating processes, allow the APIs to remain static as the schema evolves. A presentation is made of the way use cases drive performance. Requirements are presented for the physical and logical arrangement of service implementations, dictating the degree to which the databases containing the metadata may be distributed or centralised. A set of existing monitoring tools expose the validity and completeness of the use cases for experiments in various stages of maturity. A survey of the query languages, web service interfaces and tools in use across the experiments is presented.

Future Work to deliverables Meet according to deadlines Workshops according to major deadlines