Presentation on theme: "1 WP2: Data Management Paul Millar eScience All Hands Meeting September 2-4 2003."— Presentation transcript:
1 WP2: Data Management Paul Millar eScience All Hands Meeting September
2 eScience -All Hands Meeting 2-4 September 2003 Introduction EDG in third and final year of project. 1 st Generation of tools provided a very good base for input. 2 nd Generation designed for modularity and to allow evolution. Java based services (using either Tomcat or Oracle 9iAS) Interface design defined in WSDL Client stubs for Java, C/C++ using AXIS and gSOAP Persistent service data is stored mySQL or Oracle. Replication service framework (RM, RLS, RMC, ROS) Java Security Package
3 eScience -All Hands Meeting 2-4 September 2003 Replication Service Framework User Replica Management Service Core Optimisation Security Collections Sessions Subscriptions Consistency Processing Replica Location Service MetaData Catalogue Transport Replica Selection Access History Replication Initiation
4 eScience -All Hands Meeting 2-4 September 2003 Interaction with services Internal Replica Location Service (RLS) Replica Metadata Catalogue (RMC) Replica Optimisation Service (ROS) External Relational Grid Monitoring Architecture (R-GMA) Globus C-based libraries, as well as CoG EDG network monitoring services. EDG-SE services.
5 eScience -All Hands Meeting 2-4 September 2003 Replica Location Service Maintains a (possibly distributed) catalogue of files: 1 file maps to potentially many replicas. Need to keep track of file location and consistently updated. RLS stores one-to-many relations between GUID and Physical File Names (PFNs). Two-level design: LRC (Local Replica Catalogue) and RLI (Replica Location Index). LRC contains a list of GUID to PFNs. RLI contains GUID to LRC mappings. RLS will operate with just an LRC. EDG2.0 operation LRCs publish Bloom filter objects: compact form of representing a set. May contain false +ve, but not false -ve.
6 eScience -All Hands Meeting 2-4 September 2003 RLS Demo at SC2002
7 eScience -All Hands Meeting 2-4 September 2003 Replica MetaData Catalogue RLS provides GUID to PFN mapping, but GUID isn't user friendly. RMC provides metadata on a per GUID basis. One such metadata is a Logical File Name, LFN. A GUID may have many LFNs associated. RMC is also capable of storing other metadata, such as file size, date of creation, owner... User-defined metadata can also be stored, and searched against.
9 eScience -All Hands Meeting 2-4 September 2003 Replica Optimisation Service Early TB1, getBestFile absent. Now available: select the best replica of several available. Light-weight web service gathers information from network monitoring service and Storage Element services. Resource Broker (meta-Scheduler) decides on which CE a job will run. ROS treat files mentioned in JDL as hints, returning an access cost for a given array of potential CEs, allowing RB to rank based on availability of data. Most research-oriented task. OptorSim developed to test replica optimization ideas.
10 eScience -All Hands Meeting 2-4 September 2003 Security Provided by separate Java package. Covers Authentication coarse-grain authorisation. Aim to be as flexible as possible. Investigating collaboration with Liberty Alliance – a consortium developing standards and solutions for federated identity.
11 eScience -All Hands Meeting 2-4 September 2003 Authentication Extends normal Java SSL. Mutual authentication in SSL happens by exchanging public certificates signed by mutually trusted CAs, and crypto challenges Uses proxy certificates. Accepts GSI proxies as the authentication method Supports GSI proxy loading and reloading Supports OpenSSL certificate-private key loading Supports CRLs with periodic reloading Integrates with Tomcat and Jakarta AXIS SOAP framework Proxy doesn't have to be signed by CA, but has to start with DN of the user's certificate.
12 eScience -All Hands Meeting 2-4 September 2003 Coarse grain authorisation Coarse-grain means the server decides what access to grant before the request is processed: role based. Modular design for client-server interaction. SOAP and HTTP web traffic already written. Modular configuration. Currently configuration modules exist for XML and text file (the gridmap file). Integration work with Virtual Organisation Membership Service (VOMS). This allows authorisation on per-VO basis, without gridmap files.
13 eScience -All Hands Meeting 2-4 September 2003 Conclusions The 2 nd generation of data management services has been written based on the Web-services paradigm. We have chosen an extensible service framework. This will allow the adoption of upcoming OGSA standards. Our choice of software is based on our aim of supporting both high- availability commercial products and standard Open-Source solutions. The 2 nd Generation of WP2 software is currently being rolled out in production systems as part of the 2.0 release of EDG Software. Integration of additional services (such as full RLS and VOMS) are being scheduled.