Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jean-Philippe Baud, IT-GD, CERN November 2007

Similar presentations


Presentation on theme: "Jean-Philippe Baud, IT-GD, CERN November 2007"— Presentation transcript:

1 Jean-Philippe Baud, IT-GD, CERN November 2007
LFC and DPM Jean-Philippe Baud, IT-GD, CERN November 2007

2 Reliability Workshop: LFC-DPM
Agenda Goals for LFC and DPM DPM architecture Simple design Good coding practices Secure services Testing Operations Reliability Workshop: LFC-DPM

3 Reliability Workshop: LFC-DPM
Goals for LFC and DPM LFC: LCG File Catalogue Replace EDG RLS Provide hierarchical name space, access control lists, sessions and transactions DPM: Disk Pool Manager Provide a scalable solution to replace the Classic Storage Elements at Tier2s Focus on manageability Easy to install Easy to configure Low effort for ongoing maintenance Easy to add/remove resources Integrated security (authentication/authorization) Reliability Workshop: LFC-DPM

4 SRM-enabled client, etc. Reliability Workshop: LFC-DPM
DPM architecture CLI, C API, SRM-enabled client, etc. DPM head node DPMCOPY node DPM Name Server Namespace Authorization Physical files location DPM Server Requests queuing and processing Space management SRM Servers (v1.1, v2.1, v2.2) Disk Servers Physical files Direct data transfer from/to disk server (no bottleneck) data transfer DPM disk servers Reliability Workshop: LFC-DPM

5 DPM architecture (head node)
Space manager Request Scheduler Persistency Server side DPM daemon DPM client Asynchronous requests to DB Interoperability Database backend DPM tables DPM client DPM client SRM v1 and v2 daemons DPNS tables DPNS daemon DPM client Maestro of metadata Metadata Control interface SRM client Insert/select data to/from the DPM tables Insert/select data to/from the DPNS tables Synchronous requests Authentication Lcg-util/gfal Control data Reliability Workshop: LFC-DPM

6 Reliability Workshop: LFC-DPM
Simple design (1) DPM architecture is database centric Only 2 DBs Fairly simple schema No complex query (mostly key access) Use of bind variables, indices, transactions and integrity constraints Automatic reconnection to the DB (allows transparent failover when using Oracle) Reliability Workshop: LFC-DPM

7 Reliability Workshop: LFC-DPM
Simple design (2) Few daemons Mainly communicating through the DB Stateless Configuration is kept in DB A given daemon can be restarted on a different server Scalability and high availability All servers (except the DPM one) can be replicated if needed (DNS load balancing) Daemons can be restarted independently Automatic retries in clients Reliability Workshop: LFC-DPM

8 Reliability Workshop: LFC-DPM
Good coding practices For long term maintainability of the code Portable code (compiled and tested on several platforms) Modular code with enough comments Protect against buffer overrun Check validity of parameters Check for memory leaks Avoid mutexes in multi-threaded applications for performance reason (good design is needed) Code profiling Reliability Workshop: LFC-DPM

9 Reliability Workshop: LFC-DPM
Security All control and I/O services have security built-in (GSI) The entries in the name space can be protected by Posix Access Control Lists All privileged operations can only be done with a Host Certificate on a trusted host VOMS integration: groups, sub-groups and roles are supported The DNs and VOMS FQANs are mapped to virtual ids (no pool account) All the groups present in the proxy are used for authorization in the namespace Only the primary group/role is used in disk pool selection Reliability Workshop: LFC-DPM

10 Reliability Workshop: LFC-DPM
Testing Unit tests Test of new features Test after bug fixes Functional tests Full test suite Interoperability testing (SRM) Stress tests Find the limits of the system Discover timing and corner problems Pilot service (LFC only) Test of bulk methods by Atlas Test of new permission and ownership scheme (LHCb) Reliability Workshop: LFC-DPM

11 Reliability Workshop: LFC-DPM
Operations Common logging format with timestamps and user identity LFC upgrade is transparent if no DB schema change and if 2 frontends are used We limit the number of DB schema updates to about once a year LFC and DPM databases do not need to run on the same machine as the frontend server Monitoring scripts (LFC) Number of threads, response time, DB errors Reliability Workshop: LFC-DPM

12 Reliability Workshop: LFC-DPM
Conclusion The LFC and DPM have become very popular (more than 100 sites are using them for many VOs) The simple and robust design allows us to do external site support with less than one FTE at CERN Documentation: Reference man pages Admin guide Troubleshooting Reliability Workshop: LFC-DPM


Download ppt "Jean-Philippe Baud, IT-GD, CERN November 2007"

Similar presentations


Ads by Google