Download presentation
Presentation is loading. Please wait.
Published byShonda Gibbs Modified over 8 years ago
1
Proxy management mechanism and gLExec integration with the PanDA pilot Status and perspectives
2
Agenda Architectural review Integration status Site deployment Perspectives
3
Introduction WLCG Grid Job and Worker Node Security Assessment: o http://cern.ch/go/7vK9 o User traceability o Sandboxed execution of payloads o Local policy enforcement gLExec: o Provides proxy management on the WN o Toolset to manage the sandbox o Can change UID transparently
4
Architectural review PanDA: o Late binding pilot-based framework (pull) The Pilot Job and gLExec: o Handle the payload and user proxy o Sandbox and detach o Monitor and kill o Multi process, multi user environment
5
Pilot and gLExec PanDA Server PanDA pilot gLExec Computing Element … Payload MyProxy Job User proxy Job PanDA pilot gLExec Payload
6
Wrapper pilot Main process pilot Sandboxed pilot process user Payload process user t setUpEnvironment() createSandbox() runGlexec() startMonitoringLoop() startPayload() payload() sendHeartbeat() kill() cleanup()
7
Pilot Collect local info Job recovery Additional cleanup Multi-job loop Check proxy Check local space Collect local info Get job Fork sub process Clean up Monitoring loop Check log size (Check workDir) Looping check Check local space Setup job Transfer input Transfer output Execute payload Signal handler Abort and clean up gLExec Signal handler Abort and clean up
8
Pilot refactoring Two processes running: o Regular pilot process (download payloads) o Sandboxed process (monitoring loop) Process status encapsulation Multi user: o Pilot process ↔ Isolated sandbox o Transfer environments securely
9
Integration status: done New monitoring loop o Configuration handler o Serialize and encapsulate process status Sandbox creation o Status transfer, user proxy retrieval o Copy over the binaries Subprocess detaching
10
Integration status: pending Fine tuning the sandboxed process o Lots of different data to pass into the sandbox o And to retrieve afterwards Batch system signals handling Payload completion Time floor testing with different users Time scale: still work to do.
11
Issues Code structure: o Monolithic: much refactor needed. But done. o No consensus on environments: Many options for: initdir, workdir, sandbox, mkgltmpdir, stagein/out... Not clear definition in all batch systems. O(n 4 ) possible sanbox configurations. gLExec testing sandbox: o Kindly provided by Maarten.
12
Scalability Every worker node to reach MyProxy: o Artificially high load o There is no need if the PanDA Server: Gets proxies and caches them o Security concerns for payload transfer o The worker node is not a trusted entity Allowing all of them is not recommended
13
MyProxy 4. User proxy and job 3. User proxy 2. User proxy 2. Job PanDA pilot glExec Computing Element … Job PanDA pilot glExec Job PanDA client 1. User proxy and job PanDA Server 5. Notification: Proxy about to expire * The user credential is forwarded to the WN as part of the payload metadata by PanDA. Implementationwise, simple caching schemes would be desirable to reduce the load imposed on MyProxy. Proxy cache *
14
Perspectives Need to define new responsibles: o Developers, o Testers.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.