Presentation is loading. Please wait.

Presentation is loading. Please wait.

CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team.

Similar presentations


Presentation on theme: "CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team."— Presentation transcript:

1 CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team

2 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 2 Outline The vision... and a caveat Some problems with today’s system Proposal –Ideas how to get around those problems –Architecture –Request registration and scheduling –Catalogues –Disk server access and physical file ownership –Some interesting features Project planning and progress monitoring Conclusions

3 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 3 Vision... With clusters of 100s of disk and tape servers, the automated storage management faces more and more the same problems as CPU clusters management –(Storage) Resource management –(Storage) Resource sharing –(Storage) Request scheduling –Configuration –Monitoring The stager is the main gateway to all resources managed by CASTOR Vision: Storage Resource Sharing Facility

4 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 4... and a caveat The vision is to provide a scalable Storage Resource Sharing Facility –The hope is to achieve similar efficiency for the storage resource utilization as LSF provides for the CPU resources today However: nothing in the proposed design enforces a single shared stager instance –Today’s configurations with some 40 independent stagers is still OK

5 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 5 Some problems today’s stager Lot of code for supporting direct tape access No true request scheduling –Throttling, load-balancing –Fair-share Resource sharing not supported –Stagers are either dedicated or public –Dedicated resources  Some disk servers are 100% full/loaded while others are idle –Public resources  No control of who gets how much of the resources. Invites for abuse Operational issues –No unique request identifiers –Problem tracing difficult stagein –V P01234 –v EK4432 –q u –f MYHIGGSTODAY \ -g 994BR5 –b 8000 –F FB –L 80 –C ebcdic,block –E skip

6 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 6 PROPOSAL Proposal Ideas for the new stager Pluggable framework rather than total solution –True request scheduling: delegate the scheduling to a pluggable black-box scheduler. Possibly using third party schedulers, e.g. Maui or LSF –Policy attributes: externalize policy engines governing the resource matchmaking. Start with today’s policies for file system selection, GC, migration,.... Could move toward full-fledged policy languages, e.g. implemented using “GUILE” Restricted access to storage resources to achieve predictable load –No random rfiod eating up the resources behind the back of the scheduling system Disk server autonomy as far as possible –In charge of local resources: file system selection and execution of garbage collection –Loosing a server should not affect the rest of the system

7 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 7 PROPOSAL Physics application RFIO API stage API Request scheduler Request queue Common RTCOPY client: rtcpclientd mstaged VDQM VMGR Cns Global catalogue Scheduling policies Local catalogue Local policies sstaged rfiod Get physical path rtcpd Get disk server Start tape request rfiod Access /castor/… Request /castor/… Local Request scheduler RequestHandler New module Existing, modified External Existing module

8 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 8 PROPOSAL Proposal Request scheduling (1) A “master stager” (mstaged) receives all CASTOR file access requests –Authenticate client and register the request –Queue the request –The request registration is independent of the scheduling. It has to be designed to cope with high request load peaks Pluggable scheduler manages the queue and applies configured policies –E.g. requests from gid=1307 should only run on atlas001d,...

9 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 9 PROPOSAL Proposal Request handling & scheduling RequestRegister Fabric Authentication service e.g. Kerberos-V server Read: /castor/cern.ch/user/c/castor/TastyTreesDN=castor Typical file request Thread pool Authenticate “castor” Request repository (Oracle, MySQL) Scheduler Scheduling Policies user “castor” has priority Dispatcher Store request Run request on pub003d Get Jobs Disk server load Catalogue File staged? Request registration: Must keep up with high request rate peaks Request scheduling: Must keep up with average request rates

10 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 10 PROPOSAL Proposal Request scheduling (2) A “slave stager” (sstaged) runs on each disk server –Executes and controls all requests scheduled to it by the mstaged –Takes care of local resource scheduling such as file system selection and execution of garbage collector The sstaged also gathers relevant local load information for the central scheduler

11 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 11 PROPOSAL Proposal Catalogues Request catalogues –Central Repository of all running requests + request history Predictable load  facilitate load balancing Usage accounting from request history Fair-share File catalogues –Central CASTOR file  disk server mapping allows for finding files –Local CASTOR file  physical filename catalogue on the disk servers

12 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 12 PROPOSAL Proposal Disk server access Today a user can access files on disk servers either by –The CASTOR file name /castor/cern.ch/... –The physical file name /shift/lhcb003d/... With the new stager we restrict –To only allow for access by CASTOR file name –All physical files are owned by a generic account (stage,st) and their paths are hidden from direct RFIO access WHY????

13 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 13 PROPOSAL Proposal Disk server access Avoid two databases for file permissions & ownership –CASTOR name server –File system holding physical file Facilitate migration/recall of user files –Files with different owners are normally grouped together on tapes owned by a generic account (stage,st) –Would like to avoid setuid/setgid for every file Avoid backdoors: all disk server access must be scheduled An useful analogy: forbid interactive login access to the batch nodes in a LSF cluster

14 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 14 PROPOSAL Proposal Disk server access File: /castor/cern.ch/user/c/castor/RottenTrees Owner: castor,c3 File: /shift/pub002d/data05/c3/stage/castor/RottenTress.82345 Owner: stage,st CASTOR name server sstaged managed disk server All scheduled access result in an instance of a rfiod running on behalf of the user under generic account (stage,st) on the disk server rfiod authenticates the user and checks that the request has been scheduled. Unscheduled requests are rejected rfiod on disk server only allows for access to /castor files. Access with physical path is rejected

15 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 15 PROPOSAL Proposal Some interesting features Modifications to the tape mover allows for adding files to running tape requests Migration (and recall) controlled by a new central component called rtcopyclientd –Initiates the tape requests –Schedules the file copies just-in-time when the tape is positioned Dynamically expanding migration streams Better load-balancing is possible since the files copies are scheduled according to the load Allow for seeks in RFIO v3 (streaming) mode

16 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 16 Project planning and monitoring Detailed plan in proposal document Three milestones: –October -03: Demonstrate concept of pluggable scheduler and high rate request handling –February -04: Integrated prototype of the whole system –April -04: Production system ready for deployment Progress monitoring –Aim to use Project/Task manager provided by LCG Savannah portal (http://savannah.cern.ch/projects/castor/)http://savannah.cern.ch/projects/castor/ –Progress reviews at each milestone? are the experiments interested in providing efforts for helping with review?

17 24/06/2003 CASTOR new stager proposal, http://cern.ch/castor/DOCUMENTATION/ARCHITECTURE/NEW/Welcome.html 17 Conclusions The proposal aims for –A pluggable framework for intelligent and policy controlled file access scheduling –Evolvable storage resource sharing facility framework rather than a total solution –File access request running/control and local resource allocation delegated to disk servers Questions, remarks, proposals?


Download ppt "CASTOR new stager proposal CASTOR users’ meeting 24/06/2003 The CASTOR team."

Similar presentations


Ads by Google