Presentation is loading. Please wait.

Presentation is loading. Please wait.

LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1.

Similar presentations


Presentation on theme: "LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1."— Presentation transcript:

1 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1

2 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Overview of BaBar @ CC-IN2P3 (I) CC-IN2P3: mirror site of Slac for BaBar since November 2001: –real data. –simulation data. (total = 220 TB) Provides the infrastructure needed to analyze these data by the end users. Open to all the BaBar physicists.

3 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Overview of BaBar @ CC-IN2P3 (II) 2 types of data available: –Objectivity format (commercial OO database): giving it up. –Root format (ROOT I/O: Xrootd developped @ SLAC). Hardware: –200 GB tapes (type: 9940). –20 tape drives (r/w rate = 20 MB/s). –20 Sun servers. –30 TB of disks (ratio disk/tape = 15%). actually ratio ~ 30% (ignoring rarely accessed data) permanentcache

4 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 BaBar usage @ CC-IN2P3 2002 – 2004: ~ 20% of the CPU available (on a total of ~1000 CPUs available). Up to 450-500 users’ jobs running in // « Distant access » of the Objy and root files from the batch worker (BW):  random access to the files: only the objects needed by the client are transfered to the BW (~kB per request).  hundreds of connections per server.  thousands of requests per second.

5 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Data access model Client (etc…) Data servers disks HPSS Master daemon: Xrootd / Objy T1.root (1) (4) (2) Master servers T1.root ? (etc…) Slave daemon: Xrootd / Objy (5) (3) (6) (1) + (2): dynamic load balancing (4) + (5): dynamic staging (6): random access to the data

6 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Dynamic staging Average file size: 500 MB. Average staging time: 120 s. When the system was overloaded (before dyn. load balancing era): 10-15 min delays (with only 200 jobs) Up to 10k files from tape to disk cache / day (150k staging requests/month!). Max of 4 TB from tape to disk cache / day

7 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Dynamic load balancing Up and running since December 2003 for Objectivity (before a file could only be staged on a given server).  no more delayed jobs (even with 450 jobs in //).  more efficient management of the disk cache (entire disk space seen as a single file system).  fault tolerance in case of server crashes.

8 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Pros … Mass Storage System (MSS) usage completly transparent for the end user. No cache space management by the user. Extremely fault tolerant (server crashes or during maintenance work). Highly scalable + entire disk space efficiently used. On the admin side: can choose your favourite MSS, favourite protocol to do the staging (Slac: pftp, Lyon: RFIO, ….).

9 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 … and cons Entire machinery relies on a lot of different components (especially a MSS). In case of a very high demand on the client side  response time can be real slow. But also depending on: –number of data sets available. –a good data structure.

10 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Data structure: the fear factor A performant data access model depends also on this. Deep copies vs « pointers’ » files (only containing pointers to other files) ? Deep copies« Pointers » files - duplicated data - ok in a «full disk» scenario - ok if used with a MSS - no data duplication - ok in a «full disk» scenario - potentially very stressful on the MSS (VERY BAD)

11 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 What about other experiments ? Xrootd well adapted for users’ jobs using ROOT to analyze a large dataset. being included in the official version of ROOT. already setup in Lyon and being used or tested by other groups: D0, EUSO and INDRA.  access to files stored in HPSS transparently.  no need to manage the disk space.

12 LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Summary Storage and data access is the main challenge. Good ratio disk/tape hard to find: depends on many factors (users, number of tape drives etc…). Xrootd provides lots of interesting features for distant data access.  extremely robust (great achievement for a distributed system).


Download ppt "LCG Phase 2 Planning Meeting - Friday July 30th, 2004 Jean-Yves Nief CC-IN2P3, Lyon An example of a data access model in a Tier 1."

Similar presentations


Ads by Google