Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hans Wenzel PMG Meeting Friday Dec. 13th 2002 ”T1 status" Hans Wenzel Fermilab  near term vision for: data management and farm configuration management.

Similar presentations


Presentation on theme: "Hans Wenzel PMG Meeting Friday Dec. 13th 2002 ”T1 status" Hans Wenzel Fermilab  near term vision for: data management and farm configuration management."— Presentation transcript:

1 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 ”T1 status" Hans Wenzel Fermilab  near term vision for: data management and farm configuration management  Implementing the new WBS  work performed in the last few weeks.  Near term plans

2 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 What do we expect from dCache?  making a multi-terabyte server farm look like one coherent and homogeneous storage system.  Rate adaptation between the application and the tertiary storage resources.  Optimized usage of expensive tape robot systems and drives by coordinated read and write requests. Use dccp command instead of encp!  No explicit staging is necessary to access the data (but prestaging possible and in some cases desirable).  The data access method is unique independent of where the data resides.  High performance and fault tolerant transport protocol between applications and data servers  Fault tolerant, no specialized servers which can cause severe downtime when crashing.  Can be accessed directly from your application (e.g. root TDCacheFile class).  Can be used as scalable file store without HSM.

3 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 dCache GridFTP dCache GridFTP Catalog GridFTP CASTOR FNAL CERN Florida, … Catalog Tier 2 with dCache ENSTOREENSTORE

4 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Global data management w/SRB … GridFTP Local MCAT dCache GridFTP Local MCAT dCache GridFTP Local MCAT dCache Global Catalog Site A Site B Site N

5 Hans Wenzel PMG Meeting Friday Dec. 13th 2002

6 Node Front-end Node Network attached Disk dCache ENSTORE Web servers Db servers Dynamic partitioning and config. the farm

7 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Descriptive information to configure a node Compute Node Kickstart file IO ServerWeb Server Appliances Collection of all possible software packages (AKA Distribution) RPMs NPACI ROCKS

8 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Implementing the new WBS  Use new WBS to track our efforts.  Consequence of new WBS closer cooperation between T1 and T2 sites.  Established biweekly meetings to discuss R&D projects (1.1.4.1).  Fermilab and SD collaborate on deploying and evaluating ROCKS. GOAL standard Rocks distribution for all US centers. (1.1.1.3.2.1, 1.1.1.3.2.2, 1.1.1.5.5.1)

9 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Implementing the new WBS (II)  SD and FNAL also collaborate on SRB and dCache (1.1.1.7.2.1.1) dCache: combine disks on farm nodes, gridFTP as transport mechanism, SRB bookkeeping. dCache deployed at SD. Will package dCache.  Caltech evaluation and optimization of disk IO of the IDE based disk servers -> next step sustain 200MB/sec WAN speed (1.1.1.2.2.1)  Need even more collaboration next year

10 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 What was happening at FCC  Chris Brew (1/2) from SCS joint the CMS team in SCS (Joe Kaiser lead, Chris Brew, Merina Albert). ISD Michael Zalokar, Jon Bakken (dCache), Igor Mandrichenko (FBSNG). CMS: Hans Wenzel, Michael Ernst, Natalia Ratnikova  65 dual AMD athlon 1900+ nodes are installed and commissioned. Acceptance period (30 days) finished right before Thanksgiving. (many delays, same vendor won bids for CDF, CMS, D0 purchases)  Monitored and evaluated daily, swapped broken parts. Achieved 98.3 % up time (1.1.2.1.1.1)  Upgraded monitoring software (temperature and fan speeds) (1.1.2.1.5.1)  We used the farm for hcal test beam production.  We used the farm to test the 3 TB aztera disk system (1.1.1.1.6.1 benchmarking suite, 1.1.1.2.2.2). Needed constant tuning and upgrading in the end we managed to double performance.

11 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 What was happening at FCC II  Used farm in tests of dCache system. Developed a root based benchmarking suite testing the entire data path from Application to mass storage. (1.1.1.2.8.1)  Used farm to install and configure ROCKS. (1.1.1.3.2.2)  7 dCache linux nodes: hardware has been upgraded (SCSI system disk). System upgraded to Kernel 2.4.18, XFS File system. System was usable during the upgrades (1.1.2.1.2.1). dCache software and configuration upgraded.  Installed web servers electronic logbook. (1.1.1.6.8.1)  Work on transition from RH 6 to RH 7.  Missed milestone: Test of interactive farm based Analysis prototype (1.1.2.3.1). Slipped by 2 -3 weeks.

12 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 The new farm and the dCache system

13 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Linux dCache node: Important Kernel 2.4.18 to avoid memory management problems XFS file system: we found it’s the only one that scales still Delivers performance when File system is full Add SCSI system disk Need server specific Linux distribution!!! Need to tweak Many parameters to Achieve optimum Performance-> feed back to vendors Next generation: Xeon based, PCIX bus Large capacity disks Dual System disk (raid1)

14 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 First results with dCache system These tests were done before the hardware and configuration upgrade. The average file size is ~1 GByte the reads are equally distributed over all read pools. Reads with dccp from popcrn nodes into /dev/null # of concurrent reads (40 farm nodes) Aggregate input speed (sustained over hours) 70108 Mbyte/sec 60104 Mbyte/sec 542.5 Mbyte/sec READS 2.7 MB/sec per process

15 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 First results with dCache system The following was done with 2 write pools. Using an Root Application utilizing TDCacheFile to write an Event tree into dCache. Only had three farm nodes available so we are network limited. # of concurrent writes (3 farm nodes) Aggregate output speed (sustained over hours) 6 29.3 MByte/sec 5.2 MB/sec process WRITES

16 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 1800 pounds

17 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 AZTERA RESULTS

18 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Last weeks power outage (>1.5 days)  Made good use of the outage combining several maintenance projects. (knew it was coming sometime)  Completed remote power automation project, which required complete rewiring and racking of nodes.)  Addressed safety issues: Safer power distribution.  Better connectivity between cms computing and central switch.  Installation and configuration of private network for ROCKS.  Still causes delays…, things break ….  Many tasks were not listed in WBS (need to review more often)

19 Hans Wenzel PMG Meeting Friday Dec. 13th 2002 Plans for the near term future  Continue Evaluating disk systems (zambeel, dCache, dfarm, panasas, exanet…..). Procure a network attached system at the begin of next year? But lot happening on the market.  Upgrade of dCache system to follow the technology (serial ide, PCIX, xeon, memory bandwidth….)  Configure and deploy farm based interactive/batch user computing.  Continue evaluating ROCKS, SRB,…., Need more manpower for  Follow the marching orders given in the new WBS, refine WBS as we go along.


Download ppt "Hans Wenzel PMG Meeting Friday Dec. 13th 2002 ”T1 status" Hans Wenzel Fermilab  near term vision for: data management and farm configuration management."

Similar presentations


Ads by Google