Download presentation
Presentation is loading. Please wait.
1
1 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage For BaBar at SLAC http://www.slac.stanford.edu/~abh/HEPiX99-MSS/ Andrew Hanushevsky Stanford Linear Accelerator Center Produced under contract DE-AC03-76SF00515 between Stanford University and the Department of Energy
2
2 Andrew Hanushevsky - HEPiX, October 6-8, 1999 BaBar & The B-Factory n Use big-bang energies to create B meson particles u Look at collision decay products u Answer the question “where did all the anti-matter go?” n 800 physicists collaborating from >80 sites in 10 countries u USA, Canada, China, France, Germany, Italy, Norway, Russia, UK, Taiwan n Data reconstruction & analysis requires lots of cpu power u Need over 250 Ultra 5’s to just to find particle tracks in the data n The experiment also produces large quantities of data u 200 - 400 TBytes/year for 10 years u Data stored as objects using Objectivity u Backed up offline on tape in HPSS u Distributed to regional labs across the world
3
3 Andrew Hanushevsky - HEPiX, October 6-8, 1999 HPSS Milestones n Production HPSS 4.1 deployed in May, 1999 u B-factory data taking begins u Solaris Mover is working n To date, ~12TBs data stored u Over 10,000 files written n STK 9840 tapes used exclusively u Over 300 tapes written
4
4 Andrew Hanushevsky - HEPiX, October 6-8, 1999 HPSS Core Server n RS6000/F50 running AIX 4.2.1 u 4 cpus u 1Gb RAM u 12 x 9Gb disk for Encina/SFS, etc n Use tape only storage heirarchy u Only use pftp to access data n One problem with BFS u symptom: pftp_client file open failures u two circumventions added to BFS
5
5 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Solaris Tape Movers n SLAC port of mover using HPSS version 4.1 n Solaris machine configuration u Ultra-250 with 2 cpus, 512Mb RAM, Gigabit ethernet u Solaris 2.6, DCE 2.0, Encina TX4.2 u Three 9840 tape drives, each on separate Ultra SCSI bus n Observed peak load u CPU 60% busy u Aggregate I/O 26Mb/sec
6
6 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Solaris Disk Movers n Does not use HPSS disk cache u Performance & Reliability F HPSS latency too high for small block transfers F Disk cache maintenance rather complex n Solaris machine configuration u E4500 & Ultra 450, 4 cpus, 1Gb RAM, Gigabit ethernet F A3500’s, RAID-5, 5-way striped, 2 controllers, 500 to 1TB u Ultra 250, 2 cpus, 512Mb RAM, Gigabit ethernet F A1000’s, RAID-5, 5-way striped, 2 controllers, 100 to 200TB u Solaris 2.6, DCE 2.0, Encina TX4.2 (DCE/Encina not necessary) n Observed peak load u CPU 65% busy u Aggregate I/O 10Mb/sec (no migration or staging at the time)
7
7 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Mass Storage Architecture File & catalog management Staging Manager AMS (unix fs I/o) Gateway daemon HPSS Server Disk Pool Gateway Requests Migration daemon PFTP (data) Disk Server (Solaris) HPSS Mover Tape Robot PFTP (control) Purge daemon Prestage daemon
8
8 Andrew Hanushevsky - HEPiX, October 6-8, 1999 HPSS Configuration
9
9 Andrew Hanushevsky - HEPiX, October 6-8, 1999 SLAC Detailed Hardware Configuration
10
10 Andrew Hanushevsky - HEPiX, October 6-8, 1999 HPSS Total Space Used
11
11 Andrew Hanushevsky - HEPiX, October 6-8, 1999 HPSS Total Files
12
12 Andrew Hanushevsky - HEPiX, October 6-8, 1999 Summary n HPSS is very stable u Mass Storage architecture has proven to be highly flexible n Solaris mover is a success n 9840 working well for new technology n Software upgrades will be a problem n Disk Space is always an issue u Will be getting 1TB/Month for the next year (total of about 25TB) n Tape drive contention concerns u Will be getting 12 more this year (for a total of 24)
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.