Presentation is loading. Please wait.

Presentation is loading. Please wait.

Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July 2008 1.

Similar presentations


Presentation on theme: "Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July 2008 1."— Presentation transcript:

1 Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July 2008 1

2 F Batch and Interactive nodes G End-User Analysis Analysis E B A B D C MC and Data Import/Export MC and Data Import/Export Tape Read Tape Migration Online DAQ Data Export file merge file copy file merge file copy T0 Calibration and Alignment T2/T3 Analysis Production Data Disk Pools Scratch file copy Reprocessing Tape Read Reprocessing Tape Read T1/T2/T3 High level Data Flow 08 July 20082Bernd Panzer-Steindel, CERN/IT

3 Tentative ‘requirements’ for end-user analysis storage  Storage capacity of 1-2 TB per user (  assume 1.5 TB) (Ntuple, data samples, logfiles, ….)  Reliable storage, ‘server-mirroring’ 99.9% availability  4 times per year unavailable for 4 h each  No tape access  too many small files  Some backup possibility ( archive ? ) Backup  5% changes per day (75 GB/d) + 4 month retention time = 9 TB backup space per user  Quota system  Easy accessibility from batch and interactive worker nodes and the notebook  POSIX access type  distributed file system  World-wide access  High file access read/write performance  User identity and security  ……………………………. 08 July 20083Bernd Panzer-Steindel, CERN/IT

4 Estimated 500 users at CERN  750 TB of analysis disk storage + backup and archive? 1.5 TB USB disk Amazon Cloud Storage TSM Backup HSM Backup, Archive AFS Isilon, BlueArc, Exanet, DataDirect NFS4, Lustre, xrootd, Cost and Technology Scenarios Distributed File System implementations Hardware investments over 2 years ‘guestimates’ 7 MCHF 0.4 MCHF 3 MCHF 2.5 MCHF 1 MCHF 2 MCHF 4.5 MCHF HSM, disk-only 2 MCHF 08 July 20084Bernd Panzer-Steindel, CERN/IT + software operation, support, functionality,..

5 Questions  Where are these ‘extra’ resources coming from ?  Is there only one unique storage per user world-wide ?  What about users working on different sites ?  Do they have multiple end-user storage instances ?  How is data transferred between instances ?  The difference between the ‘home-directory’ storage and end-user analysis space is small.  Analysis tools/programs and the data itself must be accessed at the same time.  Who decides which user gets how much space where ?  Experiment specific policies  What is the data flow model ?  Notebook disk + site local file system + global file system  Notebook disk + site local scratch + cloud storage  Global file system only  ….. More combinations……….  Notebook issues  OS support, virtual analysis infrastructure, network connectivity = data ‘gas station’  ……many more questions…………. 08 July 20085Bernd Panzer-Steindel, CERN/IT

6 Is there some common interest to solve this problem ? Need/interest for the creation of a working group to investigate in more detail? Experiments, Sites ? 08 July 20086Bernd Panzer-Steindel, CERN/IT


Download ppt "Storage issues for end-user analysis Bernd Panzer-Steindel, CERN/IT 08 July 2008 1."

Similar presentations


Ads by Google