Presentation is loading. Please wait.

Presentation is loading. Please wait.

Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBES P. Saiz The future of AliEn.

Similar presentations


Presentation on theme: "Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBES P. Saiz The future of AliEn."— Presentation transcript:

1 Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBES P. Saiz The future of AliEn

2 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 227 Mar 2013 Pablo Saiz ALICE offline week Table of contents Current statusOngoing workFuture plansSummary

3 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 327 Mar 2013 Pablo Saiz ALICE offline week AliEn File Catalogue –LFN to PFN mapping –Metadata –700 M entries TaskQueue –Job execution model –Package management –50K concurrent jobs File transfers Used by ALICE and PANDA

4 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 427 Mar 2013 Pablo Saiz ALICE offline week AliEn versions v2-19 ** : Current version of ALICE –With plenty of patches v2-20: Current version of PANDA –Json, removal of PackMan, Catalogue layout v2-21: Development release –GUIDless catalogue After a release has been adopted, database change go to new release

5 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 527 Mar 2013 Pablo Saiz ALICE offline week AliEn improvements not yet used by ALICE Catalogue structure –InnoDB tables, foreign keys, numeric id –2-day downtime or creating 1 week hybrid version Removal of PackMan service –Clients can handle package installation by themselves JSON communication –Backward incompatible. Full redeployment File popularity –Requires changes in the Central Services

6 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 627 Mar 2013 Pablo Saiz ALICE offline week Current work File Catalogue jAliEnPopularity Classads Trust Model Priority Price AliEn/PoDVO to VO

7 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 727 Mar 2013 Pablo Saiz ALICE offline week File Catalogue Investigate File Catalogue using file system –Using all features from real file system: user, quotas, Prototype of AliEn creating entries on FS: –700M entries in the ALICE catalogue –Ext4 not up to the challenge  reiserfs –One entry per file  one entry per directory Locking, simultaneous clients, booking entries, backups –Prototype was discontinued File catalogue without GUID –See Miguel’s presentation

8 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 827 Mar 2013 Pablo Saiz ALICE offline week jAliEn Already used in production: –Managing productions –Data transfers –Data cleanup Server part for the web interface Need to: –Improve the ROOT plugin –Integrate on FITS

9 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 927 Mar 2013 Pablo Saiz ALICE offline week Other improvements TaskQueue improvements: –Store diffs between original and final JDL –Remove Classad library –Retrial mechanism Separation of price and priority –Priority: select user –Price: sort among the jobs of the same use More worker nodes platforms: SLC6 Fedora

10 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1027 Mar 2013 Pablo Saiz ALICE offline week File Popularity Developed by A. Abramyan and N. Manukyan Requires patches in central services v2-19 Frequency of file access: –Including errors –File types Identify: –In demand files  increase replicas –Other files  decrease replicas –Broken files

11 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1127 Mar 2013 Pablo Saiz ALICE offline week Other contributions AliEn trust model –Define service/user trust, and schedule jobs/storage accordingly –Sergio Guinez, TALCA AliEn/PoD integration –Interactive analysis on the grid –Cinzia Luzzi VO to VO submission –Submit jobs from one VO to another, output visible in both –PANDA colleagues

12 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1227 Mar 2013 Pablo Saiz ALICE offline week PANDA GRID/AliEn developers Link

13 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1327 Mar 2013 Pablo Saiz ALICE offline week Future work Testing Framework Job Brokering User credentials Scaling up

14 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1427 Mar 2013 Pablo Saiz ALICE offline week Testing framework Create environment to test new approaches Up to know: –BITS & FITS (functionality tests) –PANDA (becoming a mature GRID) –Development VO: ALICE_TEST Setup and running for one year Used for some train analyses Users have different priorities

15 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1527 Mar 2013 Pablo Saiz ALICE offline week Development environment I FC TQ SE CE SECE …… FC TQ SESE CECE SESE CECE ……

16 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1627 Mar 2013 Pablo Saiz ALICE offline week Environment I One way catalogue synchronization –Take snapshot of catalogue Duplicate small percentage of jobs –5,10% of TQ Jobs get executed twice –Easy to check output –Duplication of work –Setting new SE that will be erased Test of the full scale catalogue

17 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1727 Mar 2013 Pablo Saiz ALICE offline week Development Environment II FC TQ SE CE SECE …… FC TQ CECE CECE … CE

18 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1827 Mar 2013 Pablo Saiz ALICE offline week Using VO to VO submission –Once the plugin becomes available… New VO with only CE –Easier to setup –Using same SE as ALICE If jobs fail, reschedule them Does not test the full catalogue

19 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 1927 Mar 2013 Pablo Saiz ALICE offline week Alternative Job Brokering Two level broker: –Broker dispatches batches of job to CM –CM distributes among worker nodes –Bigger dependency on vobox –Reduce load on central services New job optimizer: –Groups jobs together Ideally, with the same input –Send group to the JobAgent

20 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 2027 Mar 2013 Pablo Saiz ALICE offline week User credentials Glexec Propagate user credentials to worker node Sign jdl and changes –Traceability As already presented by S. Schreiner http://indico.cern.ch/getFile.py/access?contribId=58&sessionId=9&re sId=0&materialId=slides&confId=111325http://indico.cern.ch/getFile.py/access?contribId=58&sessionId=9&re sId=0&materialId=slides&confId=111325

21 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 2127 Mar 2013 Pablo Saiz ALICE offline week Factor 1000 scale up… Number of sites: 80  80.000 – SETI, BOINC, … Opportunistic sites (without vobox) Number of nodes: 50K jobs  50M jobs –Amazon has 0.5M servers [1] Decentralized Job brokering Amount of information:30 PB  30EB –One tenth of the world’s info! [2] I/O bottleneck Number of files: 700M  700B –Default ext4, max 4B [1] http://www.zdnet.com/blog/open-source/amazon-ec2-cloud-is-made-up-of-almost-half-a-million-linux-servers/10620http://www.zdnet.com/blog/open-source/amazon-ec2-cloud-is-made-up-of-almost-half-a-million-linux-servers/10620 [2] http://www.businessinsider.com/amount-of-information-in-the-world-2011-2?op=1http://www.businessinsider.com/amount-of-information-in-the-world-2011-2?op=1

22 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 2227 Mar 2013 Pablo Saiz ALICE offline week Factor 1:1000 scale up It will require quite some tuning… Luckily, factor 10 is not even questioned –And that’s more than enough for the expected increase in resources

23 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 2327 Mar 2013 Pablo Saiz ALICE offline week After more than 13 years…

24 CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t ES 2427 Mar 2013 Pablo Saiz ALICE offline week Summary AliEn can handle current load –80 sites, 50K concurrent jobs, 700 M files An increase of 10 should be easy Plenty of areas for research/improvement –Catalogue –Job distribution –jAliEn AliEn needs a new project leader –Thank you for the last 13 years!


Download ppt "Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland www.cern.ch/i t DBES P. Saiz The future of AliEn."

Similar presentations


Ads by Google