Presentation on theme: "Florida Tech Grid Cluster P. Ford 2 * X. Fave 1 * M. Hohlmann 1 High Energy Physics Group 1 Department of Physics and Space Sciences 2 Department of Electrical."— Presentation transcript:
Florida Tech Grid Cluster P. Ford 2 * X. Fave 1 * M. Hohlmann 1 High Energy Physics Group 1 Department of Physics and Space Sciences 2 Department of Electrical & Computer Engineering
History Original conception in 2004 with FIT ACITC grant. 2007 - Received over 30 more low-end systems from UF. Basic cluster software operational. 2008 - Purchased high-end servers and designed new cluster. Established Cluster on Open Science Grid. 2009 - Upgraded and added systems. Registered as CMS Tier 3 site.
Current Status OS: Rocks V (CentOS 5.0) Job Manager: Condor 7.2.0 Grid Middleware: OSG 1.2, Berkeley Storage Manager (BeStMan) 18.104.22.168.i7.p3, Physics Experiment Data Exports (PhEDEx) 3.2.0 Contributed over 400,000 wall hours to CMS experiment. Over 1.3M wall hours total. Fully Compliant on OSG Resource Service Validation (RSV), and CMS Site Availability Monitoring (SAM) tests.
System Architecture nas-0-0 Compute Element (CE) Storage Element (SE) compute-2-Xcompute-1-X
Rocks OS Huge software package for clusters (e.g. 411, dev tools, apache, autofs, ganglia) Allows customization through “Rolls” and appliances. Config stored in MySQL. Customizable appliances auto-install nodes and post-install scripts.
Storage Set up XFS on NAS partition - mounted on all machines. NAS stores all user and grid data, streams over NFS. Storage Element gateway for Grid storage on NAS array.
Condor Batch Job Manager Batch job system that enables distribution of workflow jobs to compute nodes. Distributed computing, NOT parallel. Users submit jobs to a queue and system finds places to process them. Great for Grid Computing, most-used in OSG/CMS. Supports “Universes” - Vanilla, Standard, Grid...
Personal Condor / Central Manager Master collector negotiator startdschedd Master: Manages all daemons Negotiator: “Matchmaker” between idle jobs and pool nodes. Collector: Directory service for all daemons. Daemons send ClassAd updates periodically. Startd: Runs on each “execute” node. Schedd: Runs on a “submit” host, creates a “shadow” process on the host. Allows manipulation of job queue.
Condor Priority User priority managed by complex algorithm (half-life) with configurable parameters. System does not kick off running jobs. Resource claim is freed as soon as job is finished. Enforces fair use AND allows vanilla jobs to finish. Optimized for Grid Computing.
OSG Middleware OSG middleware installed/updated by Virtual Data Toolkit (VDT). Site configuration was complex before 1.0 release. Simpler now. Provides Globus framework & security via Certificate Authority. Low maintenance: Resource Service Validation (RSV) provides snapshot of site. Grid User Management System (GUMS) handles mapping of grid certs to local users.
BeStMan Storage Berkeley Storage Manager: SE runs basic gateway configuration - short config but hard to get working. Not nearly as difficult as dCache - BeStMan is a good replacement for small to medium sites. Allows grid users to transfer data to-and-from designated storage via LFN e.g. srm://uscms1-se.fltech-grid3.fit.edu:8443/srm/v2/server?SFN=/bestman/BeStMan/cms...
WLCG Large Hadron Collider - expected 15PB/year. Compact Muon Solenoid detector will be a large part of this. World LHC Computing Grid (WLCG) handles the data, interfaces with sites in OSG, EGEE (european), etc. Tier 0 - CERN, Tier 1 - Fermilab, Closest Tier 2 - UFlorida. Tier 3 - US! Not officially part of CMS computing group (i.e. no funding), but very important for dataset storage and analysis.
T2/T3 sites in the US T3 T3 T3 T3 T3 https://cmsweb.cern.ch/sitedb/sitelist/ T3 T3 T3 T3 T3 T2 T2 T2T2 T2 T2 T2 T3 T3
Local Usage Trends Trends Over 400,000 cumulative hours for CMS Over 900,000 cumulative hours by local users Total of 1.3 million CPU hours utilized
Tier-3 Sites Not yet completely defined. Consensus: T3 sites give scientists a framework for collaboration (via transfer of datasets), also provide compute resources. Regular testing by RSV and Site Availability Monitoring (SAM) tests, and OSG site info publishing to CMS. FIT is one of the largest Tier 3 sites.
PhEDEx Physics Experiment Data Exports: Final milestone for our site. Physics datasets can be downloaded from other sites or exported to other sites. All relevant datasets catalogued on CMS Data Bookkeeping System (DBS) - keeps track of locations of datasets on the grid. Central web interface allows dataset copy/deletion requests.
CMS Remote Analysis Builder (CRAB) Universal method for experimental data processing Automates analysis workflow, i.e. status tracking, resubmissions Datasets can be exported to Data Discovery Page Locally used extensively in our muon tomography simulations.
Network Performance Changed to a default 64kB blocksize across NFS RAID Array change to fix write-caching Increased kernel memory allocation for TCP Improvements in both network and grid transfer rates DD copy tests across network Changes from 2.24 to2.26 GB/s in reading Changes from 7.56 to 81.78 MB/s in Writing
Block size WRIT EMB/sREADGB/s 64k12.784.40.422.5 22.214.171.1242.3 126.96.36.1992.2 1476.50.532.3 12.684.80.462.3 13.181.780.4682.26 Block sizeWRITEMB/sREADGB/s 64k102.910.40.452.4 94.511.40.492.2 2188.8.131.522.2 244.894.40.492.2 1357.90.492.2 173.1587.560.4822.24 TCP:STCP:CUDP: jitterlost (Mbits/sec ) 7537540.1101.05 9129130.02201.05 8968970.03401.05 8918920.39301.05 8888891.75101.05 8688690.46201.05 Iperf on the Frontend Before TCP: STCP: CUDP: jitterlost(Mbits/sec) 9419420.04801.05 9399400.02501.05 9359370.02201.05 9309310.02301.05 9419420.02501.05 937.2938.40.028601.05 Iperf on the Frontend After DD on the Frontend After DD on the Frontend Before