Presentation is loading. Please wait.

Presentation is loading. Please wait.

BNL: ATLAS Computing 1 A Scalable and Resilient PanDA Service US ATLAS Computing Facility and Physics Application Group Brookhaven National Lab Presented.

Similar presentations


Presentation on theme: "BNL: ATLAS Computing 1 A Scalable and Resilient PanDA Service US ATLAS Computing Facility and Physics Application Group Brookhaven National Lab Presented."— Presentation transcript:

1 BNL: ATLAS Computing 1 A Scalable and Resilient PanDA Service US ATLAS Computing Facility and Physics Application Group Brookhaven National Lab Presented by Dantong Yu

2 BNL: ATLAS Computing 2  Build a scalable and resilient PanDA service for ATLAS  Support ATLAS VOs and thousands of ATLAS users and jobs.  Reliable, scalable, and high performance.  Cost-effective and flexible deployment.  A joint effort between Physics Application Group and RACF Grid Computing Group to deploy and operate every component in PanDA system.  In this talk:  BNL PanDA architecture  PANDA Components  PANDA Hardwares  Required Software Infrastructure and Grid Middleware  Infrastructure and Procedure to Download and Install Required RPMS  Nagios Based Panda Monitoring Systems  Operation Procedures  Experienced Problems Motivation and Outline

3 BNL: ATLAS Computing 3 BNL PanDA architecture 3

4 BNL: ATLAS Computing 4 Clients PanDA Server Mnt. Server AutoPilot PanDA Server Mnt. Server PanDA DBPanDA Archive … … F5 Server Load Balancing switch rewrites IP header for src. and dest. Addr., IP relay. Clients Virtual Services Physical Servers VIP Reliable/High Performance ATLAS Job Management Architecture (PanDA) 44

5 BNL: ATLAS Computing 5 Panda Components 5

6 BNL: ATLAS Computing 6  Production System  Front End Load Balancers  F5 switch does load balance and reliability.  Its transparency allows flexible management of the heterogeneous service, with only minimal application-level configuration and coding necessary to support integration with the smart switch.  Panda Monitoring Service, Panda Server, and Panda Logging Servers stateless  Dispatches jobs to pilots as they request them, HTTPS-based, stateless. It needs to connect to central Panda DB.  Provides a graphic read-only information about Panda function via HTTP. GUI is also stateless. It needs to connect to central Panda DB.  Logs Panda Server Events into the Panda DB.  Autopilot submission systems (stateful)  Using Condor-g/site gatekeepers to fill sites with pilots.  Panda Pilot Wrapper Code Distributor: Subversion with Web front-end.  Dynamically download pilot wrapper script from the Subversion web cache.  Panda Database System  https://www.racf.bnl.gov/experiments/usatlas/gridops/griduiconfig/ Production Panda Components

7 BNL: ATLAS Computing 7 Panda Development and Testbed Systems  Panda Testbed and Development Systems  Panda Monitoring Service and Panda Server  Database System

8 BNL: ATLAS Computing 8 Panda Hardware 8

9 BNL: ATLAS Computing 9 Panda Hardware  Each component group requires a separate set of hosts and hardware. Most servers should be standalone except a few of them.  Front end Load balancers: Two F5 3600 load balance switches.  Panda Monitor, Panda Server, and Panda Logging Servers.  Dual quad-core Intel Xeon CPU E5335 @ 2.00GHz. (eight cores per host), 16 GB memory, and six 750GB SATA drives. (software RAID 10 provided 2TB local storage): three servers  Autopilot submission systems (Local pilots and global pilots). Stateful  Dual quad-core Intel Xeon CPU E5430 @ 2.66GHz. (eight cores per host), 16 GB memory, and two 750GB SATA drives. (Mirrored disks): four servers  Panda pilot wrapper code distributor: subversion with Web front-end  Dual quad-core Intel Xeon CPU E5430 @ 2.66GHz. (eight cores per host), 8 GB memory, and two 150GB SAS drives. (Mirrored disks): 1 server. Need Archive system to recover if disk storage is lost.  Web Apache server

10 BNL: ATLAS Computing 10 BNL ATLAS MySQL Production and Development Servers  Following BNL production MySQL servers are used:  2 Panda-production MySQL servers (INNODB): primary and spare, dual dual core with 16GB memory and 64 bits OS.  4 Panda-archive MySQL servers (MyISAM): 2 primary + 2 spare, 2 quad- core processors with 16GB memory and 64 bit OS.  daily text-based backup (database content) for all databases on production servers above with the extra disk-copy on a special data- server having an interface to the tape.  64 bit-architecture, x86_64, 2.6.9-55.0.9.Elsm.  Six 15k rpm SAS drives, each with 145GB disk space.  Details can be found at https://www.racf.bnl.gov/experiments/usatlas/gridops/atlasdbinfo.

11 BNL: ATLAS Computing 11 ATLAS MySQL Production Databases at BNL: Details and Performance  Panda production MySQL server and its replica server with identical hardware: “ fast-buffer” DataBase. keeps the info about all Panda managed Reprocessing, MC-production and user-analysis jobs for up to 2 weeks, the cron-job moves the data into archive periodically. designed initially for USATLAS, since September 2007 supports 10 different ATLAS clouds (CERN, CA, DE, ES, FR, NL, UK, US, TW and 2 instances for Nordugrid - ND,NDGF ). runs MySQL version 5.0.X. engine InnoDB, simple structure, autoincrement for IDs, no foreign keys. 31 tables, max number of rows ~16,500,000. provides with the fast multiple parallel connections to basic Panda-components: Panda-server, Panda-monitor and Logger. Performance access pattern: ~380-440 parallel threads open simultaneously all the time (max ~600) performance: average ~360 q/sec. (max > 800) query-type: select ~35%, update ~35%, insert ~25%, others (delete, etc. ~5%) nice monitoring interface Panda-monitor: http://pandamon.usatlas.bnl.gov:25880/server/pandamon/query?dash=prod

12 BNL: ATLAS Computing 12 Critical DBs on Four Panda Archival Database Servers  Panda Archive production MySQL server (along with a spare node)‏  Database PandaArchiveDB  keeps the full archive of Panda managed reprocessing, Monte-Carlo, production and user analysis jobs since the end of 2005.  engine MyISAM, no autoincrement, replication from PandaDB through crons.  partitioning: bi-monthly structure of job/file archive tables for better search performance.  44 tables, max number of rows ~33,000,000 per table.  DataBases PandaLogDB, PandaMetaDB  keep the archive of log-extract files for jobs, some monitoring information about pilots, autopilot and scheduler-configuration support (schedconfig).  engine MyISAM, ~52-54 tables.  partitioning: bi-monthly structure for some tables.  max number of rows ~4,600,000 per table.  access pattern: ~400-450 parallel threads open (max ~740).  performance: average ~1300-1600 q/sec. (max ~2800), select (~80%), insert ~20%.

13 BNL: ATLAS Computing 13 Panda Server Infrastructure 13

14 BNL: ATLAS Computing 14 PanDA Software Infrastructure  OS, Grid Middleware, and Software Requirements  OS (RHEL/SL 4) RPMs: mod_ssl, subversion, rrdtool, openmpi, gridsite, graphtool, matplotlib, MySQL.  Glite-UI 3.1: Setup from /etc/profile.d/.  CA Certificates installed/updated.  Unix accounts w/ ssh-key access: sm  Python 2.5 (from Tadashi) RPMs: python25, mod_python25, python25-curl, python25-numeric, MySQL-python25, python25- imaging.

15 BNL: ATLAS Computing 15 PanDA Autopilot  Glite-UI 3.1: Setup from /etc/profile.d/  CA Certificates installed/updated  Unix accounts w/ ssh-key access: sm, (sm2 for grid autopilot, usatlas1 for local submission)  Condor 7.3.0 w/ custom configuration

16 BNL: ATLAS Computing 16 Panda System OS Administration  Initial install Semi-manual setup script is at: /afs/usatlas.bnl.gov/mgmt/etc/gridui.usatlas.bnl.gov/system-setup.sh Semi-manual setup script is at: /afs/usatlas.bnl.gov/mgmt/etc/gridui.usatlas.bnl.gov/system-setup.sh  Ongoing package maintenance: BNL Redhat satellite system.  Condor admin: on systems with global Condor, config changes and restart requires root.  Account management: occasional SSH key additions for new team members.

17 BNL: ATLAS Computing 17 Panda Monitoring Systems 17

18 BNL: ATLAS Computing 18 https://www.usatlas.bnl.gov/nagios/sla_array.html Panda Monitoring Systems

19 BNL: ATLAS Computing 19 https://www.usatlas.bnl.gov/nagios/tier2.html USATLAS Tier 2 Sites

20 BNL: ATLAS Computing 20 MySQL Servers Monitoring We use three monitoring tools for MySQL servers: - MySQLStat: Provide Monitoring Service for Internal ATLAS Community: BNL ATLAS MySQL servers, CERN MySQL servers, some other MySQL servers in USA and Europe. - Ganglia - Nagios: provides Critical Server Status, sends warnings and alarms if service has problem, opens RT tickets and can do some simple automatic recovery.

21 BNL: ATLAS Computing 21 MySQL Servers Monitoring: MySQLstat

22 BNL: ATLAS Computing 22 Panda Operation Procedure 22

23 BNL: ATLAS Computing 23 RT SLA Nagios RT In case of a failure of a critical machine or service Nagios generates alarms and send email alarms to SLA systems. When service recovers, Nagios generates a notification to SLA again. RACF SLA System provides a configurable alarm management layer that automates service alerts from Nagios based monitoring system. It provides a configurable alarm management layer that automates service alerts from Nagios based monitoring OSG Footprints RT can exchange problem reports with external ticketing systems. Machines and services monitored by Nagios GGUS Escalation if no response happens with SLA specified time window

24 BNL: ATLAS Computing 24 Experienced Operation Problems

25 BNL: ATLAS Computing 25 Panda Server and Databases Problems  Panda Server Hanging  A cron job at database server detects the slow query, disconnects the Panda server’s MySQL connection if it appears to be slow.  Panda processes do not handle this disconnection, wind up to be frozen.  Panda Server had to be restarted either manual or automatically by Nagios.  Panda Database Server Load  Enhanced database monitoring capabilities, and identify intrusive queries and particular users and applications which initiate the query, and worked with users to modify MySQL queries.  Effectively and significantly reduces the number of slow queries.  Purchased licensed MySQL Backup software to reduce the backup time.

26 BNL: ATLAS Computing 26 Condor-G Based Auto-Pilots  Condor-G and Gatekeepers uses GASS servers to synchronize jobs status, and large number of Condor-G jobs add significant loads and result in status loss and held jobs.  Frequently Condor-G freezing due to large number of held jobs.  Pilot job status reported by Condor-G is out of synch with the actual status of ATLAS jobs.  To kill held pilots jobs caused early aborting good ATLAS jobs.  Work with Univ. of Wisconsin to customize the condor-G.  Stage-in and stage-out events into the user log for better diagnosis.  More Condor-G tuning options for large number of job submission and dispatch.  More fine tuning knobs have separate throttles, for example: limiting jobmanagers by their role: submission -vs- stage-out/removal.  Efficiently process failed jobs and prevent bad jobs clogging the submission system: when a gridmanager decides to put a job on hold, instead use the hold_reason as the abort_reason and abort the job.

27 BNL: ATLAS Computing 27 Panda Monitoring  Front end switch system hanging due to expired licenses.  New Python version and Oracle clients require manual compile.  Certificate authority does not issue certificates with DN containing wild card (*). Clients could not properly do X509 certificated based authenticate with multiple backend severs behind F5 switch.

28 BNL: ATLAS Computing 28 Summary  Contributions:  Innovation in hardware resilience, extensive monitoring, and automatic problem reporting and tracking.  Significantly enhance the reliability of the evolving Panda system.  Support easy access to the system for software improvement.  Condor-G is slow to update Pilot status, causing inconsistency between actual job status and Panda monitoring.  Frequency of Condor-G component crashing: was fixed after condor team provided condor 7.3.0.


Download ppt "BNL: ATLAS Computing 1 A Scalable and Resilient PanDA Service US ATLAS Computing Facility and Physics Application Group Brookhaven National Lab Presented."

Similar presentations


Ads by Google