Presentation is loading. Please wait.

Presentation is loading. Please wait.

Ruben Gaspar – CERN - Speaker Dawid Wojcik – CERN Ignacio Coterillo– CERN Daniel Gomez- CERN UKOUG Database Server SIG Meeting, 29th January 2013.

Similar presentations


Presentation on theme: "Ruben Gaspar – CERN - Speaker Dawid Wojcik – CERN Ignacio Coterillo– CERN Daniel Gomez- CERN UKOUG Database Server SIG Meeting, 29th January 2013."— Presentation transcript:

1

2 Ruben Gaspar – CERN - Speaker Dawid Wojcik – CERN Ignacio Coterillo– CERN Daniel Gomez- CERN UKOUG Database Server SIG Meeting, 29th January 2013

3 Outline CERN and Databases Architecture Hardware overview Empowering users Functionality overview Conclusions

4 4 CERN European Organization for Nuclear Research founded in 1954 20 Member States, 7 Observer States + UNESCO and UE 60 Non-member States collaborate with CERN 2400 staff members work at CERN as personnel, 10 000 more researchers from institutes world-wide

5 5 LHC and Experiments Large Hadron Collider (LHC) – particle accelerator used to collide beams at very high energy 27 km long circular tunnel Located ~100m underground Protons currently travel at 99.9999972% of the speed of light Collisions are analysed with usage of special detectors and software in the experiments dedicated to LHC New particle discovered! consistent with the Standard Model Higgs boson

6 6 WLCG The world’s largest computing grid More than 25 Petabytes of data stored and analysed every year Over 68 000 physical CPUs Over 305 000 logical CPUs 157 computer centres in 36 countries More than 8000 physicists with real-time access to LHC data

7 7 Oracle at CERN Relational DBs play a key role in the LHC production chains Accelerator logging and monitoring systems Online acquisition, offline: data (re)processing, data distribution, analysis Grid infrastructure and operation services Monitoring, dashboards, etc. Data management services File catalogues, file transfers, etc. Metadata and transaction processing for tape storage system Database on Demand service: Oracle & MySQL instances (more later)

8

9 9 Database as a Service – Rationale Empowering CERN IT and research community Users can request and manage different database instances (currently MySQL and Oracle single instance) Aimed at medium size and long-term projects Users are provided with a self-service portal Ease of administration Integrated backup & recovery Monitoring solution One click patching

10 10 Database as a Service principles Scalable Provide flexible and cost effective Database as a Service Owners are grouped by a mailing group (access authorization) Owners receive full DBA privileges* on their instances Owners are responsible for ensuring that their systems, and the use of their systems, are fully compliant with the Rules of CERN Computing Facilities (including security) The “Database on Demand” (DBoD) service – OS administration and providing support for self-service portal functionality The DBoD service does not provide DBA or application support

11 11 DBaaS providers Legend MySQL Oracle SQL Server B&R HA Upgrades Monitoring add-on

12 12 Private cloud model 12 Reuse existing virtualization infrastructure and know- how – cost efficient Improve operations Standardization Consolidation – migrate existing DBs to DBoD service Reuse tools and management frameworks HA via virtualization (live migration) Oracle clusterware Master/Slave replication (just for MySQL)

13 13 Architecture 13 Virtualization Oracle VM (2.2, 3.0.2 and 3.1.1) on Linux x86_64 Typical VM size: 2 cores, 8 or 16GB RAM Physical server: usually running several instances Storage NFS over 10 Gigabit Ethernet Configuration Management Open-source Quattor Toolkit CERN is currently adopting Puppet Management framework Syscontrol – developed at CERN Custom development mainly Perl + Bash. About 10K lines code Web self-service portal

14 + Physical Servers

15 15 Hardware servers Dell blades PowerEdge M610 2x Quad-Core Intel Xeon @ 2.53GHz 48 GB RAM Transtec Database server 2x Six-Core Intel Xeon @ 2.26 GHz 128 GB RAM NetApp cluster Next release 10GbE

16 16 Hardware storage NetappFAS3240(nowadays)FAS6210 (future release) CPU 1xL5410 (Harpertown/FSB) 2xE5520 (Nehalem-EP, QPI) cores48 RAM8GB (-1GB NVRAM)24GB NVRAM1 GB4 GB Max flash512GB1.5TB Max aggregate size90 TB162 TB Moving from 7-mode Ontap to C-mode QPI = Intel QuickPath Interconnect (measured x3 memory bandwidth over FSB)

17 Hardware storage II DatafilesBinary logs DBOD instances Redo logs Controlfile DBOD instances Datafiles diagnostics Dual SAS loop 2x3GbE 10GbE NetApp cluster

18 18 Shared Instances 18 DBoD supports more than one MySQL instance on one VM Sharing CPU Sharing MySQL/Oracle binaries Separate buffer pools (pre-allocated memory) Separate NFS volumes Independent backup and restore

19 19 MySQL 19 Currently running MySQL Community Edition 5.5 InnoDB as the preferred storage engine – backup & recovery Binary logs enabled ACID (atomicity, consistency, isolation, durability) – innodb_flush_log_at_trx_commit = 1, sync_binlog and innodb_flush_method = O_DIRECT Using innodb_buffer_pool_size of ~5GB Using thread cache (big gain for some clients) Using query cache ( query_cache_size = 768M ) Performance schema is enabled by default

20 20 Oracle 11gR2 Archivelog mode Scheduler job for automatic archivelog clean-up COST to Restrict Instance Registration [ID 1453883.1] SQLNET.CRYPTO_CHECKSUM_SERVER=required & SQLNET.ENCRYPTION_SERVER=required filesystemio_options='SETALL‘ Scheduler job for automatic archivelog clean-up

21 21 Empowering users 21 Self-service portal Instance administration (status, start, stop) Manage configuration and logs MySQL: download/upload my.cnf, download slow queries log Oracle: download trace files Set up backups (automatic or manual) or command a restore One button instance upgrade (coming one button system upgrade) Access to monitoring information Behind scenes: J2EE Web Application running on central web servers ZK Framework 5.0.11 (Ajax based) SSO (Single Sign On) + SSL for authentication/authorisation JDBC + Apache DBCP connection pooling via JNDI JAX-WS 2.2 for SOAP Web Services Webapp (Java, ZUML, Javascript, CSS, etc): ~ 12500 lines of code

22 22 Backup solution 22 MySQL instances running in binary log mode (InnoDB recommended) Oracle instances running in archivelog mode Backups based on storage snapshots Full online DB backups done just in a few seconds Manual and/or automatic (scheduled) Small storage overhead (depends on instance activity) Point-in-time recovery – easy with snapshots and binary/archive logs Snapshots can be configured to be sent to tape (DR)

23 23 Backup management 23 Backup configuration panel Backup procedure: mysql> FLUSH TABLES WITH READ LOCK; mysql> FLUSH LOGS; Or Oracle>alter database begin backup; mysql> UNLOCK TABLES; Or Oracle>alter database end backup; snapshot resume … some time later new snapshot

24 ~1 sec Flush tables with read lock

25 25 Instance restore 25 Owners can request point-in-time recoveries Full restore takes just a few seconds Recovery time depends on number of binary logs/redo logs to replay/apply Warning: snapshots taken after the one used for recovery are lost

26 26 Instance restore 26 Binary logs TIME Data files Manual snapshot Now Automatic snapshots Point-in-time recovery

27 27 Framework Monitoring 27 Management server Queries its jobs table regulary (based on Oracle dbms_scheduler) Informs admins in case of: Pending jobs not executed Timed out jobs Failed jobs Lines of code: SQL, PL/SQL ~ 1300

28 28 Instance Monitoring 28 Evaluated several monitoring products OEM with Pythian plug-in MySQL Enterprise Monitor Monitoring server runs RACMon (in-house development) Implemented using Python ~13k lines & 15k lines of PHP Availability and performance monitoring system for Oracle DBs, MySQL, NAS storage and VM infrastructure ~30 MySQL metrics stored in monitoring DB mysql> show status Selected AWR metrics stored for Oracle instances Admins are notified via email (and SMS if needed) about Availability problems Performance issues (OS level and DB checks)

29 29 Monitoring interface 29

30 30 One button upgrades 30 DBoD admins prepare upgrade scripts Complete upgrade process is scripted and tested Upgrades of one minor version or several minors possible Owners can decide to upgrade at their convenience - one button upgrade Instance is stopped Binaries are upgraded shared instances must be upgraded at the same time Instance is restarted All post-installation tasks executed

31 31 HA: Oracle VM 31 CERN has more than 2 years of experience of running Oracle VM in production Easy scale-out of VM Pools 10 Gbit Ethernet – one network only (NAS over NFS) Production on OVM 2.2, currently testing OVM 3.1.1 and 3.2.1 Live migration for HW interventions and host OS upgrades

32 32 HA: Oracle CRS (being tested) Well-known clusterware in IT-DB group Requires to modify /etc/init.d/mysql Special case mysqld is suddenly killed → pid file not removed Stop/start operation should be done via crsctl Start-up after server’s boot as well Tar files allow several binaries upgrades mysqld_safe, mysqlbinlog, mysqladmin,… invoked from right distribution and right environment: --basedir=$BASEDIR --bindir=$BINDIR --datadir=$DATADIR Two basic configurations: 2 nodes cluster → critical applications +2 nodes cluster → optimised resources utilization

33 HA: Oracle CRS Check Start Stop Clean VIP for MySQL instance Enough time for crash recovery HOSTING_MEMBERS + SERVER_POOLS to assign instances

34 HA: Replication MySQL master → slave replication Monitoring Based on a “ping” table, show slave status not reliable Percona script pt-table-checksum (manual run) Change role “automated” my.cnf variables: Disaster recovery + Upgrades

35 35 HA comparison TechnologyUse caseTime to recover (seconds) CRSkill MySQL daemon~ 60 cluster relocate~ 20 server crashes~ 20 OracleVMhypervisor crashes~ 360 live migration0 ReplicationRole changeN/A On All cases sysbench was producing some OLTP load

36 36 Important clients 36 Hosting ~45 instances at CERN for IT and experiments Drupal content management system BOINC – LHC@home CERN document server Audio video conferencing and webcasts service HammerCloud for experiments Piwik (open source web analytics software) OpenStack Nova Trac for subversion PVSS, Fisheye,…

37 37 Summary 37 Many lessons learned during the design and implementation of the DBoD service Building Database as a Service helped CERN DB group to Gain experience with MySQL Provide a solution for Oracle database with special needs e.g. Unicode character sets Improve tools and operations Standardize on tools and frameworks Consolidate

38 38 Acknowledgements DBoD Team: IT-DB group colleagues, specially our Virtual experts! Daniel Gomez Dawid Wojcik Ignacio Coterillo

39


Download ppt "Ruben Gaspar – CERN - Speaker Dawid Wojcik – CERN Ignacio Coterillo– CERN Daniel Gomez- CERN UKOUG Database Server SIG Meeting, 29th January 2013."

Similar presentations


Ads by Google