Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS TSM CERN Daniele Francesco Kruse CERN IT/DSS.

Slides:



Advertisements
Similar presentations
Social Bookmarking & RSS feeds
Advertisements

Coban Technologies User’s Conference 2013 Coban Technologies, Inc West Sam Houston Parkway South #800 Houston, Texas Tel:
File Server Organization and Best Practices IT Partners June, 02, 2010.
HEPiX, CASPUR, April 3-7, 2006 – Steve McDonald TRIUMF Steven McDonald & Konstantin Olchanski TRIUMF Network & Computing Services
OVERVIEW TEAM5 SOFTWARE The TEAM5 software manages personnel and test data for personal ESD grounding devices. Test and personnel data may be viewed/reported.
Hands-On Microsoft Windows Server 2003 Administration Chapter 10 Monitoring and Troubleshooting Windows Server 2003.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 11: Monitoring Server Performance.
CERN - IT Department CH-1211 Genève 23 Switzerland t Oracle and Streams Diagnostics and Monitoring Eva Dafonte Pérez Florbela Tique Aires.
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 11 Managing and Monitoring a Windows Server 2008 Network.
L. Granado Cardoso, F. Varela, N. Neufeld, C. Gaspar, C. Haen, CERN, Geneva, Switzerland D. Galli, INFN, Bologna, Italy ICALEPCS, October 2011.
Tripwire Enterprise Server – Getting Started Doreen Meyer and Vincent Fox UC Davis, Information and Education Technology June 6, 2006.
CERN IT Department CH-1211 Geneva 23 Switzerland t CERN IT Department CH-1211 Geneva 23 Switzerland t
Backup Rationalisation Reorganisation of the CERN Computer Centre Backups David Asbury IT/DS Friday 6 December 2002.
Slide 1 of 9 Presenting 24x7 Scheduler The art of computer automation Press PageDown key or click to advance.
Air Quality Data Analysis Using Open Source Tools
Web Application Architecture: multi-tier (2-tier, 3-tier) & mvc
CERN - IT Department CH-1211 Genève 23 Switzerland t SVN Pilot: CVS Replacement Manuel Guijarro Jonatan Hugo Hugosson Artur Wiecek David.
IBM TotalStorage ® IBM logo must not be moved, added to, or altered in any way. © 2007 IBM Corporation Break through with IBM TotalStorage Business Continuity.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Michal Kwiatek, Juraj Sucik, Rafal.
CERN IT Department CH-1211 Genève 23 Switzerland t Integrating Lemon Monitoring and Alarming System with the new CERN Agile Infrastructure.
AUTOMATED RESTORE TESTING FOR TIVOLI STORAGE MANAGER.
Windows Server MIS 424 Professor Sandvig. Overview Role of servers Performance Requirements Server Hardware Software Windows Server IIS.
SmartLog X 3 TEAM Basic SmartLog X 3 TEAM Basic DescoEMIT.com USER STATUS USER EDIT TEST LOG ADMIN TEST MACHINE SCHEDULE INSTALL System Requirements:
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
CERN - IT Department CH-1211 Genève 23 Switzerland t Monitoring the ATLAS Distributed Data Management System Ricardo Rocha (CERN) on behalf.
CERN IT Department CH-1211 Geneva 23 Switzerland t OIS Ideas for 2011 Prepare must be done work items –Warranty –Software maintenance –Commitments.
Chapter 8 Implementing Disaster Recovery and High Availability Hands-On Virtual Computing.
CERN IT Department CH-1211 Genève 23 Switzerland t Experience with Windows Vista at CERN Rafal Otto Internet Services Group IT Department.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS From data management to storage services to the next challenges.
CERN IT Department CH-1211 Genève 23 Switzerland t Internet Services Job Monitoring for the LHC experiments Irina Sidorova (CERN, JINR) on.
Experiment Support CERN IT Department CH-1211 Geneva 23 Switzerland t DBES PhEDEx Monitoring Nicolò Magini CERN IT-ES-VOS For the PhEDEx.
CERN IT Department CH-1211 Geneva 23 Switzerland t Daniel Gomez Ruben Gaspar Ignacio Coterillo * Dawid Wojcik *CERN/CSIC funded by Spanish.
CERN - IT Department CH-1211 Genève 23 Switzerland t DB Development Tools Benthic SQL Developer Application Express WLCG Service Reliability.
Fermilab Distributed Monitoring System (NGOP) Progress Report J.Fromm K.Genser T.Levshina M.Mengel V.Podstavkov.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS Tape Monitoring Vladimír Bahyl IT DSS TAB Storage Analytics.
Graphing and statistics with Cacti AfNOG 11, Kigali/Rwanda.
Maintaining and Updating Windows Server Monitoring Windows Server It is important to monitor your Server system to make sure it is running smoothly.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 11: Monitoring Server Performance.
Module 13: Performing Preventive Maintenance. Overview Performing Daily Exchange Maintenance Performing Scheduled Exchange Maintenance Performing On-Demand.
CASTOR evolution Presentation to HEPiX 2003, Vancouver 20/10/2003 Jean-Damien Durand, CERN-IT.
CERN IT Department CH-1211 Genève 23 Switzerland t Frédéric Hemmer IT Department Head - CERN 23 rd August 2010 Status of LHC Computing from.
CERN - IT Department CH-1211 Genève 23 Switzerland t OIS Deployment of Exchange 2010 mail platform Pawel Grzywaczewski, CERN IT/OIS HEPIX.
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Automatic server registration and burn-in framework HEPIX’13 28.
Guide to MCSE , Enhanced1 Activity 11-1: Using Task Manager to Manage Applications and Processes Objective: To explore managing applications and.
System Center Lesson 4: Overview of System Center 2012 Components System Center 2012 Private Cloud Components VMM Overview App Controller Overview.
CERN IT Department CH-1211 Genève 23 Switzerland t DBA Experience in a multiple RAC environment DM Technical Meeting, Feb 2008 Miguel Anjo.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Computing Facilities CERN IT Department CH-1211 Geneva 23 Switzerland t CF Agile Infrastructure Monitoring HEPiX Spring th April.
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS New tape server software Status and plans CASTOR face-to-face.
CERN IT Department CH-1211 Genève 23 Switzerland t HEPiX Conference, ASGC, Taiwan, Oct 20-24, 2008 The CASTOR SRM2 Interface Status and plans.
HEPIX Backup Survey David Asbury CERN/IT/FIO HEPIX, Rome, 6 April 2006.
FTS monitoring work WLCG service reliability workshop November 2007 Alexander Uzhinskiy Andrey Nechaevskiy.
CERN IT Department CH-1211 Genève 23 Switzerland t Migration from ELFMs to Agile Infrastructure CERN, IT Department.
CERN IT Department CH-1211 Genève 23 Switzerland t Next generation of virtual infrastructure with Hyper-V Juraj Sucik, Michal Kwiatek, Rafal.
CERN - IT Department CH-1211 Genève 23 Switzerland Tape Operations Update Vladimír Bahyl IT FIO-TSI CERN.
CERN IT Department CH-1211 Genève 23 Switzerland t The Tape Service at CERN Vladimír Bahyl IT-FIO-TSI June 2009.
CERN - IT Department CH-1211 Genève 23 Switzerland CASTOR F2F Monitoring at CERN Miguel Coelho dos Santos.
CERN IT Department CH-1211 Genève 23 Switzerland t CERN Agile Infrastructure Monitoring Pedro Andrade CERN – IT/GT HEPiX Spring 2012.
Unit 1: IBM Tivoli Storage Manager 5.1 Overview. 2 Objectives Upon the completion of this unit, you will be able to: Identify the purpose of IBM Tivoli.
CERN IT Department CH-1211 Genève 23 Switzerland t Load testing & benchmarks on Oracle RAC Romain Basset – IT PSS DP.
GNU EPrints 2 Overview Christopher Gutteridge 19 th October 2002 CERN. Geneva, Switzerland.
CERN IT Department CH-1211 Genève 23 Switzerland t DPM status and plans David Smith CERN, IT-DM-SGT Pre-GDB, Grid Storage Services 11 November.
CERN IT Department CH-1211 Genève 23 Switzerland M.Schröder, Hepix Vancouver 2011 OCS Inventory at CERN Matthias Schröder (IT-OIS)
Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS CASTOR and EOS status and plans Giuseppe Lo Presti on behalf.
Shared Services with Spotfire
Acutelearn Technologies Tivoli Storage Manager(TSM) Training Tivoli Storage Manager Basics: Tivoli Storage Manager Overview Tivoli Storage Manager concepts.
Making PowerShell Useful
Making PowerShell Useful
IBM Tivoli Storage Manager
Presentation transcript:

Data & Storage Services CERN IT Department CH-1211 Genève 23 Switzerland t DSS TSM CERN Daniele Francesco Kruse CERN IT/DSS Presented by Giuseppe Lo Presti 20th HEPiX - Vancouver - October 2011

Data & Storage Services TSM at CERN TSM Management Station Overview Main features TSMMSv2 Motivations Design New ideas Outline 220th HEPiX - Vancouver - October 2011

Data & Storage Services We back up: 1.Network filesystems (60’000 AFS, 1’500 DFS volumes) 2. (18’000 mailboxes) 3.Web sites (12’000 websites) 4.Databases (120 DB servers) 5.Servers (1’000 Linux and Windows servers) 6.Virtual Machines (120 hypervisors) We don’t back up: 1.Physics data (using CASTOR for this) 2.User PCs (already backing up home AFS/DFS directories) TSM at CERN (1/3) 320th HEPiX - Vancouver - October 2011

Data & Storage Services We currently have around 3.8 PB of backup data and 0.6 PB of archived data … and growing superlinearly (last year 1 PB) Average daily traffic is 50 TB also growing steadily Around 1,200 nodes are backed up, for a total 1,500 million files TSM at CERN (2/3) 20th HEPiX - Vancouver - October 20114

Data & Storage Services 17 TSM Servers in production on RHEL4/5 80 TB of disk storage 2 IBM TS3500 libraries 48 IBM drives 4’500 IBM 3952 cartridges TSM at CERN (3/3) 520th HEPiX - Vancouver - October 2011

Data & Storage Services TSM monitoring tool developed in-house Gathers data from the TSM servers Generates graphs and reports with various statistics Sends s to users and administrators to inform them about potential issues Very useful to manage the increasing number of TSM servers TSM Management Station 620th HEPiX - Vancouver - October 2011

Data & Storage Services TSM Management Station 720th HEPiX - Vancouver - October 2011

Data & Storage Services TSMMS daily report example: TSMMS also sends an for each error in each TSM server TSM Management Station 820th HEPiX - Vancouver - October 2011

Data & Storage Services Allows management of groups of nodes (by department and division) and generates graphs and stats for each group Sends alerts to nodes whenever an operation fails or whenever they miss their periodic backup Features options to suspend or stop the alerting system Gives information of each node about file spaces, backup history performance and stats, associated schedules, etc. … and many other stats and graphs TSM Management Station 920th HEPiX - Vancouver - October 2011

Data & Storage Services TSMMS provides 90% of all the information that is needed However: not use-case oriented not compatible with TSM v6.x (heavily depending on the TSM 5 database schema) The choice was then to start from scratch with a clean design and architecture Change in philosophy: the focus is now on how to convey the relevant information for each use-case Motivations for a new TSMMS 1020th HEPiX - Vancouver - October 2011

Data & Storage Services TSMMS takes care of the monitoring and the alerting system TSMMSv2 will be only responsible for the monitoring while the alerting tasks will be moved to Splunk Splunk is a commercially available tool (with a free trial): Log aggregator/mining Search engine New features: alerting and reporting TSMMSv2 and Splunk will work together to provide the TSM admin with proper information and alerts Splunk 1120th HEPiX - Vancouver - October 2011

Data & Storage Services Splunk 1220th HEPiX - Vancouver - October 2011

Data & Storage Services TSM Admin Add nodes to TSM Spot issues and solve them Check DB space and Tape pools Handle user support tickets Need to find a suitable server... Need to have a clear view of DB and pools... Check quickly for any anomaly in the system Scope reduced: Splunk does the rest! TSMMSv2 modeled on a typical TSM admin day 1320th HEPiX - Vancouver - October 2011

Data & Storage Services Model Layer TSMMS DB TSM Server 1 TSM Server 2 TSM Server 3 TSM Server 4 TSM Server N Controller Layer (Display Logic) View Layer (HTML and Javascript Templates) Structure of TSMMSv2 1420th HEPiX - Vancouver - October 2011

Data & Storage Services TSMMSv2 will focus on helping TSM admins with daily tasks Display only relevant information (not everything else) for the most important issues that may arise Not only monitoring → also GUI for selected common administrative tasks Add new nodes to approriate server Automation of certain tasks, such as: Add new storage space where needed (ex. DB) Automatically deal with faulty tapes or drives TSMMSv2 New Ideas 1520th HEPiX - Vancouver - October 2011

Data & Storage Services Thank you, Questions ? 1620th HEPiX - Vancouver - October 2011