EGEE is a project funded by the European Union under contract IST-2003-508833 GPFS General Parallel File System INFN-GRID Technical Board – Bologna 1-2.

Slides:



Advertisements
Similar presentations
ITEC474 INTRODUCTION.
Advertisements

Page 1 Dorado 400 Series Server Club Page 2 First member of the Dorado family based on the Next Generation architecture Employs Intel 64 Xeon Dual.
Introduction to DBA.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Chapter One The Essence of UNIX.
© 2006 EMC Corporation. All rights reserved. Network Attached Storage (NAS) Module 3.2.
Network-Attached Storage
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
Understanding Networks I. Objectives Compare client and network operating systems Learn about local area network technologies, including Ethernet, Token.
Guide To UNIX Using Linux Third Edition
1© Copyright 2011 EMC Corporation. All rights reserved. EMC RECOVERPOINT/ CLUSTER ENABLER FOR MICROSOFT FAILOVER CLUSTER.
Module – 7 network-attached storage (NAS)
© 2010 VMware Inc. All rights reserved VMware ESX and ESXi Module 3.
Hardening Linux for Enterprise Applications Peter Knaggs & Xiaoping Li Oracle Corporation Sunil Mahale Network Appliance Session id:
Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Ronen Gabbay Microsoft Regional Director Yside / Hi-Tech College
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
Chapter 11: Creating and Managing Shared Folders BAI617.
Hands-On Microsoft Windows Server 2008 Chapter 1 Introduction to Windows Server 2008.
1 A Look at PVFS, a Parallel File System for Linux Will Arensman Anila Pillai.
Module 13: Configuring Availability of Network Resources and Content.
Guide to Linux Installation and Administration, 2e1 Chapter 3 Installing Linux.
1 A Look at PVFS, a Parallel File System for Linux Talk originally given by Will Arensman and Anila Pillai.
Pooja Shetty Usha B Gowda.  Network File Systems (NFS)  Drawbacks of NFS  Parallel Virtual File Systems (PVFS)  PVFS components  PVFS application.
Module 12: Designing High Availability in Windows Server ® 2008.
Local Area Networks (LAN) are small networks, with a short distance for the cables to run, typically a room, a floor, or a building. - LANs are limited.
INSTALLING MICROSOFT EXCHANGE SERVER 2003 CLUSTERS AND FRONT-END AND BACK ‑ END SERVERS Chapter 4.
CIS 191 – Lesson 2 System Administration. CIS 191 – Lesson 2 System Architecture Component Architecture –The OS provides the simple components from which.
© 2005 Mt Xia Technical Consulting Group - All Rights Reserved. HACMP – High Availability Introduction Presentation November, 2005.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Module 11: Implementing ISA Server 2004 Enterprise Edition.
FailSafe SGI’s High Availability Solution Mayank Vasa MTS, Linux FailSafe Gatekeeper
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
1 Week #10Business Continuity Backing Up Data Configuring Shadow Copies Providing Server and Service Availability.
© 2008 Cisco Systems, Inc. All rights reserved.Cisco ConfidentialPresentation_ID 1 Chapter 1: Introduction to Scaling Networks Scaling Networks.
EGEE is a project funded by the European Union under contract IST JRA1-SA1 requirement gathering Maite Barroso JRA1 Integration and Testing.
Using NAS as a Gateway to SAN Dave Rosenberg Hewlett-Packard Company th Street SW Loveland, CO 80537
Components of a Sysplex. A sysplex is not a single product that you install in your data center. Rather, a sysplex is a collection of products, both hardware.
AoE and HyperSCSI on Linux PDA Prepared by They Yu Shu.
11 CLUSTERING AND AVAILABILITY Chapter 11. Chapter 11: CLUSTERING AND AVAILABILITY2 OVERVIEW  Describe the clustering capabilities of Microsoft Windows.
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
Chapter 20 Parallel Sysplex
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Chapter 9: Networking with Unix and Linux. Objectives: Describe the origins and history of the UNIX operating system Identify similarities and differences.
4/26/2017 © 2014 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks and/or trademarks.
CEG 2400 FALL 2012 Linux/UNIX Network Operating Systems.
Quattor tutorial Introduction German Cancio, Rafael Garcia, Cal Loomis.
Installing VERITAS Cluster Server. Topic 1: Using the VERITAS Product Installer After completing this topic, you will be able to install VCS using the.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Deploying Highly Available SQL Server in Windows Azure A Presentation and Demonstration by Microsoft Cluster MVP David Bermingham.
Planning an Active Directory Deployment Lesson 1.
An Introduction to GPFS
EGEE is a project funded by the European Union under contract IST Test di GPFS a Catania IV Workshop INFN Grid – Bari Ottobre
GPFS Parallel File System
Virtuozzo 4.0 Carla Safigan Virtuozzo Marketing Jack Zubarev COO.
Guide to Linux Installation and Administration, 2e
Introduction to Distributed Platforms
Welcome to Linux Chap#1 Hanin Abdulrahman.
Cluster Communications
Objectives Differentiate between the different editions of Windows Server 2003 Explain Windows Server 2003 network models and server roles Identify concepts.
Introduction to Networks
Oracle Solaris Zones Study Purpose Only
An Introduction to Computer Networking
Chapter 2: The Linux System Part 1
Distributed computing deals with hardware
Welcome to Linux Chap#1 Hanin Abdulrahman.
Welcome to Linux Chap#1.
Presentation transcript:

EGEE is a project funded by the European Union under contract IST GPFS General Parallel File System INFN-GRID Technical Board – Bologna 1-2 luglio Rosanna Catania

INFN–GRID Technical Board, Bologna The General Parallel File System (GPFS) for Linux on xSeries® is a high-performance shared-disk file system that can provide data access from all nodes in a Linux cluster environment. Parallel and serial applications can readily access shared files using standard UNIX® file system interfaces, and the same file can be accessed concurrently from multiple nodes. GPFS provides high availability through logging and replication, and can be configured for failover from both disk and server malfunctions. Introducing GPFS

INFN–GRID Technical Board, Bologna What does GPFS do? Presents one file system to many nodes – appears to the user as a standard Unix filesystem Allows nodes concurrent access to the same data GPFS offers:  scalability,  high availability and recoverability  high performance

INFN–GRID Technical Board, Bologna Why use GPFS? GPFS highlights  Improved system performance  Assured file consistency  High recoverability and increased data availability  Enhanced system flexibility  Simplified administration

INFN–GRID Technical Board, Bologna Distributions and kernel levels has been tested whit GPFS: GPFS 2.2 for Linux on xSeries GPFS VersionLinux DistributionKernel Version Red Hat EL * Red Hat Pro SuSE SLES (service pack 3)

INFN–GRID Technical Board, Bologna “Direct attached” configuration (the one NOT tested yet because of RH7.3) GPFS software installed on each node. GPFS dedicated network, min. 10/100 ethernet SAN connection between all nodes and storage Each logical disk becomes a logical volume, from which the GPFS filesystem is created. Node GPFS SAN

INFN–GRID Technical Board, Bologna “Shared disk” configuration (the one actually tested!) Additional shared disk (SD) software layer on all nodes. Nodes which have connection to storage are SD servers. Each logical disk becomes a “shared disk”. Disks here are twin tailed between nodes. Nodes which aren’t connected to the storage are SD clients and can access disks via the SD servers. storage are SD servers. This configuration produces a lot of traffic across the GPFS network Node GPFS SD client SD server

INFN–GRID Technical Board, Bologna Quorum: multi-node quorum Quorum = ½ number of nodes + 1 Node GPFS Node GPFS Node GPFS Node GPFS Quorum exists, filesystem accessible Quorum lost, filesystem inaccessible

INFN–GRID Technical Board, Bologna System requirements Upgrade kernel Apply patches mmap-invalidate patch.gz and NFS lock to the Linux kernel, recompile, and install this kernel Ensure the glibc level is or greater Proper authorization is garanted to all nodes in the GPFS cluster to use alternative remote shell and remote copy commands (at Catania we use SSH everywhere)

INFN–GRID Technical Board, Bologna RPMs : rsct.basic i386.rpm rsct.core i386.rpm rsct.core.utils i386.rpm src i386.rpm gpfs.base i386.rpm gpfs.docs noarch.rpm gpfs.gpl noarch.rpm gpfs.msg.en_US noarch.rpm

INFN–GRID Technical Board, Bologna RSCT: Reliable Scalable Cluster Tecnology RSCT is a set of software components that together provide a comprenhensive clustering environment for Linux Is the ininfrastructure used by a variety of IBM products to provide clusters with improved system availability, scalability, and ease of use.

INFN–GRID Technical Board, Bologna RSCT: Components The Resource Monitoring and Control (RMC) subsystem  Provided global access to subsystem and resources throughout the cluster: a single monitoring/management infrastructure The RSCT core resource managers  A software layer between a resource (hardware or software) end RMC The RSCT cluster security services  Provide the secirity infrastructure that enable RSCT components to authenticate the ifdentity of other parties The Topology Services subsystem  Provide node-network failure detection The Group Service subsystem  Provides cross node/process coordination

INFN–GRID Technical Board, Bologna RSCT peer domain: configuration IP connectivity between all nodes of the peer domain Prepare initial security environment on each node that will be in the peer domain using the  preprpnode -k originator_node ip_server1 Create a new peer domain definition by issuing the  mkrpdomain – f allnodes.txt domain_name Bring the peer domain online using the  startrpdomain domain_name Verify your configuration  lsrpdomain domain_name  lsrpnode –a

INFN–GRID Technical Board, Bologna GPFS: Installation On each node copy the self-extrating images from the CDROM, invoke and accept the license agreement . /gpfs_install _i386 --silent rpm -ivh gpfs.base i386.rpm gpfs.docs noarch.rpm gpfs.gpl noarch.rpm gpfs.msg.en_US noarch.rpm Build your GPFS portability module  vi /usr/lpp/mmfs/src/config/site.mcr  export SHARKCLONEROOT=/usr/lpp/mmfs/src  cd /usr/lpp/mmfs/src/  make World To install the linux portability interface for GPFS  make InstallImages Verification  less /var/adm/ras/mmfs.log.latest

INFN–GRID Technical Board, Bologna GPFS: Configuration CREATING the CLUSTER:  mmcrcluster -t lc -n allnodes.txt -p primary_server -s secondary_server -r /usr/bin/ssh -R /usr/bin/scp  mmlscluster CREATING the NODESET ON THE ORIGINATOR NODE:  mmconfig -n allnodes.txt -A -C cluster_name  mmlsconfig -C cluster_name START the GPFS SERVICES ON EACH NODE:  mmstartup -C cluster_name (mmstartup –a)

INFN–GRID Technical Board, Bologna GPFS: Configuration CREATE NSD (Node Shared Disks)  mmcrnsd -F Descfile -v yes CREATING A FILE SYSTEM  mkdir /gpfs  mmcrfs /gpfs gpfs0 -F Descfile -C cluster_name -A yes MOUNT A FILE SYSTEM  mount /gpfs VERIFICATION:  mmlscluster  mmlsconfig-C cluster_name

INFN–GRID Technical Board, Bologna Conclusions and outlook GPFS learning curve is very steep: documentation is “monumental” but not always well organized Once properly installed and configured, GPFS actually allows to “see” many disk server as a unique entity Network bandwidth of the single servers is VERY important (GPFS sets down to the “slowest” node) Reliability is still under testing Preliminary I/O performance tests in the “NFS” configuration show a worse behaviour w.r.t. to native NFS (about 4:1) The proper configuration with GPFS installed both on WNs and servers has still to be tested (very soon!):  short term: trying to install the “right” kernel on the WNs running Grid.It  long term: re-doing tests on Scientific Linux whenever available

INFN–GRID Technical Board, Bologna Useful links