An Introduction to GPFS

Slides:



Advertisements
Similar presentations
The google file system Cs 595 Lecture 9.
Advertisements

Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Ceph: A Scalable, High-Performance Distributed File System
Ceph: A Scalable, High-Performance Distributed File System Priya Bhat, Yonggang Liu, Jing Qin.
Spark: Cluster Computing with Working Sets
On evaluating GPFS Research work that has been done at HLRS by Alejandro Calderon.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Lecture 6 – Google File System (GFS) CSE 490h – Introduction to Distributed Computing, Winter 2008 Except as otherwise noted, the content of this presentation.
Coda file system: Disconnected operation By Wallis Chau May 7, 2003.
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
IBM Research Lab in Haifa Architectural and Design Issues in the General Parallel File System Benny Mandler - May 12, 2002.
Overview of Lustre ECE, U of MN Changjin Hong (Prof. Tewfik’s group) Monday, Aug. 19, 2002.
Module – 7 network-attached storage (NAS)
MCTS Guide to Microsoft Windows Server 2008 Network Infrastructure Configuration Chapter 7 Configuring File Services in Windows Server 2008.
Google Distributed System and Hadoop Lakshmi Thyagarajan.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Hadoop Distributed File System by Swathi Vangala.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
The Hadoop Distributed File System
Operating Systems (CS 340 D) Dr. Abeer Mahmoud Princess Nora University Faculty of Computer & Information Systems Computer science Department.
Chapter 20 Distributed File Systems Copyright © 2008.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
The Vesta Parallel File System Peter F. Corbett Dror G. Feithlson.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Module 4.0: File Systems File is a contiguous logical address space.
Ceph: A Scalable, High-Performance Distributed File System
Test Results of the EuroStore Mass Storage System Ingo Augustin CERNIT-PDP/DM Padova.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
VMware vSphere Configuration and Management v6
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
To provide the world with a next generation storage platform for unstructured data, enabling deployment of mobile applications, virtualization solutions,
Distributed File Systems Questions answered in this lecture: Why are distributed file systems useful? What is difficult about distributed file systems?
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
Database CNAF Barbara Martelli Rome, April 4 st 2006.
FTOP: A library for fault tolerance in a cluster R. Badrinath Rakesh Gupta Nisheeth Shrivastava.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Vladimir Sapunenko INFN-CNAF
2011 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. GPFS-FPO: A Cluster File System for Big Data Analytics Prasenjit Sarkar.
SYSTEM MODELS FOR ADVANCED COMPUTING Jhashuva. U 1 Asst. Prof CSE
EGEE is a project funded by the European Union under contract IST GPFS General Parallel File System INFN-GRID Technical Board – Bologna 1-2.
EGEE is a project funded by the European Union under contract IST Test di GPFS a Catania IV Workshop INFN Grid – Bari Ottobre
GPFS Parallel File System
High Performance Storage System (HPSS) Jason Hick Mass Storage Group HEPiX October 26-30, 2009.
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
Transparent Cloud Tiering
Introduction to Distributed Platforms
File System Implementation
Google Filesystem Some slides taken from Alan Sussman.
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Kirill Lozinskiy NERSC Storage Systems Group
Distributed P2P File System
Chapter 2: System Structures
The Google File System (GFS)
Hadoop Technopoints.
Multiple Processor Systems
The Google File System (GFS)
The Google File System (GFS)
The Google File System (GFS)
Multiple Processor and Distributed Systems
The Google File System (GFS)
Introduction To Distributed Systems
Lecture 4: File-System Interface
The Google File System (GFS)
Presentation transcript:

An Introduction to GPFS Vladimir Sapunenko INFN-CNAF

V.Sapunenko - INFN T1+T2 cloud workshop What is GPFS? IBM General Parallel File System is a high-performance shared-disk cluster file system. Designed to support high performance computing POSIX compliant Provides concurrent high-speed file access to apps executing on multiple nodes of an AIX and Linux clusters Switching fabric I/O nodes Shared disks 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop The file system Supports quotas, snapshots and extended ACLs Built from a collection of disks Each disk can contain data and/or metadata Coherency and consistency maintained via distributed lock manager 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

Performance and scalability Striping across multiple disks attached to multiple nodes Efficient client side caching Support for large block size (configurable) Advanced algorithms for read-ahead and write-behind Dynamic optimization of I/O based on access pattern (sequential, reverse sequential, random) 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop Administration Consistent with standard Linux file system administration Simple CLI, most commands can be issued from any node in the cluster No Java and graphic libraries dependency Extensions for clustering aspects A single command can perform an action across the entire cluster Support for Data Management API (IBM’s implementation of X/Open data storage management API) 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop Data availability Fault tolerance Clustering – node failure Storage system failure – data replication File system health monitoring Extensive logging and automated recovery actions in case of failure Data replication available for Journal logs; Data Metadata 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

Information Lifecycle Management (ILM) New feature of GPFS v3.1 Storage pools allow the creation of disk groups within a file system (hardware partitioning) Filesets is a sub-tree of the file system namespace (Namespace partitioning). For example, it can be used as administrative boundaries to set quotas. 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop User defined polices File placement policies Define where the data will be created (appropriate storage pool) Rules are determined by attributes like File name User name Fileset File management policies Possibility to move data from one pool to another without changing file location in the directory structure Change replication status Prune file system (deleting files as defined by policy) Determined by attributes like Access time Path name Size of the file 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop Policy rules examples If the storage pool named pool_1 has an occupancy percentage above 90% now, bring the occupancy percentage of pool_1 down to 70% by migrating the largest files to storage pool pool_2: RULE 'mig1' MIGRATE FROM POOL 'pool_1' THRESHOLD(90,70) WEIGHT(KB_ALLOCATED) TO POOL 'pool_2' Delete files from the storage pool named pool_1 that have not been accessed in the last 30 days, and are named like temporary files or appear in any directory that is named tmp: RULE 'del1' DELETE FROM POOL 'pool_1' WHERE (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME) > 30) AND (lower(NAME) LIKE '%.tmp' OR PATH_NAME LIKE '%/tmp/%') 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

Cluster configuration Shared disk cluster SAN storage attached to all nodes in the cluster via FiberChannel (FC) All nodes interconnected via LAN Data flows via FC Control info transmitted via Ethernet 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

Network based block I/O Block level I/O interface over network – Network Shared Disk (NSD) GPFS transparently handles I/O whether NSD or direct attachment is used Intra-cluster communications can be separated using dedicated interfaces 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop GPFS@Tier1 I/O servers Worker nodes Production GPFS v.2.3.0-12 12      I/O servers 15      file systems 125.4 TB (via SAN) 654    worker nodes Testbed GPFS v.3.1.0-4 3   I/O servers (no SAN disks) 2   clients Virtual SAN Storage disks 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

V.Sapunenko - INFN T1+T2 cloud workshop Enhancements in GPFS v3.1 The requirements for IP connectivity in a multi-cluster environment have been relaxed. In earlier releases of GPFS, 'all-to-all' connectivity was required. Any node mounting a given file system had to be able to open a TCP/IP connection to any other node mounting the same file system, irrespective of which cluster either of the nodes belonged to. Enhanced file system administration The "mmmount" and "mmumount" commands are provided for cluster-wide file system management. Performans monitoring 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop

Known GPFS limitations Number of filesystems   < 32 Number of cluster nodes  < 4096 (2441 tested) Single disk size   < 2TB Filesystem size < 299  bytes (2 PB tested) Number of files < 2*109 Does not support the Red Hat EL 4.0 uniprocessor (UP) kernel. Does not support the RHEL 3.0 and RHEL 4.0 hugemem kernel. Although GPFS is a POSIX-compliant file system, some exceptions apply to this: Memory mapped files are not supported in this release. The stat() is not fully supported. mtime, atime and ctime returned from the stat() system call may be updated slowly if the file has recently been updated on another node 21-22/11/2006 V.Sapunenko - INFN T1+T2 cloud workshop