Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.

Slides:



Advertisements
Similar presentations
Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon.
Advertisements

High Performance Cluster Computing Architectures and Systems Hai Jin Internet and Cluster Computing Center.
1 Magnetic Disks 1956: IBM (RAMAC) first disk drive 5 Mb – Mb/in $/year 9 Kb/sec 1980: SEAGATE first 5.25’’ disk drive 5 Mb – 1.96 Mb/in2 625.
By Ali Alskaykha PARALLEL VIRTUAL FILE SYSTEM PVFS PVFS Distributed File System:
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
A Comparative Study of Network Protocols & Interconnect for Cluster Computing Performance Evaluation of Fast Ethernet, Gigabit Ethernet and Myrinet.
G Robert Grimm New York University SGI’s XFS or Cool Pet Tricks with B+ Trees.
ISCSI Performance in Integrated LAN/SAN Environment Li Yin U.C. Berkeley.
Cooperative Caching Middleware for Cluster-Based Servers Francisco Matias Cuenca-Acuna Thu D. Nguyen Panic Lab Department of Computer Science Rutgers University.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
I/O Systems and Storage Systems May 22, 2000 Instructor: Gary Kimura.
Gordon: Using Flash Memory to Build Fast, Power-efficient Clusters for Data-intensive Applications A. Caulfield, L. Grupp, S. Swanson, UCSD, ASPLOS’09.
Case Study - GFS.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
Redundant Array of Independent Disks
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
Interposed Request Routing for Scalable Network Storage Darrell Anderson, Jeff Chase, and Amin Vahdat Department of Computer Science Duke University.
1 A Look at PVFS, a Parallel File System for Linux Will Arensman Anila Pillai.
1 A Look at PVFS, a Parallel File System for Linux Talk originally given by Will Arensman and Anila Pillai.
N-Tier Client/Server Architectures Chapter 4 Server - RAID Copyright 2002, Dr. Ken Hoganson All rights reserved. OS Kernel Concept RAID – Redundant Array.
Emalayan Vairavanathan
Small File File Systems USC Jim Pepin. Level Setting  Small files are ‘normal’ for lots of people Metadata substitute (lots of image data are done this.
Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.
Types of Computers Mainframe/Server Two Dual-Core Intel ® Xeon ® Processors 5140 Multi user access Large amount of RAM ( 48GB) and Backing Storage Desktop.
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand K. Vaidyanathan P. Balaji H. –W. Jin D.K. Panda Network-Based.
1 Web Server Administration Chapter 2 Preparing For Server Installation.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Hardware Trends. Contents Memory Hard Disks Processors Network Accessories Future.
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas.
November 2, 2000HEPiX/HEPNT FERMI SAN Effort Lisa Giacchetti Ray Pasetes GFS information contributed by Jim Annis.
Frank Casilio Computer Engineering May 15, 1997 Multithreaded Processors.
CMAQ Runtime Performance as Affected by Number of Processors and NFS Writes Patricia A. Bresnahan, a * Ahmed Ibrahim b, Jesse Bash a and David Miller a.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Data in the Cloud – I Parallel Databases The Google File System Parallel File Systems.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
Srihari Makineni & Ravi Iyer Communications Technology Lab
Testing… Testing… 1, 2, 3.x... Performance Testing of Pi on NT George Krc Mead Paper.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
CHEP04 Performance Analysis of Cluster File System on Linux Yaodong CHENG IHEP, CAS
O AK R IDGE N ATIONAL L ABORATORY U.S. D EPARTMENT OF E NERGY Facilities and How They Are Used ORNL/Probe Randy Burris Dan Million – facility administrator.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Latency Reduction Techniques for Remote Memory Access in ANEMONE Mark Lewandowski Department of Computer Science Florida State University.
Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation Ning Liu, Christopher Carothers 1.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
COMP381 by M. Hamdi 1 Clusters: Networks of WS/PC.
Randy MelenApril 14, Stanford Linear Accelerator Center Site Report April 1999 Randy Melen SLAC Computing Services/Systems HPC Team Leader.
PROOF tests at BNL Sergey Panitkin, Robert Petkus, Ofer Rind BNL May 28, 2008 Ann Arbor, MI.
Parallel IO for Cluster Computing Tran, Van Hoai.
1 CEG 2400 Fall 2012 Network Servers. 2 Network Servers Critical Network servers – Contain redundant components Power supplies Fans Memory CPU Hard Drives.
Lecture 3 Secondary Storage and System Software I
Taeho Kgil, Trevor Mudge Advanced Computer Architecture Laboratory The University of Michigan Ann Arbor, USA CASES’06.
1© Copyright 2015 EMC Corporation. All rights reserved. NUMA(YEY) BY JACOB KUGLER.
CIT 140: Introduction to ITSlide #1 CSC 140: Introduction to IT Operating Systems.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Berkeley Cluster Projects
Diskpool and cloud storage benchmarks used in IT-DSS
Oracle SQL*Loader
Parallel Data Laboratory, Carnegie Mellon University
Web Server Administration
CLUSTER COMPUTING.
Types of Computers Mainframe/Server
Designing a PC Farm to Simultaneously Process Separate Computations Through Different Network Topologies Patrick Dreher MIT.
Presentation transcript:

Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive Data Processing on the Acxiom Cluster Testbed

University of Arkansas 2 The Acxiom Cluster Testbed A high performance cluster and cluster computing research project at the University of Arkansas Primary goal of the research is to investigate cluster computing hardware and middleware architectures for use in massive data processing (search, sort, match, …)

University of Arkansas 3 The University of Arkansas Fayetteville, AR Denver

University of Arkansas 4 Outline of talk Motivation and application description Experimental study setup Two cluster platforms System and application software Two file systems Four workloads Results Future work

University of Arkansas 5 Sponsors 1) National Science Foundation 2) Arkansas Science and Technology Authority (ASTA) 3) Acxiom Corporation Billion dollar corporation, based in Little Rock, Arkansas Provides generous support to universities in Arkansas Provides products and services for information integration

University of Arkansas 6 Application Characteristics The files are REALLY BIG. (>>100 GB) Never underestimate the bandwidth of a Sentra carrying a hard drive and grad student across campus File access may be sequential through all or portions of a file E.g., stepping through a list of all addresses in a very large customer file

University of Arkansas 7 Application Characteristics Or, file access may be “random” E.g., reading the record of a particular customer number  File cache may be ineffective for these types of workloads

University of Arkansas 8 Typical Cluster Architecture  Network File System server?  Easy to configure  But, may not have good performance for Acxiom workload – even with a fast network and disks External Network Node 0 NFS Server? Node 1 Node 2 Node N Cluster Node Dual-processor Pentium Linux HD, lots of memory Network Switch

University of Arkansas 9 Two Cluster Platforms The Eagle Cluster Four single processor Pentium II, 450MHz computers and four Pentium III, 500MHz computers, Fast Ethernet (12.5MBps) Node 0, NFS server, IDE HD tput  19MBps Nodes 1, 2, 3 IDE HD tput  13MBps Nodes 4, 5, 6 SCSI HD tput  18MBps Node 7 IDE HD tput  18MBps

University of Arkansas 10 Two Cluster Platforms The ACT Cluster Seven dual-processor Pentium III 1GHz computers Dual EIDE disk RAID 0 subsystem in all nodes, tput  60MBps Both Fast Ethernet (12.5MBps raw bw) and Myrinet (250MBps raw bw) switches, both full duplex

University of Arkansas 11 System and Application Software RedHat Linux version 7.1, kernel version 2.4 on both clusters MPI version for spawning processes in parallel For each node Open file Barrier synchronize Start timer Read/write my portion Barrier synchronize End timer Report bytes processed

University of Arkansas 12 Two File Systems NFS, Version 3 Distributed file view to clients, but uses a central server for files Sophisticated client-side cache, block size of 32KB Uses the Linux buffer cache on the server side

University of Arkansas 13 Two File Systems Parallel Virtual File System (PVFS), kernel version Also uses the Linux buffer cache on the server side No client cache

University of Arkansas 14 PVFS Setup Node 0 Node 1 Node 7 Network Switch Cluster Node Computing Node Linux Application code I/O Node “Stripe” Server MGR Node (also a Cluster Node)

University of Arkansas 15 File Striping on PVFS A very large file is striped across 7 or 8 nodes, with stripe size of 8KB, fixed record length of 839 bytes 8KB

University of Arkansas 16 Experimental Workload One  Local Whole File (LWF)  N processes run on N nodes. Each process reads the entire file to memory Node 1Node 2Node N

University of Arkansas 17 Experimental Workload Two  Global Whole File (GWF)  N processes run on N nodes. Each process reads an equal-sized disjoint portion of the file. From a global perspective the entire file is read. Node 1Node 2Node N

University of Arkansas 18 Experimental Workload Three  Random (RND)  N processes run on N nodes. Each process reads an equal number of records from the file from random starting locations Node 1Node 2Node N

University of Arkansas 19 Experimental Workload Four  Global Whole File Write (GWFW)  N processes run on N nodes. Each process writes an equal-sized disjoint portion of the file. From a global perspective the entire file is written. No locking is used since the writes are disjoint. Node 1Node 2Node N

University of Arkansas 20 NFS Read, LWF, GWF, ACT with Fast Ethernet and Eagle Total Throughput across all Nodes, varying Chunksize

University of Arkansas 21 NFS Read, LWF, GWF, ACT with Myrinet, Fast Ethernet, and Eagle Total Throughput across all Nodes, varying Chunksize

University of Arkansas 22 NFS Random Read, ACT with Myrinet and ACT with Fast Ethernet Total Throughput across all Nodes, varying Chunksize

University of Arkansas 23 PVFS Read, All Workloads, Eagle Total Throughput across all Nodes, varying Chunksize

University of Arkansas 24 PVFS Read, All Workloads, shown with NFS read, GWF, LWF, Eagle Total Throughput across all Nodes, varying Chunksize

University of Arkansas 25 PVFS Read, Act with Fast Ethernet, All Workloads Total Throughput across all Nodes, varying Chunksize

University of Arkansas 26 PVFS Read, Act with Fast Ethernet, All Workloads, shown with Eagle cs=150 Total Throughput across all Nodes, varying Chunksize

University of Arkansas 27 RND Read, CS=1, PVFS versus NFS, ACT with Fast Ethernet and Eagle Total Throughput across all Nodes

University of Arkansas 28 NFS Write, GWF, Eagle and ACT Fast Ethernet Total Throughput across all Nodes, varying Chunksize

University of Arkansas 29 PVFS Write, Eagle, (with NFS Write Eagle) Total Throughput across all Nodes, varying Chunksize

University of Arkansas 30 Conclusions File system performance is limited by disk throughput as well as network throughput, and depends on workload NFS overall throughput degrades with more parallel (different data) access Probably due to contention at the disks Even more dramatically with our faster hardware!

University of Arkansas 31 Conclusions For our system where disk speed is close to network speed, PVFS read performance is best when the access is spread across many servers Small stripes seem to be good in this case For our system where the disks are much faster than the network, PVFS read performance does not depend on access size

University of Arkansas 32 Conclusions PVFS write performance is dependent on access size for all platforms tested Myrinet is not even close to being saturated with these workloads and hardware

University of Arkansas 33 Future Work Read and write performance with Myrinet Sensitivity studies of how PVFS stripe size affects parallel file system performance Development of a lightweight locking mechanism for PVFS PVFS currently does not support concurrent writes Exploration of fast, fault-tolerant techniques for metadata storage