Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand K. Vaidyanathan P. Balaji H. –W. Jin D.K. Panda Network-Based.

Slides:



Advertisements
Similar presentations
A Proposal of Capacity and Performance Assured Storage in The PRAGMA Grid Testbed Yusuke Tanimura 1) Hidetaka Koie 1,2) Tomohiro Kudoh 1) Isao Kojima 1)
Advertisements

Crossing the Chasm: Sneaking a parallel file system into Hadoop Wittawat Tantisiriroj Swapnil Patil, Garth Gibson PARALLEL DATA LABORATORY Carnegie Mellon.
High Speed Total Order for SAN infrastructure Tal Anker, Danny Dolev, Gregory Greenman, Ilya Shnaiderman School of Engineering and Computer Science The.
Head-to-TOE Evaluation of High Performance Sockets over Protocol Offload Engines P. Balaji ¥ W. Feng α Q. Gao ¥ R. Noronha ¥ W. Yu ¥ D. K. Panda ¥ ¥ Network.
Exploiting Spatial Locality in Data Caches using Spatial Footprints Sanjeev Kumar, Princeton University Christopher Wilkerson, MRL, Intel.
Exploiting Remote Memory Operations to Design Efficient Reconfiguration for Shared Data-Centers over InfiniBand P. Balaji, K. Vaidyanathan, S. Narravula,
Evaluation of ConnectX Virtual Protocol Interconnect for Data Centers Ryan E. GrantAhmad Afsahi Pavan Balaji Department of Electrical and Computer Engineering,
Performance Characterization of a 10-Gigabit Ethernet TOE W. Feng ¥ P. Balaji α C. Baron £ L. N. Bhuyan £ D. K. Panda α ¥ Advanced Computing Lab, Los Alamos.
Performance Evaluation of RDMA over IP: A Case Study with the Ammasso Gigabit Ethernet NIC H.-W. Jin, S. Narravula, G. Brown, K. Vaidyanathan, P. Balaji,
On the Provision of Prioritization and Soft QoS in Dynamically Reconfigurable Shared Data-Centers over InfiniBand P. Balaji, S. Narravula, K. Vaidyanathan,
1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State Vsevolod V. Panteleenko Computer Science & Engineering.
Novell Server Linux vs. windows server 2008 By: Gabe Miller.
An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.
Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Internet Cache Pollution Attacks and Countermeasures Yan Gao, Leiwen Deng, Aleksandar Kuzmanovic, and Yan Chen Electrical Engineering and Computer Science.
Yaksha: A Self-Tuning Controller for Managing the Performance of 3-Tiered Web Sites Abhinav Kamra, Vishal Misra CS Department Columbia University Erich.
LDU Parametrized Discrete-Time Multivariable MRAC and Application to A Web Cache System Ying Lu, Gang Tao and Tarek Abdelzaher University of Virginia.
© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.
NPACI: National Partnership for Advanced Computational Infrastructure August 17-21, 1998 NPACI Parallel Computing Institute 1 Cluster Archtectures and.
Locality-Aware Request Distribution in Cluster-based Network Servers Presented by: Kevin Boos Authors: Vivek S. Pai, Mohit Aron, et al. Rice University.
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji  Hemal V. Shah ¥ D. K. Panda 
P. Balaji, S. Bhagvat, D. K. Panda, R. Thakur, and W. Gropp
Towards Autonomic Hosting of Multi-tier Internet Services Swaminathan Sivasubramanian, Guillaume Pierre and Maarten van Steen Vrije Universiteit, Amsterdam,
© 2013 Mellanox Technologies 1 NoSQL DB Benchmarking with high performance Networking solutions WBDB, Xian, July 2013.
Supporting Strong Cache Coherency for Active Caches in Multi-Tier Data-Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan, S. Krishnamoorthy,
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
Hybrid Prefetching for WWW Proxy Servers Yui-Wen Horng, Wen-Jou Lin, Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic.
Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers K. Vaidyanathan, S. Narravula, P. Balaji and D. K. Panda Network Based.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Design and Implement an Efficient Web Application Server Presented by Tai-Lin Han Date: 11/28/2000.
1 Design and Performance of a Web Server Accelerator Eric Levy-Abegnoli, Arun Iyengar, Junehwa Song, and Daniel Dias INFOCOM ‘99.
Global NetWatch Copyright © 2003 Global NetWatch, Inc. Factors Affecting Web Performance Getting Maximum Performance Out Of Your Web Server.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya.
Sensitivity of Cluster File System Access to I/O Server Selection A. Apon, P. Wolinski, and G. Amerson University of Arkansas.
1 Specification and Implementation of Dynamic Web Site Benchmarks Sameh Elnikety Department of Computer Science Rice University.
Amy Apon, Pawel Wolinski, Dennis Reed Greg Amerson, Prathima Gorjala University of Arkansas Commercial Applications of High Performance Computing Massive.
Architectural Characterization of an IBM RS6000 S80 Server Running TPC-W Workloads Lei Yang & Shiliang Hu Computer Sciences Department, University of.
Architectural Characterization of an IBM RS6000 S80 Server Running TPC-W Workloads Lei Yang & Shiliang Hu Computer Sciences Department, University of.
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
Performance Prediction for Random Write Reductions: A Case Study in Modelling Shared Memory Programs Ruoming Jin Gagan Agrawal Department of Computer and.
Impact of High Performance Sockets on Data Intensive Applications Pavan Balaji, Jiesheng Wu, D.K. Panda, CIS Department The Ohio State University Tahsin.
Architecture for Caching Responses with Multiple Dynamic Dependencies in Multi-Tier Data- Centers over InfiniBand S. Narravula, P. Balaji, K. Vaidyanathan,
A Data Communication Reliability and Trustability Study for Cluster Computing Speaker: Eduardo Colmenares Midwestern State University Wichita Falls, TX.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Server Performance, Scaling, Reliability and Configuration Norman White.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
PROP: A Scalable and Reliable P2P Assisted Proxy Streaming System Computer Science Department College of William and Mary Lei Guo, Songqing Chen, and Xiaodong.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Sockets Direct Protocol Over InfiniBand in Clusters: Is it Beneficial? P. Balaji, S. Narravula, K. Vaidyanathan, S. Krishnamoorthy, J. Wu and D. K. Panda.
Modeling Billion-Node Torus Networks Using Massively Parallel Discrete-Event Simulation Ning Liu, Christopher Carothers 1.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
MiddleMan: A Video Caching Proxy Server NOSSDAV 2000 Brian Smith Department of Computer Science Cornell University Ithaca, NY Soam Acharya Inktomi Corporation.
An Efficient Threading Model to Boost Server Performance Anupam Chanda.
Ohio State University Department of Computer Science and Engineering Servicing Range Queries on Multidimensional Datasets with Partial Replicas Li Weng,
Scalable Data Scale #2 site on the Internet (time on site) >200 billion monthly page views Over 1 million developers in 180 countries.
A Scalable Distributed Datastore for BioImaging R. Cai, J. Curnutt, E. Gomez, G. Kaymaz, T. Kleffel, K. Schubert, J. Tafas {jcurnutt, egomez, keith,
GPFS: A Shared-Disk File System for Large Computing Clusters Frank Schmuck & Roger Haskin IBM Almaden Research Center.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
A Practical Performance Analysis of Stream Reuse Techniques in Peer-to-Peer VoD Systems Leonardo B. Pinho and Claudio L. Amorim Parallel Computing Laboratory.
Abhinav Kamra, Vishal Misra CS Department Columbia University
Distributed Network Traffic Feature Extraction for a Real-time IDS
Authors: Sajjad Rizvi, Xi Li, Bernard Wong, Fiodar Kazhamiaka
Memory Management for Scalable Web Data Servers
Introduction to client/server architecture
Introduction to Cloud Computing
Overview Introduction VPS Understanding VPS Architecture
Li Weng, Umit Catalyurek, Tahsin Kurc, Gagan Agrawal, Joel Saltz
Presentation transcript:

Workload-driven Analysis of File Systems in Shared Multi-Tier Data-Centers over InfiniBand K. Vaidyanathan P. Balaji H. –W. Jin D.K. Panda Network-Based Computing Laboratory Department of Computer Science and Engineering The Ohio State University

Presentation Outline Introduction and Background Characterization of local and network- based file systems Multi File System for Data-Centers Experimental Results Conclusions

Introduction Exponential growth of Internet –Primary means of electronic interaction –Online book-stores, World-cup scores, Stock markets –Ex. Google, Amazon, etc Highly Scalable and Available Web-Services Performance is critical for such Services Utilizing Clusters for Web-Services? [shah01] –High Performance-to-cost ratio –Has been proposed by Industry and Research Environments [shah01]: CSP: A Novel System Architecture for Scalable Internet and Communication Services. H. V. Shah, D. B. Minturn, A. Foong, G. L. McAlpine, R. S. Madukkarumukumana and G. J. Regnier In USITS 2001

Cluster-Based Data-Centers Nodes are logically partitioned –provides specific services (serving static and dynamic content) –Use high speed interconnects like InfiniBand, Myrinet, etc. Requests get forwarded through multiple tiers Replication of content on all nodes Proxy Server Web Server (Apache) Application Server (PHP) Database Server (MySQL) WAN Clients Storage

Shared Cluster-Based Data-Centers Hosting several unrelated services on a single data-center –Currently used by several ISPs and Web Service Providers (IBM, HP) Replication of content –Amount of data replicated increases linearly with the number of web- sites hosted Proxy Server Web Server Application Server Database Server WAN Clients Website B Website C Website A } } } ABCABC ABCABC ABCABC Storage

Issues in Shared Cluster-Based Data-Centers File System Caches being shared across multiple web-sites Under-utilization of aggregate cache of all nodes Web-site Content –Replication of content on all nodes if we use local file system –Need to fetch the document via network if we use network file system, however no replication required Can we adapt the file system to avoid these?

File System Interactions Proxy Server SAN Web Server Application Server Database Server Network-based File Systems Local file system Data-Center Interaction File System Interaction Local file system

Existing File Systems Network-based File System: Parallel Virtual File System (PVFS) and Lustre (supports client-side caching) Local File System: ext3fs and memory file system (ramfs) compute node SAN Web Server Local file system Metadata Manager I/O(OST) Node I/O(OST) Node Meta Data compute node Client-side Cache Server-side Cache

Presentation Outline Introduction and Background Characterization of local and network- based file systems Multi File System for Data-Centers Experimental Analysis Conclusions

Characterization of local and network-based File Systems Network Traffic Requirements Aggregate Cache Cache Pollution Effects

Network Traffic Requirements Absolute Network Traffic generated –Static Content –Dynamic Content Network Utilization –Large/Small burst (static or dynamic content) Overhead of Metadata Operations

Aggregate Cache in Data-Centers Local File Systems use only single node’s cache –Small files get huge benefits, if in memory. Otherwise, we pay a penalty of accessing the disk –Large Files may not fit in memory and also have high penalties in accessing the disk Network File Systems use aggregate cache from all nodes –Large Files, if striped, can reside in file system cache on multiple nodes –Small files also get benefits due to aggregate cache

Cache Pollution Effects Working set – frequently accessed documents; usually fits in memory Shared Data-Centers –Multiple web-sites share the file system cache; each website has lesser amount of file system cache to utilize –Bursts of requests/accesses to one web-site may result in cache pollution –May result in drastic drop in the number of cache hits

Presentation Outline Introduction and Background Characterization of local and network- based file systems Multi File System for Data-Centers Experimental Results Conclusions

Multi File System for Data-Centers Characterization ext3fsramfspvfslustre Network Traffic generated Min More traffic Min Use of Aggregate Cache No Yes Cache pollution effects YesNoYes Metadata overhead No Yes

Multi File System for Data-Centers A combination of file systems for different environments Memory file system and local file system (ext3fs) for workloads with high temporal locality Memory file system and network file system (pvfs/lustre) for workloads with low temporal locality

Presentation Outline Introduction and Background Characterization of local and network- based file systems with data-centers Multi File System for Data-Centers Experimental Results Conclusions

Experimental Test-bed Cluster 1 with: –8 SuperMicro SUPER X5DL8-GG nodes; Dual Intel Xeon 3.0 GHz processors –512 KB L2 Cache, 2 GB memory; PCI-X 64 bit 133 MHz Cluster 2 with: –8 SuperMicro SUPER P4DL6 nodes; Dual Intel Xeon 2.4 GHz processors –512 KB L2 Cache, 512 MB memory; PCI-X 64 bit 133 MHz Mellanox MT23108 Dual Port 4x HCAs; MT port switch Apache Web and PHP Servers; MySQL , PVFS 1.6.2, Lustre 1.0.4

Workloads Zipf workloads: the relative probability of a request for the i th most popular document is proportional to 1/i  with   1 –High Temporal locality (constant  ) –Low Temporal locality (varying  ) TPC-W traces according to the specifications ClassFile SizesSize Class 01K – 250K25 MB Class 11K – 1MB100 MB Class 21K – 4MB450 MB Class 31K – 16MB2 GB Class 41K – 64MB6 GB

Experimental Analysis (Outline) Basic Performance of different file systems Network Traffic Requirements Impact of Aggregate Cache Cache Pollution Effects Multi File System for Data-Centers

Basic Performance Network File Systems incur high overhead for metadata operations (open() and close()) Lustre supports client-side cache For large files, network-based file system does better than local file system due to striping of the file Latencyext3fs (usecs) ramfs (usecs) pvfs (usecs) lustre (usecs) 4K1M4K1M4K1M4K1M Open & Close overhead Read Latency (cache) Read Latency (no cache)

Network Traffic Requirements Absolute Network Traffic Generated: –Increases proportionally compared to the local file system for PVFS –For Lustre, the traffic is close to that of the local file system –For dynamic content, the network traffic does not increase with increase in database size

Impact of Caching and Metadata operations Local File Systems are better for workloads with high temporal locality Surprisingly Lustre performs comparable with local file systems

Impact of Aggregate Cache Aggregate Cache improves data-center performance for network-based file systems

Cache Pollution Effects in Shared Data-Centers Small Workloads, web-sites are not affected Large Workloads, cache pollution affects multiple web-sites Placing files on memory file system might avoid the cache pollution effects

Multi File System Data-Centers Performance benefits for static content is close to 48% Performance benefits for dynamic content is close to 41%

Multi File System Data-Centers Benefits are two folds: –Avoidance of Cache Pollution –Reduced overhead of open() and close() operations for small files

Conclusions & Future Work Fragmentation of resources in shared data-Centers –Under-utilization of file system cache in clusters –Cache Pollution affects performance Studied the impact of file systems in terms of network traffic, aggregate cache and cache pollution effects Proposed a Multi File System approach to utilize the benefits from each file system –Combination of Network and Memory File System for static content with low temporal locality –Memory File System and local file system for static content with high temporal locality and dynamic content Propose to perform dynamic reconfiguration based on each node’s memory cache and provide prioritization and QoS

Web Pointers NOWLAB