RobuSTore: Performance Isolation for Distributed Storage and Parallel Disk Arrays Justin Burke, Huaxia Xia, and Andrew A. Chien Department of Computer.

Slides:



Advertisements
Similar presentations
-Grids and the OptIPuter Software Architecture Andrew A. Chien Director, Center for Networked Systems SAIC Chair Professor, Computer Science and Engineering.
Advertisements

Distributed Processing, Client/Server and Clusters
Henry C. H. Chen and Patrick P. C. Lee
1 NCFS: On the Practicality and Extensibility of a Network-Coding-Based Distributed File System Yuchong Hu 1, Chiu-Man Yu 2, Yan-Kit Li 2 Patrick P. C.
Abstract HyFS: A Highly Available Distributed File System Jianqiang Luo, Mochan Shrestha, Lihao Xu Department of Computer Science, Wayne State University.
RAID- Redundant Array of Inexpensive Drives. Purpose Provide faster data access and larger storage Provide data redundancy.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Availability in Globally Distributed Storage Systems
CSE 486/586 CSE 486/586 Distributed Systems Case Study: Facebook f4 Steve Ko Computer Sciences and Engineering University at Buffalo.
CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
A Server-less Architecture for Building Scalable, Reliable, and Cost-Effective Video-on-demand Systems Jack Lee Yiu-bun, Raymond Leung Wai Tak Department.
© 2009 VMware Inc. All rights reserved Big Data’s Virtualization Journey Andrew Yu Sr. Director, Big Data R&D VMware.
EU-GRID Work Program Massimo Sgaravatto – INFN Padova Cristina Vistoli – INFN Cnaf as INFN members of the EU-GRID technical team.
Distributed components
Distributed Virtual Computer (DVC): Simplifying the Development of High-Performance Grid Applications Nut Taesombut and Andrew A. Chien Department of Computer.
Technical Architectures
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Introducing: Cooperative Library Presented August 19, 2002.
Energy Efficient Prefetching – from models to Implementation 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software Engineering.
Energy Efficient Prefetching with Buffer Disks for Cluster File Systems 6/19/ Adam Manzanares and Xiao Qin Department of Computer Science and Software.
Microsoft Virtual Server 2005 Product Overview Mikael Nyström – TrueSec AB MVP Windows Server – Setup/Deployment Mikael Nyström – TrueSec AB MVP Windows.
Distributed Information Systems - The Client server model
1© Copyright 2015 EMC Corporation. All rights reserved. SDN INTELLIGENT NETWORKING IMPLICATIONS FOR END-TO-END INTERNETWORKING Simone Mangiante Senior.
.NET Mobile Application Development Introduction to Mobile and Distributed Applications.
5/8/2006 Nicole SAN Protocols 1 Storage Networking Protocols Nicole Opferman CS 526.
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing Kai Hwang, Hai Jin, and Roy Ho.
Chapter 2 Architectural Models. Keywords Middleware Interface vs. implementation Client-server models OOP.
Redundant Array of Independent Disks
Challenges of Storage in an Elastic Infrastructure. May 9, 2014 Farid Yavari, Storage Solutions Architect and Technologist.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Feb 6-7, OptIPuter Software Research and Architecture Andrew A. Chien Computer Science and Engineering University of California, San Diego OptIPuter.
1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
Jan Storage Resource Broker Managing Distributed Data in a Grid A discussion of a paper published by a group of researchers at the San Diego Supercomputer.
Module – 4 Intelligent storage system
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Resisting Denial-of-Service Attacks Using Overlay Networks Ju Wang Advisor: Andrew A. Chien Department of Computer Science and Engineering, University.
Production Data Grids SRB - iRODS Storage Resource Broker Reagan W. Moore
An Analysis of Location-Hiding Using Overlay Networks Ju Wang and Andrew A. Chien Department of Computer Science and Engineering, University of California.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
CS Spring 2009 CS 414 – Multimedia Systems Design Lecture 30 – Media Server (Part 5) Klara Nahrstedt Spring 2009.
CCNA4 v3 Module 6 v3 CCNA 4 Module 6 JEOPARDY K. Martin.
Improving Disk Throughput in Data-Intensive Servers Enrique V. Carrera and Ricardo Bianchini Department of Computer Science Rutgers University.
Computer Science in Context Evangelos E. Milios Professor and Graduate Coordinator Faculty of Computer Science Dalhousie University.
The OptIPuter Project Tom DeFanti, Jason Leigh, Maxine Brown, Tom Moher, Oliver Yu, Bob Grossman, Luc Renambot Electronic Visualization Laboratory, Department.
EMC Proven Professional. Copyright © 2012 EMC Corporation. All Rights Reserved. NAS versus SAN NAS – Architecture to provide dedicated file level access.
Best Available Technologies: External Storage Overview of Opportunities and Impacts November 18, 2015.
Practical IT Research that Drives Measurable Results Leverage Server Virtualization for DR Affordability and Agility 1Info-Tech Research Group.
Cloud Computing Vs RAID Group 21 Fangfei Li John Soh Course: CSCI4707.
Enhanced Availability With RAID CC5493/7493. RAID Redundant Array of Independent Disks RAID is implemented to improve: –IO throughput (speed) and –Availability.
Seminar On Rain Technology
© 2012 Eucalyptus Systems, Inc. Cloud Computing Introduction Eucalyptus Education Services 2.
RAID Technology By: Adarsha A,S 1BY08A03. Overview What is RAID Technology? What is RAID Technology? History of RAID History of RAID Techniques/Methods.
Pouya Ostovari and Jie Wu Computer & Information Sciences
Network-Attached Storage. Network-attached storage devices Attached to a local area network, generally an Ethernet-based network environment.
Enterprise Architectures
Clouds , Grids and Clusters
Steve Ko Computer Sciences and Engineering University at Buffalo
Steve Ko Computer Sciences and Engineering University at Buffalo
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
University of Technology
Overview Introduction VPS Understanding VPS Architecture
RAID RAID Mukesh N Tekwani
Storage Networking Protocols
CLUSTER COMPUTING.
Introduction To Distributed Systems
RAID RAID Mukesh N Tekwani April 23, 2019
Presentation transcript:

RobuSTore: Performance Isolation for Distributed Storage and Parallel Disk Arrays Justin Burke, Huaxia Xia, and Andrew A. Chien Department of Computer Science and Engineering and Center for Networked Systems, University of California, San Diego OptIPuter Supported in part by the National Science Foundation under awards NSF Cooperative Agreement ANI (OptIPuter), NSF CCR (VGrADS), NSF ACI , and NSF Research Infrastructure Grant EIA Support from the UCSD Center for Networked Systems, BigBangwidth, and Fujitsu is also gratefully acknowledged. Storage Systems in the OptIPuter Project Layer 4: XCP Node Operating Systems (Storage Systems) λ-configuration, Net Management Grid and Web Middleware – (Globus/OGSA/WebServices/J2EE) Physical Resources DVC #1 OptIPuter Applications DVC #2DVC #3 Layer 5: SABUL, RBUDP, Fast, GTP Real-Time Objects Security Models Data Services: DWTP Higher Level Grid Services OptIPuter Software Architecture “Optical IP Computer” is a project to develop a powerfully distributed infrastructure that tightly couples computational, storage and visualization resources over optical DWDM networks using novel software. Overview Multi-terabyte datasets Interactive applications means fewer benefits from prefetching. Workload Assumptions Goals QoS Guarantees Provide statistical guarantees about storage system performance Performance Isolation in Shared Environment Minimize jitter in a competitive (shared) environment Performance Resilience to Failures Ability to cope with node failures while still maintaining QoS guarantees High Performance Match hardware speeds RobuSTore for Distributed Storage Design Approach Manages remote storage nodes in a SAN-like fashion. Additional capacity can be added independent of current configuration. Use of erasure codes allows us to achieve order independence of block retrieval. Storages nodes and file blocks are managed by a metadata server. MDS is used to locate file blocks and provide user authentication. Achieves performance goals by exploiting parallelism. From Traditional Storage Methods to Erasure Encoding RobuSTore for Parallel Drive Arrays Traditional Data Storage Methods FILEFILE Segments Striping and Replication High Performance Fault Tolerance Performance Isolation FILEFILE Segments Erasure Encoded Encoded Segments High Performance Fault Tolerance Performance Isolation Requested Segment Candidate Blocks Set of Reads Current Disk State Reconstruct Segment Exploit detailed knowledge of drive internals to further improve performance and performance isolation. Store segments of data files as encoded blocks Use erasure codes to create choice freedom. Distribute encoded blocks across drive array When a segment is requested, identify candidate encoded blocks for retrieval Use model of current disk state to optimize for head motion. Erasure codes allow choice freedom of block retrieval. Reconstruct original data segment in device driver Leverages disparity between host processing capabilities and disk speed. Erasure Encoding Encoding creates interdependencies between each of the Encoded Segments. Any K of N Encoded Segments are sufficient to reconstruct the original file. Design Approach…… Received Blocks: Reconstructed Blocks: Complete! Delayed! Reconstructing the file from first set of blocks returned from a large group of storage servers yields improvement in both latency and bandwidth. Ability to reconstruct the file from any set of blocks yields robust and isolated performance.…… Received Blocks: Reconstructed Blocks: Complete! Grid with slow network Lambda Grids