Enabling Cross-Layer Optimizations in Storage Systems with Custom Metadata Elizeu Santos-Neto Samer Al-Kiswany Nazareno Andrade Sathish Gopalakrishnan.

Slides:



Advertisements
Similar presentations
Towards Automating the Configuration of a Distributed Storage System Lauro B. Costa Matei Ripeanu {lauroc, NetSysLab University of British.
Advertisements

3 Copyright © 2005, Oracle. All rights reserved. Designing J2EE Applications.
Distributed Processing, Client/Server and Clusters
1 A GPU Accelerated Storage System NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany Sathish Gopalakrishnan Matei.
AMAZON S3 FOR SCIENCE GRIDS: A VIABLE SOLUTION? Mayur Palankar and Adriana Iamnitchi University of South Florida Matei Ripeanu University of British Columbia.
S4: A Simple Storage Service for Sciences Matei Ripeanu Adriana Iamnitchi University of British Columbia University of South Florida.
Elizeu Santos-Neto, Flavio Figueiredo Jussara Almeida, Miranda Mowbray Marcos Gonçalves, Matei Ripeanu The 2 nd IEEE SocialCom/SIN -- August 2010.
Priority Research Direction (I/O Models, Abstractions and Software) Key challenges What will you do to address the challenges? – Develop newer I/O models.
June 22-23, 2005 Technology Infusion Team Committee1 High Performance Parallel Lucene search (for an OAI federation) K. Maly, and M. Zubair Department.
1 The Case for Versatile Storage System NetSysLab The University of British Columbia Samer Al-Kiswany, Abdullah Gharaibeh, Matei Ripeanu.
1 The JQuery Tool A Generic Query-Based Code browser for Eclipse Project leaders: Kris De Volder, Gregor Kiczales Students: Doug Janzen, Rajeswari Rajagopalan,
Managing Data Resources
PROTOCOLS AND ARCHITECTURE Lesson 2 NETS2150/2850.
© 2006 Cisco Systems, Inc. All rights reserved. Implementing Secure Converged Wide Area Networks (ISCW) Module 2: Teleworker Connectivity.
Are P2P Data-Dissemination Techniques Viable in Today's Data- Intensive Scientific Collaborations? Samer Al-Kiswany – University of British Columbia joint.
Define Embedded Systems Small (?) Application Specific Computer Systems.
1 stdchk : A Checkpoint Storage System for Desktop Grid Computing Matei Ripeanu – UBC Sudharshan S. Vazhkudai – ORNL Abdullah Gharaibeh – UBC The University.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Matei Ripeanu.
CSE 490dp Resource Control Robert Grimm. Problems How to access resources? –Basic usage tracking How to measure resource consumption? –Accounting How.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.
Design and Implementation of a Single System Image Operating System for High Performance Computing on Clusters Christine MORIN PARIS project-team, IRISA/INRIA.
1 Exploring Data Reliability Tradeoffs in Replicated Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh Advisor: Professor.
A Workflow-Aware Storage System Emalayan Vairavanathan 1 Samer Al-Kiswany, Lauro Beltrão Costa, Zhao Zhang, Daniel S. Katz, Michael Wilde, Matei Ripeanu.
Object-based Storage Long Liu Outline Why do we need object based storage? What is object based storage? How to take advantage of it? What's.
ECE 526 – Network Processing Systems Design Network Processor Architecture and Scalability Chapter 13,14: D. E. Comer.
C Copyright © 2009, Oracle. All rights reserved. Appendix C: Service-Oriented Architectures.
11 If you were plowing a field, which would you rather use? Two oxen, or 1024 chickens? (Attributed to S. Cray) Abdullah Gharaibeh, Lauro Costa, Elizeu.
Chapter 6 Operating System Support. This chapter describes how middleware is supported by the operating system facilities at the nodes of a distributed.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
A Metadata Catalog Service for Data Intensive Applications Presented by Chin-Yi Tsai.
Emalayan Vairavanathan
1. 2 Corollary 3 System Overview Second Key Idea: Specialization Think GoogleFS.
WP3 Zetabyte –Exascale Storage Virtualization. How does it fit in? Traditional ProvidersCloud ProvidersXXX Providers Providing: Work/Archive Storage,
Experience with Using a Performance Predictor During Development a Distributed Storage System Tale Lauro Beltrão Costa *, João Brunet +, Lile Hattori #,
1 Configurable Security for Scavenged Storage Systems NetSysLab The University of British Columbia Abdullah Gharaibeh with: Samer Al-Kiswany, Matei Ripeanu.
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Armin Bahramshahry August  Background  Problem  Solution  Evaluation  Summary.
© 2006 Cisco Systems, Inc. All rights reserved. Optimizing Converged Cisco Networks (ONT) Module 4: Implement the DiffServ QoS Model.
A Framework for Elastic Execution of Existing MPI Programs Aarthi Raveendran Graduate Student Department Of CSE 1.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Covilhã, 30 June Atílio Gameiro Page 1 The information in this document is provided as is and no guarantee or warranty is given that the information is.
DOE PI Meeting at BNL 1 Lightweight High-performance I/O for Data-intensive Computing Jun Wang Computer Architecture and Storage System Laboratory (CASS)
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
1 MosaStore -A Versatile Storage System Lauro Costa, Abdullah Gharaibeh, Samer Al-Kiswany, Matei Ripeanu, Emalayan Vairavanathan, (and many others from.
Maeda, Sill Torres: CLEVER CLEVER: Cross-Layer Error Verification Evaluation and Reporting Rafael Kioji Vivas Maeda, Frank Sill Torres Federal University.
© 2006 Cisco Systems, Inc. All rights reserved. Optimizing Converged Cisco Networks (ONT) Module 3: Introduction to IP QoS.
9 Systems Analysis and Design in a Changing World, Fourth Edition.
Distributed Data for Science Workflows Data Architecture Progress Report December 2008.
Challenges in the Next Generation Internet Xin Yuan Department of Computer Science Florida State University
CERN IT Department CH-1211 Genève 23 Switzerland t CERN IT Monitoring and Data Analytics Pedro Andrade (IT-GT) Openlab Workshop on Data Analytics.
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
An Architectural Approach to Managing Data in Transit Micah Beck Director & Associate Professor Logistical Computing and Internetworking Lab Computer Science.
1 Architecture and Behavioral Model for Future Cognitive Heterogeneous Networks Advisor: Wei-Yeh Chen Student: Long-Chong Hung G. Chen, Y. Zhang, M. Song,
Towards a Framework to Evaluate Performance of the NoCs Mahmoud Moadeli University of Glasgow.
Directions in eScience Interoperability and Science Clouds June Interoperability in Action – Standards Implementation.
Enteprise Content Management from Microsoft. 20% structured 80% unstructured 90% of unstructured data is unmanaged Volume of data is increasing ~36%/year.
1 Copyright © 2012 Tata Consultancy Services Limited Windchill Architecture.
1 ETL Framework Definition - For a leading Financial Service Company - Name: Designation: Date: February, 2004 Copyright Wipro Technologies 2004 Consultancy.
For more course tutorials visit NTC 406 Entire Course NTC 406 Week 1 Individual Assignment Network Requirements Analysis Paper NTC 406.
Protocol Architectures. Simple Protocol Architecture Not an actual architecture, but a model for how they work Similar to “pseudocode,” used for teaching.
Introduction to Oracle Forms Developer and Oracle Forms Services
Introduction to Oracle Forms Developer and Oracle Forms Services
What is Fibre Channel? What is Fibre Channel? Introduction
Introduction to Oracle Forms Developer and Oracle Forms Services
Protocol Architectures
A Software-Defined Storage for Workflow Applications
Motivation and Problem Statement
L. Glimcher, R. Jin, G. Agrawal Presented by: Leo Glimcher
Presentation transcript:

Enabling Cross-Layer Optimizations in Storage Systems with Custom Metadata Elizeu Santos-Neto Samer Al-Kiswany Nazareno Andrade Sathish Gopalakrishnan Matei Ripeanu University of British Columbia Universidade Federal de Campina Grande

HPDC'08http://netsyslab.ece.ubc.ca2 Motivation ~ 160 Exabytes of digital data created in 2006 [Reinsel07]  Harder to organize, search or navigate. Solution : Custom metadata  User-generated metadata assigned to files  Used to improve navigation and search  Communication between applications Idea: what about exploiting custom metadata for cross-layer optimizations too?

HPDC'08http://netsyslab.ece.ubc.ca3 Layered Architectures TCP/IP File System Good, but…  …it limits information flow across layers.

HPDC'08http://netsyslab.ece.ubc.ca4 Cross-Layer Optimizations Examples  HTTP (Cache directives)  IP (Optional fields) Applications  Storage System  Performance  QoS requirements  Consistency requirements Applications  Storage System  Provide storage-level information to applications Data Intensive Schedulers: Notification about data movements Data Intensive Applications: Co-usage of files

HPDC'08http://netsyslab.ece.ubc.ca5 Traditional Use of Custom Metadata Application Layer File System Layer Storage System Layer Metadata Manager File Organization Module Basic File System Author=Smith input.dat File Browser

HPDC'08http://netsyslab.ece.ubc.ca6 Cross-Layer Communication Application Layer File System Layer Storage System Layer Metadata Manager File Organization Module Basic File System Replicate input.dat 3x input.dat moved from node1 to node3 OK. Schedule Task on node3

HPDC'08http://netsyslab.ece.ubc.ca7 What are the anticipated gains? Application-agnostic cross-layer communication Opportunistic use of custom metadata New opportunities for usage-based optimization

HPDC'08http://netsyslab.ece.ubc.ca8 Challenges Standardization is important Should keep the advantages of a layered design Conflicting metadata Implementation complexity and scalability Policy versus Mechanism Incentive-compatible adoption  The lesson from IP optional fields

HPDC'08http://netsyslab.ece.ubc.ca9 Summary and Future Work Cross-layer optimizations in storage systems with custom metadata Build the cross-layer communication mechanism  Target cluster-based file systems  e.g. FreeLoader, Lustre File System Perform an experimental evaluation

Questions

HPDC'08http://netsyslab.ece.ubc.ca11 Use cases Application  Storage System  Custom metadata describing co-usage of files  Caching mechanism can consider file bundling Storage System  Application  Custom metadata describing file location characteristics  Application scheduler can exploit this information

HPDC'08http://netsyslab.ece.ubc.ca12 Design Considerations Modular and Extendable: new mechanisms and attributes. Interfaces should be stable Components (layers) should be confined