Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.

Slides:



Advertisements
Similar presentations
Distributed Systems Major Design Issues Presented by: Christopher Hector CS8320 – Advanced Operating Systems Spring 2007 – Section 2.6 Presentation Dr.
Advertisements

HP Quality Center Overview.
University of Cincinnati1 Towards A Content-Based Aggregation Network By Shagun Kakkar May 29, 2002.
Chapter 13 (Web): Distributed Databases
Technical Architectures
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 1: Introduction to Windows Server 2003.
Bgpmon real-time collection and distribution of BGP updates Dave Matthews, Yan Chen, Dan Massey Department of Computer Science Colorado State University.
Data Management I DBMS Relational Systems. Overview u Introduction u DBMS –components –types u Relational Model –characteristics –implementation u Physical.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
Object Naming & Content based Object Search 2/3/2003.
OceanStore: An Architecture for Global-Scale Persistent Storage Professor John Kubiatowicz, University of California at Berkeley
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 1: Introduction to Windows Server 2003.
Principles for Collaboration Systems Geoffrey Fox Community Grids Laboratory Indiana University Bloomington IN 47404
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Data Center Infrastructure
Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers K. Vaidyanathan, S. Narravula, P. Balaji and D. K. Panda Network Based.
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
Technology Overview. Agenda What’s New and Better in Windows Server 2003? Why Upgrade to Windows Server 2003 ?  From Windows NT 4.0  From Windows 2000.
Connecting OurGrid & GridSAM A Short Overview. Content Goals OurGrid: architecture overview OurGrid: short overview GridSAM: short overview GridSAM: example.
On P2P Collaboration Infrastructures Manfred Hauswirth, Ivana Podnar, Stefan Decker Infrastructure for Collaborative Enterprise, th IEEE International.
AL-MAAREFA COLLEGE FOR SCIENCE AND TECHNOLOGY INFO 232: DATABASE SYSTEMS CHAPTER 1 DATABASE SYSTEMS (Cont’d) Instructor Ms. Arwa Binsaleh.
Version 4.0. Objectives Describe how networks impact our daily lives. Describe the role of data networking in the human network. Identify the key components.
Managing Service Metadata as Context The 2005 Istanbul International Computational Science & Engineering Conference (ICCSE2005) Mehmet S. Aktas
Parallel Processing CS453 Lecture 2.  The role of parallelism in accelerating computing speeds has been recognized for several decades.  Its role in.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Research Interests Georgia Koloniari Computer Science Department University of Ioannina, Greece.
material assembled from the web pages at
Peer-to-Pee Computing HP Technical Report Chin-Yi Tsai.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Chapter 10: File-System Interface Silberschatz, Galvin and Gagne ©2005 Operating System Concepts – 7 th Edition, Jan 1, 2005 Chapter 10: File-System.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
NOVA Networked Object-based EnVironment for Analysis P. Nevski, A. Vaniachine, T. Wenaus NOVA is a project to develop distributed object oriented physics.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Peer-to-Peer Network Tzu-Wei Kuo. Outline What is Peer-to-Peer(P2P)? P2P Architecture Applications Advantages and Weaknesses Security Controversy.
NT SECURITY Introduction Security features of an operating system revolve around the principles of “Availability,” “Integrity,” and Confidentiality. For.
Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
GridFTP GUI: An Easy and Efficient Way to Transfer Data in Grid
Enabling the Future Service-Oriented Internet (EFSOI 2008) Supporting end-to-end resource virtualization for Web 2.0 applications using Service Oriented.
GCRC Meeting 2004 BIRN Coordinating Center Software Development Vicky Rowley.
1 Recommendations Now that 40 GbE has been adopted as part of the 802.3ba Task Force, there is a need to consider inter-switch links applications at 40.
Caching Consistency and Concurrency Control Contact: Dingshan He
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Plethora: Infrastructure and System Design. Introduction Peer-to-Peer (P2P) networks: –Self-organizing distributed systems –Nodes receive and provide.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Use Cases for High Bandwidth Query and Control of Core Networks Greg Bernstein, Grotto Networking Young Lee, Huawei draft-bernstein-alto-large-bandwidth-cases-00.txt.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Super Computing 2000 DOE SCIENCE ON THE GRID Storage Resource Management For the Earth Science Grid Scientific Data Management Research Group NERSC, LBNL.
Malugo – a scalable peer-to-peer storage system..
CLOUD COMPUTING WHAT IS CLOUD COMPUTING?  Cloud Computing, also known as ‘on-demand computing’, is a kind of Internet-based computing,
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 1 Database Systems.
Cyberinfrastructure Overview of Demos Townsville, AU 28 – 31 March 2006 CREON/GLEON.
Resource Optimization for Publisher/Subscriber-based Avionics Systems Institute for Software Integrated Systems Vanderbilt University Nashville, Tennessee.
Seminar On Rain Technology
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
DISTRIBUTED FILE SYSTEM- ENHANCEMENT AND FURTHER DEVELOPMENT BY:- PALLAWI(10BIT0033)
SEMINAR TOPIC ON “RAIN TECHNOLOGY”
Chapter 1 Characterization of Distributed Systems
Netscape Application Server
Objectives Differentiate between the different editions of Windows Server 2003 Explain Windows Server 2003 network models and server roles Identify concepts.
Distribution and components
Plethora: Infrastructure and System Design
Kirill Lozinskiy NERSC Storage Systems Group
Built on the Powerful Microsoft Azure Platform, Lievestro Delivers Care Information, Capacity Management Solutions to Hospitals, Medical Field MICROSOFT.
Distributed Storage Infrastructure: The Big Picture
Presentation transcript:

Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer Sciences, Purdue University.

Plethora: Design Goals ● To build a wide-area read-write object repository from semi-static peers for supporting a single seamless distributed storage resource. ● To support desirable features of end-user performance, global resource utilization, robustness, and application support.

Plethora: Motivation ● A number of applications require: – Large aggregate storage; – Supporting distributed access to data; – Collaborative operations on shared datasets; – Content-based retrieval; – Distributed services infrastructure; and – High degree of availability and robustness. Such applications motivate design decisions in Plethora.

Trends in Storage Software

Sample Applications: GriPhyN ● The Grid Physics Network (GriPhyN) is a classic example of a large dataset that is accessed by a number of people. ● Data is generated at the rate of roughly 1 PB/year in the form of high-energy physics experiment readouts (each experiment corresponds to roughly a MB of data). ● Researchers across the world access selected experiments.

Sample Applications: GriPhyN Tier 0 is the data source (in this case CERN), tier 1 is a national center (Fermi Labs), tier 2 are regional centers, tier 3 consists of workgroup servers, tier 4 are individual desktops.

Sample Application: Collaborative Design ● The volume of data associated with typical product lifecycle stages (concept, design, analysis, manufacturing, and field support) grows exponentially. ● At the same time, the need for effective data access, sharing, capture, and protection becomes increasingly important. ● Scalable and distributed solutions to these problems are critical components of PLM.

Sample Application: Collaborative Design ● Desirable Characteristics: – Complexity and Interoperability – Distributed Design Collaboration – Reuse and Versioning – Availability – Performance

Collaborative Design: State-of-the-art Source:

Collaborative Design: State-of-the-art ● Client-server model ideal for local area environments. Does not scale to larger number of installations. ● Mechanisms for availability rely on conventional mechanisms such as snapshots. These do not facilitate real-time recovery or account for network failures. ● Minimal support for end-user performance in terms of client-side support. ● Little or no support in terms of content-based location, application-specific consistency mechanisms, versioning techniques.

Plethora: System Overview ● Plethora Routing Core: Routing data requests to appropriate sites. ● Robustness: Novel erasure coding schemes. ● Versioning Semantics: Supporting read-write access efficiently and data reuse over wide- area networks. ● Content-based Location and Placement: Routing queries on content.

Plethora Routing Core ● Design goals: reliability, performance, end- user latency. – Locality enhancing multi-level overlays of participating sites. – Efficient caching techniques for end-user latency. – Network maintenance via redundant overlay links and real time monitoring and updation.

Plethora: Robustness ● Novel erasure coding techniques: – Conventional (n,m) techniques can reconstruct data if any m of n total data blocks can be accessed. – These techniques are resilient to multiple network and disk failures. – These techniques, however, have considerable communication and computing overhead for block updates, block reconstruction, and for reconstituting the code. – Plethora relies on novel codes that minimize these overheads.

Plethora: Versioning Semantics ● Scaling to wide-area systems require alternate concurrent data access semantics. Plethora relies on versioning semantics to facilitate performance. – Each access is to a version of an object. – Updates to objects are not reflected globally unless they are committed. – The resulting version tree for each object can be reconciled in an application specific manner. ● Versioning systems are ideally suited to high latency environment with real-time applications. They also facilitate version-based data reuse.

Plethora: Content-Based Location ● Content-based location is critical for supporting design applications. – Each data object has keys corresponding to searchable attributes installed in the Plethora routing core (keys are derived using conventional hashing techniques). – The routing core is then used to route queries generated at clients (using the same hash function) to locate data objects. – By giving applications the ability to install keys, powerful content-based searching capability supported by Plethora.

Plethora: Deliverables ● Fully functional Plethora client. ● Extensive system-level and application-level scaling studies and performance characterization (simulations and deployment). ● Sample applications demonstrating large storage capabilities, access performance, collaboration facilities, and mobile applications that maximize value for sponsor.