Grid Computing – Issues in Data grids and Solutions Sudhindra Rao.

Slides:



Advertisements
Similar presentations
Current impacts of cloud migration on broadband network operations and businesses David Sterling Partner, i 3 m 3 Solutions.
Advertisements

CLOUD COMPUTING AN OVERVIEW & QUALITY OF SERVICE Hamzeh Khazaei University of Manitoba Department of Computer Science Jan 28, 2010.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
CoreGRID Workpackage 5 Virtual Institute on Grid Information and Monitoring Services Authorizing Grid Resource Access and Consumption Erik Elmroth, Michał.
6/4/2015Page 1 Enterprise Service Bus (ESB) B. Ramamurthy.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Relational DatabaseData Grid Oracle Sybase DB2 MySQL Others Integrasoft Avaki Others Data Management Tables Query Language Procedures Locking Indexing.
Distributed Database Management Systems
Software Engineering and Middleware: a Roadmap by Wolfgang Emmerich Ebru Dincel Sahitya Gupta.
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Milos Kobliha Alejandro Cimadevilla Luis de Alba Parallel Computing Seminar GROUP 12.
Grids and Grid Technologies for Wide-Area Distributed Computing Mark Baker, Rajkumar Buyya and Domenico Laforenza.
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
Chapter 14 The Second Component: The Database.
Concurrency Control & Caching Consistency Issues and Survey Dingshan He November 18, 2002.
Distributed Data Management for Compute Grid Presented by Michael Di Stefano Founder of Author of Meeting: Tuesday, September 13 th, 2005.
SPRING 2011 CLOUD COMPUTING Cloud Computing San José State University Computer Architecture (CS 147) Professor Sin-Min Lee Presentation by Vladimir Serdyukov.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
WORKFLOWS IN CLOUD COMPUTING. CLOUD COMPUTING  Delivering applications or services in on-demand environment  Hundreds of thousands of users / applications.
Platform as a Service (PaaS)
VAP What is a Virtual Application ? A virtual application is an application that has been optimized to run on virtual infrastructure. The application software.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
3 Cloud Computing.
TECHNOLOGY GUIDE THREE Emerging Types of Enterprise Computing.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Computer System Architectures Computer System Software
August 27, 2008 Platform Market, Business & Strategy.
Presenter: Dipesh Gautam.  Introduction  Why Data Grid?  High Level View  Design Considerations  Data Grid Services  Topology  Grids and Cloud.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
DISTRIBUTED COMPUTING
1 School of Computer, National University of Defense Technology A Profile on the Grid Data Engine (GridDaEn) Xiao Nong
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
San Diego Supercomputer Center National Partnership for Advanced Computational Infrastructure San Diego Supercomputer Center National Partnership for Advanced.
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
Peer-to-Peer Distributed Shared Memory? Gabriel Antoniu, Luc Bougé, Mathieu Jan IRISA / INRIA & ENS Cachan/Bretagne France Dagstuhl seminar, October 2003.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
Middleware for FIs Apeego House 4B, Tardeo Rd. Mumbai Tel: Fax:
Cracow Grid Workshop October 2009 Dipl.-Ing. (M.Sc.) Marcus Hilbrich Center for Information Services and High Performance.
DISTRIBUTED COMPUTING Introduction Dr. Yingwu Zhu.
Distributed Computing Systems CSCI 4780/6780. Distributed System A distributed system is: A collection of independent computers that appears to its users.
1- Distributed Systems Principles and Paradigms Operating Systems: Concurrent and Distributed Software Design Jean Bacon, Tim Harris 2003.
Chapter 5 McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
Ames Research CenterDivision 1 Information Power Grid (IPG) Overview Anthony Lisotta Computer Sciences Corporation NASA Ames May 2,
Plethora: A Wide-Area Read-Write Storage Repository Design Goals, Objectives, and Applications Suresh Jagannathan, Christoph Hoffmann, Ananth Grama Computer.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Distributed Databases
Distributed Information Systems. Motivation ● To understand the problems that Web services try to solve it is helpful to understand how distributed information.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Copyright © 2006, GemStone Systems Inc. All Rights Reserved. Increasing computation throughput with Grid Data Caching Jags Ramnarayan Chief Architect GemStone.
Chapter 8 – Cloud Computing
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
GRID ANATOMY Advanced Computing Concepts – Dr. Emmanuel Pilli.
Mark Gilbert Microsoft Corporation Services Taxonomy Building Block Services Attached Services Finished Services.
CLOUD COMPUTING WHAT IS CLOUD COMPUTING?  Cloud Computing, also known as ‘on-demand computing’, is a kind of Internet-based computing,
Background Computer System Architectures Computer System Software.
Microsoft Azure and ServiceNow: Extending IT Best Practices to the Microsoft Cloud to Give Enterprises Total Control of Their Infrastructure MICROSOFT.
INTRODUCTION TO GRID & CLOUD COMPUTING U. Jhashuva 1 Asst. Professor Dept. of CSE.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Data Grids, Digital Libraries and Persistent Archives: An Integrated Approach to Publishing, Sharing and Archiving Data. Written By: R. Moore, A. Rajasekar,
Chapter 1 Characterization of Distributed Systems
Introduction to Distributed Platforms
TECHNOLOGY GUIDE THREE
Cloud Computing.
TECHNOLOGY GUIDE THREE
3 Cloud Computing.
Introduction to Grid Technology
Large Scale Distributed Computing
TECHNOLOGY GUIDE THREE
Presentation transcript:

Grid Computing – Issues in Data grids and Solutions Sudhindra Rao

Grid ComputingOSCAR Lab2 Outline Grid Computing – introduction Computational Grids Data Grids Data Management Related Work Technologies – JavaSpaces, OceanStore Our research plan Discussion

Grid ComputingOSCAR Lab3 What is grid computing? Use a network of PCs Faster networks, cheaper PCs, lot of idle time Easy to build, maintain, scale Generic solution for scientific and business problems alike Some form of grid computing - Argonne National Lab, Google etc.

Grid ComputingOSCAR Lab4 Capabilities Security Manageablity Agility Goals Efficiency Profitability Control Uncertainty Complexity Distribution New Opportunities World Events Market Dynamics Grid Computing Maturing Technology Why today?

Grid ComputingOSCAR Lab5 Compute- intensive analytics OLAP data analysis Data Center operations Compute Utility services Value at risk Credit risk Real-time risk management Automated trade programs Anti-money laundering Credit card (risk and customer Data mining) Billing In-process system migration High fault tolerance Geographic data center independence for failover and business applications Data center compute farms Corporate compute utility services creating a low- cost infrastructure similar to the electric grid Applications – data grids Geographic distribution of data Computations on large scale data

Grid ComputingOSCAR Lab6 Distributed Computing Evolution File sharing CORBAData translation Data queuesPublish/SubscribeSmart routing Pipes/socketsClustersData gridsUtility service Middleware Client/Server Grid Computing Evolution of distributed computing

Grid ComputingOSCAR Lab7 Compute grid Distributed pool of resources Completing a task for a user User requests and reserves resources Some kind of middleware manages resources and tasks Resilient and fault tolerant

Grid ComputingOSCAR Lab8 Data grid Client Network pipe 1-1 connectivity Server Data Storage Compute grid – coordinating set of tasks Multiple applications/worker threads accessing single datastore Business AppServer Client Network pipe 1-1 connectivity Server

Grid ComputingOSCAR Lab9 Data Storage Compute grid – coordinating set of tasks Data grid – manages data Data grid – eliminates data access bottlenecks

Grid ComputingOSCAR Lab10 Data grid architecture Mechanism neutrality Policy neutrality Compatibility with compute grid Uniformity with information infrastructure Services Storage Service Grid storage API Metadata service

Grid ComputingOSCAR Lab11 Data grid architecture Expectations Coordination between compute and data grid Data delivery to facilitate task and resource management Sharing data distribution and location information Leveraging data locality Guarantees Dependability Consistency Pervasiveness Security Inexpensive

Grid ComputingOSCAR Lab12 Batch Synchronous Static data Nontransactional Atomic Synchronous Static Data Nontransactional Atomic Asynchronous Static Data Nontransactional Atomic Asynchronous Dynamic data Nontransactional Atomic Synchronous Static data Transactional Atomic Asynchronous Dynamic data Transactional Atomic Asynchronous Static data Transactional Batch Synchronous Static data Transactional Application Complexity Work, Time, Data, Transactional Data Grid QoS Level 0 Level 1 OLAP Real-time datamart Monte Carlo Simulation Data delivery - QoS requirements

Grid ComputingOSCAR Lab13 Related Work Grid File System - provides primitives like a file system – Level 0 QoS NFSv4 – High performance, extensible, secure – in the works Secure File System – self certifying paths, unique identifiers, global namespace, key based certification

Grid ComputingOSCAR Lab14 Technologies related to data grids - JavaSpaces “Make Room for JavaSpaces, Part I Ease the Development of Distributed Apps with JavaSpaces” - Eric Freeman and Susan Hupfer

Grid ComputingOSCAR Lab15 OceanStore Global replication of data Promiscuously caches data Version based archival storage Applications can control their consistency requirements to manage performance Internal event monitors analyze access patterns to move data and provide redundancy

Grid ComputingOSCAR Lab16 Grid Fabric - Integrasoft Business solution provided for financial institutions, share traders Designed to complement compute grid Works closely with compute grid to schedule tasks based on data availability Moves data closer to computation

Grid ComputingOSCAR Lab17 WebServices Business process Data Grid Delivershas Requires State SOA and Data grids Moore’s law and Metcalf’s law Network based computation and grid computing with SOA Intelligent infrastructure – SONA

Grid ComputingOSCAR Lab18 Web 2.0

Grid ComputingOSCAR Lab19 Our research – Motivation Issues in data management Data tightly coupled to computation Data cached locally Distribution is haphazard and reuse is minimal Data pulled by computation – not delivered Mechanisms still improvise based on experience on smaller systems

Grid ComputingOSCAR Lab20 Data Grid and DBMS Grid DBMS Security Transparency Robustness Efficiency Intelligence Fragmentation Heterogeneity DBMSData Regions TablesSchemaOrdered Structure TriggersEvents Stored Procedures OptimizationsDistributed procedures Intra-table fieldsIndexingCross-structure Table/row levelLockingData atom level Table joinsRelationData atom SQLQueryProgrammatic string base IndexesRepeated data access Tags

Grid ComputingOSCAR Lab21 Data grid – eliminates data access bottlenecks Persistence Mechanism – with data regions Data Storage indicates Replicas, relations Data grids as extended DBMS

Grid ComputingOSCAR Lab22 Datacentric grids Automated space management and garbage collection Space and data objects lifetime mechanism I/O allocation on storage system Estimating access from Magnetic storage Co-scheduling of compute and storage resources Space reservation dilemma Thin clients Code mobility towards data

Grid ComputingOSCAR Lab23 Expected Results Can we move computation closer to data? Data grid –with features of persistence? Performance improvement using tags? Loosely coupled data grid and compute grid? Scalability of unique naming in file systems?

Thank you!