Autonomic aspects in cloud data management Alexandra Carpen-Amarie KerData.

Slides:



Advertisements
Similar presentations
Weed File System Simple and highly scalable distributed file system (NoFS)
Advertisements

The google file system Cs 595 Lecture 9.
Pankaj Kumar Qinglan Zhang Sagar Davasam Sowjanya Puligadda Wei Liu
Optimizing Windows Vista Performance Lesson 10. Skills Matrix Technology SkillObjective DomainObjective # Introducing ReadyBoostTroubleshoot performance.
Adding scalability to legacy PHP web applications Overview Mario A. Valdez-Ramirez.
Optimizing of data access using replication technique Renata Słota 1, Darin Nikolow 1,Łukasz Skitał 2, Jacek Kitowski 1,2 1 Institute of Computer Science.
Copyright 2009 FUJITSU TECHNOLOGY SOLUTIONS PRIMERGY Servers and Windows Server® 2008 R2 Benefit from an efficient, high performance and flexible platform.
Peer-to-peer archival data trading Brian Cooper Joint work with Hector Garcia-Molina (and others) Stanford University.
FutureGrid Image Repository: A Generic Catalog and Storage System for Heterogeneous Virtual Machine Images Javier Diaz, Gregor von Laszewski, Fugang Wang,
The future is bright with clouds Hong Zhu Dept of Computing and Communications technology Oxford Brookes University, Oxford OX33 1HX, UK
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Slide 1 ISTORE: System Support for Introspective Storage Appliances Aaron Brown, David Oppenheimer, and David Patterson Computer Science Division University.
Bandwidth Measurements for VMs in Cloud Amit Gupta and Rohit Ranchal Ref. Cloud Monitoring Framework by H. Khandelwal, R. Kompella and R. Ramasubramanian.
Implementing Failover Clustering with Hyper-V
MSc Education Supporting Infrastructure Emil Doychev Vladimir Valkanov University of Plovdiv Bulgaria.
Chapter 7 Configuring & Managing Distributed File System
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
1 The Google File System Reporter: You-Wei Zhang.
 Cloud computing  Workflow  Workflow lifecycle  Workflow design  Workflow tools : xcp, eucalyptus, open nebula.
Department of Computer Science Engineering SRM University
SensIT PI Meeting, January 15-17, Self-Organizing Sensor Networks: Efficient Distributed Mechanisms Alvin S. Lim Computer Science and Software Engineering.
CN1176 Computer Support Kemtis Kunanuraksapong MSIS with Distinction MCT, MCTS, MCDST, MCP, A+
A Performance Evaluation of Azure and Nimbus Clouds for Scientific Applications Radu Tudoran KerData Team Inria Rennes ENS Cachan 10 April 2012 Joint work.
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
1 Week #9 File Services DFS Overview Configuring DFS Namespaces Configuring DFS Replication Windows Server 2008 Storage Management Overview Managing Storage.
Managing and Monitoring Windows 7 Performance Lesson 8.
Flexibility and user-friendliness of grid portals: the PROGRESS approach Michal Kosiedowski
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Dataset Caitlin Minteer & Kelly Clynes.
Module 2 Configuring Disks and Device Drivers. Module Overview Partitioning Disks in Windows® 7 Managing Disk Volumes Maintaining Disks in Windows 7 Installing.
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Week #3 Objectives Partition Disks in Windows® 7 Manage Disk Volumes Maintain Disks in Windows 7 Install and Configure Device Drivers.
Author - Title- Date - n° 1 Partner Logo EU DataGrid, Work Package 5 The Storage Element.
Module 9: Implementing Caching. Overview Caching Overview Configuring General Cache Properties Configuring Cache Rules Configuring Content Download Jobs.
Biomedical Big Data Training Collaborative biobigdata.ucsd.edu BBDTC UPDATES Biomedical Big Data Training Collaborative biobigdata.ucsd.edu.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
HDFS (Hadoop Distributed File System) Taejoong Chung, MMLAB.
Database Concepts Track 3: Managing Information using Database.
Data-Intensive Cloud Control for GENI GEC 10 Orca control framework March 15 th, 2011 Michael Zink, Prashant Shenoy, Jim Kurose, David Irwin and Emmanuel.
CASE (Computer-Aided Software Engineering) Tools Software that is used to support software process activities. Provides software process support by:- –
1 e-Science AHM st Aug – 3 rd Sept 2004 Nottingham Distributed Storage management using SRB on UK National Grid Service Manandhar A, Haines K,
INFSO-RI Enabling Grids for E-sciencE ARDA Experiment Dashboard Ricardo Rocha (ARDA – CERN) on behalf of the Dashboard Team.
Module 9 Planning and Implementing Monitoring and Maintenance.
Windows Server 2003 系統效能監視 林寶森
EGI-InSPIRE RI EGI-InSPIRE EGI-InSPIRE RI VM Management Chair: Alexander Papaspyrou 2/25/
Gennaro Tortone, Sergio Fantinel – Bologna, LCG-EDT Monitoring Service DataTAG WP4 Monitoring Group DataTAG WP4 meeting Bologna –
Module 11: Configuring and Managing Distributed File System.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
Cloud Computing from a Developer’s Perspective Shlomo Swidler CTO & Founder mydrifts.com 25 January 2009.
Deploying Highly Available SQL Server in Windows Azure A Presentation and Demonstration by Microsoft Cluster MVP David Bermingham.
September 2003, 7 th EDG Conference, Heidelberg – Roberta Faggian, CERN/IT CERN – European Organization for Nuclear Research The GRACE Project GRid enabled.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
Module 11 Configuring and Managing Distributed File System.
ITMT 1371 – Window 7 Configuration 1 ITMT Windows 7 Configuration Chapter 8 – Managing and Monitoring Windows 7 Performance.
An Introduction to GPFS
Enabling Grids for E-sciencE Claudio Cherubino INFN DGAS (Distributed Grid Accounting System)
Advanced Operating Systems Chapter 6.1 – Characteristics of a DFS Jongchan Shin.
Facultatea de Automatica si Calculatoare Universitatea “Politehnica“ din Bucuresti Security in Clouds Building a Malicious Client Detection module for.
Talal H. Noor, Quan Z. Sheng, Lina Yao,
StoRM: a SRM solution for disk based storage systems
Curator: Self-Managing Storage for Enterprise Clusters
StratusLab Final Periodic Review
StratusLab Final Periodic Review
Replication Middleware for Cloud Based Storage Service
Dev Test on Windows Azure Solution in a Box
Outline Midterm results summary Distributed file systems – continued
Cloud computing mechanisms
Metadata The metadata contains
End-to-End Reconfigurability (E2R)
Presentation transcript:

Autonomic aspects in cloud data management Alexandra Carpen-Amarie KerData

Contents Monitoring framework for BlobSeer Self-Adaptive data replication Dynamic Provider deployment BlobSeer as a Fair Data Storage Service BlobSeer as a storage backend for Cumulus

Monitoring framework for BlobSeer Proxy service Repository Monitored data Proxy Services MonALISA Services Monitored nodes Types of Monitored nodes: Providers

Monitoring framework for BlobSeer Proxy service Repository Monitored data Proxy Services MonALISA Services Monitored nodes Types of Monitored nodes: Providers

Monitoring framework for BlobSeer Monitoring Database Monitored data MonALISA Services Monitored nodes Types of Monitored nodes: Providers

Monitoring framework for BlobSeer Monitoring Database Monitored data MonALISA Services Monitored nodes Types of Monitored nodes: Providers Mihaela Vlad ( Master Internship in KerData team ) Malicious Clients detection

Monitoring framework for BlobSeer Monitoring Database Monitored data MonALISA Services Monitored nodes Types of Monitored nodes: Providers Monitoring for all BlobSeer components Metadata Providers Version Manager Provider Manager

Self-Adaptive data replication Lucian Cancescu (PUB student) PUB advisor: Alexandru Costan Goals: Maintain the replication factor for each BLOB Automatically adapt the replication factor

Self-Adaptive data replication BlobSeer current status: Specify a replication degree at BLOB creation Write operation - attempts to create all the needed replicas for each page Failures: Do not affect read operations - at least 2 replicas The initial replication degree is not restored Advantage: Data is never updated - replication is easy

Self-Adaptive data replication Maintaining the replication degree (1)

Self-Adaptive data replication Maintaining the replication degree (2)

Self-adaptive data replication Adapting the replication degree Use monitoring information Number of accesses per BLOB Disk space Memory Network User-defined metrics Increase/decrease the replication degree automatically

Dynamic Provider deployment Alexandru Palade (PUB student) PUB advisor: Alexandru Costan Goal: Enable BlobSeer to scale up and down automatically

Dynamic Provider deployment Motivation Cloud Computing - pay-per-use model Optimize resource consumption Challenges Finding the optimal number of resources Maintaining data integrity when scaling down

Dynamic Provider deployment Dynamic Deployment Module: Compute a score for each provider Enable or disable providers

Dynamic Provider deployment Heuristics for computing the providers’ score Factors Physical factors (storage space, bandwidth usage) BlobSeer-specific factors (number of accesses) Weights associated with factors Decision based on thresholds Framework for specifying the scenarios that define the scoring algorithm Flexible Select factors Define conditions for the factors’ values Time interval Extensible Define new scenarios

Dynamic Provider deployment Example of scenario: free disk space is above the 70% threshold read access rate per time unit is small write access rate per time unit is small => The provider can be shut down Factor weights:

BlobSeer as a Fair Data Storage Service Mihai Mircea (PUB student) PUB advisor: Alexandru Costan Goal: Enhance BlobSeer with fairness policies Web-service on top of BlobSeer

BlobSeer as a Fair Data Storage Service Rapidshare-like functionality Reward users that add data to the system Penalize users that just collect data Flexible policies: Rewards: Priorities for upload/download Increased storage space Penalties: Download delays Access restrictions

BlobSeer as a storage backend for Cumulus Cumulus Nimbus storage cloud implementation Compatible with Amazon S3 interface Replaces the GridFTP-based VM repository Upload/download VMs using S3 tools Currently supports POSIX filesystems

BlobSeer as a storage backend for Cumulus Integrating BlobSeer with Cumulus File namespace manager for BlobSeer Python bindings for BlobSeer Enable Cumulus to store data into BlobSeer Enable BlobSeer as a VM repository Extend the BlobSeer backend to support VM uploads/downloads Work with EC2 AMI-tools

Q&A