Implementation of Simple Cloud-based Distributed File System Group ID: 4 Baolin Wu, Liushan Yang, Pengyu Ji.

Slides:



Advertisements
Similar presentations
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Advertisements

Distributed Storage March 12, Distributed Storage What is Distributed Storage?  Simple answer: Storage that can be shared throughout a network.
The Google File System (GFS). Introduction Special Assumptions Consistency Model System Design System Interactions Fault Tolerance (Results)
Distributed File Systems Chapter 11
The Google File System. Why? Google has lots of data –Cannot fit in traditional file system –Spans hundreds (thousands) of servers connected to (tens.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
GGF Toronto Spitfire A Relational DB Service for the Grid Peter Z. Kunszt European DataGrid Data Management CERN Database Group.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
Hands-On Microsoft Windows Server 2003 Administration Chapter 5 Administering File Resources.
Implementation of Simple Cloud-based Distributed File System Group ID: 4 Baolin Wu, Liushan Yang, Pengyu Ji.
7/2/2015EECS 584, Fall Bigtable: A Distributed Storage System for Structured Data Jing Zhang Reference: Handling Large Datasets at Google: Current.
Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.
Distributed Databases
Servers Redundant Array of Inexpensive Disks (RAID) –A group of hard disks is called a disk array FIGURE Server with redundant NICs.
Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗
CSE 486/586, Spring 2012 CSE 486/586 Distributed Systems Distributed File Systems Steve Ko Computer Sciences and Engineering University at Buffalo.
1 The Google File System Reporter: You-Wei Zhang.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
An Architecture for Video Surveillance Service based on P2P and Cloud Computing Yu-Sheng Wu, Yue-Shan Chang, Tong-Ying Juang, Jing-Shyang Yen speaker:
Computer Applications Unit D Remote Desktop, Cloud Storage, Dropbox.
Google File System Simulator Pratima Kolan Vinod Ramachandran.
Cooperative File System. So far we had… - Consistency BUT… - Availability - Partition tolerance ?
ATLAS DQ2 Deletion Service D.A. Oleynik, A.S. Petrosyan, V. Garonne, S. Campana (on behalf of the ATLAS Collaboration)
M i SMob i S Mob i Store - Mobile i nternet File Storage Platform Chetna Kaur.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
Distributed File Systems Overview  A file system is an abstract data type – an abstraction of a storage device.  A distributed file system is available.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Globus Replica Management Bill Allcock, ANL PPDG Meeting at SLAC 20 Sep 2000.
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
Chapter 6.5 Distributed File Systems Summary Junfei Wen Fall 2013.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
History & Motivations –RDBMS History & Motivations (cont’d) … … Concurrent Access Handling Failures Shared Data User.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
© Oxford University Press 2011 DISTRIBUTED COMPUTING Sunita Mahajan Sunita Mahajan, Principal, Institute of Computer Science, MET League of Colleges, Mumbai.
{ Tanya Chaturvedi MBA(ISM) Hadoop is a software framework for distributed processing of large datasets across large clusters of computers.
Ch 11 Distributed File System Ch11.1 Architecture Lei Zhang Oct
Silberschatz, Galvin and Gagne ©2009 Operating System Concepts – 8 th Edition, Lecture 24: GFS.
1 CEG 2400 Fall 2012 eDirectory – Directory Service.
Distributed File System. Outline Basic Concepts Current project Hadoop Distributed File System Future work Reference.
The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Presenter: Chao-Han Tsai (Some slides adapted from the Google’s series lectures)
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
CSE 486/586 Distributed Systems Distributed File Systems
Managing Multi-User Databases
Unified Data Access and MGMT. in Distributed hybrid Cloud
Data Bridge Solving diverse data access in scientific applications
Hybrid Cloud Architecture for Software-as-a-Service Provider to Achieve Higher Privacy and Decrease Securiity Concerns about Cloud Computing P. Reinhold.
Chapter 19: Distributed Databases
Replication Middleware for Cloud Based Storage Service
The Google File System Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung Google Presented by Jiamin Huang EECS 582 – W16.
Ch 11 Distributed File System
Distributed P2P File System
Outline Midterm results summary Distributed file systems – continued
CSE 486/586 Distributed Systems Consistency --- 1
A Redundant Global Storage Architecture
The Google File System (GFS)
EEC 688/788 Secure and Dependable Computing
The Google File System (GFS)
EEC 688/788 Secure and Dependable Computing
The Google File System (GFS)
The Google File System (GFS)
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S
Outline Review of Quiz #1 Distributed File Systems 4/20/2019 COP5611.
EEC 688/788 Secure and Dependable Computing
Presentation transcript:

Implementation of Simple Cloud-based Distributed File System Group ID: 4 Baolin Wu, Liushan Yang, Pengyu Ji

Security: not a single cloud storage company can get a complete view of any single file Fault tolerance: sometimes a cloud service can fail or delay, more copies on different services can solve the problem More storage: by combining the several cloud services together, clients can get more storage without extra cost Motivation & Goals Provide a layer between client companies and multiple cloud storage companies for data storage

System Architecture

What we have done Integrated Google Storage, Dropbox API. (Local API simulations when Dev) File is replicated N times across different cloud servers Server responsible to keep file version consistency in cloud. Support directory create/remove, file create/open/read/write/close Download/assemble when open, upload/splice when close, R/W to local file. Server can do migration when one cloud node is down.

Client Master node Cloud Storage 1.Request read block ids 2.Copy to temporary file in cloud 3.Return file name and server ids 4.Transfer temporary file 5.Commit transaction 6.Delete temporary file Implementation Detail Block size & upload time Read/write workflow We choose 1MB as the block size. Folder is a special file: Files in cloud are all flat. 6

Demo Result Files in cloud: 3 Google storage, 2 replica File System Client: using fuse server