A Reliable Memory-Centric Distributed Storage System a Haoyuan Li October 16 @ Strata & Hadoop World NYC Website: tachyon-project.org Meetup: www.meetup.com/Tachyon.

Slides:



Advertisements
Similar presentations
Matei Zaharia, in collaboration with Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Cliff Engle, Michael Franklin, Haoyuan Li, Antonio Lupher, Justin Ma,
Advertisements

Shark Hive SQL on Spark Michael Armbrust.
Berkley Data Analysis Stack Shark, Bagel. 2 Previous Presentation Summary Mesos, Spark, Spark Streaming Infrastructure Storage Data Processing Application.
UC Berkeley a Spark in the cloud iterative and interactive cluster computing Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica.
Mapreduce and Hadoop Introduce Mapreduce and Hadoop
MapReduce Online Created by: Rajesh Gadipuuri Modified by: Ying Lu.
EHarmony in Cloud Subtitle Brian Ko. eHarmony Online subscription-based matchmaking service Available in United States, Canada, Australia and United Kingdom.
Spark Streaming Large-scale near-real-time stream processing UC BERKELEY Tathagata Das (TD)
Spark Lightning-Fast Cluster Computing UC BERKELEY.
Spark in the Hadoop Ecosystem Eric Baldeschwieler (a.k.a. Eric14)
Matei Zaharia University of California, Berkeley Spark in Action Fast Big Data Analytics using Scala UC BERKELEY.
UC Berkeley Spark Cluster Computing with Working Sets Matei Zaharia, Mosharaf Chowdhury, Michael Franklin, Scott Shenker, Ion Stoica.
Berkeley Data Analytics Stack (BDAS) Overview Ion Stoica UC Berkeley UC BERKELEY.
Spark: Cluster Computing with Working Sets
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica Spark Fast, Interactive,
Resource Management with YARN: YARN Past, Present and Future
Discretized Streams An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters Matei Zaharia, Tathagata Das, Haoyuan Li, Scott Shenker,
Shark Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Hive on Spark.
Fast and Expressive Big Data Analytics with Python
Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Justin Ma, Murphy McCauley, Michael Franklin, Scott Shenker, Ion Stoica Spark Fast, Interactive,
Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.
CPS216: Advanced Database Systems (Data-intensive Computing Systems) How MapReduce Works (in Hadoop) Shivnath Babu.
AMPCamp Introduction to Berkeley Data Analytics Systems (BDAS)
Outline | Motivation| Design | Results| Status| Future
Next Generation of Apache Hadoop MapReduce Arun C. Murthy - Hortonworks Founder and Architect Formerly Architect, MapReduce.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Advanced Topics: MapReduce ECE 454 Computer Systems Programming Topics: Reductions Implemented in Distributed Frameworks Distributed Key-Value Stores Hadoop.
U.S. Department of the Interior U.S. Geological Survey David V. Hill, Information Dynamics, Contractor to USGS/EROS 12/08/2011 Satellite Image Processing.
Hadoop Ida Mele. Parallel programming Parallel programming is used to improve performance and efficiency In a parallel program, the processing is broken.
SparkR: Enabling Interactive Data Science at Scale
Cloud Distributed Computing Environment Content of this lecture is primarily from the book “Hadoop, The Definite Guide 2/e)
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
MapReduce: Hadoop Implementation. Outline MapReduce overview Applications of MapReduce Hadoop overview.
Introduction to Apache Hadoop Zibo Wang. Introduction  What is Apache Hadoop?  Apache Hadoop is a software framework which provides open source libraries.
Introduction to Hadoop and HDFS
f ACT s  Data intensive applications with Petabytes of data  Web pages billion web pages x 20KB = 400+ terabytes  One computer can read
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
Tachyon: memory-speed data sharing Haoyuan (HY) Li, Ali Ghodsi, Matei Zaharia, Scott Shenker, Ion Stoica Good morning everyone. My name is Haoyuan,
1 CS 294: Big Data System Research: Trends and Challenges Fall 2015 (MW 9:30-11:00, 310 Soda Hall) Ion Stoica and Ali Ghodsi (
Mesos A Platform for Fine-Grained Resource Sharing in the Data Center Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony Joseph, Randy.
Introduction to Hadoop Programming Bryon Gill, Pittsburgh Supercomputing Center.
What does it mean to virtualize the Hadoop File System?
Resilient Distributed Datasets: A Fault- Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Matei Zaharia, in collaboration with Mosharaf Chowdhury, Tathagata Das, Ankur Dave, Haoyuan Li, Justin Ma, Murphy McCauley, Joshua Rosen, Reynold Xin,
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
Cloud Distributed Computing Environment Hadoop. Hadoop is an open-source software system that provides a distributed computing environment on cloud (data.
INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
BIG DATA/ Hadoop Interview Questions.
Resilient Distributed Datasets A Fault-Tolerant Abstraction for In-Memory Cluster Computing Matei Zaharia, Mosharaf Chowdhury, Tathagata Das, Ankur Dave,
Introduction to Hadoop Programming Bryon Gill, Pittsburgh Supercomputing Center.
Raju Subba Open Source Project: Apache Spark. Introduction Big Data Analytics Engine and it is open source Spark provides APIs in Scala, Java, Python.
Big Data is a Big Deal!.
About Hadoop Hadoop was one of the first popular open source big data technologies. It is a scalable fault-tolerant system for processing large datasets.
Hadoop Aakash Kag What Why How 1.
Machine Learning Library for Apache Ignite
Spark.
Hadoop MapReduce Framework
Spark Presentation.
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
Interactive Data Analytics with Spark on Tachyon in Baidu
Introduction to Spark.
Sajitha Naduvil-vadukootu
Apache Spark Lecture by: Faria Kalim (lead TA) CS425, UIUC
Overview of big data tools
Apache Spark Lecture by: Faria Kalim (lead TA) CS425 Fall 2018 UIUC
Apache Hadoop and Spark
Fast, Interactive, Language-Integrated Cluster Computing
Lecture 29: Distributed Systems
Presentation transcript:

A Reliable Memory-Centric Distributed Storage System a Haoyuan Li October 16 @ Strata & Hadoop World NYC Website: tachyon-project.org Meetup: www.meetup.com/Tachyon UC Berkeley

Outline Overview Open Source Roadmap Feature 1: Memory Centric Storage Architecture Feature 2: Lineage in Storage Open Source Roadmap

Outline Overview Open Source Roadmap Feature 1: Memory Centric Storage Architecture Feature 2: Lineage in Storage Open Source Roadmap

Projects UC Berkeley Design next generation data analytics stack: Berkeley Data Analytics Stack (BDAS) a cluster manager making it easy to write and deploy distributed applications. a parallel computing system supporting general and efficient in-memory execution. a reliable distributed memory-centric storage enabling memory-speed data sharing.

Why Tachyon?

Memory-locality key to interactive response time Memory is King RAM throughput increasing exponentially Disk throughput increasing slowly Memory-locality key to interactive response time

Realized by many… Frameworks already leverage memory

Problem solved?

Missing a Solution for Storage Layer

An Example: - Fast in-memory data processing framework Keep one in-memory copy inside JVM Track lineage of operations used to derive data Upon failure, use lineage to recompute data Lineage Tracking map join reduce filter map

Issue 1 Data Sharing is the bottleneck in analytics pipeline: Slow writes to disk Spark Task Spark Task storage engine & execution engine same process (slow writes) Spark mem block manager block 1 Spark mem block manager block 3 block 3 block 1 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Issue 1 Data Sharing is the bottleneck in analytics pipeline: Slow writes to disk Spark Task Hadoop MR storage engine & execution engine same process (slow writes) Spark mem block manager block 1 YARN block 3 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Cache loss when process crashes. Issue 2 Cache loss when process crashes. Spark Task execution engine & storage engine same process Spark memory block manager block 1 block 3 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Cache loss when process crashes. Issue 2 Cache loss when process crashes. crash execution engine & storage engine same process Spark memory block manager block 1 block 3 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Cache loss when process crashes. Issue 2 Cache loss when process crashes. crash execution engine & storage engine same process HDFS / Amazon S3 block 1 block 2 block 3 block 4

In-memory Data Duplication & Java Garbage Collection Issue 3 In-memory Data Duplication & Java Garbage Collection Spark Task Spark Task execution engine & storage engine same process (duplication & GC) Spark mem block manager block 1 Spark mem block manager block 3 block 3 block 1 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Tachyon Reliable data sharing at memory-speed within and across cluster frameworks/jobs

Solution Overview Basic idea Feature 1: memory-centric storage architecture Feature 2: push lineage down to storage layer Facts One data copy in memory Recomputation for fault-tolerance

Stack Computation Frameworks (Spark, MapReduce, Impala, H2O, …) Tachyon Existing Storage Systems (HDFS, S3, GlusterFS, …)

Memory-Centric Storage Architecture

Memory-speed data sharing among jobs in different frameworks Issue 1 revisited Memory-speed data sharing among jobs in different frameworks execution engine & storage engine same process (fast writes) Spark Task Hadoop MR Spark mem block 1 YARN Tachyon in-memory block 1 block 3 block 4 HDFS disk block 1 block 3 block 2 block 4 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Keep in-memory data safe, even when a job crashes. Issue 2 revisited Keep in-memory data safe, even when a job crashes. Spark Task execution engine & storage engine same process Spark memory block manager block 1 HDFS / Amazon S3 block 1 block 3 block 2 block 4 Tachyon in-memory block 1 block 3 block 4

Keep in-memory data safe, even when a job crashes. Issue 2 revisited Keep in-memory data safe, even when a job crashes. crash execution engine & storage engine same process Spark memory block manager HDFS disk block 1 block 3 block 2 block 4 Tachyon in-memory block 1 block 3 block 4 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Keep in-memory data safe, even when a job crashes. Issue 2 revisited Keep in-memory data safe, even when a job crashes. crash execution engine & storage engine same process HDFS disk block 1 block 3 block 2 block 4 Tachyon in-memory block 1 block 3 block 4 HDFS / Amazon S3 block 1 block 2 block 3 block 4

No in-memory data duplication, much less GC Issue 3 revisited No in-memory data duplication, much less GC execution engine & storage engine same process (no duplication & GC) Spark Task Spark Task Spark mem Spark mem HDFS disk block 1 block 3 block 2 block 4 Tachyon in-memory block 1 block 3 block 4 HDFS / Amazon S3 block 1 block 2 block 3 block 4

Lineage in Storage (alpha)

Comparison with in Memory HDFS

Workflow Improvement Performance comparison for realistic workflow. The workflow ran 4x faster on Tachyon than on MemHDFS. In case of node failure, applications in Tachyon still finishes 3.8x faster.

Further Improve Spark’s Performance Grep Program BLOW UP

How easy / hard to use Tachyon?

Spark/MapReduce/Shark without Tachyon val file = sc.textFile(“hdfs://ip:port/path”) Hadoop MapReduce hadoop jar hadoop-examples-1.0.4.jar wordcount hdfs://localhost:19998/input hdfs://localhost:19998/output Shark CREATE TABLE orders_cached AS SELECT * FROM orders;

Spark/MapReduce/Shark with Tachyon val file = sc.textFile(“tachyon://ip:port/path”) Hadoop MapReduce hadoop jar hadoop-examples-1.0.4.jar wordcount tachyon://localhost:19998/input tachyon://localhost:19998/output Shark CREATE TABLE orders_tachyon AS SELECT * FROM orders;

Spark on Tachyon ./bin/spark-shell sc.hadoopConfiguration.set("fs.tachyon.impl", "tachyon.hadoop.TFS") // Load input from Tachyon val file = sc.textFile("tachyon://localhost:19998/LICENSE") file.count() ; file.take(10); // Store RDD OFF_HEAP in Tachyon import org.apache.spark.storage.StorageLevel; file.persist(StorageLevel.OFF_HEAP) file.count(); file.take(10); // Save output to Tachyon file.flatMap(line => line.split(" ")).map(s => (s, 1)).reduceByKey((a, b) => a + b).saveAsTextFile("tachyon://localhost:19998/LICENSE_WC")

Outline Overview Open Source Roadmap Feature 1: Memory Centric Storage Architecture Feature 2: Lineage in Storage Open Source Roadmap

History Started at UC Berkeley AMPLab from the summer of 2012 Reliable, Memory Speed Storage for Cluster Computing Frameworks (UC Berkeley EECS Tech Report) Haoyuan Li, Ali Ghodsi, Matei Zaharia, Ion Stoica, Scott Shenker

A Open Source Status Apache License 2.0, Version 0.5.0 (July 2014) Deployed at tens of companies 20+ Companies Contributing Spark and MapReduce applications can run without any code change

Release Growth Tachyon 0.1: -1 contributor Dec ‘12

Release Growth Tachyon 0.2: 3 contributors Tachyon 0.1: -1 contributor Apr ‘13 Dec ‘12

Release Growth Tachyon 0.3: 15 contributors Tachyon 0.2: Oct‘13 Tachyon 0.2: 3 contributors Tachyon 0.1: -1 contributor Apr ‘13 Dec ‘12

Release Growth Tachyon 0.4: 30 contributors Tachyon 0.3: Feb ‘14 Tachyon 0.3: 15 contributors Oct‘13 Tachyon 0.2: 3 contributors Tachyon 0.1: -1 contributor Apr ‘13 Dec ‘12

Release Growth Tachyon 0.5: 46 contributors Tachyon 0.4: July ‘14 Tachyon 0.4: 30 contributors Feb ‘14 Tachyon 0.3: 15 contributors Oct‘13 Tachyon 0.2: 3 contributors Tachyon 0.1: -1 contributor Apr ‘13 Dec ‘12

Open Community

Thanks to our Code Contributors! Aaron Davidson Achal Soni Ali Ghodsi Andrew Ash Anurag Khandelwal Aslan Bekirov Bill Zhao Brad Childs Calvin Jia Chao Chen Cheng Chang Cheng Hao Colin Patrick McCabe David Capwell David Zhu Du Li Fei Wang Gerald Zhang Grace Huang Haoyuan Li Henry Saputra Hobin Yoon Huamin Chen Jey Kottalam Joseph Tang Juan Zhou Jun Aoki Lin Xing Lukasz Jastrzebski Manu Goyal Mark Hamstra Mingfei Shi Mubarak Seyed Nick Lanham Orcun Simsek Pengfei Xuan Qianhao Dong Qifan Pu Raymond Liu Reynold Xin Robert Metzger Rong Gu Sean Zhong Seonghwan Moon Siyuan He Shivaram Venkataraman Srinivas Parayya Tao Wang Timothy St. Clair Thu Kyaw Vamsi Chitters Xi Liu Xiang Zhong Xiaomin Zhang Zhao Zhang

Thanks to Redhat!

Commercially supported by x and running in dozens of their customers’ clusters Thanks to Redhat!

Tachyon is the Default Off-Heap Storage Solution for . Thanks to Redhat!

Exchange Data Between Spark and H20

Belief from Industry

Reaching wider communities: e.g. GlusterFS

Under Filesystem Choices (Big Data, Cloud, HPC, Enterprise) This means that people can run Tachyon for Big Data, on Cloud, in HPC, and even for enterprise.

Under Filesystem Choices (Big Data, Cloud, HPC, Enterprise) This means that people can run Tachyon for Big Data, on Cloud, in HPC, and for enterprise.

Outline Overview Open Source Roadmap Feature 1: Memory Centric Storage Architecture Feature 2: Lineage in Storage Open Source Roadmap

Features Memory Centric Storage Architecture Lineage in Storage (alpha) Hierarchical Local Storage Data Serving Different hardware More… Your Requirements?

Short Term Roadmap (0.6 Release) Ceph Integration (Ceph Community, Redhat) Hierarchical Local Storage (Intel) Performance Improvement (Yahoo) Multi-tenancy (AMPLab) Mesos Integration (Mesos Community, Mesosphere) Network Sub-system Improvement (Pivotal) Many more from AMPLab and Industry Contributors

Goal?

Better Assist Other Components Spark MapReduce H2O Spark SQL GraphX Impala …… Tachyon HDFS S3 GlusterFS OrangeFS NFS Ceph …… Keeping in mind, the potential for tachyon to work in different areas beyond big data, such as cloud, HPC, and enterprise. With that, I would like to welcome any and all collaborations! Welcome Collaboration!

Thanks! Questions? More Information: Website: http://tachyon-project.org Github: https://github.com/amplab/tachyon Meetup: http://www.meetup.com/Tachyon Email: haoyuan@cs.berkeley.edu

Release Growth Tachyon 0.5: 46 contributors Tachyon 0.4: July ‘14 Tachyon 0.4: 30 contributors Feb ‘14 Tachyon 0.3: 15 contributors Oct‘13 Tachyon 0.2: 3 contributors Tachyon 0.1: -1 contributor Apr ‘13 Dec ‘12