Apache ZooKeeper By Patrick Hunt, Mahadev Konar

Slides:



Advertisements
Similar presentations
Paxos and Zookeeper Roy Campbell.
Advertisements

P. Hunt, M Konar, F. Junqueira, B. Reed Presented by David Stein for ECE598YL SP12.
Wait-free coordination for Internet-scale systems
HUG – India Meet November 28, 2009 Noida Apache ZooKeeper Aby Abraham.
Omid Efficient Transaction Management and Incremental Processing for HBase Copyright © 2013 Yahoo! All rights reserved. No reproduction or distribution.
Cloudifying Source Code Repositories: How much does it cost? LADIS 2009 Big Sky, Montana Michael Siegenthaler Hakim Weatherspoon Cornell University.
Provisioning distributed OSGi applications in a cloud Guillaume Nodet, FuseSource November 2011.
Dynamic Reconfiguration of Apache Zookeeper
Flavio Junqueira, Mahadev Konar, Andrew Kornev, Benjamin Reed
Zookeeper at Facebook Vishal Kathuria.
NGOP J.Fromm K.Genser T.Levshina M.Mengel V.Podstavkov.
Google Bigtable A Distributed Storage System for Structured Data Hadi Salimi, Distributed Systems Laboratory, School of Computer Engineering, Iran University.
16: Distributed Systems1 DISTRIBUTED SYSTEM STRUCTURES NETWORK OPERATING SYSTEMS The users are aware of the physical structure of the network. Each site.
Tango Collaboration Meeting1 Tango C++ Kernel Development J. Meyer European Synchrotron Radiation Facility (ESRF)
Network File System (NFS) in AIX System COSC513 Operation Systems Instructor: Prof. Anvari Yuan Ma SID:
Query Processing and Online Architectures T. Yang 290N 2014 Partially from Croft, Metzler & Strohman‘s textbook.
Scalability By Alex Huang. Current Status 10k resources managed per management server node Scales out horizontally (must disable stats collector) Real.
1 The Google File System Reporter: You-Wei Zhang.
Pepper: An Elastic Web Server Farm for Cloud based on Hadoop Author : S. Krishnan, J.-S. Counio Date : Speaker : Sian-Lin Hong IEEE International.
MAHADEV KONAR Apache ZooKeeper. What is ZooKeeper? A highly available, scalable, distributed coordination kernel.
Oracle10g RAC Service Architecture Overview of Real Application Cluster Ready Services, Nodeapps, and User Defined Services.
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
Contents HADOOP INTRODUCTION AND CONCEPTUAL OVERVIEW TERMINOLOGY QUICK TOUR OF CLOUDERA MANAGER.
SPREAD TOOLKIT High performance messaging middleware Presented by Sayantam Dey Vipin Mehta.
Cloudifying Source Code Repositories: How much does it cost? 1 Hadi Salimi, Distributed Systems Labaratory, School of Computer Engineering, Iran University.
Concurrent Programming. Concurrency  Concurrency means for a program to have multiple paths of execution running at (almost) the same time. Examples:
Bigtable: A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows,
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
GFS. Google r Servers are a mix of commodity machines and machines specifically designed for Google m Not necessarily the fastest m Purchases are based.
Reliability/ Secure IOC / Outlook M. Clausen / DESY 1 CA-Put Logging BurtSave Warm Reboot Matthias Clausen DESY/ MKS.
Geo-distributed Messaging with RabbitMQ
Introduction to ZooKeeper. Agenda  What is ZooKeeper (ZK)  What ZK can do  How ZK works  ZK interface  What ZK ensures.
GLOBAL EDGE SOFTWERE LTD1 R EMOTE F ILE S HARING - Ardhanareesh Aradhyamath.
CSI 3125, Preliminaries, page 1 SERVLET. CSI 3125, Preliminaries, page 2 SERVLET A servlet is a server-side software program, written in Java code, that.
HDB++: High Availability with
Motivation Large-scale distributed application require different forms of coordination: Configuration Group membership and leader election Synchronization.
Bigtable: A Distributed Storage System for Structured Data
Zookeeper Wait-Free Coordination for Internet-Scale Systems.
Next Generation of Apache Hadoop MapReduce Owen
ZOOKEEPER. CONTENTS ZooKeeper Overview ZooKeeper Basics ZooKeeper Architecture Getting Started with ZooKeeper.
INTRODUCTION TO HADOOP. OUTLINE  What is Hadoop  The core of Hadoop  Structure of Hadoop Distributed File System  Structure of MapReduce Framework.
An Introduction to GPFS
Apache ZooKeeper CMSC 491 Hadoop-Based Distributed Computing Spring 2016 Adam Shook.
Detour: Distributed Systems Techniques
강호영 Contents ZooKeeper Overview ZooKeeper’s Performance ZooKeeper’s Reliability ZooKeeper’s Architecture Running Replicated ZooKeeper.
Event Based Systems Time and synchronization (II), CAP theorem and ZooKeeper Dr. Emanuel Onica Faculty of Computer Science, Alexandru Ioan Cuza University.
Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung
Bigtable A Distributed Storage System for Structured Data
Introduction to Distributed Platforms
Shared Services with Spotfire
INTRODUCTION TO PIG, HIVE, HBASE and ZOOKEEPER
Advanced Topics in Distributed and Reactive Programming
ZooKeeper Claudia Hauff.
Open Source distributed document DB for an enterprise
Google File System CSE 454 From paper by Ghemawat, Gobioff & Leung.
Introduction to Apache ZooKeeper™
Advanced Topics in Distributed and Reactive Programming
CS6604 Digital Libraries IDEAL Webpages Presented by
Introduction to PIG, HIVE, HBASE & ZOOKEEPER
GARRETT SINGLETARY.
Trafodion Distributed Transaction Management
Evolution of messaging systems and event driven architecture
Introduction to Apache
Exploring the Power of EPDM Tasks - Working with and Developing Tasks in EPDM By: Marc Young XLM Solutions
Wait-free coordination for Internet-scale systems
THE GOOGLE FILE SYSTEM.
OpenShift as a cloud for Data Science
Advanced Topics in Functional and Reactive Programming
ZooKeeper Justin Magnotti 9/19/18.
Pig Hive HBase Zookeeper
Presentation transcript:

Apache ZooKeeper By Patrick Hunt, Mahadev Konar

What is ZooKeeper? ZooKeeper is much more than a distributed lock server! A highly available, scalable, distributed, configuration, consensus, group membership, leader election, naming, and coordination service

Why use ZooKeeper? Difficulty of implementing these kinds of services reliably brittle in the presence of change difficult to manage different implementations lead to management complexity when the applications are deployed

What is ZooKeeper again? File api without partial reads/writes No renames Ordered updates and strong persistence guarantees Conditional updates (version) Watches for data changes Ephemeral nodes Generated file names

Any Guarantees? Clients will never detect old data. Clients will get notified of a change to data they are watching within a bounded period of time. All requests from a client will be processed in order. All results received by a client will be consistent with results received by all other clients.

Data Model Hierarchal namespace Each znode has data and children data is read and written in its entirety / services YaView servers stupidname morestupidity locks read-1 apps users

ZooKeeper API String create(path, data, acl, flags)‏ void delete(path, expectedVersion)‏ Stat setData(path, data, expectedVersion)‏ (data, Stat) getData(path, watch)‏ Stat exists(path, watch)‏ String[] getChildren(path, watch)‏ void sync(path)‏

ZooKeeper Service ZooKeeper Service Server Server Server Server Server Leader Server Server Server Server Server Server Client Client Client Client Client Client Client All servers store a copy of the data (in memory)‏ A leader is elected at startup Followers service clients, all updates go through leader Update responses are sent when a majority of servers have persisted the change

Use cases inside of Yahoo! Leader Election Group Membership Work Queues Configuration Management Cluster Management Load Balancing Sharding

Leader Election getdata(“/servers/leader”, true) if successful follow the leader described in the data and exit 3 create(“/servers/leader”, hostname, EPHEMERAL) if successful lead and exit goto step 1

Leader Election in Python handle = zookeeper.init("localhost:2181", my_connection_watcher, 10000, 0) (data, stat) = zookeeper.get(handle, “/app/leader”, True); if (stat == None) path = zookeeper.create(handle, “/app/leader”, hostname:info, [ZOO_OPEN_ACL_UNSAFE], zookeeper.EPHEMERAL) if (path == None) (data, stat) = zookeeper.get(handle, “/app/leader”, True) #someone else is the leader # parse the string path that contains the leader address else # we are the leader continue leading #someone else is the leader #parse the string path that contains the leader address

Cluster Management /nodes node-1 node-2 node-3 Monitoring process: Watch on /nodes On watch trigger do getChildren(/nodes, true) Track which nodes have gone away Each Node: Create /nodes/node-${i} as ephemeral nodes Keep updating /nodes/node-${i} periodically for node status changes (status updates could be load/iostat/cpu/others) /nodes node-1 node-2 node-3

Work Queues /tasks task-1 task-2 task-3 /machines m-1 task-1 Monitoring process: Watch /tasks for published tasks Pick tasks on watch trigger from /tasks 3. assign it to a machine specific queue by creating create(/machines/m-${i}/task-${j}) Watch for deletion of tasks (maps to task completion) Machine process: Machines watch for /(/machines/m-${i}) for any creation of tasks 2. After executing task-${i} delete task-${i} from /tasks and /m-${i} /tasks task-1 task-2 task-3 /machines m-1 task-1

Performance Numbers.

Monitoring Tools Command port JMX Trace Logging

Where are we? Multi Tenant Observers Recipes Bindings Reusuable code libraries Bindings Java, C, Perl, Python, REST, Ruby?

Where are we? BookKeeper (a contrib project) System to reliably log streams of records Using BookKeeper and ZooKeeper for a pub sub system

Who is using ZooKeeper? Hbase Solr Digg LinkedIn Unnamed financial institutions

What do we do next? WAN – more testing Cross colo quorum Client server in different colo’s Usability – timeouts from zookeeper clients are a headache – ZOOKEEPER-22

Q&A Questions? Links: http://hadoop.apache.org/zookeeper/