Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAHADEV KONAR Apache ZooKeeper. What is ZooKeeper? A highly available, scalable, distributed coordination kernel.

Similar presentations


Presentation on theme: "MAHADEV KONAR Apache ZooKeeper. What is ZooKeeper? A highly available, scalable, distributed coordination kernel."— Presentation transcript:

1 MAHADEV KONAR Apache ZooKeeper

2 What is ZooKeeper? A highly available, scalable, distributed coordination kernel

3 Use Cases » Leader Election » Group Membership » Work Queues » Event Notifications/workflow management » Configuration Management » Cluster Management » Sharding

4 What is ZooKeeper again? File api without partial reads/writes No renames Ordered updates and strong persistence guarantees Conditional updates (version) Watches for data changes Ephemeral znodes Generated file names

5 Data Model Hierarchal namespace Each znode has data and children data is read and written in its entirety / apps users locks servers app1 read-1 master regionserver

6 ZooKeeper API String create(path, data, acl, flags)‏ void delete(path, expectedVersion)‏ Stat setData(path, data, expectedVersion)‏ (data, Stat) getData(path, watch)‏ Stat exists(path, watch)‏ String[] getChildren(path, watch)‏

7 ZooKeeper Service All servers store a copy of the data (in memory) ‏ A leader is elected at startup Followers service clients, all updates go through leader Update responses are sent when a majority of servers have persisted the change ZooKeeper Service Server Leader Client

8 ZooKeeper and HBase Master Failover Region Servers and Master discovery via ZooKeeper  HBase clients connect to ZooKeeper to find configuration data  Region Servers and Master failure detecti0n

9 Hbase and ZooKeeper as of now! / / root-region-server rs master Master If more than one master, they fight Root Region Server This znode holds the location of the server hosting the root of all tables in hbase rs A directory in which there is a znode per Hbase region server Region Servers register themselves with ZooKeeper when they come online On Region Server failure (detected via ephemeral znodes and notification via ZooKeeper), the master splits the edits out per region shutdown

10 Common Problems/Error Cases Garbage Collection at the Region Servers  Causes zookeeper clients to stall Session expiry Low throughput and connection loss  Mostly due to under provisioned ZooKeeper instances  Disk and Memory usage Bad Usage example:  NameNode, RegionServer, JobTracker, ZooKeeper running on the same node

11 Release 3.3.0, whats in for Hbase? Allow configuration of session timeout min/max bounds  HBase needs large session timeouts Improved logging information to detect issues Improved debugging tools Improved documentation Improved performance and robustness Queue implementation available

12 Upcoming 3.4 release No Connectionloss Use Netty - allow encryption Testing  Mockito More of backwards compatibility testing

13 More ZooKeeper in Hbase? Table Schema and state in ZooKeeper  read only, online Region Server state transitions via ZooKeeper Store region assignment in ZooKeeper for each Region Server http://wiki.apache.org/hadoop/ZooKeeper/HBaseU seCases

14 Questions?


Download ppt "MAHADEV KONAR Apache ZooKeeper. What is ZooKeeper? A highly available, scalable, distributed coordination kernel."

Similar presentations


Ads by Google