Presentation is loading. Please wait.

Presentation is loading. Please wait.

SwatI Agarwal, Thomas Pan eBay Inc.

Similar presentations


Presentation on theme: "SwatI Agarwal, Thomas Pan eBay Inc."— Presentation transcript:

1 SwatI Agarwal, Thomas Pan eBay Inc.
Hbase Operations SwatI Agarwal, Thomas Pan eBay Inc.

2 Overview Pre-production cluster handling production data sets and work loads Data storage for listed item drives eBay Search Indexing Data storage for ranking data in the future Leverage map reduce in the same cluster to build search index

3 HBASE CluSTER 225 Data nodes Region server Task Tracker Data Node
14 Enterprise Nodes Primary Name Node Secondary Name Node Job Tracker Node 5 ZooKeeper Nodes HBase Master CLI Node Ganglia Reporting Nodes Spare Nodes for Failover Node Hardware 12 2TB hard-drives 72GB RAM 24 cores under hyper-threading

4 Cluster Level Configuration

5 Hadoop/HBase Configuration
Region Server HBase Region Server JVM Heap Size: -Xmx15GB HBase Region Server JVM NewSize: -XX:MaxNewSize=150m -XX:NewSize=100m (XX:MaxNewSize=512m) Number of HBase Region Server Handlers: hbase.regionserver.handler.count=50 (Matching number of active regions) HBase Region Server Lease Period: hbase.regionserver.lease.period=300000 (5 minutes for server side timeout as lease timeout) Region Size: hbase.hregion.max.filesize=  (50GB to avoid automatic split) Turn off auto major compaction: hbase.hregion.majorcompaction=0 Read/Write cache configuration HBase block cache size (read cache): hfile.block.cache.size=0.65 (65% of 15GB ~= 9.75GB)  HBase Region Server Memstore Upper Limit: hbase.regionserver.global.memstore.upperLimit=0.10 HBase Region Server Memstore Lower Limit: hbase.regionserver.global.memstore.lowerLimit=0.09 Scanner caching: hbase.client.scanner.caching=200 HBase Block Multiplier: hbase.hregion.memstore.block.multiplier=4 (For memstore flush issue) Client settings HBase RPC Timeout: hbase.rpc.timeout=600000 (10 minutes for client side timeout) HBase Client Pause: hbase.client.pause=3000 Zoo Keeper Maximum Client Count: hbase.zookeeper.property.maxClientCnxns=5000 HDFS Block Size: dfs.block.size=  (128MB) Data node xciever count: dfs.datanode.max.xcievers=131072 Number of mappers per node: mapred.tasktracker.map.tasks.maximum=8 Number of reducers per node: mapred.tasktracker.reduce.tasks.maximum=6 Swap turned off

6 HBase Tables Multiple tables in a single cluster
Multiple column families per table Number of columns per column family: < 200. 1.45 billion rows total Max row size: ~20KB Average row size: ~10KB 13.01TB data Bulk load speed: ~500 Million items in 30 minutes Random write updates: 25K records per minute Scan speed: 2004 rows per second per region server (average version 3), 465 rows per second per region server (average version 10) Scan speed with filters: 325~353 rows per second per region server

7 Hbase Tables (cont.) Pre-split 3600 Regions per table
Table is split into roughly equal sized regions. Important to pick well distributed keys Currently using bit reversal Region split has been disabled by setting very large region size. Major compaction on demand Purge rows periodically Balance regions among region servers on demand

8 RowKey Scheme and Sharding
64-bit unsigned integer Bit reversal of document id Document ID: 2 RowKey: 0x HBase creates regions with even RowKey range Each map task maps to each region.

9 MoNITORING Systems Ganglia Nagios Alerts
Table consistency – hbck Table balancing – in-house tool Region size CPU usage Memory usage Disk failures HDFS block count …… In-house Job Monitoring System Based on OpenTSDB Job Counters

10 CHALLENGES/ISSUES HBase stability HBase health monitoring
HDFS issues can impact Hbase, such as name node failure Map/Reduce jobs can impact HBase region servers, such as high memory usage Region stuck in migration HBase health monitoring HBase table maintenance HBase table regions become unbalanced Major compaction after row purge and updates Software Upgrades cause big downtime Normal hardware failures may cause issues Stuck regions due to failed hard disk Region servers were deadlocked due to jvm Testing

11 Future Direction High scalability High availability Adopt co-processor
Scale out a table with more regions Scale out the whole cluster with more data High availability No downtime for upgrades Adopt co-processor Near-Real-Time Indexing

12 Community Acknowledgement
Kannan Muthukkaruppan Karthik Ranganathan Lars George Michael Stack Ted Yu Todd Lipcon Konstantin Shvachko


Download ppt "SwatI Agarwal, Thomas Pan eBay Inc."

Similar presentations


Ads by Google