Presentation is loading. Please wait.

Presentation is loading. Please wait.

By Vaibhav Nachankar Arvind Dwarakanath.  HBase is an open-source, distributed, column- oriented and sorted-map data storage.  It is a Hadoop Database;

Similar presentations


Presentation on theme: "By Vaibhav Nachankar Arvind Dwarakanath.  HBase is an open-source, distributed, column- oriented and sorted-map data storage.  It is a Hadoop Database;"— Presentation transcript:

1 By Vaibhav Nachankar Arvind Dwarakanath

2  HBase is an open-source, distributed, column- oriented and sorted-map data storage.  It is a Hadoop Database; sits on HDFS.  HBase can support reliable storage and efficient access of a huge amount of structured data

3

4  Modeled after BigTable.  Map/reduce with Hadoop.  Optimizations for real time queries.  No single point of failure.  Random access performance is like MySQL.  Application : Facebook Messaging Database.

5  Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store  Column family :  Super Column family:

6  Best of BigTable and Dynamo  Map/reduce possible with Apache Hadoop  Querying by column, range of keys  BigTable-like features: columns, column families  Writes are much faster than reads  Application: Twitter tweets

7  Store the output of word count inside the table.  So what we want to do is ask ourselves Is this likely to be more efficient? And how so for Read/Write??

8  Get familiar with Hbase and Cassandra and do a broad study of what benchmarking techniques are.  Do a read/write analysis.  Integrate additional components like Lucene Index to see the boost in performance.

9  ‘Hadoop Hbase-0.20.2 Performance Evaluation ’ by D. Carstoiu, A. Cernian, A. Olteanu. University of Bucharest.  ‘Hadoop Hbase-0.20.2 Performance Evaluation ’ by Kareem Dana at Duke University. It shows a varied set of test cases for executions to test HBase.

10


Download ppt "By Vaibhav Nachankar Arvind Dwarakanath.  HBase is an open-source, distributed, column- oriented and sorted-map data storage.  It is a Hadoop Database;"

Similar presentations


Ads by Google