Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase.

Similar presentations


Presentation on theme: "1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase."— Presentation transcript:

1 1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase

2 2 me Gaurav Kohli gaurav.in@gmail.com About Consultant Xebia IT Architects

3 3 Why are we here ? Something about RDBMS Limitations of RDBMS Why Hbase or any NoSql solution Overview of Hbase Specific Use cases Paradigm shift in Schema Design Architecture of Hbase Hbase Interface – Java API, Thrift Conclusion Agenda

4 4 Databases Relational

5 5 Relational Databases have a lot of limitations

6 6 Limitations Data Set going into PetaBytes RDBMS don't scale inherently Scale up/Scale out ( Load Balancing + Replication) Hard to shard / partition Both read / write throughput not possible Transactional / Analytical databases Specialized Hardware …... is very expensive Oracle clustering

7 7 Replicatio n Master Slave Maste r Slav e Replication Scaling Out

8 8 Master - Many Slave Scaling Out MySQL master becomes a problem All Slaves must have the same write capacity as master Single point of failure, no easy failover Maste r Read s Write s Slave nodes

9 9 Dual Master Maste r Slav e Replication

10 10 NoSQL

11 11

12 12 2006.11 Google releases paper on BigTable 2007.2 Initial HBase prototype created as Hadoop contrib. 2007.10 First usable HBase 2008.1 Hadoop become Apache top-level project and HBase becomes subproject 2010.5~ Hbase becomes Apache top-level project 2010.6 Hbase 0.26.5 released. 2010.10 HBase 0.89.2010092 – third developer release Background

13 13 Distributed uses HDFS for storage Column-Oriented Multi-Dimensional versions High-Availability High-Performance Storage System Hbase

14 14 A Sql Database No Joins, no query engine, no datatypes, no sql No Schema Denormalized data Wide and sparsely populated data structure(key- value) No DBA needed Hbase is Not

15 15 Bigness Big data, big number of users, big number of computers Massive write performance Facebook needs 135 billion messages a month Twitter stores 7 TB data per day Fast key-value access Write availability No Single point of failure Use Case

16 16 Managing large streams of non-transactional data: Apache logs, application logs, MySQL logs, etc. Real-time inserts, updates, and queries. Fraud detection by comparing transactions to known patterns in real-time. Analytics - Use MapReduce, Hive, or Pig to perform analytical queries Specific Use Case

17 17 Column-oriented database Table are sorted by Row Table schema only defines Column families column family can have any number of columns Each cell value has a timestamp Storage Model

18 18 Storage Model

19 19 Storage Model

20 20 Storage Model Sorted Map( RowKey, List( SortedMap( Column, List( value, Timestamp ) SortedMap(RowKey,List(SortedMap(Column,List(Value,Timestamp)))

21 21 A BIG SORTED MAP Row Key+ Column Key + timestamp => value 2 Versions of this row Timestamp is a long value Column Qualifier/Name Sorted by Row key and column key Column family Schema Design Student table

22 22 Schema Design Example of a Student and Subject mn

23 23 Example of a Student and Subject RDBMS Schema Design Three tables Student table Subject table Student-Subject table

24 24 Hbase Student-Subject schema - Hbase Schema Design Only two table Student table Subject table

25 25 Hbase Schema Design Student-Subject schema - Hbase Student table Subject table Only two table

26 26 Column families attributes

27 27 Region: Contiguous set of lexicographically sorted rows hbase.hregion.max.filesize (default:256 Mb) Region hosted by Region Servers Each Table is partitioned into Regions Regions

28 28 Regions and Splitting row20 0 row20 1 row50 0 row 1 new row

29 29 Regions and Splitting row20 0 row20 1 row35 0 row 1 row 351 row 501

30 30 Master Zookeeper RegionServers HDFS MapReduce Architectur e

31 31 Architectur e

32 32 – Java API, Thrift... Tools

33 33 – Java API, Thrift... Tools Java Thrift ( Ruby, Php, Python, Perl, C++... ) REST Groovy DSL MapReduce Hbase Shell

34 34 – Java API, Thrift... Tools Java Get Put Delete Scan IncrementalColumnValue

35 35

36 36 Hbase v/s RDBMS Not a replacement Solves only a small subset(~5%) Conclusio n

37 37 Where Sql makes life easy Joining Secondary Indexing Referential Integrity (updates) ACID Where Hbase makes life easy Dataset scale Read/Write scale Replication Batch analysis Conclusio n

38 38

39 39

40 40 Hbase Apache (http://hbase.apache.org/) Hbase Wiki (wiki.apache.org/hadoop/Hbase) Hbase blog (blog.hbase.org) Images from Google Search http://www.larsgeorge.com/2009/10/hbase- architecture-101-storage.html http://highscalability.com/blog/2010/12/6/what-the- heck-are-you-actually-using-nosql-for.html References & Credit


Download ppt "1 Gaurav Kohli Xebia Breaking with DBMS and Dating with Relational Hbase."

Similar presentations


Ads by Google