Presentation is loading. Please wait.

Presentation is loading. Please wait.

Secondary Indexing in Phoenix

Similar presentations


Presentation on theme: "Secondary Indexing in Phoenix"— Presentation transcript:

1 Secondary Indexing in Phoenix
SF HBase User Group – September 26, 2013 James Taylor Phoenix Lead Software Engineer Jesse Yates HBase Committer Software Engineer

2 Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013

3 Phoenix Open Source “SQL-skin” on HBase JDBC Driver Faster than HBase
“SQL-skin” on HBase Everyone knows SQL! JDBC Driver Plug-and-play Faster than HBase in some cases SF HUG – Sept 2013

4 Secondary Indexes Sort on ‘orthogonal’ axis Save full-table scan
Expected database feature Hard in HBase b/c of ACID considerations SF HUG – Sept 2013

5 Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013

6 Indexes In Phoenix Creating an index Deciding when an index is used
DDL statement Creates another HBase table behind the scenes Deciding when an index is used Transparent to the user (but user can override through hint) No stats yet Knowing which table was used EXPLAIN <query> SF HUG – Sept 2013

7 Creating Indexes In Phoenix
CREATE INDEX <index_name> ON <table_name>(<columns_to_index>…) INCLUDE (<columns_to_cover>…); Optionally add IMMUTABLE_ROWS=true property to CREATE TABLE statement SF HUG – Sept 2013

8 Creating Indexes In Phoenix
CREATE TABLE baby_names ( name VARCHAR PRIMARY KEY, occurrences BIGINT); CREATE INDEX baby_names_idx ON baby_names(occurrences DESC, name); SF HUG – Sept 2013

9 Deciding When To Use Transparent to the user
Query optimizer does the following: Compiles query against data and index tables Chooses “best” one (not yet stats driven) Can index even be used? Active, Using columns contained in index (no join back to data table) Can ORDER BY be removed? Which plan forms the longest start/stop scan key? SF HUG – Sept 2013

10 Deciding When To Use SELECT name, occurrences FROM baby_names ORDER BY occurrences DESC LIMIT 10; SELECT name, occurrences FROM baby_names_idx LIMIT 10 ORDER BY not necessary since rows in index table are already ordered this way SF HUG – Sept 2013

11 Deciding When To Use SELECT name, occurrences FROM baby_names WHERE occurrences > 100; SELECT name, occurrences FROM baby_names_idx Uses index, since we can form start row for scan based on filter of occurrences SF HUG – Sept 2013

12 Deciding When To Use SELECT /* NO_INDEX */ name FROM baby_names WHERE occurrences > 100; SELECT /*+ INDEX (baby_names baby_names_idx other_baby_names_idx) */ name,occurrences Override optimizer by telling it not to use any indexes Tell optimizer priority in which it should consider using indexes` SF HUG – Sept 2013

13 Knowing which table was used
EXPLAIN SELECT name, occurrences FROM baby_names ORDER BY occurrences DESC LIMIT 10; CLIENT PARALLEL 1-WAY FULL SCAN OVER BABY_NAMES_IDX SERVER FILTER BY PageFilter 10 CLIENT 10 ROW LIMIT SF HUG – Sept 2013

14 Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013

15 Immutable Indexes Immutable Rows Much easier to implement
Client-managed Bulk-loadable e.g. stats, historical data SF HUG – Sept 2013

16 Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013

17 Mutable Indexes Global Index Change row state
Common use-case “expected” implementation Covered Columns/Join Index SF HUG – Sept 2013

18 1.5 years* SF HUG – Sept 2013

19 Internals Index Management Recovery Mechanism Build index updates
Ensures index is ‘cleaned up’ Recovery Mechanism Ensures index updates are “ACID” SF HUG – Sept 2013

20 “There is no magic” - Every programming hipster (chipster)
SF HUG – Sept 2013

21 Mutable Indexing: Standard Write Path
Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013

22 Mutable Indexing: Standard Write Path
Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013

23 Mutable Indexing Codec Indexer Builder WAL Updater Durable!
Region Coprocessor Host WAL Updater WAL Durable! Index Table Region Coprocessor Host Indexer SF HUG – Sept 2013

24 Index Management Lives within a RegionCoprocesorObserver
Access to the local HRegion Specifies the mutations to apply to the index tables public interface IndexBuilder { public void setup(RegionCoprocessorEnvironment env); public Map<Mutation, String> getIndexUpdate(Put put); public Map<Mutation, String> getIndexUpdate(Delete delete); } SF HUG – Sept 2013

25 Why not write my own? Managing Cleanup Abstract access to HRegion
Efficient point-in-time correctness Performance tricks Abstract access to HRegion Minimal network hops Sorting correctness Phoenix typing ensures correct index sorting SF HUG – Sept 2013

26 Example: Managing Cleanup
Updates can arrive out of order Client-managed timestamps ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Fam2 Qual2 12 val2 13 val3 SF HUG – Sept 2013

27 Example: Managing Cleanup
Index Table ROW FAMILY QUALIFIER TS Val1|Row1 Index Fam:Qual 10 Val1|Val2|Row1 Fam2:Qual2 12 Val3|Val2|Row1 13 SF HUG – Sept 2013

28 Example: Managing Cleanup
Row1 Fam Qual 11 val4 ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Fam2 Qual2 12 val2 13 val3 SF HUG – Sept 2013

29 Example: Managing Cleanup
ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 11 val4 Fam2 Qual2 12 val2 13 val3 SF HUG – Sept 2013

30 Example: Managing Cleanup
ROW FAMILY QUALIFIER TS Va1|Row1 Index Fam:Qual 10 Val4|Row1 11 Val4|Val2|Row1 Fam2:Qual2 12 Va1l|Val2|Row1 Val3|Val2|Row1 13 SF HUG – Sept 2013

31 Example: Managing Cleanup
ROW FAMILY QUALIFIER TS Va1|Row1 Index Fam:Qual 10 Val4|Row1 11 Val4|Val2|Row1 Fam2:Qual2 12 Va1l|Val2|Row1 Val3|Val2|Row1 13 And don’t forget to cleanup the old row state! SF HUG – Sept 2013

32 Surprisingly hard! Managing Cleanup History “roll up”
Out-of-order Updates Point-in-time correctness Multiple Timestamps per Mutation Delete vs. DeleteColumn vs. DeleteFamily Surprisingly hard! SF HUG – Sept 2013

33 Phoenix Index Builder Much simpler than full index management
Hides cleanup considerations Abstracted access to local state public interface IndexCodec{ public void initialize(RegionCoprocessorEnvironment env); public Iterable<IndexUpdate> getIndexDeletes(TableState state); public Iterable<IndexUpdate> getIndexUpserts(TableState state); } SF HUG – Sept 2013

34 Phoenix Index Codec SF HUG – Sept 2013
8pt font, <200 lines, including comments SF HUG – Sept 2013

35 Dude, where’s my data? Ensuring Correctness SF HUG – Sept 2013

36 HBase ACID Does NOT give you: Does give you: Cross-row consistency
Cross-table consistency Does give you: Durable data on success Visibility on success without partial rows SF HUG – Sept 2013

37 Key Observation “Secondary indexing is inherently an easier problem than full transactions… secondary index updates are idempotent.” - Lars Hofhansl SF HUG – Sept 2013

38 Idempotent Index Updates
Doesn’t need full transactions Replay as many times as needed Can tolerate a little lag As long as we get the order right SF HUG – Sept 2013

39 Failure Recovery Custom WALEditCodec Custom WAL Reader
Encodes index updates Supports compressed WAL Custom WAL Reader Replay index updates from WAL <property> <name>hbase.regionserver.wal.codec</name> <value>o.a.h.hbase.regionserver.wal.IndexedWALEditCodec</value> </property> <name>hbase.regionserver.hlog.reader.impl</name> <value>o.a.h.hbase.regionserver.wal.IndexedHLogReader</value> SF HUG – Sept 2013

40 Failure Situations Any time before WAL, client replay
Any time after WAL, HBase replay All-or-nothing SF HUG – Sept 2013

41 Failure #1: Before WAL Client HRegion SF HUG – Sept 2013
RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013

42 Failure #1: Before WAL Client HRegion RegionCoprocessorHost WAL No problem! No data is stored in the WAL, client just retries entire update. RegionCoprocessorHost MemStore SF HUG – Sept 2013

43 Failure #2: After WAL Client HRegion SF HUG – Sept 2013
RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013

44 Failure #2: After WAL WAL replayed via usual replay mechanisms Client
HRegion RegionCoprocessorHost WAL WAL replayed via usual replay mechanisms RegionCoprocessorHost MemStore SF HUG – Sept 2013

45 “Magic” Server-short circuit Lazy load columns Skip-scan for cache
Parallel Writing Custom MemStore in Indexer Caching HTables Pluggable Index Writing/Failure Policy Minimize byte[] copy (ImmutableBytesPtr) SF HUG – Sept 2013

46 Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013

47 Demo SF HUG – Sept 2013

48 Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013

49 Roadmap Next release of Phoenix Performance improvements
Functional Indexes Other indexing approaches (Huawei, SEP) SF HUG – Sept 2013

50 Open Source! Main: https://github.com/forcedotcom/phoenix Indexing:
SF HUG – Sept 2013

51 (obligatory hiring slide)
We’re Hiring! (obligatory hiring slide)

52 Questions? Comments? jtaylor@salesforce.com @jamesplusplus
@jesse_yates

53 Appendix AsyncHBaseWriter github.com/jyates/phoenix/tree/async-hbase
2x+ slower* * Written in 2hrs, not 100% correct either SF HUG – Sept 2013


Download ppt "Secondary Indexing in Phoenix"

Similar presentations


Ads by Google