Download presentation
Presentation is loading. Please wait.
1
Secondary Indexing in Phoenix
SF HBase User Group – September 26, 2013 James Taylor Phoenix Lead Software Engineer Jesse Yates HBase Committer Software Engineer
2
Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013
3
Phoenix Open Source “SQL-skin” on HBase JDBC Driver Faster than HBase
“SQL-skin” on HBase Everyone knows SQL! JDBC Driver Plug-and-play Faster than HBase in some cases SF HUG – Sept 2013
4
Secondary Indexes Sort on ‘orthogonal’ axis Save full-table scan
Expected database feature Hard in HBase b/c of ACID considerations SF HUG – Sept 2013
5
Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013
6
Indexes In Phoenix Creating an index Deciding when an index is used
DDL statement Creates another HBase table behind the scenes Deciding when an index is used Transparent to the user (but user can override through hint) No stats yet Knowing which table was used EXPLAIN <query> SF HUG – Sept 2013
7
Creating Indexes In Phoenix
CREATE INDEX <index_name> ON <table_name>(<columns_to_index>…) INCLUDE (<columns_to_cover>…); Optionally add IMMUTABLE_ROWS=true property to CREATE TABLE statement SF HUG – Sept 2013
8
Creating Indexes In Phoenix
CREATE TABLE baby_names ( name VARCHAR PRIMARY KEY, occurrences BIGINT); CREATE INDEX baby_names_idx ON baby_names(occurrences DESC, name); SF HUG – Sept 2013
9
Deciding When To Use Transparent to the user
Query optimizer does the following: Compiles query against data and index tables Chooses “best” one (not yet stats driven) Can index even be used? Active, Using columns contained in index (no join back to data table) Can ORDER BY be removed? Which plan forms the longest start/stop scan key? SF HUG – Sept 2013
10
Deciding When To Use SELECT name, occurrences FROM baby_names ORDER BY occurrences DESC LIMIT 10; SELECT name, occurrences FROM baby_names_idx LIMIT 10 ORDER BY not necessary since rows in index table are already ordered this way SF HUG – Sept 2013
11
Deciding When To Use SELECT name, occurrences FROM baby_names WHERE occurrences > 100; SELECT name, occurrences FROM baby_names_idx Uses index, since we can form start row for scan based on filter of occurrences SF HUG – Sept 2013
12
Deciding When To Use SELECT /* NO_INDEX */ name FROM baby_names WHERE occurrences > 100; SELECT /*+ INDEX (baby_names baby_names_idx other_baby_names_idx) */ name,occurrences Override optimizer by telling it not to use any indexes Tell optimizer priority in which it should consider using indexes` SF HUG – Sept 2013
13
Knowing which table was used
EXPLAIN SELECT name, occurrences FROM baby_names ORDER BY occurrences DESC LIMIT 10; CLIENT PARALLEL 1-WAY FULL SCAN OVER BABY_NAMES_IDX SERVER FILTER BY PageFilter 10 CLIENT 10 ROW LIMIT SF HUG – Sept 2013
14
Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013
15
Immutable Indexes Immutable Rows Much easier to implement
Client-managed Bulk-loadable e.g. stats, historical data SF HUG – Sept 2013
16
Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013
17
Mutable Indexes Global Index Change row state
Common use-case “expected” implementation Covered Columns/Join Index SF HUG – Sept 2013
18
1.5 years* SF HUG – Sept 2013
19
Internals Index Management Recovery Mechanism Build index updates
Ensures index is ‘cleaned up’ Recovery Mechanism Ensures index updates are “ACID” SF HUG – Sept 2013
20
“There is no magic” - Every programming hipster (chipster)
SF HUG – Sept 2013
21
Mutable Indexing: Standard Write Path
Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013
22
Mutable Indexing: Standard Write Path
Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013
23
Mutable Indexing Codec Indexer Builder WAL Updater Durable!
Region Coprocessor Host WAL Updater WAL Durable! Index Table Region Coprocessor Host Indexer SF HUG – Sept 2013
24
Index Management Lives within a RegionCoprocesorObserver
Access to the local HRegion Specifies the mutations to apply to the index tables public interface IndexBuilder { public void setup(RegionCoprocessorEnvironment env); public Map<Mutation, String> getIndexUpdate(Put put); public Map<Mutation, String> getIndexUpdate(Delete delete); } SF HUG – Sept 2013
25
Why not write my own? Managing Cleanup Abstract access to HRegion
Efficient point-in-time correctness Performance tricks Abstract access to HRegion Minimal network hops Sorting correctness Phoenix typing ensures correct index sorting SF HUG – Sept 2013
26
Example: Managing Cleanup
Updates can arrive out of order Client-managed timestamps ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Fam2 Qual2 12 val2 13 val3 SF HUG – Sept 2013
27
Example: Managing Cleanup
Index Table ROW FAMILY QUALIFIER TS Val1|Row1 Index Fam:Qual 10 Val1|Val2|Row1 Fam2:Qual2 12 Val3|Val2|Row1 13 SF HUG – Sept 2013
28
Example: Managing Cleanup
Row1 Fam Qual 11 val4 ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Fam2 Qual2 12 val2 13 val3 SF HUG – Sept 2013
29
Example: Managing Cleanup
ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 11 val4 Fam2 Qual2 12 val2 13 val3 SF HUG – Sept 2013
30
Example: Managing Cleanup
ROW FAMILY QUALIFIER TS Va1|Row1 Index Fam:Qual 10 Val4|Row1 11 Val4|Val2|Row1 Fam2:Qual2 12 Va1l|Val2|Row1 Val3|Val2|Row1 13 SF HUG – Sept 2013
31
Example: Managing Cleanup
ROW FAMILY QUALIFIER TS Va1|Row1 Index Fam:Qual 10 Val4|Row1 11 Val4|Val2|Row1 Fam2:Qual2 12 Va1l|Val2|Row1 Val3|Val2|Row1 13 And don’t forget to cleanup the old row state! SF HUG – Sept 2013
32
Surprisingly hard! Managing Cleanup History “roll up”
Out-of-order Updates Point-in-time correctness Multiple Timestamps per Mutation Delete vs. DeleteColumn vs. DeleteFamily Surprisingly hard! SF HUG – Sept 2013
33
Phoenix Index Builder Much simpler than full index management
Hides cleanup considerations Abstracted access to local state public interface IndexCodec{ public void initialize(RegionCoprocessorEnvironment env); public Iterable<IndexUpdate> getIndexDeletes(TableState state); public Iterable<IndexUpdate> getIndexUpserts(TableState state); } SF HUG – Sept 2013
34
Phoenix Index Codec SF HUG – Sept 2013
8pt font, <200 lines, including comments SF HUG – Sept 2013
35
Dude, where’s my data? Ensuring Correctness SF HUG – Sept 2013
36
HBase ACID Does NOT give you: Does give you: Cross-row consistency
Cross-table consistency Does give you: Durable data on success Visibility on success without partial rows SF HUG – Sept 2013
37
Key Observation “Secondary indexing is inherently an easier problem than full transactions… secondary index updates are idempotent.” - Lars Hofhansl SF HUG – Sept 2013
38
Idempotent Index Updates
Doesn’t need full transactions Replay as many times as needed Can tolerate a little lag As long as we get the order right SF HUG – Sept 2013
39
Failure Recovery Custom WALEditCodec Custom WAL Reader
Encodes index updates Supports compressed WAL Custom WAL Reader Replay index updates from WAL <property> <name>hbase.regionserver.wal.codec</name> <value>o.a.h.hbase.regionserver.wal.IndexedWALEditCodec</value> </property> <name>hbase.regionserver.hlog.reader.impl</name> <value>o.a.h.hbase.regionserver.wal.IndexedHLogReader</value> SF HUG – Sept 2013
40
Failure Situations Any time before WAL, client replay
Any time after WAL, HBase replay All-or-nothing SF HUG – Sept 2013
41
Failure #1: Before WAL Client HRegion SF HUG – Sept 2013
RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013
42
Failure #1: Before WAL Client HRegion RegionCoprocessorHost WAL No problem! No data is stored in the WAL, client just retries entire update. RegionCoprocessorHost MemStore SF HUG – Sept 2013
43
Failure #2: After WAL Client HRegion SF HUG – Sept 2013
RegionCoprocessorHost WAL RegionCoprocessorHost MemStore SF HUG – Sept 2013
44
Failure #2: After WAL WAL replayed via usual replay mechanisms Client
HRegion RegionCoprocessorHost WAL WAL replayed via usual replay mechanisms RegionCoprocessorHost MemStore SF HUG – Sept 2013
45
“Magic” Server-short circuit Lazy load columns Skip-scan for cache
Parallel Writing Custom MemStore in Indexer Caching HTables Pluggable Index Writing/Failure Policy Minimize byte[] copy (ImmutableBytesPtr) SF HUG – Sept 2013
46
Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013
47
Demo SF HUG – Sept 2013
48
Agenda About Indexes In Phoenix Immutable Indexes Mutable Indexes
Demo! Roadmap SF HUG – Sept 2013
49
Roadmap Next release of Phoenix Performance improvements
Functional Indexes Other indexing approaches (Huawei, SEP) SF HUG – Sept 2013
50
Open Source! Main: https://github.com/forcedotcom/phoenix Indexing:
SF HUG – Sept 2013
51
(obligatory hiring slide)
We’re Hiring! (obligatory hiring slide)
52
Questions? Comments? jtaylor@salesforce.com @jamesplusplus
@jesse_yates
53
Appendix AsyncHBaseWriter github.com/jyates/phoenix/tree/async-hbase
2x+ slower* * Written in 2hrs, not 100% correct either SF HUG – Sept 2013
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.