Presentation is loading. Please wait.

Presentation is loading. Please wait.

Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto.

Similar presentations


Presentation on theme: "Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto."— Presentation transcript:

1 Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto Joint work with Elias Ferzli, Naweed Tajuddin, and Hans-Arno Jacobsen DEBS 2009

2 Motivation Event processing is ubiquitous in enterprise-scale applications (Fraud detection, Data analysis) Network security monitoring and analysis tools require Gigabit per second speed (Application-layer firewalls) Selective dissemination of information for Internet- scale applications (RSS, XML, Xpath) These systems need to support thousands of users and process millions of events Achieving Scalability and high performance under excessive load is a challenging problem Matching engine is the most computation intensive function in event processing 2 DEBS 2009

3 Choose an existing, powerful matching algorithm Leverage chip multi-processors Increase throughput or reduce matching time Evaluate multi-threading vs. software transactional memory 3 DEBS 2009 How to support high data-processing rates?

4 Outline Related work Matching algorithm Parallelization techniques Implementation and results 4 DEBS 2009

5 Sequential Matching Algorithms Single phase: A_TREAT [E.H., 1992] – Predicates are complied into a test network – Subscriptions may appear in one or several leaves – Poor locality, space consuming, hard to maintain Two phase: SIFT [T.Y., 2000] – Predicates are evaluated in the first phase – Subscriptions are matched in the second phase – Predicates and subscription are indexed Algorithm used: Filtering Algorithms [F.F., 2001] 5 DEBS 2009

6 P1P2 Price Color Quantity 00000000 Ap 1 Ap 2 Ap 3 Ap 4 Ap 5 C1C2C3 C1C2 C1C2C3...... Matching Algorithm 11 1 1 S9 S5 S1 E 6 DEBS 2009 Phase 1 Phase 2

7 P1P2P1P2 Price Color Quantity 00000000 Ap 1 Ap 2 Ap 3 Ap 4 Ap 5 C1C2C3 C1C2 C1C2C3...... Multiple Events Independent Processing 1 1 1 S9 S2 S3 0000000011 1 1 E1E2Thread 1Thread 2 S1 S8 S7 7 DEBS 2009

8 P2P1 Price Color Quantity Single Event Collaborative Processing EThread 1Thread 2 00000000 Ap 1 Ap 2 Ap 3 Ap 4 Ap 5 C1C2C3 C1C2 C1C2C3...... 1 S2 000000001 1 S1 S8 0000000011 1 8 DEBS 2009

9 Price Color Quantity Multiple Events Collaborative Processing Group 1 000 Ap 1 Ap 2 Ap 3 Ap 4 Ap 5 C1C2C3 C1C2 C1C2C3...... 1000 1 S1 0001 1 T2 T1 Group 2 T3 T4 0001000 1 0001 P1P2P1P2 E1E2 S3 S2 S4 S7S9 1 9 DEBS 2009

10 Implementation Setup 10 Synchronization – Static – Locks – Software transactional memory (STM) Machine – 2.33GHz quad-core Xeon processors – 32KB L1 cache and 4MB L2 cache Workload Number of Subscriptions1M – 6M Average Predicates per Subscription10 Predicate Range1 - 15 Number of Events5000 Average Attributes per Event50 DEBS 2009

11 Multiple Events Independent Processing Analysis 11 Linear Throughput and Constant Average Matching Time DEBS 2009

12 Single Event Collaborative Processing Analysis 12 Lock Implementation is best Bit vector size limits scalability Lock Implementation is best Bit vector size limits scalability DEBS 2009

13 Multiple Events Collaborative Processing Analysis 13 Threads can be allocated based on system requirements and load DEBS 2009

14 Conclusions Parallel matching engine is a promising solution Over 1600 events/s with 6M subs Matching time vs. throughput Lock-based implementation is more efficient HTM is a potential candidate for enhancing speed and potential ease of implementation 14 DEBS 2009

15

16 Predicate Tables (Phase 1) PRICE1020304050 EQUAL LESS GREATER NOT EQUAL 16 S1: quantity = 2, price < 30 QUANTITY12345 EQUAL LESS GREATER NOT EQUAL S2: quantity > 4, price = 20 1 2 3 4

17 DEBS 2009 Ap 1 S1S2S3 P1 S4 P2 P3 Ap 2 S5 P4 Ap N...... Subscription Clusters (Phase 2) 17

18 Time Profiling 18 DEBS 2009

19 Block Size 19 DEBS 2009

20 Subscriptions Effect 20 DEBS 2009 ME-IP SE-CP


Download ppt "Parallel Event Processing for Content-Based Publish/Subscribe Systems Amer Farroukh Department of Electrical and Computer Engineering University of Toronto."

Similar presentations


Ads by Google