4 Stream Processing (Electronic Trading) A feed comes out of the wall Compute a “secret sauce” looking for events of interestTrade based on the resultBut only if you are more nimble than the next guy….
5 Traditional RDBMS Model Outbound Processing Store the data before processing!LatencyWhat if the data is not important?Too many processes!Optimized for business data processingWhere you don’t trust the app.MemoryUpdatesDiskQueriesToo slow to be interesting!
6 Stream Processing Engine with StreamSQL Inbound ProcessingDatabase paradigm (SQL) a good oneBut need a different architectureStraight through processingNo task switchesLightweight schedulingStreamBase ApplicationStreambase ApplicationAlerts ActionsEvent DataMemoryDiskQueries
7 ” StreamSQL Application Example Market_FeedsAlertsMy_BuysExample: Every minute for every stock I am trading:Calculate VWAP (vol. weighted avg. price) for my trades & all tradesAlert whenever my personal trading execution is inferior to market5 Streambase operators, 30 min to buildStreams of “tuples” (time-series data) flow through queryQueries run continuously”
8 StreamSQL Will Dominate Rule Engines Essentially all applications entail a mix of stored and real-time dataStreamSQL covers both kinds of data in a single paradigmA rule engine must switch paradigmsStreamSQL amenable to compilationKnow what is the next event to processIn contrast, hard to figure this out in a rule engine
9 Performance Benchmark Financial Services Application:Construct a virtual feed of “first arrivers” on a low end Linux machineRelational DB: 11,000 messages/secStreambase: 300,000 messages/secAnother StreamSQL vendor: 20,000 messages/secResult: Streambase was a factor of 27 faster
10 (and Other Warehouse Applications) Tick Stores(and Other Warehouse Applications)Store all market data for the last 10 yearsTo back test “secret sauce” modelsTo answer ad-hoc queries – “how many times has X happened”Typical size – 100 TbytesAppend only
12 Rotate Your Thinking 90 Degrees Column stores read only the columns requiredNot all of themCompression works betterBy a factor of 2-3 against the elephantsNo record headersWhich are big ticket itemsNo padding to byte or word boundaries
13 Benchmark Summary Vertica has been baked off about 30 times Typically against the incumbentHas yet to win by less than a factor of 30 against a row storeBeats most other column stores by around 10XKX is the only system to come within an order of magnitude
14 Maybe Elephants are Good at OLTP…… OLTP is a main memory marketNot a disk-based oneTransactions are short and have no I/O or user stallsRun to completion (single threaded)Disaster Recovery (and HA) a requirementBuild it into the bottom of the system
15 TPC-C Performance on a Low-end Machine Elephant 850 TPS (1/2 the land speed record per processor)H-Store (so far – a university prototype)70,416 TPS (41X the land speed record per processor)Factor of 82!!!!!
16 Implications for the Elephants They are selling “one size fits all”Which is 30 year old legacy technology that is good at nothing
18 The DBMS Landscape – Performance Needs Streaming datahighlowhighhighOLTPData Warehouse
19 One Size Does Not Fit All -- Pictorially Elephants get only“the crevices”StreambaseOpen sourceH-StoresuccessorsVertica
20 Thank YouCorporate Headquarters 181 Spring Street Lexington, Massachusetts STRMBASNew York City Office 220 West 42nd Street, 20th Floor New York, New York STRMBASReston, Virginia Office Freedom Drive, Suite 550 Reston, VALondon Office Fleet Street London EC4A 2AB United Kingdom +44 (0)Member