Presentation is loading. Please wait.

Presentation is loading. Please wait.

Apr, 2012 Maximize WebFOCUS Performance with Hyperstage.

Similar presentations


Presentation on theme: "Apr, 2012 Maximize WebFOCUS Performance with Hyperstage."— Presentation transcript:

1 Apr, 2012 Maximize WebFOCUS Performance with Hyperstage

2 Agenda  Introduction to Hyperstage  How does it work  Recent results  Demonstration  Wrap Up and Q&A

3 Copyright 2007, Information Builders. Slide 3 Introducing Hyperstage

4 WebFOCUS Hyperstage Why? Why Do BI Applications Fail? Typically 3 Reasons…. 1. Too Complicated Self-Service, Guided Ad hoc 2. Bad Data Data Quality 3. Too Slow Hyperstage Hyperstage will improve database performance for WebFOCUS applications with less hardware, no database tuning and easy migration.

5 What is WebFOCUS Hyperstage  Embedded, columnar data store that can dramatically increase the performance of WebFOCUS applications  Columnar = reduced I/O (vs relational)  Easily implemented without the need for database administration  Disk footprint is reduced with a powerful compression algorithm  Includes embedded ETL for seamless migration of existing analytical databases  No change in query or application required  Data migrations are seamless and easy  WF M and higher includes optimized Hyperstage Adapter  Runs on commodity hardware (Intel based)  Windows 64  Linux (Redhat, Centos, Suse, Debian) 5

6 Hyperstage is an integrated columnar oriented data store that helps WebFOCUS applications achieve outstanding query performance. Introducing WebFOCUS Hyperstage ….

7 Smarter Architecture  No maintenance  No query planning  No partition schemes  No DBA Data Packs – data stored in manageably sized, highly compressed data packs Knowledge Grid – statistics and metadata “describing” the super-compressed data Column Orientation WebFOCUS Hyperstage Engine Data compressed using algorithms tailored to data type How does it work?

8 Copyright 2007, Information Builders. Slide 8 Data Organization and the Knowledge Grid …

9 Employee Id Name Smith Jones Fraser Location New York Boston Sales 50,000 65,000 40,000 1SmithNew York50,000 2JonesNew York65,000 3FraserBoston40, SmithNew York50,000 JonesNew York65,000 Data stored in rows FraserBoston40,000 Data stored in columns Pivoting Your Perspective: Columnar Technology 4FraserBoston70,000 4FraserBoston70,0004FraserBoston70,000

10 Data Packs - The data within each column is stored in groupings of 65,536 values called Data Packs  Data Packs improves data compression as the optimal compression algorithm is applied based on the data contents  An average compression ratio of 10:1 is achieved after loading data into Hyperstage. For example 1TB of raw data can be stored in about 100GB of space. Data Organization and the Knowledge Grid …. Data Pack

11 64K Data Packs  Each data pack contains 65, 536 data values  Compression is applied to each individual data pack  The compression algorithm varies depending on data type and data distribution Compression  Results vary depending on the distribution of data among data packs  A typical overall compression ratio seen in the field is 10:1  Some customers have seen results have been as high as 40:1 Patent Pending Compression Algorithms 64K Data Packs and Compression Data Organization and the Knowledge Grid ….

12 Pack Row 1 Column A Pack Row 2 Pack Row 3 Pack Row 4 Pack Row 5 Pack Row 6 Column B Global Knowledge String and character data Numeric data Distributions Dynamic Knowledge Built per-query e.g. for aggregates, joins Built during LOAD The Knowledge GridKnowledge Nodes Data Organization and the Knowledge Grid ….

13 This metadata layer = 1% of the compressed volume Data Pack Nodes (DPN) A separate DPN is created for every data pack created in the database to store basic statistical information Character Maps (CMAPs) Every Data Pack that contains text creates a matrix that records the occurrence of every possible ASCII character Histograms Histograms are created for every Data Pack that contains numeric data and creates 1024 MIN-MAX intervals. Pack-to-Pack Nodes (PPN) PPNs track relationships between Data Packs when tables are joined. Query performance gets better as the database is used.

14 Copyright 2007, Information Builders. Slide 14 How does it work …

15 salaryagejobcity Completely Irrelevant Suspect All values match SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: Query and Knowledge Grid

16 salaryagejobcity 1.Find the Data Packs with salary > SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: salary > Completely Irrelevant All values match

17 salaryagejobcity 1.Find the Data Packs with salary > Find the Data Packs that contain age < 65 SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: age<65 Completely Irrelevant Suspect All values match

18 salaryagejobcity 1.Find the Data Packs with salary > Find the Data Packs that contain age < 65 3.Find the Data Packs that have job = ‘shipping’ SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: job = ‘shipping Completely Irrelevant Suspect All values match

19 salaryagejobcity 1.Find the Data Packs with salary > Find the Data Packs that contain age < 65 3.Find the Data Packs that have job = ‘shipping’ 4.Find the Data Packs that have city = ‘Toronto’ SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: city = ‘Toronto Completely Irrelevant Suspect All values match

20 salarycity All packs ignored All packs ignored All packs ignored 1.Find the Data Packs with salary > Find the Data Packs that contain age < 65 3.Find the Data Packs that have job = ‘shipping’ 4.Find the Data Packs that have city = ‘Toronto’ 5.Eliminate All rows that have been flagged as irrelevant SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: Eliminate Pack Rows Completely Irrelevant Suspect All values match agejob

21 salarycity All packs ignored Only this pack will be de-compressed All packs ignored All packs ignored 1.Find the Data Packs with salary > Find the Data Packs that contain age < 65 3.Find the Data Packs that have job = ‘shipping’ 4.Find the Data Packs that have city = ‘Toronto’ 5.Eliminate All rows that have been flagged as irrelevant 6.Finally we identify the pack that needs to be decompressed SELECT count(*) FROM employees WHERE salary > AND age < 65 AND job = ‘Shipping’ AND city = ‘Toronto’; WebFOCUS Hyperstage Example: Decompress and scan Completely Irrelevant Suspect All values match agejob

22 POC Results (Internal Use Only)  Insurance Company  Query performance issues with SQL Server - Insurance claims analysis  3 day POC - Compression achieved 40:1  Most queries running 3X faster in Hyperstage  Large Bank  Query performance issues with SQL Server - Web traffic analysis  3 day POC -Compression achieved 10:1  Queries than ran for 10 to 15 mins in SQL Server ran sub-second in Hyperstage  Government Application  Query performance issues with Oracle – Federal Loan/Grant Tracking  3 day POC -Compression achieved 15:1  Queries than ran for 10 to 15 mins in Oracle ran in 30 secs in Hyperstage 22 POCs can typically be completed with 3 days

23 Beyond WebFOCUS 23 Java.Net WF Connector WebFOCUS Reporting Server WebFOCUS Client WF Hyperstage Adapter WebFOCUS Hyperstage Server WF Service  Hyperstage is integrated in the WebFOCUS BI Architecture through the reporting server and is administered using the WebFOCUS console  WebFOCUS client applications communicate directly through the reporting server  Custom applications developed via Java or.Net can access the reporting server via WebFOCUS services and a supplied WebFOCUS connector  Hyperstage also supports connections from any application via industry standard JDBC or ODBC connections. There are also native drivers for.NET, C, or PHP applications to connect directly to the Hyperstage engine.  Data can be loaded and maintained in Hyperstage using iWay Data Integration or using any commercial ETL tool. Generic App Java C.Net PHP Perl

24 Hyperstage vs. OLAP  Many companies are looking to migrate from legacy OLAP solutions  Hyperstage can offer excellent query performance with a commonly understood star pattern database  WebFOCUS can offer navigation and drill path navigation  Hyperstage can support large numbers of dimensional attributes and can be easily updated 24 OLAPWebFOCUS HyperStage  Limited number of dimensions  Supports up to 4096 columns on a single table  Difficult to add new dimensions  Dimension tables can be updated  Rebuilding cubes can be slow  Bulk loads of up to 500GB per hour  Up to 10X raw data size to amount of disk consumed  Typically 10:1 compression

25 Hyperstage vs. In-Memory  WebFOCUS Hyperstage is a viable alternative to BI tools that utilize an in-memory architecture like QlikView, Tableau, Cognos TM1 and Tibco/Spotfire  In-memory is limited to the amount of data you can store in RAM.  Hyperstage is a hybrid approach that efficiently uses disk I/O without sacrificing the performance achieved by in-memory  Tableau for example has approximately a 100GB limit on its in- memory cache. 25 In Memory SolutionsWebFOCUS HyperStage Storage: RAMStorage: RAM/Disk  Expensive  Cheap  Short term  Long Term  Requires additional hardware  Leverage existing hardware

26 Copyright 2007, Information Builders. Slide 26 Demonstration …

27 NYSE Daily Stock Price History  Downloaded from internet daily history from 1970 to 2006 for 7000 stocks  14 million rows  1.4GB of raw data  Compressed to 70MB  Test query summarizes stock information for top tech companies in March 2000 and compares the information for the same period in March 2002 (dot com collapse)  Note: Hyperstage running on a Dell laptop 1 duo core processor with 4GB of RAM

28 NYSE Daily Stock Price History (exploded)  Simulated additional stock prices up to 2043  2 billion rows  200GB of raw data  Compressed to 17GB  Test query summarizes stock information for top tech companies in March 2000 and compares the information for the same period in March 2002 (dot com collapse)

29 WebFOCUS Hyperstage The Big Deal…  No indexes  No partitions  No views  No materialized aggregates  Value proposition  Low IT overhead  Allows for autonomy from IT  Ease of implementation  Fast time to market  Less Hardware  Lower TCO No DBA Required!

30 Q&A Copyright 2007, Information Builders. Slide 30


Download ppt "Apr, 2012 Maximize WebFOCUS Performance with Hyperstage."

Similar presentations


Ads by Google