Presentation is loading. Please wait.

Presentation is loading. Please wait.


Similar presentations

Presentation on theme: "SAP SYBASE IQ15 VLDB OPTION"— Presentation transcript:

Technical Overview Courtney Claussen Analytics Product Management Team FEBRUARY, 2012

2 AGENDA Product Success What is Information Lifecycle Management?
SAP SYBASE IQVLDB Option VLDB in Use in a Large Bank PowerDesigner ILM Model for SAP SYBASE IQ Summary This slide deck describes the technical details of the SAP SYBASE IQVLDB option for managing big data. First, we will showcase SAP SYBASE IQ as a proven analytics platform for big data applications. Then we will describe the concept and practice of Information Lifecycle Management (ILM) for managing large volumes of data. The SAP SYBASE IQVLDB option supports ILM with partitioning, placement, and data administration features that are important for big data management. We will show how the VLDB option is being used at a large bank. And then we will describe how Sybase PowerDesigner has been enhanced with modeling features that support building an ILM scenario that can be deployed in SAP SYBASE IQ.

3 Product Success

Mature, industrial strength analytic DBMS LEADERSHIP Industry leading performance & scale benchmarks Recognized EDW market leader by Gartner, Forrester Pioneering technology with 10+ patents ADOPTION 4500+ installations in accounts ~200 new customer wins per year (last 4 years) Consistently 96%+ customer satisfaction rates MOMENTUM 2 x DW market growth rate (last 4 years) Fast paced product releases v15, v15.1 (2009), v15.2 (2010), v15.3, v15.4 (2011) SAP SYBASE IQ is a recognized leader in analytics – with a growing customer base and high level of customer satisfaction. Our revenue growth curve has been double that of the general data warehouse marketplace during the last 4 hears. Our performance in certified benchmarks, and recognition as a market leader by Gartner and Forrester have earned us our stripes. Sybase has never sat on its laurels, though, and has continuously improved SAP SYBASE IQ with a rapid series of innovations. The last 3 years have seen 5 major releases that have introduced significant new features and capabilities. Ericsson • Sungard • Nielsen • BNP Paribas • Telefonica • • comScore • Agricultural Bank of China

5 SAP SYBASE IQ Stores and analyzes large amounts of data
Stands out as the leading enterprise data warehouse amongst the largest banks, insurance agencies, and telecom operators worldwide Manage and analyze statistical measures for the entire nation of Canada Analyze complex models in more than 200 financial institutions worldwide Analyze ALL Federal tax returns in the US Store and Analyze massive amounts of industry segment data in 30 of the largest information providers in the world, including Transunion, Nielsen and Axiom SAP SYBASE IQ handles big data across many industries worldwide. It is the custodian of very large societal information: All statistical measures in Canada All federal tax returns in the USA All citizen health information in Korea SAP SYBASE IQ allows the largest commercial information providers to thrive, including 30 of the largest information providers in the world: Nielsen, Experian, TransUnion, Acxiom, Dun & Bradstreet, Thomson Reuters, TNS Media, … SAP SYBASE IQ crunches through the most complex models in the financial world, and is deployed in more than 200 financial institutions : JPMC, HSBC, Goldman Sachs, Alliance Bernstein, Citigroup, CSFB, Etrade, …

6 What is Information Lifecycle Management?
First, let’s begin with a definition of Information Lifecycle Management.

7 “ ILM is a management approach aimed at tackling the storage ‘information overload' problem which has so far failed to live up to its potential. The key to its success is being able to automate identification of the most valuable information contained in company data at any given time so that relatively unimportant data can be automatically demoted to lower-cost, less accessible storage media and ultimately discarded.” Here is a definition of Information Lifecycle Management from respected research firm, Bloor Research. Bloor Research

8 ILM in the Real World NOAA: National Oceanic and Atmospheric Administration A global network of sensors provide a steady stream of data on the Earth’s oceans and weather With streams and a vast archive of historical data, NOAA manages some of the largest databases in federal government The Princeton, NJ data center alone stores more than 20 petabytes of data NOAA CIO: Joe Klimavicz: “I focus much of my time on DATA LIFECYCLE MANAGEMENT “The keys to ensuring that data is useable and easy to find include using accurate metadata, publishing data in standard formats, and having a WELL-CONCEIVED DATA STORAGE STRATEGY” ILM is a real challenge for companies that are dealing with large volumes of data. NOAA is one example of those.

9 Data Decreases in Value Over Time
Data lifecycle Business Event Operational Transaction Data Transform and Load into DW Data is Queried, Analysed and Reported Data is Archived Data is Purged Time Hour/s Day/s Minute/s Year/s Decade/s T=0 The value of data changes over time beginning when it first appears as a business event. Business data begins as an operational transaction, that is fulfilled and closed relatively quickly, and after that becomes data required for current reporting, then historical reporting, then archived for compliance/risk mitigation, and finally purged when it has no further value. Months

10 Information Lifecycle Management
Data partitioning and placement according to data value Sep Aug Jul 2. Mark partition read-only Jun 4. Drop partition Data Partitions 1. Roll-on: Load monthly table partition Jan Feb Mar Apr May Jun Dec 3. Back-up the partition 5. Drop backup files Many companies implement a “roll on – roll off” scenario, where new data is loaded into a particular area of fast storage, then as it ages, it is moved through tiered storage: each tier implemented with cheaper and slower storage. The purpose of this is to spend IT resource dollars more efficiently, and acquire just the right level of service for each type of data.


12 SAP SYBASE IQ Information lifecycle management
SAP SAP SYBASE IQ15 Engine Multiplex Grid Architecture Admin & Monitoring Framework Storage Area Network Communications & Security Column Indexing Sub-system Loading Engine Column Storage Processor Query Engine In-Database Analytics Text Search Web Enabled Analytics Information Lifecycle Management Manage data through its existence in the DW Among its many other capabilities, SAP SYBASE IQ offers information lifecycle management features that help users manage large volumes of data more effectively.

Data partitioning Multiple user DBSpaces Separate unstructured data from transactional data Place frequently accessed data on fast storage Granular database administration with read-only, read-write, on-line and off-line DBSpaces Catalog Store IQ Main Store for User Data Temp Store Table DBSpace DBFile Table Partition Table Column Index Being able to manage data according to its value requires partitioning functions to organize data, multiple user DBSpaces to map logical containers to different areas of physical storage, and placement commands to locate data into the preferred DBSpace. SAP SYBASE IQ offers all of this. In addition, DBSpaces can be marked read-only so that once data is not changing any more, it can validated and backed up only once.

14 VLDB OPTION Benefits Option Partitioned Tables Number of User DBSpaces
Database Object Placement DBSpace Attributes DBSpace Management VLDB Option Partition by range; single column partition key Multiple DBSpaces, each with multiple DBFiles Unlimited data volume Place database objects (tables, table partitions, columns, indexes) in specific DBSpaces DBSpaces can be marked read-only, read-write, on-line or off-line Validate read-write portions of database separately from read-only Backup read-write DBSpaces separately from read-only SAP SAP SYBASE IQBase Product Single table partition Single user DBSpace with multiple DBFiles All database objects are placed in one user DBSpace Single user DBSpace is read-write and on-line Validate and backup single user DBSpace as a unit This table shows what is included in the base product compared to what is provided by the VLDB option. In the base product, you cannot partition tables, and you have a single user DBSpace, albeit with multiple DBFiles and unlimited storage. The single user DBSpace is writeable and always on-line. With the VLDB option, you can partition tables based on a range of values of a column (the partition key), and you can have an unlimited number of user DBSpaces. These user DBSpaces can be read-only, read-write, on-line or off-line. You can back up read-write DBSpaces separately from read-only DBSpaces. Read-only DBSpaces need to be backed up only once.

15 ILM in SAP SYBASE IQ Partitioning and placement IQ provides partitioning and placement features to manage the storage and movement of data: Partitioning divides data into non-overlapping subsets across a dimension, such as “date”. For example, you may partition customer order data by date Placement maps a data partition to a particular area of storage: the partition “June Customer Orders 2009” resides in file “/opt/data/orders/june2009.dat” Separate big, unstructured data from transactional data: Different levels of protection Different administration needs Use of tiered storage to control cost Partitioning and placement are two key functions necessary for information lifecycle management. Partitioning allows you to organize your data into logical sets. Then you can place those data sets in appropriate areas of storage. Partitioning allows you to localize data that belongs together, and to separate data that is not usually accessed at the same time. You can apply the appropriate storage technology to a data set, depending on how quickly the data must be served up, and what your budget is. Data that needs to be accessed quickly and frequently deserves the highest grade storage. Also, you can protect and administer data sets in different ways, according to security and risk mitigation requirements.

16 Controls for Database Administration
Database administrative operations can be performed with finer control The database can be divided into read-only and read-write sections that are managed differently Backup and restore time can be reduced by backing up read-only data once Data validation can be invoked on just the read-write portions of the database Frequently accessed data can be assigned to faster data storage, and less frequently accessed data can be segregated to cheaper, slower storage Database administration can be a very time consuming and costly activity. Think how much time you can save by dividing up the database into read-only and read-write sections. You can validate and back up read-only data once, saving precious CPU cycles and clock time.

17 Partition and position a table in IQ
Partition by range: single column partition key 1) Partition table Orders CREATE TABLE Orders ( OrderID INT, OrderDate DATE, Description CHAR(10) , PARTITION BY RANGE (OrderDate ( p2010 VALUES < =' ‘ IN FIBER, p2011 VALUES <= ' ‘ IN FIBER, pNextYear VALUES <= (MAX) IN FIBER); Over time, as data is being loaded, start migrating older data to slower, cheaper storage This slide shows examples of IQ DDL commands to create, move and drop partitions. The “PARTITION BY RANGE” clause on the CREATE TABLE statement at the top shows the creation of several table partitions in one statement. The ALTER TABLE…MOVE PARTITION statement shows the movement of a table partition onto a different DBSpace as it ages. 2) Move p2010 to SATA storage ALTER TABLE Orders MOVE PARTITION p2010 to SATA; 3) Later, drop very old partitions ALTER TABLE Orders DROP PARTITION p2010;

18 Full Mesh High Speed Interconnect
Virtual Data Marts Unique, user community focused platform for big data analytics Data Scientists Business Analysts Operations End Users Full Mesh High Speed Interconnect SAN Fabric Building upon separation of data and storage into discrete sets, SAP SYBASE IQ Multiplex introduced the concept of “logical servers”. A logical server is a grouping of physical nodes in the Multiplex. When a query is executing on a machine in a logical server, only the nodes within the particular logical server will participate in the query. This allows workloads to be isolated from each other for security or resource balancing purposes. Logical servers are elastic – physical machines may be added to or removed from a logical server dynamically as workload demand changes. A logical server can be used to build a “virtual data mart” – a set of storage and compute resources used for a particular purpose within an enterprise. The data mart is “virtual”, because the set of storage and compute resources are part of a larger set, and the boundary around the mart is changeable – data can be moved to other areas of storage, and physical servers can migrate among logical servers. Virtual data mart of servers and partitioned storage Workload management Privacy through isolation of resources Separate big unstructured data from transactional data Back up and restore independently

19 VLDB in Use at a Large Bank

20 Shorten Data Backup Times
A large bank is using the SAP SYBASE IQVLDB option to shorten backup times. They divided the database into read-write and read-only partitions. The read-only DBSpaces are backed up just once, and then only the read-write data needed to be backed up regularly.

21 Re-claim valuable Storage space
The bank also implemented a data consolidation activity, that copied the data from partially used DBFiles (the physical files that make up a logical DBSpace), into other DBFiles in the same DBSpace. Then the emptied out DBFile was returned to the storage team for reuse. The result was more efficient use of storage resources, and money saved.

22 PowerDesigner ILM Model for SAP SYBASE IQ

23 ILM in PowerDesigner Model the database Create DBSpaces Assign cost
Create a new lifecycle Assign start date and phase retention periods Associate tables with lifecycle Select date column partition key Estimate cost savings Generate scripts to move partitions through DBSpaces as they age Implementing ILM in SAP SYBASE IQ is made easier with PowerDesigner. In the Sybase PowerDesigner modeling tool, the user can define a data lifecycle - how data is partitioned, and how partitions are positioned on DBSpaces. PowerDesigner can generate cost savings reports as data is migrated over time onto cheaper storage, and can also generate the DDL scripts that move partitions at prescribed times.

24 Create Lifecycle Here is picture of a PowerDesigner dialog box for defining a data lifecycle. The user defines the total length of a lifecycle, how many phases comprise the lifecycle, and how long a partition stays in a particular phase before moving to the next phase.

25 Lifecycle Properties Assign a cost to the storage:
Indicate which tables are part of the lifecycle: The user assigns database tables to the lifecycle, and estimates the initial volume of data, and how data will potentially grow over time. Each phase of a data lifecycle is associated with a particular tablespace with a particular cost.

26 Generate Data Movement Scripts
PowerDesigner will generate data partition movement scripts that implement the data lifecycle and work with SAP SYBASE IQ.

27 Generate Cost Savings Report
Generate cost savings information Finally, PowerDesigner can generate a report that shows cost savings as data is migrated through the lifecycle phases onto cheaper and cheaper storage. Report:

28 Summary

SUMMARY Storage strategies for managing big data — to service data requests responsively, while controlling costs Learn more Visit: Call: For more information, visit the URL shown.



Similar presentations

Ads by Google