SAP SYBASE IQ15 VLDB OPTION

Slides:



Advertisements
Similar presentations
{Customer} Divisions Plan {Date} – {Version} Salesforce.com.
Advertisements

1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Database Systems: Design, Implementation, and Management
Chapter 20 Oracle Secure Backup.
Presentation by Priyanka Sawarkar
LeadManager™- Internet Marketing Lead Management Solution May, 2009.
A Fast Growing Market. Interesting New Players Lyzasoft.
11© 2011 Hitachi Data Systems. All rights reserved. HITACHI DATA DISCOVERY FOR MICROSOFT® SHAREPOINT ® SOLUTION SCALING YOUR SHAREPOINT ENVIRONMENT PRESENTER.
Tableau Visual Intelligence Platform
MS DB Proposal Scott Canaan B. Thomas Golisano College of Computing & Information Sciences.
Components and Architecture CS 543 – Data Warehousing.
With the Help of the Microsoft Azure Platform, Devbridge Group Provides Powerful, Flexible, and Scalable Responsive Web Solutions MICROSOFT AZURE ISV PROFILE:
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
Tableau Visual Intelligence Platform
1 Components of A Successful Data Warehouse Chris Wheaton, Co-Founder, Client Advocate.
Passage Three Introduction to Microsoft SQL Server 2000.
Demonstrating IT Relevance to Business Aligning IT and Business Goals with On Demand Automation Solutions Robert LeBlanc General Manager Tivoli Software.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
© 2011 IBM Corporation Smarter Software for a Smarter Planet The Capabilities of IBM Software Borislav Borissov SWG Manager, IBM.
Planning for Divisions. Meeting Goals  Provide Baseline Overview of Divisions  Review Divisions Plan & Testing To Date.
Database Systems – Data Warehousing
Database Design – Lecture 16
Adra Match BALANCER: Balance Sheet Reconciliation Software Powered by the Microsoft Azure Cloud MICROSOFT AZURE ISV PROFILE: ADRA MATCH Adra Match develops.
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
Slide 1 Copyright © 2010 MarkLogic ® Corporation. All rights reserved. Introduction to MarkLogic 4.2 Kenneth Chestnut, Vice President of Product Marketing.
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
State of Wisconsin Department of Revenue Data Warehouse Presentation August 16, 2000.
© 2005 Princeton Softech, Inc. Princeton Softech Anatomy of an Archive Project Let’s Talk About Data!! April 18, 2007 Alan Schneider.
Graph Data Analytics Arka Mukherjee, Ph.D. Global IDs Resolving Complexity at an Enterprise Scale.
Actualog Social PIM Helps Companies to Manage and Share Product Information Using Secure, Scalable Ease of Microsoft Azure MICROSOFT AZURE ISV PROFILE:
1 Melanie Alexander. Agenda Define Big Data Trends Business Value Challenges What to consider Supplier Negotiation Contract Negotiation Summary 2.
Best Practices for Implementing
7 Strategies for Extracting, Transforming, and Loading.
Built on the Powerful Microsoft Azure Platform, Mproof’s Clientele ITSM Provides Companies with a Complete Software Suite to Manage Services MICROSOFT.
CISC 849 : Applications in Fintech Namami Shukla Dept of Computer & Information Sciences University of Delaware iCARE : A Framework for Big Data Based.
Features Of SQL Server 2000: 1. Internet Integration: SQL Server 2000 works with other products to form a stable and secure data store for internet and.
Powered by Microsoft Azure, Auctori Is the Next Generation in Multilingual, Global, Search Engine Optimized Web Content Management Systems MICROSOFT AZURE.
Flight is a SaaS Solution that Accelerates the Secure Transfer of Large Files and Data Sets Into and Out of Microsoft Azure Blob Storage MICROSOFT AZURE.
Axis AI Solves Challenges of Complex Data Extraction and Document Classification through Advanced Natural Language Processing and Machine Learning MICROSOFT.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
Building a Data Warehouse
MICROSOFT AZURE ISV PROFILE: BMC SOFTWARE
SAS users meeting in Halifax
Advanced Applied IT for Business 2
Ralleo Enterprise-Grade Solution for Managing Change and Business Transformation Provides Opportunities to Better Analyze Real-Time Data MICROSOFT AZURE.
Stylelabs Develops the Marketing Content Hub to Offer Enterprises a High-End Marketing Content Management Platform Based on Microsoft Azure MICROSOFT AZURE.
Be Better: Achieve Customer Service Excellence and Create a Lean RMA and Returns Process with Renewity RMA and the Power of Microsoft Azure MICROSOFT AZURE.
TruRating: Mass Point-of-Payment Customer Rating System Uses the Power of Microsoft Azure to Store and Analyze Millions of Ratings for Business Owners.
Adra ACCOUNTS: Transaction Matching Software Powered by the Microsoft Azure Cloud That Helps Optimize the Accounting and Finance Processes MICROSOFT AZURE.
AdQ is Azure-Powered Pre-Roll Ad Management Software That Improves Pre-Roll Ad Performance, Increases Profits, and Optimizes User Experience MICROSOFT.
Table Partitioning Intro and make that a sliding window too!
XtremeData on the Microsoft Azure Cloud Platform:
FileFacets Information Governance Solution Performs High-Quality Automated Enterprise Content Management Migration, Built on Azure MICROSOFT AZURE APP.
Data Warehousing Concepts
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
Text for section 1 1 Text for section 2 2 Text for section 3 3
David Gilmore & Richard Blevins Senior Consultants April 17th, 2012
Presentation transcript:

SAP SYBASE IQ15 VLDB OPTION Technical Overview Courtney Claussen Analytics Product Management Team Courtney.claussen@sap.com FEBRUARY, 2012

AGENDA Product Success What is Information Lifecycle Management? SAP SYBASE IQVLDB Option VLDB in Use in a Large Bank PowerDesigner ILM Model for SAP SYBASE IQ Summary This slide deck describes the technical details of the SAP SYBASE IQVLDB option for managing big data. First, we will showcase SAP SYBASE IQ as a proven analytics platform for big data applications. Then we will describe the concept and practice of Information Lifecycle Management (ILM) for managing large volumes of data. The SAP SYBASE IQVLDB option supports ILM with partitioning, placement, and data administration features that are important for big data management. We will show how the VLDB option is being used at a large bank. And then we will describe how Sybase PowerDesigner has been enhanced with modeling features that support building an ILM scenario that can be deployed in SAP SYBASE IQ.

Product Success

SAP SYBASE IQ LEADERSHIP ADOPTION MOMENTUM Mature, industrial strength analytic DBMS LEADERSHIP Industry leading performance & scale benchmarks Recognized EDW market leader by Gartner, Forrester Pioneering technology with 10+ patents ADOPTION 4500+ installations in 2150+ accounts ~200 new customer wins per year (last 4 years) Consistently 96%+ customer satisfaction rates MOMENTUM 2 x DW market growth rate (last 4 years) Fast paced product releases v15, v15.1 (2009), v15.2 (2010), v15.3, v15.4 (2011) SAP SYBASE IQ is a recognized leader in analytics – with a growing customer base and high level of customer satisfaction. Our revenue growth curve has been double that of the general data warehouse marketplace during the last 4 hears. Our performance in certified benchmarks, and recognition as a market leader by Gartner and Forrester have earned us our stripes. Sybase has never sat on its laurels, though, and has continuously improved SAP SYBASE IQ with a rapid series of innovations. The last 3 years have seen 5 major releases that have introduced significant new features and capabilities. Ericsson • Sungard • Nielsen • BNP Paribas • Telefonica • hmv.com • comScore • Agricultural Bank of China

SAP SYBASE IQ Stores and analyzes large amounts of data Stands out as the leading enterprise data warehouse amongst the largest banks, insurance agencies, and telecom operators worldwide Manage and analyze statistical measures for the entire nation of Canada Analyze complex models in more than 200 financial institutions worldwide Analyze ALL Federal tax returns in the US Store and Analyze massive amounts of industry segment data in 30 of the largest information providers in the world, including Transunion, Nielsen and Axiom SAP SYBASE IQ handles big data across many industries worldwide. It is the custodian of very large societal information: All statistical measures in Canada All federal tax returns in the USA All citizen health information in Korea SAP SYBASE IQ allows the largest commercial information providers to thrive, including 30 of the largest information providers in the world: Nielsen, Experian, TransUnion, Acxiom, Dun & Bradstreet, Thomson Reuters, TNS Media, … SAP SYBASE IQ crunches through the most complex models in the financial world, and is deployed in more than 200 financial institutions : JPMC, HSBC, Goldman Sachs, Alliance Bernstein, Citigroup, CSFB, Etrade, …

What is Information Lifecycle Management? First, let’s begin with a definition of Information Lifecycle Management.

“ ILM is a management approach aimed at tackling the storage ‘information overload' problem which has so far failed to live up to its potential. The key to its success is being able to automate identification of the most valuable information contained in company data at any given time so that relatively unimportant data can be automatically demoted to lower-cost, less accessible storage media and ultimately discarded.” Here is a definition of Information Lifecycle Management from respected research firm, Bloor Research. Bloor Research

ILM in the Real World NOAA: National Oceanic and Atmospheric Administration A global network of sensors provide a steady stream of data on the Earth’s oceans and weather With streams and a vast archive of historical data, NOAA manages some of the largest databases in federal government The Princeton, NJ data center alone stores more than 20 petabytes of data NOAA CIO: Joe Klimavicz: “I focus much of my time on DATA LIFECYCLE MANAGEMENT “The keys to ensuring that data is useable and easy to find include using accurate metadata, publishing data in standard formats, and having a WELL-CONCEIVED DATA STORAGE STRATEGY” ILM is a real challenge for companies that are dealing with large volumes of data. NOAA is one example of those.

Data Decreases in Value Over Time Data lifecycle Business Event Operational Transaction Data Transform and Load into DW Data is Queried, Analysed and Reported Data is Archived Data is Purged Time Hour/s Day/s Minute/s Year/s Decade/s T=0 The value of data changes over time beginning when it first appears as a business event. Business data begins as an operational transaction, that is fulfilled and closed relatively quickly, and after that becomes data required for current reporting, then historical reporting, then archived for compliance/risk mitigation, and finally purged when it has no further value. Months

Information Lifecycle Management Data partitioning and placement according to data value Sep Aug Jul 2. Mark partition read-only Jun 4. Drop partition Data Partitions 1. Roll-on: Load monthly table partition Jan Feb Mar Apr May Jun Dec 3. Back-up the partition 5. Drop backup files Many companies implement a “roll on – roll off” scenario, where new data is loaded into a particular area of fast storage, then as it ages, it is moved through tiered storage: each tier implemented with cheaper and slower storage. The purpose of this is to spend IT resource dollars more efficiently, and acquire just the right level of service for each type of data.

SAP SYBASE IQVLDB Option

SAP SYBASE IQ Information lifecycle management SAP SAP SYBASE IQ15 Engine Multiplex Grid Architecture Admin & Monitoring Framework Storage Area Network Communications & Security Column Indexing Sub-system Loading Engine Column Storage Processor Query Engine In-Database Analytics Text Search Web Enabled Analytics Information Lifecycle Management Manage data through its existence in the DW Among its many other capabilities, SAP SYBASE IQ offers information lifecycle management features that help users manage large volumes of data more effectively.

SAP SYBASE IQVLDB OPTION Data partitioning Multiple user DBSpaces Separate unstructured data from transactional data Place frequently accessed data on fast storage Granular database administration with read-only, read-write, on-line and off-line DBSpaces Catalog Store IQ Main Store for User Data Temp Store Table DBSpace DBFile Table Partition Table Column Index Being able to manage data according to its value requires partitioning functions to organize data, multiple user DBSpaces to map logical containers to different areas of physical storage, and placement commands to locate data into the preferred DBSpace. SAP SYBASE IQ offers all of this. In addition, DBSpaces can be marked read-only so that once data is not changing any more, it can validated and backed up only once.

VLDB OPTION Benefits Option Partitioned Tables Number of User DBSpaces Database Object Placement DBSpace Attributes DBSpace Management VLDB Option Partition by range; single column partition key Multiple DBSpaces, each with multiple DBFiles Unlimited data volume Place database objects (tables, table partitions, columns, indexes) in specific DBSpaces DBSpaces can be marked read-only, read-write, on-line or off-line Validate read-write portions of database separately from read-only Backup read-write DBSpaces separately from read-only SAP SAP SYBASE IQBase Product Single table partition Single user DBSpace with multiple DBFiles All database objects are placed in one user DBSpace Single user DBSpace is read-write and on-line Validate and backup single user DBSpace as a unit This table shows what is included in the base product compared to what is provided by the VLDB option. In the base product, you cannot partition tables, and you have a single user DBSpace, albeit with multiple DBFiles and unlimited storage. The single user DBSpace is writeable and always on-line. With the VLDB option, you can partition tables based on a range of values of a column (the partition key), and you can have an unlimited number of user DBSpaces. These user DBSpaces can be read-only, read-write, on-line or off-line. You can back up read-write DBSpaces separately from read-only DBSpaces. Read-only DBSpaces need to be backed up only once.

ILM in SAP SYBASE IQ Partitioning and placement IQ provides partitioning and placement features to manage the storage and movement of data: Partitioning divides data into non-overlapping subsets across a dimension, such as “date”. For example, you may partition customer order data by date Placement maps a data partition to a particular area of storage: the partition “June Customer Orders 2009” resides in file “/opt/data/orders/june2009.dat” Separate big, unstructured data from transactional data: Different levels of protection Different administration needs Use of tiered storage to control cost Partitioning and placement are two key functions necessary for information lifecycle management. Partitioning allows you to organize your data into logical sets. Then you can place those data sets in appropriate areas of storage. Partitioning allows you to localize data that belongs together, and to separate data that is not usually accessed at the same time. You can apply the appropriate storage technology to a data set, depending on how quickly the data must be served up, and what your budget is. Data that needs to be accessed quickly and frequently deserves the highest grade storage. Also, you can protect and administer data sets in different ways, according to security and risk mitigation requirements.

Controls for Database Administration Database administrative operations can be performed with finer control The database can be divided into read-only and read-write sections that are managed differently Backup and restore time can be reduced by backing up read-only data once Data validation can be invoked on just the read-write portions of the database Frequently accessed data can be assigned to faster data storage, and less frequently accessed data can be segregated to cheaper, slower storage Database administration can be a very time consuming and costly activity. Think how much time you can save by dividing up the database into read-only and read-write sections. You can validate and back up read-only data once, saving precious CPU cycles and clock time.

Partition and position a table in IQ Partition by range: single column partition key 1) Partition table Orders CREATE TABLE Orders ( OrderID INT, OrderDate DATE, Description CHAR(10) , PARTITION BY RANGE (OrderDate ( p2010 VALUES < ='2010-12-31‘ IN FIBER, p2011 VALUES <= '2011-12-31‘ IN FIBER, pNextYear VALUES <= (MAX) IN FIBER); Over time, as data is being loaded, start migrating older data to slower, cheaper storage This slide shows examples of IQ DDL commands to create, move and drop partitions. The “PARTITION BY RANGE” clause on the CREATE TABLE statement at the top shows the creation of several table partitions in one statement. The ALTER TABLE…MOVE PARTITION statement shows the movement of a table partition onto a different DBSpace as it ages. 2) Move p2010 to SATA storage ALTER TABLE Orders MOVE PARTITION p2010 to SATA; 3) Later, drop very old partitions ALTER TABLE Orders DROP PARTITION p2010;

Full Mesh High Speed Interconnect Virtual Data Marts Unique, user community focused platform for big data analytics Data Scientists Business Analysts Operations End Users Full Mesh High Speed Interconnect SAN Fabric Building upon separation of data and storage into discrete sets, SAP SYBASE IQ Multiplex introduced the concept of “logical servers”. A logical server is a grouping of physical nodes in the Multiplex. When a query is executing on a machine in a logical server, only the nodes within the particular logical server will participate in the query. This allows workloads to be isolated from each other for security or resource balancing purposes. Logical servers are elastic – physical machines may be added to or removed from a logical server dynamically as workload demand changes. A logical server can be used to build a “virtual data mart” – a set of storage and compute resources used for a particular purpose within an enterprise. The data mart is “virtual”, because the set of storage and compute resources are part of a larger set, and the boundary around the mart is changeable – data can be moved to other areas of storage, and physical servers can migrate among logical servers. Virtual data mart of servers and partitioned storage Workload management Privacy through isolation of resources Separate big unstructured data from transactional data Back up and restore independently

VLDB in Use at a Large Bank

Shorten Data Backup Times A large bank is using the SAP SYBASE IQVLDB option to shorten backup times. They divided the database into read-write and read-only partitions. The read-only DBSpaces are backed up just once, and then only the read-write data needed to be backed up regularly.

Re-claim valuable Storage space The bank also implemented a data consolidation activity, that copied the data from partially used DBFiles (the physical files that make up a logical DBSpace), into other DBFiles in the same DBSpace. Then the emptied out DBFile was returned to the storage team for reuse. The result was more efficient use of storage resources, and money saved.

PowerDesigner ILM Model for SAP SYBASE IQ

ILM in PowerDesigner Model the database Create DBSpaces Assign cost Create a new lifecycle Assign start date and phase retention periods Associate tables with lifecycle Select date column partition key Estimate cost savings Generate scripts to move partitions through DBSpaces as they age Implementing ILM in SAP SYBASE IQ is made easier with PowerDesigner. In the Sybase PowerDesigner modeling tool, the user can define a data lifecycle - how data is partitioned, and how partitions are positioned on DBSpaces. PowerDesigner can generate cost savings reports as data is migrated over time onto cheaper storage, and can also generate the DDL scripts that move partitions at prescribed times.

Create Lifecycle Here is picture of a PowerDesigner dialog box for defining a data lifecycle. The user defines the total length of a lifecycle, how many phases comprise the lifecycle, and how long a partition stays in a particular phase before moving to the next phase.

Lifecycle Properties Assign a cost to the storage: Indicate which tables are part of the lifecycle: The user assigns database tables to the lifecycle, and estimates the initial volume of data, and how data will potentially grow over time. Each phase of a data lifecycle is associated with a particular tablespace with a particular cost.

Generate Data Movement Scripts PowerDesigner will generate data partition movement scripts that implement the data lifecycle and work with SAP SYBASE IQ.

Generate Cost Savings Report Generate cost savings information Finally, PowerDesigner can generate a report that shows cost savings as data is migrated through the lifecycle phases onto cheaper and cheaper storage. Report:

Summary

SAP SYBASE IQVLDB OPTION SUMMARY Storage strategies for managing big data — to service data requests responsively, while controlling costs Learn more Visit: http://www.sybase.com/sybaseiq-vldb Call: 1.800.792.2735 For more information, visit the URL shown.