Download presentation
Presentation is loading. Please wait.
1
IBM PureData System for Operational Analytics
James Cho Chief Architect PureData Transaction System And Operational Analytics Solutions Session Code:
2
Agenda Introduction Pure Systems PureData for Operational Analytics
Smart Analytics 5600 Questions/Discussion
3
A new family of expert integrated systems
Systems with integrated expertise and built for cloud Built-in Expertise Capturing and automating what experts do – from the infrastructure patterns to the application patterns Integration by Design Deeply integrating and tuning hardware and software – in a ready-to-go workload optimized system Main Point: The time has come for a new breed of systems. Expert integrated systems are systems with integrated expertise that combine the flexibility of a general purpose system, the elasticity of cloud and the simplicity of an appliance tuned to the workload – fundamentally changing the experience and economics of IT. Speaker Notes: The time has come for a new class of systems: systems with integrated expertise and built for cloud that combine the flexibility of general purpose system, the elasticity of cloud and the simplicity of an appliance Expert integrated systems will fundamentally change both the experience and economics of IT. Expert integrated systems are more than a static stack of self-tuned components—a server here, some database software there, serving a fixed application at the top. Instead, these systems have three truly unique attributes. The first is built-in expertise. Think of expert integrated systems as representing the collective knowledge of thousands of deployments, established best practices, innovative thinking, IT industry leadership, and the distilled expertise of business partners and solution providers. Captured into the system in a deployable form from the base system infrastructure through the application. The second is that expert integrated systems are integrated by design. All the hardware and software components are deeply integrated and tuned in the lab and packaged in the factory into a single ready-to-go system that is optimized for the workloads it will be running.. All of the integration is done for you, by experts. The third is that the entire experience is much simpler – from the moment you start designing what you need to the time you purchase – to setting up the system – to operating, maintaining and upgrading it over time. Management of the entire system of physical and virtual resources is integrated. And all this is done in an open manner enabling participation by a broad ecosystem of partners to bring their industry optimized solutions to bear. Simplified Experience Making every part of the IT lifecycle easier - with integrated management of the entire system and a broad open ecosystem of optimized solutions 3 3
4
IBM PureSystems Family
How much flexibility, integration and workload optimization do you want out of the box? Infrastructure Data Platform Application Platform Integrated and optimized infrastructure with flexibility Runs your choice of applications and middleware Integrated and optimized application platform Built on IBM middleware to accelerate deployment of your choice of applications Integrated and optimized data platform Delivers high performance data services to transactional and analytics applications New Two key points here: Spend time explaining the difference between the three PureSystems Point out that the PureData System is new, helping clients address their big data challenges PureSystems are flexible, pre-integrated and optimized systems. There is a key difference in what is integrated for each system PureFlex – this system is a perfect fit for when clients what to deploy and manage pre-integrated IT infrastructure – that is, compute and storage resources. They would then install and integrate the middleware and application themselves onto this infrastructure. PureApplication – this system is a perfect fit for when clients want to deploy and managed pre-integrated IT and application infrastructure. This system contains compute and storage resources as well as all the application and database middleware required to run applications. This system is ready to go – all clients need to do is load it with their applications and data. PureData – this system is a perfect fit for when clients want an integrated and optimized data platform. This system contains compute and storage resources as well as data management middleware and comes in different models optimized for different data workloads. Delivering IT infrastructure services Delivering application platform services New PureSystem with models optimized exclusively for data workloads 4
5
IBM PureData System Pattern based database deployment in minutes, not hours1 Handles more than 100 databases on 1 system2 System for Transactions 10-100x faster than traditional custom systems4 20x greater concurrency and throughput for tactical queries than previous Netezza technology5 System for Analytics powered by Netezza technology Continuous ingest of operation data Handles concurrent operational queries3 Up to 10x storage savings with adaptive compression6 Each model of PureData System delivers compelling value. PureData for Transactions – delivers our superior database scaling technology in a system that is more integrated and optimized for OLTP applications than Oracle Exadata PureData for Analytics – continues to deliver the values that have made Netezza a success over both Oracle and Teradata offerings. This enhanced offering is more than just a name change – it delivers 20x greater concurrency and throughput than the previous Netezza technology, and it expands the industry’s richest set of in-database analytics capabilities. PureData for Operation Analytics – builds on the successes we have had with the Smart Analytics System and the example of the Netezza appliance simplicity. It delivers a new generation offering that provides both a powerful set of capabilities and greater speed and simplicity of deployment and management System for Operational Analytics 1. Based on IBM internal tests and system design for normal operation under expected typical workload Individual results may vary. 2. Based on one large configuration 3. Based on IBM internal tests of prior generation system, and on system design for normal operations under expected typical workload. Individual results may vary. 4. Based on IBM customers' reported results. "Traditional custom systems" refers to systems that are not professionally pre-built, pre-tested and optimized. Individual results may vary. 5. Based on IBM internal performance benchmarking 6. Based on client testing is the DB2 10 Early Access Program 5 5
6
Operational Analytics Extreme concurrent query volumes on real time information
Business Users, Call Centers, Online Queries, etc 100s to 1,000+ Read and Update Queries Business Analysts SALES 2010 2009 2008 2007 2006 2005 Multiple, Concurrent Analytic Queries PureData System for Operational Analytics provides data services to meet the needs of operation analytics workloads. An operation data warehouse must be able to balance continuous ingest of data, complex analytics across large volumes of data, and high throughput interactions with the data warehouse to deliver insights for real-time operational decision making. Operations can drive 100s and even more than 1000 concurrent queries against the warehouse. BI Reports and Analytics Data Warehouse
7
Performance and Mixed Workloads
Data Mining Call Center Small, large and extra large queries High concurrency Example:
8
Availability and more frequent Ingest
9
+ + + Proven Scalability Value to clients
Can grow system as needed by adding data nodes Typical customers grow data by 30% per year Extra Small (Min Config) Small Medium + + + 0 to 18 Data Expansion Racks 1/3 Rack 2/3 Rack Full Rack Up to 133TB Up to 266 TB 400 TB 9 9
10
Pure Data for Operational Analytics New Features
Embedded 7710 Single box solution – scales down as well as up The starting building block for 7700 scalable Consolidated GUI Console – significantly improved operational experience “Single pane of glass” Alerts for all HW/SW components Maintenance for all firmware/software cluster wide Integrated OPM application management Enhanced availability – fewer components, reduced outage time Integrated Backup leveraging 900 GB HDD Roving Standby, Hot swap SSD 56x fewer managed resources; 1/3rd fewer relationships Enhanced next generation SSD and enclosure Dense (1U) Dual Controller SSD enclosure RAID-10 at high IOPS and bandwidth Higher Density – simpler manufacturing, shipment, service One HA group per rack: no FC cables cross racks P730 2U server: ½ the rack space Dense SSD enclosure, Double Dense IO cards Higher I/O Bandwidth – high ingest capability, high scaling 40 Gbps etherchannel interconnect 64 Gbps FC storage bandwidth New Software Stack – DB210 capabilities AIX DB2 10
11
HA Simplification (‘small cluster’)
PureData for Operational Analytics Smart Analytics 7700
12
HA tools in PureData for Operational Analytics
hareset Soft reset will bring the domain to a base state Clear failed, stuck, pending states - remove resource group requests across entire configuration Rebuild HA configuration completely drop the domain, create domain, create all resources/groups/relationships/equivs etc. hachkconfig Detect and repair resource model problems Detect OS configuration issues that impact HA and alert Can be run automatically or manually Hareset-rebuild takes about 5 minutes per every 30 partitions.
13
A1791 Component Building Blocks
Capacity Add-on Data Racks XL Foundation Rack Data Rack (s) Server Nodes IBM 8205-E6C IBM 8231-E2C Storage IBM SSD 5888 IBM V IBM V Ethernet Switches IBM 1G C IBM 10G E SAN switches SAN48B-5 HMC 7042-CR6 Data Module 3 Data Module 2 Foundation Modules Data Module 1 Roving Standby Module Figures are not to scale
14
Config Summary Capacity Expansion Add-On MTM: XS - 8279-A01 (0.5D)
M A03 (2.5D) L A04 (3.5D) Scalable to 9.5 D XL A05 (3.5D) Scalable to 18.5 D Data Expansion Add-on: 8279-AD1 (1D) 8279-AD2 (2D) 8279-AD3 (3D) Upgrade path Capacity Expansion Add-On Upgrade path Add-On up to 5 x 8279-AD3 Add-On up to 2 x 8279-AD3
15
MES Upgrade / Add-on Options
Model Upgrade: A01 – A04 Data nodes add-on ( lesser than 9D) Model Upgrade: AD1– AD3 Data nodes add-on (10-18D)
16
A1791 Configuration Sizes Size System Part Number Modules
Primary Data (TB) Backup Cold Data(TB) Comments Foundation Data XS A 8279-A01 1 10.8 21.6 0.5D S A 8279-A02 32.4 64.8 1.5D M A 8279-A03 2 54.0 108.0 2.5D L A 8279-A04 3+ 75.6 151.2 3.5D to 9.5D (add up to 2 full add-on racks AD3) XL A1791-X4 8279-A05 1+switches 3.5D to 18.5D (add up to 5 full add-on racks AD3) Add-on data 1 A1791-E1 8279-AD1 n/a 43.2 1D Add-on data 2 A1791-E2d 8279-AD2 86.4 2D Add-on data 3 A1791-E3 8279-AD3 3 129.6 3D Capacity per data module (foundation module has half) 21.6 TB Primary “Hot” Active data 43.2 TB free space (for backups, cold data, etc.)
17
PureData System for Operational Analytics System Console
Unified Appliance UI Alerting via SNMP & (Single System Software Status, Hardware) Platform Monitoring (Hardware and non workload specific) Maintenance Wizard (Launch point integration) User Authentication (For console roles/access only) Workload Monitoring OPM (SSO, Debranding, LIC) License Acceptance On initial interface access System Console Management Monitoring Maintenance Provide “Easy to Use”, “Common” interfaces for management of Pure System Family
18
Simplified maintenance with pre-integrated fixes
Single point of contact for support Automated updates for faster maintenance All hardware firmware and OS software patches integrated and tested together at the factory Another area where we have simplified system administration is that we provide a single point of contact for the system and integrated fixes and updates for all OS and hardware components in the systems. This means all of the time either spent bouncing back and forth between various support organizations is eliminated as is time spent figuring out maintenance pre-req and co-reqs. All hardware firmware and OS software patches are integrated and tested together at the factory. This greatly reduces any downtime caused due to the application of system maintenance. 18
22
The Overview Dashboard – At a Glance view
Details on OS-level system CPU and paging utilization, and a break down of time within DB2 The Overview Dashboard content is divided into sections which present logical groups of related metrics. Here, on the left, it shows runtime metrics, which tell us about how time is spent from an OS perspective (User and System CPU time, both inside and outside of DB2, as well as IO wait and idle time.) On the right, we see a breakdown of time spent inside DB2, derived from the new wait & component time metrics in DB2. These are an extremely useful way to see what DB2 is doing (or waiting for) – executing SQL? Waiting on locks? Waiting on disk IO? Sorting?
23
The Overview Dashboard – At a Glance view
Throughput within DB2 (statements, transactions, rows and connections) and average response time This quadrant tells us about throughput on the DB2 server, in terms of overall database requests, SELECTs, transactions, etc. As a very useful complement to statement & transaction throughput, it also shows row throughput, which is very handy for judging in-flight activity of longer-running SQL statements. In the bottom right, we also have average statement response time, which is useful on both transactional and complex query environments, since it gives us a good indication of response time at the application level.
24
The Overview Dashboard – At a Glance view
Top 3 SQLs – by elapsed time, CPU time, rows read, or lock wait time The Overview dashboard also gives a useful sampling of the top 3 SQL statements, selectably by elapsed or CPU time, rows read or lock wait time. For each of these top 3 statements, it gives us the value used for the 'Top 3' sorting, and the execution time. For particularly long statements, there's a link for each one that will bring up a dialog with the complete statement text. And of course, if there is a problem suspected in one of these statements and want to drill down, we can follow the 'Go to SQL Statement Dashboard" to drill down on SQL hotspots.
25
The Overview Dashboard – At a Glance view
Performance focus: choose a tab to show core information on locking, I/O, SQL or pureScale-specific metrics. Two key pureScale metrics shown here: GBP hit ratio Average XI time
26
Performance Overview Report
Provides good top-level view of overall system performance Choose metrics and sort orders that are most relevant for the database being analyzed Report is exportable to pdf/xls/ppt formats Useful for upline reporting Drill-down available via other reports: Another great addition to OPM 5 is the performance overview report. OPM has always had very useful built-in reports, but this one gives the report equivalent of the overview dashboard – good at-a-glance coverage of the main performance metrics of your database. This is particularly useful for reporting performance information up the management chain – where the boss doesn't want to come sit in front of an OPM screen, no matter how appealing and well-designed it is. Depending on the performance sensitivities of the system, you can choose the appropriate data to be included and the order for it to be sorted in.
27
Drill down to alert in Database Performance Monitor
28
Enhanced Information Center
Information is organized according to topic area and operational lifecycle Overview, Planning, Getting Started, Operational Tasks, Advanced Tasks, Troubleshooting, Reference Each section has further detail, and links to relevant information Easily searchable within the Information Center Easily searchable from any search engine Ability for users to enter comments on topics and participate in discussions 28
29
Information Center – Sample comment
30
ISAS 5600 R3 Announced April 9th GA June 2013
31
5600 R3 Highlights Optim Performance Manager 5.2 Server Refresh
16 core, Intel® Xeon E series processors 2 cores per partition 192GB Memory 24 GB memory per partition Increased Storage Capacity 900 GB SAS-2 Disk Drives Standard Excess space intended for cold data and/or local backup SSD Standard Hot swap and RAID protected 10 Gb Database Interconnect Standard Optim Performance Manager 5.2 New Software Stack SLES 11 + DB2 10.1 Enhanced HA Roving Standby Higher Availability with simpler HA management More stable cluster based file system (replacing NFS) fewer managed resources fewer relationships New ATK Deployment Simpler deployment model with new ATK Simplified software upgrades via master image
32
Management Module New management module
Incorporates Application module (separate in 5600 R1 and R2) Provides rootfs images for core modules (xCAT) HA for applications ISW and xCAT Management Module hosts the following applications InfoSphere Warehouse (ISW) 10.1 Optim Performance Manager (OPM) 5.2 IBM Systems Director (ISD) 6.3 xCAT 2.7.5 Management module Management node Management standby node IBM x3650 M4 8 core E GB memory IBM x3650 M4 8 core E GB memory DS disks
33
IBM System x3650 M4 ISAS R3 PCI Adapters 3 x Dual Port HBAs
4 ports to disk 2 ports to LAN-free backup 1 x 10 Gb Ethernet NIC
34
PCI placement – Core Warehouse Modules
Dual Socket Configuration Slot 1 (Riser1/Slot 1) full height full length (empty) Slot 2 (Riser1/Slot 2) full height half length 2nd QLogic 8Gb FC Dual-port HBA Slot 3 (Riser1/Slot 3) full height half length 1st QLogic 8Gb FC Dual-port HBA Suggested slot priority: 3, 2, 6, 1, 5, 4 Slot 4 (Riser2/Slot 1) full height full length (empty) Slot 5 (Riser2/Slot 2) full height full length 3rd QLogic 8Gb FC Dual-port HBA (LAN-free backup) Slot 6 (Riser2/Slot 3) full height half length Emulex Dual Port 10GbE SFP+ VFA III (95Y3762) (empty) (empty) QLogic 8Gb FC HBA QLogic 8Gb FC HBA QLogic 8Gb FC HBA Emulex 10GbE
35
GPFS Design MGMT MGMT-STDBY ADMIN DATA1 DATA2 DATA3 DATA4 STDBY1
GPFS cluster 1 /db2home, /dwhome, /stage GPFS cluster 1 /db2home, /dwhome, /db2fs, /db2plog, /db2mlog, /stage quorum-manager nsd server quorum-manager nsd server HA GROUP 1 DATA5 DATA6 DATA7 DATA8 DATA9 STDBY2 GPFS cluster 2 /db2fs, /db2plog, /db2mlog Remote from GPFS cluster 1 /db2home, /dwhome, /stage quorum-manager nsd server quorum-manager nsd server HA GROUP 2
36
What is a Terabyte: Defining Terms
Usable Space What is a Terabyte: Defining Terms Available Space Sys Overhead Temp Aggregates, Derived Tables Indexes Spinning Disk Backup / Cold Data RAID – % Aggregates, Derived Tables User Space with compression Raw Hot Data after compression Raw Hot Data Backup Space User Space no compression The data warehouse marketplace seems to be moving to a price per terabyte metric. However there isn’t a standard definition of what represents a terabyte. The terms on this slide should prove helpful in the regard. Spinning disk is the total storage available to the system, based on the number of drives and their size. Usable storage is the space available to the database software after RAID is applied (RAID is used to provide a second copy of the data for high availability). Usable storage includes Raw Data, in addition to aggregates, indexes, temp, and system overhead AND new in PureData, integrated backup space and cold data space. Raw data is the actual customer data that is loaded into the system. More details are described on the next slide. Spinning disk: The total amount of storage available to the system, ie. Total drives x Drive capacity Available space: The total amount of storage that is available to the database software after RAID Usable space: The total amount of storage that is available for hot active data including supporting objects like indexes, and temps User space: the amount of base data input to the system (Raw hot data) - number of records x record size; plus any aggregates or derived tables – assumed to be 55% of available usable space Backup Space / Cold Data: The total amount of storage that is available for database backups and cold data, which is infrequently accessed historical data 36
37
Smart Analytics System 5600 R3 data sizing
Capacity Sizes Smart Analytics System 5600 5600 R3 (900GB) 5600 R2 (300GB) # Data Modules 1 Spinning Disk (TB) 42.2 14.1 Available Database Space after RAID formatting (/db2fs) (TB) 26.2 8.7 Active 33% (TB) User 55% (TB) 4.8 User Space Compressed – Assuming 2.5x compression (TB) 12 Space available for index, temp, logs (TB) 3.9 Backup Space / Cold 66% (TB) 17.5 - Peak User Space Compressed with Cold Data Compressed* (TB) 55.7 Solid State Device for temp (TB) 1.2 0.6 ** Smart Analytics System 5600 R3 data sizing Each 5600 R3 Data Module Spinning Disk 42.2 TB = 900GB * 48 / 1024 Available Database Space after RAID 26.2 TB = 900GB*0.93 *48disks * 4 / 6 / (minus some unallocated space) - Active Space (8 partitions) 8.7 TB = 26.2 TB * 0.33 - User Space 4.8 TB = 0.55 * 8.7 TB - User Space Compressed 12 TB = 4.8 TB x 2.5 - Space available for index, temps, logs 3.9 TB = 0.45 * 8.7 TB - Backup Space 17.5 TB = 26.2 TB * 0.67 - Peak User Space Compressed with Cold Data Compressed 55.7 TB = 12 TB TB * 2.5 - SSD (RAID ) 1.4 TB = 512GB * 0.93 * 3 /1024 *Cold Data Compressed – infrequently accessed historical data w/o temps or logs ** 5600 R2 with SSD option 37
38
ISAS 5600 Data Module Comparisons
39
James Cho IBM jamescho@us.ibm.com
Evaluate my session online: James Cho IBM Session IBM PureData System for Operational
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.