Presentation is loading. Please wait.

Presentation is loading. Please wait.

June 2002 Capacity Planning for the Newer Workloads Linwood Merritt Capital One Services, Inc.

Similar presentations


Presentation on theme: "June 2002 Capacity Planning for the Newer Workloads Linwood Merritt Capital One Services, Inc."— Presentation transcript:

1 June 2002 Capacity Planning for the Newer Workloads Linwood Merritt Capital One Services, Inc. linwood.merritt@capitalone.com

2 June 2002 Disclaimer These generic issues are addressed by this presentation: –Vendor capacity ratings –e-Commerce –Continuous availability –Data warehousing –Growth rates This presentation contains no specific business-related information.

3 June 2002 Introduction: Environment Capital One –5th largest card issuer in the United States –Capital One to S&P 500 in 1998 –Fortune 500 company (#260) –Managed loans at $48.6 billion as of Q1 2002 –Accounts at 46.6 million as of Q1 2002 –Fortune 100 “Best Places to Work in America” –CIO 100 Award “Master of the Customer Connection” –Information Week “Innovation 100” Award Winner –ComputerWorld “Top 100 places to work in IT”

4 June 2002 Outline of Approach Understand behavior and issues around workloads, hardware, and data Create projections and build recommendations. Report the findings.

5 June 2002 Outline of Presentation Discussion of workload types and capacity projection approaches Overall summary of issues and approaches Examples

6 June 2002 What Workloads? E-Commerce Relational database systems Mainframe-class UNIX Multiple platforms New characteristics

7 June 2002 e-Commerce Workloads Direct to Client (business-to-business) Access –Internet –Leased line Services –Point of Care / Point of Sale –Value-added analysis

8 June 2002 e-Commerce Workloads Direct to Customer Access –Internet –Dial-in Services –Marketing –Account query

9 June 2002 e-Commerce Workloads How to Predict Take business projections of volumes or users (include fudge factor) Estimate transaction volumes and CPU/transaction Convert to normalized unit such as MIPS

10 June 2002 Relational Databases Sub-second (OLTP), decision support / data mining Distributed gateways Database machines Redundant data with extracts How to predict: estimate a factor over current database demand or take usage estimates

11 June 2002 Mainframe-Class Unix Types: Mainframe USS or Linux, Future UNIX vendor offerings Candidate applications –Web server –Vendor-ported applications –User-ported / new applications How to predict: –Estimate by timeframe –Add factor to growth rates

12 June 2002 Multiple Platforms Mainframe: plan like existing applications (#users, transactions * CPU/transaction, application look-alikes, sizing tools) Distributed: use vendor sizing, modeling tools, existing applications Network: use network simulation tools, rules- of-thumb, bandwidth calculations

13 June 2002 New Characteristics External users Continuous availability New user interfaces Cross-platform

14 June 2002 External Users Drive need for continuous availability Different access patterns (e.g., doctor’s office vs. call center) Service level measurement - harder to put agent on external workstations

15 June 2002 Continuous Availability Driven by external users 24x7 schedule –Application redesign –Data Sharing: CPU overhead –Coupling Facility –Expansion of “prime shift” 99.999% “up time” –Redundancy, overhead –Availability reporting

16 June 2002 User Interfaces TCP/IP - no “definite response” (end-to-end response time measurement) Multiple internal transactions per “mouse click” Response time measurement: –Agent on workstations –Scripting from “robots”

17 June 2002 Cross Platform Applications Only unified view: simulation package Each platform (“silo”) can be analyzed separately. Different application development groups May be able to cross-validate user numbers

18 June 2002 Types of Implementation (1) Standalone / “shrink-wrap” Layered onto legacy applications –New mainframe application code –GUI front-end –Browser –Middle-tier (Unix or NT) –MQSeries - can add middle-tier and new mainframe applications

19 June 2002 Types of Implementation (2) Legacy extracts Re-engineered legacy applications –Convergence of business rules / applications –Re-usable components –Redundant access –Salvage investment, fix Band-Aids –Simplify logic, reduce platform complexity

20 June 2002 What Are We Analyzing? (Mainframe) MIPS - growth, latent demand, software cost Memory - track and watch 2 GB limit on central storage (goes away with 64-bit) I/O - channels, gigabytes of disk, tape Coupling Facility - Parallel Sysplex, Shared Data, continuous availability Vendor upgrade paths New partitions

21 June 2002 What Are We Analyzing? (Distributed) Number and types of platforms CPU, memory, disk space Bandwidth Location of applications / processes Platform limitations (CPU, memory) Software pricing considerations Porting opportunities

22 June 2002 Measurement of New Workloads Summarize by platform: –Workload rules (process or user names) –Processes by descending CPU% Resources: CPU, memory, disk space, Coupling Facility, network traffic Growth: –Resources/user/application –Number of users + application changes

23 June 2002 Distributed Approach Consider tiers of service (not currently at Capital One) Address service level measurement issue Implement reporting Add to Capacity Plan “Silo” vs. “Application”

24 June 2002 Tiers of Service “Platinum” Most expensive Modeling product Install in one server for each major application, use collection product for other servers

25 June 2002 Tiers of Service “Gold” Collection product Capacity planning with Rules of Thumb

26 June 2002 Tiers of Service “Brass” Least expensive (man-hours only) “Native” –Unix scripts –NT PerfMon

27 June 2002 Service Level Measurement API call at workstation - “Applications Response Measurement” (ARM) or Windows 2000 trace API calls Agents: software tracing of Windows API calls - can be installed in a subset of end-user base (sampling) Scripting (“robots”) Stop watch sampling and logging

28 June 2002 Distributed Reporting

29 June 2002 Add to Capacity Plan

30 June 2002 Scope of Analysis Silos –Look at each hardware/application environment independently. Applications –Look at each application as a whole. –Application instrumentation –Inference: put platform silos together.

31 June 2002 Analyzing the Data Growth Rates General list of business plans List of technical scenarios Timeline Estimate median and maximum likely MIPS/CPU/users/business units Derive scenario growth rates

32 June 2002 Analyzing the Data Additional Resources Parallel Sysplex (Coupling Facility): important for continuous availability, level set functionality Disk / channels / tape: disk megabytes, channel maximum, tape connectivity Communications connectivity: new partitions for availability Memory: 2 GB constraint, 64-bit

33 June 2002 Growth “Baseline” growth “Scenario” growth Independent events (merger/acquisition, potential major project)

34 June 2002 Example 1: Mainframe Upgrade Task force, led by Capacity Planner Driven by expiring three-year lease (CPU replacement, three-year planning horizon) “Vendor parade” - presentations and dialogues –Upgrade paths –Technology / service differences –References / site visits –Capacity sizing: MIPS charts, LSPR / sizing tools

35 June 2002 Mainframe Upgrade Deliverables Document –Business drivers and technical scenarios –Growth forecasts –Vendor options and growth paths –Coupling Facility / Parallel Sysplex Evaluation –Difference thresholds: MIPS claims, price/MIPS, ICF –Differentiators

36 June 2002 Business and Technical Business Drivers Cost management External business Improved data access Business expansion Technical Scenarios Consolidation of distributed servers Continuous availability Significant external business Data Warehousing Acquisition/merger

37 June 2002 Projections Make educated guess by timeframe for each scenario Add to “baseline” growth Convert to growth rate Use both “baseline” and “scenario growth” Compare maximum scenario growth to maximum for platform family

38 June 2002 Impact Analysis

39 June 2002 Scenario Timeline Period1 Period2 Period3 Period4 Period5 Period6 Period7 First mainframe Wk1 Application 24x7 operation First Parallel Sysplex exploitation Initial muck exploitation with 250 Users (Potential acquisition) New DB2 functionality exploitation Full Data Sharing exploitation (IMS, CICS, DB2) Full subsystem redundancy (IMS, CICS, DB2) MajorProject A with 100 users, 150% CAGR 64-bit OS/390

40 June 2002 Vendor Upgrade Paths Detail Use logarithms: Start*CAGR^x = Threshold x years = log(Threshold/Start)/log(CAGR) ModelMIPSMSU+40%/Yr+25%/Yr GS2068E952160Aug-00Sep-00 GS2074E1013171Oct-00Dec-00 GS2084E1141193Apr-01Jul-01 GS2094E1260213Sep-01Dec-01 GS2104E1378234Nov-01May-02

41 June 2002 Vendor Upgrade Paths Summary

42 June 2002 Upgrade Document

43 June 2002 Example 2: UNIX Modeling Modeling product installed on MQSeries server Application running with a known number of users Projected rollout schedule used to drive model Mainframe side: CICS application, IMS load

44 June 2002 UNIX Platform Workloads Two primary workloads: –MQSeries userids (mqm*) - memory intensive –Messaging application processes (MDA*) - “CPU intensive”

45 June 2002 Workload Modeling Methodology MQSeries - Calculate relative workload intensity, enter model ratio. Messaging application processes - Keep constant until application is removed from platform (“design loop” - always uses 1 CPU). Must adjust across CPU upgrade to continue using 1 CPU.

46 June 2002 Track Across Upgrade CPU Upgrade

47 June 2002 Model Spreadsheet

48 June 2002 Model Presentation Timeframe:April 2000 #Users:180, 100 Ratios:1.27, 1.00 Config:F50/02,2GB Comment:Add Event1 Users

49 June 2002 Validation - Tracking Users (on mainframe) //ECLUSRS EXEC SASV8,REGION=0M //ECLD1 DD DSN=XYZ.PRD.A.AAAPRD.I.VOLFIL,DISP=SHR //ECLDPDB DD DSN=CAPLAN.PRD.ECLDPDB,DISP=OLD //SYSIN DD *,DLM=@@ data ecld1; format date date.; format dt datetime.; INFILE ECLD1 MISSOVER; INPUT @1 RECNUM $CHAR5. @6 RECTYPE $CHAR8. @14 USERCT $CHAR5. @19 USERMAX $CHAR5.; if recnum =: '99999' and rectype =: 'TCSCONFG'; dt = datetime(); date = datepart(dt); hour = hour(dt); data ecldpdb.users; update ecldpdb.users ecld1; by date hour; proc print; title 'Ecloud1 Users';

50 June 2002 Example 3: Server Replacement Project: replace “old” NT servers Application: Imaging servers Capacity sizing data: –Rules-of-thumb analysis by vendor, using projected claims/minute and processor clock speeds –Benchmark information

51 June 2002 Server Replacement Process Multiple servers: each server is a workload, must be sized separately. Enumerate and measure servers. Apply growth rates and determine processing power requirements for the replacements. Research available configurations and order appropriate server configurations. Track CPU utilization across the upgrades. Update relative capacity specs for next upgrade.

52 June 2002 Server Sizing Find (or derive) benchmark capacity ratings for starting and replacement configurations. Apply an estimate of current CPU utilization, a growth percentage, and a “peak/average” and performance buffer (+100% for this study). Output: estimated percentages of a standard configuration. The number of estimated CPUs needed (23) came very close to the vendor’s original number of 24.

53 June 2002 Sizing Spreadsheet

54 June 2002 Example 4: Hundreds of Servers Data capture Reporting Business drivers

55 June 2002 Data Capture Time-based scheduling product Script-based data “pull” Issue: data loss, time to find and rebuild Potential fixes: –Product –Data “push” from servers

56 June 2002 Data Reporting, Analysis Color-based “health index” (Concord NetHealth metric). Statistical Analysis (over two standard deviations from mean) Thumbnail drilldown graphs Automatic generation of html “Treemap” graphs

57 June 2002 Health Index * * Concord NetHealth metric

58 June 2002 Statistical Process Control cmg

59 June 2002 Thumbnail Html

60 June 2002 Automatic Generation of Html Driven by “matrix” –Originally spreadsheet –Converted to relational database –Ultimate capacity planning solution: information by server, application, platform, business driver SAS code - builds web pages and hyperlinks

61 June 2002 Treemap Paper by Ben Shneiderman, University of Maryland, http://www.cs.umd.edu/hcil/treemaps ASSDSDFVVBNM XSDFFGFRRFHFHJKJKLLXXXXX XESDGFKOKJHHMM XESDGFKOKJDERFFVBBNHGFF XESDG XES SDEFBJMGG XESDG

62 June 2002 Business Drivers Capacity Councils - business units responsible for capacity planning of “demand” side Capacity Planners - build projections based on business drivers and historical trending

63 June 2002 Business Driver Based Forecasts Server Application Business Driver Business Driver Projections

64 June 2002 Regression Analysis Widgets Gadgets Customers CPU By month (input = Widgets, Gadgets, Customers): projection =Widgets*f1 + Gadgets*f2 + Customers*f3; f1 f2 f3 Output = Coefficients Input = CPU and Business Drivers by month

65 June 2002 Graphical Output Widgets Gadgets Customers

66 June 2002 Enterprise “Capacity at a Glance”

67 June 2002 Summary Issues Access patterns and schedules Platforms (more types and numbers) Resources (what to track) Levels of capacity management Reporting of utilization and service levels, for large numbers of platforms Higher availability (redundancy, reporting) Deriving and reporting projections

68 June 2002 Summary Deriving Projections Basic capacity planning: –Growth rates –Upgrade thresholds Aggressive estimate of “scenario” demand Bracket growth: –Lower end: “baseline” –Upper end: “scenarios”

69 June 2002 Summary Types of Projections Number of transactions Number of users Number of platforms Application sizing input Application complexity Fraction of an existing workload Growth rate

70 June 2002 Summary Capacity Planning Projections based on application and platform Levels of capacity planning service Report on all enterprise resources Organize data with “matrix” database


Download ppt "June 2002 Capacity Planning for the Newer Workloads Linwood Merritt Capital One Services, Inc."

Similar presentations


Ads by Google