Storage Optimization Strategies Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations.

Slides:



Advertisements
Similar presentations
1 Senn, Information Technology, 3 rd Edition © 2004 Pearson Prentice Hall James A. Senns Information Technology, 3 rd Edition Chapter 7 Enterprise Databases.
Advertisements

Understanding I/O Performance with PATROL-Perform and PATROL-Predict
B3: Putting OpenEdge Auditing to Work: Dump and Load with (Almost) No Downtime David EDDY Senior Solution Consultant.
The basics for simulations
Storing Data: Disk Organization and I/O
Key Metrics for Effective Storage Performance and Capacity Reporting.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
Practical Performance Tuning
Database Performance Tuning and Query Optimization
Sanjay Agrawal Microsoft Research Surajit Chaudhuri Microsoft Research Gautam Das Microsoft Research DBXplorer: A System for Keyword Based Search over.
1 Overview Assignment 4: hints Memory management Assignment 3: solution.
Cache and Virtual Memory Replacement Algorithms
Chapter 3.3 : OS Policies for Virtual Memory
Chapter 10: Virtual Memory
Background Virtual memory – separation of user logical memory from physical memory. Only part of the program needs to be in memory for execution. Logical.
Paging: Design Issues. Readings r Silbershatz et al: ,
1, 2, 3… Scatter! Getting Your Humpty-Dumpty Database in Order. Tom Bascom, White Star Software
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Evaluation of Relational Operations Chapter 12, Part A.
Natural Data Clustering: Why Nested Loops Win So Often May, 2008 ©2008 Dan Tow, All rights reserved SingingSQL.
Chapter 6: Memory Management
DB-03: A Tour of the OpenEdge™ RDBMS Storage Architecture Richard Banville Technical Fellow.
T OP N P ERFORMANCE T IPS Adam Backman Partner, White Star Software.
ProTop version 3 – An open source Progress database performance monitor ProTop is a free, Open Source database monitor for Progress OpenEdge databases.
ProTop version 3 – An open source Progress database performance monitor ProTop is a free, Open Source database monitor for Progress OpenEdge databases.
Database Storage for Dummies Adam Backman President – White Star Software, LLC.
Numbers, We don’t need no stinkin’ numbers Adam Backman Vice President DBAppraise, Llc.
File Systems.
Segmentation and Paging Considerations
Storage Optimization Strategies Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations Tom Bascom, White Star Software.
1 External Sorting for Query Processing Yanlei Diao UMass Amherst Feb 27, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Overview of Storage and Indexing Chapter 8 1. Basics about file management 2. Introduction to indexing 3. First glimpse at indices and workloads.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Chapter 9.
Tree-Structured Indexes. Range Searches ``Find all students with gpa > 3.0’’ –If data is in sorted file, do binary search to find first such student,
Simplify your Job – Automatic Storage Management Angelo Session id:
IT – DBMS Concepts Relational Database Theory.
DISKS IS421. DISK  A disk consists of Read/write head, and arm  A platter is divided into Tracks and sector  The R/W heads can R/W at the same time.
Database Storage Considerations Adam Backman White Star Software DB-05:
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Top Performance Enhancers Top Performance Killers in Progress Dan Foreman Progress Expert
1 Growth: It's a Good Problem To Have! But what are you going to do about it? Abstract: Many partners start out with a great idea, create a fantastic product,
Top 10 Performance Hints Adam Backman White Star Software
Strength. Strategy. Stability.. Progress Performance Monitoring and Tuning Dan Foreman Progress Expert BravePoint BravePoint
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
A first look at table partitioning PUG Challenge Americas Richard Banville & Havard Danielsen OpenEdge Development June 9, 2014.
File System Implementation Chapter 12. File system Organization Application programs Application programs Logical file system Logical file system manages.
OSes: 11. FS Impl. 1 Operating Systems v Objectives –discuss file storage and access on secondary storage (a hard disk) Certificate Program in Software.
CS 153 Design of Operating Systems Spring 2015 Lecture 21: File Systems.
SQLintersection Putting the "Squeeze" on Large Tables Improve Performance and Save Space with Data Compression Justin Randall Tuesday,
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Using Progress® Analytical Tools Adam Backman White Star Software DONE-05:
1 Chapter 9 Tuning Table Access. 2 Overview Improve performance of access to single table Explain access methods – Full Table Scan – Index – Partition-level.
for all Hyperion video tutorial/Training/Certification/Material Essbase Optimization Techniques by Amit.
1 Tree-Structured Indexes Chapter Introduction  As for any index, 3 alternatives for data entries k* :  Data record with key value k   Choice.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Tree-Structured Indexes Content based on Chapter 10 Database Management Systems, (3 rd.
CS4432: Database Systems II
What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the database concurrently and safely. Provide.
Select Operation Strategies And Indexing (Chapter 8)
Working Efficiently with Large SAS® Datasets Vishal Jain Senior Programmer.
SQL IMPLEMENTATION & ADMINISTRATION Indexing & Views.
OpenEdge Standard Storage Areas
Indexes By Adrienne Watt.
OpenEdge Standard Storage Areas
Walking Through A Database Health Check
Lecture 14 Virtual Memory and the Alpha Memory Hierarchy
OPS-8: Effective OpenEdge® Database Configuration
File Storage and Indexing
OPS-14: Effective OpenEdge® Database Configuration
Presentation transcript:

Storage Optimization Strategies Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations.

A Few Words About The Speaker Tom Bascom, Roaming DBA & Progress User since 1987 President, DBAppraise, LLC – Remote Database Management Service. – Simplifying the job of Managing and Monitoring The Worlds Best Business Applications. – VP, White Star Software, LLC – Expert Consulting Services related to all aspects of Progress and OpenEdge. –

What do we mean by Storage Optimization? The trade press thinks we mean BIG DISKS. Bean counters think we mean using the cheapest possible disks. SAN vendors think that we mean whatever it is that will get them the biggest commission. Programmers think we mean that they can store even more stuff. Without a care in the world. DBAs seek the best possible reliability and performance at a reasonable cost.

We will NOT be Talking About: SANs Servers Operating Systems RAID Levels … and so forth.

The Pillars of OpenEdge Storage Optimization Avoid unnecessary IO. Only read the blocks that you need to read. Make every IO operation count Perform IO in the optimal order.

How Do We Do That?

Always Use Type 2 Storage Areas Cluster Sizes of 0 or 1 are Type 1 Areas. – The Schema Area is always a Type 1 Area. – It would be nice if a Cluster Size of 1 was a Type 2 Area. Type 2 Areas have Cluster Sizes of 8, 64 or 512. Data Blocks in Type 2 Areas contain data from just one table.

Mixed vs Homogenous Blocks Type 1 Mixed Area ABCDABCD EAACEAAC AFGHAFGH CDHACDHA Type 2 Area AAAAAAAA AA BBBBBBBB BBBBBBBBB C C C C F F F Y T1 Dedicated Area Z Z Z Z Z

Always Use Type 2 Storage Areas Type 2 Storage Areas are the foundation for all advanced features of the OpenEdge database. – New features will almost always require that Type 2 areas be in use in order to be leveraged.

Sneak Peek! 10.2B is expected to support a new feature called Alternate Buffer Pool. This can be used to isolate specified database objects (tables and/or indexes). The alternate buffer pool has its own distinct –B. If the database objects are smaller than –B there is no need for the LRU algorithm. This can result in major performance improvements for small, but very active, tables. 10.2B is expected to support a new feature called Alternate Buffer Pool. This can be used to isolate specified database objects (tables and/or indexes). The alternate buffer pool has its own distinct –B. If the database objects are smaller than –B there is no need for the LRU algorithm. This can result in major performance improvements for small, but very active, tables.

Only Read What You Need Because Type 2 Storage Areas are asocial: – Data in the block is from a single table. – Locality of Reference is leveraged more strongly. – Table oriented utilities, such as index rebuild, binary dump and so forth know which blocks they need to read and which blocks they do not need to read. – DB features, such as the SQL-92 fast table scan and fast table drop can operate much more effectively.

Type 2 Areas are not just… … for large tables – very small, yet active tables can dominate an applications IO. Case Study: A system with 30,000 record reads/sec. – The bulk of the reads were from one 10,000 record table. – That table was in a Type 1 area and its records were widely scattered. – Big B was set to 10,000 and RAM was tight. – Moving the table to a Type 2 Area patched the problem. Only 10% of –B was now needed for this table.

Use Many (Type 2) Storage Areas Do NOT assign tables to areas based on function. Create distinct storage areas for: – Each very large table – Each very active table – Indexes vs Data – Tables with common Rows Per Block settings

Finding Active Tables

Shameless Plug

Use the Largest DB Block Size Large Blocks reduce IO, fewer operations are needed to move the same amount of data. More data can be packed into the same space because there is proportionally less overhead. Because a large block can contain more data it has improved odds of being a cache hit. Large blocks enable HW features to be leveraged. Especially SAN HW.

Set Rows Per Block Optimally Use the largest Rows Per Block that: – Fills the block! – But does not unnecessarily fragment it. Rough Guideline: – Next power of 2 after BlockSize / (AvgRecSize + 20) – Example: 8192 / ( ) = 74, next power of 2 = 128 – Range is powers of 2 from 1 to 256. Caveat: there are far more complex rules that can be used and a great deal depends on the applications record creation & update behavior.

Set Rows Per Block Optimally BlkSzRPBBlocksDisk (KB)Waste/Blk%UsedActual RPB 143, %3 442,50010,0002,96523%4 481,2505,0002,07546% , % , %17 842,50020,0007,06011% ,0004,38345% , % , % , %35 Table is a 10,000 record simulation based on real table data with average row size of 220.

Set Rows Per Block Optimally Caveats: – Blocks have overhead which varies by storage area type, block size, Progress version and by tweaking the create and toss limits. – Not all data behaves the same: Records which are created small and which grow frequently may tend to fragment if RPB is too high. Record size distribution is not always Gaussian

Create and Toss Limits Free space needed in a block for: – Allow creation of new records if more. Type 2 default = 150. – Toss (remove) from RM Chain if less. Type 2 default = 300. – Changed significantly between V9 and OE10. – Change with PROUTIL Can be set either per area, or per object

Create and Toss Limits SymptomAction Fragmentation occurs on updates to existing records. You anticipated one fragment, but two were created. Increase Create Limit Fragmentation occurs on updates to existing records or you have many (thousands, not hundreds) blocks on the RM chain with insufficient space to create new records. Increase Toss Limit There is limited fragmentation, but database block space is being used inefficiently, and records are not expected to grow beyond their original size. Decrease Create Limit There is limited fragmentation, but database block space is being used inefficiently, and records are not expected to grow beyond their original size. Decrease Toss Limit

Set Cluster Size Optimally There is no advantage to having a cluster more than twice the size of the table. Except that you need a cluster size of at least 8 to be Type 2. Indexes are usually smaller than data and there may be dramatic differences in index size.

Set Cluster Size Optimally $ proutil dbname –C dbanalys > dbname.dba … RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : Record Size (B) -Fragments- Scatter Table Records Size Min Max Mean Count Factor Factor PUB.APP_FLAGS M … INDEX BLOCK SUMMARY FOR AREA "APP_FLAGS_Idx" : Table Index Fields Levels Blocks Size %Util Factor PUB.APP_FLAGS AppNo M FaxDateTime K FaxUserNotified K

Avoid IO, But If You Must… LayerTime# of Recs # of OpsCost per Op Relative Progress to –B ,000203, B to FS Cache ,00026, FS Cache to SAN ,00026, B to SAN Cache* ,00026, SAN Cache to Disk ,00026, B to Disk ,00026, * Used concurrent IO to eliminate FS cache … In Big B You Should Trust!

Make Every IO Op Count Use big blocks – DB Blocks size should be at least equal to OS block size. – Bigger is good – Especially for read-heavy workloads, it leverages read-ahead capabilities of HW. Pack the data densely. – Set Rows Per Block (RPB) high. – But not so dense as to fragment records.

Perform IO in the Optimal Order TableIndex%Sequential%Idx UsedDensity Table1idx169%99%0.51 idx2*98%1%0.51 idx374%0%0.51 TableIndex%Sequential%Idx UsedDensity Table1idx185%99%1.00 idx2*100%1%1.00 idx383%0%0.99 4k DB Block 8k DB Block

Case Study Block SizeHit Ratio%SequentialBlock ReferencesIO OpsTime 4k ,71919, k ,1499, k ,3506, k ,0269, k ,8054, k ,0083,19220 The process was improved from an initial runtime of roughly 2 hours (top line, in red) to approximately 20 minutes (bottom) by moving from 4k blocks and 69% sequential access at a hit ratio of approximately 95% to 8k blocks, 85% sequential access and a hit ratio of 99%.

Conclusion While these are general best practices that should be thought of as a baseline it is certainly possible that testing will reveal advantages to different approaches. Optimizing Storage requires substantial testing of many different variations. Methods which work with one particular combination of application, database, hardware and Progress release may very well change dramatically as the underlying components change. White Star Software has a great deal of experience in optimizing storage. We would be happy to engage with any customer that would like our help!

Questions?

Thank You!