Index Rebuild Performance Hopefully youll never need it. Wei Qiu Principle Engineer Progress Software Inc.

Slides:



Advertisements
Similar presentations
B3: Putting OpenEdge Auditing to Work: Dump and Load with (Almost) No Downtime David EDDY Senior Solution Consultant.
Advertisements

What's new?. ETS4 for Experts - New ETS4 Functions - improved Workflows - improvements in relation to ETS3.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
13 Copyright © 2005, Oracle. All rights reserved. Monitoring and Improving Performance.
External sorting R & G – Chapter 13 Brian Cooper Yahoo! Research.
Equality Join R X R.A=S.B S : : Relation R M PagesN Pages Relation S Pr records per page Ps records per page.
DB-03: A Tour of the OpenEdge™ RDBMS Storage Architecture Richard Banville Technical Fellow.
© IBM Corporation Informix Chat with the Labs John F. Miller III Unlocking the Mysteries Behind Update Statistics STSM.
Introduction to Database Systems1 Records and Files Storage Technology: Topic 3.
Some More Database Performance Knobs North American PUG Challenge
Index tuning Hash Index. overview Introduction Hash-based indexes are best for equality selections. –Can efficiently support index nested joins –Cannot.
External Sorting CS634 Lecture 10, Mar 5, 2014 Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
1 Overview of Storage and Indexing Chapter 8 (part 1)
1 Overview of Storage and Indexing Yanlei Diao UMass Amherst Feb 13, 2007 Slides Courtesy of R. Ramakrishnan and J. Gehrke.
1 Hash-Based Indexes Chapter Introduction  Hash-based indexes are best for equality selections. Cannot support range searches.  Static and dynamic.
1 Hash-Based Indexes Chapter Introduction : Hash-based Indexes  Best for equality selections.  Cannot support range searches.  Static and dynamic.
1 External Sorting Chapter Why Sort?  A classic problem in computer science!  Data requested in sorted order  e.g., find students in increasing.
CS 4432lecture #10 - indexing & hashing1 CS4432: Database Systems II Lecture #10 Professor Elke A. Rundensteiner.
File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears,
1.1 CAS CS 460/660 Introduction to Database Systems File Organization Slides from UC Berkeley.
1 Overview of Storage and Indexing Chapter 8 1. Basics about file management 2. Introduction to indexing 3. First glimpse at indices and workloads.
Chapter 4 Physical Database Layouts Database Processing Chapter 4.
MOVE-4: Upgrading Your Database to OpenEdge® 10 Gus Björklund Wizard, Vice President Technology.
Layers of a DBMS Query optimization Execution engine Files and access methods Buffer management Disk space management Query Processor Query execution plan.
Administration etc.. What is this ? This section is devoted to those bits that I could not find another home for… Again these may be useless, but humour.
Database Storage Considerations Adam Backman White Star Software DB-05:
1 File Systems Chapter Files 6.2 Directories 6.3 File system implementation 6.4 Example file systems.
Index tuning-- B+tree. overview © Dennis Shasha, Philippe Bonnet 2001 B+-Tree Locking Tree Traversal –Update, Read –Insert, Delete phantom problem: need.
TEMPDB Capacity Planning. Indexing Advantages – Increases performance – SQL server do not have to search all the rows. – Performance, Concurrency, Required.
Architecture Rajesh. Components of Database Engine.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 13.
1 Overview of Storage and Indexing Chapter 8 (part 1)
Hashing and Hash-Based Index. Selection Queries Yes! Hashing  static hashing  dynamic hashing B+-tree is perfect, but.... to answer a selection query.
IN-MEMORY OLTP By Manohar Punna SQL Server Geeks – Regional Mentor, Hyderabad Blogger, Speaker.
Database Management 7. course. Reminder Disk and RAM RAID Levels Disk space management Buffering Heap files Page formats Record formats.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Hash-Based Indexes Chapter 10.
I MPLEMENTING FILES. Contiguous Allocation:  The simplest allocation scheme is to store each file as a contiguous run of disk blocks (a 50-KB file would.
Query Optimization CMPE 226 Database Systems By, Arjun Gangisetty
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 B+-Tree Index Chapter 10 Modified by Donghui Zhang Nov 9, 2005.
1 Chapter 9 Tuning Table Access. 2 Overview Improve performance of access to single table Explain access methods – Full Table Scan – Index – Partition-level.
DMBS Internals I February 24 th, What Should a DBMS Do? Store large amounts of data Process queries efficiently Allow multiple users to access the.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 External Sorting Chapter 11.
Database Management 7. course. Reminder Disk and RAM RAID Levels Disk space management Buffering Heap files Page formats Record formats.
CS222: Principles of Data Management Lecture #4 Catalogs, Buffer Manager, File Organizations Instructor: Chen Li.
CS522 Advanced database Systems
CS 540 Database Management Systems
CS522 Advanced database Systems
Lecture 16: Data Storage Wednesday, November 6, 2006.
FileSystems.
COP Introduction to Database Structures
Indexing ? Why ? Need to locate the actual records on disk without having to read the entire table into memory.
OpenEdge Standard Storage Areas
External Sorting Chapter 13
Hash-Based Indexes Chapter 11
Chapter Overview Understanding the Database Architecture
Database Applications (15-415) DBMS Internals- Part III Lecture 15, March 11, 2018 Mohammad Hammoud.
Database Management Systems (CS 564)
B+-Trees and Static Hashing
CS222: Principles of Data Management Notes #8 Static Hashing, Extendible Hashing, Linear Hashing Instructor: Chen Li.
Hash-Based Indexes Chapter 10
Introduction to Database Systems
External Sorting Chapter 13
Selected Topics: External Sorting, Join Algorithms, …
General External Merge Sort
Large Object Datatypes
Hash-Based Indexes Chapter 11
Chapter 11 Instructor: Xin Zhang
External Sorting Chapter 13
Lecture 20: Representing Data Elements
Presentation transcript:

Index Rebuild Performance Hopefully youll never need it. Wei Qiu Principle Engineer Progress Software Inc

Agenda 1 Overview Processing Details Tuning Suggestions Some Statistics Summary 5

Phases of Index Rebuild (non-recoverable) Index Scan Data Scan/ Key Build Sort-Merge Index Key Insertion Scan index data area start to finish I/O Bound with little CPU activity Eliminated with area truncate Scan table data area start to finish (area at a time) Read records, build keys, insert to temp sort buffer Sort full temp file buffer blocks (write if > -TF) I/O Bound with CPU Activity Sort-merge –TF and/or temp sort file CPU Bound with I/O Activity I/O eliminated if –TF large enough Read sorted list in –TF or temp sort file Insert keys into index Formats new clusters; May raise HWM I/O Bound with little CPU Activity

Index Rebuild Parameters - Overview sort block size (8K – 64K, note new limit) # threads for data scan phase merge block size ( default -TB) merge pool fraction of system memory (in %) # threads per concurrent sort group merging -mergethreads # concurrent sort group merging -threadnum # merge buffers to merge each merge pass -TM report system usage statistics a bit quieter than before -rusage -silent -TB -datascanthreads -TMB -TF

Agenda 1 Processing Details Overview Tuning Suggestions Some Statistics Summary 5

Phases of Index Rebuild Index Scan Scan index data area start to finish I/O Bound with little CPU activity Eliminated with area truncate Area 9: Index scan (Type II) complete. Index area is scanned start to finish (single threaded) Block at a time with cluster hops Index blocks are put on free chain for the index Index Object is not deleted (to fix corrupt cluster or block chains) Order of operation: Blocks are read from disk, Blocks are re-formatted in memory Blocks are written to disk as –B is exhausted Causes I/O in other phases for block re-format Can be eliminated with manual area truncate where possible

Phases of Index Rebuild Data Scan/ Key Build Scan table data area start to finish (area at a time) Read records, build keys, insert to temp sort buffer Sort full temp file buffer blocks (write if > -TF) I/O Bound with CPU Activity Index Scan Scan index data area start to finish I/O Bound with little CPU activity Eliminated with area truncate Table data area is scanned start to finish (multi-threaded if –datascanthreads) Each thread processes next block in area (with cluster hops) Database re-opened by each thread in R/O mode Ensure file handle ulimits set high enough Processing area 8 : (11463) Start 4 threads for the area. (14536) Area 8: Multi-threaded record scan (Type II) complete.

Rules for -datascanthreads Index rebuild run with sort option No index being rebuilt exists in table area being scanned No word indexes are being rebuilt for table data being scanned Data area being scanned is type TII storage area. If any true: use single threaded data scan for that area Violation for one area does not preclude use on another area

Data Scan/Key Build Record b) Extract next record from data block and build index keys (sort order) Key -TF c) Insert key into sort block for sort group (-TB 8K thru 64K) d) Sort/merge full sort block into merge block. (-TMB -TB thru 64K) Sort Block a) Thread reads next data block in data area RM Block e) Write merge block to –TF, overflow to temp (-TMB sized I/O).srt1.srt2 DB … Merge Block What about mixed areas? Process index blocks as in index scan phase No longer read only! Sort Block

Sort Groups: -SG 8 (note 8 is minimum) Each sort group has its own sort file Sort file location 1 & 2. Sort files in same directory (I/O contention) 4. Sort files in different location Ensure enough space Record Index 1 SG 1 SG 2 SG 3 Index 2 Index 3 Index 9 1) -T /usr1/richb/temp/ 4).srt 0 /usr1/richb/temp/ 0 /usr2/richb/temp/ 0 /usr3/richb/temp/.srt1.srt2.srt3 2).srt 0 /usr1/richb/temp/ Each index assigned a particular sort group (hashed index #) 3).srt /usr1/richb/temp/ 0 /usr1/richb/temp/

More on Sort Groups Sort file size is in 1K units 0 indicates unlimited space Sort file max size with –TMB 8: 16 TB Increase –TMB (-TMB 64: 128 TB) 3).srt /usr1/richb/temp/ 0 /usr1/richb/temp/ Record Index 1 SG 1 SG 2 SG 3 Index 2 Index 3 Index 9 1) -T /usr1/richb/temp/ 4).srt 0 /usr1/richb/temp/ 0 /usr2/richb/temp/ 0 /usr3/richb/temp/.srt1.srt2.srt3 2).srt 0 /usr1/richb/temp/ Each index assigned a particular sort group (hashed index #)

Last one on Sort Groups, I promise Max –SG 64 per area. What if more than 64 indexes Sort groups can contain more than one indexs entries: MOD(idx-num, -SG) Data scan will add key entry to appropriate sort group Sort/merge orders indexes in sort group by index number Key insertion phase inserts all entries for one index followed by all index key entries for the other within same sort group. Record Index 1 SG 1 SG 2 SG 3 Index 2 Index 3 Index 9.srt1.srt2.srt3 Each index assigned a particular sort group (hashed index #) Index 1 Index 9 Index 2 Index 3

Phases of Index Rebuild Index Scan Data Scan/ Key Build Sort-merge –TF and/or temp sort file CPU Bound with I/O Activity I/O eliminated if –TF large enough Sort-Merge Scan index data area start to finish I/O Bound with little CPU activity Eliminated with area truncate Scan table data area start to finish (area at a time) Read records, build keys, insert to temp sort buffer Sort full temp file buffer blocks (write if > -TF) I/O Bound with CPU Activity Sorting index group 3 Spawning 4 threads for merging of group 3. Sorting index group 3 complete.

Sort-Merge Phase Sort blocks in each sort group have been sorted and merged into a linked list of individual merge blocks stored in –TF and temp files. Merge blocks are further merged –TM# at a time to form new larger runs of sorted merge blocks. -TM# of these new runs are then merged forming even larger runs of sorted merge blocks. When only one very large run left, all the key entries in the sort group are in sorted order. Sorted! Up to 7 concurrent merge threads Up to 3 concurrent merge threads Only 1 merge thread on last run

-threadnum vs -mergethreads -threadnum 2 -TF.srt1 -TF.srt2 -TF.srt3 Thread 1 Thread 2 Merge phase group 2 Merge phase group 1

-threadnum vs -mergethreads -threadnum 2 -TF.srt1 -TF.srt2 -TF.srt3 Thread 2 Thread 1 Merge phase group 2 Merge phase group 3 Thread 0 B-tree insertion occurs as soon as a sort groups merge is completed. Thread 0 begins b-tree insertion concurrently.

-threadnum vs -mergethreads -threadnum 2 –mergethreads 3 Thread 1 Thread 2 Note: 8 actively running threads Thread 3 Thread 4 Thread 5 Thread 6 Thread 7 Thread 8 -TF.srt1 -TF.srt2 -TF.srt3 Merge phase group 1 Merge phase group 2 Merge threads merge successive runs of merge blocks concurrently.

-threadnum vs -mergethreads -threadnum 2 –mergethreads 3 Thread 2 Thread 6 Thread 7 Thread 8 Thread 1 Thread 3 Thread 4 Thread 5 -TF.srt2 -TF.srt3 -TF.srt1 Merge phase group 3 Merge phase group 2

-threadnum vs -mergethreads -threadnum 2 –mergethreads 3 Thread 0 Thread 2 Thread 6 Thread 7 Thread 8 Thread 1 Thread 3 Thread 4 Thread 5 -TF.srt1 -TF.srt2 -TF.srt3 Thread 0 begins b-tree insertion concurrently. Best performance with low –threadnum & high -mergethreads Merge phase group 2 Merge phase group 3 Note: 9 actively running threads B-tree insertion occurs as soon as a sort groups merge is completed.

Phases of Index Rebuild Index Scan Data Scan/ Key Build Sort-Merge Scan index data area start to finish I/O Bound with little CPU activity Eliminated with area truncate Scan table data area start to finish (area at a time) Read records, build keys, insert to temp sort buffer Sort full temp file buffer blocks (write if > -TF) I/O Bound with CPU Activity Sort-merge –TF and/or temp sort file CPU Bound with I/O Activity I/O eliminated if –TF large enough Index Key Insertion Read sorted list in –TF or temp sort file Insert keys into index Formats new clusters; May raise HWM I/O Bound with little CPU Activity

Index Key Insertion Phase Key entries from sorted merge blocks are inserted into b-tree Performed sequentially entry at a time, index at a time Leaf level insertion optimization (avoids b-tree scan) Leaf level written to disk as soon as full (since never revisited) Building index 11 (cust-num) of group 3 … Building of indexes in group 3 completed. Multi-threaded index sorting and building complete. Root Leaf Write leaf when full DB Index B-tree

2085 Indexes were rebuilt. (11465) Index rebuild complete. 0 error(s) encountered.

Agenda 1 Tuning Suggestions Overview Some Statistics Summary 5 Processing Details

Assumptions for best performance Index data is segregated from table data Indexes & tables are in different storage areas User data areas are TII storage areas You have enough memory/disk space for sorting If not, go home. Youre done. You understand the impact of CPU and memory consumption Process allowed to use available system resources

Index Rebuild - Tuning Truncate index only area if possible.srt file – set up properly Spread I/O across disks Avoid with –TF settings (or RAM disk) The contents of table customer" will be deleted. The contents of index "cust-order" will be deleted. Are you sure you want to truncate storage area "Customer/Order area" (y/n) The contents of table customer" will be deleted. The contents of index "cust-order" will be deleted. Are you sure you want to truncate storage area "Customer/Order area" (y/n) proutil -C truncate area Customer/Order Area This could be a life changing decision…

Index Rebuild - Tuning Parameters -datascanthreads = 1.5 * # CPUs -mergethreads * -threadnum = 1.5 * #CPUs -threadnum 2 to 4 -B 1024 –TF 80 monitor physical memory paging –TMB 64 –TB 64 –TM 32 could be lower with large –mergethreads (> 16)

Memory usage approximation Total memory used (in KB): totalMemory = dataScanOverhead + tfMemory + allSortGroups + 1MB Total memory used (in KB): totalMemory = dataScanOverhead + tfMemory + allSortGroups + 1MB

Arough approximation: 64 GB Ram, 16 CPUs, 8 indexes proutil -C idxbuild area Customer/Order Index Area -B SG 10 -TB 64 -TM 32 -TMB 64 –TF 80 -datasanthreads 24 -threadnum 3 -mergethreads 8

Arough approximation: 64 GB Ram, 16 CPUs, 8 indexes proutil -C idxbuild area Customer/Order Index Area -B SG 10 -TB 64 -TM 32 -TMB 64 –TF 80 -datasanthreads 24 -threadnum 3 -mergethreads 8

Arough approximation: 64 GB Ram, 16 CPUs, 8 indexes proutil -C idxbuild area Customer/Order Index Area -B SG 10 -TB 64 -TM 32 -TMB 64 –TF 80 -datasanthreads 24 -threadnum 3 -mergethreads 8 Total memory used (in KB): totalMemory = 76, ,687, , MB = 54,909,667 K = 52.4 GB Total memory used (in KB): totalMemory = 76, ,687, , MB = 54,909,667 K = 52.4 GB

Agenda 1 Some Statistics Overview Tuning Suggestions Processing Details Summary 5

Performance Numbers Elapsed Time Cost of each phase (in secs)

Agenda 1 Summary Overview Tuning Suggestions Some Statistics Processing Details 5

Summary Index Rebuild Big improvements if Your database is setup properly You provide system resources to index rebuild You use the new settings in 10.2b06 One bug fix in 10.2b07 More efficient memory allocation coming soon! (10.2b08) Hopefully youll never need to index rebuild

Questions ?

October 6–9, 2013 Boston #PRGS Special low rate of $495 for PUG Challenge attendees with the code PUGAM And visit the Progress booth to learn more about the Progress App Dev Challenge!