Bernhard Zeller Alfons Kemper University of * This work was supported by an SAP contract within the so-called Terabyte-Project.

Slides:



Advertisements
Similar presentations
Chapter 1: The Database Environment
Advertisements

Database Systems: Design, Implementation, and Management
Extreme Performance with Oracle Data Warehousing
Tuning: overview Rewrite SQL (Leccotech)Leccotech Create Index Redefine Main memory structures (SGA in Oracle) Change the Block Size Materialized Views,
Module 13: Performance Tuning. Overview Performance tuning methodologies Instance level Database level Application level Overview of tools and techniques.
Information Systems & Semantic Web University of Koblenz Landau, Germany Advanced Data Modeling Steffen Staab with Simon Schenk TexPoint fonts used in.
Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.
Big Data Working with Terabytes in SQL Server Andrew Novick
A comparison of MySQL And Oracle Jeremy Haubrich.
© 2007 by Prentice Hall 1 Chapter 1: The Database Environment Modern Database Management 8 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
SAP BW Implementation at JoAnn Stores Inc. Session Code 3309 Craig Eick JoAnn Stores Inc.
Chapter 1: The Database Environment
Chapter 9 Overview  Reasons to monitor SQL Server  Performance Monitoring and Tuning  Tools for Monitoring SQL Server  Common Monitoring and Tuning.
Presented by Jacob Wilson SharePoint Practice Lead Bross Group 1.
Simplify your Job – Automatic Storage Management Angelo Session id:
Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 1 Preview of Oracle Database 12 c In-Memory Option Thomas Kyte
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Analyzing the Energy Efficiency of a Database Server Hanskamal Patel SE 521.
Distributed Databases Dr. Lee By Alex Genadinik. Distributed Databases? What is that!?? Distributed Database - a collection of multiple logically interrelated.
By N.Gopinath AP/CSE. Why a Data Warehouse Application – Business Perspectives  There are several reasons why organizations consider Data Warehousing.
PMIT-6102 Advanced Database Systems
Exam QUESTION CertKiller.com has hired you as a database administrator for their network. Your duties include administering the SQL Server 2008.
Microsoft ® SQL Server ® 2008 and SQL Server 2008 R2 Infrastructure Planning and Design Published: February 2009 Updated: January 2012.
IT The Relational DBMS Section 06. Relational Database Theory Physical Database Design.
Data: Migrating, Distributing and Audit Tracking Michelle Ayers, Advisory Solution Consultant
HBase A column-centered database 1. Overview An Apache project Influenced by Google’s BigTable Built on Hadoop ▫A distributed file system ▫Supports Map-Reduce.
A337 File Design Computerized and Manual Systems 11/10/2009.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
DBSQL 14-1 Copyright © Genetic Computer School 2009 Chapter 14 Microsoft SQL Server.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
© 2007 by Prentice Hall 1 Introduction to databases.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Chapter 1: The Database Environment Modern Database Management 9 th Edition Jeffrey A. Hoffer,
1099 Why Use InterBase? Bill Todd The Database Group, Inc.
Designing and Deploying a Scalable EPM Solution Ken Toole Platform Test Manager MS Project Microsoft.
BW Know-How Call : Performance Tuning dial-in phone numbers! U.S. Toll-free: (877) International: (612) Passcode: “BW”
Parallel Database Systems Instructor: Dr. Yingshu Li Student: Chunyu Ai.
Chapter 1 Chapter 1: The Database Environment Modern Database Management 8 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden © 2007 by Prentice.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
Database Management 7. course. Reminder Disk and RAM RAID Levels Disk space management Buffering Heap files Page formats Record formats.
Praveen Srivatsa Director| AstrhaSoft Consulting blogs.asthrasoft.com/praveens |
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Copyright © 2009 Pearson Education, Inc. Publishing as Prentice Hall Chapter 9 Designing Databases 9.1.
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
Lock Tuning. Overview Data definition language (DDL) statements are considered harmful DDL is the language used to access and manipulate catalog or metadata.
Load Rebalancing for Distributed File Systems in Clouds.
Hands-On Microsoft Windows Server 2008 Chapter 7 Configuring and Managing Data Storage.
Reliable Web Service Execution and Deployment in Dynamic Environments * Markus Keidl, Stefan Seltzsam, and Alfons Kemper Universität Passau Passau,
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Configuring SQL Server for a successful SharePoint Server Deployment Haaron Gonzalez Solution Architect & Consultant Microsoft MVP SharePoint Server
Retele de senzori Curs 1 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
History of organizational systems Calculation systems Functional systems Integrated systems.
Abstract MarkLogic Database – Only Enterprise NoSQL DB Aashi Rastogi, Sanket V. Patel Department of Computer Science University of Bridgeport, Bridgeport,
Introduction to Partitioning in SQL Server
Data Warehouse.
What is the Azure SQL Datawarehouse?
April 30th – Scheduling / parallel
1 Demand of your DB is changing Presented By: Ashwani Kumar
Physical Database Design
Overview of big data tools
Data Warehousing Concepts
Introduction of Week 14 Return assignment 12-1
Performance And Scalability In Oracle9i And SQL Server 2000
The Database Environment
The Gamma Database Machine Project
Presentation transcript:

Bernhard Zeller Alfons Kemper University of * This work was supported by an SAP contract within the so-called Terabyte-Project Exploiting Advanced Database Optimization Features for Large-Scale SAP R/3 Installations* Experience Report:

University of Passau - Outline Brief Overview of SAP R/3 Motivation Related Work Traditional Performance Tuning Techniques Exploiting Horizontal Partitioning for Tuning Purposes Partitioning Scenarios/Techniques and their Pros and Cons Possible Benefits and Drawbacks of Partitioning Performance Analysis Conclusion

University of Passau - Overview of SAP R/3 SAP is the market leader for integrated business solutions SAP R/3 is SAPs enterprise resource planning (ERP) product SAP R/3 provides modules for finance, human resources, material management, etc. today about customers world wide use SAP R/3 (used by most Fortune 500 companies) more than Installations world wide Three-Tier Client/Server-Architecture

University of Passau - Three-Tier Client/Server- Architecture of SAP R/3 (second party) Relational DBMS DBMS Application Server 2 Application Server N Application Server 1 Presentation Server 1 Presentation Server 2 Presentation Server M... LAN/ WAN LAN Many Even more ONE DBMS on ONE Host !

University of Passau - Motivation (1) Todays high end SAP R/3 installations have reached their load capacity limits: data volumes of SAP R/3 System (i.e., the database volumes) are growing tremendously (several hundred Terabytes) hard to maintain (7 x 24) performance worsens Exploiting advanced features like horizontal partitioning can widen these load capacity limits Already implemented by most database vendors: No additional effort, just switch on.

University of Passau - Motivation (2) High end systems are the most important systems: revenue prestige new contracts, business competition Therefore: every day business is the top priority, i.e. only tolerable slow down of important (=OLTP) transactions due to the use of new techniques if this cant be guaranteed: dont do it ! In this case: The benefits are obvious. Prove that horizontal partitioning doesnt conflict with daily business !

University of Passau - Related Work G. Copeland, W. Alexander, E. Boughter, and T. Keller Data placement in bubba. In Proc. of the ACM SIGMOD Conf. on Management of Data, Chicago, IL, USA, D. J. DeWitt, R. H. Gerber, G. Graefe, M. L. Heytens, K. B. Kumar, and M. Muralikrishna Gamma - a high performance dataflow database machine. In Proc. Of the Conf. on Very Large Data Bases (VLDB), Kyoto, Japan, M. Mehta and D. J. DeWitt Data placement in shared-nothing parallel database systems. VLDB Journal, 6(1):53-72, S. Ceri, M. Negri, and G. Pelagatti Horizontal data partitioning in database design. In Proc. of the ACM SIGMOD Conf. on Management of Data, Orlando, USA, 1982.

University of Passau - Whats Different shared nothing distributed databases: - use of local computing power, i.e. #partitions #CPUs, #disks - network between the nodes is the bottleneck centralized systems like SAP R/3: - limited number of CPUs, main memory, disks, i.e., #partitions >> #CPUs, #disks - shared memory/disk access hazards on disk page / CPU level likely (thrashing)

University of Passau - Outline Brief Overview of SAP R/3 Motivation Related Work Traditional Performance Tuning Techniques Exploiting Horizontal Partitioning for Tuning Purposes Partitioning Scenarios/Techniques and their Pros and Cons Possible Benefits and Drawbacks of Partitioning Performance Analysis Conclusion

University of Passau - Traditional Tuning Techniques Reduce data (Archiving) Additional Indices Additional job instances Additional/better hardware sometimes not possible even more data huge hard to maintain slow down updates/inserts limited number of CPUs problems due to data skews too expensive already the newest HW installed Special designed software really expensive long deployment times

University of Passau - Partitioning Techniques and their Pros and Cons round robin, hash partitioning + balanced partition sizes - in which partition will record R be stored ? range partitioning + users have knowledge about the data distribution can use this knowledge at application level (work load balancing, definition of working sets,...) - unbalanced partition sizes due to data skews

University of Passau - Partitioning Scenarios partitioning no partitioning partition only table choose one partitioning algorithm partition table and indices partition only indices choose one partitioning algorithm different # of partitions same # of partitions choose partitioning algorithm for index and table equi partitioned non-equi partitioned choose one uniform partitioning algorithm choose separate algorithm for index and table done

University of Passau - Benefits of Partitioning Keep data manageable by processing data partition wise: administrative tasks like index re-creation, gathering statistics, table re-organization,... partition-wise, parallel processing at application level, e.g., partition by plant number and start an inventory job for each plant (equi) join processing (when partitioning fields are subset of join attributes) bulk deletes drop whole partitions

University of Passau - Possible Drawbacks: Row Movement (RM) RM = movement of data from one partition to another because of an update doubles the cost of an update transaction produces additional logging and locking overhead

University of Passau - Possible Drawbacks: Row Movement (Example) Table Equipment: Equipment of company COMP Inc. Partition Plant LA PlantIDDescription LA EquID Desk Chair de luxe Office copier Partition Plant NY PlantIDDescription NY EquID LCD Display Chair simple Mainframe NY008Office copier Task: New office copier for LA, old one to NYDB: update equipment set PlantID = NY where PlantID = LA and EquID = 008 DB Step 1: Delete in Partition Plant LADB Step 2: Insert into Partition Plant NY

University of Passau - Possible Drawbacks: Conflicts with Parallel Jobs resources of ERP systems (CPU, memory, storage) managed at application level ERP System has no knowledge of partitioning (i.e., parallelization) at database level Conflicts at CPU and disk page level are likely

University of Passau - Possible Drawbacks: Conflicts with Parallel Jobs Host CPU 1CPU 2 CPU 4CPU 3 DBMS Table A Partition A_1 ERP Job List Job 1: analyze A CPUs used: 0 CPUs used: 1 Partition A_2 Partition A_3 Partition A_4

University of Passau - Possible Drawbacks: Conflicts with Parallel Jobs Host CPU 1CPU 2 CPU 4CPU 3 DBMS Table A Partition A_1 ERP Job List Job 1: analyze A CPUs used: 0 CPUs used: 1 Partition A_2 Partition A_3 Partition A_4 Job 2: show... Job 3: calculate Job 4: transform CPUs used: 4 Job 2: show... Job 3: calculate Job 4: transform

University of Passau - Outline Brief Overview of SAP R/3 Motivation Related Work Traditional Performance Tuning Techniques Exploiting Horizontal Partitioning for Tuning Purposes Partitioning Scenarios/Techniques and their Pros and Cons Possible Benefits and Drawbacks of Partitioning Performance Analysis Conclusion

University of Passau - Analyzed Scenarios Flat: table and index are not partitioned Global Index: the table is partitioned but the index is not Partitioned Index: the table and the index are partitioned using the same partitioning algorithm (range partitioning)

University of Passau - Used Data and Hardware (1) Copies of SAP R/3 standard tables (Material Management): indexed key columns av. row length # rowstable size MARCmandt, matnr, werks 496 Byte5 million2.5 GB MARDmandt, matnr, werks, lgort 142 Byte25 million3.5 GB

University of Passau - Used Data and Hardware (2) range partitioned according to value of werks 100 partitions anonymized data from a productive SAP R/3 system Hardware: SUN Enterprise 450 with four 400 MHz CPUs and 4 GB RAM SUN A GB RAID with RAID level 5 the R/3 system and the DBMS used 512 MB main memory each

University of Passau - Analyzed Statements selects (single, for all entries, up to n rows, parallel selects,...) inserts, updates, and deletes joins parallel jobs at application level administrative tasks...

University of Passau - Evaluated Dimensions number of processed rows set orientated and one-record-at-a-time approach number of commits number of indices parallel jobs: processing data partition wise and across partition borders

University of Passau - Joining Single Plants (Hash)

University of Passau - Joining Whole Tables (Hash)

University of Passau - Joining Single Plants (Merge)

University of Passau - Insertions and Deletions

University of Passau - Insertions and Deletions

University of Passau - Update Statements

University of Passau - Select Statements

University of Passau - Select Statements

University of Passau - Conclusion Results show: horizontal partitioning is applicable (with tolerable costs) especially joins, administrative task, parallel selects benefit greatly horizontal partitioning is already used in some large-scale system

University of Passau - Thank you for your attention ! Questions ?