Presentation is loading. Please wait.

Presentation is loading. Please wait.

Teradata Physical Implementation – Case Study

Similar presentations


Presentation on theme: "Teradata Physical Implementation – Case Study"— Presentation transcript:

1 Teradata Physical Implementation – Case Study

2 Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

3 Create Table – Copy Data
CREATE SET TABLE TPCH.Customer ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) UNIQUE PRIMARY INDEX ( C_CUSTKEY ); SELECT databasename, Tablename, sum(CurrentPerm) FROM DBC.TABLESIZE where databasename = 'TPCH' group by databasename, Tablename DatabaseNameTableNameSum(CurrentPerm) TPCH ORDERTBL TPCH LINEITEM TPCH PARTTBL TPCH PARTSUPP TPCH NATION TPCH REGION TPCH CUSTOMER TPCH SUPPLIER show table TPCH.Customer; If Privileges missing grant to your user GRANT SELECT ON DBC TO TRAINER; GRANT SELECT ON TPCH TO TRAINER; Login using your ID

4 Data Distribution Check
1) Create Customer Table in your User/Database; - Keep the same definition (No Fallback & Same PI) - You can create the table and get the data OR can be achieved as below. CREATE TABLE TRAINER.CUSTOMER AS TPCH.CUSTOMER WITH DATA; Show table and check the definition and the data in the table. show table TRAINER.Customer; 2) Check the Table size by AMP SELECT * FROM DBC.TABLESIZE where databasename = 'TRAINER' and Tablename= 'CUSTOMER‘ VprocDatabaseNameAccountNameTableNameCurrentPermPeakPerm 0TRAINER DBC CUSTOMER 1TRAINER DBC CUSTOMER -- Current PI of the Table is C_CUSTKEY, Degree of Uniqueness is 100% select count(distinct C_CUSTKEY), count(1) from TRAINER.CUSTOMER Count(Distinct(C_CUSTKEY))Count(1)

5 Change in PI 1) Consider the different PI for CUSTOMER Table.
CREATE MULTISET TABLE TRAINER.Customer_PI ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) PRIMARY INDEX ( C_MKTSEGMENT ); 1) Consider the different PI for CUSTOMER Table. select C_MKTSEGMENT, count(1) from TRAINER.CUSTOMER group by C_MKTSEGMENT C_MKTSEGMENTCount(1) FURNITURE 1169 MACHINERY 1174 BUILDING 1296 HOUSEHOLD 1171 AUTOMOBILE1190 select count(distinct C_MKTSEGMENT), count(1) Count(Distinct(C_MKTSEGMENT))Count(1) 56000 2) Create the Table CUSTOMER_PI , same definition as Customer but with C_MKTSEGMENT as PI. 3) Check the size by AMP. VprocDatabaseNameAccountNameTableNameCurrentPermPeakPerm 0TRAINER DBC CUSTOMER_PI 1TRAINER DBC CUSTOMER_PI

6 Fallback Impact CREATE MULTISET TABLE TRAINER.Customer_FB , FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) PRIMARY INDEX ( C_MKTSEGMENT ); 1) Create Table CUSTOMER_FB with the FALLBACK ON and check the Table Size. 2) Check the size by AMP. - Note the size is doubled in total. - This is just a two amp system so one AMP is the FALLBACK for the other so shows same size with FALLBACK. VprocDatabaseNameAccountNameTableNameCurrentPermPeakPerm 0TRAINER DBC CUSTOMER_FB 1TRAINER DBC CUSTOMER_FB

7 Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

8 Creating USI Explain the below Query (PI is C_MKT_SEGMENT) EXPLAIN
SELECT * FROM CUSTOMER_PI WHERE C_CUSTKEY = 1613 We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_PI by way of an all-rows scan with a condition of ( Note – Its is doing the full Table scan 2) Create a USI on C_CUSTKEY CREATE UNIQUE INDEX IDX_CKEY (C_CUSTKEY) ON CUSTOMER_PI; 3) Note the change in the Explain for Index scan and the change in the response time CREATE MULTISET TABLE TRAINER.Customer_PI ,NO FALLBACK , NO BEFORE JOURNAL, NO AFTER JOURNAL ( C_CUSTKEY INTEGER NOT NULL, C_NAME VARCHAR(25) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ADDRESS VARCHAR(40) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_NATIONKEY INTEGER NOT NULL, C_PHONE CHAR(15) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_ACCTBAL DECIMAL(15,2) NOT NULL, C_MKTSEGMENT CHAR(10) CHARACTER SET LATIN CASESPECIFIC NOT NULL, C_COMMENT VARCHAR(117) CHARACTER SET LATIN CASESPECIFIC NOT NULL) PRIMARY INDEX ( C_MKTSEGMENT ); EXPLAIN – WITH USI 1) First, we do a two-AMP RETRIEVE step from TRAINER.CUSTOMER_PI by way of unique index # 4 "TRAINER.CUSTOMER_PI.C_CUSTKEY = 1613" with no residual conditions. The estimated time for this step is 0.02 seconds. -> The row is sent directly back to the user as the result of statement 1. The total estimated time is 0.02 seconds. EXPLAIN – WITHOUT USI 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_FB. 2) Next, we lock TRAINER.CUSTOMER_FB for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_FB by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_FB.C_CUSTKEY = 1613") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 707 rows. The estimated time for this step is 0.14 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds.

9 Creating NUSI Explain the below Query EXPLAIN
select * from CUSTOMER_PI where C_NATIONKEY = 17 Note – Its is doing the full Table scan 2) Create a NUSI on C_NATIONKEY CREATE INDEX IDX_NTKEY (C_NATIONKEY) ON CUSTOMER_PI; 3)Explain the same Query again Note the change in the Explain for Index scan and the change in the response time – The Index is not used - We can only Create the Index but can not enforce the usage. The Usage depends on the Optimizer. EXPLAIN – WITHOUT NUSI 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_PI. 2) Next, we lock TRAINER.CUSTOMER_PI for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_PI by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_PI.C_NATIONKEY = 17") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 707 rows. The estimated time for this step is 0.14 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds. EXPLAIN – WITH NUSI 1) First, we do a two-AMP RETRIEVE step 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_PI. 2) Next, we lock TRAINER.CUSTOMER_PI for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_PI by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_PI.C_NATIONKEY = 17") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with low confidence to be 283 rows. The estimated time for this step is 0.13 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.13 seconds.

10 Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

11 Creating JOIN Index EXPLAIN the below query: EXPLAIN
SELECT SUM(C_ACCTBAL) FROM CUSTOMER WHERE C_NATIONKEY = 10 2) Create a Single table Join Index and CREATE JOIN INDEX CUSTOMER_JI AS SELECT C_CUSTKEY, C_NATIONKEY, C_ACCTBAL, C_MKTSEGMENT PRIMARY INDEX (C_NATIONKEY); 3) Run the same explain plan again. EXPLAIN – WITHOUT JI 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER. 2) Next, we lock TRAINER.CUSTOMER for read. 3) We do an all-AMPs SUM step to aggregate from TRAINER.CUSTOMER by way of an all-rows scan with a condition of ("TRAINER.CUSTOMER.C_NATIONKEY = 10"). Aggregate Intermediate Results are computed globally, then placed in Spool 3. The size of Spool 3 is estimated with high confidence to be 1 row. The estimated time for this step is 0.13 seconds. 4) We do an all-AMPs RETRIEVE step from Spool 3 (Last Use) by way of an all-rows scan into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 1 row. The estimated time for this step is seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. EXPLAIN – WITH JI 1) First, we do a single-AMP SUM step to aggregate from TRAINER.CUSTOMER_JI by way of the primary index "TRAINER.CUSTOMER_JI.C_NATIONKEY = 10" with no residual conditions, and the grouping identifier in field1. Aggregate Intermediate Results are computed locally, then placed in Spool 3. The size of Spool 3 is estimated with high confidence to be 1 row. The estimated time for this step is 0.03 seconds. 2) Next, we do a single-AMP RETRIEVE step from Spool 3 (Last Use) by way of the primary index "TRAINER.CUSTOMER_JI.C_NATIONKEY = 10“ into Spool 1 (one-amp), which is built locally on that AMP. The size of Spool 1 is estimated with high confidence to be 1 row. The estimated time for this step is 0.03 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1.

12 Create Table - Distribution Check & PI Change - Fallback Create Index - USI - NUSI Create Join Index Create & Collect Statistics

13 Collect Statistics COLLECT STATISTICS EXPLAIN – WITH STATISTICS
Run the Below Explain EXPLAIN SELECT * FROM CUSTOMER_FB WHERE C_NATIONKEY = 10; Note the LOW confidence and the number of rows 707. - Collect Statistics COLLECT STATISTICS ON CUSTOMER_FB COLUMN(C_NATIONKEY); Explain Again Note the HIGH confidence and the number of rows 290. Actual count in the Table SELECT COUNT(1) WHERE C_NATIONKEY = 10 246 EXPLAIN – WITHOUT STATISTICS 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_FB. 2) Next, we lock TRAINER.CUSTOMER_FB for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_FB by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_FB.C_NATIONKEY = 10") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with no confidence to be 707 rows. The estimated time for this step is 0.14 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.14 seconds. EXPLAIN – WITH STATISTICS 1) First, we lock a distinct TRAINER."pseudo table" for read on a RowHash to prevent global deadlock for TRAINER.CUSTOMER_FB. 2) Next, we lock TRAINER.CUSTOMER_FB for read. 3) We do an all-AMPs RETRIEVE step from TRAINER.CUSTOMER_FB by way of an all-rows scan with a condition of ( "TRAINER.CUSTOMER_FB.C_NATIONKEY = 10") into Spool 1 (group_amps), which is built locally on the AMPs. The size of Spool 1 is estimated with high confidence to be 290 rows. The estimated time for this step is 0.13 seconds. -> The contents of Spool 1 are sent back to the user as the result of statement 1. The total estimated time is 0.13 seconds.


Download ppt "Teradata Physical Implementation – Case Study"

Similar presentations


Ads by Google