Presentation is loading. Please wait.

Presentation is loading. Please wait.

SQL Log Analysis An Exploration of XQuery as a Tool to Analyze SQL Parse Trees Nathan Bales, Mary Fernandez, Lukasz Golab, Ted Johnson.

Similar presentations


Presentation on theme: "SQL Log Analysis An Exploration of XQuery as a Tool to Analyze SQL Parse Trees Nathan Bales, Mary Fernandez, Lukasz Golab, Ted Johnson."— Presentation transcript:

1 SQL Log Analysis An Exploration of XQuery as a Tool to Analyze SQL Parse Trees Nathan Bales, Mary Fernandez, Lukasz Golab, Ted Johnson

2 History Copyright 2007 at&t  Query logs from internal at&t source  Teradata warehouse  Ted Johnson asked to analyze logs  Data to be migrated to new DBMS  Owners want to:  Find out which data is ‘live’  Inform new database design  Proprietary tools insufficient  Ted’s Approach  Parse to memory AST, C++ to walk tree  Extract components to flat files

3 Analysis Goals  Index suggestion  Find, count, predicates over single tables  What comparison operators are used on an attribute?  Determine usefulness of views  Which are used? Not used? Joined?  Discover Hidden Schemata  Produce various interpretations of join graph  Claim facts about structure  Identify Query Sources  Find queries written using the same tool or template  etc... many more possible Copyright 2007 at&t

4 Observations on Query Logs  Logs are large  50,000 queries per month; several months  Arbitrary complexity  Queries have thousands of terms  Hide complexity in views  Teradata may not materialize  Natural tree structure  Other analysis methods analyze text or data  XML and XQuery are good tools for tree structured data Copyright 2007 at&t

5 Example query from log SELECT FTV_FINANCIAL_TRANSACTION.RESPONSIBILITY_CHARGED_CD, FTV_FINANCIAL_TRANSACTION.PROJECT_NBR, FTV_FINANCIAL_TRANSACTION.TRNSCTN_EXPENDITURE_TYPE_CD, FTV_FINANCIAL_TRANSACTION.PURCHASE_CARD_VENDOR_NM, SUM(FTV_FINANCIAL_TRANSACTION.TRANSACTION_AMT), FTV_VENDOR.VENDOR_NM, FTV_FINANCIAL_TRANSACTION.TRANSACTION_SOURCE_NM, FTV_TRNSCTN_EXPNDTR_TY_FDW.TRNSCTN_EXPENDITURE_TYPE_DESC, FTV_FINANCIAL_TRANSACTION.RESPONSIBILITY_ORIGINATING_CD, FTV_FINCL_TRNSCTN_INVOICE.INVOICE_NBR, FTV_FINCL_TRNSCTN_INVOICE.INVOICE_RECORDED_SBC_USERID, FTV_XC_MR2000_FDW_TL.FIN_CD, FTV_ACCOUNT_SERIES_FDW.ACCOUNT_NM, CASE WHEN (FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD IS NOT NULL) AND (FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD <> ' ') THEN FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD || '.' || TRIM (TRAILING FROM FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD) || TRIM (TRAILING FROM FTV_FINANCIAL_TRANSACTION.ACCOUNT_LETTER_CD) ELSE FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD END, FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD, TRIM (TRAILING FROM FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD), FTV_ACCOUNT_SERIES_FDW.PLANT_CLASS_CD, FTV_FINANCIAL_TRANSACTION.ACCOUNTED_IND, FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD, FTV_FINANCIAL_TRANSACTION.FINANCIAL_APPLICATION_CD, FTV_FINANCIAL_TRANSACTION.TRANSACTION_COMMENT1_TXT, FTV_FINANCIAL_TRANSACTION.TRANSACTION_COMMENT2_TXT, FTV_FINANCIAL_TRANSACTION.TRANSACTION_DESC, FTV_FINANCIAL_TRANSACTION.EMPLOYEE_SBC_USERID, FTV_FINANCIAL_TRANSACTION.INVENTORY_PRODUCT_ID, FTV_FINANCIAL_TRANSACTION.JOB_ACTIVITY_CD, FTV_FINANCIAL_TRANSACTION.TRANSACTION_ENTRY_TYPE_DESC, FTV_FINANCIAL_TRANSACTION.REFERENCE_NBR, FTV_FINANCIAL_TRANSACTION.TRANSACTION_TYPE_NM, FTV_FINANCIAL_TRANSACTION.DATA_YEAR_MONTH_FMT_DT, CAST(CAST(FTV_FINANCIAL_TRANSACTION.DATA_YEAR_MONTH_DT AS FORMAT 'YYYY') AS CHAR(4)), CAST(CAST(FTV_FINANCIAL_TRANSACTION.DATA_YEAR_MONTH_DT AS FORMAT 'MM') AS CHAR(2)), FTV_GEOGRAPHIC_LOCATION_1.LOCATION_CLLI_CD, FTV_FINANCIAL_TRANSACTION.BUDGET_LOCATION_CD, FTV_GEOGRAPHIC_LOCATION_1.WIRE_CENTER_CLLI_CD, FTV_COMPANY_CODE_FDW.REGIONAL_COMPANY, FTV_FINANCIAL_TRANSACTION.COMPANY_CD, CASE WHEN (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'WL') THEN (CASE WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '1') THEN 'AST' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') THEN 'LIB' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '3') THEN 'EQT' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '4') THEN 'REV' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '5') THEN 'CGS' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '6') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '7') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '8') THEN 'INC' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '9') THEN 'EXT' ELSE ' ' END) WHEN (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'TL') THEN (CASE WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '1') THEN 'AST' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '4') THEN 'LIB' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '5') THEN 'REV' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '6') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '7') THEN 'INC' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '8') THEN 'CLR' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '9') THEN 'CLR' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '3') THEN 'DEP' WHEN (FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD IN ('2002','2003','2004')) THEN (CASE WHEN (FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD LIKE '5%') THEN 'OCP' ELSE 'CON' END) WHEN (FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD IN ('2005','2006','2007')) THEN 'OCP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') AND (FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD LIKE '5%') THEN 'OCP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') THEN 'CON' ELSE ' ' END) ELSE ' ' END, CASE WHEN FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD = ' ' THEN ' ' WHEN FTV_FINANCIAL_TRANSACTION.ACCOUNT_LETTER_CD = ' ' THEN ' ' ELSE TRIM (TRAILING FROM FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD) || TRIM (TRAILING FROM FTV_FINANCIAL_TRANSACTION.ACCOUNT_LETTER_CD) END, FTV_PROJECT.PROJECT_DESC, FTV_PROJECT.PROJECT_NM, FTV_PROJECT.PROJECT_TYPE_NM FROM FDW_ACCESS_VIEWS.VZFW506_TRNSCTN_EXPNDTR_TY_FDW FTV_TRNSCTN_EXPNDTR_TY_FDW RIGHT JOIN FINANCE_ACCESS_VIEWS.VSTF005_FINANCIAL_TRANSACTION FTV_FINANCIAL_TRANSACTION ON FTV_TRNSCTN_EXPNDTR_TY_FDW.INSTANCE_CD=FTV_FINANCIAL_TRANSACTION.INSTANCE_CD AND FTV_TRNSCTN_EXPNDTR_TY_FDW.TRNSCTN_EXPENDITURE_TYPE_CD=FTV_FINANCIAL_TRANSACTION.TRNSCTN_EXPENDITURE_TYPE_CD LEFT JOIN ACCESS_VIEWS.VCCR038_GEOGRAPHIC_LOCATION FTV_GEOGRAPHIC_LOCATION_1 ON FTV_GEOGRAPHIC_LOCATION_1.INSTANCE_CD=FTV_FINANCIAL_TRANSACTION.INSTANCE_CD AND FTV_GEOGRAPHIC_LOCATION_1.GEOGRAPHIC_LOCATION_CD=FTV_FINANCIAL_TRANSACTION.BUDGET_LOCATION_CD LEFT JOIN FINANCE_ACCESS_VIEWS.VCTF003_FINCL_TRNSCTN_INVOICE FTV_FINCL_TRNSCTN_INVOICE ON FTV_FINCL_TRNSCTN_INVOICE.INVOICE_ID=FTV_FINANCIAL_TRANSACTION.INVOICE_ID AND FTV_FINCL_TRNSCTN_INVOICE.INSTANCE_CD=FTV_FINANCIAL_TRANSACTION.INSTANCE_CD LEFT JOIN ACCESS_VIEWS.VCCR037_PROJECT FTV_PROJECT ON FTV_FINANCIAL_TRANSACTION.INSTANCE_CD=FTV_PROJECT.INSTANCE_CD AND FTV_FINANCIAL_TRANSACTION.PROJECT_NBR=FTV_PROJECT.PROJECT_NBR AND FTV_FINANCIAL_TRANSACTION.VALID_PROJECT_CD=FTV_PROJECT.VALID_PROJECT_CD LEFT JOIN ACCESS_VIEWS.VCCR039_VENDOR FTV_VENDOR ON FTV_VENDOR.VENDOR_ID=FTV_FINANCIAL_TRANSACTION.VENDOR_ID AND FTV_VENDOR.INSTANCE_CD=FTV_FINANCIAL_TRANSACTION.INSTANCE_CD LEFT JOIN FDW_ACCESS_VIEWS.VZFW500_COMPANY_CODE_FDW FTV_COMPANY_CODE_FDW ON FTV_FINANCIAL_TRANSACTION.COMPANY_CD=FTV_COMPANY_CODE_FDW.COMPANY_CD LEFT JOIN FDW_ACCESS_VIEWS.VZFW501_ACCOUNT_SERIES_FDW FTV_ACCOUNT_SERIES_FDW ON FTV_FINANCIAL_TRANSACTION.INSTANCE_CD=FTV_ACCOUNT_SERIES_FDW.INSTANCE_CD AND FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD=FTV_ACCOUNT_SERIES_FDW.MAIN_ACCOUNT_CD AND FTV_FINANCIAL_TRANSACTION.SUB_ACCOUNT_CD=FTV_ACCOUNT_SERIES_FDW.SUB_ACCOUNT_CD AND FTV_FINANCIAL_TRANSACTION.ACCOUNT_LETTER_CD=FTV_ACCOUNT_SERIES_FDW.ACCOUNT_LETTER_CD LEFT JOIN FDW_ACCESS_VIEWS.VZFW507_XC_MR2000_FDW FTV_XC_MR2000_FDW_TL ON FTV_FINANCIAL_TRANSACTION.INSTANCE_CD=FTV_XC_MR2000_FDW_TL.INSTANCE_CD AND FTV_FINANCIAL_TRANSACTION.TRNSCTN_EXPENDITURE_TYPE_CD=FTV_XC_MR2000_FDW_TL.XC_CD WHERE (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'TL') AND ((FTV_FINANCIAL_TRANSACTION.COMPANY_CD LIKE ('T%'))) AND (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) ^= '5' ) AND ( FTV_FINANCIAL_TRANSACTION.DATA_YEAR_MONTH_FMT_DT BETWEEN 'JAN-2005' AND 'DEC-2005' AND SUBSTR(FTV_FINANCIAL_TRANSACTION.RESPONSIBILITY_CHARGED_CD,1,3) = 'S0S' AND FTV_FINANCIAL_TRANSACTION.PROJECT_NBR IN ('5698427', '5703138', '5703064', '5698563', '5702834') AND (CASE WHEN (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'WL') THEN (CASE WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '1') THEN 'AST' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') THEN 'LIB' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '3') THEN 'EQT' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '4') THEN 'REV' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '5') THEN 'CGS' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '6') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '7') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '8') THEN 'INC' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '9') THEN 'EXT' ELSE ' ' END) WHEN (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'TL') THEN (CASE WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '1') THEN 'AST' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '4') THEN 'LIB' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '5') THEN 'REV' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '6') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '7') THEN 'INC' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '8') THEN 'CLR' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '9') THEN 'CLR' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '3') THEN 'DEP' WHEN (FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD IN ('2002','2003','2004')) THEN (CASE WHEN (FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD LIKE '5%') THEN 'OCP' ELSE 'CON' END) WHEN (FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD IN ('2005','2006','2007')) THEN 'OCP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') AND (FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD LIKE '5%') THEN 'OCP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') THEN 'CON' ELSE ' ' END) ELSE ' ' END = 'CON' OR CASE WHEN (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'WL') THEN (CASE WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '1') THEN 'AST' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') THEN 'LIB' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '3') THEN 'EQT' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '4') THEN 'REV' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '5') THEN 'CGS' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '6') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '7') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '8') THEN 'INC' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '9') THEN 'EXT' ELSE ' ' END) WHEN (FTV_FINANCIAL_TRANSACTION.INSTANCE_CD = 'TL') THEN (CASE WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '1') THEN 'AST' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '4') THEN 'LIB' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '5') THEN 'REV' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '6') THEN 'EXP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '7') THEN 'INC' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '8') THEN 'CLR' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '9') THEN 'CLR' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '3') THEN 'DEP' WHEN (FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD IN ('2002','2003','2004')) THEN (CASE WHEN (FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD LIKE '5%') THEN 'OCP' ELSE 'CON' END) WHEN (FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD IN ('2005','2006','2007')) THEN 'OCP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') AND (FTV_FINANCIAL_TRANSACTION.ACTIVITY_CD LIKE '5%') THEN 'OCP' WHEN (SUBSTR(FTV_FINANCIAL_TRANSACTION.MAIN_ACCOUNT_CD,1,1) = '2') THEN 'CON' ELSE ' ' END) ELSE ' ' END = 'EXP') ) GROUP BY 1, 2, 3, 4, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42 Copyright 2007 at&t

6 Summer Progress  Background work  Design XML schema for SQL query parse trees  Parse logs to XML  Give qualified names context  Inline views  Exploration Phase  Write simple analysis over XML log  Poor performance  Materialize views of interesting facts  Increased complexity  Annotate XML log with interesting facts  Adding to schema breaks old analyses  Result: Need sound, robust conceptual model Copyright 2007 at&t

7 Example Analysis { for $att in $log//column where exists($log//predicate [.//column === $att]) and empty($log//select_expression [empty(./ancestor::subquery)]) return { $att } } Copyright 2007 at&t

8 Research Issue 1 of 2  Conceptual model for query analysis  Logical Level:  Identify interesting aspects of queries  Aspect defined by XQuery function f(x)  Find or group queries with specific aspects  Physical Level:  What is an aspect  Concrete component : any sub-parse-tree  Abstract component : result of applying f(x) to sub-parse-tree  Aspect index  Indexes concrete components in log  Keyed on abstract component values Copyright 2007 at&t

9 Research Issue 1 of 2 (Example)  Aspect: Join Edge  Concrete part: a predicate  $predicate in $log//predicate  Abstract part: join edge  { for $table in $predicate//table/name/text() order by $table asc return { $table } } Copyright 2007 at&t

10 System Diagram Copyright 2007 at&t

11 Conclusion  Will patent  Hope to publish  Exposed open optimization problems  Things I learned:  Vastly different approaches to computer science research can be very successful  How industrial problems motivate research  Let the research motivate the paper, not vice versa  10.5 weeks 3,000 miles from fiancée = not healthy Copyright 2007 at&t

12 QA Copyright 2007 at&t

13 Research Issue 2 of 2  Extending the model for similarity analysis  Leverage structural similarity in addition to textual  Use understood properties of SQL to improve score  Example:  SELECT a FROM r  Which is more similar?  SELECT a, b, c FROM r  SELECT a, d FROM r, s  Consider similarities of multiple aspects in a single query  Query optimization could break scores Copyright 2007 at&t

14 Related Work  Vendor analysis tools  DB2’s index advisor (others)  Practical Query Analysis (http://pqa.projects.postgresql.org)  Ruby tool for MySQL and PostgreSQL  Aggregate text after some normalization  SQL Text Mining (Vik Singh, Jim Gray, Mark Manasse – MSR Tech Report)  Normalize query text  Cluster with known text similarity methods  Goals  Bot detection  Query recommendation Copyright 2007 at&t


Download ppt "SQL Log Analysis An Exploration of XQuery as a Tool to Analyze SQL Parse Trees Nathan Bales, Mary Fernandez, Lukasz Golab, Ted Johnson."

Similar presentations


Ads by Google