1 Tough Choices Materialize nothing. Compute every cell on demand. Worst query response time. No space requirements. Materialize part of the data cube.

Slides:



Advertisements
Similar presentations
CpSc 3220 File and Database Processing Lecture 17 Indexed Files.
Advertisements

An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Binary Trees CSC 220. Your Observations (so far data structures) Array –Unordered Add, delete, search –Ordered Linked List –??
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Hashing and Indexing John Ortiz.
Multidimensional Data Rtrees Bitmap indexes. R-Trees For “regions” (typically rectangles) but can represent points. Supports NN, “where­am­I” queries.
1 ROLAP DATA FLOWS SCHEMA DATA MINING FORENSICS HOLAP NETWORK SECURITY ONLINE ANALYSIS MOLAP STREAMING DATA MULTI-DIMENSIONAL HIERARCHIES CUBOID BINARY.
1 Lecture 8: Data structures for databases II Jose M. Peña
CS 171: Introduction to Computer Science II
Xyleme A Dynamic Warehouse for XML Data of the Web.
Lists A list is a finite, ordered sequence of data items. Two Implementations –Arrays –Linked Lists.
Index Sen Zhang. INDEX When a table contains a lot of records, it can take a long time for the search engine of oracle (or other RDBMS) to look through.
CPSC 231 B-Trees (D.H.)1 LEARNING OBJECTIVES Problems with simple indexing. Multilevel indexing: B-Tree. –B-Tree creation: insertion and deletion of nodes.
2010/3/81 Lecture 8 on Physical Database DBMS has a view of the database as a collection of stored records, and that view is supported by the file manager.
1 Indexing and Querying XML Data for Regular Path Expressions A Paper by Quanzhong Li and Bongki Moon Presented by Amnon Shochot.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts Chapter 12: Part B Part A:  Index Definition in SQL  Ordered Indices  Index Sequential.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
CS 206 Introduction to Computer Science II 12 / 10 / 2008 Instructor: Michael Eckmann.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Homework #3 Due Thursday, April 17 Problems: –Chapter 11: 11.6, –Chapter 12: 12.1, 12.2, 12.3, 12.4, 12.5, 12.7.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Index Structures Parin Shah Id:-207. Topics Introduction Structure of B-tree Features of B-tree Applications of B-trees Insertion into B-tree Deletion.
CS 255: Database System Principles slides: B-trees
BTREE Indices A little context information What’s the purpose of an index? Example of web search engines Queries do not directly search the WWW for data;
Binary Trees Chapter 6.
XML, distributed databases, and OLAP/warehousing The semantic web and a lot more.
Chapter 11 Databases.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Copyright © Wondershare Software Introduction to Data Structures Prepared by: Eng. Ahmed & Mohamed Taha.
©Silberschatz, Korth and Sudarshan12.1Database System Concepts B + -Tree Index Files Indexing mechanisms used to speed up access to desired data.  E.g.,
Database Management 8. course. Query types Equality query – Each field has to be equal to a constant Range query – Not all the fields have to be equal.
Advanced Algorithms Analysis and Design Lecture 8 (Continue Lecture 7…..) Elementry Data Structures By Engr Huma Ayub Vine.
OnLine Analytical Processing (OLAP)
1 CPS216: Advanced Database Systems Notes 04: Operators for Data Access Shivnath Babu.
A Summary of XISS and Index Fabric Ho Wai Shing. Contents Definition of Terms XISS (Li and Moon, VLDB2001) Numbering Scheme Indices Stored Join Algorithms.
Created on 29/10/2008yahaya.wordpress.com1 Trees Another common nonlinear data structure is the tree. We have already seen an example of a tree when we.
Binary Trees, Binary Search Trees RIZWAN REHMAN CENTRE FOR COMPUTER STUDIES DIBRUGARH UNIVERSITY.
Discrete Structures Lecture 12: Trees Ji Yanyan United International College Thanks to Professor Michael Hvidsten.
B + -Trees. Motivation An AVL tree with N nodes is an excellent data structure for searching, indexing, etc. The Big-Oh analysis shows that most operations.
Starting at Binary Trees
1 Tree Indexing (1) Linear index is poor for insertion/deletion. Tree index can efficiently support all desired operations: –Insert/delete –Multiple search.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
[ Part III of The XML seminar ] Presenter: Xiaogeng Zhao A Introduction of XQL.
Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
SqlExam1Review.ppt EXAM - 1. SQL stands for -- Structured Query Language Putting a manual database on a computer ensures? Data is more current Data is.
Session 1 Module 1: Introduction to Data Integrity
Data Structures Lakshmish Ramaswamy. Tree Hierarchical data structure Several real-world systems have hierarchical concepts –Physical and biological systems.
Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data can be stored.
CMPS 3130/6130 Computational Geometry Spring 2015
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
Chapter 11 Indexing And Hashing (1) Yonsei University 1 st Semester, 2016 Sanghyun Park.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
CPS216: Data-intensive Computing Systems
Indexing Structures for Files and Physical Database Design
Index An index is a performance-tuning method of allowing faster retrieval of records. An index creates an entry for each value that appears in the indexed.
Database Management System
CMPS 3130/6130 Computational Geometry Spring 2017
Efficient Methods for Data Cube Computation
COMP 430 Intro. to Database Systems
Chapter 8: Data Abstractions
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Tree data structure.
i206: Lecture 13: Recursion, continued Trees
Orthogonal Range Searching and Kd-Trees
Tree data structure.
Database Design and Programming
Issues in Indexing Multi-dimensional indexing:
Presentation transcript:

1 Tough Choices Materialize nothing. Compute every cell on demand. Worst query response time. No space requirements. Materialize part of the data cube. Many cells are computable from other cells. But which cells to materialize? More cells = better query performance. Materialize the entire data cube. Best query response time. Excessive space requirements.

2 Data Value Hypercube DATA VALUE HYPERCUBES store data- record indices, whereas existing data cubes can only store data aggregates. versus ordinary data cubes DATA VALUE HYPERCUBES are generated as quickly as existing data cubes.

3 Remember this? Now it doesn’t matter. OLTP OLAP UNSTRUCTURED DATA STRUCTURED DATA Multi- Dimensional Databases XML EDI Spreadsheets Web Pages RSS Web Log Voice recognition Instant Messaging Wikis Content Management Document Management Taxonomies, Ontologies Multimedia Legacy Databases Relational Databases Main Frame Databases +80% -80%

4 Hypercubes are constructed so that each cell corresponds to a unique combination of database attribute values. 3 attributes require at least 8 cells. Hypercube

5

6 CustomerPart Customer Supplier None PartSupplier Part CustomerPartSupplier

7 CustomerSupplier Boeing Delta FedEx Lockheed Delta FedEx CustomerPartSupplier Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx PartSupplier Boeing Cockpit Jet Engine Wing Lockheed Cockpit Jet Engine Wing CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Supplier Boeing Lockheed Customer Delta FedEx None Cockpit Jet Engine Wing Part

8 CustomerSupplier Boeing Delta FedEx Lockheed Delta FedEx CustomerPartSupplier Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx PartSupplier Boeing Cockpit Jet Engine Wing Lockheed Cockpit Jet Engine Wing CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Supplier Boeing Lockheed Customer Delta FedEx None Cockpit Jet Engine Wing Part attributes require at least 8 cells.

9 CustomerPartSupplier Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx Sales $10 $20 $30 $40 $50 $60 $70 $80 $90 $100 $110 $120 PartSupplier Boeing Cockpit Jet Engine Wing Lockheed Cockpit Jet Engine Wing Sales $30 $110 $190 $70 $150 $230 Cockpit Jet Engine Wing Part Sales $100 $260 $420 Supplier Boeing Lockheed Sales $330 $450 Customer Delta FedEx Sales $360 $420 All Sales $780 CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Sales $40 $60 $120 $140 $200 $220 CustomerSupplier Boeing Delta FedEx Lockheed Delta FedEx Sales $150 $180 $210 $240 This is entirely fictional data.

10 Lattice Notation A lattice is denoted as (L, <=). L = the set of elements (queries). <= is the dependence relation. ancestor(a) = {b | a <= b}. descendant(a) = {b | b <= a}. Every element is its own descendant and ancestor. next(a) = the immediate proper ancestors of a. next(a) = {b | a < b, there exists a < c, c < b}.

11 Lattice Diagrams Lattice diagrams are graphs. Elements are nodes. There is an edge from a to b iff b is in next(a). There is a path downward from y to x iff x <= y.

12 Hypercube Algebra Simple database warehouse example. Parts are purchased from suppliers and then sold to customers. Three dimensions: Part, Supplier, and Customer. The measure of interest is total sales. For each cell (p, s, c), store the total sales of part p that was bought from supplier s, and sold to customer c. Users are interested in consolidated sales. Example: what is the total sales of a given part p to a given customer c? This query is answered by looking up the value in cube cell (p, ALL, c). CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Sales $40 $60 $120 $140 $200 $220 Many cells are computable from other cells. Dependent cells. Example: cell (p, ALL, c) is the sum of cells (p, s1, c), …, (p, sn, c).

13 The Dependence Relation on Queries Consider two queries Q1 and Q2. Q1 ≤ Q2 iff Q1 can be answered using only Q2. Q1 is dependent on Q2. For example, the query (part), can be answered using only the query (part, customer). (part) <= (part, customer). Some queries are not comparable with each other using the <= operator. For example, (part) !<= (customer) and (customer) !<= (part). CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Sales $40 $60 $120 $140 $200 $220

14 B-TREE LOGIC EASIER THAN IT LOOKS ACEGIKMOQSUWYZ BFJNRVX DLT HP

15 B-TREE LOGIC B IS FOR BALANCED GIVEN 3 RD ORDER B TREE WITH THE NUMBERS: INSERT INSERT INSERT 51 Insert any number < 20 and becomes the root. Insert any number > 50 and becomes the root. Insert any number > 20 and < 50 and it becomes the root

16 B-Tree Forest Construction time for the tree forest is where d is the number of query dimensions and n i is the O ( 1≤ i ≤ d ( log n i )) number of attributes in the database at level d.

17 B-Tree Forest A Balanced B-Tree Forest is the data structure that is used to represent a Hypercube. Each dimension in the Hypercube is represented by a separate B-Tree. B-Trees are great for storing sparse data and have fast insertion and search characteristics, (nlogn).

18 B-Tree Forest A binary tree forest consists of multiple levels of binary trees. Each level represents a cube dimension. A binary tree consists of nodes – stems or leaves. Stems nodes point to left and right binary trees. Leaf nodes point to a linked list of fact table IDs. A linked list of fact table IDs points to fact table entries with identical attribute values. A depth first search on a binary tree forest results in a GROUP BY clause.

19 CustomerPartSupplier Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx Sales $10 $20 $30 $40 $50 $60 $70 $80 $90 $100 $110 $120 PartSupplier Boeing Cockpit Jet Engine Wing Lockheed Cockpit Jet Engine Wing Sales $30 $110 $190 $70 $150 $230 Cockpit Jet Engine Wing Part Sales $100 $260 $420 Supplier Boeing Lockheed Sales $330 $450 Customer Delta FedEx Sales $360 $420 All Sales $780 CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Sales $40 $60 $120 $140 $200 $220 CustomerSupplier Boeing Delta FedEx Lockheed Delta FedEx Sales $150 $180 $210 $240 B-Tree Forest in Reverse: A primer Boeing Lockheed Cockpit Wing Jet Engine Delta FedEx Supplier Tree Customer Tree Parts Tree

20 Extensive B-Trees Are Common BOEING GENERAL DYNAMICS LOCKHEED MARTIN HONEYWELL INT’LNORTHROP GRUMMAN UNITED TECHNOLOGIES AVIONICS ELEVATOR JET ENGINE AILERON FLIGHT CONTROLS STABILIZER COCKPIT FIN FUSELAGE RUDDER WING LANDING GEAR SOUTHWEST DHL DELTA VIRGINFED EX But let’s keep it simple for now.

21 PartSupplier Boeing Cockpit Jet Engine Wing Lockheed Cockpit Jet Engine Wing Sales $30 $110 $190 $70 $150 $230 Cockpit Jet Engine Wing Part Sales $100 $260 $420 Customer Delta FedEx Sales $360 $420 All Sales $780 CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Sales $40 $60 $120 $140 $200 $220 CustomerSupplier Boeing Delta FedEx Lockheed Delta FedEx Sales $150 $180 $210 $240 Incoming Data Stream Supplier Boeing Lockheed Sales $330 $450 CustomerPartSupplier Sales Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx $10 $20 $30 $40 $50 $60 Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx $70 $80 $90 $100 $110 $120 CustomerPartSupplier Sales CustomerPartSupplier Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx Sales $10 $20 $30 $40 $50 $60 $70 $80 $90 $100 $110 $120 DATA FLOW Chunk 1 2 intervals of Data Flow Chunk 2Chunk 1

22 Setting up Fact & Dimension Tables Supplier Boeing Lockheed Sales $330 $450 CustomerPartSupplier Sales Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx $10 $20 $30 $40 $50 $60 Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx $70 $80 $90 $100 $110 $120 CustomerPartSupplier Sales Chunk 2Chunk 1 CustomerPart Supplier Sales Cockpit Boeing Delta FedEx $10 $20 $30 $40 $50 $60 StringID Global String Table Boeing 0Lockheed1Cockpit2Jet Engine3 Part Wing4Delta5FedEx6 Lockheed Cockpit Jet Engine Wing Delta FedEx UNSORTED StringID Supplier Dimension Table Boeing 00 Lockheed 11 StringID Part Dimension Table Cockpit 20 Jet Engine 31 Wing 42 StringID Customer Dimension Table Delta 50 FedEx 61 SORTED SupplierID Fact Table PartCustomerSales 0000$ $ $ $ $ $ $ $ $ $ $ $120

23 Let’s just say ‘Parts’ is the most significant data of interest. ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$ Customer $120 Supplier Part

24 Understanding Nested B-Trees ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120

25 Understanding Nested B-Trees ID Fact Table Sales 0$10 1$20 2$30 3$40 4$50 5$60 6$70 7$80 8$90 9$100 10$110 Supplier Part Customer $120 Fact Table $10$20$30$40$50$60$70$80$90$100$110$120 Sales Supplier Part Customer ID ID StringID Supplier Dimension Table Boeing 00 Lockheed 11 StringID Part Dimension Table Cockpit 20 Jet Engine 31 Wing 42 StringID Customer Dimension Table Delta 50 FedEx 61 WingCockpit BBBLLL DDDDDDFFFFFF Jet Engine WingCockpit

26 Delta FedEx Delta FedEx Delta FedEx Delta FedEx Delta FedEx Making a B-Tree Forest IDFact Table Sales 0 $10 1 $20 2 $30 3 $40 4 $50 5 $60 6 $70 7 $80 8 $90 9 $ $110 Supplier Part Customer $120 Fact Table $10$20$30$40$50$60$70$80$90$100$110$120 Sales Supplier Part Customer ID ID WingCockpit BBBLLL DDDDDDFFFFFF Jet Engine WingCockpit BoeingLockheed Boeing Lockheed Boeing Lockheed Delta FedEx Drilling down the Hypercube to a Single Data Value

27 Data Structure & Concept Side by Side Do you see the Data Value Hypercube to the left? Delta FedEx Delta FedEx Delta FedEx Delta FedEx Delta FedEx Boeing Lockheed Boeing Lockheed Delta FedEx Boeing Lockheed Wing Cockpit Jet Engine CustomerSupplier Boeing Delta FedEx Lockheed Delta FedEx CustomerPartSupplier Boeing Cockpit Delta FedEx Lockheed Cockpit Delta FedEx Boeing Jet Engine Delta FedEx Lockheed Jet Engine Delta FedEx Boeing Wing Delta FedEx Lockheed Wing Delta FedEx PartSupplier Boeing Cockpit Jet Engine Wing Lockheed Cockpit Jet Engine Wing CustomerPart Cockpit Delta FedEx Jet Engine Delta FedEx Wing Delta FedEx Supplier Boeing Lockheed Customer Delta FedEx Cockpit Jet Engine Wing Part None

28 Network Data Stream ProtocolContentIDDestination IPSource IPTime Stamp ProtocolContentIDDestination IPSource IPTime Stamp StringID SMB0 LDAP1 SSH2 AOL3 JPEG4 ENGLISH5 ZIP6 COMPRESS7 GIFF8 POP9 SMPT10 IMAP11 FTP12 TELNET13 SKYPE14 CMS15 GLOBAL String Table FRENCH16 RUSSIAN17 BMP18 BASIC SOURCE19 C SOURCE20 DISCOVER21 String Table IDID BASIC SOURCE 190 BMP 181 C SOURCE 202 CMS 153 COMPRESS 74 DISCOVER 215 ENGLISH 56 FRENCH 167 GIFF 88 JPEG 49 RUSSIAN 1710 ZIP 611 CONTENT Dimension Table String Table IDID AOL 30 FTP 121 IMAP 112 LDAP 13 POP 94 SKYPE 145 SMB 06 SMTP 107 SSH 28 TELNET 139 PROTOCOL Dimension Table Only showing 2 out of 16 NETWORK DATA STREAM Dimensions

29 B-TREE Notation FTP B (1,3) Attribute Name Node B Level Record Number

30 NETWORK DATA STREAM POP B (1,9) AOL B (1,7) IMAP B (1,8) SKYPE B (1,4) FTP B (1,3) LDAP B (1,1) TELNET B (1,6) SMTP B (1,5) SSH B (1,2) SMB B (1,0) “Protocols” B-TREE

31 Notation BMP 4 B (7,9)(7,9)(7,9)(7,9) Chunk Record Number Attribute Name Record Count Tree nodes not only contain data aggregates but a linked list of data record indices.

32 “Content” B-Trees ZIP 3 (2,10) (2,11) (2,12) C SOURCE 4 (2,3) (2,4) (2,5) (2,6) BMP 1 (2,2) BASIC SOURCE 3 (1,15) (2,0) (2,1) RUSSIAN 3 (2,7) (2,8) (2,9) B (1,8) SSH C SOURCE 1 (1,4) BMP 1 (1,3) BASIC SOURCE 3 (1,0) (1,1) (1,2) B (1,0) AOL CMS 1 (1,5) B (1,1) FTP COMPRESS 1 (1,6) B (1,2) IMAP DISCOVER 2 (1,7) (1,8) B (1,3) LDAP FRENCH 1 (1,9) B (1,4) POP GIFF 1 (1,10) B (1,5) SKYPE JPEG 2 (1,11) (1,12) B (1,6) SMB RUSSIAN 1 (1,14) B (1,7) AOL

33 B-Tree Forest POP B (1,9) AOL B (1,7) IMAP B (1,8) SKYPE B (1,4) FTP B (1,3) LDAP B (1,1) TELNET B (1,6) SMTP B (1,5) SSH B (1,2) SMB B (1,0) Pointer C SOURCE 1 (1,4) BMP 1 (1,3) BASIC SOURCE 3 (1,0) (1,1) (1,2) B (1,0) AOL Level Index of Tree at the same level

34 ZIP 3 (2,10) (2,11) (2,12) C SOURCE 4 (2,3) (2,4) (2,5) (2,6) BMP 1 (2,2) BASIC SOURCE 3 (1,15) (2,0) (2,1) RUSSIAN 3 (2,7) (2,8) (2,9) B (1,8) SSH C SOURCE 1 (1,4) BMP 1 (1,3) BASIC SOURCE 3 (1,0) (1,1) (1,2) B (1,0) AOL CMS 1 (1,5) B (1,1) FTP COMPRESS 1 (1,6) B (1,2) IMAP DISCOVER 2 (1,7) (1,8) B (1,3) LDAP FRENCH 1 (1,9) B (1,4) POP GIFF 1 (1,10) B (1,5) SKYPE JPEG 2 (1,11) (1,12) B (1,6) SMB RUSSIAN 1 (1,14) B (1,7) AOL POP B (1,9) AOL B (1,7) IMAP B (1,8) SKYPE B (1,4) FTP B (1,3) LDAP B (1,1) TELNET B (1,6) SMTP B (1,5) SSH B (1,2) SMB B (1,0)

35 Conclusion B-tree forests are limited to data aggregates. Data aggregates only identify the existence of a dimensional combination. They do not provide access to complete data records. With current OLAP implementations, examining data records requires issuing additional database queries, which is inefficient. We solve this problem by extending a balanced b-tree forest to include references to data records. We call this new type of hypercube: the data value cube. Thus for our data cube, tree nodes not only contain data aggregates but a linked list of data record indices.

36 THE Q&A Stephen A. Broeker