Nguyễn Phạm Luân Tiến 50702449 Trần Đình H ươ ng Trà 50702573 D ươ ng Bách Tùng 50702839.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Nguyen Ngoc Tuan – Le Nguyen Duy Vu /24/
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Copyright © Starsoft Inc, Data Warehouse Architecture By Slavko Stemberger.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Data Warehouse IMS5024 – presented by Eder Tsang.
Dr. M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2010 COMP207: Data Mining Data Warehousing COMP207: Data Mining.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Tanvi Madgavkar CSE 7330 FALL Ralph Kimball states that : A data warehouse is a copy of transaction data specifically structured for query and analysis.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
1 Basic concepts of On-Line Analytical processing DT211 /4.
8/20/ Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously. Defined in many different ways, but.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Ahsan Abdullah 1 Data Warehousing Lecture-12 Relational OLAP (ROLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehouse & Data Mining
On-Line Analytic Processing Chetan Meshram Class Id:221.
Multi-Dimensional Databases & Online Analytical Processing This presentation uses some materials from: “ An Introduction to Multidimensional Database Technology,
Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
1 Cube Computation and Indexes for Data Warehouses CPS Notes 7.
OnLine Analytical Processing (OLAP)
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Prof. Bayer, DWH, Ch.4, SS Chapter 4: Dimensions, Hierarchies, Operations, Modeling.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Data Warehousing.
Roadmap 1.What is the data warehouse, data mart 2.Multi-dimensional data modeling 3.Data warehouse design – schemas, indices 4.The Data Cube operator –
BI Terminologies.
Ahsan Abdullah 1 Data Warehousing Lecture-10 Online Analytical Processing (OLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Fox MIS Spring 2011 Data Warehouse Week 8 Introduction of Data Warehouse Multidimensional Analysis: OLAP.
UNIT-II Principles of dimensional modeling
1 On-Line Analytic Processing Warehousing Data Cubes.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
ADVANCED TOPICS IN RELATIONAL DATABASES Spring 2011 Instructor: Hassan Khosravi.
Data Mining Data Warehouses.
A POWER OF OLAP TECHNOLOGY National Technical University of Ukraine “Kiev Polytechnic Institute” Heat and energy design faculty Department of automation.
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2009 This is the full course notes, but not quite complete. You.
What is OLAP?.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Warehousing.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
1 Online Analytical Processing (OLAP) Anjali Gupta Mithun Arora Aameek Singh Kranthi Kumar.
SQL Server Analysis Services Understanding Unified Dimension Model (UDM)
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
CSE6011 Implementing a Warehouse  Monitoring: Sending data from sources  Integrating: Loading, cleansing,...  Processing: Query processing, indexing,...
Data warehouse.
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
On-Line Analytic Processing
Data warehouse and OLAP
Efficient Methods for Data Cube Computation
Chapter 13 The Data Warehouse
Data Warehouse.
Data Warehouse and OLAP
Introduction of Week 9 Return assignment 5-2
Data Warehouse.
Data Warehousing Concepts
Data Warehouse and OLAP
Presentation transcript:

Nguyễn Phạm Luân Tiến Trần Đình H ươ ng Trà D ươ ng Bách Tùng

Content 1 Introduction about OLAP Systems. 2 Security requirement in OLAP Systems. 3 Some Security Issues.

Introduction of OLAP Systems Nowaday database is used in two main context: 1. OLTP: On-Line Transaction Processing 2. OLAP / DS: On-Line Analytical Processing / Decision Support

OLTP vs OLAP OLTPOLAP FunctionConstanly handlingDecision support Database designApplications – OrientedSubjects – Oriented DataNow, update, detail,…History, aggregation of multidimensions AccessRead / Write / IndexReview many times Unit of workShort single transactionsComplex queries # Record accessk. 10k # Userk k Database size100 Mb – GB100 Gb – Tb

Data Warehouse A data warehouse (DW) is a database used for reporting. The data is offloaded from the operational systems for reporting. DW collect data in support of manager’s decesion – making process. Subjects – oriented Integrated Time – variant Non – volatile

Subject Oriented Data is categorized and stored by business subject rather than by application. Operational Systems Operational Systems Savings Shares Loans Insurance Equity Plans Customer Product, Sales Information Customer Product, Sales Information Data Warehouse Subject Area

Integrated Data Warehouse Data Warehouse Operational Environment Subject = Customer Savings Application Current Accounts Application Loans Application NoApplicationFlavor

Time Variant Data is stored as a series of snapshots, each representing a period of time. DataTime 01/97 02/97 03/97 Data for January Data for February Data for March DataWarehouse

Non Volatile Typically data in the data warehouse is not updated or deleted. Read Load INSERT Read UPDATEDELETE Operational Databases Warehouse Database

OLAP In computing, online analytical processing, or OLAP is an approach to swiftly answer multi-dimensional analytical queries. The OLAP database is usually updated in batch, often from multiple sources which most people want from their applications is consistently fast response time. OLAP is a protocol for processing business data. OLAP performs multidimensional analysis of business data and provides the capability for complex calculations, trend analysis, and sophisticated data modeling.

OLAP SERVICES

OLAP ARCHITECTURES Popular architectures of OLAP systems include ROLAP (relational OLAP) and MOLAP (multidimensional OLAP). 1) ROLAP provides a front-end tool that translates multidimensional queries into corresponding SQL queries to be processed by the relational backend. 2) MOLAP does not rely on the relational model but instead materializes the multidimensional views. 3) Using MOLAP for dense parts of the data and ROLAP for the others leads to a hybrid architecture, namely, the HOLAP or hybrid OLAP.

ROLAP

ColumnsRowsTable Key values to join

KEY IN ROLAP Time Product Store Single Column Time Key Single Column Product Key Single Column Store Key CompositeKey

Star schema – 4 dimensions

Snowflake schema

MOLAP

Cube: Lattice of Cuboids all timeitemcitysupplier time,itemtime,city time,supplier item,city item,supplier city,supplier time,item,location time,item,supplier time,city,supplier item,city,supplier time, item, city, supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid

HIERARCHICAL

ROLLING UP Geography Product Item Type Category All City State Country All Time Month Year Day Week All Quarter

DRILLING DOWN Geography Product Item Type Category All City State Country All Time Month Year Day Week All Quarter

SLICE / DICE

PIVOT (ROTATE)

ROLAP vs MOLAP RelationalMultidimentsional Data representationTwo dimensionsMultiple dimenstions Data extractionSpecific rowsSpecific dimensions ComputationsFunctionsHigh speed matrix ResultsTool specificMatrix

HOLAP

OLAP ARCHITECTURES MOLAPROLAPHOLAP Underlying data storageCube Relational Table Aggregative data storageCubeRelation TableCube Productivity of QueriesFastestSlowestFast Consumption of storage space HighLowNormal Maintenance cost HighLowNormal

Security requirement in OLAP Systems OLAP system heavily depends on aggregates of data. They are very vulnerable to indirect inferences of protected data. Threat of Inferences It is illustrated through 4 Examples: 1. 1 Dimensional Inference (1-d Inference) 2. Multi-Dimensional Inference (m-d Inference) with SUM only. 3. M-d Inference with MAX only. 4. M-d Inference with SUM, MAX and MIN.

One dimensional Inference(1-d Inference): Security requirement in OLAP Systems Suggest that adversary : Can’t access the cuboid but is allowed to access. Knows the Empty cells ‘ value through the outbound channels. Then he can infer that as exactly same value as. Organization

Multi-Dimensional Inferences( m-d Inference) with Sum Security requirement in OLAP Systems Suggest that adversary can: Only Access to and. Know the empty cells ‘ value through out the outbound channels. A m-d inference is possible as follow: He first sum the cells and then subtract the cells and. The final result yeilds a sensitive cell:. = Time Organization

Multi-Dimensional Inferences( m-d Inference) with Max Security requirement in OLAP Systems Now, adversary don’t know the value of the empty cells( core cuboid is full of unknown values). The cube will be free of inferences with the SUM aggregations. Can make a m-d inference with MAX aggregations as follow: - MAX values in cells and ( that is 6000 and 5000). - From here, he can infer 1 of 3 cells, or is Neither nor can be = 6000

Multi-Dimensional Inferences( m-d Inference) with Sum, Max and Min: Security requirement in OLAP Systems

Multi-Dimensional Inferences( m-d Inference) with Sum, Max and Min: Now suppose that adversary can ask queries using SUM, MAX, MIN on the data cube. Following last example, he can infer = SUM, MAX, MIN of are 11000, 6000, and From here, he can infer that, must be 5000 and 2 zeros.(but don’t know exactly). With the SUM, MAX, MIN of, and, he can concludes that must be 5000 and the others are zeros.

A security solution for OLAP systems must combine access control and inference control to remove threats. A practical solution must achieve a balance among following objectives: Security - Sesitive data should be guarded from both unauthorized accesses and malicious inferences. Applicability - The solution should not rely on any unrealistic assumptions and should cover a wide range of scenarios without the need for significant modifications. Security requirement in OLAP Systems Requirement

Effeciency - Queries should be answered in a matter of seconds or minutes. - A desired security must be computationally efficient, especially with respect to on-line overhead. Availability - Data should be available to legitimate users who have sufficient privileges. Practicality - The solution should not demand significant modifications to the existing infrastructure of an OLAP system. The main challenge is the inherent tradeoff between above objectives. Security requirement in OLAP Systems

Some Security Issues Three-Tier Security Architeture Securing Data Cubes in OLAP Systems Sum-Only Data Cubes Generic Data Cubes

Three-Tier Security Architecture Security in statistic databases usually has 2 tiers: Sensitive Data. Inference Control Aggregation Queries. Inference Control mechanisms are used to check each aggregation query to decide whether answering the query. Through the previously answered queries, many protected data may be disclosed.

Applying two-tier architecture to OLAP has some inherent drawbacks: Checking queries for inferences at run time may cause unacceptable delay to processing queries. The complexity of this checking is usually high. Inference control methods can’t take advantage due to the special characteristic of OLAP system. Three-Tier Security Architecture

This Architecture has: 3 tiers. 3 relations. 3 properties satified by aggregation tier. Three-Tier Security Architecture User Queries Pre-defined Aggregations Data Set Access Control Inference Control

Securing Data Cubes in OLAP Systems SUM-Only Data Cubes: As an inherited limitation of statistical databases, Only SUMs are considered. Only core cuboid is considered as sensitive. 2 methods : Cardinality-BasedMethod. Parity-BasedMethod.

Cardinality-Based Method Numbers of Empty Cells. The existance of 1-d inferences only be determined in 2 cases: Core cuboid has no empty cell. Core cuboid of any data cube has fewer non-empty cells than the given upper bound 2 k-1 * d max. Securing Data Cubes in OLAP Systems

Cardinality-Based Method Numbers of Empty Cells. 1-d Inferences: Core cuboid has no empty cell. Core cuboid of any data cube has fewer non-empty cells than the given upper bound 2 k-1 * d max.

Securing Data Cubes in OLAP Systems Cardinality-Based Method M-d Inferences: Core cuboid has no empty cell. Data cube is free of inferences if it has fewer empty cells than the given upper bound. Data cube having more empty cells than the given bound always has inferences. Upper bound : 2(du − 4)+2(dv − 4) − 1 du, dv are the 2 smallest among di values. di ‘s are values of attribute i th in core cuboid.

Securing Data Cubes in OLAP Systems Cardinality-Based Method Above results can beused to compute inference-free aggregations based on the three-tier architecture. Data tier corresponds to core cuboid. The aggregation tier corresponds to a collection of cells in aggregation cuboids that are free of inferences. The query tier includes any query that can be rewritten using the cells in the aggregation tier.

Parity-Based Method Based on a simple fact that even number is closed under the operation of addition and subtraction. Suppose now all the sets of queries include even number of cells. Adding and subtracting these sets to get one cell would be more difficult. Securing Data Cubes in OLAP Systems

Parity-Based Method X1+X2+X3+X4+X5+X6 X1+X2 X4+X5 X5+X6 X3+X5 X5 = = 2500

Securing Data Cubes in OLAP Systems Parity-Based Method If a set of queries (set 2) is derivable from another set(set 1) then the answer of the set 2 can be computed using the answer of the set. If set 1 is free of inference then set 2 is so. To detect inferences caused by sets of MDR queries (Q*), we find another collection of queries that are equivalent to Q* and whose inferences are easier to detect.

Securing Data Cubes in OLAP Systems Parity-Based Method

Securing Data Cubes in OLAP Systems Parity-Based Method This method can be enforced based on the three-tier inference control architecture described earlier: A partition of the core cuboid based on dimension hierarchies composes the data tier. The parity-based method is applied to each block in the partition to compute the aggregation tier. The query tier includes any query that is derivable from the aggregation tier.

Generic Data Cubes A method that does not directly detect inferences, but prevents m-d inferences and then removes 1-d inferences. It’ s able to deal with datacubes with generic aggregation types. Access Control. Lattice-Based Inference Control. Securing Data Cubes in OLAP Systems

Access Control Limit access control to the core cuboid is not always appropriate. Values in aggregation cuboids may also carry sensitive information. Securing Data Cubes in OLAP Systems

Access Control Describe a Object: Function Below() partitions data cube along the dependency lattice. Function Slice() partitions data cube along dimensions. An Object is simply the intersection of two. Example: Object (L,S),L = and S includes all the cells in the first four quarters of the core cuboids( ). Securing Data Cubes in OLAP Systems

Lattice-Based Inference Control Given two set of cells in a data cube ( S and T): Cell c is redundant to T if S includes c and it’s ancestors in any single cuboid. Cell c is non-comparable to T if for every c’ ∈ T, c is neither ancestor or descendant of c’. Securing Data Cubes in OLAP Systems

Lattice-Based Inference Control Consider an Object(L,S): This object is the union of the cuboids in Below(L). Let T be the object and S be it’s complement to the data cube. To remove inferences from S to T, we find a subset of S that is free of m-d inferences to T. Securing Data Cubes in OLAP Systems

Lattice-Based Inference Control Securing Data Cubes in OLAP Systems

After m-d inferences are prevented,need to remove 1-d inferences. Procedure to remove 1-d inferences: Check each cell and add those that cause 1-d inferences to the object so they will be prohibited by access control. We control m-d inferences to this new object by applying the last results Repeat these steps, we remove all 1-d inferences Final set of cells are free of inferences to the object. Securing Data Cubes in OLAP Systems Lattice-Based Inference Control

Securing Data Cubes in OLAP Systems Lattice-Based Inference Control This method can be implemented based on the three-tier security model: The authorization object computed through the above process comprises the data tier. The complement of the object is the aggregation tier because it does not cause any inferences to the data tier. And the user are free to input queries to the query tier.

THANK YOU !