Presentation is loading. Please wait.

Presentation is loading. Please wait.

Nguyễn Phạm Luân Tiến 50702449 Trần Đình H ươ ng Trà 50702573 D ươ ng Bách Tùng 50702839.

Similar presentations


Presentation on theme: "Nguyễn Phạm Luân Tiến 50702449 Trần Đình H ươ ng Trà 50702573 D ươ ng Bách Tùng 50702839."— Presentation transcript:

1 Nguyễn Phạm Luân Tiến 50702449 Trần Đình H ươ ng Trà 50702573 D ươ ng Bách Tùng 50702839

2 Content 1 Introduction about OLAP Systems. 2 Security requirement in OLAP Systems. 3 Some Security Issues.

3 Introduction of OLAP Systems Nowaday database is used in two main context: 1. OLTP: On-Line Transaction Processing 2. OLAP / DS: On-Line Analytical Processing / Decision Support

4 OLTP vs OLAP OLTPOLAP FunctionConstanly handlingDecision support Database designApplications – OrientedSubjects – Oriented DataNow, update, detail,…History, aggregation of multidimensions AccessRead / Write / IndexReview many times Unit of workShort single transactionsComplex queries # Record accessk. 10k. 10 6 # Userk. 10 3 k. 10 2 Database size100 Mb – GB100 Gb – Tb

5 Data Warehouse A data warehouse (DW) is a database used for reporting. The data is offloaded from the operational systems for reporting. DW collect data in support of manager’s decesion – making process. Subjects – oriented Integrated Time – variant Non – volatile

6 Subject Oriented Data is categorized and stored by business subject rather than by application. Operational Systems Operational Systems Savings Shares Loans Insurance Equity Plans Customer Product, Sales Information Customer Product, Sales Information Data Warehouse Subject Area

7 Integrated Data Warehouse Data Warehouse Operational Environment Subject = Customer Savings Application Current Accounts Application Loans Application NoApplicationFlavor

8 Time Variant Data is stored as a series of snapshots, each representing a period of time. DataTime 01/97 02/97 03/97 Data for January Data for February Data for March DataWarehouse

9 Non Volatile Typically data in the data warehouse is not updated or deleted. Read Load INSERT Read UPDATEDELETE Operational Databases Warehouse Database

10 OLAP In computing, online analytical processing, or OLAP is an approach to swiftly answer multi-dimensional analytical queries. The OLAP database is usually updated in batch, often from multiple sources which most people want from their applications is consistently fast response time. OLAP is a protocol for processing business data. OLAP performs multidimensional analysis of business data and provides the capability for complex calculations, trend analysis, and sophisticated data modeling.

11 OLAP SERVICES

12 OLAP ARCHITECTURES Popular architectures of OLAP systems include ROLAP (relational OLAP) and MOLAP (multidimensional OLAP). 1) ROLAP provides a front-end tool that translates multidimensional queries into corresponding SQL queries to be processed by the relational backend. 2) MOLAP does not rely on the relational model but instead materializes the multidimensional views. 3) Using MOLAP for dense parts of the data and ROLAP for the others leads to a hybrid architecture, namely, the HOLAP or hybrid OLAP.

13 ROLAP

14 ColumnsRowsTable Key values to join

15 KEY IN ROLAP Time Product Store Single Column Time Key Single Column Product Key Single Column Store Key CompositeKey

16 Star schema – 4 dimensions

17 Snowflake schema

18 MOLAP

19

20

21 Cube: Lattice of Cuboids all timeitemcitysupplier time,itemtime,city time,supplier item,city item,supplier city,supplier time,item,location time,item,supplier time,city,supplier item,city,supplier time, item, city, supplier 0-D(apex) cuboid 1-D cuboids 2-D cuboids 3-D cuboids 4-D(base) cuboid

22 HIERARCHICAL

23 ROLLING UP Geography Product Item Type Category All City State Country All Time Month Year Day Week All Quarter

24 DRILLING DOWN Geography Product Item Type Category All City State Country All Time Month Year Day Week All Quarter

25 SLICE / DICE

26 PIVOT (ROTATE)

27 ROLAP vs MOLAP RelationalMultidimentsional Data representationTwo dimensionsMultiple dimenstions Data extractionSpecific rowsSpecific dimensions ComputationsFunctionsHigh speed matrix ResultsTool specificMatrix

28 HOLAP

29 OLAP ARCHITECTURES MOLAPROLAPHOLAP Underlying data storageCube Relational Table Aggregative data storageCubeRelation TableCube Productivity of QueriesFastestSlowestFast Consumption of storage space HighLowNormal Maintenance cost HighLowNormal

30 Security requirement in OLAP Systems OLAP system heavily depends on aggregates of data. They are very vulnerable to indirect inferences of protected data. Threat of Inferences It is illustrated through 4 Examples: 1. 1 Dimensional Inference (1-d Inference) 2. Multi-Dimensional Inference (m-d Inference) with SUM only. 3. M-d Inference with MAX only. 4. M-d Inference with SUM, MAX and MIN.

31 One dimensional Inference(1-d Inference): Security requirement in OLAP Systems Suggest that adversary : Can’t access the cuboid but is allowed to access. Knows the Empty cells ‘ value through the outbound channels. Then he can infer that as exactly same value as. Organization

32 Multi-Dimensional Inferences( m-d Inference) with Sum Security requirement in OLAP Systems Suggest that adversary can: Only Access to and. Know the empty cells ‘ value through out the outbound channels. A m-d inference is possible as follow: He first sum the cells and then subtract the cells and. The final result yeilds a sensitive cell:. = 1500. Time Organization

33 Multi-Dimensional Inferences( m-d Inference) with Max Security requirement in OLAP Systems Now, adversary don’t know the value of the empty cells( core cuboid is full of unknown values). The cube will be free of inferences with the SUM aggregations. Can make a m-d inference with MAX aggregations as follow: - MAX values in cells and ( that is 6000 and 5000). - From here, he can infer 1 of 3 cells, or is 6000. - Neither nor can be 6000. = 6000

34 Multi-Dimensional Inferences( m-d Inference) with Sum, Max and Min: Security requirement in OLAP Systems

35 Multi-Dimensional Inferences( m-d Inference) with Sum, Max and Min: Now suppose that adversary can ask queries using SUM, MAX, MIN on the data cube. Following last example, he can infer = 6000. SUM, MAX, MIN of are 11000, 6000, and 5000. From here, he can infer that, must be 5000 and 2 zeros.(but don’t know exactly). With the SUM, MAX, MIN of, and, he can concludes that must be 5000 and the others are zeros.

36 A security solution for OLAP systems must combine access control and inference control to remove threats. A practical solution must achieve a balance among following objectives: Security - Sesitive data should be guarded from both unauthorized accesses and malicious inferences. Applicability - The solution should not rely on any unrealistic assumptions and should cover a wide range of scenarios without the need for significant modifications. Security requirement in OLAP Systems Requirement

37 Effeciency - Queries should be answered in a matter of seconds or minutes. - A desired security must be computationally efficient, especially with respect to on-line overhead. Availability - Data should be available to legitimate users who have sufficient privileges. Practicality - The solution should not demand significant modifications to the existing infrastructure of an OLAP system. The main challenge is the inherent tradeoff between above objectives. Security requirement in OLAP Systems

38 Some Security Issues Three-Tier Security Architeture Securing Data Cubes in OLAP Systems Sum-Only Data Cubes Generic Data Cubes

39 Three-Tier Security Architecture Security in statistic databases usually has 2 tiers: Sensitive Data. Inference Control Aggregation Queries. Inference Control mechanisms are used to check each aggregation query to decide whether answering the query. Through the previously answered queries, many protected data may be disclosed.

40 Applying two-tier architecture to OLAP has some inherent drawbacks: Checking queries for inferences at run time may cause unacceptable delay to processing queries. The complexity of this checking is usually high. Inference control methods can’t take advantage due to the special characteristic of OLAP system. Three-Tier Security Architecture

41 This Architecture has: 3 tiers. 3 relations. 3 properties satified by aggregation tier. Three-Tier Security Architecture User Queries Pre-defined Aggregations Data Set Access Control Inference Control

42 Securing Data Cubes in OLAP Systems SUM-Only Data Cubes: As an inherited limitation of statistical databases, Only SUMs are considered. Only core cuboid is considered as sensitive. 2 methods : Cardinality-BasedMethod. Parity-BasedMethod.

43 Cardinality-Based Method Numbers of Empty Cells. The existance of 1-d inferences only be determined in 2 cases: Core cuboid has no empty cell. Core cuboid of any data cube has fewer non-empty cells than the given upper bound 2 k-1 * d max. Securing Data Cubes in OLAP Systems

44 Cardinality-Based Method Numbers of Empty Cells. 1-d Inferences: Core cuboid has no empty cell. Core cuboid of any data cube has fewer non-empty cells than the given upper bound 2 k-1 * d max.

45 Securing Data Cubes in OLAP Systems Cardinality-Based Method M-d Inferences: Core cuboid has no empty cell. Data cube is free of inferences if it has fewer empty cells than the given upper bound. Data cube having more empty cells than the given bound always has inferences. Upper bound : 2(du − 4)+2(dv − 4) − 1 du, dv are the 2 smallest among di values. di ‘s are values of attribute i th in core cuboid.

46 Securing Data Cubes in OLAP Systems Cardinality-Based Method Above results can beused to compute inference-free aggregations based on the three-tier architecture. Data tier corresponds to core cuboid. The aggregation tier corresponds to a collection of cells in aggregation cuboids that are free of inferences. The query tier includes any query that can be rewritten using the cells in the aggregation tier.

47 Parity-Based Method Based on a simple fact that even number is closed under the operation of addition and subtraction. Suppose now all the sets of queries include even number of cells. Adding and subtracting these sets to get one cell would be more difficult. Securing Data Cubes in OLAP Systems

48 Parity-Based Method X1+X2+X3+X4+X5+X6 X1+X2 X4+X5 X5+X6 X3+X5 X5 = = 2500

49 Securing Data Cubes in OLAP Systems Parity-Based Method If a set of queries (set 2) is derivable from another set(set 1) then the answer of the set 2 can be computed using the answer of the set. If set 1 is free of inference then set 2 is so. To detect inferences caused by sets of MDR queries (Q*), we find another collection of queries that are equivalent to Q* and whose inferences are easier to detect.

50 Securing Data Cubes in OLAP Systems Parity-Based Method

51 Securing Data Cubes in OLAP Systems Parity-Based Method This method can be enforced based on the three-tier inference control architecture described earlier: A partition of the core cuboid based on dimension hierarchies composes the data tier. The parity-based method is applied to each block in the partition to compute the aggregation tier. The query tier includes any query that is derivable from the aggregation tier.

52 Generic Data Cubes A method that does not directly detect inferences, but prevents m-d inferences and then removes 1-d inferences. It’ s able to deal with datacubes with generic aggregation types. Access Control. Lattice-Based Inference Control. Securing Data Cubes in OLAP Systems

53 Access Control Limit access control to the core cuboid is not always appropriate. Values in aggregation cuboids may also carry sensitive information. Securing Data Cubes in OLAP Systems

54 Access Control Describe a Object: Function Below() partitions data cube along the dependency lattice. Function Slice() partitions data cube along dimensions. An Object is simply the intersection of two. Example: Object (L,S),L = and S includes all the cells in the first four quarters of the core cuboids( ). Securing Data Cubes in OLAP Systems

55 Lattice-Based Inference Control Given two set of cells in a data cube ( S and T): Cell c is redundant to T if S includes c and it’s ancestors in any single cuboid. Cell c is non-comparable to T if for every c’ ∈ T, c is neither ancestor or descendant of c’. Securing Data Cubes in OLAP Systems

56 Lattice-Based Inference Control Consider an Object(L,S): This object is the union of the cuboids in Below(L). Let T be the object and S be it’s complement to the data cube. To remove inferences from S to T, we find a subset of S that is free of m-d inferences to T. Securing Data Cubes in OLAP Systems

57 Lattice-Based Inference Control Securing Data Cubes in OLAP Systems

58 After m-d inferences are prevented,need to remove 1-d inferences. Procedure to remove 1-d inferences: Check each cell and add those that cause 1-d inferences to the object so they will be prohibited by access control. We control m-d inferences to this new object by applying the last results Repeat these steps, we remove all 1-d inferences Final set of cells are free of inferences to the object. Securing Data Cubes in OLAP Systems Lattice-Based Inference Control

59 Securing Data Cubes in OLAP Systems Lattice-Based Inference Control This method can be implemented based on the three-tier security model: The authorization object computed through the above process comprises the data tier. The complement of the object is the aggregation tier because it does not cause any inferences to the data tier. And the user are free to input queries to the query tier.

60 THANK YOU !


Download ppt "Nguyễn Phạm Luân Tiến 50702449 Trần Đình H ươ ng Trà 50702573 D ươ ng Bách Tùng 50702839."

Similar presentations


Ads by Google