Presentation is loading. Please wait.

Presentation is loading. Please wait.

Security in Data Warehousing Janani. S 09MCS102.

Similar presentations


Presentation on theme: "Security in Data Warehousing Janani. S 09MCS102."— Presentation transcript:

1 Security in Data Warehousing Rajalakshmi @ Janani. S 09MCS102

2 CONTENTS Introduction Characteristics of DW Data ETL Ideas towards Warehouse Security Achieving proactive security requirements Security Areas Case Studies

3 Introduction Data warehouse – an integrated repository. Process involved- read, clean, aggregate and store. Varied number of applications Tools to access the warehouse

4 Characteristics of DW Data Subject-oriented: –Data is organized around subjects or business dimensions, such as sales, customers, orders, claims, accounts, employees, etc. Integrated: –Data is collected from several transactional databases, and integrated in a way to provide a unified picture of each subject over time. –Data from different databases is transformed into a common schema, measurement, code, data type. Aggregated: –Data stored is not transaction-level, but aggregated by products, regions, months/years, or some other business dimension.

5 Characteristics of DW Data Historical: –Data updated at some time interval: weekly, monthly, etc. –Data stored by weeks, months, etc. for historical comparison and trend analysis. Time variant: –Data always includes a timestamp (e.g., sales by weeks, months, quarters, or years). Non-volatile: –Data is historical, and does not change with time. Denormalized: –Denormalized data is used to improve query performance, though it also increases update time and introduces data integrity problems. –Works because historic data in the data warehouse is rarely updated

6 Overview of Data Warehouse Archive BI Backup Purge Recover y Repositor y ET L Collaborativ e Review Model

7 ETL Data extraction: –Process of copying relevant data from a variety of transactional databases for inclusion in a DW. –May occur at regular intervals (e.g., weekly, monthly) to add new data. –Data from incompatible databases, flat files, text documents, etc. must be filtered through appropriate API (application programming interfaces) as needed.

8 Data transformation/cleaning: –Data extracted from transactional databases must be cleaned (“scrubbed”) and transformed before loading into a DW. –Format differences across different tables/databases must be reconciled. –Missing or misspelled data values must be resolved. –Erroneous data are identified using application programs, and scrutinized/ corrected by DW analysts using system- generated exception reports. –Transaction-level data is aggregated by business dimensions. –Key step in DW construction since DW is very sensitive to data errors. PK: SS# (123-45-6789) Name (Robert G. Smith) PK: DL# (FL-B12345678) Name (Bob Smith) PK: Acc# (12345678905) Name (R. G. Smith) Life Insurance DatabaseAuto Insurance DatabaseHome Insurance Database

9 Data Cleaning Example Good Reading Bookstores Questionable data: Is book quantity correct? Out-of-range data: A single book can’t cost $3,200.99 Referential integrity problem: Customer# 12738 does not exist in Customers table Possible misspelling: Do rows 3 & 8 refer to the same person? Missing data: City is blank. Questionable data: State for rows 2 & 6 could be the same

10 Ideas towards Warehouse Security 1.Replication Control: –an old copy can be considered a replica of the current copy of the data. 2. Aggregation and Generalization – The generalization idea can be used to give users a high level information at first – And the lower level details can be given after the security constraints are satisfied

11 Ideas.. 3. Exaggeration and Misleading Quality of views may depend on the user involved and user can be given an exaggerated view of the data. misleading information- information which may be partially incorrect or difficult to verify the correctness of the information

12 Ideas.. 4. Anonymity Encryption is to be used to secure the connection between the users and warehouse so that no outside user (user who has not registered with the warehouse) can access the warehouse. 5. User Profile Based Security User profile must describe how and what has to be represented pertaining to the users information and security level authorization needs.

13 What is the organization doing to identify, classify, quantify, and protect its valuable information assets?

14 Achieving proactive security requirements Phase One - Identifying the Data –It entails taking a complete inventory of all the data that is available to the DW end-users. Phase Two - Classifying the Data –Data is generally classified on the basis of criticality or sensitivity to disclosure, modification, and destruction. PUBLIC (Least Sensitive Data) CONFIDENTIAL (Moderately Sensitive Data) TOP SECRET (Most Sensitive Data)

15 Security Requirements Phase Three - Quantifying the Value of Data –Demands to see the Smoking Gun –assigning "street value" to data grouped under different sensitivity categories

16 Security Requirements Phase Four - Identifying Data Vulnerabilities –identification and documentation of vulnerabilities associated with the DW environment In-built DBMS Security DBMS Limitations Dual Security Engines Inference Attacks Availability Factor Human Factors Insider Threats Outsider Threats Natural Factors Utility Factors

17 Security Requirements Phase Five - Identifying Protective Measures and Their Costs –The Human Wall –Access Users Classification –Access Controls –Integrity Controls –Data Encryption –Partitioning –Development Controls

18 Security Requirements Phase Six - Selecting Cost-Effective Security Measures –Economy of mechanism –Adequate data protection Phase Seven - Evaluating the Effectiveness of Security Measures –Encryption Requirements –Encryption Constraints –Data Warehouse Administration –Control Reviews

19 Security Areas

20 Case study – Florida DoE Since the mid 1980s, the Florida Department of Education (DOE) has collected student, staff, and workforce data and used it to guide program development and funding. As data volumes grew, so did demand for the data—from teachers, administrators, legislators and parents. But DoE had limited success with Oracle 2004- public portal using Microsoft Office SharePoint® Portal Server 2003 called Sunshine Connections

21 Florida DoE 2007 – Microsoft services provided reporting, identity, and security infrastructure for Sunshine Connections using Microsoft Office PerformancePoint® Server 2007. 2008- All users access to data on sunshine connections. Edge Security Gateway. Sunshine Connections: Public Area, Restricted Area. Secure and rapid online access to terabytes of student level data Benefits that DoE enjoys.


Download ppt "Security in Data Warehousing Janani. S 09MCS102."

Similar presentations


Ads by Google