Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Warehousing Alex Ostrovsky CS157B Spring 2007.

Similar presentations


Presentation on theme: "Data Warehousing Alex Ostrovsky CS157B Spring 2007."— Presentation transcript:

1 Data Warehousing Alex Ostrovsky CS157B Spring 2007

2 Introduction ► Data warehouse is a main repository of corporate data ► Multiple databases are employed per specific purpose ► Contains raw events and unprocessed data, although separate tables might exist for processed information displaying meaningful data

3 What is it used for? ► Data analysis ► Data mining ► Complex queries with multiple table join ► Forecasting ► Historical reporting ► OLAP (Online Analytical Processing)

4 High level view

5 Key Concepts and Features ► Data is not required to be heavily normalized ► Transaction Processing is done mostly offline, thus processing time is not very critical. Although, this might depend on amount of data, normalization, query complexity, and application specifications.

6 Key Concepts and Features (cont.) ► Unlike regular OLTP real-time databases data is subject-oriented ► Non-volatile, i.e. data is essentially stored forever without being pruned or deleted. ► Heavily integrated: contains data from majority of organization’s applications ► Time-variant: most of the data has some time reference for the purpose of producing the reports

7 Types of data warehousing DBs ► Offline operational database: similar to regular data replication. Used to minimize the impact of queries on a running primary operational system ► Offline data warehouse: heavily integrated, reporting-oriented warehouse databases which are updated with data from operational databases on regular time intervals

8 Types of data warehousing DBs (cont) ► Real-time data warehouse: database data is updated instantaneously as soon as transaction happens ► Integrated data warehouse: database is integrated with primary operational system for immediate decision making and reporting.

9 Benefits of Data Warehousing ► No need to stress operational database with complex queries ► Separation of processing and business logic ► Very flexible, multiple distinct relations can be defined from a set of data ► Can be customer or object specific ► Persistent – once result is computed from the raw events, it doesn’t need to be recomputed again, giving faster response time on subsequent queries.

10 Dangers of Data Warehousing ► Heavy processing requires physically separate database machines for warehousing and OLTP ► Must be optimized for novice users, complex queries might take a very long time ► Much more complex multidimensional design compared to regular relational databases ► Errors in computational logic can cause serious financial losses and computational recalculations. ► Data representation ► Relatively difficult to perform data migration

11 Database Design ► Data warehousing databases mostly utilize complex multidimensional design ► Relationships must be meaningful and represent clear patterns and trends of unprocessed data. More data and relationships you have more dimensions database will have. ► Information is viewed along one common dimensional position. Can be thought of as intersection of a few planes.

12 OLAP Market

13 References ► http://en.wikipedia.org/wiki/Data_warehouse http://en.wikipedia.org/wiki/Data_warehouse ► http://en.wikipedia.org/wiki/OLAP http://en.wikipedia.org/wiki/OLAP ► http://dmoz.org/Computers/Software/Databases/D ata_Warehousing/ http://dmoz.org/Computers/Software/Databases/D ata_Warehousing/ http://dmoz.org/Computers/Software/Databases/D ata_Warehousing/ ► http://dmoz.org/Computers/Software/Databases/D ata_Warehousing/Articles/ http://dmoz.org/Computers/Software/Databases/D ata_Warehousing/Articles/ http://dmoz.org/Computers/Software/Databases/D ata_Warehousing/Articles/ ► http://en.wikipedia.org/wiki/Multidimensional_data base http://en.wikipedia.org/wiki/Multidimensional_data base http://en.wikipedia.org/wiki/Multidimensional_data base ► http://www.olapreport.com/market.htm http://www.olapreport.com/market.htm


Download ppt "Data Warehousing Alex Ostrovsky CS157B Spring 2007."

Similar presentations


Ads by Google