LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design of fact dimension table, data mart. Requirements: Oracle 11g Release 2, SQL Plus.
A data mart: is a subset of an organizational data store, usually oriented to a specific purpose or major data subject, that may be distributed to support business needs. Data marts are analytical data stores designed to focus on specific business functions for a specific community within an organization. Data marts are often derived from subsets of data in a data warehouse, though in the bottomup data warehouse design methodology the data warehouse is created from the union of organizational data marts. LAB EXERCISE #1 Oracle Data Warehousing
Reasons for creating a data mart Easy access to frequently needed data Creates collective view by a group of users Improves end user response time Ease of creation Lower cost than implementing a full Data warehouse Potential users are more clearly defined than in a full Data warehouse LAB EXERCISE #1 Oracle Data Warehousing
Design schemas star schema snowflake schema Star schema architecture Star schema architecture is the simplest data warehouse design. The main feature of a star schema is a table at the center, called the fact table and the dimension tables which allow browsing of specific categories, summarizing, drilldowns and specifying criteria. Typically, most of the fact tables in a star schema are in database third normal form, while dimensional tables are denormalized (second normal form). Despite the fact that the star schema is the simplest data warehouse architecture, it is most commonly used in the data warehouse implementations across the world today (about 9095% cases). LAB EXERCISE #1 Oracle Data Warehousing
Fact table The fact table is not a typical relational database table as it is denormalized on purpose to enhance query response times. The fact table typically contains records that are ready to explore, usually with ad hoc queries. Records in the fact table are often referred to as events, due to the timevariant nature of a data warehouse environment. The primary key for the fact table is a composite of all the columns except numeric values / scores (like QUANTITY, TURNOVER, exact invoice date and time). Typical fact tables in a global enterprise data warehouse are (usually there may be additional company or business specific fact tables): sales fact table contains all details regarding sales orders fact table in some cases the table can be split into open orders and historical orders. Sometimes the values for historical orders are stored in a sales fact table. budget fact table usually grouped by month and loaded once at the end of a year. forecast fact table usually grouped by month and loaded daily, weekly or monthly. inventory fact table report stocks, usually refreshed daily LAB EXERCISE #1 Oracle Data Warehousing
Dimension table Nearly all of the information in a typical fact table is also present in one or more dimension tables. The main purpose of maintaining Dimension Tables is to allow browsing the categories quickly and easily. The primary keys of each of the dimension tables are linked together to form the composite primary key of the fact table. In a star schema design, there is only one denormalized table for a given dimension. Typical dimension tables in a data warehouse are: time dimension table customers dimension table products dimension table key account managers (KAM) dimension table sales office dimension table LAB EXERCISE #1 Oracle Data Warehousing
Snowflake schema architecture: Snowflake schema architecture is a more complex variation of a star schema design. The main difference is that dimensional tables in a snowflake schema are normalized, so they have a typical relational database design. Snowflake schemas are generally used when a dimensional table becomes very big and when a star schema can’t represent the complexity of a data structure. For example if a PRODUCT dimension table contains millions of rows, the use of snowflake schemas should significantly improve performance by moving out some data to other table (with BRANDS for instance). The problem is that the more normalized the dimension table is, the more complicated SQL joins must be issued to query them. This is because in order for a query to be answered, many tables need to be joined and aggregates generated. LAB EXERCISE #1 Oracle Data Warehousing
Oracle SH Schema LAB EXERCISE #1 Oracle Data Warehousing
Star Query An example of an end user query is: "What were the sales and profits for the internet and catalog departments of the stores in the CA state districts over the first quarters of the year 1999?" Consider the following star query: SELECT ch.channel_class, c.cust_city, t.calendar_quarter_desc, SUM(s.amount_sold) sales_amount FROM sales s, times t, customers c, channels ch WHERE s.time_id = t.time_id AND s.cust_id = c.cust_id AND s.channel_id = ch.channel_id AND c.cust_state_province = 'CA' AND ch.channel_desc in ('Internet','Catalog') AND t.calendar_quarter_desc IN ('1999-Q1','1999-Q2') GROUP BY ch.channel_class, c.cust_city, t.calendar_quarter_desc; LAB EXERCISE #1 Oracle Data Warehousing
Query Result in SQL PLUS: LAB EXERCISE #1 Oracle Data Warehousing
Example Using Aggregates: Another example of an end user query is: What were the weekly amount sold in the stores for each customers ? SELECT p.prod_id, t.week_ending_day, s.cust_id, SUM(s.amount_sold) AS sum_amount_sold FROM sales s, products p, times t WHERE s.time_id=t.time_id AND s.prod_id=p.prod_id GROUP BY p.prod_id, t.week_ending_day, s.cust_id; LAB EXERCISE #1 Oracle Data Warehousing
Joins only: SELECT p.prod_id, p.prod_name, t.time_id, t.week_ending_day, s.channel_id, s.promo_id, s.cust_id, s.amount_sold FROM sales s, products p, times t WHERE s.time_id=t.time_id AND s.prod_id = p.prod_id; LAB EXERCISE #1 Oracle Data Warehousing
Your consent to our cookies if you continue to use this website.