Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 4 Tutorial.

Similar presentations


Presentation on theme: "Chapter 4 Tutorial."— Presentation transcript:

1 Chapter 4 Tutorial

2 Modeling Data Warehouse
A data warehouse is based on a multidimensional data model which views data in the form of a data cube A data cube allows data to be modeled and viewed in multiple dimensions Dimension tables Fact table contains measures and keys to related dimension tables

3 Cont. Star schema: A fact table in the middle connected to a set of dimension tables. Snowflake schema: represents dimensional hierarchy by normalizing the dimension tables. save storage reduces the effectiveness of browsing Fact constellations: Multiple fact tables share dimension tables

4 OLAP Operations by climbing up hierarchy or by dimension reduction
Roll up (drill-up): summarize data by climbing up hierarchy or by dimension reduction Drill down (roll down): reverse of roll-up from higher level summary to lower level summary or detailed data, or introducing new dimensions Slice and dice: project and select Pivot (rotate): reorient the cube, visualization, 3D to series of 2D planes

5

6

7

8

9 Q3 Suppose that a data warehouse consists of the three dimensions time, doctor, and patient, and the two measures count and charge, where charge is the fee that a doctor charges a patient for a visit. (a) Enumerate three classes of schemas that are popularly used for modeling data warehouses. Star schema Snowflake schema Fact constellation schema

10 Q3 cont. (b) Draw a schema diagram for the above data warehouse using one of the schema classes listed in (a). Using a star schema.

11 Q3 cont. Star Schema time Fact Table Measures time_key doctor_id
day day_of_the_week month quarter year time doctor_id doctor_name phone # address gender doctor Fact Table time_key doctor_id patient_id Charge Count patient_id patient_name phone # address gender patient Measures

12 Q3 cont. (c) Starting with the base cuboid [day, doctor, patient], what specific OLAP operations should be performed in order to list the total fee collected by each doctor in 2010? The operations to be performed are: Roll-up on time from day to year. Slice for time = 2010. Roll-up on patient from individual patient to all.

13 Q4 Suppose that a data warehouse for Big-University consists of the following four dimensions: student, course, semester, and instructor, and two measures count and avg. grade. When at the lowest conceptual level (e.g., for a given student, course, semester, and instructor combination), the avg grade measure stores the actual course grade of the student. At higher conceptual levels, avg grade stores the average grade for the given combination.

14 Q4 cont. Snowflake Schema
course_id Course_name department Course student_id student_name area_id Major status university Student area_id City State country Area Sales Fact Table course_id student_id instructor_id semester_id Count Avg. grade semester_id semester year Semester instructor_id depatment rank Instructor

15 Q4. cont. Starting with the base cuboid [student, course, semester, instructor], what specific OLAP operations (e.g., roll-up from semester to year) should one perform in order to list the average grade of CS courses for each Big University student. Roll-up on course from course id to department. Roll-up on semester from semester id to all. Slice for course=“CS” .

16 Q5 Suppose that a data warehouse consists of the four dimensions, date, spectator, location, and game, and the two measures, count and charge, where charge is the fare that a spectator pays when watching a game on a given date. Spectators may be students, adults, or seniors, with each category having its own charge rate. Draw a star schema diagram for the data warehouse.

17 Q5 cont. Star Schema date Sales Fact Table date_id spectator_id
day month quarter year date spectator_id spectator_name phone # address Status Charge rate spectator Sales Fact Table date_id spectator_id location_id game_id Charge Count location_id phone # Street city province country location game_id game_name description producer game

18 Q5. cont. Starting with the base cuboid [date, spectator, location, game], what specific OLAP operations should one perform in order to list the total charge paid by student spectators at GM Place in 2010? The specific OLAP operations to be performed are: Roll-up on date from date id to year. Roll-up on game from game id to all. Roll-up on location from location id to location name. Roll-up on spectator from spectator id to status. Dice with status=“students”, location name=“GM Place”, and year = 2010.


Download ppt "Chapter 4 Tutorial."

Similar presentations


Ads by Google