Data Management for Decision Support Session-2 Prof. Bharat Bhasker.

Slides:



Advertisements
Similar presentations
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Advertisements

Data Mining Glen Shih CS157B Section 1 Dr. Sin-Min Lee April 4, 2006.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Database – Part 3 Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Mr. Sakthi Angappamudali.
CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes #15.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
CSE6011 Data Warehouse and OLAP  Why data warehouse  What’s data warehouse  What’s multi-dimensional data model  What’s difference between OLAP and.
Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,
Database – Part 2b Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Sakthi Angappamudali at Standard Insurance; BI.
Data Mining and Data Warehousing – a connected view.
Database – Part 2 Dr. V.T. Raja Oregon State University.
13 Chapter 13 The Data Warehouse Hachim Haddouti.
Introduction to Data Warehousing Enrico Franconi CS 636.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
A Comparsion of Databases and Data Warehouses Name: Liliana Livorová Subject: Distributed Data Processing.
Chapter 13 – Data Warehousing. Databases  Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age  Information,
Enterprise systems infrastructure and architecture DT211 4
Joachim Hammer 1 Data Warehousing Overview, Terminology, and Research Issues Joachim Hammer.
Basic Concepts of Datawarehousing An Overview Prasanth Gurram.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
D ATABASE S YSTEMS D ATA W AREHOUSING I Asma Ahmad 29 th April, 2011.
Enhancing Management Decision Making Week-10 Prof. Bharat Bhasker.
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie.
AN OVERVIEW OF DATA WAREHOUSING
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing Data Mining Privacy. Reading Bhavani Thuraisingham, Murat Kantarcioglu, and Srinivasan Iyer Extended RBAC-design and implementation.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
MAIN BOOKS 1. DATA WAREHOUSING IN THE REAL WORLD : Sam Anshory & Dennis Murray, Pearson 2. DATA MINING CONCEPTS AND TECHNIQUES : Jiawei Han & Micheline.
DATA RESOURCE MANAGEMENT
Managing Data for DSS II. Managing Data for DS Data Warehouse Common characteristics : –Database designed to meet analytical tasks comprising of data.
Advanced Database Concepts
Data Warehousing/Mining 1 Data Warehousing/Mining Introduction.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Data Warehouse and OLAP
Data warehouse.
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Chapter 13 – Data Warehousing
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
An Introduction to Data Warehousing
Introduction to Data Warehousing
Data Warehouse and OLAP
Data Warehousing Data Model –Part 1
Data Warehouse.
Data Warehousing Data Mining Privacy
Data Warehousing Concepts
Kenneth C. Laudon & Jane P. Laudon
Data Warehouse and OLAP Technology
Presentation transcript:

Data Management for Decision Support Session-2 Prof. Bharat Bhasker

What is a Data Warehouse? A Practitioners Viewpoint “A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand and use it in a business context.” -- Barry Devlin, IBM Consultant

Introduction- Business Problem Definition Business Application domain of Decision Support Customer Retention- modeling those who defected to identify patterns that led their defection. Apply it to present set and devise prevention. Sales & Customer Service - By aggregating the proper information to frontline sales and service professionals. Based on customer profile, rule based recommendation of products, service plans. Cross-selling Marketing- Accurate information for retention campaigns, lifetime value analysis, trending, targeted promotions Risk assessment and fraud detection- Accessible customer base reduces the risk. A mail-order company can identify payment patterns from a mailing address. Insurance company can aggregate all drawn out policies, thus total exposure. A bank can identify risky companies

Introduction- Business Problem Definition Class of Problems Retrospective Analysis - Analysis of past and present events. Analysis of zone wise, product wise sales Predictive Analysis- Predicting certain events or behavior based on historical data. Attrition rate of customers due to competition

Introduction- Business Problem Classifications Application Techniques for Business Problems Classification- based on predefined criteria, examples credit ratings of individuals Clustering and Segmentation - Segment a database in subsets or clusters based on a set of attributes. In the process of understanding customer base, an organization may segment the known population. Thus, discover hitherto unknown attributes (school attended, # of vacations per year). Clustering may utilize statistical or AI techniques Associations- These techniques identify affinity among the collection reflected in the examined records. These affinities are often referred to as rules. Foe example, 60% of all record that contain A & B also contain C & D. Product affinity in Market basket analysis Sequencing- Identifies patterns over time,. For example a analysis of customer visits may reveal that a person who buys engine oil & filter in one visit, buys gas additive in next visit. ====>

Introduction- Business Problem Classifications The organizations are faced with a wealth of data stored in archives,. It is the inability to discover, often previously unknown, information hidden in data prevents them from deriving the knowledge and value for the organization. The objective, therefore, should be to extract valid, previously unknown, and comprehensible knowledge/information from large database and use it for profit (Competitive advantage). To meet these objectives- –Capture and integrate both external and internal data into a comprehensive view that encompasses whole organization –Organize the present data and extracted information in ways that expedites the decision making –Mine the integrated data for information

Introduction- Operational Data Stores For variety of On Line Transaction Processing (OLTP) systems (Financial, Order processing, Point of sale applications) the organizations maintain an operational data store. For example, bank withdrawals and deposits. –Organized by application –Daily business processing on a detailed transactional level –Update intensive –Current data –Optimized for high performance –Access is few records per transaction, often on primary key –Large number of short duration transactions –Large number of concurrent applications

Introduction- Informational Data Stores Need to organized around subjects such as customers, vendors, products. Focuses on answering questions like “ What two products resulted in most frequent calls on hotline?”. Tend to have redundant and non-updateable. Characteristics exhibited include: –Data Model- to meet End-use analysis needs. No focus on ACID –Data Access- Ad hoc queries –Time base- Recent, Aggregated, derived and historical data –Data changes- Periodic, scheduled batch updates. –Unit of Work- Queries rather than concurrent updates –Record range accessed- Millions for a query vs. few for operational –Number of Concurrent users- Low for IDS vs. High for ODS. –Transaction Volumes- Low for IDS vs. High for ODS. –Type of users- Analytical & Managerial –Number of Indexes- Often many complex, compound vs. few, simple.

Introduction- Managing Data for DS Data Warehouse Common characteristics : –Database designed to meet analytical tasks comprising of data from multiple applications –Small number of users with intense and long interactions –Read intensive usage –Periodic updates to the contents –Consists of current as well as historical data –Relatively fewer but large tables –Queries results is large results sets, involving full table scan and joins spanning several tables –Aggregation, vector operation and summarization are common –The data frequently resides in external heterogeneous sources

Introduction- Terminology Current Detail Data- data acquired directly from operational databases, often representing entire enterprise Old Detail Data- Aged current detail data, historical data organized by subjects, it helps in trend analysis Data Marts- A large data store for informational needs where scope is limited to a department, SBUs etc., In a phased implementation data marts are a way to build a warehouse. Summarized Data- Aggregated data along the lines required for executive reporting,trend analysis and decision support. Drill Down- Ability of knowledge worker to perform business analysis in a top down fashion,starting from the highly summarized data to the current and old detail data. Metadata- It is data about the data, description of contents, location, structure, end-user views, identification of authoritative data, history of updates, security authorizations

Introduction- Architecture Extract, Cleanup & Load External Currentl Current Repository Meta data Realized or Virtual MDDB Management Information Delivery System Report, Query & EIs OLAP Tools Data Mining Tools

What is a Data Warehouse? An Alternative Viewpoint “A DW is a –subject-oriented, –integrated, –time-varying, –non-volatile collection of data that is used primarily in organizational decision making.” -- W.H. Inmon, Building the Data Warehouse, 1992

A Data Warehouse is... Stored collection of diverse data –A solution to data integration problem –Single repository of information Subject-oriented –Organized by subject, not by application –Used for analysis, data mining, etc. Optimized differently from transaction- oriented db User interface aimed at executive

… Cont’d Large volume of data (Gb, Tb) Non-volatile –Historical –Time attributes are important Updates infrequent May be append-only Examples –All transactions ever at WalMart –Complete client histories at insurance firm –Stockbroker financial information and portfolios

Summary Operational Systems Enterprise Modeling Business Information Guide Data Warehouse Catalog Data Warehouse Population Data Warehouse Business Information Interface

Warehouse is a Specialized DB Standard DB Mostly updates Many small transactions Mb - Gb of data Current snapshot Index/hash on p.k. Raw data Thousands of users (e.g., clerical users) Warehouse Mostly reads Queries are long and complex Gb - Tb of data History Lots of scans Summarized, reconciled data Hundreds of users (e.g., decision- makers, analysts)

Warehousing and Industry Warehousing is big business –$2 billion in 1995 –$3.5 billion in early 1997 – $8 billion in 1998 WalMart has largest warehouse –900-CPU, 2,700 disk, 23 TB Teradata system –~7 TB in warehouse –40-50GB per day