Data Mining as a BI Tool Business Intelligence Data Analysis Data Extraction Visualisation Exploration Discovery Reporting / EIS / MIS OLAP Collecting.

Slides:



Advertisements
Similar presentations
Supporting End-User Access
Advertisements

Database – Part 3 Dr. V.T. Raja Oregon State University External References/Sources: Data Warehousing – Mr. Sakthi Angappamudali.
Final Review and Study Guide MIS2502, Spring 2011 Section 03.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
1 Data Warehousing. 2 Data Warehouse A data warehouse is a huge database that stores historical data Example: Store information about all sales of products.
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Data Mining.
Clarifying the Research Question through Secondary Data and Exploration Chapter 5 組員 黎旭崴 李承霖.
Supporting Decision Making Chapter 10 McGraw-Hill/IrwinCopyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved.
Data Mining Concepts 1.1 COT5230 Data Mining Week 1 Data Mining Concepts M O N A S H A U S T R A L I A ’ S I N T E R N A T I O N A L U N I V E R S I T.
Data Mining – Intro.
1 Data and Knowledge Management. 2 Data Management: A Critical Success Factor The difficulties and the process Data sources and collection Data quality.
Data mining By Aung Oo.
Business Intelligence: Essential of Business
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Data Mining: A Closer Look
Business Intelligence
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
6/22/2006 DATA MINING I. Definition & Business-Related Examples Mohammad Monakes Fouad Alibrahim.
1 © Goharian & Grossman 2003 Introduction to Data Mining (CS 422) Fall 2010.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Process, Key Success Factors, Illustrations
Chapter 5: Data Mining for Business Intelligence
Data Mining Techniques
 BA_EM Electronic Marketing – Pavel
Shilpa Seth.  What is Data Mining What is Data Mining  Applications of Data Mining Applications of Data Mining  KDD Process KDD Process  Architecture.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Understanding Data Analytics and Data Mining Introduction.
Chapter 9 Business Intelligence and Information Systems for Decision Making.
CIS 9002 Kannan Mohan Department of CIS Zicklin School of Business, Baruch College.
© 2008 Pearson Prentice Hall, Experiencing MIS, David Kroenke Slide 1 Chapter 9 Competitive Advantage with Information Systems for Decision Making.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining – A First View Roiger & Geatz. Definition Data mining is the process of employing one or more computer learning techniques to automatically.
Knowledge Discovery and Data Mining Evgueni Smirnov.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Chapter 3: Databases and Data Warehouses Building Business Intelligence Management Information Systems for the Information Age.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Part I Data Mining Fundamentals Chapter 1 Data Mining: A First View Jason C. H. Chen, Ph.D. Professor.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
What is Data Mining? process of finding correlations or patterns among dozens of fields in large relational databases process of finding correlations or.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
CHAPTER 4 Data Warehousing, Access, Analysis, Mining, and Visualization 2 1.
OLAP On Line Analytic Processing. OLTP On Line Transaction Processing –support for ‘real-time’ processing of orders, bookings, sales –typically access.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Data Mining and Decision Support
Academic Year 2014 Spring Academic Year 2014 Spring.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Ahmed K. Ezzat, SQL Server 2008 and Data Mining Overview 1 Data Mining and Big Data.
The KDD Process for Extracting Useful Knowledge from Volumes of Data Fayyad, Piatetsky-Shapiro, and Smyth Ian Kim SWHIG Seminar.
Chapter 2: Data Mining Dr. Goutam Sarker,
01-Business intelligence
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
A Research Oriented Study Report By :- Akash Saxena
MIS 451 Building Business Intelligence Systems
Introduction C.Eng 714 Spring 2010.
Data and Applications Security Introduction to Data Mining
Information Technology for Management
Data Warehousing and Data Mining
Using Data Mining To Improve Company Strategies
Data Mining Concepts and Techniques
Presentation transcript:

Data Mining as a BI Tool Business Intelligence Data Analysis Data Extraction Visualisation Exploration Discovery Reporting / EIS / MIS OLAP Collecting / Transforming Data Storage Storing / Aggregating / Historising Data Mining

OLAP vs. Data Mining OLAP verifies hypotheses – The analyst intuits at the result and guides the process OLAP verifies hypotheses – The analyst intuits at the result and guides the process Data Mining discovers hypotheses – Data Mining discovers hypotheses – The data determine the results

Input-Output View Data Mining Business Knowledge Data (internal & external) Decision Models Reports Objective(s) New Knowledge

What Kind of Output? Decision trees RulesWeb

Data Mining Operationalization of Machine Learning, with two specific emphases Operationalization of Machine Learning, with two specific emphases Emphasis on process Emphasis on process Emphasis on action Emphasis on action

From Data to Action Knowledge People who buy product X also buy product Y, P% of the time Doctors who perform in excess of N operations of type T per month may be fraudulous Molecules of class X are most likely carcinogenic Actions Offer product Y to owners of product X Investigate potential frauds Information Mrs X buys product Y Product X costs Y francs Mr X drives a car of type Y Dr X performed Y operations of type T Data (raw) Lifestyle Transactions Socio-demographics

Process View Raw Data Selected Data Pre-processed Data Model Building Patterns Models Interpretation & Evaluation Business Problem Formulation Dissemination & Deployment Determine credit worthiness Aggregate individual incomes into household income Learn about loans, repayments, etc.; Collect data about past performance Build a decision tree Check against hold-out set Data Pre-processing Understanding Domain & Data

Key Success Factors Have a clearly articulated business problem that needs to be solved and for which Data Mining is the adequate technology Have a clearly articulated business problem that needs to be solved and for which Data Mining is the adequate technology Ensure that the problem being pursued is supported by the right type of data of sufficient quality and in sufficient quantity Ensure that the problem being pursued is supported by the right type of data of sufficient quality and in sufficient quantity Recognise that Data Mining is a process with many components and dependencies Recognise that Data Mining is a process with many components and dependencies Plan to learn from the Data Mining process whatever the outcome Plan to learn from the Data Mining process whatever the outcome

Myths (I) Data Mining produces surprising results that will utterly transform your business Data Mining produces surprising results that will utterly transform your business Reality: Reality: Early results = scientific confirmation of human intuition. Early results = scientific confirmation of human intuition. Beyond = steady improvement to an already successful organisation. Beyond = steady improvement to an already successful organisation. Occasionally = discovery of one of those rare « breakthrough » facts. Occasionally = discovery of one of those rare « breakthrough » facts. Data Mining techniques are so sophisticated that they can substitute for domain knowledge or for experience in analysis and model building Data Mining techniques are so sophisticated that they can substitute for domain knowledge or for experience in analysis and model building Reality: Reality: Data Mining = joint venture. Data Mining = joint venture. Close cooperation between experts in modeling and using the associated techniques, and people who understand the business. Close cooperation between experts in modeling and using the associated techniques, and people who understand the business.

Myths (II) Data Mining is useful only in certain areas, such as marketing, sales, and fraud detection Data Mining is useful only in certain areas, such as marketing, sales, and fraud detection Reality: Reality: Data mining is useful wherever data can be collected. Data mining is useful wherever data can be collected. All that is really needed is data and a willingness to « give it a try. » There is little to loose… All that is really needed is data and a willingness to « give it a try. » There is little to loose… Only massive databases are worth mining Only massive databases are worth mining Reality: Reality: A moderately-sized or small data set can also yield valuable information. A moderately-sized or small data set can also yield valuable information. It is not only the quantity, but also the quality of the data that matters (characterising mutagenic compounds) It is not only the quantity, but also the quality of the data that matters (characterising mutagenic compounds)

Myths (III) The methods used in Data Mining are fundamentally different from the older quantitative model-building techniques The methods used in Data Mining are fundamentally different from the older quantitative model-building techniques Reality: Reality: All methods now used in data mining are natural extensions and generalisations of analytical methods known for decades. All methods now used in data mining are natural extensions and generalisations of analytical methods known for decades. What is new in data mining is that we are now applying these techniques to more general business problems. What is new in data mining is that we are now applying these techniques to more general business problems. Data Mining is an extremely complex process Data Mining is an extremely complex process Reality: Reality: The algorithms of data mining may be complex, but new tools and well- defined methodologies have made those algorithms easier to apply. The algorithms of data mining may be complex, but new tools and well- defined methodologies have made those algorithms easier to apply. Much of the difficulty in applying data mining comes from the same data organisation issues that arise when using any modeling techniques. Much of the difficulty in applying data mining comes from the same data organisation issues that arise when using any modeling techniques.

OLAP vs. DM Illustration

Data Mining with OLAP (I) Formulate hypothesis Formulate hypothesis Beer and fish sell well together Beer and fish sell well together Issue corresponding queries Issue corresponding queries TC = select COUNT of all baskets containing both beer and fish TC = select COUNT of all baskets containing both beer and fish Decide on validity Decide on validity Ratio of TC over baskets containing only beer or only fish, AND other possible associations Ratio of TC over baskets containing only beer or only fish, AND other possible associations

Data Mining with OLAP (II) Assume 11 possible products in any one basket and restrict to associations of at most 4 products Assume 11 possible products in any one basket and restrict to associations of at most 4 products 55 possible associations of 2 products 55 possible associations of 2 products 165 possible associations of 3 products 165 possible associations of 3 products 330 possible associations of 4 products 330 possible associations of 4 products Must issue 550 queries and compare the results!!! Must issue 550 queries and compare the results!!!

Data Mining Instead of OLAP Only two alternatives with OLAP: Only two alternatives with OLAP: Brute force: prohibitive! Brute force: prohibitive! Intuition: speculative! Intuition: speculative! Data Mining strikes a balance: Data Mining strikes a balance: Try most associations Try most associations Use heuristics to guide the search Use heuristics to guide the search DM increases chances of useful discovery! DM increases chances of useful discovery!