3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.

Slides:



Advertisements
Similar presentations
An Introduction to Data Mining
Advertisements

By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
Overview of Data Mining & The Knowledge Discovery Process Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
Week 9 Data Mining System (Knowledge Data Discovery)
© Prentice Hall1 DATA MINING TECHNIQUES Introductory and Advanced Topics Eamonn Keogh (some slides adapted from) Margaret Dunham Dr. M.H.Dunham, Data Mining,
Recommender systems Ram Akella February 23, 2011 Lecture 6b, i290 & 280I University of California at Berkeley Silicon Valley Center/SC.
Data Mining By Archana Ketkar.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data mining By Aung Oo.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
Data Mining.
Business Intelligence
CIT 858: Data Mining and Data Warehousing Course Instructor: Bajuna Salehe Web:
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Mining Large Data at SDSC Natasha Balac, Ph.D.. A Deluge of Data Astronomy Life Sciences Modeling and Simulation Data Management and Mining Geosciences.
Dr. Awad Khalil Computer Science Department AUC
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Chapter 5: Data Mining for Business Intelligence
Data Mining Techniques
Shilpa Seth.  What is Data Mining What is Data Mining  Applications of Data Mining Applications of Data Mining  KDD Process KDD Process  Architecture.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Principles of Data Mining. Introduction: Topics 1. Introduction to Data Mining 2. Nature of Data Sets 3. Types of Structure Models and Patterns 4. Data.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Part I Data Mining Fundamentals Chapter 1 Data Mining: A First View Jason C. H. Chen, Ph.D. Professor.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DATA MINING By Cecilia Parng CS 157B.
Chapter 14 Data Mining Transparencies. 2 Chapter Objectives u The concepts associated with data mining. u The main features of data mining operations,
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING It is a process of extracting interesting(non trivial, implicit, previously, unknown and useful ) information from any data repository. The.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Data Mining – Intro.
What Is Cluster Analysis?
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Introduction to Data Mining
MIS 451 Building Business Intelligence Systems
Introduction C.Eng 714 Spring 2010.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Sangeeta Devadiga CS 157B, Spring 2007
CSE591: Data Mining by H. Liu
Data Warehousing and Data Mining
Data Mining: Introduction
Understanding Customer Behaviors with Information Technologies
MIS2502: Data Analytics Introduction to Advanced Analytics
Data Mining: Concepts and Techniques
CSE591: Data Mining by H. Liu
Presentation transcript:

3-1 Data Mining Kelby Lee

3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling ¨ Knowledge Discovery ¨ Other Objectives to Data Mining ¨ What Data Mining is Not ¨ Other Factors in Data Mining Categorization ¨ Conclusion

3-3 Transaction Database ¨ Relation Consisting of Transactions ¨ TID (Transaction Identifier) ¨ Regularities between Transaction Behavior

3-4 Transaction Database Table 1.1 Transaction Database TID CustomerItemDatePriceQuantity C1chocolate01/11/ C1ice cream01/11/ C2chocolate01/12/ C2candy bar01/12/ C2jackets01/12/ C3jackets01/14/ C3color shirts01/14/ C4jackets01/15/

3-5 Association Rules ¨ A customer who buys chocolate will likely buy candy bar ¨ one type of Data Mining task

3-6 Discovered Rules Table 1.2 Discovered Rules RuleBought this......also bought that chocolateice cream 2candy barchocolate 3ski pantscolored shirt 4beerdiaper

3-7 What is Data Mining ¨ Retrieve individual elements ¨Given a name of a product, find price and producer ¨ Analysis ¨Average monthly sales amount and derivation

3-8 Advances Allow For ¨ Large amounts of Data to be Handled ¨ Aspect of Analysis ¨ “Data Rich” but “Knowledge Poor”

3-9 Discover Patterns ¨ Improve Business Performance ¨Exploit favorable patterns ¨Avoid problematic patterns ¨ Increase Understanding ¨ Predict Outcome

3-10 Answer the Key Business Questions ¨ Who will buy? What will they buy? How much? ¨Classification and Prediction ¨ What are the different types of Customers? ¨Segmentation of Customers

3-11 Answer the Key Business Questions ¨ What relationship exists between customers or Website visitors and the products? ¨Association ¨ What are the groupings hidden in the data? ¨Clustering Analysis

3-12 Data Mining Definition Non Trivial Extraction of implicit, previously unknown, interesting, and potentially useful information from data

3-13 Different Types of Data Mining ¨ Business Data Mining ¨ Scientific Data Mining ¨ Internet Data Mining

3-14 Data Mining Applications ¨ Medical ¨ Control Theory ¨ Engineering ¨ Public Administration ¨ Marketing and Finance ¨ Data Mining on the Web ¨ Scientific Data Base ¨ Fraud Detection

3-15 Data Mining Primitives ¨ Fundamental Elements Needed to Define a Data Mining Task ¨ Eight Elements (P,D,K,B,T,M,I,U) ¨8 - Tuple

3-16 Elements ¨ P - Problem Specification ¨ D - Task Relevant Data ¨ K - Kind of Knowledge to be Mined ¨ B - Background Knowledge ¨ T - Specific algorithms or techniques ¨ M - Models developed or knowledge patterns extracted ¨ I - Interestingness ¨ U- User

3-17 Diagram

3-18 Relationship between Elements ¨ User Defines Problem (P) and specifies Interestingness (I) ¨ Data Miner with K and T as core elements utilizing D and B and incorporates I ¨ Data Miner produces M

3-19 Data Mining Objectives ¨ Discovery ¨Finding human interpretable patterns describing the data ¨ Prediction ¨Using some variables or fields in database to predict unknown or future values or other variables of interest

3-20 Data Mining Objectives ¨ Knowledge Discovery ¨Stage somewhat prior to prediction where information is insufficient ¨Closer to decision support

3-21 Predictive Modeling ¨ Predict Values Based on Similar Groups of Data ¨ Submit records with some unknown fields and system will predict value

3-22 Predictive Modeling ¨ Pattern Recognition ¨Association of an observation to past experience or knowledge ¨Interchangeable with classification

3-23 Predictive Modeling ¨ Classification ¨Process of assigning finite set of labels to an observation ¨ Estimation ¨Assign infinite number of numeric labels to an observation

3-24 Knowledge Discovery ¨ Find Patterns in Data Base ¨If someone buys one thing, what else will they buy ¨ Interesting + Certain = Knowledge ¨Output called Discovered Knowledge ¨ KDD - Knowledge Discovery in Data Base

3-25 Data Mining ¨ Is about why, about hidden regularities, important aspect related to perception, learning and evolving ¨ Decision support process in which we search patterns of information in data ¨Once found, display in suitable format

3-26 Four Points of KDD ¨ Discovered Knowledge Represented in High-Level Language ¨ Accurately Portray contents of Database ¨ Interesting to user ¨ Process is Efficient

3-27 Important Issues ¨ Human Centered ¨Under control of human user to meet human needs ¨ Incorporate Interestingness ¨ Provide Various Types ¨ Provide Visualization

3-28 Other Objectives ¨ Forensic analysis ¨Applying extracted patterns to find anomalous or unusual data elements largely involved in business applications ¨Find out what the norm is and find those that deviate from the norm

3-29 What Data Mining is Not ¨ Analysis vs Monitoring ¨Analysis - previously collected information ¨Monitoring ¨ Collect data as it comes in and compare to set of conditions ¨ Unexpected Discovery ¨Must have general goal in mind

3-30 Other Factors in Categorization ¨ Data Retention ¨Data is retained for future pattern matching ¨ Pattern Distillation ¨Analyse data, extract pattern, leave data behind

3-31 Conclusion ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling ¨ Knowledge Discovery ¨ Other Objectives to Data Mining ¨ What Data Mining is Not ¨ Other Factors in Data Mining Categorization