Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.

Slides:



Advertisements
Similar presentations
An Introduction to Data Mining
Advertisements

Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
University of Minnesota
Data Mining By Archana Ketkar.
Data Mining and Data Warehousing – a connected view.
Data Mining Adrian Tuhtan CS157A Section1.
Data Mining – Intro.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
Introduction to Data Mining Data mining is a rapidly growing field of business analytics focused on better understanding of characteristics and.
Data Mining.
Business Intelligence
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Lingma Acheson Department of Computer and Information Science, IUPUI
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Dr. Awad Khalil Computer Science Department AUC
MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University.
Business Intelligence, Data Mining and Data Analytics/Predictive Analytics By: Asela Thomason IS 495 Summer 2015.
Data Mining An Introduction.
Data Mining Chun-Hung Chou
1 An Introduction to Data Mining Hosein Rostani Alireza Zohdi Report 1 for “advance data base” course Supervisor: Dr. Masoud Rahgozar December 2007.
Understanding Data Analytics and Data Mining Introduction.
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Data Mining Techniques As Tools for Analysis of Customer Behavior Lecture 2:
3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Knowledge Discovery and Data Mining Evgueni Smirnov.
DATA MINING 1. 2 Data Mining Extracting or “mining” knowledge from large amounts of data Data mining is the process of autonomously retrieving useful.
Data Mining and Information Visualization Yan Liu, PhD Assistant Professor Department of Biomedical, Industrial and Human Factors Engineering Wright State.
 Fundamentally, data mining is about processing data and identifying patterns and trends in that information so that you can decide or judge.  Data.
Classification and Prediction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot Readings: Chapter 6 – Han and Kamber.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Introduction of Data Mining and Association Rules cs157 Spring 2009 Instructor: Dr. Sin-Min Lee Student: Dongyi Jia.
EXAM REVIEW MIS2502 Data Analytics. Exam What Tool to Use? Evaluating Decision Trees Association Rules Clustering.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Mining Frequent Patterns, Associations, and Correlations Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Data Mining and Decision Support
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Preprocessing: Data Reduction Techniques Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Chapter 26: Data Mining Prepared by Assoc. Professor Bela Stantic.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Mining Association Rules in Large Database This work is created by Dr. Anamika Bhargava, Ms. Pooja Kaul, Ms. Priti Bali and Ms. Rajnipriya Dhawan and licensed.
Data Mining Functionalities
Data Mining.
Data Mining – Intro.
By Arijit Chatterjee Dr
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Introduction to Data Mining
Adrian Tuhtan CS157A Section1
Sangeeta Devadiga CS 157B, Spring 2007
Data Analysis.
Lingma Acheson Department of Computer and Information Science, IUPUI
Data Mining: Concepts and Techniques
Classification & Prediction
Data Mining: Concepts and Techniques
Data Mining Techniques As Tools for Analysis of Customer Behavior
Data Warehousing Data Mining Privacy
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Kenneth C. Laudon & Jane P. Laudon
Presentation transcript:

Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot

2 Data Mining Functionalities  Data mining functionalities are used to specify the kind of patterns to be found in data mining tasks.  Data mining tasks Descriptive data mining - Characterize general properties of data in database Predictive data mining - Perform inference on current data to make predictions  Data mining system should Mine multiple kinds of patterns Be able to discover patterns at various levels of abstraction Allow users to guide or focus search

3 Data Mining Functionalities (contd…) Characterization  Summarize characteristics of data of class in general terms Discrimination  Comparison of class with comparative classes

4 Data Mining Functionalities (contd…)  Data characterization Example: A data mining system should be able to produce a description summarizing the characteristics of customers who spend more than $1,000 a year at AllElectronics. The result could be a general profile of the customers, such as they are 40– 50 years old, employed, and have excellent credit ratings. The system should allow users to drill down on any dimension, such as on occupation in order to view these customers according to their type of employment.

5 Data Mining Functionalities (contd…)  Data discrimination Example: A data mining system should be able to compare two groups of AllElectronics customers, such as those who shop for computer products regularly (more than two times a month) versus those who rarely shop for such products (i.e., less than three times a year). The resulting description provides a general comparative profile of the customers, such as 80% of the customers who frequently purchase computer products are between 20 and 40 years old and have a university education, whereas 60% of the customers who infrequently buy such products are either seniors or youths, and have no university degree.

6 Data Mining Functionalities (contd…)  Mining frequent patterns, association, and correlation Patterns that occur frequently in data Frequent itemset Frequent sequential pattern Frequent structured pattern Single-dimensional vs. multi-dimensional rules Support and confidence thresholds

7 Data Mining Functionalities (contd…)  Mining frequent patterns example: A frequent itemset typically refers to a set of items that frequently appear together in a transactional data set, such as milk and bread.  Association analysis Example: Suppose, as a marketing manager of AllElectronics, you would like to determine which items are frequently purchased together within the same transactions. An example of such a rule, mined from the AllElectronics transactional database, is buys(X; “computer”))buys(X; “software”) [support = 1%; confidence = 50%] where X is a variable representing a customer. A confidence, or certainty, of 50% means that if a customer buys a computer, there is a 50% chance that he/she will buy software as well. A 1% support means that 1% of all of the transactions under analysis showed that computer and software were purchased together.

8 Data Mining Functionalities (contd…)  Classification and prediction Construct models (functions) that describe and distinguish classes or concepts for future prediction  e.g., classify countries based on (climate), or classify cars based on (gas mileage) Predict some unknown or missing numerical values

9 Data Mining Functionalities (contd…)  Classification and prediction Example: Suppose, as sales manager of AllElectronics, you would like to classify a large set of items in the store, based on three kinds of responses to a sales campaign: good response, mild response, and no response. You would like to derive a model for each of these three classes based on the descriptive features of the items, such as price, brand, place made, type, and category.

10 Data Mining Functionalities (contd…)  Cluster analysis Class label is unknown: Group data to form new classes, e.g., cluster houses to find distribution patterns Maximizing intra-class similarity & minimizing interclass similarity

11 Data Mining Functionalities (contd…)  Cluster analysis example: Cluster analysis can be performed on AllElectronics customer data in order to identify homogeneous subpopulations of customers. These clusters may represent individual target groups for marketing. Figure shows a 2-D plot of customers with respect to customer locations in a city. Three clusters of data points are evident.

12 Data Mining Functionalities (contd…)  Outlier analysis Outlier: Data object that does not comply with the general behavior of the data Noise or exception? Useful in fraud detection, rare events analysis  Outlier analysis example: Outlier analysis may uncover fraudulent usage of credit cards by detecting purchases of extremely large amounts for a given account number in comparison to regular charges incurred by the same account.

13 Data Mining Functionalities (contd…)  Trend and evolution analysis Trend and deviation Periodicity analysis  Evolution analysis example: Suppose that you have the major stock market (time-series) data of the last several years available from the New York Stock Exchange and you would like to invest in shares of high-tech industrial companies. A data mining study of stock exchange data may identify stock evolution regularities for overall stocks and for the stocks of particular companies. Such regularities may help predict future trends in stock market prices, contributing to your decision making regarding stock investments.

14 Are all Patterns Interesting?  Interesting Easily understood Valid on new data Potentially useful Novel Validates a hypothesis user sought to confirm Unexpected Objective measures  Require thresholds Subjective measures  Based on user beliefs in data