Chapter 4: Data Mining for Business Intelligence

Slides:



Advertisements
Similar presentations
DATA, TEXT, AND WEB MINING
Advertisements

Week 9 Data Mining System (Knowledge Data Discovery)
Data Mining Knowledge Discovery in Databases Data 31.
Data Mining By Archana Ketkar.
Business Intelligence: A Managerial Perspective on Analytics (3rd Edition) Chapter 4: Data Mining.
Data Mining – Intro.
Chapter 5 Data mining : A Closer Look.
Data Mining & Data Warehousing PresentedBy: Group 4 Kirk Bishop Joe Draskovich Amber Hottenroth Brandon Lee Stephen Pesavento.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Chapter 5: Data Mining for Business Intelligence
Chapter 4 Data, Text, and Web Mining
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining Week 10.
Chapter 5: Data Mining for Business Intelligence
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Chapter 5: Data Mining for Business Intelligence
Chapter 5: Data Mining for Business Intelligence
Data Mining Techniques
MAKING THE BUSINESS BETTER Presented By Mohammed Dwikat DATA MINING Presented to Faculty of IT MIS Department An Najah National University.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Intelligent Systems Lecture 23 Introduction to Intelligent Data Analysis (IDA). Example of system for Data Analyzing based on neural networks.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Chapter 5: Data Mining for Business Intelligence
Chapter 7 DATA, TEXT, AND WEB MINING Pages , 311, Sections 7.3, 7.5, 7.6.
Chapter 4: Data Mining for Business Intelligence
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Guest Lecture Introduction to Data Mining Dr. Bhavani Thuraisingham September 17, 2010.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Data mining. Data mining, at its core, is the transformation of large amounts of data into meaningful patterns and rules.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
CHAPTER 4 Data Warehousing, Access, Analysis, Mining, and Visualization 2 1.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Chapter 14 Data Mining Transparencies. 2 Chapter Objectives u The concepts associated with data mining. u The main features of data mining operations,
Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 5-1 Data Mining Methods: Classification Most frequently used DM method Employ supervised.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Academic Year 2014 Spring Academic Year 2014 Spring.
Monday, February 22,  The term analytics is often used interchangeably with:  Data science  Data mining  Knowledge discovery  Extracting useful.
Chapter 5: Data Mining.
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Chapter 2 Data, Text, and Web Mining. Data Mining Concepts and Applications  Data mining (DM) A process that uses statistical, mathematical, artificial.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Copyright © 2014 Pearson Education, Inc. 5-1 DATA MINING.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Decision Support Systems Data Mining for Business Intelligence.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Decision Support and Business Intelligence Systems (9 th Ed., Prentice Hall) Chapter 5: Data Mining for Business Intelligence.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 6: Artificial Neural Networks for Data Mining.
Business Intelligence: A Managerial Approach (2nd Edition)
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Data and Applications Security Introduction to Data Mining
Chapter 5: Data Mining for Business Intelligence
Predictive Analytics I: Data Mining Process, Methods, and Algorithms
Supporting End-User Access
Business Intelligence: A Managerial Approach (2nd Edition)
Data Warehousing Data Mining Privacy
Business Intelligence
Presentation transcript:

Chapter 4: Data Mining for Business Intelligence

Data Mining Concepts and Definitions Why Data Mining? More intense competition at the global scale Recognition of the value in data sources Availability of quality data on customers, vendors, transactions, Web, etc. Consolidation and integration of data repositories into data warehouses The exponential increase in data processing and storage capabilities; and decrease in cost Movement toward conversion of information resources into nonphysical form

Definition of Data Mining The nontrivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data stored in structured databases - Fayyad et al., (1996) Keywords in this definition: Process, nontrivial, valid, novel, potentially useful, understandable Data mining: a misnomer? Other names: knowledge extraction, pattern analysis, knowledge discovery, information harvesting, pattern searching, data dredging

Data Mining at the Intersection of Many Disciplines

Data Mining Characteristics/Objectives Source of data for DM is often a consolidated data warehouse (not always!). DM environment is usually a client-server or a Web-based information systems architecture. Data is the most critical ingredient for DM which may include soft/unstructured data. The miner is often an end user. Striking it rich requires creative thinking. Data mining tools’ capabilities and ease of use are essential (Web, Parallel processing, etc.).

Data in Data Mining Data: a collection of facts usually obtained as the result of experiences, observations, or experiments Data may consist of numbers, words, and images Data: lowest level of abstraction (from which information and knowledge are derived) DM with different data types? - Other data types?

What Does DM Do? How Does it Work? DM extracts patterns from data Pattern? A mathematical (numeric and/or symbolic) relationship among data items Types of patterns Association: (Beer & diapers in a markets basket analysis) Prediction: Predicts future occurrences based on the past (Super Bowl winner, temperature on a specific day) Cluster: (segmentation based on demographics or past purchase behavior) Sequential (or time series) relationships: existing bank customer with checking account will open savings account within a year

A Taxonomy for Data Mining Tasks

Other Data Mining Tasks These are in addition to the primary DM tasks (prediction, association, clustering) Time-series forecasting Part of sequence or link analysis? Visualization Another data mining task? Types of DM Hypothesis-driven data mining Discovery-driven data mining

Data Mining Applications Customer Relationship Management Maximize return on marketing campaigns Improve customer retention (churn analysis) Maximize customer value (cross- or up-selling) Identify and treat most valued customers Banking & Other Financial Automate the loan application process Detecting fraudulent transactions Maximize customer value (cross- and up-selling) Optimizing cash reserves with forecasting

Data Mining Applications (cont.) Retailing and Logistics Optimize inventory levels at different locations Improve the store layout and sales promotions Optimize logistics by predicting seasonal effects Minimize losses due to limited shelf life Manufacturing and Maintenance Predict/prevent machinery failures Identify anomalies in production systems to optimize manufacturing capacity Discover novel patterns to improve product quality

Data Mining Applications (cont.) Brokerage and Securities Trading Predict changes on certain bond prices Forecast the direction of stock fluctuations Assess the effect of events on market movements Identify and prevent fraudulent activities in trading Insurance Forecast claim costs for better business planning Determine optimal rate plans Optimize marketing to specific customers Identify and prevent fraudulent claim activities

Data Mining Applications (cont.) Computer hardware and software Science and engineering Government and defense Homeland security and law enforcement Travel industry Healthcare Medicine Entertainment industry Sports Etc. Highly popular application areas for data mining

Data Mining Methods: Classification Most frequently used DM method Part of the machine-learning family Employ supervised learning Learn from past data, classify new data The output variable is categorical (nominal or ordinal) in nature Classification versus regression? Classification versus clustering?

Classification Techniques Decision tree analysis Statistical analysis Neural networks Support vector machines Case-based reasoning Bayesian classifiers Genetic algorithms Rough sets

Decision Trees Employs the divide and conquer method Recursively divides a training set until each division consists of examples from one class Create a root node and assign all of the training data to it. Select the best splitting attribute. Add a branch to the root node for each value of the split. Split the data into mutually exclusive subsets along the lines of the specific split. Repeat the steps 2 and 3 for each and every leaf node until the stopping criteria is reached. A general algorithm for decision tree building

Source: KDNuggets.com, May 2009 Data Mining Software Commercial IBM SPSS Modeler (formerly Clementine) SAS – Enterprise Miner IBM – Intelligent Miner StatSoft – Statistica Data Miner … many more Free and/or Open Source RapidMiner Weka Source: KDNuggets.com, May 2009

Data Mining Myths Data mining … provides instant solutions/predictions. is not yet viable for business applications. requires a separate, dedicated database. can only be done by those with advanced degrees. is only for large firms that have lots of customer data. is another name for good-old statistics.

Common Data Mining Blunders Selecting the wrong problem for data mining Ignoring what your sponsor thinks data mining is and what it really can/cannot do Not leaving sufficient time for data acquisition, selection and preparation Looking only at aggregated results and not at individual records/predictions Being sloppy about keeping track of the data mining procedure and results

Common Data Mining Mistakes Ignoring suspicious (good or bad) findings and quickly moving on Running mining algorithms repeatedly and blindly, without thinking about the next stage Naively believing everything you are told about the data Naively believing everything you are told about your own data mining analysis Measuring your results differently from the way your sponsor measures them