3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB.

Slides:



Advertisements
Similar presentations
An Introduction to Data Mining
Advertisements

Supporting End-User Access
1. Abstract 2 Introduction Related Work Conclusion References.
Chapter 9 Business Intelligence Systems
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Data Mining By Archana Ketkar.
Data Mining Adrian Tuhtan CS157A Section1.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Data Mining: A Closer Look
Data Mining.
Data Mining: An Introduction Wing Kee Ho Xiaohua Luan.
Data Mining CS 157B Section 2 Keng Teng Lao. Overview Definition of Data Mining Application of Data Mining.
Beyond Opportunity; Enterprise Miner Ronalda Koster, Data Analyst.
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
Enterprise systems infrastructure and architecture DT211 4
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
CHAPTER 11 Managerial Support Systems. CHAPTER OUTLINE  Managers and Decision Making  Business Intelligence Systems  Data Visualization Technologies.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Dr. Awad Khalil Computer Science Department AUC
Chapter 5: Data Mining for Business Intelligence
Data Mining Techniques
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Data Mining Dr. Chang Liu. What is Data Mining Data mining has been known by many different terms Data mining has been known by many different terms Knowledge.
1 Data Mining DT211 4 Refer to Connolly and Begg 4ed.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Data Mining Techniques As Tools for Analysis of Customer Behavior
Data Mining Chun-Hung Chou
Spatial Statistics and Spatial Knowledge Discovery First law of geography [Tobler]: Everything is related to everything, but nearby things are more related.
Data Mining and Application Part 1: Data Mining Fundamentals Part 2: Tools for Knowledge Discovery Part 3: Advanced Data Mining Techniques Part 4: Intelligent.
Chapter 9 Business Intelligence and Information Systems for Decision Making.
Introduction To Data Mining. What Is Data Mining? A toolA tool Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful)
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Knowledge Discovery and Data Mining Evgueni Smirnov.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Knowledge Discovery and Data Mining Evgueni Smirnov.
2015年10月18日星期日 2015年10月18日星期日 2015年10月18日星期日 Introduction to Data Mining 1 Chapter 1 Introduction to Data Mining Chen. Chun-Hsien Department of Information.
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
Dr. Chen, Data Mining  A/W & Dr. Chen, Data Mining Chapter 2 Data Mining: A Closer Look Jason C. H. Chen, Ph.D. Professor of MIS School of Business Administration.
Copyright © 2004 Pearson Education, Inc.. Chapter 27 Data Mining Concepts.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
1 Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CRM - Data mining Perspective. Predicting Who will Buy Here are five primary issues that organizations need to address to satisfy demanding consumers:
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Data Mining and Decision Support
Academic Year 2014 Spring Academic Year 2014 Spring.
Customer Relationship Management (CRM) Chapter 4 Customer Portfolio Analysis Learning Objectives Why customer portfolio analysis is necessary for CRM implementation.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Prepared by Fayes Salma.  Introduction: Financial Tasks  Data Mining process  Methods in Financial Data mining o Neural Network o Decision Tree  Trading.
Data Mining Functionalities
Data Mining.
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
Adrian Tuhtan CS157A Section1
Data Analysis.
Data Warehousing and Data Mining
Supporting End-User Access
Data Warehousing Data Mining Privacy
Presentation transcript:

3 Objects (Views Synonyms Sequences) 4 PL/SQL blocks 5 Procedures Triggers 6 Enhanced SQL programming 7 SQL &.NET applications 8 OEM DB structure 9 DB security 10 Backup Recovery 14 Data Mining 15 Data Warehousing 1 Course Introduction 2 Oracle Introduction Advanced SQL New Trends 11 Large object 12 Transaction Management Advanced DB Concepts Chapter Structure DB Admin.

Data Mining – Business Intelligence Data explosion problem Data explosion problem We are drowning in data, but starving for knowledge! We are drowning in data, but starving for knowledge! Finding interesting structure in data (data-driven decision making practices, BBC Horizon - Age of Big Data ) Finding interesting structure in data (data-driven decision making practices, BBC Horizon - Age of Big Data )BBC Horizon - Age of Big DataBBC Horizon - Age of Big Data Structure: refers to statistical patterns, predictive models, hidden relationships Structure: refers to statistical patterns, predictive models, hidden relationships To provide knowledge that will give a company a competitive advantage, enabling it to earn a greater profit To provide knowledge that will give a company a competitive advantage, enabling it to earn a greater profit

Purpose of Data Mining Goals of data mining Goals of data mining Predict the future behavior of attributes Predict the future behavior of attributes Classify items, placing them in the proper categories Classify items, placing them in the proper categories Identify the existence of an activity or an event Identify the existence of an activity or an event Optimize the use of the organization’s resources Optimize the use of the organization’s resources

Applications of Data Mining Retailing Retailing Customer relations management (CRM) Customer relations management (CRM) Advertising campaign management Advertising campaign management Advertising campaign management Advertising campaign management Banking and Finance Banking and Finance Credit scoring Credit scoring Fraud detection and prevention Fraud detection and prevention Manufacturing Manufacturing Optimizing use of resources Optimizing use of resources Manufacturing process optimization Manufacturing process optimization Product design Product design Medicine Medicine Determining effectiveness of treatments Determining effectiveness of treatments Analyzing effects of drugs Analyzing effects of drugs Finding relationships between patient care and outcomes Finding relationships between patient care and outcomes Higher Education ( Academic analytics) Higher Education ( Academic analytics) which students will enroll in particular course programs which students will enroll in particular course programs which students will need assistance in order to graduate which students will need assistance in order to graduate

Commercial Support and Job Market Many Data Mining Tools Many Data Mining Tools Database systems with data mining support Database systems with data mining support Oracle 10g, 11g Oracle 10g, 11g SQL Server 2005, 2008 SQL Server 2005, 2008 Hot topic Hot topic members by April. 14, members by April. 14, 2009

BI Market Worldwide BI software revenue is forecast to reach almost US$12.5 billion in 2012, up 7.2 percent over last year. Worldwide BI software revenue is forecast to reach almost US$12.5 billion in 2012, up 7.2 percent over last year. The global BI software and services market will rapidly expand from $79 billion in 2012, to $143 billion in 2016

Data Mining and Business Intelligence Increasing potential to support business decisions Data Sources Paper, Files, Database systems, OLTP, WWW Data Warehouses/Data Marts OLAP, MDA Data Exploration Statistical Analysis, Reporting Data Mining Information Discovery Data Presentation Visualization Making Decisions End User DBA Business Analyst Data Analyst

Data Mining Methods (6 basic classes) Associations Associations Finding rules like “if the customer buys frozen Pizza, sausage, and beer, then the probability that he/she buys potato chips is 50%” Finding rules like “if the customer buys frozen Pizza, sausage, and beer, then the probability that he/she buys potato chips is 50%” Classifications Classifications Classify data based on the values of the decision attribute, e.g. classify patients based on their “state” Classify data based on the values of the decision attribute, e.g. classify patients based on their “state” Clustering Clustering Group data to form new classes, cluster customers based on their behavior to find common patterns Group data to form new classes, cluster customers based on their behavior to find common patterns

Data Mining Methods Sequential patterns Sequential patterns Finding rules like “if the customer buys TV, then, few days later, he/she buys camera, then the probability that he/she will buy within 1 month video is 50%” Finding rules like “if the customer buys TV, then, few days later, he/she buys camera, then the probability that he/she will buy within 1 month video is 50%” Time-Series similarities Time-Series similarities Finding similar sequences (or subsequences) in time- series (e.g. stock analysis) Finding similar sequences (or subsequences) in time- series (e.g. stock analysis) Deviation detection Deviation detection Finding anomalies/exceptions/deviations in data Finding anomalies/exceptions/deviations in data

Association and Classification Rules Association rules have form {x}  {y}, where x and y are events that occur at the same time. Association rules have form {x}  {y}, where x and y are events that occur at the same time. Have measures of support and confidence. Have measures of support and confidence. Support is the percentage of transactions that contain all items included in both left and right sides Support is the percentage of transactions that contain all items included in both left and right sides Confidence is how often the rule proves to be true; where the left hand side of the implication is present, percentage of those in which the right side is present as well Confidence is how often the rule proves to be true; where the left hand side of the implication is present, percentage of those in which the right side is present as well Classification rules, placing instances into the correct one of several possible categories Classification rules, placing instances into the correct one of several possible categories Developed using a training set, past instances for which the correct classification is known Developed using a training set, past instances for which the correct classification is known System develops a method for correctly classifying a new item whose class is currently unknown System develops a method for correctly classifying a new item whose class is currently unknown

Sequential Patterns Sequential patterns e.g. prediction that a customer who buys a particular product in one transaction will purchase a related product in a later transaction Sequential patterns e.g. prediction that a customer who buys a particular product in one transaction will purchase a related product in a later transaction Can involve a set of products Can involve a set of products Patterns are represented as sequences {S1}, {S2} Patterns are represented as sequences {S1}, {S2} First subsequence {S1} is a predictor of the second subsequence {S2} First subsequence {S1} is a predictor of the second subsequence {S2} Support is the percentage of times such a sequence occurs in the set of transactions Support is the percentage of times such a sequence occurs in the set of transactions Confidence is the probability that when {S1} occurs, {S2} will occur on a subsequent transaction - can calculate from observed data Confidence is the probability that when {S1} occurs, {S2} will occur on a subsequent transaction - can calculate from observed data

Time Series Patterns A time series is a sequence of events that are all of the same type A time series is a sequence of events that are all of the same type Sales figures, stock prices, interest rates, inflation rates, and many other quantities can be analyzed using time series Sales figures, stock prices, interest rates, inflation rates, and many other quantities can be analyzed using time series Time series data can be studied to discover patterns and sequences Time series data can be studied to discover patterns and sequences For example, we can look at the data to find the longest period when the figures continued to rise each month, or find the steepest decline from one month to the next For example, we can look at the data to find the longest period when the figures continued to rise each month, or find the steepest decline from one month to the next

Data Mining Methods: Regression A statistical method for predicting the value of an attribute, Y, (the dependent variable), given the values of attributes X1, X2, …, Xn (the independent variables) A statistical method for predicting the value of an attribute, Y, (the dependent variable), given the values of attributes X1, X2, …, Xn (the independent variables) Statistical packages allow users to identify potential factors for predicting the value of the dependent variable Statistical packages allow users to identify potential factors for predicting the value of the dependent variable Using linear regression, the package finds the contribution or weight of each independent variable, as coefficients, a0, a1, …, an for a linear function Y= a0 + a1 X1 + a2 X2 + … + anXn Using linear regression, the package finds the contribution or weight of each independent variable, as coefficients, a0, a1, …, an for a linear function Y= a0 + a1 X1 + a2 X2 + … + anXn Can also use non-linear regression, using curve-fitting, finding the equation of the curve that fits the observed values Can also use non-linear regression, using curve-fitting, finding the equation of the curve that fits the observed values

Neural Networks Methods from AI using a set of samples to find the strongest relationships between variables and observations Methods from AI using a set of samples to find the strongest relationships between variables and observations Use a learning method, adapting as they learn new information Use a learning method, adapting as they learn new information Hidden layers developed by the system as it examines cases, using generalized regression technique Hidden layers developed by the system as it examines cases, using generalized regression technique System refines its hidden layers until it has learned to predict correctly a certain percentage of the time; then test cases are provided to evaluate it System refines its hidden layers until it has learned to predict correctly a certain percentage of the time; then test cases are provided to evaluate it Problems: Problems: overfitting the curve - prediction function fits the training set values too perfectly, even ones that are incorrect (data noise) overfitting the curve - prediction function fits the training set values too perfectly, even ones that are incorrect (data noise) Knowledge of how the system makes its predictions is in the hidden layers Knowledge of how the system makes its predictions is in the hidden layers Output may be difficult to understand and interpret Output may be difficult to understand and interpret

Clustering Methods used to place cases into clusters or groups that can be disjoint or overlapping Methods used to place cases into clusters or groups that can be disjoint or overlapping Using a training set, system identifies a set of clusters into which the tuples of the database can be grouped Using a training set, system identifies a set of clusters into which the tuples of the database can be grouped Tuples in each cluster are similar, and they are dissimilar to tuples in other clusters Tuples in each cluster are similar, and they are dissimilar to tuples in other clusters Similarity is measured by using a distance function defined for the data Similarity is measured by using a distance function defined for the data

Data Mining Process Data preprocessing Data preprocessing Data selection: Identify target datasets and relevant fields Data selection: Identify target datasets and relevant fields Data cleaning Data cleaning Remove noise and outliers Remove noise and outliers Data transformation Data transformation Create common units Create common units Generate new fields Generate new fields Data mining model construction Data mining model construction Model evaluation Model evaluation