Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.

Slides:



Advertisements
Similar presentations
QMM 384 – Data Mining Data Mining: Introduction Introduction to Predictive Analytics.
Advertisements

CPS : Information Management and Mining Shivnath Babu.
McGraw-Hill/Irwin Business Research Methods, 10eCopyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 5 Clarifying the Research.
McGraw-Hill/Irwin Business Research Methods, 10eCopyright © 2008 by The McGraw-Hill Companies, Inc. All Rights Reserved. Chapter 5 Clarifying the Research.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
Data Mining By Archana Ketkar.
Clarifying the Research Question through Secondary Data and Exploration Chapter 5 組員 黎旭崴 李承霖.
Data Mining Ketaki Borkar CS157A November 29, 2007.
Data mining By Aung Oo.
Chapter 5 Clarifying the Research Question through Secondary Data and Exploration McGraw-Hill/Irwin Business Research Methods, 10e Copyright © 2008 by.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
Data Warehousing 資料倉儲 Min-Yuh Day 戴敏育 Assistant Professor 專任助理教授 Dept. of Information Management, Tamkang University Dept. of Information ManagementTamkang.
Business Intelligence
CIT 858: Data Mining and Data Warehousing Course Instructor: Bajuna Salehe Web:
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
ACS1803 Lecture Outline 2 DATA MANAGEMENT CONCEPTS Text, Ch. 3 How do we store data (numeric and character records) in a computer so that we can optimize.
Data Mining: Introduction. Why Data Mining? l The Explosive Growth of Data: from terabytes to petabytes –Data collection and data availability  Automated.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
5.1 © 2007 by Prentice Hall 5 Chapter Foundations of Business Intelligence: Databases and Information Management.
1 Data Mining Books: 1.Data Mining, 1996 Pieter Adriaans and Dolf Zantinge Addison-Wesley 2.Discovering Data Mining, 1997 From Concept to Implementation.
Data Clustering 1 – An introduction
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
Database A database is a collection of data organized to meet users’ needs. In this section: Database Structure Database Tools Industrial Databases Concepts.
Fox MIS Spring 2011 Data Mining Week 9 Introduction to Data Mining.
Chapter 5 Clarifying the Research Question through Secondary Data and Exploration This chapter explains the use of secondary data sources to develop and.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
New Developments in Business Intelligence ( Decision Support Systems) BUS 782.
1 What is Data Mining? l Data mining is the process of automatically discovering useful information in large data repositories. l There are many other.
MAIN BOOKS 1. DATA WAREHOUSING IN THE REAL WORLD : Sam Anshory & Dennis Murray, Pearson 2. DATA MINING CONCEPTS AND TECHNIQUES : Jiawei Han & Micheline.
Secondary Data Searches
MIS2502: Data Analytics Advanced Analytics - Introduction.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Academic Year 2014 Spring Academic Year 2014 Spring.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
LECTURE 2: DATA MINING. WHAT IS DATA MINING? 2 D ATA M INING AND D ATA W AREHOUSES ? It evolved in to being as the science of databases evolved Database.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 11: BIG DATA AND.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
1 SBM411 資料探勘 陳春賢. 2 Lecture I Class Introduction.
An Introduction to Data Mining
Department of Computer Science Sir Syed University of Engineering & Technology, Karachi-Pakistan. Presentation Title: DATA MINING Submitted By.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining - Introduction Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
There is an inherent meaning in everything. “Signs for people who can see.”
Data Mining.
Data Mining: Concepts and Techniques (3rd ed.) — Chapter 1 —
Chapter 1 Introduction.
MIS2502: Data Analytics Advanced Analytics - Introduction
Statistics 202: Statistical Aspects of Data Mining
Data mining and real systems modeling
Techniques for Finding Patterns in Large Amounts of Data: Applications in Biology Vipin Kumar William Norris Professor and Head, Department of Computer.
Fundamentals of Information Systems
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Chapter 1 Introduction.
Data Mining: Concepts and Techniques Course Outline
MIS5101: Data Analytics Advanced Analytics - Introduction
Data Warehousing and Data Mining
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Data Mining: Introduction
Data Mining: Concepts and Techniques
Presentation transcript:

Waqas Haider Bangyal

2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006 “Data Mining: Introductory and Advanced Topics”, by Dunham, Margaret H, Prentice Hall, 2003

What Is Data Mining? Data mining is the principle of sorting through large amounts of data and picking out relevant information. n The extraction of knowledge from data is called data mining. n Data mining can also be defined as the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules. n The ultimate goal of data mining is to discover knowledge.

Data Rich, Information Poor

Motivation Lots of data is being collected and warehoused  Web data, e-commerce  purchases at department/grocery stores  Bank/Credit Card transactions Computers have become cheaper and more powerful Data collected and stored at enormous speeds (GB/hour) remote sensors on a satellite telescopes scanning the skies

Motivation Traditional techniques infeasible for raw data Human analysts may take weeks to discover useful information We are drowning in data, but starving for knowledge! Data mining may help scientists in classifying and segmenting data

Motivation To which class does this star belong? such an analysis can no longer be conducted manually huge amounts of data are automatically collected

Why is data mining important? Rapid computerization of businesses produce huge amount of data How to make best use of data? A growing realization: knowledge discovered from data can be used for competitive advantage.

Evolution of Database Technology 1960s: Data collection, database creation, IMS and network DBMS 1970s: Relational data model, relational DBMS implementation 1980s: RDBMS, advanced data models (extended-relational, OO, deductive, etc.) and application-oriented DBMS (spatial, scientific, engineering, etc.) 1990s—2000s: Data mining and data warehousing, multimedia databases, and Web databases

Evolution of Database Technology Evolutionary StepBusiness QuestionEnabling TechnologiesProduct Providers Data Collection (1960s) "What was my total revenue in the last five years?" Computers, tapes, disks IBM,static data delivery Data Access (1980s) "What were unit sales in New England last March?” Relational databases (RDBMS), Structured Query Language (SQL), ODBC Oracle, Sybase, Informix, IBM, Microsoft dynamic data delivery at record level Data Warehousing (1990) "What were unit sales in New England last March? Drill down to Boston." multidimensional databases, data warehouses Oracle,Pilot, dynamic data delivery at multiple levels Data Mining ( Emerging Today) "What’s likely to happen to Boston unit sales next month? Why?" Advanced algorithms, massive databases Pilot, Lockheed, IBM, SGI, numerous startups (nascent industry) Prospective, proactive information delivery

Data Warehouse example Data Warehouses: Data warehousing is defined as a process of centralized data management and retrieval. It is repository of information collected from multiple sources, stored under a unified schema and usually reside at a single site

The process Of Data Mining There are 3 main steps in the Data Mining process: Preparation: data is selected from the warehouse and “cleansed”. Processing: algorithms are used to process the data. This step uses modeling to make predictions. Analysis: output is evaluated.

Reasons for growing popularity Growing data volume- enormous amount of existing and appearing data that require processing. Limitations of Human Analysis- humans lacking objectiveness when analyzing. Low cost of Machine Learning- the data mining process has a lower cost than hiring highly trained professionals to analyze data.

Applications of Data Mining Data Mining is applied in the following areas: Prediction of the Stock Market: predicting the future trends. Bankruptcy prediction: prediction based on computer generated rules, using models Foreign Exchange Market: data Mining is used to identify trading rules. Fraud Detection: construction of algorithms and models that will help recognize a variety of fraud patterns.

Results of Data Mining Include: Forecasting what may happen in the future Classifying people or things into groups by recognizing patterns Clustering people or things into groups based on their attributes Associating what events are likely to occur together Sequencing what events are likely to lead to later events

Data Mining Functions Two types of model: Predictive models predict unknown values based on known data Descriptive models identify patterns in data Each type has several sub-categories, each of which has many algorithms. We won't have time to look at ALL of them in detail.

Data Mining Functions

Thanks