Department of Computer Science Sir Syed University of Engineering & Technology, Karachi-Pakistan. Presentation Title: DATA MINING Submitted By.

Slides:



Advertisements
Similar presentations
Designing Services for Grid-based Knowledge Discovery A. Congiusta, A. Pugliese, Domenico Talia, P. Trunfio DEIS University of Calabria ITALY
Advertisements

QMM 384 – Data Mining Data Mining: Introduction Introduction to Predictive Analytics.
CPS : Information Management and Mining Shivnath Babu.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Introduction to Data Mining by Tan, Steinbach, Kumar.
Data Mining Techniques for CRM Seyyed Jamaleddin Pishvayi Customer Relationship Management Instructor : Dr. Taghiyare Tehran University Spring 1383.
 What is Data Mining?  Data Mining Motivation  Data Mining Applications  Applications of Data Mining in CRM  Data Mining Taxonomy  Data Mining Techniques.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
MIS 111: Computers and the Inter-networked Society Class 11: Data Mining July 25th, 2011.
DATA MINING CS157A Swathi Rangan. A Brief History of Data Mining The term “Data Mining” was only introduced in the 1990s. Data Mining roots are traced.
Week 9 Data Mining System (Knowledge Data Discovery)
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Decision Support: Data Mining Introduction.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Introduction to Data Mining Engineering Group in ACL.
Enterprise systems infrastructure and architecture DT211 4
Data Mining By Andrie Suherman. Agenda Introduction Major Elements Steps/ Processes Tools used for data mining Advantages and Disadvantages.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Data Mining: Introduction. Why Data Mining? l The Explosive Growth of Data: from terabytes to petabytes –Data collection and data availability  Automated.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
1 Business System Analysis & Decision Making - Lecture 14 Zhangxi Lin ISQS 5340 Summer II 2006.
Data Clustering 1 – An introduction
Data Mining CS157B Fall 04 Professor Lee By Yanhua Xue.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Data Mining – A First View Roiger & Geatz. Definition Data mining is the process of employing one or more computer learning techniques to automatically.
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
© Tan,Steinbach, Kumar Introduction to Data Mining 4/18/ Data Mining: Introduction Lecture Notes for Chapter 1 Introduction to Data Mining by Tan,
Data Mining Knowledge on rough set theory SUSHIL KUMAR SAHU.
CSE4334/5334 DATA MINING CSE 4334/5334 Data Mining, Fall 2011 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.
CS157B Fall 04 Introduction to Data Mining Chapter 22.3 Professor Lee Yu, Jianji (Joseph)
N. GagunashviliRAVEN Workshop Heidelberg Nikolai Gagunashvili (University of Akureyri, Iceland) Data mining methods in RAVEN network.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
1 What is Data Mining? l Data mining is the process of automatically discovering useful information in large data repositories. l There are many other.
Data Mining BY JEMINI ISLAM. Data Mining Outline: What is data mining? Why use data mining? How does data mining work The process of data mining Tools.
DATA MINING By Cecilia Parng CS 157B.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
MIS2502: Data Analytics Advanced Analytics - Introduction.
COMSATS Institute of Information Technology Department of Computer Science Databases and Information Systems Dr. Ramzan Talib Databases and Information.
DATA MINING PREPARED BY RAJNIKANT MODI REFERENCE:DOUG ALEXANDER.
Data Mining Basics. “Copyright and Terms of Service Copyright © Texas Education Agency. The materials found on this website are copyrighted © and trademarked.
Data Mining. Overview the extraction of hidden predictive information from large databases Data mining tools predict future trends and behaviors, allowing.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Waqas Haider Bangyal. 2 Source Materials “ Data Mining: Concepts and Techniques” by Jiawei Han & Micheline Kamber, Second Edition, Morgan Kaufmann, 2006.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
1 Francesco Gullo Barcelona Francesco Gullo Barcelona From Patterns in Data to Knowledge Discovery: what Data Mining.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
DATA MINING TECHNIQUES (DECISION TREES ) Presented by: Shweta Ghate MIT College OF Engineering.
An Introduction to Data Mining
Data Mining Concept Submitted TO: Mrs. MONIKA SUBMITTED BY: SHALU 4717.
Data Mining Generally, (Sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it.
MIS2502: Data Analytics Advanced Analytics - Introduction
Statistics 202: Statistical Aspects of Data Mining
Data Mining: Introduction
Data mining and real systems modeling
Data Mining: Introduction
Techniques for Finding Patterns in Large Amounts of Data: Applications in Biology Vipin Kumar William Norris Professor and Head, Department of Computer.
Data Mining 101 with Scikit-Learn
Adrian Tuhtan CS157A Section1
Data Mining: Introduction
Self organizing networks
What is Pattern Recognition?
Data Mining: Introduction
DATA MINING.
Prepared by: Mahmoud Rafeek Al-Farra
Course Introduction CSC 576: Data Mining.
Data Mining: Introduction
Presentation transcript:

Department of Computer Science Sir Syed University of Engineering & Technology, Karachi-Pakistan. Presentation Title: DATA MINING Submitted By

 What is data mining ?  Data mining consists of five major elements  Why Mine Data?  Commercial Viewpoint  Scientific Viewpoint  Some of the techniques used for data mining

 Data Mining, also known as Knowledge- Discovery in Databases (KDD), is the process of automatically searching large volumes of data for patterns.  It is the process of extraction of knowledge from large datasets.  Extremely large datasets.  Useful knowledge that can improve processes.

 Extract, transform, and load transaction data onto the data warehouse system.  Store and manage the data in a multidimensional database system.  Provide data access to business analysts and information technology professionals.  Analyze the data by application software.  Present the data in a useful format, such as a graph or table.

 Lots of data is being collected and warehoused  Web data, e-commerce  purchases at department/ grocery stores  Bank/Credit Card transactions  Computers have become cheaper and more powerful  Competitive Pressure is Strong  Provide better, customized services for an edge (e.g. in Customer Relationship Management)

 Data collected and stored at enormous speeds (GB/hour).  remote sensors on a satellite  telescopes scanning the skies  microarrays generating gene expression data  scientific simulations generating terabytes of data  Traditional techniques infeasible for raw data.  Data mining may help scientists.  in classifying and segmenting data

 Artificial neural networks - Neural networks are useful for pattern recognition or data classification, through a learning process. Non-linear predictive models that learn through training and resemble biological neural networks in structure.

 Neural Networks map a set of input-nodes to a set of output-nodes  Number of inputs/outputs is variable  The Network itself is composed of an arbitrary number of nodes with an arbitrary topology

 Tree-shaped structures that represent sets of decisions. These decisions generate rules for the classification of a dataset.

heighthaireyesclass shortblondblueA tallblondbrownB tallredblueA shortdarkblueB talldarkblueB tallblondblueA talldarkbrownB shortblondbrownB

hair eyes B B A A dark red blond bluebrown

A classification technique that classifies each record based on the records most similar to it in an historical database.

CLUSTURING

Clustering can be considered the most important unsupervised learning technique; so, as every other problem of this kind, it deals with finding a structure in a collection of unlabeled data. Clustering is “the process of organizing objects into groups whose members are similar in some way”. A cluster is therefore a collection of objects which are “similar” between them and are “dissimilar” to the objects belonging to other clusters.

The greater the similarity (or homogeneity) within a group, and the greater the difference between groups, the “better” or more distinct the clustering.

A few good reasons...  Simplifications  Pattern detection

Basic K-means Algorithm for finding K clusters: 1. Select K points as the initial centroids. 2. Assign all points to the closest centroid. 3. Recompute the centroid of each cluster. 4. Repeat steps 2 and 3 until the centroids don’t change.