CSE591: Data Mining by H. Liu

Slides:



Advertisements
Similar presentations
Web Mining.
Advertisements

CS583 – Data Mining and Text Mining
Overview of Data Mining & The Knowledge Discovery Process Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
CS583 – Data Mining and Text Mining
Week 9 Data Mining System (Knowledge Data Discovery)
Data Mining By Archana Ketkar.
Data Mining – Intro.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
CS 5831 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
CS583 – Data Mining and Text Mining
Business Intelligence
CIT 858: Data Mining and Data Warehousing Course Instructor: Bajuna Salehe Web:
TURKISH STATISTICAL INSTITUTE INFORMATION TECHNOLOGIES DEPARTMENT (Muscat, Oman) DATA MINING.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2.
『 Data Mining 』 By Jung, hae-sun. 1.Introduction 2.Definition 3.Data Mining Applications 4.Data Mining Tasks 5. Overview of the System 6. Data Mining.
Knowledge Discovery & Data Mining process of extracting previously unknown, valid, and actionable (understandable) information from large databases Data.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
1 1 Slide Introduction to Data Mining and Business Intelligence.
Introduction to Web Mining Spring What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,
Data Mining Chapter 1 Introduction -- Basic Data Mining Tasks -- Related Concepts -- Data Mining Techniques.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
3-1 Data Mining Kelby Lee. 3-2 Overview ¨ Transaction Database ¨ What is Data Mining ¨ Data Mining Primitives ¨ Data Mining Objectives ¨ Predictive Modeling.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Academic Year 2014 Spring Academic Year 2014 Spring.
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 28 Data Mining Concepts.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Data Mining Concept Submitted TO: Mrs. MONIKA SUBMITTED BY: SHALU 4717.
CS583 – Data Mining and Text Mining
Data Mining – Intro.
Presented by Khawar Shakeel
MIS2502: Data Analytics Advanced Analytics - Introduction
DATA MINING © Prentice Hall.
CS583 – Data Mining and Text Mining
Introduction to Data Mining
CS583 – Data Mining and Text Mining
MIS 451 Building Business Intelligence Systems
Introduction C.Eng 714 Spring 2010.
Market Research Unit 3 P3.
Adrian Tuhtan CS157A Section1
Data Mining: Concepts and Techniques Course Outline
CS583 – Data Mining and Text Mining
Data Mining Modified from
Sangeeta Devadiga CS 157B, Spring 2007
CSE591: Data Mining by H. Liu
Data Warehousing and Data Mining
CS583 – Data Mining and Text Mining
Data Mining: Concepts and Techniques
CS583 – Data Mining and Text Mining
Data Mining: Concepts and Techniques
Data Mining Concepts and Techniques
Web Mining Department of Computer Science and Engg.
Data Mining: Introduction
CS583 – Data Mining and Text Mining
Data Warehousing Data Mining Privacy
CSE/CBS 572: Data Mining by Huan Liu
Data Mining: Concepts and Techniques
Welcome! Knowledge Discovery and Data Mining
CS583 – Data Mining and Text Mining
Promising “Newer” Technologies to Cope with the
CSE572: Data Mining by H. Liu
CSE572: Data Mining by H. Liu
Presentation transcript:

CSE591: Data Mining by H. Liu Data Mining, Data Warehousing & Web Mining Huan Liu, CSE, CEAS, ASU http://www.public.asu.edu/~hliu Agenda Self introduction of all What’s the course Questions/Answers Organization 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Contents Classification, Clustering, Association, Data Warehousing, Web, and Applications Format - A seminar course Paper reading, discussion, project, presentation Assessment Class participation, project proposal, presentation, exam or quiz (to be discussed) 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Course Format Research papers - the main source to be found on the course web site One textbook is listed, but it’s not arrived yet. More will be added for your convenience to access related subjects Everyone is expected to read the papers and participate in class discussion Presenters will be evaluated on the spot 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Paper presentation Each student will be responsible for one paper. All are expected to read the paper before the presentation. What is it about? What are points to discuss and improve? What can we do with it? Each presentation is about 45 minutes including discussion, question & answer 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Project Proposal Proposal presentation, discussion, revision A project should be completed in a semester Project Presentation and demo Report 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Topic Distribution 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Your first assignment It’s on the course web site. Registered students: please sign in the course mailing list (use your frequently used email account so you won’t miss important announcement) Try to complete your first assignment before the 2nd class. 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Introduction The need for data mining Data mining Data warehousing Web mining Applications 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu What is data mining Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web, image. Patterns must be: valid novel potentially useful understandable 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Patterns are used for prediction or classification describing the existing data segmenting the data (e.g., the market) profiling the data (e.g., your customers) etc. 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Some DM tasks Classification: mining patterns that can classify future data into known classes. Association rule mining mining any rule of the form X  Y, where X and Y are sets of data items. Clustering identifying a set of similarity groups in the data 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Sequential pattern mining: A sequential rule: A B, says that event A will be immediately followed by event B with a certain confidence Deviation detection: discovering the most significant changes in data Data visualization: using graphical methods to show patterns in data. 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Why data mining Rapid computerization of businesses produce huge amount of data How to make best use of data? A growing realization: knowledge discovered from data can be used for competitive advantage. 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Make use of your data assets Many interesting things you want to find cannot be found using database queries “find me people likely to buy my products” “Who are likely to respond to my promotion” Fast identify underlying relationships and respond to emerging opportunities 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Why now The data is abundant. The data is being warehoused. The computing power is affordable. The competitive pressure is strong. Data mining tools have become available 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu DM fields Data mining is an emerging multi-disciplinary field: Statistics Machine learning Databases Visualization OLAP and data warehousing etc. 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Summary What is data mining? KDD - knowledge discovery in databases: non-trivial extraction of implicit, previously unknown and potentially useful information Why do we need data mining? Wide use of computer systems - data explosion - knowledge is power - but we’re data rich, knowledge lean - actionability ... 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Data Warehousing What is data warehousing? A repository of integrated, analysis-oriented, historical, read-only data, designed for decision support and KDD systems Why do we need data warehousing? Operational systems were never designed for KDD, they are numerous, of different types, with overlapping/contrary definitions In today’s world, the competitive edge is coming less from optimization and more from the proactive use of the information that these systems have been collecting over the years.Companies are beginning to realize the vast potential of the information that they hold in thir organizations. If they can tap into this information, they can significantly miprove the quality of their decision making and the profitability of the organization through focused actions. The problem for most companies, though, is that their operational systems were never designed to support this kind of activity, and probably never can be. Operational systems also tend to be numerous, with overlapping and sometimes contrary definitions. r 7/30/2019 CSE591: Data Mining by H. Liu

An Overview of KDD Process (Guess which is which) Various databases Integration Datawarehouse Preprocessing Data mining Post processing Knolwedge 7/30/2019 CSE591: Data Mining by H. Liu

CSE591: Data Mining by H. Liu Web mining The Web is a massive database Semi-structured data XML and RDF Web mining Content Structure Usage 7/30/2019 CSE591: Data Mining by H. Liu