Introduction to Web Mining Spring 2013. What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web,

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Web Mining.
CS583 – Data Mining and Text Mining
1.Accuracy of Agree/Disagree relation classification. 2.Accuracy of user opinion prediction. 1.Task extraction performance on Bing web search log with.
Data Mining Sangeeta Devadiga CS 157B, Spring 2007.
CS583 – Data Mining and Text Mining
1 ACCTG 6910 Building Enterprise & Business Intelligence Systems (e.bis) Introduction to Data Mining Olivia R. Liu Sheng, Ph.D. Emma Eccles Jones Presidential.
Data Mining – Intro.
CS157A Spring 05 Data Mining Professor Sin-Min Lee.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
CS 5941 CS583 – Data Mining and Text Mining Course Web Page 05/cs583.html.
Presented To: Madam Nadia Gul Presented By: Bi Bi Mariam.
Chapter 5: Information Retrieval and Web Search
CS583 – Data Mining and Text Mining
CS583 – Data Mining and Text Mining Course Web Page 07/cs583.html.
GUHA method in Data Mining Esko Turunen Tampere University of Technology Tampere, Finland.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining. 2 Models Created by Data Mining Linear Equations Rules Clusters Graphs Tree Structures Recurrent Patterns.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
World Wide Web Hypertext documents Hypertext documents Text Text Links Links Web Web billions of documents billions of documents authored by millions of.
Tang: Introduction to Data Mining (with modification by Ch. Eick) I: Introduction to Data Mining A.Short Preview 1.Initial Definition of Data Mining 2.Motivation.
Introduction to World Wide Web Authoring © Directorate of Information Systems and Services University of Aberdeen, 1999 Part II.
Internet Fundamentals Total Advantage MS Excel 97, Hutchinson, Coulthard, 1998 McGraw Introduction to HTML Chapter 7.
Chapter 1 Introduction to Data Mining
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
CS 5831 CS583 – Data Mining and Text Mining Course Web Page 06/cs583.html.
Data Mining By Dave Maung.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
CS315-Web Search & Data Mining. A Semester in 50 minutes or less The Web History Key technologies and developments Its future Information Retrieval (IR)
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
WEB MINING. In recent years the growth of the World Wide Web exceeded all expectations. Today there are several billions of HTML documents, pictures and.
The World Wide Web: Information Resource. Hock, Randolph. The Extreme Searcher’s Internet Handbook. 2 nd ed. CyberAge Books: Medford. (2007). Internet.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
ITIS 4510/5510 Web Mining Spring Overview Class hour 5:00 – 6:15pm, Tuesday & Thursday, Woodward Hall 135 Office hour 3:00 – 5:00pm, Tuesday, Woodward.
Introduction to Data Mining by Yen-Hsien Lee Department of Information Management College of Management National Sun Yat-Sen University March 4, 2003.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Lesson No:12 Introduction to Internet CHBT-01 Basic Micro process & Computer Operatio.
The World Wide Web: Information Resource. How a Search Engine works… How Search Works - YouTube
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Introduction.  Instructor: Cengiz Örencik   Course materials:  myweb.sabanciuniv.edu/cengizo/courses.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
CS570: Data Mining Spring 2010, TT 1 – 2:15pm Li Xiong.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
DATA MINING and VISUALIZATION Instructor: Dr. Matthew Iklé, Adams State University Remote Instructor: Dr. Hong Liu, Embry-Riddle Aeronautical University.
CS583 – Data Mining and Text Mining
Data Mining – Intro.
MIS2502: Data Analytics Advanced Analytics - Introduction
CS583 – Data Mining and Text Mining
Guangbing Yang Presentation for Xerox Docushare Symposium in 2011
CS583 – Data Mining and Text Mining
Data Mining 101 with Scikit-Learn
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
CS583 – Data Mining and Text Mining
Data Mining Modified from
Sangeeta Devadiga CS 157B, Spring 2007
CSE591: Data Mining by H. Liu
CS583 – Data Mining and Text Mining
Course Summary ChengXiang “Cheng” Zhai Department of Computer Science
Web Mining Department of Computer Science and Engg.
Data Mining: Introduction
CS583 – Data Mining and Text Mining
Data Warehousing Data Mining Privacy
CS583 – Data Mining and Text Mining
CSE591: Data Mining by H. Liu
CSE572: Data Mining by H. Liu
CSE572: Data Mining by H. Liu
Presentation transcript:

Introduction to Web Mining Spring 2013

What is data mining? Data mining is extraction of useful patterns from data sources, e.g., databases, texts, web, images, etc. Patterns must be: valid, novel, potentially useful, understandable

Classic data mining tasks Classification: mining patterns that can classify future (new) data into known classes. Association rule mining mining any rule of the form X  Y, where X and Y are sets of data items. Clustering identifying a set of similarity groups in the data

CS583, Bing Liu, UIC4 Classic data mining tasks (contd) Sequential pattern mining: A sequential rule: A  B, says that event A will be immediately followed by event B with a certain confidence Deviation detection: discovering the most significant changes in data Data visualization

Why is data mining important? Huge amount of data How to make best use of data? Knowledge discovered from data can be used for competitive advantage. Many interesting things that one wants to find cannot be found using database queries, e.g., “find people likely to buy my products”

6

WWW Web is an internet-based computer network that allows users of one computer to access information stored on another through the internet. Client-server model, hypertext documents Invented in 1989 by Tim Berners-Lee at CERN with HTTP/HTML Mosaic (1993), Netscape(1994), Internet Explore (1995) Related with Internet (ARPANET, TCP/IP)

8 Web mining traditional data mining data is structured and relational well-defined tables, columns, rows, keys, and constraints. Web data readily available data rich in features and patterns Content/link/usage data

Topic Description Introduction to basic data mining: association and sequential mining, classification, clustering Crawling, Web search and information retrieval Social network analysis Structure data extraction, information integration Opinion mining and sentiment analysis Web usage mining

Related fields Web mining is an multi-disciplinary field: Machine learning Statistics Databases Information retrieval Visualization Natural language processing etc.