南台科技大學資訊工程系 A web page usage prediction scheme using sequence indexing and clustering techniques Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2010/10/15.

Slides:

Advertisements

Similar presentations

WEB USAGE MINING FRAMEWORK FOR MINING EVOLVING USER PROFILES IN DYNAMIC WEBSITE DONE BY: AYESHA NUSRATH 07L51A0517 FIRDOUSE AFREEN 07L51A0522.

Advertisements

Processing XML Keyword Search by Constructing Effective Structured Queries Jianxin Li, Chengfei Liu, Rui Zhou and Bo Ning Swinburne University of Technology,

Experiments on Query Expansion for Internet Yellow Page Services Using Log Mining Summarized by Dongmin Shin Presented by Dongmin Shin User Log Analysis.

Image Indexing and Retrieval using Moment Invariants Imran Ahmad School of Computer Science University of Windsor – Canada.

Web Document Clustering: A Feasibility Demonstration Hui Han CSE dept. PSU 10/15/01.

WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.

1 Lecture 5: Automatic cluster detection Lecture 6: Artificial neural networks Lecture 7: Evaluation of discovered knowledge Brief introduction to lectures.

Web Mining Research: A Survey

Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.

Web Mining Research: A Survey

WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.

A new predictive search area approach for fast block motion estimation Kuo-Liang Chung ( 鍾國亮 ) Lung-Chun Chang ( 張隆君 ) 國立台灣科技大學資訊工程系暨研究所 IEEE TRANSACTIONS.

(C) 2001 SNU CSE Biointelligence Lab Incremental Classification Using Tree- Based Sampling for Large Data H. Yoon, K. Alsabti, and S. Ranka Instance Selection.

Memory-Efficient Regular Expression Search Using State Merging Department of Computer Science and Information Engineering National Cheng Kung University,

Rotation Forest: A New Classifier Ensemble Method 交通大學電子所蕭晴駿 Juan J. Rodríguez and Ludmila I. Kuncheva.

University of Jyväskylä – Department of Mathematical Information Technology Computer Science Teacher Education ICNEE 2004 Topic Case Driven Approach for.

Overview of Search Engines

Automated malware classification based on network behavior

Web Usage Mining Sara Vahid. Agenda Introduction Web Usage Mining Procedure Preprocessing Stage Pattern Discovery Stage Data Mining Approaches Sample.

FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.

Using Friendship Ties and Family Circles for Link Prediction Elena Zheleva, Lise Getoor, Jennifer Golbeck, Ugur Kuter (SNAKDD 2008)

南台科技大學資訊工程系 Posture Monitoring System for Context Awareness in Mobile Computing Authors: Jonghun Baek and Byoung-Ju Yun Adviser: Yu-Chiang Li Speaker:

資訊工程系智慧型系統實驗室 iLab 南台科技大學 1 Optimizing Cloud MapReduce for Processing Stream Data using Pipelining 出處 : 2011 UKSim 5th European Symposium on Computer Modeling.

Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.

An Enhanced Data Mining Life Cycle Hofmann, M.; Tierney, B.; Computational Intelligence and Data Mining, 2009.CIDM '09. IEEE Symposium on March April.

南台科技大學資訊工程系 Automatic Website Summarization by Image Content: A Case Study with Logo and Trademark Images Evdoxios Baratis, Euripides G.M. Petrakis, Member,

VAST 2011 Sebastian Bremm, Tatiana von Landesberger, Martin Heß, Tobias Schreck, Philipp Weil, and Kay Hamacher Interactive-Graphics Systems TU Darmstadt,

CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.

Incident Threading for News Passages (CIKM 09) Speaker: Yi-lin,Hsu Advisor: Dr. Koh, Jia-ling. Date:2010/06/14.

Web Document Clustering: A Feasibility Demonstration Oren Zamir and Oren Etzioni, SIGIR, 1998.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology A Taxonomy of Similarity Mechanisms for Case-Based Reasoning.

ON THE SELECTION OF TAGS FOR TAG CLOUDS (WSDM11) Advisor: Dr. Koh. Jia-Ling Speaker: Chiang, Guang-ting Date:2011/06/20 1.

Web Personalization Based on Static Information and Dynamic User Behavior Center for E-Business Technology Seoul National University Seoul, Korea Nam,

資訊工程系智慧型系統實驗室 iLab 南台科技大學 1 A Static Hand Gesture Recognition Algorithm Using K- Mean Based Radial Basis Function Neural Network 作者 :Dipak Kumar Ghosh,

1 Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart.

Intelligent Database Systems Lab Advisor ： Dr. Hsu Graduate ： Chien-Ming Hsiao Author ： Bing Liu Yiyuan Xia Philp S. Yu 國立雲林科技大學 National Yunlin University.

6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.

Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology Mining Logs Files for Data-Driven System Management Advisor.

Mining Document Collections to Facilitate Accurate Approximate Entity Matching Presented By Harshda Vabale.

Analysing Clickstream Data: From Anomaly Detection to Visitor Profiling Peter I. Hofgesang Wojtek Kowalczyk ECML/PKDD Discovery.

Learning URL Patterns for Webpage De-duplication Authors: Hema Swetha Koppula… WSDM 2010 Reporter: Jing Chiu /12/5.

Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Fuzzy integration of structure adaptive SOMs for web content.

Intelligent Database Systems Lab Advisor ： Dr. Hsu Graduate ： Chien-Shing Chen Author ： Juan D.Velasquez Richard Weber Hiroshi Yasuda 國立雲林科技大學 National.

Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)

1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.

Research Academic Computer Technology Institute (RACTI) Patras Greece1 An Algorithmic Framework for Adaptive Web Content Christos Makris, Yannis Panagis,

Chaoyang University of Technology Clustering web transactions using rough approximation Source : Fuzzy Sets and Systems 148 (2004) 131–138 Author : Supriya.

Predicting the Location and Time of Mobile Phone Users by Using Sequential Pattern Mining Techniques Mert Özer, Ilkcan Keles, Ismail Hakki Toroslu, Pinar.

Information Design Trends Unit Five: Delivery Channels Lecture 2: Portals and Personalization Part 2.

Navigation Strategies for Exploring Indoor Environments Hector H Gonzalez-Banos and Jean-Claude Latombe The International Journal of Robotics Research.

Predicting Short-Term Interests Using Activity-Based Search Context CIKM’10 Advisor: Jia Ling, Koh Speaker: Yu Cheng, Hsieh.

南台科技大學資訊工程系 Data hiding based on the similarity between neighboring pixels with reversibility Author:Y.-C. Li, C.-M. Yeh, C.-C. Chang. Date:

資訊工程系智慧型系統實驗室 iLab 南台科技大學 1 A new social and momentum component adaptive PSO algorithm for image segmentation Expert Systems with Applications 38 (2011)

WebQuery: Searching and Visualizing the Web through Connectivity Jeromy Carriere, Nortel Rick Kazman, Software Engineering Institute 元智資工所系統實驗室楊錫謦 2000/1/5.

南台科技大學資訊工程系 An effective solution for trademark image retrieval by combining shape description and feature matching 指導教授：李育強報告者：楊智雁日期： 2010/08/27.

© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.

A Recommender System based on Tag and Time Information for Social Tagging Systems Nan Zheng and Qiudan Li (Chinese Academy of Sciences) Expert Systems.

Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.

What is PDF?  Each group is required to create a Product Development File (PDF).  The PDF is a series of documents that cover the entire history of the.

Data mining in web applications

義守大學資訊工程學系作者：郭東黌, 張佑康報告人：徐碩利 Date: 2006/11/01

Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.

Improving Search Relevance for Short Queries in Community Question Answering Date： 2014/09/25 Author ： Haocheng Wu, Wei Wu, Ming Zhou, Enhong Chen, Lei.

Parametric calibration of speed–density relationships in mesoscopic traffic simulator with data mining Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2009/10/20.

Lin Lu, Margaret Dunham, and Yu Meng

Boštjan Kožuh Statistical Office of the Republic of Slovenia,

Discovery of Significant Usage Patterns from Clickstream Data

Web Mining Research: A Survey

Presentation transcript:

南台科技大學資訊工程系 A web page usage prediction scheme using sequence indexing and clustering techniques Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2010/10/15 Data & Knowledge Engineering, Vol 69, No. 4, pp , 2010.

2 Outline Introduction 1 Definitions and background 2 Prediction/recommendation model 3 WST utilization – recommendation/prediction method 4 Evaluation 5 Definitions and background 6

3 1. Introduction  We consider the problem of web page usage prediction in a web site by modeling users’ navigation history and web page content with weighted suffix trees.  We focus to the later area of web data mining that tries to exploit the navigational traces of the users in order to extract knowledge about their preferences and their behavior.

4 1. Introduction  We propose two novel methods for modeling the user navigation history.  The first method,exploits knowledge extracted only from user access sequences from the web server log file.  The second method enhances the first one by utilizing web page content during the phase of access pattern extraction.

5 2. Definitions and background

6 3. Prediction/recommendation

7  WAS maintenance  Either we program properly the web server to store each WAS in separate repository or we can program an extraction process from the log files that is executed at the beginning of the preprocessing procedure. Assume for the sake of description that there are N sequences that form a set S={WAS 1,WAS 2,….WAS N }

8 3. Prediction/recommendation  WAS clustering  Our decision to use k-windows as a clustering method was driven by a variety of reason, such as the enhanced quality of the produced clusters and its inherent parallel nature (a)Sequential movements M2, M3, M4 of initial window M1. (b)Sequential enlargements E1, E2 of window M4.

9 3. Prediction/recommendation (a)W1 and W2 satisfy the similarity condition and W1 is deleted. (b)W3 and W4 satisfy the merge operation and are considered to belong to the same cluster. (c) W5 and W6 have a small overlapment and capture two dierent clusters.

10 3. Prediction/recommendation An example of the application of the k-windows algorithm.

11 3. Prediction/recommendation  WAS clustering exploiting web page content  Direct sequence alignment (DSA) In the alignment (global or local) of a pair the scoring function of aligning two characters/web pages is a combination of the importance label of each page and the similarity metric between them.

12 3. Prediction/recommendation  Sequence alignment with clustering preprocess (SACP) Another way to incorporate the content of web pages into the sequence alignment algorithm is to perform a clustering by content of the web pages.

13 3. Prediction/recommendation  WAS cluster representation  When the WAS clustering procedure is over each one of the clusters is expressed as a weighted sequence.  As an alternative someone could possibly use the approach of progressive or iterative pairwise alignment in order to produce the multiple sequence alignment.

14 4. WST utilization – recommendation/prediction method  The recommendation/prediction algorithm works as follows: when a new user is arrived in the system, he is assigned to the root of the generalized weighted suffix tree (gWST). Weighted suffix tree navigation

15 4. WST utilization – recommendation/prediction method  We have a sample run of the recommendation algorithm. Recommendation method run. Numbers in the nodes express their weight.

16 5. Evaluation  Evaluation of access based method  Comparing our experimental results with “A web page prediction model based on click-stream tree representation of user behavior”

17 5. Evaluation  Evaluation of access and content based methods  The context of the experiment was exactly the same as the evaluation as described in the previous section.

18 6. Conclusions and open issues  we have proposed various techniques for predicting web page usage patterns by modeling the users’ navigation history using string processing techniques,  Future work includes different ways of modeling web user access patterns.

南台科技大學資訊工程系