WebKDD 2001 Aristotle University of Thessaloniki 1 Effective Prediction of Web-user Accesses: A Data Mining Approach Nanopoulos Alexandros Katsaros Dimitrios.

Slides:



Advertisements
Similar presentations
UNIVERSITY COLLEGE DUBLIN DUBLIN CITY UNIVERSITY This material is based upon work supported by Science Foundation Ireland under Grant No. 03/IN3/1361 TEMPORAL.
Advertisements

The Performance Impact of Kernel Prefetching on Buffer Cache Replacement Algorithms (ACM SIGMETRIC 05 ) ACM International Conference on Measurement & Modeling.
Mining Frequent Patterns II: Mining Sequential & Navigational Patterns Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.
شهره کاظمی 1 آزمايشکاه سيستم های هوشمند ( گزارش پيشرفت کار پروژه مدل مارکف.
Data e Web Mining Paolo Gobbo
Towards Twitter Context Summarization with User Influence Models Yi Chang et al. WSDM 2013 Hyewon Lim 21 June 2013.
Dimitrios Katsaros* † Yannis Manolopoulos* † Aristotle University, Greece *University of Thessaly, Greece Suffix Tree Based Prediction for Pervasive Computing.
An Efficient IP Address Lookup Algorithm Using a Priority Trie Authors: Hyesook Lim and Ju Hyoung Mun Presenter: Yi-Sheng, Lin ( 林意勝 ) Date: Mar. 11, 2008.
Chapter 12: Web Usage Mining - An introduction
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Mining Longest Repeating Subsequences to Predict World Wide Web Surfing Jatin Patel Electrical and Computer Engineering Wayne State University, Detroit,
Web Usage Mining: Processes and Applications
University of Athens, Greece Pervasive Computing Research Group Predicting the Location of Mobile Users: A Machine Learning Approach 1 University of Athens,
November 22, 2003 BCI 2003 Aristotle University of Thessaloniki 1 Updating Web views distributed over wide area networks Sidiropoulos Antonis Katsaros.
2001 Dimitrios Katsaros Panhellenic Conference on Informatics (ΕΠΥ’8) 1 Efficient Maintenance of Semistructured Schema Katsaros Dimitrios Aristotle University.
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Efficient Data Mining for Path Traversal Patterns CS401 Paper Presentation Chaoqiang chen Guang Xu.
1 A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS* by Gökhan Yavaş Feb 22, 2005 *: To appear in Data and Knowledge Engineering, Elsevier.
Web Transfer Latency Study Presented by Ye Xia WebTP Presentation, Aug. 28, 2000 Paper Presented: Paul Barford and Mark Crovella, “Critical Path Analysis.
On the Anonymity of Anonymity Systems Andrei Serjantov (anonymous)
By Ravi Shankar Dubasi Sivani Kavuri A Popularity-Based Prediction Model for Web Prefetching.
FALL 2012 DSCI5240 Graduate Presentation By Xxxxxxx.
CS 401 Paper Presentation Praveen Inuganti
Introduction The large amount of traffic nowadays in Internet comes from social video streams. Internet Service Providers can significantly enhance local.
Generating Intelligent Links to Web Pages by Mining Access Patterns of Individuals and the Community Benjamin Lambert Omid Fatemieh CS598CXZ Spring 2005.
Sequential PAttern Mining using A Bitmap Representation
Web Prefetching Between Low-Bandwidth Clients and Proxies : Potential and Performance Li Fan, Pei Cao and Wei Lin Quinn Jacobson (University of Wisconsin-Madsion)
Rate-based Data Propagation in Sensor Networks Gurdip Singh and Sandeep Pujar Computing and Information Sciences Sanjoy Das Electrical and Computer Engineering.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
Efficient Data Mining for Calling Path Patterns in GSM Networks Information Systems, accepted 5 December 2002 SPEAKER: YAO-TE WANG ( 王耀德 )
南台科技大學 資訊工程系 A web page usage prediction scheme using sequence indexing and clustering techniques Adviser: Yu-Chiang Li Speaker: Gung-Shian Lin Date:2010/10/15.
1 Knowledge Discovery Transparencies prepared by Ho Tu Bao [JAIST] ITCS 6162.
Log files presented to : Sir Adnan presented by: SHAH RUKH.
Chapter 12: Web Usage Mining - An introduction Chapter written by Bamshad Mobasher Many slides are from a tutorial given by B. Berendt, B. Mobasher, M.
CEMiner – An Efficient Algorithm for Mining Closed Patterns from Time Interval-based Data Yi-Cheng Chen, Wen-Chih Peng and Suh-Yin Lee ICDM 2011.
Expert Systems with Applications 34 (2008) 459–468 Multi-level fuzzy mining with multiple minimum supports Yeong-Chyi Lee, Tzung-Pei Hong, Tien-Chin Wang.
Srivastava J., Cooley R., Deshpande M, Tan P.N.
1 Murat Ali Bayır Middle East Technical University Department of Computer Engineering Ankara, Turkey A New Reactive Method for Processing Web Usage Data.
Characterising Browsing Strategies in the World Wide Web Lara D. Catledge & James E. Pitkow Presented by: Mat Mannion, Dean Love, Nick Forrington & Andrew.
Unconstrained Endpoint Profiling Googling the Internet Ionut Trestian, Supranamaya Ranjan, Alekandar Kuzmanovic, Antonio Nucci Reviewed by Lee Young Soo.
A NOVEL PREFETCHING METHOD FOR SCENE-BASED MOBILE SOCIAL NETWORK SERVICE 作者 :Song Li, Wendong Wang, Yidong Cui, Kun Yu, Hao Wang 報告者 : 饒展榕.
SocialVoD: a Social Feature-based P2P System Wei Chang, and Jie Wu Presenter: En Wang Temple University, PA, USA IEEE ICPP, September, Beijing, China1.
Web Mining Issues Size Size –>350 million pages –Grows at about 1 million pages a day Diverse types of data Diverse types of data.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Intelligent DataBase System Lab, NCKU, Taiwan Josh Jia-Ching Ying 1, Wang-Chien Lee 2, Tz-Chiao Weng 1 and Vincent S. Tseng 1 1 Department of Computer.
Search Engine using Web Mining COMS E Web Enhanced Information Mgmt Prof. Gail Kaiser Presented By: Rupal Shah (UNI: rrs2146)
Xinyu Xing, Wei Meng, Dan Doozan, Georgia Institute of Technology Alex C. Snoeren, UC San Diego Nick Feamster, and Wenke Lee, Georgia Institute of Technology.
Post-Ranking query suggestion by diversifying search Chao Wang.
Mining Patterns in Long Sequential Data with Noise Wei Wang, Jiong Yang, Philip S. Yu ACM SIGKDD Explorations Newsletter Volume 2, Issue 2 (December 2000)
Web Prefetching Lili Qiu Microsoft Research March 27, 2003.
Detecting Sequences and Cycles of Web Pages Narayan L. Bhamidipati and Sankar K. Pal Indian Statistical Institute Kolkata.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Providing Justifications in Recommender Systems Presenter.
An Energy-Efficient Approach for Real-Time Tracking of Moving Objects in Multi-Level Sensor Networks Vincent S. Tseng, Eric H. C. Lu, & Kawuu W. Lin Institute.
Web Analytics Xuejiao Liu INF 385F: WIRED Fall 2004.
On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.
© Prentice Hall1 DATA MINING Web Mining Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University Companion slides.
Improvement of Apriori Algorithm in Log mining Junghee Jaeho Information and Communications University,
Personalizing the Web Todd Lanning Project 1 - Presentation CSE 8331 Dr. M. Dunham.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
Fuzzy Set Approach for Improving Web Log Mining Sajitha Naduvil-Vadukootu Csc 8810 : Computational Intelligence Instructor: Dr. Yanqing Zhang Dec 4, 2006.
A NOVEL PREFETCHING METHOD FOR SCENE- BASED MOBILE SOCIAL NETWORK SERVICE 作者 :SONG LI, WENDONG WANG, YIDONG CUI, KUN YU, HAO WANG 報告者 : 饒展榕.
Effective Prediction of Web-user Accesses: A Data Mining Approach
Reza Yazdani Albert Segura José-María Arnau Antonio González
E-Commerce Theories & Practices
Content-Based Music Information Retrieval in Wireless Ad-hoc Networks
Lin Lu, Margaret Dunham, and Yu Meng
Mining Access Pattrens Efficiently from Web Logs Jian Pei, Jiawei Han, Behzad Mortazavi-asl, and Hua Zhu 2000년 5월 26일 DE Lab. 윤지영.
A DATA MINING APPROACH FOR LOCATION PREDICTION IN MOBILE ENVIRONMENTS*
Effective Prediction of Web-user Accesses: A Data Mining Approach
Discovery of Significant Usage Patterns from Clickstream Data
Presentation transcript:

WebKDD 2001 Aristotle University of Thessaloniki 1 Effective Prediction of Web-user Accesses: A Data Mining Approach Nanopoulos Alexandros Katsaros Dimitrios Yannis Manolopoulos Aristotle Univ. of Thessaloniki, Greece Presentation: Spyros Papadimitriou, Carnegie Mellon Univ.

WebKDD 2001 Aristotle University of Thessaloniki 2 Introduction (1/2) Web Prefetching: Deducing forthcoming user accesses based on log information Focus on: –Predictive prefetching (use of history) –Server initiated (server makes predictions and piggybacks them to the clients)

WebKDD 2001 Aristotle University of Thessaloniki 3 Within a site, users navigate following links [5] For server-initiated predictive prefetching interest is for access patterns reflecting this behavior Introduction (2/2)

WebKDD 2001 Aristotle University of Thessaloniki 4 Motivation & Related work Proposed method Comparative performance evaluation Conclusions Outline

WebKDD 2001 Aristotle University of Thessaloniki 5 Motivation & Related work Proposed method Comparative performance evaluation Conclusions Presentation Outline

WebKDD 2001 Aristotle University of Thessaloniki 6 Site structure and contents impose 1.The order of dependencies (first or higher) among the documents 2.The interleaving of documents belonging to patterns with random visits (noise) Discovered patterns should respect these factors Requirements

WebKDD 2001 Aristotle University of Thessaloniki 7 Dependency graph (DG) [9] –A graph maintains pairwise accesses Prediction by Partial Match (PPM) [10] –A trie maintains sequences of consecutive accesses LBOT [6] –Special form of association rules of length 2 Others (variations of the above) [3,11] Related work

WebKDD 2001 Aristotle University of Thessaloniki 8 Motivation DGNoYes PPM Yes No LBOTNo Order (1 st Req.) Proposed Yes Yes Noise (2 nd Req.)

WebKDD 2001 Aristotle University of Thessaloniki 9 Motivation & Related work Proposed method Comparative performance evaluation Conclusions Presentation Outline

WebKDD 2001 Aristotle University of Thessaloniki 10 Novel Web log mining algorithm (WM o ) –Apriori-like –Effective Immune to noise Considers high order dependencies –Efficient Significant reduction in the number of candidates Proposed Method (1)

WebKDD 2001 Aristotle University of Thessaloniki 11 Session (or transaction): A sequence of requests that occur in a specified time interval from each other [2] Containment relationship addresses the 1 st requirement (avoiding noise) Example: T =  A, X, B, Y, C  X, Y noise S =  A, B, C  the pattern S is contained by T Comment:With contiguous subsequences based only on support S (the pattern) will be missed. Proposed Method (2)

WebKDD 2001 Aristotle University of Thessaloniki 12 Candidate generation respects the ordering of accesses in transactions. Example:  A,B    B,A  Dramatic increase in the number of candidates Exploits the site structure for pruning [7,8] Proposed Method (3)

WebKDD 2001 Aristotle University of Thessaloniki 13 Algorithm genCandidates(L k, G) //L k the set of large k-paths and G the graph begin foreach L=  l 1, …, l k , L  L k { N + (l k ) = {v|  arc l k  v  G} foreach v  N + (l k ) { //apply modified apriori pruning if v  L and L’ =  l 2, …, l k,v   L k { C=  l 1, …, l k, v  if (  S  C, S  L’  S  L k ) insert C in the candidate-trie } end Proposed Method (4)

WebKDD 2001 Aristotle University of Thessaloniki 14 Sequential patterns [1] –Reduction when “customer-sequence” = “user-session” –Suffers from large number of candidates (by not considering the site structure) Path Fragments [4] ( containment relationship is performed with regular expressions and the “*” label ) –Focus on semantics (recommendation systems) Prefetching: patterns are for system and not for human consumption WMo focuses on efficiency/effectiveness rather on expressiveness (semantics) Discussion

WebKDD 2001 Aristotle University of Thessaloniki 15 Motivation & Related work Proposed method Comparative performance evaluation Conclusions Presentation Outline

WebKDD 2001 Aristotle University of Thessaloniki 16 Synthetic (sample site with 1000 nodes) –Synthetic data generator (see the paper) Modeling site nodes, site linkage, size of documents Real data sets (see the paper) Examine the impact of: –noise –order –client cache (see the paper) –efficiency Methodology

WebKDD 2001 Aristotle University of Thessaloniki 17 Accuracy w.r.t. noise

WebKDD 2001 Aristotle University of Thessaloniki 18 Usefulness w.r.t. noise

WebKDD 2001 Aristotle University of Thessaloniki 19 Traffic w.r.t. noise

WebKDD 2001 Aristotle University of Thessaloniki 20 Accuracy w.r.t. order

WebKDD 2001 Aristotle University of Thessaloniki 21 Usefulness w.r.t. order

WebKDD 2001 Aristotle University of Thessaloniki 22 Traffic w.r.t. order

WebKDD 2001 Aristotle University of Thessaloniki 23 Efficiency (see also [7,8])

WebKDD 2001 Aristotle University of Thessaloniki 24 Motivation & Related work Proposed method Comparative performance evaluation Conclusions Presentation Outline

WebKDD 2001 Aristotle University of Thessaloniki 25 Factors that influence Web Prefetching –Noise –Order A new algorithm WM o was presented based on data mining Compares favorably with previously proposed algorithms WM o is an effective and efficient Web prefetching algorithm Conclusions

WebKDD 2001 Aristotle University of Thessaloniki 26 1.R.Agrawal, Ramakrishnan Srikant, Mining Sequential Patterns, ICDE R.Cooley, B. Mobasher, J.Srivastava, Data Preparation for Mining World Wide Web Browsing Patterns, KAIS, 1(1), pp. 5-32, M. Deshpande, G. Karypis, Selective Markov Models for Predicting Web-page Accesses, SIAM Data Mining, W.Gaul, L.T.Schimdt-Thieme, Mining Web Navigation Path Fragments, WebKDD B. A. Huberman, P. Pirolli, J. Pitkow and R. J. Lukose, Strong Regularities in World Wide Web Surfing. Science, 280, pp , B.Lan, S.Bressan, B.C. Ooi, Y.Tay, Making Web Servers Pushier, WebKDD A. Nanopoulos, Y. Manolopoulos, Finding Generalized Path Patterns for Web Log Data Mining, ADBIS-DASFAA A. Nanopoulos, Y. Manolopoulos, Mining patterns from graph traversals, DKE 37(3), pp , V.Padmanabhan, J. Mogul, Using Predictive Prefetching to Improve World Wide Web Latency, ACM SIGCOMM Computer Communications Review, 26(3), T.Palapans, A.Mendelzon, Web Prefetching Using Partial Match Prediction, WCW J. Pitkow, P. Pirroli, Mining Longest Repeating Subsequences to Predict World Wide Web Surfing, USITS, L.T.Schimdt-Thieme, W.Gaul, Recommender Systems Based on Navigation Path Features, WebKDD References