2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations1 Towards Effective Browsing of Large Scale Social Annotations WWW 2007.

Slides:



Advertisements
Similar presentations
Answering Approximate Queries over Autonomous Web Databases Xiangfu Meng, Z. M. Ma, and Li Yan College of Information Science and Engineering, Northeastern.
Advertisements

Date: 2013/1/17 Author: Yang Liu, Ruihua Song, Yu Chen, Jian-Yun Nie and Ji-Rong Wen Source: SIGIR12 Advisor: Jia-ling Koh Speaker: Chen-Yu Huang Adaptive.
Learning to Suggest: A Machine Learning Framework for Ranking Query Suggestions Date: 2013/02/18 Author: Umut Ozertem, Olivier Chapelle, Pinar Donmez,
Improvements and extras Paul Thomas CSIRO. Overview of the lectures 1.Introduction to information retrieval (IR) 2.Ranked retrieval 3.Probabilistic retrieval.
Mustafa Cayci INFS 795 An Evaluation on Feature Selection for Text Clustering.
Random Forest Predrag Radenković 3237/10
Date : 2013/05/27 Author : Anish Das Sarma, Lujun Fang, Nitin Gupta, Alon Halevy, Hongrae Lee, Fei Wu, Reynold Xin, Gong Yu Source : SIGMOD’12 Speaker.
Entity-Centric Topic-Oriented Opinion Summarization in Twitter Date : 2013/09/03 Author : Xinfan Meng, Furu Wei, Xiaohua, Liu, Ming Zhou, Sujian Li and.
Optimizing search engines using clickthrough data
TI: An Efficient Indexing Mechanism for Real-Time Search on Tweets Chun Chen 1, Feng Li 2, Beng Chin Ooi 2, and Sai Wu 2 1 Zhejiang University, 2 National.
Experiments on Query Expansion for Internet Yellow Page Services Using Log Mining Summarized by Dongmin Shin Presented by Dongmin Shin User Log Analysis.
Context-aware Query Suggestion by Mining Click-through and Session Data Authors: H. Cao et.al KDD 08 Presented by Shize Su 1.
Creating Concept Hierarchies in a Customer Self-Help System Bob Wall CS /29/05.
Topic-Sensitive PageRank Taher H. Haveliwala. PageRank Importance is propagated A global ranking vector is pre-computed.
Maximum Entropy Model LING 572 Fei Xia 02/07-02/09/06.
Parallel K-Means Clustering Based on MapReduce The Key Laboratory of Intelligent Information Processing, Chinese Academy of Sciences Weizhong Zhao, Huifang.
Xiaomeng Su & Jon Atle Gulla Dept. of Computer and Information Science Norwegian University of Science and Technology Trondheim Norway June 2004 Semantic.
Large-Scale Cost-sensitive Online Social Network Profile Linkage.
Personalized QoS-Aware Web Service Recommendation and Visualization.
 Clustering of Web Documents Jinfeng Chen. Zhong Su, Qiang Yang, HongHiang Zhang, Xiaowei Xu and Yuhen Hu, Correlation- based Document Clustering using.
Tag Clouds Revisited Date : 2011/12/12 Source : CIKM’11 Speaker : I- Chih Chiu Advisor : Dr. Koh. Jia-ling 1.
An Integrated Approach to Extracting Ontological Structures from Folksonomies Huairen Lin, Joseph Davis, Ying Zhou ESWC 2009 Hyewon Lim October 9 th, 2009.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Clustering-based Collaborative filtering for web page recommendation CSCE 561 project Proposal Mohammad Amir Sharif
1 Web Search Personalization via Social Bookmarking and Tagging Michael G. Noll & Christoph Meinel Hasso-Plattner-Institut an der Universit¨at Potsdam,
PAUL ALEXANDRU CHIRITA STEFANIA COSTACHE SIEGFRIED HANDSCHUH WOLFGANG NEJDL 1* L3S RESEARCH CENTER 2* NATIONAL UNIVERSITY OF IRELAND PROCEEDINGS OF THE.
UOS 1 Ontology Based Personalized Search Zhang Tao The University of Seoul.
CIKM’09 Date:2010/8/24 Advisor: Dr. Koh, Jia-Ling Speaker: Lin, Yi-Jhen 1.
Personalized Web Search by Mapping User Queries to Categories Fang Liu Presented by Jing Zhang CS491CXZ February 26, 2004.
ON INCENTIVE-BASED TAGGING Xuan S. Yang, Reynold Cheng, Luyi Mo, Ben Kao, David W. Cheung {xyang2, ckcheng, lymo, kao, The University.
On Scaling Latent Semantic Indexing for Large Peer-to-Peer Systems Chunqiang Tang, Sandhya Dwarkadas, Zhichen Xu University of Rochester; Yahoo! Inc. ACM.
Intent Subtopic Mining for Web Search Diversification Aymeric Damien, Min Zhang, Yiqun Liu, Shaoping Ma State Key Laboratory of Intelligent Technology.
Searching and Browsing Using Tags Nikos Sarkas Social Information Systems Seminar DCS, University of Toronto, Winter 2007.
Introduction to Digital Libraries hussein suleman uct cs honours 2003.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
Improving Web Search Results Using Affinity Graph Benyu Zhang, Hua Li, Yi Liu, Lei Ji, Wensi Xi, Weiguo Fan, Zheng Chen, Wei-Ying Ma Microsoft Research.
Enhancing Cluster Labeling Using Wikipedia David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab (SIGIR’09) Date: 11/09/2009 Speaker: Cho, Chin.
Wikipedia as Sense Inventory to Improve Diversity in Web Search Results Celina SantamariaJulio GonzaloJavier Artiles nlp.uned.es UNED,c/Juan del Rosal,
Semantic Wordfication of Document Collections Presenter: Yingyu Wu.
Facilitating Document Annotation using Content and Querying Value.
Algorithmic Detection of Semantic Similarity WWW 2005.
DOCUMENT UPDATE SUMMARIZATION USING INCREMENTAL HIERARCHICAL CLUSTERING CIKM’10 (DINGDING WANG, TAO LI) Advisor: Koh, Jia-Ling Presenter: Nonhlanhla Shongwe.
Jiafeng Guo(ICT) Xueqi Cheng(ICT) Hua-Wei Shen(ICT) Gu Xu (MSRA) Speaker: Rui-Rui Li Supervisor: Prof. Ben Kao.
LOGO 1 Mining Templates from Search Result Records of Search Engines Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Hongkun Zhao, Weiyi.
2009/05/04 Y.H.Chang 1Trend Prediction in Social Bookmark Service Using Time Series of Bookmarks Advisor: Hsin-Hsi Chen Reporter: Y.H Chang
LOGO Identifying Opinion Leaders in the Blogosphere Xiaodan Song, Yun Chi, Koji Hino, Belle L. Tseng CIKM 2007 Advisor : Dr. Koh Jia-Ling Speaker : Tu.
Image Emotional Semantic Query Based On Color Semantic Description Wei-Ning Wang, Ying-Lin Yu Department of Electronic and Information Engineering, South.
Presented By- Shahina Ferdous, Student ID – , Spring 2010.
Post-Ranking query suggestion by diversifying search Chao Wang.
Using Social Annotations to Improve Language Model for Information Retrieval Shengliang Xu, Shenghua Bao, Yong Yu Shanghai Jiao Tong University Yunbo Cao.
Improving Support Vector Machine through Parameter Optimized Rujiang Bai, Junhua Liao Shandong University of Technology Library Zibo , China { brj,
26/01/20161Gianluca Demartini Ranking Categories for Faceted Search Gianluca Demartini L3S Research Seminars Hannover, 09 June 2006.
Divided Pretreatment to Targets and Intentions for Query Recommendation Reporter: Yangyang Kang /23.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Hybrid Content and Tag-based Profiles for recommendation in Collaborative Tagging Systems Latin American Web Conference IEEE Computer Society, 2008 Presenter:
MMM2005The Chinese University of Hong Kong MMM2005 The Chinese University of Hong Kong 1 Video Summarization Using Mutual Reinforcement Principle and Shot.
ENHANCING CLUSTER LABELING USING WIKIPEDIA David Carmel, Haggai Roitman, Naama Zwerdling IBM Research Lab SIGIR’09.
Facilitating Document Annotation Using Content and Querying Value.
1 Double-Patterning Aware DSA Template Guided Cut Redistribution for Advanced 1-D Gridded Designs Zhi-Wen Lin and Yao-Wen Chang National Taiwan University.
A Self-organizing Semantic Map for Information Retrieval Xia Lin, Dagobert Soergel, Gary Marchionini presented by Yi-Ting.
Ning Jin, Wei Wang ICDE 2011 LTS: Discriminative Subgraph Mining by Learning from Search History.
ClusCite:Effective Citation Recommendation by Information Network-Based Clustering Date: 2014/10/16 Author: Xiang Ren, Jialu Liu,Xiao Yu, Urvashi Khandelwal,
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
Antara Ghosh Jignashu Parikh
3.1 Clustering Finding a good clustering of the points is a fundamental issue in computing a representative simplicial complex. Mapper does not place any.
Topic Oriented Semi-supervised Document Clustering
MEgo2Vec: Embedding Matched Ego Networks for User Alignment Across Social Networks Jing Zhang+, Bo Chen+, Xianming Wang+, Fengmei Jin+, Hong Chen+, Cuiping.
Date: 2012/11/15 Author: Jin Young Kim, Kevyn Collins-Thompson,
WSExpress: A QoS-Aware Search Engine for Web Services
Presentation transcript:

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations1 Towards Effective Browsing of Large Scale Social Annotations WWW 2007 Advisor: Hsin-Hsi Chen Reporter: Y.H Chang Rui Li, Shenghua Bao, Yong Yu, Zhong Su, and Ben Fei Shanghai JiaoTong University IBM China Research Lab

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations2 Outline Introduction ELSABer overview Components of ELSABer Enhanced models Experimental results Conclusion

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations3 Introduction Today, a lot of services (e.g., Del.icio.us, Filckr) have been provided for helping users to manage and share their favorite URLs and photos based on social annotations. How to effectively find desired resources from large annotation data is a new problem. In this paper, we propose a novel algorithm, namely Effective Large Scale Annotation Browser (ELSABer), to browse large-scale social annotation data.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations4 Introduction ELSABer helps the users browse huge number of annotations in a semantic, hierarchical and efficient way. By incorporating the personal and time information, ELSABer can be further extended for personalized and time-related browsing.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations5 A set of pages related to the current annotation “programming” The prototype system based on ELSABer Sub-tags (sub category) of “programming”

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations6 ELSABer overview Input An empty concept set S C Step 1 Output the initial view of annotations –generates TOP 100 tags from 2000 most frequently URLs and tags. –They are the roots in hierarchical browsing. Loop User select a tag T i Step 2 Concept Matching –Add tag T i to set S C –Calculate related tag set and URL set Step 3 (optional) sample URL set and sample Tag set Step 4 Hierarchical Browsing –4-1 Calculate candidate sub-tags –4-2 Rank the sub-tags by Infor-score IF Termination condition Satisfied; Return ELSE Loop

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations7 Components of ELSABer Data setup and representation Semantic Browsing –a. Annotation Similarity Estimation –b. Generating the Semantic Concept Hierarchical Browsing –c. Sub-Tag Generation –d. Sub-Tag Clustering Efficient Browsing

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations8 Data setup and representation Del.icio.us (May, 2006) We define an annotation as a quadruple: –(User, URL, Tag, Time). Associated matrix M mxn m and n is the total number of tags and URLs |URL(t i )| represents the number of URLs annotated by tag t i. C ij denote the number of users who annotate the jth URL with the ith tag Like the TFIDF of IR

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations9 Data setup and representation Given the associated matrix M mxn : T1 T2. Tm the tag can be represented as a row vector Ti (U1,U2,.. Un) of M the URL can be represented as a column vector Ui (t1,t2,…,tm) of M. U1 U2.. ….. Un

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations10 Semantic Browsing a. Annotation Similarity Estimation Similarity: Special case-1(stemming): Ex: Programs & Programming => add 0.1 weight Special case-2(punctuation): Ex: Web-dev & WebDev => add 0.08 weight

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations11 Semantic Browsing b. Generating the Semantic Concept Given the selected tag ti, we choose a tag set STi that is most related to ti by following rules: –1. tj should be among the N most similar tags related to ti –2. The similarity should be larger than a threshold θ. –N=4, θ=0.7 semantic concept Ci = STi ∪ {ti}

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations12 Semantic Browsing b. Generating the Semantic Concept The path of user’s clicking: t 1, t 2,…,t L will bring a sequence of concepts: C 1, C 2,…,C L. Let concept set S C = {C 1, C 2,…, C L }. The related URLs : –ReURL(S C ) = {u | ∀ C ∈ S C,T(u) ∩C ≠ Φ} –T(u) means the set of annotations given to URL u. the related tags can be defined as all the tags given to ReURL(S C ): –ReTag(S C ) {t | u ∈ ReURL(S C ),t ∈ T(u)}

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations13 Hierarchical Browsing c. Sub-Tag Generation If the intersection URL set is the main part of all the URLs of ti, but a small part of tj, we can infer that ti is a sub-tag of tj 40 related tags of “google”

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations14 Hierarchical Browsing c. Sub-Tag Generation Features Coverage of Tags ICR Intersection Rate IR’ IRR Top 1~30 =1 (by IR rank) Top 30~60 =2 Top 60~ =3 U(ti) denotes the number of URLs tagged with ti

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations15 Hierarchical Browsing c. Sub-Tag Generation Given the features above, each related tag is represented as a feature vector. A decision tree can be derived from the manually labeled data set to predict the sub-tag relations using C4.5.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations16 Hierarchical Browsing d. Sub-Tag Clustering

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations17 Hierarchical Browsing d. Sub-Tag Clustering Infor(t) = w1TFIDF(t) + w2ICS(t) + w3TE(t) Intra-Cluster Similarity: –ot denotes the centroid of all the URLs associated with the tag Tag Entropy: In our experiment, these weights are 0.58, 0.27, and 0.13, respectively.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations18 Efficient Browsing Observation : People use popular tags to annotate URLs and also the popular URLs are annotated by the majority of tags.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations19 Efficient Browsing So we can get good results efficiently by running our algorithm in a small sub tagging space. In our experiment, we sampling 2000 most frequently annotated URLs and 2000 most frequently tag, so the size of M is 2000 × 2000 After a sequence of click by the user, the intention of the user will be more specific, this causes a decreasing number of related URLs or related tags. When the number is less than 2000, all the tags and URLs will be calculated

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations20 Enhanced Models User’s profile: The user interested annotations and resources can be found as follows: Ri denotes the vector representation of a resource, and Ti denotes the vector representation of Ai. Adjust the sampling and ranking algorithms according to the user’s preference: –Infor (t,U) = α × Infor (t) + β ×UI (t | P(U))

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations21 Enhanced Models Given the user required time interval TI= [ts, te]. We define the match of the URL’s time sequence TS and the user required time interval TI as follows: θ =0.5

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations22 Experiment results The scale of the dataset: Machine: Intel Pentium IV 3.0 GHz, 1GB memory, 2 processors Java Lucene API is also used to build URL and Tag index. Del.icio.us (May, 2006) 1,736,268 web pages 269,566 different annotations

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations23 Experiment results Red tag: owned by user Orange tag: recommended

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations24 Experiment results

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations25 Conclusion Our main contributions: The proposal of the effective algorithm – ELSABer based on the analysis of social annotation’s characteristics. The proposal of enhanced models for personalized and time related browsing.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations26 Future work more user studies emphasize on how to find more qualified URL resources utilize existing hierarchical structures such as ODP and WordNet for helping construct more meaningful hierarchical structures for social annotations.

2008/06/06 Y.H.Chang Towards Effective Browsing of Large Scale Social Annotations27 Thank you!!