Presentation is loading. Please wait.

Presentation is loading. Please wait.

Christian Körner 1, Dominik Benz 2, Andreas Hotho 3, Markus Strohmaier 1, Gerd Stumme 2 Stop thinking, start tagging: Tag Semantics arise from Collaborative.

Similar presentations


Presentation on theme: "Christian Körner 1, Dominik Benz 2, Andreas Hotho 3, Markus Strohmaier 1, Gerd Stumme 2 Stop thinking, start tagging: Tag Semantics arise from Collaborative."— Presentation transcript:

1 Christian Körner 1, Dominik Benz 2, Andreas Hotho 3, Markus Strohmaier 1, Gerd Stumme 2 Stop thinking, start tagging: Tag Semantics arise from Collaborative Verbosity 1 Knowledge Management Institute and Know Center, Graz University of Technology, Austria 2 Knowledge and Data Engineering Group (KDE), University of Kassel, Germany 3 Data Mining and Information Retrieval Group University of Würzburg, Germany

2 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20102 / 20 Where do Semantics come from?  Semantically annotated content is the „fuel“ of the next generation World Wide Web – but where is the petrol station?  Expert-built  expensive  Evidence for emergent semantics in Web2.0 data  Built by the crowd!  Which factors influence emergence of semantics?  Do certain users contribute more than others?

3 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20103 / 20 The Story Emergent Tag Semantics Pragmatics of tagging Semantic Implications of Tagging Pragmatics Conclusions

4 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20104 / 20 Emergent Tag Semantics  tagging is a simple and intuitive way to organize all kinds of resources  uncontrolled vocabulary, tags are „just strings“  formal model: folksonomy F = (U, T, R, Y)  Users U, Tags T, Resources R  Tag assignments Y  (U  T  R)  evidence of emergent semantics  Tag similarity measures can identify e.g. synonym tags (web2.0, web_two)

5 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20105 / 20 Tag Similarity Measures: Tag Context Similarity  Tag Context Similarity is a scalable and precise tag similarity measure [Cattuto2008,Markines2009]:  Describe each tag as a context vector  Each dimension of the vector space correspond to another tag; entry denotes co-occurrence count  Compute similar tags by cosine similarity 53011050 designsoftwareblogwebprogramming … JAVA  Will be used as indicator of emergent semantics!

6 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20106 / 20 = tag Assessing the Quality of Tag Semantics JCN(t,t sim ) = 3.68 TagCont(t,t sim ) = 0.74 Folksonomy Tags = synset WordNet Hierarchy Mapping Average JCN(t,t sim ) over all tags t: „Quality of semantics“

7 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20107 / 20 The Story Pragmatics of tagging Semantic Implications of Tagging Pragmatics Conclusions Tag Similarity measures can capture emergent tag semantics

8 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20108 / 20 Tagging motivation  Evidence of different ways HOW users tag (Tagging Pragmatics)  Broad distinction by tagging motivation [Strohmaier2009]: donuts duff marge beer bart barty Duff-beer bev alcnalc beer wine „Categorizers“… - use a small controlled tag vocabulary - goal: „ontology-like“ categorization by tags, for later browsing - tags a replacement for folders „Describers“… - tag „verbously“ with freely chosen words - vocabulary not necessarily consistent (synomyms, spelling variants, …) - goal: describe content, ease retrieval

9 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW20109 / 20 Tagging Pragmatics: Measures  How to disinguish between two types of taggers?  Intuition: Describers use open set of many tags, Categorizers use small set of controlled tags:  Vocabulary size:  Tag / Resource ratio:  Average # tags per post: high low

10 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201010 / 20 Tagging Pragmatics: Measures  Next Intuition: Describers don‘t care about „abandoned“ tags, Categorizers do  Orphan ratio:  R(t): set of resources tagged by user u with tag t high low

11 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201011 / 20 Tagging pragmatics: Limitations of measures  Real users: no „perfect“ Categorizers / Describers, but „mixed“ behaviour  Possibly influenced by user interfaces / recommenders  Measures are correlated  But: independent of semantics; measures capture usage patterns

12 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201012 / 20 The Story Semantic Implications of Tagging Pragmatics Conclusions Tag Similarity measures can capture emergent tag semantics Measures of tagging pragmatics differentiate users by tagging motivation

13 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201013 / 20 Influence of Tagging Pragmatics on Emergent Semantics  Idea: Can we learn the same (or even better) semantics from the folksonomy induced by a subset of describers / categorizers? Extreme Categorizers Extreme Describers Complete folksonomy Subset of 30% categorizers = user

14 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201014 / 20 Experimental setup 1. Apply pragmatic measures vocab, trr, tpp, orphan to each user 2. Systematically create „sub-folksonomies“ CF i / DF i by subsequently adding i % of Categorizers / Describers (i = 1,2,…,25,30,…,100) 3. Compute similar tags based on each subset (TagContext Sim.) 4. Assess (semantic) quality of similar tags by avg. JCN distance TagCont(t,t sim )= … JCN(t,t sim )= … DF 20 CF 5

15 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201015 / 20 Dataset  From Social Bookmarking Site Delicious in 2006  ORIGINAL  Two filtering steps (to make measures more meaningful):  Restrict to top 10.000 tags  FULL  Keep only users with > 100 resources  MIN100RES dataset|T||U||R||Y| ORIGINAL2,454,546667,12818,782,132140,333,714 FULL10,000511,34814,567,465117,319,016 MIN100RES9,944100,36312,125,17696,298,409

16 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201016 / 20 Results – adding Describers (DF i ) Almost all sub-folksonomies are better than random-picked ones 40% of describers according to trr outperform complete data! Optimal performance for 70% describers (trr) more describers better semantics

17 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201017 / 20 Results – adding Categorizers (CF i ) Almost all sub-folksonomies are worse than random-picked ones Global optimum for 90% categorizers (tpp)  removing 10% most extreme describers! (Spammers?) better semantics more categorizers

18 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201018 / 20 The Story Tag Similarity measures can capture emergent tag semantics Measures of tagging pragmatics differentiate users by tagging motivation Sub-folksonomies introduced by measures of pragmatics show different semantic qualities Conclusions

19 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201019 / 20 Summary & Conclusions  Introduction of measures of users‘ tagging motivation (Categorizers vs. Describers)  Evidence for causal link between tagging pragmatics (HOW people use tags) and tag semantics (WHAT tags mean)  „Mass matters“ for „wisdom of the crowd“, but composition of crowd makes a difference („Verbosity“ of describers in general better, but with a limitation)  Relevant for tag recommendation and ontology learning algorithms

20 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201020 / 20 Guess who‘s a Categorizer from the authors

21 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201021 / 20 Thanks for the attention! Questions? Be verbous Tag Similarity measures can capture emergent tag semantics Measures of tagging pragmatics differentiate users by tagging motivation Sub-folksonomies introduced by measures of pragmatics show different semantic qualities Evidende of causal link between pragmatics and semantics of tagging! christian.koerner@tugraz.at benz@cs.uni-kassel.de

22 30.04.2010Körner, Benz et al.: Tag Semantics arise from Collaborative Verbosity @ WWW201022 / 20 References  [Cattuto2008] Ciro Cattuto, Dominik Benz, Andreas Hotho, Gerd Stumme: Semantic Grounding of Tag Relatedness in Social Bookmarking Systems. In: Proc. 7 th Intl. Semantic Web Conference (2008), p. 615-631  [Markines2009] Benjamin Markines, Ciro Cattuto, Filippo Menczer, Dominik Benz, Andreas Hotho, Gerd Stumme: Evaluating Similarity Measures for Emergent Semantics of Social Tagging. In: Proc. 18 th Intl. World Wide Web Conference (2009), p.641-641  [Strohmaier2009] Markus Strohmaier, Christian Körner, Roman Kern: Why do users tag? Detecting users‘ motivation for tagging in social tagging systems. Technical Report, Knowledge Management Institute – Graz University of Technology (2009)


Download ppt "Christian Körner 1, Dominik Benz 2, Andreas Hotho 3, Markus Strohmaier 1, Gerd Stumme 2 Stop thinking, start tagging: Tag Semantics arise from Collaborative."

Similar presentations


Ads by Google