Presentation is loading. Please wait.

Presentation is loading. Please wait.

On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore.

Similar presentations


Presentation on theme: "On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore."— Presentation transcript:

1 On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore

2 2 Outline  Collaborative-tagging vs. Self-tagging  Dataset overview and characteristics  Experiments  Tag Usage and Stability  Tag Clarity vs. Popularity  Tag Co-occurrence vs. Semantic distance  Conclusion  Questions/suggestions forwarded

3 3 Collaborative-tagging vs. Self-tagging  Collaborative tagging  A resource may be tagged by multiple users with multiple tags, e.g., del.icio.us and CiteULike  Self-tagging  A resource can only be tagged by its creator, e.g., most blog posts.  Questions  Any differences in tagging behavior?  Observations made on collaborative tagging hold in self-tagging?  When tags are used in any application (e.g., tag recommendation, classification/clustering), shall the two systems be treated differently?

4 4 Dataset  Overview:  Blogs listed in http://dir.blogflux.com/ and hosted by blogspot.comhttp://dir.blogflux.com/  Categories: Academic – Zookeeping  Blogs: 15,244, Posts: 3.3M  Posts with tag(s): 983K  Distinct tags: 29K  Characteristics [Marlow06]

5 5 Tag Usage

6 6 Tag Dynamics  Collaborative tagging systems [Halpin07]  Tag distribution used to collaboratively annotate a particular resource became stable after certain time period  The tags that could well describe the resource are repeatedly received from multiple users.  Possible reasons [Golder06]:  Imitation of others  Shared knowledge  Self-tagging systems?  No direct interaction to influence and imitate each other  Bloggers may read each others’ posts and tags  shared background?  an implicit consensus of tag usage.

7 7 Tag Stability  A relatively small set of tags to annotate most blog posts

8 8 Tag Clarity  Question  The same tag tends to be assigned to topically-similar blog posts?  Tag clarity:  A tag receives high clarity score if all posts annotated by the tag are topically cohesive  Inspired by query clarity score in ad-hoc retrieval [Cronen- Townsend02]  The clarity score of a tag is the distance between the tag language model and the collection language model

9 9 Tag Clarity vs. Tag Popularity  Number of tags reduces as tag popularity increase  Clarity scores of tags decrease with popularity increase

10 10 Tag Clarity vs. Tag Popularity  Less popular tags have clarity scores close to those dummy tags  More popular tags have higher clarity scores than dummy tags

11 11 Tag Co-occurrence vs. Semantic distance  Co-occurrence  Semantic distance:  KL-divergence between the two tag language models  Question:  If tags co-occur in annotating blog posts, then their semantic distance is small?

12 12 Tag Co-occurrence vs. Semantic distance

13 13 Tag Co-occurrence vs. Semantic distance  Observations  The co-occurrence of two tags does not suggest any semantic relationship between the two tags (correlation coefficient = 0.017).  Tag pairs (e.g.,, ) is much clearer in describing posts supported by their clarity scores.  Tag pairs are likely to be semantically-orthogonal, partially consistent with [Weinberger08].  Possible reasons:  Tags are more for personal use than others’ benefit.  A blogger has a clear understanding about her post, it is not necessary for her to tag the post with many similar tags. Rather, she may tag post with tags from different perspectives.

14 14 Tag Clarity vs. Tag Popularity (Revisit)

15 15 Conclusion  A preliminary study on tags in self-tagging system  Tag dynamics  Tag clarity vs. popularity  Tag co-occurrence vs. semantic distance  Observations:  Tags are often assigned to topically similar blog posts through the notion of tag clarity.  Co-occurred tags may not necessarily be semantically-similar to each other, but are likely to be semantically-orthogonal.

16 16 Questions/suggestions forwarded  For resources only tagged by its owner, people will avoid redundancy, but provide different aspects for a single resource. How does this feature influence the application on such system?  Can we expect different facets can be extracted from self tagging system?  This system only allows one user to tag one resource, and allow the user to use multi-words/phrase tag. It must be a sparse linked data; and the co-occurrence of tags must be less than the free tagging system. Could we expect some differences from this point of view?

17 17 More questions/suggestions  How does this difference make research and applications on self-tagging system challenging?  I wonder if the convergence of tags to the final set of tags is represented primarily by the dominance of a few tags. If you omit the most common handful of topics, do the remainder converge also?  Several blogging systems separately show author tags and reader tags. It would be interesting to see the overlap between these and the effect of one another.

18 18 Acknowledgement  This work was supported by A*STAR Public Sector R&D, Singapore

19 19 References  [Cronen-Townsend02] S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. In Proc. of SIGIR’02, pages 299–306, Tampere, Finland, 2002.  [Golder06] S. A. Golder and B. A. Huberman. Usage patterns of collaborative tagging systems. Journal of Information Science, 32(2):198–208, 2006.  [Halpin07] H. Halpin, V. Robu, and H. Shepherd. The complex dynamics of collaborative tagging. In Proc. of WWW’07,pages 211–220, Banff, Alberta, Canada, 2007.  [Marlow06] C. Marlow, M. Naaman, D. Boyd, and M. Davis. Ht06, tagging paper, taxonomy, flickr, academic article, to read. In Proc. of ACM HyperText’06, pages 31–40, Odense, Denmark, 2006.  [Weinberger08] K. Weinberger, M. Slaney, and R. van Zwol. Resolving tag ambiguity. In ACM Multimedia, Vancouver, Canada, 2008.

20 Thank you


Download ppt "On Stability, Clarity, and Co-occurrence of Self-Tagging Aixin Sun and Anwitaman Datta Nanyang Technological University Singapore."

Similar presentations


Ads by Google