Presentation is loading. Please wait.

Presentation is loading. Please wait.

Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the.

Similar presentations


Presentation on theme: "Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the."— Presentation transcript:

1 Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the seventeenth conference on Hypertext and hypermedia, August 22-25, 2006, Odense, Denmark

2 Folksonomy & Collaborative Tagging Defined Folksonomy: “a collaboratively generated, open ended labeling system that enables Internet users to categorize content such as web pages, online paragraphs and Web links” Collaborative Tagging: “Tagging a collection of documents commonly assessable to a large group rather than tagging contents located all over the web, which is instead called social bookmaking.”

3 Benefits of Collaborative Tagging Reveal an individual’s structural knowledge about documents Knowledge of how concepts in a domain are interrelated Tags codify the knowledge relationships between documents and concepts represented by the tags Tagging is low cost, work is spread over large groups of people not complicated, hierarchical nomenclature to learn. Users tag on the fly in plain language

4 Benefits of Collaborative Tagging Open ended: respond quickly to to changes and developments in the way users group content Considered democratic meta-data generation Generated by both content authors and users Allows users to search content that the user has tagged using a personal vocabulary Users that share interests share vocabularies, tags made by one user are helpful to another Use low frequency key words not served by controlled vocabulary Provide dynamic hyperlinks between tags, documents and users

5 Challenges Folksonomies can be seen as an emergent knowledge taxonomy but lack of a hierarchy prevents it from being widely adopted enterprises Suffer from polysemy, words having multiple related meanings and synononymy words that have the same or similar meanings Controlled vocabularies are not vulnerable to this Invite idiosyncratic tagging, which can create meta-noise and decreases the usability of the system

6 Design Components Community Identification Much research in the WWW community has been dedicated evolving topical communities or users and documents Existing community identification techniques fit into 3 categories Spectral Apply to singular value decomposition to large matrices representing relationships of elements in a large collection Global- attempt to ID all communities in a large collection Bibliometrics Local- identify pair wise affinity among users Network Flow based Hybrid- Can identify broader communities containing a known existing community

7 Design Components Community Identification The design for this paper uses a spectral design to identify global communities using authorship and usage of tabs and documents Documents, tags and users are all nodes in a network A link is added from each tag to every associated document A link is added from every user to each created tag or accessed

8 Design Components User and Document Recommendation High quality sources are important for people to be able to find Sources can be documents or people Experts tend to use high-quality documents, and can better associate documents with concepts Existing collaborative tagging systems are limited in identifying experts and quality documents through tallying tags or frequency of usage A Modified version of HITS algorithm is utilized by the authors to obtain expert hubs and high quality documents (authorities) related to a keyword based on usage and tag structure HITS is and algorithm known to be effective in finding high quality sources in hypertext environments

9 Design Components User and Document Recommendation The base set of documents includes the documents tagged by the keyword The set is expanded to include all tags associated with any documents in the root set, documents under these tags and users who have accessed these tags A link is added from each keyword to every document tagged with that keyword, from each user to every tag they have assigned of used The link structure is captured in matrix A where Ajj shows if there is a link from a node(document, tag, user) Users are sources (nodes with outgoing links only) Documents are sinks (nodes with incoming links only) Hubs calculated are guaranteed to be users and authorities documents

10 Design Components Ontology Generation An ontology or hierarchy is a useful structure when navigating Ontologies can assist with keyword search An ontology can be used to create a common hierarchy for a large collection of documents A person’s tags represent their structural knowledge about the documents they have viewed, A common hierarchy represents a form of global knowledge about the larger document collections

11 Design Components Discussion The discussed describes a framework that collects social knowledge from folksonomies Gains social knowledge from associations between tags and documents as well as links and user behavior The end result is a taxonomy of documents rather than a taxonomy of tags Synonymous and polysemy tags don’t present a problem since they change the the associative routes but not the spectral analyses Tags are intermediate objects between documents and users

12 Evaluation To test how effective this system was compared to alternative choices, used 3 types of evaluation: Offline Studies Paper based questionnaires and interviews Participants tagged set of documents, taxonomies were generated using different techniques and manually The hierarchy produced by author’s technique produced better results according to the participants

13 Evaluation Test websites Used existing websites with users and documents Added a tagging system and a feedback system to rate a tag as “useful” or “not useful” Based on tags and clicks it showed tags and expert users identified by the system had higher than average user ratings The experts identified in the system also had higher than average scores meaning knowledge users tend to create high quality tags

14 Evaluation Pilot systems Applied to ARCHON digital library a large knowledge environment For internal validation: Evaluated algorithms against the original data For external validation: We ask human subjects to evaluate results from design solutions through online feedback, questionnaires and interviews To test scalability: Simulate large amounts of user input data To test robustness: Study the impact of statistical sampling and disturbance of the input data

15 Conclusion Collaborative tagging systems have the potential to become infrastructure for gathering social knowledge.

16 Questions What situations have you used collaborative tagging, was it value added? Why why not? Collaborative tagging has potential to expose new ways in which people think about ideas and how they related, is there a way to do this with controlled vocabulary? How can folksonomies contribute to well established hierarchies and ontologies?


Download ppt "Harvesting Social Knowledge from Folksonomies Harris Wu, Mohammad Zubair, Kurt Maly, Harvesting social knowledge from folksonomies, Proceedings of the."

Similar presentations


Ads by Google