Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko Institute of Informatics and Software Engineering,

Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko jsimko@fiit.stuba.sk Institute of Informatics and Software Engineering, Faculty of Informatics and Information Technologies, Slovak University of Technology in Bratislava Supervised by: prof. Mária Bieliková July 4 th, 2013

Games with a purpose (GWAP) for semantics acquisition

Games with a purpose Cheap (once they are created) Difficult to create [Quinn & Bederson. Human computation: a survey and taxonomy of a growing field. CHI’11, 2011]

ESP Game: image metadata acquisition What is in the image? Player 1: Player 2: water sky bridge Mostar night river bridge Bosnia The players must blindly match Banned words: blue, towers [Von Ahn & Dabbish: Designing games with a purpose. Commun. ACM, 2008.]

Motivation Open issues in semantics acquisition ◦ Modelling of specific domains ◦ Personal multimedia metadata acquisition ◦ Metadata upkeep Games with a purpose (GWAPs): design issues ◦ In general: no design methodology (young problem area) ◦ Cold start problems ◦ Quality management, effectiveness of work allocation

Thesis Goals 1. Create new, GWAP-based approaches to semantics creation, particularly for specific domains 2. Bring in generally applicable improvements to GWAP design, focusing on selected problems

Work overview State of the art: GWAP taxonomy and design space GWAPs we created: Little Search Game: term network acquisition PexAce: (personal) imagery tag acquisition CityLights: validation of music metadata General GWAP design improvements: Helper artifacts: cold start problem reduction Player competences: improving GWAP output quality

Our taxonomy of GWAPs

GWAP design A relatively new area (<10 years) No holistic design methodology exists ◦ GWAPs are created ad-hoc Few works aimed at particular design issues ◦ [Ahn, 2008] Player agreement schemes ◦ [Chiou, 2011] Suggested considering player skills in GWAPs Our contribution: GWAP design dimensions ◦ following the idea of design lenses [Schell, 2008] [Von Ahn & Dabbish: Designing games with a purpose. Commun. ACM, 2008.] [Chiou & Hsu. Capability-aligned matching: improving quality of games with a purpose. AAMAS ’11] [J. Schell. The art of game design a book of lenses. Elsevier/Morgan Kaufmann, 2008.]

Our GWAP design dimensions

Existing GWAPs in our design space

PexAce Goal: acquire (personal) image tags New artifact validation model Quality management through player modelling International Journal on Human-Computer Studies [In press] -Šimko, J., Tvarožek, M., Bieliková, M. Human Computation: Single-player Annotation Game for Image Metadata. SMAP 2011 (IEEE CS Press) -Šimko, J., Bieliková, M.: Games with a Purpose: User Generated Valid Metadata for Personal Archives. I-Semantics 2012 (ACM) - Šimko, Jakub - Bieliková, Mária: Personal Image Tagging: a Game-based Approach. I-Semantics, 2012

PexAce: acquisition of image metadata Cards– image pair seeking memory game Players create image annotations to aid their memory

PexAce: general domain deployment (Standard) Corel 5K dataset: photos + tags + our tags 107 players, 814 games, 2 792 images 22 176 annotations, 5 723 tags Golden standard comparison: 73% precision Aposteriori evaluation: 94% precision Automated methods ~70% * ◦ Limited set of tags *[Duygulu et. al. Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary 2002. Springer-Verlag.]

PexAce for personal images Personal image metadata – virtually impossible to get Personal images instead of general images in PexAce ◦ Players like that more ◦ They provide specific annotations (metadata) Experiments: 2 x 2-player groups, 50 images each Correctness: 94% ◦ 44% specific tags Persons (53%) Events (21%) Places (15%) Other (11%)

„Benevolent“ artifact validation model Original mutual player supervision Less strict heuristics Annotations decomposed to votes: P - players, T- terms, I - Images

Artifact validation and cold start problem: A general GWAP issue „How can a result of a human intelligence task be automatically evaluated?“ GWAPs use: ◦ Approximative or exact automated evaluation (case dependent) ◦ Mutual player supervision Threat to multiplayer validation schemes: COLD START ‘’ The requirement is to have multiple players online at the same time, sometimes with a requirement that they cannot communicate.” Keep the games single-player

Helper artifacts: a new artifact validation principle Helper artifacts: ◦ Decouple scoring from task solving, instead motivate players to solve tasks to help themselves in the progress of the game ◦ E.g. in PexAce, a player may win the game well enough even without the annotations ◦ Potential of general applicability (to any existing game)

Quality management in GWAPs: Considering differences in player competences 1. Quantify player skills – player model (e.g. player’s task-solving expertise for each sub-domain) 2. Apply model in a)“post-processing” - Solution filtering (e.g. vote weighting) b)“pre-processing” - Task assignment (e.g. match task subdomain to expertise areas) 3. Speed up the process or/and retrieve higher quality results

Measuring player competences: PexAce data Usefulness (delivery of correct artifacts) Consensus ratio (agreement with other players) Correlation: 0.496

Little Search Game Goal: acquire lightweight term network statistically unsupported, yet valid term relationships specific domain use Int. J. on Semantic Web and Information Systems -Šimko, Jakub - Tvarožek, Michal - Bieliková, Mária: Semantics Discovery via Human Computation Games. In: International Journal on Semantic Web and Information Systems (2011) Hypertext 2011 (ACM) -Šimko, Jakub - Tvarožek, Michal - Bieliková, Mária: Little Search Game: Term Network Acquisition via a Human Computation Game. Hypertext, 2011

Little Search Game (negative search game) Search query: „Star –movie –war –death“ Creation of lightweight term network Player’s task: reduce number of results with negative search

LSG Term network evaluation Aposteriori evaluation: 91% correctness A potential to add term relationships to existing bases ◦ 59% of LSG rels. do not exist in ConceptNet * corpus ◦ …including demanded non-taxonomic relationships *[Liu & Singh. ConceptNet — A Practical Commonsense Reasoning Tool-Kit. BT Technology Journal 2004]

Hidden term relationships – hard for automated discovery (40% of LSG term network)

LSG modification: TermBlaster (Harvesting relationships for software design domain) Specific domain No text typing 71 % correct, 21% „hidden relationships“

CityLights Goal: validate existing music tags quality management through confidence expression I-Semantics 2012 (ACM) -Dulačka, Peter - Šimko, Jakub - Bieliková, Mária: Validation of Music Metadata via Game with a Purpose. I-Semantics 2012

CityLights: music tag validation (a concept of validation question) Validation question: “Which of these tag groups characterizes the music track you hear?” 1. Rockabilly, USA, 60ties 2. Seasonal, rich oldies, xmas 3. February 08 love, oldies, 60 musik Tag support value: + increases + player selects the group -decreases - p. doesn’t select the group - player rules out the tag Wrong and correct tags bubble out Possitive and negative thresholds

CityLights: experiments LastFM dataset 875 games, 4933 questions, 1492 tags Feedback actions per tag: ◦ 17.75 implicit ◦ 5.29 explicit Optimized parameter configuration ◦ 68% correctness

Betting mechanism: Measuring competence through confidence Betting mechanism within a GWAP Through bet height, the player expresses his confidence in his task solution CityLights case: bet height aligns with impact on tag validity value Helps with cold start problem associated with user modeling

Main contributions Definition GWAP design space GWAPs for semantics acquisition ◦ For specific domains (personal images, SW engineering) ◦ For otherwise hardly discoverable semantics (hidden rels.) New GWAP design principles ◦ Helper artifacts for cold start reduction ◦ Metrics for long term player competence modeling ◦ Betting mechanism for short term player competence acq. ◦ Metadata validation GWAP concept

Summary GWAP taxonomy and design dimensions ◦ [survey paper prepared] Little Search Game – Lightweight term network acquisition Hidden term relationships ◦ Hypertext 2011, ACM ◦ Int. J. of Semantic Web and Information Systems, 2011 (CC, IGI) PexAce – Personal image metadata acquisition Helper artifacts Competence measures ◦ SMAP 2011, IEEE ◦ I-Semantics 2012, ACM ◦ Int. J. of Human-Computer Studies, 2013 (CC, Elsevier) CityLights – Music metadata validation Betting mechanics – player competence through confidence ◦ I-Semantics 2012b, ACM

Selected publications Semantics Discovery via Human Computation Games. In: International Journal on Semantic Web and Information Systems. 2011 Human Computation: Single-player Annotation Game for Image Metadata. International Journal on Human-Computer Studies. 2012 [In press]. Validation of Music Metadata via Game with a Purpose. I-Semantics 2012 (ACM) Games with a Purpose: User Generated Valid Metadata for Personal Archives. SMAP 2011 (IEEE CS) Little Search Game: Term Network Acquisition via a Human Computation Game. Hypertext 2011 (ACM) Personal Image Tagging: a Game-based Approach. I-Semantics 2012 (ACM)

Harnessing manpower for creating semantics

LSG evaluation: relationship types Data 400 nodes, 560 edges ConceptNet lightweight dataset Method Identify relationship types – A posteriori (2 judges) – Reference dataset Results Not all LSG relationships were present in ConceptNet Dominant rel. types: – Unlabelled,hasProperty, hasA, atLocation

Semantics acquisition Semantics needed everywhere Resource metadata acquisition ◦ Resource types: texts, multimedia, websites Domain modelling ◦ Concept identification, Relationships identification, labelling, Interconnecting of datasets

Semantics acquisition Output quality Output quantity Crowdsourcing Automated Expert Quick Inexpensive (once created) Scalable [3,4] Human based Scalable No specific problems We still need to pay [5,6] Expensive Essential for certain tasks [1,2]

PexAce: image annotation Annotations Currently disclosed pair

PexAce: image annotation Annotation “tooltip”

Personal images: Experiments Two social groups in each: ◦ 2 players, 1 judge ◦ A set of 48 images in albums  Portraits, Groups, Situational and Non-person (other) ◦ One group was aware of the purpose, the other was not Each player played 3 games Each image was featured twice for a single player Measured properties of tags ◦ Correctness ◦ Specificity ◦ Understandability ◦ Type of tag (person, event, place, other)

Personal images: Experiments Aware (253 tags)Unaware (108 tags) Corr.Spec.Und.Corr.Spec.Und. Portraits0.980.610.710.770.530.87 Groups0.970.570.740.760.451.00 Situations0.920.410.770.930.191.00 Other0.980.180.820.880.151.00 Average0.960.440.760.840.330.97 Persons (53%) Events (21%) Places (15%) Other (11%)

Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko Institute of Informatics and Software Engineering,

Similar presentations

Presentation on theme: "Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko Institute of Informatics and Software Engineering,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko Institute of Informatics and Software Engineering,

Similar presentations

Presentation on theme: "Harnessing manpower for creating semantics (doctoral dissertation) Jakub Šimko Institute of Informatics and Software Engineering,"— Presentation transcript:

Similar presentations

About project

Feedback