Presentation is loading. Please wait.

Presentation is loading. Please wait.

Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games Supervised by Dr. Noriko Tomuro Fall –

Similar presentations


Presentation on theme: "Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games Supervised by Dr. Noriko Tomuro Fall –"— Presentation transcript:

1 Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games Supervised by Dr. Noriko Tomuro Fall – 2009/2010

2 Project Goals A. Videogame Search/Retrieval System Allows users to search videogames by multiple criteria: Basic information of a game (e.g. developer, publisher, genre, platform), plus theme/concept -- but this is nothing new “Qualitative features” of a game, based on the content (e.g. gameplay, visual style, sound & music) => uniqueness of our system Incorporate rating scores B.Videogame Recommendation System Recommends games which are “similar” to a given game. Similarity measured (and ranked) by multiple criteria (above). Personalize search retrieval results to individual users.

3 Steps & Tasks 1. Construct a “Videogame Lexicon” Lists of titles, characters, locations, concepts/themes <= Extract from GiantBomb and Gamespot 2.Associate those features with each game Traverse links in GiantBomb => Create a relational database At this point, we can have a preliminary system which allows users to search games by basic information and theme/concept features. Start designing the interface of the system.

4 Relational DB structure

5 Steps & Tasks (cont.) 3.For qualitative features, use NLP techniques to obtain info from Gamespot review texts. We start with the feature ‘gameplay’. Tackle other features if time permits. 1.As pre-processing, annotate all Gamespot review texts with the words in our videogame lexicon (i.e., named entities of various types, such as title, characters) => Re-generate the data in which named entities are indicated (and multi-word named entities are concatenated), e.g. “Mario_Bros./TTL”

6 Steps & Tasks (cont.) Example: Short for Armed Assault (which would have made an infinitely better title), it's much easier to think of ArmA as the spiritual sequel to 2001's critically acclaimed Cold War Crisis, an innovative military themed game that's as much simulation as it is shooter. That's because ArmA is the product of Bohemia Interactive, the European developer responsible for Operation Flashpoint.…. Legend: Title Title abbreviation Developer or Publisher Concept/Theme Genre

7 Steps & Tasks (cont.) 2.Extract sentences in Gamespot reviews which express gameplay Generate a set of adjectives which modify “gameplay” (independently, from Google n-gram data), and cluster them. Each cluster will represent a semantic category of the member adjectives, e.g. speed, difficulty addictive gameplay - 40106 good gameplay - 28547 unique gameplay - 22537 excellent gameplay - 11578

8 Steps & Tasks (cont.) 3.Manually inspect Gamespot reviews and identify other words/phrases/patterns (besides the words “gameplay” or “play”). Automatically extract all sentences which have those adjectives and/or match the patterns => Manually filter incorrect sentences. 4. For each game, assign the adjectival semantic categories/clusters and/or the gameplay expression patterns => Those are the values of the ‘gameplay’ feature of the game. Store them in the database for the game. 5.We may also want to do similar clustering for concepts/themes (which we extracted from GiantBomb).

9 Clustering ‘Gameplay’ Adjectives addictive0.0150.7150.0270.4810.901 good0.8660.0110.5950.0030.102 unique0.0270.5950.1130.10.107 actual0.4810.3430.2030.8310.716 excellent0.7610.6830.7220.1250.17 Noun 1Noun 2Noun 3Noun 4Noun 5 Mutual Information:

10 ‘Gameplay’ Feature Addictive Obsessive Hooking Enslaving Habit-forming Involving Originative Original Ingenious New Leading-edge Innovative Game Review Match

11 Other Tasks a)[Step 2] Cluster adjectives to derive the adjectival semantic categories Use the Google n-gram data; extract adjectives from the bi-grams “XX gameplay”. Other patterns from 3-grams and 4-grams. Try clustering into 5-10 categories/clusters. Also, soft clustering. b)Apply a named entity tagger (e.g. Stanford NER) to all Gamespot reviews In order to pick up more named entities But we need to train the tool… c)Apply partial parsing to the extracted sentences (which express gameplay) In order to accurately identify the relevant adjective(s) in a sentence

12 Other Tasks (cont.) d)Incorporate rating scores e)Derive weights on the features Which features are more important than others? Feature weighting will help the ranking of the matched results (in search/retrieval) and the recommendations.

13 Game Recommendation Process Recommendation List Game Clusters

14 Clustering Games using User Ratings Game 1531 Game 2531 Game 35532 Game 4143 Game 5543 User 1User 2User 3User 4User 5

15 Measuring Game Similarity Game 1 Game 2 Developer ID Genre ID ESRB ID Character ID Location ID People ID Clustering by Match

16 Idea for Deriving Feature Weights Clustering by specific quality feature match Clustering by User Ratings % of Overlap Feature Importance

17 Recommendation Generation Using Google Page-Rank Algorithm


Download ppt "Topics in AI: Applied Natural Language Processing Information Extraction and Recommender Systems for Video Games Supervised by Dr. Noriko Tomuro Fall –"

Similar presentations


Ads by Google