Presentation is loading. Please wait.

Presentation is loading. Please wait.

Classifying Movie Scripts by Genre Alex Blackstock Matt Spitz 6/9/08.

Similar presentations


Presentation on theme: "Classifying Movie Scripts by Genre Alex Blackstock Matt Spitz 6/9/08."— Presentation transcript:

1 Classifying Movie Scripts by Genre Alex Blackstock Matt Spitz 6/9/08

2 Overview Motivation classifying movie scripts may identify box office flops and successes before they're even produced! Data freely-available movie scripts (DailyScripts.com, etc) ‏ IMDB genres (several labels/movie) ‏ Tools Lucene MEMM from PA3 jBNC (naïve Bayes classifier) ‏ Stanford Named Entity Recognizer Stanford Part-Of-Speech Tagger

3 Processing Scripts

4 Features Non-NLP dialogue shape character information NLP POS ratios Named Entity appearances Character-Based NLP analyze individual characters exclamations main vs. secondary

5 Evaluation Metrics Example output: Blade II (gold labels: Action, Thriller, Horror) ‏ guessed labels: Action, Adventure, Horror, Thriller,... F1 Score per genre weighted-average over all genres # of guesses allowed = # of gold labels Partial Credit Score allows for some error # guesses allowed = # of gold labels * 1.5 penalized for guesses that are beyond # gold labels, but still get points

6 Conclusions Success! best feature set: basic NLP & POS tagging PC Score: 0.601 F1 Score: 0.551 Classifier comparison (jBNC) ‏ N-way classification problem 22 genres average of 3.02 genres/datum Dataset Issues consistency diversity size


Download ppt "Classifying Movie Scripts by Genre Alex Blackstock Matt Spitz 6/9/08."

Similar presentations


Ads by Google