Presentation is loading. Please wait.

Presentation is loading. Please wait.

S AY A NYTHING Selma P. and Parth P.. S AY W HAT ? Data-driven system that allows for interactive storytelling Human author takes turns with computer.

Similar presentations


Presentation on theme: "S AY A NYTHING Selma P. and Parth P.. S AY W HAT ? Data-driven system that allows for interactive storytelling Human author takes turns with computer."— Presentation transcript:

1 S AY A NYTHING Selma P. and Parth P.

2 S AY W HAT ? Data-driven system that allows for interactive storytelling Human author takes turns with computer author

3 E XAMPLE H: You’ll never believe what happened last night. C: Leigh laughed at my joke but I couldn’t help but think ‘liz would have laughed harder.’ H: The joke wasn’t very funny in a ‘ha ha’ kind of way C: “It wasn’t anything like that, I thought he was going to give me a good night kiss but he ended up licking my cheek,” she declared. H: It made me sneeze and snort out loud. C: And now my nose hurts form the snorting.

4 E XAMPLE H: You’ll never believe what happened last night. C: Leigh laughed at my joke but I couldn’t help but think ‘liz would have laughed harder.’ H: The joke wasn’t very funny in a ‘ha ha’ kind of way C: “It wasn’t anything like that, I thought he was going to give me a good night kiss but he ended up licking my cheek,” she declared. H: It made me sneeze and snort out loud. C: And now my nose hurts form the snorting.

5 E XAMPLE H: You’ll never believe what happened last night. C: Leigh laughed at my joke but I couldn’t help but think ‘liz would have laughed harder.’ H: The joke wasn’t very funny in a ‘ha ha’ kind of way C: “It wasn’t anything like that, I thought he was going to give me a good night kiss but he ended up licking my cheek,” she declared. H: It made me sneeze and snort out loud. C: And now my nose hurts form the snorting.

6 W HY DO THIS ? Intriguing Blends structure with creativity Void Game of sorts Fits in the space between language games and more graphically oriented video games Foundational – work in progress Research

7 N ARRATIVE M ODELS (M AJEWSKI 2003) Five ways: Linear String of pearls Branching Amusement park Building blocks

8 W HAT ’ S BEEN DONE BEFORE ? Façade Top – Down Approach Strict domains Technical/ non-inclusive

9 H OW IT WORKS 1. User enters sentence 2. User’s sentence is used to search corpus using term frequency - inverse document frequency (tf- idf) algorithm 3. Highest scored match is retrieved. Sentence after best match is outputted by computer 4. Repeat

10 H OW DID THEY DO IT ?

11 G ETTING D ATA ( DB ) Considered Manual (StoryCorps/ Fed. Writers) Well curated biased Favored blog posts (3.4 million) 1.06 billion words Extraction Only 17% textual material on weblogs is narrational 3.7 million story segments 66.5 million sentences Favored Recall over Precision FN preferred over FP

12 G ETTING D ATA Spinn3r.com – 44 million weblog. Sampled / hand annotated 5,270 blogs. Annotated validations set

13 G ETTING D ATA Randomized training/ testing Supervised machine learning Binary classification problem Confidence Weighted Linear Classifier [Dredze 2008]

14 G ETTING D ATA Rando. Data Set annotation training/ testing F(x)F(x) Sampling

15 C ORPUS ( DB ) C REATION Crawled blogs and applied algorithm: (2012): Post- Processing Parse trees Verb tenses First personal pronouns

16 C ORPUS ( DB ) C REATION Rando. Data Set crawl blogs annotation training/ testing F(x)F(x) Sampling Blog data F(x)F(x) Story data

17 A PPLICATION Querying Corpus ( ) Optimized: Return a list of stories that contain any matching words with user input Use TF-IDF ! Story data

18 A PPLICATION Rando. Data Set crawl blogs annotation training/ testing F(x)F(x) Sampling Blog data F(x)F(x) Story data “User Input” Query Story data Match Algorithm (tf – id) “Computer Output”

19 T ERM FREQUENCY – INVERSE DOCUMENT FREQUENCY ( TF - IDF ) TF-IDF is a numerical statistic that reflects how important a word is to a document in a corpus The term frequency measures how often a word appears in a document The inverse document frequency is a measure of how common a word is within the corpus as a whole. It tells us how much information a word provides.

20 T ERM FREQUENCY – INVERSE DOCUMENT F REQUENCY ( TF - IDF ) Image credit: Li(2011)

21 R ESULTS

22 A REAS TO I MPROVE Metrics? Entertainment Coherence

23 A REAS TO I MPROVE Metrics? Entertainment Coherence Believability / Usability Compare how well with next sentence user would have written

24 A REAS TO I MPROVE Fail to use all preceding sentences. Only returns highest ranked search.

25 A REAS TO I MPROVE Fail to use all preceding sentences. Only returns highest ranked search.

26 A REAS TO I MPROVE Fail to use all preceding sentences. Only returns highest ranked search.

27 F UTURE WORK No narrative plot

28 W ORK IN P ROGRESS Foundational Last article was 2012

29 T HANKS

30 D ISCUSSION Q’ S

31 I MPROVEMENTS ?

32 Q UALITY A SSESSMENT

33 M ODIFY C ORPUS

34 P APER C ONTENT ?


Download ppt "S AY A NYTHING Selma P. and Parth P.. S AY W HAT ? Data-driven system that allows for interactive storytelling Human author takes turns with computer."

Similar presentations


Ads by Google