Extending SASI to Satirical Product Reviews: A Preview Bernease Herman University of Michigan Monday, April 22, 2013.

Slides:



Advertisements
Similar presentations
What is satire? –noun 1.the use of irony, sarcasm, ridicule, or the like, in exposing, denouncing, or deriding vice, folly, etc. 2.a literary composition,
Advertisements

Latent Variables Naman Agarwal Michael Nute May 1, 2013.
Semantics and Context in Natural Language Processing (NLP) Ari Rappoport The Hebrew University.
Dan Jurafsky Lecture 4: Sarcasm, Alzheimers, +Distributional Semantics Computational Extraction of Social and Interactional Meaning SSLST, Summer 2011.
WEBQUEST Let’s Begin TITLE AUTHOR:. Let’s continue Return Home Introduction Task Process Conclusion Evaluation Teacher Page Credits Introduction This.
Review our knowledge of tone and take notes on how to identify tone. Identify tone within a selection of poems.
Happy Tuesday! Today we will be discussing satire and taking notes. Today we will be discussing satire and taking notes. You may take notes on loose paper.
RHEHTORICAL DEVICES FOR ANALYSIS Aristotelian Appeals Logos Appeals to the head using logic, numbers, explanations, and facts. Through Logos, a writer.
ELEMENTS OF HUMOR. Parody any humorous, satirical, or burlesque imitation, as of a person, event, etc.
Problem Semi supervised sarcasm identification using SASI
Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Semi Supervised Recognition of Sarcastic Sentences in Twitter and Amazon Dmitry DavidovOren TsurAri Rappoport.
Sarcasm Detection on Twitter A Behavioral Modeling Approach
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts 04 10, 2014 Hyun Geun Soo Bo Pang and Lillian Lee (2004)
Epic & Satire Gillian Cannon Pd Honors ELA.
Search and Retrieval: More on Term Weighting and Document Ranking Prof. Marti Hearst SIMS 202, Lecture 22.
Aki Hecht Seminar in Databases (236826) January 2009
Learning to Extract Form Labels Nguyen et al.. The Challenge We want to retrieve and integrate online databases We want to retrieve and integrate online.
QuASI: Question Answering using Statistics, Semantics, and Inference Marti Hearst, Jerry Feldman, Chris Manning, Srini Narayanan Univ. of California-Berkeley.
UCB BioText TREC 2003 Participation Participants: Marti Hearst Gaurav Bhalotia, Presley Nakov, Ariel Schwartz Track: Genomics, tasks 1 and 2.
Satire—The most important thing you will EVER learn.
Authorship Attribution Erik Goldman & Abel Allison.
Spam? Not any more !! Detecting spam s using neural networks ECE/CS/ME 539 Project presentation Submitted by Sivanadyan, Thiagarajan.
Articles, Books, and More.  Purpose  Why reading?  Will you be expected to discuss the reading in class or with teacher?  Will you incorporate reading.
ISMB 2003 presentation Extracting Synonymous Gene and Protein Terms from Biological Literature Hong Yu and Eugene Agichtein Dept. Computer Science, Columbia.
Opinion Mining on the Web 2.0 Characteristics of User Generated Content and Their Impacts ITEC 547 Text Mining Ass. Professor: Nazife Dimililer Name: Feras.
Recognition of Multi-sentence n-ary Subcellular Localization Mentions in Biomedical Abstracts G. Melli, M. Ester, A. Sarkar Dec. 6, 2007
Fast Webpage classification using URL features Authors: Min-Yen Kan Hoang and Oanh Nguyen Thi Conference: ICIKM 2005 Reporter: Yi-Ren Yeh.
ONTOLOGY LEARNING AND POPULATION FROM FROM TEXT Ch8 Population.
K Nearest Neighborhood (KNNs)
Copyrighted material John Tullis 10/17/2015 page 1 04/15/00 XML Part 3 John Tullis DePaul Instructor
Chapter 6: Information Retrieval and Web Search
Satire, Irony and Rhetoric Finding humour and wit in life! "Satire is a sort of glass, wherein beholders do generally discover everybody's face but their.
Power point 1 Use your packet to take notes about the elements of satire. Use your packet to take notes about the elements of satire. Make sure you are.
Contextual Ranking of Keywords Using Click Data Utku Irmak, Vadim von Brzeski, Reiner Kraft Yahoo! Inc ICDE 09’ Datamining session Summarized.
Javadoc Comments.  Java API has a documentation tool called javadoc  The javadoc tool is used on the source code embedded with javadoc-style comments.
Protein motif extraction with neuro-fuzzy optimization Bill C. H. Chang and Author : Bill C. H. Chang and Saman K. Halgamuge Saman K. Halgamuge Adviser.
Graph Algorithms: Classification William Cohen. Outline Last week: – PageRank – one algorithm on graphs edges and nodes in memory nodes in memory nothing.
Intelligent Database Systems Lab N.Y.U.S.T. I. M. Psychiatric document retrieval using a discourse-aware model Presenter : Wu, Jia-Hao Authors : Liang-Chih.
Poetry Analysis UsingTPCASTTPoetry Analysis UsingTPCASTT Ms. Wolf’s Language Arts Class.
CS307P-SYSTEM PRACTICUM CPYNOT. B13107 – Amit Kumar B13141 – Vinod Kumar B13218 – Paawan Mukker.
Our World- Unit 1: Mapping Types of Questions 1. Literal Book page 17: Examples How old are you? When is your birthday? What’s your name? Examples: Book.
The Canterbury Tales Literary Devices. Vernacular The native speech or language of a place, class, or profession.
Satire. What is Satire? Using a variety of methods to make something look foolish or silly in order to point out faults or in order to seek social change.
Satire Change Through Humor.
Please pick up the handout from the small table. We will be taking notes Introduction to Satire! (2014) Day 1.
Spamming Botnets: Signatures and Characteristics Yinglian Xie, Fang Yu, Kannan Achan, Rina Panigrahy, Microsoft Research, Silicon Valley Geoff Hulten,
Kansas State University Department of Computing and Information Sciences CIS 890: Special Topics in Intelligent Systems Wednesday, November 15, 2000 Cecil.
WRITING FROM READING (SUMMARY). DAY 1 Read, highlight, and annotate “How Dumb Can We Get?” –The article is on my website. –You can either use a highlighter.
Semi-Supervised Recognition of Sarcastic Sentences in Twitter and Amazon -Smit Shilu.
1 Text Categorization  Assigning documents to a fixed set of categories  Applications:  Web pages  Recommending pages  Yahoo-like classification hierarchies.
Jay Leno Barack Obama George Bush Donald Trump.
An Introduction to.
Mini Lesson: Irony.
Title by Author’s Name Student’s Name
MID-SEM REVIEW.
Mining the Data Charu C. Aggarwal, ChengXiang Zhai
Irony There are three types, with very different definitions:
Mini Lesson: Irony.
Annotation Notes:.
Documentation& Works Cited
Do the following in your notebook:
Satire What is it?.
Irony and Satire The Power of Words.
Mini Lesson: Irony.
Irony and Satire The Power of Words.
Paper Presentation - Ultra Portable Devices
Jay Leno Barack Obama George Bush Donald Trump.
Irony and Satire The Power of Words.
Presentation transcript:

Extending SASI to Satirical Product Reviews: A Preview Bernease Herman University of Michigan Monday, April 22, 2013

Satirical Amazon Reviews Extending SASI to Satirical Product Reviews: A Preview April 22, For a fun list:

Defining Irony, Sarcasm and Satire Irony: “the use of words to convey a meaning that is the opposite of its literal meaning” Sarcasm: “a sharply ironical taunt; sneering or cutting remark” Satire: “the use of irony, sarcasm, ridicule, or the like, in exposing, denouncing, or deriding vice, folly, etc.” Extending SASI to Satirical Product Reviews: A Preview April 22,

Sarcastic Review: Shure SE110 Sound Isolating Earphones Extending SASI to Satirical Product Reviews: A Preview April 22,

Satirical Review: BIC Cristal For Her ballpoint pens Extending SASI to Satirical Product Reviews: A Preview April 22,

Satirical Review: Zenith Men’s Defy Xtreme Titanium Watch Extending SASI to Satirical Product Reviews: A Preview April 22,

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Algorithm detects sarcasm in individual sentences using k-Nearest Neighbors type algorithm. Features include pattern-matching and punctuation. There are additional features to consider for satire that are not present in sarcasm model. Classification baseline needs to be determined from multiple options. Sentence-based sarcasm detector, not full document.

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Jindal and Liu (2008) has 66,000 data set of book and product reviews. Filatova (2012) provides corpora of Amazon reviews labeled ironic, sarcastic, both, regular. Specific products, authors, companies, and book titles were replaced with [product], [author], etc. HTML and special symbols were removed from text

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Tsur et al. (2010) posited that sarcastic sentences co-appear with others. Gathered nearby sentences using Yahoo! BOSS API with seeds. Satirical reviews prove true, not sarcastic ones. Sarcasm Satire

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Via Davidov and Rappoport (2006, 2008): High frequency words(HFWs) Content words (CWs) What can I say about the 571B Banana Slicer that hasn't already been said about the wheel, penicillin or the iPhone… “What can I CW CW the” “I CW CW the [product]” “[product] that hasn’t CW been CW about” “about the CW” “CW or the CW”

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Generic features regarding punctuation, all normalized to [0, 1]. Sentence length in words Number of “!” characters Number of “?” characters Number of quotes in sentence Number of capitalized words or words in all capitals

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Burfoot and Baldwin (2009) introduced notion of validity for which models absurdity via a measure close to PMI. Related to number of made-up or mismatched named entities. Works well with satire, but not here. Absurdity of product Relevancy of product How often product is reviewed

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Classification via feature vectors for each pattern in training set. Use Euclidean distance for each of the matching vectors that share at least one pattern.

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Since semi-supervised, the classification algorithm takes advantage of the definition of sarcasm. Assumes low star rating and text with positive literal meaning. Not as clear-cut with satire, options: Variation in rating for product Purchases vs Page Views of product People finding review helpful Other heuristics

Semi-supervised Algorithm for Sarcasm Identification (SASI) Extending SASI to Satirical Product Reviews: A Preview April 22, Overview Data preprocessing Data enrichment Pattern features Punctuation features Additional features Classification Baseline options Summary Satire seems to have a distinct advantage in the data enrichment phase in comparison to sarcasm. Satire seems to have a huge disadvantage in the baseline options for classification compared to sarcasm. This is the detail that must be worked out before moving forward with implementation.

Future Goals Following the end of the course, I wish to implement SASI - taking the features mentioned today into account. Extend model to sarcasm in other domains. Any questions or comments? Extending SASI to Satirical Product Reviews: A Preview April 22,