Carolyn Penstein Rosé Language Technologies Institute Human-Computer Interaction Institute School of Computer Science With funding from the National Science.

Slides:

Advertisements

Similar presentations

Speed dating Classification What you should know about dating Stephen Cohen Rajesh Ranganath Te Thamrongrattanarit.

Advertisements

University of Sheffield NLP Module 4: Machine Learning.

CLUSTERING SUPPORT FOR FAULT PREDICTION IN SOFTWARE Maria La Becca Dipartimento di Matematica e Informatica, University of Basilicata, Potenza, Italy

Christoph F. Eick Questions and Topics Review Nov. 22, Assume you have to do feature selection for a classification task. What are the characteristics.

Machine learning continued Image source:

Programming in Visual Basic

Computational Models of Discourse Analysis Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 7 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Proactive Learning: Cost- Sensitive Active Learning with Multiple Imperfect Oracles Pinar Donmez and Jaime Carbonell Pinar Donmez and Jaime Carbonell Language.

Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.

Machine Learning in Practice Lecture 3 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Document Quality Judgment with Textual Featues Bing Bai Computer Science Department Rutgers University December 2003.

Three kinds of learning

TagHelper: User’s Manual Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center.

Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.

Classification and Prediction: Regression Analysis

Finding Advertising Keywords on Web Pages Scott Wen-tau YihJoshua Goodman Microsoft Research Vitor R. Carvalho Carnegie Mellon University.

TagHelper & SIDE Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Overview: Humans are unique creatures. Everything we do is slightly different from everyone else. Even though many times these differences are so minute.

Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.

Processing of large document collections Part 2 (Text categorization) Helena Ahonen-Myka Spring 2006.

1 A study on automatically extracted keywords in text categorization Authors:Anette Hulth and Be´ata B. Megyesi From:ACL 2006 Reporter: 陳永祥 Date:2007/10/16.

Machine Learning1 Machine Learning: Summary Greg Grudic CSCI-4830.

CRB Journal Club February 13, 2006 Jenny Gu. Selected for a Reason Residues selected by evolution for a reason, but conservation is not distinguished.

TagHelper: Basics Part 1 Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh Science of Learning Center and The Office of Naval.

CS 8751 ML & KDDSupport Vector Machines1 Support Vector Machines (SVMs) Learning mechanism based on linear programming Chooses a separating plane based.

TagHelper and InfoMagnets Technologies for Exploring the effect of Language Interactions in Learning Carolyn Penstein Rosé, Jaime Arguello, Yue Cui, Rohit.

A Regression Approach to Music Emotion Recognition Yi-Hsuan Yang, Yu-Ching Lin, Ya-Fan Su, and Homer H. Chen, Fellow, IEEE IEEE TRANSACTIONS ON AUDIO,

The Broad Institute of MIT and Harvard Classification / Prediction.

Moving Ahead: Creative Feature Extraction and Error Analysis Techniques Carolyn Penstein Rosé Carnegie Mellon University Funded through the Pittsburgh.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.

Chapter6. Statistical Inference : n-gram Model over Sparse Data 이 동 훈 Foundations of Statistic Natural Language Processing.

M Machine Learning F# and Accord.net. Alena Dzenisenka Software architect at Luxoft Poland Member of F# Software Foundation Board of Trustees Researcher.

Today Ensemble Methods. Recap of the course. Classifier Fusion

Ensemble Methods: Bagging and Boosting

PIER Research Methods Protocol Analysis Module Hua Ai Language Technologies Institute/ PSLC.

A Scalable Machine Learning Approach for Semi-Structured Named Entity Recognition Utku Irmak(Yahoo! Labs) Reiner Kraft(Yahoo! Inc.) WWW 2010(Information.

Chapter 23: Probabilistic Language Models April 13, 2004.

Author Age Prediction from Text using Linear Regression Dong Nguyen Noah A. Smith Carolyn P. Rose.

Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification John Blitzer, Mark Dredze and Fernando Pereira University.

Classification (slides adapted from Rob Schapire) Eran Segal Weizmann Institute.

M Machine Learning F# and Accord.net.

Machine Learning in Practice Lecture 19 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 5 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Improved Video Categorization from Text Metadata and User Comments ACM SIGIR 2011:Research and development in Information Retrieval - Katja Filippova -

***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.

Machine Learning in Practice Lecture 24 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 13 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 10 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.

Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)

Machine Learning in Practice Lecture 6 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 2 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 21 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.

Machine Learning in Practice Lecture 9 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.

TagHelper Track Overview Carolyn Penstein Rosé Carnegie Mellon University Language Technologies Institute & Human-Computer Interaction Institute School.

A Simple Approach for Author Profiling in MapReduce

Classification with Perceptrons Reading:

Machine Learning in Practice Lecture 11

Tutorial for LightSIDE

Machine Learning in Practice Lecture 26

Text Analytics and Machine Learning Workshop Machine Learning Session

Ensemble learning Reminder - Bagging of Trees Random Forest

Machine Learning in Practice MidTerm Review

Machine Learning in Practice Lecture 27

Introduction to Sentiment Analysis

Presentation transcript:

Carolyn Penstein Rosé Language Technologies Institute Human-Computer Interaction Institute School of Computer Science With funding from the National Science Foundation and the Office of Naval Research 1 LightSIDE

2

3

4 Click here to load a file

5 Select Heteroglossia as the predicted category

6 Make sure the text field is selected to extract text features from

 Punctuation can be a “stand in” for mood  “you think the answer is 9?”  “you think the answer is 9.”  Bigrams capture simple lexical patterns  “common denominator” versus “common multiple”  Trigrams (just like bigrams, but with 3 words next to each other)  Carnegie Mellon University  POS bigrams capture syntactic or stylistic information  “the answer which is …” vs “which is the answer”  Line length can be a proxy for explanation depth

 Contains non-stop word can be a predictor of whether a conversational contribution is contentful  “ok sure” versus “the common denominator”  Remove stop words removes some distracting features  Stemming allows some generalization  Multiple, multiply, multiplication  Removing rare features is a cheap form of feature selection  Features that only occur once or twice in the corpus won’t generalize, so they are a waste of time to include in the vector space

 Think like a computer!  Machine learning algorithms look for features that are good predictors, not features that are necessarily meaningful  Look for approximations  If you want to find questions, you don’t need to do a complete syntactic analysis  Look for question marks  Look for wh-terms that occur immediately before an auxilliary verb

10 Click to extract text features

11 Select Logistic Regression as the Learner

12 Evaluate result by cross validation over sessions

13 Run the experiment

14

 A sequence of 1 to 6 categories  May include GAPs  Can cover any symbol  GAP+ may cover any number of symbols  Must not begin or end with a GAP

16

17

18

 Identify large error cells  Make comparisons  Ask yourself how it is similar to the instances that were correctly classified with the same class (vertical comparison)  How it is different from those it was incorrectly not classified as (horizontal comparison) Positive Negative

20

21 Error Analysis on Development Set

22 Error Analysis on Development Set

23 Error Analysis on Development Set

24 Error Analysis on Development Set

25

26

 Positive: is interesting, an interesting scene  Negative: would have been more interesting, potentially interesting, etc. 27

28

29

30

31

32

33 * Note that in this case we get no benefit if we use feature selection over the original feature space.

34 General Domain ADomain BGeneral Why is this nonlinear? It represents the interaction between each feature and the Domain variable Now that the feature space represents the nonlinearity, the algorithm to train the weights can be linear.

35 Healthcare Bill Dataset

36 Healthcare Bill Dataset

37 Healthcare Bill Dataset

38 Healthcare Bill Dataset

39 Healthcare Bill Dataset

40 Healthcare Bill Dataset

41 Healthcare Bill Dataset

42 Healthcare Bill Dataset

43 Healthcare Bill Dataset