Automated Essay Grading Resources: Introduction to Information Retrieval, Manning, Raghavan, Schutze (Chapter 06 and 18) Automated Essay Scoring with e-rater.

Slides:

Advertisements

Similar presentations

Dimensionality Reduction PCA -- SVD

Advertisements

Text Similarity David Kauchak CS457 Fall 2011.

| 1 › Gertjan van Noord2014 Zoekmachines Lecture 4.

INF 141 IR METRICS LATENT SEMANTIC ANALYSIS AND INDEXING Crista Lopes.

Ranking models in IR Key idea: We wish to return in order the documents most likely to be useful to the searcher To do this, we want to know which documents.

Introduction to Information Retrieval (Manning, Raghavan, Schutze) Chapter 6 Scoring term weighting and the vector space model.

TF-IDF David Kauchak cs160 Fall 2009 adapted from:

Learning for Text Categorization

Hinrich Schütze and Christina Lioma

DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December

Information Retrieval Ling573 NLP Systems and Applications April 26, 2011.

CpSc 881: Information Retrieval

CS347 Lecture 4 April 18, 2001 ©Prabhakar Raghavan.

An Introduction to Latent Semantic Analysis

1 Latent Semantic Indexing Jieping Ye Department of Computer Science & Engineering Arizona State University

CS276 Information Retrieval and Web Mining

Hinrich Schütze and Christina Lioma

Singular Value Decomposition in Text Mining Ram Akella University of California Berkeley Silicon Valley Center/SC Lecture 4b February 9, 2011.

1/ 30. Problems for classical IR models Introduction & Background(LSI,SVD,..etc) Example Standard query method Analysis standard query method Seeking.

The Terms that You Have to Know! Basis, Linear independent, Orthogonal Column space, Row space, Rank Linear combination Linear transformation Inner product.

Information Retrieval IR 6. Recap of the last lecture Parametric and field searches Zones in documents Scoring documents: zone weighting Index support.

The Vector Space Model …and applications in Information Retrieval.

Vector Space Model : TF - IDF

Term weighting and vector representation of text Lecture 3.

CES 514 Data Mining March 11, 2010 Lecture 5: scoring, term weighting, vector space model (Ch 6)

Documents as vectors Each doc j can be viewed as a vector of tf.idf values, one component for each term So we have a vector space terms are axes docs live.

Chapter 2 Dimensionality Reduction. Linear Methods

Web search basics (Recap) The Web Web crawler Indexer Search User Indexes Query Engine 1 Ad indexes.

Boolean and Vector Space Models

1 Vector Space Model Rong Jin. 2 Basic Issues in A Retrieval Model How to represent text objects What similarity function should be used? How to refine.

Latent Semantic Indexing Debapriyo Majumdar Information Retrieval – Spring 2015 Indian Statistical Institute Kolkata.

Scoring, Term Weighting, and Vector Space Model Lecture 7: Scoring, Term Weighting and the Vector Space Model Web Search and Mining 1.

TF-IDF David Kauchak cs458 Fall 2012 adapted from:

Information Retrieval Lecture 2 Introduction to Information Retrieval (Manning et al. 2007) Chapter 6 & 7 For the MSc Computer Science Programme Dell Zhang.

An Introduction to Latent Semantic Analysis. 2 Matrix Decompositions Definition: The factorization of a matrix M into two or more matrices M 1, M 2,…,

Basic ranking Models Boolean and Vector Space Models.

Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.

Term Frequency. Term frequency Two factors: – A term that appears just once in a document is probably not as significant as a term that appears a number.

Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.

Latent Semantic Indexing: A probabilistic Analysis Christos Papadimitriou Prabhakar Raghavan, Hisao Tamaki, Santosh Vempala.

Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.

Text Categorization Moshe Koppel Lecture 12:Latent Semantic Indexing Adapted from slides by Prabhaker Raghavan, Chris Manning and TK Prasad.

Introduction to Information Retrieval Introduction to Information Retrieval COMP4210: Information Retrieval and Search Engines Lecture 5: Scoring, Term.

Vector Space Models.

Introduction to String Kernels Blaz Fortuna JSI, Slovenija.

Introduction to Information Retrieval CSE 538 MRS BOOK – CHAPTER VI SCORING, TERM WEIGHTING AND THE VECTOR SPACE MODEL 1.

Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.

Lecture 6: Scoring, Term Weighting and the Vector Space Model

Information Retrieval Techniques MS(CS) Lecture 7 AIR UNIVERSITY MULTAN CAMPUS Most of the slides adapted from IIR book.

Information Retrieval and Web Search IR models: Vector Space Model Instructor: Rada Mihalcea [Note: Some slides in this set were adapted from an IR course.

Natural Language Processing Topics in Information Retrieval August, 2002.

Web Search and Data Mining Lecture 4 Adapted from Manning, Raghavan and Schuetze.

Introduction to Information Retrieval Introduction to Information Retrieval Lecture 9: Scoring, Term Weighting and the Vector Space Model.

Web Information Retrieval

Information Retrieval and Web Search IR models: Vector Space Model Term Weighting Approaches Instructor: Rada Mihalcea.

Introduction to Information Retrieval Introduction to Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar Raghavan.

IR 6 Scoring, term weighting and the vector space model.

From Frequency to Meaning: Vector Space Models of Semantics

The Vector Space Models (VSM)

Plan for Today’s Lecture(s)

Ch 6 Term Weighting and Vector Space Model

Information Retrieval and Web Search

Representation of documents and queries

Boolean and Vector Space Retrieval Models

Restructuring Sparse High Dimensional Data for Effective Retrieval

CS276: Information Retrieval and Web Search

Term Frequency–Inverse Document Frequency

Latent Semantic Analysis

Presentation transcript:

Automated Essay Grading Resources: Introduction to Information Retrieval, Manning, Raghavan, Schutze (Chapter 06 and 18) Automated Essay Scoring with e-rater V.2, Attali

 Content Vector Analysis (CVA) Essay to be graded Higher quality essay Lower quality essay Higher grade Lower grade

 Vector space model  An essay is vector of weighted terms  Similarity in vector space  Latent Semantic Analysis  Dimensionality reduction

Human scored essays Input essay Most similar

 Vector representation doesn’t consider the ordering of words in en essay  John is quicker than Mary and Mary is quicker than John have the same vectors  This is called the bag of words model.  In a sense, this is a step back: The positional index was able to distinguish these two documents.

 The term frequency tf t,e of term t in essay e is defined as the number of times that t occurs in e.  We want to use tf when computing input_essay- score_specific vocabulary model match scores. But how?  Raw term frequency is not what we want: ▪ An essay with 10 occurrences of the term may be more relevant than an essay with one occurrence of the term. ▪ But not 10 times more relevant.  Similarity does not increase proportionally with term frequency.

 Rare terms are more informative than frequent terms  Recall stop words  Consider a term in the essay that is rare in the collection (e.g., virtualization) of ith score point  An essay containing this term is very likely to be assigned with the score with human score essay collection that contain virtualization ▪ We want a higher weight for rare terms like virtualization.

 Consider term that is frequent in a collection (e.g., high, increase, line)  An essay containing such a term is more likely to be assigned with a score point than that doesn’t, but it’s not a sure indicator of match.  For frequent terms, we want positive weights for words like high, increase, and line, but lower weights than for rare terms.  We will use document frequency (df) to capture this in the score.  df (  N) is the number of documents (essays) that contain the term

 So we have a |V|-dimensional vector space  Terms are axes of the space  Essays are points or vectors in this space  Very high-dimensional: hundreds of millions of dimensions  This is a very sparse vector - most entries are zero.

 First cut: distance between two points  ( = distance between the end points of the two vectors)  Euclidean distance?  Euclidean distance is a bad idea... ... because Euclidean distance is large for vectors of different lengths.

Why distance is a bad idea Measure angle between two vectors

 A vector can be (length-) normalized by dividing each of its components by its length – for this we use the L 2 norm:  Dividing a vector by its L 2 norm makes it a unit (length) vector

 Two problems that arose using the vector space model:  synonymy: many ways to refer to the same object, e.g. car and automobile ▪ Penalize an essay  polysemy: most words have more than one distinct meaning, e.g.model, python, chip ▪ Falsely inflate score

 Example: Vector Space Model  (from Lillian Lee) auto engine bonnet tyres lorry boot car emissions hood make model trunk make hidden Markov model emissions normalize Synonymy Will have small cosine but are related Polysemy Will have large cosine but not truly related

Similar? Prompt

 Semantics  Relating words with other words  Explicit semantic mapping  Using external knowledge-base ▪ Wordnet ▪ Ontology  Implicit sematic mapping  Extract hidden (latent) semantics ▪ Use implicit co-occurrence ▪ projection of essay in abstract space

 Dimensionality reduction through lower order approximation  Extracting hidden semantics of a document  Semantic space dimension is lower than term space ▪ Remove redundant term dimensions

Prompt specific trainingScore point specific training Prompt 1 ……... Prompt N Term t ………… Term 1 Score 1 ……... ……. Score S Term t ………… Term 1

Eigenvectors:

Lower eigenvalues have less effect in product Lower rank approximation can be obtained by ignoring small eigenvalues

 Matrix diagonalization theorem

 Symmetric diagonalization theorem

Symmetric diagonal decomposition term Number of documents in which both term i and term j co-occur ??

=