2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.

Slides:



Advertisements
Similar presentations
Chapter 5: Introduction to Information Retrieval
Advertisements

1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda.
Dimensionality Reduction PCA -- SVD
Text Similarity David Kauchak CS457 Fall 2011.
Scott Wen-tau Yih (Microsoft Research) Joint work with Vahed Qazvinian (University of Michigan)
Ranking models in IR Key idea: We wish to return in order the documents most likely to be useful to the searcher To do this, we want to know which documents.
Katrin Erk Distributional models. Representing meaning through collections of words Doc 1: Abdullah boycotting challenger commission dangerous election.
What is missing? Reasons that ideal effectiveness hard to achieve: 1. Users’ inability to describe queries precisely. 2. Document representation loses.
Latent Semantic Analysis
Learning for Text Categorization
DIMENSIONALITY REDUCTION BY RANDOM PROJECTION AND LATENT SEMANTIC INDEXING Jessica Lin and Dimitrios Gunopulos Ângelo Cardoso IST/UTL December
ISP 433/633 Week 10 Vocabulary Problem & Latent Semantic Indexing Partly based on G.Furnas SI503 slides.
Intro to NLP - J. Eisner1 Words vs. Terms Taken from Jason Eisner’s NLP class slides:
CSM06 Information Retrieval Lecture 3: Text IR part 2 Dr Andrew Salway
Scott Wen-tau Yih Joint work with Kristina Toutanova, John Platt, Chris Meek Microsoft Research.
TFIDF-space  An obvious way to combine TF-IDF: the coordinate of document in axis is given by  General form of consists of three parts: Local weight.
Indexing by Latent Semantic Analysis Scot Deerwester, Susan Dumais,George Furnas,Thomas Landauer, and Richard Harshman Presented by: Ashraf Khalil.
1/ 30. Problems for classical IR models Introduction & Background(LSI,SVD,..etc) Example Standard query method Analysis standard query method Seeking.
IR Models: Latent Semantic Analysis. IR Model Taxonomy Non-Overlapping Lists Proximal Nodes Structured Models U s e r T a s k Set Theoretic Fuzzy Extended.
INEX 2003, Germany Searching in an XML Corpus Using Content and Structure INEX 2003, Germany Yiftah Ben-Aharon, Sara Cohen, Yael Grumbach, Yaron Kanza,
Dimension of Meaning Author: Hinrich Schutze Presenter: Marian Olteanu.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Evaluation of Utility of LSA for Word Sense Discrimination Esther Levin, Mehrbod Sharifi, Jerry Ball
1/16 Final project: Web Page Classification By: Xiaodong Wang Yanhua Wang Haitang Wang University of Cincinnati.
Homework Define a loss function that compares two matrices (say mean square error) b = svd(bellcore) b2 = b$u[,1:2] %*% diag(b$d[1:2]) %*% t(b$v[,1:2])
Chapter 2 Dimensionality Reduction. Linear Methods
Latent Semantic Analysis Hongning Wang VS model in practice Document and query are represented by term vectors – Terms are not necessarily orthogonal.
Automated Essay Grading Resources: Introduction to Information Retrieval, Manning, Raghavan, Schutze (Chapter 06 and 18) Automated Essay Scoring with e-rater.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Multi-Prototype Vector Space Models of Word Meaning __________________________________________________________________________________________________.
INF 141 COURSE SUMMARY Crista Lopes. Lecture Objective Know what you know.
NL Question-Answering using Naïve Bayes and LSA By Kaushik Krishnasamy.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Latent Semantic Analysis Hongning Wang Recap: vector space model Represent both doc and query by concept vectors – Each concept defines one dimension.
Katrin Erk Vector space models of word meaning. Geometric interpretation of lists of feature/value pairs In cognitive science: representation of a concept.
Latent Semantic Indexing: A probabilistic Analysis Christos Papadimitriou Prabhakar Raghavan, Hisao Tamaki, Santosh Vempala.
Ranking in Information Retrieval Systems Prepared by: Mariam John CSE /23/2006.
June 5, 2006University of Trento1 Latent Semantic Indexing for the Routing Problem Doctorate course “Web Information Retrieval” PhD Student Irina Veredina.
LATENT SEMANTIC INDEXING Hande Zırtıloğlu Levent Altunyurt.
Comparing and Ranking Documents Once our search engine has retrieved a set of documents, we may want to Rank them by relevance –Which are the best fit.
SINGULAR VALUE DECOMPOSITION (SVD)
Collocations and Information Management Applications Gregor Erbach Saarland University Saarbrücken.
Evgeniy Gabrilovich and Shaul Markovitch
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 3. Word Association.
Vector Space Models.
1 Latent Concepts and the Number Orthogonal Factors in Latent Semantic Analysis Georges Dupret
Web Search and Text Mining Lecture 5. Outline Review of VSM More on LSI through SVD Term relatedness Probabilistic LSI.
Hierarchical Segmentation: Finding Changes in a Text Signal Malcolm Slaney and Dulce Ponceleon IBM Almaden Research Center.
Link Distribution on Wikipedia [0407]KwangHee Park.
Using Game Reviews to Recommend Games Michael Meidl, Steven Lytinen DePaul University School of Computing, Chicago IL Kevin Raison Chatsubo Labs, Seattle.
10.0 Latent Semantic Analysis for Linguistic Processing References : 1. “Exploiting Latent Semantic Information in Statistical Language Modeling”, Proceedings.
Natural Language Processing Topics in Information Retrieval August, 2002.
Xiaoying Gao Computer Science Victoria University of Wellington COMP307 NLP 4 Information Retrieval.
Feature Assignment LBSC 878 February 22, 1999 Douglas W. Oard and Dagobert Soergel.
Vector Semantics Dense Vectors.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
Short Text Similarity with Word Embedding Date: 2016/03/28 Author: Tom Kenter, Maarten de Rijke Source: CIKM’15 Advisor: Jia-Ling Koh Speaker: Chih-Hsuan.
Latent Semantic Analysis (LSA) Jed Crandall 16 June 2009.
IR 6 Scoring, term weighting and the vector space model.
Data Science Dimensionality Reduction WFH: Section 7.3 Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall.
Semantic Processing with Context Analysis
Vector-Space (Distributional) Lexical Semantics
Efficient Estimation of Word Representation in Vector Space
Word Embedding Word2Vec.
CS 430: Information Discovery
Vector Representation of Text
Retrieval Utilities Relevance feedback Clustering
Vector Representation of Text
Latent Semantic Analysis
CS 430: Information Discovery
Presentation transcript:

2 Information Retrieval System IR System Query String Document corpus Ranked Documents 1. Doc1 2. Doc2 3. Doc3.

Vector-Space (Distributional) Lexical Semantics Represent word meanings as points (vectors) in a (high-dimensional) Euclidian space. Dimensions encode aspects of the context in which the word appears (e.g. how often it co- occurs with another specific word). –“You will know a word by the company that it keeps.” (J.R. Firth, 1957) Semantic similarity defined as distance between points in this semantic space. 10

Sample Lexical Vector Space 11 dog cat man woman bottle cup water rock computer robot

Simple Word Vectors For a given target word, w, create a bag-of- words “document” of all of the words that co- occur with the target word in a large corpus. –Window of k words on either side. –All words in the sentence, paragraph, or document. For each word, create a (tf-idf weighted) vector from the “document” for that word. Compute semantic relatedness of words as the cosine similarity of their vectors. 12

Other Contextual Features Use syntax to move beyond simple bag-of- words features. Produced typed (edge-labeled) dependency parses for each sentence in a large corpus. For each target word, produce features for it having specific dependency links to specific other words (e.g. subj=dog, obj=food, mod=red) 13

Other Feature Weights 14

Dimensionality Reduction Word-based features result in extremely high-dimensional spaces that can easily result in over-fitting. Reduce the dimensionality of the space by using various mathematical techniques to create a smaller set of k new dimensions that most account for the variance in the data. –Singular Value Decomposition (SVD) used in Latent Semantic Analysis (LSA) –Principle Component Analysis (PCA) 15

Sample Dimensionality Reduction 16

Sample Dimensionality Reduction 17

Evaluation of Vector-Space Lexical Semantics Have humans rate the semantic similarity of a large set of word pairs. –(dog, canine): 10; (dog, cat): 7; (dog, carrot): 3; (dog, knife): 1 Compute vector-space similarity of each pair. Compute correlation coefficient (Pearson or Spearman) between human and machine ratings. 18

TOEFL Synonymy Test LSA shown to be able to pass TOEFL synonymy test. 19

Vector-Space Word Sense Induction Create a context-vector for each individual occurrence of the target word, w. Cluster these vectors into k groups. Assume each group represents a “sense” of the word and compute a vector for this sense by taking the mean of each cluster. 20

Sample Word Sense Induction 21 Occurrence vectors for “bat” hit flew wooden player cave vampire ate baseball

Sample Word Sense Induction 22 Occurrence vectors for “bat” hit flew wooden player cave vampire ate baseball + +

Vector-Space Word Meaning in Context Compute a semantic vector for an individual occurrence of a word based on its context. Combine a standard vector for a word with vectors representing the immediate context. 23

Example Using Dependency Context 24 fired huntergun nsubj dobj fired bosssecretary nsubj dobj Compute vector for nsubj-boss by summing contextual vectors for all word occurrences that have “boss” as a subject. Compute vector for dobj-secretary by summing contextual vectors for all word occurrences that have “secretary” as a direct object. Compute “in context” vector for “fire” in “boss fired secretary” by adding nsubj-boss and dobj-secretary vectors to the general vector for “fire”

Conclusions A word’s meaning can be represented as a vector that encodes distributional information about the contexts in which the word tends to occur. Lexical semantic similarity can be judged by comparing vectors (e.g. cosine similarity). Vector-based word senses can be automatically induced by clustering contexts. Contextualized vectors for word meaning can be constructed by combining lexical and contextual vectors. 25