Machine Learning in Natural Language Processing

Slides:



Advertisements
Similar presentations
Machine Learning Approaches to the Analysis of Large Corpora : A Survey Xunlei Rose Hu and Eric Atwell University of Leeds.
Advertisements

CS460/IT632 Natural Language Processing/Language Technology for the Web Lecture 2 (06/01/06) Prof. Pushpak Bhattacharyya IIT Bombay Part of Speech (PoS)
Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011.
For Monday Read Chapter 23, sections 3-4 Homework –Chapter 23, exercises 1, 6, 14, 19 –Do them in order. Do NOT read ahead.
Natural Language and Speech Processing Creation of computational models of the understanding and the generation of natural language. Different fields coming.
NLP and Speech Course Review. Morphological Analyzer Lexicon Part-of-Speech (POS) Tagging Grammar Rules Parser thethe – determiner Det NP → Det.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Shallow Processing: Summary Shallow Processing Techniques for NLP Ling570 December 7, 2011.
Natural Language Processing: Overview & Current Applications Noriko Tomuro April 7, 2006.
Stemming, tagging and chunking Text analysis short of parsing.
1 CSC 594 Topics in AI – Applied Natural Language Processing Fall 2009/ Shallow Parsing.
Introduction to CL Session 1: 7/08/2011. What is computational linguistics? Processing natural language text by computers  for practical applications.
Introduction to Syntax, with Part-of-Speech Tagging Owen Rambow September 17 & 19.
ITCS 6010 Natural Language Understanding. Natural Language Processing What is it? Studies the problems inherent in the processing and manipulation of.
Statistical techniques in NLP Vasileios Hatzivassiloglou University of Texas at Dallas.
Language Model. Major role: Language Models help a speech recognizer figure out how likely a word sequence is, independent of the acoustics. A lot of.
تمرين شماره 1 درس NLP سيلابس درس NLP در دانشگاه هاي ديگر ___________________________ راحله مکي استاد درس: دکتر عبدالله زاده پاييز 85.
Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006.
(Some issues in) Text Ranking. Recall General Framework Crawl – Use XML structure – Follow links to get new pages Retrieve relevant documents – Today.
March 1, 2009 Dr. Muhammed Al-Mulhem 1 ICS 482 Natural Language Processing INTRODUCTION Muhammed Al-Mulhem March 1, 2009.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
1 Sequence Labeling Raymond J. Mooney University of Texas at Austin.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
9/8/20151 Natural Language Processing Lecture Notes 1.
Lemmatization Tagging LELA /20 Lemmatization Basic form of annotation involving identification of underlying lemmas (lexemes) of the words in.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Lecture 12: 22/6/1435 Natural language processing Lecturer/ Kawther Abas 363CS – Artificial Intelligence.
Computational Linguistics Yoad Winter *General overview *Examples: Transducers; Stanford Parser; Google Translate; Word-Sense Disambiguation * Finite State.
For Friday Finish chapter 23 Homework: –Chapter 22, exercise 9.
Authors: Ting Wang, Yaoyong Li, Kalina Bontcheva, Hamish Cunningham, Ji Wang Presented by: Khalifeh Al-Jadda Automatic Extraction of Hierarchical Relations.
10/12/2015CPSC503 Winter CPSC 503 Computational Linguistics Lecture 10 Giuseppe Carenini.
Natural Language Processing Artificial Intelligence CMSC February 28, 2002.
Introduction to CL & NLP CMSC April 1, 2003.
CS774. Markov Random Field : Theory and Application Lecture 19 Kyomin Jung KAIST Nov
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
For Friday Finish chapter 23 Homework –Chapter 23, exercise 15.
CPSC 422, Lecture 27Slide 1 Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 27 Nov, 16, 2015.
Shallow Parsing for South Asian Languages -Himanshu Agrawal.
Natural Language Processing Slides adapted from Pedro Domingos
1 An Introduction to Computational Linguistics Mohammad Bahrani.
Stochastic Methods for NLP Probabilistic Context-Free Parsers Probabilistic Lexicalized Context-Free Parsers Hidden Markov Models – Viterbi Algorithm Statistical.
Overview of Statistical NLP IR Group Meeting March 7, 2006.
NATURAL LANGUAGE PROCESSING
PARSING David Kauchak CS159 – Fall Admin Assignment 3 Quiz #1  High: 36  Average: 33 (92%)  Median: 33.5 (93%)
Dan Roth University of Illinois, Urbana-Champaign 7 Sequential Models Tutorial on Machine Learning in Natural.
Tasneem Ghnaimat. Language Model An abstract representation of a (natural) language. An approximation to real language Assume we have a set of sentences,
Part-Of-Speech Tagging Radhika Mamidi. POS tagging Tagging means automatic assignment of descriptors, or tags, to input tokens. Example: “Computational.
Natural Language Processing [05 hours/week, 09 Credits] [Theory]
Approaches to Machine Translation
Sentiment analysis algorithms and applications: A survey
Basic Parsing with Context Free Grammars Chapter 13
CSC 594 Topics in AI – Natural Language Processing
CSCI 5832 Natural Language Processing
Are End-to-end Systems the Ultimate Solutions for NLP?
Probabilistic and Lexicalized Parsing
Natural Language - General
CS4705 Natural Language Processing
Parsing and More Parsing
Approaches to Machine Translation
CS4705 Natural Language Processing
Natural Language Processing
CS246: Information Retrieval
Teori Bahasa dan Automata Lecture 9: Contex-Free Grammars
CPSC 503 Computational Linguistics
Natural Language Processing: Overview & Current Applications
CSCI 5832 Natural Language Processing
David Kauchak CS159 – Spring 2019
Artificial Intelligence 2004 Speech & Natural Language Processing
Extracting Information from Diverse and Noisy Scanned Document Images
Presentation transcript:

Machine Learning in Natural Language Processing Noriko Tomuro November 16, 2006

What is NLP? Natural Language Processing (NLP) is a field in Computer Science devoted to creating computers that use natural language as input and/or output.

NLP is AI-complete “The most difficult problems in AI manifest themselves in human language phenomena.” Use of language is the touchstone of intelligent behavior.

NLP Tasks NLP can be stand-alone applications or components embedded in other systems. NLP essentially is to analyze (natural language) sentences, linguistically. It involves (minimally): Determining the part-of-speech category of words (e.g. noun, verb). Identifying the syntactic structure of a sentence. Deriving the meaning of the sentence.

NLP Applications Speech Recognition Sentence Analysis Parsing for syntactic analysis Word sense disambiguation for semantic analysis Dialogue Systems, e.g. Tutoring systems Question-Answering Systems Machine Translation Automatic Text Summarization (e.g. Newsblaster system at U of Columbia) and much more

Part-Of-Speech (POS) Identifying POS is as easy as one would think – because of ambiguity. “He authored the book.” “he” – DET “author” – N or V “the” – DET “book” – N or V Without knowing the structure (or sometimes the meaning) of the sentence, ambiguous categories cannot be correctly disambiguated.

Stemming is NOT NLP Note that stemming (e.g. Porter stemmer) is a technique in Information Retrieval; it’s NOT NLP. Stemming only removes the common morphological and inflectional endings from words. “police”, “policy”, “policies” => “polic” “went” => “went” Whereas proper NLP analysis yields: “police” => “police”, “policies” => “policy” “went” => “go”

Syntactic Structure Grammar rules (often in CFG) specify the grammatically correct ways to form phrases (and a sentence). S NP VP V “John” “ate” “the” Det N “cake” Grammar R0: R1: R2: R3: R4: R5: R6: R7: cake" " N the" Det ate" V John" NP VG VP S ®

Parsing Algorithms Computational complexity (with no optimization) is exponential. Various parsing algorithms: Top-down Parsing -- (top-down) derivation Bottom-up Parsing Chart Parsing Earley’s Algorithm – most efficient, O(n3) Left-corner Parsing – optimization of Earley’s and lots more…

Demo using my CF parser

Semantics Derive the meaning of a sentence. Often applied on the result of syntactic analysis. “John ate the cake.” NP V NP ((action INGEST) ; syntactic verb (actor JOHN-01) ; syntactic subj (object FOOD)) ; syntactic obj To do semantic analysis, we need a (semantic) dictionary (e.g. WordNet, http://www.cogsci.princeton.edu/~wn/).

NLP is Hard… Understanding natural languages is hard … because of inherent ambiguity Engineering NLP systems is also hard … because of Huge amount of resources needed (e.g. grammar, dictionary, documents to extract statistics from) Computational complexity (intractable) of analyzing a sentence

Ambiguity At last, a computer that understands you like your mother.” Three possible (syntactic) interpretations: It understands you as well as your mother understands you. It understands that you like your mother. It understands you as well as it understands your mother.

Empirical Approaches to NLP Formal analysis of natural languages is too hard and computationally heavy. Abandon complete/correct solutions, and shoot for “approximate” solutions by using cheaper/faster techniques. Collect data from documents (corpus), rather than building linguistic resources manually.

Part-Of-Speech Tagging Typically, from a corpus where words are already tagged with POS categories, collect statistics of n-word sequences (n-gram) => probabilities of POS sequences. For a given untagged/test sentence, assign the most probable POS tag for each word starting from the beginning, based on the POS tags of the preceding words. Popular techniques: Hidden Markov Model (HMM) – with parameter estimation using Viterbi, EM algorithms Transformation Rules (e.g. Brill tagger) Various ML classification algorithms

Probabilistic Parsing For ambiguous sentences, we’d like to know which parse tree is more likely than others. So we must assign probability to each parse tree … but how? A probability of a parse tree t is where r is a rule used in t. and p(r) is obtained from a (annotated) corpus. Again, parameter estimation by using various ML techniques.

Partial (Chunk) Parsing Abandon full parsing; instead aim to obtain just phrases (chunks). Build a flat structure (instead of a hierarchical tree by full parsing). Each chunk is identified by a regular expression – a finite-state automaton. => Polynomial time complexity Then a chunker is a cascade of finite-state automata.

Semantic Analysis Natural language sentences are ambiguous, largely because a word often has several meanings/senses. e.g. “bass” (n) has two senses: a type of fish tones of low frequency Which sense is used? “The bass part of the song is very moving.” “I went fishing for some sea bass.” It’s easy for humans, but not for computers.

Word Sense Disambiguation A task of assigning the proper sense to words in a sentence (thus a classification task). Using a training corpus annotated with proper senses, obtain statistics of n-word sequences (their word senses). Apply classification algorithms, such as: Decision Tree Support Vector Machine Conditional Random Fields (CRF) Difficulty with data sparseness. Techniques: Smoothing Backoff models

Other NLP’ish ML Tasks Text Categorization Multi-lingual Retrieval Classify a document into one of the document categories (See textbook on Naïve Bayes). Multi-lingual Retrieval Enter a query in one language, and retrieve relevant documents of ANY language. Machine Translation part is done by ML. Automatic construction of domain ontology Build a conceptual hierarchy of a specific domain and many, many more!

Analyzing Web Documents Recently there have been many NLP applications which analyze (not just retrieve) web documents Blogs – for semantic analysis, sentiment (polarity/opinion) identification Email Spam Filtering – but most often systems utilize simple word probability A general approach “Web as a corpus” – web as the vast collection of documents