CS 4705 Natural Language Processing: Summing Up
What is Natural Language Processing? The study of human languages and how they can be represented computationally and analyzed, recognized, and generated algorithmically Studying NLP involves studying natural language, formal representations, and algorithms for their manipulation
The cats sat on their mat. Syntax: [S [NP [ Det [the]] [Nom [cats]]] [VP [V [sat]] [PP [Prep [on]] [NP [Det [their]] [Nom [mat]]]]]] the/DET cats/N sat/VBD on/Prep their/Pro mat/N [^ the][the cats] [cats sat] [sat on] [on their] [their mat] [mat $] Morphology: the cat+pl sit+past on pro+pl+poss mat+sing Phonology: /dhe kaetz saet ahn dhEr maet/
Semantics: on (mat, cats) & own (mat,cats) event: sitting agent: cats patient: mat Entity extraction: superior creatures [the cats] sat on their mat Collocations: WSD: Pragmatic/Discourse: Information Status: They/DG/HG warily watched the dog/DN/HN.
Discourse Structure: DS1[The cats sat on their mat.] DS2[They warily watched the dog.] Nuc1[The cats sat on their mat.] Nuc2[They warily watched the dog.] Sequence(Nuc,Nuc2) Reference: their [cats], they [cats] Cp=cats, Cf={cats,mat}, Cb={} Applications: IR: cat mat Speech recognition: A cat is set on a match. TTS: The cats sat on their mat.
Spoken Dialogue Systems: A: Meow? B: Meooooowww… Story Generation: There was once a lonely cat. She was looking for a nice, trusting mouse. MT: Había una vez un gato solo. Summarization: A cat looked for a mouse
NLP Applications Speech Synthesis Dialogue Systems –Text (Eliza)Eliza –Spoken (TOOT) Machine Translation (SYSTRAN)SYSTRAN –Nice Dr. Fish works on a bank of the Rhone River. Summarization (NewsBlaster)NewsBlaster
Grand Challenges Faster, more accurate ‘real’ parsing Richer POS tagging and ‘shallow’ parsing New semantic representations Data Mining in text and speech e.g. “find friends”: X’s long time associate Y, X and Y have been friends, X intimate Y,… Extracting more entity types with less labeling Emotional Speech recognition and production Self-paced language instruction that uses ASR and TTS
Recognizing and making use of disfluencies, back-channels in ASR and understanding
Final and Papers Final examination: covers second half of course Grad student papers: due at the final