1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering.
Published byModified over 5 years ago
Presentation on theme: "1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering."— Presentation transcript:
1/7 INFO60021 Natural Language Processing Harold Somers Professor of Language Engineering
2/7 Using computer to “do” linguistics Getting computer to handle language Synonyms (or components?) –Computational Linguistics –Language Engineering Basic tools, techniques and models Applications What is NLP?
3/7 What is NLP? Linguistics Psychology Electrical Engineering Computer Science NLP Artificial Intelligence Language Engineering HCI Signal Processing Phonetics PhilosophyLogic
4/7 Some essential concepts Grammar as data Programs do something with the data –Program uses the data to handle the input –“Declarative” vs. “procedural” information (what vs. how) Formalisms –If declarative: independent of algorithm, thus reusable Algorithms –Usually seen as “searching” a defined space for an answer Data structures
5/7 More essential concepts Analysis (parsing) vs. synthesis (generation) –Using the grammar to certify input –Using the grammar to produce output Transducer –Specifies both input and output (defines a “mapping”) –Associated algorithm has three uses: Analysis Synthesis Verification
6/7 Even more essential concepts Resources –Grammars –Lexicons, word lists (WordNet) –Secondary (re)sources Corpora Human-oriented dictionaries Empirical approaches –Statistical, data-derived –Machine learning
7/7 Syllabus Survey of applications Elements of linguistics, levels of linguistic processing Resources: machine-readable dictionaries, corpora Computational morphology –Finite state models –Morphological analysis vs. Stemming Structural analysis –N-grams, tagging, chunking –Context-free parsing –Probabilistic parsing Lexical relations –WordNet, similarity in context Named-entity recognition Assignment 1 Assignment 2
8/7 Assessment Examination: 80% 3 questions from 5 Course work: 20% Two practical assignments, due weeks 5 and 8 Will be based on grammar writing package which will be explained in class