LING 581: Advanced Computational Linguistics Lecture Notes February 9th.

Slides:



Advertisements
Similar presentations
School of something FACULTY OF OTHER School of Computing FACULTY OF ENGINEERING Chunking: Shallow Parsing Eric Atwell, Language Research Group.
Advertisements

LING 581: Advanced Computational Linguistics Lecture Notes January 30th.
LING 388: Language and Computers
LING 581: Advanced Computational Linguistics Lecture Notes February 2nd.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Jan 15 th.
Documentation Generators: Internals of Doxygen John Tully.
LING 581: Advanced Computational Linguistics Lecture Notes January 19th.
LING 581: Advanced Computational Linguistics Lecture Notes February 2nd.
LING 581: Advanced Computational Linguistics Lecture Notes March 9th.
LING 581: Advanced Computational Linguistics Lecture Notes January 26th.
LING 581: Advanced Computational Linguistics Lecture Notes May 5th.
LING 581: Advanced Computational Linguistics Lecture Notes February 16th.
PCFG Parsing, Evaluation, & Improvements Ling 571 Deep Processing Techniques for NLP January 24, 2011.
LING 581: Advanced Computational Linguistics Lecture Notes January 26th.
ANLE1 CC 437: Advanced Natural Language Engineering ASSIGNMENT 2: Implementing a query expansion component for a Web Search Engine.
Linux+ Guide to Linux Certification, Second Edition
1 SIMS 290-2: Applied Natural Language Processing Marti Hearst Sept 22, 2004.
1 I256: Applied Natural Language Processing Marti Hearst Sept 25, 2006.
LING 581: Advanced Computational Linguistics Lecture Notes January 19th.
Guide To UNIX Using Linux Third Edition
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Jan 22 nd.
UAM CorpusTool: An Overview Debopam Das Discourse Research Group Department of Linguistics Simon Fraser University Feb 5, 2014.
LING/C SC/PSYC 438/538 Lecture 27 Sandiway Fong. Administrivia 2 nd Reminder – 538 Presentations – Send me your choices if you haven’t already.
A Survey of NLP Toolkits Jing Jiang Mar 8, /08/20072 Outline WordNet Statistics-based phrases POS taggers Parsers Chunkers (syntax-based phrases)
Week 7 Working with the BASH Shell. Objectives  Redirect the input and output of a command  Identify and manipulate common shell environment variables.
Advanced UNIX Shell Dr. Tran, Van Hoai Faculty of Computer Science and Engineering HCMC Uni. of Technology
LING 581: Advanced Computational Linguistics Lecture Notes February 12th.
AN IMPLEMENTATION OF A REGULAR EXPRESSION PARSER
Additional UNIX Commands. 222 Lecture Overview  Multiple commands and job control  More useful UNIX utilities.
Linux+ Guide to Linux Certification, Third Edition
LING 581: Advanced Computational Linguistics Lecture Notes February 19th.
OCR GCSE Computing © Hodder Education 2013 Slide 1 OCR GCSE Computing Python programming 1: Introduction.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 5 th.
LING/C SC/PSYC 438/538 Lecture 18 Sandiway Fong. Adminstrivia Homework 7 out today – due Saturday by midnight.
Linux+ Guide to Linux Certification, Second Edition
ICS312 Introduction to Compilers Set 23. What is a Compiler? A compiler is software (a program) that translates a high-level programming language to machine.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 3 rd.
Linux Administration Working with the BASH Shell.
LING/C SC 581: Advanced Computational Linguistics Lecture Notes Feb 17 th.
LING 581: Advanced Computational Linguistics Lecture Notes February 24th.
LING 581: Advanced Computational Linguistics Lecture Notes March 2nd.
CSE 303 Concepts and Tools for Software Development Richard C. Davis UW CSE – 10/9/2006 Lecture 6 – String Processing.
CIRC Summer School 2016 Baowei Liu
CIRC Summer School 2017 Baowei Liu
CIRC Winter Boot Camp 2017 Baowei Liu
Documentation Generators
LING 388: Computers and Language
LING 581: Advanced Computational Linguistics
LING 388: Computers and Language
LING 388: Computers and Language
LING/C SC 581: Advanced Computational Linguistics
LING 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
LING 581: Advanced Computational Linguistics
Guide To UNIX Using Linux Third Edition
LING/C SC 581: Advanced Computational Linguistics
LING 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
PDF Data extraction made simple
LING 408/508: Computational Techniques for Linguists
LING 408/508: Computational Techniques for Linguists
CSE 303 Concepts and Tools for Software Development
Lab 8: Regular Expressions
LING/C SC 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
LING/C SC 581: Advanced Computational Linguistics
LING 388: Computers and Language
LING/C SC/PSYC 438/538 Lecture 3 Sandiway Fong.
LING/C SC 581: Advanced Computational Linguistics
Presentation transcript:

LING 581: Advanced Computational Linguistics Lecture Notes February 9th

tregex Pattern matching for passives: using variable names and regex group numbering for coindexation matching for passives (NP-SBJ-i and object of VP [NP [ –NONE- [ -*-I ]]])

Homework Task Report Bracketing guide – TREEBANK_3/docs/prsguid1.pdf Pattern matching for selected constructions in – wsj tregex.mrg

Bikel Collins From treebanks search to stochastic parsers trained on the WSJ Penn treebank Java re-implementation of Collins’ parser Paper – Daniel M. Bikel Intricacies of Collins’ Parsing Model. (PS) (PDF) in Computational Linguistics, 30(4), pp PS) (PDF) in Computational Linguistics, 30(4), pp – intricacies.pdf Software – parser

Bikel Collins some TCL/TK code (I wrote for research use) makes it easy to work the parser without memorizing the command line options some TCL/TK code (I wrote for research use) makes it easy to work the parser without memorizing the command line options

Bikel Collins The wrapper is syntactic sugar for various commands Scripting language is TCL/TK (“tickle T K”) Assume variables – set prefix "/Users/sandiway/research/" – set dbprefix "$prefix/dbparser" – set tbvprefix "/Applications/treebankviewer.app/Contents/MacOS" POS tagging (MXPOST, in directory jmx) – $prefix/jmx/mxpost $prefix/jmx/tagger.project /tmp/err.txt Parsing – $dbprefix/bin/parse 400 $dbprefix/settings/$properties $dbprefix/bin/$ddf /tmp/test2.txt stdout Training – $dbprefix/bin/train 800 $dbprefix/settings/$properties $dbprefix/bin/$mrg stdout

Bikel Collins POS tagging (MXPOST, in directory jmx) – tagger_input – $prefix/jmx/mxpost $prefix/jmx/tagger.project /tmp/err.txt Parsing – set ddf "wsj obj.gz” – set properties "collins.properties" – parser_input – $dbprefix/bin/parse 400 $dbprefix/settings/$properties $dbprefix/bin/$ddf /tmp/test2.txt stdout Training – set mrg "wsj mrg” – set properties "collins.properties" – $dbprefix/bin/train 800 $dbprefix/settings/$properties $dbprefix/bin/$mrg stdout Unix file descriptors 0 Standard input (stdin) 1Standard output (stdout) 2Standard error(stderr) GUI components frame.input text.input.t -height 4 -yscrollcommand {.input.s set} scrollbar.input.s -command {.input.t yview} frame.tagged text.tagged.t -height 9 -yscrollcommand {.tagged.s set} scrollbar.tagged.s -command {.tagged.t yview} Code proc tagger_input {} { set lines [.input.t get 1.0 end] set infile [open "/tmp/test.txt" w] puts -nonewline $infile [string trimright $lines] close $infile } proc parser_input {} { set lines [.tagged.t get 1.0 end] set infile [open "/tmp/test2.txt" w] puts -nonewline $infile [string trimright $lines] close $infile } Unix file descriptors 0 Standard input (stdin) 1Standard output (stdout) 2Standard error(stderr) GUI components frame.input text.input.t -height 4 -yscrollcommand {.input.s set} scrollbar.input.s -command {.input.t yview} frame.tagged text.tagged.t -height 9 -yscrollcommand {.tagged.s set} scrollbar.tagged.s -command {.tagged.t yview} Code proc tagger_input {} { set lines [.input.t get 1.0 end] set infile [open "/tmp/test.txt" w] puts -nonewline $infile [string trimright $lines] close $infile } proc parser_input {} { set lines [.tagged.t get 1.0 end] set infile [open "/tmp/test2.txt" w] puts -nonewline $infile [string trimright $lines] close $infile }

Bikel Collins There’s also a simple tree viewer I wrote but it may not run on your system…

Bikel Collins Relevant files and directories bikeldemo – wrapper2.tcl(prefix set to /Users/sandiway) jmx – mxpost(shell script) – mxpost.jar(Java code) dbparser – dbparser/bin/parse(shell script) – dbparser/bin/train(shell script) – dbparser/dbparser.jar(Java code) – dbparser/userguide/guide.pdf