Testing with the Finite-State Calculus Thursday AM Kenneth R. Beesley Xerox Research Centre Europe.

Testing with the Finite-State Calculus Thursday AM Kenneth R. Beesley Xerox Research Centre Europe

Testing NLP Systems is a Traditional Nightmare  Traditional NLP systems coded in C/C++, or whatever, accept a language, but which one? The set of words handled or not handled by such a system is epiphenomenal.  When you change anything in a coded system, how to you test the effect? especially when your original system handled millions of words? How can you do general regression testing? But luckily …  In finite-state systems, we can easily identify and test the languages and relations handled by our networks.  Using the finite-state calculus itself, we can compare the coverage of one system or version against another.

Testing with the Finite-State Calculus  Review relevant xfst operations/commands  Reiterate some mathematical restrictions  Introduce some fundamental testing methods n Checking the alphabet n Regression testing via subtraction n Using a “Lexical Grammar”

Review of xfst Operations Most xfst operations like ‘union net’, ‘compose net’, and ‘intersect net’ pop their arguments off the stack one-by-one, compute the result, and push the result back onto the stack. More stack operations: pop stackpop off and throw away top network push Variableequivalent to read regex Variable ; clear stackpops all networks from the stack rotate stackpop off top network, reinsert it magically on the bottom on the stack turn stackflip the entire stack upside down

Some More Operations on Networks reverse netreverses a network left-to-right invert netturns a network upside-down lower-sidepops top FST, pushes one-level FSM upper-sidepops top FST, pushes one-level FSM sigmaprint the “alphabet” of the machine labelsprint the labels actually used on the arcs of the network

Peeking at your Network Languages xfst[]: print wordsprint all words/paths xfst[]: print random-wordsprint a random selection of words/paths xfst[]: print random-upperprint a random selection of upper words xfst[]: print random-lowerprint a random selection of lower words xfst[]: print netshow a list of states/arcs xfst[]: print vcg netgraphic display of the net, using VCG (not installed outside Xerox) These commands operate on the top network on The Stack.

Some Mathematical Restrictions  With simple (one-level) FSMs denoting regular languages, you can: union, concatenate, intersect, minus, negate (take the complement).  With FSTs denoting relations, you can: union and concatenate but not intersect or subtract or take the complement.  You can compose FSTs, which involves a kind of one-sided intersection.  When a simple (one-level) FSM is used in a composition, it is automatically interpreted as an identity transducer.

“Checking the Alphabet” Assume that you have created mylang.fst, and it isn’t working as expected. The first and most fundamental kind of debugging is simply to “check the alphabet”. xfst[]: load stack mylang.fst xfst[]: upper-side xfst[]: sigma xfst[]: clear stack xfst[]: load stack mylang.fst xfst[]: lower-side xfst[]: sigma Very often you will discover that you have inadvertently introduced an undesired multicharacter symbol (probably in a regular expression) or failed (in lexc) to declare a Multichar_Symbol.

Inadvertent Multicharacter Symbols xfst[0]: read regex [ dog | cat | rat ] ; Problems: ^^^ ^^^ ^^^ xfst[0]: read regex [ {work} | {talk} | {walk} ] “+Verb”:0 [ “+Bare”:0 | “+3PS”:s | “+Past”:{ed} | “+PstPrt”:”ing” ] ; Problem: ^^^

Tracking Down a Strange Symbol Suppose that “ing” appears as a multicharacter symbol in the lower- side language. You introduced it by mistake somewhere in your grammar, probably in a regular expression. The following idiom can often help you isolate where it came from. xfst[]: read regex @”mylang.fst”.o. $[ing] ; xfst[]: print random-words xfst[]: print random-upper

Forgetting to Declare a Multichar_Symbol in lexc Multichar_Symbols +Noun LEXICON Root Adjectives ; LEXICON Adjectives darkAdj ; quickAdj ; LEXICON Adj +Adj:0# ; Draw the network produced by the lexc compiler. BEWARE: If what you intended to be a multicharacter symbol is not declared, lexc will happily and silently explode it into a concatenation of symbols.

You Forgot to Declare a Multichar_Symbol It’s very common to forget, in lexc, to declare a Multichar_Symbol. By convention at Xerox, “tag” names contain a special character, e.g. [Verb] or +Verb. If the plus sign (or [ or ]) shows up as a separate symbol in the alphabet of the upper-side language, this can be a sign that you forgot to declare a tag symbol. xfst> read regex $[%+].o. @”mylang.fst” ; xfst> print random-words xfst> print random-upper Avoid using the plus sign (and other easily typed punctuation symbols) as delimiter characters. Use multicharacter symbols instead.

Regression Testing Your morphological analyzer will be built from a large lexc grammar and perhaps dozens of rules. It will typically analyze millions of words. When you change your system in any way, how do you know if you have broken anything? old.fstnew.fst After changes to dictionaries and rules

Regression Testing Assume that you have an old.fst and a new.fst. To see what has changed on the lower side … xfst[0]: load stack old.fst xfst[1]: lower-side xfst[1]: define old xfst[0]: load stack new.fst xfst[1]: lower-side xfst[1]: define new xfst[0]: read regex new - old ; xfst[1]: words xfst[1]: read regex old - new ; xfst[2]: words

Regression Testing Often it’s convenient or necessary to write the words to file: xfst[]: read regex new - old ; xfst[]: write text > words.added xfst[]: read regex old - new ; xfst[]: write text > words.lost ‘write text’ prints out the words in the top network, one to a line. In some cases, you might want to take the upper-side of old.fst and new.fst and compare them rather than the surface languages.

Testing against an External Wordlist Assume that you find spanwords.txt, a list of Spanish words, on the Internet. To compare this wordlist against the lower side of your spanish.fst... xfst[] read text spanwords.txt xfst[] define wordlist xfst[] load stack spanish.fst xfst[] lower-side xfst[] define mylow xfst[] read regex wordlist - mylow ; xfst[] write text > missing.txt xfst[] read regex mylow - wordlist ; xfst[] write text > suspect.txt

The Upper-Side Language  The surface language is usually defined for you by the standard orthography  The upper-side “analysis” language must be defined by the linguist.  Some care and planning should go into that design. Ask yourself early in the project what you want to see when you analyze a word.  Anyone who wants to use your system for generation will need to know exactly what the upper-side strings look like, including the correct tag spellings and tag orders.  The upper-side language needs to be documented.  You need to be able to test to make sure that the upper-side language conforms to your documentation.

Writing a “Lexical Grammar” It’s a very good idea to write a “lexical grammar” that covers all possible lexical strings, using a pseudo-baseform. [a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z]+ [ %+Noun ( %+Aug | %+Dim ) [ %+Masc | %+Fem ] [ %+Sg | %+Pl ] | %+Adj ( %+Comp | %+Sup ) [ %+Masc | %+Fem ] [ %+Sg | %+Pl ] | %+Verb [ %+Past | %+Pres | %+Fut ] [ %+1P | %+2P | %+3P ] [ %+Sg | %+Pl ] ] ;

Using a Lexical Grammar Find all the upper-level strings in your FST that are not described by your lexical grammar. xfst[0]: read regex < lexgram.regex xfst[1]: define lexgram xfst[0]: load stack mysystem.fst xfst[1]: upper-side xfst[1]: define us xfst[0]: read regex us – lexgram ; xfst[1]: size xfst[1]: words Then fix your grammar and/or the Lexical Grammar, until they are compatible. Why is the reverse subtraction impossible?

The Lexical Grammar and Documentation  The Lexical Grammar describes all the possible multicharacter tags and sequences of tags in lexical strings.  It is all too easy to introduce errors in lexc or regular-expression source code.  The Lexical Grammar should be used regularly to check the well- formedness of the upper-side language of your transducer.  A well commented Lexical Grammar, just as it appears in the file, should be part of the documentation of your system. (Include comments!)

Checking the Surface Language In some cases, it may be possible to use composition and/or subtraction to test your surface language for illegal words. For example, if the presence of two orthographically accented vowels in the same word is illegal, try xfst[]: define accvow [à|á|â|ã|è|é|ê|ë|ì|í|î|ï|ò|ó|ô|õ|ö|ù|ú|û|ü] ; xfst[]: define surffilter $[accvow ?* accvow] ; xfst[]: read regex @”mylang.fst”.o. surffilter ; xfst[]: size xfst[]: lower-side xfst[]: write words > multipleaccents.txt

Summary: Testing I  Think in terms of Languages and Relations.  Draw and maintain a graphic wall chart showing the overall organization of your system. Be able to characterize the language at each “level”.  Learn and use the finite-state testing idioms: n Check the alphabet n Regression testing (compare previous versions or external wordlists via subtraction) n Write a “lexical grammar” and use it regularly to find illegal upper strings If you notice illegal surface strings, use filters to search for others

Summary: Testing II  Use makefiles..  Keep source files in a separate directory.  Put all your edited source files under Version Control.  Keep backups, especially of delivered versions. Do it yourself if necessary.

Testing with the Finite-State Calculus Thursday AM Kenneth R. Beesley Xerox Research Centre Europe.

Similar presentations

Presentation on theme: "Testing with the Finite-State Calculus Thursday AM Kenneth R. Beesley Xerox Research Centre Europe."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Testing with the Finite-State Calculus Thursday AM Kenneth R. Beesley Xerox Research Centre Europe.

Similar presentations

Presentation on theme: "Testing with the Finite-State Calculus Thursday AM Kenneth R. Beesley Xerox Research Centre Europe."— Presentation transcript:

Similar presentations

About project

Feedback