Chiara Moraglia

Branch of computational linguistics The study of mathematical structures and methods that pertain to linguistics. Combines aspects of computer science, mathematics and linguistics.

Words: anchor Sentences : Cleaning fluid can be dangerous. Claire kicked the bucket.

Machine translation that keeps in mind the problem of ambiguity. A sequence of reordering decisions and word translation decisions, each with a probability assigned based upon linguistic data. 2 main reordering models: 1) phrase-based models: re-align phrases (strings of words) 2) syntax-based models: can use tree transducers to permute trees (syntactic structure) with words as leaves

Generalize the work on tree automata and tree transductions to non-deterministic models and explore the equivalence properties that were proven to hold in the deterministic case.

A hierarchical collection of labeled nodes connected by edges, starting at a root node https://upload.wikimedia.org/wikipedia/commons/f/f7/Binary_tree.svg

A tree transducer is a 5-tuple where i) F is a functional signature of input symbols ii) H is a functional signature of output symbols iii) Q is a finite set of states iv) q in Q is the initial state v) R is a finite set of rules ζ where ζ is 1) 2) h(,…, ) Φ gives the conditions the current node must satisfy, Ψ says which node to go to from the current node (Courcelle & Engelfriet, 2012)

A functional signature is a set of function symbols, each with an associated arity ρ(f) (the number of arguments the function takes on) E.g. f(x), ρ(f)=1 h(x,y,z), ρ(h)=3 (Courcelle & Engelfriet, 2012)

i) F={f,a,b} where ρ(f)=2, ρ(a)=ρ(b)=0 ii) H= {a,b,ε} where ρ(a)=ρ(b)= 1, ρ(ε)=0 iii) Q={q in,q 1,q 2 } iv) q in Q is the initial state v) R= 1) 2) x( ) 3) 4) 5) ε 6) x( ) (Courcelle & Engelfriet, 2012)

q in q1q1 q2q2 a or b( ) a or b ( ) ε

input tree output tree fa a b b ε

A tree transducer is deterministic if the state and the position in the tree uniquely determine what rule should be applied Otherwise, it is non-deterministic E.g.

q in a g( ) h( ) Modified from (Fülöp, 1981)

fghghfghhgaaaaafghghfghhgaaaaa inputpossible outputs

The possible output trees would be assigned probabilities Then the words would be translated into the target language

