Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran.

Presentation on theme: "Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran."— Presentation transcript:

Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran

Overview Motivation A word about the language Overview of constraint based parser Analysis of special cases – Genitives – Copula – “ani” construction – Conjuncts Future work

Motivation – We thought about a question answering system in Telugu mainly for medical and tourism domain which could help native Telugu speakers (as a preliminary diagnosis tool and a travel guide). And we were in need of a parser to make things easier.

A word about the language Telugu is a South Asian language Features – Morphologically rich – Free word order – Agglutinative challenges – No Treebank – No parser – No wordnet

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Telugu : rAmudu iMtiki vaccAka paMdu ni wiMtadu Gloss :Rama home after_coming apple eats English :Ram eats an apple after coming home

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence 1((NP 1.1rAmuduNN )) 2((NP 2.1iMtikiNN )) 3((VG 3.1vaccAkaVRB )) 4((NP 4.1paMduNN | 4.2niPREP )) 5((VG 5.1wiMtAduVFM 5.2.SYM ))

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence 1((NP Source 1.1rAmuduNN )) 2((NP Source 2.1iMtikiNN )) 3((VG Demand 3.1vaccAkaVRB )) 4((NP Source 4.1paMduNN 4.2niPREP )) 5((VG Demand 5.1wiMtAduVFM 5.2.SYM ))

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Frame for winu (eat in basic form so no transformation required) ------------------------------------------------------------------- arc-label |necessity| vibhakti|lextype |posn|reln ------------------------------------------------------------------- k1 m 0 n l c k2 m ni n l c k1k2 -------------------------------------------------------------------- Frame for vaccu (come) ------------------------------------------------------------------- arc-label |necessity| vibhakti|lextype |posn|reln Vmod ------------------------------------------------------------------- k1 m 0 n l c K2 m kin l c ------------------------------------------------------------------- k1 k2 Transformation charts [ina_aka (after+ing)] ---------------------------------------------------------------------------- arc-label |necessity| vibhakti|lextype |posn|reln|op ---------------------------------------------------------------------------- K1 m 0 n l c remove Vmod m - v r p insert ----------------------------------------------------------------------------- Winu[wa] (eat) rAmudu(Ram) paMdu (fruit) (after coming )Vaccu[ina_aka] (House)iMtikirAmudu

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence Frame for vaccAka (after transformation) arc-label necessity vibhakti lextype posn reln k2 m ki n l c Vmod m - v r p ------------------------------------------------------------- Frame for winu k1 m 0 n l c k2 m ni n l c ---------------------------------------------------------------------------------------- rAmuduiMtikivaccAkapaMduni wiMtadu X1:k1 X3:k2 X2:k2 X4:vmod

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence C1 : For each of the mandatory karakas in a karaka chart for each demand group, there should be exactly one outgoing edge labeled by the karaka by the demand group. C2 : for each of the optional or desirable karakas in a karaka chart for each demand group, there should be at most one outgoing edge labeled by the karaka by the demand group. C3 : There should be exactly one incoming arc into each source group Equations formed by applying the above constraints are: C1 :X1 = 1 X2 = 1 X3 = 1 X4 = 1 C2 : No optional field found C3 : X1 = 1 X2 = 1 X3 = 1 X4 = 1

Overview of constraint based parser Identify source groups satisfying demands and draw arcs Apply the 3 constraints and form equations for each demand Integer programming module (solves the equations) Final parse Pos tagging and chunking Indentify source and demand groups Load frames (demand and transformation) Raw sentence 1((NP 1.1rAmuduNN )) 2((NP 2.1iMtikiNN )) 3((VG 3.1vaccAkaVRB )) 4((NP 4.1paMduNN | 4.2niPREP )) 5((VG 5.1wiMtAduVFM 5.2.SYM ))

Analysis of special cases Genitives Copula “ani” construction Conjuncts

Genitives Genitives is the case that marks a noun as being the possessor of another noun (ex – his, her, its …… etc) Cases – Genitive marker exists – Telugu : rAmudi yoVkka puswakaM – Gloss : ram 's book So when there is a marker then it is a straight forward that the noun preceding “yoVkka” holds an R6 relation with the noun succeeding “yoVkka”. – Genitive marker is dropped – Telugu : rAmudi puswakaM – Gloss : ram book here is the suffix “udi” in “rAmudi” which gives the information about existence of genitive.

Genitive contd.. Exceptions in case where genitive marker can be dropped Telugu : raGu puswakaM rAmudiki icCadu Gloss : Raghu book Ram gave English (sense 1): Raghu gave book to sita. English (sense 2): Raghu’s book is given to sita. So for non-masculine nouns (Raghu and Sita)in Telugu we don’t have any markers for genitives. So we output all possible parses for this case. The parses include raGu icCAdu puswakam rAmudiki puswakam icCAdu raGu r6 k1 k4 k2 rAmudiki k4 k2

Copula Ex – is, are, were ….. Etc Copula is generally dropped in Telugu For ex- – Telugu : rAmudu maMci bAludu – gloss : RAM good boy – Eng : Ram is a good boy. So we handle these cases by introducing a “NULL_VG” Frame for NULL_VG -------------------------------------------------------------------------------------------- arc-label necessity vibhakti lextype posn reln -------------------------------------------------------------------------------------------- k1 m 0 n l c k1S m 0 n l c --------------------------------------------------------------------------------------------

‘ani’ construction ‘ani’ in telugu is some times similar to “that” in english. There are three different ways of using “ani” as follows :  Used as complementizer : Telugu : rAmudu paMdu wiMtAdu ani mohan ceVppAdu. Gloss : Ram fruit will_eat that mohan said. English : Ram said that Mohan will eat a fruit.  Used as verb : Telugu : mohan rAmudu paMdu wiMtAdu ani vellipoyAdu. English : mohan left saying ram eats an apple.  Used to state a reason : Telugu : mohan rAmudu paMdu winnAdani vellipoyAdu. Gloss : Mohan Ram fruit had_eaten went. English : Mohan went because ram had eaten the fruit.

“ani” construction Contd … So we created a demand frame for “ani” Frame for ani -------------------------------------------------------------------------------------------- arc-label necessity vibhakti lextype posn reln -------------------------------------------------------------------------------------------- Ccof m - v_fin l c Ccof m - v_fin r p --------------------------------------------------------------------------------------------

Conjuncts In Telugu conjuncts occur as suffixes (tam of the verb), DheergAs and as lexical items such as “inkA”, “anduke”, “mariyu”, “kAni”, “aiwe” and “anwe”.  Suffixes :  Here, just applying the corresponding transformation chart of the verb solves the case. Telugu :nenu iMtiki velwe nixrapowAnu. Gloss :Ihome if gowill_sleep. English:I will sleep if I go home.

Contd … Lexical items : Here we will have frame for each lexical entry which will do the corresponding job. In case of “mariyu” : Frame 1 : -------------------------------------------------------------------------------------------- arc-label necessity vibhakti lextype posn reln -------------------------------------------------------------------------------------------- Ccof m - v l c Ccof m - v r c -------------------------------------------------------------------------------------------- Frame 2 : -------------------------------------------------------------------------------------------- arc-label necessity vibhakti lextype posn reln -------------------------------------------------------------------------------------------- Ccof m - n l c Ccof m - n r c --------------------------------------------------------------------------------------------

Contd … DheergAs :  Often by elongation of the vowel at the end of lexical items the conjuncts information is implicit there without the need of explicit lexical entries such as “mariyu”. Telugu : rAmudU siwA iMtiki vellAru. Gloss : Ram (implicit conj) sita home went. English : Ram and Sita went home.  In such cases a NULL_CCP is introduced which serves like explicit conjunct lexical entry and we have a frames for the NULL_CCP similar to the one in previous slide.

Future work !! A thorough analysis of Relative clauses. Analysis and handling of NULL VERBS in case of complex constructions. And their implementation. Verb and TAM Classification.

THANKS !!

Any Queries ??

Download ppt "Constraint based Dependency Telugu Parser Guided by - Dr.Rajeev Sangal Dr.Dipti Misra Samar Hussain Team members - Phani Chaitanya Ravi kiran."

Similar presentations