Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36: UNL.

Similar presentations


Presentation on theme: "CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36: UNL."— Presentation transcript:

1 CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36: UNL

2 2 Internet for the Masses Internet Spanish interface English viewer Hindi viewer English interface Spanish viewer Hindi interface

3 Vaquious Triangle 3 Analysis Generation Transfer Based (do deep semantic process Before entering the target language) Direct (enter the target Language immediately Through a dictionary) Interlingua based (do deep semantic process Before entering the target language) Vaquious: an eminent French Machine Translation Researcher- Originally a Physicist

4 4 UNL representation Ram is reading the newspaper Representation of Knowledge

5 5 Knowledge Representation Ram read newspaper agt obj UNL Graph - relations

6 6 Knowledge Representation Ram(iof>person) read(icl>interpret) newspaper(icl>print_media) UNL Graph - UWs agt obj

7 7 The noun read has 1 sense (no senses from tagged texts) 1. read -- (something that is read; "the article was a very good read") The verb read has 11 senses (first 8 from tagged texts) 1. (117) read -- (interpret something that is written or printed; "read the advertisement"; "Have you read Salman Rushdie?") 2. (17) read, say -- (have or contain a certain wording or form; "The passage reads as follows"; "What does the law say?") 3. (15) read -- (look at, interpret, and say out loud something that is written or printed; "The King will read the proclamation at noon") 4. (5) read, scan -- (obtain data from magnetic tapes; "This dictionary can be read by the computer“)

8 8 5. (5) read -- (interpret the significance of, as of palms, tea leaves, intestines, the sky; also of human behavior; "She read the sky and predicted rain"; "I can't read his strange behavior"; "The fortune teller read his fate in the crystal ball") 6. (5) take, read -- (interpret something in a certain way; convey a particular meaning or impression; "I read this address as a satire"; "How should I take this message?"; "You can't take credit for this!") 7. (4) read, register, show, record -- (indicate a certain reading; of gauges and instruments; "The thermometer showed thirteen degrees below zero"; "The gauge read `empty'") 8. (4) learn, study, read, take -- (be a student of a certain subject; "She is reading for the bar exam") 9. read -- (audition for a stage role by reading parts of a role; "He is auditioning for `Julius Caesar' at Stratford this year") 10. read -- (to hear and understand; "I read you loud and clear!") 11. understand, read, interpret, translate -- (make sense of a language; "She understands French"; "Can you read Greek?")

9 9 The boy who works here went to school plt agt @ entry @ past school(icl>institution) go(icl>move) boy(icl>person) work(icl>do) here @ entry agt plc :01 Another Example

10 10 UNL System

11 11 The UNL System: An Overview

12 Dependency Parsers UNL generator is a kind of dependency parser; other such parsers are: Link Parser Mini Parser Stanford Dependency Parser Mat Parser MaxEnt Parser

13 13 Universal Networking Language Universal Words (UWs) Relations Attributes Knowledge Base

14 14 UNL Graph obj agt @ entry @ past minister(icl>person) forward(icl>send) mail(icl>collection) He(icl>person) @def gol He forwarded the mail to the minister.

15 15 UNL Expression agt (forward(icl>send).@ entry @ past, he(icl>person)) obj (forward(icl>send).@ entry @ past, minister(icl>person)) gol (forward(icl>send ).@ entry @ past, mail(icl>collection). @def)

16 16 Universal Word (UW) What is a Universal Word (UW)? What are the features of a UW? How to create UWs?

17 17 What is a Universal Word (UW)? Words of UNL Constitute the UNL vocabulary, the syntactic- semantic units to form UNL expressions A UW represents a concept Basic UW (an English word/compound word/phrase with no restrictions or Constraint List) Restricted UW (with a Constraint List ) Examples: “crane(icl>device)” “crane(icl>bird)”

18 18 The Features of a UW Every concept existing in any language must correspond to a UW The constraint list should be as small as necessary to disambiguate the headword Every UW should be defined in the UNL Knowledge-Base

19 19 Restricted UWs Examples He will hold office until the spring of next year. The spring was broken. Restricted UWs, which are Headwords with a constraint list, for example: “spring(icl>season)” “spring(icl>device)” “spring(icl>jump)” “spring(icl>fountain)”

20 20 How to create UWs? Pick up a concept the concept of “crane" as "a device for lifting heavy loads” or as “a long-legged bird that wade in water in search of food” Choose an English word for the concept. In the case for “crane", since it is a word of English, the corresponding word should be ‘crane' Choose a constraint list for the word. [ ] ‘crane(icl>device)' [ ] ‘crane(icl>bird)'

21 21 UNL Relations Constitute the syntax of UNL Expresse how concepts(UWs) constitute a sentence Represented as strings of 3 characters or less A set of 41 relations specified in UNL (e.g., agt, aoj, ben, gol, obj, plc, src, tim,…) Refer to a semantic role between two lexical items in a sentence E.g., John has composed this poem.

22 22 AGT / AOJ / OBJ AGT (Agent) Definition: Agt defines a thing which initiates an action AOJ (Thing with attribute) Definition: Aoj defines a thing which is in a state or has an attribute OBJ (Affected thing) Definition: Obj defines a thing in focus which is directly affected by an event or state

23 23 Examples John broke the window. agt ( break.@entry.@past, John) This flower is beautiful. aoj ( beautiful.@entry, flower) He blamed John for the accident. obj ( blame.@entry.@past, John)

24 24 BEN BEN (Beneficiary) Definition: Ben defines a not directly related beneficiary or victim of an event or state Can I do anything for you? ben ( do.@entry.@interrogation.@politeness, you ) obj ( do.@entry.@interrogation.@politeness, anything ) agt (do.@entry.@interrogation.@politeness, I )

25 25 BEN : UNL Graph obj agt @ entry @ past baby(icl>child) carve(icl>cut) toy(icl>plaything) he(iof>person) @def ben He carved a toy for the baby.

26 26 GOL / SRC GOL (Goal : final state) Definition: Gol defines the final state of an object or the thing finally associated with an object of an event SRC (Source : initial state) Definition: Src defines the initial state of object or the thing initially associated with object of an event

27 27 GOL I deposited my money in my bank account. objagt @ entry @ past account(icl>statement) deposit(icl>put) money(icl>currency ) I gol bank(icl>possession) mo d I I

28 28 SRC They make a small income from fishing. obj agt @ entry @ present fishing(icl>business ) make(icl>do) income(icl>gain ) they(icl>persons ) src small(aoj>thing ) mo d

29 29 PUR PUR (Purpose or objective) Definition: Pur defines the purpose or objectives of the agent of an event or the purpose of a thing exist This budget is for food. pur ( food.@entry, budget ) mod ( budget, this )

30 30 RSN RSN(Reason) Definition: Rsn defines a reason why an event or a state happens They selected him for his honesty. agt(select(icl>choose).@entry, they) obj(select(icl>choose).@entry, he) rsn (select(icl>choose).@entry, honesty)

31 31 TIM TIM (Time) Definition: Tim defines the time an event occurs or a state is true I wake up at noon. agt ( wake up.@entry, I ) tim ( wake up.@entry, noon(icl>time))

32 32 TMF TMF (Initial time) Definition: Tmf defines a time an event starts The meeting started from morning. obj ( start.@entry.@past, meeting.@def ) tmf ( start.@entry.@past, morning(icl>time) )

33 33 TMT TMT (Final time) Definition: Tmt defines a time an event ends The meeting continued till evening. obj ( continue.@entry.@past, meeting.@def ) tmt ( continue.@entry.@past,evening(icl>time) )

34 34 PLC PLC (Place) Definition: Plc defines the place an event occurs or a state is true or a thing exists He is very famous in India. aoj ( famous.@entry, he ) man ( famous.@entry, very) plc ( famous.@entry, India)

35 35 PLF PLF (Initial place) Definition: Plf defines the place an event begins or a state becomes true Participants come from the whole world. agt ( come.@entry, participant.@pl ) plf ( come.@entry, world ) mod ( world, whole)

36 36 PLT PLT (Final place) Definition: Plt defines the place an event ends or a state becomes false We will go to Delhi. agt ( go.@entry.@future, we ) plt ( go.@entry.@future, Delhi)

37 37 INS INS (Instrument) Definition: Ins defines the instrument to carry out an event I solved it with computer agt ( solve.@entry.@past, I ) ins ( solve.@entry.@past, computer ) obj ( solve.@entry.@past, it )

38 38 INS : UNL Graph obj agt @ entry @ past blanket(icl>object ) cover(icl>do) baby(icl>child ) John(iof>person ) @def ins John covered the baby with a blanket.

39 39 Attributes Constitute syntax of UNL Play the role of bridging the conceptual world and the real world in the UNL expressions Show how and when the speaker views what is said and with what intention, feeling, and so on Seven types: Time with respect to the speaker Aspects Speaker’s view of reference Speaker’s emphasis, focus, topic, etc. Convention Speaker’s attitudes Speaker’s feelings and viewpoints

40 40 Tense: @past The past tense is normally expressed by @past {unl} agt(go.@entry.@past, he) … {/unl} He went there yesterday

41 41 Aspects: @progress {unl} man ( rain.@entry.@present.@progress, hard ) {/unl} It’s raining hard.

42 42 Speaker’s view of reference @def (Specific concept (already referred)) The house on the corner is for sale. @indef (Non-specific class) There is a book on the desk @not is always attached to the UW which is negated. He didn’t come. agt ( come.@entry.@past.@not, he )

43 43 Speaker’s emphasis @emphasis John his name is. mod ( name, he ) aoj ( John.@emphasis.@entry, name ) @entry d enotes the entry point or main UW of an UNL expression

44 Tense & Aspect Present Single Continuous / Progressive Perfect / Complete TenseAspect

45 3 Types of Verbs Transitive Intransitive Stative

46 @entry attribute @entry – the attribute indicating the main Predicate. Within a sub-graph the “entry” node is the type of semantic relation.

47 47 UNL Knowledge Base (UNLKB) What is the UNL Knowledge Base? Linguistic Background How to define the UWs in the UNL Knowledge-Base?

48 48 What is the UNL Knowledge Base? A semantic network comprising every directed binary relation between UWs Categorized according to the role of a concept to other concepts

49 49 Linguistics Background Ungrammaticality The boy relied on the girl. * The boy relied the girl. *The boy relied. Grammatically Sound but Semantically Odd *The boy frightens sincerity. *Sincerity kicked the boy.

50 50 Ungrammaticality Given sentences: The boy relied on the girl. * The boy relied the girl. *The boy relied. PS Rules: VP  V (NP) (PP) NP  Det N V  rely Det  the N  boy, girl

51 51 Subcategorization Frames Specify the categorial class of the lexical item. Specify the environment. Examples: kick: [V; _ NP] cry: [V; _ ] rely: [V; _PP] put: [V; _ NP PP] think: : [V; _ S` ]

52 52 Subcategorization Rules V y / _NP] _ ] _PP] _NP PP] _S`] Subcategorization Rule:

53 53 Subcategorization Rules 1. S  NP VP 2. VP  V (NP) (PP) (S`)… 3. NP  Det N 4. V  rely / _PP] 5. P  on / _NP] 6. Det  the 7. N  boy, girl The boy relied on the girl.

54 54 Semantically Odd Constructions Can we exclude these two ill-formed structures ? *The boy frightened sincerity. *Sincerity kicked the boy. Selectional Restrictions

55 55 Selectional Restrictions Inherent Properties of Nouns: [+/- ABSTRACT], [+/- ANIMATE] E.g., Sincerity [+ ABSTRACT] Boy [+ANIMATE]

56 56 Selectional Rules A selectional rule specifies certain selectional restrictions associated with a verb. V y / [+/-ABSTARCT] [+/-ANIMATE] V frighten / [+/-ABSTARCT] [+ANIMATE] __

57 57 Subcategorization Frame forward V __ NP PP invitation N __ PP accessible A __ PP e.g., An invitation to the party e.g., A program making science is more accessible to young people e.g., We will be forwarding our new catalogue to you

58 58 Thematic Roles The man forwarded the mail to the minister. forward V __ NP PP Event FORWARD [ Thing THE MAN ], [ Thing THE MAIL ], [ Path TO THE MINISTER ] ( )

59 59 How to define the UWs in UNL Knowledge-Base? Nominal concept Abstract Concrete Verbal concept Do Occur Be Adjective concept Adverbial concept

60 60 Nominal Concept: Abstract thing abstract thing{(icl>thing)} culture(icl>abstract thing) civilization(icl>culture{>abstract thing}) direction(icl>abstract thing) east(icl>direction{>abstract thing}) duty(icl>abstract thing) mission(icl>duty{>abstract thing}) responsibility(icl>duty{>abstract thing}) accountability{(icl>responsibility>duty)} event(icl>abstract thing{,icl>time>abstract thing}) meeting(icl>event{>abstract thing,icl>group>abstract thing}) conference(icl>meeting{>event}) TV conference{(icl>conference>meeting)}

61 61 Nominal Concept: Concrete thing concrete thing{(icl>thing,icl>place>thing)} building(icl>concrete thing) factory(icl>building{>concrete thing}) house(icl>building{>concrete thing}) substance(icl>concrete thing) cloth(icl>substance{>concrete thing}) cotton(icl>cloth{>substance}) fiber(icl>substance{>concrete thing}) synthetic fiber{(icl>fiber>substance)} textile fiber{(icl>fiber>substance)} liquid(icl>substance{>concrete thing}) beverage(icl>food,icl>liquid>substance}) coffee(icl>beverage{>food}) liquor(icl>beverage{>food}) beer(icl>liquor{>beverage})

62 62 Verbal concept: do do({icl>do,}agt>thing,gol>thing,obj>thing) express({icl>do(}agt>thing,gol>thing,obj>thing{)}) state(icl>express(agt>thing,gol>thing,obj>thing)) explain(icl>state(agt>thing,gol>thing,obj>thing)) add({icl>do(}agt>thing,gol>thing,obj>thing{)}) change({icl>do(}agt>thing,gol>thing,obj>thing{)}) convert(icl>change(agt>thing,gol>thing,obj>thing) classify({icl>do(}agt>thing,gol>thing,obj>thing{)}) divide(icl>classify(agt>thing,gol>thing,obj>thing))

63 63 Verbal concept: occur and be occur({icl>occur,}gol>thing,obj>thing) melt({icl>occur(}gol>thing,obj>thing{)}) divide({icl>occur(}gol>thing,obj>thing{)}) arrive({icl>occur(}obj>thing{)}) be({icl>be,}aoj>thing{,^obj>thing}) exist({icl>be(}aoj>thing{)}) born({icl>be(}aoj>thing{)})

64 64 How to define the UWs in UNL Knowledge Base? In order to distinguish among the verb classes headed by 'do', 'occur' and 'be', the following features are used: UWUW [ need an agent ] [ need an object ] English 'do'++"to kill" 'occur'-+"to fall" 'be'--"to know"

65 65 The verbal UWs (do, occur, be) also take some pre-defined semantic cases, as follows: How to define the UWs in UNL Knowledge- Base? UWPRE-DEFINED CASESEnglish 'do'takes necessarily agt>thing"to kill" 'occur'takes necessarily obj>thing"to fall" 'be'takes necessarily aoj>thing"to know"

66 66 Complex sentence I want to watch this movie. movie(icl>) want (icl>) @entry.@past obj @def :01 I (iof>person) watch (icl>do) @entry.@inf obj agt I (iof>person)

67 67 BREAK

68 68 Enconversion Input sentence UNL expression

69 69 Operations in Analysis Movement of heads Addition of two nodes Deletion of a node Creating relation between two nodes Adding dynamically inferred attributes to node

70 70 EnConverter Simultaneously does Morphological Syntactic Semantic analysis Works like Turing machine with many heads Heads move over the input to and fro

71 71 Enco - Operation Analysis windows - Two in number Left Analysis Window (LAW) Right Analysis Window (RAW) Condition windows - Many in number Left Condition Window (LCW) Right Condition Window (RAW) LAW Word 2 Word 1 Word 4 … RAW RCW Word n LCW Word 3 sentence windows

72 72 Deconversion Output sentence UNL expression

73 73 Operations in DeConverter Insertion of node Changing attributes Movement of windows Copying of node

74 74 Process Syntax Planning Parent child placement Morphology For Noun, Verb and Adjectives

75 75 The Lexicon Format of the dictionary entry [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN); Head word Universal word Attributes Morphological- Pl(plural), V_ed(past tense form) Syntactic - V(verb),VOA(verb of action) Semantic - ANIMT(animate), PLACE, TIME [headword] {} “Universal word“ (Attribute list);

76 76 The Lexicon Content words: [forward] {} “forward(icl>send)” (V,VOA) ; [mail] {} “mail(icl>collection)” (N,PHSCL,INANI) ; [minister] {} “minister(icl>person)” (N,ANIMT,PHSCL,PRSN) ; HeadwordUniversal WordAttributes He forwarded the mail to the minister.

77 77 The Lexicon Closed words: [He] {} “he” (PRON,SUB,SING,3RD) ; [the] {} “the” (ART,THE) ; [to] {} “to” (PRE,#TO) ; HeadwordUniversal Word Attributes He forwarded the mail to the minister.

78 78 Rule-base If (condition) then action Conditions : matching attributes or the HW or UW Actions Create relation Add attributes Move windows Priority

79 79 Rule format > (SHEAD) {N,ANI : : agt :} {V,^VAUX:+AGTRES ::} (PRE,#ON) (BLK) P20 ; Type of rule Condition windows Analysis windows Priority End marker

80 80 Condition window Format (attribute 1,attribute 2,…) E.g. (N,TIME) - has noun attribute and TIME attribute

81 81 Analysis window Format { conditions : +/- attributes : relation :} If conditions are matched then do the action i.e., to add(+)/remove(-) its attributes Or create the specified relation with the other analysis window Or move the heads according to the type of rule

82 82 Rule-Priority Indicated by number 1 to 255 lowest 1 highest 255 General conditions – Low priority (N) (V) Priority is 20 Specific conditions – High priority (N,ANIMT) (V,VOA) Priority is 30

83 83 UNL Rule ;Right shift R{V,VOA:::}{N,ANIMT,PRSN:::}(PRE,#OF)P60; IF the left analysis window is on a verb (V) which is also a veb of action (VOA) AND the right analysis window is on a noun(N) which is animate(ANIMT) and a person(PRSN) AND the preposition-of follows the noun(N) indicated by (PRE,#OF) THEN shift right (indicated by R at the start of the rule)

84 84 askeddirector Right shift R{V,VOA:::}{N,ANIMT,PRSN:::}(PRE,#OF)P60; of... Before application of rule asked director... of After application of rule

85 85 UNL Rule for a Semantic Relation ; Create relation between V and N2, after resolving the preposition preceding N2 < {V,VOA,:::}{N,TIME,DAY,ONRES,PRERES::tim:}P25; IF the left analysis window is on a verb(V) which is verb of action (VOA) AND the right analysis window is on a noun (N) and has TIME, DAY attribute for which the preceding preposition (on) has been processed and deleted THEN set up the tim relation between V and N 2. (indicated by < at the start of the rule)

86 86 camemonday Semantic relation < {VRB,VOA,:::}{N,TIME,DAY,ONRES,PRERES::tim:}P25;... Before application of rule came After application of rule... …

87 87 UNL expression [S] ;he is playing chess {unl} obj(play(icl>compete):06.@entry.@present.@progress, chess(icl>game):0E) agt(play(icl>compete):06.@entry.@present.@progress, he:00) {/unl} [/S]

88 88 Suggested Readings UNDL Foundation. 2003. The Universal Networking Language (UNL), Specifications, Version 3 Edition 2. http://www.undl.org

89 89 M.Tech Seminar Topics: 2009

90 90 1. Cross Lingual Information Retrieval The goal is to study the techniques of information retrieval when the language of the query is different from the language of the web page. After covering basics of information retrieval (crawling, indexing, ranking etc.), the student is expected to study the complexities of cross lingual search (see http://clef- campaign.org).http://clef- campaign.org Considerable work on this has happened at IIT Bombay which leads the national project on CLIR (http://www.clia.iitb.ac.in/clia-beta-ext; the site may not be up always).http://www.clia.iitb.ac.in/clia-beta-ext 2. Semantic Role Labeling (UNL) Sentences have inherent structure in terms of agent, object, instrument, time, place and such relationships between words. When extracted, such information finds usage in a large number of applications like machine translation, entailment, question answering, information extraction etc. We will study semantic roles in the form of Universal Networking Language (UNL; http://www.undl.org) and the ongoing work on UNL extraction spanning over last several years (visit http://www.cse.iitb.ac.in/~pb, click on publications and see publications on “Semantics”). http://www.undl.orghttp://www.cse.iitb.ac.in/~pb

91 91 3. Empirical Methods in NLP-ML This is a foundational topic proposing to investigate mathematical and statistical principles and methods cutting across different areas of NLP and Machine Learning. Examples of these are Spectral Analysis, Expectation Maximization, Graphical Models, Lexical Semantic Association Techniques and so on. Additional topics of study will be NLP-suitable probability distributions, e.g., Dirichlet distribution. Specific NLP and ML applications will always be kept in focus. Browse proceedings of EMNLP conference (http://www.isca-students.org/emnlp_2009) and CoNLL conference (http://www.cnts.ua.ac.be/conll/).http://www.isca-students.org/emnlp_2009http://www.cnts.ua.ac.be/conll/ 4. Text Entailment This studies the methodologies involved in deciding if a piece of text (H) is inferrable from another (e.g., “France beat Brazil in the FIFA World Cup semi final” entails “Brazil bowed out of the World Cup”). Grounded on predicate calculus, there has been considerable work recently on using machine learning techniques for entailment (http://pascallin.ecs.soton.ac.uk/Challenges/RTE/). A number of students have worked with me on this topic and the use of UNL in entailment.http://pascallin.ecs.soton.ac.uk/Challenges/RTE/

92 92 5. Foundation of Artificial Intelligence We will study work by Minsky, Newel and Simon, McCarthy, Penrose, Dreyfus, Hofstadter, Marr, Chomsky and so on, including Indian thoughts on intelligence and consciousness. 6. Machine Translation, Principles and Paradigms Starting with Interligua Based Machine Translation using UNL, we have made inroads in Statistical Machine Translation (http://www.cse.iitb.ac.in/~pb/papers/acl09-smt.pdf).http://www.cse.iitb.ac.in/~pb/papers/acl09-smt.pdf This topic will study many different approaches to machine translation which is a key problem (along with CLIR) for a multilingual country like India. We are members of large national projects on English-to-Indian-Language and Indian- language-to-Indian-Language machine translation. 7. Shallow Parsing for Indian Languages This is ongoing critical work attempting to build common platforms for morphology, POS tag and chunk processing for Indian Languages- especially Hindi and Marathi. Talk to seniors Mugdha Bapat (mbapat@cse) and Harshada Gune (harshada@cse).


Download ppt "CS626-449: NLP, Speech and Web-Topics-in-AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 36: UNL."

Similar presentations


Ads by Google