Presentation is loading. Please wait.

Presentation is loading. Please wait.

PrepNet: a Framework for Describing Prepositions: Preliminary Investigation results Patrick Saint-Dizier IRIT-CNRS, France.

Similar presentations


Presentation on theme: "PrepNet: a Framework for Describing Prepositions: Preliminary Investigation results Patrick Saint-Dizier IRIT-CNRS, France."— Presentation transcript:

1 PrepNet: a Framework for Describing Prepositions: Preliminary Investigation results Patrick Saint-Dizier IRIT-CNRS, France

2 Long-term objectives Construct a repository of preposition syntactic and semantic behaviors, Develop a multi-level approach, from prototypical uses to unexpected ones, that accounts for diversity of preposition uses and for their polysemic behavior, Develop a relatively shallow semantic characterization based on frames, Investigate the verb-preposition-NP relations: restrictions and compositionality Develop a multi-lingual approach.  Applications: MT, Knowledge extraction, QA, etc.

3 This paper: basic elements of a preliminary approach Introduce a general characterization of preposition senses viewed as abstract notions, Characterize these abstract notions by means of frames (viewed as linguistic or conceptual macros), Populate preposition frames via corpus and then validate, Develop a multi-level characterization of preposition uses, to organize the diversity of their uses in language, Raise a few questions about multilinguality (prepositions can be realized by other categories or by morphology in some languages)  Investigate evaluation methods, in abstracto, and via applications.

4 Related work Very little in CL circles compared to verbs and nouns, in spite of their necessity in a number of applications (MT, IE, QA, …), Almost nothing in EWN, FrameNet or VerbNet, Some valuable work in AI: e.g. temporal, spatial reasoning, A few isolated works in linguistics on a given preposition, Quite a lot of work in psycho-linguistics. Other resources: B. Dorr’s large description for English, with MT in view (about 500 entries).

5 Why is that so ? High polysemy (but may be not more than adjectives?, and smaller number: 95 preps. in French + compounds, 32 in Spanish: not always agreement on what a preposition is…..) Linguistic realizations very difficult to predict, large number of idiosyncratic uses and cross-linguistic differences, Syntactic difficulties due to the chain V-Prep-N, e.g.: PP- attachment problems, VPC, Deep level in the semantic-cognitive structure: prepositions often used in metalanguages as primitives  Study here only compositional uses of prepositions

6 Global architecture of the proposal Prep. Senses: 3 level set of abstract notions Shallow semantic representation with strata Uses in language 1 Uses in language 2 etc.

7 General architecture (1): categorizing preposition senses  Preposition categorization on 3 levels: –Family (roughly thematic roles): localization, manner, quantity, etc. –Facets: localization: source, position, destination, etc. –Modalities.  Facets viewed as abstract notions on which PrepNet is based  12 families defined

8 Families/ facets Quantity: numerical/ frequency / proportion Accompaniment: adjunction/ simultaneity/ inclusion/ exclusion Manner: means/ manners and attitudes/ imitation or analogy Localisation: source/ destination/ via/ fixed position Choice and exchange: exchange / choice or alternative / substitution Causality: cause/ goal or consequence/ intention Opposition Ordering: priority/ subordination/ hierarchy/ ranking/ degree of importance Minor elements: about, in spite of, comparison (see examples in paper)  Conceptual/ ontological status of these dictinctions ??

9 Families  ‘superframes’ : general principles and restrictions Facets:  frames, strata: subframes : with some general forms of inheritance and property consistency Whenever appropriate: modalities  subframes Frames are viewed as linguistic macros, to be interpreted. They are shallow or coarsed-grained representations so far. Language realizations are a priori associated with the lower level frame nodes.

10 (2): a conceptual, prelexical structure Frame of abstract notion SF 1 SF 2 SF 3 - name + gloss, - shallow restrictions - simplified LCS representation strata of abstract notion: subframes

11 Structure of a frame Structure: –Number, name, gloss, –Frame with shallow constraints: X Y [Number] Z –Conceptual representation in simplified LCS (kind of LST) –In the future: inferential patterns (within a frame or among frames)  195 senses/abstract notions described using 65 primitives  Shallow constraints:  (1) generic semantic types  (2) generic verb class types from WordNet  (3) generic semantic fields from the LCS: temp, poss, loc, psy, epist, perc, amount, comm, prop, abs, etc.

12

13 Example 1: ‘via’ [1] : VIA - generic. 'An entity X moving via a location Y' X [1] Y X: concrete entity, ACTION: movement verb, Y: location representation: X : via(loc, Y) French synset: {par, via} example: Jean rentre par la porte Stratification 1: [1.1] : VIA - narrow passage. 'An entity X moving via / an action that uses a narrow passage in an object Y' X [1.1] Y X: concrete entity, ACTION: perception verb, Y: location with a narrow passage representation: X : through(loc or temp, Y) French synset: {a travers, au travers de, dans} example: Jean regarde a travers la grille / dans les jumelles..

14 Example 1, cont’: Stratification 2: [1.2.1] VIA UNDER – from generic 'An entity X moving via under a location Y' X [1.2.1] Y X: concrete entity, ACTION: movement verb, Y: location with a form of passage under it representation: X : via(loc, under(loc,Y)) French synset: {par dessous} example: Jean passe par dessous le pont. [1.2.2] VIA ABOVE – from generic etc.

15 Example 2: instruments Stratification requires the taking into account of 2 relations, characterized by means of primitives (Mari and Saint-Dizier 03): –Actor/instrument: undergo (no control), select (controls another prop.), control, –Instrument/ V+NP object: be (passive, but participates), react (other prop than controlled by the agent), act (full participation) Contrast: cut the bread with a knife / eat soup with a spoon John burned himself with boiling oil.  A generic entry for instruments, and, potentially: 9 strata (combinations), depends on language.  4 strata for French

16 (2) cont’ [5] : MANNER - MEANS - Instrument 'Someone X doing an action Y using instrument Z.' X Y [5] Z X: human, ACTION: verb of change, Y: object Z: instrument representation: X: by-means-of(_, Z)  Followed by a priori 9 Strata. Example: Application to French: 1. Be(X,Z) Λ Undergo(Z, Action+Y) : synset: {grâce à}, restrictions… 2. Be(X,Z) Λ Select (Z, Action+Y) : synset: {par}, restrictions… 3. Select(X,Z) Λ React (Z, Action+Y) : synset: {avec}, restrictions… 4. Act(X,Z) Λ Control (Z, Action+Y) : synset: {avec, au moyen de}, …..

17 (3) The language realization level SF i (= lower frame level) Multi-level partitioning of realizations from usage norms Direct uses Indirect uses etc… restr1 restr2 restr3 Derived types, … synset1synset3synsets ?? ….… + frequency measures

18 Populating preposition frames from corpora Conceptual frames are associated with shallow constraints  Move on to the language level, elements of a method: For a given language: associate each frame strata with corpus and dictionary observations Manual analysis: identify prototypical uses, promote usage norms  multi-level partitioning of realizations Contrast, if possible, direct versus indirect (mainly metaphorical) realization levels Elaborate conceptual/ontological status of categorizations and related constraints (mainly semantic types)

19 A few notes Multi-level architecture: helps to account for the large variety of (compositional) behaviors, investigate in more depth partitioning strategies,  incremental depth to get finer-grained analysis worth pursuing?? For each synset: develop frequency measures, identify contexts of use (from syntactic to type of text): frequency rates are very diverse (some uses are only found in dictionaries!) Populate but then valide on new corpora: develop several forms of corpus annotations (the frame; the relation with the head, with the NP, etc.)

20

21 Looking at other languages Hypothesis: given an abstract notion (interlingua), translations are constructed on the basis of the restrictions that hold on the corresponding synsets, BUT: Large realization variations are in general observed, even for closely related languages: up to what point is this just surface language contrasts? Or is it also conceptual ? : Regarder dans le microscope / look through the microscope (durch; a travès de) Some languages have do not use so much pre-/post- positions, but other categories, incorporation in heads, or just case marks.

22

23 Preliminary conclusions Preliminary investigation to identify difficulties and organize the research, Global architecture looks an interesting approach Abstract notion definitions seem to be quite stable, status of strata needs further investigations, Multi-level approach to language realizations seems a good direction, but needs a much larger testing on a number of languages and a more clear method to organize sets of realizations Implement an open system on the Web.

24 Some obvious research directions  ontological/conceptual status of categorizations and restrictions,  Investigate integration with other frameworks: VerbNet, FrameNet,  Investigate preposition polysemy and derived uses in more depth, and ways to characterize it  Relations Head-preposition-NP, and compositionality (Head is often a verb, but can be any other kind of predicate): some PPs have wider scope over the proposition.  Inferential patterns associated with prepositions (e.g. for approximation notions, spatial notions, etc.)


Download ppt "PrepNet: a Framework for Describing Prepositions: Preliminary Investigation results Patrick Saint-Dizier IRIT-CNRS, France."

Similar presentations


Ads by Google