Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Italian CLIPS Lexicon and its reuse in a bilingual environment Nilda Ruimy ILC CNR, Pisa september 2004.

Similar presentations


Presentation on theme: "The Italian CLIPS Lexicon and its reuse in a bilingual environment Nilda Ruimy ILC CNR, Pisa september 2004."— Presentation transcript:

1 The Italian CLIPS Lexicon and its reuse in a bilingual environment Nilda Ruimy ILC CNR, Pisa september 2004

2 Outline The origin of the CLIPS lexicon The PAROLE-SIMPLE model General encoding criteria Phonological and morphological levels Syntactic level: information content The semantic lexicon Theoretical background: GL theory The original Qualia Structure The SIMPLE ontology The Extended Qualia Structure Semantic level: information content Predicative structure Syntax-semantics mapping Encoding methodology CLIPS essential features & applications september 2004 Part IPart II Creating a bilingual resource The two scenarios Scenario I Drawbacks Scenario II The cognate approach The sense indicator approach Results Concluding remarks Nilda Ruimy

3 CLIPS: a bit of genealogy CLIPS lexicon XML format morphology: 20,000 entries syntax: 20,000 lemmas semantics: 10,000 senses september 2004 PAROLE Corpus lexical units PAROLE Corpus lexical units DMI phonology PAROLE European project Italy: enlargment of these core lexicons in a national follow-up project SIMPLE European project phonology: 374,000 entries morphology: 49,000 entries syntax: 55,000 lemmas semantics: 55,000 senses Nilda Ruimy 12 harmonized lexicons PAROLE lexicons SIMPLE lexicons Semantic Information for Multifunctional Plurilingual Lexica

4 GENELEX-PAROLE Representational Model PAROLE-SIMPLE Theoretical model EAGLES recommendations Extended GENELEX model Results from EU projects: EUROWORDNET ACQUILEX DELIS GENERATIVE LEXICON The PAROLE-SIMPLE Model september 2004 Nilda Ruimy

5 common EAGLES-conformant model common representation language common building methodology The Linguistic Model Innovative Tackles misrepresented areas of knowledge Extendible and multifunctional Multilingual perspective PAROLE-SIMPLE lexicons Nilda Ruimy september 2004 REUSABILITYREUSABILITY

6 Representational Model (1) Entity/Relationship Model: september 2004 implemented through a DTD that defines: the structure of every descriptive element the relationships holding among the various descriptive elements as well as their co-occurence restrictions non ridondant data representation Nilda Ruimy

7 Representational Model (2) specific representational structures for the every level of linguistic description; september 2004 link among the different levels although the information encoded at each level is perfectly autonomous Nilda Ruimy

8 september 2004 General encoding criteria Reduce the lexicographers margin of subjectivity by setting precise guidelines for the treatment of particular phenomena Base as much as possible the encoding on corpus data Find a balance between the encoding of attested structures / senses only and an exhaustive encoding including rare structures / senses as well Nilda Ruimy

9 september 2004 Splitting entries Avoid both redundancy and over-powerful gatherings Use criteria strictly relevant to the description level, e.g. at the syntactic level, syntactic-driven criteria: arity syntactic function: disporre i libri negli scaffali / disporre di due auto complement optionality: attraversare (la strada) (lit. sense) / attraversare un momento difficile different (non alternative) realization of complements: Leo evita Lia / L. ha evitato di guardare L., che L. si ferisse Encode, at the semantic level, most common senses distinguished in average size dictionaries (ca.150,000 words) Nilda Ruimy

10 a. head properties b. subcat. frame position synt. restr. syntactic structure 1 Corresp. MrphU-SynU Corresp. PhnU-MrphU Morphological Unit PoS & subcat. inflectional paradigm Phonological Unit stress position vowel openness cons. prononciation syntactic structure 2 Frameset position synt. restr. a. head properties b. subcat. frame Syntactic Unit The four-level architecture september 2004 The first three levels Nilda Ruimy

11 P1 adverbial di_PP optional Aumentare: main verb relates main syntactic frame to alternating one aux. :avere syntactic frame: FRAMESET relating systematic frame alternations: relates respective frame positions to increase: The government has increased the prices by 3%. Prices have increased by 3% Il governo ha aumentato i prezzi del 3%. I prezzi sono aumentati del 3% P0 optional subject P1 oblig. object P2 optional adverbial NP di_PP RELATED P0 subject NP optional decausativization locative alternation reciprocal altern. symmetrical altern. MAIN complex synt. entry syntactic frame: Syntactic entry information content september 2004 Specific properties of the entry in the syntactic context described Subcategorization frame Link between syntactic structures Nilda Ruimy

12 The semantic lexicon september 2004 Theoretical linguistic background: Extended version of Pustejovskys Generative Lexicon (GL) theory Nilda Ruimy

13 lexical meanings of various levels of complexity Generative Lexicon theory september 2004 o bambino HUMAN, age (childhood), sex (male) o dottore HUMAN, age (adult), sex (male), o giornale 1. printed paper, 2. location 3. istitution 4. human group polysemy simplest ones : definable by a taxonomic relation more complex ones:hypernymic relation not sufficient Qualia Structure allows : to coherently model the pluridimensionality of meaning to represent uniformly semantic units of different degree of complexity function to capture the relationships holding btw. semantic units Nilda Ruimy

14 Qualia formal = what is X? constitutive = what is X made of? agentive = how does X come about? telic = what is Xs function? september 2004 The Original Qualia structure Consists of four roles: formal role: distinguishes the denoted entity from others constitutive role: expresses its components agentive role: expresses its coming about telic role: specifies its funtion Nilda Ruimy

15 The SIMPLE ontology (1) september 2004 Lexicon structured on the basis of a type ontology : Possible creation of language / application specific types Core Ontology: top level, general types; large consensus; provide essential information; mappable on EuroWordNet ontology Recommended Ontology: hierarchically lower and more specific types; provide finer-grained information Nilda Ruimy

16 157 language independent semantic types The SIMPLE ontology (2) september 2004 Living_entity Animal Earth_Animal Concrete_entity Entity simple types (one-dimensional) : can be fully characterized in terms of a hypernymic relation, e.g. Nilda Ruimy

17 the reference to orthogonal dimensions of meaning The SIMPLE ontology (3) september 2004 AgentiveTelic Institution Abstract_Entity Entity unified types (multi-dimensional) : can only be defined through the combination of: the relation to their supertype Nilda Ruimy

18 The SIMPLE ontology (4) september 2004 Simple Ontology: multidimensional type hierarchy based on both hierarchical and non-hierarchical conceptual relations Nilda Ruimy

19 Semantic types september 2004 In the SIMPLE ontology, types are not mere labels but the repository of a specific set of structured semantic information Nilda Ruimy

20 Concrete_entity Abstract_entity PropertyRepresentation TELIC Furniture Instrument Clothing Artwork Sign Language Information..... Living_entity Human Animal Vegetal_entity Artifact Susbstance Location Food Material Quality Quantity Physical_prop Psychol_prop..... Convention Cognitive_fact..... Artifactual_material Artifact TOP AGENTIVE CONSTITUTIVE ENTITY Event... some semantic types for abstract & concrete entities september 2004 Nilda Ruimy

21 Phenomenon Change Psych_event Aspectual State Act EVENT Cause_change Relational_state Non_relational_act Relational_act Move Cause_act Relational_change Change_possession Change_location Acquire_knowledge Natural_transition... Creation... Speech_act... some semantic types for events september 2004 Nilda Ruimy

22 some semantic types for adjectives september 2004 Nilda Ruimy ExtensionalIntensional TOP Psychological_prop Social_prop Physical_prop Intensifying_prop Temporal_prop Relational_prop Temporal Modal Emotive Manner Object_related Emphasizer

23 Features: PlusHuman, PlusCollective,.. Relations between semantic units: R (, ) Descriptive elements september 2004 Nilda Ruimy

24 isa antonym_comp antonym_grad mult_oppositionFormal result_of agentive_prog agentive_cause agentive_experience caused_by source AGENTIVEAGENTIVE ARTIFACTUAL AGENTIVE created_by derived_fromAgentive used_for used_as used_by used_against TELIC ACTIVITY INSTRUMENTAL DIRECT TELIC indirect_telic purpose object_of_activity is_the_activity_of is_the_ability_of is_the_habit_ofTelic made_of is_a_follower_of has_as_member is_a_member_of has_as_part instrument kinship is_a_part_of resulting_state relates uses CONSTITUTIVECONSTITUTIVE causes concerns affects constitutive_activity contains has_as_colour has_as_effect has_as_property measured_by measures produces produced_by property_of quantifies related_to successor_of precedes typical_of contains feeling PROPERTYPROPERTY is_in lives_in typical_location LOCATIONConstitutive september 2004 ExtendedExtended ExtendedExtended Nilda Ruimy ExtendedrolesExtendedroles ExtendedrolesExtendedroles QualiaQualia QualiaQualia StructureStructure StructureStructure

25 isa antonym_comp antonym_grad mult_oppositionFormal result_of agentive_prog agentive_cause agentive_experience caused_by source AGENTIVEAGENTIVE ARTIFACTUAL AGENTIVE created_by derived_fromAgentive used_for used_as used_by used_against TELIC ACTIVITY INSTRUMENTAL DIRECT TELIC indirect_telic purpose object_of_activity is_the_activity_of is_the_ability_of is_the_habit_ofTelic made_of is_a_follower_of has_as_member is_a_member_of has_as_part instrument kinship is_a_part_of resulting_state relates uses CONSTITUTIVECONSTITUTIVE causes concerns affects constitutive_activity contains has_as_colour has_as_effect has_as_property measured_by measures produces produced_by property_of quantifies related_to successor_of precedes typical_of contains feeling PROPERTYPROPERTY is_in lives_in typical_location LOCATIONConstitutive september 2004 proiettile, colpire (projectile, hit) antitarmico, tarma (moth balls, moth) bisturi, chirurgo (lancet, surgeon) metano, combustibile (methane, fuel) casa, costruire (house, build) mohair, capra (mohair, goat) manubrio, bicicletta (handlebar, bicycle) abbaiare, cane (bark, dog) arancio, arancia (orange tree, orange) medico, curare (doctor, cure) fumatore, fumare (smoker, smoke) disgusto, provare (disgust, feel) senato, senatore (senate, senator) Nilda Ruimy pane, farina (bread, flour)

26 Formal role Agentive role Telic role Constitutive role instrument is_a used_for created_by is_made_of Orthogonal dimensions of meaning september 2004 Nilda Ruimy

27 Formal role Agentive role Telic role Constitutive role violin is_a musical_instrument used_for playing created_by make has_as_part strings is_made_of wood Orthogonal dimensions of meaning september 2004 Nilda Ruimy

28 recipiente di legno fatto che serve per la conservazione e il trasporto Formal: isa Constitutive: made_of Agentive: created_by Constitutive: contains Telic: Used_for di doghe arcuate tenute unite da cerchi di ferro Constitutive: made_of di liquidi, specialmente vino bottebotte barrel traditional dictionary definition meaning dimensions expressed by Qualia relations september 2004 Nilda Ruimy

29 Within a semantic type population, further clusterings can be made through the is-a relation: september 2004 Qualia informative power (1) Nilda Ruimy

30 INSTRUMENT utensile graticola colabrodo frusta posata coltello is-a cucinare used for mangiare used for CONTAINER contenitore pentola tegamepadella is-a forchetta Qualia informative power (2) september 2004 Nilda Ruimy

31 domain semant. class ontological type Corresp. SynU-SemU event type semant. features semant. relations Extended Qualia Structure regular polysemy sem. restr. arguments predicate predicative represent. type of link Semantic Unit synonymy derivation constitutive role formal role telic role agentive role a. head properties b. subcat. frame position synt. restr. syntactic structure 1 Corresp. MrphU-SynU Corresp. PhnU-MrphU Morphological Unit PoS & subcat. inflectional paradigm Phonological Unit stress position vowel openness cons. prononciation syntactic structure 2 position synt. restr. Frameset a. head properties b. subcat. frame Syntactic Unit semantic level: information content september 2004 Nilda Ruimy

32 september 2004 Predicative Representation Assigned to predicative semantic units assignment of a lexical predicate type of link holding btw. entry and predicate predicate argument stucture semantic role of arguments selection restrictions of arguments link semantic arguments / syntactic complements Describes the semantic scenario a word sense is involved in Nilda Ruimy

33 september 2004 Assignment of a lexical predicate verbs; predicative nouns: deverbals (costruzione) and collective simple nouns (gruppo), nouns denoting a relation (madre), quantity (bottiglia), part (fetta), unit of measurement (metro), property (bellezza); adjectives; some adverbs (indipendentemente da) Nilda Ruimy

34 PRED_ACCUSARE accusare accusatore accusa master agent nominalisation process nominalisation accusato patient nominalisation september 2004 Predicate-semantic unit link to accuse accusation accusatoraccused Nilda Ruimy

35 ProtoAgent: volitional subject of verb: ARG0 of kill ProtoPatient: object undergoing an action: ARG1 of kill 2ndParticipant: indirect object: ARG2 of give SoA (State of Affair): sentential complement: ARG2 of ask Location: ARG2 of put Direction: ARG2 of move Origin: ARG1 of move Kinship: ARG0 of father HeadQuantified: ARG0 of metre, bottle september 2004 Semantic arguments: thematic roles Nilda Ruimy

36 Features, used transversely across semantic types (eg.: plusEdible), allow to capture wider preferences w.r.t. single semantic types: ARG1 eat : [PlusEdible] / ARG1 eat : [FOOD] september 2004 Semantic arguments: selectional restrictions preferences of combinations in prototypical situations Not proper restrictions, but rather preferences of combinations in prototypical situations. Expressible through: semantic types; notions (combination of types or type + feature…) features; semantic units Nilda Ruimy

37 increase: the increase of prices by the government september 2004 Nilda Ruimy PREDICATIVE REPRESENTATION EXTENDED QUALIA INFO. ONTOLOGICAL INFO. Aumento: Semantic type: Cause_change_of_value Gloss: accrescimento in dimensione o quantità Agentivecause: yes Laumento dei prezzi da parte del governo Supertype: Cause_relational_change Eventype: transition Domain: general, economics aumento isa cambiamento aumento resulting_state maggiore Direction: up Morphological derivation: Eventverb aumentare Lexical semantic predicate: PRED_aumentare Type of link: event nominalization Predicate arg. struct.: range, semantic role & selectional restrictions of args.: Arg0 Protoagent Human / Institution Arg1 ProtoPatient Entity Arg2 Quantifier Amount Semantic entry information content (1)

38 spray: to spray water with a spray september 2004 Nilda Ruimy PREDICATIVE REPRESENTATION EXTENDED QUALIA INFO. ONTOLOGICAL INFO. vaporizzatore: Semantic type: Instrument Gloss: apparecchio usato per ridurre in minuscole particelle un liquido vaporizzatore created_by fabbricare spruzzare acqua con un vaporizzatore Supertype: Artifact Eventype: === Domain: general, cleaning, gardening, cosmetics vaporizzatore isa apparecchio vaporizzatore has_as_part pulsante vaporizzatore used_for atomizzare Morphological derivation: Eventverb vaporizzare Lexical semantic predicate: PRED_vaporizzare Type of link: instrument nominalization Predicate arg. struct.: range, semantic role & selectional restrictions of args.: Arg0 Protoagent Human / Instrument Arg1 ProtoPatient +liquid Arg2 Location Concrete_entity Semantic entry information content (2) Synonymy: nebulizzatore

39 domain semant. class a. head properties b. subcat. frame position synt. restr. syntactic structure 1 ontological type Corresp. SynU-SemU event type semant. features semant. relations Extended Qualia Structure regular polysemy sem. restr. arguments predicate predicative represent. Corresp. Syntax-Semantics type of link Semantic Unit synonymy derivation constitutive role formal role telic role agentive role syntactic structure 2 position synt. restr. Frameset a. head properties b. subcat. frame Syntactic Unit Syntax-semantics mapping (1) september 2004 Nilda Ruimy

40 SynU_migliorare Transitive structure P0 P1 Intransitive structure P0 Frameset SYNTACTIC LEVEL SEMANTIC LEVEL SemU2_migliorare CHANGE_OF_STATE SemU1_migliorare CAUSE_CHANGE_OF_STATE to improve PRED_ migliorare ARG0 : Agent ARG1 : Patient SEMANTIC PREDICATE LINK PREDICATE-SEMANTIC UNIT september 2004 Syntax-semantics mapping (2)

41 september 2004 Nilda Ruimy SynU_migliorare to improve Transitive structure P0 P1 Intransitive structure P0 Frameset SemU1_migliorareSemU2_migliorare CHANGE_OF_STATECAUSE_CHANGE_OF_STATE PRED_ migliorare ARG0 : Agent ARG1 : Patient CORRESPONDENCE SYNTACTIC-SEMANTIC FRAME isomorphic non-isomorphic Syntax-semantics mapping (2)

42 a template is a schema providing, for each semantic type, a set of structured information that are deemed crucial to its definition twofold function: interface between ontology and lexicon guide for the lexicographer ensures systematicity, consistency and uniformity of representation of the lexical meaning september 2004 Template-driven encoding methodology Nilda Ruimy

43 A template september 2004 Nilda Ruimy

44 Generic lexicon large coverage (vocabulary and synt. structures) Based on a rich and multifunctional linguistic and representational model shared by 11 other European lexica Fine-grained information, highly structured, innovative, most useful for HLT applications The largest electronic, multilevel lexical resource of Italian language Lexical description conformant to international standards Respect of the principles of uniformity, consistency and exhaustivity High level of reusability 4 description levels: phonology, morphology, syntax, semantics 55,000 words encoded september 2004 CLIPS key features Nilda Ruimy

45 natural language understanding, etc. surface and deep analysis of texts information retrieval machine translation Application fields september 2004 building semantic networks extracting the vocabulary of a specific domain The wealth of information the lexicon contains allows: NP recognition: disambiguating the semantic contribution of some PPs in complex nominals Nilda Ruimy

46 as the PAROLE and SIMPLE lexicons, CLIPS does meet these requirements september 2004 To lend itself to further uses, a lexicon must have: flexible model generic database uniformly structured data precise and explicit linguistic description Nilda Ruimy

47 september 2004 1) Use CLIPS and the PAROLE-SIMPLE French lexicon 2) Perform a semi-automatic linking of their respective entries Strategy I: Creating a bilingual electronic lexical resource Nilda Ruimy

48 september 2004 1) Derive, in a semi-automatic way, a semantically annotated French lexicon from CLIPS 2) Use source and derived lexicons as a basis for building a bilingual resource Strategy II: Creating a bilingual electronic lexical resource Nilda Ruimy

49 Strategy I: CLIPS bilingual dictionary IT-FR & FR-IT capo ufficio gentile residenza tessere pompa scrivere tessuto vestibolo testo amministratore vincere PAR-SIMPLE French lex. PAR-SIMPLE French lex. capo_1 phon:……. morph:.…… syn:………. sem:……. capo_2 …. ufficio_1 …………… ……………. tête_1 morph:.…… syn:………. sem:……. tête_2 ….. tête_3 … bureau_1 …………… ……………. ? ? capo xxxxx tête yyyyy chef zzzzz bout ufficio xxxxx bureau yyyyy charge …….. tête xxxxx testa yyyyy capo zzzzz faccia www cima bureau xxxxx ufficio yyyyy scrivania …….. ALGORITHM september 2004 Nilda Ruimy

50 Analysis of the inherent properties of the SL & TL senses: identity of ontological classification or subsumption relation btw. the semantic type of the SL & TL senses identity of semantic class or subsumption relation btw. their semantic class identity of domain or subsumption relation btw. their domain info. identity / corrispondence of semantic features identity / corrispondence of semantic relations Analysis of their contextual properties: compatibility of syntactic valency function and grammatical instantiation of complements compatibility of semantic valency semantic role and semantic restrictions of arguments cf. Villegas et al. LREC 2000, Athens september 2004 Nilda Ruimy

51 eventoévènement freedefinition=cio' che e' accaduto o potra' accadere, avvenimento Tipo semantico: EVENT Supertype: ENTITY Classe semantica: EVENT freedefinition="something that happens at a given place and time" Tipo semantico: EVENT Supertype: ----- Classe semantica: EVENT scrivereécrire freedefinition=creare qualcosa di scritto Tipo semantico: SYMBOLIC_CREATION Supertype: CREATION Classe semantica: CREATION Domain: CREATIVE_WRITING freedefinition=create written works & semi Tipo semantico: CREATION Supertype: ----- Classe semantica: CREATION Domain: ---- pompapompe freedefinition=macchina o apparecchio usato per sollevare liquidi o comprimere gas Tipo semantico: INSTRUMENT UnificationPath:ConcreteEntityArtifactagenti ve -Materialtelic Classe semantica: APPARATUS freedefinition= "a device that moves fluid or gas by pressure or suction" Tipo semantico: ----- UnificationPath:----- Classe semantica: APPARATUS september 2004 Nilda Ruimy

52 vincerevaincre freedefinition=portare a termine con successo Tipo semantico: RELATIONAL_ACT Classe semantica: ACTIVITY Rel.Sem:---- freedef.=be the winner in contest/competition Tipo semantico: CAUSE_RELAT.-CHANGE Classe semantica: CHANGE Rel.Sem: Resulting_action/state: victoire Agentive_cause:cause Tipo semantico: RELATIONAL_ACT Supertype: ----- Classe semantica: OBJECT Domain: ---- Tratto distintivo: PLUS_SEMIOTIC Tipo semantico: INFORMATION Supertype: REPRESENTATION Classe semantica: ABSTRACT Domain: MEDIA Tratto distintivo: PLUS_SEMIOTIC textetesto_1 Tipo semantico: SEMIOTIC_ARTIFACT UnficationPath:ConcreteEntity- Artifactagentive -Telic Classe semantica: ARTIFACT Domain: MEDIA Tratto distintivo: PLUS_SEMIOTIC testo_2 PREDICATE_vincere_1PREDICATE_vaincre_2 september 2004 Nilda Ruimy

53 Discrepancy of lexical coverage between the lexicons => method applicable to 10,000 senses only Drawbacks of this strategy september 2004 SIMPLE-FR does not always encode all information => necessity of manual intervention wherever SL and TL entries have NO corresponding element due to: encoding error having privileged different although complementary aspects of meaning, e.g.: imprigionare: PURPOSE_ACT vs. emprisonner: CAUSE_RELATIONAL_CHANGE lack of information Nilda Ruimy

54 september 2004 Deriving a FR lexicon from CLIPS Feasibility study for deriving a semantically annotated French lexicon using CLIPS lexical knowledge Crucial step for deriving the French entries: correctly pair off each FR w. sense with the relevant CLIPS semantic unit whose information we want to ultimately assign to the French entry Strategy II – Phase 1: Nilda Ruimy

55 villaggio: 1. (piccolo centro abitato) village 2. (complesso urbanistico) village CLIPS semantically annotated French lexicon semantically annotated French lexicon capo: 1.(testa) tête; 2.(persona che...) chef... sense indicator approach sense indicator approach cognate approach september 2004 exploits the cognateness of Italian and French endings to relate the FR word to the IT CLIPS entry and infer the FR entry matches onto the CLIPS data the information provided in bilingual dictionaries by sense indicators, in order to identify the relevant CLIPS entry Nilda Ruimy

56 look-up september 2004 Nilda Ruimy naming="village" weightvalsemfeaturel= « Geopolitical_Location» […] naming="village" weightvalsemfeaturel=«Human_group» […] FR–LEX naming="villaggio" weightvalsemfeatrel=«Geopolitical_Location» […] { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/3/967941/slides/slide_56.jpg", "name": "look-up september 2004 Nilda Ruimy naming= village weightvalsemfeaturel= « Geopolitical_Location» […] naming= village weightvalsemfeaturel=«Human_group» […] FR–LEX naming= villaggio weightvalsemfeatrel=«Geopolitical_Location» […]

57 IT word SENSE INDICATOR FR word compagnie (presenza) compagnia compagnie (gruppo) compagnia asphalte (per rivestire) asfalto sentir (percepire) avvertire prévenir (avvisare) avvertire aspirer à intr.(avere) prep. a aspirare aspirer tr. (inalare) aspirare aspirer LING. aspirare aspirer tr.(con un tubo) aspirare tête (testa) capo chef (persona che…) capo extracted from bilingual dictionary … analysis & classification of sense indicators Nilda Ruimy september 2004 The sense indicator approach N. Ruimy, ILC-CNR, Pisa

58 indicators conveying morphosyntactic information: verb subclass, auxiliary selection, plural form of nouns, typical subject / object, PP type, etc. september 2004 Types of sense indicators (1) Nilda Ruimy Italian–French COVARE A. v.tr. 1 (di uccelli) [dar calore col proprio corpo alle uova per sviluppare lembrione] couver 2 (fig.) [custodire con gelosia] couver 3 (fig.)[nutrire, alimentare in segreto dentro di sé] nourrir, mijoter [tramare, macchinare in segreto] couver [incubare] couver: covare un malanno B. v.intr. (aus. avere)(fig.)[stare chiuso, nascosto] couver: il fuoco cova sotto la cenere auxiliary typical subj. verbal class Atkins, Bouillon, 2003

59 september 2004 indicators conveying inferential information: synonyms, hypernyms, meronyms domain of use Types of sense indicators (2) Nilda Ruimy Italian–French CAPO I (persone) 1 [testa] tête 2 (fig.) [mente, intelligenza] tête 3 [persona investita di comando, di potere] chef II (animali) 1 (raro) -> testa 2 spec. al plur [ciascun individuo di una specie determinata] têtes, pièces III (cose) 1 [la parte più grossa e più sporgente di un oggetto] tête 2 [la parte più alta] haut 3 [ciascuna delle due estremità di qlco.] bout, tête 4 [inizio, principio] début 5 [fine, conclusione; sbocco] bout 6 loc. ….. 7 (nei filati) fil 8 [singolo oggetto appartenente ad una serie] pièce 9 (geog.) cap synonym hypernym synonym domain of use synonym

60 IT word SENSE INDICATOR FR word CLIPSCLIPS bijouterie (arte) gioielleria bijouterie (negozio) gioielleria asphalte (per rivestire) asfalto sentir (percepire) avvertire prévenir (avvisare) avvertire aspirer à intr.(avere) prep. a aspirare aspirer tr. (inalare) aspirare aspirer LING. aspirare aspirer tr.(con un tubo) aspirare tête (testa) capo chef (persona che…) capo … sense indicators used as search keys for identifying, in CLIPS, the semantic entry relevant to the IT sense of the bilingual pair Nilda Ruimy september 2004

61 Using sense indicators indicators usable straightforwardly indicators to be converted into the descriptive language of CLIPS: illuminare (rendere luminoso) illuminer (to make luminous) analizzatore (chi effettua analisi) analyste (who performs analyses) sem. type of analizzatore belongs to HUMAN hierarchy sem. type of iluminare belongs to causative types hierarchy Nilda Ruimy

62 september 2004 Rule types search for a CLIPS entry containing the s.i. as target of the synonymic relation of the hypernymic relation of any qualia relation search for a CLIPS entry sharing properties with the entry of the s.i. shared hypernym shared semantic type search for a CLIPS entry containing information inferred from the s.i. specific type specific relation or feature (esp. domain info.) specific syntactic structure testacapo synonym_rel negozio gioielleria isa_rel comunicare (notificare) isa_rel dire avvertire (percepire)semtypeEXP._EVENT conoscere (pron. (reciprocamente))reciprocal syn. struct. Nilda Ruimy

63 IT word SENSE INDICATOR FR word CLIPSCLIPS compagnie (presenza) compagnia compagnie (gruppo) compagnia asphalte (per rivestire) asfalto sentir (percepire) avvertire prévenir (avvisare) avvertire aspirer à intr.(avere) prep. a aspirare aspirer tr. (inalare) aspirare aspirer LING. aspirare aspirer tr.(con un tubo) aspirare tête (testa) capo chef (persona che…) capo SemU61397 capo, sem. type= Body_part, where synonym SemU3615 capo, sem. type= Role, where isa SemU68603 asfalto, sem. type= Artifact_Material, where used_for SemU79372 aspirare, sem. type= Speech_act, where domain : phonetics SemU7040 aspirare, sem. type= Modal_event, linked to SynUaspirare, intr. pp_a … september 2004 Nilda Ruimy

64 Small percentage of errors due to a different granularity of sense distinctions in CLIPS and in the blingual dictionary IT constructed words whose different senses are translated by a unique FR constructed word IT constructed words having more than one translation –aggio89.9 %10.1 % –tà77.4 %22.6 % –zione80.4 %19.6 % FR constructed words sharing the IT CLIPS entries –aggio99.97 % –tà99.98 % –zione99.98 % recall ratio september 2004 Cognate approach: results Nilda Ruimy

65 Itword – sense indicator – FRword X – A – Y application order 129786354 investigated lex. data target of syn. rel. target of hyper. rel. target of any qualia shared hypernym shared semtype specific semtype specific domain specific feat/rel specific syn.struct success rate 16.6%26.8%0.92%8.9%5.8%3.9%12.3%9.2%15.4% rule type 1 search for an entry of X containing string A 2 search for entry of X sharing properties with an entry of A 3 search for an entry of X containing information inferred from A september 2004 Sense indicator approach: results the higher the rule rank, the more reliable the result Nilda Ruimy

66 distribution of success rate over the algorithm rules recall ratio: 69% september 2004 Nilda Ruimy

67 results may be enhanced by gleaning the most informative sense indicators from different sources september 2004 Combining the two methods constructed words represent 68.2% of the vocabulary successful handling of: + 69% of non constructed words 95% of constructed words Nilda Ruimy

68 Approaches taken applicable to other language pairs sharing similarities in terms of morphological structure Derived lexicon building process is simplified and shortened Deriving new lexical resources from existing ones: a worthwhile venture in terms of time and effort Such practice entails coverage and consistency assessment of the source lexical resource Source and derived lexicons constitute a most reliable basis for developing a bilingual resource september 2004 Concluding remarks Nilda Ruimy


Download ppt "The Italian CLIPS Lexicon and its reuse in a bilingual environment Nilda Ruimy ILC CNR, Pisa september 2004."

Similar presentations


Ads by Google