Presentation is loading. Please wait.

Presentation is loading. Please wait.

COLING Workshop - 2002 Nicoletta Calzolari ILC - CNR - Pisa, Italy Language Resources & Semantic Web.

Similar presentations


Presentation on theme: "COLING Workshop - 2002 Nicoletta Calzolari ILC - CNR - Pisa, Italy Language Resources & Semantic Web."— Presentation transcript:

1 COLING Workshop - 2002 Nicoletta Calzolari ILC - CNR - Pisa, Italy Language Resources & Semantic Web

2 COLING Workshop - 2002 To make the Semantic Web a reality... …need to tackle the twofold challenge of content availabilitycontent availability and multilingualitymultilinguality Natural convergence with HLT: multilingual semantic processingmultilingual semantic processing ontologiesontologies semantic-syntactic computational lexiconssemantic-syntactic computational lexicons

3 COLING Workshop - 2002 Computational Multilingual Lexicons: an essential component for the Semantic Web Language - & lexicons - are the gateway to knowledgeLanguage - & lexicons - are the gateway to knowledge repositories of wordsSemantic Web developers need repositories of words & terms - & knowledge of their relations in language use & ontological classification. machine-understandable lexical informationThe cost of adding this structured and machine-understandable lexical information can be one of the factors that delays its full deployment. millions of ‘words’ for dozens of languagesno small groupThe effort of making available millions of ‘words’ for dozens of languages is something that no small group is able to afford. radical shift in the lexical paradigm - whereby many participants add linguistic content descriptions in an open distributed lexical framework - is required to make the Web usableA radical shift in the lexical paradigm - whereby many participants add linguistic content descriptions in an open distributed lexical framework - is required to make the Web usable

4 COLING Workshop - 2002 But … they will never be “complete” Semantic network: Euro-/ItalWordNet Semantic network: Euro-/ItalWordNet Lexicons: PAROLE/SIMPLE/CLIPS Lexicons: PAROLE/SIMPLE/CLIPS TreeBank TreeBank +sw Infrastructure of Language Resources... Lexical acquisitionsystems from text corpora Lexical acquisition systems (syntactic & semantic) from text corpora morphosyntactic & syntactic analysis Robust systems of morphosyntactic & syntactic analysis Word-sensedisambiguation systems Word-sense disambiguation systems...static …dynamic InternationalStandards

5 COLING Workshop - 2002 Italian Semantic Network Italian module of EuroWordNet (http://www.hum.uva.nl/~ewn/) 50.000synonym groupssynsets hierarchies130.000 ~ 50.000 lemmas organized in synonym groups (synsets), structured in hierarchies & linked by ~ 130.000 semantic relations ~ ~ 50.000 hyperonymy/hyponymy relations ~ 16.000 relations among different POS (role, cause, derivation, etc..) ~ 2.000 part-whole relations ~ 1.500 antonymy relations, …etc. linked to the InterLingual IndexSynsets linked to the InterLingual Index (ILI=Princeton WordNet), ILIEuropean WordNetsThrough the ILI link to all the European WordNets (de-facto standard) Top Ontology & to the common Top Ontology domain terminological lexiconsPossibility of plug-in with domain terminological lexicons Usable in IR, CLIR, IE, QA,...

6 COLING Workshop - 2002 Domain - Semantic class mangiare

7 COLING Workshop - 2002 mangiare Used_for Object_of_th e_ activity mangiare tavola FURNITURE forchetta posata INSTRUMENT ristorante BUILDING cucinare cuocere mestolo pentola CONTAINER mangiare friggere friggitrice bollitore bollire pesce pesciera Is_the_activity_of cuoco PROFESSION cucinare mangiare coniglio carne mela carota arrosto mangiare ARTIFACT _FOOD VEGETABLES FRUIT FOOD SUBSTANCE_FOOD +edible zucchero alloro tartufo VEGETAL_ENTITY FLAVOURING NATURAL_SUBSTANCE AGENTIVETELIC Created_by cucinare cuocere arrostire bollire lessare stufare friggere rosolare grigliare …… Domain - Semantic class

8 COLING Workshop - 2002 machine language learning

9 COLING Workshop - 2002 machine language learning development of conceptual networks linguistic learning adaptive classification systems information extraction bootstrapping of grammars linguistic change models linguistic change models language usage models bootstrapping of lexical information

10 COLING Workshop - 2002 Beyond MILE: towards open & distributed lexicons Semantic Lexicon URI = http://www.xxx… Syntactic Constructions URI = http://www.yyy… Ontology URI = http://www.zzz… Monolingual/Multilingual Lexicon Lexicon Lex_object: semFeature URI = http://www.xxx…#HUMAN Lex_object: syntagmaNT URI = http://www.zzz…#NP

11 COLING Workshop - 2002 Target….. Multilingual Knowledge Management Technical Feasibility: 4Prerequisite:achievable goalcommonly agreedtext/lexicon annotation protocol also for the semantic/conceptual level 4Prerequisite: is it an achievable goal a commonly agreed text/lexicon annotation protocol also for the semantic/conceptual level (to be able to automatically establish links among different languages)? Yes lexical Yes, at the lexical level More complex, for corpus annotation? EAGLES/ISLE

12 COLING Workshop - 2002 A few Issues for discussion: lexicon standards Semantic Web standardscontent processing technologies:Semantic Web standards and the needs of content processing technologies: “content” –importance of reaching consensus on (linguistic and non-linguistic) “content”, in addition to agreement on formats and encoding issues (…words convey content & knowledge) standards for multilingual lexicons & content encodingindustrial requirements –short/medium term requirements wrt standards for multilingual lexicons & content encoding, also industrial requirements Relation with Spoken languageRelation with Spoken language community MILE & Asian languages: how to cooperate concretely?MILE & Asian languages: how to cooperate concretely? further stepsprioritiesDefine further steps necessary to converge on common priorities ….

13 COLING Workshop - 2002 A few Issues for discussion: “content”, priorities... For which type of resources to invest?For which type of resources to invest? wrt short vs. medium term results? robust systems, able to acquire/tune lexical/linguistic knowledgeNeed for robust systems, able to acquire/tune lexical/linguistic (also multilingual) knowledge, to auto-enrich static basic resources? relation betw. lexical standards and text annotation protocols?What the relation betw. lexical standards and text annotation protocols? For “content” interoperability ‘mature’ enough to convergeKnowledge management is critical. For “content” interoperability, is the field ‘mature’ enough to converge around agreed standards also for the semantic/conceptual level (e.g. to automatically establish links among different languages)? ready to tackle the challenges set by the Semantic WebIs the field of multilingual lexical resources ready to tackle the challenges set by the Semantic Web development? Towards a new paradigm??

14 COLING Workshop - 2002 A new paradigm for LR? cooperation Where the focus is on cooperation New Strategic Vision? New Strategic Vision? Distributed Open Lexical Infrastructure? towards a Distributed Open Lexical Infrastructure? distributed & cooperative creationfor distributed & cooperative creation, management, etc. of Lexical Resources technical & organisational requirementstechnical & organisational requirements

15 COLING Workshop - 2002 “ELITE” (expression of interest for the 6thFP) Lexical Infrastructure and Technology “ELITE” (expression of interest for the 6thFP) “European Lexical Infrastructure and Technology” New proposed paradigm for lexicon development: Open & Distributed Lexical Infrastructure 4for content description and content interoperability, Semantic Web 4to make lexical resources usable within the emerging Semantic Web scenario Language Resources & Semantic Web


Download ppt "COLING Workshop - 2002 Nicoletta Calzolari ILC - CNR - Pisa, Italy Language Resources & Semantic Web."

Similar presentations


Ads by Google