Presentation is loading. Please wait.

Presentation is loading. Please wait.

ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-09-27, Pisa HASIDA Koiti CfSR, AIST, Japan.

Similar presentations


Presentation on theme: "ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-09-27, Pisa HASIDA Koiti CfSR, AIST, Japan."— Presentation transcript:

1 ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-09-27, Pisa HASIDA Koiti hasida.k@aist.go.jp CfSR, AIST, Japan

2 Ontologization reformulation in terms of ontology reformulation in terms of ontology provide standard way to convert annotations to labeled directed graphs provide standard way to convert annotations to labeled directed graphs DCR, LAF, LMF, FS, MAF, SemAF, SynAF, MLIF, etc. DCR, LAF, LMF, FS, MAF, SemAF, SynAF, MLIF, etc. Cf. LMF and MAF have UML-based schemas. not XML but RDF as base description and modeling tool not XML but RDF as base description and modeling tool standard semantic interpretation for RDF standard semantic interpretation for RDF highlight semantics rather than syntax highlight semantics rather than syntax 2

3 Purposes of Ontologization interoperability interoperability among ISO/TC37 standards among ISO/TC37 standards with ontologies from elsewhere with ontologies from elsewhere with any data containing linguistic content with any data containing linguistic content RDF data are easier to integrate than XML data. RDF data are easier to integrate than XML data. e.g. external annotation of texts in SMIL data without including linguistic description in SMIL specification fuller formalization of IS specifications fuller formalization of IS specifications semantic extension of DCR semantic extension of DCR 3

4 Semantic Extension of DCR sorts of DCs sorts of DCs unary predicate → class unary predicate → class binary relation → property binary relation → property symmetric binary relation, etc. symmetric binary relation, etc. types of the domain (1 st arg.) and the range (2 nd arg.) of binary relations (properties) types of the domain (1 st arg.) and the range (2 nd arg.) of binary relations (properties) 4

5 XML Mess Semantic interpretation of XML is not standardized but defined ad hoc. Semantic interpretation of XML is not standardized but defined ad hoc. Many inconsistent `standards’ on overlapping issues. Many inconsistent `standards’ on overlapping issues. Huge standards containing many different semantic interpretation manners. Huge standards containing many different semantic interpretation manners. e.g., MPEG-7 > 2000 pages e.g., MPEG-7 > 2000 pages 5

6 RDF Resource Description Framework Resource Description Framework labeled directed graph labeled directed graph W3C recommendation http://www.w3.org/RDF/ W3C recommendation http://www.w3.org/RDF/ http://www.w3.org/RDF/ Schemas are provided by RDFS, OWL, etc. Schemas are provided by RDFS, OWL, etc. textual representation textual representation XML, N3, etc. XML, N3, etc. 6

7 http://meetings.example.com/cal#m1http://meetings.example.com/cal#m1 RDF Graph http://www.example.org/people#fredhttp://www.example.org/people#fred http://meetings.example.com/m1/hphttp://meetings.example.com/m1/hp m:homePagem:homePage m:attendingm:attending m:givenNamem:givenName FredFred m:hasEmailm:hasEmail mailto:fred@example.commailto:fred@example.com 7

8 Conversion of XML to RDF AnyURI- and IDREF(S)-type attribute AnyURI- and IDREF(S)-type attribute → object property (link) other attribute → datatype property other attribute → datatype property embedded element embedded element → object/datatype property 8

9 24610: Feature Structure typed feature structure as in HPSG, etc. typed feature structure as in HPSG, etc. ISO 24610-1: Feature Structure Representation ISO 24610-1: Feature Structure Representation ISO 24610-2: Feature System Declaration ISO 24610-2: Feature System Declaration labeled directed graph labeled directed graph AVM (attribute-value matrix) AVM (attribute-value matrix) textual encoding by XML textual encoding by XML 9

10 FS Graph = RDF Graph determinerdeterminer POSPOS SPECIFIERSPECIFIER ORTHORTH lala HEADHEAD AGRAGR AGRAGR nounnoun POSPOS ORTHORTH pommepomme singularsingular NUMBERNUMBER 10

11 FS in AVM SPECIFIER HEAD POSdeterminer ORTH`la’ AGR [1] [NUMBER singular] POSnoun ORTH`pomme’ AGR [1] 11

12 Ontologies Subsume Feature Systems Features are partial functions, whereas RDF properties are relations in general (possibly partial functions). Features are partial functions, whereas RDF properties are relations in general (possibly partial functions). Usual feature systems have no taxonomy of features, whereas usual ontologies have taxonomies of properties (e.g., due to rdfs:subPropertyOf). Usual feature systems have no taxonomy of features, whereas usual ontologies have taxonomies of properties (e.g., due to rdfs:subPropertyOf). 12

13 wordword The fundamental type for individual words The orthographic representation for this word The fundamental type for individual words The orthographic representation for this word orthorth Feature-System Declaration 13 signsign rdfs:domainrdfs:domain stringstring rdfs:rangerdfs:range rdfs:subClassOfrdfs:subClassOf The fundamental type for individual words rdfs:commentrdfs:comment The orthographic representation for this word rdfs:commentrdfs:comment owl:FunctionalPropertyowl:FunctionalProperty rdf:typerdf:type

14 Constraint (Conditional) 14 XX invinv truetrue finfin auxaux vformvform XX truetrue condcond SWRL representation: inv(?X,true) -> aux(?X,true) & vform(?X,fin)

15 FS Ontologization (Summary) RDF ⊃ FS RDF ⊃ FS Use ontologies for feature-system declarations. Use ontologies for feature-system declarations. SWRL to encode constraints SWRL to encode constraints Defaults are outside of ontology. Defaults are outside of ontology. 15

16 24612: Linguistic Annotation Framework 16

17 GrAF in RDF NUMBERNUMBER 17 rdfs:typerdfs:type NPNP TheThe clockclock SINGSING rdfs:typerdfs:type TOKENTOKEN POSPOS BASEBASE THETHE DETDET rdfs:typerdfs:type POSPOS NNNN BASEBASE CLOCKCLOCK possibly stand-off annotation

18 18 Turn Agent Utterance Dialogue addressee overhearer sender 1..* 0..* 1..1 1..* DialogueAct 0..* 1..* func.dep.SemAF-DActs

19 TODOs (projects in TDG6?) include ontologies in documents include ontologies in documents FSD FSD just check UML (as far as no property hierarchy is necessary) just check UML (as far as no property hierarchy is necessary) LMF, MAF LMF, MAF finish ontologization (possibly in UML) finish ontologization (possibly in UML) SynAF SynAF ontologize from scratch, forgetting XML ontologize from scratch, forgetting XML DCR, SemAF-Time, SemAF-DActs, MLIF, etc. DCR, SemAF-Time, SemAF-DActs, MLIF, etc. 19

20 Issues Who should ontologize individual WIs? Who should ontologize individual WIs? ontologize future WIs from the beginning ontologize future WIs from the beginning TDG6 should exemplify how. TDG6 should exemplify how. whether and how to make ontologization mandatory? whether and how to make ontologization mandatory? Where to include ontologies of ongoing WIs? Where to include ontologies of ongoing WIs? depending on their stages (WD, CD,...) depending on their stages (WD, CD,...) How to keep ontologizing DCs? How to keep ontologizing DCs? replace DC metamodel by ontology? replace DC metamodel by ontology? modify ISOCat? modify ISOCat? 20


Download ppt "ISO/TC37/SC4/TDG6 Language Resource Ontologies 2008-09-27, Pisa HASIDA Koiti CfSR, AIST, Japan."

Similar presentations


Ads by Google