Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Web The Story So Far Ian Horrocks Oxford University Computing Laboratory.

Similar presentations


Presentation on theme: "Semantic Web The Story So Far Ian Horrocks Oxford University Computing Laboratory."— Presentation transcript:

1 Semantic Web The Story So Far Ian Horrocks Oxford University Computing Laboratory

2 Semantic Web

3 According to W3C –“an evolving extension of the World Wide Web in which web content can be … read and used by software agents, thus permitting them to find, share and integrate information more easily” Data will use uniform syntactic structure (RDF) (OWL) ontologies will provide –Schemas for data –Vocabulary for annotations Ultimate goal is a “more intelligent web” Semantic Web

4 The Semantic Web

5 Web “invented” by Tim Berners-Lee (amongst others) –(Conceptual) simplicity of web has contributed to success, but is also a limiting factor Tim has ambitious goals for future of the web –Objective is to overcome existing limitations This vision of the future of the Web has become known as the Semantic Web What is it? “… a consistent logical web of data …” “… information is given well-defined meaning …”

6 Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois Why do we want it? Many tasks are difficult or impossible using existing web:

7 Why do we want it? Many tasks are difficult or impossible using existing web: Complex queries involving background knowledge –Find information about “animals that use sonar but are neither bats nor dolphins” Locating information in data repositories –Travel enquiries –Prices of goods and services –Results of human genome experiments Finding and using “web services” –Given DNA sequence, identify genes, determine proteins they produce, and hence biological processes they control, e.g., Barn Owl

8 What is the Problem? Consider a typical web page: Markup consists of: –rendering information (e.g., font size and colour) –Hyper-links to related content Semantic content is accessible to humans, but not (easily) to computers…

9 How Will It Work? Add semantic annotations to web resources Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois Dr. Alan Rector, Professor of Computer Science, University of Manchester Rev. Alan M. Gates, Associate Rector of the Church of the Holy Spirit, Lake Forest, Illinois

10 How Will It Work? Now... that should clear up a few things around here

11 Giving Semantics to Annotations Agree on meaning of a set of annotation tags E.g., Dublin Core –Limited flexibility and extensibility –Limited number of things can be expressed Agree on language used to define meanings E.g., an ontology language –Flexible and extensible New terms can be formed by combining existing ones –Meaning (semantics) of such terms is formally specified

12 The Web Ontology Language OWL

13 Semantic Web led to requirement for a “web ontology language” set up Web-Ontology (WebOnt) Working Group –WebOnt developed OWL language –OWL based on earlier languages RDF, OIL and DAML+OIL –OWL now a W3C recommendation (i.e., a standard) OWL is a family of 3 languages: OWL Lite, OWL DL and OWL Full OIL, DAML+OIL and OWL (DL & Lite) based on Description Logics –Has facilitated development of wide range of high quality tools & infrastructure OWL now language of choice in many applications Web Ontology Language OWL

14 What Are Description Logics? A family of logic based Knowledge Representation formalisms –Descendants of semantic networks and KL-ONE –Describe domain in terms of concepts (classes), roles (properties, relationships) and individuals –Operators allow for composition of complex concepts –Names can be given to complex concepts, e.g.: HappyParent ´ Parent u 8 hasChild.(Intelligent t Athletic)

15 Why (Description) Logic? OWL exploits results of 15+ years of DL research –Well defined (model theoretic) semantics –Most DLs are subsets of C2, i.e., decidable fragments of FOL

16 Why (Description) Logic? OWL exploits results of 15+ years of DL research –Well defined (model theoretic) semantics –Formal properties well understood (complexity, decidability) [Garey & Johnson. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, 1979.] I can’t find an efficient algorithm, but neither can all these famous people.

17 Why (Description) Logic? OWL exploits results of 15+ years of DL research –Well defined (model theoretic) semantics –Formal properties well understood (complexity, decidability) –Known reasoning algorithms

18 Why (Description) Logic? OWL exploits results of 15+ years of DL research –Well defined (model theoretic) semantics –Formal properties well understood (complexity, decidability) –Known reasoning algorithms –Implemented systems (highly optimised) Pellet KAON2 CEL

19 Class/Concept Constructors Concept can be thought of as a FOL formula with one free variable

20 Knowledge Base / Ontology Axioms

21 OWL RDF/XML Exchange Syntax E.g., Parent u 8 hasChild.(Intelligent t Athletic):

22 Ontology based Information Systems Similar to relational databases –Ontology ¼ schema; instances ¼ data Some important (dis)advantages +(Relatively) easy to maintain and update schema Both schema and data are “self organising” +Query answers reflect both schema and data +Able to answer both intensional and extensional queries –Semantics may be counter-intuitive or even inappropriate Open -v- closed world; axioms -v- constraints –Query answering (logical entailment) much more difficult Can lead to scalability problems

23 Ontology based Information Systems Similar to relational databases –Ontology ¼ schema; instances ¼ data Some important (dis)advantages +(Relatively) easy to maintain and update schema Both schema and data are “self organising” +Query answers reflect both schema and data +Able to answer both intensional and extensional queries –Semantics may be counter-intuitive or even inappropriate Open -v- closed world; axioms -v- constraints –Query answering (logical entailment) much more difficult Can lead to scalability problems Very useful, but no miracles!

24 Ontologies and Reasoning

25 Support for Ontology Engineering Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: –Meaningful — all named classes can have instances

26 Support for Ontology Engineering Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: –Meaningful — all named classes can have instances –Correct — captures intuitions of domain experts

27 Support for Ontology Engineering Developing and maintaining quality ontolgies is very challenging Users need tools and services, e.g., to help check if ontology is: –Meaningful — all named classes can have instances –Correct — captures intuitions of domain experts –Minimally redundant — no unintended synonyms  Banana splitBanana sundae

28 Support for Ontology Engineering Range of new “non-standard” services supporting, e.g.: –Modular design and integration What is the effect of merging O 2 into O 1 ? In general, check that O 1 [ O 2 ² C iff O 1 ² C for any concept C constructed using vocabulary occurring in O 1 –Module Extraction Extract a (small) module from O capturing all “relevant” information about some vocabulary V In general, find O ’ µ O s.t. O ’ ² C iff O ² C for any concept C constructed using terms from V –Bottom-up design Find a (small and specific) concept describing a set of individuals In general, find most specific C s.t. O ² C(i 1 ) Æ … Æ C(i n ) –Where C may be “small” and/or in a sub-language (of O )

29 Support for Ontology Engineering Range of new “non-standard” services supporting, e.g.: –Error diagnosis and repair

30 Support for Query Answering In an Ontology based Information System (OIS), Query answering ¼ computing logical entailment –Reasoner needed in order to answer queries, e.g.: C is a sub-class of D iff O ² 8 x. C(x) ! D(x) a is an instance of C iff O ² C(a) OIS with no reasoner ¼ DBMS with no query engine

31 Example Applications

32 e-Science E.g., for “in silico” investigations and “hypothesis testing” –Comparing data (e.g., on proteins) to (model of) biological knowledge –Characteristics of proteins captured in an ontology O Goal is to identify protein instances based on characteristics

33 e-Science E.g., for “in silico” investigations and “hypothesis testing” –Comparing data (e.g., on proteins) to (model of) biological knowledge –Characteristics of proteins captured in an ontology O Goal is to identify protein instances based on characteristics –Equivalent to answering queries of form: O ² P(i)? for protein P and instance i –Result may be discovery of new kinds of protein And these may be potential drug targets if unique to a pathenogen –Result may also be discovery of errors in model Which may reflect gaps/errors in existing knowledge

34 Healthcare UK NHS has a £6.2 billion “Connecting for Health” IT programme Key component is Care Records Service (CRS) –“Live, interactive patient record service accessible 24/7” –Patient data distributed across local centres in 5 regional clusters, and a national DB Detailed records held by local service providers Diverse applications support radiology, pharmacy, etc Applications exchange messages containing “semantically rich clinical information” Summaries sent to national database –SNOMED-CT ontology provides common vocabulary for data Clinical data uses terms drawn from ontology

35 SNOMED Over 400,000 concepts

36 SNOMED Over 400,000 concepts Schema only — no instances Language used is a (well known) fragment of OWL NHS version extended with 1,000s of additional classes –OWL reasoner (FaCT++) used to classify and check ontology Currently takes ¼ 4 hours –180 missing subClass relationships were found, e.g.: Periocular_dermatitis subClassOf Disease_of_face Fibrin_measurement subClassOf Coagulation_factor_assay

37 SNOMED Vocabulary is extensible at point of use: “post coordination” –Users (e.g. clinicians) may add/define new vocabulary –Terminology service (reasoner) used to insert in ontology Typical new term: – almond_allergy ´ “allergy caused_by almond” –OWL reasoner (FaCT++) used to classify new term Takes <10 ms –Classified as a kind of “nut allergy” Clearly of crucial importance to recognise patients with allergy caused by almond as kinds of patient with nut allergy

38 Online Self-Medication Advice Self-medication is pervasive, but can be hazardous –180 deaths in the USA in 2006 French project to provide on-line advice –Will be made available to 20 million customers of French health insurance companies –Patients have their own simple health care record (SEHR) –Diagnosis system considers symptom descriptions, SEHR, Q&A and self-medication KB –Uses an ontology for vocabulary and knowledge (axioms) about treatments, contra-indications, side-effects, etc. E.g., do not take x if patient suffers from y; side-effects of x may include z

39 Online Self-Medication Advice Self-medication is pervasive, but can be hazardous –180 deaths in the USA in 2006 French project to provide on-line advice –Will be made available to 20 million customers of French health insurance companies –Patients have their own simple health care record (SEHR) –Diagnosis system considers symptom descriptions, SEHR, Q&A and self-medication KB –Uses OWL reasoner to advise on treatment, and check for contra-indications, side-effects, etc. E.g., do not take x if patient suffers from y; side-effects may include z

40 Online Self-Medication Application Data taken from drug terminologies, e.g.: –European Pharmaceutical Market Research Association (EphMRA) –Anatomical Therapeutic Chemical (ATC) Data transformed into OWL ontology –Expert uses reasoner to check and enhance ontology OWL reasoner also used to check and enhance data –Combined with induction and interaction with expert –Corrected missing/incorrect information on interactions, contra-indications, allergies, side-effects, etc. –Quality of data improved by factor of 8%

41 Columbia Presbyterian Medical Center Ontology used in analysis of results in path lab OWL reasoner used to check this ontology Several errors and omissions found that: “ would have led to missed test results ”

42 Recent Developments

43 OWL 1.1 Is an extension of OWL –Addresses deficiencies identified by users and developers (at OWLED workshop) Is based on more expressive DL: SROIQ –(OWL is based on SHOIN ) W3C working group now chartered –Will develop recommendation based on existing member submission Already supported by popular OWL tools –Protégé, Swoop, TopBraid, FaCT++, Pellet

44 What’s New in OWL 1.1? Four kinds of features: More expressive logic ( SROIQ ) –qualified cardinality restrictions (>n R.C) and (6n R.C), e.g: Person v Animal u =2 hasPart.Legs Car v =4 hasComponent.Wheel Person v 6 1 bioParent.Male (OWL/SHOIN only allows for concepts ( >n R) and (6n R))

45 What’s New in OWL 1.1? Four kinds of features: More expressive logic ( SROIQ ) –Expressive role axioms (R), e.g., complex role inclusions: R1 o … o Rn v S R1 o … o Rn o S v S S o R1 o … o Rn v S (with some restrictions on cycles) –useful, e.g., for owns o hasPart v owns ) 9 owns.Bicycle v 9 owns.Wheels partOf o locatedIn v locatedIn ) Fracture u 9 locatedIn.FemurShaft v Fracture u 9 locatedIn.Femur hasParent o hasBrother v hasUncle

46 What’s New in OWL 1.1? Four kinds of features: More expressive logic ( SROIQ ) –Expressive role axioms (R), e.g., asymmetry, reflexivity, etc: Tra(R) (supported by SHOIN ) Asy(R) e.g., Asy(properpartOf), Asy(hasParent) Sym(R) (supported by SHOIN ) Refl(R) e.g., Refl(knows) Irrefl(R) e.g., Irrefl(properPartOf), Asy(hasParent) Disj(R S) e.g., Disj(hasParent hasSibling) ObjectExistsSelf(likes) [for narcissists]

47 What’s New in OWL 1.1? Four kinds of features: More expressive datatypes –OWL 1.1 allows for user-defined datatypes: over18 ´ base(xsd:integer) minInclusive("18"xsd:integer) Adult ´ Person u 9 age.over18 –and n-ary datatype predicates: Spendthrift ´ 9 spends,earns.> –BUT, still cannot: define complex relationships between data properties on different individuals, e.g., Women who earn more than their husbands. declare a datatype property as inverse-functional (keys).

48 What’s New in OWL 1.1? Four kinds of features: Metamodelling and annotations –Names can be used as any or all of an individual, a class, or a property –Allows for a restricted form of metamodelling (“punning”), e.g.: subClassOf(SnowLeopard BigCat) ClassAssertion(SnowLeopard EndangeredSpecies) –Annotations of axioms as well as entities ClassAssertion(Comment(“source: WWF”) SnowLeopard EndangeredSpecies)

49 What’s New in OWL 1.1? Four kinds of features: Syntactic sugar (make things easier to say) –Disjoint unions, e.g.: DisjointUnion(Element Earth Wind Fire Water) –Negative assertions, e.g.: NegativeObjectPropertyAssertion(Ian hasChild Mary) NegativeDataPropertyAssertion(Ian hasAge 21)

50 Tractable Fragments OWL defines only one fragment (OWL Lite) –And it isn’t very tractable! OWL 1.1 defines several different fragments with useful computational properties –E.g., reasoning complexity in range LOGSPACE to PTIME –Smaller fragments implementable using RDBs

51 Tractable Fragments

52 Tools and Methodologies OWL 1.1 support already added to several tools: –Protégé, Swoop, TopBraid Composer, FaCT++, Pellet New features available (soon) in OWL tools: –Diagnosis and semi-automatic repair of errors –Support for integration and modular design –Incremental classification (addition and retraction) –Support for bottom up design

53 Diagnosis Editing tools use reasoner to identify inconsistent classes May not be very useful without some explanation facility

54 Modularity in Software Engineering Typically referred to as the extent to which software is divided into components with: –high internal cohesion –controlled coupling between each other through simple interfaces (encapsulation) Benefits of modular software design: –software maintainability –software understandability

55 Modularity in Ontology Engineering Benefits of a modular ontology design: to simplify ontology refinement/update modifying a module should not lead to modifications in parts of the ontology that are not conceptually related understanding relationships between different modules in an ontology controlled and well-understood integration with other ontologies no unexpected consequences partial reuse reuse only the relevant part/module of an ontology

56 Tool Support for Modular Design Check when integration of modules is “safe” –Interface between modules via exported vocabulary –Information flows from imported to importing ontology –No information flows back the other way Formalised using conservative extensions –What is the effect of merging O 2 into O 1 ? –In general, check that O 1 [ O 2 ² C iff O 1 ² C for any concept C constructed using vocabulary occurring in O 1 [Cuenca Grau & Kazakov, IJCAI-07 and WWW-07]

57 Tool Support for Modular Design Extract smaller modules from large ontologies –E.g., starting with FMA, extract module for “Heart” –Tool should ensure that module Is as small as possible, but Still contains all relevant knowledge More formally: –Extract a (small) module from O capturing all “relevant” information about some vocabulary V –In general, find O ’ µ O s.t. O ’ ² C iff O ² C for any concept C constructed using terms from V

58 Incremental Reasoning Modules can also be used to support incremental addition and retraction of axioms, e.g: –When retracting C v D, reclassify only concepts whose module includes this axiom –Typically this is only a very small subset of all concepts Prototype now implemented in Swoop editor

59 Tool Support for Bottom-up Design Bottom-up design –Find a (small and specific) concept describing a set of individuals –In general, find most specific C s.t. O ² C(i 1 ) Æ … Æ C(i n ) Where C may be “small” and/or in a sub-language (of O ) –Prototype: SONIC system [Turhan et al]

60 Extending Expressive Power Database style keys [Lutz et al, JAIR 2004] –E.g., make + model + chassis-number is a key for Vehicles Rule language extensions –W3C RIF WG (see http://www.w3.org/2005/rules/)http://www.w3.org/2005/rules/) –First order extensions (e.g., SWRL) [Horrocks et al, JWS, 2005] –Hybrid language extensions, e.g., [Eiter et al, KR-04; Motik et al, ISWC-04; Rosati, JoWS, 2005] –LP/F-Logic/Common Logic [Chen et al, JLP, 1993; de Bruijn et al, WWW-05] Other extensions –Temporal –Fuzzy –Extended annotation framework –Macro language –…–…

61 Improving Scalability Optimisation techniques –Improve performance of DL reasoners, e.g., [Tsarkov et al, JAR, ] New Reasoning Techniques –Reduction to disjunctive Datalog [Motik et at, KR-04] Transform SHOIN ontology to Datalog Ç rules Use LP techniques to deal with large numbers of ground facts –Hybrid DL-DB systems [Horrocks et al, CADE-05] Use DB to store “Abox” (individual) axioms Cache inferences and use DB queries to answer/scope logical queries –Hypertableau based algorithms [Motik et al, CADE-07] Prototypical implementation in HermiT system Polynomial time algorithms for sub-ALC logics –Graph based techniques for EL+ [Baader et al, IJCAI-05] –Database techniques for DL-Lite [Calvanese et al, AAAI-05]

62 Developing Tools and Infrastructure Editors/environments –Oiled, Protégé, Swoop, TopBraid, Ontotrack, …

63 Developing Tools and Infrastructure Editors/environments –Oiled, Protégé, Swoop, TopBraid, Ontotrack, … Reasoning systems –Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, … Pellet KAON2 CEL

64 Developing Tools and Infrastructure Editors/environments –Oiled, Protégé, Swoop, TopBraid, Ontotrack, … Reasoning systems –Cerebra, FaCT++, Kaon2, Pellet, Racer, CEL, … Design methodologies –Foundational ontologies, etc. Entity SubstantialQualityEvent Achievement Stative Accomplishment PerdurantEndurant

65 Summary Semantic Web aims to make web content more accessible to automated processes –Adds semantic annotations to web resources OWL Ontologies provide vocabulary for annotations –Terms have well defined meaning OWL now being used in a wide range of applications –e-Science, medicine, geography, geology, … Reasoning enabled tools are of crucial importance –For both design and deployment of ontologies Active research area –Expressive power, scalability, methodologies, tools, …

66 Thank you for listening

67 Any questions? FRAZZ: © Jeff Mallett/Dist. by United Feature Syndicate, Inc.


Download ppt "Semantic Web The Story So Far Ian Horrocks Oxford University Computing Laboratory."

Similar presentations


Ads by Google