Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Semantic Web Dr. Bhavani Thuraisingham The University of Texas at Dallas January 2013.

Similar presentations

Presentation on theme: "Introduction to Semantic Web Dr. Bhavani Thuraisingham The University of Texas at Dallas January 2013."— Presentation transcript:

1 Introduction to Semantic Web Dr. Bhavani Thuraisingham The University of Texas at Dallas January 2013

2 Outline Introduction XML RDF OWL RULES Reference: G. Antoniou and F. vanHarmelen, A Semantic Web Primer, MIT Press, 2004 (second edition, 2008)

3 From Todays Web to Semantic web Todays web – High recall, low precision: Too many web pages resulting in searches, many not relevant – Sometimes low recall – Results sensitive to vocabulary: Different words even if they mean the same thing do not results in same web pages – Results are single web pages not linked web pages Semantic web – Machine understandable web pages – Activities on the web such as searching with little or no human intervention – Technologies for knowledge management, e-commerce, interoperability – Solutions to the problems faced by todays web

4 Applications Knowledge Management – Corporation Need: Searching, extracting and maintaining information, uncovering hidden dependencies, viewing information – Semantic web for knowledge management: Organizing knowledge, automated tools for maintaining knowledge, question answering, querying multiple documents, controlling access to documents Personal Agent – John is a president of a company. He needs to have a surgery for a serious but not a critical illness. With current web he has to check each web page for relevant information, make decisions depending on the information provided – With the semantic web, the agent will retrieve all the relevant information, synthesize the information, ask John if needed, and then present the various options and makes recommendations

5 Applications: E-Commerce Business to Consumer – Users shopping on the web; wrapper technology is used to extract information about user preferences etc. and display the products to the user – Use of semantic web: Develop software agents that can interpret privacy requirements, pricing and product information and display timely and correct information to the use; also provides information about the reputation of shops Business to Business – Organizations work together and carrying out transactions such as collaborating on a product, supply chains etc. With todays web lack of standards for data exchange – Use of semantic web: XML is a big improvement, but need to agree on vocabulary. Future will be the use of ontologies to agree on meanings and interpretations

6 Layered Approach: Tim Berners Lees Vision

7 7 The XML Language An XML document consists of a prolog a number of elements an optional epilog The prolog consists of an XML declaration and an optional reference to external structuring documents XML

8 8 XML Elements Content may be text, or other elements, or nothing David Billington If there is no content, then the element is called empty; it is abbreviated as follows: for

9 9 XML Attributes An empty element is not necessarily meaningless – It may have some properties in terms of attributes An attribute is a name-value pair inside the opening tag of an element

10 10 Well-Formed XML Documents Syntactically correct documents Some syntactic rules: – Only one outermost element (called root element) – Each element contains an opening and a corresponding closing tag – Tags may not overlap Lee Hong – Attributes within an element have unique names – Element and tag names must be permissible An XML document is valid if – it is well-formed – respects the structuring information it uses There are two ways of defining the structure of XML documents: – DTDs (the older and more restricted way) – XML Schema (offers extended possibilities) –

11 11 The Tree Model of XML Documents: An Example

12 12 The Tree Model of XML Documents: An Example (2)

13 13 DTD: Element Type Definition David Billington DTD for above element (and all lecturer elements): The element types lecturer, name, and phone may be used in the document A lecturer element contains a name element and a phone element, in that order (sequence) A name element and a phone element may have any content In DTDs, #PCDATA is the only atomic type for elements

14 14 XML Schema Significantly richer language for defining the structure of XML documents Tts syntax is based on XML itself – not necessary to write separate tools Reuse and refinement of schemas – Expand or delete already existent schemas Sophisticated set of data types, compared to DTDs (which only supports strings) An XML schema is an element with an opening tag like Structure of schema elements – Element and attribute types using data types

15 15 Data Types There is a variety of built-in data types – Numerical data types: integer, Short etc. – String types: string, ID, IDREF, CDATA etc. – Date and time data types: time, Month etc. There are also user-defined data types – simple data types, which cannot use elements or attributes – complex data types, which can use these Complex data types are defined from already existing data types by defining some attributes (if any) and using: – sequence, a sequence of existing data type elements (order is important) – all, a collection of elements that must appear (order is not important) – choice, a collection of elements, of which one will be chosen

16 16 A Data Type Example

17 17 XML Schema: The Example

18 18 Namespaces An XML document may use more than one DTD or schema Since each structuring document was developed independently, name clashes may appear The solution is to use a different prefix for each DTD or schema – prefix:name

19 19 An Example

20 20 Addressing and Querying XML Documents: XPATH In relational databases, parts of a database can be selected and retrieved using SQL – Same necessary for XML documents – Query languages: XQuery, XQL, XML-QL The central concept of XML query languages is a path expression – Specifies how a node or a set of nodes, in the tree representation of the XML document can be reached XPath is core for XML query languages Language for addressing parts of an XML document. – It operates on the tree data model of XML – It has a non-XML syntax

21 21 XSL Transformations (XSLT) XSLT specifies rules with which an input XML document is transformed to – another XML document, an HTML document, plain text The output document may use the same DTD or schema, or a completely different vocabulary XSLT can be used independently of the formatting language Grigoris Antoniou University of Bremen may be displayed in different ways:Grigoris AntoniouUniversity of

22 22 Summary XML is a metalanguage that allows users to define markup XML separates content and structure from formatting XML is the de facto standard for the representation and exchange of structured information on the Web XML is supported by query languages

23 23 Drawbacks of XML XML is a universal metalanguage for defining markup It provides a uniform frramework for interchange of data and metadata between applications However, XML does not provide any means of talking about the semantics (meaning) of data E.g., there is no intended meaning associated with the nesting of tags – It is up to each application to interpret the nesting. RDF

24 24 Basic Ideas of RDF Basic building block: object-attribute-value triple – It is called a statement – Sentence about Billington is such a statement RDF has been given a syntax in XML – This syntax inherits the benefits of XML – Other syntactic representations of RDF possible The fundamental concepts of RDF are: – resources – properties – statements

25 25 Resources We can think of a resource as an object, a thing we want to talk about – E.g. authors, books, publishers, places, people, hotels Every resource has a URI, a Universal Resource Identifier A URI can be – a URL (Web address) or – some other kind of unique identifier

26 26 Properties Properties are a special kind of resources They describe relations between resources – E.g. written by, age, title, etc. Properties are also identified by URIs Advantages of using URIs: – Α global, worldwide, unique naming scheme – Reduces the homonym problem of distributed data representation

27 27 Statements Statements assert the properties of resources A statement is an object-attribute-value triple – It consists of a resource, a property, and a value Values can be resources or literals Literals are atomic values (strings) Three views of a Statement – A triple – A piece of a graph – A piece of XML code Thus an RDF document can be viewed as: A set of triples A graph (semantic net) An XML document

28 28 Statements as Triples (, #David Billington) The triple (x,P,y) can be considered as a logical formula P(x,y) – Binary predicate P relates object x to object y – RDF offers only binary predicates (properties) A Set of Triples as a Semantic Net

29 29 RDF Statements in XML An RDF document is represented by an XML element with the tag rdf:RDF The content of this element is a number of descriptions, which use rdf:Description tags. Every description makes a statement about a resource, identified in 3 ways: – an about attribute, referencing an existing resource – an ID attribute, creating a new resource – without a name, creating an anonymous resource

30 30 Reification In RDF it is possible to make statements about statements – Grigoris believes that David Billington is the creator of Such statements can be used to describe belief or trust in other statements The solution is to assign a unique identifier to each statement – It can be used to refer to the statement Introduce an auxiliary object (e.g. belief1) relate it to each of the 3 parts of the original statement through the properties subject, predicate and object In the preceding example – subject of belief1 is David Billington – predicate of belief1 is creator – object of belief1 is

31 31 The rdf:resource Attribute Discrete Mathematics David Billington Associate Professor

32 32 Container Elements Collect a number of resources or attributes about which we want to make statements as a whole E.g., we may wish to talk about the courses given by a particular lecturer The content of container elements are named rdf:_1, rdf:_2, etc. – Alternatively rdf:li Three Types of Container Elements – rdf:Bag an unordered container, allowing multiple occurrences E.g. members of the faculty board, documents in a folder – rdf:Seq an ordered container, which may contain multiple occurrences E.g. modules of a course, items on an agenda, an alphabetized list of staff members (order is imposed) – rdf:Alt a set of alternatives E.g. the document home and mirrors, translations of a document in various languages

33 A Semantic Web Primer\ 33 Example for a Bag

34 Chapter 3A Semantic Web Primer 34 Example for Alternative

35 35 RDF Collections A limitation of these containers is that there is no way to close them – these are all the members of the container RDF provides support for describing groups containing only the specified members, in the form of RDF collections – list structure in the RDF graph – constructed using a predefined collection vocabulary: rdf:List, rdf:first, rdf:rest and rdf:nil Shorthand syntax: – "Collection" value for the rdf:parseType attribute:

36 36 Basic Ideas of RDF Schema RDF is a universal language that lets users describe resources in their own vocabularies – RDF does not assume, nor does it define semantics of any particular application domain The user can do so in RDF Schema using: – Classes and Properties – Class Hierarchies and Inheritance – Property Hierarchies

37 Chapter 3A Semantic Web Primer 37 Classes and their Instances We must distinguish between – Concrete things (individual objects) in the domain: Discrete Maths, David Billington etc. – Sets of individuals sharing properties called classes: lecturers, students, courses etc. Individual objects that belong to a class are referred to as instances of that class The relationship between instances and classes in RDF is through rdf:type

38 Chapter 3A Semantic Web Primer 38 Inheritance in Class Hierarchies Range restriction: Courses must be taught by academic staff members only Michael Maher is a professor He inherits the ability to teach from the class of academic staff members This is done in RDF Schema by fixing the semantics of is a subclass of – It is not up to an application (RDF processing software) to interpret is a subclass of

39 Chapter 3A Semantic Web Primer 39 Property Hierarchies Hierarchical relationships for properties – E.g., is taught by is a subproperty of involves – If a course C is taught by an academic staff member A, then C also involves Α The converse is not necessarily true – E.g., A may be the teacher of the course C, or – a tutor who marks student homework but does not teach C P is a subproperty of Q, if Q(x,y) is true whenever P(x,y) is true

40 Chapter 3A Semantic Web Primer 40 RDF Schema in RDF The modeling primitives of RDF Schema are defined using resources and properties (RDF itself is used!) To declare that lecturer is a subclass of academic staff member – Define resources lecturer, academicStaffMember, and subClassOf – define property subClassOf – Write triple (lecturer,subClassOf,academicStaffMember) We use the XML-based syntax of RDF

41 41 Core Classes rdfs:Resource, the class of all resources rdfs:Class, the class of all classes rdfs:Literal, the class of all literals (strings) rdf:Property, the class of all properties. rdf:Statement, the class of all reified statements Example

42 42 Core Properties rdf:type, which relates a resource to its class – The resource is declared to be an instance of that class rdfs:subClassOf, which relates a class to one of its superclasses – All instances of a class are instances of its superclass rdfs:subPropertyOf, relates a property to one of its superproperties rdfs:domain, which specifies the domain of a property P – The class of those resources that may appear as subjects in a triple with predicate P – If the domain is not specified, then any resource can be the subject rdfs:range, which specifies the range of a property P – The class of those resources that may appear as values in a triple with predicate P

43 43 Semantics based on Inference Rules Semantics in terms of RDF triples instead of restating RDF in terms of first- order logic with a sound and complete inference systems This inference system consists of inference rules of the form: IF E contains certain triples THEN add to E certain additional triples where E is an arbitrary set of RDF triples Examples IF E contains the triple (?x,?p,?y) THEN E also contains (?p,rdf:type,rdf:property) IF E contains the triples (?u,rdfs:subClassOf,?v) and (?v,rdfs:subclassOf,?w) THEN E also contains the triple (?u,rdfs:subClassOf,?w) IF E contains the triples (?x,rdf:type,?u) and (?u,rdfs:subClassOf,?v) THEN E also contains the triple (?x,rdf:type,?v) Any resource ?y which appears as the value of a property ?p can be inferred to be a member of the range of ?p – This shows that range definitions in RDF Schema are not used to restrict the range of a property, but rather to infer the membership of the range IF E contains the triples (?x,?p,?y) and (?p,rdfs:range,?u) THEN E also contains the triple (?y,rdf:type,?u)

44 SPARQL RDF Query Language SPARQL is based on matching graph patterns The simplest graph pattern is the triple pattern : - like an RDF triple, but with the possibility of a variable instead of an RDF term in the subject, predicate, or object positions Combining triple patterns gives a basic graph pattern, where an exact match to a graph is needed to fulfill a pattern

45 45 Summary RDF provides a foundation for representing and processing metadata RDF has a graph-based data model RDF has an XML-based syntax to support syntactic interoperability – XML and RDF complement each other because RDF supports semantic interoperability RDF has a decentralized philosophy and allows incremental building of knowledge, and its sharing and reuse RDF is domain-independent - RDF Schema provides a mechanism for describing specific domains RDF Schema is a primitive ontology language – It offers certain modelling primitives with fixed meaning Key concepts of RDF Schema are class, subclass relations, property, subproperty relations, and domain and range restrictions There exist query languages for RDF and RDFS, including SPARQL

46 46 Requirements for Ontology Languages Ontology languages allow users to write explicit, formal conceptualizations of domain models The main requirements are: – a well-defined syntax – efficient reasoning support – a formal semantics – sufficient expressive power – convenience of expression OWL

47 47 Reasoning About Knowledge in Ontology Languages Class membership – If x is an instance of a class C, and C is a subclass of D, then we can infer that x is an instance of D Equivalence of classes – If class A is equivalent to class B, and class B is equivalent to class C, then A is equivalent to C, too Consistency – X instance of classes A and B, but A and B are disjoint – This is an indication of an error in the ontology Classification – Certain property-value pairs are a sufficient condition for membership in a class A; if an individual x satisfies such conditions, we can conclude that x must be an instance of A Reasoning Support for OWL – Semantics is a prerequisite for reasoning support – Formal semantics and reasoning support are usually provided by mapping an ontology language to a known logical formalism; using automated reasoners that already exist for those formalisms – OWL is (partially) mapped on a description logic, and makes use of reasoners such as FaCT and RACER – Description logics are a subset of predicate logic for which efficient reasoning support is possible

48 48 Reasoning Support for OWL Semantics is a prerequisite for reasoning support Formal semantics and reasoning support are usually provided by – mapping an ontology language to a known logical formalism – using automated reasoners that already exist for those formalisms OWL is (partially) mapped on a description logic, and makes use of reasoners such as FaCT and RACER Description logics are a subset of predicate logic for which efficient reasoning support is possible

49 49 Some Limitations of the Expressive Power of RDF Schema Local scope of properties – rdfs:range defines the range of a property (e.g. eats) for all classes – In RDF Schema we cannot declare range restrictions that apply to some classes only – E.g. we cannot say that cows eat only plants, while other animals may eat meat, too Disjointness of classes – Sometimes we wish to say that classes are disjoint (e.g. male and female) Boolean combinations of classes – Sometimes we wish to build new classes by combining other classes using union, intersection, and complement – E.g. person is the disjoint union of the classes male and female Cardinality restrictions – E.g. a person has exactly two parents, a course is taught by at least one lecturer Special characteristics of properties – Transitive property (like greater than) – Unique property (like is mother of) – A property is the inverse of another property (like eats and is eaten by)

50 50 Three Species of OWL W3CsWeb Ontology Working Group defined OWL as three different sublanguages: – OWL Full – OWL DL – OWL Lite Each sublanguage geared toward fulfilling different aspects of requirements OWL uses XML and RDF for syntax OWL is based on Description Logic Description Logic is a fragment of first-order logic OWL inherits from Description Logic – The open-world assumption – The non-unique-name assumption

51 51 OWL OWL Full – It uses all the OWL languages primitives and allows the combination of these primitives in arbitrary ways with RDF and RDF Schema – OWL Full is fully upward-compatible with RDF, both syntactically and semantically – OWL Full is so powerful that it is undecidable; No complete (or efficient) reasoning support OWL DL (Description Logic) – It is a sublanguage of OWL Full that restricts application of the constructors from OWL and RDF – Application of OWLs constructors to each other is disallowed; Therefore it corresponds to a well studied description logic – OWL DL permits efficient reasoning support; But we lose full compatibility with RDF – Not every RDF document is a legal OWL DL document. Every legal OWL DL document is a legal RDF document. OWL-Lite: – Even further restriction limits OWL DL to a subset of the language constructors – OWL Lite excludes enumerated classes, disjointness statements, and arbitrary cardinality. – The advantage of this is a language that is easier to grasp (for users, for tool builders) The disadvantage is restricted expressivity

52 52 owl:Ontology An example OWL ontology University Ontology owl:imports is a transitive property

53 53 Classes Classes are defined using owl:Class – owl:Class is a subclass of rdfs:Class Disjointness is defined using owl:disjointWith owl:equivalentClass defines equivalence of classes owl:Thing is the most general class, which contains everything owl:Nothing is the empty class

54 54 Properties In OWL there are two kinds of properties – Object properties, which relate objects to other objects E.g. is-TaughtBy, supervises – Data type properties, which relate objects to datatype values E.g. phone, title, age, etc. Data Type Properties: OWL makes use of XML Schema data types, using the layered architecture of the SW User-defined data types: Object Properties

55 55 An African Wildlife Ontology – Properties

56 56 An African Wildlife Ontology – Plants and Trees Plants form a class disjoint from animals. Trees are a type of plant.

57 57 An African Wildlife Ontology – Branches Branches are parts of trees.

58 58 An African Wildlife Ontology – Leaves Leaves are parts of branches.

59 59 An African Wildlife Ontology – Carnivores Carnivores are exactly those animals that eat animals.

60 60 An African Wildlife Ontology – Giraffes Giraffes are herbivores, and they eat only leaves.

61 61 Summary OWL is the proposed standard for Web ontologies OWL builds upon RDF and RDF Schema: – (XML-based) RDF syntax is used – Instances are defined using RDF descriptions – Most RDFS modeling primitives are used Formal semantics and reasoning support is provided through the mapping of OWL on logics – Predicate logic and description logics have been used for this purpose While OWL is sufficiently rich to be used in practice, extensions are in the making – They will provide further logical features, including rules

62 Semantic Web Rules Language A rule in SWRL has the form – B1, …, Bn A1, …, Am – Commas denote conjunction on both sides – A1, …, Am, B1, …, Bn can be of the form C(x), P(x,y), sameAs(x,y), or differentFrom(x,y) where C is an OWL description, P is an OWL property, and x, y are Datalog variables, OWL individuals, or OWL data values If the head of a rule has more than one atom, the rule can be transformed to an equivalent set of rules with one atom in the head Expressions, such as restrictions, can appear in the head or body of a rule This feature adds significant expressive power to OWL, but at the high price of undecidability 62 RULES

63 63 Non-monotonic Rules In nonmonotonic rule systems, a rule may not be applied even if all premises are known because we have to consider contrary reasoning chains Now we consider defeasible rules that can be defeated by other rules Negated atoms may occur in the head and the body of rules, to allow for conflicts – p(X) q(X) – r(X) ¬q(X)

64 64 The Potential Buyer Carlos Requirements – At least 45 sq m with at least 2 bedrooms – Elevator if on 3rd floor or higher – Pet animals must be allowed Carlos is willing to pay: – $ 300 for a centrally located 45 sq m apartment – $ 250 for a similar flat in the suburbs – An extra $ 5 per square meter for a larger apartment – An extra $ 2 per square meter for a garden – He is unable to pay more than $ 400 in total If given the choice, he would go for the cheapest option His second priority is the presence of a garden His lowest priority is additional space

65 65 Formalization of Carloss Requirements – Rules r1: acceptable(X) r2: bedrooms(X,Y), Y < 2 ¬acceptable(X) r3: size(X,Y), Y < 45 ¬acceptable(X) r4: ¬pets(X) ¬acceptable(X) r5: floor(X,Y), Y > 2,¬lift(X) ¬acceptable(X) r6: price(X,Y), Y > 400 ¬acceptable(X) r2 > r1, r3 > r1, r4 > r1, r5 > r1, r6 > r1 r7: size(X,Y), Y 45, garden(X,Z), central(X) offer(X, *Z + 5*(Y 45)) r8: size(X,Y), Y 45, garden(X,Z), ¬central(X) offer(X, *Z + 5(Y 45)) r9: offer(X,Y), price(X,Z), Y < Z ¬acceptable(X) r9 > r1

66 66 Representation of Available Apartments FlatBedroomsSizeCentralFloorLiftPetsGardenPrice a1150yes1noyes0300 a2245yes0noyes0335 a3265no2 yes0350 a4255no1yesno15330 a5355yes0noyes15350 a6260yes3no 0370 a7365yes1noyes12375

67 67 Determining Acceptable Apartments If we match Carloss requirements and the available apartments, we see that flat a1 is not acceptable because it has one bedroom only (rule r2) flats a4 and a6 are unacceptable because pets are not allowed (rule r4) for a2, Carlos is willing to pay $ 300, but the price is higher (rules r7 and r9) flats a3, a5, and a7 are acceptable (rule r1)

68 68 Summary Horn logic is a subset of predicate logic that allows efficient reasoning, orthogonal to description logics Horn logic is the basis of monotonic rules DLP and SWRL are two important ways of combining OWL with Horn rules DLP is essentially the intersection of OWL and Horn logic, whereas SWRL is a much richer language Nonmonotonic rules are useful in situations where the available information is incomplete They are rules that may be overridden by contrary evidence Priorities are used to resolve some conflicts between rules Representation XML-like languages is straightforward

Download ppt "Introduction to Semantic Web Dr. Bhavani Thuraisingham The University of Texas at Dallas January 2013."

Similar presentations

Ads by Google