Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #8 Trustworthy Semantic Webs February 2011 Data and Applications Security Developments.

Slides:



Advertisements
Similar presentations
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
Advertisements

Dr. Leo Obrst MITRE Information Semantics Information Discovery & Understanding Command & Control Center February 6, 2014February 6, 2014February 6, 2014.
CH-4 Ontologies, Querying and Data Integration. Introduction to RDF(S) RDF stands for Resource Description Framework. RDF is a standard for describing.
Semantic Web Thanks to folks at LAIT lab Sources include :
Dr. Bhavani Thuraisingham February 18, 2011 Building Trustworthy Semantic Webs RDF and RDF Security.
1 RDF Tutorial. C. Abela RDF Tutorial2 What is RDF? RDF stands for Resource Description Framework It is used for describing resources on the web Makes.
An Introduction to Semantic Web Portal
CS570 Artificial Intelligence Semantic Web & Ontology 2
By Ahmet Can Babaoğlu Abdurrahman Beşinci.  Suppose you want to buy a Star wars DVD having such properties;  wide-screen ( not full-screen )  the extra.
RDF Tutorial.
Of 27 lecture 7: owl - introduction. of 27 ece 627, winter ‘132 OWL a glimpse OWL – Web Ontology Language describes classes, properties and relations.
The Web of data with meaning... By Michael Griffiths.
Dr. Alexandra I. Cristea RDF.
Semantic Web Tools for Authoring and Using Analysis Results Richard Fikes Robert McCool Deborah McGuinness Sheila McIlraith Jessica Jenkins Knowledge Systems.
COMP 6703 eScience Project Semantic Web for Museums Student : Lei Junran Client/Technical Supervisor : Tom Worthington Academic Supervisor : Peter Strazdins.
RDF Kitty Turner. Current Situation there is hardly any metadata on the Web search engine sites do the equivalent of going through a library, reading.
From SHIQ and RDF to OWL: The Making of a Web Ontology Language
Module 2b: Modeling Information Objects and Relationships IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Secure Knowledge Management: and.
OIL: An Ontology Infrastructure for the Semantic Web D. Fensel, F. van Harmelen, I. Horrocks, D. L. McGuinness, P. F. Patel-Schneider Presenter: Cristina.
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #9 Trustworthy Semantic Webs February 2010 Data and Applications Security Developments.
Dr. Bhavani Thuraisingham October 1, 2008 Building Trustworthy Semantic Webs Lecture #11: Logic and Inference Rules Semantic Web Applications.
RDF (Resource Description Framework) Why?. XML XML is a metalanguage that allows users to define markup XML separates content and structure from formatting.
Ontology Development Kenneth Baclawski Northeastern University Harvard Medical School.
Dr. Bhavani Thuraisingham The University of Texas at Dallas Trustworthy Semantic Webs October 2013 Data and Applications Security.
The Semantic Web Service Shuying Wang Outline Semantic Web vision Core technologies XML, RDF, Ontology, Agent… Web services DAML-S.
Dr. Bhavani Thuraisingham June 2010 Knowledge Management, Semantic Web and Social Networking Introduction to the Semantic Web.
OWL Capturing Semantic Information using a Standard Web Ontology Language Aditya Kalyanpur Jennifer Jay Banerjee James Hendler Presented By Rami Al-Ghanmi.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas December 2007.
EU Project proposal. Andrei S. Lopatenko 1 EU Project Proposal CERIF-SW Andrei S. Lopatenko Vienna University of Technology
Metadata. Generally speaking, metadata are data and information that describe and model data and information For example, a database schema is the metadata.
Semantic Web - an introduction By Daniel Wu (danielwujr)
Dr. Bhavani Thuraisingham February 2010 Building Trustworthy Semantic Webs Lecture #14 : OWL (Web Ontology Language) and Security.
Dr. Bhavani Thuraisingham August 2006 Building Trustworthy Semantic Webs Unit #1: Introduction to The Semantic Web.
Semantically Processing The Semantic Web Presented by: Kunal Patel Dr. Gopal Gupta UNIVERSITY OF TEXAS AT DALLAS.
EEL 5937 Ontologies EEL 5937 Multi Agent Systems Lecture 5, Jan 23 th, 2003 Lotzi Bölöni.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #22 Secure Web Information.
Dr. Bhavani Thuraisingham The University of Texas at Dallas Trustworthy Semantic Webs March 25, 2011 Data and Applications Security Developments and Directions.
Ontology-Based Computing Kenneth Baclawski Northeastern University and Jarg.
OWL Representing Information Using the Web Ontology Language.
Trustworthy Semantic Webs Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #4 Vision for Semantic Web.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Dr. Bhavani Thuraisingham September 2006 Building Trustworthy Semantic Webs Lecture #5 ] XML and XML Security.
Dr. Bhavani Thuraisingham September 24, 2008 Building Trustworthy Semantic Webs Lecture #9: RDF and RDF Security.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #24 Semantic Web and Security.
1 Open Ontology Repository initiative - Planning Meeting - Thu Co-conveners: PeterYim, LeoObrst & MikeDean ref.:
Dr. Bhavani Thuraisingham September 18, 2006 Building Trustworthy Semantic Webs Lecture #9: Logic and Inference Rules.
Dr. Bhavani Thuraisingham January 14, 2011 Building Trustworthy Semantic Webs Lecture #1: Introduction to Trustworthy Semantic Web.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
OWL Web Ontology Language Summary IHan HSIAO (Sharon)
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #19 Digital Libraries, Semantic.
Chapter 5 The Semantic Web 1. The Semantic Web  Initiated by Tim Berners-Lee, the inventor of the World Wide Web.  A common framework that allows data.
Dr. Bhavani Thuraisingham The University of Texas at Dallas Trustworthy Semantic Webs February 2012 Secure Web Services and Cloud Computing.
Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #21 Trustworthy Semantic Webs March 26, 2007 Data and Applications Security Developments.
OWL (Ontology Web Language and Applications) Maw-Sheng Horng Department of Mathematics and Information Education National Taipei University of Education.
The Semantic Web By: Maulik Parikh.
Information and Security Analytics
Lecture #13: RDF and RDF Security Dr. Bhavani Thuraisingham
Building Trustworthy Semantic Webs
Lecture #11: Ontology Engineering Dr. Bhavani Thuraisingham
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
ece 720 intelligent web: ontology and beyond
Data and Applications Security Developments and Directions
Lecture #6: RDF and RDF Security Dr. Bhavani Thuraisingham
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Presentation transcript:

Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #8 Trustworthy Semantic Webs February 2011 Data and Applications Security Developments and Directions

Outline l Semantic web l XML and XML security l RDF and RDF security l Ontologies l Rules l Applications l Reference: - Building trustworthy semantic web, Thuraisingham, CRC Press, 2007

From Today’s Web to Semantic web l Today’s web - High recall, low precision: Too many web pages resulting in searches, many not relevant - Sometimes low recall - Results sensitive to vocabulary: Different words even if they mean the same thing do not results in same web pages - Results are single web pages not linked web pages l Semantic web - Machine understandable web pages - Activities on the web such as searching with little or no human intervention - Technologies for knowledge management, e-commerce, interoperability - Solutions to the problems faced by today’s web

Knowledge Management and Personal Agents l Knowledge Management - Corporation Need: Searching, extracting and maintaining information, uncovering hidden dependencies, viewing information - Semantic web for knowledge management: Organizing knowledge, automated tools for maintaining knowledge, question answering, querying multiple documents, controlling access to documents l Personal Agent - John is a president of a company. He needs to have a surgery for a serious but not a critical illness. With current web he has to check each web page for relevant information, make decisions depending on the information provided - With the semantic web, the agent will retrieve all the relevant information, synthesize the information, ask John if needed, and then present the various options and makes recommendations

E-Commerce l Business to Consumer - Users shopping on the web; wrapper technology is used to extract information about user preferences etc. and display the products to the user - Use of semantic web: Develop software agents that can interpret privacy requirements, pricing and product information and display timely and correct information to the use; also provides information about the reputation of shops l Business to Business - Organizations work together and carrying out transactions such as collaborating on a product, supply chains etc. With today’s web lack of standards for data exchange - Use of semantic web: XML is a big improvement, but need to agree on vocabulary. Future will be the use of ontologies to agree on meanings and interpretations

Semantic Web Technologies l Explicit metadata: - Metadata is data about data; Need metadata to be explicitly specified so that different groups and organizations will know what is on the web - Metadata specification languages include XML and RDF l Ontologies - Explicit and formal specification of conceptualization describes a domain of discourse; relationships - Ontology languages include XML, RDF, OWL l Logic - Logic can be used to specify facts as well as rules; New facts and derived from existing facts based on the inference rules - Descriptive Logic is the type of logic that has been developed for semantic web applications

Layered Approach: Tim Berners Lee’s Vision

What is XML all about? l XML is needed due to the limitations of HTML and complexities of SGML l It is an extensible markup language specified by the W3C (World Wide Web Consortium) l Designed to make the interchange of structured documents over the Internet easier l Key to XML used to be Document Type Definitions (DTDs) - Defines the role of each element of text in a formal model l XML schemas have now become critical to specify the structure - XML schemas are also XML documents

XML Elements XML Statement John Smith is a Professor in Texas This can be expressed as follows: John Smith Texas

XML Elements Now suppose this data can be read by anyone then we can augment the XML statement by an additional element called access as follows. John Smith Texas All, Read

XML Elements If only HR can update this XML statement, then we have the following: John Smith Texas HR department, Write

XML Elements We may not wish for everyone to know that John Smith is a professor, but we can give out the information that this professor is in Texas. This can be expressed as: John Smith, Govt-official, Read Texas, All, Read HR department, Write

XML Attributes Suppose we want to specify to access based on attribute values. One way to specify such access is given below. <Professor Name = “John Smith”, Access = All, Read Salary = “60K”, Access = Administrator, Read, Write Department = “Security” Access = All, Read </Professor Here we assume that everyone can read the name John Smith and Department Security. But only the administrator can read and write the salary attribute.

XML DTD DTDs essentially specify the structure of XML documents. Consider the following DTD for Professor with elements Name and State. This will be specified as:

XML Schema While DTDs were the early attempts to specify structure for XML documents, XML schemas are far more elegant to specify structures. Unlike DTDs XML schemas essentially use the XML syntax for specification. Consider the following example:

XML Namespaces Namespaces are used for DISAMBIGUATION <CountryX: Academic-Institution Xmlns: CountryX = DTD” Xmlns: USA = “ DTD” Xmlns: UK = “ DTD” <USA: Title = College USA: Name = “University of Texas at Dallas” USA: State = Texas” <UK: Title = University UK: Name = “Cambridge University” UK: State = Cambs

XML Namespaces <Country: Academic-Institution Xmlns: CountryX = DTD” Xmlns: USA = “ DTD” Xmlns: UK = “ DTD” <USA: Title = College USA: Name = “University of Texas at Dallas” USA: State = Texas” <UK: Title = University UK: Name = “Cambridge University” UK: State = Cambs

Federations/Distribution Site 1 document: 111 John Smith Texas Site 2 document: K

Credentials in XML Alice Brown University of X CS Security John James University of X CS Senior

Policies in XML <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = = ‘CS’]//Node()” priv = “VIEW”/> <policy-spec cred-expr = “//Professor[department = ‘CS’]” target = “annual_ report.xml” path = = ‘EE’] /Short-descr/Node() and //Patent = ‘EE’]/authors” priv = “VIEW”/> <policy-spec cred-expr = Explantaion: CS professors are entitled to access all the patents of their department. They are entitled to see only the short descriptions and authors of patents of the EE department

Access Control Strategy l Subjects request access to XML documents under two modes: Browsing and authoring - With browsing access subject can read/navigate documents - Authoring access is needed to modify, delete, append documents l Access control module checks the policy based and applies policy specs l Views of the document are created based on credentials and policy specs l In case of conflict, least access privilege rule is enforced l Works for Push/Pull modes

System Architecture for Access Control User Pull/Query Push/result XML Documents X-AccessX-Admin Admin Tools Policy base Credential base

Third-Party Architecture Credential base policy base XML Source User/Subject Owner Publisher Query Reply document SE-XML credentials l The Owner is the producer of information It specifies access control policies l The Publisher is responsible for managing (a portion of) the Owner information and answering subject queries l Goal: Untrusted Publisher with respect to Authenticity and Completeness checking

XML Databases l Data is presented as XML documents l Query language: XML-QL l Query optimization l Managing transactions on XML documents l Metadata management: XML schemas/DTDs l Access methods and index strategies l XML security and integrity management

Inference/Privacy Control Policies Ontologies Rules XML Database XML Documents Web Pages, Databases Inference Engine/ Rules Processor Interface to the Semantic Web Technology By UTD

Why RDF? l XML cannot be used to specify semantics l Example: - Professor is a subclass of Academic Staff - Professor inherits all properties of Academic Staff l RDF was specified so that the inadequacies of XML could be handled l RDF uses XML Syntax l Additional constructs are needed for RDF

RDF l Resource Description Framework is the essence of the semantic web l Adds semantics with the use of ontologies, XML syntax l RDF Concepts - Basic Model l Resources, Properties and Statements - Container Model l Bag, Sequence and Alternative

RDF Basics l Resource: Everything is a resource - Person, Vehicle, etc. l Property: properties describe relationships between resources - E.g., Invented l Statement: (Object, Property, Value) Triple - Berners Lee invented the Semantic Web

RDF Container Model l Bag: Unordered container, may contain multiple occurrences - Rdf: Bag l Seq: Ordered container, may contain multiple occurrences - Rdf: Seq l Alt: a set of alternatives - Rdf: Alt

RDF Specification <rdf: RDF xmlns: rdf = “ xmlns: xsd = “ xmlns: uni = “ <rdf: Description: rdf: about = “949352” Professor <rdf: Description rdf: about: “ZZZ” semantic web

RDF Specification l RDF specifications have been given for Attributes, Types Nesting, Containers, etc. l How can security policies be included in the specification l Example: consider the statement “Berners Les is the Author of the book Semantic Web” l Do we allow access to the connection between author and book? Do we allow access to the connection but not to the author name and book name?

RDF Policy Specification < rdf: RDF xmlns: rdf = “ xmlns: xsd = “ xmlns: uni = “ <rdf: Description: rdf: about = “949352” Professor Level = L1 <rdf: Description rdf: about: “ZZZ” semantic web Level = L2

RDF Schema l Need RDF Schema to specify statements such as professor is a subclass of academic staff <rdfs: Class rdf: ID = “professor” The class of Professors All professors are Academic Staff Members.

RDF Schema: Security Policies l How can security policies be specified? <rdfs: Class rdf: ID = “professor” The class of Professors All professors are Academic Staff Members. Level = L

RDF Axiomatic Semantics l First order logic to specify formulas and inferencing - Built in functions (First) and predicates (Type) - Modus Ponens - From A and If A then B, deduce B l Example: All containers are Resources - Type(?C, Container)  Type(?c, Resource) - If we have Type(A, Container) then we can infer (Type A, Resource)

RDF Inferencing l While first order logic provides a proof system, it will be computationally infeasible l As a result horn clause logic was developed for logic programming; this is still computationally expensive l RDF uses If then Rules l IF E contains the triples (?u, rdfs: subClassof, ?v) and (?v, rdfs: subClassof ?w) THEN E also contains the triple (?u, rdfs: subClassOf, ?w) That is, if u is a subclass of v, and v is a subclass of w, then u is a subclass of w

RDF Query l One can query RDF using XML, but this will be very difficult as RDF is much richer than XML l Is there an analogy between say XQuery and a query language for RDF? l RQL – an SQL-like language has been developed for RDF l Select from “RDF document” where some “condition”

Policies in RDF l How can policies be specified? l Should policies be specified as shown in the examples, extensions to RDF syntax? l Should policies be specified as RDF documents? l Is there an analogy to XPath expressions for RDF policies? -

Ontology l Common definitions for any entity, person or thing l Several ontologies have been defined and available for use l Defining common ontology for an entity is a challenge l Mappings have to be developed for multiple ontologies l Specific languages have been developed for ontologies

Why RDF is not sufficient? l RDF was developed as XML is not sufficient to specify semantics - E.g., class/subclass relationship l RDF has issues also - Cannot express several other properties such as Union, Interaction, relationships, etc l Need a richer language l Ontology languages were developed by the semantic web community for this purpose l Essentially RDF is not sufficient to specify ontologies

Security and Ontology l Ontologies used to specify security policies - Example: OWL to specify security policies - Choice between XML, RDF, OWL, Rules ML, etc. l Security for Ontologies - Access control on Ontologies l Give access to certain parts of the Ontology

OWL: Background l It’s a language for ontologies and relies on RDF l DARPA (Defense Advanced Research Projects Agency) developed early language DAML (DARPA Agent Markup Language) l Europeans developed OIL (Ontology Interface Language) l DAML+OIL combines both and was the starting point for OWL l OWL was developed by W3C

OWL Features l Subclass relationship l Class membership l Equivalence of classes l Classification l Consistency (e.g., x is an instance of A, A is a subclass of B, x is not an instance of B) l Three types of OWL: OWL-Full, OWL-DL, OWL-Lite l Automated tools for managing ontologies - Ontology engineering

OWL Specification (e.g., Classes) Faculty and Academic Staff Member are the same Associate Professor is not a professor Associate professor is not an Assistant professor

OWL Specification (e.g., Property) Courses are taught by Academic staff members

OWL Specification (e.g., Property Restriction) All first year courses are taught only by professors

Policies in OWL l How can policies be specified? l Should policies be specified as shown in the examples, extensions to OWL syntax? l Should policies be specified as OWL documents? l Is there an analogy to XPath expressions for OWL policies? -

Policies in OWL: Example Level = L1 Level = L2

Logic and Inference l First order predicate logic l High level language to express knowledge l Well understood semantics l Logical consequence - inference l Proof systems exist l Sound and complete l OWL is based on a subset of logic – descriptive logic

Why Rules? l RDF is built on XML and OWL is built on RDF l We can express subclass relationships in RDF; additional relationships can be expressed in OWL l However reasoning power is still limited in OWL l Therefore the need for rules and subsequently a markup language for rules so that machines can understand

Example Rules l Studies(X,Y), Lives(X,Z), Loc(Y,U), Loc(Z,U)  HomeStudent(X) l i.e. if John Studies at UTDallas and John is lives on Campbell Road and the location of Campbell Road and UTDallas are Richardson then John is a Home student l Note that Person (X)  Man(X) or Woman(X) is not a rule in predicate logic That is if X is a person then X is either a man of a woman. This can be expressed in OWL However we can have a rule of the form Person(X) and Not Man(X)  Woman(X)

Monotonic Rules l  Mother(X,Y) l Mother(X,Y)  Parent(X,Y) If Mary is the mother of John, then Mary is the parent of John Syntax: Facts and Rules Rule is of the form: B1, B2, ---- Bn  A That is, if B1, B2, ---Bn hold then A holds

Logic Programming l Deductive logic programming is in general based on deduction - i.e., Deduce data from existing data and rules - e.g., Father of a father is a grandfather, John is the father of Peter and Peter is the father of James and therefore John is the grandfather of James l Inductive logic programming deduces rules from the data - e.g., John is the father of Peter, Peter is the father of James, John is the grandfather of James, James is the father of Robert, Peter is the grandfather of Robert - From the above data, deduce that the father of a father is a grandfather l Popular in Europe and Japan

Nonmonotonic Rules l If we have X and NOT X, we do not treat them as inconsistent as in the case of monotonic reasoning. l For example, consider the example of an apartment that is acceptable to John. That is, in general John is prepared to rent an apartment unless the apartment ahs less than two bedrooms, is does not allow pets etc. This can be expressed as follows: l  Acceptable(X) l Bedroom(X,Y), Y<2  NOT Acceptable(X) l NOT Pets(X)  NOT Acceptable(X) l Note that there could be a contradiction. But with nonmotonic reasoning this is allowed.

Rule Markup l The various components of logic are expressed in the Rule Markup Language – RuleML l Both monotonic and nonmonotnic rules can be represented l Example representation of Fact P(a) - a is a parent p a

Policies in RuleML p a Level = L

An Application: Horizontal Information Products at Elsevier l Elsevier is publishing company based in Amsterdam - E.g., publisher of Computer Standards and Interface Journal that has papers on all kinds of computer related standards l Currently the journals and books are grouped by topics such as say operating systems, databases, etc. (or at a higher level, Biology, Chemistry, etc.) l Where do we then put the journal Computer Standards and Interfaces? l Need horizontal groupings also

Horizontal Information Products at Elsevier l Semantic web technologies are being used by Elsevier - RDF for document representation - RDF for ontologies - Query language based on RDF to query the documents and the ontologies - E.g. Life Science Thesaurus EMTREE - Other publishing companies are following in Elsevier’s direction

Common Threads and Challenges l Common Threads - Building Ontologies for Semantics - XML for Syntax l Challenges - Scalability, Resolvability - Security policy specification, Securing the documents and ontologies - Developing applications for secure semantic web technologies - Automated tools for ontology management l Creating, maintaining, evolving and querying ontologies