Presentation on theme: "Intro to Semantic Web Programming From Semantic Web Programming by J. Hebeler et al.,"— Presentation transcript:
Intro to Semantic Web Programming From Semantic Web Programming by J. Hebeler et al.,
Semantic Web Represents data in formats amenable to automated processing, integration, and reasoning. Data is king and it provides even greater value when connected with other data sources to create a linked data web. Sem Web standards include RDF, OWL, and SPARQL The Linking Open Data Initiative: 3/09 and 2/10 http://www.ted.com/talks/tim_berners_lee_on_the_next_web. html http://www.ted.com/talks/tim_berners_lee_on_the_next_web. html http://blog.ted.com/2010/03/08/the_year_open_d/ 2008 Article: http://tomheath.com/papers/bizer-heath-berners- lee-ijswis-linked-data.pdf
Introduction Semantics offer the leverage to make more information better and not overwhelmingly worse Offers a new approach to extremely tough but lucrative challenges that employ vast amount of information and services Awareness of Sem Web is required for any solution that depends on dynamic information and service resources Part 1: Chapters 1 and 2: “Introducing Semantic Web Programming” - Chapter 2 implements “Hello Semantic Web World” Two main areas drive a semantic web App: Knowledge representation and Application Integration.
Introduction (cont.d) Part 2: Chapters 3 to 7: “Foundations of Semantic Web Programming” - details Knowledge Representation – Ch. 3: “Modeling Information” shows data modeling with RDF; – Ch. 4: “Incorporating Semantics” – creates an Ontology knowledge model with RDFS and OWL; – Ch. 5: “Modeling Knowledge in the Real World” – exercises an ontology via Appl. Frameworks and reasoners, such as Pellet – Ch. 6: “Discovering Information” extracts useful information from a knowledge model with SPARQL – Ch. 7: “Adding Rules” : Shows the use of SWRL, a W3C standard.
Introduction (Cont.d) Part 3: “Building Semantic Web Applications” – Chapters 8-11 – Integrates the knowledge base with an application that acts upon it – Solid programming base for the semantic web – Ch. 8: “Applying a Programming Framework” – explores Jena semantic web framework – Ch. 9: “Combining Information” – ports info from sources such as relational databases, web services, and other formats – Ch. 10: ”Aligning Information” – aligning data with ontological concepts to unify disparate information – Ch. 11: “Sharing Information” – outputs the info in many formats, viz., RDFa, microformats, SPARQL endpoints, etc. – Uses FriendTracker App to directly show the programming concepts
Introduction (Continued) Part 4: “Expanding Semantic Web Programming” – Ch. 12-15 – Ch. 12: “Developing and Using Semantic Services” – adds semantics to services to allow them to participate in the Semantic Web – Ch. 13: “Managing Space and Time” – adds space and time considerations to the knowledge representation – Ch. 14: “Applying Patterns and Best Practices” – shows architectural patterns for constructing various Semantic Web Apps – Ch. 15: “Moving Forward” – discussed the future – focuses on four critical evolving areas: ontology management, advanced integration and distribution, advanced reasoning, and visualization. Website: http://semwebprogramming.org
Chapter 1: Preparing to Program a Semantic Web of Data Make useable sense out of large, distributed information found throughout the WWW Objectives of the chapter: – Form a useful, pragmatic definition of Sem Web – ID the major components and relate them – Outline how Sem Web impacts Apps – Discuss myths and hype – Explore the origin and foundation of Sem Web – Find out real world applications
Semantic Web Programming: Chapter 1 Map Semantic Web (SW) – Origin Foundation – DL and Graph Theory – Components SW – Statements, Ontology, Instance, and Language SW Tools – Frameworks, IDE, Reasoner, and KB – Features – Expressiveness, Inference, Integration, and Unique Programming – Examples – Impacts – Data centric, Sharing Data, Dynamic Data, Expressive Data – Roadblocks – Myths and Hype
Definition: Sem Web A web of data described and linked in ways to establish context or semantics that adhere to defined grammar and language constructs No formal standard for such programmed semantics; Also, aggregation, sharing, and validation are not easy. E.g., Building SW – addresses semantics through standardized connections to related info by labeling data uniquely and addressable Figure 1-2: Isolation Vs the Semantic Web
FeatureWWWSemantic Web Fundamental ComponentUnstructured ContentFormal Statements Primary AudienceHumansApplications LinksIndicate LocationIndicate Location and Meaning Primary VocabularyFormatting InstructionsSemantics and Logic LogicInformal/nonstandardDescription Logic
Sem Web SW statements allow the definition and organization of info to form rich expressions, simplify integration and sharing, enable inference, and allow meaningful information extractions While the info remains distributed, dynamic, and diverse In summary, SW improves your App’s ability to effectively utilize large amounts of diverse info on the scale of the WWW
SW Relationships Include definitions, associations, aggregations, and restrictions Figure 1.3 – Example Graph Your own “good” personal secretary? – Family, Friends, Associates, Suppliers, Distributors, Employees, Bank,.. Establish both concepts (e.g., a Person has a birth date) and instances (e.g., John is a friend of Bill) The former defines an ontology; the latter – instance data Statements can be asserted or inferred; the former is created directly, while the later needs a reasoner to infer additional statements logically (dashed lines)
SW Sw statements employ a SW vocabulary and language to identify different types of stmnts and relationships. SW offers several languages; Choice to balance user’s needs for performance, integration, and expressiveness. The Stmnts are in two forms: Knowledgebases and files; KB offer dynamic, extensible storage like relational DB; Files contain static stmnts. Example: http://www.geonames.org/ontologyhttp://www.geonames.org/ontology http://en.wikipedia.org/wiki/GeoNames Can enhance existing data sources (relational DB, web page, web service) and applications (standalone desktop App, mission-critical enterprise App, and large scale web app/svc).
Links to Discuss Geonames Feature Codes: http://www.geonames.org/statistics/total.html http://www.geonames.org/statistics/total.html Linked data: http://linkeddata.org/http://linkeddata.org/ Sem Web Blog: http://www.geospatialsemanticweb.com/2006/10/14/ge onames-ontology-in-owl http://www.geospatialsemanticweb.com/2006/10/14/ge onames-ontology-in-owl Web Services Overview: http://www.geonames.org/export/ws-overview.html http://www.geonames.org/export/ws-overview.html Collective Intelligence: Algorithms of the Intelligent Web, Marmanis and Babenko, 2009 (Thanks to Luis Atencio) MarmanisBabenko
Source: http://www.geonames.org/ontology/ontology_v2.2.1.rdf - a conspicuous, isolated rocky mass rock
Source: http://www.geonames.org/ontology/ontology_v2.2.1.rdf - a building providing lodging and/or meals for the public hotel
Source: http://en.wikipedia.org/wiki/GeoNames GeoNames is a geographical data base available and accessible through various Web services, under a Creative Commons attribution license.Web servicesCreative Commons Database and web services The GeoNames database contains over 10,000,000 geographical names corresponding to over 7,500,000 unique features.  All features are categorized into one out of nine feature classes and further subcategorized into one out of 645 feature codes. Beyond names of places in various languages, data stored include latitude, longitude, elevation, population, administrative subdivision and postal codes. All coordinates use the World Geodetic System 1984 (WGS84).geographical names latitudelongitudepostal codescoordinatesGeodeticWGS84 Those data are accessible free of charge through a number of Web services and a daily database export.  The Web services include direct and reverse geocoding, finding places through postal codes, finding places next to a given place, and finding Wikipedia articles about neighbouring places. geocodingWikipedia Wiki interface The core of GeoNames data base is provided by official public sources, of which quality may vary. Through a wiki interface, users are invited to manually edit and improve the data base by adding or correcting names, move existing features, add new features, etc.wiki Semantic Web integration Each GeoNames feature is represented as a Web resource identified by a stable URI. This URI provides access, through content negotiation, either to the HTML wiki page, or to a RDF description of the feature, using elements of the GeoNames ontology.  This ontology describes the GeoNames features properties using the Web Ontology Language, the feature classes and codes being described in the SKOS language. Through Wikipedia articles URL linked in the RDF descriptions, GeoNames data are linked to DBpedia data and other RDF Linked Data.Web resourceURIcontent negotiationRDFontology Web Ontology LanguageSKOSDBpediaLinked Data
Comparison of Relational DB and KB FeatureRelational DatabaseKnowledge Base StructureSchemaOntology Statements DataRowsInstance Statements Admin LanguageDDLOntology Statements Query LanguageSQLSPARQL RelationshipsForeign KeysMultidimensional LogicExternal to DB, with triggers Formal Logic Statements UniquenessKey for tableURI
Major Programming Components SW Statement: A triple (subject-Predicate-Object) – thousands of these simple 3-tuples are combined in an App. URI- unique name across the entire Internet. Each SW Statement has an URI. Could include a URL, an abstract URN, or IRI (Intl Resrc ID) SW Languages – Has a set of keywords to instruct the SW tools. An Ontology – Has statements that define concepts, relationships, and constraints. Similar to a DB schema or OO class diagram. Use the many existing Ontologies, for better quality and speed. Instance Data – Statements about specific instances; not a generic concept. This forms the bulk of the SW – an ontology containing the concept ‘person’ may be used by millions of instances of ‘person’ Example: The nces site: http://nces.ed.gov/http://nces.ed.gov/
Programming Concepts (Contd) Construction tools: To construct and integrate a SW through ontology/instance creation/import. GUI-based SW editor to see and explore; APIs Interrogation tools: Navigate thru the SW to return a requested response. Query language Reasoners: Add inference to SW; logical additions to gain classification and realization (same as relationship). Leverage assertions Rule Engines: Support inference typically beyond that can be deduced from description logic. Merge ontologies, count and string searches. SW Frameworks: package all the tools above to have an integrated flow. We use open source alternatives for both GUI and API. So, you can start right away.
Impacts on Programming Web Data-Centric: Place data at its center; Data is key Semantic Data: Place meaning directly within the data Vs in programming instructions Data Integration/Sharing: Access and share rich info resources throughout the WWW Dynamic Data: Enable dynamic, run-time changes to the structure and contents of your info Add comments on each later on
Roadblocks, Myths, and Hype Roadblocks: web-centric development; accepting of dirty, conflicting data; and dynamic addition of new data. SW perspective: data-centric programming and distributed information programming Traditional approach: ETL (extract, transform, and load) does not scale; has a single point of failure WWW is full of data- good, bad, and wrong data. To fix, we need to know the truth; not easy. SW can define reputation and manage conflicting info Remaining open to new data is not an easy perspective. SW maintains a nimble, agile view of data.
Myths Pursuit of one big info model – SW supports a multitude of distributed info sources with a multitude of perspectives. Your solution and you need to have that perspective. One view – Allow any view for info analysis. Get started quickly and adapt/evolve. Encourages agility, and decoupled, modular design. Look for existing SW sources and use Acceptance – New technologies scare people, as they should. Change also scares people.
Hype Hype doomed AI; SW, does incorporate excellent AI research of the past; and offers a useful improvement in leveraging info. SW is an evolutionary step in making info work harder for us Hype may make complex technology look too simple. Tough, challenging problems demand complex technical solutions. SW reduces unnecessary complexity to focus on necessary complexity, that of managing info and knowledge we produce Hype can overpromise the inherent challenges in using a new technology. Do not expect a perfect world of tools, frameworks, and the SW itself. You may hit sum bumps, but hang in there, it will be worth it.
SW Origins Scientific American article by Tim Berners-Lee et al., discussed SW for machine readability, easy info integration, info inference, unique naming, and rich representations, … Graph Theory – 1736/Euler - nodes and relationships. Graphs imply the answer mathematically; brute force exhaustive analysis otherwise. Description Logic – 1980s; rules to construct valid, useful knowledge representations, knowledge representations that are decidable and can actually produce an answer. From first order logic. From AI research. Info not tacit (If else). The externalized form reveals the info for verification, integration, reasoning, and interrogation. Relationships beyond inheritance to seek dependable logic.
Links Haiti - http://topics.nytimes.com/top/news/internati onal/countriesandterritories/haiti/index.html http://topics.nytimes.com/top/news/internati onal/countriesandterritories/haiti/index.html http://nces.ed.gov/ SEC and XBRL: http://www.cpa2biz.com/Content/media/PRO DUCER_CONTENT/Newsletters/Articles_2010/ CorpFin/SEC_XBRL.jsp http://www.cpa2biz.com/Content/media/PRO DUCER_CONTENT/Newsletters/Articles_2010/ CorpFin/SEC_XBRL.jsp