We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byMia Alexander
Modified over 4 years ago
© Copyright 2011 TopQuadrant Inc. Slide 1 Evolving Practices of Linked Data Irene Polikoff, TopQuadrant June 29-30, 2011 W3C Government Linked Data Working Group
© Copyright 2011 TopQuadrant Inc. Slide 2 What is data? Data has: value type structure units of measure encoding bit and byte order Not a topic of this presentation but many questions relevant to interpretation of data depend on the attributes of the data
© Copyright 2011 TopQuadrant Inc. Slide 3 What is Linked Data? A set of best practices for publishing and connecting structured data on the Web http://linkeddata.org/faq A method of publishing structured data so that it can be interlinked and become more useful. It builds upon standard Web technologies such as HTTP and URIs, but rather than using them to serve web pages for human readers, it extends them to share information in a way that can be read automatically by computers. This enables data from different sources to be connected and queried.HTTPURIs http://en.wikipedia.org/wiki/Linked_Data
© Copyright 2011 TopQuadrant Inc. Slide 4 How is LD publishing being done today? SPARQL endpoints Making static serialized RDF available at a URL URL that corresponds to the base namespace? Content negotiation (person gets HTML document, machine gets RDF) Structured markup embedded in HTML (RDFa, microdata, microformats) Provided as a meta tag link in an HTML page pointing to the corresponding RDF file Zipped RDF files downloadable from the web ???
© Copyright 2011 TopQuadrant Inc. Slide 5 What factors influence LD publishing decisions? Available infrastructure and its constraints Cost Data consumers preferences Size of the data being published Frequency of change Skills and knowledge of the data publisher W3C recommendations ???
© Copyright 2011 TopQuadrant Inc. Slide 6 A data consumer viewpoint – in favor of SPARQL endpoint At the latest EBI Industry Day industry reps requested EBI curated content to be made available as SPARQL endpoint as opposed to, e.g., published as a large download, or being re-hosted by a third party The following arguments were made: Ease of access. The datasets are very large, and are updated regularly. Download of an entire dataset is time consuming and costly. A (high-performance) SPARQL endpoint allows a client to specify just what data they want, and get it in a just-in-time manner. Currency. The datasets change often, users want to know that they have the latest version, without having to perform tedious checks at every access. Authority. The users of this data trust the EBI curation for this data, and don't know if they can trust a third party. Was the data corrupted? Is it the version it claims to be?
© Copyright 2011 TopQuadrant Inc. Slide 7 Each publishing approach requires guidance on best practices For example, for Content negotiation: How does a client identify its requirements (RDF/XML, Turtle, HTML, SPARQL endpoint)? The Turtle submission suggests mimetype text/turtle for turtle. What types of content can be negotiated? (SPARQL endpoint? RDF/XML? Turtle? NTriples? OWL/XML?) Must all negotiated variants contain the same information? What does this mean, when different formats have different interpretations (e.g., OWL/XML vs. Turtle)? Must all negotiated variants have the same prefix definitions? What about forms that don't have a notion of prefixes (NTripes, HTML)? And in a more general sense: How are versions managed? (e.g, using owl:versionInfo)? How are the URLs for various versions managed? If one dataset uses resources from another, how does it indicate this? Just use it? rdfs:seeAlso? owl:imports? What is the appropriate behavior of a client in these situations? Is there any relationship between the location at which a file is found and the URIs it describes? How about its base URI, owl:Ontology or default namespace?
© Copyright 2011 TopQuadrant Inc. Slide 8 An example of what we may see when we look at the published data "http://my.site.com/#Recoveries" Recoveries Recouvrement Bankruptcies Debtors Seizure (of property) The regaining of something of value, such as property or funds lent, as a result of special efforts by the owner or creditor. EC Economics and Industry "http://my.site.com/#Recovery%20plans%20%28Environment%29" Recovery plans (Environment) Environmental management NE Nature and Environment "http://my.site.com/#Recreation" Recreation Loisir Entertainment Hobbies Leisure Recreational activities Games Recreational facilities Sports Tourism Toys Outdoor recreation An activity that diverts, amuses or stimulates usually done in one's spare time. SO Society and Culture Government of Canada Core Subject Thesaurus http://en.thesaurus.gc.ca/default.asp?lang=En&n=EAEAD1E6-1http://en.thesaurus.gc.ca/default.asp?lang=En&n=EAEAD1E6-1
© Copyright 2011 TopQuadrant Inc. Slide 9 Issues with the example Minting new URIs in someone elses namespace e.g., skos:UsedFor, skos:SubjectCategory, etc. Providing no type definitions for the new URIs (Possibly) making errors in URIs did they mean skos:scopeNote or skos:ScopeNote? (Potentially) misusing URIs did they meant skos:narrower when they said skos:NarrowerTerm, if so, it is an object property Inventing a way to do language tags, e.g., skos:French perhaps, because they are not aware of how to do this correctly Not following a convention of lower camel case for properties Not linking their own data skos:NarrowerTerm and skos:RelatedTerm are all strings
© Copyright 2011 TopQuadrant Inc. Slide 10 One possible guideline or test Assuming that information about a resource should be found at the place it resolves to, then a resource like: skos:RelatedTerm should be available at http://www.w3.org/2004/02/skos/core# which it isn't
© Copyright 2011 TopQuadrant Inc. Slide 11 Looking at examples helps There will be issues the working group would not have thought possible Understanding these will provide the needed scope/level of details for best practices One good resource is the Pedantic Web Group: http://pedantic-web.org/ http://groups.google.com/group/pedantic-web
© Copyright 2011 TopQuadrant Inc. Slide 12 More questions to address - 1 If you use someone elses vocabulary, do you include type declarations, effectively replicating information? It is very common to see included something like: foaf:Person a owl:Class What role do imports play, if any, in Linked Data publishing? What name do we give to a set of graphs (ontologies) that belong together e.g., skos and skos-xl, QUDT ontology collection What should be a relationship between their URIs/namespaces? TQ has build grammars to resolve this
© Copyright 2011 TopQuadrant Inc. Slide 13 More questions to address - 2 What information should be returned for a resource? All triples that it is a subject of? What about back links? What about if a resource is a class? How to express vocabulary and data mappings? owl:sameAs, owl:equivalentClass, etc. are commonly used, sometimes, without understanding semantic commitment SKOS mapping properties are an alternative What about more complex mappings – at TQ, we use SPIN (SPARQL) maps
© Copyright 2011 TopQuadrant Inc. Slide 14 Thank You Irene Polikoff E-mail: email@example.com@topquadrant.com Twitter: @topquadrant, @oegovnews
Presented to the ALCTS FRBR Interest Group, ALA Annual, 24 June 2011
Copyright © 2003 Pearson Education, Inc. Slide 7-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Copyright © 2003 Pearson Education, Inc. Slide 6-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Interoperability and semantics in RDF representations of FRBR, FRAD and FRSAD Gordon Dunsire Presented at the Cologne Conference on Interoperability and.
© Copyright 2009 TopQuadrant Inc. Slide 1 QUDT: An OWL Ontology for Measurable Quantities, Units, Dimension Systems, and Dimensional Data Types James Chip.
1 Copyright ©2007 Sandpiper Software, Inc. Vocabulary, Ontology & Specification Management at OMG Elisa Kendall Sandpiper Software
Resource description and access for the digital world Gordon Dunsire Centre for Digital Library Research University of Strathclyde Scotland.
OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
LIS650lecture 1 XHTML 1.0 strict Thomas Krichel
Requirements. UC&R: Phase Compliance model –RIF must define a compliance model that will identify required/optional features Default.
XPointer and HTTP Range A possible design for a scalable and extensible RDF Data Access protocol. Bryan Thompson Presented to the RDF Data Access.
XPointer and HTTP Range A possible design for a scalable and extensible RDF Data Access protocol. Bryan Thompson draft Presented to the RDF.
Metadata vocabularies and ontologies Dr. Manjula Patel Technical Research and Development
UKOLN, University of Bath
Dr. Alexandra I. Cristea CS 253: Topics in Database Systems: C3.
Last update: (2) (3) The Dutch airline.
4. Internet Programming ENG224 INFORMATION TECHNOLOGY – Part I
Semantic Descriptions for RESTful Services SA-REST by Knoesis Service Research Lab Tomas Vitvar WSMO Phone Conference January 09,
OSLC Resource Shape: A Linked Data Constraint Language Arthur Ryman & Achille Fokoue, IBM W3C RDF Validation Workshop, Cambridge,
31242/32549 Advanced Internet Programming Advanced Java Programming
© 2018 SlidePlayer.com Inc. All rights reserved.