Presentation is loading. Please wait.

Presentation is loading. Please wait.

Uniform Resource Identifiers Jacek Kopecký WSML Working Group June 2004.

Similar presentations


Presentation on theme: "Uniform Resource Identifiers Jacek Kopecký WSML Working Group June 2004."— Presentation transcript:

1 Uniform Resource Identifiers Jacek Kopecký WSML Working Group June 2004

2 2 Jacek Kopecký, jacek.kopecky@deri.org Overview History of URIs URI syntax URI references and their resolution Good practices for creating URIs Interesting issues

3 June 20043 Jacek Kopecký, jacek.kopecky@deri.org URI History Universal Resource Identifiers (RFC 1630, June 1994) Uniform Resource Locators and Names RFC 2396, August 1998 2396bis in development Originally “Universal”, later “Uniform” as a compromise “Universal” again preferred by TimBL

4 June 20044 Jacek Kopecký, jacek.kopecky@deri.org URLs and URNs Locators (addresses) vs. Names URNs not easily dereferencable URNs can be made dereferencable by infrastructure URLs perceived as less persistent URLs and URNs drifting towards middle ground http://www.w3.org/DesignIssues/NameMyth.html No point in making the distinction any more

5 June 20045 Jacek Kopecký, jacek.kopecky@deri.org Uniform Resource Identifiers URIs “identify” “resources” Identification doesn’t imply interaction Resource is a sameness of characteristics over time Latest blog rant Latest blog rant on politics Blog rant on politics from 2004-6-22 Resource need not be accessible when URI is created Pictures from my future trip to London will be at http://jacek.cz/photos/2004-08-london

6 June 20046 Jacek Kopecký, jacek.kopecky@deri.org URI Syntax According to 2396bis http://www.apache.org/~fielding/uri/rev-2002/rfc2396bis.html Examples http://www.ietf.org/rfc/rfc2396.txt mailto:John.Doe@example.com news:comp.infosystems.www.servers.unix telnet://melvyl.ucop.edu/ URI Syntax - simplified scheme: [//authority] [/path] [?query] [#fragid] Relative URI without “scheme:” Dot path segments (‘.’ and ‘..’) treated specially

7 June 20047 Jacek Kopecký, jacek.kopecky@deri.org URI Syntax cont’d Reserved characters (like /:?#@$&+* ) Many allowed characters Rest of UNICODE percent-encoded from UTF-8 http://google.com/search?q=kopeck%C3%BD Percent-encoding allowed characters creates equivalent URIs But namespaces compared char-by-char

8 June 20048 Jacek Kopecký, jacek.kopecky@deri.org URI Reference Resolution Resolving URI A against base URI B Going from the left, keep as much from B as is undefined in A First part of A replaces that part from B Path resolution special If A has absolute path, that is taken Relative path from A resolved against path from B, removing dot segments from result Everything after first part of A taken from A Fragment always taken from A

9 June 20049 Jacek Kopecký, jacek.kopecky@deri.org URI Ref. Resolution Examples Base URI: http://a/b/c/d?e#f 1.g= http://a/b/c/g 2..= http://a/b/c/ 3../= http://a/b/c/ 4../g= http://a/b/c/g 5...= http://a/b/ 6.../= http://a/b/ 7.../g= http://a/b/g 8.../../g= http://a/g 9.../../../g= http://a/g

10 June 200410 Jacek Kopecký, jacek.kopecky@deri.org URI Ref. Resolution Examples Base URI: http://a/b/c/d?e#f 10././g= http://a/g 11.//g= http://g 12.#s= http://a/b/c/d?e#s 13.g#s= http://a/b/c/g#s 14.?y= http://a/b/c/d?y 15.g?y= http://a/b/c/g?y 16.g?y#s= http://a/b/c/g?y#s 17.g:h= g:h 18../g:h= http://a/b/d/g:h

11 June 200411 Jacek Kopecký, jacek.kopecky@deri.org Base URIs Necessary when resolving URI references 1.Explicit base URI embedded in content 2.URI of the document Usual in HTML files on the web 3.App-dependent base URI default

12 June 200412 Jacek Kopecký, jacek.kopecky@deri.org URI Equivalence Do two URIs identify the same resource? Comparing without accessing the resources Various applications for URI comparison Increasing cache efficiency Comparing the namespaces of two symbols Algorithms must avoid false positives False negatives unavoidable http://weather.example.com/innsbruck http://jacek.cz/innsbruckweather redirect to above

13 June 200413 Jacek Kopecký, jacek.kopecky@deri.org Uses of URIs Addresses on the Web Namespaces in XML QNames Namespaces in QNames in other languages Identifiers of things and concepts (e.g. RDF) Unique keys (e.g. MIME message ID)

14 June 200414 Jacek Kopecký, jacek.kopecky@deri.org QName Introduced in XML Namespaces Name of an XML namespace-qualified element RDF uses QNames for brevity of URI notation XML Schema expanded use of QNames to further things (6 symbol spaces) Every following language uses QNames as identifiers Number of independent symbol spaces => Turning QNames into URIs is cumbersome Should have been as simple as in RDF (IMHO)

15 June 200415 Jacek Kopecký, jacek.kopecky@deri.org Creating URIs for Web Resources Versioning approach for persistence http://w3.org/TR/soap vs. http://w3.org/TR/soap12 vs. http://w3.org/TR/2003/REC-soap12-part1-20030624/ Simple, memorable URIs http://jacek.cz/blog Scribbled on a napkin Correcting spelling and case helps – mod_speling Making the “www.” prefix optional (both ways) helps Content negotiation – drop.html (.php,.asp) URI changes harmful

16 June 200416 Jacek Kopecký, jacek.kopecky@deri.org Creating Example URIs http://example.com http://example.net http://example.org Reserved for precisely this purpose Or use own domain (deri.org, wsmo.org) http://foo.com not good

17 June 200417 Jacek Kopecký, jacek.kopecky@deri.org Creating URIs for Namespaces Dereferencable, ending with ‘/’ or ‘#’ Canonical URIs – no unnecessary dot segments or percent-encoding Namespaces compared char-by-char Namespace document Preferably in the language that uses the namespace – enables automatic discovery With human-oriented descriptions To allow for the above, don’t share namespace URIs for schema and WSDL

18 June 200418 Jacek Kopecký, jacek.kopecky@deri.org Creating URIs for Concepts Group concepts in a common, dereferencable namespace Each concept identified by its fragID In RDF/XML, namespace ends with ‘#’ Namespace document describes the concepts Two problems FragIDs depend on media types Can http://example.com/#car identify a car?

19 June 200419 Jacek Kopecký, jacek.kopecky@deri.org Fragment IDs in URIs Fragment ID identifies a secondary resource Interpretation of fragment IDs depends on media type In HTML In XML No meaning in JPEG xml:id in development So far language-dependent (often DTD) solutions Fragment IDs should mean the same thing across media types with content negotiation

20 June 200420 Jacek Kopecký, jacek.kopecky@deri.org Range of HTTP URIs? Open W3C TAG issue Can http: URI identify a car? Can I say http://jacek.cz/dragstar/ is my motorbike? TimBL doesn’t seem to think so Is it necessary to distinguish between a thing and a description of that thing?

21 June 200421 Jacek Kopecký, jacek.kopecky@deri.org Other Interesting Issues data: URI scheme – the URI is the resource RFC 2397 data:image/gif;base64,R0lGODdhMAAwAPAA… mailto: scheme a misnomer URIs don’t specify actions but identifiers uuid: scheme for unique identifiers Good for transient identification in closed systems Mismatches between perceived and intended meaning of a resource http://w3.org/tr/soap Should URIs be human-readable? http://www.bscw.semanticweb.org/bscw/bscw.cgi/0/21621

22 June 200422 Jacek Kopecký, jacek.kopecky@deri.org Main Points Cool URIs don’t change URIs can be (and are) scribbled on napkins URIs don’t (necessarily) point to documents Dereferencable URIs also good as names URLs, URNs obsolete

23 June 200423 Jacek Kopecký, jacek.kopecky@deri.org References http://www.apache.org/~fielding/uri/rev-2002/rfc2396bis.html http://www.ietf.org/rfc/rfc2396.txt http://www.w3.org/Provider/Style/URI http://www.w3.org/DesignIssues/Architecture.html http://www.w3.org/DesignIssues/Axioms.html http://www.w3.org/DesignIssues/NameMyth.html

24 June 200424 Jacek Kopecký, jacek.kopecky@deri.org Hope it Helped Thanks for your attention Questions? Comments? jacek.kopecky@deri.org


Download ppt "Uniform Resource Identifiers Jacek Kopecký WSML Working Group June 2004."

Similar presentations


Ads by Google