Presentation on theme: "What are GUIDs and Why Do We Need Them ??? Steve Baskauf Vanderbilt Dept. of Biological Sciences"— Presentation transcript:
What are GUIDs and Why Do We Need Them ??? Steve Baskauf Vanderbilt Dept. of Biological Sciences http://bioimages.vanderbilt.edu/
What is a GUID? A globally unique identifier (GUID) should be: 1.globally unique 2.actionable 3.persistent
1. How do you make an identifier globally unique? (part 1) Make it locally unique within your institution A common strategy: – identifier (catalog number) unique within a collection, e.g. 66920 – namespace (collection code) unique within the institution, e.g. ind-baskauf Unique local identifier: ind-baskauf/66920, ind-baskauf:66920, ind-baskauf_66920, etc.
How do you make an identifier globally unique?(part 2) Make your local identifier globally unique Use your institution code? TENN, BOON, bioimages? No! How do you know that is globally unique? Consensus: use a domain (or subdomain) name, e.g. www.biology.appstate.edu, tenn.bio.utk.edu, or bioimages.vanderbilt.edu
Some identifiers that are globally unique bioimages.vanderbilt.edu_ind-baskauf_66920 urn:lsid:bioimages.vanderbilt.edu:baskauf:66920 http://bioimages.vanderbilt.edu/ind-baskauf/66920 Do these qualify as GUIDs??? – globally unique – actionable???? What happens if you put them in a web browser?
2. How do you make an identifier actionable? Something has to happen when the identifier is put in a web browser. LSIDs – need a special browser plugin that nobody has. – need a special system for its resolvers to talk to each other HTTP URIs – work in any web browser – DNS nameservers already talk to each other
Can a material or conceptual object have an HTTP URI? We know web page can have a URI that the web browser uses to find the HTML document… But physical objects (specimens, living plants) and conceptual entities (species) can also have HTTP URIs!
CAN I HAVE A URI??? Yes! Here it is: http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me
How is my URI actionable??? If I put that HTTP URI in a web browser, does it deliver me to the user, like a web page? http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me
Darn, no transporter technology! What should I use for my HTTP URI? firstname.lastname@example.org https://medschool.mc.vanderbilt.edu/biosci/bio_fac.php?id3=13257 http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf http://people.vanderbilt.edu/~steve.baskauf/foaf.rdf#me The web server doesn’t do anything with the fragment identifier (#me), but it makes the URI different from the RDF metadata file. URIs for objects must be different from the URIs of other things that represent them. A URI is a Uniform Resource Identifier, not a URL (Uniform Resource Locator). It identifies me, but doesn’t deliver me.
Back to the tree… http://bioimages.vanderbilt.edu/ind-baskauf/66920.htm = a URI and URL for a web page about the tree http://bioimages.vanderbilt.edu/ind-baskauf/66920.rdf = a URI and URL for an RDF metadata file about the tree http://bioimages.vanderbilt.edu/baskauf/66921.jpg = a URI and URL for an image of the tree http://bioimages.vanderbilt.edu/ind-baskauf/66920 = a URI for the tree itself
How did the web server know what do do with the HTTP URI? Content negotiation=rules about what representation of a resource a web server should send when a non-information URI is sent to it. Apache web servers can do it if set up properly. Web browsers ask for HTML content Computers (“semantic web user-agents”) ask for RDF/XML content
What the heck’s the Semantic Web? same thing as “Web 2.0” an idea pushed by Tim Berners-Lee (inventor of the Web a way for programs like web crawlers (e.g. GoogleBot) to know rather than guess. Disco=an RDF browser http://www4.wiwiss.fu-berlin.de/rdf_browser/ http://bioimages.vanderbilt.edu/ind-baskauf/66920
3. What is a persistent HTTP URI? One of my favorite websites: http://tenn.bio.utk.edu/vascular/vascular.html Oops. It’s now: http://tenn.bio.utk.edu/vascular/vascular.shtml
Unchanging local file names http://bioimages.vanderbilt.edu/baskauf/66921.htm vs. http://bioimages.vanderbilt.edu/metadata.htm?baskauf/66921/metadata/img/34 56/2304 What’s in the HTML of the first URI? window.location.replace("../metadata.htm?baskauf/66921/metadata/img/3456/2304"); The first URI is also a “cool” URI (easy to remember).
Unchanging domain names http://www.bioblitznashville.org/ vs. http://bioimages.vanderbilt.edu/ If I die, get fired, or loose interest in Bioimages, the HTTP URIs could still continue to be resolved for a long time.
How long is “persistent”? Forever is a pretty long time. The Internet is only 40 years old and the Web only 20. I say if you can foresee your institution and domain name lasting 10 years, go for it! Alternative? tdwg.org subdomain (but GUID review is 188 days old!)
Why do we need GUIDs? They provide a convenient way to cite ANYTHING and allow a reader to obtain further information with only a Web browser. They allow metadata about a resource to unambiguously refer to other resources at other institutions (e.g. duplicate specimens, live plant images and specimens) They make it possible to have a system that can update itself automatically.
STOP WAITING and go for it! There is nothing that would stop most of us from starting to use HTTP URI guids within a month. Forget about LSIDs. If you are afraid of RDF, ignore it and worry about it later. Rules were made to be broken. See http://bioimages.vanderbilt.edu/ for more information about everything here and examples. Also a link to Apache page on content negotiation.http://bioimages.vanderbilt.edu/