Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University.

Slides:



Advertisements
Similar presentations
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
Advertisements

UKOLN, University of Bath
An overview of collection-level metadata Applications of Metadata BCS Electronic Publishing Specialist Group, Ismaili Centre, London, 29 May 2002 Pete.
Collections and services in the information environment JISC Collection/Service Description Workshop, London, 11 July 2002 Pete Johnston UKOLN, University.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
A centre of expertise in digital information management A QA Framework To Support Your Library Web Site Review Brian Kelly UKOLN University of Bath Bath.
Z39.50 and the Web ZIG July 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
1 Technical Developments Related to Quality Issues Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY
1 Authentication and Open Standards Brian Kelly UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the British Library Research.
1 CS 502: Computing Methods for Digital Libraries Lecture 22 Web browsers.
IS 373—Web Standards Todd Will
© Tefko Saracevic, Rutgers University1 metadata considerations for digital libraries.
The RDF meta model: a closer look Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations.
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
1 Deploying New Web Technologies Brian Kelly Address UK Web Focus UKOLNURL University of Bath UKOLN is.
Metadata and identifiers for e- journals Copenhagen Juha Hakala Helsinki University Library
UKOLN and the Interoperability Focus Paul Miller Interoperability Focus
DHTML. What is DHTML?  DHTML is the combination of several built-in browser features in fourth generation browsers that enable a web page to be more.
August Chapter 1 - Introduction Learning XML by Erik T. Ray Slides were developed by Jack Davis College of Information Science and Technology Radford.
Copyright © cs-tutorial.com. Introduction to Web Development In 1990 and 1991,Tim Berners-Lee created the World Wide Web at the European Laboratory for.
Web Site Creation: Good Practice Guidelines Standards For Project Web Sites Brian Kelly UK Web Focus UKOLN University of Bath UKOLN is supported by: .
A Lightweight Approach To Support of Resource Discovery Standards The Problem Dublin Core is an international standard for resource discovery metadata.
1 © Netskills Quality Internet Training, University of Newcastle Metadata Explained © Netskills, Quality Internet Training.
1 If I Could Start All Over Again: Lessons To be Learnt From The HE Community Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is.
1 Metadata for Citizens’ Information UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher.
1999 Asian Women's Network Training Workshop What the Internet Offers Communications  Across the country or across the world Information resources and.
Technologies For Hybrid Libraries: Implementation Issues Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the Library.
Indo-US Workshop, June23-25, 2003 Building Digital Libraries for Communities using Kepler Framework M. Zubair Old Dominion University.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Introduction to HTML Tutorial 1 eXtensible Markup Language (XML)
1 The Latest Web Developments Brian Kelly UK Web Focus UKOLN University of Bath
Linking electronic documents and standardisation of URL’s What can libraries do to enhance dynamic linking and bring related information within a distance.
1 The Latest Web Developments Brian Kelly, UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the British Library.
XML A web enabled data description language 4/22/2001 By Mark Lawson & Edward Ryan L’Herault.
The Latest Web Developments Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: URL
Standards And Architectures For NOF Digitisation Projects Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: .
The Resource Discovery Network and OAI Andy Powell UKOLN, University of Bath UKOLN is funded by Resource: The Council.
Automated Benchmarking Of Local Authority Web Sites Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by:
1 Metadata –Information about information – Different objects, different forms – e.g. Library catalogue record Property:Value: Author Ian Beardwell Publisher.
Standards For Building Web Sites Brian Kelly Address UK Web Focus UKOLN University of Bath UKOLN is funded.
2nd Concertation Day 18 February 2000 The Charity Centre RSLP Collection Description.
1 An Introduction to Metadata Brian Kelly UK Web Focus UKOLN University of Bath BA2 7AY
1 Web Standards and the HyLiFe Project (including authentication and distributed searching) Brian Kelly Address UK Web Focus UKOLNURL.
A Quick Introduction to Metadata Michael Day UKOLN: The UK Office for Library and Information Networking, University of Bath
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
1 XML and RDF Paul Miller Interoperability Focus UK Office for Library & Information Networking (U KOLN ) U.
1 New Standards on the Web Brian Kelly Address UK Web Focus UKOLNURL University of Bath UKOLN is funded.
Disseminating News Within Your Organisation Brian Kelly UKOLN University of Bath Bath, BA2 7AY UKOLN is supported by: URL
1 Web Developments Related To Metadata Brian Kelly UK Web Focus UKOLN University of Bath
1 Metadata for Joined-up Government Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 Standards, the Web and eLib Projects Brian Kelly Address UK Web Focus UKOLN University of Bath UKOLN.
The RDF meta model Basic ideas of the RDF Resource instance descriptions in the RDF format Application-specific RDF schemas Limitations of XML compared.
Metadata : an overview XML and Educational Metadata, SBU, London, 10 July 2001 Pete Johnston UKOLN, University of Bath Bath, BA2 7AY UKOLN is supported.
1 Future Of The Web Brian Kelly, UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the British Library Research.
Future Web Trends Brian Kelly UK Web Focus UKOLN University of Bath UKOLN is funded by Resource: The Council for Museums, Archives.
1 Building our DNER the Z way Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 Dublin Core and its implementation in RDF/XML Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 Metadata – Has The Time Arrived? Brian Kelly UK Web Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the Library and Information Commission,
A centre of expertise in digital information managementwww.ukoln.ac.uk UKOLN: WWW Brian Kelly UKOLN University of Bath Bath, BA2 7AY
Chapter 1 Introduction to HTML, XHTML, and CSS HTML5 & CSS 7 th Edition.
1 Dublin Core in Z39.50: The Bath Profile Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
1 Web Standards for the Clumps Projects Brian Kelly Address UK Web Focus UKOLNURL University of Bath UKOLN.
Surveying the landscape: collection-level description & resource discovery JISC/NSF DLI Projects meeting, Edinburgh, 24 June 2002 Pete Johnston UKOLN,
A centre of expertise in digital information managementwww.ukoln.ac.uk Search Facilities For Web Sites A Discussion Group Session Brian Kelly UKOLN University.
1 Educational Metadata Paul Miller Interoperability Focus UKOLN U KOLN is funded by Resource: the Council for.
1 Semantic Web Technologies for UK HE and FE Institutions: Part 1: Background to the Development of the Web Brian Kelly UK Web Focus UKOLN
1 Z39.50 and the DNER UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education.
1 DC, RDF, Z39.50, and assorted other acronyms UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC)
1 Bath Profile and the DNER Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN)
XML and Distributed Applications By Quddus Chong Presentation for CS551 – Fall 2001.
Presentation transcript:

Will It All Fit Together? The need for standards and the technical challenges Brian Kelly and Paul Miller UK Web Focus Interoperability Focus UKOLN University of Bath Bath, BA2 7AY UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the University of Bath where it is based B

2 Contents Introduction Background: The Web Library Information Problems Solutions Deployment Challenges Conclusions B

3 About Us UK Web Focus Advises UK HE community on web developments JISC-funded Represents JISC on W3C Interoperability Focus Advises on issues related to the deployment of ‘interoperable’ services across libraries, museums, archives, etc. JISC and LIC funded Represents community on various international metadata and standardisation initiatives B

4 About You How many are in the following groups: “Webmasters” Library catalogue/system Managers Others What do you hope to gain from the session? …and if we use terms you don’t understand… ask! B

5 Aims of this Session The aims of this session are: To provide an update on web developments To illustrate ways in which the web relates to other library–based electronic information To outline some of the advantages of adopting a standardised solution to problems To look at the ways in which things might move in the near future B

6 Standardisation Community Library groups Cultural Heritage Government W3C Produces W3C Recommendations Managed approach Protocols initially developed by W3C members Decisions made by W3C, influenced by member & public review IETF Produces Internet Drafts on Internet protocols Bottom-up approach to developments Protocols developed by interested individuals "Rough consensus and working code" Formal Formal international/ national standards processes ISO, CEN, NISO, ECMA, ANSI, BSI… Can be slow-moving and bureaucratic Produce robust standards PNG HTML HTTP PNG HTML HTTP URN whois++ HTTP URN whois++ Proprietary De facto standards Often initially appealing (cf PowerPoint, PDF) May emerge as standards PNG HTML Java PNG HTML Java Relevant Bodies B

7 Background to the Web The web was initially very successful due to its simplicity Client Netscape IE Lynx HTML Server Apache IIS... Give me foo.html from Here it is The web is based on three key architectural components: Data Format: HTML (HyperText Markup Language) Addressing: URLs (Uniform Resource Locators) Transport: HTTP (Hypertext Transfer Protocol) The web is based on three key architectural components: Data Format: HTML (HyperText Markup Language) Addressing: URLs (Uniform Resource Locators) Transport: HTTP (Hypertext Transfer Protocol) B

8 Background to Library Information Long tradition of categorising information Card catalogue (local) OPAC (local-ish) WebPAC (potentially global) Proven track record on formalising practice AACR (rules for cataloguing) MARC (rules for transfer) Z39.50 (linking and access) P

9 Problems With the Web Although the web has been successful, there are problems: Performance - the web is too slow Resource discovery - lack of a metadata architecture HTML’s lack of arbitrary structure Accessibility - difficulties of accessing information by visually impaired, people using PDAs, etc. Functionality - difficult to deploy interactive applications on the web Addressing etc. B

10 Solutions (Today) HTML 4.0 used in conjunction with CSS 2.0 (Cascading Style Sheets) and the DOM provides an architecturally pure, yet functionally rich environment HTML W3C-Rec Improved forms Hooks for stylesheets Hooks for scripting languages Table enhancements Better printing CSS W3C-Rec Support for all HTML formatting Positioning of HTML elements Multiple media support Problems Changes during CSS development Netscape & IE incompatibilities Continued use of browsers with known bugs Problems Changes during CSS development Netscape & IE incompatibilities Continued use of browsers with known bugs DOM - W3C-Rec Document Object Model Hooks for scripting languages Permits changes to HTML & CSS properties and content B

11 HTML's Limitations HTML 4.0 / CSS 2.0 have limitations: Difficulties in introducing new elements –Time-consuming standardisation process ( ) –Dictated by browser vendor (, ) Area may be inappropriate for standarisation: –Covers specialist area (maths, music,...) –Application-specific ( ) HTML is a display (output) format HTML's lack of arbitrary structure limits functionality: –Find all memos copied to John Smith –How many unique tracks on Jackson Browne CDs B

12 XML XML: Extensible Markup Language A lightweight SGML designed for network use Addresses HTML's lack of evolvability Arbitrary elements can be defined (,, etc) Agreement achieved quickly - XML 1.0 became W3C Recommendation in Feb 1998 Support from industry (SGML vendors, Microsoft, etc.) Support in Netscape 5 and IE 5 B

13 XML Deployment Ariadne issue 15 has article on "What Is XML?" Describes how XML support can be provided: Natively by new browsers Back end conversion of XML - HTML Client-side conversion of XML - HTML / CSS Java rendering of XML Examples of intermediaries See B

14 Namespaces and Linking XML Namespaces What if an XML document contains a for the document and a for the name of a book? XML Namespaces enable such clashes to be resolved The naming conventions are defined at a URL XSL stylesheet language will provide extensibility and transformation facilities (e.g. create a table of contents) B

15 Challenges facing library information Amazon.co.uk Many–MARC Integration with other scholarly resources AHDS Gateway SOSIG Web of Science Alternative delivery on–line document delivery? P Competition? Obfuscation ? Complication !

16 Addressing (Problems) URLs (e.g. poly.ac.uk/depts/music/ ) have limitations: Lack of long-term persistency –Organisation changes name –Department shut down or merged –Directory structure reorganised Inability to support multiple versions of resources (mirroring) ISBN/ISSN also problematic: Not tied to the work Nor to the item at hand P

17 Addressing (Solutions) DOIs (Document Object Identifiers): Proposed by publishing industry as a solution Aimed at supporting rights ownership Business model needed Do two copies of a digital object get separate DOIs? PURLs (Persistent URLs): Provide single level of redirection P

18 Joined–up thinking Users can be anywhere. They need to search anywhere Physical locations at which digital data are stored should not impinge upon access Disciplinary boundaries should not be a barrier P

19 Z39.50 International Standard (ISO 23950) Permits remote searching of databases Access via Z client or over web Relies upon ‘Profiles’ Used outside the library See P

20 Z39.50 Challenges Profiles for each discipline Defeats interoperability? Bib–1 bloat Largely invisible Seen as complicated Seen as expensive Seen as old–fashioned Surely no match for XML/RDF/whatever P

21 Z39.50 Futures International Interoperability Profile Cross–Domain Attribute Set Attribute Architecture Bib–2 XER DNER/RDNC/NGDF/ New Library? P

22 When to use it? To provide remote access to a large catalogue of material (an OPAC, a museum collection management system…) To facilitate/allow searching of your resources alongside like resources from elsewhere P

23 What is ‘Metadata’? –meaningless jargon –or a fashionable term for what we’ve always done –or “a means of turning data into information” –and “data about data” –and the name of a film director (‘Luc Besson’) –and the title of a book (‘The Lord of the Flies’). P

24 What is ‘Metadata’? Metadata exists for almost anything; People Places Objects Concepts Web pages Databases. P

25 What is ‘Metadata’? Metadata fulfils three main functions; Description of resource content –“What is it?” Description of resource form –“How is it constructed?” Description of resource use –“Can I afford it?”. P

26 Introducing the Dublin Core An attempt to improve resource discovery on the Web –now adopted more broadly Building an interdisciplinary consensus about a core element set for resource discovery –simple and intuitive –cross–disciplinary –international –flexible.

27 15 elements of descriptive metadata All elements optional All elements repeatable The whole is extensible –offers a starting point for semantically richer descriptions Interdisciplinary –libraries, museums, archives… International –available in 20 languages, with more on the way... Introducing the Dublin Core

28 Title Creator Subject Description Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights Introducing the Dublin Core

29 Implementing the Dublin Core Normally thought of as being HTML Most recently possible in XML/RDF Dublin Core ‘view’ onto richer databases DC elements in Bib–1 DC elements form basis of XD attribute set DC closely mapped to GILS See datamodel/WD–dc–rdf/ See P

30 RDF RDF - the metadata framework Based on a formal data model (direct label graphs) Syntax for interchange of data Schema model page.html £0.05 Cost 11-May-98 ValidUntil Resource Value PropertyType Property RDF Data Model page.html £ May-98 Property Cost InstanceOf ValidUntil Value PropObj Cost PropName P

31 Authentication We can’t (can we?) just make all these resources available for free. Users need to authenticate. ATHENS / Digital Signatures … Authenticate once per site ? Authenticate once per query per site ? Complicated by Z39.50 searches authenticate once per Target queried ?! Ideally, authenticate once when you log on in the morning! P

32 Deployment How to I deploy “the new stuff” in the real world? Barriers: Browser x doesn’t do CSS, … Authoring tools don’t do RDF I prefer the web as it is I haven’t the time to learn anything new This Z39.50 thing is just too hard B

33 Approaches to Deployment Various interesting new technologies have been outlined How can they be deployed in our environment? Should we: Ignore them? Accept them fully? Accept them partly? B

34 Ignore New Developments We can chose to ignore new developments, and continue to use, say, HTML 3.2: Safe option, with no new training, support or software costs Experience in effectiveness, limitations, etc.  Fails to address current performance problems  Fails to address accessibility problems  Fails to provide new functionality  Service likely to look "old-fashioned" compared with competition B

35 Fully Accept New Developments We can chose to more wholesale to, say, HTML 4.0 and CSS 2.0: Can be exciting to be at leading edge Performance benefits Accessibility benefits Based on open-standards Provides motivation for users to upgrade browsers Likely to be solution at some point (cf. Gopher)  Backwards compatibility problems with old browsers  Costly to deploy new authoring news, training,..  Likely to be bugs and incompatibilities with new tools and browsers B

36 Implement "Safe" Solutions An alternative is to use "safe" parts of technologies which are backwards compatible and avoid major browser bugs Attractive sounding compromise position  Lose some functionality, but not all  Can be difficult or expensive to find "safe" options (does.margin-left work on IE on SGI?)  Tools may not allow safe options to be chosen  Lack of validation tools for checking conformance with restricted set of specification Note See for unsafe CSS 2.0 properties B

37 Decision Time What would you opt for? Stick with current technologies Cheap, default option. Continuation of performance and accessibility problems. Unlikely to be long term solution. Deploy new technologies More expensive option. Functionality, performance and accessibility benefits. Access problems for old browsers. Use "safe" new technologies May require home-grown tools and support. Avoids some of the problems of other solutions B

38 An Alternative An alternative approach to deploying new technologies is available: Use more intelligent server-side software Use "proxies" to address limitations of browser technologies. The term intermediary was used in a paper [1] at the WWW 7 conference to describe this approach Protocol solutions, such as Transparent Content Negotiation (TCN) and (CC/PP) [1]"Intermediaries: New Places For Producing and Manipulating Web Content" B

39 Intelligent Server Software Simple model: Server receives request for resource Server delivers resource to client More sophisticated model: Server receives request for resource Server processes header information from client Server delivers resource to client based on client information Can be implemented used server add-ons such as PHP/FI and MS Active Server Pages or by use of Content Management systems B

40 Web Conclusions To conclude: New web protocols are still being developed Deployment of new technologies can be expensive or time-consuming, but is likely to be needed Various deployment models:  Don't implement  Implement fully  Implement via proxy  Other solutions We can't do it all ourselves Experience in developing (wide-area) web applications will help in developing intermediaries B

41 Non–Web Conclusions Cross–domain interoperability is a laudable goal Technical developments continue in a rapidly shifting environment Libraries are not alone To make an OPAC more widely available, look at Z39.50 To raise awareness of library web pages, or to describe particular resources, look at a ‘metadata’ solution like Dublin Core We need to move beyond ‘traditional’ users (who know where the library is and what if offers)… P