Presentation on theme: "Integrating the DOI with Intra- organization Legacy Systems WWW8 Conference - DOI Workshop Toronto, May 11, 1999 Andy Stevens John Wiley & Sons, Inc.,"— Presentation transcript:
Integrating the DOI with Intra- organization Legacy Systems WWW8 Conference - DOI Workshop Toronto, May 11, 1999 Andy Stevens John Wiley & Sons, Inc., New York
Topics for discussion Business and technical reasons for using DOIs Choosing numbering schemes, granularity, bookkeeping Technical implementation issues
Wiley as an example Wiley is a medium-size independent publisher with annual revenue around US$450 million We mostly publish technical and scientific books and journals: –1000 new book titles per year –400 academic journals –various Web sites, CD-ROMs, software
Business reasons for using DOIs Need a way to keep track of e-commerce. How can you track sales for an online item unless you have a way to track the item itself? Similar reasoning applies for tracking rights and royalties. Facilitates rebundling of online content The DOI cuts across product lines. For Wiley, this means journals, books, Internet- only content.
Technical reasons for using DOIs Persistence of URLs is a problem. For example, as our online journals mature, they get shuffled from system to system. We need a way to do bookkeeping on our own content We need a way to link to other peoples content (central metadata database)
Applying DOIs to Wiley journals We initially chose to concentrate on our journal content due to the well-regimented production process and also the existence of our online journals project (Wiley InterScience) Our journals were already assigning identifiers to content (SICI numbers)
Applying DOIs to Wiley books More difficult because the book production process is much more varied within the company -- many different platforms (Quark, Framemaker, etc.), many different vendors and compositors not much of an online presence for book content (yet), though this is changing (e.g. sample chapters for Amazon)
Choosing a numbering scheme Q: Whats a SICI number? A: a serial item and contribution identifier, of course! NISO Z (199612)47: CO;2-J –compact identifier containing ISSN, publication date, volume, issue, page, component, format, check digit
Choosing a numbering scheme (cont.) We are assigning SICI numbers to all of our abstracts, articles, and issues anyway, so we might as well just use them for DOIs: /(SICI) (199612)47: CO;2-J For the journals themselves, just use ISSN: /(ISSN) Dont forget: the DOI is a dumb number!
Choosing a numbering scheme (cont.) For books, use BICI number or some similar scheme (undecided as of yet) Will probably be finalized soon as encyclopedias come online
Granularity For our journals, we chose abstract, article, issue (TOC), journal. No need to get more specific, as there is currently no business need to identify down to the paragraph or diagram level For books, we will automatically assign DOIs down to the chapter level, and then have a manual procedure for creating finer DOIs as needed
The need for a new system Given that legacy systems generally do not cross product lines (e.g. books, journals, and rights are all produced/accounted in separate systems), we decided to create a new system to maintain DOIs. The Wiley DOI Server has open network- based APIs (via HTTP or FTP) for integration with external systems, old or new.
Wiley DOI Server What is it? –Initially, a Sun Sparc 5 with mSQL database on my desktop (Summer 97) –Currently, a dedicated Sun Sparc E450 with Sybase and Verity (Fall 98) –Eventually, just another application on a big Sun server in our IT department
Resolution of Wiley DOIs CNRI Directory (dx.doi.org) 1. Request Wiley DOI 2. Redirect to URL for response page on Wiley DOI Server 5. Request content associated with requested DOI 6. Return content Content Web Server Wiley DOI Server (doi.wileynpt.com) 3. Request DOI response page for requested Wiley DOI 4. Link user to content (if available)
Functions of Wiley DOI Server Provides response page for DOI requests –from user perspective: interruption –from publisher perspective: opportunity to do authentication and branding, and possibility to provide additional services (make the interruption a feature, not a bug). Provides FTP site for new DOIs as they are created by other Wiley systems. Current delivery format: XML files
Functions of Wiley DOI Server (cont.) Provides place to do maintenance –ping test to check for dead URLs –register new DOIs in central directory Stores metadata about DOIs to allow reverse DOI lookup. Hot Topic!! –Currently, we use a custom metadata field set geared to Wiley journals –defined API for batch lookups (similar to PubMed)
The need for a metadata database For the DOI to be useful as a linking mechanism, we need to be able to lookup DOIs based on item metadata (just like we would lookup an ISBN for a given book title). Several initiatives underway to develop technical infrastructure and to decide field set(s).
Ongoing issues Support from management is essential Authentication: most DOI-assigned content is controlled by some kind of online authentication system. How do we gracefully deal with multiple logins without burdening the user too much? Metadata and links are valuable, how do we police them?
Ongoing issues (cont.) Still need to implement some features of Wiley DOI server: –automatic registration of new DOIs into central directory –dual ISSNs for journals (print and electronic) can confuse searching –foreign characters currently confuse searching
Summary The business needs to recognize the usefulness and need for the DOI for it to succeed within a company. Coexistence with legacy systems is possible in a networked environment. Many cross-linking opportunities exist, but depend on a good metadata database.