Presentation is loading. Please wait.

Presentation is loading. Please wait.

MELLON E-JOURNAL ARCHIVING PROJECT January20, 2002.

Similar presentations


Presentation on theme: "MELLON E-JOURNAL ARCHIVING PROJECT January20, 2002."— Presentation transcript:

1 MELLON E-JOURNAL ARCHIVING PROJECT January20, 2002

2 DIGITAL PRESERVATION

3 THE BIG ISSUE IN DIGITAL LIBRARIES Digital is inherently fragile –constant technological change yields short life for all digital materials Nothing will be saved passively –requires constant and conscious action to preserve A core role for research libraries in the digital era????

4 JOURNAL ARCHIVING IN THE PAPER ERA Large-scale redundancy Access copy and archival copy usually the same Not just storage, but preservation –includes environmental control, library binding, repair, reformatting... Deliberate, long-term archiving largely the role of national and research libraries

5 E-JOURNAL MODEL IS DIFFERENT “Copies” are remote, held in publisher systems –not replicated across different institutions Perpetual license provides limited comfort in the absence of independent copies Long-term preservation involves very different issues than day-to-day access

6 LACK OF ARCHIVING A GROWING PROBLEM Libraries bearing double costs –the e-journals users prefer –the paper for preservation Publishers cannot convert totally to digital –authors and editors distrust e-only journals because of concerns about persistence –libraries demand paper for preservation Libraries preserving paper version, but electronic more complete, increasingly the copy of record

7 MELLON E-JOURNAL ARCHIVING PROGRAM 13 institutions invited to submit proposals for a one-year planning project Six planning proposals were selected and funded in December 2000 –additional project focused on technology (LOCKSS) also funded Second round of Mellon grants to be announced in June will fund actual implementation

8 SIX PLANNING PROJECTS Publisher-based –Harvard (Wiley, Blackwell, University of Chicago Press) –Penn (Oxford and Cambridge University Presses) –Yale (Elsevier) Discipline-based –Cornell (agriculture), –NYPL (performing arts) Dynamic e-journals –MIT

9 SOME BASIC ASSUMPTIONS Archive should be independent of publishers –responsibility of institutions for whom archiving is a core mission Archiving requires active publisher partnership Address long timeframes (100 years?) Archive design based on Open Archival Information System (OAIS) model

10 OBJECTIVES FOR PLANNING PROJECTS Develop draft archiving agreements with publisher partners Design technical architecture for an archive Formulate an acquisitions and growth plan Articulate access policies Address validation/certification Design an organizational model, staffing, long-term funding model

11 Key planning issues/decisions…

12 BASE ON DL INFRASTRUCTURE Use existing infrastructure for storage, management, preservation, access Enhanced to comply with OAIS model New ingest and rendering functions

13 ARCHIVING AGREEMENT Explicit archiving license with publisher License addresses what content is archived, responsibilities of parties, conditions of use, economics Not always an easy negotiation –archiving involves handing publisher’s intellectual property to independent party

14 PUSH MODEL Publishers will “push” content to be archived to Harvard –on-going regular deposit following on-line publication of issue (what happens when issues disappear?)

15 WHAT CONTENT IS DEPOSITED? “Journal issues” are complex –publishers do not treat all journal content the same (e. g. “front matter” treated as web pages, not objects in content management systems) –“associated materials” (datasets, images, tables, etc.) not in the print versions –advertising usually dynamic, and can involve country-specific complexities

16 SOME COMMON STUFF Journal description Editorial board Instructions to authors Rights and usage terms Copyright statement Ordering information Reprint information Indexes Career information News Events lists Discussion fora Editorials Errata Reviewers Conference announcements

17 ARCHIVE MOST CONTENT Exclude little except advertisements –different from most “local loading” Articles include supplementary materials Include an “issue object” in addition to the article components – masthead, news, jobs, meetings, etc Reference links problematic –dynamic, frequently separate from article

18 STANDARD ARCHIVAL ARTICLE DTD Publisher’s SGML formats vary widely Consultant report on practicality of common archival XML DTD Dramatically reduces archive complexity Issues include –how low a common denominator –extended character sets, formulae, etc. –sacrifice functionality and original appearance –transformations involve risks

19 DEPOSIT MORE THAN ONE FORMAT? Archive must accept PDF in any case –so include both SGML and PDF when available? belt and suspenders –inclined to do this Accept publisher’s original SGML also? –conversion to archival DTD will result in loss –inclined to not do this

20 “DARK-TO-LIGHT” Archived material not accessible at deposit –do not compete with publishers Content becomes accessible after “trigger event” –default then is universal access But how do you know “dark” archival content is still good? –it would be better if there was some on-going access…..

21 ACCESS MODEL Archived content always accessible to anyone with appropriate license from publisher –might be satisfied by batch export After trigger, simple on-line functionality –assume same functionality for auditors

22 TRIGGER EVENTS “N” years after deposit –“N” set by publisher title-by-title When title/year no longer commercially accessible on the Internet –still problematic with some publishers When content enters public domain

23 PRESERVATION Format-by-format issue Archive specifies preferred formats, which will be kept renderable Just maintain bits for others –e. g., “associated materials” (datasets, models, etc.) generally accepted in ANY format maintaining the viability of such wildly heterogeneous materials unrealistic –keep unaltered for future “digital archeology”

24 ECONOMIC MODEL First question is not who pays, but what will it cost… –reducing costs to the minimum is critical In general publishers expected to bear preparation costs for archived objects Process automation critical to keeping costs low –ingest process –auditing

25 PAYMENT WITH DEPOSIT Two part fee –ingest fee to cover up-front costs varies with publisher effort to create easily archived objects??? –“dowry” to create maintenance endowment Sources include subscribers, authors, societies

26 NEXT….. Proposal to Mellon by April 1 for funding to implement an archive –particular parameters of the call-for-proposals still uncertain Original plan suggested 3 or 4 year projects Intent is to implement archive, contract for deposit, begin operations –learn by getting dirty hands –help understand issues, costs


Download ppt "MELLON E-JOURNAL ARCHIVING PROJECT January20, 2002."

Similar presentations


Ads by Google