Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Libraries and the Open Archives Initiative Louisiana State University June 30, 2000 Edward A. Fox CC CS DLRL Internet.

Similar presentations


Presentation on theme: "Digital Libraries and the Open Archives Initiative Louisiana State University June 30, 2000 Edward A. Fox CC CS DLRL Internet."— Presentation transcript:

1 Digital Libraries and the Open Archives Initiative Louisiana State University June 30, 2000 Edward A. Fox fox@vt.edu http://fox.cs.vt.edu CC CS DLRL Internet TIC Virginia Tech, Blacksburg, VA, USA

2 Acknowledgements (Selected) F Sponsors: ACM, Adobe, ARL, Belgian Science Found., CLIR, DARPA, IBM, LANL, Microsoft, NSF, OCLC, SPARC, US Dept. of Ed. (FIPSE), … F VT Faculty/Staff: Tony Atkins, Thomas Dunbar, John Eaton, Gwen Ewing, Peter Haggerty, Gary Hooper, Gail McMillan, Len Peters, James Powell, …  VT Students: Emilio Arce, Fernando Das Neves, Brian DeVane, Robert France, Marcos Goncalves, Scott Guyer, Robert Hall, Neill Kipp, Paul Mather, Tim McGonigle, Todd Miller, Constantinos Phanouriou, William Schweiker, Ohm Sornil, Hussein Suleman, Patrick Van Metre, Laura Weiss, …

3 Virginia Tech Background F Largest university in Virginia, land-grant, football, town population 35K plus 25K students F Blacksburg Electronic Village, since 1992, with > 80% of community on Internet F Net.Work.Virginia, largest ATM network, with over 750 sites, for education, research, government F LMDS, Local Multipoint Distribution Service, gigabit wireless networking - 1/3 of Virginia F Math Emporium, 500 workstations F Faculty Development Initiative, round 2 F Hosting First Joint Conference on Digital Libraries, www.jcdl.org, Summer 2001 @ Hotel Roanoke, VA

4 Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)

5 Digital Libraries SGML (1985) PDF (1992) NSF DLI (1994) Library Cancellations (1988) University Scholarly Electronic Pub. (1988) Info. Literacy (1995) Improving Education Internet (1984) WWW (1994) Multimedia (1986)

6 Digital Libraries Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer

7 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian

8 How do universities and digital libraries relate? F Each U. will have its own digital libraries. Hence there will be large numbers (i.e., critical mass). F All students will learn how to use and how to “feed” digital libraries (and bring those habits to future work as needs and skills). F All digital library problems (esp. federation, flexibility, personalization) appear at U’s (so they are a good type of testbed, with willing collaborators in-place for developing solutions). F Start with NDLTD, extend to NUDL

9 Digital Libraries --- Virginia Tech F MARIAN (NLM) F CS DL Prototype - ENVISION (NSF, ACM) F TULIP (Elsevier, OCLC) F BEV History Base (NSF, Blacksburg) F DL for CS Education - EI (NSF, ACM) F WATERS, NCSTRL (NSF) F NDLTD (SURA, US Dept. of Education) F CSTC (NSF, ACM), CRIM (NSF, SIGMM) F WCA (Log) Repository (W3C) F VT-PetaPlex-1 (Knowledge Systems)

10 NCSTRL F http://www.ncstrl.org F Networked Computer Science Technical Reference Library F CS Technical Reports F 1994 merger of CSTC + WATERS F 1998 integration with LANL server (CoRR) F Federated search, mirrors, Dienst protocol

11 Digital Libraries --- Objectives F World Lit.: 24hr / 7day / from desktop F Integrated “super” information systems: 5S: streams, structures, spaces, scenarios, societies F Ubiquitous, Higher Quality, Lower Cost F Education, Knowledge Sharing, Discovery F Disintermediation -> Collaboration F Universities Reclaim Property F Interactive Courseware, Student Works F Scalable, Sustainable, Usable, Useful

12 Benefits F Ease of use F Effectiveness F “The benefits of digital libraries will not be appreciated unless they are easy to use effectively.” - IITA Workshop report

13 DLs: Why of Global Interest? F National projects can preserve antiquities and heritage: cultural, historical, linguistic, scholarly F Knowledge and information are essential to economic and technological growth, education F DL - a domain for international collaboration –wherein all can contribute and benefit –which leverages investment in networking –which provides useful content on Internet & WWW –which will tie nations and peoples together more strongly and through deeper understanding

14 DL Challenges F Preservation - so people with trust DLs F Supporting infrastructure - networks,... F Scalability, sustainability, interoperability F DL industry - critical mass by covering libraries, archives, museums, corporate info, govt info, personal info - “quality WWW” integrating IR, HT, MM,... –Need tools & methods to make them easier to build

15 Computing (flops) Digital content Communicat i ons (bandwidth, connectivity) Locating Digital Libraries in Computing and Communications Technology Space Digital Libraries technology trajectory: intellectual access to globally distributed information lessmore

16

17 Definitions F Library ++ (library+archive+museum+…) F Distributed information system + organization + effective interface F User community + collection + services F Digital objects, repositories, IPR management, handles, indexes, federated search, hyperbase, annotation

18 Definition: Digital Libraries are complex systems that F help satisfy info needs of users (societies) F provide info services (scenarios) F organize info in usable ways (structures) F present info in usable ways (spaces) F communicate info with users (streams)

19 5S Layers Societies Scenarios Spaces Structures Streams

20 Document Models, Representations, and Accesses F Doc = stream + structure + use-scenario; hybrid (paper/electronic), digital only F Multilingual: content, summary, metadata F Multimedia: structure, quality (oS), search F Structured: MARC, SGML, by user: MVD F Distributed collection: Kleisli, CIMI, Z39.50 F Federated search: collecting, picking site(s), parallel search / fall-back, fusing results F Access: IPR, payment, security, scenarios

21 Architectural Issues F Internet middleware F Independent system / part of federation F Decompositions vary –search engine, browser, DBMS, MM support –repository, handle server, client –information resources + mediators, bus or agent collection + client with workspace/environment F Metrics: e.g., for federated search

22 Standards F Protocols/federation –Z39.50, CIMI –Dienst, NCSTRL –OAi protocol F Metadata –TEI: inline, detailed (structure in stream) –MARC: two-level, fine-grained –Dublin Core: high-level, 15 elements –RDF: describing resources/collections, annotation –OAMS and others used in OAi

23 Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)

24 Enhancing Learning with DLs

25

26

27 NSF Education Innovation (EI) F NSF “Interactive Learning with a Digital Library in Computer Science” (1993-98) F 45 online courses (esp. Internet, IR, MM, Professionalism, overall EI project pages): 100+K accesses/wk F Tools: SWAN (visualization), QUIZIT F Evaluation –traditional –network logging and analysis –tools for visualization

28 Digital Library Courseware F http://ei.cs.vt.edu/~dlib/ F WWW pages or large PDF copy files F Online quizzes based on book by Michael Lesk (Morgan Kaufmann Publishers) F Contents based on book, with several other popular topics added (e.g., agents) F Separate pages to supplement: Definitions, Resources (People, Projects), and References

29 CS -> CSTC -> CRIM F NSF and ACM Education Committee are funding a 2 year project “A Computer Science Teaching Center” - CSTC - http://www.cstc.org/ F College of NJ, U. Ill. Springfield, Virginia Tech F Focus initially on labs, visualization, multimedia F Multimedia part is also supported by a 2nd grant to Virginia Tech and The George Washington University: http://www.cstc.org/~crim/ (with curricular guidelines also under development)

30 CS Teaching Center (CSTC) F Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. F Learners benefit from having well-crafted modules that have been reviewed and tested. F Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. [See NSF NSDL - National Science (math, engineering, technology education) Digital Library (formerly SMETE-lib) at http://www.dlib.org/smete/public/smete-public.html ] F ACM Education Board and SIG support, new NSF grant with COLLEGIS Research Institute and others …

31

32 Browsing (1)

33 Browsing (2)

34 CRIM Rationale F MM field needs properly trained personnel F Support this with resources + curricula F Together these help us move toward a DL for Interactive MM -> CS -> NSDL F Benefits will go to teachers (who have more to build upon) and students (who will have a richer environment for learning

35

36 CRIM Project Activities F Workshops, other ways to involve community F WWW site including DL in CSTC re MM –Devised cataloging schema, designed interface –Referring to all MM syllabi and curriculum –Inviting learning resources for the CRIM DL, with reviews, reuse certifications F Publish report on MM curriculum through ACM and IEEE, after careful review F Introducing into CC2001: information retrieval, hypertext/hypermedia, multimedia, digital libraries

37 Curriculum Resources in Interactive Multimedia (CRIM) F MM field needs properly trained personnel F Support this with resources + curricula F Benefits will go to teachers (who have more to build upon) and students (who will have a richer environment for learning F CSTC, CRIM have led to ACM Journal of Educational Resources in Computing, JERIC F Together these help us move forward: DL for Interactive MM -> CS -> NSDL

38 SMETE Library -> NSDL (from www.dlib.org to NSF DLI-2) F Context: Global movement toward Digital Libraries (see April 1998 CACM) F NSF effort: Science, Mathematics, Engineering, and Technology Education Digital Library (focussed on undergraduates) –3 workshops, yearly increasing funds / new calls F NSDL will operate as a distributed federation, with separate parts for each key discipline, and should lead to a global effort.

39 Selected NSDL Projects/Topics COLLEGIS Res. Inst.IMS, CS, Math, Viz., … Columbia UniversityEarth sciences Stanford UniversityMedicine (images) U. California BerkeleyEngineering University of MarylandK-12 education U. Texas at AustinPhysical anthropology

40 Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)

41 Open Archives initiative OAi www.openarchives.org openarchives@openarchives.org

42 OAi Philosophy F Self-archiving = submission mechanism F Long-term storage system = archive F Open interface = harvesting mechanism F Data provider + service provider F Start with “gray literature” –e-prints/pre-prints, reports, dissertations, …

43 Tiered Model of Interoperability Mediator services Metadata harvesting Document models

44 Repository of Digital Objects Repository Access Protocol handle Digital object terms and conditions

45 Open Archives initiative History F xxx at LANL = Los Alamos National Laboratory (Ginsparg) for high-energy physics - 1991 F CSTR + WATERS = NCSTRL (Lagoze) - 1994 F xxx + NCSTRL = CoRR collaboration - 1998 F UPS (Universal Preprint Service) – 1999 mtg –Herbert Van de Sompel (U. Ghent, SFX) … –Dublin Core (DC), XML –Dienst protocol and software (Lagoze) F Renamed late 1999 as OAi

46 Open Archives (protoproto) F ArXiv & Los Alamos National Lab F CogPrints & U. Southampton F NACA & NASA (reports) F NCSTRL & Cornell U. F NDLTD & Virginia Tech F RePEc & U. Surrey F Total of around 200K records

47 Original Open Archives Members F Caroline Arms, Library of Congress F Leslie Carr, University of Southampton F Mark Doyle, American Physical Society F Dale Flecker, Harvard University F Edward A. Fox, Virginia Tech F Michael Friedman, HighWire Press, Stanford U. F Paul M. Gherman, Vanderbilt U. & SPARC F Paul Ginsparg, Los Alamos National Lab. & xxx F Stevan Harnad, University of Southampton F Thomas Krichel, University of Surrey & RePEc F Carl Lagoze, Cornell University …

48 Original Open Archives Members cont’d F Rick Luce, Los Alamos National Laboratory F Clifford Lynch, Coalition for Networked Info. F Kurt Maly, Old Dominion University F Michael Nelson, NASA Langley Research Center F John Ober, California Digital Library F Bob Parks, Washington University & EconWPA F Herbert Van de Sompel, University of Ghent F Eric F. Van de Velde, Caltech F Don Waters, The Andrew W. Mellon Foundation F Ken Weiss, California Digital Library

49 Open Archives Future F EconWPA (U. Washington) F e-biomed -> PubMed Central (NIH) F PubScience (DOE) F Clinical Medicine Netprints (+ other HighWire Press holdings ) F University ePub (California Digital Library) F All public e-prints (MIT) F Scholar’s Forum (Caltech) F Int’l: CERN, Germany, India, Mexico, … F Goal: millions of books/articles/reports / yr

50 Approaches to Open Archives Build By Discipline Build By Institution

51 Approaches to Open Archives Build By Discipline Build By Institution Author Category Interdisciplinary Year Language Query …

52 Open Archives initiative (OAi) F xxx@LANL, high-energy physics (Ginsparg, 1991) F CSTR + WATERS = NCSTRL (Lagoze,1994) F xxx + NCSTRL = CoRR collaboration (1998) F Universal Preprint Service protoproto, Oct. 21-22, 1999, Santa Fe – led by LANL, CNI, DLF, Mellon --> OAi F Santa Fe Convention (see Feb. D-Lib Magazine article) F Follow-on mtgs: 6/3@San Antonio, 9/21@Lisbon (ECDL) F Archives -> Open Archives –Support unique archive identifiers –Implement Open Archives Metadata Set (DC-based, using XML) –Implement Dienst harvesting interface (based on Dienst protocol) –Register the archive F Build tools, layer other services: linking, searching, …

53

54 Mechanisms F Sharing –Join federation, run software –Make metadata and archive available F Aggregating –By discipline –By institution –By genre F Automating –Workflow –Harvesting and providing services –Federated searching –Dynamic linking (e.g., with SFX)

55 Report on Open Archives work in progress at Virginia Tech With students: Hussein Suleman (hussein@vt.edu) Dave Watkins (dwatkins@cs.vt.edu) Robert France (france@vt.edu) Marcos Andre Goncalves (mgoncalv@cs.vt.edu)

56 VT View of the Open Archives initiative (OAi) F Enable sharing of publication metadata and full-text by digital libraries F Standardize low-level mechanisms to share contents of libraries F Build higher-level user-centric and administrative services in meta-libraries F Install organizational mechanisms to support the technical processes

57 Virginia Tech Projects F MARC XML-DTD F Computer Science Teaching Centre (CSTC) F W3C Web Characterization Repository F OAi Repository Explorer F Networked Digital Library of Theses and Dissertations (NDLTD)

58 MARC XML-DTD F XML Transport format for US-MARC records F Standardized metadata exchange format for traditional library services joining OAi

59 CS Teaching Center (CSTC) F Collection of reviewed online resources used to aid in teaching of Computer Science F Supports author submission and peer-review process for new ACM Journal of Educational Resources In Computing (JERIC) F Connected with NSDL (NSF 00-44) F http://www.cstc.org

60 W3C Web Characterization Repository F Online database of metadata related to publications, tools and data sets dealing with Web characterization F Project of the Web Characterization Activity working group of the World-Wide-Web Consortium (www.w3c.org/WCA) F http://purl.org/net/repository

61 OAi Repository Explorer F Serves as a compliancy test F Allows browsing of open archives using only OAi protocol F Sends requests on behalf of user, parses and checks responses and displays browsable interface F Will detect most discrepancies in protocol F http://purl.org/net/explorer

62 NDLTD F Work has begun on interoperability between Virginia Tech and partners in Germany F Wrappers have been created to harvest data from remote sites which use other protocols F Harvested data to be stored in a central OAi- compliant database (work in progress)

63 Extending Services - 1 of 2 F Working with publishers –Motivate students: awards, … –Publicize support of NDLTD u ACM, ACS, IEEE-CS, Elsevier, … –Allow students to increase level of access F Arranging preservation –Mirroring worldwide –Involving long-term trusted parties

64 Extending Services - 2 of 2 F Adding services currently prototyped –annotation and SDI (routing) capabilities –Dublic Core metadata, crosswalk to MARC –support for XML, *ML, preservation –harvesting, federated search F Adding other services planned –building/using citation DB (CiteSeer, SFX, …) –implementing plagiarism check (like “SCAM”)

65 Remember! F Digital libraries introduction F Digital libraries to enhance learning F OAi (help establish enormous international cooperative of data and service providers)


Download ppt "Digital Libraries and the Open Archives Initiative Louisiana State University June 30, 2000 Edward A. Fox CC CS DLRL Internet."

Similar presentations


Ads by Google