Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Libraries: From Theory to Applications in Education and Business ICADL 2000 – Seoul, Korea December 7, 2000 Edward A. Fox

Similar presentations


Presentation on theme: "Digital Libraries: From Theory to Applications in Education and Business ICADL 2000 – Seoul, Korea December 7, 2000 Edward A. Fox"— Presentation transcript:

1 Digital Libraries: From Theory to Applications in Education and Business ICADL 2000 – Seoul, Korea December 7, 2000 Edward A. Fox fox@vt.edu http://fox.cs.vt.edu CS DLRL Internet TIC Virginia Tech, Blacksburg, VA, USA

2 Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions

3 Acknowledgements (Selected) F Conference Organizers and Sponsors F Mentors: JCR Licklider, Michael Kessler, Gerard Salton F Sponsors: Advance Auto Parts, CNI, DLF, IBM, NLM, NSF, OCLC, UNESCO, US Dept. of Ed. (FIPSE), … F VT Faculty/Staff: Tony Atkins, Debra Dudley, John Eaton, Jim Hicks, Lance Matheson, Gail McMillan, James Powell, … F VT Students: Fernando Das Neves, Robert France, Marcos Goncalves, Neill Kipp, Paul Mather, Ryan Richardson, Ohm Sornil, Hussein Suleman, Omar Vasnaik, Marc Vass, … F Visitors: Mann-Ho Lee (Korea), Byongsun Kim (Korea), Shalini Urs (India), Akira Maeda (Japan)

4 Internet Technology Innovation Center Supported by Virginia’s Center for Innovative Technology Statewide University Partners - Governing Board: F Christopher Newport University –William Winter, William Muir, Virginia Electronic Commerce Technology Center / Southeastern Virginia Network (VECTEC/SEVAnet) F George Mason University –Scott Martin, Internet Multimedia Center (ICM) –Steven Ruth, International Center for Applied Studies in IT (ICASIT) F University of Virginia –Alf Weaver, Internet Commerce Group (InterCom) –Jim French, Internet Digital Library F Virginia Tech –Edward Fox, Digital Library Research Laboratory (DLRL), CC, CS –Scott Midkiff, Center for Wireless Telecomm. (CWT), VTISC, ECpE

5 JCDL 2001  First Joint ACM/IEEE Conference on Digital Libraries (+ NSF DLI-2 PI mtg) F http://www.jcdl.org F June 24-28, 2001 in Roanoke, VA F Conference Committee: F General Chair: Edward A. Fox, Virginia Tech F Program Chair: Christine Borgman, UCLA F Treasurer: Neil Rowe, Naval Postgraduate School F Posters Chair: Craig Nevill-Manning, Rutgers U.

6 URLs  http://fox.cs.vt.edu  http://www.dlib.vt.edu (DLRL)  http://ei.cs.vt.edu/~dlib (Courseware)  www.ndltd.org & www.theses.org  www.cstc.org (CSTC and JERIC)  www.openarchives.org (OAI)  www.jcdl.org (JCDL’2001 – June 24-28)

7 Collaboration! U.S. – Korea Joint Workshop on Digital Libraries San Diego Supercomputer Center August 10 & 11, 2000 Sponsored by National Science Foundation, USA Ministry of Information & Communication, Korea Institute of Information Tech. Assessment, Korea San Diego Supercomputer Center University of Maryland Virginia Tech

8 Workshop Participants (1 of 3) Robert AllenUniversity of Marylandrba@GLUE.UMD.EDU Dookwon BaikKorea Universitybaik@SWSYS2.KOREA.AC.KR Ching-Chih Chen Simmons College, Bostonchen@SIMMONS.EDU Su-Shing ChenUniversity of Missouri - Columbiaschen@ECN.MISSOURI.EDU Jonghoon ChunMyongji Universityjchun@WH.MYONGJI.AC.KR Gregory CraneTufts Universitygcrane@PERSEUS.TUFTS.EDU Lois DelcambreOregon Graduate Institutelmd@CSE.OGI.EDU Edward FoxVirginia Techfox@VT.EDU Michael GertzUniversity of California, Davisgertz@CS.UCDAVIS.EDU Stephen Helmreich New Mexico State Universityshelmrei@CRL.NMSU.EDU

9 Workshop Participants (2 of 3) Ulf HermjakobUSC Information Sciences Instituteulf@ISI.EDU Soon Joo HyunInformation & Communications University (ICU) shyun@ICU.AC.KR Hyeon KimKorea Research & Development Information Center hyeon@KORDIC.RE.KR Sung-Hyuk KimSookmyung Women’s Universityksh@SOOKMYUNG.AC.KR Yongchae KimMinistry of Information & Communication yongari@MIC.GO.KR Ron LarsenUniversity of Marylandrlarsen@DEANS.UMD.EDU Sang-goo LeeSeoul National Universitysglee@MARS.SNU.AC.KR Sang Ho LeeSoongsil Universityshlee@COMPUTING.SOONGSIL.AC.KR Young-Suk LeeMIT, Lincoln Laboratoryysl@SST.LL.MIT.EDU Karl LoUniversity of California, San Diegoklo@UCSD.EDU

10 Workshop Participants (3 of 3) Bruce MillerUniversity of California, San DiegoRbmiller@UCSD.EDU Sung Been Moon Yonsei Universitysbmoon@YONSEI.AC.KR Reagan MooreSan Diego Supercomputer Centermoore@SDSC.EDU Sung Hyon Myaeng Chungnam National Universityshmyaeng@CS.CHUNGNAM.AC. KR Gang-Tak OhNational Computerization Agency, Seoulokt@NCA.OR.KR Sam-Gyun OhSungKyunKwan Universitysamgyun@YAHOO.COM samoh@YURIM.SKKU.AC.KR Hae-Chang RimKorea Universityrim@NLP.KOREA.AC.KR Shalini UrsUniversity of Mysoreshaliniurs@HOTMAIL.COM Lee ZiaNational Science Foundationlzia@NSF.GOV

11 Some Observations F So many conferences! Lots of R&D! F Exhibits: a DL industry is emerging. F But: we don’t cite each other’s works; F nobody is asking “Why”; F we are not connecting theory + projects; F nobody is talking about OAI. F So, I’ve redone my talk, since you can see: –paper in proceedings –demo tomorrow (p. 327) and online –see tutorial notes (in book) and online

12 DL = Users Direct (Organized Artifact Mediated Communication) Author Reader Digital Library EditorReviewer Teacher Learner Librarian Sponsor Publisher

13 DL = Users Direct (Organized Artifact Mediated Communication) Sales Agent Inventory Digital Library Sales Partners Parts Supplier Training Home Garages Shopper Store Repair Manuals B2B B2C Staff

14 CS 6604: Digital Libraries (Fall 2000) http://scholar.lib.vt.edu/imagebase/ DL of Images of Birds for Virginia Tech Museum of Natural History Student Team Ameya Datey Aniket Sule Supriya Angle Balaprasuna Chennupati and the Eagle Scouts Under the guidance of Dr. Edward Fox Ms. Llyn Sharp (VT Museum of Natural History) Mr. Anthony Atkins (Digital Library and Archives) Plus, 3-D VTMNH minerals in UH3004

15 Libraries of the Future JCR Licklider, 1965, MIT Press: Unified Theory? F Not ready in 1960s F Analog – unified field theory in physics F “Mess” today – segmented field, specialities –Database Knowledge Content Mgmnt –Multimedia, Hypermedia, Hypertext –Logic, Algebra, Artificial Intelligence, … F Expensive, annoying for users –Don’t know where to look –Don’t know how to use services

16 5S Layers Societies Scenarios Spaces Structures Streams

17 Definition: Digital Libraries are complex systems that F help satisfy info needs of users (societies) F provide info services (scenarios) F organize info in usable ways (structures) F present info in usable ways (spaces) F communicate info with users (streams)

18 Definition: 5S Framework F Societies: interacting people (, computers) F Scenarios: services, functions, operations, methods F Spaces: domains + constraints (e.g., distance, adjacency): 2D, vector, probability F Structures: relations, trees, nodes and arcs F Streams: sequences of items (text, audio, video, network traffic) F (5 Element System: Fire, Wood, Earth, Metal, Water)

19 5S: Combinations F Societies + Scenarios = user model F Societies + Scenarios + Spaces = user interface F Streams + Structures = markup F Streams + Structures + Scenarios = object F Structures + Scenarios = DBMS

20 Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions

21 NSDL Spine full-service collections full-service collections NSDL Collections referenced items & collections referenced items & collections Referenced Items & Collections NSDL Services NSDL Services Other NSDL Services CI Services discussion CI Services personalization CI Services topic-map registry CI Services query transform Core Collection- Usage Services annotation Core Collection- Building Services protocol mediation Core Collection- Building Services persistence Core Collection- Building Services harvesting Portals & Clients Portals & Clients Portals & Clients (Slide from Dave Fulker, Bill Arms – 11/2/2000)

22 ARIADNE Screens (E. Duval)

23 CS Teaching Center (CSTC) F Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. F Learners benefit from having well-crafted modules that have been reviewed and tested. F Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. F ACM Education Board and SIG support, new NSF grant with UNCW, Eduprise, TCNJ, … - iLumina Project F ACM J. of Educational Resources in Computing (JERIC)

24

25 Browsing (1)

26 Browsing (2)

27

28

29

30 A Digital Library Case Study F Domain: graduate education, research F Genre: ETDs = electronic theses & dissertations F Submission: http://etd.vt.edu F Collection: http://www.theses.org Project: Networked Digital Library of Theses & Dissertations http://www.ndltd.org (NDLTD – remember: ND LTD / NDL TD) (also, newer NUDL: Networked University Digital Library, with e-courseware, etc.)

31 ETD Initiative (and UMI) Students Learn about DL, EPub TDs become more expressive N. Amer. (T)Ds are accessible, archived Global TDs become more accessible, archived UMI Universities

32 Library Catalogs ETD, Access is Opened to the New Research WWW NDLTD

33 What are the long term goals? F Attract all TDs/yr: 50K D-US, 25K D-Germany, 10K TD-Canada, … F >200K/yr rich hypermedia ETDs that may turn into electronic portfolios (images, video, audio, …) F Dramatic increase in knowledge sharing: literature reviews, bibliographies, … F Services providing lifelong access for students: browse, search, prior searches, citation links F Hundreds/thousands of downloads / year / work

34 The Networked Digital Library of Theses and Dissertations www.NDLTD.org Leader of the Worldwide ETD (Electronic Thesis and Dissertation) Initiative Training Authors Expanding Access Preserving Knowledge Improving Graduate Education Enhancing Scholarly Communication Empowering Students & Universities

35 Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions

36 Why do we need the Open Archives Initiative ? F Current standards are too complicated F Information wants to be free ! F We can decouple –Running an archive (DL content collection) –Running a service (DL system / operation) F So we can have more and better archives, that build on each other F So we can have better services, that work on multiple collections

37 OAI: Archives of Digital Objects Archive Access Protocol Handle (ID) Digital object terms and conditions

38 The Open Archives Initiative www.openarchives.org a technical introduction Hussein Suleman (hussein@vt.edu) Virginia Tech DLRL December 2000

39 History F Santa Fe Convention (October 1999) –Electronic pre-print community F San Antonio (July 2000), Lisbon (Sept. 2000) –Broader interest from other parties F Ithaca Meeting (September 2000) –Formulation of general-purpose protocol F OAI Open Meetings (January –Feb. 2001) –Public release of specifications

40 Federation vs. OAI Harvesting F Federation –Sending out queries to remote sites and combining results F Harvesting –Gathering all metadata from remote sites into a central search system –Lightweight protocol –Robust –Less network traffic –Redundant servers

41 Black Box OAI-ETD Perspective ISTEC (Ibero America) PhysDisNSYSU (Taiwan) ADT (Australia) BN.PT (Portugal) www.theses.orgCyberTheses (Francophone) VTDissert.Online (Germany) MITOhioLINKCBUC (Catalunya) NDC (Greece) SEALS (S.Africa) CICU. Bergen (Norway) … …

42 Splitting Data & Services F Data Provider –Implements the OAI protocol on archive to allow external access to data F Service Provider –Uses the OAI protocol to access external archives and provide services (such as searching or linking) on their metadata

43 The Big Picture DL Repository 1Repository 2Repository 3Repository 4

44 Requirements for OAI Protocol F Unique identifiers (URNs) for each record F Date-stamp for each record when last modified/created/deleted F HTTP server with scripting ability

45 OAI Harvesting Protocol v1 F Operates over HTTP F HTTP Requests and XML Responses F HTTP Error codes F 6 Service requests (verbs): –Identify, ListMetadataFormats, ListSets –ListIdentifiers, GetRecord, ListRecords

46 Identify - Response

47 ListMetadataFormats - Response

48 GetRecord - Response

49 Verb: ListRecords F Retrieves metadata for multiple records F Parameters –from – start date (O) –until – end date (O) –set – set to harvest from (O) –resumptionToken – flow control mechanism (X) –metadataPrefix – metadata format (R)

50 ListRecords - Response

51 Feature: Different Metadata

52 Feature: Date Ranges

53 Feature: Resumption Token

54 Repository Explorer

55 ODU Search Service

56 What Next ? F In General –Cross-archive searching –Cross-archive linking, de-duping, threading –Selective Filtering –Open-DL in a Box ? F VT –The VT Digital Library –NDLTD Union Catalog

57 [acknowledgements] Carl Lagoze the Open Archives Initiative Herbert Van de Sompel Cornell University -- Computer Science DLF FALL FORUM 2000 – Chicago – November 18th 2000

58 Actions herbert van de sompel establish organizational stability for the OAI: institutional backing from CNI & DLF steering committee: policy guidance technical committee: technical specifications executive group: day to day coordination workshops: public dissemination, feedback revise specifications to allow adoption beyond preprints

59 low-barrier interop umbrella herbert van de sompel metadata OPACimageFTXTA&Ie-print

60 low-barrier interop umbrella herbert van de sompel metadata OPACimageFTXTA&Ie-print Author Title Abstract Identifer

61 OAI harvesting tools herbert van de sompel service provider harvester data provider repository Datestamp Identifier Set Records repositoryrepository

62 publication of specifications: January 2001 US Open Day, January 23rd Washington DC EC Open Day, February 2001, Berlin freeze specifications for 1 year: stable for experimentation; not definitive minimize risk for early adopters maximize chances for future interoperability across communities revision of specifications herbert van de sompel

63 alpha test of specs (11/2000-01/2001) herbert van de sompel data providers: arXiv -- Los Alamos NACA -- NASA CogPrints -- U Southampton ETD -- Virginia Tech Thesis & Dissertations from WorldCat -- OCLC

64 data providers: HeinOnline law journals -- Cornell U TEI-lite collection -- U Tennessee STM publisher metadata -- U Illinois Resource Disovery Network -- UKOLN Open Language Archives -- U Pennsylvania Open Video Project -- U North Carolina Museum info. -- CIMI alpha test of specs (11/2000-01/2001) herbert van de sompel

65 software: OAI harvesting interface to Ex Libris Aleph 500 Integrated Library System -- Ex Libris OAI harverster – Cornell U OAI harverster – Virginia Tech Open-source software capable of creating a merged catalog of metadata harvested from OAI- servers -- OCLC alpha test of specs (11/2000-01/2001) herbert van de sompel

66 service providers: Repository explorer -- Virginia Tech MARIAN DL -- Virginia Tech ARC service -- Old Dominion U alpha test of specs (11/2000-01/2001) herbert van de sompel

67 The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. New OAI mission statement herbert van de sompel The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. Continued support of this work remains a cornerstone of the Open Archives program.

68 The fundamental technological framework and standards that are developing to support this work are, however, independent of the both the type of content offered and the economic mechanisms surrounding that content, and promise to have much broader relevance in opening up access to a range of digital materials. [...] New OAI mission statement herbert van de sompel

69 Harvesting Document Metadata for Federated Search CS6604 Fall 2000 Project Presented By Avnish Kumar Chhabra

70 Benefits of Harvesting F Limited storage requirement F Fast search F Consistently ranked results F Improved reliability F Distributed collections are transparent to user. F Efficient use of network resources.

71 Design of the Solution OAI wrapper Z39.50 Wrapper Update Scheduling Query Generation Digital Library collection Parser/Updater Queries Replies New Metadata MARIAN Metadata Database Boundary of System developed

72 Implementation Main scheduler thread : Server, Protocol, Update Frequency SiteInfo Schedule File OAI harvester class: OAIInterface Instantiated with URL of OAI site And scheduling frequency HarvestorMonitor: Monitor for arbitrating access to network resources DL Collection OAIHandler XML Document Event Handler class Auth Sub Abs

73 Features of the system developed F Per-collection execution thread F Schedules updates F Encapsulation of protocol specific details F Extensibility F Control over active execution threads F Fault tolerance –Server unreachable –Failure / timeout of individual connections F Time zones and date ambiguity considered

74 Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions

75 MARIAN Layers Database Layer Search Engine Layer User Information Layer User Interface Layer User

76

77

78

79

80

81

82

83 Part of Hierarchy of MARIAN Classes

84 Relevant Document Structure

85

86

87 MARIAN-Phronesis Interoperability CS6604 Fall 2000 Project Tracy Lewis Ryan Richardson Kim Woods

88 MARIAN-Phronesis V1 Architectural Diagram Phron Query CGI Script Search Page Display to user Create object instance MARIANPHRONESIS Marian Query CGI Script Phron Results

89 MARIAN-Phronesis Login Page

90 Query in Español

91 Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions

92 Conclusions F Education is an important application of DLs F Having a framework and theory may lead to better (more effective) systems and broader applicability –5S –MARIAN F Interoperability is part of the DL grand challenge –OAI


Download ppt "Digital Libraries: From Theory to Applications in Education and Business ICADL 2000 – Seoul, Korea December 7, 2000 Edward A. Fox"

Similar presentations


Ads by Google