Download presentation
Presentation is loading. Please wait.
Published byGwen Bridges Modified over 8 years ago
1
Digital Libraries: From Theory to Applications in Education and Business ICADL 2000 – Seoul, Korea December 7, 2000 Edward A. Fox fox@vt.edu http://fox.cs.vt.edu CS DLRL Internet TIC Virginia Tech, Blacksburg, VA, USA
2
Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions
3
Acknowledgements (Selected) F Conference Organizers and Sponsors F Mentors: JCR Licklider, Michael Kessler, Gerard Salton F Sponsors: Advance Auto Parts, CNI, DLF, IBM, NLM, NSF, OCLC, UNESCO, US Dept. of Ed. (FIPSE), … F VT Faculty/Staff: Tony Atkins, Debra Dudley, John Eaton, Jim Hicks, Lance Matheson, Gail McMillan, James Powell, … F VT Students: Fernando Das Neves, Robert France, Marcos Goncalves, Neill Kipp, Paul Mather, Ryan Richardson, Ohm Sornil, Hussein Suleman, Omar Vasnaik, Marc Vass, … F Visitors: Mann-Ho Lee (Korea), Byongsun Kim (Korea), Shalini Urs (India), Akira Maeda (Japan)
4
Internet Technology Innovation Center Supported by Virginia’s Center for Innovative Technology Statewide University Partners - Governing Board: F Christopher Newport University –William Winter, William Muir, Virginia Electronic Commerce Technology Center / Southeastern Virginia Network (VECTEC/SEVAnet) F George Mason University –Scott Martin, Internet Multimedia Center (ICM) –Steven Ruth, International Center for Applied Studies in IT (ICASIT) F University of Virginia –Alf Weaver, Internet Commerce Group (InterCom) –Jim French, Internet Digital Library F Virginia Tech –Edward Fox, Digital Library Research Laboratory (DLRL), CC, CS –Scott Midkiff, Center for Wireless Telecomm. (CWT), VTISC, ECpE
5
JCDL 2001 First Joint ACM/IEEE Conference on Digital Libraries (+ NSF DLI-2 PI mtg) F http://www.jcdl.org F June 24-28, 2001 in Roanoke, VA F Conference Committee: F General Chair: Edward A. Fox, Virginia Tech F Program Chair: Christine Borgman, UCLA F Treasurer: Neil Rowe, Naval Postgraduate School F Posters Chair: Craig Nevill-Manning, Rutgers U.
6
URLs http://fox.cs.vt.edu http://www.dlib.vt.edu (DLRL) http://ei.cs.vt.edu/~dlib (Courseware) www.ndltd.org & www.theses.org www.cstc.org (CSTC and JERIC) www.openarchives.org (OAI) www.jcdl.org (JCDL’2001 – June 24-28)
7
Collaboration! U.S. – Korea Joint Workshop on Digital Libraries San Diego Supercomputer Center August 10 & 11, 2000 Sponsored by National Science Foundation, USA Ministry of Information & Communication, Korea Institute of Information Tech. Assessment, Korea San Diego Supercomputer Center University of Maryland Virginia Tech
8
Workshop Participants (1 of 3) Robert AllenUniversity of Marylandrba@GLUE.UMD.EDU Dookwon BaikKorea Universitybaik@SWSYS2.KOREA.AC.KR Ching-Chih Chen Simmons College, Bostonchen@SIMMONS.EDU Su-Shing ChenUniversity of Missouri - Columbiaschen@ECN.MISSOURI.EDU Jonghoon ChunMyongji Universityjchun@WH.MYONGJI.AC.KR Gregory CraneTufts Universitygcrane@PERSEUS.TUFTS.EDU Lois DelcambreOregon Graduate Institutelmd@CSE.OGI.EDU Edward FoxVirginia Techfox@VT.EDU Michael GertzUniversity of California, Davisgertz@CS.UCDAVIS.EDU Stephen Helmreich New Mexico State Universityshelmrei@CRL.NMSU.EDU
9
Workshop Participants (2 of 3) Ulf HermjakobUSC Information Sciences Instituteulf@ISI.EDU Soon Joo HyunInformation & Communications University (ICU) shyun@ICU.AC.KR Hyeon KimKorea Research & Development Information Center hyeon@KORDIC.RE.KR Sung-Hyuk KimSookmyung Women’s Universityksh@SOOKMYUNG.AC.KR Yongchae KimMinistry of Information & Communication yongari@MIC.GO.KR Ron LarsenUniversity of Marylandrlarsen@DEANS.UMD.EDU Sang-goo LeeSeoul National Universitysglee@MARS.SNU.AC.KR Sang Ho LeeSoongsil Universityshlee@COMPUTING.SOONGSIL.AC.KR Young-Suk LeeMIT, Lincoln Laboratoryysl@SST.LL.MIT.EDU Karl LoUniversity of California, San Diegoklo@UCSD.EDU
10
Workshop Participants (3 of 3) Bruce MillerUniversity of California, San DiegoRbmiller@UCSD.EDU Sung Been Moon Yonsei Universitysbmoon@YONSEI.AC.KR Reagan MooreSan Diego Supercomputer Centermoore@SDSC.EDU Sung Hyon Myaeng Chungnam National Universityshmyaeng@CS.CHUNGNAM.AC. KR Gang-Tak OhNational Computerization Agency, Seoulokt@NCA.OR.KR Sam-Gyun OhSungKyunKwan Universitysamgyun@YAHOO.COM samoh@YURIM.SKKU.AC.KR Hae-Chang RimKorea Universityrim@NLP.KOREA.AC.KR Shalini UrsUniversity of Mysoreshaliniurs@HOTMAIL.COM Lee ZiaNational Science Foundationlzia@NSF.GOV
11
Some Observations F So many conferences! Lots of R&D! F Exhibits: a DL industry is emerging. F But: we don’t cite each other’s works; F nobody is asking “Why”; F we are not connecting theory + projects; F nobody is talking about OAI. F So, I’ve redone my talk, since you can see: –paper in proceedings –demo tomorrow (p. 327) and online –see tutorial notes (in book) and online
12
DL = Users Direct (Organized Artifact Mediated Communication) Author Reader Digital Library EditorReviewer Teacher Learner Librarian Sponsor Publisher
13
DL = Users Direct (Organized Artifact Mediated Communication) Sales Agent Inventory Digital Library Sales Partners Parts Supplier Training Home Garages Shopper Store Repair Manuals B2B B2C Staff
14
CS 6604: Digital Libraries (Fall 2000) http://scholar.lib.vt.edu/imagebase/ DL of Images of Birds for Virginia Tech Museum of Natural History Student Team Ameya Datey Aniket Sule Supriya Angle Balaprasuna Chennupati and the Eagle Scouts Under the guidance of Dr. Edward Fox Ms. Llyn Sharp (VT Museum of Natural History) Mr. Anthony Atkins (Digital Library and Archives) Plus, 3-D VTMNH minerals in UH3004
15
Libraries of the Future JCR Licklider, 1965, MIT Press: Unified Theory? F Not ready in 1960s F Analog – unified field theory in physics F “Mess” today – segmented field, specialities –Database Knowledge Content Mgmnt –Multimedia, Hypermedia, Hypertext –Logic, Algebra, Artificial Intelligence, … F Expensive, annoying for users –Don’t know where to look –Don’t know how to use services
16
5S Layers Societies Scenarios Spaces Structures Streams
17
Definition: Digital Libraries are complex systems that F help satisfy info needs of users (societies) F provide info services (scenarios) F organize info in usable ways (structures) F present info in usable ways (spaces) F communicate info with users (streams)
18
Definition: 5S Framework F Societies: interacting people (, computers) F Scenarios: services, functions, operations, methods F Spaces: domains + constraints (e.g., distance, adjacency): 2D, vector, probability F Structures: relations, trees, nodes and arcs F Streams: sequences of items (text, audio, video, network traffic) F (5 Element System: Fire, Wood, Earth, Metal, Water)
19
5S: Combinations F Societies + Scenarios = user model F Societies + Scenarios + Spaces = user interface F Streams + Structures = markup F Streams + Structures + Scenarios = object F Structures + Scenarios = DBMS
20
Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions
21
NSDL Spine full-service collections full-service collections NSDL Collections referenced items & collections referenced items & collections Referenced Items & Collections NSDL Services NSDL Services Other NSDL Services CI Services discussion CI Services personalization CI Services topic-map registry CI Services query transform Core Collection- Usage Services annotation Core Collection- Building Services protocol mediation Core Collection- Building Services persistence Core Collection- Building Services harvesting Portals & Clients Portals & Clients Portals & Clients (Slide from Dave Fulker, Bill Arms – 11/2/2000)
22
ARIADNE Screens (E. Duval)
23
CS Teaching Center (CSTC) F Instead of building large, expensive multimedia packages, that become obsolete and are difficult to re-use, concentrate on small knowledge units. F Learners benefit from having well-crafted modules that have been reviewed and tested. F Use digital libraries to build a powerful base of support for learners, upon which a variety of courses, self-study tutorials & reference resources can be built. F ACM Education Board and SIG support, new NSF grant with UNCW, Eduprise, TCNJ, … - iLumina Project F ACM J. of Educational Resources in Computing (JERIC)
25
Browsing (1)
26
Browsing (2)
30
A Digital Library Case Study F Domain: graduate education, research F Genre: ETDs = electronic theses & dissertations F Submission: http://etd.vt.edu F Collection: http://www.theses.org Project: Networked Digital Library of Theses & Dissertations http://www.ndltd.org (NDLTD – remember: ND LTD / NDL TD) (also, newer NUDL: Networked University Digital Library, with e-courseware, etc.)
31
ETD Initiative (and UMI) Students Learn about DL, EPub TDs become more expressive N. Amer. (T)Ds are accessible, archived Global TDs become more accessible, archived UMI Universities
32
Library Catalogs ETD, Access is Opened to the New Research WWW NDLTD
33
What are the long term goals? F Attract all TDs/yr: 50K D-US, 25K D-Germany, 10K TD-Canada, … F >200K/yr rich hypermedia ETDs that may turn into electronic portfolios (images, video, audio, …) F Dramatic increase in knowledge sharing: literature reviews, bibliographies, … F Services providing lifelong access for students: browse, search, prior searches, citation links F Hundreds/thousands of downloads / year / work
34
The Networked Digital Library of Theses and Dissertations www.NDLTD.org Leader of the Worldwide ETD (Electronic Thesis and Dissertation) Initiative Training Authors Expanding Access Preserving Knowledge Improving Graduate Education Enhancing Scholarly Communication Empowering Students & Universities
35
Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions
36
Why do we need the Open Archives Initiative ? F Current standards are too complicated F Information wants to be free ! F We can decouple –Running an archive (DL content collection) –Running a service (DL system / operation) F So we can have more and better archives, that build on each other F So we can have better services, that work on multiple collections
37
OAI: Archives of Digital Objects Archive Access Protocol Handle (ID) Digital object terms and conditions
38
The Open Archives Initiative www.openarchives.org a technical introduction Hussein Suleman (hussein@vt.edu) Virginia Tech DLRL December 2000
39
History F Santa Fe Convention (October 1999) –Electronic pre-print community F San Antonio (July 2000), Lisbon (Sept. 2000) –Broader interest from other parties F Ithaca Meeting (September 2000) –Formulation of general-purpose protocol F OAI Open Meetings (January –Feb. 2001) –Public release of specifications
40
Federation vs. OAI Harvesting F Federation –Sending out queries to remote sites and combining results F Harvesting –Gathering all metadata from remote sites into a central search system –Lightweight protocol –Robust –Less network traffic –Redundant servers
41
Black Box OAI-ETD Perspective ISTEC (Ibero America) PhysDisNSYSU (Taiwan) ADT (Australia) BN.PT (Portugal) www.theses.orgCyberTheses (Francophone) VTDissert.Online (Germany) MITOhioLINKCBUC (Catalunya) NDC (Greece) SEALS (S.Africa) CICU. Bergen (Norway) … …
42
Splitting Data & Services F Data Provider –Implements the OAI protocol on archive to allow external access to data F Service Provider –Uses the OAI protocol to access external archives and provide services (such as searching or linking) on their metadata
43
The Big Picture DL Repository 1Repository 2Repository 3Repository 4
44
Requirements for OAI Protocol F Unique identifiers (URNs) for each record F Date-stamp for each record when last modified/created/deleted F HTTP server with scripting ability
45
OAI Harvesting Protocol v1 F Operates over HTTP F HTTP Requests and XML Responses F HTTP Error codes F 6 Service requests (verbs): –Identify, ListMetadataFormats, ListSets –ListIdentifiers, GetRecord, ListRecords
46
Identify - Response
47
ListMetadataFormats - Response
48
GetRecord - Response
49
Verb: ListRecords F Retrieves metadata for multiple records F Parameters –from – start date (O) –until – end date (O) –set – set to harvest from (O) –resumptionToken – flow control mechanism (X) –metadataPrefix – metadata format (R)
50
ListRecords - Response
51
Feature: Different Metadata
52
Feature: Date Ranges
53
Feature: Resumption Token
54
Repository Explorer
55
ODU Search Service
56
What Next ? F In General –Cross-archive searching –Cross-archive linking, de-duping, threading –Selective Filtering –Open-DL in a Box ? F VT –The VT Digital Library –NDLTD Union Catalog
57
[acknowledgements] Carl Lagoze the Open Archives Initiative Herbert Van de Sompel Cornell University -- Computer Science DLF FALL FORUM 2000 – Chicago – November 18th 2000
58
Actions herbert van de sompel establish organizational stability for the OAI: institutional backing from CNI & DLF steering committee: policy guidance technical committee: technical specifications executive group: day to day coordination workshops: public dissemination, feedback revise specifications to allow adoption beyond preprints
59
low-barrier interop umbrella herbert van de sompel metadata OPACimageFTXTA&Ie-print
60
low-barrier interop umbrella herbert van de sompel metadata OPACimageFTXTA&Ie-print Author Title Abstract Identifer
61
OAI harvesting tools herbert van de sompel service provider harvester data provider repository Datestamp Identifier Set Records repositoryrepository
62
publication of specifications: January 2001 US Open Day, January 23rd Washington DC EC Open Day, February 2001, Berlin freeze specifications for 1 year: stable for experimentation; not definitive minimize risk for early adopters maximize chances for future interoperability across communities revision of specifications herbert van de sompel
63
alpha test of specs (11/2000-01/2001) herbert van de sompel data providers: arXiv -- Los Alamos NACA -- NASA CogPrints -- U Southampton ETD -- Virginia Tech Thesis & Dissertations from WorldCat -- OCLC
64
data providers: HeinOnline law journals -- Cornell U TEI-lite collection -- U Tennessee STM publisher metadata -- U Illinois Resource Disovery Network -- UKOLN Open Language Archives -- U Pennsylvania Open Video Project -- U North Carolina Museum info. -- CIMI alpha test of specs (11/2000-01/2001) herbert van de sompel
65
software: OAI harvesting interface to Ex Libris Aleph 500 Integrated Library System -- Ex Libris OAI harverster – Cornell U OAI harverster – Virginia Tech Open-source software capable of creating a merged catalog of metadata harvested from OAI- servers -- OCLC alpha test of specs (11/2000-01/2001) herbert van de sompel
66
service providers: Repository explorer -- Virginia Tech MARIAN DL -- Virginia Tech ARC service -- Old Dominion U alpha test of specs (11/2000-01/2001) herbert van de sompel
67
The Open Archives Initiative develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content. New OAI mission statement herbert van de sompel The Open Archives Initiative has its roots in an effort to enhance access to e-print archives as a means of increasing the availability of scholarly communication. Continued support of this work remains a cornerstone of the Open Archives program.
68
The fundamental technological framework and standards that are developing to support this work are, however, independent of the both the type of content offered and the economic mechanisms surrounding that content, and promise to have much broader relevance in opening up access to a range of digital materials. [...] New OAI mission statement herbert van de sompel
69
Harvesting Document Metadata for Federated Search CS6604 Fall 2000 Project Presented By Avnish Kumar Chhabra
70
Benefits of Harvesting F Limited storage requirement F Fast search F Consistently ranked results F Improved reliability F Distributed collections are transparent to user. F Efficient use of network resources.
71
Design of the Solution OAI wrapper Z39.50 Wrapper Update Scheduling Query Generation Digital Library collection Parser/Updater Queries Replies New Metadata MARIAN Metadata Database Boundary of System developed
72
Implementation Main scheduler thread : Server, Protocol, Update Frequency SiteInfo Schedule File OAI harvester class: OAIInterface Instantiated with URL of OAI site And scheduling frequency HarvestorMonitor: Monitor for arbitrating access to network resources DL Collection OAIHandler XML Document Event Handler class Auth Sub Abs
73
Features of the system developed F Per-collection execution thread F Schedules updates F Encapsulation of protocol specific details F Extensibility F Control over active execution threads F Fault tolerance –Server unreachable –Failure / timeout of individual connections F Time zones and date ambiguity considered
74
Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions
75
MARIAN Layers Database Layer Search Engine Layer User Information Layer User Interface Layer User
83
Part of Hierarchy of MARIAN Classes
84
Relevant Document Structure
87
MARIAN-Phronesis Interoperability CS6604 Fall 2000 Project Tracy Lewis Ryan Richardson Kim Woods
88
MARIAN-Phronesis V1 Architectural Diagram Phron Query CGI Script Search Page Display to user Create object instance MARIANPHRONESIS Marian Query CGI Script Phron Results
89
MARIAN-Phronesis Login Page
90
Query in Español
91
Outline F Introduction (5S) F Education (CSTC, NDLTD) F OAI F MARIAN F Conclusions
92
Conclusions F Education is an important application of DLs F Having a framework and theory may lead to better (more effective) systems and broader applicability –5S –MARIAN F Interoperability is part of the DL grand challenge –OAI
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.