Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality.

Similar presentations


Presentation on theme: "1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality."— Presentation transcript:

1 1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality modeling and functionality interoperability, Session 1” Functionality and Interoperability with 5S by Edward A. Fox fox@vt.edu http://fox.cs.vt.edu Dept. of Computer Science, Virginia Tech Blacksburg, VA 24061 USA

2 Acknowledgements Mentors (Licklider, Kessler, Salton) Virginia Tech, CS, Digital Library Research Laboratory NSF and other sponsors, e.g., grants –DUE-0840719, CCF-0722259, IIS-0535057, IIS-0325579 Students, colleagues, co-investigators Robert France, Marcos André Gonçalves, Doug Gorton, Yi Ma, Uma Murthy, Rao Shen, Hussein Suleman, Ricardo da Silva Torres,... Barbara Wildemuth, Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang 2

3 Theses and Dissertations Douglas Gorton, "Practical Digital Library Generation into DSpace with the 5S Framework", April 2007, MS thesis, http://scholar.lib.vt.edu/theses/available/etd- 04252007-161736/ Rao Shen, "Applying the 5S Framework To Integrating Digital Libraries", April 2006, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-04212006-135018/ Ananth Raghavan, "Schema Mapper: A Visualization Tool for Incremental Semi- automatic Mapping-based Integration of Heterogeneous Collections into Archaeological Digital Libraries: The ETANA-DL Case Study", May 2005, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-05182005-114155/ Marcos Andre Goncalves, "Streams, Structures, Spaces, Scenarios, and Societies (5S): A Formal Digital Library Framework and Its Applications", Nov. 2004, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-12052004-135923/ Rohit Dilip Kelapure, "Scenario-Based Generation of Digital Library Services", June 2003, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-06182003-055012/ Hussein Suleman, "Open Digital Libraries", Nov. 2002, PhD dissertation, http://scholar.lib.vt.edu/theses/available/etd-11222002-155624/ Qinwei Zhu, "5SGraph: A Modeling Tool for Digital Libraries", Nov. 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd-11272002-210531/ Jun Wang, "VIDI: A Lightweight Protocol Between Visualization Systems and Digital Libraries", May 2002, MS thesis, http://scholar.lib.vt.edu/theses/available/etd- 07012002-145841/ 3

4 Other Selected References Marcos Andre Goncalves, Robert K. France, Edward A. Fox, MARIAN: Flexible Interoperability for Federated Digital Libraries. ECDL 2001, 173-186, 2001 Hussein Suleman and Edward Fox. The Open Archives Initiative: Realizing Simple and Effective Digital Library Interoperability. J. Library Automation, 35(1/2):125-145, 2002 Marcos Andre Goncalves, Edward A. Fox. 5SL - A Language for Declarative Specification and Generation of Digital Libraries. JCDL 2002, 263-272 Marcos Andre Goncalves, Ming Luo, Rao Shen, Mir Farooq Ali, Edward A. Fox. An XML Log Standard and Tool for Digital Library Logging Analysis. ECDL 2002, 129-143 Marcos Andre Goncalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox, Filip Jagodzinski, Lillian Cassel. The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment. JCDL 2003, 312 – 314 Hussein Suleman, Edward A Fox, Rohit Kelapure, Aaron Krowne, Ming Luo. Building digital libraries from simple building blocks, Online Information Review 27(5): 301-310, 2003 M. Goncalves, E. Fox, L. Watson, N. Kipp. Streams, Structures, Spaces, Scenarios, Societies (5S): A Formal Model for Digital Libraries. TOIS, 22(2): 270-312, 2004 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Ricardo da S. Torres, E. A. Fox. Exploring Digital Libraries: Integrating Browsing, Searching, and Visualization. JCDL 2006, 1-10 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. What is a Successful Digital Library? ECDL 2006, 208-219 4

5 Other Selected References - 2 Jeffrey Pomerantz, Sanghee Oh, Seungwon Yang, Edward A. Fox, Barbara M. Wildemuth. The Core: Digital Library Education in Library and Information Science Programs. D-Lib Magazine, 12(11), Nov. 2006 Marcos Andre Goncalves, Barbara L. Moreira, Edward A. Fox, Layne T. Watson. "What is a good digital library?" - A quality model for digital libraries. Information Processing and Management, 43(5): 1416-1437, 2007 Uma Murthy, Douglas Gorton, Ricardo Torres, Marcos Goncalves, Edward Fox, Lois Delcambre. Extending the 5S Digital Library (DL) Framework: From a Minimal DL towards a DL Reference Model. JCDL 2007 Workshop on Digital Library Foundations Barbara L. Moreira, Marcos A. Goncalves, Alberto H. F. Laender, Edward A. Fox, Evaluating Digital Libraries with 5SQual. ECDL 2007: pp. 466-470 Yi Ma, Edward A. Fox, Marcos A. Goncalves. Personal Digital Library: PIM upon 5S Framework. CIKM 2007 Workshop: PIKM07, Lisbon, Nov. 2007, 117-124 Marcos Andre Goncalves, Edward A. Fox, Layne T. Watson. Towards a Digital Library Theory: A Formal Digital Library Ontology. Int. J. Digital Libraries 8(2): 91-114, 2008 Rao Shen, Naga Srinivas Vemuri, Weiguo Fan, Edward A. Fox. Integration of Complex Archaeology Digital Libraries: An ETANA-DL Experience. Information Systems. 33(7-8): 699-723, 2008 Barbara L. Moreira, Marcos Andre Goncalves, Alberto H.F. Laender, Edward A. Fox. Automatic Evaluation of Digital Libraries with 5SQual. J. Informetrics, 3(2): 102-123, 2009 5

6 Outline Contextual Background –DL Definitions, Scope –DL Curricula Efforts –Interoperability Approaches 5S 5S Services Work International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) Discussion Topics 6

7 7 DL Definitions Issues and Spectra –Collection vs. Institution –Content vs. System –Access vs. Preservation –“Free” vs. Quality –Managed vs. Comprehensive –Centralized vs. Distributed

8 8 Borgman et al.: Workshop Report on Social Aspects of Digital Libraries: http://www-lis.gseis. ucla.edu/DL/ Information Life Cycle

9 9 Information Life Cycle Authoring Modifying Organizing Indexing Storing Retrieving Distributing Networking Retention / Mining Accessing Filtering Using Creating

10 10 Digital Libraries Shorten the Chain from Editor Publisher A&I Consolidator Library Reviewer

11 11 DLs Shorten the Chain to Author Reader Digital Library Editor Reviewer Teacher Learner Librarian

12 DL Curric. Project NSF awards to VT and UN C-CH CS and LIS http://curric.dlib.vt.edu/ http://curric.dlib.vt.edu/wiki/index.php/Main _Page http://curric.dlib.vt.edu/modDev/modDev.ht ml 12

13 13 DL Curriculum Framework

14 DL Curric. Modules - 1 Module 1-b: History of digital libraries and library automation Module 2-c: File Formats, Transformation, and Migration Module 3-b: Digitization Module 4-b: Metadata Module 5-a: Architecture overviews 14

15 DL Curric. Modules - 2 Module 5-b: Application software Module 5-d: Protocols Module 6-a: Information needs/relevance Module 6-b: Online information seeking behaviors and search strategies Module 6-d: Interaction design and usability assessment 15

16 DL Curric. Modules - 3 Module 7-b: Reference Services Module 7-g: Personalization Module 8-b: Web Archiving Module 9-c: Digital library evaluation, user studies 16

17 Interoperability Approaches Browsers (Mosaic) Federation Heterogeneous, Homogeneous Protocols (OAI-PMH) Repositories Content Standards (XML), Mapping Integration (ETANA) Services (Superimposed Information) 17

18 18 Integration: Challenges “Semantic Web” is vision, not reality. How can we integrate without a theory? How can we interoperate without a common framework? How can we have a science of DLs if we lack agreement on definitions (so we can reason and discuss) and measures of quality (so we can compare and improve)?

19 19 Informal 5S & DL Definitions DLs are complex systems that help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

20 20 5S Layers Societies Scenarios Spaces Structures Streams

21 21 5Ss SsExamplesObjectives Streams Text; video; audio; image Describes properties of the DL content such as encoding and language for textual material or particular forms of multimedia data Structures Collection; catalog; hypertext; document; metadata Specifies organizational aspects of the DL content Spaces Measure; measurable, topological, vector, probabilistic Defines logical and presentational views of several DL components Scenarios Searching, browsing, recommending Details the behavior of DL services Societies Service managers, learners, teachers, etc. Defines managers, responsible for running DL services; actors, that use those services; and relationships among them

22 5S Overview 5S and Generating DLs –5S Framework –5S definitions, services taxonomy, ontology –5SL –5SGraph –5SGen (and DL development) –DL development of union DL, DL integration –5SGen into DSpace 5S Metamodels –Minimal DL –Archaeology DL –CBIR DL –Union DL

23 23 Streams

24 24 Structure (Degrees, Terminology) Chaotic OrganizedStructured WebDLsDBs

25 25 Digital Objects (DOs) Born digital Digitized version of “real” object –Is the DO version the same, better, or worse? –Decision for ETDs: structured + rendered Surrogate for “real” object –Not covered explicitly in metamodel for a minimal DL –Crucial in metamodel for archaeology DL

26 26 Databases 5S perspective: structures, streams, scenarios Extending database technology Structured and unstructured info Multimedia databases Link databases Performance, transaction processing Replicated storage, rollback/recovery

27 27 Spaces User interfaces and visualization 2D interfaces 3D interfaces GIS Other paradigms

28 Scenarios Services (see later) Scenario based design, use cases Functionality Representation and processing for humans and machines 28

29 29 Societies User communities –Authors, editors, teachers, students, readers –Personal(ization), group(ware), community, global –Accessibility, universal access Librarians: reference, acquisition, operations Research community –Associations, conferences, publications, labs, projects Economics –Copyright, intellectual property rights, digital rights management, authorization, authentication, security, privacy, self-archiving (eprints) –Publishers, catalogers, distributors, sustainability –Open source, commercial, hybrid

30 30 Higher DL Constructs Collections Catalogs Repositories and Archives Services Systems Case Studies

31 31 Collections Terminology: set, “database” Distributed: basis, efficiency/effectiveness Parallelism: federation, harvesting Scale: object size, compression, replication, stream splitting Intelligence/processing granularity: object, cluster, collection, repository

32 32 NSDL Collections Discovery of content Classification and cataloguing Acquisition and/or linking; referencing Disciplinary-based themes define a natural body of content, but other possibilities are also encouraged Access to massive real-time or archived datasets Software tool suites for analysis, modeling, simulation, or visualization Reviewed commentary on learning materials and pedagogy

33 33 Catalogs OPACs Distributed vs. centralized Coverage, breadth Specificity, depth Management: versioning, works

34 34 Repositories and Archives Naming, identifiers Architectures, interoperability –OAI: harvesting –SRU/SRW: federation Preservation, archives –LOCKSS, UVC, emulation/migration Scalability, storage Institutional repositories, Open Access

35 35 Services NSDL Services Taxonomy of services Ontology, composition, reuse Evaluation Key services in-depth: –Crawling, indexing –Clustering, classifying –Recommending, using social networks –Logging

36 36 NSDL Services Help services, frequently asked questions, etc. Synchronous/asynchronous collaborative learning environments using shared resources Mechanisms for building personal annotated digital information spaces Reliability testing for applets or other digital learning objects Audio, image, and video search capability Metadata system translation Community feedback mechanisms

37 37

38 38 Services Ontology: Applications

39 39 Ontology: Applications Expand definition of minimal DL by characterizing –typical DL services –in the context of “employs” and “produces” relationships Use characterization to: –Reason about how DL services can be built from other DL components –As well as be composed with other services through extension or reuse

40 40

41 41 5S and DL formal definitions and compositions (April 2004 TOIS)

42 42

43 43 XML-based DL Log Standard Log analysis –is a source of information on: How patrons really use DL services How systems behave while supporting user information seeking activities Used to: –Evaluate and enhance services –Guide allocation of resources Common practice in the web setting –Supported by web servers, proxy caches DL Logging can be more detailed

44 44 The XML Log Format Log SessionIdMachineInfo StatementTransactionTimestamp SessionInfoRegisterInfo StatementEventTimestamp Action SearchBrowse StoreSysInfoUpdate SearchBy QueryString CatalogCollection PresentationInfo StatusInfo Timeout

45 45 Systems Architectures –Client-server, service-oriented –P2P, Grid System descriptions and comparisons –Personal DLs; Institutional to global –DSpace, Eprints, Fedora, Greenstone, Kepler ODL 5S Suite: language, visualization, generation, logging

46 46 Architectural Issues Independent system vs. part of federation Centralized vs. distributed vs. open services Monolithic vs. modular vs. componentized Topologies: bus vs. star vs. hierarchical vs. network Decompositions vary –search engine, browser, DBMS, MM support –repository, handle server, client –information resources + mediators, bus or agent collection + client with workspace/environment

47 47 NSDL Information Architecture Essentially as developed by the Technical Infrastructure Workgroup referenced items & collections referenced items & collections Special Databases NSDL Services NSDL Services Other NSDL Services CI Services annotation CI Services discussion CI Services personalization CI Services authentication CI Services browsing Core Services: information retrieval Core Collection- Building Services harvesting Core Collection- Building Services protocols Core Services: metadata gathering Portals & Clients Portals & Clients Portals & Clients Usage Enhancement Collection Building User Interfaces NSDL Collections NSDL Collections NSDL Collections Core NSDL “Bus”

48 48 5S Modeling -> Systems

49 49 Tools/Applications

50 50

51 51 5SL: a DL design language Domain specific languages –Address a particular class of problems by offering specific abstractions and notations for the domain at hand –Advantages: domain-specific analysis, program management, visualization, testing, maintenance, modeling, and rapid prototyping. XML-based realization of 5S –Interoperability –Use of many sub-languages (e.g., MIME types, XML Schemas, UML notations)

52 52 5SL – The Minimal DL Metamodel

53 53 <stream value=`ETDText'> <stream value=`ETDAudio'>... %XMLSchema% Example of Document declaration in the Structures Model <Attribute name='name‘ type='String'/> <Attribute name='ID‘ type='Integer'/> Converting Reviewing Cataloguing ……… Example of Actors declaration in the Societies Model Simple scenario for an NDLTD site searching service Patron InterfaceManager collection query InterfaceManager SearchManager collection query SearchManager InterfaceManager WtdSet …. Example of Service declaration in the Scenario Model

54 54 Help users model their own instances of a digital library (DL) in the 5S language (5SL). A simple modeling process which enables rapid generation of digital libraries Features –5SGraph loads and displays a metamodel in a structured toolbox. –The structured editor of 5SGraph provides a top- down visual building environment for the DL designer. –5SGraph produces syntactically correct 5SL files according to the visual model built by the designer. 5SGraph: A DL Modeling Tool

55 55 Overview of 5SGraph Workspace (instance model) Structured toolbox (metamodel)

56 56

57 57 5SGen Version 1 -- MARIAN as the target system –Focused on rich structures: semantic networks –Behavior attached to nodes/links Version 2 -- Shifted for later work to componentized (ODL) approach –Focused on scenarios/societies –Structures/Spaces encapsulated within components (e.g., relational tables, indexes) –Only textual streams supported Version 3 – Practical DL (w. DSpace) – Doug Gorton

58 58 5SLGen – Version 2: ODL, Services, Scenarios

59 59 5S Meta Model 5SGraph DL Expert DL Designer 5SL DL Model 5SLGen Practitioner Researcher Tailored DL Services Teacher c omponent pool ODLSearch, ODLBrowse, ODLRate, ODLReview, ……. Requirements (1) Analysis (2) Implementation (4) Design (3) 5SGraph5SGen Mapping Tool 5SSuite

60 60 Describing Quality in Digital Libraries What’s a “good” digital Library? –Central Concept: Quality! –Hypotheses of this work: Formal theory can help to define “what’s a good digital library” by: New formalizations of quality indicators for DLs within our 5S framework Contextualizing these measures within the Information Life Cycle

61 61 Quality and the Information Life Cycle

62 62 Quality Dimensions

63 63 Services: Efficiency / Effectiveness Effectiveness –Very common measures: Precision, Recall, F1, 10- precision, R-Precision –Other services may have different measures: e.g., Recommending, etc. Efficiency –let t(e) be the time of an event e – let e ix and e fx be the initial and the final event of service se x. –For service se x, efficiency is defined as: Efficiency(se x ) = t(e fx ) - t(e ix )

64 64 DL Integration What is “DL Integration” –Hide distribution –Hide heterogeneity –Enable autonomy of individual component Why Integration –island-DLs –inability to seamlessly and transparently access knowledge across DLs Utilize various autonomous DLs in concert

65 65 Integration: Urgency, Longevity If we collect, capture, acquire, or produce information, will it be usable in 100 years? NSF Digital Archiving Program Library of Congress National Digital Information Infrastructure and Preservation Program

66 66 DL interoperability approach Intermediary-basedmapping-based Consists of mediatorwrapperagent use two architectures federationUnion Archiving used in Consists of hybrid mappercomposite mapper use schema mapping use Interrelated with GA trained by DL integration formalization based on

67 Union DL Definitions A Minimal Union Digital Library integrated from n DLs is given as a four-tuple: MinUnionDL=(Union Repository, Union Catalog, Minimal Union Services, Union Society). DL Integration Problem Definition: Given n individual digital libraries (DL1, DL2, …, DLn), each defined as described above, to integrate the n DLs is to create a Union DL.

68 68 Union Catalog Quality Measurement Complete –All the catalogs to be integrated are complete. Consistent –All the catalogs to be integrated are consistent. –Each descriptive metadata specification in the union catalog describes only one digital object.

69 Member DLs of ETANA-DL

70 Architecture of ETANA-DL, with centralized catalog and partially decentralized repository

71 71 Mapping confirmationMapping history

72 72 Union Catalog Integration VN Metadata Format Global Metadata Format VN Catalog HD Catalog Union Catalog Mapping Tool Wrapper Mapping Tool Wrapper HD Metadata Format Virtual Nimrin (VN) Halif DigMaster (HD) Union ArchDL

73 73 5SGraph 5S Archaeology MetaModel ArchDL Expert ArchDL Designer Structure Sub-model ETANA-DL Union Services Descriptions Harvesting Mapping Searching Browsing … Scenario Sub-model VN Metadata Format ETANA-DL Metadata Format HD Metadata Format Mapping Tool Wrapper4VNWrapper4HD Inverted Files Services DB Index Browse Service Search Service Browse DB Other ETANA-DL Services Web Interface XOAI VN Catalog HD Catalog Union Catalog 5SGen Component Pool Browsing …

74 5S definitional structure

75 Minimal archaeological DL in the 5S framework (A.i is from minimal DL, j is new)

76 StreamStructureSpaceServiceSociety Image Stream Feature Vector Image Descriptor Structured Featute Vector Image Content Description Image Digital Object Image Object User Info Need Image Collection Visualization Operation Content-based Image Searching Service Image Descriptor Metadata Catalog Composite Descriptor KNNQ RQ Minimal CBIR DL

77 DL Ref. Model Concepts -5S (see II.4.2) User -> Societies –Human and machine actors –End-users, Designers, Administrators, Application Developers + Librarians (DL curric) Content -> Streams, Structures Functionality -> Services -> Scenarios Quality -> Services (recall 5SQual) Policy -> Scenarios, Societies Architecture -> Scenarios, Structures, Spaces (components, protocols, standards, specs) 77

78 International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) How can we strengthen the infrastructure for repositories: key solvable problems: Citation services - making citation data more easily available from repositories Repository handshake – talking to each other, user deposit into several at once Interoperable identification infrastructure – unambiguous people, documents (FRBR) 78

79 International Repository Infrastructure Workshop – and DL.org How are these 2 related? Can we learn from the Amsterdam meeting and focus on some important and solvable issues immediately? 79

80 Discussion Topics Faced in MARIAN, NCSTRL, CITIDEL, Ensemble, NSDL, ETANA Already solved: OAI-PMH Focus –Superimposed information / annotation –Citation information Approaches –5S: 5SL, 5SGen, 5SQual –XML representations –Protocols (VIDI) 80

81 Summary Contextual Background –DL Definitions, Scope –DL Curricula Efforts –Interoperability Approaches 5S 5S Services Work International Repository Infrastructure Workshop (Amsterdam, Mar 16-17, 2009) Discussion Topics 81

82 82 Questions? Discussion? Thank You!


Download ppt "1 DL.Org (Digital Library Interoperability, Best Practices and Modeling Foundations) Functionality Working Group Mtg 29-30 June 2009, Athens “Functionality."

Similar presentations


Ads by Google