Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study Unni Ravindranathan, Rao Shen, Marcos André Gonçalves, Weiguo.

Similar presentations


Presentation on theme: "Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study Unni Ravindranathan, Rao Shen, Marcos André Gonçalves, Weiguo."— Presentation transcript:

1 Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study Unni Ravindranathan, Rao Shen, Marcos André Gonçalves, Weiguo Fan, Edward A. Fox, James W. Flanagan fox@vt.edu http://fox.cs.vt.edu Virginia Tech, Blacksburg, VA, USA (and CWRU) ECDL 2004, Bath, England, September 2004

2 Acknowledgements (Selected) Sponsors: NSF grant ITR-0325579; AOL, ASOR, CWRU, ETANA, Vanderbilt U., Virginia Tech Faculty/Staff: Lillian Cassel, Debra Dudley, Roger Ehrich, Manuel Perez, Naren Ramakrishnan VT (Former) Students: Aaron Krowne, Ming Luo, Fernando Das Neves, Ricardo Torres, Hussein Suleman

3 Acknowledgements (contd.) Karen Borstad, MPP Douglas Clark, Walla Walla College Joanne Eustis, CWRU Nick Fischio, CWRU Paul Gherman, Vanderbilt U. Andrew Graham, U. Toronto Tim Harrison, U. Toronto Larry Herr, Canadian University College Christopher Holland, LRP Paul Jacobs, Mississippi State U. Douglas Knight, Vanderbilt U. Stan LaBianca, Andrews U. David McCreery, Willamette U. Eric Meyers, Duke U. Adam Porter, Illinois College Jack Sasson, Vanderbilt U. Tom Schaub, Indiana U. of Penn. Randall Younker, Andrews U.

4 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

5 Problems Interoperability among heterogeneous archaeological systems Delay in publication of primary archaeological data Lack of sustainable solutions to long-term preservation of valuable information Lack of services useful to the archaeology community, including “traditional DL services” Difficulty in understanding complex archaeological information systems Difficulty in requirements elicitation for archaeological systems

6 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

7 Open Archives Initiatives Promotes interoperability among DLs Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) Data Provider possess metadata and share it (internally / externally) via well-defined OAI protocols (e.g., database servers) Service Provider harvest data from Data Providers provide higher-level services to users

8 Traditional Digital Libraries ? 1010100101 0100101010 1001010101 0101010101 Program 1010100101 0100101010 1001010101 0101010101 Document 1010100101 0100101010 1001010101 0101010101 Document 1010100101 0100101010 1001010101 0101010101 Document 1010100101 0100101010 1001010101 0101010101 Program 1010100101 0100101010 1001010101 0101010101 Program 1010100101 0100101010 1001010101 0101010101 Image 1010100101 0100101010 1001010101 0101010101 Image 1010100101 0100101010 1001010101 0101010101 Image 1010100101 0100101010 1001010101 0101010101 Video 1010100101 0100101010 1001010101 0101010101 Video 1010100101 0100101010 1001010101 0101010101 Video ? Monolithic and/or Custom-built web-based application UsersDigital Library Digital Objects

9 Introduction to ODL (Open Digital Libraries) Open Digital Libraries Framework for componentized Digital Libraries Design principles for components Protocols for inter-component communications Built upon OAI

10 Open Digital Libraries Approach UsersETANA-DLSites 1010100101 0100101010 1001010101 0101010101 Bone Search Filter Union Recent Browse USER INTERFACE Filter 1010100101 0100101010 1001010101 0101010101 Seed 1010100101 0100101010 1001010101 0101010101 Figurine 1010100101 0100101010 1001010101 0101010101 Pottery

11 Basic ODL Model: An application for Archaeology OAI Data Provider OAI-PMH ODL Protocol User Interface Nimrin ETANA-DL Union Catalog OAI-PMH ETANA-DL Search Engine ODL Service Provider Component WWW Interface ODL Protocol

12 Componentized services example User Search Handler Servlet Query Results IRDB Search Engine User Interface Index DB Query in the IRDB query language Results in XML Query Parsed XML

13 5S Model – Informally Digital libraries are complex information systems that: help satisfy info needs of users (societies) provide info services (scenarios) organize info in usable ways (structures) present info in usable ways (spaces) communicate info with users (streams)

14 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

15 Solution – our approach Applying and extending Digital Library (DL) techniques to solve the following problems: interoperability, making primary data available, data preservation Modeling archaeological information systems using 5S theory to better understand the domain and design the system and the supported services Rapidly prototyping DLs that handle heterogeneous archaeological data using componentized frameworks: requirements elicitation, provide useful services.

16 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

17 ETANA-DL Archaeological Digital Library Applies and extends the OAI-PMH Open Archives Initiative Protocol for Metadata Handling Design considerations Componentized Distributed architecture Extensible Portable

18 ETANA Digital Library Core Components - DigBase DigBase (DB) Central repository - stores metadata Union catalog - for the collections in ETANA-DL Various kinds of digital objects – excavation records, images, text collections, etc. General services - Search, Browse, Annotate, Recommend, etc. Archaeology-specific services - artifact analysis, visualizations, artifact interpretation, workflows, etc.

19 ETANA Digital Library Core Components - DigKit DigKit (DK) A suite of tools for collecting and recording archaeological data in the field, that can be used for a new dig Metadata will migrate to DigBase (DB). Real-time collaborative archaeology: Metadata in DB will be rapidly available to others.

20 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

21 Architecture Union Catalog Inverted Files DB used by Services Index Browse Engine Search Component Browse DB Other ETANA-DL Services Web Interface XOAI DigBase DB Data Mapping Component OAI Data Provider OAI Archaeological Site ETANA-DL DigKit Configure

22 Modeling ETANA-DL – An Archaeological DL Meta-model Text Video Audio *Site *Sub-partition *Container*Artifact*LocusRegion Taxonomies Temporal Artifact-specific Space model Structure model Metadata DrawingPhoto3D Stream model *Partition Society model Archaeologist General public Geographic space Service Manager Information Satisfaction Value added Repository building Scenario model Services Domain specific User interfaceMetric space Spatial

23 Modeling ETANA-DL – The ETANA-DL model *Field*Pail *Bone *LocusJordan Taxonomies Space model Structure model Field record, locus sheet Figurine image (photo) Stream model Umayri Society model Archaeologist Generic public Site-specific coordinate system Web interface Vector space ETANA-DL Service Manager Searching, Browsing Annotation, binding Harvesting, Converting Scenario model Services Object comparison, marking item for analysis Archaeological periods Bone type Seed species *Square *Figurine *Quadrant*Bag *Locus Jordan Valley Nimrin *Square *Field*Basket*LocusSouthern IsraelHalif*Area *Seed Site/field plan (drawing) Preliminary/Final Report (application/pdf) Spatial

24 Modeling ETANA-DL – Mapping heterogeneous data to the structural model SitePartition Sub- partition LocusContainer Lahav Field I Area A8 Locus A8074 Basket 224 Nimrin Quadrant NW Quadrant Value N25/W50 Locus 96 Bag 240 Umayri Field A Square 7J59 Locus 001 Pail 12

25 Data Mapping

26 ETANA-DL Schema Design Bone Seed Figurine ETANA-DL Object Count Animal …… Species Name …… Description Dimensions …… Owner Subpartition Partition Locus ID Container Collection ……

27 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

28 ETANA-DL Services: Categories Information satisfaction Searching Browsing Recommendation Archaeology (Domain) specific Object comparison Marking items Value-added Annotation Items of interest (Binding service) Recent searches/discussions User management

29 Searching: Search Interface

30 Searching: Search Results

31 Searching: Advanced Search

32 Searching: Advanced Search Results

33 Multi Dimensional Browsing Site structure Temporal Object-specific User context

34 Searching within a Context

35 Searching within a Context: Search Results

36 Restoring Browsing Contexts

37 Object Comparison: Selecting Objects for Comparison

38 Object Comparison: Editing Attributes

39

40 Object Comparison: Comparing Objects

41 Object Comparison: Comparison Results

42 Marking items

43 Viewing marked items

44 Remarking items

45 Discussion Board (Annotation): View Messages

46 Discussion Board (Annotation): Post Messages/Replies

47 Collections Description

48 Other services Items of Interest (Binding service) Recent searches/discussions Recommendation User management Account creation Login

49 Items of Interest: Binding Service

50 Recent Searches/Discussions

51 Recommendation

52 User Management: New User Account

53 User Management: Login

54 User Management: Navigations

55 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

56 Heterogeneous data handling Site Artifact Type Original data source Number of attributes in original record Number of attributes in harvested record Number of records harvested LahavFigurine Tab-delimited text file 1518564 Nimrin Bone field record Table in Oracle DB21247420 Seed field record Table in Oracle DB1215430 Umayri Bone field record 2 tables in Access DB 8242123 Total10537

57 Heterogeneous data handling Site Data Analysis (in hours) Data Mapping (in hours) Data Provider Implementation (in hours) Service Provider Implementation (in hours) Lahav4814441 Nimrin48 41 Umayri244841 Total120240123

58 Heterogeneous data handling

59 Rapid prototyping: Lines of Code Type of Service LOC for implementing service LOC reused from components Total LOC Reuse Percentage Componentized3503630398091 Non- componentized 7950- - Total830036301193030.4

60 Rapid prototyping: Service development times Componentized Services Non-componentized Services

61 User Analysis Initial comments from all 3 projects, plus others interested in ETANA-DL Positive feedback – users liked: Data integration Prototype cross-collection information access services Information structuring Utility of supported services Negative feedback – user concerns: Need for service enhancements Usability

62 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

63 Conclusions Apply 5S to the archaeological domain Identified requirements for future versions of system Extensible and componentized approach for handling heterogeneous archaeological data from disparate sources Rapidly generated prototype archaeological DL Making primary archaeological data available without significant delay

64 Outline Problems Background Approach ETANA-DL ETANA-DL Prototype System Modeling ETANA-DL ETANA-DL Services Analysis Conclusions Future Work

65 Componentizing current DL services Creating next-generation DL services from expanding set of requirements Integrating richer content (Semi-)automatic data mapping Automating the ingest of DL content Enhancing interface capabilities Formal usability studies

66 Visual Browsing Visual Browse By sites

67 Visual Browsing: Topographical Drawings Full siteNorth west quadrant Square: N40/W20

68 Visual Browsing: Square information Loci layout Square: N40/W20 Locus: 86

69 Visual Browsing: locus sheet

70 Publications 1.U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, J. W. Flanagan. ETANA-DL: A Digital Library for Integrated Handling of Heterogeneous Archaeological Data. To be presented at the ACM- IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. 2.U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, J. W. Flanagan. ETANA-DL: Managing Complex Information Applications – An Archaeology Digital Library. Demo to be presented at the ACM-IEEE Joint Conference on Digital Libraries (JCDL 2004), Tucson, AZ, June 7-11, 2004. 3.U. Ravindranathan, R. Shen, M. A. Goncalves, W. Fan, E. A. Fox, J. W. Flanagan. Prototyping Digital Libraries Handling Heterogeneous Data Sources – The ETANA-DL Case Study. European Conference on Digital Libraries (ECDL 2004), Bath, U.K., September 12-17, 2004 (submitted).

71 Questions/Feedback ??


Download ppt "Prototyping Digital Libraries Handling Heterogeneous Data Sources – An ETANA-DL Case Study Unni Ravindranathan, Rao Shen, Marcos André Gonçalves, Weiguo."

Similar presentations


Ads by Google