Adding Value to Open Scholarly Content How Services and Search Expose The Value of the Perseus Digital Library.

Slides:



Advertisements
Similar presentations
On-line media tools for strategic communications purposes When using media tools for communication we try to use the latest technologies such us blogging,
Advertisements

Metasearching: The Problem, Promise, Principles, Possibilities & Perils Roy Tennant California Digital Library.
PNS: Personalized Multi-Source News Delivery Georgios Paliouras(1), Mouzakidis Alexandros(1), Christos Ntoutsis(2), Angelos Alexopoulos(3), Christos Skourlas(2)
TU/e technische universiteit eindhoven Hera: Development of Semantic Web Information Systems Geert-Jan Houben Peter Barna Flavius Frasincar Richard Vdovjak.
Crawling, Ranking and Indexing. Organizing the Web The Web is big. Really big. –Over 3 billion pages, just in the indexable Web The Web is dynamic Problems:
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 12 Slide 1 Distributed Systems Design 2.
JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot CERN Library GS/SIS The Library behind the scene Opportunities for Scientific.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
Information and Business Work
Shared Ontology for Knowledge Management Atanas Kiryakov, Borislav Popov, Ilian Kitchukov, and Krasimir Angelov Meher Shaikh.
Understanding Metamodels. Outline Understanding metamodels Applying reference models Fundamental metamodel for describing software components Content.
Architecture & Data Management of XML-Based Digital Video Library System Jacky C.K. Ma Michael R. Lyu.
1 Workflow Description for Open Hypermedia Systems Sanjay Vivek, David C. De Roure Department of Electronics and Computer Science.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.
Architecture, Deployment Diagrams, Web Modeling Elizabeth Bigelow CS-15499C October 6, 2000.
Modified from Sommerville’s originalsSoftware Engineering, 7th edition. Chapter 8 Slide 1 System models.
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
Web-based Portal for Discovery, Retrieval and Visualization of Earth Science Datasets in Grid Environment Zhenping (Jane) Liu.
1 Review of Important Networking Concepts Introductory material. This slide uses the example from the previous module to review important networking concepts:
Databases & Data Warehouses Chapter 3 Database Processing.
Web 2.0: Concepts and Applications 2 Publishing Online.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
CSI315CSI315 Web Development Technologies Continued.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
These slides are designed to accompany Web Engineering: A Practitioner’s Approach (The McGraw-Hill Companies, Inc.) by Roger Pressman and David Lowe, copyright.
How Web Servers and the Internet Work by by: Marshall Brainby: Marshall Brain
1 Guidelines For The Future Sharing Best Practice For National Bibliographies In The Digital Era Neil Wilson Information Coordinator IFLA Bibliography.
University of Dublin Trinity College Localisation and Personalisation: Dynamic Retrieval & Adaptation of Multi-lingual Multimedia Content Prof Vincent.
Ontologies for the Integration of Geospatial Data Michael Lutz Workshop: Semantics and Ontologies for GI Services, 2006 Paper: Lutz et al., Overcoming.
LIS 506 (Fall 2006) LIS 506 Information Technology Week 11: Digital Libraries & Institutional Repositories.
3231 Software Engineering By Germaine Cheung Hong Kong Computer Institute Lecture 12.
Interfacing Registry Systems December 2000.
Perseus’ Archiving Needs And What They Mean For Librarians.
Search Update April 1-3, 2009 Joshua Ganderson Laura Baalman.
Instructional Technology & Design Office or The World of Wikis Presented by Rebecca McGuire.
CS3773 Software Engineering Lecture 04 UML Class Diagram.
ICS (072)Database Systems: An Introduction & Review 1 ICS 424 Advanced Database Systems Dr. Muhammad Shafique.
ORGANIZATIONS AT THE MARGINS: PROSPECTS AND NEW DIRECTIONS Deanna B. Marcum July 20, 2002.
Evolving MARC 21 for the future Rebecca Guenther CCS Forum, ALA Annual July 10, 2009.
Course grading Project: 75% Broken into several incremental deliverables Paper appraisal/evaluation/project tool evaluation in earlier May: 25%
Overview Web Session 3 Matakuliah: Web Database Tahun: 2008.
.  A multi layer architecture powered by Spring Framework, ExtJS, Spring Security and Hibernate.  Taken advantage of Spring’s multi layer injection.
Voting with Their Fingers: What Research Libraries Can Learn from User Behavior Anne R. Kenney Columbia Reference Symposium March 2004.
1/22/08 RTR Project Presentation to TPTF RTR Project Michael Daskalantonakis & Brian Cook.
Introduction to the Semantic Web and Linked Data
GEMET GEneral Multilingual Environmental Thesaurus leading the way to federated terminologies Stefan Jensen, Head of information services group with input.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
How to optimise your WordPress website for search engines and get your offerings found by the right people Presented by: Women In Business with Maggie.
Information Retrieval
Dr. Rebhi S. Baraka Advanced Topics in Information Technology (SICT 4310) Department of Computer Science Faculty of Information Technology.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
CSC 405: Web Application Engineering II Course Preliminaries Course Objectives Course Objectives Students’ Learning Outcomes Students’ Learning Outcomes.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
Topic Maps for Cultural Heritage Collections Conal Tuohy Senior Developer New Zealand Electronic Text Centre
The Semantic Web. What is the Semantic Web? The Semantic Web is an extension of the current Web in which information is given well-defined meaning, enabling.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Multilingual Information Retrieval using GHSOM Hsin-Chang Yang Associate Professor Department of Information Management National University of Kaohsiung.
Chapter 7 K NOWLEDGE R EPRESENTATION, O NTOLOGICAL E NGINEERING, AND T OPIC M APS L EO O BRST AND H OWARD L IU.
Integrated Departmental Information Service IDIS provides integration in three aspects Integrate relational querying and text retrieval Integrate search.
Semantic Web Technologies Readings discussion Research presentations Projects & Papers discussions.
Designing Cross-Language Information Retrieval System using various Techniques of Query Expansion and Indexing for Improved Performance  Hello everyone,
CCNT Lab of Zhejiang University
Web Software Model CS 4640 Programming Languages for Web Applications
An Architecture for Complex Objects and their Relationships
Federated & Meta Search
Building the Literature Review
The Economy of Distributed Metadata Authoring
NSDL Data Repository (NDR)
Database Design Hacettepe University
Presentation transcript:

Adding Value to Open Scholarly Content How Services and Search Expose The Value of the Perseus Digital Library

Mission –To increase accessibility to and interest in the humanities. What makes our content interesting? – We give this content away yet still maintain a user base that finds value in its offerings. What is Perseus?

Perseus’ Static & Dynamic Services Perseus gives away its static content.. Perseus also makes its content dynamically accessible. –Allows for interconnections among Perseus’ objects. –This allows us to build up a network of associations between primary and secondary sources of information. Named Entity Extraction Morphological Analysis –The more content we have, the more associations between objects we can offer.

Increasing the value of Perseus’ texts –The concepts behind the Canonical Text Services protocol (CTS) CTS will allow us to interconnect our objects. –Intra-connecting: Making associations within our own content –Inter-connecting: Making associations between our content and external services/content. The role of search In a time when “scholarly content is increasingly being seen as a public resource,” what is the role of search engines in conceiving and delivering texts? Text Services

By the end of the talk we will see: –A Service for Referencing Text CTS –The Value of Associations CTS URNs, a syntax for intra and inter-connecting texts –Perseus’ other sources of value Perseus’ logical architecture Goals

A Service For Referencing Text

Hierarchical Ontology of Text Organization –An author’s works Get me all works by Julius Caesar urn:cts:latinLit:stoa0069 –A particular work of an author Get me Caesar’s The Gallic War urn:cts:latinLit:stoa0069:stoa002 –An edition or translation of a work Get me a specific English translation of Caesar’s The Gallic War urn:cts:latinLit:stoa0069:stoa002:1999_02_0001 Concepts Behind CTS: Author to Edition

–A logical component of text from an edition or translation in terms of its citation scheme Get me Book 1, Chapter 1 of Caesar’s Gallic War from this English translation urn:cts:latinLit:stoa0069:stoa002:1999_02_0001:1.1 –A paragraph, quotation, or single character within a text “All Gaul is divided into three parts” urn:cts:latinLit:stoa0069:stoa002:1999_02_0001:1.25: All:0-parts:0 –A range of text Give me Book 1, Chapter 37 through Book 2, Chapter 5 of Caesar’s Gallic War urn:cts:latinLit:stoa0069:stoa002: Concepts Behind CTS: Edition to Character

–CTS URNs can be thought of as a syntax for a “new and emerging content delivery mechanism” –Through URNs we can “break down the content into component parts, each of which can be manipulated…separately” –Although CTS adds value to the raw data/content we give away. Logical referencing Enables associations CTS And Content Delivery

The Value of Associations

–Associations between data add value. Google Page Rank –Index services let us construct associations with semantic precision. Named entity disambiguation Citations Morphological Information –Associations add context and increase understanding of the underlying content. Occurrence of Gaul in a text to its definition Occurrence of Gaul on this slide to previous examples. Intra-Connecting Content: The Role of Index Services

–Perseus can increase the value of its content even further by connecting its highly-structured data with external services (like search engines) providing less-structured data –We’ve seen this idea before… Google Earth: Search and display results –Longitude and latitude (Geographic coordinates) CTS-aware searching: Search and display results –CTS URNs (textual coordinates) Inter-Connecting Content: The Role of Search Engines

–What Perseus is doing now (experimental): Using Google Base and CTS-URNs to find Perseus’ highly-structured content with semantic precision. –Search texts at any tier of the hierarchical structure expanding or truncate the URN. –Examples: Get me all works by Julius Caesar visible to this search.all works by Julius Caesar visible to this search Get me Caesar’s The Gallic WarCaesar’s The Gallic War Get me a Perseus-edition English translation of Caesar’s Gallic Wara Perseus-edition English translation of Caesar’s Gallic War Get me Book 1, Chapter 1 of Caesar’s The Gallic War from the English TranslationBook 1, Chapter 1 of Caesar’s The Gallic War from the English Translation CTS URNs and Search

–We have a standard mechanism for referencing and retrieving texts –We have a mechanism for tracking our audience. –A syntax for aggregation of content (Shore) –A well-defined API implementing an open standard (Shore) –Handles multi-lingual content Provides a syntax for datasets of aligned texts. –A notation for semantically precise associations. CTS as a Value-Added Service

Perseus’ Other Sources of Value

DATA LAYER: TEI-XML texts, databases, raw data. Perseus gives away this raw data under the Creative Commons License. –Perseus as a data source to the community –Perseus understands how to create this data and can help others to do so as well DOMAIN LAYER: The objects that encapsulate the data and add a set of behaviors. –The knowledge and experience gained while creating this layer, and coming to understand the objects of the domain. –Working in the domain of Classical texts provides Perseus with a unique perspective on the nature of text that others may find useful. Perseus’ Logical Architecture: Identifying Sources of Value

SERVICE LAYER: The service layer provides an API implementing a series of protocols for each of the types of data Perseus serves. –Others are free to repurpose Perseus’ content through an API that encodes domain knowledge. –The community using the API becomes a source of information and value DISPLAY LAYER: The user interface. Think widgets, HTML web pages, PDFs, etc. –Convenience & Ease of Use –Expertise: The UI reflects the knowledge about the content gained when building the other layers.

–The idea: Perseus can give away its static data because it adds value through providing semantically rich associations, adding context to the content. –An Example Service: The Canonical Text Services’ protocol offers a new way to conceive of, reference, and deliver texts –Associations Add Value: Perseus’ value stems from these associations, the value is not inherent in the raw data, but comes from creating relationships among the data. Search engines give Perseus the opportunity to create semantically precise associations from less-structured, external content to highly-structured Perseus content. This is accomplished through augmenting search queries with the ‘textual coordinates’ of the CTS URN. –Perseus Offers More Than Services: In giving away our raw data, we hope to encourage others to create their own associations, increasing our value as a data provider and as service developers. For the majority of users however, our value stems from providing highly structured texts with rich associations in a simple user interface. Closing Points

–People Blossom, John. “Shoreviews. Content Industry Outlook 2007: Reality Checks.” Shore Communications Inc. 8 Feb Crane, Gregory. Conversations and being in his general vicinity Present. –Interconnecting primary and secondary sources –The Perseus Digital Library Smith, Neel. Conversations and being in his general vicinity Present. –“An Architecture for a distributed library incorporating open-source critical editions.” OSCE position paper. Weaver, Michael. Conversations and being in his general vicinity Present. –Slide layout based on his HTML design –Logical layers of an application as a model for business processes ( –Relevant Links CTS: Google Base: Perseus: Resources