The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises

Slides:



Advertisements
Similar presentations
2008 EPA and Partners Metadata Training Program: 2008 CAP Project Geospatial Metadata: Intermediate Course Module 3: Metadata Catalogs and Geospatial One.
Advertisements

Cultural Heritage in REGional NETworks REGNET. October 2001Project presentation REGNET 2 T1.3. IDENTIFICATION OF STANDARDS TO BE USED 1. OBJECTIVES 2.
Getting Involved in OLAC Steven Bird University of Pennsylvania LREC Symposium: The Open Language Archives Community 29 May 2002.
Schedule of Releases (since Tromso meeting) and New Access Interfaces.
Subject Based Information Gateways in The UK Coordinated Activities in The UK Within the UK Higher Education community, the JISC (Joint Information Systems.
DIGIDOC A web based tool to Manage Documents. System Overview DigiDoc is a web-based customizable, integrated solution for Business Process Management.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
Z39.50 and the Web ZIG July 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
Geospatial One-Stop A Federal Gateway to Federal, State & Local Geographic Data
An Operational Metadata Framework For Searching, Indexing, and Retrieving Distributed GIServices on the Internet By Ming-Hsiang.
Enterprise Content Management Departmental Solutions Enterprisewide Document/Content Management at half the cost of competitive systems ImageSite is:
Overview of PubWEST Patent and Trademark Depository Library Training Seminar April 2006.
1 panFMP - Ein XML-basiertes Framework für Metadaten- Portale Vortrag und „hands-on“ Seminar am GFZ Potsdam Uwe Schindler MARUM – Universität Bremen PANGAEA.
MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises.
1 Introduction to XML. XML eXtensible implies that users define tag content Markup implies it is a coded document Language implies it is a metalanguage.
Chapter 6: Client/Server and Intranet Computing
December 9, 2002 Cheshire II at INEX -- Ray R. Larson Cheshire II at INEX: Using A Hybrid Logistic Regression and Boolean Model for XML Retrieval Ray R.
Evolution of NBII Search-Based Technologies Oct 24, 2002 Donna Roy USGS Center for Biological Informatics.
What is the Internet? The Internet is a computer network connecting millions of computers all over the world It has no central control - works through.
Software Engineering Module 1 -Components Teaching unit 3 – Advanced development Ernesto Damiani Free University of Bozen - Bolzano Lesson 2 – Components.
V0.01 © 2009 Research In Motion Limited Push technology for Java applications Trainer name Date.
Distributed Systems: Client/Server Computing
2001 User Meeting OCLC SiteSearch Update Doug Loynes SiteSearch Product Manager.
Z39.50, XML & RDF Applications ZIG Tutorial January 2000 Poul Henrik Jørgensen, Danish Bibliographic Centre,
Digital Library Architecture and Technology
A/WWW Enterprises1 Introduction to CNIDR’s Isearch Archie Warnock
A Lightweight Approach To Support of Resource Discovery Standards The Problem Dublin Core is an international standard for resource discovery metadata.
Mining For Lost Treasure National Geospatial Data Clearinghouse Archibald Warnock U.S. Federal Geographic Data Committee A/WWW Enterprises.
Postacademic Interuniversity Course in Information Technology – Module C1p1 Contents Data Communications Applications –File & print serving –Mail –Domain.
OASIS ebXML Registry Standard Open Forum 2003 on Metadata Registries 10:30 – 11:15 January 20, 2003 Kathryn Breininger The Boeing Company Chair, OASIS.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
Metadata and Geographical Information Systems Adrian Moss KINDS project, Manchester Metropolitan University, UK
Introduction to Nutch CSCI 572: Information Retrieval and Search Engines Summer 2010.
Midterm Hardware vs. Software Everyone got this right!
An Alternative Approach to Interoperability Testing The Use of Special Diagnostic Records in the Context of Z39.50 and Online Library Catalogs William.
EO/GEO Team Response to Open GIS Consortium Catalog Interface RFP George Percivall February 1999.
1 CS 502: Computing Methods for Digital Libraries Lecture 19 Interoperability Z39.50.
A/WWW Enterprises 28 Sept 1995 AstroBrowse: Survey of Current Technology A. Warnock A/WWW Enterprises
Z39 Server and Z39.50 Gateway. Z39 Configuration Z39.50 Server Bath Profile conformance has been added to the Z39 Server. Z39 server supports Structure.
Extending Access To Information Resource Discovery Service William E. Moen, Ph.D. Kathleen R. Murray, Ph.D. School of Library and Information Sciences.
PatentScope - Electronic Publication World Intellectual Property Organization.
Lecture 6: Sun: 8/5/1435 Distributed Applications Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Managed by UT-Battelle for the Department of Energy Mercury – Distributed Metadata Tool for Finding and Retrieving CDIAC Data CDIAC UWG Meeting September.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Archibald Warnock FGDC Activities CIP/INFEO Interoperability and ISO CD2 Metadata Activities.
Web Design A Brief Intro to the Internet Internet History Internet Protocols 2.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
A Resource Discovery Service for the Library of Texas Requirements, Architecture, and Interoperability Testing William E. Moen, Ph.D. Principal Investigator.
WEB SERVER SOFTWARE FEATURE SETS
U.S. Environmental Protection Agency Central Data Exchange Pilot Project Promoting Geospatial Data Exchange Between EPA and State Partners. April 25, 2007.
Coming Soon to a Computer Near You (maybe) MicroZGate A Light, Portable, and Configurable z39.50 Gateway John Ulmer NOAA Coastal Services Center.
FGDC and ASF Using Structured Metadata Archie Warnock A/WWW Enterprises
Don’t Duck Metadata March 2005 Introducing Setting Up a Clearinghouse Node Topic: Introduction to Setting Up a Clearinghouse Node Objective: By.
A/WWW Enterprises 15 July 1996 Implementing Queries with Z39.50 A. Warnock A/WWW Enterprises
National Geospatial Enterprise Architecture N S D I National Spatial Data Infrastructure An Architectural Process Overview Presented by Eliot Christian.
Interoperability and Standards for Bibliographic Applications Poul Henrik Jørgensen Danish Library Centre Telematics for.
1 CS 430: Information Discovery Lecture 26 Architecture of Information Retrieval Systems 1.
Not Your Father’s Laserfiche AA101 Michael Allen.
JAFER Toolkit Project Oxford University 1 JAFER Java-based high level Z39.50 toolkit Matthew Dovey; Colin Tatham; Antony Corfield; Richard Mawby Oxford.
Alexandria Digital Library The ADL Testbed Greg Janée
Alexandria Digital Library ADL Metadata Architecture Greg Janée.
CAP-378 and “Conhecer para não ignorar”
WHAT DOES THE FUTURE HOLD? Ann Ellis Dec. 18, 2000
Real Life Networking Examples
Building Search Systems for Digital Library Collections
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Overview of PATENTSCOPE® search service Webinar September 2010
Open Automation Software
Objective Understand the concepts of modern operating systems by investigating the most popular operating system in the current and future market Provide.
Archibald Warnock A/WWW Enterprises
Presentation transcript:

The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises

What Is Isite? n Isite is a standards-based Internet toolkit for information search and retrieval (Z39.50) n Isite was developed by MCNC/CNIDR n Isite was intended as a replacement for freeWAIS n Funded by a US NSF grant n There are other good Z39.50 toolkits, too

Isite Architecture n Isite is written in C++ to utilize the usual object-oriented advantages n Major components Isearch - the search and retrieval engine SAPI - the Z39.50 search engine API Zdist - the Z39.50 implementation

Isite Architecture - Example Programs n Iindex, Isearch, Iutil - the search engine n Isearch-cgi - the CGI gateway to Isearch n zclient, izclient, zping, zbatch - the Z39.50 clients n zserver, zserverNT - the Z39.50 servers n zcon & zgate - the WWW-to-Z39.50 gateway

Current Status of Isite n MCNC/CNIDR funding from NSF is finished Successful completion of 3 year grant Jim Fullton, PI, is now at WIPO in Geneva No additional support is anticipated n Other projects are supporting customization FGDC, US Dept. of Commerce, US Patent & Trademark Office, CEO, STScI, World Bank, BSn

Isite Strengths n Powerful and flexible search engine n Community-based development of a reference implementation n Freely distributed and widely available for any use n Source code included n Powerful search engine interface n Ported to Windows NT with threaded Z39.50 server

Isearch Features  Full text search  Search on text fields  Search on numeric fields with appropriate relations (>, <, =)  Search on date fields with appropriate relations (before, during, after)  Search on geospatial bounding box  Boolean searches  Phrase searching  Right truncation  Proximity searching (within N characters)  Case insensitive searching, punctuation ignored  Configurable stopword list  Customizable results presentation  Relevance ranked scores  Term weighting

Isearch Document Types n ASCII text n USMARC records n Electronic mail folders n Usenet news archives n US patents n IAFA templates n BIBTeX n Filenames n First line in file n SGML tagged fields HTML GILS templates FGDC templates n Colon delimited fields GCMD DIF templates n whois++ templates n Multi-file documents n Medline

Isite Weaknesses n Modest Z39.50 implementation needs GRS-1 better USMARC support data structures n All examples are console applications n No real end-user applications n No GUI interface n Difficult configuration n Requires programming for extensions n Needs optimization & performance enhancement n Needs more documentation

What The Future Holds For Isite n New Projects want (and will get): Distributed document collections Distributed searching Automated information extraction (centroids, templates) Searching and referrals Additional Z39.50 support (lots of Z39.50 details are not supported now)

GILS and the Advanced Search Facility n ASF is a US Dept. of Commerce project, to be built by Pilot Research, MCNC and A/WWW Enterprises n “GILSnet” - a network of cooperative, low-impact, distributed nodes n The basic interchange will be GILS templates n Search on full text and GILS records

GILS, Dublin Core and Everyone Else n Dublin Core is a minimal (15 fields) generic metadata scheme for virtually any kind of document n GILS represents a more detailed approach, including most of DC, providing greater interoperability n GILS is less bibliographically oriented than BIB-1 n GILS is lightweight compared to GEO and CIP (which have specific functional requirements

What GILS Means To Me -1 n Fewer fields More documents More metadata records Skinnier metadata records Easier abstraction n More fields Fewer documents Fewer metadata records Fatter metadata records Less abstraction  GILS is a good, general compromise

What GILS Means To Me - 2 n Think of the GILS profile as defining a language At some level, Z39.50 is a detail Protocols are about communication, profiles are about abstraction and GILS is about content Z39.50 guarantees that the user’s query can be unambiguously decoded - no guarantees about content We could implement the profile over any protocol - http, CORBA, etc. Does GILS have to use Z39.50? No, but the abstraction is required Z39.50 already includes the abstraction model

Related Documents n Getting Isite ftp://ftp.cnidr.org/pub/software/Isite ftp://ftp.clark.net/pub/warnock/Software (pre) n A/WWW Enterprises ml US Phone/FAX: