Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises

Similar presentations


Presentation on theme: "The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises"— Presentation transcript:

1 The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises http://www.clark.net/pub/warnock/awww.html warnock@clark.net

2 What Is Isite? n Isite is a standards-based Internet toolkit for information search and retrieval (Z39.50) n Isite was developed by MCNC/CNIDR n Isite was intended as a replacement for freeWAIS n Funded by a US NSF grant n There are other good Z39.50 toolkits, too

3 Isite Architecture n Isite is written in C++ to utilize the usual object-oriented advantages n Major components Isearch - the search and retrieval engine SAPI - the Z39.50 search engine API Zdist - the Z39.50 implementation

4 Isite Architecture - Example Programs n Iindex, Isearch, Iutil - the search engine n Isearch-cgi - the CGI gateway to Isearch n zclient, izclient, zping, zbatch - the Z39.50 clients n zserver, zserverNT - the Z39.50 servers n zcon & zgate - the WWW-to-Z39.50 gateway

5 Current Status of Isite n MCNC/CNIDR funding from NSF is finished Successful completion of 3 year grant Jim Fullton, PI, is now at WIPO in Geneva No additional support is anticipated n Other projects are supporting customization FGDC, US Dept. of Commerce, US Patent & Trademark Office, CEO, STScI, World Bank, BSn

6 Isite Strengths n Powerful and flexible search engine n Community-based development of a reference implementation n Freely distributed and widely available for any use n Source code included n Powerful search engine interface n Ported to Windows NT with threaded Z39.50 server

7 Isearch Features  Full text search  Search on text fields  Search on numeric fields with appropriate relations (>, <, =)  Search on date fields with appropriate relations (before, during, after)  Search on geospatial bounding box  Boolean searches  Phrase searching  Right truncation  Proximity searching (within N characters)  Case insensitive searching, punctuation ignored  Configurable stopword list  Customizable results presentation  Relevance ranked scores  Term weighting

8 Isearch Document Types n ASCII text n USMARC records n Electronic mail folders n Usenet news archives n US patents n IAFA templates n BIBTeX n Filenames n First line in file n SGML tagged fields HTML GILS templates FGDC templates n Colon delimited fields GCMD DIF templates n whois++ templates n Multi-file documents n Medline

9 Isite Weaknesses n Modest Z39.50 implementation needs GRS-1 better USMARC support data structures n All examples are console applications n No real end-user applications n No GUI interface n Difficult configuration n Requires programming for extensions n Needs optimization & performance enhancement n Needs more documentation

10 What The Future Holds For Isite n New Projects want (and will get): Distributed document collections Distributed searching Automated information extraction (centroids, templates) Searching and referrals Additional Z39.50 support (lots of Z39.50 details are not supported now)

11 GILS and the Advanced Search Facility n ASF is a US Dept. of Commerce project, to be built by Pilot Research, MCNC and A/WWW Enterprises n “GILSnet” - a network of cooperative, low-impact, distributed nodes n The basic interchange will be GILS templates n Search on full text and GILS records

12 GILS, Dublin Core and Everyone Else n Dublin Core is a minimal (15 fields) generic metadata scheme for virtually any kind of document n GILS represents a more detailed approach, including most of DC, providing greater interoperability n GILS is less bibliographically oriented than BIB-1 n GILS is lightweight compared to GEO and CIP (which have specific functional requirements

13 What GILS Means To Me -1 n Fewer fields More documents More metadata records Skinnier metadata records Easier abstraction n More fields Fewer documents Fewer metadata records Fatter metadata records Less abstraction  GILS is a good, general compromise

14 What GILS Means To Me - 2 n Think of the GILS profile as defining a language At some level, Z39.50 is a detail Protocols are about communication, profiles are about abstraction and GILS is about content Z39.50 guarantees that the user’s query can be unambiguously decoded - no guarantees about content We could implement the profile over any protocol - http, CORBA, etc. Does GILS have to use Z39.50? No, but the abstraction is required Z39.50 already includes the abstraction model

15 Related Documents n Getting Isite ftp://ftp.cnidr.org/pub/software/Isite ftp://ftp.clark.net/pub/warnock/Software (pre) n A/WWW Enterprises warnock@clark.net http://www.clark.net/pub/warnock/awww.ht ml US Phone/FAX: 301-854-2987


Download ppt "The Future of Isite - Growing GILS Archie Warnock A/WWW Enterprises"

Similar presentations


Ads by Google