Presentation is loading. Please wait.

Presentation is loading. Please wait.

MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises.

Similar presentations


Presentation on theme: "MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises."— Presentation transcript:

1 MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises

2 MCNC/CNIDR & A/WWW Enterprises What is Isite? u A freely available implementation of the Z39.50 search/retrieval protocol u It includes a Unix-based server, a WWW gateway, a command-line client and a sophisticated text search engine u ftp://ftp.cnidr.org/pub/NIDR.tools/Isite u http://vinca.cnidr.org/software/Isite/Isite.html

3 MCNC/CNIDR & A/WWW Enterprises What is Isearch? u Isearch is the successor to freeWAIS u Isearch is a sophisticated full-text search and retrieval system u Isearch is a component of Isite, an implementation of the NISO standard protocol Z39.50 for information search and retrieval u ftp://ftp.cnidr.org/pub/NIDR.tools/Isearch u http://vinca.cnidr.org/software/Isearch/Isearch.html

4 MCNC/CNIDR & A/WWW Enterprises System Components - I u Iindex, the Text Indexer - builds searchable version of the document collection F Implements fast word-based searching F Document parser - recognize start/end of individual documents F Field parser - recognize start/end of fields within individual documents

5 MCNC/CNIDR & A/WWW Enterprises System Components - II u Isearch, the Search engine - searches a document collection based on user- supplied query F Command line search 4 Primarily used for testing F WWW gateway (using CGI) 4 End-user interface using forms F Z39.50 gateway

6 MCNC/CNIDR & A/WWW Enterprises Isearch Capabilities u Fast full-text search F US AIDS Patent Collection - can search ~250,000 patents in < 1 second u Fielded search F Can restrict searches to title, author, abstract, other fields u Relevance ranking F Search “hits” are assigned scores & sorted

7 MCNC/CNIDR & A/WWW Enterprises Isearch Capabilities u Word truncation F search for “matri*” matches “matrix” and “matrices” u Boolean functions F AND, OR and ANDNOT combinations of different fields u Customized presentation of results u Phrase searching (coming soon)

8 MCNC/CNIDR & A/WWW Enterprises Isearch Customization u What’s needed to customize Isearch? F Isearch is written in C++ F Documents are C++ objects - data & procedures 4 Already have SGML & HTML, among others F Object technology allows code reusability, customizing only where differences from existing objects occur

9 MCNC/CNIDR & A/WWW Enterprises Isearch Customization u What’s needed to make arbitrary documents searchable? F Code to parse documents F Code to parse fields F Code to build brief and full result records F Yes, it requires programming F But, many of these are derived from existing procedures

10 MCNC/CNIDR & A/WWW Enterprises Introduction to Z39.50 u Developed for search and retrieval u Networked, client/server environment u Tested by working information scientists (Z39.50 Implementor’s Group) u Commerical & public domain support (Isite from CNIDR) u http://www.ds.internic.net/z3950/z3950.html

11 MCNC/CNIDR & A/WWW Enterprises Attribute Sets u Attributes define how the query is specified F Use: field names F Relation: comparisons F Position: location in field F Structure: word/phrase/key/ etc F Truncation: left/right/none/ etc F Completeness: subfield/field

12 MCNC/CNIDR & A/WWW Enterprises Attributes & Element Sets u Supported Attribute Sets  BIB-1  GILS  GEO F STAS u Element Sets define retrievable sets of use attributes F Brief record F Full record F Summary record (GEO)

13 MCNC/CNIDR & A/WWW Enterprises Record Syntaxes u Z39.50 allows specification of a “Preferred Record Syntax” for results F SUTRS (unstructured text) F HTML F USMARC F GRS-1 (tagged, generalized syntax)

14 MCNC/CNIDR & A/WWW Enterprises Profiles - GEO and Otherwise u Profiles define allowed attributes and element sets u Usually domain specific - ATS-1, GILS, WAIS, GEO, Digital Collections, Museum Collections u Supported by external agreement between client & server (currently) F i.e., a GEO client talks to a GEO server

15 MCNC/CNIDR & A/WWW Enterprises FGDC Enhancements u Search Engine (Iindex/Isearch) F Field types (text, numeric, date, others) F Search in nested fields F Search in numeric fields F Date & Date Range Searching F Spatial Searching

16 MCNC/CNIDR & A/WWW Enterprises FGDC Enhancements u Z39.50 Implementation (ZDist) F Support for GEO attributes & element sets F GRS-1 record syntax F Support for additional (non-Isearch) search engines F Syntax to support nested query

17 MCNC/CNIDR & A/WWW Enterprises Outstanding Issues u User Interface F What fields are searchable and how does the user indicate them? F How complex can the geographic queries be? Bounding box only? Complex regions?


Download ppt "MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises."

Similar presentations


Ads by Google