Download presentation
Presentation is loading. Please wait.
Published byGordon Hancock Modified over 9 years ago
1
MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises
2
MCNC/CNIDR & A/WWW Enterprises What is Isite? u A freely available implementation of the Z39.50 search/retrieval protocol u It includes a Unix-based server, a WWW gateway, a command-line client and a sophisticated text search engine u ftp://ftp.cnidr.org/pub/NIDR.tools/Isite u http://vinca.cnidr.org/software/Isite/Isite.html
3
MCNC/CNIDR & A/WWW Enterprises What is Isearch? u Isearch is the successor to freeWAIS u Isearch is a sophisticated full-text search and retrieval system u Isearch is a component of Isite, an implementation of the NISO standard protocol Z39.50 for information search and retrieval u ftp://ftp.cnidr.org/pub/NIDR.tools/Isearch u http://vinca.cnidr.org/software/Isearch/Isearch.html
4
MCNC/CNIDR & A/WWW Enterprises System Components - I u Iindex, the Text Indexer - builds searchable version of the document collection F Implements fast word-based searching F Document parser - recognize start/end of individual documents F Field parser - recognize start/end of fields within individual documents
5
MCNC/CNIDR & A/WWW Enterprises System Components - II u Isearch, the Search engine - searches a document collection based on user- supplied query F Command line search 4 Primarily used for testing F WWW gateway (using CGI) 4 End-user interface using forms F Z39.50 gateway
6
MCNC/CNIDR & A/WWW Enterprises Isearch Capabilities u Fast full-text search F US AIDS Patent Collection - can search ~250,000 patents in < 1 second u Fielded search F Can restrict searches to title, author, abstract, other fields u Relevance ranking F Search “hits” are assigned scores & sorted
7
MCNC/CNIDR & A/WWW Enterprises Isearch Capabilities u Word truncation F search for “matri*” matches “matrix” and “matrices” u Boolean functions F AND, OR and ANDNOT combinations of different fields u Customized presentation of results u Phrase searching (coming soon)
8
MCNC/CNIDR & A/WWW Enterprises Isearch Customization u What’s needed to customize Isearch? F Isearch is written in C++ F Documents are C++ objects - data & procedures 4 Already have SGML & HTML, among others F Object technology allows code reusability, customizing only where differences from existing objects occur
9
MCNC/CNIDR & A/WWW Enterprises Isearch Customization u What’s needed to make arbitrary documents searchable? F Code to parse documents F Code to parse fields F Code to build brief and full result records F Yes, it requires programming F But, many of these are derived from existing procedures
10
MCNC/CNIDR & A/WWW Enterprises Introduction to Z39.50 u Developed for search and retrieval u Networked, client/server environment u Tested by working information scientists (Z39.50 Implementor’s Group) u Commerical & public domain support (Isite from CNIDR) u http://www.ds.internic.net/z3950/z3950.html
11
MCNC/CNIDR & A/WWW Enterprises Attribute Sets u Attributes define how the query is specified F Use: field names F Relation: comparisons F Position: location in field F Structure: word/phrase/key/ etc F Truncation: left/right/none/ etc F Completeness: subfield/field
12
MCNC/CNIDR & A/WWW Enterprises Attributes & Element Sets u Supported Attribute Sets BIB-1 GILS GEO F STAS u Element Sets define retrievable sets of use attributes F Brief record F Full record F Summary record (GEO)
13
MCNC/CNIDR & A/WWW Enterprises Record Syntaxes u Z39.50 allows specification of a “Preferred Record Syntax” for results F SUTRS (unstructured text) F HTML F USMARC F GRS-1 (tagged, generalized syntax)
14
MCNC/CNIDR & A/WWW Enterprises Profiles - GEO and Otherwise u Profiles define allowed attributes and element sets u Usually domain specific - ATS-1, GILS, WAIS, GEO, Digital Collections, Museum Collections u Supported by external agreement between client & server (currently) F i.e., a GEO client talks to a GEO server
15
MCNC/CNIDR & A/WWW Enterprises FGDC Enhancements u Search Engine (Iindex/Isearch) F Field types (text, numeric, date, others) F Search in nested fields F Search in numeric fields F Date & Date Range Searching F Spatial Searching
16
MCNC/CNIDR & A/WWW Enterprises FGDC Enhancements u Z39.50 Implementation (ZDist) F Support for GEO attributes & element sets F GRS-1 record syntax F Support for additional (non-Isearch) search engines F Syntax to support nested query
17
MCNC/CNIDR & A/WWW Enterprises Outstanding Issues u User Interface F What fields are searchable and how does the user indicate them? F How complex can the geographic queries be? Bounding box only? Complex regions?
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.