Presentation on theme: "Text information storage and retrieval and the CDS/ISIS program"— Presentation transcript:
1 Text information storage and retrieval and the CDS/ISIS program ***Text information storage and retrieval and the CDS/ISIS programPaul NIEUWENHUYSENUniversity Library, Vrije Universiteit Brussel,Pleinlaan 2, B-1050 Brussel, Belgium
2 ***What is a database?A database is a collection of similar data records stored in a common file (or collection of files).
3 Software type = information retrieval software ***Software type = information retrieval softwareSoftware for information storage and retrieval(ISR software)Text(-oriented) database management systems(Text-DBMS)Text information management systems(TIMS)Document retrieval systemsDocument management systems
4 Information retrieval: via a database to the user ***Information retrieval: via a database to the userInformation contentLinear fileInverted fileDatabaseSearch engineSearch interfaceUser
5 Information retrieval: the basic processes in search systems ***Information retrieval: the basic processes in search systemsInformation problemText documentsRepresentationRepresentationEvaluation and feedbackQueryIndexed documentsComparisonRetrieved documents
6 Information retrieval systems: many components make up a system ***Information retrieval systems: many components make up a systemAny retrieval system is built up of many more or less independent components.These components can be modified to increase the quality of the results more or less independently.
7 Information retrieval systems: important components ***Information retrieval systems: important componentsthe information contentsystem to describe formal aspects of information itemssystem to describe the subjects of information itemsconcrete descriptions of information items = application of the used information description systemsinformation storage and retrieval computer program(s)computer system used for retrievaltype of medium or information carrier used for distribution
8 Information retrieval systems: the information content ***Information retrieval systems: the information contentThe information content is the information that is created or gathered by the producer.The information content is independent of software and of distribution media.The information content is input into the retrieval system usinga system (rules) to describe the formal aspectsa system (rules) to describe the contents (classification, thesaurus,...)
9 Information retrieval systems: media used for distribution ***Information retrieval systems: media used for distributionHard copy (for information retrieval systems only in the broad sense)PrintMicroficheFor computers: (for information retrieval systems strictu sensu)Magnetic tapeFloppy disk; optical disk (CD-ROM, CD-i, Photo-CD,...)Online
10 Information retrieval systems: the computer program ***Information retrieval systems: the computer programThe information retrieval program consists of several modules, including:The module that allows the creation of the inverted file(s) = index file(s) = dictionary file(s).The search engine provides the search features and power that allow the inverted file(s) to be searched.The interface between the system and the user determines how they (can) interact to search the database (using menus and/or icons and/or templates and/or commands).
11 What determines the results of a search in a retrieval system? ***What determines the results of a search in a retrieval system?the information retrieval system ( = contents + system)the user of the retrieval system and the search strategy applied to the systemResult of a search
12 Characteristics / definition of structured text-information ***Characteristics / definition of structured text-informationThe text information is structured. (files, records, fields, sub-fields, links/relations among records,...)The length of records and fields can be “long”.Some fields are multi-valued, i.e. they occur more than once.
13 Layered structure of a database ***Layered structure of a databaseDatabaseFileRecordsFieldsCharacters+ in many systems: relations / linksbetweenrecords
14 Structure of a bibliographic file ***Structure of a bibliographic fileRecord No. 1TitleAuthor 1: name + first nameAuthor 2:...SourceDescriptor 1Descriptor 2Record No. 2Sub-fieldsRepeated
15 Thesaurus: description ***Thesaurus: descriptionThesaurus =system to control a vocabulary +the contents of this vocabularyThesaurus program =program to create, manage, modify and/or search a thesaurus using a computer
16 Thesaurus relations *** Term(s) with broader meaning BT (= Broader Term)RT (Related Term) UF (= Use For)Other term(s) Term Synonym(s)NT (= Narrower Term)Term(s) with narrower meaning
17 Thesaurus applications ***Thesaurus applicationsTo find/choose index terms to add these to items, when terms are taken from a controlled vocabularyTo find more and/or better terms to search a database (to increase recall and precision)To find more and/or better terms during writingTo understand the meaning of a term, by inspectingthe scope note of the term and/orthe relations with other terms
18 Thesaurus examples **-Examples General systems / universal systems / on all subjectsLibrary of Congress Subject Headings (LCSH)Focused on a particular subject domainERICINSPECMedical Subject Headings (MeSH)Psychological Abstracts / PsycInfoSociological Abstracts / SocioFile
19 Database systems: why study this subject briefly ? ***Database systems: why study this subject briefly ?To achieve a better understanding of the inner workings of the external information retrieval systems that you use, so that you can exploit these more efficientlyTo be able to evaluate the quality of database systems you are confronted with, so that you canmake better choices among available systems,offer constructive suggestions to the manager,...
20 Database systems: why study this subject in detail? **-Database systems: why study this subject in detail?To acquire the knowledge and skills to create / set up / manage your own local database system on a computer
21 Database systems: definition ***Database systems: definitionA database (management) system is a program or set of programs, providing a means by which a user can easily store and retrieve data in the form of “databases”.
22 Information retrieval software: related terms **-Information retrieval software: related termsSoftware for information storage and retrieval(ISR software)Text(-oriented) database management systems(Text-DBMS)Text information management systems(TIMS)Document retrieval systemsDocument management systems
25 Cataloguing: hard copy versus computer-based **-Cataloguing: hard copy versus computer-basedHard copy“Input” , i.e. cataloguing, on cards determines directly the “ouput”, i.e. the format of the data on the card as presented to the userSummarized: INPUT=OUTPUTComputer-basedInput in the database in fields allows later output in various formats for presentationSummarized: 1. INPUT, 2. various OUTPUTs
26 Text-information management systems: characteristics and definition ***Text-information management systems: characteristics and definitionThe information in the database is text oriented. Therefore, several features are required:ability to store relatively long blocks of textsability to retrieve items in which specific words or terms occur anywhere
27 Text-information management: from free-form to structure **-Text-information management: from free-form to structureFree form text information without structureText database with information structured in files, records, fields, sub-fields, with links/relations among records,... (Ideally, each fields is repeatable = can be multi-valued, = can occur more than once in each record.)
28 Text-information management: types of software ***Text-information management: types of softwareSoftware typeWord processing softwareFree-form or structured text information database softwareFeaturesMust be learnt anyway. Slow sequential searching.Additional software to be purchased and learnt. Fast searching via index(es).
29 Advantages of structured text-retrieval versus X-base systems **-Advantages of structured text-retrieval versus X-base systemsFeatureMany long fields, forming long recordsRepeatable fieldsSubfieldsVariable field lengthsFast searching any word in all fieldsThesaurus to help searchingText- retrievalYesX-base systemsNo
30 Hierarchy in the use of a database ***Hierarchy in the use of a databaseDatabasestructureInput / EditingSearching / Output
31 Functions of database management software ***Functions of database management softwareInput / edit using keyboard or batch inputIndexing of the database(s)Browse / Search / Select / Retrieve data from databaseOutput (Sort / Display / Print to file / Print to paper) +Export / Import
32 The various formats of records in a database **-The various formats of records in a databaseFormat tofacilitateretrieval=inverted fileInput format=Edit formatInternal formatforlong term storageDisplay formatfor output todisplay, printer, fileFormat forexchange purposes
33 Structure of records / Field tags **-Structure of records / Field tagsField tags:Examples from the Common Communication Format (supported by UNESCO):TitleAuthor(s)NotesAbstract
34 Which advantages offers a document management system on computer? ***!? Question !? Task !? Problem !?Which advantages offers a document management system on computer?
35 Advantages of a document system on computer, for the user(s) ***Advantages of a document system on computer, for the user(s)Access to information is easier.Access to information is faster.Online access is possible even when centre is closed.Online access is possible from a distance.Integration in search module with data on loan status.More elements of the records can serve as search term.Combinations of search terms can be used.Results /selections can be stored as computer files.
36 Advantages of a document system on computer, for the manager(s) **-Advantages of a document system on computer, for the manager(s)Multiplication / distribution / exchange is easier.Available computer data can be input / incorporated.Global changes are easier.The system takes less physical space.Output to printer allows production of cards, listings,...Sorting of records in output is easier.The system is resistant to physical aging.Parts are better integrated (for instance books and loans)The system can offer statistical information.
37 Drawbacks of a document system on computer **-Drawbacks of a document system on computerLThe costs of software and hardware can be high.Training related to computers is required.Evolution in computer applications should be considered.Some systems do not provide a backup.
38 Unesco’s involvement with information management **-Unesco’s involvement with information managementUnesco - General Information ProgrammeComputer programs for information managementStandards (for instance CCF)Various subject-oriented information projectsLibraries and archivesUnesco’s other programmes and divisions
39 Tools for information management by Unesco **-Tools for information management by UnescoIDAMSDOS program for numeric data analysisCDS/ISISProgram for storage and retrieval of structured text-oriented informationInterface between IDAMS and CDS/ISISTo use both programs efficiently in one projectCommon Communication Format (CCF)Guidelines on how to format a database
40 The IDAMS analysis program *--The IDAMS analysis programSoftware to create, manage and analyse local, in-house, number-oriented databasesFor DOS (or systems emulating DOS)Developed by an international teamDistributed free of charge by the Unesco - General Information Programme (PGI)Detailed manual is available
41 The CDS/ISIS text database management program **-The CDS/ISIS text database management programSoftware to create and manage local, in-house databases with primarily structured text as contents (NOT numbers, graphics, sound,...)Versions available forMainframes (IBM)Minicomputers (Digital VAX)Microcomputers (DOS )
42 Micro-CDS/ISIS: options on the original main menu *--Micro-CDS/ISIS: options on the original main menu________________________________________________________________________________________________________ Micro CDS/ISIS - Version 3.0 ________________________C - Change data baseL - Change dialogue languageE - ISISENT - Data entry servicesS - ISISRET - Information retrieval servicesP - ISISPRT - Sorting and printing servicesI - ISISINV - Inverted file servicesD - ISISDEF - Data base definition servicesM - ISISXCH - Master file servicesU - ISISUTL - System utility servicesA - ISISPAS - Advanced programming servicesX - Exit (to MSDOS)
43 Micro-CDS/ISIS: original main menu on the display *--Micro-CDS/ISIS: original main menu on the display
44 Micro-CDS/ISIS running in Microsoft Windows *--Micro-CDS/ISIS running in Microsoft Windows
45 Micro-CDS/ISIS running in Microsoft Windows (full screen)
46 CDS/ISIS: general features *--CDS/ISIS: general featuresAvailable for several operating systemsMulti-user editing and searching in a networkUnlimited number of databases can be storedNo practical limitation in the number of records per databaseMultiple field-occurrences are possibleOnly few limitations in a database structureCan be applied on CD-ROM
47 CDS/ISIS: input, indexing, searching, output *--CDS/ISIS: input, indexing, searching, outputMore than one input worksheet can be appliedPowerful word-, phrase- and field-indexingPowerful, fast searchingPowerful in output formats
48 CDS/ISIS: positive non-technical characteristics *--CDS/ISIS: positive non-technical characteristicsGood, detailed manual in English, French,...Used in more than 4000 institutesUsed internationally, worldwideNational user-groups are active in many countriesUser interface available in English, French, Spanish, Arabian, Chinese,...Suitable database structures are availableFree forum about CDS/ISIS by electronic mail
49 CDS/ISIS is available free of charge *--CDS/ISIS is available free of chargeFrom National distributors approved by UnescoFrom subject-oriented distributors approved by UnescoFrom the Unesco - General Information Programme in Paris(From the secretariat of the Unesco - International Hydrological Programme in Paris, for water-related projects)
50 CDS/ISIS database structures *--CDS/ISIS database structuresexample database structured according to CCF, on diskette, distributed by the Unesco - General Information Programmeexample database structured according to CCF, published in a guide by the Unesco - International Hydrological Programmedatabases and manuals by other organisations
51 Interface to link CDS/ISIS with IDAMS *--Interface to link CDS/ISIS with IDAMSTo use both packages efficiently together, an interface program has also been developed by the Unesco - General Information Programme (PGI)
52 Important new features in CDS/ISIS version 3 *--Important new features in CDS/ISIS version 3More than one user can edit data at the same time in a computer network.User / program can call external, non-CDS/ISIS programs.Better Pascal programming language is included in CDS/ISIS .
53 CDS/ISIS and libraries: examples of applications *--CDS/ISIS and libraries: examples of applicationsA libraryautomatedwith CDS/ISISCentral libraryprovidingbibliographic datafor computersAnother libraryautomatedwith CDS/ISISSmallerdepartment libraries anddocumentation systemsautomated using CDS/ISISNetwork
54 CDS/ISIS database structure *--CDS/ISIS database structureGeneralCDS/ISISDatabaseRecordFieldSubfieldCharacterName < 5 charactersMFN = Master File NumberTag = 1, 2, 3, ..., 999^a ^b ^c ...ABC...abcaccented charactersUnlimited16 million perdatabase250 per record-8000 per record 8000 per field
55 CDS/ISIS database files, defined by the user *--CDS/ISIS database files, defined by the userName in DOS.ANY.FDT.FMT.FST.PFT.STW# used with 1 database1> 1= or > 1Full nameANY fileField Definition TableWorksheet(s)Field Select Table(s)Print Format(s)STOPword listPurposeSearchingStructureInputIndexingOutputand sorting
56 CDS/ISIS database files: the database contents *--CDS/ISIS database files: the database contentsName in DOS.MST.XRF.IFP.L01.L02.N01.N02Full nameMaster fileCross reference fileB-tree index files
57 CDS/ISIS database files to change using a text editing program *--CDS/ISIS database files to change using a text editing programWhere dbn = data base nameName in DOSdbn.ANYdbn.STW# used with a database1Full nameANY fileSTOPword listPurposeIndexing and sortingSearching
58 *--CDS/ISIS database files, which can be changed using a text editing programName inDOS.FDT.FST.PFT# used with a database11,...1 or severalFull name Field Definition TableField Select Table(s)Print (or display )Format(s)PurposeStructureIndexingOutput
59 Advantages of using CDS/ISIS with Windows (Part 1) *--Advantages of using CDS/ISIS with Windows (Part 1)Multitasking in several windows: CDS/ISIS and other programsstart CDS/ISIS from the program managerview CDS/ISIS and the file manager at the same timesearch or edit a CDS/ISIS database together with a thesaurus or classification scheme in another programswitch easily between CDS/ISIS and a word processing program to produce output...
60 Advantages of using CDS/ISIS with Windows (Part 2) *--Advantages of using CDS/ISIS with Windows (Part 2)Multitasking in several windows: multiple instances of CDS/ISISview several databases at the same time in several instances of CDS/ISIS running at the same time...
61 Advantages of using CDS/ISIS with Windows (Part 3) *--Advantages of using CDS/ISIS with Windows (Part 3)Copy and pastecopy data from a document in a program for text processing, and paste into a CDS/ISIS databasecopy data from a CDS/ISIS database displayed on screen by CDS/ISIS, and paste into a document in another programcopy data from one CDS/ISIS database displayed on screen, and paste into another database through the editing worksheet...
62 Advantages of using CDS/ISIS with Windows (Part 4) *--Advantages of using CDS/ISIS with Windows (Part 4)Associations of file name extensions with programsassociate the following CDS/ISIS file name extensions with a program for word processing:.any.fdt.par.pft.stw
63 CDS/ISIS: some wishes of users (Part 1) *--CDS/ISIS: some wishes of users (Part 1)General aspects:better use of Microsoft Windowspossibility to open and work with more than one file/database on screen simultaneously, and to exchange data among those open filesclient-server architecture for the database management systembetter availability of CDS/ISIS applications developed by other users (including additional ISIS-Pascal programs)
64 CDS/ISIS: some wishes of users (Part 2) *--CDS/ISIS: some wishes of users (Part 2)Database structure:records longer than 8 KBytessupport of non-text fields (graphics, audio,...)better availability of database structures developed by usersInput:access to and copy from one or more authority filesspell-check of database contents (language independent)direct import of non-ISO-structured ASCII files
65 CDS/ISIS: some wishes of users (Part 3) *--CDS/ISIS: some wishes of users (Part 3)Indexing:multiple inverted filesSearching:save search statements (queries) for future runsOutput:emphasis of search terms in the output on display and paperbetter use of the features of various printersdirect output to PostScript printers
66 CDS/ISIS: some wishes of users (Part 4) *--CDS/ISIS: some wishes of users (Part 4)Interface with usermore help messages in contextonline tutorialsupport for mouse
67 CDS/ISIS: further development going on *--CDS/ISIS: further development going onversions for various Unix computersversion for Windowssplitting into a client and a server package
68 CDS/ISIS database definition services: display menu *--CDS/ISIS database definition services: display menu
69 CDS/ISIS database modification services: display menu *--CDS/ISIS database modification services: display menu
70 CDS/ISIS database definition table: display of an example *--CDS/ISIS database definition table: display of an example
71 Copying and renaming a CDS/ISIS database structure *--Copying and renaming a CDS/ISIS database structureCopyXCOPY (not COPY) all files, using DOSorCopy using WindowsIf required: renameFDT file, FMT file(s), PFT file(s), ANY file, FST file(s),...AND (!):Change first lines in xxxxx.FDT to make theseagree with new names of FST(s), FMT(s), PFT(s)
72 Limitations to the structure of a database **-Limitations to the structure of a databaseWhat is the maximum numberof databases managed by the program?of records in a database? of characters in a record?of fields in a record? of characters in a field?Can fields contain subfields?Can the user define / modify the structure?Is the amount of memory taken on disk by each record minimal or fixed?...
74 Types of field contents to control input **-Types of field contents to control inputType (Example)Alphanumeric (Title)Alphabetic (Country code)Numeric (Year)Pattern (Date)
75 CDS/ISIS manual data entry, editing / input services: display menu *--CDS/ISIS manual data entry, editing / input services: display menu
76 Manual inputting and editing using a keyboard **-Manual inputting and editing using a keyboardCan more than 1 form / worksheet be used for a database?Are pre-fabricated database structures with input worksheets included and / or available?Can more than one user edit the same database at the same time?
77 Batch input / Import **- Is batch input possible? Is a format conversion program included or available?...
78 Activities related to indexing **-Activities related to indexingActivityIntellectual, human indexingDevelop an automatic indexing methodAutomatic indexingWho does it?Database producer /Thesaurus producerDatabaseproducer /Software featuresComputerwith programConcrete actionAttributesubject termsto recordsMaking anindex methodfileMakinginverted file(s)
79 Indexes in books and databases: a comparison **-Indexes in books and databases: a comparisonBookDatabaseIndex_term_1 page x1, y1, z1,...Index_term_2 page x2, y2, z2,......PrintedInvisibleIndex_term_1 record nr. x1 / field type nr. x1 / field occurrence x1 / position x1record nr. y1 / field type nr. y1 / field occurrence x1 / position y1...Index_term_2 record nr. x2 / field type nr. x2 / field occurrence x2 / position x2record nr. x2 / field type nr. x2 / field occurrence x2 / position x2
80 Index in a text retrieval system (such as CDS/ISIS) **-Index in a text retrieval system (such as CDS/ISIS)Terminology: Index = Inverted file = Dictionarydatabase dictionary on displaydatabase complete inverted file
81 Methods of inverted file creation **-Methods of inverted file creationÆ Word indexingJ Simple / automatic / no indication requiredL Loss of word contextJ A field structure is not requiredÆ Phrase indexingL Indication of phrases during input is requiredJ Richer than separate wordsÆ Field indexingJ Context is better preservedL A field structure is required
82 CDS/ISIS inverted file services: display menu *--CDS/ISIS inverted file services: display menu
83 Automatic indexing (file inversion) **-Automatic indexing (file inversion)Word indexing? with proximity indexing?Field indexing?Sub-field indexing?Phrase indexing?Æ Maximum length of index entry?Æ List of stopwords available?Æ Immediately after input or in batch? (Slow down...?)Æ Indexing speed?Æ Adding prefixes/tags possible?Æ Modification of indexing possible?Possible?Obligatory?
84 !? Question !? Task !? Problem !? **- Why can the index of a database be so large in comparison with the size of the database?
85 CDS/ISIS information retrieval services: display menu *--CDS/ISIS information retrieval services: display menu
86 CDS/ISIS information retrieval: example of a dictionary on the display *--CDS/ISIS information retrieval: example of a dictionary on the display
87 CDS/ISIS: features related to retrieval *--CDS/ISIS: features related to retrievalBrowsingDictionary > Searching /selectingDirect searching+ mix of these methodsBoolean operators: OR * AND (^ NOT)Previous search result: #nField qualifier: search term / (n,m,...)
88 *--CDS/ISIS: an additional, user-friendly search interface program, HeuriskoOffers a more limited but more user-friendly interface with drop-down menus,to choose among available CDS/ISIS databasesto search the chosen database and to display selected records on the video displayto print search results / selectionsIs available since 1993, free of charge from CDS/ISIS distributors, with a manual.
89 Interactive searching of a database / Retrieval **-Interactive searching of a database / RetrievalBrowse in index(es)? Select from index(es)?Combine search terms?Proximity operators? (Adjacency / Same paragraph / ...)Truncated search term(s)?Limit search to specific field(s)?Highlighting of search terms in selected records?Ranking of output?Speed?Save search strategy?
90 Output from a database to various “devices” **-Output from a database to various “devices”to video displayto printerto computer file (“printing” to a file)=< ;
91 CDS/ISIS output (sorting and printing) services: display menu *--CDS/ISIS output (sorting and printing) services: display menu
92 CDS/ISIS printing worksheet: display of an example *--CDS/ISIS printing worksheet: display of an example
93 CDS/ISIS sorting worksheet: display of an example *--CDS/ISIS sorting worksheet: display of an example
94 Formatting of output from a database *--Formatting of output from a databaseFormatting aspects / levelsÆ data in each recordÆ lay-out of recordson the printed page orin the output computer fileÆ sorting of records in outputIn CDS/ISISÆ .PFT file(s)Æ printing worksheet(s)Æ sorting worksheet(s)
95 Formatting of data within each record in output **-Formatting of data within each record in outputIndependent of output device:Determine the sequence of the fields in each record.Omit specific fields from each record.Add field names or tags to the fields in each record.Indicate the search term(s) in each record.Dependent of output device:Specify character formats in each (sub)field: typeface + size + bold/italic/underline
96 Sorting / arranging of records in the whole output **-Sorting / arranging of records in the whole outputCan the user determine the sequence of the records?Which elements can be used as a basis for sorting?Can stopwords be omitted as a basis for sorting?What is the maximum number of sort levels?Can the user choose between ascending or descending order?Can duplicate records be eliminated? (If yes: Can the user determine the meaning of duplicate?)Can output formats (styles) be stored?
97 Advanced and experimental retrieval systems *--Advanced and experimental retrieval systemsThe system accords weights to terms in the databaseFrequency of occurrence in the database + ...The searcher accords weights to terms in his queryBased on importance of the termNatural language interface between user and systemThe system derives word stems + word meanings + ...Relevance feedback and query reformulationUser assesses relevance and the system refines queryDynamic user profile is a part of the systemSystem understands the user and his query better
98 Additional programs for CDS/ISIS *--Additional programs for CDS/ISISCDS/ISISCDS/ISIS Pascalprogramming languagecompilerAdditional program(s)Source codeof additionalprogram(s)in CDS/ISISPascal
99 Global modification program for CDS/ISIS *--Global modification program for CDS/ISISGMOD.PASallows the modification of a string in a specific field of all records, throughout the whole CDS/ISIS database.
100 Thesaurus program module: purpose **-Thesaurus program module: purposeDoes the database management program offer a thesaurus module which allows the user to create, modify, store, and delete relations between terms used in the database?This is mainly used to establish relations among controlled subject indexing terms.If more than one controlled vocabulary is used, these should be managed separately.
101 Structure of a thesaurus database record (Fields for “good” terms) **-Structure of a thesaurus database record (Fields for “good” terms)“Good” termControlled vocabulary to which the term belongs (if more than 1 is used in the same database)Scope note (= definition of the controlled term)Date of creation or modification of the termNotes
102 Structure of a thesaurus database record (Fields for relations) **-Structure of a thesaurus database record (Fields for relations)BT (= broader term) term(s) with broader meaningTT (= top term) term highest in the hierarchyNT (= narrower term) term(s) with narrower meaningRT (= related term) other term(s) related to this oneUF (= use for) synonym(s)
103 Structure of a thesaurus database record (Fields for forbidden terms) **-Structure of a thesaurus database record (Fields for forbidden terms)Forbidden termUS (= use instead) “good” term in the controlled vocabulary
104 Structure of a thesaurus database record (Fields for candidate terms) **-Structure of a thesaurus database record (Fields for candidate terms)Candidate “good” term in the controlled vocabulary(Other fields as in the case of “good” terms)
105 Structure of a multilingual thesaurus database record **-Structure of a multilingual thesaurus database recordEach type of field in a thesaurus record occurs for each language.
106 Thesaurus program: desirable properties (Part 1) **-Thesaurus program: desirable properties (Part 1)Multilingual user interface = menus and messages in more than 1 languageMultilingual contents = terms in more than 1 languageWhen a term in the thesaurus database is added, changed or deleted, the program automatically makes the corresponding changes throughout the whole thesaurus database, there where that term occursThe program controls the creation of impossible (= forbidden) or undesirable relations
107 Thesaurus program: desirable properties (Part 2) **-Thesaurus program: desirable properties (Part 2)Can the thesaurus contents be formatted and printed or sent to file?Can more than 1 thesaurus be managed, linked to the same database?Can a thesaurus database can be used with more than 1 primary database?Can the program signal the presence of orphan terms (= terms without relation)?
108 **-Thesaurus program: integration with input/editing of the primary databaseHow simply and quickly can the usersearch the thesaurus during manual input/editing? (for instance to use it as an authority list)copy a term from a thesaurus and paste into a database record?copy a term from the database and paste into a thesaurus?...
109 Thesaurus program: integration with searching of the primary database **-Thesaurus program: integration with searching of the primary databaseCan the user browse the thesaurus during a search in the database?Can the program automatically formulate a query, when the user selects terms in the thesaurus module?Does the program allow to include easily and quickly synonyms, narrower terms and broader terms in a query?...
110 Automatic creation, deletion or adaptation of the reciprocal relation **-Automatic creation, deletion or adaptation of the reciprocal relationDoes a change by the user of a relation in one record cause an automatic change by the thesaurus program of the reciprocal relation in the corresponding record of the thesaurus database? Examples:change of BT changes NT in the corresponding recordchange of NT changes BT in the corresponding recordchange of RT changes RT in the corresponding recordchange of UF changes US in the corresponding recordchange of US changes UF in the corresponding record
111 **-Automatic control of the creation of impossible or undesirable relationsDoes the thesaurus program avoid the creation of impossible or undesirable relations, or does it warn the user? Examples of this kind of relations:circular hierarchy (a NT b, b NT c, c NT a, or longer)circular synonym relation (a UF b, b UF a)iterative synonym relations (a US b, b US c, or longer)incomplete relations (a RT b, while b does not exist)term related to itself (for instance: a NT a)...
112 Trilingual thesaurus program module for CDS/ISIS: properties *--Trilingual thesaurus program module for CDS/ISIS: propertiesIt is an additional program in CDS/ISIS Pascal languageUsage is free of charge, as in the case of CDS/ISISThesaurus database management is based on CDS/ISISThe thesaurus program, as well as CDS/ISIS, offers a user interface in English, French, and SpanishThe contents of a thesaurus database is trilingual : each term in English, French, and Spanish (each one replaceable by another language)
113 Trilingual thesaurus program for CDS/ISIS: the relations among terms *--Trilingual thesaurus program for CDS/ISIS: the relations among termsThe available relations are: US, UF, NT, BT, TT, RTUnlimited number of occurrences for each type of relations in each recordAfter a change of a relation, the program automatically adapts the corresponding relation in the corresponding thesaurus term records
114 Trilingual thesaurus program for CDS/ISIS: control of relations *--Trilingual thesaurus program for CDS/ISIS: control of relationsThe program avoids the creation of some impossible or undesirable relations:circular synonym relation (a UF b, b UF a)iterative synonym relations (a US b, b US c, or longer)incomplete relations (a RT b, while b does not exist)
115 Trilingual thesaurus for CDS/ISIS: integration with searching *--Trilingual thesaurus for CDS/ISIS: integration with searchingThe user can browse the thesaurus during a search in the primary database.The program automatically formulates a query in the primary database, when the user selects terms in the thesaurus module.The program allows to include easily and quickly synonyms, narrower terms and broader terms in a query.The thesaurus database can be used for searching with more than 1 primary database.
116 Trilingual thesaurus program module for CDS/ISIS: further properties *--Trilingual thesaurus program module for CDS/ISIS: further propertiesIn each record describing a term, a field for a scope note is present.A field for date of term creation is present.Several printout formats are included.
117 How to obtain the trilingual thesaurus program for CDS/ISIS? *--How to obtain the trilingual thesaurus program for CDS/ISIS?the national distributor in your countryUNESCO Headquarters, General Information Programme, 1 rue Miollis, Paris, France...
118 Trilingual thesaurus program module for CDS/ISIS: conclusions *--Trilingual thesaurus program module for CDS/ISIS: conclusions- Negative: Not well integrated with the input/editing module of CDS/ISIS+ Positive: Exceptionally interesting price/quality ratio
119 Security / privacy / protection of databases **-Security / privacy / protection of databasesPassword for searchingspecific database(s) and / or fields and / or recordPassword for editingspecific database(s) and / or fields and / or recordsPassword for changingdatabase structureinput and modification work sheetssort and print formats of data in recordssort and print formats of records in a selection
120 Security / privacy / protection provided by DOS *--Security / privacy / protection provided by DOSDOS can make filesread-onlyhidden
121 Security / privacy / protection in CDS/ISIS *--Security / privacy / protection in CDS/ISISSYSPAR.PAR file (entry 0) asks for a password, which can limit access to a particulardatabaseset of worksheetsset of menusset of additional CDS/ISIS programsUsing the read-only version, named ISISCD.EXE, prevents modifications.Menus can be changed or removed to prevent access.
122 Passwords and usage tracking **-Passwords and usage trackingDoes the use of passwords linked to users or user groups allow usage tracking by a systems manager?“Usage” = for instance, number and types of search and/or edit actions.This can be useful for studies and system management.
123 Data export in the case of CDS/ISIS *--Data export in the case of CDS/ISISCDS/ISISDatabaseDatabase structureContentsCopy of alldatabase files“Export”of data“Print” datato fileOtherCDS/ISISuser withoutdatabaseOther CDS/ISISuser withsame databasestructureOtherdatabasemanagementsystem
124 Manual versus batch import of data in a database **-Manual versus batch import of data in a databaseInformation itemsManualinputBatchinput
125 Conversion and batch input in the case of a CDS/ISIS database *--Conversion and batch input in the case of a CDS/ISIS databaseFile with database records in ASCII with field tagsFangorn program + Conversion specification fileFile with records in format of the CDS/ISIS databaseImport module in CDS/ISISRecords in the CDS/ISIS database
126 Format conversion program Fangorn *--Format conversion program FangornAuthors: Besemer and NieuwenhuysenAvailable via anonymous ftp fromPCWS1.SCI.SNS.ITftp.vub.ac.be in the directory \pub\projects\Docinfo\paul\cursus\isis\…
127 *--Specification of a format conversion in the case of Fangorn for CDS/ISIS
128 !? Question !? Task !? Problem !? **- Which software packages for storage and retrieval of structured text do YOU know?
129 **-ExamplesMicrocomputers software packages for for structured text retrieval: examplesaskSamBib-SearchCAIRSCardbox-PlusCDS / ISISHeadfastIdeaListInmagicNotes (Lotus / IBM)Personal LibrarianPro-CiteReference ManagerStrixSTATUSTopic (Verity)...
130 !? Question !? Task !? Problem !? **- How can you use a word processing program together with a text retrieval system?
131 Word processing program to assist a retrieval program **-Word processing program to assist a retrieval programF To polish text data before import in the database managed by the retrieval program$ To inspect output to printer before real printing2 To accept output from the retrieval program for further and better formatting, followed by printing
132 Which benefits offers a field structure to databases? **-!? Question !? Task !? Problem !?Which benefits offers a field structure to databases?
133 Field structure in records: benefits concerning input **-Field structure in records: benefits concerning inputThe indication of fields in input worksheets guides the input.Default values can be assigned to fields which can avoid errors and can make input faster.The existence of fields allows control of the contents format of each specific field during input....
134 Field structure in records: benefits concerning searching **-Field structure in records: benefits concerning searchingUser can limit search to specific fields.Field type adds information to contents.Field-indexing keeps data together in index....
135 Field structure in records: benefits concerning output **-Field structure in records: benefits concerning outputField structure makes output easier to understand.In output, each field can be indicated with tag/prefix.Records can be sorted based on contents of a field.In output, the fields can be sorted in each record.In output, some fields can be omitted....
136 !? Question !? Task !? Problem !? **- Besides all the benefits offered by a field structure in a database, which problems does this cause?
137 Field structure in records: problems (Part 1) **-Field structure in records: problems (Part 1)In the short term, it is more expensive and time consuming, than handling less structured data.Initially, the database manager who wants to create a new database has to make decisions:which fields to create to subdivide the database records,which field tags or names to use for the internal housekeeping of the database by the chosen database management software package.
138 Field structure in records: problems (Part 2) **-Field structure in records: problems (Part 2)The exchange of data, i.e. importing data in a database, which have been exported from another database, is hindered when the databases structures are not identical or compatible....
139 Exchange formats and standards for text database systems ***Exchange formats and standards for text database systemsUsage and aims:to allow efficient exchange of information among databases without loss of structural informationto guide database managers in the creation of a database structure (records divided in fields and subfields)Examples: (MARC = machine readable catalogue)LC-MARC (=Library of Congress MARC); UNIMARCCommon Communication Format (of UNESCO)SGML
140 Common Communication Format (CCF): description **-Common Communication Format (CCF): descriptionDeveloped by the Unesco - General Information Programme for international applicationIncludes a system of numeric tags indicatingthe location of fields and subfields in the recordsthe meaning of the fields and subfields
141 Common Communication Format (CCF): availability **-Common Communication Format (CCF): availabilityPublished and made available free of charge by the Unesco - General Information ProgrammePrinted manualsPrinted implementation notesExample CDS/ISIS database structured according to the Common Communication Format
142 Exchange of data among systems: requirements **-Exchange of data among systems: requirementsSubject thesaurus (relation-structure + contents)Subject classification scheme + level of usageContents of fields (and subfields) in the records (in the case of bibliographic databases: cataloguing input rules)Database structure: records, fields, subfields,... as seen by the database managerVersion of the program for database managementType of program for database managementAlphabet used for the data
143 Compatibility among databases: an example Library of Congress Subject Headings (LCSH) (a thesaurus)Universal Decimal Classification (UDC)Anglo American Cataloguing Rules (AACR)Common Communication Format (CCF)Version 3.0CDS/ISIS programExtension of ASCII by IBMISOstandardforrecordstorage !