Download presentation
Presentation is loading. Please wait.
Published byDina Parker Modified over 10 years ago
1
Text information storage and retrieval and the CDS/ISIS program
*** Text information storage and retrieval and the CDS/ISIS program Paul NIEUWENHUYSEN University Library, Vrije Universiteit Brussel, Pleinlaan 2, B-1050 Brussel, Belgium
2
*** What is a database? A database is a collection of similar data records stored in a common file (or collection of files).
3
Software type = information retrieval software
*** Software type = information retrieval software Software for information storage and retrieval (ISR software) Text(-oriented) database management systems (Text-DBMS) Text information management systems (TIMS) Document retrieval systems Document management systems
4
Information retrieval: via a database to the user
*** Information retrieval: via a database to the user Information content Linear file Inverted file Database Search engine Search interface User
5
Information retrieval: the basic processes in search systems
*** Information retrieval: the basic processes in search systems Information problem Text documents Representation Representation Evaluation and feedback Query Indexed documents Comparison Retrieved documents
6
Information retrieval systems: many components make up a system
*** Information retrieval systems: many components make up a system Any retrieval system is built up of many more or less independent components. These components can be modified to increase the quality of the results more or less independently.
7
Information retrieval systems: important components
*** Information retrieval systems: important components the information content system to describe formal aspects of information items system to describe the subjects of information items concrete descriptions of information items = application of the used information description systems information storage and retrieval computer program(s) computer system used for retrieval type of medium or information carrier used for distribution
8
Information retrieval systems: the information content
*** Information retrieval systems: the information content The information content is the information that is created or gathered by the producer. The information content is independent of software and of distribution media. The information content is input into the retrieval system using a system (rules) to describe the formal aspects a system (rules) to describe the contents (classification, thesaurus,...)
9
Information retrieval systems: media used for distribution
*** Information retrieval systems: media used for distribution Hard copy (for information retrieval systems only in the broad sense) Print Microfiche For computers: (for information retrieval systems strictu sensu) Magnetic tape Floppy disk; optical disk (CD-ROM, CD-i, Photo-CD,...) Online
10
Information retrieval systems: the computer program
*** Information retrieval systems: the computer program The information retrieval program consists of several modules, including: The module that allows the creation of the inverted file(s) = index file(s) = dictionary file(s). The search engine provides the search features and power that allow the inverted file(s) to be searched. The interface between the system and the user determines how they (can) interact to search the database (using menus and/or icons and/or templates and/or commands).
11
What determines the results of a search in a retrieval system?
*** What determines the results of a search in a retrieval system? the information retrieval system ( = contents + system) the user of the retrieval system and the search strategy applied to the system Result of a search
12
Characteristics / definition of structured text-information
*** Characteristics / definition of structured text-information The text information is structured. (files, records, fields, sub-fields, links/relations among records,...) The length of records and fields can be “long”. Some fields are multi-valued, i.e. they occur more than once.
13
Layered structure of a database
*** Layered structure of a database Database File Records Fields Characters + in many systems: relations / links between records
14
Structure of a bibliographic file
*** Structure of a bibliographic file Record No. 1 Title Author 1: name + first name Author 2: ... Source Descriptor 1 Descriptor 2 Record No. 2 Sub- fields Repeated
15
Thesaurus: description
*** Thesaurus: description Thesaurus = system to control a vocabulary + the contents of this vocabulary Thesaurus program = program to create, manage, modify and/or search a thesaurus using a computer
16
Thesaurus relations *** Term(s) with broader meaning
BT (= Broader Term) RT (Related Term) UF (= Use For) Other term(s) Term Synonym(s) NT (= Narrower Term) Term(s) with narrower meaning
17
Thesaurus applications
*** Thesaurus applications To find/choose index terms to add these to items, when terms are taken from a controlled vocabulary To find more and/or better terms to search a database (to increase recall and precision) To find more and/or better terms during writing To understand the meaning of a term, by inspecting the scope note of the term and/or the relations with other terms
18
Thesaurus examples **-Examples
General systems / universal systems / on all subjects Library of Congress Subject Headings (LCSH) Focused on a particular subject domain ERIC INSPEC Medical Subject Headings (MeSH) Psychological Abstracts / PsycInfo Sociological Abstracts / SocioFile
19
Database systems: why study this subject briefly ?
*** Database systems: why study this subject briefly ? To achieve a better understanding of the inner workings of the external information retrieval systems that you use, so that you can exploit these more efficiently To be able to evaluate the quality of database systems you are confronted with, so that you can make better choices among available systems, offer constructive suggestions to the manager, ...
20
Database systems: why study this subject in detail?
**- Database systems: why study this subject in detail? To acquire the knowledge and skills to create / set up / manage your own local database system on a computer
21
Database systems: definition
*** Database systems: definition A database (management) system is a program or set of programs, providing a means by which a user can easily store and retrieve data in the form of “databases”.
22
Information retrieval software: related terms
**- Information retrieval software: related terms Software for information storage and retrieval (ISR software) Text(-oriented) database management systems (Text-DBMS) Text information management systems (TIMS) Document retrieval systems Document management systems
23
Information retrieval software: applications (Part 1)
**- Information retrieval software: applications (Part 1) Documentation centres Archives Libraries Musea Medical files Marketing departments Schools Bibliographic databases Documents Archived documents Books / Documents Objects / Books / ... Patient’s histories Clients / Potential clients Courses / Teachers Publications / ...
24
Information retrieval software: applications (Part 2)
**- Information retrieval software: applications (Part 2) Meeting calendars Product information Laboratories Personal documentation Patent office Co-operating information networks ... Meetings = conferences Product descriptions Recipes Documents Patents Documents / Persons / Institutes / Events / ...
25
Cataloguing: hard copy versus computer-based
**- Cataloguing: hard copy versus computer-based Hard copy “Input” , i.e. cataloguing, on cards determines directly the “ouput”, i.e. the format of the data on the card as presented to the user Summarized: INPUT=OUTPUT Computer-based Input in the database in fields allows later output in various formats for presentation Summarized: 1. INPUT, 2. various OUTPUTs
26
Text-information management systems: characteristics and definition
*** Text-information management systems: characteristics and definition The information in the database is text oriented. Therefore, several features are required: ability to store relatively long blocks of texts ability to retrieve items in which specific words or terms occur anywhere
27
Text-information management: from free-form to structure
**- Text-information management: from free-form to structure Free form text information without structure Text database with information structured in files, records, fields, sub-fields, with links/relations among records,... (Ideally, each fields is repeatable = can be multi-valued, = can occur more than once in each record.)
28
Text-information management: types of software
*** Text-information management: types of software Software type Word processing software Free-form or structured text information database software Features Must be learnt anyway. Slow sequential searching. Additional software to be purchased and learnt. Fast searching via index(es).
29
Advantages of structured text-retrieval versus X-base systems
**- Advantages of structured text-retrieval versus X-base systems Feature Many long fields, forming long records Repeatable fields Subfields Variable field lengths Fast searching any word in all fields Thesaurus to help searching Text- retrieval Yes X-base systems No
30
Hierarchy in the use of a database
*** Hierarchy in the use of a database Database structure Input / Editing Searching / Output
31
Functions of database management software
*** Functions of database management software Input / edit using keyboard or batch input Indexing of the database(s) Browse / Search / Select / Retrieve data from database Output (Sort / Display / Print to file / Print to paper) + Export / Import
32
The various formats of records in a database
**- The various formats of records in a database Format to facilitate retrieval = inverted file Input format = Edit format Internal format for long term storage Display format for output to display, printer, file Format for exchange purposes
33
Structure of records / Field tags
**- Structure of records / Field tags Field tags: Examples from the Common Communication Format (supported by UNESCO): Title Author(s) Notes Abstract
34
Which advantages offers a document management system on computer?
*** !? Question !? Task !? Problem !? Which advantages offers a document management system on computer?
35
Advantages of a document system on computer, for the user(s)
*** Advantages of a document system on computer, for the user(s) Access to information is easier. Access to information is faster. Online access is possible even when centre is closed. Online access is possible from a distance. Integration in search module with data on loan status. More elements of the records can serve as search term. Combinations of search terms can be used. Results /selections can be stored as computer files.
36
Advantages of a document system on computer, for the manager(s)
**- Advantages of a document system on computer, for the manager(s) Multiplication / distribution / exchange is easier. Available computer data can be input / incorporated. Global changes are easier. The system takes less physical space. Output to printer allows production of cards, listings,... Sorting of records in output is easier. The system is resistant to physical aging. Parts are better integrated (for instance books and loans) The system can offer statistical information.
37
Drawbacks of a document system on computer
**- Drawbacks of a document system on computer L The costs of software and hardware can be high. Training related to computers is required. Evolution in computer applications should be considered. Some systems do not provide a backup.
38
Unesco’s involvement with information management
**- Unesco’s involvement with information management Unesco - General Information Programme Computer programs for information management Standards (for instance CCF) Various subject-oriented information projects Libraries and archives Unesco’s other programmes and divisions
39
Tools for information management by Unesco
**- Tools for information management by Unesco IDAMS DOS program for numeric data analysis CDS/ISIS Program for storage and retrieval of structured text-oriented information Interface between IDAMS and CDS/ISIS To use both programs efficiently in one project Common Communication Format (CCF) Guidelines on how to format a database
40
The IDAMS analysis program
*-- The IDAMS analysis program Software to create, manage and analyse local, in-house, number-oriented databases For DOS (or systems emulating DOS) Developed by an international team Distributed free of charge by the Unesco - General Information Programme (PGI) Detailed manual is available
41
The CDS/ISIS text database management program
**- The CDS/ISIS text database management program Software to create and manage local, in-house databases with primarily structured text as contents (NOT numbers, graphics, sound,...) Versions available for Mainframes (IBM) Minicomputers (Digital VAX) Microcomputers (DOS )
42
Micro-CDS/ISIS: options on the original main menu
*-- Micro-CDS/ISIS: options on the original main menu ________________________________________________________________________________ ________________________ Micro CDS/ISIS - Version 3.0 ________________________ C - Change data base L - Change dialogue language E - ISISENT - Data entry services S - ISISRET - Information retrieval services P - ISISPRT - Sorting and printing services I - ISISINV - Inverted file services D - ISISDEF - Data base definition services M - ISISXCH - Master file services U - ISISUTL - System utility services A - ISISPAS - Advanced programming services X - Exit (to MSDOS)
43
Micro-CDS/ISIS: original main menu on the display
*-- Micro-CDS/ISIS: original main menu on the display
44
Micro-CDS/ISIS running in Microsoft Windows
*-- Micro-CDS/ISIS running in Microsoft Windows
45
Micro-CDS/ISIS running in Microsoft Windows (full screen)
46
CDS/ISIS: general features
*-- CDS/ISIS: general features Available for several operating systems Multi-user editing and searching in a network Unlimited number of databases can be stored No practical limitation in the number of records per database Multiple field-occurrences are possible Only few limitations in a database structure Can be applied on CD-ROM
47
CDS/ISIS: input, indexing, searching, output
*-- CDS/ISIS: input, indexing, searching, output More than one input worksheet can be applied Powerful word-, phrase- and field-indexing Powerful, fast searching Powerful in output formats
48
CDS/ISIS: positive non-technical characteristics
*-- CDS/ISIS: positive non-technical characteristics Good, detailed manual in English, French,... Used in more than 4000 institutes Used internationally, worldwide National user-groups are active in many countries User interface available in English, French, Spanish, Arabian, Chinese,... Suitable database structures are available Free forum about CDS/ISIS by electronic mail
49
CDS/ISIS is available free of charge
*-- CDS/ISIS is available free of charge From National distributors approved by Unesco From subject-oriented distributors approved by Unesco From the Unesco - General Information Programme in Paris (From the secretariat of the Unesco - International Hydrological Programme in Paris, for water-related projects)
50
CDS/ISIS database structures
*-- CDS/ISIS database structures example database structured according to CCF, on diskette, distributed by the Unesco - General Information Programme example database structured according to CCF, published in a guide by the Unesco - International Hydrological Programme databases and manuals by other organisations
51
Interface to link CDS/ISIS with IDAMS
*-- Interface to link CDS/ISIS with IDAMS To use both packages efficiently together, an interface program has also been developed by the Unesco - General Information Programme (PGI)
52
Important new features in CDS/ISIS version 3
*-- Important new features in CDS/ISIS version 3 More than one user can edit data at the same time in a computer network. User / program can call external, non-CDS/ISIS programs. Better Pascal programming language is included in CDS/ISIS .
53
CDS/ISIS and libraries: examples of applications
*-- CDS/ISIS and libraries: examples of applications A library automated with CDS/ISIS Central library providing bibliographic data for computers Another library automated with CDS/ISIS Smaller department libraries and documentation systems automated using CDS/ISIS Network
54
CDS/ISIS database structure
*-- CDS/ISIS database structure General CDS/ISIS Database Record Field Subfield Character Name < 5 characters MFN = Master File Number Tag = 1, 2, 3, ..., 999 ^a ^b ^c ... ABC...abc accented characters Unlimited 16 million per database 250 per record - 8000 per record 8000 per field
55
CDS/ISIS database files, defined by the user
*-- CDS/ISIS database files, defined by the user Name in DOS .ANY .FDT .FMT .FST .PFT .STW # used with 1 database 1 > 1 = or > 1 Full name ANY file Field Definition Table Worksheet(s) Field Select Table(s) Print Format(s) STOPword list Purpose Searching Structure Input Indexing Output and sorting
56
CDS/ISIS database files: the database contents
*-- CDS/ISIS database files: the database contents Name in DOS .MST .XRF .IFP .L01 .L02 .N01 .N02 Full name Master file Cross reference file B-tree index files
57
CDS/ISIS database files to change using a text editing program
*-- CDS/ISIS database files to change using a text editing program Where dbn = data base name Name in DOS dbn.ANY dbn.STW # used with a database 1 Full name ANY file STOPword list Purpose Indexing and sorting Searching
58
*-- CDS/ISIS database files, which can be changed using a text editing program Name in DOS .FDT .FST .PFT # used with a database 1 1,... 1 or several Full name Field Definition Table Field Select Table(s) Print (or display )Format(s) Purpose Structure Indexing Output
59
Advantages of using CDS/ISIS with Windows (Part 1)
*-- Advantages of using CDS/ISIS with Windows (Part 1) Multitasking in several windows: CDS/ISIS and other programs start CDS/ISIS from the program manager view CDS/ISIS and the file manager at the same time search or edit a CDS/ISIS database together with a thesaurus or classification scheme in another program switch easily between CDS/ISIS and a word processing program to produce output ...
60
Advantages of using CDS/ISIS with Windows (Part 2)
*-- Advantages of using CDS/ISIS with Windows (Part 2) Multitasking in several windows: multiple instances of CDS/ISIS view several databases at the same time in several instances of CDS/ISIS running at the same time ...
61
Advantages of using CDS/ISIS with Windows (Part 3)
*-- Advantages of using CDS/ISIS with Windows (Part 3) Copy and paste copy data from a document in a program for text processing, and paste into a CDS/ISIS database copy data from a CDS/ISIS database displayed on screen by CDS/ISIS, and paste into a document in another program copy data from one CDS/ISIS database displayed on screen, and paste into another database through the editing worksheet ...
62
Advantages of using CDS/ISIS with Windows (Part 4)
*-- Advantages of using CDS/ISIS with Windows (Part 4) Associations of file name extensions with programs associate the following CDS/ISIS file name extensions with a program for word processing: .any .fdt .par .pft .stw
63
CDS/ISIS: some wishes of users (Part 1)
*-- CDS/ISIS: some wishes of users (Part 1) General aspects: better use of Microsoft Windows possibility to open and work with more than one file/database on screen simultaneously, and to exchange data among those open files client-server architecture for the database management system better availability of CDS/ISIS applications developed by other users (including additional ISIS-Pascal programs)
64
CDS/ISIS: some wishes of users (Part 2)
*-- CDS/ISIS: some wishes of users (Part 2) Database structure: records longer than 8 KBytes support of non-text fields (graphics, audio,...) better availability of database structures developed by users Input: access to and copy from one or more authority files spell-check of database contents (language independent) direct import of non-ISO-structured ASCII files
65
CDS/ISIS: some wishes of users (Part 3)
*-- CDS/ISIS: some wishes of users (Part 3) Indexing: multiple inverted files Searching: save search statements (queries) for future runs Output: emphasis of search terms in the output on display and paper better use of the features of various printers direct output to PostScript printers
66
CDS/ISIS: some wishes of users (Part 4)
*-- CDS/ISIS: some wishes of users (Part 4) Interface with user more help messages in context online tutorial support for mouse
67
CDS/ISIS: further development going on
*-- CDS/ISIS: further development going on versions for various Unix computers version for Windows splitting into a client and a server package
68
CDS/ISIS database definition services: display menu
*-- CDS/ISIS database definition services: display menu
69
CDS/ISIS database modification services: display menu
*-- CDS/ISIS database modification services: display menu
70
CDS/ISIS database definition table: display of an example
*-- CDS/ISIS database definition table: display of an example
71
Copying and renaming a CDS/ISIS database structure
*-- Copying and renaming a CDS/ISIS database structure Copy XCOPY (not COPY) all files, using DOS or Copy using Windows If required: rename FDT file, FMT file(s), PFT file(s), ANY file, FST file(s),... AND (!): Change first lines in xxxxx.FDT to make these agree with new names of FST(s), FMT(s), PFT(s)
72
Limitations to the structure of a database
**- Limitations to the structure of a database What is the maximum number of databases managed by the program? of records in a database? of characters in a record? of fields in a record? of characters in a field? Can fields contain subfields? Can the user define / modify the structure? Is the amount of memory taken on disk by each record minimal or fixed? ...
73
Limitations to the contents of a database
**- Limitations to the contents of a database Only plain 7-bit or extended ASCII? §©¥¤£¢ Can documents be managed, such as texts from word processing programs with formatting codes? pictures / graphics / bitmaps ? movies / video /...? ...
74
Types of field contents to control input
**- Types of field contents to control input Type (Example) Alphanumeric (Title) Alphabetic (Country code) Numeric (Year) Pattern (Date)
75
CDS/ISIS manual data entry, editing / input services: display menu
*-- CDS/ISIS manual data entry, editing / input services: display menu
76
Manual inputting and editing using a keyboard
**- Manual inputting and editing using a keyboard Can more than 1 form / worksheet be used for a database? Are pre-fabricated database structures with input worksheets included and / or available? Can more than one user edit the same database at the same time?
77
Batch input / Import **- Is batch input possible?
Is a format conversion program included or available? ...
78
Activities related to indexing
**- Activities related to indexing Activity Intellectual, human indexing Develop an automatic indexing method Automatic indexing Who does it? Database producer / Thesaurus producer Database producer / Software features Computer with program Concrete action Attribute subject terms to records Making an index method file Making inverted file(s)
79
Indexes in books and databases: a comparison
**- Indexes in books and databases: a comparison Book Database Index_term_1 page x1, y1, z1,... Index_term_2 page x2, y2, z2,... ... Printed Invisible Index_term_1 record nr. x1 / field type nr. x1 / field occurrence x1 / position x1 record nr. y1 / field type nr. y1 / field occurrence x1 / position y1 ... Index_term_2 record nr. x2 / field type nr. x2 / field occurrence x2 / position x2 record nr. x2 / field type nr. x2 / field occurrence x2 / position x2
80
Index in a text retrieval system (such as CDS/ISIS)
**- Index in a text retrieval system (such as CDS/ISIS) Terminology: Index = Inverted file = Dictionary database dictionary on display database complete inverted file
81
Methods of inverted file creation
**- Methods of inverted file creation Æ Word indexing J Simple / automatic / no indication required L Loss of word context J A field structure is not required Æ Phrase indexing L Indication of phrases during input is required J Richer than separate words Æ Field indexing J Context is better preserved L A field structure is required
82
CDS/ISIS inverted file services: display menu
*-- CDS/ISIS inverted file services: display menu
83
Automatic indexing (file inversion)
**- Automatic indexing (file inversion) Word indexing? with proximity indexing? Field indexing? Sub-field indexing? Phrase indexing? Æ Maximum length of index entry? Æ List of stopwords available? Æ Immediately after input or in batch? (Slow down...?) Æ Indexing speed? Æ Adding prefixes/tags possible? Æ Modification of indexing possible? Possible? Obligatory?
84
!? Question !? Task !? Problem !? **-
Why can the index of a database be so large in comparison with the size of the database?
85
CDS/ISIS information retrieval services: display menu
*-- CDS/ISIS information retrieval services: display menu
86
CDS/ISIS information retrieval: example of a dictionary on the display
*-- CDS/ISIS information retrieval: example of a dictionary on the display
87
CDS/ISIS: features related to retrieval
*-- CDS/ISIS: features related to retrieval Browsing Dictionary > Searching /selecting Direct searching + mix of these methods Boolean operators: OR * AND (^ NOT) Previous search result: #n Field qualifier: search term / (n,m,...)
88
*-- CDS/ISIS: an additional, user-friendly search interface program, Heurisko Offers a more limited but more user-friendly interface with drop-down menus, to choose among available CDS/ISIS databases to search the chosen database and to display selected records on the video display to print search results / selections Is available since 1993, free of charge from CDS/ISIS distributors, with a manual.
89
Interactive searching of a database / Retrieval
**- Interactive searching of a database / Retrieval Browse in index(es)? Select from index(es)? Combine search terms? Proximity operators? (Adjacency / Same paragraph / ...) Truncated search term(s)? Limit search to specific field(s)? Highlighting of search terms in selected records? Ranking of output? Speed? Save search strategy?
90
Output from a database to various “devices”
**- Output from a database to various “devices” to video display to printer to computer file (“printing” to a file) =< ;
91
CDS/ISIS output (sorting and printing) services: display menu
*-- CDS/ISIS output (sorting and printing) services: display menu
92
CDS/ISIS printing worksheet: display of an example
*-- CDS/ISIS printing worksheet: display of an example
93
CDS/ISIS sorting worksheet: display of an example
*-- CDS/ISIS sorting worksheet: display of an example
94
Formatting of output from a database
*-- Formatting of output from a database Formatting aspects / levels Æ data in each record Æ lay-out of records on the printed page or in the output computer file Æ sorting of records in output In CDS/ISIS Æ .PFT file(s) Æ printing worksheet(s) Æ sorting worksheet(s)
95
Formatting of data within each record in output
**- Formatting of data within each record in output Independent of output device: Determine the sequence of the fields in each record. Omit specific fields from each record. Add field names or tags to the fields in each record. Indicate the search term(s) in each record. Dependent of output device: Specify character formats in each (sub)field: typeface + size + bold/italic/underline
96
Sorting / arranging of records in the whole output
**- Sorting / arranging of records in the whole output Can the user determine the sequence of the records? Which elements can be used as a basis for sorting? Can stopwords be omitted as a basis for sorting? What is the maximum number of sort levels? Can the user choose between ascending or descending order? Can duplicate records be eliminated? (If yes: Can the user determine the meaning of duplicate?) Can output formats (styles) be stored?
97
Advanced and experimental retrieval systems
*-- Advanced and experimental retrieval systems The system accords weights to terms in the database Frequency of occurrence in the database + ... The searcher accords weights to terms in his query Based on importance of the term Natural language interface between user and system The system derives word stems + word meanings + ... Relevance feedback and query reformulation User assesses relevance and the system refines query Dynamic user profile is a part of the system System understands the user and his query better
98
Additional programs for CDS/ISIS
*-- Additional programs for CDS/ISIS CDS/ISIS CDS/ISIS Pascal programming language compiler Additional program(s) Source code of additional program(s) in CDS/ISIS Pascal
99
Global modification program for CDS/ISIS
*-- Global modification program for CDS/ISIS GMOD.PAS allows the modification of a string in a specific field of all records, throughout the whole CDS/ISIS database.
100
Thesaurus program module: purpose
**- Thesaurus program module: purpose Does the database management program offer a thesaurus module which allows the user to create, modify, store, and delete relations between terms used in the database? This is mainly used to establish relations among controlled subject indexing terms. If more than one controlled vocabulary is used, these should be managed separately.
101
Structure of a thesaurus database record (Fields for “good” terms)
**- Structure of a thesaurus database record (Fields for “good” terms) “Good” term Controlled vocabulary to which the term belongs (if more than 1 is used in the same database) Scope note (= definition of the controlled term) Date of creation or modification of the term Notes
102
Structure of a thesaurus database record (Fields for relations)
**- Structure of a thesaurus database record (Fields for relations) BT (= broader term) term(s) with broader meaning TT (= top term) term highest in the hierarchy NT (= narrower term) term(s) with narrower meaning RT (= related term) other term(s) related to this one UF (= use for) synonym(s)
103
Structure of a thesaurus database record (Fields for forbidden terms)
**- Structure of a thesaurus database record (Fields for forbidden terms) Forbidden term US (= use instead) “good” term in the controlled vocabulary
104
Structure of a thesaurus database record (Fields for candidate terms)
**- Structure of a thesaurus database record (Fields for candidate terms) Candidate “good” term in the controlled vocabulary (Other fields as in the case of “good” terms)
105
Structure of a multilingual thesaurus database record
**- Structure of a multilingual thesaurus database record Each type of field in a thesaurus record occurs for each language.
106
Thesaurus program: desirable properties (Part 1)
**- Thesaurus program: desirable properties (Part 1) Multilingual user interface = menus and messages in more than 1 language Multilingual contents = terms in more than 1 language When a term in the thesaurus database is added, changed or deleted, the program automatically makes the corresponding changes throughout the whole thesaurus database, there where that term occurs The program controls the creation of impossible (= forbidden) or undesirable relations
107
Thesaurus program: desirable properties (Part 2)
**- Thesaurus program: desirable properties (Part 2) Can the thesaurus contents be formatted and printed or sent to file? Can more than 1 thesaurus be managed, linked to the same database? Can a thesaurus database can be used with more than 1 primary database? Can the program signal the presence of orphan terms (= terms without relation)?
108
**- Thesaurus program: integration with input/editing of the primary database How simply and quickly can the user search the thesaurus during manual input/editing? (for instance to use it as an authority list) copy a term from a thesaurus and paste into a database record? copy a term from the database and paste into a thesaurus? ...
109
Thesaurus program: integration with searching of the primary database
**- Thesaurus program: integration with searching of the primary database Can the user browse the thesaurus during a search in the database? Can the program automatically formulate a query, when the user selects terms in the thesaurus module? Does the program allow to include easily and quickly synonyms, narrower terms and broader terms in a query? ...
110
Automatic creation, deletion or adaptation of the reciprocal relation
**- Automatic creation, deletion or adaptation of the reciprocal relation Does a change by the user of a relation in one record cause an automatic change by the thesaurus program of the reciprocal relation in the corresponding record of the thesaurus database? Examples: change of BT changes NT in the corresponding record change of NT changes BT in the corresponding record change of RT changes RT in the corresponding record change of UF changes US in the corresponding record change of US changes UF in the corresponding record
111
**- Automatic control of the creation of impossible or undesirable relations Does the thesaurus program avoid the creation of impossible or undesirable relations, or does it warn the user? Examples of this kind of relations: circular hierarchy (a NT b, b NT c, c NT a, or longer) circular synonym relation (a UF b, b UF a) iterative synonym relations (a US b, b US c, or longer) incomplete relations (a RT b, while b does not exist) term related to itself (for instance: a NT a) ...
112
Trilingual thesaurus program module for CDS/ISIS: properties
*-- Trilingual thesaurus program module for CDS/ISIS: properties It is an additional program in CDS/ISIS Pascal language Usage is free of charge, as in the case of CDS/ISIS Thesaurus database management is based on CDS/ISIS The thesaurus program, as well as CDS/ISIS, offers a user interface in English, French, and Spanish The contents of a thesaurus database is trilingual : each term in English, French, and Spanish (each one replaceable by another language)
113
Trilingual thesaurus program for CDS/ISIS: the relations among terms
*-- Trilingual thesaurus program for CDS/ISIS: the relations among terms The available relations are: US, UF, NT, BT, TT, RT Unlimited number of occurrences for each type of relations in each record After a change of a relation, the program automatically adapts the corresponding relation in the corresponding thesaurus term records
114
Trilingual thesaurus program for CDS/ISIS: control of relations
*-- Trilingual thesaurus program for CDS/ISIS: control of relations The program avoids the creation of some impossible or undesirable relations: circular synonym relation (a UF b, b UF a) iterative synonym relations (a US b, b US c, or longer) incomplete relations (a RT b, while b does not exist)
115
Trilingual thesaurus for CDS/ISIS: integration with searching
*-- Trilingual thesaurus for CDS/ISIS: integration with searching The user can browse the thesaurus during a search in the primary database. The program automatically formulates a query in the primary database, when the user selects terms in the thesaurus module. The program allows to include easily and quickly synonyms, narrower terms and broader terms in a query. The thesaurus database can be used for searching with more than 1 primary database.
116
Trilingual thesaurus program module for CDS/ISIS: further properties
*-- Trilingual thesaurus program module for CDS/ISIS: further properties In each record describing a term, a field for a scope note is present. A field for date of term creation is present. Several printout formats are included.
117
How to obtain the trilingual thesaurus program for CDS/ISIS?
*-- How to obtain the trilingual thesaurus program for CDS/ISIS? the national distributor in your country UNESCO Headquarters, General Information Programme, 1 rue Miollis, Paris, France ...
118
Trilingual thesaurus program module for CDS/ISIS: conclusions
*-- Trilingual thesaurus program module for CDS/ISIS: conclusions - Negative: Not well integrated with the input/editing module of CDS/ISIS + Positive: Exceptionally interesting price/quality ratio
119
Security / privacy / protection of databases
**- Security / privacy / protection of databases Password for searching specific database(s) and / or fields and / or record Password for editing specific database(s) and / or fields and / or records Password for changing database structure input and modification work sheets sort and print formats of data in records sort and print formats of records in a selection
120
Security / privacy / protection provided by DOS
*-- Security / privacy / protection provided by DOS DOS can make files read-only hidden
121
Security / privacy / protection in CDS/ISIS
*-- Security / privacy / protection in CDS/ISIS SYSPAR.PAR file (entry 0) asks for a password, which can limit access to a particular database set of worksheets set of menus set of additional CDS/ISIS programs Using the read-only version, named ISISCD.EXE, prevents modifications. Menus can be changed or removed to prevent access.
122
Passwords and usage tracking
**- Passwords and usage tracking Does the use of passwords linked to users or user groups allow usage tracking by a systems manager? “Usage” = for instance, number and types of search and/or edit actions. This can be useful for studies and system management.
123
Data export in the case of CDS/ISIS
*-- Data export in the case of CDS/ISIS CDS/ISIS Database Database structure Contents Copy of all database files “Export” of data “Print” data to file Other CDS/ISIS user without database Other CDS/ISIS user with same database structure Other database management system
124
Manual versus batch import of data in a database
**- Manual versus batch import of data in a database Information items Manual input Batch input
125
Conversion and batch input in the case of a CDS/ISIS database
*-- Conversion and batch input in the case of a CDS/ISIS database File with database records in ASCII with field tags Fangorn program + Conversion specification file File with records in format of the CDS/ISIS database Import module in CDS/ISIS Records in the CDS/ISIS database
126
Format conversion program Fangorn
*-- Format conversion program Fangorn Authors: Besemer and Nieuwenhuysen Available via anonymous ftp from PCWS1.SCI.SNS.IT ftp.vub.ac.be in the directory \pub\projects\Docinfo\paul\cursus\isis\ …
127
*-- Specification of a format conversion in the case of Fangorn for CDS/ISIS
128
!? Question !? Task !? Problem !? **-
Which software packages for storage and retrieval of structured text do YOU know?
129
**-Examples Microcomputers software packages for for structured text retrieval: examples askSam Bib-Search CAIRS Cardbox-Plus CDS / ISIS Headfast IdeaList Inmagic Notes (Lotus / IBM) Personal Librarian Pro-Cite Reference Manager Strix STATUS Topic (Verity) ...
130
!? Question !? Task !? Problem !? **-
How can you use a word processing program together with a text retrieval system?
131
Word processing program to assist a retrieval program
**- Word processing program to assist a retrieval program F To polish text data before import in the database managed by the retrieval program $ To inspect output to printer before real printing 2 To accept output from the retrieval program for further and better formatting, followed by printing
132
Which benefits offers a field structure to databases?
**- !? Question !? Task !? Problem !? Which benefits offers a field structure to databases?
133
Field structure in records: benefits concerning input
**- Field structure in records: benefits concerning input The indication of fields in input worksheets guides the input. Default values can be assigned to fields which can avoid errors and can make input faster. The existence of fields allows control of the contents format of each specific field during input. ...
134
Field structure in records: benefits concerning searching
**- Field structure in records: benefits concerning searching User can limit search to specific fields. Field type adds information to contents. Field-indexing keeps data together in index. ...
135
Field structure in records: benefits concerning output
**- Field structure in records: benefits concerning output Field structure makes output easier to understand. In output, each field can be indicated with tag/prefix. Records can be sorted based on contents of a field. In output, the fields can be sorted in each record. In output, some fields can be omitted. ...
136
!? Question !? Task !? Problem !? **-
Besides all the benefits offered by a field structure in a database, which problems does this cause?
137
Field structure in records: problems (Part 1)
**- Field structure in records: problems (Part 1) In the short term, it is more expensive and time consuming, than handling less structured data. Initially, the database manager who wants to create a new database has to make decisions: which fields to create to subdivide the database records, which field tags or names to use for the internal housekeeping of the database by the chosen database management software package.
138
Field structure in records: problems (Part 2)
**- Field structure in records: problems (Part 2) The exchange of data, i.e. importing data in a database, which have been exported from another database, is hindered when the databases structures are not identical or compatible. ...
139
Exchange formats and standards for text database systems
*** Exchange formats and standards for text database systems Usage and aims: to allow efficient exchange of information among databases without loss of structural information to guide database managers in the creation of a database structure (records divided in fields and subfields) Examples: (MARC = machine readable catalogue) LC-MARC (=Library of Congress MARC); UNIMARC Common Communication Format (of UNESCO) SGML
140
Common Communication Format (CCF): description
**- Common Communication Format (CCF): description Developed by the Unesco - General Information Programme for international application Includes a system of numeric tags indicating the location of fields and subfields in the records the meaning of the fields and subfields
141
Common Communication Format (CCF): availability
**- Common Communication Format (CCF): availability Published and made available free of charge by the Unesco - General Information Programme Printed manuals Printed implementation notes Example CDS/ISIS database structured according to the Common Communication Format
142
Exchange of data among systems: requirements
**- Exchange of data among systems: requirements Subject thesaurus (relation-structure + contents) Subject classification scheme + level of usage Contents of fields (and subfields) in the records (in the case of bibliographic databases: cataloguing input rules) Database structure: records, fields, subfields,... as seen by the database manager Version of the program for database management Type of program for database management Alphabet used for the data
143
Compatibility among databases: an example
Library of Congress Subject Headings (LCSH) (a thesaurus) Universal Decimal Classification (UDC) Anglo American Cataloguing Rules (AACR) Common Communication Format (CCF) Version 3.0 CDS/ISIS program Extension of ASCII by IBM ISO standard for record storage !
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.