Presentation on theme: "Revisiting PRECIS: The Preserved Context Index System Barbara H. Kwasnik School of Information Studies Syracuse University November 13,"— Presentation transcript:
Revisiting PRECIS: The Preserved Context Index System Barbara H. Kwasnik School of Information Studies Syracuse University November 13, 2004 ASIST SIG/CR Workshop
Representation and Meaning An indexer analyzes a text and strives to ascertain meaning. Ideally this analysis anticipates a searcher at some future time, looking for text with the same meaning. But, meaning is not fixed at either end of this process. And even if the meaning is relatively unambiguous or stable, the terms used to represent it are not.
The Dilemma Thus, most indexing processes encounter a dilemma at two levels: Interpreting meaning as intended by the author and as construed by the potential user; Choosing the terms to represent that meaning and that will enable this communication to be clear and as true as it can be. (Bearing in mind that such fidelity is a relative thing to begin with)
Interacting Layers of Meaning Meaning is ascertained through several layers. These layers interact and and inform each other: The lexical and morphemic level – words and their forms The semantic level – the meaning of the words The syntactic level – the relationship of the words to each other, known as grammar The discourse level – words interpreted in the context of text that is greater than the single sentence, and The pragmatic level – words embedded in world knowledge, that is, the way they are used
Meaning in Texts The meanings created through texts are often complex – not readily reducible to a single concept. Representing them in too simple a way reduces the richness and fidelity of the representation. But representing complexity is very difficult, especially if we want to build in some stability through standardization.
What to Do? Because words can be ambiguous, can have multiple senses and can change those senses over time, humans employ a range of strategies to work around this problem. One of the most useful and natural is the inclusion of context to disambiguate potential
The Role of Context Indexers have employed many strategies to enhance the richness of representation. One of these techniques is to add contextual cues which may help disambiguate the terms possible multiple senses, and reveal how the term is being used, that is, its role in the text.
Back-of-the-Book Indexes B.o.b.s are replete with context. In fact, a good index can be read and will give a fairly good indication of the content and scope of the text. Librarians education of job satisfaction of poor pay for The retention of natural order and prepositions helps make the meaning of individual terms clear (although not always). But these indexes are usually unique to the text to which they point and are quite difficult to maintain on a large scale.
Traditional Thesauri A collection of subject terms structured as a hierarchy, with equivalence and associative relationships also noted. Community-college librarians UFJunior-college librarians BT Academic librarians RTUniversity librarians These types of structures offer a semantic context. But, typically only one aspect of meaning is revealed at a time, and the representations only account for nouns. Associations among terms can only imply syntactic relationships. E.g., pasteurization and milk.
Facet Analysis Strives to remedy limitations of one- dimensionality by enabling representation from a number of perspectives. Using Ranganathans classic dimensions we produce the string: Time:12 th Century Space:Celtic Energy:Embroidered Matter:Felt Personality:Slippers These strings can be presented in permuted order for access by any of the facets.
PRECIS: Preserved Context Indexing System Developed by Derek Austin in the early 1970s for subject indexing for the British National Bibliography Subsequently developed by him, with the assistance of Mary Dykstra, into an adaptable method of linking both the semantics and syntax of indexing terms. Goal was to represent meaning without disturbing the users immediate understanding.
PRECIS Indexing Process (Incredibly Simplified) The indexer: examines document, asking the following questions: Did anything happen? If yes, to whom or what did it happen? Who or what did it? Where did it happen? (from Dykstra, 1987, p.9) mentally formulates a title-like phrase E.g., recruitment of teachers in American library schools analyzes terms syntactically
PRECIS Indexing Process (Incredibly Simplified) determines role of each term; (e.g., agent, location) selects appropriate role operator; chooses lead terms. Term order is achieved by the operators and is based on context dependency. This means that each term in the string sets the next term into its obvious context. (e.g., Teachers. Library schools.)
Producing the following entry: United States Library schools. Teachers. Recruitment Library schools. United States Teachers. Recruitment Teachers. Library schools. United States Recruitment Recruitment. Teachers. Library schools. United States (from Austin, JDoc, 1974, p.49-51)
Aspects of PRECIS Indexing: Context is preserved: The entire indexing statement appears at each lead term; The permuted entries read naturally, which is achieved by the prescribed order of the role operators; The terms are linked to a machine-held thesaurus (not described in this presentation) thereby providing possible sees and see alsos; According to Austin, PRECIS can be adapted to other languages, e.g., those with inflection. The indexer determines meaning and codes the roles and lead terms, but the computer takes care of the permutations.
Some Challenges Indexing with PRECIS requires a good knowledge of grammar; In my opinion, the bottleneck comes at the first step: articulating the title-like phrase. Its not clear how the terms provided by the indexer are harmonized with the thesaurus to produce consensual meaning.
PRECIS as a Bridge PRECIS can take advantage of the semantic richness of a thesaurus, AND the contextual richness of the natural-like permuted phrases of back-of-the-book indexes. Could potentially add to the power of a facetted- string display by adding some explicit notion of operators among the facets. And, could take advantage of NLP techniques, which at this point are able to parse most syntactic roles, as well as phrases and names with about 80% accuracy without too much work. (personal communication, Liz Liddy)
References Austin, Derek. PRECIS: A Manual of Concept Analysis and Subject Indexing. 2 nd ed. London: British Library Bibliographic Services Division, Austin, Derek. The development of PRECIS: A theoretical and technical history. Journal of Documentation 30 (1) 1974: Dykstra, Mary. PRECIS: A Primer. Rev. reprint. Metuchen, NJ & London: Scarecrow Press, 1987.