Presentation on theme: "Revisiting PRECIS: The Preserved Context Index System"— Presentation transcript:
1Revisiting PRECIS: The Preserved Context Index System Barbara H. KwasnikSchool of Information StudiesSyracuse UniversityNovember 13, 2004ASIST SIG/CR Workshop
2Representation and Meaning An indexer analyzes a text and strives to ascertain meaning. Ideally this analysis anticipates a searcher at some future time, looking for text with the same meaning.But, meaning is not fixed at either end of this process.And even if the meaning is relatively unambiguous or stable, the terms used to represent it are not.
3The DilemmaThus, most indexing processes encounter a dilemma at two levels:Interpreting meaning as intended by the author and as construed by the potential user;Choosing the terms to represent that meaning and that will enable this communication to be clear and as true as it can be. (Bearing in mind that such fidelity is a relative thing to begin with)
4Interacting Layers of Meaning Meaning is ascertained through several layers. These layers interact and and inform each other:The lexical and morphemic level – words and their formsThe semantic level – the meaning of the wordsThe syntactic level – the relationship of the words to each other, known as grammarThe discourse level – words interpreted in the context of text that is greater than the single sentence, andThe pragmatic level – words embedded in world knowledge, that is, the way they are used
5Meaning in TextsThe meanings created through texts are often complex – not readily reducible to a single concept. Representing them in too simple a way reduces the richness and fidelity of the representation.But representing complexity is very difficult, especially if we want to build in some stability through standardization.
6What to Do?Because words can be ambiguous, can have multiple senses and can change those senses over time, humans employ a range of strategies to work around this problem.One of the most useful and “natural” is the inclusion of context to disambiguate potential
7The Role of ContextIndexers have employed many strategies to enhance the richness of representation. One of these techniques is to add contextual cues which mayhelp disambiguate the term’s possible multiple senses, andreveal how the term is being used, that is, its role in the text.
8Back-of-the-Book Indexes B.o.b.’s are replete with context. In fact, a good index can be “read” and will give a fairly good indication of the content and scope of the text.Librarianseducation ofjob satisfaction ofpoor pay forThe retention of natural order and prepositions helps make the meaning of individual terms clear (although not always).But these indexes are usually unique to the text to which they point and are quite difficult to maintain on a large scale.
9Traditional ThesauriA collection of subject terms structured as a hierarchy, with equivalence and associative relationships also noted.Community-college librariansUF Junior-college librariansBT Academic librariansRT University librariansThese types of structures offer a semantic context.But, typically only one aspect of meaning is revealed at a time, and the representations only account for nouns.Associations among terms can only imply syntactic relationships. E.g., “pasteurization” and “milk.”
10Facet AnalysisStrives to remedy limitations of one-dimensionality by enabling representation from a number of perspectives.Using Ranganathan’s classic dimensions we produce the string:Time: 12th CenturySpace: CelticEnergy: EmbroideredMatter: FeltPersonality: SlippersThese strings can be presented in permuted order for access by any of the facets.
11PRECIS: Preserved Context Indexing System Developed by Derek Austin in the early 1970s for subject indexing for the British National BibliographySubsequently developed by him, with the assistance of Mary Dykstra, into an adaptable method of linking both the semantics and syntax of indexing terms.Goal was to represent meaning without “disturbing the user’s immediate understanding.”
12PRECIS Indexing Process (Incredibly Simplified) The indexer:examines document, asking the following questions:Did anything happen?If yes, to whom or what did it happen?Who or what did it?Where did it happen? (from Dykstra, 1987, p.9)mentally formulates a title-like phraseE.g., “recruitment of teachers in American library schools”analyzes terms syntactically
13PRECIS Indexing Process (Incredibly Simplified) determines role of each term;(e.g., agent, location)selects appropriate role operator;chooses lead terms.Term order is achieved by the operators and is based on context dependency. This means that each term in the string sets the next term into its obvious context.(e.g., Teachers. Library schools.)
14Producing the following entry: United StatesLibrary schools. Teachers. RecruitmentLibrary schools. United StatesTeachers. RecruitmentTeachers. Library schools. United StatesRecruitmentRecruitment. Teachers. Library schools. United States(from Austin, JDoc, 1974, p.49-51)
15Aspects of PRECIS Indexing: Context is preserved: The entire indexing statement appears at each lead term;The permuted entries read naturally, which is achieved by the prescribed order of the role operators;The terms are linked to a machine-held thesaurus (not described in this presentation) thereby providing possible see’s and see also’s;According to Austin, PRECIS can be adapted to other languages, e.g., those with inflection.The indexer determines meaning and codes the roles and lead terms, but the computer takes care of the permutations.
16Some ChallengesIndexing with PRECIS requires a good knowledge of grammar;In my opinion, the bottleneck comes at the first step: articulating the title-like phrase.It’s not clear how the terms provided by the indexer are harmonized with the thesaurus to produce “consensual meaning.”
17PRECIS as a BridgePRECIS can take advantage of the semantic richness of a thesaurus, AND the contextual richness of the natural-like permuted phrases of back-of-the-book indexes.Could potentially add to the power of a facetted- string display by adding some explicit notion of operators among the facets.And, could take advantage of NLP techniques, which at this point are able to parse most syntactic roles, as well as phrases and names with about 80% accuracy without too much “work.” (personal communication, Liz Liddy)
18ReferencesAustin, Derek. PRECIS: A Manual of Concept Analysis and Subject Indexing. 2nd ed. London: British Library Bibliographic Services Division, 1984.Austin, Derek. The development of PRECIS: A theoretical and technical history. Journal of Documentation 30 (1) 1974:Dykstra, Mary. PRECIS: A Primer. Rev. reprint. Metuchen, NJ & London: Scarecrow Press, 1987.