Presentation is loading. Please wait.

Presentation is loading. Please wait.

Knowledge Organization in Digital Libraries (II) Digital Libraries INFO 653 Week 6 Xia Lin College of Information Science and Technology Drexel University.

Similar presentations


Presentation on theme: "Knowledge Organization in Digital Libraries (II) Digital Libraries INFO 653 Week 6 Xia Lin College of Information Science and Technology Drexel University."— Presentation transcript:

1 Knowledge Organization in Digital Libraries (II) Digital Libraries INFO 653 Week 6 Xia Lin College of Information Science and Technology Drexel University

2 Approaches: Keyword Indexing Metadata (bottom-up)
Making search engines functional Metadata (bottom-up) Extending traditional subject indexing Classification (Top-down) Using a structured classification frame to provide hierarchical browsing and access. Ontology Approach

3 Keyword Indexing Highly automated process.
Use every meaningful word to index documents. Make search engines functional Make large amount of information accessible.

4 MetaData Approach Digital Object Identifiers Dublin Core
Subject tag Description tag RDF Data model Resource

5 Classification Approach
Use Current Classification Scheme LC Classification Dewey Classification Most projects are not completed A mile wide an inch deep Use ad-hoc classification schemes Yahoo style hierarchical list Use automatic classification

6 Ontology Approach Ontologies
Define not only concepts but also relationships of concepts. Define both links and types of links.

7 Ontology An ontology is a specification of a conceptualization.
An ontology is a description (like a formal specification of a program) of the concepts and relationships that can exist for an agent or a community of agents. An ontology is a commitment to use the shared vocabulary in a coherent and consistent manner.

8 Work Force Digital Library Ontology
Cases that worked Concepts (taxonomy and ontology) Lessons learned example-of example-of Workforce Programs represents describes Policy and regulation Documents refers-to Projects example-of Info Resources sponsors uses is-part-of describes Government refers-to example-of Guides, Handbooks Document initiates is-related-to write Describes includes Organizations People Presentations example-of sponsors Events (conferences, workshops, ...) Peter Creticos sponsors

9 Why Develop an Ontology?
To enable a machine to use the knowledge in some application. To enable multiple machines to share their knowledge. To help yourself understand some area of knowledge better. To help other people understand some area of knowledge. To help people reach a consensus in their understanding of some area of knowledge.

10 Ontology and thesaurus
Ontology inherits the ideas, purposes, and functions of the thesaurus. Ontology extends relationships among concepts beyond those in thesaurus (NT, BT, RT, Synonyms). Ontology intends to be consumed by both human and machine.

11 Topic Maps A key component of Semantic Web A new ISO standards
ISO Topic Maps XML-like syntax XML Schema XTM: XML Topic Maps

12 XTM Topic MAPS XML Topic Maps(XTM) defines an abstract model and XML grammar for topic maps. XTM does not define topic maps at the implementation level. Each implementation may interpret XTM differently or define their own “metadata” with the framework of XTM.

13 TAO of Topic Maps <topicmap> </topicmap> TOPIC OCCURS
topname basename dispname sortname OCCURS ASSOC assocrl facet fvalue addthms </topicmap>

14

15 Topic Maps for Knowledge Representation
Establishing an associative network between resources which represent concepts Organizing legacy resources into a new information/knowledge space, by relating them to topics, and associating those topics, in a structured way Enabling disparate sets of information resources to be used together, by interrelating them using a unifying conceptual framework

16 Topic Map Implementation
Why is topic map implementation hard? There are no “magic” solutions for content representation. It is labor-intensive and involves many manual activities to create a complete TAO. There are no good tools for topic map creation. XML is not designed to let end-users work directly on objects contained in a XML file.

17 Topic Maps and Thesaurus
Different Directions of indexing Thesaurus: assign descriptors to documents Topic maps: associate occurrences to terms Different structures Thesaurus: mainly a hierarchy plus some cross-references Topic Maps: more link types

18 ALL Together – Libraries Keyword indexing Classification Thesaurus
Metadata Knowledge Organizing Ontology XML RDF Topic Maps Semantic Web

19 Personal Research Projects
Explore solutions to make knowledge organizing practical Knowledge Class KEPT Knowledge Middleware

20 Knowledge Class Purposes
to customize knowledge organization and access, to supplement and complement existing devices for Web users, and to explore the possibility of combining existing methods of knowledge organization with advanced Web technology.

21 Knowledge Class Design Principles balance of browsing and searching
balance of manual indexing and automatic indexing balance of personal (topical) information space and the whole web space

22 Knowledge Class Three components an organizing framework
a dynamic web interface Search strategies for each term

23

24 Knowledge Class Features
A hierarchical structure of subject terms constructed on classification principles Multiple levels of knowledge organization --Expandable and contractible branches of the hierarchy to allow varying levels of depths, Static links to remote resources and related sites or pages Dynamic links to target information through search engines such as Google, AltaVista, InfoSeek, Yohoo!, and Lycos, etc. Coded search strategies for terms Use of scope terms for classes and for branches

25 Knowledge Class Features
Referral links among terms within a knowledge class and potentially among knowledge classes to assist cross reference. Instant switch among search engines available over the Web to allow access of a variety of resources covered by different search engines.

26 A Knowledge Class for Digital Libraries
Developed by students two years ago

27 Yahoo Categories: References – Libraries – Digital Libraries:
Cataloging Electronic Conferences (5) Electronic Electronic Theses and Dissertations (ETDs) (14) Organizations (2) Projects and Collections (33)

28 IFLA page: Resources and Projects
Cataloguing & Indexing of Electronic Resources Electronic Text & Journal Archives Metadata Resources

29 Digital Libraries: a Selected Resource Guide
Overview and general resources Project planning & management Architecture Technology Standards and guidelines Archiving & Preservation Metadata Intellectual property rights.

30 Northern Light folders
Digital Libraries Special collections Conferences dlib.org dlib.org.ar uh.edu rutgers.edu stanford.edu stfx.ca vt.edu uni-trier.de ucla.edu Class notes & Assignments all others...

31 Digital libraries by William Y. Arms: Table of Contents
1 Libraries, Technology, and People 2 The Internet and the World Wide Web 3 Libraries and Publishers 4 Innovation and Research 5 People, Organizations, and Change 6 Economic and Legal Issues 7 Access Management and Security 8 User Interfaces and Usability 9 Text 10 Information Retrieval and Descriptive Metadata 11 Distributed Information Discovery 12 Object Models, Identifiers, and Structural Metadata 13 Repositories and Archives 14 Digital Libraries and Electronic Publishing Today

32 Practical Digital Libraries: Books, Bytes, and Bucks by Michael Lesk
1. Evolution of Libraries 2. Text Access Methods 3. Images of Pages 4. Multimedia Storage and Access 5. Knowledge Representation Methods 6 Distribution 7 Usability and Retrieval Evaluation 8 Collections and Preservation 9 Economics 10 Intellectual Property Rights 11 International Activities 12 Future: Ubiquity, Diversity, Creativity, and Public Policy

33 How do I build a Thesaurus
Use existing dictionaries and thesauri to decide on the terms and their relationships. Collect a set of representative documents and try to index them; take the set of indexing terms as your preliminary list. Review and organize the preliminary term set: decide on preferred terms and make Use references from the variants and synonyms; build hierarchical and associative relationships among the preferred terms. Produce a draft list, test and revise.

34 Scope terms Each knowledge class can have one scope term to limit the search scope: Technology -- will be searched by technologies AND “digital libraries” in the kclass of Digital Libraries. Each branch of knowledge class can have one scope term: Issues – in Technology branch will be search by “Issues and Technology and digital libraries”

35 Data Format –first year
--, mutual funds, mutual-funds Investment-trusts Unit-trusts, 1 1. Hierarchical level 2. Display term 3. Search term (synonyms) 4. URL 5. Search strategy code

36 Second year-- Last Year’s student project
<topicmap title="Digital Libraries"> <topic id="General Resources" type="Main category"> <topic id="Bibliography"> <topname> <basename>Bibliography</basename> <dispname>Bibliography</dispname> <sortname></sortname> </topname> <occurs> </occurs> <topic id="IFLA bibliography" type="reference"> <basename>IFLA bibliography</basename> <dispname>IFLA bibliography</dispname> <occurs> type="website" href=" </occurs> </topic>

37 Third year: Visual Editing

38 Search Strategy key word search:
0 search term + branch scope term + class scope term 1 search term + class scope term 2 search term only Phrase search: 3 search term (as a phrase) +branch scope term + class scope term 4 search term (as a phrase) + class scope term 5 search term (as a phrase) Hierarchical search: 6 search term +its all the children + branch scope term + class scope term 7 search term +its all the children +class scope term 8 search term +its all the children No search: 9 No search No link for this display term; Label only Search terms+ display term: 10 same as 0 except display term also adds to the query 11 same as 1 except display term also adds to the query 12 … …

39 Digital Libraries General Resources Technology Projects
Indexing & Cataloging Knowledge representation Metadata Resources Collections and Repositories Digital Preservation Economic and legal issues Intellectual Property Rights People and organizations

40 Next Version Convert to XML Use topic map standards
Improve the editing tool

41 Next Integration: KEPT
RDF-ISO Standards OAI protocol Knowledge-Enabled Personalization Tool (KEPT) Knowledge Repository Topic Map Editor Information Resources Drag and drop Relational Database Thesauri Ontologies Topic maps ……. Hierarchical Generator Co-occurrence Mapping Web Browser Schema XML XML XSLT Searching/ Browsing Interface Search engines XML Application Server HTTP Server

42 New Interface Search: Primary Source: TopicMap Recycling
ERIC Thesaurus TopicMap ERIC Thesaurus ERIC Database Secondary Source: MeSH Related Terms: Conservation (Environment) Depleted Resources Ecology Natural Resources Pollution Recycling Solid Wastes Waste Disposal Waste Water Wastes Water Treatment Broader Terms: Sanitation Co-occurrence Terms: Environmental Education Waste Disposal Conservation (Environment) Science Education Natural Resources Solid Wastes Ecology Pollution Learning Activities Higher Education Wastes Instructional Materials Conservation Education Energy Environment MeSH Terms matched “Pollution”: Air Pollution Air Pollution, Indoor Indoor Air Pollution Air Pollution, Radioactive Environmental Pollution Pollution, Environmental Tobacco Smoke Pollution Air Pollution, Tobacco Smoke Environmental Pollution, Tobacco Smoke Environmental Smoke Pollution, Tobacco Environmental Tobacco Smoke Pollution Water Pollution Thermal Water Pollution Water Pollution, Thermal Water Pollution, Chemical Chemical Water Pollution Water Pollution, Radioactive Recycling Ecology Wastes Waste Water Waste disposal Pollution Air pollution Water pollution Indoor pollution Energy Natural Resources Water Power Conservation Education Attitudes Motivations ……

43 Next Level: Building a Knowledge Middleware

44 The Knowledge Middleware
A centralized repository that integrates diverse knowledge structures A set of mapping tools and protocols for crosswalks among various thesauri; A dynamic knowledge base for semantic neighborhoods that uses term occurrences and co-occurrences A web-based authoring and editing tool for building personalized topic maps from existing knowledge structures in the repository A visual search interface for content-base searching with the help of knowledge structures in the repository.

45 A semantic map for “Digital Libraries” in INSPEC database

46 Conclusions Knowledge Organizing is one of the major challenges of Digital Libraries. There are increasing demand for formalized (marked up) knowledge. There are increasing tools and specification for subject access (or knowledge access) to the Web and to Digital libraries.

47 References Xiao, Y. (1994). Facet Classification: A consideration of its features as a paradigm of knowledge organization. Knowledge Organization 21(2), pp Bies, W. (1996). Thinking with the help of images: on the metaphors of knowledge organization. Knowledge Organization 23(1), pp. 3-8. Huth, M. (1995). Symbolic and sub-symbolic knowledge organization in the computational theory of mind. Knowledge Organization 22(1),


Download ppt "Knowledge Organization in Digital Libraries (II) Digital Libraries INFO 653 Week 6 Xia Lin College of Information Science and Technology Drexel University."

Similar presentations


Ads by Google