Presentation is loading. Please wait.

Presentation is loading. Please wait.

Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies.

Similar presentations


Presentation on theme: "Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies."— Presentation transcript:

1 Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies 2002 – Seattle, WA

2 Complex Web sites Many institutions are struggling to solve problems with their official Web sites. Many institutions are struggling to solve problems with their official Web sites. But: But:  The contents constantly change.  The editors can’t exercise sufficient control. One result: an institution’s major presence on the Web is difficult to navigate. One result: an institution’s major presence on the Web is difficult to navigate.

3 The Semantic Web Tim Berners-Lee’s vision: “The current Web has documents for people, not computers. By augmenting Web pages with data designed for automated processing, users will transform the Web into the Semantic Web.” “The current Web has documents for people, not computers. By augmenting Web pages with data designed for automated processing, users will transform the Web into the Semantic Web.” “Computers will find the meaning of semantic data by following hyperlinks to definitions of key terms and rules for reasoning about them logically.” “Computers will find the meaning of semantic data by following hyperlinks to definitions of key terms and rules for reasoning about them logically.”

4 The Semantic Web: An Architecture UnicodeURI XML + XML namespaces + XMLschema RDF + RDFschema Ontology vocabulary Logic Proof Digital signature Trust Data Rules Self- describing documents. Source: Tim Berners-Lee

5 The promise of the Semantic Web A common data model A common data model Conceptual links Conceptual links Limited inferences Limited inferences

6 Our demo: goals Represent subject/topic information obtained from different sources. Represent subject/topic information obtained from different sources. Demonstrate the value of hypothetical metadata- based navigation for a collection of related Web sites. Demonstrate the value of hypothetical metadata- based navigation for a collection of related Web sites.  oclc.org  Portions of w3c.org  dublincore.org Develop and evaluate the utility of Open Source prototyping tools based on RDF. Develop and evaluate the utility of Open Source prototyping tools based on RDF.

7 S Some common topics digital library xml dublin core xml namespace xml schema metadata oclc.orgw3c.org dublincore.org xml fragment xml stylesheet element node dc element syntax library automation classification traditional library library users library network xml profile schema processor uri syntax

8 Sources of subject/topic metadata HTML keywords HTML keywords Subject lines in email messages Subject lines in email messages An index of library/information science terms An index of library/information science terms Terms extracted automatically from text using natural-language-processing algorithms Terms extracted automatically from text using natural-language-processing algorithms

9 Some term relationships Singular/Plural Library, libraries Acronyms Standard Generalized Markup Language--SGML Library of Congress Subject Headings--LCSH Coordination library and information science--library science, information science information storage and retrieval--information storage, information retrieval Broad/Narrow Computational linguistics—linguistics Classification scheme—classification Type-of Library—digital library, traditional library Related Library—library classification scheme, library automation

10 An RDF encoding classification classification </Topic>

11 Connected RDF encodings resource discovery resource discovery </Topic> resource resource </Topic>

12 A graphical representation of relationships classification codes automatic classification resource discovery and classification Coordination Broad/Narrow resource discovery resource resource description framework rdf Type_of Coordination Related Acronym

13 The philosophy of our system Modular Modular Open Source Open Source Project Web site accessible at: topicmap.oclc.org:5000 topicmap.oclc.org:5000

14 System architecture: 1 Extract terms Filter terms Structure terms Normalized HTML data RDF graph

15 Term filters: using knowledge encoded in the text Positive contexts for terms: study of, information about, professor of, department of information science, metadata applications, data processing, automatic classification, computational linguistics, internet resources Negative contexts for terms: very different things, few messages, good point, interesting example, appealing idea, small extension, terse document, simple kind

16 System architecture: 2 Harvester (Perl) File System (HTML ) Metadata Scraper (Perl) File System (Normalized HTML) Term manipulator (Java) File System (XML/RDF) XML/RDF Loader Database

17

18

19

20

21

22

23 Open issues RDF knowledge in the user interface. RDF knowledge in the user interface. Encoding in RDF or XML? Encoding in RDF or XML? The construction of knowledge ontologies. The construction of knowledge ontologies.

24

25 Conclusions The enterprise succeeds or fails on the strength of the knowledge ontology. The enterprise succeeds or fails on the strength of the knowledge ontology. RDF and the XTM standard are descriptively equivalent for our work. RDF and the XTM standard are descriptively equivalent for our work. Sophisticated user interface design is required to exploit all of the encoded information. Sophisticated user interface design is required to exploit all of the encoded information.

26 For more information Sharon Caraballo. Automatic Construction of a Hypernym-Labeled Noun Hierarchy. PhD dissertation. Brown University, 2001. Sharon Caraballo. Automatic Construction of a Hypernym-Labeled Noun Hierarchy. PhD dissertation. Brown University, 2001. Carol Jean Godby. A Computational Study of Lexicalized Noun Phrases in English. PhD dissertation. The Ohio State University, 2002. Carol Jean Godby. A Computational Study of Lexicalized Noun Phrases in English. PhD dissertation. The Ohio State University, 2002.


Download ppt "Strategies for subject navigation of linked Web sites using RDF topic maps Carol Jean Godby Devon Smith OCLC Online Computer Library Center Knowledge Technologies."

Similar presentations


Ads by Google