Presentation is loading. Please wait.

Presentation is loading. Please wait.

INLS 520 Erik Mitchell INLS 520 Information Organization.

Similar presentations


Presentation on theme: "INLS 520 Erik Mitchell INLS 520 Information Organization."— Presentation transcript:

1 INLS 520 Erik Mitchell INLS 520 Information Organization

2 INLS 520 Erik Mitchell Review Controlled vocabularies –Term Lists, Hierarchies, Trees, Paradigms, Facets, Folksonomies Knowledge organization systems –Term Lists, Thesauri, Taxonomies, Ontologies

3 INLS 520 Erik Mitchell Today Protege tutorial –Create a thesaurus –Create an ontology Ontologies –Basic concept –Building in protege –RDF (?) –OWL (?)

4 INLS 520 Erik Mitchell Assignment 1 recap Required XML tags – Required DC elements –None, need a content wrapper and at least one element,, etc. Advanced Concepts –Namespaces –Schemas/DTDs MARC & DC –Advantages / disadvantages Techniques for discovering data –View SourceView Source –DC DOT Metadata generatorDC DOT Metadata generator

5 INLS 520 Erik Mitchell CV Concepts & definitions Controlled Vocabularies –Organized Lists –Relationships between concepts Knowledge organization systems –Typed relationships –Direct / inferable knowledge

6 INLS 520 Erik Mitchell Thesauri Definitions –“Guide to use of terms, showing relationships between them, for the purpose of providing standardized, controlled vocabulary for information storage and retrieval”(Monash)Monash –“A list of words showing similarities, differences, dependencies, and other relationships to each other”(USG)USG

7 INLS 520 Erik Mitchell Thesauri Concepts Preferred terms Non-preferred terms Semantic relations between terms How to apply terms (guidelines, rules) Scope notes Adding terms (How to produce terms that are not listed explicitly in the thesaurus)

8 INLS 520 Erik Mitchell Common thesaural identifiers SN Scope Note –Instruction, e.g. don’t invert phrases USE Use (another term in preference to this one) UF Used For BT Broader Term NT Narrower Term RT Related Term

9 INLS 520 Erik Mitchell Thesauri Guides National Information Standards Organization. (2005). Guidelines for the construction, format, and management of monolingual thesauri. ANSI/NISO Z39.19- 2005. Bethesda, MD: NISO Press. –http://www.niso.org/standards/resources/Z39-19- 2005.pdf?CFID=5559601&CFTOKEN=31747314http://www.niso.org/standards/resources/Z39-19- 2005.pdf?CFID=5559601&CFTOKEN=31747314 Aitchison, Jean & Gilchirist, Alan. Thesaurus Construction: A Practical Guide. 3 rd ed. London: Aslib, 1997. Willpower Information Management Consultants –http://www.willpower.demon.co.uk/thesprin.htmhttp://www.willpower.demon.co.uk/thesprin.htm

10 INLS 520 Erik Mitchell Ontology Definitions “The study of being or existence” “A conceptualization of a specification” (Gruber)Gruber “An ontology formally defines a common set of terms that are used to describe and represent a domain.” (OWL)

11 INLS 520 Erik Mitchell Webster’s Dictionary Webster’s Third New International Dictionary defines Ontology as: 1.A science or study of being, specifically a branch of metaphysics* relating to the nature and relations of being. 2.A theory concerning the kinds of entities and specifically the kinds of abstract entities that are to be admitted to a language system. *Metaphysics: Nature of being “or” existence.

12 INLS 520 Erik Mitchell Ontology Concepts Classes –Names of objects in the domain Relationships between classes Connections between classes Properties of classes Background or identifying knowledge of these objects Constraints on these properties & relationships Limits and parameters of the relationships

13 INLS 520 Erik Mitchell Class exercise Protégé overview –Orientation –Object types (Classes, Slots, Instances) –Relationships (hierarchies, associative) As a group, we will work through the protege training guide –http://protege.stanford.edu/doc/tutorial/get_started/ get-started.pdf

14 INLS 520 Erik Mitchell What is the semantic web URI (Universal resource identifier) OWL/RDFS All built on top of regular web RDF underlying language of semantic web –Xml represents data (document based) –RDF represents pure information (anyone can use, re-harvestable), you could call this knowledge Examples –SwoogleSwoogle –Goog411Goog411

15 INLS 520 Erik Mitchell Ontologies (review) “A common set of terms that are used to describe and represent a domain” –Classes, Relationships, Properties, Constraints A formal organization of knowledge –The primary role of an ontology is to define a language which people and computers in a given domain can share

16 INLS 520 Erik Mitchell A good ontology has Features: –Meaningful – all classes have instances –Accurate / correct –Non-redundant – each class/instance is represented in a single way –Rich in description – context, content Enabled functionality: –Able to use queries to connect new pieces of information –Use XML & definitions to integrate knowledge across domains

17 INLS 520 Erik Mitchell Ontology Continuum Keyword Lists Basic Thesauri Complex Thesauri Taxonomies Simple Ontologies (wordnet) Complex Ontologies (OWL)

18 INLS 520 Erik Mitchell SHOE Ontology project – Possible to build an ontology for anything –Simple HTML Ontology Extensions (SHOE) Project http://www.cs.umd.edu/projects/plus/SHOE/ http://www.cs.umd.edu/projects/plus/SHOE/html-pages.html Sample projects –Beer Ontology http://www.cs.umd.edu/projects/plus/SHOE/onts/index.html#beer –Document Ontology http://www.cs.umd.edu/projects/plus/SHOE/onts/docmnt1.0.html

19 INLS 520 Erik Mitchell Ontology Concepts Multiple inheritance Vertical and horizontal relationships Decomposed subject/object Predicate based description (isRelatedto, hasVersion) First Order Predicate Logic –Statements broken down into subjects/predicates Proposition –All men are mortal, Socrates is a man Therefore –Socrates is mortal

20 INLS 520 Erik Mitchell Creating a CV review Design methods –Re-use existing, start with content & desired use ideas –Committee / community approach Top-down –Concept driven Bottom-up –Document driven –Empirical approach Deductive approach –Select terms, create relationships, perform term control Inductive approach –Establish CV at outset, build hierarchies on as needed basis

21 INLS 520 Erik Mitchell Creating a CV review (2) Top-Down –Identify audience –Identify all topics, concepts, uses, and context of the domain –Sort topics identified into an appropriate organization scheme (enumerative, hierarchical, faceted) –Solidify structure and clean up gaps & redundancies –Assign documents to categories, test retrieval Bottom-up –Identify audience –Survey documents for topics/concepts. –Build system on the fly – let content drive structure and limits of system –Identify gap & redundancies in system –Test retrieval

22 INLS 520 Erik Mitchell Creating a CV review (3) Think about scope, use, content, maintenance Gather Terms –Based on existing systems, content –Based on user needs/expectations –Investigate issues of specificity, exhaustivity, granularity Build hierarchies, relationships –Broader/narrower terms, Related terms, Use/Use for, see/see also Establish Rules Implement Evaluate Maintain http://www.boxesandarrows.com/view/creating_a_controlled_vocabulary

23 INLS 520 Erik Mitchell Creating an Ontology Determine Scope of field, define boundaries Check for existing ontologies, vocabularies Select a top-down/bottom-up approach –Identify concepts, vocabulary, parameters, constraints Identify relationships –Multiple hierarchies, inheritance Build, test, maintain

24 INLS 520 Erik Mitchell Class exercise Design your own ontology –In Groups, pick a domain of knowledge Type of food (pizza, soup, beer), field of study (library science, math), etc Come up with a basic ontological framework and begin creating it in Protege Be prepared to share a brief overview with the class which will include –Domain area –Top level classses –Instance definitions –Relationships

25 INLS 520 Erik Mitchell Assignment 2 Overview In this assignment you will create an ontology on a topic of your choice. Your ontology should contain multiple classes and instances and be focused on a specific purpose. This assignment includes an implementation of the ontology in Protégé and a brief paper explaining your ontology. Guidelines Select a topic of interest and determine the top level (i.e. Basketball, Chocolate, etc). Define the scope (depth/breadth) and purpose of the ontology. Define specific classes and facets (known as slots in Protégé) that describe those classes. Your ontology should have between 5-10 classes with multiple (2-5) slots for each class. Think about the use of hierarchy and multiple inheritance in your ontology. Summarize your ontology in a short paper (no more than two pages). Outline your ontology and discuss your rationale and key decisions (e.g. scope, purpose, classes and slots, defining relationships) Implement the ontology in Protégé. Define your classes and instances. Create two queries that illustrate ways in which the data could be retrieved. Dates & groupwork Due – November 6 th Groupwork is acceptable

26 INLS 520 Erik Mitchell RDF Subject, property, object triples Transmitted in xml RDFS extends RDF with an ontology language –Properties, specialization OWL –More powerful extension of RDFS –Uses same syntax of RDF

27 INLS 520 Erik Mitchell RDF Model Webpage: http://www.stuff.com “Saki Knafo” Author (Value) Object (Property type) Predicate (Resource) Subject “The author of the stuff webpage is Saki Knafo” - A literal, a triple, a statement

28 INLS 520 Erik Mitchell How is RDF different? RDF is a descriptive model that –Allows variable contextualized description –Deconstructs the descriptive process –Allows more granular automated processing of data –Uses exact markup to indicate the context of values (namespaces, schemas)

29 INLS 520 Erik Mitchell Encoding RDF in XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> The Hang: The Island of Black Jeans SAKI KNAFO Sun, 16 Sep 2007 01:04:40 GMT descriptive content

30 INLS 520 Erik Mitchell Iterative RDF description <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:vcard="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_schemas/v card.xsd" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> The Hang: The Island of Black Jeans http://www.stuff.com Sun, 16 Sep 2007 01:04:40 GMT descriptive content rdf:about="http://dli.grainger.uiuc.edu/publications/metadatacasestudy/dc_,,,"> Saki Knafo knafo@www.nytimes.com

31 INLS 520 Erik Mitchell RDFS RDF Schema –Defines additional rdf elements that help type relationships Special Classes –Based on RDF Classes / Properties / Attributes with additional http://www.w3schools.com/rdf/rdf_reference.asp Allows the creation of vocabularies / ontologies

32 INLS 520 Erik Mitchell OWL (Web Ontology Language) An ontolgy that is geared towards representing information on the web –Classes, properties, and relationships that describe URIs and their facets. Based on the Triple concept –Subject, Predicate, Object –3 versions: OWL-Lite, OWL-DL, OWL-Full Formatted in RDF/XML –Uses RDF and RDFS as a foundation –Adds new elements in the owl namespace

33 INLS 520 Erik Mitchell OWL Versions OWL-Lite –Simple hierarchies, constraints OWL-DL –Uses description logics Logic-based semantic markup based on first-order predicate logic –Still guarantees finite relationship processing –Best suited for automation OWL-Full –Most complex –Open ended, possible to get into infinite processing

34 INLS 520 Erik Mitchell OWL Example <rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns#http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns:rdfs="http://www.w3.org/2000/01/rdfschema#" xmlns:owl=http://www.w3.org/2002/07/owl#http://www.w3.org/2002/07/owl# xmlns=http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl#http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl# xml:base="http://www.w3.org/2001/sw/BestPractices/OEP/SimplePartWhole/part.owl"> 1.0 An ontology containing the basic part relations: partOf, hasPart, partOf_directly, and hasPart_directly. These are described in the accompanying note. Author: Chris Welty (Chris Welty)

35 INLS 520 Erik Mitchell More OWL Examples Airport Pizza

36 INLS 520 Erik Mitchell Next Week(s) Fall Break – Enjoy 10/30 – Guest speaker Lorrie Eakin 11/6 – First Group presentations


Download ppt "INLS 520 Erik Mitchell INLS 520 Information Organization."

Similar presentations


Ads by Google