Presentation is loading. Please wait.

Presentation is loading. Please wait.

COULD WE CREATE A SEMANTIC WEB DATA MODEL FOR SUBJECT CATALOGING?

Similar presentations


Presentation on theme: "COULD WE CREATE A SEMANTIC WEB DATA MODEL FOR SUBJECT CATALOGING?"— Presentation transcript:

1 COULD WE CREATE A SEMANTIC WEB DATA MODEL FOR SUBJECT CATALOGING?

2 BY Martha M. Yee Cataloging Supervisor UCLA Film & Television Archive myee@ucla.edu http://myee.bol.ucla.edu

3 HOW I GOT STARTED DOING RESEARCH MLIS, 1978-1980, UCLA Graduate School of Library and Information Science Ph.D., 1993, UCLA Graduate School of Library and Information Science Try to ask a research question that is of interest beyond your institution

4 INTRODUCTION 1. The vision 2. The experiment 3. Definition of terms 4. The current approach to linking two different concepts or objects in a subject relationship

5 INTRODUCTION 5. Research questions 6. The RDF model so far—the schema 7. The RDF model so far—an example (instance) 8. Potential problems with RDF

6 THE VISION The Web as shared database instead of shared document store

7 THE VISION Instead of records, URI’s (Uniform Resource Identifiers) for entities: URI for work containing all work attributes

8 THE VISION URI for subject entity (concept or object), containing all entity attributes, including preferred name, variant names, but also much more data about the concept or object than our current authority records do

9 THE VISION URI for disciplinary approach/perspective entity (currently a classification number), containing all entity attributes, including preferred name, variant names, but also much more data about the disciplinary approach/perspective than our current authority records do

10 THE VISION URI’s for persons, corporate bodies, places, etc., including preferred name, variant names, but also much more data about person, corporate body, or place than our current authority records do

11 THE VISION If any data about a particular entity needed to be changed, it would be changed once at the URI and immediately accessible to all users, libraries and library staff by means of links down to local data such as circulation, acquisitions, and binding data

12 THE EXPERIMENT A set of cataloging rules that incorporate both descriptive and subject cataloging rules, differing from RDA in being more FRBR- ized Today, focus is on the subject cataloging rules

13 THE EXPERIMENT I am now in the process of trying to model my cataloging rules in the form of an RDF/RDFS/OWL/SKOS model Today the focus is on the subject part of this model

14 THE EXPERIMENT I don’t seriously expect anyone to adopt these rules!

15 THE EXPERIMENT You can find these rules and the data model, (including the RDF schema, and some RDF examples) at: http://myee.bol.ucla.edu

16 SOME DEFINITIONS The semantic web: a way to represent knowledge; a knowledge representation language that provides ways of expressing meaning that are amenable to computation; a means of constructing maps of domains of knowledge consisting of class and property axioms with a formal semantics

17 SOME DEFINITIONS The semantic web The web as huge shared database Hyperdata replacing hypertext

18 SOME DEFINITIONS RDF (Resource Description Framework): a family of specifications for methods of modeling information that underpins the semantic web through a variety of syntax formats

19 SOME DEFINITIONS RDF (Resource Description Framework) Data encoded as: the subject of a triple (New York) the predicate of a triple (has the postal abbreviation) the object of a triple (NY)

20 SOME DEFINITIONS RDF (Resource Description Framework) XML is commonly used to express RDF, but is not a necessity

21 SOME DEFINITIONS RDF (Resource Description Framework) RDFS or RDF Schema is an extensible knowledge representation language providing basic elements for the description of ontologies, AKA RDF vocabularies

22 SOME DEFINITIONS RDF (Resource Description Framework) RDFS data encoded as: Class (= Entity); the subject of a triple (e.g., “New York”) Class relationship (semantic linkage); the predicate of a triple (e.g., “has the postal abbreviation”) Class property (= Attribute); the object of a triple (e.g., “NY”)

23 SOME DEFINITIONS RDF (Resource Description Framework) OWL (Web Ontology Language): a family of knowledge representation languages for authoring ontologies compatible with RDF

24 SOME DEFINITIONS RDF (Resource Description Framework) SKOS (Simple Knowledge Organisation Systems): a family of formal languages built upon RDF and designed for representation of thesauri, classification schemes, taxonomies or subject-heading systems

25 A CONTROVERSY A controversy over what URIs identify: 1. the name for a concept? 2. the concept itself? 3. a web location? 4. or a document instance?

26 LINKING SUBJECT ENTITIES Current approach to linking two different subject entities (concepts or objects) 1. Compound headings Comic books and children African Americans on television 2. Heading-subdivision combinations Birds--Effect of pesticides on Women--Employment

27 RESEARCH QUESTION 1 1. Is it possible to fit our subject cataloging, genre/form, and classification system data into RDF/RDFS/OWL/SKOS?

28 RESEARCH QUESTION 2 2. If it is, is it possible to use that data to design indexes and displays that meet the objectives of the catalog (providing an efficient instrument to allow a user to find all of the works in a given genre or form, or all of the works on a particular subject)?

29 RESEARCH QUESTION 3 3. Would it be possible to create and control a list of types of relationships between concepts and objects that currently make up main heading-subdivision combinations in LCSH?

30 RESEARCH QUESTION 3 Ability, Types of (free-floating scope note, Ability testing, H1095, p. 4) Activities, Types of (free-floating scope note, Equipment and supplies, H1095, p. 22) Animals, Individuals (pattern heading, H1147) Animals, Groups of (pattern heading, H1147) Animals, Types of (free-floating scope note, Equipment and supplies, H1095, p. 22) Archaeological sites, Individual (free-floating scope note, Catalogs, H1095, p. 12)

31 RESEARCH QUESTION 4 4. Would it be possible to create and control a list of types of relationships between concepts and objects that currently make up compound headings? Perhaps these types of relationships could be made more granular, e.g.

32 RESEARCH QUESTION 4 Subject to subject relationship--Activity of entity relationship Examples: Child artists Subject to subject relationship--Audience for activity Examples: Art therapy for children

33 RESEARCH QUESTION 4 Subject to subject relationship--Created by Examples: Films by children Subject to subject relationship--Depiction of Examples: Children in art

34 RESEARCH QUESTION 4 Subject to subject relationship--Effect on Example: Television and children Subject to subject relationship--Material made of Example: Brick chimneys

35 RESEARCH QUESTION 4 Subject to subject relationship--Participation in Example: Women in television broadcasting Subject to subject relationship--Regulation of Example: Railroads and state

36 RESEARCH QUESTION 5 Would it be possible to use the same type of relationship properties to link objects/concepts to place or period more explicitly or in a more granular way than heretofore?

37 RESEARCH QUESTION 5 For example, a geographic subdivision may refer to: the place of origin of an object, person, corporate body, etc. the place in which an event or activity occurred the place in which an object, person, corporate body, etc. is now found Current use of geographic subdivisions can be ambiguous as to which of the above meanings is intended

38 RESEARCH QUESTION 6 Would it be possible to use RDF to encode broader and narrower hierarchical relationships such as those found in both subject heading lists and classification schemes?

39 RDF MODEL SO FAR--SCHEMA Domain (RDFS): A global restriction on a property, used to infer a subject's membership in a class or classes. Range (RDFS): A global restriction on a property, used to infer an object's membership in a class or classes.

40 RDF MODEL SO FAR--SCHEMA Subclass (OWL): Used to create a hierarchy below the class level; all things in a subclass are also in its class. Subproperty (OWL): Used to create a hierarchy below the property level; use of one subproperty implies the use of the property of which it is the subproperty.

41 RDF MODEL SO FAR--SCHEMA Disjoint with (OWL): Used to assert that one or more classes are siblings sharing the same parent class with no overlap among siblings. An instance that is a member of one sibling class cannot also be the member of the other sibling class(es).

42 RDF MODEL SO FAR--SCHEMA Class: Work URI:http://myee.bol.ucla.edu /ycrschema#Work Label:work Disjoint with: ycrschema#Expression, ycrschema#Title-Manifestation, ycrschema#SerialTitle, ycrschema#Manifestation and ycrschema# Item Subclass of:rdf-schema#Resource

43 RDF MODEL SO FAR--SCHEMA Class: Concept URI:http://myee.bol.ucla.edu /ycrschema#Concept Label:concept Disjoint with: ycrschema:Object, ycrschema:Placeassubj and ycrschema:Eventassubj Subclass of:rdf-schema#Resource

44 RDF MODEL SO FAR--SCHEMA Class: Object URI:http://myee.bol.ucla.edu /ycrschema#Object Label:object Disjoint with: ycrschema:Eventassubj, ycrschema#Placeassubj and ycrschema:Concept Subclass of:rdf-schema#Resource

45 RDF MODEL SO FAR--SCHEMA Property: Resource to Work Subject Relationships URI:http://myee.bol.ucla.edu /ycrschema#resworksubj rel Label:resource to work subject relationships Domain:rdf-schema#Resource Range:rdf-schema#Resource

46 RDF MODEL SO FAR--SCHEMA Property: Resource to Work Subject Relationship-- About (Nonfiction) URI: http://myee.bol.ucla.edu/ycrschema#resw orksubjabout Label:Resource to work subject relationship—about (nonfiction) Domain:rdf-schema#Resource Range:rdf-schema#Resource Subproperty of:ycrschema:resworksubjrel

47 RDF MODEL SO FAR--SCHEMA Property: Subject to Subject Relationship--Effect on URI:http://myee.bol.ucla.edu /ycrschema#subjsubjeffe ct Label:subject to subject relationship—effect on Domain:rdf-schema#Resource Subproperty of:ycrschema:subjsubjrel

48 THE RDF MODEL SO FAR—AN EXAMPLE (INSTANCE)? Fishes sh85048726

49 THE RDF MODEL SO FAR—AN EXAMPLE (INSTANCE)? Effect of pesticides on sh00002520

50 SOME PROBLEMS? Can we do what we need to do within the context of the semantic web?

51 SOME PROBLEMS? There is a cross reference from Blimps to Airships, but not from Blimps--Drama to Airships--Drama. For that reason, a search in any OPAC subject index for Blimps in which the main heading is not in use will fail, even if the library or archive has material on blimps under the heading Airships with various subdivisions.

52 SOME PROBLEMS? The solution to this problem is to define a transitive or inheritance relationship between a main heading and its subject subdivisions.

53 SOME PROBLEMS? Unfortunately, RDF seems to resist hierarchical relationship. It assumes that you just need to connect everything to everything else without needing to express any hierarchy.

54 SOME PROBLEMS? This is bad news for bibliographic data which is rife with hierarchical relationships. Hierarchy is one of our major tools for expressing meaning to our users.

55 SOME PROBLEMS? In order to recognize the fact that the subject of a book or a film could be a work, a person, a concept, an object, an event, or a place, all classes in the model, it was necessary to define subject itself as a property (a relationship) rather than a class in its own right. All subject properties are defined as having a domain of resource, meaning there is no constraint as to the class to which these subject properties apply. I'm not sure if there will be any fall-out from that modelling decision?

56 SOME PROBLEMS? In order to distinguish between the corporate behavior of a jurisdiction and the subject behavior of a geographical location, I have defined two different classes for place: Place as Jurisdictional Corporate Body Place as Geographic Area Will this cause problems in the model?

57 SOME PROBLEMS? Some events are corporate bodies (e.g. conferences that publish papers) and some are a kind of subject (e.g. an earthquake). I have defined two different classes for event: Conference or Other Event as Corporate Body Creator Event as Subject Will this cause problems in the model?

58 SOME PROBLEMS? If subject itself is a property, a relationship between two subjects becomes a property of a property. Technically this is possible in RDF but it becomes very complex.

59 SOME PROBLEMS? I have treated genre/form as a class, but RDA treats it as a property of work. Which approach is best?

60 CONCLUSION Our need for hierarchy and our need for properties of properties may in the end dictate that RDF is not yet sophisticated enough to encode our data efficiently and then use it for efficient displays and efficient indexes. My goal in doing this work is to find out whether or not that is the case, and if it is, to try to imagine how a more sophisticated system could be devised that would support hierarchy and complex relationships and still allow our data to live on the web outside of database software.

61 READ MORE ABOUT IT Yee, Martha M. "Can Bibliographic Data be Put Directly Onto the Semantic Web?" Information Technology and Libraries 28:2 (June, 2009): 55-80. Also available on the Web at: http://repositories.cdlib.org/postprints/3369

62 READ MORE ABOUT IT McGrath, Kelley. "Facet-Based Search and Navigation with LCSH: Problems and Opportunities." Code{4}lib Journal 1 (December 17, 2007). Available on the Web at: http://journal.code4lib.org/articles/23

63 READ MORE ABOUT IT Coyle, Karen. LCSH as Linked Data. On Coyle’s Information (blog) at: http://kcoyle.blogspot.com/2009/05/lcsh- as-linked-data-beyond-dash-dash.html


Download ppt "COULD WE CREATE A SEMANTIC WEB DATA MODEL FOR SUBJECT CATALOGING?"

Similar presentations


Ads by Google