Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFM 700: Session 6 Taxonomies and Metadata

Similar presentations


Presentation on theme: "INFM 700: Session 6 Taxonomies and Metadata"— Presentation transcript:

1 INFM 700: Session 6 Taxonomies and Metadata
Paul Jacobs The iSchool University of Maryland Monday, October 26, 2009 This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States See for details

2 Today’s Topics Nature and types of metadata
General-purpose taxonomies (ontologies, thesauri, …) Special-purpose taxonomies & thesauri Practical use of taxonomies and metadata Metadata Taxonomies & Thesauri Practical Uses

3 Metadata Literally “data about data” Why do we need this?
“a set of data that describes and gives information about other data” ― Oxford English Dictionary Why do we need this? Types of metadata Descriptive/subjective/content (e.g. author, subject, keywords, …) Administrative (e.g. owner, rights, cost, creation date, version, …) Technical (e.g. format, size, dependencies, programs) In practical terms: Metadata helps users locate, navigate, interpret content Metadata helps organizations manage content Metadata helps systems manipulate content Metadata Taxonomies & Thesauri Practical Uses

4 Data without Metadata…
Who: authored it? to contact about data? What: are contents of database? When: was it collected? processed? finalized? Where: was the study done? Why: was the data collected? How: were data collected? processed? Verified? Metadata Taxonomies & Thesauri Practical Uses … can be pretty useless!

5 Early Example of Metadata
Taxonomies & Thesauri Practical Uses

6 Related Terms & Techniques
Taxonomies Anything organized in some sort of hierarchical structure Tagging Adding almost any kind of metadata to content, but now often descriptive and user-provided Thesauri Focus on relations between terms Focus on “concepts” Ontologies Usually model a specific domain or part of the world Generally machine-readable Metadata Taxonomies & Thesauri Practical Uses Increasing complexity and richness

7 Menagerie of Terms Classification Hierarchies Epistemology Directories
Controlled vocabularies Knowledge representation Metadata Taxonomies & Thesauri Practical Uses Let’s focus on significant differences. Let’s focus on advantages/disadvantages. Let’s focus on how each is useful. Let’s not quibble over what to exactly call each.

8 Segue – Metadata to Taxonomies
What do taxonomies, thesauri, etc., have to do with meta-data? Metadata Taxonomies & Thesauri Practical Uses

9 Taxonomies Organization of objects according to some principle
Familiar examples: Linnaean taxonomy (for living organisms) Web directories (e.g., Yahoo or ODP) Corporate directories Organization charts Organizational structures previously discussed Metadata Taxonomies & Thesauri Practical Uses

10 Thesauri: Motivation “Semantic gap” between concepts and words
Words are used to evoke concepts Concrete objects: MacBook Pro, iPhone Abstract ideas: freedom, peace Concepts Ideas Words Meaning Metadata Taxonomies & Thesauri Practical Uses

11 Words and concepts The semantic gap: What’s the problem?
Synonymy – roughly, different words or phrases can be used to express similar ideas (e.g. “notebook”, “laptop”) Polysemy – roughly, the same word can have different meanings (e.g., “line” (fishing, code, queue, . . .) ) Taxonomies try to group similar concepts “Tags” often assign words to concepts, making it easier to find related concepts Controlled vocabularies avoid ambiguity (like a specific tag set) Thesauri represent attempts to better organize mappings between words and concepts Do these present precision or recall problems? Metadata Taxonomies & Thesauri Practical Uses

12 Some Real Examples Content tagging and social media (e.g. flickr, del.i.cious) Special-purpose classification schemes and thesauri (e.g. art & architecture thesaurus – AAT, UMLS) General semantic tools and classification schemes (e.g., Princeton WordNet, Roget’s Thesaurus) Metadata Taxonomies & Thesauri Practical Uses

13 Think for a sec… You are developing a content-rich site and need organization and labeling schemes to help users view/browse/learn/find stuff – what do you do? Define your own tagging/organization scheme? Let the users define their own? Leave it all to a search engine? Use some existing scheme? . . . Metadata Taxonomies & Thesauri Practical Uses

14 Flickr – popular tags Metadata Taxonomies & Thesauri Practical Uses

15 Flickr – related tags Metadata Taxonomies & Thesauri Practical Uses

16 Del.icio.us – related tags
Metadata Taxonomies & Thesauri Practical Uses

17 Art & Architecture Thesaurus
Metadata Taxonomies & Thesauri Practical Uses

18 UMLS (Unified Medical Labeling System)
Source: National Library of Medicine (NIH) SPECIALIST Lexicon +Tools Semantic Network Metathesaurus 135 broad categories and 54 relationships between them lexical information and programs for language processing 1 million+ biomedical concepts from over 100 sources Metadata Taxonomies & Thesauri Practical Uses 3 Knowledge Sources used separately or together

19 UMLS (Unified Medical Labeling System)
Source: National Library of Medicine (NIH) Began in 1986 as long-term R&D project Designed for systems developers Develop multi-purpose tools to enhance understanding of medical meaning across systems Overcome barriers to effective retrieval of machine-readable information Overcome variety of ways the same concepts are expressed in machine readable and human language Metadata Taxonomies & Thesauri Practical Uses

20 UMLS Uses HIPAA, CHI, PHIN regulatory standards SNOMED CT
Source: National Library of Medicine (NIH) Information retrieval Thesaurus construction Natural language processing Automated indexing Electronic health records (EHR) Distribution mechanism for HIPAA, CHI, PHIN regulatory standards SNOMED CT Metadata Taxonomies & Thesauri Practical Uses

21 UMLS Metathesaurus http://www.nlm.nih.gov/research/umls/ Metadata
Taxonomies & Thesauri Practical Uses

22 UMLS Metathesaurus http://www.nlm.nih.gov/research/umls/ Metadata
Taxonomies & Thesauri Practical Uses

23 UMLS Thesaurus Browser
Metadata Taxonomies & Thesauri Practical Uses

24 Think for a sec… You are developing a content-rich site and need organization and labeling schemes to help users view/browse/learn/find stuff – what do you do? Define your own tagging/organization scheme? Let the users define their own? Leave it all to a search engine? Use some existing scheme? . . . Metadata Taxonomies & Thesauri Practical Uses

25 Applying IA Principles
Focus on users and user needs – users are different, and have different models Focus on content – concepts are different, too – different levels, words, complexity, vagueness Examples: What’s the difference between laptop, PDA, phone, and convergence device? When is “cancer research” “oncology”? When a user browses a furniture catalog for chairs, do you show them ottomans and footstools? Metadata Taxonomies & Thesauri Practical Uses

26 Standard Thesaurus Structure
Broader Terms Computer IS-A Preferred Notebook Laptop Synonyms (variants) AKA IS-A Metadata Taxonomies & Thesauri Practical Uses Narrower Terms Desktop Replacement Ultraportable Tablet PC

27 IA Uses of Thesauri For organization For navigation
For indexing content For searching Metadata Taxonomies & Thesauri Practical Uses

28 Auschwitz II-Birkenau (Poland : Death Camp)
Poly-Hierarchies Concepts can have multiple parents Example: Cracow (Poland : Voivodship) German death camps Auschwitz II-Birkenau (Poland : Death Camp) Metadata Taxonomies & Thesauri Practical Uses Block 25 (Auschwitz II-Birkenau) Kanada (Auschwitz II-Birkenau) From Shoah Foundation’s thesaurus of holocaust terms

29 Poly-Hierarchies What are the advantages and disadvantages?
What’s the relationship to polysemy? Metadata Taxonomies & Thesauri Practical Uses

30 Practical Uses & Implementation
What are we trying to do (e.g., help users find stuff)? What tools are at our disposal (e.g., tags, XML, databases)? Given the above, how do we use/implement hierarchies and thesauri? Metadata Taxonomies & Thesauri Practical Uses

31 Faceted Hierarchies Alternative to single and poly-hierarchies
Basic idea: Describe objects along multiple facets Each facet has its associated hierarchy Issues: What’s a facet? How do you navigate faceted hierarchies? Metadata Taxonomies & Thesauri Practical Uses

32 Faceted Browsing Example
Metadata Taxonomies & Thesauri Practical Uses

33 Faceted Browsing Example
Metadata Taxonomies & Thesauri Practical Uses

34 Faceted Browsing Example
Metadata Taxonomies & Thesauri Practical Uses Demo:

35 Advantages of Facets Integrates searching and browsing
Easy to build complex queries Easy to narrow, broaden, shift focus Helps users avoid getting lost Helps to prevent “categorization wars” Metadata Taxonomies & Thesauri Practical Uses

36 Relationship to IA? Database Web Server Application Server
Network Ontologies are implicitly “hidden” here!!! Trip Airplane Type: Capacity: Part-of Equipment Flight Metadata Taxonomies & Thesauri Practical Uses From: Departure Time: Origin: To: Arrival Time: Destination: Rule: Arrival Time is always after Departure Time Rule: Distance from Origin to Destination typical > 100 miles

37 Putting it all together…
mySQL Apache Database Web Server PHP Network Two-Layer Architecture Database Web Server Application Server Network Metadata Taxonomies & Thesauri Practical Uses Three-Layer Architecture

38 Popular Implementation
Presentation PHP/HTML Metadata Taxonomies & Thesauri Practical Uses Content Metadata SQL Database

39 Encoding Hierarchies Table: Hierarchy A Child Parent B A C D E F G H B
Store in RDBMS D E F G H Metadata Taxonomies & Thesauri Practical Uses Finding children of A: Select child from Hierarchy where parent = ‘A’  B, C Finding parent of G: Select parent from Hierarchy where child = ‘G’  D Finding siblings of D: find parent, and then find its children

40 Encoding Metadata Table: Items A ID Attributes … Label 0001 B 0002
0003 C 0004 D 0005 0006 E B C D E F G H Metadata Taxonomies & Thesauri Practical Uses

41 Content  Presentation
You are here: A > C > D Related - D - E B C Contents at D D E F G H Metadata Taxonomies & Thesauri Practical Uses Hierarchy(child, parent) Content(id, attribute1, attribute2, attribute3, …)

42 Faceted Browsing Matching Results Filter by - Facet1 (possible values)
Metadata Taxonomies & Thesauri Practical Uses Hierarchy(child, parent) Content(id, attribute1, attribute2, attribute3, …)

43 Recap Meta-data Taxonomies and Thesauri Practical use & implementation
General function Types of meta-data Taxonomies and Thesauri Role in organizing, navigating and searching content General-purpose taxonomies Special-purpose taxonomies Practical use & implementation Metadata Taxonomies & Thesauri Practical Uses


Download ppt "INFM 700: Session 6 Taxonomies and Metadata"

Similar presentations


Ads by Google