Presentation is loading. Please wait.

Presentation is loading. Please wait.

INFM 700: Session 6 Taxonomies and Metadata Paul Jacobs The iSchool University of Maryland Wednesday, Feb. 29, 2012 This work is licensed under a Creative.

Similar presentations


Presentation on theme: "INFM 700: Session 6 Taxonomies and Metadata Paul Jacobs The iSchool University of Maryland Wednesday, Feb. 29, 2012 This work is licensed under a Creative."— Presentation transcript:

1 INFM 700: Session 6 Taxonomies and Metadata Paul Jacobs The iSchool University of Maryland Wednesday, Feb. 29, 2012 This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States See http://creativecommons.org/licenses/by-nc-sa/3.0/us/ for detailshttp://creativecommons.org/licenses/by-nc-sa/3.0/us/

2 iSchool Today’s Topics Nature and types of metadata General-purpose taxonomies (ontologies, thesauri, …) Special-purpose taxonomies & thesauri Practical use of taxonomies and metadata Metadata Taxonomies & Thesauri Practical Uses

3 iSchool Metadata Literally “data about data” “a set of data that describes and gives information about other data” ― Oxford English Dictionary Why do we need this? Types of metadata Descriptive/subjective/content (e.g. author, subject, keywords, …) Administrative (e.g. owner, rights, cost, creation date, version, …) Technical (e.g. format, size, dependencies, programs).... In practical terms: Metadata helps users locate, navigate, interpret content Metadata helps organizations manage content Metadata helps systems manipulate content Metadata Taxonomies & Thesauri Practical Uses

4 iSchool Data without Metadata… Who: authored it? to contact about data? What: are contents of database? When: was it collected? processed? finalized? Where: was the study done? Why: was the data collected? How: were data collected? processed? Verified? … can be pretty useless! Metadata Taxonomies & Thesauri Practical Uses

5 iSchool Early Example of Metadata Metadata Taxonomies & Thesauri Practical Uses

6 iSchool Related Terms & Techniques Taxonomies Anything organized in some sort of hierarchical structure Tagging Adding almost any kind of metadata to content, but now often descriptive and user-provided Thesauri Focus on relations between terms Focus on “concepts” Ontologies Usually model a specific domain or part of the world Generally machine-readable Increasing complexity and richness Metadata Taxonomies & Thesauri Practical Uses

7 iSchool Menagerie of Terms Classification Hierarchies Epistemology Directories Controlled vocabularies Knowledge representation Let’s focus on significant differences. Let’s focus on advantages/disadvantages. Let’s focus on how each is useful. Let’s not quibble over what to exactly call each. Metadata Taxonomies & Thesauri Practical Uses

8 iSchool Segue – Metadata to Taxonomies What do taxonomies, thesauri, etc., have to do with meta-data? Metadata Taxonomies & Thesauri Practical Uses

9 iSchool Taxonomies Organization of objects according to some principle Familiar examples: Linnaean taxonomy (for living organisms) Web directories (e.g., Yahoo or ODP) Corporate directories Organization charts Organizational structures previously discussed Metadata Taxonomies & Thesauri Practical Uses

10 iSchool Thesauri: Motivation “Semantic gap” between concepts and words Words are used to evoke concepts Concrete objects: MacBook Pro, iPhone Abstract ideas: freedom, peace Concepts Words Ideas Meaning Metadata Taxonomies & Thesauri Practical Uses

11 iSchool Words and concepts The semantic gap: What’s the problem? Synonymy – roughly, different words or phrases can be used to express similar ideas (e.g. “notebook”, “laptop”) Polysemy – roughly, the same word can have different meanings (e.g., “line” (fishing, code, queue,...) ) Taxonomies try to group similar concepts “Tags” often assign words to concepts, making it easier to find related concepts Controlled vocabularies avoid ambiguity (like a specific tag set) Thesauri represent attempts to better organize mappings between words and concepts Do these present precision or recall problems? Metadata Taxonomies & Thesauri Practical Uses

12 iSchool Some Real Examples Content tagging and social media (e.g. flickr, del.i.cious) Special-purpose classification schemes and thesauri (e.g. art & architecture thesaurus – AAT, UMLS) General semantic tools and classification schemes (e.g., Princeton WordNet, Roget’s Thesaurus) Metadata Taxonomies & Thesauri Practical Uses

13 iSchool Think for a sec… You are developing a content-rich site and need organization and labeling schemes to help users view/browse/learn/find stuff – what do you do? Define your own tagging/organization scheme? Let the users define their own? Leave it all to a search engine? Use some existing scheme?... Metadata Taxonomies & Thesauri Practical Uses

14 iSchool Flickr – popular tags Metadata Taxonomies & Thesauri Practical Uses

15 iSchool Flickr – related tags Metadata Taxonomies & Thesauri Practical Uses

16 iSchool Del.icio.us – related tags Metadata Taxonomies & Thesauri Practical Uses

17 iSchool Art & Architecture Thesaurus Metadata Taxonomies & Thesauri Practical Uses http://www.getty.edu/research/conducting_research/vocabularies/aat/

18 iSchool UMLS (Unified Medical Labeling System) Source: National Library of Medicine (NIH) Metathesaurus Semantic Network SPECIALIST Lexicon +Tools 135 broad categories and 54 relationships between them 1 million+ biomedical concepts from over 100 sources lexical information and programs for language processing 3 Knowledge Sources used separately or together Metadata Taxonomies & Thesauri Practical Uses

19 iSchool UMLS (Unified Medical Labeling System) Source: National Library of Medicine (NIH) Metadata Taxonomies & Thesauri Practical Uses Began in 1986 as long-term R&D project  Designed for systems developers  Develop multi-purpose tools to enhance understanding of medical meaning across systems  Overcome barriers to effective retrieval of machine-readable information  Overcome variety of ways the same concepts are expressed in machine readable and human language

20 iSchool UMLS Uses Source: National Library of Medicine (NIH) Metadata Taxonomies & Thesauri Practical Uses  Information retrieval  Thesaurus construction  Natural language processing  Automated indexing  Electronic health records (EHR)  Distribution mechanism for  HIPAA, CHI, PHIN regulatory standards  SNOMED CT

21 iSchool UMLS Metathesaurus Metadata Taxonomies & Thesauri Practical Uses http://www.nlm.nih.gov/research/umls/

22 iSchool UMLS Metathesaurus Metadata Taxonomies & Thesauri Practical Uses http://www.nlm.nih.gov/research/umls/

23 iSchool UMLS Thesaurus Browser Metadata Taxonomies & Thesauri Practical Uses http://www.nlm.nih.gov/research/umls/

24 iSchool Think for a sec… You are developing a content-rich site and need organization and labeling schemes to help users view/browse/learn/find stuff – what do you do? Define your own tagging/organization scheme? Let the users define their own? Leave it all to a search engine? Use some existing scheme?... Metadata Taxonomies & Thesauri Practical Uses

25 iSchool Applying IA Principles Focus on users and user needs – users are different, and have different models Focus on content – concepts are different, too – different levels, words, complexity, vagueness Examples: What’s the difference between laptop, PDA, phone, and convergence device? When is “cancer research” “oncology”? When a user browses a furniture catalog for chairs, do you show them ottomans and footstools? Metadata Taxonomies & Thesauri Practical Uses

26 iSchool Standard Thesaurus Structure Computer Notebook Laptop Desktop Replacement UltraportableTablet PC IS-A AKA Synonyms (variants) Narrower Terms Broader Terms Preferred Metadata Taxonomies & Thesauri Practical Uses

27 iSchool IA Uses of Thesauri For organization For navigation For indexing content For searching Metadata Taxonomies & Thesauri Practical Uses

28 iSchool Poly-Hierarchies Concepts can have multiple parents Example: Cracow (Poland : Voivodship) Auschwitz II-Birkenau (Poland : Death Camp) Block 25 (Auschwitz II-Birkenau) German death camps Kanada (Auschwitz II-Birkenau) From Shoah Foundation’s thesaurus of holocaust terms Metadata Taxonomies & Thesauri Practical Uses

29 iSchool Poly-Hierarchies What are the advantages and disadvantages? What’s the relationship to polysemy? Metadata Taxonomies & Thesauri Practical Uses

30 iSchool Practical Uses & Implementation What are we trying to do (e.g., help users find stuff)? What tools are at our disposal (e.g., tags, XML, databases)? Given the above, how do we use/implement hierarchies and thesauri? Metadata Taxonomies & Thesauri Practical Uses

31 iSchool Faceted Hierarchies Alternative to single and poly-hierarchies Basic idea: Describe objects along multiple facets Each facet has its associated hierarchy Issues: What’s a facet? How do you navigate faceted hierarchies? Metadata Taxonomies & Thesauri Practical Uses

32 iSchool Faceted Browsing Example Metadata Taxonomies & Thesauri Practical Uses

33 iSchool Faceted Browsing Example Demo: http://flamenco.berkeley.edu/demos.html Metadata Taxonomies & Thesauri Practical Uses

34 iSchool Advantages of Facets Integrates searching and browsing Easy to build complex queries Easy to narrow, broaden, shift focus Helps users avoid getting lost Helps to prevent “categorization wars” Metadata Taxonomies & Thesauri Practical Uses

35 iSchool Relationship to IA? Database Web Server Application Server Network Ontologies are implicitly “hidden” here!!! Flight Trip From: Part-of Airplane Equipment To: Departure Time: Arrival Time: Origin: Destination: Type: Capacity: Rule: Arrival Time is always after Departure Time Rule: Distance from Origin to Destination typical > 100 miles Metadata Taxonomies & Thesauri Practical Uses

36 iSchool Putting it all together… Database Web Server Application Server Network Database Web Server Network Two-Layer Architecture Three-Layer Architecture Apache mySQL PHP Metadata Taxonomies & Thesauri Practical Uses

37 iSchool Popular Implementation Content Metadata Presentation SQL Database PHP/HTML Metadata Taxonomies & Thesauri Practical Uses

38 iSchool Encoding Hierarchies A BC DEF GH ChildParent BA CA DC EC FC GD HD Table: Hierarchy Finding children of A: Select child from Hierarchy where parent = ‘A’  B, C Finding parent of G: Select parent from Hierarchy where child = ‘G’  D Finding siblings of D: find parent, and then find its children Store in RDBMS Metadata Taxonomies & Thesauri Practical Uses

39 iSchool Encoding Metadata A BC DEF GH IDAttributes …Label 0001B 0002B 0003C 0004D 0005D 0006E …… Table: Items Metadata Taxonomies & Thesauri Practical Uses

40 iSchool Content  Presentation A BC DEF GH You are here: A > C > D Contents at D Related - D - E Hierarchy(child, parent)Content(id, attribute 1, attribute 2, attribute 3, …) Metadata Taxonomies & Thesauri Practical Uses

41 iSchool Faceted Browsing Matching Results Filter by - Facet 1 (possible values) - Facet 2 (possible values) Hierarchy(child, parent)Content(id, attribute 1, attribute 2, attribute 3, …) Metadata Taxonomies & Thesauri Practical Uses

42 iSchool Recap Meta-data General function Types of meta-data Taxonomies and Thesauri Role in organizing, navigating and searching content General-purpose taxonomies Special-purpose taxonomies Practical use & implementation Metadata Taxonomies & Thesauri Practical Uses


Download ppt "INFM 700: Session 6 Taxonomies and Metadata Paul Jacobs The iSchool University of Maryland Wednesday, Feb. 29, 2012 This work is licensed under a Creative."

Similar presentations


Ads by Google