Presentation is loading. Please wait.

Presentation is loading. Please wait.

IBE312: Information Architecture 2013 Ch. 9 – Metadata Many of the slides in this slideset are reproduced and/or modified content from publically available.

Similar presentations


Presentation on theme: "IBE312: Information Architecture 2013 Ch. 9 – Metadata Many of the slides in this slideset are reproduced and/or modified content from publically available."— Presentation transcript:

1 IBE312: Information Architecture 2013 Ch. 9 – Metadata Many of the slides in this slideset are reproduced and/or modified content from publically available slidesets by Paul Jacobs (2012), The iSchool, University of Maryland http://terpconnect.umd.edu/~psjacobs/s12/INFM700s12.htm. These materials were made available and licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 United States See http://creativecommons.org/licenses/by-nc-sa/3.0/us/ for details.http://creativecommons.org/licenses/by-nc-sa/3.0/us/

2 2 Metadata “Data about data” - Definitional and descriptive documentation/information about data… From Free On-line Dictionary of Computing: Data about data. In data processing, meta-data is definitional data that provides information about or documentation of other data managed within an application or environment. For example, meta-data would document data about data elements or attributes, (name, size, data type, etc) and data about records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Meta-data may include descriptive information about the context, quality and condition, or characteristics of the data. (Some other definitions.)Some otherdefinitions

3 Metadata Why do we need this? Types of metadata – Descriptive/subjective/content (e.g. author, subject, keywords, …) – Administrative (e.g. owner, rights, cost, creation date, version, …) – Technical (e.g. format, size, dependencies, programs) –.... In practical terms: – Metadata helps users locate, navigate, interpret content – Metadata helps organizations manage content – Metadata helps systems manipulate content

4 Data without Metadata… Who: authored it? to contact about data? What: are contents of database? When: was it collected? processed? finalized? Where: was the study done? Why: was the data collected? How: were data collected? processed? Verified? … can be pretty useless!

5 Early Example of Metadata

6 Menagerie of Terms Classification Hierarchies Epistemology Directories Controlled vocabularies Knowledge representation Let’s focus on significant differences. Let’s focus on advantages/disadvantages. Let’s focus on how each is useful.

7 7 Controlled Vocabulary Any defined subset of natural language List of equivalent terms (synonym rings) – Use search logs. List of preferred terms (authority files) – Commonly also include variant terms – Educating users, enabling browsing – Term rotation (pointers in index) p.201 Classification scheme / taxonomy – Hierarchical relationships (narrower/broader)

8 Controlled Vocabulary Queries can be ”exploded” to increase recall

9 Controlled Vocabulary authority file – inclusive, preferred term can serve as the unique identifier for a collection of terms, educate users

10 Related Terms & Techniques Taxonomies – Anything organized in some sort of hierarchical structure Tagging – Adding almost any kind of metadata to content, but now often descriptive and user-provided Thesauri – Focus on relations between terms – Focus on “concepts” Ontologies – Usually model a specific domain or part of the world – Generally machine-readable Increasing complexity and richness Metadata Taxonomies & Thesauri Practical Uses

11 How are taxonomies, tagging, controlled vocabularies and thesauri used? The semantic gap: What’s the problem? – Synonymy – roughly, different words or phrases can be used to express similar ideas (e.g. “notebook”, “laptop”) – Polysemy – roughly, the same word can have different meanings (e.g., “line” (fishing, code, queue,...) ) Taxonomies try to group similar concepts “Tags” often assign words to concepts, making it easier to find related concepts Controlled vocabularies avoid ambiguity (like a specific tag set) Thesauri represent attempts to better organize mappings between words and concepts Do these present precision or recall problems?

12 Taxonomies – Organization of objects according to some principle – Familiar examples: Linnaean taxonomy (for living organisms) Web directories (e.g., Yahoo or ODP) Corporate directories Organization charts Organizational structures previously discussed Metadata Taxonomies & Thesauri Practical Uses

13 Tagging- e.g. Flickr – popular tags Metadata Taxonomies & Thesauri Practical Uses

14 Flickr – related tags Metadata Taxonomies & Thesauri Practical Uses

15 Del.icio.us – related tags Metadata Taxonomies & Thesauri Practical Uses

16 Thesauri: Motivation “Semantic gap” between concepts and words Online thesauri help mapping many synonyms or word variants onto one preferred term – improve precision in retrieval (p.203) Words are used to evoke concepts – Concrete objects: MacBook Pro, iPhone – Abstract ideas: freedom, peace Concepts Words Ideas Meaning

17 17 Thesauri Book of synonyms, often including related and contrasting words and antonyms. In this class: – A controlled vocabulary in which equivalence, hierarchical, and associative relationships are identified for purposes of improved retrieval. Technical lingo … Thesauri standards: ISO 2788, …

18 18 Thesauri Types

19 IA Uses of Thesauri For organization For navigation For indexing content For searching

20 Applying IA Principles Focus on users and user needs – users are different, and have different models Focus on content – concepts are different, too – different levels, words, complexity, vagueness Examples: – What’s the difference between laptop, PDA, phone, and convergence device? – When is “cancer research” “oncology”? – When a user browses a furniture catalog for chairs, do you show them ottomans and footstools?

21 Standard Thesaurus Structure Computer Notebook Laptop Desktop Replacement UltraportableTablet PC IS-A AKA Synonyms (variants) Narrower Terms Broader Terms Preferred

22 Semantic relationships in a thesaurus ( pp. 204-205): Abbreviations: PT, VT, BT, NT, RT, Use (U) – VT use PT, Use For (UF) – full list of VT on the PT record, Scope Note (SN) – meaning of the term to rule out ambiguity.

23 Semantic relationships of a wine thesaurus, p. 206

24 Some Real Examples Content tagging and social media (e.g. flickr, del.i.cious) Special-purpose classification schemes and thesauri (e.g. art & architecture thesaurus – AAT, UMLS) General semantic tools and classification schemes (e.g., Princeton WordNet, Roget’s Thesaurus)

25 Art & Architecture Thesaurus Metadata Taxonomies & Thesauri Practical Uses http://www.getty.edu/research/conducting_research/vocabularies/aat/

26 UMLS (Unified Medical Labeling System) Source: National Library of Medicine (NIH) Metathesaurus Semantic Network SPECIALIST Lexicon +Tools 135 broad categories and 54 relationships between them 1 million+ biomedical concepts from over 100 sources lexical information and programs for language processing 3 Knowledge Sources used separately or together Metadata Taxonomies & Thesauri Practical Uses

27 E.g. UMLS (Unified Medical Labeling System) Source: National Library of Medicine (NIH) Metadata Taxonomies & Thesauri Practical Uses Began in 1986 as long-term R&D project  Designed for systems developers  Develop multi-purpose tools to enhance understanding of medical meaning across systems  Overcome barriers to effective retrieval of machine-readable information  Overcome variety of ways the same concepts are expressed in machine readable and human language

28 UMLS Uses Source: National Library of Medicine (NIH) Metadata Taxonomies & Thesauri Practical Uses  Information retrieval  Thesaurus construction  Natural language processing  Automated indexing  Electronic health records (EHR)  Distribution mechanism for  HIPAA, CHI, PHIN regulatory standards  SNOMED CT

29 UMLS Metathesaurus http://www.nlm.nih.gov/research/umls/

30 UMLS Metathesaurus http://www.nlm.nih.gov/research/umls/

31 UMLS Thesaurus Browser http://www.nlm.nih.gov/research/umls/

32 32 Semantic Relationships Equivalence (PT = VT) Hierarchical: Generic (Bird NT Magpie), whole-part (Foot NT big toe) or instance (Seas NT Mediterranean Sea) – Faceted / multiple hierarchies Associative – Related terms (hammer RT nail) Preferred terms: – Form, selection, definition and specificity Polyhierarchy (Medline corss-lists viral pneumonia under both...Fig 9-25, p. 220) Faceted classification – multiple taxonomies that focus on different dimensions of the content. (e.g. wine.com pp. 223-224.)

33 Associative Term

34 Poly-Hierarchies Concepts can have multiple parents Example: What are the advantages and disadvantages? What’s the relationship to polysemy? Cracow (Poland : Voivodship) Auschwitz II-Birkenau (Poland : Death Camp) Block 25 (Auschwitz II-Birkenau) German death camps Kanada (Auschwitz II-Birkenau) From Shoah Foundation’s thesaurus of holocaust terms

35 Faceted Hierarchies Alternative to single and poly-hierarchies Basic idea: – Describe objects along multiple facets – Each facet has its associated hierarchy Issues: – What’s a facet? – How do you navigate faceted hierarchies?

36 Faceted Browsing Example

37 Demo: http://flamenco.berkeley.edu/demos.html

38 Advantages of Facets Integrates searching and browsing Easy to build complex queries Easy to narrow, broaden, shift focus Helps users avoid getting lost Helps to prevent “categorization wars”

39 Relationship to IA? Database Web Server Application Server Network Ontologies are implicitly “hidden” here!!! Flight Trip From: Part-of Airplane Equipment To: Departure Time: Arrival Time: Origin: Destination: Type: Capacity: Rule: Arrival Time is always after Departure Time Rule: Distance from Origin to Destination typical > 100 miles

40 Putting it all together… Database Web Server Application Server Network Database Web Server Network Two-Layer Architecture Three-Layer Architecture Apache mySQL PHP

41 Popular Implementation Content Metadata Presentation SQL Database PHP/HTML

42 Content  Presentation A BC DEF GH You are here: A > C > D Contents at D Related - D - E Hierarchy(child, parent)Content(id, attribute 1, attribute 2, attribute 3, …)

43 Faceted Browsing Matching Results Filter by - Facet 1 (possible values) - Facet 2 (possible values) Hierarchy(child, parent)Content(id, attribute 1, attribute 2, attribute 3, …)

44 Summary Meta-data – General function – Types of meta-data Taxonomies and Thesauri – Role in organizing, navigating and searching content – General-purpose taxonomies – Special-purpose taxonomies Practical use & implementation


Download ppt "IBE312: Information Architecture 2013 Ch. 9 – Metadata Many of the slides in this slideset are reproduced and/or modified content from publically available."

Similar presentations


Ads by Google