Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tuesday, 8 th June 2004 Introduction Margaret Hanley Business Analyst/Senior Information Architect BBC Worked on three continents – Australia, USA and.

Similar presentations


Presentation on theme: "Tuesday, 8 th June 2004 Introduction Margaret Hanley Business Analyst/Senior Information Architect BBC Worked on three continents – Australia, USA and."— Presentation transcript:

1 Tuesday, 8 th June 2004 Introduction Margaret Hanley Business Analyst/Senior Information Architect BBC Worked on three continents – Australia, USA and UK Been both a consultant and internal staff to companies like Sensis (Yellow Pages in Australia), Argus Associates (US), Ingenta (UK and BBC (UK)

2 Tuesday, 8 th June 2004 CVs and Metadata Exercise Definition, types, and uses Controlled vocabularies and thesauri How to create them

3 Tuesday, 8 th June 2004 Metadata exercise Take a paper bag from the back of room Each bag will have sheet of paper and a goodie Two colours of sheets of papers – organise yourselves into groups of 5 with the same colour sheet

4 Tuesday, 8 th June 2004 Metadata: what is metadata? Data about data Information which describes a document, a file or a CD Common metadata –CD information: title, composer, artist, date –MS Word document properties: time last saved, company, author

5 Tuesday, 8 th June 2004 Metadata: metadata on the Web Used in the header portion of an html document –Common schemes on the web: Dublin Core, RDF and TopicMaps In databases to describe chunks of information to create pages

6 Tuesday, 8 th June 2004 Metadata: types Intrinsic: metadata that the file holds about itself (e.g., file name or size) Descriptive: metadata that describes the file (e.g., subject, title, or audience) Administrative: metadata used to manage the file (e.g., time last saved, review date, author)

7 Tuesday, 8 th June 2004 Metadata: uses Search: can limit the search to a part of the metadata, like title or keyword Browse: create topical indexes by aggregating pages with the same metadata Personalization and customization: show content to an employee based on their role or position in the company, e.g. engineer or manager

8 Tuesday, 8 th June 2004 Metadata: controlled vocabularies To do this, the metadata needs to be the same or at least be related to each other A controlled vocabulary allows a defined set of words to be used to describe content, therefore allowing the content to be related together

9 Tuesday, 8 th June 2004 Metadata: what is vocabulary control? Controlled Vocabulary –A list of preferred and variant terms –A subset of natural language PreferredVariantsAuthority AZAriz, Arizona, 85XXX US Postal Service IBMIntl Bus Machines, Big Blue NY Stock Exchange NyctalopiaNight blindness Moon blindness National Library of Medicine

10 Tuesday, 8 th June 2004 Metadata: why control vocabulary? 1/2 Language is Ambiguous –Synonyms, homonyms, antonyms, contronyms, etc. In the Oxford English Dictionary: –Round takes 7 ½ pages or 15,000 words to define. –Set has 58 uses as a noun, 126 as a verb, 10 as an adjective. The Mother Tongue: English & How It Got That Way by Bill Bryson

11 Tuesday, 8 th June 2004 Metadata: why control vocabulary? 2/2 …so your users dont have to!

12 Tuesday, 8 th June 2004 Metadata: semantic relationships Three types 1.Equivalence 2.Hierarchical 3.Associative (Preferred) Train (Related) Bus (Narrower) Steam engine (Broader) Transport (Variant) Locomotive (Related) Tram (Variant) Choo choo 1 3 2

13 Tuesday, 8 th June 2004 Metadata: levels of control

14 Tuesday, 8 th June 2004 Metadata: what is a thesaurus? Traditional use –Dictionary of synonyms (Rogets) –From one word to many words Information retrieval context –A controlled vocabulary in which equivalence, hierarchical, and associative relationships are identified for purposes of improved retrieval –Many words to one concept

15 Tuesday, 8 th June 2004 Metadata: thesaurus terminology Preferred terms (UF subject headings, descriptors) –SNScope Notes –UFUsed For –BTBroader Term –NTNarrower Term –RTRelated Terms (See Also) Variant terms (UF non-preferred, entry terms) –USE(See)

16 Tuesday, 8 th June 2004 Metadata: types of thesauri

17 Tuesday, 8 th June 2004 Metadata: visibility Classic Use –Both indexers and searchers explicitly map natural language terms onto controlled vocabularies Web Environment –Able to choose level of visibility (implicit use, thesaural browsers) –Opportunity to educate users (terminology, associative learning)

18 Tuesday, 8 th June 2004 Metadata: niche applications (hypothetical example)

19 Tuesday, 8 th June 2004 Metadata: controlled vocabulary statistics Principle of unlimited aliasing: by leveraging synonyms, recall went from 20% to 80% (in a small collection). The Trouble with Computers Research study at Bellcore (Furnas et al. 1987) The findings indicate that a hypertext index with multiple access points for each concept…led to greater effectiveness and efficiency of retrieval on almost all measures. A Usability Assessment of Online Indexing Structures By Carol A. Hert, Elin K. Jacob, and Patrick Dawson Journal of the American Society for Information Science (September 2000)

20 Tuesday, 8 th June 2004 Metadata: Creating CVs Understand your content (content audits and inventories) Understand your business requirements Understand what users are looking for Decide on the ways the metadata will be used in the organisation

21 Tuesday, 8 th June 2004 Metadata: defining the fields By understanding the content, users and context, you should be getting an idea of the ways to describe content to make it –more accessible for users –able to connect to other content –meet the business needs The fields will reflect this

22 Tuesday, 8 th June 2004 Metadata: the fields Say you decided on –Product name (because the users kept searching for it) –Subject (to links content together) –Audience (because the business wanted to target specific audiences)

23 Tuesday, 8 th June 2004 Metadata: Use existing CVs 1/2 Identify any CVs that exist within the organisation Identify any CVs that exist outside of the organisation that could be useful See if any will meet your needs with modification It is ALWAYS better to modify a CV than come up with it yourself

24 Tuesday, 8 th June 2004 Metadata: Use existing CVs 2/2 License the CVs with the ability to make changes – ensure that updates to the CV are included within the licensing fee Add more preferred terms, if the CV is incomplete for your collection Add more variant terms (your users and organisations words) Restructure (but only if necessary)

25 Tuesday, 8 th June 2004 Metadata: Creating your own If no CVs exist, create your own Collect terms that could be used in the CV – from users, content and the business Identify CV structure from the terms collected Start to create

26 Tuesday, 8 th June 2004 Metadata: Using it in your site Static HTML sites –In the header CMS – page based systems –In the header CMS – object based systems –With each object Databases –With each record

27 Tuesday, 8 th June 2004 Metadata: Power in the site Ability to do contextual linking to web sites and applications Ability to find content Syndication Personalisation Recommendation engines Pervasive state for users across applications

28 Tuesday, 8 th June 2004 Thank you Questions or comments? Margaret Hanley mairead@yahoo.com


Download ppt "Tuesday, 8 th June 2004 Introduction Margaret Hanley Business Analyst/Senior Information Architect BBC Worked on three continents – Australia, USA and."

Similar presentations


Ads by Google