Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Economy of Distributed Metadata Authoring

Similar presentations


Presentation on theme: "The Economy of Distributed Metadata Authoring"— Presentation transcript:

1 The Economy of Distributed Metadata Authoring
by Stefano Mazzocchi The presentation will sketch the differences between data creation and metadata creation, outlining the impact of these differences on the economy of distributed content creation and consumption. It will be shown how these economical effects might impact both semantically-enhanced distributed technologies and communities of users of these technologies. Finally, it will be suggested how the economical and social projections can be used as a metric for the feasibility of a proposed technologies that involve highly distributed environments. Experts' Workshop - Perspectives on networked knowledge spaces 25/26 October 2002, Sankt Augustin, Germany Organised by: MARS Exploratory Media Lab at the Fraunhofer Institut für Medienkommunikation

2 What is Metadata Metadata is information about information

3 Classic Examples of Metadata
Keywords Author Date of creation/modification Address/Identifier

4 More provocative examples
Punctuation in text Layout on a page Font size/weight/style in text Commentary audio tracks on DVD

5 General Metadata Properties
Metadata is data about data, but it’s still data Metadata should be semantically orthogonal: data should be understandable even without metadata

6 Markup and Metadata Markup languages can be seen as metadata-driven languages. Markup syntax is designed to keep data and metadata orthogonal

7 The Importance of Metadata
Key to semantic analysis Key to multidimensional augmentation of information Key to information relationability In short: key to more powerful datamining

8 Types of Metadata Human authored Automatically Inferred

9 Human Authoring (1) In-process: data and metadata are created at the same time Out-of-process: metadata is added after data has been created

10 Human Authoring (2) By the data author: data and metadata are written by the same person By another author: data and metadata are created by different people

11 Automatic Inference Recogniction of patterns and trends in data
Semantic assumption of data-metadata correlations

12 Types of Automatic Inference
Heuristic: some algorithm performs analysis on the data set (artificial reproduction of intelligent behavior) Transparent: some mechanically extracted information is transparently associated with some metadata performed by human semantic analysis

13 Transparent Inference Examples
Google’s PageRank Amazon’s related items NEC’s CiteSeer

14 Google PageRank is the system that ranks the pages found after a query against their database It works on hyperlink topology analysis Metadata is inferred from the hyperlinks contained into the page

15 Amazon Relation between items is inferred from the analysis of the articles bought by the other users The act of a user buying two products is assumed to be a sign of relation between the items Simply by buying, the users are collectively filling up product metadata on relations

16 CiteSeer Digital Library of IT papers
Ranks searches on ‘citations’ topology analysis Bibliographies become the source of relevance metadata

17 The Issues with Metadata
Quality of metadata heavily influences the quality of all search/retrieval systems

18 First Law of Metadata Quality
Artificial intelligence is just that: artificial! So: for a system that feels smart to humans, you need human-created metadata

19 First Law of Metadata Quantity
The more high-quality metadata, the better. But: the more human-created metadata, the more expensive the authoring process gets.

20 Metrics In order to estimate the value of proposed technological solutions, a metric is required Economical feasibility is one possible metric

21 Consequences All current markup-based semantic web solutions (RDF, topic maps, ontologies) are economically infeasible. The best semantic solutions are those based on transparent inference

22 Suggestions (1) Plan the impact of metadata authoring costs on technology decisions. Don’t underestimate the importance of user feeling. Think about what can be inferred transparently without requiring heuristics

23 Suggestions (2) Do all efforts to make instant return on the investment of metadata authoring Don’t ask too much Be smart but not smarter

24 Thanks!


Download ppt "The Economy of Distributed Metadata Authoring"

Similar presentations


Ads by Google