Softwaretechnologie für Fortgeschrittene Teil Thaller Stunde VI: Information revisited … Köln 9. Januar 2014.

Slides:



Advertisements
Similar presentations
PC in TB Manfred Thaller PLANETS TB meeting, DenHaag, Sept 28th. '06.
Advertisements

PC/4 Manfred Thaller PLANETS TB meeting, DenHaag, Sept 29th. '06.
Characterisation Adrian Brown The National Archives, UK.
Lecture 2: Basic Information Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Ontologies: Dynamic Networks of Formally Represented Meaning Dieter Fensel: Ontologies: Dynamic Networks of Formally Represented Meaning, 2001 SW Portal.
CHAPTER 2 GC101 Program’s algorithm 1. COMMUNICATING WITH A COMPUTER  Programming languages bridge the gap between human thought processes and computer.
Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
What is a text within the Digital Humanities, or some of them at least? Manfred Thaller, Universität zu Köln Digital Humanities 2012, July 20 th 2012.
CS292 Computational Vision and Language Pattern Recognition and Classification.
If You Missed Last Week Go to Click on Syllabus, review lecture 01 notes, course schedule Contact your TA ( on website) Schedule.
Introduction to Gröbner Bases for Geometric Modeling Geometric & Solid Modeling 1989 Christoph M. Hoffmann.
Case-based Reasoning System (CBR)
Foundations This chapter lays down the fundamental ideas and choices on which our approach is based. First, it identifies the needs of architects in the.
55:148 Digital Image Processing Chapter 11 3D Vision, Geometry Topics: Basics of projective geometry Points and hyperplanes in projective space Homography.
Introduction to Database Systems 1.  Assignments – 3 – 9%  Marked Lab – 5 – 10% + 2% (Bonus)  Marked Quiz – 3 – 6%  Mid term exams – 2 – (30%) 15%
The PLANETS-Ontology in the context of the PLANETS-Testbed and the XCL-Software.
An Information Theory based Modeling of DSMLs Zekai Demirezen 1, Barrett Bryant 1, Murat M. Tanik 2 1 Department of Computer and Information Sciences,
Unit 2: Engineering Design Process
Electromagnetism Lecture#1 [Introduction] Instructor: Engr. Muhammad Mateen Yaqoob.
Purpose of study A high-quality computing education equips pupils to use computational thinking and creativity to understand and change the world. Computing.
Fundamentals of Information Systems, Fifth Edition
Chapter 1 Introduction Dr. Frank Lee. 1.1 Why Study Compiler? To write more efficient code in a high-level language To provide solid foundation in parsing.
Foundations of Computer Science Computing …it is all about Data Representation, Storage, Processing, and Communication of Data 10/4/20151CS 112 – Foundations.
Digital Image Processing CCS331 Basic Transformations 1.
Information Retrieval and Web Search Lecture 1. Course overview Instructor: Rada Mihalcea Class web page:
The XCL Languages Digital Preservation – The Planets Way Dresden, April 23 rd 2010 Manfred Thaller, Universität zu Köln.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
1 What is HTML? Standardized codes Web pages SGML Descriptive markup Tags.
Discrete Mathematics 이재원 School of Information Technology
The Beauty and Joy of Computing Lecture #3 : Creativity & Abstraction UC Berkeley EECS Lecturer Gerald Friedland.
EXtensible Characterisation Languages (XCL) Manfred Thaller, (University at Cologne) DPP meeting, Glasgow, Nov. 23 rd 2006.
Lesson 1 – Number Sets & Set Notation Math 2 Honors – Mr Santowski 10/18/20151 Math 2 Honors - Santowski.
Scientific Applications of XML Arvind Hulgeri, Shantanu Godbole
Visual Information Systems Recognition and Classification.
PROBLEM AREAS IN MATHEMATICS EDUCATION By C.K. Chamasese.
1 Digital Preservation Testbed Database Preservation Issues Remco Verdegem Bern, 9 April 2003.
1 A Historical Perspective on Conceptual Modelling (Based on an article and presentation by Janis Bubenko jr., Royal Institute of Technology, Sweden. June.
Chapter 5: Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization DECISION SUPPORT SYSTEMS AND BUSINESS.
02. November 2007 Florian Probst Data and Knowledge Modelling for the Geosciences - Chris Date Seminar - e-Science Institute, Edinburgh Semantic Reference.
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
3.2 Semantics. 2 Semantics Attribute Grammars The Meanings of Programs: Semantics Sebesta Chapter 3.
Computers, Variables and Types Engineering 1D04, Teaching Session 2.
Description of Bibliographic Items. Review Encoding = Markup. The library cataloging “markup” language is MARC. Unlike HTML, MARC tags have meaning (i.e.,
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
1 / 48 Formal a Language Theory and Describing Semantics Principles of Programming Languages 4.
Mestrado em Ciência de Computadores Mestrado Integrado em Engenharia de Redes e Sistemas Informáticos VC 15/16 – TP14 Pattern Recognition Miguel Tavares.
Cooperative Computing & Communication Laboratory A Survey on Transformation Tools for Model-Based User Interface Development Robbie Schäfer – Paderborn.
XCL-Tools in relation to Significant characteristics in Planets Manfred Thaller Universität zu* Köln *University at not of Cologne.
College of Computer Science, SCU Computer English Lecture 1 Computer Science Yang Ning 1/46.
1 Prints: The Language of Industry. Learning Objectives Identify the importance of prints. Discuss historical processes and technologies related to prints.
CS223: Software Engineering
Connecting Architecture Reconstruction Frameworks Ivan Bowman, Michael Godfrey, Ric Holt Software Architecture Group University of Waterloo CoSET ‘99 May.
Key understandings in mathematics: synthesis of research Anne Watson NAMA 2009 Research with Terezinha Nunes and Peter Bryant for the Nuffield Foundation.
Auszug aus: What is a text within the Digital Humanities, or some of them at least? Manfred Thaller, Universität zu Köln Digital Humanities 2012, July.
Convergences between modern languages and language(s) of schooling – Sweden –
What We Need to Know to Help Students’ Scores Improve What skills the test measures How the test relates to my curriculum What skills my students already.
Technical Reports ELEC422 Design II. Objectives To gain experience in the process of generating disseminating and sharing of technical knowledge in electrical.
Victorian Curriculum F–10 Online professional learning session Unpacking Digital Technologies Paula Christophersen Digital Technologies, Curriculum Manager.
What makes a good HEFA report? Jennifer French
Computer Systems Architecture Edited by Original lecture by Ian Sunley Areas: Computer users Basic topics What is a computer?
Describing Syntax and Semantics
CSCI-235 Micro-Computer Applications
Compiler Construction
ELECTRONIC MAIL SECURITY
Theory of Computation Turing Machines.
ELECTRONIC MAIL SECURITY
CHAPTER 1: THE DATABASE ENVIRONMENT AND DEVELOPMENT PROCESS
Respectful Type Converters
UNIT-II CHAPTER-4 SOFTWARE REQUIREMENT DEFINITION
Presentation transcript:

Softwaretechnologie für Fortgeschrittene Teil Thaller Stunde VI: Information revisited … Köln 9. Januar 2014

Beobachtungen: Bilder und Texte

Süddeutsche Zeitung, Nr. 58, 9./10. März 2013, Wochenendbeilage, p. V2/2

Bilder und Texte sind 2013 so ubiquitär, dass sie in modernen Informationssystemen nicht mehr sinnvoll getrennt gehandhabt werden können. Hintergrundthese

Bilder und Texte in geisteswissenschaftlichen Informationssystemen: State of the Art

κλειω – Bild / Textverknüpfungen, Stand 1992 Text – Bild – Editoren

Meta Image, ( ) Text – Bild – Editoren

( ) Text – Bild – Editoren

( ) Text – Bild – Editoren

Magisterarbeiten J. Schnasse (2006), E. Weiper (2007) Text – Bild – Editoren

TILE (Text Image Linking Evironment) ( ) Text – Bild – Editoren

TextGrid ( ) Text – Bild – Editoren

ELAN ( ) Text – Bild – Editoren

 Text- und Bildabschnitte, sowie Punkte in strukturierten Wissensdarstellungen (Ontologien) können in beliebiger Granularität miteinander verbunden werden.  Textabschnitte können aus Mengen diskontinuierlicher Abschnitte beliebiger Granularität bestehen.  Bildabschnitte können aus Mengen diskontinuierlicher Abschnitte beliebiger Form bestehen.  Es stehen Werkzeuge zur Formerkennung zur Verfügung.  Suchabfragen können datentypspezifische Frageformen aller drei Domänen miteinander verbinden.  Alle Funktionen sind ohne explizite Installation von Clientsoftware über Standardbrowser zugänglich.  Beschreibungen können kollaborativ nach unterschiedlichen Rollenmodellen angelegt werden. Idealtypische Anforderungen …

Bilder und Texte - Information

What is a text within the Digital Humanities, or some of them at least? Manfred Thaller, Universität zu Köln Digital Humanities 2012, July 20 th 2012

Information I

Claude Shannon: "A Mathematical Theory of Communication", Bell System Technical Journal, Shannon 19

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. (Shannon, 1948, 379) Shannon 20

Shannon It is wet outside. It must be raining … 21

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem. (Shannon, 1948, 379) Shannon 22

Shannon It is wet outside. It must be raining … 23

Shannon It is wet outside. It is wet outside … 24

Shannon

„Ladder of Knowledge“ wisdom knowledge information data 26

Information 27

Data 28

Data are stored. E.g.: 22°C. Information are data interpreted within a context: "In this lecture hall the temperature is 22°C". This context is fixed and identical for all recipients of information. Data  Information 29

Knowledge is the result of a more complex process. E.g. the decision, derived from the room temperature of 22 ° centigrade, to get out of your jacket; or not. This context is different between recipients of information. Information  Knowledge 30

Data 22 ° C 22 ‘ ’ So … Information 22 ° C in lecture hall M 22 ° 22 [ NOT ASCII { 0, 22 } ] 31

Langefors “Infological Equation”: original I = i (D, S, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 32

Information II

Receiving information … The kink has been kiled. Oh, that was John II. Very strange spelling, even for that time.. 34

Receiving information … Oh, that was John II. Very strange spelling, even for that time.. Notice: We can not consult the sender any more …. ? 35

Langefors “Infological Equation”: original I = i (D, S, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 36

Langefors “Infological Equation”: generalization 1 I 2 = i (I 1, S 2, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 37

Receiving information … Oh, that was John II. Very strange spelling, even for that time.. Notice: We can not consult the sender any more …. ? 38

Receiving information … Or, was it 100 years earlier? Very strange spelling, even for that time.. Notice: We can not consult the sender any more …. ? 39

Langefors “Infological Equation”: generalization 2 I x = i (I x-1, S x, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 40

Langefors “Infological Equation”: generalization 3 S x = s (I x-1, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge, s() = knowledge generating process t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 41

Langefors “Infological Equation”: generalization 4 I x = i (I x-α, S x-β, t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 42

Langefors “Infological Equation”: generalization 5 I x = i (I x-α, s(I x-β, t), t) I ::= Information i() ::= interpretative process D ::= Data S ::= Previous knowledge t ::= time Börje Langefors, Essays on Infology, Studentliteratur: Lund, 1995 Langefors 43

Data 22 ° C 22 ‘ ’ Remember … Information 22 ° C in lecture hall M 22 ° 22 [ NOT ASCII { 0, 22 } ] 44

Changeable datatypes int myVariable; char myVariable; temperature myVariable; obj myVariable; myVariable.useAsInt(); myVariable.useAsChar(); myVariable.addInterpretation(temperature,Centigrade); 45

Notes: (1)If this is so, the assumption of Comp. Sci., that information is represented by structures on which algorithms operate, can be replaced by a more general understanding, according to which information is a state of a set of perpetually active algorithms. (2) Has that any practical meaning? Langefors 46

A practical interlude

► Photoshop ► Planets: the problem 48

png tiff Extractor Comparator image info 2 image info 1 the same? Format conversion png rulestiff rules Planets: the vision 1 49

Obj 1 Obj 2 Extractor Comparator object info 2 object info 1 the same? Format conversion rule set 1rule set 2 Planets: the vision 2 50

Planets: the vision 3 Obj 1 Obj 2 Extractor Comparator XCDL 2 XCDL 1 the same? Format conversion XCEL 1XCEL 2 Specification of „similiarity“ to be used: „comparator comparison [Language] “ (coco). Specification of „similiarity“ observed: „comparator results [Language] “ (copra). Abstract description of file content: „eXtensible Characterisation Definition Language“ (XCDL), able to describe the content of digital objects (=1 + n more files), processible by a software tool for further analysis. Machine readable form of a file format specification: „eXtensible Characterisation Extraction Language“ (XCEL), able to describe any machine readable format in a formal language, processible by a software tool for extraction of content as XCDL. 51

This is a text … fontsize 48 unsignedInt8 Text in XCDL 52

7A 11 9B F4 DA 9C B … … title Ebstorf Mappa Mundi ASCII Image in XCDL 53

Generalizing the practical solution

Allows to make statements about the proximity of two objects on the "y" axis. Irrespective of the "shape" of the object. Dimensions: geometry 55

Allows to make statements about the proximity of two objects on the "y" axis. Irrespective of the “object" that is at the abstract position. Dimensions: textual / conceptual 56

Dimensions are by definition orthogonal. Dimensions can have any sort of metric:  Rational: { - ∞ … + ∞ }  Integer range: { 0 … 100 }  Nominal: { medieval, early modern, modern }  Image: {, }  … Dimensions: metrics 57

(1) Biggin (2) Biggin (3) Biggin (4) Biggin Which of the chunks are more similar to each other: (1) and (2) or (1) and (3)? Four texts … 58

… in a coordinate space. 59

Liber exodi glosatus An image in a textual coordinate space 60

Liber exodi glosatus An text in an image coordinate space 61

An image in a semantic coordinate space Bishop Cardinal Monk Priest Monk 62

Semantics in an image coordinate space Bishop Cardinal Monk Priest Monk 63

Generalization 1 Biggin Visualization {bold, italic} Interpretation {surname, topographic name} 64

Generalization 2 Series of atomic content tokens Conceptual dimension 1 Conceptual dimension 2 65

Generalization 3 { T, C 1, C 2 } 66

Generalization 4 { T, { C 1, C 2, …, C n } } 67

Generalization 5 { T, C n } (1) Texts are sequences of content carrying atomic tokens. (2) Each of these tokens has a position in an n-dimensional conceptual universe. 68

Generalization 6 { X, Y, C n } 69

Generalization 7 { T 1, T 2, C n } (1) Images are planes of content carrying atomic tokens. (2) Each of these tokens has a position in an n- dimensional conceptual universe. 70

Generalization 8 I ::= { { T 1, T 2, … T m }, C n } (1) Information objects are m-dimensional arrangements of content carrying atomic tokens. (2) Each of these tokens has a position in an n- dimensional conceptual universe. 71

Generalization 9 I ::= {T m, C n } (1) Information objects are m-dimensional arrangements of content carrying atomic tokens. (2) Each of these tokens has a position in an n- dimensional conceptual universe. (3) All of this, of course, is recursive … 72

Another practical interlude

Virtual Research Environments  Virtuelles deutsches Urkundennetz (Virtual network of German charters) 74

digitization editing research transcription publication symbol manipulation teachteach A model of historical research

base image editorial coordinates semantic coordinates textual coordinates publication coordinates symbol coordinates didactic coordinates A model of historical research

Conclusion

Summary (1)All texts, for which we cannot consult the producer, should be understood as a sequence of tokens, where we should keep the representation of the tokens and the representation of our interpretation thereof completely separate. (2)Such representations can be grounded in information theory. (3)These representations are useful as blueprints for software on highly divergent levels of abstraction. 78

Thank you! 79