Presentation on theme: "Mercury ain't what he used to be, but was he ever? Or do electronic scholarly editions have a mercurial attitude? The Marriage of Mercury and Philology:"— Presentation transcript:
Mercury ain't what he used to be, but was he ever? Or do electronic scholarly editions have a mercurial attitude? The Marriage of Mercury and Philology: Problems and Outcomes in Digital Philology - 26/03/ e-Science Institute– Federico Meschini
Mercury is for lot of things... [http://en.wikipedia.org/wiki/Mercury]http://en.wikipedia.org/wiki/Mercury [He | It] is a deity, a planet, an element, a plant, a city, a programming language and many other things, being also the root for a day of the week and words such as mercurial and probablyhermeneutics. [His | Its] several domains and traits are communication, ingenuity, trade, flexibility, magic, speed, and crossroads.
There is also a modern version of mercury…. … well, more than one! Mercury Updated This name is also polysemic, indicating in electronic publishing a graphic format which is the base for the so-called Rich Internet Applications.
Is the marriage the right metaphor to indicate the relationship and the synthesis between technique and culture? About Marriages... What at first could seem a dichotomic opposition presents a lot of nuances. Techn é the Greek word root for both technology and technique literally means Art The product of a marriage has traits from both partners Following Jungs theory of anima/animus, male and female are present and reflected in each other
What principles should be followed and what (when possible) avoided when creating an electronic scholarly edition? Principles Incompatibility vs. Semantic umbrella/glue Sonic screwdriver vs. Lego Bricks Blob vs. Snow Crystal Incompleteness vs. Extensibility
Incompatibility What are the main factors causing this difference? - Nature of the primary materials (both form and content) - Scholarly vision of the general editor - Technical vision of the computer scientist - Level of understanding between the two…. The variables are many and their combination can produce very different results. Most of the digital critical editions are architectural dysmorphic with each other.
Incompatibility Two different encoders will encode the same text in two different ways Two different programmers will write two different programs, even to solve the same problem Two different editors will produce two different editions, even if based on the same text
Semantic Glue/Umbrella Possible Solution? Using semantic metadata, or even better an ontology, imposed from the top (umbrella) and/or as part of the edition itself (glue), for smoothing all the differences This solution has been used both by NINES and DISCOVERYNINES DISCOVERY Moreover Discovery includes both a broad and general structural ontology and a specific domain ontology, customized for the particular content of every archive since they span from Greek to modern Philosophy.
Sonic Screwdriver Is it possible to have only one tool/framework that will solve all the issues about electronic editions? This can be defined be defined as a sonic screwdriver syndrome, an impossible fictional tool which is used to solve every possible tasks.
Lego Bricks Then, there was a tendency for software to be greedy: to try to do everything possible. […] Now, the tendency is the reverse: for software tools to try to do one thing only, really well, and to cooperate with other tools which do other things P. Robinson, Anastasia and Collate Blog A modular approach to the functions of an electronic edition/archive/knowledge site may help us achieve the flexibility and compatibility we want. P. Shillingsbur, From Gutenberg to Google
Lego Bricks The magic behind Lego Bricks is a standardized stud mechanism. Is it possible the application of the same principle, given that every editions has tailor-made requirements? For a software component this would mean having at least: exposed methods, configuration parameters and standardized input-output formats. A new Digital Library System, called BRICKS is built following this paradigm. BRICKS
Blob vs. Snow Crystal When the edition starts to grown, adding both new contents, functionalities and components, even using Lego Bricks, what kind of pattern should it follow? There are two extremes: on one side the blob, and on the other the snow crystal Every electronic edition will surely change sooner or later, or it would be used for some other purposes. The less the blob, the better.
By extension no finite [electronic] scholarly edition can be complete. Therefore it should be extensible. And in this case the electronic medium has surely some advantages compared to print. Incompletness vs. Extensibility Axiom 3. No finite markup language can be complete C. M. SPERBERG-McQUEEN, Text in the Electronic Age New standards and technologies can be used to extend the edition, even beyond their main scope, see CIDOC-CRM for [formalizing | representing | encoding] the textual process.CIDOC-CRM
Digital Edition Layers What are the main ingredients/layers of an electronic scholarly editions? - Operating logic - Structured data - Raw data - User Interface Every simple action, such as turning pages, involves all of these layers.
Reference Model There is nothing more practical than a good theory J. C. Maxwell Beware the Jabberwock, my son! Lewis Carroll - Jabberwocky Would a reference model for the edition assembling be useful? When talking about theoretical abstract models there are two quotes which should be always kept in mind.
First rule of programming: Not to reinvent the wheel. Reference Model Currently there are two main models for Digital Libraries (more people/resources involved): - Delos Reference ModelDelos Reference Model - 5S Model5S Model What is the relationship between Digital Libraries and Electronic Editions, if any? Holonymy, hyponymy, overlapping? Are they two species in the same phylum?
Based on a Concepts-Relationships approach Delos Reference Model The DELOS Digital Library Reference Model - Version 0.98, Candela, L.; Castelli, D.; Ferro, N.; Ioannidis, Y.; Koutrika, G.; Meghini, C.; Pagano, P.; Ross, S.; Soergel, D.; Agosti, M.; Dobreva, M.; Katifori, V.; Schuldt, H. (February 2008)
Based mostly on linear algebra, graph and set theory to express in a formal and unambiguous way what a Digital Library is. 5S Model Streams, structures, spaces, scenarios, societies (5s): A formal model for digital libraries, Gon ç alves M. A., Fox E. A., Watson L. T., Kipp N. A., April 2004.
At the same time similar and different. They share some basic functions (indexing, browsing, searching), are built with the same basic ingredients (Software Frameworks, Databases, XML). Libraries vs. Editions Critical Digital Editions have a more granular level and structure: think about the encoding and representation of textual variants. Digital libraries main functions are preservation and retrieval. An electronic edition is more like an environment… … thought that this was an original idea, until Ive read Vanhouttes Where is the editor.Vanhouttes Where is the editor Functional Requirements: Library and Information Science vs. Textual Scholarship
Electronic Scholarly Editions are: - Composed by complex digital objects and related metadata (see MTPO Overview)MTPO Overview - A complex digital objects themselves Complex Digital Objects What about the preservation of the edition itself? In which way its architecture and features can be preserved, using a common vocabulary, so that the edition could be recreated or improved?
[Semiotic] Digital Stack FRBR Manifestation CONTENT What are the different levels of digital encoding? EXPRESSION FORM SUBSTANCE FORM SUBSTANCE Whan that Aprille, with... FRBR Expression FRBR Work Annotations, mapping and comparison between these layers
[Semantic] Web [2.0] Are Semantic Web and Web 2.0 relevant for Electronic Scholarly Editions? In two words… Semantic Web: allows the formal representation of knowledge, which is the actual mission of a critical edition (both printed and electronic) Web 2.0: allows for a distributed application platform, advanced user interfaces and user generated content
Text encoding Text (or better Char and Strings) is one of the fundamental type for computer The issue is not (thanks also to Unicode) the horizontal (syntagmatic) level, but the vertical (paradigmatic) one, in particular about the structures, when they are not perfectly contiguous, or in other words overlapping…. To be, or not to be: that is the question: Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles, And by opposing end them? To die: to sleep;
TEI TEI is the de facto standard for electronic encoding of literary texts. It is loved, hated, criticized, praised, but cannot be ignored. TEI is not XML, TEI currently uses XML to express its Guidelines currently at the P5 version (Chapter 11 and 12 for transcription of primary sources and critical apparatus). Before (from 1987 to 2001) SGML was used, therefore TEI has always had an OHCO approach. Ordered Hierarchy of Content Objects, theorized by De Rose et al. in What is a Text Really? (1990)
OHCO OHCO is a vision of the textual phenomenon and organization, it is not the only one, but probably is the most widespread and general. Charles Goldfarb was a lawyer and this influenced his vision of textual structures in designing SGML. Some of the most peculiar features of literary texts are overlapping between the hierarchies. Renear et al, Refining our Notion of What Text Really Is: The Problem of Overlapping Hierarchies (1993).
Overlapping OHCO and TEI have at the same time lot of supporters and critics. Solutions to overlapping? In theory a lot: using XML ( Segment-Boundary Elements | Delimiters, COLT) or other syntaxes (Concur, MECS, GODDAG, LMNL, HORSE). Every solution has its own pros and cons.
Stand-Off Markup Stand-off MarkUp separate the text from the annotation. Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, Whether 'tis nobler in the mind to suffer The slings and arrows of outrageous fortune, to suffer The slings and arrows
Stand-Off Markup XPointer is just one of the possibilities to indicate the base text portions to be used to create the new text. Both Buzzetti and Sperberg-McQueen when writing about mark-up theory talk about having multiple views of the same text, and Buzzetti underlines the advantages of having an external markup instead of only an embedded one. (D. Buzzetti, Digital Representation and the Text Model, 2002)
JITM Just In Time Mark Up, an implementation paradigm of stand-off markup made in the late nineties at Australian Defence Force Academy by Phil Berrie et al. JITM was based on TEI P3 (SGML) and HyTime. A technology update could be made using TEI P5 (XML) and RDF (or Topic Maps). JITM can also be integrated with other parallel projects, in particular Tummarello et al. RDF Text Encoding and Desmond Schmidts data structure for textual variants.data structure for textual variants
[Semantic] Stand-off Markup JITM original DTD provides only for Syntactic MarkUp. What about adding, if using RDF, an OWL layer providing thereby also Semantic Markup?original DTD Sounds original? It should not: see the BECHAMEL Project, Sperberg-McQueen et al. Cons of Stand-Off Markup: complex processing (both encoding and publishing). A suitable editor would solve at least the encoding issue.
[Semantic] Stand-off Markup Paragraph Heading Text Citation Has-a Contains P HETXT CITTXT (Renear et al., Towards a Semantic for XML Markup, 2002)
[Semantic] Stand-off Markup Definition of a mark up language elements as subject/object of RDF triple predicates. These elements can be thereby defined in classes, together with the related semantic relationships. These relationships would also have several properties: contextualization, deixis, distribution, legacy, overridable.
[subjective | limited] list of recent relevant developments in electronic textual editing and related fields. Whats Hot Peter Robinsons Distributed Edition Talia, new platform developed by the Discovery projectTalia Desmond Schmidts Direct Acyclic Graph for representing overlapping textual layers Open Source Critical Editions InitiativeOpen Source Critical Editions Interedition COST ActionInteredition OAI-ORE protocol for the exchange and reuse of digital objectsOAI-ORE