Presentation is loading. Please wait.

Presentation is loading. Please wait.

Digital Media Technology

Similar presentations


Presentation on theme: "Digital Media Technology"— Presentation transcript:

1 Digital Media Technology
Week 3: Introduction to TEI Peter Verhaar

2 eXtensible Markup Language
<title>La Biblioteca de Babel</title> is a short story written by <persName>Jorge Luis Borges</persName>.

3 XML: General rules which determine the well-formedness (e. g
XML: General rules which determine the well-formedness (e.g. proper nesting, single root element, case sensitivity) DTD: Rules for a particular language which determine validity

4 DTD or XML Schema Document Instance
<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE TEI SYSTEM "tei.dtd " > <tei> <text> <salute> Gentlemen, </salute> <body> I reply to your letter of the <date>29th Ulto</date>, offering 30 £ for an early copy of the novel (…) </body> </text> </tei> Validation rules DTD or XML Schema Document Instance

5 Booktrade Correspondence Project
Application of text encoding Study of correspondence from the Dutch book trade in the 19C. Primary materials: Archive of De Erven F. Bohn Archive of A.W. Sijthof

6 De Erven F. Bohn Founded in Haarlem, 1752 by Christoph Heinrich Bohn ( ) >1784: son François Bohn ( ) 1819: public auction. Name changed to De Erven F. Bohn. 1875 no more bookselling; 1876: J.K. Tadema Today: Bohn Stafleu van Loghum

7 Archives Ca. 20,000 books Financial administration
Ca. 70,000 letters, often with well-known authors and publishers Contracts, Reviews, Illustrations Correspondence section has been digitised

8 Research questions Social network of Bohn
Which book titles are mentioned in the correspondence? How international was the Dutch Booktrade in the 19C? Who were Bohn’s and Sijthoff’s competitors?

9 Dear Sirs, I will accept / £10 for the / rights to make a / translation into / Dutch of my / novel entitled / Wanda //

10 Printers will / send you entire / proofs from London / instantly. Please / to send money / on receipt of this / Address Madame / Ouida. ~c. 2 words illegible~/ ~c. 1 word illegible~ Ouida / L. de la Ramée

11 Example of a transcription
Gentlemen, I reply to your letter of the 29th Ulto, offering 30 £ for an early copy of my late father's forthcoming novel Kenelm Chellengly. I beg to inform you that I have simultaneously received from another Dutch Firm, precisely the same offer, viz. 30 £ for an early copy of that work, with a view to a Dutch translation of it (…). Your obedt. Servt, Lytton Knebworth Park Stevenage Herts

12 Encoded text Gentlemen,
I reply to your letter of the <date>29th Ulto</date>, offering 30 £ for an early copy of my late father's forthcoming novel <title>Kenelm Chellengly</title>. I beg to inform you that I have simultaneously received from another Dutch Firm, precisely the same offer, viz. 30 £ for an early copy of that work, with a view to a Dutch translation of it (…). Your obedt. Servt, <persName>Lytton</persName> Knebworth Park <placeName>Stevenage</placeName> Herts

13 Ontologies Models are based on an ontology
The properties of the original which are represented in the model Models “inevitably lie, by omission at least” A DTD can be viewed as an ontology John Unsworth, 'What is Humanities Computing and What is Not?', in: Melissa Terras, Julianne Nyhan, & Edward Vanhoutte (eds.), Defining digital humanities: a reader, 2013, pp. 36–37. R. Davis, H. Shrobe & P. Szolovits, 'What is a Knowledge Representation?', AI Magazine, 14:1 (1993).

14 Deconstruction

15 letter closer salute body p persName title

16 <?xml version="1.0" encoding="UTF-8"?>
<letter> <salute> Dear Sirs,</salute> <body> <p> I will accept £10 for the rights to make a translation into Dutch of my novel entitled <title>Wanda</title> </p> <p> Printers will send you entire proofs from London instantly. Please to send money on receipt of this / Address Madame Ouida. ~c. 2 words illegible~ ~c. 1 word illegible~ </p> </body> <closer> Ouida L. de la Ramée </closer> </letter>

17 More than 500 elements Developed by consortium of scholars First established in 1987 Text in general: “texts in any natural language, of any date, in any literary genre”

18 XML HTML TEI SVG EAD EPUB

19

20

21

22 ASCII Character encoding scheme
American Standard Code for Information Interchange e.g.: A = Uses 7 bits (128 characters)

23 Unicode 16 bits UTF-8 1,112,064 characters α: α

24 <p>En réponse à votre lettre du 30 Janvier nous avons <lb/> l'honneur de vous informer que nous avons payé Mon-<lb/> sieur Midderigh déjà depuis longtemps et presque toujours <lb/> d'avance.</p>

25 If the premises “A > B” and “A” are true, we can conclude B
Book & Digital Media Studies

26 Entities <p>This sentence is in the <p> element.</p>
> Greater than < Less than " Quotation mark & Ampersand

27 Comments Used to improve the readability of the XML document:
<!-– The next section contains the transcription -->

28 Terminology Elements Attributes DTD Well-formed XML Valid XML Entities
Unicode ASCII Meta-language


Download ppt "Digital Media Technology"

Similar presentations


Ads by Google