The TEI : an overview. Basic concepts  The TEI is a modular system, built like a Chicago pizza  Modules Base modules (choose one) Additional modules.

Slides:



Advertisements
Similar presentations
LIS650lecture 1 XHTML 1.0 strict Thomas Krichel
Advertisements

HTML I. HTML Hypertext mark-up language. Uses tags to identify elements of a page so that a browser such as Internet explorer can render the page on a.
Putting together a METS profile. Questions to ask when setting down the METS path Should you design your own profile? Should you use someone elses off.
A Common Standard for Data and Metadata: The ESDS Qualidata XML Schema Libby Bishop ESDS Qualidata – UK Data Archive E-Research Workshop Melbourne 27 April.
HTML Basics 1450 Technology Seminar Copyright 2003, Matthew Hottell.
Introduction to HTML & CSS
Music Encoding Initiative (MEI) DTD and the OCVE
A complete citation, notecard, and outlining tool
CSS Cascading Style Sheets. Objectives Using Inline Styles Working with Selectors Using Embedded Styles Using an External Style Sheet Applying a Style.
A really fairly simple guide to: mobile browser-based application development (part 1) Chris Greenhalgh G54UBI / Chris Greenhalgh
Chapter 8 Creating Style Sheets.
An overview of TEI tagging or, Anyone for pizza?.
® Microsoft Office 2010 Word Tutorial 3 Creating a Multiple-Page Report.
HTML and XHTML Controlling the Display Of Web Content.
Introduction to XML This material is based heavily on the tutorial by the same name at
Introducing HTML & XHTML:. Goals  Understand hyperlinking  Understand how tags are formed and used.  Understand HTML as a markup language  Understand.
Word Tutorial 3 Creating a Multiple-Page Report
Basics of HTML Shashanka Rao. Learning Objectives 1. HTML Overview 2. Head, Body, Title and Meta Elements 3.Heading, Paragraph Elements and Special Characters.
Adagio4 Web Content Management EP Information Offices.
Creating a Simple Page: HTML Overview
Week 1.  Phillip Chee   Ext.1214 
Lesson 4: Using HTML5 Markup.  The distinguishing characteristics of HTML5 syntax  The new HTML5 sectioning elements  Adding support for HTML5 elements.
The International Standard ISO 2384 Presentation of Translations Part 1.
XML CPSC 315 – Programming Studio Fall 2008 Project 3, Lecture 1.
An overview of the TEI vocabulary ➢ markup makes explicit a theory about some aspect of a document ➢ some theories are more useful or generalizable than.
SEG3210 DHTML Tutorial. DHTML DHTML is a combination of technologies used to create dynamic and interactive Web sites. –HTML - For creating text and image.
HTML (HyperText Markup Language)
Introduction to XML and TEI for Digital Archives
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language)  XML is a markup language for creating documents containing structured information.
Learning Web Design: Chapter 4. HTML  Hypertext Markup Language (HTML)  Uses tags to tell the browser the start and end of a certain kind of formatting.
SEG3210 DHTML Tutorial. DHTML DHTML is a combination of technologies used to create dynamic and interactive Web sites. –HTML - For creating text and image.
TEXT ENCODING INITIATIVE (TEI) Inf 384C Block II, Module C.
HTML | DOM. Objectives  HTML – Hypertext Markup Language  Sematic markup  Common tags/elements  Document Object Model (DOM)  Work on page | HTML.
SDPL 2001Notes 4: Intro to Stylesheets1 4. Introduction to Stylesheets n Discussed recently: –Programmatic manipulation of (data-oriented) documents n.
E0262 – MIS – Multimedia Storage Techniques XML (Extensible Markup Language  XML is a markup language for creating documents containing structured information.
Essentials of HTML Class 4 Instructor: Jeanne Hart
HTML | DOM. Objectives  HTML – Hypertext Markup Language  Sematic markup  Common tags/elements  Document Object Model (DOM)  Work on page | HTML.
XML 2nd EDITION Tutorial 1 Creating An Xml Document.
 2008 Pearson Education, Inc. All rights reserved Introduction to XHTML.
HTML: Hyptertext Markup Language Doman’s Sections.
4 Chapter Four Introduction to HTML. 4 Chapter Objectives Learn basic HTML commands Discover how to display graphic image objects in Web pages Create.
Construction and Pedagogical Use of Digital Archives Washington University 30 May 2006 Four: The DTD
XML for Text Markup An introduction to XML markup.
Content and Systems Week 3. Today’s goals Obtaining, describing, indexing content –XML –Metadata Preparing for the installation of Dspace –Computers available.
SDPL 2002Notes 4: Intro to Style Sheets1 4. Introduction to Style Sheets n Discussed recently: –Programmatic manipulation of documents n Now a more human-oriented.
HTML Introduction. Lecture 7 What we will cover…  Understanding the first html code…  Tags o two-sided tags o one-sided tags  Block level elements.
XML CSC1310 Fall HTML (TIM BERNERS-LEE) HyperText Markup Language  HTML (HyperText Markup Language): December  Markup  Markup is a symbol.
Page Layout You can quickly and easily format the entire document to give it a professional and modern look by applying a document theme. A document theme.
Today’s Lesson….. 1.Formative Assessment Given Back – Go through Answers. 2.Webpage Design.
THE INTERNATIONAL STANDARD ISO The International Organization for Standardization (ISO) is a worldwide organization which deals with the development.
XP Review 1 New Perspectives on JavaScript, Comprehensive1 Introducing HTML and XHTML Creating Web Pages with HTML.
Basic HTML Document Structure. Slide 2 Goals (XHTML HTML5) XHTML Separate document structure and content from document formatting HTML 5 Create a formal.
CHAPTER TWO HTML TAGS. 1.Basic HTML Tags 1.1 HTML: Hypertext Markup Language  HTML stands for Hypertext Markup Language.  It is the markup language.
Department of Computer Science, Florida State University CGS 3066: Web Programming and Design Spring
XML CORE CSC1310 Fall XML DOCUMENT XML document XML document is a convenient way for parsers to archive data. In other words, it is a way to describe.
INT222 – Internet Fundamentals
Laboratory Exercise # 10 – Microsoft Word Additional Topics Office Productivity Tools 1 Laboratory Exercise # 10 Microsoft Word Additional Topics Objectives:
Academic Computing Services 2007 Microsoft Word 2010 Publishing Long Documents This Guide will teach you how to work with long documents such as dissertations.
Chapter 4 and 5. Objectives Introduce markup: elements and attributes How browsers interpret HTML documents Basic structure of HTML document What do style.
CITA 330 Section 2 DTD. Defining XML Dialects “Well-formedness” is the minimal requirement for an XML document; all XML parsers can check it Any useful.
HTML Introduction. Lecture 7 What we will cover…  Understanding the first html code…  Tags o two-sided tags o one-sided tags  Block level elements.
HTML CS 4640 Programming Languages for Web Applications
Working with Tables: Module A: Table Basics
Elements of HTML Web Design – Sec 3-2
Elements of HTML Web Design – Sec 3-2
Basic HTML Document Structure
Benchmark Series Microsoft Word 2016 Level 2
Attributes and Values Describing Entities.
HTML CS 4640 Programming Languages for Web Applications
Presentation transcript:

The TEI : an overview

Basic concepts  The TEI is a modular system, built like a Chicago pizza  Modules Base modules (choose one) Additional modules (choose zero or more) Core modules (no choice) User-supplied extensions  Each module defines specific elements and attributes elements are classified semantically and structurally

For example  TEI Lite ( our guess at what most people want, most of the time realistic for existing texts, and for new document production, e.g. TEI technical documentation  Modules: core modules prose base additional modules for figures, for linking, for analysis, core tags a few omissions

Basic structure(s)  Every TEI-conformant document comprises a header followed by (at least one) text  The TEI header contains essential metadata for: bibliographic control and identification resource documentation and description  Based on library practice, but extensible

Metadata requirements: the scope  Identification of the object  Documentation of its structure and organization  Statement of rights (reproduction, ownership etc.)  Statement of intended usage  Documentation of interpretive scheme/s applied  Brief characterization for search engines  Any kind of description

The TEI header  Based on AACR2 practice, the header contains: mandatory file description optional encoding, profile and revision descriptions  Content may be free text or highly structured  Specific extensions for language corpora manuscript description

For example Thomas Paine: Common sense, a machine-readable transcript compiled by Jon K Adams Oxford Text Archive The complete writings of Thomas Paine,collected and edited by Phillip S. Foner (New York, Citadel Press, 1945) Thomas Paine: Common sense, a machine-readable transcript compiled by Jon K Adams Oxford Text Archive The complete writings of Thomas Paine,collected and edited by Phillip S. Foner (New York, Citadel Press, 1945)

Structure of a TEI text  A text may be unitary or composite  a unitary text contains front matter back matter a body  in a composite text, the body is a group of texts (or nested groups)

TEI basic structure teiHeader tei.2 teiCorpus.2 tei.2 teiHeader TEI.2 back front text body div group div back front text body s

A text usually has divisions  generic, hierarchic subdivisions  vanilla or numbered  type attribute  associated head and trailer elements from the divtop class

for example... Book I. Of writing lives in general,... Book I. Of writing lives in general,...

TEI global attributes  Defined in the core module id for unique identification n for (non-unique) name or number rend for rendition (appearance) lang for language  Defined in the linking module corresp, synch, ana for specific association types next, prev for aggregating fragmented elements  Nonglobal, but pervasive type for subclassification

Character Encoding Recommendations  non-normative  extend, using standard entity sets or transliteration  document transliteration scheme with formal Writing System Declaration a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z " % & ' ( ) * +, -. / : ; ? _ (space) a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z " % & ' ( ) * +, -. / : ; ? _ (space)

Text components (prose base)  What are divisions composed of? prose is mostly paragraphs ( ) verse is mostly lines ( ), sometimes in hierarchic groups ( ) drama is mostly speeches ( ) containing or and interspersed with stage directions ( )  These may be mixed, and may also appear directly within undivided texts.

Verse: an example Summer grass — all that's left of warriors' dreams. Summer grass — all that's left of warriors' dreams.

Drama: an example Enter Barnardo and Francisco, two Sentinels, at several doors Barnardo: Who's there? Francisco: Nay, answer me. Stand and unfold yourself. Barnardo: Long live the king! Francisco: Barnardo? Barnardo: He. Enter Barnardo and Francisco, two Sentinels,at several doors Who's there? Nay, answer me. Stand and unfold yourself. Long live the king! Barnardo? He. Enter Barnardo and Francisco, two Sentinels,at several doors Who's there? Nay, answer me. Stand and unfold yourself. Long live the king! Barnardo? He.

Texts are not just words...  … but probably only people know that  an encoding may claim to capture just visual salience, just its assumed causes both  encoding makes explicit one (or more) sets of interpretations

For example... And this Indenture further witnesseth that the said Walter Shandy, merchant, in consideration of the said intended marriage...

…or... And this Indenture further witnesseth that the said Walter Shandy, merchant, in consideration of the said intended marriage...

Who does the work?  TEI scheme allows for close reading -- and the reverse  can tag very detailed features of discourse function  can normalise or simplify (e.g. dates numbers, names)  … or leave well alone

Core phrase level elements include...  phrases that are conventionally typographically distinct  “data-like” (names, numbers, dates, times, addresses)  editorial intervention (corrections, regularizations, additions, omissions...)  cross references and linksb

for example... Of writing lives in general,and particularly of Pamela, with a word by the bye of Colley Cibber and others. It is a trite but true observation, that examples work more forcibly on the mind than precepts.… Mr. Joseph Andrews, the hero of our ensuing history, was esteemed to be... Of writing lives in general,and particularly of Pamela, with a word by the bye of Colley Cibber and others. It is a trite but true observation, that examples work more forcibly on the mind than precepts.… Mr. Joseph Andrews, the hero of our ensuing history, was esteemed to be...

Spaulding, he came down into the office just this day eight weeks with this very paper in his hand, and he says:— I wish to the Lord, Mr. Wilson, that I was a red-headed man. Spaulding, he came down into the office just this day eight weeks with this very paper in his hand, and he says:— I wish to the Lord, Mr. Wilson, that I was a red-headed man. Direct speech  Use the who attribute to show speakers  Speeches can be nested in other speeches

Have you read Die Dreigroschenoper ? Savoir-faire is French for know-how. John has real savoir- faire. Have you read Die Dreigroschenoper ? Savoir-faire is French for know-how. John has real savoir- faire. Foreign language phrases  The lang attribute may be attached to any element  Use if nothing else is available  Define each language in in header

My dear Mr. Bennet, said his lady to him one day, have you heard that Netherfield Park is let at last? Names and other referring strings  The (referring string) element is used for any kind of name or reference

Today is Tuesday 29th. One afternoon in late November.. One afternoon in <dateRange from=' ' to=' exact= ' to ' > late November.. Today is Tuesday 29th. One afternoon in late November.. One afternoon in <dateRange from=' ' to=' exact= ' to ' > late November.. Dates, times, numbers  attributes can be used to quantify and expressions  similarly, times, and numbers

Correction and Regularization  and for correction (or non-correction)  and for normalization (or the reverse).. for his nose was as sharp as a pen and a’ table of green feelds.... for his nose was as sharp as a pen and he babbl'd of green fields... for his nose was as sharp as a pen and he babbl'd of green fields

Omissions, Deletions, Additions  omission by transcriber  cancellation in source or by editor  or insertion in source or by editor  material uncertain because illegible  physical damage to text carrier

The multiple hierarchy problem  SGML allows only one hierarchy at a time  Is a document chapter-paragraph-phrase gathering-page-leaf or both?  discontinuous segments  links and milestones

Diana and Mary approved the step unreservedly. Dia na announced that... Boundary markers  page, column, and line breaks (,, )  generic

Some chunks are also phrases  lists of all kinds  notes (authorial or editorial)  pictures or figures  formulae  tables  bibliographic descriptions

Lists  use for lists of any kind (use type attribute to distinguish)  use in two-column lists as alternative to n attribute  may be nested as necessary

for example... For my true love: * three calling birds * two french hens * a partridge in a pear tree For Uncle Joe: socks as usual For my true love: * three calling birds * two french hens * a partridge in a pear tree For Uncle Joe: socks as usual For my true love three calling birds> two french hens a partridge in a pear tree For Uncle Joe socks as usual For my true love three calling birds> two french hens a partridge in a pear tree For Uncle Joe socks as usual

Figures and graphics  The presence of a graphic is indicated by the element  The title of the graphic is tagged as a  A description of the graphic may be supplied (as a ) for use by software unable to render the graphic  The graphic itself is specified as an external entity

<!ENTITY fezziPic SYSTEM "fezz.gif" NDATA GIF> <!ENTITY fezziPic SYSTEM "fezz.gif" NDATA GIF> Mr Fezziwig's Ball A Cruikshank engraving showing Mr Fezziwig leading a group of revellers. Mr Fezziwig's Ball A Cruikshank engraving showing Mr Fezziwig leading a group of revellers. for example...

Tables  a element contains s of s  spanning is indicated by rows and cols attributes  role attribute indicates whether row or column holds data or a label  embedded tables are permitted

for example... A three column table Row Row2abc defgh A three column table Row Row2 abc defgh A three column table Row Row2 abc defgh

Bibliography  Use simple with optional subcomponents: (for any kind of responsibility) or,, etc. with optional level attribute groups publication details adds page references etc.  Use for list of references

Bibliography Ed Regis Great Mambo Chicken and the Trans- Human Experience London Penguin Books 1992 pp 144 ff Bibliography Ed Regis Great Mambo Chicken and the Trans- Human Experience London Penguin Books 1992 pp 144 ff for example... See for example Regis (1992)....

Notes  Use for notes of any kind (editorial or authorial)  if in-line, use place attribute to specify location  if out of line, either use target attribute to specify attachment point or mark attachment point as a

for example... The self-same moment I could pray> And from my neck so free The albatross fell off, and sank Like lead into the sea. The spell begins to break. The self-same moment I could pray> And from my neck so free The albatross fell off, and sank Like lead into the sea. The spell begins to break.

Out of line bibliographic notes Blenkinsop,(p. 322) remarks … Blenkinsop, Basil Thoughts on This and That Oxford, 1997 … Blenkinsop,(p. 322) remarks … Blenkinsop, Basil Thoughts on This and That Oxford, 1997 …

Generic problems call for generic solutions Generic problems call for generic solutions Links and pointers  cross-referencing  association of text and annotation  association of image and text or audio and transcript  alignment of text and translation...

TEI Linking terminology  A pointer points from here (where it is) to there (somewhere else)  A ref does the same, but has some content  A link points to two or more places and asserts some (linking) relation between them. Its own location is not significant  An anchor exists only to be pointed at

 Use (empty element) or (with content)  use target to specify an identifier (ID value) Cross References See especially section 12 on page 34. See especially.... Concerning Identifiers But what if the target is not in the current document?

TEI X-pointers  TEI defined a "location ladder" style syntax later adapted by W3C as Xpath  Syntax now under review  Basic notion: tree navigation see especially see especially <xptr doc= ' doc2 ' from="DESCENDANT (2 DIV1) (4 P) CHILD (1 QUOTE LANG LAT)"> see especially see especially <xptr doc= ' doc2 ' from="DESCENDANT (2 DIV1) (4 P) CHILD (1 QUOTE LANG LAT)">

... and links  freestanding links can associate anything that has an ID, including x-pointers  can also be grouped and typed

A three way alignment The Study is a place where a Student, a part from men, sitteth alone, addicted to his Studies, whilst he readeth Books, The Study is a place where a Student, a part from men, sitteth alone, addicted to his Studies, whilst he readeth Books, Muséum Museum est locus ubi Studiosus, secretus ab hominibus, studiis deditus, dum lectitat Muséum Museum est locus ubi Studiosus, secretus ab hominibus, studiis deditus, dum lectitat <xptr n='2' id=p982 doc=com98 from='space (2d) (75 5) (133 75)'> <xptr n='3' id=p983 doc=com98 from='space (2d) (55 42) (90 60)'> <xptr n='2' id=p982 doc=com98 from='space (2d) (75 5) (133 75)'> <xptr n='3' id=p983 doc=com98 from='space (2d) (55 42) (90 60)'>

Not covered here...  specialised front and back matter  analytic tagging segmentation interpretations  the header  tags for documentation

Summary  How TEI Lite handles… Structural divisions Rendition vs. interpretation Phrases, chunks, and chunky phrases Pointers and links  Any dtd dealing with ordinary text will need a similar range