Presentation is loading. Please wait.

Presentation is loading. Please wait.

1. Intro and XML Rules Spring Summer 2010 Marcus Bingenheimer TEI Workshop.

Similar presentations


Presentation on theme: "1. Intro and XML Rules Spring Summer 2010 Marcus Bingenheimer TEI Workshop."— Presentation transcript:

1 1. Intro and XML Rules Spring Summer 2010 Marcus Bingenheimer TEI Workshop

2 Humanities Computing... ●...is the attempt to model Human culture with the help of IT ● Publish, research and preserve all aspects of Human culture: Oral & Written text Images & Architecture Music Performance & Ritual

3 "Digitization" ● At the most basic level, this means the transformation of analog information into a digitally usable format. ● Initially, this means simply taking information from a "hard" medium, such as text, film, or music, and turning into a computer file, to be read by software

4 "Digitization" ● But to really benefit from the material being in digital format, we need to do more than simply convert it to bits. For example, PDF. ● Once we convert a document to digital format, we should attempt to make greater use of the information contained within, to create something more useful.

5 Encoding/Markup ● The way this is done with textual data is through a process called Encoding, or Markup. ● Markup is done by applying tags to textual information that encode information about that content. The name of the tagging language that is used for this work is XML, which stands for eXtensible Markup Language.

6 What is XML? ● eXtensible Markup Language ● A set of rules to generate other markup standards (such as TEI, XHTML...) ● Structural and semantic encoding of a |text|

7 XML vs. HTML ● XML and HTML are "sister languages," both developed from a parent language called SGML (Standard Generalized Markup Language) ● Thus, the appearance of, and the rules governing these two languages is similar, in that they both use bracketed tags to encode information.

8 XML vs. HTML ● The difference between them is that HTML is primarily used to encode style (and a bit of structure), whereas XML encodes structure and semantic content.

9 HTML New Japanese- English Dictionary Ed. by Koh Masuda Tokyo: Kenkyusha, 2000

10 New Japanese-English Dictionary Koh Masuda Tokyo Kenkyusha 2000 XML

11 Well-formed & Valid Every XML document ➨ 1. must be well-formed 2. can in principle be validated 驗 證 against a ( DTD / Schema) 文件模 型

12 Well-formed (正確) means: ● that the document conforms to the XML rules. E.g.: ● One Root Element - The XML document may only have one root element. ● All start-tags have end-tags ● Each element is properly nested within the root element ("nesting"). ● Names are always case sensitive

13 Broken XML Code Mr. Garcia Hello there! How are we today?

14 Well-formed XML Code Mr. Garcia Hello there! How are we today?

15 Valid (可驗證) means: ● that the document conforms to the vocabulary and syntax of a markup standard (e.g. TEI, XHTML, Music ML, MathML) expressed in a document schema (written in DTD, W3 Schema, or Relax NG). ● Working with the default template in oxygen your schema is in your oxygen folder under frameworks/ tei/xml/tei/ custom/ schema/ relaxng/tei_all.rng.

16 Parsing: well- formed not well- formed valid not valid  Parser step 1 XML 文件 文件模型 (DTD/ Schema) e.g.TEI Parser step 2

17 XML rules 1 (Declaration) ● An XML document should begin with an XML declaration ● the declaration 宣告 has the form: (encoding and standalone are optional)

18 XML rules 2 (root element根元素) ● It has one, and only one root element containing all other elements and the character data

19 XML rules 3 (end-tags) ● Every start tag 起始标签 must have a matching end-tag 结束标签 ● Exception: Empty elements 空白的/无内容的元 素

20 XML rules 4 (nesting) ● Elements must be properly nested 元素必须正 确的嵌套, 不允许出现交叉嵌套的情况 like this: not like this:

21 XML rules 5 (XML names) ● XML is case-sensitive ● Element names must start with a letter (including CJK 漢字) or the “_” ● May contain only alphanumeric characters (letters and digits) and “_” “-” “.” ● the colon “:” is reserved with namespaces 命名 空間

22 CSS Document Model 文件模型: DTD, Relax NG, XML Schema <XML Document> XQuery XSLT XPath XSL- FO JScript HTML PDF any XML OUTPUT 輸出 ENCODING 標記 TRANSFORM 轉換 ePub

23 Practice ● Write a “firstdocument.xml” ● Open it in Firefox ● attach a stylesheet declaration like: ● Make “styletest.css” (empty file is enough) ● try it in Firefox

24 Marcus Bingenheimer, Charles Muller 2005-2010


Download ppt "1. Intro and XML Rules Spring Summer 2010 Marcus Bingenheimer TEI Workshop."

Similar presentations


Ads by Google