Typography for Automatic Markup Liam Quin Barefoot Computing, 1999.

Presentation on theme: "Typography for Automatic Markup Liam Quin Barefoot Computing, 1999."— Presentation transcript:

Typography for Automatic Markup Liam Quin Barefoot Computing, 1999

Firs `ings Firs Your Insrucor: Liam Quin
Your Cofe: get i while i’s hot Your Laptop: if you don’t have one, don’t worry. It is optional. So are socks. Stop and ask Quesions Tell me when you are confused If you’re in the wrong room, why not say?

1: Introducing Graphic Design
Overview 1. Page Layout 2. Text and Fonts 3. Text and Paragraphs 4. Discussion and Exercises 5. Break

1.1: Page Layout Ratios & Proportions, Motion & Res
The Golden Secion, 3:4, 1:2 Whie space adds res Use space to draw attention Use a paper size suied to the task Consider printers, binders, consumers Paper sizes used (usa letter, iso a4 etc.) 1:2:3:4 ratios for margns

1.1 Page Layout: `e Grid Why use a grid?
Fewer Decisions: Simpli0es page design Helps increase consisency Pages treated diferently sand out Precise control over ratios Reduce paper backing problems Protec and perpetuate the design

Page Layout: `e Grid (2) Designing a Grid
Leave room for margns, binding, gutters Posiion illusrations vertically & horizontally choose sizes for main design elements Make sure text areas are exac numbers of lines Consider paper show-through (back-up) Use pleasing proportions (Tschichold)

Page Layout: `e Tile Page
Use clumps and not cluxer If you don’t centre i, make your design obviously not centred Check for legal wording requirements The optical centre of the page is above the acual centre; this applies to all apecs of page design.

Avoid Clutter Rows of dots like dripping blood liter the page Leave things out to make i plainer: people won’t read a long contents page. Use per-chapter tables if necessary Put page numbers on the left or right-align the chapter tiles so they are next to the numbers Don’t try to show too much information!

Page Layout: Generated Pages
Some sysems generate a table of contents at the end of the output, and you have to move i; problem for pdf, but much faser. Make sure your designs allow for extra long tiles; “A Rose by Any Other Name Would Smell as Sweet; But What if You Called both a Rose and a Pansy by the Same Name?” Wriers will gladly supply multiple versions of tiles or other text.

Page Layout: Looking Inside
`e human eye detecs lumps: Put related things near to one another; Put unrelated things far apart. Robin Williams calls this the Rule of Proximiy.

Page Layout: Looking Inside
Bewen the Lumps Are Spaces… Whie pace is di2cult in xml and sgml. Multiple blanks mus be coalesced; Use space before/after elements (or low objecs) to control gaps Remember to keep vertical pace a multiple of half the vertical line spacing (e.g. 3½ times) If in doubt, use more whie pace.

Page Layout: Alignment
`e eye detecs misalignment of 0·002 inches; Something not quie aligned looks wrong. `e more things that line up, the sronger the design. Wesern eyes move rom let to right, top to boxom; use alignment to lead the reader in the desired order.

Page Layout: Summary Line things up (alignment) wih a Grid
Keep unaligned things well apart Group related things (tiles and text) Treat whie pace wih care Always srive to axain pleasing proportions

Firs Brief Intermission
More cofe Don’t pill the coffee on the examples Competiion: identid this ypeface Quesions What shall we talk about next?

Text and Type Outline: A Brief Hisory of Type
Typographic Colour and Type Syles L e t t e r s p a c i n g and legbiliy Type Families Special Characers Puxing on the Right Face

A Brief Hisory of Type Gutenberg: Moveable ype that didn’t move
Bis of metal wih srips of lead bewen the lines (leading) Poiny fet (serifs) made by pen srokes A ypeface is a design; a font is the implementation

Type in Hisorical Context (1)
Venetian (Centaur; Adobe Jenson) Old Style (Bembo; Caslon) Transiional (Baskerville) Modern (Bodoni; Century)

Type in Hisorical Context (2)
Grotesque Sans Serif (Franklin Gothic; Helvetica) The word “Gothic” was used to mean “ugly” in North America around 100 years ago! Geometric Sans Serif (Futura) Humanis Sans Serif (Gill Sans) Flare Serifs (Aachen) Block Serifs (Enptienne) (Rockwell, Symie)

Typographic Colour Colour: how dark or light a block of text appears
`e colour of a printed page is determined by the ink, paper, printing process,typeface and how i’s used. Some fonts are darker than others. [M.E.32] Some fonts are darker than others. [times 32] Balance the colour of the page agains the amount of text and agains graphics.

Text & Contras Contras Light/Dark, Round/Square, Roman/Italic/Bold
The human eye recognises contras quickly Whie space contrass wih areas of ype Don’t overdo i: vary one element at a time Typographers don’t underline, rarely use bold, and use bold ialics only for light-on-dark printing.

Type Families Relatively modern invention
Companion Roman, ialic and bold. Sometimes a Bold Italic too. May be an Expert Set or a small caps font, wih extra characers: DEF 2 ¼ # p c % Some sysems use the font names to associate fonts in a family; mos use an auxiliary 0le. Mos sysems can’t kern or hyphenate across a font change.

Special Characers Ligatures are joined letters: ffi vs. 2
Small Caps have diferent proportions: EXTRAORDINARY vs. Extraordinary. Some families have smaller caps: Extraordinary! Ranging (lining) and non-lining (old syle) numerals: vs Mos 0gures are 0xed width for supid table formaxers. En dash, em dash, Pilcrow ¶ and Secion §.

Paragraphs: Jusi0cation (1)
Text fully jusi0ed has uneven word pacing so as to make the lines all the same length. Very corporate! Right-aligned text is useful in a table of contents and for a few other special-purpose things. Text let jusi0ed has even word pacing but all the lines are not all the same length. This can look disracing, but it’s easier to read. Watch for backup problems.

Paragraphs: Jusi0cation (2)
Some Finer Points of Spacing The las line of a fully jusi0ed paragraph can be short. Any line that’s almos the full length of the measure should be made the full length if possible (the alignment zone) Hang puncuation in the right margin for a more even efec.

Paragraphs: Line Spacing (1)
Mos fonts need a thin srip of lead between each line to pace the letters out better. A good rule of thumb is to add 10% of the font size, measured in points (72 points/inch). Fonts wih a smaller x-height need more spacing. You can pack Times closer than Caslon: i was designed that way.

Paragraphs: Line Spacing (2)
What greater joy could there be in life than to wade barefoot through banana custard whilst discoursing upon the nature of the lower crustacean? This is 24/22pt What greater joy could there be in life than to wade barefoot through banana custard whilst discoursing upon the nature of the lower crustacean? This is 24/29pt What greater joy could there be in life than to wade barefoot through banana custard whilst discoursing upon the nature of the lower crustacean? Times 24/25pt

INQRVL & SECND Morning Break Quesions, Ruminations and Answers
Insalling a font family and using i Exploring the characer set How would you use ligatures in your software? Quesions, Ruminations and Answers

Part 2: Automatic Formatting
Document formaxing and xml/sgml dtds Presentation imparts meaning, and meaning is guided by markup. Limiations in sotware may necessiate pecial markup. Mos sotware seems to need containers around groups of elements to be treated alike.

Changing Markup You can edi documents, use a Perl script jus once, or maybe xsl or Omnimark on the ly. If you edi the insances, you’ll need to change the dtd; i helps to keep the old one around! You can use a conversion dtd in which both old and new forms are valid, but then throw i away!

Elements and Syles Mos sgml and xml formaxing sotware works by taking a lis of syles and applying them to elements. Sometimes you can apply formaxing to entiies too. To do small caps, you may want a script that surrounds ordinary capials wih markup: perl -p -e ‘s{\b([A-Z][A-Z0-9][A-Z0-9]+)} {<sc>\L\$1\E</sc>}g;’ #( two close quotes, not ‘)

Siblings, Groups and Boxes
If you want every 0th PanelNumber element to sart a new row in a table, you’ll probably need to put row elements in there. If a sequence of elements are grouped together, i’s probably because they share a common meaning or funcion, so give them a container (list, chapter, partlist, …).

Axributes vs. Content Formaxers oten won’t display axribute values in any useful way. If you can display i at all, you might not to be able to put i in a diferent font. Axributes can contain entiy references (&) but not elements. You can usually choose a diferent syle based on an axribute value. Really posh sotware can use an axribute value in an expression, e.g. size = \$att(height) * 0.8

%RunningText; and %Lumps;
Mos formatters are happies wih block elements. Real life has inline elements too <!Entity % RunningText “(#pcdata | partNumber | shout | warn)*” > <!Entity % Lumps “(Paragraph | List | Table | Picture)*”

Consider: <Entry pos="noun"><title>boy</title> <p>a male child.</p> </Entry> Producing boy (n), a male child.

<List><Item> <p>Artichokes</p> <p>Maybe also pears.</p></Item> <Item> <p>Five pairs of pyjamas (silk)</p></Item> </List>

Solution: Make <p> syle say inline sart, break at end. If you have lots of block elements, you may have to do that to all of them. Not all formaxers can do that. Sometimes you have to move content: <Entry><p><title>boy</title> a male child … and use built-in lis numbering rubbish.

Style Axributes Style axributes are like <td align="left"> or
<display face="bold"> or have syle content: <font size="7pt" face="Helvetica"> Not all formaxers support both of these. `e second ype is much harder. Not all formaxers can do arihmetic eiher, such as Indent = point size * 3.

Tables A table is a way of presenting information.
Aside about how cool Ed Tute’s books are Use markup that lets you tranpose tables Use minimal cluxer: you probably don’t ned all the lines and boxes, and they take axention away from the information. Don’t pread tables out more than you ned to. Avoid lixle rows of dots in tables!

Table Markup (1) Four kinds of tables
cals tables, in particular the sgml open model; SoftQuad tables html tables Mrs Eaves’ Own Home-grown Organic Tables Content tables Mos formaxers can do at leas two of these.

Table Markup (2) Running Heads and Fet:
Running table headers repeat on every page. You may need a “table continues” marker; e.g. set a variable to “continued…” at the sart of a table and clear i at the end. `e cals and sq models put table heads and fet in separate tables; automatic calculation breaks. Remember Proximity: put related things nearer together than unrelated things.

Prescriptive or Descriptive?
Prescriptive Markup controls what is allowed where, usually tightly requires editorial authoriy Descriptive markup describes an exising document or text can’t disallow things that acually occur oten much harder to format same problems with buq dtds!

Before `e Rubber Chicken
Examples converting to trof with Perl conversion to html with Perl conversion to html with xsl talking about expensive posh high-end suff Quesions, answers, cogtation and agtation Rubber Du5, or, Dinner.

While `ey Are Gone Typography for Automatic Markup
`is room is in use for an all-day tutorial: Typography for Automatic Markup Liam Quin Barefoot Computing, 1999

Welcome Back Typography for Automatic Markup
We shall shortly be resuming: Typography for Automatic Markup Liam Quin Barefoot Computing, 1999

GOD AFQRNON Part Three: Print Technologies
PosScript, hpgl, TrueType and QuickDraw Fonts, Encoding and Unicode Fonts on the World Wide Web Images Part Four: Putting it all together Quick introducion to dsssl, xsl and css Detailed look at Something Managing the Files

Part 3: Print Technologies
Printer Hardware all except cad ploxers are raser devices they print a giant array of very tiny dots some of the dots are bimer than others some of the dots are black and some are pink the thinnes line possible is one dot thick can make offset lithography plates direcly need to consider ink traps and blots

Sotware inside the Printer
Typesetters ( dpi) usually receive a huge bitmap image from a Raster Image Processor (rip). Most rips run on Unix, very high-end Macs or special hardware (e.g. an Alpha under os9). Most laser printers receive bitmaps, hpgl or PostScript programs You can download font programs to printers and rips. We will see how to do this later.

A Brief look at PostScript
The output of a PostScript program is usually a printed page. Comments start with % followed by a space Significant Comments start with %! or %% Level One ps is in ascii Level Two can contain binary compressed data Embedded fonts can be binary in either case

%!PS-Adobe-2.0 %%Title: Liam’s Left Foot %%Creator: Liam Quin %%Pages: (atend) /Inch { 72 mul } def % define a procedure we use later %%EndProlog %%Page i 1 /Palatino-Italic findfont 22 scalefont setfont 1 Inch 9.5 Inch moveto (Liam’s Left Foot) show showpage %%Page 104 2 1 Inch 1 Inch moveto 0 4 Inch rlineto 2 Inch 0 rlineto 0 -4 Inch rlineto closepath fill %%Trailer %%Pages: 2

Weaknesses of PostScript
No access to font metrics No automatic kerning (but programs that generate PostScript can do the kerning themselves) Verbose (but compresses well) Multiple Levels and the three versions of Encapsulated PostScript confusing page is usually as frozen as an image Awfully hard to debug

Strengths of PostScript
Plain text: easy to generate from a program, and you can read it to see if it’s right Device independent: always 72 dpi even on an 8 000 dpi Berthold Typesetter! Portable: a Level 1 PostScript 0le can print just about anywhere, with no special driver needed Amenable to sed and perl Most high-quality commercial fonts are for PostScript.

Other Formats The other print formats usually require special drivers, and the print files can’t be copied. The most common is hpgl, from hp. But it isn’t as general, so you keep running into things you can’t do. Xerox have their own language, Interpress; deficiencies in that led to PostScript.

Font Formats 1: Bitmap Bitmap Fonts (very rare today for printing)
hp cartridges and soft fonts PostScript Type 3 fonts can include bitmaps Xerox use them too; so can TEX Can have a different design for every size (compare Big Caslon, Mrs Eaves and Adobe Caslon) You can’t scale them in any useful way.

Fonts: Outline Formats
PostScript Type 1 Originally a secret format to protect royalties But all the high end printers use PS Type | TrueType Developed by Apple and Microsoft First successful internationalised font format Fonts often automatic conversions from Type | Microsft Windows has strong support for tt.

Other Outline Formats QuickDraw GX (Macintosh Only) PostScript Type 3
Very few commercial GX fonts exist. Too complex, and only works on the Mac. PostScript Type 3 Unhinted (Unhinged, says the spell checker!) You can make them yourself mostly for logos, bitmaps, or unusual effects

Fonts: Encodings Input character sequence must be turned to a sequence of shapes, called glyphs. Formatter accesses a glyph implementation in a font using a number. The number-to-glyph mapping is the font’s encoding vector. You can change a Type 1 font’s encoding vector in PostScript.

Text Encoding Xml uses Unicode for the document character set.
Most input documents are encoded in iso (Latin 1) or utf-8; this is not a font encoding. Unicode doesn’t specify how characters are mapped to glyphs; that’s up to the formatter. Text encoding, document character set, font encoding are all joysomely different.

PostScript Font Encodings
Some standard encodings: Latin 1 (iso ) AdobeStandardEncoding (see the Red Book): ascii, accents, ligatures V, W, X, and some publishing symbols Adobe Expert Set (contains small caps, ligatures Y, Z and some fractions) Small Caps (Barefoot  Barefoot)

Custom Font Encodings Reverse small caps (Barefoot  bAREFOOT)
Font subsets, e.g. in pdf (abcde  Wherb) Logo fonts Filling the gaps in Latin 1 with extra symbols from Adobe’s Standard Encoding

Adobe Font Metrics The afm is ascii; Windows uses a binary .pfm file.
Bounding box of each glyph, plus its default encoding position List of kern pairs Tracking information for spacing characters out further at small sizes Information about accents

Using font metrics Font bounding box - no character draws outside this
Character bounding box - the glyph is entierly within the box Character advance - x and y amounts to move by after drawing the character Kerning - amount to adjust character advance by depending on the previous character You can’t kern across fonts with this scheme; TEX can do that with Virtual Fonts.

Read %%PageFont and %%DocumentFont comments to work out which fonts are used Look for %%IncludeFont lines, or include all fonts just before %%EndProlog, or just before the first %%Page if that’s not found. Use t1tools or pfb2ps on Unix to convert binary pfb fonts to ascii. Note that pfb fonts are not compressed!

Fonts on the Web You can embed fonts in web pages:
Microsoft Weft (free) works only for IE4 & IE5 Bitstream TrueDoc works only for Netscape It costs approx. \$200 for the encoder, so no-one uses it; decoder is included with Netscape. Bitmap images have no royalty problems.

Designing Images Avoid clutter
Use a line thickness that will show up on your printer, but that is not too much heavier than the underline in your text____font. Consider the overall typographic colour of the page, whether it is dark or light. Convert colour images to black and white before sending to a black & white printer.

Vector/Outline Image Formats
1 inch 5 inch rlineto: commands, not pictures can scale to any resolution Encapsulated PostScript (eps, epsi), cgm, wmf most common. Can be problems with fonts used by images but not included in the image file Best for engineering drawings, charts, or things you generated in a program

Bitmap (Raster) Images
From 1 bit to 32 or more bits per pixel Compuserv gif: up to 8 bpp (256 colours); best for line drawings jpeg: up to 24-bit but lossy compression; best for photographs Portable Network Graphics (png) new; lossless compression. Many, many others!

Generating Images Fairly easy to generate PostScript from programs you write Perl libraries make gif (gd) and PostScript (PostScript) images Tcl/Tk can generate PostScript too Adobe’s ftp site had C code to read an afm file netpbm, ImageMagick, giftrans, gimp, xv are all useful on Unix; see

New standard, pgml, may be supported soon If you write scripts to generate images, make sure you get the mime Content-Type right! Beware of copyright issues!

Brief walk for fresh air
For those who remain… Making a five-line drop cap Other Questions

Part 4: Putting it all Together
Formatting Languages: dsssl, xsl, css Other formatters Option: Generated Text Managing Files With Revision Control Tools Relating dtd and instance versions Further Reading and Learning More Questions, Free-for-All Recovery Time

Formatting Languages Document Style and Semantics Specification Language (dsssl) Sgml and xml based Uses lisp syntax, so very hard to read no good books on it moderately powerful too difficult for most people to use few implementations

& XML Style Language (xsl)
Comes from the World Wide Web Consortium Uses xml syntax so you can edit it easily Easy things are fairly easy with templates Declarative, so hard things require Particular and Careful Thinking Not good enough for fine paper typography Two parts: transformation and style; implementations can choose to do one or both, reducing interoperability.

Other sgml/xml formatters
Commercial formatters 3b2 Miles 33 Genera Datalogics Pager Xyvision Parlance Publisher ArborText Adept/Epic (uses TEX) Exosoft See the trade show for others!

Other Formatters (2) Free Solutions
TEX is free and works on Unix and Macintosh; maybe Windows too? There are free versions of troff for Unix, including James Clark’s groff. James Tauber is working on fop, with the goal of doing reasonable quality layout; it’s in Java. David Megginson’s nsgmls.pm can be used to write an sgml to troff (say) converter.

Batch Formatting Example
Convert sgml input to esis with nsgmls Convert esis to troff with nsgmls.pm and a custom perl script (about 500 lines) Run troff with macro package (1500+ lines) to produce ascii context file Take troff output and add ligatures, hung punctuation and spacing corrections in awk Generate 80MByte PostScript file with sqdps Add fonts with addpsfonts (shell script)

Handling Cross References (1)
Referring to generated text elsewhere: see figure 12 Referring to content elsewhere: see Figure 12, The Mating Habits of Mountains Referring to page numbers: see Figure 12 on page 37 Never say, See page 12, if you’re on page 12! See Figure 5 (opposite page, bottom); See Figure 12 (next page); See Table 3 (above)

Handling Cross References (2)
Implementation You generally need two passes to get the page number information. Inserting the text may cause pagination to change; you have to detect this and format again You can get an infinite loop!

Optional Section 1 Generated Text Tables of contents
Prefixes, Postfixes and formats Indexes Effectivity Revision Bars

Optional Section 2 Managing Automatic Formatting
Revision Control Explained RCS and SCCS A repository with CVS Printing \$Id\$ in the margin: keeping track of everything used to print a document

Optional Section 3 Relating Files
dtds, instances, style sheet and script versions tests and checks managing manual changes using fixed attributes

Further Reading Robin Williams, The Non-Designer’s Design Book, Peachpit Press, bright yellow cover Robert Bringhurst, The Elements of Typographic Style, second edition, tall thin black book Various TeX books by Knuth Stop stealing sheep and start desiginig type, Adobe Press (small blue book)