Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 9 1 Chinese Character Output Character 字符 : abstract object recognized by human in communication, it is the representation at the conceptual level.

Similar presentations


Presentation on theme: "Lecture 9 1 Chinese Character Output Character 字符 : abstract object recognized by human in communication, it is the representation at the conceptual level."— Presentation transcript:

1 Lecture 9 1 Chinese Character Output Character 字符 : abstract object recognized by human in communication, it is the representation at the conceptual level. Control characters in computer internal code is not considered characters Glyph 字形 : character in its concrete form without regards to thickness, style, size, and the computer internal representation(bitmap, outline, etc) Font (font set) 字體 / 字型庫 : specific form of character with all computer internal representation attributes

2 Lecture 9 2 The three levels of representation Image 圖像 Font 字型 External Representation 外部表示 GID (Glyph ID) Glyph 字形 Document Description Character 字符 Code Internal Representation 內部表示 Rendering Association Human perception

3 Lecture 9 3

4 4

5 5 Glyph Representation: Bitmaps A matrix of 1s and 0s to represent a character Typical monitor display a character using a 16 x 16 bitmap Typical sizes and storage demand are shown (not double size => quadruple storage) Data compression(a lot of empty space)

6 Lecture 9 6 Usually store small bitmaps and scale up but there are problems with the quality of slanted edges Linear scaling: from Old(x old, y old ) to New(x new, y new ), where 0 <= x old <= (Width OLD -1), 0 <= y old <= (Height OLD -1) and 0 <= x new <= (Width NEW -1), 0 <= y new <= (Height NEW -1) assuming Height and Width values are integers r x = Width NEW /Width OLD, r y =Height NEW /Height OLD If r x >1 and r y >1, then it is called scaling up New(x new, y new ) = New(x * r x, y* r y ) = Old(  x ,  y  )

7 Lecture 9 7 Smoothing techniques for scaling Ad Hoc Techniques (No underlying model but cheap): –Enlargement (Matrix manipulation) Thresholding: convert into bitmap (assign 1 if >= 0.4 for unidirectional)

8 Lecture 9 8 Smoothing spline ( 齒形 ) and interpolation 嵌入法 (costly) –Basis: Character bitmaps are a coarse sample of the original character –Approach: Recover the curves of the character as continuous functions (cubic spline) and then interpolate or generate the bitmaps of another size –Optimization: Minimize the unsmoothing

9 Lecture 9 9 Bezier Curves P(t) = (x(t), y(t)): any point in the curve(0<= t <= 1) Cubic Bezier: 4 points –end points coincide with curve –other points control shape (can specify gradient at end points) X(t) =X 0 *(1-t) 3 + 3* X 1 *(1-t) 2 *t + 3*X 2 *(1-t) *t 2 + X 3 *t 3 Y(t) =Y 0 *(1-t) 3 + 3* Y 1 *(1-t) 2 *t + 3*Y 2 *(1-t) *t 2 + Y 3 *t 3

10 Lecture 9 10 Glyph Representation: Outline Characters as shapes enclosed by lines or curves and specify these by parameters (i.e. data as an ASCII file and an interpreter to generate the graphic image) Line specified by 2 points Curve: (usually cubic Bezier) specified by 4 points –end points coincide with curve –other points control shape

11 Lecture 9 11 Advantages comparing to bitmaps: –Scaling does not affect quality (Major) –Does not need to store different sized fonts (a compression of extremely detailed/large fonts) –Compression (as in standard text) –Email transport without encoding and decoding Example of a Postscript for the Chinese Character 一 :

12 Lecture 9 12 Unit of measurements: 1 point = 1/72 of an inch and the coordinates starts at the bottom left corner and coordinate translation is needed. Postscript level 1 font(base font) can handle only up to 256 characters in each set. It maps 256 code into names of fonts in the set. Postscript Level 0 fonts: Composite Font –Double byte encoding: –1st byte: index to base font –2nd byte: code in the particular base font

13 Lecture 9 13 CID-keyed fonts(pp 288) A technique to make character glyph definitions be independent of codeset. –Each character glyph is given a CID which uniquely defines a glyph shape. –A CMap is a file which contains mapping of character encodings with glyphs(CID). –A CIDFont file contains the pointers to the actual descriptions of the glyphs. A CIDFont file usually keeps character glyphs with the same style. Other outline fonts include: TrueType fonts and OpenType. They different in the data structures/ header forms.

14 Lecture 9 14 Bitmap-to-Outline Conversion Determine outline for all the straight lines Generate curve list: a curve must begin and end in two different corner (therefore needs to find corners: compute an angle between two vector points along the outline) Preprocessing for curve-fitting: knee removal, smooth filtering to yield finer co-ordinates of sample points. Perform curve fitting: iterations try to improve fitting goodness (measured as the least square error) End point alignment: close end points of two consecutive splines are merged by averaging their positions

15 Lecture 9 15

16 Lecture 9 16 Getting outline pixels through erosion Finding the outline of a bitmap is to find the pixel that is located inside an object, but that has at least one neighbour outside the object Basic idea –Find the bitmap with its edge pixels removed:erosion( a smaller cross) –Original bitmap with the eroded bitmap removed.

17 Lecture 9 17 Need more mathematical terms and binary image operation Translation:The displacement in either the x direction, the y direction or both at once. It is the reposition of the co- ordinate system. Suppose B is a binary image, Bxy means to move B by the coordinates(x,y). (0,0) origin (x,y) Translated

18 Lecture 9 18 Erosion of B(a bitmap): is a set of coordinates (x,y) such that S translated by (x,y), is contained in B. E = B ⊕ S = {(x,y) | Sxy  B} S(4 pixels of blacks): Against and their rotations Returns all the points in B whose neighbors are not the boarder (edge) pixels.

19 Lecture 9 19 Outline pixels: B - (B S)


Download ppt "Lecture 9 1 Chinese Character Output Character 字符 : abstract object recognized by human in communication, it is the representation at the conceptual level."

Similar presentations


Ads by Google