Text and Image Representation

Text and Image Representation
Université Pierre-Mendès France Digital multimedia Text and Image Representation

Outline Representation of media Text ASCII Unicode Image Audio Video

ASCII Most text files are represented using the ASCII coding scheme
Each character is translated into 7 bits Can contain at most 128 symbols (i.e., 27): Latin alphabet and Arabic numerals Standard punctuation characters Small set of accents and other European special characters (Latin-I ASCII) Not enough for many languages

ASCII Table

ASCII Table Control Characters Printable Characters

ASCII Table Most significant bit Least significant bit ‘a’ =

ASCII Table (Hexadecimal)
MSD LSD 1 2 3 4 5 6 7 NUL DLE SP @ P p SOH DC1 ! A Q a W STX DC2 “ B R b r ETX DC3 # C S c s EOT DC4 $ D T d t ENQ NAK % E U e u ACJ SYN & F V f v BEL ETB ‘ G g w 8 BS CAN ( H X h x 9 HT EM ) I Y i y LF SUB * : J Z j z VT ESC + ; K [ k { FF FS , < L \ l | CR GS - = M ] m } SO RS . > N ^ n ~ SI US / ? O _ o DEL ‘a’ = 61HEX

ASCII Example Representing "Hello world" in ASCII ASCII ASCII HEX
URL Encoded ? H e l o w r d = 48 65 6C 6F 20 77 72 64

ASCII Extension Uses 8 bits to represent a character
can code 256 letters or symbols (28) better for many European and Asian languages OEM extended ASCII IBM-PC ANSI extended ASCII Windows Popular extensions

Unicode Unicode is a 16-bit code
Can represent 65, 536 characters (216) Enough for any language character set Useful in international business documents Subsumes ASCII Unicode helps with the location of software  Software modification for local-languages

Unicode Code Ranges

Unicode Range Exemple Emoticons (range 1F600-1F64F) Emoticons in HTML

UTF Encoding Unicode is a machine independent representation
UTF-X specifies how to serialize a Unicode character the number of bits to use (e.g., UTF-8, UTF-16, UTF-32, …) UTF-8 is the de facto standard Backward compatible with ASCII (Unix systems) 128 UTF-8 first characters correspond to ASCII characters Preferred encoding mechanism for multi-language web pages

UTF-8 Encoding Principle
UTF-8 serializes a Unicode characters into multiple bytes The high bits of the first serialized byte indicate how many bytes are used for the serialization of that character The bits represented by "n"s hold the Unicode character code value

Outline Representation of media Text Image Image data types
Popular formats Audio Video

Color Perception In the real-world images are a continuous spectrum of colors

Color Perception How humans perceive colors

Bitmap and Pixels Computer handles real images by "discretizing" the color spectrum into small “bit of lights" and producing bitmaps Real world Bitmap

Bitmap and Pixels The colored dots that make up a bitmap are properly called pixels (i.e. picture elements) In the simplest sort of bitmapped image, each pixel is represented by red, green and blue light

Bitmaps Images Images can be classified according to the number of bits used for representing them: Monochrome (black or white) 1 bit per pixel Gray scale (black, white and 254 shades of gray) 1 byte per pixel Color images: 16 colors = 256 colors 24-bit = (16.7 million colors)

1-bit Monochrome Image Simplest type of image
Contains no color Each pixel is stored as a single bit (0 or 1) Also referred as binary image Usage Printing tickets, news paper, fax, etc. Lena

1-bit Monochrome Image What is the size of this image ? (in bytes)
480 px 38,4 Kbytes 640 px

8-bit Gray Scale Image Represent gray images
Each pixel is represented by a single byte Each pixel has a gray-value between 0 and 255 (i.e., 28) e.g., a dark pixel might have a value of 10, and a bright one might be 230

8-bit Gray Scale Image What is the size of an image of this image?
640 px 480 px 300 Kbytes Gray scale image of Lena

Bit planes An 8-bit image can be thought of as a set of bit-planes
Each plane consists of a 1-bit representation of the image at higher levels of elevation A bit is turned on if the image pixel >= bit level

Bit planes Note that the first plane gives the most critical approximation to the image lower the number of the bit plane, less is its contribution to the final image adding a bit plane gives a better approximation

Dithering Strategy that trade intensity resolution by spatial resolution for giving the illusion of bit-depth transition

24-bit Color Image A pixel is represented by three bytes (usually representing RGB) Supports 256 x 256 x 256 possible colors (i.e., 16,777,216 ) Known also as True Color The human eye discriminates up to 10 million colors Such flexibility does result in a storage penalty: Ex. A 640 x bit color image would require ? 921.6 kB of storage (without any compression)

24-bit Color Image Sometimes these images are stored as 32-bit images for representing special effect information The extra byte of data for each pixel is called alpha value Can be used for representing transparency Ex. for composing several overlapping objects

24-bit Color Image Original
High-resolution color and separate R, G, B color channel images Green Red Blue

8-bit Color Image When space is a concern reasonable accurate color images can be obtained by using 8 bits of color (the so-called “256 colors”) Many systems can make use of 8 bits color images to be backward compatible Ex. Mobile phones previous to smartphones Such image files use the concept of a lookup table (LUTs) to store color information The image stores no color but a pointer into a table with 3-byte values that specify the color for a pixel

Color Lookup Table (LUTs)
Ex. if a pixel stores the value 25, the meaning is to go to row 25 in a LUT  a LUT is a kind of palette of colors (e.g., web safe colors)

Color Lookup Table (LUTs)
Choosing the best 256 colors based on an RGB histogram In this example there is almost no blue. Thus the 256 colors can be composed mostly of red and green

Macromedia Director Supported File Formats
Image File Formats The number of file formats continues to proliferate File Import File Export Native Image Palette Sound Video Anim. .BMP, .DIB, .GIF, .JPG, .PICT .PNG, .PNT, .PSD, .TGA, .TIFF, .WMF .PAL .ACT .AIFF .AU .MP3 .WAV .AVI .MOV .DIR .FLA .FLC .FLI .GIF .PPT .BMP .DIR .DXR .EXE Macromedia Director Supported File Formats

Image File Formats Popular file formats GIF
One of the most important formats because of its historical connection to the WWW and HTML markup language as the first image type recognized by web browsers JPEG Currently the most important common file format PNG Subsumes GIF

GIF (Graphics Interchange Format)
Developed by CompuServe in 1987 Most commonly used graphic file formats (especially on the Internet) Limited to 256 colors  8-bits used Can store multiple images and the controls to make them appear as real time animation (e.g. delay time and transparency index) Allows a special color to be used as "background" so image looks transparent Lossless compression Reduce the file size without degrading the visual quality Preferred for line drawings, clip art and pictures with large blocks of solid color

Note that is not possible to convert directly from a 24-bit file (ex. JPEG) to the GIF format You need to convert a 24-bit image to Indexed Color mode first Reduce the number of colors to a palette of 256 or less  Create a custom palette generated by the most commonly used color in the image

LUTs

Animated GIF

Transparency GIF One position of the color palette is designated as “transparent” All pixels of the image that have this particular color index will be painted as transparent when viewing Transparent GIF Not a transparent GIF

JPEG (Joint Photographers Expert Group)
Allows more than 16 million colors Suitable for highly detailed photographs and paintings Uses "lossy" compression to get more graphics into a smaller file May reduce image resolution Tends to distort sharp lines An image written to the JPEG format will be degraded by a quality factor Quality factor close to 100  almost no degradation but compression is not significant Quality factor close to 0  very small file but unrecognizable A quality factor of 75 is usually a good compromise

JPEG Quality Factor Example Q = 100 Q = 10 Q = 1 83,261 Kbytes

JPEG Degradation The image degradation caused by the JPEG format is cumulative i.e., if you write an image to a JPEG file, and then read it from the JPEG file and write it back to the JPEG format, it will have suffered two passes of image degradation It works very, very badly on text, line art or other types of mechanical graphics Degradation will be quite noticeably. These sorts of graphics should be stored in another format (e.g. GIF or PNG)

Progressive JPEG There are two types of JPEG files : Advantage
Sequential JPEG stores its image as a simple bitmap Progressive JPEG stores its image such that it can appear initially out of focus when it begins to download to a web page and resolve itself as more of the image is received by the web browser Advantage Provide indication of the whole image to the viewer before the entire image is loaded Disadvantage Require more computational power to display

GIF vs. JPEG GIF JPEG Best application
Line Art, Image with few color text Photographs, Image with many colors Display speed Fast Slower, more computation Benefits Transparency, Animation Greatest compress for photographs, more color Max. color 256 16.7 million

PNG (Portable Network Graphic)
Created as an alternative to GIF Lossless compression scheme is used Does not support animation Support three image type: true color, gray scale, palette-based JPEG supports the first 2. GIF supports the 3rd one.

Quiz Why we need to be able to have less that 24-bit color and why this makes for a problem ? What do we need to do to adaptively transform a 24-bit image into a 8-bit one?

Quiz Suppose we decide to quantize an 8-bit gray scale image down to just 2 bits of accuracy. What is the simplest way to do so? What ranges of byte values in the original image are mapped to quantized values?

Quiz Suppose we have available 24 bits per pixel for a color image. However we notice that humans are more sensitive to R and G than to B – (1.5 times more sensitive). How could we best make use of the bits available?

Quiz At your job, you have decided to impress the boss by using up more disk space for the company's gray scale images. Instead of using 8 bits per pixel, you'd like to use 48 bits per pixel in RGB. How could you store the original gray scale images so that in the new format they would appear the same as they used to, visually?

Text and Image Representation

Similar presentations

Presentation on theme: "Text and Image Representation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Text and Image Representation

Similar presentations

Presentation on theme: "Text and Image Representation"— Presentation transcript:

Similar presentations

About project

Feedback