Presentation is loading. Please wait.

Presentation is loading. Please wait.

ITEC 1000 “Introduction to Information Technology”

Similar presentations


Presentation on theme: "ITEC 1000 “Introduction to Information Technology”"— Presentation transcript:

1 ITEC 1000 “Introduction to Information Technology”
Lecture 4 Data Formats

2 Lecture Template: Data Forms Data conversion and representation
Data Formats Alphanumeric Data Image Data Audio Data Data Input Data Compression Internal Computer Data Format

3 Data Forms Human communication Computers
Includes language, images and sounds Computers Process and store all forms of data in binary format Conversion to computer-usable representation using data formats Define the different ways human data may be represented, stored and processed by a computer

4 Data conversion and representation

5 Standards (evolve in two ways):
Data formats Proprietary formats Unique to a product or company E.g., Microsoft Word, Word Perfect Standards (evolve in two ways): Proprietary formats become de facto standards (e.g., Adobe PostScript) Invented by an international standard organization (e.g., Motion Pictures Experts Group, MPEG)

6 Common Data Representations
Type of Data Standard(s) Alphanumeric Unicode, ASCII, EDCDIC Image (bitmapped) GIF (graphical image format) TIF (tagged image file format) PNG (portable network graphics) Image (object) PostScript, JPEG, SWF (Macromedia Flash), SVG Outline graphics and fonts PostScript, TrueType Sound WAV, AVI, MP3, MIDI, WMA Page description PDF (Adobe Portable Document Format), HTML, XML Video Quicktime, MPEG-2, RealVideo, WMV

7 Alphanumeric Data Characters (r, T), number digits (0..9), punctuation (!, ;), special purpose characters ($, &) Four codes/standards to represent letters and numbers: BCD (Binary-Coded Decimal) Unicode ASCII (American Standard Code for Information Interchange) EBCDIC (Extended Binary Coded Decimal Interchange Code)

8 Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode Next 2 slides

9 Binary-Coded Decimal (BCD)
Four bits per digit Digit Bit pattern 0000 1 0001 2 0010 3 0011 4 0100 5 0101 6 0110 7 0111 8 1000 9 1001 Note: the following 6 bit patterns are not used:

10 BCD: Example = ? (in BCD)

11 Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode Next 13 slides

12 ASCII Features Developed by ANSI (American National Standards Institute) Defined in ANSI document X 7-bit code 8th bit is unused (or used for a parity bit or to indicate “extended” character set) 27 = 128 different codes Two general types of codes: 95 are “Printing” codes (displayable on a console) 33 are “Control” codes (control features of the console or communications channel) Represents Latin alphabet, Arabic numerals, standard punctuation characters Plus small set of accents and other European special characters (Latin-I ASCII)

13 ASCII Table

14 ASCII Table Most significant bit Least significant bit

15 ASCII Table e.g., ‘a’ =

16 ASCII Table 95 Printing codes

17 ASCII Table 33 Control codes

18 ASCII Table Alphabetic codes

19 ASCII Table Numeric codes

20 ASCII Table Punctuation, etc.

21 ASCII Table 7416 111 0100 | MSD LSD 1 2 3 4 5 6 7 NUL DLE SP @ P p SOH
1 2 3 4 5 6 7 NUL DLE SP @ P p SOH DC1 ! A Q a W STX DC2 B R b r ETX DC3 # C S c s EOT DC4 $ D T d t ENQ NAK % E U e u ACJ SYN & F V f v BEL ETB G g w 8 BS CAN ( H X h x 9 HT EM ) I Y i y LF SUB * : J Z j z VT ESC + ; K [ k { FF FS , < L \ l | CR GS - = M ] m } SO RS . > N ^ n ~ SI US / ? O _ o DEL 7416

22 Example: “Hello, world”
= Binary Hexadecimal 48 65 6C 6F 2C 20 77 67 72 64 Decimal 101 108 111 44 32 119 103 114 100 H e l o , w r d

23 Common Control Codes CR 0D carriage return LF 0A line feed
HT 09 horizontal tab DEL 7F delete NULL 00 null Hexadecimal code

24 ASCII Table: Common Control Codes

25 Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode Next 3 slides

26 EBCDIC 8-bit code Developed by IBM IBM and compatible mainframes only
Rarely used today (common in archival data) Character codes differ from ASCII Conversion software to/from ASCII available ASCII EBCDIC Space 2016 4016 A 4116 C116 b 6216 8216

27 EBCDIC Table (1 out of 2)

28 EBCDIC Table (2 out of 2)

29 Standard Alphanumeric Formats
BCD ASCII EBCDIC Unicode Next 2 slides

30 Unicode Most common 16-bit form represents 65,536 characters
ASCII Latin-I subset of Unicode Values 0 to 255 in Unicode table Multilingual: defines codes for Nearly every character-based alphabet Large set of ideographs for Chinese, Japanese and Korean Composite characters for vowels and syllabic clusters required by some languages Allows software modifications for local-languages

31 Two-byte Unicode Assignment Table

32 Collating Sequence Collating Sequence – the order of the codes in the representation table Determines sorting and selection of the alphanumeric data Collating Sequences are different in ASCII and EBCDIC: Small letters precede capitals in EBCDIC; reverse in ASCII Numbers collate first in ASCII; in EBCDIC, last

33 Two Classes of Codes Printing characters Control characters
Produced output on the screen or printer Control characters Control position of output on screen or printer Cause action to occur Communicate status between computer and I/O device

34 Control Code Definitions (ASCII Table)

35 Escape Sequences Extend the capability of the ASCII code set
For controlling terminals and formatting output Defined by ANSI in documents X and X The escape code is ESC = 1B16 An escape sequence begins with two codes: ESC [ 1B16 5B16

36 Escape Sequences: Examples
Erase display: ESC [ 2 J Erase line: ESC [ K

37 Alphanumeric Input: Keyboard
Scan code Two different binary scan codes generated when key is struck and when key is released Converted to Unicode, ASCII or EBCDIC by software in terminal or PC Received by the host as a stream of text and other characters, i.e. in the sequence typed Advantage Easily adapted to different languages or keyboard layout Separate scan codes for key press/release for multiple key combinations Examples: shift and control keys

38 inhibits bit 5 in the ASCII code
Shift Key inhibits bit 5 in the ASCII code Key(s) ASCII code Character a A a Shift a

39 inhibits bits 5 & 6 in the ASCII code
Control Key inhibits bits 5 & 6 in the ASCII code Key(s) ASCII code Character c ETX c Ctrl c Control code

40 Keyboard Input Three letters are typed: “D”, “I”, “R”, followed by the carriage return Four scan codes translated to ASCII binary codes: , , ,

41 OCR (optical character recognition)
Scans text and inputs it as character data Special OCR software required Used to read specially encoded characters Example: magnetically printed check numbers Attempts to recognize hand-written input (limited, only carefully printed)

42 Bar Code Readers Used in applications that require fast, accurate and repetitive input with minimal employee training Examples: supermarket checkout counters and inventory control Alphanumeric data in bar code (i.e., ) read optically using wand that converts them into electrical binary signals A bar code translation module converts the binary input into a sequence of number codes , one code per digit, then translated to Unicode or ASCII.

43 Other Alphanumeric Input
Magnetic stripe reader: alphanumeric data from credit cards Voice Digitized audio recording common but conversion to alphanumeric data difficult Requires knowledge of sound patterns in a language (phonemes) plus rules for pronunciation, grammar, and syntax

44 Image Data Photographs, figures, icons, drawings, charts and graphs
Two approaches: Bitmap or raster images of photos and paintings with continuous variation (e.g., GIF, JPEG) Object or vector images composed of graphical shapes like lines and curves defined geometrically Differences include: Quality of the image Storage space required Time to transmit Ease of modification

45 Image Input Image scanning (moves over the image converting dot by dot into a stream of binary numbers, pixels, representing black or white, or levels of gray, or of a colour) – bitmap image Digital/video cameras – bitmap image Pointing devices (mouse, pen)- object image

46 Bitmap Images Each individual pixel (pi(x)cture element) in a graphic stored as a binary number Pixel: A small area with associated coordinate location Example: each point below represented by a 4-bit code corresponding to 1 of 16 shades of gray

47 Bitmap Display Monochrome: black or white
1 bit per pixel Gray scale: black, white or 254 shades of gray 1 byte per pixel Color graphics: 16 colors, 256 colors, or 24-bit true color (16.7 million colors) 4, 8, and 24 bits respectively

48 Storing Bitmap Images Frequently large files File size affected by
Example: 600 rows of 800 pixels with 1 byte for each of 3 colors ~1.5MB file File size affected by Resolution (the number of pixels per inch) Amount of detail affecting clarity and sharpness of an image Levels: number of bits for displaying shades of gray or multiple colors Palette: color translation table that uses a code for each pixel rather than actual color value Data compression

49 GIF (Graphics Interchange Format)
First developed by CompuServe in 1987 GIF89a enabled animated images allows images to be displayed sequentially at fixed time sequences Color limitation: 256 Image compressed by LZW (Lempel-Zif-Welch) algorithm Preferred for line drawings, clip art and pictures with large blocks of solid color Lossless compression

50 GIF (Graphics Interchange Format)

51 JPEG (Joint Photographers Expert Group)
Allows more than 16 million colors Suitable for highly detailed photographs and paintings Employs special compression algorithm that Discards data to decreases file size and transmission speed May reduce image resolution, tends to distort sharp lines

52 Other Bitmap Formats TIFF (Tagged Image File Format): .tif (pronounced tif) Used in high-quality image processing, particularly in publishing BMP (BitMaPped): .bmp (pronounced dot bmp) Device-independent format for Microsoft Windows environment: pixel colors stored independent of output device PCX: .pcx (pronounced dot p c x) Windows Paintbrush software PNG: (Portable Network Graphics): .png (pronounced ping) Designed to replace GIF and JPEG for Internet applications Patent-free Improved lossless compression No animation support

53 Object Images Created by drawing packages or output from spreadsheet data graphs Composed of lines and shapes in various colors Computer translates geometric formulas to create the graphic Storage space depends on image complexity number of instructions to create lines, shapes, fill patterns Movies Shrek and Toy Story use object images

54 Object Images Based on mathematical formulas
Easy to move, scale and rotate without losing shape and identity as bitmap images may Require less storage space than bitmap images Cannot represent photos or paintings Cannot be displayed or printed directly Must be converted to bitmap since output devices except plotters are bitmap

55 Popular Object Graphics Software
Most object image formats are proprietary Files extensions include .wmf, .dxf, .mgx, and .cgm Macromedia Flash: low-bandwidth animation Micrographx Designer: technical drawings to illustrate products CorelDraw: vector illustration, layout, bitmap creation, image-editing, painting and animation software Autodesk AutoCAD: for architects, engineers, drafters, and design-related professionals W3C SVG (Scalable Vector Graphics) based on XML Web description language Not proprietary

56 PostScript Page description language: list of procedures and statements that describe each of the objects to be printed on a page Stored in ASCII or Unicode text file Interpreter program in computer or output device reads PostScript to generate image Scalable font support Font outline objects specified like other objects

57 PostScript Program

58 Representing Characters as Images
Characters stored in format like Unicode or ASCII Text processed and stored primarily for content Presentation requirements like font stored with the character Text appearance is primary factor Example: screen fonts in Windows Glyphs: Macintosh coding scheme that includes both identification and presentation requirement for characters

59 Bitmap vs. Object Images
Bitmap (Raster) Object (Vector) Pixel map Geometrically defined shapes Photographic quality Complex drawings Paint software Drawing software Larger storage requirements Higher computational requirements Enlarging images produces jagged edges Objects scale smoothly Resolution of output limited by resolution of image Resolution of output limited by output device

60 Video Images Require massive amount of data
Video camera producing full screen 640 x 480 pixel true color image at 30 frames/sec MB of data/sec 1-minute film clip 1.6 GB storage Options for reducing file size: decrease size of image, limit number of colors, reduce frame rate Method depends on how video delivered to users Streaming video: video displayed as it is downloaded from the Web server Example: video conferencing Local data (file on DVD or downloaded onto system) for higher quality MPEG-2: movie quality images with high compression require substantial processing capability

61 Audio Data Transmission and processing requirements less demanding than those for video Waveform audio: digital representation of sound MIDI (Musical Instrument Digital Interface): instructions to recreate or synthesize sounds Analog sound converted to digital values by A-to-D converter

62 Waveform Audio Sampling rate normally 50KHz

63 Sampling Rate Number of times per second that sound is measured during the recording process. 1000 samples per second = 1 KHz (kilohertz) Example: Audio CD sampling rate = 44.1KHz Height of each sample saved as: 8-bit number for radio-quality recordings 16-bit number for high-fidelity recordings 2 x 16-bits for stereo

64 MIDI Music notation system that allows computers to communicate with music synthesizers Instructions that MIDI instruments and MIDI sound cards use to recreate or synthesize sounds. Do not store or recreate speaking or singing voices More compact than waveform 3 minutes = 10 KB

65 Audio Formats MP3 Derivative of MPEG-2 (ISO Moving Picture Experts Group) Uses psychoacoustic compression techniques to reduce storage requirements Discards sounds outside human hearing range: lossy compression WAV Developed by Microsoft as part of its multimedia specification General-purpose format for storing and reproducing small snippets of sound

66 .WAV Sound Format

67 Data Compression Compression: recoding data so that it requires fewer bytes of storage space. Compression ratio: the amount file is shrunk Lossless: inverse algorithm restores data to exact original form Examples: GIF, PCX, TIFF Lossy: trades off data degradation for file size and download speed Much higher compression ratios, often 10 to 1 Example: JPEG Common in multimedia MPEG-2: uses both forms for ratios of 100:1

68 Compression Algorithms
Repetition Example: large blocks of the same color Pattern Substitution Scans data for patterns Substitutes new pattern, makes dictionary entry Example: 45 to 30 bytes plus dictionary Peter Piper picked a peck of pickled peppers.  t   p    a   of  l   pp  s. Pe pi ed er ck pe Pi

69 Internal Computer Data Format
All data stored as binary numbers Interpreted based on Operations computer can perform Data types supported by programming language used to create application

70 Five Simple Data Types Boolean: 2-valued variables or constants with values of true or false Char: Variable or constant that holds alphanumeric character Enumerated User-defined data types with possible values listed in definition Type DayOfWeek = Mon, Tues, Wed, Thurs, Fri, Sat, Sun Integer: positive or negative whole numbers Real Numbers with a decimal point Numbers whose magnitude, large or small, exceeds computer’s capability to store as an integer


Download ppt "ITEC 1000 “Introduction to Information Technology”"

Similar presentations


Ads by Google