TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob.

TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob

Contents This lecture will address: This lecture will address: Several different number systems. Several different number systems. Data format: Data format: Alphanumeric character. Alphanumeric character. Image data. Image data. Audio data. Audio data. Data compression Data compression Internal computer data format. Internal computer data format. Representing Integer data. Representing Integer data. Floating point number. Floating point number.

Number Systems Computer perform all of their operations using the binary (base 2). Computer perform all of their operations using the binary (base 2). Program code and data are stored and manipulated in binary. Program code and data are stored and manipulated in binary. Each digit in a binary number is known as a bit (value 0 or 1). Each digit in a binary number is known as a bit (value 0 or 1). Bits are commonly stored and manipulated in groups of: Bits are commonly stored and manipulated in groups of: 8 bit: Byte. 8 bit: Byte. 16 bit : Halfword. 16 bit : Halfword. 32 bit: Word. 32 bit: Word. 34 bit: Doubleword 34 bit: Doubleword

Number Systems The number of bits used in calculations affects the accuracy and size limitations. The number of bits used in calculations affects the accuracy and size limitations. In programming language, programmer can define a signed integer variable to be: In programming language, programmer can define a signed integer variable to be: short (16 bit) short (16 bit) int (32 bit) int (32 bit) long (64 bit). long (64 bit).

Number Systems Common number systems used when working with computers include: Common number systems used when working with computers include: binary binary base 10 (decimal) base 10 (decimal) base 8 (octal) base 8 (octal) base 16 (hexadecimal) base 16 (hexadecimal)

Number Systems: Counting in Different Base Base 10: Base 10:0,1,2,3,4,5,6,7,8,9,10,11,12,…99,100…. Base 8: Base 8:0,1,2,3,4,5,6,7,10,11,12,…17,20,…77,100,.. Base 16: Base 16:0,1,2,3,4,5,6,7,8,9,A,B,C,D,E,F,10,11…FF,100,.. Base 2: Base 2: 0, 1, 10, 11, 110, 111,…….

Numeric Conversion between Numbers Convert the number to base 10. Convert the number to base 10. E.g. 13754 8 = ??? E.g. 13754 8 = ??? (1x8 4 )+(3x8 3 )+(7x8 2 )+(5x8 1 )+(4x8 0 ) = 6124 10 Other method. Other method.1x8=8(8+3)x8=88(88+7)x8=760(760+5)x8=6120 (6120+4)= 6124 10

Numeric Conversion between Numbers Convert the number from base 10. Convert the number from base 10. E.g. 6124 10 = ??? 5 E.g. 6124 10 = ??? 5residuals 6124/5 = 12244 1224/5=2444 244/5=484 48/5=9 3 9/5=14 1/5=01 143444 5

Convert Binary Number to Hex E.g. 0011 0101 1101 1000 Group by 4 digit 3 5 D8 35D8 16 Most computer manufacturers prefer to use hexadecimal, since 16-bit or 32- bit number can be represented exactly by a four- or eight-digit hex number. Conversion between binary  hex are used frequently.

Data Formats Since all data/codes in computer are binary, it is almost always necessary to convert our words, numbers, images and sounds into a different form in order to store and process them in the computer. Since all data/codes in computer are binary, it is almost always necessary to convert our words, numbers, images and sounds into a different form in order to store and process them in the computer. Original data (character, image, etc.) must be brought initially into the computer and converted into an appropriate computer representation so that it can be processed, stored and used within the computer system. Original data (character, image, etc.) must be brought initially into the computer and converted into an appropriate computer representation so that it can be processed, stored and used within the computer system.

Data Formats

Different input devices are used for converting original data into computer format. Different input devices are used for converting original data into computer format. Keyboard: Generate binary number code for each key. Keyboard: Generate binary number code for each key. Microphone: Convert analog sound into binary data using ADC. Microphone: Convert analog sound into binary data using ADC. Camera: Convert analog picture into binary data using ADC. Camera: Convert analog picture into binary data using ADC. Etc. Etc.

Data Formats There must be agreement between input-output devices, so that the data is displayed correctly. There must be agreement between input-output devices, so that the data is displayed correctly. If necessary, translation programs can be used to translate from one representation to another. Example: If necessary, translation programs can be used to translate from one representation to another. Example: Data from keyboard enters the computer in the form of character stream. Data from keyboard enters the computer in the form of character stream. For storage and transmission of data, a representation different from that used for internal processing is often necessary, For storage and transmission of data, a representation different from that used for internal processing is often necessary, i.e. in addition to the actual data representing points in an image, for example, the system must also store and pass along information that describes or interprets the meaning of data. i.e. in addition to the actual data representing points in an image, for example, the system must also store and pass along information that describes or interprets the meaning of data. This information is known as metadata. This information is known as metadata. E.g. graphic image: Type of graphical image, colour format, etc. E.g. graphic image: Type of graphical image, colour format, etc.

Data Formats Individual programs can store and process data in any format that they want. Individual programs can store and process data in any format that they want. The format used by individual programs are known as proprietary formats. The format used by individual programs are known as proprietary formats. However, standard data representation exist to be used as interfaces between different programs, between program and IO devices, between interconnected hardware, and between systems that shared data. However, standard data representation exist to be used as interfaces between different programs, between program and IO devices, between interconnected hardware, and between systems that shared data.

Data Formats Many different standards in use for different types of data. Some common data representation are: Many different standards in use for different types of data. Some common data representation are: Type of data Standard Alphanumeric Unicode, ASCII, EBCDIC Image (bitmap) GIF(graphical image format), TIFF (tagged image file format), PNG (portable network graphics) Image (object) PostScript, JPEG, SWF (Macromedia Flash), SVG. Outline graphics and fonts PostScript, TrueType Sound WAV, AVI, MP3, MIDI, WMA Page description Pdf (Adobe Portable Document Format), HTML, XML. Video Quicktime, MPEG-2, WMV

Alphanumeric character data Characters, number digits, and punctuation : alphanumeric data. Characters, number digits, and punctuation : alphanumeric data. Since the is no processing capability in the keyboard itself, number data must be entered into the computer just like other characters, one digit at a time. Since the is no processing capability in the keyboard itself, number data must be entered into the computer just like other characters, one digit at a time. Conversion will be done using software. Conversion will be done using software. Alphanumeric data must be stored and processed within computer in binary form  character translation. Alphanumeric data must be stored and processed within computer in binary form  character translation. The choice of code used is arbitrary. The choice of code used is arbitrary. Three common alphanumeric code: Three common alphanumeric code: Unicode Unicode ASCII (American Standard Code for Information Interchange). ASCII (American Standard Code for Information Interchange). EBCDIC (Extended Binary Coded Decimal Interchange Code).. ….”ebb-see-dick”. EBCDIC (Extended Binary Coded Decimal Interchange Code).. ….”ebb-see-dick”. Many computer/terminal use: Unicode or ASCII.

ASCII Code Table The codes are in hex. This is a 7-bit code  128 entries.

ASCII Note that ASCII are designed so that the order of the letters is such that a simple numerical sort on the codes can be used within the computer to perform alphabetization. Note that ASCII are designed so that the order of the letters is such that a simple numerical sort on the codes can be used within the computer to perform alphabetization. The order of codes in the representation table is known as its collating sequence. The order of codes in the representation table is known as its collating sequence. There are two classes of codes: There are two classes of codes: Printing characters – produce output on the screen/printer. Printing characters – produce output on the screen/printer. Control characters – use to control the position of the output on the screen/paper, to cause some action to occur (e.g. ringing a bell, deleting a character), etc. Control characters – use to control the position of the output on the screen/paper, to cause some action to occur (e.g. ringing a bell, deleting a character), etc.

Control Code Definitions Except for position control characters, the control characters are struck by holding down the Control key and striking a character. The code executed corresponds in table position to the position of the same alphabetic character. e.g. “Ctrl A” is for executing SOH.

ASCII vs Unicode Due to the limitation of 7-bit ASCII code, American National Standard Institute (ANSI) also extend the 7-bit ASCII code to 8-bit code, known as Latin-I. Due to the limitation of 7-bit ASCII code, American National Standard Institute (ANSI) also extend the 7-bit ASCII code to 8-bit code, known as Latin-I. Latin-I is an ISO standard. Latin-I is an ISO standard. However, the 8-bit code still not adequate for representing all possible characters in use  Unicode. However, the 8-bit code still not adequate for representing all possible characters in use  Unicode. Unicode can represent 65,536 characters, of which approximately 49,000 have been defined. Unicode can represent 65,536 characters, of which approximately 49,000 have been defined. More recent standard, Unicode 3.1 supports millions of different characters. More recent standard, Unicode 3.1 supports millions of different characters. Unicode is multilingual in the most global sense. Unicode is multilingual in the most global sense.

Two-byte Unicode Table

Keyboard Input When key is struck on the keyboard, the circuitry in the keyboard generates a binary code, called a scan code. When key is struck on the keyboard, the circuitry in the keyboard generates a binary code, called a scan code. When key is released, a different code is generated. When key is released, a different code is generated. The scan codes are converted to Unicode, ASCII or EBCDIC codes by software within terminal or PC to which the keyboard is connected. The scan codes are converted to Unicode, ASCII or EBCDIC codes by software within terminal or PC to which the keyboard is connected. Advantage of software conversion – use of the keyboard can be easily change to correspond to different language and keyboard layout. Advantage of software conversion – use of the keyboard can be easily change to correspond to different language and keyboard layout.

Keyboard Operation

Alternative Sources of Alphanumeric Input Optical character recognition: Optical character recognition: Scan text with an image scanner and convert the image into alphanumeric data form using optical character recognition (OCR) software. Scan text with an image scanner and convert the image into alphanumeric data form using optical character recognition (OCR) software. Bar code readers: Bar code readers: Bar code represent alphanumeric data. Bar code are read optically using a device called a wand that converts a visual scan of the code into electrical binary signals that a bar translation module can read. Bar code represent alphanumeric data. Bar code are read optically using a device called a wand that converts a visual scan of the code into electrical binary signals that a bar translation module can read.

Alternative Sources of Alphanumeric Input Magnetic stripe reader: Magnetic stripe reader: Read alphanumeric data form credit cards and other similar devices. Read alphanumeric data form credit cards and other similar devices. Voice input: Voice input: It is currently possible and practical to digitised audio for use as input data. However, technology to interpret audio data as voice input and to translate the data into alphanumeric form is still primitive. It is currently possible and practical to digitised audio for use as input data. However, technology to interpret audio data as voice input and to translate the data into alphanumeric form is still primitive.

Image Data Images used in computer: Bitmap and object images. Different computer representations and processing techniques are used for each category. Images used in computer: Bitmap and object images. Different computer representations and processing techniques are used for each category. Bitmap image/raster image: e.g. photograph and painting. Bitmap image/raster image: e.g. photograph and painting. Produced by: scanner, digital camera, video camera frame grabber, software program such as paint. Produced by: scanner, digital camera, video camera frame grabber, software program such as paint. To maintain and reproduce the detail of these images, it is necessary to represent and store each individual point within the image. To maintain and reproduce the detail of these images, it is necessary to represent and store each individual point within the image. GIF and JPEG formats are common bitmap image using on the Web. GIF and JPEG formats are common bitmap image using on the Web.

Image Data Object image/vector image: made up of graphical shapes such as line, circle, etc. that can be defined geometrically. Object image/vector image: made up of graphical shapes such as line, circle, etc. that can be defined geometrically. Produced using drawing or design package. Produced using drawing or design package. Example: the movies Shrek and Toy Story are the object images. Example: the movies Shrek and Toy Story are the object images.

Image Input Image scanner. Image scanner. Digital camera. Digital camera. Video capture devices. Video capture devices. Graphical input using pointing devices. Graphical input using pointing devices.

Audio Data Few different formats are used for storing audio waveform, e.g.: Few different formats are used for storing audio waveform, e.g.:.MOD.MOD.MIDI.MIDI.VOC.VOC.WAV.WAV MP3 MP3

Data Compression Due to the volume of multimedia data, particularly video, but also sound and images, data compression is usually desirable. Due to the volume of multimedia data, particularly video, but also sound and images, data compression is usually desirable. Two categories of data compression: Two categories of data compression: Lossless – allow complete recovery of the original noncompressed data. Lossless – allow complete recovery of the original noncompressed data. Lossy – does not allow recovery but is designed to be perceived as sufficient by the user. Lossy – does not allow recovery but is designed to be perceived as sufficient by the user.

Data Formats Internally, all data, regardless of use, are stored in binary number. Internally, all data, regardless of use, are stored in binary number. Instructions in the computer support interpretation of these numbers as character, integers, pointers, and floating point numbers. Instructions in the computer support interpretation of these numbers as character, integers, pointers, and floating point numbers. No special provision is made the storage of algebraic sign or decimal point that might be associated with a number. No special provision is made the storage of algebraic sign or decimal point that might be associated with a number.

Representing Integer Data Unsigned integer can be stored using unsigned binary or binary-coded decimal (BCD). Unsigned integer can be stored using unsigned binary or binary-coded decimal (BCD). unsigned binary – the range of integers that we can store is determined by the number of bits available, i.e. 8-bit binary, for example, can store an unsigned integer of value between 0 and 255. unsigned binary – the range of integers that we can store is determined by the number of bits available, i.e. 8-bit binary, for example, can store an unsigned integer of value between 0 and 255. For storing larger numbers, multiple storage locations of 8-bit is used. For storing larger numbers, multiple storage locations of 8-bit is used. BCD – the number is stored as a digit-by-digit binary representation of the original decimal integer. Each decimal digit is individually converted to 4-bit binary. BCD – the number is stored as a digit-by-digit binary representation of the original decimal integer. Each decimal digit is individually converted to 4-bit binary.

Storage of a 32-bit Data Word

Representation for Signed Integers The most common method to represent signed numbers is using 2’s complement representation. The most common method to represent signed numbers is using 2’s complement representation. The 2’s complement of a number can be found in one of two ways: The 2’s complement of a number can be found in one of two ways: Subtract the value from the modulus or Subtract the value from the modulus or Find the 1’s complement by inverting all 1’s and 0’s and adding 1 to the result (common method use in computer). Find the 1’s complement by inverting all 1’s and 0’s and adding 1 to the result (common method use in computer). Two’s Complement Representation

2’s complement representation Example: Example: The number +2 in 8-bit number is: 0000 0010 The number -2 in 8-bit number is: 0000 0010 0000 0010 1’s complement:1111 11 01 1’s complement:1111 11 01 + 1 + 1 2’s complement:1111 1110

Floating Point Numbers The usual floating point number format consist of: The usual floating point number format consist of: A sign bit. A sign bit. An exponent An exponent A mantissa. A mantissa.

Thank you Q & A

TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob.

Similar presentations

Presentation on theme: "TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob.

Similar presentations

Presentation on theme: "TK 2123 COMPUTER ORGANISATION & ARCHITECTURE Lecture 4: Data in The Computer Dr. Masri Ayob."— Presentation transcript:

Similar presentations

About project

Feedback