Data Representation.

Data Representation

Topics Bit patterns Binary numbers Data type formats
Character representation Integer representation Floating point number representation

Data Representation Data representation refers to the manner in which data is stored in the computer There are several different formats for data storage It is important for computer problem solvers to understand the basic formats

Why is it Important? As an example:
We will learn that since we have finite storage, it is possible to overflow a storage location by trying to store too large a number Most programming languages provide multiple data types each providing different length storage for variables It is up to the programmer to choose the data type with a length that won’t overflow Knowing how numbers are represented in storage helps one to understand this

Bit Pattern As you recall from an earlier presentation, data may take various forms: characters, numbers, graphical, etc. All data is stored in the computer as a sequence of bits, that is, binary digits This is a universal storage format for all data types, and it is called a bit pattern

Bits and Bytes A bit is the smallest unit of data stored in a computer and it has a value of 0 or 1 It’s like a switch, on (1) or off (0) In computers, bits are stored electronically in RAM and auxiliary storage devices by two-state digital circuits The storage device itself doesn’t know what the bit pattern represents, but software (application software, operating system, and I/O device firmware) stores and interprets the pattern That is, data is coded then stored and when retrieved it is decoded A byte is a string of 8 bits and is called a character when the data is text

Binary Numbers Each bit pattern is a binary number, that is, a number represented by 0’s and 1’s rather than 0, 1, 2, …, 9 as decimal numbers are For example, bit patterns like 1010 and are also binary numbers Binary numbers are based on powers of 2 rather than powers of 10 as decimal numbers are

Data Type Formats As we have learned, fundamentally all data is stored as a bit pattern But the different data types have different bit pattern formats We want to learn the formats for: Characters (for example, left, Lane, a, ?, \) Integers (for example, 1, 453, -10, 0) Floating point numbers (for example, , 1.2, , 0.009)

Character Representation
The American Standard Code for Information Interchange (ASCII) is the scheme used to assign a bit pattern to each of the characters ASCII charts come in different flavors: Some have 7 bit strings, some 8 or more Some show the binary code for the various characters as well as the code represented in other number systems, e.g., decimal, hex, octal For example, the letter A has the ASCII code: in binary for a 7-bit chart 65 in decimal 41 in hex 101 in octal Note: All four of these numbers represent the same value but using different number systems

Subset of ASCII Chart

ASCII Chart Uppercase characters have different ASCII codes than lowercase Uppercase characters come before lowercase Numbers come before letters The special characters are spread around The numbers and upper and lowercase characters are in adjacent groupings, so that their codes increment by one

ASCII Chart The only difference between the binary codes for the upper and lowercase characters is the sixth bit, that is, the decimal code for a lowercase character is 32 greater than the uppercase character’s decimal code ASCII codes before decimal 32 are control characters (nonprintable) like bell, backspace, and carriage return The final ASCII code in the 7-bit chart is the control character DEL with decimal code 127

Extended ASCII & Unicode
The eight-bit ASCII chart is sometimes called Extended ASCII The seven-bit ASCII codes are the same in eight-bit chart except have a zero at the left Some manufactures use the extra bit to create additional special characters, these however are nonstandard, e.g., using decimal 171 for ½, or 246 for ÷ Unicode is another scheme developed so that the many symbols in international languages may be represented. It also uses bit patterns. UTF-32 uses 32 bits.

Numeric Representation
ASCII codes are an inefficient method for representing numbers For example, the number 1,024 using 8-bit ASCII would require four bytes or 32 bits of storage Arithmetic operations on numbers represented in ASCII are very complicated Representing the precision of a number, that is, the number of digits stored, may require large amounts of space when stored in ASCII There are more efficient schemes for numbers

Integer Representation
An integer is a whole number, that is, a number without a decimal portion Integers may be positive, negative, or zero A plus-sign or minus-sign in front of the number is used to represent positive and negative numbers The plus-sign is not required for positive numbers and zero There are two categories of integer representation: unsigned and signed

Unsigned Integer An unsigned integer is an integer without a sign, that is, a non-negative integer They range from zero to infinity, but no computer can store all the integers in that range So, a maximum unsigned integer is defined This maximum is based on the number of bits used to store an integer Let’s use 8 and 16-bit (1 and 2 bytes) storage locations in our examples The length of storage is set by the data type the programmer specifies for a variable

Unsigned Integer An unsigned integer is stored as its value when represented as a binary number Leading zeros are added to fill out the storage location For example, the decimal number 9 is represented as when stored in 1-byte because = 910 When stored in a 2-byte location, 9 would be represented as

Unsigned Integer One may use the following table to work with binary numbers: For example, given , what decimal number does it represent? Add the non-negative powers of two, that is, = 9

Unsigned Integer One may use the same table to go the other way, that is, given the decimal number 13, what is its binary representation? Find the largest power of 2 that doesn’t exceed the number and place a 1 in that cell: Subtract that power of 2 from the number and use this as the new number: 13 – 8 = 5

Unsigned Integer Then continue in this way until the sum of the powers of two equals the number: Now, 5 – 4 = 1, and so finally: Note that = 13

Unsigned Integer Then fill in the remaining cells with zeros:
So, the unsigned integer representation of decimal 13 is when stored in 1-byte

Unsigned Integer If one tries to store a number in a memory location that is not large enough we have what is called overflow In this case, depending on the system, one may or may not receive an error message So, one must not store a number that is larger than the maximum for a given length of storage The maximum number storable in 1-byte is 255

Unsigned Integer For example, if one tries to store 256 in 1-byte there is overflow because the largest value storable in 8 bits is 255 as one can see from the following table: Note that = 255

Signed Integer A sign-and-magnitude format is used to allow for positive and negative numbers (and zero) The leading bit is designated as the sign bit: 0 for positive or zero, 1 for negative The remaining bits represent the value So, in 1-byte of storage the maximum number storable is not 255 as it was for the unsigned integer representation, but 127: Note that = 127

Signed Positive Integer
To determine what the sign-and-magnitude representation of a positive decimal number is: Convert the decimal number to binary If needed add leading zeros to fill the storage location For example, decimal 12 is represented in 1-byte as because = 12:

Signed Positive Integer
Going the other way, given a sign-and-magnitude representation for a positive number, one can interpret it as follows: Leftmost bit will be 0 indicating positive Convert the remaining bits to a decimal number For example, is decimal 17: Because = 17

Signed Negative Integer
For negative numbers, two’s complement format is used Two’s complement is still a sign-and-magnitude format In two’s complement, some of the magnitude bits are flipped from 0 to 1 or 1 to 0

To determine what the two’s complement representation of a negative decimal number is: Ignore the sign and convert the decimal number to binary If needed add leading zeros to fill the storage location Leave all the rightmost 0’s and first 1 unchanged, but flip the remaining bits Make the sign bit 1 For example, decimal -14 is represented in 1-byte as because (see next slide)

Convert 14 to binary ( = 14) and make leading bits zero: Leave the rightmost 0’s and first 1 as is, but flip the remaining bits: Make sign bit 1:

Going the other way, given a two’s complement representation for a negative number, one can interpret it as follows: Leave the rightmost bits up to and including the first 1 unchanged, but flip the remaining bits Convert the binary number to decimal Put a minus-sign in front For example, is decimal –22 because (see next slide)

Flip all but the rightmost 1 and any following 0’s: Convert the binary number to decimal: We get 22 because = 22 Put a minus-sign in front yielding -22

Two’s complement is the standard representation for negative integers in modern computers This is because arithmetic operations are simple to implement when integers are stored this way (but this concept is beyond the scope of the course) Although on the surface it seems complicated, at a deeper level it allows for simplicity of operations

An alternative but equivalent method for converting a negative number to its two’s complement representation is: Ignore the sign and convert the decimal number to binary If needed add leading zeros to fill the storage location Flip all the bits Add 1 to the result of the last step Make the sign bit 1 Some people find this easier

For example, First, convert 14 to binary ( = 14) and make leading bits zero: Flip all the bits: Add 1: Make the sign bit 1:

Floating Point Number Representation
Float point numbers are those that have a decimal portion (mathematicians call these real numbers) Numbers like , , and The method that is used allows for very large or very small numbers to be stored using the same format

Floating Point The main idea in this format is that the decimal point is allowed to “float” That is, there is an “actual” decimal location in the original number, and there is “stored” decimal location that is usually different The original number is normalized by moving the decimal place so there is only one digit to the left

Floating Point The basic idea can be seen from an example although this description glosses over many details The number is normalized by moving its decimal point two places to the left to become , and this number is stored and is called the mantissa Also, the fact that the decimal point was moved left by 2 is stored so that the original number may be reconstructed and this is called the exponent The sign of the number is also stored (0 for positive or zero, 1 for negative)

Floating Point However, it is actually more complicated than that
The exponent and mantissa are actually stored in binary And the value stored as the mantissa is only the fractional part of the binary number once the decimal point has been moved so that there is a binary 1 at the left, that is, is stored as and the leading 1 is assumed

Floating Point The representation of numbers in floating point involves a couple procedures that are complicated and beyond the scope of the course These are “repetitive multiplication of a decimal fraction by 2,” and the “excess system” for storing positive and negative numbers So, we won’t be converting the numbers manually ourselves

Floating Point However, the procedure used to store a number in floating point representation is: Store a 0 (positive) or 1 (negative) in the sign field Convert the integer part to binary Convert the decimal part (fraction) to binary by using “repetitive multiplication by 2” Combine the two binary numbers with a decimal point between Move the decimal point so that there is a 1 bit at the left and store the remaining bits in the mantissa field Store the number of places moved using the “excess system” in the exponent field

Floating Point Computers store data in binary and in finite space, i.e., they are discrete, finite systems However real numbers form a continuous, infinite system Hence, computers can only approximate real numbers The precision of a floating point number is how close the stored number is to the original number

Floating Point Small Basic Example:
Mathematically c should be 0 but what does the program display for c? a = 2 / 3 b = 2 * (1 / 3) c = a - b TextWindow.WriteLine(c)

Floating Point The more bits available for the mantissa field the more digits of the original number may be stored Programming languages normally allow the programmer to define the precision by the data type chosen

Floating Point Institute of Electrical and Electronics Engineers (IEEE) standards: Single-Precision (4 bytes) Double-Precision (8 bytes)

Floating Point Trade off:
Double precision numbers require more space and therefore programs using them may run slower But operations using double precision numbers will be more precise

Summary Data are stored as bit patterns
A bit pattern is a binary number There are various data type formats Characters are represented in ASCII

Summary Integers are represented as either
Unsigned – stored as the binary number equivalent to the original Signed Positive – stored using the sign-and-magnitude format where the magnitude is the binary equivalent Signed Negative – stored using the sign-and-magnitude format where the magnitude is in the two’s complement format Floating point numbers are represented using sign, exponent, and mantissa

Terminology Data representation Unsigned integer representation Bit
Byte Bit pattern Binary number Character Integer Floating point number ASCII Control characters Extended ASCII Unicode Unsigned integer representation Overflow Sign-and-magnitude representation Sign Two’s complement representation Floating point representation Normalize Exponent Mantissa Precision Single-precision Double-precision

Data Representation.

Similar presentations

Presentation on theme: "Data Representation."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Representation.

Similar presentations

Presentation on theme: "Data Representation."— Presentation transcript:

Similar presentations

About project

Feedback