Unit -2 ARITHMETIC.

Unit -2 ARITHMETIC

Syllabus Number representations and their operations,
Design of Fast Adders, Signed multiplication, Booth’s Algorithm, Bit-pair recoding, Integer Division, Floating point numbers and operations, Guard bits and rounding.

Number representations and their operations

Number Representation
• Numbers can be represented in 3 formats: Signed magnitude 1's complement 2's complement • In all three formats, MSB=0 for +ve numbers & MSB=1 for -ve numbers. • In signed magnitude system, negative value is obtained by changing the MSB from 0 to 1 of the corresponding positive value. For ex, +5 is represented by 0101 & -5 is represented by 1101. • In 1's complement system, negative values are obtained by complementing each bit of the corresponding positive number. For ex, -5 is obtained by complementing each bit in 0101 to yield 1010. In 2's complement system, the 2's complement of a number is obtained by adding 1 to the 1's complement of that number. For ex, -5 is obtained by complementing each bit in 0101 & then adding 1 to yield 1011.

Addition and subtraction with Signed- Magnitude Data

Addition and Subtraction of Signed-Magnitude Data
We designate the magnitude of the two numbers by A and B. When the signed numbers are added or subtracted, we find that there are eight different conditions to consider, depending on the sign of the numbers and the operation performed. These conditions are listed in the Table. The algorithms for addition and subtraction are derived from the table.

Addition and Subtraction of Signed-Magnitude Data

Rules for addition and subtraction using signed magnitude
When the signs of A and B are same, add the two magnitudes and attach the sign of result is that of A. When the signs of A and B are not same, compare the magnitudes and subtract the smaller number from the larger. And give the sign of larger magnitude to the result. If the two magnitudes are equal, subtract B from A and make the sign of the result will be positive.

Hardware for signed magnitude addition and subtraction

Example For addition: add A+B +3 0011 +2 0010 …………………… +5 0101
…………………… For subtraction: add A+B’ +1 ……………………..

Hardware Implementation
It consists of registers A and B and sign flip-flops As and Bs ,subtraction is done by adding A to the 2’s complement of B. The output carry is transferred to flip flop E, where it can be checked to determine the relative magnitudes of the two numbers. The add-overflow flip-flop AVF holds the overflow bit when A and B are added. The addition of A plus B is done through the parallel adder. The S(sum) output of the adder is applied to the input of the A register. The complementer provides an output of B or the complement of B depending on the state of the mode control M. The M signal is also applied to the input carry of the adder. When M=0, the output of B is transferred to the adder, the input carry is 0, and the output of the adder is equal to the sum A + B. When M=1, the 1’s complement of B is applied to the adder, the input carry is 1, and output S= A + B’ +1.

Flowchart for add and subtract operations

Flowchart for add and subtract operations
The two signs As and Bs are compared by an XOR gate. If the output of the gate is 0, the signs are identical; if it is 1, the signs are different. For an add operation, identical signs dictate that the magnitudes be added. For a subtract operation, different signs dictate that the magnitudes be added. The magnitudes are added with a microoperation EA ← A+B, where EA is a register that combines E and A. The carry in E after the addition constitutes an overflow if it is equal to 1. the value of E is transferred into the add-overflow flip flop AVF. The two magnitudes are subtracted if the signs are different for an add operation or identical for a subtract operation. The magnitudes are subtracted by adding A to the 2’s complement of B. No overflow can occur if the numbers are subtracted to AVF is cleared to 0. A 1 in E, indicates that A ≥ B and the number in A is the correct result. If this number is zero, the sign A, must be made positive to avoid a negative zero. A 0 in E indicates that A < B.

Example (+8) + (-3)=+5 (-8) +(-3)=-11 (+8) - (-3)=+11 (-8) - (-3)=-5

Addition and subtraction with Signed- 2’s Complement Data

Hardware for signed-2’s Complement addition and subtraction

Hardware for signed-2’s Complement addition and subtraction
AC and BR are the registers to hold the numbers. The leftmost bit in AC and BR represent the sign bits of the numbers. The two sign bits are added or subtracted together with the other bits in the complementer and parallel adder. The overflow flip flop V is set to 1 if there is an overflow.

Algorithm

Algorithm The sum is obtained by adding the contents of AC and BR.
The overflow bit V is set to 1 if the exclusive OR of the last two carries is 1, and it is cleared to 0 otherwise. The subtraction operation is accomplished by adding the content of AC to the 2’s complement of BR. Taking the 2’s complement of BR has the effect of changing a positive number to negative, and vice versa.

Rules for addition using 2’s complement
When two negative numbers are added a carry will be generated from the sign bit which will be discarded. 2’s complement of the magnitude bits of the operation will be the final sum.

Rules for subtraction using 2’s complement
At first, 2’s complement of the negative number is found. Then it is added to the other number. If the final carry over of the sum is 1, it is dropped and the result is positive. If there is no carry over, the two’s complement of the sum will be the result and it is negative.

How to calculate binary value of negative number
Example: (-3) First find the binary value of (+ 3) then find the 2’s complement. = 0011 2’s complement = = 1101 =1101

Example +3 0 011 +3 0011 + +2 0 010 - +2 1110 (2’s complement)
…………………… ……………………….. (2’s complement) …………………………. ………………………..

Overflow When the actual result exceeds the range then overflow occurs . Overflow occurs when there are insufficient bits in a binary number representation to portray the result of an arithmetic operation. To detect and compensate for overflow, one needs n+1 bits if an n-bit number representation is employed. For example, in 32-bit arithmetic, 33 bits are required to detect or compensate for overflow. This can be implemented in addition (subtraction) by letting a carry (borrow) occur into (from) the sign bit. Examples: 4 bit unsigned number range is from ( 0-15). Whenever the carry flag is set overflow occurred.

Examples of overflow

Underflow When the actual result is below the range underflow occurs.
Example: 4-bit 2’s complement (range if from : -8 to 7) Example: (-8) (-7) …………………………….. (-15)

2’s complement table

Design of Fast Adders

N-bit ripple carry adder
A cascaded connection of n full adder blocks can be used to add 2 bit numbers. Since carries must propagate (or ripple) through cascade, the configuration is called an n-bit ripple carry adder.

N-bit ripple carry adder

Design of fast adders If the adder is used is used to implement the addition/subtraction, all sum bits are available after much delay. Two approaches can be used to reduce delay in adders Use the fastest possible electronic technology in implementing the ripple carry design. Use an augmented logic gate network structure.

Fast Addition The logic expression for Si (sum) and C i+1 (carry-out) of stage i are

Carry Lookahead Addition

4-bit carry-lookahead adder

Multiplication Algorithm: Hardware Implementation, Hardware Algorithm, Binary Multiplication, Booth Multiplication Algorithm.

Multiply Signed-Magnitude

Multiply Signed-Magnitude Hardware Implementation

Example: 23 x 19 = 437

Multiply Signed-2’s Complement(Booth Algorithm)

Multiply Signed-2’s Complement Hardware Implementation

Example: -9 x -13 = 117

Bit-pair recoding

Bit-pair recoding In this method, multiplier is recoded, then fewer steps may be needed on the multiplication process. Consider the multiplication (21) x (14). The booth recoded multiplier is obtained by scanning the original multiplier from right to left, placing a -1 in the position where the first 1 in a string is encountered and placing a +1 in position where the next 0 is seen. Multiplicand= (21)= Multiplier = (14) = The multiplier becomes

Steps to multiply using bit recoded method

Unsigned division

Unsigned division Division is more tedious process than multiplication. For the unsigned case, there are two standard approaches Restoring division: In restoring method if after subtraction or addition if the sign of A is changed then we restore the previous value of A. Non-restoring division: But in contrast with non restoring division if the sign of A is negative then A, Q will be shifted left so negation bit will be carry then.

Divide Fixed-Point Signed-Magnitude

Divide overflow The division operation may result in a quotient with an overflow. The quotient is to be stored in a standard register, so the overflow bit will require one more flip flop for storing the sixth bit. The divide overflow condition must be avoided in normal computer operations because the entire quotient will be too long for transfer into a memory unit that has words of standard length. Provision to ensure that this condition is detected must be included in either the hardware or the software. When the dividend is twice as long as the divisor, the condition for overflow can be stated as follows: a divide overflow condition occurs if the high order half bits of the dividend constitute a number greater than or equal to the divisor. Another problem associated with division is the fact that a division by zero must be avoided. Overflow condition is usually detected when a special flip flop is set.(DVF)

Restoring division

Example: 448 ⁄ 17 = 26 r 6

Non Restoring division

Divisions involving Negatives
Simplest solution: convert to positive and adjust sign later Note that multiple solutions exist for the equation: Dividend = Quotient x Divisor + Remainder +7 div +2 Quo = +3 Rem = +1 -7 div +2 Quo = -3 Rem = -1 +7 div -2 Quo = -3 Rem = +1 -7 div -2 Quo = +3 Rem = -1 Convention: Dividend and remainder have the same sign If the two signs are alike, the sign of the Quotient is plus. If they are unlike , the sign is minus These rules fulfil the equation above

Floating-Point Arithmetic Operations

Basic Considerations The floating point representation has two parts. The first part represents a signed, fixed point number called the mantissa. The second part designates the position of the decimal point and is called the exponent. The fixed point mantissa may be a fraction or an integer. For example, the decimal number is represented in floating point with a fraction and an exponent as follows: Example: Fraction = Exponent = The value of the exponent indicates that the actual position of the decimal point is four position to the right of the indicated decimal point in the fraction.

Basic Considerations A floating point number in computer registers consists of two parts: a mantissa m and a exponent e. The two parts represent a number obtained from multiplying m times a radix r raised to the value of e; thus Only the mantissa m and the exponent e are physically represented in the register. The radix r and the radix point position of the mantissa are always assumed.

Basic Considerations The decimal number is represented in a register with m= and e=3 and is interpreted to represent the floating point number X 10 3 The binary number is represented with an 8 bit fraction and 6 bit exponent as follows. Fraction = Exponent = The fraction has a 0 in the left most position to denote positive. The binary point of the fraction follows the sign bit but it is not shown in the register. The exponent has the equivalent binary number 4. Computers with shorter lengths use two or more words to represent a floating point number. An 8 bit microcomputer may use four words to represent one floating point number. One word of 8 bits is reserved for the exponent and the 24 bits of the other three words are used for the mantissa.

Normalization A floating point number is normalized if the most significant digit of the mantissa is nonzero. In this way the mantissa contains the maximum possible number of significant digits. For example, the number can be represented as 5.566×10^1, ×10^2, ×10^3, and so on. The fractional part can be normalized. In the normalized form, there is only a single non-zero digit before the radix point. For example, decimal number can be normalized as ×10^2; binary number B can be normalized as B×2^3.

Register Configuration
The register configuration for floating point operations is quite similar to the layout for fixed point operations. As a general rule, the same registers and adder used for fixed point arithmetic are used for processing the mantissa. The difference lies in the way the exponents are handled. The register organization for floating point operations consists of three registers BR, AC, and QR. Each register is subdivided into two parts. The mantissa part has the same uppercase letter symbols as in fixed point representation. The exponent part uses the corresponding lowercase letter symbol.

Registers for floating point arithmetic operations

Register Configuration
It is assumed that each floating point number has a mantissa in signed magnitude representation and a biased exponent. Thus the AC has a mantissa whose sign is in A s and a magnitude that is in A. The exponent is in the part of the register denoted by the lowercase letter symbol a. The diagram shows explicitly the MSB of A, labeled by A 1.. The bit in this position must be a 1 for the number to be normalized. A parallel adder adds the two mantissa and transfers the sum into A and the carry into E. A separate parallel adder is used for the exponents.

Floating point number representation
32-bit single-precision floating-point representation: The most significant bit is the sign bit (S), with 0 for positive numbers and 1 for negative numbers. The following 8 bits represent exponent (E). The remaining 23 bits represents fraction (F). Value represented =± 1. M X 2 E – 127 For example: Value represented= X 2 40 – 127 = X 2 -87

Floating point number representation
64-bit single-precision floating-point representation: The most significant bit is the sign bit (S), with 0 for positive numbers and 1 for negative numbers. The following 11 bits represent exponent (E). The remaining 52 bits represents fraction (F). Value represented =± 1. M X 2 E – 1023 For example: Value represented (un-normalized)= X 2 9 Value represented (un-normalized)= X 2 6

Floating-Point Add / Subtract
Arithmetic operations with floating point numbers are more complicated than with fixed point numbers and their execution takes longer and requires more complex hardware. Adding or subtracting two number requires first an alignment of the radix point since the exponent parts must be made equal before adding or subtracting the mantissa. The alignment is done by shifting one mantissa while its exponent is adjusted until it is equal to the other exponent. Ex: X 10 2 X 10 -1 It is necessary that the two exponents are equal before the mantissas can be added. We can either shift the first number three position to the left, or shift the second number three positions to the right.

When the mantissas are stored in registers, shifting to the left causes a loss of MSB digits. Shifting to the right causes a loss of LSB digits. The second method is preferable because it only reduces the accuracy , while the first method may cause an error. The usual alignment procedure is to shift the mantissa that has the smaller exponent to the right by a number of places equal to the difference between the exponents. After this is done, the mantissas can be added. X 10 2 X 10 2 Now both can be added.

During addition or subtraction, the two floating point operands are in AC and BR. The sum or difference is formed in the AC. The algorithm can be divided into four consecutive parts. Check for zeros Align the mantissas Add or subtract the mantissas Normalize the result

A floating point number that is zero cannot be normalized. If this number is used during the computation, the result may also be zero. Instead of checking for zeros during the normalization process we check for zeros at the beginning and terminate the process if necessary. The alignment of the mantissas must be carried out prior to their operation. After the mantissas are added or subtracted, the result may be unnormalized. The normalization procedure ensures that the result is normalized prior to the transfer to memory.

Flowchart for Floating-Point Add / Subtract
If BR is equal to zero, the operation is terminated, with the value in the AC being the result. If AC is equal to zero, we transfer the content of BR into AC and also complement its sign if the numbers are to be subtracted. If neither number is equal to zero, we proceed to align the mantissas. The magnitude comparator attached to exponents a and b provides three outputs that indicate their relative magnitude. If the two exponents are equal, we go to perform the arithmetic operation. If the exponents are not equal, the mantissas having the smaller exponent is shifted to the right and its exponent incremented. This process is repeated until the two exponents are equal. The addition and subtraction of the two mantissas is identical to the fixed point addition and subtraction algorithm. The magnitude part is added or subtracted depending on the operation and the signs of the two mantissas. If an overflow occurs when the magnitudes are added, it is transferred into flip flop E. If E is equal to 1, the bit is transferred into A1 and all other bits of A are shifted right. The exponent must be incremented to maintain the correct number. No underflow may occur in this case because the original mantissa that was not shifted during the alignment was already in a normalized position.

Flowchart for Floating-Point Add / Subtract
If the magnitudes were subtracted, the result may be zero or may have an underflow. If the mantissa is zero, the entire floating point number in the AC is made zero. Otherwise, the mantissa must have at least one bit that is equal to 1. The mantissa has an underflow if the MSB in position A1 is 0. In that case, the mantissa is shifted left and the exponent decremented. The bit in A1 is checked again and the process is repeated until it is equal to 1. when A1=1, the mantissa is normalized and the operation is completed.

Floating-Point Multiplication
The multiplication of two floating point numbers requires that we multiply the mantissas and add the exponents. The multiplication of the mantissas is performed in the same way as in fixed point to provide a double precision product. The double precision answer is used in fixed point numbers to increase the accuracy of the product. The multiplication algorithm can be subdivided into four parts: Check for zeros Add the exponents Multiply the mantissas Normalize the product

The two operands are checked to determine if they contain a zero. If either operand is equal to zero, the product in the AC is set to zero and the operation is terminated. If neither of the operands is equal to zero, the process continues with the exponent addition. The exponent of the multiplier is in q and the adder is between exponents a and b. It is necessary to transfer the exponents from q to a, add the two exponents, and transfer the sum into a. Since both exponents are biased by the addition of a constant, the exponent sum will have double this bias. The correct biased exponent for the product is obtained by subtracting the bias number from the sum. The multiplication of the mantissas is done as in the fixed point case with the product residing in A and Q. Overflow cannot occur during multiplication, so there is no need to check for it.

The product may have an underflow, so the MSB in A is checked. If it is a 1, the product is already normalized. If it is a 0 , the mantissa in AQ is shifted left and the exponent decremented. Note that only one normalization shift is necessary. The multiplier and multiplicand were originally normalized and contained fractions. The smallest normalized operand is 0.1, so the smallest possible product is therefore, only one leading zero may occur. Although the low order half of the mantissa is in Q, we do not use it for the floating point product. Only the value in the AC is taken as the product.

Floating-Point division
Floating point division requires that the exponents be subtracted and the mantissas divided. The mantissa division is done as in fixed point except that the dividend has a single precision mantissa that is placed in the AC. The division algorithm can be subdivided into five parts: Check for zeros Initialize registers and evaluate the sign Align the dividend Subtract the exponents Divide the mantissas

The two operands are checked for zero. If the divisor is zero, it indicates an attempt to divide by zero, which is an illegal operation. The operation is terminated with an error message. If the dividend in AC is zero, the quotient in QR is made zero and the operation terminates. If the operands are not zero, we proceed to determine the sign of the quotient and store it in Qs. The sign of the dividend in As is left unchanged to be the sign of the remainder. The Q register is cleared and the sequence counter SC is set to a number equal to the number of bits in the quotient. The dividend alignment is similar to the divide overflow check in the fixed point operation. The proper alignment requires that the fraction dividend be smaller than the divisor. The two fractions are compared by a subtraction test. The carry in E determines their relative magnitude. The dividend fraction is restored to its original value by adding the divisor.

If A ≥ B, it is necessary to shift A once to the right and increment the dividend exponent. Since both operands are normalized , this alignment ensures that A< B. Next the divisor exponent is subtracted from the dividend exponent. Since both exponents were originally biased, the subtraction operation gives the difference without the bias. The bias is then added and the result transferred into q because the quotient is formed in QR. The magnitudes of the mantissa are divided as in the fixed point case. After the operation , the mantissa quotient resides in Q and the remainder in A. the floating point quotient is already normalized and resides in QR.

Guard bits and Rounding

Guard bits Prior to a floating point operation, the exponent and mantissa of each operand are loaded into ALU registers. In the case of the mantissa , the length of the register is almost always greater than the length of the mantissa plus an implied bit. The register contains additional bits, called guard bits, which are used to pad out the right end of the mantissa with 0s. This yields maximum accuracy in the final results.

The use of guard bits

The use of guard bits Consider the numbers in the IEEE format, which has a 24-bit mantissa. Two numbers that are very close in values are X= 1.00…..00 x 21 Y= 1.11…..11 x 20 If the smaller number is to be subtracted from the larger, it must be shifted right 1 bit to align the exponents. In the process, Y loses 1 bit of mantissa; the result is The same operation is repeated in part b with the addition of guard bits. Now the least significant bit is not lost due to alignment, and the result is 2-23 , a difference of a factor 2 from the previous answer.

Rounding Removing guard bits in generating a final result requires that the extended mantissa be truncated to create a 24 bit number that approximates the longer version. The result of any operation on the mantissa is generally stored in a longer register. When the result is put back into the floating point format, the extra bit must be disposed of. A number of techniques have been explored for performing rounding are Chopping: The simplest way is to remove the guard bits and make no changes in the retained bits. Von neumann rounding: if the bits to be removed are all 0s, they are simply dropped, with no changes to the retained bits. However, if any of the bits to be removed are 1, the LSB of the retained bit is set to 1. Example: In 6 bit to 3 bit truncation example, all 6 bit fractions with b-4 b-5 b-6 not equal to 000 are truncated to 0. b-1 b-2 1. Rounding: It achieves the closest approximation to the number being truncated and it is unbiased method. A 1 is added to the LSB position of the bits to be retained if there is a 1 in the MSB position of the bits being removed.

Question In a Single precision floating point number, specify the value for representing ±0,±infinity and NaN (not a number) The end values 0 and 255 of the excess -127 format exponent E are used to represent special values. When E=0 and the mantissa fraction M =0 , the value exact 0 is represented. When E=255 and the mantissa fraction M =0 , the value ∞ is represented, where ∞ is the result of dividing a normal number by zero. The sign bit is still part of these representations, so there are ±0 and ±∞ representations. When E=0 and the mantissa fraction M ≠0 , denormal numbers are represented. The purpose of introducing denormal numbers is to allow for gradual underflow. When E=255 and the mantissa fraction M ±0 , the value represented is called Not a Number(NAN). A NAN is the result of performing an invalid operation such as 0/0 or root under -1.

Expressible Numbers With the normalization, the range of numbers that can be represented in a 32-bit word given in the next slide. Using 2’s complement integer representation, all of the integers from to can be represented, for a total of different numbers. Using the single and double precision floating point number format , the following ranges of numbers are possible.

Expressible Numbers

Unit -2 ARITHMETIC.

Similar presentations

Presentation on theme: "Unit -2 ARITHMETIC."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Unit -2 ARITHMETIC.

Similar presentations

Presentation on theme: "Unit -2 ARITHMETIC."— Presentation transcript:

Similar presentations

About project

Feedback