Presentation is loading. Please wait.

Presentation is loading. Please wait.

Samuel Hishmeh EE 585: Fault Tolerant Computing October 12th 2006

Similar presentations


Presentation on theme: "Samuel Hishmeh EE 585: Fault Tolerant Computing October 12th 2006"— Presentation transcript:

1 Samuel Hishmeh EE 585: Fault Tolerant Computing October 12th 2006
Pentium FPU Error Samuel Hishmeh EE 585: Fault Tolerant Computing October 12th 2006

2 History Intel 486 Upgraded 486 vs. Pentium Pentium FDIV Bug Discovered
Pentium I Released 486 vs. Pentium Pentium FDIV Bug Discovered Intel first released the Pentium on March 22, It was an upgrade from the old 486 architecture. Intel wanted to trademark the processor as a 586, but because numbers can’t be trademarked they came up with Pentium. The Pent stands for 5 and tium from the suffix of titatnium, to make it sound like a tough/durable processor. The major additions to the Pentium were: New superscaler architecture: There were two datapaths, one for simple instructions and one for all others. This meant that one more than one instruction could be handled per clock cycle. 64 bit data path. This meant that more information could be pulled in every memory fetch. A Faster FPU In Fall 1994 Dr. Thomas Nicely a Mathematics professor at Lynchburg College, Lynchburg, VA, noticed a bug in software he was writing. He first contacted the company he purchased the PC from, Micron Computers, but they had no explanation. He contacted Intel through , who acknowledged the same problem but did not offer a solution. After a few days of not hearing back from Intel, he sent the same to some prominent people, who worked for companies such as Byte Magazin, PC Week, InfoWorld and PC Magazine. At this time the flaw began to go public. The actually copy of the is posted on Dr. Nicely’s web site. The most common name for the FPU Error is the Pentium FDIV Bug. References Pentium Release Date/Upgrade Info: History:

3 256!! The Problem If we have: z = 0?? Pentium says: x = 4195835
z = x - (x/y)*y z = 0?? Pentium says: 256!! Calculations:

4 Examples / = Pentium: ( )*(1/ ) = 1 Pentium: In rare cases certain inputs would produce a result this was incorrect, from anywhere to the 4th to the 19th decimal place Affected 60, 66, 75, 90 or 100MHz Pentium® processors, the problem was corrected before 120 MHz released. Processors Affected: Ex 1: Ex 2:

5 No Big Deal? 1/(1/x) = x, for x 824,633,702,418 <= x <= 824,633,702,449 Pentium: 3,072! / = Pentium: Multiply by 3,422,378 Theoretical: 63,884, Pentium: 63,882, Theoretical – Pentium = 2, ! (Assuming my P4 did the calculation correctly) First Example:

6 What Happened? Better Performance using SRT Algorithm
Floating-Point Scalar Code 3x Improvement Vector Code 5x Improvement 2 Quotient Bits per Clock Cycle Algorithm used LUTs Programming Error 5 Table Entries Missing! Intel upgraded the FPU hardware in order to speed up FP calculations. New FPU would result in 3x FP scalar and 5x vector code improvement, compared to the 486. The new algorithm produced 2 quotient bits per clock cycle, compared to the 486 shift and subtract algorithm which only produced one. The SRT algorithm looked at the most signifcant bits of the numerator and denominator, and based on these would make a guess at the quotient. The quotient was retrieved by accessing a 1,066 entry lookup table. Due to programming error, 5 entries were not downloaded to PLA (Programmable Logic Array). So Whenever the FPU tried to access the qoutient from the missing lookup tables, 0 was returned as the result. Algorithm: Speedup Numbers/LUT numbers:

7 Instructions Affected
FDIV FDIVP FDIVR FDIVRP FIDIV FIDIVR FPREM

8 Honorable Mentions Dan 0411 - Pentium II/Pro Bug Ariane 5
Only a few years after the FDIV Bug a FPU error with the Pentium II was found. Named after the man who first tested the error, and the date on which he first learned about the problem, April 11, Dan 0411 was a problem with the Pentium II and Pentium Pro chips. The chip sometimes had problems converting floats to ints. Occasionally the float to int conversion produces more bits than what can fit into the integer representation of the number. In this case the chip should raise an error flag, but the PII and Pro did not raise this flag. I should also mention that the Ariane 5 Rocket crashed was caused by a similar float to int conversion, but in this case the error was in software not hardware. The flag was thrown in hardware, but the software did not catch the fault, leading to an eroneous value which caused the ship to accelerate uncontrollably and ultimately explode. Specific Info: Date Bug Discovered/Ariane 5:

9 The Solution Better Test Benches Used Old Code
Apparently Intel used standard code that was passed down generations to test FPU hardware. The code was not thorough and did not do enough calculations to catch the error.

10 The Cost Catastrophic Potential $475 Million Money Integrity
The fact that the chip was released to the public with a flaw could have had caused Intel catastrophic problems. For one the company had to fork out 475 million in recalled chips. Not only that they could have lost, and did lose the trust of consumers. In addition people will start to doubt the integrity of technology. We rely so much on technology, how do we know are always getting the correct results? Intel announced a pre-tax charge of $475 million against earnings, to pay for recalled chips. Amount:

11 References http://www.websters-online-dictionary.org/


Download ppt "Samuel Hishmeh EE 585: Fault Tolerant Computing October 12th 2006"

Similar presentations


Ads by Google