Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology.

Similar presentations


Presentation on theme: "Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology."— Presentation transcript:

1 Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology

2 Revision Legal Disclaimers Copyright © 2008 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of their respective owners.

3 Revision The Cluster Interconnect Problem * Depends on BER requirement 24 AWG Copper Cables Distance 8-10 meters * Heavy: 1.2 Kg for a 10 meter cable Bulky: Blocks airflow, affects cooling Copper cables limit HPC cluster design

4 Revision Intel ® Connects Cables What Are They ? Electrical Socket Optical Transceiver in Plug Optical Cable Optical Transceiver in Plug Up to 100m Electrical Socket A Drop In Replacement for Copper Cables Compatible: Use existing copper sockets Reliable: No user accessible optical interface Low Cost: No separate optical transceivers * Source: Intel internal testing

5 Revision Intel ® Connects Cables High-performance 20 Gbps Optical Cables Longer, Faster, Lighter, Thinner, … and More Reliable

6 Revision Intel ® Connects Cables Technology Highly Reliable: Real World Case Studies Ohio State University SCinet 07 One can now put switches and racks where they are needed, rather than within the typical 8 or 10-meter reach of traditional cables. This will let us significantly expand the size of InfiniBand clusters, and that is one of our primary goals -Dr. DK Panda Head of the OSU Network-Based Computing Research Group The Intel cables were a clear success in our deployment. Every cable worked as intended, and that helped both the OpenFabrics effort and SCinet deliver on our promises. Doug Fuller Co-Team Lead SCinet OpenFabrics Subcommittee, SC 07

7 Revision Simple Cable Latency for Intel® Connects Cables Optical/Electrical Conversions = nanoseconds each end Speed of light through the fiber = 4.99 nanoseconds per meter Latency of a 10 meter Intel Connects Cable (ns) – First O/E conversion: – Speed of of light 10m*4.99 ns/m: 49.9 – Second O/E conversion:0.275 – Total: ns Latency of a 100 meter Intel Connects Cable (ns) – First O/E conversion: – Speed of of light 100m*4.99 ns: – Second O/E conversion:0.275 – Total: ns Note: 1 ns = 1x10 -9 sec

8 Revision - 01 Source: DK Panda, OSU

9 Revision Effective Latency Function of – Simple Cable Latency – Bit Error Rate – The time required to find and fix those bit errors > Many things affect this Other physical delays (e.g. passing through switches) Where the error is detected Is it the Bit Error random, or is there a bad link ? Whether the data to be resent is in a buffer or has to be re-accessed from slower media. The system and application tolerance for bit errors.

10 Revision Intel ® Connects Cables Better Quality: At 10m.. at 100m 1 Meter Intel ® Connects Cable 1 Source: Tektronix Lab Evaluation Superior signal quality from 1 to 100 meters 10 Meter Intel ® Connects Cable 100 Meter Intel ® Connects Cable 5 Meter 24 AWG Copper Cable 10 m 1 m 100 m 5m

11 Revision Extremely low BER for high HPC compute fabric stability Intel ® Connects Cables Actual Bit Error Rates May Even Be Lower * * Note: Specified BER for Intel® Connects Cables is ** Source: Tektronix Lab Evaluation 10 Meter Intel ® Connects Cable ** 100 Meter Intel ® Connects Cable **

12 Revision Errors Per Day for a single link Errors Per Day For 1000 links Bit Error Rate at 20 Gbps per link Intel ® Connects Cable BER/day for 1000 links 1000 times less BER than interconnects at Intel ® Connects Cables More Reliable: BER interconnects BER/day for 1000 links 1,728 1,728,000 *Source: Intel

13 Revision Intel ® Connects Cables Technology Highly Reliable: Real World Case Studies Computational Research Labs Navy Research Laboratory Almost all our reliability problems went away when we went with the Intel optical cables. -Henry D. Dardy, Ph. D. Chief Scientist, Center for Computational Science, Naval Research Laboratory We had to overcome significant reliability issues, but virtually all our reliability problems went away when we went with the Intel optical cables. -Ashrut Ambastha EKA Team Member, Computational Research Laboratories

14 Revision Intel ® Connects Cables Enable Large Cluster Scale Out High data rate: 20 Gbps per cable * InfiniBand or 10 GbE: CX4 Connector Long distance: Up to 100 meters Low bit error rate: BER Low conversion latency: 550 picoseconds ** Reduced Installation, Maintenance Less weight: 84% lighter Less volume, better airflow: 83% smaller Smaller bend radius: 40% less Low Electro Magnetic Interference No ground loops High-Performance 20 Gbps Optical Cables * Source: All claims based on Intel Internal testing ** Per pair of connectors

15 Revision Intel ® Connects Cables Technology 40 Gbps Ready Shown with Mellanox ConnectX * 40 Gbps InfiniBand * HCA *Other names and brands may be claimed as the property of their respective owners.

16 Revision Intel ® Connects Cables Longer, Faster, Lighter, Thinner, … and More Reliable For more information visit… *


Download ppt "Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology."

Similar presentations


Ads by Google