Presentation is loading. Please wait.

Presentation is loading. Please wait.

Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology.

Similar presentations


Presentation on theme: "Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology."— Presentation transcript:

1 Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology

2 Revision - 01 2 Legal Disclaimers Copyright © 2008 Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Other names and brands may be claimed as the property of their respective owners.

3 Revision - 01 3 The Cluster Interconnect Problem * Depends on BER requirement 24 AWG Copper Cables Distance Limited @DDR: 8-10 meters * Heavy: 1.2 Kg for a 10 meter cable Bulky: Blocks airflow, affects cooling Copper cables limit HPC cluster design

4 Revision - 01 4 Intel ® Connects Cables What Are They ? Electrical Socket Optical Transceiver in Plug Optical Cable Optical Transceiver in Plug Up to 100m Electrical Socket A Drop In Replacement for Copper Cables Compatible: Use existing copper sockets Reliable: No user accessible optical interface Low Cost: No separate optical transceivers * Source: Intel internal testing

5 Revision - 01 5 Intel ® Connects Cables High-performance 20 Gbps Optical Cables Longer, Faster, Lighter, Thinner, … and More Reliable

6 Revision - 01 6 Intel ® Connects Cables Technology Highly Reliable: Real World Case Studies Ohio State University SCinet 07 One can now put switches and racks where they are needed, rather than within the typical 8 or 10-meter reach of traditional cables. This will let us significantly expand the size of InfiniBand clusters, and that is one of our primary goals -Dr. DK Panda Head of the OSU Network-Based Computing Research Group The Intel cables were a clear success in our deployment. Every cable worked as intended, and that helped both the OpenFabrics effort and SCinet deliver on our promises. Doug Fuller Co-Team Lead SCinet OpenFabrics Subcommittee, SC 07

7 Revision - 01 7 Simple Cable Latency for Intel® Connects Cables Optical/Electrical Conversions = 0.275 nanoseconds each end Speed of light through the fiber = 4.99 nanoseconds per meter Latency of a 10 meter Intel Connects Cable (ns) – First O/E conversion: 0.275 – Speed of of light 10m*4.99 ns/m: 49.9 – Second O/E conversion:0.275 – Total: 50.45 ns Latency of a 100 meter Intel Connects Cable (ns) – First O/E conversion: 0.275 – Speed of of light 100m*4.99 ns: 499.0 – Second O/E conversion:0.275 – Total:499.45 ns Note: 1 ns = 1x10 -9 sec

8 Revision - 01 Source: DK Panda, OSU

9 Revision - 01 9 Effective Latency Function of – Simple Cable Latency – Bit Error Rate – The time required to find and fix those bit errors > Many things affect this Other physical delays (e.g. passing through switches) Where the error is detected Is it the Bit Error random, or is there a bad link ? Whether the data to be resent is in a buffer or has to be re-accessed from slower media. The system and application tolerance for bit errors.

10 Revision - 01 10 Intel ® Connects Cables Better Quality: At 10m.. at 100m 1 Meter Intel ® Connects Cable 1 Source: Tektronix Lab Evaluation Superior signal quality from 1 to 100 meters 10 Meter Intel ® Connects Cable 100 Meter Intel ® Connects Cable 5 Meter 24 AWG Copper Cable 10 m 1 m 100 m 5m

11 Revision - 01 11 Extremely low BER for high HPC compute fabric stability Intel ® Connects Cables Actual Bit Error Rates May Even Be Lower * * Note: Specified BER for Intel® Connects Cables is 10 -15 ** Source: Tektronix Lab Evaluation 10 Meter Intel ® Connects Cable ** 100 Meter Intel ® Connects Cable ** 10 -25

12 Revision - 01 12 10 -12 BER @20Gbps10 -15 BER @20Gbps Errors Per Day for a single link 17281.7 Errors Per Day For 1000 links Bit Error Rate at 20 Gbps per link Intel ® Connects Cable BER/day for 1000 links 1000 times less BER than interconnects at 10 -12 Intel ® Connects Cables More Reliable: 10 -15 BER 10 -12 interconnects BER/day for 1000 links 1,728 1,728,000 *Source: Intel

13 Revision - 01 13 Intel ® Connects Cables Technology Highly Reliable: Real World Case Studies Computational Research Labs Navy Research Laboratory Almost all our reliability problems went away when we went with the Intel optical cables. -Henry D. Dardy, Ph. D. Chief Scientist, Center for Computational Science, Naval Research Laboratory We had to overcome significant reliability issues, but virtually all our reliability problems went away when we went with the Intel optical cables. -Ashrut Ambastha EKA Team Member, Computational Research Laboratories

14 Revision - 01 14 Intel ® Connects Cables Enable Large Cluster Scale Out High data rate: 20 Gbps per cable * InfiniBand or 10 GbE: CX4 Connector Long distance: Up to 100 meters Low bit error rate: 10 -15 BER Low conversion latency: 550 picoseconds ** Reduced Installation, Maintenance Less weight: 84% lighter Less volume, better airflow: 83% smaller Smaller bend radius: 40% less Low Electro Magnetic Interference No ground loops High-Performance 20 Gbps Optical Cables * Source: All claims based on Intel Internal testing ** Per pair of connectors

15 Revision - 01 15 Intel ® Connects Cables Technology 40 Gbps Ready Shown with Mellanox ConnectX * 40 Gbps InfiniBand * HCA *Other names and brands may be claimed as the property of their respective owners.

16 Revision - 01 16 Intel ® Connects Cables Longer, Faster, Lighter, Thinner, … and More Reliable For more information visit… www.intelconnects.com *


Download ppt "Improving Cluster Interconnect Reliability Otto Liebat April 14, 2008 Intel ® Connects Cables Technology."

Similar presentations


Ads by Google