DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 1 Improving Cluster Performance Performance Evaluation of Networks.

Slides:



Advertisements
Similar presentations
DataTAG CERN Oct 2002 R. Hughes-Jones Manchester Initial Performance Measurements With DataTAG PCs Gigabit Ethernet NICs (Work in progress Oct 02)
Advertisements

BZUPAGES.COM 1 User Datagram Protocol - UDP RFC 768, Protocol 17 Provides unreliable, connectionless on top of IP Minimal overhead, high performance –No.
Performance Characterization of a 10-Gigabit Ethernet TOE W. Feng ¥ P. Balaji α C. Baron £ L. N. Bhuyan £ D. K. Panda α ¥ Advanced Computing Lab, Los Alamos.
Performance Evaluation of RDMA over IP: A Case Study with the Ammasso Gigabit Ethernet NIC H.-W. Jin, S. Narravula, G. Brown, K. Vaidyanathan, P. Balaji,
Institute of Computer Science Foundation for Research and Technology – Hellas Greece Computer Architecture and VLSI Systems Laboratory Exploiting Spatial.
Presented by: Yash Gurung, ICFAI UNIVERSITY.Sikkim BUILDING of 3 R'sCLUSTER PARALLEL COMPUTER.
1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State Vsevolod V. Panteleenko Computer Science & Engineering.
VIA and Its Extension To TCP/IP Network Yingping Lu Based on Paper “Queue Pair IP, …” by Philip Buonadonna.
CdL was here DataTAG/WP7 Amsterdam June 2002 R. Hughes-Jones Manchester 1 EU DataGrid - Network Monitoring Richard Hughes-Jones, University of Manchester.
PFLDNet Argonne Feb 2004 R. Hughes-Jones Manchester 1 UDP Performance and PCI-X Activity of the Intel 10 Gigabit Ethernet Adapter on: HP rx2600 Dual Itanium.
High Performance Communication using MPJ Express 1 Presented by Jawad Manzoor National University of Sciences and Technology, Pakistan 29 June 2015.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
1 Performance Evaluation of Gigabit Ethernet & Myrinet
An overview of Infiniband Reykjavik, June 24th 2008 R E Y K J A V I K U N I V E R S I T Y Dept. Computer Science Center for Analysis and Design of Intelligent.
Storage area network and System area network (SAN)
Sockets vs. RDMA Interface over 10-Gigabit Networks: An In-depth Analysis of the Memory Traffic Bottleneck Pavan Balaji  Hemal V. Shah ¥ D. K. Panda 
Virtual Network Servers. What is a Server? 1. A software application that provides a specific one or more services to other computers  Example: Apache.
GGF4 Toronto Feb 2002 R. Hughes-Jones Manchester Initial Performance Measurements Gigabit Ethernet NICs 64 bit PCI Motherboards (Work in progress Mar 02)
Protocol-Dependent Message-Passing Performance on Linux Clusters Dave Turner – Xuehua Chen – Adam Oline This work is funded by the DOE MICS office.
Sven Ubik, Petr Žejdl CESNET TNC2008, Brugges, 19 May 2008 Passive monitoring of 10 Gb/s lines with PC hardware.
Performance Tradeoffs for Static Allocation of Zero-Copy Buffers Pål Halvorsen, Espen Jorde, Karl-André Skevik, Vera Goebel, and Thomas Plagemann Institute.
LECTURE 9 CT1303 LAN. LAN DEVICES Network: Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and.
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 17.
ISO Layer Model Lecture 9 October 16, The Need for Protocols Multiple hardware platforms need to have the ability to communicate. Writing communications.
Lecture 2 TCP/IP Protocol Suite Reference: TCP/IP Protocol Suite, 4 th Edition (chapter 2) 1.
TNC 2007 Bandwidth-on-demand to reach the optimal throughput of media Brecht Vermeulen Stijn Eeckhaut, Stijn De Smet, Bruno Volckaert, Joachim Vermeir,
1 March 2010 A Study of Hardware Assisted IP over InfiniBand and its Impact on Enterprise Data Center Performance Ryan E. Grant 1, Pavan Balaji 2, Ahmad.
Brierley 1 Module 4 Module 4 Introduction to LAN Switching.
Towards a Common Communication Infrastructure for Clusters and Grids Darius Buntinas Argonne National Laboratory.
High Performance User-Level Sockets over Gigabit Ethernet Pavan Balaji Ohio State University Piyush Shivam Ohio State University.
A TCP/IP transport layer for the DAQ of the CMS Experiment Miklos Kozlovszky for the CMS TriDAS collaboration CERN European Organization for Nuclear Research.
1b.1 Types of Parallel Computers Two principal approaches: Shared memory multiprocessor Distributed memory multicomputer ITCS 4/5145 Parallel Programming,
Cracow, 16 October 2006 Performance Evaluation of Gigabit Ethernet-Based Interconnects for HPC Clusters Paweł Pisarczyk
Introduction to Networks CS587x Lecture 1 Department of Computer Science Iowa State University.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
1 Selecting LAN server (Week 3, Monday 9/8/2003) © Abdou Illia, Fall 2003.
TCP Lecture 13 November 13, TCP Background Transmission Control Protocol (TCP) TCP provides much of the functionality that IP lacks: reliable service.
A Comparative Study of the Linux and Windows Device Driver Architectures with a focus on IEEE1394 (high speed serial bus) drivers Melekam Tsegaye
A Measurement Based Memory Performance Evaluation of High Throughput Servers Garba Isa Yau Department of Computer Engineering King Fahd University of Petroleum.
Srihari Makineni & Ravi Iyer Communications Technology Lab
High TCP performance over wide area networks Arlington, VA May 8, 2002 Sylvain Ravot CalTech HENP Working Group.
CS 164: Slide Set 2: Chapter 1 -- Introduction (continued).
Integrating New Capabilities into NetPIPE Dave Turner, Adam Oline, Xuehua Chen, and Troy Benjegerdes Scalable Computing Laboratory of Ames Laboratory This.
1 Network Performance Optimisation and Load Balancing Wulf Thannhaeuser.
Infiniband Bart Taylor. What it is InfiniBand™ Architecture defines a new interconnect technology for servers that changes the way data centers will be.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
An Architecture and Prototype Implementation for TCP/IP Hardware Support Mirko Benz Dresden University of Technology, Germany TERENA 2001.
Masaki Hirabaru NICT Koganei 3rd e-VLBI Workshop October 6, 2004 Makuhari, Japan Performance Measurement on Large Bandwidth-Delay Product.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
Sockets Direct Protocol Over InfiniBand in Clusters: Is it Beneficial? P. Balaji, S. Narravula, K. Vaidyanathan, S. Krishnamoorthy, J. Wu and D. K. Panda.
Using Heterogeneous Paths for Inter-process Communication in a Distributed System Vimi Puthen Veetil Instructor: Pekka Heikkinen M.Sc.(Tech.) Nokia Siemens.
Optimizing Charm++ Messaging for the Grid Gregory A. Koenig Parallel Programming Laboratory Department of Computer.
Ethernet. Ethernet standards milestones 1973: Ethernet Invented 1983: 10Mbps Ethernet 1985: 10Mbps Repeater 1990: 10BASE-T 1995: 100Mbps Ethernet 1998:
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
Prentice HallHigh Performance TCP/IP Networking, Hassan-Jain Chapter 13 TCP Implementation.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Rehab AlFallaj.  Network:  Nodes: Service units: PC Interface processing Modules: it doesn’t generate data, but just it process it and do specific task.
Mellanox Connectivity Solutions for Scalable HPC Highest Performing, Most Efficient End-to-End Connectivity for Servers and Storage September 2010 Brandon.
TCP continued. Discussion – TCP Throughput TCP will most likely generate the saw tooth type of traffic. – A rough estimate is that the congestion window.
CCNA3 Module 4 Brierley Module 4. CCNA3 Module 4 Brierley Topics LAN congestion and its effect on network performance Advantages of LAN segmentation in.
Sockets Direct Protocol for Hybrid Network Stacks: A Case Study with iWARP over 10G Ethernet P. Balaji, S. Bhagvat, R. Thakur and D. K. Panda, Mathematics.
Exploiting Task-level Concurrency in a Programmable Network Interface June 11, 2003 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
L1/HLT trigger farm Bologna setup 0 By Gianluca Peco INFN Bologna Genève,
CHAPTER -II NETWORKING COMPONENTS CPIS 371 Computer Network 1 (Updated on 3/11/2013)
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
DataGrid WP7 Meeting Jan 2002 R. Hughes-Jones Manchester Initial Performance Measurements Gigabit Ethernet NICs 64 bit PCI Motherboards (Work in progress)
Networks Networking has become ubiquitous (cf. WWW)
SUPPORTING MULTIMEDIA COMMUNICATION OVER A GIGABIT ETHERNET NETWORK
Cluster Computers.
Presentation transcript:

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 1 Improving Cluster Performance Performance Evaluation of Networks

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 2 Improving Cluster Performance Service Offloading Larger clusters may need to have special purpose node(s) to run services to prevent slowdown due to contention (e.g. NFS, DNS, login, compilation) In cluster e.g. NFS demands on single server may be higher due to intensity and frequency of client access Some services can be split easily e.g. NSF Other that require a synchronized centralized repository cannot be split NFS also has a scalability problem if a single client makes demands from many nodes PVFS tries to rectify this problem

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 3 Multiple Networks/Channel Bonding Multiple Networks : separate networks for NFS, message passing, cluster management etc Application message passing the most sensitive to contention, so usually first separated out Adding a special high speed LAN may double cost Channel Bonding: bind multiple channel to create virtual channel Drawbacks: switches must support bonding, or must buy separate switches Configuration more complex –See Linux Ethernet Bonding Driver mini-howto

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 4 Jumbo Frames Ethernet standard frame 1518 bytes (MTU 1500) With Gigabit Ethernet controversy on MTU –Want to reduce load on computer i.e. number of interrupts –One way is to increase frame size to 9000 (Jumbo Frames) –Still small enough not to compromise error detection –Need NIC and switch to support –Switches which do not support will drop as oversized frames Configuring eth0 for Jumbo Frames ifconfig eth0 mtu 9000 up If we want to set at boot put in startup scripts –Or on RH9 put in /etc/sysconfig/network-scripts/ifcfg-eth0 MTU=9000 More on performance later

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 5 Interrupt Coalescing Another way to reduce number of interrupt Receiver : delay until –Specific number of packets received –Specific time has elapsed since first packet after last interrupt NICs that support coalescing often have tunable parameters Must take care not to make too large –Sender: send descriptors could be depleted causing stall –Receiver: descriptors depleted cause packet drop, and for TCP retransmission. Too many retransmissions cause TCP to apply congestion control reducing effective bandwidth

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 6 Interrupt Coalescing (ctd.) Even if not too large, increasing causes complicated effects –Interrupts and thus CPU overhead reduced If CPU was interrupt saturated may improve bandwidth –Delay causes increased latency Negative for latency sensitive applications

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 7 Socket Buffers For TCP, send socket buffer size determines the maximum window size (amount of unacknowledged data “in the pipe”) –Increasing may improve performance but consumes shared resources possibly depriving other connections –Need to tune carefully Bandwidth-delay product gives lower limit –Delay is Round Trip Time (RTT): time for sender to send packet, reciever to receive and ACK, sender to receive ACK –Often estimated using ping (although ping does not use TCP and doesn’t have its overhead!!) Better if use packet of MTU size (for Linux this means specifying data size of ICMP & IP headers = 1500

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 8 Socket Buffers (ctd.) Receive socket buffer determines amount that can be buffered awaiting consumption by application –If exhausted sender notified to stop sending –Should be at least as big as send socket buffer Bandwidth-delay product gives lower bound –Other factors impact size that gives best performance Hardware, software layers, application characteristics –Some applications allow tuning in application System level tools allow testing of performance –ipipe, netpipe (more later)

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 9 Setting Default Socket Buffer Size /proc file system –/proc/sys/net/core/wmem_default send size –/proc/sys/net/core/rmem_default receive size Default can be seen by cat of these files Can be set by e.g. Echo > /proc/sys/net/core/wmem_default Sysadm can also determine maximum buffer sizes that users can set in –/proc/sys/net/core/wmem_max –/proc/sys/net/core/rmem_max –Should be at least as large as default!! Can be set at boot time by adding to /etc/rc.d/rc.local

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 10 Netpipe - NETwork Protocol Independent Performance Evaluator Performs simple ping-pong tests, bouncing messages of increasing size between two processes Message sizes are chosen at regular intervals, and with slight perturbations, to provide a complete test of the communication system Each data point involves many ping-pong tests to provide an accurate timing Latencies are calculated by dividing the round trip time in half for small messages ( < 64 Bytes ) NetPIPE was originally developed at the SCL by Quinn Snell, Armin Mikler, John Gustafson, and Guy HelmerQuinn SnellArmin MiklerJohn GustafsonGuy Helmer

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 11 Netpipe Protocols & Platforms

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 12 Performance Comparison of LAM/MPI, MPICH, and MVICH on a Cluster Connected by a Gigabit Ethernet Network Hong Ong and Paul A. Farrell Dept. of Mathematics and Computer Science Kent, Ohio Atlanta Linux Showcase Extreme Linux /12/00 – 10/14/00 Atlanta Linux Showcase Extreme Linux 2000

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 13 Testing Environment Hardware Two 450MHz Pentium III PCs. –100MHz memory bus. –256MB of PC100 SD-RAM. –Back to back connection via Gigabit NICs. –Installed in the 32bit/33MHz PCI slot. Gigabit Ethernet NICs. –Packet Engine GNIC-II (Hamachi v0.07). –Alteon ACEnic (Acenic v0.45). –SysKonnect SK-NET (Sk98lin v3.01).

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 14 Testing Environment - Software Operating system. –Red Hat 6.1 Linux distribution. –Kernel version Communication interface. –LAM/MPI v6.3. –MPICH v –M-VIA v0.01. –MVICH v0.02. Benchmarking tool. –NetPIPE v2.3.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 15 TCP/IP Performance - Throughput

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 16 TCP/IP Performance - Latency

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 17 Gigabit Over Copper Evaluation DRAFT Prepared by Anthony Betz April 2, 2002 University of Northern Iowa Department of Computer Science.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 18 Testing Environment Twin Server-Class Athlon systems with 266MHz FSB from QLILinux Computer SystemsQLILinux Computer Systems –Tyan S2466N Motherboard –AMD 1500MP –2x64-bit 66/33MHz jumper- able PCI slots – 4x32-bit PCI slots – 512MB DDR Ram – Kernel – RedHat 7.2 Twin Desktop-Class Dell Optiplex Pentium-Class systems – Pentium III 500 Mhz – 128MB Ram – 5x32-bit PCI slots – 3x16-bit ISA slots

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 19 Cards Tested D-Link DGE 500T (32-bit)$45D-Link DGE 500T –SMC's dp83820 chipset, driver ns83820 in kernel ARK Soho-GA2500T (32-bit) $44ARK Soho-GA2500T ARK Soho-GA2000T $69ARK Soho-GA2000T Asante Giganix $138Asante Giganix –Same as D-Link except dp83821 chipset Syskonnect SK9821 $570Syskonnect SK9821 –driver used was sk98lin from the kernel source 3Com 3c996BT $1383Com 3c996BT –driver bcm5700, version , as supplied by 3Com Intel Pro 1000 XT $169Intel Pro 1000 XT –Designed for PCI-X, Intel's e1000 module, version Syskonnect SK9D2 $228Syskonnect SK9D2

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 20 D-Link DGE-500T

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 21 Ark Soho-GA200T

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 22 Ark Soho-GA2500T

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 23 Asante Giagnix

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 24 3Com 3c996BT

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 25 Intel E1000 XT

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 26 Syskonnect SK9821

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 27 32bit 33MHz

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 28 64bit/33MHz MTU 1500

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 29 64bit/66MHz MTU 1500

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 30 64bit/33MHz MTU 6000

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 31 64bit/66MHz MTU 6000

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 32 64bit/33MHz MTU 9000

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 33 64bit/66MHz MTU 9000

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 34 Cost per Mbps 32bit/33MHz

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 35 Cost per Mbps 64bit/33MHz

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 36 Cost per Mbps 64bit/66MHz

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 37 Integrating New Capabilities into NetPIPE Dave Turner, Adam Oline, Xuehua Chen, and Troy Benjegerdes Scalable Computing Laboratory of Ames Laboratory This work was funded by the MICS office of the US Department of Energy

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 38 Recent additions to NetPIPE Can do an integrity test instead of measuring performance. Streaming mode measures performance in 1 direction only. Must reset sockets to avoid effects from a collapsing window size. A bi-directional ping-pong mode has been added (-2). One-sided Get and Put calls can be measured (MPI or SHMEM). Can choose whether to use an intervening MPI_Fence call to synchronize. Messages can be bounced between the same buffers (default mode), or they can be started from a different area of memory each time. There are lots of cache effects in SMP message-passing. InfiniBand can show similar effects since memory must be registered with the card. Process 1 Process

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 39 Performance on Mellanox InfiniBand cards A new NetPIPE module allows us to measure the raw performance across InfiniBand hardware (RDMA and Send/Recv). Burst mode preposts all receives to duplicate the Mellanox test. The no-cache performance is much lower when the memory has to be registered with the card. An MP_Lite InfiniBand module will be incorporated into LAM/MPI. MVAPICH 0.9.1

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing Gigabit Ethernet Intel 10 Gigabit Ethernet cards 133 MHz PCI-X bus Single mode fiber Intel ixgb driver Can only achieve 2 Gbps now. Latency is 75 us. Streaming mode delivers up to 3 Gbps. Much more development work is needed.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 41 Channel-bonding Gigabit Ethernet for better communications between nodes Channel-bonding uses 2 or more Gigabit Ethernet cards per PC to increase the communication rate between nodes in a cluster. GigE cards cost ~$40 each. 24-port switches cost ~$1400.  $100 / computer This is much more cost effective for PC clusters than using more expensive networking hardware, and may deliver similar performance.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 42 Performance for channel-bonded Gigabit Ethernet Channel-bonding multiple GigE cards using MP_Lite and Linux kernel bonding GigE can deliver 900 Mbps with latencies of us for PCs with 64-bit / 66 MHz PCI slots. Channel-bonding 2 GigE cards / PC using MP_Lite doubles the performance for large messages. Adding a 3 rd card does not help much. Channel-bonding 2 GigE cards / PC using Linux kernel level bonding actually results in poorer performance. The same tricks that make channel- bonding successful in MP_Lite should make Linux kernel bonding working even better. Any message-passing system could then make use of channel-bonding on Linux systems.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 43 Channel-bonding in MP_Lite Application on node 0 a b MP_Lite User space Kernel space Large socket buffers b a TCP/IP stack dev_q_xmit device driver device queue DMA GigE card TCP/IP stack dev_q_xmit device queue DMA GigE card Flow control may stop a given stream at several places. With MP_Lite channel-bonding, each stream is independent of the others.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 44 Linux kernel channel-bonding Application on node 0 User space Kernel space Large socket buffer TCP/IP stack dqx device driver device queue DMA GigE card device queue DMA GigE card A full device queue will stop the flow at bonding.c to both device queues. Flow control on the destination node may stop the flow out of the socket buffer. In both of these cases, problems with one stream can affect both streams. bonding.c dqx

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 45 Comparison of high-speed interconnects InfiniBand can deliver Mbps at a 7.5 us latency. Atoll delivers 1890 Mbps with a 4.7 us latency. SCI delivers 1840 Mbps with only a 4.2 us latency. Myrinet performance reaches 1820 Mbps with an 8 us latency. Channel-bonded GigE offers 1800 Mbps for very large messages. Gigabit Ethernet delivers 900 Mbps with a us latency. 10 GigE only delivers 2 Gbps with a 75 us latency.

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 46 Recent tests at Kent Dell Optiplex GX260 - Fedora 10.0 –Dell motherboard –Builtin GE Intel 82540EM (rev 02) built-in GE –PCI revision 2.2, 32-bit, 33/66 MHz Linux e1000 driver –SysKonnect SK9821 NIC RocketCalc Xeon 2.4 GHz –SuperMicro X5DAL-G motherboard –Intel 82546EB built-in GE 133MHz PCI-X bus –Linux e1000 ver RocketCalc Opteron –Tyan Thunder K8W motherboard –Broadcom BCM5703C builtin GE on PCI-X Bridge A (64bit?) –tg3 driver in kernel

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 47 Dell Optiplex GX260 – builtin

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 48 Dell Optiplex GX260 - SysKonnect

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 49 Xeon

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 50 Xeon

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 51 Opteron

DiSCoV Fall 2008 Paul A. Farrell Cluster Computing 52 Summary