We think you have liked this presentation. If you wish to download it, please recommend it to your friends in any social system. Share buttons are a little bit lower. Thank you!
Presentation is loading. Please wait.
Published byLinda Wooten
Modified over 2 years ago
RHEL6 tuning guide for mellanox ethernet card
2 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.1 X3650 BIOS Setting. IBM X3650 M4 Performance Tuning General Power Profile/Operating Modes Max Performance Processor C-States Disabled Turbo mode Enabled /Performance Optimized Hyper-Threading Disabled CPU frequency select Max performance C-states limit : C. C 0 working. C, sleep time working. Disable,.
3 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.2 X3650 BIOS Setting. IBM X3650 M4 Performance Tuning Memory Memory speed Max performance Memory channel mode Independent Socket Interleaving NUMA / Disabled Memory Node Interleaving OFF Patrol Scrubbing Disabled Demand Scrubbing Enabled Thermal Mode Performance
4 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.3 RHEL6 OS tuning.(Networking) IBM X3650 M4 Performance Tuning Network Disable the TCP timestamps sysctl -w net.ipv4.tcp_timestamps=0 Disable the TCP selective acks sysctl -w net.ipv4.tcp_sack=0 processor input queues sysctl -w net.core.netdev_max_backlog= TCP buffer sizes using setsockopt(): sysctl -w net.core.rmem_max= sysctl -w net.core.wmem_max= sysctl -w net.core.rmem_default= sysctl -w net.core.wmem_default= sysctl -w net.core.optmem_max= Increase memory thresholds to prevent packet dropping sysctl -w net.ipv4.tcp_mem=" " auto-tuning of TCP buffer limits sysctl -w net.ipv4.tcp_rmem=" " sysctl -w net.ipv4.tcp_wmem=" " Low latency mode for TCP sysctl -w net.ipv4.tcp_timestamps=0 Rebooting /etc/sysctl.conf. ex) net.ipv4.tcp_timestamps = 0 #
5 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.4 RHEL6 OS tuning.(Power management) IBM X3650 M4 Performance Tuning Check that the output CPU frequency for each core is equal to the maximum supported and that all core frequencies are consistent. Check the maximum supported CPU frequency using: #cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq Check that core frequencies are consistent using: #cat /proc/cpuinfo | grep "cpu MHz" Check that the output frequencies are the same as the maximum supported. If the CPU frequency is not at the maximum, check the BIOS settings according to table in Recommended BIOS Settings. Check the current CPU frequency to check whether echo performance is implemented using: #cat /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_cur_freq OS CPU frequency MAX. MAX, BIOS CPU frequency.
6 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.5 RHEL6 OS tuning.(Kernel Idle Loop Tuning ) IBM X3650 M4 Performance Tuning The mlx4_en kernel module has an optional parameter that can tune the kernel idle loop for better latency. This will improve the CPU wakeup time but may result in higher power consumption. To tune the kernel idle loop, set the following options in /etc/modprobe.d/mlx4.conf: options mlx4_en enable_sys_tune=1 CPU wakeup time..
7 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.6 RHEL6 OS tuning.(OS Controlled Power Management ) IBM X3650 M4 Performance Tuning Some operating systems can override BIOS power management configuration and enable c-states by default, which results in a higher latency. To resolve the high latency issue, please follow the instructions below: 1. Edit the /boot/grub/grub.conf file or any other bootloader configuration file. 2. Add the following kernel parameters to the bootloader command. intel_idle.max_cstate=0 processor.max_cstate=1 3. Reboot the system. Ex) title RH6.2x64 root (hd0,0) kernel /vmlinuz-RH6.2x el6.x86_64 root=UUID=817c207b-c0e8-4ed9-9c33-c589c0bb566f console=tty0 console=ttyS0,115200n8 rhgb intel_idle.max_cstate=0 processor.max_cstate=1 OS Power management setting BIOS setting., /boot/grub/grub.conf.
8 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.6 RHEL6 OS tuning.(Interrupt Moderation ) IBM X3650 M4 Performance Tuning Interrupt moderation is used to decrease the frequency of network adapter interrupts to the CPU. Mellanox network adapters use an adaptive interrupt moderation algorithm by default. The algorithm checks the transmission (Tx) and receive (Rx) packet rates and modifies the Rx interrupt moderation settings accordingly. To manually set Tx and/or Rx interrupt moderation, use the ethtool utility. For example, the following commands first show the current (default) setting of interrupt moderation on the interface eth1, then turns off Rx interrupt moderation, and last shows the new setting. # ethtool -c eth1 Coalesce parameters for eth1: Adaptive RX: on TX: off... pkt-rate-low: pkt-rate-high: rx-usecs: 16 rx-frames: 88 rx-usecs-irq: 0 rx-frames-irq: 0...
9 © 2012 UniOne INC Co., Ltd. All Rights Reserved RHEL6 OS tuning.(Interrupt Moderation ) IBM X3650 M4 Performance Tuning ethtool -C eth1 adaptive-rx off rx-usecs 0 rx-frames 0 #ethtool -c eth1 Coalesce parameters for eth1: Adaptive RX: off TX: off... pkt-rate-low: pkt-rate-high: rx-usecs: 0 rx-frames: 0 rx-usecs-irq: 0 rx-frames-irq: 0... Interrupt Moderation CPU. TX / RX Interrupt Moderation ethtool.
10 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.7 RHEL6 OS tuning.(Tuning for NUMA Architecture ) IBM X3650 M4 Performance Tuning Tuning for Intel® Microarchitecture Code name Sandy Bridge The Intel Sandy Bridge processor has an integrated PCI express controller. Thus every PCIe adapter OS is connected directly to a NUMA node. On a system with more than one NUMA node, performance will be better when using the local NUMA node to which the PCIe adapter is connected. In order to identify which NUMA node is the adapter's node the system BIOS should support the proper ACPI feature. to see if your system supports PCIe adapter's NUMA node detection run the following command: # cat /sys/devices/[PCI root]/[PCIe function]/numa_node Or # cat /sys/class/net/[interface]/device/numa_node Example for supported system: # cat /sys/devices/pci0000\:00/0000\:00\:05.0/numa_node 0 Example for unsupported system: # cat /sys/devices/pci0000\:00/0000\:00\:05.0/numa_node
11 © 2012 UniOne INC Co., Ltd. All Rights Reserved. 1.7 RHEL6 OS tuning.(IRQ Affinity ) IBM X3650 M4 Performance Tuning The affinity of an interrupt is defined as the set of processor cores that service that interrupt. To improve application scalability and latency, it is recommended to distribute interrupt requests (IRQs) between the available processor cores. To prevent the Linux IRQ balancer application from interfering with the interrupt affinity scheme, the IRQ balancer must be turned off. The following command turns off the IRQ balancer: > /etc/init.d/irqbalance stop The following command assigns the affinity of a single interrupt vector: > echo > /proc/irq/ /smp_affinity where bit i in indicates whether processor core i is in s affinity or not. Application scalability latency IRQ processor core. LINUX IRQ balancer Interrupt affinity scheme, IRQ balancer turn off.
12 © 2012 UniOne INC Co., Ltd. All Rights Reserved RHEL6 OS tuning.(IRQ Affinity Configuration ) IBM X3650 M4 Performance Tuning For Intel Sandy Bridge systems set the irq affinity to the adapter's NUMA node: For optimizing single-port traffic, run: # set_irq_affinity_bynode.sh For optimizing dual-port traffic, run: # set_irq_affinity_bynode.sh To show the current irq affinity settings, run: # show_irq_affinity.sh Mellanox (www.mellanox.com ).
Janusz Szuba European XFEL 10GE network tests with UDP.
© 2014 VMware Inc. All rights reserved My Slides from VMware vSphere: Optimize and Scale.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment Chapter 2: Managing Hardware Devices.
70-290: MCSE Guide to Managing a Microsoft Windows Server 2003 Environment, Enhanced Chapter 2: Managing Hardware Devices.
1 Setup and Compile Linux Kernel Speaker: Yi-Ji Jheng Date:
I/O Systems ◦ Operating Systems ◦ CS550. Note: Based on Operating Systems Concepts by Silberschatz, Galvin, and Gagne Strongly recommended to read.
Jim Chen ( 14/06/2011 Hytec Electronics Ltd.
Ethernet Over PCI Express Presented by Kallol Biswas NucleoDyne Systems, Inc Stevens Creek Blvd Cupertino, CA.
Digital Computer Fundamentals System Buses Mukesh N. Tekwani Mumbai, India
HPCA-10 Architectural Characterization of TCP/IP Processing on the Intel® Pentium® M Processor Srihari Makineni & Ravi Iyer Communications Technology Lab.
The Journey of a Packet Through the Linux Network Stack … plus hints on Lab 9.
Interrupt driven I/O. MIPS RISC Exception Mechanism The processor operates in The processor operates in user mode user mode kernel mode kernel mode Access.
Interrupt driven I/O Computer Organization and Assembly Language: Module 12.
Chapter 3 Presenter: Mojtaba Malekpour Professor: Dr. Seyed Vahid Azhari.
MCITP Guide to Microsoft Windows Server 2008 Server Administration (Exam #70-646) Chapter 3 Configuring the Windows Server 2008 Environment.
1 OS & Computer Architecture Modern OS Functionality (brief review) Architecture Basics Hardware Support for OS Features.
Module 3 Configuring Hardware on a Computer Running Windows XP Professional.
Floating Cloud Tiered Internet Architecture Current: Rochester Institute of Technology, Rensselaer Polytechnic Institute, University of Nevada, Reno Level.
Device Drivers. Linux Device Drivers Linux supports three types of hardware device: character, block and network –character devices: R/W without buffering.
Managing Hardware Devices Facilitator: Suleiman Mohammed(mcpn, mncs) Institute of Computing & ICT, Ahmadu Bello University, Zaria.
Anders Magnusson TCP Tuning and E2E Performance TREFpunkt - October 20, 2004.
Stephen Berard Program Manager Windows Platform Architecture Team.
Versatile Low Power Media Access for Wireless Sensor Networks Sarat Chandra Subramaniam.
1 Peripheral Component Interconnect (PCI). 2 PCI based System.
TCP Performance over IPv6 Yoshinori Kitatsuji KDDI R&D Laboratories, Inc.
Day 4 Understanding Hardware Partitions Linux Boot Sequence.
+ William Stallings Computer Organization and Architecture 9 th Edition.
Accessing I/O Devices I/O Device 1I/O Device 2 ProcessorMemory BUS.
Embedded Transport Acceleration Intel Xeon Processor as a Packet Processing Engine Abhishek Mitra Professor: Dr. Bhuyan.
Slide 1/38 Industrial Automation - Customer View - Services - Training PhW - CANopen_soft_setup_en 10/ 2003 CANopen CANopen Software setup with PL7 and.
Presenter : Cheng-Ta Wu Kenichiro Anjo, Member, IEEE, Atsushi Okamura, and Masato Motomura IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 39,NO. 5, MAY 2004.
Ladebug Kernel Debugging Tutorial Bob Lidral. Introduction Kinds of kernel debugging How to use Ladebug for kernel debugging Not how to debug a kernel.
NS Training Hardware. System Controller Module.
VSP1999 esxtop for Advanced Users Name, Title, Company.
Module 12 MXL DCB. Module Objectives: Describe DCB Describe each of the four DCB protocols Enable Tagging on the array using one of three methods Configure.
1 Last Class: Introduction Operating system = interface between user & architecture Importance of OS OS history: Change is only constant User-level Applications.
计算机系 信息处理实验室 Lecture 3 System Mechanisms (1)
Diagnostics. Module Objectives By the end of this module participants will be able to: Use diagnostic commands to troubleshoot and monitor performance.
Chapter 7 Input/Output Computer Organization and Architecture William Stallings 8th Edition.
Guide to Linux Installation and Administration, 2e1 Chapter 2 Planning Your System.
Quick guide to ASIMON configuration For version 3.0 or greater SAFETY AT WORK Date: 3/18/2009.
Intel Research & Development ETA: Experience with an IA processor as a Packet Processing Engine HP Labs Computer Systems Colloquium August 2003 Greg Regnier.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 8 Virtual LANs.
Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Chapter 6A Operating System Basics PART I.
Hands-On Microsoft Windows Server 2008 Chapter 3 Configuring the Windows Server 2008 Environment.
Interrupts and Interrupt Handling David Ferry, Chris Gill CSE 522S - Advanced Operating Systems Washington University in St. Louis St. Louis, MO
January 10, Kits Workshop 1 Washington WASHINGTON UNIVERSITY IN ST LOUIS A Smart Port Card Tutorial --- Software John DeHart Washington University.
© 2017 SlidePlayer.com Inc. All rights reserved.