What’s needed to transmit? A look at the minimum steps required for programming our 82573L nic to send packets.

Slides:



Advertisements
Similar presentations
Device Drivers. Linux Device Drivers Linux supports three types of hardware device: character, block and network –character devices: R/W without buffering.
Advertisements

© 2006 Cisco Systems, Inc. All rights reserved.IP6FD v2.0—2-1 IPv6 Operations Defining and Configuring Neighbor Discovery.
The Linux Kernel: Memory Management
Hardware ‘flow control’ How we can activate our NIC’s ability to avoid overwhelming the capacities of its ‘link partner’
1 SpaceWire Update NASA GSFC November 25, GSFC SpaceWire Status New Link core with split clock domains complete (Much faster) New Router core.
Dr A Sahu Dept of Comp Sc & Engg. IIT Guwahati. PCI Devices NIC Cards NIC card architecture Access to NIC register – PCI access.
More 82573L details Getting ready to write and test a character-mode device-driver for our anchor-LAN’s ethernet controllers.
Receiver ‘packet-splitting’
Offloading TCP Segmentation Using Context Descriptors lets a driver offload ‘TCP Segmentation’ as well as checksum calculations.
Virtual Local Area Networks A look at how the Intel 82573L nic supports IEEE standard 802.1q for ethernet VLANs.
What’s needed to receive? A look at the minimum steps required for programming our 82573L nic to receive packets.
The RealTek interface Introduction to the RTL-8139 network controller registers.
A look at memory issues Data-transfers must occur between system memory and the network interface controller.
Accessing network hardware The Network Interface Controllers are part of a larger scheme used in modern PCs for device control.
Exploring a modern NIC An introduction to programming the Intel 82573L gigabit ethernet network interface controller.
82573L Initializing our Pro/1000. Chicken-and-Egg? We want to create a Linux Kernel Module that can serve application-programs as a character-mode device-driver.
RTL-8139 experimentation Setting up an environment for studying the Network Controller.
A “real” network driver? We want to construct a Linux network device-driver that is able to send and receive packets.
Our ‘xmit1000.c’ driver Implementing a ‘packet-transmit’ capability with the Intel 82573L network interface controller.
Informationsteknologi Friday, November 16, 2007Computer Architecture I - Class 121 Today’s class Operating System Machine Level.
Our ‘nic.c’ module We create a ‘character-mode’ device-driver for the 82573L NIC to use in futrure experiments.
Adjusting out device-driver Here we complete the job of modifying our ‘nicf.c’ Linux driver to support ‘raw’ packet-transfers.
What’s needed to transmit? A look at the minimum steps required for programming our anchor nic’s to send packets.
Anush Rengarajan Feng Zheng Thomas Madaelil
Module 6 Chapter 5. Ethernet Ethernet is now the dominant LAN technology in the world. Ethernet is not one technology but a family of LAN technologies.
Hardware-address filtering How can we send packets to just one node on our ‘anchor’ cluster?
Accessing the NIC A look at the mechanisms that software can use to interact with our 82573L network interface.
What’s needed to receive? A look at the minimum steps required for programming our anchor nic’s to receive packets.
Building TCP/IP packets A look at the computation-steps which need to be performed for utilizing the TCP/IP protocol.
© 2007 Cisco Systems, Inc. All rights reserved.ICND1 v1.0—2-1 Ethernet LANs Troubleshooting Switch Issues.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking Ethernet, ARP.
Layer 2 Switch  Layer 2 Switching is hardware based.  Uses the host's Media Access Control (MAC) address.  Uses Application Specific Integrated Circuits.
Input/Output. Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower.
Q and A for Ch. 17 CS 332, Spring Fiber Modems Q: Why use fiber modem extensions? Is it to save money on not having to use as much fiber (otherwise.
Intel
Chapter 1-3 The Ethernet LAN. Ethernet The networking protocol used in most modern computer networks is Ethernet. Ethernet is a CSMA/CD LAN protocol.
ICOM 6115©Manuel Rodriguez-Martinez ICOM 6115 – Computer Networks and the WWW Manuel Rodriguez-Martinez, Ph.D. Lecture 17.
1-1 Embedded Network Interface (ENI) API Concepts Shared RAM vs. FIFO modes ENI API’s.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
S3C2 – LAN Switching Addressing LAN Problems. Congestion is Caused By Multitasking, Faster operating systems, More Web-based applications Client-Server.
I/O Example: Disk Drives To access data: — seek: position head over the proper track (8 to 20 ms. avg.) — rotational latency: wait for desired sector (.5.
Penn State CSE “Optimizing Network Virtualization in Xen” Aravind Menon, Alan L. Cox, Willy Zwaenepoel Presented by : Arjun R. Nath.
CS 342 – Operating Systems Spring 2003 © Ibrahim Korpeoglu Bilkent University1 Input/Output CS 342 – Operating Systems Ibrahim Korpeoglu Bilkent University.
1 Ethernet & IEEE Cisco Section 7.3 Stephanie Hutter October 2000.
Transmission Control Protocol
11 NETWORK CONNECTION HARDWARE Chapter 3. Chapter 3: NETWORK CONNECTION HARDWARE2 NETWORK INTERFACE ADAPTER  Provides the link between a computer and.
Ethernet Driver Changes for NET+OS V5.1. Design Changes Resides in bsp\devices\ethernet directory. Source code broken into more C files. Native driver.
CCNA 3 Week 4 Switching Concepts. Copyright © 2005 University of Bolton Introduction Lan design has moved away from using shared media, hubs and repeaters.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
© 2007 Cisco Systems, Inc. All rights reserved.Cisco Public ITE PC v4.0 Chapter 1 1 Ethernet Network Fundamentals – Chapter 9.
Cisco 3 - Switching Perrine. J Page 16/4/2016 Chapter 4 Switches The performance of shared-medium Ethernet is affected by several factors: data frame broadcast.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 3 v3.0 Module 4 Switching Concepts.
CH10 Input/Output DDDData Transfer EEEExternal Devices IIII/O Modules PPPProgrammed I/O IIIInterrupt-Driven I/O DDDDirect Memory.
Nevis FVTX Update Dave Winter FVTX Silicon Meeting 13 July 2006.
McGraw-Hill©The McGraw-Hill Companies, Inc., 2004 Connecting Devices CORPORATE INSTITUTE OF SCIENCE & TECHNOLOGY, BHOPAL Department of Electronics and.
EC week Review. Rules of Engagement Teams selected by instructor Host will read the entire questions. Only after, a team may “buzz” by raise of.
Ethernet Overview it the IEEE standard for Ethernet.
Input/Output Problems Wide variety of peripherals —Delivering different amounts of data —At different speeds —In different formats All slower than CPU.
Chapter 7 OSI Data Link Layer.
1 © 2003, Cisco Systems, Inc. All rights reserved. CCNA 1 v3.0 Module 6 Ethernet Fundamentals.
Introduction to Ethernet In 1985, the Institute of Electrical and Electronics Engineers (IEEE) published standards for LANs. These standards start with.
WINLAB Open Cognitive Radio Platform Architecture v1.0 WINLAB – Rutgers University Date : July 27th 2009 Authors : Prasanthi Maddala,
Local Area Networks: Topologies. 2 Packet Identification & MAC Addresses Each packet specifies an intended recipient with an identifier. – Demultiplexing.
ETHERNET Yash Vaidya. Introduction Ethernet is a family of computer networking technologies for local area networks (LANs). Ethernet was commercially.
Local Area Networks: Topologies
Configuring EtherChannels and Switch Troubleshooting
Who’s listening? Some experiments with an ‘echo’ service on our anchor-cluster’s local network of 82573L nic’s.
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

What’s needed to transmit? A look at the minimum steps required for programming our 82573L nic to send packets

nic Typical NIC hardware TX FIFO RX FIFO transceiver LAN cable BUSBUS main memory packet buffer CPU

Quotation Many companies do an excellent job of providing information to help customers use their products... but in the end there's no substitute for real-life experiments: putting together the hardware, writing the program code, and watching what happens when the code executes. Then when the result isn't as expected -- as it often isn't -- it means trying something else or searching the documentation for clues. -- Jan Axelson, author, Lakeview Research (1998)

Thanks, Intel!☻ Intel Corporation has kindly posted details online for programming its family of gigabit Ethernet controllers – includes our 82573L

Our ‘nictx.c’ module We’ve created an LKM which has minimal functionality – enough to be sure we know how to ‘transmit’ a raw Ethernet packet – but we do this in a forward-looking way so that our source-code can later be turned into a Linux character-mode device-driver (once we’ve also seen how to write code which allows our nic to ‘receive’ packets)

Access to PRO1000 registers Device registers are hardware mapped to a range of addresses in physical memory We obtain the location (and the length) of this memory-range from a BAR register in the nic device’s PCI Configuration Space Then we request the Linux kernel to setup an I/O ‘remapping’ of this memory-range to ‘virtual’ addresses within kernel-space

Tx-Desc Ring-Buffer Circular buffer (128-bytes minimum) TDBA base-address TDLEN (in bytes) TDH (head) TDT (tail) = owned by hardware (nic) = owned by software (cpu) 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80

How ‘transmit’ works descriptor0 descriptor1 descriptor2 descriptor Buffer0 Buffer1 Buffer2 Buffer3 List of Buffer-Descriptors We setup each data-packets that we want to be transmitted in a ‘Buffer’ area in ram We also create a list of buffer-descriptors and inform the NIC of its location and size Then, when ready, we tell the NIC to ‘Go!’ (i.e., start transmitting), but let us know when these transmissions are ‘Done’ Random Access Memory

Allocating kernel-memory Our 82573L device-driver will need to use a segment of contiguous physical memory which is cache-aligned and non-pageable Such a memory-block can be allocated by using the kernel’s ‘kzalloc()’ function (and it can later be deallocated using ‘kfree()’) You should use the ‘GFP_KERNEL’ flag (and we also used the ‘GFP_DMA’ flag)

NIC registers (for transmit) enum{ E1000_CTRL= 0x0000,// Device Control E1000_STATUS= 0x0008,// Device Status E1000_TCTL= 0x0400,// Transmit Control E1000_TDBAL= 0x3800,// Tx-Descriptor Base-Address Low E1000_TDBAH= 0x3804,// Tx-Descriptor Base-Address High E1000_TDLEN= 0x3808,// Tx-Descriptor queue Length E1000_TDH= 0x3810,// Tx-Descriptor Head E1000_TDT= 0x3818,// Tx-Descriptor Tail E1000_TXDCTL= 0x3828,// Tx-Descriptor Control E1000_RA= 0x5400,// Receive-address Array };

Device Control (0x0000) PHY RST VME R =0 TFCERFCE RST R =0 R =0 R =0 R =0 R =0 ADV D3 WUC R =0 D/UD status R =0 R =0 R =0 R =0 R =0 FRC DPLX FRC SPD R =0 SPEED R =0 SLUSLU R =0 R =0 R =1 0 FDFD GIO M D R = FD = Full-DuplexSPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved) GIOMD = GIO Master DisableADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link UpD/UD = Dock/Undock statusRFCE = Rx Flow-Control Enable FRCSPD = Force SpeedRST = Device ResetTFCE = Tx Flow-Control Enable FRCDPLX = Force DuplexPHYRST = Phy ResetVME = VLAN Mode Enable 82573L

0 Device Status (0x0008) ? GIO Master EN PHY RA ASDV ILOSILOS SLUSLU 0 TX OFF 0 FDFD Function ID LULU SPEED FD = Full-Duplex LU = Link Up TXOFF = Transmission Paused SPEED (00=10Mbps,01=100Mbps, 10=1000Mbps, 11=reserved) ASDV = Auto-negotiation Speed Detection Value PHYRA = PHY Reset Asserted 82573L some undocumented functionality?

Transmit Control (0x0400) R =0 R =0 R =0 MULRTXCSCMT UNO RTX RTLC R =0 SW XOFF COLD (upper 6-bits) (COLLISION DISTANCE) COLD (lower 4-bits) (COLLISION DISTANCE) 0ASDV ILOSILOS SLUSLU TBI mode PSPPSP 0 R = R =0 ENEN SPEED CT (COLLISION THRESHOLD) EN = Transmit EnableSWXOFF = Software XOFF Transmission PSP = Pad Short PacketsRLTC = Retransmit on Late Collision CT = Collision Threshold (=0xF)UNORTX = Underrun No Re-Transmit COLD = Collision Distance (=0x3F)TXCSCMT = TxDescriptor Minimum Threshold MULR = Multiple Request Support 82573L

Tx-Descriptor Control (0x3828) GRANGRAN 00 WTHRESH (Writeback Threshold) 000 FRC DPLX FRC SPD 0 HTHRESH (Host Threshold) ILOSILOS 0 ASDEASDE 0 LRSTLRST PTHRESH (Prefetch Threshold) 00 Recommended for 82573: 0x (GRAN=1, WTHRESH=1) “This register controls the fetching and write back of transmit descriptors. The three threshhold values are used to determine when descriptors are read from, and written to, host memory. Their values can be in units of cache lines or of descriptors (each descriptor is 16 bytes), based on the value of the GRAN bit (0=cache lines, 1=descriptors). When GRAN = 1, all descriptors are written back (even if not requested).” --Intel manual

An observation We notice that the 82573L device retains the values in many of its internal registers This fact reduces the programming steps that will be required to operate our nic on the anchor cluster machines, since Intel’s own Linux device driver (‘e1000e.ko’) has already initialized many nic registers But we MAY need to bring ‘eth1’ down!

Using ‘/sbin/ifconfig’ You can use the ‘/sbin/ifconfig’ command to find out whether the ‘eth1’ interface has been brought ‘down’: $ /sbin/ifconfig eth1 If it is still operating, you can turn it off with the (privileged) command: $ sudo /sbin/ifconfig eth1 down

Programming steps 1)Detect the presence of the 82573L network controller (VENDOR_ID, DEVICE_ID) 2)Obtain the physical address-range where the nic’s device-registers are mapped 3)Ask the kernel to map this address range into the kernel’s virtual address-space 4)Copy the network controller’s MAC-address into a 6-byte array for future access 5)Allocate a block of kernel memory large enough for our descriptors and buffers 6)Insure that the network controller’s ‘Bus Master’ capability has been enabled 7)Select our desired configuration-options for the DEVICE CONTROL register 8)Perform a nic ‘reset’ operation (by toggling bit 26), then delay until reset completes 9)Select our desired configuration-options for the TRANSMIT CONTROL register 10)Initialize our array of Transmit Descriptors with the physical addresses of buffers 11)Initialize the Transmit Engine’s registers (for Tx-Descriptor Queue and Control) 12)Setup the buffer-contents for an Ethernet packet we want to be transmitted 13)Enable the Transmit Engine 14)Give ‘ownership’ of a Tx-Descriptor to the network controller 15)Install our ‘/proc/nictx’ pseudo-file (for user-diagnostic purposes)

Legacy Tx-Descriptor Layout special 0x0 0x4 0x8 0xC CMD Buffer-Address high (bits ) Buffer-Address low (bits 31..0) 31 0 Packet Length (in bytes)CSO statusCSS reserved =0 Buffer-Address = the packet-buffer’s 64-bit address in physical memory Packet-Length = number of bytes in the data-packet to be transmitted CMD = Command-field CSO/CSS = Checksum Offset/Start (in bytes) STA = Status-field

Suggested C syntax typedef struct { unsigned long long base_address; unsigned shortpacket_length; unsigned charcksum_offset; unsigned chardesc_command; unsigned chardesc_status; unsigned charcksum_origin; unsigned shortspecial_info; } TX_DESCRIPTOR;

TxDesc Command-field IDEVLEDEXT reserved =0 RSICIFCSEOP EOP = End Of Packet (1=yes, 0=no) IFCS = Insert Frame CheckSum (1=yes, 0=no) – provided EOP is set IC = Insert CheckSum (1=yes, 0=no) as indicated by CSO/CSS fields RS = Report Status (1=yes, 0=no) DEXT = Descriptor Extension (1=yes, 0=no) use ‘0’ for Legacy-Mode VLE = VLAN-Packet Enable (1=yes, 0=no) – provided EOP is set IDE = Interrupt-Delay Enable (1=yes, 0=no)

TxDesc Status field reserved =0 LCECDD DD = Descriptor Done this bit is written back after the NIC processes the descriptor provided the descriptor’s RS-bit was set (i.e., Report Status) EC = Excess Collisions indicates that the packet has experienced more than the maximum number of excessive collisions (as defined by the TCTL.CT field) and therefore was not transmitted. (This bit is meaningful only in HALF-DUPLEX mode.) LC = Late Collision indicates that Late Collision has occurred while operating in HALF-DUPLEX mode. Note that the collision window size is dependent on the SPEED: 64-bytes for 10/100-MBps, or 512-bytes for 1000-Mbps.

Bit-mask definitions enum { DD = (1<<0), // Descriptor Done EC = (1<<1),// Excess Collisions LC = (1<<2),// Late Collision EOP = (1<<0),// End Of Packet IFCS = (1<<1),// Insert Frame CheckSum IC = (1<<2), // Insert CheckSum as per CSO/CSS RS = (1<<3),// Report Status DEXT = (1<<5),// Descriptor Extension VLE = (1<<6),// VLAN packet IDE = (1<<7) // Interrupt-Delay Enable };

the packet’s data ‘payload’ goes here (usually varies from 56 to 1500 bytes) Ethernet packet layout Total size normally can vary from 64 bytes up to 1536 bytes (unless ‘jumbo’ packets and/or ‘undersized’ packets are enabled) The NIC expects a 14-byte packet ‘header’ and it appends a 4-byte CRC check-sum destination MAC address (6-bytes) source MAC address (6-bytes) Type/length (2-bytes) Cyclic Redundancy Checksum (4-bytes)

In-class exercises Modify the code in our ‘nictx.c’ module so that it will transmit more than just one raw packet when you install it into the kernel Can you also modify the ‘module_exit()’ function so that it will transmit a packet before it disables the ‘Transmit Engine’?