Presentation is loading. Please wait.

Presentation is loading. Please wait.

Spring Camp 2013 1 NetFPGA Spring Camp Day 1 Presented by: Andrew W. Moore with Marek Michalski, Neelakandan Manihatty-Bojan, Gianni Antichi Georgina Kalogeridou,

Similar presentations


Presentation on theme: "Spring Camp 2013 1 NetFPGA Spring Camp Day 1 Presented by: Andrew W. Moore with Marek Michalski, Neelakandan Manihatty-Bojan, Gianni Antichi Georgina Kalogeridou,"— Presentation transcript:

1 Spring Camp NetFPGA Spring Camp Day 1 Presented by: Andrew W. Moore with Marek Michalski, Neelakandan Manihatty-Bojan, Gianni Antichi Georgina Kalogeridou, Jong Hun Han, Noa Zilberman aided by Yury Audzevich, Dimosthenis Pediaditakis Poznan University of Technology May 20 – 24, 2013

2 Spring Camp Tutorial Outline Background –Introduction –The NetFPGA Platform The Base Reference Router –Motivation: Basic IP review –Example: Reference Router running on the NetFPGA Infrastructure –Tree –Build System –Scripts The Life of a Packet Through the NetFPGA –Hardware Datapath –Interface to software: Exceptions and Host I/O Implementation –Module Template –Write Crypto NIC using a static key Simulation and Debug –Write and Run Simulations for Crypto NIC Concluding Remarks

3 Spring Camp Section I: Motivation

4 Spring Camp NetFPGA = Networked FPGA A line-rate, flexible, open networking platform for teaching and research

5 Spring Camp NetFPGA 1G Board NetFPGA consists of… Four elements: NetFPGA board Tools + reference designs Contributed projects Community NetFPGA 10G Board

6 Spring Camp NetFPGA-10G A major upgrade over the 1Gb/s predecessor State-of-the-art technology

7 Spring Camp Gigabit Ethernet 4 SFP+ Cages AEL2005 PHY 10G Support –Direct Attach Copper –10GBASE-R Optical Fiber 1G Support –1000BASE-T Copper –1000BASE-X Optical Fiber

8 Spring Camp Others QDRII-SRAM –27MB –Storing routing tables, counters and statistics RLDRAM-II –288MB –Packet Buffering PCI Express x8 –PC Interface Expansion Slot

9 Spring Camp Xilinx Virtex 5 TX240T Optimized for ultra high-bandwidth applications 48 GTX Transceivers 4 hard Tri-mode Ethernet MACs 1 hard PCI Express Endpoint

10 Spring Camp NetFPGA 1GNetFPGA 10G 4 x 1Gbps Ethernet Ports4 x 10Gbps Ethernet Ports 4.5 MB ZBT SRAM 64 MB DDR2 SDRAM 27 MB QDRII-SRAM 288 MB RLDRAM-II PCIPCI Express x8 Virtex II-Pro 50Virtex 5 TX240T NetFPGA Board Comparison

11 Spring Camp Beyond Hardware NetFPGA-10G Board Xilinx EDK based IDE Reference designs with ARM AXI4 Software (embedded and PC) Public Repository Public Wiki Reference Designs AXI4 IPs Xilinx EDK MicroBlaze SW PC SW GitHub, User Community

12 Spring Camp FPGA Memory 10GE NetFPGA board PCIe CPU Memory PC with NetFPGA Networking Software running on a standard PC A hardware accelerator built with a Field Programmable Gate Array driving 10 Gigabit network links

13 Spring Camp Tools + Reference Designs Tools: Compile designs Verify designs Interact with hardware Reference designs: Router (HW) Switch (HW) Network Interface Card (HW) Router Kit (SW) SCONE (SW)

14 Spring Camp Contributed Projects More projects: https://github.com/NetFPGA/NetFPGA-public/wiki/Projects ProjectContributor OpenFlow switchStanford University IPv4 RouterPisa & Cambridge Uni. RLDRAM I/FCambridge Univerity ARP ReplyStanford University Encap & DCAPStanford University Learning CAM SwitchPisa & Cambridge Uni.

15 Spring Camp Community Wiki Documentation –User’s Guide “so you just got your first NetFPGA” –Developer’s Guide “so you want to build a …” Encourage users to contribute Forums Support by users for users Active community - 10s-100s of posts/week Incubating the 10G mailing-list into a forum

16 Spring Camp International Community Over 1,000 users, using 2,000 cards at 150 universities in 40 countries

17 Spring Camp NetFPGA’s Defining Characteristics Line-Rate –Processes back-to-back packets Without dropping packets At full rate of 10 Gigabit Ethernet Links –Operating on packet headers For switching, routing, and firewall rules –And packet payloads For content processing and intrusion prevention Open-source Hardware –Similar to open-source software Full source code available BSD-Style License for 1G and LGPL 2.1 for 10G –But harder, because Hardware modules must meeting timing Verilog & VHDL Components have more complex interfaces Hardware designers need high confidence in specification of modules

18 Spring Camp Test-Driven Design Regression tests –Have repeatable results –Define the supported features –Provide clear expectation on functionality Example: Internet Router –Drops packets with bad IP checksum –Performs Longest Prefix Matching on destination address –Forwards IPv4 packets of length bytes –Generates ICMP message for packets with TTL <= 1 –Defines how packets with IP options or non IPv4 … and dozens more … Every feature is defined by a regression test

19 Spring Camp Who, How, Why Who uses the NetFPGA? –Teachers –Students –Researchers How do they use the NetFPGA? –To run the Router Kit –To build modular reference designs IPv4 router 4-port NIC Ethernet switch, … Why do they use the NetFPGA? –To measure performance of Internet systems –To prototype new networking systems

20 Spring Camp Spring Camp Objectives Overall picture of NetFPGA How reference designs work How you can work on a project –NetFPGA Design Flow –Directory Structure, library modules and projects –How to utilize contributed projects Interface/Registers –How to verify a design (Simulation and Regression Tests) –Things to do when you get stuck AND… You can build your own projects!

21 Spring Camp Section II: Network review

22 Spring Camp Internet Protocol (IP) Data IP Hdr Eth Hdr Data IP Hdr Data to be transmitted: IP packets: Ethernet Frames: Data IP Hdr Data IP Hdr Eth Hdr Data IP Hdr Eth Hdr Data IP Hdr … …

23 Spring Camp Internet Protocol (IP) Data IP Hdr … Options (if any) Destination Address Source Address Header ChecksumProtocolTTL Fragment Offset Flags Fragment ID Total Packet LengthT.Service HLen Ver 20 bytes

24 Spring Camp Basic operation of an IP router R3 A B C R1 R2 R4D E F R5 F R3E D Next HopDestination D

25 Spring Camp Basic operation of an IP router A B C R1 R2 R3 R4D E F R5

26 Spring Camp Forwarding tables EntryDestinationPort 1 2 ⋮ ⋮ ⋮ 12 ~ 4 billion entries Naïve approach: One entry per address Improved approach: Group entries to reduce table size EntryDestinationPort 1 2 ⋮ – – ⋮ – ⋮ 12 IP address 32 bits wide → ~ 4 billion unique address

27 Spring Camp IP addresses as a line EntryDestinationPort Stanford Berkeley North America Asia Everywhere (default) All IP addresses North America Asia BerkeleyStanford Your computerMy computer

28 Spring Camp Longest Prefix Match (LPM) EntryDestinationPort Stanford Berkeley North America Asia Everywhere (default) Universities Continents Planet Data To: Stanford Matching entries: Stanford North America Everywhere Most specific

29 Spring Camp Longest Prefix Match (LPM) EntryDestinationPort Stanford Berkeley North America Asia Everywhere (default) Universities Continents Planet Data To: Canada Matching entries: North America Everywhere Most specific

30 Spring Camp Implementing Longest Prefix Match EntryDestinationPort Stanford Berkeley North America Asia Everywhere (default) Most specific Least specific Searching FOUND

31 Spring Camp Basic components of an IP router Control Plane Data Plane per-packet processing Switching Forwarding Table Routing Table Routing Protocols Management & CLI Software Hardware Queuing

32 Spring Camp IP router components in NetFPGA SCONE Routing Table Routing Protocols Management & CLI Output Port Lookup Forwarding Table Input Arbiter Output Queues SwitchingQueuing Linux Routing Table Routing Protocols Management & CLI Router Kit OR Software Hardware

33 Spring Camp Section III: Example I

34 Spring Camp Operational IPv4 router Control Plane Data Plane per-packet processing Software Hardware Routing Table Routing Protocols Management & CLI SCONE Switching Forwarding Table Queuing Reference router Java GUI

35 Spring Camp Streaming video

36 Spring Camp Streaming video PC & NetFPGA (NetFPGA in PC) NetFPGA running reference router

37 Spring Camp Streaming video Video streaming over shortest path Video client Video server

38 Spring Camp Streaming video Video client Video server

39 Spring Camp Observing the routing tables Columns: Subnet address Subnet mask Next hop IP Output ports

40 Spring Camp Example 1

41 Spring Camp Review NetFPGA as IPv4 router: Reference hardware + SCONE software Routing protocol discovers topology Demo: Ring topology Traffic flows over shortest path Broken link: automatically route around failure

42 Spring Camp Section III: Example II

43 Spring Camp Buffers in Routers Rx Tx Internal Contention Pipelining Congestion

44 Spring Camp Buffer Sizing Story

45 Spring Camp Using NetFPGA to explore buffer size Need to reduce buffer size and measure occupancy Alas, not possible in commercial routers So, we will use the NetFPGA instead Objective: –Use the NetFPGA to understand how large a buffer we need for a single TCP flow.

46 Spring Camp Reference Router Pipeline Five stages –Input interfaces –Input arbitration –Routing decision and packet modification –Output queuing –Output interfaces Packet-based module interface Pluggable design MAC RxQ CPU RxQ MAC RxQ CPU RxQ MAC RxQ CPU RxQ MAC RxQ CPU RxQ Input Arbiter Output Port Lookup MAC TxQ CPU TxQ MAC TxQ CPU TxQ MAC TxQ CPU TxQ MAC TxQ CPU TxQ Output Queues

47 Spring Camp Extending the Reference Pipeline MAC RxQ MAC RxQ CPU RxQ CPU RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ Input Arbiter Output Port Lookup MAC TxQ MAC TxQ CPU TxQ CPU TxQ MAC TxQ MAC TxQ CPU TxQ CPU TxQ MAC TxQ MAC TxQ CPU TxQ CPU TxQ MAC TxQ MAC TxQ CPU TxQ CPU TxQ Output Queues Rate Limiter Rate Limiter Event Capture

48 Spring Camp Enhanced Router Pipeline Two modules added 1. Event Capture to capture output queue events (writes, reads, drops) 2. Rate Limiter to create a bottleneck MAC RxQ CPU RxQ MAC RxQ CPU RxQ MAC RxQ CPU RxQ MAC RxQ CPU RxQ Input Arbiter Output Port Lookup MAC TxQ CPU TxQ MAC TxQ CPU TxQ MAC TxQ CPU TxQ MAC TxQ CPU TxQ Output Queues Rate Limiter Event Capture

49 Spring Camp Topology for Exercise 2 Iperf Client Iperf Server Recall: NetFPGA host PC is life-support: power & control So: The host PC may physically route its traffic through the local NetFPGA PC & NetFPGA (NetFPGA in PC) NetFPGA running extended reference router nf2c2 eth1 nf2c1 eth2

50 Spring Camp Example 2

51 Spring Camp Review NetFPGA as flexible platform: Reference hardware + SCONE software new modules: event capture and rate-limiting Example 2: Client Router Server topology –Observed router with new modules –Started tcp transfer, look at queue occupancy –Observed queue change in response to TCP ARQ

52 Spring Camp Section IV: Infrastructure

53 Spring Camp Infrastructure Tree structure NetFPGA package contents –Reusable Verilog modules –Verification infrastructure –Build infrastructure –Utilities –Software libraries

54 Spring Camp NetFPGA package contents Projects: –HW: router, switch, NIC, buffer sizing router –SW: router kit, SCONE Reusable Verilog modules Verification infrastructure: –simulate full board with PCI-express + physical interfaces –run tests against hardware –test data generation libraries (eg. packets) Build infrastructure Utilities: –register I/O, packaging, … Software libraries

55 Spring Camp Tree Structure (1) NetFPGA-10G projects (including reference designs) contrib-projects (contributed user projects) lib (custom and reference IP Cores and software libraries) tools (scripts for running simulations etc.) docs (design documentations and user-guides) https://github.com/NetFPGA/NetFPGA-10G-live

56 Spring Camp Tree Structure (2) lib hw (hardware logic as IP cores) sw (core specific software drivers/libraries) std (reference cores) contrib (contributed cores) xilinx (Xilinx provided cores) std (reference libraries) contrib (contributed libraries) xilinx (Xilinx provided libraries)

57 Spring Camp Tree Structure (3) projects/reference_nic hw (EDK based project) data (contains user constraint files) bitfiles (FPGA executables) nf10 (contains files used to run various tools) pcores (contains local hardware peripherals) sw embedded (contains code for microblaze) host (contains code for host communication etc.)

58 Spring Camp Reusable logic (IP cores) * PBS – packet bus stream in NF1G ** RBS – register bus stream in NF1G

59 Spring Camp A BRIEF DETOUR What is a PCore?

60 Spring Camp What’s pcore? IP Core? P Core? Pcore? pcore? - all the same. “Processor Core” in EDK HDL (Verilog/VHDL) + metadata files ≈ module Examples: –10G MAC –SRAM Controller –NIC Output port lookup

61 Spring Camp HDL (Verilog) All NetFPGA-10G pcores are AXI-compliant AXI = Advanced eXtensible Interface –Used in ARM-based embedded systems –Standard interface to bridge pcores together –AXI4/AXI4-Lite: Control and status interface –AXI4-Stream: Data path interface Xilinx IPs and tool chains are migrating to AXI

62 Spring Camp Pcore metadata files Help EDK put pcores together PAO: Peripheral Analyze Order –Files that needs synthesizing MPD: Microprocessor Peripheral Definition –Defines interface of the pcore (e.g. # of AXI’s) Consult Xilinx UG642 for details

63 Spring Camp Verification Infrastructure (1) Simulation and Debugging –built on industry standard Xilinx “iSim” simulator and “Scapy” –Python scripts for stimuli construction and verification

64 Spring Camp Verification Infrastructure (2) iSim –a High Level Description (HDL) simulator –performs functional and timing simulations for embedded, VHDL, Verilog and mixed designs Scapy –a powerful interactive packet manipulation library for creating “test data” –provides primitives for many standard packet formats –allows addition of custom formats

65 Spring Camp Build Infrastructure (1) Based on the design flow of: –“Xilinx Embedded Development Kit” –“ARM’s AXI4 interface specification”

66 Spring Camp Build Infrastructure (2) Build/Synthesis –collection of shared hardware peripherals cores (pcores) stitched together with “AXI4”, “Lite” and “Stream” buses –bitfile generation and verification using Xilinx synthesis and implementation tools XST Translate, MAP, PAR BitGen

67 Spring Camp Build Infrastructure (3) Register system –collates and generates addresses for all the registers and memories in a project –uses Xilinx EDK – libgen utility to generate “include” files for software applications

68 Spring Camp Module i Module i+1 TDATA Inter-Module Communication TUSER TVALID TREADY –Using AXI-4 Stream (Packets are moved as Stream) TSTRB TLAST

69 Spring Camp AXI4-Stream Description TDATAData Stream TSTRBStrobe signal (i.e. byte enable) TVALIDValid Indication TREADY Flow control indication TLASTEnd of packet/burst indication TUSEROut of band metadata

70 Spring Camp Packet Format TLASTTUSERTSTRBTDATA 0X0xFF…FEth Hdr 0X0xFF…FIP Hdr 0X0xFF…F… 1X0x0…1FLast word

71 Spring Camp TUSER PositionContent [15:0]length of the packet in bytes [23:16]source port: one-hot encoded [31:24]destination port: one-hot encoded [127:32]6 user defined slots, 16bit each

72 Spring Camp TVALID/TREADY Signal timing –No waiting! –Assert TREADY/TVALID whenever appropriate –TVALID should not depend on TREADY TVALID TREADY

73 Spring Camp Byte ordering In compliance to AXI, NF 10G has a specific byte ordering –1st byte of the TDATA[7:0] –2nd byte of the TDATA[15:8] 73

74 Spring Camp Section V: Life of a Packet

75 Spring Camp Reference Router Pipeline Five stages –Input –Input arbitration –Routing decision and packet modification –Output queuing –Output Packet-based module interface Pluggable design MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ Input Arbiter Output Port Lookup Output Queues MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ

76 Spring Camp Full System Components Software PCIe Bus NetFPGA AXI Lite user data path nf0nf1nf2nf3ioctl MAC TxQ MAC TxQ MAC RxQ MAC RxQ Ethernet CPU RxQ CPU RxQ CPU TxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ

77 Spring Camp port0port y x Life of a Packet through the Hardware IP packet

78 Spring Camp MAC Rx Queue IP Hdr: IP Dst: , TTL: 64, Csum:0x3ab4 Eth Hdr: Dst MAC = port 0, Ethertype = IP Data

79 Spring Camp Rx Queue IP Hdr: IP Dst: , TTL: 64, Csum:0x3ab4 Eth Hdr: Dst MAC = port 0, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0 TUSER TDATA

80 Spring Camp Input Arbiter Rx Q 0 Rx Q 1 … Rx Q 4 Pkt

81 Spring Camp Output Port Lookup IP Hdr: IP Dst: , TTL: 64, Csum:0x3ab4 Eth Hdr: Dst MAC = port 0, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0

82 Spring Camp Output Port Lookup IP Hdr: IP Dst: , TTL: 63, Csum:0x3ac2 Eth Hdr: Dst MAC= nextHop, Src MAC = port 4, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0 Output Port Lookup 1- Check input port matches Dst MAC 2- Check TTL, checksum 3- Lookup next hop IP & output port (LPM) 4- Lookup next hop MAC address (ARP) 5- Update output port in TUSER 6- Modify MAC Dst and Src addresses 7-Decrement TTL and update checksum TUSER TDATA

83 Spring Camp Output Queues Pkt OQ0 OQ2 OQ4

84 Spring Camp MAC Tx Queue IP Hdr: IP Dst: , TTL: 63, Csum:0x3ac2 Eth Hdr: Dst MAC= nextHop, Src MAC = port 4, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0

85 Spring Camp MAC Tx Queue IP Hdr: IP Dst: , TTL: 63, Csum:0x3ac2 Eth Hdr: Dst MAC= nextHop, Src MAC = port 4, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0

86 Spring Camp Exception Packets Example: TTL = 0 or TTL = 1 Packet has to be sent to the CPU Host generates an ICMP packet response Difference starts at the Output Port Lookup

87 Spring Camp Software PCI Bus NetFPGA AXI Lite user data path nf0nf1nf2nf3ioctl MAC TxQ MAC TxQ MAC RxQ MAC RxQ Ethernet CPU RxQ CPU RxQ CPU TxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ Exception Packet Path

88 Spring Camp Output Port Lookup IP Hdr: IP Dst: , TTL: 1, Csum:0x3ab4 Eth Hdr: Dst MAC= 0, Src MAC = port x, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0 Output Port Lookup 1- Check input port matches Dst MAC 2- Check TTL, checksum – EXCEPTION! 3- Add output port module

89 Spring Camp Output Queues Pkt OQ0 OQ1 OQ2 OQ4

90 Spring Camp CPU Tx Queue IP Hdr: IP Dst: , TTL: 1, Csum:0x3ab4 Eth Hdr: Dst MAC= 0, Src MAC = port x, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0

91 Spring Camp CPU Tx Queue IP Hdr: IP Dst: , TTL: 1, Csum:0x3ab4 Eth Hdr: Dst MAC= 0, Src MAC = port x, Ethertype = IP Payload 0 Length, Src port, Dst port, User defined 0

92 Spring Camp ICMP Packet Packet arrives at the CPU Rx Queue from the PCI-express Bus Same path as a packet from the MAC until it reaches the Output Port Lookup (OPL) The OPL module sees the packet is from the CPU Rx Queue 1 and sets the output port directly to 0 The packet continues on the same path as the non-exception packet to the Output Queues and then MAC Tx queue 0

93 Spring Camp Software PCI Bus NetFPGA AXI Lite user data path nf0nf1nf2nf3ioctl MAC TxQ MAC TxQ MAC RxQ MAC RxQ Ethernet CPU RxQ CPU RxQ CPU TxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ MAC TxQ MAC TxQ MAC RxQ MAC RxQ ICMP Packet Path

94 Spring Camp NetFPGA-Host Interaction Linux driver interfaces with hardware –Packet interface via standard Linux network stack –Register reads/writes via ioctl system call with wrapper functions: rdaxi(int address, unsigned *rd_data); wraxi(int address, unsigned *wr_data); eg: rdaxi(0x7d , &val);

95 Spring Camp NetFPGA-Host Interaction NetFPGA to host packet transfer PCI Bus 2. Interrupt notifies driver of packet arrival 3. Driver sets up and initiates DMA transfer 1. Packet arrives – forwarding table sends to CPU queue

96 Spring Camp NetFPGA-Host Interaction NetFPGA to host packet transfer (cont.) PCI Bus 4. NetFPGA transfers packet via DMA 5. Interrupt signals completion of DMA 6. Driver passes packet to network stack

97 Spring Camp NetFPGA-Host Interaction Host to NetFPGA packet transfers PCI Bus 3. Interrupt signals completion of DMA 1. Software sends packet via network sockets Packet delivered to driver 2. Driver sets up and initiates DMA transfer

98 Spring Camp NetFPGA-Host Interaction Register access PCI Bus 1. Software makes ioctl call on network socket ioctl passed to driver 2. Driver performs PCIe memory read/write

99 Spring Camp Section V: Contributed Projects

100 Spring Camp FPGA Memory Running the Router Kit User-space development, 4x10GE line-rate forwarding PCI-Express CPU Memory OSPFBGP My Protocol user kernel Routing Table Usage #1 IPv4 Router Fwding Table Packet Buffer “Mirror” 10GbE

101 Spring Camp FPGA Memory Enhancing Modular Reference Designs PCI-Express CPU Memory Usage #2 NetFPGA Driver Java GUI Front Panel (Extensible) PW-OSPF In Q Mgmt IP Lookup L2 Parse L3 Parse Out Q Mgmt Verilog modules interconnected by FIFO interfaces My Block Verilog, System Verilog, VHDL, Bluespec…. 10GbE EDA Tools (Xilinx, Mentor, etc.) 1.Design 2.Simulate 3.Synthesize 4.Download 1.Design 2.Simulate 3.Synthesize 4.Download

102 Spring Camp FPGA Memory Creating new systems PCI-Express CPU Memory Usage #3 NetFPGA Driver My Design (10GE MAC is soft/replaceable) Verilog, System Verilog, VHDL, Bluespec…. 10GbE EDA Tools (Xilinx, Mentor, etc.) 1.Design 2.Simulate 3.Synthesize 4.Download 1.Design 2.Simulate 3.Synthesize 4.Download

103 Spring Camp NetThreads, NetThreads-RE, NetTM Martin Labrecque Gregory Steffan ECE Dept. Geoff Salmon Monia Ghobadi Yashar Ganjali CS Dept. U. of Toronto Efficient multithreaded design –Parallel threads deliver performance System Features –System is easy to program in C –Time to results is very short

104 Spring Camp FPGA Soft processors: processors in the FPGA fabric User uploads program to soft processor Easier to program software than hardware in the FPGA Could be customized at the instruction level Processor(s) DDR controller Ethernet MAC Soft Processors in FPGAs

105 Spring Camp Example 1G Contributed Projects ProjectContributor OpenFlow switchStanford University Packet generatorStanford University NetFlow ProbeBrno University NetThreadsUniversity of Toronto zFilter (Sp)routerEricsson Traffic MonitorUniversity of Catania DFAUMass Lowell

106 Spring Camp G Contributions ProjectContributor Bluespec switchMIT/SRI International Traffic MonitorUniversity of Pisa NF1G legacy on NF10GUni Pisa & Uni Cambridge Simple/better DMA coreStanford RAMcloud project Your ideasYou? …. Alongside reference projects we have…

107 Spring Camp Future for NetFPGA 10G Non-reference Projects we know about High precision capture card –4 10GbE ports, GPS input, 8ns resolution. Traffic generator –4 port 10GbE real-traffic generator OpenFlow native port (Verilog) to OVS

108 Spring Camp Build an accurate, fast, line-rate NetDummy/nistnet element A flexible home-grown monitoring card Evaluate new packet classifiers –(and application classifiers, and other neat network apps….) Prototype a full line-rate next-generation Ethernet-type Trying any of Jon Crowcrofts’ ideas (Sourceless IP routing for example) Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware) Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w) Hardware supporting Virtual Routers Check that some brave new idea actually works e.g. Rate Control Protocol (RCP), Multipath TCP, toolkit for hardware hashing MOOSE implementation IP address anonymization SSL decoding “bump in the wire” Xen specialist nic computational co-processor Distributed computational co-processor IPv6 anything IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….) Netflow v9 reference PSAMP reference IPFIX reference Different driver/buffer interfaces (e.g. PFRING) or “escalators” (from gridprobe) for faster network monitors Firewall reference GPS packet-timestamp things High-Speed Host Bus Adapter reference implementations –Infiniband –iSCSI –Myranet –Fiber Channel Smart Disk adapter (presuming a direct-disk interface) Software Defined Radio (SDR) directly on the FPGA (probably UWB only) Routing accelerator –Hardware route-reflector –Internet exchange route accelerator Hardware channel bonding reference implementation TCP sanitizer Other protocol sanitizer (applications… UDP DCCP, etc.) Full and complete Crypto NIC IPSec endpoint/ VPN appliance VLAN reference implementation metarouting implementation virtual intelligent proxy application embargo-er Layer-4 gateway h/w gateway for VoIP/SIP/skype h/w gateway for video conference spaces security pattern/rules matching Anti-spoof traceback implementations (e.g. BBN stuff) IPtv multicast controller Intelligent IP-enabled device controller (e.g. IP cameras or IP powermeters) DES breaker platform for flexible NIC API evaluations snmp statistics reference implementation sflow (hp) reference implementation trajectory sampling (reference implementation) implementation of zeroconf/netconf configuration language for routers h/w openflow and (simple) NOX controller in one… Network RAID (multicast TCP with redundancy) inline compression hardware accelorator for TOR load-balancer openflow with (netflow, ACL, ….) reference NAT device active measurement kit network discovery tool passive performance measurement active sender control (e.g. performance feedback fed to endpoints for control) Prototype platform for NON-Ethernet or near-Ethernet MACs –Optical LAN (no buffers) How might we use NetFPGA? Well I’m not sure about you but here is a list I created: Build an accurate, fast, line-rate NetDummy/nistnet element A flexible home-grown monitoring card Evaluate new packet classifiers – (and application classifiers, and other neat network apps….) Prototype a full line-rate next-generation Ethernet-type Trying any of Jon Crowcrofts’ ideas (Sourceless IP routing for example) Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware) Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w) Hardware supporting Virtual Routers Check that some brave new idea actually works e.g. Rate Control Protocol (RCP), Multipath TCP, toolkit for hardware hashing MOOSE implementation IP address anonymization SSL decoding “bump in the wire” Xen specialist nic computational co-processor Distributed computational co-processor IPv6 anything IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….) Netflow v9 reference PSAMP reference IPFIX reference Different driver/buffer interfaces (e.g. PFRING) or “escalators” (from gridprobe) for faster network monitors Firewall reference GPS packet-timestamp things High-Speed Host Bus Adapter reference implementations – Infiniband – iSCSI – Myranet – Fiber Channel Smart Disk adapter (presuming a direct-disk interface) Software Defined Radio (SDR) directly on the FPGA (probably UWB only) Routing accelerator – Hardware route-reflector – Internet exchange route accelerator Hardware channel bonding reference implementation TCP sanitizer Other protocol sanitizer (applications… UDP DCCP, etc.) Full and complete Crypto NIC IPSec endpoint/ VPN appliance VLAN reference implementation metarouting implementation virtual intelligent proxy application embargo-er Layer-4 gateway h/w gateway for VoIP/SIP/skype h/w gateway for video conference spaces security pattern/rules matching Anti-spoof traceback implementations (e.g. BBN stuff) IPtv multicast controller Intelligent IP-enabled device controller (e.g. IP cameras or IP powermeters) DES breaker platform for flexible NIC API evaluations snmp statistics reference implementation sflow (hp) reference implementation trajectory sampling (reference implementation) implementation of zeroconf/netconf configuration language for routers h/w openflow and (simple) NOX controller in one… Network RAID (multicast TCP with redundancy) inline compression hardware accelorator for TOR load-balancer openflow with (netflow, ACL, ….) reference NAT device active measurement kit network discovery tool passive performance measurement active sender control (e.g. performance feedback fed to endpoints for control) Prototype platform for NON-Ethernet or near-Ethernet MACs – Optical LAN (no buffers)

109 Spring Camp How might YOU use NetFPGA? Build an accurate, fast, line-rate NetDummy/nistnet element A flexible home-grown monitoring card Evaluate new packet classifiers – (and application classifiers, and other neat network apps….) Prototype a full line-rate next-generation Ethernet-type Trying any of Jon Crowcrofts’ ideas (Sourceless IP routing for example) Demonstrate the wonders of Metarouting in a different implementation (dedicated hardware) Provable hardware (using a C# implementation and kiwi with NetFPGA as target h/w) Hardware supporting Virtual Routers Check that some brave new idea actually works e.g. Rate Control Protocol (RCP), Multipath TCP, toolkit for hardware hashing MOOSE implementation IP address anonymization SSL decoding “bump in the wire” Xen specialist nic computational co-processor Distributed computational co-processor IPv6 anything IPv6 – IPv4 gateway (6in4, 4in6, 6over4, 4over6, ….) Netflow v9 reference PSAMP reference IPFIX reference Different driver/buffer interfaces (e.g. PFRING) or “escalators” (from gridprobe) for faster network monitors Firewall reference GPS packet-timestamp things High-Speed Host Bus Adapter reference implementations – Infiniband – iSCSI – Myranet – Fiber Channel Smart Disk adapter (presuming a direct-disk interface) Software Defined Radio (SDR) directly on the FPGA (probably UWB only) Routing accelerator – Hardware route-reflector – Internet exchange route accelerator Hardware channel bonding reference implementation TCP sanitizer Other protocol sanitizer (applications… UDP DCCP, etc.) Full and complete Crypto NIC IPSec endpoint/ VPN appliance VLAN reference implementation metarouting implementation virtual intelligent proxy application embargo-er Layer-4 gateway h/w gateway for VoIP/SIP/skype h/w gateway for video conference spaces security pattern/rules matching Anti-spoof traceback implementations (e.g. BBN stuff) IPtv multicast controller Intelligent IP-enabled device controller (e.g. IP cameras or IP powermeters) DES breaker platform for flexible NIC API evaluations snmp statistics reference implementation sflow (hp) reference implementation trajectory sampling (reference implementation) implementation of zeroconf/netconf configuration language for routers h/w openflow and (simple) NOX controller in one… Network RAID (multicast TCP with redundancy) inline compression hardware accelorator for TOR load-balancer openflow with (netflow, ACL, ….) reference NAT device active measurement kit network discovery tool passive performance measurement active sender control (e.g. performance feedback fed to endpoints for control) Prototype platform for NON-Ethernet or near-Ethernet MACs – Optical LAN (no buffers)

110 Spring Camp Section VI: Example Project

111 Spring Camp Project: Cryptographic NIC Implement a network interface card (NIC) that encrypts upon transmission and decrypts upon reception

112 Spring Camp Cryptography XOR function XOR written as: ^ ⊻ XOR is commutative: (A ^ B) ^ C = A ^ (B ^ C) ABA ^ B XORing a value with itself always yields 0

113 Spring Camp Cryptography (cont.) Simple cryptography: –Generate a secret key –Encrypt the message by XORing the message and key –Decrypt the ciphertext by XORing with the key Explanation: (M ^ K) ^ K =M ^ (K ^ K) =M =M ^ 0 Commutativity A ^ A = 0

114 Spring Camp Cryptography (cont.) Example: Message: Key: Message ^ Key: Key: Message ^ Key ^ Key:

115 Spring Camp Cryptography (cont.) Idea: Implement simple cryptography using XOR –32-bit key –Encrypt every word in payload with key Note: XORing with a one-time pad of the same length of the message is secure/uncrackable. See: PayloadHeader Key

116 Spring Camp Section VII: Implementation

117 Spring Camp Embedded Development Kit (1) A development environment for building “embedded systems” NetFPGA-10G is largely built upon the “EDK”

118 Spring Camp Embedded Development Kit (2) Xilinx EDK suite is composed of: –Platform Studio (XPS), a top level integrated design tool for “hardware” synthesis, implementation and bitstream generation –Software Development Kit (SDK), a development environment for “software application” running on embedded processors like Microblaze

119 Spring Camp Xilinx Platform Studio (1) An XPS project consists of following: –system.xmp top level XPS project file –system.mhs microprocessor hardware specification file defines the embedded processor system, and includes bus architecture, peripherals, processor, and connectivity etc. –system.ucf user constraint file defines constraints such as timing, area, IO placement etc.

120 Spring Camp Xilinx Platform Studio (2) To invoke XPS design tool, run: #xps /hw/system.xmp This will open the project in the XPS graphical user interface open a new terminal cd ~/NetFPGA-10G-live/projects/crypto_nic source /opt/Xilinx/13.4/ISE_DS/settings64.sh xps hw/system.xmp

121 Spring Camp XPS Design Tool (1) Bus Assembly System Assembly IP Catalog

122 Spring Camp XPS Design Tool (2) IP Catalog: contains categorized list of all available peripheral cores Bus Assembly: shows connectivity of various modules over AXI bus System Assembly: provides a complete view of instantiated cores –Cores can be added and deleted, in this view

123 Spring Camp XPS Design Tool (2) Port view Port view: provides listing of internal and external ports for each peripheral core

124 Spring Camp XPS Design Tool (3) Address view Address view: defines base and high address value for peripheral cores connected to AXI4 or AXI-LITE bus These values can be changed manually, here

125 Spring Camp Getting started with a new project (1) Projects: –Each design represented by a project –Location: NetFPGA-10G-live/projects/ ~/NetFPGA-10G-live/projects/crypto_nic –Consists of: Verilog source Simulation tests Hardware tests Libraries Optional software

126 Spring Camp Getting started with a new project (2) –Normally: copy an existing project as the starting point –Today: pre-created project –Missing from pre-created project: Verilog files (with crypto implementation) Simulation tests Hardware tests Custom software

127 Spring Camp MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ Input Arbiter Output Port Lookup Output Queues MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ MAC RxQ CPU RxQ CPU RxQ Getting started with a new project (3) Typically implement functionality in one or more modules under the top wrapper Crypto Crypto module to encrypt and decrypt packets

128 Spring Camp Getting started with a new project (4) –Shared modules included from netfpga/lib/verilog Generic modules that are re-used in multiple projects Specify shared modules in project’s mhs file –Local src modules override shared modules –crypto_nic: LocalShared crypto.vEverything else

129 Spring Camp Getting started with a new project (5) Create crypto.v from module_template.v: 1.Rename project project_template to crypto 2.Rename the local module_template.v to crypto.v 3.Change the module name inside crypto.v (first non- comment line of the file) 4.Add the crypto module to the mhs file 5.Update mpd file 6.Update pao file

130 Spring Camp cd ~/NetFPGA-10G-live/projects/crypto_nic xps hw/system.xmp Running XPS

131 Spring Camp Adding The Crypto Module Double click the Crypto module in IP catalog –It’s under “Project Local Pcores” Select 256 bit data width and click OK

132 Spring Camp Connect Crytpo module In “Bus Interfaces”, click Crypto’s S_AXIS and select Input Output_lookup’s M_AXIS Click Output_queue’s S_AXIS and select Crypto’s M_AXIS

133 Spring Camp Connect Clock and Reset Click “Ports” Tab Clock -> clock_generator_0::CLKOUT1 Reset -> reset_0::Peripheral_aresetn

134 Spring Camp Implementing the Crypto Module (1) What do we want to encrypt? –IP payload only Plaintext IP header allows routing Content is hidden –Encrypt bytes 35 onward Bytes 1-14 – Ethernet header Bytes – IPv4 header (assume no options) Remember AXI byte ordering –For simplicity, assume all packets are IPv4 without options

135 Spring Camp Implementing the Crypto Module (2) State machine (draw on next page): –Module headers on each packet –Datapath 256-bits wide 34 / 32 is not an integer!  Inside the crypto module Registers in_fifo_empty in_fifo_rd_en data/ctrl valid ready data/ctrl S_AXISM_AXIS

136 Spring Camp Example packet IP Payload0 TDATA (256 bits) TUSER (128 bits) ETH HDR XXX IP Hdr 0 IP Payload N... Partial encryption Full encryption No encryption

137 Spring Camp Crypto Module State Diagram Hint: We suggest 3 states Detect Packet’s Header

138 Spring Camp Implementing the Crypto Module (3) Implement your state machine inside crypto.v –Use a static key initially Suggested sequence of steps: 1.Create a static key value Constants can be declared in the module with localparam: localparam MY_EXAMPLE = 32’h ; 2.Implement your state machine without modifying the packet 3.Update your state machine to modify the packet by XORing the key and the payload Use eight copies of the key to create a 256-bit value to XOR with data words

139 Spring Camp module_template.v (1) module module_template #( parameter C_FAMILY = "virtex5", parameter C_M_AXIS_DATA_WIDTH=256, parameter C_S_AXIS_DATA_WIDTH=256, parameter C_M_AXIS_TUSER_WIDTH=128, parameter C_S_AXIS_TUSER_WIDTH=128,... ) (... ) // Signals // Local assignments // Registers Module port declaration

140 Spring Camp module_template.v (2) // Modules fallthrough_small_fifo #(.WIDTH(C_S_AXIS_DATA_WIDTH+C_S_AXIS_TUSER_WIDTH +(C_S_AXIS_DATA_WIDTH / 8) +1),.MAX_DEPTH_BITS(2) ) input_fifo (.din (), // Data in.wr_en (), // Write enable.rd_en (), // Read the next word.dout (), // Data out.full (), //FIFO full indication.nearly_full (), //Almost full indication.empty (), //FIFO empty indication.reset (), //Rest.clk () //Clock ); Packet data dumped in a FIFO. Allows some “decoupling” between input and output.

141 Spring Camp module_template.v (3) // -- AXILITE IPIF axi_lite_ipif_1bar #(...) axi_lite_ipif_inst (... ); // -- IPIF REGS ipif_regs #(...) ipif_regs_inst (... ); // Registers logic Registers section. Ignore for now – we’ll explore this later

142 Spring Camp module_template.v (4) // Logic begin // Default values out_wr_int = 0; in_fifo_rd_en = 0; if (!in_fifo_empty && out_rdy) begin out_wr_int = 1; in_fifo_rd_en = 1; end Combinational logic to read data from the FIFO. (Data is output to output ports.) You’ll want to add your state in this section.

143 Spring Camp More Verilog: Assignments 1 Continuous assignments –appear outside processes ( blocks): assign foo = baz & bar; –targets must be declared as wire s –always “happening” (ie, are concurrent)

144 Spring Camp More Verilog: Assignments 2 Non-blocking assignments –appear inside processes ( blocks) –use only in sequential (clocked) processes: clk) begin a <= b; b <= a; end –occur in next delta (‘moment’ in simulation time) –targets must be declared as reg s –never clock any process other than with a clock!

145 Spring Camp More Verilog: Assignments 3 Blocking assignments –appear inside processes ( blocks) –use only in combinatorial processes: (combinatorial processes are much like continuous assignments) begin a = b; b = a; end –occur one after the other (as in sequential langs like C) –targets must be declared as reg s – even though not a register –never use in sequential (clocked) processes!

146 Spring Camp More Verilog: Assignments 3 Blocking assignments –appear inside processes ( blocks) –use only in combinatorial processes: (combinatorial processes are much like continuous assignments) begin tmp = a; a = b; b = tmp; end –occur one after the other (as in sequential langs like C) –targets must be declared as reg s – even though not a register –never use in sequential (clocked) processes! unlike non-blocking, have to use a temporary signal

147 Spring Camp (hints) Never assign one signal from two processes: clk) begin foo <= bar; end clk) begin foo <= quux; end

148 Spring Camp (hints) In combinatorial processes: –take great care to assign in all possible cases begin if (cond) begin foo = bar; end –(latches ‹as opposed to flip-flops› are bad for timing closure)

149 Spring Camp (hints) In combinatorial processes: –take great care to assign in all possible cases begin if (cond) begin foo = bar; else foo = quux; end

150 Spring Camp (hints) In combinatorial processes: –(or assign a default) begin foo = quux; if (cond) begin foo = bar; end

151 Spring Camp Section VIII: Simulation and Debug

152 Spring Camp Testing: Simulation Simulation allows testing without requiring lengthy synthesis process NetFPGA simulation environment allows: –Send/receive packets Physical ports and CPU –Read/write registers –Verify results Simulations run in iSim

153 Spring Camp Testing: Simulation We will simulate the “crypto_nic” design under the “10G simulation framework” We will show you how to –create simple packets using scapy –transmit and reconcile packets sent over 10G Ethernet and PCIe interfaces

154 Spring Camp Testing: Simulation(2) Run a simulation to verify changes: 1.cd ~/NetFPGA-10G-live/projects/crypto_nic/hw 2.make sim or 1.cd ~/NetFPGA-10G-live/projects/crypto_nic/hw 2.make simgui make simgui opens isim *more on this later* Now we can implement the crypto functionality

155 Spring Camp cd ~/NetFPGA-10G-live/projects/crypto_nic/hw vim make_pkts.py This will create eight simple IP packets for injecting into the system from each port including CPU 1

156 Spring Camp Using the created packets we create the stimulus (input to the simulator) and the expected output from the simulator NB. line 97 deciphers the packets from the device To test without a key – edit ~/NetFPGA-10G- live/projects/crypto_nic/hw/Makefile and set the argument of –k to be 0x0 2 The destination in this code snippet is the DMA

157 Spring Camp Using the created packets we create the stimulus (input to the simulator) and the expected output from the simulator NB. line 97 deciphers the packets from the device To test without a key – edit ~/NetFPGA-10G- live/projects/crypto_nic/hw/Makefile and set the argument of –k to be 0x0 The destinations in this code snippet are the ports 3

158 Spring Camp As expected, two packets are received from PCIe on each 10G interface Total of 32 packets, eight from each 10G interface, are received on the PCIe The results are in…

159 Spring Camp Running simulation in iSim (1) To invoke iSim for design simulation, run: #cd /hw/ #make simgui This will open simulation in the iSim graphical user interface

160 Spring Camp Running simulation in iSim (2) Objects panel Instance panel Waveform window Tcl console

161 Spring Camp Running simulation in iSim (3) Instance panel: displays process and instance hierarchy Objects panel: displays simulation objects associated with the instance selected in the instance panel Waveform window: displays wave configuration consisting of signals and busses Tcl console: displays simulator generated messages and can executes Tcl commands

162 Spring Camp Simulation gone wild When make sim 1 source /opt/Xilinx/13.4/ISE_DS/settings64.sh 2 … XPS% make: *** No rule to make target `___xps/simgen.opt’, needed ….. Go fix the address in ….hw/system.mhs 3 ERROR:EDK – for point-2-point interface connector ‘nf10_crypto_0_m_axis’….. 3a Add the crypto module to the MHS file (edit system.mhs OR use the GUI) 3b Check the connectivity of the module (To each of the OPL and Output Queue modules) 4 ….Cannot find MPD for weird core name …. 5 ERROR: Simluator:778 – Static elaboration of top level Verilog design unit(s) in library work failed edit pao file (~/NetFPGA-10G-live/projects/crypto_nic/hw/pcores/nf10_crypto_v1_00_a/data/nf10_crypto_v2_1_0.pao) For simulation uncomment the first four lines (39-42) comment out the last line here (43) For synthesis uncomment lines 39,40,43 comment out lines 41 and 42 6 if sim finishes but complains that each test passes 2 packets but all tests FAIL – this means your static key is different between your code and your Makefile (argument to make_pkts.py) check the log (verilog hhhhhhhk

163 Spring Camp Simple Verilog error  Lots of errors When the file is empty…

164 Spring Camp Section IX: Conclusion

165 Spring Camp Nick McKeown, Glen Gibb, Jad Naous, David Erickson, G. Adam Covington, John W. Lockwood, Jianying Luo, Brandon Heller, Paul Hartke, Neda Beheshti, Sara Bolouki, James Zeng, Jonathan Ellithorpe, Sachidanandan Sambandan, Eric Lo Acknowledgments NetFPGA Team at Stanford University (Past and Present): NetFPGA Team at University of Cambridge (Past and Present): Andrew Moore, David Miller, Muhammad Shahbaz, Martin Zadnik Matthew Grosvenor, Gianni Antichi, Neelakandan Manihatty-Bojan, Georgina Kalogeridou, Jong Hun Han, Noa Zilberman All Community members (including but not limited to): Paul Rodman, Kumar Sanghvi, Wojciech A. Koszek, Yahsar Ganjali, Martin Labrecque, Jeff Shafer, Eric Keller, Tatsuya Yabe, Bilal Anwer, Yashar Ganjali, Martin Labrecque Kees Vissers, Michaela Blott, Shep Siegel

166 Spring Camp Acknowledgement Support for the NetFPGA-10G project has been provided by the following companies and institutions Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these materials do not necessarily reflect the views of any sponsors supporting this project. 166

167 Spring Camp Thanks to our Sponsors: Support for the NetFPGA project has been provided by the following companies and institutions Disclaimer: Any opinions, findings, conclusions, or recommendations expressed in these materials do not necessarily reflect the views of the National Science Foundation or of any other sponsors supporting this project. This effort is also sponsored by the Defense Advanced Research Projects Agency (DARPA) and the Air Force Research Laboratory (AFRL), under contract FA C This material is approved for public release, distribution unlimited. The views expressed are those of the authors and do not reflect the official policy or position of the Department of Defense or the U.S. Government.


Download ppt "Spring Camp 2013 1 NetFPGA Spring Camp Day 1 Presented by: Andrew W. Moore with Marek Michalski, Neelakandan Manihatty-Bojan, Gianni Antichi Georgina Kalogeridou,"

Similar presentations


Ads by Google