Presentation is loading. Please wait.

Presentation is loading. Please wait.

Washington WASHINGTON UNIVERSITY IN ST LOUIS Control Update 1: Phase 0 Fred Kuhns Applied Research laboratory Department.

Similar presentations


Presentation on theme: "Washington WASHINGTON UNIVERSITY IN ST LOUIS Control Update 1: Phase 0 Fred Kuhns Applied Research laboratory Department."— Presentation transcript:

1 Washington WASHINGTON UNIVERSITY IN ST LOUIS fredk@arl.wustl.edu Control Update 1: Phase 0 Fred Kuhns fredk@arl.wustl.edu Applied Research laboratory Department of Computer Science and Engineering Washington University in St. Louis

2 2 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 What’s in the slides? Some guiding requirements impacting design What will the overlay networks look like (to me) –simple picture summarizing the relationships between the diversified networking model with our current design Mapping IP to Ethernet addresses –simple picture depicting how we may associate the MAC layer next hop with a network layer next hop Basic slice creation (i.e. conventional Planetlab Slice) Creating an NP-based slice (what we add) Run-time (production) support –dynamic control and configuration requirements/needs Boot/configure time support –initial configuration of data plane and any debug needs Meta-Router control –local delivery and exception packets Configuration tool ( cmd shell) Testing: packet generation using sp++.

3 3 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Goals/Charge Create high performance PlanetLab node Maintain compatibility with existing plab nodes/interfaces –external interfaces same as existing plab –where possible conform to existing plab abstractions, models, interfaces and development paradigms. Extend interfaces –internal interfaces add NP abstractions, distributed resource management. Special issues/concerns –Node audit service: Meta-Net traffic (flow) accounting conforming to existing netflow stats –Virtual machine model and node manager interface: extending rspec to account for NPs –Slice model: extending to include heterogeneous nodes (realizing slivers)

4 4 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Visualizing Ports, Links and Nodes 1.Meta-router uses a single UDP port number (i.e. meta-port) any host/router may send traffic to the advertised IP address/UDP port pair Only works if all meta-net traffic uses a single line card and physical port 2.Meta-router uses a UDP port per physical interface in use. 3.UDP tunnels act as meta-links –define a unique UDP tunnel between pairs of meta-routers –may have multiple UDP ports for each physical interface in use. P0P0 P1P1 PmPm P0P0 P1P1 PnPn P0P0 P1P1 PlPl P0P0 P1P1 PpPp P0P0 P0P0 P0P0 P0P0 IP A IP B IP C IP D IP A IP B IP C IP D … … … …

5 5 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 A Meta-Router encapsulates its packets within a UDP datagram using the destination IP address and port number obtained from the lookup. The packet is then sent to the line card encapsulated within an Ethernet 802.1p/q frame. The Ethernet destination address is obtained from the lookup. The line card must replace the Ethernet header with one specifying the MAC layer next hop (eth addr). For the demo we will assume there is only one next hop Ethernet device. Mapping IP to Ethernet Destination: Simple Case MR 2 MR 3 MR 1 Eth 2 Ethernet IP X IP Y IP Z IP rtr Eth 1 Meta-Router Ethernet IP W Line Card Simplifying assumption: For a given physical output port, all packets use the same Ethernet header, in particular the same Ethernet destination address regardless of the IP destination address. Substrate Router

6 6 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Context: In general we can not assume there will only be one next hop Ethernet device. Problem: We can not assume the destination IP address corresponds to the next hop Ethernet device (the current design’s built-in assumption). Solutions –Create table mapping packet IP destination addresses to next hop Ethernet addresses. –Line card performs IP route lookup to obtain the next hop IP address then uses ARP. –Meta-router supplies the next hop IP address then use ARP. –Meta-router supplies the next hop Ethernet address. MR 2 Substrate Router Eth 3 MR 1 IP X IP Y MR 3 IP Z Eth 2 Eth 1 Meta-Router Ethernet IP W Ethernet IP rtr Ethernet IP rtr Ethernet Switch Mapping IP to Ethernet Destination: Not so Simple Line Card

7 7 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 A meta-router may use multiple physical ports Meta-Router (NPE) Ethernet Switch (in chasis) Line Card RTM … Ethernet switch (LAN or Router) Ethernet switch (LAN or Router) …

8 8 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Basic Slice Creation: No changes Slice information is entered into PLC database. –Current: Node manager pools PLC for slice data. –Planned: PLC contacts Node manager proactively. Node manager ( pl_conf ) periodically retrieves slice table. –updates slice information –creates/deletes slices Node manager ( nm ) instantiates new virtual machine ( vserver ) for slice. User logs into vserver using ssh –uses existing plab mechansism on GPE. GPE sys-sw vnet root ctx NM slice X … new Slice (Y) NPE per Slice contexts X1X1 … Preallocated Ports (UDP) … RM Ethernet Switch Line card (NPE) Lookup table (TCAM) Eth 3 filterresult defaultEth 1 VLAN 0 TUN X Eth 2 VLAN X … Eth 1 Eth 2 Default configuration: forward traffic to the (single) GPE, in this case the user’s ssh login session.

9 9 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Requesting NP User requests shared-NP –Specify code option –Request UDP port number for overlay tunnel –Request local UDP port for exception traffic Substrate Resource Manager 1)Configure SW: Assign local VLAN to new meta router. Enable VLAN on switch ports. 2)Configure NPE: allocates NP with requested code option (decision considers both current load and available options) 3)Configure LC(s): Allocate an externally visible UDP port number (from the preallocated pool of UDP ports for the external IP address). Add filter(s) –Ingress packet’s destination port –to- local (chassis) VLAN and MAC destination address –Egress IP destination address (??) –to- MAC destination address and RTM physical output port 4)Configure GPE: Open local UDP port for exception and local delivery traffic from NPE. Transfer local port (socket) and results to client slice GPE sys-sw vnet root ctx NM slice X … Slice Y NPE per Slice contexts X … Preallocated Ports (UDP) … RM Y Line card (NPE) Lookup table (TCAM) filterresult defaultEth 1 VLAN 0 TUN X Eth 2 VLAN X … TUN Y Eth 2 VLAN Y Ethernet Switch Eth 2 VLAN Y 2341 Eth 1 Meta-network traffic uses UDP tunnels. Only need to install filter in TCAM. Exception and local delivery traffic. Only need to install filter in TCAM. Eth 3

10 10 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Software maintained Tables/Maps Mappings/Associations needed for creating filters Substrate InterfaceMeta-Port Identifier Line Card Physical Interface External IP Address UDP port Slot 0Port 3192.168.100.432405 ………… Slot 1Port 5192.168.100.432400 Physical PortNext Hop MAC 1Eth 1 …… NEth N Line Card Next Hop Table Meta-Port to Physical Interface Table

11 11 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Configure Ethernet Switch: Step 1 Allocate next unused VLAN id for meta-net. –In this scenario can a meta-net have multiple meta-routers instantiated on a node? –If so then do we allocate switch bandwidth and a VLAN id for the meta-net or for each meta-router? Configure Ethernet switch –enable VLAN id on applicable ports –need to know line card to meta-port (i.e. IP tunnel) mappings –if using external GigE switch then use SNMP (python module pysnmp) –if using Radisys blade then use SNMP??? –set default QoS parameters, which are??? –other ??

12 12 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Configure NPE: Step 2 vlan table: –code option and instance number memory for code options –instance: base address, size and index/instance –each instance is given an instance number to use for indexing into a common code option block of memory –each code option is assigned a block of memory –code option: base address and size. Also Max number of instances that can be supported. Select NPE to host client MR –Select eligible NPEs (those that have the requested code option) –Select best NPE based on current load and do what??? Configure NPE –Add entry to SRAM table mapping VLAN:PORT to MR instance What does this table look like? Where is it? –Allocate memory block in SRAM for MR. Where in SRAM are the eligible blocks located? How do I reference the block? 1) allocate memory for code option at load time 2) allocate memory dynamically –Allocate 3 counter blocks for MR where are the blocks? How are they referenced (i.e. named)? Using VM/PM address on NP? –Configure MR instance attributes What attributes are needed by the different code options? Tunnel header fields; Exception/local delivery IP header fields, QID, physical Port#; Ether ssrc of NPE??? Set default QM QIDs, weights and number of queues? ??

13 13 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Configure LC(s): Step 3 User may request specific UDP port number Open UDP socket (on GPE) –open socket and bind to external IP address and UDP port number. This prevents other slices or the system from using selected port Configure line card to forward tunnel(s) to correct NPE and MR instance –Add ingress and egress entries to TCAM how do I know IP–to-Ethernet destination address mapping for egress filter? –For both ingress and egress allocate QID and configure QM with rate and threshold parameters for MR. Do I need to allocate a Queue (whatever this means)? Need to keep track of qid’s (assign qid when create instance etc) For egress I need to know the output physical port number. I may also need to know this for ingress (if we are using external sw).

14 14 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Configuring GPE: Step 4 Assign local UDP port to client for receiving exception and local delivery traffic. –user may request specific port number. –use either a preallocated socket or open a new one. –use UNIX domain socket to pass socket back to client along with other results. –all traffic will use this UDP tunnel, this means the client must perform IP protocol processing of encapsulated packet in user space. for exception traffic this makes sense. for local delivery traffic the client can use a tun/tap interface to send packet back into Linux kernel so it can perform more complicated processing (such as TCP connection management). Need to experiment with this. should we assign a unique local IP address for each slice? Result of shared-NPE allocation and socket sent back to client.

15 15 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Run-Time Support for Clients Managing entries in NPE TCAM (lookup) –add/remove entry –list entries NPE Statistics: –Allocate 2 blocks of counters: pre-queue and post-queue. –clear block counter pair (Byte/Pkt) ??? –get block counter pair (Byte/pkt) specify block and index get once, get periodic –get counter group (Byte/pkt) specify counter group as set of tuples: {(index, block), …} SRAM read/write –read/write MR instance specific SRAM memory block –relative address and byte count, writes include new value as byte array. Line card: Meta-interface packet counters, byte counters, rates and queue thresholds –get/set meta-interface rate/threshold Other –Register next hop nodes as the tuple (IPdst, ETHdst), where IPdst is the destination address in the IP packet. The ETHdst is the corresponding Ethernet address. –Can we assume the destination ethernet address is always the same? –Issue: how do we map this to LC and physical interface? We need this information to configure output TCAM entries on line cards.

16 16 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Boot-time Support Initialize GPE Initialize NPE Initialize LC things to init –spi switch –memory –microengine code download –tables?? –default Line card tables –default code paths –TCAM

17 17 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 IP Meta Router: Control All meta-net traffic arrives via a UDP tunnel using a local IP address. –raw IP packets must be handled in user space. –complete exception traffic processing in user space. –local delivery traffic: can we inject in Linux kernel so it performs transport layer protocol processing? This would also allow application to use the standard socket interface. –should we use two different IP tunnels, one for exception traffic and one for local delivery? Configuration responsibilities? Stats monitoring for demo? get counter values support for traceroute and ping ONL -like monitoring tool Adding/removing routes: –static routing tables or do we run a routing protocol?

18 18 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 IP-Meta Router Internal packet format has changed. –see Jing’s slides Redirect: not in this version of the meta-router

19 19 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 XScale Control Software Substrate Interface –Raw interface for reading/writing arbitrary memory locations. –substrate stats? –add new meta-router Meta-router/Slice interface –all requests go through a local entity (managed) –not needed: authenticate client –validate request (verify memory location and operation) Node Initialization –??

20 20 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Command/Configuration Tool Simple command interpreter with syntax similar to lisp Basic syntax: expr := cmd [arg]* arg := [‘(‘ expr ’)’ | array | scalar | string] Commands are either arithmetic expressions or some system defined operation ( mem, vmem, set, etc.) Command arguments are typed scalar and array values: integers, double and string Allow you to read/write any location in physical memory interactively or via a script. arg typePOSIXTraditional dw1uint8_tunsigned char dw2uint16_tunsigned short dw4uint32_tunsigned int dw8uint64_tunsigned long long stringNAchar BUF[] double

21 21 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Example Operations cmd> $a = (dw4 0x01010101 \ 0x02020202 \ 0x03030303) cmd> $b = $a + (dw4 0x01010101 0x02 4) cmd> $c = 3 + $b[2] * 2 - 4 cmd> (dw4 $c) cmd> $t = "text one" +\ " two" cmd> set Symbol Table: cmd> help Usage: : type is one of {int, dw8, dw4, dw2, dw1, dbl} … load "file_name" mem : commands to manage internal memory maps mem read maps mem show maps mem read paddr [type] [count] mem write paddr value vmem : read/write to kernel virutal memory vmem read vaddr [type] [count] vmem write vaddr value

22 22 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Reading Memory Maps cmd> mem read maps Adding symbols: cmd> mem show maps DRAM Channel 0: kpa 0x00000000, kva 0xa7480000, Size 536870912 (cachable 0, bufferable 0) DRAM CSR Ch 0: kpa 0xd0009000, kva 0xa73d0000, Size 65536 (cachable 0, bufferable 0) DRAM CSR Ch 1: kpa 0xd000a000, kva 0xa73f0000, Size 65536 (cachable 0, bufferable 0) DRAM CSR Ch 2: kpa 0xd000b000, kva 0xa7410000, Size 65536 (cachable 0, bufferable 0) SRAM Channel 0: kpa 0x80000000, kva 0x00000000, Size 0 (cachable 0, bufferable 1) SRAM Channel 1: kpa 0x90000000, kva 0xc7490000, Size 8388608 (cachable 0, bufferable 1) SRAM Channel 2: kpa 0xa0000000, kva 0xc7ca0000, Size 8388608 (cachable 0, bufferable 1) SRAM Channel 3: kpa 0xb0000000, kva 0xc84b0000, Size 8388608 (cachable 0, bufferable 1) … SRAM Ring1 CSR: kpa 0xce400000, kva 0xd16a0000, Size 4096 (cachable 0, bufferable 1) SRAM Ring2 CSR: kpa 0xce800000, kva 0xd16b0000, Size 4096 (cachable 0, bufferable 1) SRAM Ring3 CSR: kpa 0xcec00000, kva 0xd16c0000, Size 4096 (cachable 0, bufferable 1) SRAM CSR Ch 0: kpa 0xcc010000, kva 0xa7440000, Size 4096 (cachable 0, bufferable 1) … cmd>

23 23 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Possible configuration script set MYTABLE_START 0xXXXXXXX mem write $MYTABLE_START (dw4 0x00000000 \ $DEFAULT_ADDR \ $DEFAULT_VLAN) set ETHER_ADDR 00:e4:4d:33:00:00 $ETHER_ADDR[5] = 2 mem write $ETHER_TABLE[0] $ETHER_BASE $ETHER_BASE[5] = 3 mem write $ETHER_TABLE[0] $ETHER_BASE … mem write ($MYTABLE_START + 20) (mem read $SOMEPLACE dw4 1) …

24 24 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Testing: Generating Packets sp++ ---------Packet/data Sending Rate -------------- [(-n|--pcnt) n] : Number of pkts to send. default = 100000. [(-x|--pps) rate] : Pkt/sec, default 100 [--Kbps rate]: Kbps for IP datagrams. default 0 Kbps [--KBps KBps] : KBps for IP datagram ---------When pkts are sent, see below for description-------- [(-m|--mode) m] : m is one of {cont|burst|swait} [(-p|--period) p] : send pkts every msecs. default = 0 msec [(-B|--batch) b] : Number of pkts to send in a batch, default = 0 pkts [--pdelay n] : nsec inter-packet gap ---------Packet Size, Specify only one -------------------------- [--dlen b] : Size of payload in bytes, default = 4 --------Flags affecting pkt size or content ----------------- [--dtype) type] : Packet data type (zero, seq, UDP) [--ftype type] : Type of frame to send (raw, udp, tcp, data) [--file name] : Name of file containing the raw packet data (ftype == raw) ---------Network addressing information ------------------------- [--sa host] : Use local address "host", default INADDR_ANY [(-s|--sp) port] : Source port to use. default = 0 [--da host] : Send to remote "host“, required option [(-d|--dp) port] : Destination port number to use. default = 5050 [--pr (udp|tcp)] : Transport protocol UDP or TCP. default = udp ---------Various Endsystem/Socket Control parameters ------------ [--sbuf sz] : Set socket buffer size, net set by default ---------Parameters affecting the core processing steps---------- [(-D|--dot) (0/1)]: print a dot '.' each time we have to retransmit [--rxtout ms] : Timeout for reply pkts, units msec. default = 100 msec ---------Debug/message flags ----------------------------------- [--rt p] : Put process in the real-time scheduling class with prio 'p'. …

25 25 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Example Command Example using a constant inter-packet gap sp++ -n 10 --pps 1 --mode cont --ftype raw \ --file Rx_NPUA_Dev_0_Port_0.log --ifn eth2 -n 10 : send a total of 10 packets --pps 1 : send at a rate of 1 packets per second --mode cont : use a constant inter-packet delay calculated from pps --ftype : selects the RAW packet interface protocol family --file Rx_NPUA_Dev_0_Port_0.log : read packet contents from file --ifn eth2 : send packets out interface eth2 Or for low packet rates use burst mode: sp++ -n 10 --pps 1 --mode burst --ftype raw \ --file Rx_NPUA_Dev_0_Port_0.log --ifn eth2 only difference is the option: --mode burst Printing the help message sp++ --help... I have copied an example packet file into the bin directory (/opt/bin): Rx_NPUA_Dev_0_Port_0.log You must be root or use sudo to run sp++ since it opens a raw device socket. You can use tcpdump to watch the packets being sent: tcpdump -i eth2

26 26 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Example File: Rx_NPUA_Dev_0_Port_0.log 0102030405060708090a0b0c81000aaa080045000 05000000000ff003a5ac0a80001c0a80002000100 02003cff1b008500004500003000000000ff113a6 9c0a80001c0a8000200010002001cd3b4dddddddd ddddddddddddddddddddddddddddddddcaa08273 0102030405060708090a0b0c81000aaa080045000 05000000000ff003a5ac0a80001c0a80002000100 02003cfedc00c400004500003000000000ff113a6 9c0a80001c0a8000200010002001cd3b4dddddddd dddddddddddddddddddddddddddddddd306942f9 0102030405060708090a0b0c81000aaa080045000 04c00000000ff003a5ec0a80001c0a80002000100 020038ffa85500003000000000ff112a69c0a8000 1c0a8000200010002001cd3b4dddddddddddddddd dddddddddddddddddddddddddb16526b

27 27 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Testing Environment: Generating Traffic What sort of packet generations features are useful? What do you need? Generate packets identical to those used in simulation? –specify on command line? Do you need to generate arbitrary Ethernet headers or can we preconfigured the host’s Ethernet interface to use VLANs? Do you need to specify arbitrary UDP tunnel headers or can we use the standard socket mechanism to establish the tunnel? The encapsulated IP and transport headers will be built up by the program (sp) and thus must be specified on the command line. –or is there a default encapsulated header that will do and can preconfigured at compile time? This can be overloaded at run time.

28 28 Washington WASHINGTON UNIVERSITY IN ST LOUIS Fred Kuhns - 10/20/2015 Expected Ethernet Frame Format (see 802.3ac) Destination Address cont. Destination (6 B) Source Address (6 B) EtherType (vlan 0x8100) IP Hdr Ethernet Hdr UDP Hdr Payload Fragment offset VersionHdrLenTOSTotal length IdentificationFlags TTLProtocolIP Header checksum IP Source Address IP Destination Address Encapsulated IP Datagram dportsport Fragment offsetIdentificationFlags TTLProtocolIP Header checksum IP Source Address IP Destination Address transport header Tunnel Headers prio CFI Original EtherType cksumlength VersionHdrLenTOSTotal length Tag control information (TCI): Priority (3-bits), Canonical format indicator (CFI) (1-bit), VLAN ID (VID) (12-bit), Length/Type (16-bit). CFI should always be set to zero (CFI = 0). VID = 0 identifies priority frames (what does this mean?). VID = 4095 (0xfff) is reserved. Minimum frame size is 65B Frame Check Sequence (FCS) VID Source Address cont.


Download ppt "Washington WASHINGTON UNIVERSITY IN ST LOUIS Control Update 1: Phase 0 Fred Kuhns Applied Research laboratory Department."

Similar presentations


Ads by Google