9Cisco 12016 Product Highlights Switching performance 16 Slot System, 2.5Gbps switching capacity/slot – can support 10Gb LCs if fabric is upgradedIncreased number of linecardsConfiguration 2 Interface Shelves 16 slots 1 Fabric Shelf, with 5 slots 2 Alarm cards – 1 top shelf, 1 bottom shelfMore details of the Electro-Mechanical specs are available in the Data Sheet and the Cisco GSR documentationKey points to note are the following :The has 2 interface shelves with 8 slots eachThe Route processor has a dedicated slot in the shelvesThe pitch of the linecard slots is wider than the 12012, in order to accommodate the existing linecards, a filler is provided. This is critical to maintaining air flow and EMI characteristics of the chassisThe base configuration of the chassis will include :1 Gigabit Route ProcessorFull fabric redundancy ( 3 SFC + 2 CSC)Fully redundant power configuration (4 DC or 3 AC)Redundant Blower assembly and Alarm cards
10Cisco 12416 - Product Overview Switching performance 16 Slot System each with 10Gbps switching capacity/slotSupports 10G linecards Support for existing 12k line cards Slots are wider to accommodate 10 Gb LCsConfiguration 2 interface shelves 16 slots 1 fabric shelf, with 5 slots 2 Alarm cards – 1 top shelf, 1 bottom shelf
11Cisco 12410 Product Highlights 10 X OC192 capable slots8 Slots are wider to accommodate 10 Gb Lcs2 x Legacy slots (narrower slots 8 and 9)7 card fabric – 2 CSCs & 5 SFCs
12Cisco 12406 Product Highlights 6 slot card cage1 narrow slot dedicated for RP5 for redundant RP and Line CardsComponents:Switch Fabric Cards (SFC)Clock and Scheduler Cards (CSC)1 or 2 Route Processors (RP)Up to 5 Line Cards (LCs)1 or 2 Alarm Cards1/3 rack height
13Cisco 12404 GSR Product Highlights 4 slot card cage1 narrow slot for RP3 10G capable slotsComponents:1 Consolidated Fabric Card : CSC-4(CSC, SFC, Alarm built in)Route Processors (RP)Up to 3 Line Cards (LCs)FABRIC IS NOT RESILIENT
14Cisco 12816 - Product Overview Switching performance 16 Slot System each with 40Gbps switching capacity/slotSupports 20G and future 40G linecards Support for existing GSR line cards Slots are wider to accommodate 10/20Gb LCs Requires PRPConfiguration 2 interface shelves 16 slots 1 fabric shelf, with 5 slots 2 Alarm cards – 1 top shelf, 1 bottom shelf
15Cisco 12810 Product Highlights 10 X 40Gb capable slots8 Slots are wider to accommodate 10/20 Gb Lcs Requires PRP2 x Legacy slots (narrower slots 8 and 9)7 card fabric – 2 CSCs & 5 SFCs
18System Components Route Processor Switching fabric Line cards Power/Environmental SubsystemsMaintenance BUS
1912k Architecture - Components Line CardLine CardLine CardLine CardSwitchFabric• • •• • •Line CardLine CardGRP: Gigabit Route ProcessorRuns IOS and routing protocols, distributes the CEF table to line cardsHas a single 10/100 ethernet port (for management only, not for transit switched traffic)Unlike the RSP on the 7500, the GRP is not involved in the packet switching pathLine CardsIP/MPLS – IP/tag forwarding, ping response, fragmentationQueuing - FIFO, MDRRCongestion Control - WREDFeatures – MPLS, CAR, Tunnels, ACL, BGP Policy AcctStatistics - Netflow, CEF accountingSwitch FabricMBUS1 Mbps redundant CAN busConnects to LCs, RP, Fabric, Power Supplies, and BlowersControls hardware discovery, environmental monitoring, firmware upgradesRouteProcessorRoute ProcessorMaintenance BusFan/BlowerSystemPowerSupplies
20Route Processor Boots and manages line cards Provides and coordinates routing servicesBuilds, distributes, and maintains FIBAdjacency table, FIB table, MPLS label tableProvides out-of-band console/aux portsProvides intelligence behind system monitoring and access
21RP - System Monitor/Controller Routing Protocol UpdatesProcess-level TrafficSystem health monitoringInterface Status MsgsStatisticsLine CardLine CardLine CardLine CardSwitchFabric• • •• • •Line CardLine CardRouteProcessorLineCardGRP: Gigabit Route ProcessorRuns IOS, system interface software (SNMP, etc)Builds the master CEF table and distributes this to the individual linecardsControls environmental and system functionsHas a single 10/100 ethernet port (for management only, not for transit switched traffic)Unlike the RSP on the 7500, the GRP is not involved in the packet switching pathStatistics - Netflow, CEF accountingTemperature,Voltage,Current MonitoringMaintenance BusFan/BlowerSystemPowerSupplies
22Line Cards Perform all packet switching Statistics collection and reportingRun IOSSix different forwarding architectures
23MBUS Maintenance Bus Out-of-band communications channel to linecards 1 Mbps - 2 wire serial interfaceBased on Controller Area Network (CAN) 2.0 Spec. (ISO 11898)A daughter card on each linecard having it’s own CPU w/ integrated CAN controller, A/D converter and other peripherals, dual CAN interface, SRAM, Flash and Serial EEPROM.CSCs and BusBoard can proxy and/or multiplex MBUS signals for power suppliesControl pins reach into LED, Serial ID EEPROM, DC/DC power converter, clock select FPGA, temp sensor, voltage sensorVery large set of functionsLine CardLine CardMultigigabitCrossbarFabric• • •• • •Line CardLine CardSchedulerRoute ProcessorRoute ProcessorFan/BlowerSystemPowerSupplyMaintenance Bus
24MBUS Functions Power and boot LC Device Discovery RP arbitration OIR managementEnvironmental monitoringDiagnostics downloadLC console accessVia “attach” commandLogging
25Alarm Cards LED display for fabric card status External alarm connectionPower conversion/supply for 5v MBUS power planeOn the 12008, this functionality is on the CSC.
26Switch Fabric - Overview Provides the data path connecting the LCs and the RPActive CSC card provides the master clock for the systemEverything traverses fabric in Cisco cell. - Data is 8B/10B encodedTwo components- Clock & Scheduler Cards (CSC)- Switch Fabric Cards (SFC)
27ciscoCellPacket are chopped into ciscoCells before they are sent across the switching fabric.A ciscoCell is 64bytes of data consisting of 48bytes of IP payload and 8bytes of header and 8bytes of CRC.
28Clock Scheduler Card (CSC) Scheduler (SCA)Handles scheduling requests and issues grants to access the crossbar switching fabricCross-bar (XBAR)Sets the fabric lines for transmissions following the scheduling decisionClock and Scheduler Card contains the following functionality:System clock--Sent to all line cards, GRP, and switch fabric cards. The system clock synchronizes data transfers between line cards or line cards and the GRP through the switch fabric. In systems with redundant clock and scheduler cards, the two system clocks are synchronized so that if one system clock fails, the other clock takes over.Scheduler--Handles requests from the line cards for access to the switch fabric. When the scheduler receives a request from a line card for switch fabric access, the scheduler determines when to allow the line card access to the switch fabric.Switch fabric--Circuitry that carries the user traffic between line cards or between the GRP and a line card. The switch fabric on the clock and scheduler card is identical to the switch fabric on the switch fabric card.MTBF Clock Scheduler Card = 240,078 hrSwitch Fabric Card contains the following functionality:Switch fabric--Circuitry that carries the user traffic between line cards or between the GRP and a line card.The switch fabric on the clock and scheduler card is identical to the switch fabric on the switch fabricSFCs receive the scheduling information and clocking reference from the CSC cards and perform theswitching functions.
29Fabric RedundancyEach fabric card provides a slice of the Cisco cell data pathUp to 5 data paths are available – for up to 4+1 redundancyThe 5th data path carries an XOR of other streamsUsed for recovery of a errored stream No 5th path = no recovery capability‘Grants’ travel exclusively between the LC and the active CSC using separate communication linesNever traverse the SFC cards
30Scheduling Algorithm (“ESLIP”) RequestEach input LC makes request to output highest priority queued cell (unicast or multicast)GrantEach destination LC grants the request to the highest priority requestAcceptEach input LC selects the highest grantTransmitXBAR set and cells transmitted
31ESLIP Illustrated Switch Fabric Scheduler DRR Slot select the highest grant and accepts theconnectionAcceptSwitchFabricGrantSchedulersend multiple grants(for multiple outputs)to slotEach line card utilizesDRR to select a setof packets from theVoQs.Request is sent toScheduler on CSCto obtain a grantRequestScheduler (for eachoutput) selectshighest prioritypacket fromrequests anddetermines if outputcan grant requestScheduler
33Startup/Boot Process Initial Power On RP Boot Process Clock Scheduler Boot ProcessLine Card Boot ProcessSwitch Fabric Boot ProcessFabric InitializationIOS Download
34Initial Power OnWhen the chassis is powered on, the Mbus module on each card is powered on.After the Mbus module powers on its processor it boots from a module on EEPROM.Card power up order varies depending on linecard type.
35RP Boot Process Mbus module powers first Board logic starts, image begins booting and Mbus code is loaded to the Mbus moduleThe CPU, Memory controller ASIC, cell-handler ASICs and FIA ASICs are then issued power for startupRP arbitration process is executed using the MbusMaster RP instructs Line Cards and Switch Fabric Cards to power on.RP waits for Line Cards to power and finish booting
36Switch Fabric Card Startup/Boot Master RP instructs each SFC Mbus module to power on at the same time the Line Card Mbus modules are toldSFC obtains clock the same way each LC doesThe SLI ASICs and XBAR initialize and power upSFC Mbus code is downloaded from the RPAll cards are now powered on but not usable
37Line Card Startup/Boot Each LC Mbus module powers up after being told to do so by the RPClock selection takes placeThe Line Card CPU is powered on and bootsMbus module code is loadedThe Line Cards CPU notifies the RP it has bootedSwitch Fabric access is not available yet
38Line Card IOS Downloads The Line Card may already have enough code in its flash to become operational on the Switch Fabric, or it may require an Mbus download.Only enough code for the Line Card to become operational on the fabric will be loaded using the Mbus.Once all cards are operational on the fabric, the fabric is initialized and the main IOS software is downloaded.
40IPC OverviewThe 12k is a distributed multiprocessor system. The processors communicate via IPC … an essential architectural serviceIPC has a reliable (acknowledged) and unreliable mode of transport (with or without sequence number or notification). The application uses an appropriate method.
41IPC ClientsApplications (clients) can build their own queue structures and feed the IPC queue/cache as well as choose to block or not until an ACK or imbedded response is received.e.g. … CEF uses a multi-priority queue and it’s own cache in front of the IPC queue (controlled by “ip cef linecard ipc memory”) … it’s got it’s own message handling routines defined in the same registry as direct IPC interrupt or process level message handling.Many (most) applications use the CEF packaging (XDR) message types and queues as an interface to IPC.e.g. … route-map updates and acl updates to linecardsApplications are also responsible for being “well-behaved”.Utility applications like slavelog and slavecore use IPC directly.
44The Route Processor (RP) The RP’s control path for Line Cards uses IPC via the switch fabric or MbusThe switch fabric connection is the main data path for route table distributionThe Mbus connection enables the RP to download a bootstrap image, collect or load diagnostic information, and perform general maintenance operations
45RP Responsibilities Running routing protocols Builds and distributes the routing tables to Line Cards (i.e. routing table maintenance)Provides general maintenance functions (i.e. Booting Line Card processors)The RP derives it’s name from one of it’s primary functions of running the routing protocols enabled on the routerThe master Forwarding Information Base (FIB) is built and the RP uses this FIB to build entries it then sends to each 12k Line Card across the Switch Fabric.The GRP plugs into any slot of the GSR chassis and serves as the console for the router, and handles environmental monitoring.The forwarding table and the routing table differ significantly.
46RP Routing Table Maintenance Using the RIB the RP maintains a complete forwarding table of its own (RP-FIB)Routing updates are forwarded from RP-RIB to each Line Card (LC-FIB)Each LC-FIB entry corresponds to an interface which contains a MAC encapsulation string, output interface and MTUThe GRP is also responsible for maintaining a complete forwarding table on each Line Card. The Forwarding Information Base (FIB) differs from the Routing Information Base (RIB). The RIB contains information which is not useful on the Line Card (e.g. the time the route was installed, the metrics for the route(s), etc…)The Line Card forwarding table will contain supernet prefixes which can be used by the Layer 3 Forwarding Engines. These prefixes are overlapping and represent a complete copy of the routing table. Associated with each prefix will be a MAC encapsulation string, an output interface and an MTU.The forwarding table may have additional fields populated in the data structure for a forwarding table entry (e.g. next-hop, recursive route address).Also, multiple routes may exist in the forwarding table for a particular prefix. This may change the way that load sharing is done in the system.The GRP is responsible for detecting, resolving, and maintaining recursive entries in the forwarding table. The GRP also maintains an adjacency table, which lists all neighboring IP systems. This table calculates and holds the MAC encapsulation strings.The forwarding table will refer to the adjacency table to extract encapsulation strings. Note that this information needs to be relayed down to the LCs. Platform specific software will be necessary to map MAC encapsulation strings into board-specific MAC encapsulation pointers and lengths. Further platform independent software will be necessary to format the forwarding table entry in the RP for distribution via IPC to the LC.
47RP Routing Table Maintenance FIB distribution is done through reliable IPC updatesWhen the routing protocol triggers an update, it is placed into the FIB of the RP then sent to the Line CardsUpdates are unicast across the fabric to all Line Cards
50GRP Components Power Units DRAM Cisco cell Segment And Reassembly (CSAR)Tiger ASICCPUSerial Line Interface ASIC (SLI)Fabric Interface ASIC (FIA)
51GRP Component Groups Logic I/O Sub-system Fabric Mbus Memory enhancements to the Cisco Internet Router Gigabit Route Processor (GRP): • Increased route memory (DRAM), from 256MB to 512MB • Increased PCMCIA flash storage from 20MB up to 128M with the use of PCMCIA ATA flash disks.Key Benefits for using 512MB Route Memory: • Supports up to 750,000 prefixes in some cases, depending on the type line cards installed in the system • Supports a large number of Interfaces including subinterfaces and channelized interfaces depending on overall system memory usage. • Provides more memory space for running large IOS images, which have grown in size recently as more features are added to support current and emerging services and applications.Software Requirements: New IOS and ROM Monitor software releases are required for using 512MB route memory on the GRP. IOS support is available in 12.0(19)S, 12.0(19)ST and later IOS releases. ROM Monitor software 11.2(181) is also required to use the new memory option. The ROM Monitor software is bundled into main IOS software packages. Flash disks are supported in 12.0(16)S, 12.0(14)ST and later IOS releases and need corresponding boot images to function properly.512MB DRAM is available as of September 14, 2001 and Flash Disk is available as of September 21, 2001.Mbus
52CSAR (ciscoCell Segmentation and Reassembly) ASIC Buffer manager ASIC for the GRP (equivalent to Rx and Tx BMA on Engine 0 LCs)The CSAR contains two 64k buffersMessages are placed in a hold queue if these buffers are fullAn interrupt is sent to the CPU when the buffers are freeThe CSAR contains 32 reassembly areas when receiving ciscoCells from the fabric for unicast and multicast providing 64 areasConnects to fabric at OC12
55Performance Route Processor (PRP) The PRP is fully compatible with the GRP at the hardware levelOne of the major differences with the PRP is the use of the V’ger processor, a Motorola PPC processor running at 655MHzThe future Apollo processor running at 1GHz will replace V’gerConnects to fabric at OC48 – requires at least 1CSC and 3 SFCs to operatePPC = PowerPC
56Performance Route Processor (PRP) The PPC CPU also supports on-chip 32Kbs of Layer 1 cache and on-chip 256Kb of Layer 2 cache with an external 2MB of Layer 3 cache controller.The realized performance improvement is 4 – 5 times that of the current GRP
57Performance Route Processor (PRP) Default 512Mb DRAM upgradeable to 2Gb2 10/100 Enet portsRJ-45 Console port64Mb Flash Disk as standard
60PLIM – Physical Interfaces Handle L2 protocol encap/decap - SONET/SDH framing - ATM cell segmentation/re-assembly - ChannelizationReceives packet off the wire and passes it to the forwarding engine
61FE - Forwarding Engine Runs IOS and maintains CEF tables Provides CEF switching services, feature capabilitiesProvides queuing and QoS services (through the RX and TX queue managers)NOTE – QoS will be covered in detail in the ‘Applications section’
62FIM - Fabric Interface Module Provides fabric transmission servicesTwo components:FIA – interface between forwarding engine and fabric interfaceSLI - does 8B/10B encoding and decoding of Cisco cells
65Forwarding Architecture Various routing protocols maintain individual routing databases.The routing table is built by using the best available paths from the routing protocols.From the IP routing table, we pre-resolve recursive routes and build the CEF table (a.k.a. FIB table)The CEF table is pushed down from the GRP to each linecard via IPCFrom the CEF table, HW-based linecards will build their own hardware forwarding tables
66Summary Multiple levels of routing/forwarding information RP provides control plane services - IP routing protocols - MPLS label exchange protocolsRP maintains RIB, FIB, LFIBLC have a copy of FIB and LFIBE2/3/4/4+/6 have a HW forwarding FIB and LFIB as well
70Engine 0 - Components R5000 CPU + L3FE ASIC BMA Main Memory QoS support with performance hitMain MemoryUp to 256MB of DRAMPacket MemoryUp to 256MB SDRAM split equally between Rx and Tx
71Engine 0 Architecture PLIM L3 Engine Fabric Interface Packet Memory Rx POSToFabBMAL3FELC IOS MemoryXOpticsFramerToFabFIACPUSLICPure vanilla IP switching performance is about 420kpps.All features are done on the LC CPU, or in the BMA microcode.General feature-path performance is about 200kppsUse compiled ACLs for best performanceAlmost all s/w features are supported, check GSR S/W roadmapVTxPOSFrFabBMAFrFabFIASLIRSPacketMemoryPLIML3 EngineFabric Interface
721 port OC12 Engine 0 line card SLIFIAMbus Agent ModuleTxBMATx Packet MemoryOpticsL3FERxBMARx Packet MemoryCPU
73Engine 0 – OC12 with Features CPU-based switchingProvides OC-12 performance with featuresExtensible/flexible architecture - easy to add more featuresWRED/MDRR in HW with performance hitPerformance:No features - ~ 420 kppsWith features - ~ 250 kpps
75Engine 1 - Components R5000 CPU + Salsa ASIC BMA48 Main Memory Salsa = Enhanced Layer 3 Fetch Engine (L3FE)Hardware IP lookup with software re-writeBMA48Performance enhanced BMA No QoS supportMain MemoryUp to 256MB of DRAMPacket MemoryUp to 256MB SDRAM split equally between Rx and Tx
771 port GigE Engine 1 line card Tx BMA48SalsaCPURx BMA48
78Engine 1- SalsaHardware enhancements to IP packet validation and FIB lookup assistVerifypacket is IPv4 packets with no options.Identify that packet is PPP/HDLC encapsulated.checksum, length, TTLUpdate IP header (TTL, checksum)Perform IP lookup and cache FIB pointer for CPU re-write operation
80Packet Arrives on Line Card (tofab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26626/26626 (buffers specified/carved), 50.90%, 80 byte data size16184/16184 (buffers specified/carved), 30.94%, 608 byte data size7831/7831 (buffers specified/carved), 14.97%, 1568 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:ToFab Queues:SlotMcastWhen the packet arrives from the PLIM, we try to allocate a free queue for that packet size.When we allocate the buffer for the new packet, the #Qelem value is decremented by 1 (max value was 26626).At this point, the packet is stored in a buffer within the packet memory, with a pointer to the packet maintained inside the BMA ASIC.
81Move the Buffer onto the Raw Q (tofab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26626/26626 (buffers specified/carved), 50.90%, 80 byte data size16184/16184 (buffers specified/carved), 30.94%, 608 byte data size7831/7831 (buffers specified/carved), 14.97%, 1568 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:ToFab Queues:SlotMcastAfter buffering the packet, we send a copy of the packet header to the CPU via the raw queue.When the packet is placed on the raw queue, the #Qelem value is incremented.This triggers a CPU interrupt and begins the actual forwarding lookup.
82FIB Result and ToFab Queuing (tofab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26626/26626 (buffers specified/carved), 50.90%, 80 byte data size16184/16184 (buffers specified/carved), 30.94%, 608 byte data size7831/7831 (buffers specified/carved), 14.97%, 1568 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:ToFab Queues:SlotMcastWhen the CPU returns the buffer to BMA with the results of the forwarding decision, we queue the packet onto the toFab queue for the destination slot.In this example, the destination interface is located in slot 6.Decrement the #Qelem for the raw queue, and increment the #Qelem counter for the destination toFab slot
83Return the Buffer to the Free Q (tofab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26626/26626 (buffers specified/carved), 50.90%, 80 byte data size16184/16184 (buffers specified/carved), 30.94%, 608 byte data size7831/7831 (buffers specified/carved), 14.97%, 1568 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:ToFab Queues:SlotMcastAfter the buffer is handed off to the FIA interface, we return the packet back to the free queue, and the entire cycle begins again.Here, we decrement the toFab queue, and increment the free queue where we originally obtained the buffer.
84Egress Card Receives the Packet (frfab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26560/26560 (buffers specified/carved), 50.90%, 80 byte data size16144/16144 (buffers specified/carved), 30.94%, 608 byte data size7811/7811 (buffers specified/carved), 14.97%, 1568 byte data size1562/1562 (buffers specified/carved), 2.99%, 4544 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:Interface Queues:On the from fabric side, the process is very similar to the toFab side, with the steps essentially reversed.First, we receive the packet from the frFab FIA.Again, we allocate a buffer from the free queue (decrement #Qelem for that buffer pool)
85Queuing for Transmission (frfab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26560/26560 (buffers specified/carved), 50.90%, 80 byte data size16144/16144 (buffers specified/carved), 30.94%, 608 byte data size7811/7811 (buffers specified/carved), 14.97%, 1568 byte data size1562/1562 (buffers specified/carved), 2.99%, 4544 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:Interface Queues:After storing the packet, BMA uses the information from the buffer header to determine which interface queue to place the packet in.The packet buffer is transferred to the appropriate transmit queue for the destination interface.
86Return the Buffer to the Free Q (frfab) Qnum Head Tail #Qelem LenThresh4 non-IPC free queues:26560/26560 (buffers specified/carved), 50.90%, 80 byte data size16144/16144 (buffers specified/carved), 30.94%, 608 byte data size7811/7811 (buffers specified/carved), 14.97%, 1568 byte data size1562/1562 (buffers specified/carved), 2.99%, 4544 byte data sizeIPC Queue:100/100 (buffers specified/carved), 0.19%, 4112 byte data sizeRaw Queue:Interface Queues:After storing the packet, BMA uses the information from the buffer header to determine which interface queue to place the packet in.The packet buffer is transferred to the appropriate transmit queue for the destination interface.
88Engine 2 Overview First programmable, hardware-based forwarding engine Multi-million PPS with some featuresUp to 4Mpps performance (no features)
89Engine 2 Architecture PLIM L3 Engine Fabric Interface PSA Memory PacketMemoryRxSOPPSARBMXOpticsFramerCPUSalsaLC IOS MemoryToFabFIA48SLICPSA runs various microcode bundles to perform feature checkingPIRC – limited subset of iCAR featuresPSA-ACLs – lots of restrictions but OK for short ACLsSampled netflow, BGP-PA, etc…Ucode features are mutually exclusive with few exceptions (see S/W roadmap)VTxSOPFrFabFIA48TBMSLIRSPacketMemoryPLIML3 EngineFabric Interface
91Engine 2 - Components R5000 CPU -> Slow Path Slow path (CPU computed) CEF tables, ICMPs, IP options, etc…PSA (Packet Switched ASIC) -> Fast PathMicrocoded IP/MPLS lookup & feature processingRBM/TBM (Receive/Transmit Buffer Manager)Hardware WRED, MDRRPacket Memory256MB SDRAM can be upgraded to 512MB SDRAMPSA MemoryPSA copy of FIB table
92Engine 2 – Rx Packet flow PLIM L3 Engine Fabric Interface Packet validationIP/MPLS lookupFeature processing (ACLs, CAR, Netflow, etc...)Append buffer headerDetermine loq, oq and freeq for packetEngine 2 – Rx Packet flowExtract packets from SONET/SDH payloadPass indication of input interface and packet header to PSAPayload passed to RBMPSAMemoryPacketMemoryadd CRC to ciscoCellsend transmission request to SCARxSOPPSARBMTofab queueingWREDMDRRSegment packet into ciscoCellXOpticsFramerCPUSalsaLC IOS MemoryToFabFIA48SLICVTxSOPFrFabFIA48TBMSLIR8B/10B encodingsend cells to fabricSONET/SDH framerSPacketMemoryPLIML3 EngineFabric Interface
93Engine 2 – PSA Forwarding The Packet Switching ASIC is an IP and TAG forwarding engineThe ASIC contains a 6 stage pipeline, Pointer and Table Lookup memoryAs packets move through the PSA pipeline, the forwarding decision and feature processing is completed
94Each stage has a 25 clock budget @100MHz = 250ns, i.e. 4Mpps PSA ArchitectureEach stage has a 25 clock = 250ns, i.e. 4MppsExt. SSRAMFIB TREE(256K)LEAVES/ADJ/STATS(256K)FetchPrePPLUTLUPoPGatherModifications to packet header (e.g. pushing MPLS Labels). Prepare packet for transmission to RBMMicrocode engine which applies the results of the PLU/TLU lookup to the packet. Tasks include COS handling, MTU check, special case tests, setup of gather stage, feature processing, etc...IP/MPLS lookup machineAdjacency Lookup, Per Adjacency CountersMicrocode engine which performs checks on the packet (protocol, length, TTL, IP CHKSUM) and extracts the appropriate address(es) for the main lookup. Some feature processing.MAC header checking, protocol ID checking, IP header checking, extraction of IP/MPLS address fields
95RBM: Rx Queue ManagerThe RBM manages the linecard’s receive packet memory buffers and queuesThere are two major types of queues in RBM:LowQs (16 FreeQs, 1 RAWQs, an IPC FreeQ and spare queues)2048 unicast Output Queues and 8 multicast queues16 slots per chassis, 16 ports per slot, 8 queues per port = 2048 queuesOne hpr (high priority) queue is allocated per destination slot/port.
96TBM: Tx Queue ManagerThe TBM manages the linecard’s transmit packet memory buffers and queuesThree types of queues:Non-IPC freeQs, 1 CPU RawQ, IPC FreeQ128 Output QueuesMulticast RawQ8 CoS queues per output port, 16 ports = 128 queues
97Engine 2 – Tx Packet flow PLIM L3 Engine Fabric Interface PSAMemoryPacketMemoryRxSOPPSARBMRe-assemble packet from ciscoCellsFrFab queueingWREDMDRRAppend L2 header and send packet to PLIMMcast duplicationVerify and remove CRC from ciscoCellsend cells to TBMRemove 8B/10B encodingXOpticsFramerPut packets in SONET/SDH payloadCPUSalsaLC IOS MemoryToFabFIA48SLICVSONET/SDH framerTxSOPFrFabFIA48TBMSLIRSPacketMemoryPLIML3 EngineFabric Interface
98E2 Feature Support Performance varies with features (eg. ACLs): Designed to be forwarding ASIC on a “backbone” card, ie does not natively support any featuresFeatures like ACLs, SNF, BGP PA added later on, but take performance hitMost new features require a separate ucode load and are mutually exclusivePerformance varies with features (eg. ACLs):128 line iACLs – 800kpps128 line oACLs – 675 kpps448 line iACLs – 690 kpps448 line oACLs – 460 kpps
100ISE Overview Programmable, hardware-based forwarding engine Up to 4Mpps performance (with features)Uses TCAMs for advanced feature processingTraffic shaping and advanced QoS supportFlexible mapping of queues
103ISE – Rx Packet flow PLIM L3 Engine Fabric Interface Packet validation IP/MPLS lookupFeature processing (ACLs, CAR, Netflow, etc...)Append buffer headerDetermine loq, oq and freeq for packetISE – Rx Packet flowadd CRC to ciscoCellsend transmission request to SCA8B/10B encodingsend cells to fabricFIB TableMemoryPacketMemoryHandle channelizationExtract packets from SONET/SDH payloadPass indication of input interface and packet header to RX AlphaPayload based to RadarALPHAOpticsRADARTCAMFUSCILLITofab queueingInput rate shapingWREDMDRRSegment packet into ciscoCellXSPECTRAGULFLC IOS MemCPUPICANTEFIASLICVSONET/SDH framerALPHAFIASLIRCONGATCAMSPacketMemoryPLIML3 EngineFabric Interface
104ISE - ALPHA ALPHA Advanced Layer 3 Packet Handling ASIC Performs forwarding, classification, policing and accountingTwo ALPHA chips, one in the receive path, one in the transmit path. This allows features to be implemented in both the ingress (RX) and egress (TX) paths11 pipeline stages 3 micro-code stages for future expandabilityUtilizes TCAMs to perform high-speed feature processing. Each ALPHA has its own TCAM
10511 stages of ALPHA Pipeline Ext. SSRAMLEAVES/ADJ/STATSFIB TREEFetchPrePPLUPCMTLUTCAM access for altering PLU results (PBR, MPLS)MAC header checking, protocol ID checking, IP header checking, extraction of IP/MPLS address fieldsMicrocoded stage which is capable of any general purpose activity on the packetAdjacency Lookup, Per Adjacency CountersMTRIE lookup machineCAMP3 stagesMIPPoPGatherExt. CAM + SSRAMMicrocoded stage – Performs feature actions, handling exception packetsProcessing packet structure, including stripping the old input encapsulation, stripping old MPLS labels if necessary, pushing new labels and computation of the new IP checksumMicrocoded stage which is capable of any general purpose activity on the packetCAM Processor – Lookups for xACL (permit/deny), CAR token bucket maintenance, Netflow counters update
106RADAR: Rx Queue Manager The RADAR manages the linecard’s receive packet memory buffers and queuesThere are three major types of queues in RADAR:LowQs (16 FreeQs, 3 RAWQs, an IPC FreeQ and spare queues)2048 Input Shape Queues (rate-shaping)2048 unicast Output Queues (16 unicast high priority queues)and 8 multicast queuesOne local output-queue is allocated per destination interfaceOne hpr (high priority) queue is allocated per destination slot
107RADAR: Input Shape Queues There are 2048 queues dedicated to ingress traffic shaping each with an independent ‘leaky bucket’ circuit.Each flow can be shaped in increments of 64kbps (from 64kbps up to line rate)
108RADAR: Rx Queue Manager The Rx ALPHA decides which type of queue will be used for each packet.
109ISE – TX packet flow PLIM L3 Engine Fabric Interface Remove 8B/10B encodingVerify and remove CRC from ciscoCellsend cells to TX ALPHAFIB TableMemoryPacketMemoryAdjacency lookupFeature processing (ACLs, CAR, MQC, etc...)Update output_info field of buffer header with info from adjacencyHandle channelizationPut packets in SONET/SDH payloadRe-assemble packet from ciscoCellsFrFab queueingOutput rate shapingWREDMDRRAppend L2 header and send packet to PLIMMcast duplicationALPHAOpticsRADARSONET/SDH framerTCAMFUSCILLIXSPECTRAGULFLC IOS MemCPUPICANTEFIASLICVALPHAFIASLIRCONGATCAMSPacketMemoryPLIML3 EngineFabric Interface
110CONGA: Tx Queue Manager The CONGA manages the linecard’s transmit packet memory buffers and queuesThree types of queues:Non-IPC freeQs, 3 CPU RawQs, IPC FreeQ2048 Output Queues (2 leaky buckets per queue for rate-shaping)Multicast RawQOutput queues divided equally among output portsSupport for 512 logical interfacesMax Bandwidth shaping per port, Min and Max Bandwidth shaping per queue
111CONGAEach ‘Shaped Output’ queue has a built in dual-leaky bucket mechanism with a programmable maximum and minimum rate (I.e. a DRR bandwidth guarantee)A second level of shaping is available per ‘Port’.
126Engine 6 PLIM L3 Engine Fabric Interface RED/WRED performed IP/MPLS LookupTCAM based feature processing (ACL/CAR/PBR/VRFs)PKT MOD TTL adj, ToS adj,IP checksum adj.Append buffer header and update loq, oq etc..Queueing ASICManage packet buffersPerform WRED, MDRRLookupMTRIEPacketMemoryFramer + PHAD integratedLayer-1 processing alarms ,crc check, APS..Pkts bufferingVerify packet lengthAppend PLIM headerMulticast packets duplicationMAC rewrite for outputTCAM based output feature- ACL/CAROutput packet queuingRED/WRED performedMDRR schedulingOutput traffic shapingHermesCells re-assembled into packetsCRC checkedPacket header reconstructedPackets scheduled to TXAresTCAMTFIAZeusFramer & PHADOpticsPicanteEROS SERDESBACKPLNECPULC IOS MemoryTFIATFIA + FFIA ASICPackets segmented into cellsMake packet transmission requestAppend ciscoCell CRCLayer1 processingPLIM header removedPackets segments queuedto SONET channelsPackets sent within SONETpayloads (POS)TCAMFFIAFFIAHERAPacketMemoryPLIML3 EngineFabric Interface
127TCAM based feature TCAM used to implement key features 32000 ingress entries, egress entries shared between…ACLCAR – 32 car rules per portPBRVRF SelectionSecurity ACLs are not merged
128TCAM BasicsTCAM - Ternary Content Addressable Memory – Match on 0 , 1 , X (don’t care)ACL/CAR/PBR/VRFs rules from CLI converted into Value Mask Result (VMR) format to be inserted in TCAMValue cells – key valuesACL/CAR/PBR/VRFs valuesMask cells = Significant Value Bits to be matchedResult = Value && Mask - ActionSecurity ACL – permit/denyCAR - Pointer to CAR bucketsPBR – AdjacencyVRF Selection – VRF Root