Presentation on theme: "ETHERNET AUTOMATIC PROTECTION SWITCHING (EAPS) A small comparison with Eternet Ring Protection Switching (ERPS)"— Presentation transcript:
ETHERNET AUTOMATIC PROTECTION SWITCHING (EAPS) A small comparison with Eternet Ring Protection Switching (ERPS)
Introduction EAPS is a protocol invented to increase the availability of Ethernet rings Developed by Extreme Networks (RFC3619 – 2003) Objective: Provide a resilience level comparable to SONET rings Current version (v ) has some enhancements over version 1 (RFC3619 – 2003)
Motivation Ethernet is widely used in Local Area Networks (LANs) and Metropolitan Area Networks (MANs) Typically present a ring topology MAN operators want to reduce recovery time Spanning Tree Protocol (STP) could take 30 – 60 second to recover Rapid Spanning Tree Protocol (RSTP) is faster... Convergence time depends on the number of nodes Both STP and RSTP limit the number of nodes EAPS recovers in less than 1 second (100 ms) Does not limit the number of nodes!!!
Basic Considerations (I) A ring is made up of two or more switches Each switch has two ports connected to the ring An EAPS domain exists on a single Ethernet ring A domain protects a group of VLANs A domain has a unique control VLAN Multiple EAPS domains could coexist on the same ring Multiple control VLANs
Basic Considerations (II) For each EAPS domain: One of the nodes is the Master (S1) One port is designated as the Primary port (P) The other is the Secondary Port (S) All other nodes (S2-S6) are known as Transit nodes
Normal Operation The Master node blocks its secondary port -> avoid loops Non-control traffic is blocked (Control VLAN is NOT blocked) Master is in COMPLETE state Transient nodes are in LINKS-UP state The Master sends health-check frames (HEALTH-CHECK- PDU) periodically (Hello timer) From primary port to secondary port Control frames consumed by the Master -> NOT forwarded
Fault Operation When a fault is detected: The Master changes to FAILED state Unblocks secondary port Flushes it bridging table The Master orders the other nodes to flush their tables Sends a RING-DOWN-FLUSH-FDB-PDU frame Transit nodes learn the new topology
Fault Detection (I) 2 ways of detecting a failure Link Down Alert Ring Polling Link Down Alert Transient nodes detect a link-down Transient detecting the failure changes to LINKS-DOWN state Transient sends a LINK-DOWN-PDU frame to the Master Master changes to FAILED state Master unblocks secondary port...
Fault Detection (II) Ring Polling (version 1 – RFC3619) Master sends HEALTH-CHECK-PDU frames periodically From primary to secondary port Master has a Fail-period timer If health check frame received before timer expires -> reset timer If health check frame NOT received before timer expires Master changes to FAILED state Master unblocks secondary port...
Fault Detection (III) Ring Polling (version 1.3) 2 options if the Fail-period timer expires (configurable) «Open Secondary Port» -> previous slice «Send-Alert» Master DO NOT unblock its secondary port yet Master sends a QUERY-LINK-STATUS-PDU frame out of both ports Transit nodes with link failure reply with LINK-DOWN-PDU frame Master changes to FAILED state... Prevents False Failures Health frames could not return to Master –> even if the ring is complete Control VLAN misconfigurations Too much traffic Master node’s CPU busy Why?
Fault Restoration (I) Master in FAILED state -> continues sendind HEALTH- CHECK-PDU frames Ring restored -> Master’s secondary port receives health frame Master changes to COMPLETE state Blocks non-control frames on secondary port Flushes its bridge table Orders the other nodes to flush their tables Sends a RING-UP-FLUSH-FDB-PDU frame Transit nodes re-learn the topology
Fault Restoration (II) – PREFORWARDING State Time between The Transit node detecting its link is restored The Master detecting the ring is restored Master’s secondary port is unblocked Possible temporary loop !!!! When Transit node detects its link is restored Changes to PREFORWARDING state and starts Preforwarding timer Protected VLANs in that port are temporary blocked Waits till a RING-UP-FLUSH-FDB-PDU is received Changes to LINKS-UP state Unblocks previously blocked VLANs Flushes its bridge table and stops Preforwarding timer Re-learns topology
Fault Restoration (III) – PREFORWARDING State Preforwarding timer deals with: Lost RING-UP-FLUSH-FDB-PDU from the Master Another break in the ring If the transient node remains in PREFORWARDING state indefinitely -> disconnected network Preforwarding timer is derived from the Hello-timer for HEALTH-CHECK-PDU frames
Enhancements of version 1.3 «Send-alert» configuration for Ring Polling fault detection method INIT state Master comes up for first time and its ports are up Master does not know if the ring is up Master starts in INIT state -> blocks secondary port When the first health frame is received -> changes to COMPLETE state Helps spotting misconfigurations in control VLAN LINK-UP-PDU Transient detects a link comes up -> sends LINK-UP-PDU to Master Timestamp used for trouble-shooting If the Master never changes to COMPLETE state Allows use of EAPS Shared-Ports
VLANs in Multiple EAPS domains (Multiple Rings) (I) EAPS could handle a simple configuration Each ring has a EAPS domain, a Master node and a Control VLAN VLAN spanning in both rings is added as protected by both EAPS domains
VLANs in Multiple EAPS domains (Multiple Rings) (I) Topologies with a common link could be problematic If the common link fails Both Masters open secondary ports Protected VLANs spanning both rings will have a loop S1-S2-S3-S4-S5-S6-S7-S8-S9-S10-S1 EAPS Shared-Ports deals with it Out of the scope
States and Control Frames Version 1 – RFC3619Version 1.3
Ethernet Ring Protection Switching (I) Ethernet Ring Protection Switching (ERPS) is defined by ITU-T G > achieve sub-50 ms recovery times in rings Basic considerations: One link is designated as the Ring Protection Link (RPL) -> blocked to prevent loops The node setting the block is the RPL Owner (Master in EAPS) Nodes monitor link failure using Ethernet Continuinity Check (ETH-CC) messages Four defined local events: Local Signal Failure (local SF) -> detection of link failure Local clear Signal Failure (local clear SF) -> detection of link restoration Wait-To-Restore Expire (WTR-Expire) -> timer expiration Wait-To-Restore Running (WTR-Running) -> timer running
Ethernet Ring Protection Switching (II) Basic considerations (cont.): The protocol uses Ring Automatic Protection Switching (R-APS) messages: R-APS(SF): sent by the node detecting link failure (gets local SF) R-APS(NR): sent by the node detecting link restoration (gets local clear SF) R-APS(NR,RB): sent by RPL Owner indicating the RPL is blocked Two important timers Wait-To-Restore (WTR) Timer: used by the RPL Owner to verify that the ring has stabilized before blocking the RPL after failure Guard Timer: used by links detecting link restoration to avoid receiving outdated R-APS messages Three states for nodes Initialization: first defining the node Idle: normal state, RPL blocked, all nodes/ports working Protecting: protection switching is in effect
Ethernet Ring Protection Switching (III) Basic considerations (cont.): An R-APS channel is configured using a VLAN -> transmitting R-APS messages
ERPS Principle of Operation (I) In normal operation (nodes in state Idle): RPL is blocked Link failure (local SF): nodes detecting it block failed port, send R-APS(SF) and flush filtering database (FDB) Nodes receiving R-APS(SF) flush FDBs RPL Owner receives R-APS(SF): flushes FDB, unblocks RPL Link Restoration (local clear SF): detecting nodes send R- APS(NR) periodically and start Guard Timer RPL Owner receives R-APS(NR): starts WTR Timer WTR Timer expires: RPL Owner blocks RPL, sends R-APS(NR,RB) and flushes DFB Nodes receiving R-APS(NR,RB) flush FDBs Nodes detecting link restoration unblock recovered ports, stop sending R-APS(NR) and flush FDBs
ERPS Principle of Operation (II)
EAPS vs. ERPS Same basic idea: break the loop in the ring by blocking one port In case of failure, unblock the blocked port and keep connectivity EAPS: Both the Master and Transient nodes can detect a failure Only the Master detects the failed link is restored ERPS: Only the nodes adjacent to a failed link detect failures and restoration
References S.Shah, M. Yip, «RFC3619: Extreme Networks’ Ethernet Protection Switching (EAPS), Version 1», Network Working Group, October A. Lim, S. Blake, S. Shah, «Extreme Networks’ Ethernet Protection Switching (EAPS), Version 1.3», Internet-Draft, July Extreme Networks Whitepaper «Ethernet Automatic Protection Switching (EAPS)», Extreme Networks, Inc., J. D. Ryoo, H. Long, Y. Yang, M. Holness. Z. Ahmad, J. K. Rhee, «Ethernet Ring Protection for Carrier Ethernet Networks», IEEE Comm. Magazine, September 2008