Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 © 2005 Cisco Systems, Inc. All rights reserved. High Availability Campus Networks Tyler Creek Consulting Systems Engineer Southern California.

Similar presentations


Presentation on theme: "1 © 2005 Cisco Systems, Inc. All rights reserved. High Availability Campus Networks Tyler Creek Consulting Systems Engineer Southern California."— Presentation transcript:

1 1 © 2005 Cisco Systems, Inc. All rights reserved. High Availability Campus Networks Tyler Creek Consulting Systems Engineer Southern California

2 222 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access Design Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

3 333 © 2005 Cisco Systems, Inc. All rights reserved. What Is High Availability? DPM—Defects per Million AvailabilityDowntime Per Year (24x365) 99.000% 99.500% 99.900% 99.950% 99.990% 99.999% 99.9999% 3 Days 1 Day 53 Minutes 5 Minutes 30 Seconds 15 Hours 19 Hours 8 Hours 4 Hours 36 Minutes 48 Minutes 46 Minutes 23 Minutes DPM 10000 5000 1000 500 100 10 1 $ 205$1,010,536Average $ 107$ 668,586Transportation $ 244$1,107,274Retail $ 370$1,202,444Insurance $1,079$1,495,134Financial Institution $ 134$1,610,654Manufacturing $ 186$2,066,245Telecommunications $ 569$2,817,846Energy Revenue/ Employee- Hour Revenue/HourIndustry Sector To achieve five-nines availability or better, seconds or even milliseconds count More than just revenue impacted Revenue loss Productivity loss Impaired financial performance Damaged reputation Recovery expenses

4 444 © 2005 Cisco Systems, Inc. All rights reserved. Automation and local action SOFTWARE RESILIENCY EMBEDDED MANAGEMENT NETWORK LEVEL RESILIENCY SOFTWARE RESILIENCY Cisco IOS Software features for faster network convergence, protection, and restoration INVESTMENT PROTECTION IS A KEY COMPONENT HARDWARE RESILIENCY SOFTWARE RESILIENCY Cisco IOS ® Software functionality that mitigates the impact of faults Reliable, robust hardware designed for high availability SYSTEM LEVEL RESILIENCY Systematic, End-to-End Approach: Targeting Downtime

5 555 © 2005 Cisco Systems, Inc. All rights reserved. System Level Resiliency Overview Control/data plane resiliency Separation of control and forwarding plane Fault isolation and containment Seamless restoration of Route Processor control and data plane failures Link resiliency Reduced impact of line card hardware and software failures Planned outages Seamless software and hardware upgrades Eliminate single points of failure for hardware and software components Micro-Kernel Line Card ACTIVE STANDBY CONTROL PLANE MANAGEMENT PLANE FORWARDING/DATA PLANE

6 666 © 2005 Cisco Systems, Inc. All rights reserved. Hierarchical Network Design Scalability to expand/shrink without affecting network behavior Predictable performance under normal conditions and failure conditions Convergence and Self-Healing Reduce convergence times for major network protocols—EIGRP, OSPF, IS-IS, BGP Leverage in network wherever redundant paths exist Intelligent Protocol Fabric Embed NSF intelligence network-wide in Service Provider and Enterprise networks Network Level Resiliency Overview

7 777 © 2005 Cisco Systems, Inc. All rights reserved. Commonality: Intelligent switching Simplified configuration Reduced network complexity Improved network availability Reduced management complexity Commonality: Intelligent switching Simplified configuration Reduced network complexity Improved network availability Reduced management complexity Ω Cisco Campus Architecture One Architecture with Multiple Design Options Enterprise Campus Intelligent Switching Future Campus Design Options Routed Campus Design Routed Campus Design Cisco Campus Architecture Multi-Layer Design Multi-Layer Design Intelligent Switching (Hybrid of L2 + L3 features) Intelligent Switching (Hybrid of L2 + L3 features)

8 888 © 2005 Cisco Systems, Inc. All rights reserved. Hierarchical Campus Design Without a Rock Solid Foundation the Rest Doesn’t Matter Spanning Tree Routing VLANs GLBP Trunking Load Balancing Access Distribution Core Distribution Access Data Center

9 999 © 2005 Cisco Systems, Inc. All rights reserved. Hierarchical Campus Design Building Blocks Data Center WANInternet Access Distribution Core Distribution Access Offers hierarchy—each layer has specific role Modular topology—building blocks Easy to grow, understand, and troubleshoot Creates small fault domains—clear demarcations and isolation Promotes load balancing and redundancy Promotes deterministic traffic patterns Incorporates balance of both Layer 2 and Layer 3 technology, leveraging the strength of both Can be applied to all campus designs; Multi-Layer L2/L3 and Routed Access designs

10 10 © 2005 Cisco Systems, Inc. All rights reserved. Multi-Layer Reference Design Layer 2/3 Distribution with Layer 2 Access Consider fully utilizing uplinks via GLBP Distribution-to-distribution link required for route summarization STP convergence not required for uplink failure/recovery Map L2 VLAN number to L3 subnet for ease of use/management Can easily extend VLANs across access layer switches when required 10.1.20.0 10.1.120.0 VLAN 20 Data VLAN 120 Voice VLAN 40 Data VLAN 140 Voice 10.1.40.0 10.1.140.0 HSRP or GLBP VLANs 20,120,40,140 HSRP or GLBP VLANs 20,120,40,140 Reference Model Layer 3 Layer 2 Access Distribution

11 11 © 2005 Cisco Systems, Inc. All rights reserved. Routed Campus Design Layer 3 Distribution with Layer 3 Access Move the Layer 2/3 demarcation to the network edge Upstream convergence times triggered by hardware detection of link lost from upstream neighbor Beneficial for the right environment 10.1.20.0 10.1.120.0 VLAN 20 Data VLAN 120 Voice VLAN 40 Data VLAN 140 Voice 10.1.40.0 10.1.140.0 EIGRP/OSPF GLBP Model Layer 3 Layer 2 Layer 3 Layer 2 EIGRP/OSPF

12 12 © 2005 Cisco Systems, Inc. All rights reserved. Optimal Redundancy Core and distribution engineered with redundant nodes and links to provide maximum redundancy and optimal convergence Network bandwidth and capacity engineered to withstand node or link failure Sub-Second converge around most failure events Access Distribution Core Distribution Access Data Center WAN Internet Redundan t Nodes

13 13 © 2005 Cisco Systems, Inc. All rights reserved. Campus Network Resilience Sub-Second Convergence Worst Case Convergence for Any Link or Platform Failure Event Seconds *OSPF Results Require Sub-Second Timers Convergence Times for Campus Best Practice Designs

14 14 © 2005 Cisco Systems, Inc. All rights reserved. ESE Campus Solution Test Bed Verified Design Recommendations Data Center WAN Internet Total of 68 Access Switches, 2950, 2970, 3550, 3560, 3750, 4507 SupII+, 4507SupIV, 6500 Sup2, 6500 Sup32, 6500 Sup720 and 40 APs (1200) 6500 with Redundant Sup720s Three Distribution Blocks 6500 with Redundant Sup720 4507 with Redundant SupV Three Distribution Blocks 6500 with Redundant Sup720s 7206VXR NPEG1 4500 SupII+, 6500 Sup720, FWSM, WLSM, IDSM2, MWAM 8400 Simulated Hosts 10,000 Routes End-to-End Flows: TCP, UDP, RTP, IPmc

15 15 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access Design Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

16 16 © 2005 Cisco Systems, Inc. All rights reserved. Best Practices—Layer 3 Routing Protocols Used to quickly re-route around failed node/links while providing load balancing over redundant paths Build triangles not squares for deterministic convergence Only peer on links that you intend to use as transit Insure redundant L3 paths to avoid black holes Summarize distribution to core to limit EIGRP query diameter or OSPF LSA propagation Tune CEF L3/L4 load balancing hash to achieve maximum utilization of equal cost paths (CEF polarization) Utilized on both Multi-Layer and Routed Access designs Data Center WAN Internet Layer 3 Equal Cost Link’s

17 17 © 2005 Cisco Systems, Inc. All rights reserved. Best Practice—Build Triangles Not Squares Deterministic vs. Non-Deterministic Layer 3 redundant equal cost links support fast convergence Hardware based—fast recovery to remaining path Convergence is extremely fast (dual equal-cost paths: no need for OSPF or EIGRP to recalculate a new path) Triangles: Link/Box Failure Does NOT Require Routing Protocol Convergence Model A Squares: Link/Box Failure Requires Routing Protocol Convergence Model B

18 18 © 2005 Cisco Systems, Inc. All rights reserved. Best Practice—Passive Interfaces for IGP Limit OSPF and EIGRP Peering Through the Access Layer Limit unnecessary peering Without passive interface: Four VLANs per wiring closet, 12 adjacencies total Memory and CPU requirements increase with no real benefit Creates overhead for IGP Routing Updates OSPF Example: Router(config)#router ospf 1 Router(config-router)#passive- interface Vlan 99 Router(config)#router ospf 1 Router(config-router)#passive- interface default Router(config-router)#no passive- interface Vlan 99 EIGRP Example: Router(config)#router eigrp 1 Router(config-router)#passive- interface Vlan 99 Router(config)#router eigrp 1 Router(config-router)#passive- interface default Router(config-router)#no passive- interface Vlan 99 Distribution Access

19 19 © 2005 Cisco Systems, Inc. All rights reserved. CEF Load Balancing Avoid Underutilizing Redundant Layer 3 Paths The default CEF hash ‘input’ is L3 CEF polarization: In a multi-hop design, CEF could select the same left/left or right/right path Imbalance/overload could occur Redundant paths are ignored/underutilized Redundant Paths Ignored L L R R Distribution Default L3 Hash Core Default L3 Hash Distribution Default L3 Hash Access Default L3 Hash Access Default L3 Hash

20 20 © 2005 Cisco Systems, Inc. All rights reserved. CEF Load Balancing Avoid Underutilizing Redundant Layer 3 Paths With defaults, CEF could select the same left/left or right/right paths and ignore some redundant paths Alternating L3/L4 hash and default L3 hash will give us the best load balancing results The default is L3 hash—no modification required in core or access Use: mls ip cef load-sharing full in the distribution switches to achieve better redundant path utilization RL R Distribution L3/L4 Hash Core Default L3 Hash Distribution L3/L4 Hash L RL Left Side Shown Access Default L3 Hash Access Default L3 Hash L All Paths Used

21 21 © 2005 Cisco Systems, Inc. All rights reserved. Single Points of Termination SSO/NSF Avoiding Total Network Outage The access layer and other single points of failure are candidates for supervisor redundancy L2 access layer SSO L3 access layer SSO and NSF Network outage until physical replacement or reload vs one to three seconds Core Distribution Access L2 = SSO L3 = SSO/NSF

22 22 © 2005 Cisco Systems, Inc. All rights reserved. Campus Multicast Which PIM Mode—Sparse or Dense “Sparse mode Good! Dense mode Bad!” Source: “The Caveman’s Guide to IP Multicast”, ©2000, R. Davis

23 23 © 2005 Cisco Systems, Inc. All rights reserved. PIM Design Rules for Routed Campus Use PIM sparse mode Enable PIM sparse mode on ALL access, distribution and core layer switches Enable PIM on ALL interfaces Use Anycast RPs in the core for RP redundancy and fast convergence Define the Router-ID to prevent Anycast IP address overlap IGMP-snooping is enabled when PIM is enabled on a VLAN interface (SVI) (Optional) use garbage can RP to black-hole unassigned IPmc traffic IPmc Sources WAN Internet IP/TV Server Call Manager w/MoH RP-Left10.122.100.1 RP-Right 10.122.100.1

24 24 © 2005 Cisco Systems, Inc. All rights reserved. Access Multicast in the Campus MSDP Distribution-A Distribution-B Distribution Layer 2/3 Core Layer 3 Core-ACore-B ip pim rp-address 10.0.0.1 ! interface Y description GigE to Access/Core ip address 10.122.0.Y 255.255.255.252 ip pim sparse-mode ! ip pim rp-address 10.0.0.1 ! interface Y description GigE to Access/Core ip address 10.122.0.Y 255.255.255.252 ip pim sparse-mode ! IGMP snooping on by default interface loopback 0 ip address 10.0.0.1 255.255.255.255 interface loopback 1 ip address 10.0.0.3 255.255.255.255 ! ip msdp peer 10.0.0.2 connect-source loopback 1 ip msdp originator-id loopback 1 ! interface TenGigabitEthernet M/Y ip address 10.122.0.X 255.255.255.252 ip pim sparse-mode ! ip pim rp-address 10.0.0.1 interface loopback 0 ip address 10.0.0.1 255.255.255.255 interface loopback 1 ip address 10.0.0.2 255.255.255.255 ! ip msdp peer 10.0.0.3 connect-source loopback 1 ip msdp originator-id loopback 1 ! interface TenGigabitEthernet M/Y ip address 10.122.0.X 255.255.255.252 ip pim sparse-mode ! ip pim rp-address 10.0.0.1

25 25 © 2005 Cisco Systems, Inc. All rights reserved. Best Practices—UDLD Configuration Typically deployed on any fiber optic interconnection Use UDLD aggressive mode for best protection Turn on in global configuration to avoid operational error/“misses” Config example Cisco IOS Software: udld aggressive CatOS: set udld enable set udld aggressive-mode enable Data Center WAN Internet Layer 3 Equal Cost Link’s Fiber Interconnections

26 26 © 2005 Cisco Systems, Inc. All rights reserved. Unidirectional Link Detection Protecting Against One-Way Communication Highly available networks require UDLD to protect against one-way communication or partially failed links and the effect that they could have on protocols like STP and RSTP Primarily used on fiber optic links where patch panel errors could cause link up/up with miss matched transmit/receive pairs Each switch port configured for UDLD will send UDLD protocol packets (at L2) containing the port's own device/port ID, and the neighbor's device/port IDs seen by UDLD on that port Neighboring ports should see their own device/port ID (echo) in the packets received from the other side If the port does not see its own device/port ID in the incoming UDLD packets for a specific duration of time, the link is considered unidirectional and is shutdown Are You ‘Echoing’ My Hellos?

27 27 © 2005 Cisco Systems, Inc. All rights reserved. Best Practices— EtherChannel Configuration Typically deployed in distribution to core, and core to core interconnections Used to provide link redundancy—while reducing peering complexity Tune L3/L4 load balancing hash to achieve maximum utilization of channel members Match CatOS and Cisco IOS Software PAgP settings 802.3ad LACP for interop if you need it Disable unless needed CatOS: set port host Cisco IOS Software: switchport host Data Center WAN Internet Layer 3 Equal Cost Link’s

28 28 © 2005 Cisco Systems, Inc. All rights reserved. EtherChannel Load Balancing Avoid Underutilizing Redundant Layer 2 Paths Network did not load balance using default L3 load balancing hash Common IP addressing scheme 72 access subnets addressed uniformly from 10.120.x.10 to 10.120.x.215 Converted to L4 load balancing hash and achieved better load sharing cr2-6500-1(config)#port-channel load-balance src-dst-port Link 0 load—68% Link 1 load—32% Link 0 load—52% Link 1 Load—48% L3 Hash L4 Hash

29 29 © 2005 Cisco Systems, Inc. All rights reserved. PAgP Tuning PAgP Default Mismatches Matching EtherChannel Configuration on Both Sides Improves Link Restoration Convergence Times set port channel on/off As Much as Seven Seconds of Delay/Loss Tuned Away

30 30 © 2005 Cisco Systems, Inc. All rights reserved. Unauthorized Switch Enterprise Server Unauthorized Switch Cisco Secure ACS Enterprise Server PROBLEM:SOLUTION: Mitigating Plug and Players Protecting Against Well-Intentioned Users Well-intentioned users place unauthorized network devices on the network possibly causing instability Cisco Catalyst ® switches support rogue BPDU filtering: BPDU Guard, Root Guard Incorrect STP Info BPDU Guard Network Instability Authorized Switch Root Guard

31 31 © 2005 Cisco Systems, Inc. All rights reserved. BPDU Guard Prevent Loops via WLAN (Windows XP Bridging) WLAN AP’s do not forward BPDU’s Multiple Windows XP machines can create a loop in the wired VLAN via the WLAN BPDU Guard configured on all end station switch ports will prevent loop from forming Win XP Bridging Enabled Win XP Bridging Enabled BPDU Guard Disables Port STP Loop Formed BPDU Generated BPDU Discarded PROBLEM: SOLUTION:

32 32 © 2005 Cisco Systems, Inc. All rights reserved. Cisco Catalyst Integrated Security Features Port security prevents MAC flooding attacks DHCP snooping prevents client attack on the switch and server Dynamic ARP Inspection adds security to ARP using DHCP snooping table IP source guard adds security to IP source address using DHCP snooping table IP Source Guard Dynamic ARP Inspection DHCP Snooping Port Security

33 33 © 2005 Cisco Systems, Inc. All rights reserved. Cisco Catalyst 6500 High Availability Leadership Maximizing Uptime Physical Redundancy Redundant supervisors, power supplies, switch fabrics, and clocks Non-Stop Forwarding/ Stateful Switch Over (NSF/SSO) Traffic continues flowing after a primary supervisor failure Sub-second recovery in L2 and L3 networks Generic Online Diagnostics (GOLD) Proactively detect and address potential hardware and software faults in the switch before they adversely impact network traffic Catalyst 6500 Cisco IOS Software Modularity Subsystem In-Service Software Upgrades (ISSU) Stateful Process Restarts Fault Containment, Memory Protection New!

34 34 © 2005 Cisco Systems, Inc. All rights reserved. Catalyst 6500 with IOS Modularity Infrastructure Enhancements IOS with Modularity Protected Memory Fault Containment Restartable Processes –20+ independent processes –Remaining feature subsystems live in IOS Base process Subsystem ISSU Embedded Event Manager Create TCL policy scripts to program the Catalyst 6500 –When detect event X, then do action Y Generic On-Line Diagnostics (GOLD) supports pro-active diagnosis of faults before they become a problem… Catalyst 6500 Data Plane Network Optimized Microkernel Base Routing TCP UDPEEM FTP CDP INETD etc High Availability Infrastructure Cisco IOS Software Modularity

35 35 © 2005 Cisco Systems, Inc. All rights reserved. Software Modularity Minimize Unplanned Downtime If an error occurs in a modular process… HA subsystem determines the best recovery action Restart a modular process Switchover to standby supervisor Remove the system from the network Process restarts with no impact on the data plane Utilizes Cisco Nonstop Forwarding (NSF) where appropriate State Checkpointing allows quick process recovery Traffic forwarding continues during unplanned process restarts Cisco Catalyst 6500 Data Plane Network Optimized Microkernel Base Routing TCP UDPEEM FTP CDP INETD etc High Availability Infrastructure Cisco IOS Software Modularity

36 36 © 2005 Cisco Systems, Inc. All rights reserved. Software Modularity Simplify Software Changes Traffic forwarding continues during planned software changes If the software needs to be upgraded (for example, to protect against a new security vulnerability)… The change can be made available as an individual patch which reduces code certification time Subsystem In-Service Software Upgrade (ISSU)* allows the change to be applied with no service disruption Time Code Certification Code Deployment Catalyst 6500 Data Plane Network Optimized Microkernel Base Routing TCP UDPEEM FTP CDP INETD etc High Availability Infrastructure Cisco IOS Software Modularity Routing Patch *for all modularized processes

37 37 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access Design Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

38 38 © 2005 Cisco Systems, Inc. All rights reserved. Why Multi-Layer Campus Design? Most widely deployed campus design Supports the spanning of VLANs and Subnets across multiple access layer switches Leverages the strength of both Layer 2 and Layer 3 capabilities Supported on all models of Cisco Catalyst Switches Access Distribution Layer 2 Layer 3 Layer 2 Non-LoopedLooped

39 39 © 2005 Cisco Systems, Inc. All rights reserved. Multi-Layer Design Best Practices—Spanning VLANs ONLY when you have to! More common in the data center Required when a VLAN spans access layer switches Required to protect against ‘user side’ loops Use Rapid PVST+ for best convergence Take advantage of the Spanning Tree Toolkit Data Center WAN Internet Layer 3 Equal Cost Link’s Layer2 Loops Same VLAN

40 40 © 2005 Cisco Systems, Inc. All rights reserved. PVST+ and Rapid PVST+, MST Spanning Tree Toolkit, 802.1d, 802.1s, 802.1w 802.1D-1998: Classic Spanning Tree Protocol (STP) 802.1D-2004: Rapid Spanning Tree Protocol (RSTP = 802.1w) 802.1s: Multiple Spanning Tree Protocol (MST) 802.1t: 802.1d Maintenance, 802.1Q: VLAN Tagging (Trunking) PVST+: an instance of STP (802.1D-1998) per VLAN + Portfast, Uplinkfast, BackboneFast, BPDUGuard, BPDUFilter, RootGuard, and LoopGuard Rapid PVST+: an instance of RSTP (802.1D-2004 = 802.1w) per VLAN + Portfast, BPDUGuard, BPDUFilter, RootGuard, and LoopGuard MST (802.1s): up to 16 instances of RSTP (802.1w); combining many VLANS with the same physical and logical topology into a common RSTP instance; additionally Portfast, BPDUGuard, BPDUFilter, RootGuard, and LoopGuard are supported with MST A B

41 41 © 2005 Cisco Systems, Inc. All rights reserved. Spanning Tree Toolkit PortFast*: Bypass listening-learning phase for access port UplinkFast: Three to five seconds convergence after link failure BackboneFast: Cuts convergence time by Max_Age for indirect failure LoopGuard*: Prevents alternate or root port to become designated in absence of BPDUs RootGuard*: Prevents external switches from becoming root BPDUGuard*: Disable PortFast enabled port if a BPDU is received BPDUFilter*: Do not send or receive BPDUs on PortFast enabled ports Wiring Closet Switch Distribution Switches Root F F FF F X B * Also Supported with MST and Rapid PVST+

42 42 © 2005 Cisco Systems, Inc. All rights reserved. Layer 2 Hardening Spanning Tree Should Behave the Way You Expect Place the Root where you want it Root Primary/Secondary Macro The root bridge should stay where you put it Rootguard Loopguard UplinkFast (Classic STP only) UDLD Only end station traffic should be seen on an edge port BPDU Guard Root Guard PortFast Port-security BPDU Guard or Rootguard PortFast Rootguard Loopguard UplinkFastSTP Root Loopguard

43 43 © 2005 Cisco Systems, Inc. All rights reserved. Optimizing Convergence: PVST+ or Rapid PVST+ 802.1d + Extensions or 802.1s + Extensions Rapid-PVST+ greatly improves the restoration times for any VLAN that requires a topology convergence due to link UP Rapid-PVST+ also greatly improves convergence time over backbone network fast for any indirect link failures 30 Seconds of Delay/Loss Tuned Away

44 44 © 2005 Cisco Systems, Inc. All rights reserved. Multi-Layer Design Best Practices—Trunk Configuration Typically deployed on interconnection between access and distribution layers Use VTP transparent mode to decrease potential for operational error Hard set trunk mode to on and encapsulation negotiate off for optimal convergence Change the native VLAN to something unused to avoid VLAN hopping Manually prune all VLANS except those needed Disable on host ports: CatOS: set port host Cisco Cisco IOS: switchport host Data Center WAN Internet Layer 3 Equal Cost Link’s 802.1q Trunks

45 45 © 2005 Cisco Systems, Inc. All rights reserved. Optimizing Convergence: Trunk Tuning Trunk Auto/Desirable Takes Some Time DTP negotiation tuning improves link up convergence time CatOS> (enable) set trunk nonegotiate dot1q IOS(config-if)# switchport mode trunk IOS(config-if)# switchport nonegotiate VoiceData Two Seconds of Delay/Loss Tuned Away

46 46 © 2005 Cisco Systems, Inc. All rights reserved. Multi-Layer Design Best Practices—First Hop Redundancy Used to provide a resilient default gateway/first hop address to end stations HSRP, VRRP, and GLBP alternatives VRRP, HSRP and GLBP provide millisecond timers and excellent convergence performance VRRP if you need multi- vendor interoperability GLBP facilitates uplink load balancing Tune preempt timers to avoid black-holed traffic Data Center WAN Internet Layer 3 Equal Cost Link’s 1 st Hop Redundancy

47 47 © 2005 Cisco Systems, Inc. All rights reserved. Optimizing Convergence: HSRP Timers HSRP Millisecond Convergence HSRP = default gateway redundancy; effects traffic out of the access layer interface Vlan5 description Data VLAN for 6k-access ip address 10.1.5.3 255.255.255.0 ip helper-address 10.5.10.20 no ip redirects ip pim query-interval 250 msec ip pim sparse-mode logging event link-status standby 1 ip 10.1.5.1 standby 1 timers msec 200 msec 750 standby 1 priority 150 standby 1 preempt standby 1 preempt delay minimum 180 Data Center WAN Internet Layer 3 Equal Cost Link’s Layer 2 Link’s

48 48 © 2005 Cisco Systems, Inc. All rights reserved. Optimizing Convergence: HSRP Preempt Delay Preempt Delay Needs to Be Longer Than Box Boot Time More Than 30 Seconds of Delay/Loss Tuned Away Without Increased Preempt Delay HSRP Can Go Active Before Box Completely Ready to Forward Traffic L1 (Boards), L2 (STP), L3 (IGP Convergence) standby 1 preempt delay minimum 180 Test Tool Timeout—30 Seconds

49 49 © 2005 Cisco Systems, Inc. All rights reserved. First Hop Redundancy with GLBP Cisco Designed, Load Sharing, Patent Pending All the benefits of HSRP plus load balancing of default gateway  utilizes all available bandwidth A group of routers function as one virtual router by sharing one virtual IP address but using multiple virtual MAC addresses for traffic forwarding Allows traffic from a single common subnet to go through multiple redundant gateways using a single virtual IP address GLBP AVG/AVF,SVF GLBP AVF,SVF R1- AVG; R1, R2 Both Forward Traffic IP: 10.0.0.254 MAC: 0000.0c12.3456 vIP: 10.0.0.10 vMAC: 0007.b400.0101 IP: 10.0.0.253 MAC: 0000.0C78.9abc vIP: 10.0.0.10 vMAC: 0007.b400.0102 IP: 10.0.0.1 MAC: aaaa.aaaa.aa01 GW: 10.0.0.10 ARP: 0007.B400.0101 IP: 10.0.0.2 MAC: aaaa.aaaa.aa02 GW: 10.0.0.10 ARP: 0007.B400.0102 IP: 10.0.0.3 MAC: aaaa.aaaa.aa03 GW: 10.0.0.10 ARP: 0007.B400.0101 Access-a Distribution-A GLBP AVG/AVF, SVF Distribution-B GLPB AVF,SVF R1

50 50 © 2005 Cisco Systems, Inc. All rights reserved. 50% of Flows Have ZERO Loss W/ GLBP Optimizing Convergence: VRRP, HSRP, GLBP Mean, Max, and Min—Are There Differences? VRRP does not have sub-second timers and all flows go through a common VRRP peer; mean, maximum, and minimum are equal HSRP has sub-second timers; however all flows go through same HSRP peer so there is no difference between mean, maximum, and minimum GLBP has sub-second timers and distributes the load amongst the GLBP peers; so 50% of the clients are not effected by an uplink failure GLBP Is 50% Better Distribution to Access Link Failure Access to Server Farm

51 51 © 2005 Cisco Systems, Inc. All rights reserved. Both distribution switches act as default gateway Blocked uplink caused traffic to take less than optimal path VLAN 2 F 2 B2B2 B 2 F: Forwarding B: Blocking Access-b Core Access-a Distribution-A GLBP Virtual MAC 1 If You Span VLANS Tuning Required By Default Half the Traffic Will Take a Two Hop L2 Path Distribution-B GLBP Virtual MAC 2 Access Layer 2 Access Layer 2 Distribution Layer 2/3 Core Layer 3

52 52 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 3 VLAN 2 Access-b Core Access-a Distribution-A GLBP Virtual MAC 1 Distribution-B GLBP Virtual MAC 2 BxBx STP Port Cost Increased GLBP + STP turning Change the Blocking Interfaces + VLAN per Access 1.Force STP to block the interface between the distribution switches 2.Use the fewest possible VLANs per access switch Access Layer 2 Access Layer 2 Distribution Layer 2/3 Core Layer 3

53 53 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 2 Asymmetric Routing (Unicast Flooding) Affects redundant topologies with shared L2 access One path upstream and two paths downstream CAM table entry ages out on standby HSRP Without a CAM entry packet is flooded to all ports in the VLAN Downstream Packet Flooded Upstream Packet Unicast to Active HSRP Asymmetric Equal Cost Return Path CAM Timer Has Aged out on Standby HSRP VLAN 2

54 54 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 2 Best Practices Prevent Unicast Flooding Assign one unique voice and as few data VLAN’s as possible to each access switch Traffic is now only flooded down one trunk Access switch unicasts correctly; no flooding to all ports If you have to: Tune ARP and CAM aging timers; CAM timer exceeds ARP timer Bias routing metrics to remove equal cost routes Downstream Packet Flooded on Single Port Upstream Packet Unicast to Active HSRP Asymmetric Equal Cost Return Path VLAN 3 VLAN 4 VLAN 5

55 55 © 2005 Cisco Systems, Inc. All rights reserved. Keep Redundancy Simple “If Some Redundancy Is Good, More Redundancy Is NOT Better” Root placement? How many blocked links? Convergence? Complex fault resolution

56 56 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 2 But Not Too Simple… What Happens if You Don’t Link the Distributions? STP’s slow convergence can cause considerable periods of traffic loss STP could cause non-deterministic traffic flows/link load engineering STP convergence will cause Layer 3 convergence STP and Layer 3 timers are independent Unexpected Layer 3 convergence and re-convergence could occur Even if you do link the distribution switches dependence on STP and link state/connectivity can cause HSRP irregularities and unexpected state transitions F 2 B2B2 STP Secondary Root and HSRP Standby F 2 Access-b Core Hellos Traffic Dropped Until HSRP Goes Active Access-a STP Root and HSRP Active Traffic Dropped Until MaxAge Expires Then Listening and Learning Traffic Dropped Until Transition to Forwarding; As much as 50 Seconds

57 57 © 2005 Cisco Systems, Inc. All rights reserved. Aggressive HSRP timers limit black hole #1 Backbone fast limits time (30 seconds) to event #2 Even with Rapid PVST+ at least one second before event #2 What If You Don’t? Black Holes and Multiple ‘Transitions’… Blocking link on access-b will take 50 seconds to move to forwarding  traffic black hole until HSRP goes active on standby HSRP peer After MaxAge expires (or backbone fast or Rapid PVST+) converges HSRP preempt causes another transition Access-b used as transit for access-a’s traffic STP Root and HSRP Active F 2 B2B2 STP Secondary Root and HSRP Standby F 2 HSRP Active (Temporarily) MaxAge Seconds Before Failure Is Detected….Then Listening and Learning F: Forwarding B: Blocking Access-b Core Hellos Traffic Dropped Until HSRP Goes Active F 2 Access-a Access Layer 2 Access Layer 2 Distribution Layer 2/3 Core Layer 3 VLAN 2

58 58 © 2005 Cisco Systems, Inc. All rights reserved. 802.1d: up to 50 seconds PVST+: backbone fast 30 seconds Rapid PVST+: address by the protocol (one second) VLAN 2 What If You Don’t? Return Path Traffic Black Holed… Blocking link on access-b will take 50 seconds to move to forwarding  return traffic black hole until then F 2 B2B2 F: Forwarding B: Blocking Core Hellos Traffic Dropped Until MaxAge Expires Then Listening and Learning F 2 STP Root and HSRP Active Access-bAccess-a STP Secondary Root and HSRP Standby Access Layer 2 Access Layer 2 Distribution Layer 2/3 Core Layer 3

59 59 © 2005 Cisco Systems, Inc. All rights reserved. Layer 2 Distribution Interconnection Redundant Link from Access Layer Is Blocked Use only if Layer 2 VLAN spanning flexibility required STP convergence required for uplink failure/recovery More complex as STP root and HSRP should match Distribution-to-distribution link required for route summarization Trunk Layer 2 VLAN 20 Data VLAN 120 Voice 10.1.20.0 10.1.120.0 VLAN 40 Data VLAN 140 Voice 10.1.40.0 10.1.140.0 HSRP Active and STP Root VLAN 20,140 HSRP Active and STP Root VLAN 40,120 STP Model Layer 2 Links Access Distribution

60 60 © 2005 Cisco Systems, Inc. All rights reserved. Layer 3 Distribution Interconnection No Spanning Tree—All Links Active Recommended ‘best practice’—tried and true No STP convergence required for uplink failure/recovery Distribution-to-distribution link required for route summarization Map L2 VLAN number to L3 subnet for ease of use/management Layer 3 VLAN 20 Data VLAN 120 Voice 10.1.20.0 10.1.120.0 VLAN 40 Data VLAN 140 Voice 10.1.40.0 10.1.140.0 HSRP Active VLAN 20,140 HSRP Active VLAN 40,120 HSRP Model Layer 2 Links Access Distribution

61 61 © 2005 Cisco Systems, Inc. All rights reserved. Layer 3 Distribution Interconnection GLBP Gateway Load Balancing Protocol Fully utilize uplinks via GLBP Distribution-to-distribution required for route summarization No STP convergence required for uplink failure/recovery Layer 3 VLAN 20 Data VLAN 120 Voice 10.1.20.0 10.1.120.0 VLAN 40 Data VLAN 140 Voice 10.1.40.0 10.1.140.0 GLBP Model Layer 2 Links GLBP Active VLAN 20,120,40,140 GLBP Active VLAN 20,120, 40, 140 Access Distribution

62 62 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 120 Voice 10.1.120.0/24 Layer 3 Distribution Interconnection Reference Design—No VLANs Span Access Layer Tune CEF load balancing Match CatOS/IOS Etherchannel settings and tune load balancing Summarize routes towards core Limit redundant IGP peering STP Root and HSRP primary tuning or GLBP to load balance on uplinks Set trunk mode on/nonegotiate Disable Etherchannel unless needed Set Port Host on access layer ports: Disable Trunking Disable Etherchannel Enable PortFast RootGuard or BPDU-Guard Use security features P-t-P Link Layer 3 VLAN 20 Data 10.1.20.0/24 VLAN 140 Voice 10.1.140.0/24 VLAN 40 Data 10.1.40.0/24 Access Distribution Core

63 63 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 250 WLAN 10.1.250.0/24 Layer 2 Distribution Interconnection Some VLANs Span Access Layer Tune CEF load balancing Match CatOS/IOS Etherchannel settings and tune load balancing Summarize routes towards core Limit redundant IGP peering STP Root and HSRP primary or GLBP and STP port cost tuning to load balance on uplinks Set trunk mode on/nonegotiate Disable Etherchannel unless needed RootGuard on downlinks LoopGuard on uplinks Set port host on access Layer ports: Disable trunking Disable Etherchannel Enable PortFast RootGuard or BPDU-Guard Use security features VLAN 120 Voice 10.1.120.0/24 Trunk VLAN 20 Data 10.1.20.0/24 VLAN 140 Voice 10.1.140.0/24 VLAN 40 Data 10.1.40.0/24 Layer 2 Access Distribution Core

64 64 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access DesignRouted Access Design EIGRP Design Details OSPF Design Details Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

65 65 © 2005 Cisco Systems, Inc. All rights reserved. Why Routed Access Campus Design? Most Cisco Catalyst routers support L3 switching today EIGRP/OSPF routing preference over spanning tree IGP enhancements; stub router/area, fast reroute, etc.. Single control plane and well known tool set Traceroute, show ip route, show ip eigrp neighbor, etc… It is another design option available to you Layer 2 Layer 3 Access Distribution

66 66 © 2005 Cisco Systems, Inc. All rights reserved. Ease of Implementation Less to get right: No STP feature placement core to distribution LoopGuard RootGuard STP Root No default gateway redundancy setup/tuning No matching of STP/HSRP priority No L2/L3 multicast topology inconsistencies

67 67 © 2005 Cisco Systems, Inc. All rights reserved. Ease of Troubleshooting Routing troubleshooting tools Show IP route Traceroute Ping and extended pings Extensive protocol debugs Consistent troubleshooting; access, dist, core Bridging troubleshooting tools Show ARP Show spanning-tree, standby, etc… Multiple show CAM dynamic’s to find a host Failure differences Routed topologies fail closed—i.e. neighbor loss Layer 2 topologies fail open—i.e. broadcast and unknowns flooded

68 68 © 2005 Cisco Systems, Inc. All rights reserved. Routed Campus Design Resiliency Advantages? Yes, with a Good Design Sub-200 msec convergence for EIGRP and OSPF OSPF convergence times dependent on timer tuning RPVST+ convergence times dependent on GLBP/HSRP tuning A B Seconds

69 69 © 2005 Cisco Systems, Inc. All rights reserved. Routed Access Considerations Do you have any Layer 2 VLAN adjacency requirements between access switches? IP addressing—Do you have enough address space and the allocation plan to support a routed access design? Platform requirements; Cisco Catalyst 6500 requires an MSFC in the access to get all the necessary switchport and routing features Cisco Catalyst 4500 requires a SUP4/5 for EIGRP or OSPF support Cisco Catalyst 3500s and 3700s require an enhanced Cisco IOS Software image for IGRP and OSPF

70 70 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP vs. OSPF as Your Campus IGP DUAL vs. Dijkstra Convergence: Within the campus environment, both EIGRP and OSPF provide extremely fast convergence EIGRP requires summarization OSPF requires summarization and timer tuning for fast convergence Flexibility: EIGRP supports multiple levels of route summarization and route filtering which simplifies migration from the traditional Multi-Layer L2/L3 campus design OSPF area design restrictions need to be considered Scalability: Both protocols can scale to support very large enterprise network topologies

71 71 © 2005 Cisco Systems, Inc. All rights reserved. Routed Access Design High-Speed Campus Convergence Convergence is the time needed for traffic to be rerouted to the alternative path after the network event Network convergence requires all affected routers to process the event and update the appropriate data structures used for forwarding Network convergence is the time required to: Detect the event Propagate the event Process the event Update the routing table/FIB

72 72 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access DesignRouted Access Design EIGRP Design Details OSPF Design Details Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

73 73 © 2005 Cisco Systems, Inc. All rights reserved. Strengths of EIGRP Advanced distance vector Maps easily to the traditional Multi-Layer design 100% loop free Fast convergence Easy configuration Incremental update Supports VLSM and discontiguous network Classless routing Protocol independent IPv6, IPX and AppleTalk Unequal cost paths load balancing Flexible topology design options

74 74 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Design Rules for HA Campus Similar to WAN Design, But… EIGRP design for the campus follows all the same best practices as you use in the WAN with a few differences No BW limitations Lower neighbor counts Direct fiber interconnects Lower cost redundancy HW switching WAN  stability and speed Campus  stability, redundancy, load sharing, and high speed

75 75 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP in the Campus Conversion to an EIGRP Routed Edge The greatest advantages of extending EIGRP to the access are gained when the network has a structured addressing plan that allows for use of summarization and stub routers EIGRP provides the ability to implement multiple tiers of summarization and route filtering Relatively painless to migrate to a L3 access with EIGRP if network addressing scheme permits Able to maintain a deterministic convergence time in very large L3 topology 10.10.0.0/1710.10.128.0/17 10.10.0.0/16

76 76 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Design Rules for HA Campus Limit Query Range to Maximize Performance EIGRP convergence is largely dependent on query response times Minimize the number of queries to speed up convergence Summarize distribution block routes upstream to the core Upstream queries are returned immediately with infinite cost Configure all access switches as EIGRP stub routers No downstream queries are ever sent

77 77 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Neighbors Event Detection EIGRP neighbor relationships are created when a link comes up and routing adjacency is established When physical interface changes state, the routing process is notified Carrier-delay should be set as a rule because it varies based upon the platform Some events are detected by the routing protocol Neighbor is lost, but interface is UP/UP To improve failure detection Use Routed Interfaces and not SVIs Decrease interface carrier-delay to 0 Decrease EIGRP hello and hold-down timers Hello = 1 Hold-down = 3 interface GigabitEthernet3/2 ip address 10.120.0.50 255.255.255.252 ip hello-interval eigrp 100 1 ip hold-time eigrp 100 3 carrier-delay msec 0 Hello’s Routed Interface L2 Switch or VLAN Interface

78 78 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Query Process Queries Propagate the Event EIGRP is an advanced distant vector; it relies on its neighbor to provide routing information If a route is lost and no feasible successor is available, EIGRP actively queries its neighbors for the lost route(s) The router will have to receive replies back from ALL queried neighbors before the router calculates successor information If any neighbor fails to reply, the queried route is stuck in active and the router resets the neighbor that fails to reply The fewer routers and routes queried, the faster EIGRP converges; solution is to limit query range Query Reply Traffic Dropped Until EIGRP Converges Access Distribution Core Distribution Access

79 79 © 2005 Cisco Systems, Inc. All rights reserved. No Queries to Rest of Network from Core EIGRP Query Process With Summarization Query Reply Reply ∞ interface gigabitethernet 3/1 ip address 10.120.10.1 255.255.255.252 ip summary-address eigrp 1 10.130.0.0 255.255.0.0 Summary Route Traffic Dropped Until EIGRP Converges When we summarize from distribution to core for the subnets in the access we can limit the upstream query/ reply process In a large network this could be significant because queries will now stop at the core; no additional distribution blocks will be involved in the convergence event The access layer is still queried

80 80 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Stubs A stub router signals (through the hello protocol) that it is a stub and should not transit traffic Queries that would have been generated towards the stub routers are marked as if a “No path this direction” reply had been received D1 will know that stubs cannot be transit paths, so they will not have any path to 10.130.1.0/24 D1 simply will not query the stubs, reducing the total number of queries in this example to 1 These stubs will not pass D1’s advertisement of 10.130.1.0/24 to D2 D2 will only have one path to 10.130.1.0/24 D2D1 Distribution Access 10.130.1.0/24 “Hello, I’m a Stub…” “I’m Not Going to Send You Any Queries Since You Said That!”

81 81 © 2005 Cisco Systems, Inc. All rights reserved. No Queries to Rest of Network from Core EIGRP Query Process With Summarization and Stub Routers When we summarize from distribution to core for the subnets in the access we can limit the upstream query/reply process In a large network this could be significant because queries will now stop at the core; no additional distribution blocks will be involved in the convergence event When the access switches are EIGRP stub’s we can further reduce the query diameter Non-stub routers do not query stub routers—so no queries will be sent to the access nodes No secondary queries—and only three nodes involved in convergence event Query Reply Reply ∞ Stub Summary Route Traffic Dropped Until EIGRP Converges

82 82 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Route Filtering in the Campus Control Route Advertisements Bandwidth is not a constraining factor in the campus but it is still advisable to control the number of routing updates advertised Remove/filter routes from the core to the access and inject a default route with distribute-lists Smaller routing table in access is simpler to troubleshoot Deterministic topology router eigrp 100 network 10.0.0.0 distribute-list Default out ip access-list standard Default permit 0.0.0.0

83 83 © 2005 Cisco Systems, Inc. All rights reserved. EIGRP Routed Access Campus Design Summary Detect the event: Set hello-interval = 1 second and hold-time = 3 seconds to detect soft neighbor failures Set carrier-delay = 0 Propagate the event: Configure all access layer switches as stub routers to limit queries from the distribution layer Summarize the access routes from the distribution to the core to limit queries across the campus Process the event: Summarize and filter routes to minimize calculating new successors for the RIB and FIB Summary Route Stub

84 84 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access DesignRouted Access Design EIGRP Design Details OSPF Design Details Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

85 85 © 2005 Cisco Systems, Inc. All rights reserved. Open Shortest Path First (OSPF) Overview OSPFv2 established in 1991 with RFC 1247 Goal—a link-state protocol more efficient and scaleable than RIP Dijkstra Shortest Path First (SPF) algorithm Metric—path cost Fast convergence Support for CIDR, VLSM, authentication, multipath and IP unnumbered Low steady state bandwidth requirement OSPFv3 for IPv6 support

86 86 © 2005 Cisco Systems, Inc. All rights reserved. Hierarchical Campus Design OSPF Area’s with Router Types Data CenterWANInternet BGP Area 0 Area 200 Area 20 Area 30 Area 10 Backbone ABR’s Internal’s Area 0 ABR’s Area 100 ASBR’s ABR’s Area 300 Access Distribution Core Distribution Access

87 87 © 2005 Cisco Systems, Inc. All rights reserved. OSPF Design Rules for HA Campus Where Are the Areas? Area size/border is bounded by the same concerns in the campus as the WAN In campus, the lower number of nodes and stability of local links could allow you to build larger areas however… Area design also based on address summarization Area boundaries should define buffers between fault domains Keep area 0 for core infrastructure; do not extend to the access routers Data Center WAN Internet Area 100Area 110Area 120 Area 0

88 88 © 2005 Cisco Systems, Inc. All rights reserved. Backbone Area 0 Area 120 Area Border Router Regular Area ABRs Forward All LSAs from Backbone An ABR Forwards the Following into an Area Summary LSAs (Type 3) ASBR Summary (Type 4) Specific Externals (Type 5) Access Config: router ospf 100 network 10.120.0.0 0.0.255.255 area 120 Distribution Config router ospf 100 summary-address 10.120.0.0 255.255.0.0 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0 External Routes/LSA Present in Area 120

89 89 © 2005 Cisco Systems, Inc. All rights reserved. Backbone Area 0 Area 120 Area Border Router Stub Area Consolidates Specific External Links—Default 0.0.0.0 Stub Area ABR Forwards Summary LSAs Summary 0.0.0.0 Default Access Config: router ospf 100 area 120 stub network 10.120.0.0 0.0.255.255 area 120 Distribution Config router ospf 100 area 120 stub summary-address 10.120.0.0 255.255.0.0 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0 Eliminates External Routes/LSA Present in Area (Type 5)

90 90 © 2005 Cisco Systems, Inc. All rights reserved. Backbone Area 0 Area 120 Area Border Router A Totally Stubby Area ABR Forwards Summary Default Totally Stubby Area Use This for Stable—Scalable Internetworks Access Config: router ospf 100 area 120 stub no-summary network 10.120.0.0 0.0.255.255 area 120 Distribution Config router ospf 100 area 120 stub no-summary summary-address 10.120.0.0 255.255.0.0 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0 Minimize the Number of LSA’s and the Need for Any External Area SPF Calculations

91 91 © 2005 Cisco Systems, Inc. All rights reserved. Backbone Area 0 Area 120 Area Border Router ABR’s Forward Summary 10.120.0.0/16 Summarization Distribution to Core Reduce SPF and LSA Load in Area 0 Access Config: router ospf 100 area 120 stub no-summary network 10.120.0.0 0.0.255.255 area 120 Distribution Config router ospf 100 area 120 stub no-summary summary-address 10.120.0.0 255.255.0.0 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0 Minimize the Number of LSA’s and the Need for Any SPF Recalculations at the Core

92 92 © 2005 Cisco Systems, Inc. All rights reserved. OSPF Default Route to Totally Stubby Area Totally stubby area’s are used to isolate the access layer switches from route calculations due to events in other areas This means that the ABR (the distribution switch) will send a default route to the access layer switch when the neighbor relationship is established The default route is sent regardless of the distribution switches ability to forward traffic on to the core (area 0) Traffic could be black holed until connectivity to the core is established A B Traffic Dropped Until Connectivity to Core Established Default Route Note: Solution to this anomaly is being investigated.

93 93 © 2005 Cisco Systems, Inc. All rights reserved. OSPF Timer Tuning High-Speed Campus Convergence OSPF by design has a number of throttling mechanisms to prevent the network from thrashing during periods of instability Campus environments are candidates to utilize OSPF timer enhancements Sub-second hellos Generic IP (interface) dampening mechanism Back-off algorithm for LSA generation Exponential SPF backoff Configurable packet pacing Reduce Hello Interval Reduce LSA and SPF Interval

94 94 © 2005 Cisco Systems, Inc. All rights reserved. Subsecond Hello’s Neighbor Loss Detection—Physical Link Up OSPF hello/dead timers detect neighbor loss in the absence of physical link loss Useful in environments where an L2 device separates L3 devices (Layer 2 core designs) Aggressive timers are needed to quickly detect neighbor failure Interface dampening is recommended if sub-second hello timers are implemented Traffic Dropped Until Neighbor Loss Detection Occurs OSPF Processing Failure (Link Up) A B Access Config: interface GigabitEthernet1/1 dampening ip ospf dead-interval minimal hello-multiplier 4 router ospf 100 area 120 stub no-summary timers throttle spf 10 100 5000 timers throttle lsa all 10 100 5000 timers lsa arrival 80

95 95 © 2005 Cisco Systems, Inc. All rights reserved. Access Config: interface GigabitEthernet1/1 ip ospf dead-interval min hello-multi 4 router ospf 100 area 120 stub no-summary timers throttle spf 10 100 5000 timers throttle lsa all 10 100 5000 timers lsa arrival 80 OSPF LSA Throttling By default, there is a 500ms delay before generating router and network LSA’s; the wait is used to collect changes during a convergence event and minimize the number of LSA’s sent Propagation of a new instance of the LSA is limited at the originator timers throttle lsa all Acceptance of a new LSAs is limited by the receiver timers lsa arrival Traffic Dropped Until LSA Generated and Processed A B

96 96 © 2005 Cisco Systems, Inc. All rights reserved. OSPF SPF Throttling OSPF has an SPF throttling timer designed to dampen route recalculation (preserving CPU resources) when a link bounces 12.2S OSPF enhancements let us tune this timer to milliseconds; prior to 12.2S one second was the minimum After a failure, the router waits for the SPF timer to expire before recalculating a new route; SPF timer was one second Traffic Dropped Until SPF Timer Expires Access Config: interface GigabitEthernet1/1 ip ospf dead-interval min hello-multi 4 router ospf 100 area 120 stub no-summary timers throttle spf 10 100 5000 timers throttle lsa all 10 100 5000 timers lsa arrival 80 A B

97 97 © 2005 Cisco Systems, Inc. All rights reserved. OSPF Routed Access Campus Design Overview—Fast Convergence Detect the event: Decrease the hello-interval and dead- interval to detect soft neighbor failures Enable interface dampening Set carrier-delay = 0 Propagate the event: Summarize routes between areas to limit LSA propagation across the campus Tune LSA timers to minimize LSA propagation delay Process the event: Tune SPF throttles to decrease calculation delays Stub Area 120 Backbone Area 0

98 98 © 2005 Cisco Systems, Inc. All rights reserved. OSPF Routed Access Campus Design Overview—Area Design Use totally stubby areas to minimize routes in Access switches Summarize area routes to backbone Area 0 These recommendations will reduce number of LSAs and SPF recalculations throughout the network and provide a more robust and scalable network infrastructure Area Routes Summarized router ospf 100 area 120 stub no-summary summary-address 10.120.0.0 255.255.0.0 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0 router ospf 100 area 120 stub no-summary network 10.120.0.0 0.0.255.255 area 120 Configured as Totally Stubby Area

99 99 © 2005 Cisco Systems, Inc. All rights reserved. OSPF Routed Access Campus Design Overview—Timer Tuning In a hierarchical design, the key tuning parameters are SPF throttle and LSA throttle Need to understand other LSA tuning in the non-optimal design Hello and dead timers are secondary failure detection mechanism Reduce Hello Interval Reduce SPF and LSA Interval router ospf 100 area 120 stub no-summary area 120 range 10.120.0.0 255.255.0.0 timers throttle spf 10 100 5000 timers throttle lsa all 10 100 5000 timers lsa arrival 80 network 10.120.0.0 0.0.255.255 area 120 network 10.122.0.0 0.0.255.255 area 0 interface GigabitEthernet5/2 ip address 10.120.100.1 255.255.255.254 dampening ip ospf dead-interval minimal hello-multiplier 4

100 100 © 2005 Cisco Systems, Inc. All rights reserved. Agenda Campus High Availability Design Principles Foundation Services Multi-Layer Design Routed Access Design Summary Architectural Foundation Hierarchical Campus Design Security Mobility Convergence Availability Flexibility

101 101 © 2005 Cisco Systems, Inc. All rights reserved. Campus High Availability Non-Stop Application Delivery Hierarchical, systematic approach System level resiliency for switches and routers Network resiliency with redundant paths Supports integrated services and applications Embedded management

102 102 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 120 Voice 10.1.120.0/24 Multi-Layer Campus Design Reference Design—No VLANs Span Access Layer Tune CEF load balancing Match CatOS/IOS Etherchannel settings and tune load balancing Summarize routes towards core Limit redundant IGP peering STP Root and HSRP primary tuning or GLBP to load balance on uplinks Set trunk mode on/nonegotiate Disable Etherchannel unless needed Set Port Host on access layer ports: Disable Trunking Disable Etherchannel Enable PortFast RootGuard or BPDU-Guard Use security features P-t-P Link Layer 3 VLAN 20 Data 10.1.20.0/24 VLAN 140 Voice 10.1.140.0/24 VLAN 40 Data 10.1.40.0/24 Access Distribution Core

103 103 © 2005 Cisco Systems, Inc. All rights reserved. VLAN 250 WLAN 10.1.250.0/24 Multi-Layer Campus Design Some VLANs Span Access Layer Tune CEF load balancing Match CatOS/IOS Etherchannel settings and tune load balancing Summarize routes towards core Limit redundant IGP peering STP Root and HSRP primary or GLBP and STP port cost tuning to load balance on uplinks Set trunk mode on/nonegotiate Disable Etherchannel unless needed RootGuard on downlinks LoopGuard on uplinks Set Port Host on access Layer ports: Disable trunking Disable Etherchannel Enable PortFast RootGuard or BPDU-Guard Use security features VLAN 120 Voice 10.1.120.0/24 Trunk VLAN 20 Data 10.1.20.0/24 VLAN 140 Voice 10.1.140.0/24 VLAN 40 Data 10.1.40.0/24 Layer 2 Access Distribution Core

104 104 © 2005 Cisco Systems, Inc. All rights reserved. Routed Access Campus Design No VLANs Span Access Layer Use EIGPR or OSPF Use Stub routers or Stub Areas With OSPF tune LSA and SPF timers Summarize routes towards core Filter routes towards the access Tune CEF load balancing Disable Etherchannel unless needed Set Port Host on access layer ports: Disable Trunking Disable Etherchannel Enable PortFast RootGuard or BPDU-Guard Use security features VLAN 120 Voice 10.1.120.0/24 P-t-P Link Layer 3 VLAN 20 Data 10.1.20.0/24 VLAN 140 Voice 10.1.140.0/24 VLAN 40 Data 10.1.40.0/24 Access Distribution Core

105 105 © 2005 Cisco Systems, Inc. All rights reserved. Cisco Campus Architecture Multiple Design Options supporting Integrated Services Enterprise Campus The Right Design for Each Customer Virtualization Future Services Integrated Security WLAN Integration Intelligent Switching (Hybrid of L2 + L3 features) Intelligent Switching (Hybrid of L2 + L3 features) Future Campus Design Options Future Campus Design Options Multi-Layer Campus Design Multi-Layer Campus Design High Availability IP Communications WLAN Integration Integrated Security IPv6 Virtualization Future Services IP Communications Cisco Campus Architecture IPv6 Routed Campus Design Routed Campus Design

106 106 © 2005 Cisco Systems, Inc. All rights reserved. RST-2031 11207_05_2005_c1


Download ppt "1 © 2005 Cisco Systems, Inc. All rights reserved. High Availability Campus Networks Tyler Creek Consulting Systems Engineer Southern California."

Similar presentations


Ads by Google