Presentation is loading. Please wait.

Presentation is loading. Please wait.

January 25, 2004Joint Techs, Honolulu Hawaii Routing WG How I spent last summer: Converting MAX to 2547bis VPNs Converting NGIX/DC atm to gige Dan Magorian.

Similar presentations


Presentation on theme: "January 25, 2004Joint Techs, Honolulu Hawaii Routing WG How I spent last summer: Converting MAX to 2547bis VPNs Converting NGIX/DC atm to gige Dan Magorian."— Presentation transcript:

1 January 25, 2004Joint Techs, Honolulu Hawaii Routing WG How I spent last summer: Converting MAX to 2547bis VPNs Converting NGIX/DC atm to gige Dan Magorian Director, Engineering and Operations magorian@maxgigapop.net

2 2 Talks about 2 big projects Plus installing 2 new pops in Balt and DC/George Wash U: was busy summer. Technically the conversion of the NGIX/ DC NAP to gige was all L2, not L3 routing. But it did involve interesting use of Cisco RFC 1483 bridged routed atm encap conversion, to single-endedly convert each Fednet peer at the NAP to gige w/o a massive horrible flag day cutover. So I thought folks might be interested.

3 Talk 1: MAX’s 2547 vpn background Heard Ivan Gonzalez of Juniper’s presentation to routing wg July 2002 on Calren 2547 mockup. Examined his configs, looked useful & feasible. MAX in 2002 was offering standard mix of routes from Abilene, DREN, Esnet, vBNS, etc peerings. Everything working, nobody interested in 2547. But pressure from customers mounted to resell Qwest ISP service. Ran tests for 6 months with UMD as ISP and GU as customer, and found routing problems. Mocked up 2547 topology on MAX’s inexpensive lab routers, then deployed.

4 Mid-Atlantic Crossroads Abilene Network Virginia Qwest ISP National/International Peering Networks Regional Network Participants Institutional Participants ATDnet VBNS NGIX-DC DREN NISNNREN ESNet MAX Regional Infrastructure Supernet National Oceanic and Atmospheric Administration National Science Foundation Naval Research Lab & ATDnet Smithsonian Institution Southeastern Universities Research Association University Consortium for Advanced Internet Development University of Maryland University of Maryland, Baltimore County University System of Maryland U.S. Geological Survey The Catholic University of of America George Washington University Georgetown University Howard Hughes Medical Institute Gallaudet University NASA/Emerging Technologies Center NASA/Goddard Space Flight Center National Consortium for Supercomputer Applications/ACCESS National Institutes of Health National Library of Medicine Univ Sys Maryland

5 MAX Core Optical Network OADM BALT Qwest OC48 lambda via MD state net OC48c POS lambda via Luxn WDM OC48c POS Abilene M160 Router GigE Switch M160 Router M40e Router ARLG DCGW DCNE CLPK NGIX 6 nets Abilene Northern Virginia Washington D.C. College Park Baltimore

6 So what are RFC 2547 L3 VPNs or, Why should you let MPLS onto your network? Probably everyone’s heard “fish problem” talks & knows about policy constrained routing issues. Don’t want to bore everyone with fish again, or characterizations of atm/frame vs L2 vs L3 vpns. Many people have done that better than I could This is from operational perspective: what caused MAX to convert to them, what are pros and cons, why regional (not just national) service providers might consider using them.

7 Policy Constrained Routing, Explicit Routing Objectives Solve long standing “fish problem” by use of single router node to create multiple policies or “routing instances” Use more than destination as criteria for routing decision At minimum use Source (VPN membership or L3 Info) + Destination for route decision Technology evolution offers solution RFC 2547bis and MPLS Policy Constrained Routing Review Juniper Networks, Inc. Copyright © 2002 – Proprietary and Confidential

8 PCR/ER Overview REQUIREMENTS: -Send traffic from router “D” across router “F” -Send traffic from router “E” across router “G” -Routers “D” and “E” are connected to “A” Juniper Networks, Inc. Copyright © 2002 – Proprietary and Confidential

9 PCR/ER – The Challenge BGP Path Selection Process will run. ONE active path will be selected to get to the destination "yellow" network Routes learned from EBGP Peers GRN and RED Router “A” learns routes through IBGP Both customers will use the single best path – Undesireable in this case Juniper Networks, Inc. Copyright © 2002 – Proprietary and Confidential

10 PCR/ER – Solution – (Control Plane) 2547bis L3VPN Routes learned from EBGP Peers GRN and RED Router “A” learns routes through IBGP BGP Path Selection Process will run. An active path will be selected for each VRF to get to the destination "yellow" network Each VRF still has access to all routes; these routes can be set with different preference values Finer granularity can be achieved by using FBF + 2547bis/MPLS on customer facing ports Juniper Networks, Inc. Copyright © 2002 – Proprietary and Confidential

11 That’s fine, but why should you care? 2547 is widely used by many national service providers to create overlay networks to different customers. Many run ip on edge, mpls in core. Should mention Cisco routers might have similar VRF capability. Don’t know, can’t speak to that. Yet gigapops don’t usually have overlay networks, and people usually find workarounds to fish problems. But in this case we ran into a show-stopper that caused us to really need to deploy them.

12 What Juniper doesn’t tell you about 2547 They’re called Routing Instances (VRFs), but they AREN’T virtual routers. JunOs has virtual routers in 6.1 2547 vpns have only one iBGP. What happens is that a Route Identifier and Target bgp communities are added, putting routes into separate tables. IGP remains same. The catch is that interfaces need to be in one VRF or another. They can’t be in multiples. There are tricky workarounds with next-table policies, but they didn’t work well in our situation. Customers don’t know about 2547. So since we ebgp with everyone, we establish second peerings with ISP customers. A nuisance but it works.

13 4.1.0.0/16 (1 entry, 1 announced) *BGP Preference: 170/-81 Source: 138.18.47.37 Next hop: 138.18.47.37 via ge-3/2/1.175, selected State: Local AS: 10886 Peer AS: 668 Age: 3d 19:11:46 Task: BGP_668.138.18.47.37+179 Announcement bits (6): 0-KRT 3-LDP 5-BGP.0.0.0.0+179 6-Resolve inet.0 7-Resolve inet.2 AS path: 668 1455 I Communities: 668:100 10886:5 Localpref: 80 Router ID: 138.18.9.55 3.0.0.0/8 (1 entry, 1 announced) *BGP Preference: 170/-81 Route Distinguisher: 206.196.177.247:1 Source: 206.196.177.247 Next hop: via so-2/0/0.0, selected Label operation: Push 100048 Protocol next hop: 206.196.177.247 Push 100048 Indirect next hop: eedd1f8 782 State: Local AS: 10886 Peer AS: 10886 Age: 2d 3:04:00 Metric: 522 Metric2: 1 Task: BGP_10886.206.196.177.247+179 Announcement bits (2): 0-KRT 2-BGP.0.0.0.0+179 AS path: 209 7018 80 I Communities: 209:888 209:889 10886:8 target:209:1 VPN Label: 100048 Localpref: 80 Router ID: 206.196.177.247 Primary Routing Table bgp.l3vpn.0

14 So what was the big problem? We mark routes from upstreams with bgp communities, and use them to subset which routes are advertised to downstream customers. Eg, everyone gets Abilene, Dren, Esnet, MAX, but only a few get vBNS (yes, we still have) Started out doing same w/ ISP : bgp community to control Qwest route advertisement. Then discovered that we were blackholing traffic from certain non-Qwest customers eg NLM.

15 Exploring further We found that the main problem was gigapops who advertise unequal prefix length announcements to their ISP and Abilene. If everyone’s were equal would be fine. But we discovered that it’s unfortunately fairly common for GPs to aggregate towards Abilene and not aggregate or aggregate less towards their ISPs, very undesirable So when NLM saw a route and sent traffic to MAX, a more specific Qwest route not advertised to them could take precedence, and yet they hadn’t subscribed to Qwest ISP. So that traffic would be dropped. After some attempts to get sources to fix it we gave up.

16 So we realized that Most gigapops have only one service offering, and give everyone a mix of I2 and ISP routes. The problem would only get worse once we had customers not subscribed to Abilene (now do). There wasn’t an easy way out that could “get back” routes not preferred by bgp when sub- setted by bgp communities for advertisement. We had seen same issue in minor way with vBNS, but NGIX Abilene/vBNS peering solved it Lots of folks run 2547, and Juniper supports well

17 OK, so how hard was it to deploy? Went quite smoothly. We had Ivan’s configs that had worked with Calren mockup. Ben Eater of Juniper gave lots of help, but we mocked up ourselves in our lab. Main issue was what to do about the dual peering issue. Since we bgp with everyone except 2 downtown offices, decided to simplify to make everyone same. One went away and Ucaid/dc converting soon. Lots would be prob Basically, we turned on new VRF, ran for awhile, no issues, then turned on Qwest peering & new custs. Did cheat and keep existing inet.0: Juniper recommends doing everything as VRFs. Better but nasty cutover.

18

19 There’ve got to be some downsides Two virtual circuit requirement is biggest. We couldn’t have done it if we had lots of statics to small customers. Ethernet easy, yet carriers like Yipes don’t support trunking and multiple vlans. Sonet & DSXs need frame relay encap for dlcis. Also have get used to tracing & pinging within other routing tables, & oddities about source ints And what MPLS is doing can be fuzzy with core and edge routers same, like nat net in 4 boxes. Plus your bgp becomes unusual, eg Arbor mons

20 Sample JunOS showing VRF (partial) routing-instances { Q { instance-type vrf; /* UMD-ISP */ interface ge-2/2/0.1; /* GU-ISP */ interface at-1/0/0.4; /* HHMI-ISP */ interface ge-2/2/1.6; route-distinguisher 206.196.177.246:1; vrf-import QWEST-IMPORT; vrf-export QWEST-EXPORT; routing-options { rib Q.inet.0 { aggregate { route 206.196.176.0/21 { community 10886:1; passive

21 protocols { bgp { group ISP { type external; export [ AGGREGATE-ROUTES NO-MAX-SPECIFICS ]; neighbor 206.196.177.50 { description UMD-ISP; import from-UMD; export [ DEFAULT MAX QWEST REJECT ]; peer-as 27; neighbor 206.196.177.154 { description HHMI-ISP; import from-HHMI; export [ NO-DEFAULT QWEST REJECT ]; peer-as 16692community QWEST members 10886:8;

22 policy-statement QWEST-IMPORT { term 10 { from community QWEST-VRF then accept; term 20 then reject policy-statement QWEST-EXPORT { term 10 { then community add QWEST-VRF; accept; community ABILENE members 10886:3; community DREN members 10886:5; community ESNET members 10886:6; community MAX members 10886:1; community NISN members 10886:4; community NREN members 10886:9; community QWEST members 10886:8; community QWEST-CUST members 209:209; community QWEST-VRF members target:209:1 ; community VBNS members 10886:2;

23 So do we recommend it? Sure. Wasn’t that hard, no real war stories, didn’t cost anything except time to scope it out. Solves problem nicely, caused by our being late into ISP resale, and reselling each separately. Can have as many overlay nets as we want for different service offerings (mostly different ISPs) Might also be handy if don’t want to mix lost-cost ISP routes (eg Cogent) w/higher cost (eg Qwest) Might be trickier for GPs with lots of small custs

24 Thanks! Questions?

25 Talk 2: Converting NGIX/DC to gige MAX has run the Fednet peer point for Abilene, vBNS, DREN, NISN, NREN, USGS for 4 years Previously called NGIX/East coast, is meet-me point (NAP) for major Federal R&E nets. Topology was full L2 mesh of p-t-p atm pvcs. No transit peerings, no common route pool. Doesn’t scale to larger peer points, but kept life simple. Then atm began to go out of style, and customer nets began push for frame-based (gige) peering

26 Took us awhile to get it done We contemplated several architectures, including putting all nets on Junipers directly and using transitional cross-connects (tccs) that allow bolting ethernets to atms to sonets. Ended up deciding to use conventional switch, because of high cost of 10G router interfaces ($110k vs $30k for switched), also no tcc IPv6. Did side-by-side comparison of Cisco 6500 vs Extreme Black Diamond. Liked Extremes, but no oc12 atm, and winning feature was Cisco’s OSM-2OC12-atm 1483 bridged-routing blade

27 RFC 1483:Multiprotocol encap over AAL5 One of the many reasons in addition to lane why people run screaming from atm (actually, if you only do pvcs, atm is easy and reliable AND can do 9K and even 64k mtus). I’m no 1483 expert. There are many variations involving routed and bridged pdus. Basically, bridged is used by SPs who need to connect ethernets via an atm cloud. What Marconi’s gige int does: useless for us. Juniper does minimal routed subset w/1 ip addr good only for dslams. Others as well, complex.

28 Bridged-routed connect: the magic There are several variants of routed pdus across different platforms, mostly also useless to us. But on 6500 OSM blades Cisco has developed a 1483 bridged/routed encap that does just what we wanted: allow ethernets to connect to atms. Uses proxy arp and other ugly stuff internally. Best part was that we only needed it for the transition, so we borrowed the $60K blade for 6 months & then returned it when all atm gone.

29 So why are we discussing 1483 anyway? Converting a national NAP from atm to gige isn’t like changing over your campus net at midnight. Replacement lines have to be arranged, peer partners have to coordinate ints, downtime has to be minimal because they’re paying for uptime The biggest issue was how to do the transition without massive flag-day changeover nightmare With testing, realized that the Cisco OSM blade bridged-routing would allow us to single-endedly change over each peer at midnight without any of their full-mesh of peers needing to be there.

30 How did we do it? We hooked up both OSM OC12 atms to the Marconi atm switch (would have been a problem if more traffic than would fit, but there wasn’t). All peers were assigned new vlans matched to their old pvcs. The 6500 bridged routed encap allowed us to make ip connections across. In some sense was subset of TCCs, but cheaper. When each peer had their new line and gige int ready with new vlans, we hot-cut their line both ends to gige, moving the /30 w/o peers knowing!

31 Sample bre-connect 6500 IOS (partial) interface ATM4/1 description marconi 1D1 mtu 9186 no ip address atm bridge-enable switchport trunk allowed vlan 161,170,178 switchport mode trunk interface ATM4/1.161 point-to-point pvc USGS-Abilene 161/32 bre-connect 161 interface GigabitEthernet3/5 description USGS mtu 9216 no ip address switchport trunk encapsulation dot1q switchport trunk allowed vlan 158,161,162,178 switchport mode trunk interface Vlan161 description ABIL-USGS ACTIVE gige no ip address mtu 9216

32 How did it go? Had to be some war story Of course there was, Overall had to be VERY careful at midnight cutting pvcs Testing initially didn’t work, took conf with Cisco product manager to resolve issues. Then fine. Had incredible lab juryrig of atm switch & routers in multiple bldgs NASA’s NISN was first, easy because had spare fiber to GSFC, didn’t have to hotcut. Then Abilene was easy, because MAX sells them lambda transport to NGIX, just had change wdm USGS, NLM, and MAX also easy hotcuts. But MCI’s DREN and vBNS were nightmare.

33 The war story Caveat: MCI did VERY good job, not their fault. MCI is contractor for both original vBNS and in 2002 brought up new DREN. MCI had Verizon OC12 from their pop to MAX for years, I suggested both share it. With summer gige conversion, were unable to get replacement gige line to their pop. Decided keep OC12atm and I installed their colo M10 to run TCCs. Juniper and Andrea Reitzel tested in lab, worked At midnight cut, vBNS TCCs failed, had to back out 5am. Eventually had to loan MCI MAX on-site test routers. Scott Robohn of Juniper found,needed old gige firmware After another 4am-er got working, but tccs still touchy

34 So now that’s done… All peers off atm, the OSM blade went back Now the config is totally boring, pure L2 on one 16-port gige blade, runs beautifully. Will be upgrading Abilene’s peering to 10G next week, already tested Luxn 10G transponders. NREN joined NGIX, using GSFC wdm transport Subsequently bought lab 6503 to test future IOSes and other changes. Upcoming release will have L2 support for netflow, now just int stats.

35 Thanks! Questions? Dan Magorian magorian@maxgiapop.net


Download ppt "January 25, 2004Joint Techs, Honolulu Hawaii Routing WG How I spent last summer: Converting MAX to 2547bis VPNs Converting NGIX/DC atm to gige Dan Magorian."

Similar presentations


Ads by Google