Presentation on theme: "BGP trends of an AS Looking under the hood and diagnosing the noise. Stephan Millet Network Engineer Telstra."— Presentation transcript:
BGP trends of an AS Looking under the hood and diagnosing the noise. Stephan Millet Network Engineer Telstra
Intro Informational Presentation 16 month data collection of the BGP activity of AS1221 –Sept 2004 to Jan 2006 Analysis of the BGP behavior within the AS Background on AS1221 Only run IP (no other protocol mixes) Topology hasn’t changes in 3 years Geographically confined to Australia –Only one transit provider All Cisco (consistency in the BGP algorithm(?)) Core acts as RR for all access routers –All eBGP is conducted on access –No dampening on transit paths
n-1 BGP topology Sydney Melbourne Brisbane Adelaide Canberra Perth Where all routers have an iBGP relationship with all other routers
Add a collector to the n-1 BGP topology Sydney Melbourne Brisbane Adelaide Canberra Perth +
Turn on logging and wait 18 months + Current configuration: ! hostname bgp-logger ! log file /data/bgp-mesh/mesh-log log trap informational log record-priority ! debug bgp updates ! router bgp 1221 BGP: rcvd UPDATE w/ attr: nexthop , origin ?, localpref 100, community 1221:500, originator , clusterlist BGP: rcvd /29 BGP: rcvd /32 BGP: rcvd UPDATE w/ attr: nexthop , origin i, localpref 0 BGP: rcvd UPDATE about /32 -- withdrawn BGP: rcvd UPDATE about /32 --withdrawn BGP: rcvd UPDATE about /32 -- withdrawn Every 24hrs process the data and add an extra data point
What the initial data shows
Analysis of initial data set. or what happened in 16 months 30% increase in AS1221’s BGP table –176k to 228k prefixes A doubling or greater in all other attributes –Prefix updates from 600k to 1.2M per day –BGP updates from 200k to 550k per day A small amount of prefixes are creating a high portion noise –See ‘noisy 100’ later in presentation –~10% - 15% of the prefixes in the BGP table will generate an update on a daily basis. eBGP prefixes are noisier that iBGP prefixes –Though not for much longer In 2004 eBGP to iBGP ratio was 4:1 Now eBGP to iBGP ratio is passing 2:1 The really big spikes. –Operational work on one or more cores e.g. IOS upgrade –The rebooting of the mesh-logger
Looking at the trends. Raw daily data views Prefix updates are the new black pink
Updates and Prefixes
More going than coming ?
‘table size’ to ‘prefix updates ratio’
By Jan 2006 the table to prefix noise ratio increased from 3:1 to 5:1 –Today for every prefix in the BGP table expect 5 prefix updates –Heading towards 10:1 by 2008 With continued BGP table growth, expect 3.0M prefix updates per day. –How much of this is an artifact of the statistical technique (least squares best fit) and how much is a basic BGP artifact ? ‘table size’ to ‘prefix updates’ ratio continued..
Forecast: number of prefix updates
Forecast prefix updates continued... Does it really matter ? –Hasn’t been a problem to date. Traffic is low.. ~1.5kbps What about the CPU and Memory on the RP ?
What’s the CPU doing ? Nov 2003Nov 2004Mar 2006Mar 2008April 2004 GRP-A/256MbGRP-B/512MbPRP-2/1GbCRS-1/4Gb ??
What the memory is doing ? GRP-B to PRP2 upgrade
Are we OK ? Growth, growth and more growth 2009 onwards may be the end of a PRP2. –Will probably run at 100% (1Min average) –What happens when the CPU receives updates faster than it can process them ? –AS’s flapping due to CPU issues will exacerbate the issue.
Who are the culprits? Who’s been naughty and who’s been nice.
Noisiest 100 Origin AS’s** ** Includes AS1221 as origin
One AS to rule them all
Info on AS9121 Turk Telekom Originate ~160 prefixes –Snapshot on Jan Varying number of prefixes have an ‘origin’ tag of EGP –Using really old software or munging routing policy ? –These prefixes seem to oscillate at will #show ip route Routing entry for /24 Known via "bgp 1221", distance 200, metric 0 Tag 4637, type internal * , from , 00:00:56 ago show ip bgp BGP routing table entry for /24, version Paths: (0 available, no best path)
Noisiest 100 Prefixes** ** Includes AS1221 as origin
One prefix to rule them all
What can we do ? Not run DFZ’s Bigger processors, good for those that can afford it. –However come 2009, those that can’t need alternate steps or issue gets worse for everyone. Limit updates ? –Turn on Flap Dampening ?