Presentation is loading. Please wait.

Presentation is loading. Please wait.

Connecting to the IP Multicast World

Similar presentations


Presentation on theme: "Connecting to the IP Multicast World"— Presentation transcript:

1 Connecting to the IP Multicast World
David Mitchell NCNE, NLANR, NCAR, and the Pittsburgh Supercomputing Center

2 Agenda What Is Multicast Multicast Protocols Multicast Configurations
Troubleshooting Futures Useful Links

3 What Is Multicast Unicast is one to one Multicast is one to many
traffic is sent from a single source to a single destination Multicast is one to many traffic is sent from a single source to a “group” address

4 Unicast vs. Multicast Unicast Multicast Server Router Server Router
Unicast transmission sends multiple copies of data, one copy for each receiver Ex: host transmits 3 copies of data and network forwards each to 3 separate receivers Ex: host can only send to one receiver at a time Multicast transmission sends a single copy of data to multiple receivers Ex: host transmits 1 copy of data and network replicates at last possible hop for each receiver, each packet exists only one time on any given network Ex: host can send to multiple receivers simultaneously Multicast Server Router

5 Why Multicast More efficient method of sending the same data to lots of hosts Distributing a video stream to multiple viewers Allows unique distributed protocols clients can announce services or perform queries in a distributed fashion ala Appletalk’s “Chooser”

6 Multicast Disadvantages
Multicast Is UDP Based!!! Best Effort Delivery: Drops are to be expected. Multicast applications should not expect reliable delivery of data and should be designed accordingly. Reliable Multicast is still an area for much research. Expect to see more developments in this area. No Congestion Avoidance: Lack of TCP windowing and “slow-start” mechanisms can result in network congestion. If possible, Multicast applications should attempt to detect and avoid congestion conditions. Duplicates: Some multicast protocol mechanisms (e.g. Asserts, Registers and SPT Transitions) result in the occasional generation of duplicate packets. Multicast applications should be designed to expect occasional duplicate packets. Out of Order Delivery : Some protocol mechanisms may also result in out of order delivery of packets. Multicast Disadvantages Most Multicast Applications are UDP based. This results in some undesirable side-effects when compared to similar unicast, TCP applications. Best Effort Delivery results in occasional packet drops. Many multicast applications that operate in real-time (e.g. Video, Audio) can be impacted by these losses. Also, requesting retransmission of the lost data at the application layer in these sort of real-time applications is not feasible. Heavy drops on Voice applications result in jerky, missed speech patterns that can make the content unintelligable when the drop rate gets high enough. Moderate to Heavy drops in Video is sometimes better tolerated by the human eye and appear as unusual “artifacts” on the picture. However, some compression algorithms can be severally impacted by even low drop rates; causing the picture to become jerky or freeze for several seconds while the decompression algorithm recovers. No Congestion Control can result in overall Network Degradation as the popularity of UDP based Multicast applications grow. Duplicate packets can occasionally be generated as multicast network topologies change. Applications should expect occasional duplicate packets to arrive and should be designed accordingly.

7 Agenda Multicast Protocols What Is Multicast Multicast Configurations
Troubleshooting Futures Useful Links

8 Multicast Protocols Multicast Addressing Basics IGMP PIM MSDP MBGP

9 Multicast Addressing IP Multicast Group Addresses
Class “D” Address Space High order bits of 1st Octet = “1110” Reserved Link-local Addresses Transmitted with TTL = 1 Examples: All systems on this subnet All routers on this subnet DVMRP routers OSPF routers PIMv2 Routers IANA Reserved Addresses IANA is the responsible Authority for the assignment of reserved class D addresses. Other interesting reserved addresses are: PIMv1 (ALL-ROUTERS - due to transport in IGMPv1) OSPF ALL ROUTERS (RFC1583) OSPF DESIGNATED ROUTERS (RFC1583) RIP2 Routers PIMv2 CISCO-RP-ANNOUNCE (Auto-RP) CISCO-RP-DISCOVERY (Auto-RP) “ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-addresses” is the authoritative source for reserved multicast addresses. Additional Information "Administratively Scoped IP Multicast", June 1997, has a good discussion on scoped addresses. This document is available at: “draft-ietf-mboned-admin-ip-space-03.txt”

10 Multicast Addressing Administratively Scoped Addresses
Private address space Similar to RFC1918 unicast addresses Not used for global Internet traffic Used to limit “scope” of multicast traffic Same addresses may be in use at different locations for different multicast sessions Examples Site-local scope: /16 Organization-local scope: /14 IANA Reserved Addresses IANA is the responsible Authority for the assignment of reserved class D addresses. Other interesting reserved addresses are: PIMv1 (ALL-ROUTERS - due to transport in IGMPv1) OSPF ALL ROUTERS (RFC1583) OSPF DESIGNATED ROUTERS (RFC1583) RIP2 Routers PIMv2 CISCO-RP-ANNOUNCE (Auto-RP) CISCO-RP-DISCOVERY (Auto-RP) “ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-addresses” is the authoritative source for reserved multicast addresses. Additional Information "Administratively Scoped IP Multicast", June 1997, has a good discussion on scoped addresses. This document is available at: “draft-ietf-mboned-admin-ip-space-03.txt”

11 IP Multicast MAC Address Mapping
Multicast Addressing IP Multicast MAC Address Mapping (FDDI and Ethernet) 32 Bits 28 Bits 1110 5 Bits Lost Ethernet & FDDI Multicast Addresses The low order bit (0x01) in the first octet indicates that this packet is a Layer 2 multicast packet. Furthermore, the “0x01005e” prefix has been reserved for use in mapping L3 IPmc addresses into L2 MAC addresses. When mapping L3 to L2 addresses, the low order 23 bits of the L3 IPmc address are mapped into the low order 23 bits of the IEEE mac address L2/L3 Multicast Address Overlap Since there are 28 bits of unique address space for an IPmc address (32 minus the first 4 bits containing the 1110 Class D prefix) and there are only 23 bits plugged into the IEEE MAC address - there are 5 bits of overlap or = 5. 2**5 = 32 therefore there is a 32:1 overlap of L3 addresses to L2 addresses - so beware several L3 addresses can map to the same L2 multicast address! For example, all of the following IPmc addresses map to the same L2 multicast of e-0a-00-01: , , , , , , , , , , , , , , , , , , , , , , , , e-7f-00-01 25 Bits 23 Bits 48 Bits

12 Multicast Addressing IP Multicast MAC Address Mapping
(FDDI & Ethernet) Be Aware of the 32:1 Address Overlap 32 - IP Multicast Addresses . 1 - Multicast MAC Address (FDDI and Ethernet) 0x0100.5E

13 Multicast Addressing Dynamic Group Address Assignment
Historically accomplished using SDR application Sessions/groups announced over well-known multicast groups Address collisions detected and resolved at session creation time Has problems scaling Future dynamic techniques under consideration Multicast Address Set-Claim (MASC) Hierarchical, dynamic address allocation scheme Extremely complex garbage-collection problem. Long ways off MADCAP Similar to DHCP Need application and host stack support IANA Reserved Addresses IANA is the responsible Authority for the assignment of reserved class D addresses. Other interesting reserved addresses are: PIMv1 (ALL-ROUTERS - due to transport in IGMPv1) OSPF ALL ROUTERS (RFC1583) OSPF DESIGNATED ROUTERS (RFC1583) RIP2 Routers PIMv2 CISCO-RP-ANNOUNCE (Auto-RP) CISCO-RP-DISCOVERY (Auto-RP) “ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-addresses” is the authoritative source for reserved multicast addresses. Additional Information "Administratively Scoped IP Multicast", June 1997, has a good discussion on scoped addresses. This document is available at: “draft-ietf-mboned-admin-ip-space-03.txt”

14 Multicast Addressing Static Group Address Assignment (new)
Temporary method to meet immediate needs Group range: Your AS number is inserted in middle two octets Remaining low-order octet used for group assignment Defined in IETF draft “draft-ietf-mboned-glop-addressing-00.txt” IANA Reserved Addresses IANA is the responsible Authority for the assignment of reserved class D addresses. Other interesting reserved addresses are: PIMv1 (ALL-ROUTERS - due to transport in IGMPv1) OSPF ALL ROUTERS (RFC1583) OSPF DESIGNATED ROUTERS (RFC1583) RIP2 Routers PIMv2 CISCO-RP-ANNOUNCE (Auto-RP) CISCO-RP-DISCOVERY (Auto-RP) “ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-addresses” is the authoritative source for reserved multicast addresses. Additional Information "Administratively Scoped IP Multicast", June 1997, has a good discussion on scoped addresses. This document is available at: “draft-ietf-mboned-admin-ip-space-03.txt”

15 Multicast Protocols Multicast Addressing Basics IGMP PIM MSDP MBGP

16 Host-Router Signaling: IGMP
How hosts tell routers about group membership Routers solicit group membership from directly connected hosts RFC 1112 specifies version 1 of IGMP Supported on Windows 95 RFC 2236 specifies version 2 of IGMP Supported on latest service pack for Windows and most UNIX systems IGMP version 3 is specified in IETF draft draft-ietf-idmr-igmp-v3-03.txt IGMP The primary purpose of IGMP is to permit hosts to commincate their desire to receive multicast traffic to the IP Multicast router(s) on the local network. This, in turn, permits the IP Multicast router(s) to “Join” the specified multicast group and to begin forwarding the multicast traffic onto the network segment. The initial specification for IGMP (v1) was documented in RFC 1112, “Host Extensions for IP Multicasting”. Since that time, many problems and limitations with IGMPv1 have been discovered. This has lead to the development of the IGMPv2 specification which was ratified in November, 1997 as RFC Even before IGMPv2 had been ratified, work on the next generation of the IGMP protocol, IGMPv3, had already begun. However, the IGMPv3 specification is still in the working stage and has not been implemented by any vendors.

17 Host-Router Signaling: IGMP
Joining a Group H3 Report H3 H1 H2 Asynchronous Joins Members joining a group do not have to waited for a query to join; they send in an unsolictied report indicating their interest. This reduces join latency for the end system joining if no other members are present. Host sends IGMP Report to join group

18 Host-Router Signaling: IGMP
Maintaining a Group H1 H2 H3 Suppressed X Report Query Query-Response Process The router multicasts periodic IGMPv1 Membership Queries to the “All-Hosts” ( ) group address. Only one member per group responds with a report to a query. This is to save bandwidth on the subnet network and processing by the hosts. This is process is called “Response Suppression”. (See section below.) Response Suppression Mechanism The “Report Suppression” mechanism is accomplished as follows: When a host receives the Query, it starts a count-down timer for each multicast group of which it is a member. The count-down timers are each initialized to a random count within a given time range. (In IGMPv1 this was a fixed range of 10 seconds. Therefore the count-down timers were randomly set to some value between 0 and 10 seconds.) When a count-down timer reaches zero, the host sends a Membership Report for the group associated with the count-down timer to notify the router that the group is still active. However, if a host receives a Membership Report before its associated count-down timer reaches zero, it cancels the count-down timer associated with the multicast group, thereby suppressing its own report. In the example shown in the slide, H2’s time expired first so it responded with its Membership Report. H1 and H3 cancelled their timers associated with the group; thereby suppressing their reports. Router sends periodic Queries to One member per group per subnet reports Other members suppress reports

19 Host-Router Signaling: IGMP
Leaving a Group (IGMPv1) H3 #1 H1 H2 H3 General Query #2 Host quietly leaves group Router sends 3 General Queries (60 secs apart) No IGMP Report for the group is received Group times out (Worst case delay ~= 3 minutes)

20 Host-Router Signaling: IGMP
Leaving a Group (IGMPv2) Leave to #1 H3 H1 H2 H3 Group Specific Query to #2 Host sends Leave message to Router sends Group specific query to No IGMP Report is received within ~3 seconds Group times out

21 IGMPv3 draft-ietf-idmr-igmp-v3-??.txt
Early implementations now in beta Enables hosts to listen only to a specified subset of the hosts sending to the group IGMPv3 (future) As IGMPv2 nears ratification, the IDMR has already begun work on IGMPv3. While it is very premature to speculate on the details of the enhancements to this protocol, it is known that one of the goals of the IDMR is to specify a mechanism in IGMPv3 to allow hosts to indicate that they only wish to receive traffic from a particular source(s) within a multicast group.

22 IGMPv3 Source = 1.1.1.1 Group = 224.1.1.1 Source = 2.2.2.2
H1 wants to receive from S = but not from S = With IGMP, specific sources can be pruned back - S = in this case IGMPv3 Example (future) In this example, host “H1” has joined group but only wishes to receive traffic from Source Using an as yet unspecified IGMPv3 mechanism, the host can inform the designated router, “R3”, that it is only interested in multicast traffic from Source for Group Router “R3” could then potentially “prune” this specific (S,G) traffic source. R3 IGMPv3: Join , Leave , H1 - Member of

23 Multicast Protocols Multicast Addressing Basics IGMP PIM MSDP MBGP

24 PIM Distribution Trees Neighbor Discovery PIM-DM PIM-SM

25 Multicast Distribution Trees
Shortest Path or Source Distribution Tree Source 1 Notation: (S, G) S = Source G = Group Source 2 A B D F Shortest path or source distribution tree is the network path with the least hops Ex: shortest path between Source and Receiver 1 is via Routers A and C, and shortest path to Receiver 2 is one additional hop via Router E C E Receiver 1 Receiver 2

26 Multicast Distribution Trees
Shortest Path or Source Distribution Tree Source 1 Notation: (S, G) S = Source G = Group Source 2 A B D F Shortest path or source distribution tree is the network path with the least hops Ex: shortest path between Source and Receiver 1 is via Routers A and C, and shortest path to Receiver 2 is one additional hop via Router E C E Receiver 1 Receiver 2

27 Multicast Distribution Trees
Shared Distribution Tree Notation: (*, G) * = All Sources G = Group A B D (RP) F Shared distribution tree is the path derived from a shared point through which sources and receivers must send/receive multicast data Ex: regardless of location/number of receivers, senders register with Shared Root (Router D) and always send a single copy of multicast data through the Shared Root to registered receivers Ex: regardless of location/number of sources, group members always receive forwarded multicast data from Shared Root (Router D) (RP) PIM Rendezvous Point C E Shared Tree Receiver 1 Receiver 2

28 Multicast Distribution Trees
Shared Distribution Tree Source 1 Notation: (*, G) * = All Sources G = Group Source 2 A B D (RP) F Shared distribution tree is the path derived from a shared point through which sources and receivers must send/receive multicast data Ex: regardless of location/number of receivers, senders register with Shared Root (Router D) and always send a single copy of multicast data through the Shared Root to registered receivers Ex: regardless of location/number of sources, group members always receive forwarded multicast data from Shared Root (Router D) (RP) PIM Rendezvous Point C E Shared Tree Source Tree Receiver 1 Receiver 2

29 Multicast Distribution Trees
Characteristics of Distribution Trees Source or Shortest Path trees Uses more memory O(S x G) but you get optimal paths from source to all receivers; minimizes delay Shared trees Uses less memory O(G) but you may get sub-optimal paths from source to all receivers; may introduce extra delay Source or Shortest Path Tree Characteristics Provides optimal path (shortest distance and minimized delay) from source to all receivers, but requires more memory to maintain Shared Tree Characteristics Provides suboptimal path (may not me shortest distance and may introduce extra delay) from source to all receivers, but requires less memory to maintain

30 PIM Neighbor Discovery
PIM Router 2 Highest IP Address elected as “DR” (Designated Router) PIM Hello PIM Hello PIM Router 1 PIM Neighbor Discovery PIM Hellos are sent periodically to discover the existence of other PIM routers on the network and to elect the Designated Router. For Multi-Access networks (e.g. Ethernet), the PIM Hello messages are multicast to the “All-PIM-Routers” ( ) multicast group address. Designated Router (DR) For multi-access networks, a Designated Router (DR) is elected. In PIM Sparse mode networks, the DR is responsible for sending Joins to the RP for members on the multi-access network and for sending Registers to the RP for sources on the multi-access network. For Dense mode, the DR has no meaning. The exception to this is when IGMPv1 is in use. In this case, the DR also functions as the IGMP Querier for the Multi-Access network. Designated Router (DR) Election To elect the DR, each PIM node on a multi-access network examines the received PIM Hello messages from its neighbors and compares the IP Address of its interface with the IP Address of its PIM Neighbors. The PIM Neighbor with the highest IP Address is elected the DR. If no PIM Hellos have been received from the elected DR after some period (configurable), the DR Election mechanism is run again to elect a new DR. PIMv2 Hellos are periodically multicast to the “All-PIM-Routers” ( ) group address. (Default = 30 seconds) Note: PIMv1 multicasts PIM Query messages to the “All-Routers” ( ) group address. If the “DR” times-out, a new “DR” is elected. The “DR” is responsible for sending all Joins and Register messages for any receivers or senders on the network.

31 PIM Neighbor Discovery
wan-gw8>show ip pim neighbor PIM Neighbor Table Neighbor Address Interface Uptime Expires Mode FastEthernet w1d :01:24 Dense FastEthernet w6d :01:01 Dense (DR) FastEthernet w0d :01:14 Dense FastEthernet w0d :01:13 Dense FastEthernet w0d :01:02 Dense Serial :47:11 00:01:16 Dense Serial :47:22 00:01:08 Dense Serial :47:07 00:01:21 Dense Serial d04h :01:06 Dense Serial w4d :01:25 Dense Serial d04h :01:20 Dense Serial :53:25 00:01:03 Dense “show ip pim neighbor” command output Neighbor Address - the IP address of the PIM Neighbor Interface - the interface where the PIM Query of this neighbor was received. Uptime - the period of time that this PIM Neighbor has been active. Expires - the period of time after which this PIM Neighbor will no longer be considered as active. (Reset by the receipt of a another PIM Query.) Mode - PIM mode (Sparse, Dense, Sparse/Dense) that the PIM Neighbor is using. “(DR)” - Indicates that this PIM Neighbor is the Designated Router for the network.

32 PIM-DM — Evaluation Most effective for small pilot networks
Advantages: Easy to configure—two commands Simple flood and prune mechanism Potential issues... Inefficient flood and prune behavior Complex Assert mechanism Mixed control and data planes Results in (S, G) state in every router in the network Can result in non-deterministic topological behaviors No support for shared trees Evaluation: PIM Dense-mode Appropriate for large number of densely distributed receivers located in close proximity to source Advantages Minimal number of commands required for configuration (two) Simple mechanism for reaching all possible receivers and eliminating distribution to uninterested receivers Simple behavior is easier to understand and therefore easier to debug Interoperates with DVMRP Potential issues Necessity to flood frequently because prunes expire after 3 minutes.

33 PIM-SM Overview Explicit join model RPF check depends on tree type
Receivers join to the Rendezvous Point (RP) Senders register with the RP Data flows down the shared tree and goes only to places that need the data from the sources Last hop routers can join source tree if the data rate warrants by sending joins to the source RPF check depends on tree type For shared trees, uses RP address For source trees, uses Source address Explicit “Join” Model Unlike PIM Dense mode, PIM Sparse mode uses the explicit join model whereby Receivers send PIM Join messages to a designated “Rendezvous Point” (RP). (The RP is the root of a shared distribution tree down which all multicast traffic flows.) In order to get multicast traffic to the RP for distribution down the shared tree, Senders send Register messages to the RP. Register messages cause the RP to send a “Join” towards the source so that multicast traffic can flow to the RP and hence down the shared tree. Last hop routers may be configured with an “SPT-Threshold” which, once exceeded, will cause the last hop router to join the “Shortest Path Tree” (SPT). This will result in the multicast traffic from the source to flow down the SPT from the source to the last hop router. RPF Check depends on tree type If traffic is flowing down the shared tree, the RPF check mechanism will use the IP address of the RP to perform the RPF check. If traffic is flowing down the SPT, the RPF check mechanism will use the IP address of the Source to perform the RPF check.

34 PIM-SM Overview Only one RP is chosen for a particular group
RP statically configured or dynamically learned (Auto-RP, PIM v2 candidate RP advertisements) Data forwarded based on the source state (S, G) if it exists, otherwise use the shared state (*, G) RFC “PIM Sparse Mode Protocol Spec” Only one RP for a group may be active at a time While it is usually the case that a single RP serves all groups, it is possible to configure different RP’s for a range(s) group(s). This is accomplished via access-lists. This permits the RP’s to be placed in different locations and can improve the traffic flow for the group if it is placed close to the Source(s). RP Configuration RP’s may be configured statically on each router (although they must all agree or your network will be broken!) in your network. However, a better solution is to use the Auto-RP or PIMv2 mechanisms to configure RP’s. Data Forwarding Multicast traffic forwarding In a PIM Sparse mode network is first attempted using any matching (S,G) entries in the Multicast Routing table. If no matching (S,G) state exists, then the traffic is forwarded using the matching (*,G) entry in the Multicast Routing table. PIM Sparse Mode Spec PIM Sparse mode is now defined in RFC 2117.

35 PIM-SM Shared Tree Join
RP (*, G) State created only along the Shared Tree. PIM-SM Shared Tree Joins In this example, there is an active receiver (attached to leaf router at the bottom of the drawing) has joined multicast group “G”. The leaf router knows the IP address of the Rendezvous Point (RP) for group G and when it sends a (*,G) Join for this group towards the RP. This (*, G) Join travels hop-by-hop to the RP building a branch of the Shared Tree that extends from the RP to the last-hop router directly connected to the receiver. At this point, group “G” traffic can flow down the Shared Tree to the receiver. (*, G) Join Shared Tree Receiver

36 PIM-SM Sender Registration
RP Source (S, G) State created only along the Source Tree. PIM-SM Sender Registration As soon as an active source for group G sends a packet the leaf router that is attached to this source is responsible for “Registering” this source with the RP and requesting the RP to build a tree back to that router. The source router encapsulates the multicast data from the source in a special PIM SM message called the Register message and unicasts that data to the RP. When the RP receives the Register message it does two things It de-encapsulates the multicast data packet inside of the Register message and forwards it down the Shared Tree. The RP also sends an (S,G) Join back towards the source network S to create a branch of an (S, G) Shortest-Path Tree. This results in (S, G) state being created in all the router along the SPT, including the RP. Traffic Flow Shared Tree Source Tree (S, G) Register (unicast) Receiver (S, G) Join

37 PIM-SM Sender Registration
RP Source (S, G) traffic begins arriving at the RP via the Source tree. PIM-SM Sender Registration (cont.) As soon as the SPT is built from the Source router to the RP, multicast traffic begins to flow natively from source S to the RP. Once the RP begins receiving data natively (i.e. down the SPT) from source S it sends a ‘Register Stop’ to the source’s first hop router to inform it that it can stop sending the unicast Register messages. Traffic Flow Shared Tree RP sends a Register-Stop back to the first-hop router to stop the Register process. Source Tree (S, G) Register (unicast) Receiver (S, G) Register-Stop (unicast)

38 PIM-SM Sender Registration
RP Source Source traffic flows natively along SPT to RP. PIM-SM Sender Registration (cont.) At this point, multicast traffic from the source is flowing down the SPT to the RP and from there, down the Shared Tree to the receiver. Traffic Flow Shared Tree From RP, traffic flows down the Shared Tree to Receivers. Source Tree Receiver

39 PIM-SM SPT Switchover RP Source Last-hop router joins the Source Tree.
PIM-SM Shortest-Path Tree Switchover PIM-SM has the capability for last-hop routers (i.e. routers with directly connected members) to switch to the Shortest-Path Tree and bypass the RP if the traffic rate is above a set threshold called the “SPT-Threshold”. The default value of the SPT-Threshold” in Cisco routers is zero. This means that the default behaviour for PIM-SM leaf routers attached to active receivers is to immediately join the SPT to the source as soon as the first packet arrives via the (*,G) shared tree. In the above example, the last-hop router (at the bottom of the drawing) sends an (S, G) Join message toward the source to join the SPT and bypass the RP. This (S, G) Join messages travels hop-by-hop to the first-hop router (i.e. the router connected directly to the source) thereby creating another branch of the SPT. This also creates (S, G) state in all the routers along this branch of the SPT. Traffic Flow Shared Tree Additional (S, G) State is created along new part of the Source Tree. Source Tree (S, G) Join Receiver

40 PIM-SM SPT Switchover RP Source Traffic Flow
PIM-SM Shortest-Path Tree Switchover Once the branch of the Shortest-Path Tree has been built, (S, G) traffic begins flowing to the receiver via this new branch. Next, special (S, G)RP-bit Prune messages are sent up the Shared Tree to prune off the redundant (S,G) traffic that is still flowing down the Shared Tree. If this were not done, (S, G) traffic would continue flowing down the Shared Tree resulting in duplicate (S, G) packets arriving at the receiver. Traffic Flow Traffic begins flowing down the new branch of the Source Tree. Shared Tree Source Tree Additional (S, G) State is created along along the Shared Tree to prune off (S, G) traffic. (S, G)RP-bit Prune Receiver

41 PIM-SM SPT Switchover RP Source
(S, G) Traffic flow is now pruned off of the Shared Tree and is flowing to the Receiver via the Source Tree. PIM-SM Shortest-Path Tree Switchover As the (S, G)RP-bit Prune message travels up the Shared Tree, special (S, G)RP-bit Prune state is created along the Shared Tree that selectively prevents this traffic from flowing down the Shared Tree. At this point, (S, G) traffic is now flowing directly from the first-hop router to the last-hop router and from there to the receiver. Traffic Flow Shared Tree Source Tree Receiver

42 PIM-SM SPT Switchover RP Source
(S, G) traffic flow is no longer needed by the RP so it Prunes the flow of (S, G) traffic. PIM-SM Shortest-Path Tree Switchover At this point, the RP no longer needs the flow of (S, G) traffic since all branches of the Shared Tree (in this case there is only one) have pruned off the flow of (S, G) traffic. As a result, the RP will send (S, G) Prunes back toward the source to shutoff the flow of the now unnecessary (S, G) traffic to the RP Note: This will occur IFF the RP has received an (S, G)RP-bit Prune on all interfaces on the Shared Tree. Traffic Flow Shared Tree Source Tree (S, G) Prune Receiver

43 PIM-SM SPT Switchover RP Source
(S, G) Traffic flow is now only flowing to the Receiver via a single branch of the Source Tree. PIM-SM Shortest-Path Tree Switchover As a result of the SPT-Switchover, (S, G) traffic is now only flowing from the first-hop router to the last-hop router and from there to the receiver. Notice that traffic is no longer flowing to the RP. As a result of this SPT-Switchover mechanism, it is clear that PIM SM also supports the construction and use of SPT (S,G) trees but in a much more economical fashion than PIM DM in terms of forwarding state. Traffic Flow Shared Tree Source Tree Receiver

44 PIM-SM Protocol Mechanics
PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning

45 PIM-SM State Example sj-mbone> show ip mroute
IP Multicast Routing Table Flags: D - Dense, S - Sparse, C - Connected, L - Local, P - Pruned R - RP-bit set, F - Register flag, T - SPT-bit set, J - Join SPT M - MSDP created entry, X - Proxy Join Timer Running A - Advertised via MSDP Timers: Uptime/Expires Interface state: Interface, Next-Hop or VCD, State/Mode (*, ), 00:13:28/00:02:59, RP , flags: SCJ Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:13:28/00:02:32 Serial0, Forward/Sparse, 00:4:52/00:02:08 ( /32, ), 00:01:43/00:02:59, flags: CJT Incoming interface: Serial0, RPF nbr Ethernet1, Forward/Sparse, 00:01:43/00:02:11 Ethernet0, forward/Sparse, 00:01:43/00:02:11

46 PIM-SM (*,G) State Rules
(*,G) creation Upon receipt of a (*,G) Join or Automatically if (S,G) must be created (*,G) reflects default group forwarding IIF = RPF interface toward RP OIL = interfaces that received a (*,G) Join or with directly connected hosts or manually configured (*,G) deletion When OIL = NULL and no child (S,G) state exists

47 PIM-SM (S,G) State Rules
(S,G) creation By receipt of (S,G) Join or Prune or By “Register” process Parent (*,G) created (if doesn’t exist) (S,G) reflects forwarding of “S” to “G” IIF = RPF Interface normally toward source RPF toward RP if “RP-bit” set OIL = Initially, copy of (*,G) OIL minus IIF (S,G) deletion By normal (S,G) entry timeout

48 PIM-SM OIL Rules Interfaces in OIL added Interfaces in OIL removed
By receipt of Join message Intfc’s added to (*,G) are added to all (S,G)’s Interfaces in OIL removed By receipt of Prune message Intfc’s removed from (*,G) are removed from all (S,G)’s Interface Expire timer counts down to zero Timer reset (to 3 min.) by receipt of periodic Join or By IGMP membership report

49 PIM-SM State Flags S = Sparse Mode C = Directly Connected Host
L = Local (Router is member) P = Pruned (All intfcs in OIL = Prune) T = Forwarding via SPT Indicates at least one packet was forwarded

50 PIM-SM State Flags (Cont.)
J = Join SPT In (*, G) entry Indicates SPT-Threshold is being exceeded Next (S,G) received will trigger join of SPT In (S, G) entry Indicates SPT joined due to SPT-Threshold If rate < SPT-Threshold, switch back to Shared Tree F = Register In (S,G) entry Indicates the router is a first-hop router and there is a directly connected source or proxy registers are being sent. Set when “F” set in at least one child (S,G)

51 PIM-SM State Flags (Cont.)
R = RP bit (S, G) entries only Set by (S,G)RP-bit Prune/Join Indicates info is applicable to Shared Tree Used to prune (S,G) traffic from Shared Tree Initiated by Last-hop router after switch to SPT Modifies (S,G) forwarding behavior IIF = RPF toward RP (I.e. up the Shared Tree) OIL = Pruned accordingly

52 PIM-SM Protocol Mechanics
PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning

53 PIM-SM Forwarding Rules
Use longest match entry Use (S, G) entry if exists Otherwise, use (*, G) entry RPF check first If Packet didn’t arrive via IIF, drop it. Forward Packet (if RPF succeeded) Send out all “unpruned” interfaces in OIL

54 PIM SM Forwarding rtr-a
to RP ( ) Source Tree (SPT) Multicast Packets ( , ) S1 Shared Tree Multicast Packets ( , ) S0 rtr-a E0 E0 E1 Rcvr A (*, ) Packets are “forwarded” out all interfaces in “oilist”. PIM Sparse mode interfaces are placed on the “oilist” for a Multicast Group IF: PIM neighbor Joins the group on this interface Host on this interface has joined the group Interface has been manually configured to join group. PIM SM Forwarding In PIM Sparse mode, incoming multicast packets for group “G” are forwarded out all interfaces that are in the Multicast Route table entry’s “outgoing interface list” (oilist). Once again, an attempt is made to forward using a matching (S,G) entry if one exists. In that case, the “oilist” of the (S,G) entry is used to control forwarding of the multicast packet. If no (S, G) entry exists, then the “oilist” of the (*,G) entry is used to control the forwarding of the multicast packet. Outgoing Interface Lists Interfaces are placed in the (*, G) “oilist” if and only if: The router has received a PIM “Join” for this Group from another PIM Neighbor via this interface or A host on this interface has joined the Group (via IGMP) or The router’s interface has been manually configured to join the group.

55 PIM-SM Protocol Mechanics
PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning

56 PIM SM Joining Leaf routers send a (*,G) Join to toward RP
Joins sent hop-by-hop via unicast path toward RP Each router along path creates (*,G) state IF no (*,G) state, create it & send a Join toward RP ELSE Join process complete. Reached the (*,G) tree. Leaf (last-hop) routers join the Shared Tree (RPT) When a last-hop router wishes to begin receiving multicast traffic for group “G”, it sends a PIM (*,G) Join message to its up-stream PIM Neighbor in the direction of the RP. While the Join is multicast to the “All-Routers” ( ) multicast address, the up-stream PIM Neighbor’s IP address is indicated in the body of the PIM Join Message. This permits all PIM routers on a Multi-Access network to be aware of the Join but only the indicated up-stream PIM Neighbor will perform the Join. Routers up the Shared Tree (RPT) create (*,G) state When a PIM router receives a (*,G) Join for group “G” from one of its down-stream PIM Neighbors, it will check to see if any (*, G) state exists for group “G” in its Multicast Routing table. If (*, G) state for group “G” already exists, then the interface from which the Join was received is placed on the (*,G) oilist. If no (*, G) state for group “G” exists, a (*, G) entry is created, the interface from which the Join was received is placed in the (*, G) oilist and a (*, G) Join is sent towards the RP. The end result of the above mechanism is to create (*, G) state all the way from the last-hop router to the RP so that group “G” multicast traffic will flow down the Shared Tree (RPT) to the last-hop router.

57 PIM SM Joining rtr-a rtr-b
To RP ( ) S0 rtr-a E0 Shared Tree E0 E1 rtr-b IGMP Join 1 Rcvr A PIM SM Joining Example 1) Receiver “A” wishes to receive group “G” multicast traffic and therefore sends an IGMP Host Membership message (sometimes loosely referred to as an IGMP Join) which is received by “rtr-b”. “rtr-b” has no existing (*, G) state for group “G” and therefore creates an entry. (See next slide.) “Rcvr A” wishes to receive group G traffic. Sends IGMP Join for G. 1

58 “rtr-b” creates (*, 224.1.1.1) state
PIM SM Joining S1 To RP ( ) S0 rtr-a E0 Shared Tree E0 E1 rtr-b (*, ), 00:00:05/00:02:54, RP , flags: SC Incoming interface: Ethernet0, RPF nbr Outgoing interface list: Ethernet1, Forward/Sparse, 00:00:05/00:02:54 “rtr-b” creates (*, ) state Rcvr A State in “rtr-b” after Joining (*, ) (*, ) indicates the (*, G) entry. 00:00:05/00:02:54 indicates that the entry has existed for 5 seconds and will expire in 2 minutes and 54 seconds. RP is the IP Address of the Rendezvous Point for Group flags: SC indicates that this is a Sparse mode group (S) and that there is a member of this group directly connected (C) to the router. Incoming interface: Ethernet0, RPF nbr indicates the Incoming interface (up the Shared Tree toward RP) and the RPF neighbor’s IP address (in the direction of the RP) is Outgoing interface list: lists the interfaces that are in the outgoing interface list for this entry. Ethernet1, Forward/Sparse, 00:00:05/00:02:54 indicates Ethernet 1 is in the oilist; it’s in the “Forward” state; Sparse mode and that it has been in the list for 5 seconds and will expire in 2 minutes and 54 seconds if no further (*, G) Join or IGMP Report is received on this interface. Ethernet1, Forward/Sparse, 00:00:05/00:02:54

59 PIM SM Joining rtr-a rtr-b
To RP ( ) S0 rtr-a E0 PIM Join 2 Shared Tree E0 E1 rtr-b Rcvr A PIM SM Joining Example 2) Because the OIL of the (*, G) transitioned from Null to non-Null (when “rtr-b” added Ethernet 1 to the OIL of the newly created entry), a PIM (*, G) Join is sent to rtr-b’s up-stream PIM neighbor (rtr-a) in the direction of the RP. When “rtr-a” receives the (*, G) Join it creates (*, G) state. (See next slide.) “Rcvr A” wishes to receive group G traffic. Sends IGMP Join for G. 1 “rtr-b” sends (*,G) Join towards RP. 2

60 “rtr-a” creates (*, 224.1.1.1) state.
PIM SM Joining S1 To RP ( ) (*, ), 00:00:05/00:02:54, RP , flags: S Incoming interface: Serial0, RPF nbr Outgoing interface list: Ethernet0, Forward/Sparse, 00:00:05/00:02:54 “rtr-a” creates (*, ) state. S0 rtr-a E0 Shared Tree E0 E1 rtr-b Rcvr A State in “rtr-a” after Joining (*, ) (*, ) indicates the (*, G) entry. 00:00:05/00:02:54 indicates that the entry has existed for 5 seconds and will expire in 2 minutes and 54 seconds. RP is the IP Address of the Rendezvous Point for Group flags: S indicates that this is a Sparse mode group (S). Incoming interface: Serial0, RPF nbr indicates the Incoming interface (up the Shared Tree toward RP) and the RPF neighbor’s IP address (in the direction of the RP) is Outgoing interface list: lists the interfaces that are in the outgoing interface list for this entry. Ethernet0, Forward/Sparse, 00:00:05/00:02:54 indicates Ethernet 0 is in the oilist; it’s in the “Forward” state; Sparse mode and that it has been in the list for 5 seconds and will expire in 2 minutes and 54 seconds if no further (*, G) Join or IGMP Report is received on this interface. Ethernet0, Forward/Sparse, 00:00:05/00:02:54

61 PIM SM Joining rtr-a rtr-b
Shared Tree 4 S1 To RP ( ) PIM Join 3 S0 rtr-a E0 Shared Tree E0 E1 rtr-b Rcvr A PIM SM Joining Example 3) Because the OIL of the (*, G) transitioned from Null to non-Null (when “rtr-a” added Ethernet 0 to the OIL of the newly created entry), a PIM (*, G) Join is sent to rtr-a’s up-stream PIM neighbor in the direction of the RP. When the upstream router receives the (*, G) Join it too creates (*, G) state and creates a branch of the Shared Tree. 4) This process continues all the way back to the RP (or until a router is reached that is already on the Shared Tree and therefore already has a (*, G) entry.) “Rcvr A” wishes to receive group G traffic. Sends IGMP Join for G. 1 “rtr-b” sends (*,G) Join towards RP. 2 “rtr-a” sends (*,G) Join towards RP. 3 Shared tree is built all the way back to the RP. 4

62 PIM-SM Protocol Mechanics
PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning

63 PIM SM Registering Senders begin sourcing Multicast Traffic
Senders don’t necessarily perform IGMP group joins. 1st-hop router unicasts “Registers” to RP A Mcast packet is encapsulated in each Register msg Registers messages follow unicast path to RP RP receives “Register” messages De-encapsulates the Mcast packet inside Register msg Forwards Mcast packet down Shared Tree Sends (S,G) Join toward Source / 1st-Hop router to build an (S,G) SPT between Source and RP All Senders are not necessarily Receivers and vice versa. It is not a requirement that all sources be receivers. In the case of a source-only host, it is permisible for the host to simply begin sending multicast traffic without ever joining the group via IGMP. 1st-hop router sends “Registers” to the RP In PIM Sparse mode, when a 1st-hop router receives the first multicast packet from directly connected source “S” for group “G”, it creates (S, G) state and sets the “F” bit in the (S, G) entry to indicate that it is a directly connected “Source” and also sets the “Registering” flag to indicate that it’s in the process of “Registering”. Next, the 1st-hop router encapsulates the original multicast packet in a PIM Register message and unicasts it to the RP. (Any subsequent multicast packets received from directly connected source “S” for group “G” are also encapsulated in a Register message and unicast to the RP. This continues until an (S, G) “Register-Stop” message is received from the RP.) RP receives “Register messages When the RP receives a “Register” message it will de-encapsulated the message. If this packet is to a Group for which the RP has (*, G) state, the RP will: Forward the original packet out all interfaces in the the (*, G) entry’s “oilist”. If it hasn’t already done so, the RP creates (S, G) state and sends an (S, G) Join back towards the Source in order to join the Shortest-path Tree (SPT) to Source “S”.

64 PIM SM Registering 1st-hop router receives (S,G) Join
SPT between Source and RP now built. Begins forwarding traffic down (S,G) SPT to RP (S,G) Traffic temporarily flowing down 2 paths to RP RP receives traffic down native (S,G) SPT Sends a “Register-Stop” msg to Source / 1st-Hop router. 1st-Hop router receives “Register-Stop” msg Stops encapsulating traffic in “Register” messages (S,G) Traffic now flowing down single SPT to RP 1st-hop router receives (S, G) Join When the 1st-hop router receives the (S, G) Join (sent hop-by-hop from the RP), it processes it normally by adding the interface, from which the Join was received, to the “oilist” of the existing (S, G) entry. (This entry was originally created when the 1st-hop router received the first multicast packet from directly connected Source “S”.) This completes the building of the Shortest-Path Tree (SPT) from the Source to the RP. The 1st-hop router now begins forwarding Source “S” multicast traffic down the newly built Shortest-Path Tree (SPT) to the RP. Note: (S, G) traffic temporarily flows to the RP via two methods; via Register messages (until a Register-Stop message is received) and the “native” Shortest-Path Tree (SPT). RP begins receiving traffic down the (S, G) SPT. As soon as the RP begins receiving (S, G) traffic “natively” (I.e. not encapsulated in Register messages) down the SPT, the RP will set the “T” bit in the (S, G) entry to denote that traffic is succesfully flowing down the Shortest-Path Tree (SPT). Now when any (S, G) Register messages are received by the RP, it sees that the “T” bit is set in the (S, G) entry will respond by sending a PIM (S, G) Register-Stop message to the 1st-hop router. This notifies the 1st-hop router that traffic is now being received “natively” down the SPT. 1st-hop router receives “Register-Stop” message When the (S, G) Register-Stop message is received by the 1st-hop router, it clears the “Registering” flag in the (S, G) entry and stops encapsulating (S,G) traffic in Register messages. Traffic is now flowing only down the SPT to the RP.

65 PIM SM Register Examples
Receivers Join Group First Source Registers First PIM SM Register Examples Depending on whether there are any existing Receivers for group “G” on the Shared Tree (RPT), the RP hands the Register process a little different. In the following examples we will consider the Register process for the cases when: Receivers join group “G” first; The Source Registers first.

66 PIM SM Registering Receiver Joins Group First
RP (*, ), 00:00:03/00:02:56, RP , flags:S Incoming interface: Null, RPF nbr , Outgoing interface list: Serial0, Forward/Sparse, 00:03:14/00:02:59 Serial1, Forward/Sparse, 00:03:14/00:02:59 State in “RP” before any source registers (with receivers on Shared Tree) E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b Shared Tree State in RP before Registering (Rcvr’s on Shared Tree) Pay particular attention to the following in the (*, G) entry: The “Incoming interface:” is NULL and the “RPF nbr” is This indicates that this router is the RP. The “Outgoing interface list:” contains Serial0 and Serial1 which are assumed to be the only two active branches of the Shared Tree (RPT).

67 PIM SM Registering Receiver Joins Group First
RP rtr-b>sh ip mroute No such group State in “rtr-b” before any source registers (with receivers on Shared Tree) E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b Shared Tree State in “rtr-b” before source registers Note that there is no group state information for this Group yet.

68 PIM SM Registering Receiver Joins Group First
RP rtr-a>sh ip mroute No such group. State in “rtr-a” before any source registers (with receivers on Shared Tree) E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b Shared Tree State in 1st-hop router (rtr-a) before source registers Note that there is no group state information for this Group yet.

69 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets 1 RP Source E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b Shared Tree Receivers Join Group First Example 1) Source “S” begins sending traffic to group “G”. 2) 1st-hop router (“rtr-a”) creates (*, G) and (S, G) state; encapsulates the multicast packets in PIM Register message(s) and unicasts it(them) to the RP. 3) The RP (“rtr-c”) de-encapsulates the packets and sees that the packet is for group “G” for which it already has (*, G) state. It then forwards the packets down the Shared Tree. “Source” begins sending group G traffic. 1

70 PIM SM Registering Receiver Joins Group First
2 ( , ) Mcast Packets Register Msgs RP Source (*, ), 00:00:03/00:02:56, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: Null ( /32, ), 00:00:03/00:02:56, flags: FPT Incoming interface: Ethernet0, RPF nbr , E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b Shared Tree 1st-hop router (rtr-a) creates (S, G) state A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for this entry points up the Shared Tree via Serial0 with the RPF neighbor of (Serial 0 of “rtr-b”.) Because in this example no members have joined the group (the sender is only sending), the OIL of the (*, G) entry is Null. The “P” flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Ethernet0. The RPF neighbor is because the source is directly connected. The (S, G) OIL receives a copy of the (*, G) OIL. (Which is Null.) The “F” flags are set in the (S, G) entry which indicates that this is a directly connected Source. The “Registering” flag is set in the (S, G) entry which indicates that we are still sending Register messages to the RP for this Source. 2) The 1st-hop router encapsulates the multicast packets in PIM Register message(s) and unicasts them to the RP. FPT Registering “rtr-a” creates (S, G) state for source (After automatically creating a (*, G) entry) “Source” begins sending group G traffic. 1 “rtr-a” encapsulates packets in Registers; unicasts to RP. 2

71 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets Register Msgs RP “RP” processes Register; creates (S, G) state (*, ), 00:09:21/00:02:38, RP , flags: S Incoming interface: Null, RPF nbr , Outgoing interface list: Serial0, Forward/Sparse, 00:09:21/00:02:38 Serial1, Forward/Sparse, 00:03:14/00:02:46 ( , , 00:01:15/00:02:46, flags: Incoming interface: Serial3, RPF nbr , Serial0, Forward/Sparse, 00:00:49/00:02:11 Serial1, Forward/Sparse, 00:00:49/00:02:11 Source E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b 3 (*, ) Mcast Traffic Shared Tree The RP creates (S, G) state As a result of the Register message that was received from “rtr-a”, the RP creates (S, G) state as follows: The RPF information is calculated using the source address contained in the multicast packet encapsulated inside of the register message. This results in an IIF of Serial3 and an RPF neighbor of Next, the OIL of the parent (*, G) entry is copied into the OIL of the new (S,G) entry. (An additional check is made to insure that the IIF does not appear in the OIL. If it does, it is removed to prevent a route loop.) Now the router is ready to forward the (S, G) packet that was encapsulated in the Register message using the newly created (S, G) state. (Note that traffic is always forwarded using the matching (S, G) entry if one exists. Otherwise, the (*, G) entry is used.) This is accomplished as follows: Because this packet was received inside of a Register message, the RPF check is skipped. Next, the router forwards a copy of the packet out all interfaces in the (S, G) OIL. In this case a copy is sent out Serial0 and Serial1 which corresponds to the two branches of the Shared Tree. The “T” flag is not yet set in the (S, G) entry. However, when the first (S, G) packet is received natively (via the Incoming interface) and forwarded using this entry, the “T” flag will be set. “rtr-c” (RP) de-encapsulates packets; forwards down Shared tree. 3

72 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets Register Msgs Join S0 4 RP Source E0 S0 S0 S1 S0 S1 rtr-a rtr-c rtr-b (*, ) Mcast Traffic Shared Tree Receivers Join Group First Example (cont.) 4) Because RP has existing (*, G) state (I.e. Receivers already waiting on the Shared Tree), it sends an (S, G) Join toward source “S” to build a Shortest-Path Tree (SPT) from source “S” to the RP. RP sends (S,G) Join toward Source to build SPT. 4

73 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets Register Msgs Join S0 5 RP Source (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial1, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: Incoming interface: Serial0, RPF nbr Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 E0 S0 S0 S1 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic Shared Tree “rtr-b” processes the (S, G) Join and creates state A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for the (*, G) entry points up the Shared Tree via Serial1 with the RPF neighbor of (Serial 3 of the RP.) Because in this example no members have joined the group, the OIL of the (*, G) entry is Null. The “P” flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Serial0. The RPF neighbor is (Serial 0 of “rtr-a”.) The (S, G) OIL initially receives a copy of the (*, G) OIL. (Which is Null.) Interface Serial1 (which is the interface that received the (S, G) Join) is added to the (S, G) OIL. The “T” flag is not yet set in the (S, G) entry. However, when the first (S, G) packet is forwarded using this entry, the flag will be “T” set. 5) Because the OIL of the (S, G) transitioned from Null to non-Null (when “rtr-b” added Serial1 to the OIL of the newly created entry), a PIM (S, G) Join is sent to rtr-a’s to continue the process of joining the SPT. “rtr-b” processes Join, creates (S, G) state (After automatically creating the (*, G) entry) RP sends (S,G) Join toward Source to build SPT. 4 “rtr-b” sends (S,G) Join toward Source to continue building SPT. 5

74 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets Register Msgs RP (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: FT Incoming interface: Ethernet0, RPF nbr , Registering Outgoing interface list: Source E0 S0 S0 S1 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic Shared Tree “rtr-a” processes the (S, G) Join Because an (S, G) entry already existed, “rtr-a” simply added the interface on which it received the (S, G) join to the OIL. This results in the following: Serial0 is now listed in the “Outgoing interface list” (OIL) since the RP joined the SPT via this interface. The “P” flag (Pruned) is cleared since the OIL is no longer Null. Serial0, Forward/Sparse, 00:04:28/00:01:32 “rtr-a” processes the (S, G) Join; adds Serial0 to OIL

75 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets Register Msgs 6 RP Source E0 S0 S0 S1 Register-Stop 7 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic Shared Tree RP begins receiving (S,G) traffic down SPT. 6 A branch of the (S,G) SPT has been built to the RP. 6) Now that the SPT has been built from source “S” to the RP, traffic begins flowing down the Shortest-Path Tree (SPT). At this point, the RP is receiving the (S, G) traffic “natively” down the SPT. (This causes the “T” flags to be set in the (S, G) entries along this path including in the RP.) 7) The RP then sends an (S, G) “Register-Stop” to the 1st-hop router to inform it that the encapsulated group “G” Register messages from source “S” are no longer necessary. RP sends “Register-Stop” to “rtr-a”. 7

76 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets 8 RP (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: FT Incoming interface: Ethernet0, RPF nbr , Registering Outgoing interface list: Serial0, Forward/Sparse, 00:04:28/00:01:32 Source E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic Shared Tree “rtr-a” stops sending Register messages When the 1st-hop router (rtr-a) receives the (S, G) Register-Stop message it ceases sending encapsulated Register messages for (S, G) traffic. Notice that the “Registering” flag on the second line of the (S, G) entry is no longer being displayed indicating that “rtr-a” is not sending Registers. This is the final state in “rtr-a” after the Registration process. 8) (S, G) traffic is now only flowing down the Shortest-Path Tree (SPT). “rtr-a” stops sending Register messages (Final State in “rtr-a”) (S,G) Traffic now flowing down a single path (SPT) to RP. 8

77 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets RP Source (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial1, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: T Incoming interface: Serial0, RPF nbr Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 E0 S0 S0 S1 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic Shared Tree Final state in “rtr-b” after the Registration process Pay particular attention to the following in the (S, G) entry: The “T” flag is now set indicating that (S, G) traffic is flowing along this path. The (*, G) entry still has a Null OIL and the “P” flag is still set. This is because there are no members that have joined the Shared Tree. Final state in “rtr-b”

78 PIM SM Registering Receiver Joins Group First
( , ) Mcast Packets RP Final state in the “RP” (with receivers on Shared Tree) (*, ), 00:09:21/00:02:38, RP , flags: S Incoming interface: Null, RPF nbr , Outgoing interface list: Serial0, Forward/Sparse, 00:09:21/00:02:38 Serial1, Forward/Sparse, 00:03:14/00:02:46 ( , , 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr , Serial0, Forward/Sparse, 00:00:49/00:02:11 Serial1, Forward/Sparse, 00:00:49/00:02:11 Source E0 S0 S3 S0 S1 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic Shared Tree Final state in the RP after the Registration process Pay particular attention to the following in the newly created (S, G) entry: The “T” flag is now set indicating that (S, G) traffic is flowing along this path.

79 PIM SM Register Examples
Receivers Join Group First Source Registers First PIM SM Register Examples Depending on whether there are any existing Receivers for group “G” on the Shared Tree (RPT), the RP hands the Register process a little different. In the following examples we will consider the Register process for the cases when: Receivers join group “G” first; The Source Registers first.

80 PIM SM Registering Source Registers First
RP rtr-c>show ip mroute Group not found. State in “RP” before Registering (without receivers on Shared Tree) E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b State in RP before Registering (w/o Rcvr’s on Shared Tree) Notice that no state for group “G” exists since there are no Receivers on the Shared Tree yet.

81 PIM SM Registering Source Registers First
RP rtr-b>show ip mroute Group not found. State in “rtr-b” before any source registers (with receivers on Shared Tree) E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b State in “rtr-b” before source registers Note that there is no group state information for this Group yet.

82 PIM SM Registering Source Registers First
RP rtr-a>show ip mroute Group not found. State in “rtr-a” before any source registers (with receivers on Shared Tree) E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b State in 1st-hop router (rtr-a) before source registers Note that there is no group state information for this Group yet.

83 PIM SM Registering Source Registers First
( , ) Mcast Packets 1 RP Source E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b Source Registers First Example 1) Source “S” begins sending traffic to group “G”. 2) 1st-hop router (“rtr-a”) creates (*, G) and (S, G) state; encapsulates the multicast packets in PIM Register message(s) and unicasts it(them) to the RP. 3) The RP (“rtr-c”) de-encapsulates the (S, G) packet and creates (*, G) and (S, G) state. Since no one has joined the Shared Tree yet, the OIL’s of these entries will be NULL.. Because the OIL of the (S, G) entry (just created) is NULL, the packet is discarded. “Source” begins sending group G traffic. 1

84 PIM SM Registering Source Registers First
2 ( , ) Mcast Packets Register Msgs RP Source (*, ), 00:00:03/00:02:56, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: Null ( /32, ), 00:00:03/00:02:56, flags: FPT Incoming interface: Ethernet0, RPF nbr , E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b 1st-hop router (rtr-a) creates (S, G) state A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for this entry points up the Shared Tree via Serial0 with the RPF neighbor of (Serial 0 of “rtr-b”.) Because in this example no members have joined the group (the sender is only sending), the OIL of the (*, G) entry is Null. The “P” flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Ethernet0. The RPF neighbor is because the source is directly connected. The (S, G) OIL receives a copy of the (*, G) OIL. (Which is Null.) The “F” flags are set in the (S, G) entry which indicates that this is a directly connected Source. The “Registering” flag is set in the (S, G) entry which indicates that we are still sending Register messages to the RP for this Source. 2) The 1st-hop router encapsulates the multicast packets in PIM Register message(s) and unicasts them to the RP. FPT Registering “rtr-a” creates (S, G) state for source (After automatically creating a (*, G) entry) “Source” begins sending group G traffic. 1 “rtr-a” encapsulates packets in Registers; unicasts to RP. 2

85 PIM SM Registering Source Registers First
( , ) Mcast Packets Register Msgs RP (*, ), 00:01:15/00:01:45, RP , flags: S Incoming interface: Null, RPF nbr , Outgoing interface list: Null ( , ), 00:01:15/00:01:45, flags: P Incoming interface: Serial3, RPF nbr , 3 Source E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b The RP creates (S, G) state As a result of the Register message that was received from “rtr-a”, the RP creates (*, G) and (S, G) state. However, because no previous (*, G) state existed, it must be created before the (S,G) entry can be created. This (*, G) entry is created as shown above. Notice that the (*, G) OIL is NULL. This is because the RP has not yet received any (*, G) Joins for this group. (Remember, in this example, the source registers first.) Next, the (S, G) entry can be created and is accomplished as follows: The RPF information is calculated using the source address contained in the multicast packet encapsulated inside of the register message. This results in an IIF of Serial3 and an RPF neighbor of Next, the OIL of the parent (*, G) entry is copied into the OIL of the new (S,G) entry. Since the OIL of the (*, G) entry is NULL, this results in a NULL (S, G) OIL. Now the router is ready to forward the (S, G) packet that was encapsulated in the Register message using the newly created (S, G) state. This is accomplished as follows: Because this packet was received inside of a Register message, the RPF check is skipped. Next, the router forwards a copy of the packet out all interfaces in the matching (S, G) OIL. However, because the (S, G) OIL is NULL (i.e. there are no branches of the Shared Tree), the packet is simply discarded. “RP” processes Register; creates (S, G) state (After automatically creating the (*, G) entry) “rtr-c” (RP) has no receivers on Shared Tree; discards packet. 3

86 PIM SM Registering Source Registers First
( , ) Mcast Packets Register Msgs RP Source E0 Register-Stop 4 S0 S0 S1 S3 S0 S1 rtr-c rtr-c rtr-a rtr-b Source Registers First Example 4) Since the RP has no (*, G) state and hence no receivers on the Shared Tree, it does not need the (S, G) traffic. Therefore the RP sends an (S, G) “Register-Stop” message to the 1st-hop router so it will stop sending Register messages. “rtr-c” (RP) has no receivers on Shared Tree; discards packet. 3 RP sends “Register-Stop” to “rtr-a”. 4

87 PIM SM Registering Source Registers First
( , ) Mcast Packets RP Source E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a 5 rtr-b Source Registers First Example 5)The 1st-hop router receives the (S, G) Register-Stop message and stops sending Register messages for (S, G) traffic. Note: Eventually, the original (S, G) entry will time out (approx. 3 min.) and be deleted. The Register process will start over again when the 1st-hop router receives the next multicast packet from directly connected source “S”. The RP will again respond with a Register-Stop which will prevent the (S,G) traffic from flowing to the RP until it is needed. “rtr-c” (RP) has no receivers on Shared Tree; discards packet. 3 RP sends “Register-Stop” to “rtr-a”. 4 “rtr-a” stops encapsulating traffic in Register Messages; drops packets from Source. 5

88 PIM SM Registering Source Registers First
( , ) Mcast Packets RP Source (*, ), 00:01:28/00:01:32, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: Null ( /32, ), 00:01:28/00:01:32, flags: FPT Incoming interface: Ethernet0, RPF nbr State in “rtr-a” after Registering (without receivers on Shared Tree) E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b State in 1st-hop router after Registering (w/o Rcvr’s on Shared Tree) Pay particular attention to the following in the (S, G) entry: The “Registering” flag is now cleared. The “Outgoing interface list” is still Null since the RP did not join the SPT. The “P” flag (Pruned) is still set since the oilist is still Null. The “00:01:32” Expiration time value will count down to zero at which time the (S, G) entry will be deleted. (The Register process will begin all over again when the next multicast packet is received from source “S”.)

89 PIM SM Registering Source Registers First
( , ) Mcast Packets RP rtr-b>show ip mroute Group not found. State in “rtr-b” after “rtr-a” Registers (without receivers on Shared Tree) Source E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-c rtr-a rtr-b State in “rtr-b” after Registering (w/o Rcvr’s on Shared Tree) Notice that no state exists in “rtr-b” at this point in time.

90 PIM SM Registering Source Registers First
( , ) Mcast Packets RP Source (*, ), 00:01:15/00:01:45, RP , flags: S Incoming interface: Null, RPF nbr , Outgoing interface list: Null ( , ), 00:01:15/00:01:45, flags: P Incoming interface: Serial3, RPF nbr , E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b State in RP after Registering (w/o Rcvr’s on Shared Tree) Pay particular attention to the following in the newly created (S, G) entry: The “RPF nbr” is the IP Address of “rtr-b”. The “Incoming interface:” is Serial3 which is the RPF interface towards source “S” via “rtr-b”. The “Outgoing interface list:” is Null since the (*,G) OIL is also Null. (Indicates there are no Receivers on the Shared Tree yet.) The “P” flag (Pruned) is set since the OIL is Null. The (S,G) state will remain in the RP as long as the source is still actively sending. This is accomplished by fact that the first-hop route will continue sending periodic Register messages to the RP as long as the first-hop router is receiving traffic from the source. State in “RP” after “rtr-a” Registers (without receivers on Shared Tree)

91 PIM SM Registering Source Registers First
( , ) Mcast Packets RP Source E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b (*, G) Join 6 Receivers begin joining the Shared Tree Source Registers First Example 6)The RP now begins receiving (*, G) Joins from Last-hop routers with Receivers that wish to join the Shared Tree. RP (“rtr-c”) receives (*, G) Join from a receiver on Shared Tree. 6

92 PIM SM Registering Source Registers First
( , ) Mcast Packets Join 7 RP Source “RP” processes (*,G) Join (Adds Serial1 to Outgoing Interface Lists) (*, ), 00:09:21/00:02:38, RP , flags: S Incoming interface: Null, RPF nbr , Outgoing interface list: ( /32, , 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr , E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b Serial1, Forward/Sparse, 00:00:14/00:02:46 The RP process the (*, G) Join In the (*, G) entry: Serial1 has been added to the (*, G) entry since a (*,G) Join was received on this interface which is the only active branch of the Shared Tree (RPT). In the (S, G) entry: Serial1 has also been added to the (S, G) OIL because the OIL’s of all (S,G) entries are always kept in sync with their parent (*, G). Note: When the (S, G) OIL’s are synchronized with the OIL of their parent (*,G) OIL, a check is made to insure that the IIF of the (S, G) does not appear in the OIL of the (S, G). This could result in a route loop. 7) The transitioning of the (*, G) OIL from Null to non-Null triggers the RP to scan its list of (S, G) entries for group “G” and send (S, G) Joins towards all sources. (This will cause a SPT to be built from each active source back to the RP which will eventually start the flow of (S, G) traffic to the RP.) RP sends (S,G) Joins for all known Sources in Group. 7

93 PIM SM Registering Source Registers First
( , ) Mcast Packets Join S0 8 Join 7 RP Source (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial1, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: Incoming interface: Serial0, RPF nbr Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b “rtr-b” processes the (S, G) Join and creates state A (*, G) entry must be created before the (S, G) entry can be created. Note that: The RPF information for the (*, G) entry points up the Shared Tree via Serial1 with the RPF neighbor of (Serial 3 of the RP.) Because in this example no members have joined the group, the OIL of the (*, G) entry is Null. The “P” flag (Pruned) is set since the OIL is Null. The (S, G) entry is then created. Pay particular attention to the following: The RPF information for this entry points towards the source via Serial0. The RPF neighbor is (Serial 0 of “rtr-a”.) The (S, G) OIL initially receives a copy of the (*, G) OIL. (Which is Null.) Interface Serial1 (which is the interface that received the (S, G) Join) is added to the (S, G) OIL. The “T” flag is not yet set in the (S, G) entry. However, when the first (S, G) packet is forwarded using this entry, the flag will be “T” set. 8) Because the OIL of the (S, G) transitioned from Null to non-Null (when “rtr-b” added Serial1 to the OIL of the newly created entry), a PIM (S, G) Join is sent to rtr-a’s to continue the process of joining the SPT. “rtr-b” processes Join, creates (S, G) state (After automatically creating the (*, G) entry) RP sends (S,G) Joins for all known Sources in Group. 7 “rtr-b” sends (S,G) Join toward Source to continue building SPT. 8

94 PIM SM Registering Source Registers First
( , ) Mcast Packets 9 RP (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: FT Incoming interface: Ethernet0, RPF nbr , Registering Outgoing interface list: Source E0 S0 S0 S1 S3 S0 10 (*, ) Mcast Traffic S1 rtr-c rtr-a rtr-b 1st-hop router (rtr-a) processes the (S, G) Join The (S, G) Join is processed as follows: Serial0 is added to the “Outgoing interface list” (OIL). (This is the interface on which the (S, G) Join arrived.) The “P” flag (Pruned) is cleared since the OIL is no longer Null. 9) As a result of Serial0 being added to the (S, G) OIL, traffic begins to flow down the SPT from the source to the RP. 10) The RP then forwards all incoming (S, G) traffic to the Receivers down the Shared Tree. Serial0, Forward/Sparse, 00:04:28/00:01:32 “rtr-a” processes the (S, G) Join; adds Serial0 to OIL RP begins receiving (S,G) traffic down SPT. 9 10 RP forwards (S,G) traffic down Shared Tree to receivers.

95 PIM SM Registering Source Registers First
( , ) Mcast Packets RP (*, ), 00:04:28/00:01:32, RP , flags: SP Incoming interface: Serial1, RPF nbr , Outgoing interface list: Null ( /32, ), 00:04:28/00:01:32, flags: T Incoming interface: Serial0, RPF nbr Outgoing interface list: Serial1, Forward/Sparse, 00:04:28/00:01:32 Source E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic State in “rtr-b” after Receivers Join Pay particular attention to the following: Both (*, G) and (S, G) state was created as a result of the (S, G) Join received from the RP. The “P” flag set in the (*, G) entry since there are no receivers on the Shared Tree at this point in the network. The “T” flag is set in the (S, G) entry indicating that traffic is flowing down the Shortest-Path Tree. The “RPF nbr” is the IP Address of “rtr-a”. Serial0 is the “Incoming interface” of the (S, G) entry since this is the RPF interface for source “S” via “rtr-a”. Serial1 is listed in the “Outgoing interface list” of the (S, G) entry since the RP joined the SPT via this interface. Final state in “rtr-b” after Receivers Join

96 PIM SM Registering Source Registers First
( , ) Mcast Packets RP Source Final state in “RP” after Receivers Join (*, ), 00:09:21/00:02:38, RP , flags: S Incoming interface: Null, RPF nbr , Outgoing interface list: Serial1, Forward/Sparse, 00:03:14/00:02:46 ( /32, , 00:01:15/00:02:46, flags: T Incoming interface: Serial3, RPF nbr , Serial1, Forward/Sparse, 00:00:49/00:02:11 E0 S0 S0 S1 S3 S0 S1 rtr-c rtr-a rtr-b (*, ) Mcast Traffic State in RP after Receivers Join In the (*, G) entry: Serial1 has been added to the (*, G) entry since a (*,G) Join was received on this interface which is the only active branch of the Shared Tree (RPT). In the (S, G) entry: Serial1 has also been added to the (S, G) OIL because the OIL’s of all (S,G) entries are always kept in sync with their parent (*, G). Note: When the (S, G) OIL’s are synchronized with the OIL of their parent (*, G) OIL, a check is made to insure that the IIF of the (S, G) does not appear in the OIL of the (S, G). This could result in a route loop.

97 PIM SM SPT-Switchover SPT Thresholds may be set for any Group
Access Lists may be used to specify which Groups Default Threshold = 0kbps (I.e. immediately join SPT) Threshold = “infinity” means “never join SPT”. Threshold triggers Join of Source Tree Sends an (S,G) Join up SPT for next “S” in “G” packet received. Pros Reduces Network Latency Cons More (S,G) state must be stored in the routers. SPT Thresholds In PIM Sparse mode, SPT Thresholds may be configured to control when to switch to the Shortest-Path Tree (SPT). SPT Thresholds are specified in Kbps and can be used with Access List to specify to which Group(s) the Threshold applies. The default SPT-Threshold is 0Kbps. This means that any and all sources are immediately switched to the Shortest-Path Tree. If an SPT-Threshold of “Infinity” is specified for a group, the sources will not be switched to the Shortest-Path Tree (SPT) and will remain on the Shared Tree. Exceeding the Threshold When the Group’s SPT-Threshold is exceed in a last-hop router, the next received packet for the group will cause an (S, G) Join to be sent toward the source of the packet. This builds a Shortest-Path Tree from the source “S” to the last-hop router. PROS By switching to the Shortest-Path Tree (SPT), the most optimal (usually) path is used to deliver the multicast traffic. Depending on the location of the source in relation to the RP, this switch to the SPT can reduce network latency substantially. CONS In networks with large numbers of senders (remember most multicast applications such as IP/TV Client, send RTCP multicast packets in the background and are therefore senders), an increased amount of state must be kept in the routers. In some cases, an Infinity threshold may be used to force certain groups to remain on the Shared Tree when latency is not an issue.

98 SPT-Switchover Mechanism
PIM SM SPT-Switchover SPT-Switchover Mechanism Once each second Compute new (*, G) traffic rate If threshold exceeded, set “J” flag in (*, G) For each (Si , G) packet received: If “J” flag set in (*, G) Join SPT for (Si , G) Mark (Si , G) entry with “J” flag Clear “J” flag in (*,G) SPT-Threshold Myth This is a frequently misunderstood mechanism. Many people think that the the traffic rates of the sources in the group are monitored and compared against the SPT-Threshold. THIS IS NOT THE CASE. Instead, the total aggregate rate of Group traffic flowing down the Shared Tree (RPT) is calculated once per second. If this total aggregate rate is exceed, then the next Group packet received causes that source to be switched to the Shortest-Path Tree (SPT). SPT-Switchover Mechanism Once each second, the aggregate (*, G) traffic rate is computed and checked against the SPT-Threshold. If the aggregate rate of all group traffic flowing down the Shared Tree (RPT) exceeds the threshold, then the “J” flag is set in the (*, G) entry. As each multicast packet is received on the Shared Tree, the “J” bit is checked in the (*, G) entry. If the “J” flag is set, a new (S, G) entry is created for the source of the packet. An (S, G) Join is sent towards the source in order to join the SPT. The “J” flag is set in the (S, G) entry to denote that this entry was created as a result of the SPT-Threshold switchover. The “J” flag in the (*, G) is reset. (It will be set in one second if the aggregate rate on the Shared Tree is still over the SPT-Threshold.) This mechanism can sometimes result in low rate sources being switched to the SPT erroneously. However, the RPT-switchback mechanism will correct this situation and eventually only the high rate sources will be recevied via SPTs while low rate sources will remain on the Shared Tree.

99 State in “rtr-c” before switch
PIM SM SPT-Switchover rtr-c To RP ( ) rtr-a To Source “Si” S0 State in “rtr-c” before switch (*, ), 00:01:43/00:02:13, RP , flags: S Incoming interface: Serial0, RPF nbr , Outgoing interface list: Serial1, Forward/Sparse, 00:01:43/00:02:11 Serial2, Forward/Sparse, 00:00:32/00:02:28 S1 S1 S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Receivers A & B have joined multicast group which has resulted in traffic flowing down the Shared Tree as shown by the solid arrows. The state is “rtr-c” prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Serial0. The OIL of the (*, G) entry contains Serial1 and Serial2 as a result of (*, G) Joins that were sent up the Shared Tree by “rtr-a” and “rtr-d”, respectively.

100 State in “rtr-d” before switch
PIM SM SPT-Switchover rtr-c To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 S0 E0 S0 rtr-d State in “rtr-d” before switch (*, ), 00:01:43/00:02:13, RP , flags: SC Incoming interface: Serial0, RPF nbr , Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example The state is “rtr-d” prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Serial0. The OIL of the (*, G) entry contains Ethernet0 as a result of the IGMP Reports for group that are sent by Receiver “B”.

101 State in “rtr-a” before switch
PIM SM SPT-Switchover rtr-c To RP ( ) rtr-a To Source “Si” S0 S1 S1 (*, ), 00:01:43/00:02:13, RP , flags: S Incoming interface: Serial0, RPF nbr , Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-a” before switch S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example The state is “rtr-a” prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Serial0. The OIL of the (*, G) entry contains Ethernet0 as a result of (*, G) Joins that were sent up the Shared Tree by “rtr-b”.

102 State in “rtr-b” before switch
PIM SM SPT-Switchover rtr-c To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 State in “rtr-b” before switch (*, ), 00:01:43/00:02:13, RP , flags: SC Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 Rcvr A Rcvr B PIM-SM SPT-Switchover Example The state is “rtr-b” prior to the switchover is as follows: The IIF of the (*, G) entry points toward the RP via Ethernet0. The OIL of the (*, G) entry contains Ethernet1 as a result of the IGMP Reports for group that are sent by Receiver “A”.

103 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 S0 E0 S0 Group “G” rate > Threshold 1 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 (*, ), 00:01:43/00:02:13, RP , flags: SC Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 1: The total amount of all traffic flowing down the Shared Tree begins to exceed the SPT-Threshold configured at “rtr-b”. Step 2: As a result, “rtr-b” sets the “J” flag in the (*, G) entry to denote that the rate is above the SPT-Threshold for this group. 2 J Group “G” rate exceeds SPT Threshold at “rtr-b”; 1 Set J Flag in (*, G) and wait for next (Si,G) packet. 2

104 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 S0 E0 S0 rtr-d 3 E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 (*, ), 00:01:43/00:02:13, RP , flags: SC Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 3: The very next packet to arrive via the Shared Tree happens to be from source (Si, G). Because there is a member directly connected to this router (denoted by the “C” flag) and the traffic rate is above the SPT-Threshold (denoted by the “J” flag), “rtr-b” initiates a switch to the SPT for (Si, G) traffic. Step 4: The “J” flag in the (*, G) entry is first cleared and an new traffic rate measurement interval (1 second) is started. Next, (Si, G) state is created for source “Si” sending to group “G”. 4 J (Si,G) packet arrives down Shared tree. 3 Clear J Flag in the (*,G) & create (Si,G) state. 4

105 J Flag indicates (S, G) created by exceeding the SPT-threshold
PIM SM SPT-Switchover rtr-c To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b New State in “rtr-b” (*, ), 00:01:43/00:02:13, RP , flags: SC Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 ( /32, ), 00:13:28/00:02:53, flags: CJT Incoming interface: Ethernet0, RPF nbr Ethernet1, Forward/Sparse, 00:13:28/00:02:53 E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example The ( /32, ) entry shown above is created as follows: To denote that this entry was created as a result of the SPT Switchover mechanism, the “J” flag is set on the (S, G) entry. (The “J” flag being set will cause “rtr-b” to monitor the rate of the (S, G) traffic and if the rate of this traffic drops below the SPT Threshold for over one minute, “rtr-b” will attempt to switch this traffic flow back to the Shared Tree.) The RPF information is calculated in the direction of source “Si”. This results in an Incoming interface of Ethernet0, and an RPF neighbor address of “ ”. (Note: That the RPF information for the (S, G) entry is the same as the (*, G) entry. This indicates that the Shared Tree and the SPT are following the same path at this point.) The OIL for the (S, G) entry is constructed by copying the OIL from the (*, G) entry and then removing the IIF from this list to prevent a possible route loop. This results in an (S, G) OIL containing only Ethernet1. J Flag indicates (S, G) created by exceeding the SPT-threshold CJT

106 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 S0 E0 S0 (Si,G) Join 5 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 5: Once state has been created for (Si, G), an (S, G) Join is sent toward source “Si” to build a branch of the SPT to “rtr-b”. These (Si, G) Joins will continue to be sent periodically (once a minute) as long as the (Si, G) entry is not Pruned (i.e. does not have a Null OIL). 5 Send (Si,G) Join towards Si .

107 PIM SM SPT-Switchover New state in “rtr-a” rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 (*, ), 00:01:43/00:02:13, RP , flags: S Incoming interface: Serial0, RPF nbr , Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 ( /32, ), 00:13:28/00:02:53, flags: T Incoming interface: Serial1, RPF nbr Ethernet0, Forward/Sparse, 00:13:25/00:02:30 New state in “rtr-a” S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example When the (Si, G) Join is received by “rtr-a”, the ( /32, ) entry shown above is created as follows: The RPF information is calculated in the direction of source “Si”. This results in an Incoming interface of Serial1, and an RPF neighbor address of “ ”. (Note: That the RPF information for the (S, G) entry is not the same as the (*, G) entry. This indicates that the paths of the Shared Tree and the SPT diverge at this point.) The OIL for the (S, G) entry is constructed by copying the OIL from the (*, G) entry and then removing the IIF from this list to prevent a possible route loop. This results in an (S, G) OIL containing only Ethernet0.

108 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) (Si,G)RP-bit Prune 7 rtr-a To Source “Si” S0 S1 S1 (Si,G) Join 6 S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 6: When the (Si, G) state is created at “rtr-a”, an (Si, G) Join is sent toward source “Si”. These (Si, G) Joins will continue to be sent periodically (once a minute) as long as the (Si, G) entry is not Pruned (i.e. does not have a Null OIL). Step 7: Because the paths of the Shared Tree and the SPT diverge at “rtr-a”, (note the difference in RPF information on the previous page), this causes “rtr-a” to begin sending (Si, G)RP-bit Prune messages up the Shared Tree to stop the flow of redundant (Si, G) traffic down the Shared Tree. “rtr-a” forwards (Si,G) Join toward Si. 6 SPT & RPT diverge, triggering (Si,G)RP-bit Prunes toward RP. 7

109 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 (Si,G) Traffic 8 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 8: When the (Si, G) Joins reach the first-hop router directly connected to source “Si”, a complete branch of the SPT has been built (shown by the dashed arrows). This permits (Si, G) traffic to flow via the SPT to “rtr-b” and receiver “A”. (Si, G) traffic begins flowing down SPT tree. 8

110 State in “rtr-c” after receiving the (Si, G) RP-bit Prune
PIM SM SPT-Switchover rtr-c To RP ( ) rtr-a To Source “Si” (*, ), 00:01:43/00:02:13, RP , flags: S Incoming interface: Serial0, RPF nbr , Outgoing interface list: Serial1, Forward/Sparse, 00:01:43/00:02:11 Serial2, Forward/Sparse, 00:00:32/00:02:28 ( /32, ), 00:13:28/00:02:53, flags: R Incoming interface: Serial0, RPF nbr State in “rtr-c” after receiving the (Si, G) RP-bit Prune S0 S1 S1 S2 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example When the (Si, G)RP-bit Prune reaches “rtr-c”, the ( /32, ) entry shown above is created as follows: Because this (S, G) entry was created as a result of the receipt of an (S,G)RP-bit Prune, the “R” bit is set to denote that this forwarding state is applicable to traffic flowing down the Shared Tree and not the Source Tree. Because the “R” bit is set, the RPF information is calculated in the direction of the RP instead of source “Si”. (Remember, this entry is applicable to (Si,G) traffic flowing down the Shared Tree and therefore the RPF information must point up the Shared Tree.) This results in an Incoming interface of Serial0, and an RPF neighbor address of “ ”. The OIL for the (S, G) entry is constructed by copying the OIL from the (*, G) entry minus the interface that the (Si, G)RP-bit Prune was received. Next, the IIF is removed from the OIL to prevent a possible route loop. These steps results in an (S, G) OIL containing only Serial2. At this point, (Si, G) traffic flowing down the Shared Tree will be forwarded using the (Si, G) entry. (Si, G) traffic arriving at “rtr-a” will RPF correctly because the RPF information in the (Si, G) entry is pointing up the Shared Tree (as a result of the “R” bit) and will then be forwarded out all interfaces in the (Si, G) OIL. In this case, only Serial2 remains in the (Si, G) OIL and therefore (Si, G) traffic will be sent to “rtr-d” but not “rtr-a”. This successfully prunes the redundant (Si, G) traffic from the branch of the Shared Tree between “rtr-c” and “rtr-a”.

111 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 9 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 9: At this point, the redundant (Si, G) traffic is pruned from the Shared Tree branch from “rtr-c” to “rtr-a”. (Si, G) traffic is reaching receiver “A” via the SPT through “rtr-a” and “rtr-b”. Unnecessary (Si, G) traffic is pruned from the Shared tree. 9

112 PIM SM SPT-Switchover rtr-c rtr-a rtr-d rtr-b
To RP ( ) rtr-a To Source “Si” S0 S1 S1 S2 10 S0 E0 S0 rtr-d E0 (Si, G) Traffic Flow Shared (RPT) Tree SPT Tree E1 rtr-b E0 Rcvr A Rcvr B PIM-SM SPT-Switchover Example Step 10: (Si, G) traffic is still reaching receiver “B” via a branch of the Shared Tree through “rtr-c” and “rtr-d”. This is because the (Si, G) state in “rtr-c” still has Serial2 in its OIL. Unnecessary (Si, G) traffic is pruned from the Shared tree. 9 (Si, G) traffic still flows via other branches of the Shared tree. 10

113 Shared Tree Switchback Mechanism
PIM SM SPT-Switchover Shared Tree Switchback Mechanism Once each second when the (S,G) state is older than 1 minute If “J” flag set in (Si , G) entry Compute new (Si , G) traffic rate If rate < SPT-threshold Rejoin (*, G) Tree for (Si , G) traffic Send (Si , G) prune up SPT toward Si Delete (Si , G) entry Shared Tree Switchback The Shared Tree Switchback (for lack of a better term) mechanism is used to switch sources back to the Shared Tree when their traffic rate falls below the SPT-Threshold. Switchback Algorithm The Switchback mechanism runs once a minute. (This helps prevent Sources from cycling between Shared Tree and Shortest-Path Tree too rapidly.) For each (Si, G) entry in the Multicast Routing Table that has the “J” flag set, the mechanism computes the traffic rate for source Si. If the rate has fallen below the SPT-Threshold, a switchback to the Shared Tree is initiated by the last-hop router by: Sending a Join/Prune message that contains a (*, G) Join without a (Si, G)RP-bit Prune, up the Shared Tree (RPT). (This will cause the (Si, G) Prune state along the RPT to be deleted which will permit (Si, G) traffic to begin flowing down the RPT again.) Deleteing its (Si, G) entry in the Multicast Routing Table. Send (Si, G) Prune up the Shortest-Path Tree (SPT) to stop traffic from flowing down the SPT. Note that this Switchback Algorithm is broken in older versions of IOS.

114 PIM-SM Protocol Mechanics
PIM SM State PIM SM Forwarding PIM SM Joining PIM SM Registering PIM SM SPT-Switchover PIM SM Pruning

115 PIM SM Pruning IGMP group times out / last host sends Leave
Interface removed from all (*,G) and (S,G) entries IF all interfaces in “oilist” for (*,G) are pruned; THEN send Prune up shared tree toward RP Any (S, G) state allowed to time-out Each router along path “prunes” interface SM Pruning Locally connected host sends an IGMP Leave (or IGMP state times out in the router) for group “G”. The interface is removed from the (*, G) and all (S, G) entries in the Multicast Routing Table. If the (*, G) “Outgoing Interface list” is now Null, then send a (*, G) Prune up the Shared Tree (RPT) towards the RP. Any remaining (S, G) entries are allowed to timeout and be deleted from the Multicast Routing Table. When the routers up the Shared Tree receive the (*, G) Prune, they remove the interface on which the Prune was received from their (*, G) “Outgoing interface list”. If as a result of removing the interface the (*, G) “Outgoing Interface list” becomes Null, then forward a (*, G) Prune up the Shared Tree (RPT) towards the RP.

116 PIM SM Pruning Shared Tree Case
To RP ( ) S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b (*, ), 00:01:43/00:02:13, RP , flags: SC Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-b” before Pruning Rcvr A State in “rtr-b” before Pruning Pay particular attention to the following: Traffic is flowing down the Shared Tree. (Denoted by the existance of only the (*, G) entry.) The “Incoming interface” is Ethernet0. The “Outgoing interface list” contains Ethernet1. The “C” flag is set in the (*, G) which denotes that there is a locally connected host for this group. (Rcvr A)

117 PIM SM Pruning Shared Tree Case
To RP ( ) (*, ), 00:01:43/00:02:13, RP , flags: S Incoming interface: Serial0, RPF nbr , Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 State in “rtr-a” before Pruning S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b Rcvr A State in “rtr-a” before Pruning — RPT Case Pay particular attention to the following: Traffic is flowing down the Shared Tree. (Denoted by the existance of only the (*, G) entry.) The “Incoming interface” is Serial0. The “Outgoing interface list” contains Ethernet0.

118 PIM SM Pruning Shared Tree Case
To RP ( ) S0 rtr-a (Si, G) Traffic Flow E0 (*,G) Prune 3 Shared Tree SPT Tree E0 IGMP Leave 1 E1 X rtr-b Rcvr A 2 PIM SM Pruning Example — RPT Case 1) The last-hop or Leaf router (rtr-b) receives an IGMP Group Leave message from “Rcvr A” for group “G”. After performing the normal IGMP Leave processing and finding that “Rcvr A” was the last host to leave, the IGMP state for group “G” on interface “E1” is deleted. 2) This causes interface “E1” to be removed from the “Outgoing interface list” of the (*, G) entry and any (Si, G) entries (in this case there are none) in the Multicast Routing Table. Because “E1” was the only interface in the (*, G) entry, it’s outgoing interface list becomes Null. 3) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is sent up the Shared Tree (RPT) via “E0” toward the RP. “rtr-b” is a Leaf router. Last host “Rcvr A”, leaves group G. 1 “rtr-b” removes E1 from (*,G) and any (Si,G) “oilists”. 2 “rtr-b” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 3

119 PIM SM Pruning Shared Tree Case
X 6 S1 To RP ( ) (*,G) Prune 5 S0 rtr-a (Si, G) Traffic Flow E0 X Shared Tree SPT Tree 4 E0 E1 rtr-b PIM SM Pruning Example — RPT Case (cont.) 4) The (*, G) Prune is received by “rtr-a” which causes interface “E0” to be removed from the “Outgoing interface list” of the (*, G) entry in the Multicast Routing Table. (Note: “rtr-a” delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 5) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via “S0” toward the RP. 6) This pruning continues back toward the RP or until a router is reached whose (*, G) “Outgoing interface list” doesn’t go to Null as a result of the Prune. “rtr-a” receives Prune; removes E0 from (*,G) “oilist”. (After the 3 second Multi-access Network Prune delay.) 4 “rtr-a” (*,G) “oilist” now empty; send (*,G) Prune toward RP. 5 Pruning continues back toward RP. 6

120 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b (*, ), 00:01:43/00:02:59, RP , flags: SC Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: Ethernet1, Forward/Sparse, 00:01:43/00:02:11 ( /32, ), 00:01:05/00:01:55, flags: CJT Incoming interface: Ethernet0, RPF nbr Ethernet1, Forward/Sparse, 00:01:05/00:02:55 State in “rtr-b” before Pruning Rcvr A State in “rtr-b” before Pruning — SPT Case Pay particular attention to the following: Both a (*, G) and (S, G) entries exist. The “J” flag is set in the (S, G) entry. This indicates that the (S, G) state was created as a result of the SPT-Threshold being exceeded. The “T” flag is set in the (S, G) entry. This indicates that (S, G) traffic is being successfully received via the Shortest-Path Tree (SPT). The “Incoming interface” is the same for the (*, G) and the (S, G) entry. This indicates that Shared Tree and the Shortest-Path tree are the same at this point.

121 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) (*, ), 00:01:43/00:02:59, RP , flags: S Incoming interface: Serial0, RPF nbr , Outgoing interface list: Ethernet0, Forward/Sparse, 00:01:43/00:02:11 ( /32, ), 00:01:05/00:01:55, flags: T Incoming interface: Serial1, RPF nbr Ethernet0, Forward/Sparse, 00:01:05/00:02:55 State in “rtr-a” before Pruning S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b Rcvr A State in “rtr-b” before Pruning — SPT Case Pay particular attention to the following: Both a (*, G) and (S, G) entries exist. The “T” flag is set in the (S, G) entry. This indicates that (S, G) traffic is being successfully received via the Shortest-Path Tree (SPT). The “Incoming interface” is different for the (*, G) and the (S, G) entry. This indicates that Shared Tree and the Shortest-Path tree diverge at this point.

122 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) S0 rtr-a (Si, G) Traffic Flow (*,G) Prune 3 E0 Shared Tree SPT Tree E0 IGMP Leave 1 E1 X rtr-b Rcvr A 2 PIM SM Pruning Example — SPT Case 1) The last-hop or Leaf router (rtr-b) receives an IGMP Group Leave message from “Rcvr A” for group “G”. After performing the normal IGMP Leave processing and finding that “Rcvr A” was the last host to leave, the IGMP state for group “G” on interface “E1” is deleted. 2) This causes interface “E1” to be removed from the “Outgoing interface list” of the (*, G) entry and any (Si, G) entries in the Multicast Routing Table. Because “E1” was the only interface in the (*, G) and the (Si, G) entries, their outgoing interface lists become Null. 3) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is sent up the Shared Tree (RPT) via “E0” toward the RP. “rtr-b” is a Leaf router. Last host “Rcvr A”, leaves group G. 1 “rtr-b” removes E1 from (*,G) and any (Si,G) “oilists”. 2 “rtr-b” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 3

123 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree X SPT Tree Periodic (S, G) Join 4 E0 E1 X rtr-b Rcvr A PIM SM Pruning Example — SPT Case (cont.) 4) Because the (Si, G) “Outgoing interface list” is now Null, “rtr-b” stops sending Periodic (Si, G) Join messages up the Shortest-Path Tree (SPT). “rtr-b” is a Leaf router. Last host “Rcvr A”, leaves group G. 1 “rtr-b” removes E1 from (*,G) and any (Si,G) “oilists”. 2 “rtr-b” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 3 “rtr-b” stops sending periodic (S, G) joins. 4

124 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) (*,G) Prune 6 S0 rtr-a (Si, G) Traffic Flow E0 X Shared Tree SPT Tree 5 E0 E1 rtr-b PIM SM Pruning Example — SPT Case (cont.) 5) The (*, G) Prune is received by “rtr-a” which causes interface “E0” to be removed from the “Outgoing interface list” of the (*, G) entry in the Multicast Routing Table. (Note: “rtr-a” delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via “S0” toward the RP. 7) Because “rtr-a” is no longer receiving (Si, G) Join messages from “rtr-b”, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source “Si”. 8) Traffic stops flowing down the Shortest-Path Tree (SPT). “rtr-a” receives Prune; removes E0 from (*,G) “oilist”. (After the 3 second Multiaccess Network Prune delay.) 5 “rtr-a” (*,G) “oilist” now empty; sends (*,G) Prune toward RP. 6

125 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b (*, ), 00:02:32/00:02:59, RP , flags: SP Incoming interface: Ethernet0, RPF nbr , Outgoing interface list: ( /32, ), 00:01:56/00:00:53, flags: PT Incoming interface: Ethernet0, RPF nbr State in “rtr-b” after Pruning PIM SM Pruning Example — SPT Case (cont.) 5) The (*, G) Prune is received by “rtr-a” which causes interface “E0” to be removed from the “Outgoing interface list” of the (*, G) entry in the Multicast Routing Table. (Note: “rtr-a” delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via “S0” toward the RP. 7) Because “rtr-a” is no longer receiving (Si, G) Join messages from “rtr-b”, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source “Si”. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

126 PIM SM Pruning Source (SPT) Case
To Source “Si” S1 To RP ( ) (*, ), 00:02:32/00:02:59, RP , flags: SP Incoming interface: Serial0, RPF nbr , Outgoing interface list: ( /32, ), 00:01:56/00:00:53, flags: PT Incoming interface: Serial1, RPF nbr State in “rtr-a” after Pruning S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b PIM SM Pruning Example — SPT Case (cont.) 5) The (*, G) Prune is received by “rtr-a” which causes interface “E0” to be removed from the “Outgoing interface list” of the (*, G) entry in the Multicast Routing Table. (Note: “rtr-a” delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via “S0” toward the RP. 7) Because “rtr-a” is no longer receiving (Si, G) Join messages from “rtr-b”, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source “Si”. 8) Traffic stops flowing down the Shortest-Path Tree (SPT).

127 PIM SM Pruning Source (SPT) Case
(Si,G) Data 7 To Source “Si” S1 To RP ( ) (Si,G) Prune 8 S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b PIM SM Pruning Example — SPT Case (cont.) 5) The (*, G) Prune is received by “rtr-a” which causes interface “E0” to be removed from the “Outgoing interface list” of the (*, G) entry in the Multicast Routing Table. (Note: “rtr-a” delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via “S0” toward the RP. 7) Because “rtr-a” is no longer receiving (Si, G) Join messages from “rtr-b”, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source “Si”. 8) Traffic stops flowing down the Shortest-Path Tree (SPT). Another (Si,G) data packet arrives via Serial1. 7 ‘rtr-a’ responds by sending an (Si,G) Prune toward source. 8

128 PIM SM Pruning Source (SPT) Case
9 To Source “Si” S1 To RP ( ) S0 rtr-a (Si, G) Traffic Flow E0 Shared Tree SPT Tree E0 E1 rtr-b PIM SM Pruning Example — SPT Case (cont.) 5) The (*, G) Prune is received by “rtr-a” which causes interface “E0” to be removed from the “Outgoing interface list” of the (*, G) entry in the Multicast Routing Table. (Note: “rtr-a” delayed Pruning E0 from the (*, G) entry for 3 seconds since this is a Multi-Access network and it needed to wait for a possible overriding Join from another PIM neighbor. Since none was received, the interface was pruned.) 6) Because the (*, G) “Outgoing interface list” is now Null, a (*, G) Prune is forwarded on up the Shared Tree (RPT) via “S0” toward the RP. 7) Because “rtr-a” is no longer receiving (Si, G) Join messages from “rtr-b”, the (Si, G) state eventually times out. This causes a (Si, G) Prune to be sent up the Shortest-Path Tree (SPT) towards the source “Si”. 8) Traffic stops flowing down the Shortest-Path Tree (SPT). Another (Si,G) data packet arrives via Serial1. 7 ‘rtr-a’ responds by sending an (Si,G) Prune toward source. 8 (Si,G) traffic ceases flowing down SPT. 9

129 Multicast Protocols Multicast Addressing Basics IGMP PIM MSDP MBGP

130 MSDP—Multicast Source Discovery Protocol
MSDP concepts MSDP design points MSDP example Cisco MSDP implementation MSDP configuration MSDP application—Anycast RP

131 MSDP Concept Simple but elegant Utilize inter-domain source trees
Reduces problem to locating active sources RP or receiver last-hop can join inter-domain source tree

132 MSDP Concepts Works with PIM-SM only
RP’s knows about all sources in a domain Sources cause a “PIM Register” to the RP Can tell RP’s in other domains of its sources Via MSDP SA (Source Active) messages RP’s know about receivers in a domain Receivers cause a “(*, G) Join” to the RP RP can join the source tree in the peer domain Via normal PIM (S, G) joins

133 MSDP Overview MSDP Example r s Domain E Domain C Domain B Domain D
MSDP Peers Source Active Messages SA RP SA r Join (*, ) Domain C SA RP Domain B SA SA RP SA Message , RP SA Message , Domain D RP s Register , Domain A

134 MSDP Overview MSDP Example r s Domain E Domain C Domain B Domain D
MSDP Peers RP (S, ) Join r Domain C RP Domain B RP RP Domain D RP s Domain A

135 MSDP Overview MSDP Example r s Domain E Domain C Domain B Domain D
MSDP Peers Multicast Traffic RP r Domain C RP Domain B RP RP Domain D RP s Domain A

136 MSDP Overview MSDP Example r s Domain E Domain C Domain B Domain D
MSDP Peers Multicast Traffic RP r Domain C Join (S, ) RP Domain B RP RP Domain D RP s Domain A

137 MSDP Overview MSDP Example r s Domain E Domain C Domain B Domain D
MSDP Peers Multicast Traffic RP r Domain C RP Domain B RP RP Domain D RP s Domain A

138 MSDP Design Points MSDP peers talk via TCP connections
Source Active (SA) messages Peer-RPF forwarded to prevent loops RPF check on AS-PATH back to the peer RP other rules from draft apply If successful, flood SA message to other peers Stub sites can be configured to accept all SA messages (similar to static default for routing) Since they have only one exit (e.g., default peer) MSDP speaker may cache SA messages Reduces join latency

139 MSDP Peers MSDP establishes a neighbor relationship between MSDP peers
Peers connect using TCP port 639 Peers send keepalives every 60 secs (fixed) Peer connection reset after 75 seconds if no MSDP packets or keepalives are received MSDP peers must run BGP! May be an MBGP peer, a BGP peer or both Exception: BGP is unnecessary when peering with only a single MSDP peer.

140 MSDP SA Messages MSDP SA Message Contents
One or more messages (in TLV format) Keepalives Source Active (SA) Messages Source Active Request (SA-Req) Messages Source Active Response (SA-Resp) Message Used to advertise active Sources in a domain Can also carry initial multicast packet from source Hack for Bursty Sources (ala SDR) SA Message Contents: IP Address of Originator (usually an RP) Number of (S, G)’s pairs being advertised List of active (S, G)’s in the domain Encapsulated Multicast packet

141 MSDP SA Messages MSDP Message Contents (cont.)
SA Request (SA-Req) Messages Used to request a list of active sources for a group Sent to an MSDP SA Cache Server Reduces Join Latency to active sources SA Request Messages contain: Requested Group Address SA Response (SA-Resp) Messages Sent in response to an SA Request message SA Response Messages contain: IP Address of Originator (usually an RP) Number of (S, G)’s pairs being advertised List of active (S, G)’s in the domain Keepalive messages Used to keep MSDP peer connection up

142 Receiving SA Messages Skip RPF Check and process SA if:
Sending MSDP peer = only MSDP peer (i.e. the ‘default-peer’ or only ‘msdp-peer’ configured.) Sending MSDP peer = Mesh-Group peer RPF Check the received SA message. Lookup best BGP path to RP in SA message Search MBGP RIB first then BGP RIB If path to RP not found, RPF Check Fails; ignore SA message Sending MSDP Peer = BGP peer? Yes: Is best path to RP via this BGP peer? If yes, RPF Check Succeeds; process SA message No: Is next AS in best path to RP = AS of MSDP peer?

143 Receiving SA Messages Detailed RPF Check rules
Case 1: Sending MSDP Peer = iBGP peer Is best path to RP via this BGP peer? Translation: Is the address of the iBGP peer that advertised the best path to the RP = address of the sending MSDP peer? Case 2: Sending MSDP Peer = eBGP peer Translation: Is the AS of eBGP peer = next AS in the best path to the RP? Case 3: Sending MSDP Peer != BGP peer Is the next AS in best path to RP = AS of the sending MSDP peer? Translation: None needed.

144 RPF rule when MSDP == internal (m)BGP peer
RPF Check Example RPF rule when MSDP == internal (m)BGP peer BGP Table router B Network Next Hop Path *> / i B B 4.2 Who is the iBGP peer adverting this route, in our example MSDP Peers router B MSDP Peer AS State UP UP AS1 3.1 4.1 2.1 RP A C Is the MSDP == BGP peer Source RPF Success! ip pim rp-address MSDP SA MSDP/iMBGP mesh-peering

145 RPF rule when MSDP == internal (m)BGP peer
RPF Check Example RPF rule when MSDP == internal (m)BGP peer BGP Table router B Network Next Hop Path *> / i B B 4.2 Who is the iBGP peer adverting this route, In our example MSDP Peers router B MSDP Peer AS State UP UP AS1 3.1 4.1 2.1 RP A C Is the MSDP == BGP peer Source RPF Failure! ip pim rp-address MSDP SA MSDP/iMBGP mesh-peering

146 RPF rule when MSDP == external (m)BGP peer
RPF Check Example MSDP Peers router B MSDP Peer AS State UP UP RPF rule when MSDP == external (m)BGP peer AS 2 B BGP Neighbours router B Neighbor AS Up/Down :06: :09:06 B 4.2 3.1 4.1 2.1 BGP Table Network Next Hop Path *> / i / i RP A C AS3 AS1 Is the MSDP AS == Last AS in RP route Source ip pim rp-address RPF Success! MSDP SA MSDP/eMBGP mesh-peering Who is the BGP peer adverting this route

147 RPF rule when MSDP == external (m)BGP peer
RPF Check Example MSDP Peers router B MSDP Peer AS State UP UP RPF rule when MSDP == external (m)BGP peer AS 2 B BGP Neighbours router B Neighbor AS Up/Down :06: :09:06 B 4.2 3.1 4.1 2.1 BGP Table Network Next Hop Path *> / i / i RP A C AS3 AS1 Is the MSDP AS == Last AS in RP route Source ip pim rp-address RPF Failure! MSDP SA MSDP/eMBGP mesh-peering Who is the BGP peer adverting this route

148 RPF rule when MSDP != (m)BGP peer
RPF Check Example RPF rule when MSDP != (m)BGP peer MSDP Peers router B MSDP Peer AS State UP UP AS 2 B B 4.2 BGP Table router B Network Next Hop Path *> / i / I *> / i *> / i 3.1 4.1 2.1 RP A C AS3 AS1 Is the MSDP AS == Last AS in RP route Source ip pim rp-address RPF Success! MSDP SA MSDP/eMBGP mesh-peering

149 RPF rule when MSDP != (m)BGP peer
RPF Check Example RPF rule when MSDP != (m)BGP peer MSDP Peers router B MSDP Peer AS State UP UP AS 2 B B 4.2 BGP Table router B Network Next Hop Path *> / i / I *> / i *> / i 3.1 4.1 2.1 RP A C AS3 AS1 Is the MSDP AS == Last AS in RP route RPF Failure! Source ip pim rp-address MSDP SA MSDP/eMBGP mesh-peering

150 Cisco MSDP Implementation
Cisco implementation current with ID: draft-ietf-msdp-spec-02.txt Multiple peer support Peer with BGP, MBGP, or static peers SA caching (off by default) Sending and receiving SA-requests Sending and receiving SA-responses

151 Cisco MSDP Implementation
SA input and output filtering SA-request input filtering Default peer support So a tail site can MSDP with a backbone provider without requiring the two to BGP peer Triggered join support when creating an (S,G) learned by MSDP Mesh groups Reduces RPF-flooding of SA messages between fully meshed MSDP peers

152 MSDP Configuration Configure peers Configure default peer SA caching
ip msdp peer <ip-address> [connect-source <i/f>] Configure default peer ip msdp default-peer <ip-address> [prefix-list acl] SA caching ip msdp cache-sa-state [list <acl>] Mesh groups ip msdp mesh-group <name> <ip-address>

153 MSDP Configuration Filtering TTL Scoping
Can filter SA in/out, groups, with acls or route-maps TTL Scoping ip msdp ttl-threshold <ip-address> <ttl> For more configuration commands see: ftp://ftpeng.cisco.com/ipmulticast/msdp-commands

154 ISP Requirements to Deploy Now
Want an explicit join protocol for efficiency PIM-SM (forwarding) Use existing (unicast) operation model MBGP (routing) Need interdomain source discovery MSDP (source discovery)

155 MSDP Application—Anycast RP
draft-ietf-mboned-anycast-rp-05.txt Within a domain, deploy more than one RP for the same group range Give each RP the same IP address assignment Sources and receivers use closest RP May be used intra-domain (enterprise) to provide redundancy and RP load sharing

156 MSDP Application—Anycast RP
Sources from one RP are known to other RPs using MSDP When an RP goes down, sources and receivers are taken to new RP via unicast routing Fast convergence

157 Anycast RP—Convergence
Src Rec RP1 X RP2 MSDP Rec

158 Anycast RP—Convergence
Src Rec RP1 X RP2 Rec Rec Rec Src

159 Multicast Protocols Multicast Addressing Basics IGMP PIM MSDP MBGP

160 MBGP—Multiprotocol BGP
MBGP overview MBGP capability negotiation MBGP NLRI exchange

161 MBGP Overview MBGP: Multiprotocol BGP (aka multicast BGP in multicast networks) Defined in RFC 2283 (extensions to BGP) Can carry different types of routes Unicast Multicast Both routes carried in same BGP session Does not propagate multicast state info Same path selection and validation rules AS-Path, LocalPref, MED, …

162 MBGP Overview New multiprotocol attributes
MP_REACH_NLRI MP_UNREACH_NLRI MP_REACH_NLRI and MP_UNREACH_NLRI Address Family Information (AFI) = 1 (IPv4) Sub-AFI = 1 (NLRI is used for unicast) Sub-AFI = 2 (NLRI is used for multicast RPF check) Sub-AFI = 3 (NLRI is used for both unicast and multicast RPF check)

163 MBGP Overview Separate BGP tables maintained Unicast RIB (U-RIB)
Unicast Routing Information Base (RIB) Multicast Routing Information Base (MRIB) Unicast RIB (U-RIB) Contains unicast prefixes for unicast forwarding Populated with BGP unicast NLRI AFI = 1, Sub-AFI = 1 or 3 Multicast RIB (M-RIB) Contains unicast prefixes for RPF checking Populated with BGP multicast NLRI AFI = 1, Sub-AFI = 2 or 3

164 MBGP Overview MBGP allows different unicast and multicast topologies and different policies Same IP address may have different signification Unicast routing information Multicast RPF information For same IPv4 address two different NLRI with different next-hops Can use existing or new BGP peering topology for multicast

165 MBGP NLRI Exchange BGP/MBGP configuration allows you to:
Define which NLRI type are exchanged (unicast, multicast, both) Set NLRI type through route-maps (redistribution) Define policies through standard BGP attributes (for unicast and/or multicast NLRI) No redistribution allowed between MBGP and BGP tables NLRI type can be set with set nlri route-map command

166 MBGP NLRI Exchange MRIB is populated by:
Receiving AFI/SAFI 1/2 MP_REACH_NLRI from neighbors Configured/stored locally by: network <foo> <foo-mask> [nlri multicast unicast] redistribute <unicast> route-map <map> aggregate-address <foo> <foo-mask> [nlri multicast unicast] neighbor <foo> default-originate [nlri multicast unicast] Note: Syntax changes in 12.07T/12.1 forward

167 MBGP — NLRI Information
Congruent Topologies BGP Session for Unicast and Multicast NLRI AS 321 AS 123 .1 /24 .2 router bgp 321 neighbor remote-as 123 nlri unicast multicast network nlri unicast multicast no auto-summary /24 /24 Receiver Sender

168 MBGP — NLRI Information
Congruent Topologies BGP Session for Unicast and Multicast NLRI AS 321 AS 123 .1 /24 .2 NLRI: / AS_PATH: 321 MED: Next-Hop: ... Unicast Information Multicast Information MP_REACH_NLRI: /24 AFI: 1, Sub-AFI: 2 (multicast) AS_PATH: 321 MED: Next-Hop: Routing Update /24 /24 Receiver Sender

169 MBGP — NLRI Information
Incongruent Topologies AS 321 AS 123 .1 Unicast Traffic /24 .2 .1 Multicast Traffic .2 router bgp 321 . . . network nlri unicast multicast neighbor remote-as 123 nlri unicast neighbor remote-as 123 nlri multicast /24 /24 Sender

170 MBGP — NLRI Information
Incongruent Topologies AS 321 AS 123 .1 Unicast Traffic NLRI: / AS_PATH: 321 MED: Next-Hop: Unicast Information Routing Update /24 .2 .1 Multicast Traffic .2 /24 /24 Sender

171 MBGP — NLRI Information
Incongruent Topologies AS 321 AS 123 .1 Unicast Traffic /24 .2 .1 Multicast Traffic .2 /24 /24 MP_REACH_NLRI: /24 AFI: 1, Sub-AFI: AS_PATH: 321 MED: Next-Hop: Multicast Information Routing Update Sender

172 MBGP — NLRI Information
Incongruent Topologies AS 321 AS 123 .1 Unicast Traffic /24 .2 .1 Multicast Traffic .2 /24 /24 Network Next-Hop Path / / Unicast RIB Multicast RIB Sender

173 Populating the MRIB Just to restate
network <foo> <foo-mask> [nlri multicast unicast] redistribute <unicast> route-map <map> aggregate-address <foo> <foo-mask> [nlri multicast unicast] neighbor <foo> default-originate [nlri multicast unicast]

174 MBGP—Summary Solves part of inter-domain problem
Can exchange multicast routing information Uses standard BGP configuration knobs Permits separate unicast and multicast topologies if desired Still must use PIM to: Build distribution trees Actually forward multicast traffic PIM-SM recommended But there’s still a problem using PIM-SM here… (more on that later)

175 Midterm Exam Which protocol...
...is used to propagate forwarding state? PIM-SM - Protocol Independent Multicast-Sparse Mode ...maintains routing information for interdomain RPF checking? MBGP - Multiprotocol Border Gateway Protocol …exchanges active source information between RPs? MSDP - Multicast Source Discovery Protocol

176 Agenda What Is Multicast Multicast Protocols Multicast Configurations
Troubleshooting Futures Useful Links

177 Putting it all together..
PIM-SM, MBGP, MSDP Public multicast peering (MIX) Transit topology examples Address allocation GLOP draft-ietf-mboned-static-allocation-00.txt

178 ISP Requirements at the MIX
Current solution: MBGP + PIM-SM + MSDP Environment ISPs run iMBGP and PIM-SM (internally) ISPs multicast peer at a public interconnect Deployment Border routers run eMBGP The interfaces on interconnect run PIM-SM RPs’ MSDP peering is fully meshed All peers set a common distance for eMBGP

179 ISP Requirements at the MIX
Peering Solution: MBGP + PIM-SM +MSDP Public Interconnect ISP A PIM-SM PIM-SM ISP B RP RP iMBGP iMBGP Redistribute old MBONE DVMRP routes into MBGP eMBGP iMBGP RP iMBGP RP PIM-SM AS 10888 ISP C

180 Multicast Transit Design Objectives
PIM Border Constraints Confine registers within domain Confine local groups Confine RP announcements Control SA advertisements via MSDP Border RPF check RPF check against unicast routes to multicast sources MSDP RPF check RPF check toward RP in received SAs

181 Recommended MSDP SA Filter
ftp://ftp-eng.cisco.com/ipmulticast/msdp-sa-filter.txt ! domain-local applications access-list 111 deny ip any host ! access-list 111 deny ip any host ! Rwhod access-list 111 deny ip any host ! Microsoft-ds access-list 111 deny ip any host ! SVRLOC access-list 111 deny ip any host ! SGI-Dogfight access-list 111 deny ip any host ! SVRLOC-DA access-list 111 deny ip any host ! hp-device-disc ! auto-rp groups access-list 111 deny ip any host access-list 111 deny ip any host ! scoped groups access-list 111 deny ip any ! loopback, private addresses (RFC 1918) access-list 111 deny ip any access-list 111 deny ip any access-list 111 deny ip any access-list 111 deny ip any access-list 111 permit ip any any

182 Multicast Boundary Filter
Minimum Recommended !deny auto-rp groups access-list 1 deny access-list 1 deny !deny scoped groups access-list 1 deny !permit the rest of 224/4 access-list 1 permit

183 Single-Homed, ISP RP, Non-MBGP
PIM Border Constraints Transit AS109 Tail-site Customer RP pos0/ pos0/ int pos0/0 ip pim sparse-dense-mode ip pim send-rp-announce Loopback0 scope 255 ip pim send-rp-discovery Loopback0 scope 255 int pos0/0 ip pim sparse-dense-mode /24 Receiver

184 Single-Homed, ISP RP, Non-MBGP
Checking PIM Border (RP mapping) Transit AS109 Tail-site Customer RP pos0/ pos0/ tail-gw#show ip pim rp mapping PIM Group-to-RP Mappings Group(s) /4 RP (loopback.transit.net), v2v1 Info source: (tail.transit.net), via Auto-RP Uptime: 21:57:41, expires: 00:02:08 /24 Receiver

185 Single-Homed, ISP RP, Non-MBGP
Checking PIM Border (RP mapping) Transit AS109 Tail-site Customer RP pos0/ pos0/ Transit-tail#show ip pim rp mapping PIM Group-to-RP Mappings This system is an RP (Auto-RP) This system is an RP-mapping agent Group(s) /4 RP (loopback.transit.net), v2v1 Info source: (loopback.transit.net), via Auto-RP Uptime: 22:08:47, expires: 00:02:14 /24 Receiver

186 Single-Homed, ISP RP, Non-MBGP
Border RPF Check Transit AS109 Tail-site Customer RP pos0/ pos0/ ip route ip route router bgp 109 ... network nlri unicast multicast /24 Receiver

187 Single-Homed, ISP RP, Non-MBGP
MSDP RPF Check Transit AS109 Tail-site Customer RP pos0/ pos0/ - no RP / no MSDP - no downstream RP - no downstream MSDP peering /24 Receiver

188 Single-Homed, Customer RP, Non-MBGP
PIM Border Constraints Transit AS109 Tail-site Customer RP pos0/ pos0/ int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out ip msdp sa-filter in /24 Receiver Note: Access-list 111 = Recommended SA Filter

189 Single-Homed, Customer RP, Non-MBGP
PIM Border Constraints Transit AS109 Tail-site Customer RP pos0/ pos0/ int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out ip msdp sa-filter in /24 Receiver Note: Access-list 111 = Recommended SA Filter

190 Single-Homed, Customer RP, Non-MBGP
Border RPF Check Transit AS109 Tail-site Customer RP pos0/ pos0/ ip route ip route router bgp 109 ... network nlri unicast multicast /24 Receiver

191 Single-Homed, Customer RP, Non-MBGP
MSDP RPF Check Transit AS109 Tail-site Customer RP pos0/ pos0/ ip msdp peer connect-source pos0/0 ip msdp peer connect-source pos0/0 /24 Receiver

192 Single-Homed, Customer RP, MBGP
PIM Border Constraints Transit AS109 Tail-site Customer RP pos0/ pos0/ int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out ip msdp sa-filter in /24 Receiver Note: Access-list 111 = Recommended SA Filter

193 Single-Homed, Customer RP, MBGP
PIM Border Constraints Transit AS109 Tail-site Customer RP pos0/ pos0/ int pos0/0 ip pim sparse-mode ip pim bsr-border ip multicast boundary 1 ip msdp sa-filter out ip msdp sa-filter in /24 Receiver Note: Access-list 111 = Recommended SA Filter

194 Single-Homed, Customer RP, MBGP
Border RPF Check Transit AS109 Tail-site Customer RP pos0/ pos0/ router bgp 100 network nlri unicast multicast neighbor remote-as 109 nlri unicast multicast neighbor update-source pos0/0 router bgp 109 neighbor remote-as 100 nlri unicast multicast neighbor update-source pos 0/0 /24 Receiver

195 Single-Homed, Customer RP, MBGP
MSDP RPF Check Transit AS109 Tail-site Customer RP pos0/ pos0/ ip msdp peer connect-source pos0/0 ip msdp peer connect-source pos0/0 /24 Receiver

196 Which Mode—Sparse or Dense
Sparse mode Must configure a Rendezvous Point (RP) Statically (on every Router) Using Auto-RP (Routers learn RP automatically) Using BSR (Routers learn RP automatically) Very efficient Uses Explicit Join model Traffic only flows to where it’s needed Separate control and data planes Router state only created along flow paths Deterministic topological behavior Scales better than dense mode Works for both sparsely or densely populated networks

197 Which Mode—Sparse or Dense
CONCLUSION Virtually all production networks should be configured to run in Sparse mode!

198 RP Placement Q: “Where do I put the RP?”
A: “Generally speaking, it’s not critical” SPT’s are normally used by default RP is a place for source and receivers to meet Traffic does not normally flow through the RP RP is therefore not a bottleneck Exception: SPT-Threshold = Infinity Traffic stays on the shared tree RP could become a bottleneck

199 Static RP’s Hard-coded RP address Command
When used, must be configured on every router All routers must have the same RP address RP fail-over not possible Exception: If Anycast RPs are used. Command ip pim rp-address <address> [group-list <acl>] [override] Optional group list specifies group range Default: Range = /4 (Includes Auto-RP Groups!!!!) Override keyword “overrides” Auto-RP information Default: Auto-RP learned info takes precedence Hard-code RP Addresses Requires every router in the network to be manually configured with the IP address of a single RP. If this RP fails, there is no way for routers to fail-over to a standby RP. The exception to this rule is if “Anycast-RP’s” are in use. This requires MSDP to be running between each RP in the network. Command ip pim rp-address <address> [group-list <acl>] [override] The ‘group-list’ allows a group range to be specified. The default is ALL multicast groups or /4 DANGER, WILL ROBINSON!!! The default range includes the Auto-RP groups ( and ) which will cause this router to attempt to operate these groups in Sparse mode. This is normally not desirable and can often lead to problems where some routers in the network are trying to run these groups in Dense mode (which is the normal method) while others are trying to use Sparse mode. This will result in some routers in the network being starved of Auto-RP information. This in turn, can result in members of some groups to not receive multicast traffic. The ‘override’ keyword permits the statically defined RP address to take precedence over Auto-RP learned Group-to-RP mapping information. The default is that Auto-RP learned information has precedence.

200 Auto-RP Overview All routers automatically learn RP address
No configuration necessary except on: Candidate RPs Mapping Agents Makes use of Multicast to distribute info Two specially IANA assigned Groups used Cisco-Announce Cisco-Discovery Typically Dense mode is used forward these groups Permits backup RP’s to be configured Can be used with Admin-Scoping Auto-RP Overview Auto-RP allows all routers in the network to automatically “learn” Group-to-RP mappings. There are no special configuration steps that must be taken except on the router(s) that are to function as: Candidate RP’s Mapping Agents Multicast is used to distribute Group-to-RP mapping information via two special, IANA assigned multicast groups. Cisco-Announce Group Cisco-Discovery Group Because multicast is used to distribute this information, a “Chicken and Egg” situation can occur if the above groups operate in Sparse mode. (Routers would have to know a priori what the address of the RP is before they can learn the address of the RP(s) via Auto-RP messages.) Therefore, it is recommend that these groups always run in Dense mode so that this information is flooded throughout the network. Multiple Candidate RP’s may be defined so that in the case of an RP failure, the other Candidate RP can assume the responsibility of RP. Auto-RP can be configured to support Administratively Scoped zones. (BSR cannot!) This can be important when trying to prevent high-rate group traffic from leaving a campus and consuming too much bandwidth on WAN links.

201 Auto-RP Fundamentals Candidate RPs
Configured via global config command ip pim send-rp-announce <intfc> scope <ttl> [group-list acl] Multicast RP-Announcement messages Sent to Cisco-Announce ( ) group Sent every rp-announce-interval (default: 60 sec) RP-Announcements contain: Group Range (default = /4) Candidate’s RP address Holdtime = 3 x <rp-announce-interval>

202 Auto-RP Fundamentals Mapping agents
Configured via global config command ip pim send-rp-discovery scope <ttl> Receive RP-Announcements Select highest C-RP IP addr as RP for group range Stored in Group-to-RP Mapping Cache with holdtimes Multicast RP-Discovery messages Sent to Cisco-Discovery ( ) group Sent every 60 seconds or when changes detected RP-Discovery messages contain: Contents of MA’s Group-to-RP Mapping Cache

203 Auto-RP Fundamentals All Cisco routers
Join Cisco-Discovery ( ) group Automatic No configuration necessary Receive RP-Discovery messages Stored in local Group-to-RP Mapping Cache Information used to determine RP for group range

204 Auto-RP—From 10,000 Feet MA A B C D C-RP 1.1.1.1 2.2.2.2
Announce Announce MA A B C D C-RP Announce RP-Announcements multicast to the Cisco Announce ( ) group

205 Auto-RP—From 10,000 Feet MA MA A B C D C-RP 1.1.1.1 C-RP 2.2.2.2
Discovery Discovery MA MA A B C D C-RP C-RP Discovery RP-Discoveries multicast to the Cisco Discovery ( ) group

206 BSR Overview A single Bootstrap Router (BSR) is elected
Multiple Candidate BSR’s (C-BSR) can be configured Provides backup in case currently elected BSR fails C-RP’s send C-RP announcements to the BSR C-RP announcements are sent via unicast BSR stores ALL C-RP announcements in the “RP-set” BSR periodically sends BSR messages to all routers BSR Messages contain entire RP-set and IP address of BSR Messages are flooded hop-by-hop throughout the network away from the BSR All routers select the RP from the RP-set All routers use the same selection algorithm; select same RP BSR cannot be used with Admin-Scoping BSR Overview Bootstrap Router (BSR) A single router is elected as the BSR from a collection of Candidate BSR’s. If the current BSR fails, a new election is triggered. The election mechanism is pre-emptive based on C-BSR priority. Candidate RP’s (C-RP’s) Send C-RP announcements directly to the BSR via unicast. (Note: C-RP’s learn the IP address of the BSR via periodic BSR messages.) The BSR stores the complete collection of all received C-RP announcements in a database called the “RP-set”. The BSR periodically sends out BSR messages to all routers in the network to let them know the BSR is still alive. BSR messages are flooded hop-by-hop throughout the network. Multicast to the “All-PIM Routers” group ( ) with a TTL of 1. BSR messages also contain: The complete “RP-set” consisting of all C-RP announcements. The IP Address of the BSR so that C-RP’s know where to send their announcements. All routers receive the BSR messages being flooded throughout the network. Select the active RP for each group range using a common hash algorithm that is run against the RP-set. This results in all routers in the network selecting the same RP for a given group-range. BSR cannot be used with Admin-Scoping! Admin scoping was not considered when BSR was designed. One problem is that C-RP announcements that are unicast to the BSR cross multicast boundaries. There are several other problems as well.

207 BSR Fundamentals Candidate RPs Configured via global config command
ip pim rp-candidate <intfc> [group-list acl] Unicast PIMv2 C-RP messages to BSR Learns IP address of BSR from BSR messages Sent every rp-announce-interval (default: 60 sec) C-RP messages contain: Group Range (default = /4) Candidate’s RP address Holdtime = 3 x <rp-announce-interval>

208 BSR Fundamentals Bootstrap router (BSR) Receive C-RP messages
Accepts and stores ALL C-RP messages Stored in Group-to-RP Mapping Cache w/holdtimes Originates BSR messages Multicast to All-PIM-Routers ( ) group (Sent with a TTL = 1) Sent out all interfaces. Propagate hop-by-hop Sent every 60 seconds or when changes detected BSR messages contain: Contents of BSR’s Group-to-RP Mapping Cache IP Address of active BSR

209 BSR Fundamentals Candidate bootstrap router (C-BSR)
Configured via global config command ip pim bsr-candidate <intfc> <hash-length> [priority <pri>] <intfc> Determines IP address <hash-length> Sets RP selection hash mask length <pri> Sets the C-BSR priority (default = 0) C-BSR with highest priority elected BSR C-BSR IP address used as tie-breaker (Highest IP address wins) The active BSR may be preempted New router w/higher BSR priority forces new election

210 BSR Fundamentals All PIMv2 routers Receive BSR messages
Stored in local Group-to-RP Mapping Cache Information used to determine active BSR address Selects RP using Hash algorithm Selected from local Group-to-RP Mapping Cache All routers select same RP using same algorithm Permits RP-load balancing across group range

211 BSR—From 10,000 feet PIMv2 Sparse Mode BSR G F A D C-RP C-RP B C
BSR Msgs G PIMv2 Sparse Mode F BSR A D C-RP Advertisement (unicast) C-RP Advertisement (unicast) Router A is configured as the RP mapping agent - Router A listens to group for announcements from C-RPs. It then sends the group-RP mappings to the group RP discovery group for routers D,E,F,G to discover the group-RP mappings automatically Router A could also be the administrative boundary to external routing domains Routers A and B are both configured as C-RPs for all groups (group range /4) and send candidate announcements to the RP announce group Auto-RP allows for multiple hot backup C-RPs for a given group range. (Highest IP address C-RP is selected.) C-RP C-RP B C BSR Msgs Flooded Hop-by-Hop E

212 Auto-RP vs BSR Auto-RP BSR Easy to configure
Supports Admin. Scoped Zones Works in either PIMv1 or PIMv2 Cisco networks Cisco proprietary BSR Does not support Admin. Scoped Zones Non-proprietary (PIMv2 networks only)

213 Anycast RP—Overview Uses single statically defined RP address
Two or more routers have same RP address RP address defined as a Loopback Interface. Loopback address advertised as a Host route. Senders & Receivers Join/Register with closest RP Closest RP determined from the unicast routing table. MSDP session(s) run between all RPs Informs RPs of sources in other parts of network RPs join SPT to active sources as necessary

214 Anycast RP—Convergence
Src Rec RP1 X RP2 MSDP Rec

215 Anycast RP—Convergence
Src Rec RP1 X RP2 Rec Rec Rec Src

216 PIM Configuration Steps
Enable Multicast Routing on every router Configure every interface for PIM Use Sparse-Dense Mode (to support Auto-RP) Configure the RP (if using Sparse Mode) Static RP addressing RP address must be configured on every router Using Auto-RP or BSR Configure certain routers as Candidate RP(s) All other routers automatically learn elected RP

217 Fool-Proof Multicast Configuration
RP/Mapping Agent A B PIM Sparse Mode C D RP/Mapping Agent Router A is configured as the RP mapping agent - Router A listens to group for announcements from C-RPs. It then sends the group-RP mappings to the group RP discovery group for routers D,E,F,G to discover the group-RP mappings automatically Router A could also be the administrative boundary to external routing domains Routers A and B are both configured as C-RPs for all groups (group range /4) and send candidate announcements to the RP announce group Auto-RP allows for multiple hot backup C-RPs for a given group range. (Highest IP address C-RP is selected.) On every router: ip multicast-routing ip pim accept-RP Auto-RP On every interface: ip pim sparse-dense-mode On routers B and C: ip pim send-rp-announce loopback0 scope <ttl> ip pim send-rp-discovery loopback0 scope <ttl>

218 Classic Partial Multicast Cloud Mistake
src no ip pim dense-mode ip pim dense-mode T1 56K X RPF Failure!!!!! We’ll just use the spare 56K line for the IP Multicast traffic and not the T1. rcvr Network Engineer

219 Beginner’s Guide to IP Multicast
JUST DO IT! Enable multicast on ALL interfaces in ALL routers in a given network. Failure to do so will require complex (and typically unnecessary) Multicast Traffic Engineering.

220 Agenda What Is Multicast Multicast Protocols Multicast Configurations
Troubleshooting Futures Useful Links

221 Debugging and Troubleshooting
Monitoring Tools sdr_monitor Kamil Sarac and Kevin Almeroth - UCSB Several sites on Abilene and vBNS mlisten Abilene Core Node Router Proxy

222 Debugging and Troubleshooting
Multicast Looking Glass Written by NLANR Engineering Services Based on DIGEX Looking Glass Queries Pittsburgh GigaPoP multicast border router for multicast information

223 Debugging and Troubleshooting
Mtrace No longer gives coherent stats Won’t even traverse some routers Basically useless Multicast Debugging Handbook draft-ietf-mboned-mdh-02.txt

224 Debugging and Troubleshooting (cont.)
Useful Cisco Commands show ip sdr [detail | address | <“session name”>] show ip mroute [summary | address | active] show ip rpf <ip address> show ip msdp [ summary | peer | count | sa-cache ] show ip mbgp [address | neighbor [ ip [adv]]

225 >show ip sdr SDR Cache - 67 entries 2/5 ASSIST J Kerr Repeat 3/4 LAN Systems - NT & SMS in Labs 4J KRVM Radio ABONE Discussion BGMP/MASC discussion CANARIE 5th Annual Advanced Networks Workshop(MPEG1) dslab-test ECG - Tunnel Test from UB GT test High Bandwidth Bird (HBB) iBAND3 Keynotes [MP3] at Stardust.com IMJ -- Channel 1 IMJ -- Channel 2 Korea Radio Broadcasting from SNU ...

226 >show ip sdr detail SDR Cache - 68 entries Session Name: BGMP/MASC discussion Description: BGMP/MASC discussion Group: , ttl: 0, Contiguous allocation: 1 Lifetime: from 12:30:00 EDT May until 19:50:00 EDT Jun Uptime: 12:17:02, Last Heard: 11:27:03 Announcement source: , destination: Created by: pavlin IN IP Phone number: Pavlin Radoslavov (USC/ISI) (ext. 8125) Pavlin Radoslavov (USC/ISI) URL: Media: audio RTP/AVP 0 Media group: , ttl: 127 Attribute: ptime:40 Media: whiteboard udp wb Media group: , ttl: 127 Media: text udp nt Media group: , ttl: 127

227 >show ip sdr "NASA TV - Broadcast from NASA HQ"
SDR Cache - 68 entries Session Name: NASA TV - Broadcast from NASA HQ Description: NASA TV Multicasting from NASA HQ Group: , ttl: 0, Contiguous allocation: 1 Lifetime: from 12:36:16 EDT Jun until 11:36:16 EST Dec Uptime: 1d17h, Last Heard: 18:21:10 Announcement source: , destination: Created by: ellery IN IP Phone number: (Ellery D. Coleman) URL: Media: audio RTP/AVP 0 Media group: , ttl: 127 Media: video RTP/AVP 31 Media group: , ttl: 127 Attribute: framerate:9 Attribute: quality:8 Attribute: grayed:0

228 >show ip bgp neighbors 192.88.115.113
BGP neighbor is , remote AS 145, external link Index 2, Offset 0, Mask 0x4 NEXT_HOP is always this router BGP version 4, remote router ID BGP state = Established, table version = 2, up for 05:54:44 Last read 00:00:05, last send 00:00:07 Hold time 90, keepalive interval 30 seconds Neighbor NLRI negotiation: Configured for unicast and multicast routes Peer negotiated unicast and multicast routes Exchanging unicast and multicast routes ... Number of unicast/multicast prefixes received 0/6773

229 >show ip mbgp neigh 192.88.115.113 adv
Network Next Hop Metric LocPrf Weight Path *> i *> / i

230 Agenda What Is Multicast Multicast Protocols Multicast Configurations
Troubleshooting Futures Useful Links

231 Futures PIM modes designed for specific scalings and applications
Bidir-PIM Single Source Multicast (SSM)

232 Bidir PIM Designed for many to many multicast groups
Builds a single (*,G) shared tree which all sources use Relieves the router from having to keep state for each and every source

233 SSM Designed to allow receivers to specify the single source which they wish to receive traffic from, such as when viewing a broadcast from a single provider Requires IGMPv3 so that receivers can tell the first hop router which source they want RP is not used Requires an out of band method to discover the proper source (e.g. a web pag or

234 Agenda What Is Multicast Multicast Protocols Multicast Configurations
Troubleshooting Futures Useful Links

235 Useful Links Multicast Documentation (FAQ and Tools): Cisco IOS Documentation Cisco IP Multicast: ftp://ftp-eng.cisco.com/ipmulticast/ Introduction to Multicast Routing: Send mail to


Download ppt "Connecting to the IP Multicast World"

Similar presentations


Ads by Google