Fundamentals of IP Multicast

Fundamentals of IP Multicast

Unicast vs Multicast Unicast Multicast Host Router Host Router
Unicast transmission sends multiple copies of data, one copy for each receiver Ex: host transmits 3 copies of data and network forwards each to 3 separate receivers Ex: host can only send to one receiver at a time Multicast transmission sends a single copy of data to multiple receivers Ex: host transmits 1 copy of data and network replicates at last possible hop for each receiver, each packet exists only one time on any given network Ex: host can send to multiple receivers simultaneously Multicast Host Router

Multicast Addressing IP group addresses 224.0.0.0–239.255.255.255
“Class D” addresses = high order bits of “1110” Special reserved group addresses: – : All systems on this subnet All routers on this subnet DVMRP routers IANA Reserved Addresses IANA is the responsible Authority for the assignment of reserved class D addresses. Other interesting reserved addresses are: PIMv1 (ALL-ROUTERS - due to transport in IGMPv1) OSPF ALL ROUTERS (RFC1583) OSPF DESIGNATED ROUTERS (RFC1583) RIP2 Routers PIMv2 CISCO-RP-ANNOUNCE (Auto-RP) CISCO-RP-DISCOVERY (Auto-RP) “ftp://ftp.isi.edu/in-notes/iana/assignments/multicast-addresses” is the authoritative source for reserved multicast addresses. Additional Information "Administratively Scoped IP Multicast", June 1997, has a good discussion on scoped addresses. This document is available at: “draft-ietf-mboned-admin-ip-space-03.txt”

IP Multicast Addressing
Global Scope Addresses Addresses through Allocated dynamically Administratively Scoped Addresses Reserved for use inside of private Domains Static Group Address Assignment Group range: Your AS number is inserted in middle two octets Remaining low-order octet used for group assignment Static Group Address Assignment Until MASC has been fully specified and deployed, many content providers in the Internet require “something” to get going in terms of address allocation. This is being addressed with a temporary method of static multicast address allocation. This special allocation method is defined in: “draft-ietf-mboned-glop-addressing-xx.txt” The basic concept behind this methodology is as follows: Use the 233/8 address space for static address allocation The middle two octets of the group address would contain your AS number The final octet is available for group assignment.

Multicast Protocol Basics
Multicast Distribution Trees Multicast Forwarding Types of Multicast Protocols Dense/Sparse Mode Protocols MBGP MSDP Multicast Distribution Trees Defines the path down which traffic flows from source to receiver(s). Multicast Forwarding Unlike unicast forwarding which uses the “destination” address to make it’s forwarding decision, multicast forwarding uses the “source” address to make it’s forwarding decision. Type of Multicast Protocols Dense Mode Protocols Spares Mode Protocols

Multicast Distribution Trees
Source or Shortest Path trees Uses more memory (S x G) but you get optimal paths from source to all receivers; minimizes delay Shared trees Uses less memory (G) but you may get sub-optimal paths from source to all receivers; may introduce extra delay Characteristics of Distribution Trees Source or Shortest Path Tree Characteristics Provides optimal path (shortest distance and minimized delay) from source to all receivers, but requires more memory to maintain Shared Tree Characteristics Provides sub-optimal path (may not be shortest distance and may introduce extra delay) from source to all receivers, but requires less memory to maintain

Shortest Path or Source Distribution Tree Source 1 Notation: (S, G) S = Source G = Group Source 2 A B D F Shortest Path Trees — aka Source Trees A Shortest path or source distribution tree is a minimal spanning tree with the lowest cost from the source to all leaves of the tree. We forward packets on the Shortest Path Tree according to both the Source Address that the packets originated from and the Group address G that the packets are addressed to. For this reason we refer to the forwarding state on the SPT by the notation (S,G) (pronounced “S comma G”). where: “S” is the IP address of the source. “G” is the multicast group address Example 1: The shortest path between Source 1 and Receiver 1 is via Routers A and C, and shortest path to Receiver 2 is one additional hop via Router E. C E Receiver 1 Receiver 2

Shared Distribution Tree Notation: (*, G) * = All Sources G = Group A B D (RP) F Shared Distribution Trees Shared distribution tree whose root is a shared point in the network down which multicast data flows to reach the receivers in the network. In PIM-SM, this shared point is called the Rendezvous Point (RP). Multicast traffic is forwarded down the Shared Tree according to just the Group address G that the packets are addressed to, regardless of source address. For this reason we refer to the forwarding state on the shared tree by the notation (*,G) (pronounced “star comma G”) where: “*” means any source “G” is the group address (RP) PIM Rendezvous Point C E Shared Tree Receiver 1 Receiver 2

Shared Distribution Tree Source 1 Notation: (*, G) * = All Sources G = Group Source 2 A B D (RP) F Shared Distribution Trees (cont.) Before traffic can be sent down the Shared Tree it must somehow be sent to the Root of the Tree. In classic PIM-SM, this is accomplished by the RP joining the Shortest Path Tree back to each source so that the traffic can flow to the RP and from there down the shared tree. In order to trigger the RP to take this action, it must somehow be notified when a source goes active in the network. In PIM-SM, this is accomplished by first-hop routers (i.e. the router directly connected to an active source) sending a special Register message to the RP to inform it of the active source. In the example above, the RP has been informed of Sources 1 and 2 being active and has subsequently joined the SPT to these sources. (RP) PIM Rendezvous Point C E Shared Tree Source Tree Receiver 1 Receiver 2

How Are Distribution Trees Built? PIM Uses existing Unicast Routing Table plus Join/Prune/Graft mechanism to build tree. DVMRP Uses DVMRP Routing Table plus special Poison-Reverse mechanism to build tree. MOSPF Uses extension to OSPF’s link state mechanism to build tree. CBT Distribution trees are built in a variety of ways, depending upon the multicast routing protocol employed PIM utilizes the underlying unicast routing table (any unicast routing protocol) plus: Join: routers add their interfaces and/or send PIM-JOIN messages upstream to establish themselves as branches of the tree when they have interested receivers attached Prune: routers prune their interfaces and/or send PIM-PRUNE messages upstream to remove themselves from the distribution tree when they no longer have interested receivers attached Graft: routers send PIM-GRAFT messages upstream when they have a pruned interface and have already sent PIM-PRUNEs upstream, but receive an IGMP host report for the group that was pruned; routers must reestablish themselves as branches of the distribution tree because of new interested receivers attached DVMRP utilizes a special RIP-like multicast routing table plus: Poison-Reverse: a special metric of Infinity (32) plus the originally received metric, used to signal that the router should be placed on the distribution tree for the source network. Prunes & Grafts: routers send Prunes and Grafts up the distribution similar to PIM-DM. MOSPF utilizes the underlying OSPF unicast routing protocol's link state advertisements to build (S,G) trees Each router maintains an up-to-date image of the topology of the entire network CBT utilizes the underlying unicast routing table and the Join/Prune/Graft mechanisms (much like PIM-SM)

Multicast Forwarding Multicast Routing is backwards from Unicast Routing Unicast Routing is concerned about where the packet is going. Multicast Routing is concerned about where the packet came from. Multicast Routing uses “Reverse Path Forwarding” Multicast Forwarding Routers must know packet origin, rather than destination (opposite of unicast) ... origination IP address denotes known source ... destination IP address denotes unknown group of receivers Multicast routing utilizes Reverse Path Forwarding (RPF) ... Broadcast: floods packets out all interfaces except incoming from source; initially assuming every host on network is part of multicast group ... Prune: eliminates tree branches without multicast group members; cuts off transmission to LANs without interested receivers ... Selective Forwarding: requires its own integrated unicast routing protocol

Multicast Forwarding Example: RPF Checking Source 151.10.3.21
Multicast Forwarding: RPF Checking Source floods network with multicast data Each router has a designated incoming interface (RPF interface) on which multicast data can be received from a given source Each router receives multicast data on one or more interfaces, but performs RPF check to prevent duplicate forwarding Example: Router receives multicast data on two interfaces 1) performs RPF Check on multicast data received on interface E0; RPF Check succeeds because data was received on specified incoming interface from source ; data forwarded through all outgoing interfaces on the multicast distribution tree 2) performs RPF Check on multicast data received on interface E1; RPF Check fails because data was not received on specified incoming interface from source ; data silently dropped RPF Checks Fail Packets arrived on wrong interface! Mcast Dist. Tree Mcast Packets

Types of Multicast Protocols
Dense-mode Uses “Push” Model Traffic Flooded throughout network Pruned back where it is unwanted Flood & Prune behavior (typically every 3 minutes) Sparse-mode Uses “Pull” Model Traffic sent only to where it is requested Explicit Join behavior Dense-mode multicast protocols Initially flood/broadcast multicast data to entire network, then prune back paths that don't have interested receivers Sparse-mode multicast protocols Assumes no receivers are interested unless they explicitly ask for it

Dense-Mode Protocols DVMRP - Distance Vector Multicast Routing Protocol MOSPF - Multicast OSPF PIM DM - Protocol Independent Multicasting (Dense Mode)

DVMRP — Evaluation Widely used on the old MBONE (being phased out)
Significant scaling problems Slow Convergence—RIP-like behavior Significant amount of multicast routing state information stored in routers—(S,G) everywhere No support for shared trees Maximum number of hops < 32 Not appropriate for large scale production networks Due to flood and prune behavior Due to its poor scalability Appropriate for large number of densely distributed receivers located in close proximity to source Widely used, oldest multicast routing protocol Significant scaling problems Protocol limits maximum number of hops to 32 and requires a great deal of multicast routing state information to be retained Not appropriate for... Few interested receivers (assumes everyone wants data initially) Groups sparsely represented over WAN (floods frequently)

PIM-DM Protocol Independent Uses reverse path forwarding
Supports all underlying unicast routing protocols including: static, RIP, IGRP, EIGRP, IS-IS, BGP, and OSPF Uses reverse path forwarding Floods network and prunes back based on multicast group membership Assert mechanism used to prune off redundant flows Appropriate for... Testing and pilot networks Protocol Independent Multicast (PIM) Dense-mode (Internet-draft) Uses Reverse Path Forwarding (RPF) to flood the network with multicast data, then prune back paths based on uninterested receivers Interoperates with DVMRP Appropriate for Small implementations and pilot networks

PIM-DM Flood & Prune Initial Flooding Source (S, G) State created in
PIM-DM Initial Flooding PIM-DM is similar to DVMRP in that it initially floods multicast traffic to all parts of the network. However unlike DVMRP, which pre-builds a “Truncated Broadcast Tree” that is used for initial flooding, PIM-DM initially floods traffic out ALL non RPF interfaces where there is: Another PIM-DM neighbour or A directly connected member of the group The reason that PIM-DM does not use “Truncated Broadcast Trees” to pre-build a spanning tree for each source network is that this would require running a separate routing protocol as does DVMRP. (At the very least, some sort of Poison-Reverse messages would have to be sent to build the TBT.) Instead, PIM-DM uses other mechanisms to prune back the traffic flows and build Source Trees. Initial Flooding Example In this example, multicast traffic being sent by the source is flooded throughout the entire network. As each router receives the multicast traffic via its RPF interface (the interface in the direction of the source), it forwards the multicast traffic to all of its PIM-DM neighbors. Note that this results in some traffic arriving via a non-RPF interface such as the case of the two routers in the center of the drawing. (Packets arriving via the non-RPF interface are discarded.) These non-RPF flows are normal for the initial flooding of data and will be corrected by the normal PIM-DM pruning mechanism. (S, G) State created in every router in the network! Multicast Packets Receiver

Pruning unwanted traffic
PIM-DM Flood & Prune Pruning unwanted traffic Source Pruning unwanted traffic In the example above, PIM Prunes (denoted by the dashed arrows) are sent to stop the flow of unwanted traffic. Prunes are sent on the RPF interface when the router has no downstream members that need the multicast traffic. Prunes are also sent on non-RPF interfaces to shutoff the flow of multicast traffic that is arriving via the wrong interface (i.e. traffic arriving via an interface that is not in the shortest path to the source.) An example of this can be seen at the second router from the receiver near the center of the drawing. Multicast traffic is arriving via a non-RPF interface from the router above (in the center of the network) which results in a Prune message. Multicast Packets Prune Messages Receiver

PIM-DM Flood & Prune Results after Pruning Flood & Prune process
Source Results after Pruning In the final drawing in our example shown above, multicast traffic has been pruned off of all links except where it is necessary. This results in a Shortest Path Tree (SPT) being built from the Source to the Receiver. Even though the flow of multicast traffic is no longer reaching most of the routers in the network, (S, G) state still remains in ALL routers in the network. This (S, G) state will remain until the source stops transmitting. In PIM-DM, Prunes expire after three minutes. This causes the multicast traffic to be re-flooded to all routers just as was done in the “Initial Flooding” drawing. This periodic (every 3 minutes) “Flood and Prune” behavior is normal and must be taken into account when the network is designed to use PIM-DM. (S, G) State still exists in every router in the network! Multicast Packets Flood & Prune process repeats every 3 minutes!!! Receiver

PIM-SM (RFC 2362) Supports both source and shared trees
Assumes no hosts want multicast traffic unless they specifically ask for it Uses a Rendezvous Point (RP) Senders and Receivers “rendezvous” at this point to learn of each others existence. Senders are “registered” with RP by their first-hop router. Receivers are “joined” to the Shared Tree (rooted at the RP) by their local Designated Router (DR). Appropriate for… Wide scale deployment for both densely and sparsely populated groups in the enterprise Optimal choice for all production networks regardless of size and membership density. Protocol Independent Multicast (PIM) Sparse-mode (RFC 2362) Utilizes a rendezvous point (RP) to coordinate forwarding from source to receivers Regardless of location/number of receivers, senders register with RP and send a single copy of multicast data through it to registered receivers Regardless of location/number of sources, group members register to receive data and always receive it through the RP Appropriate for Wide scale deployment for both densely and sparsely populated groups in the Enterprise Optimal choice for all production networks regardless of size and membership density.

PIM-SM Shared Tree Join
RP (*, G) State created only along the Shared Tree. PIM-SM Shared Tree Joins In this example, there is an active receiver (attached to leaf router at the bottom of the drawing) has joined multicast group “G”. The leaf router knows the IP address of the Rendezvous Point (RP) for group G and when it sends a (*,G) Join for this group towards the RP. This (*, G) Join travels hop-by-hop to the RP building a branch of the Shared Tree that extends from the RP to the last-hop router directly connected to the receiver. At this point, group “G” traffic can flow down the Shared Tree to the receiver. (*, G) Join Shared Tree Receiver

PIM-SM Sender Registration
RP Source (S, G) State created only along the Source Tree. PIM-SM Sender Registration As soon as an active source for group G sends a packet the leaf router that is attached to this source is responsible for “Registering” this source with the RP and requesting the RP to build a tree back to that router. The source router encapsulates the multicast data from the source in a special PIM SM message called the Register message and unicasts that data to the RP. When the RP receives the Register message it does two things It de-encapsulates the multicast data packet inside of the Register message and forwards it down the Shared Tree. The RP also sends an (S,G) Join back towards the source network S to create a branch of an (S, G) Shortest-Path Tree. This results in (S, G) state being created in all the router along the SPT, including the RP. Traffic Flow Shared Tree Source Tree (S, G) Register (unicast) Receiver (S, G) Join

RP Source (S, G) traffic begins arriving at the RP via the Source tree. PIM-SM Sender Registration (cont.) As soon as the SPT is built from the Source router to the RP, multicast traffic begins to flow natively from source S to the RP. Once the RP begins receiving data natively (i.e. down the SPT) from source S it sends a ‘Register Stop’ to the source’s first hop router to inform it that it can stop sending the unicast Register messages. Traffic Flow Shared Tree RP sends a Register-Stop back to the first-hop router to stop the Register process. Source Tree (S, G) Register (unicast) Receiver (S, G) Register-Stop (unicast)

RP Source Source traffic flows natively along SPT to RP. PIM-SM Sender Registration (cont.) At this point, multicast traffic from the source is flowing down the SPT to the RP and from there, down the Shared Tree to the receiver. Traffic Flow Shared Tree From RP, traffic flows down the Shared Tree to Receivers. Source Tree Receiver

PIM-SM SPT Switchover RP Source Last-hop router joins the Source Tree.
PIM-SM Shortest-Path Tree Switchover PIM-SM has the capability for last-hop routers (i.e. routers with directly connected members) to switch to the Shortest-Path Tree and bypass the RP if the traffic rate is above a set threshold called the “SPT-Threshold”. The default value of the SPT-Threshold” in Cisco routers is zero. This means that the default behaviour for PIM-SM leaf routers attached to active receivers is to immediately join the SPT to the source as soon as the first packet arrives via the (*,G) shared tree. In the above example, the last-hop router (at the bottom of the drawing) sends an (S, G) Join message toward the source to join the SPT and bypass the RP. This (S, G) Join messages travels hop-by-hop to the first-hop router (i.e. the router connected directly to the source) thereby creating another branch of the SPT. This also creates (S, G) state in all the routers along this branch of the SPT. Traffic Flow Shared Tree Additional (S, G) State is created along new part of the Source Tree. Source Tree (S, G) Join Receiver

PIM-SM SPT Switchover RP Source Traffic Flow
PIM-SM Shortest-Path Tree Switchover Once the branch of the Shortest-Path Tree has been built, (S, G) traffic begins flowing to the receiver via this new branch. Next, special (S, G)RP-bit Prune messages are sent up the Shared Tree to prune off the redundant (S,G) traffic that is still flowing down the Shared Tree. If this were not done, (S, G) traffic would continue flowing down the Shared Tree resulting in duplicate (S, G) packets arriving at the receiver. Traffic Flow Traffic begins flowing down the new branch of the Source Tree. Shared Tree Source Tree Additional (S, G) State is created along along the Shared Tree to prune off (S, G) traffic. (S, G)RP-bit Prune Receiver

PIM-SM SPT Switchover RP Source
(S, G) Traffic flow is now pruned off of the Shared Tree and is flowing to the Receiver via the Source Tree. PIM-SM Shortest-Path Tree Switchover As the (S, G)RP-bit Prune message travels up the Shared Tree, special (S, G)RP-bit Prune state is created along the Shared Tree that selectively prevents this traffic from flowing down the Shared Tree. At this point, (S, G) traffic is now flowing directly from the first-hop router to the last-hop router and from there to the receiver. Traffic Flow Shared Tree Source Tree Receiver

(S, G) traffic flow is no longer needed by the RP so it Prunes the flow of (S, G) traffic. PIM-SM Shortest-Path Tree Switchover At this point, the RP no longer needs the flow of (S, G) traffic since all branches of the Shared Tree (in this case there is only one) have pruned off the flow of (S, G) traffic. As a result, the RP will send (S, G) Prunes back toward the source to shutoff the flow of the now unnecessary (S, G) traffic to the RP Note: This will occur IFF the RP has received an (S, G)RP-bit Prune on all interfaces on the Shared Tree. Traffic Flow Shared Tree Source Tree (S, G) Prune Receiver

(S, G) Traffic flow is now only flowing to the Receiver via a single branch of the Source Tree. PIM-SM Shortest-Path Tree Switchover As a result of the SPT-Switchover, (S, G) traffic is now only flowing from the first-hop router to the last-hop router and from there to the receiver. Notice that traffic is no longer flowing to the RP. As a result of this SPT-Switchover mechanism, it is clear that PIM SM also supports the construction and use of SPT (S,G) trees but in a much more economical fashion than PIM DM in terms of forwarding state. Traffic Flow Shared Tree Source Tree Receiver

PIM-SM — Evaluation Advantages:
Traffic only sent down “joined” branches Can switch to optimal source-trees for high traffic sources dynamically Unicast routing protocol-independent Basis for inter-domain multicast routing When used with MBGP and MSDP Evaluation: PIM Sparse-mode Can be used for sparse of dense distribution of multicast receivers (no necessity to flood) Advantages Traffic sent only to registered receivers that have explicitly joined the multicast group RP can be switched to optimal shortest-path-tree when high-traffic sources are forwarding to a sparsely distributed receiver group Interoperates with DVMRP Potential issues Requires RP during initial setup of distribution tree (can switch to shortest-path-tree once RP is established and determined sub optimal)

MBGP Overview MBGP: Multiprotocol BGP
Defined in RFC 2283 (extensions to BGP) Can carry different types of routes IPv4 Unicast IPv4 Multicast IPv6 Unicast May be carried in same BGP session Does not propagate multicast state info Still need PIM to build Distribution Trees Same path selection and validation rules AS-Path, LocalPref, MED, … MBGP Overview Multiprotocol BGP (MBGP) is defined in RFC This RFC defines extensions to the existing BGP protocol to allow it to carry more than just IPv4 route prefixes. Examples of some of the new types of routing information include (but are not limited to): IPv4 prefixes for Unicast routing IPv4 prefixes for Multicast RPF checking IPv6 prefixes for Unicast routing A common misconception is that MBGP is a replacement for PIM. This is incorrect. MBGP does not propagate any multicast state information nor does it build any sort of multicast distribution trees. MBGP can distribute unicast prefixes that can be used for the multicast RPF check. Because MBGP is an extension to the existing BGP protocol, the same basic rules apply to path selection, path validation, etc.

MBGP Overview Separate BGP tables maintained Unicast RIB (U-RIB)
Unicast Routing Information Base (U-RIB) Multicast Routing Information Base (M-RIB) New BGP ‘nlri’ keyword specifies which RIB Allows different unicast/multicast topologies or policies Unicast RIB (U-RIB) Contains unicast prefixes for unicast forwarding Populated with BGP unicast NLRI Multicast RIB (M-RIB) Contains unicast prefixes for RPF checking Populated with BGP multicast NLRI Routing Information Bases Previously, BGP only maintained a single Routing Information Base (RIB) for IPv4 unicast prefixes. In the case of MBGP, separate RIB’s must be maintained for each type of routing information being exchanged. This implies that a separate Unicast RIB (U-RIB) and a separate Multicast RIB (M-RIB) can be maintained by MBGP. A new “nlri” keyword was added to the Cisco IOS command structure to differentiate between the U-RIB and the M-RIB. (Note: This keyword will soon be depreciated in order to generalize MBGP for other protocols such as IPv6. Consult you latest IOS Documentation for the correct syntax.) Unicast RIB (U-RIB) This RIB contains the unicast prefixes that was previously used by BGP for IPv4 unicast traffic forwarding. Multicast RIB (M-RIB) This new RIB contains the same type of unicast prefixes contained in the U-RIB except that the prefixes stored in the M-RIB are used to RPF check arriving multicast traffic.

MBGP — NLRI Information
RIB’s may be populated by: Network commands network <foo> <foo-mask> [nlri multicast unicast] New “nlri” keyword controls in which RIB the matching route(s) is(are) stored M-RIB if “multicast” keyword specified U-RIB if “unicast” keyword specified (or if nlri clause omitted) Both RIB’s if both keywords specified Controlling RIB population Like the “neighbor” command, the “network” command also accepts the new “nlri” keyword. This provides control as to which RIB (U-RIB, M-RIB, or both) that the network information is injected. If the “nlri” clause is omitted, the Unicast RIB (U-RIB) is assumed.

MBGP—Summary Solves part of inter-domain problem
Can exchange multicast RPF information Uses standard BGP configuration knobs Permits separate unicast and multicast topologies if desired Still must use PIM to: Build multicast distribution trees Actually forward multicast traffic PIM-SM recommended MBGP — Summary MBGP solves part of the inter-domain multicast problem by allowing AS’s to exchange Multicast RPF information in the for of MBGP Multicast NLRI. Because this is accomplished using an extension to the BGP protocol to make it support multiple protocols (i.e.“Multiprotocol BGP”), the same BGP configuration knobs are available for both Unicast and Multicast information. The separation of unicast and multicast prefixes into separate Unicast and Multicast RIB’s permits unicast and multicast traffic to follow different paths if desired. MBGP is only one piece of the overall Inter-domain Multicast solution and PIM must still be used to: Build the multicast distribution trees. (Typically via PIM-SM.) Actually RPF check and forward multicast traffic. PIM-SM is recommended as it permits the use of MSDP which solves most of the remaining issues and is covered in another section.

MSDP Overview Use inter-domain source trees
Reduces the problem to locating active sources MSDP establishes a neighbor relationship between MSDP peers MSDP peers talk via TCP connections use keepalives every 60 sec after 75 sec TCP connection is reset MSDP Overview By abandoning the notion of inter-domain Shared-Trees and using only inter-domain Source-Trees, the complexity of interconnecting PIM-SM domains is reduced considerably. The remaining problem becomes one of communicating the existence of active sources between the RP’s in the PIM-SM domains. The RP can join the inter-domain source-tree for sources that are sending to groups for which the RP has receivers. This is possible because the RP is the root of the Shared-Tree which has branches to all points in the domain where there are active receivers. Note: If the RP either has no Shared-Tree for a particular group or a Shared-Tree whose outgoing interface list is Null, it does not send a Join to the source in another domain. Once a last-hop router learns of a new source outside the PIM-SM domain (via the arrival of a multicast packet from the source down the Shared-Tree), it too can send a Join toward the source and join the source tree.

MSDP Overview Works with PIM-SM only
RP’s knows about all sources in a domain Sources cause a “PIM Register” to the RP Can tell RP’s in other domains of its sources Via MSDP SA (Source Active) messages RP’s know about receivers in a domain Receivers cause a “(*, G) Join” to the RP RP can join the source tree in the peer domain Via normal PIM (S, G) joins Only necessary if there are receivers for the group MSDP Overview The entire concept of MSDP depends on the RP’s in the inter-connected domains being the be-all, know-all oracle that is aware of all sources and receivers in its local domain. As a result, MSDP can only work with PIM-SM. RP’s know about all sources in a domain. Whenever a source goes active in a PIM-SM domain, the first-hop router immediately informs the RP of this via the use of PIM Register messages. (S,G) state for the source is kept alive in the RP by normal PIM-SM mechanisms as long the source is actively sending. As a result, the RP can inform other RP’s in other PIM-SM domains of the existence of active sources in its local domain. This is accomplished via MSDP Source Active (SA) messages. RP’s know about receivers in a domain. Receivers cause last-hop routers to send (*, G) Joins to the RP to build branches of a Shared-Tree for a group. If an RP has (*, G) state for a group and the outgoing interface list of the (*, G) entry is not Null, it knows it has active receivers for the group. Therefore, when it receives an SA message announcing an active source for group G in another domain, it can send (S, G) Joins toward the source in the other domain.

MSDP SA Messages Used to advertise active Sources in a domain
SA Message Contents: IP Address of Originator (RP address) Number of (S, G)’s pairs being advertised List of active (S, G)’s in the domain Encapsulated Multicast packet Source Active (SA) messages Peer-RPF forwarded to prevent loops RPF check on path back to the originating RP If using MSDP mesh group If one MSDP peer without running BGP Since they have only one exit (e.g., default peer) MSDP Overview MSDP Peers (typically RP’s) are connected via TCP sessions. Note: The MSDP specification describes a UDP encapsulation option but this is not currently available in the IOS implementation. Source Active (SA) Messages RP’s periodically originate SA messages for sources that are active in their local domain. These SA messages are sent to all active MSDP peers. When a MSDP speaker receives an SA messages from one of its peers, it is RPF forwarded to all of its other peers. An RPF check is performed on the arriving SA message (using the originating RP address in the SA message) to insure that it was received via the correct AS-PATH. Only if this RPF check succeeds is the SA message flooded downstream to its peers. This prevents SA messages from looping through the Internet. Stub domains (i.e. domains with only a single MSDP connection) do not have to perform this RPF check since there is only a single entrance/exit. MSDP speakers may cache SA messages. Normally, these messages are not stored to minimize memory usage. However, by storing SA messages, join latency can be reduced as RP’s do not have to wait for the arrival of periodic SA messages when the first receiver joins the group. Instead, the RP can scan its SA cache to immediately determine what sources are active and send (S, G) Joins. Non-caching MSDP speakers can query caching MSDP speakers in the same domain for information on active sources for a group.

MSDP Overview MSDP Example r s Domain E Domain C Domain B Domain D
MSDP Peers Source Active Messages SA RP SA r Join (*, ) Domain C SA RP Domain B SA SA RP SA Message , MSDP Example In the example above, PIM-SM domains A through E each have an RP which is an MSDP speaker. The solid lines between these RP’s represents the MSDP peer sessions via TCP and not actual physical connectivity between the domains. Note: The physical connectivity between the domains is not shown in the drawing above. Assume that a receiver in Domain E joins multicast group which in turn, causes its DR to send (*, G) Join for this group to the RP. This builds a branch of the Shared-Tree from the RP in Domain E to the DR as shown. When a source goes active in Domain A, the first-hop router (S) sends a PIM Register message to the RP. This informs the RP in Domain A that a source is active in the local domain. The RP responds by originating an (S, G) SA message for this source and send them to its MSDP peers in domains B and C. (The RP will continue to send these SA messages periodically as long as the source remains active.) When the RP’s in domains B and C receive the SA messages, they are RPF checked and forwarded downstream to their MSDP peers. These SA messages continue to travel downstream and eventually reach the MSDP peers (the RP’s) in domains D and E. Note: The SA message traveling from domain B to domain C, will fail the RPF check at the domain C RP (MSDP speaker) and will be dropped. However, the SA message arriving at domain C from domain A will RPF correctly and will be processed and forwarded on to domains D and E. RP SA Message , Domain D RP s Register , Domain A

MSDP Peers RP (S, ) Join r Domain C RP Domain B RP MSDP Example Once the SA message arrives at the RP (MSDP speaker) in domain E, it sees that it has an active branch of the Shared-Tree for group It responds to the SA message by sending an (S, G) Join toward the source. IMPORTANT: The (S, G) Join will follow the normal inter-domain routing path from the RP to the source. This inter-domain routing path is not necessarily the same path as that used by the MSDP connections. In order to emphasis this point, the (S, G) Join is shown following a different path between domains. RP Domain D RP s Domain A

MSDP Peers Multicast Traffic RP r Domain C RP Domain B RP MSDP Example Once the (S, G) Join message reaches the first-hop router (S) in domain A, (S, G) traffic begins to flow to the RP in domain E via the Source Tree shown. IMPORTANT: The (S, G) traffic will not flow over the TCP MSDP sessions. It will instead follow the path of the Source Tree that was built in the preceding step. RP Domain D RP s Domain A

Originating SA Messages
SA messages are triggered when any new source in the local domain goes active. Initial multicast packet is encapsulated in an SA message. This is an attempt at solving the bursty-source problem Originating SA Messages SA messages are triggered by an RP (assuming MSDP is configured) when any new source goes active within the local PIM-SM domain. When a source in the local PIM-SM domain initially goes active, it causes the creation of (S, G) state in the RP. New sources are detected by the RP by: The receipt of a Register message or The arrival of the first (S, G) packet from a directly connected source. The initial multicast packet sent by the source (either encapsulated in the Register message or received from a directly connected source) is encapsulated in the initial SA message in an attempt to solve the problem of bursty sources.

MSDP Mesh-Groups Optimises SA flooding
Useful when 2 or more peers are in a group Reduces amount of SA traffic in the net SA’s are not flooded to other mesh-group peers No RPF checks on arriving SA messages When received from a mesh-group peer SA’s always accepted from mesh-group peers MSDP Mesh-Groups An MSDP Mesh-Groups can be configured on a group of MSDP peers that are fully meshed. (In other words, each of the MSDP peers in the group has an MSDP connection to every other MSDP peer in the group. When an MSDP Mesh-Group is configured between a group of MSDP peers, SA flooding is reduced. This is because when an MSDP peer in the group receives an SA message from another MDP peer in the group, it can assume that this SA message was sent to all the other MSDP peers in the group. As a result, it is not necessary nor desirable for the receiving MSDP peer to flood the SA message to the other MSDP peers in the group. MSDP Mesh-Groups may also be used to eliminate the need to run (M)BGP to do RPF checks on arriving SA messages. This is because SA messages are never flooded to other MSDP peers in the mesh-group. As a result, it is not necessary to perform the RPF check on arriving SA messages.

MSDP Mesh-Groups Configured with:
ip msdp mesh-group <name> <peer-address> Peers in the mesh-group should be fully meshed. Multiple mesh-groups per router are supported. MSDP Mesh-Groups A MSDP Mesh-Group may be configured by using the following IOS configuration command: ip msdp mesh-group <name> <peer-address> This command configures the router as a member of the mesh-group <name> for which MSDP peer <peer-address> is also a member. All MSDP peers in the mesh-group must be fully-meshed with all other MSDP peers in the group. This means that each router must be configured with ‘ip msdp peer’ and ‘ip msdp mesh-group’ commands for each member of the mesh-group. Routers may be members of multiple Mesh-Groups.

MSDP Mesh-Group Example
ip msdp peer R2 ip msdp peer R3 ip msdp mesh-group My-Group R2 ip msdp mesh-group My-Group R3 SA not forwarded to other members of the mesh-group R1 SA X RP RP ip msdp peer R1 ip msdp peer R3 ip msdp peer R4 ip msdp mesh-group My-Group R1 ip msdp mesh-group My-Group R3 SA R4 R2 ip msdp peer R1 ip msdp peer R2 ip msdp peer R5 ip msdp mesh-group My-Group R1 ip msdp mesh-group My-Group R2 R5 SA R3 MSDP Mesh-Group Example. In the above example, routers R1, R2 and R3 are all configured as members of the same MSDP mesh-group. In addition, router R1 is also MSDP peering with router R4 and router R3 is MSDP peering with router R5. Neither R4 nor R5 are members of the MSDP mesh-group. Assume router R4 originates an SA message for a source in its local PIM-SM domain. This message is sent to route R2 as shown in the drawing above. When router R2 receives this SA message, it must perform an RPF check on the message because it was received from an MSDP peer that is not a member of the mesh-group. In this case the RPF check is successful and router R2 floods the SA message (received from a non-mesh-group member) to all other members of the mesh-group. When routers R1 and R3 receive the SA message from mesh-group member R2, they do not have to perform an RPF check on the arriving message nor do they flood the SA message to each other since they are both members of the mesh-group. (They know that the other members of the mesh-group will have received a copy directly from R2 and therefore they do not have to forward the SA message to each other. This is why a full mesh between mesh-group members is required.) Finally, router R3 floods the SA message to all of its MSDP peers that are not members of the mesh-group. In this case, the SA message is flooded to router R5 to continue the flow of the SA message downstream away from the RP. MSDP mesh-group peering

Quiz … is used to propagate forwarding state?
… permits separate unicast and multicast topologies? … exchanges active source information between RPs?

Fundamentals of IP Multicast

Similar presentations

Presentation on theme: "Fundamentals of IP Multicast"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Fundamentals of IP Multicast

Similar presentations

Presentation on theme: "Fundamentals of IP Multicast"— Presentation transcript:

Similar presentations

About project

Feedback