# Chapter 5: Tree Constructions

## Presentation on theme: "Chapter 5: Tree Constructions"— Presentation transcript:

Chapter 5: Tree Constructions
Breadth-First Search (BFS) layer-based using Dijkstra’s algorithm update-based using the Bellman-Ford algorithm Distributed Depth-First Search (DFS) Minimum spanning trees (MST) Matroid Problems solutions using synchronous upcasting yield faster alternatives to MST

Breadth First Search tree construction
recall that the flooding algorithm can be used to construct a BFS tree for a synchronous model lower bounds for BFS algorithms message complexity = Ω(|edges|) time complexity = Ω(diameter) flood algorithm may not construct a BFS tree for the asynchronous model disallow large messages so we can focus on time/message tradeoffs

Layer-synchronized BFS construction (Dijkstra’s algorithm)
build tree layer by layer – at each stage, add all vertices which are adjacent to a vertex from a previously-constructed layer r0 initiates construction by issuing phases with one new layer constructed at each phase at each phase p + 1, assume the tree has already been constructed on p layers (denoted by Tp); then do the following: r0 generates a pulse message and broadcasts it on Tp each vertex v in Tp upon receiving pulse sends an exploration message Ex to all of its neighbors (except its parent) for each vertex w, upon receiving Ex for the first time picks one neighbor, v, to be its parent (parent(w) = v) and sends Ack to this parent if vertex w has already selected a parent: upon receipt of an Ex message, it replies with Ack as well as parent(w) each leaf v of Tp collects Ack messages on its Ex messages; if an Ack from vertex w with parent(w) = v arrives, v adds w to its set: child(v)

Dijkstra’s algorithm (cont’d.):
algorithm description (cont’d.) once a leaf v in Tp has received all of its Ack messages, it upcasts Ack to its parent in Tp; these Ack messages are then convergecast on Tp back to r0 once this convergecast terminates at r0, it may begin the next phase termination detection each Ack message has a new field (initially set to 0) which will indicate if any new vertices were added to the tree in the current phase a vertex v sets new(v) = 1 if any new vertices have responded to its Ex message by joining the tree as children OR of these new bits is convergecast on the tree if a phase ends with r0 receiving new(v) = 0 in each Ack message from its children, then the next layer explored by the leaves in the current phase is empty and the tree is complete inefficiencies: certain Ex messages can be avoided – if only the left subtree of a node is unexplored, we still send Ex messages to the right subtree as well some of the Ack messages can be omitted

Dijkstra’s algorithm (cont’d.): complexities
(lemma 5.2.1) after phase p is completed, the variables parent and child correspond to a legal BFS tree spanning Γ0 (r0) = p-neighborhood of r0 time = O(diam2(G)) message = O(n * diam(G) + |E|) analysis time time(phase p) = 2p + 2; broadcast and convergecast take p time units each; exploration takes two time units for 1  p  diam(G), then p time(phase p) = p 2p + 2 = O(diam2(G))

Dijkstra’s algorithm (cont’d.)
analysis (cont’d.) message assume p  0; let Vp be the set of vertices in T at layer p; let Ep be the internal edges of Vp; and let Ep,p+1 be the edges connecting Vp to Vp+1 at phase p, exploration messages are sent only over Ep and Ep,p+1; and the edges of Tp are traversed twice, giving message(phase p) = O(n) + O(|Ep|) + O(| Ep,p+1 |) for 1  p  diam(G), then p message(phase p) = p O(n) + O(|Ep|) + O(| Ep,p+1|) = = O(n * diam(G) + |E|)

Update-based BFS construction (distributed Bellman-Ford algorithm)
modified flooding algorithm to ensure that a BFS tree is constructed in the asynchronous model algorithm: each vertex keeps a variable L(v) (initially set to ), its distance to the root as flooding progresses, each vertex v sends L(v) to its neighbor w along with the flooded message if a vertex w receives L(v) from its neighbor v and L(v) + 1 < L(w), then w chooses v as its parent and sets L(w) = L(v) + 1 if this change occurs, then w also informs all of its other neighbors of its new (shorter) path to the root

Bellman-Ford distributed algorithm (cont’d.)
complexities time = O(diam(G)) message = O(n*|edges|) analysis synchronous – complexities are the same as in the flooding algorithm; once a vertex changes L(v) from , it won’t change it again asynchronous time: assume d  1 at d time units into the execution, each vertex v at distance d from the root has already received a L(d-1) message from some neighbor v will then set L(v) = d and choose a parent w such that L(w) = d – 1 induction on d gives O(diam(G))

Bellman-Ford distributed algorithm (cont’d)
asynchronous model analysis (cont’d) message: for a vertex v, the first value it assigns to L(v) is at most n-1 (the longest possible path in the network) L(v) then changes at most n-2 times each change to L(v) results in v sending messages on each of its outgoing edges thus each v sends at most n*degree(v) messages total messages = v n*degree(v) = O(n*|edges|)

Distributed Depth-First Search
general overview algorithm begin at some source vertex, r0 when reaching any vertex v if v has unvisited neighbors, then visit them otherwise, return to parent(v) when we reach the parent of some vertex v such that parent(v) = NULL, then we terminate since v = r0 DFS defines a tree, with r0 as the root, which reaches all vertices in the graph “back edges” = graph edges not in tree sequential time complexity = O(|edges|)

Distributed DFS (cont’d.)
distributed version = token-based the token traverses the graph in a depth-first manner using the algorithm described above complexities message = time = (|edges|) note that edges are not examined from both endpoints; when edges (v,w) is examined by v, w then knows that v has been visited analysis message: lower bound of (|edges|) to explore every edge

Distributed DFS (cont’d.)
analysis (cont’d.) time: ensure that vertices visited for the first time know which of their neighbors have/have not been visited; thus we make no unnecessary vertex explorations algorithm: freeze the DFS process; inform all neighbors of v that v has been visited; get Ack messages from those neighbors; restart DFS process additional time cost each time a vertex is first visited = O(1) only edges of the DFS tree are traversed therefore, time complexity = O(n)

Minimum spanning trees (MST)
evaluate the spanning tree by total weight subgraph: let G’ be a subgraph of the graph G with a set of edges E’ and weight function w( ); then w(G’) = eE’ w(e) then define the MST of a tree T as a spanning tree TM which minimizes w(TM) MST problem given a weighted graph G = (V,E,w), compute an MST for G edges are assumed to be distinct, thus yielding an unique MST for G if not unique, such weights can be created using vertex identifiers however in anonymous networks without distinct edge weights or distinct index identifiers, no distributed algorithm exists for computing an MST with a bounded number of messages

MST (cont’d.) in the worst case, distributed MST construction requires
(|E|) messages for weighted n-vertex graphs (n logn) messages for arbitrary n-vertex graphs definitions an MST fragment is a tree T in G where  MST TM of G such that T is a subtree of TM edge e = (v,w) is an outgoing edge of fragment T if either v or w (but not both) belongs to T MWOE(T) = minimum weight outgoing edge of fragment T blue rule: given fragment T and e = MWOE(T) create T’ = T  {e} lemma T’ is a fragment as well

MST (cont’d.) Prim’s algorithm (distributed version)
works by repeatedly applying the blue rule to each resulting T’ and each resulting e’ = MWOE(T’), as above, to yield the MST for G works with both asynchronous and synchronous models algorithm let vertex r0 be the source as well as first fragment T use pulse messages broadcast on the current fragment T to synchronously add the MWOE(T) – each vertex in T sends its MWOE convergecast the MWOE’s (each vertex sends the minimum it has seen) towards r0 the MWOE is then selected by r0 and broadcast on the tree complexities time = message = O(n2)

MST (cont’d.) synchronous GHS algorithm
Prim’s algorithm is still fairly sequential GHS (distributed version of Kruskal’s algorithm) is less sequential and thus more efficient Kruskal’s algorithm each vertex v is initially a fragment at each step, the MWOE of all fragments is selected and added to the tree, thus merging the two fragments it touches when a single fragment remains, it is the MST for T sequential – n-1 steps still needed

MST (cont’d.) GHS algorithm overview description of each phase:
works with synchronous model vertices are partitioned into fragments, with each fragment Fi being a rooted tree each fragment has an identifier (possibly the identifier of its root) each vertex in a fragment knows its parent, children and the identifier of the fragment works in phases, each with input of the fragment structure from the previous phase and output of larger fragments description of each phase: all vertices of a fragment F cooperate to find the MWOE(F) – carried out as in Prim; it is assumed that each vertex knows which of its edges is outgoing a Request_to_merge message is sent over e = MWOE(F) to fragment F’, carrying F’s identifier the two fragments then combine (possibly with several other fragments if MWOE(F’)  MWOE(F)) into a larger fragment

MST (cont’d.) description of each phase (cont’d.)
once connected, the two fragments (now one) proceed as follows assume fragments F1 and F2, where e = MWOE(F1) = MWOE(F2) assume e = (v1,v2) where v1  F1 and v2  F2 the root of the new fragment is chosen as the higher identifier of the two vertices v1 and v2, say v1 the new root, v1, broadcasts a New_fragment message throughout the combined fragment F’ informing all vertices of its identifier (the new identifier of F’) each vertex updates its identifier and root entries and the direction of its fragment edges to point to its new parent (the vertex which sent the message) – thus now “pointing” towards the new root of F’ each vertex then updates its neighbors of its fragment identifier

MST (cont’d.) complexities message = O(|E| log n) time = O(n log n)