1 CS 410 Mastery in Programming Chapter 6 Graphs, Focus: Strongly Connected Components Herbert G. Mayer, PSU CS status 7/25/2011.

1 CS 410 Mastery in Programming Chapter 6 Graphs, Focus: Strongly Connected Components Herbert G. Mayer, PSU CS status 7/25/2011

2 Syllabus Definition of Graph Definition of Graph Building a Graph Building a Graph Graph Data Structure Graph Data Structure Strongly Connected Component Strongly Connected Component Control Flow Graph Control Flow Graph Reducing Graph to DAG Reducing Graph to DAG References References

3 Formal Definition of Graph Empty Graph: For simplicity and expediency we ignore the possibility of a graph G being empty Graph: is a data structure G = { V, E } consisting of a set E of edges and a set of V vertices, AKA nodes. Any node v i V may be connected to any other node v j. Such a connection is called an edge. Edges may be directed, or even bi- directed. Different from a tree, a node in G may have any number of predecessors –or incident edges Connected Graph: If all n>0 nodes v n in G are connected somehow, the graph G is called connected, regardless of the edges’ directions Strongly Connected Component : A subset SG ⊆ G is strongly connected, if every of i>0 nodes v i in SG can reach all v i nodes in SG somehow Directed Acyclic Graph (DAG): A DAG is a graph with directed edges that form no cycle. A node may still have multiple predecessors When programming graphs, it is convenient to add fields to the node type for additional functions; e.g. it is possible to process all nodes in a linear fashion by adding a link field, etc. Sample 1: build a stack of nodes of all nodes in G Sample 2: a step may have to traverse all nodes in G, though G is unconnected!

4 Building a Graph

5 Graph Data Structure A graph G( v, e ) consists of v nodes (AKA vertices) and edges e that connect nodes Implemented via some node_type data structure G is identified and thus accessible via one select node, called an entry node, or simply entry, AKA head Head is of type pointer to node_type G is not necessarily connected If parts of G are unconnected, how can they be retrieved in case of complete graph traversal? Several methods of forcing complete access: Either create a super-node, not specified by the user of G, in a way that each unconnected region is pointed at Or have a linked-list (LL) meandering through each node of G, without this link field being part of G proper

6 Sample Graph G 0 1 R 2 Y 3 R 5 R 4O Y 6 G0G0 How many Strongly Connected Components in G 0 ?

7 Graph Data Structure Dd Sample Graph G 0 above has 6 nodes The ID, AKA name of each node is shown next to the nodes, e.g. 1 2 3 4 5 … The graph’s node type data structure includes such name information as part of node_type In addition, each node in G 0 has attributes, such as R, G, Y etc. in the sample above There may be many more attributes belonging to each node, depending on what the graph will be used for Any of these attributes must also declared in the node_type data structure And the successors, if any, of each node must be encoded in the node somehow; no limit on number! G 0 has 3 SCCs; 2 of those are not interesting!

8 Graph Data Structure Dd Since in general there is no inherent upper bound on the number of successor nodes, a suitable way to define successors is via a linked list Hence the data type for successor is a pointer to a linked list of link nodes Link nodes then are also allocated off the heap, as needed, of type link_type And each link consists of just 2 fields One field pointing to the next link; the type is pointer to link_type, in some languages expresses as *link_type The other field pointing to the successor node; the type is pointer to node_type For convenience, the last link inserted is added at the start of the list, saving multiple searches for list end

9 Graph Data Structure, Link // node may have any number of successors // all need to be retrieved, // so each node has link, // pointing to LL of all successor nodes. // Last one connected is the first one inserted typedef struct link_tp * link_ptr_tp; // forward ref! typedef struct link_tp { link_ptr_tpnext_link;// point to next successor node_ptr_tpnext_node;// is a successor } str_link_tp; #define LINK_SIZE sizeof( str_link_tp )

10 Graph Data Structure, Node // "name" is arbitrary number given during creation // "link" is head of LList of successor nodes, while // finger" is linear link through all // "visited" true if was visited; initially FALSE typedef struct node_tp * node_ptr_tp; typedef struct node_tp { link_ptr_tplink;// points to LL of successors node_ptr_tpfinger;// finger through all nodes intname;// name given at creation bool visited;// to check connectivity others...// many other fields } str_node_tp; #define NODE_SIZE sizeof( str_node_tp )

11 Building a Graph // similar to what you did for differentiation: // create node in graph. Identified by name; // connect to finger at start of Llist node_ptr_tp make_node( int name ) { // make_node node_ptr_tp node = (node_ptr_tp) malloc( NODE_SIZE ); // checked non-Null here, not later on user side! ASSERT( node, "space for node missing" ); node->finger = finger; node->lowlink= NIL; node->number = NIL; node->link = NULL; node->name = name; node->visited= FALSE; return node; } //end make_node

12 Building a Graph // input is list of pairs, each element is a node name // craft edge from first to second name/number // first and second are node pointers, initially NULL while( scanf( "%d%d", &a, &b ) ) { // a, b are ints if ( ! ( first = exists( a ) ) ) { first = make_node( a ); } //end if if ( ! ( second = exists( b ) ) ) { second = make_node( b ); } //end if // now both exist. Either created, or pre-existed: // Connect them! if ( new_link( first, second ) ) { link = make_link( first->link, second ); ASSERT( link, "no space for link node" ); first->link = link; }else{ // link was there already, no ned to add again! printf( "<><> skip duplicate link %d->%d\n", a, b ); } //end if } //end while

13 Building a Graph // check, whether link between these 2 nodes already exists // if not, return true: New! Else return false, NOT new bool new_link( node_ptr_tp first, node_ptr_tp second ) { // new_link int target= second->name; link_ptr_tp link= first->link; while ( link ) { if ( target == link->next_node->name ) { return FALSE; // it is an existing link, NOT new } //end if // check different node link = link->next_link; } //end while // none of successors equal the second node's name return TRUE; // is a new link } //end new_link

14 Strongly Connected Components

15 Strongly Connected Component We’ll analyze graphs for the attribute of strong connectivity Using the best method known to date: by Robert E Tarjan, in his awesome 1972 SIAM paper: The beauty in Computer Science Using the best method known to date: by Robert E Tarjan, in his awesome 1972 SIAM paper: The beauty in Computer Science Requires special fields in graph node, int number and int lowlink, which we just add to regular node typedef struct node_tp { link_ptr_tplink;// points to LL of successors node_ptr_tpfinger;// finger through all nodes intlowlink;// Tarjan's lowlink intnumber;// Tarjan's number intname;// name given during creation boolvisited;// to check connectivity } str_node_tp; #define NODE_SIZE sizeof( str_node_tp )

16 Strongly Connected Component Every node v i in a strongly connected component (SCC) of graph G can reach every node v i --not necessarily in a single step An SCC is a subgraph SG of graph G, SG ⊆ G By definition then, a singleton node graph is strongly connected; not very interesting, but this shows up when we discuss Tarjan’s method to uncover SCCs We’ll enhance Tarjan’s code to filter out singleton- node SCCs It is not required that an SCC have a single entry point, single exit point, and a single back-edge Graph needs defined entry point: entry or head Tarjan’s SCC analysis may start at any node of G Proof for correct working of algorithm in [3], Tarjan’s awesome 1972 paper

17 Strongly Connected Component // Pseudo code for Tarjan’s method // of detecting SCCs in directed graph int scc_number = 0 procedure main() { // main scc_number := 0 empty the stack mark all nodes in G as 'not visited' for each node w, w G not visited, do scc( w ) end for } // end main

18 Strongly Connected Component // Pseudo code for Tarjan’s method of detecting SCCs in directed graph // Node in Tarjan's notation has added lowlink and number // Also, there is a stack of nodes, again encoded via field and scc_stack procedure scc( node_ptr_tp v ) { // scc lowlink( v ) := number( v ) := ++scc_number push( v ) for all successors w of v do if w is not visited then -- v->w is a tree arc scc( w ) lowlink( v ) := min( lowlink( v ), lowlink( w ) ) elsif number( w ) w is a cross link if in_stack( w ) then lowlink( v ) := min( lowlink( v ), number( w ) ) end if end for if lowlink( v ) = number( v ) then-- next scc found scc_count++ while scc_stack, w := scc_stack, number( w ) >= number( v ) do -- w is part of this scc_number pop( w ) end while end if } // end scc

19 SCC Sample – Omit Trivial SCCs

20 3 Num: Low: Stack: 2 Num: Low: Stack: 1 Num: Low: Stack: 3-Node SCC Sample

21 Outside call to scc( for node 1 ) Increments num to 1, sets fields.num and.lowlink = 1 Pushes node 1, i.e., stack = 1, node 1’s predecessor = null Recursive call to find_scc( for node 2 ) When done, find that lowlink = num, hence SCC Recursive call scc( node 2 ) Set node 2’s.num and.lowlink = 2 Stack points to node 2, node 2.pred is node 1 Has 2 successors: node 1 and node 3 But node 1 is visited, while node 3 causes new find_scc( node 3 ) Recursive call scc( node 3 ) Set.num and.lowlink = 3 Stack points to node 3, node 3.pred is node 2 Has no successor But lowlink = num hence is SCC, but is a singleton-node SCC 3-Node SCC Sample

22 Strongly Connected Component //////////////////////////////////////////////////////////////////////// //////////// //////////// //////////// S C C G r a p h A n a l y s i s //////////// //////////// //////////// //////////////////////////////////////////////////////////////////////// // globals for scc intscc_number= NIL;// Tarjan's SCC numbers node_ptr_tpscc_stack= NULL;// stack exists via link in nodes intscc_count= 0;// tracks # of SCCs // initial point of stack is global "scc_stack" // each node has scc_pred link, linking up in fashion of a stack void push( node_ptr_tp v ) { // push ASSERT( v, "push() called with NIL vertex v" ); ASSERT( !( v->visited ), "pushing vertex again?" ); v->scc_pred= scc_stack;// first time NULL, then stack ptr v->visited= TRUE;// will be handled now scc_stack= v;// global pts ID’s head } //end push // starting with global scc_stack, we can traverse whole stack // all elements are connected by node field scc_stack void pop() { // pop ASSERT( scc_stack, "error, empty SCC stack" ); scc_stack->visited = FALSE; scc_stack = scc_stack->scc_pred; } //end pop

23 SCC Coded, Part 1 void scc( node_ptr_tp v ) { // scc node_ptr_tp w; ASSERT( v, "calling scc with NULL pointer" ); ASSERT( !v->number, “node already has non-null number!" ); v->number = v->lowlink = ++scc_number; push( v ); for( link_ptr_tp link=v->link; link; link=link->next_link ) { w = link->next_node; ASSERT( w, “node w linked as successor must be /= 0" ); if ( ! w->number ) { scc( w ); v->lowlink = min( v->lowlink, w->lowlink ); }else if ( w->number number ) { // frond, AKA cross link if( w->visited ) { v->lowlink = min( v->lowlink, w->number ); } //end if } //end for.. Continued next page: now we can pop

24 SCC Coded, Part 2 // now pop up all SCCs if ( v->lowlink == v->number ) { // found next scc; but if singleton node scc: skip it if ( scc_stack == v ) { // yes, singleton node pop();}else{ // multi-node scc; THAT we do consider scc_count++; while( scc_stack && ( scc_stack->number >= v->number ) ) { printf( "%d scc %d\n", scc_stack->name, scc_count ); pop(); } //end while } //end if } //end scc

25 References  Control Flow Graph, in: Mayer, H. “Parallel Execution Enabled by Refined Source Analysis: Cost and Benefits in a Supercompiler”, R. Oldenbourg Verlag München/Wien, March 1997  Graphs in: C. Berge, “Graphs and Hypergraphs”, North-Holland, Amsterdam 1973  SCCs: Robert Tarjan, "Depth-First Search and Linear Graph Algorithms". SIAM J. Computing, Vol. 1, No. 2, June 1972

1 CS 410 Mastery in Programming Chapter 6 Graphs, Focus: Strongly Connected Components Herbert G. Mayer, PSU CS status 7/25/2011.

Similar presentations

Presentation on theme: "1 CS 410 Mastery in Programming Chapter 6 Graphs, Focus: Strongly Connected Components Herbert G. Mayer, PSU CS status 7/25/2011."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 CS 410 Mastery in Programming Chapter 6 Graphs, Focus: Strongly Connected Components Herbert G. Mayer, PSU CS status 7/25/2011.

Similar presentations

Presentation on theme: "1 CS 410 Mastery in Programming Chapter 6 Graphs, Focus: Strongly Connected Components Herbert G. Mayer, PSU CS status 7/25/2011."— Presentation transcript:

Similar presentations

About project

Feedback