School of Information University of Michigan SI 614 Directed & weighted networks, minimum spanning trees, flow Lecture 12 Instructor: Lada Adamic.

School of Information University of Michigan SI 614 Directed & weighted networks, minimum spanning trees, flow Lecture 12 Instructor: Lada Adamic

Outline directed networks prestige weighted networks minimum spanning trees flow

Comparing across these 3 centrality values Generally, the 3 centrality types will be positively correlated When they are not (low) correlated, it probably tells you something interesting about the network. Low Degree Low Closeness Low Betweenness High Degree Embedded in cluster that is far from the rest of the network Ego's connections are redundant - communication bypasses him/her High Closeness Key player tied to important important/active alters Probably multiple paths in the network, ego is near many people, but so are many others High Betweenness Ego's few ties are crucial for network flow Very rare cell. Would mean that ego monopolizes the ties from a small number of people to many others. Review of centrality in undirected networks Comparison slide: Jim Moody

Bonacich Power Centrality: Actor’s centrality (prestige) is equal to a function of the prestige of those they are connected to. Thus, actors who are tied to very central actors should have higher prestige/ centrality than those who are not.  is a scaling vector, which is set to normalize the score.  reflects the extent to which you weight the centrality of people ego is tied to. R is the adjacency matrix (can be valued) I is the identity matrix (1s down the diagonal) 1 is a matrix of all ones. Centrality in Social Networks Power / Eigenvalue slide: Jim Moody

Bonacich Power Centrality: The magnitude of  reflects the radius of power. Small values of  weight local structure, larger values weight global structure. If  is positive, then ego has higher centrality when tied to people who are central. If  is negative, then ego has higher centrality when tied to people who are not central. As  approaches zero, you get degree centrality. Centrality in Social Networks Power / Eigenvalue slide: Jim Moody

Bonacich Power Centrality:  = 0.23 Centrality in Social Networks Power / Eigenvalue slide: Jim Moody

 =.35  =-.35 Bonacich Power Centrality: Centrality in Social Networks Power / Eigenvalue slide: Jim Moody

Bonacich Power Centrality:  =.23  = -.23 Centrality in Social Networks Power / Eigenvalue slide: Jim Moody

Examples of directed networks? WWW food webs population dynamics influence hereditary citation transcription regulation networks neural networks

Prestige in directed social networks when ‘prestige’ may be the right word admiration influence gift-giving trust directionality especially important in instances where ties may not be reciprocated (e.g. dining partners choice network) when ‘prestige’ may not be the right word gives advice to (can reverse direction) gives orders to (- ” -) lends money to (- ” -) dislikes distrusts

Extensions of undirected degree centrality - prestige degree centrality indegree centrality a paper that is cited by many others has high prestige a person nominated by many others for an reward has high prestige

Extensions of undirected closeness centrality closeness centrality usually implies all paths should lead to you and unusually not: paths should lead from you to everywhere else usually consider only vertices from which the node i in question can be reached

Influence range The influence range of i is the set of vertices who are reachable from the node i

Extending betweenness centrality to directed networks We now consider the fraction of all directed paths between any two vertices that pass through a node Only modification: when normalizing, we have (N-1)*(N-2) instead of (N-1)*(N-2)/2, because we have twice as many ordered pairs as unordered pairs betweenness of vertex i paths between j and k that pass through i all paths between j and k

Directed geodesics A node does not necessarily lie on a geodesic from j to k if it lies on a geodesic from k to j k j

Prestige in Pajek Calculating the indegree prestige Net>Partition>Degree>Input to view, select File>Partition>Edit if you need to reverse the direction of each tie first (e.g. lends money to -> borrows from): Net>Transform>Transpose Influence range (a.k.a. input domain) Net>k-Neighbours>Input enter the number of the vertex, and 0 to consider all vertices that eventually lead to your chosen vertex to find out the size of the input domain, select Info>Partition Calculate the size of the input domains for all vertices Net>Partitions>Domain>Input Can also limit to only neighbors within some distance

Proximity prestige in Pajek Direct nominations (choices) should count more than indirect ones Nominations from second degree neighbors should count more than third degree ones So consider proximity prestige C p (n i ) = fraction of all vertices that are in i’s input domain average distance from i to vertex in input domain

Weighted networks Examples: email communication sports matches packet transfer population movement co-authorship food webs Weighted treatment of data/algorithms usually left for ‘future work’

But what are weights good for? Defining thresholds Shortest paths that don’t take long Flow/capacity of a network

Food webs usually considered as binary networks problems in defining threshold fluxes: do killer whales who eat bears count? weights interaction frequency: acts of predation per hectare per day carbon flow (prey to predator) grams of Carbon per meter squared per year interaction strength (predator on prey) (carbon flow of prey to predator)/ (biomass of predator) Lake carbon flow

Co-authorship networks The weight assigned to each edge is the sum of the number of papers in which two people were co-authors, divided by the total number of people in that paper large-scale high energy physics collaboration producing a paper with 100 authors is less evidence of direct collaboration than an article in ‘Social Networks’ with only two co-authors. Should we normalize? all weights from i to other nodes should sum to 1? (probably not) all papers where i and j were coauthors number of authors of paper k

Symmetry in normalization If normalizing by the sum of values for each node w ij = 3/3=1 w ji = 3/15=1/5 j i Cosine similarity: symmetric values assume the weight for each paper is w k = 1/(n k -1) i and j each have vectors of 0’s and w’s depending on whether they authored paper k normalize by the length of both vectors 1 2 3 3 6 3 assume simple weighting = number of papers co-authored

Other similarity Measures Simple matching Dice’s Coefficient Jaccard’s Coefficient Cosine Coefficient Overlap Coefficient p1p1 p2p2 p3p3 p4p4 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 p 11 a2a2 a3a3 a1a1 Q set of papers authored by a 1 D set of papers authored by a 2

Weighted shortest paths Routes shortest route from Chicago to Boston vertex: intersection edge weights: road distances alternative weights: expected time traveled, gas consumed… usually sum the weights from each segment start finish freeway, 65 mph 40 miles/65 mph ~ 37 minutes freeway, 70 mph 30 miles/70 mph ~ 26 minutes surface road 25 mph, 50 miles 2 hours

Reliable paths through social networks The probability of transmitting a message or infectious agent could be related to the strength of the tie e.g. rather than summing the weights, we might multiply the probabilities of getting through p = 0.5 p = 0.001 p = 1 Probability of getting an idea through to the head of labs via CEO (0.001*1 = 0.001), via direct manager (0.5*0.5 = 0.25) p = 0.05

Shortest Path Problem Given a weighted graph and two vertices u and v, we want to find a path of minimum total weight between u and v. Length of a path is the sum of the weights of its edges. Example: Shortest path between Providence and Honolulu Applications Internet packet routing Flight reservations Driving directions ORD PVD MIA DFW SFO LAX LGA HNL 849 802 1387 1743 1843 1099 1120 1233 337 2555 142 1205 slide by: Huajie Zhang, http://www.cs.unb.ca/courses/cs3913/http://www.cs.unb.ca/courses/cs3913/

Negative weights Shortest paths usually undefined for edges with negative weights if there are negative cycles present 2 -3 4 3

Shortest Path Properties Property 1: A subpath of a shortest path is itself a shortest path Property 2: There is a tree of shortest paths from a start vertex to all the other vertices Example: Tree of shortest paths from Providence ORD PVD MIA DFW SFO LAX LGA HNL 849 802 1387 1743 1843 1099 1120 1233 337 2555 142 1205 slide by: Huajie Zhang, http://www.cs.unb.ca/courses/cs3913/http://www.cs.unb.ca/courses/cs3913/

Dijkstra’s Algorithm The distance of a vertex v from a vertex s is the length of a shortest path between s and v Dijkstra’s algorithm computes the distances of all the vertices from a given start vertex s Assumptions: the graph is connected the edges are undirected the edge weights are nonnegative We grow a “cloud” of vertices, beginning with s and eventually covering all the vertices We store with each vertex v a label d(v) representing the distance of v from s in the subgraph consisting of the cloud and its adjacent vertices At each step We add to the cloud the vertex u outside the cloud with the smallest distance label, d(u) We update the labels of the vertices adjacent to u slide by: Huajie Zhang, http://www.cs.unb.ca/courses/cs3913/http://www.cs.unb.ca/courses/cs3913/

Edge Relaxation Consider an edge e  (u,z) such that u is the vertex most recently added to the cloud z is not in the cloud The relaxation of edge e updates distance d(z) as follows: d(z)  min{d(z),d(u)  weight(e)} d(z)  75 d(u)  50 10 z s u d(z)  60 d(u)  50 10 z s u e e slide by: Huajie Zhang, http://www.cs.unb.ca/courses/cs3913/http://www.cs.unb.ca/courses/cs3913/

Example CB A E D F 0 428  4 8 71 25 2 39 C B A E D F 0 328 511 4 8 71 25 2 39 C B A E D F 0 328 58 4 8 71 25 2 39 C B A E D F 0 327 58 4 8 71 25 2 39 slide by: Huajie Zhang, http://www.cs.unb.ca/courses/cs3913/http://www.cs.unb.ca/courses/cs3913/

Example (cont.) CB A E D F 0 327 58 4 8 71 25 2 39 CB A E D F 0 327 58 4 8 71 25 2 39 slide by: Huajie Zhang, http://www.cs.unb.ca/courses/cs3913/http://www.cs.unb.ca/courses/cs3913/

Minimum spanning trees Connect all vertices with a single tree Consider a communications company, such as AT&T or GTE that needs to build a communication network that connects n different users. The cost of making a link joining i and j is c ij. What is the minimum cost of connecting all of the users? 1 6 3 7 5 8 9 4 2 10 Common assumption: the only links possible are the ones directly joining two nodes. web.mit.edu/~jorlin/www/15.082/Lectures/16_Spanning_Trees.ppt

Electronic Circuitry Consider a system with a number of electronic components. In order to make two pins i and j of different components electrically equivalent, one can connect i and j by a wire. How can we connect n different pins in this way to make them electrically equivalent to each other so as to minimize the total wire length. 1 2 3 4 5 web.mit.edu/~jorlin/www/15.082/Lectures/16_Spanning_Trees.ppt

Minimum Cost Spanning Tree Problem Undirected network G = (N, A). (i, j) is the same arc as (j, i). We associate with each arc (i, j)  A a cost c ij. A spanning tree T of G is a connected acyclic subgraph that spans all the nodes. A connected graph with n nodes and n – 1 arcs is a spanning tree. The minimum cost spanning tree problem is to find a spanning tree of minimum cost. web.mit.edu/~jorlin/www/15.082/Lectures/16_Spanning_Trees.ppt

A Minimum Cost Spanning Tree Problem 35 10 30 15 25 40 20 17 8 15 11 21 1 2 3 4 5 6 7 web.mit.edu/~jorlin/www/15.082/Lectures/16_Spanning_Trees.ppt

A Minimum Cost Spanning Tree 35 10 30 15 25 40 20 17 8 15 11 21 1 2 3 4 5 6 7 web.mit.edu/~jorlin/www/15.082/Lectures/16_Spanning_Trees.ppt

Prim-Jarnik Algorithm Vertex based algorithm Grows one tree T, one vertex at a time A cloud covering the portion of T already computed Label the vertices v outside the cloud with key[v] – the minimum weigth of an edge connecting v to a vertex in the cloud, key[v] = , if no such edge exists www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Prim Example www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Prim Example (2) www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Prim Example (3) www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Kruskal's Algorithm The algorithm adds the cheapest edge that connects two trees of the forest MST-Kruskal(G,w) 01 A   02 for each vertex v  V[G] do 03 Make-Set(v) 04 sort the edges of E by non-decreasing weight w 05 for each edge (u,v)  E, in order by non- decreasing weight do 06 if Find-Set(u)  Find-Set(v) then 07 A  A  {(u,v)} 08 Union(u,v) 09 return A www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Kruskal Example www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Kruskal Example (2) www.cs.earlham.edu/~celikeb/fall_2005/cs310_aads/lecture_slides/ch23_minimum_spanning_trees.ppt

Network flow Applications traffic & transportation maximum number of cars that can commute from Berkley to San Francisco during rush hour fluid networks: pipes that carry liquids computer networks: packets traveling along fiber extended applications (from Kleinberg & Tardos, “Algorithm Design”) bipartite matching problem number of disjoint paths between two vertices survey design airline scheduling image segmentation baseball elimination

Max flow problem: how much stuff can we get from source to sink per unit time? 7 Capacity Sink Source www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Equivalent tasks Find a cut with minimum capacity Find maximum flow from source to sink www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

A Flow 7 3 2 5 residual graph 2 5 www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Augmenting Paths A path from source to sink in the residual graph of a given flow If there is an augmenting path in the residual graph, we can push more flow www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Ford-Fulkerson Method initialize total flow to 0 residual graph G’= G while augmenting path exist in G’ pick a augmenting path P in G’ m = bottleneck capacity of P add m to total flow push flow of m along P update G’ www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Example 1 2 1 1 1 1 1 1 1 1 2 2 2 2 4 3 3 3 3 3 4 4 2 www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Example 1 2 1 1 1 1 1 1 1 1 2 1 2 2 3 3 3 3 3 3 3 4 1 1 1 2 www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Example 1 2 1 1 1 1 1 1 1 1 2 1 2 2 3 1 3 1 1 3 3 4 1 1 1 www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Answer: Max Flow = 4 1 1 2 2 2 2 2 2 2 1 2 www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

Answer: Minimum Cut = 4 1 2 1 1 1 1 1 1 1 1 2 2 2 2 4 3 3 3 3 3 4 4 www.comp.nus.edu.sg/~ooiwt/slides/2004-cs3233-graph2.ppt

project status report worth 5 % of your grade, meant to keep you on track 2-3 weeks later: in-class presentation 1 month later – final project report due what it should do: include part of your project proposal as intro include result summaries (including figures & tables). be 4-6 pages include references to and briefly (paragraph or 2) discuss some related work. include a plan of remaining work. It is graded on a 0-5 scale 5 - same as 4, but very complete and already shows interesting new insights 4 - data, more than basic analysis (e.g. looked at robustness, community structure, centrality, etc. if applicable) 3 - some data, preliminary analysis (imported data into Pajek or GUESS, counted things up, visualized, if possible) 2 - some data, no results 1 - attempts made to get project started, but nothing worked out (no data, no results) 0 - no work done

GUESS installation Windows unzip the files into a folder edit the guess.bat (a batch executable file) so that @rem set GUESS_HOME=c:\program files\GUESS becomes @set GUESS_HOME=C:\PROGRA~1\GUESS if you installed into c:\Program Files\GUESS else you can try installing into a directory with no spaces in the name and have (e.g.) @set GUESS_HOME=C:\apps\GUESS

School of Information University of Michigan SI 614 Directed & weighted networks, minimum spanning trees, flow Lecture 12 Instructor: Lada Adamic.

Similar presentations

Presentation on theme: "School of Information University of Michigan SI 614 Directed & weighted networks, minimum spanning trees, flow Lecture 12 Instructor: Lada Adamic."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

School of Information University of Michigan SI 614 Directed & weighted networks, minimum spanning trees, flow Lecture 12 Instructor: Lada Adamic.

Similar presentations

Presentation on theme: "School of Information University of Michigan SI 614 Directed & weighted networks, minimum spanning trees, flow Lecture 12 Instructor: Lada Adamic."— Presentation transcript:

Similar presentations

About project

Feedback