Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Structure and Function of Complex Networks Part I Jim Vallandingham M. E. J. Newman.

Similar presentations


Presentation on theme: "The Structure and Function of Complex Networks Part I Jim Vallandingham M. E. J. Newman."— Presentation transcript:

1 The Structure and Function of Complex Networks Part I Jim Vallandingham M. E. J. Newman

2 Introduction Paper is a review of – Network types – Common network properties – Network models Examine large networks – Millions / Billions of nodes Statistical methods are an attempt to find something to play the part of the eye in current network analysis

3 Organization I.Definitions II.Types of Networks III.Properties of Networks IV.Random Graphs V.Extensions to Random Graphs VI.Markov Graphs

4 Definitions Network | Graph: – Composed of items : vertices / nodes – Connections between vertices : edges Directed edge: – One that runs in only one direction Degree: – Number of edges connected to a vertex – Directed graph has an in-degree and out-degree for each vertex

5 Definitions VertexDegree 12 23 32 43 53 61 Undirected Graph Vertex In-DegreeOut-Degree 102 220 322 411 Directed Graph

6 Definitions Component: – Set of vertices connected together by edges Geodesic Path: – The shortest path through the network from one vertex to another. – Can be multiple geodesic paths between two vertices Diameter: – Length of the longest geodesic path – In terms of edges

7 Definitions Three components in a network

8 Types of Networks A.Social Networks B.Information Networks C.Technological Networks D.Biological Networks

9 Social Networks Definition: – Set of people or groups of people with some interaction pattern between them Early Work: – Southern Women Study Social circles of small southern town in 1936 – Social networks of factory workers in 1930s Current Work: – Business communities – Sexual partner studies

10 Social Networks Internet Chat Relay (IRC) communications between individuals

11 Social Networks Dating relationships between students in a high school

12 Social Networks Small-World experiments – Looked at the distribution of path lengths in network – Participants were asked to pass letter around in an attempt to reach a specific individual – Shown that there is usually short path between any two vertices in a network – Later became the basis of the 6 degrees of separation concept.

13 Social Networks Problems with traditional social networks – Based on questionnaires Labor intensive process which limits the size of network Source of bias which skews results – Friend might mean different thing to different people Presents need for other methods for probing social networks

14 Social Networks Collaboration Networks – Affiliation networks in which vertices collaborate in groups of some sort – Edges are created between pairs of nodes that have a common group membership – Classic Example : IMDB – Internet Movie Database Vertices are actors Edges indicate two actors have been in the same film together

15 Social Networks

16 Other social network data sources – Phone Calls – Email – Instant Messaging Produce Millions of pieces of data a day – Demonstrate the need for new analytical methods

17 Information Networks Also known as knowledge networks Definition: – Representation of how information moves through a population or group Classic Example: – Network of citations between academic papers Directed edges Mostly acyclic – Papers can only cite other papers already written and not future papers. (not always true)

18 Information Networks Citation Network for Inferring network mechanisms: The Drosophila melanogaster protein interaction network

19 Information Networks The World Wide Web – Network of information containing pages Vertices are the pages themselves Edge is created when one page links to another – No constraints as seen in the citation network Cycles Multiple edges between vertices – Power-law in-degree and out-degree distributions

20 Information Networks Graph of Relationships between Facebook pages. Example of an Information Network with Social Network aspects.

21 Information Networks Preference Networks – Includes two kinds of vertices Individuals Objects of their preference – Example: books or films – Edges connect vertices of different types – Edges can be weighted – Example of Bipartite Information Network

22 Technological Networks Definition: – Man-made networks designed for the transportation of a resource or commodity Examples – Power grid – Airline routes – The Internet Physical network of machines

23 Technological Networks Bandwidth transfer in Europe between countries

24 Biological Networks Wide variety of biological systems can be represented as networks Metabolic Pathways – Vertices are metabolic substrates and products – Directed edges between known reaction exists that produces product from substrate Protein Interactions – Mechanistic physical interactions between proteins

25 Biological Networks

26

27 Portion of yeast protein interactions

28 Biological Networks Gene Regulatory Networks – Expression of protein coded by particular genes – Controlled by other proteins Act as inducers and inhibitors – Vertices represent proteins – Edges represent dependencies between proteins – One of the first networked dynamical systems for which large-scale modeling attempts were made

29 Biological Networks Food Webs – Vertices represent species – Directed edge indicates predatory relationship Could be the other way in terms of carbon movement Neural Networks – Actual biological neuron pathways

30 Biological Networks Reef fish food web

31 Biological Networks Rat hippocampal neurons

32 Properties of Networks Look at features that are common to many types of networks May or may not encode important or relevant information for any one graph Might be suggestive of the mechanisms in how real networks are formed Most involve how real networks are different than random graphs

33 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

34 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

35 Properties: Small-World Effect Most pairs of vertices are connected by a relatively short path through the network Distance between any two vertices in a graph is usually much smaller than the total number of vertices Deals with the geodesic distance property – Uses Mean Geodesic Distance :

36 Properties: Small-World Effect can be measured in O(mn) time where m is the number of edges n is the number of vertices – Usually is much smaller than n Can be problematic if there are multiple components in the graph – Represented as edges and thus average geodesic distance – Alternate way is to exclude any vertices that connect multiple components

37 Properties: Small-World Effect This property implies that spread of x through real networks occurs fast – Rumor – Information Mathematically obvious – If number of vertices within distance r grows exponentially – Value of will increase as log n – small-world can refer to networks in which value of l scales logarithmically or slower with network size

38 Properties: Small-World Effect Biological example: protein-protein interactions in the yeast, S. cerevisiae Vertices: 1870 Edges: 2240 : 6.80

39 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

40 Properties: Transitivity Probability that if vertex A is connected to vertex B, and vertex B is connected to vertex C, than vertex A will also be connected to vertex C In social network terms: the friend of your friend is likely also to be your friend Also known as clustering – This is confusing as it has another meaning – Quantified using the Clustering Coefficient

41 Properties: Transitivity C : Clustering coefficient

42 Properties: Transitivity 1 8 2 1 3 4 5 6 7 8 Fraction of Transitive Triples

43 Properties: Transitivity Can also be defined locally for each vertex With this value the definition of C becomes:

44 Properties: Transitivity Alternative method for clustering coefficient 1 C 1 = 1 / 1 = 1 2 C 2 = 1 C 3 = 1/6 C 4 = 0 C 5 = 0 3 4 5 C = 1/5(1+1+(1/6)) C = 13/30

45 Properties: Transitivity Two definitions labeled C (1) and C (2) in text Effectively reverses the order of the operations: – Taking the ratio of triangles to triples – Averaging over vertices C (2) calculates the mean of the ratio C (1) calculates the ratio of the means C (2) tends to weigh contributions of low-degree vertices more heavily – Give significantly different results

46 Properties: Transitivity C i used often as well in sociological literature – Called network density Both C (1) and C (2) usually are significantly higher in real networks than random graphs

47 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

48 Properties: Degree Distributions Degree of a vertex is the number of edges connected to that vertex p k is the probability that a vertex chosen at random has a degree k Look at by creating a histogram of p k – Called the degree distribution for that network

49 Properties: Degree Distributions

50 Real World networks are usually highly right- skewed – Long right tail of values above the mean Measuring of the tail is difficult – small sample size in that section – Usually noisy

51 Properties: Degree Distributions Histograms depicting the Noise and lack of measurements indicative of the tail section of the degree distribution

52 Properties: Degree Distributions Many real world graph degree distributions follow power laws in their tails – p k ~ k -α for some constant α Others have exponential tails – p k ~ e -k/κ Knowing this makes power-law and exponential distributions easy to find experimentally – Plot on logarithmic scales : power laws – Semi-logarithmic scales : exponentials

53 Properties: Degree Distributions Power lawExponential

54 Properties: Degree Distributions Power-law degree distributions sometimes called scale-free networks Include networks of: – World wide web – Metabolic pathways – Telephone calls

55 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

56 Properties: Network Resilience How resilient is a network to the removal of its vertices – How the geodesic distance is affected by node deletion Two main removal processes discussed 1.Random removal of vertices 2.Targeted removal Usually remove the vertices with highest degrees

57 Properties: Network Resilience Two recent studies done on the resilience of the Internet and World Wide Web – One study found that these networks resilient to random deletions but vulnerable to targeted attacks – Other study found the opposite: WWW resilient to targeted attack as well as deletion of all vertices with degree greater than 5 would be needed – Difference attributed to the high skew of degree distribution as only a very small fraction of nodes have degree greater than 5

58 Properties: Network Resilience Biological Example: – Metabolic network of yeast Diameter: total of all path lengths divided by total number of paths Targeted Random

59 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

60 Properties: Mixing Patterns What types of vertices associate with other types of vertices Examples: – Food web: Many links between herbivores and carnivores Few links between carnivores and plants – Internet: Many links between end-users and ISPs Few between end-users and backbone

61 Properties: Mixing Patterns Quantified by assortativity coefficient Other ways to look at assortative mixing – By scalar characteristics Age, income – Vector characteristics Location : 2D vector

62 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

63 Properties: Degree Correlations Special case of assortative mixing – Based on a particular scalar vertex property : degree Do high-degree vertices prefer other high- degree? Do high-degree associate more with low- degree vertices?

64 Properties: Degree Correlations Several different ways to quantify: – Two-dimensional histogram – One-parameter curve based on the degree – A single number Positive for assortatively mixed networks Negative for disassortative networks Social networks tend to be assortative All other networks discussed are disassortative

65 Properties: Degree Correlations Degree Increasing Highest degree correlation Yeast protein interactions

66 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

67 Properties: Community Structure Structure and formation of groups in the network Social Networks: – People tend to divide into sub-sections based on common interests, occupations, etc. Cluster Analysis – Extracting community structure from a network – Assigns connection strength to vertex pairs of interest – Finished process of cluster analysis can be represented by a tree or dendrogram

68 Properties: Community Structure Groups in protein interactions

69 Properties of Networks Small-World Effect Transitivity Degree Distribution Network Resilience Mixing Patterns Degree Correlation Community Structure Network Navigation

70 Properties: Network Navigation Finding paths in networks Use some domain knowledge about the network – Example: small-world experiments – people knew who to give the letter to so as to reach the destination quickly If it were possible to construct artificial networks that were easy to navigate in the same way social networks seem to be, then they could be used for databases or P2P networks

71 Other Properties Largest Component Size – The Giant component Betweenness Centrality: – Number of geodesic paths between other vertices that run through a particular vertex Recurrent Motifs: – Small sub-graphs that repeat in the network

72 Random Graphs Poisson Random Graphs Configuration Model Extensions to Random Graphs Markov Graphs

73 Poisson Random Graphs Developed by – Solomnoff and Rapoport (1951) – Erdős and Rényi (1959) Used as a straw man when discussing graph theory Most of the interesting work is in how real world graphs are not like random graphs

74 Poisson Random Graphs Building Random Graphs: Very simple process – Take some number n of vertices – Connect each pair with a probability p

75 Poisson Random Graphs Many properties of the random graph are exactly solvable in the limit of large graph size. Probability of a vertex having degree k : – (Degree Distribution) Hence the name Poisson Exact in large graph limit

76 Poisson Random Graphs Expected structure varies with p. Most important property: phase transition – From low-density, low-p state Containing few edges and all components are small – To high-density, high-p state Extensive fraction of all vertices are joined together in single giant component Giant component is main significant feature of random graphs discussed in this paper

77 Poisson Random Graphs Two properties in random graphs : – Giant component size Calculating the expected size of the giant component: – Mean size of the non-giant components:

78 Poisson Random Graphs

79

80 Models – Small-world effect Typical distance through network log n / log z Does not Model – Clustering coefficient Lower than real world – Degree Distribution Poisson instead of power-law / exponential – Random Mixing Pattern – No community structure – Navigation is impossible using local algorithms

81 Poisson Random Graphs Linear graphLogarithmic graph Scale-free random

82 Poisson Random Graphs Still, it forms the basis of our basic intuition about how networks behave Giant component & phase transition are ideas that underlie much of graph theory Many future models started with this random graph as a springboard

83 Random Graphs Poisson Random Graphs Configuration Model Extensions to Random Graphs Markov Graphs

84 Configuration model Trying to make random graphs more realistic Configuration model incorporates idea of non- Poisson degree distribution Building configuration model: – p k : degree distribution : the fraction of vertices having degree k – Degree sequence a set of n values of the degrees k i of vertices i = 1 … n Visualized as giving each vertex k i spokes sticking out of it – Choose pairs of spokes at random and connect them

85 Configuration model Two important points on the configuration model 1.p k is the distribution of degrees of vertices But not the degree of the vertex reached by following a randomly chosen edge k edges that arrive at a vertex of degree k, we are k times as likely to arrive at that vertex as some other vertex of degree = 1. Thus degree distribution of a random vertex is proportional to k p k

86 Configuration model 2.Chance of finding a loop in a small component of the graph goes as n -1 – Probability that there is more than one path between any pair of vertices is O(n -1 ) – Not true of most real world networks

87 Configuration model Example : power-law degree distribution

88 Configuration model Gets rid of Poisson degree distribution Still no clustering (transitivity) Explanation : – Configuration model graphs are suitable for modeling the global network – Clustering is a characteristic of the local network

89 Random Graphs Poisson Random Graphs Configuration Model Extensions to Random Graphs Markov Graphs

90 Extension to Random Graph: Directed Graphs Directed Graphs: Each vertex has – An in-degree : j – An out-degree: k Control both in creation of the random graph

91 Extension to Random Graph: Directed Graphs Use of extended random graph to model directed network: WWW

92 Extension to Random Graph: Bipartite Graphs Have two types of nodes Edges run only between two different types Work well for modeling some real world networks Fail to capture the complexity of others

93 Extension to Random Graph: Bipartite Graphs

94 Indication of shortcomings of modeled bipartite graphs The theoretical predictions of the last two data sets show account for only half of the actual clustering present.

95 Random Graphs Poisson Random Graphs Configuration Model Extensions to Random Graphs Markov Graphs

96 Generalized random graph models have serious shortcoming: – Fail to show transitivity Look for completely different model – Add clustering to generated systems

97 Markov Graphs Looks at properties (edge configurations) of a graph Use properties to construct conditional tie variables (X ij ) – Signify a relationship between nodes i & j – X ij = 1 if there is an observed relational tie – X ij = 0 otherwise These tie variables are not independent – Need some way to reflect dependency – Markovian dependence structure: ties are conditionally dependent when they share a node.

98 Markov Graphs Social Network Example: – Work ties among lawyers Vertices : Lawyers in a law firm Edges : Collaboration (work ties) among them – How is work flow structured? Discernable form of local structuring? – Social ties are not interdependent of each other but the dependence is expressed through any persons directly involved in the ties in question

99 Markov Graphs Network Ties Among Lawyers

100 Markov Graphs Significant Graph Features when considering Markovian Relational ties

101 Markov Graphs Results indicate improved local clustering (transitivity) representation.

102 Markov Graphs Problem : – Tend to condense Form regions of complete cliques – Subsets of vertices in which each vertex is connected to every other vertex in that subset – Networks in the real world do not share this clumpy transitivity

103 Markov Graphs Clumping effect indicative of Markov Graph representation

104 Summary Types of Real World Networks A.Social Networks B.Information Networks C.Technological Networks D.Biological Networks

105 Summary Properties of networks – Small-World Effect – Transitivity – Degree Distribution – Network Resilience – Mixing Patterns – Degree Correlation – Community Structure – Network Navigation

106 Summary Random Graphs and extensions – Model only some of the properties found in real networks – Motivates the exploration of other models that can represent these properties


Download ppt "The Structure and Function of Complex Networks Part I Jim Vallandingham M. E. J. Newman."

Similar presentations


Ads by Google