Download presentation

Presentation is loading. Please wait.

1
Network Statistics Gesine Reinert

2
Yeast protein interactions

3
Summary statistics Vertex degree distribution (the degree of a vertex is the number of vertices connected with it via an edge) Clustering coefficient: the average proportion of neighbours of a vertex that are themselves neighbours Shortest distance between two vertices - also average shortest distance, maximal distance, average of inverse distance (efficiency) Betweenness of a vertex: the number of shortest paths that go through a given vertex (similarly for edge)

4
Some examples for real networks (in averages) Networksize vertex degree shortest path Shortest path in fitted random graph ClusteringClustering in random graph Film actors225,226613.652.990.790.00027 MEDLINE coauthorship 1,520,25118.14.64.910.431.8 x 10 -4 E.Coli substrate graph 2827.352.93.040.320.026 C.Elegans282142.652.250.280.05

5
Underlying model assumptions Network consisting of vertices and edges Randomness in edges Here: assume edges undirected, no self- loops, no multiple edges

6
Main model 1: Random Graph Bernoulli random graph (Erdös+Renyi 1959, 1960): L vertices, any two connected by an edge with probability p, independent of each other need not be connected; Phase transition: for edge probability p(L) = (log L)/L the random graph becomes connected.

7
Main model 2: Watts-Strogatz Small World (1998) L vertices, each connected are to m nearest neighbours, in addition random links, each probability p (originally, rewiring edges instead of adding edges was proposed, but then the resulting network need not be connected)

8
Main model 3: Scale-free network Network growth models: start with one vertex; new vertex attaches to existing vertices by preferential attachment: vertex tends choose vertex according to vertex degree (Barabasi+Albert 1999, Price 1965)

9
Watts-Strogatz’ Small World Amenable to mathematical analysis More realistic than random graphs Shortest path length Motif counts Vertex degrees Predicting links Generalization: hard-wired links only present with a certain probability

10
Shortest path length Put ρ=2 (L-2m-1) p, where p is the probability of a shortcut Approximation: continuous model gives Expected shortest path length is approximately 1/ρ {1/2 log (L ρ) – 0.2886 } (+ distribution, Barbour + R.) In the discrete case, the distribution may be concentrated on one or two points.

11
Example: 6 degrees of separation? If the number of vertices is L=200,000,000, and we observe l=6, then we can estimate ρ as approximately 1.54 This gives for L=60,000,000 that the expected shortest path length is approximately 5.81 For L=100,000 it gives approximately 3.73 For L=6,500,000,000 it gives approximately 7.33

12
Motif counts Triangles: relate to clustering coefficient Cycles: biologically relevant Distributions: approximately compound Poisson Can get joint distribution for cycle counts of different lengths (also using compound Poisson); dependence! Goal: assess statistical significance of counts

13
Vertex degrees Random graph superimposed on hard- wired networks Poisson approximation for number of vertices with degree at least k, say Normal approximation for joint distribution of some vertex degrees Goal: assess scale-free appearance

14
Predicting links Use Bayesian analysis and biochemical properties to predict which proteins might interact Use H.pylori interactions to construct prior for E.coli interactions Assess whether small-world structure; if so, use parametric model

15
Statistical significance Clustering coefficient, vertex degrees, shortest path length are not independent Long-term goal: joint distribution of summary statistics to assess whether networks are similar or not

16
People Research students: Kaisheng Lin (motif counts, metabolic networks; vertex degrees) Pao-Yang Chen (protein interaction networks) KimHuat Lim (epidemics on networks) Collaborators: Andrew Barbour (shortest path length) Charlotte Deane (protein interaction networks) Susan Holmes (bottlenecks)

Similar presentations

© 2019 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google