Presentation is loading. Please wait.

Presentation is loading. Please wait.

11. Lecture WS 2006/07Bioinformatics III1 V11: Max-Flow Min-Cut V11 continues chapter 12 in Gross & Yellen „Graph Theory“ Theorem 12.2.3 [Characterization.

Similar presentations


Presentation on theme: "11. Lecture WS 2006/07Bioinformatics III1 V11: Max-Flow Min-Cut V11 continues chapter 12 in Gross & Yellen „Graph Theory“ Theorem 12.2.3 [Characterization."— Presentation transcript:

1 11. Lecture WS 2006/07Bioinformatics III1 V11: Max-Flow Min-Cut V11 continues chapter 12 in Gross & Yellen „Graph Theory“ Theorem 12.2.3 [Characterization of Maximum Flow] Let f be a flow in a network N. Then f is a maximum flow in network N if and only if there does not exist an f-augmenting path in N. Proof: Necessity (  ) Suppose that f is a maximum flow in network N. Then by Proposition 12.2.1, there is no f-augmenting path. Proposition 12.2.1 (Flow Augmentation) Let f be a flow in a network N, and let Q be an f-augmenting path with minimum slack  Q on its arcs. Then the augmented flow f‘ given by is a feasible flow in network N and val(f‘) = val(f) +  Q.  assuming an f-augmenting path existed, we could construct a flow f‘ with val(f‘) > val(f) contradicting the maximality of f.

2 11. Lecture WS 2006/07Bioinformatics III2 Max-Flow Min-Cut Sufficiency (  ) Suppose that there does not exist an f-augmenting path in network N. Consider the collection of all quasi-paths in network N that begin with source s, and have the following property: each forward arc on the quasi-path has positive slack, and each backward arc on the quasi-path has positive flow. Let V s be the union of the vertex-sets of these quasi-paths. Since there is no f-augmenting path, it follows that sink t  V s. Let V t = V N – V s. Then  V s,V t  is an s-t cut of network N. Moreover, by definition of the sets V s and V t, Hence, f is a maximum flow, by Corollary 12.1.8. □

3 11. Lecture WS 2006/07Bioinformatics III3 Max-Flow Min-Cut Theorem 12.2.4 [Max-Flow Min-Cut] For a given network, the value of a maximum flow is equal to the capacity of a minimum cut. Proof: The s-t cut constructed in the proof of Theorem 12.2.3 has capacity equal to the maximum flow. □ The outline of an algorithm for maximizing the flow in a network emerges from Proposition 12.2.1 and Theorem 12.2.3.

4 11. Lecture WS 2006/07Bioinformatics III4 Finding an f-Augmenting Path The idea is to grow a tree of quasi-paths, each starting at source s. If the flow on each arc of these quasi-paths can be increased or decreased, according to whether that arc is forward or backward, then an f-augmenting path is obtained as soon as the sink t is labelled. A frontier arc is an arc e directed from a labeled endpoint v to an unlabeled endpoint w. For constructing an f-augmenting path, the frontier path e is allowed to be backward (directed from vertex w to vertex v), and it can be added to the tree as long as it has slack  e > 0. The discussion of f-augmenting paths culminating in the flow-augmenting Proposition 12.2.1 provides the basis of a vertex-labeling strategy due to Ford and Fulkerson that finds an f-augmenting path, when one exists. Their labelling scheme is essentially basic tree-growing.

5 11. Lecture WS 2006/07Bioinformatics III5 Terminology: At any stage during tree-growing for constructing an f-augmenting path, let e be a frontier arc of tree T, with endpoints v and w. The arc e is said to be usable if, for the current flow f, either e is directed from vertex v to vertex w and f(e) < cap(e), or e is directed from vertex w to vertex v and f(e) > 0. Frontier arcs e 1 and e 2 are usable if f(e 1 ) 0 Remark From this vertex-labeling scheme, any of the existing f-augmenting paths could result. But the efficiency of Algorithm 12.2.1 is based on being able to find „good“ augmenting paths. If the arc capacities are irrational numbers, then an algorithm using the Ford&Fulkerson labeling scheme might not terminate (strictly speaking, it would not be an algorithm). Finding an f-Augmenting Path

6 11. Lecture WS 2006/07Bioinformatics III6 Finding an f-Augmenting Path Even when flows and capacities are restricted to be integers, problems concerning efficiency still exist. E.g., if each flow augmentation were to increase the flow by only one unit, then the number of augmentations required for maximization would equal the capacity of a minimum cut. Such an algorithm would depend on the size of the arc capacities instead of on the size of the network.

7 11. Lecture WS 2006/07Bioinformatics III7 Finding an f-Augmenting Path Example: For the network shown below, the arc from vertex v to vertex w has flow capacity 1, while the other arcs have capacity M, which could be made arbitrarily large. If the choice of the augmenting flow path at each iteration were to alternate between the directed path  s,v,w,t  and the quasi path  s,w,v,t , then the flow would increase by only one unit at each iteration. Thus, it could take as many as 2M iterations to obtain the maximum flow.

8 11. Lecture WS 2006/07Bioinformatics III8 Finding an f-Augmenting Path Edmonds and Karp avoid these problems with this algorithm. It uses breadth-first search to find an f-augmenting path with the least number of arcs.

9 11. Lecture WS 2006/07Bioinformatics III9 FFEK algorithm: Ford, Fulkerson, Edmonds, and Karp Algorithm 12.2.3 combines Algorithms 12.2.1 and 12.2.2

10 11. Lecture WS 2006/07Bioinformatics III10 Example: the figures illustrate algorithm 12.2.3. shown is the s-t cut with capacity equal to the current flow, establising optimality. FFEK algorithm: Ford, Fulkerson, Edmonds, and Karp

11 11. Lecture WS 2006/07Bioinformatics III11 FFEK algorithm: Ford, Fulkerson, Edmonds, and Karp At the end of the final iteration, the arc directed from source s to vertex w and the arc directed from vertex v to sink t are the only frontier arcs of the tree T, but neither is usable. These two arcs form the minimum cut  {s,x,y,z,v }, {w,a,b,c,t} . This illustrates the s-t cut that was constructed in the proof of theorem 12.2.3.

12 11. Lecture WS 2006/07Bioinformatics III12 Determining the connectivity of a graph In this section, we use the theory of network flows to give constructive proofs of Menger‘s theorem. These proofs lead directly to algorithms for determining the edge-connectivity and vertex-connectivity of a graph. The strategy to prove Menger‘s theorems is based on properties of certain networks whose arcs all have unit capacity. These 0-1 networks are constructed from the original graph.

13 11. Lecture WS 2006/07Bioinformatics III13 Determining the connectivity of a graph Lemma 12.3.1. Let N be an s-t network such that outdegree(s) > indegree(s), indegree(t) > outdegree (t), and outdegree(v) = indegree(v) for all other vertices v. Then, there exists a directed s-t path in network N. Proof. Let W be a longest directed trail (trail = walk without repeated edges; path = trail without repeated vertices) in network N that starts at source s, and let z be its terminal vertex. If vertex z were not the sink t, then there would be an arc not in trail W that is directed from z (since indegree(z) = outdegree(z) ). But this would contradict the maximality of trail W. Thus, W is a directed trail from source s to sink t. If W has a repeated vertex, then part of W determines a directed cycle, which can be deleted from W to obtain a shorter directed s-t trail. This deletion step can be repeated until no repeated vertices remain, at which point, the resulting directed trail is an s-t path. □

14 11. Lecture WS 2006/07Bioinformatics III14 Determining the connectivity of a graph Proposition 12.3.2. Let N be an s-t network such that outdegree(s) – indegree(s) = m = indegree(t) – outdegree (t), and outdegree(v) = indegree(v) for all vertices v  s,t. Then, there exist m disjoint directed s-t path in network N. Proof. If m = 1, then there exists an open eulerian directed trail T from source s to sink t by Theorem 6.1.3. Review: An eulerian trail in a graph is a trail that contains every edge of that graph. Theorem 6.1.3. A connected digraph D has an open eulerian trail from vertex x to vertex y if and only if indegree(x) + 1 = outdegree(x), indegree(y) = outdegree(y) + 1, and all vertices except x and y have equal indegree and outdegree. Theorem 1.5.2. Every open x-y walk W is either an x-y path or can be reduced to an x-y path. Therefore, trail T is either an s-t directed path or can be reduced to an s-t path.

15 11. Lecture WS 2006/07Bioinformatics III15 Determining the connectivity of a graph By way of induction, assume that the assertion is true for m = k, for some k  1, and consider a network N for which the condition holds for m = k +1. There exists a directed s-t path P by Lemma 12.3.1. If the arcs of path P are deleted from network N, then the resulting network N - P satisfies the condition of the proposition for m = k. By the induction hypothesis, there exist k arc-disjoint directed s-t paths in network N - P. These k paths together with path P form a collection of k + 1 arc-disjoint directed s-t paths in network N. □

16 11. Lecture WS 2006/07Bioinformatics III16 Basic properties of 0-1 networks Definition A 0-1 network is a capacitated network whose arc capacities are either 0 or 1. Proposition 12.3.3. Let N be an s-t network such that cap(e) = 1 for every arc e. Then the value of a maximum flow in network N equals the maximum number of arc-disjoint directed s-t paths in N. Proof: Let f* be a maximum flow in network N, and let r be the maximum number of arc-disjoint directed s-t paths in N. Consider the network N* obtained by deleting from N all arcs e for which f*(e) = 0. Then f*(e) = 1 for all arcs e in network N*. It follows from the definition that for every vertex v in network N*, and

17 11. Lecture WS 2006/07Bioinformatics III17 Basic properties of 0-1 networks Thus by the definition of val(f*) and by the conservation-of-flow property, outdegree(s) – indegree (s) = val(f*) = indegree(t) – outdegree(t) andoutdegree(v) = indegree(v), for all vertices v  s,t. By Proposition 12.3.2., there are val(f*) arc-disjoint s-t paths in network N*, and hence, also in N, which implies that val(f*)  r. To obtain the reverse inequality, let {P 1,P 2,..., P r } be the largest collection of arc- disjoint directed s-t paths in N, and consider the function f: E N  R + defined by Then f is a feasible flow in network N, with val(f) = r. It follows that val(f*)  r. □

18 11. Lecture WS 2006/07Bioinformatics III18 Separating Sets and Cuts Review from §5.3 Let s and t be distinct vertices in a graph G. An s-t separating edge set in G is a set of edges whose removal destroys all s-t paths in G. Thus, an s-t separating edge set in G is an edge subset of E G that contains at least one edge of every s-t path in G. Definition: Let s and t be distinct vertices in a digraph D. An s-t separating arc set in D is a set of arcs whose removal destroys all directed s-t paths in D. Thus, an s-t separating arc set in D is an arc subset of E D that contains at least one arc of every directed s-t path in digraph D. Remark: For the degenerate case in which the original graph or digraph has no s-t paths, the empty set is regarded as an s-t separating set.

19 11. Lecture WS 2006/07Bioinformatics III19 Separating Sets and Cuts Proposition 12.3.4 Let N be an s-t network such that cap(e) = 1 for every arc e. Then the capacity of a minimum s-t cut in network N equals the minimum number of arcs in an s-t separating arc set in N. Proof: Let K* =  V s,V t  be a minimum s-t cut in network N, and let q be the minimum number of arcs in an s-t separating arc set in N. Since K* is an s-t cut, it is also an s-t separating arc set. Thus cap(K*)  q. To obtain the reverse inequality, let S be an s-t separating arc set in network N containing q arcs, and let R be the set of all vertices in N that are reachable from source s by a directed path that contains no arc from set S. Then, by the definitions of arc set S and vertex set R, t  R, which means that  R, V N - R  is an s-t cut. Moreover,  R, V N - R   S. Therefore

20 11. Lecture WS 2006/07Bioinformatics III20 Separating Sets and Cuts which completes the proof. □

21 11. Lecture WS 2006/07Bioinformatics III21 Arc and Edge Versions of Menger’s Theorem Revisited Theorem 12.3.5 [Arc form of Menger‘s theorem] Let s and t be distinct vertices in a digraph D. Then the maximum number of arc- disjoint directed s-t paths in D is equal to the minimum number of arcs in an s-t separating set of D. Proof: Let N be the s-t network obtained by assigning a unit capacity to each arc of digraph D. Then the result follows from Propositions 12.3.3. and 12.3.4., together with the max-flow min-cut theorem. □ Theorem 12.2.4 [Max-Flow Min-Cut] For a given network, the value of a maximum flow is equal to the capacity of a minimum cut. Proposition 12.3.3. Let N be an s-t network such that cap(e) = 1 for every arc e. Then the value of a maximum flow in network N equals the maximum number of arc-disjoint directed s-t paths in N. Proposition 12.3.4 Let N be an s-t network such that cap(e) = 1 for every arc e. Then the capacity of a minimum s-t cut in network N equals the minimum number of arcs in an s-t separating arc set in N.

22 11. Lecture WS 2006/07Bioinformatics III22 Transcriptional – Gene regulation networks The machine that transcribes a gene is composed of perhaps 50 proteins, including RNA polymerase, the enzyme that converts DNA code into RNA code. A crew of transcription factors grabs hold of the DNA just above the gene at a site called the core promoter, while associated activators bind to enhancer regions farther upstream of the gene to rev up transcription. ahttp://www.berkeley.edu/news/features/1999/12/09_nogales.html Working as a tightly knit machine, these proteins transcribe a single gene into messenger RNA. The messenger RNA winds its way out of the nucleus to the factories that produce proteins, where it serves as a blueprint for production of a specific protein.

23 11. Lecture WS 2006/07Bioinformatics III23 Transcription in E.coli and in Eucaryotes ProcaryotesEucaryotes Genes are grouped into operonsGenes are not grouped in operons mRNA may contain transcript ofeach mRNA contains only several genes (poly-cistronic)transcript of a single gene (mono-cistronic) Transcription and translation are coupled.Transcription and translation are Transcript is translated already duringNOT coupled. transcription.Transcription takes place in nucleus, translation in cytosol. Gene regulation takes place byGene regulation via transcription modification of transcription raterate AND by RNA-processing, RNA stability etc.

24 11. Lecture WS 2006/07Bioinformatics III24 Promoter prediction in E.coli To analyze E.coli promoters, one may align a set of promoter sequences by the position that marks the known transcription start site (TSS) and search for conserved regions in the sequences.  E.coli promoters are found to contain 3 conserved sequence features - a region approximately 6 bp long with consensus TATAAT at position -10 - a region approximately 6 bp long with consensus TTGACA at position -35 - a distance between these 2 regions of ca. 17 bp that is relatively constant a

25 11. Lecture WS 2006/07Bioinformatics III25 Gene regulatory promoter network In E.coli, 240 transcription factors have been verified that regulate 3000 genes. Binding site matrics are available for more than 55 E.coli TFs (Robison et al. 1998) In S. cerevisae, genome-wide binding analysis of 106 transcription factors indicates that more than one-third of the promoter regions that were bound by regulators were bound by 2 or more regulators.  Highly connected network of transcriptional regulators.

26 11. Lecture WS 2006/07Bioinformatics III26 Feasibility of computational motif search? Computational identification of transcription factor binding sites is difficult because they consist of short, degenerate sequences that occur frequently by chance.  The problem is not easy to define (therefore: it is „complex“) because - the motif is of unknown size - the motif might not be well conserved between promoters - the sequences used to search for the motif do not necessarily represent the complete promoter - genes with promoters to be analyzed are in many cases grouped together by a clustering algorithm which has its own limitations.

27 11. Lecture WS 2006/07Bioinformatics III27 Strategy 1 Arrival of microarray gene-expression data. Group of genes with similar expression profile (e.g. those that are activated at the same time in the cell cycle)  one may assume that this profile ist, at least partly, caused by and reflected in a similar structure of the regions involved in transcription regulation. Search for common motifs in < 1000 base upstream regions. Sofar used: detection of single motifs (representing transcription-factor binding sites) common to the promoter sequences of putatively co-regulated genes. Better: search for simultaneous occurrence of 2 or more sites at a given distance interval! Search becomes more sensitive.

28 11. Lecture WS 2006/07Bioinformatics III28 Motif identifaction A flowchart to illustrate the two different approaches for motif identification. We analyzed 800 bp upstream from the translation start sites of the five genes from the yeast gene family PHO by the publicly available systems MEME (alignment) and RSA (exhaustive search). MEME was run on both strands, one occurrence per sequence mode, and found the known motif ranked as second best. RSA Tools was run with oligo size 6 and noncoding regions as background, as set by the demo mode of the system. The well- conserved heptamer of the motifs used by MEME to build the weight matrix is printed in bold. Ohler, Niemann Trends Gen 17, 2 (2001)

29 11. Lecture WS 2006/07Bioinformatics III29 Strategy 2: Exhaustive motiv search in upstream regions Exploit the finding that relevant motifs are often repeated many times, possibly with small variations, in the upstream region for the regulatory action to be effective.  Search upstream region for overrepresented motifs (1)Group genes based on the overrepresented motifs (2)Analyze sets of genes that share motifs for coregulation in microarray exp. (3)Consider overrepresented motifs labelling sets of co-regulated genes as candidate binding sites. Cora et al. BMC Bioinformatics 5, 57 (2004)

30 11. Lecture WS 2006/07Bioinformatics III30 Exhaustive motiv search in upstream regions Exploit Cora et al. BMC Bioinformatics 5, 57 (2004)

31 11. Lecture WS 2006/07Bioinformatics III31 Position-specific weight matrix Popular approach when list of genes available that share TF binding motif; Good multiple sequence alignment available. Alignment matrix: lists # of occurrences of each letter at each position of an alignment Hertz, Stormo (1999) Bioinformatics 15, 563

32 11. Lecture WS 2006/07Bioinformatics III32 Position-specific weight matrix Examples of matrices used by YRSA http://forkhead.cgb.ki.se/YRSA/matrixlist.html

33 11. Lecture WS 2006/07Bioinformatics III33 A protein bound to a specific DNA sequence will interfere with the digestion of that region by DNase I. An end-labelled DNA probe is incubated with a protein extract or a purified DNA-binding factor. The unprotected DNA is then partially digested with DNase I such that on average every DNA molecule is cut once. Digestion products are then resolved by electrophoresis. Comparison of the DNase I digestion pattern in the presence and absence of protein will allow the identification of a footprint (protected region) * * * * Denaturing PAGE Footprint Exp. Identification of TF binding site: DNase 1 Footprinting

34 11. Lecture WS 2006/07Bioinformatics III34 Gel Shifts Electro Mobility Shift Assay (EMSA) Band Shift Incubating a purified protein, or a complex mixture of proteins e.g. nuclear or cell extract, with a 32 P end-labelled DNA fragment containing the putative protein binding site (from promoter region). Reaction products are then analysed on a non- denaturing polyacrylamide gel. The specificity of the DNA-binding protein for the putative binding site is established by competition experiments using DNA fragments or oligonucleotides containing a binding site for the protein of interest, or other unrelated DNA sequences. ** Non-denaturing PAGE Retarded mobility due to protein binding Free DNA probe No proteinadd protein Gel retardation assays

35 11. Lecture WS 2006/07Bioinformatics III35 http://www.rcsb.org 3D structures of transcription factors 1A02.pdb1AM9.pdb 1AU7.pdb 1CIT.pdb1GD2.pdb 1H88.pdb TFs bind with very different binding modes. Some are sensitive for DNA conformation. 2 TFs bound!

36 11. Lecture WS 2006/07Bioinformatics III36 database for eukaryotic transcription factors: TRANSFAC BIOBase / TU Braunschweig / GBF Relational database 6 flat files: FACTOR interaction of TFs SITE their DNA binding site GENE through which they regulate these target genes CELL factor source MATRIX TF nucleotide weight matrices CLASS classification scheme of TFs Wingender et al. (1998) J Mol Biol 284,241

37 11. Lecture WS 2006/07Bioinformatics III37 database for eukaryotic transcription factors: TRANSFAC BIOBase / TU Braunschweig / GBF Matys et al. (2003) Nucl Acid Res 31,374

38 11. Lecture WS 2006/07Bioinformatics III38 TRANSFAC classification 1 Superclass basic domains3 Superclass: Helix-turn-helix 1.1 Leuzine zipper factors (bZIP) 1.2 Helix-loop-helix factors (bHLH)4 Superclass: beta-Scaffold 1.3 bHLH-bZIP Factors with Minor Groove 1.4 NF-1 Contacts 1.5 RF-X 1.6 bHSH5 Superclass: others 2 Superclass: Zinc-coordinating DNA-binding domains 2.1 Cys4 zinc finger of nuclear receptor type 2.2 diverse Cys4 zinc fingers 2.3 Cys2His2 zinc finger domains 2.4 Cys6 cysteine-zinc cluster 2.5 Zinc fingers of alternating composition http://www.gene-regulation.com/pub/databases/transfac/cl.html

39 11. Lecture WS 2006/07Bioinformatics III39 BIOBase / TU Braunschweig / GBF Matys et al. (2003) Nucl Acid Res 31,374 database for eukaryotic transcription factors: TRANSFAC

40 11. Lecture WS 2006/07Bioinformatics III40 Summary http://www.gene-regulation.com Large databases available (e.g. TRANSFAC) with information about promoter sites. Information verified experimentally. Microarray data allows searching for common motifs of coregulated genes. Also possible: common GO annotation etc. TF binding motifs are frequently overrepresented in 1000 bp upstream region. Clear function of this is unknown. (Same as in proline-rich recognition sequences.) Relatively few TFs regulate large number of genes.  Complex regulatory network, Next lecture(s).

41 11. Lecture WS 2006/07Bioinformatics III41 additional slides

42 11. Lecture WS 2006/07Bioinformatics III42 Arc and Edge Versions of Menger’s Theorem Revisited Assertion 12.3.6. Let s and t be distinct vertices of a graph G, and let be the digraph obtained by replacing each edge e of G with a pair of oppositely directed arcs having the same endpoints as edge e. Then there is a one-to-one correspondence between the s-t paths in graph G and the directed s-t paths in digraph. Moreover, two s-t paths in graph G are edge-disjoint if and only if their corresponding directed s-t paths in digraph are arc-disjoint. Assertion 12.3.7. Let s and t be distinct vertices of a graph G, and let be defined as above. Then the minimum number of edges in an s-t separating set of graph G is equal to the minimum number of arcs in an s-t separating arc set of digraph. Remark The edge form of Menger‘s theorem for undirected graphs follows directly from the next two assertions concerning the relationship between a graph G and the digraph obtained by replacing each edge e of graph G with a pair of oppositely directed arcs having the same endpoints as edge e. Each of these assertions follows directly from the definitions.

43 11. Lecture WS 2006/07Bioinformatics III43 Arc and Edge Versions of Menger’s Theorem Revisited Theorem 12.3.8 [Edge form of Menger‘s theorem]. Let s and t be distinct vertices in a graph G. Then the maximum number of edge-disjoint s-t paths in G equals the minimum number of edges in an s-t separating edge set of graph G. Proof: This is an immediate consequence of Assertions 12.3.6 and 12.3.7, together with the arc form of Menger‘s theorem (theorem 12.3.5). Review from §5.1 The edge-connectivity  e (G) is the size of a smallest edge-cut in graph G. Definition Let s and t be distinct vertices in a graph G. The local edge-connectivity between vertices s and t, denoted  e (s,t) is the minimum number of edges in an s-t separating edge set in G.

44 11. Lecture WS 2006/07Bioinformatics III44 Determining Edge-Connectivity Using Network Flows Proposition 12.3.9 The edge-connectivity of a graph G is equal to the minimum of the local edge-connectivites, taken over all pairs of vertices s and t. That is: Proposition 12.3.9 and theorem 12.3.8 suggest an algorithm for determining the edge-connectivity  e (G) of an arbitrary graph G. The algorithm calculates the local edge-connectivity between each pair of vertices in G, by solving an appropriate maximum flow problem in the network. In fact, as the next two results show, it is not necessary to calculate the local edge- connectivity between every pair of vertices.

45 11. Lecture WS 2006/07Bioinformatics III45 Determining Edge-Connectivity Using Network Flows Proposition 12.3.10. Let  V 1,V 2  be a partition-cut of minimum cardinality in a graph G, and let v 1 and v 2 be any vertices in V 1 and V 2, respectively. Then the edge-connectivity  e (G) equals the local edge-connectivity  e (v 1,v 2 ). Proof: Suppose that the minimum local edge-connectivity is achieved between vertices x and y. Then  e (G)   e (x,y) by Proposition 12.3.9. It suffices to show that  e (v 1,v 2 )   e (x,y). Let be the digraph obtained by replacing each edge of graph G with two oppositely directed arcs. Then can be regarded as a v 1 -v 2 capacitated network and as an x-y capacitated network where each arc is assigned unit capacity. Let K* be a minimum v 1 -v 2 cut in network

46 11. Lecture WS 2006/07Bioinformatics III46 Determining Edge-Connectivity Using Network Flows It follows that cap(K*)  cap  V 1,V 2 , since the partition-cut  V 1,V 2  corresponds to a v 1 -v 2 cut in network. Next, let f* be a maximum flow and  V 1,V 2  a minimum x-y cut in x-y network so that cap(V x,V y ) = val(f*). Then □

47 11. Lecture WS 2006/07Bioinformatics III47 Arc and Edge Versions of Menger’s Theorem Revisited Corollary 12.3.11 Let s be any vertex in a graph G. Then Proof: Let  V 1,V 2  be a partition-cut of minimum cardinality, and suppose, without loss of generality, that vertex s  V 1. There must be some vertex t  V 2 (otherwise, E G = , and the assertion would be trivially true). By Proposition 12.3.10 it follows that  e (G) =  e (s,t). □ The variable  e used in the next algorithm, represents the edge-connectivity of graph G and is initialized with the sufficiently large positive integer |E G |.

48 11. Lecture WS 2006/07Bioinformatics III48 Arc and Edge Versions of Menger’s Theorem Revisited Algorithm 12.3.1 requires O(n) iterations, and since Algorithm 12.2.3 requires O(n|E| 2 ) computations, the overall complexity of algorithm 12.3.1 is O(n 2 |E| 2 ). More efficient algorithms exist.

49 11. Lecture WS 2006/07Bioinformatics III49 Using Network Flows to Prove the Vertex Forms of Menger’s Theorem Construction of Digraph N D from Digraph D. Let s and t be any pair of non-adjacent vertices in a digraph D. The digraph N D is obtained from digraph D as follows: - each vertex x  V D – {s,t} corresponds to two vertices x‘ and x‘‘ in digraph N D and an arc directed from x‘ to x‘‘. - each arc in digraph D that is directed from vertex s to vertex x  V D – {s,t} corresponds to an arc in N D directed from s to x‘. - each arc in D that is directed from a vertex x  V D – { s,t } to vertex t corresponds to an arc in N D directed from x‘‘ to t. - each arc in D that is directed from a vertex x  V D – {s,t} to a vertex y  V D – {s,t} corresponds to an arc in N D directed from x‘‘ to y‘.

50 11. Lecture WS 2006/07Bioinformatics III50 Using Network Flows to Prove the Vertex Forms of Menger’s Theorem Review from §5.3: Let s and t be a pair of non-adjacent vertices in a graph G (or digraph D). An s-t separating vertex set in G (or in D) is a set of vertices whose removal destroys all s-t paths in G (or all directed s-t paths in D). Thus, an s-t separating vertex set is a set of vertices that contains at least one internal vertex of every (directed) s-t path. Definition Two (directed) s-t paths in a digraph D are internally disjoint if they have no internal vertices in common.

51 11. Lecture WS 2006/07Bioinformatics III51 Relationship between digraphs D and N D Assertion 12.3.12 There is a one-to-one correspondence between directed s-t paths in digraph D and directed s-t paths in digraph N D. Assertion 12.3.13 Two directed s-t paths in D are internally disjoint if and only if their corresponding s-t directed paths in N D are arc-disjoint. Assertion 12.3.14 The maximum number of internally disjoint directed s-t paths in D is equal to the maximum number of arc-disjoint directed s-t paths in N D. Assertion 12.3.15 The minimum number of vertices in an s-t separating vertex set in digraph D is equal to the minimum number of arcs in an s-t separating arc set in digraph N D.

52 11. Lecture WS 2006/07Bioinformatics III52 Relationship between digraphs D and N D Theorem 12.3.16 [Vertex Form of Menger for Digraphs] Let s and t be a pair of non-adjacent vertices in a digraph D. Then the maximum number of internally disjoint directed s-t paths in D is equal to the minimum number of vertices in an s-t separating vertex set in D. Proof: This follows from Assertions 12.3.12 through 12.3.15 together with the arc form of Menger‘s theorem (theorem 12.3.5). Theorem 12.3.17 [Vertex Form of Menger for Undirected Graphs]. Let s and t be a pair of non-adjacent vertices in a graph G. Then the maximum number of internally disjoint s-t paths in G is equal to the minimum number of vertices in an s-t separating vertex set in G. Proof: This follows from Theorem 12.3.16 and Assertions 12.3.6 and 12.3.7.

53 11. Lecture WS 2006/07Bioinformatics III53 Determining Vertex-Connectivity using Network Flow Review from §5.3: Let s and t be non-adjacent vertices of a connected graph G. Then the local vertex-connectivity between s and t, denoted  v (s,t) is the minimum number of vertices in an s-t separating vertex set. Lemma 5.3.5. Let G be a connected graph containing at least one pair of non- adjacent vertices. Then the vertex connectivity  v (G) is the minimum of the local vertex-connectivity  v (s,t), taken over all pairs of non-adjacent vertices s and t. The following algorithm, with O(n|E G | 3 ) time-complexity, computes the vertex- connectivity of an n-vertex graph by calculating the local vertex-connectivity between various pairs of non-adjacent vertices. As in algorithm 12.3.1, it is not necessary to calculate the local vertex-connectivity between each pair.

54 11. Lecture WS 2006/07Bioinformatics III54 Determining Vertex-Connectivity using Network Flow


Download ppt "11. Lecture WS 2006/07Bioinformatics III1 V11: Max-Flow Min-Cut V11 continues chapter 12 in Gross & Yellen „Graph Theory“ Theorem 12.2.3 [Characterization."

Similar presentations


Ads by Google