Presentation is loading. Please wait.

Presentation is loading. Please wait.

Sharlee Climer Department of Computer Science and Engineering

Similar presentations


Presentation on theme: "Sharlee Climer Department of Computer Science and Engineering"— Presentation transcript:

1 A Formalization of the Use of Bounds with Applications in Biology and Engineering
Sharlee Climer Department of Computer Science and Engineering Department of Biology Washington University in St. Louis This research was funded in part by NDSEG and Olin Fellowships, and by NSF grants IIS , ITR/EIA , and IIS

2 Washington University in St. Louis
Overview Introduction Limit crossing Cut-and-solve TSP Haplotyping 11/27/2018 Washington University in St. Louis

3 Washington University in St. Louis
upper bound The use of bounds optimal solution Used in a number of search strategies as well as a large number of algorithms for particular problems. Many of these algorithms use bounds implicitly and it is never stated that bounds have been used. lower bound 11/27/2018 Washington University in St. Louis

4 Washington University in St. Louis
Use of bounds Bounds have been extensively studied in both computer science and operations research Pruning rules in branch-and-bound search Previous efforts focused on relaxations Vast number of ways that bounds can be produced 11/27/2018 Washington University in St. Louis

5 Formulation and notation
Techniques presented can be applied to a variety of optimization problems We’ll use integer linear programs (IPs) as basic problem structure Without loss of generality, we consider only minimization problems 11/27/2018 Washington University in St. Louis

6 Integer Linear Programs
Great number of research and engineering problems CS applications: Traveling Salesman Problem Constraint Satisfaction Problem Robotic motion problems Clustering Multiple sequence alignment Haplotype inferencing VLSI circuit design Computer disk read head scheduling Derivation of physical structures of programs Delay-Tolerant Network routing Cellular radio network base station locations Minimum-energy multicast problem in wireless ad hoc networks In addition to STRIPS-style problems, IPs have been used to model a number of additional AI problems such as… Defend the use of IPs for the model. Remind general ideas presented should be applicable in other domains. 11/27/2018 Washington University in St. Louis

7 Integer Linear Programs
Minimize Z = Sci xi (objective function) Subject to: a set of linear constraints xi integer If xi integer constraints omitted, would have a linear program (LP) Minimize a linear expression. Define objective function, decision variables, and solution space. LPs are easy to solve, IPs usually are not. 11/27/2018 Washington University in St. Louis

8 Linear program example
Minimize Z = -11x + 4y Subject to: 3x + 8y <= 40 11x - 8y <= 16 x,y >= 0 Integrality not required. Easily solved using simplex. 11/27/2018 Washington University in St. Louis

9 Linear program example
Minimize Z = -11x + 4y y = 11/4 x + Z/4 Family of parallel lines with slope of 11/4 and unknown y-intercept 11/27/2018 Washington University in St. Louis

10 Linear program example
Optimal solution x = 4 y = 7/2 Z = -30 Optimal solution is always on a vertex or edge 11/27/2018 Washington University in St. Louis

11 Integer linear program
Minimize Z = -11x + 4y Subject to: 3x + 8y <= 40 11x - 8y <= 16 x,y >= 0 x,y integer Optimal solution x = 3 y = 3 Z = -21 Relaxing integrality is a lower bound. LP easy to solve. IP may be NP-hard. 11/27/2018 Washington University in St. Louis

12 The Traveling Salesman Problem
The Traveling Salesman Problem (TSP) is the problem of finding a minimum cost complete tour of a set of cities NP-hard 11/27/2018 Washington University in St. Louis

13 Optimal solution for 49-city TSP
11/27/2018 Washington University in St. Louis

14 The Traveling Salesman Problem
Minimize Z = SScij xij s.t.: Sxij = 1 for j = 1,…,n Sxij = 1 for i = 1,…,n SSxij <= |W| - 1, for all proper non empty subsets W of V xij = {0,1} 11/27/2018 Washington University in St. Louis

15 Branch-and-bound search
Branching rules Determine structure of search tree Relaxations Lower-bounding modification Pruning Heuristics to guide search 11/27/2018 Washington University in St. Louis

16 TSP: Omit subtour elimination constraints
Minimize Z = SScij xij s.t.: Sxij = 1 for j = 1,…,n Sxij = 1 for i = 1,…,n SSxij <= |W| - 1, for all proper non empty subsets W of V xij = {0,1} The assignment problem Can be solved in polynomial time Insert picture of 49-city TSP with subtours 11/27/2018 Washington University in St. Louis

17 TSP: Omit subtour elimination constraints
11/27/2018 Washington University in St. Louis

18 TSP: Relax integrality constraints
Minimize Z = SScij xij s.t.: Sxij = 1 for j = 1,…,n Sxij = 1 for i = 1,…,n SSxij <= |W| - 1, for all proper non empty subsets W of V xij = {0,1} 0 <= xij <= 1 Linear program (LP) relaxation Can be solved in polynomial time Insert picture of 49-city TSP with subtours 11/27/2018 Washington University in St. Louis

19 TSP: Relax integrality constraints
11/27/2018 Washington University in St. Louis

20 Branch-and-bound search
Incumbent solution 11/27/2018 Washington University in St. Louis

21 Washington University in St. Louis
Limit crossing A 2-step procedure for exploring the use of bounds Has been implicitly used in a number of algorithms and search strategies To our knowledge, hasn’t been formalized Broaden focus beyond traditional search 11/27/2018 Washington University in St. Louis

22 Washington University in St. Louis
Limit crossing 2 steps: (1) Find a simple upper or lower bound (2) Combine upper-bounding and lower- bounding modifications and solve If solution of the doubly-modified problem exceeds the simple upper bound, upper-bounding modification in step (2) is invalid If solution of doubly-modified problem is less than the simple lower bound, lower-bounding modification in step (2) is invalid 11/27/2018 Washington University in St. Louis

23 Washington University in St. Louis
Limit crossing Find a simple upper or lower bound that is tight Systematically apply modifications to produce doubly-modified problems Either modification can be difficult to solve Only need the combination of the two modifications to be relatively easy Not limiting ourselves to setting variable values for upper-bounding modification of doubly-modified problem. 11/27/2018 Washington University in St. Louis

24 Modifications to obtain bounds
Many possibilities for obtaining bounds have been previously overlooked Examine every aspect of problem description Modifications of IPs to produce bounds Relaxing or tightening constraints Modifying objective function Adding or deleting decision variables Use simple example problem to demonstrate. 11/27/2018 Washington University in St. Louis

25 Limit crossing strategies
Cut-and-solve [Climer and Zhang, Artificial Intelligence, to appear] An iterative search strategy Useful for general combinatorial optimization problems Backbone and fat identifier [Climer and Zhang, AAAI-02] Used to identify characteristic variables 11/27/2018 Washington University in St. Louis

26 Washington University in St. Louis
Cut-and-solve For each iteration: Step 1: A chunk of the solution space is cut away and solved Step 2: A relaxed solution is found for remaining solution space Iterate until relaxed solution is greater than or equal to incumbent Incumbent is guaranteed to be optimal 11/27/2018 Washington University in St. Louis

27 Washington University in St. Louis
Example x >= 0 y <= 3 y + 13/6 x <= 9 y – 5/13 x >= 1/14 y + 3/5 x >= 6/5 x,y integers 11/27/2018 Washington University in St. Louis

28 Washington University in St. Louis
Optimal solution Minimize Z = y – 4/5 x x = 2 y = 1 Z = -0.6 11/27/2018 Washington University in St. Louis

29 Washington University in St. Louis
Iteration 1, first step Cut away a chunk of the solution space: y – 17/3 x >= -14 and solve sparse problem 11/27/2018 Washington University in St. Louis

30 Washington University in St. Louis
Iteration 1, first step x = 3 y = 2 Z = -0.4 Incumbent solution is -0.4 11/27/2018 Washington University in St. Louis

31 Washington University in St. Louis
Iteration 1, second step Add new constraint: y – 17/3 x <= -14 to cut off chunk of solution space Relax integrality and solve 11/27/2018 Washington University in St. Louis

32 Washington University in St. Louis
Iteration 1, second step x = 2.6 y = 1.0 Z = -1.1 Incumbent solution is -0.4, so need to run another iteration 11/27/2018 Washington University in St. Louis

33 Washington University in St. Louis
Iteration 2, first step Cut away a chunk of the solution space and solve sparse problem 11/27/2018 Washington University in St. Louis

34 Washington University in St. Louis
Iteration 2, first step x = 2 y = 1 Z = -0.6 This solution is less than incumbent, so incumbent becomes -0.6 11/27/2018 Washington University in St. Louis

35 Washington University in St. Louis
Iteration 2, second step Add constraint to cut off solved chunk Relax integrality and solve 11/27/2018 Washington University in St. Louis

36 Washington University in St. Louis
Iteration 2, second step x = 1.0 y = 0.6 Z = -0.2 Incumbent value: Z = -0.6 Solution is greater than incumbent, so incumbent must be optimal 11/27/2018 Washington University in St. Louis

37 Cut-and-solve properties
Nominal memory requirements Keep new constraints and incumbent solution from one iteration to the next No subtrees in which to get lost Can be used as complete anytime solver Can use parallel processing 11/27/2018 Washington University in St. Louis

38 Washington University in St. Louis
Cut-and-solve Same as two steps of limit crossing Small chunk is solved to provide simple upper bound Doubly-modified problem Piercing cuts Relaxation Unusual upper-bounding modification 11/27/2018 Washington University in St. Louis

39 Washington University in St. Louis
Cut-and-solve We used generic algorithm for TSP [Artificial Intelligence, to appear] 7 real-world problem classes [Cirasella, Johnson, McGeoch, Zhang, Lecture Notes in Computer Science, 2000] 500 instances solved for each class and size Comparisons with: CDT [Carpaneto, Dell’Amico, and Toth, ACM Trans. On Math. Software, 1995] Concorde [Applegate et al. Cplex [ILOG STSPs are hard if very large, our code not designed for very large problems (arc lengths computed on the fly). A simple implementation, yet out performs state-of-the-art solvers on difficult instances. 11/27/2018 Washington University in St. Louis

40 Shortest common superstring
11/27/2018 Washington University in St. Louis

41 Tilted drilling machine (additive norm)
11/27/2018 Washington University in St. Louis

42 Tilted drilling machine (sup norm)
11/27/2018 Washington University in St. Louis

43 Washington University in St. Louis
Stacker crane 11/27/2018 Washington University in St. Louis

44 Computer disk read head
11/27/2018 Washington University in St. Louis

45 Pay phone coin collection
11/27/2018 Washington University in St. Louis

46 Washington University in St. Louis
No-wait flow shop 11/27/2018 Washington University in St. Louis

47 Largest problem size solved by each method
11/27/2018 Washington University in St. Louis

48 Moving beyond traditional tree search
Cut-and-solve Backbone & fat identifier 11/27/2018 Washington University in St. Louis

49 Haplotype inferencing
What are haplotypes? Why should we care about them? How can we infer haplotypes? 11/27/2018 Washington University in St. Louis

50 Haplotype inferencing
…TGGCACTTCCGAACTTTG… …TGGTACTTCCGAACATTG… …TGGCACTGCCGAACATTG… …TGGCACTGCCGAACTTTG… 11/27/2018 Washington University in St. Louis

51 Haplotype inferencing
…TGGCACTTCCGAACTTTG… …TGGTACTTCCGAACATTG… …TGGCACTGCCGAACATTG… …TGGCACTGCCGAACTTTG… 11/27/2018 Washington University in St. Louis

52 Haplotype inferencing
…C T T… …T T A… …C G A… …C G T… 11/27/2018 Washington University in St. Louis

53 Haplotype inferencing
…C T T… …0 0 1… …T T A… …1 0 0… …C G A… …0 1 0… …C G T… …0 1 1… 11/27/2018 Washington University in St. Louis

54 Haplotype inferencing
…0 0 1… …1 0 0… …0 1 0… …0 1 1… …2 0 2… …0 1 2… 11/27/2018 Washington University in St. Louis

55 Haplotype inferencing
If a site on a genotype is the product of two different nucleotides, it is heterozygous Else it is homozygous 2k-1 feasible resolutions for k heterozygous sites 11/27/2018 Washington University in St. Louis

56 Haplotype inferencing
Example: g1: g2: g3: g4: g5: g6: g7: g8: 11/27/2018 Washington University in St. Louis

57 Washington University in St. Louis
g1: g2: 01010 , , 11001 01011 , 11010 g3: g4: 01110 , , 10001 g5: g6: 01100 , , 01100 01101 , 10100 00100 , 11101 00101 , 11100 g7: g8: 11100 , , 00111 01111 , 00011 11/27/2018 Washington University in St. Louis

58 Washington University in St. Louis
11/27/2018 Washington University in St. Louis

59 Washington University in St. Louis
Why do we care? Genetic association studies use haplotypes Identify relationships between genes and diseases International HapMap Consortium Identify genotypes, use PHASE [Stephens and Donnelly, Am. J. of Hum. Gen., 2003] “haplotypes of extremely high quality” [The International HapMap Consortium, Nature, 2005] 11/27/2018 Washington University in St. Louis

60 How can we infer haplotypes?
Consider genotypes from a population Different objectives have been proposed Pure parsimony PHASE 11/27/2018 Washington University in St. Louis

61 Washington University in St. Louis
Pure parsimony Find minimum number of haplotypes that will resolve the set Exponential time (worst case) Gusfield cast as an IP [CPM 2003] Solved some instances with 30 sites and 50 individuals Doesn’t consider similarities of haplotypes 11/27/2018 Washington University in St. Louis

62 12 parsimonious solutions: 11 haplotypes
11/27/2018 Washington University in St. Louis

63 Washington University in St. Louis
PHASE Weights used to select haplotype pairs that have one already in the set Weights for haplotypes that are “similar” to those in the set Divide-and-conquer 11/27/2018 Washington University in St. Louis

64 PHASE solution: 11 haplotypes
11/27/2018 Washington University in St. Louis

65 Washington University in St. Louis
PHASE solution: S dij = 7 11/27/2018 Washington University in St. Louis

66 Haplotype inferencing
Recent study by Andres, Clark, Hixson, Boerwinkle, and Sing Computational methods including PHASE Poor performance Degree of uncertainty “highly error prone” 11/27/2018 Washington University in St. Louis

67 Washington University in St. Louis
Three challenges Find biologically meaningful model Space complexity Time complexity 11/27/2018 Washington University in St. Louis

68 Haplotype inferencing
PHASE Favors reduced cardinality Favors increased similarities Our method Favors reduced cardinality and increased similarities Combinatorial approach Use a single parameter d 11/27/2018 Washington University in St. Louis

69 Washington University in St. Louis
11 haplotypes 11/27/2018 Washington University in St. Louis

70 Washington University in St. Louis
S dij = 6 11/27/2018 Washington University in St. Louis

71 Washington University in St. Louis
Summary Limit crossing 2-step procedure for using bounds Explore every facet of model Cut-and-solve Generic algorithm for IPs TSP Outperformed other solvers for 5 out of 7 problem classes Haplotyping 11/27/2018 Washington University in St. Louis

72 Washington University in St. Louis
Future work Haplotyping Customized limit crossing approach Accommodate multi-allelic data Automatically reduce trio data Accept phased data Genome-wide association testing Combinatorial approaches to biological problems 11/27/2018 Washington University in St. Louis


Download ppt "Sharlee Climer Department of Computer Science and Engineering"

Similar presentations


Ads by Google