Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech.

Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech

OUTLINE Sampling and its algorithmic incorporation Adaptive sampling for approximating maxflow Adaptive sampling for solving linear systems

RANDOM SAMPLING Pick a small subset of a collection of many objects Goal: Estimate quantities Reduce sizes Speed up algorithms

ALGORITHMIC USE OF SAMPLING Framework: Compute on the sample Bring answer back to original Original Compute on sample Examples: Quick sort/select Geometric cuttings Nystrom method

THIS TALK Adaptive sampling based algorithms for finding combinatorial and scientific flows Current best, O(mlog c n) time guarantees for: Approximate maxflow / balanced cuts Computation of random walks / eigenvectors Measure runtime with big-O, address worst case instances, including ill-conditioned ones

ADAPTIVE SAMPLING Iterative schemes: chain of samples, each constructed from the previous Preserve `just enough’ for answer on the sample to be useful Problem 1 Problem 2 Problem d …

PRESERVING GRAPH STRUCTURES This talk: undirected graphs n vertices m < n 2 edges Is n 2 edges (dense) sometimes necessary? For connectivity: < n edges always ok

PRESERVING MORE [BK `96]: for ANY G, can get H with O(nlogn) edges s.t. G ≈ H on all cuts Andras Benczur David Karger How: keep edge e with probability p e, rescale if kept to maintain expectation

HOW TO PICK PROBABILITIES Widely used: uniform sampling Works well when data is uniform e.g. complete graph Problem: long path, removing any edge changes connectivity (can also have both in one graph) Fix: pick probaiblities adaptively baesd on graph

THE `RIGHT’ PROBABILITIES Path + clique: 1 1/n [RV `07], [Tropp `12], [SS `08]: suffices to have p e ≥ O( logn) × w e × effective resistance Dan Spielman Nikhil Srivastava Gives spectral sparsifiers, which suffices for all the results in this talk, e.g. preserves cuts

SAMPLING MORE [Talagrand `90] “Embedding subspaces of L 1 into L N 1 ” : non-linear, matrix, analog, O(dlogd) objects for d dimensions Mathematical view of probabilities: 2-norm: leverage scores 1-norm: [CP `15]: Lewis weights ║x║1║x║1 ║x║2║x║2 Michel Talagrand

COMPUTING SAMPLING PROBABILITIES [ST `04][OV `11]: spectral partitioning [SS`08][KLP `12]: projections + solves [Koutis `14]: spanners / low diameter partitions [LMP `13][CLMMPS`15]: self-reduction / recursion [DMMW`13][CW`13]: sketches [BSS`09][LS `15]: potential functions For edge with Laplacian L e w e × r e = trace( L + L e ) +: pseudo-inverse

s Undirected graphs PROBLEM t Maximum number of disjoint s-t paths Applications: Routing Scheduling Dual: separate s and t by removing fewest edges Applications: Partitioning Clustering n vertices, m edges Assume capacities = poly(n) Goal: 1± ε approx in O(mlog c nε -2 ) time

May also ‘fix’ earlier flow via residual graph Faster: [EK `73]: fewer paths: O(m 2 n) [Dinic `73][HK `75] [GN `80][ST `83]: shorter / simpler paths, O(nmlogn) ‘PROTOTYPE’ FLOW ALGORITHM (FORD-FULKERSON `56) While exists s-t path route it adjust graph s t s t D. R. Fulkerson L. R. Ford Jr.

EXTREME INSTANCES Highly connected, need global steps Long paths / tree, need many steps But must handle both simultaneously gradient steps MCMC Power method min-degree dynamic trees DFS Each easy on their own

John Hopcroft LIMIT OF THIS APPROACH? s t s t [HK `75]: if we can route f units of flow in a unit-capacity graph, the shortest augmenting path has length at most m / f If we can find (approximate) shortest paths dynamically: Σ f=1 m (m / f) = O(mlogn) Richard Karp

1980: dynamic trees 1970s: blocking flows 1986: dual algorithms 1999: scaling 2010: linear systems 1956: augmenting paths Augmenting paths [FF `55] Notion of polynomial time [Edmonds `65] Blocking flows [EK`75, Dinic `73] Dynamic tree data structures [GN `80] Dual/scaling algorithms [Gabow `83, GT `88, GR `99] Faster solvers for graph Laplacians [Vaidya `89] WORK RELATED TO FLOWS LED TO:

AVERAGE TOGETHER FLOWS NUMERICALLY Numerical notions of `approximate’ shortest paths, interact well with Hierarchical decompositions, Divide-and-conquer on graphs s s t t s t

Jon Kelner NUMERICAL MAXFLOW ALGORITHMS [Sherman `13] [KLOS `14]: given operator that α-approximates maxflow for ANY demand d, can compute (1 + ε)-approx maxflow in O(α 2 lognε -2 ) calls [Madry `10] [KLOS `13]: build this operator with α = O(m θ ) in O(m 1+θ ) time Jonah Sherman Alexander Madry Aaron Sidford Yin-Tat Lee Lorenzo Orecchia Combining gives O(m 1+θ ), will show next: O(mlog c n)

-2 3 OPERATOR? Tree: unique s-t path, maxflow = demand / bottleneck edge s t 1 -2 Multiple (exact) demands: flows along each edge determined via linear mapping, O(n) time

TREE FOR ANY GRAPH? [RST `14]: can find such a tree by solving maxflows of total size O(mlog c n) [Racke `01]: ANY undirected graph has a tree that’s an O(log c n) approximator Harald Racke Hanjo Taubig Chintan Shah

USING THESE DECOMPOSITIONS? Approximator Maxflow [RST `14] operator with α=O(log c n) via maxflows of total size O(mlog c n) Maxflow Approximator O(mlog c nε -2 ) time? [Sherman `13] [KLOS `14]: given operator that α-approximates maxflow for ANY demand d, can compute (1 +ε)-approx maxflow in O(α 2 lognε -2 ) calls

RESOLUTION: RECURSION Create smaller approximation Build cut/flow approximator on it recursively Use (fixed size) approximator for the original ` Adaptive sampling acts driver that controls progress of recursion

SIZE REDUCTION Ultra-sparsifier: for any k, can find H ≈ k G that’s tree + O(mlog c n/k) edges ` ` e.g. [Koutis-Miller-P `10]: pick good tree, sample off-tree edges by their ‘stretch’ Reducible to O(mlog c n/k) vertices/edges Yiannis Koutis Gary Miller

Use: maxflow from O(α 2 lognε -2 ) calls Construction: α=O(log c n) via maxflows on graphs of total size O(mlog c n) [P `16]: INCORPORATING REDUCTIONS T(m) = T(mlog 2c n/k) + O(mk 2 log 2c+1 n) Set k  O(log 2c n): T(m) = T(m/2) + O(mlog 4c+1 n) = O(mlog 4c+1 n) Construct on ultra-sparsifier with approximation factor k α=O(klog 2c n) for original graph New size = O(mlog c n/k) Maxflow

GRAPH LAPLACIANS 1 1 2 -1 -1 -1 1 0 -1 0 1 Matrices that correspond to undirected graphs Entries  vertices, n by n matrices Non-zeros  edges, O(m) nonzeros Problem: given graph Laplacian L, vector b, find x s.t. Lx = b

THE LAPLACIAN PARADIGM Directly related : Elliptic systems Few iterations : Eigenvectors, Heat kernels Many iterations / modify algorithm Graph problems Image processing Dan Spielman Shanghua Teng

USE MATLAB? TRILINOS? LAPACK? Sequence of (adpatively) generated linear systems: https://github.com/sachdevasushant/Isotonic/blob/master/README.mdhttps://github.com/sachdevasushant/Isotonic/blob/master/README.md : …we suggest rerunning the program a few times and/or using a different solver. An alternate solver based on incomplete Cholesky factorization is provided… Optimization Problem Linear System Solver Kevin Deweese

SIMPLIFICATION Adjust/rescale so diagonal = I Add to diagonal to make full rank L = I – A A: Random walk

LOWER BOUND FOR ITERATIVE METHODS Graph theoretic interpretation: each term  1 step walk A diameter b bAbA2bA2b Need Ω(diameter) steps Division with multiplication: L -1 = ( I – A ) -1 = I + A + A 2 + A 3 +…

( I – A ) -1 = I + A + A 2 + A 3 + … = ( I + A ) ( I + A 2 ) ( I + A 4 )… REPEATED SQUARING Dense matrix! A 16 = (((( A 2 ) 2 ) 2 ) 2, 4 operations O(logn) terms ok Similar to multi-level methods

DENSE INTERMEDIATE OBJECTS Matrix powers Matrix inverse Transitive closures LU factorizations Cost-prohibitive to store / find But can access a sparse version

GRAPH THEORETIC VIEW A : step of random walk A 2 : 2 step random walk Still a graph! [PS `14]: can directly access a sparsifier in O(mlog c n) time

HIGHER POWERS A : random walk A k : k step random walk [CCLPT `15: can also compute this sparsifier in nearly-linear time Shanghua Teng Yan Liu Yu Cheng Dehua Cheng Sparse graph close to A k

SPARSIFIED SQUARING I - A 1 ≈ ε I – A 2 I – A 2 ≈ ε I – A 1 2 … I – A i ≈ ε I – A i-1 2 I - A d ≈ I I - A 0 I - A d ≈ I Convergence: (approximately) same as repeated squaring: d = O(log(mixing time)) suffices ≈: spectral/condition number, implies cut

[PS `14] CHAIN  SOLVER x = Solve( I, A 0, … A d, b) 1.For i =1 to d, b i  ( I + A i ) b i-1. 2.x d  b d. 3.For i = d - 1 downto 0, x i  ½[b i +( I + A i )x i+1 ]. Runtime: O(mlog c nlog 3 (mixing time))

SPARSE BLOCK CHOLESKY [KLPRS`16]: Repeatedly eliminate some variables Sparsify intermediate matrices O(mlogn) time, extends to connection Laplacians, which can be viewed as having complex weights Rasmus Kyng Sushant Sachdeva

EVEN MORE ADAPTIVE [KS `16] Per-entry pivoting, almost identical to incomplete LU / ichol Running time bound: O(mlog 3 n), OPEN: improve this

OPEN QUESTIONS Nearly-linear time algorithms for: Wider classes of linear systems Directed maximum flow Intermediate questions: Squaring based flow algorithms / oblivious routing schemes. What can we preserve while sparsifying directed graphs?

THANK YOU!

Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech.

Similar presentations

Presentation on theme: "Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech.

Similar presentations

Presentation on theme: "Algorithm Frameworks Using Adaptive Sampling Richard Peng Georgia Tech."— Presentation transcript:

Similar presentations

About project

Feedback