Facility Location in Dynamic Geometric Data Streams Christiane Lammersen Christian Sohler.

Facility Location in Dynamic Geometric Data Streams Christiane Lammersen Christian Sohler

Dynamic Geometric Data Streams Streams of geometric data arise in –Mobile networks –Sensor networks –… Continuously changing data –Mobile networks: position of nodes –Sensor networks: measured data Communication in form of update operations –Update consists of ID of node, old value, new value IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 2

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 333 Hierarchical Communication Systems upper layer offers lower layer a certain service each node can be a server cost for server ↔ access time 3

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 4 Hierarchical Communication Systems upper layer offers lower layer a certain service each node can be a server cost for server ↔ access time

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 5 Dynamic Geometric Data Streams m insert and delete operations points in low-dimensional, discrete space {1,...,  } d polylog( , m) memory space, one pass  [Indyk ‘04]

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 666 Dynamic Uniform FLP point set P facilities have uniform opening cost f clients have uniform demand b goal: maintaining F  P, so as to minimize 6 FLP related to k -Median but | F | can be  (|P|)  problem in streaming  approximation of the cost

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 777 Related Work P. Indyk: Algorithms for Dynamic Geometric Problems over Data Streams, STOC 04 – O(log 2  ) -approximation for cost of FLP – Idea: nested squared grids, open facility in all heavy cells G. Frahling and C. Sohler: Coresets in Dynamic Geometric Data Streams, STOC 05 – space partition based on heavy cells 7

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 8 Construction of Our Streaming Method deterministic method E det (P) =  (OPT(P)) randomized method E rand (P) =  (E det (P)) streaming method E stream (P) =  (E rand (P))

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets Impose log(  )+1 nested squared grids In each grid, identify the heavy cells Partition the input space based on the heavy cells For each cell size, count the number of points within cells of that size => estimator for cost: [Indyk ’04, Frahling and Sohler ‘05] 9 Deterministic Method

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets Impose log(  )+1 nested squared grids In each grid, identify the heavy cells Partition the input space based on the heavy cells For each cell size, count the number of points within cells of that size => estimator for cost: 10 Deterministic Method Idea: Open one facility in each heavy cell in the space partition.

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets Impose log(  )+1 nested squared grids In each grid, identify the heavy cells Partition the input space based on the heavy cells For each cell size, count the number of points within cells of that size => estimator for cost: 11 Deterministic Method Idea: Open one facility in each heavy cell in the space partition.

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 12 Nested Grids Impose log(  )+1 nested squared grids  = 16 Level: 4

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 17 Deterministic Method Impose log(  )+1 nested squared grids In each grid, identify the heavy cells Partition the input space based on the heavy cells For each cell size, count the number of points within cells of that size => estimator for cost:

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 18 Space Partition In each grid, identify the heavy cells Partition the input space based on the heavy cells f = 8  = 16 Level: 4 Cell in level i is heavy if it contains f / 2 i points.

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 31 Space Partition In each grid, identify the heavy cells Partition the input space based on the heavy cells

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 33 Cost Estimator For each cell size, count the number of points within cells of that size => estimator for cost:

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 35 Cost Estimator For each cell size, count the number of points within cells of that size => estimator for cost: 9 points

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 37 Cost Estimator For each cell size, count the number of points within cells of that size => estimator for cost: 7 points

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 38 Value of Cost Estimator is  (OPT(P)) Contribution of heavy cell C in level i is at most Contribution of light cell C in level i is at most A heavy cell in level i contains  ( f / 2 i ) points. The space partition is balanced. The distance of a cell in level i to heavy cell is O(2 i ).

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 39 Value of Cost Estimator is O(OPT(P)) Contribution of distant cell C in level i is at least n(C). 2 i-1 OPT(P)  f. |F OPT | Estimated cost for near cell C in level i is n(C). 2 i = O( f ) There is a constant number of near cells. Estimated cost for near cells is O( f. |F OPT |) level i radius 2 i-1

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 41 Randomized Method Idea: –Heavy cell in level i contains at least f /2 i points –Sample a point in level i with probability 2 i /f Problem: coin flips & delete operations Solution: –Hash function h i : { 1,…,  } d → { 1,…,  f / 2 i  } –Sample set S i = { p  P | h i ( p) = 1 } 1 2 3 4 … hihi

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 42 Randomized Method for each level i do F(i)  set of all marked cells C in level i such that a)no subcell of C is marked b)no smaller cell within a distance of less than 2 i-1 is marked return E rand (P) =  (E det (P))

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 43 Idea: Reduction to counting distinct elements Implementation: -For each level i count distinct elements in DE 1 (i) = {C|C is in level i and marked}  {C|C is in level i and a) or b) fails} and DE 2 (i) = {C|C is in level i and a) or b) fails} -Output difference as cost for level i Streaming Method DE 1 (i) DE 2 (i) DE 1 (i+1) DE 2 (i+1)

IITK Workshop on Algorithms for Christiane Lammersen Processing Massive Data Sets 44 Conclusion & Future Work Streaming Algorithm for Dynamic FLP: constant factor approximation of cost update-time: O(log(1/  ). polylog(  )) space : O(log(1/  ). polylog(  )) failure probability:  Future Work: approximation factor not exponential in d (1+  ) -approximation algorithm 44

Thank you for your attention! Department of Computer Science Technische Universität Dortmund Otto-Hahn-Str. 14 44221 Dortmund, Germany Phone: +49 231 755-4762 Fax.: +49 231 755-2047 Email: christiane.lammersen@tu-dortmund.de http://ls2-www.cs.uni-dortmund.de/~lammersen/

Facility Location in Dynamic Geometric Data Streams Christiane Lammersen Christian Sohler.

Similar presentations

Presentation on theme: "Facility Location in Dynamic Geometric Data Streams Christiane Lammersen Christian Sohler."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Facility Location in Dynamic Geometric Data Streams Christiane Lammersen Christian Sohler.

Similar presentations

Presentation on theme: "Facility Location in Dynamic Geometric Data Streams Christiane Lammersen Christian Sohler."— Presentation transcript:

Similar presentations

About project

Feedback