# Friedhelm Meyer auf der Heide

## Presentation on theme: "Friedhelm Meyer auf der Heide"— Presentation transcript:

Friedhelm Meyer auf der Heide
Algorithmic Aspects of Dynamic Intelligent Systems Part 2: Introduction to online algorithms Friedhelm Meyer auf der Heide

Lectures on monday take place at am in W0.209

Online Algorithms An online algorithm is one that can process its input piece-by-piece, without having the entire input available from the start. In contrast, an offline algorithm is given the whole problem data from the beginning and is required to output an answer which solves the problem at hand. Input: a sequence of requests. Task: process the requests as efficiently as possible Online: i‘th request has to be processed before future requests are known Offline: All requests are known in advance

How to measure quality of online algorithms?
Assume some a priori knowledge about request sequence, e.g., „requests are chosen randomly“ Assume worst case measure, compare online cost to offline cost Online : standard competitive analysis – competitive ratio Online randomized:

Toy Example: Ski rental problem
I go skiing one year after the other, until I am no longer interested in skiing. I do not know in advance, when I loose interest. Bying skis costs D € (and I can use them for ever), renting them costs 1€. Question: When should I buy skis? Answer: Immediately, if I am interested for at least D years, never otherwise. This can only be done offline!

Toy Example: Ski rental problem
What to do online? I buy when I go skiing for a D‘th year. Cost offline : D Cost online : 2D This strategy is 2-competitive

Paging

Paging: A basic problem for an operating system
Main memory Size k=6 Disk Paging: given a main memory that can hold k pages. Input: sequence of pages to be used by the processor Goal: Need as few page faults ( requests for pages that have to be moved from disk to main memory) as possible The algorithm has to store the requested page in main memory in case of a page fault, and has to choose a page to be removed from main memory.

Paging Optimal offline:

Two online strategies:
Paging Two online strategies: Theorem: LRU and FIFO are k-competitive. Theorem: This competitive ratio is best possible.

Furthermore, LRU turns out to be better than FIFO.
Paging In practise, both algorithms are much better, the observed competitive ratio decreases with increasing memory size k. Furthermore, LRU turns out to be better than FIFO. Reasons: In practise, request sequences exhibit locality, i.e., they tend to use the same pages more often, and have dependencies among pages. („If page A is accessed, then it is likely that page B will be accessed shortly afterwards“) Way out: Model restrictions to the „adversary“, i.e. the bad guy that generates the worst case sequences. This is done using access graphs.

Page Migration

Page Migration Model (1)
Page migration – Classical online problem processors connected by a network There are costs of communication associated with each edge. Cost of communication between pair of nodes = cost of the cheapest path between these nodes. Costs of communication fulfill the triangle inequality. v3 v2 v4 v5 v1 v6 v7

Page Migration Model (2)
Alternative view: processors in a metric space Indivisible memory page of size in the local memory of one processor (initially at ) v3 v2 v4 v5 v1 v6 v7

Page Migration Model (3)
Input: sequence of processors, dictated by a request adversary - processor which wants to access (read or write) one unit of data from the memory page. After serving a request an algorithm may move the page to a new processor. v3 v2 v4 v5 v1 v6 v7

Page Migration (cost model)
The page is at node . Serving a request issued at costs Moving the page to node costs

A randomized algorithm
Memoryless coin-flipping algorithm CF [Westbrook 92] Theorem: CF is 3-competitive against an adaptive-online adversary Remark: This ratio is optimal against adaptive-online Adversary (may see the outcomes of the coinflips) In each step after serving a request issued at , move page to with probability

Proof of competitiveness of CF
We run CF and OPT „in parallel” on the input sequence We define potential function There are two events to consider in each step Request occurs at a node , CF and OPT serve the requests, part 1 CF optionally moves the page OPT optionally moves the page } part 2 For each part separately, we prove that

Proof of competitiveness of CF
Note: This is a telescopic sum. Thus the cancel out and we get the competitive ratio 3.

Competitiveness of CF, a step
Page in and resp. Request occurs at CF and OPT serve the requests CF optionally moves the page to part 1 OPT optionally moves the page part 2 to

Competitiveness of CF – part 1
Request occurs at Cost of serving requests: in CF : a, in OPT : b Expected cost of moving the page: Potential before: Exp. potential after: Exp. change of the potential:

Competitiveness of CF – part 2
OPT moves to

Deterministic algorithm
Algorithm Move-To-Min (MTM) [Awerbuch, Bartal, Fiat 93] Theorem: MTM is 7-competitive Remark: The currently best deterministic algorithm achieves competitive ratio of 4.086 After each steps, choose to be the node which minimizes , and move to . ( is the best place for the page in the last steps)

Results on page migration
The best known bounds: Algorithm Lower bound Deterministic [Bartal, Charikar, Indyk ‘96] [Chrobak, Larmore, Reingold, Westbrook ‘94] Randomized: Oblivious adversary [Westbrook ‘91] Adaptive-online adversary

Data management in networks

- Exploit Locality - Scenario
Networks have low bandwidth, global objects are small, access is fine grained. typical for parallel processor networks, partially also for the internet. bottleneck: link-congestion task: distribute global objects (maybe dynamically) among processors such that • an application (sequence of read/write access to global variables) can be executed using small link-congestion • storage overhead is small. - Exploit Locality -

Basic Strategy Design strategy for trees Produce strategy for target-network by tree embedding

Dynamic Model Application: Sequence of read / write requests from processors to objects. Each processor decides solely based on its local knowledge.  distributed online-strategy Goal: Develop strategy that produces only by a factor c more congestion than an optimal offline strategy.  c-competitive strategy (and by a factor m more storage per processor  (m, c) – competitive strategy )

Dynamic strategy for trees
v writes to x : v creates (or updates) copy of x in v, and invalidates all other copies (consistency!) v reads x: v reads the closest copy of x and creates copies in every processor on the path back to v. (Remark: Data Tracking in trees is easy!)

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) v0

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) v1 v0

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) v1 v0

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) v1 v0 v2

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) v1 v0 v2

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) v1 v0 v2 v3

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) vk v1 v0 v2 v3

Example and Analysis Consider phase write (v0), read (v1), read (v2), ... , read (vk-1), write (vk) vk v1 v0 v2 v3  Each strategy has to use each link of the red subtree at least once. Our strategy uses each of these links at most three times.  Strategy is 3-competitive for trees

Idea: Simulate suitable tree in target-network M. tree embedding: Goals: - small dilation (in order to reduce overall load) - randomized embedding (in order to reduce congestion) Goals contradict?!?

Tree embedding Randomized, locality preserving embedding!
Example: nxn-mesh 1 2 3 M‘ v leaves: nodes of the mesh link-capacity: # links leaving the submesh

Result for meshes The static and dynamic strategies are
O (log(n))-competitive in nxn-meshes, w.h.p. Finding an optimal static placement for several variables is NP-hard already on 3x3-meshes.

Some Results competitive ratio w.h.p. d-dimensional meshes O(d log n)
Fat Trees O(log n) Hypercubes SE Networks De Bruijn Networks Direct Butterflies Indirect Butterflies Arbitrary Networks O(1) polylog n (Räcke 2002)

Conclusions Provably efficient protocols for data management in networks Different models for different application scenarios Experimental evaluation (Presto, DIVA) Some open problems: - startup times - combination with load balancing - randomized, locality preserving embedding and routing in dynamic networks

Robot navigation Theorem: This competitive ratio can be achieved.

The playground is a line, no obstacles. The robot has to find the treaure, but does not know in advance where it is. It only finds it when it touches it.