Greedy Algorithms / Caching Problem Yin Tat Lee

Slides:



Advertisements
Similar presentations
Tight Bounds for Online Class- constrained Packing Hadas Shachnai Bell Labs and The Technion IIT Tami Tamir The Technion IIT.
Advertisements

110/6/2014CSE Suprakash Datta datta[at]cse.yorku.ca CSE 3101: Introduction to the Design and Analysis of Algorithms.
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
CS223 Advanced Data Structures and Algorithms 1 Greedy Algorithms Neil Tang 4/8/2010.
Analysis of Algorithms
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Princeton University COS 423 Theory of Algorithms Spring 2001 Kevin Wayne Competitive Analysis.
CS216: Program and Data Representation University of Virginia Computer Science Spring 2006 David Evans Lecture 7: Greedy Algorithms
CSE115/ENGR160 Discrete Mathematics 02/28/12
1 Competitive analysis of the LRFU paging algorithm Edith Cohen -- AT&T Haim Kaplan -- Tel Aviv Univ. Uri Zwick -- Tel Aviv Univ.
Lectures on Network Flows
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Online Algorithms Motivation and Definitions Paging Problem Competitive Analysis Online Load Balancing.
CSE 421 Algorithms Richard Anderson Lecture 6 Greedy Algorithms.
Ecole Polytechnique, Nov 7, Online Job Scheduling Marek Chrobak University of California, Riverside.
Computer ArchitectureFall 2008 © November 3 rd, 2008 Nael Abu-Ghazaleh CS-447– Computer.
1Bloom Filters Lookup questions: Does item “ x ” exist in a set or multiset? Data set may be very big or expensive to access. Filter lookup questions with.
Called as the Interval Scheduling Problem. A simpler version of a class of scheduling problems. – Can add weights. – Can add multiple resources – Can ask.
Primal-Dual Meets Local Search: Approximating MST’s with Non-uniform Degree Bounds Author: Jochen Könemann R. Ravi From CMU CS 3150 Presentation by Dan.
Online Paging Algorithm By: Puneet C. Jain Bhaskar C. Chawda Yashu Gupta Supervisor: Dr. Naveen Garg, Dr. Kavitha Telikepalli.
9/8/10 A. Smith; based on slides by E. Demaine, C. Leiserson, S. Raskhodnikova, K. Wayne Adam Smith Algorithm Design and Analysis L ECTURE 6 Greedy Algorithms.
10/18: Lecture topics Memory Hierarchy –Why it works: Locality –Levels in the hierarchy Cache access –Mapping strategies Cache performance Replacement.
CPSC 404, Laks V.S. Lakshmanan1 External Sorting Chapter 13: Ramakrishnan & Gherke and Chapter 2.3: Garcia-Molina et al.
CSE 5314 On-line Computation Homework 1 Wook Choi Feb/26/2004.
CSE378 Intro to caches1 Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early.
A Optimal On-line Algorithm for k Servers on Trees Author : Marek Chrobak Lawrence L. Larmore 報告人:羅正偉.
1 Chapter 5-1 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
1 Chapter 4 Greedy Algorithms Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
© The McGraw-Hill Companies, Inc., Chapter 12 On-Line Algorithms.
Problem 3.2 Paging Algorithms-Time & Space Requirements Problem:What are the space and time requirements of LIFO, LFU, and LED? LRU:(least-recently-used):
CSE 421 Algorithms Richard Anderson Lecture 8 Optimal Caching Dijkstra’s algorithm.
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 10.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
What is it and why do we need it? Chris Ward CS147 10/16/2008.
CSE 421 Algorithms Richard Anderson Lecture 8 Greedy Algorithms: Homework Scheduling and Optimal Caching.
Sorting by placement and Shift Sergi Elizalde Peter Winkler By 資工四 B 周于荃.
Lecture 7 Greedy Algorithms
Morgan Kaufmann Publishers Memory & Cache
Lectures on Network Flows
Greedy Algorithms / Interval Scheduling Yin Tat Lee
Chapter 4 Greedy Algorithms
Lecture 21: Memory Hierarchy
Greedy Algorithms / Caching Problem Yin Tat Lee
Data Structures and Algorithms
CS 154, Lecture 4: Limitations on DFAs (I),
Richard Anderson Autumn 2015 Lecture 8 – Greedy Algorithms II
4.1 Interval Scheduling.
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Performance metrics for caches
Performance metrics for caches
Advanced Algorithms Analysis and Design
Greedy Algorithms: Homework Scheduling and Optimal Caching
Richard Anderson Lecture 6 Greedy Algorithms
If a DRAM has 512 rows and its refresh time is 9ms, what should be the frequency of row refresh operation on the average?
Greedy: Huffman Codes Yin Tat Lee
Richard Anderson Lecture 7 Greedy Algorithms
Richard Anderson Winter 2019 Lecture 8 – Greedy Algorithms II
Memory Hierarchy Memory: hierarchy of components of various speeds and capacities Hierarchy driven by cost and performance In early days Primary memory.
Performance metrics for caches
Richard Anderson Autumn 2016 Lecture 8 – Greedy Algorithms II
Chapter 4 Greedy Algorithms
Copyright © Cengage Learning. All rights reserved.
Lecture 13: Computer Memory
Richard Anderson Winter 2019 Lecture 7
Lecture 9: Caching and Demand-Paged Virtual Memory
COMPSCI 330 Design and Analysis of Algorithms
Performance metrics for caches
Richard Anderson Autumn 2019 Lecture 7
Richard Anderson Autumn 2019 Lecture 8 – Greedy Algorithms II
Algorithms Tutorial 27th Sept, 2019.
Presentation transcript:

Greedy Algorithms / Caching Problem Yin Tat Lee CSE 421 Greedy Algorithms / Caching Problem Yin Tat Lee

Optimal Caching/Paging Memory systems Many levels of storage with different access times Smaller storage has shorter access time To access an item it must be brought to the lowest level of the memory system Type Latency Capacity Registers 0.25 ns 36 KB L1 Cache 1 ns 192 KB L2 Cache 3 ns 1.5 MB L3 Cache 14 ns 15 MB DRAM 66 ns 32 GB SDD 0.15 ms 480 GB Internet 7 ms 2 GB left in Dropbox  Register: There are maybe 150 integer and 150 floating point “physical” register, which is 6KB. 6 core together is 36 KB. Seems it is not a public information Type Bandwidth Latency Capacity Registers 3 TB / sec 0.25 ns 36 KB L1 Cache 1 TB / sec 1 ns 192 KB L2 Cache 0.7 TB / sec 3 ns 1.5 MB L3 Cache 0.3 TB / sec 14 ns 15 MB DRAM (DDR4) 26 GB / sec 66 ns 32 GB SDD 300 MB /sec 0.15 ms (?) 480 GB Internet 8 MB / sec 7 ms 2 GB left in Dropbox  My home computer

Optimal Caching/Paging Memory systems Many levels of storage with different access times Smaller storage has shorter access time To access an item it must be brought to the lowest level of the memory system Consider the problem between 2 levels Main memory with 𝒏 data items Cache can hold 𝒌<𝒏 items Assume no restrictions about where items can be Suppose cache is full initially Holds 𝒌 data items to start with

Optimal Offline Caching Cache with capacity to store 𝒌 items. Sequence of 𝒎 item requests 𝒅 𝟏 , 𝒅 𝟐 ,⋯, 𝒅 𝒎 . Cache hit: item already in cache when requested. Cache miss: item not already in cache when requested: must bring requested item into cache, and evict some existing item, if full. Goal Eviction schedule that minimizes number of evictions. Example: 𝒌=𝟐, initial cache =𝒂,𝒃, requests: 𝒂,𝒃,𝒄,𝒃,𝒄,𝒂,𝒂,𝒃. Optimal eviction schedule: 𝟐 cache misses. a a b b a b c c b b c b c c b a a b a a b b a b requests cache Why 2 is optimal?

Optimal Offline Caching: Farthest-In-Future Which item we should evict? Farthest-in-future Evict item in the cache that is not requested until farthest in the future. Theorem [Bellady, 1960s] FIF is an optimal eviction schedule. Exchange Argument We can swap choices to convert other schedules to Farthest-In-Future without losing quality a b current cache: c d e f future queries: g a b c e d a b b a c d e a f a d e f g h ... cache miss eject this one

Why Exchange Argument? Greedy cannot handle problems with many local minimum. Exchange argument basically proving there is no local min.

Warm up (𝑛=𝑘+1) Farthest-in-future Evict item in the cache that is not requested until farthest in the future. When 𝑛=𝑘+1, between the cache miss and the farthest-item in the future, “g a b c e d a b b a c d e a f” contains all the item. Hence, any algorithm must miss once. Problem: What’s the problem of this proof for 𝑛>𝑘+1. a b current cache: c d e f future queries: g a b c e d a b b a c d e a f a d e f g h ... cache miss eject this one Answer: In the next interval, the FIF may miss more than one. Consider the input abcdefghijk…. We cannot easily argue the next range is correct.

Reduced Eviction Schedules Definition A reduced schedule is a schedule that only inserts an item into the cache in a step in which that item is requested. Intuition Can transform an unreduced schedule into a reduced one with no more cache misses. a a b c a a b c a a x c a a b c c a d c c a b c d a d b d a d c a a c b a a d c b a x b b a d b c a c b c a c b a a b c a a c b a a b c a a c b an unreduced schedule a reduced schedule

Reduced Eviction Schedules Claim Given any unreduced schedule 𝑺, can transform it into a reduced schedule 𝑺′ with no more cache misses. Proof (by induction on number of unreduced items) Suppose 𝑺 brings 𝒅 into the cache at time 𝒕, without a request. Let 𝒄 be the item 𝑺 evicts when it brings 𝒅 into the cache. Case 1: 𝒅 evicted at time 𝒕′, before next request for 𝒅. Case 2: 𝒅 requested at time 𝒕′ before 𝒅 is evicted. ▪ S S' S S' c c c c c c t t t t d d d t' t' t' t' e e d d d evicted at time t', before next request d requested at time t'

Farthest-In-Future: Analysis Theorem FIF is optimal eviction algorithm. Proof. (by induction on number or requests 𝑗) Let 𝑺 be reduced schedule that satisfies invariant through 𝒋 requests. We produce 𝑺′ that satisfies invariant after 𝒋+𝟏 requests. Consider (𝒋+𝟏)st request 𝒅= 𝒅 𝒋+𝟏 . Since 𝑺 and 𝑺𝑭𝑰𝑭 have agreed up until now, they have the same cache contents before request 𝒋+𝟏. Case 1: (𝒅 is already in the cache). 𝑺′ = 𝑺 satisfies invariant. (used 𝑺 is reduced here) Case 2: (𝒅 is not in the cache and 𝑺 and 𝑺𝑭𝑰𝑭 evict the same element). 𝑺′ = 𝑺 satisfies invariant. Invariant: There exists an optimal reduced schedule 𝑺 that makes the same eviction schedule as 𝑺𝑭𝑰𝑭 through the first 𝒋+𝟏 requests.

Farthest-In-Future: Analysis Proof. (continued) Case 3: (𝒅 is not in the cache; 𝑺𝑭𝑰𝑭 evicts 𝒆; 𝑺 evicts 𝒇≠𝒆). begin construction of 𝑺′ from 𝑺 by evicting 𝒆 instead of 𝒇 now 𝑺′ agrees with 𝑺𝑭𝑰𝑭 on first 𝒋+𝟏 requests; we show that having element 𝒇 in cache is no worse than having element 𝒆 Continue building 𝑺’ to be the same as 𝑺 until forced to be different evicted by 𝑺 evicted by 𝑺𝑭𝑰𝑭 j same e f same e f S S' j+1 j same e d same d f S S'

Farthest-In-Future: Analysis Proof. (continued) Let 𝒋′ be the first time after 𝒋+𝟏 that 𝑺 and 𝑺′ must take a different action, and let 𝒈 be item requested at time 𝒋′. Case 3a: 𝒈=𝒆. Can't happen: 𝒆 was evicted by Farthest-In-Future so there must be a request for 𝒇 before 𝒆. Case 3b: 𝒈=𝒇. Element 𝒇 can't be in cache of 𝑺, so let 𝒆′ be the element that 𝑺 evicts. if 𝒆′=𝒆, 𝑺′ accesses 𝒇 from cache; now 𝑺 and 𝑺′ have same cache if 𝒆 ′ ≠𝒆, 𝑺′ evicts 𝒆′ and brings 𝒆 into the cache; now 𝑺 and 𝑺′ have the same cache same e f S S' j' Note: 𝑺′ is no longer reduced, but can be transformed into a reduced schedule that agrees with 𝑺𝑭𝑰𝑭 through step 𝒋+𝟏

Farthest-In-Future: Analysis Proof. (continued) Let 𝒋′ be the first time after 𝒋+𝟏 that 𝑺 and 𝑺′ must take a different action, and let 𝒈 be item requested at time 𝒋′. Case 3c: 𝒈≠𝒆 and 𝒈≠𝒇. 𝑺 must evict 𝒆. Make 𝑺′ evict 𝒇; now 𝑺 and 𝑺′ have the same cache. ▪ Action of S must involve e or f (or both) same e f S S' j' otherwise 𝑺′ would take the same action same g S S' j' In each case can now extend 𝑺’ using rest of 𝑺 at no extra cost. 𝑺’ is optimal, reduced, and agrees with 𝑺𝑭𝑰𝑭 for 𝒋+𝟏 steps Optimality of 𝑺𝑭𝑰𝑭 follows by induction.

Online Caching Online vs. offline algorithms. Offline: full sequence of requests is known a priori. Online (reality): requests are not known in advance. Caching is among most fundamental online problems in CS. LIFO. Evict page brought in most recently. LRU. Evict page whose most recent access was earliest. Theorem. FIF is optimal offline eviction algorithm. Provides basis for understanding and analyzing online algorithms. LRU is k-competitive. [Section 13.8] LIFO is arbitrarily bad. FIF with direction of time reversed!

𝑘-server problem There are 𝑘 fire trucks. When fire happens, We need to move a truck there. Define 𝐴𝐿𝐺 is the total movement of all fire trucks. Define 𝑂𝑃𝑇 is the total movement of the optimal plan if we know where the fire happens in advance. Define the competitive ratio is 𝐴𝐿𝐺/𝑂𝑃𝑇.

Greedy does not work James R. Lee 𝐴𝐿𝐺=+∞. 𝑂𝑃𝑇=𝑂(1). So, the competitive ratio is 𝐴𝐿𝐺 𝑂𝑃𝑇 =+∞. For long time, the best competitive ratio is 𝑂(𝑘). It was conjectured that one can get log 𝑂 1 (𝑘). This man did it.