Presentation is loading. Please wait.

Presentation is loading. Please wait.

Edge computing (1) Content Distribution Networks

Similar presentations


Presentation on theme: "Edge computing (1) Content Distribution Networks"β€” Presentation transcript:

1 Edge computing (1) Content Distribution Networks
Chen Qian Department of Computer Science and Engineering

2 Algorithmic Nuggets in Content Delivery
Bruce M. Maggs Ramesh K. Sitaraman

3 Overview Background Representative research Conclusion 3

4 Web caches (proxy server)
goal: satisfy client request without involving origin server user sets browser: Web accesses via cache browser sends all HTTP requests to cache object in cache: cache returns object else cache requests object from origin server, then returns object to client HTTP response proxy server HTTP request client origin server HTTP request HTTP response client origin server Application Layer

5 More about Web caching cache acts as both client and server
server for original requesting client client to origin server typically cache is installed by ISP (university, company, residential ISP) why Web caching? reduce response time for client request reduce traffic on an institution’s access link When is cache not good? Every client of the ISP requests different content. Waste time on visiting cache server Application Layer

6 Background Content delivery network (CDN)
A geographically distributed network of proxy servers and their data centers. Distribute service spatially relative to end-users e.g. Service for DNS query 6

7 Background Top-three objectives of CDN Representative research on CDN
High reliability Fast and consistent performance Low operating cost Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Overlay routing Leader election and consensus 7

8 Overview Background Representative research Conclusion 8

9 Representative research on CDN
Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 9

10 Global Load Balancing Purpose: Map clients to the server clusters of the CDN. Clusters assignments are made at the granularity of map units. <IP address prefix, traffic class> e.g. < /24, video> Question: How to assign each map unit 𝑴 π’Š ,πŸβ‰€π’Šβ‰€π‘΄ to a server cluster π‘ͺ 𝒋 ,πŸβ‰€π’‹β‰€π‘΅ 10

11 Global Load Balancing Preference Constraints
Each map unit has preferences for clusters, higher preference indicates better predicted performance. Each server cluster has preferences regarding which map units it would like to serve. Constraints Each map unit is associated with a demand 𝑑. Each cluster has a notion of capacity 𝑐. Satisfy preferences & Meet capacity constraints 11

12 Global Load Balancing Stable allocations
Blocking pairs: 𝑀 𝑖 prefers 𝐢 𝑗 over its current partner 𝐢 𝑗 β€² & 𝐢 𝑗 prefers 𝑀 𝑖 over its current partner 𝑀 𝑖 β€² Stable: There are no blocking pairs in the allocations 𝑀 1 :( 𝐢 2 , 𝐢 1 , 𝐢 3 ) 𝐢 1 :( 𝑀 2 , 𝑀 1 , 𝑀 3 ) 𝑀 2 :( 𝐢 2 , 𝐢 1 , 𝐢 3 ) 𝐢 2 :( 𝑀 1 , 𝑀 2 , 𝑀 3 ) 𝑀 3 :( 𝐢 2 , 𝐢 3 , 𝐢 1 ) 𝐢 3 :( 𝑀 1 , 𝑀 3 , 𝑀 2 ) 12

13 Gale-Shapley algorithm
Gale-Shapley algorithm is a distributed algorithm to find a stable allocations. (Propose-And-Reject algorithm) The stable allocations problem is a classical problem of the algorithm: the stable marriage problem man(who proposes)-optimal map-unit-optimal 13

14 Some Limitations Unequal number of map units and clusters
More map units than clusters Partial preference lists Tens of millions of map units VS Thousands of clusters Rank for each map unit the top dozen clusters that are likely to provide the best performance Modeling integral demands and capacities A server cluster cannot be accurately modeled as a single resource with a single number capacity 14 A Survey of the Stable Marriage Problem and Its Variants

15 Resource Trees Bps: the rate at which data can be sent out of the cluster modeled. Fps: the capacity of non-network serve resources such as the processor, memory and disk. A 50 Bps Violation B 25 Fps 30 Fps Video Apps Web 1. 20 units of demand from a video map, each unit requires 0.25 Fps and 1 Bps. (5 Fps & 20 Bps) C D E 30 Fps 40 Fps 30 Fps 2. 26 units of demand from application map, each unit requires 1 Fps and 0.25 Bps. (26 Fps & 6.5 Bps) 5 Fps & 20 Bps 26 Fps & 6.5Bps 25 Fps 4 Fps 15

16 Resource Trees If cluster has a higher preference for map units with application traffic than video traffic A 50 Bps Violation B 25 Fps 30 Fps Evict a lower preference map unit Video Apps Web C D E e.g. 4 units of video demand (1 Fps) are evicted. 30 Fps 40 Fps 30 Fps 5 Fps & 20 Bps 26 Fps & 6.5Bps 25 Fps 4 Fps 16

17 Implementation Challenges
Complexity and scale Tens of millions of map units Thousands of clusters Over a dozen traffic classes Time to solve Map unit assignment must be recomputed every 10 to 30 seconds Demand and capacity estimation Incremental and persistent allocation 17

18 Representative research on CDN
Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 18

19 Problem Statement In a traditional hash table, objects from a universe 𝑒 are mapped to a set of buckets 𝐡. In CDN, an object is a file such as a JPEG image or HTML page; a bucket is the cache of a distinct web server. Naive method: Use hash functions to directly map objects to buckets. If the servers fail? 19

20 Problem Statement Consistent Hashing Solution 1: Solution 2:
Simply remap objects in the lost bucket to another bucket One bucket stores double the expected load Solution 2: Renumber the existing buckets and rehash the elements using a new hash function Many objects will have to be transferred between buckets. Consistent Hashing 20

21 Consistent Hashing Each object is mapped to the next bucket that appears in clockwise order on the unit circle. Server Object 21

22 Consistent Hashing Improvements When a server fails
Map each bucket to multiple locations (instances) on the unit circle to improve the balance When a server fails All of the corresponding bucket’s instances are removed from the unit circle The objects that were in the buckets are remapped to other buckets. 22

23 Consistent Hashing Popular objects
It is not possible for a single server within a cluster to satisfy all of the requests for a popular object Naive method: map a popular object to the next k servers that appear in clockwise order Problem: If two popular objects happen to hash to nearby positions, the buckets that they map to will highly overlap CDN approach Use a separate mapping of the buckets for each popular object 23

24 Representative research on CDN
Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 25

25 Bloom filters for CDN Bloom filters are useful in two different contexts Content summarization Content filtering Use Bloom filters to succinctly store the set of objects stored in a CDN server’s cache Use counting Bloom filters to support elements update (deletions and insertions) 26

26 Content Filtering Use Bloom filters to determine what objects to cache in the first place Motivation 74% of the roughly 400 million objects in cache were accessed only once (one-hit-wonders) 90% were accessed less than four times No need to cache one-hit-wonders. 27

27 Content Filtering Cache-on-second-hit rule False positives
Use Bloom filters to store accessed objects Server checks Bloom filters to see whether the object has been accessed before Server caches the objects have been accessed before False positives The probability of false positives increases with more objects are added to a Bloom filter. Use two Bloom filters to circumvent the problem 28

28 Content Filtering New objects Primary Bloom filter
It reaches a threshold for maximum number of objects Then new objects Secondary Bloom filter Check both the primary and secondary Bloom filters to see if the object has been accessed in the recent past 29

29 Content Filtering Benefits
Byte hit rates increased when cache filtering was turned on 30

30 Content Filtering Benefits
Not having to store the one-hit-wonders in cache reduces the disk writes by nearly one-half 31

31 Representative research on CDN
Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Overlay routing Leader election and consensus 33

32 Overview Background Representative research Conclusion 34

33 Conclusion This paper explores some research problems in CDN.
The purpose of the paper is to illustrate How research influenced the design of CDN How the system-building challenges inspired more research in CDN 35

34 Chen Qian cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/
Thank You Chen Qian


Download ppt "Edge computing (1) Content Distribution Networks"

Similar presentations


Ads by Google