Edge computing (1) Content Distribution Networks

Slides:



Advertisements
Similar presentations
Key Algorithms in a Content Delivery System Akamai Technologies and Carnegie Mellon University Bruce Maggs.
Advertisements

1 Content Delivery Networks iBAND2 May 24, 1999 Dave Farber CTO Sandpiper Networks, Inc.
19 Historical overview Main challenge: How to distribute content in high quality over the Internet cost-effectively? • Traditional “Best-effort” model:
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
EEC-484/584 Computer Networks Discussion Session for HTTP and DNS Wenbing Zhao
Web Servers How do our requests for resources on the Internet get handled? Can they be located anywhere? Global?
P2P: Advanced Topics Filesystems over DHTs and P2P research Vyas Sekar.
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Implementing ISA Server Caching. Caching Overview ISA Server supports caching as a way to improve the speed of retrieving information from the Internet.
Proteus: Power Proportional Memory Cache Cluster in Data Centers Shen Li, Shiguang Wang, Fan Yang, Shaohan Hu, Fatemeh Saremi, Tarek Abdelzaher.
1 Web Content Delivery Reading: Section and COS 461: Computer Networks Spring 2007 (MW 1:30-2:50 in Friend 004) Ioannis Avramopoulos Instructor:
Web Caching Schemes For The Internet – cont. By Jia Wang.
1 The Mystery of Cooperative Web Caching 2 b b Web caching : is a process implemented by a caching proxy to improve the efficiency of the web. It reduces.
Application Layer  We will learn about protocols by examining popular application-level protocols  HTTP  FTP  SMTP / POP3 / IMAP  Focus on client-server.
Caching and Content Distribution Networks. Web Caching r As an example, we use the web to illustrate caching and other related issues browser Web Proxy.
Web Proxy Server Anagh Pathak Jesus Cervantes Henry Tjhen Luis Luna.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Distributed Data Stores – Facebook Presented by Ben Gooding University of Arkansas – April 21, 2015.
Hybrid Prefetching for WWW Proxy Servers Yui-Wen Horng, Wen-Jou Lin, Hsing Mei Department of Computer Science and Information Engineering Fu Jen Catholic.
{ Content Distribution Networks ECE544 Dhananjay Makwana Principal Software Engineer, Semandex Networks 5/2/14ECE544.
1 Computer Communication & Networks Lecture 28 Application Layer: HTTP & WWW p Waleed Ejaz
Infrastructure for Better Quality Internet Access & Web Publishing without Increasing Bandwidth Prof. Chi Chi Hung School of Computing, National University.
CH2 System models.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
How Akamai Handles Large Events Bruce Maggs Carnegie Mellon Duke Akamai Technologies.
Compact Data Structures and Applications Gil Einziger and Roy Friedman Technion, Haifa.
Web Caching and Content Distribution: A View From the Interior Syam Gadde Jeff Chase Duke University Michael Rabinovich AT&T Labs - Research.
Aditya Akella The Performance Benefits of Multihoming Aditya Akella CMU With Bruce Maggs, Srini Seshan, Anees Shaikh and Ramesh Sitaraman.
Kiew-Hong Chua a.k.a Francis Computer Network Presentation 12/5/00.
Adaptive Web Caching CS411 Dynamic Web-Based Systems Flying Pig Fei Teng/Long Zhao/Pallavi Shinde Computer Science Department.
Eduardo Gutarra Velez. Outline Distributed Filesystems Motivation Google Filesystem Architecture The Metadata Consistency Model File Mutation.
Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/
09/13/04 CDA 6506 Network Architecture and Client/Server Computing Peer-to-Peer Computing and Content Distribution Networks by Zornitza Genova Prodanoff.
Bigtable: A Distributed Storage System for Structured Data
Overview on Web Caching COSC 513 Class Presentation Instructor: Prof. M. Anvari Student name: Wei Wei ID:
Algorithms used by CDNs Stable Marriage Algorithm Consistent Hashing.
THE FUTURE IS HERE: APPLICATION- AWARE CACHING BY ASHOK ANAND.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
Understanding Solutions
Introduction and Overview of Network and Telecommunications
Jonathan Walpole Computer Science Portland State University
Module 11: File Structure
Trading Timeliness and Accuracy in Geo-Distributed Streaming Analytics
Content Distribution Networks
Administrative Things
The Impact of Replacement Granularity on Video Caching
Finding a Needle in Haystack : Facebook’s Photo storage
E-commerce | WWW World Wide Web - Concepts
E-commerce | WWW World Wide Web - Concepts
Web Caching? Web Caching:.
CHAPTER 3 Architectures for Distributed Systems
Internet Networking recitation #12
Algorithmic Nuggets in Content Delivery
Net 323 D: Networks Protocols
Wednesday, September 19, 2018 What Is the Internet?
ECE 671 – Lecture 16 Content Distribution Networks
The Anatomy of a Large-Scale Hypertextual Web Search Engine
Computer Communication & Networks
Software Defined Networking
CSE 461 HTTP and the Web.
AWS Cloud Computing Masaki.
Internet and Web Simple client-server model
How Yahoo! use to serve millions of videos from its video library.
Content Delivery and Remote DNS services
Computer Networks Primary, Secondary and Root Servers
EE 122: Lecture 22 (Overlay Networks)
Lecture 1: Bloom Filters
Presentation transcript:

Edge computing (1) Content Distribution Networks Chen Qian Department of Computer Science and Engineering qian@ucsc.edu https://users.soe.ucsc.edu/~qian/

Algorithmic Nuggets in Content Delivery Bruce M. Maggs Ramesh K. Sitaraman

Overview Background Representative research Conclusion 3

Web caches (proxy server) goal: satisfy client request without involving origin server user sets browser: Web accesses via cache browser sends all HTTP requests to cache object in cache: cache returns object else cache requests object from origin server, then returns object to client HTTP response proxy server HTTP request client origin server HTTP request HTTP response client origin server Application Layer

More about Web caching cache acts as both client and server server for original requesting client client to origin server typically cache is installed by ISP (university, company, residential ISP) why Web caching? reduce response time for client request reduce traffic on an institution’s access link When is cache not good? Every client of the ISP requests different content. Waste time on visiting cache server Application Layer

Background Content delivery network (CDN) A geographically distributed network of proxy servers and their data centers. Distribute service spatially relative to end-users e.g. Service for DNS query 6

Background Top-three objectives of CDN Representative research on CDN High reliability Fast and consistent performance Low operating cost Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Overlay routing Leader election and consensus 7

Overview Background Representative research Conclusion 8

Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 9

Global Load Balancing Purpose: Map clients to the server clusters of the CDN. Clusters assignments are made at the granularity of map units. <IP address prefix, traffic class> e.g. <1.2.3.4/24, video> Question: How to assign each map unit 𝑴 𝒊 ,𝟏≤𝒊≤𝑴 to a server cluster 𝑪 𝒋 ,𝟏≤𝒋≤𝑵 10

Global Load Balancing Preference Constraints Each map unit has preferences for clusters, higher preference indicates better predicted performance. Each server cluster has preferences regarding which map units it would like to serve. Constraints Each map unit is associated with a demand 𝑑. Each cluster has a notion of capacity 𝑐. Satisfy preferences & Meet capacity constraints 11

Global Load Balancing Stable allocations Blocking pairs: 𝑀 𝑖 prefers 𝐶 𝑗 over its current partner 𝐶 𝑗 ′ & 𝐶 𝑗 prefers 𝑀 𝑖 over its current partner 𝑀 𝑖 ′ Stable: There are no blocking pairs in the allocations 𝑀 1 :( 𝐶 2 , 𝐶 1 , 𝐶 3 ) 𝐶 1 :( 𝑀 2 , 𝑀 1 , 𝑀 3 ) 𝑀 2 :( 𝐶 2 , 𝐶 1 , 𝐶 3 ) 𝐶 2 :( 𝑀 1 , 𝑀 2 , 𝑀 3 ) 𝑀 3 :( 𝐶 2 , 𝐶 3 , 𝐶 1 ) 𝐶 3 :( 𝑀 1 , 𝑀 3 , 𝑀 2 ) 12

Gale-Shapley algorithm Gale-Shapley algorithm is a distributed algorithm to find a stable allocations. (Propose-And-Reject algorithm) The stable allocations problem is a classical problem of the algorithm: the stable marriage problem man(who proposes)-optimal map-unit-optimal 13

Some Limitations Unequal number of map units and clusters More map units than clusters Partial preference lists Tens of millions of map units VS Thousands of clusters Rank for each map unit the top dozen clusters that are likely to provide the best performance Modeling integral demands and capacities A server cluster cannot be accurately modeled as a single resource with a single number capacity 14 A Survey of the Stable Marriage Problem and Its Variants

Resource Trees Bps: the rate at which data can be sent out of the cluster modeled. Fps: the capacity of non-network serve resources such as the processor, memory and disk. A 50 Bps Violation B 25 Fps 30 Fps Video Apps Web 1. 20 units of demand from a video map, each unit requires 0.25 Fps and 1 Bps. (5 Fps & 20 Bps) C D E 30 Fps 40 Fps 30 Fps 2. 26 units of demand from application map, each unit requires 1 Fps and 0.25 Bps. (26 Fps & 6.5 Bps) 5 Fps & 20 Bps 26 Fps & 6.5Bps 25 Fps 4 Fps 15

Resource Trees If cluster has a higher preference for map units with application traffic than video traffic A 50 Bps Violation B 25 Fps 30 Fps Evict a lower preference map unit Video Apps Web C D E e.g. 4 units of video demand (1 Fps) are evicted. 30 Fps 40 Fps 30 Fps 5 Fps & 20 Bps 26 Fps & 6.5Bps 25 Fps 4 Fps 16

Implementation Challenges Complexity and scale Tens of millions of map units Thousands of clusters Over a dozen traffic classes Time to solve Map unit assignment must be recomputed every 10 to 30 seconds Demand and capacity estimation Incremental and persistent allocation 17

Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 18

Problem Statement In a traditional hash table, objects from a universe 𝑢 are mapped to a set of buckets 𝐵. In CDN, an object is a file such as a JPEG image or HTML page; a bucket is the cache of a distinct web server. Naive method: Use hash functions to directly map objects to buckets. If the servers fail? 19

Problem Statement Consistent Hashing Solution 1: Solution 2: Simply remap objects in the lost bucket to another bucket One bucket stores double the expected load Solution 2: Renumber the existing buckets and rehash the elements using a new hash function Many objects will have to be transferred between buckets. Consistent Hashing 20

Consistent Hashing Each object is mapped to the next bucket that appears in clockwise order on the unit circle. Server Object 21

Consistent Hashing Improvements When a server fails Map each bucket to multiple locations (instances) on the unit circle to improve the balance When a server fails All of the corresponding bucket’s instances are removed from the unit circle The objects that were in the buckets are remapped to other buckets. 22

Consistent Hashing Popular objects It is not possible for a single server within a cluster to satisfy all of the requests for a popular object Naive method: map a popular object to the next k servers that appear in clockwise order Problem: If two popular objects happen to hash to nearby positions, the buckets that they map to will highly overlap CDN approach Use a separate mapping of the buckets for each popular object 23

Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Leader election and consensus 25

Bloom filters for CDN Bloom filters are useful in two different contexts Content summarization Content filtering Use Bloom filters to succinctly store the set of objects stored in a CDN server’s cache Use counting Bloom filters to support elements update (deletions and insertions) 26

Content Filtering Use Bloom filters to determine what objects to cache in the first place Motivation 74% of the roughly 400 million objects in cache were accessed only once (one-hit-wonders) 90% were accessed less than four times No need to cache one-hit-wonders. 27

Content Filtering Cache-on-second-hit rule False positives Use Bloom filters to store accessed objects Server checks Bloom filters to see whether the object has been accessed before Server caches the objects have been accessed before False positives The probability of false positives increases with more objects are added to a Bloom filter. Use two Bloom filters to circumvent the problem 28

Content Filtering New objects Primary Bloom filter It reaches a threshold for maximum number of objects Then new objects Secondary Bloom filter Check both the primary and secondary Bloom filters to see if the object has been accessed in the recent past 29

Content Filtering Benefits Byte hit rates increased when cache filtering was turned on 30

Content Filtering Benefits Not having to store the one-hit-wonders in cache reduces the disk writes by nearly one-half 31

Representative research on CDN Global load balancing Load balancing within a single cluster of servers Bloom filters for CDN Overlay routing Leader election and consensus 33

Overview Background Representative research Conclusion 34

Conclusion This paper explores some research problems in CDN. The purpose of the paper is to illustrate How research influenced the design of CDN How the system-building challenges inspired more research in CDN 35

Chen Qian cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/ Thank You Chen Qian cqian12@ucsc.edu https://users.soe.ucsc.edu/~qian/