A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley.

A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley

Motivation The Internet has evolved to become a commercial infrastructure for service delivery –Web delivery, VoIP, streaming media … Challenges for Internet-scale services –Scalability: 600M users, 35M Web sites, 28Tb/s –Efficiency: bandwidth, storage, management –Agility: dynamic clients/network/servers –Security, etc. Focus on content delivery - Content Distribution Network (CDN) –Totally 4 Billion Web pages, daily growth of 7M pages –Annual growth of 200% for next 4 years

How CDN Works

New Challenges for CDN Large multimedia files ― Efficient replication Dynamic content ― Coherence support Network congestion/failures ― Scalable network monitoring

Existing CDNs Fail to Address these Challenges Non-cooperative replication inefficient No coherence for dynamic content Unscalable network monitoring - O(M × N) X

Provisioning (replica placement) Network Monitoring Coherence Support Ad hoc pair-wise monitoring O(M×N) Tomography -based monitoring O(M+N) Granularity SCANPush Existing CDNsPull CooperativeNon-cooperative Per object Per Website Per cluster Access/Deployment Mechanisms IP multicast App-level multicast Unicast SCAN: Scalable Content Access Network

SCAN Coherence for dynamic content Cooperative clustering-based replication s1, s4, s5

SCAN X Scalable network monitoring - O(M+N) s1, s4, s5 Cooperative clustering-based replication Coherence for dynamic content

Outline Introduction Research Methodology SCAN Mechanisms and Status –Cooperative clustering-based replication –Coherence support –Scalable network monitoring Research Plan Conclusions

Design and Evaluation of Internet-scale Systems Network topology Web workload Network end-to-end latency measurement Analytical evaluation Algorithm design Realistic simulation iterate Real evaluation?

Network Topology and Web Workload Network Topology –Pure-random, Waxman & transit-stub synthetic topology –An AS-level topology from 7 widely-dispersed BGP peers Web Workload Web Site PeriodDuration# Requests avg –min-max # Clients avg –min-max # Client groups avg –min-max MSNBCAug-Oct/199910–11am1.5M–642K–1.7M129K–69K–150K15.6K-10K-17K NASAJul-Aug/1995All day79K-61K-101K5940-4781-76712378-1784-3011 World Cup May-Jul/1998All day29M – 1M – 73M103K–13K–218KN/A –Aggregate MSNBC Web clients with BGP prefix »BGP tables from a BBNPlanet router –Aggregate NASA Web clients with domain names –Map the client groups onto the topology

Network E2E Latency Measurement NLANR Active Measurement Project data set –111 sites on America, Asia, Australia and Europe –Round-trip time (RTT) between every pair of hosts every minute –17M daily measurement –Raw data: Jun. – Dec. 2001, Nov. 2002 Keynote measurement data –Measure TCP performance from about 100 worldwide agents –Heterogeneous core network: various ISPs –Heterogeneous access network: »Dial up 56K, DSL and high-bandwidth business connections –Targets »40 most popular Web servers + 27 Internet Data Centers –Raw data: Nov. – Dec. 2001, Mar. – May 2002

Cooperative Clustering-based Replication Cooperative push: only 4 - 5% replication/update cost compared with existing CDNs Clustering reduce the management/computational overhead by two orders of magnitude –Spatial clustering and popularity-based clustering recommended Incremental clustering to adapt to emerging objects –Hyperlink-based online incremental clustering for high availability and performance improvement –Offline incremental clustering performs close to optimal Publication –ICNP 2002 –IEEE J-SAC 2003 (extended version)

Coherence Support Leverage on DOLR, Tapestry Dynamic replica placement Self-organized replicas into app-level multicast tree –Small delay and bandwidth consumption for update multicast –Each node only maintains states for its parent & direct children Evaluated based on simulation of –Synthetic traces with various sensitivity analysis –Real traces from NASA and MSNBC Publication –IPTPS 2002 –Pervasive Computing 2002

Network Distance Estimation Proposed Internet Iso-bar: a scalable overlay distance monitoring system Procedures 1.Cluster hosts that perceive similar performance to a small set of sites (landmarks) 2.For each cluster, select a monitor for active and continuous probing 3.Estimate distance between any pair of hosts using inter- and intra-cluster distance

End Host Cluster A Cluster B Cluster C Landmark Diagram of Internet Iso-bar

Cluster A End Host Cluster B Monitor Cluster C Distance probes from monitor to its hosts Distance probes among monitors Landmark Diagram of Internet Iso-bar

Internet Iso-bar Evaluated with NLANR AMP and Keynote data –90% of relative error less than 0.5 »if 60ms latency, 45ms < prediction < 90ms –Good stability for distance estimation Publications –ACM SIGMETRICS Performance Evaluation Review (PER), September issue, 2002. –Journal of Computer Resource Management, Computer Measurement Group, Spring Edition, 2002.

Research Plan Focus on congestion/failures estimation (4 months) –Apply topology information, e.g. lossy link detection with network tomography –Cluster and choose monitors based on the lossy links –Dynamic node join/leave for P2P systems –More comprehensive evaluation »Simulate with large network »Deploy on PlanetLab, and operate at finer level Write up thesis (4 months)

Tomography-based Network Monitoring Observations –# of lossy links is small, dominate E2E loss –Loss rates are stable (in the order of hours ~ days) –Routing is stable (in the order of days) Identify the lossy links and only monitor a few paths to examine lossy links Make inference for other paths End hosts Routers Normal links Lossy links

Conclusions Cooperative, clustering-based replication –Cooperative push: only 4 - 5% replication/update cost compared with existing CDNs –Clustering reduce the management/computational overhead by two orders of magnitude »Spatial clustering and popularity-based clustering recommended –Incremental clustering to adapt to emerging objects »Hyperlink-based online incremental clustering for high availability and performance improvement Self-organize replicas into app-level multicast tree for update dissemination Scalable overlay network monitoring –O(M+N) instead of O(M × N), given M client groups and N servers

Backup Materials

SCAN Coherence for dynamic content Cooperative clustering-based replication X Scalable network monitoring O(M+N) s1, s4, s5

Problem Formulation Subject to certain total replication cost (e.g., # of URL replicas) Find a scalable, adaptive replication strategy to reduce avg access cost

CDN Applications (e.g. streaming media) SCAN: Scalable Content Access Network Provision: Cooperative Clustering-based Replication User Behavior/ Workload Monitoring Coherence: Update Multicast Tree Construction Network Performance Monitoring Network Distance/ Congestion/ Failure Estimation red: my work, black: out of scope

Evaluation of Internet-scale System Analytical evaluation Realistic simulation –Network topology –Web workload –Network end-to-end latency measurement Network topology –Pure-random, Waxman & transit-stub synthetic topology –A real AS-level topology from 7 widely-dispersed BGP peers

Web Workload Web Site PeriodDuration# Requests avg –min-max # Clients avg –min-max # Client groups avg –min-max MSNBCAug-Oct/199910–11am1.5M–642K–1.7M129K–69K–150K15.6K-10K-17K NASAJul-Aug/1995All day79K-61K-101K5940-4781-76712378-1784-3011 World Cup May-Jul/1998All day29M – 1M – 73M103K–13K–218KN/A Aggregate MSNBC Web clients with BGP prefix –BGP tables from a BBNPlanet router Aggregate NASA Web clients with domain names Map the client groups onto the topology

Simulation Methodology Network Topology –Pure-random, Waxman & transit-stub synthetic topology –An AS-level topology from 7 widely-dispersed BGP peers Web Workload Web Site PeriodDuration# Requests avg –min-max # Clients avg –min-max # Client groups avg –min-max MSNBCAug-Oct/199910–11am1.5M–642K–1.7M129K–69K–150K15.6K-10K-17K NASAJul-Aug/1995All day79K-61K-101K5940-4781-76712378-1784-3011 –Aggregate MSNBC Web clients with BGP prefix »BGP tables from a BBNPlanet router –Aggregate NASA Web clients with domain names –Map the client groups onto the topology

Online Incremental Clustering Predict access patterns based on semantics Simplify to popularity prediction Groups of URLs with similar popularity? Use hyperlink structures! –Groups of siblings –Groups of the same hyperlink depth: smallest # of links from root

Challenges for CDN Over-provisioning for replication –Provide good QoS to clients (e.g., latency bound, coherence) –Small # of replicas with small delay and bandwidth consumption for update Replica Management –Scalability: billions of replicas if replicating in URL »O(10 4 ) URLs/server, O(10 5 ) CDN edge servers in O(10 3 ) networks –Adaptation to dynamics of content providers and customers Monitoring –User workload monitoring –End-to-end network distance/congestion/failures monitoring »Measurement scalability »Inference accuracy and stability

SCAN Architecture Leverage Decentralized Object Location and Routing (DOLR) - Tapestry for –Distributed, scalable location with guaranteed success –Search with locality Soft state maintenance of dissemination tree (for each object) data plane network plane data source Web server SCAN server client replica always update adaptive coherence cache Tapestry mesh Request Location Dynamic Replication/Update and Content Management

Cluster A Clients Cluster B Monitors Cluster C Distance measured from a host to its monitor Distance measured among monitors SCAN edge servers Wide-area Network Measurement and Monitoring System (WNMMS) Select a subset of SCAN servers to be monitors E2E estimation for Distance Congestion Failures network plane

Dynamic Provisioning Dynamic replica placement –Meeting clients’ latency and servers’ capacity constraints –Close-to-minimal # of replicas Self-organized replicas into app-level multicast tree –Small delay and bandwidth consumption for update multicast –Each node only maintains states for its parent & direct children Evaluated based on simulation of –Synthetic traces with various sensitivity analysis –Real traces from NASA and MSNBC Publication –IPTPS 2002 –Pervasive Computing 2002

Effects of the Non-Uniform Size of URLs Replication cost constraint : bytes Similar trends exist –Per URL replication outperforms per Website dramatically –Spatial clustering with Euclidean distance and popularity- based clustering are very cost-effective 1 2 3 4

End Host Cluster A Cluster B Cluster C Landmark Diagram of Internet Iso-bar

Cluster A End Host Cluster B Monitor Cluster C Distance probes from monitor to its hosts Distance probes among monitors Landmark Diagram of Internet Iso-bar

Real Internet Measurement Data NLANR Active Measurement Project data set –119 sites on US (106 after filtering out most offline sites) –Round-trip time (RTT) between every pair of hosts every minute –Raw data: 6/24/00 – 12/3/01 Keynote measurement data –Measure TCP performance from about 100 agents –Heterogeneous core network: various ISPs –Heterogeneous access network: »Dial up 56K, DSL and high-bandwidth business connections –Targets »Web site perspective: 40 most popular Web servers »27 Internet Data Centers (IDCs)

Related Work Internet content delivery systems –Web caching »Client-initiated »Server-initiated –Pull-based Content Delivery Networks (CDNs) –Push-based CDNs Update dissemination –IP multicast –Application-level multicast Network E2E Distance Monitoring Systems

Clien t Local DNS serverProxy cache server Web content server Client Local DNS server Proxy cache server 1.GET request 4. Response 2.GET request if cache miss 3. Response ISP 2 ISP 1 Web Proxy Caching

CDN name server Clien t Local DNS serverLocal CDN server 1. GET request 4. local CDN server IP address Web content server Client Local DNS server Local CDN server 2. Request for hostname resolution 3. Reply: local CDN server IP address 5.GET request 8. Response 6.GET request if cache miss 7. Response ISP 2 Pull-based CDN ISP 1

CDN name server Clien t Local DNS serverLocal CDN server 1. GET request 4. Redirected server IP address Web content server Client Local DNS server Local CDN server 2. Request for hostname resolution 3. Reply: nearby replica server or Web server IP address ISP 2 Push-based CDN ISP 1 0. Push replicas 5. GET request 6. Response 5.GET request if no replica yet

Internet Content Delivery Systems Scalability for request redirection Pre- configured in browser Use Bloom filter to exchange replica locations Centralized CDN name server Decentra- lized P2P location Properties Web caching (client initiated) Web caching (server initiated) Pull-based CDNs (Akamai) Push- based CDNs SCAN Efficiency (# of caches or replicas) No cache sharing among proxies Cache sharing No replica sharing among edge servers Replica sharing Network- awareness No Yes, unscalable monitoring system NoYes, scalable monitoring system Coherence support No YesNoYes

Previous Work: Update Dissemination No inter-domain IP multicast Application-level multicast (ALM) unscalable –Root maintains states for all children (Narada, Overcast, ALMI, RMX) –Root handles all “join” requests (Bayeux) –Root split is common solution, but suffers consistency overhead

Design Principles Scalability –No centralized point of control: P2P location services, Tapestry –Reduce management states: minimize # of replicas, object clustering –Distributed load balancing: capacity constraints Adaptation to clients’ dynamics –Dynamic distribution/deletion of replicas with regarding to clients’ QoS constraints –Incremental clustering Network-awareness and fault-tolerance (WNMMS) –Distance estimation: Internet Iso-bar –Anomaly detection and diagnostics

Comparison of Content Delivery Systems (cont’d) Properties Web caching (client initiated) Web caching (server initiated) Pull-based CDNs (Akamai) Push- based CDNs SCAN Distributed load balancing NoYes NoYes Dynamic replica placement Yes NoYes Network- awareness No Yes, unscalable monitoring system NoYes, scalable monitoring system No global network topology assumption Yes NoYes

Network-awareness (cont’d) Loss/congestion prediction –Maximize the true positive and minimize the false positive Orthogonal loss/congestion paths discovery –Without underlying topology –How stable is such orthogonality? »Degradation of orthogonality over time Reactive and proactive adaptation for SCAN

A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley.

Similar presentations

Presentation on theme: "A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley.

Similar presentations

Presentation on theme: "A Scalable, Adaptive, Network-aware Infrastructure for Efficient Content Delivery Yan Chen Ph.D. Status Talk EECS Department UC Berkeley."— Presentation transcript:

Similar presentations

About project

Feedback