Content Distribution March 6, 2012 2: Application Layer1.

Slides:



Advertisements
Similar presentations
Peer-to-Peer Systems Chapter 25. What is Peer-to-Peer (P2P)? Napster? Gnutella? Most people think of P2P as music sharing.
Advertisements

Incentives Build Robustness in BitTorrent Bram Cohen.
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CompSci 356: Computer Network Architectures Lecture 21: Content Distribution Chapter 9.4 Xiaowei Yang
Road Map Application basics Web FTP DNS P2P DHT.
No Class on Friday There will be NO class on: FRIDAY 1/30/15.
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Peer-to-Peer Content Sharing. P2P File Sharing Benefits Why use a P2P model for a file sharing application?
2: Application Layer P2P applications and Sockets.
Chord-over-Chord Overlay Sudhindra Rao Ph.D Qualifier Exam Department of ECECS.
CSE 124 Networked Services Fall 2009 B. S. Manoj, Ph.D 11/03/2009CSE 124 Network Services FA 2009 Some of these.
Introduction 1 Lecture 8 Application Layer (DNS, p2p) slides are modified from J. Kurose & K. Ross University of Nevada – Reno Computer Science & Engineering.
1 Content Distribution Networks. 2 Replication Issues Request distribution: how to transparently distribute requests for content among replication servers.
Introduction to Peer-to-Peer Networks. What is a P2P network Uses the vast resource of the machines at the edge of the Internet to build a network that.
P2P File Sharing Systems
Content Distribution March 8, : Application Layer1.
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 10 Omar Meqdadi Department of Computer Science and Software Engineering University.
1 Lecture05: Application layer r Principles of network applications r DNS r P2P and DHT.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Peer-to-Peer Overlay Networks. Outline Overview of P2P overlay networks Applications of overlay networks Classification of overlay networks – Structured.
Application Layer2-1 Chapter 2: outline 2.1 principles of network applications – app architectures – app requirements 2.2 Web and HTTP 2.3 FTP 2.4 electronic.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
Introduction to Peer-to-Peer Networks. What is a P2P network A P2P network is a large distributed system. It uses the vast resource of PCs distributed.
Introduction of P2P systems
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
P2P Networking and Content Distribution
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
© Janice Regan, CMPT 128, CMPT 371 Data Communications and Networking DNS 0.
Content Distribution March 2, : Application Layer1.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Peer-to-Peer (P2P) networks and applications. What is P2P? r “the sharing of computer resources and services by direct exchange of information”
Peer-to-Peer File Sharing Jennifer Rexford COS 461: Computer Networks Lectures: MW 10-10:50am in Architecture N101
Peer-to-Peer Networks Hongli Luo CEIT, IPFW. r Topics m Application architecture m P2P file sharing m P2P networks: Napster Gnutella KaAzA Bittorrent.
Lecture 2 Distributed Hash Table
1 Peer-to-Peer Technologies Seminar by: Kunal Goswami (05IT6006) School of Information Technology Guided by: Prof. C.R.Mandal, School of Information Technology.
CS 3830 Day 10 Introduction 1-1. Announcements r Quiz #2 this Friday r Program 2 posted yesterday 2: Application Layer 2.
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
Flashback: A Peer-to-Peer Web Server for Flash Crowds Presented by Tom Batkiewicz CS 587x Fall ‘07.
2: Application Layer 1 Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April.
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
Advance Computer Networks Lecture#06
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Chapter 2 Application Layer Application 2-1. Chapter 2: Application layer 2.1 Principles of network applications 2.2 Web and HTTP 2.3 FTP 2.4 Electronic.
CS Spring 2014 CS 414 – Multimedia Systems Design Lecture 37 – Introduction to P2P (Part 1) Klara Nahrstedt.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
No Class on Friday There will be NO class on: FRIDAY 1/29/15 1.
2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2.
P2P Search COP6731 Advanced Database Systems. P2P Computing  Powerful personal computer Share computing resources P2P Computing  Advantages: Shared.
P2P Search COP P2P Search Techniques Centralized P2P systems  e.g. Napster, Decentralized & unstructured P2P systems  e.g. Gnutella.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
II -a: Application Layer II. Application Layer (part b) lectured by Chang-jin Suh Soongsil University, Dep. of Computer Science Tel :
05 - P2P applications and Sockets
An example of peer-to-peer application
Server-client vs. P2P: example
Content Distribution Networks
Part 4: Peer to Peer - P2P Applications
Content Distribution Networks + P2P File Sharing
CMPE 252A : Computer Networks
Pure P2P architecture no always-on server
Chapter 2 Application Layer
Content Distribution Networks + P2P File Sharing
Chapter 2 Application Layer - 2
Presentation transcript:

Content Distribution March 6, : Application Layer1

Contents r P2P architecture and benefits r P2P content distribution r Content distribution network (CDN) 2: Application Layer2

3 Pure P2P architecture r no always-on server r arbitrary end systems directly communicate r peers are intermittently connected and change IP addresses r Three topics:  File distribution  Searching for information  Case Study: Skype peer-peer

2: Application Layer4 File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) File, size F u s : server upload bandwidth u i : peer i upload bandwidth d i : peer i download bandwidth

2: Application Layer5 File distribution time: server-client usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r server sequentially sends N copies:  NF/u s time r client i takes F/d i time to download increases linearly w.r.t. N (for large N) = d cs = max { NF/u s, F/min(d i ) } i Time to distribute F to N clients using client/server approach

2: Application Layer6 File distribution time: P2P usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r server must send one copy: F/u s time r client i takes F/d i time to download r NF bits must be downloaded (aggregate)  fastest possible upload rate: u s +  u i d P2P = max { F/u s, F/min(d i ), NF/(u s +  u i ) } i

2: Application Layer7 Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, u s = 10u, d min ≥ u s Client server ~ NF/u s vs. P2P ~ NF/(u s +  u i )

Contents r P2P architecture and benefits r P2P content distribution r Content distribution network (CDN) 2: Application Layer8

P2P content distribution issues r Issues  Group management and data search  Reliable and efficient file exchange  Security/privacy/anonymity/trust r Approaches for group management and data search (i.e., who has what?)  Centralized (e.g., BitTorrent tracker)  Unstructured (e.g., Gnutella)  Structured (Distributed Hash Tables [DHT]) 2: Application Layer9

Centralized model (Napster) original “Napster” design 1) when peer connects, it informs central server:  IP address  content 2) Alice queries for “Hey Jude”; server notifies that Bob has the file.. 3) Alice requests file from Bob centralized directory server peers Alice Bob : Application Layer10 Q: “Hey Jude” A: Bob has it

Centralized model BobAlice JaneJudy file transfer is decentralized, but locating content is highly centralized 2: Application Layer11

Centralized model r Benefits:  Low per-node state  Limited bandwidth usage  Short search time  High success rate  Fault tolerant r Drawbacks:  Single point of failure  Limited scale  Possibly unbalanced load r copyright infringement (?) BobAlice JaneJudy 2: Application Layer12

2: Application Layer13 File distribution: BitTorrent tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain a list of peers trading chunks peer r P2P file distribution

2: Application Layer14 BitTorrent (1) r file divided into 256KB chunks. r peer joining torrent:  has no chunks, but will accumulate them over time  registers with tracker to get list of peers, connects to subset of peers (“neighbors”) r while downloading, peer uploads chunks to other peers. r peers may come and go r once peer has entire file, it may (selfishly) leave or (altruistically) remain

2: Application Layer15 BitTorrent (2) Pulling Chunks r at any given time, different peers have different subsets of file chunks r periodically, a peer (Alice) asks each neighbor for a list of chunks that it has. r Alice sends requests for her missing chunks  rarest first Sending Chunks: tit-for-tat r Alice sends chunks to four neighbors currently sending her chunks at the highest rate  re-evaluate top 4 every 10 secs r every 30 secs: randomly select another peer, starts sending chunks  newly chosen peer may join top 4  “optimistically unchoke”

2: Application Layer16 BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster!

2: Application Layer17 P2P Case study: Skype r inherently P2P: pairs of users communicate. r proprietary application-layer protocol (inferred via reverse engineering) r hierarchical overlay with super nodes (SNs) r Index maps usernames to IP addresses; distributed over SNs Skype clients (SC) Supernode (SN) Skype login server

2: Application Layer18 Peers as relays r Problem when both Alice and Bob are behind “NATs”.  NAT prevents an outside peer from initiating a call to insider peer r Solution:  Using Alice’s and Bob’s SNs, Relay is chosen  Each peer initiates session with relay.  Peers can now communicate through NATs via relay

Contents r P2P architecture and benefits r P2P content distribution r Content distribution network (CDN) 2: Application Layer19

Why Content Networks? r More hops between client and Web server  more congestion! r Same data flowing repeatedly over links between clients and Web server S C1 C4 C2 C3 - IP router Slides from 2: Application Layer20

Why Content Networks? r Origin server is bottleneck as number of users grows r Flash Crowds (for instance, Sept. 11) r The Content Distribution Problem: Arrange a rendezvous between a content source at the origin server ( and a content sink (us, as users) Slides from 2: Application Layer21

Example: Web Server Farm r Simple solution to the content distribution problem: deploy a large group of servers r Arbitrate client requests to servers using an “intelligent” L4-L7 switch r Pretty widely used today L4-L7 Switch Request from grad.umd.edu Request from ren.cis.udel.edu Request from ren.cis.udel.edu Request from grad.umd.edu (Copy 1) (Copy 3) (Copy 2) 2: Application Layer22

Example: Caching Proxy r Majorly motivated by ISP business interests – reduction in bandwidth consumption of ISP from the Internet r Reduced network traffic r Reduced user perceived latency Client ren.cis.udel.edu Client merlot.cis.ud el.edu Intercepters Proxy Internet TCP port 80 traffic Other traffic ISP 2: Application Layer23

But on Sept. 11, : Application Layer24 Web Server User mslab.kaist.ac.kr 1000,000 other hosts 1000,000 other hosts New Content WTC News! old content request - Caching Proxy ISP - Congestion / Bottleneck

Problems with discussed approaches: Server farms and Caching proxies r Server farms do nothing about problems due to network congestion r Caching proxies serve only their clients, not all users on the Internet r Content providers (say, Web servers) cannot rely on existence and correct implementation of caching proxies r Accounting issues with caching proxies.  For instance, needs to know the number of hits to the webpage for advertisements displayed on the webpage 2: Application Layer25

Again on Sept. 11, 2001 with CDN 2: Application Layer26 Web Server User mslab.kaist.ac.kr New Content WTC News! request new content 1000,000 other users 1000,000 other users - Surrogate - Distribution Infrastructure FL IL DE NY MA MI CA WA

Web replication - CDNs r Overlay network to distribute content from origin servers to users r Avoids large amount of same data repeatedly traversing potentially congested links on the Internet r Reduces Web server load r Reduces user perceived latency r Tries to route around congested networks 2: Application Layer27

CDN vs. Caching Proxies r Caches are used by ISPs to reduce bandwidth consumption, CDNs are used by content providers to improve quality of service to end users r Caches are reactive, CDNs are proactive r Caching proxies cater to their users (web clients) and not to content providers (web servers), CDNs cater to the content providers (web servers) and clients r CDNs give control over the content to the content providers, caching proxies do not 2: Application Layer28

CDN Architecture Surrogate Request Routing Infrastructure Distribution & Accounting Infrastructure CDN Origin Server Client 2: Application Layer29

CDN Components r Distribution Infrastructure:  Moving or replicating content from content source (origin server, content provider) to surrogates r Request Routing Infrastructure:  Steering or directing content request from a client to a suitable surrogate r Content Delivery Infrastructure:  Delivering content to clients from surrogates r Accounting Infrastructure:  Logging and reporting of distribution and delivery activities 2: Application Layer30

Server Interaction with CDN Distribution Infrastructure 1 1. Origin server pushes new content to CDN OR CDN pulls content from origin server Accounting Infrastructure 2 2. Origin server requests logs and other accounting info from CDN OR CDN provides logs and other accounting info to origin server CDN Origin Server 2: Application Layer31

Request Routing Infrastructure Client Interaction with CDN 1 1. Hi! I need Go to surrogate newyork.cnn.akamai.com 3 3. Hi! I need content /sept11 Q: How did the CDN choose the New York surrogate over the California surrogate ? Client Surrogate (NY) Surrogate (CA) CDN california.cnn.akamai.com newyorkcnn.akamai.com 2: Application Layer32

Request Routing Techniques r Request routing techniques use a set of metrics to direct users to “best” surrogate r Proprietary, but underlying techniques known:  DNS based request routing  Content modification (URL rewriting)  Anycast based (how common is anycast?)  URL based request routing  Transport layer request routing  Combination of multiple mechanisms 2: Application Layer33

DNS based Request-Routing r Common due to the ubiquity of DNS as a directory service r Specialized DNS server inserted in a DNS resolution process r DNS server is capable of returning a different set of A, NS or CNAME records based on policies/metrics 2: Application Layer34

DNS based Request-Routing Akamai DNS DNS query: DNS response: A Session local DNS server (dns.nyu.edu) ) DNS query: DNS response: A Surrogate Surrogate Akamai CDN test.nyu.edu newyork.cnn.akamai.com california.cnn.akamai.com newyork.cnn.akamai.com Q: How does the Akamai DNS know which surrogate is closest ? 2: Application Layer35

DNS based Request-Routing DNS query Akamai DNS Surrogate Akamai CDN test.nyu.edu local DNS server (dns.nyu.edu) DNS query Measure to Client DNS Measure to Client DNS Measurement results Measurements 2: Application Layer36

DNS based Request-Routing Client DNS Surrogate Surrogate Akamai DNS Akamai CDN Client Requesting DNS Surrogate A TTL = 10s Requesting DNS Available Bandwidth = 10 kbps RTT = 10 ms Requesting DNS Available Bandwidth = 5 kbps RTT = 100 ms 2: Application Layer37

DNS based Request Routing: Discussion r Originator Problem: Client may be far removed from client DNS r Client DNS Masking Problem: Virtually all DNS servers, except for root DNS servers honor requests for recursion Q: Which DNS server resolves a request for test.nyu.edu? Q: Which DNS server performs the last recursion of the DNS request? r Hidden Load Factor: A DNS resolution may result in drastically different load on the selected surrogate – issue in load balancing requests, and predicting load on surrogates 2: Application Layer38

Summary r P2P architecture and its benefits r P2P content distribution  BitTorrent, Skype r Content distribution network (CDN)  DNS-based request routing 2: Application Layer39

Distributed Hash Table (DHT) r DHT = distributed P2P database r Database has (key, value) pairs;  key: ss number; value: human name  key: content type; value: IP address r Peers query DB with key  DB returns values that match the key r Peers can also insert (key, value) peers 2: Application Layer40

DHT Identifiers r Assign integer identifier to each peer in range [0,2 n -1].  Each identifier can be represented by n bits. r Require each key to be an integer in same range. r To get integer keys, hash original key.  eg, key = h(“Led Zeppelin IV”)  This is why they call it a distributed “hash” table 2: Application Layer41

How to assign keys to peers? r Central issue:  Assigning (key, value) pairs to peers. r Rule: assign key to the peer that has the closest ID. r Convention in lecture: closest is the immediate successor of the key. r Ex: n=4; peers: 1,3,4,5,8,10,12,14;  key = 13, then successor peer = 14  key = 15, then successor peer = 1 2: Application Layer42

Chord (a circular DHT) (1) r Each peer only aware of immediate successor and predecessor. r “Overlay network” 2: Application Layer43

Chord (a circular DHT) (2) Who’s resp for key 1110 ? I am O(N) messages on avg to resolve query, when there are N peers 1110 Define closest as closest successor 2: Application Layer44

Chord (a circular DHT) with Shortcuts r Each peer keeps track of IP addresses of predecessor, successor, short cuts. r Reduced from 6 to 2 messages. r Possible to design shortcuts so O(log N) neighbors, O(log N) messages in query Who’s resp for key 1110? 2: Application Layer45

Peer Churn r Peer 5 abruptly leaves r Peer 4 detects; makes 8 its immediate successor; asks 8 who its immediate successor is; makes 8’s immediate successor its second successor. r What if peer 13 wants to join? To handle peer churn, require each peer to know the IP address of its two successors. Each peer periodically pings its two successors to see if they are still alive. 2: Application Layer46