2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2.

Slides:



Advertisements
Similar presentations
P2P data retrieval DHT (Distributed Hash Tables) Partially based on Hellerstein’s presentation at VLDB2004.
Advertisements

Peer to Peer and Distributed Hash Tables
Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April A note on the use.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
No Class on Friday There will be NO class on: FRIDAY 1/30/15.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Application Layer – Lecture.
EEC-484/584 Computer Networks Lecture 6 Wenbing Zhao
Cis e-commerce -- lecture #6: Content Distribution Networks and P2P (based on notes from Dr Peter McBurney © )
Application Layer  We will learn about protocols by examining popular application-level protocols  HTTP  FTP  SMTP / POP3 / IMAP  Focus on client-server.
Chapter 2: Application Layer
Topics in Reliable Distributed Systems Lecture 2, Fall Dr. Idit Keidar.
2: Application Layer P2P applications and Sockets.
EEC-484/584 Computer Networks Discussion Session for HTTP and DNS Wenbing Zhao
CSE 124 Networked Services Fall 2009 B. S. Manoj, Ph.D 11/03/2009CSE 124 Network Services FA 2009 Some of these.
Web, HTTP and Web Caching
Topics in Reliable Distributed Systems Fall Dr. Idit Keidar.
CPSC 441: DNS1 Instructor: Anirban Mahanti Office: ICT Class Location: ICT 121 Lectures: MWF 12:00 – 12:50 Notes derived.
Application Layer  We will learn about protocols by examining popular application-level protocols  HTTP  FTP  SMTP / POP3 / IMAP  Focus on client-server.
P2P File Sharing Systems
Content Distribution March 8, : Application Layer1.
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 10 Omar Meqdadi Department of Computer Science and Software Engineering University.
1 Lecture05: Application layer r Principles of network applications r DNS r P2P and DHT.
By Shobana Padmanabhan Sep 12, 2007 CSE 473 Class #4: P2P Section 2.6 of textbook (some pictures here are from the book)
Application Layer – Peer-to-peer UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross (sometimes edited)
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
CHAPTER 2. Creating a network app write programs that – run on (different) end systems – communicate over network – e.g., web server software communicates.
Application Layer2-1 Chapter 2: outline 2.1 principles of network applications – app architectures – app requirements 2.2 Web and HTTP 2.3 FTP 2.4 electronic.
1 Telematica di Base Applicazioni P2P. 2 The Peer-to-Peer System Architecture  peer-to-peer is a network architecture where computer resources and services.
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach Featuring the Internet, 2 nd edition. Jim Kurose, Keith Ross.
Content Distribution March 6, : Application Layer1.
20-1 Last time □ NAT □ Application layer ♦ Intro ♦ Web / HTTP.
Week 11: Application Layer1 Web and HTTP First some jargon r Web page consists of objects r Object can be HTML file, JPEG image, Java applet, audio file,…
CS 372 – introduction to computer networks* Wednesday June 30
Introduction of P2P systems
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
Chapter 2: Application layer
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail  SMTP,
What makes a network good? Ch 2.1: Principles of Network Apps 2: Application Layer1.
P2P Networking and Content Distribution
CPSC 441: Multimedia Networking1 Outline r Scalable Streaming Techniques r Content Distribution Networks.
2: Application Layer1 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail r 2.5 DNS r 2.6 Socket.
Chord: A Scalable Peer-to-peer Lookup Protocol for Internet Applications Xiaozhou Li COS 461: Computer Networks (precept 04/06/12) Princeton University.
Content Distribution March 2, : Application Layer1.
Making the Best of the Best-Effort Service (2) Advanced Multimedia University of Palestine University of Palestine Eng. Wisam Zaqoot Eng. Wisam Zaqoot.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
Ch 2. Application Layer Myungchul Kim
1 Peer-to-Peer Systems r Application-layer architectures r Case study: BitTorrent r P2P Search and Distributed Hash Table (DHT)
2: Application Layer1 Chapter 2: Application layer r 2.1 Principles of network applications  app architectures  app requirements r 2.2 Web and HTTP r.
Data Communications and Computer Networks Chapter 2 CS 3830 Lecture 8 Omar Meqdadi Department of Computer Science and Software Engineering University of.
Content distribution networks (CDNs) r The content providers are the CDN customers. Content replication r CDN company installs hundreds of CDN servers.
CS 3830 Day 10 Introduction 1-1. Announcements r Quiz #2 this Friday r Program 2 posted yesterday 2: Application Layer 2.
2: Application Layer1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Addison-Wesley March 2012.
2: Application Layer 1 Chapter 2 Application Layer Computer Networking: A Top Down Approach, 5 th edition. Jim Kurose, Keith Ross Addison-Wesley, April.
Computer Networks CSE 434 Fall 2009 Sandeep K. S. Gupta Arizona State University Research Experience.
ADVANCED COMPUTER NETWORKS Peer-Peer (P2P) Networks 1.
Application Layer 2-1 Chapter 2 Application Layer Computer Networking: A Top Down Approach 6 th edition Jim Kurose, Keith Ross Application Layer – Lecture.
INTERNET TECHNOLOGIES Week 10 Peer to Peer Paradigm 1.
PEAR TO PEAR PROTOCOL. Pure P2P architecture no always-on server arbitrary end systems directly communicate peers are intermittently connected and change.
CSEN 404 Application Layer II Amr El Mougy Lamia Al Badrawy.
CS Spring 2010 CS 414 – Multimedia Systems Design Lecture 24 – Introduction to Peer-to-Peer (P2P) Systems Klara Nahrstedt (presented by Long Vu)
05 - P2P applications and Sockets
HTTP request message: general format
Administrative Things
ECE 671 – Lecture 16 Content Distribution Networks
EE 122: Peer-to-Peer (P2P) Networks
Part 4: Peer to Peer - P2P Applications
Pure P2P architecture no always-on server
Chapter 2 Application Layer
Chapter 2 Application Layer - 2
Presentation transcript:

2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2

2: Application Layer 2 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail m SMTP, POP3, IMAP r 2.5 DNS r 2.6 Content distribution m Network Web caching m Content distribution networks m P2P file sharing

2: Application Layer 3 Content Distribution r Problem of a single server m Bottleneck, single point of failure, … r Content Distribution m Distribute (Replicate) contents at different place m Direct requests to appropriate places

2: Application Layer 4 Client-side Caching

2: Application Layer 5 Limit of Client-side Caching Not shared !

2: Application Layer 6 Web caches (proxy server) r user sets browser: Web accesses via proxy r browser sends all HTTP requests to proxy m object in cache: cache returns object m else cache requests object from origin server, then returns object to client r Proxy: both client and server; typically installed by ISP (university, company, residential ISP) Goal: satisfy client request without involving origin server client Proxy server client HTTP request HTTP response HTTP request HTTP response origin server origin server

2: Application Layer 7 More about Web caching Why Web caching? r Reduce response time for client request. r Reduce traffic on an institution’s access link. r Internet dense with caches enables “poor” content providers to effectively deliver content

2: Application Layer 8 More about Web caching Why is caching effective for Web, even if cache space is quite limited ? r Zipf distribution

2: Application Layer 9 Caching example (1) Assumptions r average object size = 100,000 bits r avg. request rate from institution’s browser to origin serves = 15/sec r delay from institutional router to any origin server and back to router = 2 sec Consequences r utilization on LAN = 15% r utilization on access link = 100% ! r total delay = Internet delay + access delay + LAN delay = 2 sec + some minutes + some milliseconds origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache

2: Application Layer 10 Caching example (2) Possible solution r increase bandwidth of access link to, say, 10 Mbps Consequences (if 10 Mbps) r utilization on LAN = 15% r utilization on access link = 15% r Total delay = Internet delay + access delay + LAN delay = 2 sec + some msecs + some msecs r often a costly upgrade origin servers public Internet institutional network 10 Mbps LAN upgraded from 1.5 to 10 Mbps institutional cache

2: Application Layer 11 Caching example (3) Install cache r suppose hit rate is.4 Consequence r 40% requests will be satisfied almost immediately r 60% requests satisfied by origin server r utilization of access link reduced to 60%, resulting in negligible delays (say 10 msec) r total delay = Internet delay + access delay + LAN delay = 60%*2 sec + 40%*0.01 secs + some milliseconds < 1.3 secs origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache

2: Application Layer 12 More about Web caching r Problem of Web caching m Extra space/machine (proxy) m Inconsistency (out-of-date objects…)

2: Application Layer 13 Consistency of Cached Objects r Solution 1: no caching

2: Application Layer 14 Consistency of Cached Objects r Solution 2: Manually update

2: Application Layer 15 Conditional GET r Goal: don’t send object if client has up-to-date cached version r client: specify date of cached copy in HTTP request If-modified-since: r server: response contains no object if cached copy is up- to-date: HTTP/ Not Modified client server HTTP request msg If-modified-since: HTTP response HTTP/ Not Modified object not modified HTTP request msg If-modified-since: HTTP response HTTP/ OK object modified

2: Application Layer 16 Hierarchical cache r Calculation m HR of Proxy 1 = 90% m HR of Proxy 2 = 95% m Independent m Joint HR = ? 99.5% r How to measure cache ? m Hit Ratio (HR) client Proxy Server 1 client origin server Proxy Server 2

2: Application Layer 17 Hierarchical cache r Calculation m HR of Proxy 1 = 90% m HR of Proxy 2 = 95% m Independent m Joint HR = ? How about average delay? r How to measure cache ? m Hit Ratio (HR) client Proxy Server 1 client origin server Proxy Server 2

2: Application Layer 18 Content distribution networks (CDNs) Ping What did you see ? origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CDN server in Asia

2: Application Layer 19 Content distribution networks (CDNs) Content replication r CDN company (e.g., Akamai) installs hundreds of CDN servers throughout Internet m in lower-tier ISPs, close to users r Content providers (e.g., Netflix) are the CDN company’s customers. r CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CDN server in Asia

2: Application Layer 20 CDN example origin server r r distributes HTML r Replaces: with h ttp:// HTTP request for DNS query for HTTP request for Origin server CDNs authoritative DNS server Nearby CDN server CDN company r cdn.com r distributes gif files r uses its authoritative DNS server to route redirect requests

2: Application Layer 21 More about CDNs routing requests r CDN creates a “map”, indicating distances from leaf ISPs and CDN nodes r when query arrives at authoritative DNS server: m server determines ISP from which query originates m uses “map” to determine best CDN server Caching vs. CDN r Pull: passive r Push: active

2: Application Layer 22 Client-server architecture server: m always-on host m permanent IP address m server farms for scaling clients: m communicate with server m may be intermittently connected m may have dynamic IP addresses m do not communicate directly with each other client/server

2: Application Layer 23 Pure P2P architecture r no always-on server r arbitrary end systems directly communicate r peers are intermittently connected and change IP addresses r self scalability – new peers bring new resources r Three topics: m File distribution m Searching for information m Case Study: BitTorrent, Skype peer-peer

2: Application Layer 24 P2P file sharing Example r Alice runs P2P client application on her notebook computer m Intermittently connects to Internet r Asks for “X.mp3” r Application displays other peers that have copy of X.mp3. r Alice chooses one of the peers, Bob. r File is copied from Bob’s PC to Alice’s notebook: HTTP r While Alice downloads, other users uploading from Alice. m Alice’s peer is both a Web client and a transient Web server All peers are servers = highly scalable!

2: Application Layer 25 File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) File, size F u s : server upload bandwidth u i : peer i upload bandwidth d i : peer i download bandwidth

2: Application Layer 26 File distribution time: server-client usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r Server transmission: must sequentially sends (upload) N copies: m NF/u s time r Client: each must download the file m client i takes F/d i time to download increases linearly in N (for large N) = D cs = max { NF/u s, F/min(d i ) } i Time to distribute F to N clients using client/server approach

2: Application Layer 27 File distribution time: P2P usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r Server transmission: m must upload at least one copy: F/u s time r Client: each must down a copy m client i takes F/d i time to download m but also share (upload) r Clients (peers): m as a whole must download NF bits  fastest possible overall download rate: u s +  u i D P2P = max { F/u s, F/min(d i ), NF/(u s +  u i ) } i increases linearly in N (for large N) ?

2: Application Layer 28 Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, u s = 10u, d min ≥ u s

2: Application Layer 29 File distribution: BitTorrent tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain list of peers trading chunks peer r P2P file distribution Alice arrives … … obtains list of peers from tracker … and begins exchanging file chunks with peers in torrent

2: Application Layer 30 BitTorrent (1) r file divided into 256KB chunks. r peer joining torrent: m has no chunks, but will accumulate them over time m registers with tracker to get list of peers, connects to subset of peers (“neighbors”) r while downloading, peer uploads chunks to other peers. r peers may come and go: churn r once peer has entire file, it may (selfishly) leave or (altruistically) remain

2: Application Layer 31 BitTorrent (2) Requesting Chunks r at any given time, different peers have different subsets of file chunks r periodically, a peer (Alice) asks each neighbor for list of chunks that they have. r Alice sends requests for her missing chunks m rarest first Sending Chunks: tit-for-tat r Alice sends chunks to four neighbors currently sending her chunks at the highest rate m re-evaluate top 4 every 10 secs r every 30 secs: randomly select another peer, starts sending chunks m newly chosen peer may join top 4 m “optimistically unchoke”

2: Application Layer 32 BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster!

2: Application Layer 33 P2P Case study: Skype r inherently P2P: pairs of users communicate. r proprietary application-layer protocol (inferred via reverse engineering) r hierarchical overlay with SNs r Index maps usernames to IP addresses; distributed over SNs Skype clients (SC) Supernode (SN) Skype login server

2: Application Layer 34 Peers as relays r Problem when both Alice and Bob are behind “NATs”. m NAT prevents an outside peer from initiating a call to insider peer (see later) r Solution: m Using Alice’s and Bob’s SNs, Relay is chosen m Each peer initiates session with relay. m Peers can now communicate through NATs via relay

2: Application Layer 35 P2P: searching for information Index in P2P system: maps information to peer location (location = IP address & port number) r So many files r But, where are they ?

2: Application Layer 36 P2P: centralized directory original “Napster” design 1) when peer connects, it informs central server: m IP address m content 2) Alice queries for “X.mp3” 3) Alice requests file from Bob centralized directory server peers Alice Bob

2: Application Layer 37 P2P: problems with centralized directory r Single point of failure r Performance bottleneck r Copyright infringement file transfer is decentralized, but locating content is highly centralized

2: Application Layer 38 P2P: decentralized directory r Each peer is either a group leader or assigned to a group leader. r Group leader tracks the content in all its children. r Peer queries group leader; group leader may query other group leaders.

2: Application Layer 39 More about decentralized directory advantages of approach r no centralized directory server m location service distributed over peers m more difficult to shut down disadvantages of approach r bootstrap node needed r group leaders can get overloaded

2: Application Layer 40 P2P: Query flooding r Gnutella r no hierarchy r use bootstrap node to learn about others r join message r Send query to neighbors r Neighbors forward query r If queried peer has object, it sends message back to querying peer join

2: Application Layer 41 P2P: more on query flooding Pros r peers have similar responsibilities: no group leaders r highly decentralized r no peer maintains directory info Cons r excessive query traffic r query radius: may not have content when present r bootstrap node r maintenance of overlay network

2: Application Layer 42 DHT: A New Story … r Motivation:  Frustrated by popularity of all these “ half-baked ” P2P apps m We can do better! m Guaranteed lookup success for files in system m Provable bounds on search time m Provable scalability to millions of node

2: Application Layer 43 P2P: Content Addressing (Hash Routing) Hash routing r Given an object identifier I, calculate its hash value H=hash(I), and (hopefully) find it (or its location info) in peer H r Not a new idea m Load balancing – hash IP address, re-direct to different servers hash table application get (key) data node …. put(key, data)

2: Application Layer 44 Hash Routing r Two alternatives m Node can cache each (existing) object that hashes within its range m Pointer-based: level of indirection - node caches pointer to location(s) of object r What’s new in P2P? m Dynamic overlay peer join/leave number of peers is not fixed m Traditional hash function doesn’t work SHA

2: Application Layer 45 Distributed Hash Table (DHT) Challenges r For each object, node(s) whose range(s) cover that object must be reachable via a “short” path r # neighbors for each node should scale well (e.g., should not be O(N)) r Fully distributed (no centralized bottleneck/single point of failure) r DHT mechanism should gracefully handle nodes joining/leaving m need to repartition the range space over existing nodes m need to reorganize neighbor set m need bootstrap mechanism to connect new nodes into the existing DHT infrastructure

2: Application Layer 46 Case Studies r Structure overlay (p2p) systems – Consistent Hashing m Chord m CAN (Content Addressable Network) r Key Questions m Q1: How is hash space divided “evenly” among existing nodes? m Q2: How is routing implemented that connects an arbitrary node to the node responsible for a given object? m Q3: How is the hash space repartitioned when nodes join/leave? r Let N be the number of nodes in the overlay r Let H be the size of the range of the hash function (when applicable)

2: Application Layer 47 Chord r Associate to each node and file a unique id in an uni-dimensional space (a Ring) m E.g., pick from the range [0...2 m -1] m Usually the hash of the file or IP address r Properties: m Routing table size is O(log N), where N is the total number of nodes m Guarantees that a file is found in O(log N) hops from MIT in 2001

2: Application Layer 48 Consistent Hashing N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID (Key – Hashed value of a file identifier)

2: Application Layer 49 Chord Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

2: Application Layer 50 Chord “ Finger Table ” N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 r Entry i in the finger table of node n is the first node that succeeds or equals n + 2 i r In other words, the ith finger points 1/2 n-i way around the ring

2: Application Layer 51 Chord Join r Assume a hash space [0..7] r Node n1 joins i id+2 i succ Succ. Table

2: Application Layer 52 Chord Join r Node n2 joins i id+2 i succ Succ. Table i id+2 i succ Succ. Table

2: Application Layer 53 Chord Join r Nodes n0, n6 join i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table

2: Application Layer 54 Chord Join r Nodes: n1, n2, n0, n6 r Keys: f7, f i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Key 1 i id+2 i succ Succ. Table

2: Application Layer 55 Chord Routing r Upon receiving a query for file id, a node first calculates the key (Hash id) r Checks whether stores the key locally r If not, forwards the query to the largest node in its successor table that does not exceed the key i id+2 i succ Succ. Table i id+2 i succ Succ. Table i id+2 i succ Succ. Table 7 Key 1 i id+2 i succ Succ. Table query(7)

2: Application Layer 56 Chord Summary r Routing table size? m Log N fingers r Routing time? m Each hop expects to 1/2 the distance to the desired key => expect O(log N) hops. Note: so far only the basic Chord; many practical issues remain (not covered in this course though …)

2: Application Layer 57 A few words about BitCoin (and other digital/virtual currency) r Two key issues for a currency m Generation (where does it come from ?) m Distribution (how to use it, i.e., buy/sell transactions?) r BitCoin – open source p2p currency m Mining (hashing) m Verification

2: Application Layer 58 Chapter 2: Summary r application service requirements: m reliability, bandwidth, delay r client-server paradigm r Internet transport service model m connection-oriented, reliable: TCP m unreliable, datagrams: UDP Our study of network apps now complete! r specific protocols: m HTTP m FTP m SMTP, POP, IMAP m DNS r content distribution m caches, CDNs m P2P

2: Application Layer 59 Chapter 2: Summary r typical request/reply message exchange: m client requests info or service m server responds with data, status code r message formats: m headers: fields giving info about data m data: info being communicated More importantly: learned about protocols r control vs. data msgs m in-band, out-of-band r centralized vs. decentralized r stateless vs. stateful r reliable vs. unreliable msg transfer r “complexity at network edge” – many protocols r security: authentication