Presentation is loading. Please wait.

Presentation is loading. Please wait.

2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2.

Similar presentations


Presentation on theme: "2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2."— Presentation transcript:

1 2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2

2 2: Application Layer 2 Chapter 2 outline r 2.1 Principles of app layer protocols r 2.2 Web and HTTP r 2.3 FTP r 2.4 Electronic Mail m SMTP, POP3, IMAP r 2.5 DNS r 2.6 Content distribution m Network Web caching m Content distribution networks m P2P file sharing

3 2: Application Layer 3 Content Distribution r Problem of a single server m Bottleneck, single point of failure, … r Content Distribution m Distribute (Replicate) contents at different place m Direct requests to appropriate places

4 2: Application Layer 4 Client-side Caching

5 2: Application Layer 5 Limit of Client-side Caching Not shared !

6 2: Application Layer 6 Web caches (proxy server) r user sets browser: Web accesses via proxy r browser sends all HTTP requests to proxy m object in cache: cache returns object m else cache requests object from origin server, then returns object to client r Proxy: both client and server; typically installed by ISP (university, company, residential ISP) Goal: satisfy client request without involving origin server client Proxy server client HTTP request HTTP response HTTP request HTTP response origin server origin server

7 2: Application Layer 7 More about Web caching Why Web caching? r Reduce response time for client request. r Reduce traffic on an institution’s access link. r Internet dense with caches enables “poor” content providers to effectively deliver content

8 2: Application Layer 8 More about Web caching Why is caching effective for Web, even if cache space is quite limited ? r Zipf distribution

9 2: Application Layer 9 Caching example (1) Assumptions r average object size = 100,000 bits r avg. request rate from institution’s browser to origin serves = 15/sec r delay from institutional router to any origin server and back to router = 2 sec Consequences r utilization on LAN = 15% r utilization on access link = 100% ! r total delay = Internet delay + access delay + LAN delay = 2 sec + some minutes + some milliseconds origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache

10 2: Application Layer 10 Caching example (2) Possible solution r increase bandwidth of access link to, say, 10 Mbps Consequences (if 10 Mbps) r utilization on LAN = 15% r utilization on access link = 15% r Total delay = Internet delay + access delay + LAN delay = 2 sec + some msecs + some msecs r often a costly upgrade origin servers public Internet institutional network 10 Mbps LAN upgraded from 1.5 to 10 Mbps institutional cache

11 2: Application Layer 11 Caching example (3) Install cache r suppose hit rate is.4 Consequence r 40% requests will be satisfied almost immediately r 60% requests satisfied by origin server r utilization of access link reduced to 60%, resulting in negligible delays (say 10 msec) r total delay = Internet delay + access delay + LAN delay = 60%*2 sec + 40%*0.01 secs + some milliseconds < 1.3 secs origin servers public Internet institutional network 10 Mbps LAN 1.5 Mbps access link institutional cache

12 2: Application Layer 12 More about Web caching r Problem of Web caching m Extra space/machine (proxy) m Inconsistency (out-of-date objects…)

13 2: Application Layer 13 Consistency of Cached Objects r Solution 1: no caching

14 2: Application Layer 14 Consistency of Cached Objects r Solution 2: Manually update

15 2: Application Layer 15 Conditional GET r Goal: don’t send object if client has up-to-date cached version r client: specify date of cached copy in HTTP request If-modified-since: r server: response contains no object if cached copy is up- to-date: HTTP/1.0 304 Not Modified client server HTTP request msg If-modified-since: HTTP response HTTP/1.0 304 Not Modified object not modified HTTP request msg If-modified-since: HTTP response HTTP/1.0 200 OK object modified

16 2: Application Layer 16 Hierarchical cache r Calculation m HR of Proxy 1 = 90% m HR of Proxy 2 = 95% m Independent m Joint HR = ? 99.5% r How to measure cache ? m Hit Ratio (HR) client Proxy Server 1 client origin server Proxy Server 2

17 2: Application Layer 17 Hierarchical cache r Calculation m HR of Proxy 1 = 90% m HR of Proxy 2 = 95% m Independent m Joint HR = ? How about average delay? r How to measure cache ? m Hit Ratio (HR) client Proxy Server 1 client origin server Proxy Server 2

18 2: Application Layer 18 Content distribution networks (CDNs) Ping www.Microsoft.comwww.Microsoft.com www.Netflix.com www.ibm.com www.apple.com What did you see ? origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CDN server in Asia

19 2: Application Layer 19 Content distribution networks (CDNs) Content replication r CDN company (e.g., Akamai) installs hundreds of CDN servers throughout Internet m in lower-tier ISPs, close to users r Content providers (e.g., Netflix) are the CDN company’s customers. r CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers origin server in North America CDN distribution node CDN server in S. America CDN server in Europe CDN server in Asia

20 2: Application Layer 20 CDN example origin server r www.foo.com r distributes HTML r Replaces: http://www.foo.com/sports.ruth.gif with h ttp://www.cdn.com/www.foo.com/sports/ruth.gif HTTP request for www.foo.com/sports/sports.html DNS query for www.cdn.com HTTP request for www.cdn.com/www.foo.com/sports/ruth.gif 1 2 3 Origin server CDNs authoritative DNS server Nearby CDN server CDN company r cdn.com r distributes gif files r uses its authoritative DNS server to route redirect requests

21 2: Application Layer 21 More about CDNs routing requests r CDN creates a “map”, indicating distances from leaf ISPs and CDN nodes r when query arrives at authoritative DNS server: m server determines ISP from which query originates m uses “map” to determine best CDN server Caching vs. CDN r Pull: passive r Push: active

22 2: Application Layer 22 Client-server architecture server: m always-on host m permanent IP address m server farms for scaling clients: m communicate with server m may be intermittently connected m may have dynamic IP addresses m do not communicate directly with each other client/server

23 2: Application Layer 23 Pure P2P architecture r no always-on server r arbitrary end systems directly communicate r peers are intermittently connected and change IP addresses r self scalability – new peers bring new resources r Three topics: m File distribution m Searching for information m Case Study: BitTorrent, Skype peer-peer

24 2: Application Layer 24 P2P file sharing Example r Alice runs P2P client application on her notebook computer m Intermittently connects to Internet r Asks for “X.mp3” r Application displays other peers that have copy of X.mp3. r Alice chooses one of the peers, Bob. r File is copied from Bob’s PC to Alice’s notebook: HTTP r While Alice downloads, other users uploading from Alice. m Alice’s peer is both a Web client and a transient Web server All peers are servers = highly scalable!

25 2: Application Layer 25 File Distribution: Server-Client vs P2P Question : How much time to distribute file from one server to N peers? usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) File, size F u s : server upload bandwidth u i : peer i upload bandwidth d i : peer i download bandwidth

26 2: Application Layer 26 File distribution time: server-client usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r Server transmission: must sequentially sends (upload) N copies: m NF/u s time r Client: each must download the file m client i takes F/d i time to download increases linearly in N (for large N) = D cs = max { NF/u s, F/min(d i ) } i Time to distribute F to N clients using client/server approach

27 2: Application Layer 27 File distribution time: P2P usus u2u2 d1d1 d2d2 u1u1 uNuN dNdN Server Network (with abundant bandwidth) F r Server transmission: m must upload at least one copy: F/u s time r Client: each must down a copy m client i takes F/d i time to download m but also share (upload) r Clients (peers): m as a whole must download NF bits  fastest possible overall download rate: u s +  u i D P2P = max { F/u s, F/min(d i ), NF/(u s +  u i ) } i increases linearly in N (for large N) ?

28 2: Application Layer 28 Server-client vs. P2P: example Client upload rate = u, F/u = 1 hour, u s = 10u, d min ≥ u s

29 2: Application Layer 29 File distribution: BitTorrent tracker: tracks peers participating in torrent torrent: group of peers exchanging chunks of a file obtain list of peers trading chunks peer r P2P file distribution Alice arrives … … obtains list of peers from tracker … and begins exchanging file chunks with peers in torrent

30 2: Application Layer 30 BitTorrent (1) r file divided into 256KB chunks. r peer joining torrent: m has no chunks, but will accumulate them over time m registers with tracker to get list of peers, connects to subset of peers (“neighbors”) r while downloading, peer uploads chunks to other peers. r peers may come and go: churn r once peer has entire file, it may (selfishly) leave or (altruistically) remain

31 2: Application Layer 31 BitTorrent (2) Requesting Chunks r at any given time, different peers have different subsets of file chunks r periodically, a peer (Alice) asks each neighbor for list of chunks that they have. r Alice sends requests for her missing chunks m rarest first Sending Chunks: tit-for-tat r Alice sends chunks to four neighbors currently sending her chunks at the highest rate m re-evaluate top 4 every 10 secs r every 30 secs: randomly select another peer, starts sending chunks m newly chosen peer may join top 4 m “optimistically unchoke”

32 2: Application Layer 32 BitTorrent: Tit-for-tat (1) Alice “optimistically unchokes” Bob (2) Alice becomes one of Bob’s top-four providers; Bob reciprocates (3) Bob becomes one of Alice’s top-four providers With higher upload rate, can find better trading partners & get file faster!

33 2: Application Layer 33 P2P Case study: Skype r inherently P2P: pairs of users communicate. r proprietary application-layer protocol (inferred via reverse engineering) r hierarchical overlay with SNs r Index maps usernames to IP addresses; distributed over SNs Skype clients (SC) Supernode (SN) Skype login server

34 2: Application Layer 34 Peers as relays r Problem when both Alice and Bob are behind “NATs”. m NAT prevents an outside peer from initiating a call to insider peer (see later) r Solution: m Using Alice’s and Bob’s SNs, Relay is chosen m Each peer initiates session with relay. m Peers can now communicate through NATs via relay

35 2: Application Layer 35 P2P: searching for information Index in P2P system: maps information to peer location (location = IP address & port number) r So many files r But, where are they ?

36 2: Application Layer 36 P2P: centralized directory original “Napster” design 1) when peer connects, it informs central server: m IP address m content 2) Alice queries for “X.mp3” 3) Alice requests file from Bob centralized directory server peers Alice Bob 1 1 1 1 2 3

37 2: Application Layer 37 P2P: problems with centralized directory r Single point of failure r Performance bottleneck r Copyright infringement file transfer is decentralized, but locating content is highly centralized

38 2: Application Layer 38 P2P: decentralized directory r Each peer is either a group leader or assigned to a group leader. r Group leader tracks the content in all its children. r Peer queries group leader; group leader may query other group leaders.

39 2: Application Layer 39 More about decentralized directory advantages of approach r no centralized directory server m location service distributed over peers m more difficult to shut down disadvantages of approach r bootstrap node needed r group leaders can get overloaded

40 2: Application Layer 40 P2P: Query flooding r Gnutella r no hierarchy r use bootstrap node to learn about others r join message r Send query to neighbors r Neighbors forward query r If queried peer has object, it sends message back to querying peer join

41 2: Application Layer 41 P2P: more on query flooding Pros r peers have similar responsibilities: no group leaders r highly decentralized r no peer maintains directory info Cons r excessive query traffic r query radius: may not have content when present r bootstrap node r maintenance of overlay network

42 2: Application Layer 42 DHT: A New Story … r Motivation:  Frustrated by popularity of all these “ half-baked ” P2P apps m We can do better! m Guaranteed lookup success for files in system m Provable bounds on search time m Provable scalability to millions of node

43 2: Application Layer 43 P2P: Content Addressing (Hash Routing) Hash routing r Given an object identifier I, calculate its hash value H=hash(I), and (hopefully) find it (or its location info) in peer H r Not a new idea m Load balancing – hash IP address, re-direct to different servers hash table application get (key) data node …. put(key, data)

44 2: Application Layer 44 Hash Routing r Two alternatives m Node can cache each (existing) object that hashes within its range m Pointer-based: level of indirection - node caches pointer to location(s) of object r What’s new in P2P? m Dynamic overlay peer join/leave number of peers is not fixed m Traditional hash function doesn’t work SHA-1 0-999 9500-9999 1000-1999 1500-4999 9000-9500 4500-6999 8000-8999 7000-8500

45 2: Application Layer 45 Distributed Hash Table (DHT) Challenges r For each object, node(s) whose range(s) cover that object must be reachable via a “short” path r # neighbors for each node should scale well (e.g., should not be O(N)) r Fully distributed (no centralized bottleneck/single point of failure) r DHT mechanism should gracefully handle nodes joining/leaving m need to repartition the range space over existing nodes m need to reorganize neighbor set m need bootstrap mechanism to connect new nodes into the existing DHT infrastructure

46 2: Application Layer 46 Case Studies r Structure overlay (p2p) systems – Consistent Hashing m Chord m CAN (Content Addressable Network) r Key Questions m Q1: How is hash space divided “evenly” among existing nodes? m Q2: How is routing implemented that connects an arbitrary node to the node responsible for a given object? m Q3: How is the hash space repartitioned when nodes join/leave? r Let N be the number of nodes in the overlay r Let H be the size of the range of the hash function (when applicable)

47 2: Application Layer 47 Chord r Associate to each node and file a unique id in an uni-dimensional space (a Ring) m E.g., pick from the range [0...2 m -1] m Usually the hash of the file or IP address r Properties: m Routing table size is O(log N), where N is the total number of nodes m Guarantees that a file is found in O(log N) hops from MIT in 2001

48 2: Application Layer 48 Consistent Hashing N32 N90 N105 K80 K20 K5 Circular ID space Key 5 Node 105 A key is stored at its successor: node with next higher ID (Key – Hashed value of a file identifier)

49 2: Application Layer 49 Chord Basic Lookup N32 N90 N105 N60 N10 N120 K80 “Where is key 80?” “N90 has K80”

50 2: Application Layer 50 Chord “ Finger Table ” N80 1/2 1/4 1/8 1/16 1/32 1/64 1/128 r Entry i in the finger table of node n is the first node that succeeds or equals n + 2 i r In other words, the ith finger points 1/2 n-i way around the ring

51 2: Application Layer 51 Chord Join r Assume a hash space [0..7] r Node n1 joins 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 1 1 3 1 2 5 1 Succ. Table

52 2: Application Layer 52 Chord Join r Node n2 joins 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 1 2 5 1 Succ. Table i id+2 i succ 0 3 1 1 4 1 2 6 1 Succ. Table

53 2: Application Layer 53 Chord Join r Nodes n0, n6 join 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 6 Succ. Table i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table

54 2: Application Layer 54 Chord Join r Nodes: n1, n2, n0, n6 r Keys: f7, f1 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 6 Succ. Table 7 Key 1 i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table

55 2: Application Layer 55 Chord Routing r Upon receiving a query for file id, a node first calculates the key (Hash id) r Checks whether stores the key locally r If not, forwards the query to the largest node in its successor table that does not exceed the key 0 1 2 3 4 5 6 7 i id+2 i succ 0 2 2 1 3 6 2 5 6 Succ. Table i id+2 i succ 0 3 6 1 4 6 2 6 6 Succ. Table i id+2 i succ 0 1 1 1 2 2 2 4 6 Succ. Table 7 Key 1 i id+2 i succ 0 7 0 1 0 0 2 2 2 Succ. Table query(7)

56 2: Application Layer 56 Chord Summary r Routing table size? m Log N fingers r Routing time? m Each hop expects to 1/2 the distance to the desired key => expect O(log N) hops. Note: so far only the basic Chord; many practical issues remain (not covered in this course though …)

57 2: Application Layer 57 A few words about BitCoin (and other digital/virtual currency) r Two key issues for a currency m Generation (where does it come from ?) m Distribution (how to use it, i.e., buy/sell transactions?) r BitCoin – open source p2p currency m Mining (hashing) m Verification

58 2: Application Layer 58 Chapter 2: Summary r application service requirements: m reliability, bandwidth, delay r client-server paradigm r Internet transport service model m connection-oriented, reliable: TCP m unreliable, datagrams: UDP Our study of network apps now complete! r specific protocols: m HTTP m FTP m SMTP, POP, IMAP m DNS r content distribution m caches, CDNs m P2P

59 2: Application Layer 59 Chapter 2: Summary r typical request/reply message exchange: m client requests info or service m server responds with data, status code r message formats: m headers: fields giving info about data m data: info being communicated More importantly: learned about protocols r control vs. data msgs m in-band, out-of-band r centralized vs. decentralized r stateless vs. stateful r reliable vs. unreliable msg transfer r “complexity at network edge” – many protocols r security: authentication


Download ppt "2: Application Layer 1 CMPT 371 Data Communications and Networking Chapter 2 Application Layer - 2."

Similar presentations


Ads by Google