Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 4720 Peer to Peer Networking CS 4720 – Web & Mobile Systems.

Similar presentations


Presentation on theme: "CS 4720 Peer to Peer Networking CS 4720 – Web & Mobile Systems."— Presentation transcript:

1 CS 4720 Peer to Peer Networking CS 4720 – Web & Mobile Systems

2 CS 4720 History of File Sharing P2P is the solution to a problem: how to get very large files to a lot of people in a timely fashion How did the question arise? "Freedom!" "Internet Media!" "Take it to the man!" … or sharing copies of copyrighted files… 2

3 CS 4720 Bulletin Board Systems Ah, the good old days… Let's take a look at one now! –bbsmates.com / We can consider it the earliest form of a web service Usenet is a form of a bbs… kinda… –No central server, fully distributed, evolving mesh –"servers" copy info between themselves 3

4 CS 4720 Napster 4 The tech story of my undergrad years Debuted in Summer of 1999 –That fall I was starting my Sophomore year –I was taking: CSC 112 (and lab) Fundamentals of Comp Science – B MTH 112 Calculus II – B MTH 117 Discrete Mathematics – A THE 112 Introduction to the Theatre – A- HMN 396 Individual Study (Medieval Themes in Modern Video Games) – A

5 CS 4720 Napster Protocol Napster ran central servers that maintained: –User authentication –Logging –Chat functionality –Making connections between clients A user would login to Napster and the program would the populate their profile with all the songs/files they had available 5

6 CS 4720 Napster Protocol 6 " " – is the user contributing the file – is the mp3 file contributed – is the has of the mp3 file – is the file size in bytes – is the mp3 bitrate in kbps – is the sampling frequency in Hz – is the play time in seconds –Example: foouser "generic band - generic song.mp3" b92870e0d41bc8e698cf2f0a1ddfeac

7 CS 4720 Napster Protocol 7 When a user did a search, it was just a DB lookup on Napster's servers Then Napster would establish a client-client connection to make the transfer happen Usually, this would be a simple TCP connection to that users Napster data port (effectively like an FTP server) –If a firewall was involved, the sender would initiate the connection with the requester

8 CS 4720 Why did this work? Universities and Colleges were pushing HARD about how "connected" their campus was Many students got their first address when they went to school starting in 1996 or so The speed was an INCREDIBLE jump over 14.4, 28.8, 56k The direct connection made it really easy to send the files 8

9 CS 4720 Why did this work? 9 Napster (theoretically at the time) was in the clear –They didn't host any of the files… they just made them available –Feb 2001: 26.4 million users! –Eventually, the connection part of the whole deal was "enabling technology" and that was the end of that in the Summer of 2001 One of the big problems was there was an identifiable target: Napster and Shawn Fanning

10 CS 4720 The Solution 10 Decentralize the network Truly create "the cloud" where the data would live Even though Napster was going strong until Summer of 2001, the foundation was already being laid for the next generation of sharing technolgy

11 CS 4720 Kazaa, Morpheus, eDonkey, et al. Continuing the trend of odd names for programs comes the first set of decentralized sharing services 11

12 CS 4720 The Link Between Them All The network was actually called "FastTrack" Was the most popular file sharing network in 2003 – estimates say that it even eclipsed Napster at its height FastTrack was an intentionally designed, corporate funded de-centralized distribution network 12

13 CS 4720 Nodes and Supernodes When you connected to the FastTrack network (through whatever program you used), you started as a node Nodes provide file information and download requests to supernodes Supernodes are responsible for indexing users' shares, performing queries, and keeping statistics When a connection is made, HTTP is used 13

14 CS 4720 But you just said it was decentralized… And it is! Supernodes are regular nodes that are "promoted" to supernode status by other supernodes on the network As supernodes see that their ranks are diminishing, or if the bandwidth is hurting, they find an unsuspecting node and assimilate… I mean, promote it to supernode Supernodes could also self-announce 14

15 CS 4720 Guess which nodes got promoted! Supernodes liked nodes with: –Lots of files –Lots of bandwidth –Lots of uptime –Low latency –Lots of computing power So… where do you find one of these machines? College students who leave their machines on overnight! 15

16 CS 4720 University Response After all of that with Napster traffic, now the university machines themselves are the supernodes! IT staff did their best to throttle traffic, block ports, etc. In the end, the RIAA came knocking… 16

17 CS 4720 Hashing and RIAA Response There were some problems in the FastTrack protocol that were susceptible to some attacks from the RIAA The hashing algorithm used to verify if a file was indeed a particular file was written to be fast and efficient… but not terribly accurate The RIAA seeded a ton of dummy files to drop the value of the network 17

18 CS 4720 Other Problems Remember how I said this was corporately funded? How would they make their money back? Malware and spyware! 18

19 CS 4720 Kazaa Malware Cydoor (spyware): Collects information on the PC's surfing habits and passes it on to the company which created Cydoor.Cydoor B3D (adware): An add-on which causes advertising popups if the PC accesses a website which triggers the B3D code.B3D Altnet (adware): A distribution network for paid "gold" files.Altnet The Best Offers (adware): Tracks your browsing habits and internet usage to display advertisements similar to your interests.The Best Offers InstaFinder (hijacker): Redirects your URL typing errors to InstaFinder's web page instead of the standard search page.InstaFinder TopSearch (adware): Displays paid songs and media related to your search in Kazaa.TopSearch RX Toolbar (spyware): The toolbar monitors all the sites you visit with Microsoft Internet Explorer and provides links to competitors' websites.RX Toolbar New.net (hijacker): A browser plugin that lets you access several of its own unofficial Top Level Domain names, e.g.,.chat and.shop. The main purpose of which is to sell domain names such as which is actually Level Domain nameswww.record.shop 19

20 CS 4720 Today The FastTrack network is still out there, but many people (comparatively) don't use it anymore The inventors of FastTrack? They're doing just fine. They created Skype. 20

21 CS 4720 So what replaced FastTrack? BitTorrent Introduced by Bram Cohen in Summer of 2001 Just a few months after FastTrack goes online, actually At the time, though, there weren't any groups to host the trackers that were needed for the protocol to work That wouldn't change until early

22 CS 4720 What exactly is BitTorrent? From bitttorrent.org: –BitTorrent is a free speech tool. –BitTorrent gives you the same freedom to publish previously enjoyed by only a select few with special equipment and lots of money. –You have something terrific to publish -- a large music or video file, software, a game or anything else that many people would like to have. –But the more popular your file becomes, the more you are punished by soaring bandwidth costs. –If your file becomes phenomenally successful and a flash crowd of hundreds or thousands try to get it at once, your server simply crashes and no one gets it. –There is a solution to this vicious cycle: BitTorrent –With BitTorrent free speech no longer has a high price. 22

23 CS 4720 But does it cure the common cold? So… that was nice and all… but what is BitTorrent? Simply put, BT is a P2P file sharing protocol for sharing large amounts of data in a method where all nodes share not only demand but also supply There is no one node a file is downloaded from, and thus the load is theoretically evenly balanced 23

24 CS 4720 The Basics It starts with one node who publishes a file A.torrent file is created, which simply has the connection information, file size, etc. Files are split into (usually) 256KB pieces They are hashed (of course) to verify the contents after transmission The.torrent file is hosted on a tracker site The tracker HOSTS NO FILES other than the.torrent files But it does monitor traffic to connect nodes 24

25 CS 4720 The Basics The file originator is called the seed The seed pushes the.torrent file to the tracker The tracker provides the.torrent file for others to download When a peer downloads the.torrent to start downloading the file, it announces itself to everyone else that is downloading that file 25

26 CS 4720 The Basics In general, BT works on a rarest piece first algorithm A peer will ask for the rarest piece for a given file from the seeder and will receive it The peer will then start hosting that "rarest piece," which theoretically is now NOT the rarest, and again asks for the rarest 26

27 CS 4720 The Basics This "rarest first" approach makes downloading different from any other downloading you've done before A standard HTTP request is a straight flow of data… sort of HTTP packets are out of order and put back together… so why not whole files? It works out fairly well to get data distributed 27

28 CS 4720 The Basics Also, BT doesn't (necessarily) use a single port Multiple TCP connections can be opened randomly to keep the network strong The problem with this: –Speed of download is a bell curve, not constant –Partial seeding is possible –Streaming is pretty hard (although Bram says "it's coming") 28

29 CS 4720 The Protocol All non "keep alive" messages start with one of the following: –0 - choke –1 - unchoke –2 - interested –3 - not interested –4 - have –5 - bitfield –6 - request –7 - piece –8 - cancel 29

30 CS 4720 The Protocol bitfield – set of indices indicating what the peer has have – successful download and check of a piece request – send index and offset of data wanted piece – index of and actual data of a piece cancel – stop transmission 30

31 CS 4720 The Protocol Interested / not interested – indicates whether a peer wants to start communicating with another peer Choke / unchoke – response from peer to interested party as to whether the connection will continue Used to manage the number of connections at any one time 31

32 CS 4720 Snark Build your own BT client Or build it into your own app How might you use BT in an app? How might you use it in: –An enterprise architecture? –A service-oriented architecture? 32


Download ppt "CS 4720 Peer to Peer Networking CS 4720 – Web & Mobile Systems."

Similar presentations


Ads by Google