Presentation is loading. Please wait.

Presentation is loading. Please wait.

Challenges, Design and Analysis of a Large-scale P2P-VoD System Dr. Yingwu Zhu.

Similar presentations


Presentation on theme: "Challenges, Design and Analysis of a Large-scale P2P-VoD System Dr. Yingwu Zhu."— Presentation transcript:

1 Challenges, Design and Analysis of a Large-scale P2P-VoD System Dr. Yingwu Zhu

2 P2P Overview Advantage – Reduce server load by having peers contributing their resources – Scalability, availability, robustness, … P2P services –P2P file downloading : BitTorrent and Emule –P2P live streaming : Coolstreaming and PPLive –P2P video-on-demand (P2P-VoD) : PPLive Unlike P2P streaming systems, P2P-VoD systems lacks synchrony, as peers can watch different parts of a video at the same time. Require each user to contribute a small amount of storage (usually 1GB) instead of only the playback buffer in memory as in the P2P streaming system

3 Challenges in P2P VoD Lack of synchrony in user behavior Content replication to provide movie data for other peers – Data availability by caching at peers – ATD: Availability to Demand Ratio – Cache replacement Transmission strategy to deliver data in real time – Satisfaction level – Playrate

4 PPLive-VoD Architecture Design considerations Measurement study

5 Architecture Major components –Peers –Content Servers : the source of content –Trackers : help peers connect to other peers to share the same content –A bootstrap server : helps peers to find a suitable tracker and to perform other bootstrapping functions –Other servers log servers : log significant events for data measurement transit servers : help peers behind NAT boxes

6 Design Decisions Segment sizes Content replication strategy Content Discovery Piece Selection Transmission Strategy Other issues

7 Segment sizes How to divide a video into multiple pieces –Small segment size gives more flexibility to schedule which piece should be uploaded from which neighboring peer. –The larger the segment size the smaller the overhead. Header overhead (checksum) Bitmap overhead (advertisement) Protocol overhead (TCP/IP header) –The video player expects a certain minimum size for a piece of content to be viewable (playback rate: frame rate). Segmentation of a movie in PPLive’s VoD syste m

8 Replication Strategy Goal –To make the chunks as available to the user population as possible to meet users’ viewing demand while without incurring excessive additional overheads Considerations –Whether to allow multiple movies be cached Multiple movie cache (MVC) / single movie cache (SVC) –Whether to pre-fetch or not (NO! due to short view duration) –Cache replacement: Which chunk/movie to remove when the disk cache is full (movie-based replacement) Least recently used (LRU) / least frequently used (LFU) Weight-based replacement (How complete the movie is cached locally; and ATD = c/n  from tracker) Reduce sever load from 19% to 11-7%

9 Content Discovery Content advertising and look-up methods –Trackers keep track of which peers replicate a given movie As soon as a user starts watching a movie, the peer informs its tracker that it is replicating that movie. Also inform tracker of content removal When a peer wants to start watching movie, it goes to the tracker to find out which other peers have that movie. –Gossip method Discovering where chunks are is by the gossiping chunk bitmap. This cuts down on the reliance on the tracker, and makes the system more robust. –DHT Used to automatically assign movies to trackers to achieve some level of load balancing.

10 Piece Selection Which piece to download first –Sequential : select the piece that is closest to what is needed for the video playback First priority –Rarest first : selecting the rarest piece helps speeding up the spread of pieces, hence indirectly helps streaming quality. Second priority –Anchor-based : when a user tries to jump to a particular location in the movie, if the piece for that location is missing then the closest anchor point is used instead. Currently not used User do not jump much, 1.8 times/movie Optimizing transmission alg., buffering time after a jump is satisfactory

11 Transmission Strategy Goals –Maximize downloading rate –Minimize the overheads due to duplicate transmissions and requests Strategies (by levels of aggressiveness) –A peer can send a request for the same content to multiple neighbors simultaneously –A peer can request for different content from multiple neighbors simultaneously (PPLive’s choice) For playback rate of 500Kbps, 8-20 neighbors is the best. More than this number can still improve the achieved rate, but at the expense of heavy duplication rate. –A peer can work with one neighbor at a time.

12 Other Design Issues NAT and firewalls –Discovering different types of NAT boxes (similar to STUN protocol) –Pacing the upload rate and request rate (avoid being labeled as attacks and blocked) Content authentication –Chunk level authentication Some pieces may be polluted and cause poor viewing experience locally at a peer. If a peer detects a chunk is bad, discard it. –Piece level authentication

13 What to measure User behavior –includes the user arrival patterns, and how long they stayed watching a movie –used to improve the design of the replication strategy External performance metrics –includes user satisfaction and server load –used to measure the system performance perceived externally Health of replication –measures how well a P2P-VoD system is replicating a content –Used to infer how well the important component of the system is doing

14 User Behavior MVR (movie viewing record)

15 User Satisfaction Simple fluency –measures the fraction of time a user spends watching a movie out of the total time he spends waiting for and watching that movie R(m, i) : the set of all MVRs for a given movie m and user i n(m, i) : the number of MVRs in R(m, i) r : one of the MVRs in R(m, i)

16 User Satisfaction (cont’) User satisfaction index –considers the quality of the delivery of the content r(Q) : a grade for the average viewing quality for an MVR r, using infer/estimate

17 Health of Replication Three levels –Movie level The number of active peers who have advertised storing chunks of that movie The information that the tracker collects about movies –Weighted movie level Considers the fraction of chunks a peer has in computing the index –Chunk bitmap level (chunk vector) The number of copies each chunk of a movie is stored by peers Various other statistics can be computed; the average number of copies of a chunk in a movie, the minimum number of chunks, the variance of the number of chunks.

18 Statistics on video objects Overall statistics of the three typical movies

19 Statistics on user behavior (1) Interarrival time distribution of viewers

20 Statistics on user behavior (2) View duration distribution Many MVRs <10 mins Prefetching?

21 Statistics on user behavior (3) How long does a user stay in system? – important to data replication >=70% users stay longer than 15 mins, able to provide upload serv.

22 Statistics on user behavior (4) Start position distribution

23 Observations from last slide A large fraction of users start watching movies from beginning User jump to different positions uniformly, so anchor points for jump operations could be uniformly spaced, guide for chunk selection strategy

24 Health index of Movies (1) Number of peers that cahce the movie

25 Health index of Movies (2) Average owning ratios for different chunks (avg. over 24 hours)

26 What the observations imply? Many users watching movies from the beginning Many users do not finish the whole movie? Why? – Movie epilog!

27 Health index of Movies (3) Chunk availability and chunk demand

28 Health index of Movies (4) The available to demand ratios (good, >= 1)

29 User Satisfaction Index (1) Generating fluency index –The computation of F(m, i) is carried out by the client software. –The client software reports all MVRs and the fluency F(m, i) to the log server whenever a “stop-watching” event occurs. The STOP button is pressed Another movie/program is selected The user turns off the P2P-VoD software

30 User Satisfaction Index (2) The number of fluency records –A good indicator of the number of viewers of the movie

31 User Satisfaction Index (3) The distribution of fluency index ( =0.8 good) –Good, but need to improve buffering time!

32 Future works Further research in P2P-VoD systems –How to design a highly scalable P2P-VoD system to support millions of simultaneous users –How to perform dynamic movie replication, replacement, and scheduling so as reduce the workload at the content servers –How to quantify various replication strategies so as to guarantee a high health index –How to select proper chunk and piece transmission strategies so as to improve the viewing quality –How to accurately measure and quantify the user satisfaction level

33 Thinking What makes PPLive-VoD distinguish from BitTorrent? – From design perspectives – Similarity – Difference


Download ppt "Challenges, Design and Analysis of a Large-scale P2P-VoD System Dr. Yingwu Zhu."

Similar presentations


Ads by Google