Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University.

Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University of Hongkong Shanghai Synacast Media Tech (PPLIVE) Xiaofei Wang 2009.08.12 Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University of Hongkong Shanghai Synacast Media Tech (PPLIVE) Xiaofei Wang 2009.08.12

Outline  Introduction  How to design and build P2P-Vod  Performance metrics and methods  Measurement  Conclusion

1. Introduction  P2P  File sharing ( BT, eMule …)  Live streaming ( PPLive, PPStream...)  …  Users help each other  PPLive (by Synacast Media Tech)  The most popular P2P streaming software  120 M total / 150 K online

 P2P Video-on-Demand in PPLive  Small storage is contributed by every peer (1G);  All videos are stored not only in dedicated server but also peers in distributed way  Download / Upload / Share Live VOD Distributed VOD

 Therefore, P2P-VoD  has less synchrony in users sharing of video content  is much difficult to balance the loading of servers and peers while maintaining the streaming performance  requires new mechanism for coordinating content replication, discovery, transmission and peer scheduling …  Furthermore  Metrics/methods to the evaluate the P2P-Vod system  Measurement results

2. Design and Building Blocks  A P2P-VoD system consists of:  a set of servers, as the initial source of content  a set of trackers, updating “where the video is distributed”  a bootstrap server, to help peers to find a suitable tracker  a server, for event log  PEERS

Segmentation in PPLive(1)  P2P-VoD requires efficient segment setting:  Scheduling view:  divide the content into as many pieces as possible - more flexibility to schedule which piece should be from which peer  Overhead view:  header, video bitmap, control message and so on… - the larger the segment size the better  Real-time streaming:  requires minimum size of video to display in player (16KB)

Segmentation in PPLive(2)  Segment implementation  Total overhead: 6.2% ~ 10%  Other related settings: .wmv, source rate: 381~450Kbps, for HD: 700Kbps Basic unit stored in a peer’ disk dispersedly, and it always reports to tracker what parts of a video it holds by bitmap of the content (32 bits)

Replication Strategy  To efficiently makes the chunks more available  Multiple Video Cache (MVC) v.s. Single Video Cache (SVC)  Pre-fetch is turned off to save bandwidth, only share viewed  Strategy to remove chunks when full (Least Frequently Used)  Currently improved to weight based  A. how complete the video is cached locally - %  B. how needed a copy of the movie is – viewers/copies  Overall, 6~8 peers to offload the source  Server load 19%  7~11%

Content Discovery & Overlay  A) Tracker Servers  Keep track of which peers replicate a given movie (or part of)  B) DHT (Distributed Hash Table function)  Assign movies to trackers, load balancing  Peers use DHT to have non-deterministic path to trackers  C) Gossiping  Chunk Bitmap – which chunks a peer has  “Keep-alive messages”

Piece Selection  Download chunks form peers by “Pull”  Sequential – closest to what is needed for video playback  Rarest First – Rarest (usually the )  Anchor-based - “Anchor” points  Some particular video points for user to jump to  Not implemented currently, since users don’t jump that much (1.8 time per movie)

Transmission Strategy  A peer can request for different content from multiple neighbors simultaneously in PPLive  Maximize downloading rate v.s. Minimize overhead  500Kbps – 8~20 neighbors  1Mbps – 16~32 neighbors  When there is not ample peers, content server will supplement the need  No “tit-for-tat” like incentive  No control on up/down link limitation

3. Metrics and Methodology  What to measure?  User behavior  User arrival pattern  Duration of watching  External performance  User satisfaction index  Server load  Health of replication  Movie health index

Movie Viewing Record (MVR)  MVR - the continuous viewing of a stretch of a movie

User Satisfaction  Total Viewing Time  Fraction of time a user spends watching a movie out of the total time he/she spends waiting for and watching that movie  The higher the better, ( 1 is the best, no buffering time )  E.g. ( supposing 10 buffering time )

Health of Replication  Movie level  The number of active peers who have advertised storing chunks of that movie  Weighted movie level  Weighted for the percentage of a movie, 0.5 for 50% saving  Chunk bitmap level  The number of copies each chunk of a movie is stored by peers

4. Results and Analysis  Generally measure at log servers and tracker servers  Dec 23 th, 2007 to Dec 27 th 2007  Three typical movies A chunk is about 40 seconds Views started from the beginning Movie 2 is the most popular! But also smallest view duration and highest jump rate

Inter-arrival Time PDF  From the set of MVRs with SP =0  Distribution of the differences (interval) of ST fields Follows exponential distribution

View Duration CDF  Very high percentage of MVRs are of short duration Less than 10 minutes

Residence Distribution  How long a peer stays in PPLive? ( 1 week ) Over 70% peers stays for more than 15 minute, much high ratio of peers stays for hours. They serve a lot! Much high ratio of peers stays less then 5 minutes We infer that: Peers probably first quickly scan a few videos until they find an interesting one and continue to watch, or just leave the system to wait/seek/comment …

Start Position Distribution  Important to design efficient anchor points  All uniformly distributed…  set anchor points uniformly We know that movie 2 has higher jump ratio, thus SP curve here is low… very low probability to start from right beginning

Number of View Activities (MVR)  “Daily Periodicity”  Peeks: 2pm 11pm Drops: dawn time (concurrent MVRs at each hourly sampling point) (cumulated MVRs within one hour)

Health Index of Movies (1)  Movie owner: has at least one chunk of the movie Movie 2 is so popular… This follows the daily pattern in last page

Health Index of Movies (2)  Owning Ratio of chunk i and time t  The higher the better  More peers have early chunks  View from beginning  Averagely 30~40%

Health Index of Movies (3)  Replications v.s. Demands  The number of available replications of chunk i is always much higher than the number of demands of chunk I  People always like to skip the last chunks…(epilog) Movie 2 is popular, the large fluctuation is due to the high interactivity of users …

Health Index of Movies (4)  Available to Demand ratio (ATD) for chunk i at time t  ATD of a movie is  For good scalability and quality, ATD should be greater than 1!

Health Index of Movies (5)  ATD of three movies  Always >3  Overall replication health is very good

Health Index of Movies (6)  Temporal means and standard deviations on the number of available replicated chunks of movies:  High temporal means & low standard deviation - means the replication is stable and reliable  Temporal mean, stdev. and ATD of movie 2 is very high from 12pm~19pm, because many users watch it…

User Satisfaction Index (1)  Fluency  Calculated by clients in a distributed manner whenever a “stop-watching” occurs, with all MVRs

User Satisfaction Index (2)  Num. of fluency records reported to server in one day  Rises and drops nearly the same as population…  More users, more events of “stop-watching”, more fluencies…

User Satisfaction Index (3)  Distribution of fluency index of movies in one day  Fluency > 0.8  GOOD!  Fluency < 0.2  BAD!  Much greater than 0.7  20% less than 0.2 Many fluency indexes is around 0.1~0.2, probably because: many users start to watch with a high buffering time, then the users find the video is not interesting or lose the patency to wait for buffering, then stop/leave

User Satisfaction Index (4)  How does increasing population affects the fluency?  Good fluency (0.8~1.0), bad fluency (0.0~0.2)  Population , fluency of bad and good keeps still  Population  suddenly, bad fluency , good fluency 

Server Load  48-hours  100 movies per server  Dell PowerEdge 1430  Upload and CPU are correlated with number of users

NAT  Cluster of local networks  80% are behind NAT (full cone > symmetric > port-restricted)

 Download rate from server: 32Kbps; from peers: 352Kbps  Upload rate of peer: 368Kbps  Average server loading is 8.3% Average Upload/Download Rate

5. Conclusion  P2P-VoD streaming service by PPLive  All contents are from lessons of long time practical running  Design issues  segmentation, replication, transmission,  Metrics & measurement  user satisfaction, server load, health index and so on…  Results analysis

Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University.

Similar presentations

Presentation on theme: "Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University.

Similar presentations

Presentation on theme: "Challenges, Design and Analysis of a Large-scale P2P-VoD System Yan Huang, Tom Z. J. Fu, Dah-Ming Chiu, John C. S. Lui and Cheng Huang Chinese University."— Presentation transcript:

Similar presentations

About project

Feedback