Designing QoE experiments to evaluate Peer-to-Peer streaming applications Tom Z.J. Fu, CUHK Dah Ming Chiu, CUHK Zhibin Lei, ASTRI VCIP 2010, Huang Shan,

Designing QoE experiments to evaluate Peer-to-Peer streaming applications Tom Z.J. Fu, CUHK Dah Ming Chiu, CUHK Zhibin Lei, ASTRI VCIP 2010, Huang Shan, China

Outline  Introduction & motivation  Chunk-level impairment model  Experiment setting  Result analysis and insights  Future work & conclusion

 Internet streaming service becomes popular  S/C mechanism,  P2P mechanism, which is mostly implemented.  CDN,  single/multiple tree-based application layer multicast,  peer-to-peer streaming (live streaming / VoD).  There is a need to evaluate different mechanisms by some proper methodology.  E.g. different strategies used in P2P system.  Two types of evaluation method  Objective: measurement on objective metrics (plr, trans. delay)  Subjective: inviting subjects to give scores Introduction and motivation

1. Existing methods are not suitable  Only packet-level impairment model for single link network transmission (packet loss rate, packet end-to-end delay, etc) is considered.  A chunk (much larger than one packet) becomes basic unit of almost all the building blocks and designing issues for most large-scale P2P streaming systems. 2. Various objective metrics are defined in different systems and analytical models  Buffer count (UUSee measurement);  Playback continuity (several different definitions, Coolstreaming, PPlive Measurement, etc.);  Subjective testing validation is necessary.

Traditional HRC includes:  source video (SRC),  video encoder,  network transmission,  video decoder,  processed video (PVS). Chunk-level impairment model Packet-level impairment for single link (e.g. plr, end-to-end delay) Chunk-level impairment: for dynamic topology; and various strategies

 Video encoder – Different media codec, transmission rate could be chosen at the video encoder component  Network transmission – chunk level impairment module  Chunk maker – responsible for organizing video stream packets into chunks.  Chunk-level distortion generator – three different ways are designed to implement chunk-level distortion generator  Chunk buffer manager and playback controller: – manages and keeps the received chunks in a local chunk-level buffer; – make playback decision for each chunk.  Video decoder – After being decoded by the video decoder component, the processed videos (PVS) are then displayed in the monitors to the users. Chunk-level impairment model

 Notations: – T i e : the expected playback time of the ith chunk; – T i s : the start download time of chunk i; – T i c : the complete download time of chunk i; 1. Chunk-level delay.  Chunk i is delayed if D i = {T i c - T i e } + > 0, where {x} + = x when x > 0, otherwise 0. 2. Chunk delay distribution (CDD).  Chunk delay distribution is aggregate statistics for all delayed chunks. In the simplest case, it can be represented by a discrete random variable. 3. Chunk receiving pattern (CRP).  It describes how a chunk, i, is filled over the whole downloading process. If we denote f i (t), t ∈ [T s i, T c i ], to be the download completion percentage of chunk i at time t, then mathematically, CRP could be represented by any increasing curve of f i (t) over t ∈ [T s i, T c i ] with constraints f i (T s i ) = 0 and f i (T c i ) = 1. Chunk-level impairment model

Illustration of different CRPs  Curves A, B, C, D have the same start downloading time T s i (1 second before T i e ) and finish time T c i (4 seconds after the T i e ).  Chunk generated by curve A will always receive more contents than that of B, C and D.  At t = T i e, the expected playback time, A generates chunk with 80% of the completeness while B only generates 20%, C close to 0% and D 0%. Note: in this work, we only apply the simplest pattern, (Curve D, i.e. all contents arrive at same time, T c i ), the complicated curves will be studied in future work

 Live experiments  Most detailed CRP for each chunk can be collected and recorded during a real-life experiment  Simulation results  It is possible to simulate a large network with a large number of users, and have the simulation repeatable. The same kinds of detailed CRP traces can be collected.  Artificial generating  Manually create different possible chunk delays (by following certain distribution) or chunk-level receiving patterns (by implementing f i (t) with different increasing curves and parameters), for subjective testing purposes. Chunk-level distortion generator

 For the P2P streaming system, the playback controller acts as an essential role.  Chunks can be considered as two cases:  non-delayed chunk, complete downloading on or before T i e.  delayed chunk (D i > 0). Not complete when meets T i e.  PC deals with the two cases:  non-delayed chunk: move it out of the local buffer and send it to the decoder to be played back.  delayed chunk: three possible actions might be taken, but not limited. a) Wait until the chunk is complete and then send to decoder; b) Directly send the incomplete chunk to the decoder with no waiting; c) Wait for at most longest waiting time (LWT), either the timer expires or the chunk is complete, the PC stops waiting and sends it to the decoder immediately. Simple playback controller

 LWT = ∞, case (a)  LWT = 0, case (b)  LWT in between, case (c) Simple playback controller Note: implementation of PC can be more complicated, and this will be studied in future.

 Experiment goal:  To validate the effectiveness of some well-studied performance metrics, e.g. the average playback (dis)continuity.  If the correlation does exist, try to find out a simple mapping function between the objective and subjective metrics.  To explore the relationship between chunk delay distribution (CDD) and subjective QoE.  Learn useful insight to help on design of the streaming peer software.  Experiment settings:  50 source video clips with average length of 30 seconds;  30 subjects (16 males and 14 females), age range (18 - 28);  Assessment scheme: Absolute Category Rating (ACR) with hidden reference Experiment setting

 Source videos (SRC)  Simply deployed decoder  If the video chunk sent from the PC is incomplete, discard it; Otherwise decode and playback it (just for Curve D).  If there is no chunk received from PC at the expected playback time, the decoder simply freezes at the last playable image until new content arrives. Experiment setting

 Due to the implementation of the decoder, there are three possible viewing effects caused by chunk-level distortions:  D i = 0, no distortion. If chunk i is completed before its expected playback time, it will be normally decoded and played back;  0 < D i < LWT, freeze-and-play viewing effect. If chunk i is delayed but still completed before LWT, the resulting effect in PVS is firstly freezing at an image for duration of D i and then normally playing back chunk i.  D i >= LWT, freeze-and-discard viewing effect. If chunk i is delayed and remains incomplete until LWT expires, the effect in PVS is freezing at an image for LWT and then directly jumping to chunk i + 1. Resulting viewing effects

 A reduced set of 20 combinations of chunk-level distortions composed of two factors:  Average discontinuity (d = 1 − c): – 0, 5%, 10%, 15%, 20%, 25%, 30%, 40%, 50%, 60%, where  Tow types of chunk delay distribution (CDD): 1. Short delay distribution: delays uniformly distributed in [0, 2] seconds; or 2. Long delay distribution: all delays equal to 3 seconds, ( = LWT, LWT is set 3 seconds by default). Testing set

 Subjective assessment results for each processed video sequence MOS value (left), DMOS value (right): The meaning for Mean Opinion Score (MOS) and DMOS: Result analysis and insights

 Insights from the subjective assessment results: 1. The DMOS (right) is consistent with MOS (left) analysis which means the experiment results are reasonable, where: – DMOS is derived by subtracting the MOS of the PVS from the MOS of the reference video (of same category and with no distortion). – DMOS metric removes the bias in the subjective scoring process caused by individual’s preference of video contents. 2. The correlation between objective metric and subjective QoE exists 3. The line derived by linear regression of the discontinuity (d) and subjective (MOS) can be made use of later (when we need to predict QoE by measured discontinuity metric w/o conducting subjective testing, so saving cost). Result analysis and insights

 Comparison between short and long chunk delay distribution, MOS value (left), DMOS value (right): Insights from the comparison: 1. PVSes with long delay distribution obtain higher MOSes than those with short delay distribution when average d is same. 2. Subjects care more about the number of screen freezing events than the duration of each freezing event.

Future work & conclusion  Future work  Conduct more experiments with different parameter settings.  Change the implementation of decoder, to support incomplete chunk and concealment algorithm  Based on such framework, study more complicate design of playback controller (how long to wait for delayed chunk)  Study different chunk-receiving patterns.  Conclusion  Chunk-level impairment model is proposed for P2P mechanism.  By applying this new model, we carry out subjective experiments  The results are preliminary but still get some interesting insights.

The end Thanks! Q & A

Designing QoE experiments to evaluate Peer-to-Peer streaming applications Tom Z.J. Fu, CUHK Dah Ming Chiu, CUHK Zhibin Lei, ASTRI VCIP 2010, Huang Shan,

Similar presentations

Presentation on theme: "Designing QoE experiments to evaluate Peer-to-Peer streaming applications Tom Z.J. Fu, CUHK Dah Ming Chiu, CUHK Zhibin Lei, ASTRI VCIP 2010, Huang Shan,"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Designing QoE experiments to evaluate Peer-to-Peer streaming applications Tom Z.J. Fu, CUHK Dah Ming Chiu, CUHK Zhibin Lei, ASTRI VCIP 2010, Huang Shan,

Similar presentations

Presentation on theme: "Designing QoE experiments to evaluate Peer-to-Peer streaming applications Tom Z.J. Fu, CUHK Dah Ming Chiu, CUHK Zhibin Lei, ASTRI VCIP 2010, Huang Shan,"— Presentation transcript:

Similar presentations

About project

Feedback