Download presentation
Presentation is loading. Please wait.
Published byEdgar Bruce Modified over 6 years ago
1
CS 598kn – Multimedia Systems Design Introduction to Video 360
Klara Nahrstedt Fall 2017 CS 598kn - Fall 2017
2
Outline Definition of 360-degree video Creation of Video360
Playback of Video360 Streaming of Video360 Conclusion CS 598kn - Fall 2017
3
Definition 360-degree video (see YouTube Video360)
Immersive video or spherical video Video recordings where view in every direction is recorded at the same time, sot using an omnidirectional camera or collection of cameras
4
Up Forward Left Right Backward Down
5
Creation of Video 360 Use special rig of multiple cameras or dedicated camera that contains multiple camera lenses embedded into the devices CS 598kn - Fall 2017
6
Creation of Video 360 Video from multiple cameras are filmed together with overlapping angles simultaneously. Through stitching of videos together into one spherical video piece, one gets video360. Color and contrast of each shot is calibrated to be consistent with others. CS 598kn - Fall 2017
7
Creation of Video 360 Stitching and color/contrast calibration can be done at Camera itself Specialized video editing software that can analyze visuals and audio to synchronize different camera feeds together Video 360 is formatted in an equirectangular projection and is Monoscopic (one image directed to both eyes) or stereoscopic CS 598kn - Fall 2017
8
Equirectangular Projection (ERP)
Geographic projection Definition: The projection maps meridians to vertical straight lines of constant spacing and circles of latitude to horizontal straight lines of constant spacing. Projection is neither equal area nor conformal. CS 598kn - Fall 2017
9
Stereoscopic projection
Technique for creating the illusion of depth in an image by means of stereopsis for binocular vision. Most stereoscopic methods present two offset images separately to the left and right eye of the viewer. These two-dimensional images are then combined in the brain to give the perception of 3D depth. CS 598kn - Fall 2017
10
Creation of Video 360 Due to this projection and stitching, equirectuangular video exhibits a lower quality in the middle of image than at top and bottom; Omnidirectional Cameras Vuza camera, Nokia OZO, Dual-lens cameras – Samsung Gear 360, CS 598kn - Fall 2017
11
Playback of Video 360 Viewing on PCs, smartphones, head-mounted displays Users can pan around the video by clicking and dragging On mobile phones, one can use gyroscope to pan video based on orientation of device. CS 598kn - Fall 2017
12
Playback of Video 360 on smartphone
13
Playback of Video360 View devices: Google Cardboard; Samsung Gear VR; Oculus Rift These videos require good resolution 6K or higher CS 598kn - Fall 2017
14
Industry Support for Video 360
March 2015 – YouTube launched support for publishing and viewing Video 360 Android mobile apps 2017 – Google and YouTube started to support alternative stereoscopic video format, VR180, which is limited to 180-degree – more accessible September 2015 –Facebook added support for Video360; Facebook Surround 360 product March 2017 – Facebook has over 1M Video360 items. CS 598kn - Fall 2017
15
l. Xie et al. “360ProbDASH: Improving QoE of 360 Video Streaming Using Tile-based HTTP Adaptive Streaming”, ACM Multimedia 2017, Mountain View, CA, October 2017. Video 360 Streaming CS 598kn - Fall 2017
16
Video 360 Streaming Framework with DASH
CS 598kn - Fall 2017
17
Streaming Constraints
Challenge to stream 6K video360 over DASH Need to consider viewport adaptive streaming Important Concepts Field of View (FOV), i.e., viewport Preserve FOV, while other parts are sent at lower quality CS 598kn - Fall 2017
18
Viewport adaptive streaming
Two categories Asymmetric panorama Tile-based Asymmetric panorama method transforms 360 video into viewport-dependent multi-resolution panorama which decreases quality of the viewport. This method is prevalent since it still provides 360 image. Disadvantage: it wastes bandwidth because user’s viewport is limited to FOV.
19
Asymetric panorama-oriented streaming
Representation formats: Truncated Pyramid Projection (TSP) Facebook offset cubemap Decrease overall bitrate without decrease of quality CS 598kn - Fall 2017
20
Viewport adaptive streaming
Tile-based method crops 360 video frames into multiple tiles (or blocks) in space; then partitions and encodes the tiles into multi-bitrate segments. Client pre-fetches tiles within predicted viewport. In tile-based HTTP adaptive streaming there are two adaptations Rate adaptation and viewport adaptation Disadvantage: non-trivial to provide QoE CS 598kn - Fall 2017
21
Challenges of tile-based DASH
Tiles pre-fetching errors Motion-to-photon latency requirement for VR is less than 20 ms (smaller delay than Internet request-reply delay) We need viewpoint predictions, but it takes too long. If viewport is not streamed, we see white block, hence drop in QoE. CS 598kn - Fall 2017
22
Challenges of tile-based DASH
Rebuffering/stall under small playback buffer Due to short viewport prediction constraint, we need small buffers, this may cause rebuffering and stall. Border effects of mixed bitrate tiles Due to spatial split and different bit rates for tiles, borders among tiles with different rates could end up with different qualities CS 598kn - Fall 2017
23
Solution: 360ProbDASH 360ProbDASH uses
probabilistic model of viewport predictions and expected quality optimization framework to maximize quality of viewport adaptive streaming CS 598kn - Fall 2017
24
Solution: 360ProbDASH Probabilistic tile-based adaptive streaming
system CS 598kn - Fall 2017
25
Protocol Raw panoramic video in ERP format is temporarily divided into several chunks with same duration For each chunk Crop spatially chunk into N tiles Index tiles in raster-scan order Each tile is encoded into segments with M kinds bitrates Store M x N optional segments at server for streaming CS 598kn - Fall 2017
26
Notations h – height, w- width of video360
i in {1,..N} – tile sequence numbers j in {1,..M} – bitrate level rij – bitrate of segment (i,j) dij – distortion of segment (i,j) pi – normalized viewing probability of i-th tile X={xij} – set of streaming segments Φ(X) – expected distortion Ψ(X) – spatial quality variance R – transmission bitrate budget h – height, w- width of video360 Δh – height, Δw- width of tile CS 598kn - Fall 2017
27
Problem Formulation There are M x N optional segments with N denoting tile sequence number and M denoting bitrate level rij is bitrate of segment (i,j), dij is distortion of segment (i,j) pi is normalized viewing probability of i-th tile, sum of ΣNpi = 1; Problem: find set of streaming segments X = {xij} where xij = 1 means segment of i-th tile at j-th bitrate level is selected for streaming and xij = 0 is otherwise Goal: maximize quality of viewport adaptive streaming Approach: define two QoE functions: Expected distortion Φ(X) – quality distortion of viewport under consideration of viewing probability of tiles Spatial quality variance Ψ(X) – quality smoothness in viewport Objective: minimize weighted distortion of these two QoE functions where ‘n’ is weight for spatial quality variance. CS 598kn - Fall 2017
28
Problem Formulation minX Φ(X) + n*Ψ(X) Constraints: ΣNΣM xij*rij ≤ R
ΣM xij ≤ 1 with xij in [0,1] for all i CS 598kn - Fall 2017
29
Expected Viewport Distortion and Variance
Spherical Distortion of segment means: In Video 360 – use S-PSNR to evaluate quality which is calculated via Mean Squared Error (MSE) of point on sphere dij indicates the MSE corresponding to segment (i,j) Overall spherical distortion of segment is sum of distortion over all pixels the segment covers It is important to calculate tile’s corresponding spherical area. This is because even if tiles have same area in plane, their corresponding area on sphere are not the same. CS 598kn - Fall 2017
30
Spherical Mapping of Tiles
CS 598kn - Fall 2017
31
CS 598kn - Fall 2017
32
Probabilistic Model of Viewport
Problem: need to pre-fetch segments by predicting viewport However, prediction may be inaccurate which will lead to viewport deviation Viewport adaptation is needed Approach: probabilistic model to predict viewport Important steps: (a) linear regression prediction of orientation; (b) distribution of prediction error – viewport is hard to predict accurately, especially for long-term prediction, hence use probabilistic model to predict errors (use 5 head movement traces) – Gaussian distribution (c) viewing probability of points on sphere, (d) viewing probability of tiles
33
Target-Buffer-based Rate Control
Problem: long-term head movement prediction results in high prediction errors, hence It is not possible to employ large playback buffer to smooth bandwidth variation. Goal: provide continuous playback under small buffers Approach: Target-buffer-based rate control CS 598kn - Fall 2017
34
Dynamics of small playback buffer in client
Buffer occupancy is tracked in seconds of video Set of segments generate a chunk Chunks are stored in buffer At adaptation step k we define bk as the buffer occupancy (in seconds) when k-th set of segments are downloaded completely CS 598kn - Fall 2017
35
Algorithm Assume Rk - total bitrate and Ck – average bandwidth, T – chunk time Buffer occupancy when finishing downloading segments is calculated: bk = bk-1 – [(Rk*T)/Ck] +T If we want to prevent rebuffering, buffer must be controlled to avoid running out of chunks Due to small buffer constraint, we set target buffer level Btarget, i.e., bk = Btarget. Total bitrate Rk = [Ck/T]*(bk-1 – Btarget +T) Ck – network bandwidth, estimated from historic segments downloading Lower bound Rmin is set to R. Rk = max{[Ck/T]*(bk-1 – Btarget +T), Rmin} Rk is used as total bitrate budget constraint in the optimization problem CS 598kn - Fall 2017
36
360probDASH CS 598kn - Fall 2017
37
Components Client: Server: Video cropper – crops frames into tiles
QoE-driven optimizer – determine optimal segments to download which are involved in HTTP GET requests Target-buffer-based rate controller Viewport probabilistic model QR map – Quality rate maps for all segments according to attributes in MPD Bandwidth estimation Server: Video cropper – crops frames into tiles Encoder MPD Generator Appache Server CS 598kn - Fall 2017
38
Evaluation setup Video sequence and 5 user head movement traces on this video (AT&T) Sequence is 3 minutes, 2990x1440 in ERP format Chunks divided into 1 second chunks Each chunk divided into 6x12 tiles Bitrate levels are 20kbps, 50kbps, 100 kbps, 200 kbps, 300 kbps Video codec is x264 CS 598kn - Fall 2017
39
Experiment setup Client buffer size is Bmax = 3s;
Target buffer level is Btarget = 2.5s; Rmin = 200kbps Tile-LR: tile-based method uses linear regression to predict future viewport and requests corresponding tiles. The bitrate of each tile is allocated eqally; CS 598kn - Fall 2017
40
Metrics Stall ratio: playback continuity which calculates the percentage of duration of stall over total video streaming time Viewport PSNR; quality of content in user viewport Spatial quality variance Viewport deviation: percentage of white blocks over the viewport area CS 598kn - Fall 2017
41
Bitrate under different network conditions
CS 598kn - Fall 2017
42
Bandwidth utilization and stall
CS 598kn - Fall 2017
43
Viewport deviation CS 598kn - Fall 2017
44
Conclusion Tile-based adaptive streaming is promising
Rate adaptation – target-buffer-based control algorithm to ensure continuous playback with smaller buffer Viewport adaptation – construct probabilistic model to cope with the viewport prediction error Qoe-driven optimization problem Minimize expected quality distortion of tiles and spatial variability of quality under constraints of total transmission bitrate CS 598kn - Fall 2017
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.