Presentation is loading. Please wait.

Presentation is loading. Please wait.

CS 598kn – Multimedia Systems Design Introduction to Video 360

Similar presentations


Presentation on theme: "CS 598kn – Multimedia Systems Design Introduction to Video 360"— Presentation transcript:

1 CS 598kn – Multimedia Systems Design Introduction to Video 360
Klara Nahrstedt Fall 2017 CS 598kn - Fall 2017

2 Outline Definition of 360-degree video Creation of Video360
Playback of Video360 Streaming of Video360 Conclusion CS 598kn - Fall 2017

3 Definition 360-degree video (see YouTube Video360)
Immersive video or spherical video Video recordings where view in every direction is recorded at the same time, sot using an omnidirectional camera or collection of cameras

4 Up Forward Left Right Backward Down

5 Creation of Video 360 Use special rig of multiple cameras or dedicated camera that contains multiple camera lenses embedded into the devices CS 598kn - Fall 2017

6 Creation of Video 360 Video from multiple cameras are filmed together with overlapping angles simultaneously. Through stitching of videos together into one spherical video piece, one gets video360. Color and contrast of each shot is calibrated to be consistent with others. CS 598kn - Fall 2017

7 Creation of Video 360 Stitching and color/contrast calibration can be done at Camera itself Specialized video editing software that can analyze visuals and audio to synchronize different camera feeds together Video 360 is formatted in an equirectangular projection and is Monoscopic (one image directed to both eyes) or stereoscopic CS 598kn - Fall 2017

8 Equirectangular Projection (ERP)
Geographic projection Definition: The projection maps meridians to vertical straight lines of constant spacing and circles of latitude to horizontal straight lines of constant spacing. Projection is neither equal area nor conformal. CS 598kn - Fall 2017

9 Stereoscopic projection
Technique for creating the illusion of depth in an image by means of stereopsis for binocular vision. Most stereoscopic methods present two offset images separately to the left and right eye of the viewer. These two-dimensional images are then combined in the brain to give the perception of 3D depth. CS 598kn - Fall 2017

10 Creation of Video 360 Due to this projection and stitching, equirectuangular video exhibits a lower quality in the middle of image than at top and bottom; Omnidirectional Cameras Vuza camera, Nokia OZO, Dual-lens cameras – Samsung Gear 360, CS 598kn - Fall 2017

11 Playback of Video 360 Viewing on PCs, smartphones, head-mounted displays Users can pan around the video by clicking and dragging On mobile phones, one can use gyroscope to pan video based on orientation of device. CS 598kn - Fall 2017

12 Playback of Video 360 on smartphone

13 Playback of Video360 View devices: Google Cardboard; Samsung Gear VR; Oculus Rift These videos require good resolution 6K or higher CS 598kn - Fall 2017

14 Industry Support for Video 360
March 2015 – YouTube launched support for publishing and viewing Video 360 Android mobile apps 2017 – Google and YouTube started to support alternative stereoscopic video format, VR180, which is limited to 180-degree – more accessible September 2015 –Facebook added support for Video360; Facebook Surround 360 product March 2017 – Facebook has over 1M Video360 items. CS 598kn - Fall 2017

15 l. Xie et al. “360ProbDASH: Improving QoE of 360 Video Streaming Using Tile-based HTTP Adaptive Streaming”, ACM Multimedia 2017, Mountain View, CA, October 2017. Video 360 Streaming CS 598kn - Fall 2017

16 Video 360 Streaming Framework with DASH
CS 598kn - Fall 2017

17 Streaming Constraints
Challenge to stream 6K video360 over DASH Need to consider viewport adaptive streaming Important Concepts Field of View (FOV), i.e., viewport Preserve FOV, while other parts are sent at lower quality CS 598kn - Fall 2017

18 Viewport adaptive streaming
Two categories Asymmetric panorama Tile-based Asymmetric panorama method transforms 360 video into viewport-dependent multi-resolution panorama which decreases quality of the viewport. This method is prevalent since it still provides 360 image. Disadvantage: it wastes bandwidth because user’s viewport is limited to FOV.

19 Asymetric panorama-oriented streaming
Representation formats: Truncated Pyramid Projection (TSP) Facebook offset cubemap Decrease overall bitrate without decrease of quality CS 598kn - Fall 2017

20 Viewport adaptive streaming
Tile-based method crops 360 video frames into multiple tiles (or blocks) in space; then partitions and encodes the tiles into multi-bitrate segments. Client pre-fetches tiles within predicted viewport. In tile-based HTTP adaptive streaming there are two adaptations Rate adaptation and viewport adaptation Disadvantage: non-trivial to provide QoE CS 598kn - Fall 2017

21 Challenges of tile-based DASH
Tiles pre-fetching errors Motion-to-photon latency requirement for VR is less than 20 ms (smaller delay than Internet request-reply delay) We need viewpoint predictions, but it takes too long. If viewport is not streamed, we see white block, hence drop in QoE. CS 598kn - Fall 2017

22 Challenges of tile-based DASH
Rebuffering/stall under small playback buffer Due to short viewport prediction constraint, we need small buffers, this may cause rebuffering and stall. Border effects of mixed bitrate tiles Due to spatial split and different bit rates for tiles, borders among tiles with different rates could end up with different qualities CS 598kn - Fall 2017

23 Solution: 360ProbDASH 360ProbDASH uses
probabilistic model of viewport predictions and expected quality optimization framework to maximize quality of viewport adaptive streaming CS 598kn - Fall 2017

24 Solution: 360ProbDASH Probabilistic tile-based adaptive streaming
system CS 598kn - Fall 2017

25 Protocol Raw panoramic video in ERP format is temporarily divided into several chunks with same duration For each chunk Crop spatially chunk into N tiles Index tiles in raster-scan order Each tile is encoded into segments with M kinds bitrates Store M x N optional segments at server for streaming CS 598kn - Fall 2017

26 Notations h – height, w- width of video360
i in {1,..N} – tile sequence numbers j in {1,..M} – bitrate level rij – bitrate of segment (i,j) dij – distortion of segment (i,j) pi – normalized viewing probability of i-th tile X={xij} – set of streaming segments Φ(X) – expected distortion Ψ(X) – spatial quality variance R – transmission bitrate budget h – height, w- width of video360 Δh – height, Δw- width of tile CS 598kn - Fall 2017

27 Problem Formulation There are M x N optional segments with N denoting tile sequence number and M denoting bitrate level rij is bitrate of segment (i,j), dij is distortion of segment (i,j) pi is normalized viewing probability of i-th tile, sum of ΣNpi = 1; Problem: find set of streaming segments X = {xij} where xij = 1 means segment of i-th tile at j-th bitrate level is selected for streaming and xij = 0 is otherwise Goal: maximize quality of viewport adaptive streaming Approach: define two QoE functions: Expected distortion Φ(X) – quality distortion of viewport under consideration of viewing probability of tiles Spatial quality variance Ψ(X) – quality smoothness in viewport Objective: minimize weighted distortion of these two QoE functions where ‘n’ is weight for spatial quality variance. CS 598kn - Fall 2017

28 Problem Formulation minX Φ(X) + n*Ψ(X) Constraints: ΣNΣM xij*rij ≤ R
ΣM xij ≤ 1 with xij in [0,1] for all i CS 598kn - Fall 2017

29 Expected Viewport Distortion and Variance
Spherical Distortion of segment means: In Video 360 – use S-PSNR to evaluate quality which is calculated via Mean Squared Error (MSE) of point on sphere dij indicates the MSE corresponding to segment (i,j) Overall spherical distortion of segment is sum of distortion over all pixels the segment covers It is important to calculate tile’s corresponding spherical area. This is because even if tiles have same area in plane, their corresponding area on sphere are not the same. CS 598kn - Fall 2017

30 Spherical Mapping of Tiles
CS 598kn - Fall 2017

31 CS 598kn - Fall 2017

32 Probabilistic Model of Viewport
Problem: need to pre-fetch segments by predicting viewport However, prediction may be inaccurate which will lead to viewport deviation Viewport adaptation is needed Approach: probabilistic model to predict viewport Important steps: (a) linear regression prediction of orientation; (b) distribution of prediction error – viewport is hard to predict accurately, especially for long-term prediction, hence use probabilistic model to predict errors (use 5 head movement traces) – Gaussian distribution (c) viewing probability of points on sphere, (d) viewing probability of tiles

33 Target-Buffer-based Rate Control
Problem: long-term head movement prediction results in high prediction errors, hence It is not possible to employ large playback buffer to smooth bandwidth variation. Goal: provide continuous playback under small buffers Approach: Target-buffer-based rate control CS 598kn - Fall 2017

34 Dynamics of small playback buffer in client
Buffer occupancy is tracked in seconds of video Set of segments generate a chunk Chunks are stored in buffer At adaptation step k we define bk as the buffer occupancy (in seconds) when k-th set of segments are downloaded completely CS 598kn - Fall 2017

35 Algorithm Assume Rk - total bitrate and Ck – average bandwidth, T – chunk time Buffer occupancy when finishing downloading segments is calculated: bk = bk-1 – [(Rk*T)/Ck] +T If we want to prevent rebuffering, buffer must be controlled to avoid running out of chunks Due to small buffer constraint, we set target buffer level Btarget, i.e., bk = Btarget. Total bitrate Rk = [Ck/T]*(bk-1 – Btarget +T) Ck – network bandwidth, estimated from historic segments downloading Lower bound Rmin is set to R. Rk = max{[Ck/T]*(bk-1 – Btarget +T), Rmin} Rk is used as total bitrate budget constraint in the optimization problem CS 598kn - Fall 2017

36 360probDASH CS 598kn - Fall 2017

37 Components Client: Server: Video cropper – crops frames into tiles
QoE-driven optimizer – determine optimal segments to download which are involved in HTTP GET requests Target-buffer-based rate controller Viewport probabilistic model QR map – Quality rate maps for all segments according to attributes in MPD Bandwidth estimation Server: Video cropper – crops frames into tiles Encoder MPD Generator Appache Server CS 598kn - Fall 2017

38 Evaluation setup Video sequence and 5 user head movement traces on this video (AT&T) Sequence is 3 minutes, 2990x1440 in ERP format Chunks divided into 1 second chunks Each chunk divided into 6x12 tiles Bitrate levels are 20kbps, 50kbps, 100 kbps, 200 kbps, 300 kbps Video codec is x264 CS 598kn - Fall 2017

39 Experiment setup Client buffer size is Bmax = 3s;
Target buffer level is Btarget = 2.5s; Rmin = 200kbps Tile-LR: tile-based method uses linear regression to predict future viewport and requests corresponding tiles. The bitrate of each tile is allocated eqally; CS 598kn - Fall 2017

40 Metrics Stall ratio: playback continuity which calculates the percentage of duration of stall over total video streaming time Viewport PSNR; quality of content in user viewport Spatial quality variance Viewport deviation: percentage of white blocks over the viewport area CS 598kn - Fall 2017

41 Bitrate under different network conditions
CS 598kn - Fall 2017

42 Bandwidth utilization and stall
CS 598kn - Fall 2017

43 Viewport deviation CS 598kn - Fall 2017

44 Conclusion Tile-based adaptive streaming is promising
Rate adaptation – target-buffer-based control algorithm to ensure continuous playback with smaller buffer Viewport adaptation – construct probabilistic model to cope with the viewport prediction error Qoe-driven optimization problem Minimize expected quality distortion of tiles and spatial variability of quality under constraints of total transmission bitrate CS 598kn - Fall 2017


Download ppt "CS 598kn – Multimedia Systems Design Introduction to Video 360"

Similar presentations


Ads by Google