Presentation on theme: "Our Response to Proposals to remove FMO and Arbitrary Slices (ASO) from Baseline Stephan Wenger (Teles AG) Michael Horowitz (Polycom."— Presentation transcript:
Our Response to Proposals to remove FMO and Arbitrary Slices (ASO) from Baseline Stephan Wenger (Teles AG) Michael Horowitz (Polycom Inc.)
Stephan Wenger / KBS / TU Berlin 2 Relevant Documents All documents enjoy the support of Broadcom, Motorola, CableLabs, Scientific-Atlanta, and LSI Logic JVT-D115r1.doc (by Iole Moccagatta ) Good description, though no concrete results JVT-D121.doc (by Yasser Syed) A collection of reflector s We do not feel that this contribution adds any value above the reflector comments, which were addressed timely. JVT-D133.doc (by Yasser Syed, late submission) One page proposal to move the Error Resilience Tools (including FMO ) to a new Profile – no arguments Technical arguments are all included in JVT- D115r1.doc – hence we comment on this one.
Stephan Wenger / KBS / TU Berlin 3 Why Error Resilience in Baseline Interoperability Aspects Gateway Design Trend to multi-functional devices Trend to homogenous network structures (IP based) Painful to later A lesson learned with H.263 Annex K Implementing FMO/ASO now is not that difficult Cost burden relatively low as will be shown momentarily Royalty free Baseline attribute
Stephan Wenger / KBS / TU Berlin 4 Arguments made in JVT-D115r1.doc We concentrate here on the Proposal to move FMO/ASO to an “Error Resilience” Profile ASO across frame boundaries can be discussed later Key Arguments A) There is no need for them (few/no errors) B) FMO/ASO are too expensive for Broadcast To A) we answer as follows: The error free property of networks for Broadcast is not undisputed – and in practice not completely achieved. FMO is very flexible – could be used outside the error resilience property. Examples were shown e.g. by Miska.
Stephan Wenger / KBS / TU Berlin 5 Similarity of FMO and ASO From a computational complexity point-of-view the two tools are roughly comparable Both allow MBs completely out of scan order ASO by using one-MB-sized slices When scan-order reconstruction is chosen, in both cases bit buffer handling is required 8 buffers for FMO 1 buffer for ASO FMO may require per-macroblock change of CABAC/CA- VLC contexts FMO requires some signaling in the Parameter Sets Rest of the Slides use FMO as the example
Stephan Wenger / KBS / TU Berlin 6 How expensive is FMO really? Detailed implementation description in JVT-D063 Sneak preview was available to many of the companies opposing FMO in baseline Technical concepts presented there seem to be undisputed Two alternative implementations Out-of-order reconstruction (low delay) No CABAC/CA-VLC context switches necessary Loop-Filtering after reconstruction Incurs more cache misses (depending on architecture) (Potentially) requires twice the bus bandwidth for pixel transfers This is seen as too big a burden Scan order reconstruction (broadcast) Needs CABAC/CA-VLC context switching on a per-MB basis Needs bit buffer management
Stephan Wenger / KBS / TU Berlin 7 Scan order reconstruction: Recap Collect all slices of a picture in buffers Need buffer space for one coded picture 8 bit buffer bins w/ pointers Reconstruct MBs in scan-order Two Slice Group example (SG 0 red / 1 blue) MBAmapBuffer a b c a b c a a b b c c
Stephan Wenger / KBS / TU Berlin 8 Cost of CABAC/CA-VLC context switch 89 Contexts, one 6 bit int plus 1 flag per context -> 89 bytes Need to store 8 contexts – 712 bytes Memory amount for CABAC >> CA-VLC When implementing CABAC in software, no real cost Pointer switch When implementing CABAC in a Register-based solution Is this possible/advisable? If yes, need Either Store/retrieve 89 bytes per MB Or have 8 register banks w/ 89 ints (6 bits) plus flag each
Stephan Wenger / KBS / TU Berlin 9 Bit Buffer Management 1/3 Need to handle up to 8 bit bins One for ASO Each bit bin is a chained list of NALUs NALUs are “inserted” to the bit bin and at a position determined by the MB-Adr Get SliceGroup from MB-Adr very simple Insert to list (for out-of-order slices) is also very simple Note: NALUs are byte aligned No need for bit oriented processing or copying of data Required Memory: one coded picture
Stephan Wenger / KBS / TU Berlin 10 Bit Buffer Management 2/3 Slice: 00Slice: 14 Slice: 01Slice: 17Slice: 55 Slice: 34 Slice: 55Slice: 34
Stephan Wenger / KBS / TU Berlin 11 Bit Buffer Management 3/3 Slice: 00Slice: 14 Slice: 01Slice: 17Slice: 55 Read Pointer for SliceGroup 0 Read Pointer for SliceGroup 1
Stephan Wenger / KBS / TU Berlin 12 Disposable B Pictures and Memory Reqmnts. This seems to be the key argument of JVT-D115 Similar discussions on the reflector We admit that one needs an additional coded frame memory compared to MPEG-2 architectures. However: In JVT B-pictures are not always disposable, hence The RAM argument made in JVT-D115 doesn’t hold Considering the number of frame buffers typically used in JVT, this is a moderate cost (20% more memory or so)