Theora: Thusnelda project update 20080605

Overview

This report isn't going to be as detailed or extensive as the last two. The past month of Thusnelda development time has been dedicated to the implementation of proper rate/distortion optimization techniques which seek to turn the global encoder optimization problem into a set of simpler local optimizations. Quite alot of optimization has been done-- a much smaller amount has ended up working properly. Hopefully a few of the brick walls currently holding up progress will fall soon.

Rate-Distortion optimization

The basic idea behind rate-distortion optimization in a modern video codec is fairly simple. Decide for each and every token coded exactly how much global distortion will be reduced for the specific number of bits spent on the token. Do not code any token for which the cost exceeds the reduction in distortion. Simply put, all coding decisions in the encoder are governed by the need to minimize (distortion + lambda * cost), where lamba is a free variable that correlates fairly closely with desired image quality.

SKIP block implementation

The one unqualified success in Theora RD implementation so far is SKIP block detection. 'SKIP blocks' are actually an MPEG-ism; Theora doesn't have a 'SKIP' block type, but it can mark blocks as uncoded, which is functionally the exact same concept. In short, any complete block which it turns out isn't worth coding is simply marked 'uncoded' and otherwise ignored in the rest of the encoding process for that frame.

Optimal selection of uncoded blocks seeks to avoid 'false positives' (spurious updates) without accidentally neglecting legitimate changes. The Matrix test clip is actually a good test of noise rejection as the film transfer is actually remarkably noisy. We can see below that the new Thusnelda SKIP code is substantially more immune to spurious block updates, without sacrificing image quality.

Left: frame 546 of the Matrix test clip, encoded with mainline Theora using a constant quantizer setting of 35 (-v 5.5).example of motion vectors in a frame of video from mainline Theora encode. The full clip is here.

Left: frame 546 of the Matrix test clip, encoded with Thusnelda using a lambda of 64, resulting in the same quantizer selection and a slightly lower bitrate than the Theora encoded clip. The full clip is here.

Left: frame 546 of the Theora clip with macroblock selection marked. Sections of the image with no colored box are uncoded (SKIP) blocks.

Left: frame 546 of the Thusnelda clip with macroblock selection marked. Sections of the image with no colored box are uncoded (SKIP) blocks.

Optimized SKIP determination is a relatively minor optimization in terms of bitrate gain. Although Thusnelda does not implement 'ZeroBin' or actively optimize individual tokens as yet (the most substantial optimization gains will come from optimization of individual DCT tokens), the new SKIP block detection is sufficient to finally surpass the performance of the original decoder.

rate-aware motion compensation

This modifies the motion compensation and motion vector selection code to take into account the bit-cost of a specific motion vector. Although this cost was already factored into mode selection after the motion vector analysis, cost consideration played no part in the selection of specific motion vector candidates.

Although this code is complete and functioning, it is currently wholly ineffective (<.02% rate improvement). Analysis remains to determine if the technique is not specifically useful to Theora, if the technique must be modified to be effective with Theora, or if the implementation is simply suffering from bugs.

rho-domain analysis and quantizer selection

The 'rho domain' is simply a fancy name for mapping average rate and distortion from being functions of the quantizer index to being functions of the total percentage of DCT coefficients that quantize to zero. The goal of rho-domain analysis is to provide a cheap and precise means of estimating and regulating bitrate and distortion for a given choice of quantizer. This eliminates the need for imprecise and expensive approximations and hand tuning of quantizer choices.

This is another example of completed code that is not functioning properly as yet. As it's certain that this technique does apply appropriately to Theora, final vicotry here should simply be a matter of debugging.

DCT token optimization

Optimization of individual DCT tokens, which comprise the lions'-share of the bitstream bulk, is the most important aspect of RD optimization. In a sense, it is simply a finer-grained version of the SKIP block determination, where the code/demote/skip determination is made for every DCT token according to its cost and distortion liability. This technique replaces the need for the classic MPEG quantization 'dead zone' and other bitrate management hacks and approximations with a simple, globally optimal optimization mechanism.