Blog-like Post 20150325

Daala is Xiph.Org's in-development next-next-generation video codec. Never heard of Daala before? You can find "Introducing Daala [Part 1]" here!.

Bug or Feature? Unintentionally Intentional Behaviors

Codec development is often an exercise in tracking down examples of "that's funny... why is it doing that?" The usual hope is that unexpected behaviors spring from a simple bug, and finding bugs is like finding free performance. Fix the bug, and things usually work better.

Often, though, hunting down the 'bug' is a frustrating exercise in finding that the code is not misbehaving at all; it's functioning exactly as designed. Then the question becomes a thornier issue of determining if the design is broken, and if so, how to fix it. If it's fixable. And the fix is worth it.

Huh, that's funny...

At the moment, Daala has an annoying lapping-related bug that can turn rounding and quantization noise into static patterns that accumulate, producing halos around some blocks. The behavior doesn't appear that often, but when it does it's obvious and undesirable. Over the past week, I've been busy cataloging various solutions.

Above: Illustration of the 'halo' bug caused by correlated noise buildup from interaction between reference frame quantization and the lapping process. The above image is a crop from a -v 100 [extreme low bitrate] encode of the sintel 1080p trailer and has been brightened and scaled up (fatbits) by 2x.

While researching that bug, though, I noticed something funny. Even if I disabled the lapping filters entirely, just for testing, I still got stripe-like artifacts at the top edges of some blocks on the same test clips. These artifacts only disappeared when I also disabled motion compensation. Great! Two entirely different bugs causing a similar problem! I was positively giddy at the potential improvement fixing both bugs might get us.

Above: Illustration of the stripes that appeared when coding the Sintel trailer test at extremely low-rate without lapping, but with motion compensation. The initial portion of the trailer consists of a low-contrast, mostly monochrome zoom and fade sequence that codes mostly as 32x32 DC-only updates with motion vectors. The image has been cropped, brightened and scaled up (fatbits) by 2x.

Along with lapped transforms, Daala also avoids blocking artifacts by using Overlapped Block Motion Compensation [PDF] (which will be the subject of its own demo page) to predict inter-frames. Our OBMC code is somewhat immature and incomplete, so the possibility of finding some easy bugs didn't seem too far-fetched.

Debugging images dumped from an encoding run showed that the stripe artifacts only appeared in blocks that were motion-compensated and also appeared in the motion compensation output, suggesting that a simple bug in motion compensation was at fault.

It's not a bug, it's a feature

Except that last point is misleading; any artifact introduced in coding or reconstruction becomes part of the reconstructed reference. That reconstructed reference is used to motion-compensate the next frame. Thus, any introduced artifact will appear in the motion compensation output of the subsequent frame, whether or not motion compensation caused it. Once the artifact is in the reference, it will remain there unless rate distortion optimization considers it significant enough to remove.

Looking more closely at the debugging images, the motion compensation output of the frame in which the stripe artifacts first appear shows no hint of the problem.

Above: Output from motion compensation and prediction of frame 22 when coding the Sintel trailer test at extremely low-rate without lapping, but with motion compensation. The image has been cropped, brightened and scaled up (fatbits) by 2x.

After motion compensation, Daala codes the left-over residue. The only thing coded for the blocks in question is a DC update, increasing the luma values of the blocks. Our stripes appear.

Above: Final frame 22 when coding the Sintel trailer test at extremely low-rate without lapping, but with motion compensation. The image has been cropped, brightened and scaled up (fatbits) by 2x.

What's happening is pretty clear:

Motion compensation first shifts the area slightly above the block downward, resulting in a block that's mostly dark but has a small stripe of lighter luma along the top. At this point just after the motion prediction, there's no artifact because that lighter strip at the top of the block still matches the luma of the area above it. The light/dark boundary just shifted down a bit. Now it's inside a block instead of at the edge between blocks...

Above: Output from motion compensation and prediction of frame 22 as above, but with block coding boundaries marked in black and motion compensation flow marked in green. The image has been cropped, brightened and scaled up (fatbits) by 2x.

PVQ then codes a DC update that brightens the whole block, including the small strip of brighter luma that motion compensation shifted downward into the block. The bright stripe that appears is where these two 'improvements' to the coded block essentially stack up, brightening the top of the block twice. Each change makes sense in isolation; together they're a mistake.

Above: Final frame 22 output as above, but with block coding boundaries marked in black and blocks that received a DC luma update marked in green. The image has been cropped, brightened and scaled up (fatbits) by 2x.

It's not clear we can fault the motion compensation. Naive MC systems commonly chase up or down gradients during a fade, and that's effectively what's happening here. Smarter analysis would be less prone to the effect, but at the moment our OBMC system is a powerful framework that's just learning to walk. It's still at the proof-of-concept stage. So this isn't really a bug, either in the code or the initial design.

...or is it?

Although I just said that the motion compensation is behaving approximately as expected (given that it's not yet very smart) there's one more "huh, that's funny" aspect. Virtually all of the motion vectors chasing gradients are purely vertical. Does some accident of the context model adaptation make vertical vectors overly cheap, or is it a real bias? Or even a bug?

Interestingly enough, rotating the video 90° and encoding does not show the same tendency to generate purely horizontal motion vectors. They appear eventually, but in much smaller numbers, and analysis codes no motion vectors whatsoever in the frame we investigated above.

Hmmm.

I see what you did there

On the third hand, even if this is a flaw in our analysis, is it really a problem? After all, it's not a problem conceptually unique to Daala, and as far as I'm aware, other codecs do not explicitly look to avoid this interaction.

More importantly, lapped transforms mostly save our bacon here, even in this mostly pathological case. Because of the lapping, we don't have sharp block edges that can cause stripes. Because our motion compensation is overlapped and blended, it also can't artificially introduce edges, not even at low rates. The double stacking can still happen, but it's relatively unnoticeable.

Noticeable or not, the conflict may still cost some coding efficiency. Or is the problem not actually a problem at all?

Hmmm.

—Monty (monty@xiph.org) March 25, 2015

Monty's Daala documentation work is sponsored by
Mozilla Research.
(C) Copyright 2015 Mozilla and Xiph.Org