Ogg Theora

a Free Video Codec and Multimedia Platform

Ralph Giles

Xiph.org Foundation

Open Multimedia

  • Ogg Theora rocks
  • It's free
  • You should use it
[we will want to elaborate here, obviously]

Compression

How does compression work?

Compression

  • exploit redundancy to save space
  • the more you know about the data the better
  • two general classes: lossless and lossy

Compression

How can we remove redundant information?

Because computers can process data, we can rearrange the bits.

A program + a bitstream = another bitstream.

Compression

Define the complexity of a bitstream as the size of the smallest program that can reproduce it.

C(bitstream) = min(sizeof(program) + sizeof(compressed stream))

For most data this is much smaller than the original.

In practice also we reduce complexity (and add security!) by using the same program together with different input bitstreams to handle a whole class of data. General purpose compression works on anything; class-specific compression techniques build on this by adding a model of the data which is used to further re-arrange the bits so the general-purpose compression works better. It's all about exposing entropy, concentrating the inherent complexity of the bitstream.

Compression

Practically, available cpu time and decoder complexity limit the compression factor.

A program designed for use in compressing a class of data is called a codec.

Compression

'Embarrassing' compression

  • Compress n bitstreams in n bits!
  • Not a real reduction in complexity.

The term 'embarrassing' compression is used in analogy to 'embarrassingly parallel' referring to the running a code on a multiple-cpu system not by doing the difficult work of re-writing it to support multiple concurrent execution threads, but just by running multiple copies of the same code, for example with different parameters or initial conditions.

Compression

A program that reproduces the original bitstream performs lossless compression.

What went in comes back out.

This is in contrast to so-called 'lossy' codecs like Vorbis and Theora

Lossless Compression

LZ77 algorithm

  • simple redudancy: remember repeated strings
  • LZ77 was the first of a series of dictionary-like compression algorithms
  • followed by LZ78, LZW, etc.
  • unlike (the infamous) LZW, LZ77 is not encumbered by patents

Lossless Compression

LZ77 algorithm

  • store new strings as literals
  • mark repeated strings with back-references
  • back references are (offset, length) pairs to copy from the last 32k of data
  • aabbccaaaabcc... -> aabbcc(6,2)(6,2)(7,1)(7,2)...

Lossless Compression

  • don't use more symbols than you need
  • storing 'aabbccaaaabcc' with 8-bit characters: 104 bits (13 bytes)
  • if there are only three symbols, we only need 2-bit characters: 26 bits (4 bytes: 25%)

Lossless Compression

  • What if some symbols are more common than others?
  • we can save more bits by using a variable-length coding: 1.5 bit characters.
  • the trick is knowing how many bits to read!

Lossless Compression

  • Our example: aabbccaaaabcc
  • 'a' is most common: let that be '0'
  • 'b' and 'c' can be 2 bits: '10' and '11'
  • this is a Huffman Code

Lossless Compression

Huffman Coding

  • we build a binary tree: the prefix determines the code length
  • Our example: 'a'->'0', 'b'->'10', 'c'->'11'
    aabbccaaaabcc -> 00101011110000101111 (20 bits, 19%)
  • this is most efficient if the symbol probabilities are 2-n
  • for other distributions, there's arithmetic coding

Lossless Compression

DEFLATE algorithm.

  • a combination of LZ77 and Huffman coding.
  • forms the basis of the popular gzip program (and pkzip). It is also used in the PNG image format, FLAC, PDF and Postscript, etc.

Documented in RFC 1951.

Lossless Compression

DEFLATE algorithm

  • this format has become very popular for generic lossless compression.
  • standard and patent free.
  • easy, open source implementation in zlib.

Lossless Compression

Predictors

A model of the data can often support additional compression

  • use the model to predict the data, and encode the error.
  • transform the data to concentrate complexity.

Lossless Compression

Predictors

FLAC and Speex use predictors

The PNG image format uses one of six predictors, which can vary from row to row.

Theora and Vorbis do a more complicated version of this.

Vorbis and Theora use the (quantized/interpolated) MDCT coefficients as a predictor, and of course Theora has motion vectors. These are much more complicated than the simple LPC filter used in FLAC, or the difference schemes PNG uses.

Lossless Compression

Transforms

Rotate a chunk of data to a different coordinate system. Like diagonalizing a matrix.

example RGB->YCrCb matrix

RGB->YCrCb colourspace transform (used in JPEG, Theora, analog TV)

Lossless Compression

Transforms

Applying a fourier transform can concentrate complexity in smooth data.

JPEG, Theora, Vorbis all use this.

Lossy Compression

These techniques only go so far. To do better we need a new ingredient.

  • A factor of 5, like in our example is about the limit (best case)
  • For multimedia, we don't need bit-accurate reproduction. Just something that looks or sounds the same.
  • remove perceptual as well as mathematical redundancy!

Perceptual Compression

One common approach is to truncate transformed data.

Generally, one first transforms the data to concentrate complexity (mathematical or perceptual) and then truncates the output, discarding details outside the region of concentration.

One may also represent the results of the transform less accurately than would be necessary to compress losslessly.

Perceptual Compression

Cambell-Robson Contrast Sensitivity Chart

Cambell-Robson Contrast Sensitivity Chart

Notice how the visual boundary between the bands and the smooth gray space falls toward the right end of the chart. This is why throwing away information at high spatial frequency doesn't compromise image appearance much. Because there's a lot of information at high frequency, this adds a lot of to compression efficiency.

Perceptual Compression

One common approach is to truncate transformed data.

Thanks!

Hope you enjoyed it.