Ghost update 20110604: Death By Graphs

The process of chirp estimation fits a set of modulated sinusoids to a given audio signal. We pass in initial chirp parameter estimates and the estimator then improves these approximations to a desired level of fit precision. The parameters describing a chirp are the zeroth-order parameters P (phase) and A (amplitude), the first-order parameters W (frequency) and dA (amplitude modulation), and the second-order parameters dW (chirp rate or frequency modulation) and ddA (amplitude modulation squared).

At present, Ghost research uses a specific chirp estimation algorithm we published in 2007, and we'll likely continue using and improving this algorithm until something clearly better comes along. For this reason, it's essential we know what convergence and error to expect from it.

As alluded to in demo 2, the original 2007 paper had little space to devote to detailed algorithm characterization. The few graphs covered specific disjoint cases that were primarily concerned with efficiency as compared to other techniques. This data does not lead to a comprehensive, intuitive undestanding of how the chirp estimation performs given varied input data and operating parameters.

In this update, we concern ourselves with documenting the expected behavior of the chirp estimator, as well as what to expect from several optional improvements alluded to but left unexplored by the 2007 paper. Because chirp estimation is of general use in the field of signal processing, we do not focus only on expected use cases for Ghost. This documentation also serves as a benchmark archive for testing. Future improvments and optimizations can be compared against these known good results.

Notes on Graph Generation

code

The reference chirp estimator algorithm as well as code used to generate the graphs can be found in Xiph.Org's SVN at http://svn.xiph.org/trunk/chirptest. The module builds a standalone C application (using libcairo for drawing) named 'chirp' that when run produces all the graphs seen here. The application requires several hours of run time on a 32 core Opteron server to generate the graphs as seen here, so you probably don't want to try it on a netbook.

errata

Except where noted, graphs are generated using the algorithms as described in the paper linked above. The paper contains an algebraic sign error on Page 11 in Algorithm 2; the 12th line has a '+' which should be a minus. Specifically it should read:

precision

The algorithms are implemented in single precision float arithmetic with the exception of the instantaneous phase calculation in the basis chirp generation. A naive calculation such as cosf(P+(W+dW*i)*i) runs out of bits in the mantissa of the single-precision argument to cosf() as W and i increase. For the sake of simplicity and clarity, the naive calculation is used but implemented and passed in double precision.

termination and convergence

The termination criteria for convergence is a simple magnitude threshold. The algorithm is declared to have converged when the sum of the squares of the basis projections for each chirp are all less than 1e-12. This value is chosen to be large enough to sit comfortably above numerical noise (between 1e-16 and 1e-14), but small enough to show a substantial amount of the algorithm's depth of precision. The convergence graphs plot the number of iterations required before the final uncounted iteration results in a change below the termination threshold. The error graphs give the worst-case performance for our specific termination criteria across all values of any swept input.

Keep in mind that 'convergence' simply means that the estimation algorithm arrived at a stable solution and halted. It does not mean that the solution is correct or sufficiently accurate. Consult both the convergence and error graphs to determine the trustworthy domain of operation.

Linear vs Non-linear Operation

Algorithms in the 2007 paper

The paper offers two variants of the chirp estimation algorithm, linear and non-linear. The linear and non-linear algorithms differ primarily in that the non-linear algorithm 'recenters' the W parameter (frequency) each iteration, that is, it regenerates the basis functions with an updated W estimate each iteration as well as restarting the residual error calculation. Basis generation is a relatively expensive operation, and the nonlinear estimator repeats basis generation for each chirp in each iteration. Thus, although the conceptual difference between the linear and non-linear algorithms is small, the computational cost is potentially substantial.

Recentering dW

The non-linear algorithm in the paper does not recenter the dW (chirp rate) parameter. In addition, it also assumes an initial dW estimate of 0, thus avoiding the dW term in basis generation completely:

This is an intentional optimization to reduce the basis generation and projection complexity; if unchirped, the basis functions show symmetry/antisymmetry about their centers.

This basis modification also requires modifying the dW update step in Algorithm 2:

Using and recentering the dW parameter has benefits, improving both convergence range and dW accuracy. We will document the estimator performance both of 'partial' non-linear operation (recentering only W with an unchirped basis as described in the paper) as well as 'full' non-linear operation (recentering both W and dW with chirped basis functions).

data

The graphs below show behavior of the linear, partial-nonlinear and full-nonlinear estimators across sets of single-chirp input signals. Estimation was performed with a sine input window and other parameters as in the original 2007 paper.

Near DC, convergence and error behavior most strongly depend on chirp center frequency (W); the following graphs show behavior from DC extending to a few bins above DC. In addition, the phase of an input chirp affects behavior at the edge of convergence, so all graphs show the worst-case behavior run over a swept-phase set of inputs. Estimator behavior is exactly anti-symmetric about the center of the spectrum.

Frequency (W) convergence and error behavior

As expected, the linear algorithm shows reliable convergence behavior but as it does not restart the basis functions or error estimates, error increases steadily above and below Y=0 where the input W estimate already matches the chirp W before iteration. Both the partial- and full-nonlinear algorithms show reduced convergence range but high precision across the domain of frequency convergence making the total domain of useful output much larger than the linear estimator. This comes at the cost of a more expensive algorithm.

Chirp rate (dW) convergence and error behavior

The linear algorithm shows dW (chirp rate) convergence and error behavior similar to frequency (W) behavior.

The partial-nonlinear algorithm shows error behavior similar to the linear algorithm once there's a significant dW (chirp) component to the input signal. This is not surprising as the partial-nonlinear algorithm is not recentering the basis functions for dW changes and the calculated residual error is also an approximation computed from the basis functions.

The full-nonlinear algorithm shows vastly improved dW behavior similar to W behavior. It gives accurate results across the nearly-complete domain of convergence, which is greatly expanded over the partial-nonlinear algorithm. With the proper window, we can in fact expand the domain of convergence to an arbitrarily large portion of the solution space. More about this later under the section on windowing functions.

A chirp's actual W does not strongly influence the convergence or accuracy of the algorithm once it is more than 5-10 bins above DC (or below Nyquist). W can still affect numerical noise; depending on how the basis and reconstruction chirps are generated, an approach such as cos(W*t) shows a linear loss of precision as |t| or W increases. As such, single-precision basis/reconstruction chirp generation can show noticably more noise near Nyquist than at DC.

Influence of Windowing Functions

FFT techniques tend not to be particularly sensitive to exact window choice; there's a set of mostly interchangable 'good' windows that give similar performance for most uses. Unless you have a very specific need, the Hanning window covers 95% of what any DSP engineer would ever need for use with Fourier transforms. This is not the case with the chirp estimator. The window function used for input and basis windowing exerts considerable influence on algorithm accuracy and convergence behavior.

As with the FFT, window mainlobe width and shape influences discrimination ability (the ability to resolve closely spaced sinusoids/chirps).
Mainlobe slope partially determines convergence speed.
Mainlobe width determines the convergence range of the nonlinear iterator; it cannot converge to values beyond the first null in the window transfer response. Convergence can also 'hang up' in local maxima in the mainlobe response.
In that same vein, sidelobes create local maxima away from the mainlobe. The estimator essentially marches 'up' to peaks in the input signal's windowed response, and so local maxima created by sidelobes will also capture convergence.
Sidelobe leakage, as with the FFT, spreads noise through the spectrum into the results for other sinusoids. The exact behavior is different due to use of Gauss-Seidel fitting, which removes the energy of each fit sinusoid from the error term of all sinusoids. However, unfit energy still exhibits full sidelobe leakage.

This all points to the possibility that a unimodal window without sidelobes could render the nonlinear estimator's solution space substantially convex for single-chirps and indeed this turns out to be the case. The W and dW terms can be made to converge from initial estimates arbitrarily far away from the correct value. This comes at the cost of typically slower convergence (unimodal windows tend to be broader and roll-off more slowly) and some reduction in convergence domain near DC.

Ghost update 20110604

Chirp Estimator Behavior: Overview

Notes on Graph Generation

code

errata

precision

termination and convergence

Linear vs Non-linear Operation

Algorithms in the 2007 paper

Recentering dW

data

Influence of Windowing Functions

Window functions and linear estimation

Window functions and non-linear estimation

First- vs Second-Order Operation

Symmetric Normalization

dW alpha

Noise Behavior

Mutliple Chirp Interaction

orthogonality limit

mainlobe/sidelobe capture

realistic estimate error

In Conclusion....

General points

Linear vs. Nonlinear

Windows