Non-Linear techniques

Linear techniques are widely used for pyramidal coding because they are easy to analyze in the frequency domain. They are also efficient since they can often be computed as separable convolutions. Nevertheless, restricting ourselves to linear kernels seems to be a big limitation to our compression scheme. Improvements in the compression ratios are most likely possible by going to non-linear REDUCE and EXPAND functions. Thus, we have also taken a look at the existing non-linear approaches and we tried to come up with our own NL scheme.


Existing NL methods

Most NL methods we encountered are starting from the observation that linear filters are doing a pretty bad job around edges due to the blurring. They assumed a significant gain would be obtained by trying to preserve the edges and details that are not part of high frequency periodic structures. On the other hand, periodic structures should be eliminated from the images since they give rise to aliasing errors in the decimated images. As linear filtering makes it difficult to achieve high attenuation of aliasing components, it seems NL filters are more adapted for the removal of unnecessary detail information from decimated images. In the methods we reviewed, the design of the filters usually relies on the intuition that improvements could be obtained on specific visual patterns like edges. We felt that the choice of the EXPAND and REDUCE functions often seemed arbitrary, lacking some sort of general compression efficiency analysis. In particular, many other factors have an impact on the entropy of the difference images and are not even considered in these papers. For instance, they give no proof that the EXPAND function is optimized for the REDUCE function and vice versa.


Optimal NL interpolation

If we choose a particular decimation filter, the problem can be reformulated as follows :
  1. Given an image I, compute its decimated image D = Reduce(I).
  2. Divide I into four sub images (I1, I2, I3, I4) for even-even, odd-even, even-odd and odd-odd pixels respectively.
  3. Try to find the four optimal predictors Pi that predict Ii given D.
The subsequent Expand function is represented in the following diagram

figure: Expand function using optimal non-linear predictors

An optimal NL predictor scheme exists theoretically. Indeed, if we have sufficient information on the signal statistics, the optimal predictor is shown to be the conditional expected value of the pixel given its neighborhood.

This is optimal in the sense that it minimizes the average MSE between images (with the given statistics) and their reconstructed counterparts for a given neighborhood size. It is interesting to note that with a neighborhood equal to the size of the image, this scheme becomes equivalent to storing a table of all possible reduced and expanded images and simply doing a look-up at interpolation time.

So why not using this for our interpolation filter ? The practical algorithm would be :

  1. Tabulate the expected value for each neighborhood in a preprocessing step
  2. Use this as a look-up table to determine the interpolated pixel values from their neighborhood in the reduced image.
Unfortunately it is impossible to implement this method as is, because of exponential memory requirements (the number of possible neighborhoods is 256 to the power of the number of pixels in the neighborhood). Furthermore we would need very accurate signal statistics, since we are supposed to compute the expected value of the intensity for each of these possible neighborhoods. If we wanted to learn these statistics using a training set, we would need a huge number of images, otherwise the look-up table would be mainly made of holes and of non representative expected values.


Approximately optimal NL interpolation, 1st attempt

To alleviate these problems, we need to limit the number of different neighborhoods to create the look-up table. This can be done in several ways : Neighborhood features Feature quantization


Approximately optimal NL interpolation, 2d attempt

The problem of the latter interpolator is clearly a lack of adaptiveness within each cell, since all pixels in a cell are mapped to a unique interpolation value. Consequently, we propose to use an optimal linear predictor instead of the expected value within each cell. Actually this is equivalent to determining local best fitting planes. The resulting interpolator is a piece-wise linear predictor with no continuity constraints on the cell boundaries. We used the pixels of the window as input of the predictor and the value in the sub image Ii as desired output. We didn't use the features as inputs because they did not necessarily contain enough information to make a good predictor. The interpolated image pixels are computed as follows :

 where S(x,y) is a vector containing the values of the pixels in a window around (x,y). The coordinates of C are the coefficients of the linear interpolator, and are computed for each cell and each sub image using the correlation and auto correlation matrices of S and I as follows :

 Choice of a Reduce function

Partitioning the images  Experiment 1  We got a error image with variance 17.95, which is pretty good compared to our best linear scheme that yielded 18.79 with the Optimal Cubic method and a 15x15 kernel.

 Experiment 2

Everything else unchanged, this modification enabled us to decrease the first level error image variance to 17.48. We tried increasing the number of cells by doing finer quantization but we could not get better results than with the latter setup.

 Experiment 3

Sadly, it does not bring any improvement. It does better on the training set (which makes sense because we added degrees of freedom), but it probably lacks of generality and the variance increases to 24.7.

Previous (Linear techniques)-Next (Non-Linear techniques(2))