This work is to compare the performance of two different pyramid coding for still images: a critically sampled subband pyramid and an over-complete pyramid. Once the coefficients of subbands or the error signals in different layers are obtained, the way they are quantized and encoded affects the efficiency of compression significantly. Since we focus on the rate-distortion performance and bit allocation in this work, we implement simple scalar quantizers and memoryless encoding, with optimized bit allocation for the coefficients.
In the following sections, we will briefly discuss the two subband pyramid coding schemes, bit allocation and quantization. How these are implemented in this work is also described in detail. The simulation results are presented and analyzed with a few testing images.
In critically sampled pyramid coding, the image spectrum is decomposed into
subbands by analysis filters. The filtered signal is then subsampled by a
factor of 2 such that the cascaded decomposition operation results in the same number of
samples as that of the original image. Due to the delaying property of the
typical image spectrum from low frequency to high frequency, the decomposition
produces subbands with flatter spectrum. Encoding these subbands with greatly
reduced statistical correlation improves the coding efficiency according to
the rate distortion theory [4].
In this work we choose the octave-band decomposition over the uniform decomposition.
Since the decay of the typical spectrum is more rapid at lower frequencies,
the former yields subbands with even flatter spectrum by dividing lower bands
further into smaller bandwidth. Discrete Wavelet Transform (DWT) is a common
method to implement octave subband decomposition, which has been employed
in this work for critically sampled pyramid coding. Fig. 1 shows the original
image and the signal decomposed by analysis filters cascaded
in three levels, which results in ten subbands. The lowest band is shown
at the top left corner in the bottom image in
Fig. 1 , where the energy is mostly concentrated.
Besides critically sampled pyramid, overcomplete pyramid decomposition technique
is frequently used. This was introduced as a simple, yet powerful image
representation scheme by Burt and Adelson [7]. From the original image, a low-resolution
version is derived, then the original is predicted based on the coarse version, and
the difference is calculated between the original and the prediction.
At the decoder, the prediction is simply added back to the difference. Of course,
the process can be iterated on the coarser version. Fig. 2 shows 3-layered
over-complete pyramid. The upper one is the coarser version and the other two are
prediction error signals.
The benefit of the oversampled pyramid comes from the fact that arbitrary filters (including nonlinear ones) can be used, and that visually pleasing coarse versions are easy to obtain by choosing unconstrained filters for decimation and interpolation. We tried three different filters: Gaussian filter as in the original Burt and Adelson scheme [7], simple averaging and bilinear interpolation filter suggested in [3], and Chebyshev type I lowpass filter generated by Matlab built-in functions "decimate" and "interp". Gaussian filter is used for later experiments to be shown since it gives the best rate-distortion performance among the three, which is illustrated in Fig. 3.
In both critically sampled pyramid and oversampled pyramid, quantization
is performed on the coefficients of subbands or the error signals of the
over-sampled layers to compress the signal. In order to encode the
image with maximum efficiency, different bit rates should be assigned to each
subband or layer, depending on its statistical property. Here we will first
review the bit allocation scheme in an analytical way and then discuss
a "greedy" algorithm used in practice.
3.1 Analytical solutions
We denote the allocated bit rate per pixel in layer l or subband l by rl, then the bit rate for the image is
where
.
(2)
To have the minimum distortion given the bit rate constrain r, one can obtain
, and
According to the detailed analysis in [3], the optimal bit allocation for subband l in the critically sampled case, or layer l in the open-loop oversampled case is
where
,
is the power transfer factor,
is the spectral
flatness and
is the variance of the interpolation error signal.
For the closed-loop oversampled case,
where
.
(5)
From the above, (4) and (5) are the analytical forms we follow in our simulations for bit allocation.
Full resolution layer PSNR (dB)
|
Pyramid
|
Pyramid
|
28
|
||
30
|
||
33
|
In this project, by comparing two pyramid coding schemes, we find that the critically sampled pyramid ourperforms the over-complete pyramid in terms of the rate-distortion performance due to the oversampling in the latter scheme.
Several papers [2][4] mention that better coarser versions in closed-loop pyramids can be achieved in terms of subjective quality, because of more freedom in choosing the decimation and interpolation filters. In our simulations, with a fixed PSNR in the full resolution, lower resolution images reconstructed from the open-loop pyramids have shown better subjective quality than those obtained from critically sampled subband pyramids. The over-complete pyramids are more suitable for low bit-rate coding in providing good subjective quality of coarse pictures. Another advantage of over-complete pyramid coding is its lower complexity, since it uses fewer quantizers in achieving the same number of layers of multiresolution. Other features of over-complete pyramid include control of quantization noise, and robustness counterbalance the oversampling problem [2].