Full Text:   <1380>

CLC number: TN919.81

On-line Access:

Revision Accepted: 2004-04-02

Crosschecked: 0000-00-00

Cited: 0

Clicked: 3229

Citations:  Bibtex RefMan EndNote GB/T7714

 Journal of Zhejiang University SCIENCE A 2005 Vol.6 No.1 P.43~48 http://doi.org/10.1631/jzus.2005.A0043

A high fidelity VQ coding algorithm with region adaptive subbanding

 Author(s):  Zhong-qi Yang, Zhi-can Bai, Dong-mei Li, Chang-sheng Yang Affiliation(s):  . Department of Computer, Zhejiang University, Hangzhou 310027, China Corresponding email(s):   csyang@hzcnc.com Key Words:  Image compression, Subband, Vector quantization (VQ) Share this article to： More <<< Previous Article|Next Article >>>

YANG Zhong-qi, BAI Zhi-can, LI Dong-mei, YANG Chang-sheng. A high fidelity VQ coding algorithm with region adaptive subbanding[J]. Journal of Zhejiang University Science A, 2005, 6(1): 43~48.

@article{title="A high fidelity VQ coding algorithm with region adaptive subbanding",
author="YANG Zhong-qi, BAI Zhi-can, LI Dong-mei, YANG Chang-sheng",
journal="Journal of Zhejiang University Science A",
volume="6",
number="1",
pages="43~48",
year="2005",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.2005.A0043"
}

%0 Journal Article
%T A high fidelity VQ coding algorithm with region adaptive subbanding
%A YANG Zhong-qi
%A BAI Zhi-can
%A LI Dong-mei
%A YANG Chang-sheng
%J Journal of Zhejiang University SCIENCE A
%V 6
%N 1
%P 43~48
%@ 1673-565X
%D 2005
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.2005.A0043

TY - JOUR
T1 - A high fidelity VQ coding algorithm with region adaptive subbanding
A1 - YANG Zhong-qi
A1 - BAI Zhi-can
A1 - LI Dong-mei
A1 - YANG Chang-sheng
J0 - Journal of Zhejiang University Science A
VL - 6
IS - 1
SP - 43
EP - 48
%@ 1673-565X
Y1 - 2005
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.2005.A0043

Abstract:
Image subbands can be obtained by using filterbank. Traditional compression method uses direct entropy coding for each subband. After studying the energy distribution in image subbands, we proposed a vector quantization (VQ) coding algorithm to image subband. In the algorithm, vector quantizers were adaptively designed for high-frequency bands in an image. In particular, the edges of the image were examined and fewer bits were assigned to high-energy regions. The experimental result showed that the algorithm had higher SNR and higher compression ratio than possible by traditional subband coding, JPEG and JPEG 2000.

## .  INTRODUCTION

Digital image compression technique plays a vital role in communication, media storage, and data issue fields, and is one of the most active fields of information techniques.

This paper proposes a near-lossless compression algorithm which applies a high fidelity vector quantization method to image subbands. The original image was first separated by filterbank into seven subbands. The regions around the edges in each subband were then extracted, and divided by their orientations into several sub-regions. In the third step, different codebook (vector quantizer) was designed for different region, and fewer bits were assigned to regions of higher-energy in the subband. Better compression ratio and fidelity were achieved.

## .  ENERGY DISTRIBUTION IN IMAGE SUBBANDS

The properties of the lowest frequency subband (LFS) are similar to those of the original image, but all the highest frequency subband (HFS) images have small correlation coefficients between pixels within a subband and negligible cross-correlation coefficients between subbands. The former is described in Tanabe and Farvardin (). The energy distribution of the HFS is described briefly in this section. The reader can be referred to Kwon and Chellappa () for details.

A traditional image compression scheme based on a filterbank splits the signal into M subbands as shown in Fig.1. The input signal defined by $$x(n) = \left\{ {\begin{array}{*{20}{c}} {1,}&{n = 0,1,2, \ldots } \\ {0,}&{{\text{otherwise}}} \end{array}} \right.$$ , has the following antisymmetric property.

Fig.1
General structure of subband coding system

Property 1 Let the input signal, x(n), be a high-pass decimator or a band-pass decimator, with the decimated signal, xi(n), being antisymmetric for n≠0, that is, xi(n)=− xi(−n), n>0.

Corollary Given a (2N+1)-length one-dimensional discrete-time signal, x(n), −NnN. If the input signal, x(n), is a high-pass decimator or a band-pass decimator, the mean of xi(n) is equal to $$\frac{{{x_i}(0)}}{{2N + 1}}$$ , where −NnN.

Let ki=M and$${H_i}(\omega ) = \left\{ {\begin{array}{*{20}{c}} {1,}&{\frac{{i - 1}}{M}\pi \leqslant \left| \omega \right| \leqslant \frac{i}{M}\pi } \\ {0,}&{{\text{otherwise}}} \end{array}} \right.$$ for i=1, 2, …, M in Fig.1 with subband tree structure forming an M-band subband tree decomposition. For the M-band subband tree decomposition, we have the following energy packing property.

Property 2 In the M-band subband tree decomposition, the energy of the input signal in the ith subband, ${E_{{x_i}}}(M),$ is a decreasing function of i for any M.

The energy ratio of the ith subband is defined by $${\eta _{{x_i}}}(M) \equiv \frac{{{E_{{x_i}}}(M)}}{{\sum\nolimits_{i = 2}^M {{E_{{x_i}}}(M)} }}$$ , which represents the ratio of the energy of the ith subband to the total HFS energy.

Corollary ${E_{{x_i}}}(M),$ or equivalently $${\eta _{{x_i}}}(M),$$ decreases monotonically as i increases for any M. It also can be proved that $${\eta _{{x_i}}}(M)$$ is a monotonically decreasing function of M for fixed i and that more than 90% of the total HFS energy is concentrated in the lower half of the HFS when M≥16.

Let the compact energy of the ith subband be defined by $${\xi _{{x_i}}} \equiv {\left| {{x_i}(0)} \right|^2} + {\left| {{x_i}(1)} \right|^2} + {\left| {{x_i}( - 1)} \right|^2}$$ and the compact energy ratio of the ith subband be defined by $${\gamma _{{x_i}}} \equiv \frac{{{\xi _{{x_i}}}(M)}}{{\sum\nolimits_{i = 2}^M {{E_{{x_i}}}(M)} }}$$ , which represents the ratio of the compact energy of the ith subband to the total HFS energy. It was found that the properties of $${\gamma _{{x_i}}}$$ are very similar to those of $${\eta _{{x_i}}}.$$ Property 3 The compact energy $${\xi _{{x_i}}}(M),$$ or equivalently $${\gamma _{{x_i}}}(M),$$ decreases monotonically as i increases for any M. Also $${\gamma _{{x_i}}}(M)$$ is a monotonically decreasing function of M if i is fixed, with more than 80% of the total HFS energy being in the form of compact energy for all M.

In the M-band subband tree decomposition, the energy of the ith subband and the energy of the (i+1)th subband have the following relationship.

Property 4 In the M-band subband tree decomposition of the input signal, the relationship between the energy of the ith subband, ${E_{{x_i}}}(M),$ and the energy of the (i+1)th subband, ${E_{{x_{i + 1}}}}(M),$ can be represented as $${E_{{x_i}}}(M) = \cos \left( {\frac{\pi }{{{2^{M + 1 - i}}}}} \right){E_{{x_{i + 1}}}}(M)$$ for i=2, 3, …, (M−1).

The proofs of the above properties and Corollaries are given in Kwon and Chellappa ().

Since the property of the compact energy ratio is similar to the property of the energy ratio, we can conclude that the compact energy of each subband is almost equally distributed up to i=M−1. It was found that more than 77% of the total HFS energy is in the form of compact energy.

The energy distribution of the image subbands has the following characteristics:

(1) Energy packing property toward the lower frequency subbands: The energy decreases monotonically as i increases, with more than 90% of the total HFS energy being concentrated in the lower half of the HFS.

(2) Energy packing property toward the edges: Most of the total HFS energy is concentrated on the edges. The Lena image has been found to contain 84.9% of the total HFS energy in its edge-regions.

(3) Directionality of the energy distribution: Analysis of the compact energy of the subedge-regions showed that the horizontal, vertical, and off-diagonal edge-region has relatively more energy in the horizontal, vertical, and diagonal subbands, respectively, and that the diagonal edge-region has relatively less energy in the diagonal subbands.

## .  VQ CODING ALGORITHM WITH REGION ADAPTIVE SUBBANDING

The above description of the energy distribution leads to the following conclusions.

(1) Based on the energy packing property toward the edges, for efficient transmission of image, the edge-regions should be emphasized;

(2) Based on the directionality of the energy distribution, different bits should be allocated for different sub-regions;

(3) Based on the energy packing property toward the lower frequency subbands, more bits should be assigned to the lower frequency subbands.

Based on the aforementioned conclusions, we present a high fidelity vector quantization (VQ) coding algorithm with region adaptive subbanding. Our encoder is effective at:

(1) Splitting the original image into seven subbands using two-level Walsh filtering;

(2) Detecting the edges for each subband using Canny detector (Canny, );

(3) Partitioning each subband into vectors;

(4) Extracting edge-regions for each subband;

(5) Encoding LFS using VDPCM and encoding HFS using VQ;

(6) Designing entropy coder.

In what follows, we describe in detail each step of our encoder shown in Fig.2.

Fig.2
General framework of encoder and decoder

### .  Subband decomposition

The Walsh-coefficient, Wx(n), can be seen as the frequency domain representation of an N-length one-dimensional discrete-time signal, x(n), which can be decomposed into lower frequency WxL(n) and higher frequency WxH(n) as follows. $$WxL(n) \equiv \left\{ {\begin{array}{*{20}{c}} {0,}&{0 \leqslant n \leqslant \frac{N}{2} - 1} \\ {Wx(n),}&{\frac{N}{2} \leqslant n \leqslant N - 1} \end{array}} \right.$$ $$WxH(n) \equiv \left\{ {\begin{array}{*{20}{c}} {Wx(n),}&{0 \leqslant n \leqslant \frac{N}{2} - 1} \\ {0,}&{\frac{N}{2} \leqslant n \leqslant N - 1} \end{array}} \right.$$ We can obtain the low-pass signal, xL(n), and the high-pass signal, xH(n), through inverse Walsh transform, which has the following property:

Property 5 Two adjacent points of the low-band signal, xL(n), of a bank of N points are equal with positive sign and two adjacent points of the high-band signal, xH(n), are equal with negative sign; that is, xL(2i)=xL(2i+1), xH(2i)=(xH(2i+1) for i=0, 1, …, N/2(1.

Based on the above property, the subband can be downsampled losslessly by two. We have the following conclusion:

Property 6 The relation between the low-pass signal, xL(n), and the high-pass signal, xH(n), of the signal, x(n), can be represented as $${\bar x_L}(n) + {\bar x_H}(n) =$$ $$\bar x(n).$$ It was proved that the Walsh analysis/synthesis filterbank has ideal reconstruction performance (Manz, ; Ching and Goodyear, ), so we applied it to realize two-level filtering in our experiment, in which the original image was decomposed into seven subbands using two-level Walsh filtering shown in Fig.3.

Fig.3
Two-level subband decomposition

### .  Edge detection and vector-region extraction

The horizontal, vertical, diagonal, and off-diagonal edges for each subband were detected using the Canny detector (Canny, ); then the recursive thresholding method (Kwon and Chellappa, ) using multiple one-dimensional histograms of several texture features was used to extract textured-regions.

The edge-region extraction algorithm of Kwon and Chellappa () was extended to the following vector-edge-region extraction algorithm.

Let the vector-edge-region of the (i,j)-th subband be Λij; detect the edges {x(m, n)} for the (i,j)-th subband using the Canny edge detector (Canny, ). The algorithm is described as follows in detail.

### .  Vector quantization (VQ)

Vector quantization (VQ) is an efficient technique for exploiting the cross-correlation. According to information theory, the bigger the vector dimension, the more correlation removed by the vector quantizer. Traditional vector quantization consists of building a set of signal adapted codevectors called the codebook, with each codevector being assigned a fixed length codeword (signal independent).

A set of well-designed codebooks is crucial for good performance of the reconstruction image. In the frequency domain, the frequency is divided into seven subbands after Walsh transform. Thus, the vectors are classified into seven classes by their subbands in VQ.

A codebook was designed for each sub-region of each subband, with Linde-Buzo-Gray (LBG) algorithm being used to train and obtain the corresponding codebook. For detail of the LBG algorithm, please refer to Linde et al.(). Each sub-region of each subband was straightforwardly partitioned into vectors of 4×4 pixels and trained using the LBG algorithm to obtain the corresponding codebook. Full search algorithm was used in the decoder to select the code of the current vector in the corresponding codebook and transmit the address of this code to the channel. The codebook in the decoder was located by the address from the channel and yielded the vector of the code in the corresponding codebook.

In our approach, dynamic rate allocation was not adequate. Our goal was to use vector quantization coding techniques in the subbands, and to rely on the optimal rate allocation to discriminate between all the possible techniques on each sub-region of each subband. Different number of bits was assigned to each sub-region of each subband by assuming its distributions. Our bit allocation scheme could allocate more bits for the vectors of the LFS and fewer bits for the vectors of the HFS.

### .  Encoder for LFS and HFS

Different encoders were designed for the LFS and the HFS. It is known that the properties of the LFS are similar to those of the original image and that lower relationship exists in the HFS, so the Vector DPCM (VDPCM) was adopted for encoding the LFS, and Vector Quantization (VQ) coding for encoding the HFS, which will be described in detail.

### .  (1) VDPCM for LFS

Because of the strong relativity of coefficients of the LFS in low and column direction, we selected the VDPCM method to encode the LFS. Not only can their relativity be eliminated, but the efficiency can also be improved because it is the error signal of prediction, which corresponds to the higher frequency of the image.

Two-dimension VDPCM was selected in this algorithm; Fig.4 shows the reconstruction. The predicted value of the current vector can be calculated as the following predictor: x(i, j)=0.5×x(i(1, j)+0.5(x(i, j(1), where x(i, j) is the current vector of 4×4 pixels.

Fig.4
Position of reference and current vectors

In the case of the border, one of the coefficients may be absent, then the formula above can be transformed into x(i, j)=x(i(1, j) (if x(i, j(1) is absent) or x(i, j)=x(i, j(1) (if x(i(1, j) is absent). It is noteworthy that, for the first vector of the upper left corner, the coefficients are all absent, so it cannot be predicted but can only be saved in the output code flows of the head information.

In our experiment, the sub-regions of the LFS were directly partitioned into vectors of 4(4 pixels, then the error vectors were trained using the LBG algorithm to obtain the corresponding codebook of the LFS sub-regions.

### .  (2) VQ for HFS

The HFS sub-regions were directly partitioned into vectors of 4(4 pixels, and then directly encoded using VQ. Regarding the details of the VQ algorithm please refer to Sullivan and Baker () and Li et al.().

### .  Entropy coding

Conventional Huffman coding was adopted in this completely developed algorithm. Please see Tenenbaum and Augenstein () for the details of the algorithm. Certainly other entropy coding techniques such as run-length coding and arithmetic coding can also be employed here.

## .  EXPERIMENTAL RESULT AND CONCLUSION

In our experiment, we designed a set of local codebooks from a training set of representative images. The training set of images decomposed by two-level filtering was comprised of five 512(512 images (including Lena, Girl, Mand, Aug25r8 and Pentagon shown in Fig.5 and Fig.6) of seven subbands after two-level Walsh filtering. Applying the method offered in this paper on Lena and Averser8, we measured by the root mean square error (RMSE), peak signal-to-noise ratio (PSNR), and compression ratio (CR) to determine how good was the quality of the reconstructed image. The experimental result showed that the algorithm reconstructs low bit rate coded images with not only high SNR but also good perceptual quality, as shown in Table 1. Fig.7 shows the coding result for the Lena and Averser8 image, respectively. $$MSE = \frac{1}{N}\sum\nolimits_{i = 1}^N {{{({x_i} - {{\hat x}_i})}^2}}$$ $$RMSE = \sqrt {\frac{1}{N}\sum\nolimits_{i = 1}^N {{{({x_i} - {{\hat x}_i})}^2}} }$$ $$PSNR = 10{\log _{10}}\frac{{{{255}^2}}}{{MSE}}$$ In this paper, we present a new algorithm that can yield high fidelity, nearly lossless compression at relatively high compression ratio. Compared with other coding methods, our algorithm has the following several advantages:

Fig.5
Training images in the experiment

Fig.6
Training images in the experiment

Fig.7
Reconstructed images in the experiment

#### Table 1

Results for different images compared with those of JPEG
 Algorithm Image RMSE PSNR CR Proposed algorithm Lena 2.493 40.196 9.841 Averser8 2.716 39.451 8.854 JPEG (Penne, 1988) Lena 3.3 37.7 8.4 Averser8 4.5 35.0 5.2 JPEG2000 (ISO/IEC JTC1/SC29/WG1 N1577, 2000) Lena 2.6 39.7 8.68 Averser8 3.5 38.4 8.75

1. Our algorithm can yield better reconstruction image quality and higher compression ratio than other traditional algorithms such as JPEG.

2. Compared with typical VQ coding, this algorithm has higher PSNR and no block trouble. In particular, there is less mosquito noise in reconstructed images.

3. Compared with traditional subband coding, this algorithm has higher compression ratio and no ringing effect.

The compression performance of our algorithm shows that it can achieve high fidelity image compression with great improvement in compression ratio. So this algorithm has promising prospect in field of high fidelity image compression, especially the compression field of satellite remote sensing images, medical images, and so on.

## References

[1] Canny, J., 1986. A Computational Approach to Edge Detection. , IEEE Trans. Pattern Anal. Machine Intelligence, PAMI-6, 679-698. :679-698.

[2] Ching, P.C., Goodyear, C.C., 1984. Walsh-transform coding of the speech residual in RELP coders. , Proc Inst Elect Eng G, 29-34. (1):29-34.

[3] ISO/IEC JTC1/SC29/WG1 N1577, 2000. JPEG2000 Part II Working Draft Version 1. , 0 Pre-Release A, :

[4] Kwon, O.J., Chellappa, R., 1998. Region adaptive subband image coding. IEEE Trans Image Processing, 7:632-648.

[5] Li, W., Wus, J., Zhang, Y.Q., 1994. New Vector Coding Schemes for Image and Video Compression. , SPIE Visual Communications and Image Processing, Chicago, IL, 401-412. :401-412.

[6] Linde, Y., Buzo, A., Gray, R.M., 1980. An algorithm for vector quantiacer design. IEEE Trans Communication, 28:84-95.

[7] Manz, J.W., 1972. A sequence-ordered fast Walsh transform. IEEE Trans Audio Elect, Au-20:204-205.

[8] Penne, B.W., 1988. JPEG Technical Specification, Revision 8. , Working Document, No. JTC1/SC2/WG10/JPEG-150, :

[9] Sullivan, G.J., Baker, R.L., 1994. Efficient quad-tree coding of images and video. IEEE Trans Image Processing, 3:327-331.

[10] Tenenbaum, A.M., Augenstein, M.J., 1981. Data Structures Using Pascal, Prentice-hall, New Jersey,:275-284.

[11] Tanabe, N., Farvardin, N., 1992. Subband image coding using entropy-coded quantization over noisy channels. IEEE J Select Areas Commun, 10:926-943.