KDS transformation to improve compression rationKDS - transformation stands for Keep Dominant Symbols - transformation. The advantage of this method is that those statistical properties of the input stream that can be efficiently used by any compression algorithm are not dramatically changed. Consider the following procedure to build a Huffman code for an alphabet V, where pa is the probability of appearance of symbol a in an input source:
The same is the essence of this transformation, to replace symbols with low probabilities with a new symbol. Further we present formally this transformation. Consider A is an alphabet and w is a word over the alphabet A. Consider A = R D with |R| = |D|, where R is a subset of the alphabet A containing the "rare" symbols and D a subset containing the "dominant" symbols in the word w. Is is easy to see that there is an unique decomposition of w: where:
The first step of our transformation is rewriting w in w', where: Consider now a bijective mapping: f:R D. In the second step of the transformation the string w'' is obtained in the following way: In the final step the new symbol # and an explicit representation of the mapping f are added at the beginning of string w''. This way the string w''' is obtained: The reverse transformation works in the following way: given a string w'''(obtained as above) consider the first symbol of w''', as the separator #. Consider now the following factorization of w''': A Case StudyWe present here the effect of this transformation combined with some traditional compression methods.The length (in bits) of the Lempel-Ziv code of a word w V* is: Consider now the case of the alphabet A where |A|=2p, p > 1. Let u and v be two words, u A* and v (A { # } )* , v is obtained form u using the above transformation. If compression is improved by transformation presented above. An implementation of this algorithms improve compression from 5\% up to 10\% in combination with compression tools like gzip. In the following table some experimental results are presented:
Download:KDS transformation implementation in Visual C++. |