An Approach to Efficient Dictionary Utilization and Improved Data Compression Technique for LZW Algorithm

S. Revathi1, D. Thiripurasundari2
1S. Revathi, School of Electronics Engineering, VIT Chennai, India.
2D.Thiripurasundari*, School of Electronics Engineering, VIT Chennai, India.

Manuscript received on December 02, 2020. | Revised Manuscript received on December 05, 2020. | Manuscript published on December 30, 2020. | PP: 224-229 | Volume-10 Issue-2, December 2020. | Retrieval Number: 100.1/ijeat.B20971210220 | DOI: 10.35940/ijeat.B2097.1210220
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: This paper proposes an improved data compression technique compared to existing Lempel-Ziv-Welch (LZW) algorithm. LZW is a dictionary-updation based compression technique which stores elements from the data in the form of codes and uses them when those strings recur again. When the dictionary gets full, every element in the dictionary are removed in order to update dictionary with new entry. Therefore, the conventional method doesn’t consider frequently used strings and removes all the entry. This method is not an effective compression when the data to be compressed are large and when there are more frequently occurring string. This paper presents two new methods which are an improvement for the existing LZW compression algorithm. In this method, when the dictionary gets full, the elements that haven’t been used earlier are removed rather than removing every element of the dictionary which happens in the existing LZW algorithm. This is achieved by adding a flag to every element of the dictionary. Whenever an element is used the flag is set high. Thus, when the dictionary gets full, the dictionary entries where the flag was set high are kept and others are discarded. In the first method, the entries are discarded abruptly, whereas in the second method the unused elements are removed once at a time. Therefore, the second method gives enough time for the nascent elements of the dictionary. These techniques all fetch similar results when data set is small. This happens due to the fact that difference in the way they handle the dictionary when it’s full. Thus these improvements fetch better results only when a relatively large data is used. When all the three techniques’ models were used to compare a data set with yields best case scenario, the compression ratios of conventional LZW is small compared to improved LZW method-1 and which in turn is small compared to improved LZW method-2. 
Keywords: Data compression, LZW, dictionary encoding, lossless encoding.