Sie sind auf Seite 1von 18

LZW Encoding and

Decoding
110114049
110114017
110114011
General purpose lossless data compression algorithm created by
Abraham Lempel, Jacob Ziv, and Terry Welch.
Lossless compression Technique
As the name implies, involve no loss of information. So,
the original data can be recovered when the file is uncompressed.
Error free compression approach that also addresses spatial redundancies in an
image.
Applications :
Unix Compress, GIF , TIFF, monochrome images and text files
that contain repetitive text/patterns.
LZW compression is one of the Adaptive Dictionary techniques.
LZW uses fixed-length codewords to represent
variable-length strings of symbols/characters that
commonly occur together, e.g., words in English text.
The LZW encoder and decoder build up the same
dictionary dynamically while receiving the data.
LZW places longer and longer repeated entries into a
dictionary, and then emits the code for an element,
rather than the string itself, if the element has already
been placed in the dictionary.
LZW Coding
Initially, a dictionary containing the source symbols to be coded is constructed.

ALGORITHM:
Buffer input characters in a sequence W until W+ next character is not in the dictionary. Emit the code for , and
add W+ next character to the dictionary. Start buffering again with the next character.
STEPS:
Initialize the dictionary to contain all strings of length one.
Find the longest string W in the dictionary that matches the current input.
Emit the dictionary index for W to output and remove W from the input.
Add W followed by the next symbol in the input to the dictionary.
Go to Step 2.
Dictionary
*

Example: a b a b a b a b a
0-a ab ab ababa
*Dictionary 1-b 01 2
0-a 2 - ab
1-b
3 - ba
*Dictionary 4 - aba
0-a ababababa
1-b 0
2 - ab *Dictionary
0-a ab ab ababa
*Dictionary
0-a ababababa
1-b 0 1 2 4
1b 01 2 - ab
2 - ab
3 - ba
3 - ba
4 - aba
5 - abab
*Dictionary
0-a a b ab aba ba
1-b 0 1 2 4 3
2 - ab
3 - ba
4 - aba
5 - abab
It assigns fixed-length code words to variable length sequences of
source symbols.
LZW Decoding
2 STEPS:
1. Reads a value from the encoded input and outputs the corresponding string from the
initialized dictionary.
2. Updates the dictionary value by concatenating the current string and the first character of
the string obtained by decoding the next input value.

In order to rebuild the dictionary in the same way as it was built during encoding, it also
obtains the next value from the input and adds to the dictionary.
The concatenation of the current string and the first character of the string obtained by decoding the next
input value, or the first character of the string just output if the next value can not be decoded.
Repeats the process until there is no more input, at which point the final input value is decoded without
any more additions to the dictionary.
To decode an LZW-compressed archive, one needs to know in advance the
initial dictionary used, but additional entries can be reconstructed as they are
always simplyconcatenationsof previous entries.
Example

The string to be encoded is "TOBEORNOTTOBEORTOBEORNOT.


Let Initial dictionary values be:

A 1 I 9 Q 17 Y 25
B 2 J 10 R 18 Z 26
C 3 K 11 S 19
D 4 L 12 T 20
E 5 M 13 U 21
F 6 N 14 V 22
G 7 O 15 W 23
H 8 P 16 X 24
ENCODING:
Output
Current
Next Char Extended Dictionary Comments
Sequence Code Bits

NULL T

27 = first
available code
T O 20 10100 27: TO
after 0 through
26

O B 15 01111 28: OB

B E 2 00010 29: BE

E O 5 00101 30: EO

O R 15 01111 31: OR

32 requires 6
bits, so for next
R N 18 10010 32: RN
output use 6
bits

N O 14 001110 33: NO

O T 15 001111 34: OT

T T 20 010100 35: TT

TO B 27 011011 36: TOB

BE O 29 011101 37: BEO


Y LZW CODING?
Unencoded length = 25 symbols 5 bits/symbol = 125 bits
Encoded length = (6 codes 5 bits/code) + (11 codes 6
bits/code) = 96 bits.
Using LZW has saved 29 bits out of 125, reducing the message
by almost 22%. If the message were longer, then the dictionary
words would begin to represent longer and longer sections of
text, allowing repeated words to be sent very compactly.
Decoder : 20 15 2 5 15 18 14 15 20 27 29 31 36 30 32 34
Input Code Output Sequence New Dictionary Entry
20 T
15 O 27: TO
2 B 28: OB
5 E 29: BE
15 O 30: EO
18 R 31: OR
14 N 32: RN
15 O 33: NO
20 T 34: OT
27 TO 35: TT
29 BE 36: TOB
31 OR 37: BEO
36 TOB 38: ORT
30 EO 39: TOBE
32 RN 40: EOR
34 OT 41: RNO
ADVANTAGES
*Conceptually very simple.
*Error free compression approach.
*Very fast technique.
*It is loss less compression technique.
*There is no need to analyze the incoming text.
*LZW excels when used for data streams that have any repeated strings.
*Compression ratio of 50% or more is expected.

DISADVANTAGES
*Files that do not contain any repetitive data at all cannot be compressed
much.
*The method is good at text files but not as good at other types of files.
*The amount of storage needed is indeterminate as it depends on the total
length of all the strings.
MATLAB CODE:
OUTPUT:

Das könnte Ihnen auch gefallen