Sie sind auf Seite 1von 10

Q.

1)

(a
)

How the Extended Huffman code is different from


code? Explain with the help of an example.

it corresponding Huffman

Answer: Extended Huffman compression can encode groups of symbols rather than
single symbols. It is more efficient than using huffman code. If the average length
approach entropi value, we can say that the codewords we obtain is efficient.
Example

(b How a tag is generated in arithmetic coding? Write the steps.


)
Answer:

(c ) Define the entropy and efficiency of the information source .Consider a source S
has Three output symbols S1, S2 and S3 with probabilities of 0.5,0.25 and 0.25
respectively. Find the entropy of the source.

Answer: Entropy (designated by the letter H) is equivalent to the potential information gain once
the experimenter learns the outcome of the experiment, and is given by the following formula:

This formula implies that the more entropy a system has, the more information we can
potentially gain once we know the outcome of the experiment.
The efficiency of a coding system is the ratio of the average information per symbol to the
average code length. The maximum efficiency possible is 1, and can theoretically be obtained
using a prefix code.

Enropy is

H=-[.5*log base 2(.5)+.25*log base 2(.25)+.25*log base2(.25)]


=-[.5*(-1)+.25*(-2)+.25*(-2)]
= -[-.5-.5-.5]
=1.5 bits/symbol
(d) What do you mean by lossless compression and lossy compression?
Compare lossless compression with lossy compression.
Answer:
With lossless compression, every single bit of data that was originally in the file
remains after the file is uncompressed. All of the information is completely restored.
This is generally the technique of choice for text or spreadsheet files, where losing
words or financial data could pose a problem. The Graphics Interchange File (GIF) is
an image format used on the Web that provides lossless compression.
lossy compression reduces a file by permanently eliminating certain information,
especially redundant information. When the file is uncompressed, only a part of the
original information is still there (although the user may not notice it). Lossy
compression is generally used for video and sound, where a certain amount of
information loss will not be detected by most users. The JPEG image file, commonly

used for photographs and other complex still images on the Web, is an image that
has lossy compression.

Q2(a) What is dictionary based compression techniques? Encode the following


sequence using the LZ78 algorithm.
ABBCBCABABCAABCAAB
Answer:
Dictionary-based algorithms do not encode single symbols as variable-length bit
strings; they encode variable-length strings of symbols as single tokens The tokens
form an index into a phrase dictionary
If the tokens are smaller than the phrases they replace, compression occurs.
Dictionary-based compression is easier to understand because it uses a strategy
that programmers are familiar with using indexes into databases to retrieve
information from large amounts of storage for example: Telephone numbers, Postal
codes

The compressed message is: (0,A)(0,B)(2,C)(3,A)(2,A)(4,A)(6,B)

2(b) What are the measures of performance of data compression algorithms? Explain with
suitable example.

Entropy(in our context) - smallest number of bits needed, on the average, to represent a symbol
(the average on all the symbols code lengths).

The efficiency of the coding is defined as where H is entropy and R is average codeword length.

Average code word length: Assume a code C over an alphabet of N symbols, and
probabilities p(Ai). Let i be the length of codeword C(Ai). Then, the average length of code C is

The average code length is bounded below by the shortest codeword length and bounded above
by the longest codeword length.
2( c):What do you understand by length of Huffman code and how it is
defined ? Explain with example.

Answer: Assuming one symbol per probability and that the probabilities are exact, you can get
two different codes depending on the choices made when executing the Huffman algorithm. One
has a maximum length of 3, the other a maximum length of 4. Both codes are equally optimal in
coding the symbols. The two codes have code lengths, in the same frequency order, (4,4,3,2,2,2)
and (3,3,3,3,2,2).
You may mean the sum of the bits over the six possible symbols, which is in fact 17 for one of
the codes, but 16 for the other. However that is a meaningless measure, since you have used each
symbol once, in contradiction to their stated probabilities. A useful measure would be
multiplying each symbol length in bits by the probability to get an average symbol length in bits.
That is 2.5 bits for both of those codes. That is how you verify that both codes are equally
optimal.
In general you need to apply the Huffman algorithm in order to determine the maximum code
length. There is no other shortcut. You can traverse the tree to find the maximum length. You do
not need to explicitly generate the code per se, but the code is implied by the tree.

You can compute the entropy to get a lower bound on the average symbol length in bits. That is
the sum of each probability times its negative base-2 logarithm. In this case, the entropy is 2.446.
(d) Define and explain finite context modeling with their uses in data compression .
Answer: Finite-context models are invariably used adaptively because they contain detail that tends to be
specific to the particular text being compressed. The probability estimates are simply frequency counts
based on the text seen so far. It is tempting to think that very highorder models should be used to obtain
the best compression. We need to be able to estimate probabilities for any context, however, and the
number of possibilities increases exponentially with the order. Thus, large samples of text are needed to
make the estimates, and large amounts of memory are needed to store them. In an adaptive setting, the
size of the sample increases gradually, so larger contexts become more meaningful as compression
proceeds. The way to get the best of both worlds-large contexts for good compression and small contexts
when the sample is inadequate is to use a blending strategy, where the predictions of several contexts of
different lengths are combined into a single overall probability. There is a number of ways of performing
blending.

3(a) Explain the update procedure for the adaptive Huffman coding
algorithm/flowchart with suitable example.

The Algorithm:
The Huffman tree is initialized with a single node, known as the Not-Yet-Transmitted (NYT) or
escape code. This code will be sent every time that a new character, which is not in the tree, is
ecountered, followed by the ASCII encoding of the character. This allows for the decompressor
to destinguish between a code and a new character. Also, the procedure creates a new node for
the character and a new NYT from the old NYT node.
Whenever a character that is already in the tree is encountered, the code is sent and the weight is
increased.
In order to for this algorithm to work, we need to add some additional information to the
Huffman tree. In addition to each node having a weight, it will now also be assigned a unique
node number. Also, all the nodes that have the same weight are said to be in the same block.
These node numbers will be assigned in such a way that:
1. A node with a higher weight will have a higher node number.
2. A parent node will always have a higher node number than its children.
This is known as the sibling property, and the update algorithm simply swaps nodes to make sure
that this property is upheld. Obviously, the root node will have the highest node number because
it has the highest weight.

Example: string is abb

3(b) What do you understand by arithmetic coding ? Which makes it different from
Huffman and Shennon Fano coding? Following are the probability distribution of
symbols:
O
0.1

D
0.1

T
0.1

N
0.2

I
0.1

R
O.1

A
0.1

L
0.1

Y
0.1

Code the following string using arithmetic coding:


INDIA and LONDON

Solution:
Arithmetic coding (AC) is to define a method that provides code words with an ideal length.
Like for every other entropy coder, it is required to know the probability for the appearance of
the individual symbols.The AC assigns an interval to each symbol, whose size reflects the
probability for the appearance of this symbol. The code word of a symbol is an abritrary rational
number belonging the corresponding interval.
The Arithmetic coding achieves an optimum which exactly corresponds to the
theoretical specifications of the information theory. A slight degradation results from
inaccuracies, which are caused by correction mechanisms for the interval division.
Huffman coding always produces rounding errors, because its code length is
restricted to multiples of a bit. The efficiency of an arithmetic code is always better
or at least identical to a Huffman code.
in general, Shannon-Fano and Huffman coding will always be similar in size. However, Huffman coding
will always at least equal the efficiency of the Shannon-Fano method, and thus has become the preferred
coding method of its type

4(a) Advantages and Disadvantages of Lzw Compression vs LZ78


The size of files usually increases to a great extent when it includes lots of repetitive
data or monochrome images. LZW compression is the best technique for reducing
the size of files containing more repetitive data. LZW compression is fast and simple
to apply. Since this is a lossless compression technique, none of the contents in the
file are lost during or after compression. The decompression algorithm always
follows the compression algorithm. LZW algorithm is efficient because it does not
need to pass the string table to the decompression code. The table can be recreated
as it was during compression, using the input stream as data. This avoids insertion
of large string translation table with the compression data.

The LZ77 compression algorithm is the most used compression algorithm, on which program
like PkZip has their foundation along with a few other algorithms.
This algorithm works on a dictionary basis, by searching the window for the longest match with
the beginning of the look-ahead buffer, and outputs a pointer to that match. Since it is possible
that not even a one-character match is found, the output cannot consist of merely pointers. This is
solved by outputting the first character of the look-ahead buffer after a match was found. If no
match was found, a NULL pointer and the character at the coding position will be output.
LZ78 is part of the family of LZ dictionary algorithms, which work by caching in on repetition
of small lexical units and larger phrases in data files. LZ77 and LZ78 are both dictionary coders,
unlike minimum redundancy coders or run length coders. LZ77 is the sliding window
compression ...

4(B) Briefly discuss the

future of multimedia. How might multimedia be


used to improve the lives of its users? How might it influence users in
negative ways? What might be its shortcomings?

Answer: Future of Multimedia


History has proven that advances in communication like the migration from radio to TV, marks a
whole new era of communication by using multimedia which combines text, sound and images.
Implementation of multimedia for today and the future reflect the innate desire of man to create
outlets for creative expression, to use technology and imagination to gain empowerment and
freedom for ideas that will propel all of us into a better world.

The future of multimedia will depend mostly on the development of current technology Which
will include high band width access to a wide range of multimedia resources. It is absolutely
important to many fields especially education, business, entertainment, communications, public
places and in medical field. In this technology advanced environment, many multimedia tools are
involved when organising the events or activities such as lecture in class, open ceremony,
seminar, presentation and others that may combine all forms of media content. For instance,
multimedia is used to produce computer-based training course and reference books like
encyclopedia and almanacs in education field; multimedia is used to develop special effects in
movies and animations in the entertainment industry; and multimedia is used in future for
worldwide voice and video communication (eg: internet telephone and video phone system). As
we can see, the impact of multimedia technology on our daily lives is often more than what we
realize and expect. The future of multimedia online is at present, video is rather primitive, but in

future users will be able to interact visually with people from all over the world. Multimedia do
not replacing human contact; rather, it will supplement it by allowing users to interact in ways
and places that would otherwise be impossible and same goes to other elements of multimedia.

How might multimedia be used to improve the lives of its users?


In my opinion, multimedia can improve the lives of users in its applications.

Firstly, multimedia is used as a common source of reference. Encyclopedias, directories,


dictionaries and electronic books are among common multimedia references. Multimedia is also
used in education and training. The application of multimedia make the learning has become
more interesting and effective with educational programmes. Besides that, as I mentioned as
above, entertainment industry produce computer games, and develop animations or special
effects for cartoons and movies

multimedia applications also used widely in scientific research to improve the condition of
sick people. In medicine field, doctors can practice or be trained in performing high-risk surgery
by using virtual surgery.

multimedia applications have greatly benefited those with low vision as it helps them in
performing tasks normally in their daily lives. Therefore, for those users who have vision
problems, they can use the multimedia for their educational and rehabilitation program to
improve their standard of living.
B: How might it influence users in negative ways?
In spite of the numerous ways in which it can improve the life of its users, multimedia also might
influence users in negative ways.

Nowadays, there are a lot of people use the multimedia invention and advanced technology
in the wrong ways. For instance, due to the multimedia tools available, peoples can now edit
pictures of prominent people, politicians etc and publish it just to tarnish their reputation.

Other people can also use their smart phone or camera to record some incident that
happened around them and publish the video to 'Facebook' or 'YouTube' which is a way to
violating human being right. Besides violating the victim's privacy, this kind of unethical action
also may affect their personal dignity.

In addition, the advanced multimedia software ( eg: Photoshop, Dreamweaver and etc)
help the users to edit, cut or merge the real picture in order to create the fake story which can
affect personal image. A typical example is when someone edited the picture of the president of
Ghana in a very devastated manner just for mischievous reasons.

What might be its shortcomings?

Multimedia may for some reasons cannot reach in certain areas because lack of
technology, lack of internet connection or low bandwidth. Thus, some of the people cannot
access to multimedia.

multimedia invention might always be misuse by some unmoral organizations when


competes to get something. False news was made and spread by someone based on edited
picture through Multimedia Messaging Service (MMS) or other web resources.

multimedia invention might always be misuse by society and organizations. For example,
the negative effects of addiction are portrayed through advertisements. Newspapers, television or
even the Internet are used to convey or transmit the social messages to people but ended up the
messages are reaches the masses in the wrong way. Therefore, there is another shortcoming that
positively influenced by the media while there are others who take the wrong message from it.

Das könnte Ihnen auch gefallen