Sie sind auf Seite 1von 4

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - A R I T H M E T I C C O D E I M P L E M E N T A T I O N

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - < ANSI C > version 4.03 - 05/30/95 Amir Said: amir@densis.fee.unicamp.br - said@ipl.rpi.edu DENSIS - Faculty of Electrical Enginnering UNICAMP - Campinas, SP, Brazil William A. Pearlman - pearlman@ecse.rpi.edu Dept. of Electrical, Computer, and Systems Engineering Rensselaer Polytechnic Institute - Troy, NY 12180, USA

DESCRIPTION =========== The files "arithm.h" and "arithm.c" contain an implementation of the arithmetic code based on the article of Witten, Neal & Cleary (Comm. ACM, vol. 30, pp. 520-540, June 1987). The implementation was adapted to allow simultaneous coding to more than one file and the use of several adaptive models, with different alphabets. Please send corrections, suggestions and bug reports to amir@densis.fee.unicamp.br.

FILES ===== -> arithm.h (header) -> arithm.c (implementation) -> arithtst.c (example and test).

REMARKS ======= * Each encoder/decoder is uniquely associated with a code file, defined during initialization. * The interface functions (with prototypes in "arithm.h") perform ALL operations required in the normal usage of the method. The content in the data structures should NOT be directly addressed by the user program, but only via the interface functions. * Different adaptive models can be used in any order during coding, but the order of models must be repeated exactly by the decoder.

* There is no "end-of-file" predefined symbol. This choice was made to avoid the extra symbol overhead when several models are used, the models are frequently reset, or have a small number of symbols. The user must provide a method to allow the decoder to know when to stop. The encoder and decoder have byte counters that allow coding to a given rate and then finding the end of the file. Of course, the user can pre-define a special symbol to represent the end of the message. * The maximum number of symbols allowed to each adaptive model is defined by "MaxSymbols" in the header file "arithm.h". The algorithm should work with a larger number, but it takes more time to gather statistics when the number of symbols is large. The program uses dynamic memory allocation to store the stattistics according to its number of symbols. * A model with M symbols will accept symbols from 0 to M-1. * For faster compression/decompression, data can also be coded in binary format. In this case a fixed (instead of adaptive) model, with uniform distribution is used. INCLUSION ========= The header file "arithm.h" includes the ANSI standard C libraries <stdlib.h> and <stdio.h>.

DATA TYPES ========== -> struct Adaptive_Model * Contains the alphabet size and the frequency of the symbols. -> struct Encoder * Contains the information about the code file and all the data required by the arithmetic coding algorithm. -> struct Decoder * Contains the information about the code file and all the data required by the arithmetic decoding algorithm.

INTERFACE FUNCTIONS =================== << Adaptive_Model >> --------------------> void Create_Model(Adaptive_Model * M, int number_of_symbols) * Used to define the number of symbols in the alphabet of the

adaptive model, reset the statistics, and assign memory. This function MUST be called before using the model for proper memory allocation. -> void Reset_Model(Adaptive_Model * M) * The adaptive model can be reset any time to allow changes in the distribution of symbols. Each call to this function by the encoder MUST be repeated by the decoder in the same order. -> void Set_New_Model(Adaptive_Model * M, int number_of_symbols) * Redefines the alphabet size. -> void Dispose_Model(Adaptive_Model * M) * Frees the memory allocated by the model.

<< Encoder >> -------------> void Start_Encoder(Encoder * E, char * file_name) * Used for initialization. The encoder will overwrite or create the file with name "file_name". This function MUST be called before using the encoder, and CANNOT be used again unless the function "Stop_Encoder" is called first. -> void Stop_Encoder(Encoder * E) * Stops the encoding process and closes the file assigned to the encoder. -> void Write_Symbol(Encoder * E, Adaptive_Model * M, int symb) * Adds the symbol with number "symb" to the coded message, using the statistics in the corresponding adaptive model. The value of "symb" must be in the interval [0, m-1], where m is the number of symbols in the alphabet of the given adaptive model. -> void Write_Bits(Encoder * E, int b, int word) * Writes the "b" least significant bits of "word" to the coded message, assuming an inplicit model with "2**b" symbols and uniform distribution. -> long Bytes_Used(Encoder * E) * Returns the number of bytes already used by the message. A few bits are used to finish the message and to allow a message with an integer number of bytes. The result after calling "Stop_Encoder" includes those extra bits.

<< Decoder >> -------------

-> void Start_Decoder(Decoder * D, char * file_name) * Used for initialization (similar to "Start_Encoder"). -> void Stop_Decoder(Decoder * D) * Closes the code file. -> int Read_Symbol(Decoder * D, Adaptive_Model * M) * Reads the next symbol number from the coded message. -> int Read_Bits(Decoder * D, int b) * Reads a word with "b" bits. -> long Bytes_Read(Decoder * E) * Returns the number of bytes read by the decoder. The decoder has a buffer containing bits from the next symbols. So, in the same part of the message this number may be 2 or 3 bytes larger than the same number furnished by the encoder. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Das könnte Ihnen auch gefallen