Beruflich Dokumente
Kultur Dokumente
1 Introduction
During the last decades, considerable research effort has been put into the field
of image and video coding. Pixel-based coding techniques, whose first steps
were made in the seventies [46,26,27], matured during the eighties and begin-
ning of the nineties into widely spread standards such as ITU-T H.261 [24],
ISO/IEC MPEG1 [44], ISO/IEC MPEG2/ITU-T H.262 [45], and more re-
cently ITU-T H.263 [25]. Meanwhile, starting in the beginning of the eighties,
research on new techniques has begun [38]; these new techniques have been
identified [61] as using mid-level vision concepts (such as regions and textures)
instead of the pixel-based low-level vision concepts used before.
The comparison of techniques proposed in the literature has often been haunted
by the lack of systematisation of the subject. This paper attempts to fill this
gap by proposing a taxonomy of partition types and representations. The par-
tition coding techniques addressing each of the considered partition types and
using the identified partition representations are then overviewed.
2 Definitions
2
y y
– lattice sites
u1 u1
u0 x u0 x
(a) (b)
Fig. 1. Examples of (a) rectangular and (b) hexagonal lattices (u0 and u1 are the
lattice basis vectors).
A partition is a digital image where the value at each pixel is a label identifying
the class to which that pixel belongs. If the number of labels is restricted to
two, the partition is binary; if more than two labels are possible, the partition
is said to be mosaic. Partitions are usually obtained by segmenting a digital
image.
A class is the set of all pixels in a partition having that class’s label. A region
consists of the pixels in a connected component of a class (seen as a maximal
subgraph).
The class adjacency graph (CAG) is a graph with one vertex for each class
in the partition (plus an extra one representing the outside of the image),
and an edge between any two classes adjacent in the image graph. The region
adjacency graph (RAG) is defined similarly for regions.
Two partitions are said to be class or region equivalent if they divide the image
into equally shaped classes or regions. Two partitions are said to be equal if,
apart from being class equivalent, the labels of each class are equal in both
3
partitions. The partitions are said to be class topologically equivalent if they
have the same labels and the CAGs are equal.
The line graph of a partition is obtained by duality of its planar image graph,
see Figure 3. The contours of a partition are usually defined as the subgraph of
the line graph containing all vertices and edges standing between pixels with
different labels (i.e., belonging to different classes).
– pixel
– line graph edge
– line graph vertex
(a) (b)
Contours can also be defined directly on the pixels, by selecting only the pixels
at the borders between regions.
Fig. 4. Variety of vertices on contour graphs: (a) and (b) rectangular and hexagonal
line graphs, and (b) rectangular image graph.
4
Circuits and paths are simple if they do not contain repeated edges. An Euler
circuit is a simple circuit containing all the edges of a given graph.
During the description of the taxonomy tree levels, square brackets will be
used to specify the codes representing each branch of the tree.
The partition types can be organised in a tree with the following levels:
Figure 5 shows the partition type levels of the taxonomy tree. The leaves of
the taxonomy tree correspond to different types of partition. Each type of
partition can be specified by answering the five questions listed. For instance,
the answers: i. two-dimensional [2D], ii. hexagonal [H], iii. 6-neighbourhood
[N6 ], iv. mosaic [M], and v. connected [C], (or, with codes, 2DHN6 MC) define
a type of partitions that lie in a two-dimensional space, correspond to digital
images sampled according to an hexagonal lattice, are structured according
to the hexagonal graph, can have more than two classes, and where all classes
are connected (the concepts of class and region are equivalent in this case).
5
Space: 2D 3D
(2D or 3D)
Lattice: H R ...
(hexagonal or
rectangular)
Graph: N6 N4 N8
(Nn)
Classes: B M B M B M
(binary or
mosaic)
Connectivity: C D C D C D C D C D C D
(connected or
disconnected)
2DHN6MC
Fig. 5. The partition type taxonomy tree (in bold, the example given in the text).
c stands for either C (connected classes) or D (disconnected classes).
Notice that the branches under “3D” in the figure are not drawn, since this
paper addresses mainly two-dimensional partitions. At the partition represen-
tation level, however, three-dimensional partitions will be considered in more
detail (see next section).
This section introduces more levels of detail, related with the representation
chosen for the partitions, into the taxonomy tree. Two- and three-dimensional
partitions will be dealt with separately.
It has been seen in Section 2 that a partition can be represented in two dif-
ferent ways: either by the labels of each pixel, or by contour information plus
6
region-class information. When class equivalence is the aim, the latter provides
information about the clustering of regions into a certain number of classes.
(ii) How – How should the partition be represented? With pixel labels [L] or
with contours [C]?
For the case of partitions represented with contours, other choices have to
be made: How to represent the contours? What sort of neighbourhood sys-
tem has the line graph? These questions lead to two other levels of partition
representation in the taxonomy tree:
Figure 6 shows the partition representation levels of the taxonomy tree for the
two-dimensional case.
Handling: B M B M B M
(binary or
mosaic)
How: L C L C L C
(labels or
contours)
Where: P E P E P E
(pixels or
edges)
Graph: N6 N3 N4 N8 N4 N4 N8 N4
(Nn)
2DHN6Mc–BCEN3
Fig. 6. The partition representation taxonomy tree for the two-dimensional case (in
bold, the example given in the text). c stands for either C (connected classes) or D
(disconnected classes).
The 2DHN6 Mc partition type with a representation separated into binary class
partitions, using contours defined on edges, which have a N3 neighbourhood
system, is coded as 2DHN6 Mc-BCEN3 or:
7
contours, edges, N3 graph.
Approach: 3D 2D
(3D or 2D)
Fig. 7. The partition representation taxonomy tree for the three-dimensional case.
Choosing the representation for the partitions (of a given type) depends on
the properties of each representation and how adequate they are for the task
at hand. Pros and cons related with some of the levels of the partition repre-
sentation taxonomy tree are listed below:
– Handling: 3 (a) Mosaic – A single connected contour graph can separate sev-
eral regions, which leads to coding efficiency when a contour representation
is used; however, access to a single class shape is not easy, since the regions
(and classes) are not represented individually. (b) Binary – The classes are
8
represented independently, and thus easy access to each class is provided,
though at the expense of a lower coding efficiency.
– How: (a) Labels – In this case, the identification of the class to which each
pixel in the partition belongs is very simple, though the shapes of the classes
are not directly represented. (b) Contours – The shapes of the classes are
directly represented, albeit at the expense of requiring somewhat involved
algorithms to ascertain the class of a given pixel [50,58,2].
– Where: (a) Pixels – Representing contours on pixels poses a number of prob-
lems, especially in the case of mosaic partitions, since using all border pixels
leads to unnecessary repetition at both sides of a border; when the problem
is avoided by using only one side of each border, other problems arise: e.g.,
how should one pixel wide regions or parts of regions be distinguished from
borders of thick regions. Although the problems associated with these rep-
resentations have solutions, often somewhat involved, coding contours on
pixels does not seem to achieve higher compression than coding contours
on edges [13] (see also Section 4.6). (b) Edges – This is usually a more
elegant way of representing contours, which in addition typically provides
more compression than pixel based contours [13].
Once the type of partitions to code has been ascertained and a partition
representation selected, according to the taxonomy defined in the previous
section, there are usually a number of available coding techniques. This section
overviews some of these techniques. Special attention will be payed to two-
dimensional partitions.
Picard [53] identified the three performance criteria that classical video source
coders attempt to minimise: rate, distortion, and cost. The first relates to the
desirable compression of the data to transmit, so as to reduce redundancy
and also irrelevancy, if information losses are admissible during coding. The
second pertains to the need to maintain the quality of the signal as high as
possible, according to a possibly subjective criterion, and is applicable only to
lossy coding techniques. The third has to do with implementation costs.
9
possible in the bit stream. This “fourth criterion” is being addressed also in
MPEG-4, and is related to one of the most important MPEG-4 functionalities:
object scalability.
When easy access to the contents of the video sequence is required, the shapes
of the various objects (e.g., a class or a set of classes in a partition) will
have to be coded independently. This requirement can be imposed even if the
segmentation process resulted in a mosaic partition, reducing the problem to
the coding of a series of binary partitions (see the “handling” level in Figure 6).
The independent coding of binary partitions also arises naturally when a lay-
ered scene representation, as proposed by Wang and Adelson [61], is used.
Layered representations of the scenes are also used in the MPEG-4 Video Ver-
ification Model 3.0 (VM3): each layer corresponds to a two-dimensional ob-
ject of arbitrary shape, whose time snapshots are called Video Object Planes
(VOPs) [52,48]. The shape of the objects represented by VOPs can be asso-
10
ciated to binary partitions. 4 However, if the content of the VOPs is coded
through region based techniques, then mosaic partitions will also be necessary
within each VOP.
Thus, both coding of binary and mosaic partitions may be important issues
when easy access to the contents of the video sequences is required.
(i) the regions tend to contain a significant amount of pixels, i.e., small
regions are improbable;
(ii) the classes tend to contain a small amount of regions;
(iii) the contours (borders between regions) tend to be simple (not ragged);
(iv) the region interiors tend not to contain too many small holes.
4 Actually the shapes of the VOPs can be specified in MPEG-4 using “binary
shape”, i.e., a binary partition, or “grey scale shape”, which is an alpha plane
specifying the transparency of each pixel.
5 Such a dependency makes it difficult to evaluate the performance of a partition
coder by itself.
6 See for instance Chapter 10 of [28].
11
same class. This issue will not be discussed at length here. However, note that
the coding methods used should take into account that:
(i) the explicit class labels are not required, since class equivalence is enough;
(ii) adjacent regions cannot belong to the same class, for otherwise they would
be a single region (this can help reduce the amount of data to transmit).
If partition equality is required, then the class labels should be coded explicitly
for each region in the partition. When the classes are connected, the fact that
a given label appears only once can be used to reduce the amount of data
to transmit, since the degrees of freedom keep reducing until zero when the
next-to-last label is transmitted.
Binary partitions can be seen as binary (or two-tone or bi-level) images. There-
fore, the techniques available for coding binary images are good candidates
for coding binary partitions. While lossless techniques can be applied without
any problems, lossy techniques often do pose some problems, since the type
of losses they allow does not generally take into account the requirements
identified in Section 4.1 for lossy partition coding.
Reviews on binary image coding can be found in [36,30] and, specifically for
fax, [31]. The lossless coding standards ITU-T T.4 and T.6 (Group 3 and
Group 4 facsimile) [16,17] and ITU-T T.82 (JBIG, for progressive coding of
binary images) [32] use techniques with increasing compression efficiency:
12
probabilities for the Q-Coder are estimated using a local context (a tem-
plate) for the current pixel. Since JBIG uses resolution layers for progressive
coding, two types of templates exist: the first is used in the lowest resolution
layer and includes only pixels already transmitted in that layer, while the
second is used for all the other layers and includes not only pixels from the
current layer but also from the layer immediately below in resolution.
Among all the other techniques that have been proposed for binary partition
coding, the morphological skeletons [41] (and more recently [33]) is especially
relevant, mainly because this technique has evolved lately to efficiently cover
also mosaic partitions [8] (see Section 4.5.2). This technique represents the
shape of a region by a set of skeleton points and a so-called quench function:
the region is the union of structuring elements (of a certain shape) centred on
the skeleton points and scaled according to the value of the quench function
at that point.
Since binary partitions are a special case of mosaic partitions, techniques de-
veloped for the latter may also be applied to the former, either directly or
with simplifying changes, despite the fact that they do not take into account
the special characteristics of binary partitions.
The case of mosaic partitions is more complex. The coding of mosaic parti-
tions has received less attention than the coding of binary partitions (however,
see [8,7,60]). It is possible, nevertheless, to use binary partition coding tech-
niques by first converting the mosaic partitions into bit planes. 7
A technique using the concept of geodesic skeleton, where the regions are
described by a set of skeleton points and a quench function [8], was recently
7 For instance, using the Four-colour theorem [54], the regions in a partition can be
perfectly identified by painting them with only four colours. Hence, each region can
be identified by a two-bit label, and thus two bit-planes are sufficient for representing
the partition. Each of the two bit-planes can be coded independently using (lossless)
binary partition coding techniques. Notice that some borders are present in both
bit-planes, so this method cannot yield optimal results.
13
proposed. This technique was developed for mosaic partitions, being thus also
applicable in the binary case, and is, in a sense, an extension of the technique
proposed in [41] for binary partitions (see Section 4.5.1). The authors claim
that “the geodesic skeleton is preferable to chain code whenever there are
many isolated and short contour arcs to be coded”, which seems to be the
case when 3D· · ·-2DInterM (motion predicted 2D partitions corresponding to
time slices of a 3D partition) partition representations are used.
A method which is also related to geodesic skeletons has been proposed in [60].
It represents regions as a union of structuring elements with appropriate trans-
lations and scalings. Both techniques ([8,60]) allow the structuring elements
to overlap already coded regions, thus avoiding duplicate coding of borders
and reducing the required bit rate. Both techniques are lossy and, again, can
be used for mosaic and binary partitions.
(i) Chain codes – The contour graph is coded by a string of symbols rep-
resenting the direction of the “chain” connecting a vertex to the next
vertex on the contour. Each of these strings is called a chain code. Sym-
bols may also represent direction changes, which makes the chain codes
differential.
(ii) Parametric curves – The contours are approximated by parametric curves,
whose coefficients are then coded; the most common examples are approx-
imations by straight lines and by splines (in general, by polynomials).
(iii) Transform codes – The contours are represented as parametric curves
which are coded using transform methods, followed by coefficient quan-
tisation, in a one-dimensional equivalent of the transform image coding.
All these techniques involve two steps: first the representation is changed by
transforming the contours into strings of symbols (e.g., changes in chain di-
rection, spline parameters, control points or transform coefficients—possibly
8 [7] also contains a good review of partition coding techniques.
14
quantised) and then these symbols are entropy coded.
The contour graph is a subgraph of either the line graph (for contours defined
on edges) or the image graph (for contours defined on pixels), and usually
consists of a collection of paths on the original graph. Contours can thus be
represented by a string of symbols representing which of the neighbours of the
current graph vertex belongs to the contour or, which is the same, the direc-
tion of the (chain) “link” connecting it to the next vertex on the contour: these
strings are called chain codes [18,19,64]. When the symbols represent direc-
tion changes, the chain codes are said to be differential [14,22]. The simplest
partitions are those for which the contour graph is constituted of disconnected
loops, that is, circuits where each vertex has exactly two neighbours in the
contour graph.
Binary partitions are generally simpler to code than mosaic partitions. The
main difference stems from the fact that, for binary partitions, all vertices in
the contour graph (at least for contours defined on the the line graph) have an
even number of neighbours: two vertices for images sampled with hexagonal
lattices, and two or four vertices for images sampled with rectangular lattices.
That is, the connected components of such graphs have Euler circuits, i.e., they
can be “drawn without lifting the pencil”, according to a known theorem 9 in
graph theory [54].
(i) Ignore junctions and crossings – Select one of the exits and leave the
others for coding as separate contours; since initial contour points are
costly to code, this solution is not optimal.
9 “A connected multigraph [and hence also a simple graph] has an Euler circuit
if and only if each of its vertices has even degree [54]”—this theorem solves the
so-called Königsberg bridges problem.
15
(ii) Code junctions and crossings explicitly [42] – Select one of the exits but
code also information about the junction or crossing so that later one can
“return” and continue following the remaining exits (one in the case of a
junction, two in the case of a crossing).
When junctions and crossings are explicitly coded, the compression obtained
when coding a connected component of a contour depends strongly on the way
the connected component is followed: where to start, which exit to follow first
at each junction or crossing, etc. The problem of coding can then be seen as
a problem of minimising the bit rate given a certain syntax of representation.
This problem is similar to “the Königsberg bridges problem generalised”, that
is, to the problem of making a line drawing without lifting the pencil and
minimising the length of the redrawn lines [3].
When contours are defined on pixels, the concepts of junction and crossing
require a more involved definition and treatment [40,13]. In the case of bi-
nary partitions, the problem may be solved by again ignoring the presence
of vertices of degree larger than two in the contour graph. Another problem
of contours defined on pixels is posed by one pixel wide regions or parts of
regions, which make it difficult to use a stopping condition as simple as “stop
when the initial vertex of the contour is attained”, which is often used when
coding contours defined on edges. Such regions may also require the existence
of a “turning back” (180◦) direction in the chain codes, rarely used, which
may cause some VLCs to be inefficient (for instance Huffman). 10
Several techniques have been proposed in the literature for entropy coding
the initial vertices and the chain codes: 1. zero order Huffman and arithmetic
coding (adaptive or not) [43,42], which tend to be inefficient, since region
borders are usually very different from a Brownian random walk through the
10Consider an alphabet consisting of two symbols A and B with equal probabilities
0.5: the corresponding Huffman code will have one bit per symbol. If a third, im-
probable but possible, symbol C is added, and the probabilities are p(A) = 0.495,
p(B) = 0.495, p(C) = 0.01, the number of bits per code word will be 1, 2, and 2,
respectively. The average number of bits per symbol will be 1.505, 40% worst than
the minimum of 1.071.
16
image or line graph; 2. nth order Huffman and arithmetic coding (adaptive or
not) [13,43,14]; 3. Ziv-Lempel coding [65,63], which is a form of “dictionary-
based coding” [36]; and 4. run-length coding, which groups chain codes into
runs of related symbols [34,43], usually corresponding to straight line seg-
ments [37,4,43] (and hence constituted either of a single symbol or of two sym-
bols, with adjacent directions, which verify the conditions defined by Rosenfeld
in [55]).
When control points are used, their differences along the contour graph are
usually entropy coded. These methods deal with junctions and crossings in a
very similar way to chain coding techniques (see Section 4.6.1).
As part of the MPEG-4 core experiments on binary shape coding [59], para-
metric curve techniques have also been evaluated [21,47,35,12] (some of these
techniques stem from the earlier [29]). These techniques approximate the con-
17
tours with polygons or splines using a set of control points chosen again with a
split algorithm. The selection of which approximation method to use is either
done for each contour segment (between control points) or for each object.
The proposed techniques also take advantage of time redundancy between
control points along the successive partitions. One-dimensional transform cod-
ing methods, some of which multi-resolution, are proposed to compensate the
residual error between the parametric curve approximation and the actual
contours (see the next section).
The contours are represented first as parametric curves taking values in IR, if
the contour (or contour segment) being coded can be represented by a polar
function centred somewhere in the image, or in IR2 for other kinds of contour
(or contour segments). These parametric curves (still a lossless representation)
are then coded using transform methods [10], in a one-dimensional equivalent
of the transform coding used in image coding (e.g., DCT in JPEG, H.261,
H.262, and H.263), i.e., the parametric curves are transformed and the result-
ing coefficients are quantised and entropy coded.
Transform codes have also been under scrutiny in the MPEG-4 core exper-
iments on binary shape coding [59], both for contour coding proper and for
coding the residual error after using parametric curve methods.
The first of the techniques considered in the core experiments considers a polar
representation of the contour [11]. The contour is represented by a function
of the polar angle, whose value is the distance between the centroid and the
contour in the direction defined by the angle. 11 The one-dimensional DCT of
the distance function is calculated and then its coefficients are quantised and
VLC coded. Some contours cannot be properly represented by a parametric
function of the polar angle (since more than one contour point may occur for
a single angle). Hence, parts of the contour may have to be left out. These
parts are handled separately using chain codes (see Section 4.6.1). This tech-
nique can also take advantage of the temporal redundancy between successive
partitions.
The other transform coding techniques tested on the MPEG-4 core experi-
ments use either the one-dimensional DST or DCT to code not the contour
itself, but the residual error (distance) between a parametric curve approx-
imation and the actual contour [47,35,12]. In [12] the distance between the
approximated and actual contours is calculated either horizontally or verti-
cally, depending on the slope of the line between the control points of the
11 The centroid is the point whose coordinates are the average of the coordinates of
all the pixels in the region enclosed by the contour.
18
contour segment being encoded. This substantially reduces the calculations
relative to the usual orthogonal distance method. In [47] a multi-resolution
version of the DST is used, so as to provide contour (object) scalability.
5 Conclusion
The partition type level classifies the possible partition types that may have to
be coded. The partitions types are classified according to: i. space (2D or 3D),
ii. sampling lattice, iii. superimposed graph structure, iv. number of classes in
the partition, and v. class connectivity.
19
The proposed systematisation is believed to simplify the comparison between
partition coding techniques, by establishing clearly which type of partitions a
given partition coding technique addresses, and which partition representation
that technique is based on.
An issue of interest, which will also be left for further study, is the extension of
the partition tree to include a branch for line drawings or contours that may
be open (which are not the dual of some partition). This is of interest since
contour-based coding, or image reconstruction from edges [22,9,15], with its
long history, still seems to have a large potential in image coding.
Acknowledgement
The authors would like to acknowledge the valuable comments of Prof. Fer-
nando Pereira and of the anonymous reviewers.
References
[1] Technical description for MPEG-4 first round of test. Technical Description
ISO/IEC JTC1/SC29/WG11 MPEG95/0354, Toshiba, November 1995.
[2] S. M. Ali and R. E. Burge. A new algorithm for extracting the interior of
bounded regions based on chain coding. Computer Vision, Graphics, and Image
Processing, 43:256–264, 1988.
[3] Richard Bellman and K. L. Cooke. The Königsberg bridges problem
generalized. Journal of Mathematical Analysis and Applications, 25:1–7, 1969.
[4] Michael James Biggar and A. G. Constantinides. Thin line coding techniques.
In Proceedings of the International Conference on Digital Signal Processing,
Florence, Italy, September 1987.
[5] Frank Bossen and Touradj Ebrahimi. A simple and efficient binary shape coding
technique based on bitmap representation. Technical Description ISO/IEC
JTC1/SC29/WG11 MPEG96/0964, EPFL, July 1996.
[6] Noel Brady. Adaptive arithmetic encoding for shape coding. Technical
Description ISO/IEC JTC1/SC29/WG11 MPEG96/0975, Teltec Ireland
(Dublin City University), ACTS/MoMuSys, July 1996.
20
[7] Patrick Brigger, Antoni Gasull, Chuang Gu, Ferran Marqués, Fernand
Meyer, and Christophe Oddou. Contour coding. CEC Deliverable
R2053/UPC/GPS/DS/R/006/b1, EPFL, UPC, CMM, LEP, December 1993.
[8] Patrick Brigger and Murat Kunt. Morphological shape representation for very
low bit-rate video coding. Signal Processing: Image Communication, 7(4–
6):297–311, November 1995.
[9] Stefan Carlsson. Sketch based coding of grey level images. Signal Processing,
15(1):57–83, July 1988.
[11] Yu-Shin Cho, Shi-Hwa Lee, Jae-Seob Shin, and Yang-Seock Seo. Results of core
experiments on comparison of shape coding tools (S4). Technical Description
ISO/IEC JTC1/SC29/WG11 MPEG96/0717, Samsung AIT, March 1996.
[12] Yu-Shin Cho, Shi-Hwa Lee, Jae-Seob Shin, and Yang-Seock Seo. Shape
coding tool: Using polygonal approximation and reliable error residue sampling
method. Technical Description ISO/IEC JTC1/SC29/WG11 MPEG96/0565,
Samsung AIT, January 1996.
[14] Murray Eden and Michel Kocher. On the performance of a contour coding
algorithm in the context of image coding part I: Contour segment coding. Signal
Processing, 8(4):381–386, July 1985.
[17] Facsimile coding schemes and coding control functions for Group 4 facsimile
apparatus. Recommendation T.6, CCITT, 1984.
[20] Antoni Gasull, Ferran Marqués, and Juan A. Garcı́a. Lossy image contour
coding with multiple grid chain code. In Proceedings of the Workshop on
Image Analysis and Synthesis in Image Coding (WIASIC’94), page B4, Berlin,
Germany, October 1994. Heirich-Hertz-Institute.
21
[21] Peter Gerken, Michael Wollborn, and Stefan Schultz. Polygon/spline
approximation of arbitrary image region shapes as proposal for MPEG-
4 tool evaluation – technical description. Technical Description ISO/IEC
JTC1/SC29/WG11 MPEG95/0360, RACE/MAVT, University of Hannover,
Robert Bosch GmbH, and Deutsche Telekom AG, November 1995.
[23] Chuang Gu and Murat Kunt. Contour simplification and motion compensated
coding. Signal Processing: Image Communication, 7(4–6):279–296, November
1995.
[24] Draft revision of recommendation H.261: Video codec for audiovisual services at
px64 kbits/s, CCITT study group XV, TD 35, 1989. Signal Processing: Image
Communication, 2(2):221–239, August 1990.
[25] Video coding for low bitrate communication. Draft Recommendation H.263,
ITU-T, December 1995.
[27] Ali Habibi. Survey of adaptive image coding techniques. IEEE Transactions
on Communications, COM-25(11):1275–1284, November 1977.
[28] Robert M. Haralick and Linda G. Shapiro. Computer and Robot Vision,
volume I. Addison-Wesley Publishing Company, Inc., Reading, Massachusetts,
1992.
[31] Roy Hunter and A. Harry Robinson. International digital facsimile coding
standards. Proceedings of the IEEE, 68(7):854–867, July 1980.
[33] Rémi Jeannot, Demin Wang, and Véronique Haese-Coat. Binary image
representation and coding by a double-recursive morphological algorithm.
Signal Processing: Image Communication, 8(3):241–266, April 1996.
[34] Toru Kaneko and Masashi Okudaira. Encoding of arbitrary curves based on the
chain code representation. IEEE Transactions on Communications, 33(7):697–
707, July 1985.
22
[35] Jong-Lak Kim, Jong-Il Kim, Jong-Tae Lim, Jin-Hun Kim, Han-Soo Kim, Kyu-
Hwan Chang, and Seong-Dae Kim. Daewoo proposal for object scalability.
Technical Description ISO/IEC JTC1/SC29/WG11 MPEG96/0554, Daewoo
Electronics CO.LTD. and KAIST, January 1996.
[36] Weidong Kou. Digital Image Compression: Algorithms and Standards. Kluwer
Academic Publishers, Boston, 1995.
[39] Michael S. Landy and Yoav Cohen. Vectorgraph coding: Efficient coding of
line drawings. Computer Vision, Graphics, and Image Processing, 30:331–344,
1985.
[40] Yuh-Tay Liow. A contour tracing algorithm that preserves commom boundaries
between regions. CVGIP: Image Understanding, 53(3):313–321, May 1991.
[42] Ferran Marqués, Josep Sauleda, and Antoni Gasull. Shape and location coding
for contour images. In Proceedings of the Picture Coding Symposium (PCS’93),
page 18.6, Lausanne, Switzerland, March 1993.
[44] Coding of moving pictures and associated audio for digital storage media up to
about 1,5 Mbit/s. International Standard 11172, ISO/IEC, 1993.
[45] Generic coding of moving pictures and associated audio information. Draft
Recommendation H.262, Draft International Standard 13818, ITU-T, ISO/IEC,
January 1995.
[46] Arun N. Netravali and John O. Limb. Picture coding: A review. Proceedings
of the IEEE, 68(3):366–406, March 1980.
[47] Kevin O’Connell and Damon Tull. Motorola MPEG-4 contour-coding tool
technical description. Technical Description ISO/IEC JTC1/SC29/WG11
MPEG95/0447, Motorola, November 1995.
23
[49] David W. Paglieroni and Anil K. Jain. A control point theory for boundary
representation and matching. In Proceedings of the International Conference
on Acoustics, Speech and Signal Processing (ICASSP’85), pages 1851–1854,
Tampa, Florida, 1985. IEEE, Signal Processing Society.
[50] Theo Pavlidis. Contour filling in raster graphics. Computer Graphics, 15(3):29–
36, July 1981.
[51] William B. Pennebaker, Joan L. Mitchell, Glen G. Langdon, Jr., and Ronald B.
Arps. An overview of the basic principles of the q-coder adaptive binary
arithmetic coder. IBM Journal of Research and Development, 32(6):717–726,
November 1988.
[52] Fernando Pereira. MPEG4: a new challenge for the representation of audio-
visual information. In Proceedings of the Picture Coding Symposium (PCS’96),
pages 7–16, Melbourne, Australia, March 1996.
[53] Rosalind W. Picard. Content access for image/video coding: “the fourth
criterion”. Technical Report 295, MIT Media Lab: Perceptual Computing
Section, 1994.
[56] Philippe Saint-Marc, Hillel Rom, and Gérard Medioni. B-spline contour
representation and symmetry detection. IEEE Transactions on Pattern
Analysis and Machine Intelligence, 15(11):1191–1197, November 1993.
[57] Jean Serra. Image Analysis and Mathematical Morphology, volume I. Academic
Press, Inc., San Diego, California, 1993.
[58] Uri Shani. Filling regions in binary raster images: a graph-theoretical approach.
Computer Graphics (SIGGRAPH’80 Proceedings), 14(3):321–327, July 1980.
[59] MPEG Video Subgroup. Core experiments on MPEG-4 video shape coding.
Document ISO/IEC JTC1/SC29/WG11 N1326, ISO, July 1996.
[61] John Y. A. Wang and Edward H. Adelson. Representing moving images with
layers. IEEE Transactions on Image Processing, 3(5):625–638, September 1994.
[62] Shuichi Watanabe, Hisashi Saiga, Hiroyuki Katata, and Hiroshi Kusao. Binary
shape coding based on hierarchical chain codes. Technical Description ISO/IEC
JTC1/SC29/WG11 MPEG96/1045, Sharp Corporation, July 1996.
24
[64] C. A. Wüthrich and Peter Stucki. An algorithmic comparison between square-
and hexagonal-based grids. CVGIP: Graphical Models and Image Processing,
53(4):324–339, July 1991.
[65] Jacob Ziv and Abraham Lempel. A universal algorithm for sequential data
compression. IEEE Transactions on Information Theory, IT-23(3):337–343,
May 1977.
25