Sie sind auf Seite 1von 4

GENERAL REGION MERGING APPROACHES BASED ON INFORMATION THEORY STATISTICAL MEASURES Felipe Calderero, Ferran Marques Department of Signal

Theory and Communications Technical University of Catalonia (UPC), Barcelona, Spain {felipe.calderero, ferran.marques}@upc.edu
ABSTRACT This work presents a new statistical approach to region merging where regions are modeled as arbitrary discrete distributions, directly estimated from the pixel values. Under this framework, two region merging criteria are obtained from two different perspectives, leading to information theory statistical measures: the Kullback-Leibler divergence and the Bhattacharyya coefcient. The developed methods are size-dependent, which assures the size consistency of the partitions but reduces their size resolution. Thus, a size-independent extension of the previous methods, combined with a modied merging order, is also proposed. Additionally, an automatic criterion to select the most statistically signicant partitions from the whole merging sequence is presented. Finally, all methods are evaluated and compared with other state-of-the-art region merging techniques. Index Terms Image segmentation, region merging, information theory, Kullback-Leibler divergence, Bhattacharyya coefcient 1. INTRODUCTION Image segmentation is a key step into image analysis. However, in a large number of cases, a unique solution for the image segmentation problem does not exist, i.e., instead of a single optimal partition, it is possible to nd different region-based explanations of an image, at different levels of resolution. To overcome this situation, a hierarchical segmentation approach can be used where, instead of a single partition, a hierarchy of partitions may be provided. An important type of hierarchical bottom-up segmentation approaches are region merging techniques. They are based on local decisions, directly on the region features. Starting from an initial partition or from the collection of pixels, regions are iteratively merged until a stopping criterion is reached. Thus, region merging algorithms are specied by: a merging criterion, dening the cost of merging two regions; a merging order, determining the sequence in which regions are merged based on the merging criterion; and a region model that determines how to represent the union of regions. Most attention has been focused on the improvement of the merging criterion, e.g. using criteria combination [1, 2], while applying a more simple region model and merging order. Additionally, most approaches assume either color homogeneous [1, 2] or textured regions [3], leading to separate approaches to image segmentation. The motivation of the current work is to investigate an unsupervised region merging technique using a more general region model, and providing a general strategy with no assumptions about the nature of the regions. For that purpose, a statistical framework is proThis work has been partly supported by the projects CENIT-2006 i3media and TEC2007-66858/TCM PROVEC of the Spanish Government.

posed, where we can use well-known theoretical results in probability theory and information theory, to formally develop merging criteria leading to the most statistically meaningful partitions. Other statistical region merging approaches have been early proposed. However, they lack this generality principle. For instance, some of them consider different parameter-based families of probability distributions as region model [4]. In turn, [5] considers a nonparametric region model. Nevertheless, regions are assumed to hold a homogeneity property: inside any statistical region, pixels have the same expectation value for each color channel. The rest of the paper is structured as follows. Section 2 proposes the empirical distribution of the region as general statistical model. Under this premise, two novel region merging criteria are developed in Section 3 from two different perspectives, leading to the size-weighted statistical similarity measures based on the KullbackLeibler divergence and to the Bhattacharyya coefcient. An alternative approach, combining a size-independent extension of the previous methods and a new scale-based merging order is presented in Section 4. An automatic partition selection criterion is detailed in Section 5. Section 6 presents an objective evaluation and comparison with other techniques. Finally, conclusions are outlined in Section 7. 2. GENERAL STATISTICAL REGION MODEL From a statistical point of view, a single channel image can be considered as a realization of a 2D stochastic process. Therefore, each pixel is a sample of one of the discrete random variables composing the image process. For simplicity, all results in this work are obtained for single channel images, whose extension is straightforward for the multichannel case under channel independence assumption. To formally tackle the image segmentation problem, we consider a region as a set of independent and identically distributed (i.i.d) pixels which is completely characterized by the probability distribution common to all pixels. We propose a region model based on the estimation of this probability mass function from the empirical distribution of the region. The empirical distribution Px or type of a sequence x of n samples from an alphabet X = {a1 , a2 , . . . , a|X | } is dened as the relative proportion of occurrences of each value of X , i.e., Px (a) = N (a|x)/n for all a X , where N (a|x) is the number of times the symbol a occurs in the sequence x X n . Using a main result of the theory of types [6], the probability of the type of a sequence of i.i.d. observations x with probability distribution Q, is given by: Qn (x) = 2n(H(Px )+D(Px ||Q)) where
H(Px )=
aX

(1)

P (a) log P (a)

is the Shannon entropy of the

978-1-4244-1764-3/08/$25.00 2008 IEEE

3016

ICIP 2008

type and D(Px ||Q)= aX Px (a) log Px (a) is the Kullback-Leibler Q(a) divergence between the statistical distributions. It can be seen [6] that the empirical distribution Px converges to Q, concretely, D(Px ||Q) 0 with probability 1. Hence, for n sufciently large, the probability for a particular sequence can be approximated by: Qn (x) 2nH(Px ) (2)

merged. We will refer to this statistical criterion as the KullbackLeibler merging criterion (KL), formally stated as: {R1 , R2 } = arg max ni D(Pi ||Pij )nj D(Pj ||Pij ) (7)
Ri Rj

and the unknown distribution of the data, Q, can be directly approximated by empirical distribution of the samples. Using the empirical distribution provides a unied and general framework for image segmentation, as arbitrary discrete distributions are directly estimated from data. Apart from pixel independency, no further assumptions are made. Moreover, this model can be easily computed and, after the union of a pair of regions, updated: P12 (a) =
n1 n1 +n2

This criterion is based on measuring the similarity between the empirical distributions of the regions and the empirical distribution of their merging, weighted by the size of the regions. 3.2. Bhattacharyya Merging Criterion In this section we present a new criterion based on a direct statistical comparison between the types of the regions. Nevertheless, in this case, the Kullback-Leibler divergence becomes impractical, as its convergence cannot be warranted anymore. For instance, D(P1 ||P2 ) if, for some a X , P1 (a) = 0 and P2 (a) = 0. We tackle the problem from a different perspective. Let us consider the probability simplex in Rn , i.e., the (n-1)-dimensional manifold dened by all possible empirical distributions for a sequence of n samples. Each region can be seen as a class in this space, centered at the point generated by its empirical distribution on the probability simplex. The exponent of the probability of error of such a classier is bounded by the minimum Chernoff information between the statistical distribution of any pair of classes [6], dened as: C(Pi , Pj ) min log
01 x

P1 (a) +

n2 n1 +n2

P2 (a)

(3)

where n1 , n2 are the number of pixels in R1 , R2 , respectively. The quantization of the alphabet X can be set to optimize the performance of the algorithm. In this work, we only consider a uniform quantization and directly refer to the number of bins considered in the empirical distribution. 3. AREA-WEIGHTED STATISTICAL MERGING CRITERIA FOR I.I.D. PIXEL REGION MODEL 3.1. Kullback-Leibler Merging Criterion The rst criterion is based on merging at each step the pair of adjacent regions maximizing the probability of being generated by the same statistical distribution. We tackle this problem as a pairwise hypothesis test. Assume R1 and R2 are two adjacent regions with empirical distributions P1 , P2 , respectively, whose union would generate a new region with empirical distribution P12 . Then, the two hypotheses considered are: H0 : Pixels in the rst region, x1 R1 , and pixels in second, x2 R2 , are both distributed by P12 ; H1 : Pixels x1 R1 are distributed by P1 ; and pixels x2 R2 are distributed by P2 . In general, we wish to minimize both probabilities of error. The Neyman-Pearson lemma [6] proofs that the optimal test for two hypotheses, in that sense, is the so-called likelihood ratio test: PH0 (x1 , x2 , . . . , xn ) T PH1 (x1 , x2 , . . . , xn ) (4)

Pi (x)Pj1 (x)

(8)

In other words, the performance of a classier is determined by the pair of closest classes in the probability simplex, in terms of the Kullback-Leibler divergence. In our case, we propose to merge the pair of regions with maximum Chernoff information, redening the probability of error of a classier as the probability of fusion in a clustering method. Hence, the bound on the error probability becomes a bound on the probability of merging. This way, the bound on the probability of merging for two adjacent regions, with type Pi ,Pj , and number of pixels ni , nj , respectively, can be written as: Pmerging e min(ni ,nj )C(Pi ,Pj ) (9)

Using the result in (2) for the probability of each sequence of pixels, we can write the log-likelihood ratio as: PH0 (x1 , x2 ) = nH(P12 ) + n1 H(P1 ) + n2 H(P2 ) log PH1 (x1 , x2 ) (5)

Nevertheless, computing the Chernoff Information implies an optimization over . To reduce this computational load, in practice, we propose to approximate the Chernoff information by the Bhattacharyya coefcient, which corresponds to the case = 1/2: B(Pi , Pj ) log
x

Pi2 (x)Pj2 (x)

(10)

that can be interpreted as the size-weighted decrement on the entropy when the regions are merged. Considering (3) and the Kullback-Leibler divergence between statistical distributions, (5) can be rewritten as: PH0 (x1 , x2 ) = n1 D(P1 ||P12 ) n2 D(P2 ||P12 ) (6) log PH1 (x1 , x2 ) Consequently, at each merging stage, the two adjacent regions (written as Ri Rj ) with maximum log-likelihood should be

In conclusion, a statistical clustering approach leads to the merging of the adjacent pair of regions with maximum (bound of the) probability of fusion, or equivalently, maximizing its exponent: {R1 , R2 } = arg max min(ni , nj ) B(Pi , Pj )
Ri Rj

(11)

This method is based on a size-weighted direct statistical measure of the empirical region distributions, and we will refer to it as the Bhattacharyya merging criterion (BHAT).

3017

4. AREA-UNWEIGHTED STATISTICAL MERGING CRITERIA FOR I.I.D. PIXEL REGION MODEL The obtained merging costs depend on the size of the involved regions, establishing, in some sense, the condence of the estimated empirical models. This approach assures that the resulting partitions are size consistent, meaning that the area of the regions tends to increase as the number of regions into the partition decreases. The size term priorities the fusion of the smaller regions, slowing the merging of the larger regions, even when they are similarly distributed. On one hand, as it may be possible to make a mistake during the merging process, merging small regions causes less signicant errors in terms of pixels, minimizing undersegmentation. On the other, as the fusion of larger regions is slowed even when they are similarly distributed, size-biased methods suffer generally from oversegmentation (see Section 6). Therefore, the goal of this section is to propose an extension of the previous methods providing a trade-off between under- and oversegmentation, while increasing the size resolution of the partition. This is achieved by removing the size dependency from the merging criteria and incorporating it to the merging order to assure size consistency. Hence, under the assumption that regions are large enough to have a high condence on the estimated distribution, the area dependency can be removed from the previous merging criteria: Area-unweighted Kullback-Leibler merging criterion: {R1 , R2 } = arg max D(Pi ||Pij ) D(Pj ||Pij )
Ri Rj

5. PARTITION SELECTION CRITERION Here we propose a partition selection criterion, i.e., an automatic technique to extract from the hierarchy of partitions the most statistical signicant partitions at different resolutions. We expect statistical meaningful partitions to contain most human-representative regions, for different levels of analysis. We have observed that the merging similarity sequences for area-weighted and area-unweighted methods present a similar behavior. Thus, the presented selection strategy is applied to all merging techniques proposed in this work. The proposed strategy relies on selecting the partitions associated to a signicant decrease into the sequence of merging similarities. Therefore, we consider a non-decreasing version of the sequence of merging similarities s(n), where n is the number of remaining regions, dened as smin (n) = minnk< {s(k)}. Determining the most important decrements on smin (n) pro vides the set of statistically signicant partitions. These partitions may be ordered using some signicance index. Here, we propose an importance weight based on the relative increase with respect to s the current similarity value. Given (n) = dmin (n) and a nondn increasing version of (n), max (n) = maxnk< {(k)}, the importance weight is dened as: max (k 1) max (k) k = (15) max (k) Examples of automatically selected partitions using this signicance order are shown in Figure 1 (additional examples at http://gpstsc.upc.es/imatge/ Felipe/icip08/ ). Note that, in general, the rst selection corresponds to a coarse partition, whose regions are good approximations of the objects. Usually, the second proposal gives a ner partition with most representative regions in the scene.

(12)

Area-unweighted Bhattacharyya merging criterion: {R1 , R2 } = arg max B(Pi , Pj )


Ri Rj

(13)

In practice, we cannot always assure that the distribution of all regions is perfectly estimated, specially, in early stages of the merging process. For this reason, and to assure the size consistency of the partitions, an agglomerative force is needed into the merging process. Our proposal is to combine the criteria in (12) and (13) with a scale-based merging order, incorporating the size consistency constraints. The idea is to dene a scale threshold for each level of resolution. Regions beyond this threshold are considered as outof-scale and are merged with the highest priority, fusing them with their most similar region in the partition. Finally, when no out-ofscale regions remain, the algorithm continues merging in-scale regions normally. At each merging step the scale threshold is updated, and normal merging continues till new out-of-scale regions appear. The scale threshold is dened as: Tscale = Image Area Number of Regions (14)

Fig. 1: Analysis of the partition selection criterion. Columns 1-2-3 and 4-5-6, from left to right: original image, rst and second selected partitions using the signicance index in (15). The methods used in each row, in descending order, are: KL area-weighted, BHAT areaweighted and BHAT area-unweighted (types quantized to 5 bins). 6. EXPERIMENTAL RESULTS In all experiments in this section, our statistical region merging techniques were applied on an initial partition of the original image in HSV color space, computed using the watershed algorithm, to ensure that all initial regions are large enough to have a high condence on the statistical model estimation. A rst set of experiments was performed over a subset of 100 images (10 images of 10 different complexity classes) from the Corel c image database [1]. Ground truth partitions were manually segmented in the context of the SCHEMA project (http://www.iti.gr/SCHEMA/). To evaluate the quality of the partitions created by the proposed methods, we use the distances dened in [7]. Initially, a symmetric distance is proposed dsym (P, Q). This distance is dened in terms of the minimum number of pixels whose labels should be changed between regions in P to achieve a perfect matching with Q (P and Q become identical), normalized by the total number of pixels in the

i.e., regions that are smaller than a given percentage of the mean region area at the current scale are considered out-of-scale. The parameter controls the minimum resolution at each scale. Heuristically, we have found that values around = 0.15 provide a good compromise between under- and oversegmentation. The benet of this approach is that the fusion of large regions is not penalized, once out-of-scale regions have been removed. All regions are equally likely to merge despite its size, because the merging cost only measures the statistical similarity of the empirical distributions, without being size biased.

3018

0.18
C 6 8 @ e E 6 b a ` 6 X V T S

0.8
Q P g b b a a H F ` ` 6 6 g F X X a E ` V V 6 C X B / / / / r r V @ A + , 0 ) , V V 8 9 ' ' ' ' ' ' 6 T p p 7 % % % % % % 6 $ $ $ $ $ $ S i i 5

0.16 0.14
&

0.7
%

0.06 0.04 50 40 30
      

0.2

20

10

0.1 50

40

30
    

50

40

30
      

20

10

50

40

image. This measures evaluates the global quality of a partition, and its compromise between under- and oversegmentation. The denition is extended to an asymmetric distance: dasym (P, Q). In this case, it measures the minimum number of pixels whose labels should be changed so that partition Q becomes ner than partition P , normalized by the image size. Note that, in general, dasym (P, Q) = dasym (Q, P ). When P is the partition to evaluate and Q its ground truth partition, the rst ordering measures the degree of oversegmentation, and the second, the undersegmentation in P w.r.t. the ground truth partition. Table 1 shows the mean symmetric distance between ground truth partitions and partitions generated by the proposed methods, both with the same number of regions (see Fig. 2 and http://gps-tsc. upc.es/imatge/ Felipe/icip08/ for an extension). These results are compared with the region merging technique proposed in [1], using the same watershed-based initial partitions. Its merging criterion combines color similarity and contour complexity of the regions, normalized by the component dynamic range, and was shown to outperform most color based merging techniques. Note that all statistical criteria outperform [1] and, as expected, area-unweighted methods present the best trade-off between under- and oversegmentation. In Figure 3, the results for the mean asymmetric distance for different number of regions are presented. Figure 3-left shows dasym (P, Q), measuring the degree of oversegmentation. In this case, area-weighed methods outperform area-unweighted methods, although the Bhattacharyya version obtains similar results. On the contrary, in Fig. 3-right, for dasym (Q, P ), size-unbiased methods clearly present less undersegmentation. Note that, in general and for both results, the performance of the Bhattacharyya criterion is slightly superior to that of the Kullback-Leibler criterion. Figure 4 presents an example of the performance variation for different number of bins on the normalized histogram for each region. Concretely, the dasym (P, Q) for the area-weighted version of the Bhattacharyya method is shown. We can conclude that good performance can be obtained with a reduced number of bins: 5 or 10 bins is a good compromise between the partition quality and the algorithm computational load. Remaining methods present a similar behavior and plots are not shown due to space limitations. Another evaluation on the context of texture segmentation has

Fig. 4: Asymmetric distance for the BHAT area-weighted method for different number of bins in the empirical distributions. Left: from computed to ground truth partition; right: viceversa. been performed on the benchmark system in [8]. The results are available online at http://mosaic.utia.cas.cz/. 7. CONCLUSIONS We have proposed a family of general region merging techniques, developed under a statistical framework. The rst set, the areaweighted methods, exhibit an excellent performance in terms of minimizing the merging error or undersegmentation. Nevertheless, an area-unweighted extension of the previous ones has been proposed to obtain a better trade-off between under- and oversegmentation. In addition, an automatic criterion to select most statistically signicant partitions has been presented. It allows implementing an unsupervised version of the previous techniques, while obtaining a set of meaningful partitions at different levels of detail.
8. REFERENCES [1] V. Vilaplana and F. Marques, On building a hierarchical region-based representation for generic image analysis, Proc. ICIP07, vol. 4, pp. 325328, 16-19 Sept. 2007. [2] T. Adamek and N.E. OConnor, Using dempster-shafer theory to fuse multiple information sources in region-based segmentation, Proc. ICIP07, vol. 2, pp. 269272, 16-19 Sept. 2007. [3] G. Scarpa, M. Haindl, and J. Zerubia, A hierarchical nite-state model for texture segmentation, Proc. ICASSP07, vol. 1, pp. 12091212, 1520 April 2007. [4] V. Gies and T.M. Bernard, Statistical solution to watershed oversegmentation, Proc. ICIP 04, vol. 3, pp. 18631866, 24-27 Oct. 2004. [5] R. Nock. and F. Nielsen, Statistical region merging, IEEE Trans. Pattern Anal. Mach. Intell., vol. 26, no. 11, pp. 14521458, Nov. 2004. [6] T. Cover and J. Thomas, Elements of Information Theory, New York: John Wiley & Sons, Inc., second edition, 2006. [7] J.S. Cardoso and L. Corte-Real, Toward a generic evaluation of image segmentation, IEEE Trans. Image Process., vol. 14, no. 11, pp. 1773 1782, Nov. 2005. [8] S. Mike and M. Haindl, Prague texture segmentation data generator s and benchmark, ERCIM News, , no. 64, pp. 6768, 2006.

3019

Table 1: Symmetric distance for the subset of the Corel c database.

0.05

0.4

"

0.06
"

0.5
% $%

&

'

0.07
% $% (

"

'

"

&

'

'

1. 2. 3. 4. 5.

Method in [1] KL Area-weighted KL Area-unweighted BHAT Area-weighted BHAT Area-unweighted

0.08

&

&

Symmetric Distance

0.3809 0.3191 0.2443 0.3081 0.2330

0.09

Fig. 2: Merging criteria comparison. Columns, from left to right: original image, human partition, KL area-weighted, BHAT areaweighted, KL area-unweighted, BHAT area-unweighted (5 bins).

Fig. 3: Asymmetric distance for the subset of the Corel c database. Left: from computed to ground truth partition (degree of oversegmentation); right: viceversa (degree of undersegmentation). Statistical methods were computed using types quantized to 5 bins.
0.1 0.8 0.7

0.6

30

0.08
"

0.3

#$

0.1
% $% &

'

&

0.12
( "

&

6 8

8 @

@ e E

E 6

0.6 0.5 0.4

4 3 1 ' ' 0 2 0 !

20

10

20

10

Das könnte Ihnen auch gefallen