IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 33, NO.
5, MAY 2014 1137
Model-Based Classication Methods of Global Patterns in Dermoscopic Images Aurora Sez*, Student Member, IEEE, Carmen Serrano, Member, IEEE, and Begoa Acha, Member, IEEE AbstractIn this paper different model-based methods of clas- sication of global patterns in dermoscopic images are proposed. Global patterns identication is included in the pattern analysis framework, the melanoma diagnosis method most used among dermatologists. The modeling is performed in two senses: rst a dermoscopic image is modeled by a nite symmetric conditional Markov model applied to color space and the estimated parameters of this model are treated as features. In turn, the distribution of these features are supposed that follow different models along a lesion: a Gaussian model, a Gaussian mixture model, and a bag-of-features histogram model. For each case, the classication is carried out by an image retrieval approach with different distance metrics. The main objective is to classify a whole pigmented lesion into three possible patterns: globular, homogeneous, and reticular. An extensive evaluation of the performance of each method has been carried out on an image database extracted from a public Atlas of Dermoscopy. The best classication success rate is achieved by the Gaussian mixture model-based method with a 78.44% success rate in average. In a further evaluation the multicomponent pattern is analyzed obtaining a 72.91% success rate. Index TermsBag of features, classication, distance metrics be- tween models, Gaussian mixture model, global pattern, Markov random eld (MRF). I. INTRODUCTION A NONINVASIVE technique to assist dermatologists in the diagnosis of melanoma is dermoscopy, which is an epilu- minescence light microscopy, that magnies lesions and enables examination down to the dermoepidermal junction. There are four main diagnosis methods from dermoscopic images: ABCD rule, pattern analysis, Menzies method, and seven-point check- list. These methods were evaluated during the 2000 Consensus Net Meeting on Dermoscopy (CNMD) [1] by experts from all over the world. Pattern analysis, considered as the classic ap- proach for diagnosis in dermoscopic images, was deemed su- perior to the other algorithms [1]. It is a methodology rst de- Manuscript received January 03, 2014; accepted February 04, 2014. Date of publication February 11, 2014; date of current version April 22, 2014. This work was supported in part by the project TEC2010-21619-C04-02, CICYT, Spain, and in part by the project P11-TIC-7727, Consejera de Innovacin, Ciencia y Empresas, Junta de Andaluca, Spain. The work of A. S. was supported by the Consejera de Innovacin, Ciencia y Empresa of Junta de Andaluca, Spain. Asterisk indicates corresponding author. *A. Sez is with the Signal Theory and Communications Department, Uni- versity of Seville, 41092 Seville, Spain (e-mail: aurorasaez@us.es). C. Serrano and B. Acha are with Signal Theory and Communications De- partment, University of Seville, 41092 Seville, Spain (e-mail: cserrano@us.es; bacha@us.es). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TMI.2014.2305769 scribed by Pehamberger et al. [2], based on the analysis of more than 3000 pigmented skin lesions, and later revised by Argen- ziano et al. [3]. This methodology denes the signicant der- matoscopic patterns of pigmented skin lesions. Currently, it is the method most commonly used for providing diagnostic ac- curacy for cutaneous melanoma [4]. Pattern analysis seeks to identify specic patterns, which may be local or global. The melanocytic lesions are identied by their general dermoscopic features, dening their global pattern, or by specic dermoscopic criteria that determine their local pat- terns [4]. Thus, a lesion is categorized by a global pattern, al- though it can present more than one local pattern. Global fea- tures permit a broad classication of pigmented skin lesions, while a description of the local features provides more detailed information about a given lesion [5]. The local features repre- sent individual or grouped characteristics that appear in the le- sion. The global features are presented as arrangements of tex- tured patterns covering most of the lesion. The main global pat- terns are: Reticular pattern, Globular pattern, Cobblestone pat- tern, Homogeneous pattern, Parallel pattern, Starburst pattern, and Multicomponent pattern. They are associated with the pre- dominant local pattern: Reticular pattern with pigment network, Globular pattern with globules, Cobblestone pattern with glob- ules, Homogeneous pattern with pigmentation, Parallel pattern with furrows and ridges, Starburst pattern with streak, and Mul- ticomponent pattern with a combination of three or more above patterns. The main aim of this paper is the classication of a entire pig- mented lesion into Reticular pattern, Globular pattern, or Ho- mogeneous pattern by texture analysis. Likewise, in a further evaluation the Multicomponent pattern is analyzed. There are different reasons behind this decision instead to address the clas- sication of the seven patterns mentioned above. Globules are also predominant in the Cobblestone pattern, however they are larger and more closely aggregated than in Globular pattern, for what can be considered a special case of Globular pattern. Con- sequently, in our database, images belonging to Cobblestone pattern have been included in the Globular class. Regarding Par- allel pattern, its automatic detection does not have a signicant interest for the clinical community because lesions with this pat- tern are only located in palm or sole. Starburst pattern is char- acterized by the presence of pigmented streaks at the edge of a given lesion. As our objective is the texture analysis of an entire lesion, this type of lesion escapes from our study. Pattern analysis allows to dermatologist not only the distinc- tion between benign and malignant growth features but it also determines the type of a lesion. Each diagnostic category within 0278-0062 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. 1138 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 33, NO. 5, MAY 2014 the realm of pigmented skin lesions is characterized by few global patterns and a rather distinctive combination of specic local features. Thus, 1) Reticular pattern represents the dermo- scopic hallmark of benign acquired melanocytic nevi in general and of thin melanomas in particular; 2) Globular pattern and the Cobblestone pattern are commonly seen in congenital nevus, su- percial type; 3) Homogeneous pattern represents morphologic hallmark of blue nevus, although it may also be present in Clark nevi, dermal nevi, and metastatic melanomas; 4) Multicompo- nent pattern is highly suggestive of melanoma. Numerous works have focused on the extraction of local pat- terns [6], however, when dealing with the detection and/or clas- sication of global patterns, a few methods have been published in the literature. Tanaka et al. [7] presented an extraction of 110 texture features to classify a pattern into three categories: ho- mogeneous, globular and reticular. Gola et al. [8] presented a method based on edge detection, mathematical morphology, and color analysis to detect three global patterns (reticular, glob- ular, and homogeneous), but based on the predominant local pattern identication: globules, pigment network, and blue pig- mentation. Abbas et al. [9] extracted color features from the CIECAM02 representation and texture features from steerable pyramids transform (SPT) from the dermoscopic image in order to classify it into the seven global patterns. In a previous work [10], we addressed the classication of global patterns following a model-based technique. We proposed a method to automati- cally classify ve types of global patterns (reticular, globular, cobblestone, homogeneous, and parallel), in which a Markov random eld (MRF)-based texture modeling was performed. Lately, Sadeghi et al. [11] modeled the texture with the joint probability distribution of lter responses to detect ve patterns. However, these works classify patches extracted from a lesion instead of a whole lesion. To the best of our knowledge only the work in [12] classify a entire lesion based on the model-based approach proposed in [10]. In Section V the method described in [12] has been tested with the database used in this paper in order to establish a comparison with the proposed methods. In this work, we propose to identify the global pattern that a lesion presents by modeling in different ways. First, an image is modeled as an MRF in color space to obtain texture features. In turn, these texture features are supposed to follow different models: Gaussian model, Gaussian mixture model and a bag-of-visual words histogram model. Different distance met- rics between Gaussian mixture distributions and between his- tograms are analyzed. A k-Nearest neighbor algorithm based on these distance metrics is then applied, assigning to the test image the global pattern of the closest training image. An image database extracted from the Interactive Atlas of Dermoscopy [3] is used for evaluation. The results of the pro- posed methods are compared with the method proposed in [12] with the same database. The rest of the paper is organized as follows. In Section II a segmentation algorithm to isolate the lesion from its sur- rounding skin is presented. A review of how a textured image is modeled as an MRF is found in Section III. In Section IV, the proposed classication methods are detailed. In Section V, Fig. 1. Sliding window. a review of the method proposed in [10] is presented and the results are presented in Section VII. Finally, a discussion is presented. II. SEGMENTATION As the aim of the paper is the classication of a whole lesion into different types of global patterns, the rst step is to iso- late the lesion from the surrounding skin. The automatic nature of the segmentation process becomes crucial if the objective is the development of a computer-aided diagnosis (CAD) system. Therefore, an automatic segmentation algorithm is proposed. An edge based level set technique used in [13] is proposed as segmentation method. In this kind of methods, the basic idea is to represent contours as the zero level set of an implicit func- tion dened in a higher dimension, usually referred as the level set function. The challenge of a level-set algorithm is to make this function evolve so that its zero level converges at the real boundaries in the image. The general level set equation is pre- sented in (1) where represents the speed function and the level set func- tion. The level set formulation here used was proposed by Li et al. [14]. In the reported work, it is shown that (1) can be ex- pressed as (2) where the rst right-hand term is associated with a distance reg- ularization and the second and third with an external energy with edge-based information. is an edge indicator function, that is responsible of driving the zero level curve towards the object boundaries. As Sez et al. [13], [15] proposed, the function pro- posed in [14] is modied in the following way: (3) where is a Gaussian kernel to smooth the image and is the color vector gradient magnitude proposed in [13], as described below. A pixel is dened as where refers to the spatial dimensions in the 2-D plane and , and are the coordinates in the color space. In [16] Sobel mask was generalized into the multidimensional case. To this aim, the following vectors are constructing (ac- cording to the notation used in Fig. 1): , , , . SEZ et al.: MODEL-BASED CLASSIFICATION METHODS OF GLOBAL PATTERNS IN DERMOSCOPIC IMAGES 1139 Fig. 2. Steps followed in the segmentation process. (a) Original image with artifacts: hair and grid marker. (b) Smoothed image using an average lter. (c) First principal component image resulting of a PCA applied to the smoothed image in . (d) Otsus thresholding to image (c). (e) Binary image after applying shape conditions in order to avoid artifacts. (f) Contour of the dilated image (e) is treated as the initial contour. (g) Image resulting of the enhancement of the edge indicator function. (h) Segmented image by level-set technique. Final segmentation is indicated in red. The gradient along and direction respectively, is dened as in [13] [(4) and (5)], as well as the gradient magnitude (4) (5) (6) where denotes the CIE94 color difference [17] between the two vectors dened in the CIE color space. CIE94 was dened by CIE to address perceptual nonuniformities of the Lab color space. A. Initial Contour Level set methods require an initial contour to begin the process. The edge-based models fail to detect the boundaries when the initial contour is far from the desired object boundary [18]. Thus, to overcome this limitation in the proposed method a relatively accurate initial contour is found. The following steps are proposed to automatically nd the initial contour. 1) First, the original image [Fig. 2(a)] is smoothed with a 20 20 spatial averaging lter for multidimensional im- ages [Fig. 2(b)]. Then, considering each pixel as a vector, principal component analysis (PCA) is applied to the image and the rst component is retained [Fig. 2(c)]. 2) Otsus thresholding method [19] is applied to the rst principal component image. In the resulting binary image, apart from the lesion, some artifacts, such as hair or grid markers can be seen [Fig. 2(d)]. 3) The artifacts are removed imposing shape conditions. As a lesion is supposed to approach a circle, the region of interest corresponds to the one with the biggest area and the lowest eccentricity. The eccentricity is dened as the ratio of the distance between the foci of the ellipse that has the same second-moments as the region and its major axis length. An ellipse whose eccentricity is 0 is actually a circle. The result of applying these two conditions is shown in Fig. 2(e). 4) The resulting region is dilated with a disk of a ve-pixels radius to ensure that the initial contour surrounds the lesion since the contour of this dilated image will be considered the initial contour [Fig. 2(f)]. (3), whose histogram was equalized and, subse- quently, a linear expansion of its dynamic range between [0 1], was applied, is shown in Fig. 2(g). The nal segmentation is shown in Fig. 2(h). In most of the level set schemes, the curve evolution stops when a xed number of iterations is reached. However, in this work, we propose a different stopping condition. When in two consecutive iterations the curve does not evolve the process is stopped, implying that it has reached an object boundary. It is important to note that in spite of the presence of artifacts, such as hair and grid marker, a good segmentation is achieved in all cases. III. MARKOV RANDOM FIELD MODEL Models based on MRFs have wide acceptance for solving texture analysis problems [20], [21]. They are able to capture the local (spatial) contextual information in an image. These models 1140 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 33, NO. 5, MAY 2014 assume that the intensity at each pixel in the image depends on the intensities of the neighboring pixels [21]. As suggested Xia et al. [22], in this paper a nite symmetric conditional Markov (FSCM) [23] model characterizes the observed image to obtain texture features. These features are proposed as the basis of the different classication methods of global patterns in dermoscopic images, which are explained in the following section. The MRF model is detailed as follows: an image is considered as a random eld , dened on a rectangular lattice , which is indexed by the coordinate . The gray-scale values are represented by , where denotes a specic site. However, in this work, as we proposed in [10], the random variable represents a color pixel in the color space instead of gray-scale values with range [0 255]. Let an observed patch be an instance of , dened in a square center on each site . It can be described by a FSCM [23] as follows: (7) where is the set of shift vectors corresponding to the second-order neigh- borhood system, is the mean of the color pixels in the patch centered in site , is the set of correlation coef- cients associated with the set of translations from every site , and is a stationary Gaussian noise sequence with variance . Based on this FSCMmodel, a texture feature vector is dened as (8) where is the mean of the color pixels of the patch under study, is the estimation of the noise variance, and the other four components, , are the estimation of the correlation coef- cients. As it has been mentioned, in this paper these features are computed from the color space. In [10] analyzed the use of this color space in two cases. In the rst case, the six parameters obtained from component were supposed to be independent from the 12 parameters calculated from and components, obtaining a vector that can be decomposed into two parameters vectors and . And in the second case, the feature vector was formed by 18 components, assuming de- pendence between , and . Their results showed that the second assumption outperformed the rst one. This is reason why in this work we work with the following feature vector of 18 components: (9) The parameters of the FSCM model are estimated by the least-squares estimation method proposed by Manjunath and Chellappa [24]. Consider a region (patch) containing a single texture. Let be the set of all the sites belonging to the patch under consideration and be the interior of the region of , i.e., , and for at least some (10) (11) (12) where is dened by with di- mensions because is a 3-D color pixel. Because we are processing color images, the dimensions of the parame- ters are for , (1 3) for and (1 3) for . The estimation of all the parameters of the FSCM model requires products. IV. PROPOSED MODEL-BASED CLASSIFICATION METHODS In this section, the proposed model-based classication methods are detailed. The aim is the classication of a whole lesion, not only of a sample or patch of it. It is important to note that, in this paper, two different training sets of images are used, depending on the method implemented. Complete lesions compose the rst dataset, whereas the second set is constituted by individual patches, each patch extracted froma different lesion of the rst dataset. The extraction of these patches was performed randomly. The test set is constituted by complete lesions. None of the lesions included in the test dataset are included in the training dataset. In order to analyze a whole lesion, the lesion is divided into overlapping patches. Taking into account that our images have a spatial resolution of 768 512 pixels, different patch sizes were tested: 40 40 (as proposed in [10]), 50 50 (as pro- posed in [12]), 81 81 (as proposed in [11]) and 100 100. Finally, patch size was xed to 81 81 pixels achieving a trade-off between computational cost and size, that should be large enough to distinguish and detect different textures. A dis- placement equal to nine rows or/and nine columns on the lesion is applied to obtain the next patch. A displacement of 27 rows or/and 27 columns instead of nine is shown in Fig. 3(e) and (f) in order to be appreciated. In Fig. 3 individual patches of the three global patterns under study as well as an example of a lesion divided into overlapping patches can be seen. Only the patches without background or with a background area of up to 10% the patch area are taken into account. A. Gaussian Model-Based Method This approach is based on the assumption that the MRF features of the patches or samples constituting a test le- sion follow a multivariate Gaussian distribution model with mean and covariance matrix (13) SEZ et al.: MODEL-BASED CLASSIFICATION METHODS OF GLOBAL PATTERNS IN DERMOSCOPIC IMAGES 1141 Fig. 3. Examples of the two image sets used. (a)(c) rst set: individual patches. 81 81 dermoscopic individual patches belonging to (a) globular pattern, (b) homogeneous pattern, and (c) reticular pattern. (d)(f) Second image set: complete lesions. (d) 81 81 sample extracted fromthe whole lesion. (e) Displacement equal to 27 rows or/and 27 columns is applied to obtain the following sample. (f) Overlapping samples to analyze the whole lesion. where is the dimension of the feature vector . Apart fromthis assumption, two different scenarios regarding to the training set have been considered. 1) GM1: the training set is constituted by individual patches. MRF features of each class in this training set are sup- posed to follow a multivariate Gaussian distribution with mean and covariance matrix (14) 2) GM2: full lesions constitute the training set. MRF features of the patches within each training lesion are supposed to follow a multivariate Gaussian distribution (15) Different distance metrics are used in order to compare the multivariate Gaussian distributions of the test lesion and those from the training sets. Symmetric KullbackLeibler distance [25], Bhattacharyya distance [26] and Frechet distance [27], which is the closed form solution of the earth movers distance (EMD) in the case of two Gaussian distributions, are analyzed. The closed form expression for the symmetric KL divergence between two multivariate Gaussian distributions can be written as [28] (16) where . Bhattacharyya distance between two Gaussian kernels, is de- ned as (17) Frechet distance is computed as (18) The k-nearest neighbor algorithm (KNN) with the aforemen- tioned distances has been applied for the nal classication. In the rst scenario, a test image is identied with the pattern closest to it. In the second case, a KNN approach is applied so that the test image is assigned to the class of the training image closest to it. B. Gaussian Mixture Model-Based Methods According to Skas et al. [29], in the context of image re- trieval, it is advantageous to model the feature data using para- metric probability density function models, such us Gaussian mixture models (GMM). In this approach MRF features ex- tracted from patches constituting a test lesion are supposed to followa Gaussian mixture model. This model represents a prob- ability density function (PDF) as (19) where stands for the number of Gaussian kernels mixed, and are the mean vectors and the covariance matrices of Gaussian kernel and are the mixing weights. These pa- rameters and weights are estimated iteratively from the input MRF features using the expectation-maximization (EM) algo- rithm [30]. In three different tests, data were modeled with 3, 4, and 5 Gaussian kernels and, accordingly, the classication method was applied. The best classication results were ob- tained with a three-component Gaussian mixture model. Based on this assumption, other two scenarios regarding to the training set are considered, similarly to Section IV-A. 1) GMM1: individual patches constitute the training set. The MRF features of the individual training patches belonging to each class follow a Gaussian mixture distribution (20) where represents each global pattern, stands for the number of Gaussian kernels mixed for each pattern, 1142 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 33, NO. 5, MAY 2014 and are the mean vectors and the covariance matrices of Gaussian kernel and are the mixing weights. 2) GMM2: the training set consists of full lesions that are supposed to follow a Gaussian mixture distribution (21) As in the previous approach (see Section IV-A), the idea is to compare the Gaussian mixture model of a test lesion with the mixture distribution corresponding to the training sets. To this purpose different distance metrics between Gaussian mixture models are used: the symmetric KullbackLeibler divergence [29], the Bhattacharyya-based distance metric [29], EMD [31] and a distance metric proposed by Skas et al. [29]. The symmetric KullbackLeibler divergence for GMMs can be computed as (22) where is the number of data samples generated from the and , and and are estimated iteratively with expectation maximization (EM) [29]. The Bhattacharyya-based distance metric is computed as (23) where , are Gaussian mixture models consisting of and kernels, respectively. , denote the kernel parameters and , are the mixing weights. denotes the Bhattacharyya distance between two Gaussian kernels [(17)]. The EMD for GMMs is computed as (24) (25) where is the Frechet distance between and [(18)]. The distance metric proposed by Skas et al. [29] to compare Gaussian Mixture is computed as (26) (27) (28) the mixing weights, and are indexes on the Gaussian kernels, and and are mean and covariance matrices for the kernels of Gaussian mixture and respectively. As in the previous section, a k-nearest neighbor algorithm is applied. In the rst case, a test image is identied with the closest pattern according to the four distances proposed above and in the second case the test image is assigned to the pattern of the closest training image. C. Bag of Features The last approach is based on the representation of an image as a bag of features (BoF). This approach nds its origin, on the one hand, in the texture recognition by textons [32], [33] (basic elements of texture) and, on the other hand, in the bag of words scheme used for text categorization and text retrieval [34]. The idea is to model an image as a frequency histogram of visual words (bag of features). These visual words are built from the quantication of descriptors (in our case the descriptors are MRF features) of local patches sampled from the training set. This quantication is usually carried out by a clustering algo- rithm such as k-means. The centroid of each cluster represents a visual word. The set of visual words forms a codebook. The bag of features representation is widely used in computer vision for classication of natural scenes [35] or biomedical images such as histology images [36], different tissues [37] or mammography [38]. However, to the best to our knowledge, only a few works can be found in the literature that use bag of features applied to pigmented lesions. Situ et al. [39], [40], and more recently Wadhawan et al. [41] applied this approach to classify lesions between malignant and benign. Recently, Barata et al. [42] also proposed a binary classication between malignant and benign lesions using bag of features focusing in different strategies for the extraction of interest regions from which features are extracted. Hu et al. [43] proposed a nonuni- form sampling strategy based on image patch saliency and pixel intensity that outperformed other sampling strategies in melanoma detection based on bag of features methods. Sadeghi et al. [11] used an approach based on textons and applied to lter bank responses for each pixel to identify different global patterns, however only individual patches are classied, not whole lesions. Differently to all the works cited above, in this paper, bag of features is applied to MRF features. For each global pattern, the MRF features of the patches of the training lesions form a n-di- mensional space. Specically, we have no patches per image no training images per class n-dimensional vectors per class located in this space . These n-dimensional vectors be- longing to the same class are clustered by K-means algorithm, obtaining K centroids or visual words for each pattern. These (three patterns under study) visual words form the code- book. Then, all the patches from all training images are reas- signed to the closest visual word of the codebook. An histogram with the frequency of occurrences of each centroid or visual word is form for each training image. It is important to note that initialization in K-means al- gorithm is randomly accomplished, i.e., K observations are selected from the data at random. However, 10 initializations SEZ et al.: MODEL-BASED CLASSIFICATION METHODS OF GLOBAL PATTERNS IN DERMOSCOPIC IMAGES 1143 Fig. 4. Overview of the BoF approach to image classication. are performed, and the one that attains the lowest value in the K-means objective function is selected. In Fig. 4 an overview of the proposed BoF approach to image classication is shown. In the classication step, overlapping patches are extracted froma newtest image and a n-dimensional vector with MRF features is estimated from each patch. Each n-dimensional vector is assigned to the nearest centroid in the codebook, so that for each lesion a histogram of frequencies of clusters (bag of features) is formed. Finally, a classier is applied to identify the training image whose histogram is closet to the one of the image to be classied (see Fig. 4). A KNN algorithm with different histogram dissimilarity measures is proposed as classier. Five common histogram dissimilarity measures are used [44]: EMD, statistic, histogram intersection, KullbackLeibler divergence and Kol- mogorovSmirnov distance. According to [35], on natural image classication the larger the codebook size the better. However, Tomassi et al. [45] found that the size of the codebook is not a signicant aspect in a med- ical image classication task. In this work different codebook sizes are evaluated (see Fig. 8). V. CLASSIFICATION METHOD BY COMPUTATION OF POSTERIOR PROBABILITY The proposed methods are compared with the method pro- posed in [12], which is based on [10], applied to our database. In this section, a review of the technique proposed in [10] is car- ried out. First, it is important to note that in this method both the training set and the test set are composed by individual patches. However, in [12] a lesion was divided by overlapping patches, and the nal classication of whole lesion was made by polling process. The method is based on the MAP-MRF framework [46] that suggested that the optimal pattern under a feature set can be obtained by maximizing the posterior probability. The fol- lowing assumption is introduced to calculate this probability: the features of the training patches of each class, follows a multivariate Gaussian distribution (14) with mean vector and covariance matrix , corresponding to the pat- tern belongs to. is an instance of a randomvariable , taking values from a nite set . Therefore, in order to classify each sample of the test lesion with feature vector into the pattern it belongs to, the max- imum a posteriori (MAP) criterion is applied together with the assumption that the three possible global patterns (globular, ho- mogeneous, and reticular) are equally probable, what results in the maximum likelihood (ML) criterion. Then (29) where is the vector of MRF features for the patch to be clas- sied. This ML problem can then be solved by minimizing the following energy: (30) In [12], each patch of the lesion is classied in this way and the whole lesion is assigned to the global pattern most voted. VI. IMAGE DATABASE The image database used in this work is formed by 30 im- ages of each type of pattern, a total of 90 images. These 30 im- ages from each global pattern were randomly chosen. However, 1144 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 33, NO. 5, MAY 2014 Fig. 5. Examples of images from the database. (a) Image classied as globular pattern. (b) Image classied as reticular pattern. (c) Image classied as homoge- neous pattern. it should be emphasized that some low quality images (blurry or low-contrast images) had to be replaced. This is due to the fact that they have been acquired in different hospitals without following an acquisition protocol. As it has already been men- tioned, globules are predominant in Globular and Cobblestone pattern, however, for the second case, they are larger and more closely aggregated than in Globular pattern. Thus, Cobblestone pattern can be considered a special case of Globular pattern. Eight images of the 30 categorized as globular pattern, belong to Cobblestone pattern. All images were extracted from the Interactive Atlas of Der- moscopy, published by Edra Medical Publishing New Media [3], which is a multimedia project for medical education with images of pigmented skin lesions from different centers and hospitals. The selected database include both images with a clear diagnosis and images difcult to classify depending on the type of the lesion. Each image presents an unique global pattern. This unique label does not mean that the lesion has an only local pattern, i.e., a lesion can show different local features although it is as- signed to only one global pattern. Usually, a global pattern is determined by a predominant local pattern in a lesion. Therefore, considering that the global pattern is determined by the dermatoscopic feature predominant in the lesion, its au- tomated classication becomes hard due to the possible pres- ence of different local patterns in the same lesion. An example can be seen in Fig. 5(a), where the lesion was classied as globular, however, the inferior part of the lesion is reticular. Fig. 5(b) shows a lesion whose classication was reticular, but also presents globules and homogeneous areas. Besides this intrinsic difculty, the images from this Atlas of Dermoscopy present two difculties for their automatic clas- sication: intra-class variability, lesions belonging to the same global pattern with very different appearance, and inter-class similarity, lesions belonging to different global patterns with certain similar appearance. Moreover, the virtual consensus net meeting on dermoscopy (CNMD) [1] was organized to investigate the interobserver and intraobserver reproducibility and validity of the various features and diagnostic algorithm. The reproducibility was assessed ac- cording to the method of Fleiss et al. [47] to calculate the sta- tistics. Its interpretation is: a value of 1.0 indicates perfect agree- ment and values less than 0.4 are poor. 128 dermoscopic images of pigmented skin lesions were selected to include in this study. The interobserver reproducibility was computed among 40 ob- servers. To test for intraobserver agreement, 20 lesions were randomly selected and included for re-examination. The results presented for diagnostic of Global patterns were for interobserver agreement and for intraobserver agree- ment. These results showthe difculty in the diagnosis of global patterns. In this work, the reference truth has been taken according to the diagnosis presented in the Interactive Atlas of Dermoscopy [3]. VII. EVALUATION AND RESULTS To evaluate the performance of the proposed methods success classication rate is computed. Ak-fold cross validation is used. There are two goals in cross-validation [48]: to estimate perfor- mance of the learned model from available data using one al- gorithm, in other words, to gauge the generalization of an algo- rithm; and to compare the performance of different algorithms and nd out the best algorithm for the available data. It is important to note that under no situationwhen indi- vidual patches constitute the training set or when entire images constitute the training setan image in the training dataset is used in the test dataset. In addition, as remarked above, each patch in the training dataset is extracted from a different image. In k-fold cross-validation, the image database is randomly partitioned into k equal size groups or splits. A single group out of the k groups is used as the validation data for testing the model, and the remaining k-1 groups are used as training data. The cross-validation process is then repeated k times (folds), with each of the k groups used exactly once as the validation data. Then the k results from the folds are averaged to produce a single estimation. The advantage of this method is that all images are used for both training and validation, and each image is used for validation exactly once. However, due to the variability of our database, as mentioned in Section VI, the choice of the k splits has a high inuence in the results. A full cross-validation, i.e., performing all-possible ways of partitioning, would give an accurate estimation, but it is computationally too expensive. Therefore, repeating k-fold cross-validation r-times using different random splits for each run, provides a good Monte-Carlo estimate of the full cross-val- idation [49]. In this paper, we used a 20-times three-fold cross- validation. SEZ et al.: MODEL-BASED CLASSIFICATION METHODS OF GLOBAL PATTERNS IN DERMOSCOPIC IMAGES 1145 Fig. 6. Performance of the methods whose training set is constituted by individual patches. GM1 is supposed to follow a multivariate Gaussian (Section IV-A) and GMM1 is supposed to follow a Gaussian mixture (Section IV-B). Different proposed distance metrics between probability density functions are evaluated: Bhattacharyya distance (Batt.), EMD, Kull- backLeibler divergence (Kul.), and a distance proposed in [29] (C2). y-axis shows the classication success rate. Fig. 7. Performance of the methods whose training set is constituted by com- plete lesions. GM2 is supposed to followa multivariate Gaussian (Section IV-A) and GMM2 is supposed to follow a Gaussian mixture (Section IV-B). Different proposed distance metrics between probability density functions are evaluated: Bhattacharyya-based (Batt.), EMD, KullbackLeibler divergence (Kul.), and a distance proposed in [29] (C2). y-axis shows the classication success rate. The success rates of the methods proposed in Sections IV-A and IV-B are shown in Figs. 6 and 7. The proposed distance metrics are evaluated. It can be seen that a different distance out- performs the rest for each method. If we compare the methods whose training set consists of individual patches (Fig. 6), the method GMM1 when uses the distance C2 outperforms the rest. When we compare the methods whose training set is consti- tuted by complete lesions (Fig. 7), the method GMM2 is slightly better (in terms of correct classication), however, in this case, regarding to distances, EMD is the one provided better results. Regarding the Bag of Features approach (Section IV-C), Fig. 8 shows the performance for the different histogram dissimilarity measures: EMD, statistic , histogram intersection (Hist.), KolmogorovSmirnov distance (Kol.), and KullbackLeibler divergence(Kul.). They have been evaluated with different number of centroids or visual words. In view of Fig. 8. Performance of the Bag of Features approach when different codebook sizes are used. KNN on different histogram dissimilarity measures: EMD, statistic , histogram intersection (Hist.), KolmogorovSmirnov distance (Kol.), and KullbackLeibler divergence (Kul.) y-axis shows the classication success rate. TABLE I CLASSIFICATION RESULTS FOR THE PROPOSED METHODS COMPARED WITH THE METHOD PROPOSED IN [12]. BOLD TEXT INDICATE THE BETTER SUCCESS RATES OBTAINED the results in Fig. 8, it seems that the number of visual words does not signicantly inuence the success rate. However, distance using 20 centroids per class (60 visual words in total) achieved the best result. Table I shows the success rate of the classication for all the proposed methods. For each method, the distance wichh pro- vides the highest classication rate obtained in Figs. 6 8 is presented. In addition, the classication success rate obtained in the identication of each global pattern is shown. In general, homogeneous pattern is identied with a success rate of over 90% in all cases, decreasing this rate for globular and reticular pattern identication. It can be conclude that Gaussian mixture model-based methods outperform the rest in average. The method proposed in [12] has been included in the evalu- ation (see Table I). The results show that the proposal has sig- nicantly better performance. Once a successful global pattern classication has been obtained, a further evaluation was performed. The multi- component pattern was included in the study. This newly appended pattern is characterized by the presence of three or more patterns within a lesion. Thirty images of melanomas with multicomponent pattern were chosen randomly from the Interactive Atlas of Dermoscopy [3]. Examples of melanoma are shown in Fig. 9. The classication into four categories or patterns (globular, homogeneous, reticular, and multicom- ponent) was performed with the best model in the previous experiments (GMM). However, GMM2 has been applied to 1146 IEEE TRANSACTIONS ON MEDICAL IMAGING, VOL. 33, NO. 5, MAY 2014 Fig. 9. Examples of melanomas images with multicomponent pattern. TABLE II CLASSIFICATION RESULTS FOR THE GMM2 METHOD WHEN LESIONS WITH MULTICOMPONENT PATTERN ARE INCLUDED IN THE STUDY take into account that various patterns are included within a multicomponent pattern lesion. Table II presents the results. The inclusion of this fourth pattern in the classication proce- dure reduces the success rate only by 5.53%. These promising results show the potential of this system for early melanoma diagnosis. VIII. SUMMARY AND CONCLUSION In this paper, different classication methods for global der- moscopic patterns have been proposed. The aim is to classify each lesion as a particular global pattern. This unique-label clas- sication is motivated by the fact that a lesion is characterized by a global pattern and by one or more local patterns. The ma- jority of the classication approaches in the literature are based on a feature extraction step followed by a classier whose in- puts are the features extracted. On the contrary, this paper pro- poses techniques based on modeling in different senses. First, an image is modeled by a MRF on the color space. The estimated parameters of this model are treated as features. And then, these features within a lesion are supposed to follow three different models. In the rst one, it is supposed that a lesion fol- lows a multivariate Gaussian distribution. The idea is to measure distances between Gaussian models (GM) and then to apply a KNN algorithm. The same idea remains in the second approach proposed although a GMM assumption substitutes to GM. As in the previous case different distance metrics between GMMs are analyzed. The third model-based classication technique is a Bag of Features approach, where a image is modeled by a frequency histogram of visual words. In this case, different dis- tances between histograms have been studied. A database from the Interactive Atlas of Dermoscopy [3] has been chosen. A comparative study between the classica- tion success rates obtained by all these methods is presented. In view of the results, it can be concluded that the methods based on the assumption that MRF features, which characterize a lesion and consequently a global pattern, follow a Gaussian Mixture model outperform the rest, obtaining a 78.44% on av- erage. This method was used to perform a further evaluation, in which the multicomponent pattern was included. This pattern is highly suggestive of melanoma. A 72.91% success rate was ob- tained what shows the capacity of this systemto early melanoma diagnosis. Moreover, the proposed model-based methods have been compared with the method of the previous work proposed in [10], in which individual patches were classied into ve global patterns, and later applied to whole lesions in [12]. The results evidence the superiority of the methods here presented. This work constitutes the rst stage of a computer-aid system for classication of malignant pigment lesions. IX. DISCUSSION The rst novelty presented in this paper is that MRF features within a lesion are modeled for classication purposes. In other words, a multidimensional histogramis formed with the features within a lesion and this histogram is modeled with a particular density model. Then, classication is performed via compar- ison of histograms or density models, with specic dissimilarity functions. Other authors [50], [51] modeled pixel distributions as multivariate Gaussian distributions for segmentation tasks. Differently, in this paper features rather than pixel values are modeled, and models are applied to texture classication rather than for segmentation. In the previous work [10] we used MRF features in to model global patterns in patches. But then we modeled the features within a class as a Gaussian distribution of MRF fea- tures within a class. Differently, in this work, for the rst time, the distribution of MRF features within a lesion is analyzed. The authors also apply the concept of bag of visual words to MRF features of pigmented lesions. It is a very convenient approach to analyze very sparse histogram in a n-dimensional space. Previously, Sadeghi et al. [11] applied the concept of textons to the problem of global pattern classication. Never- theless, differently from our approach, they applied the textons concept to pixels within a patch and they used as features the output of lter banks. In our proposal bag of features approach is applied to MRF features of the different patches within a le- sion. Finally, it should be outlined that no previous attempts of global pattern model-based classication of full lesions can be found in the literature. REFERENCES [1] G. Argenziano, H. Soyer, and Chimenti et al., Dermoscopy of pig- mented skin lesions: Results of a consensus meeting via the internet, J. Am. Acad. Dermatol., vol. 48, no. 5, pp. 679693, 2003. [2] H. Pehamberger, A. Steiner, and K. Wolff, In vivo epiluminescence microscopy of pigmented skin lesions. I. Pattern analysis of pigmented skin lesions, J. Am. Acad. Dermatol., vol. 17, no. 4, pp. 571583, 1987. SEZ et al.: MODEL-BASED CLASSIFICATION METHODS OF GLOBAL PATTERNS IN DERMOSCOPIC IMAGES 1147 [3] G. Argenziano and H. Soyer et al., Interactive Atlas of Dermoscopy. Milan, Italy: EDRA-Medical Publishing New Media, 2000. [4] G. Rezze, B. De S, and R. Neves, Dermoscopy: The pattern anal- ysis, Anais Brasileiros Dermatologia, vol. 81, no. 3, pp. 261268, 2006. [5] H. Soyer, G. Argenziano, V. Ruocco, and S. Chimenti, Dermoscopy of pigmented skin lesions (Part II), Eur. J. Dermatol., vol. 11, no. 5, pp. 483498, 2001. [6] A. Sez, B. Acha, and C. Serrano, Pattern analysis in dermoscopic images, in Computer Vision Techniques for the Diagnosis of Skin Cancer, ser. Bioengineering, J. Scharcanski and M. E. Celebi, Eds. New York: Springer, 2013, ch. 2. [7] T. Tanaka, S. Torii, I. Kabuta, K. Shimizu, and M. Tanaka, Pattern classication of nevus with texture analysis, IEE J. Trans. Electr. Electron. Eng., vol. 3, no. 1, pp. 143150, 2008. [8] A. Gola Isasi, B. Garca Zapirain, and A. Mndez Zorrilla, Melanomas non-invasive diagnosis application based on the ABCDrule and pattern recognition image processing algorithms, Comput. Biol. Med., vol. 41, no. 9, pp. 742755, 2011. [9] Q. Abbas, M. Celebi, C. Serrano, I. Fondn Garca, and G. Ma, Pattern classication of dermoscopy images: A perceptually uniform model, Pattern Recognit., vol. 46, no. 1, pp. 8697, 2013. [10] C. Serrano and B. Acha, Pattern analysis of dermoscopic images based on Markov random elds, Pattern Recognit., vol. 42, no. 6, pp. 10521057, 2009. [11] M. Sadeghi, T. Lee, D. McLean, H. Lui, and M. Atkins, Global pattern analysis and classication of dermoscopic images using textons, in Proc. SPIE Progr. Biomed. Opt. Imag., 2012, vol. 8314. [12] C. Mendoza, C. Serrano, and B. Acha, Pattern analysis of dermoscopic images based on FSCM color Markov random elds, in Advanced Concepts for Intelligent Vision Systems. New York: Springer, 2009, vol. 5807, Lecture Notes in Computer Science, pp. 676685. [13] A. Sez, C. S. Mendoza, B. Acha, and C. Serrano, Development and evaluation of perceptually adapted colour gradients, IET Image Process., vol. 7, no. 4, pp. 355363, 2013. [14] C. Li, C. Xu, C. Gui, and M. Fox, Level set evolution without re-ini- tialization: A new variational formulation, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2005, vol. 1, pp. 430436. [15] A. Sez, I. Fondo, B. Acha, S. Jimeez, P. Alemany, Q. Abbas, and C. Serrano, Optic disc segmentation based on level-set and colour gradi- ents, in Proc. 6th Eur. Conf. Colour Graphics, Imag., Vis., 2012, pp. 121125. [16] K. Plataniotis and A. Venetsanopoulos, Color Image Processing and Applications, ser. Digital Signal Process.. New York: Springer, 2000. [17] Industrial colour-difference evaluation Viena, Austria, CIE Pub. 116, 1995. [18] K. Zhang, L. Zhang, H. Song, and W. Zhou, Active contours with selective local or global segmentation: A new formulation and level set method, Image Vis. Comput., vol. 28, no. 4, pp. 668676, 2010. [19] N. Otsu, A threshold selection method from gray-level histograms, IEEE Trans. Syst., Man Cybern., vol. 9, no. 1, pp. 6266, Jan. 1979. [20] G. R. Cross and A. K. Jain, Markov random eld texture models, IEEE Trans. Pattern Anal. Mach. Intell., vol. 5, no. 1, pp. 2539, Jan. 1983. [21] M. Tuceryan and A. Jain, , C. H. Chen, L. F. Pau, and P. S. P. Wang, Eds., Texture analysis, in Handbook of Pattern Recognition and Vi- sion, 2nd ed. River Edge, NJ: World Scientic, 1998. [22] Y. Xia, D. Feng, and R. Zhao, Adaptive segmentation of textured im- ages by using the coupled Markov random eld model, IEEE Trans. Image Process., vol. 15, no. 11, pp. 35593566, Nov. 2006. [23] R. L. Kashyap and R. Chellappa, Estimation and choice of neighbors in spatial-interaction models of images, IEEE Trans. Inf. Theory, vol. 29, no. 1, pp. 6072, Jan. 1983. [24] B. Manjunath and R. Chellappa, Unsupervised texture segmentation using Markov randomeld models, IEEE Trans. Pattern Anal. Mach. Intell., vol. 13, no. 5, pp. 478482, May 1991. [25] S. Kullback, Information Theory and Statistics. Mineola, NY: Dover, 1997. [26] A. Bhattacharyya, On a measure of divergence between two statistical populations dened by their probability distributions, Bull. Calcutta Math. Soc., vol. 35, pp. 99109, 1943. [27] D. Dowson and B. Landau, The Frchet distance between multi- variate normal distributions, J. Multivariate Anal., vol. 12, no. 3, pp. 450455, 1982. [28] K. Abou-Moustafa, F. De La Torre, and F. Ferrie, Designing a metric for the difference between Gaussian densities, Adv. Intell. Soft Comput., vol. 83, pp. 5770, 2010. [29] G. Skas, C. Constantinopoulos, A. Likas, and N. Galatsanos, An An- alytic Distance Metric for Gaussian Mixture Models With Application in Image Retrieval. New York: Springer, 2005, vol. 3697, LNCS, pp. 835840. [30] G. Mclachlan and D. Peel, Finite Mixture Models, ser. Wiley Ser. Prob- abil. Stat., 1st ed. New York: Wiley-Interscience, 2000. [31] H. Greenspan, G. Dvir, and Y. Rubner, Context-dependent segmen- tation and matching in image databases, Comput. Vis. Image Under- stand., vol. 93, no. 1, pp. 86109, 2004. [32] B. Julesz, Textons, the elements of texture perception, and their inter- actions, Nature, vol. 290, no. 5802, pp. 9197, 1981. [33] M. Varma and A. Zisserman, Texture classication: Are lter banks necessary?, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2003, vol. 2, pp. II/691II/698. [34] D. D. Lewis, Naive (Bayes) at forty: The independence assumption in information retrieval, in Proc. 10th Eur. Conf. Mach. Learn., 1998, pp. 415. [35] G. Csurka, C. R. Dance, L. Fan, J. Willamowski, and C. Bray, Vi- sual categorization with bags of keypoints, in Workshop Stat. Learn. Comput. Vis., ECCV, 2004, pp. 122. [36] A. Cruz-Roa, J. Caicedo, and F. Gonzlez, Visual pattern mining in histology image collections using bag of features, Artif. Intell. Med., vol. 52, no. 2, pp. 91106, 2011. [37] J.-Y. Wang, H. Bensmail, and X. Gao, Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classication, Pat- tern Recognit., vol. 46, no. 12, pp. 32493255, 2013. [38] A. Bosch, X. Muoz, A. Oliver, and J. Mart, Modeling and classifying breast tissue density in mammograms, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recognit., 2006, vol. 2, pp. 15521558. [39] N. Situ, X. Yuan, J. Chen, G. Zouridakis, and X. Yuan, Malignant melanoma detection by bag-of-features classication, in Proc. 30th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2008, pp. 31103113. [40] N. Situ, T. Wadhawan, X. Yuan, and G. Zouridakis, Modeling spatial relation in skin lesion images by the graph walk kernel, in Proc. Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2010, vol. 2010, pp. 61306133. [41] T. Wadhawan, N. Situ, K. Lancaster, X. Yuan, and G. Zouridakis, Skinscan: A portable library for melanoma detection on handheld devices, in Proc. 2011 IEEE Int. Symp. Biomed. Imag.: From Nano to Macro, 2011, pp. 133136. [42] C. Barata, J. S. Marques, and T. Mendona, Bag-of-features classi- cation model for the diagnose of melanoma in dermoscopy images using color and texture descriptors, in Proc. 10th Int. Conf. Image Anal. Recognit., 2013, pp. 547555. [43] R. Hu, N. Situ, T. Wadhawan, and G. Zouridakis, Nonuniform sampling for bag-of-features classication in melanoma detection, in Proc. 13th Int. Conf. Signal Image Process., 2011, pp. 308312. [44] Y. Rubner, C. Tomasi, and L. Guibas, Earth movers distance as a metric for image retrieval, Int. J. Comput. Vis., vol. 40, no. 2, pp. 99121, 2000. [45] T. Tommasi, F. Orabona, and B. Caputo, Discriminative cue integra- tion for medical image annotation, Pattern Recognit. Lett., vol. 29, no. 15, pp. 19962002, 2008. [46] R. C. Dubes and A. K. Jain, Random eld models in image analysis, J. Appl. Stat., vol. 16, no. 2, pp. 131164, 1989. [47] J. Fleiss, Measuring nominal scale agreement among many raters, Psychol. Bull., vol. 76, no. 5, pp. 378382, 1971. [48] P. Refaeilzadeh, L. Tang, and H. Liu, Cross-validation, in Encyclo- pedia of Database Systems, L. Liu and M. T. Ozsu, Eds. New York: Springer, 2009, pp. 532538. [49] R. Kohavi, Astudy of cross-validation and bootstrap for accuracy esti- mation and model selection, in Proc. 14th Int. Joint Conf. Artif. Intell., 1995, pp. 11371143. [50] P. Wighton, T. Lee, H. Lui, D. McLean, and M. Atkins, Generalizing common tasks in automated skin lesion diagnosis, IEEE Trans. Inf. Technol. Biomed., vol. 15, no. 4, pp. 622629, Jul. 2011. [51] P. Wighton, T. Lee, G. Mori, H. Lui, D. McLean, and M. Atkins, Con- ditional randomelds and supervised learning in automated skin lesion diagnosis, Int. J. Biomed. Imag., 2011.