MTAP Final Published Jan 23 2017

Multimed Tools Appl
DOI 10.1007/s11042-016-4260-y
Segmentation of hepatocellular carcinoma and dysplastic

liver tumors in histopathology images using area based
adaptive expectation maximization
Lekshmi Kalinathan 1 & Ruba Soundar Kathavarayan 2 &

Dinakaran Nagendram 3 & Mukul Vij 4 & Mohamed Rela 5
Received: 12 March 2016 / Revised: 12 October 2016 / Accepted: 12 December 2016

# Springer Science+Business Media New York 2017
Abstract The differentiation of a cluster of nuclei and multi-nucleation is a critical issue in

automated diagnosis systems. Due to the similarities between said clusters and malignant nuclei,
misclassification of these regions can affect the automated systems’ final decision. In this paper, a
method for differentiating clusters from multi nucleated cells in histopathological images is
proposed. Hepatocellular Carcinoma(HCC) and Dysplasia are characterized by cellular and
nuclear enlargement, nuclear pleomorphism and multinucleation, which possess prominent threat
Data was obtained from Global Hospital and Research Center from patients diagnosed with
Hepatocellular Carcinoma and Dysplasia. This paper introduces a hybrid diagnosis method that
uses texture, layout and context features of nuclei and cytoplastic cells in order to enhance the
poor diagnosis of liver tumors in Infra Red (IR) images. We propose a Area based Adaptive
Expectation Maximization(EM) that grows the clusters, which avoids the need for initial cluster
* Lekshmi Kalinathan
lekshmik@ssn.edu.in
Rubasoundar K
rubasoundar@yahoo.com
Dinakaran Nagendram
nagudeena@gmail.com
Mukul Vij
mukul.vij.path@gmail.com
1
SSN College of Engineering, Anna University, Chennai, India
2
P.S.R. Engineering College, Anna University, Sivakasi, India
3
Gastroenterologist, Melmaruvathur Adhiparasakthi Institute of Medical Sciences and Research,
Melmaruvathur, India
4
Global Hospital and Research Centre, Chennai, India
5
King_s College Hospital, London, UK
Multimed Tools Appl
selection in order to obtain texton maps of nuclei and cytoplasm. A linear regression model of
nuclei and cytoplastic changes were built by incorporating the aforementioned features efficiently.
The proposed method provides better classification and segmentation accuracy of nuclei and extra
nuclear content in HCC and dysplasia, compared to the state-of-the-art methods like
convolutional networks and classical methods like Adaptive K means and EM method in constant
time. In conclusion, this system detects the malignant cells and the highly eligible precancerous
cells which is cost effective and reproducible.
Keywords Segmentation . Histopatholgy images . Hepatocellular carcinoma . Dysplasia .

Hepatic lesion . Adaptive EM
1 Introduction
This paper investigates the problem of achieving automatic detection, recognition, and seg-
mentation of HCC and Dysplasia with respect to nucleus size and extra nuclear cellular
arrangement in histopathology images. The proposed system should automatically partition
the given histopathology image into meaningful regions, where the required regions can be
labeled with a specific color code for both the object classes (Fig. 1a and b).
Treating the liver tumor in early stage can cure it in certain cases, yet the long term
anticipation essentially relies upon the vicinity and severity of liver damage and its extension
[23]. Liver tumor is of two types, the primary liver cancer where the cancerous cells begins
from the liver. Secondary liver cancer (metastatic) which begins in another part of the body, for
example, pancreas, bowel, stomach, lung or breast and has spread to the liver. Hepatocellular
Carcinoma is the most common primary malignant neoplasm of the liver, representing the fifth
most common malignancy worldwide and the third most frequent cause of cancer-related
mortality [7, 28]. This tumor is more common in men than women, with a ratio of 2.7:1 [28].
A standardized terminology of hepatic nodular lesions, introduced by International Work-
ing Party (IWP) in 1995, was updated by an International Consensus Group for Hepatocellular
Neoplasia(ICGHN) in 2009 [16, 35]. Hepatic Nodular lesions, composed of either hepatocytes
or neoplastic cells with hepatocytic features may be detected on imaging studies of cirrhotic
and non-cirrhotic livers and may undergo guided needle biopsy. The pathologists finally
diagnose the characterization of such lesions by assessing sets of subtle and overlapping
histological features.
(a) (b)
Fig. 1 Example results of new simultaneous tumor nucleus recognition and segmentation algorithm
Multimed Tools Appl
Dysplastic nodules are nodular lesions with cytologic and structural atypia indicative of
precancerous change [35]. Dysplastic foci and dysplastic nodules consist of expanding cell
populations with genome and epigenetic cell populations with genome and epigentic changes
that provide a survival advantage over the surrounding hepatocytes and additional molecular
changes are obtained, ultimately leading to malignancy [14, 15].
The cytological similarity between small cell change and early HCC is significant.
Molecular studies have provided the precancerous nature of small cell change [22,
26]. However, the small sized hepatocytes are often seen in cirrhotic livers, as a result
of regeneration [25]. Therefore, small cell size alone is not a sufficient evidence of
precancerous change in the absence of cytologic atypia. This cytological changes were
originally described as Bliver cell dysplasia^ [1]. These dysplastic lesions have been
shown to evolve into HCC over time [31, 34]. Therefore, focal HCC may occasion-
ally be found on microscopic examination of dysplastic nodules [2].
The clinical need in surgery stems is not satisfied because the visible light that
penetrates into blood and tissue, is not more than a few hundred microns, due to high
photon attenuation from absorbance and scatter with the existing imaging modalities
such as x-rays (plain film, fluoroscopy, and computed tomography [CT]), single-
photon emission computed tomography (SPECT), positron emission tomography
(PET), magnetic resonance imaging (MRI) and ultrasound (US). Thus, a surgeon
can only see surface features, when he looks at the surgical field. For example, the
findings of these studies [3, 32] showed that 20 to 25% of breast cancer still resects
incompletely with the modern surgical techniques and the local recurrence remains
unacceptably high at 12–28%. The unmet clinical need is still existing in resection of
the structures such as malignant cells, and avoiding the structures such as blood
vessels and nerves.
K-means cluster analysis was used to differentiate tumor from non-tumors, based on
various IR peaks determined for tumors in fixation-free liver tissues [11]. Hence, the work
concludes that a method has been developed for real-time differentiation of diseased liver
tissue and normal liver tissue to assist surgeons in the resection of liver tumors. In histopa-
thology images, the resemblance of a cluster of nuclei and malignant nuclei is high and there is
always a chance of misclassification of these regions by an automated system.
A hybrid diagnosis method is proposed, in which textons are used to detect the
nuclei and the cellular nuclei changes of hepatic lesions automatically in histopathol-
ogy images of cryostat sections. Object recognition and segmentation algorithms can
use the results of textons, which identifies the location of the object in the image. As
texton deduction reduces the influence of the cluttered background, it enhances the
performance of the system. An area function adaptation scheme that uses EM model
grows the clusters of textons without the need for initial selection of clusters. The
textons of nuclei and cytoplastic cells are trained in a classifier, whose output is
combined with color, edge and location information by a discriminative model to
improve the classification accuracy of nuclei and cytoplastic cells in histopathology
images. The main contributions in this paper are threefold. The texture features record
patterns of textons, and exploit the textural appearance of nuclei and cytoplastic cells,
their layout (arrangement), and textural context. A discriminative model is built by
combining these texture features with low-level image features, to provide a near pixel
perfect segmentation of the image. Finally, this model has been trained efficiently by
exploiting both boosting and piecewise training methods.
Multimed Tools Appl
This paper is organized as follows. Immediately below, the related work is

discussed. The various clustering, segmentation methods and proposed method, which
introduces the high-level discriminative model, a conditional random field (CRF) are
discussed in Section 2. In this Section 2, we discuss about system design which
consists of texture-layout filters and their combination in a boosting classifier. Finally
evaluation and comparison with the related work and performance of AdaBoost
algorithm is discussed and concluded in Section 3.
2 Materials and methods
2.1 Recording IR spectroscopic imaging data
The process of recording IR spectroscopic imaging data [11] involved the pretreatment of liver
tissue done by snap freezing in liquid nitrogen, which made the sample temperature reduced to
below −70 °C. For further analysis, snap frozen liver tissue was treated as specimen integrity.
Using FTIR microscope with a liquid-nitrogen-cooled linear array of 16 mercury cadmium
telluride (MCT) detectors and a computer-controlled microscope, imaging of IR spectra were
recorded (Perkin Elmer Spotlight 300).
Using K-means cluster analysis, 64 IR spectral points are distinguished into 5 groups based
on six IR metrics(Fig. 2 a,b and c) (Table 1) [9]. The group centroid for each pixel in image
based on similarity (or a Bdistance^) between a particular image pixel and the average metric
scores of the group was calculated. According to the minimization of sum of Bdistances^ for
each group, membership of image pixels in each group changes. Finally, each image pixel with
similar metrics organised into a group explained in algorithm (Fig. 3).
2.2 Various approaches of segmentation
The regions detected by the bottom-up segmentation were labelled with textual class
label of images, trained in a classifier [10]. Yet, semantic objects are not correlated
(a) (b) (c)
Fig. 2 K-means cluster analysis with 5 groups using the unscaled, top-six metrics (top). The tumor is represented
in red groups
Multimed Tools Appl
Table 1 IR metric values for 5 groups from K-means Analysis
IR Metric Non-Tumor Tumor
L1(1744/1548) 2.06 0.882

L2(1744/1244) 1.68 0.483
L3(1742/1256) 1.34 0.312
L4(1080/1548) 1.38 0.17
L5(1080/1244) 1.21 0.344
L6(1012/1256) 1.42 0.198
with such segmentations and hence in proposed method, segmentation and recognition
are performed in the same unified framework rather than in two separate steps. At a
high computational cost, such a unified approach was presented in this study [37].
However, using a unary classifier labeled images, spatially coherent segmentations are
not achieved [17]. K-means clustering might not be easy to clearly identify initial K
seeds for the clusters [29]. Adaptive K-means algorithm, partitioned the given dataset
without the initial identification of elements to represent clusters [5, 6]. As a quick
implemenation of clustering an image, we can understand even without knowing the
number of clusters. So it starts with the automatic selection of K elements, which
form the seeds of clusters that are randomly selected. Since the properties of each
element form the properties of the cluster that is constituted by the randomly selected
element, this method does not segment the objects accurately. Expectation
Maximization(EM) algorithm, assigned data points partially to a given set of partitions
using a probabilistic distribution, where each data point is associated with every
partition through a system of weights based upon how strongly the data value should
be associated with a particular partition. The findings of this study showed that
regions are clustered with words, in which EM was used to learn the mapping
between region types and keywords [24, 36]. EM algorithm tended to get stuck less
rather than K-means algorithm, get stuck in local optimal [36]. Recognition algorithms
used in convolutional networks achieve the good performances over the aforemen-
tioned classical methods [20, 21]. The computational cost of convolutional networks
Algorithm for K-means cluster analysis

Step 1. Initially choose the number of k points, where each point represents a cluster to be
made.
Step 2. Calculate the distance between the points and all objects (spectra).
Step 3. Attribute the closest cluster to each point.
Step 4. Calculate the centroids of the clusters and distance between centroids and each of
the objects are recalculated
Step 5. for each object
if the closest centroid is associated with the cluster to which the object currently
belongs, then
Do no change
else
Object will switch its cluster membership to the cluster with the closest
centroid
end if
end for
Step 6. Repeat the steps 2 to 5 until none of the objects are reassigned
Fig. 3 Algorithm for K-means cluster analysis with 5 groups using the unscaled, top-six metrics (top)
Multimed Tools Appl
algorithms was quite high due to the representation of a pixel in higher dimensions
despite achieving improvements. However, the proposed method works for the med-
ical histopathology images of liver. It extracts the texton features and provides a better
classification and segmentation accuracy of nuclei and extra nuclear content in HCC
and dysplasia in constant time. As the size of the nucleus to be segmented is very
small, the proposed method is able to identify such micromolecular nuclei better than
any of the algorithms discussed above. In the proposed method, Area based Adaptive
EM is used to grow clusters with the textons of images without having the initial
selection of clusters and also stop generating the clusters based on area function
automatically, and trained in a classifier, which generates a linear regression model
to improve the classification accuracy of nuclei and cytoplastic cells from other
components.
2.3 A conditional random field model of classes
Conditional distribution over the class labeling was learnt by using Conditional random field
(CRF) model [18, 19, 33], for a given image. To support this model of class, texture layout,
color, location, and edge cues are incorporated into a single unified model. Conditional
probability of the class labels c given an image x is defined as
(1)
where E is the set of edges in a 4-connected grid structure, Z(θ, x) is the partition function
which normalizes the distribution, and i and j correspond to sites in the graph.
2.3.1 Learning the CRF model
Three unary potentials and one pairwise potential comprises the four potential func-
tions such as texture, color, location and edge in this model. Using this model, the
nucleus can be well segregated from cytoplasm and other components such as
vascular tissue, vacuoles, etc. to segment the nucleus of HCC, dysplasia respectively.
Texture-layout potentials As the nucleus is highly textured than the other components,
when H&E stain is being applied, texture components are extracted here using the Eq. (2)

ψ ci ; X ; θψ ¼ logPðci jX ; iÞ ð2Þ
where P(ci| X, i) is the normalized distribution given by a boosted classifier. This

classifier combines the features of texture-layout filters to model the texture layout,
and textural context of the object classes. The texture-layout potentials are proved to
be the most powerful term in the CRF, which is shown in the evaluation.
Multimed Tools Appl
Color potentials Color models are represented by mixture coefficients of class label in
gaussian mixture models (GMMs). The unary color potentials draw on the color distribution
of (objects) nucleus and cytoplasm in an image. The conditional probability of the color x of a
pixel is given by Eq. (3)
X
PðxjcÞ ¼ k
Pðxjk ÞPðkjcÞ ð3Þ
with color clusters (mixture components)

Pðxjk Þ ¼ N xjμk ; ∑k ð4Þ
where μk and Σk are the mean and variance respectively of color cluster k.
Location potentials The location of each texton is compared with location of each object
class in the database to identify the location of nucleus and cytoplasm using the Eq. (5)

λi ðci ; i; θλ Þ ¼ logθλ ci ; i ̂ ð5Þ
where iindex i is the normalized version of pixel index i, where normalization lets the images
of different sizes. The relatively weak dependence of the class labels on the absolute location
of the pixel in the image are confined by these location potentials.
Edge potentials The edge potentials in CRF deals severely with neighboring nodes, having
different class labels explicitly, except where there is a corresponding edge in the image. The
edge feature gij measures the difference in color between the neighboring pixels as
2
g ij ¼ exp −β xi −x j ð6Þ
where xi and xj are three-dimensional vectors representing the colors of pixels i and j respectively.
2.4 Proposed system
The system design describes the phases of modules of the proposed system
Textonization, building classifier model and performance evaluation. The overall
system architecture is explained (Fig. 4). Training images are convolved with 11 filter
banks to produce 17D feature responses as textons. The modernity that exists in the
Multimed Tools Appl
Training Phase Testing Phase
Image Database Query Image
Preprocessing Preprocessing
17D Filter Bank 17D Filter Bank
Feature Responses Feature Responses
Clustering Clustering
Adaptive EM Adaptive EM
Texton Maps Texton Maps
Integral Image Integral Image

Calculation Calculation
Texton Bins Texton Bins
1000 rounds
AdaBoost Classifier
Training
AdaBoost Classifier
Generated Models Classified

Image
Database
Stored Models
Segment Nuclei alone
Fig. 4 Proposed system architecture
work is the clustering of the textons using Area based Adaptive EM for generating
the clusters automatically without the need of giving initial value for clusters, to
generate the textonized images of highly textured nuclei and cytoplastic cells in fixed
amount of time. Integral image estimation is carried out on these textonized images to
calculate model feature responses. These feature responses are trained in AdaBoost
classifier and gives the probability of the pixel belonging to each class. Linear
regression stump model combines the boosting output with color, location and edge
information, because of which the image receives the final label. During testing phase,
the aforementioned process explained in algorithm (Fig. 5) is repeated for test image
and its feature responses are classified using each class model from database into
nucleus as red and cytoplasm as blue. The red component nucleus is extracted from
Multimed Tools Appl
Algorithm for Proposed Method

Input : Training images
Output: model
For each image
Step 1: Textonization
- Textons are calculated for each pixel.
- Area based Adaptive EM clusters the textons, based on each texton that belongs to
the cluster with the highest probability.
- Integral histogram is computed for each texton with one bin to obtain the filter responses.
Step 2: Building the classifier model

- Boosting classifier trains filter responses and gives the probability of the pixel belonging to
each class.
- Discriminative model combines the boosting output with color, location and edge
information, for the image to receive the final label.
Fig. 5 Proposed method extracts texton features of nuclei and ctyoplasm of image database and clusters on
Adaptive EM, trained in Adaboost classifier
this classified image to identify HCC and dysplasia with respect to the size of nucleus
and its cellular arrangement.
2.4.1 Textonization
Training images are convolved with 17 dimensional convolution filter bank to obtain 17D
responses, which are clustered using Area based Adaptive EM. Finally, each pixel in each
image is assigned to appropriate cluster center, producing the final texton map.
Preprocessing Pre-processing can significantly increase the reliability of this process.

The filter operations can intensify or reduce certain image details and enable an easier
or faster evaluation of the size of the nucleus. 17D Convolution filter bank is
generated by applying Gaussian to all three HSV (Hue, Saturation and Value) chan-
nels, while the other filters are applied only to the luminance. This is because, as
nucleus is highly textured in histopathology images when compared to other compo-
nents such as cytoplastic cells and vascular tissues, microscopic evaluation always
serve to highlight the texture features. When three Gaussian filters are applied to three
HSV channels, 9D responses are obtained and four Laplacian of Gaussian filters
(LoG) and the four first order derivatives of Gaussian are applied to luminance to
produce 4D responses each, of totally 17D responses.
Gaussian filter The Gaussian filter [33] modifies the input image by convolution with 2D
Gaussian function, whose impulse response is given by
1 x2 þy2
g ðx; yÞ ¼ 2
⋅e− 2σ2 ð7Þ
2πσ
Based on the size(m,n) of Gaussian filter, x ranges from -m/2 to (m-1)/2 and y ranges from -
n/2 to (n-1)/2. When three gaussian filters whose sigma values are 1,2,4 respectively, are
applied to HSV, nine filter responses, three from each are obtained.
Multimed Tools Appl
LoG filter The LoG filter [4] initially smoothes the image using Gaussian filter and then applies
Laplacian filter. The Laplacian and Gaussian functions are combined to obtain a single equation as

where the term, hðx; yÞ ¼ ∇2 g ðx; yÞ *f ðx; yÞ ð8Þ
2 2
x2 þ y2 −2σ2 x þy
∇2 gðx; yÞ ¼ ⋅e− 2σ2 ð9Þ
σ4
where x ranges from -m/2 to (m-1)/2 and y ranges from -n/2 to (n-1)/2 whose m and n
represents the size of the filter. When four LoG filters, whose filter values of 1,2,4 and 8 are
applied only on V Channel, they produce 4 filter responses, one from each LoG filter.
First order derivative Gaussian filter Gaussian filter [4] modifies the input image by
convolution with 1D Gaussian function, whose impulse response is given by
1 x2
gðx; yÞ ¼ pffiffiffiffiffiffi ⋅e− 2σ2 ð10Þ
2π⋅σ
where y ranges from -n/2 to (n-1)/2 whose n represents the column size of the filter. Four filters
of first order derivative Gaussian is used ( two x-dimension first order derivative Gaussian
filter and two y-dimension first order Gaussian derivation) each with two different values of
kernels ( sigma =2,4). All the filters are applied to only the V channel. As a result it produces
four filter responses.
2.4.2 Texton map
The 17D filter responses obtained in section 2.4.1 are convolved with the training images,
which are automatically clustered using Adaptive EM algorithm, called textonized image
(Fig. 6) .
Fig. 6 Textonized image

Multimed Tools Appl
The textonized image can be denoted as Ti where pixel i has value 1,2,. . ,K. K textons in
textonized image are split into K channels, where K channels represent the nucleus, cytoplast
and vascular tissue cells respectively. In order to calculate texture-layout filter responses in
constant time, an integral image is built for each channel. Area based Adaptive EM method runs
on these filter responses to generate the clusters automatically, thus providing the texton map.
Adaptive EM clustering Novelty of this proposed work is Area based Adaptive EM grows
clusters with the textons of images without the initial selection of clusters and also stops
generating the clusters, based on area function automatically in fixed amount of time. With the
feature responses obtained, clusters are generated automatically. As no component in any
cluster is bigger than the texton of nucleus, algorithm stops generating the clusters after the
generation of texton of nucleus, whose cluster number is k. Thus the n observations(filter
responses) are partitioned into k clusters where each observation serves as a prototype of a
cluster, belongs to the nearest mean cluster. Adaptive EM clustering partitions the n observa-
tions into k sets S = S1, S2,... , Sk, here the sets represent the nucleus, cytoplast and vascular
tissue cells, so as to minimize the Within-Cluster Sum of Squares (WCSS).
Initialize the parameters ϕ , such as mean μk, variance ∑k and weight p(ck)based on the
generation of clusters.
m
μk ¼ ½1 : k ⋅ ð11Þ
k þ1
V k ¼ ones½1 : k ⋅m ð12Þ
ones½1 : k
pk ¼ ð13Þ
k
The finding of these studies [18, 19, 30] showed that parameters are updated
iteratively until it converges. With these updated parameters, clusters of textons are
generated and stops its generation of clusters automatically when the area of the
biggest component nucleus is found as shown in algorithm (Fig. 7).
wk •f ðxjφk Þ
pðC k jxi Þ ¼ wk ðxÞ ¼ X k ð14Þ
wi •f i ðxjφk Þ
i
X
wk ðxi Þ⋅xi
i
μkþ1 ¼ X ð15Þ
wk ðxi Þ
i
X
wk ðxi Þ⋅ðxi −μk Þ⋅ðxi −μk ÞT
i
V kþ1 ¼ X ð16Þ
wk ðxi Þ
i
Multimed Tools Appl
Algorithm for construction of Texton Map

Input : Feature responses of training images
Output: Texton Map
Step 1: Initialization of parameters , as k = 1, k , k , p ck and biggest_comp0 = -1

Step 2: Computation of parameters , using EM based on k clusters so far generated
Repeat until convergence {

(E-step) for each i, k , set
wk f k ( x | k )
p (C k | x i ) wk ( x )
wi f i ( x | k )
i
(M-step) Update the parameters k, Vk and p(Ck)
wk ( xi ) xi
i
k 1
wk ( xi )
i
wk ( xi ) ( xi k ) ( xi k )T
Vk i
wk ( xi )
1
p (C k | xi )
p (C k 1 ) i
N
}
Do generate the clusters using area based Adaptive EM

biggest = -1;
for each FeatureResponse xi and k
c = compute f k ( xi | k )
n = count the number of connected components on c

for each conn_compn
area_comp= computearea(conn_compn)
if area_comp > biggest
biggest = area_comp
end for
biggest_compk = biggest
end for
Step 3: Repeat step 2 until biggest_compk > biggest_compk-1 by incrementing k

further
Multimed Tools Appl
Fig. 7 Algorithm for construction of Texton map using the Adaptive EM, in which area is defined as number of
pixels in each component. The biggest component has the maximum area in each iterative cluster ck. As no
component in any cluster is bigger than the texton of nucleus, algorithm stops generating the clusters after the
generation of texton of nucleus
X
pðC k jxi Þ
i
pðC kþ1 Þ ¼ ð17Þ
N
2.4.3 Integral histogram estimation
To compute the texture-layout filter responses in constant time, integral histogram [27] is
computed for each texton with one bin.
ðtÞ ðtÞ ðtÞ ðtÞ

v½r;t ðiÞ ¼ T ̂ rbr −T ̂ rbl −T ̂ rtr þ T ̂ rtl ð18Þ
where rbr, rbl, rtr and rtl denote the bottom right, bottom left, top right and top left corners of
rectangle r.
2.4.4 AdaBoost classifier algorithm
Automatic feature selection and learning of texture-layout potentials [12] are carried out by
boosting process (Fig. 8). Using shared boosting, Unary classification and feature selection are
achieved to build an efficient classifier, that can be applied to a large number of classes. So a
Joint Boosting algorithm is used to learn a multi-class classifier. A strong classifier H(ci) can
be built by summing up ‘weak classifiers’ iteratively [13]. Using a threshold feature response
Filtering Adaptive EM
Clustering
Textonized images
Channelize Integral Image Training Database

Texton Calc Classifiers
1000 iterations
Fig. 8 Training phase

Multimed Tools Appl
as a decision stump, weak classifier can be found. To enable a single feature to classify several
classes at once, weak classifier is shared between a set of classes.
The basic boosting algorithm uses +1 and −1 as output of weak classifiers. This algorithm
reduces the problem of maximizing function with multiple labels to a sequence of binary
maximization problems. These sub problems are called alpha-expansions. We have a current
configuration (set of labels) c and a fixed label 1,2,. . ,C, where C is the number of classes. In
the alpha-expansion operation, each pixel i makes a binary decision: it can either keep its old
label or switch to a new label. Therefore, a binary vector of values 0,1 is introduced.
For those classes that share this feature (c belongs to C), the weak learner gives hi(c) which
belongs to a + b, b depending on the comparison of feature (c belongs to C). Round m chooses
a new weak learner by minimizing an error function Jwse [33] incorporating the weights:
XM
H ðci Þ ¼ h ðc Þ
m−1 m i
ð19Þ
A decision stump of each weak-learner is defined as

aδðvði; r; tÞ > θÞ þ b if ci ∈N
hðci Þ ¼ ð20Þ
k ci otherwise
X X
wci zci ½vði; r; t Þ≤θ
where b¼ X c∈N
Xi ð21Þ
c∈N
wc ½vði; r; t Þ≤θ
i i
X X
wci zci ½vði; r; tÞ > θ
a þ b ¼ X c∈N X i ð22Þ
c∈N i
wci zci ½vði; r; tÞ > θ
2.4.5 Testing image
Here, we input the test image to be processed. If the automatic recognition correctly classifies
all the object classes in the image, then we obtain the final output. First the image is textonized
to obtain the texton map of the image. From the textonized image we extract the features and
non-features in the model file. The features extracted from the test set is given as a input to the
classifier model which have been generated using AdaBoost algorithm as a result of which we
get the semantically segmented image with each object in a image assigned to corresponding
color given in the object database (Fig. 9).
3 Experimental results and analysis
As this work is a retrospective study, the images used in this research work are the records of
the previously diagnosed patients. We have obtained 4 normal, 5 hepetocellular carcinoma
images and 4 dysplasia images from Global Hospital and Health City, Chennai. This work
mainly focused on detecting the nucleus and the extra cellular nuclei changes of the respective
tumor. Each image has approximately more than 700 nuclei of varying size. The training
Multimed Tools Appl
Filtering Adaptive EM
Clustering
Textonized image
Classified image
Channelize Integral Image AdaBoost

Texton Calc Classifiers
Segmented Nucleus
Database
Accuracy of Nucleus = 71.8%
Fig. 9 Testing phase
dataset consist of two normal, three HCC and two dysplasia images which approximately
includes 4900 nuclei and cytoplastic cells. Similarly, the testing set consists of rest of the two
normal, two HCC and two dysplasia images which approximately includes 4200 nuclei and
cytoplastic cells. This implementation is carried out in matlab tool. The texton feature
responses are trained in AdaBoost classifier for 1000 iterations to build the linear regression
model to gain more accuracy. Adaptive-K-means does not require the initial identification of
the clusters representation, but the segmentation accuracy is very low due to the poor
clustering of nuclei and cytoplastic cells. EM technique is highly dependent on the initial
selection of elements that represent the clusters well and segments the nuclei better than
Adaptive-K-means. However, this method needs the number of clusters to be predefined.
Taking the advantages of Adaptive-K-means and EM, the proposed method clusters’ the
textons with no initial selection of clusters and the probability distribution of EM based on
the area of respective components such as nuclei, cytoplastic cells and vascular components.
The proposed method gives better classification and segmentation accuracy even without the
need for initial selection of clusters and combines the texton features with low level features
such as color, edge and location information, when compared with the state-of-the-art methods
like convolutional networks and classical methods like Adaptive k-means and EM. Represen-
tation of a pixel in higher dimensions always leads to the high computational cost in
convolutional networks. Hence, when the proposed algorithm runs on the dataset of Microsoft
Image Understanding Research, it gives the accuracy of 70% and for the micronuclear
histopathology images in constant time, it gives the better accuracy of 74.26%. Therefore,
the proposed algorithm is capable of identifying micro-nuclei more efficiently. Since the
images obtained from the Global hospital are 10× size, they have more nuclei based textures
Multimed Tools Appl
rather than cytoplastic cells and other components. So the algorithm has trained more features
of the nuclei than the cytoplasm. The efficiency of the proposed algorithm may be improved
further by modifying the linear regression stump model or by using any other classification
algorithm.
3.1 Boosting accuracy
Boosting classifier gradually selects new texture-layout filters to improve classification accu-
racy. Initially, after 30 iterations of boosting (i.e. 30 texture-layout filters), a very poor
classification is given, with low confidence. As more texture layout filters are added, the
classification accuracy improves greatly (Fig. 9), and after 1000 iterations, a very accurate
classification is obtained. This illustrates only the texture-layout potentials, and not the full
CRF model. As expected, the training error Jwse [33] decreases non-linearly as the number of
weak learners increases. Furthermore, the accuracy of classification with respect to the
validation set, which flattens out to an approximate value of 71.8% and 75.09% after 1000
iterations for the nuclei and cytoplastic cells of HCC, dysplasia respectively (Table 2). The
accuracy against the validation set is measured as, the pixelwise segmentation accuracy or in
other words the percentage of pixels that are assigned the correct class labels (Fig. 10).
3.2 Performance evaluation
Three metrics namely Mean Squared Error (MSE), Peak Signal-to-Noise Ratio(PSNR) and
Dice Similarity Coefficient(DSC) are used for measuring the accuracy of the segmentation.
MSE is close to zero relative to the magnitude of at least one of the estimated treatment effects.
It represents the mean squared error rate between 0 to 1. The lower the value of MSE, better
the segmentation and lesser the error. Using the proposed technique, the error rate is found to
be 0.01. The higher the PSNR, better the quality of image. The existing methods, Adaptive-K-
means has a PSNR value of 55.79 dB and EM has PSNR value of 66.22 dB. The proposed
method has a PSNR value of 72.16 dB, which is considerably higher than that of the existing
methods. When DSC metric [8] is applied between the automatic and the manual segmenta-
tion, the values obtained are always between 0 to 1 where, the higher values are considered as
the better match which shows that the accuracy has improved.
1 X 2
n
MSE ¼ y−~y ð23Þ
n i¼1
Table 2 Experimental results of proposed method
Classes Structure/Tissue TruePositive FalsePositive
Red HCC* 0.72 0.28

Red Dysplasia 0.75 0.25
*
= Hepatocellular Carcinoma
Multimed Tools Appl
Fig. 10 Performance analysis of Input Reference Segmented Method & its

proposed vs existing methods image image image Accuracy
Adaptive K
means - 57.18%
EM - 65.61%
Proposed
Method -
74.26%

255*255
PSNR ¼ 10*log10 ð24Þ
ðmseÞ
\
2 M 1 A1
DCE ¼ ð25Þ
2M 1 þ A1
where A and M represented automatic and manual segmentation.
The comparison of the performance metrics for all the three Automated segmentation
algorithms dealt in this work detects the nuclei of various hepatic tumors in histopathology
images (Table 3). It is clear that to segment the three types of hepatic tumors, the proposed
method results with much better efficiency than the classical and convolutional algorithms
discussed in this work.
4 Conclusion
Segmentation of multi-nucleated hepatic tumors in histopathological images for the

patient is very crucial. In this work, the segmentation of nuclei in hepatic tumors is
Table 3 Performance evaluation of proposed vs existing methods
Segmentation Techniques MSE1 PSNR2 DSC3 Accuracy %
Proposed 0.01 72.16 0.5 74.26

AdaptiveK-Means 0.24 55.79 0.33 57.18
EM4 0.14 66.22 0.07 65.61
1
Mean Squared Error
2
Peak signal-to-noise ratio
3
Dice Similarity Coefficient
4
Expectation Maximization
Multimed Tools Appl
implemented and analysed with histopathological images. A comparative study of

Automatic segmentation techniques of detecting nuclei of various hepatic tumors from
histopathological images using Adaptive K Means, Expectation Maximization and
proposed method is carried out. The Segmentation accuracy has been assessed through
the following evaluation framework: MSE, PSNR and DSC. From the analysis of the
performance metrics calculated for the various Automatic segmentation techniques, it
is observed that the proposed algorithm results in much better efficiency than the
algorithms dealt in this work. Finally, the performance of the algorithm is evaluated
with respect to the nucleus and the extra cellular nuclei changes for HCC and
dysplasia tumors. In future, there is a need to collect more cytoplastic cells along
with nuclei histopathology images to analyze and detect the cytoplastic changes along
with the cellular nuclei changes with respect to size and their number in each cell.
The detected cytoplastic changes and nuclei will be used to diagnose an accurate
prognosis of dysplasia leading to malignant metaplasia or neoplasia, either or not
undergoing cirrhosis.
Acknowledgements We would like to thank and extend a deep sense of gratitude to Dr. Mohamed Rela, MS
FRCS, Consultant HPB Surgeon and Dr. Balajee, MD, HOD of Laboratory Medicine & Senior Consultant of
Global Hospital, Chennai for providing the medical image data and interpretation for the analysis. They have
helped us a lot in getting a better insight and assessing the number of liver metastases in histopathology images to
a greater extent.
Compliance with ethical standards
Funding No. The study was approved by Institutional Ethics Committee-Global Hospital and Health City (IEC-
GHHC).
Conflict of interest The authors declare that they have no conflict of interest.
Ethical approval All procedures performed in studies involving human participants were in accordance with
the ethical standards of the Institutional Ethics Committee-Global Hospital and Health City(IEC-GHHC).
References
1. Anthony PP, Vogel CL, Barker LE (1973) Liver cell dysplasia: a premalignant condition. J clin Path 26(3):
217–223
2. Arakawa M, Kage M, Sugihara S et al (1986) Emergence of malignant lesions within an adenomatous
hyperplastic nodule in a cirrhotic liver. Observations in five cases. Gastroenterology 91(1):198–208
3. Bani MR, Lux MP, Heusinger K et al (2009) Factors correlating with reexcision after breast-conserving
therapy. Eur J Surg Oncol 35(1):32–37. doi:10.1016/j.ejso.2008.04.008
4. Basu M (2002) Gaussian-based edge-detection methods-a survey. IEEE Transactions on Systems, Man and
Cybernetics-Part C: Applications And Reviews 32(3):252–260
5. Bhagwati CP, Sinha GR (2010) An adaptive K-means clustering algorithm for breast image segmentation.
Int J of Computer Applications 10(4):35–38
6. Bhatia SK (2004) Adaptive K-means clustering. FLAIRS Conference p:695–699
7. Bruix J, Branco FS, Ayuso C (2006) Hepatocellular Carcinoma. In: Schiff ER, Sorrel MF, Maddrey WC
(eds) Schiff’s diseases of liver, 10th edn. hippincott Williams and Wilkins, Philadelphia
8. Casciaro S, Franchini R, Massoptier L et al (2012) Fully automatic segmentations of liver and hepatic
tumors from 3-D computed tomography abdominal images: comparative evaluation of two automatic
methods. IEEE Sensors J 12(3):464–473
9. Chen Z, Butke R, Miller B et al (2013) Infrared metrics for fixation-free liver tumor detection. J Phys Chem
B 117(41). doi:10.1021/jp4073087
Multimed Tools Appl
10. Duygulu P, Barnard K, de Freitas JFG et al (2002) Object recognition as machine translation: learning a
lexicon for a fixed image vocabulary. Proc European Conf on Computer Vision Springer-Verlag Berlin
Heidelberg 2353:97–112
11. Fernandez DC, Bhargava R, Hewitt SM et al (2005) Infrared spectroscopic imaging for histopathologic
recognition. Nat Biotech 23:469–474. doi:10.1038/nbt1080
12. Freund Y, Schapire RE (1999) A short introduction to boosting. J Japanese Society for Artificial Intelligence
14(5):771–780
13. Friedman J, Hastie T, Tibshirani R (2000) Additive logistic regression: a statistical view of boosting. Ann
Statistics 28(2):337–407
14. Hytiroglou P (2004) Morphological changes of early human hepatocarcinogenesis. Semin Liver Dis 24(1):
65–75
15. Hytiroglou P, Park YN, Krinsky G, Theise ND (2007) Hepatic precancerous lesions and small hepatocel-
lular carcinoma. Gastroenterol Clin N Am 36(4):867–887
16. International Consensus Group for Hepatocellular Neoplasia (2009) Pathologic diagnosis of early
hepatocellular carcinoma: a report of the international consensus group for hepatocellular neopla-
sia. Hepatology 49(2):658–664
17. Konishi S, Yuille AL (2000) Statistical cues for domain specific image segmentation with
performance analysis. Proc. IEEE Conf. Computer vision and. Pattern Recogn 1:125–132
18. Kuang Z, Schnieders D, Zhou H et al (2012) Learning image-specific parameters for interactive segmen-
tation. IEEE Conference on Computer Vision and Pattern Recognition:590–597
19. Lafferty J, McCallum A, Pereira F (2001) Conditional random fields: probabilistic models for
segmenting and labeling sequence data. Proc Int Conf on Machine Learning:282–289
20. Jonathan Long, Evan Shelhamer and Trevor Darrell (2015) Fully Convolutional Networks for Semantic
Segmentation. Proc. IEEE CVPR15
21. Jitendra Malik, Hariharan B, Pablo A, et al. (2015) Hypercolumns for Object Segmentation and Fine-
grained Localization. Proc. IEEE CVPR15
22. Marchio A, Terris B, Meddeb M et al (2001) Chromosomal abnormalities in liver cell dysplasia
detected by comparative genome hybridisation. Mol Pathol 54(4):270–274
23. Marengo A, Jouness RIK, Bugianesi E (2015) Progression and natural history of nonalcoholic fatty liver
disease in adults. Clin Liver Dis doi. doi:10.1016/j.cld.2015.10.010
24. Moon N, Bullitt E, Leemput KV et al (2002) Model-based brain and tumor segmentation. IEEE
1:528–531
25. Nakanuma Y, Hirata K (1993) Unusual hepatocellular lesions in primary biliary cirrhosis resem-
bling but unrelated to hepatocellular carcinoma. Virchows Arch A Path Anat 422(1):17–23
26. Plentz RR, Park YN, Lechel A et al (2007) Telomere shortening and p21 checkpoint inactivation
characterize multistep hepatocarcinogenesis in humans. Hepatology 45(4):968–976
27. Porikli F (2005) Integral histogram: a fast way to extract histograms in Cartesian spaces. IEEE computer
society conference on computer vision and. Pattern Recogn 1:829–836
28. Roberts LR, Gores GJ (2009) Hepatocellular Carcinoma. In: Yamada T (ed) Textbook of gastroenterology,
vol 2, 5th edn. Blackwell, UK, Oxford, pp. 2386–2411
29. Rohit S, Gaikwad MS (2013) Segmentation of brain tumour and its area calculation in brain MR images
using K-mean clustering and fuzzy C-mean algorithm. Int J of computer science and engineering.
Technology 4(5):524
30. Russel BC, Torralba A, Murphy KP et al (2008) LabelMe: a database and webbased tool for image
annotation. Int J of Computer Vision 77(1–3):157–173
31. Sakamoto M, Hirohashi S (1998) Natural history and prognosis of adenomatous hyperplasia to hepatocel-
lular carcinoma: multi-institutional analysis of 53 nodules followed up for more than 6 months and 141
patients with single early hepatocellular carcinoma treated by surgical resection or percutaneous ethanol
injection. Jpn J Clin Oncol 28(10):604–608
32. Schiller DE, Le LW, Cho BCJ et al (2008) Factors associated with negative margins of
lumpectomy specimen: potential use in selecting patients for intraoperative radiotherapy. Ann
Surg Oncol 15(3):833–842. doi:10.1245/s10434-007-9711-2
33. Shotton J, Winn J, Rother C, et al. (2006) TextonBoost: Joint Appearance, Shape and Context Modeling for
Multi-Class Object Recognition and Segmentation. Proc. European Conf. on Computer Vision Springer-
Verlag Berlin Heidelberg 3951: 1–15.
34. Takayama T, Makuuchi M, Hirohashi S et al (1990) Malignant transformation of adenomatous
hyperplasia to hepatocellular carcinoma. Lancet 336(8724):1150–1153
35. Terminology of nodular hepatocellular lesions (1995) International working party. Hepatology 22(3):983–
993
Multimed Tools Appl
36. Tsai A, Zhang J, Willsky AS (2001) Expectation-maximization algorithms for image processing
using multiscale models and mean-field theory, with applications to laser radar profiling and
segmentation. Opt Eng 40(7):1287–1301
37. Tu Z, Chen X, Yuille AL et al (2005) Image parsing: unifying segmentation, detection, and recognition. Int J
of Computer Vision 63(2):113–140
Lekshmi Kalinathan received her B. Tech in Information Technology from Madurai Kamaraj University,
Madurai, in 2004 and was admitted to the post graduate program of Computer Science and Engineering of
Anna University. She joined the faculty of Computer Science and Engineering and currently is an Assistant
Professor. Her research interests include image processing, soft computing and signal processing.
Rubasoundar K received his Bachelor degree in Computer Science and Engineering from Institution of Engineers,
Kolkata, India in 2000 and was admitted to the post graduate program of Computer Science and Engineering of Anna
University. He received his PhD in Faculty of Information and Communication Engineering from Anna University in
2010. He joined the faculty of Computer Science and Engineering and currently is a Professor. His research interests
include image processing, soft computing, cloud computing and computer networks.
Multimed Tools Appl
Dinakaran Nagendram has received his B. Sc.,MD (General Medicine), DM(Medical Gastroenterolo-
gy) from Madras University, Chennai. He worked in Tamilnadu Medical Health Department from
1979 till age of superannuation retirement on October 2004. He retired as Head of the Department
and Professor of Medical Gastroenterology, Government General Hospital and Madras Medical
College, Chennai. He is currently working as Head of the Department of Medical Gastroenterology
at Melmaruvathur Adhiparasakthi Institute of Medical Sciences and Research. He has published about
326 National publications and 29 International publications in his area of research. He attended
various National and International Conferences including Medical, Surgical Oncology, Nutrition, Food
and Diary Developments, Animal Husbandry and 2 Bio-Engineering International Conferences.
Dr. Mukul Vij received his MBBS degree from King George’s Medical University, Chowk, (KGMC)
Lucknow, in 2004 and was admitted to MD, Pathology in 2005 and PDCC from SANJAY GANDHI
POST GRADUATE INSTITUTE OF MEDICAL SCIENCES(SGPGIMS), Lucknow in 2009. He joined
as the Consultant in speciality of Laboratory Medicine in Global Hospital, Chennai.
Multimed Tools Appl
Prof. Mohamed Rela received his MBBS degree in 1980 and MS degree from Stanley Medical College,
Chennai. Later he went to the United Kingdom in 1986 and became a Fellow of the Royal College of Surgeons in
1988. In 1991, he joined the King's College Hospital, London, UK (where the first liver transplant was done in
1989) and became actively involved in the liver transplant surgeries. He is now affiliated to Global Hospitals &
Health City Group in Chennai as the Head of the Department for Institute of Liver, Pancreas diseases and
Transplantation,. He has written a number of scientific articles and papers on his areas of interests which includes
Liver transplantation, Complex Hepatobiliary, Pancreatic surgery and more.

MTAP Final Published Jan 23 2017

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

MTAP Final Published Jan 23 2017

Hochgeladen von

Copyright:

Verfügbare Formate

Multimed Tools Appl

Segmentation of hepatocellular carcinoma and dysplastic

Lekshmi Kalinathan 1 & Ruba Soundar Kathavarayan 2 &

Received: 12 March 2016 / Revised: 12 October 2016 / Accepted: 12 December 2016

Abstract The differentiation of a cluster of nuclei and multi-nucleation is a critical issue in

Keywords Segmentation . Histopatholgy images . Hepatocellular carcinoma . Dysplasia .

This paper is organized as follows. Immediately below, the related work is

2 Materials and methods

2.1 Recording IR spectroscopic imaging data

2.2 Various approaches of segmentation

(a) (b) (c)

Table 1 IR metric values for 5 groups from K-means Analysis

IR Metric Non-Tumor Tumor

L1(1744/1548) 2.06 0.882

Algorithm for K-means cluster analysis

2.3 A conditional random field model of classes

2.3.1 Learning the CRF model

where P(ci| X, i) is the normalized distribution given by a boosted classifier. This

with color clusters (mixture components)

2.4 Proposed system

Training Phase Testing Phase

Image Database Query Image

17D Filter Bank 17D Filter Bank

Feature Responses Feature Responses

Texton Maps Texton Maps

Integral Image Integral Image

Texton Bins Texton Bins

Generated Models Classified

Algorithm for Proposed Method

Step 2: Building the classifier model

Preprocessing Pre-processing can significantly increase the reliability of this process.

2.4.2 Texton map

Fig. 6 Textonized image

V k ¼ ones½1 : k ⋅m ð12Þ

Algorithm for construction of Texton Map

Step 1: Initialization of parameters , as k = 1, k , k , p ck and biggest_comp0 = -1

Repeat until convergence {

(M-step) Update the parameters k, Vk and p(Ck)

Do generate the clusters using area based Adaptive EM

n = count the number of connected components on c

Step 3: Repeat step 2 until biggest_compk > biggest_compk-1 by incrementing k

2.4.3 Integral histogram estimation

ðtÞ ðtÞ ðtÞ ðtÞ

2.4.4 AdaBoost classifier algorithm

Channelize Integral Image Training Database

Fig. 8 Training phase

A decision stump of each weak-learner is defined as

2.4.5 Testing image

3 Experimental results and analysis

Channelize Integral Image AdaBoost

Accuracy of Nucleus = 71.8%

Fig. 9 Testing phase

3.1 Boosting accuracy

3.2 Performance evaluation

Table 2 Experimental results of proposed method

Classes Structure/Tissue TruePositive FalsePositive

Red HCC* 0.72 0.28

Fig. 10 Performance analysis of Input Reference Segmented Method & its

Segmentation of multi-nucleated hepatic tumors in histopathological images for the

Table 3 Performance evaluation of proposed vs existing methods

Segmentation Techniques MSE1 PSNR2 DSC3 Accuracy %

Proposed 0.01 72.16 0.5 74.26

implemented and analysed with histopathological images. A comparative study of

Compliance with ethical standards