Sie sind auf Seite 1von 8

Automatic Fish Species Classification Based on Robust Feature

Extraction Techniques and Artificial Immune Systems


Marco T. A. Rodrigues, Flavio L. C. Padua, Rogerio M. Gomes, Gabriela E. Soares
Intelligent Systems Laboratory
Federal Center of Technological Education of Minas Gerais
Av. Amazonas, 7675, Belo Horizonte, MG, Brasil
{tulio,cardeal,rogerio,gabriela}@lsi.cefetmg.br

AbstractThis paper addresses the problem of automatic


classification of fish species, by using image analysis techniques and artificial immune systems. Unlike most common
methodologies, which are based on manual estimations that
lead to significant time and financial constraints, we present
an automatic framework based on (i) two well-known robust
feature extraction techniques: Scale-Invariant Feature Transform and Principal Component Analysis for parameterizing
shape, appearance and motion, (ii) two immunological algorithms: Artificial Immune Network and Adaptive Radius
Immune Algorithm for clustering individuals of the same
species, and (iii) a simple nearest neighbor classification
strategy. The framework was successfully validated with
images of fish species that have significant economic impact,
achieving overall accuracy as high as 92%.
Keywords-Fish Species Classification; Artificial Immune
Systems; Scale-Invariant Feature Transform (SIFT) and
Principal Component Analysis (PCA)

I. I NTRODUCTION
Many applications benefit from the availability of information about fish abundance and distribution. Examples
include ecological and environmental studies of fish communities [1], design and placement of fish ladders in dams
for hydroelectric power generation [2], feeding strategies
used by fish farmers [3] and stock assessment for fishery
management [4]. Accurate stock assessment is particularly
important in a scenario where human population and
the demand for fish are increasing. Attempting to match
natural stock fluctuation with fishing effort may help to
avoid any further long term damage of exploited species.
The consequences of over-fishing can be catastrophic,
since fish provide vital contributions to food supplies and
influence employment in coastal areas (see Figure 1).
Various methods have been applied to calculate estimates of stock sizes. It is apparent that stock assessment
techniques are highly dependent on available data and
whether long or short-term predictions are the aim. For
correct predictions many techniques require large inputs
of unbiased data and, therefore, the strength of any stock
biomass prediction will be influenced by the weakness of
the available inputs, validating final modal estimates of a
fishery [4], [5].
A critical task in all of the afore-mentioned applications
is fish species classification, that is, the task of grouping
and categorizing fish species according to shared physical
characteristics. Unfortunately, even though fish species
classification may be performed by human observers,

Figure 1. According to the United Nations Environment Program, a


combination of climate change, over-fishing and pollution could cause
the collapse of commercial fish stocks worldwide within decades [4].

this approach is prone to human error and demands


on significant time and financial costs. Therefore, as an
effort to overcome those problems, we propose a new
automatic framework based on image analysis techniques
and artificial immune systems to assist fish biologists and
other professionals in their works.
According to Nery et alli [6], a reliable vision-based
system for classifying fish species automatically should
be able to handle cases like:
Arbitrary scale and orientation: fish appear in a
variety of scales, orientations and body poses;
Environmental variations: the scene illumination
and water quality may vary;
Bad image quality: image acquisition is frequently
affected by noise and distortions in the optical system;
Segmentation failures: an individual fish may not be
segmented reliably.
Our framework considers that images are acquired by
a single stationary uncalibrated camera. Moreover, it assumes that scene illumination is controlled and, without
loss of generalization, that the input of the system is the
image of a single fish. This last assumption means that a
segmentation operation is performed previously.
Specifically, we present a framework based on (i)
two robust feature extraction techniques: Scale-Invariant
Feature Transform (SIFT) [7] and Principal Component
Analysis (PCA) [8] for parameterizing shape, appearance
and motion, (ii) two immunological algorithms: Artificial

Figure 2. Block diagram of the proposed framework for fish species


classification.

Immune Network (aiNet) [9] and Adaptive Radius Immune Algorithm (ARIA) [10] for clustering individuals
of the same species, and (iii) a simple nearest neighbor
classification strategy. A block diagram with the main
modules of the proposed framework is depicted in Figure
2. As it will be later demonstrated by our experimental
results, even though a very simple classifier has been
used, we have obtained very good performance levels,
what confirms the robustness of the feature extraction
techniques used.
In spite of the fact that vision-based systems for automatic classification of fish species have several applications, so far only a handful of works have proposed
effective solutions for that problem [6], [11][15]. In
[11], the authors use artificial neural networks for the
classification of digital echo recordings of fish schools.
Energetic, morphometric and bathymetric school descriptors were extracted. Several fish species were considered,
including anchovy, rough scad and blue whiting. Correct
classification rates up to 96% were obtained. That method
is best suited to coarser detections, such as finding fish
schools, and can not be applied in multispecies environments characterized by continuous changes in the species
composition of the schools.
The methodology used in [12] is the most closely
related to our approach. The authors present a system
for classifying four fish species, namely: Pacu, Carpa,
Surubim and Cascudo. That system is based on the use
of PCA and aiNet algorithms, achieving overall accuracy
higher than 80%. Unlike that work, we present a flexible
framework that allows users to apply not only PCA but
also the SIFT algorithm, as well as one of two possible
artificial immune systems: ARIA and aiNet. Differently
from PCA features, the SIFT features are based on the
appearance of the object at interest points and are invariant
to image scale and rotation [7]. Additionally, the ARIA
algorithm has a better performance than aiNet in several
data clustering applications [10], since it is able to preserve
the density information of data. Finally, our approach was
tested with a larger dataset, composed by nine fish species
of significant economic impact.
In [13], the authors present a deformable template
object recognition method for classifying two fish species
in underwater video. Specifically, a deformable template

matching is used, which employs shape contexts and largescale spatial structure preservation. A SVM texture classification is performed, achieving a classification accuracy
of 90%. Unlike our method, that approach was validated
with a very limited fish species group (only two species)
and its computational cost may prevent its application in
a real-time scenario composed by more fish species.
In [6], a feature selection methodology is proposed for
fish species classification. Given a set of available descriptors, the method builds the feature vector by analyzing
each individual characteristic contribution to the overall
classification performance. A couple of statistical measures, namely discrimination and independence, are used
to aid in the feature selection process. The authors report
a classification accuracy of about 85%. Similarly to our
approach, instead of studying techniques for improving
the classifiers structure itself, the authors consider it as
a black box and focus on the study of robust feature
determination methods.
In [14], a shape analysis algorithm was developed for
removing edge noise and redundant data points. A curvature function analysis was used to locate critical landmark
points. The fish contour segments of interest were then
extracted based on these landmark points for species
classification. Similarly to our approach, that method was
tested with a large group of fish species (nine species).
However, the authors performed experiments with a much
smaller dataset, composed by only 22 sample images.
Finally, in [15], an infrared silhouette sensor is used to
acquire contours of fish in constrained flow. Classification
is based on the combined results of three different classifiers which use invariant moments and Fourier boundary
descriptors for fish silhouette recognition. Those features,
however, do not perform well for noisy images. The
authors report a classification accuracy of about 78%.
The remainder of this paper is organized as follows:
Sections 2 covers our automatic framework for fish species
classification. Experimental results are presented in Section 3, followed by the conclusions and discussion in
Section 4.
II. C LASSIFICATION F RAMEWORK
The operation of the framework for automatic classification of fish species is divided in two main steps (see Figure
2). The first step is executed offline and is responsible for
estimating a knowledge base by using feature extraction
and clustering techniques. These techniques are applied
to a training set containing image samples of the nine
fish species considered in this work (see Figure 3). The
knowledge base is further used by the framework in a
second step, which is executed online and is responsible
for performing the classification of new detected fish.
A. Feature Extraction
Stable local feature extraction and representation is a
fundamental component of systems for automatic classification of fish species [13]. To perform a robust feature extraction, the proposed framework provides two

R = .
(6)
As R is symmetric, we have = , that is, R = .
After some additional mathematical manipulations, we
get:
= R,
(7)

Figure 3. Image samples of the nine fish species considered. (a)-(f)


Fish conserved in formaldehyde solution and (g)-(i) Fish in vivo.

well-known techniques: Scale-Invariant Feature Transform


(SIFT) [7] and Principal Component Analysis (PCA) [8].
PCA is a standard technique for dimensionality reduction and has been applied to a broad class of problems. Although PCA presents some shortcomings, such as
its implicit assumption of Gaussian distributions and its
restriction to orthogonal linear combinations, it remains
popular due to its simplicity [16]. The idea of applying
PCA to object classification is not novel. Our contribution
lies in rigorously demonstrating that PCA is well-suited
to parameterizing shape, appearance and motion of fish
species.
To apply PCA, we first convert RGB image samples
to Y U V color space. The Y U V color space models
human perception of color more closely than the standard
RGB, encoding brightness information (Y component)
separately from color information (U and V components).
Given p image samples, we obtain for each sample i
(i = 1, 2, . . . , p) three n-dimensional column vectors yi , ui
and vi by concatenating, respectively, the n pixel values
of each component Y , U and V . Those n-dimensional
column vectors are combined to form three different
matrices:
= [y1 y2 . . . yp ] ,
(1)
= [u1 u2 . . . up ] ,
(2)
= [v1 v2 . . . vp ] ,
(3)
which encode the brightness () and color ( and )
information of all image samples. In the following, we
describe the application of PCA to matrix . A similar
procedure is also applied to matrices and .
Consider a new coordinate system T
=
[t1 t2 . . . tp ]. Supposing that T is orthonormal,
of in this new system is given by:
the representation
= T .

(4)
Assuming that ui has expected value zero, that is, E[ui ] =
as follows:
0, i, we compute the covariance matrix of
2 = T RT,

(5)

where R = E p1 .
The coordinate system T that results in the highest
possible value for covariance is computed by finding a
singular value decomposition for R, as follows:


where the main diagonal of contains the singular


values of R. Assuming that T = and comparing
Equations (5) and (7), we note that 2 = .
Given that matrix R represents the correlation between
the coordinates of each vector ui of , i, the transformation applied to R in Equation 7 performs its diagonalization, representing R in a new orthogonal system. In this
new coordinate system given by T , each coordinate j of
a vector ui presents maximum variance with respect to
axis tj and null variance with respect to the other axes.
It is exactly this property that allows the dimensionality
reduction of data.
Therefore, by using only the first k vectors of T , that
is, Tk = [t1 t2 . . . tk ], the representation of in this
new coordinate system is given by:
k = Tk .

(8)
In Figures 4(a) and 4(b), we illustrate a PCA feature space
obtained by using the two components with highest values
(k = 2), which were computed by applying PCA to matrix
. In this case, encodes the color information of our
image samples of fish conserved in formaldehyde solution.
Alternatively, our framework may use the SIFT algorithm to detect local features in fish images. It consists of four steps [7]: (i) scale-space extrema detection;
(ii) keypoint localization; (iii) orientation assignment and
(iv) keypoint descriptor. In the first step, some interest
points (keypoints) are detected. For this, the image is
convolved with Gaussian filters at different scales, and
then the difference of successive Gaussian-blurred images
are taken. Keypoints are searched as maxima/minima of
the Difference of Gaussians that occur at multiple scales.
In the second step, candidate keypoints are localized and
eliminated if found to be unstable. In the third step,
each keypoint is assigned one or more orientations based
on local image gradient directions. The assigned orientation(s), scale and location for each keypoint enables
SIFT to construct a canonical view for the keypoint that
is invariant to similarity transforms [16]. The final step
computes descriptor vectors for these keypoints.
Specifically, a keypoint descriptor used by SIFT is
created by sampling the magnitudes and orientations of
the image gradient in the patch around the keypoint and
building orientation histograms to capture the relevant
aspects of the patch. Histograms contain 8 bins each, and
each descriptor contains a 4 4 array of 16 histograms
around the keypoint. This leads to a SIFT feature vector
with 4 4 8 = 128 elements. This 128-element vector
is then normalized to unit length to enhance invariance to
changes in illumination.
The SIFT keypoint descriptor representation is designed
to avoid problems due to boundary effects [7]. Therefore,
smooth changes in location, orientation and scale do not
cause radical changes in the feature vector. Moreover, it

(a)

(b)

Figure 4. PCA feature space for the two components with highest values, computed by applying PCA to matrix . In this case, encodes color
information of our image samples of fish conserved in formaldehyde solution. (a) Antibodies obtained by aiNet and (b) Antibodies obtained by ARIA.

is a compact representation, expressing the patch of pixels


using a 128 element vector.
To apply SIFT, we first convert RGB image samples to
grayscale images. Note that SIFT uses a one-dimensional
(1D) vector of scalar values as a local feature descriptor
and cannot be extended to operate on color images which
generally consist of three-dimensional (3D) vector values.
The main difficulty of applying SIFT to color images is
that no color space is able to use 1D scalar values to
represent colors [17]. In Figure 5, we show examples of
keypoints extracted by SIFT algorithm in some of our
image samples.
B. Clustering
Clustering is a very useful task in pattern classification.
In this work, the goal of cluster analysis is to find natural
groupings in a set of features extracted from images of
individuals of different fish species, such that features in
the same cluster refer to individuals of the same species.
As a step toward this goal, our framework uses two immunological algorithms, specifically, an Artificial Immune
Network (aiNet) [9] and an Adaptive Radius Immune
Algorithm (ARIA) [10]. An Artificial Immune Network
is a bio-inspired computational model that uses concepts
from the immune network theory, mainly the interactions
among B-cells (stimulation and suppression), and the
cloning and mutation process. Several models have been
proposed for problem solving in areas such as pattern
classification.
Our framework uses the aiNet model of Artificial Immune Network, proposed in [9]. This model generates
a network of antibodies linked according to the affinity
(Euclidean Distance). A subset of the antibodies with the
highest affinity, with respect to a given antigen, is selected

Figure 5. Examples of keypoints extracted by SIFT algorithm in some


image samples.

and cloned proportionally to the affinity. All generated


clones are mutated inversely to its affinity. A fix percentage
of clones is selected to be memory antibodies, by eliminating those whose affinity with the current antigen is less
than a death threshold. If a pair of memory antibodies have
an affinity greater than a suppression threshold, one of
them is removed from the network [9]. For our problem of
fish species classification, the set of antigens presented as
input to aiNet correspond to the feature vectors estimated.
Unlike aiNet, the ARIA algorithm implements an adaptive antibody radius that captures the relative density information during a clustering task, thus making it possible
to preserve relative distances when generating a compact
representation of data. ARIA is an iterative procedure
that can be summarized into three main phases [10]:
(i) affinity maturation: the antigens are presented to the
antibodies, which suffer hypermutation in order to better
fit the antigens; (ii) clonal expansion: those antibodies that
are more stimulated are selected to be cloned and (iii)
network suppression: the interaction between the antibodies is quantified and if one antibody recognizes another,
one of them is removed from the pool of cells. The main
steps of aiNet and ARIA are described in Algorithms 1
and 2, respectively. The meanings and values used for
the parameters of Algorithms 1 and 2 are summarized in
Tables I and II, respectively.
C. Classification
Our framework determines the species of an individual
by using a simple nearest neighbor search. Therefore,
Euclidean distances must be first computed between the
feature vector(s) of the input image (antigens) and the
feature vectors in the knowledge base (antibodies) that are
associated to all species considered.
Specifically, when PCA is used, the species of an
individual is determined as being that one associated to
the closest antibody (smallest Euclidean distance). On the
other hand, when SIFT is applied to the input image,
an incremental approach is employed. In this case, given
an input image Iv , our method estimates the matches
between its keypoints (antigens) and the keypoints (antibodies) of the image samples of reference individuals.
We use the match measure suggested in [10], which is
based on the comparison of the distances of the closest

Algorithm 1

Artificial Immune Network (aiNet)

1. For iteration 1 to G do:


1.1 For each antigen xi , i = 1, . . . , p:
1.1.1 Determine its affinity with respect to each antibody of
a random set A.
1.1.2 Select a subset An of antibodies in A that is composed
of the n highest affinity antibodies.
1.1.3 Clone the n selected antibodies proportionally to their
antigenic affinity, generating a set C of clones.
1.1.4 Perform a directed affinity maturation of C, generating
a mutated set C .
1.1.5 Determine the antigenic affinities of the mutated antibodies in C .
1.1.6 Re-select % of the antibodies in C with the highest
affinities and put them into a matrix Am of clonal memory.
1.1.7 Eliminate the memory clones from Am , whose affinities are higher than a natural death threshold d .
1.1.8 Determine the affinity between pairs of memory clones.
1.1.9 Eliminate memory clones whose affinities are lower
than a suppression threshold s .
1.1.10 Concatenate the total antibody memory matrix with
the resultant clonal memory.
end;
1.2 Determine the affinity between pairs of memory antibodies.
1.3 Eliminate all the antibodies whose affinities are lower than
s .
1.4 Build the total antibody matrix.
end;

and second-closest neighbors. This match performs well


because correct matches need to have the closest neighbor
significantly closer than the closest incorrect match to
achieve reliable matching.
In Figure 6, we illustrate how those estimated matches
are used to determine the species i of an individual, for
i = {1, 2, 3}. As a first step, we compute the number of
matches mij for each image pair (Iv , Iji ), where i denotes
the species considered and j denotes the image sample of
the reference individual of species i. Thus, for each species
i, we compute:
N
X
mij ,
(9)
Mi =
j=1

which represents the total number of matches between


the antigens of Iv and the antibodies of the N image
samples of the reference individual of species i. Finally,
the individual in Iv is associated to the species i with
highest value M i . The values for M i might be displayed
in a bar graph, as that one depicted in Figure 6. From
this graphic, it is possible to visualize the most probable
species of the individual in Iv .

Algorithm 2

Adaptive Radius Immune Algorithm (ARIA)

1. Initialize variables
2. For iteration 1 to G do:
2.1 For each antigen xi , i = 1, . . . , p, do:
2.1.1 Select the best matching antibody Ab ;
2.1.2 Mutate Ab with rate ;
end;
2.2 Kill those antibodies that are not stimulated;
2.3 Clone those antibodies that recognize antigens located at
a distance larger than its radius R;
2.4 Calculate the local density for each Ab ;
2.5 Calculate the suppression threshold of each Ab making
RAb = r(Demax /De )(1/Di ) ;
2.6 Suppress antibodies giving survival priority for those with
smaller R;
2.7 Make E = mean(R);
2.8 If current generation is greater than G/2:
2.8.1 Reduce ( = *);
end;
end;

dataset used. In the second group of experiments, we tested


our framework on a dataset containing images of live fish
species, acquired at a prototype of a fish ladder. The main
goal was to demonstrate the applicability of our approach.
In Figure 3(g)-(i), we show some image samples of this
dataset.
Importantly, for each group of experiments, we tested
the effectiveness of our framework by exploring the four
possible combinations of feature extraction and clustering
techniques used, namely: PCA and aiNEt, PCA and ARIA,
SIFT and aiNet and SIFT and ARIA. The classifiers
overall accuracy was estimated by the ratio between the
the number of individuals correctly identified and the total
number of image samples for validation.
The values for the parameters of the immunological
algorithms used in our experiments are summarized in
Tables I and II. Note that, in the case of the ARIA
algorithm, the initial radius R of the antibodies is set at
random in interval [0.01, 0.09] coherent values for R are
automatically produced during the iterative procedure of
adaptation. The mutation rate is initially set to 1.
A. Group 1: Fish in Formaldehyde Solution
As a first test, we evaluated our framework by using
a dataset containing image samples of 6 fish species
(see Figure 3(a)-(f)), whose individuals were perfectly
conserved in formaldehyde solutions. For each species, we

III. E XPERIMENTAL R ESULTS


In order to demonstrate the advantages and limitations
of our framework, we performed two main groups of
experiments with challenging datasets. The first group was
carried out with images of individuals perfectly conserved
in formaldehyde solutions, as performed in [6]. In this
case, two objectives were pursued: (1) to evaluate the
accuracy of our framework against significant variations
of the 3D fish orientation and (2) to analyze the behavior
of our framework in more detail without water influence.
In Figure 3(a)-(f), we show some image samples of the

Figure 6. Classification based on the estimation of matches between


SIFT keypoints.

Table I
PARAMETERS VALUES OF AI N ET.
Parameters
G
d
s
n

Meaning
Number of generations
Natural death threshold
Suppression threshold
No of cloned antibodies
Percentage of the mature antibodies to be selected

Value
10
1
0.01
4
0.2

considered 3 different individuals and by using a device


similar to that one proposed in [6], we rotated those
individuals from 40 to 40 in 10 steps simulating their
swimming characteristic, as described by fish biologists
[6]. We have acquired one image sample for each rotation
angle of an individual, totalizing 9 images per individual.
Therefore, in this group of experiments, we have used
a dataset with 162 image samples, that is, 6 species
3 individuals per species 9 images per individual =
162 images. When PCA was used, all 162 images were
randomly divided in two subsets S1 and S2 . The subset
S1 contained 108 images and was used to estimate the
knowledge base, whereas experiments were conducted
with the 54 images of S2 . On the other hand, when SIFT
was applied, the knowledge base was created by using the
9 image samples of each reference individual of all species
considered. Therefore, to create the knowledge base we
have used 54 images, that is, 6 species 1 reference
individual per species 9 images per reference individual
= 54 images. The remaining 108 images were used to
validate our framework.
By using PCA as our primary feature extraction technique, we have extracted the first 10 principal components
of higher variance from images and evaluated their impacts
in the overall accuracy of our framework. In Figures 4(a)
and 4(b), we illustrate the PCA feature space obtained by
using the two principal components with highest values,
which were computed by applying PCA to the U band
of our image samples. Note that Figure 4(a) shows the
antibodies computed by aiNet, while Figure 4(b) shows
the antibodies computed by ARIA.
Comparing those figures, we observe that ARIA, differently from aiNet, is capable of capturing the relative
density information in space, leading to a refined clustering result. This property of ARIA has improved the overall
accuracy of our framework in about 7%, as illustrated in
Figure 7(a). This figure depicts the frameworks change
Table II
PARAMETERS VALUES OF ARIA.
Parameters
G

Meaning
Number of generations

Radius of each antibody

Constant to determine the size


of the smallest radius
Mutation rate
Constant for decreasing the
mutation rate
Dimension of the input data

Di

Value
10
Initially drawn in
[0.01, 0.09]
0.01
Initially set to 1
1
10 or 128

in accuracy as the number of principal components k


increases for both aiNet and ARIA algorithms. Note that,
in both cases, the maximum classification accuracies (85%
for aiNet and 92% for ARIA) were achieved when only
the 3 principal components of higher variance were used.
Moreover, the inclusion of further components decreased
the frameworks accuracy to smaller values (69% for aiNet
and 74% for ARIA, when k = 10). These results are in
agreement with the literature [18], which shows that the
average error rate is strongly related to the numbers of
features and image samples.
The frameworks accuracy was also carefully evaluated
when SIFT was used together with aiNet and ARIA. Figure 7(b) summarizes the accuracy results for each species
considered. Similar to the former scenario when PCA was
applied, the use of ARIA instead of aiNet brought better
accuracy results. Specifically, by using SIFT and aiNet,
the framework had an average accuracy of 83%, whereas
the use of SIFT and ARIA lead to an average result of
87%. In Figure 8(a) we illustrate a bar graph containing
the numbers of matches between each image sample of a
validation individual of species Canivete and the images
of reference individuals of all species. Note that, in this
case, the SIFT algorithm and the classification strategy
described in Section 2.3 performed well for almost image
samples, since the numbers of matches with images of
the reference individual of species Canivete were higher
in most cases.
The comparison of the results obtained by all the combinations of techniques reveals that the best combination
was the one based on the use of PCA and ARIA (overall
accuracy of 92%). Unfortunately, the SIFT algorithm was
strongly affected by the 3D rotation of individuals in
our experiments, as illustrated in Figure 8(b). This figure
shows the numbers of matches between the image of a
validation individual of species Canivete, rotated at 40 ,
and all the images of the reference individual of the
same species. Note that the number of matches is higher
when the rotation angles of the reference and validation
individual are the same. When the difference between
those angles increases, the number of matches decreases
dramatically.
B. Group 2: Fish in Vivo
To demonstrate the effectiveness of our framework, we
tested it on a real scenario. In this case, we have acquired
images of live fish swimming in a prototype of a fish
ladder. Four fish species were considered: Carpa, Surubim,
Pacu and Cascudo, and only one individual per species
was available. Figure 3(g)-(i) shows some image samples
used.
In this group of experiments, we have used a dataset
containing 48 image samples, that is, 4 species 1
individual per species 12 images per individual =
48 images. When PCA was used, the 48 images were
randomly divided in two subsets S1 and S2 . The subset
S1 contained 24 images and was used to estimate the
knowledge base, whereas experiments were conducted

(a)

(b)

Figure 7. (a) Overall accuracy of the framework as a function of the number of principal components. (b) Frameworks accuracy per each fish
species, when SIFT is applied to the images.

with the 24 images of S2 . On the other hand, when SIFT


was applied, the knowledge base was created by using 6
image samples of each individual of all species considered.
Therefore, to create the knowledge base we have used
24 images, while the remaining 24 images were used to
validate our framework.
The same methodology used in the first group of
experiments was applied in the second one. By using PCA
together with aiNet and ARIA, our framework has behaved
similarly to its use with the former dataset. That is, we
observed maximum classification accuracies (92% for both
aiNet and ARIA) when only the 3 principal components
of higher variance were used and the inclusion of further components decreased the frameworks accuracy to
smaller values. However, differently from the first group,
ARIA and aiNet presented very similar performances.
Importantly, even though the image samples contained
the influence of water characteristics, such as turbidity, and
the individuals presented arbitrary rotation and scale in the
images, the frameworks performance did not degrade as
we expected. This result may be explained by the fact
that we have used a small dataset, whose species were
represented by only one individual. We believe that a more
representative dataset should be used to demonstrate the
actual classifications accuracy of our framework in such
a scenario.
The use of SIFT together with aiNet and ARIA was
also evaluated. By using SIFT and aiNet, the framework
had an average accuracy of 75%, whereas the use of SIFT
and ARIA lead to an average result of 79%. Apparently,
the SIFT algorithm was strongly affected by the characteristics already described of a real scenario. Note that the
classifications accuracy has decreased in about 8% with
respect to the results obtained in the first group. Again,
a more representative dataset should be created and used,
so that the effectiveness of our framework could be better
evaluated.
IV. C ONCLUSIONS
This work describes a novel combination of existing
techniques applied to the problem of automatic classification of fish species. The classification accuracies obtained
in our work are so significant as in [6], [11], [13],
suggesting that our framework can be successfully used.

Currently, we are developing a preprocessing and tracking


component that will output segmented fish images to be
classified using the methods described in this paper. The
goal is to develop a complete system that automatically
detects, tracks, counts and classifies fish in underwater
video. Future work includes testing the framework on
more representative datasets of underwater scenes and
performing a careful statistical analysis of the behavior
of our clustering techniques.
ACKNOWLEDGMENT
The authors thank the support of FAPEMIG-Brazil
under Proc. EDT- 162/07 and CEFET-MG under Proc.
PROPESQ-023-076/09.
R EFERENCES
[1] M. Bowen, S. Marques, L. Silva, V. Vono, and H. Godinho,
Comparing on Site Human and Video Counts at Igarapava
Fish Ladder, Southeastern Brazil, Neotropical Ichthyology,
vol. 4, pp. 291294, 2006.
[2] D. R. Fernandez, A. A. Agostinho, and L. M. Bini, Selection of an Experimental Fish Ladder Located at the Dam
of the Itaipu Binacional, Paran River, Brazil, Brazilian
Archives of Biology and Technology, vol. 47, no. 4, pp.
579586, 2004.
[3] D. Chan, S. Hockaday, R. Tillett, and L. Ross, A Trainable
N-Tuple Pattern Classifier and its Application for Monitoring Fish Underwater, in International Conference on
Image Processing and its Applications, 1999, pp. 255259.
[4] D. Hoggarth, S. Abeyasekera, R. Arthur, and J. Beddington,
Stock Assessment for Fishery Management: A Framework
Guide to the Stock Assessment Tools of the Fisheries Management Science Programme, FAO Fisheries Technical
Paper, Rome, Tech. Rep. 487, 2006.
[5] J. Pereiro, Assessment and Management of Fish Populations: A Critical View, Scientia Marina, vol. 59, no. 3, pp.
653660, 1995.
[6] M. Nery, A. Machado, M. Campos, F. Padua, R. Carceroni,
and J. Queiroz-Neto, Determining the Appropriate Feature
Set for Effective Fish Classification Tasks, in SIBGRAPI,
2005, pp. 173180.
[7] D. G. Lowe, Distinctive Image Features from ScaleInvariant Keypoints, IJCV, vol. 60, no. 2, pp. 91110,
2004.

(a)

(b)

Figure 8. (a) Bar graph containing the numbers of matches between each image sample of a validation individual of species Canivete and the images
of reference individuals of all species. (b) Numbers of matches between the image of a validation individual of species Canivete, rotated at 40 ,
and all the images of the reference individual of the same species.

[8] K. Pearson, On Lines and Planes of Closest Fit to Systems


of Points in Space, Philosophical Magazine, vol. 2, pp.
559572, 1901.
[9] L. de Castro and F. Von Zuben, aiNet: An artificial immune network for data analysis, Data Mining: a heuristic
approach, pp. 231259, 2001.
[10] G. B. Bezerra, T. V. Barra, L. N. Castro, and F. J. V. Zuben,
Adaptive Radius Immune Algorithm for Data Clustering,
Artificial Immune Systems, LNCS, vol. 3627, pp. 290303,
2005.
[11] A. G. Cabreira, M. Tripode, and A. Madirolas, Artificial
Neural Networks for Fish-Species Identification, ICES
Journal of Marine Science, vol. 4, pp. 291294, 2009.
[12] M. T. A. Rodrigues, F. L. C. Pdua, and R. M. Gomes,
Classificao de Espcies de Peixes Baseada em Sistemas
Imunolgicos Artificiais e Anlise de Componentes Principais, in CBA, 2008, pp. 6166.
[13] A. Rova, G. Mori, and L. M. Dill, One Fish, Two Fish,
Butterfish, Trumpeter: Recognizing Fish in Underwater
Video, in IAPR Conf. on Mach. Vis. App., 2007, pp. 404
407.
[14] D. J. Lee, S. Redd, R. Schoenberger, X. Xiaoqian, and
Z. Pengcheng, An Automated Fish Species Classification
and Migration Monitoring System, in Conf. of the IEEE
Industrial Electronics Society, 2003, pp. 10801085.
[15] S. Cadieux, F. Lalonde, and F. Michaud, Intelligent System for Automated Fish Sorting and Counting, IEEE IROS,
pp. 12791284, 2000.
[16] Y. Ke and R. Sukthankar, PCA-SIFT: A More Distinctive
Representation for Local Image Descriptors, in IEEE
CVPR, 2004, pp. 506513.
[17] Y. Chang, D. J. Lee, Y. Hong, and J. Archibald, Unsupervised Video Shot Detection Using Clustering Ensemble
with a Color Global Scale-Invariant Feature Transform
Descriptor, Journal on Image and Video Processing, vol. 2,
no. 24, 2008.
[18] R. O. Duda and P. E. Hart, Pattern Classification and Scene
Analysis. New York, NY: John Wiley & Sons, 1973.

Das könnte Ihnen auch gefallen