Beruflich Dokumente
Kultur Dokumente
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
1,2
ABSTRACT : Automatic clothing pattern recognition and matching costumes with appropriate color is a challenging
task for optically defective humanity. We developed a camera based prototype system for recognizing clothing patterns
such as plaid, striped, pattern-less and irregular. It identifies 11 clothing colors. A camera mounted upon a pair of
sunglasses is used to capture the clothing images. We propose a novel Radon Signature descriptor to capture the global
directionality of clothing pattern. SIFT represents the local structural features and STA descriptor is used to extract the
global statistical features from wavelet sub band and they are combined with local features to recognize the complex
clothing patterns. To evaluate the efficiency, we use the CCNY clothing pattern dataset which captures 627 images. To
assist blind persons to read the text written on the clothes, we have a conceived camera based assistive text reading
framework to track the object of interest within the rectangular arrangement and extract printed text information from
the clothes. Our Optical Character Recognition System(OCR) can effectively handle complex background and multiple
patterns, and extract text information from that costumes(object). This prototype system supports more independence in
their daily basis of the visually challenged people.
KEYWORDS: Radon Signature, SIFT(Scale Invariant Feature Transform),STA(Statistical Descriptor), CCNY
Dataset(City College Of New York),Assistive Text Reading Framework, OCR(Optical Character Recognition System)
,Local and Global image features.
I.
INTRODUCTION
Due to large intra-class variations in patterns and designs in clothes cause difficulty to recognize the clothing
patterns so we employ the automatic camera based clothing pattern recognition system. Existing texture analysis
methods mainly focus on textures having large changes in viewpoint, orientation, and scaling image, but it presents less
intra-class pattern and intensity variations. The system contains three major components 1)sensors including a camera
to capture the clothing images, micro- phone for giving speech command input and speakers (or Bluetooth, earphone)
for audio output display 2) data capture and analysis performs command control, clothing pattern recognition, and color
identication uses a computer which can be a desktop or a wearable computer (e.g., a mini-computer or a smart phone)
and 3) audio outputs provides recognition results of clothing pattern, colors and also the system status.
Copyright @ IJIRCCE
www.ijircce.com
80
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
RELATED WORK
Some clothing patterns presented as a visual patterns are characterized by the repetition of few basic
primitives (e.g., plaids or striped). Local texture features are effective to extract the structural information of repetitive
Copyright @ IJIRCCE
www.ijircce.com
81
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
primitives. Global features includes the directionality and statistical properties of the clothing patterns are more stable
within the same category. Radon Signature, statistical descriptor (STA) and scale invariant feature transform (SIFT)
presents extraction of local and global texture feature.
Fig 2.1 (a)An Intensity Image of Clothing Pattern (b)Radon Transform performed on a maximum disk area
within the gradient map (c)Result of Radon Transform (d) Feature vector of Radon Signature.
Copyright @ IJIRCCE
www.ijircce.com
82
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
B. STATISTICS OF WAVELET SUB BANDS
DWT(Discrete Wavelet Transform) provides a generalization of a multi resolution spectral analysis which
decompose an image into a low frequency channel. Therefore, we extract the statistical features from wavelet sub bands
to capture global statistical information of images at different scales.
Each decomposition level includes four wavelet sub bands of original, horizontal, vertical and diagonal
components arranged from the close to the distant. Four statistical values calculated in each wavelet sub band are
concatenated to form the nal descriptor.
C. SCALE INVARIANT FEATURE TRANSFORM
Detectors are used to detect interest of points by searching the local extrema in a scale space . Descriptors are
employed to compute the representations of interest of points based on their associated support regions.
www.ijircce.com
83
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
In this paper, the evenly sampled uniform grids are used as interest points, are then represented by SIFT
descriptors which performs well in the context of image matching. The bag- of-words (BOW) method is further applied
to aggregate the extracted SIFT descriptors by labeling each SIFT descriptor as a visual word and counting a
frequencies of each visual word. The local feature representation of an image is represented as the histogram of the
quantized SIFT descriptors. We perform L2-norm and inverse document frequency (IDF) normalization for BOW
histograms.
D. TEXT RECOGNITION AND AUDIO OUTPUT
Stroke orientation and Edge distribution used to extract the text feature from the complex background. The
Cascade-Ada boost classifier confirms there is an existence of text information in an image patch but it cannot done for
the whole images. Text information in the image usually appears as the text strings in horizontal manner containing not
less than the three characters.
Text recognition is performed by off-the-shelf OCR prior to output of informative words from the localized
text regions. A text region first draws the minimum rectangular area for the accommodation of characters inside it, so
that border of the text region identifies the edge boundary of the text characters. OCR first assigned proper margin
areas to the text regions and binarizedb to segment the text characters from background view.
The recognized text codes are recorded in a script files. Then,we employ the Microsoft Speech Software
Development Kit to load a script files and presents an audio output of text information to the user. Blind users can
adjust speech rate, volume and tone according to their preferences.
ARCHITECTURE DIAGRAM
A camera mounted upon a pair of sun glasses is used to capture the clothing image will be maintained in the
CCNY pattern dataset.
The captured image is processed in gray scale matrix to plot the matrix data in histogram analysis.
Filter the image using various illumination conditions like scaling, rotation and view point orientations.
For feature representation, we perform 3-D transformation such as non rigid surface deformation to extract the
local image features.
A texton dictionary is generated by clustering the extracted local features.
we have a conceived camera based assistive text reading framework to track the object of interest
within the rectangular arrangement box and extract printed text information from the clothes.
Copyright @ IJIRCCE
www.ijircce.com
84
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
Optical Character Recognition System uses portable label reader technique to handle complex background,
multiple patterns and extract text information from clothes.
Using Flite library, audio output will be received by processing the text written on the image by portable
camera based label reader.
Copyright @ IJIRCCE
www.ijircce.com
85
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
The clothing patterns and colors mutually provide complementary information.
If the dominant colors present in a pair of clothing image are the same, then the two clothing images are
determined as color matched. The proposed color identication method achieves 99% matching accuracy in the
experiment evaluation.
IV.
www.ijircce.com
86
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
+
R(r,) = ,
(1)
where r is the perpendicular distance of a projection line goes to the origin and is the angle of the projection line, as
shown in Fig. 2.1(b). To reduce the intensity variations in the projections we use the Sobel operator to compute the
gradient map similar to that f(x,y) calculations. The directionality of an image can be represented by Var(r,i), the
variances of under a certain projection direction i:
var(r,) =
(, )=
1
=0 (
1
=0 (
, (, ))2
(2)
(3)
where R(rj,i) is the projection value at perpendicular distance of rj and projection direction of i; (r,i) is the
expected value of R(r,i); N is the number of sampling bins in each projection line. The Radon Sig is formed by the
variances of r under all sampling projection directions:
[Var(r,0), Var(r,1),..., Var(r,T-1)]
where T, is the number of sampling projection directions.
The plaid patterns have two principle orientations(Two dominant peak value in the Radon Sig) and the striped
ones have one principle orientation similar to that pattern-less and the irregular images have no dominant direction,
but the directionality of the irregular image presents much larger variations than that of the pattern-less image
4.2 Discrete Wavelet Transform
Discrete wavelet transform (DWT) decompose the clothing image and provides a multi resolution spectral
analysis of images at different scale. The discrete wavelet transform (DWT) is a linear transformation that operates on a
data vector in the image whose length is an integer power of two and transforming it into a numerically different vector
of the same length. It acts as a tool to separate the data into a different frequency components, and then studies each
component with a resolution that matches with the image scale. The main feature of DWT is a multi-scale
representation of function. By using the wavelets, multi-scale representation of function can be analyzed at various
levels of resolution.
4.3 Scale Invariant Feature Transform
Scale Invariant Feature Transform (SIFT) used for image matching is robust to variation in illumination. SIFT
collect features extracted from images which help in reliable matching of the same object in different perspective. The
extracted features from image are invariant to scale and orientation, and are highly distinctive. The first step computes
the locations of potential of interest points in the overall image by detecting the maxima and minima location by
applying a set of Difference of Gaussian (DoG) filters at different scales. Then, these locations are refined by
discarding points of low contrast.
V.
The camera-based clothing pattern recognition system aided for blind people integrates a camera, a
microphone, a computer, and a Bluetooth earpiece for audio description which includes clothing pattern and color
information. A camera mounted upon a pair of sunglasses is used to capture the clothing images
The clothing patterns and colors are described to blind users by a verbal display and it contains minimal
distraction to hearing. The system can be controlled by a speech input from a microphone. In order to interact with a
blind users, speech command input from a microphone are used to enable the function selection and system control.
Copyright @ IJIRCCE
www.ijircce.com
87
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
The interface design includes some basic functions and high priority commands are shown
Fig 5.1 System interface design for the proposed camera based clothing pattern recognition system by using speech
commands.
Basic functions: A blind user can perform some basic function to recognize the clothing pattern using speech
command or by clicking the button. The recognition results will be announced to the blind user as an audio output such
as recognized, not recognized, and start a new function. The recognized function includes the next level functions to
announce the recognized clothing pattern and dominant colors present in the cloth will be spoken to the user , Repeat
result functions repeats the recognized clothing pattern result and save result function used to save the clothing image
associated with pattern and color information are stored in the computer.
High priority commands: A blind user can set the system conguration by clicking several high priority speech
commands such as system restart, turn-off system, stop function (i.e., abort current task),adjust speaker volume, control
commands for speech adjustment(e.g., louder, quieter, slower, and faster), and help option. The high priority commands
can be used at any time. If user wants help option enabled through speech command, then the cloth pattern recognition
system will respond to user with options that are associated with the current function. Bone conducted earphones or
small wireless Bluetooth speakers can be employed to protect privacy information of recognition results and minimize
background sounds. The battery level will be checked automatically and an audio warning is provided if the battery
level is low.
Audio output: Operating system speech facility in modern portable computer systems and smart phone is utilized for
audio display. We currently use a Microsoft Speech Software Development Kit which supports various number of
scripts. The number of system conguration options vary according to user preference such as speech rate, volume and
voice gender
.
VI.
CONCLUSION
To evaluate the performance of the system, we maintain two dataset
1) CCNY Clothing Pattern dataset which is invariant to large changes in intra-class variations.
2) UIUC Texture dataset to validate the generalization of multi resolution spectral analysis.
CCNY clothing pattern dataset includes 627 images which of four different categories such as plaid, striped, patternless, and irregular with 156, 157, 156, and 158 images in each category.
UIUC texture dataset contains 1000 unc alibrated and unregistered images .We maintain 25 texture classes with 40
images for each class in the dataset. The texture images present rotation, scaling, view-point change, and non- rigid
surface deformation under various conditioning of lightings.
Copyright @ IJIRCCE
www.ijircce.com
88
ISSN(Online)
ISSN (Print)
: 2320-9801
: 2320-9798
Second National Conference on Emerging Trends and Intelligence Technologies [ETIT 2015]
On 3rd October 2015, Organized by
Dept. of CSE, Anand Institute Of Higher Technology, Kazhipathur, Chennai-603103, India
In our implementation, the training set is selected as a xed size random subset of each class and all the remaining
images are used as the testing set. To eliminate the dependence of the results on the particular training images we used,
the system will report the average of the classication rates obtained from the randomly selected training sets. The
recognition performance is measured by the average classication accuracy. A combination of multiple features may
obtain better results than any individual feature channel. This system provides a new functions such as high priority
commands and performs some basic functions to improve the life quality of blind and visually impaired people.
REFERENCES
[1]D. Dakopoulos and N. G. Bourbakis, Wearable obstacle avoidance electronic travel aids for the blind: A survey, IEEE Trans. Syst.,Man,
Cybern.C, vol. 40, no. 1, pp. 2535,Jan 2010.
[2]F.Hasanuzzaman, X.Yang, and Y. Tian, Robust and effective component based banknote recognition for the blind, IEEE Trans. Syst.,Man,
Cybern.C, vol. 42, no. 6, pp 10211030, Nov. 2012.
[3]K. Khouzani and H. Zaden,Radon transform orientation estimation for rotation invariant texture analysis, IEEE Trans. Pattern Anal. Mach.
Intell.,vol.27, no.6, pp. 10041008, Jun. 2005.
[4]S. Lazebnik, C. Schmid, and J. Ponce,A sparse texture representation using local affine regions, IEEE Trans. Pattern Anal. Mach. Intell., vol.
27,no. 8, pp. 12651277, Aug. 2005.
[5]Vasanthi.G and Ramesh Babu.Y, Vision Based Assistive System for Label Detection With Voice Output, International Journal Innovative
Research in Science, Engineering and Technology,Vol-3,January 2014
[6]Xiaodong Yang, Shuai Yuan and Yingli Tian, Recognizing Clothes Pattern For Blind People By Confidence Margin Based Feature
Combination,City college of Newyork,2014
Copyright @ IJIRCCE
www.ijircce.com
89