Beruflich Dokumente
Kultur Dokumente
1,2 Dept. of Electronics and Communication Engineering, IMS Engineering College, INDIA
---------------------------------------------------------------------***--------------------------------------------------------------------
Abstract - In order to provide the blind people hearable
environment, this project focuses on the field of assistive
devices for visual impairment people. It converts the visual
data by image and video processing into an alternate
rendering modality that will be appropriate for a blind user.
The alternate modalities can be auditory, haptic, or a
combination of both. Therefore, the use of artificial
intelligence for modality conversion, from the visual modality
to another.
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 601
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
video modes, as well as still capture. It attaches via a 15cm the biological neural networks. An ANN is based on a
ribbon cable to the CSI port on the Raspberry Pi. collection of connected units or nodes called artificial
neurons. Each connection (analogous to a synapse) between
artificial neurons can transmit a signal from one to another.
The artificial neuron that receives the signal can process it
and then signal artificial neurons connected to it0.
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 602
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
We are using CIFAR-10 classification to classify RGB 32x32 Pooling layers, which down sample the image data extracted
pixel images. The reason CIFAR-10 was selected was that it is by the convolutional layers to reduce the dimensionality of
complex enough to exercise much of TensorFlow's ability to the feature map in order to decrease processing time. We
scale to large models. At the same time, the model is small used max pooling algorithm, which extracts sub-regions of
enough to train fast, which is ideal for trying out new ideas the feature map (e.g., 2x2-pixel tiles), keeps their maximum
and experimenting with new techniques. value, and discards all other values 0.
Model Architecture The most common form of pooling is Max pooling where we
take a filter of size F*F and apply the maximum operation
The model CIFAR-10 is a multi-layer architecture consisting over the F*F sized part of the image.
of alternating convolutions and nonlinearities. These layers
are followed by fully connected layers leading into a softmax
classifier 0. This model achieves a peak performance of
about 86% accuracy within a few hours of training time on a
GPU. It consists of 1,068,298 learnable parameters and
requires about 19.5M multiply-add operations to compute
inference on a single image. By using CIFAR-10 Model the
images are processed as follows:
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 603
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
Most common pooling is done with the filter of size 2*2 with Calculate Loss, for both training and evaluation, we need to
a stride of 2. As you can calculate using the above formula, it define a loss function that measures how closely the model's
essentially reduces the size of input by half 0. predictions match the target classes. To calculate cross
entropy, we use One hot encoding technique. One-hot is a
group of bits among which the legal combinations of values
are only those with a single high (1) bit and all the others
low (0). This cost that will be minimized to reach the
optimum value of weights.
3. CONCLUSIONS
Dense (fully connected) layers, which perform classification
on the features extracted by the convolutional layers and The goal of this research is to provide the better image
down sampled by the pooling layers. In a dense layer, every processing using artificial intelligence. By using CNN image
node in the layer is connected to every node in the preceding classifier, we predict the correct answer with more than
layer. 90% accuracy rate. By doing so we achieved the state-of-art
result on the CIFAR-10 dataset. We also use the trained
Dense Layer 1: 1,024 neurons, with dropout regularization model with real time image and obtained the correct label.
rate of 0.4 (probability of 0.4 that any given element will be
dropped during training)
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 604
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 05 Issue: 03 | Mar-2018 www.irjet.net p-ISSN: 2395-0072
3) Raspberry Pi User Guide; Eben Upton and Gareth 15) Graupe, Daniel (2013). Principles of Artifircial
Halfacree; 312 pages; 2014; ISBN 9781118921661. Neural Networks. World Scientific. pp. 1–. ISBN 978-
981-4522-74-8.
4) Xilinx. "HDL Synthesis for FPGAs Design Guide".
section 3.13: "Encoding State Machines". Appendix 16) Dominik Scherer, Andreas C. Müller, and Sven
A: "Accelerate FPGA Macros with One-Hot Behnke: "Evaluation of Pooling Operatiorns in
Approach". 1995. Convolutional Architectures for Object
Recognition," In 20th International Conference
5) Hinton, G.; Deng, L.; Yu, D.; Dahl, G.; Mohamed, A.; Artificial Neural Networks (ICANN), pp. 92-101,
Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; 2010. https://doi.org/10.1007/978-3-642-15825-
Sainath, T.; Kingsbury, B. (2012). "Deep Neural 4_10
Networks for Acoustic Modeling in Speech
Recognition --- The shared views of four research 17) Graupe, Daniel (7 July 2016). Deep Learrning Neural
groups". IEEE Signal Processing Magazine.29(6):82- Networks: Design and Case Studies. World Scientific
97. doi:10.1109/msp.2012.2205597 Publishing Co Inc. pp. 57–110. ISBN 978-9r81-314-
647-1
6) R. J. Williams and D. Zipser. Gradient-based learning
algorithms for recurrent networks and their 18) Clevert, Djork-Arné; Unterthiner, Thomas;
computational complexity. In Back-propagation: Hochreiter, Sepp (2015). "Fast and Accurate Deep
Theory, Architectures, and Applications. Hillsdale, Network Learning by Exponential Linear Units
NJ: Erlbaum, 1994 (ELUs)". arXiv:1511.07289
7) Hilbe, Joseph M. (2009), Logistic Regression Models, 19) Yoshua Bengio (2009). Learning Deep Architectures
CRC Press, p. 3, ISBN 9781420075779. for AI. Now Publishers Inc. pp. 1–3. ISBN 978-1-
60198-294-0.
8) Zoph, Barret; Le, Quoc V. (2016-11-04). "Neural
Architecture Search with Reinforcement 20) Christopher Bishop (1995). Neural Networks for
Learning". arXiv:1611.01578 Pattern Recognition, Oxford University
Press. ISBN 0-19-853864-2
9) Hope, Tom; Resheff, Yehezkel S.; Lieder, Itay (2017-
08-09). Learning TensorFlow: A Guide to Building BIOGRAPHIES
Deep Learning Systems. "O'Reilly Media, Inc.".
pp. 64–. ISBN 9781491978481. Ashwani Kumar, born in 1996,
India. He is currently pursuing
10) Sutton, R. S., and Barto A. G. Reinforcement B.tech in ECE from IMS
Learning: An Introduction. The MIT Press, Engineering College. His interest is
Cambridge, MA, 1998 in Data Science and Machine
Learning.
11) Zeiler, Matthew D.; Fergus, Rob (2013-01-15).
"Stochastic Pooling for Regularization of Deep Ankush Chourasia, born in 1996,
Convolutional Neural Networks". arXiv:1301.3557 India. He is currently pursuing
B.tech in ECE from IMS
12) Lawrence, Steve; C. Lee Giles; Ah Chung Tsoi; Engineering College. His interest is
Andrew D. Back (1997). "Face Recognition: A in developing Android Application.
Convolutional Neural Network Approach". Neural
Networks, IEEE Transactions on. 8 (1): 98–
113. CiteSeerX 10.1.1.92.5813
© 2018, IRJET | Impact Factor value: 6.171 | ISO 9001:2008 Certified Journal | Page 605