Sie sind auf Seite 1von 5

Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]

IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

CNN based Leaf Disease Identification and


Remedy Recommendation System
Suma V
R Amog Shetty, Rishab F Tated, Sunku Rohan, Triveni S Pujar
Department of Information Science and Engineering, Dayananda Sagar College of
Engineering, Kumaraswamy Layout, Bangalore-560078, India

amount and intense of pests and disease attacked in farm for


Abstract—Agriculture is one field which has a high impact spraying correct and enough fertilizers/pesticides to eliminate
on life and economic status of human beings. Improper the host. Therefore, artificial Perceptron indicates an accurate
management leads to loss in agricultural products. Farmers value and provide corrective measure of appropriate amount of
lack the knowledge of disease and hence they produce less
pesticides/ fertilizers to be sprayed at specified target areas.
production. Kisan call centers are available but do not offer
service 24*7 and sometimes communication too fail. Farmers
are unable to explain disease properly on call need to analysis The aim of this paper is to help the farmers to protect his
the image of affected area of disease. Though, images and farm from any kind of pests and disease attacks and eliminate
videos of crops provide better view and agro scientists can them without disturbing the decorum of the soil and untouched
provide a better solution to resolve the issues related to parts of other plants.[4] Mostly in India, farmers use manual
healthy crop yet it not been informed to farmers. monitoring and some apps which have huge database
It is required to note that if the productivity of the crop is
not healthy, it has high risk of providing good and healthy limitations and are only bound to detection part. Since,
nutrition. Due to the improvement and development in Prevention is better than cure, this paper aims at detection of
technology where devices are smart enough to recognize and attack of pests/diseases in future thereby making farmer to
detect plant diseases. Recognizing illness can prompt faster prevent such attacks.
treatment in order to lessen the negative impacts on harvest.
This paper therefore focus upon plant disease detection
Technology has laid its influence in developing farms and
using image processing approach This work utilizes an open
dataset of 5000 pictures of unhealthy and solid plants, where agro-based industries. Today, it is possible to grow crops in
convolution system and semi supervised techniques are used deserts by using technology. Technology has dived into
to characterize crop species and detect the sickness status of 4 depths in agriculture sector. Automation technology is the
distinct classes. present most demanded tool in agriculture. Many companies
Key words: CNN, leaf disease, Classification, deep
learning, remedies.
have come up with latest solutions in Machine Learning,
Artificial Intelligence transforming agriculture into a Digital
I. I NTRODUCTION Agriculture etc. Many tests have proved that deploying
Health of human beings depends on the type of food they technology in farms, will increase crop yield and farmer’s
consume. If the food is unhealthy, it certainly leads to revenue thereby. This paper discusses and tests Deep
poor nutrition and emergence of several types of health Learning technology implementation in agriculture.
issues. Thus, having good crop productivity depends on
healthy plants. [2]Any type of disease in plants yields Diagnosis is always a concern for farmers in India. At the
unhealthy crops. Hence, detection of plant disease forms same time due to fear of attack of pests/diseases, farmer
basic and most important step in yielding good crops. uniformly sprays pesticides/fertilizers in whole farm which
may lead to damage of soil as well as plant. The aim of this
However, manual mode of such detection is not accurate paper is to make the farmer to spray a limited and enough
and is time consuming. Hence, it is now possible to pesticide/fertilizer at a specified target area where either
conduct such a detection using advanced technological pest/disease is present or maybe an occurrence of attack in
support. Deep Learning technology can accurately detect future. This helps the farmers mainly to prevent any such
presence of pests and disease in the farms. Upon this attacks on his farm as well as eliminate them if present any
Machine learning algorithm can even predict accurately by spraying in limited amount and not polluting soil and
the chance of any disease and pest attacks in future.[1] A other parts of plants. Major advantage of this is to increase
normal human monitoring cannot accurately predict the

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 395


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

farmer’s annual monetary revenue and minimizing crop III. SYSTEM DESIGN AND
loss caused by pests/disease attacks. The work thus ARCHITECTURE
carried out in this part of research is to put forth a
solution which consists of 3 major steps: Data MATERIALS AND METHODS
acquisition, Pre-processing and Classification for plant In order to develop a model for plant disease recognition, the
disease detection. It further helps in resolving those approach used is deep CNN.
issues and produce healthy crop growth.
1. DATASET
II. LITERATURE SURVEY
Authors of [11] give a brief knowledge of Support Vector For the purpose of image-based identification which includes,
Machine. They focus on proposing an automated system for training phase to evaluation phase where the performance of
diagnosis of common paddy diseases by making use of k- classification algorithms are evaluated, it is necessary to have huge
data sets. Hence, the source of data is collected from PlantVillage
means clustering and feature extraction. While the authors
website. The images thus collated are labeled with four different
of [12] focuses on plant diseases that are affected by
categories-bacterial spot , yellow leaf curl virus , late blight and
climatic causes in Thailand. Accordingly, they state that
healthy(in order to differentiate healthy leaves from affected ones).
system should have an application which can operate for
Subsequently, there is a need to enhance the dataset by adding the
specific disease diagnosis using rule based model of data images that are augmented. This paper further train the network to
mining technique. learn features that differentiates one class from other.
Correspondingly, a database comprising of more than 5000 images
Authors of [13] focuses on proposing a model that provides are used to train and around 1000 images are further used to
automatic method find out leaf diseases by inspecting if an validate the same.
image which is subjected to examination in the system has
been affected by any disease or not. They are generated by
using different cluster sizes using image segmentation and 2. PROCESS AND LABEL OF IMAGES
thereby obtaining an optimized results. However, the work Several samples of images are collected from plant village which
carried out by the authors [14] emphasize to determine the are spread across in several formats having varying levels of
nitrogen deficiency and further anticipate the right quantity resolutions and hence the variations in quality. Thus, to acquire a
of fertilizers required for the type of area using feature reasonable feature extraction, the final images are used as input
extraction, text extraction. data for classifier which are then pre-processed to achieve
consistency.
However, authors of [14] feel that digital image recognition
of plant diseases is one of the thrust areas and hence came It is further ensured that at the time of data collection, those of the
out with a model which comprises of back propagation images whose resolution is smaller and which has a dimension less
networks and probabilistic neural networks. It is further than 500px is not taken into consideration as valid images for the
depending on color features, shape features and text dataset. As such, images having higher resolution form the
features extracted from disease image. potential candidates for this investigation purpose. Consequently,
images are ascertained to contain all the required information for
Also, the work of authors [15] focuses on occurrence of risk feature learning. Accordingly, images used for the dataset
factor in apple. Beta regression model is used to predict the were image resized to 50 X 50. This ensures that there is a
reduction of the time required for training and automatically
subsystem and the severity of the apple disease so that it
computes it using written script in Python, using the OpenCV
helps the farmers to take the decision of pesticide spray and
framework. Pre-processing images involves outsourcing
reduce the diseases.
background noise, intensity normalization of individual
image particles, removing reflections and masking portions of
Thus, huge amount of research is always going on in the images.
domain of agriculture in order to yield better and
satisfactory results. This work is broadly classified into 2 algorithms:
Artificial Neural Networks algorithm:
ANN is used to detect the plant swelling (moisture content),
disease and pest along with soil analysis. The dataset of the plant
leaf, various diseases, pests and soil images are trained in python
tool and classified into various clusters which classifies various

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 396


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

labels. Convolution Neural Networks is designed for like “Disease X and Pest Y”. This phase is Disease/Pest Detecting
accurate analysis. Unsupervised Learning classification is
used since the input image is unknown and new to the
algorithm. Most of the real time applications need
unsupervised learning data since the input is always
unknown to the algorithm.

Machine Learning algorithm:


Based on the feature extraction parameters, the algorithm
predicts whether the crop is going to get any pest and disease
attacks in future. Machine learning algorithm uses
CART(classification and regression tree) to predict the part. The Neural Network here is the creation of Convolution
condition of the plant in future based on the given trained Neural Network (CNN) used for Non-Linear Regression models.
data. Starting from the basic image reading process, dataset An artificial neuron gets created having several hidden layers in it.
of nearly 900-1000 each set of healthy plant images,
different pest infected images and diseases infected images IV. IMPLEMENTATION
are fed. Therefore, each set of categories will have 900-1000
images. The feature extraction is GLCM texture extraction 1. NEURAL NETWORK TRAINING
and edge detection for plant disease identification. These
features describe the actual condition of the plant based on This paper put forth a model which is used to train CNN in order
the pixel values. Figure 1 indicates the flow of activities for to identify leaf disease. Tenser flow, an open source library is
the aforementioned steps. used to carry out numerical computations in neural networks
along with data flow graphs. Nodes represent mathematical
operations while graph edges represent multidimensional data
arrays. Convolution neural network in machine learning is a type
of feed forward artificial neural network in which neurons are
associated in a pattern that is stimulated by the organization of
animal visual cortex. Receptive field which happens to be
restricted region of space is the location where individual cortical
neurons respond to stimuli. Different neurons in the respective
fields partially overlap so as to tile the visual field. Individual
neurons react to the stimuli within its field. Convolution
operation approximates this scenario mathematically. These are
stimulated by biological process and variations of multilayer
perceptron that are intended to use pre-processing in minimal
amounts. Consequently, applications in image and video
Figure 1: Flow Chart depicting overview of the recognition, natural language processing and recommender
process systems are large in number. Convolution networks may include
global or local pooling layers which combines output of neuron
The parameters of completely disease and pest infected clusters.
plants, partially infected disease and pest plants and healthy The first convolution layer filters the input image with 32 kernels
plant images are taken as image dataset. These values are of size 3x3. After max pooling is applied, the out- put is given as
trained and clustered based on their respective categories. an input for the second convolution layer with 64 kernels of size
Right from healthy to partially to completely disease and pest 4x4. The last convolution layer has 128 kernels of size 1x1
infected plants are classified into separate labels. The values followed by a fully connected layer of 512 neurons. The output
are trained in accordance with Unsupervised Learning of this layer is given to softmax function which produces a
algorithm. probability distribution of the four output classes. Figure 2
depicts sample images from a database of types of leaves while
The input image taken called as test image, after the
extraction of features of it, each feature gets compared with
the trained data features. Therefore, a centroid is formed Figure 2: Sample images from the database a) healthy leaf
where the nearest corresponding feature cluster is selected. image taken under a constant background b) healthy leaf im-
The label under nearest corresponding label gets displayed age taken under uncontrolled environment [c-e] leaf images

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 397


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

from a plant affected by: c) septorial leaf blight d) V. RESULTS


frogeye leaf spot e) downy mildew

The dataset is divided into 70% for the training, 10% for
validation and 20% for testing. Different models with different
architectures and learning rate are tested. The parameters of the
network like the kernel size, filter size, learning parameter were
selected by trial and error. Table 1 depicts classification results
from different models using different architectures

Figure 2: Sample segmented images resized to


64x64 pixels
a) healthy leaf image taken under a constant background
b) healthy leaf image taken in uncontrolled environment [c-
e] leaf images from a plant affected by: c) septorial leaf
blight d) frogeye leaf spot e) downy mildew

Figure 6 indicates graph of training accuracy versus


actual accuracy

Figure 6: Training verses validation accuracy of the base

Figure 4: illustrates working of CNN layers From the result, the classification accuracy from the color
images is better than the gray scale and the segmented images.
This shows the color feature is important to extract important
features for classification. The model that provides good
classification accuracy contains three convolutional layers each
followed by max pooling layer. The graphs of the training
accuracy versus validation ac- curacy of the model is shown. It
can be seen from the graphs that the model is overfitting.
Overfitting happens when the model fits too well to the training
set. It then becomes difficult for the model to generalize to new
examples that were not in the training set.

Experiments were conducted to see the effect of each technique


Figure 5: Visualization of feature maps in the first activation on the performance of the model. Since the dataset is too small
layer a) sample image b) feature maps of the first activation when compared to the total number of trainable parameters of
layer the model, the first experiment carried out is to increase the
training data by rotating, flipping, rescaling of the images. The
data augmentation is performed only on the training data.

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 398


Proceedings of the Third International Conference on Electronics Communication and Aerospace Technology [ICECA 2019]
IEEE Conference Record # 45616; IEEE Xplore ISBN: 978-1-7281-0167-5

Figure 7 indicates activation process of the feature maps is implemented with training data and classification of given
image dataset. The test input image is compared with the trained
data for detection and prediction analysis. From the results, it is
clear that model provides reliable results

VIII. REFERENCES

1. Machine Learning: What it is and why it matters, 09


2016, [online
2. R.E. Schapire, "The boosting approach to machine
learning: An overview" in Nonlinear estimation and
classification, New York:Springer, pp. 149-171, 2003.
Figure 7 Training vs validation accuracy of the models.
3. A. Ben-Hur, D. Horn, H. T. Siegelmann, V. Vapnik,
"Support vector clustering", Journal of machine
Table 2 illustrates the results of testing and validation learning research, vol. 2, pp. 125-137, Dec 2001.
accuracy
4. J. R. Otukei, T. Blaschke, "Land cover change
assessment using decision trees support vector
machines and maximum likelihood classification
algorithms", International Journal of Applied Earth
Observation and Geoinformation, vol. 12, pp. S27-
S31, 2010.
Table 2. Effect of dropout and regularization
5. Mukhopadhyay S.C. (2012) Smart Sensing
From the Table 2, it is clear that the model provides the Technology for Agriculture and Environmental
required efficiency Monitoring. Vol. 146, Springer Berlin Heidelberg.

VII. CONCLUSION 6. German, L., Ramisch, J.J. & Verma R. (2010) Beyond
Convolution neural network is used to detect and classify the Biophysical, Knowledge, Culture, and Power in
plant diseases. The Network is trained using the images Agriculture and Natural Resource Management,
taken in the natural environment and achieved 99.32% Springer Publ.
classification ability. This shows the ability of CNN to
extract important features in the natural environment which 7. Jun Wu, Anastasiya Olesnikova, Chi-Hwa Song, Won
is required for plant disease classification. Don Lee (2009). The Development and Application of
Decision Tree for Agriculture Data. IITSI :16-20.
Image classification, Image Categories, Feature Extraction,
and Training Data is carried out. The whole development of 8. Leemans, V., Destain, M.F.,2004.A real-time grading
algorithm is done in Python tool. Using several toolboxes method of apples based on features extracted from
like Statistics and Machine Learning toolbox, Neural defects. J. Food Eng. 61, 83-89.
Network Toolbox and Image Processing Toolbox the outputs
as of now are the training data in form of image categories, 9. Quinlan, J.R.(1985b). Decision trees and multi-valued
image classification using K-Means clustering and moisture attributes. In J.E. Hayes & D. Michie (Eds.), Machine
content along with predicting of withstanding. The algorithm intelligence 11. Oxford University Press (in press).

10. Zelu Zia (2009). An Expert System Based on Spatial


Data Mining used Decision Tree for Agriculture Land
Grading. Second International Conference on
Intelligent Computation Technology and Automation.
Oct10-11, China.

978-1-7281-0167-5/19/$31.00 ©2019 IEEE 399