Sie sind auf Seite 1von 50

DETERMINATION OF ABNORMALITIES IN ULTRASOUND KIDNEY IMAGES USING ASSOCIATION RULE BASED NEURAL NETWORK ABIRAMI.P Reg.No: 91009534002 Of P.S.N.

A COLLEGE OF ENGINEERING AND TECHNOLOGY DINDIGUL-624 622 A PROJECT REPORT Submitted to the FACULTY OF COMPUTER SCIENCE AND ENGINEERING In partial fulfillment of the requirement for the award of the degree Of MASTER OF ENGINEERING IN COMPUTER SCIENCE AND ENGINEERING ANNA UNIVERSITY: TIRUCHIRAPPALLI-620024 JUNE 2011

ANNA UNIVERSITY:: TIRUCHIRAPALLI TIRUCHIRAPALLI-620 024 BONAFIDE CERTIFICATE

Certified that this project titled DETERMINATION OF ABNORMALITIES IN ULTRASOUND KIDNEY IMAGES USING ASSOCIATION RULE BASED NEURAL NETWORK is the bonafide work of P.ABIRAMI (91009534002) who carried out the research under my supervision. Certified further, that to the best of my knowledge the work reported herein does not form part of any other project report or dissertation on the basis of which a degree or award was conferred on an earlier occasion on this or any other candidate.

SIGNATURE Dr. R. SARAVANAN M.E., Ph.D HEAD OF THE DEPARTMENT


Dept of Computer Science & Engg, PSNA College of Engg & Tech., Dindigul.

SIGNATURE Mr.S.SatheesBabu M.E., ASSOCIATE PROFESSOR


Dept of Computer Science & Engg, PSNA College of Engg & Tech., Dindigul.

Submitted for viva-voce examination held on ..2011

INTERNAL EXAMINER

EXTERNAL EXAMINER

ABSTRACT

The objective of this work is to develop an automatic diagnosis system for detecting kidney diseases based on association rules (AR) and neural network (NN). The proposed method distinguishes two categories namely normal and abnormal (medical renal disease or cortical cyst). For each segmented ultra sound kidney images 20 features are extracted. AR is used for reducing the number of features and NN is used for intelligent classification of US kidney images. Apriori algorithm is used for association mining which reduces the 20 features to12 features. Neural network classifies the kidney images as normal or abnormal. The AR and NN model is used to obtain fast automatic diagnostic system.

ACKNOWLEDGEMENT At the outset I wholeheartedly thank the Almighty, who has been my strength in times of weakness and hope in times of despair, the sole creator of all the creations in this world and hence this project. I thank my parents who have encouraged me with good spirit by their incessant prayers to complete this project. I would like to express my sincere thanks to our management for providing me various formalities needed for successful completion of my project work. I express my sincere thanks to our beloved Principal Dr. S. Sakthivel B.E., M.Sc (Engg.), MBA., Ph.D., for permitting me to do the project work. I would like to cordially thank our Head of the Department Dr.R. Saravanan M.E., Ph.D., for his kind co-operation and advice. I am indebted to our internal guide Mr.S.SatheesBabu M.E., for the keen interest shown by her in my project and comforting words of encouragement offered by her from time to time. I extend my thanks to Mrs.K.DhanaLakshmi M.E., for her whole support and successful completion. Finally, I thank all faculty members, non-teaching staff members, friends and all my well wishers who directly and indirectly support of this work.

TABLE OF CONTENTS

CHAPTER

TITLE

PAGE NO

ABSTRACT LIST OF TABLES LIST OF FIGURES LIST OF ABBREVIATIONS

iii viii ix x

INTRODUCTION 1.1 Data mining 1.1.1 Data mining architecture 1.1.2 Steps in data mining 1.1.3 Association rules 1.2 Neural Networks 1.2.1Advantages 1.2.2 Types of Neural Network 1.2.2.1 Single Layer Perceptron 1.2.2.2 Multi Layer Perceptron 1.3 Image processing

1 1 1 2 3 4 4 4 4 5 6 8 12

2 3

LITERATURE SURVEY SYSTEM ANALYSIS

3.1 Objective 3.2 Existing system 3.3 Drawbacks of existing system 3.4 Proposed system 3.5 Base paper comparative study 3.6 Tool analysis 4 SYSTEM DESIGN 4.1Module Design 4.1.1 Feature extraction 4.1.1.1 First order gray level feature 4.1.1.2 Second order gray level feature 4.1.1.3 Power Spectral feature 4.1.1.4 Gabor feature 4.1.2 Apriori algorithm 4.1.3 Classification using MLP-BP 5 SYSTEM IMPLEMENTATION 5.1 Software Requirements 5.2 Hardware Requirements 5.3 Implementation of feature extraction 5.4 Implementation of Apriori algorithm 5.5Classification using MLP-BP 6 SYSTEM TESTING 6.1 Testing objectives and purpose 6.2 System testing 6.2.1 Unit testing 6.2.2 Integration testing 6.2.3 Validation testing

12 12 12 13 13 14 16 16 16 16 16 20 22 23 25 28 28 28 28 29 30 31 31 32 32 32 34

CONCLUSION

36 37

REFERENCE APPENDIX-I

LIST OF TABLES

Table No. 4.1 4.2 4.3 4.4

Description Matrix format of test image General format of GLCM


GLCM for =1 and =0 GLCM for =1 and =90

Page No.
22 23 23 24

LIST OF FIGURES

Figure No. 1.1 1.2 4.1 4.2

Description Data Mining architecture Multi layer perceptron System design Threshold Logic unit

Page No.
3 7 20 31

LIST OF ABBREVATIONS ANN NN AR MLP BP SLP TLU Artificial Neural Networks Neural Network Association Rules Multi Layer Perceptron Back Propagation Single Layer Perceptron Threshold Logic Unit

CHAPTER 1 INTRODUCTION
1.1 DATA MINING

Data mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into business intelligence giving an informational advantage. It is currently used in a wide range of profiling practices, such as marketing, surveillance, fraud detection, and scientific discovery. Data mining techniques can be implemented rapidly on existing software and hardware platforms to enhance the value of existing information resources, and can be integrated with new products and systems as they are brought on-line. Data mining is ready for application in the business community because it is supported by three technologies that are now sufficiently mature: Massive data collection Powerful multiprocessor computers Data mining algorithms 1.1.1 DATA MINING ARCHITECTURE Data Mining is the extraction of hidden predictive information from large database, is a powerful new technology with great potential to help companies focus on the most important information in their data warehouses. Data mining software is one of a number of analytical tools for analyzing data. It allows user to analyze data from many different dimensions or angles, categorize it,

summaries the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases.

User interface

Pattern Evaluation Knowledge base Data mining engine

Database or Data warehouse

Fig 1.1 Data mining architecture 1.1.2 STEPS IN DATA MINING

Data Selection: We may not all the data we have collected in the first step. So in this step we select only those data which we think useful for data mining. Data Cleaning: The data we have collected are not clean and may contain errors, missing values, noisy or inconsistent data. So we need to apply different techniques to get rid of such anomalies. Data Transformation: The data even after cleaning are not ready for mining as we need to transform them into forms appropriate for mining. The techniques used to accomplish this are smoothing, aggregation, normalization etc. Data Mining: Now we are ready to apply data mining techniques on the data to discover the interesting patterns. Techniques like clustering and association analysis are among the many different techniques used for data mining. Pattern Evaluation and Knowledge Presentation: This step involves visualization, transformation, removing redundant patterns etc from the patterns we generated. Decisions / Use of Discovered Knowledge: This step helps user to make use of the knowledge acquired to take better decisions. 1.1.3 ASSOCIATION RULES Association rule mining finds interesting associations and/or correlation relationships among large set of data items. Association rules show attributes value conditions that occur frequently together in a given dataset. A typical and widely-used example of association rule mining is market Basket Analysis. The various algorithms are as follows

Apriori algorithm Eclat algorithm FP-growth algorithm One -attribute rule Zero- attribute rule

1.2 NEURAL NETWORKS An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information. The key element of this paradigm is the novel structure of the information processing system. It is composed of a large number of highly interconnected processing elements working in unison to solve specific problems. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. 1.2.1 ADVANTAGES A neural network can perform tasks that a linear program can not. When an element of the neural network fails, it can continue without any problem by their parallel nature. A neural network learns and does not need to be reprogrammed. It can be implemented in any application. It can be implemented without any problem.

1.2.2 TYPES OF NEURAL NETWORK 1.2.2.1 SINGLE LAYER PERCEPTRON The earliest kind of neural network is a single-layer perceptron network, which consists of a single layer of output nodes; the inputs are fed directly to the outputs via a series of weights. In this way it can be considered the simplest kind of feed-forward network. The sum of the products of the weights and the inputs is calculated in each node, and if the value is above some threshold (typically 0) the neuron fires and takes the activated value (typically 1); otherwise it takes the deactivated value (typically -1). 1.2.2.1.1 ADVANTAGES

Easy to setup and train. Outputs are weighted sum of inputs: interpretable representation.

1.2.2.1.2 LIMITATIONS

Can only represent a limited set of functions. Decision boundaries must be hyper plane. Can only perfectly separate linearly separable data.

1.2.2.2 MULTILAYER PERCEPTRON A multilayer perceptron (MLP) is a feed forward artificial neural network model that maps sets of input data onto a set of appropriate output. An MLP consists of multiple layers of nodes in a directed graph, with each layer fully connected to the next one. Except for the input nodes, each node is a neuron

with a nonlinear activation function. MLP utilizes a supervised learning technique called backpropagation for training the network.

Fig 1.2 Multi layer perceptron

1.2.2.3 ADVANTAGES Adaptive learning: An ability to learn how to do tasks based on the data given for training or initial experience. MLP/Neural networks do not make any assumption regarding the underlying probability density functions or other probabilistic information about the pattern classes under consideration in comparison to other probability based models. They yield the required decision function directly via training. A two layer backpropagation network with sufficient hidden nodes has been proven to be a universal approximator. 1.3 IMAGE PROCESSING

Image Processing is any form of signal processing for which the input is an image, such as a photograph or video frame; the output of image processing may be either an image or, a set of characteristics or parameters related to the image. Most image-processing techniques involve treating the image as a twodimensional signal and applying standard signal-processing techniques to it. Image processing is a physical process used to convert an image signal into a physical image. The image signal can be either digital or analog. The actual output itself can be an actual physical image or the characteristics of an image. The most common type of image processing is photography. In this process, an image is captured using a camera to create a digital or analog image.

CHAPTER 2 LITERATURE SURVEY [1]An expert system for detection of breast cancer based on association rules and neural network-Murat Karabatak a Frat University, (2009) The source of the literature survey presents an automatic diagnosis system for detecting breast cancer based on association rules (AR) and neural network (NN). In this study, AR is used for reducing the dimension of breast cancer database and NN is used for intelligent classification. The AR + NN system performance is compared with NN model. In this study, an automatic diagnosis system for detecting breast cancer based on association rules (AR) and neural network (NN) is presented. Feature extraction is the key for pattern recognition

and classification. The best classi.er will perform poorly if the features are not chosen well. A feature extractor should reduce the feature vector to a lower dimension, which contains most of the useful information from the original vector. So, AR is used for reducing the dimension of breast cancer database and NN is used for intelligent classification. The proposed AR + NN system performance is compared with NN model. The dimension of input feature space is reduced from nine to four by using AR. In test stage, 3-fold cross validation method was applied to the Wisconsin breast cancer database to evaluate the proposed system performances.

[2] A HYBRID FUZZY-NEURAL SYSTEM FOR COMPUTER-AIDED DIAGNOSIS OF ULTRASOUND KIDNEY IMAGES USING PROMINENT FEATURES.[K.Bommanna Raja & M.Madheswaran & K.Thyagarajah.] The objective of this work is to develop and implement a computer-aided decision support system for an automated diagnosis and classification of ultrasound kidney images. The proposed method distinguishes three kidney categories namely normal, medical renal diseases and cortical cyst. For the each pre-processed ultrasound kidney Image, 36 features are extracted. Two types of decision support systems, optimized multi-layer back propagation network and hybrid fuzzy-neural system have been developed with these features for classifying the kidney categories. The performance of the hybrid fuzzy-neural

system is compared with the optimized multi-layer back propagation network in terms of classification efficiency, training and testing time. The results obtained show that fuzzy-neural system provides higher classification efficiency with minimum training and testing time. It has also been found that instead of using all 36 features, ranking the features enhance classification efficiency. The outputs of the decision support systems are validated with medical expert to measure the actual efficiency. The overall discriminating capability of the systems is accessed with performance evaluation measure, f-score. It has been observed that the performance of fuzzy neural system is superior compared to optimized multi-layer back propagation network. Such hybrid fuzzy-neural system with feature extraction algorithms and pre-processing scheme helps in developing computer-aided diagnosis system for ultrasound kidney images and can be used as a secondary observer in clinical decision making. [3] P. Rajendran, M.Madheswaran. Hybrid medical image classification using AR mining with decision tree algorithm. JOURNAL OF COMPUTING, JANUARY 2010. The main focus of image mining in the proposed method is concerned with the classification of brain tumor in the CT scan brain images. The major steps involved in the system are: pre-processing, feature extraction, association rule mining and hybrid classifier. The pre-processing step has been done using the median filtering process and edge features have been extracted using canny edge detection technique. The two image mining approaches with a hybrid manner have been proposed in this paper. The frequent patterns from the CT scan images are generated by frequent pattern tree (FP-Tree) algorithm that mines the association rules. The decision tree method has been used to classify the medical images for diagnosis. This system enhances the classification

process to be more accurate. The hybrid method improves the efficiency of the proposed method than the traditional image mining methods. The experimental result on prediagnosed database of brain images showed 97% sensitivity and 95% accuracy respectively. The physicians can make use of this accurate decision tree classification phase for classifying the brain images into normal, benign and malignant for effective medical diagnosis.

[4] Haiwei Pan, Jianzhong Li, and Zhang Wei. Mining Interesting Association Rules in Medical Images (2007). Image mining is more than just an extension of data mining to image domain but an interdisciplinary endeavor. Very few people have systematically investigated this field. Mining association rules in medical images is an important part in domain-specific application image mining because there are several technical aspects which make this problem challenging. In this paper, we extend the concept of association rule based on object and image in medical images, and propose two algorithms to discover frequent item-sets and mine interesting association rules from medical images. We describe how to incorporate the domain knowledge into the algorithms to enhance the interestingness. Some interesting results are obtained by our program and we

believe many of the problems we come across are likely to appear in other domains.

CHAPTER 3 SYSTEM ANALYSIS 3.1 OBJECTIVE To produce an accurate classification of ultrasound kidney image using neural network. Apriori algorithm is used for association mining which is used to select the most relevant features of the given image. 3.2 EXISTING SYSTEM The techniques such as association rule based neural network is used for the classification of malignant and benign patterns in digitized mammograms.

Back-propagation neural network for the classication of the suspicious lesions extracted using a fuzzy rule-based detection system and they obtained higher accuracy.

A comparative study of a radial basis function (RBF) and a multilayer perceptron (MLP) based neural networks for the classication of breast abnormalities using the texture features and concluded that MLP obtained 4% higher accuracy than RBF. 3.3 DRAWBACKS OF EXISTING SYSTEM Segmentation and feature extraction technique for reliable classication of microcalcications which achieves low classication rate (78%) on DDSM database. Accuracy of the system may be high in the training data set and may drop in the test data. 3.4 PROPOSED SYSTEM
In the proposed system the features are extracted from the kidney image.

There are four feature extraction techniques they are First order gray level statistical features, Second order gray level statistical features, Power spectral features and Gabor features.
Different features are extracted to study the gray level intensity distribution

of kidney region. Totally 20 features are extracted from the segmented kidney images.
Apriori algorithm reduces the number of features to 12. As the features are

reduced from 20 to 12 before passing to the MLP, the classification accuracy will be increased.

MLP classifies the 3 types of categories Normal Medical renal disease Cortical cyst. 3.5 BASE PAPER COMPARITIVE STUDY An automatic diagnosis system for detecting breast cancer based on association rules (AR) and neural network (NN). AR is used for reducing the dimension of breast cancer database and NN is used for intelligent classification. The AR + NN system performance is compared with NN model. An automatic diagnosis system for detecting breast cancer based on association rules (AR) and neural network (NN) is presented. Feature extraction is the key for pattern recognition and classification. The best classifier will perform poorly if the features are not chosen well. A feature extractor should reduce the feature vector to a lower dimension, which contains most of the useful information from the original vector. So, AR is used for reducing the dimension of breast cancer database and NN is used for intelligent classification. The proposed AR + NN system performance is compared with NN model. The dimension of input feature space is reduced from nine to four by using AR. In test stage, 3-fold cross validation method was applied to the Wisconsin breast cancer database to evaluate the proposed system performances. Modifying the existing by using the ultrasound kidney images as the input and apply feature extraction techniques to extract the features and then apriori algorithm is used to select the relevant features. MLP-BP classifies the given image as normal/abnormal.

3.6 TOOL ANALYSIS Math Works MATLAB 7.9 high-level technical computing language provides interactive environment for the development of algorithms and a modern tool for data analysis. MATLAB compared with traditional programming languages (C / C + +, Java, Pascal, FORTRAN) allows an order to reduce the solution time for typical tasks and greatly simplifies the development of new algorithms. MATLAB (matrix laboratory) is a numerical computing environment and fourth-generation programming language. Developed by Math Works, MATLAB allows matrix manipulations, plotting of functions and data, implementation of algorithms, creation of user interfaces, and interfacing with programs written in other languages, including C, C++, and FORTRAN. Although MATLAB is intended primarily for numerical computing, an optional toolbox uses the MuPAD symbolic engine, allowing access to symbolic computing capabilities. An additional package, Simulink, adds graphical multidomain simulation and Model-Based Design for dynamic and embedded systems. In 2004, MATLAB had around one million users across industry and academia. MATLAB users come from various backgrounds of engineering, science, and economics. MATLAB is widely used in academic and research institutions as well as industrial enterprises.

CHAPTER 4 SYSTEM DESIGN 4.1 MODULE DESIGN 4.1.1 FEATURE EXTRACTION The feature extraction techniques are applied on the segmented kidney images and each technique are explained as follows. 4.1.1.1 FIRST ORDER GRAY LEVEL STATISTICAL

FEATURE

The first order gray level statistical features are estimated for preprocessed ultra sound kidney images. They features are of the central pixel and its neighborhood. 4.1.1.2 SECOND ORDER GRAY LEVEL STATISTICAL FEATURE The spatial gray level dependency matrix is one of the most widely used techniques for statistical texture description. All known visually distinct texture pairs can be discriminated using this method. GLCM ALGORITHM Texture is one of the important characteristics used in identifying objects or regions of interest in an image. Texture contains important information about the structural arrangement of surfaces. The textural features based on gray-tone spatial dependencies have a general applicability in image classification. The three fundamental pattern elements used in human interpretation of images are spectral, textural and contextual features. Spectral features describe the average tonal variations in various bands of the visible and/or infrared portion of an electromagnetic spectrum. Textural features contain information about the spatial distribution of tonal variations within a band. The fourteen textural features proposed by Haralick contain information about image texture characteristics such as homogeneity, gray-tone linear dependencies, contrast, number and nature of boundaries present and the complexity of the image. Contextual features contain information derived from blocks of pictorial data surrounding (i,j)th entry of the matrices represents the probability of going from pixel with gray level (i) to another with a gray level( j) under predefined angles. Mean, Dispersion, Variance, Energy, Skewness, kurtosis. Variance is the sum of difference between intensity

Usually for statistical texture analysis these angles are defined at 0, 45, 90 and 135. Test image 0 0 0 2 0 0 2 2 1 1 2 3 1 1 2 3

Table 4.1 Matrix format of test image

General form of GLCM Gray tone 0 1 2 3

#(0,0)

#(0,1)

#(0,2)

#(0,3)

#(1,0)

#(1,1)

#(1,2)

#(1,3)

#(2,0)

#(2,1)

#(2,2)

#(2,3)

#(3,0)

#(3,1)

#(3,2)

#(3,3)

Table 4.2 General form of GLCM

4 2 1 0

2 4 0 0

1 0 6 1

0 0 1 2

Table 4.3 GLCM for =1 and =0

GLCM for =1 and =90

6 0 2 0

0 4 2 0

2 2 2 2

0 0 2 0

Table 4.4 GLCM for =1 and =90

Energy: One approach to generating texture features is to use local kernels to detect various types of texture. After the convolution with the specified kernel, the texture energy measure (TEM) is computed by summing the absolute values in a local neighborhood:
Le = C (i, j )
i =1 j =1 m n

(4.1)

If n kernels are applied, the result is an n-dimensional feature vector at each pixel of the image being analyzed. Correlation: Correlation is a measure of image linearity.
Cc =

[ijP [i, j ]]
d i i j

(4.2)

i j

i = iPd [i, j ],

i2 = i 2 Pd [i, j ] i2

(4.3)

Correlation will be high if an image contains a considerable amount of linear structure. Entropy: Entropy is a measure of information content. It measures the randomness of intensity distribution.
Ce = d [i, j ] ln Pd [i, j ] P
i j

(4.4)

Such a matrix corresponds to an image in which there are no preferred gray level pairs for the distance vector d. Entropy is highest when all entries in P[i,j] are of similar magnitude, and small when the entries in P[i,j] are unequal. Homogeneity A homogeneous image will result in a co-occurrence matrix with a combination of high and low P[i,j]s.
P [i, j ] Ch = d i j 1+ i j

(4.5)

Where the range of gray levels is small the P [i, j] will tend to be clustered around the main diagonal. A heterogeneous image will result in an even spread of P [i, j]s. 4.1.1.3 POWER SPECTRAL FEATURE

The spectral features estimated by using fast Fourier transform .It is used for various analysis, diagnosis and evaluation of biological systems. An important application of Power Spectral feature is to detect and characterize binary images.

The periodogram computes the power spectra for the entire input signal:

(4.6) Where F (signal) is the Fourier transform of the signal, and N is the normalization factor, which Igors DSP Periodogram operation defaults to the number of samples in the signal. The calculation of the periodogram is improved by spectral windowing and Igors DSP Periodogram operation supports the same windows as the FFT operation does. The result of the periodogram is often normalized by a multiplication factor to make the result satisfy Parsevals

(4.7) Theorem which presumes the two-sided frequency-domain FFT result is computed from the time-domain signal data, and where N is again the number of time-domain values in the signal.

Normalization of the periodogram result to meet this criterion follows several different conventions in the literature, (and depends on the average power of any spectral windowing function and also on whether the periodogram is oneor two-sided), so the DSP Periodogram operation allows the user to specify the desired normalization using the /NOR parameter. DSPPeriodogram/NOR= (numpnts (signal)/2) signal When using a window function, the amount of power in the signal is reduced. A compensating multiplier of 1/average (window[i] ^2) should be applied to the result to compensate for this. For a Hanning window this value is theoretically 0.375. Because the normalization factor is a denominator, you would divide N by 0.375 to compensate for the Hanning window: DSP Periodogram/NOR=(numpnts(signal)/(2*0.375)) signal 4.1.1.4 GABOR FEATURE A Gabor filter can be seen as a sinusoidal plane of a particular frequency and orientation, modulated by a Gaussian envelop. A 2D Gabor function g(x, y) and its Fourier transform G(u, v) are defined as

(4.8) Where j = 1, and W is the frequency of the modulated sinusoid.

(4.9)

A self-similar filter dictionary can be obtained by associating an appropriate scale factor and a rotation parameter with the mother wavelet g(x, y). M and N represent the scales and orientations of the Gabor wavelets.

(4.10) Where = n/K and K is the total number of orientations. 4.1.2ASSOCIATION RULES USING APRIORI ALGORITHM Association rule find interesting associations or relationships among large set of data items. Association rule show attributes value conditions that occur frequently together in a given dataset. They allow capturing all possible rules that explain the presence of some attributes according to the presence of other attributes.For example, the rule {onions, potatoes}->{burger} found in the sales data of a supermarket would indicate that if a customer buys onions and potatoes together, he or she is likely to also buy burger. Such information can be used as the basis for decisions about marketing activities such as, e.g., promotional pricing or product placements. 4.1.2.1 APRIORI ALGORITHM The Apriori Algorithm is an influential algorithm for mining frequent itemsets for Boolean association rules.

Apriori () L1= {large 1-itemsets} K=2 While Lk-1 do begin Ck=apriori_gen (Lk-1) For all transactions t in D do begin Ct=subset (Ck,t) For all candidate c Ct do c.count=c.count+1 end Lk={c Ck|c.count minsup} K=k+1 end Apriori rst scans the transaction databases D in order to count the support of each item I in I, and determines the set of large 1itemsets. Then, iteration is performed for each of the computation the set of 2-itemsets, 3-itemsets, and so on. The kth iteration consists of two steps. The first step is to generate the candidate set from large item set. The second step is to scan the data base in order to compute the support count of each candidate sets. The candidate generation algorithm is given as follows.

Apriori_gen (Lk-1) Ck= For all itemsets X Lk-1 and Y Lk-1 do If X1=Y1^ . . .Xk-2 = Yk-2 ^ Xk-1<Yk-1 then begin C=X1X2Xk-1 Yk-1 Add C to Ck end Different features are extracted to study the gray level intensity distribution of kidney region. Totally 20 features are extracted from the segmented kidney images. Apriori algorithm reduces the number of features to 12 and this is given as the input to the MLP-BP. 4.1.3 CLASSIFICATION USING MLP-BP A neural network consists of an interconnected group of artificial neurons. Multilayer perceptron neural network with back propagation is used for classification. The intelligent classification is realized in this layer by using Features, which are obtained from AR. The initial weights are Random .The number of neuron on the layers Input: 12 Hidden: 2

Output: 1

THRESHOLD

Scientists trying to understand the working of human brain think our brain has networks of Neurons. Each neuron can conduct (the signal) or not conduct depending on the input, its weights and a threshold value. In scientific terms, neurons fire or not depending on the summed strength of their inputs across synapses of various strengths. Initially, the neural network has random weights and thus does not do the given task well. As we practice, the brain keeps adjusting the weights and the threshold values of neurons in the network. After a while, when the weights are adjusted, we call the neural network, trained and we can do the task well. As can be seen from the diagram, a TLU has various inputs X1, X2... Xn. These inputs are multiplied with the corresponding weights and added together. If this sum is greater than the threshold value, the output is a high (1). Otherwise, the result is low (0).
To start with, the weights in the TLU and the threshold value are randomly decided. Then, the TLU is presented the expected output for a particular input. For the given input, the output of the TLU is also noted. Usually, because the weights are random, the TLU responds in error. This error is used to adjust weights so that the TLU produces the required output for the given input. Similarly, all the expected values in the training set are used to adjust the weights.

Fig 4.1 Threshold logic unit Once the TLU is trained, it will respond correctly for all inputs in the training set. Also, now that the TLU is trained, we can use it to calculate the output for inputs not in the training set. The threshold value can be incremented by 0.05 from 0.1 to 1.0. Classification efficiency can be changed for each setting and best possible threshold value is assigned. The input image is determined as normal category if the achieved output value of MLP-BP is less than or equal to 0.35. If the value is greater than 0.35 and less than 0.75 then input image will be Medical renal category. If the value is greater than 0.75 then it will be cortical cyst category.

4.2 SYSTEM DESIGN Input image

Feature Extraction
Second order gray level statistical Feature

First order gray level statistical Features

Power spectral Features

Gabor Features

Association Rules generation using Apriori algorithm

Normal

Abnormal

Medical renal disease

Cortical cyst

Fig 4.2 Data flow diagram

CHAPTER 5

SYSTEM IMPLEMENTATION 5.1 SOFTWARE REQUIREMENTS Operating System: Windows XP(Platform that supports Language: MATLAB Version: MATLAB 7.9 5.2 HARDWARE REQUIREMENTS Pentium IV 2.7 GHz 1GB DDR RAM onwards 250 GB Hard Disk 5.3 IMPLEMENTATION OF FEATURE EXTRACTION MATLAB)

The segmented ultra sound kidney images are taken as the input and different feature extraction techniques are applied to the segmented kidney images. The four feature techniques First order gray level statistical features, Second order gray level Statistical features, Power spectral features, Gabor features. 5.4 IMPLEMENTATION OF APRIORI ALORITHM In data mining, association rule learning is a popular and well researched method for discovering interesting relations between variables in large databases. Apriori is the best-known algorithm to mine association rules. It uses a breadth-first search strategy to counting the support of itemsets and uses a candidate generation function which exploits the downward closure property of support.

The implementation of apriori is done only for the training datasets and for the test data we can use directly the MLP-BP classification without going for apriori. 5.5 CLASSIFICATION USING MLP-BP Multilayer perceptron neural network with back propagation is used for classification. The intelligent classification is realized in this layer by using Features, which are obtained from AR. Classification efficiency can be changed for each setting and best possible threshold value is assigned. The input image is determined as normal category if the achieved output value of MLP-BP is less than or equal to 0.35. If the value is greater than 0.35 and less than 0.75 then input image will be Medical renal category. If the value is greater than 0.75 then it will be cortical cyst category.

CHAPTER 6 SYSTEM TESTING 6.1 TESTING OBJECTIVES AND PURPOSE Testing is the major quality measure employed during software development. After the coding phase, computer programs available are executed for testing purposes. Testing not only has to uncover errors introduced during coding, but also locates errors committed during the previous phases. Thus the aim of testing is to uncover requirements, design or coding errors in the program. No system design is ever perfect. Communication problem, programmers negligence or time constraints create error that must be eliminated before the system is ready for user acceptance testing. A system is tested for on-line response, volume of transaction stress, recovery from failure, and non usability. Following system testing is acceptance testing or live running the system with live data by the actual user. Testing is the one step in the software engineering process that could be viewed as destructive rather than constructive. Testing requires that the developer discard the preconceived notions of the correctness of the software just developed and overcome the conflict of interest that occurs when errors are uncovered. 6.2 SYSTEM TESTING

This is the phase where the bug in the programs was to be found and corrected. One of the goals during dynamic testing is produce a test suite, where the salary calculated with the desired outputs such as reports in this case. This is applied to ensure that the modification of the program does have any side effects. This type of testing is called regression testing. Testing generally removes all the residual bugs and improves the reliability of the program. The basic types of testing are, Unit testing Integration testing Validation testing 6.2.1 Unit Testing This is the first level of testing. In here different modules are tested against the specification produced during the design of the modules. Unit testing is done for the verification of code during the coding of single program module in an isolated environment. Unit testing first focuses on the modules independently of one another to locate errors. After coding each task is tested and run individually. All unnecessary coding were removed and it was ensured that all the modules worked, as the programmer would expect. Logical errors found were corrected. So by working all the modules independently and verifying the outputs of each module in the presence of staff was concluded that the program was functioning as expected. 6.2.2 Integration Testing

Data can be lost access an interface, one module can have as adverse effort on another sub functions when combined, may not produce the desired major functions. Integration testing is a systematic testing for constructing the program structure, while at the same time conducting test to uncover errors associated within the interface. The objectives are to take unit tested as a whole. Here correction is difficult because the vast expenses of the entire program complicate the isolation of causes. Thus in the integration testing step, all the errors uncovered are corrected for the next testing steps.

Problem The problem is that the user need not enter the number of input, hidden and output layers it should be automatically generated form the apriori to the MLP input for training data set and from the feature extraction for the test data.

Solution:

6.2.3 Validation Testing This provides the final assurance that the software meets all functional, behavioral and performance requirements. The software is completely assembled as a package. Validation succeeds when the software functions in a manner in which the user expects. Validation refers to the process of using software in a live environment in order to find errors. During the course of validating the system, failures may occur and sometimes the coding has to be changed according to the requirement. Thus the feedback from the validation phase generally produces changes in the software.

Once the application was made free of all logical and interface errors, inputting dummy data ensured that the software developed satisfied all the requirements of the user.

CHAPTER 7 CONCLUSION

The system determines the abnormalities in ultra sound kidney image based on association rules and neural network. The MLP-BP classifies US kidney images as Normal/Medial Renal Disease/cortical cyst. The intelligent classification is realized in this layer by using features, which are obtained from AR. The combination of association rules and neural network produces higher accuracy in classification of three kidney categories than the existing system. This system may be enhanced with the following tasks in the future o Different types of neural network can be implemented and their performance can be compared. o We can increase the number of hidden layer to increase the performance.

REFERENCES

1. An expert system for detection of breast cancer based on association rules and neural network-Murat Karabatak a Frat University, (2009) 2 A Hybrid Fuzzy-Neural System for Computer-Aided Diagnosis of ultrasound Kidney using Prominent FeaturesA [K.Bommanna Raja & M.Madheswaran & K.Thyagarajah.] 3 P. Rajendran, M.Madheswaran. Hybrid medical image classification using AR mining with decision tree algorithm. JOURNAL OF COMPUTING, JANUARY 2010. 4 Haiwei Pan, Jianzhong Li, and Zhang Wei. Mining Interesting Association Rules in Medical Images (2007). 5. Maryellen, L., Giger, N. K., Armato, S. G., Computer-aided diagnosis in medical imaging. IEEE Trans. Med. Imag. 20(12):12051208, 2001. 6. Erickson, B. J., Bartholmai, B., Computer-aided detection and diagnosis at the start of the third millennium. J. Digit. Imaging 15:5968, 2002.

APPENDIX-I

CONFERENCE DETAILS

Das könnte Ihnen auch gefallen