Sie sind auf Seite 1von 7

GRD Journals- Global Research and Development Journal for Engineering | Volume 5 | Issue 6 | May 2020

ISSN- 2455-5703

Supervised Machine Learning Algorithms:


Classification and Comparison
Shweta Chaudhary
Department of Computer Science and Engineering
Sharda University, Greater Noida

Abstract
Supervised Machine Learning (SML) is a search for algorithms that cause given external conditions to produce general hypotheses,
and then make predictions about future events. Supervised classification is one of the most frequently performed tasks by smart
systems. This paper describes various Supervised Machine Learning (ML) methods for comparing, comparing different learning
algorithms and determines the best-known algorithm based on the data set, number of variables and variables (features). : Decision
Table, Random Forest (RF), Naive Bayes (NB), vector Support Machine (SVM), Neural Networks (Perception), JRip and Tree
Decision (J48) using learning tool the Waikato Information Machine (WEKA). In order to use algorithms, diabetes data were set
up to be classified into 786 cases with eight characteristics such as independent variables and reliability analyzes. The results
indicate that the SVM was found to be an algorithm with great accuracy and accuracy. Naive Bayes and Random Forest
classification algorithms were found to be more accurate following SVM. Studies show that the time it takes to build a model and
accuracy (accuracy) is a factor on the other hand; while statistical kappa and mean Absolute Error (MAE) are another factor on the
other hand. Therefore, ML algorithms require more precision, accuracy and less error to evaluate machine learning prediction.
Keywords- Machine Learning, Classifiers, Mining Techniques, Data Analysis, Learning Algorithms, Monitored Machine
Learning

I. INTRODUCTION
Machine learning is one of the most rapidly developing areas of computer science. It means automatic detection of meaningful
patterns in the data. Machine learning tools are concerned with learning and adaptive learning systems.
Machine learning has become one of the most important forms of Information Technology and, therefore, the central,
often hidden, part of our lives. With the ever-increasing prices of available data, there is good reason to believe that systematic
data processing will be as complete as necessary ingredients for technological advancements.
There are many applications of Machine Learning (ML), most important of which are data minerals. People tend to make
mistakes between analyzes or, perhaps, when trying to build relationships between multiple symptoms.
Data Mining and Machine Learning for Siamese twins where more information can be found with relevant learning
algorithms. There has been great progress in data mining and machine learning due to the emergence of smart and Nano
technologies that has raised the interest in discovering hidden patterns of quantitative data. The combination of mathematics,
machine learning, information orders, and computer has created strong science, solid mathematical foundations, and very powerful
tools.
Supervised reading creates the mapping function of the desired output input.
The unprecedented data generation has made machine learning techniques more sophisticated at times. This requires the
use of a few algorithms to study an unmanaged virtual machine. Readings made for that are most common in partition problems
because the purpose is usually to get the computer to read through the editing program we created.
ML is wholly intended to achieve access hidden within Big Data. ML contributes to ensuring value extraction from large
and unique data sources with minimal systematic dependence on the individual track as data is cut and increased at machine level.
Machine learning is well suited to the sophisticated input of managing different data sources and the large range of variables and
amount of data involved when ML succeeds in non-additive information. The more information that is provided in the ML
structure, the more it can be trained and affect the effects of a higher level of understanding. In the relief that comes from the
restriction of the scale and the consideration of individual levels, ML is wise to discover and display patterns hidden in the data.
Another common way of doing supervised reading work is the problem of classification: The student needs to learn (guess
how he or she performs some memory-based work in one of the many classes by looking at examples of reproducible mechanical
input). Learning is the process of learning a set of rules from specific contexts (examples in a training set), or multitasking, to
create a classifier that can be used to generalize from new contexts. The procedure for using the supervised ML for a real-world
problem is described in Figure 1.

All rights reserved by www.grdjournals.com 8


Supervised Machine Learning Algorithms: Classification and Comparison
(GRDJE/ Volume 5 / Issue 6 / 003)

Fig. 1: Supervised Machine Learning Techniques

This work focuses on the classification of ML algorithms and determining the most efficient algorithms with high
accuracy and accuracy. As well as introducing the functionality of the different algorithms to large and small data sets with a view
it has separated them well and provided insight into how to build supervised machine learning models.
The remainder of this work is organized as follows: Section 2 presents a review of the literature on the categories of
supervised learning algorithms; section 3 presents the methodology used, section 4 discusses the results of the work while section
5 presents the conclusions and recommendations of the other works.

II. REVIEW READING

A. Classification of Supervised Learning Algorithms


Accordingly, the machine learning algorithm for multi-segmentation based algorithms includes the following: Linear Classifiers,
Logistic Regression, Naïve Bayes Classifier, Perception, Support Vector Machine; Quadratic Classifiers, K-Means Cluswing,
Weed Power, Tree Decision, Random Forest (RF); Neural Networks, Bayesian Networks and more.

1) Linear Classifiers
Linear models for the classification of different vegetation types using the boundary (plane line) of decision boundaries. The goal
of linear segmentation with machine learning is to group objects with the same element values, in groups. It emphasized that the
classifier placed in the queue achieves this goal by making a classification decision based on the number of combinations of
features. A high classifier is often used in cases where the speed of separation is problematic, since it is measured with a fast
classifier. Also, line separators tend to be most efficient when the maximum value is large, such as document fragmentation, where
each component is generally a word count in a text. The degree of overlap between variable data sets however depends on the line.
In short, the margin specifies how well the partitioned data is, which is why it is very easy to solve the given partitioning problem.

2) Logistic Regression
This is a classification function which uses a class for building and uses a single multinomial logistic regression model with a
single estimator. Logistic regression usually tells where the boundary between classes is, and it also states that the squared
probability depends, in a certain way, on the distance from the boundary. When the data set is large it moves towards the intensity
(0 and 1). These statements about probability make logic more than just classification. It makes stronger, more detailed predictions
and fits in a different way; but those strong assumptions may be wrong. Logistic regression is an approach to estimation such as
Ordinary List Square (OLS) regression. However, with logistic regression, the estimation results in bipolar.

All rights reserved by www.grdjournals.com 9


Supervised Machine Learning Algorithms: Classification and Comparison
(GRDJE/ Volume 5 / Issue 6 / 003)

3) Innocent Bayesian (NB) Networks


These are very simple Bayesian networks with a strong (if not identical) node of child nodes having a parent (representing a talking
node) and a large number of children (corresponding to related nodes). Are made with directed acyclic graphs. In reference to his
parents. Thus, the independence model (Naive Bayes) is based on the approximation. Bayes classifiers are generally less accurate
than other advanced learning algorithms (such as ANN). Decision tree induction, however, can be compared to state-of-the-art
algorithms for example-based learning with naive Bayes classifiers. And the rule is included on standard benchmark datasets, and
even in datasets with significant data dependencies, it is sometimes found to be superior to other learning schemes. The Bayes
classification has the characteristic-independence problem, which is solved with mean one-dependence estimators.

4) Multi-layer Perception
This is a classification in which the network's wattage is found by solving convex, uncontrolled minimization problems in standard
neural network training rather than solving quadratic programming problems with linear constraints. Other popular algorithms rely
on umption. Perception algorithm is used to learn from training sets, repeatedly running through the training set until the algorithm
becomes an estimation vector. Find the right one in all training sets. This estimation rule is used to evaluate labels on test sets.

5) Support Vector Machines (SVMs)


This is a recent supervised machine learning technology. Support vector machine (SVM) models are related to multilayer
perception and neural networks. VVS revolves around the notion of a margin either side of the hyper plane separating the two data
classes. Maximizing the margin and creating as much distance as possible between the separating hyper planes, and the upper limit
on the normalization error, shown in the examples on either side, has been demonstrated.

6) K-means
It is one of the most simple and unpredictable learning algorithms that can solve a clustering problem. This process follows a
simple and easy way to classify a given data by a specific number of clusters (beyond clusters). When the labelled data is not
available — the algorithm is used. The simplest way to change a strict rule. The most accurate assessment rule is thumb. If you
look at the weak― learning algorithm that can find consistent classifiers (rules of thumb), at least better than random, say, accuracy
_ 55%, with enough data, a very high accuracy algorithm can produce, say, 99%.

7) Decision Trees
Decision Trees (DT) are trees that sort examples by feature values. Each node in the decision tree represents an attribute in the
context that needs to be categorized and each branch node can represent a value. The examples are initialized at the root node and
sorted based on their attribute values. Decision tree learning, used in data mining and machine learning, uses the decision tree to
model observations about conclusions about the objective value of an object. Those tree models are classification trees or regression
trees. Decision tree classifiers typically use post-pruning techniques to assess tree performance, as they can be sorted using a valid
set. You can remove any node and assign it to the most common class of serialization.

8) Neural Networks
Advanced Neural Networks (NNs), which can perform multiple regression and / or classification tasks simultaneously but usually
each network only does one. Therefore, in most cases, the network has only one output variable, although in the case of multi-state
classification problems, it may be compatible with multiple output units (the next stage of processing takes care of the mapping.
Output Units of Output Variables). The three basic elements of the Artificial Neural Network (ANN) unit depend on the input and
activation functions, network architecture, and the weight of each input connection. Since the first two issues are solved, the
behaviour of the ANN is defined by the current values of the weights. The net weight of the training set is initially set to random
values, and then the instances of the training set are repeatedly exposed to the net. The values of the input of an instance are placed
in the input units and the output of the net is compared to the desired output for this example. Then, all the weights of the net are
adjusted slightly in the direction of the net, bringing the output values of the net closer to the desired output. There are many
algorithms that train the network.

9) Bayesian Network
Bayesian Network (BN) is a graphical model for probability relationships between variables. Bayesian networks are well known
representatives of statistical learning algorithms. The problem with BN classmates is that they are not suitable for datasets with
multiple attributes. This prior expertise or domain knowledge about Bayesian network architecture can take the following forms:
– A node declares a parent node, which means it has no parent.
– Declare that a node is a leaf node, which means it has no children.
– Declare that a node is the direct cause or direct effect of another node.
– Declare that a node is indirectly connected to another node.
– N is given a condition-set declaring that the two nodes are independent.
– Ordering partial nodes, that is, a node in the order appears earlier than another node.
– Providing full node commands.

All rights reserved by www.grdjournals.com 10


Supervised Machine Learning Algorithms: Classification and Comparison
(GRDJE/ Volume 5 / Issue 6 / 003)

B. Properties of Machine Learning Algorithms


Supervised machine learning techniques apply across many domains. Many machine learning (ML) application-based documents
can be found in [18], [25].
SVM and neural networks work much better when working with multiple dimensions and continuous features. On the
other hand, logic-based systems work better for dealing with discrete / hierarchical features. For neural network models and SVMs,
a large sample size is required to achieve its maximum prediction accuracy, while NBs may require new datasets.
There is general agreement that k-NN is very sensitive to irrelevant properties: this characteristic can be explained by the
way the algorithm works. Most decision tree algorithms may not work well with problems requiring diagonal partitioning. Example
The division of space is orthogonal to the axis of the variable and parallel to all other axes. Therefore, the fields after the partition
are all hyper rectangles. There is multi-co linearity and there is a nonlinear relationship between input and output characteristics
when ANN and SVM work well.
Naive Bayes (NB) requires less storage space during the training and classification stages: memory required to store the
strict minimum pre- and conditional probabilities. The basic kNN algorithm consumes a large amount of storage space for the
training phase and its implementation space is at least larger than its training space. In contrast, for all lazy learners, the
implementation space is usually much smaller than the training space, because the resulting classification is usually a condensed
summary of the data. In addition, Naive Bayes and KNN can be easily used as incremental learners, while normative algorithms
cannot. Naive Bayes is inherently strong for missing values because they are ignored in the computing potential and therefore have
no impact on the final decision. In contrast, kNN and neural networks require complete records to complete their work. In contrast,
Decision Trees and Rule classifiers have a similar operational profile. SVM and ANN have similar operational profile. A single
data algorithm does not equal all other algorithms in all datasets.
Different data can be set with different types of variables and the number of instances determines the type of algorithm
that works best. There is no single learning algorithm that determines other data algorithms according to free lunch theory. Table
1 presents a comparative analysis of various learning algorithms.
Table 1: Comparative learning algorithms (**** stars indicate best and * worst performance)

III. RESEARCH METHODOLOGY


Data for the research was obtained from the National Institute of Diabetes and Digestive and Kidney Diseases at the University of
California, Available on the Web site: https://archive.ics.uci.edu/ml/machine-learning-database/pima-Indian. -Diabetes / (2017).
This data is selected because of its accuracy and is anonymous (unidentified), so confidentiality is ensured. The number of attributes
is 8, the square is formed. 9. All the properties of the numeric value are as follows:
– Pregnancy number
– Oral glucose tolerance test for 2 hours with ration as plasma glucose
– Diastolic blood pressure (mm Hg)
– Triceps skin fold thickness (mm)
– 2-hour serum insulin (mu u / ml)
– Mass Body Mass Index (kg weight / height in meters) ^ 2)

All rights reserved by www.grdjournals.com 11


Supervised Machine Learning Algorithms: Classification and Comparison
(GRDJE/ Volume 5 / Issue 6 / 003)

– Diabetic pedigree function


– Age
– class variable (0 or 1)
Table 2: Class Distribution: (Class Vol 1 "tested positive for diabetes") and (Class Vol 0 means "tested negative for diabetes")
Classroom The number of values Changed value (attribute)
0 500 NO
1 268 YES
Table 2 shows that of all the examples used for this research, 500 tested positive for diabetes and 268 tested negative for
diabetes. Comparative analysis between different monitored machine learning algorithms was performed using WEKA 3.7.13 (for
analysis for WEKA - Waikato environment). The data set is trained to represent the nominal attribute column as a dependent
variable. Values for Class Distribution (Class Variable) are changed to 1 Yes, which means that 0 values for positive and Class
Distribution (Class Variable) are not changed, i.e. Negative is tested for Diabetes. This is necessary because most algorithms must
have at least one nominal variable column. Seven classification algorithms were used in this research: Decision Table, Random
Forest, Novay Bay, SVM, and Neural Networks (Perception), JRIP and Decision Tree (J48). The following characteristics were
considered for comparative analysis: time, correctly classified, misclassified and tested modes, number of examples, kappa statistic,
MAE, accuracy of yes, accuracy of NO and classification.
This research work was carried out by tuning the parameters of two different sets of examples, in order to evaluate
accuracy and to ensure accuracy for different machine learning algorithms. According to the first category of 768 examples and 9
symptoms (gestational number, oral glucose tolerance test, ration as plasma glucose, 2 hours, diastolic blood pressure (mm Hg),
triceps skin fold thickness (mm), 2-hr insulin per year). , Body mass index (weight in kilogram (height in metre) ^ 2), diabetes
pedigree function, age (year) and class variable (0 or 1) as a dependent variable and eight Vatantra variables. With. The second
category consists of 384 examples and 6 features of the data set (multiple times pregnant, 2 hours on oral glucose tolerance test as
plasma glucose, 2 hours serum insulin (mu / mL), diabetes gene function, age (years) )) And class variables (0 or 1)) with one
dependent variable and five independent variables.

IV. RESULTS AND DISCUSSION

A. Results
WEKA was used to classify and compare different machine tilt algorithms. Table 3 shows the results along with the 9 parameters
along with the considered parameters.
Table 3: Comparison of different classification algorithms with larger data sets and more features
Correctly Incorrectly Accuracy Accuracy
Time No of Kappa
Algorithm classified Classified Test mode Attributes MAE Of Of Classification
(SEC) instances statistic
(%) (%) Yes No
Decision 10-fold-cross 0.3752 0.341
0.23 72.3948 27.6042 9 768 0.619 0.771 rules
Table validation
Random 10-fold-cross Trees
0.55 74.7396 25.2604 9 768 0.4313 0.3105 0.653 0.791
Forest validation
Naive 10-fold-cross Bayes
0.03 76.3021 23.6979 9 768 0.4664 0.2841 0.678 0.802
Bayes validation
10-fold-cross Functions
SVM 0.09 77.3438 22.6563 9 768 0.4682 0.2266 0.740 0.785
validation
Neural 10-fold-cross Functions
0.81 75.1302 24.8698 9 768 0.4445 0.2938 0.653 0.799
network validation
10-fold-cross rules
JRip 0.19 74.4792 25.5208 9 768 0.4171 0.3461 0.659 0.780
validation
Decision 10-fold-cross
0.14 73.8281 26.1719 9 768 0.4164 0.3158 0.632 0.790 tree
Tree (J48) validation
Time taken to build My Model (Mean Absolute Error), which is a measure of how close the end result or forecast is.
Table 4: comparison of the 6 features of the classifier and the various machines tilt algorithms and scales
Time Correctly Incorrectly Accuracy Accuracy
Attrib No of Kappa Classificat
Algorithm (SEC) classified Classified Test mode MAE Of Of
utes instances statistic ion
(%) (%) Yes No
Decision 10-fold-cross
0.09 67.9688 32.0313 6 384 0.3748 0.3101 0.581 0.734 Rules
Table validation
Random 10-fold-cross
0.42 71.875 28.125 6 384 0.3917 0.348 0.639 0.763 Trees
Forest validation
Naive 10-fold-cross-
0.01 70.5729 29.4271 6 384 0.352 0.3297 0.633 0.739 Bayes
Bayes validation
10-fold-cross-
SVM 0.04 72.9167 27.0833 6 384 0.3837 0.2708 0.711 0.735 Functions
validation
Neural 10-fold-cross- 0.672
0.17 59 41 6 384 0.1156 0.4035 0.444 Functions
Networks validation
10-fold-cross-
JRip 0.01 64 36 6 384 0.2278 0.4179 0.514 0.714 Rules
validation

All rights reserved by www.grdjournals.com 12


Supervised Machine Learning Algorithms: Classification and Comparison
(GRDJE/ Volume 5 / Issue 6 / 003)

Decision 10-fold-cross-
0.03 64 36 6 384 0.1822 0.4165 0.56 0.685 Tree
Tree (J48) validation
Time is the time taken to model. MAE (Mean Absolute Error) is a measure of how close the end result or expectations are.
Yes, testing positive for diabetes. NO is a negative test for diabetes
Table 4 shows the results of a comparison of the 6 features of the classifier and the various machines tilt algorithms and scales.
The kappa statistic is a metric that compares the observed accuracy with the expected accuracy (random chance). That means
testing positive for diabetes. NO is a negative test for diabetes
Tables 5: Ranking of Accuracy of Positive Diabetes and Negative Diabetes Using Small Sets of Different Algorithms
Small Dataset 384
Algorithm Yes (Positive Diabetes) NO (negative diabetes)
SVM 0.711 0.735
Random Forest 0.639 0.761
Naive bays 0.633 0.739
Decision table 0.581 0.734
Decision Tree (J48) 0.519 0.685
JRip 0.514 0.714
Neural Network (Perception) 0.444 0.672
Tables 6: Ranking of Accuracy of Positive Diabetes and Negative Diabetes Using Large Sets of Different Algorithms
Large Dataset 384
Algorithm Yes (Positive Diabetes) NO (negative diabetes)
SVM 0.74 0.785
Naive Bayes 0.678 0.802
JRip 0.659 0.78
Random Forest 0.653 0.791
Neural Network (Perception) 0.653 0.799
Decision tree (J48) 0.632 0.79
Decision Table 0.619 0.771
Tables 7: Small data sets are shown over time to classify correctly and misclassify the model to be classified with the correct algorithm
Small Dataset 384
Algorithm Time Correctly Classified Incorrectly Classified
SVM 0.04 sec 72.92% 27.08%
Random Forest 0.42 sec 71.88% 28.13%
Naive Bayes 0.01 sec 70.57% 29.43%
Decision Tree 0.09 sec 67.97% 32.03%
JRip 0.01 sec 64% 36%
Decision Tree(J48) 0.03 sec 64% 36%
Neural network(perception) 0.17 sec 59% 41%
Table 8: Large data sets are shown over time to classify correctly and misclassify the model to be classified with the correct algorithm
Large dataset 768
Algorithm Time Correctly Classified Incorrectly Classified
SVM 0.09 sec 77.34% 22.66%
Naive Bayes 0.03 sec 76.30% 23.70%
Neural network(Perception) 0.81 sec 75.13% 24.87%
Random Forest 0.55 sec 74.74% 25.26%
JRip 0.19 sec 74.48% 25.52%
Decision Tree(J48) 0.14 sec 73.83% 26.17%
Decision Table 0.23 sec 72.40% 27.60%
Table 9: Detailed analysis of various dataset attributes
Attribute number Mean Standard Deviation
1 3.8 3.4
2 120.9 32.0
3 69.1 19.4
4 20.5 16.0
5 79.8 115.2
6 32.0 7.9
7 0.5 0.3
8 33.2 11.8

B. Discussion
Table 3 shows a comparison of the results of the 768 cases and 9 features. It is observed that all algorithms have higher kappa
statistics than MAE (Mean Absolute Error). Furthermore, correctly classified examples outperform incorrectly classified examples.
With more data sets, this is an indication that attendance analysis is more reliable. SVM and NB require large sample sizes to
achieve maximum prediction accuracy, as shown in Table 3, while Decision Tree and Decision Table have minimum accuracy.

All rights reserved by www.grdjournals.com 13


Supervised Machine Learning Algorithms: Classification and Comparison
(GRDJE/ Volume 5 / Issue 6 / 003)

Table 4 shows a comparison of the results of the 384 cases and 6 features. Kappa's statistics for neural networks, JRIPs,
and J48 are lower than MAE and do not describe accuracy and accuracy. However, SVM and RF with high data sets show high
accuracy and accuracy. Decision Table has produced more time models than JRip and Decision Tree. Therefore, less time does
not guarantee accuracy. If the kappa statistic is less than the mean absolute error (MAE), the algorithm does not show accuracy
and precision. It follows that an algorithm that cannot use such features for that data set does not show accuracy and precision.
Table 6 shows the accuracy of large data sets and small data sets with SVM. Yet Table 5 shows the SVM as a very accurate
algorithm. Small data set.
Stories 7 and 8 show a comparison of correctly classified and misclassified percentages for small and large datasets over
time during model preparation. From Table 7, the results are revealed as naive Bayes and JRip as the fastest time algorithms to
build, although JRip has a correctly classified lower percentage, which shows that the construction time is not an accurate model.
In the same vein, SVM has the highest level of accuracy with a time of 0.04 seconds. Table 8 compares this result with the neural
network () h) The third is the correctly classified algorithm. This means that the neural network works better with larger datasets
compared to smaller data sets. Furthermore, the results indicate that the decision table does not work well with large datasets. The
SVM algorithm performs the highest classification and the larger the dataset, the greater the accuracy.
Table 9 shows the mean and standard deviation of all traits used in this research, indicating that plasma glucose
concentrations (feature 2) have the highest average and the lowest mean of diabetic pedigree function (symptom 7), indicating
strong effects. In small data sets. However, low standard deviation (SD) is not desirable, meaning that the function of the diabetes
pedigree (feature 7) may not be of importance when analyzing large data sets.

V. CONCLUSION AND RECOMMENDATION FOR FURTHER WORKS


ML classification requires fine tuning of parameters and number of instances of data set at the same time. Creating a model only
for the algorithm is not a matter of time, but an accurate and correct classification. Therefore, a best practice algorithm for a
particular data set cannot guarantee the accuracy and accuracy of another data set, whose characteristics are logically different
from those of the other. However, the important question when dealing with ML classification is not whether a learning algorithm
is superior to others, but under what circumstances a particular method can best explain others on a given application problem. For
this purpose, meta-learning uses a set of attributes called meta-attributes to represent the characteristics of teaching tasks and
searches for the interrelationships between these characteristics and the performance of learning algorithms. Some of the
characteristics of learning tasks are: number of examples, hierarchy of proportions, proportion of missing values, admission of
classes, etc.
Provided a comprehensive list of information and statistical measures for the dataset. After a thorough understanding of
the strengths and limitations of each method, the possibility of combining two or more algorithms to solve the problem needs to
be investigated. The idea is to use the strengths of one method to complement the weaknesses of another. If we are only interested
in the best possible classification accuracy, it is difficult or impossible to find a single classifier that displays a good set of
classifiers. VV, NB and RF machine learning algorithms can provide high accuracy and accuracy, regardless of the number of
features and data. This research suggests that model preparation time is a factor on the one hand; and while Kappa is accurate with
statistics, MAE is another aspect on the other side. Therefore, the ML algorithm requires accuracy, accuracy and minimum error
so that the machine can monitor attendant machine learning.
This work recommends that for large data sets, consider a distributed processing environment. This allows for a high
degree of correlation between the variables, which ultimately makes the model more efficient.

REFERENCES
[1] Alex S. Sgt. and Viswanathan, S.V.N. (2008). Introduction to machine learning. Copyright 5 Cambridge University Press 2008. ISBN: 0-521-82583-0.
Available on the KTH website: https://www.kth.se/social/upload/53a14887f276540ebc81aec3/online.pdf
Retrieved from: http://alex.smola.org/drafts/thebook/pdf
[2] Bishop, c. M. (1995). Neural Networks for Pattern Recognition. Clarendon Press, Oxford, England. 1995.
[3] Brazil P., Soares C.. & D. Costa, J. (2003). Ranking Learning Algorithms: Using IBL and Meta-Learning on Accurate and Timely Outcomes. Machine
Learning Volume 50, Issue 3, 2003. Copyright © Kluwer Academic Publishers. Made in the Netherlands, doi: 10.1023/A:1021713901879P.251-277.
Available on the Springer website: https://link.springer.com/content/pdf/10.1023%2FA%3A1021713901879.pdf
[4] Cheng, J., Greener, R., Kelly, J., Bell, D. & Liu, W. (2002). Bayesian Network Learning from Data: Information-Theory Based Approach. Artificial
Intelligence Volume 137. Copyright © 2002. Published by Elsevier Science. Wife. All rights reserved. 43 - 90.
Science Direct: http://www.sciencedirect.com/science/article/pii/S00043704200191111
[5] Domingo, p. And Pajani, M. (1997). On the suitability of ordinary Bayesian classifiers in zero-one loss. Machine Learning Volume 29, pp. 103-130 Copyright
© 1997 Kluwer Academic Publishers. Made in the Netherlands.
Available on the University of Trento website: http://disi.unitn.it/~p2p/RelatedWork/Matching/domingos97optimality.pdf

All rights reserved by www.grdjournals.com 14

Das könnte Ihnen auch gefallen