Sie sind auf Seite 1von 3

Decision tree

 Decision tree is one of the predictive modelling approaches used in statistics, data


mining and machine learning.
 Decision Trees are a non-parametric supervised learning method used for
both classification and regression tasks.
 Tree models where the target variable can take a discrete set of values are
called classification trees.
 Decision trees where the target variable can take continuous values (typically real numbers)
are called regression trees.
 Classification and Regression Tree (CART) is general term for this.

Information Gain
 Information gain is used to decide which feature to split on at each step in building the tree.
 A commonly used measure of purity is called information.

“For each node of the tree, the information value measures


how much information a feature gives us about the class.
The split with the highest information gain will be
taken as the first split and the process will continue
until all children nodes are pure, or until the
information gain is 0”.

Pure
 Pure means, in a selected sample of dataset all data belongs to same class (PURE).

Impure
 Impure means, data is mixture of different classes.

Entropy
 In machine learning, entropy is a measure of the randomness in the information being
processed. The higher the entropy, the harder it is to draw any conclusions from that
information.
  If the sample is completely homogeneous the entropy is zero and if the sample is an equally
divided it has entropy of one.

Information Gain
 Information gain can be defined as the amount of information gained about a random
variable or signal from observing another random variable. It can be considered as the
difference between the entropy of parent node and weighted average entropy of child
nodes.

Gini Impurity
 Gini impurity is a measure of how often a randomly chosen element from the set would be
incorrectly labelled if it was randomly labelled according to the distribution of labels in the
subset.

 Gini impurity is lower bounded by 0, with 0 occurring if the data set contains only one class.

There are many algorithms there to build a decision tree. They are

 CART (Classification and Regression Trees) — This makes use of


Gini impurity as metric.

 ID3 (Iterative Dichotomiser 3) — This uses entropy and information gain


as metric.
Advantages and disadvantages of decision trees

Advantages:
 Decision trees are super interpretable
 Require little data pre-processing
 Suitable for low latency applications

Disadvantages:
 More likely to overfit noisy data. The probability of overfitting on noise increases as a tree
gets deeper. A solution for it is pruning.

Important Terminology related to Decision Trees


 Root Node: It represents entire population or sample and this further gets divided into two or
more homogeneous sets.
 Splitting: It is a process of dividing a node into two or more sub-nodes.
 Decision Node: When a sub-node splits into further sub-nodes, then it is called decision node.
 Leaf/ Terminal Node: Nodes do not split is called Leaf or Terminal node.
 Pruning: When we remove sub-nodes of a decision node, this process is called pruning. You
can say opposite process of splitting.
 Branch / Sub-Tree: A sub section of entire tree is called branch or sub-tree.
 Parent and Child Node: A node, which is divided into sub-nodes is called parent node of sub-
nodes whereas sub-nodes are the child of parent node

https://www.saedsayad.com/decision_tree.htm

Das könnte Ihnen auch gefallen