Advanced AI Bias Variance Tradeoff Report

University of Skövde
School of Informatics
Master of Data science
Haftamu Hailu Tefera
al17hafte@student.his.e
Advanced Artificial Intelligence - IT714A
Evaluation of Machine Learning algorithms and Bias Variance Trade off analysis
The main goal of this assignment to evaluate machine learning algorithms on a particular data set
called abalone and analyzing the bias variance tradeoff as complexity of the model increases to fit
the data using different polynomial regressions.
The selected dataset consists of different attributes of the plant called abalone. My experiment is
predicting the age of the plant from different physical measurements like height, width, height and
others. Here in my experiment, I used diameter of the data set to predict the age.
For this assignment, I used Python as a programming language and polynomial regression plus the
collection of functions and classes which are defined inside the sklearn machine learning library.
Firstly, I modeled the relationship between the independent and response variable using simple
linear mode but later used polynomial regression to fit the data because the relationship between
independent variable (diameter) and the response variable is not linear.
Graph of actual data vs predicted data using the model
Firstly, the data is divided into train and test sets, I train the model using the train data set and later
I applied the model on the unseen data (the test set), and finally I plotted the graphs as follows
Fitting the data using different complexities
1|Page Bias variance trade off analysis

Comparison between real data and predicted data
Diameter Actual Predicted(Model Observed

data data) error
0.455 9 10.92075297 1.92
0.440 8 10.71917958 2.7
0.445 16 10.78824462 5.3
0.490 9 11.34109616 2.3
0.385 14 9.81603842 4.19
Even if I used different polynomial regression with various degrees, but there is error between the
actual data points and predicted value of using the model. When I use simple modes, they do not
represent the actual relationship between the response variable (ring) and the feature variable
(diameter). Using complex models are also sensitive to small change in the data. Therefore, to
decide which model is perfect for my data set I performed bias variance analysis by varying
complexities and I obtained the following complexity vs bias variance.
Bias variance graph
As we can see from the graph the variance changes slightly from complexity to complexity but the
bias is almost remains the same.
According the Occam's razor principle for this experiment bias and variance is low at complexity 1
which balances between the bias and variance errors.

I calculated the bias and variance using 15 different polynomials and 10 models as follows. The
values of the bias and variance are iteratively stored in an array. Here I stored five different bias
and variance values.
Degree of the polynomial Bias Variance

1 1.0825238483546171 0.37027046004464464
2 1.0825232368634246 0.36722314421445895
3 1.0825262515682708 0.37065378285258632
4 1.082521012912604 0.37143326266730792
5 1.0825227151319923 0.37557117740903923
Table of Bias Variance results
Code for the experiment
1. Code for plotting predicted values and real data on the test data
plt.plot(X_test, y_test, 'ro', label="actual")

max_degree =5
for d in range(1, max_degree+1):
m = fit_poly(X_train, y_train, d)
Pred=apply_poly(m,X_test)
plt.plot(X_test,Pred,label=" fit"+str(d)+"Poly Degree")
plt.legend(bbox_to_anchor=(0.0, 1.02, 1., .102), loc=4,
ncol=2, mode="expand", borderaxespad=0.)
plt.xlabel("Diameter")
plt.ylabel("Age(Rings)")
plt.grid()
plt.show()
2. Code for bias Function
def bias(Pred, actual):

result=float(sum(Pred))/len(actual)
return (((result- actual) ** 2).mean())
3. Code for variance function

def variance(pred,avg):
result=np.mean(avg)
return np.mean((pred-result)**2)

4. Code for displaying the above bias and variance graphs
n_models = 10
max_degree = 15
var_values=[]
bias_values = []
for degree in range(1, max_degree):
models = []
for m in range(n_models):
#training the model
model = fit_poly(X_train, y_train, degree)
#testing the model on the test data
Pred = apply_poly(model, X_test)
b=bias(Pred,y_test)/n_models
bias_values.append(b)
va=variance(Pred,Pred)
va=va/n_models
var_values.append(va)
pl.plot(bias_values, label=”bias”, range(1, max_degree))
plt.plot(var_values, label="variance”, range(1, max_degree))
plt.xlabel("Complexity")
plt.ylabel("Bias Value")
plt.grid()
plt.legend()
Reference
1. http://scikit-learn.org/stable/
2. Class lecture notes on machine learning

Advanced AI Bias Variance Tradeoff Report

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Advanced AI Bias Variance Tradeoff Report

Hochgeladen von

Copyright:

Verfügbare Formate

University of Skövde

Advanced Artificial Intelligence - IT714A

Graph of actual data vs predicted data using the model

Fitting the data using different complexities

1|Page Bias variance trade off analysis

Diameter Actual Predicted(Model Observed

Bias variance graph

2|Page Bias variance trade off analysis

Degree of the polynomial Bias Variance

Code for the experiment

plt.plot(X_test, y_test, 'ro', label="actual")

2. Code for bias Function

def bias(Pred, actual):

3. Code for variance function

3|Page Bias variance trade off analysis

4|Page Bias variance trade off analysis

Das könnte Ihnen auch gefallen