Sie sind auf Seite 1von 8

1/2/2017

Linear Regression

LINEAR REGRESSION
October 2, 2014 Saimadhu Polamuri 5 Comments DATAMINING
1
Shares

Introduction to Linear Regression:


Linear Regression means predicting scores of one variable from the
scores of the second variable. The variable we are predicting is called the
criterion variable and is referred to as Y. The variable we are basing our
predictions is called the predictor variable and is referred to as X. When
there is only one predictor variable, the prediction method is called
simple regression.The aim of linear regression is to nd the best- tting
straight line through the points. The best- tting line is called a regression
line.

http://dataaspirant.com/2014/10/02/linear-regression/

1/8

1/2/2017

Linear Regression

The above equation is hypothesis equation


where:
h(x) is nothing but the value Y(which we are going to predicate ) for
particular x ( means Y is a linear function of x)
0is a constant
1 is the regression coe

cient

X is value of the independent variable

Properties of the Linear Regression Line


Linear Regression line has the following properties:
1. The line minimizes the sum of squared di erences between
observed values (the y values) and predicted values (theh(x) values
computed from the regression equation).
2. The regression line passes through the mean of the X values (x) and
through the mean of the Y values (h(x) ).
3. The regression constant (0) is equal to the y-intercept of the
regression line.
4. The regression coe cient (1) is the average change in the
dependent variable (Y) for a 1-unit change in the independent
variable (X). It is the slope of the regression line.
The least squares regression line is the only straight line that has all of
these properties.

Goal of Hypothesis Function


The goal of Hypothesis is to choose0 and 1 so thath(x) is close to Y
for our training data while choosing 0 and 1 we have to consider the
cost function( J() ) wherewe are getting low value for cost function( J() ).
http://dataaspirant.com/2014/10/02/linear-regression/

2/8

1/2/2017

Linear Regression

The below function is called as a cost function, the cost function ( J() ) is
nothing but just a Squared error function.

Lets Understand Linear Regression with


Example
Before going to explain linear Regression let mesummarize the
things we learn

http://dataaspirant.com/2014/10/02/linear-regression/

3/8

1/2/2017

Linear Regression

Suppose we have data some thing looks like


this
No.
1

Year
2000

http://dataaspirant.com/2014/10/02/linear-regression/

Population
1,014,004,000
4/8

1/2/2017

Linear Regression

2001

1,029,991,000

2002

1,045,845,000

2003

1,049,700,000

2004

1,065,071,000

2005

1,080,264,000

2006

1,095,352,000

2007

1,129,866,000

2008

1,147,996,000

10

2009

1,166,079,000

11

2010

1,173,108,000

12

2011

1,189,173,000

13

2012

1,205,074,000

Now our task is to answer the below questions


No.

Year

Population

2014

2,205,074,000

Let me draw a graph for our data

http://dataaspirant.com/2014/10/02/linear-regression/

5/8

1/2/2017

Linear Regression

Python Code for graph

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27

# Required Packages

import plotly.plotly as pyfrom plotly.graph_objs import *

py.sign_in("username", "API_authentication_code")

from datetime import datetime

x = [

datetime(year=2000,month=1,day=1),
datetime(year=2001,month=1,day=1),
datetime(year=2002,month=1,day=1),
datetime(year=2003,month=1,day=1),
datetime(year=2004,month=1,day=1),
datetime(year=2005,month=1,day=1),
datetime(year=2006,month=1,day=1),
datetime(year=2007,month=1,day=1),
datetime(year=2008,month=1,day=1),
datetime(year=2009,month=1,day=1),
datetime(year=2010,month=1,day=1),
datetime(year=2011,month=1,day=1),
datetime(year=2012,month=1,day=1)]

data = Data([

Scatter(

http://dataaspirant.com/2014/10/02/linear-regression/

6/8

1/2/2017

Linear Regression

28
29
30
31
32
33
34
35
36
37

x = x,

y = [1014004000, 1029991000, 1045845000, 1049700000, 1065071000,


1080264000, 1095352000, 1129866000, 1147996000, 1166079000,
1173108000,1189173000,1205074000])

])

plot_url = py.plot(data, filename='DataAspirant')

Now what we will do is we will nd the mostsuitable value for our 0


and 1 using hypotheses equation.
Where x is nothing but the years , and the h(X) is the prediction
value for our hypotheses .
Once we weredone nding0 and 1 we can nd any value.
Keep in mind we rst nd the0 and 1 for our training data.
Later we will use these0 and 1 values to do the prediction for test
data.
Dont think too much about how to nd 0 and 1 values, in linear
regression implementationin python, I have explained how we can nd
0 and 1 values with nice example and the coding part too.

Follow us:
FACEBOOK|QUORA|TWITTER| GOOGLE+ | LINKEDIN|REDDIT| FLIPBOARD |
MEDIUM| GITHUB

I hope you liked todays post. If you have any questions then feel free to
http://dataaspirant.com/2014/10/02/linear-regression/

7/8

1/2/2017

Linear Regression

comment below. If you want me to write on one speci c topic then do tell
it to me in the comments below.
Share this:

Related

Linear Regression
Implementation in
Python
In "DATAMINING"

http://dataaspirant.com/2014/10/02/linear-regression/

Linear Regression

di erence between
classi cation and
regression in machine
learning
In "DATAMINING"

DataAspirant August2015
newsletter
In "DATAMINING"

regression coe cient

8/8

Das könnte Ihnen auch gefallen