Sie sind auf Seite 1von 7

Joshua Cook

October 20, 2014

Math 382
Assignment 4

Math 382 Assignment 4


import numpy as np
import numpy.linalg as la
import math as m
import matplotlib.pyplot as plt
%matplotlib inline

Problem 1
Write a program to read a two column white space delimited data file and calculate the mean and standard deviation
of each column and correlation of the two columns
P
(xi x
)(yi y)
pP
pP
(xi x
)2
(yi y)2
Run your program on the datafile HW4DATA.txt.
Read in Data
data = np.genfromtxt('HW4DATA.txt', delimiter=',')
n = data[:,0].size
x = data[:,0]
y = data[:,1]
Create Statistical Functions
def myMean(v):
n = v.size
return sum(v)/n
def myStd(v):
n = v.size
mu = myMean(v)
sum = 0
for i in range(n):
sum = sum + (v[i]-mu)**2
return m.sqrt(sum/n)
def myCorr(u,v):
n = u.size
mu_u = myMean(u)
mu_v = myMean(v)
sum_1 = 0
sum_2 = 0
sum_3 = 0
for i in range(n):
sum_1 = sum_1 + ((u[i]-mu_u)*(v[i]-mu_v))

Math 382
Assignment 4

for i in range(n):
sum_2 = sum_2 + (u[i]-mu_u)**2
for i in range(n):
sum_3 = sum_3 + (v[i]-mu_v)**2
return sum_1/(m.sqrt(sum_2*sum_3))
Statistical Analysis
mu_x = myMean(x)
mu_y = myMean(y)
sigma_x = myStd(x)
sigma_y = myStd(y)
print
print
print
print
print

"Mean of Column 1:\t\t", mu_x


"Mean of Column 2:\t\t", mu_y
"Standard Deviation of Column 1:\t",sigma_x
"Standard Deviation of Column 2:\t",sigma_y
"Correlation of Two Columns:\t",myCorr(x,y)

Mean of Column 1:
5.944261
Mean of Column 2:
9.69482647387
Standard Deviation of Column 1: 2.86297310008
Standard Deviation of Column 2: 3.92102139465
Correlation of Two Columns: 0.842105478054

Joshua Cook
October 20, 2014

Joshua Cook
October 20, 2014

Math 382
Assignment 4

Problem 2
Write a program to plot the file HW4DATA.txt as a scatter plot. Do not connect the dots.
plt.scatter(x,y)
plt.axhline(linewidth=1.0, color="black")
plt.axvline(linewidth=1.0, color="black")

<matplotlib.lines.Line2D at 0x107b25990>

Figure 1: Scatter Plot of Data

Joshua Cook
October 20, 2014

Math 382
Assignment 4

Problem 3
Write a program to read in a file and calculate the linear regression (least squares linear fit), plot the points as a
scatter plot, and plot a the regression line. Run your program on HW4DATA.txt.
From Introduction to Linear Algebra, by Gilbert Strang

Here, Strang is creating a matrix

1
1

A = .
..
1

x0
x1

..
.
xn

and two vectors

y0
 
y1
a

b = . and x = 0
a1
..
yn
Note that Ax = b can not be solved conventionally because A is not square. Strangs technique is to create a square
(symmetric) matrix AT A. In order to maintain the equality,
AT Ax = AT b
which can be solved as
x = (AT A)1 AT b
A = np.ones((n,2))
for i in range(n):
A[i,1] = x[i]
[b,m] = la.solve(A.T.dot(A),A.T.dot(y))
indep_vec = np.linspace(0,12,1000)
dep_vec = m*indep_vec + b
plt.plot(indep_vec, dep_vec)
plt.scatter(x,y)
plt.axhline(linewidth=1.0, color="black")
plt.axvline(linewidth=1.0, color="black")
b,m

Joshua Cook
October 20, 2014

Math 382
Assignment 4

(2.8392132606226101, 1.1533163185888702)
<matplotlib.lines.Line2D at 0x107e67290>

Figure 2: Linear Fit, f (x) = 1.153x + 2.839, v Scatter Plot

Joshua Cook
October 20, 2014

Math 382
Assignment 4

Problem 4
Write a program to read in a file and calculate the best quadratic fit, plot the points as a scatter plot, and plot a the
regression curve. Run your program on HW4DATA.txt.
A = np.ones((n,3))
for i in range(n):
A[i,1] = x[i]
A[i,2] = x[i]**2
[a_0,a_1,a_2] = la.solve(A.T.dot(A),A.T.dot(y))
indep_vec = np.linspace(0,12,1000)
dep_vec = a_0+a_1*indep_vec + a_2*indep_vec**2
plt.plot(indep_vec, dep_vec)
plt.scatter(x,y)
plt.axhline(linewidth=1.0, color="black")
plt.axvline(linewidth=1.0, color="black")
a_0,a_1,a_2
(-3.914270435592262, 4.1248403760037702, -0.2506275414900847)
<matplotlib.lines.Line2D at 0x107c265d0>

Figure 3: Quadratic Fit, f (x) = 0.251x2 + 4.12x 3.91, v Scatter Plot

Joshua Cook
October 20, 2014

Math 382
Assignment 4

Probelm 5
Write a program to read in a file and calculate the best cubic fit, plot the points as a scatter plot, and plot a the
regression curve. Run your program on HW4DATA.txt.
A = np.ones((n,4))
for i in range(n):
A[i,1] = x[i]
A[i,2] = x[i]**2
A[i,3] = x[i]**3
[a_0,a_1,a_2,a_3] = la.solve(A.T.dot(A),A.T.dot(y))
indep_vec = np.linspace(0,12,1000)
dep_vec = a_0+a_1*indep_vec + a_2*indep_vec**2+a_3*indep_vec**3
plt.plot(indep_vec, dep_vec)
plt.scatter(x,y)
plt.axhline(linewidth=1.0, color="black")
plt.axvline(linewidth=1.0, color="black")
a_0,a_1,a_2,a_3

(0.020319901276356165,
1.1723176357858898,
0.32810398181422878,
-0.032524255234103877)
<matplotlib.lines.Line2D at 0x107e15c10>

Figure 4: Cubic Fit, f (x) = 0.0325x3 + 0.325x2 + 1.172x + 0.0203, v Scatter Plot

Das könnte Ihnen auch gefallen