Beruflich Dokumente
Kultur Dokumente
Prepared By:
Name
I.D. NO
Imran Hossain
Table of Content
SL
Topic
Page
Introduction
Background
03
Variables
05
Statistical Approaches
10
08
Data sheet
11
Descriptive statistics
21
28
Regression Analysis
24
32
Correlations
29
35
30
10
Hypothesis testing
32
11
Findings
41
12
Conclusion
42
13
Reference
43
ACKNOWLEDGEMENT
First of all we would like to express our sincere gratitude to almighty Allah that we have
successfully completed our report.
We would like to thank our honorable teacher of the Business Statistics course, Dr.Kais Zaman
for giving us this opportunity and help needed to prepare this report.
Finally, we would like to thanks our class mates for their cooperative attitude which guided us
to recover the problems regarding our report.
Dear Sir,
It is our great honor to submit our project report on Effect on the selling price (dependant
variable) with changes in independent variables of different cars models. In this endeavor, this
report seeks to identify and analyze the relationships among the variables. The report contains
statistical analysis and some findings and recommendations. It would be our enormous pleasure
if you find this report useful and informative to have an apparent perspective on the issue.
Thanking you
1. Imran Hossain
ID - 1321071660
2. Rajiv Shamim
ID -1120542460
ID - 1330802660
1. Introduction
1.1Origin of the Report:
BUS 511 is a statistics course offered in the MBA program of NSU in order to equip students
with the statistical tools. The project was initiated so that the students would get a practical
exposure of statistical analysis in a project work. Different types of statistical tools were used
in this project to find out the results.
To find out the level of impact and relationship between Wheel drive and Cars
selling price.
Regression analysis of 4 independent variables with the dependent variable
Testing usefulness of the model
Testing partial regression co efficient
Testing correlation co efficient
To get a practical exposure of statistical analysis
1.4 Methodology:
The data used in this report is collected from different car showrooms in the city. These
include the sole agents of the company in the city such as Pacific Motors BD for Nissan and
Hyundai, Navana 3s for Toyota, Honda and some local car dealers. In total 33 car models are
used as a sample variables. After collecting the data we analyzed the data with the help of
statistical software (Minitab 17). The collected data was first summarized and presented
graphically. Then we tested some hypothesis about the population mean for each of the
variables. After that, we calculated the correlations by using Minitab software among
different variables, to see the strength of their relationship. Then we tested hypothesis of
correlation coefficient. Then we extended the relationships to a multiple regression model.
After that we tested some hypothesis of partial regression coefficient and finally we tested
the usefulness of the regression model.
2. Background
2.1 History of the Automobile industry
The history of the automobile begins as early as 1769, with the creation of steam engine
automobiles capable of human transport. In 1806, the first cars powered by an internal
combustion engine running on fuel gas appeared, which led to the introduction in 1885 of the
ubiquitous modern gasoline- or petrol-fueled internal combustion engine. Cars powered by
electric power briefly appeared at the turn of the 20th century, but largely disappeared from
use until the turn of the 21st century. The early history of the automobile can be divided into
a number of eras, based on the prevalent means of propulsion. Later periods were defined by
trends in exterior styling, and size and utility preferences.
3. Variables
3.1 Explanation of test parameters
There are total 5 variables in this project. Among them 1 is dependent variable and other 4 is
independent variables. Car selling is always been an interesting thing for the one who wants
to buy it. So Car selling price is our dependent variable in this report. 4 variables are
affecting the car selling price, so these are the independent variables. These independent
variables are given below:
Engine Displacement- Cubic Centimeters (CC)
Horse power (HP)
Fuel Miles per gallon (MPG)
Wheel /Drive
engine
cc
hp
rpm
cc / hp
3.5
3.45
42,600
1.0145
8194
8000+
8 200
1.0243
1494
1400+
14 000
1.0671
1494
1000
13 000
1.4940
125
47
3.5
1.3
26 000
2.6923
2.47
0.82
27 200
3.0122
2998
920
19 200
3.2587
Motocross bike
125
33
3.7879
125
33
3.7879
3000
750
4.0000
2.6596
This table shows that two cars having the same cc can have different horse powers so both cc and
horsepower are not directly related to each other.
Data source: http://www.simetric.co.uk/si_cc2hp.htm
For four-wheeled vehicles, this term is used to describe vehicles that are able to transmit
torque to at most two road wheels, referred to as either front- or rear-wheel drive. The term
4x2 is also used, to indicate four total road-wheels with two being driven.
Four-wheel drive, 4WD, 4x4 ("four-by-four"), all-wheel drive, and AWD are terms used to
describe a four-wheeled vehicle with a drive train that allows all four road wheels to receive
torque from the internal combustion engine simultaneously. While some people associate the
term with off-road vehicles - powering all four wheels provides better control, and therefore
safety on slick ice, and is an important part of rally racing on mostly-paved roads.
Front-wheel drive
Front-wheel drive (or FWD for short) is the most common form of internal combustion
engine / transmission layout used in modern passenger cars, where the engine drives the front
wheels. Most front wheel drive vehicles today feature transverse engine mounting, whereas
in past decades engines were mostly positioned longitudinally instead. Rear-wheel drive was
the traditional standard, and is still widely used in luxury cars and most sport cars. Fourwheel drive is also sometimes used. See also Front-engine, front-wheel drive layout.
Rear-wheel drive
Rear-wheel drive (or RWD for short) was a common internal combustion engine /
transmission layout used in automobiles throughout the 20th century.
4. Statistical Approaches
4.1 Theoretical Model:
Dependent variable: Cars selling price (Y)
Independent variable: X1, X2, X3, X4
Car selling price, Y= f (X1, X2, X3, X4)
The analysis would be based on different variables of cars and the internal relationship of
their characteristics with the cars selling price.
4.3 Hypothesis:
H1:
H2:
H3:
Fuel Miles per gallon (MPG) has impact on car selling price
H4:
Car Model
Selling
Price
CC
HP
in
Fuel
Wheel
(MPG)
/Drive
BDT
1
2013
NISSAN 16500000
5700
381
15
NISSAN 9500000
4000
270
19
2850000
1500
135
28
2800000
1500
135
28
PATROL
2
2012
MURANO
2012
Toyota
Premio G
4
2012
NISSAN 1650000
1300
132
25
SUNNY
6
2013Toyota Yaris
1750000
1299
132
30
1800
165
65
8200000
2500
231
66
NISSAN 2300000
2000
132
46
NISSAN 2650000
1800
98
50
Hybrid
8
2012
SYLPHY
10
2012
BLUEBIRD
11
5200000
2400
115
39
12
1800
98
42
2500
179
24
1450000
1300
132
30
NISSAN 4500000
3500
266
21
2362
159
26
4500
310
15
2982
182
21
1500
132
22
2694
270
17
TRAIL
13
2012
NISSAN 4550000
CEFIRO
14
Toyota Avanza
15
2012
PATHFINDER
Hybrid
16
4200000
17
18
19
2012
13200000
NISSAN 1750000
SUNNY 1.5
20
2012
Fortuner
Toyota 9000000
21
2011
NISSAN 5700000
3500
268
20
NISSAN 2250000
2500
169
22
Hyundai 4500000
2400
179
28
1500000
1200
105
30
NISSAN 5200000
3500
270
17
DUALIS
22
2011
TEANA
23
2013
Sonata
24
Hyundai i10
25
2011
SKYLINE
26
Hyundai Eon
1150000
814
95
35
27
6300000
2400
175
27
28
1700000
1800
132
22
29
1600
127
28
2012
30
8400000
2500
179
22
31
Honda City
1950000
1300
120
30
32
2800000
2400
185
24
33
Mitsubishi
Pajero 6900000
2700
175
25
Sport 2013
4.6 Graphs
4.6.1Histogram: A histogram is a graphical representation of the distribution of data. It is an
estimate of the probability distribution of a continuous variable. A histogram is a representation
of tabulated frequencies, shown as adjacent rectangles, erected over discrete intervals, with an
area proportional to the frequency of the observations in the interval. The total area of the
histogram is equal to the number of data.
Mean 5813636
StDev 7094719
N
33
14
Frequency
12
10
8
6
4
2
0
-10000000
10000000
20000000
30000000
40000000
Histogram of CC
Normal
12
Mean 2350
StDev 1052
N
33
10
Frequency
1000
2000
3000
4000
5000
CC
Histogram of HP
Normal
Mean 176.8
StDev 69.52
N
33
12
Frequency
10
80
160
240
HP
320
400
Mean 29.06
StDev 12.48
N
33
8
7
Frequency
6
5
4
3
2
1
0
10
20
30
40
50
60
Fuel (MPG)
20
Frequency
15
10
Wheel /Drive
4.6.2 Scatter diagram: The scatter plot is widely used to present measurements of two or more
related variables. It is particularly useful when the variables of the y-axis are thought to be
dependent upon the values of the variable of the x-axis (usually an independent variable).In a
scatter plot, the data points are plotted but not joined; the resulting pattern indicates the type and
strength of the relationship between two or more variables.
40000000
30000000
20000000
10000000
0
1000
2000
3000
4000
5000
6000
CC
40000000
30000000
20000000
10000000
0
100
150
200
250
HP
300
350
400
30000000
20000000
10000000
0
10
20
30
40
50
60
70
Fuel (MPG)
40000000
30000000
20000000
10000000
0
2.0
2.5
3.0
Wheel /Drive
3.5
4.0
4.6.3Probability Plot: The normal probability plot is a graphical technique for normality testing:
assessing whether or not a data set is approximately normally distributed. The data are plotted
against a theoretical distribution in such a way that the points should form approximately a
straight line. Departures from this straight line indicate departures from the specified distribution.
5. Descriptive statistics
5.1 Descriptive Statistics: Selling Price, CC, HP, Fuel (MPG), wheel drive
Descriptive Statistics: Selling Price in BDT, CC, HP, Fuel (MPG), Wheel /Drive
Variable
Q3
Selling Price in BDT
6650000
CC
2697
HP
208.0
Fuel (MPG)
30.00
Wheel /Drive
4.000
Variable
Selling Price in BDT
CC
HP
Fuel (MPG)
Wheel /Drive
N*
Mean
SE Mean
StDev
Minimum
Q1
Median
33
5813636
1235032
7094719
1150000
1850000
4200000
33
2350
183
1052
814
1500
2400
33
176.8
12.1
69.5
95.0
132.0
165.0
33
29.06
2.17
12.48
15.00
21.50
26.00
33
2.788
0.173
0.992
2.000
2.000
2.000
Maximum
40000000
5700
381.0
66.00
4.000
5.2 Summary
Summary Report for Selling Price in BDT
Anderson-Darling Normality Test
A-Squared
P-Value
Mean
StDev
Variance
Skewness
Kurtosis
N
Minimum
1st Quartile
Median
3rd Quartile
Maximum
3.85
<0.005
5813636
7094719
5.03350E+13
3.8000
17.3185
33
1150000
1850000
4200000
6650000
40000000
10000000
20000000
30000000
40000000
3297958
8329314
5451281
9384136
Median
2000000
3000000
4000000
5000000
6000000
7000000
8000000
0.97
0.013
2350.0
1052.3
1107243.8
1.26155
2.06613
33
814.0
1500.0
2400.0
2697.0
5700.0
2000
3000
4000
1976.9
5000
2723.1
2500.0
Median
1800
2000
2200
2400
2600
2800
1391.8
1.57
<0.005
Mean
StDev
Variance
Skewness
Kurtosis
N
176.76
69.52
4833.44
1.17036
0.94583
33
Minimum
1st Quartile
Median
3rd Quartile
Maximum
95.00
132.00
165.00
208.00
381.00
160
240
320
152.11
400
201.41
179.00
91.96
Median
140
160
180
200
2.03
<0.005
Mean
StDev
Variance
Skewness
Kurtosis
N
29.061
12.485
155.871
1.71811
2.90649
33
Minimum
1st Quartile
Median
3rd Quartile
Maximum
15.000
21.500
26.000
30.000
66.000
30
40
50
24.634
60
33.488
29.005
Median
22
24
26
28
30
32
34
16.514
6.12
<0.005
2.7879
0.9924
0.9848
0.45507
-1.91285
33
2.0000
2.0000
2.0000
4.0000
4.0000
2.4360
3.1398
4.0000
1.3126
Median
2.0
2.5
3.0
3.5
4.0
Regression Analysis: Selling Price in BDT versus CC, HP, Fuel (MPG), Wheel /Drive
Regression Equation:
Selling Price in BDT = -6269746 + 4113 CC + 480 HP - 4486 Fuel (MPG)
+ 883610 Wheel /Drive
Explanation:
bo = -6269746, it will always remain constant.
For a single unit change of CC, the Car Selling Price will be changed 4113 units, and
the variables share a positive relationship to each other.
For a single unit change of HP, the car Selling Price will be changed 480units, and the
variables share a positive relationship to each other.
For a single unit change of Fuel (MPG), the car Selling Price will be changed 446 units,
and the variables share a negative relationship to each other.
For a single unit change of Wheel/Drive, the Car Selling Price will be changed
883610units, and the variables share a positive relationship to each other.
Predictor
Constant
Coef
-6269746
SE Coef
T-value
P-value
4584915
-1.37
0.182
2641
1.56
0.131
CC
4113
HP
480
36687
0.01
0.990
Fuel (MPG)
-4486
86150
-0.05
0.959
Wheel/Drive
883610
1276072
0.69
0.494
Regression Table
Minitab Output:
Regression Equation
Selling Price in BDT = -6269746 + 4113 CC + 480 HP - 4486 Fuel (MPG) + 883610 Wheel /Drive
Analysis of Variance
Source
Regression
CC
HP
Fuel (MPG)
Wheel /Drive
Error
Lack-of-Fit
Pure Error
Total
DF
4
1
1
1
1
28
27
1
32
Adj SS
7.94558E+14
7.06797E+13
4993542516
79043102586
1.39762E+13
8.16163E+14
8.16162E+14
1250000000
1.61072E+15
Adj MS
1.98640E+14
7.06797E+13
4993542516
79043102586
1.39762E+13
2.91487E+13
3.02282E+13
1250000000
F-Value
6.81
2.42
0.00
0.00
0.48
P-Value
0.001
0.131
0.990
0.959
0.494
24182.58
0.005
Model Summary
S
5398952
R-sq
49.33%
R-sq(adj)
42.09%
R-sq(pred)
27.36%
Coefficients:
Term
Constant
CC
HP
Fuel (MPG)
Wheel /Drive
Coef
-6269746
4113
480
-4486
883610
SE Coef
4584915
2641
36687
86150
1276072
T-Value
-1.37
1.56
0.01
-0.05
0.69
P-Value
0.182
0.131
0.990
0.959
0.494
Selling
Price in BDT
8200000
40000000
Large residual
Unusual X
Fit
7361820
15854387
Resid
838180
24145613
Std
Resid
0.23
4.89
X
R
VIF
8.48
7.14
1.27
1.76
40000000
5175848
48.4%
46.8%
30000000
20000000
10000000
0
1000
2000
3000
4000
5000
6000
CC
The graph shows that for 1 unit increase in CC the selling price increases by 4693 units.
40000000
5533401
41.1%
39.2%
30000000
20000000
10000000
0
100
150
200
250
300
350
400
HP
The graph shows that for 1 unit increase in HP the selling price increases by 65400 units.
40000000
6917845
7.9%
4.9%
30000000
20000000
10000000
0
10
20
30
40
50
60
70
Fuel (MPG)
The graph shows that for 1 unit increase in Fuel (MPG) the selling price changes by -159673
units.
40000000
6184330
26.4%
24.0%
30000000
20000000
10000000
0
2.0
2.5
3.0
3.5
4.0
W heel /Drive
The graph shows that for 1 unit increase in WHEEL/DRIVE the selling price increases by
3672692 units.
8. One way ANOVAs: One-way analysis of variance (one-way ANOVA) is a technique used to
compare means of two or more samples (using the F distribution). This technique can be used
only for numerical data.
The ANOVA tests the null hypothesis that samples in two or more groups are drawn from
populations with the same mean values. To do this, two estimates are made of the population
variance. These estimates rely on various assumptions. The ANOVA produces an F-statistic, the
ratio of the variance calculated among the means to the variance within the samples. If the group
means are drawn from populations with the same mean values, the variance between the group
means should be lower than the variance of the samples, following the central limit theorem. A
higher ratio therefore implies that the samples were drawn from populations with different mean
values.
One-way ANOVA: CC, HP, Fuel (MPG), Wheel /Drive
Method
Null hypothesis
Alternative hypothesis
Significance level
= 0.05
Levels
4
Values
CC, HP, Fuel (MPG), Wheel /Drive
Analysis of Variance
Source
Factor
Error
Total
DF
3
128
131
Adj SS
129296737
35591490
164888228
Adj MS
43098912
278059
F-Value
155.00
Model Summary
S
527.313
R-sq
78.41%
R-sq(adj)
77.91%
R-sq(pred)
77.04%
P-Value
0.000
Means
Factor
CC
HP
Fuel (MPG)
Wheel /Drive
N
33
33
33
33
Mean
2350
176.8
29.06
2.788
StDev
1052
69.5
12.48
0.992
95% CI
(
2168,
2532)
(
-4.9,
358.4)
( -152.57, 210.69)
(-178.841, 184.417)
2000
Data
1500
1000
500
CC
HP
Fuel (MPG)
Wheel /Drive
9. Hypothesis testing: Hypothesis testing or significance testing is a method for testing a claim
or hypothesis about a parameter in a population, using data measured in a sample. In this
method, we test some hypothesis by determining the likelihood that a sample statistic could have
been selected, if the hypothesis regarding the population parameter were true.
9.1 Hypothesis test for Mean
1. Car selling price
Mean (x) = 5800000, Standard Deviation (S) = 7094719, n = 33
Ho: = 5800000
HA: 5800000
Test Statistic:
z = x - o / s n
With = .05
And p value 0.991, which is greater than .05
Hence the Null Hypothesis Ho is not rejected.
Population mean of car selling price is equal to BDT 5800000.
2. CC
Mean (x) = 2300, Standard Deviation (S) =1052, n = 33
Ho: = 2300
HA: 2300
Test Statistic:
z = x - o / s n
With = .05
And p value 0.785, which is greater than .05
Hence the Null Hypothesis Ho is not rejected
Therefore the Population mean of CC is equal to 2300.
3. HP
Ho: = 176
HA: 176
Test Statistic:
z = x - o / s n
With = .05
And p value 0.950, which is greater than .05
Hence the Null Hypothesis Ho is not rejected
Therefore, Population mean of HP is equal to 176
4. Fuel (MPG)
5. Wheel drive
9.2
Ha = Fuel (MPG) is a valuable predictor in the presence of the other variables while
predicting cars selling price.
Test Statistic: here, p value = 0.494 n = 33 = 0.05
P value = .494 is larger than .05
Hence do not reject the Null Hypothesis Ho
So, we conclude that Wheel drive is a not a valuable predictor in the presence of the other
variables while predicting cars selling price.
9.4
We are testing the F test for finding the regression model is useful or not.
Regression Analysis: Selling Price in BDT versus CC, HP, Fuel (MPG), Wheel /Drive
Ho: regression model is not useful in predicting the car selling price
HA: regression model is useful in predicting the car selling price
Ho: 1= 2= 3= 4= 5=0
HA: 1= 2= 3= 4= 50
Test statistics F = MSR/MSE
Analysis of Variance
Source
DF
Adj SS
Regression
Adj MS
F-Value
P-Value
1.98640E+14
6.81
0.001
7.94558E+14
CC
7.06797E+13
7.06797E+13
2.42
0.131
HP
4993542516
4993542516
0.00
0.990
79043102586
79043102586
0.00
0.959
Wheel
1.39762E+13
1.39762E+13
0.48
0.494
Error
28
8.16163E+14
2.91487E+13
Lack-of-Fit
27
8.16162E+14
3.02282E+13
24182.58
0.005
Pure Error
1250000000
1250000000
Total
32
1.61072E+15
Fuel (MPG)
10. Findings
In this report we tried to find out the relationship and impact on the car selling price with 4
independent variables. We had 4 hypotheses about this report, these are given below,
H1:
relationship
H2:
relationship
H3:
Pearson correlation of Selling Price in BDT and Fuel (MPG) = -0.281, so it a partial
Fuel Miles per gallon (MPG) has impact on car selling price
negative relationship
H4:
Pearson correlation of Selling Price in BDT and Wheel /Drive = 0.514, so it a negative
positive relationship.
The coefficient of determination (R2) and the adjusted value was found to be 49.33% and
42.09% respectively. That means the Selling Price can be explained 49.33% by CC, HP,
Fuel (MPG) and Wheel/Drive.
From the Hypothesis Test for correlation coefficient we can conclude that among 4
independent variables fuel miles (MPG) have inverse relation with the selling price and
the other 3 CC, HP and Wheel drive have positive relationship with the car selling price.
From the Hypothesis Test for partial regression coefficient we can conclude that all
independent variables are not a valuable predictor in the presence of the other variables
while predicting cars selling price. That means the selling price of a car cannot be found
using the relationship with just one independent variable as the other variables plays a
great role as well.
And after testing the usefulness of the regression model we can say that this regression
model is useful in predicting the car selling price.
11. Conclusion:
There are other variables such as the brand image, the type of tires used in the car, the interior
decoration type of car, the type of engine used etc. All these and others factors play a major role
in determining the selling price. Due to time constraints and data constraints we need to work
with the available factors and that is explained by the value of R2 in the report. The report could
have been more realistic if the other variables could be included.
13. References:
Navana 3s centre
Car retailers:
Car selection
KK automobiles
http://www.toyota.com
http://www.nissan-global.com/EN/index.html
http://worldwide.hyundai.com/WW/Main/index.html
http://www.simetric.co.uk/si_cc2hp.htm