Sie sind auf Seite 1von 16

Part 0 -- Introduction

Statistics and Data


Analysis
Professor William Greene
Stern School of Business
IOMS Department
Department of Economics

Part 0 -- Introduction

1/15

Statistics and Data Analysis

Part 0 - Introduction

Part 0 -- Introduction

2/15

Professor William Greene; Economics and


IOMS Departments

Office: KMEC, 7-90 (Economics Department)

Office phone: 212-998-0876

Email: wgreene@stern.nyu.edu

URL: http://www.stern.nyu.edu/~wgreene

http://www.stern.nyu.edu/~wgreene/Statistics/Outline.htm

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

3/15

Course Objectives

Pepperoni
21.8%

Sausage
5.8%

900000

800000

800000

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Scatterplot of Listing vs IncomePC

Normal - 95% CI

700000

900000
Mean
StDev
N
AD
P-Value

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

95

Listing

Meatball
Garlic 5.0%
2.3%

30000

32500

1000000

60

800000

40

Listing

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

Percent

Frequency

Listing

Understand random outcomes and random


information
Understand statistical information as the
measured outcomes of random processes
Learn how to analyze statistical information
Statistical analysis
Model building
Learn how to present statistical information

Percent

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

4/15

What Does it Mean?


Slightly more than one-third of Americans have a favorable opinion of
the Democratic-led Congress, a poll said Wednesday.
The Pew Research Center for the People & the Press said the 37%
expressing a positive opinion represents a decline of 13 points since
April.

The favorable percentage is one of the lowest in more than two decades
of Pew surveys if not the lowest, the poll said. The previous low was
40% in January, but the result is not statistically significant because of
the margin of error.
(USA Today, 9/3/09, page 4)

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

5/15

Really?
To Get Rid of Hiccups, Have Someone Startle You.
The truth is: Most home remedies, like holding your breath or
drinking from a glass of water backward, haven't been medically
proven to be effective, says Pollack. However, you can try this trick
dating back to 1971, when it was published in The New England
Journal of Medicine: Swallow one teaspoon of white granulated
sugar. According to the study, this tactic resulted in the cessation of
hiccups in 19 out of 20 afflicted patients.
Posted August 31, 2010, cnn.com
http://www.cnn.com/2010/HEALTH/08/31/rs.12.health.myths/index.html?iref=allsearch

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

6/15

Heard on the Street?


Dear Professor Greene,
The WSN is trying to poll people on the Park51 Mosque
debate. I saw that you were an statistics/data analysis
professor and I was wondering if you could explain how we
should go about conducting this poll. For example,
approximatley [sic] how many people would we need to poll
for the data to be completley [sic] unbaised?

Email received September 5, 2010

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

7/15

Technical Help Wanted


Our firm is looking for a [Ph.D.-level] statistician to assist us in analyzing
a simple database of compensation levels. Our database includes 93 unique records for
different institutions. We expect to analyze two dependent variables against 13
independent variables.
We need to perform multivariate regression analysis to determine which of the variables
are statistically significant. We also need to calculate the t-statistics for each of the
independent variables and adjusted r-squared values for the multivariate regression
model developed. We expect that some of the variables may need to be transformed
prior to creating the regression analysis. Additional statistical approaches and
techniques may be required as appropriate.
Subsequent to the analysis of each of the variables, we will require a brief write-up
detailing any relationships (or lack thereof) uncovered through the analysis. We
anticipate that this write-up will be approximately 2-3 pages in length, excluding any
supporting appendices. This write up should describe, in plain English, all relevant
details regarding the analysis.

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

8/15

Course Prerequisites

Pepperoni
21.8%

Sausage
5.8%

900000

800000

800000

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

Frequency

Listing

Percent

Basic algebra. (Especially summation)


Geometry (straight lines)
Logs and exponents
NOTE: I (you) will use only base e (natural)
logs, not base 10 (common) logs in this
course.
A smattering of simple calculus. (I may use two
or three derivatives during the entire semester.)

Listing

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

9/15

Course Materials
Notes: Distributed in first class
Text: Hildebrand, Ott and Gray. Basic
Statistical Ideas for Managers, 2nd ed.
(Recommended, not required)
On the course website:
Miscellaneous notes and materials
Class slide presentations
Problem sets

http://www.stern.nyu.edu/~wgreene/Statistics/Outline.htm

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

10

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

10/15

Course Software: Minitab


The Current Version: Minitab 16

Buy: Professional
Bookstore
Rent: www.e-academy.com
e-Store

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

11

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

11/15

Course Outline and Overview


1. Presenting Data
Data

Data Description

Pepperoni
21.8%

Sausage
5.8%

900000

800000

800000

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Scatterplot of Listing vs IncomePC

Normal - 95% CI

700000

900000
Mean
StDev
N
AD
P-Value

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

95

12

Listing

Meatball
Garlic 5.0%
2.3%

30000

32500

1000000

60

800000

40

Listing

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

Percent

Graphical devices: Plots, histograms


Statistical: Summary statistics

Frequency

Listing

Types
Information content

Percent

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

12/15

How to
describe/summarize
them.

Data:
House
Price
Listings
and
Income

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

13

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

How to determine if
there is any
connection between
the two variables.

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

How to explain the


variation across
states

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

13/15

Course Outline and Overview


2. Explaining How Random Data Arise

Pepperoni
21.8%

Sausage
5.8%

900000

800000

800000

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

14

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

Frequency

Listing

Percent

Probability: Understanding unpredictable outcomes


Precise mathematical principles of random outcomes
that occur naturally e.g., gambling and games of chance
Models = descriptions of random outcomes that occur in
nature but dont have fixed mathematical laws
The Normal distribution
THE fundamental model for outcomes involving
behavior
Model building for random outcomes using the normal
distribution
Listing

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

14/15

Course Outline and Overview


3. Modeling Relationships Between
Outcomes
What is correlation?
Simple linear
regression:
Connecting one
variable with another
Multiple regression

Scatterplot of Listing vs IncomePC


900000

Hawaii. Outlier?

800000

Listing_1

700000

600000

500000

Model building
Understanding
covariation of more
than one variable.

400000
15000

17500

20000

22500
25000
IncomePC_1

27500

30000

32500

Correlation = 0.428. Is this large?


600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

15

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

800000

800000

Percent

900000

Frequency

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Listing

Pepperoni
21.8%

Listing

Meatball
Garlic 5.0%
2.3%

Percent

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Part 0 -- Introduction

15/15

Course Outline and Overview - 4

900000

800000

800000

600000
500000
400000

Mushroom
16.2%

Plain
32.5%

Scatterplot of Listing vs IncomePC

Normal - 95% CI

900000
Mean
StDev
N
AD
P-Value

95

16

700000

90

500000
400000

200000
100000
15000

800000
700000

60
50
40
30

20000

22500
25000
IncomePC

27500

30000

32500

e mc

200000

100000
15000

400000
600000
Listing

800000

1000000

17500

20000

22500
25000
IncomePC

27500

Mean
StDev
N

369687
156865
51

80

200000

Normal

10

300000

Marginal Plot of Listing vs IncomePC

Empirical CDF of Listing


100

12

500000
400000

10

17500

Histogram of Listing
14

600000

70

20

300000

200000

369687
156865
51
0.994
0.012

80

600000

300000

100000

Probability Plot of Listing


99

30000

32500

1000000

60

800000

40

Listing

Sausage
5.8%

Scatterplot of Listing vs IncomePC

900000

700000

Listing

Pepper and Onion


7.3%

Boxplot of Listing
Category
Pepperoni
Plain
Mushroom
Sausage
Pepper and Onion
Mushroom and Onion
Garlic
Meatball

Percent

Pepperoni
21.8%

Frequency

Meatball
Garlic 5.0%
2.3%

Listing

Pie Chart of Percent vs Type

Mushroom and Onion


9.2%

Percent

Statistical inference
Hypothesis testing: (Is the correlation large?
Could it actually be zero?)
Hypothesis tests for specific applications
Mean of a population: Is it a specific value?
Pair of means: Are they equal?
Applications in regression: Are the variables in
the model really related?
An application in marketing: Did the sales
promotion work? How would you find out?
Listing

20

600000
400000

0
200000

300000

400000

500000 600000
Listing

700000

800000

900000

00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
00
10
20
30
40
50
60
70
80
90
Listing

200000
15000

20000

25000
IncomePC

30000

Das könnte Ihnen auch gefallen