STATISTICS

Part I (Chapters 1 11)
MBA 611 STATISTICS AND QUANTITATIVE METHODS

Part I. A. Review of Basic Statistics (Chapters 1-11) Introduction (Chapter 1)
Uncertainty: Decisions are often based on incomplete information from uncertain events. We use statistical methods and statistical analysis to make decisions in uncertain environment. Population: Sample: A population is the complete set of all items in which an investigator is interested. A sample is a subset of population values.
& Example: Population - High school students - Households in the U.S. Sample - A sample of 30 students - A Gallup poll of 1,000 consumers - Nielson Survey of TV rating Random Sample: A random sample of n data values is one selected from the population in such a way that every different sample of size n has an equal chance of selection.
& Example: Random Selection - Lotto numbers - Random numbers Random Variable: A variable takes different possible values for a given subject of study.
Numerical Variable: A numerical variable takes some countable finite numbers or infinite numbers. Categorical Variable: A categorical variable takes values that belong to groups or categories. Data: Data are measured values of the variable. There are two types of data: quantitative data and qualitative data.
Part I (Chapters 1 11) Quantitative Data: Qualitative Data: & Example: 1. 2. 3. 3. 4. 5. 6. 7. 8. Statistics: Quantitative data are data measured on a numerical scale. Qualitative data are non-numerical data that can only be classified into one of a group of categories.
Temperature Height Age in years Income Prices Occupations Race Sales and Advertising Consumption and Income Statistics is the science of data. This involves collecting, classifying, summarizing, analyzing data, and then making inferences and decisions based on the data collected. The numerical measures of a population are called parameters.
Population Parameters:
& Example: Population average. Sample Statistics: The numerical measures of a sample are called sample statistics.
& Example: Sample average. Descriptive Statistics: Descriptive statistics summarizing data. involves collecting, classifying, and
Inferential Statistics:
Inferential statistics makes statistical inference about the population parameters based on sample information.
Business Decisions: From time to time, we use quantitative analysis to make business decisions. & Example: Economics: Price of a Good, Interest Rate, Mortgage Rate Finance: Returns, Stock Prices Marketing: Advertising, Sales Management: Quality Control
Part I (Chapters 1 11) B. Descriptive Statistics (Chapters 2 and 3)
B.1 Describing Data Sets Graphically (Chapter 2) The simplest way to describe data is to use graphs. The following shows two types of graphs: frequency histogram and line graph. B.1.1 Relative Frequency Histogram The relative frequency histogram shows the proportions of the total set of data values that fall in various numerical intervals. & Example: Sale Prices The following data represent sale prices (in thousands of dollars) for a random sample of 25 residential properties sold. 66 89 71 109 42 Sort the data. 36 63 72 84 106 59 129 95 77 36 106 74 72 68 148 50 82 57 101 94 63 84 76 65 112
42 65 74 89 109
50 66 76 94 112
57 68 77 95 129
59 71 82 101 148
Organize the data and construct the following relative frequency distribution table. Class i 1 2 3 4 5 6 Sum Class Limits (30, 49) (50, 69) (70, 89) (90, 109) (110, 129) (130, 149) Freq. ( f i ) 2 7 8 5 2 1 25 Relative Frequency 2/25 =0.08 7/25 = 0.28 8/25 = 0.32 5/25 = 0.20 2/25 = 0.08 1/25 = 0.04 1 3
Part I (Chapters 1 11) The relative frequency histogram is
Relative Frequency 0.4 0.3 0.2 0.1 0 49.5 69.5 89.5 109.5 129.5 149.5 Sale Price
In this graph, 1. 2. 3. 4. The data are classified into 6 classes. Each class has the same width. The width is equal to 20. The graph shows the midpoints of these classes on the horizontal axis. The vertical bar shows the relative frequency of sale prices falling in each class interval.
How to decide the class width:
Relative Frequency
Width =
the largest number - the smallest number . the number of classes
& Example: Sale Prices
Width =
148 36 = 18.67 20 . 6
O Exercise: The following data are year-to-day (YTD) returns for a sample of 30 mutual funds.
0.2 0.6 3.4 Then Width =
1.4 5.1 1.1
1 0.9 -0.7
-4.2 0.9 -1.1
3.8 -1 1
2.5 0.8 0.6
0.6 0.5 0.5
0.9 -4.3 3
-1.1 5.5 -0.5
-0.1 2.7 9.6
9.6 ( 4.3) = 2.31 2.5 . 6

4
Part I (Chapters 1 11) Sort the data as the following: -4.3 0.5 1.1 -4.2 0.6 1.4 -1.1 0.6 2.5 -1.1 0.6 2.7 -1 0.8 3 -0.7 0.9 3.4 -0.5 0.9 3.8 -0.1 0.9 5.1 0.2 1 5.5 0.5 1 9.6
Organize the data and construct the following relative frequency distribution table. Class i 1 2 3 4 5 6 Sum Draw a relative frequency histogram. Class Limits (-5.00, -2.51) (-2.50, -0.01) Freq. ( f i ) Relative Frequency
Textbook Exercises: 2.5, 2.6, 2.8, 2.9, pages 22-23.

Excel: Create a Histogram 1. Click on Tools. 2. Click on Data Analysis. (If Data Analysis is not on the list, click ATools@ and AAdd-Ins@. Check AAnalysis ToolPak@ to install the add-in from Microsoft Office CD.) 3. Select Histogram; click OK. 4. Complete dialog box: Input range contains data; Bin range contains upper boundary of interval; click OK. 5. Delete (using Edit) last row called More.

B.1.2 Line Graph (Time Plot) A line graph is graphic representation for a time series. Time series are data collected at different time period. & Example: The following data are daily high temperatures from Monday through Friday:
70
74
72
78
75.
The line graph for the temperatures is
Temperature 80 Temperature 75 70 65
Monday Tuesday Wedn. Thursday Friday
Date
& Example: The line graph of IBM stock price is
IBM Stock Price 14000 12000 10000 8000 6000 4000 2000 0
80 82 84 86 88 90 92 94 96 98 00
Price
Date
Textbook Exercises: 2.22-2.29, pages 22-23; 2.35-2.39, pages 30-31.

6
02

Excel: Create a Line Graph 1. Click on Insert. 2. Click on Chart. 3. In Chart-Wizard Step 1: Select Line and the top-left line chart; click Next. 4. In Chart-Wizard Step 2: Click Series tab, Values contains data, Category (X) axis labels contains the values of date or time. Click Next and complete the rest steps. B.2 Measures of Central Tendency (Section 3.1) To describe data sets numerically, we use mean, median, range, and standard deviation. B.2.1 Mean (Average)
The mean of a collection of n data values is the sum of the data values divided by n.
& Example: Calculate the mean of the following daily high temperatures:
70
74
72
78
75.
The mean is
70 + 74 + 72 + 78 + 75 = 73.8 . 5
Notation: Sum and Mean Suppose there is a collection of n data values. These values are represented by x1 , x 2 ,K , x n , . The sum of these values is denoted as
x
i =1
The mean is equal to
x
i =1
Sample Mean, X The mean of a sample of n data values x1 , x 2 , K, x n is denoted as X . And
X =
x
i =1
. 7

O Exercise: Prices of product A Suppose the prices of product A in the past five months are
5.
Calculate the mean. Answer:
Population Mean, The mean of a population is denoted as . If the data values of x are represented by x1 , x 2 , K , x N , then the population variance is defined as
x
i =1
B.2.2 Median The median of a collection of data values is the data value in the middle position for sorted data. & Example: Calculate the median of the following daily high temperatures:
70
74
72
78
75.
The sorted data are 70 72 74 75 78.
The median is 74.
Textbook Exercises: 3.1-3.11, pages 50-51.

B.3 Measures of Variability (Section 3.2)
B.3.1 Range The range of a collection of data values is the difference between the largest and the smallest values. & Example: Calculate the range of the following daily high temperatures:
70
74
72
78
75.
The range is 78 - 70 = 8.
& Example: Sale Prices for Residential Properties Calculate the range. The range is 148 - 36 = 112. O Exercise: YTD Returns Calculate the range. The range is
B.3.2 Variance and Standard Deviation The variance is used to measure the variation of the data values from its mean. The variance of a collection of data values is defined to be the average of the squares of the deviations of the data values about their mean. Sample Variance, s 2 The variance of a sample of n data values x1 , x2 , K, xn is defined as
(x
n 2 s = i =1
X)
n 1
& Example: Prices of product A Suppose the prices of product A in the past five months are
5.
Calculate the mean and the variance.
xi
deviation xi X 6-4 =2 4-4=0 2 - 4 = -2 3 - 4 = -1 5-4=1 0
(deviation)2 (xi X )2 4 0 4 1 1 10
1 2 3 4 5 Sum
6 4 2 3 5 20
The sample mean is X =
20 = 4. 5 10 The sample variance is s 2 = = 2.5 . 5 1
Alternative Formula: A shortcut formula to compute s 2 is

2
( x ) n (X ) =
2 i
n 1
& Example: Prices of product A Suppose the prices of product A in the past five months are
5.
Use a shortcut formula to compute the sample variance.
10
Part I (Chapters 1 11) i 1 2 3 4 5 Sum The sample mean is X = The sample variance is s = 20 = 4. 5
2
xi 6 4 2 3 5 20
xi2
36 16 4 9 25 90
( x ) n (X ) =
2 i
n 1
90 5 4 = 2.5 . 4
2
O Exercise: Prices of product B Suppose the prices for product B in the past five months are
1.
Calculate the sample mean and the sample variance.

i xi xi2
1 2 3 4 5 Sum The sample mean is The sample variance is
3 7 5 4 1 20
11

Standard Deviation The standard deviation of a collection of data values is equal to the square root of their variance. Sample Standard Deviation, s
s = s2 .
& Example: Prices of Product A The sample standard deviation is s = 2.5 = 1.58 . O Exercise: Prices of Product B Calculate the sample standard deviation. The sample standard deviation is
Population Variance 2 and Population Standard Deviation The population variance is denoted as 2 . For a population with the data values of x1 , x 2 , K , x N and the mean , population variance is defined as
2=
(x
i=1
- )2
The population standard deviation is
= 2 .
Note: Sample mean X , variance s 2 , and standard deviation s are sample statistics. Population mean , variance 2 , and standard deviation are population parameters.
Textbook Exercises: 3.12-3.14, 3.20-3.25, pages 59-60.
12

B.4 Skewness and Kurtosis We use skewness and kurtosis to show the shape of distribution. B.4.1 Skewness The skewness measures the amount of asymmetry in a distribution or in a relative frequency histogram. If a distribution is symmetric, skewness equals zero; the larger the absolute size of the skewness statistic, the more asymmetric is the distribution. The measure of sample skewness is defined as
1 (xi X )3 skewness = n . 3 s When skewness has a large positive value indicates a long right tail. When skewness has a large negative value indicates a long left tail.
& Example: Sale Prices The data set has a positive skewness. Hence, the distribution has a long right tail.
Column1 Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 81 5.293707 76 #N/A 26.46853 700.5833 0.488226 0.658795 112 36 148 2025 25
B.4.2 Kurtosis The kurtosis is a measure of the thickness of the tails of its distribution (or relative frequency histogram) relative to those of a normal distribution. A normal distribution has a kurtosis of three. A kurtosis above three indicates Afat tails.@ The measure of sample kurtosis is defined as
1 (xi X )4 Kurtosis = n . 4 s 13

& Exercise: YTD Returns
Column1 Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count 1.12 0.490704 0.85 0.6 2.687699 7.223724 2.87748 0.849036 13.9 -4.3 9.6 33.6 30
Textbook Exercises: 3.4, 3.6, 3.9-3.11, pages 50, 51.
Excel: Descriptive Statistics (See Appendix) 1. Click on Tools. 2. Click on Data Analysis. (If Data Analysis is not on the list, click ATools@ and AAdd-Ins@. Check AAnalysis ToolPak@ to install the add-in from Microsoft Office CD.) 3. Select Descriptive Statistics; click OK. 4. Complete dialog box: Input range contains data; Output range contains the starting cell with descriptive statistics; select Summary statistics. Click OK.
14

C. C.1 Random Variables and Normal Distribution (Sections 5.1-5.3, 6.1, 6.3) Random Variables (Section 5.1)
Random Experiment: A random experiment is a process leading to two or more possible outcomes with uncertainty as to which outcome will occur. Random Variable: A random variable is a variable that takes on numerical values determined by the outcome of a random experiment. Usually, there are two usages of random variables. Random Variables for a Population: We can use a random variable to represent different possible data values for a population. This random variable has a probability distribution. & Example: The sale price can be represented by a variable X . Then different data values of sale price can also be represented by ( x1 , x2 ,K , x n ) . The population mean is denoted as X and the population standard deviation is X . Random Variables for Statistical Analysis: Some random variables have interesting probability distributions. These probability distributions are useful in statistical inference. & Example: The random variable Z has a standard normal distribution.
There are two types of random variables. One is discrete random variable and the other is continuous variable.
Discrete Random Variable: Continuous Random Variable:
A discrete random variable takes some countable number of values. A continuous random variable is a random variable taking values on a line interval.
& Example: Age in years - Discrete random variable Income - Discrete Prices - Discrete Temperature - Continuous Height - Continuous Growth rates - Continuous
15

C.2 Discrete Random Variable (Sections 5.2, 5.3) The probability distribution of a random variable X is denoted as P(x) . The properties of P(x) are
a. b.
P(x ) = 1 .
P(x ) 0 .
& Example: New Products Suppose the number of new products introduced each year is a random variable X . The values and the probabilities of are
x P(x )
3 4 5 6
Mean and Standard Deviation The mean of a discrete random variable X is
0.1 0.4 0.3 0.2
x = x P(x ) .
The mean of x is also called the expected value of X ,
E ( ) = x P ( x ) .
The variance of a discrete random variable X is

2 2 x = ( x x ) P( x ) .
16

& Example: New Products Calculate the mean and standard deviation.
x
P(x )
x P(x )
x x
( x x )2 ( x x ) 2 P ( x )
2.56 0.36 0.16 1.96 0.256 0.144 0.048 0.392 0.84
3 4 5 6 Sum
0.1 0.4 0.3 0.2 1
0.3 1.6 1.5 1.2 4.6
-1.6 -0.6 0.4 1.4
The mean is x = 4.6 .

2 = 0.84 . The variance is x
The standard deviation is x = 0.84 = 0.9165 .

O Exercise: Returned Checks Suppose the number of returned checks in a day for a department store is a random variable X . The values and the probabilities of X are
x P(x )
0 1 2 3
0.3 0.4 0.2 0.1
Calculate the mean, variance, and standard deviation.
17

P(x ) x P(x )
x x
( x x )2 ( x x )2 P ( x )
0 1 2 3 Sum The mean is The variance is
0.3 0.4 0.2 0.1
The standard deviation is

Alternative Formula for Calculating Variance: A shortcut formula to compute 2 x is
2 2 2 x = ( x P ( x )) x .
& Example: New Products
x 3 4 5 6 Sum The variance is
P(x )
x P(x )
2 x P( x )
0.1 0.4 0.3 0.2 1
0.3 1.6 1.5 1.2 4.6
9 16 25 36
0.9 6.4 7.5 7.2 22.0
2 2 2 2 x = ( x P ( x )) x = 22 4.6 = 0.84 .
18

O Exercise: Returned Checks
x 0 1 2 3 Sum
2 The variance is X =
P(x )
x P(x )
2 x P( x )
0.3 0.4 0.2 0.1
Textbook Exercises: 5.1-5.8, pages 136, 137; 5.15-5.21, 5.25-5.29, pages 148-150.
C.3 Continuous Random Variable (Sections 6.1, 6.3, 8.3) The probability distribution of a random variable X can be denoted as f ( x ) . The probability distribution of X has the following properties: f (x ) 0 . a. b. Total area under f (x ) is one. c. The probability of x falling within an interval (a, b ) is denoted as P (a < x < b ) . It is the area under the curve f ( x ) between a and b.
One of the most commonly used continuous random variable is normal random variable.
C.3.1 Normal Distribution (Section 6.3) Normal Random Variable and Normal Probability Distribution A normal random variable with a normal probability distribution has the following properties: a. The probability distribution has a bell-shaped. b. The distribution is symmetric about its mean . c. The spread of the distribution is determined by the standard deviation . d. Any normal random variable X with mean and standard deviation can be standardized as a standard normal random variable.
Z=
. 19

Standard Normal Random Variable A standard normal random variable is a normal random variable with mean zero and standard deviation one. The probability table for standard normal random variable shows the probability of
P (0 < Z < a ) .
Using Standard Normal Probability Distribution Table Case 1. Find P (0 < Z < a ) . & Example:
P (0 < Z < 1.2 ) = 0.3849 . P (0 < Z < 1.76 ) = 0.4608 .
O Exercise:
P (0 < Z < 1.64 ) = P (0 < Z < 1.96 ) =
Case 2. Find P (a < Z < 0 ) . & Example: P ( 1.2 < Z < 0 ) = 0.3849 . P ( 1.76 < Z < 0 ) = 0.4608 . P ( Z < 0 ) = 0 .5 . P ( Z > 0 ) = 0 .5 . O Exercise: P ( 1.28 < Z < 0 ) =
P ( 2.33 < Z < 0 ) =
Case 3. Find P(Z < a ) . & Example:

P (Z < 1.2 ) = 0.5 0.3849 = 0.1151 . P (Z < 1.76 ) = 0.5 0.4608 = 0.0392 .
20
Note: We denote the cumulative probability as F (a ) , such that F (a ) = P(Z < a ) .

O Exercise: P (Z < 1.64 ) =
P (Z < 1.96 ) =
Case 4. Find P(Z > a ) . & Example:

P (Z > 1.2 ) = 0.5 0.3849 = 0.1151 . P (Z > 1.76 ) = 0.5 0.4608 = 0.0392 .
O Exercise: P (Z > 1.64 ) =

P (Z . > 1.96 ) = P (Z > 1.28) = P(Z > 2.33) =
Case 5. The probability P (0 < Z < a ) is given. Find the value of a. & Example: P (0 < Z < a ) = 30% . What is a ? From the table, a = 0.84 . O Exercise: P(0 < Z < a ) = 40% . What is a? Case 6. The probability P(Z > a ) is given. Find the value of a. & Example: P(Z > a ) = 5% , find a . The point a locates on the right-hand side of origin and P(0 < Z < a ) = 0.5 0.05 = 0.45 . With the given probability 0.45, we find a = 1.64 from the table.
21

O Exercise: P(Z > a ) = 0.10 , find a . Answer:
Textbook Exercises:6.17, 6.18, page 207.
Probabilities for Normal Random Variables
Let X be a normal random variable with mean and variance 2 . Then random variable X is a standard normal random variable. Also, Z=
b a <Z< P(a < X < b ) = P .
& Example: A company produces light bulbs whose life follows a normal distribution with mean 1,200 hours and standard deviation 250 hours. If we choose a light bulb at random, what is the probability that its lifetime will be between 900 and 1,300 hours?
Answers:
900 1200 X 1200 1300 1200 P(900 < X < 1300) = P < < 250 250 250 = P( 1.2 < Z < 0.4 ) = 0.3849 + 0.1554 = 0.5403 .
O Exercise: Anticipated consumer demand for a product next month can be represented by a normal random variable with mean 1,200 units and standard deviation 100 units.
a. b.
What is the probability that sales will be between 1,000 and 1,300 units? What is the probability that sales will exceed 1,100 units?
Answers:
Textbook Exercises: 6.19 abc, 6.20 abc, 6.21 abc, 6.22 abd, 6.23 ab, 6.24 abc, 6.25, 6.26, 6.27 a, 6.31 ab, 6.35 ab, 6.36a, 6.37 ab, pages 208-210.
22

C.3.2 Student=s t Distribution (Section 8.3) Student's t Distribution ( t -distribution) Let t be a random variable with t -distribution.
Properties of 1. 2. 3. 4. 5.
t -distribution: Bell-shaped. Symmetrical about t = 0 . The probability distribution has tails that are more spread out than the standard normal distribution. The shape of probability distribution depends on a constant, the degrees of freedom (v). When v is large, t distribution is close to the standard normal distribution.
t Statistical Table The table shows the value of t , such that P(t > t ) = .
For = 0.01 , = 0.025 , and = 0.05 , the values of t for different v are
v t .05 t .025 t.01
5 10 15 20
2.015 1.812 1.753 1.725 1.645
2.571 2.228 2.131 2.086 1.96
3.365 2.764 2.602 2.528 2.326
23

& Example: Find the value a such that, P(t > a ) = 0.05 when v = 5 . a. P(t < a ) = 0.025 when v = 10 . b. P(t > a ) = 0.01 when v = 20 . c. Answer: a: a = 2.015 ; b: a = 2.228 ; c: a = 2.258. O Exercise: Find the value a such that, P(t > a ) = 0.01 when v = 5 . a. P(t < a ) = 0.05 when v = 10 . b. c. P(t > a ) = 0.025 when v = 15 . Answer:
24

STATISTICS

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

STATISTICS

Hochgeladen von

Copyright:

Verfügbare Formate

Part I (Chapters 1 11)

MBA 611 STATISTICS AND QUANTITATIVE METHODS

Part I (Chapters 1 11) B. Descriptive Statistics (Chapters 2 and 3)

Part I (Chapters 1 11) The relative frequency histogram is

How to decide the class width:

the largest number - the smallest number . the number of classes

& Example: Sale Prices

0.2 0.6 3.4 Then Width =

1.4 5.1 1.1

-4.2 0.9 -1.1

2.5 0.8 0.6

0.6 0.5 0.5

-1.1 5.5 -0.5

-0.1 2.7 9.6

9.6 ( 4.3) = 2.31 2.5 . 6

Textbook Exercises: 2.5, 2.6, 2.8, 2.9, pages 22-23.

Part I (Chapters 1 11)

The line graph for the temperatures is

& Example: The line graph of IBM stock price is

Textbook Exercises: 2.22-2.29, pages 22-23; 2.35-2.39, pages 30-31.

Part I (Chapters 1 11)

The mean is equal to

Sample Mean, X The mean of a sample of n data values x1 , x 2 , K, x n is denoted as X . And

Part I (Chapters 1 11)

Calculate the mean. Answer:

The sorted data are 70 72 74 75 78.

The median is 74.

Textbook Exercises: 3.1-3.11, pages 50-51.

Part I (Chapters 1 11)

Calculate the mean and the variance.

Part I (Chapters 1 11)

deviation xi X 6-4 =2 4-4=0 2 - 4 = -2 3 - 4 = -1 5-4=1 0

The sample mean is X =

20 = 4. 5 10 The sample variance is s 2 = = 2.5 . 5 1

Alternative Formula: A shortcut formula to compute s 2 is

Use a shortcut formula to compute the sample variance.

Calculate the sample mean and the sample variance.

1 2 3 4 5 Sum The sample mean is The sample variance is

Part I (Chapters 1 11)

The population standard deviation is

Part I (Chapters 1 11)

Part I (Chapters 1 11)

Textbook Exercises: 3.4, 3.6, 3.9-3.11, pages 50, 51.

Part I (Chapters 1 11)

Part I (Chapters 1 11)

0.1 0.4 0.3 0.2

The variance of a discrete random variable X is

Part I (Chapters 1 11)

0.1 0.4 0.3 0.2 1

0.3 1.6 1.5 1.2 4.6

-1.6 -0.6 0.4 1.4

The mean is x = 4.6 .

The standard deviation is x = 0.84 = 0.9165 .

0.3 0.4 0.2 0.1

Calculate the mean, variance, and standard deviation.

Part I (Chapters 1 11)

0 1 2 3 Sum The mean is The variance is

0.3 0.4 0.2 0.1

The standard deviation is

& Example: New Products

x 3 4 5 6 Sum The variance is

0.1 0.4 0.3 0.2 1

0.3 1.6 1.5 1.2 4.6

0.9 6.4 7.5 7.2 22.0