Beruflich Dokumente
Kultur Dokumente
Contents
Probability
Distributions Example 1:
Descriptive Statistics Probability of success in test
Estimation theory
Example 2:
Hypothesis testing Probability of success in test 2
Linear Model given that test 1<5.5?
Design of experiments
3
Contents
Probability
Distributions 0.35
mu=5.72 sigma=1.55
0.3
Descriptive Statistics
0.25
Estimation theory
0.2
Dens ity
Hypothesis testing
0.15
Linear Model
0.1
0
0 1 2 3 4 5 6 7 8 9 10
Score
4
Contents
Probability
25
Distributions
Test 1 Test 2
20 5. 6 6. 1
Descriptive Statistics 5. 1 7. 5
6. 8 6. 6
3. 4 3. 1
Estimation theory 15 6. 8 8. 4
Frequency
4. 6 6. 4
5. 6 4. 9
Hypothesis testing 6. 3 10. 0
10 5. 0 4. 0
Linear Model 7. 6
5. 6
8. 2
5. 8
5
Design of experiments
0
0 1 2 3 4 5 6 7 8 9 10
Score
5
Contents
Descriptive Statistics
Probability
Distributions Example:
What is and ?
Estimation theory
Hypothesis testing Bias
Robustness
Linear Model Confidence Interval
Design of experiments
6
Contents
Descriptive Statistics
Probability
Example 1:
Distributions When you have less than 4. 5
on test 1, you will not pass
Estimation theory
Hypothesis testing Example 2:
Linear Model Average Test1=Average Test 2
Design of experiments
7
Contents
Descriptive Statistics
10
Probability 9
8
Distributions
7
Estimation theory
Score Test 2
6
5
Hypothesis testing
4
Linear Model 3
2
Design of experiments 1
0 1 2 3 4 5 6 7 8 9 10
Score Test 1
8
Contents
Descriptive Statistics
Probability
Distributions
Estimation theory
To improve estimate
Hypothesis testing
Linear Model
... To improve prediction of model
Design of experiments
9
What is Data?
Data: Consist of information coming from observations, counts,
measurements, or responses.
People who eat three daily servings of whole grains have been shown to
reduce their risk of stroke by 37%.
70% of the 1500 U.S. spinal cord injuries to minors result from vehicle
accidents, and 68% were not wearing a seatbelt.
10
What is Statistics?
Statistics
Data Information
Statistics
Data Information
List of last terms marks. New information about the
statistics class.
95
89
70 E.g. Class average,
65 Proportion of class receiving As
78 Most frequent mark,
57 Marks distribution, etc.
:
12
Data Sets
Population
The collection of all outcomes,
responses, measurements, or
counts that are of interest.
Sample
A subset of the population.
15
Branches of Statistics
Descriptive Statistics
Collect data
e.g., Survey
Present data
e.g., Tables and graphs
Characterize data
e.g., Sample mean = X i
n
17
Inferential Statistics
Estimation
e.g., Estimate the population mean
weight using the sample mean weight
Hypothesis testing
e.g., Test the claim that the population
mean weight is 120 pounds
DATA
Data are the different values associated with a variable.
POPULATION
A population consists of all the items or individuals about
which you want to draw a conclusion.
SAMPLE
A sample is the portion of a population selected for analysis.
PARAMETER
A parameter is a numerical measure that describes a
characteristic of a population.
STATISTIC
A statistic is a numerical measure that describes a
characteristic of a sample.
19
NOTE #1:
Reversing the x,y order 0
(y,x) simply rotates the
plot 90 degrees!
Manually inserted text...
-0.5
NOTE #2:
line(x,y) is similar to plot(x,y)
but does not have additional options -1
0 5 10 15 20 25 30 35
X axis description
20
Kinds of plots:
bar(x) creates a bar graph of the vector x. (Note also the command stairs(x))
bar(x,y) creates a bar-graph of the elements of the vector y, locating the bars
according to the vector elements of 'x'
m-function Structure
Function definition
Arguments
Returned variable
function volume=cylinder(radius, length)
% CYLINDER computes volume of circular cylinder
% given radius and length
% Use:
Help comments
% vol=cylinder(radius, length)
%
volume=pi.*radius^2.*length;
Statements
(no end required)
n
f ( x) P( X x) p x (1 p) n x , x 0, 1, 2, , n
x
Descriptive Statistics
corrcoef - Linear correlation coefficient with confidence intervals.
cov - Covariance.
mean - Sample average (in MATLAB toolbox).
median - 50th percentile of a sample.
range - Range.
std - Standard deviation (in MATLAB toolbox).
var - Variance (in MATLAB toolbox).
Example:
>> X = [ 1 2 3 5 6 7 23 45 33 46 22]
X=
1 2 3 5 6 7 23 45 33 46 22
>> mean(X)
ans =
17.5455
>> std(X)
ans =
17.5455
27
A = [ 0 2 5 7 20] B = [1 2 3
336
468
4 7 7];
Mean:
mean(A) = 6.8
mean(B) = 3.0 4.5 6.0 (column-wise mean)
mean(B,2) = 2.0 4.0 6.0 6.0 (row-wise mean)
Median:
median(A) = 5
median(B) = 3.5 4.5 6.5 (column-wise median)
median(B,2) = 2.0
3.0
6.0
7.0 (row-wise median)
28
Descriptive Statistics
Example: The function displaytable.m is posted on the course website
>> X = rand(9,9); %generates 9x9 random matrix
>> displaytable(cov(X)); % plots the covariance matrix of X
>> displaytable(corrcoef(X)); % plots the correlation matrix of X
30
Data Correlations
2
% Compute sample correlation
1
r = corrcoef([var1,var2])
Variable 1
0 r = 1.0000 0.7051
0.7051 1.0000
-1
-2
-3
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Variable 2
31
Statistical Plotting
andrewsplot - Andrews plot for multivariate data.
biplot - Biplot of variable/factor coefficients and scores.
boxplot - Boxplots of a data matrix (one per column).
cdfplot - Plot of empirical cumulative distribution function (cdf).
fsurfht - Interactive contour plot of a function.
glyphplot - Plot stars or Chernoff faces for multivariate data.
gplotmatrix - Matrix of scatter plots grouped by a common variable.
gscatter - Scatter plot of two variables grouped by a third.
hist - Histogram (in MATLAB toolbox).
hist3 - Three-dimensional histogram of bivariate data.
normplot - Normal probability plot.
parallelcoords - Parallel coordinates plot for multivariate data.
probplot - Probability plot.
surfht - Interactive contour plot of a data grid.
wblplot - Weibull probability plot.
32
3D histogram
>> hist3([X(:,1),X(:,2)]);
35
Statistical Plotting
normplot: Normal probability plot for graphical normality test.
0.75
Probability
0.50
0.25
0.10
0.05
0.02
0.01
-1.5 -1 -0.5 0 0.5 1 1.5
Data
The plot is linear, indicating that you can model the sample by a
normal distribution
37
0.25
0.2
Density
0.15
0.1
0.05
0
6 8 10 12 14 16 18 20
Critical Value
40
Control Charts
A control chart displays measurements of process samples over time. The measurements
are plotted together with user-defined specification limits and process-defined control
limits. The process can then be compared with its specificationsto see if it is in control or
out of control.
The chart is just a monitoring tool. Control activity might occur if the chart indicates an
undesirable, systematic change in the process. The control chart is used to discover the
variation, so that the process can be adjusted to reduce it.
Xbar or mean
Standard deviation
Range
Exponentially weighted moving average
Individual observation
Moving range of individual observations
Moving average of individual observations
Proportion defective
Number of defectives
Defects per unit
Count of defects