Sie sind auf Seite 1von 38

1-1

Statistik dan Ekonometrik


Oleh:
Prof. Tri Widodo, Ph.D

1-2

Statistics is the science


of collecting, organizing,
presenting,
ti
analyzing,
l i
and interpreting
numerical data to assist
in making more
effective decisions.

First PhD Class: Lets cross the street and do your


economics!

What is Meant by Statistics?

1-3

Statistical techniques are


used extensively by
marketing, accounting,
quality control,
consumers professional
consumers,
sports people, hospital
administrators,
educators, politicians,
physicians, and many
others.
Who Uses Statistics?

1-4

Descriptive
esc p ve Statistics
S s cs: Methods of organizing,
g
g,
summarizing, and presenting data in an informative way.
EXAMPLE : Election Result

Types of Statistics

1-5

Inferential Statistics:
Statistics: A decision, estimate,
prediction, or generalization about a population,
based on a sample.
A Population
i a Collection
is
of all possible
individuals,
objects, or
measurements of
i
interest
.

A Sample is a
portion,
i or part,
of the population
of interest

Types of Statistics

1-6

Contoh: Hasil penelitian heboh mengenai virginitas: 93%


mahasiswi di Yogya sudah tidak virgin
virgin

Sampling

Populasi
Sampel

Parameter

Statistics
Pengujian Hipotesa
Ho:; H1:

Statistik Inferensi

1-7

DATA

Qualitative or attribute
(type of car owned)

Quantitative or numerical

discrete
(number of children)

continuous
(time taken for an exam)

Summary of Types of Variables

1-8

Describing Data: Frequency Distributions and Graphic


Presentation

Organize data into a frequency distribution


distribution.
Portray a frequency distribution in a histogram,
histogram frequency
polygon, and cumulative frequency polygon.
Present data using such graphic techniques as line
charts bar charts
charts,
charts, and pie charts.
charts

1-9

The three commonly used graphic forms are


Histograms, Frequency Polygons, and a
Cumulative Frequency
q
y distribution.
A Histogram
i
is
i a graph
h in
i which
hi h the
h class
l
midpoints or limits are marked on the horizontal
axis and the class frequencies on the vertical axis.
axis
The class frequencies are represented by the
heights of the bars and the bars are drawn
adjacent to each other.
Graphic Presentation of a
Frequency Distribution

1-10 3- 10

Describing Data: Numerical Measures


Compute and interpret the range,
range the mean deviation,
deviation the
variance, and the standard deviation of ungrouped data.
Explain the characteristics, uses, advantages, and
disadvantages of each measure of dispersion.

1-11 3- 11

The Arithmetic Mean is


the most widely used
measure of location and
shows the central value of
the data.

It is calculated by
summing
i the
th values
l
and dividing by the
number off values.

The major characteristics of the mean are:

Average
Joe

It

requires the interval scale.


All values are used.
It is unique.
The sum of the deviations from the mean is 0.
Characteristics of the Mean

1-12 3- 12

Th Median is
i the
th
The
midpoint of the values after
they have been ordered
from the smallest to the
largest.
g

There are as many


values above the
median as below it in
the data array.

For an even set of values, the median will be the


arithmetic average
g off the two middle numbers and is
found at the (n+1)/2 ranked observation.

The Median

1-13 3- 13

The Mode is another measure of location and


represents the value of the observation that appears
most frequently.
Data can have more than one mode. If it has two
modes, it is referred to as bimodal, three modes,
trimodal, and the like.

1-14 3- 14

Dispersion
p
refers to the
spread or
variability in
the data.

30
25
20
15
10
5
0
0

10

12

range
range,
mean deviation, variance, and standard
deviation.
M
Measures
off di
dispersion
i include
i l d the
th following:
f ll i

Range = Largest value Smallest


value

Measures of Dispersion

1-15 3- 15

Sample variance (s2)

2
s

(X n1
n-1

2
X)

Sample standard deviation (s)

s= s

Sample variance and standard deviation

Chapter Four

1-16

Describing Data: Displaying and Exploring


Data
Develop and interpret a dot plot.
Develop and interpret a stem-and-leaf display.
Compute and interpret quartiles, deciles, and percentiles.
Construct and interpret box plots.
Compute and understand the coefficient of variation and the
coefficient of skewness.
Draw and interpret a scatter diagram.
Set up and interpret a contingency table.

4-17

1-17

Dot Plot

Dot plots:
Report the details of each observation
Are useful for comparing two or more data sets
Dot Plot

4-18

1-18

Stem-and-leaff Displays
p y
Stem-and-leaff
display: A statistical
technique for
di l i a sett off
displaying
data. Each
numerical value is
divided into two
parts: the leading
digits become the
stem and the
trailing digits the
leaf.

Note: an advantage
g
of the stem-and-leaf
display over a
f
frequency
distribution is we do
not lose the identityy
of each observation.

Stem-and-leaf Displays

4-19

1-19

A box
b plot
l is a graphical
display, based on quartiles,
that helps to picture a set of
data.
Five pieces of data
are needed to
construct a box plot:
the Minimum Value,
the First Quartile,
the Median,
Median the
Third Quartile, and
the Maximum
V l
Value.
Box Plots

4-20

Min Q
1

12

14

16

Median

18

20

Max

Q3

22

1-20

24

26

28

30

32

4-21

1-21

Skewness is the
measurement of the
lack of symmetry of
the distribution.
The coefficient of
skewness
k
can range
from -3.00 up to 3.00
when using the following
formula:

3 X Median
sk =
s

A value of 0 indicates a symmetric


distribution.

Some software packages use a different formula


which results in a wider range for the coefficient.

)
Movie

4-22

Scatter
diagram: A
technique
q
used to show
the
relationship
between
variables
variables.

1-22

V i bl mustt bbe att lleastt interval


Variables
i t
l scaled.
l d

Relationship can be positive (direct) or


negative (inverse).

Example
The twelve days of stock prices and the overall market
index on each day are given as follows:
Scatter diagram

1-23

Estimation
i
i andd Confidence
C fid
Intervals
l

Construct a confidence interval for the population proportion.


proportion
.

1-24

A point estimate is
a single value
(statistic) used to
e s t i m a t e a
population value
(parameter).

An Interval Estimate
states the range
within which a
population parameter
probably lies.

A confidence
interval
f
is a range of values
within which the
population parameter
is expected to occur.
The two confidence
i t
intervals
l that
th t are usedd
extensively are the
95% and the 99%.
99%
Point and Interval Estimates

1-25

If the population
standard deviation is
unknown, the
underlying population
is approximately
normal,, and the sample
p
size is less than 30 we
use the t distribution.

X t

s
n

The value of t for a given confidence level depends


upon its degrees of freedom.
Point and Interval Estimates

1-26

Confidence interval for the mean

s
n
95% CI for the population mean

X 1 . 96

s
n

99% CI for the population mean

X 2 .5 8

s
n

Constructing General Confidence


Intervals for

1-27

Ekonometrik
Statistika

Ekonomi

Matematika

1-28

Linear
i
Regression
i andd Correlation
C
l i
Draw a scatter diagram.
Understand and interpret the terms dependent variable and independent
variable.
Calculate and interpret the coefficient of correlation, the coefficient of
determination, and the standard error of estimate.
Conduct a test of hypothesis to determine if the population coefficient of
correlation is different from zero.
Calculate the least squares regression line and interpret the slope and intercept
values.
Construct and interpret a confidence interval and prediction interval for the
dependent variable.
Set up and interpret an ANOVA table.

1-29

Correlation Analysis
to
y is a ggroup
p off statistical techniques
q
measure the association between two variables.
Advertising Minutes and $ Sales

The Dependent

Variable is the variable


being predicted or estimated.

30
Sa
ales ($thousands)

A Scatter Diagram
is a chart that portrays
the relationship
between two variables.

25
20
15
10
5
0
70

90

110

130

150

170

190

Advertising Minutes

The Independent

Variable provides the


basis for estimation. It
is the predictor variable.
Correlation Analysis

1-30

The Coefficient
ff
off Correlation (r) is a measure off the
strength of the relationship between two variables.
Also called Pearsons r and
It requires
q
interval or
P
Pearsons
product
d
moment
ratio-scaled data.
correlation coefficient.
Pearson's r
It can range from
-1.00 to 1.00.
Values of -1.00
-1 00 or 1.00
1 00
indicate perfect and
-1
strong
g correlation.
0
1
Negative values indicate an
Values close to 0.0 indicate
inverse relationship and
weak correlation.
positive values indicate a
The Coefficient of Correlation, r
direct relationship.

1-31

10
9
8
7
6
5
4
3
2
1
0

10

Perfect Negative Correlation

1-32

10
9
8
7
6
5
4
3
2
1
0

10

Perfect Positive Correlation

1-33

10
9
8
7
6
5
4
3
2
1
0

10

Zero Correlation

1-34

10
9
8
7
6
5
4
3
2
1
0

10

Strong Positive Correlation

1-35

The coefficient of determination (r2) is the


proportion of the total variation in the dependent
variable
va
iable (Y)
( ) that
t at is explained
explai ed or
o accounted
accou ted for
fo by the
t e
variation in the independent variable (X).
It is the square of the coefficient of correlation.
It ranges from 0 to 1.
It does not give any information on the direction
of the relationship between the variables.

Coefficient of Determination

1-36

In Regression Analysis we use the independent


variable (X) to estimate the dependent variable (Y).
The relationship
between the
variables is linear.

Both variables
must be at least
interval scale.

The least squares criterion


is used to determine the
equation. That is the term
(Y Y)2 is minimized.
Regression Analysis

1-37

The regression equation is Y= a + bX


where
Y is the average predicted value of Y for any X.
a is the YY-intercept.
It is
i the
th estimated
ti t d Y value
l when
h X=0
X 0
b is the slope of the line
line, or the average change
in Y for each change of one unit in X
The least squares principle is used to obtain a
and b.
Regression Analysis

1-38

Terimakasih
T i k ih

Das könnte Ihnen auch gefallen