Sie sind auf Seite 1von 4

ProbStats (Reviewer)

CHAPTER 1

Statistics consists of conducting studies to collect, organize, summarize,


analyze, and draw conclusions.
Data are the values (measurements or observations) that the variables can
assume.
Variables whose values are determined by chance are called random
variables.
A collection of data values forms a data set.
Each value in the data set is called a data value or a datum.
Descriptive statistics consists of the collection, organization, summation,
and presentation of data.
A population consists of all subjects (human or otherwise) that are being
studied.
A sample is a subgroup of the population.
Inferential statistics consists of generalizing from samples to populations,
performing hypothesis testing, determining relationships among variables,
and making predictions.
Qualitative variables are variables that can be placed into distinct
categories, according to some characteristic or attribute. For example,
gender (male or female).
Quantitative variables are numerical in nature and can be ordered or
ranked. Example: age is numerical and the values can be ranked.
Discrete variables assume values that can be counted.
Continuous variables can assume all values between any two specific
values. They are obtained by measuring.
The nominal level of measurement classifies data into mutually exclusive
(non-overlapping), exhausting categories in which no order or ranking can be
imposed on the data.
The ordinal level of measurement classifies data into categories that can
be ranked; precise differences between the ranks do not exist.
The interval level of measurement ranks data; precise differences
between units of measure do exist; there is no meaningful zero.
The ratio level of measurement possesses all the characteristics of
interval measurement, and there exists a true zero. In addition, true ratios
exist for the same variable.
Data can be collected in a variety of ways.
One of the most common methods is through the use of surveys.
Surveys can be done by using a variety of methods Examples are telephone, mail questionnaires, personal interviews, surveying
records and direct observations.
To obtain samples that are unbiased, statisticians use four methods of
sampling.

Random samples are selected by using chance methods or random


numbers.
Systematic samples are obtained by numbering each value in the
population and then selecting the kth value.
Stratified samples are selected by dividing the population into groups
(strata) according to some characteristic and then taking samples from each
group.
Cluster samples are selected by dividing the population into groups and
then taking samples of the groups.
Computers and calculators make numerical computation easier.
Many statistical packages are available. One example is MINITAB. The TI-83
calculator can also be used to do statistical calculations.
Data must still be understood and interpreted.

CHAPTER 2

When data are collected in original form, they are called raw data.
When the raw data is organized into a frequency distribution, the
frequency will be the number of values in a specific class of the distribution.
A frequency distribution is the organizing of raw data in table form, using
classes and frequencies.
Categorical frequency distributions - can be used for data that can be
placed in specific categories, such as nominal- or ordinal-level data.
Examples - political affiliation, religious affiliation, blood type etc.

Ungrouped frequency distributions - can be used for data that can be


enumerated and when the range of values in the data set is not large.

Examples - number of miles your instructors have to travel from home to


campus, number of girls in a 4-child family etc.
Grouped frequency distributions - can be used when the range of values
in the data set is very large. The data must be grouped into classes that are
more than one unit in width.
Examples - the life of boat batteries in hours.
Class limits represent the smallest and largest data values that can be
included in a class.
In the lifetimes of boat batteries example, the values 24 and 30 of the first
class are the class limits.
The lower class limit is 24 and the upper class limit is 30.
The class boundaries are used to separate the classes so that there are no
gaps in the frequency distribution.
The class width for a class in a frequency distribution is found by
subtracting the lower (or upper) class limit of one class minus the lower (or
upper) class limit of the previous class.
There should be between 5 and 20 classes.
The class width should be an odd number.
The classes must be mutually exclusive.
The classes must be continuous.
The classes must be exhaustive.

The class must be equal in width.

Find the highest and lowest value.


Find the range.

Select the number of classes desired.

2
( k > n; k=

log n
)
log 2

Find the width by dividing the range by the number of classes and rounding
up.
Select a starting point (usually the lowest value); add the width to get the
lower limits.
Find the upper class limits.
Find the boundaries.
Tally the data, find the frequencies, and find the cumulative frequency.
The histogram is a graph that displays the data by using vertical bars of
various heights to represent the frequencies.
A frequency polygon is a graph that displays the data by using lines that
connect points plotted for frequencies at the midpoint of classes. The
frequencies represent the heights of the midpoints.
A cumulative frequency graph or ogive is a graph that represents the
cumulative frequencies for the classes in a frequency distribution.

Pareto charts - a Pareto chart is used to represent a frequency distribution


for a categorical variable.
Time series graph - A time series graph represents data that occur over a
specific period of time.
Pie graph - A pie graph is a circle that is divided into sections or wedges
according to the percentage of frequencies in each category of the
distribution.

CHAPTER 3

A statistic is a characteristic or measure obtained by using the data values


from a sample.
A parameter is a characteristic or measure obtained by using the data
values from a specific population.
The mean is defined to be the sum of the data values divided by the total
number of values.
We will compute two means: one for the sample and one for a finite
population of values.
The mean, in most cases, is not an actual data value.

Das könnte Ihnen auch gefallen