Sie sind auf Seite 1von 4

Defining a Common Language

In the previous examples we mostly considered problems associated with


questions that measure opinion. We need to discern what we want to measure
and how we want to measure it in a wider variety of circumstances. We must
also think carefully about the properties of the measurements we gather.

Data is a collection of a number of pieces of information. Each specific piece of


information is called an observation. The observations are measurements of
certain characteristics which we call "variables". The word "variable" is used
because the pieces of information, the observations, vary from one person to the
next.

Figure 2.5. Types of Data

Example 2.11. Variables

Consider the following variables:

Table 2.2. Classification of Variables

Number Variable Type of Variable


1 Which are you? Near-sighted, Categorical
far-sighted, neither

2 What is your height? Measurement and Continuous


3 How many phone calls did you Measurement and Discrete
make yesterday on a cell
phone?
4 What is your cholesterol level? Measurement and Continuous

Hopefully, you find the classification of the first three variables easy to
understand.

Variable #1 is a categorical variable because the possible choices are "words"


or "categories."

Variable #2 is a measurement variable because the possible choices are


"numbers." This variable is also called a continuous variable because it can
assume a range of values on a continuum. You need an instrument, such as a
tape measure or a ruler, to determine height. With measurement variables that
are continuous, it is often necessary to use an instrument to determine the value
of the variable. Measurement variables that are continuous can be subdivided
into fractional parts (subdivided into smaller and smaller units of measurement).
Typically, a continuous measurement variable is expressed as "an amount of "
something.

Variable #3 is a measurement variable because the possible choices are


numbers. It is also a discrete variable because one can simply count the number
of phone calls made on a cell phone in any given day. The possible numbers are
only integers such as 0, 1, 2, ... , 50, etc. (Some of you probably make a lot of
cell phone calls.) Discrete measurement variables cannot be subdivided into
smaller and smaller fractional parts (smaller and smaller units of measurement).
Often, a discrete measurement variable is expressed as "a number of "
something.

Variable #4 is somewhat ambiguous. Obviously what the variable is measuring


(cholesterol levels) can be expressed on a continuum of possible values - but
subjects are likely to round off or only know their levels as a discrete value.
Cholesterol levels must be determined by a blood test where an instrument is
used to determine the final value. The reported value represents the
concentration of cholesterol in the blood. The appropriate units are milligrams
per deciliter (mg/dL). What typically happens is that the value of the cholesterol
level is rounded to the nearest whole number. Consequently, the cholesterol
level might look like a discrete variable - but the raw values are continuous and,
since the amount of "discreteness" is not great, a variable like this would be
treated the same way as a continuous variable in any analyses.

Example 2.12. Best Way to Determine Heart Rate

Heart Sinus Rhythm imageConsider an experiment where heart rate (heart


beats/minute) is measured by three different methods. Let's consider three
different methods to determine heart rate.

Method 1: Count heart beats for 6 seconds & multiple by 10 to get heart
beats/minute

Method 2: Count heart beats for 30 seconds & multiply by 2 to get heart
beats/minute

Method 3: Count heart beats for 60 seconds

We collected six measurements on an individual for each of the three methods.


These results are found in Table 2.3.

Table 2.3. Results from the Heart Rate Experiment

Method Six Results Heart Rate Minimum and Average


(HeartBeats/Minute) Maximum Heart Heart Rate
Rate

1 7, 7, 7, 7, 7, 7 70, 70, 70, 70, 70, 70 70, 70 70


2 36, 35, 37, 38, 37, 37 72, 70, 74, 76, 74, 74 70, 76 73

3 73, 76, 74, 75, 74, 75 73, 76, 74, 75, 74, 75 73, 76 74.5

In this example, we will not explore whether or not heart rate is a valid measure
of overall health and fitness. Obviously, it does provide some information about
whether or not a person may have some health problems. But by itself, it usually
does not provide a complete picture. The questions that we pose now are the
following:

Question 1: Which method is the most reliable?

Question 2: Which method is the most biased?


What may surprise you is that the answer to both questions is method 1.
Method 1 is the most reliable because every time we took the measurement we
observed 7 beats in 6 seconds. The results are consistent. Results from method
1 are also the most biased because it consistently underestimates the individual's
true heart rate. If you look at the results from method 3, which is really the best
method to determine heart rate, you find that the individual's average heart rate
is 74.5 beats/minute. The results from method 1 always fell below this value.
What this means is that even though method 1 is reliable, it still can have other
problems, which in this case, is biasedness.

Example 2.13. Bias versus Reliability

Suppose you are interested in knowing whether the average price of homes in a
certain county had gone up or down this year in comparison with last year.
Would you be more interested in having a measure with low bias or a reliable
measure of sales?

Ideally you would like the measure to be both unbiased and reliable. However, a
reliable measure that is biased, can still often provide some meaningful
information. Since the goal is to make a comparison of the average price of
homes over two years, the measure must be reliable. So, even if the measure is
biased, the amount of change from one year to the next may be sufficient
information to make a comparison.

Das könnte Ihnen auch gefallen