Stat Lecture 1

BIOSTATISTICS (L-1)
Mohey Elmazar
Professor of Pharmacology & Toxicology
Dean of Pharmacy (BUE)
THE BRITISH UNIVERSITY IN EGYPT (BUE)

FACULTY OF PHARMACY
Biostatistics - Lecture 1
OVERVIEW
Statistics is the science of collecting, summarizing, presenting and
analyzing data. This analysis may lead to conclusions and subsequent
decisions.
Biostatistics is the use of statistical techniques pertaining to
biological sciences, including medical and pharmaceutical fields.
Classification
Statistics can be broadly classified into
Descriptive Statistics Inferential Statistics

Organizing data in tabular, Concerned with drawing
diagrammatic or numerical forms conclusions from data that will
influence subsequent decisions
VARIABLES & DATA
Variables: a variable is something whose value can vary. Example:

age, sex, blood type.
Data: data are the values you get when you measure a variable.
Example: 32 years (for the age variable) & female (for the sex variable).
Mrs Brown Mr Patel
Age 32 20
Variables
Sex Female Male
Blood group O A
Data
1
Mohey Elmazar
Types of Variables
Categorical Metric
Variables Variables
(Non-parametric) (Parametric)
Nominal Ordinal Discrete Continuous

Values in Values in ordered Integer values on Continuous
arbitrary categories proper numeric values on proper
categories (No units) line or scale numeric line or
(No units) (Counted units) scale
(Measured units)
I- CATEGORICAL VARIABLES (NON-PARAMETRIC)

a- Nominal Categorical Variables
Consider the variable blood type. Lets assume for simplicity that there are only
four different blood types: O, A, B, and A/B. Suppose we have a group of 100
patients. We can first determine the blood type of each and then allocate the result
to one of the four blood type categories.
Blood Types No. of Patients

(or Frequency)
O 65
A 15
B 12
A/B 8
Table 1.1: Categorical nominal variable
2
Mohey Elmazar
Table 1.1 is called a frequency table, or a contingency table. It shows

how the number, or frequency, of the different blood types is distributed
across the four categories.
So 65 patients have a blood type O, 15 blood type A, and so on.
The variable blood type is a nominal categorical variable.
Notice two things about this variable, which is typical of all nominal
variables:
1- The data DO NOT have any units of measurement.
2- The ordering of the categories is completely arbitrary. In other words,
the categories cannot be ordered in any meaningful way.
In other words we could just as easily write the blood type categories as
A/B, B, O, A or B, O, A, A/B, or B, A, A/B, O, or whatever
We CANNOT say that being in any particular category is better, or
shorter, or quicker, or longer, than being in any other category.
b- Ordinal Categorical Variables

Lets consider another variable the Glasgow Coma Scale, or GCS for
short.
As the name suggests, this scale measures the degree of brain injury
following head trauma.
A patients GCS score is judged by their responsiveness, as observed
by a clinician, in three areas: eye opening response, verbal response
and motor response.
The GCS score can vary from 3 (death or severe injury) to 15 (mild or no
injury). In other words, there are 13 possible values or categories of brain
injury.
3
Mohey Elmazar
For Your Info
4
Mohey Elmazar
Imagine that we determine the Glasgow Coma Scale (GCS) scores of the
last 90 patients admitted to an Emergency Department with head trauma,
and we allocate the score of each patient to one of the 13 categories. The
results might look like the frequency table shown in Table 1.2.
GCS Score No. of Patients

3 8
4 1
5 6
6 5
7 5
8 7
9 6
10 8
11 8
12 10
13 12
14 9
15 5
Table 1.2: Categorical ordinal variable
The Glasgow Coma Scale is an ordinal categorical variable.
Ordinal categorical variable is characterized by the following:

1- The data DO NOT have any units of measurement (same as
nominal variables).
2- The ordering of the categories is NOT arbitrary as it was with

nominal variables. This means that its possible to order the
categories in a meaningful way. Thus, we can say that a patient in
category 15 has less brain injury than a patient in category 14.
Similarly, a patient in the category 14 has less brain injury than a
patient in category 13, and so on.
5
Mohey Elmazar
3- They DO NOT have interval property. This means that the

difference between any pair of adjacent scores is not necessarily the
same as the difference between any other pair of adjacent scores.
For example, the difference in the degree of brain injury between
Glasgow Coma Scale scores of 5 and 6, and scores of 6 and 7, is
not necessarily the same.
4- They DO NOT have ratio property. This means that we cant say that
a patient with a score of say 6 has exactly twice the degree of brain
injury as a patient with a score of 12.
5- Ordinal data therefore are NOT real numbers. Thus, they cannot be
placed on a number line.
6- They are NOT properly measured but assessed in some way, by

the clinician working with the patient.
7- It is NOT appropriate to apply any of the rules of basic arithmetic

operations to this sort of data. You should not add, subtract, multiply
or divide ordinal values.
II- METRIC VARIABLES (PARAMETRIC)

a- Continuous Metric Variables
Look at Table 1.3, which shows the weight in kg (rounded to two

decimal places) of six individuals.
Patient Weight (Kg)
Ms V Wood 68.25
Mr P Green 80.63
Mr S Laken 75.00
Mrs B Noble 72.21
Ms G Taylor 73.44
Ms J Taylor 76.98
Table 1.3: Metric continuous variable
6
Mohey Elmazar
The variable weight is a metric continuous variable.
With metric variables, proper measurement is possible. For example, if

we want to know someones weight, we can use a weighing machine, we
dont have to look at the patient and make a guess (which would be
approximate), or ask them how heavy they are (very unreliable).
Similarly, if we want to know their diastolic blood pressure we can use a
sphygmometer. Guessing, or asking, is not necessary.
Because they can be properly measured, these variables produce data

that are real numbers, and so can be placed on the number line.
Some common examples of metric continuous variables include: birth

weight (g), blood pressure (mmHg), blood cholesterol (g/ml), body mass
index (kg/m2) etc.
Notice that all of these variables have units of measurement. This is a

characteristic of all metric continuous variables.
In contrast to ordinal values, the difference between any pair of adjacent

values is exactly the same (interval property). The difference between
birth weights of 4000 g and 4001 g is the same as the difference between
4001 g and 4002 g.
Moreover, a blood cholesterol score, for example, of 8 g/ml is exactly

twice a blood cholesterol of 4 g/ml (ratio property).
Thus, metric continuous variables are characterized by the following:
1- They can be properly measured and have units of measurement.

2- They produce data that are real numbers (located on the number
line).
7
Mohey Elmazar
3- Because metric data values are real numbers, you can apply all of
the usual mathematical operations to them. This opens up a much
wider range of analytical possibilities than is possible with either
nominal or ordinal data.
4- They have interval & ratio properties.
b- Discrete Metric Variables

Discrete metric data usually comes from counting.
For example: number of deaths, number of pressure sores, number of

angina attacks, etc. These are all discrete metric variables.
The data produced are real numbers, and are invariably integer (i.e.
whole number). They can be placed on the number line, and have the
same interval and ratio properties as continuous metric data:
Thus, metric discrete variables are characterized by:
1- They can be properly counted and have units of measurement

numbers of things.
2- They produce data which are real numbers located on the number
line.
3- Mathematical operations can be applied.
4- They have interval & ratio properties.
Example: Number of times that a group of children with asthma used

their inhalers in the past 24 hours was determined and the data were
displayed in Table 1.4.
8
Mohey Elmazar
Number of times that a group of children with asthma used their

inhalers in the past 24 hours.
Patient No. of times inhalers used in the past 24 hrs
Tim 1
Jane 2
Susie 6
Barbara 6
Peter 7
Gill 8
Table 1.4: Metric discrete variable
Categorical Variables (Non-parametric) Metric Variables (Parametric)
No interval property Interval property
No ratio property Ratio property

Data are NOT REAL numbers Data are REAL numbers
CANNOT be placed on a number line CAN be placed on a number line
Mathematical operations CANNOT be
Mathematical operations CAN be applied
applied
Have NO units of measurement Have units of measurements

Table 1.5: Comparison between categorical & metric variables
9
Mohey Elmazar
Identifying the Type of Variable

The following algorithm is used in order to identify the type of variable
we are dealing with.
N.B. The easiest way to tell whether data are metric is to check whether they have units
attached to them, such as: g, mm, oC, g/cm3, number of pressure sores, number of
deaths, and so on. If not, it may be ordinal or nominal They are ordinal if the values
can be put in any meaningful order.
10
Mohey Elmazar
EXERCISES
Exercise 1:
Four migraine patients are asked to assess the severity of their migraine pain
one hour after the first symptoms of an attack, by marking a point on a
horizontal line, 100 mm long. The line is marked no pain at the left-hand end
& worst possible pain at the right-hand end. The distance of each patients
mark from the left-hand end is subsequently measured with mm ruler and
their scores are 25 mm, 44 mm, 68 mm & 85 mm.
1- What sort of data is this?
2- Can you calculate the average pain of these 4 patients?

N.B. This form of measurement (using a line & getting patients to mark it) is
known as Visual Analog Scale (VAS).
Exercise 2:
The table below contains the characteristics of cases & controls from a case
control study into stressful life events and breast cancer in women.
Identify the type of each variable in the table. Values are mean (SD) unless
stated otherwise.
Breast cancer group Control group

Variable
(n=106) (n=226)
1- Age 61.6 (10.9) 51.0 (8.5)
2- Social class (%)
I 10 (10) 20 (9)
II 38 (36) 82 (36)
III non-manual 28 (26) 72 (32)
11
Mohey Elmazar
III manual 13 (12) 24 (11)

IV 11 (10) 21 (9)
V 3 (3) 2 (1)
VI 3 (3) 4 (2)
3- No. of children
(%)
0 15 (14) 31 (14)
1 16 (15) 31 (13.7)
2 42 (40) 84 (37)
3 32 (31) 80 (35)
4- Age at birth of first
21.3 (5.6) 20.5 (4.3)
child
5- Age at menarche 12.8 (1.4) 13.0 (1.6)
6- Menopausal state
(%)
Premenopausal 14 (13) 66 (29)
Perimenopausal 9 (9) 43 (19)
Postmenopausal 83 (78) 117 (52)
7- Age at
47.7 (4.5) 45.6 (5.2)
menopause
8- Lifetime use of
oral contraceptives 38 61
(%)
9- No. of years
taking oral 3.0 (5.4) 4.2 (5.0)
contraceptives
10- No. of months n=90 n=195

breastfeeding 7.4 (9.9) 7.4 (12.1)
11- Lifetime use of

hormonal therapy 29 (27) 78 (35)
(%)
12
Mohey Elmazar
12- Mean years use

1.6 (3.7) 1.9 (4.0)
of hormonal therapy
13- Family history of
8 (8) 10 (4)
ovarian cancer (%)
14- History of benign
15 (15) 105 (47)
breast disease (%)
15- Family history of
16 (15) 35 (16)
breast cancer (%)
16- Units of
alcohol/week (%)
0 38 (36) 59 (26)
0-4 26 (25) 71 (31)
5-9 20 (19) 52 (23)
10 22 (21) 44 (20)
17- No. of
cigarettes/day
0 83 (78.3) 170 (75.2)
1-9 8 (7.6) 14 (6.2)
10 15 (14.2) 42 (18.6)
18- Body mass index
26.8 (5.5) 24.8 (4.2)
(Kg/m2)
ANSWERS
Exercise 1:
1- VAS data are ordinal because they are subjective judgments, which are
not measured but assessed, and will probably vary from patient to patient
and moment to moment.
2- Its not possible to calculate average, if by this meant adding up 4 values
& dividing by 4 BECAUSE ordinal data ARE NOT real numbers.
Mathematical operations cannot be applied to them.
13
Mohey Elmazar
Exercise 2:
1- Age MC
2- Social class O
3- No. of children (%) MD
4- Age at birth of first child MC
5- Age at menarche MC
6- Menopausal state O
7- Age at menopause MC
8- Lifetime use of oral contraceptives (%) N
9- No. of years taking oral contraceptives MC
10- No. of months breastfeeding MC
11- Lifetime use of hormonal therapy (%) N
12- Mean years use of hormonal therapy MC
13- Family history of ovarian cancer (%) N
14- History of benign breast cancer (%) N
15- Family history of breast cancer (%) N
16- Units of alcohol per week (%) MD
17- No. of cigarettes per day MD
18- Body mass index (Kg/m2) MC
N.B.
(N=nominal, O=ordinal, MC=metric continuous, MD=metric discrete)
14
Mohey Elmazar
Lecture checklist:
Overview
What are variables & data?
Types of variables (with examples)
Categorical nominal
Categorical ordinal
Metric continuous
Metric discrete
Identifying the type of variable (algorithm)
Exercises & answers
15
Mohey Elmazar

Stat Lecture 1

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Stat Lecture 1

Hochgeladen von

Copyright:

Verfügbare Formate

BIOSTATISTICS (L-1)

THE BRITISH UNIVERSITY IN EGYPT (BUE)

Statistics can be broadly classified into

Descriptive Statistics Inferential Statistics

VARIABLES & DATA

Variables: a variable is something whose value can vary. Example:

Mrs Brown Mr Patel

Sex Female Male

Nominal Ordinal Discrete Continuous

I- CATEGORICAL VARIABLES (NON-PARAMETRIC)

Blood Types No. of Patients

Table 1.1: Categorical nominal variable

Table 1.1 is called a frequency table, or a contingency table. It shows

b- Ordinal Categorical Variables

For Your Info

GCS Score No. of Patients

The Glasgow Coma Scale is an ordinal categorical variable.

Ordinal categorical variable is characterized by the following:

2- The ordering of the categories is NOT arbitrary as it was with

3- They DO NOT have interval property. This means that the

6- They are NOT properly measured but assessed in some way, by

7- It is NOT appropriate to apply any of the rules of basic arithmetic

II- METRIC VARIABLES (PARAMETRIC)

Look at Table 1.3, which shows the weight in kg (rounded to two

Mrs B Noble 72.21

The variable weight is a metric continuous variable.

With metric variables, proper measurement is possible. For example, if

Because they can be properly measured, these variables produce data

Some common examples of metric continuous variables include: birth

Notice that all of these variables have units of measurement. This is a

In contrast to ordinal values, the difference between any pair of adjacent

Moreover, a blood cholesterol score, for example, of 8 g/ml is exactly

Thus, metric continuous variables are characterized by the following:

1- They can be properly measured and have units of measurement.

b- Discrete Metric Variables

For example: number of deaths, number of pressure sores, number of

Thus, metric discrete variables are characterized by:

1- They can be properly counted and have units of measurement

Example: Number of times that a group of children with asthma used

Number of times that a group of children with asthma used their

Patient No. of times inhalers used in the past 24 hrs

Categorical Variables (Non-parametric) Metric Variables (Parametric)

No interval property Interval property

No ratio property Ratio property

Have NO units of measurement Have units of measurements

Identifying the Type of Variable

2- Can you calculate the average pain of these 4 patients?

Breast cancer group Control group

1- Age 61.6 (10.9) 51.0 (8.5)

2- Social class (%)

III manual 13 (12) 24 (11)

10- No. of months n=90 n=195

11- Lifetime use of

12- Mean years use

Das könnte Ihnen auch gefallen