(P.P. Leung)

Lecture notes are based on the following textbook:

N.A. Weiss (2012), Introductory Statistics, 9th edition, Pearson.

Chapter 1 The Nature of Statistics

1.1 Two kinds of Statistics

1.4 Other Sampling Designs ()

What is Statistics? ?

From Wikipedia, the free encyclopaedia:

Statistics is a mathematical science pertaining to the collection, analysis, interpretation

or explanation, and presentation of data. It is applicable to a wide variety of academic

disciplines, from the natural and social sciences to the humanities. Statistics is also used for

making informed decisions in government and business.

Statistical methods can be used to summarize or describe a collection of data; this is

called descriptive statistics. In addition, patterns in the data may be modeled in a way that

accounts for randomness and uncertainty in the observations, and then used to draw

inferences about the process or population being studied; this is called inferential statistics.

Both descriptive and inferential statistics comprise applied statistics. There is also a discipline

called mathematical statistics, which is concerned with the theoretical basis of the subject.

From Our textbook:

) so as to present significant information about a given subject

Collecting and analyzing data for the purpose for making generalizations and decisions

From :

1. Let data talk.

2. Quantify the uncertainties.

3. Making decision without enough information.

Descriptive Statistics () consists of methods for organizing () and

summarizing () information, e.g. the NBA/CBA season every year.

Inferential Statistics () consists of methods for drawing and measuring the

reliability () of conclusions about a population based on information obtained

from a sample of the population, e.g. the 1948 presidential election.

Technical Terms ():

Population () the collection of all individuals or items under consideration in a

statistical study.

Sample () a subset (part) of the population from which information is collected.

Statistics in this course either descriptive statistics or inferential statistics (they are applied

statistics.)

Census () acquire information on the entire population of interest.

Sampling () acquire information on only part of the population of interest.

Experimentation () acquire information by making up an experiment.

Why sampling is needed?

Survey of the whole population is usually labouring, time-consuming, expensive, frequently

impractical and sometimes impossible.

Simple Random Sampling () A sampling procedure for which each possible

sample of a given size is equally likely to be the one obtained. (

)

Simple Random Sample () A sample obtained by simple random sampling.

Why random sample is so important?

The sample being considered must be a representative sample (), i.e. it should

reflect as closely as possible the relevant characteristics () of the population.

Simple random sampling with replacement ()

Simple random sampling without replacement ()

(In this course, unless we specify otherwise, assume that simple random sampling is done

with replacement).

Example Simple Random Sampling P.14 Ex1.7

Sampling Oklahoma State Officials As reported by The World Almanac, the top five state

officials of Oklahoma are as shown in Table 1.2. Consider these five officials a population of

interest.

(a) List the possible samples (without replacement) of two officials from this population of

five officials.

Governor (G)

Lieutenant Governor (L)

Secretary of State (S)

Attorney General (A)

Treasurer (T)

(b) Describe a method for obtaining a simple random sample of two officials from this

population of five officials.

(c) For the sampling method described in part (b), what are the chances that any particular

sample of two officials will be the one selected?

(d) Repeat parts (a)-(c) for samples of size 4.

Solution For convenience, we represent the officials by using the letters in parentheses.

{(G, L), (G, S), (G, A), (G, T), (L, S), (L, A), (L, T), (S, A), (S, T), (A,T)}

(a) Table 1.3 lists the 10 possible samples of two officials from this population of five

officials.

(b) To obtain a simple random sample of size 2, we could write the letters that correspond to

the five officials (G, L, S, A, and T) on separate pieces of paper. After placing these five

slips of paper in a box and shaking it, we could, while blindfolded, pick two slips of

paper.

(c) The procedure described in part (b) will provide a simple random sample. Consequently,

each of the possible samples of two officials is equally likely to be the one selected. There

are 10 possible samples, so the chances are 1/10 (1 in 10) that any particular sample of

two officials will be the one selected.

(d) Listing the five possible samples of four officials from this population of five officials. A

simple random sampling procedure, such as picking four slips of paper out of a box, gives

each of these samples a 1 in 5 chance of being the one selected.

{(G, L, S, A), (G, L, S, T), (G, L, A, T), (G, S, A, T), (L, S, A, T)}

Random-number Table to obtain random numbers, P.15, A5.

1.4 Other Sampling Designs ()

Why do we need other sampling designs?

Simple random sampling is relatively labouring, time-consuming, costly and sometimes

impractical. With limited resources, it is necessary to look for some other sampling designs.

The thumb rule is the sample obtained has to be closest as possible to the simple random

sample.

Systematic Random Sampling (/)

S1. Divide the population size by the sample size and round the result down to the nearest whole number, m.

S2. Use a random-number table or a similar device to obtain a number, between 1 and m.

S3. Select for the sample those members of the population that are numbered k, k+m, k+2m,

Cluster Sampling ()

S1. Divide the population into groups (clusters).

S2. Obtain a simple random sample of the clusters.

S3. Use all the members of the clusters obtained in step 2 as the sample.

Stratified Sampling (/)

S1. Divide the population into subpopulations (strata).

S2. From each stratum, obtain a simple random sample of size proportional to the size of the

stratum; that is the sample size for a stratum equals the total sample size times the stratum

size divided by the population size.

S3. Use all the members obtained in S2 as the sample.

Multistage sampling ()

Most large-scale surveys combine one or more of simple random sampling, systematic

random sampling, cluster sampling and stratified sampling.

Review Problems

Under standing the Concepts and Skills

a. a descriptive study.

b. an inferential study.

2. Almost any inferential study involves aspects of descriptive statistics. Explain why.

3. Baseball Scores On September 3, 2005, the following baseball scores were printed in The

Daily Courier. Is this study descriptive or inferential? Explain your answer.

Major League Baseball

Giants 6, D'backs 3

Cubs 7, Pirates 3

Marlins 4, Mets 2

Phillies 7, Nationals 1

Braves 7, Reds 4

Brewers 12, Padres 2

Astros 6, Cardinals 5

Rockies 11, Dodgers 3

Orioles 7, Red Sox 3

White Sox 9, Tigers 1

Indians 6, Twins 1

Rangers 8, Royals 7

A's 12, Yankees 0

Angels 4, Mariners 1

4. Serious Energy Situation. In a USA TODAY/CNN Gallup Poll, 94% of those surveyed

said that the United States faced a serious energy situation, but, by 47% to 35%, they

preferred an emphasis on conservation rather than on more production. Is this study

descriptive or inferential? Explain answer.

5. British Backpacker Tourists. Research by Gustav Visser Charles Barker in "A Geography

of British Backpacker tourists in South Africa" (Geography, Vol. 89, No. 3, pp. 226reflects on the impact of British backpacker tourists visiting. South Africa. A sample of

British backpackers was interviewed. The information obtained from the sample was used

to construct the following table for the age distribution of all British backpackers. Classify

this study as descriptive or inferential, and explain your answer.

Age (yrs)

Percentage

Less than 21

21-25

26-30

31-35

36-40

Over 40

9%

46%

27%

10%

4%

4%

6. Teen Drug Abuse. In an article dated April 24,200-5, USA TODAY reported on the 17th

annual study on teen drug abuse, conducted by the Partnership for a Drug-Free America.

According to the survey of 7300 teens, the most popular prescription drug abused by teens

was Vicodin, with 18% or about 4.3 million youths reporting that they had used it to

get high. OxyContin and drugs for attention deficit disorder, such as Ritalin/Adderall,

followed with one in 10 teens reporting that they had tried them. Answer the following

questions and explain your answers.

a. Is the statement about 18% of youths abusing Vicodin inferential or descriptive?

b. Is the statement about 4.3 million youths abusing Vicodin inferential or descriptive?

7. Regarding observational studies and designed experiments:

a. Describe each type of statistical study.

b. With respect to possible conclusions, what important difference exists between these

two types of statistical studies?

8. Persistent Poverty and IQ. An article appearing in an issue of The Arizona Republic

reported on a study conducted by Greg Duncan of the University of Michigan. According

to the report, "Persistent poverty during the first 5 years of life leaves children with IQs 9.1

points lower at age 5 than children who suffer no poverty during that period...." Is this

statistical study an observational study or is it a designed experiment? Explain your

answer.

9. Wasp Hierarchical Status. In the February 2005 issue of Discover (Vol. 26, No. 2, pp. 1011), Jesse Netting describes the research of Elizabeth Tibbetts of the University of Arizona in

the article, "The Kind of Face Only a Wasp Could Trust." Tibbetts found that wasps signal

their strength and status with the number of black splotches on their yellow faces, with more

splotches denoting higher status. Tibbetts decided to see if she could cheat the system. She

painted some of the insects' faces to make their status appear higher or lower than it really

was. She then placed the painted wasps with a group of female wasps to see if painting the

faces altered their hierarchical status. Was this investigation an observational study or a

designed experiment? Justify your answer.

10. Before planning and conducting a study to obtain information, what should be done?

* * * * * End of Chapter 1 * * * * *

