Sie sind auf Seite 1von 56

A focus on Sampling and Sampling Methods

Menu
Sampling Methods Measures of Centre Measures of Spread On Your Calculator For clarification, click on any step you do not understand to see that element broken down The example used throughout this presentation is trying to find the mean height of WBHS pupils Definitions Assessment Tips Practice Tasks

Sampling Methods
In this presentation you will see a number of sampling methods, their benefits and drawbacks.
Simple Random Sample Cluster Sampling Systematic Sampling Stratified Sampling

Note:

For more detailed instructions on any of the example click on the step you misunderstand

Measures of Central Tendency


In this presentation you will learn how to calculate a number of measures of average or centre, as well as their benefits and drawbacks
Note: Mean Median Mode

For more detailed instructions in any of the examples click on the step you misunderstand

Measures of Spread
In this presentation you will learn how to find a number of measures of spread as well as their drawbacks and advantages. You will also need to decide which measure of spread and which measure of centre go together.
Standard Deviation Interquartile Range Range

Note:

For more detailed instructions in any of the examples click on the step you misunderstand

Simple Random Sample


The simplest unbiased sample. 1- Number the entire population. 2- Generate random numbers. 3- Proceed until you have as many as you need ignoring any repeats.
Example (Heights of WBHS students) 1. Get a copy of the School Roll. 2. Number every person 3. Generate Random numbers from 1 to the maximum you need. 4. Proceed until you have the desired sample size ignoring repeats.

Simple Random Sample


Advantages Cheap Easy to carry out Unbiased Disadvantages May not represent strata Needs an entire population list

Cluster Sampling
The easiest unbiased sample. Sort your data into clusters based on location. Randomly choose the cluster. Perform a simple random sample on the chosen cluster.
Example (Heights of WBHS students) 1. Get a copy of the School Roll. 2. Sort into clusters eg year levels 3. Randomly select the cluster. 4. Randomly generate a sample from each cluster. Care with clusters as Juniors are much shorter than Seniors

1.

2.

3.

Cluster Sampling
Advantages Very Cheap Very Easy to carry out Unbiased Disadvantages Needs an entire population list Can be biased if clusters strongly affect the statistics.

Systematic Sampling
A relatively quick way to pick an unbiased sample List the entire population. Decide on your step size (Total Sample size = n). n). Randomly generate a starting point. Step every nth data point till you have your sample.
Example (Heights of WBHS students) 1. Get an alphabetical copy of the School Roll. 2. Step Size = Total Sample size 3. Randomly generate a starting point. 4. Starting from the beginning use the step size to pick the rest of the sample

1. 2.

3.

4.

Systematic Sampling
Advantages Cheap Easy to Choose Sample Unbiased Disadvantages Needs an entire population list If population list is ordered then sample can become biased

Stratified Sampling
The most reliable sampling method. Sort the data into strata based on information you already know. Calculate the proportions for each strata. Perform a Simple Random Sample on each of the strata.
Example (Heights of WBHS students) 1. Get a copy of the School Roll separated into year levels. 2. Calculate the sample size for each year group (strata). 3. Perform a simple random sample on each year group to their specific sample size.

1.

2.

3.

Stratified Sampling
Advantages Unbiased Completely representative of each of the strata Most reliable estimates Disadvantages Needs entire population list Information about entire population needs to be known beforehand Time consuming

Generate a Random Number


1.

2.

3.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Choose your calculator
Casio FX-82 Casio Graphic Texas

Random Number on a Casio Graphics Calculator


1.

2.

3.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) In Run Mode

Intg(529 Ran# + 1)

F3

F4

F6

OPTN

Intg OPTN F6 F4 F5 Ran# OPTN F6 F3 F4

On Screen Intg(529 Intg(529 Ran# + 1)


Population size or Strata size Starting Value

8 5 +

Random Number on a Casio FX - 82


1.

2.

3. 4.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Ran# = 2nd function On screen
Ran# 529 + 1 =
Population size or strata size Starting value

RAN#529+1

shift

note

Ignore any decimal in the answer

Random Number on a Texas


1.

2.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students)
RANDI

RANDI(1,529)

2nd PRB

RANDI PRB

,
3.

2nd Function ) On Screen


RANDI(1 , 529)
End Value

Starting Value

Simple Random Sample


The simplest unbiased sample. Number the entire population. Generate random numbers. Proceed until you have as many as you need ignoring any repeats.
Example (Heights of WBHS students) 1. Get a copy of the School Roll. 2. Number every person from 1 (to 529) 3. Generate Random numbers from 1 to the maximum you need (529). 4. Proceed until you have the desired sample size ignoring repeats.

1.

2.

3.

Strata Proportions
1.

2.

Number of people in strata divided by total in population. Multiplied by number of people wanted in total sample.

Example (Heights of WBHS students) 1. 529 people on School Roll. 2. 115 year 10s 3. Sample size of 30 4. So year 10 sample size 115 529 30 = 6.52 So take 7 year 10 students

Systematic Step Sizes


1.

Number of people in population divided by Sample Size

Example (Heights of WBHS students) 1. 529 people on School Roll. 2. Sample size of 30 3. So Step size 529 30 = 17.63333 So take every 17th student from the starting position

Systematic Stepping
1.

Starting at the random start point step out till you get desired sample size.

Example (Heights of WBHS students) 1. Random starting point 803, step size 29 2. 803rd student on alphabetical list is where we start. 3. Then 832nd student, 861st student, we have now reached the end of the roll so start at the beginning 890= 15th student then 45th student

Mean
1. 2.

Add up all of the values in the sample. Divide by the sample size.
Calculator Method

Advantages
Easy to calculate for large samples. Accurate and well understood

Disadvantages
Affected by outliers

Median
1.

2.

List all the values in order. Find the central value

Advantages
Accurate Not affected much by Outliers

Disadvantages
Not so widely known as an average Time consuming to list large sample in order

Mode
1. 2.

List all the values Find the most common item

Advantages
Can calculate mode for data that is not numeric or ordered Not affected much by Outliers Very easy to calculate

Disadvantages
Can be inaccurate for numeric or data that can be ordered

Statistics on a Calculator
Choose your calculator
Casio FX-82

Casio Graphic

Texas

Statistics on a Casio Graphics Calculator


1. 2. 3.

4. 5. 6. 7.

In Stat Mode In list 1 enter all data values In list 2 enter their frequencies F2 (CALC) F6 (SET) Should read Exit F1 (1VAR)
is mean, mean, n is std. dev.) dev.)

1Var XList 1Var Freq 2Var XList 2Var YList 2Var Freq

:List1 :List2 :List3 :List4 :List5

F1

F2

F6

EXIT

(All Statistics are listed

S.D. using table

Entering Data on Casio Graphics Calculator


Enter each data value in List 1 followed by EXE
List 1 List 2 List 3 List4 1 2 3 4 5

Enter the frequency of each data value in List 2 followed by EXE Note If all of the frequencies are 1 then you dont need to enter the frequencies.

In the Set Menu change the 1Var Freq to 1 instead of list 2


EXE

Statistics on a Casio FX 82 Calculator


1.

Put your calculator into statistics mode

Scl

mode

clr all

Mode 2

2.

Clear the statistics memory


Shift Mode 1 Shown on Screen

shift

mode

3.

Enter the data carefully


M+

180cm

M+

4.

Calculate desired statistics


Shift
1. 2.

2
mean n standard deviation

S.D. using table

Entering Data on Casio FX 82 Calculator


n=

Enter each data value followed by M+ n is the number of data values that you have entered
M+

Note

Be very careful entering the data values as you cannot review them later to make sure that they are correct.

Statistics on a Texas Calculator


1.

Put your calculator into statistics mode


1. 2.

2nd Function 1 - VAR

DATA
n x Sx x

2.

Enter the data carefully


1.

DATA STATVAR Shift between statistics with arrow keys


1. 2. 3.

2nd

3.

Calculate desired statistics


1. 2.

DATA STATVAR

number of data values mean standard deviation

S.D. using table

Entering Data on a Texas Calculator


Press the Data Key to begin Begin entering data. X1 is the data value Followed by the down arrow Freq1 is that data values frequency Followed by the down arrow X2 is next then Freq2 To check data use up arrow
X1 = 180

2nd

DATA

Definitions

Population Census Sample Parameters Statistics Representative Bias

The entire list of those people or things that you wish to sample A survey of an entire population A small group of a population Facts about an entire population gained from a census
(Notation: mean or standard deviation )

Estimates of population parameters calculated from a sample


(Notation: mean or standard deviation s)

A sample that appears to represent all elements of the in the correct proportions population A sampling method that does not give every element of the population an equal chance of selection

Standard Deviation

This is a calculation of the average difference between the data values and the mean. This measure of spread applies to the mean.
Use table to calculate

Use Calculator to Calculate

Advantages
Easy to calculate for large samples on calculator. Accurate Very useful for certain types of data

Disadvantages
Affected by outliers Possibly not so well understood

Interquartile Range
1. 2.

3.

Calculate the upper and lower quartiles. Upper quartile minus lower quartile. This measure of spread applies to the median Disadvantages
Easy to calculate for large samples.

Advantages
Well understood Unaffected by outliers

Range
Find the highest and lowest value. 2. Highest value minus the lowest value. 3. This measure of spread applies to all measures of centre. Advantages Disadvantages
1.

Well understood Unaffected by outliers

Easy to calculate for large samples.

Standard Deviation by Table


Mean Data Values
From your sample or census 180 150 165 170 160 Total Mean 825 165 Use Calculator to Calculate 165 165 165 165 165 Calculated as usual, doesnt change Data values minus the Mean 15 -15 0 5 -5 0 ( )2 225 225 0 25 25 500 100 Final Standard Deviation is the square root of this value so s = 10 Square of each of the values to the left

Calculating Quartiles
1. 2. 3. 4.

5.

List all the values in order. Find the central value Discard that central value Find the central value of the remaining two halves. These 2 numbers are the upper and lower quartiles

Example (Heights of WBHS students) 1. Data Values 165, 170, 173, 180, 182, 183, 191, 192 2. Central value middle of 180 and 182 so median is 181 3. Discard 181 and calculate middle of each half. 4. 165, 170, 173, 180//182, 183, 191, 192 Lower quartile 171 Upper quartile 187

Things to Consider
Is my sample representative of the population?

Need to consider whether any strata present in the data are represented in approximately the correct proportions. Need to consider the presence of any apparent outliers in the sample chosen, and the effect they will have on estimates of population parameters.

Things to Consider
Is my sample representative of the population?

Estimates are more reliable when taken from a large sample as the effects of outliers are lessened. Consider the size of the s.d.
A larger value of s suggests considerable variation in the data values. Thus taking another sample could produce quite different statistics.

Ask yourself, If I were to repeat this sampling process, would I get the same results?

Things to Consider
How could I improve my sampling method?

Need to choose a sampling method which eliminates bias, and which gives the best chance of choosing a representative sample. (Bias exists when some of the population members have greater or lesser chance of being included in the sample.) Need to discuss which statistics would give the best estimates of population parameters, including the effect of outliers.

Things to Consider
Would I get the same or similar results if I repeated the same process?

Are there outliers or extreme values that may affect the sample statistics? If so then I probably wouldnt get similar results. Is the standard deviation (or measure of spread) large when compared to the mean, if it is then repeating the same results is unlikely.

Things to Consider
When answering question or stating conclusions;

Answers need to be precise and refer to actual data values present in the sample and/or population. Strata must be clearly defined. Answers cannot be vague or rote-learnt without referring specifically to the context of the assessment. Students must be very clear that the sample statistics are ESTIMATES of the population parameters. They must NOT state that the population mean is unless they have taken a census of the whole population!

Practice Tasks
Real Estate Stats

On Your Calculator
In this part of the presentation you can check on exactly how to use your calculator effectively to help with Statistics
Note: Generating Random Numbers Entering Data Calculating Statistics

For more detailed instructions on any of the example click on the step you misunderstand

Entering Data on a Calculator


Choose your calculator
Casio FX-82

Casio Graphic

Texas

Statistics on a Calculator
Choose your calculator
Casio FX-82

Casio Graphic

Texas

Statistics on a Casio Graphics Calculator


1. 2. 3.

4. 5. 6. 7.

In Stat Mode In list 1 enter all data values In list 2 enter their frequencies F2 (CALC) F6 (SET) Should read Exit F1 (1VAR)
is mean, mean, n is std. dev.) dev.)

1Var XList 1Var Freq 2Var XList 2Var YList 2Var Freq

:List1 :List2 :List3 :List4 :List5

F1

F2

F6

EXIT

(All Statistics are listed

S.D. using table

Entering Data on Casio Graphics Calculator


Enter each data value in List 1 followed by EXE
List 1 List 2 List 3 List4 1 2 3 4 5

Enter the frequency of each data value in List 2 followed by EXE Note If all of the frequencies are 1 then you dont need to enter the frequencies.

In the Set Menu change the 1Var Freq to 1 instead of list 2


EXE

Statistics on a Casio FX 82 Calculator


1.

Put your calculator into statistics mode

Scl

mode

clr all

Mode 2

2.

Clear the statistics memory


Shift Mode 1 Shown on Screen

shift

mode

3.

Enter the data carefully


M+

180cm

M+

4.

Calculate desired statistics


Shift
1. 2.

2
mean n standard deviation

S.D. using table

Entering Data on Casio FX 82 Calculator


n=

Enter each data value followed by M+ n is the number of data values that you have entered
M+

Note

Be very careful entering the data values as you cannot review them later to make sure that they are correct.

Statistics on a Texas Calculator


1.

Put your calculator into statistics mode


1. 2.

2nd Function 1 - VAR

DATA
n x Sx x

2.

Enter the data carefully


1.

DATA STATVAR Shift between statistics with arrow keys


1. 2. 3.

2nd

3.

Calculate desired statistics


1. 2.

DATA STATVAR

number of data values mean standard deviation

S.D. using table

Entering Data on a Texas Calculator


Press the Data Key to begin Begin entering data. X1 is the data value Followed by the down arrow Freq1 is that data values frequency Followed by the down arrow X2 is next then Freq2 To check data use up arrow
X1 = 180

2nd

DATA

Generate a Random Number


1.

2.

3.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Choose your calculator
Casio FX-82 Casio Graphic Texas

Random Number on a Casio Graphics Calculator


1.

2.

3.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) In Run Mode

Intg(529 Ran# + 1)

F3

F4

F6

OPTN

Intg OPTN F6 F4 F5 Ran# OPTN F6 F3 F4

On Screen Intg(529 Intg(529 Ran# + 1)


Population size or Strata size Starting Value

8 5 +

Random Number on a Casio FX - 82


1.

2.

3. 4.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Ran# = 2nd function On screen
Ran# 529 + 1 =
Population size or strata size Starting value

RAN#529+1

shift

note

Ignore any decimal in the answer

Random Number on a Texas


1.

2.

Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students)
RANDI

RANDI(1,529)

2nd PRB

RANDI PRB

,
3.

2nd Function ) On Screen


RANDI(1 , 529)
End Value

Starting Value

Das könnte Ihnen auch gefallen