Beruflich Dokumente
Kultur Dokumente
Menu
Sampling Methods Measures of Centre Measures of Spread On Your Calculator For clarification, click on any step you do not understand to see that element broken down The example used throughout this presentation is trying to find the mean height of WBHS pupils Definitions Assessment Tips Practice Tasks
Sampling Methods
In this presentation you will see a number of sampling methods, their benefits and drawbacks.
Simple Random Sample Cluster Sampling Systematic Sampling Stratified Sampling
Note:
For more detailed instructions on any of the example click on the step you misunderstand
For more detailed instructions in any of the examples click on the step you misunderstand
Measures of Spread
In this presentation you will learn how to find a number of measures of spread as well as their drawbacks and advantages. You will also need to decide which measure of spread and which measure of centre go together.
Standard Deviation Interquartile Range Range
Note:
For more detailed instructions in any of the examples click on the step you misunderstand
Cluster Sampling
The easiest unbiased sample. Sort your data into clusters based on location. Randomly choose the cluster. Perform a simple random sample on the chosen cluster.
Example (Heights of WBHS students) 1. Get a copy of the School Roll. 2. Sort into clusters eg year levels 3. Randomly select the cluster. 4. Randomly generate a sample from each cluster. Care with clusters as Juniors are much shorter than Seniors
1.
2.
3.
Cluster Sampling
Advantages Very Cheap Very Easy to carry out Unbiased Disadvantages Needs an entire population list Can be biased if clusters strongly affect the statistics.
Systematic Sampling
A relatively quick way to pick an unbiased sample List the entire population. Decide on your step size (Total Sample size = n). n). Randomly generate a starting point. Step every nth data point till you have your sample.
Example (Heights of WBHS students) 1. Get an alphabetical copy of the School Roll. 2. Step Size = Total Sample size 3. Randomly generate a starting point. 4. Starting from the beginning use the step size to pick the rest of the sample
1. 2.
3.
4.
Systematic Sampling
Advantages Cheap Easy to Choose Sample Unbiased Disadvantages Needs an entire population list If population list is ordered then sample can become biased
Stratified Sampling
The most reliable sampling method. Sort the data into strata based on information you already know. Calculate the proportions for each strata. Perform a Simple Random Sample on each of the strata.
Example (Heights of WBHS students) 1. Get a copy of the School Roll separated into year levels. 2. Calculate the sample size for each year group (strata). 3. Perform a simple random sample on each year group to their specific sample size.
1.
2.
3.
Stratified Sampling
Advantages Unbiased Completely representative of each of the strata Most reliable estimates Disadvantages Needs entire population list Information about entire population needs to be known beforehand Time consuming
2.
3.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Choose your calculator
Casio FX-82 Casio Graphic Texas
2.
3.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) In Run Mode
Intg(529 Ran# + 1)
F3
F4
F6
OPTN
8 5 +
2.
3. 4.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Ran# = 2nd function On screen
Ran# 529 + 1 =
Population size or strata size Starting value
RAN#529+1
shift
note
2.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students)
RANDI
RANDI(1,529)
2nd PRB
RANDI PRB
,
3.
Starting Value
1.
2.
3.
Strata Proportions
1.
2.
Number of people in strata divided by total in population. Multiplied by number of people wanted in total sample.
Example (Heights of WBHS students) 1. 529 people on School Roll. 2. 115 year 10s 3. Sample size of 30 4. So year 10 sample size 115 529 30 = 6.52 So take 7 year 10 students
Example (Heights of WBHS students) 1. 529 people on School Roll. 2. Sample size of 30 3. So Step size 529 30 = 17.63333 So take every 17th student from the starting position
Systematic Stepping
1.
Starting at the random start point step out till you get desired sample size.
Example (Heights of WBHS students) 1. Random starting point 803, step size 29 2. 803rd student on alphabetical list is where we start. 3. Then 832nd student, 861st student, we have now reached the end of the roll so start at the beginning 890= 15th student then 45th student
Mean
1. 2.
Add up all of the values in the sample. Divide by the sample size.
Calculator Method
Advantages
Easy to calculate for large samples. Accurate and well understood
Disadvantages
Affected by outliers
Median
1.
2.
Advantages
Accurate Not affected much by Outliers
Disadvantages
Not so widely known as an average Time consuming to list large sample in order
Mode
1. 2.
Advantages
Can calculate mode for data that is not numeric or ordered Not affected much by Outliers Very easy to calculate
Disadvantages
Can be inaccurate for numeric or data that can be ordered
Statistics on a Calculator
Choose your calculator
Casio FX-82
Casio Graphic
Texas
4. 5. 6. 7.
In Stat Mode In list 1 enter all data values In list 2 enter their frequencies F2 (CALC) F6 (SET) Should read Exit F1 (1VAR)
is mean, mean, n is std. dev.) dev.)
1Var XList 1Var Freq 2Var XList 2Var YList 2Var Freq
F1
F2
F6
EXIT
Enter the frequency of each data value in List 2 followed by EXE Note If all of the frequencies are 1 then you dont need to enter the frequencies.
Scl
mode
clr all
Mode 2
2.
shift
mode
3.
180cm
M+
4.
2
mean n standard deviation
Enter each data value followed by M+ n is the number of data values that you have entered
M+
Note
Be very careful entering the data values as you cannot review them later to make sure that they are correct.
DATA
n x Sx x
2.
2nd
3.
DATA STATVAR
2nd
DATA
Definitions
The entire list of those people or things that you wish to sample A survey of an entire population A small group of a population Facts about an entire population gained from a census
(Notation: mean or standard deviation )
A sample that appears to represent all elements of the in the correct proportions population A sampling method that does not give every element of the population an equal chance of selection
Standard Deviation
This is a calculation of the average difference between the data values and the mean. This measure of spread applies to the mean.
Use table to calculate
Advantages
Easy to calculate for large samples on calculator. Accurate Very useful for certain types of data
Disadvantages
Affected by outliers Possibly not so well understood
Interquartile Range
1. 2.
3.
Calculate the upper and lower quartiles. Upper quartile minus lower quartile. This measure of spread applies to the median Disadvantages
Easy to calculate for large samples.
Advantages
Well understood Unaffected by outliers
Range
Find the highest and lowest value. 2. Highest value minus the lowest value. 3. This measure of spread applies to all measures of centre. Advantages Disadvantages
1.
Calculating Quartiles
1. 2. 3. 4.
5.
List all the values in order. Find the central value Discard that central value Find the central value of the remaining two halves. These 2 numbers are the upper and lower quartiles
Example (Heights of WBHS students) 1. Data Values 165, 170, 173, 180, 182, 183, 191, 192 2. Central value middle of 180 and 182 so median is 181 3. Discard 181 and calculate middle of each half. 4. 165, 170, 173, 180//182, 183, 191, 192 Lower quartile 171 Upper quartile 187
Things to Consider
Is my sample representative of the population?
Need to consider whether any strata present in the data are represented in approximately the correct proportions. Need to consider the presence of any apparent outliers in the sample chosen, and the effect they will have on estimates of population parameters.
Things to Consider
Is my sample representative of the population?
Estimates are more reliable when taken from a large sample as the effects of outliers are lessened. Consider the size of the s.d.
A larger value of s suggests considerable variation in the data values. Thus taking another sample could produce quite different statistics.
Ask yourself, If I were to repeat this sampling process, would I get the same results?
Things to Consider
How could I improve my sampling method?
Need to choose a sampling method which eliminates bias, and which gives the best chance of choosing a representative sample. (Bias exists when some of the population members have greater or lesser chance of being included in the sample.) Need to discuss which statistics would give the best estimates of population parameters, including the effect of outliers.
Things to Consider
Would I get the same or similar results if I repeated the same process?
Are there outliers or extreme values that may affect the sample statistics? If so then I probably wouldnt get similar results. Is the standard deviation (or measure of spread) large when compared to the mean, if it is then repeating the same results is unlikely.
Things to Consider
When answering question or stating conclusions;
Answers need to be precise and refer to actual data values present in the sample and/or population. Strata must be clearly defined. Answers cannot be vague or rote-learnt without referring specifically to the context of the assessment. Students must be very clear that the sample statistics are ESTIMATES of the population parameters. They must NOT state that the population mean is unless they have taken a census of the whole population!
Practice Tasks
Real Estate Stats
On Your Calculator
In this part of the presentation you can check on exactly how to use your calculator effectively to help with Statistics
Note: Generating Random Numbers Entering Data Calculating Statistics
For more detailed instructions on any of the example click on the step you misunderstand
Casio Graphic
Texas
Statistics on a Calculator
Choose your calculator
Casio FX-82
Casio Graphic
Texas
4. 5. 6. 7.
In Stat Mode In list 1 enter all data values In list 2 enter their frequencies F2 (CALC) F6 (SET) Should read Exit F1 (1VAR)
is mean, mean, n is std. dev.) dev.)
1Var XList 1Var Freq 2Var XList 2Var YList 2Var Freq
F1
F2
F6
EXIT
Enter the frequency of each data value in List 2 followed by EXE Note If all of the frequencies are 1 then you dont need to enter the frequencies.
Scl
mode
clr all
Mode 2
2.
shift
mode
3.
180cm
M+
4.
2
mean n standard deviation
Enter each data value followed by M+ n is the number of data values that you have entered
M+
Note
Be very careful entering the data values as you cannot review them later to make sure that they are correct.
DATA
n x Sx x
2.
2nd
3.
DATA STATVAR
2nd
DATA
2.
3.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Choose your calculator
Casio FX-82 Casio Graphic Texas
2.
3.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) In Run Mode
Intg(529 Ran# + 1)
F3
F4
F6
OPTN
8 5 +
2.
3. 4.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students) Ran# = 2nd function On screen
Ran# 529 + 1 =
Population size or strata size Starting value
RAN#529+1
shift
note
2.
Decide on the starting number (in this case 1) Decide how many you need (In the case of the school 529 students)
RANDI
RANDI(1,529)
2nd PRB
RANDI PRB
,
3.
Starting Value