SAMPLING
DESIGNs
INTRODUCTION

Researchers usually cannot make direct observations of every individual in the population they are studying. Instead, they collect data from a subset of individuals a sample and use those observations to make inferences about the entire population. Ideally, the sample corresponds to the larger population on the characteristic(s) of interest. In that case, the researcher's conclusions from the sample are probably applicable to the entire population.

This type of correspondence between the sample and the larger population is most important when a researcher wants to know what proportion of the population has a certain characteristic like a particular opinion or a demographic feature. Public opinion polls that try to describe the percentage of the population that plans to vote for a particular candidate, for example, require a sample that is highly representative of the population.

What is sampling?

In research terms a sample is a group of people, objects, or items that are taken from a larger population for measurement. The sample should be representative of the population to ensure that the researcher can generalize the findings from the research sample to the population as a whole. Sampling basically means selecting people or objects from a "population" in order to test the population for something. For example, we might want to find out how people are going to vote at the next election. Obviously we can't ask everyone in the country, so we ask a sample.

When considering a particular population it is usually advisable to choose a sample in such a way that everyone is represented. This is not easy and requires careful thought about sample size and composition. Often questionnaires are devised to identify the required information. These need to be idiot proof, so questions need to cover all alternatives and give little scope for variation. The main objective of drawing a sample is to make inferences about the larger population from the smaller sample.

## RATIONALE FOR SAMPLING

To draw conclusions about populations from samples, we must use inferential statistics which enables us to determine a population's characteristics by directly observing only a portion (or sample) of the population. We obtain a sample rather than a complete enumeration (a census) of the population for many reasons. Though censuses have the advantage of completeness it may not be practical and is not always economical. Researchers may choose to use sampling for any of the following reasons.

Resource constraints - most business researchers are forced to deal with resource constraints, including the all important factor of cost and time. Well selected samples can be less costly.

Budget and Time constraints- less expensive and less time to study a sample than a population. Complete population inaccessible - Some populations are so difficult to get access to that only a sample can be used. E.g. prisoners, people with severe mental illness, disaster survivors etc. The inaccessibility may be associated with cost or time or just access.

Accurate and reliable results- a sample may be more accurate than the total study of the population. A badly identified population can provide less reliable information than a carefully obtained sample. Since sampling is done by skilled and qualified researchers the results are expected to be accurate.

Whenever destruction of test units is involved. Sometimes the very act of observing the desired characteristic of a unit of the population destroys it for the intended use. Good examples of this occur in quality control. E.g. to determine the quality of a fuse and whether it is defective, it must be

destroyed. Therefore if you tested all the fuses, all would be destroyed. The ability to sample and make inferences about the population in such cases is very important.

SAMPLING TERMINOLOGIES
Sampling design as with any other field of study is equipped with its own jargons. A few of which will be discussed Element /Unit of Analysis- the unit about which the information is collected and provides the basis for analysis. It can be a person, groups, families, organizations, corporations, communities, and so forth, each member of the population is a unit.

Population- the complete set of unit analysis that are under investigation. The target population is the entire group a researcher is interested in; the group about which the researcher wishes to draw

conclusions. Let's imagine that you wish to generalize to urban homeless males between the ages of 30 and
50 in the Jamaica. If that is the population of interest, you are likely to have a very hard time developing a reasonable sampling plan. You are probably not going to find an accurate listing of this population, and even if you did, you would almost certainly not be able to mount a national sample across hundreds of urban areas. So we probably should make a distinction between the population you would like to generalize to, and the population that will be accessible to you. We'll call the former the theoretical population and the latter the accessible population. In this example, the accessible population might be homeless males between the ages of 30 and 50 in six selected urban areas across the Jamaica. SAMPLING UNITS- the element or some set of elements considered for selection in some stage of sampling and may include individuals, households, city blocks, census tracts, departments, companies or any other logical unit that is related to the study.

Sampling Frame - A physical representation of objects, individual groups etc., that is important to the development of the study sample. It is the actual list of sampling units at any stage in the selection procedure.

If you were doing a phone survey and selecting names from the telephone book, the book would be your sampling frame. Sampling Frame Problems 1. Missing elements: Some members of the population are not included in the frame. 2. Foreign elements: The non-members of the population are included in the frame. 3. Duplicate entries: A member of the population is surveyed more than once. 4. Groups or clusters: The frame lists clusters instead of individuals.

Sample- Sample is a finite part of a statistical population whose properties are studied to gain information about the whole. When dealing with people, it can be defined as a set of respondents (people) selected from a larger population for the purpose of a survey. A good sample will permit generalization of the findings of a population with a certain level of statistical confidence. An efficient sample combines the generalization qualities of a good sample with the added benefits of cost minimization in design and execution. Parameter/Statistic Parameter: a value computed by examining a whole population e.g. the average purchase price of all homes sold in Ocho Rios in 2010. A parameter is a "true" value Statistic: a value computed from a sample e.g. take a sample of just 100 home purchases and computer the average price for that sample . A statistic is a guess or estimate of the true value A statistic is used to estimate the value of the parameter

SAMPLING ERRORS - Sampling error comprises the differences between the sample and the

population that are due solely to the particular units that happen to have been selected. It is a measure
of the difference between a statistic and the parameter it is estimating Procedural error- a bias in the sampling procedure itself Imprecision associated with using statistics to estimate parameters

For example, suppose that a sample of 100 Jamaican women are measured and are all found to be shorter than four feet. It is very clear even without any statistical prove that this would be a highly unrepresentative sample leading to invalid conclusions. This is a very unlikely occurrence because

naturally such rare cases are widely distributed among the population. But it can occur. Luckily, this is a very obvious error and can be detected very easily. The more dangerous error is the less obvious sampling error against which nature offers very little protection. An example would be like a sample in which the average height is overstated by only one inch or two rather than one foot which is more obvious. It is the unobvious error that is of much concern.

## STEPS IN THE SAMPLING PROCESS

There seven steps in the sampling process:

1. Select the population - The explicit designation of all elements concerned. A proper definition

includes elements, sampling units, extent and time. (extent is simply the range of conditions to which the population under study is restricted.
2. Select sampling units- select what sampling units are appropriate in the population. It may be one

## element or multiple elements.

3. Select a Sampling Frame- this means physically representing the population. This is a critical step

because if the sampling frame chosen does not adequately represent the population selected in step 1, then the results of the study will be questionable.
4. Select a Sample Design- the method by which the sample is chosen. There are probability & non-

## probability type designs

5. Select the sample size- the selection of the number of people or objects to study in the population.

## The sample size depends on a number of factors :

i) Homogeneity of sampling units: the more alike the sampling units are the smaller the sample need

to estimate the parameters. ii) Confidence: the degree to which researchers want to be sure that they are estimating true population parameters. iii) Precision: how close should the estimate be to the true population parameter.
iv) Statistical power: making the right decision to reject certain hypothesis and recognize that relationship when it exists. The researcher must always try to be accurate because a mistake may be costly. 6. Select a Sampling Plan- this specifies the procedures and methods to obtain the desired sample. If

selected correctly it guides the researcher in the selection of the study sample so that errors may be minimized.
7. Select the Sample- the specific units of analysis are enumerated and designated for the next step in

## the research process.

PROBABILITY SAMPLING

Probability sampling is used when the selection of the sample is purely based on chance, i.e. each element has an equal chance of being selected. Probability sample are usually more costly as they require a complete enumeration of the population and then we must locate the units for analysis. It is however more accurate, they take more time because they are more exact. Also the generalizability of results is good because

probability sampling usually gives conclusions that are replicable. A probability sampling method is any method of sampling that utilizes some form of random selection. In order to have a random selection method, you must set up some process or procedure that assures that the different units in your population have equal probabilities of being chosen. Humans have long practiced various forms of random selection, such as picking a name out of a hat, or choosing the short straw. These days, we tend to use computers as the mechanism for generating random numbers as the basis for random selection. Probability sampling (or representative sampling) is most commonly associated with survey-based research strategies where you need to make inferences from your sample about a population to answer your research question(s) or to meet your objectives. The process of probability sampling can be divided into four stages: 1. Identify a suitable sampling frame based on your research question(s) or objectives. 2. Decide on a suitable sample size. 3. Select the most appropriate sampling technique and select the sample. 4. Check that the sample is representative of the population.

## TYPES OF PROBABALITY SAMPLING

SIMPLE RANDOM SAMPLING this is the easiest form of probability sampling. All the researcher needs to do is assure that all the members of the population are included in the list and then randomly select the desired number of subjects. There are a lot of methods to do this. It can be as mechanical as picking strips of paper with names written on it from a hat while the researcher is blindfolded or it can be as easy as using a computer software to do the random selection. The main advantage of this technique of sampling is that, it is easy to understand and it is easy to apply too. The disadvantage is that, it is hard to use with too large a population because of the difficulty encountered in writing the names of the persons involved. Steps in selecting sample using a table of random numbers: Define the population Determine the desired sample size List all the members of the population Assign each of the individuals on the list a consecutive number from zero to the required number, ex. 01-89 or 001-249 Select an arbitrary number in the table of random numbers (Close your eyes and point) For the selected number, look at only the appropriate number of digits If the selected number corresponds to the number assigned to any individual in the population, then that individual is in the sample
Repeat the steps until the desired sample size is reached.

STRATIFIED RANDOM SAMPLING also known as proportional random sampling is a probability sampling technique wherein the subjects are initially grouped into different classifications such as age, socioeconomic status or gender. Then, the researcher randomly selects the final list of subjects from the different strata. It is important to note that all the strata must have no overlaps. Researchers usually use

stratified random sampling if they want to study a particular subgroup within the population. It is also pre the process of selecting randomly, samples from the different strata of the population used in the study. The advantage is that it contributes much to the representative of the sample Steps involves in stratified sampling: Define the population Determine the desired sample size Identify the variable and subgroups (strata) for which you want to guarantee appropriate representation (either proportion or equal) Classify all members of the population as members of one of the identified subgroups

Randomly select (using table of random numbers) an appropriate number of individuals from subgroups prefered over the simple random sampling because it warrants more precise statistical outcomes.

SYSTENATIC SAMPLING- Similar to simple random sampling, but instead of selecting random numbers from tables, you move through list (sample frame) picking every nth name. You must first workout sampling fraction by dividing population size by required sample size. E.g. for a population of 500 and a sample of 100, the sampling fraction is 1/5 i.e. you will select one person out of every

five in the population. Random number needs to be used only to decide on starting point. With the sampling fraction of 1/5, the starting point must be within the first 5 people in your list Disadvantage: Effect of periodicity (bias caused by particular characteristics arising in the sampling frame at regular units). An example of this would occur if you used a sampling frame of adult residents in an area composed of predominantly couples or young families. If this list was arranged: Husband / Wife / Husband / Wife etc. and if every tenth person was to be interviewed, there would be an increased chance of males being selected. Steps in systematic sampling: Define the population Determine the desired sample size Obtain a list (preferably randomized) of the population Determine what K is equal to by dividing the size of the population by the desired sample size Select some random place at the top of the population list Starting at that point, take every Kth name on the list until desired sample size is reached If the end of the list is reached before the desired sample is reached, go back to the top The main advantage is that it is more convenient, faster, and more economical The disadvantage is that the sample becomes biased if the persons in the list belong to a class by themselves whereas the investigation requires that all sectors of the population are to be involved.

Multi-stage cluster sampling As the name implies, this involves drawing several different samples. It does so in such a way that cost of final interviewing is minimized.

Basic procedure: First draw sample of areas. Initially large areas selected then progressively smaller areas within larger area are sampled. Eventually end up with sample of households and use method of selecting individuals from these selected households. Advantage: Sampling list, identification, and numbering required only for members in the sampling unit selected for sample. If the sampling units are geographically defined it cuts down the cost. Disadvantage: errors are likely to be larger than the simple random sampling and the systematic random sampling. Errors will increase the number of sampling units selected. Steps in cluster sampling: Define the population Determine the desired sample size Identify and define a logical cluster Obtain, or make a list of all clusters in the population Estimate the average number of population members per cluster Determine the number of clusters needed by dividing the sample size by the estimated size of the cluster Randomly select the needed number of clusters (using a table of random numbers) Include in the sample all population members in selected cluster

## NON PROBABILITY SAMPLING

It isn't always possible to undertake a probability method of sampling, such as in random sampling. For example, there is not a complete sampling frame available for certain groups of the population e.g. the elderly; people who are attending a football match; people who shop in a particular part of town. Another factor to bear in mind is that many of the probability sampling methods described above may mean that researchers would have to undertake a postal or telephone survey delivery or might be expected to go from house to house. Advantages of non-probability methods:

Cheaper Used when sampling frame is not available Useful when population is so widely dispersed that cluster sampling would not be efficient Often used in exploratory studies, e.g. for hypothesis generation Some research not interested in working out what proportion of population gives a particular response but rather in obtaining an idea of the range of responses on ideas that people have.

## Non probability sampling may be used under the following conditions

1. When demonstrating that a particular trait exists in the population. 2. When the researcher aims to do a qualitative, pilot or exploratory study. 3. When randomization is impossible like when the population is almost limitless. 4. When the research does not aim to generate results that will be used to create generalizations

pertaining to the entire population. 5. It is also useful when the researcher has limited budget, time and workforce.

## TYPES OF NON PROBABILITY SAMPLING

CONVENIENCE SAMPLING is probably the most common of all sampling techniques. With convenience sampling, the samples are selected because they are accessible to the researcher. Subjects are chosen simply because they are easy to recruit. This technique is considered easiest, cheapest and least time consuming. The advantage of this sampling is that it is quick and inexpensive; however as a disadvantage it may contain several systematic and variable errors.

QUOTA SAMPLING is a non-probability sampling technique wherein the researcher ensures equal or proportionate representation of subjects depending on which trait is considered as basis of the quota. For example, if basis of the quota is college year level and the researcher needs equal representation, with a sample size of 100, he must select 25 1st year students, another 25 2nd year students, 25 3rd year and 25 4th year students. The bases of the quota are usually age, gender, education, race, religion and socioeconomic status. Its advantage over accidental/convenience sampling is that many sectors of the population are represented. But its representativeness is doubtful because there is no proportional representation and there are no guidelines in the selection of the respondents.

Disadvantage of quota sampling - Interviewers choose who they like (within above criteria) and may therefore select those who are easiest to interview, so bias can result. Also, impossible to estimate accuracy because not a genuine random sample

JUDGMENTAL SAMPLING is more commonly known as purposive sampling. In this type of sampling, subjects are chosen to be part of the sample with a specific purpose in mind. With judgmental sampling, the researcher believes that some subjects are fit for the research compared to other individuals. This is the reason why they are purposively chosen as subjects.

SNOWBALL SAMPLING is usually done when there is a very small population size. In this type of sampling, the researcher asks the initial subject to identify another potential subject who also meets the criteria of the research. The downside of using a snowball sample is that it is hardly representative of the population.

TYPES OF SAMPLING

