Ph.D. B.A. Program Class RES600: Introductory to Data Analysis
Professor: Dr. Truel Student: Anh Tran E-mail: Anh.NTran@my.trident.edu Phone: 714-904-6209
Subject
Date
From SLP #3 for Module 3: Describing data statistically: Randomness, probability, and distributions 18-Nov-2013
A. Tran
References
Page 2 of 5
Module 3 - SLP 1. Comment on the following sampling designs. Are they appropriate? If not, what are the potential problems? A. A citizens group interested in generating public and financial support for a new university basketball arena printed a questionnaire in area newspapers. Readers return the questionnaires by mail. In this survey, the sampling frame is defined as the people that read these particular area newspapers, where the survey questionnaires are printed. This group of people is a subset of the general population. The survey may introduce a population specification error where researchers select an inappropriate population from which to collect data. B. A department store that wishes to examine whether the store is losing or gaining customers draws a sample from its list of credit card holders by selecting every 10th name. First, this survey method is subjected to sampling error. The sampling error occurs when the resulting sample is not representative of the population concern. In this case, customers that used credit cards to purchase at the department store is a subset of all customers, which can pay by cash or other means. Second, the name selection scheme (i.e., every 10 th name) can be used for probability sampling but it must involve a random start and then proceeds with the selection of every 10 th
element from then onwards. This selection is subjected to systematic sampling error. C. A motorcycle manufacturer decided to research consumer characteristics by sending 100 questionnaires to each of its dealers. The dealers would then use their sales records to trace down buyers of their brand of motorcycle and distribute the questionnaires. This is a non-probability sampling procedure where the targeted group is the people have expertise on motorcycles and use certain brands of motorcycles. Its also called Expert Sampling procedure. D. A research company obtains a sample for a focus group through organized groups such as church groups, clubs, schools, etc. The organizations are paid for securing a respondent and no individual is directly compensated. This is a non-probability sampling procedure where the targeted group is the people have expertise on motorcycles and using certain brands of motorcycles. Its also called Expert Sampling procedure.
Page 3 of 5
E. A banner ad on a business-oriented web site read Are you a large company Sr. Executive? Qualified execs receive $50 for under 10 minutes of time. Take the survey now! Is this an appropriate way to select a sample of business executives? This type of sampling is one of the non-probability sampling called convenient sampling, which doesnt allow the findings to be applied from the sample to the population. Therefore, the sampling methodology is not an appropriate way to select a sample of business executives to represent the whole population of business executives. 2. Jury duty is supposed to be a totally random process. Comment on the following computer selection procedures and determine if they are indeed random processes. A. A program instructs the computer to scan the list of names and pluck names that are next to those from the last scan. This sampling process is not random because it doesnt include a random start by selecting the list of the names for scan randomly before picking the next to the last from these scans. B. Three-digit numbers were randomly generated to select jurors from a list of licensed drivers. If the weight information listed on the license matched the random number, the person was selected. Using the computer program to generate random numbers and using these numbers to match with the weight information of the selected jurors is an appropriate method to select persons for jury service randomly. C. The juror source list was obtained by merging a list of registered voters with a list of licensed drivers. This is not a random sampling process because the merging of these two lists are very static (repeatable) and not random at all. 3.Why is the standard deviation typically utilized rather than the average deviation (sum all the deviations and divided by the sample size)? When drawing repeated large samples from a normally distributed population, the standard deviation of their individual mean deviations is 14% higher than the standard deviations of their individual standard deviations (Stigler 1973). Thus, the SD of such a sample is a more consistent estimate of the SD for a population, and is considered better than its plausible alternatives as a way of estimating the standard deviation in a population using measurements from a sample (Hinton 1995, p.50). That is the main reason why SD has subsequently been preferred, and why much of subsequent statistical theory is based on it.
4. What is the sampling distribution? How does it differ from the sample distribution? Please explain with one or two examples.
Page 4 of 5
A sampling distributions is a distribution of a statistics of a group of samples. The sample distribution is a statistics of the data in a sample. For example, the Table 1 contain the sample distribution statistics of 8 samples shown in columns, which include sample means, sample standard deviation, sample skewness, and sample kurtosis. If we do the descriptive statistics of the sample means,
, listed on the first row, we have
the sampling distribution of the mean with a sampling mean of 19.961 and the sampling standard deviation of 0.406 as shown in Table 2. Table 1
Table 2 Descriptive Statistics: Sampling Distribution of The Sample Means
Variable N Mean Median TrMean StDev SE Mean C1 8 19.961 19.864 19.961 0.406 0.143
Variable Minimum Maximum Q1 Q3 C1 19.392 20.730 19.768 20.242 5. As long as the sample size is large enough, we will get a normal distribution. Is this statement true? The Central Limit Theorem states that if you draw a large enough sample, the way the sample mean varies around the population mean can be described by a normal distribution but its not a normal distribution. With a large sample size, a sampling distribution will approach a normal distribution and make a good approximation using the normal distribution characteristics as shown
Page 5 of 5
in the below figure 4-19. Therefore, I think the statement is not absolutely correct.
Reference: Montgomery, D.C.,Runger, G.C., Applied Statistics and Probability for Engineers, 5 th
General Static Load Capacity in Slewing Bearings. Unified Theoretical Approach For Crossed Roller Bearings and Four Contact Point Angular Ball Bearings
General Static Load Capacity in Slewing Bearings. Unified Theoretical Approach For Crossed Roller Bearings and Four Contact Point Angular Ball Bearings