Sie sind auf Seite 1von 3

Stat 200 Sections 10-12 Group Project 2 Spring 2008 Submit a typed group report with a cover page

that indicates each members name and a signature for each member (nothing else) followed by pages giving Task Number and answers to the task (you dont need to put each task on a new page). The signature is testimony that the person actually participated. Task 1. These are exercises in the text, from Chapters 3 and 4. You should work together on these tasks (at least everyone should check your solutions to be submitted). 1. Exercise 3.10 2. Exercise 3.15 3. Exercise 3.41 4. Exercise 3.42 5. Exercise 4.3 6. Exercise 4.4 7. Exercise 4.9 8. Exercise 4.26

Task 2. Read the article by Stephen Jay Gould (n the folder Projects) carefully and then answer the following questions. 1. What disease did the author have and what is it associated with? The author had abdominal melanoma which is associated with exposure to asbestos. 2. Describe the shape of the distribution of lifetimes associated with the disease. The disease has a median mortality rate of eight months. In other words, an average patient would be led to believe they have around eight months to live after the discovery of this disease. However because this is the median, realistically about half or more will live longer than eight months. 3. Based on the shape of the distribution of lifetimes after getting the disease, compare the values of the mean and median (which is larger?), The mean is undoubtedly lager when referring to the life expectancy of a person with abdominal melanoma. The fact that many people live years beyond the 8 month median raises the mean life expectancy beyond 8 months. 4. If you had the disease which would you prefer to live: the value of the mean or the median? If I had contracted abdominal melanoma, I would much rather prefer to live the value of the mean rather than the value of the median. The median represents a middle measurement of the dataset, and does not truly account for the outliers beyond the 8 month median. If you do take into account those outliers using the mean, if raises the life expectancy. 5. Now justify the choice of title of the article by Gould.
Stephen Jay Gould chose the title The Median Isnt the Message to denounce the belief that the life expectancy for a patient with abdominal melanoma is 8 months after diagnosis. The median (8 months) is a middle measurement of the data and does not truly account for the data that is greater than the 8 months. As stated in the article, many people live for years after learning they have the disease. That fact alone would no doubt raise the life expectancy if we had used the mean rather than the median. I believe this is why Stephen Jay Gould chose the title The Median Isnt the Message. Task 3. Stephen J. Gould, a biologist, also studied how one could compare performances of various kinds across past years. Here is one example. Three landmarks of baseball achievement are Ty Cobbs batting average of .420 in 1911, Ted Williamss .406 in 1941, and George Bretts .390 in 1980. These batting averages cant be compared directly because the distribution of major league batting averages has changed over the decades. The distributions are quite

symmetric and (except for outliers such as Cobb, Williams, and Brett) reasonably normal. Here are some facts about batting averages: Decade 1910s 1940s 1970s Mean Batting Average .266 .267 .261 Standard Deviation .0371 .0326 .0317

a. Comment on the Mean Batting Average (are they about the same, decreasing, increasing etc.; look at the percentage changes for the three decades and not the magnitude of the differences) The mean batting averages for these three decades remain relatively the same. The percent of change is also relatively small throughout the years. b. Comment on the Standard Deviations (are they about the same, decreasing, increasing etc.; look at the percentage changes for the three decades). The Standard Deviations for these three decades seem to be decreasing throughout each decade. c. Interpret what the behavior of the standard deviations says about the spread (variation) in batting averages over the three decades. The decreasing standard deviations represent a smaller variation among players and their batting averages through the years. More batters were closer to the mean batting average, with the exception of the outliers (Cobb, Williams, Brett) d. One way to compare the performance of the three baseball achievements is to standardize (using z-scores) their values (batting averages). Do this and then comment on which player seems to have had the greatest performance. Draw a bell-curve (free hand) to illustrate and justify what you say.

Task 4. A researcher wants to see if the regular use of Vitamin C reduces the risk of getting a cold. a. What are the response and explanatory variables? Whether or not a person uses Vitamin C regularly is the explanatory variable, and whether or not that person develops a cold more often is the response variable. b. Briefly describe how the researcher could do an observational study to examine this relationship. A researcher could question a random sample of people about their Vitamin C intake habits in addition to their history of having colds. Based on the testimony from the sample, the researcher can then draw conclusions on whether or not Vitamin C reduces the risk of getting a cold. c. Suppose that 100 people are available for an experiment. Describe a randomized experiment for this problem. d. Why would it be better to do an experiment rather than an observational study to examine this relationship? e. Briefly explain the term double-blind as it applies to randomized experiments

Task 5. The faculty senate at a large university wanted to know what proportion of the students think a foreign language should be required of everyone. The statistics department offered to cooperate in conducting a survey, and a simple random sample of 500 students was selected from all students enrolled in statistics classes. A survey form was sent by email to these 500 students. For a similar problem on some parts, see problem 3.42 in the text.

a. What is the population of interest to the faculty senate? b. What is the parameter of interest? c. Is the sample representative of the population of interest? Explain. d. Describe an alternative sampling method for obtaining a sample to estimate the parameter of interest. e. The faculty senate believes that the proportion of students who think a foreign language should be required of everyone varies by the college they are enrolled in (for example, business, liberal arts, engineering, etc.). Describe a sampling scheme which will enable them to estimate the proportion for each college. . Project 2 Due Date: Friday Feb. 29

Das könnte Ihnen auch gefallen