Beruflich Dokumente
Kultur Dokumente
OUTLINE
1. The logic of hypothesis testing 2. Uncertainty and errors in hypothesis testing 3. Trying out a hypothesis test 4. Directional (One-tailed) hypothesis tests 5. Measuring effect size
It is a statistical method that uses sample data to evaluate a hypothesis about a population. Ex: Does a certain brand of memory enhancer really improve ones memory?
1. State a hypothesis about a population. 2. Use the hypothesis to predict the characteristics of the sample. 3. Obtain a random sample from the population. 4. Compare the obtained sample data with the prediction that was made from the hypothesis.
If sample M is consistent w/ prediction, hypothesis is reasonable. If theres big discrepancy between data and prediction, hypothesis is wrong.
The basic experimental situation for hypothesis testing. It is assumed that the parameter is known for the population before treatment. The purpose of the experiment is to determine whether or not the treatment has an effect on the population mean.
From the point of view of the hypothesis test, the entire population receives the treatment and then a sample is selected from the treated population. In the actual research study, a sample is selected from the original population and the treatment is administered to the sample. From either perspective, the result is a treated sample that represents the treated population.
7/22/2013
Researchers have noted a decline in cognitive functioning as people age. However, the results from other research suggest that the antioxidants in foods such as blueberries can reduce and even reverse these age-related declines, at least in laboratory rats. One might thus theorize that the same antioxidants might also benefit elderly humans. Suppose we are interested in testing this theory.
We decide to use a neuropsychological test to measure cognitive function. It is known that the distribution of scores on this test is approximately normal and, for adults older than 65, the average score is = 80 with an SD of = 20. We then obtain a sample of n = 25 adults who are older than 65, and give each participant a daily dose of blueberry supplement that is very high in antioxidants. After taking the supplement for 6 mos, the participants will be given the neuropsychological test to measure their level of cognitive function.
STEP 1: STATE THE HYPOTHESIS Null hypothesis (H0) states that in the general population there is no change, no difference, or no relationship. In an experiment, H0 predicts that the IV (treatment) has no effect on the DV for the population.
If the mean score for the sample is noticeably different from the mean for the general population of elderly adults, the researcher can conclude that the supplement does appear to have an effect on cognitive function. But if the sample mean is around 80 (the same as the general population mean), the researcher must conclude that the supplement does not appear to have any effect.
Alternative hypothesis (H1) states that there is a change, a difference, or a relationship for the general population. In an experiment, H1 predicts that the IV (treatment) does have an effect on the DV.
Use H0 to predict the kind of sample mean that ought to be obtained. Determine exactly what sample means are consistent with H0 and what sample means are at odds with H0.
The set of potential samples is divided into those that are likely to be obtained and those that are very unlikely to be obtained if the null hypothesis is true.
7/22/2013
Critical region composed of extreme sample values that are very unlikely to be obtained if H0 is true. Boundaries for critical region determined by level. If sample data fall in the critical region, H0 is rejected.
The z-score describes exactly where the sample mean is located relative to the hypothesized population mean from H0.
Use the z-score value obtained in Step3 to make a decision about the H0 according to the criteria established in Step2. Either:
Reject the H0 when the sample data fall in the critical region. Fail to reject (i.e., retain) the H0 when the sample data is NOT in the critical region.
Why pay so much attention to the null hypothesis? Because it is much easier to demonstrate that a universal (population) hypothesis is false than to demonstrate that it is true. Ex: All Psych majors are hot. universal statement It is much easier to show that something is false than to prove that it is true. In the end, we find support for the alternative hypothesis by disproving (rejecting) the null hypothesis.
7/22/2013
HYPOTHESIS TESTING
TYPE I ERROR
TYPE II ERROR
Occurs when a researcher rejects an H0 that is actually true. In research, this means that a treatment does have an effect when in fact it has no effect. Consequence of Type I errors can be very serious Alpha level for a hypothesis test is the probability that the test will lead to a Type I error. Thus, alpha level determines the possibility of obtaining sample data in the critical region even though the H0 is true.
Occurs when a researcher fails to reject a null hypothesis that is really false. In research, this means that the hypothesis test has failed to detect a real treatment effect. Consequences of Type II error not as serious as those of Type I error. The researcher can simply repeat the experiment with some improvements to try to demonstrate that the treatment really does work. Type II error represented by beta ().
Ang Akala mo lang meron, pero TYPE I VS. TYPE II wala, wala, wala! error
ERROR
Ang Akala mo lang wala, pero meron, meron, meron! error Actual situation
Effect exists, H0 False
Primary concern in selecting alpha level is to minimize risk of Type I error. The smaller the alpha level, the less risk of committing a Type I error. But a smaller alpha level also means that the hypothesis test also demands more evidence from the research results. We try to maintain a balance between the risk of Type I error and the demands of the hypothesis test. levels of .05, .01, and .001 are reasonably good values; they provide a low risk of error without placing excessive demands on the research results.
No effect, H0 True
Experimenters decision
Reject H0
Retain H0
7/22/2013
THE PROBLEM
A researcher begins with a known population: scores on a standardized test that are normally distributed with a = 65 and = 15. Q: Will a special training in reading skills produce a change in the scores for the individuals in the population? Sample n=25 individuals is selected, and treatment is given to this sample. Following treatment, average score for sample is M=70. Q: Is there evidence that the training has an effect on test scores?
H0: = 65 (After special training, the mean is still 65.) H1: 65 (After training, the mean is different from 65.) = .05 (5% risk of committing a Type I error if we reject H0.)
Draw your graph. With = .05, the critical region consists of sample means that correspond to z-scores beyond the critical boundaries of z = +/-1.96.
M = /n Z =
Thus, we fail to reject (i.e., we retain) the H0. The data do not provide sufficient evidence that the special training changes test scores.
7/22/2013
The larger the difference between the sample mean and the population mean is, the larger the z-score will be, and the greater the likelihood of finding a significant treatment effect.
The larger the variability, the lower the likelihood of finding a significant treatment effect.
4. DIRECTIONAL (ONE-TAILED)
HYPOTHESIS TESTS
The larger the sample size, the greater the likelihood of finding a significant treatment effect.
In a directional (one-tailed) hypothesis test, the statistical hypothesis (H0 and H1) specify either an increase or decrease in the population mean score (i.e., they make a statement about the direction of the effect).
Effect size intended to provide a measurement of the absolute magnitude of treatment effect, independent of the size of the sample(s) being used. Cohens d = mean difference standard deviation (of population)
SEATWORK
A psychologist is investigating the hypothesis that children who grow up as the only child in the household develop different personality characteristics than those who grow up in larger families. A sample of n = 30 only children is obtained and each child is given a standardized personality test. For the general population, scores on the test form a normal distribution with a mean of = 50 and a standard deviation of = 15. If the mean for the sample is M = 58, can the researcher conclude that there is a significant difference in personality between only children and the rest of the population? Use a two-tailed test with = .05. What about if you use = .01?