Beruflich Dokumente
Kultur Dokumente
(SPSS)
Venue
Conference Room, Health Economics Unit Ministry of Health and Family Welfare 14/2, Topkhana Road (4th level, Room # 311), Dhaka 1000
Day # 2
Session Outline
1. Nonparametric Hypothesis Testing a. Binomial Test b. Two-Independent-Samples Tests c. Tests for Several Independent Samples d. Two-Related-Samples Tests e. Tests for Several Independent Samples
(SPSS)
Binomial Test
Introduction
The Binomial Test procedure compares the observed frequencies of the two categories of a dichotomous variable to the frequencies that are expected under a binomial distribution with a specified probability parameter. By default, the probability parameter for both groups is 0.5. To change the probabilities, you can enter a test proportion for the first group. The probability for the second group will be 1 minus the specified probability for the first group.
Example
When you toss a dime, the probability of a head equals 1/2. Based on this hypothesis, a dime is tossed 40 times, and the outcomes are recorded (heads or tails). From the binomial test, you might find that 3/4 of the tosses were heads and that the observed significance level is small (0.0027). These results indicate that it is not likely that the probability of a head equals 1/2; the coin is probably biased.
Statistics
Mean, standard deviation, minimum, maximum, number of nonmissing cases, and quartiles.
Data
The variables that are tested should be numeric and dichotomous. To convert string variables to numeric variables, use the Automatic Recode procedure, which is available on the Transform menu. A dichotomous variable is a variable that can take only two possible values: yes or no, true or false, 0 or 1, and so on. The first value encountered in the dataset defines the first group, and the other value defines the second group. If the variables are not dichotomous, you must specify a cut point. The cut point assigns cases with values that are less than or equal to the cut point to the first group and assigns the rest of the cases to the second group.
(SPSS)
Assumptions
Nonparametric tests do not require assumptions about the shape of the underlying distribution. The data are assumed to be a random sample.
Missing Values. Controls the treatment of missing values. Exclude cases test-by-test. When several tests are specified, each test is evaluated separately for missing values. Exclude cases listwise. Cases with missing values for any variable are excluded from all analyses.
(SPSS)
(SPSS)
Analyze Nonparametric Tests Binomial... Select Churn within last month as the test variable. Enter 0.27 as the test proportion. Click Options Select Descriptive. Click Continue. Click OK in the Binomial Test dialog box.
Because Churn within last month is a dichotomous variable, the mean tells us the proportion of churn within each customer type. Multiplying these proportions by 100 expresses the same data as percentages. For example, the percentage of churn for customers subscribing only to basic service was 31%. Similarly, customers who prefer more high-end electronic services churned at a rate of about 27% within the last month. There are about 280 customers who subscribe to a set of convenience services (three-way calling, call forwarding, call waiting, etc.). Of these, only 16% recently churned. Customers who take advantage of all of the services offered by the firm churned the most--37%, or 10% higher than the average of all customers within the last month.
(SPSS)
Each panel of the binomial test table displays one binomial test. For example, the first panel displays the test of the null hypothesis that the proportion of churn for Basic service users is the same as the proportion of churn in the total sample. Of the 266 Basic service customers, 83 churned within the last month. The Observed Prop. column here shows that these 83 customers account for 31% of the total Basic service group in this sample. The test proportion of 0.27 suggests that we should expect 0.27 * 266, or about 72 customers, to have churned. The asymptotic significance value is 0.07, which is above the conventional cutoff for statistical significance (0.05). By that standard, you cannot reject the null hypothesis that the churn rate for basic service customers is equal to the churn rate in the sample at large. The same cannot be said for Plus service customers, however. In this case, the proportion, 0.16, is significantly lower than the test proportion. Many fewer Plus service customers found another service provider last month. At the other extreme, significantly more Total service customers were lost last month than the test proportion predicts.
(SPSS)
Using the Binomial Test procedure, you have determined that the rate of churn differs across customer types. Now that Total Service customers have been identified as high-risk, you can focus further efforts on finding out why these customers are dissatisfied.
(SPSS)
Select Quartiles. Click Continue. Click OK in the Binomial Test dialog box.
The descriptives table displays the quartiles for each churn group. Generally, customers who churned last month have lower household incomes.
The binomial test table is split by the values of Churn within last month. The first test selects only those customers in the sample who did not churn last month. Within this first split file group, the cut point has created two groups. The first group consists of those customers who did not churn and whose household income is less than or equal to the median for the total sample. In these data, just about half of those who did not churn fall at or below median income. As we would expect, the difference in proportions is not significant.
(SPSS)
However, of those 274 customers who churned last month, the proportion with household incomes at or below the median is significantly different from the null hypothesis value. Those who churn tend to be the less affluent customers. Using a cut point to define the groups, you have found that a majority of the customers who churned within the last month fall below the median household income. Now that these customers have been identified as high-risk, you can focus further efforts on determining why these customers are dissatisfied.
(SPSS)
Two-Independent-Samples Tests
Introduction
The Two-Independent-Samples Tests procedure compares two groups of cases on one variable.
Example
New dental braces have been developed that are intended to be more comfortable, to look better, and to provide more rapid progress in realigning teeth. To find out whether the new braces have to be worn as long as the old braces, 10 children are randomly chosen to wear the old braces, and another 10 children are chosen to wear the new braces. From the Mann-Whitney U test, you might find that, on average, children with the new braces did not have to wear the braces as long as children with the old braces.
Statistics
Mean, standard deviation, minimum, maximum, number of nonmissing cases, and quartiles. Tests: Mann-Whitney U, Moses extreme reactions, Kolmogorov-Smirnov Z, Wald-Wolfowitz runs.
Data
Use numeric variables that can be ordered.
Assumptions
Use independent, random samples. The Mann-Whitney U test tests equality of two distributions. In order to use it to test for differences in location between two distributions, one must assume that the distributions have the same shape.
(SPSS)
Select one or more numeric variables. Select a grouping variable and click Define Groups to split the file into two groups or samples.
(SPSS)
with the control group. The control group is defined by the group 1 value in the Two-IndependentSamples Define Groups dialog box. Observations from both groups are combined and ranked. The span of the control group is computed as the difference between the ranks of the largest and smallest values in the control group plus 1. Because chance outliers can easily distort the range of the span, 5% of the control cases are trimmed automatically from each end.
Missing Values. Controls the treatment of missing values. Exclude cases test-by-test. When several tests are specified, each test is evaluated separately for missing values. Exclude cases listwise. Cases with missing values for any variable are excluded from all analyses.
(SPSS)
The results are collected in the file adl.sav. Use the Mann-Whitney test to determine whether the two groups' abilities differ. To begin the analysis, from the menus choose: Analyze Nonparametric Tests 2 Independent Samples... Select Travel ADL, Cooking ADL, and Housekeeping ADL as the test variables. Select Treatment group as the grouping variable. Click Define Groups. Type 0 as the group 1 value and 1 as the group 2 value. Click Continue. Click OK in the Two-Independent-Samples Tests dialog box.
Because the test variables are assumed to be ordinal, the Mann-Whitney and Wilcoxon tests are based on ranks of the original values and not on the values themselves. The rank table is divided into three panels, one panel for each test variable. The first test variable, Travel ADL, measures the ability to regularly get around the community. It ranges from 0 to 4, where 0 = Same as before illness and 4 = Bedridden. All 46 women in the control group and all 54 women in the treatment group provided valid data for this variable.
(SPSS)
First, each case is ranked without regard to group membership. Cases tied on a particular value receive the average rank for that value. After ranking the cases, the ranks are summed within groups. Average ranks adjust for differences in the number of patients in both groups. If the groups are only randomly different, the average ranks should be about equal. For Travel ADL, the average ranks are over 9 points apart. The test variables Cooking ADL and Housekeeping ADL contain missing data. For these variables, the value 4 = Never did any; thus, these scales do not apply to all patients. However, for those to whom they do apply, there are differences of about 12 to 13 points between the average ranks of the treatment and control groups.
The U statistic is simple (but tedious) to calculate. For each case in group 1, the number of cases in group 2 with higher ranks is counted. Tied ranks count as 1/2. This process is repeated for group 2. The Mann-Whitney U statistic displayed in the table is the smaller of these two values. The Wilcoxon W statistic is simply the smaller of the two rank sums displayed for each group in the rank table. The values displayed here are the rank sums for the treatment group. A nice feature of the Mann-Whitney and Wilcoxon tests is that the Z statistic and normal distribution provide an excellent approximation as the sample size grows beyond 10 in either group. The negative Z statistics indicate that the rank sums are lower than their expected values. Each two-tailed significance value estimates the probability of obtaining a Z statistic as or more extreme (in absolute value) as the one displayed, if there truly is no effect of the treatment.
(SPSS)
The significantly lower rank sums of the treatment group indicate to the physicians that the additional emotional therapy had some beneficial effect on such activities of daily life as cooking and cleaning.
(SPSS)
Example
Do three brands of 100-watt lightbulbs differ in the average time that the bulbs will burn? From the Kruskal-Wallis one-way analysis of variance, you might learn that the three brands do differ in average lifetime.
Statistics
Mean, standard deviation, minimum, maximum, number of nonmissing cases, and quartiles. Tests: Kruskal-Wallis H, median.
Data
Use numeric variables that can be ordered.
Assumptions
Use independent, random samples. The Kruskal-Wallis H test requires that the tested samples be similar in shape.
(SPSS)
Select one or more numeric variables. Select a grouping variable and click Define Range to specify minimum and maximum integer values for the grouping variable.
(SPSS)
Missing Values. Controls the treatment of missing values. Exclude cases test-by-test. When several tests are specified, each test is evaluated separately for missing values. Exclude cases listwise. Cases with missing values for any variable are excluded from all analyses.
(SPSS)
Click Define Range Type 1 as the minimum and 3 as the maximum values. Click Continue. Click Options in the Tests for Several Independent Samples dialog box. Select Quartiles in the Statistics group. Click Continue. Click OK in the Tests for Several Independent Samples dialog box.
Across all 60 subjects, the median performance on the exam is a score just below 75. The null hypothesis for the median test is that this particular value is a good approximation of center for each of the three training groups.
To test this hypothesis, each group is divided into two subgroups: those whose scores fall at or below the median, and those whose scores are above it. The result is a two-way frequency table with two rows and g columns, where g is the number of categories in your grouping variable. In this table, for example, the first cell is a count of the number of employees who received standard training and scored above the median. While the null hypothesis would predict that about 10 subjects scored above the median, only four subjects in this group did so.
(SPSS)
In addition to standard training, group 2 also received some technical training. Unlike the other groups, the median for all trainees does what the null hypothesis says it should do: it nearly divides this group into two equal subgroups.
In the final training group, those with exam scores greater than the median outnumber those at or below it by a margin of three to one. Like group 1, the null hypothesis does not provide a good approximation of center for these trainees.
From this two-way frequency table, a chi-square statistic can be calculated to test the null hypothesis of row and column independence. In fact, the median test is a chi-square test of independence between group membership and the proportion of cases above and below the median.
The chi-square value is obtained in the usual fashion for two-way tables. For each cell, the distance between the observed and expected counts is squared, then divided by the expected value. Finally, these quantities are summed across all cells. For this table, the value is 12.4. Degrees of freedom for the frequency table are equal to (rows - 1) * (columns - 1). In this case, that is 1 * 2 = 2. The asymptotic significance tells us how often we can expect a chi-square value at least as large as 12.4 in similar repeated samples, if there really is no relationship between the median and group membership. The probability is very low: about two times per thousand.
(SPSS)
From this analysis, the manager learns that type of training resulted in different median scores between the groups. Trainees who received the hands-on tutorial have a higher median value than
either their counterparts who received standard training or additional technical training.
The Kruskal-Wallis test uses ranks of the original values and not the values themselves. That's appropriate in this case, because the scale used by the taste-testers is ordinal.
Ministry of Health and Family Welfare
Page 21 of 34
(SPSS)
First, each case is ranked without regard to group membership. Cases tied on a particular value receive the average rank for that value. After ranking the cases, the ranks are summed within groups.
The Kruskal-Wallis statistic measures how much the group ranks differ from the average rank of all groups. The chi-square value is obtained by squaring each group's distance from the average of all ranks, weighting by its sample size, summing across groups, and multiplying by a constant. The degrees of freedom for the chi-square statistic are equal to the number of groups minus one. The asymptotic significance estimates the probability of obtaining a chi-square statistic greater than or equal to the one displayed, if there truly are no differences between the group ranks. A chisquare of 9.751 with 2 degrees of freedom should occur only about 8 times per 1,000. The table tells us the ratings of the strawberries differed by type of mulch used for cultivation. Like the F test in standard ANOVA, Kruskal-Wallis does not tell us how the groups differed, only that they are different in some way. The Mann-Whitney test could be used for pairwise comparisons.
(SPSS)
Example
In general, do families receive the asking price when they sell their homes? By applying the Wilcoxon signed-rank test to data for 10 homes, you might learn that seven families receive less than the asking price, one family receives more than the asking price, and two families receive the asking price.
Statistics
Mean, standard deviation, minimum, maximum, number of nonmissing cases, and quartiles. Tests: Wilcoxon signed-rank, sign, McNemar. If the Exact Tests option is installed (available only on Windows operating systems), the marginal homogeneity test is also available.
Data
Use numeric variables that can be ordered.
Assumptions
Although no particular distributions are assumed for the two variables, the population distribution of the paired differences is assumed to be symmetric.
(SPSS)
(SPSS)
Missing Values. Controls the treatment of missing values. Exclude cases test-by-test. When several tests are specified, each test is evaluated separately for missing values. Exclude cases listwise. Cases with missing values for any variable are excluded from all analyses.
(SPSS)
In the Wilcoxon test, ranks are based on the absolute value of the difference between the two test variables. The sign of the difference is used to classify cases into one of three groups: differences below 0 (negative ranks), above 0 (positive rank), or equal to 0 (ties). Tied cases are ignored. In these data, 5 cases have negative differences whose absolute values are ranked 3, 4, 6, 7, and 11 among all differences. The sum of these ranks equals 31. The other cases have positive differences, whose ranks sum to 60.
Z is a standardized measure of the distance between the rank sum of the negative group and its expected value. The expected rank sum is 45.5, half the sum of all ranks. The standard deviation is 14.31. The negative group rank sum equals 31, so the Z statistic is (31 - 45.5) / 14.31, or -1.013. The two-tailed asymptotic significance estimates the probability of obtaining a Z statistic that is as extreme or more extreme in absolute value as the one displayed, if there truly is no difference between the group ranks. In this case, the probabilities for both tests are above any reasonable cutoff. From this analysis, the investment analyst can breathe a sigh of relief. His technology stock holdings did not underperform on a daily basis in the years 2000 and 2001, compared to the median daily performance of all other stocks on the S&P 500 over the same period.
(SPSS)
The McNemar test focuses on change from one condition or one sample response to another. In this example, the null hypothesis is that the promotion would have no effect; customers would be equally likely to change preferences from one brand to another. In this table, you can see that 26 customers preferred the store brand before seeing the promotion but not afterwards. This is a change in buying behavior that the manager would like not to attribute to the promotion. On the other hand, 48 customers said that they did not prefer the store detergent prior to the promotion but did prefer it afterwards. This is a change in the direction that would certainly please the store manager.
(SPSS)
The McNemar chi-square is computed using only the two cells of the previous table where customers changed their preferences from before to after the promotion. Continuity is corrected because the chi-square statistic is used to approximate a discrete distribution. The asymptotic significance is the approximate probability of obtaining a chi-square statistic as extreme as 5.959 in repeated samples, if the frequencies of the two change conditions are only randomly different. Because a chi-square this large is unlikely to have arisen by chance, the manager rejects the null hypothesis of no difference in favor of her hypothesis that the promotion had a favorable effect.
(SPSS)
Example
Does the public associate different amounts of prestige with a doctor, a lawyer, a police officer, and a teacher? Ten people are asked to rank these four occupations in order of prestige. Friedman's test indicates that the public does associate different amounts of prestige with these four professions.
Statistics
Mean, standard deviation, minimum, maximum, number of nonmissing cases, and quartiles. Tests: Friedman, Kendall's W, and Cochran's Q.
Data
Use numeric variables that can be ordered.
Assumptions
Nonparametric tests do not require assumptions about the shape of the underlying distribution. Use dependent, random samples.
(SPSS)
(SPSS)
To begin the analysis, from the menus choose: Analyze Nonparametric Tests K Related Samples... Select all of the variables, from Registered warranty data to Edited database information as the test variables. Deselect Friedman, and select Cochran's Q. Click Statistics. Select Descriptive. Click Continue. Click OK in the Tests for Several Related Samples dialog box.
The only possible outcomes for each task were 0 (Failure) or 1 (Success). Therefore, the means measure the proportion of users who succeeded at each task. For example, all five users were able to register their warranty data, but none could successfully add a question to the support list.
(SPSS)
The frequency table summarizes the number of observations of success or failure at each task. Because the null hypothesis would predict that each task had the same number of successes, you can sense that perhaps that hypothesis is not supported by this pattern of frequencies.
Cochran's Q statistic is a chi-square variate formed by a ratio of the variation in success across tasks to the variation in success within subjects. Based on the statistics and frequency tables, you would expect a large statistic because you observed quite a bit of variation in success by task. Degrees of freedom for this chi-square are equal to the number of test variables minus 1. There were six tasks, so there are five degrees of freedom. The asymptotic significance is the approximate probability of obtaining a chi-square statistic as extreme as 12.949 in repeated samples if the frequencies of task success are only randomly different. Because a chi-square this large is unlikely to have arisen by chance, the design team rejects the null hypothesis that all tasks have an equal number of successes. Clearly, users had difficulty interacting with the support list, as well as the fax and newsletter request pages.
(SPSS)
Analyze Nonparametric Tests K Related Samples... Select PPO plan 1 through HMO plan 2 as the test variables. Click OK.
The Friedman test ranks the scores in each row of the data file independently of every other row. In this example, each employer has already performed this ranking. For each plan, these ranks are summed and then divided by the number of employers to yield an average rank for each plan. In this table, you can see that the 12 employers tended to rank PPO plan 2 more highly than the other three plans.
The Friedman chi-square tests the null hypothesis that the ranks of the variables do not differ from their expected value. For a constant sample size, the higher the value of this chi-square statistic, the larger the difference between each variable's rank sum and its expected value. For these rankings, the chi-square value is 10.3. Degrees of freedom are equal to the number of variables minus 1. Because four health plans were being ranked, there are three degrees of freedom The asymptotic significance is the approximate probability of obtaining a chi-square statistic as extreme as 10.3 with three degrees of freedom in repeated samples if the rankings of each health plan are not truly different.
Ministry of Health and Family Welfare
Page 33 of 34
(SPSS)
Because a chi-square of 10.3 with three degrees of freedom is unlikely to have arisen by chance, the insurer concludes that the 12 employers do not have equal preference for all four health care plans.