Beruflich Dokumente
Kultur Dokumente
N is the total number of observations sr is the smallest of the four marginal totals sr =< sc sc is the smaller of the two column totals sc =< Int[ N2] X is the number of observations from 0 [1] sr and is the left uppermost cell These four numbers are sufficient to generate all the unique tables for a given N (which form an isomarginal family) for each N the first block of tables are for sr = 1 with sc = 1..Int[N2] then for sr = 2 with sc = sr [1] Int[N2] p(table) is the Hypergeometric Probability associated with the table LHS(p) is the Cumulative Probability starting from the left of the distributions curve RHS(p) is the Cumulative Probability starting from the right of the distributions curve Same Side this Column gives the Cumulative Probability for the side that the table is in i.e. the sum of the probabilities up to and including that table it switches sides after the Maximum Value(s) If the table corresponds to the Maximum Value it is entered as exactly 1 Other Side this Column gives a Cumulative Probability for the side that the table is not in i.e. the sum of the probabilities less than or equal to the value of that table it also switches sides If the table corresponds to the Maximum Value it is entered as exactly 0 The p-value for the table is the sum of these two side probabilities it is the one sided probability as defined by Fisher he suggested doubling it if a two-sided probability was required. Included are two other forms for the tables with row totals A >= B and first column cell members a>=b and similarly A >= B , a >= b when sr = sc there is only one form [c.f Finney, Latscha, Bennett & Hsu Tables for Testing Significance in a 2X2 Contingency Table Cambridge U. P.] The rows were generated in Excel using a series of blocks of tables which apart from the Values and Formulas used in the first row of the first block are a copy of the previous block plus a copy of the last row. The format for the tables was motivated by Goyette & Mickeys, Health Sciences Computing Facility UCLA, as reproduced in Dixon & Mason (1969) Introduction to Statistical Analysis McGraw Hill. The Tables appear to be unique in that they give the various cumulative probabilities, the p-value and up to three versions of all the tables in the iso-marginal family in a straight forward look up table ALBEIT the user has to arrange their table of observations to match one they can look up.
sc
N, sr, sc, X
2X2 Contingency tables arise when observing counts of dichotomous attributes. The observations are written down quite naturally as a 2X2 table typically giving n1 n2 n3 n4 i.e. two rows of two columns.
The required values (N,sr,sc,X) for entering the tables are found by following these steps :-
[ If there are ties for the smallest/smaller marginal total they each have the same initial status & the way to deal with this is to initially take the first eligible one as the smallest ]
n1 n2 n3 n4 n1 n3 n3 n4 n1 n2 n2 n4
a c
b d
The two NEW column totals are examined the smaller one is LABELLED sc and the other corresponding total is the second column total. The cells are filled in by using the values that give these NEW marginal totals.
STEP 1
The Observations are ----------- -- 17, 7 and 5, 10
17 5
7 10
r1=24
r2=15
c1=22
c2=17
As row two gives the smallest marginal total a row swop is needed
5 17
10 7
sr=15
r2=24
c1=22
c2=17
As column two now gives the smaller total a column swop is needed
X=10 7
5 17
sr
r2=24
sc=17
c2=22
N=39
The values for N, sr, sc, X can be read off as 39, 15, 17, 10 to give
p(table) 0.02037
LHS(p) 0.995998
RHS(p) 0.024373
p-value 0.04478
Kruger et al give a value of 0.0244 and 0.0448 for their p1 and p2.
Lieberman & Owen give for (39, 17, 15, 10) - P(10) = 0.995997 and p(10) = 0.020371 which correspond to the LHS(p) and p(table). The values for the cells and marginal totals of the two other tables are given by Finney et al.
A 24 B 15 a 17 b 5 A' 22 B' 17 a' 17 b' 7
a b
A-a
A B
B-b
a+b
A+B-a-b
17 5
24 15
10
22
17
17 7
22 17
10
24
15
These are the forms given by Finney et al and the first one corresponds to that of Kruger et al for which Finney et al gives a probability of 0.0244 which is also given for the other form of the table.
STEP 1
8 4
12
2 6
8
10
10
X=2 8
6 4
sr=8
12
sc=10
10
N=20
Since there were two marginal totals the same - the diagonal check is performed and as no further changes need to be made the table entry values can be read off as
p(table) 0.07502
LHS(p) 0.084901 RHS(p) 0.9901167
Same Side 0.0849012 Other Side 0.0849012
p-value 0.1698
Kruger et al give 0.0849 and 0.1698 for their p1 and p2 Lieberman & Owen give for (20, 10, 8, 2) P(2) = 0.084901 and p(2) = 0.075081 which correspond to + LHS(p) and p(table) = 0.0750178613955703 The tables of Kruger et al are nominally restricted to N = 80 but also to values of the one-sided probability >= 0.1 and values of the two sided probability >= 0.0001
+
The N, sr, sc, X tables are a pdf version of the original Excel tables which are restricted to a smallest value of 2.229E-308 Stirlings formula for factorials was used for hypergeometric values where N > 1000 in a fuller set of the tables used for Sample Sizes
A 12 B 8 a 8 b 2 A' 10 B' 10 a' 6 b' 2
8 2
12 8
10
10
6 2
10 10
12
These are the forms given by Finney et al, and the second requires a column swop one to match Kruger et als for which Finney et al report as not significant.
STEP 1
2 15
c1=17
12 26
c2=38
r1=14
r2=41
N=55
No changes need to be made and the table entry values are read off as
p(table)
LHS(p) RHS(p)
Same Side Other Side 0.0749498
p-value 0.18289
Kruger et al do not include this table for the reasons they give nor do the other tables.
STEP 1
The Observations are ----------- -- 3, 14 and 15, 1
3 15
c1=18
14 1
c2=15
r1=17
r2=16
N=33
14 3
C2=17
1 15
c2=16
sr=15
r2=18
N=33
X=1 15
sc=16
14 3
c2=17
sr=15
r2=17
N=33
p(table) 1.04902E-05
LHS(p) 1.06213E-05
RHS(p) 0.999999869
p-value 1.26037E-05
Lieberman & Owen give for (33, 16, 16,1) P(2) = 0.000011 and p(2) = 0.000010 which correspond to LHS(p) and p(table)
A 18
B 15
a 15
b 1
A' 17
B' 16
a' 14
b' 1
15 1
18 15
14
16
17
14 1
17 16
15
15
18
Finney et al do not give these tables as they are not significant at any of the levels they use
REFERENCES :
The Fourfold Table up to N=80 Kruger, Lemacher & Wal : Verlag 1981 pp. 440 Tables 410 pp. The text is in German & English n.b. it does not include all the tables. Tables for the Hypergeometric Probability Distribution Lieberman & Owen: Stanford U.P. 1961 pp 726 Tables 693 pp. The Tables give two probabilities - a one sided cumulative probability and the hypergeometric probability for the table. The distribution is characterised in terms of the four variables N the number of items in a lot (the total number of observations, n the number of items taken from the lot (corresponds to sc) , k the number of defective items in the lot ( corresponds to sr) and x the number of defective items observed in the sample (corresponds to sr) The tables are for N=2 [1] 50 [10] 100 and for N = 1000 and n = 500
[The tables up to N = 20 and part of the table for N = 21 are reproduced in Handbook of Statistical Tables Owen: Addison Wesley 1962 Tables 28 pp.] Tables for Testing Significance in a 2X2 Table Finney, Latscha, Bennett, Hsu: Cambridge U.P. 1963 pp.108 Tables 93 pp. For A = B= 3 to A = B = 20 ( i.e. N= 6 [2] 40) Finney et al give, for significance levels of 0.05, 0.025, 0.01, and 0.005, the tables that are statistically significant as well as a Right Hand Side Cumulative probability. The tables continue up to 60 with statistical significance at the 0.05 and 0.01 levels. [Handbook of Tables for Probability & Statistics Beyer: CRC 2nd Ed pp. 14 reproduce the tables up to N = 20] Scientific Tables Diem: Ciba Geigy 1962 14 pp. For N=4 [1] 50 [2] 60 and significance levels of 2 = 0.20, 0.10, 0.05, 0.02, 0.01 and 0.002 the tables that are statistically significant are indicated. McDonald, Davies & Miliken 1977 Technometrics 19(2):145-157 give 6 pages of tables for critical regions of nominal levels 0.05 and 0.01 for N = 5 [1] 30 Armsen 1955 Biometrika 42:494-511 gives 6 pages which can be used up to N =50 but not for all such tables. Neave 1982 British Institute of Statistics 9(2): 165-178 gives 2 pages for tables up to N =25 and offers tables up to N = 100 for those interested Ballatori 1982 Metron XL (3-4): 157-171 gives some tables of significance for 0.05 Luna del Castillo & Martin Andres 1987 Trabajos De Estadistica 2(1): 15-43 gives10 pages of significant tables up to N = 20 with the Hypergeometric AND Fishers one-sided probability. The Universidad de Granada gives various computer programs for 2X2, 2XK and rXc contingency tables. Keisan provides an online calculator which gives in separate tables the hypergeometric probability, the lower and upper cumulative probability. Each gives up to 401 values and another function gives all three up to 50 significant figures. easycalculation.com gives the hypergeometric probability and the lower cumulative probability to three decimal places. stat trek gives 16 d.p. for the hypergeometric ,upper and lower cumulative probabilities. adscienceengineering gives a table of the hypergeometric probability to 16 significant decimal places.