Fishers Exact Test N 2 To 50 Background

Fishers Exact Test h(X,sr,sc,N) & Significance Testing Tables
N sr sc X p(table) LHS(p) RHS(p) Same Side Other Side p-value
N is the total number of observations sr is the smallest of the four marginal totals sr =< sc sc is the smaller of the two column totals sc =< Int[ N2] X is the number of observations from 0 [1] sr and is the left uppermost cell These four numbers are sufficient to generate all the unique tables for a given N (which form an isomarginal family) for each N the first block of tables are for sr = 1 with sc = 1..Int[N2] then for sr = 2 with sc = sr [1] Int[N2] p(table) is the Hypergeometric Probability associated with the table LHS(p) is the Cumulative Probability starting from the left of the distributions curve RHS(p) is the Cumulative Probability starting from the right of the distributions curve Same Side this Column gives the Cumulative Probability for the side that the table is in i.e. the sum of the probabilities up to and including that table it switches sides after the Maximum Value(s) If the table corresponds to the Maximum Value it is entered as exactly 1 Other Side this Column gives a Cumulative Probability for the side that the table is not in i.e. the sum of the probabilities less than or equal to the value of that table it also switches sides If the table corresponds to the Maximum Value it is entered as exactly 0 The p-value for the table is the sum of these two side probabilities it is the one sided probability as defined by Fisher he suggested doubling it if a two-sided probability was required. Included are two other forms for the tables with row totals A >= B and first column cell members a>=b and similarly A >= B , a >= b when sr = sc there is only one form [c.f Finney, Latscha, Bennett & Hsu Tables for Testing Significance in a 2X2 Contingency Table Cambridge U. P.] The rows were generated in Excel using a series of blocks of tables which apart from the Values and Formulas used in the first row of the first block are a copy of the previous block plus a copy of the last row. The format for the tables was motivated by Goyette & Mickeys, Health Sciences Computing Facility UCLA, as reproduced in Dixon & Mason (1969) Introduction to Statistical Analysis McGraw Hill. The Tables appear to be unique in that they give the various cumulative probabilities, the p-value and up to three versions of all the tables in the iso-marginal family in a straight forward look up table ALBEIT the user has to arrange their table of observations to match one they can look up.
Complete tables for N = 2 [1] 50

X sr
sc
N, sr, sc, X
2X2 Contingency tables arise when observing counts of dichotomous attributes. The observations are written down quite naturally as a 2X2 table typically giving n1 n2 n3 n4 i.e. two rows of two columns.
The required values (N,sr,sc,X) for entering the tables are found by following these steps :-
[ If there are ties for the smallest/smaller marginal total they each have the same initial status & the way to deal with this is to initially take the first eligible one as the smallest ]
STEP 1 - USE FOR EACH TABLE

The marginal totals are found i.e. the members of row 1 are added together, similarly row 2 and columns 1 & 2 - the total number of observations is LABELLED N. The smallest marginal total, regardless of whether it was originally a row or column total is LABELLED sr and the other corresponding total is the second row total. This may rearrange the rows or the columns. After this the new columns may have to be interchanged to correctly assign sc
STEP 2 if all row totals different

The NEW rows are written down this will involve either (1) NO Change or (2) A ROW SWAP or (3) SWAPPING COLUMNS FOR ROWS ensuring that sr is the upper most row total (1) (2) (3a) (3b) n2 n4 n1 n3
n1 n2 n3 n4 n1 n3 n3 n4 n1 n2 n2 n4
A more general form often used is
a c
b d
The two NEW column totals are examined the smaller one is LABELLED sc and the other corresponding total is the second column total. The cells are filled in by using the values that give these NEW marginal totals.
STEP 2 if all totals the same

Simply use the values as first written down.
STEP 2 if there are three smallest totals

When there are three smallest totals one must be a row total and this is labelled sr rearranging the rows if necessary. The columns may have to be swopped over to position sc correctly.
STEP 2 if there are two smallest totals

If there are two smallest totals one a row and one a column keep the rows as rows rearranging if necessary to correctly position sr as the first row total and then swop columns , if it is necessary, to ensure the first column total is sc by swopping columns If the two are row totals use the values as first written down Or if they are column totals rewrite the columns as rows rearranging their order if necessary to ensure the top total is sr and in both cases swop columns , if it is necessary, to ensure the first column total is sc Either n1 n2 n3 n4 n3 n4 n1 n2 n1 n3 n2 n4 or n 2 n4 n 1 n3 (and) n2 n4 n3 n1
STEP 3 - USE IF THERE ARE TIES FOR SMALLEST MARGINAL TOTAL

If any of the marginal totals have been the same use the diagonal condition bc > ad and rearrange the rows(columns), if the condition is not met, ensuring that the first row(column) total is still the smaller marginal.
Examples Number 1 p..19 Kruger, Lemacher, Wall
STEP 1
The Observations are ----------- -- 17, 7 and 5, 10
17 5
7 10
r1=24
r2=15
c1=22
c2=17
As row two gives the smallest marginal total a row swop is needed
5 17
10 7
sr=15
r2=24
c1=22
c2=17
As column two now gives the smaller total a column swop is needed
X=10 7
5 17
sr
r2=24
sc=17
c2=22
N=39
The values for N, sr, sc, X can be read off as 39, 15, 17, 10 to give
p(table) 0.02037
LHS(p) 0.995998
RHS(p) 0.024373
Same Side 0.0243734
Other Side 0.0204041
p-value 0.04478
Kruger et al give a value of 0.0244 and 0.0448 for their p1 and p2.
Lieberman & Owen give for (39, 17, 15, 10) - P(10) = 0.995997 and p(10) = 0.020371 which correspond to the LHS(p) and p(table). The values for the cells and marginal totals of the two other tables are given by Finney et al.
A 24 B 15 a 17 b 5 A' 22 B' 17 a' 17 b' 7
a b
A-a
A B
B-b
a+b
A+B-a-b
17 5
24 15
10
22
17
17 7
22 17
10
24
15
These are the forms given by Finney et al and the first one corresponds to that of Kruger et al for which Finney et al gives a probability of 0.0244 which is also given for the other form of the table.
Example Number 2 p..18 Kruger, Lemacher, Wall
STEP 1
8 4
12
2 6
8
10
10
Only a column / row interchange is required
X=2 8
6 4
sr=8
12
sc=10
10
N=20
Since there were two marginal totals the same - the diagonal check is performed and as no further changes need to be made the table entry values can be read off as
p(table) 0.07502
LHS(p) 0.084901 RHS(p) 0.9901167
Same Side 0.0849012 Other Side 0.0849012
20, 8, 10, 2 to give
p-value 0.1698
Kruger et al give 0.0849 and 0.1698 for their p1 and p2 Lieberman & Owen give for (20, 10, 8, 2) P(2) = 0.084901 and p(2) = 0.075081 which correspond to + LHS(p) and p(table) = 0.0750178613955703 The tables of Kruger et al are nominally restricted to N = 80 but also to values of the one-sided probability >= 0.1 and values of the two sided probability >= 0.0001
+
The N, sr, sc, X tables are a pdf version of the original Excel tables which are restricted to a smallest value of 2.229E-308 Stirlings formula for factorials was used for hypergeometric values where N > 1000 in a fuller set of the tables used for Sample Sizes
A 12 B 8 a 8 b 2 A' 10 B' 10 a' 6 b' 2
8 2
12 8
10
10
6 2
10 10
12
These are the forms given by Finney et al, and the second requires a column swop one to match Kruger et als for which Finney et al report as not significant.
Example Number 3 pp. 24 Kruger, Lemacher, Wall

The Observations are ------------- 2, 12 and 15, 26
STEP 1
2 15
c1=17
12 26
c2=38
r1=14
r2=41
N=55
No changes need to be made and the table entry values are read off as
p(table)
LHS(p) RHS(p)
Same Side Other Side 0.0749498
55, 14, 17, 2 to give
p-value 0.18289
0.08458 0.10794 0.976634 0.10794
Kruger et al do not include this table for the reasons they give nor do the other tables.
Example Number 4 pp. 24 Kruger, Lemacher, Wall
STEP 1
3 15
c1=18
14 1
c2=15
r1=17
r2=16
N=33
As c2 is the smallest marginal total a columns / rows interchange is required
14 3
C2=17
1 15
c2=16
sr=15
r2=18
N=33
As c2 < c1 the columns needed to be swopped
X=1 15
sc=16
14 3
c2=17
sr=15
r2=17
N=33
The table entry values are read off as
33, 15, 16, 1 to give
p(table) 1.04902E-05
LHS(p) 1.06213E-05
RHS(p) 0.999999869
Same Side 1.06213E-05
Other Side 1.98234E-06
p-value 1.26037E-05
Lieberman & Owen give for (33, 16, 16,1) P(2) = 0.000011 and p(2) = 0.000010 which correspond to LHS(p) and p(table)
A 18
B 15
a 15
b 1
A' 17
B' 16
a' 14
b' 1
15 1
18 15
14
16
17
14 1
17 16
15
15
18
Finney et al do not give these tables as they are not significant at any of the levels they use
REFERENCES :
The Fourfold Table up to N=80 Kruger, Lemacher & Wal : Verlag 1981 pp. 440 Tables 410 pp. The text is in German & English n.b. it does not include all the tables. Tables for the Hypergeometric Probability Distribution Lieberman & Owen: Stanford U.P. 1961 pp 726 Tables 693 pp. The Tables give two probabilities - a one sided cumulative probability and the hypergeometric probability for the table. The distribution is characterised in terms of the four variables N the number of items in a lot (the total number of observations, n the number of items taken from the lot (corresponds to sc) , k the number of defective items in the lot ( corresponds to sr) and x the number of defective items observed in the sample (corresponds to sr) The tables are for N=2 [1] 50 [10] 100 and for N = 1000 and n = 500
[The tables up to N = 20 and part of the table for N = 21 are reproduced in Handbook of Statistical Tables Owen: Addison Wesley 1962 Tables 28 pp.] Tables for Testing Significance in a 2X2 Table Finney, Latscha, Bennett, Hsu: Cambridge U.P. 1963 pp.108 Tables 93 pp. For A = B= 3 to A = B = 20 ( i.e. N= 6 [2] 40) Finney et al give, for significance levels of 0.05, 0.025, 0.01, and 0.005, the tables that are statistically significant as well as a Right Hand Side Cumulative probability. The tables continue up to 60 with statistical significance at the 0.05 and 0.01 levels. [Handbook of Tables for Probability & Statistics Beyer: CRC 2nd Ed pp. 14 reproduce the tables up to N = 20] Scientific Tables Diem: Ciba Geigy 1962 14 pp. For N=4 [1] 50 [2] 60 and significance levels of 2 = 0.20, 0.10, 0.05, 0.02, 0.01 and 0.002 the tables that are statistically significant are indicated. McDonald, Davies & Miliken 1977 Technometrics 19(2):145-157 give 6 pages of tables for critical regions of nominal levels 0.05 and 0.01 for N = 5 [1] 30 Armsen 1955 Biometrika 42:494-511 gives 6 pages which can be used up to N =50 but not for all such tables. Neave 1982 British Institute of Statistics 9(2): 165-178 gives 2 pages for tables up to N =25 and offers tables up to N = 100 for those interested Ballatori 1982 Metron XL (3-4): 157-171 gives some tables of significance for 0.05 Luna del Castillo & Martin Andres 1987 Trabajos De Estadistica 2(1): 15-43 gives10 pages of significant tables up to N = 20 with the Hypergeometric AND Fishers one-sided probability. The Universidad de Granada gives various computer programs for 2X2, 2XK and rXc contingency tables. Keisan provides an online calculator which gives in separate tables the hypergeometric probability, the lower and upper cumulative probability. Each gives up to 401 values and another function gives all three up to 50 significant figures. easycalculation.com gives the hypergeometric probability and the lower cumulative probability to three decimal places. stat trek gives 16 d.p. for the hypergeometric ,upper and lower cumulative probabilities. adscienceengineering gives a table of the hypergeometric probability to 16 significant decimal places.

Fishers Exact Test N 2 To 50 Background

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Fishers Exact Test N 2 To 50 Background

Hochgeladen von

Copyright:

Verfügbare Formate

Fishers Exact Test h(X,sr,sc,N) & Significance Testing Tables

N sr sc X p(table) LHS(p) RHS(p) Same Side Other Side p-value

Complete tables for N = 2 [1] 50

STEP 1 - USE FOR EACH TABLE

STEP 2 if all row totals different

A more general form often used is

STEP 2 if all totals the same

STEP 2 if there are three smallest totals

STEP 2 if there are two smallest totals

STEP 3 - USE IF THERE ARE TIES FOR SMALLEST MARGINAL TOTAL

Examples Number 1 p..19 Kruger, Lemacher, Wall

Same Side 0.0243734

Other Side 0.0204041

Example Number 2 p..18 Kruger, Lemacher, Wall

The Observations are ----------- -- 8, 2 and 4, 6

Only a column / row interchange is required

20, 8, 10, 2 to give

Example Number 3 pp. 24 Kruger, Lemacher, Wall

55, 14, 17, 2 to give

0.08458 0.10794 0.976634 0.10794

Example Number 4 pp. 24 Kruger, Lemacher, Wall

As c2 is the smallest marginal total a columns / rows interchange is required

As c2 < c1 the columns needed to be swopped

The table entry values are read off as

33, 15, 16, 1 to give

Same Side 1.06213E-05

Other Side 1.98234E-06

Das könnte Ihnen auch gefallen