a student project

© All Rights Reserved

Als DOCX, PDF, TXT **herunterladen** oder online auf Scribd lesen

15 Aufrufe

a student project

© All Rights Reserved

Als DOCX, PDF, TXT **herunterladen** oder online auf Scribd lesen

- Test for Independence
- Statistical Data Analysis Book Dang Quang the Hong
- The Trauma Problem Overview and History
- A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledge Regarding Selected Neonatal Infections and their Prevention among Primi Gravida Mothers Attending H.S.K Medical College Hospital and Research Center. Bagalkot
- Decision Algorithms for Electrical Wiring
- 18.05 Lecture 2 February 4, 2005
- Fundamental of Statistics
- Wagenaar_2016_SuicideSofala.pdf
- Philips, Sul 2007
- Ntroduction to Statistics for a Level Biology
- An Exploratory Analysis of School Health Environment and School Health Services in Primary Schools of Hyderabad
- Lesson 4
- IPC2012-90255
- Exact Analysis of Discrete Data PDF
- Ch3_DiscreteDist_PoissonProcess_CLT_Approx_torresgarcia.pdf
- Stats syllabus
- JIM 104_CH5_KI (2015-16).pdf
- 4. Environmental Standards
- youngsyllabusstatistics
- Language teaching

Sie sind auf Seite 1von 17

RELATED

DISTRIBUTIONS

SUBMITTED BY

OSAMA BIN AJAZ

(std_18154@iobm.edu.pk)

CONTENTS

Abstract

03

Bernoulli distribution

04

Binomial distribution

05

Multinomial Distribution

07

Beta binomial distribution

08

Correlated binomial distribution

08

09

Neyman C () test

09

Testing goodness of fit of binomial distribution

09

The C () test for correlated binomial alternatives

10

C () test for beta binomial alternatives

10

The C () test for Althams multiplicative alternatives

11

12

13

References

14

ABSTRACT

R. E. Tarone from National Cancer Institute, Bethesda, Maryland;

derive the tests for the goodness of fit of the binomial distribution

using C() procedure of Neyman (1959), which are

asymptotically optimal against generalized binomial alternatives

proposed by Altham (1978) and Kupper & Haseman (1978).

Before coming to the article I have explain about binomial and

related distributions. I have reproduced key parts of the article, if

somebody interested in detail of the article then he is advice to

see references at the end page of the report.

Bernoulli trial

A Bernoulli trial (named after James Bernoulli, one of the founding fathers of

probability theory) is an experiment with two, and only two possible

outcomes [2]. For example: female or male, life or death, Head or Tail and

success or failure etc. A sequence of Bernoulli trials occur when a Bernoulli

experiment is performed several independent times so that the probability of

success, say p, remains the same from trial to trial.

Bernoulli distribution

A random variable X is defined to have a Bernoulli distribution if the discrete

density function of X is given by

1 x

f ( x )= p (1p) forx=01

0 otherwise

Where the parameter p satisfies 0p1,

If X has a Bernoulli distribution, then

E[x] = p,

var [x] = pq,

Mx (t) = pet + q.

Proof

1

E[x] =

x p x (1 p)1x=0. q+1. p= p

x=0

Mx (t) = E[etx] =

etx p x ( 1 p)1x

x=0

= q+pet

Example 1: out of millions of instant lottery tickets, suppose that 20% are

winners. If five such tickets are purchased, (0, 0, 0, 1, 0) is a possible

observed sequence in which the fourth ticket is a winner and the other four

are losers. Assuming independence among winning and losing tickets, the

probability of this outcome is (0.8) (0.8) (0.8) (0.2) (0.8) = (0.2) (0.8) 4 [5]

In a sequence of Bernoulli trials, we are often interested in the total number

of successes and not in the order of their occurrence. If we let the random

variable X equal the number of observed successes in n Bernoulli trials, the

possible values of X are 0, 1, 2, . . ., n. if x successes occur, where x=0,1,2,

, n, then n-x failures occur. The number of ways selection x positions for

the x successes in the n trials is

n!

(nx)= x !(nx)!

independent and since the probabilities of success and failure on each trial

are, respectively, p and q=1-p, the probability of each of these ways is px (1p) n-x. Thus f(x), the p.m.f of X, is the sum of the probabilities of these

mutually exclusively events, that is

f ( x )= n p x (1 p)n x for x=0,1,2, n

x

()

(nx )

variable X is said to have a binomial distribution.

A binomial experiment satisfies the following properties:

1. A Bernoulli experiment is performed n times.

2. The trials are independent.

3. The probability of success on each trial is a constant p; the probability

of failure is q=1-p.

4. The random variable X equals the number of successes in the n trials.

A binomial distribution is denoted by the symbol b (n, p) and we say that the

distribution of X is b (n, p). The constants n and p are called the parameters

of the binomial distribution. Thus if we say that the distribution of X is b (10,

n=10 from a Binomial distribution with p=1/5.

The binomial distribution derived its name from the fact that the (n+1) terms

in the binomial expansion of (q + p) n correspond to the various values of b(x;

n, p) for x=0, 1, 2. . . n. That is

n

(q+ p) n=

n1

n2

+ + n pn

n

()

b ( x ; n , p )=1

x=0

Example 2: if we want to find the probability of obtaining exactly three 2s if

an ordinary die is tossed 4 times; then the probability is:

b (4,

6 =

1

6

5

6

4

3

()

b(x; n, p) are:

= np

2 = npq

Mx (t) = (q+pet) n respectively.

Proof

pe

( t )x qn x

Mx (t) = E[etx] =

x=0

x=0

= (pet + q)

And second derivative is

(pet + q) n-1

Var[X] = E[x2]-{E[x]} 2= Mx (0) (np) 2= n (n 1) p2 + np (np) 2= np (1 p)

2 1 t

+ e

3 3

)5

then X has a binomial distribution with n = 5 and p = 1/3; that is, the pmf of

X is

Note: Binomial distribution reduces to the Bernoulli distribution when n=1.

Sometimes the Bernoulli distribution is called the point binomial.

Example 4: Let the random variable Y be equal to the number of successes

throughout n independent repetitions of a random experiment with

probability p of success. That is, Y is b (n, p). The ratio Y/n is called the

relative frequency of success. Now recall Chebyshevs Inequality i.e. P (|x-|

2

) 2 for all >0.

Y

Var ( )

Y

p(1 p)

n

P (| n p )

=

2

n 2

Now, for every fixed > 0, the right-hand member of the preceding inequality is close to zero

for sufficiently large n. That is

Since this is true for every fixed > 0, we see, in a certain sense that the

relative frequency of success is for large values of n, close to the probability

of p of success [3].

Example 5: Let the independent random variables X1, X2, X3 have the same

cdf F(x). Let Y be the middle value of X1, X2, X3. To determine the cdf of Y ,

say FY (y) = P(Y y), we note that Y y if and only if at least two of the

random variables X1, X2, X3 are less than or equal to y. Let us say that the ith

trial is a success if Xi y, i = 1, 2, 3; here each trial has the probability of

success F(y). In this terminology, FY (y) = P(Y y) is then the probability of

at least two successes in three independent trials. Thus

FY(y) =

3

2

()

y

1F

[

)+

2

[F ( y )]

[F ( y )] .

If F(x) is a continuous cdf so that the pdf of X is F(x) =f(x), then the pdf of Y

is

FY(y) = FY(y) =6[F(y)] [1-F(y)] f(y). [4]

MULTINOMIAL DISTRIBUTION

Recall that in order for an experiment to be binomial; two outcomes are

required for each trial. But if each trial in an experiment has more than two

outcomes, a distribution called the multinomial distribution must be used.

For example, a survey might require the responses of approve,

disapprove, or no opinion. In another situation, a person may have a

choice of one of five activities for Friday night, such as a movie, dinner,

baseball game, play, or party. Since these situations have more than two

possible outcomes for each trial, the binomial distribution cannot be used to

compute probabilities.

If X consists of events E1, E2, E3, . . . , Ek, which have corresponding

probabilities p1, p2, p3, . . . , pk of occurring, and X1 is the number of times E1

will occur, X2 is the number of times E2 will occur,X3 is the number of times E3

will occur, etc., then the probability that X will occur is

P ( X )=

n!

. p x p x p xk

X1 ! X2! X3! Xk ! 1 2

1

For an illustration purpose let a box contains four white balls, three red balls,

and three blue balls. A ball is selected at random, and its color is written

down. It is replaced each time and let we want to find the probability that if

five balls are selected, two are white, two are red, and one is blue.

The distribution with discrete density function

f(x) = f(x; n, , ) =

(nx)

( + ) (n+ x )

.

(n+ + )

I{0,1 , , n}(x)

binomial distribution.

The beta binomial distribution has Mean =

n

+

and variance =

n ( n+ + )

( + )2 ( + +1)

If ==1, then the beta binomial distribution reduces to a discrete uniform

distribution over the integers 0, 1 n. [2]

the fetuses in a litter are not mutually independent. This idea is due to

Bahadur (1961). Retaining only the first order correlation between the

responses and denoting as the covariance between the binary responses of

any two fetuses, the random variable X is such that

where p is the probability that the fetus is abnormal. Note that for the above

equation to be a valid probability distribution, a data-dependent bound for

the parameters has to be imposed; see Kupper and Haseman (1978). It can

be shown that the expectation and variance of the correlated binomial

distribution are np and np (1-p) + n(n-1), respectively. Thus, the correlated

binomial distribution is a generalization of the binomial distribution, the CB

distribution becomes the binomial distribution when =0. Altham (1978)

derived a further two-parameter generalized binomial distribution, namely,

the multiplicative generalized binomial (MB) distribution.

The probability mass function of the Altham-multiplicative binomial

distribution is

n p (1 p)

(

x)

P ( X=x )=

x

nx

a x(nx)

F( n)

x= 0, 1, 2, . . . , n

a0

0p1

Neyman C () test

10

hypotheses testing problems in applied research often involve several

nuisance parameters. In these composite testing problems, most powerful

tests do not exist, motivating search for an optimal test procedure that yields

the highest power among the class of tests obtaining the same size.

Neymans locally asymptotically optimality result for the C() test employs

regularity conditions inherited from the conditions used by Cramer (1946) for

showing consistency of MLE and some further restrictions on the testing

function to allow for replacing the unknown nuisance parameters by its nconsistent estimators. It is the confluence of these Cramer conditions and

the maintained significance level that gives the name to the C () test

DISTRIBUTION*

R. E. Tarone from National Cancer Institute, Bethesda, Maryland; derive the

tests for the goodness of fit of the binomial distribution using C() procedure

of Neyman (1959), which are asymptotically optimal against generalized

binomial alternatives proposed by Altham (1978) and Kupper & Haseman

(1978) [5].

Consider an experiment in which the responses take the form of proportions

and let the ith response be given by pi=xi/ni for i=1, ... , M. Under the

correlated binomial model the log likelihood function is :

M

i=1

i=1

x ni p ) 2+ x i ( 2 p1 )ni p2 }]

2 2 {( i

2p q

the goodness of fit of the binomial distribution is obtained by testing the null

hypothesis: Ho: =0 in the presence of nuisance parameter p. Moran (1970)

demonstrated that for such problems the C () tests proposed by Neyman

(1959) are asymptotically equivalent to tests using maximum likelihood

11

the following partial derivatives of L evaluated at = 0:

Under the null hypothesis, the xi are independent binomial random variables,

and hence it follows from (2) that E {S2 (p)} =0. Neyman (1959) has shown

that when E {S2 (p)} =0 the null hypothesis Ho: =0 can be tested using the

^

statistic S1 ( p) , where ^p is a root-n consistent estimator of p (Moran,

1970). Substituting the consistent estimator

^p=

xi

ni

p

x ini ^

ni

S

2

(^

p) =

S=

,

we

find

that

C

()

test

statistic

is

given

by

S

.

Since E {S2 (p)} =0, the variance of S ( ^p ) is given by E {S3 (p)} where the

expectation is taken under Ho: =0. From (3) it follows that E {S3 (p)} =

ni (ni1)

2 p2 q2

. Substituting

^p

The statistic X2c is the C () test statistic for homogeneity of proportions

which is asymptotically optimal against correlated binomial alternatives.

The binomial variance test for homogeneity is based on the statistic

12

freedom when b= 0. It is clear from the above expressions that for the case

in which ni = n for all i, the C () test statistic S is equivalent to the variance

test statisticX2v.

The beta-binomial distribution is a mixture of binomial distributions which

has often

been utilized as an alternative to the binomial distribution. Under the betabinomial model

the log likelihood function is given by

of fit of the

binomial distribution is obtained by testing the null hypothesis Ho: = 0.

The derivation of

the C () test statistic using the beta-binomial model is similar to the

derivation for the correlated binomial model, and the optimal statistic again

is found to be the statistic S

derived in the last section. Note, however, that in the beta-binomial model

the parameter cannot take negative values. The alternative hypothesis is

necessarily one sided, and hence the

C () test is the one-sided test based on the statistic the C () test is the onesided test based on the statistic cannot take negative values. The alternative

hypothesis is necessarily one sided, and hence the C () test is the one-sided

test based on the statistic

Under the null hypothesis Ho: = 0, the statistic Z will have an asymptotic

standard normal

distribution.

13

The multiplicative generalization of the binomial distribution provides an alternative for which

the correlated binomial C () test is not asymptotically optimal. The log likelihood function for

the multiplicative generalization of the binomial model is

i

nix

The C () test for Ho: =1 is based on the statistic x I () . Note that unlike the correlated

R=

binomial C () statistic, R is not equivalent to the variance test statistic in the case ni = n for all i.

Will have an asymptotic chi-squared distribution with one degree of freedom. The test based on

X2m is asymptotically optimal against alternatives given by the multiplicative generalization of

the binomial mode

In order to compare the different tests of the goodness of fit of the binomial

distribution we

consider the treatment group data of Kupper & Haseman (1978, p. 75). The

observed proportions were 0/5,2/5,1/7,0/8,2/8,3/8,0/9,4/9,1/10and 6/10.The

variance test gives X2v = 19.03 and P = 0.025,the correlated binomial C()

test gives X2c= 6.63and P = 0. 01. Thus for this example, the correlated

binomial C() test is more sensitive to the departure of the observed

proportions from a binomial distribution than the other tests considered.

14

the null hypothesis, a Monte Carlo experiment was performed. Ten binomial

proportions were randomly generated using the unequal sample sizes from

the above example. For each pseudorandom sample of 10 proportions the C

() statistics X2c and the variance test statistic X2v were calculated and

compared to the 100%, 500 and 1% points of their asymptotic null

distributions. The empirical significance levels based on 1500 replications are

shown in Table 1 for under lying binomial probabilities of 0.10, 0.25and 0.50.

For the cases considered, the empirical significance levels for the correlated

binomial C () statistic are significantly lower than the nominal level for the

500 and 10% critical values. The empirical significance levels for the 1%

critical value show no consistent pattern.

optimal against correlated binomial and variance test, based on

1500 replications for underlying binomial probabilities of 0.10, 0.25

and 0.50

Nomin

al

level

X2c

X2m

X2v

0.01

0.007

0.010

0.003

Binomial Probabilities

P=0.10

P=0.25

0.05

0.10

0.01

0.05

0.10

0.019

0.043

0.042

0.048

0.100

0.082

0.013

0.012

0.012

0.035

0.037

0.042

0.073

0.085

0.097

0.01

0.009

0.009

0.007

P=0.50

0.05

0.10

0.034

0.031

0.049

0.077

0.075

0.108

variance test and the generalized binomial C () tests for

correlated binomial and multiplicative alternatives

15

Test

statistic

X2v

X2c

X2m

Correlated binomial Multiplicative generalized

binomial

0.95

0.71

1.00

0.82

0.79

1.00

variance test and the generalized binomial C () test for

correlated binomial and multiplicative alternatives; it shows that

the correlated binomial C () test is more efficient than the

variance test for multiplicative alternatives as well as for

correlated binomial alternatives.

REFERENCES

1. Alexander M. Mood, Franklin A. Graybill and Duane C. Boes,

Introduction to the theory of statistics, third edition, McGrawHill series in probability and statistics

16

second edition, page 89, Duxbury Advanced Series.

3. Hogg, McKean and Craig, Introduction to Mathematical

Statistics (2013), seventh edition, Pearson education, Inc.

4. Paul S. R. , A three parameter generalization of binomial

distribution, Windsor mathematics report, February 1984

5. Robert V. Hogg, Elliot A. Tennis, Jagan Mohan Rao, Probability

and Statistical Inference, seventh edition, Pearson Education

6. Tarone, R. E. (1979), Testing the goodness of fit of binomial

distribution, Biometrika 66, 585 590

17

- Test for IndependenceHochgeladen vonlovepink_17
- Statistical Data Analysis Book Dang Quang the HongHochgeladen vonAnonymous gInYZHU
- The Trauma Problem Overview and HistoryHochgeladen vonSurgicalgown
- A Study to Assess the Effectiveness of Planned Teaching Programme on Knowledge Regarding Selected Neonatal Infections and their Prevention among Primi Gravida Mothers Attending H.S.K Medical College Hospital and Research Center. BagalkotHochgeladen vonInternational Journal of Innovative Science and Research Technology
- Decision Algorithms for Electrical WiringHochgeladen vonAlfian Amin
- 18.05 Lecture 2 February 4, 2005Hochgeladen vonvannud
- Fundamental of StatisticsHochgeladen vonrohitbatra
- Wagenaar_2016_SuicideSofala.pdfHochgeladen vonWafiy Akmal
- Philips, Sul 2007Hochgeladen vonRogel Legor
- Ntroduction to Statistics for a Level BiologyHochgeladen vonbiology565
- An Exploratory Analysis of School Health Environment and School Health Services in Primary Schools of HyderabadHochgeladen vonarcherselevators
- Lesson 4Hochgeladen vonwearplay
- IPC2012-90255Hochgeladen vonMarcelo Varejão Casarin
- Exact Analysis of Discrete Data PDFHochgeladen vonPaul
- Ch3_DiscreteDist_PoissonProcess_CLT_Approx_torresgarcia.pdfHochgeladen vonOsiris Lopez - Manzanarez
- Stats syllabusHochgeladen vonMax T
- JIM 104_CH5_KI (2015-16).pdfHochgeladen vonHanya Dhia
- 4. Environmental StandardsHochgeladen vonAhmad Muhammad
- youngsyllabusstatisticsHochgeladen vonapi-251186161
- Language teachingHochgeladen vonAndi Jaya Saputra
- Testing for Multiple Structural Breaks- An Application of Bai-perron Test to the Nominal Interest Rates and Inflation in Turkey[#242872]-211274 (1)Hochgeladen vonvita
- Stats NotesHochgeladen vonKevin McNeill
- 17392556.pdfHochgeladen vonEugenio Martinez
- Binomial DistributionHochgeladen vonMa.Mikee Camille Camora
- HypothesisHochgeladen vonRahul More
- 8mmw Simple Test of Hypothesis z tHochgeladen vonMarean Bosquillos
- Quantitative Analysis(4)Hochgeladen vonwaq00
- ChiSquareTestOfIndependencePostHoc.pptxHochgeladen vonFahmeeda Ahmed
- Math revision questionHochgeladen vonMary Lu
- 7. Product and Process Comparisons.pdfHochgeladen vonfandisetia

- Forty Hadith on the Intercession of NABI SALLALAHU ALAYHI WASALLAMHochgeladen vonAbdul Mustafa
- Challan FormHochgeladen vonAhmer Khan
- Microsoft Word - SYLLABUS-cceHochgeladen vonab99math
- ProjectHochgeladen vonUsama Ajaz
- 7 Cs of CommunicationHochgeladen vonUsama Ajaz
- Antidote to SuicideHochgeladen vonSehra E Madina
- Binomial and Related DistributionsHochgeladen vonUsama Ajaz
- Binomial and Related DistributionsHochgeladen vonUsama Ajaz
- Bloodshed in Karbala [English]Hochgeladen vonDar Haqq (Ahl'al-Sunnah Wa'l-Jama'ah)
- The Odds, NYtimes BeyesianHochgeladen vonUsama Ajaz
- work of GEP BoxHochgeladen vonUsama Ajaz
- OrthogonalityHochgeladen vonUsama Ajaz
- GAT-subjectveHochgeladen vonUsama Ajaz
- The Respect of a MuslimHochgeladen vonAli Asghar Ahmad
- 101 Madani PearlsHochgeladen vonMuhammad Wajeeh
- Khazana-e-Khuda_Ki_Chabiyan_Habib-e-Khuda_K_Hath_MeinHochgeladen vonTariq Mehmood Tariq

- Thermal and Stress Analysis With Finite ElementHochgeladen vonAnonymous LcR6ykPBT
- CSD(5)Hochgeladen vonTalha Yasin
- Kumar-Varaiya-Textbook.pdfHochgeladen vonJennifer Gonzalez
- CalcIII CompleteHochgeladen vonKiran Kumar
- Accelerated SowHochgeladen vonSteve Jonzie
- ITHochgeladen vonBaji
- signals and systemsHochgeladen vonosama_tahir_3
- How to Soft BodyHochgeladen vonSophia F. Zhang
- c3-3Hochgeladen vonTefera Kitaba
- solution5[1]Hochgeladen vonHarjinder Pal Singh
- Nyquist Plot and Stability Criteria - GATE Study Material in PDFHochgeladen vonAtul Choudhary
- saddle point and inflectionHochgeladen vonDhiaa LaMi
- Grade 10 MathHochgeladen vonAli Elbasry
- BSC Chapter 3 Test Methods QuadraticsHochgeladen vonZhiyong Huang
- TCS 5 soluHochgeladen vonsuganya004
- Ch 05 Discrete Probability DistributionHochgeladen vonDaniel Tri Ramadhani
- A Crack Along the Interface of a Circular Inclusion Embedded in an Infinite SolidHochgeladen vonRam Anji
- Problem Session (2)Hochgeladen vonMaha Hassan
- 9709_m18_ms_32Hochgeladen vonLaura Wu
- 08W 112 FinalHochgeladen vonDang Xuan Vinh
- Statistical PostulateHochgeladen vonJoel Aleman Ramirez
- 2016-17MAMSCAdmission-BI.docxHochgeladen vonSourabh Gupta
- piHochgeladen vondissipog1932
- napkin-2016-07-19Hochgeladen vonMatthew Ung
- mmf2008jHochgeladen vonAmrit Madhav
- Allam, Bakeir, Abo-Tabl - Some Methods for Generating Topologies by RelationsHochgeladen vonChuckie Balbuena
- DRAG lINKHochgeladen vonKim Custodio
- Kalman Filter - Wikipedia, The Free EncyclopediaHochgeladen vondineshprocks
- Algebra Taxonomy 100511Hochgeladen vonpfreda
- J.P.wolf. the Scaled Boundary Finite Element Method (WILEY, 2003)(T)(362s)Hochgeladen vonvik999