Sie sind auf Seite 1von 6

# Homework 8

# Problem 1
#
#
#
#
#
#

In 1986, the St. Louis Post Dispatch was interested in measuring public
support for the construction of a new indoor stadium. The newspaper
conducted a survey in which they interviewed 301 registered voters. Let
p denote the proportion of all registered voters in the St. Louis voting
district opposed to the stadium. A city councilman wishes to test the
hypotheses H : p .5, K : p < .5

#
#
#
#
#
#

(a)
The number y opposed to the stadium construction is assumed to be
binomial(301, p). Suppose the survey result is y = 135. Using the
R function pbinom, compute the p-value P(y 135|p = .5). If this
probability is small, say under 5%, then one concludes that there is
significant evidence in support of the hypothesis K : p < .5.

prob <- pbinom(135, 301, 0.5, lower.tail = TRUE)


round(prob, 3)
[1] 0.042
# This probability is relatively small - it's under 5%. So, there is
# significant evidence in support of the hypothesis K : p < .5.
# (b)
# Suppose one places a uniform prior on p. Compute the prior odds of
# the hypothesis K.
probK=punif(.5,min=0,max=1)
probA=1-probK
prior.odds=probK/probA
prior.odds
[1] 1
#
#
#
#

(c)
After observing y = 135, the posterior distribution of p is beta(136,
167). Using the R function pbeta, compute the posterior odds of the
hypothesis K.

prob.K=pbeta(.5, 136,167)
prob.A=1-prob.K
post.odds=prob.K/prob.A
post.odds
[1] 25.93
# (d)
# Compute the Bayes factor in support of the hypothesis K
BF = post.odds/prior.odds
BF
[1] 25.93
#
#
#
#
#
#

Problem 2
For last year, a sample of 50 cell phone users had a mean local monthly
bill of $41.40. Do these data provide sufficient evidence to conclude that
last years mean local monthly bill for cell phone users has changed from
the 1996 mean of $47.70? (Assume that the population standard deviation
is = $25.)

# (a)
# Use this statistic and the R function pnorm to compute

# a p-value for testing the hypothesis H : = 47.7.


# z= sqrt(n)*(ybar-mu)/sigma
z = sqrt(50)*(41.40-47.70)/25
round(z,4)
[1] -1.782
p.value <- 2 * (1 - pnorm(z))
p.value
[1] 1.925
#
#
#
#
#
#
#
#

(b)
Suppose one assigns a prior probability of .5 to the null hypothesis. Use
the R function mnormt.twosided to compute the posterior probability
of H. The arguments to mnormt.twosided are the value to be tested
(47.70), the prior probability of H (.5), the standard deviation of
the prior under the alternative hypothesis (assume = 4), and the
data vector (values of sample mean, sample size, and known sampling
standard deviation).

mu <- 47.70
prior.prob <- 0.5
tau <- 4
data <- c(41.40, 50, 25)
mnormt.twosided(mu, prior.prob, tau, data)
$bf
[1] 0.6192808
$post
[1] 0.3824419
function "mnormt.twosided"
#
#
#
#

(c)
Compute the posterior probability of H for the alternative values
= 1, 6, 8, and 10. Compare the values of the posterior probability
with the value of the p-value computed in part (a).

tau1 <- 1
tau2 <- 6
tau3 <- 8
mnormt.twosided(mu, prior.prob, tau1, data)
$bf
[1] 0.9239295
$post
[1] 0.4802304
mnormt.twosided(mu, prior.prob, tau2, data)
$bf
[1] 0.6062231
$post
[1] 0.3774215
mnormt.twosided(mu, prior.prob, tau3, data)
$bf
[1] 0.6554671
$post
[1] 0.3959409
#
#
#
#
#

When looking at these values of the posterior probability


and the value of the p-value computed in part (a), we can see
that they are not similar to the p-vale from (a).
The bayesian probability of a null hypothesis is approximately equal to
the p-value when we have a one-sided test and a vague

# prior distribution placed on the parameter.


# Problem 3
# Comparing Bayesian models using a Bayes factor
#
#
#
#
#
#

Suppose that the number of births to women during a month at a particular


hospital has a Poisson distribution with parameter R. During a given
year at a particular hospital, 66 births were recorded in January and 48
births were recorded in April
The birthrates during January and April are given by RJ and RA,
respectively

# M1 : RJ ~ gamma(240, 4), RA ~ gamma(200, 4).


# M2: RJ = RA and the common value of the rate R ~ gamma(220, 4).
# (a)
# Write R functions to compute the logarithm of the posterior density of
# (RJ,RA) under model M1 and the logarithm of the posterior density
# of R under model M2.
logpoissgamma=function(theta,datapar)
{
y=datapar$data
npar=datapar$par
lambda=exp(theta)
loglike=log(dgamma(lambda,shape=sum(y)+1,rate=length(y)))
logprior=log(dgamma(lambda,shape=npar[1],rate=npar[2])*lambda)
return(loglike+logprior)
}
# Under M1, we have
datapar1=list(data=cbind(48,66),par=cbind(c(220,4),c(200,4)))
# RJ~Gamma(220,4)
# RA~Gamma(200,4)
#Under M2, we have
datapar2=list(data=c(48,66),par=c(220,4))
# RJ=RA, R~Gamma(200,4)
# (b)
# Use the function laplace to compute the logarithm of the predictive
# density for both models M1 and M2
fit1=laplace(logpoissgamma,.5,datapar1)
fit1
$mode
[1] 4.019531
$var
[,1]
[1,] 0.002993563
$int
[1] -2.851552
$converge
[1] TRUE
fit2=laplace(logpoissgamma,.5,datapar2)
fit2
$mode
[1] 4.019531
$var
[,1]
[1,] 0.002993563
$int
[1] -2.851552

$converge
[1] TRUE
# (c)
# Compute the Bayes factor in support of the model M1.
log.bayes.factor <- fit1$int - fit2$int
log.bayes.factor
[1] 0
# To get the bayes factor, take the exponential of the value
bayes.factor <- exp(log.bayes.factor)
bayes.factor
[1] 1
# Problem 4
# Is Kobe Bryant streaky?
bfexch
function (theta, datapar)
{
y = datapar$data[, 1]
n = datapar$data[, 2]
K = datapar$K
eta = exp(theta)/(1 + exp(theta))
logf = function(K, eta, y, n) lbeta(K * eta + y, K * (1 eta) + n - y) - lbeta(K * eta, K * (1 - eta))
sum(logf(K, eta, y, n)) + log(eta * (1 - eta)) - lbeta(sum(y) +
1, sum(n - y) + 1)
}
<environment: namespace:LearnBayes>
y <- c(8,4,5,12,5,7,10,5,12,9,8,7,19,11,7)
n <- c(15,10,7,19,11,17,19,14,23,18,24,23,26,23,16)
data <- cbind(y,n)
data
y n
[1,] 8 15
[2,] 4 10
[3,] 5 7
[4,] 12 19
[5,] 5 11
[6,] 7 17
[7,] 10 19
[8,] 5 14
[9,] 12 23
[10,] 9 18
[11,] 8 24
[12,] 7 23
[13,] 19 26
[14,] 11 23
[15,] 7 16
#
#
#
#

Use the function laplace together with the function bfexch to compute
the logarithm of the Bayes factor in support of the streaky hypothesisMK.
Compute the log of the Bayes factors for values of K = 10, 20, 50, and
100.

# Baye's factor for K=10


laplace(bfexch, 0, list(data=data, K=10))
$mode
[1] -0.0480957
$var
[,1]
[1,] 0.03719374

$int
[1] -1.421339
$converge
[1] TRUE
bf10 = exp(-1.42)
bf10
[1] 0.2417
# Baye's factor for K=20
laplace(bfexch, 0, list(data=data, K=20))
$mode
[1] -0.05239258
$var
[,1]
[1,] 0.02708854
$int
[1] -0.09296268
$converge
[1] TRUE
bf20 = exp(-0.093)
bf20
[1] 0.9112
# Baye's factor for K=50
laplace(bfexch, 0, list(data=data, K=50))
$mode
[1] -0.05375977
$var
[,1]
[1,] 0.01766577
$int
[1] 0.3381446
$converge
[1] TRUE
bf50 = exp(-0.393)
bf50
[1] 0.675
# Baye's factor for K=100
laplace(bfexch, 0, list(data=data, K=100))
$mode
[1] -0.05375977
$var
[,1]
[1,] 0.01766577
$int
[1] 0.3381446
$converge
[1] TRUE
bf100 = exp(-0.338)
bf100
[1] 0.7132

log.marg=function(logK)
laplace(bfexch,0,list(data=data,K=exp(logK)))$int
log.K=seq(2,6)
K=exp(log.K)
log.BF=sapply(log.K,log.marg)
BF=exp(log.BF)
round(data.frame(log.K,K,log.BF,BF),2)
log.K
K log.BF
BF
1
2
7.39 -2.37 0.09
2
3 20.09 -0.09 0.92
3
4 54.60
0.40 1.49
4
5 148.41
0.27 1.31
5
6 403.43
0.12 1.13
# Based on your work, is there much evidence that Bryant displayed
# true streakiness in his shooting performance in these 15 games?
# Look at Log.K = 4 and the corresponding BF = 1.49
# This streaky model is approximately one and a half
# times as likely as the consistent model.
# This is evidence that Kobe Bryant demonstrated some
# true streakiness

Das könnte Ihnen auch gefallen