ECE531 Lecture 3: Minimax Hypothesis Testing: D. Richard Brown III

Minimax Hypothesis Testing
ECE531 Lecture 3: Minimax Hypothesis Testing
D. Richard Brown III
Worcester Polytechnic Institute
05-February-2009
Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 1 / 21

Simple Binary Bayesian Risks Under Different Priors
r(δ, π0 ) = π0 R0 (δ) + (1 − π0 )R1 (δ)

1
R0 (δ)
0. 9
0. 8 r(δ, π0 )
R1 (δ)
0. 7
′
R0 (δBπ0 )
0. 6
′
0. 5 r(δBπ0 , π0 )
0. 4
0. 3
r(δBπ0 , π0 )
′
R1 (δBπ0 )
0. 2
0. 1
0
0 0. 1 0. 2 0. 3 0. 4 0. 5 0. 6 0. 7 0. 8 0. 9 1
π0′ prior π0

Least Favorable Prior State Distribution

1
0. 9
0. 8
0. 7
′
R0 (δBπ0 )
0. 6
′
0. 5 r(δBπ0 , π0 )
r(δBπlf , π 0) R0 (δBπlf )
R1 (δBπlf )
0. 3
r(δBπ0 , π0 )
R1 (δ Bπ0′ )
0. 2
0. 1
0
0 0. 1 0. 2 0. 3 0. 4 0. 6 0. 7 0. 8 0. 9 1
π0′ πlf prior π0

Definition:
ρmm := arg min max Rj (ρ)

ρ j
Remarks:
◮ No single decision rule minimizes the weighted average, e.g. Bayes,
risk for every possible prior state distribution.
◮ A conservative approach is to minimize the worst case risk over all
possible prior state distributions.
◮ Intuitively, there should be a least favorable prior. Does it always
exist? Is it unique?
◮ Intuitively, the minimax decision rule should be the Bayesian decision
rule with constant Bayesian risk over the priors. Is this always true?

Minimum Bayesian Risk as a Function of the Prior
Let V (π) := r(δBπ , π) be the minimum Bayesian risk for the prior π.
Theorem
The minimum Bayesian risk V (π) is concave and continuous P over the
space of priors satisfying πj ≥ 0, j = 0, 1, . . . , N − 1, and j πj = 1.
Hence, there exists a unique least favorable prior
πlf = arg max V (π).

π

Concavity of the Minimum Bayesian Risk

A function is concave if, for any {x, y} in the domain of f and any
α ∈ [0, 1], f (αx + (1 − α)y) ≥ αf (x) + (1 − α)f (y).
Denote a pair of priors as π and π ′ and a third prior π ′′ = απ + (1 − α)π ′ .
We can write
′′
V (π ′′ ) = π ′′⊤ R(δBπ )
′′ ′′
= απ ⊤ R(δBπ ) + (1 − α)π ′⊤ R(δBπ )
≥ αV (π) + (1 − α)V (π ′ )
hence V (π) is concave.
V
V (π0′′ ) V (π ′ )
0
V (π0 )
0
π0 π0′′ π0′ 1

Continuity of the Minimum Bayesian Risk
Theorem (“A First Course in Optimization Theory” by

R.K. Sundaram)
Let f : D → R be a concave function. Then, if D is open, f is continuous
on D. If D is not open, f is continuous on the interior of D.
Note that continuity does not imply differentiability.

The Four Possibilities

V (π0 ) V (π0 )
R1 (δBπlf ) R0 (δBπlf )
1 π0
2 π0
π0lf π0lf
V (π0 ) V (π0 )
R0 (δBπlf )
R1 (δBπlf )
π0lf = 0
π0lf = 1
3 π0
4 π0

Case 1: Differentiable Interior Maximum Risk

Theorem
′ ′
If there exists a prior π ′ such that the conditional risks satisfy R0 (δ Bπ ) = R1 (δ Bπ )
′
then π ′ is a least favorable prior and the minimax decision rule is ρmm = δ Bπ .
Proof.
′ ′
Given a π ′ satisfying R0 (δ Bπ ) = R1 (δ Bπ ). For any δ,
max{R0 (δ), R1 (δ)} ≥ max π0 R0 (δ) + (1 − π0 )R1 (δ)

π0 ∈[0,1]
≥ π ′ R0 (δ) + (1 − π ′ )R1 (δ)

′ ′
≥ π ′ R0 (δ Bπ ) + (1 − π ′ )R1 (δ Bπ )
′ ′
= R0 (δ Bπ ) = R1 (δ Bπ )
Moreover, for any π0 ∈ [0, 1],

′ ′ ′ ′
V (π ′ ) = π ′ R0 (δ Bπ ) + (1 − π ′ )R1 (δ Bπ ) = π0 R0 (δ Bπ ) + (1 − π0 )R1 (δ Bπ ) ≥ V (π0 )

A Procedure for Finding the Minimax Decision Rule
1. Find a Bayesian decision rule δBπ as a function of the prior π.

2. See if Case 1 holds by solving for the unique least favorable prior πlf
using the equalizer rule:
R0 (δBπlf ) = R1 (δBπlf )
3. If the solution exists, then set
ρmm = δBπlf
4. If there is no solution to the equalizer rule, then see if Case 3 or 4

holds by computing the risk at the endpoints π0lf = 0 and π0lf = 1.
5. If neither endpoint is least favorable, then we must be in Case 2. In
this case we must create a randomized minimax decision rule as a
convex function of two deterministic Bayes decision rules.
Example: Coherent Detection of BPSK

Our Bayes decision rule for coherent BPSK with prior π0 , π1 = 1 − π0 is

1
 if y >γ
Bπ
δ (y) = 0/1 if y =γ

0 if y < γ.

2
where γ := a0 +a
2
1
+ a1σ−a0 ln ππ01 .
The conditional risks are

Bπ γ − a0
R0 (δ ) = Q
σ

Bπ a1 − γ
R1 (δ ) = Q
σ
R∞ 2
where Q(x) := √1 e−t /2 dt.
x 2π
Let’s try the equalizer rule. What value of γ gives us R0 (δBπ ) = R1 (δBπ )?

a0 +a1
Answer: R0 = R1 when γ = 2 . Hence
a0 +a1

1
 if y > 2
mm a0 +a1
ρ (y) = 0/1 if y = 2
a0 +a1

0 if y < 2 .

a0 +a1
γ= 2
a0 a1
Y0 Y1
What does this imply about the least favorable prior?
Answer: π0 = π1 = 21 .
Given a0 , a1 , and σ, the minimax rule allows you to guarantee a
worst-case risk over all priors.

1
0. 9
0. 8
0. 7
′
R0 (δBπ0 )
0. 6
′
0. 5 r(δBπ0 , π0 )
r(δBπlf , π 0) R0 (δBπlf )
R1 (δBπlf )
0. 3
r(δBπ0 , π0 )
R1 (δ Bπ0′ )
0. 2
0. 1
0
0 0. 1 0. 2 0. 3 0. 4 0. 6 0. 7 0. 8 0. 9 1
π0′ πlf prior π0

Cases 3-4: Maximum Risk Occurs at Boundary

Let’s return to our coin flipping problem from Lecture 1 (H0 ↔ x0 = HT
and H1 ↔ x1 = HH) with a modified cost matrix

0 100
C=
100 60
100
90
80
70
60
Bayes risk
50
40
30
20
rule 1
rule 2
10 rule 3
rule 4
minimum
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
prior π
0

Cases 3-4: Maximum Risk Occurs at Boundary

Remarks:
◮ Rules 2 and 3, depending on the prior, minimize the Bayes risk.
◮ Note that the equalizer rule would give no solution to this problem:
R0 (D1 ) = 100 and R1 (D1 ) = 60

R0 (D2 ) = 0 and R1 (D2 ) = 100
R0 (D3 ) = 50 and R1 (D3 ) = 60
R0 (D4 ) = 50 and R1 (D4 ) = 100
No decision rule gives R0 = R1 .

◮ In this example, the least favorable prior (maximizing the minimum
risk) is π0 = 0, or that the coin is always HH. This should make sense.
◮ The minimax decision rule is Rule 3: observe T, decide the coin is
fair; observe H, decide the coin is unfair.
◮ You can guarantee a worst-case risk of $60 by using Rule 3.
Case 2: Non-Differentiable Interior Maximum

Back to our original coin flipping problem with cost matrix

0 100
C=
100 0
100
90
80
70
60
rule 1
Bayes risk
rule 2
50 rule 3
rule 4
minimum
40
30
20
10
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
prior π
0


Remarks:
◮ Rules 2 and 3, depending on the prior, minimize the Bayes risk.
◮ Again, the equalizer rule gives no deterministic solution since
R0 (D1 ) = 100 and R1 (D1 ) = 0

R0 (D2 ) = 0 and R1 (D2 ) = 100
R0 (D3 ) = 50 and R1 (D3 ) = 0
R0 (D4 ) = 50 and R1 (D4 ) = 100
◮ In this example, the least favorable prior (maximizing the minimum

risk) is π0 = 32 , or that the coin is HT with probability 23 .
◮ The minimax decision rule is neither Rule 2 or Rule 3.
◮ You can guarantee a worst-case risk of $100
3 by using a randomized
decision rule that is a combination of Rules 2 and 3.

Problem: Find a randomized decision rule that satisfies the equalizer rule
+
R1 (δBπlf ) 100
90
80
+
70 δBπlf
60
rule 1
Bayes risk
rule 2 −
50 rule 3
−
R0 (δBπlf )
rule 4 Bπlf
minimum
δ
40
V (πlf ) V (πlf )
30 r(ρmm )
20
10
− +
R1 (δ Bπlf
) 00 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 R0 (δBπlf )
prior π0

Our randomized minimax decision rule is the
− +
ρmm = αδBπlf + (1 − α)δBπlf
We can calculate the randomization α ∈ [0, 1] by applying the equalizer

rule:
− + − +
αR0 (δBπlf ) + (1 − α)R0 (δBπlf ) = αR1 (δBπlf ) + (1 − α)R1 (δBπlf )
which gives the solution

+ +
R1 (δBπlf ) − R0 (δBπlf )
α = − − + +
(R0 (δBπlf ) − R1 (δBπlf )) − (R0 (δBπlf ) − R1 (δBπlf ))
+
V ′ (πlf )
= + −
V ′ (πlf ) − V ′ (πlf )


What is the minimax decision rule in our example?
+
V ′ (πlf ) = −100
V ′ (πlf
−
) = 50
hence α = 32 . If H0 is the hypothesis that the coin is HT, H1 is the

hypothesis that the coin is HH, the observations y0 = T, y1 = H, then our
deterministic decision rules 2 and 3 can be written as

−
Bπlf 1 0 +
Bπlf 1 1
D3 = δ = and D2 = δ =
0 1 0 0
The minimax decision rule is then given by
» – » – » –
2 1 1 1 1
ρmm (y = T) = + = (T → always decide HT)
3 0 3 0 0
» – » – » – „ nPr(decide HT)=1/3«
2 0 1 1 1/3
ρmm (y = H) = + = H → randomize →
3 1 3 0 2/3 Pr(decide HH)=2/3

Final Remarks on Minimax Hypothesis Testing
1. The objective of minimax hypothesis testing is to minimize your

worst-case (maximum) risk over all possible prior state probabilities.
2. Conservative approach but useful in scenarios when:
◮ the prior is unknown and/or
◮ you need to provide a maximum risk guarantee.
3. Try the equalizer rule first!
4. Minimax risk at the endpoints only occurs in weird cases.
5. Finite observation space Y implies that the minimum Bayes risk curve
V is not going to be differentiable everywhere. Randomization is
often necessary to obtain the minimax decision rule in these cases.
6. Composite hypotheses:
◮ The equalizer rule is still valid.
◮ Checking “endpoints” is still valid (but more points to check).
◮ In the case of a non-differentiable interior maximum, finding the
randomization can be difficult.

ECE531 Lecture 3: Minimax Hypothesis Testing: D. Richard Brown III

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

ECE531 Lecture 3: Minimax Hypothesis Testing: D. Richard Brown III

Hochgeladen von

Copyright:

Verfügbare Formate

Minimax Hypothesis Testing

ECE531 Lecture 3: Minimax Hypothesis Testing

D. Richard Brown III

Worcester Polytechnic Institute

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 1 / 21

Simple Binary Bayesian Risks Under Different Priors

r(δ, π0 ) = π0 R0 (δ) + (1 − π0 )R1 (δ)

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 2 / 21

Least Favorable Prior State Distribution

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 3 / 21

Minimax Hypothesis Testing

ρmm := arg min max Rj (ρ)

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 4 / 21

Minimum Bayesian Risk as a Function of the Prior

πlf = arg max V (π).

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 5 / 21

Concavity of the Minimum Bayesian Risk

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 6 / 21

Continuity of the Minimum Bayesian Risk

Theorem (“A First Course in Optimization Theory” by

Note that continuity does not imply differentiability.

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 7 / 21

The Four Possibilities

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 8 / 21

Case 1: Differentiable Interior Maximum Risk

max{R0 (δ), R1 (δ)} ≥ max π0 R0 (δ) + (1 − π0 )R1 (δ)

≥ π ′ R0 (δ) + (1 − π ′ )R1 (δ)

Moreover, for any π0 ∈ [0, 1],

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 9 / 21

A Procedure for Finding the Minimax Decision Rule

1. Find a Bayesian decision rule δBπ as a function of the prior π.

3. If the solution exists, then set

4. If there is no solution to the equalizer rule, then see if Case 3 or 4

Example: Coherent Detection of BPSK

Example: Coherent Detection of BPSK

Example: Coherent Detection of BPSK

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 13 / 21

Cases 3-4: Maximum Risk Occurs at Boundary

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 14 / 21

Cases 3-4: Maximum Risk Occurs at Boundary

R0 (D1 ) = 100 and R1 (D1 ) = 60

No decision rule gives R0 = R1 .

Case 2: Non-Differentiable Interior Maximum

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 16 / 21

Case 2: Non-Differentiable Interior Maximum

R0 (D1 ) = 100 and R1 (D1 ) = 0

◮ In this example, the least favorable prior (maximizing the minimum

Case 2: Non-Differentiable Interior Maximum

Case 2: Non-Differentiable Interior Maximum

We can calculate the randomization α ∈ [0, 1] by applying the equalizer

which gives the solution

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 19 / 21

Case 2: Non-Differentiable Interior Maximum

hence α = 32 . If H0 is the hypothesis that the coin is HT, H1 is the

Worcester Polytechnic Institute D. Richard Brown III 05-February-2009 20 / 21

Final Remarks on Minimax Hypothesis Testing

1. The objective of minimax hypothesis testing is to minimize your

Das könnte Ihnen auch gefallen