ENEE 627 SPRING 2011 Information Theory Convexity

c
2011
by Armand M. Makowski
ENEE 627
SPRING 2011
INFORMATION THEORY
CONVEXITY
Convex sets
A subset K of Rd is said to be convex if for any elements x and y of K, and any
in ]0, 1], we have
x + (1 )y K,
[0, 1]
It is a simple exercise to show the following by induction:

Lemma 0.1 A set K in Rd is convex if and only if for any integer p = 2, 3, . . .,
and any collection x1 , . . . , xp in K, we have
1 x1 + . . . + p xp K
for arbitrary 1 , . . . , p in [0, 1] such that
1 + . . . + p = 1.
We refer to the linear combination 1 x1 + . . . + p xp with x1 , . . . , xp in Rd
and 1 , . . . , p in [0, 1] such that
1 + . . . + p = 1
as a convex combination.
Convex functions
Consider a convex set K in Rd . A function : K R is said to be convex if
for any elements x and y of K, and any in ]0, 1], we have
(x + (1 )y) (x) + (1 )(y),
[0, 1].
A function : K R is said to be concave if is a convex function.

It is also a simple exercise to show the following by induction:
c
2011
Lemma 0.2 Consider a convex set K in Rd . A function : K R is convex if

and only if for any integer p = 2, 3, . . ., and any collection x1 , . . . , xp in K, we
have
(1 x1 + . . . + p xp ) 1 (x1 ) + . . . + p (xp )
for arbitrary 1 , . . . , p in [0, 1] such that
1 + . . . + p = 1.
Strictly convex functions
A function : K R is said to be strictly convex if it is convex and whenever
the equality
(x + (1 )y) = (x) + (1 )(y),
x, y K
(0, 1)
holds, we necessarily have x = y. As expected, a function : K R is said to

be strictly concave if is a strictly convex function.
Of great usefulness in many arguments is the following observation: Consider
a strictly convex : K R. Suppose that for some p = 2, 3, . . ., with x1 , . . . , xp
in K, we have the equality
(1)
(1 x1 + . . . + p xp ) = 1 (x1 ) + . . . + p (xp )
with 1 , . . . , p in (0, 1) such that

1 + . . . + p = 1.
Under such circumstances, what can we say about x1 , . . . , xp ? We shall show that
we must necessarily have
(2)
x1 = . . . = xp .
If p = 2, since 0 < 1 , 2 < 1, by definition of strict convexity we automatically
have the conclusion x1 = x2 . If p > 2, the matter is more involved. To proceed,
with any subset I of {1, . . . , p} such that 1 |I| < p we define
X
I =
i .
iI
Under the foregoing assumptions we have 0 < I < 1, so that the definition
X i
xI =
xi
I
iI
c
2011
is well posed and yields an element of K. We also note that

I xI + I c xI c = 1 x1 + . . . + p xp .
with
I + I c = 1.
Using the convexity of twice we get
(1 x1 + . . . + p xp )
= (I xI + I c xI c )
I (xI ) + I c (xI c )
!
X j
X i
(xi ) + I c
(xj )
I
I
Ic
iI
j I
/
(3)
= 1 (x1 ) + . . . + p (xp ).
Moreover, convexity again gives

(4)
(xI )
and
(5)
(xI c )
X i
(xi )
I
iI
X j
(xj ).
I c
j I
/
However, because of (1) the inequalities leading to (3) must necessarily hold
as equalities, and this implies
(6)
(I xI + I c xI c ) = I (xI ) + I c (xI c ),
(7)
(xI ) =
X i
(xi )
I
iI
and
(8)
(xI c ) =
X j
(xj )
I c
j I
/
as we make use of the fact that

0 < I , I c < 1.
c
2011
By strict convexity it follows from (6) that

xI = xI c .
With (7) and (8) as point of departure, in lieu of (1), we can repeat the arguments
above with I and I c , respectively, instead of {1, . . . , p}. Upon doing this as many
times as needed we can eventually conclude that
xi = xj ,
i, j = 1, . . . , p
i 6= j
and this completes the proof of (2).
Kullback-Leibler distance
Consider a set X of finite cardinality. With and pmfs on X , define

X
(x)
D(||) =
(x) log
x
(x)
with the conventions

0
0 log
= 0,
0
p
p log
= if p > 0
0
and

0
0 log
= 0 if q > 0
q
The proof of Theorem 2.6.3 revisited: Thus,

X
(x)
D(||) =
(x) log
x
(x)

X
(x)
=
(x) log
x: (x)>0
(x)

X
(x)
=
(x) log
x: (x)>0
(x)
X

(x)
(9)
log
(x)
x: (x)>0
(x)
X

= log
(x)
x: (x)>0
(10)
log 1 = 0
c
2011
whence D(||) 0, or equivalently, D(||) 0.

The equality D(||) = 0 occurs if and only if equality occurs at both (9) and
(10). By the strict concavity of t log t, equality occurs at (9) if and only terre
exists c > 0 such that
(x)
xX
= c,
(x)
>0
(x)
As a result,
X
x: (x)>0
(x) = c
x: (x)>0
(x) = c
since
X
x: (x)>0
(x) =
(x) = 1.
On the other hand, (10) occurs if and only if

X
(x) = 1
x: (x)>0
Consequently, c = 1 and
X
x: (x)=0
(x) = 0,
whence (x) = 0 if and only if (x) = 0. In sum, (x) = (x) for all x in X .

ENEE 627 SPRING 2011 Information Theory Convexity

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

ENEE 627 SPRING 2011 Information Theory Convexity

Hochgeladen von

Copyright:

Verfügbare Formate

c

It is a simple exercise to show the following by induction:

A function : K R is said to be concave if is a convex function.

Lemma 0.2 Consider a convex set K in Rd . A function : K R is convex if

holds, we necessarily have x = y. As expected, a function : K R is said to

with 1 , . . . , p in (0, 1) such that

is well posed and yields an element of K. We also note that

Moreover, convexity again gives

as we make use of the fact that

By strict convexity it follows from (6) that

and this completes the proof of (2).

whence D(||) 0, or equivalently, D(||) 0.

On the other hand, (10) occurs if and only if

Das könnte Ihnen auch gefallen