Sie sind auf Seite 1von 11

# ML Estimation by Homotopy Continuation

Max Buot
Department of Statistics
Carnegie Mellon University

7 August 2004

Polynomials

## Although locating roots (finding zeros) is a classical problem, it is

generally avoided.
q
I a2 x 2 + a1 x + a0 ⇒ roots = −a1 ± a12 − 4a2 a0 /2a2
I General univariate polynomial can be written as
f (x) = an x n + an−1 x n−1 + · · · + a2 x 2 + a1 x + a0
I Univariate results
I Fundamental Theorem of Algebra
I Galois Theory: Roots of Deg (f ) > 5 polynomials cannot be
expressed in terms of radicals
I Descartes Rule of Signs: Number of sign changes in the
sequence {an an−1 , an−1 an−2 , . . . , a1 a0 } bounds the number of
roots in R
I Budan-Fourier: Bound on real roots in interval [a, b]

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Multivariate Polynomials

A bivariate example:

## f2 (x1 , x2 ) = −6x13 + 2x2 x1 − 5x23

I Bezout’s Theorem: Number of nonzero roots in C bounded by
Deg (f1 ) · Deg (f2 )
I Newton Polytope: Convex hull of exponents; for example,
NP(f1 ) = Conv ({(3, 0), (2, 1), (0, 2), (0, 3)})
NP(f2 ) = Conv ({(3, 0), (1, 1), (0, 3)})
I Bernstein’s Theorem: Number of nonzero roots in C is exactly
equal to the area of NP(f1 ) + NP(f2 )

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Examples of Polynomial Likelihoods

## I Simple mixture models (Buot and Richards, 2004)

I Four Taxa Phylogenic Trees (Chor, Khetan, Snir; 2003)
I Discrete Data model (Catanese, Hosten, Khetan, Sturmfels;
2004)
I Log-linear models, exponential families
I Linear model estimation subject to order restrictions
(Hoferkamp and Peddada, 2002)
I Mixture Transition Density models (Berchtold and Raftery;
2003)

## New Researcher’s Conference ML Estimation by Homotopy Continuation

A Root Finding Algorithm: Homotopy Continuation

## Because we solve polynomial systems we exploit the algebraic

structure to count the number of roots and to construct a start
system. By continuation methods the known solutions of the start
system, Q(x), are extended to the desired solutions of the target
system, P(x). This deformation is defined by the homotopy, that
is a family of systems connecting start and target system.

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Homotopy Continuation, cont.

## I Usually, Q(x) is chosen such that:

I Q has at least the same number of roots as P
I Q is easy; e.g., Qi (x1 , . . . , xn ) = ai xidi − bi , where
di = Deg (Pi ), and ai and bi are random
I t = 0 : H(x, 0) = Q(x) → t = 1 : H(x, 1) = P(x)
I PHCpack, by Jan Verschelde

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Mixture Model example

## Consider an iid sample x1 , . . . , xn drawn from an m component

mixture:
Xm
f (x) = πj fj (x)
j=1

## I Assume that the component densities fj are known: the goal

is to estimate π1 , . . . , πm
I Locating MLEs rely on numerical techniques, since closed
form solutions are not available
I Algebraic Geometry:
I each ML equation is a polynomial of degree n, almost surely∗
I By Bernstein’s Theorem, the number of nonzero roots in C is
nm−1 , almost surely∗

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Some Open Questions

## I What can the number of roots tell us about the maximum

likelihood problem? Can a useful interpretation of the
complex solutions be given? Is there a relation between this
and convergence issues?
I Is homotopy continuation a plausible option in statistical
optimization problems? What about other root finding
algorithms?
I Can probability and statistics make a contribution in the
choice of Q(x)?
Answers to these types of questions are really just beginning to be
understood, and most progress has come about from collaborative
efforts between statisticians and mathematicians.

References

## I Buot and Richards: Counting and locating the solutions of

polynomial systems of maximum likelihood equations, I.
I Sturmfels, et. al.: The maximum likelihood degree
I Li: Numerical solution of multivariate polynomial systems by
homotopy continuation methods
I Verschelde: PHCpack
I AIM workshop on Computational Algebraic Statistics:
http://www.aimath.org/ARCC/workshops/compalgstat.html
I More on the secure computation, data confidentiality issues:
http://www.niss.org

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Preserving Privacy in Partitioned Data

General context:
I k agencies wish to perform statistical analysis on the
combined data without actually combining data
I Agencies are semi-honest
I Third-party computation is not desired

## New Researcher’s Conference ML Estimation by Homotopy Continuation

Horizontal and Vertical Partitioned Data

## I Horizontal example: several state educational agencies seek to

combine their students’ data to improve the precision of
analysis of the general student population
I Vertical example: agencies seek to combine educational, tax,
health data of the student population in a given state