Sie sind auf Seite 1von 117

Gaussian Belief Propagation:

Theory and Application

Thesis for the degree of

DOCTOR of PHILOSOPHY

by

Danny Bickson

submitted to the senate of

The Hebrew University of Jerusalem

1 st Version: October 2008. 2 nd Revision: May 2009.

This work was carried out under the supervision of Prof. Danny Dolev and Prof. Dahlia Malkhi

ii

Acknowledgements

I would first like to thank my advisors, Prof. Danny Dolev and P rof. Dahlia Malkhi. Danny

Dolev encouraged me to follow this interesting research direction, had infinite time to meet and

our brainstorming sessions where always valuable and enjoyable. Dahlia Malkhi encouraged me

to do a Ph.D., worked with me closely at the first part of my Ph.D., and was always energetic and inspiring. My time as in Intern in Microsoft Research, Si licon Valley, will never be forgotten, in the period where my research skills where immensely impro ved.

I would like to thank Prof. Yair Weiss, for teaching highly in teresting courses and introducing

me to the graphical models world. Also for continuous support in answering millions of questions. Prof. Scott Kirkpatrick introduced me to the world of statistical physics, mainly using the Evergrow project. It was a pleasure working with him, specifically watching his superb manage- ment skills which could defeat every obstacle. In this proje ct, I’ve also met Prof. Erik Aurell from SICS/KTH and we had numerous interesting discussions about Gaussians, the bread-and-butter of statistical physics.

I am lucky that I had the opportunity to work with Dr. Ori Shental from USCD. Ori introduced

me into the world of information theory and together we had a fruitful joint research. Further thanks to Prof. Yuval Shavitt from Tel Aviv universi ty, for serving in my Ph.D. com-

mittee, and for fruitful discussions. Support vector regression work was done when I was intern in I BM Research Haifa Lab. Thanks

to Dr. Elad Yom-Tov and Dr. Oded Margalit for their encouragement and for our enjoyable joint work. The linear programming and Kalman filter work was done with the great encouragement of Dr. Gidon Gershinsky from IBM Haifa Reseach Lab.

I thank Prof. Andrea Montanari for sharing his multiuser detection code.

I would like to thank Dr. Misha Chetkov and Dr. Jason K. Johnso n for inviting me to visit

Los Alamos National Lab, which resulted in the convergence fi x results, reported in this work.

Further encouragement I got from Prof. Stephen Boyd, Stanford University. Finally I would like to thank my wife Ravit, for her continuou s support.

iii

Abstract

The canonical problem of solving a system of linear equation s arises in numerous contexts in information theory, communication theory, and related fiel ds. In this contribution, we develop a solution based upon Gaussian belief propagation (GaBP) th at does not involve direct matrix inversion. The iterative nature of our approach allows for a distributed message-passing imple- mentation of the solution algorithm. In the first part of this thesis, we address the properties of the GaBP solver. We characterize the rate of convergence, enhance its message-passing efficiency by introducing a broadcast version, discuss its relation to classical solution methods including nu- merical examples. We present a new method for forcing the GaB P algorithm to converge to the correct solution for arbitrary column dependent matrices. In the second part we give five applications to illustrate the applicability of the GaBP algorithm to very large computer networks: Peer-to-Peer rating, linear detection, distributed computation of support vector regression, efficient computation of Kalma n filter and distributed linear pro- gramming. Using extensive simulations on up to 1,024 CPUs in parallel using IBM Bluegene supercomputer we demonstrate the attractiveness and appli cability of the GaBP algorithm, using real network topologies with up to millions of nodes and hund reds of millions of communication links. We further relate to several other algorithms and exp lore their connection to the GaBP algorithm.

iv

Contents

1

Introduction

1

1.1 Material Covered in this Thesis

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

3

1.2 Preliminaries: Notations and Definitions

 

.

.

.

.

.

.

.

3

1.3 Problem Formulation

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

5

Part 1: Theory

 

7

2 The GaBP Algorithm

 

8

 

2.1 Linear Algebra to Inference

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

8

2.2 Belief Propagation

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

10

2.3 GaBP Algorithm

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

11

2.4 Max-Product Rule

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

15

2.5 Properties

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

16

3 GaBP Algorithm Properties

 

17

 

3.1 Upper Bound on Rate

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

17

3.2 Convergence Acceleration

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

19

3.3 GaBP Broadcast

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

19

3.4 The GaBP-Based Solver and Classical Solution

 

Methods

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

21

3.4.1 Gaussian Elimination .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

21

3.4.2 Iterative Methods

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

23

3.4.3 Jacobi Method

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

24

4 Numerical Examples

 

25

 

4.1 Toy Linear System

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

25

4.2 Non PSD Example

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

27

4.3 2D Poisson’s

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

27

5 Convergence Fix

 

34

 

5.1 Problem Setting

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

34

5.2 Diagonal Loading

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

35

5.3 Iterative Correction Method

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

35

 

v

CONTENTS

CONTENTS

5.4

Extension to General Linear Systems

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

37

Part 2: Applications

 

38

6 Peer-to-Peer Rating

 

39

6.1 Framework

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

40

6.2 Quadratic Cost Functions

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

41

 

6.2.1 Peer-to-Peer Rating

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

44

6.2.2 Spatial Ranking

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

44

6.2.3 Personalized PageRank

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

6.2.4 Information Centrality

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

6.3 Experimental Results

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

46

 

6.3.1

Rating Benchmark

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

48

7 Linear Detection

 

51

7.1

GaBP Extension

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

56

7.1.1 Distributed Iterative Computation of the MMSE Detector

 

56

7.1.2 Relation to Factor Graph

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

57

7.1.3 Convergence Analysis

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

58

7.2

Applying GaBP Convergence Fix

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

59

8 SVR

62

8.1

SVM Classification

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

62

8.2

KRR Problem

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

64

8.3

Previous Approaches

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

64

8.4

Our novel construction

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

66

8.5

Experimental Results

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

66

8.6

Discussion

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

69

9 Kalman Filter

 

71

9.1 Kalman Filter

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

72

9.2 Our Construction

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

72

9.3 Gaussian Information Bottleneck

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

74

9.4 Relation to the Affine-Scaling Algorithm

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

77

10 Linear Programming

 

80

10.1 Standard Linear Programming .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

81

10.2 From LP to Inference

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

82

10.3 Extended Construction

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

83

 

10.3.1

Applications to Interior-Point Methods

 

85

NUM

10.4 .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

86

 

10.4.1

NUM Problem Formulation

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

86

10.4.2

Previous Work

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

87

vi

CONTENTS

CONTENTS

10.4.3

Experimental Results .

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

89

11 Relation to Other Algorithms

 

92

11.1 Montanari’s MUD

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

92

11.2 Frey’s IPP Algorithm

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

94

11.3 Consensus Propagation

 

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

96

11.4 Quadratic Min-Sum

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

98

12 Appendices

100

12.1 Weiss vs. Johnson

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

100

12.2 GaBP code in Matlab

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

101

12.2.1 The file gabp.m

.

.

.

.

.

.