Beruflich Dokumente
Kultur Dokumente
DAVID R. KARGER
PHILIP N. KLEIN
AND
ROBERT E. TARJAN
Journal of the Aswcl.tmn for Computmg Machinery Vol 42, No 2, March 1995, pp 321 –328
322 D. R. KARGER ET AL.
1. Introduction
1.1. PRELIMINARIES. Our algorithm actually solves the slightly more general
problem of finding a minimum spanning forest in a possibly disconnected
graph. Wc assume that the input graph has no iaolatcd vcrticcs (wmticcs
without incident edges).
If edge weights are not distinct, we can make them distinct by numbering the
edges and breaking weight-ties according to the numbers. We therefore assume
for simplicity that all edge weights are distinct. This assumption ensures that
the minimum spanning tree is unique. The following properties are also well
known and correspond respectively to the red rule and the blue rule in Tarjan
[1983].
Cycle Property. For any cycle C in a graph, the heaviest edge in C does not
appear in the minimum spanning forest.
Linear- Time Algorithm to Find Spanning Trees 323
Cut Prope@. For any proper nonempty subset X of the vertices, the lightest
edge with exactly one endpoint in X belongs to the minimum spanning
forest.
Unlike most algorithms for finding a minimum spanning forest, our algo-
rithm makes use of each property in a fundamental way.
2. A Sampling Lemma
Now imagine that we continue flipping nickels until n heads have occurred,
and let Y be the total number of nickels flipped. Then, Y is an upper bound on
the number of F-light edges. The distribution of Y is exactly the negative
binomial distribution with parameters n and p [Feller 1968]. The expectation
of a random variable that has a negative binomial distribution is n/p [Feller
1968]. It follows that the expected number of F-light edges is at most n\p. ❑
Remark 2.2. The above proof actually shows that the number of F-light
edges is stochastically dominated by a variable with a negative binomial
distribution.
Remark 2.3. Lemma 2.1 directly generalizes to matroids. See Karger [1993 b].
3. The Algorithm
Step (l). Apply two successive Borfivka steps to the graph, thereby reducing
the number of vertices by at least a factor of four.
Step (3). Apply the algorithm recursively to the remaining graph to com-
pute a spanning forest F‘. Return those edges contracted in Step (1) together
with the edges of F‘.
By the cycle property, the edges deleted in Step (2) do not belong to the
minimum spanning forest. By the inductive hypothesis, the minimum spanning
forest of the remaining graph is correctly determined in the recursive call of
Step (3).
are removed in Step (l), the total number of edges in the left and right
subproblems is at most the number of edges in the parent problem.
It follows that the total number of edges in all subproblems at any single
recursive depth d is at most m. Since the number of different depths is O(log
n), the total number of edges in all recursive subproblems is O(rn log n). ❑
THEOREM 4.2. The expected running time of the minimum spanning forest
algorithm is O(m).
PROOF. Our analysis relies on a partition of the recursion tree into left
paths. Each such path consists of either the root or a right child and all nodes
reachable from this node through a path of left children. Consider a parent
problem on a graph of X edges, and let Y be the number of edges in its left
child. Since each edge in the parent problem is either removed in Step (1) or
has a chance of 1/2 of being selected in Step (2), J!3[Y I X = )c] s k/2. It
follows by linearity of expectation that E[Y] s E[X]/2. That is, the expected
number of edges in a left subproblem is at most half the expected number of
edges in its parent. It follows that, if the expected number of edges in a
problem is k, then the sum of the expected numbers of edges in every
subproblem along the left path descending from the problem is at most
Z;.O k/2’ = 2k.
Thus, the expected total number of edges is bounded by twice the sum of m
and the expected total number of edges in all right subproblems. By Lemma
2.1, the expected number of edges in a right subproblem is at most twice the
number of vertices in the subproblem. Since the total number of vertices in all
right subproblems is at most X;. ~ 2~ -‘ n/4~ = n/2, the expected number of
edges in the original problem and all subproblems is at most 2m + n. ❑
THEOREM 4.3. The minimum spanning forest algorithm runs in O(nz) time
with probability 1 – exp( – Q(m)).
5. Remarks
In work with [Cole et al. 1994], Klein and Tarjan have adapted the randomized
algorithm to run in parallel. The parallel algorithm does linear expected work
and runs in O(log n 210~‘n) expected time on a CRCW PRAM [Karp and
Ramachandran 1990]. This is the first parallel algorithm for minimum spanning
trees that does linear work. In contrast, Karger [1992] gives an algorithm
running on an EREW PRAM that requires O(log n) time and m/log n + nl +‘
processors for any constant ~ >0. Also, Cole and Vishkin [1986] give an
algorithm running on a CRCW PRAM that requires O(log n) time on
O((n + m)log log n\log n) processors.
Among remaining open problems, we note especially the following three:
REFERENCES
ALON, N., AND SPENCER, J. H. 1992. The Probabil~stic Method. Wiley, New York, p. 223.
BORIk.A, O. 1926. 0 jist6m prob16mu minim~lnim. Prhca Morauski Pi%’odoL@deckk Spole?nosti
3, 37-58. (In Czech.)
CHERIYAN, J., HAGERUP, T., AND MEHLHORN, K. 1990. Can a maximum flow be computed in
o(nm) time? In Proceedings of the 17th Intematlonal Colloquwm on Automata, Languages, and
Programming. Lecture Notes in Computer Science, vol. 443. Springer-Verlag, New York, pp.
235-248.
CHERNOFF, H. 1952. A measure of the asymptotic efficiency for tests of a hypothesis based on
the sum of observations. Ann. Math. Stat. 23, 493–509.
COLE, R., KLEIN, P. N., AND TAR.WJJ, R. E. 1994. A linear-work parallel algorithm for finding
minimum spanning trees. In Proceedings of the 6th Symposium LIn Parallel Algorithms and
Architectures. (Cape May, N.J., June 27-29). ACM, New York, pp. 11-15.
COLE, R., AND VISHIUN, U. 1986. Approximate and exact parallel scheduling with applications to
list, tree, and graph problems. In Proceedings of the 27th Annual IEEE Symposium on Founda-
t~ons of Computer Science. IEEE, Computer Society Press, Los Afamitos, Calif., pp. 478–491.
328 D. R. KARGER ET AL.
DIXON, B., RAUCH, M., AND TARJAN, R. E. 1992. Verification and sensitivity analysis of
minimum spanning trees in linear time. SIAM J. Comput. 21, 1184– 1192.
FELLER, W. 1968. An Introduction to Probabd@ Theoiy and Its Appkcatlons. Vol. 1. Wdey, New
York.
FREDMAN, M., AND WILLARD, D. E. 1990. Trans-dichotomous algorithms for minimum spanning
trees and shortest paths. In Proceedings of the 31st Annual IEEE Symposuim on Foundations of
Computer Science. IEEE Computer Society Press, Los Afamitos, Cahf., pp. 719-725.
GADOW, H. N., GALIL, Z., AND SPENCER, T. H. 1984. Efficient implementation of graph
algorithms using contraction. In Proceedings of the 25th Anmial IEEE Symposutm on Founda-
tions of Computer Science. IEEE Computer Society Press, Los Alamltos, Calif., pp. 347-357.
GABOW, H. N., GALIL, Z., SPENCER, T., AND TARJAN, R. E. 1986. Efficient algorithms for finding
minimum spanning trees in undirected and directed graphs. Cornbirratotim 6, 109–122.
GRAHAM, R. L., AND HELL, P. 1985. On the history of the minimum spanning tree problem.
Ann. Hut. Comput. 7, 43-57.
KARGER, D. R. 1992. Approximating, verifying, and constructing minimum spanmng forests.
Manuscript.
KARGER, D. R. 1993a. Global rein-cuts in RNC and other ramifications of a simple mincut
algorithm. In Proceedings of the 4th Annual ACM–SIAM Sympostum on Discrete Algonthrns.
ACM, New York, pp. 21-30.
KARGER, D. R. 1993b. Random sampling in matroids, with applications to graph connectivity
and minimum spanning trees. In Proceedings of the 34th Annual IEEE Symposmm on Founda-
tions of Computer Science. IEEE Computer Society Press, Los Alamitos, Calif., pp. 84–93.
KARP, R. M., AND RAMACHANDRAN, V. 1990. A survey of parallel algorithms for shared-memory
machines. In Handbook of Theoretical Computer Sccence, Volume A: Algorithms and Complemw,
J. van Leeuwen, ed. MIT Press, Cambridge, Mass., pp. 869-941.
KJNG, V. 1995. A simpler minimum spanning tree verification algorithm. In Proceedings of the
Workshop on Algorithms and Data Structures, to appear.
KLEIN, P. N., AND TARJAN, R. E. 1994. A randomized linear-time algorlthm for finding mnumum
spanning trees. In Proceedings of the 26th Annual ACM Symposium on Theoiy of Computmg.
(Montreal, Que., Canada. May 23-25). ACM, New York, pp. 9-15.
KOML6S, J. 1985. Linear verification for spanmng trees. Combmatonca 557-65.
KRUSKAL, J. B. 1956. On the shortest spanning subtree of a graph and the traveling salesman
problem. Proc. Amer. Math. Sot. 7, 48-50.
RAGHAVAN, P. 1990. Lecture notes on randomized algorithms. Res. Rep. RC 15340 (#68237).
Computer Science/Mathematics. IBM Research Division, T. J. Watson Research Center,
Yorktown Heights, N. Y., p. 54.
TARJAN, R. E. 1979. Applications of path compression on balanced trees. J. ACM 26, 4 (Oct.),
690-715.
TARJAN, R. E. 1983. Data Structures and Network Algorithms. Society for Industrial and Applied
Mathematics, Philadelphia, Pa., Chap. 6.
Journal of the As<oclatlon for Computmg Machinery, Vol 42, No 2, March 1995