Sie sind auf Seite 1von 662

5829tp(path) 29/8/05 1:48 PM Page 1

100 Years of Gravity and Accelerated Frames


The Deepest lnsig hts of Einstein and Yang-Mills
ADVANCED SERIES ON THEORETICAL PHYSICAL SCIENCE
A Collaboration between World Scientific and Institute of Theoretical Physics

Series Editors: Dai Yuan-Ben, Hao Bai-Lin, Su Zhao-Bin


(Institute of Theoretical Physics Academia Sinica)

Vol. 1: Yang-Baxter Equation and Quantum Enveloping Algebras


(Zhong-Qi Ma)
Vol. 2: Geometric Methods in the Elastic Theory of Membrane in Liquid Crystal Phases
(Ouyang Zhong-Can, Xie Yu-Zhang & Liu Ji-Xing)
Vol. 4: Special Relativity and Its Experimental Foundation
(Yuan Zhong Zhang)

Vol. 6 : Differential Geometry for Physicists


(Bo- Yu Hou & Bo- Yuan Hou)
Vol. 7 : Einsteins Relativity and Beyond
(Jong-Ping Hsu)

Vol. 8: Lorentz and PoincarC Invariance: 100 Years of Relativity


(J. -P. Hsu & Y.-Z. Zhang)

Vol. 9: 100 Years of Gravity and Accelerated Frames: The Deepest Insights
of Einstein and Yang-Mills
(J.-P. Hsu & D. Fine)
5829tp(path) 29/8/05 1:48 PM Page 2

100 Years of
Gravity and
Accelerated Frames
The Deepest Insights of
Einstein and YangMills

Editors

Jong-Ping Hsu
Dana Fine
University of Massachusetts Dartmouth, USA

World Scientific
NEW JERSEY . LONDON . SINGAPORE . BEIJING . SHANGHAI . HONG KONG . TAIPEI . CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA ofice: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK ofice: 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-PublicationData


100years of gravity and accelerated frames : the deepest insights of Einstein and
Yang-Mills I editor, Jong-Ping Hsu; advisory editor, Dana Fine.
p. cm. -- (Advanced series on theoretical physical science; v. 9)
Includesbibliographical references.
ISBN 981-256-335-0 (alk. paper)
1. Gravitation. 2. Relativity (Physics). 3. Einstein field equations. 4.Yang-Mills theory.
I. Title: One hundred years of gravity and accelerated frames. 11. Hsu, J. P. (Jong-Ping). 111.
Fine, Dana. IV. Series.

QC178.Al5 2005
530.1 l--dc22
2005050077

British Library Cataloguing-in-PublicationData


A catalogue record for this book is available from the British Library.

Copyright 0 2005 by World Scientific Publishing Co. Pte. Ltd.


All rights reserved. This book, orpartsthereoJ may not be reproducedinany formorby any means, electronicormechanical,
including photocopying, recording or any information storage and retrieval system now known or to be invented, without
written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222
Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

Printed in Singapore by B & JO Enterprise


Deepest Insights
The Deepest Insights of Einstein and Yang-Mills
The Deepest Insights of Einstein and Yang-Mills
The Deepest Insights of Einstein and Yang-Mills
The Deepest Insights of Einstein and Yang-Mills

To

Arthur Fine

Hu/Shih and Tsierm/SAue-Sen

The Deepest Insights of Einstein and Yang-Mills

The Deepest Insights of Einstein and Yang-Mills

The Deepest Insights of Einstein and Yang-Mills

The Deepest Insights of Einstein and Yang-Mills

Deepest Insights
This page intentionally left blank
"In 1953-1954, I was visiting Brookhaven and Bob was my office mate.
We discussed many things in physics, from the experimental results pouring
out o f the new Cosmotron, to theoretical topics like renormalization and the
Ward identity. I t was in that year that we found the very elegant and unique
generalization of Maxwell's equation. We were pleased b y the beauty of the
generalization, b u t neither o f us had anticipated its great impact on physics
20 years later.
C. N. Yang,
in "Remembering Robert L. Mills" by Samuel L. Marateck,
Physics Today, p. 14, October 2003.

Robert Mills and Kitty,


the oldest of his 5 kids.

"The Rubaiyat"
Omar Khayyam
(transl. by Edward Fitzgerald)
This page intentionally left blank
ix

Preface

This book is a collection of papers and writings from the past 100 years on
ideas and problems related to gravity, gauge fields and accelerated frames. The grand
triumphs of Einstein's theory of gravity and Yang-Mills' theory in physics are well
known. It is believed that both theories are based on the principle of 'gauge
invariance,' although not on the same kind of action. Einstein's theory is linear in
spacetime curvature, while Yang-Mills' theory is quadratic in gauge curvature. Now,
at the dawn of the 2 1st century, invariance principles in physics have transcended the
kinematical and dynamical contexts from which they originated to became the
foundation of our understanding of the physical world. Using this framework of
invariance principles, this book surveys the development of gravitationa1 and Yang-
Mills fields, as well as spacetime transformations of accelerated frames. It also
attempts to reveal the problems and limitations of various formulations of
gravitational and Yang-Mills fields. The intent is to enlarge and broaden the reader's
views on the subjects.
As TIME magazine's person of the 20th century (cf. TIME magazine),
Einstein's contributions to physics are arguably incomparable, aside from Newton's.
The gravitational force and accelerated frames were two ingredients in the young
Einstein's 'happiest thoughts' in 1907. The simple thought that 'If a person falls freely
he will not feel his own weight,' made a deep impression on him and impelled him
toward a successful theory of gravitation. Unfortunately, accelerated spacetime
transformations for non-inertial frames have still not been well developed. However,
they are important because one cannot claim to have a complete understanding of the
physical world, especially the basic gravitational and Yang-Mills fields, if one
understands physics only from the viewpoint of the special and limited class of
inertial frames. Strictly speaking, all real frames of reference in the physical world
are non-inertial because of the long range of the gravitational force. In particular,
when one taks about an inherent property of nature (e.g., values of fundamental
constants such as the fine structure constant and the speed of light), a reasonable
criterion is that the property must be present in both incrtial and non-inertial frames.
In this sense, the book suggests that the present understanding of gravitational and
Yang-Mills fields is far from complete.
The formulations of the gravitational and Yang-Mills theories are both an
effect and a cause of scientifk development in experiment and theory. Progress in
physics is made through the collective effort of many physicists. The community of
physicists is like the thousand-hand Guan-Yin: Each hand accomplishes only a
partial or small task, yet the overall accomplishment is enormous. As we shall see in
this volume, in the pursuit of physical laws, the right track has often been discovered
only after many failures by well-known and not-so-well-known pathfinders.
X

Nowadays, the spectacular success of Einstein and Yang-Mills' profound thoughts is


often emphasized while the lessons of the struggle in their birth and development is
lost. Furthermore, there is little chance for making progress simply by going over
their success repeatedly.
The aim here is to present some of the leading ideas and problems discussed
by physicists and mathematicians, highlighting three aspects:

(1) the idea of gravity as a Yang-Mills field, first discussed by Utiyama;


(2) the problems of quantum gravity, discussed by Feynman, Dyson and others;
(3) spacetime properties and the physics of particles and fields in accelerated kames.

It is hoped that the present volume will bring some of the unfulfilled aspects
of the profound thoughts to the attention of physicists and mathematicians of the 2 1st
century. For various reasons, research in the areas of spacetime symmetry and
special relativity are sometimes discouraged by general editorial policy. Fortunately,
so far this is not true in the cases of general relativity and Yang-Mills theory. If the
deepest insights of Einstein and Yang-Mills can inspire its readers to pursue these
subjects further, the chief purpose of the book will have been achieved.
We are grateful that Chairman M. Ninomiya of the Progress of Theoretical
Physics, Acta Physica Polonica B, Elsevier Ltd, and Editorial Administrator R. W.
Brown of the Annals of the New York Academy of Sciences freely granted us
permission to reprint their papers. We would like to thank L. Hsu, H. L. Chen, N.
Cleffi and E. M. Winiarz for their help. This book was supported in part by the Prof.
George Leung Memorial Fund of the University of Massachusetts Dartmouth
Foundation, the Potz Science Fund, and the World Scientific Publishing Company.

Hsu/Jong-Ping
xi

Contents

Preface ix
Acknowledgements xiv
Remarks on the Development of the Gravitational and Yang-Mills Fields, and Accelerated Frames xix

Chapter 1 The Dawn of Gravitation

A The Mathematical Principles of Natural Philosophy (Extract) 2


Rules of Reasoning in Philosophy Phenomena, or Appearances
1. Newton (Transl. A. Mutte)
B On the Dynamics of the Electron (Extract) 13
Introduction, Hypotheses Concerning Gravitation
H. Poincare (Transl. G. Pontecorvo, Commentary by A. A. b g u n o v )
C On the Relativity Principle and the Conclusions Drawn from It (Extract) 33
A . Einstein (Transl. A , Beck)

Chapter 2 Einsteins Deepest Tnsight and Its Early lmpacts


A Outline of a Generalized Theory of Relativity and of a Theory of
Gravitation (Extract) 48
A . Einstein and M Grossmunn (Transl. A . Beck)
B The Foundation of the General Theory of Relativity 65
A. Einstein (Transl. W Pevrett and G. B. Jefeiy)
c The Foundation of Physics 120
D.HiIbert (Transl. D. Fine)
D On a Generalization of the Concept of Riemann Curvature and
Spaces with Torsion 132
E. Ccrrtan (Transl. A. Fine)

Chapter 3 The Scalar-Tensor Theory of Gravity

A Formation of the Stars and Development of the Universe 136


R Jordan
B On the Physical Interpretation of P, Jordans Extended Theory of Gravitation (Extract) 140
ian M Fkrz (Transl. D.Fine)
C Machs Principle and a Relativistic Theory of Gravitation (Extract) 142
C. Brans and R.H. Dicke

Chapter 4 Yang-Mills Deepest Insight and Its Relation to Gravity


A Conservation of Isotopic Spin and Isotopic Gauge Invariance 150
C. N . Yang atid R. L. Mills
B Conservation of Heavy Particles and Generalized Gauge Transformations 155
7: D. Lee and C. N. Yang
C Invariant Theoretical Interpretation of Interaction 157
R. Utiyama
D Lorentz Invariance and the Gravitational Field I68
T. K B. Kibble
xii

Chapter 5 Accelerated Frames: Generalizing the Lorentz Transformations

A On Homogeneous Gravitational Fields in the General Theory of Relativity


and the Clock Paradox 180
C. MXIer
B Physical Consequences of a Co-ordinate Transformation to a
Uniformly Accelerating Frame 204
I: Fulton, E Rohrlich and L. Witten
C The Clock Paradox in the Relativity Theory 223
I: I.: Wu and I.: C. Lee
D Four-dimensional Symmetry of Taiji Relativity and Coordinate Transformations
Based on a Weaker Postulate for the Speed of Light (Extract) 240
J. f! Hsu and L. Hsu
E Generalized Lorentz Transformations for Linearly Accelerated Frames with
Limiting Four-Dimensional Symmetry 247
J. l? Hsu and L. Hsu
F Generalizing Lorentz Transformations for Accelerated Frames and
Their Physical Implications 258
D. T. Schmitt and I: Kleinschmidt

Chapter 6 Quantum Gravity and 'Ghosts'

A Quantum Theory of Gravitation 272


R. l? Feynman
B Quantum Theory of Gravity. I1 The Manifestly Covariant Theory (Extract) 298
B. S. DeWitt
C Quantum Theory of Gravity, I11 Applications of the Covariant Theory 307
B. S. DeWitt
D Feynman Diagrams for the Yang-Mills Field 325
L. D. Faddeev arrd N. Popov
E Feynman Rules for Electromagnetic and Yang-Mills Fields from the
Gauge-Independent Field-Theoretic Formalism (Extract) 327
S. Mendelstam
F S Matrix for Yang-Mills and Gravitational Fields (Extract) 339
E. S. Fradkin and I. Z i'jwtin
G Missed Opportunities (Extract, with a brief comment of the author) 347
Introduction, General Coordinate Invariance
E J. Dyson

Chapter 7 Gauge Theories of Gravity

A Extended Translation Invariance and Associated Gauge Fields 354


K. Hayashi and T. Nakano
B Gravitational Field as a Generalized Gauge Field 371
R. Utiyama and T. Fukuyama
C Integral Formalism for Gauge Fields (with a brief comment) 387
C. N. Yang
D Yang 's Gravitational Field Equations 391
W T. Ni
...
Xlll

E Einstein Lagrangian as the Translational Yang-Mills Lagrangian 393


I: M Cho
F De Sitter and PoincarC Gauge-Invariant Fermion Lagrangians and Gravity 398
J. l? Hsu

Chapter 8 Alternate Approaches to Gravity: Roads Less Traveled By

A Fixation of Coordinates in the Hamiltonian Theory of Gravitation 402


P A. M Dirac
B New General Relativity (with Addendun) 409
K. Hayashi and 7: Shirafuji
C Relativistic Theory of Gravitation 442
A. A. Logunov and M A. Mestvirishvili
D Yang-Mills Gravity: A Union of Einstein-Grossmann Metric with Yang-Mills
Tensor Fields in Flat Spacetime with Translation Symmetry 462
J. l? Hsu

Chapter 9 Experimental Tests of Gravitational Theories

A Empirical Foundations of the Relativistic Gravity 476


W i? Ni
B Binary Pulsars and Relativistic Gravity 494
J. H. Taylor, Jr.

Chapter 10 Other Perspectives

A Concept of Nonintegrable Phase Factors and Global Formulation


of Gauge Fields (Extract) 504
i? T Wu and C. N. Yang
B Gauge Fields 512
R. L. Mills
C Magnetic Monopoles, Fiber Bundles, and Gauge Fields 527
C. N. Yang
D Gauge Theory: Historical Origins and Some Modem Developments 539
L. ORaifeartaigh and N. Straumann
E String Theory as a Generalization of Gauge Symmetry 562
F!-M HO
F The Cosmological Constant Problem 569
S. Weinberg
G The Cosmological Constant and Dark Energy (Extract) 592
l? J. E. Peebles and B. Ratra

Appendices
A Marcel Grossmann (1878-1 936) 618
J. F! Hsu and D . Fine
B Remembering Robert L. Mills 622
S. L. Marateck
xiv

Acknowledgements

The Mathematical Principles of Natural Philosophy (1687) (Extract)


From The Principia by Isaac Newton, translated by Andrew Motte (1729)
(Amherst, NY; Prometheus Books). Published 1995.

Hypotheses Concerning Gravitation (Extract)


Extracted from The Dynamics of the Electron by H. Poincare, Rend. Circ. Mat. Palermo 21,
129 (1906), with comments by A. A. Logunov, translated by G. Pontecorvo. Reprinted with
permission from A. A. Logunov, Dubna Joint Institute for Nuclear Research, published 2001.

On the Relativity Principle and the Conclusions Drawn from It (Extract)


EINSTEIN, ALBERT; THE COLLECTED PAPERS OF ALBEERT EINSTEIN 0 1987-2004
Hebrew University and Princeton University Press. Reprinted by permission of Princeton
University Press.

Outline of a Generalized Theory of Relativity and of a Theory of Gravitation


EINSTEIN, ALBERT; THE COLLECTED PAPERS OF ALBEERT EINSTEIN 0 1987-2004
Hebrew University and Princeton University Press. Reprinted by permission of Princeton
University Press.

The Foundation of the General Theory of Relativity


Reprinted from A. Einstein, in The Principles of Relativity
(Translated by W. Perrett and G. B. Jeffery, Methuen and Company, 1923)

The Foundation of Physics


Translated from D. Hilbert, Die grundlagen der Physik. (Erste Mitteilung). Goett. Nachr. 395
(1 9 1 9 , by Dana Fine.

On a Generalization of the Concept of Riemann Curvature and Spaces with Torsion


Translated from E. Cartan, C. R. Acad. Sci. (Paris) 174, 597 (1922) by Arthur Fine.

Formation of the Stars and Development of the Universe


Reprinted with permission from P. Jordan, Nature (London) 164, 637 (1949).
0 1949 Macmillan Magazines Ltd.

On the Physical Interpretation of P. Jordans Extended Theory of Gravitation (Extract)


Translated from M. Fierz, Helv. Phys. Acta 29, 128 (1956) by Dana Fine.

Machs Principle and a Relativistic Theory of Gravitation


Reprinted with permission from C. Brans and R. H. Dicke, Phys. Rev. 124, 925 (1961).
0 1961 The American Physical Society.

Conservation of Isotopic Spin and Isotopic Gauge Invariance


Reprinted with permission from C. N. Yang and R. L. Mills, Phys. Rev. 96, 191 (1954).
0 1954 The American Physical Society.

Conservation of Heavy Particles and Generalized Gauge Transformations


Reprinted with permission from T. D. Lee and C. N. Yang, Phys. Rev. 98, 1501 (1955).
0 1955 The American Physical Society.
xv

Invariant Theoretical Interpretation of Interaction


Reprinted with permission from R. Utiyama, Phys. Rev. 101, 1597 (1956).
0 1956 The American Physical Society.

Lorentz Invariance and the Gravitational Field


Reprinted with permission from T. W. B. Kibble, J. Math. Phys. 2, 212 (1961).
0 1961, American Institute of Physics.

On Homogeneous Gravitational Fields in the General Theory of Relativity and the Clock Paradox
Reprinted with permission from C. Merller, Danske Vid. Sel. Mat-Fys. 20, No. 19 (1943).
0 1943 The Royal Danish Academy.

Physical Consequences of a Co-ordinate Transformation to a Uniformly Accelerating Frame


Reprinted with permission from T. Fulton, R. Rohrlich and L. Witten, Nuovo Cimento
xxvi, 652 (1962). 0 1962 Societa Italiana di Fisica.

The Clock Paradox in the Relativity Theory


Reprinted with permission from T. Y. Wu and Y. C. Lee, Int. J. Theor. Phys. 5, 307 (1972).
0 1997 Springer Science and Business Media.

Four-dimensional Symmetry of Taiji Relativity and Coordinate Transformations Based on a Weaker


Postulate for the Speed of Light (Extract)
Reprinted with permission from J. P. Hsu and L. Hsu, Nuovo Cimento, 112, 575
(1997). 0 1997 Societa Italiana di Fisica.

Generalized Lorentz Transformations for Linearly Accelerated Frames with Limiting Four-
Dimensional Symmetry
Reprinted with permission from J. P. Hsu and L. Hsu, Chinese Journal of Physics 35,407
(1997). 0 1997 The Physical Society of the Republic of China.

Generalizing Lorentz Transformations for Accelerated Frames and Their Physical Implications
Contributed by D. T. Schmitt and Tobias Kleinschmidt. Based on a paper submitted to the Int. J.
Modem Phys. (to be published).

Quantum Theory of Gravitation


Reprinted with permission from R. P. Feynman, Acta Physica Polonica 24, 697 (1963).
0 1963 Acta Physica Polonica B.

Quantum Theory of Gravity. 11. The Manifestly Covariant Theory (Extract)


Reprinted with permission from B. S. DeWitt, Phys. Rev. 162, 1195 (1967).
0 1967 The American Physical Society.

Quantum Theory of Gravity, 111. Applications of the Covariant Theory


Reprinted with permission from B. S. DeWitt, Phys. Rev. 162, 1239 (1967).
0 1967 The American Physical Society.

Feynman Diagram for the Yang-Mills Field


Reprinted from L. D. Faddeev and V. N. Popov, Phys. Lett. 25B, 29 (1967).
0 1967, with permission from Elsevier.
XVl

Feynman Rules for Electromagnetic and Yang-Mills Fields from the Gauge-Independent Field-
Theoretic Formalism (Extract)
Reprinted with permission from S. Mendelstam, Phys. Rev. 175, 1580 (1968).
0 1968 The American Physical Society.

S Matrix for Yang-Mills and Gravitational Fields (Extract)


Reprinted with permission from E. S. Fradkin and I. V. Tyutin, Phys. Rev. D2, 2841 (1970).
0 1970 The American Physical Society.

Missed Opportunities (Extract, with a brief comment of the author)


Introduction, General Coordinate Invariance
Reprinted with permission from F. J. Dyson, Bull. Am. Math. SOC.78, 635 (1972).
0 1972 American Mathematical Society.

Extended Translation Invariance and Associated Gauge Fields


Reprinted with permission from K. Hayashi and T. Nakano, Prog. Theor. Phys. 38,491 (1967).
0 1967 Prog. Theor. Phys.

Gravitational Field as a Generalized Gauge Field


Reprinted with permission from R. Utiyama and T. Fukuyama, Prog. Theor. Phys. 45,612
(1971). 0 1971 Prog. Theor. Phys.

Integral Formalism for Gauge Fields


Reprinted with permission from C. N. Yang, Phys. Rev. Letters 33, 445 (1974).
0 1974 The American Physical Society.

Yangs Gravitational Field Equations


Reprinted with permission from Wei-Tou Ni, Phys. Rev. Letters 35, 3 19 (1975).
0 1975 The American Physical Society.

Einstein Lagrangian as the Translational Yang-Mills Lagrangian


Reprinted with permission from Y. M. Cho, Phys. Rev. 14,2515 (1976).
0 1976 The American Physical Society.

De Sitter and Poincare Gauge-Invariant Fermion Lagrangians and Gravity


Reprinted from J. P. Hsu, Phys. Letters, 119B,328 (1982).
0 1982, with permission from Elsevier.

Fixation of Coordinates in the Hamiltonian Theory of Gravitation


Reprinted with permission from P. A. M. Dirac, Phys. Rev. 114,924 (1959).
0 1959 The American Physical Society.

New General Relativity (with Addendum)


Reprinted with permission from K. Hayashi and T. Shirafuji, Phys. Rev. 19,3524 (1979).
0 1979 The American Physical Society.

Relativistic Theory of Gravitation


Reprinted with permission from A. A. Logunov and M. A. Mestvirishvili, Prog. Theor. Phys. 74,
31 (1985). 0 1985 Prog. Theor. Phys.
xvii

Yang-Mills Gravity: A Union of Einstein-Grossmann Metric and Yang-Mills Tensor Fields in


Flat Spacetime with Translation Symmetry
Contributed by J. P. Hsu. Based on a paper in Annual Report of the National Center for
Theoretical Sciences (2005).

Empirical Foundations of the Relativistic Gravity


Reprinted with permission from Wei-Tou Ni, Int. J. Mod. Phys. D (to be published)
02005 World Scientific.

Binary Pulsars and Relativistic Gravity


Reprinted with permission from J. H. Taylor, Jr., Rev. Mod. Phys. 66, 711 (1994).
Q 1994 The American Physical Society.

Concept of Nonintegrable Phase Factors and Global Formulation of Gauge Fields (Extract)
Reprinted with permission from T. T.Wu and C. N. Yang, Phys. Rev. DlZ, 3845 (1975).
0 1975 The American Physical Society.

Gauge Fields
Reprinted with permission from Robert Mills, Am. J. Phys. 57, 493 (1989),
0 1989 American Institute of Physics.

Magnetic Monopoles, Fiber Bundles, and Gauge Fields


Reprinted with permission fromi C. N. Yang ,Ann. of the New York Academy of Sciences, 294,
86 (1977). 0 1977 New York Academy of Sciences.

Gauge Theory: Historical Origins and Some Modern Developments


Reprinted with permission from H. L. ORaifeartaigh and N. Straumann, Rev. Mod. Phys. 72, 1
(2000). 0 2000 The American Physical Society.

String Theory as a Generalization of Gauge Symmetry


Contributed by Pei-Ming Ho, National Taiwan University.

The Cosmological Constant Problem


Reprinted with permission from S. Weinberg, Rev. Mod. Phys. 61, 1 (1989).
0 1989 The American Physical Society.

The Cosmological Constant and Dark Energy (Extract)


Reprinted with permission from P. J. E. Peebles and B. Katra, Rev. Mod. Phys. 75, 559 (2003).
0 2003 The American Physical Society.

Marcel Grossmann (I 878-1936)


Contributed by Jong-Ping Hsu and Dana Fine, University of Massachusetts Dartmouth.

Remembering Robert L. Mills


Reprinted with permission from S. L. Marateck, Phys. Today, Oct. 2003.
0 2003 American Institute of Physics.

Marcel Grossmann (Photo courtesy of Anna and Carlo Kkvay-Grossmann)


xvjii

Bryce DeWitt (Photo courtesy of Cecile DeWitt-Morette)

Robert Mills (Photo courtesy of Mrs. Mills)

Chen Ning Yang and Tai Tsun Wu in Leiden (1984) (Photo courtesy of Judy Wong)

T. D. Lee and C. N. Yang at the Institute for Advanced Study (Courtesy of the Archives of the Institute
for Advanced Study, Princeton, New Jersey, USA)
xix

Remarks on the Development of the Gravitational and Yang-Mills Fields,


and Accelerated Frames

Jong-Ping Hsu and Dana Fine

1 Introduction
In the past 100 years, the ideas of general coordinate invariance and of gauge invariance
have played leading roles in the investigation of the fundamental interactions of nature
(gravitational, weak, electromagnetic and strong interactions). Physics Today calls 2005
the World Year of Physics to celebrate physics in its broadest context as part of the
human experience and to raise awareness of physics within the broad population. [l]It
also happens to be roughly the 50th birth-year of Yang-Mills theory, the 100th birth-year
of Einsteins happiest thought, as well as the 100th anniversary of the publication of the
celebrated theory of special relativity. [a] It is thus a fitting time to review Einstein and
Yang-Mills ideas, their impacts on both physics and mathematics, and some open problems
in related areas.
In 1907, a simple thought flashed through young Einsteins mind:

I wus sitting i n a chair in the patent ofice at Bern when all of a sudden a thought
occurred to me: If a person fulls freely he will not feel his own weight. I was startled.
This simple thought made a deep impression on me. It impelled m e toward a theory of
gravitation.

He told this story in his Kyoto lecture. [3] This happiest thought, as Einstein called it,
involves two ingredients:
i. gravitational force, and
ii. accelerated frames.
These two related physical subjects were first discussed by Einstein in his 1907 review paper
entitled On the Relativity Principle and the Conclusions Drawn From it. [4] Apparently,
these problems had engrossed Einsteins thought shortly after he published his landmark
paper on special relativity 100 years ago. Einstein realized that his knowledge of the relevant
mathematics was inadequate, so he turned to his mathematician friend and university col-
league Grossmann. Grossmanns help proved significant for Einstein in realizing his dream
theory.
Around 1947, a graduate student at the University of Chicago, C. N. Yang, also had a
simple thought: [5]
xx

Since Maxwells equations and the conservation of electric charge are intimately related,
and the conservation of isotopic spin has been established by experiments, should it imply
another kind of gauge field?

Yang tried and tried, but he was just unable to overcome a key difficulty. [6] (See sec. 3)
Such efforts, and failures, are typical, everyday occurrences in graduate students offices. It
seems this thought made a deep impression on him. Seven years later, it also impelled
Yang and Mills toward a non-Abelian gauge theory, when a spark from their discussions
grew to illuminate the key difficulty.
These t w o instances, and many others, raise awareness of a remarkable fact that new
thoughts and ideas often emerge from refreshingly young minds. As Nobert Wiener said it
eloquently to young mathematicians: You must devote this brief springtime of top creative
ability to the discovery of new fields and new problems, of such richness and compelling
character that you can scarcely exhaust them in your life. This goal appears to apply
equally well to all young researchers.

2 Gravity
Historically, Newton created the basic framework for understanding the static property of
gravitational force in his grand PRINCIPIA (1686): the inverse-square law of the universal
gravity and the laws of motion. Newton achieved a n extraordinary unification, perhaps
the first in physics: he demonstrated that the same force and laws of motion apply in the
celestrial and the sub-lunar spheres. The style of Newtons PRINCIPIA, follows closely that
of ELEMENTS by a pioneer and profound mathematical thinker Euclid. For example, book
111 of PRINCIPIA started from rules of reasoning in philosophy, introduced phenomena,
and followed these with proposition. [7] Newton proposed the first scientific understanding
of the solar system based on his universal law for (static) gravitational force and his powerful
general methods of mathematics.
About 220 years later, Poincark investigated the Lorentz invariant and kinematic prop-
erties of gravity in his comprehensive paper on relativity finished in 1905. After he discussed
and derived all essential results of special relativity for mechanics and electrodynamics, he
reached an insightful conclusion that the gravitational action propagates with the speed of
light, based on the invariants of the Lorentz group. [8] Poincari is known for his broad and
universal mind and for his contributions to mathematics and physics, comparable to those
of Gauss.
With his persistence and ingenuity, Einsteins happiest thought of 1907 eventually lead
to arguably the greatest leap forward in the human endeavor to understand the universe,
The Foundation of the General Theory of Relativity. This followed 8 years of hard work
and about 30 not-completely-correct papers on gravity and/or general relativity. [9] The
result is Einsteins splendid equation for gravitation which has been extensively discussed
and tested by experiments.
Einstein conceived and worked on the idea of general coordinate invariance during 1911-
12. His knowledge of mathematics was inadequate to express his radically new ideas, so he
sought help, pleading to his mathematician friend: Grossmann, you must help me, or else
Ill go crazy!; thus started one of the most beautiful collaborations between two scientists
xxi

in different disciplines. [lo] Einsteins insight was that the law of gravity must be invariant
under arbitrary spacetime coordinate transformations (one-to-one and twice-differentiable) .
This idea implies that the physics of gravity should be formulated and understood in any
frame of reference, and it eventually led to a theory of gravity which revolutionized our
concepts of the physical universe, endowing the universe with pseudo-Riemannian geometry.
The impact on mathematics can be seen from the participation of development of Einsteins
idea by such brilliant mathematicians as Hilbert, Cartan, Levi-Civita, Weyl and others.
Indeed the impact went well beyond the sciences to include literature and art, which is part
of how Einstein became an international celebrity and TIME magazines person of the
century.
In 1915, after attending Einsteins talks and many discussions and extensive correspon-
dence, Hilbert was able to grasp the essence of Einsteins idea and express it through a n
invariant action involving the linear scalar curvature. He finished the paper The Founda-
tion of Physics containing this invariant action about a week before Einstein completed
his landmark paper. [ll] Hilbert was a newcomer in this field, and one can feel a fierce
competition. Indeed, their relation appears to have been strained for a short period of
time. Presumably this was due to their difference in philosophy (action principle versus
detailed dynamical analysis) rather than a dispute over priority. In fact, Hilbert clearly
credits Einstein with the idea for the theory. Apparently, Einstein liked Hilberts method
of deriving the gravitational equation from one single principle of variation. He also pub-
lished a paper Hamiltons Principle and the General Theory of Relativity in 1916. T h e
recent public dissemination of a hand-edited galley proof of Hilberts paper has shed new
light on the development of Hilberts formulation and its relation to Einsteins, as discussed
in detail in [la]. Hilberts treatment of gravity (and electromagnetism) was based on a n
elegant invariant formalism which he argued mathematically was unique given, as axioms,
the requirement of general covariance and some reasonable additions assumptions. [ll]He
even believed what we would now call the theory of everything could be constructed on
the basis of an axiomatically-determined geometrically-invariant action. This situation re-
sembles the way the mathematician Poincare grasped the essence of relativity principle in
1905 through his complete understanding of the symmetry group of the Lorentz transforma-
tions. He derived, for the first time, the invariant law for the motion of a charged particle
by using a Lorentz invariant action. [8] The young Einstein was unable to do so; in his 1905
landmark paper he only obtained an approximate and non-invariant equation for the motion
of charged particle with small accelerations. The principle of least action is now a powerful
and standard method for the formulations of physical theories.
Furthermore, Cartan recognized in 1922 an unsatisfactory feature in Einsteins equa-
tion of gravity, namely, the energy-momentum tensor does not have geometrical mean-
ing. [13] He showed how, in an Einstein universe with a given d s 2 , the energy tensor attached
to each volume element of that universe can be defined geometrically. Following this line
of research, he published a paper On a generalization of the concept of Riemann curva-
ture and spaces with torsion. [14]Cartans work created the only satisfactory mathematical
framework for physicists to be able to introduce fermions or spinors into Einsteins theory.
This is the only known way for fermions to be coupled to gravitational field according t o
the requirement of general coordinate invariance. All these examples clearly show the needs
of collaborations between physicists and mathematicians to develop a physical theory.
In some sense, Einsteins idea of general coordinate invariance suggests a drastic new
xxii

approach in physics: namely, the geometrization of all physical fields. If one follows this
approach to treat the electromagnetic force, which is velocity-dependent , one might choose
to employ the Finsler geometry rather than the Riemann geometry. The reason is that the
fundamental metric tensors of the Finsler geometry depend on both position and velocity
(i.e., the differentials of coordinates). [15] It appears that, so far, all attempts to geometrize
the classical electromagnetic field has not been successful, let alone quantum electrodynamics
and Einsteins unified theory based on Riemannian geometry.
Einsteins tremendously successful theory of gravity also presents a huge problem in
physics. When Dyson gave the Gibbs lectures on Missed Opportunities under the aus-
pices of the American Mathematical Society in 1972, he stressed that the most glaring
incompatibility of concepts in contemporary physics is that between the principle of general
coordinate invariance and all quantum-mechanical and quantum-field-theoretic descriptions
of nature. [16] Such an incompatibility is intimately related to the difficulty of quantization
in curved spacetime. As a perturbative theory in flat spacetime Einsteins theory is very
complicated and is not renormalizable. [17] This is of course a challenge to mathemati-
cians and physicists, as Bohr was fond of saying: How wonderful that we have met with a
paradox. Now we have some hope of making progress.

3 Yang-Mills Fields and Gauge Theory of Gravity


In the decades from the 1930s to the 195Os, most physicists were excited by new discov-
eries in quantum mechanics for atomic structure, nuclear physics, elementary particles and
quantum fields. Gravitational theory was no longer in the main stream of physics research.
Around 1940, experiments of strong interaction indicated the existence of a new conser-
vation law of isospin (or isotopic spin). T h e concept of isospin was originally introduced
by Heisenberg in 1932 as a convenient mathematical representation to characterize the two
states of the nucleon-isospin up as proton and isospin down as neutron, similar to the
usual electron spin states. Nevertheless, it proves to be a useful and good physical quantum
number for strong interactions, but not for all interactions.
The story of the celebrated Yang-Mills collaboration is briefly as follows: Robert Mills
was a research associate from 1953 to 1955 at Brookhaven National Laboratory, while C .
N. Yang joined the Institute for Advanced Study in Princeton as a postdoc in 1949 and
became a professor in 1955. Mills shared an office with Yang who visited the Laboratory
from 1953 to 1954. They were both young and dreamed about physics. Both were a t
the right place and a t the right time. They earnestly discussed many things in physics,
including the experimental d a t a flowing out of the new Cosmotron (then, at 3GeV, the
worlds largest proton accelerator) in the National Laboratory, and possible implications of
isospin conservation. During their discussions, they had the idea of adding a new term of
the form cabcbb,bE to the field strength f&. The gauge-covariant derivative D, and field
strength f;l. of the isovector field bi
are given by

where a , b , c, = 1 , 2 , 3 , and F s are the constant matrix representations of the isospin SU(2)
group. This addition solved Yangs original difficulty, and the rest is history. This type
xxiii

of new term in the field strength is essential for all non-Abelian gauge fields. Adding this
term to the gauge field strength is comparable in its significance to adding the displacement
current term to the original Ampere law by Maxwell, which made the electromagnetic theory
consistent and complete.
Nevertheless, this non-trivial and unique generalization of Maxwells equations did not
attract much attention initially. When Dyson wrote the article Innovation in Physics in
1958, he failed to mention the discovery of non-Abelian gauge fields by Yang and Mills. [16]
Even when Yang himself gave a talk on The Future of Physics a t MIT in 1961, he just
asked a related question : What are the basis of the invariance under charge conjugation,
and the invariance under isotopic spin rotation, both of which, unlike space-time symmetries,
are known to be violated? Yang did not mention the non-Abelian gauge fields.
There were in fact serious problems associated with the original Yang-Mills theory. One
of the problems is that Yang-Mills field is massless, while all observed particles with strong
interactions have mass. This incompatibility between Yang-Mills field theory and experi-
ment became the central issue when Yang was invited by Oppenheimer to give a talk a t the
Institute for Advanced Study, Princeton, on this work. Pauli persistently and repeatedly
asked Yang the question regarding the mass of the new field. Yang could not give him a
satisfactory answer. Pauli made a strong criticism, and the talk was almost stopped by
him. [19] It was well known that Pauli was super-critical on every new physical idea, in-
cluding Einsteins ideas. In fact, Pauli himself had a similar idea and investigated the same
problem before. He also obtained the expression for the field strength of the new gauge
field. According to Paulis colleague Gulmanelli, [20] Pauli gave up the whole investiga-
tion because the new field quantum was a massless vector particle and, hence, contradicted
experiments. To all practical physicists at that time, it was obvious that the Yang-Mills
theory with zero mass field did not exist in nature, because a zero mass field would have
been easily detected in strong-interaction experiments.
Here, one sees clearly the difference between Yang and Pauli regarding their tastes in
physics research. Yang said later: We did not know how to make the theory fit experiment.
I t was our judgment, however, the beauty of the idea alone merited attention.[21] Even
so, when Mills passed away in 2003, Yang, recalling their creation of the gauge theory,
acknowledged We were pleased by the beauty of the generalization, but neither of us had
anticipated its great impact on physics 20 years later. [22]
Although most physicists ignored Yang-Mills work a t that time, T. D. Lee and R.
Utiyama were immediately attracted to their idea. Lee and Yang proposed generalized
gauge transformations to understand the conservation of heavy particle (or baryon) num-
ber in 1955. They argued that such a new conservation law implies the existence of a new
long- range repulsive force between baryons (e.g., protons and neutrons). T h e corresponding
force would be attractive between baryons and anti-baryons. They used Eijtvos experiment
t o estimate the strength of such a new force. They found that it to be about times
smaller than that of the gravitational force. Similar consideration of the conservation law
for lepton number lead to a new and very weak long-range repulsive force between electrons.
It is quite possible that such a Lee-Yang force between baryons in galaxies played a role
similar to the so-called dark energy and has direct relevance to the observed accelerating
expansion of the universe. [23]
One year later in 1956, Utiyama generalized Yang-Mills discussion of the SU(2) gauge
group to a general group with n gauge functions, and grasped the essential idea expressed
xxiv

in his paper Invariant Theoretical Interpretation of Interaction. He stressed that the


form of the interactions between some well known fields can be uniquely determined by
postulating invariance under a certain group of transformations.[24] Furthermore, he applied
his generalized viewpoint of Yang-Mills fields to discuss gravity. Thus Utiyama opened a
new approach to gravity, which was later termed Gauge Theory of Gravity. Over the
years since, many people, including Kibble, Hayashi, Nakano, Utiyama, Fukuyama, Yang,
Thompson, Ni, Hsu, Cho, Ashtekar and others continue to explore such gauge theories of
gravity. These discussions are based on the Lorentz group, the Poincare group, the de Sitter
group, spacetime translation group and other groups. [25] We shall suggest below that, from
the viewpoint of symmetry and its related conservation law, the spacetime translation group
appears to be the simplest and the most natural candidate for the gauge group of gravity.
The problem of gravity as a Yang-Mills field is still open.
A few years later in 1960, J. J . Sakurai [26] took advantage of Yang-Mills gauge invariant
principle t o propose the strongly interacting vector mesons for discussing universality and
conserved current. But the presence of masses of vector mesons in strong interaction physics
violates gauge invariance. The mass problem in the Yang-Mills theory was later solved by
Higgs mechanism of spontaneous symmetry breaking, which enabled Weinberg t o make a
wonderful prediction of gauge boson masses in 1967. [27] Thus Yang-Mills fields took center
stage in physics only gradually.
The first breakthrough in the development of Yang-Mills theory did not address the
mass problem. It came from Feynmans 1962 lecture at the (Conference on Relativistic
Theories of Gravitation in Poland. This is a rare opportunity to see the detailed process
of discovery in a geniuss mind: Feynmans discovery of a serious problem through direct
calculations and his original idea of using a fictitious particle t o fix it. Presumably, such
a crazy and not-even-half-baked idea may have hard time to be published in the damned
Physical Review, (Feynmans term) if the paper were submitted. A conference was no
doubt the ideal forum for him to present such an original and crazy idea and to challenge
directly the experts in the audience. Feynman considered Einsteins gravitational theory
with a scalar field as the source. This grand master of calculations with rules and diagrams
evaluated amplitudes at the one-loop level and encountered a serious problem: namely, the
resultant amplitudes do not satisfy unitarity (a necessary condition for their interpretation
as determining probabilities). He was worried and puzzled. At the suggestion of Gell-Mann,
he looked into the Yang-Mills theory which, like Einsteins theory of gravity, is non-linear,
but is otherwise much simpler. Feynman was happy to discover that Yang-Mills theory has
the same problem! He remarked that the unitarity difficulty should have been noticed by
meson physicists who had been fooling around the Yang-Mills theory. They had not noticed
it because they are practical, and the Yang-Mills theory with zero mass obviously does not
exist, ... Based on the simpler Yang-Mills theory, Feynman was able to find a suitable and
crazy way to cure the difficulty: ... You must subtract from the answer, the result that you
get by imagining that in the ring which involves only a graviton going around, instead you
calculate with a different particle going around, an artificial, dopey particle is coupled to it.
It is a vector particle, artificially coupled to the external field, so designed as to correct the
error in this one. Nevertheless, Feynman was unable to solve the problem beyond one-loop
at that time. B. S. DeWitt was attracted to Feynmans crazy idea of ghost particle and
asked Feynman about its structure and nature.
After the conference, DeWitt made an heroic effort to work out all the details of Feynman
xxv

rules beyond one-loop to show that Feynmans idea of a ghost particle works for both the
Yang-Mills theory and the Einstein theory. He published three long and detailed papers in
1967: Quantum Theory of Gravity 1: 11, and 111. [28] T h e whole thing is very complicated
arid the physical reason for the presence of the ghost particle only in the intermediate states
of a physical process is not completely clear. However, the results of Feynman and DeWitt
inspired two Russian physicists Faddeev and Popov to write a most elegant paper Fcynman
Diagrams for the Yang-Mills Field (with only 2 pages!) in the same year. [29] T h is paper
completely clarified and solved the problem of unitarity and gauge invariance to all orders,
on the basis of the Feynman path integral.
Basically, Faddeev and Popov solve Feynmans problem of unitarity by considering t h e
gauge invariance of physical ampiitudes. They showed that the gauge condition such as
d, B: = 0 in the Yang-Mills theory cannot be consistently imposed for all time, in contrast
to that in quantum electrodynamics. Faddeev and Popov proposed a new method t o enforce
the gauge condition for all time and, hence, maintain the gauge invariance of physical
amplitudes. They showed that it leads to the closed loop with a (scalar) ghost particle
propagating along it with a specific interaction, which restores the unitarity of the physical
amplitudes. The same method based on Feynmans path integral can also be applied t o
Einsteins theory of gravity.[30]
The path integral is another not-completely-baked but (probably) profound idea of Feyn-
mans. It is useful in physics but still lacks a mathematical foundation. T h e status of Diracs
delta function around 1930s was similar. Now we have a foundation for the delta function
in the mathematical theory of distribulions, but we still lack a mathematical basis for Feyn-
marls path integral. This problem was also discussed by Dyson in his lecture Missed
Opportunities. [31] In this sense, the residts of Faddeev and Popov may not be consid-
ered as rigorously established within the framework of quantum field theory. Mandelstams
obtained the same results on the basis of quantum field theory.[32]
Thus, the Yang-Mills theory (with or without masses) can be established as a gauge
theory which satisfies both unitarity and renormalizability. So far, this is the best theory
of strong, weak and electromagnetic interactions that physicists can construct within the
framework of local field theory. [33] As noted above Einsteins theory of gravity can also be
regarded as a gauge theory. However, there are important differences: Einsteins theory is
based on curved spacetime and its action is linear in the spacetime curvature. In contrast,
the conventional Yang-Mills theory is based on flat spacetime a n d its action is quadratic
in gauge curvature. As a result, Einsteins theory is gauge invariant and unitary, but it is
not renotmalizable. Only when one can construct a unitary, renormalizable and consistent
theory of gravity, can one claim to have solve the problem of quantum gravity.

4 Transformations of Spacetime for Accelerated Frames


The first ingredient, gravitational force, in Einsteins happiest thought has been well
developed and understood through his theory of gravity. Indeed, his idea of general coordi-
nate invariance implies that the physics of gravity should be formulated and understood in
any frame of reference. However, the concept of specific accelerated frames, the second in-
gredient in his happiest thought, turns out to be very difficult to define precisely and, hence,
has not been well developed theoretically or experimentally. So far, we even do not have a
xxvi

well-established definition of constant linear accelerations as transformations of spacetime.


In one simple, smooth generalization of Lorentz transformations to reference frames with
linear accelerations, spacetime as seen from these accelerated frames is non-uniform, and the
group properties of the spacetime transformations, with the associated Lie algebra appears
to be non-trivial. For instance, the Lie algebra has infinite rank.
In his 1907 paper [4], Einstein asked:
Is it conceivable that the principle of relativity also holds for systems which accelerute
relative to each other?
Apparently, he has in mind the physical properties of space and time and exact physical
laws in non-inertial frames of reference. He was able t o obtain some physical results with
the help of the principle of equivalence. Nevertheless, he could only write down approximate
spacetime transformations involving constant linear accelerations, which do not reduce t o
the Lorentz transformations in the limit of zero acceleration. I t appears that Merller is the
first to attempt t o derive the spacetime transformations for accelerated frames. [35] In 1943,
he used Einsteins field equation in vacuum to derive a spacetime transformation between
an inertial frame FI(wI,~I, , and a frame F(w*,
y ~ZI) x,y, 2 ) which moves with a constant
acceleration cr; along the x-direction:

WI = (x + - 1) s i n h ( a ; w * ) ,
a*,
1 1

YI = Y

where W I = ctr and W * = ct. In 1972, T.Y. Wu and Y . C . Lee derived the same acceler-
ated transformations from kinematical considerations based on Lorentz transformations for
short spacetime intervals. [35] In the limit of zero acceleration, they reduce to the identity
transformation, and although there are reasons to term these constant accelerations the
velocity ,O of the F(w*,x,y, z ) frame is not a linear function of time w*,,B = t u n h ( ~ * w * ) .
J.P. Hsu and L. Hsu reparametrized the time coordinate and generalized further to arrive
at transformations with two properties: [36]
i. minimal departure from the Lorentz transformation: the velocity ,B is a linear function
+
of the (accelerated-frame) time w ,,B = Po a,w, and
ii. limiting 4-dimensional symmetry: the transformation to an allowed accelerated frame
reduces to a (generally non-trivial) Lorentz boost in the limit of zero acceleration.
The resulting transformation between an inertial frame Fr and an allowed accelerated
frame F, moving with a constant acceleration a , in the x-direction, is
XXVii

where ,8= cr,wf,Bo, yo = l/dm y= l/dm. This is called the Wu transforma-


tion, [36] I t implies that a particle a t rest in the F frame will gain constant energy per iinit
length, as rnea.siired in an iriert,ial frame F I . This property, which dcfiries the constarit linear
acceleration, means the framc E m a y serve as t>herest frame for a particle in a high-energy
linear accelerator.
The accelerated transforrnatioris of spacetime (4) reveal the following property for physics
in accelerated frames of reference: The speed of light is described by the equations, cis2 =
dm; - dr; = I;ii2dw2- dr2 = 0, where W = ~(y;~+ a , ~ can ) be obtained from (4). Thus,
in the inertial frames FI,the speed of a light signal in vacuum is constant, drrldwr = 1 . On
the contrary, the speed of this light signal as measured in the accelerated frame F is given
by the function M I , i.e., d r / d w = W = + crox). This implies there is no relativity
or eyuivalwcc iietwecn a n inerti.al frame 81and an accelerated frame F in the sense of the
invariancc of physical laws familiar in special relativity theory.
It, is int,eresting to note thal these accelerat,ed t,ra.nsformations or spacetime are nori-
linear, in contrast. t o their zero acceleration limj t,, the linear Lorentz transformalions. As a
resull, the finite point transformations cannot be derived from infinitesimal transformations
through the usual process of exponentiation of the transformation matrices.
The group operations and t h e Lie algebra for the group generators for the MWL and
Wu transformations involving two parameters are nol trivial. For example, the Lie algcbra
is not finitely gcnerated, in contrast to the Loreiitz and Poimare groups. [37]
T h e corresporiding t,ransformations of the differentials dxy and d x p (for fixed values of
the spxehjine coordinates) are necessarily lincar. They can be described by a pair of group
operai ions:

i. multiplying by a synchronization factor W to dw:

(5)

and then
ii. making the usual 4-dimensional Tmrentz boost for (Wdw,dx,dy,dx)involving the (space-
time-dependent) vclocity 0 = {& u,w, +
dupr = y ( W d w + o d z ) , d ~ =r y ( d z + P W d w ) , d y i = dy, dri = d z . (6)

These equations can be obtained from (4) by differentiation. The synchronization factor
W in the first operation guarantees that the differential equations in (6) are integrable and
that the time in the accelerated frame F automatically becomes the synchronized time in
the Lorentz transformation when the acceleration IY, approaches zero. 1361

5 Yang-Mills and the Interplay between Mathematics and Physics


Einsteins general relativity involved some of the periods mathematics in its original
formulation. Yang and Mills theory did not, yet il had a swift and lasting impact in
mathematics, particularly in the areas of geometry and topology. Mathematicians, who had
developed fiber bundles as tools for understanding the global geometry of manifolds, soon
understood the vector potential and the Yang-Mills action as a connection on a principal
xxviii

fiber bundle and the square norm of this connections curvature, respectively. Indeed, Wu
and Yang published a dictionary [38] allowing the physicist and the mathematician each to
translate the terms the other had developed for the same concepts. The recognition of this
overlap of interests quickly led to a renewal of interaction between what had become two
distinct communities. As outlined below, basic questions in the new physics suggested new
realms of mathematical exploration; conversely, existing mathematical constructs, notably
index theory, provided insights in the new physics.
The formulation of an action immediately raises the question of its critical points; that
is, the solutions of the corresponding Euler-Lagrange equations. For the Yang-Mills action
these solutions can be non-trivial field configurations known as instantons. Hitchin, Atiyah,
Singer, Drinfeld and Manin [39] began studying the spaces of instantons. In the case of
self-dual Yang-Mills instantons Donaldson [40] obtained results which firmly established
the Yang-Mills equations as a useful tool in the study of manifolds. He developed what
is now known as the Donaldson invariant, which proves sensitive not only to the topology
but to the differentiable structure of a manifold. This served as the key to resolve a long-
standing question of whether there could be inequivalent differentiable structures on a given
topological four-manifold. (The answer is yes. )
The chiral and non-Abelian anomaly, first discovered using the technique of current
algebras, provided another avenue from Yang-Mills physics to mathematics. This avenue
proved to be very much a two-way street. Bringing techniques from global analysis of
manifolds, including index theory, to bear on a puzzle arising from their study of the Yang-
Mills path integral, Singer and Atiyah [41] developed a new, global formulation of the
anomaly. In so doing, they resolved some outstanding issues in quantum physics, and
inspired a generation of physicists to study new techniques of differential geometry. On
the mathematical side, this inaugurated the study of the topology and geometry of certain
infinite-dimensional spaces. Fine and Fine [42] present the history of the anomaly and the
attendant interplay between mathematics and physics in detail.
The profound influence of Yang-Mills on mathematics is ongoing. Attempts to rigorously
define the quantum Yang-Mills field theory continue. Moreover, there is a direct connection
between the mathematical formulation of the anomaly in Yang-Mills theory and the contem-
porary formulation of string theory, which, in turn, is a continuing source of mathematical
studies. String theory is also a descendant of general relativity: it is supposed to reduce
to general relativity in a classical limit, and is conjectured to reduce to higher-dimensional
supergravity in an appropriate parameter regime.
Topological quantum field theory is an area of active research in which trying to follow
ideas as they move between mathematics and physics is like watching a tennis match from
center court. After Donaldson developed his invariants by mathematical analyzing the clas-
sical Yang-Mills equations, Atiyah conjectured there should be a quantum field theory in
which they arise naturally. In response, Witten wrote down a Lagrangian for a topologi-
cal Yang-Mills theory, followed shortly by a topological gravity theory. Atiyah and Jeffrey
promptly re-interpreted the topological Yang-Mills theory in terms of equivariant cohomol-
ogy. Since then topological quantum field theories have led t o new mathematical conjectures,
most of which have then been rigorously proven, and provided both mathematicians and
physicists with examples of path integrals which can be treated non-perturbatively.
Arguably, the most important influence of Yang and Mills work has been bringing sig-
nificant portions of the mathematics and physics communities back into their historically
xxix

close contact, once again sharing a language and working on aspects of the same problems,
after a long period of divergence.
There is an interesting story in which Yang described his joy of comprehension of the
relation of physics and mathematics: [43]
In 1975, impressed with the fact that gauge fields are connections on jiber bundles, I
drove to the house of Shiing Shen Chern in El Cerrito, near Berkeley. ..... I told him that
I had finally learned from Jim Simons the beauty of fiber-bundle theory and the profound
Chern- Weil theorem. I said I found it amazing that gauge fields are exactly connections on
fiber bundles, which the mathematicians developed without reference to the physical world.
I added, this is both thrilling and puzzling, since you mathematicians dreamed up these
concepts out of nowhere. He immediately protested, No, no. These concepts were not
dreamed up. They were natural and real.

6 Perspectives on General Relativity and Gauge Theory


The terms general relativity and gauge fields are widely used and have their conven-
tional meanings. But these meanings are not fixed in terms of underlying physical ideas.
Rather, they show the evolution of ideas in physics research and also show the difficulty for
physicists to change the accepted names. The modern understanding of gauge fields turn
out to have nothing to do with scale invariance, as originally proposed by H. Weyl, but
have everything to do with the phase of a wave function, as pointed out by V. Fock and
F. London. [43]

Focks Comments on General Relativity

However, when Einstein created his theory of gravitation, he put forward the term gen-
eral relativity which confused everything. This term was adopted in the sense of general
covariance, i.e. in the sense of the covariance of equations with respect to arbitrary trans-
formations of coordinates accompanied b y transformations of the g p v . But we have seen
that this kind of covariance has nothing to do with the uniformity of space, while in one way
or another relativity is connected with uniformity. This means that general relativity has
nothing to do with relativity as such. A t the same time the latter received the name spe-
cial relativity, which purports to indicate that it is a special case of general relativity. [44]

Focks criticism is constructive because it helps to clarify the relation or the lack of re-
lation between special relativity and general relativity. In special relativity, as discussed
by Lorentz, Poincarg, Einstein and Minkowski, the spacetime is flat. However, in Einsteins
general relativity, the spacetime is curved, so that he could introduce the physical effects
of gravity into Riemannian curvature tensor. Perhaps, the idea of generalization of special
relativity first came to mind when he was thinking about generalizing physical laws from
inertial frames to non-inertial frames with an arbitrary velocities. [4] This does not justify
the term general relativity in the sense of frame-independence, because the accelerated
transformations in (3) and (4) show that there is no relativity between inertial and acceler-
ated frames. Note that the curvature of spacetime of in these accelerated frame must still
be zero. Nevertheless, the terms reveal the continuity of Einsteins reasoning after 1905.
xxx

Synges Comments on the Principle of Equivalence

V. L. Synge made a constructive critism concerning the Principle of Equivalence: [45]

..... Perhaps they speak of the Principle of Equivalence. If so, it is my turn to have
a blank mind, I have never been able to understand this principle. Does it mean that the
signature of the space-time metric is +2 (or -2 i f you prefer the other convention)? If so,
it is important, but hardly a Principle. Does it mean that the effects of Q gravitational
field are indistinguishable from the effects of an observers acceleration? I f so, it is false.
In Einsteins theory, either there is a gravitational field or there is none, according as the
Riemann tensor does not or does vanish. This is an absolute property; it has nothing to
do with any observers world-line. Space-time is either flat or curved, and an several places
in the book I have been a t considerable pains to separate truly gravitational effects due to
curvature of space-time from those due to curvature of the observers world-line (in most
ordinary cases the latter predominate). The Principle of Equivalence performed the essential
ofice of midwife a t the birth of general relativity, but, as Einstein remarked, the infant would
never have got beyond its long-clothes had it not been for Minkowskis concept. I suggest
that the midwife be now buried with appropriate honours and the f a c t s of absolute space-time
faced.

As noted above the Riemann curvature tensor of the spacetime of accelerated frames
characterized by the transformations (4)vanishes, just as in the inertial frames. Thus, the
physical effects related to these accelerated transformations have nothing to do with the
gravitational field.
The physics in non-inertial frames deserves more theoretical and experimental investi-
gations. Clearly, one cannot be contented with the understanding of physics only in the
usual inertial frames, which is the basic framework for the standard models and particle
physics. The Lorentz and Poincark transformations are linear and carry the whole arith-
metic spacetime into itself. The accelerated transformations such as those of Moller and the
Wu transformations, are nonlinear, and they carry only portion of spacetime in accelerated
frame F to the whole spacetime in an inertial frame F I . The notion of pseudo-group was
discussed by Veblen and Whitehead in order to deal with the transformations which carry
the whole space into portions of space. A set of transformations is called a pseudo-group
if it satisfies the conditions: (i) If the resultant of two transformations in the set exists it
is also in the set. (ii) The set contains the inverse of each transformation in the set. Thus,
it is possible that the concept of pseudo-group could become relevant in both differential
geometry and physics of general accelerated transformations of spacetime. [46]

Gauge Theory of Gravity, Yang-Mills Gravity and Translation Symmetry

In early 1950, Einstein said that he was not sure whether differential geometry was to
be the framework for further progress, but if it was then he believed he was on the right
track. [47] Indeed, in the past 50 years, most researchers in this area followed Einsteins
approach to investigate gauge theory of gravity, based on Riemannian geometry and a
gauge symmetry group in curved spacetime. [48] However, the difficulties related to the
ultraviolet divergence and the energy-momentum tensor in Einsteins theory have not been
xxxi

overcome. These problems are probably related to the fact that Einsteins approach to
gravitational field is characterized by a drastic departure from the grand tradition of all
classical and quantum fields, as stressed by Dyson [31] It appears that the framework of
Riemannian geometry is too general for field theory in the sense that once a gauge symmetry
is introduced in it, the gauge symmetry, however powerful it may be, cannot harness the
ultraviolet divergence. In contrast, we know that gauge symmetry in Yang-Mills theory can
exercise its full power to harness ultraviolet divergence within the usual framework of local
field theory based on flat spacetime. This appears to be the secret essence for the success
of unified electroweak theory and quantum chromodynamics.
Thus, a burning question is: Is it possible t o realize a union of Einsteins theory and
Yang-Mills theory to overcome the divergence difficulties and to understand gravitational
experiments? Two key features for connecting Einsteins theory to experiments and obser-
vations are the field equation (involving a spacetime curvature tensor) and the Einstein-
Grossmann metric g f i V d d d d which is essential for the motion of classical objects and light
rays. Similarly, there are also two basic features of Yang-Mills theory; namely, an action in-
volving quadratic gauge curvature with a symmetry group and an underlying flat spacetime.
In view of the divergence difficulty associated with curved spacetime, it is worthwhile to
consider a Yang-Mills gravity [49] which is characterized by: (a) an action with quadratic
gauge curvature with translational gauge symmetry, (b) flat spacetime, and (c) an effective
Einstein-Grossmann metric. This is interesting because the external spacetime translation
gauge symmetry naturally leads to an effective Einstein-Grossmann metric, provided the
external symmetry group is implemented in the action according to the Yang-Mills approach
for internal gauge groups. Furthermore, the Yang-Mills theory has the advantage of con-
necting translation gauge symmetry to the conserved energy-momentum tensor which is the
source of the gravitational field. Table 1 shows a comparison of key features of Yang-Mills
theory, Yang-Mills gravity, gauge theory of gravity and Einsteins theory. [50]
In the bundle language, Einsteins gravitational potential is a metric which determines
the Levi-Civita connection on the tangent (vector) bundle, while Yang-Mills potentials are
connections on principal fiber bundles. Thus, Einsteins gravitational field is not exactly
the same as the Yang-Mills field. In this context, it is worth noting Ashtekars approach to
loop quantum gravity actually does formulate Einsteins gravitational theory as an SU(2)
Yang-Mills theory defined on the principal fiber bundle associated to the frame bundle. The
dynamical variable is no longer the metric but the vierbein and a spin connection. [51]
In the past 100 years, the track record of researches in the fundamental physics of
particles and fields could perhaps be briefly summarized as follows:

Local quantum fields with gauge symmetry appear to be the most effective key a t hand to
unlock the secret of the basic interactions of quantum particles and, hence, the last mystery
of quantum gravity. The concepts of Regge poles, current algebra and superconvergence, once
very hot, turn out to be not viable. Geometrization of all physical fields based on Riemann-
Cartan or Finder geometry is still a dream, while supersymmetric string theory and loop
quantum gravity are only visions.
Table 1. A comparison of key features of gauge fields with internal and various external spacetime symmetries

Eang - Mills Theory Yang - Mills Gravity Gauge Theory of Gravity Einsteins Theory of Gravity
( M i n k o w s k i spacetime) ( F l a t spacetime) (Curved spacetime) (Curved spacetime)
A t , 0, dpv, 4, B?, d, r;,, 4,
[Lie groups (internal) [translation and arbitrary
d@(Z), = -iWn(X)(.~),k~k(x)] coordinate transformations :
group generators . r, -
- xp +
Ap(z),]
1
A, = 24: + -a%, 4- I<nbcWbAf, group generators : p p = -iD,,
f
A@ = (8, + ifA:T,)@, 4; = dpv - ~ ~ - 4 p ad d v ~ -
a ~4,,apiza,
4 ~ ~
[Ap,A] = ifF:r,, ( o p + f4pvD)d JpaD0, J p a = Ppa + f d p a ,

F! = a p A : - aA2 - f I i o b e A ~ A ~ , [ J p a D aJupDp]
l = CpuxDx,

[
L = --F
: p
.Fp
Cpux= JpaD Jux - JvpDpJ,x,

[
L = -(C p a p C p P a - c;ac;P)/(2f2)
m a t @] .
. (A@)- 1
+i(G,paa@p4

Gap = PpJpaJ,p =
- ntz42)
I m.
effective metrtic tensor,
m u x i m u m coupling : 4 - vertex m a e i m u m coupling : 4 - vertex m a z i m u m coupling : 03 - vertex m a x i m u m coupling : 00 - vertex*

[* inertial frames,] [ inertial and non-inertial frames [assume &,4 = 0, and ai = hrd,,, [ in the 6term.]
[.* Iwb(x)l is small.] with the metric tensor Pp,,,]
[.* D, is covariant derivative w.r.t the
where h: = df + fBf.1
Levi-Civita connection in flat spacetime.] [, h ; h ; p = gP.]
xxxiii

References
[l] S. G. Benka, Physics Today, J a n . 2005, p.10.

[a] It is more reasonable to view the creation of the celebrated theory of special relativity as
both a cause and an effect of scientific progress, through the collective effort of Lorentz
(1904) , Poincark (1905) , Einstein (1905) and Minkowski (1908).
[3] J . Ishiwara, Einstein Koen-Roku Tokyo-Tosho, Tokyo, 1977. See also A. Pais, Subtle
is the Lord . . . I (Oxford Univ. Press, 1982) p. 179.
[4] A. Einstein, paper C in Chapter 1 of this volume.
[5] C. N. Yang. Talking about Physics Research and Teaching, (The 5 t h talk at the
Graduate School ofthe Chinese University of Science and Technology, Beijing, China),
(1986/5/27 - 1886/6/12). This kind of repeated failure a t some seemingly good idea
is, of coiirse, a common experience for all research workers, he said later, see Chen
Ning Yang, Selected Pupers 1945- 1980 Wath Commentary (W. 11. Freeman and Comp.
1983) p. 19. Robert Mills, Am. J . Phys. 57, 493 (1989).
[6] It was related to the existence of the third term eabcb;bE in the field strength fEV.
[7] I. Newton, paper A in Chapter 1 of this volume. Euclids ELEMENTS started with the
dcfiriitions of point, line, etc. and followed by 5 postulates; while Newtons PRINClPIA
started with the definitions of mass, kinetic energy,,.. and followed by the 3 Iaws of
motion.
[8] H. Poincart!, Rend. Circ. Mat. Palermo 21, 129 (1906), On the Dynamics of the Elec-
tron Section 9 Hypothesis Concerning Gravitational Force. Its English translation
in paper B in Chapter 1 is an extract from A. A. Logunovs book On the Articles by
Henry Poincare ON THE DYNAiMICS OF THE ELECTRON

[9] The Collected Papers of iilbert Einstein (Ed. J . Stachel, Princeton University Press,
1993.)
[lo] A. Pais, Sublle is the L.od ..., The Science and the Life o j Albert Einstein (Oxford
Univ. Press, .1982), p. 212. Pais gave a lucid and detailed discussion about Einstein-
Grossrriann collaboration in Chapter 12 in his book. As far as final result is concerned,
the Einstein-Grossmann collaboration resembles, perhaps, the discovery of the structure
of DNA in 1953 by a colaboration of Physicist Francis Crick and biologist James
Watson. Grossmanris contribution and help to Einsteins theory deserves to be more
widely known. There exists a first page of Einsteins landmark paper The Foundation
of the General Theory of Relativity, in which he said: Finally, I want to acknowledge
gratefully my friend, the mathematician Grossmann, whose help not only saved me the
effort of studying the pertinent mathematical literature, but who also helped me in my
search for the field equations of gravitation. See ref. 4. Unfortunately, this page was not
published together with Einsteins landmark paper. It is fitting that since 1975 there
has been a regular International Grossmann Conference on Gravity to commemorate
his contribution.
xxxiv

[ll] D. Hilbert, paper C in Chapter 2 of this volume. In this paper, Hilbert already at-
tempted to unify electromagnetic and gravitational forces in his theory. Also, we note
that the addition of a cosmological term in the Einstein-Hilbert gravitational equation
will upset the elegant and unique invariant action for Einsteins theory of gravity and,
hence, is untenable from the viewpoint of symmetry.
[12] L. Cory, J. Renn and J. Stachel, Science 278 1270-73 (1997); T. Sauer,
Archive for History of Exact Sciences 53 # 6 529-575 (1999); echo.mpiwg-
berlin.mpg.de/content /relativityrevolution/hilbert
[13] So far, the energy-momentum tensor in Einsteins equation is still not well understood.
[14] E. Cartan, paper D in Chapter 2 of this volume. For a more detailed discussion of
torsion in gauge theory, see J. M. Nester, in A n Introduction to I<7aluza-I<7einTheories
(Ed. H. C. Lee, World Scientific, 1984) pp. 83-115.
[15] 5. P. Hsu, Nuovo Cimento 109B, 645 (1994).
[16] F. Dyson, paper G in Chapter 6 of this volume, and Bull. Am. Math. SOC.78, 635
(1972).
[17] See, for example, B. DeWitt, papers B and C in Chapter 6 of this volume. For a review
of quantum gravity, see E. Alvarez, Rev. Mod. Phys. 6 1 561 (1989).
[18] F. Dyson, Sci. Am. 199, 74 (1958).
[19] The rascal, as Einstein once called Pauli, was so extraordinary that Weyl said: Pauli
combines in an exemplary way physical insight and mathematical skill. But one also
knew that, in 1925, Pauli did not believe that Kronigs new idea of the electron spin
had any connection with reality, and discouraged Kronig from publishing it. In the
same year, Pauli refused to collaborate with Born to develop the matrix mechanics.
[20] C. P. Enz, No Time to be Brief, A sczentific biography of Wolfgang Pauli (Oxford
Univ. Press, 2002) pp. 481-482. Aside from Pauli, Ronald Shaw also discussed possible
generalization of gauge invariance in his Ph.D. thesis (1954, unpublished); see R. Mills,
ref. 5.
[21] This belief resembles Voigts belief when he published his paper on Doppler effects in
1887. Without having any experimental evidence, Voigt postulated (i) the invariance
of the laws for the propagation of the light wave (in aether) and (ii) the universal
constant of the speed of light to derive a (conformal) 4-dimensional space and time
transformations and the Doppler effects. Such a Voigt transformation differs from the
Lorentz transformation by an overall constant. Based on Doppler effects, Voigt obtained
an approximate relativistic time and presented the first challenge to Newtons absolute
time about 20 years before the discovery of special relativity. See, W. Voigt, Nachr.
Ges. Wiss. Goettingen. 41 (1887). For an English translation, see A. Ernst and J. P.
Hsu, Chin. J. Phys. 39, 211, (2001).
[22] Article B of Appendices in this volume.
xxxv

[23] Jong-Ping Hsu and Leonard0 Hsu, UMassD preprint (2005).


[24] R. Utiyarna, paper C in Chapter 4 of this volume.

[25] See, for example, papers in Chapter 7 of this volume. For the gauge theory with the
Poincari group, see J . M. Nester, in An Introduction t o liFaZuza-Ilein Theories (Ed. H.
C. Lee, World Scientific, 1984) pp. 83-115; F. W. Hehl, P. von der Heyde, G . D. Kerlick
and J. M . Nester, Rev. Mod. Phys. 48 393 (1976).
1267 J. J. Sakurai, Ann. of Phys. 11, 1 (1960),
[27] P. W. Higgs, Phys. Rev. Lett. 12, 132 (1964). The Higgs mechanism for introducing
masses t o gauge bosom is related to spontaneoussymmetry breaking, which is only
a n apparent breaking of symmetry because the essence of gauge symmetry is still
preserved. This mechanism is crucial for Weinberg to construct explicitly a celebrated
lepton model Lo unify the electromagnetic and weak interactions, to show the power
of Yang-Mills fieIds through t h e astonishing predictions of gauge-boson masses and
others. All Weinbergs predictions were confirmed by experiments 1at.er. This is one of
the greatest achievements of the 20th century science. GIasl-low arid Salem also had
similar ideas.
1281 R DeWit#t,papers 13 and C in Chapter G of this voIume.

[29] L. D. Faddeev and V.. N.Popov, paper D in Chapter 6 o f t h i s volume.


1301 E. S. Fradkiri and I. V. Tyutin, paper F in Chapter 6 of this volume.
[31] F. Dyson, Bull. Am. hlath. SOC.7 8 , 635 (1972). See also paper G in Chapter 6.
[32] S.Mandelst;am, paper I3 in Chapter 6 of this volume.
1331 The ultraviolet divergence in local field theory may be the price t o pay for the ideal-
ization of physical partides as point particles. And renormaIization with counter terms
m a y be considered as a way to compensate the deficiency of the niathematics of local
functions.
[34] V. Fock, The Theory of Space, Time, and Gravitation (Pergamon Press, 1964, 2nd
Revised Edition, transl. by N . Kernmer), p. 207.

[35] C. M@ller,paper A in Chapter 5 of this volume. T. Y. Wu and Y . C. Lee, paper C


iu Chapter 5 of this volume. The Mmller transformations can be generalized t o include
a. constant velocity so that they reduces to the Lorentz transformation in the limit of
zero acceleration. See Jong-Ping Hsu, Einsteins Relativity and Beyond-New Symmetry
ippraaches (World Scientific, 2000) Ch. 21.

[36] J. P. Hsu and L. Hsu, papers D and E in Chapter 5 of this volume. The Wu trans-
formation is named to honor Ta-You Wus idea of a kinematic approach t o finding an
accelerated transformation within the framework of spacetime with a vanishing Rie-
mann curvature tensor. The time w in the Wu transformation (4)can be synchronized
by a set of computerized clocks. See, J . P. Hsu, ref. 35, pp. 289-290.
xxxvi

[37] D. Fine and J. P. Hsu, preprint (UMassD, 2005).


[38] T . T. Wu and C. N. Yang, Phys. Rev. D12, 3845 (1975); paper A in Chapter 10 of
this volume. In order to understand the geometrical meaning of gauge theory, Yang
asked his colleague J . Simons. Simons told Yang that gauge theory is related t o the
connections on fiber bundles. Yang tried t o read books on fiber bundles and felt t h a t
modern mathematical language is too dull and too abstract to physicists. In early
1975, Yang invited Simons to give lunch talks on differential forms and fiber bundles.
These talks helped Yang and others to understand the mathematical meaning of Diracs
quantization rules, magnetic monopoles and the Chern-Weil theorem. Simons talk
stimulated Wu and Yang t o write this paper in ref. 38. See, D. Z. Zhang, C. N. Yang
and Contemporary Mathematics in Mathematical Intelligencer, 15, no. 4 (1993).
[39] M. F. Atiyah, N. J. Hitchin and I. M. Singer, proc. Roy. SOC.London Ser. A. 362,
425-461 (1978); Atiyah, M. F., Drinfeld, V. , Hitchin N. J., and Manin, Yu. I., Phys.
Lett. A 65 185-187 (1978); Atiyah, M. F . , Hitchin N., and Singer, I. M. Proc. Nat.
Acad. Sci. USA, 7, 2662 (1977).
[40] S. K. Donaldson, Topology 29 257-315 (1990); S. K. Donaldson, J. Differential Geom.
18, 279-315 (1983).

[41] M. F. Atiyah and I. M. Singer, Proc. Nat. Acad. Sci. U.S.A. 81 2597-2600 (1984)
[42] A. Fine and D. Fine, Studies in History and Philosophy of Modern Physics, 28B #2
307-323 (1997)
[43] V. Fock, Z. Phys. 39, 226 (1927); F. London, Z. Phys. 42 375 (1927). For areview of the
history of the idea of gauge symmetry, see Chen-Ning Yang, Phys. Today, June 1980,
pp.42-49; or in JingShin Theoretical Physics Symposium in Honor of Prof. Ta- You W u
(Ed. J. P. Hsu and L. Hsu, World Scientific, 1998) pp. 61-71.
[44] V. Fock, The Theory of Space, Time, and Gravitation (Pergamon Press, 1964, 2nd
Revised Edition, transl. by N. Kemmer), p. xvii.
[45] J. L. Synge, Relativity: The General Theory (North-Holland, 1966), Preface.
[46] 0. Veblen and J.H.C. Whitehead, The Foundations of Differential Geometry (Cam-
bridge Univ. Press, 1953) pp. 37-38.
[47] A. Pais, Subtle is the Lord ..., The Science and the Life of Albert Einstein (Oxford Univ.
Press, 1982), p. 467.

[48] See papers C and D in Chapter 4 and those in Chapter 7 of this volume. Y. M. Cho,
Phys. Rev. D14, 3341 (1976).
[49] See paper D in Chapter 8 of this volume.
[50] See papers A in Chapter 4, E in Chapter 7, D in Chapter 8 and B in Chapter 2 of this
volume.
xxxvii

[51] A. Ashtekar, Phys. Rev. D 36, 1587 (1987); A . Ashtekar, Phys. Rev. Lett. 5 7 , 2244
(1986); Rovelli, C. and Smolin, L. Nucl. Phys. B 331,80 (1990).
This page intentionally left blank
Chapter 1

The Dawn of Gravitation*

*I. Newton, H. Poincark, A. Einstein


2

THE

OF

N .ATURAL PHILOSOPIKY.

ISAAC
NEWTON
Translated by Andrew Motte
3

BOOK 111,
IN the preceding Books I have laid down the principles of philosophy,
principles not philosophical, but mathematical; such, to wit, a3 r e may
build our reasonings upon in philosophical inquiries. These principles are
the lams and conditions of certain motions, and powers or forces, which
chiefly have respect to philosophy ; but, lest they should have appeared of
themselves dry and barren, I have illwtrated them here and there with
some philosophical scholiums, giving an account of such things as are of
more general nature, and xhich philosophy seems chiefly to be founded on ;
such as the density and the resistance of bodies, spacer; void of all bodies,
and the motion of light and sounds. I t remains that, from the mme prin-
ciples, I now demonstrate the frame of t t e System of the World. Upon
this subject I had, indeed, composed the third Book in a popular method,
that i t might be read by many; but afterward, considering that such aa
had not sufficiently entered into the principles could not easily discern the
strength of the consequences, nor lay aside the prcjudices t o which they had
been many ycars accustomed, therefore, to prevent the disputes rrhich might
be raised upon such accounts, I chose to reduce the substance bf this Book
into the form of Propositions (in the mathematical way), which should be
read by those only who had first made themselves masters of the principles
established in the preceding Books : not that I would advise any one to the
previous study of every Proposition of those Books; for they abound with
such as might cost too much time, even t o readers of good mathematical
learning. It is enough if'one carefully reads the Definitions, the Laws of
Motion, and the first three Sections of the first Book. H e may then pass
on to thia Book, and conault such of the remaining Propositions of the
first two Boob, as the references in thig and his occasions, shall require.
4

RULES OF REASONING IN PHILOSOPHY,

RULE I.
We are Po admit no more cames of natural things t h t t such as are both
true and suflcient to explain their appearances.
T o this purpose the philosophers say that Nature does nothing in vain,
and more is in vain when less will serve; for Nature is pleased with ~1111-
plicity, and affects not the pomp of superfluous causes.
RULE 11.
Therefore t o the same natural efects we must, as f a r as possible, asszgn
the same causes.
As to respiration in a man and in a beast; the descent of stones in Europe
and in America; the light of our culinary fire and of the sun; the reflec-
tion of light in the earth, and in the planets.
RULE 111.
The qtzalities of bodies, which admit neiiher kittension nor remission of
degrees, and which are -found to belong to all bodies udhin tlze reach
of our experiments, are to be esteemed the universal qualities of all
bodies whatsoever.
For since.the qualities of bodies are only known t o us by experinients, Ee
me to hold for universal all such as universally agree with experiments;
and such as are not liable t o diminution can never be quite taken away.
We are certainly not to relinquish the evidence of experiments for the sake
of dreams and vain fictions of our own devising; nor are we to recede from
the analogy of Nature, which uses t o be simple, and always consonant to
itaelf. We no other way know the extension of bodies than by our senees,
nor do these reach it in all bodies; but because we perceive extension in
d l that are sensible, therefore we ascribe i t universally t o all other8 also.
T h a t abundance of bodies are hard, we learn b i experience ; and because
the hdrdness of the whole arises from the hardness of the parts, we therefore
justly infer the hardness of the undivided particles not only of the bodies
we feel but of all others. Thht all bodies are impenetrable, me gather not
from reason, but from sensation. T h e bodies which we handle we find im-
penetrable, and thence conclude impenetrability t o be an universal property
of all bodies whatsoever. That all bodies are moveable, and endorred with
certain powers (which we call the vires in,erlia) of persevering in their mo-
tion, or in their rest, we only infer from the like properties obserred in the
5

bodies which we have 3een. The extension, hardness, impenetrability, mo-


bility, and vis inertice of the whole, result from the extension, hardness,
impenetrability, mobility, and vires inertia! of the parts ; and thence we
conclude the least particles of all bodies to be also all estendcd, and hard
and impenetrable, and moveable, and endowed with their proper vires iiiertia.
And this is the foundation of all philosophy. Moreover, that the divided
but contiguous particles of bodies may be separated from one another, is
matter of observation; and, in the particles that remain undivided, our
minds are able t o distinguish yet lessef parts, as is mathematically demon-
strated. But whether the parts ao distinguished, and not yet divided, may,
by the powers of Nature, be actually divided and separated from one an-
other, we cannot certainly determine. Yet, had we the proof of but onc
experiment that any undivided particle, in breaking a hard and solid body,
suffered a division, we might by virtue of this rule conclude that the un-
divided as well as the divided particles may be divided and actually sep-
arated to infinity.
Lastly, if it universally appears, by experiments and astronomical obser-
vations, that all bodies about the earth gravitate towards the earth, and
that in proportion to the quantity of matter which they severally contain ;
that the moon likewise, according to the quastity of its matter, gravitate
towards the earth; that, on the other hand, our sea gravitates towards thc
moon; and all the planets mutually one towards another; and the comets
in like manner towards the sun ; we must, in consequence of this ruleJ uni-
versally allow that all bodies whatsoever are endowed with a principle o t
mutual gravitation. For the argument from the appearances concludes with
more force for the universal gravitation of all bodies than for their impen-
etrability; of which, among those in the celestial regions, we have no ex-
periments, nor any manner of observation. Not that I affirm gravity t o bc
essential to bodies : by their vis insita I mean nothing but their vis inertice,
This is immutable. Their gravity is diminished as they recede from thc
earth.
RULE IV.
bt experimental philosophy we are to look upon propositions collected by
general iirduction f r o m pficeiiornena as accurately or very nearly true,
nottoitlistanding any con.trary hypotheses that may be imagined, till
such time as other phtanomena occur, by which they may either be made
more accurate, or liable to exceptions.
Thia rule we must follow, that the argument of induction may not br
evaded by hypotheses.
6

PIIBNOMENA, OR APPEARANCES,

PHIENOMENON I.
That the circumjovinl planets, By radii drawn to Jupiters ceiitre, de-
scribe areas proportional to the times of descriptioir.; and tlcrtt their
periodic times, the $xed stars being at rest, am in the sesipiplicnte
proportion of their distunces from its centre.
This we know from astronomical observations. For the orbits of these
planets differ but insensibly from circles concentric to Jupiter; and their
motions in those circles are found to be uniform. And all astronomers
agree that their periodic times are in the sesquiplicate proportion of the
semi-diameters of their orbits; and so i t manifestly appears from the fol-
owing table.
The periodic times of the satcllites of Jupiter.
Id. 18h. 27. 34. 3d. 131.13 42. Td. 3. 42 36. lGd. l(ih.32 9.
T h e distarices of the satellites from Jupiters centre.
From the observations of 1 1 I 2 I 3 I 4 1
Townly by the Blicrom. . . . 5,52 8,78
Causini by the Telescope . . .
I Frorir. the pe,riodic times
I:#
Cassini hv the c c l i ~of. the satel. . ;1
(5,66719,017
Mr. Pound lias determined, by the help of excellent micrometws, the
diameters of Jupite] and the elongation of its satellites after the fullowing
manner. The greatest 3eliocentric elongation of the fourth satellite from
Jupiters centre m s taken with a micrometer in a 15 feet telescope, aiid at
the mean distance of Jupiterfrom the earth was found ahoiit 8 16. T h e
elongation of the third satellite was taken with a micrometer in a telescope
of 123 feet, and at the ssme distance of Jupiter from the earth FBS found
4 42. T h e greatest elongations of the other satellites, at the same dis-
tance of Jupiter from the earth, are found from the periodic times to be 2
56 4V,and 1 51 6.
T h e diameter of Jupiter taken Kith the micrometer in a 123 feet tele-
scope several times, and reduced to Jupiters mean distance from tlie earth,
proved always less than 40, never less than 38,generally 39. Tliis di-
ameter in shorter telescopes is 40, or 41; for Jupiters light. is R little
dilated by the unequal refrangibility of the rays, and this dilatation bears
5 less .ratio to the dircmeter of Jupiter in the longer and more perfect tele-
escopes than in those which are shorter and less perfect. T h e t,;mes ju
7

which two satellites, the first and the third, passed over Jupiters body, were
observed, froin the beginning of the ingress to the beginning of the egregs,
and from the complete ingress to the complete egress, with the long tele-
scope. And from the transit of the first satellite, the diameter of Jupiter
at its mean distance from the earth came forth 37;: and from the transit
of the third 379. There was observed also the time i n which the shadow
of the first satellite passed over Jupiters body, and thence the diameter of
Jupiter at its mean distance from the earth came out ahout 37. Let us
suppose its diameter to be 37+ very nearly, and then thP greatest elonga-
tions of the first, second, third, and fourth satellite will be respectively
equal to 5,965, 9,494, 15,141, and 26,63 semi-diameters of Jupiter.

PHBNOMENON 11.
That the circumsatirrtral planets, by radii drawn to Saturns centre, de-
scribe areas proportional to the times of description ;a d that their
periodic times, the fixed stars being at rest, are in the sespipkicate
proportion of their distances f r o m its centre.
For, as Ca.ssitii from his own observations has determined; their distan-
ces from Saturns centre and their periodic times are as follow.
The periodic times of the satellites of Sutzrrn.
I d . 2 1 h . 18 27. 2d. 17h. 41 22. 4d. 12. 25 12. lijd.22h. 41 14l.
70. 71.48 00.
The distances of the satellitesfroin Saturds centre, in semidiamcters of
itu ring.
From observations . . .. . 14;. 2.; 3;. 8. 84.
Prom the periodic times .. . 1,93. 2,47. 3,45. 8. 23,35.
Ihe greatest elongation of the fourth satellite from Saturns centre is
commonly determined from the observations to be eight of thwe semi-
diameters very nearly. But the greatest elongation of this satellite from
Saturns centre, when taken with an excellent micrometer in Mr. Hirygens
telescope of 123 feet, appeared to be eight semi-diameters and &, of a semi-
diameter. And from this obserration and the periodic times the distances
of the satellites from Saturns centre in semi-diameters of the ring are 2,l.
2,69. 375. 8,7. and 25,35. T h e diameter of Saturn observed in the same
telescqe was found to be t o the diameter of the ring as 3 to 7 ; and the
diameter of the ring, M i y 25-29) 1719, was found to be 43; and thznce
the diameter of the ring when Saturn is a t its mean distalice from the
earth is 42, and the diameter of Saturn IS. These things appear so in
very long and excellent telescopes, because in such telescopes the apparent
magnitudes of the heavenly bodies bear a greater proportion to the dilata-
tion of light in the extremities of those bodies than in shorter telescope&
8

If we, then, reject all the spurious light, the diameter of Saturn will not
amount to more than 16".

P H B N O M E N O N 111.
That the$ve primary planets, M m u r y , Venus, Hars, Jupiter, and ~Yut-
uru, with their several orbits, encompass the sun.
T h a t Mercury and Venus revolve about the sun, is evident from their
moon-like appearances. When they shine out with a full face, they are, in
respect of us, beyond or above the s u n ; when they appear half full, they
are about the same height on one side or other of the s u n ; when horned,
they are below or between us and the sun; and they are sometimes, when
directly under, seen like spots traversing the sunk disk. 'That M&rs~ u r -
rounds the Sun, is as plain from its full face when near its conjunction with
the sun, and from the gibbous figure which it shews i n its quadratures.
And the same thing is demonstrable of Jupiter and Saturn, from their ap-
pearing full in all situations ; for the shadows of their satellites that appear
sometimes upon their disks make it plain that the light they shine with is
riot their own, but borrowed rom the sun.

P H E N O M E N O N IV.
That t h e j x e d stars being at rest, the periodic times of t h e j v e primary
planets, rlnd (whether of the sicu about the earth, or) of the earth about
the s m , are itif the sesquiplicate proportion of their meail, distances
fi*oin the sun.
This proportion, first observed by Kepler, is now received by all astron-
omers ; for the periodic times are the same, and the dimensions of the orbits
are the snme, whether the sun revolves about the earth, or the earth ahnut
the sun. And as to the measures of the periodic timcs, all astronomers are
agreed about them. B u t for the dimensions of the orbits, Kepler and Bzil-
Eialdus, above all others, have determined them from observatioris with the
greatest accuracy ; and the mean distances corresponding to the periodic
times differ but insensibly from those which they have assigned, and for
the most part fall in between them ; as we may see from the following table.
T h e periodic tiiiLes witlt respect to tfiejxed stars, of the planets a i d earth
revolving about the suit, in days and decimal parts of a day.
'7 4 a t; ? Y
107'59,275. 4332,514. 696,9755. 365,2565. 224,6176. 87,9692.
TJie mean distances of the planets and of the earth from the SZLTZ.
'7 4 8
According to Kepler . . . . . . . . 951000. 519650. 152360.
" to Bzrllialdrrs . . . . . . . 954195. 522520. 152350.
t o the periodic times . . . . 954006. 520096. 1.52361)
9

8 ? v
According to KepZer . . . ... . .. 100000. 72400. 38806
I( to Bullialdics . . . . . 100000. 72395. 38585
(I .
to the periodic t i m a . . . . 100000. 72333. 3S710.
As to Mercury and Venus, there can be no doubt about their distalices
from the sun; for they are determined by the elongations of those planets
from the sun; and for the distances of the superior planets, all dispute is
cut off by the cclipses of the satellites of Jupiter. For by those eclipses
the position of the shadow which Jupiter projects is determined ; whence
we have the heliocentric longitude of Jupiter. And from its helio-
centric and geocentric longitudes compared together, we determine its
distance.
PHXNOMENON V.
Then the primary plnilets, by radii drawn to the etcrth, describe areas no
wise proportional to tke times ;but that the areas which they describe
by radii drawn to the sun are proportional to the times of descrip-
tion.
For t o the earth they appear sometimes direct, sometimes stationary,
nay, and sometimes retrograde. But from the sun they are alwa,ys seen
direct, and to proceed with a motion nearly uniform, that is t o say,a little
swifter in the perihelion and a little slower in the aphelion distances, so as
t o maintain an equality in the description of the areas. This a noted
proposition among astronomers, and particularly demonstrable in Jupiter,
from the eclipses of his satellites; by the help of which eclipses, as we have
said, the heliocentric longitudes of that planet, and its distances from the
sun, are determined.
PHBNOMENON vr.
That the m o o n , by a radius drawn to the earWs centre, describes a n area
proportional to the time of description.
This we gather from the apparent motion of the moon, compared with
its apparent diameter. I t is true that the motion of the moon is a little
disturbed by tho action of the sun : but in laying down these P h z n o m e q
I neglect thoso qmall and inconsiderable errors.
10

PRQPOSITIONS*
PROPOSTTION I. T H E O R E M I.
37int the forces by which the circtrnzjoviul planets are coiitirrzrnlly drawn
of f r o m rectiliiiear motions, aiid retained in th.eir proper orbits, tend
to Jupiters ceiiire ;and are rehprocally as the sqisares of the distances
of the pluces of those planets f r o m tliat centre.
T h e former part of this Proposition appears from P h z n . I, and Prop.
I1 or 111, Book I ; the latter from Phzen. I, and Cor. 6, Prop. IV, of the same
13oek.
T h e same thing we are to understand of the planets which encompass
Saturn, by P h m . 11.

PROPOSITION 11. T H E O R E M 11.


That the forces By which the privttnry planets are coiititttrally d m i u i i , oJ
f r o m rectidiiiwbr motiotss, aitd rctaiited irt their proper orbits, tend to
the sim ;m d arc recipi-ocdly as the squares of the distances of the
places of those planets f r o m the suds centre.
T h e former part of the Proposition is manifest, from P h m . V, and
Prop. 11, Book I ; the latter from P h m . IV, and Cor. 6, Prop. IV, of tho
same Book. But this part of the Proposition is, with grcat a c c ~ m c ycle- ,
monstrable from the quiescence of the aphelion points; for a very small
aberration from the reclfproccd duplicate proportion would (by Cor. 1, Prop,
XLV, Book I) produce a motion of the apides sensible enough i n every
single rgvolution, and in many of them enormously great.

PROPOSITION 111. T H E O R E M 111.


That the force by iohich the moon is retained in its orbit ttwds to the
earth ;a i d i s rec-ipivcalty as the square of the distance crf its place
front the ecrrthr caritre.
T h e former part of the Proposition is evident from P h a x . VI, and Prop.
I1 or 111, Book I ; the latter from the very slow motion of the moons apo-
gee; which in every single revolution amounting biit to 3 3 i i t coiise-
queiitin, may be neglected. For (by Cor. 1, Prop. XLV, Book I) it ap-
pears, that, if the distance of the moon from the earths centre is t o the
semi-dinmeter of the earth as D to 1, the force, from which such a motion
will result, is reciprocnlly as D53$9, i. e., reciprocally as the power of D,
whose exponent is 2T:T ; that is to say, in the proportion of the distance
something greater than reciprocally dqlicate, but which coma 59; times
nearer to the duplicate than to the triplicate proportion. B u t in regard
that this motion is owing to thc action of the sun (as we slid1 afterwards
11

shew), i t is here t o be neglected. The action of the sun, attracting the


moon from the earth, is nearly as the moon's distance from the earth; and
therefore (by what we have shewed in Cor. 2>Prop. XLV, Book I) is to the
centripetal force of the moon m 2 t o 357,45, or nearly so; that ig 89 L to
178$#, And if we neglect 80 inconsiderable o force of the sun, the r e
maining force, by which the moon is retained in its orb, mill be recipro-
cally as D*. This will yet more fully appear from comparing this force
with the force of gravity, as is done in the next Proposition.
COR.If we augment the mean centripetal force by which the moon is
retained in its orb, first in the proportion of 177':; to 178&$ and then i n
the duplicate proportion of the semi-diameter of the earth to the mean dis-
tance of the centres of the moon and earth, we shall have the centripetal
force of the moon at the surface of the earth; supposing this force, in de-
scending to the earth's surface, continually to increase in the reciprocal
duplicate proportion of the height.

PROPOSITION IV. THEORZM IV.


That the moon qravitates towards the earth, and by lJieJorce fl gravaty
is continually drawn off from a reclilinear motion, and retailzed in
its orbit.
T h e mean distance of the moon from the earth in the syzygies in semi-
diametcra of the earth, is, according to Plolemy and most astronomers,
59 ; according to Vendelin and H7~ygens,60; to Ccpernictts, 604; to
Street, 60;; and to Tycho, 56%. But Tycho, and all. that fallow his ta-
bles of refraction, making the refractions of the sun and moon (altogether
against the nature of light) t o exceed the refractions of the fixed stars, and
that by four or five minutes near the JI.orizo?z,did thereby increase the
moon's horizontal parallax by a like number of minutes, that is, by a
twelfth or fifteenth part of the whole parallax. Correct this error, and
the distance will become about 60+ semi-diameters of the earth, near to
what otherg have assigned. Let us assume the mean distance of 60 diam-
eters in the syzygies; and suppose one revolution of the moon, in respect
of the fixed stars, to be completed in 2rd. Th. 43', w astronomers have de-
termined ; and the circumference of the earth to arnaiint to 12324$ISOO
Paris feet, as the French have found by mensuration. And now if we
imagine the moon, deprived of all motion, to be let go, so as t o descend
towards tho earth with the impulse of all that force by which (by Cor.
Prop. 111) it is retained in its orb, i t will in the space of one minute of time,
describe in its fall lQT Paris feet. This we gather by a calculus, founded
either upon Prop. XXXVI, Book I, or (which comes to the same thing)
upon Cor. 9, Prop. IV, of the same Book. For the verged siw of that arc,
which the moon, in the space of one minute of time, would by its mean
12

motion describe at the distance of 60 semi-diameters of the earth, is nearly


152, Paris feet: or more accurately 15 feet, 1inch, and 1 line $. Where-
fore, since that force, in approaching to the earth, increases in the recipro-
cal duplicate proportion of the distance, and, upon that account, a,t tlie
surface of the earth, is GO X 60 times greater than at the moon, a body
in our regions, falling with that force, ought in the spsce of one minute of
time, to describe 60 x 60 X 15& Paris feet; and, in the space of one sec-
ond of time, to describe 15& of those feet; or more accurately 15 feet, 1
inch, and 1 line $. And with this very force we actually find that bodies
here upon earth do really descend ; for a pendulum oacillating secoi;r!s i n
the latitude of Paris will be 3 Paris feet, and S lines 4 in length, as Mr.
Hiiyqem has observed. And the space which a, heavy body dcscribca
by falling in one second of time is to half the length of this peiidiilum in
the duplicate ratio of the circumference of a circ'iz t o its diameter (as Mr.
L5ryyens has also shewn), and is therefore 15 F u ~ i sfeet, 1 inch, 1 line 3.
And therefore the force by which the moon is retained in its orbit becomes,
at the very surface of the earth, equal to the force of gravity which we ob-
serve in heavy bodies there. And therefore (by Rule I and 11) the force by
which the moon is retained in its orbit ia that very same force which we
commonly call gravity ; for, were gravity another force different from that,
then bodies descending t o the earth with the joint impulse of both forces
would fall with a double velocity, and in the space of one second of time
would describe 30; Paris feet ; altogether against experience.
This calculus is founded on the hypothesis of the earth's standing still ;
for if both earth and moon move about the sun, and a t the same time about
their common centre of gravity, the distance of the centres of the moon and
earth froin 'one another will be 60; semi-diameters of the earth ; as may
be found by a computation from Prop. LX, Book I.
~
13

A.A.Logunov

ON THE ARTICLES
B Y HENRI POINCARE

((ON THE DYNAMICS


OF THE ELECTRON))

Translated by G.Pontecorvo
14

H .Poincare
THE DYNAMICS
OF THE ELECTRON'
( 2 3 July 1905)

INTRODUCTION
It would seem at first sight that the aberration of light and the optical and
electrical effects related thereto should afford a means of determining the absolute
motion of the Earth, or rather its motion relative to the ether instead of relative to
the other celestial bodies. An attempt at this was made, indeed, by Fresnel, but
he soon perceived that the Earth's motion does not affect t h e laws of refraction
and reflection. Similar experiments, such as that using a waterfilled telescope. or
any in which only the first-order terms relative to the aberration were considered,
likewise yielded only negative results. The explanation of this was soon found:
but Michelson, who devised an experiment wherein the terms involving the square
of the aberration should be detectable, was equally unsuccessful.
Th is i in poss i b i 1i t y of experimental 1y demonstrating the a b so I u te in o t i o n of
the Earth appears to be a general law of Nature; i t is reasonable to assume the
existence of this law, which we shall call the relativity postulate, and to assume
that it is universally valid. Whether this postulate, which so far is i n agreement
with experiment, be later confirmed or disproved by more accurate tests, i t is. in
any case, of interest to see what consequences follow from it.

* We note thizt the relativity postulate, the postirlrrte of' totLrI im-
possibility of reLwling iibsoliitr rriotinri, CIS jbrrrii~liitedirr his first
short note ((011 the dynrriiiic,s r,f' tlir elrctroii D, N * L ~ . S first rririrtiorirci
by Poiricnre in his report to the CmiCqressof' Lirt atid sc.ienc,c l i ~ l r liti
Sairit Loitis in 1904 '.
I n this report Poiiicare lists the niniri priiiciples of' tlieorrtic.rrl
physics CI r i ii .form ii I LI t es the re I nt i;iity p r ir i c-iplP ir i LI cco rd(it i CL' \ I *i t li
(( ~

'Poincarc H.O. On the dynmniics of thc electron 11 The relativity principle: Collection o f works
of the classics of relativity. Leningrad, 1935. P.Sl-139. Rend. dcl Circ. Mat. di PiiIcrmo 71 ( IOOh)
P.120-17s.
'Poincnre H . The present and the future of niathcniutical physics I / The relati\ ity principlc: Coll.
of works o n special relativity theory. - Moscow. 1973. - P . 2 7 4 4 ; Poincarc H. L'ctat ;ictucl ct
I'avcnir dr la Physique In~ithcmatique. - lnnvicr 1904. - V.28. Scr. 3. - P.302-324 // Thc
Monist. - 1905. - V . X V . N i .

IS
15

which the laws of physical phenomena must be the same for a motion-
less observer and for an observer experiencing uniform motion along
a straight line, so the laws governing physical phenomena should be
the same for a motionless observer and for an observer experienc-
ing uniform motion, so there is no way and cannot be any way of
determining whether one experiences such motion or not.

One explanation, suggested by Lorenta and Fitzgerald, involves the hypothe-


sis that all bodies undergo a contraction in the direction of the Earth's motion, of
an amount proportional to the square of the aberration; such a contraction, which
we shall call the Lorentz contraction, would explain the result of Michelson's
experiment and of all others conducted heretofore. The hypothesis would never-
theless be inadequate if the relativity postulate were valid in its most general form.
Lorentz has sought to extend and modify the hypothesis so as to make it
fully compatible with the relativity postulate. This he has succeeded in doing,
in his paper Electromagneticphenomena in a system moving with any velocity
smaller than that of light (Proceedings of the Section of Sciences, K o n i n k l i j k e
Akademie van Wetenschappen te Amsterdam 6, 809831, I904)\
In view of the importance of this problem, I resolved to examine it further.
The results which I have obtained agree with those of Lorentz in all the principal
points, and I have needed only to modify and augment them in certain details.
These differences, which are of but minor importance, will be shown in later
sections.
* These lines can be properly understood, f one remembers that
already in his first work Poincare fully attributes his own under-
standing < the problem lo Lorentz and speaks of it as of rhe idea (f
Lorentz. Below he again formulates the idea <Lorentz, although
Lorentz nowhere and never wrote any such thing before Poincare.
When Poincare writes about an agreement with Lorentz, he at-
tributes his own understanding to Lorentz. This circumstance was
not understood hitherto by many people, although it was formulated
quite definitely by Poincare, if one rends his works carefully.

Lorentz's concept may be summarized thus: if a common translatory motion


may be imparted to the entire system without any alternation of the observable
phenomena, then the equations of an electromagnetic medium are unaltered by
certain transformations, which we shall: call Lorentz transformations. In this
way two systems, of which one is fixed and the other is in translatory motion,
become exact images of each other.
'Lorentz H.A. Electromagnetic phenomena in a system moving at any velocity smaller than the
velocity of light//The relativity principle: Collection of works of the classics of relativity. Leningrad,
1935. P. 16-48.

16
16

* The last words show how profoundly Poincare understood and


formulated clearly the equality, contained in the transformations of
the Lorentz group, of all reference systems moving along a straight
line uniformly with respect to each other, which leads to the total
impossibility of determining absolute motion.
We may note, here, the essential difference, from the point of
view of Lorentz, concerning the interaction of two reference systems
related by relativistic transformations. Lorentz, in his own words,
considered there being an essential difSerence between the systems
x,y, z , t and X I , y, z, tw. In one reference system used are - such
was my reasoning - coordinate axes having a dejinite location in
ether and what can be called crtruew time; in the other system, con-
trariwise, we are simply dealing with auxiliary quantities introduced
only with the aid of a mathematical trick. Thus, f o r instance, the
variable t should not be called (<time>> in the same sense as the
variable t.
Given such reasoning, I had no intention of describing phenom-
ena in system x,y, z, t in precisely the same way as in system
4
2 ,y, z , t,...>> .
Lorentz further, in the same article, writes: NI tried to choose
the transformation formulae so as to obtain the simplest possible
equations in the new system. Later I saw from Poincares article
that, had my approach been more methodical I could have achieved
greater simplicity. Having not noticed this circumstance I didnt
achieve complete invariance in the equations>>.

Langevid sought to derive a modification of Lorentzs concept. Both au-


thors consider that an electron in motion assumes the form of an oblate spheroid;
but Lorentz considers that two of the axes of this spheroid remain constant,
whereas Langevin supposes that its volume remains constant. These two au-
thors have shown that the two hypotheses are in agreement with the experi-
ments of Kaufmann, as is Abrahams original hypothesis of a rigid spherical
electron.
The advantage of Langevins theory is that it involves only the electromag-
netic forces and the constraints; but it is not compatible with the relativity pos-
tulate. This was shown by Lorentz, and I have likewise proved it by a different
method, based upon the use of group theory.
4Lorentz H.A. Two articles by Henri Poincare about mathematical physics N The relativity prin-
ciple: Coil. of works on special relativity theory. - Moscow, 1973. - P.189-196.
., in Bonn. See: Bucherer A.H. Mathematische
SBucherer expressed the same idea before Langevin
Einfuhrung in die Elektronentheorie. - Leipzig: Druck und Verlag von B.G.Teubner, 1904. -
P. 148.

17
17

W e must return therefore to Lorentzs theory, but, in order to maintain this


free from unacceptable contradictions, a special force must be invoked to account
both for the contraction and for the constancy of two of the axes. I have
attempted to determine this force, and have found that it can be regarded as a
constant external pressure acting upon an electron capable of deformation and
compression, the work done being proportional to the change in the volume
of the electron.
Then, if the inertia of matter is exclusively of electromagnetic origin, as has
been custornariIy supposed since Kaufmanns experiment, and if all forces (other
t h a n the constant pressure to which I have just alluded) are of electromagnetic
origin, the relativity postulate can be accepted as strictly valid. I show this by
means of a very simple calculation based upon the principle of least action.
But this is not all. Lorentz, in his paper already mentioned, has deemed i t
necessary to extend his hypothesis i n such a manner that the postulate remains
valid when there exist forces other than the electromagnetic forces. In Lorentzs
view, all forces, no matter how originating, are affected by the Lorentz trans-
formation (and therefore by a translatory motion) in the same manner as the
electromagnetic forces.
* And in spite of the assertion rnade by Poincnre beiitg extrernply
clear arid itnanibigiious, P.A.M. Dirac, even in 1980, wrote6: ((In m e
aspect Einstein went much -farther than Lorentz, Poincare arrd orhers,
n.nrnely in assunzing that the Lorentz transformntivris shnuld be np-
plied in alE physics arid not only in the case of pheriomerra relared to
electrodynnrnics. Any physical forces, that mciy be introduced in the
future, must be consistent with the Loreritz transformntionsu. Similar
things are written by physicists of a somewhat lower rank, not ro
speak of those authors who have quite an unclear idea of physics,
but also write that Poincare didnt make the decisive step toLcards
the creation of relativity theory.
All the above is actually not true, and the relevant answer is foiiiid
in the two articles of Poincare, arid if someone has riot understood
or has not made any attempt to do so, it is certainly riot the fault of
Poiitciira.
By comparing what was written by Poincare arid the asserfiort
made by Dirnc one can readily verify that everything considered by
Dimc to he the merit of Einstein is actually already all fourad irr the
Jirst paper of Poinuare. So Oiracs claim, quoted nbovr, thaf <in one
aspect Einstciri went much farther ...u simply has tlo grounds. It was
Poincare who extended the Loreritz transfornttrtiorzs to alI the forces
of Nature.
Collection dedicated to Einstein. 1982-1983, Moscow: Nauka, 1986. P.218.

18
18

Surprisingly, many people exert great efforts in attributing their


own incomprehension to Poincare. In formulating the relativity pos-
tulate, or the postulate of total impossibility of determining absolute
morion as he terms it in his first paper, Poincare stresses its uni-
versality in applications to all natural forces. It must be noted that
although here, also, Poincare, as usually, modestly attributes prece-
dence to Lorentz, it is precisely to Poincare himself, who first estab-
lished the group nature of the Lorent- transformations as well as the
correct transformation laws of forces and other physical quantities,
that we owe the discovery of the overall general nature of relativistic
laws, independently of the nature of the forces.
It was necessary to consider this hypothesis more closely, and in particular to
ascertain the changes which it would compel us to apply to the laws of gravitation.
First of all, we find that gravitational action would be propagated with the
velocity of light, and not instantaneously. This might in itself appear to be
sufficient reason to reject the hypothesis, for Laplace has shown that such prop-
agation cannot occur. But, in fact, the effects of this are largely counterbalanced
by another phenomenon, and there is, therefore, no contradiction between the
proposed law and astronomical observations.
The question arises whether it is possible to discover a law which satisfies
Lorent/.'s condition and which yet reduces to Newton's whenever the velocities
of the bodies are so small that the squares of these velocities (and the products
of the accelerations and the distances) may be neglected in comparison with the
square of the velocity of light.
It will be seen later that the answer must be affirmative.
Is the law, thus modified, compatible with astronomical observations?
At first sight it appears to be so, but a more detailed discussion is necessary
to settle the question.
Even assuming, however, that the new hypothesis survives this test, what
conclusion is to be drawn? If the gravitational attraction is propagated with the
velocity of light, this cannot occur by mere chance, but must be dependent on
the ether; we should then have to investigate the nature of this dependence, and
attempt to relate it to other such dependences.
* It is interesting to note that although the relativity postulate put
forward by Poincare implies total impossibility of determining the
motion of matter relative to ether, the concept itself of ether is not
discarded, since it is difficult to imagine any greater absurdity than
empty space. Many people consider the removal of ether to be a
most important revolutionary achievement of relativity theory. This
is not true. It is quite evident that relativity theory did not simply
destroy ether, but only assigned it a certain state of motion.
19
19

This was si(bseqiierit1y rioted by Eiwsteiw, rrvho rr-rote: N.. . a closer


e.uamination rekneals thcit specicil relativitj. theory does m t recpire
iriicotiditioiial negcition of ether. One ccin ciccept the esistence of
ether, birt oiie rieeds iiot ccire aboiit assigriirig it ciriy definite stcite of
motion; in other ,cords, it niitst, in cibstract ternis. be clepri\!etl of its
liist niechariical sign still left it by Lorent,. We shcrll firrther see thiit
geiierirl relativity t h e o q jiistfies sirch an crpproeich... 7 . ))

I n moelerri theoretical physics the riotiori of ether gmte lip its


place to the concept of physical vcici~irni.iri which there are ineinitablj
present yuciwtiini fliictiratioiis - zero oscillatioiis of q i i i i n t i i n i jielcls.

We cannot be satisfied with formulae that are merely placed side by and
agree only by a lucky chance; these formulae must, as it were, interlock. The
mind will consent only when i t sees the reason for the agreement, and when this
agreement even seems to have been predictable.
But the matter may be viewed in a different light, as an analogy will show. Let
us imagine some astronomer before Copernicus, pondering upon the Plotemainc
system. He would notice that, for every planet, either the epicycle or the deferent
is traversed i n the same time. This cannot be due to chance, and there must be
some mysterious bond between all the planets of the system.
Then Copernicus, by a simple change of the coordinate axes which were
supposed fixed, did away with this seeming relationship: every planet described
one circular orbit only, and the periods of revolution became independent of one
another - until Kepler once more established the relationship that had apparently
been destroyed.
Now, there may be an analogy with our problem. If we assume the relativity
postulate, we find a quantity common to the law of gravitation and the laws
of electromagnetism, and this quantity is the velocity of light; and this same
quantity appears in every other force, of whatever origin. There can be only two
explanations.
Either, everything in the universe is of electromagnetic origin; or, this con-
stituent which appears common to all the phenomena of physics has no real
existence, but arises from our methods of measurement. What are these meth-
ods? One might first reply, the bringing into juxtaposition of objects regarded
as invariable solid things; but this is no longer so in our present theory, if the
Lorentz contraction is assumed. In this theory, two lengths are by definition
equal if they are traversed by light in the same time.
Perhaps the abandonment of this definition would suffice to overthrow Lo-
rentzs theory as decisively as the system of Ptolemy was by the work of Coper-
nicus. Should this ever happen, it would by no means argue the futility of
7Einstein A. Collection of scientific papers: in 4 volumes. Ed. I.Ye.Tamm, Ya.A.Smorodinsky,
B.C.Kuznetsov. - Moscow: Nauka, 1965, V . I. P.685486.

20
20

Lorentzs analysis: whatever the faults of the Ptolemainc theory, it was the
necessary foundation for Copernicus to build upon.
I have therefore not hesitated to publish these incomplete results, even though
at the present time the entire theory may seem to be threatened by the discovery
of cathode rays.

SECTION 9.
HYPOTHESES CONCERNING GRAVITATION
* ccMnss has two aspects: these are both inertia and the grav-
itatiorinl mass, which is involved in Newtonian gravity as a mul-
tiplication factor. If the coefficient of inertia is constant, can the
gravitational mass also be constant? That is the question>,.
H. Poincare
The preserit and future of mathematical physics (1904)18

Thus, Lorentz theory would entirely account for the impossibility of demon-
strating absolute motion, provided that all forces were of electromagnetic origin.
But there exist forces, such as gravitation, which cannot be regarded as being
of electromagnetic origin. It may happen that two systems of bodies create
equivalent electromagnetic fields, in the sense of exerting the same action upon
electrified bodies and currents, while at the same time these two systems do not
exert the same gravitational action upon Newtonian masses.
The gravitational field is therefore not identical with the electromagnetic field.
Lorentz was thus compelled to augment his hypothesis by assuming that forces,
of whatever origin, and in particular gravitation, are affected by translation
(or, if one prefers, by the Lorentz transformation) in the same way as the
electrornagnetic forces.
We must now examine this hypothesis in detail. If the Newtonian force is
to behave in such a way under the Lorentz transformation, we can no longer
Poincare H . Lavenir de la Physique rnathernatique /I Bulletin des Sciences Mathernatiques.
Janvier 1904 - V.28. Ser.2. - P.302-324.

65
21

suppose that this force depends only on the relative position of the attracting and
the attracted body at the instant concerned; it must depend also on the velocities
of the two bodies. Moreover, we may reasonably assume that the force acting
upon the attracted body, at an instant t , depends on the position and velocity of
the body at that instant; but i t will also depend on the position and velocity of the
attracting body, not at the instant but at some previous instant, as if gravitation
required a certain time for its propagation.
Let us consider therefore the position of the attracted body at the instant to,
and let its coordinates at that instant be FO = ( 2 0 . yo. Z O ) and the components of it5
velocity be i7, and let us consider the attracting body at the corresponding instant
t o + t , its coordinates at that instant being FO+ Fand its velocity components 2,.
First of all, we must have a relationship

to determine the time t. This relationship express the law of propagation of


gravitational action; I shall by no means impose the condition that propagation
occurs with the same velocity in every direction.
Next, let F' = ( F z , F,, F,) be the three components of the action exerted
+
upon the attracted body at the instant t . We have to express F = (F,, F,, F z )
as functions of
t , T+, ZI,
+ -
'U1. (2)

The conditions to be satisfied are as follows:


1 . The .relationship ( 1 ) must not be affected by the transformations of the
Lorentz group.
2. The components F' = ( F z , F,, F,) must behave, under the Lorentz
transformations, in the same manner as the electromagnetic forces denoted by
the same letters, that is, as shown by equations (11') of section I .
3. When both bodies are at rest, the usual law of attraction must apply.
In the latter case, however, is should be noted that the relationship ( I ) plays
no part, since the time t is of no significance if both bodies are at rest.
The problem thus stated is clearly indeterminate. We shall therefore seek to
satisfy as many further conditions as possible.
4. Astronomical observations do not appear to reveal any perceptible devia-
tion from Newton's law, and we shall therefore choose the solution which differs
least from this law when the velocities of the two bodies are small.
5 . We shall attempt to ensure that t is always negative; for, whereas it
is reasonable that the effect of gravitation should require a certain time for its
propagation, we should find it more difficult to understand how this effect coulcl
depend on a position of the attracting body which the latter has not yet reached.

66
22

There is one case where the problem is no longer indeterminate, namely if


the two bodies are at relative rest, i.e. if

we shall therefore f i r h t investigate this case, assuming that these velocities are
constant, and therefore that the two bodies are executing a common uniform
motion of translation i n a straight line.
We may assume that the .r-axis has been taken to be pdrallel to this motion
of tranhlation, so that (;I = 0, and we shall take J = i*l. = 11.
If, under these conditions, we apply the Lorentz transformation, the two
bodies will be at rest after the transformation, with

The components must then be in accordance with Newtons law and we


have, apart from a constant factor,

But, from section I ,

Moreover,

and
/s
Fll = -T(X - ~ l l t ) / t, FJ. = -TJ./?~. 3 .
/
(4
which may also be written

It seems at first sight that the indeterminacy remains, since no hypotheses


have been made concerning the value of t , that is, concerning the velocity of

67
23

propagation. Moreover, z is a function of t . But it is easily seen that the


quantities x - vllt, y, z and which appear in the formulae do not depend on t .
Thus, if two bodies have a common translatory motion, the force acting upon
the attracted body is normal to an ellipsoid having the attracting body at its
centre.
In order to proceed further, it is necessary to ascertain the invariants of the
Lorentz group.
* The Jitrther exposition represents a general formulation of the
mathematical principles of relativity theory, including tensor calculus
and investigation of invariants of the Lorentz group corresponding to
relativistically invariant physical quantities and relations.
Here Poincare for the first time considers the transformations
of the Lorentz group as elements of transformations of the four-
dimensional manifold, the coordinates of points of which serve as
sets of the space coordinates x, y, z and of the ccimaginaryw time
t-, conserving the quadratic form x2 + y2 z2 - t2. -+
Following Poincare, Minkowski subsequently developed these ar-
guments concerning the unity of space and time, the geometry of
which is pseudo-Euclidean. In his famous talk at a meeting of Ger-
man scientists and doctors in Koln, which turned out to be decisive
f o r relativity theory acquiring wide-spread recognition, G.Minkowski
expressed these ideas as follows:
((Dear gentlemen! The notions of space and time, which I intend
to develop here, are based on physical experiment. That is why they
are powerful. Their tendency is radical. From now on, space by itself
and time by itself will have to become fictions, while only a certain
form joining the two together will still retain independenceu".

It is known that the substitutions forming this group (if 1 = 1) are linear and
such that the quadratic form

x2 + y2 + z 2 - t 2
is invariant. Putting

v'= br'/bt, br'= (6X) b y ) 6z),

61 = Slr'/blt, 61r'= ( & x , 61y, b1z).


We see that the Lorentz transformation causes 62,by, b z , bt and b l x , 61y, 612,
61t to undergo the same linear substitutions as z, y, z , t .
"Minkowski H. Raum und Zeit: Vortlge von der 80. Naturforchersamrnlung zu Koln // Physikalis-
che Zeitschrift. - 1909. - B. 10. N.3. - P.104-1 1 1 .

68
24

If
:c, y. 2. t q ,
bx: dy, ba: at-,
dlS, & y , 612: 61tJ-1,
are regarded a s the coordinates of three points P , PI, P i n four-dimensional
space, we see that the Lorentz transformation is simply a rotation of this space
about a fixed origin.
The only distinct invariants are therefore t h e six distances of the points P , PI,
P from one another and from the origin, or alternatively the two expressions

x2 + y2 + z2 - t 2 , xbz + yby + zbz - tbt,


and the four expressions of the same form obtained by permuting the three points
I, PI, PIf in any manner.
What we are seeking, however, is invariant functions of the ten variables (2);
we must therefore find, among combinations of the six invariants, those which
depend only on these ten variables, i.e. those which are homogeneous and of
degree zero with respect to bz, b y , b a , bt and with respect to blx, b l y , dlz, S l t .
This leaves four distinct invariants, namely

Let us now consider how the components of the force are transformed. We
return to equations ( 1 1) of section I , which refer not to the force @, discussed
here but to the force per unit volume. Putting

we see that these equations ( 1 1) may be written (with 1 = 1)

f; = f y r f; = f*. (6)
Thus, f x , jyr f Z , f t are transformed in the same manner as z, y, z , t. The
invariants of the group will therefore be

The quantities in which we are interested are not but p , with

69
25

Evidently
F/J= T l f t = l p .
Thus the Lorentz transformation will act upon F, T , in the same way as upon
f: f t , except that these expressions will in addition be multiplied by

p - 1 - bt
-- -_
p y ( l - Pv,) bt

Likewise, the transformation will act upon v in the same way as upon
6x,6y, 6 2 , bt, except that these expressions will in addition be multiplied by
the same factor,
bt
--
1
bt y(1 - pV1:).
Let us now regard f z , f?,,f z , ft as being the coordinates of a fourth point &;
The invariants will then be functions of the distances between the five points 0,
P , P, P, Q and these functions must be homogeneous of degree zero, firstly
with respect to fz, fy, fz, f t , 62, b y , b z , 6 t (which variables can subsequently be
replaced by F,, FV, F,, T , i7, l), and secondly with respect to dlx, 6,y, 612, 1
(which variables can subsequently be replaced by vl, 1).
In this way we find, in addition to the four invariants (3,four further and
distinct invaiiants, namely

The last of these is always zero, according to the definition of T

* Having discovered the Lorentz group and the JCundamental invari-


+ +
ant x 2 y2 z 2 - t 2 , Poincare established that a series of physical
quantities must be individual components of unique four-dimensional
quantities varying under the Lorentz transformations like time and
coordinates. Below we present some such quantities constructed by
Poincare:
time and space coordinates - [ t ,21;
work per unit time and force reduced to unit volume - [f 17, f 1;
- 4

work per unit time and force reduced to unit charge - [?(I%),
7@1;
four-component velocity, or four-momentum - [y,74;
charge and current - [p, pvj;
scalar and vector potential - [cp, 4
with y = l / d m .
70
26

Which are the conditions that must now be satisfied?


I . The left-hand side of equation ( I ), which defines the velocity of propaga-
tion, n i u h t be a f u n c t i o n of the four invariants ( 5 ) .
It i s obvious that a large number of hypotheses could be wnstructed. We
hhall ctmlrider- only two of thehe.
( A ) I t may be that
p - f = ,.-> _ f-
>
y 0.
whence f = fr,and, since f must be negative, f = --r.
This nieans that the velocity of propagation is equal to that of light.
At first, it seems that this hypothesis should be rejected immediately; for
Laplace has shown that the propagation is either instantaneous or much more
rapid than that of light. But Laplace was discussing t h e hypothesis of a finite
velocity of propagation alone, whereas here it is compounded with many others.
and there may happen to ht: some more or less complete mutual compensation
between them, a situatiun o f which many examples have already appeared in the
ap p I ic a t i on s of the Lo re n t z t ran sfor nia t i on .
( B ) I t may be that

The velocity of propagation is then much more rapid than that of light, but i n
certain cases might be negative, which, as we have said. seems hardly acceptable.
We shall therefore abide by hypothesis (A).
2. The four invariants ( 7 ) must be functions o f the invariants ( 5 ) .
3 . When both bodies are at absolute rest, F , must have the values given
by Newtons law; when the bodies are at relative rest, the values must be those
given by equations (4).
In the case of absolute rest, the first two invariants ( 7 ) must reduce to

or, by Newtons law. to


-
1 _ _
1
.,A I

According to hypothesis (A), the second and third of the invariants ( 5 ) become

that is, f o r absolute rest,

71
27

We may therefore assume, for example, that the first two invariants ( 5 )
reduce to

- 2 '>
(1 - t: ,)-/(I' + GI).'. -41 - e f / ( r + 1 7 ~ 1 ) .
but other combinations are possible.
I t is necessary to choose some combination, and a third equation is also
needed to determine p . In making the choice, we shall attempt to remain as
close as possible to Newton's law. Let u s then examine the result when the
squares of the velocities .It. P i , etc., are neglected (and f = - r ) .
The four invariants ( 5 ) then become

and the four invariants (7) become

In order to compare this with Newton's iaw, however, a further transformation


+ +
is necessary. In these equations, :cg :t, yo + 9 , zu z represent the coordinates
+
of the attracting body at the instant t o r , and r = 177; in Newton's law, we
+ +
have to consider the coordinates :cg 2 1 . ;yo + yl, z(1 z1 of the attracting body
at the instant t o , and the distance r1 = I F l l .
We may neglect the square of the time t occupied by the propagation, and
therefore regard the motion as uniform; then

or, since,
t = -r ' r'= r'l + ClI': T = 7-1 - F C l ,

and the four invariants ( 5 ) become

and the four invariants (7) become

In the second of these expressions I have written r1, i n place of I', since r is
multiplied by u - v1 and the square of v i s neglected.
Newton's law gives, for these four invariants (7),

72
28

If, therefore, we denote the second and third invariants ( 5 ) by A and B , and
the first three invariants (7) by AI, IV and P, Newtons law will be obeyed, to
within terms of the order of the squares of the velocities, by putting
1 A A- B
AI = - N = fg2: p = -.
B4 B3
This solution is not unique: if the fourth invariant (5) is denoted by C , then
C -1 is of the order of l : , as is ( A - 13)2.
We may therefore add to the right-hand side of each of the equations (8) a
term consisting of C - 1 multiplied by any function of A, B and C , and a term
consisting of ( A - B ) 2 also multiplied by any function of A, B , and C.
The solution (8) appears the simplest at first sight, but it cannot be accepted.
Since 71, N and P are functions of p, and T = Fz7, these equations yield values
of F; but the resulting values may in some cases be imaginary.
In order to avoid this difficulty, we proceed differently, putting
1 1

by analogy with
1
?=dm1
as in the Lorentz substitution.
Then, with the condition --T = t , the invariants ( 5 ) become

0, A = -YO(T + r. C), B -Y~(T


1 + T . GI),
c = yoy1(1 - v. G I ) .
Moreover, the following systems of quantities:
x, Y 7 z, t=-r
YOFZ, YoFy9 YOFZ, TOT
YOVZl YOVy, YOVZ? YO
YlUlZ. YIVly, YlVlzr Y1
are seen to undergo the same linear substitutions when the transformations of the
Lorentz group are applied to them.
* Here Poincare introduces, for thefirst time, the sets of quantities
+
Fp = (yoT, yoF) and u p = (70,yov) transforming with respect to
the same linear irreducible (tensor) law like the set of space-time
coordinates xp = (t,.) and presently termed the four-vectors of
force and velocity. In this case the relations ui
= 1 and upFp = 0
are identities.
73
29

W e therefore put
F = aF/-yo + bu' + CGfyl/-yO,

It is evident that, if a , b, c are invariants p , T will satisfy the fundamental


condition, i.e. will undergo an appropriate linear substitution when the Lorentz
transformations are applied to them.
If the equations (9) are compatible, we must have

When p , T are replaced by their values (9), the result is, after multiplication
by 7027

-Au - b - CC:
= 0. (10)
T h e desired conclusion is that the values of F" should remain i n accordance
with Newton's law when the square of the velocities C, GI, etc., and the products
of the accelerations and the distances are neglected in comparison with th,p .5 q uare
of the velocity of light.
W e can take
b = 0, c = -aA/C.
To the approximation used,

Then the first equation (9) becomes

But, if vu2is neglected, Aul may be replaced by -rIv, or by --7.01, whence

Newton's law would give

We must therefore take as the invariant u one which reduces to -l/?-; within
the approximation adopted, that is, l / B 3 . The equations (9) then become

74
30

It is seen, first of all, that the corrected attraction consists of two components,
one parallel to the vector joining the position of the two bodies, and the other
parallel t o the velocity of the attracting body.
When we speak of the position or the velocity of the attracting body, we
mean its position or velocity at the instant when the gravitational wave leaves
it; but the position or the velocity of the attracted body means its position or
velocity at the instant when the gravitational wave reaches it, this wave being
assumed to be propagated with the velocity of light.
I believe that i t would be premature to attempt to continue the discussion of
these formulae, and 1 shall therefore confine myself to making a few comments.
1 . The solutions ( 1 I ) are not unique; for the conimon factor may be replaced
by
1
B3
- + (C - l ) f l ( A . B. C ) + (.4- B ) ' j ~ ( ~ B
- l.. C ) .

where fl and f.;!are any functions of A , B and C . Moreover 6 , need not be


taken as zero; any additional ternis may be added to ( I , b and c which satisfy the
condition (10) and are of the second order in L7 for u, and of the first order in F
for b and c.
2. The first equation ( 1 1) may be written

and the quantity in the brackets may turn be written

( F + S1r) + [li[c', x F]] (12)

so that the total force is divisible into three components corresponding to the
three parentheses in equation (12). The first component is somewhat similar to
the mechanical force due to the electric field, the other two to the mechanical
force due to the magnetic field. By virtue of comment 1, 1 may replace 1/B"
in equations ( I I ) by C / B 3 , so that are linear functions of the velocity Ct of
the attracted body, having been eliminated from the denominator of (11'). This
completes the analogy.
Putting then -
e'= - ( , ( F + r & ) , 12 = -(l[C?l x F], (13)
with eliminated C from the denominator of ( I 1') we obtain

75
31

Thus Z or F/B: is a kind of electric field, while h or ;/B is a kind of


magnetic field.
3. The relativity postulate would compel us to use either the solution ( 1 1 )
or the solution (14) or any one of the solutions obtained therefrom by using
comment 1. But the prime question is whether these are compatible with astro-
nomical observations. The deviation from Newtons law is of the order of I?,
that is 10000 times less than if it had been of the order of 21, as it would have been
with the velocity of propagation equal to that of light and the other conditions
unchanged. We may therefore hope that the deviation will not be very great; but
only a more extended investigation will furnish the answer to this question.

* Let us now briefly fist some of the main results obtained by


Poincare in this work.
A formulation is presented of the relativity principle, formulated
by Poincare in 1904 (see footnote 2 on page 7) f o r all physical
phenomena.
It has been shown, f o r thefirst time, that the Lorentz transforma-
tions form, together wifh space rotations, a group termed by Poincare
the Lorentz group. Infinitesimal operators of the Lorentz group have
been constructed. Invariance has been revealed of the quadratic form
t 2 - x2 - y2 - z2 under transformations of the Lorentz group.
Poincares discovery of the Lorentz group and of the fundamental
invariant t 2 - x2 - y2 - z2 permitted him to construct a series of
four-dimmensional quantities, which under Lorentz transformations
vary like the time and the space coordinates.
Here follow some of these quantities:
work per unit time and force reduced to unit volume - [fG1 fl;
work per unit time and force reduced to unit charge - [?(pi?) 7F];
four-component velocity, or four-momentum - [r,74;
charge and current - [p,pvj;
-.
scalar and vector potential - [cpA].
With the aid of these transformation laws Poincare established,
for thefirst time, that the Maxwell-Lorentz equations and, also, which
is extremely important, the Lorentz force acting on a elementary
charge in unit volume do not alter their form under transformations
of the Lorentz group. Thus, it was shown that no phenomena can be
used for establishing, whether one is in the state of rest or of linear
and uniform motion.
Thus, Poincare demonstrated that the relativity principle f o r elec-
tromagnetic phenomena follows from the Maxwell-Lorentz equations
as a rigorous mathematical truth.

76
32

T h p relntivistic law for addirig velocities tvas jirst established by


Poincare.
Poiricare N I I S the jirst to demonstrate the invariance qf the action
integra? ,for an electroniayrietic jield uiider tramformatioris of the
Lorerrtz yroiip aiid to discover the fuiidamerital irivariarits of the
rlectroniagrietic jielcl,

Poiriccire as the jirst to discover the equatioris of relativistic


nieclimics (in units m = c = 1)

tirid to write the corresponding expression f o r the Lagrangian fiiiic-


tiorrs of ci movirig material point.
Poiricare introduced four-dimensional space with the coordinates
(z, y, z , t a ) arid showed that transformations of the Loreritz
group correspond to various rotations in this space about the origin.
He cowstructed various invariants of the Lorentz group.
He piit forward the hypothesis that ail natural forces, includ-
ing gravitatiorial forces, must transform in the same manner under
Loreritz transformations.
Poiricare introduced the concept of gravitational waves travelling
with the speed of light. He demoristrated that the hypothesis asserting
propagation of the forces of gravity with the speed of light does not
coritradict observatioiial data.
Even this short list of results reveals that Poiricare discovered,
in ciii extremely precise and general form, almost all the essential
coristituerits of relativity theory.

77
33

252 THE RELATIVITY PRINCIPLE

Doc. 47
O N THE RELATIVITY PRINCIPLE AND THE CONCLUSIONS DRAWN FROM I T
by A . E i n s t e i n
[ J n h r b u c h d e r R a d i o a k t i v i t a t und E l e k t r o n i k 4 (1907): 411-4621

Newton's e q u a t i o n s of motion r e t a i n t h e i r form when one transforins t o a


new system of c o o r d i n a t e s t h a t is i n uniform t r a n s l a t i o n a l motion r e l a t i v e t o
t h e system used o r i g i n a l l y according t o t h e e q u a t i o n s

2' = z - vt

2' = y

2' = 2

As long as one believed t h a t a l l of p h y s i c s can be founded on Newton's


equations of motion, one t h e r e f o r e could n o t doubt t h a t t h e laws of n a t u r e are
t h e same without regard t o which of t h e c o o r d i n a t e systems moving uniformly
(without a c c e l e r a t i o n ) r e l a t i v e t o each o t h e r t h e y are r e f e r r e d . However,
t h i s independence from t h e s t a t e of motion of t h e system of c o o r d i n a t e s used,
which we w i l l c a l l " t h e p r i n c i p l e of r e l a t i v i t y , " seemed t o have been suddenly
c a l l e d i n t o q u e s t i o n by t h e b r i l l i a n t c o n f i r m a t i o n s of H. A . L o r e n t z ' s
electrodynamics of moving bodies .1 That t h e o r y i s b u i l t on t h e p r e s u p p o s i t i o n
of a r e s t i n g , immovable, luminiferous e t h e r ; i t s b a s i c equations are not such
t h a t they transform t o equations of t h e same form when t h e above
transforination equations a r e a p p l i e d .
A f t e r t h e acceptance of t h a t t h e o r y , one had t o expect t h a t one would
succeed i n demonstrating an e f f e c t of t h e t e r r e s t r i a l motion r e l a t i v e t o t h e
luminiferous e t h e r on o p t i c a l phenomena. I t is t r u e t h a t i n t h e s t u d y c i t e d
Lorentz proved t h a t i n o p t i c a l experiments, as a consequence of h i s b a s i c
assumptions, an e f f e c t of t h a t r e l a t i v e motion on t h e r a y p a t h i s n o t t o be
expected as long as t h e c a l c u l a t i o n i s l i m i t e d t o terms i n which t h e r a t i o

[11 III. A . Lorentz, V e r s u c h e i n e r T h e o r i e d e r e l e k t r i s c h e n und o p t i s c h e n


E r s c h e i n u n g e n i n beuregten h o r p e r n . [Attempt a t a t h e o r y of e l e c t r i c and
o p t i c a l phenomena i n moving bodies] Leiden, 1895. Reprinted Leipzig, 1906.
34

DOC. 47 253

v / c of t h e r e l a t i v e v e l o c i t y t o the v e l o c i t y of l i g h t i n vacuum appears i n


t h e f i r s t power. But t h e negative r e s u l t of Michelson and Morley's experi-
ment' showed t h a t i n a p a r t i c u l a r case an e f f e c t of t h e second order
(proportional t o w2/c2) was not present e i t h e r , even though it should have
shown up i n t h e experiment according t o t h e fundamentals of t h e Lorentz
theory.
It is well known t h a t t h i s contradiction between theory and experiment
was formally removed by t h e p o s t u l a t e of H . A. Lorentz and FitzGerald, [41
according t o which moving bodies experience a c e r t a i n c o n t r a c t i o n i n t h e
d i r e c t i o n of t h e i r motion. However, t h i s ad hoc p o s t u l a t e seemed t o be only
an a r t i f i c i a l means of saving t h e theory: Michelson and Morley's experiment
had a c t u a l l y shown t h a t phenomena agree with t h e p r i n c i p l e of r e l a t i v i t y even
where t h i s was not t o be expected from t h e Lorentz theory. I t seemed
t h e r e f o r e as i f Lorentz's theory should be abandoned and replaced by a theory
whose foundations correspond t o t h e p r i n c i p l e of r e l a t i v i t y , because such a
theory would r e a d i l y p r e d i c t t h e negative r e s u l t of t h e Michelson and Morley
experiment. [51
Surprisingly, however, it turned out t h a t a s u f f i c i e n t l y sharpened
conception of time was a l l t h a t was needed t o overcome t h e d i f f i c u l t y
discussed. One had only t o r e a l i z e t h a t an a u x i l i a r y q u a n t i t y introduced by
H . A . Lorentz and named by him " l o c a l time'' could be defined as "time" i n [61
general. I f one adheres t o t h i s d e f i n i t i o n of time, t h e b a s i c equations of
Lorentz's theory correspond t o t h e p r i n c i p l e of r e l a t i v i t y , provided t h a t t h e
above transformation equations a r e replaced by ones t h a t correspond t o t h e new
conception of time. H. A. Lorentz's and FitzGerald's hypothesis appears then
as a compelling consequence of t h e theory. Only t h e conception of a lumini-
ferous e t h e r as t h e c a r r i e r of t h e e l e c t r i c and magnetic f o r c e s does not f i t
i n t o t h e theory described here; f o r electromagnetic f o r c e s appear here not as
s t a t e s of some substance, but r a t h e r as independently e x i s t i n g t h i n g s t h a t a r e
similar t o ponderable matter and share with it t h e f e a t u r e of i n e r t i a . [71
The following is an attempt t o summarize t h e s t u d i e s t h a t have resulted
t o d a t e from t h e merger of t h e H . A . Lorentz theory and t h e p r i n c i p l e of
relativity .

lA. A . Michelson and E. W. Morley, Amer. J. of S c i e n c e 34, (1887): 333.


35

254 THE RELATIVITY PRINCIPLE

The f i r s t two parts of t h e paper deal w i t h t h e kinematic foundations as


Kell as w i t h their application t o t h e fundamental equations of t h e Maxwell-
Lorentz t h e o r y , and a r e based on the s t u d i e s ' by H . A . Lorentz (YersZ. Jon.
dkad. 9. l e i . , Amsterdam (1904)) and A. E i n s t e i n ( A n n . d . Phys. 16 (1905)).
I n t h e f i r s t s e c t i o n , i n which only t h e kinematic foundations of t h e
theory a r e applied, I a l s o discuss some o p t i c a l problems (Doppler's p r i n c i p l e ,
a b e r r a t i o n , dragging of l i g h t by moving b o d i e s ) ; I was made aware of t h e
p o s s i b i l i t y of such a mode of treatment by an o r a l communication and a paper
by Mr. M. Laue ( A n n . d . Phys. 23 (1907): 989), as well as a paper (though in
need of c o r r e c t i o n ) by Mr. J . Laub ( d a n . d . Phys. 32 (1907)).
I n t h e t h i r d part I develop t h e dynamics of t h e m a t e r i a l point ( e l e c -
t r o n ) . I n t h e derivation of the equations of motion I used t h e same method as
i n my paper c i t e d e a r l i e r . Force is defined as i n Planck's study. The
reformulations of t h e equations of motion of material p o i n t s , which so clearly
demonstrate t h e analogy between these equations of motion and t h o s e of
c l a s s i c a l mechanics, a r e a l s o taken from t h a t study.
The f o u r t h p a r t d e a l s with t h e general inferences regarding t h e energy
and momentum of physical systems t o which one i s l e d by t h e theory of
r e l a t i v i t y . These have been developed i n t h e o r i g i n a l s t u d i e s ,
A . E i n s t e i n , A n n . d . Phys. 18 (1905): 639 and Ann. d . Phys. 23 (1907):
371, as well as if. Planck, Sitzaagsber. d . I g l . P w u s s . Akad. d .
Yzssensch. X X I X (190'71,
but a r e here derived i n a new way. which, it seems t o me, shows e s p e c i a l l y
c l e a r l y t h e r e l a t i o n s h i p between t h e above a p p l i c a t i o n and t h e foundations of
t h e theory. I a l s o discuss here t h e dependence of entropy and temperature on
t h e s t a t e of motion; as f a r as entropy is concerned, I k e p t completely t o t h e
Planck study c i t e d , and t h e temperature of moving bodies I defined as did Mr.
Mosengeil i n h i s study on moving black-body radiation.2
The most important r e s u l t of t h e fourth p a r t is t h a t concerning t h e
i n e r t i a l mass of t h e energy. T h i s r e s u l t suggests t h e question whether energy
a l s o possesses heavy ( g r a v i t a t i o n a l ) mass. A further question suggesting
i t s e l f is whether t h e p r i n c i p l e of r e l a t i v i t y is limited t o nonaceeleraled
moving systems. I n order n o t t o leave t h i s question t o t a l l y undiscussed, I
added t o t h e present paper a f i f t h part t h a t contains a novel c o n s i d e r a t i o n ,
based on t h e p r i n c i p l e of r e l a t i v i t y , on a c c e l e r a t i o n and g r a v i t a t i o n .

'E. Cohn's s t u d i e s on t h e subject a r e a l s o p e r t i n e n t , but I d i d not make use


of them h e r e .
2Kurd von Yasengeil, A n n . d . Phys. 22 (1907): ,867.
V. PRINCIPLE OF RELATIVITY AND GRAVITATION

517. Accelerated reference system and gravitational field

So f a r we have applied t h e p r i n c i p l e of r e l a t i v i t y , i . e . , t h e assumption


t h a t t h e physical laws are independent of t h e s t a t e of motion of t h e reference
system, only t o nonacce 1 e r a 2 ed reference systems. Is it conceivable t h a t t h e
p r i n c i p l e of r e l a t i v i t y a l s o applies t o systems t h a t a r e a c c e l e r a t e d r e l a t i v e
t o each other?
37

302 THE RELATIVITY PRINCIPLE

While t h i s i s not t h e place f o r a d e t a i l e d discussion of t h i s question,


it w i l l occur t o anybody who has been following t h e a p p l i c a t i o n s of t h e
p r i n c i p l e of r e l a t i v i t y . Therefore I w i l l not r e f r a i n from t a k i n g a stand on
t h i s question h e r e .
We consider two systems C, and X, i n motion. Let C, be accelerated
[93] i n t h e d i r e c t i o n of i t s I-axis, and l e t 7 be t h e (temporally constant)
magnitude of t h a t a c c e l e r a t i o n . C, s h a l l be at r e s t , but it s h a l l be located
i n a homogeneous g r a v i t a t i o n a l f i e l d t h a t imparts t o a l l o b j e c t s an
a c c e l e r a t i o n -7 i n t h e d i r e c t i o n of t h e X-axis.
1941 As f a r a s we know, t h e physical laws with respect t o El do not d i f f e r
from those with respect t o C,; t h i s is based on t h e f a c t t h a t a l l bodies a r e
equally accelerated i n t h e g r a v i t a t i o n a l f i e l d . A t our present s t a t e of
experience we have t h u s no reason t o assume t h a t t h e systems C1 and C,
d i f f e r from each o t h e r i n any r e s p e c t , and i n t h e discussion t h a t follows, we
s h a l l t h e r e f o r e assume t h e complete physical equivalence of a g r a v i t a t i o n a l
f i e l d and a corresponding a c c e l e r a t i o n of t h e reference system.
This assumption extends t h e p r i n c i p l e of r e l a t i v i t y t o t h e uniformly
accelerated t r a n s l a t i o n a l motion of t h e reference system. The h e u r i s t i c value
of t h i s assumption r e s t s on t h e f a c t t h a t it permits t h e replacement of a
homogeneous g r a v i t a t i o n a l f i e l d by a uniformly a c c e l e r a t e d reference system,
t h e l a t t e r case being t o some extent a c c e s s i b l e t o t h e o r e t i c a l treatment.

$18. Space and time i n a uniformly accelerated reference system

We f i r s t consider a body whose individual material p o i n t s , a t a given


time t of t h e nonaccelerated reference system S, possess no v e l o c i t y
r e l a t i v e t o S, but a c e r t a i n a c c e l e r a t i o n . What is t h e influence of t h i s
a c c e l e r a t i o n 7 on t h e shape of t h e body with r e s p e c t t o S?
If such an influence is p r e s e n t , it w i l l c o n s i s t of a c o n s t a n t - r a t i o
d i l a t a t i o n i n t h e d i r e c t i o n of acceleration and possibly i n t h e two d i r e c t i o n s
perpendicular t o i t , s i n c e an e f f e c t of another kind i s impossible f o r reasons
of symmetry. The acceleration-caused d i l a t a t i o n s ( i f such e x i s t at a l l ) must
be even functions of 7 ; hence they can be neglected i f one r e s t r i c t s oneself
t o t h e case i n which 7 i s so small t h a t terms of t h e second o r higher power
38

DOC. 47 303

i n 7 may be neglected. Since we are going t o r e s t r i c t ourselves t o t h a t


case, we do not have t o assume t h a t t h e a c c e l e r a t i o n has any influence on t h e
shape of t h e body.
We now consider a reference system C t h a t is uniformly a c c e l e r a t e d
r e l a t i v e t o t h e nonaccelerated system S i n t h e d i r e c t i o n of t h e l a t t e r ' s
I-axis. The clocks and measuring rods of C, examined at r e s t , s h a l l be
i d e n t i c a l with t h e clocks and measuring rods of S. The coordinate o r i g i n of
Z s h a l l move along t h e X-axis of S, and t h e axes of X s h a l l be
perpetually p a r a l l e l t o those of S. A t any moment t h e r e e x i s t s a
nonaccelerated reference system S' whose coordinate axes coincide with t h e
coordinate axes of X at t h e moment i n question ( a t a given time t' of
S'). If t h e coordinates of a point event occurring at t h i s time t' a r e 6 ,
q , ( with respect t o C , we w i l l have

because i n accordance with what we s a i d above, we a r e not t o assume t h a t


acceleration a f f e c t s t h e shape of t h e measuring instruments used f o r measuring
E , q , (. We s h a l l a l s o imagine t h a t t h e clocks of C are s e t at time 2 ' of
S' such t h a t t h e i r readings at t h a t moment equal t I . What about t h e r a t e of
t h e clocks i n t h e next time element r?
First of a l l , we have t o bear i n mind t h a t a s p e c i f i c e f f e c t of
acceleration on t h e r a t e of t h e clocks of C need not be taken i n t o account,
s i n c e it would have t o be of t h e order 72. Furthermore, s i n c e t h e e f f e c t of
t h e velocity a t t a i n e d during r on t h e r a t e of t h e clocks i s n e g l i g i b l e , and
t h e distances t r a v e l e d by t h e clocks during t h e time r r e l a t i v e t o those
t r a v e l e d by S ' a r e a l s o of t h e order 7 2 , i . e . , n e g l i g i b l e , t h e readings of
t h e clocks of C may be f u l l y replaced by readings of t h e clocks of S ' f o r
t h e time element r . [951
From t h e foregoing it follows t h a t , relative t o C, l i g h t i n vacuum i s
propagated during t h e time element r with t h e u n i v e r s a l v e l o c i t y c i f we
define simultaneity i n t h e system S' which is momentarily at rest r e l a t i v e
39

304 THE RELATIVITY PRINCIPLE

t o C , and i f t h e clocks and measuring rods we use f o r measuring t h e time and


length a r e i d e n t i c a l w i t h those used f o r t h e measurement of time and space i n
nonaccelerated systems. Thus t h e p r i n c i p l e of constancy of t h e v e l o c i t y of
l i g h t can be used here t o o t o d e f i n e simultaneity i f one restricts oneself t o
very short l i g h t paths.
We now imagine t h a t t h e clocks of C a r e a d j u s t e d , i n t h e way
described, a t t h a t time t = 0 of S a t which C is instantaneously at rest
r e l a t i v e t o S. The t o t a l i t y of readings of t h e clocks of X adjusted i n
[96] t h i s way i s c a l l e d t h e " l o c a l time" u of t h e system C . It is immediately
evident t h a t t h e physical meaning of t h e l o c a l time u is as follows. I f one
uses t h e l o c a l time u f o r t h e temporal evaluation of processes occurring i n
t h e individual space elements of C , then t h e laws obeyed by t h e s e processes
cannot depend on t h e p o s i t i o n of these space elements, i . e . , on t h e i r coordi-
n a t e s , i f not only t h e clocks, but a l s o t h e o t h e r measuring t o o l s used i n t h e
[g7] various space elements are i d e n t i c a l .
However, we must not simply r e f e r t o t h e l o c a l time u as t h e "time" of
C , because according t o t h e d e f i n i t i o n given above, two point events occurring
at d i f f e r e n t p o i n t s of X a r e not simultaneous when t h e i r l o c a l times u a r e
equal. For i f at time t = 0 two clocks of I: are synchronous with respect
t o 5' and a r e subjected t o t h e same motions, then they remain forever
synchronous with respect t o S. However, f o r t h i s reason, i n accordance w i t h
$4, they do not run synchronously with respect t o a reference system S'
instantaneously a t r e s t r e l a t i v e t o X but i n motion r e l a t i v e t o S, and
hence according t o our d e f i n i t i o n they do not run synchronously with respect
t o C either.
We now define t h e "time" T of t h e system C as t h e t o t a l i t y of those
readings of t h e clock s i t u a t e d a t the coordinate o r i g i n of Z which a r e ,
according t o t h e above d e f i n i t i o n , simultaneous with t h e events which a r e t o
be temporally evaluated .
We s h a l l now determines t h e r e l a t i o n between t h e time T and t h e l o c a l
time u of a point event. It follows from t h e f i r s t of equations (1) t h a t

'Thus t h e symbol 1 1 ~ 1 1 is used here in a d i f f e r e n t sense than above


40

DOC. 47 305

two events a r e simultaneous with respect t o S ' , and t h u s a l s o with respect t o


C, i f
V V
t , - s x 1 = t2-F"z'

where t h e s u b s c r i p t s r e f e r t o t h e one o r t o t h e other point event, respec-


t i v e l y . We S h a l l f i r s t confine ourselves t o t h e consideration of times that
a r e so short' t h a t a l l terms containing t h e second o r higher power of r or
v can be omitted; taking (1) and (29) i n t o account, we then have t o put [981

2.2 - XI = xi - xi = (2 - (1

t, = u1 t, = 0,
v = yt = y r ,

s o t h a t we obtain from t h e above equation

If we move t h e f i r s t point event t o t h e coordinate o r i g i n , so t h a t u1 = r


and (I = 0 , we o b t a i n , omitting t h e subscript f o r t h e second point event,

u = 711 + $1 .

Th s equation holds f i r s t of a l l i f r and ( l i e below c e r t a i n


l i m i t s . It i s obvious t h a t it holds f o r a r b i t r a r i l y l a r g e r i f t h e acceler-
a t i o n y i s constant with respect t o C , because t h e r e l a t i o n between u and
r must then be l i n e a r . Equation (30) does not hold f o r a r b i t r a r i l y l a r g e 6.
From t h e f a c t t h a t t h e choice of t h e coordinate o r i g i n must not a f f e c t t h e
r e l a t i o n , one must conclude t h a t , s t r i c t l y speaking, equation (30) should be
replaced by t h e equation

u=re %.
Nevertheless , we s h a l l maintain formula (30).
lIn accordance with (11, we thereby a l s o assume a c e r t a i n r e s t r i c t i o n with
respect t o t h e values of ( = X I .
41

306 THE RELATIVITY PRINCIPLE

According t o $17, equation (30) is also a p p l i c a b l e t o a coordinate


system in which a homogeneous g r a v i t a t i o n a l f i e l d i s a c t i n g . 111 t h a t case we
have t o p u t = 7 5 , where 4 i s t h e g r a v i t a t i o n a l p o t e n t i a l , s o t h a t we
obtain
(7 = r[l + 3
We have defined two kinds of times f o r C. Which of t h e two d e f i n i t i o n s
do we have t o use i n t h e various cases? Let u s assume that a t two l o c a t i o n s
of d i f f e r e n t g r a v i t a t i o n a l p o t e n t i a l s ( 7 f ) t h e r e e x i s t s one physical system
each, and we want t o compare t h e i r physical q u a n t i t i e s . To do t h i s , t h e most
n a t u r a l procedure might be as follows: First we t a k e our measuring t o o l s t o
t h e f i r s t physical system and c a r r y out our measurements t h e r e ; then we take
our measuring t o o l s t o t h e second system t o carry out t h e same measurement
here. If t h e two s e t s of measurements give the same r e s u l t s , we shall denote
t h e tuo physical systems as "equal." The measuring t o o l s include a clock w i t h
which we measure l o c a l times (7. From t h i s it follows t h a t t o define t h e
p h y s i c a l q u a n t i t i e s at some p o s i t i o n of the gravitational f i e l d , it i s natural
t o use t h e time u.
However, i f we deal with a phenomenon i n which o b j e c t s s i t u a t e d a t posi-
t i o n s with d i f f e r e n t g r a v i t a t i o n a l p o t e n t i a l s must be considered simultan-
eously, we have t o use t h e time r i n those terms i n which time occurs
e x p l i c i t l y ( i . e . , not only i n t h e d e f i n i t i o n of physical q u a n t i t i e s ) , because
otherwise t h e simultaneity of t h e events would not be expressed by t h e equal-
i t y of t h e time values of t h e two events. Since i n t h e d e f i n i t i o n of t h e time
T a clock s i t u a t e d i n an a r b i t r a r i l y chosen p o s i t i o n is used, but not an
a r b i t r a r i l y chosen i n s t a n t , when using time r t h e laws of nature can vary
w i t h p o s i t i o n but not w i t h time.

$19. The effect of t h e grawitalional field OR clocks

If a clock showing l o c a l time i s located i n a point P of g r a v i t a t i o n a l


potential @, then, according t o (30a), i t s reading will be (1 + 3)
I times
greater than t h e time r , i . e . , i t runs b ) times f a s t e r than an
(1 + F
42

DOC. 47 307

i d e n t i c a l clock located a t t h e coordinate o r i g i n . Suppose an observer located


somewhere i n space perceives t h e indications of t h e two clocks i n a c e r t a i n
way, e . g . , o p t i c a l l y . As t h e time AT t h a t elapses between t h e i n s t a n t s at
which a clock indication occurs and at which t h i s i n d i c a t i o n is perceived by
t h e observer is independent of r , f o r an observer s i t u a t e d somewhere i n space
t h e clock i n point P runs (1 + ?)8 times f a s t e r than t h e clock a t t h e
coordinate o r i g i n . I n t h i s sense we may say t h a t t h e process occurring i n t h e
clock, and, more generally, any physical process, proceeds f a s t e r t h e g r e a t e r
t h e g r a v i t a t i o n a l p o t e n t i a l at t h e position of t h e process t a k i n g place.
There e x i s t "clocks" t h a t a r e present at l o c a t i o n s of d i f f e r e n t g r a v i t a -
t i o n a l p o t e n t i a l s and whose rates can be controlled with g r e a t p r e c i s i o n ;
these a r e t h e producers of s p e c t r a l l i n e s . I t can be concluded from t h e
aforesaid' t h a t t h e wave length of l i g h t coming from t h e s u n ' s s u r f a c e , which
o r i g i n a t e s from such a producer, is l a r g e r by about one p a r t i n two millionth
than t h a t of l i g h t produced by t h e same substance on e a r t h . [ 1001

$20. The effect of gravitation on electromagnetic phenomena

I f we r e f e r an electromagnetic process at some point of time t o a non-


accelerated reference system S' t h a t is instantaneously a t r e s t r e l a t i v e t o
t h e reference system C accelerated a s above, then t h e following equations
w i l l hold according t o ( 5 ) and ( 6 ) :

and

In accordance with t h e above, we may r e a d l l y equate t h e 5" - referred


q u a n t i t i e s p ' , u ' , I',L', z ' , e t c . , with t h e corresponding & r e f e r r e d

While assuming t h a t equation (30a) holds f o r an inhomogeneous g r a v i t a t i o n a l


f i e l d as well.
308 THE RELATIVITY PRINCIPLE

q u a n t i t i e s p , u , 1, L , I , e t c . , i f ue l i m i t ourselves t o an infinitesimally
short period t h a t i s infinitesimally close t o the time of r e l a t i v e r e s t of
S and 2. Further, we have t o replace t by t h e local time 6. However,
we must not simply put

because a point which is at r e s t r e l a t i v e t o Z, and t o which equations


transformed t o Z should r e f e r , changes i t s velocity r e l a t i v e t o 5 during
t h e time element d t = da, t o which change, according t o equations (7a) and
( 7 b ) , there corresponds a temporal change of t h e X-related f i e l d component.
Hence we have t o put

Hence the E-ref erred electromagnetic equation8 a r e

This r e s t r i c t i o n does not a f f e c t t h e range of v a l i d i t y of o u r r e s u l t s because


inherently t h e laws t o be derived cannot depend on t h e time.
44

DOC. 47 309

We multiply these equations by [l + $1 and put f o r t h e sake of b r e v i t y

Neglecting terms of t h e second power i n 7 , we o b t a i n t h e equations

These equations show f i r s t of a l l how t h e g r a v i t a t i o n a l f i e l d a f f e c t s t h e


s t a t i c and s t a t i o n a r y phenomena. The same laws hold a s i n t h e g r a v i t a t i o n -
f r e e f i e l d , except t h a t t h e f i e l d components I, e t c . a r e replaced by
d[l + $1 , e t c . , and p i s replaced by p I +
c $1 .
Furthermore, t o follow t h e development of nonstationary s t a t e s , we make
use of t h e time r i n t h e terms d i f f e r e n t i a t e d with respect t o time a s well
as i n t h e d e f i n i t i o n of t h e velocity of e l e c t r i c i t y , i . e . , we put according t o
(30)

and
45

310 THE RELATIVITY PRINCIPLE

We thus obtain

and

11031

These equations too have t h e same form as t h e corresponding equations of


t h e nonaccelerated o r g r a v i t a t i o n - f r e e space; however, c is here replaced by
t h e value
e[l + $1 = c[1 + $1 I

From t h i s it follows t h a t those l i g h t rays t h a t do not propagate along t h e


(-axis a r e bent by t h e g r a v i t a t i o n a l f i e l d ; it can e a s i l y be seen t h a t t h e
change of d i r e c t i o n amounts t o 5 s i n 'p per cm l i g h t p a t h , where cp
[ l o 4 1 denotes t h e angle between t h e d i r e c t i o n of g r a v i t y and t h a t of t h e l i g h t ray.
With t h e help of t h e s e equations and t h e equations r e l a t i n g t h e f i e l d
s t r e n g t h and t h e e l e c t r i c c u r r e n t of one p o i n t , which a r e known from t h e
o p t i c s of bodies at r e s t , we can c a l c u l a t e t h e e f f e c t of t h e g r a v i t a t i o n a l
f i e l d on o p t i c a l phenomena i n bodies a t r e s t . One has t o bear i n mind,
however, t h a t t h e above-mentioned equations from t h e o p t i c s of bodies a t r e s t
hold f o r t h e l o c a l time u . Unfortunately, t h e e f f e c t of t h e t e r r e s t r i a l
g r a v i t a t i o n a l f i e l d is so small according t o our theory (because of t h e
smallness of 3)
t h a t t h e r e i s no prospect of a comparison of t h e r e s u l t s of
[lo51 t h e theory w i t h experience.

I f we successively multiply equations (31a) and (32a) by P . . . . . P 47


and i n t e g r a t e over i n f i n i t e space, we o b t a i n , using our e a r l i e r n o t a t i o n ,

[ 1071 &(,I + u Y + u 2 ) i s t h e energy qu supplied t o t h e matter per unit


7 t
volume and u n i t l o c a l time u i f t h i s energy i s measured by measuring t o o l s
s i t u a t e d a t the corresponding location. Hence, according t o (30) ,
46

DOC. 47 311

vr = 7,f[1 t $1 i s t h e ( s i m i l a r l y measured) energy supplied t o t h e matter p e r [lo81

unit volume and u n i t l o c a l time 7 ; &(82+Y2-*- +N2) is t h e electromagnetic


energy E per u n i t volume, measured t h e same way. If we take i n t o account
t h a t according t o (30) we have t o s e t a = [l - $I&, we o b t a i n

This equation expresses t h e p r i n c i p l e of conservation of energy and


contains a very remarkable r e s u l t . An energy, o r energy i n p u t , t h a t , measured
l o c a l l y , has t h e value E = cdw or E = 77 d w d r , r e s p e c t i v e l y , c o n t r i b u t e s t o
t h e energy i n t e g r a l , i n addition t o t h e value E t h a t corresponds t o its
magnitude, a l s o a value F E ?( = E ip t h a t corresponds t o its position.
Thus, t o each energy E i n t h e g r a v i t a t i o n a l f i e l d t h e r e corresponds an
energy of position t h a t equals t h e p o t e n t i a l energy of a "ponderable" mass of
magnitude F E.
Thus t h e proposition derived i n $11, t h a t t o an amount of energy E
t h e r e corresponds a mass of magnitude F E , holds not only f o r t h e inertial but
a l s o f o r t h e gravitational mass, if t h e assumption introduced i n $17 i s
correct.

(Received on 4 December 1907)


Chapter 2

Einstein's Deepest Insight and Its Early Impacts*

*A. Einstein, M. Grossmann, D. Hilbert, E. Cartan


48

Doc. 13
Outline of a Generalized Theory of Relativity and of a
Theory of Gravitation

I. Physical Part
by Albert Einstein

II. Mathematical Part


by Marcel Grossmann
[Teubner, Leipzig, 19131

I
Physical Part

The theory expounded in what follows derives from the conviction that the
proportionality between the inertial and the gravitational mass of bodies is an exactly
valid law of nature that must already find expression in the very foundation of
theoretical physics. I already sought to give expression to this conviction in several
earlier papers by seeking to reduce the gravitational mass to the inertial mass;' this
endeavor led me to the hypothesis that, from a physical point of view, an (infinitesi-
mally extended, homogeneous) gravitational field can be compIetely replaced by a
state of acceleration of the reference system. This hypothesis can be expressed
pictorially in the following way: An observer enclosed in a box can in no way decide
whether the box is at rest in a static gravitational field, or whether it is in accelerated
motion, maintained by forces acting on the box, in a space that is free of gravitational
fields (equivalence hypothesis). PI
We know the fact that the law of proportionality of inertial and gravitational
mass is satisfied to an extraordinary degree of accuracy from the fundamentally
important investigation by Eotvos,* which is based on the following argument. A
body at rest on the surface of the Earth is acted upon by gravity as well as by the
centrifugal force resulting from Earth's rotation. The first of these forces is

'A. Einstein,Ann. d. Phys. 35 (1911): 898; 38 (1912):355; 38 (1912): 443. [ 11


2B. Eotvos, Mclthernatische und natiirwissenschaftliche Rerichte aus Ungarn 8 (1890);
Wiedemann's Reiblutter 15 (1891): 688. [31
49

proportional to the gravitational mass, and the second to the inertial mass. Thus, the
direction of the resultant of these two forces , i.e., the direction of the apparent
gravitational force (direction of the plumb) would have to depend on the physical
nature of the body under consideration if the proportionality of the inertial and
gravitational mass were not satisfied. In that case the apparent gravitational forces
acting on parts of a heterogeneous rigid system would, in general, not merge into a
resultant; instead, in general, there would still be a torque associated with the
apparent gravitational forces that would have to make itself noticeable if the system
were suspended from a torsion-free thread. By having established the absence of such
torques with great care, Eotvos proved that, for the bodies that he investigated, the
relationship of the two masses was independent of the nature of the body to such a
degree of exactness that the relative difference in this relationship that might still
exist from one substance to another must be smaller than one twenty-millionth.
The decomposition of radioactive substances occurs with a release of such
significant quantities of energy that the change in the inertial mass of the system that
corresponds to that energy decrease according to the theory of relativity is not very
small relative to the total mass.3 In the case of the decay of radium, for example,
this decrease amounts to one ten-thousandth of the total mass. If these changes of the
inertial mass did not correspond to changes in the gravitational mass, then there
would have to be deviations of the inertial mass from the gravitational mass much
PI greater than those allowed by Eotvos's experiments. Hence it must be considered very
probable that the identity of the inertial and gravitational mass is exactly satisfied.
For these reasons it seems to me that the equivalence hypothesis, which asserts the
essential physical identity of the gravitational with the inertial mass, possesses a h g h
degree of pr~bability.~

53. The Significance of the Fundamental Tensor


of the g p v for the Measurment of Space and Time

From the foregoing, one can already infer that there cannot exist relationships
between the space-time coordinates xl, x2, x3, x4 and the results of measurements
obtainable by means of measuring rods and clocks that would be as simple as those
in the old relativity theory. With regard to time, this has already found to be true in
the case of the static gravitational field.8 The question therefore arises, what is the

k f . Part II, $1.


'cf. P X ~Ir, $1.
I161 'Cf., e.g., A. Einstein, Ann. d. Phys. 35 (1911): 903 ff.
50

physical meaning (measurability in principle) of the coordinates xl, x2, x3, x4.
We note in this connection that ds is to be conceived as the invariant measure
of the distance between two infinitely close space-time points. For that reason, ds
must also possess a physical meaning that is independent of the chosen reference
system. We will assume that ds is the naturally measured distance between the two
space-time points, and by thls we will understand the following. ~ 7 1
The immediate vicinity of the point (xl,x2, x3, x4) with respect to the coordinate
system is determined by the infinitesimal variables dr,, dx2, dx,, h4.We assume
that, in their place, new variables d t 1 , dt;,, d t 3 , dE4 are introduced by means of a
linear transformation in such a way that
ds2 = dt;; + st;: + dt;; - dE;.
In h s transformation the g p v are to be viewed as constants; the real cone ?!G = 0
appears referred to its principal axes. Then the ordinary theory of relativity holds in
this elementary dt; system, and the physical meaning of lengths and times shall be
the same in this sytem as in the ordinary theory of relativity, i.e., ds is the square of
the four-dimensional distance between two infinitely close space-time points,
measured by means of a rigid body that is not accelerated in the dc-system, and by
means of unit measuring rods and clocks at rest relative to it. [I81
From this one sees that, for given dx,,dx2, dx,, h,, the natural distance that
corresponds to these differentials can be determined only if one knows the quantities
g p v that determine the gravitational field. This can also be expressed in the
following way: the gravitational field influences the measuring bodies and clocks in
a determinate manner. [191
From the fundamental equation

one sees that, in order to fix the physical dimensions of the quantities g p v and xv, yet
another stipulation is required. The quantity ds has the dimension of a length.
Likewise, we wish to view the xv (x4 too) as lengths, and thus we do not ascribe any
physical dimension to the quantities gpv.

$5. The Differential Equations of the Gravitational Field

Having established the momentum-energy equation for material processes (mechani-


cal, electrical, and other processes) in relation to the gravitational field, there remains
for us only the following task. Let the tensor Opv for the material process be given.
51

What differential equations permit us to determine the quantities g i k , i.e., the


gravitational field? In other words, we seek the generalization of Poisson's equation
Acp = 4 7 ~ k p .
We have not found a method for the solution of this problem as thoroughly
compelling as that for the solution of the problem discussed previously. It would be
necessary to introduce several assumptions whose correctness seems plausible but not
evident.
The generalization that we seek would likely have the form
(11) =
1c.0~~ rpy,
where K is a constant and rPV a second-rank contravariant tensor derived from the
fundamental tensor g P vby differentiatial operations. In line with the Newton-Poisson
law one would be inclined to require that these equations (11) be second order. But
it must be stressed that, given this assumption, it proves impossible to find a
differential expression rrvthat is a generalization of Acp and that proves to be a
fensor with respect to arbitrary transformations." To be sure, it cannot be negated
a priori that the final, exact equations of gravitation could be of higher than second
order. Therefore there still exists the possibility that the perfectly exact differential
equations of gravitation could be covariant with respect to arbitrary substitutions.
But given the present state of our knowledge of the physical properties of the
gravitational field, the attempt to discuss such possibilities would be premature. For
that reason we have to confine ourselves to the second order, and we must therefore
forgo setting up gravitational equations that are covariant with respect to arbitrary
transformations. Besides, it should be emphasized that we have no basis whatsoever
WI for assuming a general covariance of the gravitational equations.''
The Laplacian scalar Acp is obtained from the scalar cp if one forms the
expansion (the gradient) of the latter and then the inner operator (the divergence) of
this. Both operations can be generalized in such a way that one can carry them out
on every tensor of arbitrarily high rank, namely while permitting arbitrary substitu-
tions of the basic variables.'* But these operations degenerate if they are carried out
on the fundamental tensor gpv.I3From this it seems to follow that the equations
sought will be covariant only with respect to a particular group of transformations,
1251 which group however, is as yet unknown to us.

'OCf.Part II, $4,NO. 2.


"Cf. also the arguments given at the beginning of $6.
'*Part 11, $2.
I3Cf. the remark on p. 28 in Part 11, $2.
52

Given this state of affairs, and in view of the old theory of relativity, it seems
natural to assume that the transformation group we are seekmg also includes the
linear transformations. Hence we require that rpv be a tensor with respect to
arbitrary linear transformations.
Now it is easy to prove (by carrying out the transformation) the following
theorems: [261
1. If @ab,.,A is a contravariant tensor of rank n with respect to linear transfonna-
tions, then

is a contravariant tensor of rank n + 1 with respect to linear transformations


(expansion).l4
2. If @ap,,,Ais a contravariant tensor of rank n with respect to linear transforma-
tions, then

is a contravariant tensor of rank n - 1 with respect to linear transformations


(divergence).
If one carries out these two operations on a tensor in succession, one obtains a
tensor of the same rank as the original one (operation A , carried out on a tensor).
For the fundamental tensor y p v one obtains

One can also see from the following argument that this operator is related to the
Laplacian operator. In the theory of relativity (absence of gravitational field) one
would have to set
g , , = g,, = g,, = -1, g, = c2, gFv = 0, for P * v;
1
hence Y11 = Y22 = Y33 = -1, Y, = - ,y, = 0, for p f v.
c2
If a gravitational field is present that is sufficiently weak, i.e., if the gpvand y ,.,v
differ only infinitesimally from the values just given, then one obtains instead of the
expression (a), neglecting the second-order terms,

14
y p v is the contravariant tensor reciprocal to gpv (Part 11, $1).
53

If the field is static and only g p v is variable, we thus amve at the case of the
Newtonian theory of gravitation if we take the expression obtained for the quantity
[281 r,, up to a constant.
Hence one might think that, up to a constant factor, the expression (a) must
already be the generalization of Acp that we are seelung. But this would be a
mistake; for alongside this expression, in a generalization of this kind there could also
appear terms that are themselves tensors and that vanish when we neglect the kinds
of terms just indicated. This always occurs when two first derivatives of the g p v or
yllv are multiplied by each other. Thus, for example,

is a covariant tensor of the second rank (with respect to linear transformations); it


becomes infinitesimally small to the second order if the quantities gap and Y a p
deviate from constant values only infinitesimally to the first order. We must therefore
allow still other terms in Fpv,in addition to (a), which terms, for now, must satisfy
only the condition that, taken together, they must possess the character of a tensor
with respect to linear transformations.
We make use of the rnomentum-energy law to find these terms. To make myself
WI clear about the method used, 1 will first apply it to a generally known example.
In electrostatics - acp p
- is the vth component of the momentum transferred to
2%
the matter per unit volume, if cp denotes the electrostatic potential and p the electric
density. We seek a differential equation for cp of such h n d that the law of the
conservation of momentum is always satisfied. It is well known that the equation

solves the problem. The fact that the momentum law is satisfied follows from the
identity

Thus, if the momentum law is satisfied, then an identity of the following


construction must exist for every v : On the right side, - acp
- is multiplied by the left
ax,
side of the differential equation; on the left side of the identity there is a sum of the
s4

differential quotients.
If the differential equation for p were not yet known, the problem of finding it
would be reduced to that of finding this identity. What is essential for us to realize
is that this identity can be derived if one of the terms occurring in it is known. All
one has to do is to apply repeatedly the product differentiation rule in the forms
a
-(uv) =
au
-v +
av
-u
ax, ax, ax,
and
u- av a
= -(UV) -
au
- v,
ax, ax, 3%
and then finally to put the terms that are differential quotients on the left side and the
rest of the terms on the right side. For example, if one starts with the first term of
the above identity, one obtains, one after another,

from which we obtain the above identity upon rearrangement.


Now we turn again to our problem. It follows from equation (10) that

(a = 1,2,3,4)

is the momentum (or energy) imparted by the gravitational field to the matter per unit
volume. For the energy-momentum law to be satisfied, the differential expressions
rpv of the fundamental quantities y p v that enter the gravitational equations
K.eP, = rPy
must be chosen such that

can be rewritten in such a way that it appears as the sum of differential quotients.
On the other hand, we know that the term (a) appears in the expression sought for
rPv. Hence the identity that is being sought has the following form:
Sum of differential quotients

+ the other terms, which vanish with the first approximation.


I
55

The identity that is being sought is thereby uniquely determined; if one


POI constructs it according to the procedure indi~ated,~
one obtains

Thus, the expression for rllv that is enclosed between the curly brackets on the
right-hand side is the tensor that is being sought that enters into the gravitational
equations
KopV = rpv.
To make these equations more comprehensible, we introduce the following
abbreviations:

We will designate B P v as the contravariant stress-energy tensor of the


1311 gravitationalfield. The covariant tensor reciprocal to it will be denoted by t p v ;then
we have

Likewise, for the sake of brevity, we introduce the following notations for
differential operations carried out on the fundamental tensors y and g:

and

Cf. Part 11, $4,No. 8.


56

Each of these operators yields again a tensor of the same kind (w. resp. to linear
transformations).
With the application of these abbreviations the identity (12) assumes the form

or also

If we write the conservation law (10) for matter and the conservation law (12a)
for the gravitational field in the form

then one recognizes that the stress-energy tensor 6 p v of the gravitational field enters
the conservation law for the gravitational field in exactly the same way as the tensor
O,, of the material process enters the conservation law for this process; this is a
noteworthy circumstance considering the difference in the derivation of the two laws.
From equation (12a) follows the expression for the differential tensor entering
into the gravitational equations
(17) rpv = A p v W - K * f J p v .
Thus, the gravitational equations (1 1) are of the form
(18) *JY) = K F p v + a,,>. [321

These equations satisfy a requirement that, in our opinion, must be imposed on


a relativity theory of gravitation; that is to say, they show that the tensor I?,.of
,,the
gravitational field acts as a field generator in the same way as the tensor O,, of the
material processes. An exceptional position of gravitational energy in comparison [331
with all other kinds of energies would lead to untenable consequences.
Adding equations (10) and (12a) whle talung into account equation (18), one
finds
57

(a = 1,2,3,4)

This shows that the conservation laws hold for the matter and the gravitational
field taken together.
In the foregoing we have given preference to the contravariant tensors, because
the contravariant stress-energy tensor of the flow of incoherent masses can be
expressed in an especially simple manner. However, we can express the fundamental
relations that we have obtained just as simply by using covariant tensors. Instead of
Opv, we must then take Tpv= g d v p O , P as the stress-energy tensor of the
ffP
material process. Instead of equation (lo), we obtain through term-by-tern
reformulation

It follows from this equation and equation (16) that the equations of the gravitational
field can also be written in the form
(21) -qm = KPwv + TPVb
these equations can also be derived directly from (18). The equation that corresponds
to (19) reads
58

n
Mathematical Part
by Marcel Grossman

The mathematical tools for developing the vector analysis of a gravitational field,
whch is characterized by the invariance of the line element
h2= &?,vhP%
P
derive from Christoffels fundamental paper on the transformation of quadratic
differential forms. Taking Christoffelsresults as their starting point, Ricci and Levi-
Civita2 developed their methods of the absolute differential calculus-i.e., a
differential calculus that is independent of the coordinate system-which permit our
giving an invariant form to the differential equations of mathematical physics. But
since the vector analysis of a Euclidean space referred to arbitrary curvilinear
coordinates is formally identical with the vector analysis of an arbitrary manifold
specified by its line element, the extension of the vector-analytical conceptions that
Minkowski, Sommerfeld, Laue, et al. worked out for the theory of relativity in recent
years to the general theory of Einsteins expounded above does not present any
difficulty.
With some practice, the general vector analysis obtained in this way is as simple
to handle as the special vector analysis of three- or four-dimensional Euclidean space;
in fact, the greater generality of its conceptions lends it a clarity that is lacking often
enough in the special case.
The theory of special tensors ($3) has been treated to the full in a paper by
K ~ t t l e r published
,~ while this work was in progress; the treatment is based on the
theory of integral forms, sometlung that is not possible in the general case.
Since more detailed mathematical investigations will have to be done in
connection with Einsteins theory of gravitation, and especially in connection with the Pol
problem of the differential equations of the gravitational field, a systematic
presentation of the general vector analysis might be in order. I have purposely not
employed geometrical aids because, in my opinion, they contribute very little to an
intuitive understanding of the conceptions of vector analysis.

Christoffel, Uber die Transformation der homogenen Differentialausdriicke zweiten


Grades, J. fi Math. 70 (1869): 46.
2Ricci et Levi-Civita, Methodes de calcul differentiel absolu et leurs application.
Math. Ann. 54 (1901): 125.
[491 3Kottler, iiber die Raumzeitlinien der Minkowskischen Welt. Wen. Ber. 121 (1912).
59

54. Mathematical Supplements to the Physical Part

1. Proof of the Covariance of the Momentum-Energy Equations


It has to be proved that the equations (10) of Part I, page 10, which, neglecting the
factor m,read

are covariant with respect to arbitrary transformations.


According to formula (3.3, the divergence of the contravariant tensor OPVis

The covariant vector TO reciprocal to this contravariant vector 0, is thus

The factor % serves to simplify the result but is inconsequential from the point of
view of the theory of invariants.
60

But the last term of this sum is equal to

Hence, we end up with

1
i.e., the left side of the investigated equation, up to the factor -. Thus, if that
hi
equation is divided by &, then its left side represents the o-component of a
covariant vector, and is, therefore, in fact, covariant. For that reason, the content of
those four equations can also be expressed thus:
The divergence of the (contravariant) stress-energy tensor of the material flow
or of the physical process vanishes.

2. Differential Tensors of a Manifold Given by Its Line Element

The problem of constructing the differential equations of a gravitational field (Part


I, 5.5) draws one's attention to the differential invariants and differential covariants
of the quadratic differential form

PV

In the sense of our general vector analysis, the theory of these dfferential
covariants leads to the difei-ential tensors that are given with a gravitational field.
The complete system of these differential tensors (with respect to arbitrary
transformations) goes back to a covariant differential tensor of fourth rank found by
Riemann'* and, independently of him, by Chri~toffel,'~ which we shall call the
Riemann differential tensor, and which reads as follows:

(43) RiHm= (ik,lm) =

[661 l2Riernann, Ges. Werke, p. 270.


t671 13~hristoffel,
LC., p. 54.
61

By means of covariant algebraic and differential operations we obtain the


complete system of differential tensors (thus also the differential invariants) of the
manifold from the Riemann differential tensor and the discriminant tensor ($3,
formula 38).
The (ik, lm) are also called the Christoffel four-index symbols of the first kind.
In addition to these, of importance are also the four-index symbols of the second kind

which are related to the former in the following way:

(45)

I { i p , lm}

(ik, 1.1)
=
k

= c g k p
P
ypk(ik, lm), or, when solved,

{ i p , Lm}.

In general vector analysis, the four-index symbols of the second kmd take on the
meaning of the components of a mixed tensor that is covariant of third rank and
contravariant of first rank.14
The extraordinary importance of these conceptions for the dzflerential g e ~ m e t r y ' ~
of a manifold that is given by its line element makes it a priori probable that these
general differential tensors may also be of importance for the problem of the
differential equations of a gravitational field. To begin with, it is, in fact, possible
to specify a covariant differential tensor of second rank and second order G, that
could enter into those equations, namely,
(46) G im = 1y k l ( i k 7lm) = {ik, k m } .
kl k
It turns out, however, that in the special case of the infinitely weak, static
gravitational field this tensor does not reduce to the expression Acp. We must
therefore leave open the question to what extent the general theory of the differential
tensors associated with a gravitational field is connected with the problem of the

'?his follows from the first of equations 45.


"The identical vanishing of the tensor Riklm constitutes a necessary and sufficiefit
condition for the differential form's being transformable to the form d ~ f
,
gravitational equations. Such a connection would have to exist insofar as the
gravitational equations are to permit arbitrary substitutions; but in that case, it seems
that it would be impossible to find secondwrder differential equations. On the other
hand, if it were established that the gravitational equations permit only a particular
group of transformations, then it would be understandable if one could not manage
with the differential tensors yielded by the general theory. As has been explained in
the physical part, we are not able to take a stand on these questions.-

3. On the Derivation of the Gravitational Equations

The derivation of the gravitational equations described by Einstein (Part I, $5) is


carried out step by step in the following way:
We start out from the term that is definitely to be expected in the energy balance,

(47)

and reformulate it by integrating by parts.'6 In this way we obtain

The first sum on the right-hand side has the desired form of a sum of differential
quotients and shall be denoted by A, so that we have

We once again integrate by parts in the second sum on the right-hand side. The
identity will then take the form

The first of the sums obtained on the right-hand side can be written as a sum of
differentials and shall be denoted by

(49)

We differentiate in the second sum. Then we get

l6 The derivation of the identity we are seeking becomes simpler, without affecting the
result, if we put the factor & inside the differentiation sign.
63

or if we apply formula (29) of $2 in the second summand and integrate by parts in

The first two sums have the form of terms such as we place on the left side of
our identity. We denote them by

The third of the sums appearing on the right has the form of a sum of differential

quotients; if we eliminate ?!k.from it with the help of the above formula (29), this
2%
sum proves to be the quantity A that has aIready been introduced. Finally, we replace

?!& in the last sum in accord with the same formula. In this way we find
ax,

or

By virtue of (29), i.e., by virtue of

the first of these sums becomes


64

Since i is interchangeable with k, and p with v , we can write the second sum as

Hence, the identity sought reads


2U-V+W+2X=2A-B,
and is thus identical with the one given in Part I, $5.

Einstein and Marcel Grossmann in the garden of


the Grossmann home in Thalwil, May 28, 1899.
65

Doc. 30
[p. 7691 The Foundation of the General Theory of Relativity
by A. Einstein

[This first page was missing in the existing trans1ation.l

The theory which is presented in the following pages conceivably constitutes the
farthest-reaching generalization of a theory which, today, is generally called the
PI theory of relativity; I will call the latter one-in order to distinguish it from the
121 first named-the special theory of relativity, which I assume to be known. The
generalization of the theory of relativity has been facilitated considerably by
Minkowski, a mathematician who was the first one to recognize the formal
131 equivalence of space coordinates and the time coordinate, and utilized this in the
construction of the theory. The mathematical tools that are necessary for general
relativity were readily available in the absolute differential calculus, which is based
upon the research on non-Euclidean manifolds by Gauss, Riemann, and Christoffel,
and whch has been systematized by Ricci and Levi-Civita and has already been
141 applied to problems of theoretical physics. In section B of the present paper I
developed all the necessary mathematical tools-which cannot be assumed to be
lcnown to every physicist-and I tried to do it in as simple and transparent a manner
as possible, so that a special study of the mathematical literature is not required for
[5l &heunderstanding of the present paper. Finally, I want to acknowledge gratefully my
friend, the mathematician Grossmann, whose help not only saved me the effort of
[GI studying the pertinent mathematical literature, but who also helped me in my search
for the field equations of gravitation.

[The balance of this translation is reprinted from H. A. Lorentz et al., The


Principk of Relativity, trans. W. Perrett and G. B. Jeffery (Methen, 1923; Dover
rpt., 1952).]
THE FOUNDATION O F THE GENERAL THEORY
O F RELATIVITY
BY A. EINSTEIN
A. FUNDAMENTAL CONSIDERATIONS O N THE POSTULATE OF
RELAT IVITP
I. Observations on the Special Theory of Relativity

T HE special theory of relativity is based on the


following postulate, which is also satisfied by the
mechanics of Galileo and Newton.
If a system of co-ordinates K is chosen so that, in re-
lation to it, physical laws hold good in their simplest form,
the same laws also hold good in relation to any other system
of co-ordinates K moving in uniform translation relatively
to K. This postulate we call the Special principle of
relativity. The word special is meant to intimate
that the principle is restricted to the case when K has a
motion of uniform translation relatively to I(,but that the
equivalence of K and K does not extend to the case of non-
uniform motion ol K relatively to K.
Thus the special theory of relativity does not depart from
classical mechanics through the postulate of relativity, but
through the postulate of the constancy of the velocity of light
i n vucuo, from which, in combination with the special prin-
ciple of relativity, there follow, in the well-known way, the
relativity of simultaneity, the Lorentzian transformation, and
the related laws for the behaviour of moving bodies and
clocks.
The modification to which the special theory of relativity
has subjected the theory of space and time is indeed far-
reaching, but one important point has remained unaffected.
67

For the laws of geometry, even according to the special theory


of relativity, are to be interpreted directly as laws relating to
the possible relative positions of solid bodies at rest; and, in
a more general way, the laws of kinematics are to be inter-
preted as laws which describe the relations of measuring
bodies and clocks. To two selected material points of a
stationary rigid body there always corresponds a distance of
quite definite length, which is independent of the locality and
orientation of the body, and is also independent of the time.
To two selected positions of the hands of a, clock at rest
relatively to the privileged system of reference there always
corresponds an interval of time of a definite length, which is
independent of place and time. We shall soon see that the
general theory of relativity cannot adhere to this simple
physical interpretation of space and time.

2. The Need for an Extension of the Postulate of


Relativity
I n classical mechanics, and no less in the special theory
of relativity, there is an inherent epistemological defect which
was, perhaps for the first time, clearly pointed out by Ernst
Mach. We will elucidate it by the following example :-Two
fluid bodies of the same size and nature hover freely in space
at so great a distance from each other and from all other
masses that only those gravitational forces need be taken into
account which arise from the interaction of different parts of
the same body. L e t the distance between the two bodies be
invariable, and in neither of the bodies let there be any
relative movements of the parts with respect to one another.
But let either mass, as judged by an observer at rest
relatively to the other mass, rotate with constant angular
velocity about the line joining the masses. This is a verifi-
able relative motion of the two bodies. Now let us imagine
that each of the bodies has been surveyed by means of
measuring instruments at rest relatively to itself, and let the
surface of S, prove to be a sphere, and that of S, an ellipsoid
of revolution. Thereupon we put the question-What is the
reason for this difference in the two bodies ? No answer can
68

be admitted as epistemologically satisfactory,* unless the


reason given is an observable fact of experience. The law of
causality has not the significance of a statement as to the
world of experience, except when observable facts ultimately
appear as causes and effects.
Newtonian mechanics does not give a satisfactory answer
to this question. It pronounces as follows:-The laws of
mechanics apply to the space R,, in respect to which the body
S, is at rest, but not to the space R,, in respect to which the
body S, is a t rest. But the privileged space R, of Galileo,
thus introduced, is a merely factitious cause, and not a thing
that can be observed. It is therefore clear that Newtons
mechanics does not really satisfy the requirement of causality
in the case under consideration, but only apparently does so,
since it makes the factitious cause R, responsible for the ob-
servable difference in the bodies S, and Sp
The only satisfactory answer must be that the physical
system consisting of S, and S2reveals within itself no irnagin-
able cause to which the differing behaviour of S, and S, can
be referred. The cause must therefore lie outside this system.
We have to take it that the general laws of motion, which in
particular determine the shapes of S, and S,, must be such
that the mechanical behaviour of S1 and S2 is partly con-
ditioned, in quite essential respects, by distant masses which
we have not included in the system under consideration.
These distant masses and their motions relative to S, and
S, must then be regarded as the seat of the causes (which
must be susceptible to observation) of the different behaviour
of our two bodies S, and S,. They take over the r61e of the
factitious cause R,. Of all imaginable spaces R,, R,, etc., in
any kind of motion relatively to one another, there is none
which we may look upon as privileged a p ~ i o r without
i re-
viving the above-mentioned epistemological objection. The
laws of physics must be of such a nature that they apply to
systems of reference in any kind of motion. Along this road
we arrive at an extension of the postulate of relativity.
I n addition to this weighty argument from the theory of
* Of course an answer may be satisfactory from the point of view of episte-
mology, end yet be unsound physically, if it is in conflict with other experi-
ences.
69

knowledge, there is a well-known physical fact which favours


an extension of the theory of relativity. L e t K be a Galilean
system of reference, i.e. a system relatively t o which (at least
in the four-dimensional region under consideration) a mass,
sufficiently distant from other masses, is moving with uniform
motion in a straight line. Let K be a second system of
reference which is moving relatively to K in uniformly
accelerated translation. Then, relatively to K, a mass
sufficiently distant from other masses would have an acceler-
ated motion such that its acceleration and direction of
acceleration are independent of the material composition and
physical state of the mass.
Does this permit an observer at rest relatively to K to
infer that he is on a really accelerated system of reference ?
The answer is in the negative; for the above-mentioned
relation of freely movable masses to K may be interpreted
equally well in the following way. The system of reference
K is unaccelerated, but the space-time territory in question
is under the sway of a gravitational field, which generates the
accelerated motion of the bodies relatively to K.
This view is made possible for us by the teaching of
experience as to the existence of a field of force, namely, the
gravitational field, which possesses the remarkable property
of imparting the same acceleration to all bodies.* The
mechanical behaviour of bodies relatively to K is the same
as presents itself to experience in the case of systems which
we are wont to regard as stationary or as privileged.
Therefore, from the physical staadpoint, the assumption
readily suggests itself that the systems K and K may both
with equal right be looked upon as stationary, that is to
say, they have an equal title as systems of reference for the
physical description of phenomena.
I t will be seen from these reflexions that in pursuing the
general theory of relativity we shall be led to a theory of
gravitation, since we are able to produce a gravitational
field merely by changing the system of co-ordinates. I t will
also be obvious that the principle of the constancy of the
velocity of light in vacuo must be modified, since we easily
* Eotvos has proved experimentally that the gravitational field has this
property in great accuracy.
70

recognize that the path of a ray of light with respect to K


must in general be curvilinear, if with respect to K light is
propagated in a straight line with a definite constant velocity.

5 3. The Space-Time Continuum. Requirement of General


Co- Variance for the Equations Expressing General
Laws of Nature
I n classical mechanics, as well as in the special theory of
relativity, the co-ordinates of space and time have a direct
physical meaning. To say that a point-event has the XLco-
ordinate x, means that the projection of the point-event on the
axis of XI, determined by rigid rods and in accordance with the.
rules of Euclidean geometry, is obtained by measuring off a
given rod (the unit of length) z1times from the origin of co-
ordinates along the axis of x,. To say that a point-event
has the X, co-ordinate x4 = t, means that a standard clock,
made to measure time in a definite unit period, and which is
stationary relatively to the system of co-ordinates and practic-
ally coincident in space with the point-event,* will have
measured off x, = t periods at the occurrence of the event.
This view of space and time has always been in the minds
of physicists, even if, as a rule, they have been unconscious
of it. This is clear from the part which these concepts play
in physical measurements ; it must also have underlain the
readers reflexions on the preceding paragraph ( 5 2) for
him to connect any meaning with what he there read. But
we shall now show that we must put it aside and replace it
by a more general view, in order to be able to carry through
the postulate of general relativity; if the special theory of
relativity ctppIies to the special case of the absence of a gravi-
tational field.
I n a space which is free of gravitational fields we introduce
a Galilean system of reference K (5, y, x , t ) , and also a system
of co-ordinates K (x,y, x, t) in uniform rotation relatively
to K. Let the origins of both systems, as well its their axes
* We assume the possibility of verifying simultaneity for events im-
mediately proximate in space, or-to speak more precisely-for immediate
proximity or coincidence in space-time, without giving a defhition of this
fundamental concept.
71

of Z, permanently coincide. We shall show that for a space-


time measurement in the system K the above definition of
the physical meaning of lengths and times cannot be main-
tained. For reasons of symmetry it is clear that a circle
around the origin in the X, Y plane of H may at the same
time be regarded as a circle in the X, Y plane of X. We
suppose that the circumference and diameter of this circle
have been measured with a, unit measure infinitely small
compared with the radius, and that we have the quotient of
the two results. If this experiment were performed with a
measuring-rod at rest relatively to the Galilean system K, the
quotient would be T. With a measuring-rod at rest relatively
to K, the quotient would be greater than T. This is readily
understood if we enkisage the whole process of measuring
from the stationary system K, and take into consideration
that the measuring-rod applied to the periphery undergoes
tt Lorentzian contraction, while the one applied along the
radius does not. Hence Euclidean geometry does not apply
to K. The notion of co-ordinates defined above, which pre-
supposes the validity of Euclidean geometry, therefore breaks
down in relation to the system K. So, too, we are unable
to introduce a time corresponding to physical requirements
in K, indicated by clocks at rest relatively to K. To
convince ourselves of this impossibility, let us imagine two
clocks of identical constitution placed, one at the origin of
co-ordinates, and the other at the circumference of the
circle, and both envisaged from the stationary system
K. By st familiar result of the special theory of relativity,
the clock at the circumference-judged from K-goes more
slowly than the other, because the former is in motion and
the latter at rest. An observer at the common origin of
co-ordinates, capable of observing the clock at the circum-
ference by means of light, would therefore see it lagging be-
hind the clock beside him. As he will not make up his mind
to let the velocity of light along the path in question depend
explicitly on the time, he will interpret his observations as
showing that the clock at the circumference really goes
more slowly than the clock at the origin. So he will be
obliged to define time in such a way that the rate of a clock
depends upon where the clock may be.
72

We therefore reach this result :-In the general theory of


relativity, space and time cannot be defined in such a way
that differences of the spatial co-ordinates can be directly
measured by the unit measuring-rod, or differences in the
time co-ordinate by a standard clock.
The method hitherto employed for laying co-ordinates
into the space-time continuum in a definite manner thus breaks
down, and there seems to be no other way which would allow
us to adapt systems of co-ordinates to the four-dimensional
universe so that we might expect from their application a
particularly simple formulation of the laws of nature. So
there is nothing for it but to regard all imaginable systems
of co-ordinates, on principle, as equally suitable for the
description of nature. This comes to requiring that :-
The general laws of nature are to be expressed by equations
which hold good for all systems of co-ordinates, that i s , are
co-variant with respect to any substitutions whatever (generally
co-variant).
I t is clear that a physical theory which satisfies this
postulate will also be suitable for the general postulate of
relativity. For the sum of all substitutions in any case in-
cludes those which correspond to all relative motions of three-
dimensional systems of co-ordinates. That this requirement
of general co-variance, which takes away from space and
time the last remnant of physical objectivity, is a natural
one, will be seen from the following reflexion. All our
space-time verifications invariably amount to a determination
of space-time coincidences. If, for example, events consisted
merely in the motion of material points, then ultimately
nothing would be observable but the meetings of two or more
of these points. Moreover, the results of our measurings are
nothing but verifications of such meetings of the material
points of our measuring instruments with other material
points, coincidences between the hands of a clock and points
on the clock dial, and observed point-events happening at the
same place at the same time.
The introduction of a, system of reference serves no other
purpose than to facilitate the description of the totality of such
coincidences. We allot to the universe four space-time vari-
ables zl,x2,z3,x, in such a way that for every point-event
13

there is a corresponding system of values of the variables


x 1 . , . 2,. To two coincident point-events there corre-
sponds one system of values of the variables x, . . x,, i.e. .
coincidence is characterized by the identity of the co-ordintitea.
If, in place of the variables x, . . . x,, we introduce functions
of them, dl, xt3,x ~as
XI,, , a new system of co-ordinates, so
that the systems of values are made to correspond t o one
another without ambiguity, the equality of all four co-ordin-
ates in the new system will also serve as an expression for
the space-time coincidence of the two point-events. As all
our physical experience can be ultimately reduced to such
coincidences, there is no immediate reason for preferring
certain systems of co-ordinates to others, that is to say, we
arrive at the requirement of general co-variance.

5 4. The Relation of the Four Co-ordinates to Measure-


ment in Space and Time
It is not my purpose in this discussion to represent the
general theory of relativity as a system that is as simple and
logical as possible, and with the minimum number of axioms ;
but my main object is to develop this theory in such a way
that the reader will feel that the path we have entered upon
is psychologically the natural one, and that the underlying
assumptions will seem to have the highest possible degree
of security. With this aim in view let it now be granted
that :-
For infinitely small four-dimensional regions the theory
of relativity in the restricted sense is appropriate, if the co-
ordinates are suitably chosen.
For this purpose we must choose the acceleration of the
infinitely small ( local ) system of co-ordinates so that no
gravitational field occurs ; this is possible for an infinitely
small region. Let X,, X,, X,, be the co-ordinates of space,
and X4 the appertaining co-ordinate of time measured in the
appropriate unit.* If a rigid rod is imagined to be given as
the unit measure, the co-ordinates, with a given orientation
of the system of co-ordinates, have a direct physical meaning
* The unit of time is to be chosen so that the velocity of light in vacuo 8s
measured in the local systemof co-ordinates is to be equal to unity.
74

in the sense of the special theory of relativity. By the


special theory of relativity the expression
ds2 = - dX: - dX12 - dX: + dX: .
then has a value which is independent of the orientation of
the local system of co-ordinates, and is ascertainable by
measurements of space and time. The magnitude of the
linear element pertaining to points of the four-dimensional
continuum in infinite proximity, we call ds. If the ds belong-
ing to the element d X , . . . d X , is positive, we follow
Minkowski in calling it time-like ; if it is negative, we call it
space-like.
To the linear element * in question, or to the two infin-
itely .proximate point-events, .there will also correspond
.
definite differentials dx, . . dx4 of the four-dimensional
co-ordinates of m y chosen system of reference. If this
system, as well as the local system, is given for the region
under consideration, the d X , will allow themselves to be
represented here by definite linear homogeneous expressions
of the d x g : -
&xu = Zayudxu
U . *(2)

Inserting these expressions in (l),we obtain

where the gu7 will be functions of the xu. These can no


longer be dependent on the orientation and the state of
motion of the local system of co-ordinates, for ds2 is a
quantity ascertainable by rod-clock measurement of point-
events infinitely proximate in space-time, and defined inde-
pendently of any particular choice of co-ordinates. The gu7
are to be chosen here so that gQ7 = ~ 3 the ~ summation
~ ; is
to extend over all values of CT and T, so that the sum consists
of 4 x 4 terms, of which twelve are equal in pairs.
The case of the ordinary theory of relativity arises out of
the case here considered, if it is possible, by reason of the
particular relations of the gu7 in a finite region, to choose the
system of reference in the finite region in such a way that
the gUT assume the constant values
75

- 1 0 0
0 -1 0
0 0 -1 0 - (4)
0 0 0 4-1
W e shall find hereafter that the choice of such co-ordinates
is, in general, not possible for a finite region.
From the considerations of 2 and 5 3 it follows that
the quantities g,, are to be regarded from the physical stand-
point as the quantities which describe the gravitational
field in relation to the chosen system of reference. For, if
we now assume the special theory of relativity to apply to a
certain four-dimensional region with the co-ordinates properly
chosen, then the g,, have the values given in (4). A free
material point then moves, relatively to this system, with
uniform motion in a straight line. Then if we introduce new
space-time co-ordinates xl, x2,x3, xp,by means of any substi-
tution we choose, the gar in this new system will no longer
be constants, but functions of space and time. At the same
time the motion of the free material point will present itself
in the new co-ordinates as a curvilinear non-uniform motion,
and the law of this motion will be independent of the nature
of the moving particle. W e shall therefore interpret this
motion as a motion under the influence of a gravitational
field. W e thus find the occurrence of a gravitational field
connected with a space-time variability of the g, . So, too,
in the general case, when we are no longer able by a suitable
choice of co-ordinates to apply the special theory of relativity
to a finite region, we shall hold fast to the view that the g,,
describe the gravitational field.
Thus, according to the general theory of relativity, gravi-
tation occupies an exceptional position with regard to bther
forces, particularly the electromagnetic forces, since the ten
functions representing the gravitational field at the same time
define the metrical properties of the space measured.

B. MATHEMATICAL AIDS TO THE FORMULATION OF


GENERALLY COVARIANT EQUATIONS
Having seen in the foregoing that the general postulate
of relativity leads to the requirement that the equations of
16

physics shall be covariant in the face of any substitution of


the co-ordinates x 1 . , . x4, we have to consider how such
generally covariant equations can be found. We now turn
to this purely mathematical task, and we shall find that in its
solution a fundamental rdle is played by the invariant d s
given in equation (3), which, borrowing from Gausss theory
of surfaces, we have called the linear element.
The fundamental idea of this general theory of covariants
is the following :-Let certain things ( I tensors ) be defined
with respect to any system of co-ordinates by a number of
functions of the co-ordinates, called the components of
the tensor. There are then certain rules by which these
components can be calculated for a new system of co-ordin-
ates, if they are known for the original system of co-ordinates,
and if the transformation connecting the two systems is
known. The things hereafter called tensors are further
characterized by the fact thBt the equations of transformation
for their components are linear and homogeneous. Accord-
ingly, all the components in the new system vanish, if they
all vanish in the original system. If, therefore, a law of
nature is expressed by equating all the components of a tensor
to zero, it is generally covariant. By examining the laws
of the formation of tensors, we acquire the means of formu-
lating generally covariant laws.

5 5. Contravariant and Covariant Four-vectors


Contravariant Four-vectors.-The linear element is de-
fined by the four components dxv, for which the law of
transformation is expressed by the equation

The dx, are expressed as linear and homogeneous functions


of the dx,. Hence we m i y look upon these co-ordinate differ-
entials as the components of a tensor of the particular
kind which we call a contravariant four-vector. Any thing
which is defined relatively to the system of co-ordinates by
four quantities A, and which is transformed by the same law
77

we also call a contravariant four-vector. From ( 5 4 it


follows at once that the sums A" B" are also components
of a four-vector if A" and B" are such. Corresponding rela-
y

tions hold for all " tensors " subsequently to be introduced.


(Rule for the addition and subtraction of tensors.)
Covariant Four-vectors.-We call four quantities A,, the
components of a covarictnt four-vector, if for any arbitrary
choice of the contravariant four-vector B'
BA,B'
Y
= Invariant - (6)
The law of transformation of covariant four-vector follows
from this definition. For if we replace B' on the right-hand
side of the equation
Z'A'UB'" = BAuB'
U Y

by the expression resulting from the inversion of (5a),


Z32, B'",
a3Xu
we obtain

Since this equation is true for arbitrary values of the B'", it


follows that the law of transformation is

Note on a Simplijed W a y of Writing the Expressions.-


A glance at the equations of this paragraph shows that there
is always a summation with respect to the indices which
occur twice under a sign of summation (e.g. the index v in
(5)), and only with respect to indices which occur twice. I t
is therefore possible, without loss of clearness, to omit the sign
of summation. I n its place we introduce the convention :-
If an index occurs twice in one term of an expression, it is
always to be summed unless the contrary is expressly stated.
The difference between covariant and contravariant four-
vectors lies in the law of transformation ((7) or ( 5 ) respectively).
Both forms are tensors in the sense of the general remark
above. Therein lies their importance. Following Ricci and
78

Levi-Civita, we denote the contravariant character by placing r131


the index above, the covariant by placing it below.

5 6.Tensors of the Second and Higher Ranks


Contravariant Tensors.-If we form all the sixteen pro-
ducts AF of the components Ap and B of t w o contravariant
four-vectors
AFv APB (8)
then by (8) and (5a) ANsatisfies the law of transformation
A = 3 ~ adTAIlv
~
-
--

ax, ax, . i9)


W e call a thing which is described relatively to any system
of reference by sixteen quantities, satisfying the law of trans-
formation (9), a contravariant tensor of the second rank. Not
every such tensor allows itself to be formed in accordance
with (8) from two four-vectors, but it is easily shown that
any given sixteen A* can be represented as the sums of the
AFB of four appropriately selected pairs of four-vectors.
Hence we can prove nearly all the laws which apply to the
te,uor of the second rank defined by (9) in the simplest
manner by demonstrating them for the special tensors of the
type (8).
Coiatravariant Tensors of Any Rank.-It is clear that, on
the lines of (8) and (9), contravariant tensors of the third and
higher ranks may also be defined with 43 components, and so
on. In the same way it follows from (8) and (9) that the
contravariant four-vector may be taken in this sense as a,
contravariant tensor of the first rank.
Covariant Tensors.-On the other hand, if we take the
sixteen products A,, of two covariant four-vectors A, and B,,
Apu = ApBu, * (10)
the law of transformation for these is

This law of transformation defines the covariant tensor of


the second rank. All our previous remarks on contravariant
tensors apply equally to covariant tensors.
NoTE.-It is convenient to treat the scalar (or invariant)
both as a contravariant and a covariant tensor of zero rank.
Mixed Tensors.-We may also define a tensor of the
second rank of the type
A"P =A,B' . (12)
which is covariant with respect to the index p, and contra-
variant with respect to the index v. I t s law of transforrna-
tion is

Naturally there are mixed tensors with any number of


indices of covariant chltracter, and any number of indices of
contravariant character. Covariant and contravariant tensors
may be looked upon as special cases of mixed tensors.
Symmetrical Tensors.-A contravariant, or a covariant
tensor, of the second or higher rank is said to be symmetrical
if two components, which are obtained the one from the other
by the interchange of two indices, are equal. The tensor Ap",
or the tensor Apv,is thus symmetrical if for any combination
of the indices p, v,
ApH= K C , . (14)
or respectively,
A," = Avp . . (14a)
I t has to be proved that the symmetry thus defined is a
property which is independent of the system of reference.
It follows in fact from (9), when (14)is taken into consider-
ation, that
A'=?=ax',-
ax', --
.- Apv = ax', --A ax'r pv = A''".
ax'r K p = ax'#
a x p axv ax, axu axv ax,
The last equation but one depends upon the interchange of
the summation indices f i and v, i.e. merely on a, change of
notation.
Antisymmetrical Tensors.-A contravariant or a covariant
tensor of the second, third, or fourth rank is said to be anti-
symmetrical if two components, which are obtained the one
from the other by the interchange of two indices, are equal
and of opposite sign. The tensor Apv,or the tensor A,,,, is
therefore antisymmetrical, if always
80

A*'= - MI*,. - (15)


or respectively,
A p = - A, . . (15a)
Of the sixteen components Apu,the four components App
vanish; the rest are equal and of opposite sign in pairs, so
that there are only six components numerically different (a
six-vector). Similarly we see that the antisymmetrical tensor
of the third rank A"" has only four numerically different
components, while the antisymmetrical tensor ApvUT has only
one. There are no antisymmetrical tensors of higher rank
than the fourth in a, continuum of four dimensions.
7. Multiplication of Tensors
Outer Multiplication of Tensors.-We obtain from the
components of a tensor of rank n and of a tensor of rank m
the components of a, tensor of rank n + m by multiplying
each component of the one tensor by each component of the
other. Thus, for example, the tensors T arise out of the
tensors A and B of differen-t kinds,
Tpvu = ApvBu,
TWQ7= APVB"',
TZZ = ApvB"'.
The proof of the tensor character of T is given directly
by the representations (8), (lo), (12), or by the laws of trans-
formation (9), (ll), (13). The equations (8), (lo), (12) are
themselves examples of outer multiplication of tensors of the
first rank.
L6Contraction of a Mixed Tensor.-From any mixed
"

tensor we may form a tensor whose rank is less by two, by


equating an index of covariant with one of contravariant
character, and summing with respect to this index ("con-
traction "). Thus, for example, from the mixed tensor of the
fourth rank A;, we obtain the mixed tensor of the second

and from this, by a, second contraction, the tensor of zero


rank,
A =A; = A;
81

The proof that the result of contraction really possesses


the tensor character is given either by the representation of a
tensor according to the generalization of (12) in combination
with (6), or by the generalization of (13).
Inner and Mixed Multiplication of Tensors.-These consist
in a combination of outer multiplication with contraction.
Examples.-From the covariant tensor of the second rank
A,, and the contravariant tensor of the first rank B" we form
by outer multiplication the mixed tensor
DZ, = A,,B".
On contraction with respect to the indices v and a,we obtain
the covarian t f our-vect o r
a

Dp= DL, = ApyBY.


This we call the inner product of the tensors A,, and B".
Analogously we form from the tensors A,, and BUT, by outer
multiplication and double contraction, the inner product
A,,Bpv. By outer multiplication and one contraction, we
obtain from APyand BUT the mixed tensor of the second rank
DI; = A,,BvT. This operation may be aptly characterized as
a mixed one, being " outer " with respect to the indices p
and T , and " inner " with respect to the indices v and a.
W e now prove a proposition which is often useful as evi-
dence of tensor character. From what has just been ex-
plained, A,,Bpv is a scalar if A,, and BUTare tensors. But
we may also make the following assertion: If ApyBlrvis
a scalar for any choice of the tensor BpY, then A, has tensor
character. For, by hypothesis, for any substitution,
A'grB'"' = Ap,Bpv.
But by an inversion of (9)

This, inserted in the above equation, gives

This can only be satisfied for arbitrary values of B'"' if the


82

bracket vanishes. The result then follows by equation (11).


This rule applies correspondingly to tensors of any rank and
character, and the proof is analogous in all cases.
The rule may also be demonstrated in this form : If Bp
and C are any vectors, and if, for all values of these, the
inner product A,,BC is a, scalar, then A,, is a covariant
tensor. This latter proposition also holds good even if only
the more special assertion is correct, that with any choice of
the four-vector B p the inner product APVBpB is a scalar, if
in addition it is known that A,, satisfies the condition of
symmetry A,, = Avp For by the method given above we
prove the tensor character of (A,, + A,,), and from this the
tensor character of A,, follows on account of symmetry.
This also can be easily generalized to the case of covariant
and contravariant tensors of any rank.
Finally, there follows from what has been proved, this
law, which may also be generalized for any tensors: If for
any choice of the four-vector B the quantities A,,B form a
tensor of the first rank, then A,, is a tensor of the second
rank. For, if C p is any four-vector, then on account of the
tensor character of A,,,B, the inner product A,,BCP is a
scalar for any choice of the two four-vectors B and C- From
which the proposition follows.

8. Some Aspects of the Fundamental Tensor gpv


The Covariant Fundamental Tensor.-In the invariant
expression for the square of the linear element,
ds2 = grvdxpdxv,
the part played by the d x , is that of a contravariant vector
which may be chosen at will. Since further, ,g, = g,, it
follows from the considerations of the preceding paragraph
that g,, is a covariant tensor of the second rank. W e call
it the fundamental tensor. In what follows we deduce
some properties of this tensor which, it is true, apply to any
tensor of the second rank. But as the fundamental tensor
plays a special part in our theory, which has its physical basis
in the peculiar effects of gravitation, it so happens that the
relations to be developed are of importance to us only in the
case of the fundamental tensor.
83

The Contravariant Fundamental Tensor.--If in the deter-


minant formed by the elements g,, we take the co-factor of
each of the g,, and divide it by the determinant g = I g,, I ,
we obtain certain quantifies g ~ =( 9'1") which, as we shall
demonstrate, form a contravariant tensor.
By a known property of determinants

8r ( *
g,ug,@ =
Y
- (16)
where the symbol 8; denotes 1 or 0, according as p = v or
P =
a)*I-
Instead of the above expression for ds2 we may thus write
g,u8~dx,dzu
or, by (16)
gpugngurdxpdxu.
But, by the niultiplication rules of the preceding paragraphs,
the quantities
dfu gpdX+
form a covariant four-vector, and in fact an arbitrary vector,
since the dx, are arbitrary. By introducing this into our ex-
pression we obtain
ds2 = g@Tdfud&.
Since this, with the arbitrary choice of the vector dEu, is a
scalar, and g m by its definition is symmetrical in the indices
Q and T, it follows from the results of the preceding paragraph

that 967 is a contravariant tensor.


It further follows from (16) that 8, is also a tensor, which
we may call the mixed fundamental tensor.
The Determinant of the Fundamental Tensor.--By the
rule for the multiplication of determinants
I I = I g, I I 9"' I
On the other hand
I gpzg"' I = 18; I = 1.
It therefore follows that

The Volume Scalar.-We seek first the law of transfor-


84

mation of the determinant g = I gpv 1 . In accordance with


(11)
1 3 2 , 3s
g = 1 d Z , $ ~ S V 11 *

Hence, by a double application of the rule for the multipli-


cation of determinants, it follows that

or

On the other hand, the law of transformation of the element


of volume
dr = jdx,ds,dx,ds,

is, in accordance with the theorem of Jacobi,

By multiplication of the last two equations, we obtain

Instead of .JE we introduce in what follows the quantity


-
J - g, which is always real on account of the hyperbolic -
character of the space-time continuum. Theinvariant .J - gdr
is equal to the magnitude of the four-dimensional element
of volume in the local system of reference, as measured
with rigid rods and clocks in the sense of the special theory
of relativity.
Note on the Character of the Space-time Continuum-Our
assumption that the special theory of relativity can always
be applied to an infinitely small region, implies that ds2 can
always be expressed in accordance with (1)by means of real
quantities dX, . . . dX4. If we denote by d r Othe natural
element of volume dX,, d X 2 , dX,, d X 4 , then
drO J- -
gdr - * . (18a)
If JGwere $0 vanish at a point of the four-dimensional
continuum, it would mean that at this point an infinitely small
natural volume would correspond to a finite volume in
the co-ordinates. Let us assume that this is never the case.
Then g cannot change sign. We will assume that, in the
sense of the special theory of relativity, g always has a finite
negative value. This is a hypothesis as to the physical
nature of the continuum under consideration, and at the same
time a convention as to the choice of co-ordinates.
But if - g is always finite and positive, it is natural to settle
the choice of co-ordinates a posteriori in such a way that this
quantity is always equal to unity. We shall see later that
by such a restriction of the choice of co-ordinates it is possible
to achieve an important simplification of the laws of nature.
I n place of (18), we then have simply d ~ = dT, from
which, in view of Jacobis theorem, it follows that

121 = l .

Thus, with this choice of co-ordinates, only substitutions for


which the determinant is unity are permissible.
But it would be erroneous to believe that this step indicates
a partial abandonment of the general postulate of relativity.
We do not ask What are the laws of nature which are co-
variant in face of all substitutions for which the determinant
is unity ? but our question is What are the generally co-
variant laws of nature ? It is not until we have formulated
these that we simplify their expression by a particular choice
of the system of reference.
The Formatim.of New Tensors b y Means of the F m d a -
mentd Tensor.-Inner, outer, and mixed multiplication of a
tensor by the fundamental tensor give tensors of different
character and rank. For example,
A = gurAr,
A = gPvA.
The following forms may be specially noted :-
A = gP*gSAaB,
A p v = gpagv,A
86

(the complements of covariant and contravariant tensors


respectively), and
Bpu = gpugaAafi.
We call B,, the reduced tensor associated with Apy. Similarly,

I t may be noted that is nothing more than the comple-


ment of spy,since
gPqSgaS = gpag = gpv.

5 9. The Equation of the Geodetic Line. The Motion of a


Particle
As the linear element ds is defined independently of the
system of co-ordinates, the line drawn between two points P
and P of the four-dim.ensiona1continuurnin such ct way that
Sds is stationary-a geodetic line-has a meaning which also
is independent of the choice of co-ordinates. Its equation is

STds-0 . * (20)
J
P
Carrying out the variation in the usual way, we obtain
from this equation four differential equations which define the
geodetic line ; this operation will be inserted here for the sake
of completeness. Let X be a function of the co-ordinates xv,
and let this define a family of surfaces which intersect the
required geodetic line as well as all the lines in immediate
proximity to it whichare drawn through the points P and P.
Any such line may then be supposed to be given by expres-
sing its co-ordinates z, as functions of X. Let the symbol 6
indicate the transition from a point of the required geodetic
to the point corresponding to the same X on a neighbouring
line. Then for (20) we may substitute

But since
dx, dx, h
620 = - +

and
(2) =
d
-(62,),
dx

we obtain from (2Oa), after a partial integration,

where

Since the values of 6xa are arbitrary, it follows from this that
x,=o . . (20c)
are the equations of the geodetic line.
If ds does not vanish along the geodetic line we may
choose the length of the arc s, measured along the geod3tic
line, for the parameter X. Then w = 1, and in place of (20c)
we obtain
[I 51 d2x, + -bg,,
- - -dx, - - - 1 bg,, dxp dx, - 0
- - dx,
g l l , p
ax, ds ds 2 bx, ds ds
or, by a mere change of notation,
dexa + ~ v , u ]dx,,
- - - dx,
--=O .
ga** ds ds
. (20d)
where, following Christoffel, we have written

Finally, if we multiply (20d) by gaT (outer multiplication with


respect to T , inner with respect to c),we obtain the equations
of the geodetic line in the form

where, following Christoffel, we have set


88

5 10. The
Formation of Tensors by Differentiation
With the help of the equation of the geodetic line we can
now easily deduce the laws by which new tensors can be
formed from old by differentiation. By this means we are
able for the first time to formulate generally covariant
differential equations. We reach this goal by repeated appli-
cation of the following simple law :-
If in our continuum a curve is given, the points of which
are specified by the arcual distance s measured from a fixed
point on the curve, and if, further, cj is an invariant function
of space, then d+/ds is also an invariant. The proof lies in
this, that ds is an invariant as well as d+.
As

therefore
at+ d z ,
*=a+-
is also an invariant, and an invariant for all curves starting
from a point of the continuum, that is, for any choice of the
vector dxp. Hence it immediately follows that

is a covariant four-vector-the " gradient " of cj.


According to our rule, the differential quotient

X =dJ.
X
taken on a curve, is similarly an invariant. Inserting the
value of 9, we obtain in the first place
a2+ dx, dx,, 3+ d2xp
= G,ds ds + - -
3x, ds2
a

The existence of a, tensor cannot be deduced from this forth-


with. But if we may take the curve along which we have
differentiated to be a geodetic, we obtain on substitution for
d2xv/ds2from (221,
.=&- J2+
(PV,
at+ dx, dx,
Z'
Since we may interchange the order of the differentiations,
89

and since by (23) and (21) {p,r } is symmetrical in p and u,


it follows that the expression in brackets is symmetrical in p
and v. Since a geodetic line can be drawn in any direction
from a point of the continuum, and therefore dx,/ds is a four-
vector with the ratio of its components arbitrary, it follows
from the results of 5 7 that

is a covariant tensor of the second rank. W e have therefore


come to this result: from the covariant tensor of the first
rank
a+
-
A,, =
ax,
we can, by differentiation, form a covarisnt tensor of the
second rank

We call the tensor A,, the " extension (covariant derivative)


"

of the tensor A, In the first place we can readily show that


the operation leads to a tensor, even if the vector A+ cannot
be represented as a gradient. To see this, we first observe
that

is a, covariant vector, if + and 9 are scalars. The sum of


four such terms

is also a covariant vector, if +(I), . . . +(4), +(4) are scalars.


But it is clear that any covariant vector can be represented
in the form S,. For, if Ap is a vector whose components are
any given functions of the xy,we have only to put (in terms
of the selected system of co-ordinates)
+(1) A,, +(I) = xl,
+(a) = A,, = x,,
+3) = A,, +(3) = z,,
+(*I = &, $(*I = xp,
in order to ensure that-S,, shall be equal to A,,.
90

Therefore, in order to demonstrate that A,, is a tensor if


any covariant vector is inserted on the right-hand side for Ap,
we only need show that this is so for the vector S,. But for
this latter purpose it is sufficient, as a glance at the right-
hand side of (26) teaches us, to furnish the proof for the case
a+ .
A , = + -32,
Now the right-hand side of (25) multiplied by 9,

is a tensor. Similarly
w
- 34)
-
ax, ax,,
being the outer product of two vectors, is a tensor. By ad-
dition, there follows the tensor character of

As a glance at (26) will show, this completes the demon-


stration for the vector

and consequently, from what has already been proved, for any
vector A,.
By means of the extension of the vector, we may easily
define the extension of a covariant tensor of any rank
This operation is a generalization of the extension of a vector.
We restrict ourselves to the case of a tensor of the second
rank, since this suffices to give a clear idea of the law of
formation.
As has already been observed, any covariant tensor of the
second rank can be represented * as the sum of tensors of the
By outer multiplication of the veotor with arbitrary components All, A12,
A13, A,, by the vector with components 1,0,0,0,we produce a tensor with
components

0 0 0 0
0 0 0 0 .
By the addition of four tensors of this type, we obtain the tensor Apy with any
assigned components.
91

type A,B,. It will therefore be sufficient to deduce the ex-


pression for the extension of a tensor of this special type.
By (26) the expressions

are tensors. On outer multiplication of the first by B,, and


of the second by A,, we obtain in each case a tensor of the
third rank. By adding these, we have the tensor of the third
rank

where we have put ANY= A,B, As the right-hand side


of (27) is linear and homogeneous in the A,, and their first
derivatives, this law of formation leads to a tensor, not only
in the case of a tensor of the type A,B,, but also in the case
of a sum of such tensors, i.e. in the case of any covariant
tensor of the second rank. We call A,,,, the extension of the
tensor A,,
I t is clear that (26) and (24) concern only special cases
of extension (the extension of the tensors of rank one and
zero respectively).
In general, all special laws of formation of tensors are in-
cluded in (27) in combination with the multiplication of
tensors.

5 11. Some Cases of Special Importance


The Fundamental Tensor.-We will first prove' some
lemmas which will be useful hereafter. By the rule for the
differentiation of determinants

dg = gpvgdg,v = - gpVgdgPv - (28)


The last member is obtained from the last but one, if we bear
in mind that g,,gr'v = 6',,, so that g,,gpv = 4, and conse-
quently
gfiVdgCv+ gfivdg,, = 0.
92

From (28), it follows that

Further, from gpugvu= IP, it follows on differentiation that

From these, by mixed multiplication by guT and gvA re-


spectively, and a change of notation for the indices, we have

and

The relation (31) admits of a transformation, of which we


also have frequently to make use From (21)

Inserting this in the second formula of (31), we obtain, in


view of (23)

Substituting the right-hand side of (34) in (as), we have

c171

The '' Divergence of a Contravariant Vector.- -If we


"

take the inner product of (26) by the contravariant funda-


mental tensor grv, the right-hand side, after a transformation
of the first term, assumes the form
93

~ 9 1 In accordance with (31) and (as), the last term of this ex-
pression may be written

As the symbols of the indices of summation are immaterial,


the first two terms of this expression cancel thesecond of the
one above. If we then write gPVA, = A, so that A like &
is an arbitrary vector, we finally obtain

This scalar is the divergence of the contravariant vector AS


The Gurl of a Covariant Vector.-The second term in
(26) is symmetrical in the indices p and v. Therefore
A, - A,, is a particularly simply constructed antiaym-
metrical fensor. We obtain
3A,
Bp =
3A,
x ,- -
3%
. (36)

Arttisymmetrical Extension of Six-vector.-Applying


Q
(27) to an antisymmetrical tensor of the second rank APV,
forming in addition the two equations which arise through
cyclic permutations of the indices, and adding these three
equations, we obtain the tensor of the Ghird rank

BFU = A,,, + Avcrp + A,,, a&


= -- +-&U
f -3AuF (37)
ax, a ~ , ax,,
which it is easy to prove is antisymmetrical.
The Divergence of a Six-wector.-Tsking the mixed pro-
duct of (27) by gagv@, we also obtain a tensor. The first
term on the right-hand side of (27) may be written in the
form

If we write A:B for gcgvfiA,,, and A*@ for gwgvBA,,, and in


the transformed firat term replace
3gB agpa
- and -
32, 3%
94

by their values as given by (34), there results from the right-


hand side of (27) an expression consisting of seven terms, of
which four cancel, and there remains

This is the expression for the extension of a contravariant


tensor of the second rank, and corresponding expressions for
the extension of contravariant tensors of higher and lower
rank may also be formed.
W e note that in an analogous way we may also form the
extension of a mixed tensor :-

A,,,,
a = - - {up, 7)A: + (47, a)A: . . (39)
3%
On contracting (38) with respect to the indices /3 and u
(inner multiplication by $), we obtain the vector
aA"p
A" = - + (&, /?}Aa7 + (&, a}AYp.
3%
On account of the symmetry of {By, a)with respect to the in-
dices and 7, the third term on the right-hand side vanishes,
if A@ is, as we will assume, an antisymmetrical tensor. The
second term allows itself to be transformed in accordance
with (29a). Thus we obtain

' (40)

This is the expression for the divergence of a contravariant


six-vector.
The Divergence of a Mixed Tensor of t h e Second Rank.-
Contracting (39) with respect to the indices a and u, and
taking (29a) into consideration, we obtain

If we introduce the contravariant tensor Ap" = gp.rA: in the


last term, it assumes the form
- cv,P l J 7 A P " .
95

If,further, the tensor A Pis~ symmetrical, this reduces to

Had we introduced, instead of Ap", the covariant tensor


Apr = g,ggpAas, which is also symmetrical, the last term, by
virtue of (31), would assume the form
-3 9 P "
3J - g--A,o.
3%
I n the case of symmetry in question, (41) may therefore be
replaced by the two forms

which we have to employ later on.

la. The Riemann-Christoffel Tensor


We now seek the tensor which can be obtained from the
fundamental tensor aEone, by differentiation. At first sight
the solution seems obvious. We place the fundamental
tensor of the g,, in (27) instead of any given tensor A,,, and
thus have a new tensor, namely, the extension of the funda-
mental tensor. But we easily convince ourselves that this
extension vanishes identically. We reach our goal, however,
in the following way. In (27) place

i.e. the extension of the four-vector A,. Then (with a some-


what different naming of the indices) we get the tensor of the
third rank

+[-
96

This expression suggests forming the tensor Apur- Apra.


For, if we do so, the following terms of the expression for
Apu7 cancel those of Apr,,, the first, the fourth, and the
member corresponding to the last term in square brackets ;
because all these are symmetrical in Q and T. The same
holds good for the sum of the second and third terms. Thus
we obtain

where

The essential feature of the result is that on the right side of


(42) the A, occur alone, without their derivatives. From the
tensor character of A,, - ApTuin conjunction with the fact
that A, is an arbitrary vector, it follows, by reason of 7,
that BLCTis a tensor (the Riemann-Christoffel tensor).
The mathematical importance of this tensor is as follows :
If the continuum is of such a nature that there is a co-ordinate
system with reference to which the g,, are constants, then
all the I3& vanish. If we choose any new system of co-
ordinates in place of the original ones, the gPv referred
thereto will not be constants, but in consequence of its tensor
nature, the transformed components of BLuTwill still vanish
in the new system. Thus the vanishing of the Riemann
tensor is a necessary condition that, by an appropriate choice
of the system of reference, the g,, may be constants. I n our
problem this corresponds to the case in which,* with a,
suitable choice of the system of reference, the special
theory of relativity holds good for a jinite region of the
continuum.
Contracting (43) with respect to the indices T and p we
obtain the covariant tensor of second rank

* The mathematicians have proved that this is also a suficient condition,


Note on the Choice of Go-ordinates.-It has already been
observed in 5 8, in connexion with equation (I&), that the
- of co-ordinates may with advantage be made so that
choice
J - g = 1. A glance at the equations obtained in the last
two sections shows that by such a choice the laws of forma-
tion of tensors undergo an important simplification. This
applies particularly to G,,,, the tensor just developed, which
plays a fundamental part in the theory to be set forth. For
this specialization of the choice of co-ordinates brings about
the vanishing of S,, so that the tensor G,, reduces to R,,.
On this account I shall hereafter give all relations in the
simplified form which this specialization of the choice of co-
ordinates brings with it. It will then be an easy matter to
revert to the generally covariant equations, if this seems
desirable in a special case.

C. THEORY O F THE GRAVITATIONAL FIELD


13. Equations of Motion of a Material Point in the
Ciravitational Field. Expression for the Field-corn -
ponents of aravitation
A freely movable body not subjected to external forces
moves, according to the special theory of relativity, in a
straight line and uniformly. This is also the case, according
to the general theory of relativity, for a part of four-di-
mensional space in which the system of co-ordinates KO, may
be, and is, so chosen that they have the special constant
values given in (4).
If we consider precisely this movement from any chosen
system of co-ordinates Kl, the body, observed from Kl, moves,
according to the considerations in 5 2, in a gravitational field.
The law of motion with respect to K, results without diffi-
98

culty from the following consideration. With respect t o KO


the law of motion corresponds to a four-dimensional straight
line, i.e. to a geodetic line. Now since the geodetic line
is defined independently of the system of reference, its
equations will also be the equation of motion of the material
point with respect to K,. If we set

the equation of the motion of the point with respect to K,,


becomes
d2
- =xr r T
- ax,
dx, - (46)
ds2 P ds ds
We now make the assumption, which readily suggests itself,
that this covariant system of equations also defines the motion
of the point in the gravitational field in the case when there
is no system of reference KO,with respect to which the
special theory of relativity holds good in a finite region.
We have all the more justification for this assumption as (46)
contains onlyJirst derivatives of the qIIy,between which even
in the special case of the existence of KO,no relations sub-
sist.*
If the r,.Yvanish, then the point moves uniformly in a
straight line. These quantities therefore condition the devi-
ation of the motion from uniformity. They are the com-
ponents of the gravitational field.

s 14. The Field Equations of Gravitation in the Absence


of Matter
We make a distinction hereafter between " gravitational
field " and " matter " in this way, that we denote everything
but the gravitational field as " matter." Our use of the word
therefore includes not only matter in the ordinary sense, but
the electromagnetic field as well.
Our next task is to find the field equations of gravitation
in the absence of matter. Here we again apply the method
It is only between the second (and first) derivatives that, by 3 12, the
relations BP =L 0 subsist.
PUT
99

employed in the preceding paragraph in formulating the


equations of motion of the material point. A special case in
which the required equations must in any case be satisfied is
that of the special theory of relativity, in which the g,, have
certain constant values. Let this be the case in a certain
finite space in relation to a definite system of co-ordinates KO.
Relatively to this system all the components of the Riemann
tensor BpZ7, defined in (43), vanish. For the space under
consideration they then vanish, also in any other system of
co-ordinates.
Thus the required equations of the metter-free gravita-
tional field must in any case be satisfied if all B& vanish.
But this condition goes too far. For it is clear that, e.g., the
gravitational field generated by a material point in its environ-
ment certainly cannot be " transformed away " by any choice
of the system of co-ordinates, i.e. it cannot be transformed to
the case of constant gp,.
This prompts us to require for the matter-free gravitational
field that the symmetrical tensor G,,, derived from the tensor
B&, shall vanish. Thus we obtain ten equations for the ten
quantities gpv, which are satisfied in the special case of the
vanishing of all Bp:7. With the choice which we have made
of a system of co-ordinates, and taking (44) into considera-
tion, the equations for the matter-free field are

-
J - g = 1
It must be pointed out that there is only a minimum of
arbitrariness in the choice of these equations. For besides
G,, there is no tensor of second rank which is formed from
the g,, and its derivatives, contains no derivations higher than
second, and is linear in these derivatives."
These equations, which proceed, by the method of pure
* Properly speaking, this can be affirmed only of the tensor
a, + A9,,9'%@
where A is a constant. If,however, we set this tensor P 0,we come back sgsin
to the equations GPv = 0.
100

mathematics, from the requirement of the general theory of


relativity, give us, in combination with the equations of
motion (46), to a first approximation Newton's law of at-
traction, and to a second approximation the explanation of
the motion of the perihelion of the planet Mercury discovered
by Leverrier (as it remains after corrections for perturbation
have been made). These facts must, in my opinion, be
taken as a convincing proof of the correctness of the theory.

15. The Hamiltonian Function for the Gravitational


Field. Laws of Momentum and Energy
To show that the field equations correspond to the laws of
momentum and energy, it is most convenient to write them
in the following Hamiltonian form :-

where, on the boundary of the finite four-dimensional region


of integration which we have in view, the variations vanish.
We first have to show that the form (478) is equivalent
to the equations (47). For this purpose we regard H as a,
function of the gpV and the gpi ( = 3gru/3z,).
Then in the first place

But

The terms arising from the last two terms in round brackets
are of different sign, and result from each other (since the de-
nomination of the summation indices is immaterial) through
interchange of the indices p and p. They cancel each other
in the expression for 6H, because they are multiplied by the
101

quantity T : ~ ,which is symmetrical with respect to the in-


dices /I and p. Thus there remains only the first term in
round brackets to be considered, so that, taking (31) into ac-
count, we obtain
6H =
Thus

Carrying out the variation in (47a), we get in the first place

which, on account of (48), agrees with (47), as was to be


proved.
If we multiply (47b) by g y , then because

32, ax,
and, consequently,

we obtain the equation

or *

where, on account of (48),the second equation of (47), and


c241 (34)
~ t =: + $ g p v I $ p f i , - .
sPV~,r,8, * (50)
The reason for the introduction of the factor - 2~ will be apparent later.
102

It is to be noticed that tz is not a tensor; on the other


hand (49) applies to all systems of co-ordinates for which
J - g = 1. This equation expresses the law of conservation
of momentum and of energy for the gravitational field.
Actually the integration of this equation over a three-
dimensional volume V yields the four equations

There I , m, n denote the direction-cosines of direction of the


inward drawn normal at the element dS of the bounding sur-
face (in the sense of Euclidean geometry). We recognize in
this the expression of the laws of conservation in their usual
form. The quantities tz we call the " energy components "

of the gravitational field.


I will now give equations (47)in a third form, which is
particularly useful for it vivid grasp of our subject. By
multiplication of the field equations (47)by gv" these are ob-
tained in the " mixed form. Note that
"

which quantity, by reason of (34), is equal to

or (with different symbols for the summation indices)

The third term of this expression cancels with the one aris-
ing from the second term of the field equations (47); using
relation (50), the second term may be written
+: - fq3>,
where t = tz. Thus instead of equations (47) we obtain
9 16. The General Form of the Field Equations of
Gravitation
The field equations for matter-free space formulated in
5 15 are to be compared with the field equation
02+ = 0
of Newton's theory. W e require the equation corresponding
to Poisson's equation
v2+= 4.rrup,
where p denotes the density of matter.
The special theory of relativity has led to the conclusion
that in6rt mass is nothing more or less than energy, which
finds its complete mathematical expression in a symmetrical
tensor of second rank, the energy-tensor. Thus in the
general theory of relativity we must introduce a correspond-
ing energy-tensor of matter TZ,which, like the energy-com-
ponents t, [equations (49) and (50)] of the gravitational field,
will have mixed character, but will pertain to a symmetrical
covariant tensor."
The system of equation (51) shows how this energy-tensor
(corresponding to the density p in Poisson's equation) is to
be introduced into the field equations of gravitation. For if
we consider a complete system (e.g. the solar system), the
total mass of the system, and therefore its total gravitating
action as well, will depend on the total energy of the system,
and therefore on the ponderable energy together with the
gravitational energy. This will allow itself to be expressed
by introducing into (51), in place of the energy-components
of the gravitational field alone, the sums t; + Tjof theenergy-
components of matter and of gravitational field. Thus instead
of (51) we obtain the tensor equation
.

J-y=1
where we have set T = T,P (Laue's scalar). These are the
104

required general field equations of gravitation in mixed form.


Working back from these, we have in place of (47)

It must be admitted that this introduction of the energy-


tensor of matter is not justified by the relativity postulate
alone. For this reason we have here deduced it from the
requirement that the energy of the gravitational field shall
act gravitatively in the same way as any other kind of energy.
But the strongest reason for the choice of these equations
lies in their consequence, that the equations of conservation
of momentum and energy, corresponding exactly to equations
(49) and (49a), hold good for the components of the total
energy. This will be shown in 5 17.

17. The Laws of Conservation in the General Case


Equation (52) may readily be transformed so that the
second term on the right-hand side vanishes. Contract (52)
with respect to the indices p and Q, and after multiplying the
resulting equation by subtract it from equation (52).
This gives

On this equation we perform the operation 2/32,. We have

The first and third terms of the round brackets yield con-
tributions which cancel one another, as may be seen by
interchanging, in the contribution of the third term, the
summation indices a and CT on the one hand, and @, and X
on the other. The second term may be re-modelled by (31),
so that we have

T h e second term on the left-hand aide of (52a) yields in the


105

first place

or

With the choice of co-ordinates which we have made, the


term deriving from the last term in round brackets disappears
by reason of (29). The other two may be combined, and
together, by (31), they give
- f3x.3xadx,
33g"
s

so that in consideration of (54), we have the identity

From ( 5 5 ) and (52a), it follows that


3(t' c + T;> = 0.
9 (56)
32,
Thus it results from our field equations of gravitation
that the laws of conservation of momentum and energy are
satisfied. This may be seen most easily from the consider-
ation which leads to equation (49a) ; except that here, instead
of the energy components t" of the gravitational field, we have
to introduce the totality of the energy components of matter
and gravitational field.

18. The Laws of Momentum and Energy for Matter, as


a Consequence of the Field Equations
Multiplying (53) by bgpv/dx,, we obtain, by the method
adopted in 5 15, in view of the vanishing of
3gpv
g~vazop
the equation
- + +aS"T,,, E 0,
hG l'x,
or, in view of (56),

Comparison with (41b) shows that with the choice of


system of co-ordinates which we have made, this equation
predicates nothing more or less than the vanishing of di-
vergence of the material energy-tensor. Physically, the
occurrence of the second term on the left-hand side shows
that laws of conservation of momentum and energy do not
apply in the strict sense for matter alone, or else that they
apply only when the g P v are constant, i.e. when the field in-
tensities of gravitation vanish. This second term is an ex-
pression for momentum, and for energy, as transferred per
unit of volume and time from the gravitational field to matter.
This is brought out still more clearly by re-writing (57) in the
sense of (41) as

The right side expresses the energetic effect of the gravita-


tional field on matter.
Thus the field equations of gravitation contain four con-
di tions which govern the course of material phenomena.
They give the equations of material phenomena completely,
if the latter is capable of being characterized by four differ-
ential equations independent of one another.

D. MATERIAL PHENOMENA
The mathematical aids developed in part B enable us
forthwith to generalize the physical laws of matter (hydro-
dynamics, Maxwells electrodynamics), as they are formulated
in the special theory of relativity, so that they will fit in with
the general theory of relativity. When this is done, the
general principle of relativity does not indeed afford us a
further limitation of possibilities ; but it makes us acquainted
with the influence of the gravitational field on all processes,
*On this question of. H. Hilbert, Nachr. d. K. Gesellsch. d. Wiss. zu
Gottingen, MatLphys. Klasse, 1915, p. 3.
107

without our having to introduce any new hypothesis what-


ever.
Hence it comes about that it is not necessary to introduce
definite assumptions as to the physical nature of matter (in
the narrower sense). In particular it may remain an open
question whether the theory of the electromagnetic field in
conjunction with that of the gravitational field furnishes a
sufficient basis for the theory of matter or not. The general
postulate of relativity is unable on principle to tell us anything
about this. I t must remain to be seen, during the working
out of the theory, whether electromagnetics and the doctrine
of gravitation are able in collaboration to perform whet the
former by itself is unable to do.

$ 19. EuIers Equations for a Frictionless Adiabatic Fluid


Let p and p be two scalars, the former of which we call
the pressure, the latter the density of a fluid ; and let
an equation subsist between them. Let the contravariant
symmetrical tensor

be the contravariant energy-tensor of the fluid. To it belongs


the covariant tensor

as well as the mixed tensor *

Inserting the right-hand side of (58b) in (57a), we obtain the


Eulerian hydrodynamical equations of the general theory of
relativity. They give, in theory, a complete solution of the
problem of motion, since the four equations (57a), together
*For an observer using a system of reference in the sense of the special
theory of relativity for an infinitely small region, and moving with it, the
density of energy T j equals p - p . This gives the definition of p. Thus p is
not constant for an incompressible fluid.
108

with the given equation between p and p, and the equation


dxa dxp
ds ds = 1,
gap--

are sufficient, gas being given, to define the six unknowns

P,P9&s
dxl dx, axs ax4
ds -@ Z
If the g,, are also unknown, the equations (53) are
brought in. These are eleven equations for defining the ten
functions g,,, so that these functions appear over-defined.
We must remember, however, that the equations (57a) are
already contained in the equations (53), so that the latter
represent only seven independent equations. There is good
reason for this lack of definition, in that the wide freedom of
the choice of co-ordinates causes the problem to remain
mathematically widefined i o such a degree that three of the
functions of space may be chosen at will.*

5 20.Maxwells Electromagnetic Field Equations for Free


Space
Let +v be the components of a covariant vector-the
electromagnetic potential vector. From them we form, in
accordance with (36), the components FPa of the covariant
six-vector of the electromagnetic field, in accordance with
the system of equations

I t follows from (59) that the system of equations

is satisfied, its left side being, by (37), an antisymmetrical


tensor of the third rank. System (60) thus contains essenti-
ally four equations which are written out as follows :-
-
* On the abandonment of the choice of co-ordinates with g = 1,there
remain four functions of space with liberty of choice, corresponding to t 5 e four
arbitrary functions at. our disposal in the choice of co-ordinates.
109

This system corresponds to the second of Maxwells


systems of equations. We recognize this at once by setting
F23 = Es, F14 Ez
=
F,, = Hy, F24= EY} . . * (61)
F12 = Hz F34 Ez

Then in place of (60a) we may set, in the usual notation of


three-dimensional vector analysis,
3H
- -
3t
= curl E
. (60b)
div H = 0
We obtain Maxwells first system by generalizing the
form given by Minkowaki. We introduce the contravariant
six-vector associated with FaS
PO1 FPV= gPagVBF,p . (62)
and also the contravariant vector JP of the density of the
electric current. Then, taking (40) into consideration, the
following equations will be invariant for any substitution
whose invariant is unity (in agreement with the d o s e n co-
ordinates) :-

Let

which quantities are equal to the quantities Hs - . . Ez in


110

the special case of the restricted theory of relativity ; and in


addition
J1 = j,, Ja = j,, J3 =jz,J4 = p,
we obtain in place of (63)
3E'
-3t + j = curlH'1 . (63%)
div E' = p I
The equations (60), (62), and (63) thus form the generali-
zation of Maxwell's field equations for free space, with the
convention which we have established with respect to the
choice of co-ordinates.
The Energy-components of the Electromagnetic Field.-
We form the inner product
I C ~= F,,J' . (65)
B y (61) its components, written in the three-dimensional
manner, are
tcl = pE, + [ j . HI"

. (65a)
~p = - UE)

A C ~is a covariant vector the components of which are


equal to the negative momentum, or, respectively, the energy,
which is transferred from the electric masses to the electro-
magnetic field per unit of time and volume. If the electric
masses are free, that is, under the sole influence of the
electromagnetic field, the covariant vector ,cU will vanish.
To obtain the energy-components T: of the electromagnetic
field, we need only give to equation ttu = 0 the form of
equation (57). From (63) and (65) we have in the first d a c e

The second term of the right-hand side, by reason


permits the transformation
111

which letter expression may, for reasons of symmetry, also


be written in the form

But for this we may set

The first of these terms is written more briefly

the second, after the differentiation is carried out, and after


some reduction, results in

Taking all three terms together we obtain the relation

where

Equation (66), if xu vanishes, is, on account of (30),


equivalent to (57) or (57a) respectively. Therefore the T:
are the energy-components of the electromagnetic field.
With the help of (61) and (64), it is easy to show that these
energy-components of the electromagnetic field in the case
of the special theory of relativity give the well-known Maxwell-
Poynting expressions.
W e have now deduced the general laws which are satisfied
by the gravitational field and matter, by consistently using a
system of co-ordinates for which J - g = 1. We have
thereby achieved a considerable simplification of formulae
and calculations, without failing to comply with the require-
ment of general covariance ; for we have drawn our equations
from generally covariant equations by specializing the system
of co-ordinates.
112

Still the question is not without a formal interest, whether


with a correspondingly generalized definition of the energy-
components of gravitational field and matter, even without
specializing the system of co-ordinates, it is possible to formu-
late laws of conservation in the form of equation (56), and
field equations of gr6vitation of the same nature as (52) or
(52a), in such a manner that on the left we have a divergence
(in the ordinary sense), and on the right the sum of the
energy-components of matter and gravitation. I have found
h a t in both cases this is actually so. But I do not think
that the communication of my somewhat extensive reflexions
on this subject would be worth while, because after all they
do not give us anything that is materially new.

E
3 21. Newtons Theory as a First Approximation
As has already been mentioned more than once, the
special theory of relativity as a special case of the general
theory is-characterized by the g ,, having the constant values
(4). From what has already been said, this means complete
neglect of the effects of gravitation. We arrive at a closer
approximation to reality by considering the case where the
gpPdiffer from the values of (4) by quantities which are small
compared with 1, and neglecting small quantities of second
and higher order. (First point of view of approximation.)
I t is further to be assumed that in the space-time territory
under consideration the g,, at spatial infinity, with a suitable
choice of co-ordinates, tend toward the values (4) ; i.e. we are
considering gravitational fields which may be regarded as
generated exclusively by matter in the finite region.
I t might be thought that these approximations must lead
us to Newtons theory. But to that end we still need to ap-
proximate the fundamental equations from a second point of
view. We give our attention to the motion of a material
point in accordance with the equations (16). In the case of
the special theory of relativity the components

- dx,
dx, --dx,
ds ds a s
113

may take on any values. This signifies that any velocity

v = J<z)2 + (z)2
+ ((p)2
may occur, which is less than the velocity of light in vacuo.
If we restrict ourselves to the case which almost exclusively
offers itself to our experience, of w being small as compared
with the velocity of light, this denotes that the components
d-x , ax2
-- dx,
d s ' ds' as
are to be treated as small quantities, while dx4/ds, to the
second order of small quantities, is equal to one. (Second
point of view of approximation.)
Now we remark that from the first point of view of ap-
proximation the magnitudes Tiv are all small magnitudes of
at least the first order. A glance at (46) thus shows that in
this equation, from the second point of view of approximation,
we have to consider only terms for which p = v = 4. Re-
stricting ourselves to terms of lowest order we first obtain in
place of (46) the equations

where we have set ds = dx4 = d t ; or with restriction to terms


which from the first point of view of approximation are of
first order :-
-G X-T - [44,71 (7 = 1, 2, 3)
dt2

If in addition we suppose the gravitational field to be a quasi-


static field, by confining ourselves to the case where the
motion of the matter generating the gravitational field is but
slow (in comparison with the velocity of the propagation of
light), we may neglect on the right-hand side differentiations
with respect to the time in comparison with those with re-
spect to the space co-ordinates, so that we have
114

-d,
= - &-3g44 (7 = 1 , 2 , 3 ) .* (67)
d t2 3x7
This is the equation of motion of the material point accord-
ing to Newtons theory, in which +g4, plays the pert of the
gravitational potential. What is remarkable in this result
is that the component g,, of the fundamental tensor alone
defines, to a first approximation, the motion of the material
point.
We now turn to the field equations (53). Here we
have to take into consideration that the energy-tensor of
matter is almost exclusively defined by the density of
matter in the narrower sense, i.e. by the second term of the
right-hand side of (58) [or, respectively, (58a) or (58b)l.
If we form the approximation in question, all the components
vanish with the one exception of Td4= p = T. On the left-
hand side of (53) the second term is a small quantity of
second order; the first yields, to the approximation in
question,

For = v = 4,this gives, with the omission of terms differ-


entiated with respect to time,

The last of equations (53) thus yields


02944 = P
The equations (67) and (68) together are equ,n
Newtons law of gravitation.
By (67) and (68) the expression for the gravitational
potential becomes

while Newtons theory, with the unit of time which we have


chosen, gives
115

in which K denotes the constant 6.7 x 10 - 8, usually called


the constant of gravitation. By comparison we obtain
u=--
STK - 1-87 x 10 - 27
C2

5 22. Behaviour of Rods and Clocks in the Static Ciravi-


tational Field. Bending of Light-rays. Motion of
the Perihelion of a Planetary Orbit
To arrive at Newton's theory as a first approximation we
had to calculate only one component, 94,, of the ten g,, of the
gravitational field, since this component alone enters into the
first approximation, (67), of the equation for the motion of the
material point in the gravitational field. From this, however,
it is already apparent that other components of the g,, must
differ from the values given in (4) by small quantities of the
first order. This is required by the condition g = - 1.
For a field-producing point mass at the origin of co-ordin-
ates, we obtain, to the first approximation, the radially
symmetrical solution
xpxu
gp, = - 6,, - a- (p, a =
9-d
1, 2, 3)\
gP4 = 94, = 0
g,, = 1 - a
--
9' I
1331 where ,6, is 1or 0,respectively, accordingly as p = Q or p Q, .+
and r is the quantity + ,Jxi + x i + x:. On account of (68a)

if M denotes the field-producing mass. It is easy to verify


that the field equations (outside themass) are satisfied to the
first order of small quantities.
We now examine the influence exerted by the field of the
mass M upon the metrical properties of space. The relation
ds2 = g,,dz,dx,.
always holds between the " locally " ( 5 4) measured lengths
and times ds on the one hand, and the differencesof co-ordin-
ates d x , on the other hand.
116

For a unit-measure of length laid " parallel " to the axis


of x,for example, we should have to set ds2 = - 1; d ~ =, dz,
= dx4 = 0. Therefore - 1 = g,,dx:. If, in addition, the
unit-measure lies on the axis of x, the first of equations (70)
gives
g,, = - (1 + ;).
From these two relations it follows that, correct to a first
order of small quantities,
a
dx = 1--
2r '

The unit measuring-rod thus appears a little shortened in


relation to the system of co-ordinates by the presence of the
gravitational field, if the rod is laid along a radius.
In an analogous manner we obtain the length of CO-
ordinates in tangential
-
direction if, for example, we set
ds2 = - 1; dxl = d ~ = 3 d ~ 4= 0 ; X, = r , x, = 5, = 0.
The result is
- 1 = ga2dx22 = - dxi . (71%)
With the tangential position, therefore, the gravitational
field of the point of mass has no influence on the length of 8
rod.
Thus Euclidean geometry does not hold even to a first ap-
proximation in the gravitational field, if we wish to take one
and the same rod, independently of its place and orientation,
as a realization of the same interval ; although, to be sure, a
gIance at (70a) and (69) shows that the deviations to be ex-
pected are much too slight to be noticeable in measurements
of the earth's surface.
Further, let us examine the rate of a unit clock, which is
arranged to be at rest in a static gravitational field. Here we
have for a clock period ds = 1; dx, = dx, = dx, = 0
Therefore
1 = g 4 4 d ~ ;:
1 1
dx4 = 'Js44 = J ( l + (944 - I ) ) = 1 - Q(g4.4 - 1)
117

or

Thus the clock goes more slowly if set up in the neighbour-


hood of ponderable masses. From this it follows that the
spectral lines of light reaching us from the surface of large
stars must appear displaced towards the red end of the
spectrum.*
We now examine the course of light-rays in the static
gravitational field. By the special theory of relativity the
velocity of light is given by the equation
- dx; - dx2 - dx: + dxz = 0
and therefore by the general theory of relativity by the
equat ion
ds' = g , , d x F ~ , = 0 . (73)
If the direction, i.e. the ratio dx, : dx, : dx, is given, equation
(73) gives the quantities

and accordingly the velocity

defined in the sense of Euclidean geometry. We easily


recognize that the course of the light-rays must be bent with
regard to the system of co-ordinates, if the g,, are not con-
stant. If n is a direction perpendicular to thepropagation of
light, the Huyghens principle shows that the light-ray, en-
visaged in the plane ( y , n), has the curvature - >?/an.
We examine the curvature undergone by a ray of light
passing by a mass M at the distance A. If we choose the
system of co-ordinates in agreement with the accompanying
diagram, the total bending of the ray (calculated positively if
According to E. Freundlich, spectroscopical observations on f3ed stars of
certain types indicate the existence of an effect of this kind, but a crucial
test of this consequence has not yet been made.
118

concave towards the origin) is given in sufficient approxi-


mation by

while (73) and (70) give

y = J( - 2) = 1 - 2r
:(I+ 5).
Carrying out the calculation , this gives

According to this, a ray of light going past the sun under-


goes a deflexion of 1.7"; and a ray going past the planet
Jupiter a deflexion of about *02".
If we calculate the gravitational field to a higher degree
of approximation , and likewise with corresponding accuracy
the orbital motion of a material point of relatively infinitely
small mass, we find a deviation of the following kind from
the Kepler-Newton laws of planetary motion. The orbital
ellipse of a planet undergoes a slow rotation, in the direction
of motion, of amount
119

per revolution. I n this formula a denotes the major semi-


axis, c the velocity of light in the usual measurement, e the
eccentricity, T the time of revolution in seconds.*
Calculation gives for the planet Mercury a, rotation of the
orbit of 43per century, corresponding exactly to astronomical
observation (Leverrier) ; for the astronomers have discovered
in the motion of the perihelion of this planet, after allowing
for disturbances by other planets, an inexplicable remainder
of this magnitude.
For the calculation I refer to the original papers: A. Einstein,
Sitaungsber. d. Preuss. Akad. d. Wies., 1915, p. 831; K. Schwarzechild,
ibid., 1916, p. 189.
120

The Foundations of Physics.


(First communication.)
by
David Hilbert.
Presented in the session of November 20th, 1 9 1 5
(Translated by D. Fine.)

The powerful posing of problems by Einstein, as well as his penetrating meth-


ods devised for their solution, and the deep thoughts and original conceptualizations
by means of which Mie2 builds his electrodynamics, have opened new paths for the
investigation of the foundations of physics.
I would like in the following - in the sense of the axiomatic method - to establish,
essentially from two simple axioms, a new system of basic equations of physics, that
are of ideal beauty, and in which, so I believe, is included simultaneously the
solutions of the problems of Einstein and Mie. I reserve for later communications
the more precise exposition as well as, above all, the specific application of my basic
equations to the fundamental questions of the science of electricity.
Let w , ( s = 1 , 2 , 3 , 4 ) be some coordinates naming the world-points essentially
unambiguously (most general spacetime coordinates). The quantities characteriz-
ing what happens in w, are

1. the ten gravitational potentials gp,(pl v = 1 , 2 , 3 , 4 ) ,first introduced by Ein-


stein, with symmetric tensor character under an arbitrary transformation of
the world-parameters w,;

2 . the four electrodynamical potentials y, with vector character in the same


sense.

What happens physically is not haphazard; on the contrary, the following two
axioms hold true:
Axiom I (Mies Axiom of the world-function3): The law of what happens physi-
cally is determined by a world-function H that contains the following as arguments:

(1, k = 1 , 2 , 3 , 4 )

and indeed the variation of the integral

must vanish for each of the 14 potentials g,,, 9,.

Sitzungsber. d. Berliner Akad. 1914 S.1030, 115 S.778, 799, 831, 844.
2Ann. d. Phys. 1912, Bd. 37 S. 511, Bd. 39 S. 1, 1913, Bd. 40 S . 1 .
3Mies world-functions do not contain exactly these arguments; in particular, the use of the
arguments (2) goes back to Born; nevertheless it is precisely the introduction and use of such a
world-function in the Hamiltonian principle which is characteristic of Mies electrodynamics.
121

In place of the arguments (1) the arguments

(3)

obviously can appear, where gP means the minor determinant of the determinant
of g corresponding to gpv , divided by g .
Axiom I1 (Axiom of general invariance ): The world-function H is an invari-
ant with respect t o an arbitrary transformation of the world-parameter w 3 .
Axiom I1 is the simplest mathematical expression for the requirement that the
coupling of the potentials g P V ,qs is in and of itself completely independent of the
way one chooses t o name world-points using world-parameters.
The following mathematical theorem, whose proof I will lay out elsewhere, pro-
vides the Leitmotiv for the construction of my theory.
Theorem I. If J is an invariant with respect to arbitrary transformations of
the four world-parameters which contains n quantities and their derivatives, and if
one forms from

J
6 J&dw = 0

in reference to those n quantities the n Lagrangian variational equations, then,


in this system of n differential equations for the n quantities, four are always a
consequence of the n - 4 others - in the sense that among the n differential equa-
tions and their total derivatives four mutually-independent linear combinations are
identically fulfilled.
In regard to the derivatives with respect to g ~ , g [ , g ~ r as they appear in (4)
and following formulas, let it be noted once and for all, that due to the symmetry
in p , u on the one hand, and k , 1 on the other hand the derivatives with respect t o
g,g[ are to be taken as multiplied by 1, respectively i,
according to whether
p = u , respectively p # u occurs; further, the derivatives with respect to g[;/ are
to be taken as multiplied by 1, respectively f , respectively $, according to whether
p = u and k = 1, respectively p = u and k # 1 or p # u and k = 1, respectively
p # u and k # 1 occurs.
From Axiom 1 follow first in regard to the ten gravitational potentials gP the
ten Lagrangian differential equations

and then in regard to the four electrodynamic potentials qd the four Lagrangian
differential equations

(5)

4Mies has already imposed the requirement of orthogonal invariance. In Axiom I1 above, t h e
Einsteinian fundamental basic idea of general invariance is given t h e simplest expression, albeit
in Einstein the Hamiltonian principle plays only a secondary role and his functions H a r e certainly
not general invariants nor d o they contain the electric potentials.

2
122

For brevity we denote the left-hand sides of the equations (4), (5) respectively as

[&HIPV , [fiHIh

The equations (4) might be called the basic equations of gravitation, the equa-
tions (5) the basic electrodynamics equations or the generalized Maxwells equa-
tions. Due to the theorem given above, the equations (5) may be viewed as a
consequence of the equations (4); that is, based on that mathematical proposition,
we can immediately state the claim that in the indzcated sense the electrodynamic
phenomena are effects of gravitation. In this realization I discern the simple and
quite surprising solution of the problem of Riemann who was the first to seek the
theoretical relationship between gravitation and light.
In the following, we use the easily proven fact that if pl ( j = 1 , 2 , 3 , 4 )denotes
an arbitrary contravariant vector, the expression

represents a symmetric contravariant tensor and the expression

a covariant vector.
Furthermore, we present two mathematical theorems, that read as follows:
Theorem 11. If J is an invariant depending on g p , g; , g r , qs , q s k then identically
in all arguments and for every arbitrary contravariant vector p s

wherein

This Theorem I1 may also be stated in the following form:


+ T h e last partial derauatiue aboue is x
a9
,not 22-
aqsk
in the original. (Translator)

3
123

If J is an invariant and p s an arbitrary vector as before, then the identity

is valid; here we set

and abbreviate

The proof of (6) falls out easily, because this identity is obviously correct when p s
is a constant vector from which it follows in general due to its invariance.
Theorem I11 If J is an invariant depending only on the gP and their deriva-
tives, and, as above, the variational derivative of , / j J with respect to gz is denoted
[&7J]Pu, then the expression - with h understood to be some contravariant tensor

represents an invariant; if in this sum we replace h by the particular tensor PI


and write

where the expressions

depend only on the gP and their derivatives, then

(7)

in the manner that this equation is fulfilled for all arguments, namely the gP and
their derivatives.
For the proof we consider the integral

J J d d w , dw = dwldwzdwSdw4

4
124

extending over a finite piece of the four-dimensional world. Further p s should be a


vector that along with its derivatives vanishes on the three-dimensional boundary
of this piece of the world. As P = P,, it follows from the formula (12a) appearing
below that

which gives

and, due to the manner of construction of the Lagrangian derivative, is also ac-
cordingly

The introduction of i,, i, in this identity shows finally that

/ (7 - is) pdw = 0

and therefore also that the claim of our theorem is correct.


The most important goal at this point is the establishment of the concept of
energy and the derivation of the energy proposition on the basis of the axioms I
and I1 alone.
Towards that we construct first:

Now, is a mixed tensor of fourth order and therefore if one sets

the expression

is a contragradient vector.
If we therefore form the expression
125

this no longer contains the second derivatives and is therefore of the form

wherein

is again a mixed tensor.


At this point we form the vector

(9)

and then obtain

(10)

On the other hand, we form

then is a tensor, and the expression

therefore represents a contragradient vector. Correspondingly, as above,

Recalling now the basic equations (4) and (5), it follows from addition of (10)
and (12) that

Now,

6
126

and therefore by dint of the identity (6)

With this we finally obtain the equation

C - &awl
a
(Hp'-a 1
- b 1 - c )1 = O .
1

Now we recall that


---
8qIk aqkl
is a skew-symmetric tensor; as a consequence

will be a contravariant vector and indeed the latter obviously satifies the identity

If at this point we define

(14) e 1 = H p 1 - a1 - b 1 - c 1 - d 1

as the energy-vector, then the energy-vector is a contravariant vector that stall


depends linearly on the arbitrary vector p s and f o r every choice of this vector p s
satisfies identically the invariant energy-equation

As concerns the world-function H I further axioms are necessary to make its


choice unambiguous. If the gravitational equations are to contain only second
derivatives of the potentials g p " , then H must have the form

H = li' + L ,

where Ii' denotes the invariant arising from the Riemann tensor (curvature of the
four-dimensional manifold)

7
127

and L depends only on g p , g r , q s , q s k . Finally we make in the following the


simplifying assumption that L does not contain the gf.
We then apply Theorem I1 to the invariant L and obtain

Equating to zero the coefficients of ps on the left-hand side yields the equation

or

that is, the derivatives of the electrodynamic potentials qs appear only in the com-
bination
Mks = Qsk - Qks.
We thereby recognize that with our assumptions, the invariant L beyond the po-
tentials q p , qs depends solely on the components of the skew-symmetric invariant
tensor
M = ( M k s ) = Rot(qs);
that is, the so-called electromagnetic six-vector. This result, by which the character
of Maxwells equations is really conditioned, ensues here essentially as a conse-
quence of general invariance, thus on the basis of Axiom II.
If we set the coefficient of pk on the left-hand side of the identity (15) to zero,
then we obtain, using (16),

This equation permits an important reformulation of the electromagnetic energy,


that is the part of the energy vector coming from L. This part is given in fact from
( l l ) ,(13) (14) as follows:

Due to (16) and recalling (5) this expression becomes

(Sf. = 0 , l # s ; b, = 1);

8
128

that is, due to (17)

Due t o the formula (21) developed in the following, we see from this in particular
that the electromagnetic energy and with it also the total energy-vector el may be
expression in terms of 11 alone, so that only the gp and their derivatives, but not
the qs and their derivatives appear therein. If in the expression (18) one goes t o
the boundary case

gpv = 01 ( P # ).
gpp = 1
then the same agrees exactly with that which Mie set out in his electrodymics:
the Mie electromagnetic energy-tensor is thus nothing but the general
invariant tensor arising through differentiation of the invariant L with
respect to the gravitational potential g p - a circumstance which first directed
me to the necessarily close relationship between the Einsteinian general theory of
relativity and the Miesian electrodymics and gave me the conviction of the correct-
ness of the theory developed here.
It remains yet to show directly from the assumption

(20) H=K+L
how the generalized Maxwells equations (5) presented above are a result of the
gravitational equations (4) in the sense given above. Using the notation introduced
above for the variational derivatives with respect to the gp the gravitational equa-
tions due to (20) take the form

The first term on the left becomes

as follows easily without calculation from the fact K,, is, other than g p v , the only
tensor of second rank and K is the only invariant, that can be constructed with
only the gp and their first and second derivatives g i , g::.
The differential equations of gravity coming about thusly appear to me to be in
agreement with the ambitious theory of general relativity presented by Einstein in
his later treatments.
If we continue in general as above to denote the variational derivatives of &iJ
with respect to the electrodynamic potential q h by

[&J]I~=
afiJ
-- C-a -8fiJ
aqh awk aqhk

510c. czt. Berliner Sitzungsber. 1915.

9
then due to (20) the electrodynamic basic equation takes the form

(22) [fiJIh 0
Now as Ii is an invariant depending soley on the g!Jv and their derivatives, according
to Theorem 111 the equality (7) obtains identically, wherein

(23) is =

Due t o (21) and (24), (19) equals --Li?.


Through differentiation by w, and
fi
summation over rn we obtain due to (7)

as

and

We now recall that due to (16)

and obtain then through appropriate rearrangement:

(25)

10
130

On the other hand,

The first term on the right-hand side is, due to (21) and (23), nothing other than
i,. The last term on the right proves to be cancelled by the last term on right in
(25); in fact,

as the expression

dMsv -
aqms - d2q, - a2qs -~d2qm
dwm d ~ , dwsdwm dw,dw, dw,dw,
is symmetric in s , m and the first factor in the summation in (26) comes out skew-
symmetric in s, m.
The equation

(27)

follows immediately from (25); that is, from the gravitational equations (4) follow
indeed the four mutually linearly independent combinations (27) of the electro-
dynamic basic equations (5) and their first derivatives. This is the precise math-
ematical expression of the above claim expressed generally about the character of
electrodynamics as a consequence of gravity.
As L in consequence of our assumptions should not depend on the derivatives of
the gp, L must be a function of four certain general invariants which correspond
to the special orthogonal invariants given by Mie and of which the two most simple
are these:

and

k ,1
The most simple, and looking at the construction of Ii,most obvious Ansatz for
L is likewise the one which corresponds to Mies electrodynamics; namely,

L = a& + f(q)
or, more specifically following Mie:

L = a& + Pqs,
where f ( q ) denotes any function of q and a,p denote constants.*
t T h e last t e r m on the right-hand side is q 5 n o t q in the original. (Translator)

11
As one sees, the few simple assumptions expressed in the A4xiomsI and I1 suf-
fice with appropriate interpretation for the construction of the theory: thereby not
only are our conceptions of space, time and motion transformed from the ground
up in the sense laid forth by Einstein, but I a m also of the conviction that through
the basic equations presented here the most intimate, hitherto hidden, processes
within the atom will receive clarification and most particularly it must be generally
possible to trace all physical constants back to mathematical constants - as then
thereby the possibility first edges nearer that in principle a science of the style of
geometry will evolve from physics: certain is the most excellent reputation of the
axiomatic method which, as we see here, takes into its service the powerful instru-
ments of analysis; namely, variational calculus and invariant theory.

Translators notes:
1. Hilbert writes in the erudite manner of a well-educated mathematician of the early
20th century. I have endeavored to translate his German into an English preserving
the feel of his style. Insofar as I have succeeded, credit is due to my wife, Heide-
marie Floerke, whose help in sorting through the nuances of Hilberts German was
invaluable. The extent to which the reader may mistake Hilbert here for an elderly
German academic writing in broken English is my fault entirely.
2. This paper does not appear in Hilberts collected works.() As Pais notes,() Hilberts
claim that electrodynamics can be viewed as a consequence of gravity rests on a
misunderstanding. The identities among the gravitational potentials gPv and the
electrodynamical potentials q,, on which Hilbert bases his claim express the Bianchi
identities, which are automatically true, and of which Hilbert was apparently un-
aware. In a subsequent version,(11)Hilbert corrects this and other errors. This
version, which presents verbatim much of the present paper, is included in his col-
lected works. Presumably, Hilbert was not anxious to have the erroneous early
version more widely read. From a historical perspective, however, it provides a
fascinating glimpse of Hilbert poised, in 1915, to give an axiomatic, geometric foun-
dation to all of theoretical physics.

(i) D. Hilbert Gesammelte Abhandlunyen Chelsea, New York, 1981


(li)A. Pais, Subtle is t h e Lord; T h e Science and the Life of Albert Einstein, p. 358 Oxford U.
Press, Oxford ,1982
(iii)D.Hilbert Die Grundlagen der Physik Math. Ann. 9 2 p. 1 (1924)

12
132

Meeting of February 27,1922

On a generalization of the concept of Riemann curvature and


spaces with torsion.
Note by Mr. E. Cartan, presented by Mr. E. Borel.

(Translated by Arthur Fine)

In a recent note(')I showed how, in an Einstein universe with a given ds2, the energy
tensor attached to each volume element of that universe can be defined geometrically;
this is the tensor which, set equal to zero, gives the laws of gravitation in any region
devoid of matter. The definition that I gave makes the curvature of the universe depend
on a certain rotation associated with every closed, infinitesimal contour, and this rotation
was introduced on the basis of the concept of parallel transport of Levi-Civita. This last
concept itself, although it was originally presented using geometrical considerations, is
rather difficult to define in a precise way without calculation. But it is possible, it seems
to me, to show the major significance of it by generalizing the concept even of space; at
the same time that will lead us to geometrical images of material universes physically
richer than our universe, at least as it is usually considered; that will also show us the true
rational of the fundamental laws governing the energy tensor (law of symmetry, law of
conservation).

Let us restrict ourselves to the case of three dimensions, the generalization to four
dimensions being easy. Imagine a space which, in the immediate neighborhood of each
point, has all the characteristics of Euclidean space. The inhabitants of this space will
know, for example, how to locate points infinitely close to a point A by means of an
orthogonal triple having this point A as origin; but we will suppose further that they have
a law enabling them to orient, in relation to the triple at origin A, every coordinate triple
having its origin A' close to A; in particular that will give a sense for them to say that two
directions, one coming from A and another from A', are parallel. Ultimately, such a
space will be defined by the law of mutual orientation (of a Euclidean nature) of two
triples with origins infinitely close.

A space of the preceding kind is not completely defined by its ds2. The ds2, indeed,
determines only one part of the operation that allows the passage from a triple with origin
A to an infinitely close triple with origin A', namely a translation A-A'; in addition, as
one knows, ds2 being fixed, a rotation can still be defined according to an arbitrary law.

That granted, when one describes a closed, infinitesimal contour starting from point A
and returning there, the divergence between the space considered and Euclidean space

(') Comptes rendus, vol. 174, 1922, p. 437.


133

will show itself in the following way. Let us attach a coordinate triple to each point M of
the contour; to pass from the triple attached to M to the triple attached to the infinitely
close point M , one needs to make an infinitesimal translation and rotation whose
components one knows with respect to the moving triple with origin M.
Imagine that this collection of infinitesimal displacements is carried out in a Euclidean
space starting from an initial triple chosen arbitrarily. When the point M of non-
Euclidean space that starts from A returns there after having described the closed contour,
in Euclidean space one will not recover the initial triple, but for that to obtain it will be
necessary to carry out a complementary displacement whose components will be well
defined with respect to the initial triple. This complementary displacement is otherwise
independent of the law whereby one attached a triple to each point M of the contour.

In sum, associated with any infinitesimal closed contour of the given space are an
infinitesimal translation and rotation (on the order of magnitude of the surface area
bounded by the contour) and which express the divergence between this space and
Euclidean space. The rotation can be represented by a vector with origin A and the
translation by a couple. One can then prove the following conservation law: If one
considers an infinitesimal volume, the vectors and the couples associated with diflerent
elements of the surface bounding the volume are in equilibrium.

Thus one has a geometrical image of a continuous material medium in equilibrium


under the sole action of its elastic forces, but in a situation where these forces would be
expressed on each surface element, not only by one single force (tension or pressure), but
by a couple (torsion).

Return now to the case where we are given simply ds2. An easy calculation shows that,
among all the laws of mutual orientation of two triples of infinitely close origin
compatible with the given ds2,there is only one f o r which the translation associated with
an arbitrary, infinitesimal closed contour is null. It is this law which leads to the concept
of parallel displacement of Levi-Civita. The couple in question above disappears, and this
is why the elastic tensor satisfied the law of symmetry.

In the general case where there is a translation associated with any infinitesimal closed
contour, one can say that the given space is different from Euclidean space in two
respects: 1) by a curvature in the sense of Riemann, which results in rotation; 2) by a
torsion, which results in the translation.

In a space with curvature and torsion, the method of moving triples, as in Euclidean
space, allows one to build a theory of the curvature of curves (and even of surfaces). A
straight line will be characterized by the property of having null (relative) curvature at all
of its points; i.e., of preserving the same direction locally. A straight line is no longer
necessarily the shortest path from one point to another; it is in spaces devoid of torsion;
exceptionally, it can also be in certain special torsion spaces.
134

A very simple example of this last case is the following. Imagine a space & that
corresponds point by point with a Euclidean space E, the correspondence preserving
distances. The difference between the two spaces will be as follows: two orthogonal
triples originating from two infinitely close points A and A' of E will be parallel when
the corresponding triples of E can result one from the other by a helicoidal displacement
at a given rate in a given sense (righthanded, for example), with the line that connects
their origins as axis. Lines of & then correspond to lines of E: they are still geodesics.
Space & thus defined admits a 6-parameter group of transformations; it would be our
ordinary space seen by observers all of whose perceptions would be twisted.
Mechanically it would correspond to a medium with constant pressure and constant
torsion.

I will add that the preceding considerations which, from the point of view of
mechanics, are connected with the beautiful work of Mssrs E. and F. Cosserat on the
Euclidean action, are also connected with the theory of generalized spaces of H. Weyl
and can themselves be extended.
Chapter 3

The Scalar-Tensor Theory of Gravity"

*P. Jordan, von M. Fierz, C. Brans, R. H. Dicke


136

N ~ 4.1 7 2 October 15, 1949 NATURE 637


FORMATION OF THE STARS AND DEVELOPMENT OF THE UNIVERSE
By PROF.PASCUAL JORDAN
Hamburg

Introduction Can it reasonably be hoped that t h i ~value can


ever ba established by theoretical considerations ?
I 3 a fascinating article which appeared in Na.$ure of
February 6, under the title Stellar Evolution and
the Expanding Universe, Mi. I?. Hoyle hss brought
Eddington cherished the optimistic view that i t
could; but his speculations really only served to
forward convincing argumenta for the permanent make one more acutely conscious of how improbable
creation of matter in space. Many physicists will it is, on the basis of such natural lam aa are known,
find it difficult to accept this hypotheais. For if there that an explanation could be given of why a dimen-
is any law which has withstood all changes a n d sionless constant should have j w t this enormous
revolutions in physics, it is the law of conservation value, and not some other. It is not so much a
of energy, which according to Einsteins formula
question of the particular natural laws known to us
E = n2ca is equivalent to the commation of mass. to-day, SB of the mathematical type common to all the
The aame strange conclusion has, during recent years, known laws of physica. The mathematical type
been formulated by Prof. Pmcual Jordan, but with which characterizes physical lam allows the ex-
an important modification. whereby the conservation planation of such natural constauta as n, or 4n, or
law is not violated. This is achieved by taking d2 ; but a number of the order of magnitude 1 0 4 0
account of the losa of gr8VitCLtiOnal energy connected can never wise from the mathematically ehple,
with the creation of particlea. As Jordans pepers fundamental lam of Nature. But ahould it now be
do not seem to be known to many English-speaking
believed that there are, in the laws of Nature,
physicists, I have aaked him to write a short report numerical constants which by a meaningless chance
of his work, and the following article is 8 translation have their actual values-which could equally d
aesume other arbitrary values, without producing 6
of his article made by my collaborator, Dr. H. S. fundamental discord in the harmony of the n a t d
Green. MAX BORN laws? It is an outstanding merit of Eddington
that he emphasized that we should never believe in
meaninglea coincidences.
Since the formulation by Einstein of the general There are six W i n c t quantities which comprise
theory of relativity, m y theoretical considerations the sum of our howledge of the structure of the
have been advanced in which models of the universe universe OR a large male. After the velocity of light,
are derived deductively from certain physical prin- c, and the gravitational conatant, k = 87j/ca, where
ciples (expressed in the form of field equations), f is the Newtonian gravitational constant, comes
which are adopted &B hypotheses. The treatment thirdly the maximum age of the old& celesta
here explained is an attempt to approach the problem bodiea known to us. From a variety of data and
more cautiously. It is necessary t a reckon with the considerations, one arrives at the conclusion that
possibility that processes and R 8 k d l a m are none of all the known stars and systems of stars can
involved which lie outaide o w past experience, and have existed for longer than a certain meximum
which must be learnt by the exemination of the time, which is 8 p p r O ~ h 8 h l y4 X lo9 years. Thus
universe : an essentially indudive approach to the one has a quantity of the order of magnitude
cosmological problem must be sought. The dimen- A = 10 sec.
monal a d y e i s of the empirical feots pmenta iteelf Three constante additional to thoae elresdy men-
as the method appropriate to this task. This method tioned are furnished by the statistics of the spiral
proceeds by the combination of the experimental nebulae. From the enumeration of the neb&. and
data to form dimensionless quantities, and the com- the determination of the maas- of a cmes-section of
perison of these dimensionleaa numbers with regard typical nebula, one obtains the value of the mean
to their order of magnitude. mass-density p throughout the universe ; this comes
Dimensionless constants, the significance of which to about gm. cm.-*. The aecond of the three
remaina to be explained, are, as is well known, constants of the spiral neb& is the constant a of
already apparent in atomic physics. Many attempts the Hubble effect. The spectral lines of the very
have been made to elucidate the significance of distant nebula show a displacement to the red,
Sommerfelde fine-structure constant and the ratio of which may be described simply by the empirid fact
the masses of the proton and electron. Though that if a spectral line has the wave-length A, then the
Eddingtons well-known considerations have not led displacement Ah is proportional to A, and also to the
to the solution of these problems, he has very usefully diswce r of the nebula from ua, so that one obtains
made apparent their grat urgency and importance. slwaya the same value for a = cAA/Ar. The third
The discovery of difemnt kinds of meaone has constant of the spiralnebulae ia empirically the worst
intensified these problems : a theory which could determined. It B B B m S , however, that the number of
explain the m m ratios of the different kinds of those spiral nebula the separation of which from us
mwom could probably account S ~ E O for the two is in the interval 7, r+ A?, ia rather leas than pro-
constants already mentioned. T h e reciprocity theory portional to 4nr2Ar, that ia. approximately pro-
of Born End Green promises Borne interesting pro- portional to 4 w a (1 - c0nst.r) AT. Here the constant
gress in this direction. defines the radius of the universe R,to the square of
There is, however, an even more significant which it is inversely proportional.
dimensionless constant in microphysics. If the ratio The assembly of cosmological data is now com-
of the electrostatic and gravitational attractions of plete; and it may be inquired what ditnensiodesa
an electron and a proton is formed, the number constanta c m be constructed from c, k. A, p, a and
2 x 1039 results. R. Such are UA ; R/cA ; k p ca Aa ; and it is an
137

638 NATURE October 15, 1949 v0i. 164

extraordinary thing that all three are of the order 1. units of microphysics. The choice of these uni- is,
This is the justification for an attempt to interpret of course, made uncertain by the appearmce of the
these constants by means of the following simple and unexplained dimensionless constante of microphysics :
illuminating picture. The 'radiue of curvature' R is it k questionable whether one should take, BB the
inbrpreted-and is for thst reason given such a natural unit of mass, the m&98 of the proton, or that
name-in the eenae of Riemannian geometry: in- of the meeon or electron, and whether the n t l t u
stead of the infinite Euclidean space, one haa thus a unit of length should be Bohr'sradius of the hydrogen
closed, finite, Riemannian space, the volume of atom, or the electronic radius e1/mec2, the ratio of
which is of the order of magnitude Ra. In the same which is essentially the square of the fine-structure
intuitive way the Hubble effect is interpreted aa a conatant. However. the dimensional ratios with
Doppler effect. DXerent interpretations of the which one has to do are so enormouely large thet
Hubble effect have, indeed, often been attempted, them distinctions are of little consequence; more.
but here the intuitive concept of an expanding space over, there is s d c i e n t reaeon to give preference to
is retained without modifiation. The empirid the mma n a ~ of the meson. and to the electronic
relation R = CA then meam that the radius of the radius or 'elementary length' 2 = 2 x lo-" cm. If
universe increaaea at 8 rate which is just of the the ratio M : m x , which is then of the order of
magnitude of the velocity of l i g h t a n attractive magnitude of the number of elementary particles in
and revealing reeult. Ale0 d = 1 meam that this the whole Universe, is formed, then a number of the
space. which has been expanding with the velocity coloseel magnitude 10'0 ie obtained. Eddington had
of light ever since that time which we recognize ta the boldnea to o h that this number should not be
be most remote in the history of the universe, must simply acuepted as something incapabIe or without
once have been very d. need of explauation; he sought, by meana of aome
But what is meant by the fact that the last of the curious consider8tions. to give theoretical grounds
three dimensionless c o ~ t a a t smsntioned is of the for supposing that the number of protons in the
order of magnitude unity? This relation 8ppea1-a universe must have the value 2C The true solution
already in the well-known model of the universe would appear to lie in another direction: an idea
which waa Einstein's bold attempt, in framing the which Dirac has put forward in quite a different
general theory of relativity, to reslize the idea of connexion will now be introduced.
a closed, Riemannian, phyeical space. Indeed, if M The ratio R/Z is of the same order of magnitude,
is understood to be the total maas of the universe- 1 0 ' ~ . already encountered in the comparison of the
which is then of the order of magnitude pRa-and gravitational and electromagnetic attractions be-
cA is r e p 1 4 as previouely by R, the equation in tween two elementary particlee; if thase two
question cen be writtan in the form k M 2: R ; and dimensionless numbers ara divided, therefore, a new
Einstein had previouly obtained the relation dimensionless quantity of the grstifying order of
k M = 4 ~ ~ 2for2 , a closed spaoe with 8 tdme-independmt magnitude 1 ia obtained. The ratio of the two
radius R, on the baais of the relativistic theory of attractive forcea ia thereby compared, however, with
gravitation. 8 number which,88 is dready known,is not constent :
From a consideration of the Hubble effect, how- the ratio R/I increeees proportionally to the age of
ever, the concept of a growing universe haa been the universe; it is, in fact, equal to the age of
mached: and then the empirical relation kM N R the universe, expressed in tern of the 'elementary
gives a disturbing conclusion. If R is not fixed, but time' T = E/c r lo-" s e a Also, fmm the significant
is always growing, then neither can k M be regarded discovery of a quotient of the order of magnitude
any longer as invariable; and, of the two factors, unity, it muet be concluded, with Dirac, that the
the gravitstionsl constant k, and the mms M of the constant of gravitation is in reality not 8 constant,
universe, at leest one must vary likewiae with time. but inversely proportional to the age A of the
An important contribution to the elucidation of this universe.
situation is contained in an observation by A. H8ae : What is now the obstacle to applying this idea also
the relation kM N R c a n Slso be written in the form to the ratio M / m x , and aaserting that the number of
f M ' J R - r M e , which means that the negative elementary particla in the universe obviously in-
potentral energy of gravitetion for the whole universe c m as the square of the age of the universe f
is equal to the sum of the rest-energiee of the mSsaea Dirac, who wee led away from this application by a
of the stars. T h i ~provides a surprising solution of specid tmin of thought (concerning the formulation
the prohiem of the universe : it ia possible for the of a comology with an infinite Euclidean ~ p ~ and c e
hid energy of the universe to have the value zero an infinite total mgse of the universe), waa probably
exacfly-through the cancelling out of the positive influenced by a fear of contradicting the principle of
and negative contributions to the energy. The coneervatioo of energy. However, the foregoing
reIation kM N R would then appear 88 a direct conaideration has cleared the way in W respect:
consequence of the consenration of energy, which with the perception that kM N R v m the
wquirea that the evolving universe should continue conservation of energy, the complete harmony of all
in a sequenoe of s t e t e e the total energy of which the statements concerning the proportionslity of k to
always has the value 'zero. A-l, M to A', and R to A is attained.
To prmesd further, another dimemionless constant It is, therefore, accepted that there is a conthud
ie mquimd ; those which can be produced from the creation of matter in the space of the universe, and
six c o a m d o g i d quantitias c. k, A. p, a and R are the question arises, how and where this generation
exhausted by the three dimensionlea c o ~ t e n t s of matter. connected with the growth of the universe.
already mentioned. More are obtained if the cosmo- occura. TO answer this, the individual at-, insd
logical quantities are compared with those derived of the universe as 8 whole, are now taken the
from microphysics. Hitherto the radius of the subject for careful consideration in the light Of
universe R and the m888 of the universe M pRa dimensional analysie. The maas of the 8- amounts
have been expressed in centimetree and grams ; now to 2 x 10" gm.; there are certainly many *-
they will be elcp~88edin terms of the fundamental with still d e r maaaes, and an m t a b u e d lower
138

No. 4172 October 15, I949 NATURE 639


limit scarcely e x h . There are, however, on the contained in a space of the dimexmiom of the
other hand, few stam with very large maasee: elementary length. Hence the size of the drop is
1 0 ' 6 p . = 60 BullB may be t8ken 88 an 8pproximate determined aa 8 function of the gravitational con-
v&e for an upper limit to stellar m&~seawhich is 8tant at any given time : drop8 are obtained which
very seldom exceeded. It may be asked how many contain a number of elementary perticlea to-day of
protons such a star would contain : the answer is of the order of magnitude IO", and in general pro-
the order Of magnitude loao. portional to k-*/*or A'/'. Each drop which appesrs
Here is 8 freah OppOrtUdy to apply D k M ' s in the speae of our universe is therefore 8 star in ita
principle, from which it m8y be concluded that this own right.
upper limit, too, must be a function of A : indeed, it This picture can be worked out in rather more
must then be interpreted aa A'/n. According to this detfd. The universe, regarded BB 8 four-dimensional
hypothesis, the mBB8 O f 8 8- ( O n the 8Verage) Space-time m i f o l d , iS, according tO the 8bOV0, 8
depends on the sge attained by the universe a t its cone : its apex, A = 0. marks the starting-point in
formation ; this formation being settled, the maas of time, at which ale0 R and M are zero. It may be
the star (the number of its nucleons) will remain imagined, however, that this cone stage doee not
unchanged (apart from secondary proceeSes, aa neceaearily posaeaa only one apex, but may have a
occasional gathering of dust). Therefore, the normal large number of subsidiary apices. A 'cut' t = const.
mass of a star which ~ & geIUX8ted
8 a t 8x1 age A, may then in cartain circumstances cut the space-time
of the world must be proportional to Aua/'. (In con- manifold into several unconnected three-dimensional
sequence of this, to-day not only the average m e e ~ parts: another isoleted, smaller apace may exist
of recently created stare, but also the average m&98 apart from the large universe, which, by the gradual
of now existing stars has an order of magnitude 10.0, unfolding of time, and also by ita own expansion,
proportional to A*/'.) Naturally, this dependence of may begin to coalesce with the large universe. Sup-
the average of stellar masees on the age of the posing that in this d l 'world' the mass-density
universe necessarily implies a corresponding depend- rem8hE COXlSt8nt, and the gr8vit8tiOd COnStent
enca of the g r a t d d t i o d m t u n t , which ia then decreaees as the inverse aquere of ita age. then the
itself 8 function of the age of the universe: and, in conservation of energy holds there also. If it is
fact, the proportionality M,t = inferred in the m p p e d further that when, by such 8 procese, the
first instance from a comparison of the empirical little world reaches the same velue of k aa attained
numbers lo4* (prasent age of the world) and loao in the large universe, it can coaleece with the
(present mtw of the normal etar), is in agreement universe, then it will m e 8 mma of exactly the
with and corroborated by the fact that the present same vdue, 8t the batant of cdescence, MI h a s
theories of the intern81 constitution of the sters been recognized to be chfuacteriefic of 8 newly
continually p m p ose stellar qumtitiee which are formed star. Such 8 world may therefore be regarded
proportional to k'5.
It hes been made dear above that the requirement
&8 the embryonic Stste O f 8 star.
A carefd discueaion of the Bstmphyeicd facts
of conservation of energy on a large scale i s satisfied, aeems to indicate that the ideas suggested above a m
if a creation of matter is assumed which leeds to an w d adapted to give 8 better underetanding of many
increaae of M proportional to A'. Nothing is said known empirical feats. Instances will not be deve-
there, however, of the way in which thie creation of loped here; the last two of my papera cited below
matter c8n occur. Though the poesibility of 8 contain a full discuseion. I mention only that I
localization of energy, like that encountered in suspect that the supernovae 1 (which appear to
Maxwell's theory. can no longer be maintained in belong to the star population II,end which, one may
Einstein's theory of gravitation, it is certain that in venture to my. probably eccount also for the
some w8y 8 rigorous formulation of the principle of planet8ry nebulee) are stars crated in thie w8y.
conservationof energy must holdsuchthat,not onlyfor Finally, it may be noted that the theory, sketahed
the universe 86 a whole, but also for 8 b i t e region, here in e purely inductive manner, has received also
where the creation t a k a place, a balance of energy quantitative mathematical treatment. For this.
is xmintained. Coneider now the rough model of 8 of couree. a certain generslizetion of Einstein's theory
homogeneous sphere with r n w M , and radius R, of gravitation was required, 88 this theory treats the
in 8n approximately flat, Euclidean apace. If it constant of gravitation essentially 86 e genuine con-
Satisfies the relation kM, = R, (ap~rtfrom a certein stant; and the substitution of 8 variable & (in
numerical factor of the order of magnitude unity) Dirac's sense), which must then be treated as 8 scalar
then this space has zero total energy: if it were field quantity. requires 8 fundamental generahation
required to effect 8 traneformation to a cloud of @a of the theory. For this purpose I found, to all
of virtually infinite radius, then it would be necwsary. 8ppear8ZlC08, 8 Very n8tWd s--pOht in the
in order to overcome the gravitational force between five-dimeneional (or projective) theory of relativity.
the atoms, to expend just aa much energy as would The generalization in question has ale0 been studied
be present in the final state BB the awn of the rest- by Ein~tainand Bergmann. and by Lichnerowicz
energies. The conjecture suggests itself that the and Thky. In a paper by Ludwig and Mueller, it is
Cwmic creation of matter does not take place aa a &own that, with this new and more general form of
diffuse creation of protons. but by the sudden the theory of relativity, 8 deductive foundation and
appearance of whole drsps of matter, for which M , quantitative precision can be given t o the model
and R, are chosen so that each drop, even taken by sketched above. Thb applies to the model of the
-i, remains true to the principle of conservation Unive1'8e on 8 large Scale Well 88 to the model Of 8
of energy from the inatant of ite formation-in such s t a r in ita embryonic state. In thie way also the
a m y that the total energy has the value zero. correctness and practicability of the present inter-
The critical question a t once arises, What density pretation of the problem of cosmological energy are
must be ascribed to the drop of newly created matter 7 c o n k e d .
It can certainly take only the density of the atomic After having written this article, I had the oppor-
nucleus, which is of the order of the proton mass, tunity of stxdying the paper by Mr. F. Hoyle men-
119

tioned above by Prof. Max Born, and which will be


sure to give a new greet impetus to coemological
discussion. Several decisive ideas of Hoyle's a m in
full harmony with my own theory; especially his
very convincing arguments against oscillatory models,
and concerning the irreversible conversion of hydro-
gen. But by far the greatest encouragement which
I gain from his very interesting diecumion is given
by his hypothesis of creation of matter. When I put
forward in 1939, following Diraa, the idea of an
increasing masa M of t h e universe, I myself believed
this to be a rather aatoniahing hypothesis. Surely
this hypothesis will now be discussed earnestly. But
there are also considerable differences between
Hoyle's theory and my own. The principal point
has already been emphasized by Prof. Born. Will
Hoyle'B the& (F), namely, the existence of inter-
nebular matter, really be accepted by empirical
eetronomera ? I learn from Dr. Baade that he does
not believe it.
BIBLlWMPEY
Jordan, P., Ann. d. JAjwti, 38, M (1939); Phyt. Z., 45, 183 (1944);
1, 219 (1947); Aitro. Naehr., 876, 193 (1948); Ada Phyiica
Auttriaca (In the press).
Ludwig, Q., and ?dueller, C., Ann. d. Phvrtk, 2. 78 (1948).
BergWM, P. Q., Ann. h f d . , 19, 255 (1048).
Lichnsmwlcz, A., C.R. A d . ad.,Paria. 988. 432 (1946).
T W ,J., C.R.A&. Sd.,P d . 896, 216 (1948).
Born, Bf., 8nd Green, H. s., N&?6, 10, 201 and u)9 (1949).
140

On the physical interpretation of


P. Jordans extended theory of gravitation
by M. Fierz.
(11/24/56)
(Translated by D.Fine)

Abstract
The metric of the Jordan theory can be defined through the postulate
that point-masses move along geodesics. This postulate is equivalent to
another: the Compton wavelength of the elementary particles provides a
natural length-scale. The gravitational constant is defined as the ratio
of gravitational to inertial mass. Further, there generally follows from
the theory a vacuum dielectric constant 0 = l / p ~ whose
, dependence
on x depends on the choice of the exponent 17 introduced by Jordan:
t o = x 1+1'?(r/ # 0 ) .

Introduction
It is well-known to be possible to formally combine the equations of the grav-
itational field and the electromagnetic field by interpreting them as describing
a five-dimensional projective space'. So that the correct number of field equa-
tions still obtain one must normalize the metric components through the side
condition
J = g p u X p X v= 1, (1)
where Xu denotes the 5 homgeneous coordinates.
P. JORDAN^ has suggested extending the theory by dropping the side con-
dition (1) and introducing J as a variable scalar field in the theory. J O RD A N
assumes that the field equations follow from the 5-dimensional variational prin-
ciple
S I (
J a R - X ( J I ~ J I ~ / J 'G) ) d ' X . (2)

(P.J. $26 (22)). Here J is the invariant ( l ) ,R is the contracted 5-dimensional


curvature tensor. CY and X are arbitrary constants. This corresponds in 4 di-
mensions to the variational principle:

+
(c.f. P.J. 27 (22)). Here 7 = cy l / 2 . J O RD A N chooses for 7 the value 1. As
we do not wish to do that it is necessary, in the case 7 # 0 , to set

'See for example W . P AULI , Annalen d . P h . 18, 305 (1933).


'P. J ORDAN , Schwerkraft und Weltall, 2 . erw. Auflage (Braunschweig 1955), S. 128 ff. We
cite in the following P.J. followed by sand t h e equation number.

1
141

Then (3) takes the following form:

with

J O R D A N takes x to be the Gravitational constant which in this theory


becomes variable. Further, he interprets the Q i k appearing in G as the metric
components of the four-space.
W. PAULI(P.J. 828) has pointed out that through the principle (3), or
through the differential equations which follow from it, the metric is far from
unambiguously determined. Indeed, one can, through a conformal transforma-
tion
921 =w(x)gkl
introduce a new metric which could serve in place of Q i k as the correct metric.
In this, w ( x ) is largely arbitrary. This has to do with the fact t h a t in the
variational principle (3) only the electomagnetic field F i k represents matter.
The light rays are however in this theory, too, always null geodesics. This
property is preserved by conformal transformations. Therefore light does not
define a unit of length.
J ORDAN S interpretation of Q i k as the metric and of x as gravitational con-
stant is thereby unfounded. He admits this himself (P.J. s. 170). Solely the
interpretation of F i k as electromagnetic field ( E , B ) may not be doubted, as
these quantities are rotations of the potentials.
A decision on what the metric and gravitational constant are t o be first
becomes possible when real matter, for example point-masses, are introduced
into the theory.
The physical interpretation of the quantities appearing in the theory depends
upon the form of the additional terms which are to describe matter in the
variational principle. When one has decided on the form of these terms, the
interpretation is, under very general assumptions, unambiguous. T h e goal of
our work is to demonstrate this.
Whether the Jordan theory describes reality better than the usual relativity
theory of Einstein is another question entirely, into which we will not go in more
detail here. Admittedly it appears to us that no convincing physical reasons
exist to prefer this sort of extension.

2
142

PHYSICAL REVIEW VOLUME 124. NUMBER 3 NOVEMBER 1. 1961

Machs Principle and a Relativistic Theory of Gravitation*


c. BaANSt AND R. H. DICKE
Palmer Physical Lahoratmy, Princeton University, Princeton, New Jersey
(Received June 23, 1961)

The role of Machs principle in physics is discussed in relation to the equivalence principle. The difficulties
encountered in attempting to incorporate Machs principle into general relativity are discussed. A modified
relativistic theory of gravitation, apparently compatible with Machs principle, is developed.

INTRODUCTION small mass, its effect on the metric is minor and can be
considered in the weak-field approximation. The ob-
IT is interesting that only two ideas concerning the
nature of space have dominated our thinking since
the time of Descartes. According to one of these pic-
server would, according to general relativity, observe
normal behavior of his apparatus in accordance with the
usual laws of physics. However, also according to general
tures, space is an absolute physical structure with
relativity, the experimenter could set his laboratory ro-
properties of its own. This picture can be traced from
tating by leaning out a window and firing his 22-caliber
Descartes vortices through the absolute space of
rifle tangentially. Thereafter the delicate gyroscope in
Newton,2 to the ether theories of the 19th century.
the laboratory would continue to point in a direction
The contrary view that the geometrical and inertial
nearly fixed relative to the direction of motion of the
properties of space are meaningless for an empty space,
rapidly receding bullet. The gyroscope would rotate
that the physical properties of space have their origin
relative to the walls of the laboratory. Thus, from the
in the matter contained therein, and that the only
meaningful motion of a particle is motion relative to point of view of Mach, the tiny, almost massless, very
distant bullet seems to be more important that the
other matter in the universe has never found its com-
plete expression in a physical theory. This picture is massive, nearby walls of the laboratory in determining
inertial coordinate frames and the orientation of the
also old and can be traced from the writings of Bishop
Berkeley3 to those of Ernst Mach.4 These ideas have gyroscope.6It is clear that what is being described here is
found a limited expression in general relativity, but it more nearly an absolute space in the sense of Newton
rather than a physical space in the sense of Berkeley
must be admitted that, although in general relativity
spatial geometries are affected by mass distributions, and Mach.
the geometry is not uniquely specified by the distribu- The above example poses a problem for us. Ap-
tion. I t has not yet been possible to specify boundary parently, we may assume one of a t least three things:
conditions on the field equations of general relativity 1. that physical space has intrinsic geometrical and
which would bring the theory into accord with Machs inertial properties beyond those derived from the matter
principle. Such boundary conditions would, among other contained therein ;
things, eliminate all solutions without mass present. 2. that the above example may be excluded as non-
It is necessary to remark that, according to the ideas physical by some presently unknown boundary condi-
of Mach, the inertial forces observed locally in an ac- tion on the equations of general relativity.
celerated laboratory may be interpreted as gravitational 3. that the above physical situation is not correctly
effects having their origin in distant matter accelerated described by the equations of general relativity.
relative to the laboratory. The imperfect expression
of this idea in general relativity can be seen by consider- These various alternatives have been discussed pre-
ing the case of a space empty except for a lone experi- viously. Objections to the first possibility are mainly
menter in his laboratory. Using the traditional, asymp- philosophical and, as stated previously, go back to the
totically Minkowskian coordinate system fixed relative time of Bishop Berkeley. A common inheritance of all
to the laboratory, and assuming a normal laboratory of present-day physicists from Einstein is an appreciation
for the concept of relativity of motion.
* Supported in part by research contracts with the U. S. Atomic As the universe is observed to be nonuniform, it
Energy Commission and the Office of Naval Research. would appear to be difficult to specify boundary condi-
t National Science Foundation Fellow; now at Loyola Uni- tions which would have the effect of prohibiting un-
versity, New Orleans, Louisiana.
E. T. Whittaker, History of the Theories of Aellter and Elec- suitable mass distributions relative to the laboratory
tricity (Thomas Nelson and Sons, New York, 1951). arbitrarily placed; for could not a laboratory be built
I. Newton, Principia Mathemalica Philosophiae Naturalis
(1686) (reprinted by University of California Press, Berkeley, near a massive star? Should not the presence of this
California, 1934). massive star contribute to the inertial reaction?
8 G. Berkeley, The Principles of Hunzan Knowledge, paragraphs
111-117, 1710-De Motu (1726). The difficulty is brought into sharper focus by con-
E. Mach, Conservation of Energy, note No. 1, 1872 (reprinted
by Open Court Publishing Com any, LaSalle, Illinois, 1911), and 6Because of the Thirring-Lense effect, [H. Thirring and J.
The Science of Mechanics, 1883 {eprinted by Open Court Publish- Lense, Phys. Zeits. 19, 156 (1918)], the rotating laboratory would
ing Company, LaSalle, Illinois, 1902), Chap. 11, Sec. VI. have a weak effect on the axis of the gyroscope.
925
143

926 C. BRANS AND R. >I. DICKE

sidering the laws of physics, including their quantitative due to the presence of distant accelerated matter.7
aspects, inside a static massive spherical shell. I t is This interpretation of the inertial reaction carries with
well known that the interior Schwarzschild solution is it an interesting implication. Consider a test body falling
flat and can be expressed in a coordinate system toward the sun. I n a coordinate system so chosen that
Minkowskian in the interior. Also, according to general the object is not accelerating, the gravitational pull of
relativity all Minkowskian coordinate systems are the sun may be considered as balanced by another
equivalent and the mass and radius of the spherical gravitational pull, the inertial reaction.8 Note that the
shell have no discernible effects upon the laws of physics balance is not disturbed by a doubling of all gravita-
as they are observed in the interior. Apparently the tional forces. Thus the acceleration is determined by the
spherical shell does not contribute in any discernible mass distribution in the universe, but is independent
way to inertial effects in the interior. What would of the strength of gravitational interactions. Designating
happen if the mass of the shell were decreased, or its the mass of the sun by m, and its distance by r enables
radius increased without limit? I t might be remarked the acceleration to be expressed according to Newton
also that Komare has attempted, without success, to as a=Gm,/r* or, from dimensional arguments, in terms
find suitable boundary- and initial-value conditions for of the mass distribution as a-mRc2/Mr2. Combining
general relativity which would bring into evidence the two expressions gives Eq. (1).
Machs principle. This relation has significance in a rough order-of-
The third alternative is the subject of this paper. magnitude manner only, but it suggests that either the
Actually the objectives of this paper are more limited ratio of M to R should be fixed by the theory, or alter-
than the formulation of a theory in complete accord natively that the gravitational constant observed locally
with Machs principle. Such a program would consist of should be variable and determined by the mass distribu-
two parts, the formulation of a suitable field theory tion about the point in question. The first of these two
and the formulation of suitable boundary- and initial- alternatives is of course, in part, simply the limitation
value conditions for the theory which would make the of mass distribution which it might be hoped would
space geometry depend uniquely upon the matter result from some boundary condition on the field equa-
distribution. This latter part of the problem is treated tions of general relativity. The second alternative is
only partially. not compatible with the strong principle of equiva-
At the end of the last section we shall briefly return 1enceO and general relativity. The reasons for this will
again to the problem of the rotating laboratory. be discussed below.
A principle as sweeping as that of Mach, having its If the inertial reaction may be interpreted as a gravi-
origins in matters of philosophy, can be described in tational force due to distant accelerated matter, it
the absence of a theory in a qualitative way only. A might be expected that the locally observed values of
model of a theory incorporating elements of Machs the inertial masses of particles would depend upon the
principle has been given by Sciama. From simple distribution of matter about the point in question. It
dimensional argumentss-9 as well as the discussion of should be noted, however, that there is a fundamental
Sciama, it has appeared that, with the assumption of ambiguity in a statement of this type, for there is no
validity of Machs principle, the gravitational constant direct way in which the mass of a particle such as an
G is related to the mass distribution in a uniform electron can be compared with that of another a t a
expanding universe in the following way : different space-time point. Mass ratios can be compared
at different points, but not masses. On the other hand,
G M / R c Z1.~ (1) gravitation provides another characteristic mass
Here M stands for the finite mass of the visible (i.e.,
(Ac/G)~=Z.l6XlO+ g,
causally related) universe, and R stands for the radius
of the boundary of the visible universe. and the mass ratio, the dimensionless number
The physical ideas behind Eq. (1) have been given
in references 7-9 and can be summarized easily. As m ( ~ / ~ z1c ) s x 10-23, (3)
stated before, according to Machs principle the only
meaningful motion is that relative to the rest of the provides an unambiguous measure of the mass of an
matter in the universe, and the inertial reaction experi- electron which can be compared at different space-
enced in a laboratory accelerated relative to the distant time points.
matter of the universe may be interpreted equivalently I t should also be remarked that statements such as
A and c are the same at all space-time points are in
as a gravitational force acting on a fixed laboratory
the same way meaningless within the same context
A. Komar. Ph.D. thesis, Princeton Universitv.
r .
1956 until a method of measurement is prescribed. I n fact,
(un ublished). it should be noted that h and c may be defined to be
b. W. Sciama, Monthly Notices Roy. Astron. SOC. 113, 34
(1953); The Unity of the Universe (Doubleday & Company, Inc., constant. A set of physical constants may be defined
New York, 1959), Cha s. 7 9. as constant if they cannot be combined to form one or
K. H. Dicke, Am. &e&t 47, 25 (1959).
R. H. Dicke, Science 129, 621 (1959). lK. H. Dicke, Am. J. Phys. 29, 344 (1960).
1 44

M A C H S P I< I N C I P L E 927

more dimensionless numbers. The necessity for this same physical situation, the formal structure of the
limitation is obvious, for a dimensionless number is theory would be very different for the two cases. Thus,
invariant under a transformation of units and the ques- for example, i t can be easily shown that uncharged
tion of the constancy of such dimensionless numbers is spinless particles whose masses are position dependent
to be settled, not by definition, but by measurements. no longer move on geodesics of the metric. (See Ap-
A set of such independent physical constants which are pendix I.) Thus, the definition of the metric tensor is
constant by definition is complete if it is impossible different for the two cases. The two metric tensors are
to include another without generating dimensionless connected by a conformal transformation.
numbers. The arbitrariness in the metric tensor which results
It should be noted that if the number, Eq. (3), from the indefiniteness in the choice of units of measure
should vary with position and h and G are defined as raises questions about the physical significance of Rie-
constant, then either m or G, or both, could vary with mannian geometry in relativity.12 I n particular the 14
position. There is no fundamental difference between invariants which characterize the space are generally
the alternatives of constant mass or constant G. How- not invariant under a conformal transformation inter-
ever, one or the other may be more convenient, for the preted as a redefinition of the metric tensor in the same
formal structure of the theory would, in a superficial space.13 Matters are even worse, for a more general
way, be quite different for the two cases. redefinition of the units of measure can be used to re-
T o return to Eq. (3), the odd size of this dimension- duce all 14 invariants to zero. It should be said that
less number has often been noticed as well as its ap- these remarks should not be interpreted as casting
parent relation to the large dimensionless numbers of doubt on the correctness or usefulness of Riemannian
astrophysics. The apparent relation of the square of the geometry in relativity, but rather that each such
reciprocal of this number [Eq. (3)] to the age of the geometry is but a particular representation of the theory.
universe expressed as a dimensionless number in atomic It would be expected that the physical content of the
time units and the square root of the mass of the visible theory should be contained in the invariants of the group
portion of the universe expressed in proton mass units of position-dependent transformations of units and co-
suggested to Dirac a causal connection that would lead ordinate transformations. The usual invariants of
to the value of Eq. (3) changing with time. The signifi- Riemannian geometry are not invariants under this
cance of Diracs hypothesis from the standpoint of wider group.
Machs principle has been discussed.8 I n general relativity the representation is one in
Dirac postulated a detailed cosmological model based which units are chosen so that atoms are described as
on these numerical coincidences. This has been criti- having physical properties independent of location. I t
cized on the grounds that it goes well beyond the empiri- is assumed that this choice is possible!
cal data upon which it is based.* Also in another publi- I n accordance with the above, a particular choice of
cation by one of us (R. H. D.), it will be shown that it units is made with the realization that the choice is
gives results not in accord with astrophysical observa- arbitrary and without an invariant significance. The
tions examined in the light of modern stellar evolution- theoretical structure appears to be simpler if one de-
ary theory. fines the inertial masses of elementary particles to be
On the other hand, it should be noted that a large constant and permits the gravitational constant to vary.
dimensionless physical constant such as the reciprocal I t should be noted that this is possible only if the mass
of Eq. (3) must be regarded as either determined by ratios of elementary particles are constant. There may
nature in a completely capricious fashion or else as re- be reasonable doubt about this?JO On the other hand,
lated to some other large number derived from nature. it would be expected that such quantities as particle
I n any case, it seems unreasonable to attempt to derive mass ratios or the fine-structure constant, if they
a number like loz3from theory as a purely mathematical depend upon mass distributions in the universe, would
number involving factors such as k / 3 . be much less sensitive in their dependence9 rather than
It is concluded therefore, that although the detailed the number given by Eq. (3) and their variation could
structure of Diracs cosmology cannot be justified by be neglected in a first crude theory. Also it should be
the weak empirical evidence on which it is based, the remarked that the requirements of the approximate
more general conclusion that the number [Eq. (3)] constancy of the ratio of inertial to passive gravitational
varies with time has a more solid basis. mass,I4 and the extremely stringent requirement of
If, in line with the interpretation of Machs principle spatial isotropy,16 impose conditions so severe that it
being developed, the dimensionless mass ratio given by has been found to be difficult, if not impossible, to
Eq. (3) should depend upon the matter distribution in 12 E. P. Wigner has questioned the physical significance of Kie-
the universe, with h and c constant by definition, either rnannian geometry on other grounds [Relativity Seminar, Stevens
the mass m or the gravitational constant, or both, must Institute, May 9, 1961 (unpublished)].
a B. Hoffman, Phys. Rev. 89.49 (1953).
vary. Although these are alternative descriptions of the I4 R. Eotvos, Ann: Physik 68; 11 (1922).
ISV. W. Hughes, H. G . Robinson, and V. Beltran-Lopez,
P. A. M. Dirdc, Proc. Roy. SOC (London) A165, 199 (1938). Phys. Rev. Letters 4,342 (1960).
145

928 C. BRANS AND R. H. DICKE

construct a satisfactory theory with a variable fine- concrete spherical shell could be constructed with the
structure constant. laboratory in its interior.
I t should be emphasized that the above argument in- 2. The contrary view is that locally observed inertial
volving the large dimensionless numbers, Eq. (3), does reactions depend upon the mass distribution of the uni-
not concern Machs principle directly, but that Machs verse about the point of observation and consequently
principle and the assumption of a gravitational con- the quantitative aspects of locally observed physical
stant dependent upon mass distributions gives a laws (as expressed in the physical constants) are
reasonable explanation for varying constants. position dependent.
I t would be expected that both nearby and distant 3. I t is possible to reduce the variation of physical
matter should contribute to the inertial reaction experi- constants required by this interpretation of Machs
enced locally. If the theory were linear, which one does principle to that of a single parameter, the gravitational
not expect, Eq. (1)would suggest that it is the reciprocal constant.
of the gravitational constant which is determined locally 4. The separate but related problem posed by the
as a linear superposition of contributions from the mat- existence of very large dimensionless numbers repre-
ter in the universe which is causally connected to the senting quantitative aspects of physical laws is clarified
point in question. This can be expressed in a somewhat by noting that these large numbers involve G and that
symbolic equation : they are of the same order of magnitude as the large
numbers characterizing the size and mass distribution
G-1-Zi(mi/ric2), (4) of the universe.
5. The strong principle of equivalence upon which
where the sum is over all the matter which can con- general relativity rests is incompatible with these ideas.
tribute t o the inertial reaction. This equation can be However, it is only the weak principle which is
given an exact meaning only after a theory has been directly supported by the very precise experiments of
constructed. Equation (4) is also a relation from Eotvos.
Sciamas theory.
I t is necessary to say a few words about the equiva- A THEORY OF GRAVITATION BASED ON A SCALAR
FIELD IN A RIEMANNIAN GEOMETRY
lence principle as it is used in general relativity and as
it relates to Machs principle. As it enters general rela- The theory to be developed represents a generaliza-
tivity, the equivalence principle is more than the as- tion of general relativity. It is not a completely geometri-
sumption of the local equivalence of a gravitational cal theory of gravitation, as gravitational effects are
force and an acceleration. Actually, in general relativity described by a scalar field in a Riemannian manifold.
it is assumed that the laws of physics, including numeri- Thus, the gravitational effects are in part geometrical
cal content (Le., dimensionless physical constants), as and in part due to a scalar interaction. There is a formal
observed locally in a freely falling laboratory, are inde- connection between this theory and that of Jordan,I6
pendent of the location in time or space of the labora- but there are differences and the physical interpretation
tory. This is a statement of the strong equivalence is quite different. For example, the aspect of mass crea-
prin~iple.~J~ The interpretation of Machs principle tion in Jordans theory is absent from this theory.
being developed here is obviously incompatible with In developing this theory we start with the weak
strong equivalence. The local equality of all gravitational principle of equivalence. The great accuracy of the
accelerations (to the accuracy of present experiments) Eotvos experiment suggests that the motion of un-
is the weak equivalence principle. It should be noted charged test particles in this theory should be, as in
that it is the weak equivalence principle that re- general relativity, a geodesic in the four-dimensional
ceives strong experimental support from the Eotvos manifold.
experiment. With the assumption that only the gravitational
Before attempting to formulate a theory of gravita- constant (or active gravitational masses) vary with
tion which is more satisfactory from the standpoint of position, the laws of physics (exclusive of gravitation)
Machs principle than general relativity, the physical observed in a freely falling laboratory should be unaf-
ideas outlined above, and the assumptions being made, fected by the rest of the universe as long as self-gravi-
will be summarized : tational fields are negligible. The theory should be con-
structed in such a way as to exhibit this effect.
1. An approach to Machs principle which attempts, If the gravitational constant is to vary, it should be
with boundary conditions, to allow only those mass
distributions which produce the correct inertial P. Jordon, Schwerkrajt and Wellall (Friedrich Vieweg and
Sohn, Braunschweig, 1955); Z. Physik 157,112 (1959). In this sec-
reaction seems foredoomed, for there do exist large ond reference, Jordan has taken cognizance of the objections of
localized masses in the universe (e.g., white dwarf Fierz (see reference 19) and has written his variational principle
stars) and a laboratory could, in principle, be con- in a form which differs in only two respects from that expressed
in Eq. (16). See also reference 20.
structed near such a mass. Also it appears to be possible 1 For a discussion of this, see H. Bondi, Cosmology, 2nd edition,
to modify the mass distribution. For example, a massive 1960.
146

MACH'S PRINCIPLE 929

a function of some scalar field variable. The contracted where


metric tensor is a constant and devoid of interest. The Ti'= [2/ ( - g ) ' ] ( d / d g i j ) [ (-g)*L]. (8)
scalar curvature and the other scalars formed from the
It is assumed that L does not depend explicitly upon
curvature tensor are also devoid of interest as they con-
derivatives of g i j .
tain gradients of the metric tensor components, and
fall o f f more rapidly than r-l from a mass source. Thus Jordan's theory has been criticized by FierzlVon the
such scalars are determined primarily by nearby mass grounds that the introduction of matter into the theory
required further assumptions concerning the standards
distributions rather than by distant matter.
of length and time. Further, the mass creation aspects
As the scalars of general relativity are not suitable,
a new scalar field is introduced. The primary function of of this theory and the nonconservation of the energy-
this field is the determination of the local value of the momentum tensor raise serious questions about the
gravitational constant. significance of the energy-momentum tensor. T o make
it clear that this objection cannot be raised against
In order to generalize general relativity, we start
with the usual variational principle of general relativity this version of the theory, we hasten to point out that
from which the equations of motion of matter and non- L is assumed to be the normal Lagrangian density of
gravitational fields are obtained as well as the Einstein matter, a function of matter variables and of gij only,
field equation, namely,I8 not a function of 4. It is a well-known result that for
any reasonable metric field distribution g i j (a distribu-

0=6
s [R+ (16uG/c4)L](- g)fd4x.

Here, R is the scalar curvature and L is the Lagran-


(5)
tion which need not be a solution of the field equations
of gi;), the matter equations of motion, obtained by
varying matter variables in Eq. (6), are such that
Eq. (7) is satisfied with T"j defined by Eq. (8). Thus
gian density of matter including all nongravitational Eq. (7) is satisfied and this theory does not contain a
fields. mass creation principle.
In order to generalize Eq. (5) it is first divided by The wave equation for 4 is obtained in the usual way
G, and a Lagrangian density of a scalar field 4 is added by varying I$ and 4,; in Eq. (6). This gives
inside the bracket. G is assumed to be a function of 4.
Remembering the discussion in connection with Eq.
,;+
2 ~ 4 - 1 04 - ( ~ / 4 ~ ) 4 . ' 4 R= 0. (9)
(4), it would be reasonable to assume that G-1 varies Here the generally covariant d'Alembertian 0 is defined
as 4, for then a simple wave equation for 4 with a scalar to be the covariant divergence of 4vi :
matter density as source would give an equation roughly
the same as (4).
The required generalization of Eq. (6) is clearly From the form of Eq. (9), it is evident that +R and the
Lagrangian density of 4 serves as the source term for
the generation of 4 waves. Remarkably enough, as
will be shown below, this equation can be transformed
so as to make the source term appear as the contracted
Here 4 plays a role analogous to G" and will have the energy-momentum tensor of matter alone. Thus, in
dimensions M C 3 T 2 The
. third term is the usual Lagran- accordance with the requirements of Mach's principle,
gian density of a scalar field, and the scalar in the de- 4 has as its sources the matter distribution in space.
nominator has been introduced to permit the constant By varying the components of the metric tensor and
w to be dimensionless. In any sensible theory w must be their first derivatives in Eq. (6), the field equations for
of the general order of magnitude of unity. the metric field are obtained. This is the analog of the
It should be noted that the term involving the Einstein field equation and is
Lagrangian density of matter in Eq. (6) is identical
with that in Eq. (5). Thus the equations of motion of Rij-jgijR = (ST~-'/C~)
Tij
matter in a given externally determined metric field + (w/@) (4,d3i-+gij4,k'$*')
are the same as in general relativity. The difference +4-'(4,i;j-gijO4)* (11)
between the two theories lies in the gravitational field The left side of Eq. (11) is completely familiar and needs
equations which determine gi;, rather than in the equa- no comment. Note that the first term on the right is the
tions of motion in a given metric field. usual source term of general relativity, but with the
It is evident, therefore, that, as in general relativity, variable gravitational coupling parameter 4-I. Note
the energy-momentum tensor of matter must have a also that the second term is the energy-momentum
vanishing covariant divergence, tensor of the scalar field, also coupled with the gravita-
TCj;;= 0, (7) tional coupling 4-l. The third term is foreign and results
ISL. Landau and E. Liftschitz, Classical Theory of Fields from the presence of second derivatives of the metric
(Addison-Wesley Publishing Company, Reading, Massachusetts,
1951). M. Fierz, Helv. Phys. Acta. 29, 128 (1956).
147

930 C. BRANS AND R. H. DICKE

tensor in R in Eq. (6). These second derivatives are for a fluid


eliminated by integration by parts to give a divergence T,= - (P+E)U*%fpg*1, (14)
and the extra terms. I t should be noted that when the so that
first term dominates the right side of Eq. (ll),the equa- T = -e+3P, (15)
tion differs from Einstein's field equation by the pres-
ence of a variable gravitational constant only. where e is the energy density of the matter in comoving
While the "extra" terms in Eq. (12) may at fjrst coordinates and p is the pressure in the fluid. With
seem strange, their role is essential. They are needed
this sign convention and o positive, the contribution to
if Eq. (7) is to be consistent with Eqs. (9) and (11). 4 from a local mass is positive. Note, however, that
This can be seen by multiplying Eq. (11) by 4 and then there is no direct eIectromagnetic contribution to T ,
taking the covariant divergence of the resulting equa- as the contracted energy-momentum tensor of an elec-
tion. The divergence of these two terms cancels the term tromagnetic field is identically zero. However, bound
4,;Rji=4qiRji, To show this, use is made of the well- electromagnetic energy does contribute indirecLly
known property of the full curvature tensor that it through the dress terms in other fields, the stresses
serves as a commutator for two successive gradient being necessary to confine the electromagnetic field."
operations applied to an arbitrary vector. I n conclusion, o must be positive if the contribution
If Eq. (11) is contracted there results to the inertial reaction from nearby matter is to be
positive.
-R = (8n4-'/c4) T- (u/I$')c$,~c#JJ 4. (1 2)
-34--'0
20 There are but Iwo formal differences between the iicld equn-
Equation (12) can be combined with Eq. (9) lo give a tions of this theory and those of the particular form of Jordan's
theory given in Z. Physik 157, 112 (1959). First, Jordan has de-
new wave equation for : fined his scalar field variable reciprocal to 9. Thus, the simple wave
character of the scalar field equation [Eq. (13)] is not so clear
Od,=[8~/(3+2w)c']T. (13) and the hysical arguments based on Mach's principle and leading
to Eq. &) have not been satisfied. Second, as a result of its out-
With the sign convention growth from his five-dimensional theory, Jordan has limited his
matter variables to those of the electromagnetic field.
ds2= gijdxidxj and goo<0, Ii C. Misner and P. P u t n m , Phys. Rev. 116, 1045 (1959).
This page intentionally left blank
Chapter 4

Yang-Mills' Deepest Insight and

Its Relation to Gravity*

C. N. Yang, R. L. Mills, T. D. Lee, R. Utiyama, T. W. B. Kibble


150

PHYSICAL REVIEW VOLUME 96, NUMBER 1 O C T O B E R 1. 1 9 5 4

Conservation of Isotopic Spin and Isotopic Gauge Invariance"


C. N. YANC t AND R.L.MILLS
Brookhaven Nai?imal Labwatwy, Upton, New York
(Received June 28, 1954)

It is pointed out that the usual principle of invariance under isotopic spin rotation is not consistant with
the concept of localiied fields. The possibility is explored of having invariance under local isotopic spin
rotations. This leads to formulating a principle of isotopic gauge invariance and the existence of a b field
which has the same relation to the isotopic spin that the electromagnetic field has t o the electric charge, The
b field satisfies nonlinear differential equations. The quanta of the b field are particles with spin unity,
isotopic spin unity, and electric charge &e or zero.

INTRODUCTION stable even nuclei contain equal numbers of them. Then


in 1937 Breit, Condon, and Present pointed out the
T HE conservation of isotopic spin is a much dis-
cussed concept in recent years. Historically an
isotopic spin parameter was first introduced by Heisen-
approximate equality of p - p and f i - p interactions in
the 'S state.2 It seemed natural to assume that this
berg' in 1932 to describe the two charge states (namely equality holds also in the other states available to both
neutron and proton) of a nucleon. The idea that the the fi--p and p - p systems. Under such an assumption
neutron and proton correspond to two states of the one arrives at the concept of a total isotopic spin*which
same particle was suggested a t that time by the fact is conserved in nucleon-nucleon interactions. Experi-
that their masses are nearly equal, and that the light
zBreit, Condon, and Present, Phys. Rev. 50, 825 (1936). J.
*Work performed under the auspices of the U. S. Atomic Schwinger pointed out that the small difference may be attributed
Energy Commission. to magnetic interactions [Ph s. Rev. 78, 135 (1950)l.
f On leave of absence from the Institute for Advanced Study, 8 The total isoto ic spin ?was first introduced by E. Wigner,
Princeton, New Jersey. Phys. Rev. 51, 10%(1937); R. Cassen and E. U. Condon, Phys.
W. Heisenberg, Z. Physik 77, 1 (1932). Rev. 50,846 (1936).
151

192 C. N. YANG AND R. L. MILLS

ments in recent years4 on the energy levels of light nuclei dynamics it is necessary to counteract the variation of a!
strongly suggest that this assumption is indeed correct, with x , y , z, and t by introducing the electromagnetic
An implication of this is that all strong interactions field A , which changes under a gauge transformation as
such as the pion-nucleon interaction, must also satisfy
the same conservation law. This and the knowledge that 1 aa
there are three charge states of the pion, and that pions A,=A,+---.
can be coupled to the nucleon field singZy, lead to the
e ax,
conclusion that pions have isotopic spin unity. A direct In an entirely similar manner we introduce a B field in
verification of this conclusion was found in the experi- the case of the isotopic gauge transformation to counter-
ment of Hildebrand6 which compares the differential act the dependence of S on x , y , z, and t. It will be seen
cross section of the process n+p-.rr0+d with that of that this natural generalization allows for very little
the previously measured process p+p-m++d. arbitrariness. The field equations satisfied by the twelve
The conservation of isotopic spin is identical with the independent components of the B field, which we shall
requirement of invariance of all interactions under call the b field, and their interaction with any field
isotopic spin rotation. This means that when electro- having an isotopic spin are essentially fixed, in much the
magnetic interactions can be neglected, as we shall here- same way that the free electromagnetic field and its
after assume to be the case, the orientation of the interaction with charged fields are essentially deter-
isotopic spin is of no physical significance. The differ- mined by the requirement of gauge invariance.
entiation between a neutron and a proton is then a I n the following two sections we put down the
purely arbitrary process. As usually conceived, however, mathematical formulation of the idea of isotopic gauge
this arbitrariness is subject to the following limitation: invariance discussed above. We then proceed to the
once one chooses what to call a proton, what a neutron, quantization of the field equations for the b field. I n the
a t one space-time point, one is then not free to make any last section the properties of the quanta of the b field
choices a t other space-time points. are discussed.
It seems that this is not consistent with the localized
field concept that underlies the usual physical theories. ISOTOPIC GAUGE TRANSFORMATION
In the present paper we wish to explore the possibility
of requiring all interactions to be invariant under +
Let be a two-component wave function describing
independent rotations of the isotopic spin a t all space- a field with isotopic spin 3. Under an isotopic gauge
time points, so that the relative orientation of the iso- transformation it transforms by
topic spin at two space-time points becomes a physic-
ally meaningless quantity (the electromagnetic field
being neglected). where S is a 2x2 unitary matrix with determinant
We wish to point out that an entirely similar situation unity. In accordance with the discussion in the pre-
arises with respect to the ordinary gauge invariance of a vious section, we require, in analogy with the electro-
charged field which is described by a complex wave magnetic case, that all derivatives of $ appear in the
function $. A change of gaugeemeans a change of phase following combination :
factor ++, $= (expic&, a change that is devoid of
any physical consequences. Since $ may depend on (a,-iwk.
+
x, y, z, and t, the relative phase factor of a t two differ-
B, are 2x2 matrices such that7 for p= 1, 2, and 3, B, is
ent space-time points is therefore completely arbitrary.
In other words, the arbitrariness in choosing the phase Hermitian and Bq is anti-Hermitian. Invariance re-
factor is local in character. quires thst
We define isotopic gaxge as an arbitrary way of choos- S (13, -icB,)$= (a, -id?,)$. (2)
ing the orientation of the isotopic spin axes a t all space-
time points, in analogy with the electromagnetic gauge Combining (1) and (2), we obtain the isotopic gauge
which represents an arbitrary way of choosing the com- transformation on B, :
plex phase factor of a charged field a t all space-time
points. We then propose that all physical processes i as
(not involving the electromagnetic field) be invariant B, = SIBJ+-S1--. (3)
ax,
under an isotopic gauge transformation, $-+,$= S v ,
where S represents a space-time dependent isotopic The last term is similar to the gradiant term in the
spin rotation. gauge transformation of electromagnetic potentials.
To preserve invariance one notices that in electro- In analogy to the procedure of obtaining gauge in-
variant field strengths in the electromagnetic case, we
4T. Lauritsen, Ann. Rev. Nuclear Sci. 1, 67 (1952); D. R.
Inglis, Revs. Modern Phys. 25,390 (1953).
6 R. H. Hildebrand, Phys. Rev. 89, 1090 (1953). We use the conventions h=c= 1, and x4=it. Bold-face type
6 W. P a d , Revs. Modern Phys. 13,203 (1941). refers to vectors in isotopic space, not in space-time.
152

ISOTOPIC SPIN A N D ISOTOPIC G A U G E I N V A R I A N C E 193


define now But the sum of and B,(b),the B fields correspond-
aB, aB, ing to s()
and Sa),
transforms in exactly the same way,
F, = -- -+ia (BJ-Bl3,). (4) so that
ax, ax, B, =B, ( 0 )+B,b
One easily shows from (3) that (plus possible terms which transform homogeneously,
and hence are irrelevant and will not be included).
F,.= S-F,S (5) Decomposing P S @ )into irreducible representations,
under an isotopic gauge transformation.$ Other simple we see that the twelve-component field b, in Eq. (6) is
functions of B than (4) do not lead to such a simple the same for all representations.
transformation property. To obtain the interaction between any field of +
The above lines of thought can be applied to any arbitrary isotopic spin with the b field one therefore
field 3. with arbitrary isotopic spin. One need only use simply replaces the gradiant of $ by
other representations S of rotations in three-dimensional
space. It is reasonable to assume that different fields
(a,- 2i&. T)$, (7)
with the same total isotopic spin, hence belonging to the where Tj (i= 1, 2, 3), as defined above, are the isotopic
same representation S, interact with the same matrix spin angular momentum matrices for the field +.
field B,. (This is analogous to the fact that the electro- We remark that the nine components of b,, p= 1, 2 , 3
magnetic field interacts in the same way with any are real and the three of ba are pure imaginary. The
charged particle, regardless of the nature of the particle. isotopic-gauge covariant field quantities F, are ex-
If differentfields interact with different and independent pressible in terms of b, :
B fields, there would be more conservation laws than
simply the conservation of total isotopic spin.) To find
a more explicit form for the B fields and to relate the
B,s corresponding to different representations S, we
proceed as follows.
Equation (3) is valid for any S and its corresponding f,. transforms like a vector under an isotopic gauge
B,. Now the matrix S1dS/dx, appearing in (3) is a transformation. Obviously the same fPYinteract with
linear combination of the isotopic spin angular mo- all fields 3. irrespective of the representation S that 3.
mentum matrices T (i= 1, 2, 3) corresponding to the belongs to.
isotopic spin of the $ field we are considering. SO B, The corresponding transformation of b, is cumber-
itself must also contain a linear combination of t).e some. One need, however, study only the infinitesimal
matrices Ti. But any part of B, in addition to this, B,, isotopic gauge transformations,
say, is a scalar or tensor combination of the Ts, and
must traJlsform by the homogeneous part of (3),
B i = S I B , S . Such a field is extraneous; it was allowed i a
by the very general form we assumed for the B field, but b,=b,+2b,X6~+- -80. (10)
is irrelevant to the question of isotopic gauge. Thus the ax,
relevant part of the B field is of the form
FIELD EQUATIONS
B, =Zb, * T. (6) To write down the field equations for the b field we
(Bold-face letters denote three-component vectors in clearly only want to use isotopic gauge invariant
isotopic space.) To relate the bs corresponding to quantities. In analogy with the electromagnetic case we
different representations S we now consider the product therefore write down the following Lagrangian density :a
representation S=S(n)S(*). The B field for the combina- --ff,.f,,.
tion transforms, according to (3), by
B,= [S(]-l[S(b)]-lBS(O)S(b) Since the inclusion of a field with isotopic spin 3 is
illustrative, and does not complicate matters very much,
we shall use the following total Lagrangian density :
c = -ifNv.f,v-&y,( a,-iet. b,)+-m&. (I I)
1Note added in proof.-It may appear that B, could be intro- One obtains from this the following equations of motion :
duced as an auxiliary quantity to accomplish invariance, but need
not be regarded as a field variable by itself. I t is to be emphasized I. \
that such a procedure violates the principle of invariance. Every (IL)
quantity that is not a pure numeral (like 2, or M,or any definite m,--iet. b,)rL+mJ.= 0,
representation of the y matrices) should be regarded as a dynam-
ical variable, and should be varied in the Lagrangian t o yield an Repeated indices are summed over, except where explicitly
equation of motion. Thus the quantities B, must be regarded as stated otherwise. Latin indices are summed from 1to 3, Greek ones
independent fields. from 1 to 4.
153

194 C. N . Y A N G A N D R . L . M I L L S

where This is the analog of the equation aza/ax>=O that


J,= i&,.s$. (13) must be satisfied by the gauge transformation A,,
=A,+e-l(aa/dxp) of the electromagnetic field.
The divergence of J, does not vanish. Instead it can
easily be shown from (13) that QUANTIZATION
aJ,/ax, = -2eb, X J,. (14) To quantize, it is not convenient to use the isotopic
gauge invariant Lagrangian density (11). This is quite
If we define, however, similar to the corresponding situation in electrodyna-
mics and we adopt the customary procedure of using a
3, =J,+ Zeb,X f,,, (15) Lagrangian density which is not obviously gauge in-
then (12) leads to the equation of continuity, variant :

a3,/axll=o. (16)
3 ,z, a and 3 4 are respectively the isotopic spin current
density and isotopic spin density of the system. The
equation of continuity guarantees that the total iso- - ez (bpx bv)+ J, * b, -$ (r,a,+m)$. (19)
topic spin
T=
s 34ds~

is independent of time and independent of a Lorentz


The equations of motion that result from this Lagran-
gian density can be easily shown to imply that
a2 a
-a+ 2 eb.x - a =0,
transformation. It is important to notice that 3,, like ax: ax,
b,, does not transform exactly like vectors under isotopic where
space rotations. But the total isotopic spin, a =ab,/ax,.
Thus if, consistent with (17), we put on one space-like
T=-J?x, surface a=O together with aa/at=O, it follows that a=O
a t all times. Using this supplementary condition one can
easily prove that the field equations resulting from the
is the integral of the divergence of f4i, which transforms Lagrangian densities (19) and (11) are identical.
like a true vector under isotopic spin space rotations.
One can follow the canonical method of quantization
Hence, under a general isotopic gauge transformation,
with the Lagrangian density (19). Defining
if S - 4 0 on an infinitely large sphere, T would transform
like an isotopic spin vector. IT,= - ab,/ax,+Ze(b,XbJ,
Equation (15) shows that the isotopic spin arises both
from the spin-; field (J,) and from the b, field itself. one obtains the equal-time commutation rule
Inasmuch as the isotopic spin is the source of the b
field, this fact makes the field equations for the b field [bpi(%),lI,i(~)]+t* = -6ij6,J3 (X -x), (20)
nonlinear, even in the absence of the spin-3 field. This is where b,, i= 1,2,3, are the three components of b,. The
different from the case of the electromagnetic field, relativistic invariance of these commutation rules
which is itself chargeless, and consequently satisfies follows from the general proof for canonical methods of
linear equations in the absence of a charged field. quantization given by Heisenberg and Pauli.8
The Hamiltonian derived from (11) is easily demon- The Hamiltonian derived from (19) is identical with
strated to be positive definite in the absence of the field the one from (ll), in virtue of the supplementary
of isotopic spin 3. The demonstration is completely condition. Its density is
identical with the similar one in electrodynamics.
We must complete the set of equations of motion (12) H = Ho+Hint,
and (13) by the supplementary condition,
abJax,= 0, (17)
which serves to eliminate the scalar part of the field in
b,. This clearly imposes a condition on the possible Hint= Zt(biXb4). IIi- 2e(b,X bj) * (abJaxJ
isotopic gauge transformations. That is, the infinitesi-
mal isotopic gauge transformation S= 1--ir.6o must +t2(biX bj)- J, b,.
satisfy the following condition :
The quantized form of the supplementary condition
is the same as in quantum electrodynamics.
@ W.Heisenberg and W. Pauli, Z. Physik 56, 1 (1929).
154

ISOTOPIC SPIN A N D ISOTOPIC G A U G E I N V A R I A N C E 195


PROPERTIES OF THE b QUANTA
The quanta of the b field clearly have spin unity and
isotopic spin unity. We know their electric charge too
because all the interactions that we proposed must FIG. 2. Primitive
--*--
,,A\, @
0 b
\


.C .
I

.
satisfy the law of conservation of electric charge, which divergences. I
is exact. The two states of the nucleon, namely proton
and neutron, d s e r by charge unity. Since they can
transform into each other through the emission or ab-
sorption of a b quantum, the latter must have three
+A d 8
charge states with charges f e and 0. Any measurement
of electric charges of course involves the electro- satisfies
magnetic field, which necessarily introduces a prefer- im/at=H~,**,
ential direction in isotopic space at all space-time points. where Hintwas defined in Eq. (21). The matrix elements
Choosing the isotopic gauge such that this preferential of the scattering matrix are then formulated in terms
direction is along the z axis in isotopic space, one sees of contributions from Feynman diagrams. These
that for the nucleons diagrams have three elementary types of vertices
Q=electric charge= e(++e+T*), illustrated in Fig. 1, instead of only one type as in
quantum electrodynamics. The primitive divergences
and for the b quanta are still finite in number and are listed in Fig. 2. Of
Q= ( e / r ) P . these, the one labeled a is the one that effects the propa-
The interaction (7) then h e s the electric charge up to gation function of the b quantum, and whose singularity
an additive constant for all fields with any isotopic determines the mass of the b quantum. I n electro-
spin : dynamics, by the requirement of electric charge con-
Q = e (e-lTz+R). (22) servation,12 it is argued that the mass of the photon
vanishes. Corresponding arguments in the b field case
The constants R for two charge conjugate fields must be do not existla even though the conservation of isotopic
equal but have opposite signs.lo spin still holds. We have therefore not been able to
anything about the mass of the b quantum.
FIG. 1. Elementary vertices for \\ II conclude
b fields and nucleon fields. Dotted
f
lines refer to b field, solid lines with
arrow refer to nucleon field.
-- ,k,
#i
A,
A conclusion about the mass of the b quantum is of
course very important in deciding whether the proposal
II of the existence of the b field is consistent with experi-
mental information. For example, it is inconsistent with
We next come to the question of the mass of the present experiments to have their mass less than that of
b quantum, to which we do not have a satisfactory the pions, because among other reasons they would then
answer. One may argue that without. a nucleon field the be created abundantly at high energies and the charged
Lagrangian would contain no quantity of the dimension ones should live long enough to be seen. If they have a
of a mass, and that therefore the mass of the b quantum mass greater than that of the pions, on the other hand,
in such a case is zero. This argument is however subject they would have a short lifetime (say, less than
to the criticism that, like all field theories, the b field is sec) for decay into pions and photons and would so far
beset with divergences, and dimensional arguments are have escaped detection.
not satisfactory.
One may of course try to apply to the b field the J. Schwinger, Phys. Rev. 76,790 (1949).
1s In electrodynamics one can formally prove that G,.k,=O,
methods for handling infinities developed for quantum where G, is defined by Schwingers Eq. (A12). (G,,dv is the
electrodynamics. Dysons approach is best suited for current generated through virtual processes by the arbitrary
the present case. One first transforms into the inter- present external field A.) No corresponding proof has been found for the
case. This is due to the fact that in electrodynamics the
action representation in which the state vector fJ? conservation of charge is a consequence of the equation of motion
of the electron field alone, quite independently of the electro-
*O See M. Gell-Mann, Phys. Rev. 92,833 (1953). magnetic field itself. I n the present case the b field carries an iso-
l1 F.J. Dyson, Phys. Rev. 75,486,1736 (1949). topic spin and destroys such general conservation laws.
PHYSICAL REVIEW VOLUME 98, NUMBER 5 JUNE 1. 1955

Conservation of Heavy Particles and Generalized Gauge Transformations


T.D. LEE, Columbia University, New Ymk,NEWYmk
AND

C. N. YANG,Institute for Advanced Study, Plincelon, New Jersey


(Received March 2, 1955)

The possibility of a heavy-particle gauge transformation is discussed.

T HE conservation laws of nature fall into two


distinct categories : those that are related to
invariance under space-time displacements and rota-
charge of -7. The force between two massive bodies
therefore would contain a contribution from the
Coulomb-like repulsion between such heavy-particle
tions, and those that are not. In the former category charges. The total force including the gravitational
there are the conservation laws of momentum, energy, attraction is :
and angular momentum. In the latter category we find
the conservation laws of electric charge, of heavy Force= -C(M,Mz/Rz)+rr(AlAz/R2). (2)
particles, and the approximate conservation laws of Here M 1 , M z , Al, and A Z are the inertia masses and
isotopic spin, and perhaps others. We notice that the mass numbers of the two bodies. There should also be
best known within this second category, the conserva- a magnetic-dipole-like interaction between individual
tion of electric charge, is related to invariance under nuclei because the nucleons are in constant motion in
gauge transformations: which expresses the nonmeas- a nucleus. But in a macroscopic object the nuclear
urability of the phase of the complex wave function of spins average out so that (2) is correct unless the two
a charged particle. bodies are spinning a t high speeds.
We want to ask here whether similar gauge in- Now the packing fraction of various atoms differ so
variances should be related to all conservation laws of that M I A varies fractional-wise from substance to
the second category. This question has been discussed substance by This means that the observed
in connection with the conservation of isotopic spin by gravitation mass [which contains a contribution from
Yang and Mills? We wish here to discuss the problem the q2 term in (2)] divided by the inertia mass would
in connection with the conservation of heavy particles. vary fractional-wise from substance to substance by
If we take the conservation of heavy particles to 10-*qz/G(Mp)2,where M p is the mass of the proton4
mean invariance under the transformation Very careful measurements by Eotvos and co-workers.
h+eWN, +p-teJlp, (1) have shown this variation to be < 10-8. Therefore
for the wave function of the heavy particles (neutrons $/G(Mp)< lo-.
and protons), a general gauge transformation (heavy-
particle gauge transformation) is a transformation like It may be remarked that since the packing fraction
(1) with the phase (Y an arbitrary function of space-time. differs most between hydrogen and, say, carbon,
Invariance under such a transformation means that the Eotvos experiment could yield a more sensitive de-
relative phase of the wave function of a heavy particle tection of q2 by a factor of 10 if repeated with a com-
a t two different space-time points is not measurable. parison of hydrogen and carbon.
Such a gauge transformation is formally completely The assumption that leads to the above line of
identical with the electromagnetic gauge transforma- reasoning and the force expression (2) is that the phase
tion. Invariance under such a transformation therefore factor in (1) should be space-time-dependent. It
(Y

necessitates the existence of a neutral vector massless should be noticed that in addition the assumption has
field coupled to all heavy particles. A nucleon would also been made that the transformation that generates
have a heavy-particle charge of f~in such a field the conservation of heavy particles is of the specific
and an antinucleon would have a heavy-particle form (1).
We wish to thank Dr. J. Robert Oppenheimer for an
* See M. Gell-Mann and A. Pais, Proceedings of the Glasgow interesting discussion.
Conference, July, 1954 (to be ublished).
* W. Pauli, Revs. Modem PEP. 13,203 (1941).
* C. N. Yang and R. L. Mills, Phys. Rev. 96, 191 (1954). Eotvos, Pekir, and Fekete, Ann. Physik 68, 11 (1922).

1501
156

T. D. Lee and C. N. Yang at the Institute for Advanced Study


(Courtesy of the Archives of the Institute for Advanced Study,
Princeton, New Jersey, U.S.A.)
157

PHYSICAL REVIEW VOLUME 101, NUMBER 5 MARCH 1. 1 Y S 6

Invariant Theoretical Interpretation of Interaction


RYOW UTIYAMA
Institute for Advanctd Study, Princeton, New Jwsey
(Received July 7, 1955)

Some systems of fields have been considered which are invariant under a certain group of transformations
depending on n parameters. A general rule is obtained for introducing a new field in a definite way with a
definite type of interaction with the original fields by postulating the invariance of these systems under a
wider group derived by replacing the parameters of the original group with a set of arbitrary functions.
The transformation character of this new field under the wider group is determined from the invariance
postulate. The possible types of the equations of the new fields can be also derived, giving rise to a certain
conservation law owing to the invariance. As examples, the electromagnetic, the gravitational and the
Yang-Mills fields are reconsidered following this line of approach.

INTRODUCTION then according to the aforementioned viewpoint we


may have the possibility of introducing a new field,
T HE form of the interactions between some well
known fields can be determined by postulating
invariance under a certain group of transformations.
say A ( x ) , in a definite way. In addition, the transfor-
mation character of this new field and the interaction
For example, let us consider the electromagnetic inter- form with the Qs can be determined uniquely.
action of a charged field Q(x), Q*(z). The electro- Let us tentatively call a family of the interactions
magnetic interaction appears in the Lagrangian through derived in this way the interactions of the first class,
the expressions while other types of interactions are denoted as the
interactions of the second class. The electromagnetic,
aQ ieA,Q
_- aQ*
or -+ieA,Q*.
gravitational and B,-field interactions belong to the
(1) first class and the meson-nucleon interaction to the
ax. ax, second class, a t least a t the present stage.
The main purpose of the present paper is to investi-
The gauge invariance of this system is easily verified in gate the following problem. Let us consider a system of
virtue of the combinations of Q, Q*, and A , in (l), if fields Q A( x ) , which is invariant under some transfor-
this system is invariant under the phase transformation mation group G depending on parameters el, eZ, . . .e,.
Suppose that the aforementioned parameter-group G
Q-+eiaQ, Q*+Q*e-ia, a= const. (2) is replaced by a wider group G, derived by replacing
Reversing the argument, the combination (1) can be the parameters 8s by a set of arbitrary functions E(x)s,
uniquely introduced by the following line of reasoning. and that the system considered is invariant under this
In the first place, let us suppose that the Lagrangian wider group G. Then, can we answer the following
L(Q,Q,,) is invariant under the constant phase trans- questions by using only the postulate of invariance
formation (2). Let us replace this phase transformation stated above? (1) What kind of field, A ( x ) , is introduced
with the wider one (gauge transformation) having the on account of the invariance? (2) How is this new field
phase factor a(z) instead of the constant a. In order to A transformed under G? (3) What form does the
make the Lagrangian still invariant under this wider interaction between the field A and the original field Q
transformation it is necessary to introduce the electro- take? (4) How can we determine the new Lagrangian
magnetic field through the combination (1). This L(Q,A) from the original one L(Q)? ( 5 ) What type of
combination and the transformation character of A , field equations for A are allowable?
under the gauge transformation can be uniquely deter- The solution of these problems will be stated in Sec. 1.
mined from the gauge invariance postulate of the In Secs. 2, 3, and 4 the well-known examples of the
interactions of the first class will be reconsidered
Lagrangian L(Q,Q,,,A,).
This approach was taken by Yang and Mills to following the line of reasoning of Sec. 1. We shall find an
introduce their new field B, which interacts with fields analogy between the transformation characters of the
having nonvanishing isotopic spins. The gravitational electromagnetic field A,, the Yang-Mills field B,, and
interaction also can be introduced in this fashion. Christoffels affinity rrvX in the theory of the general
It may be worthwhile to investigate this approach relativity. Furthermore we shall understand the reason
for a more general case, for if there is a system of why in the Yang-Mills field strength the quadratic
fields Q(x) which is invariant under some transfor- term, B,XB,, appears which is quite similar to that
mation group depending on parameters el, CZ, . . .en, occurring in the Riemann-Christoffel tensor RXPV,,,
namely, to the term rI-rr in R.
*On leave of absence from the University of Osaka, Osaka, In the usual textbooks of general relativity the
Japan.
C. N. Yang and R. L. Mills, Phys. Rev. 96, 191 (1954). covariant derivative of any tensor is introduced by
1597
158

1598 RYOYU UTIYAMA

using the concept of parallel displacement. On the Now from the invariant character of I under the
other hand, we shall see in Sec. 4 that the covariant transformation (1.1) and from the fact that this
derivative of any tensor or spinor can be derived from invariance is always preserved for an arbitrary domain
the postulate of invariance under the "generalized 3, we have the invariance of the Lagrangian density
Lorentz transformations" derived by replacing the six itself. Namely we have
parameters of the usual Lorentz group with a set of six
arbitrary functions of x . In deriving such covariant
derivatives it is unnecessary to use explicitly the notion
of parallel displacement.
Now the above stated classification of the interactions The symbol means that 6L must vanish a t any
has only a tentative meaning. Some of the interactions world point and further that this relation does not
of the second class might be translated to the first class depend on the behavior of QA and QAo.. Substituting
if we could find a transformation group by means of (1.1) into (1.4) we get
which we can derive that interaction following the
general scheme in Sec. 1. For example, if the interaction
between mesons and nucleons could be reinterpreted
in a fashion analogous to those of the first class, then
one might presumably be able to get a wider viewpoint
for interpreting the interactions between the new since the 6's are independent of each other. These n
unstable particles and the nucleons. identities are the necessary and sufficient conditions for
the invariance of I under G.
1. GENERAL THEORY If we take into account the field equation for Q A ,
Let us consider a set of fields Q A ( x ) , ( A = 1,2, .. . N ) , we obtain from (1.5) the following n conservation laws:
with the Lagrangian density

Now let us postulate that the action integral referred


to some arbitrary four-dimensional domain 3, This is so because (1.5) can be rewritten as follows:

I= I,,, { ~ - " ( a"x "e ) a}Q6A ,Q, A + ~ [ ~ax#


6 Qa QAA], #~ 0 .

is invariant under the following infinitesimal transfor- The first term, ( ) 6 Q A , vanishes on account of the
mation : field equation.
Now let us consider the following transformation :
QA-QA+W,
6QA= T(a),A~ ta QB,
(1.1)
@=infinitesimal parameter ( a = 1 , 2 , . . .n),
ea(x) =infinitesimal arbitrary function,
T(.,,AB= constant coefficient.
instead of (1.1). In this case 6L does not vanish but
In addition, the transformation (1.1) is assumed to be becomes
a Lie group G depending on the n parameters to.
Thus there must be a set of constants f b a c called the
"structure constants," which are defined by
or
[T(,),T(b)]AB=T(a),AC T ( b ) C B - T (bLAC T ( 4 . C B

= f o C b T(c),A B . (1.2)
These constants, fbac, have the following important by virtue of the identity (1.5).
properties: In order to preserve the invariance of the Lagrangian
j a m b fm'c+jbamc fm'a+fcma fm'b=O,
under (l.l)', it is necessary to introduce a new field
(1.3)
focb= -fbOa. A ' J ( ~ ) ,J = l , 2 , . .. M ,
The relations (1.3) can be easily obtained from Jacobi's in such a way that the right-hand side of (1.9' can be
identity and the definition (1.2). cancelled with the contrib&on from this new field A'.'.
~
159

INVARIANT INTERPRETATION OF INTERACTION 1599

Now let us denote the new Lagrangian by

L'(QA,QA,
and consider the following transformation:
By using A", in place of A r J , the transformation
SQA= T(a),AB Q" e " ( ~ ) , character of A turns into
a 0

6AIJ= U ( a ) J gAIK e" (x)+CJ, ' a -9 (1.7)


ax'
where the coefficients U and C are unknown constants
which will be determined later. In addition, let us Now the new Lagrangian must have the form
propose that the new action integral I' is invariant p,Aa,) =L"(QA,V,QA).
L'(QA,QA.
under the transformation (1.7).
Our problem is to answer the five questions listed in Therefore we have the relations
the Introduction.
From the invariance postulate we get the following aL' aLi'
identity : T ( a L B A A",,
aL' dL' aL'
~ L ' ( x ) = - 6QA+- SQ", p+- 6AtJ=O.
aQA aQA,r dA'J
Inserting (1.7) into the above and taking account of
the arbitrariness of choosing ea and dealax,, we see
that each coefficient of E and &/ax must vanish
independently. Namely, we have the identities
By using these relations, (1.8) becomes

- S ( S ) ~' 'b ~T,( d ) , (1.12)


Now, in order to be able to determine uniquely the If we putz
A'J-dependence of L', the number of the components L" (Q",v*Q")= (Q" ,VpQA),
of A'J should be equal to the number of Eqs. (1.9), namely, put L" to be what is obtained by replacing
namely, 8QA/dx' in the original Lagrangian L with the "co-
M=4n variant derivative," VpQA, then on account of the
identity (l.S), the first and the second terms in (1.12)
should hold. In addition, the matrix CJ;, must be
cancel each other. The remaining terms of (1.12) can
nonsingular. Thus there is the inverse of C defined by
be rewritten in the following way owing to the group
CJ,.a C - 1 a p I K = 6 J K , (?lap, J CJ,'b=6ab6Yp. character (1.2) :

Then (1.9) can be rewritten as


("
aVpQA Q-conat
Q" Abv,(fadbG Y p - S ( a ) d . p , ' b ) T ( d ) , A ~ ~ O .

Therefore we can determine the unknown coefficient S


as follows :
where we have put S(a)"r, ' b = 6 " , f a c b . (1.13)
An, =C-'",, 3 A". Using this expression for S, we can easily show the
covariant character of the derivative V,QA, i.e.,
Thus A'J should be contained in L' only through the
combination 6VpQA= T ( a ) , A Bc"(x) VpQ". (1.14)
aQA T
2This particular choice of L" is due to the requirement that
V,,QA=---- QB C-'", J AtJ, when the field A is assumed to vanish, we must have the original
ax# Lagrangian L.
160

1600 RYOYU UTIYAMA

Now let us investigate the possible type of the From these relations and (1.16), we have
Lagrangian for the free A-field. Let it be denoted by
LO(Aap,Aap,"), A a , aA'Jdx'. =O.

The invariance postulate for LO under the transfor- Namely LOmust be a function of F alone and must
mation (1.11) leads to satisfy the identity (1.19).
As may easily be seen, the transformation character
of F is given by
6Fafiv= eb(%) fb"c F'py. (1.20)
Equation (1.20) can be verified by using the relation
(1.3).
Now let us define a set of matrices, M ( l ) , Mcz),
. . .M(,,),
in the following way:
(a$)-element of M(=)=M(o)'b= fc'b,
From (1.17) we see that the derivative of A should be (a, b, c = l , 2, . . .).
contained in LOthrough the combination
Then these matrices are a representation of degree n
a a for the generators of the Lie group G, since the relation
(1.3) can be written as

Thus (1.16) can be written as [ ~ ( a ) , ~ ( b ) I 2 c = f o mMb( m ) , ' c .

Therefore (1.20) shows that 12 quantities, Fl,,, P,.,


. . .Fnp,, are transformed cogradiently to the transfor-
mation of Q.
So far we have not used the field equations of A and Q.
(1.16)' means that the derivative of A appears in Lo, The variation of the total Lagrangian density
only through the particular combination
LT= LO( F ) + L (Q,vQ)
can be rewritten as

Finally, substituting (1.16) into the first term of (l.lS),


we get
~LT
-6QA+-
SQ"
a
~LT
6Aa,
f b a e cb A',--
a
a x p rLT>
__
6 ~ a ,

+-i- I aL
ax. aVpQA
aLn 6LT I
6 ~ ~ + 4 ~ ~ , + - =o,
aFarv 6Aap
-
- 1 (1.21)

where the following abbreviations have been used :


or by virtue of (1.3) we have

(1.19)
Now let us choose the arbitrary function ra(x) in such
a way that the values of all the r's and &//ax's vanish
(See Appendix I.) Since Lo must have the form
on the boundary surface of the integration domain Q .
Lo(A,aA/ax) Lo' (Anp,Fapy), Then the integration of (1.21) over the domain D
becomes
we have the relations
lKd'x--O, (1.22)

with the abbreviation

fab. A
161

INVARIANT INTERPRETATION OF INTERACTION 1601

because the integration of the divergence term in (1.21) 2. PHASE TRANSFORMATION GROUP AND THE
vanishes on account of our special choice of the ds. ELECTROMAGNETIC FIELD
Since the BS can be chosen arbitrarily within Q , K Let us consider a charged field Q and Q*. The La-
must vanish a t every point in Q , as is easily seen from grangian of this system is assumed to be invariant
(1.22). under the phase transformation
Consequently the identity (1.21) are separated into
the following two relations : 6QA=iCUQA, 6QA*= -&QA*, a = a real constant.

X=O, (1.23) Since this one-parameter group is commutative, the


and structure constant, of course, vanishes. By replacing
the constant a with a scalar function X(x), a vector
field A,,(x) is introduced. The transformation character
of A,(x) is given by
aFa,, 6Aa,
6 ~ , , = axlax.,
From (1.24), we have
following the general formula (1.11). The new La-
grangian L has the form
L=L(Q,Q*,vd,v,Q*) 9
where V,Q and V,Q* are given by
aQA aQA*
V,QA = --iA,,QA, V,QA*=-+iApQA*,
and axp ax.
aLo aLo
---+---=O. because in the present case
aFa,, aFQ,,
TAB=isAB for QA, T A ~ = - i s A g for QA*.
Put
J P ~ aLT/aAa,,.
= (1.27) The Lagrangian LOfor the free A,,-field is
Then (1.26) leads to Lo= Lo (F,,),
where
aA. aAP
F PY =_--
ax# ax.

and (1.25) becomes The current J p can be obtained from the two different
expressions
(1.29)

If we use the field equation


3. ROTATION GROUP IN ISOTOPIC SPIN SPACE
6LT/6Aa,= 0 , AND THE YANGMILLS FIELD

then we have the conservation of the current, i.e., As an example let us consider a system of proton and
neutron fields :
aJ*JaxP-O (a=1,2, . . .) (1.30) +a= (;:) (
= proton
neutron
).
Thus we have obtained a general rule for introducing
a new field A in a definite way when there exists some The Lagrangian in the charge-independent theory is
conservation law such as (1.6) or there is a Lie group invariant under the rotation in the three-dimensional
depending upon some parameters under which the isotopic spin space :
system is invariant.
In the following sections we shall consider the a a
following groups as examples of the original Lie group : S+.=i C 6 7 ( c ) a b +fl, a$,= -i C E $0 ~ ( ~ ) f l - , (3.1)
0-1 c-1
(1) the phase transformation of a charged field, (2) the
rotation group in the isotopic spin space, and (3) the where T ( ~ ) T, ( * ) , and T ( Q are the usual isotopic spin
Lorentz group. matrices.
162

1602 RYOYU UTIYAMA

In this case the general notation T in Sec. 2 corre- The square of thc invariant length of the infinitesimal
sponds to r as follows line element is given by

dsa=g*ik dx' d3Ck=g,, dw. duy,


where
By replacing the parameters, e', with a set of func- g*u=g*,z= g*aa= - g*rr= 1, g*ik=O for if k,
tions, P ( x ) , the Yang-Mills field
and
B"&) (6'1, 2,3)
is introduced, and this appears in the Lagrangian
through the combination [see (1.10)]
Let us introduce two sets of functions defined by
v,p= a p / a x u - i T ( c ) , eS +0 BC,. (3.2)
The variation of Bc, is given by [see (1.10) and (1.13)] and

Then we have the following relations:

where facb is defined by g*kZ h k p h'v=gpv(w), g p , hkp h l ' = g * k l , h k P h1p=6'k,

gE'* h k p h~"=glrr(ZC),g"' hk, hZl=gL1*, h k " hkv=6Py,


[ i 7 ( a ) $ T ( b ) ] = fucb ' i T ( e ) . (3.4)
det(g,,)=g= -1P= - [det(hk,)]Z.
The derivative of Ba, can appear only through the
Raising or lowering of both kinds of suffices can be
combination [see (1 .IS)]
done by means of gpv, gP' or g * k J and gkJ*. The geo-
metrical meaning of the sets of hA,and hk,, is obvious.
The introduction of the four-world vector3 hla, h#, hap,
and ha' assigns respectively a Iocsl Lorentz frame to
every world point. Of course, the local frames at: every
The variations of V,+ and Fa,. are as follows: world point are transformed in the same way under any
Lorentz transformation, i.e.,
and xk-+xk+ek,xl,

hkr+hkr+8hkr, t k l =-Elk
As was stated in Sec. 1, Fa,,is transformed under the 6hk'= -d k h f .
rotation group as a vector, namely, the isotopic spin of
this B-field is unity. The expression for the "current" On account of this geometrical meaning of the h's,
has the form [see (1.25) and (1.24)]: we can transform the world tensor into the corre-
sponding local tensor defined with respect to the local
aLT . ar. aLo frame, or vice versa, using h~' or hk,. For example,
Juo=-= -*-----T(.), O L $J'--fG%
~ Bbpa
avp F",. (u>=hk, (u) Q' (HI8
Qk =hnp(u)
Qlr(a) Qk (a),
4. LORENTZ GROUP AND THE GRAVITATIONAL where the abbreviation
FJELD
Let us consider a system of fields Q" (x)being defined has been used.
e
= Q"$ ( 1>
.
with respect to same Lorentz frame. In addition, let us In this way we can rewrite the action integral as
assume that the action integral follows :

I = s 2 (Q"(4$2". (a>,hkg>.( Id4%,

where 9 is defined by
is invariant under any Lorentz transformation.
Now besides the z-system, let us introduce an arbi- P = L ( Q " ( ~ ) , J w Q"..(u))
(~) h, (4.2)
trary system of the curvilinear coordinates U M (p= 1, 2,
3 , 4 ) . In what follows, the Latin and Greek indices and Q", stands for
represent quantities defined with respect to the x- aQ " (u)/ W.
system (or the local Lorentz frame) and to the u- a The world vector means a vector which is defined with respect
system respectively. t o the u-system.
163

INVARIANT INTERPRETATION O F I N T E R A C T I O N 1603

The reason or the fact that the Q A in (4.2) is not are assumed to be transformed as
transformed into the corresponding world quantity
is that if Q is a spinor this rewriting is not possible, QA=&(~) T ( ~ z ) , Q,
B
because the spinor can be well defined only with 6hk,,=eki(N) P,,. (4.6)
respect to a Lorentz frame.
Now 1 is invariant under the following two kinds of Then, in order to retain the invariance of I under the
t r a n s f o ~ m a t i o n:s ~ ~ ~ transformatian (4.6), it is necessary to introduce a new
(1) The Lorentz transformation field
A E l p (24)= --A ikp(Zd),
6hhP= e k l hIP,
which has the following transformation character
~ Q = + T ( W . ~ Btk,
Q~ (4.3)
according to (1.11):
u =unchanged,
aekz

where T ( E R , is the (A$)-element of the N X R matrix 6Ak1,=tf0a,hrC*(U)Aha,+-


aUL
T(k1)which is the representation of the generator of the
Lorentz group. The matrix T(k0 satisfies the relation
[T(RIJ,T(m, n)] = ; f k l , 9bmn T(di), T ( k l )= - T(lk)
I

(2) The gencral point transformation Furthermore the new Lagrangian is given by
M-w+X+) =dr,

X+(w)=an arbitrary function of u,

(4.4)
[see (1.10)].
SQ (u) Q(u) -QA(.) =0, - The factors 3 in (4.9) and f in (4.7) are necessary
because in summing up the terms in these expressions
SQf ,= --_ax QA. v.
with respect to the dummy suffices the same contribu-
all* tions are counted twice or four times.
Because of the general Lorentz transformation,
Now our Lagrangian (4.2) has the suitable form for under which each local frame a t each world point i s
the application of the general method stated in I, if the transformed differently, the reIation (4.5) was aban-
given functions, hkP,are regarded as a set of field doned. Since this relation is satisfied only when the
quantities satisfying the condition : basic world is flat, we are forced to take as our basic
ahk,,/auU= ahk,/auP, (4.5) space-time some Riemannian space with the metric
and having the transformation character (4.3) under
the Lorentz group. Though we will omit the condition
(4.5), the invariance of I under the transformations and the affine connection
(4.3) and (4.4) still holds. The only role of (4.5) is to
guarantee the possibility of finding the simplest and
most convenient system of coordinates (d,. . .,d).In
fact if we replace the parameters, eik, with a set of
arbitrary functions, eik(u),after the Lorentz transfor- Accordingly we wouId expect that there exists some
mation depending on such e(u)s, the relatiun (4.5) is relationship between A: and hk,,.
destroyed. I n order to obtain this relationship let us consider,
The condition (4.5) is inconsistent with the applica- as an example, the local tensor
tion of the general procedure of Sec. 1 to the present
problem. Accordingly we shall consider the hs as a set
of 16 independent given functions.
Now following the prescription of Sec. 1, let us con- Then from (4.9) we have
sider the generalized Lorentz transformation depend-
aQkl
ing upon a set of arbitrary functions eik(u)instead of the VfiQkl =-- A Em,,Qm-A Z m
z Qkm
parameters eik. Under this transformation, Q A and hk, aw
R. Utiyama, Progr. Theoret. Phys. (japan) 2, 38 (1947). F. J. Belinfante, Physica 7, 305 (1940); K. Husimi, Proc.
L. Rosenfeld, Ann. Physik 5, 113 (1930). National Research Council of Japan 4, 81 (1943).
164

1604 RYOYU UTIYAMA

By using h, this can be rewritten as then we can solve (4.13)for I. The solution is

or
where Qmvand I are defined by
(4.14)
Qkv=h,,, Qkm,
and where
ah, Apv,,,=hkP hipAk,,,.
ri.,,P=hf--hkp Ak,. (4.11)
a u p
(4.14)is just the relation desired.
I n general, the following relation is easily derived from Now from (4.14)we see that AP,,,, is a world tensor
(4.9) : of the third rank under the general point transformation
a (4.4)because the inhomogeneous term
.
V,Qkl.. ,pu. ..., aB... =-QkZ.. .,pa. 9 .
a)..., a8...
aur a2xp

--A . ..., -A1iflQki*+ab...,a@...-


kilr QiI.. . , p u . . ,,b

+A,, Qkt.*Pr.;b...,ao...+A$,arQk1.*Pr*.; ...,.g.. f . . .


arising from 8 is cancelled out in virtue of the term
+rlxpp Q k i . ...xu... ...,as...f r x g Qk.8pXab s(hah/du). Consequently the A k z , is a covariant world
...,
vector under the u-transformation (4.4). Then it is
+ .. .-p ar A Q ~ ~ . . . . P..., u . . . ~ ~ easily seen that our covariant derivative V , is in
fact covariant under both kinds of transformations,
-rlSrXQ k Z . . . , p a . . .ab ...,ah...- . . .. (4.12)
namely, under any general Lorentz transformation
and any u-transformation (4.4).
This relation is nothing but the usual covariant deriva- Thus we have obtained the general expression for
tive with the exception that for the Latin indices the the covariant derivative without using the concept of
A,, must be inserted in place of the usual affinity I, parallel displacement. For example, if the Q A were the
and for the Greek indices our I must be used instead spinor field +, we have
of the I?. Therefore for the world tensor Q. our
covariant derivative agrees with the usual covariant
derivative with the affinity I. Namely, if we use the
symbol 6, for the usual covariant derivative, we get
where the 71%are the usual Dirac y matrices.
Now let us consider the Lagragian 20 of the free
A-field :
The relationship between A and h can be derived from go(hk,,Akz,,aAki,/au*),
the following consideration: where the hk, is necessary to raise or lower both kinds
VpgkZ*=-Akl -Alk
P-
-0 of tensor suffices.
From the invariance postulate for $0 under the
=h k , h. V , g p = h k o hl, 6 , g p . general Lorentz transformation, we see that $0 must
have the form
From this expression we have
$0 ( h k P , F k l , ) ,

can be represented in the following way in terms of h , ah/au and B :

If we assume that The right-hand side of the above expression is antisymmetric in


rip.p=rt.,,P, pand Y because of the fact that

7 I n general (4.13)gives the following expression for I, if the


symmetry r,>=PVpA is not assumed:
(hi,ah
G-rv,,, ,)+(v and p interchanged) =%-rvk ,-
,?J = 0.

rtvp= rpVp- w,,,+B~,~)+B,.. 2,


,,
Therefore the antisymmetry of A,, in p and Y does not give any
restriction to the symmetry character of B. Now if B is assumed
where r is the Christoffel allinity, while B is an arbitrary tensor to vanish, we obtain the relation (4.14). On theyother hand, if
with the symmetry character B [See E. Schrodinger, the basic space-time is flat, then A takes the form A , v , , = B , , p
Space-time Slruclwe (Cambri&:&%& Press, Cambridge, + B P ,v.-B,p, on account of the relation ahl,/a&= ahl,,/au,
1950), p. 66.1 Therefore our new field Atl, (or A p u , p = h ~
hi,p AX:) or, eqmvalently, hl, .3h~,/adLrV,,.,.
165

INVARlANT INTERPRETATION OF INTERACTION 1605

Now corresponding to (2.23) and (1.24) we have the


following identities:

and
(a/au+m=o, (4.19)
where ! l R u is
(4.15) can be rewritten formally as follows:
Fkl,=V,Akly-V,Akr-AkpA1bv+AlrblA$$, (4.15)
where V , A k l , does not behave Like a local tensor of
the second rank, but is a covariant world tensor as the
s f i c e s p and Y show. Using the expression (4.15) we
can prove the following relation (see Appendix 11)
Accordingly, the coefficient of a2E/d3 in (4.19) identi-
Fkluv=hi hk. RahrvJ (4.16) cally vanishes :
where R is the Riemann-Christoffel curvature tensor: di!T a?T
~ &-- hpO.
ahi, ,? ahi;,p y
Thus becomes
Though lak, is contained in our Lagrangian as well as IIJ1.Z ;EkB*is,
A and a A / d u , we can still prove that A k i , appears in
loonly through the combination P.
So far we have assumed that h, is a given function.
The behavior of hk, in general relativity is defined by
the field equation derived by the variational principle. a?T a&
The total Lagrangian density is now given by ---h,,+-htp,
+lIiWp,, ahi,,,..

The field equations for Q and h are8


- { k and i interchanged}
1
.

Inserting this expression into (4.19), we have the


6l!/dQ* =0 trivial result
and d!lPik/aw=OJ (4.21)
as
%Pik=O.

Since the Eulerian derivative 6L/6A appeared in (1.24),


with the abbreviations the nonvanishing current could be derived in the
general theory in Sec. 1. I n the present case, however, we
F,..=dhJdu, higp A = ahp/duYauh. have no Eulerian derivatives in W;nz, Thus the field
cquations do not play any role in deriving the current.
81f the B,,v,pis taken into account, the Riemann-Christoffcl
tensor P x U vis the sainc iunctiun as (4.17) but r musl bc iriscrtcd Now the usual equation for the gravitational field is
in place of r in (4.17). Hcre P i s theafinity given in refcrence 7. derived by taking a particular Lagrangian
In this casc, in addition to the equatiuns for the Q and h fields,
wc havc thc following erpaliun : Qo= hR,
where X is defined as follows:
R=g R,,= h i p h k Fkirv, R,,r=RXp.x.

Of course, in all these equations for the Q, h, and B fields, the Taking the variation with regard to h, we get
affinity Imust be used instead of r. (4.18) and (4.19) also hold
in this case, since B is invariant under the general Lorentz blo bi!
transformations. Consequently this B field is of no use in -+--=(I.
avoiding the trivial result : $Q%i,-D. ah, ahi,
166

1606 RYOYU UTIYAMA

The following relation is now easily verified: For example,


KO. fa'% fimb F'PV
is transformed contragradiently to Fapp,for 6K,, is
6 K a ,p u = f a ' m fImb 6Fbpv
where8
620/6gp,= -h(Rp'-- :g PURR)=-@P'. =fo'm fImb f j b k Fkpv t i .

Thus we have By taking into account the relation (1.3), the above
h 1C
. @Pn= -%Pi, expression becomes
with 6 K ,p v = f a l m ( f k b l f b m j f f I b j f b m k ) F k p v ti.
%Pi= -62/6hip,
or we have Using (1.3) again, we can rewrite the hrst term of this
@Pa= -5PW. expression as follows:

Here X is the symmetric energy-momentum-tensor 6Ka,pv=fkbl(fjma fm'bffamb fmlj) F k p v e i


density of the original field Q. The symmetry character -fkmb f L m f i b j F k p v Ci.
of % can be proved in the following way.
Since the Lagrangian ? for the Q-field is also invariant Since the second term is cancelled with the last term,
under the "general Lorentz transformation," we have we have
an identity similar to (4.18) :
6Ka,pv=-tei fjma f m ' b f i b k F k p v = - E i fjma K q p v .

sl 62 Now let us call Fap,a contravariant vector, and K,, p v


--6QA+--6h',~0.
6QA 6hip a covariant vector with regard to the transformation
(1.20). In addition, let us propose that fb'c is contra-
Inserting the field equation for the Q-field into this variant with respect to the suffix a and covariant with
identity, we get respect to b and c. Then we see that f b a c is a constant
and invariant tensor owing t o the following fact that
6fbac= ei(fjak fbko-fjkb fk'c-fj'c fbak),

From this relation, which vanishes in virtue of the relation (1.3). Hence
this proposal concerning the transformation character
of f b a c is compatible with the covariant character of
can be easily derived. Ka,pv.
Using the quantity
ACKNOWLEDGMENTS
gab=fa'm flmb=gba
The author is most grateful to the Institute for
Advanced Study for a grant-in-aid and to Professor and its inverse gab, we can easily construct a tensor
Robert Oppenheimer for the kind hospitality extended algebra similar to that used in the theory of relativity.
him there. He is also indebted to members of the For example, we have invariants
Institute, especially to Dr. R. Arnowitt, for helpful
conversations. Ilpv,ps=gab Fa,, F b p r = H p o , p v .
In the case of the rotation group in three-dimensional
APPENDIX I. CONDITION (1.19) isotopic spin space (see Sec. 3 ) , f b a c has the following
Here we shall show how to construct an invariant values :
in terms of Fapr.
Consider a quantity G,, the transformation character
of which is contragradient to that of Fa,, under the
transformation (120).
Since
Therefore, we have
I
fi32= j 3 ' 1 =
f?k=
fZ13= - 1,
-fk'i,
otherwise f = 0.

gab=26ab,
GP,. and
Hpv,pa= 26.6 Fapv Fbps.
is invariant by definition, 6G, is given by
Another familiar example is the case of the Lorentz
6Ga= - tcf cba G b . group. Here we have
9 W. Pauli, Encyklopaedie der Mathematischen Wissenschaften
(B. G. Teubner, Leipzig, 1904-1922), Vol. 5, Chap. 19, p. 621. If.
4 j k . o bc d fab.Cdlm'g*il g*km-g*jm g*kl,
167

INVARIANT INTERPRETATION OF I N T E R A C T I O N 1607

and where 6, is the usual covariant derivative with the


Christoffel affinity.
A ", is by virtue of (4.14) rewritten as

If La is a function of the invariant t r r v . ~alone,


o we
can easily prove the identity
aLo If we suppose as a simple covariant world vector
-j c a b Fhp,,=O. (1-19) and ignore the local su& I, then the factor in the
dF",, parenthesis in the above expression is just the usual
Namely, the left-hand side can be written as covariant derivative of h ' ~ Therefore
. we get
APO
. v --
g sX
hrP6,h'A.
On the other hand from (4.14) we have the relation

The factor in the bracket vanishes on account of the By using (A.3) this becomes
relation (1.3). Consequently there exists, in fact, a
family of invariant Lagrangians, Lo, which are functions 6pA", p=$" hf 6p6vkCA-A"f, p A'k, y hmP It".
of Fap. alone and satisfy the condition (1.19). Inserting this expression into (A.l) with (A.2), we have
APPENDIX 11. PROOF OF THE RELATION Fk',,,= h1A(6,6,-6,6p)hk~.
F",. =h'xhk a R a ~ p r
Fkl,. is given by As is well known, the Riemann-Christoffel tensor is
defined by
Fk',,=VpAk'y-V,Ak',-Akbp A 1 b u f A k b y A'b,. (A.1) (S,,6.-6dp) VX=RPX~"
V,

Now according to the general rule (4.12) for an arbitrary covariant vector V,.
Thus we get
V,Ak',= hk, h', 8J PO") (-4.2) Fk' # U = h" hk, RahpY.
I68

JOURNAL OF MATHEMATICAL PHYSICS VOLUME 2. NUMBER 2 MARCH-APRIL. 1961

Lorentz Invariance and the Gravitational Field


T.W . B. KIBBLE
Departmenf of Mdhemalics, Imperial Colkge, London,England
(Received August 19, 1960)
An argument leading from the Lorentz invariance of the scalar density expressed in terms of h p and A,,,. This Lagrangian
Lagrangian to the introduction of the p v i t a t i o ~ l&Id is pre- is of fimt order in the derivatives, and is the analog for the vierbein
sented. Utiyamas discussion is extended by considering the formalism of Palatinis Lagrangian. In the absence of matter, it
10-parameter group of inhomogeneous Lorentz transformations, yields the familiar equations 2 . - 0 for empty space, but when
involving variation of the coordinatesas well ns the field variables. matter is present there is a d8erence from the usual theory (6rst
It is then unnecessary to introduce (I prim%curvilinear coordinates pointed out by Weyl) which arises from the fact that A,,, appears
or a Riemannian metric, and the new field variables introduced in the matter field Lagrangian, so that the equation of motion
as a consequence of the argument include the vierbein components relating Aj, to h p is changed. In particular, this means that,
hp as well as the local a 5 e connection Af,,. The extended although the covariant derivative of the metric vanishes, the
transformations for which the 10 parameters become arbitrary a!ine connection P,, is*nonsymmetric. The theory may be reex-
functions of position may be interpreted as general coordinate pressed in terms of the Christoffel connection, and in that caw
transformations and rotations of the vierbein system. The free additional terms quadratic in the spin density Ski, appear in
Lagrangian for the new fields is shown to be 8 function of two the Lsgrangian. These terms are almost certainly too small to
covariant quantities analogous to F,. for the electromagnetic make any experimentally detectable difference to the predictions
field, and the simplest possible form is just the usual curvature of the usual metric theory.

1. INTRODUCTION that the quantity PA,, calculated from Aj,, was


T has long been realized that the existence of certain symmetric.
I fields, notably the electromagnetic field, can be
Telated to invariance properties of the Lagrangian.
I t is the purpose of this paper to show that the
vierbein components kk, as well as the local affine
Thus, if the Lagrangian is invariant under phase trans- connection A 5, can be introduced as new field vari-
formations $-, cia)$, and if we wish to make it in- ables analogous to A, if one considers the full 10-param-
variant under the general gauge transformations for eter group of inhomogeneous Lorentz transformations
which X is a function of r, then it is necessary to intro- in place of the restricted six-parameter group. This
duce a new field A,, which transforms according to implies that one must consider transformations of the
A,, +A,,- a,X, and to replace a,,+ in the Lagrangian by coordinates as well as the field variables, which will
a covariant derivative (a,,tieA,)$. A similar argu- necessitate some changes in the argument, but it also
ment has been applied by Yang and Mills2 to isotopic means that only one system of coordinates is required,
spin rotations, and in that case yields a triplet of vector and that a Riemannian metric need not be introduced
fields. It is thus an attractive idea to relate the existence u priori. The interpretation of the theory in terms of a
of the gravitational field to the Lorentz invariance of Riemannian space may be made later if desired. The
the Lagrangian. Utiyama has proposed a method starting point of the discussion is the ordinary formu-
which leads to the introduction of 24 new field variables lation of Lorentz invariance (including translational
A by considering the homogeneous Lorentz trans- invariance) in terms of rectangular coordinates in flat
formations specified by six parameters dj. However, space. We shall follow the analogy with gauge trans-
in order to do this it was necessary to introduce a prwri formations as far as possible, and for purposes of com-
curvilinear coordinates and a set of 16 parameters h k p . parison we give in Sec. 2 a brief discussion of linear
Initially, the hk# were treated as given functions of x, transformations of the field variables. This is essentially
but at a later stage they were regarded as field vari- a summary of Utiyamas argument, though the em-
ables and interpreted as the components of a vierbein phasis is rather different, particularly with regard to
system in a Riemannian space. This is a rather unsatis- the covariant and noncovariant conservation laws,
factory procedure since it is the purpose of the dis- In Sec. 3 we discuss the invariance under Lorentz
cussion to supply an argument for introducing the transformations, and in Sec. 4 we extend the discussion
gravitational field variables, which include the metric to the corresponding group in which the ten parameters
as well as the affine connection. The new field variables become arbitrary functions of position. We show that
A ij,, were subsequently related to the Christoffel con- to maintain invariance of the Lagrangian, it is necessary
nection P,, in the Riemannian space, but this could to introduce 40 new variables so that a suitable cova-
only be done uniquely by making the ad ~ O assumption
G riant derivative may be constructed. To make the
action integral invariant, one actually requires the
* NATO Research Fellow. Lagrangian to be an invariant density rather than an
See, for example, H. Weyl, Guppmlhcmic und Quafiten- invariant, and one must, therefore, multiply the invariant
mechanik ( S . Hirzel, Lei zig, 19311, 2nd ed., Chap. 2, p. 89; and
earlier references cited &ere. by a suitable (and uniquely determined) function of the
* C. N. Yang and R. L. Mills, Ph s. Rev. 96,191 (1954). new fields. In Sec. 5 we consider the possible forms of the
Ryoyu Utiyama, Phys. Rev. lo[ 1597 (1956). free Lagrangian for the new fields. As in the case of the
212
169

LORENTZ INVARIANCE AND THE GRAVITATIONAL FIELD 213

electromagnetic field, we choose the Lagrangian of Now, under the more general transformations of the
lowest degree which satisfies the invariance require- form (2.1), but in which the parameters @ become
ments. arbitrary functions of position, the Lagrangian is no
The geometrical interpretation in terms of a Rieman- longer invariant, because the derivatives transform
nian space is discussed in Sec. 6, where we show that according to
the free Lagrangian we have obtained is just the usual
curvature scalar density, though expressed in terms of 6x,p = @Tax,p f @.,Tax, (2.4)
an a t k e connection FApuwhich is not necessarily sym- and the terms in to,,, do not cancel. In fact, one finds
metric. In fact, when no matter is present it is sym-
metric as a consequence of the equations of motion, but bL= -Po.J'h.
otherwise it has an antisymmetric part expressible in However, one can obtain a modified Lagrangian which
terms of the "spin density" @#ij. Thus there is a dif- is invariant by replacing X.,, in L by a quantity X ; ,
ference between this theory and the usual metric which transforms according to
theory of gravitation. This difference was first pointed
out by Weyl,' and has more recently been discussed by AX,,,= @Tax,,. (2.5)
Sciama.6 It arises from the fact that our free Lagrangian To do this' it is necessary to introduce 4n new field
is of first order in the derivatives, with the l r k ~and A iip variables As,, whose transformation properties involve
as independent variables. It is possible to re-express the c?,~. In fact, if one takes
theory in terms of the Christoffel connection Orh,," or
its local analog OA ijp,and this is done in Sec. 7. In that X;p= X++A"pTaX, (2.6)
case, additional terms quadratic in @ij, and multiplied then the condition (2.5) determines the transformation
by the gravitational constant, appear in the Lagrangian. properties of the new fields uniquely. They are
2. LINEAR TRANSFORMATIONS 6Aop= cbfbaJcp- @,pa (2.7)
We consider a set of field variables X A ( Z ) , which we In this way one obtains the invariant Lagrangian
regard as the elements of a column matrix ~ ( x )with ,
the Lagrangian L'(X,X,,JA"J= L W ; r ) .
The expression X ; , may be called the c o v h n t deriva-
L ( + U X ( 4 , X,M(X)),
tive of x with respect to the transformations (2.1). One
where X + = 8,X. We also consider linear transformations may defme covariant currents by
of the form
J ' P a z - (13L'/aAap)~-(aL/aX;,)TaX, (2.8)
6 X = @Tax, (2.1)
where the are n constant infinitesimal parameters, where L is regarded as a function of X and Xip They
and the Ta are n given matrices satisfying commutation transform linearly according to
rules appropriate to the generators of a Lie group, bJ'pa= - SbfbcaJ'Pc,
CTa,Tb]= facbTc.
and their covariant divergences vanish in virtue of the
The Lagrangian is invariant under these transforma- equations of motion and the identities (2.2) :
tions if the n identities
J'pa;&= J'",,+- Ab,,j b C a J ' p c
(aL/aX)Tax+ (13L/8t,,,)TaX,,,=0, (2.2)
=0.
are satisfied, and we shall assume that this is so. Note
that a/* must be regarded as a row matrix. The Two covariant d~erentiationsdo not in general
equations of motion imply n conservation laws commute. From (2.6) one finds
Jpa.pz0, X;pv-X;vp=FapvTaX,
where
where the ''currents" are defined by6 Fapu=A -A -j b " d '
4' w . (2.9)
J # ~ =- (aL/ax,,,)~.x. (2.3) Unlike A",,, the expression PPy
is a covariant quantity
'H. Weyl, Phys. Rev. 77,699 (1950). transforming according to
6D.W. Saama, Festschrift for Injeld (Pergamon Press, New
York), to be published. 6Fapu= t bf b " P " p v ,
8 We have defined .Tpm with the opposite sign to that used by
Utiyama.8 This is -use with this choice of sign the analogous
-
quantity for translabom is Tp. rather than T*, The change may and one may, therefore, define its covariant derivative
be considered 89 a change of sip of p and T., and there is a cor- in an obvious manner. It satisfies the cyclic identity
responding change. of sign in (2.6). This convention has the addi-
tional advantage that the "local a5ne connection" A',, defined F",v;p+Faup;p+Fap+;
~ 0 .
in Sec.,4 specifies covariant derivatives according to the same rule
as Py.. 7 For a full discussion, see footnote 3.
170

2 14 T. W . B . K I B B L E

It remains to h d a free Lagrangisn Lo for the new where aL/axu denotes the partial derivative with 6xed
fields. Clearly LOmust be separately invariant, and it x . It is sometimes useful to consider also the variation
is easy to see that this implies that it must contain at a fixed value of x,
A; only through the covariant combination PpY. The
simplest such Lagrangian is* 6&= x(s) -x(z)=sx-6bz~x,. (3.2)

(2.10)
In particular, it is obvious that 60 commutes with a,,
whence
where the tensor indices a.re raised with the flat-space
metric )IPV With diagonal elements (1, - 1, - 1, - l), ax,= (ax),- (6P)J.. (3.3)
and the index a is lowered with the metnc The action integral
gabE fad fcdb
I ( Q ) = s L(x)d&
assodated with the Lie group (except of course for a n
one-parameter group). It is clear that this Lagrangian
is not unique. All that is required is that it should be over a space-time region D is transformed under (3.1)
a scalar both in coordinate space and in the Lie-group into
space, and one could add to it terms of higher degree
Zl(Q)=J L(z?lladfllld4x.
in PFV. However, it seems reasonable to choose the n
Lagrangia~of lowest degree which satisfies the in-
variance requirements. Thus the action integral over an arbitrary region is
With the choice (2.10) of LO, the equations of motion invariant iP
for the new fields are
bL+L(SXfl).@=bgL+ ( D x f i ),p= 0. (3.4)
Faflu;= Jfl,,.
This is of course the typical transformation law of an
Because of the antisymmetry of Fafluone can define invariant density.
another current which is conserved in the strict sense : We now consider the specific case of Lorentz trans-
formations,
(Jpa+j.a).p=O, (2.11)
where 6x#= cpu+c, 6X= +fiVSfiX, (3.5)
jpa= A, jbe,,Fefiu. where e and t p v = - t v p are 10 real infinitesimal param-
This extra current jp,, may be regarded as the current eters, and the S, are matrices satisfying
of the new field A, itself, since it is expressible in the S,U+S, =0,
form
CS,v,S,l= )Iv J p w + ? J ~ u p -subs,,-)IflPsw= +jfi=hP.S. A.
pa=- (aLo/aAafl)= - (aLo/aAbvr)jaboAeu, (2.12) From (3.3) one has
which should be compared with (2.8). Note, however,
that it is not a covariant quantity. To obtain a strict 6X,,= +PSwX*- (3.6)
ConSeTvation law one must sacrifice the covariance of Moreover, since (ax.) ,#= ON,,= 0, the condition (3.4)
the current. for invariance of the action integral again reduces to
6L-0, and yields the 10 identitieslO
3. LORENTZ TRANSFORMATIONS
We now wish to consider infinitesimal variations of
aL/axp= L , ~ -(aL/ax)x,p
- - (aL/ax,,)x,,,,= 0, (3.7)

xp -
both the coordinates and the field Variables,
XI@= xfi+6xx,
X(x) -+ x(x)=x(x)+6x(x).
(3.1)
( a ~ / a x ) s , ~(aL/ax,)
+ (s,x,
+qd,v-~wX,p)

These are evidently the analogs of the identities (2.2),


and we shall assume that they are satisfied. Note that
=0. (3.8)

It will be convenient to allow for the possibility that (3.7), which express the conditions for translational
the Lagrangian may depend on x explicitly. Then, invariance, are equivalent to the requirement that L
under a variation (3.1), the change in L is be explicitly independent of x, as might be expected.
As before, the equations of motion may be used to
~ L (aL/ax)sx+
E (aL/ax,p)sx,fl+ ( a L / a x p ) w , obtain 10 conservation laws which follow from these
~

* There muld of course be a constant factor multiplying (2.10), identities, namely,


but this can be absorbed by a trivial change of debition of Ao,
and T. TflP+=O, (SF,- XJ~~+X.T,),=
0,
hThe discussion here a liea onl to semisimple groups since
otherwise gab is singular. r a m i d b t e d to the referee for this See L. Rosenfeld, Ann. Physik 5, 113 (1930).
remark.) Compare L. Rosenfeld, Ann. inst. Henri Poincar62,25 (f931).
lo
171

LORENTZ INVARIANCE AND THE GRAVITATIONAL FIELD 215

where obtained by setting


T*,= (aL/ax,,)x,,--6ppL, sppa= - (aL/ax,p)s,x.
Xk=bk'xep

These are the conservation laws of energy, momentum, It is of course not invariant under the generalized
and angular momentum. transformations (4.1), but we shall later obtain an
It is instructive to examine these transformations in invariant expression by replacing X k by a suitable
terms of the variation S& also, which in this case is quantity X ; k.
The transformation of X,,, is given by
w = -ePa,x++ew (s,+x,a,- %,,a,)x.
6X+= 3'Jsijx.,+"",~ijx- .p+X,", (4.3)
On comparing this with (2.1), one sees that the role of
the matrices T,,is played by the differential operators and so the original Lagrangian transforms according to
-a, and S,+x,a.-z,a,. Thus, by analogy with the 6L= - [P,> p
-.IcijoJr.
a
..
'I
definition (2.3) of the currents -Tea, one might expect
the currents in this case to be Note that it is J p , rather than Tr,which appears here.
The reason for this is that we have not included the
extra term L(W),,,in (3.4).The left-hand side of (3.4)
actually has the value
corresponding to the parameters @, epu, respectively.
However, in terms of 80, the condition for invariance 6L+ L ( W ),,,E - [ P + T f i p - 3 e i j , p S ' i j .
(3.4) is not simply &LEO, and the additional term
6xpL,, is responsible for the appearance of the term 1 5 , ~ We now look for a modified Lagrangian which makes
the action integral invariant. The additional term just
in the identities (3.7), and hence for the term PJ in Tfi,,.
mentioned is of a different kind to those previously
encountered, in that it involves L and not aL/aXk. In
4. GENERALIZED LORENTZ TRANSFORMATIONS
particular, it includes contributions from terms in L
We now turn to a consideration of the generalized which do not contain derivatives. Thus it is clear that
transformations (3.5) in which the parameters e p and we cannot remove it by replacing the derivative by a
t W Y become arbitrary functions of position. It is more suitable covariant derivative. For this reason, we shall
convenient, and clearly equivalent, to regard as inde- consider the problem in two stages. We first e l i n a t e
pendent functions tfiuand the noninvariance arising from the fact that X , is not
a covariant quantity, and thus obtain an expression L'
p= e"&+ P, satisfying
since this avoids the explicit appearance of x. Moreover, bL'E0. (4.4)
one could consider generalied transformations with Then, because the condition (3.4) for invariance of the
[e=O but nonzero P,so that the coordinate and field action integral requires the Lagrangian to be an in-
transformations can be completely separated. In view variant density rather than an invariant, we make a
of this fact, it is convenient to use Latin indices for d j further modification, replacing L' by Y f , which satisfies
(and for the matrices Sij), retaining the Greek ones for
.$# and x p . Thus the transformations under considera- W+,p+P=O. (4.5)
tion are
The first part of this program can be accomplished
6x'= p, 6 X = + e ' j S i j x (4.1) by replacing x k in L by a "covariant derivative" X ; k
or
which transforms according to
6oX= - pJX,p+)c'jSijX. (4.2)
This notation emphasizes the similarity of the t i j 6X;L=jCijSijX;k-SikX;i. (4.6)
transformations to the linear transformations discussed The condition (4.4) then follows from the identities
in Sec. 2. These transformations alone were considered (3.8). To do this it is necessary to introduce forty new
by Utiyama.3 Evidently, the four functions f f i specify field variables. We consider first the t i j transformations,
a general coordinate transformation. The geometrical and eliminate the eij,,, term in (4.3) by settingE
signihnce of the e i j will be discussed in Sec. 6.
According to our convention, the differentialoperator XIp= x,,+ $A ' j p S ij x , (4.7)
a,, must have a Greek index. However, in the Lagrangian where the Aij,,= - A j i , , are 24 new field variables. We
function L it would be inconvenient to have two kinds
can then impose the condition
of indices, and we shall, therefore, regard L as a given
function of X and x k (no comma),11satisfying the iden- 6X1,= $e'JS~jXlr-p,rXl" (4.8)
tities (3.7) and (3.8). The original Lagrangian is then
which determines the transformation properties of A i j p
Note that since we are using Latin indices for Sij the various
tensor components of x must also have Latin indices, and for A'i, differs in sign from that of Utiyama? Compare
spinor components the Dirac matrices must be yb. footnote 6.
172

216 T. W . B . K I B B L E

uniquely. They are equivalent, then the modified Lagrangians !& and 9%
6 A i pj -- iw l k i , f e i ~ i k ~ - E Y , r A i j y - ~ i j are not necessarily equivalent. Consider for example the
L (4.9) Lagrangian for a real scalar field written in its first-order
The position with regard to the last term in (4.3) is form
rather different. The term involving &,, is inhomo-
Ll=Ukq,k- +7rk1.k-+(pz. (4.12)
geneous in the sense that it contains X rather than X,,,,
just like the second term of (2.4), but this is not true This is equivalent to
of the last Correspondingly, the transformation
law (4.8) of X I , is already homogeneous. This means LZ= - U k , k q - ~ 4 r k T k - ~ m z # , (4.13)
that to obtain an expression X ; k transforming according but the corresponding modified Lagrangians Wer by
to (4.6) we should add to XI,, not a term in X but rather
a term in XI,, itself. In other words, we can merely f?1-&3@(?rq);k

multiply by a new field : =@ha[ ( ~ ~.,+A9 ) i r ~ ( ~ ] (4.14)


x ;k E hkXl,,. (4.10) which is not an explicit divergence. Thus in order to
d e k e the modified Lagrangian I completely it would
Here the h k P are 16 new field variables with transforma-
be necessary to specify which of the possible equivalent
tion properties determined by (4.6) to be
forms of the original Lagrangian is to be chosen. T h e
&h#= t p , y h k u - e&. (4.11) reasons for this situation and the problem of choosing
the correct form are discussed in the Appendix.
It should be noted that the fields hk and Aii, are quite As in Sec. 2, one may define modified currents in
independent and unrelated at this stage, though of terms of L= L ( X , X ; k } by
course they will be related by equations of motion.
We have now found an invariant L. We can easily sk#E @//ahkPz @bi#{ ( B L / B x ; k ) x ; i- SL), (4.15)
obtain an invariant density 9 by multiplying by a @<j= -2 (&?//aAiifl) - @ h k ( a L / a X ; k ) s { j x , (4.16)
suitable function of the fields already introduced :
where bi,, is the inverse of h i , satisfying
9=@L.
p,,hiU=6,,u, bi,,hj#=6ij.
Then (4.5) is satisfied provided that @ is itself an
invariant density, To express the conservation laws which these currenta
satisfy in a simple form, it is convenient to extend the
@+ P,lr@=O. definition of the covariant derivative XI,, (not x ; k ) .
It is easy to see that the only function of the new fields Originally, it is defined for x and, therefore! by a trivial
which obeys this transformation law, and does not extension for any other quantity which is mvariant
involve derivatives, is under [s transformations, and transforms linearly under
e i j transformations.We wish to extend it to any quantity
@= [det(hkr)]-, which transforms linearly under c i j transformations, by
where the arbitrary constant factor has been chosen so simply ignoring the tfi transformation properties alto-
that 8 reduces to 1 when hk#is set equal to 6kC.14 gether. Thus, for example, we would have
The final form of our modified Lagrangian is
h.rI I u =h.fi
a .u -AkiAk, (4.17)
S(X,X,,,,hkJ,A j,,) @ L ( X , X ; k ) .
according to the tij transformation law of hi. We shall
(We can drop the prime without risk of confusion.) It call this the t covariant derivative. Later we shall define
may be asked whether this Lagrangian is unique in the another covaxiant derivative which takes account of t p
same sense as the modified Lagrangian L of Sec. 2, and transformations also.
in fact it is easy to see that it is not. The reason for this One can easily calculate the commutator of two
is that if one starts with two Lagrangians LI and Lz e covariant differentiations>sThis gives
which d8er by an explicit divergence, and are therefore
XIrv- XI ,= )Ri$SijX, (4.18)
The reason for this may be seen in terms of the variation where
&OX given by (4.2). The analogs of the matrices T . are clearly -a,
and Sir, so that the presence of the derivative x,, in the last term Ri.leu-= A 3fi.u
. - A .p , f i - A i k J k j v + A i k Y A kj#.
of (4.3) is to be expected. By analogy with (2.6) we should expect
the covariant derivative to have the form
This quantity is covariant under c i i transformations,
X : k36kx r + i A i i S i j X - A t a f i .
and satisfies the cyclic identity
Because of the a pearance of derivatives, the first and last terms
can be combineain the form h p x , , where hp=6p-A*t. If we
then.set !<fb==hpAij,,,we amve at the same form for x ; b as that
R j # v l p + R i j v p l l r + R i j p p I use.
obtamed u1 the text. Note that this could not be done without extending the
Multiplication of the entire Lagrangian by a constant factor definition, since one must know how to treat the index on X I , , .
is of course unimportant. Here, a8 in Sec. 2, we simply ignore it.
173

LORENTZ INVARIANCE A N D T H E GRAVITATIONAL F I E L D 217


It is thus closely analogous to P,,.Note that Ri$, is With this choice of Lagrangian, the equations of
antisymmetric in both pairs of indices. motion for the new fields are
In terms of the e mvariant derivative, the con-
servation laws can be expressed in the form* @(Rikj r - #jR) = -5&j, (5.5)
(~k~k)Ip+lS.k~kPIu=~~i~u, (4.19) -[@(hifihj y - hj W ) ] ,
@ijlfi=%i&j-%j&i. (4.20) =@ ( h k c i j-h jcik- h i C k j ) = 6i j . (5.6)
From Eq. (5.6)one can immediately obtain a strict
5. FREE GRAVITATIONAL LAGRANGIAN conservation law
we now wish to examine the quantity X ; k , rather
than Xlr. As before, the covariant derivative of any (GNij+@ij) .p= 0, (5.7)
quantity which transforms in a simiiar way to X may where
be defined analogously. Now in particular x ; k itself
(unliie XI,) is such a quantity, and therefore without
@ij=
@A ju (hj%- hkh j ) -@A k j v ( h i h k u - hkhd).

extending the dehition of covariant derivative one can This quantity is expressible in the form
evaluate the commutator X ; k [ - X ; [ k . However, this
quantity is not simply obtained by multiplying - 2 (&/aA
6 Pa>. S - fils -3 (aSo/aAm n u . . u ) j i j m n d u,

Xlru-X1, by h k W , as one might expect. The reason for


which is closely analogous to (2.12), and should be
this is that in evaluating X ; L I one differentiates the hkl compared with (4.16). Equation (5.7) is a rather sur-
in X ; k, and moreover adds an extra A i b term on account prising result, since may very reasonably be inter-
of the index k. Thus one finds preted as the spin density of the matter field,* so that
X;rl-x; [k= +RiikSajX-Cik&$, (5.1) it appears to be a law of conservation of spin with no
where reference to the orbital angular momentum. I n fact,
R i i k [ E hkJhfij,u, (5.2) however, the orbital angular momentum appears in the
corresponding covariant conservation law (4.20), and
cis[= (hkh[-hlhku)bifil (5.3)
us
therefore part of the spin of the gravitational field,
Note that (5.1) is not simply proportional to X , but Pii, may be regarded as arising from this source.
involves x;i also. Nevertheless, Eq. (5.7) differs from other statements
We now look for a free Lagrangian SO for the new of angular momentum conservation in that the coor-
fields. Clearly 90 must be an invariant density, and if dinates do not appear explicitly.
we set It would also be possible to deduce from Eq. (5.5) a
i!o=@Lo, strict conservation law
then it is easy to see, as in the case of linear transfor- [hr w p + tkll&=O, (5.8)
mations, that the invariant Lo must be a function only
of the covanant quantities R i j k [ and c i k z . As before, but there is a considerable amount of freedom in
there are many possible forms for go, but there is a choosing t,. The most natural definition, by analogy
Werence between this case and the previous one in with (4.15) would be
that all the indices on these expressions are of the same t= aSo/ahP,
type (unlike Fa,,,),and one can,therefore, contract the
upper indices with the lower. In fact, the condition that and this quantity does indeed satisfy (5.8). However,
LObe a scalar in two separate spaces is now reduced to in this case the expression within the parentheses itself
the condition that it be a scalar in one space. In par- vanishes, so that (5.8) is rather trivial. We shall not
ticular, this means that there exists a linear invariant discuss the question of the correct choice of tk,,further,
which has no analog in the previous case, namely, as this lies beyond the scope of the present paper.lS
R = R .I J. t J . It should be noted that Eq. (5.6) can be solved, at
least in principle, for A i j , , . In the simple case when @j
There are in addition several quadratic invariants. vanishes, one findsB
However, if we again choose for LOthe form of lowest
4.. =OA.. =bk
possible degree, then we are led to the free Lagrangian SJ - 3 @ ( c h ij - c i j k - c j k i ) ,
c k .*I-
.= (hphju- hjhiU)bk,,,.
(5.9)
Po= +@R (5.4)
18See H. J. BeEdante, Physica 6, 887 (1939), and footnote 5.
which dilfers from (2.10) in being only linear in the 19It is well known in the case of the ordinary metric theory of
derivatives. gravitation that many definitions of the energy pseudotensor are
possible. See, for example, P. G. Bergmann, Phys. Rev. 112, 287
la This is another example of the fact that for p transformations
derivatives play the role of the matrices T
. Compare footnote 13.
.--The
(19.58).
--I ~

oAitpare Ricus coefiaents of rotation. See for instance


*7Wechooseunitsinwhichr=l (aswellasc=h=l). V. Fock, Z. Physik 57, 261 (1929).
174

218 T. W . B KIBBLE

In general, if we write interpreted as the metric tensor of a Riemannian space.


Q P . -%, .= @hPSkiji It is moreover invariant under the e i j transformations.
Evidently, the Greek indices may be regarded as world
then tensor indices, and we must of course abandon for them
. . =OAij,,-fbk P ( S kat . . - sr.t k. - S .t k r.
A .JP the convention that all indices are to be raised or
- ~ t w S t t j - ~ ~ j S t i t ) (5.10)
.
lowered with the flat-space metric q,,,,, and use g,,.
instead. I t is easy to see that the scalar density @ is
If the original Lagrangian L is of first order in the equal to (-g)*, where g=det(g,,).
derivatives, then S k i j is independent of A'j,, so that Now, in view of the relation (6.1), hk' and bkr are
(5.10) is an explicit solution. Otherwise, however, Aij, the contravariant and covariant components, respec-
also appears on the right-hand side of this equation. tively, of a vierbein system in the Riemannian space?'
We conclude this section with a discussion of the Thus the dj transformations should be interpreted as
Lagrangian for the fields Aa,,introduced in Sec. 2 when vierbein rotations, and the Latin indices as local tensor
the "gravitational" fields hkP and are also intro- indices with respect to this vierbein system. The
duced. The fields A",, should not be regarded merely as original field x may be decomposed into local tensors
components of x when dealing with Lorentz trans- and s p i n ~ r sand , ~ from the tensors one can fonn corre-
formations, since one must preserve the invariance sponding world tensors by multiplying by hkr or bkr.
under the linear transformations. To find the correct For example, from a local vector ui one can form
form of the Lagrangian, one should consider simul-
taneously Lorentz transformations and these linear v"= h&+, v,,= bi,,ui. (6.2)
transformations. This can be done provided that the No confusion can be caused by using the same symbol
matrices T" commute with the Sij, a condition which u for the local and world vectors, since they are dis-
is always fulfilled in practice. Then one 6nds that X k in tinguished by the type of index, and indeed we have
L should be replaced by a derivative which is covanant already used this convention in (5.2). Note that
under both (2.1) and (4.1), namely, u,,=g,,uvv, so that (6.2) is consistent with the choice of
metric (6.1). We shall frequently use this convention
X ; k = hk@(X,++A 'j,ijX+A",TJ).
of associating world tensors with given local tensors
The commutator X;k 2 - X ; zk then contains the extra term without explicit mention on each occasion.
The field A 'j,, may reasonably be called a "local a;fhne
F"ktfi, connection" with respect to the vierbein system, since
where it specifies the covariant derivatives of local tensors or
Fakt' hk'ht'F",,v, spinors?* For a local vector, this takes the form
with PFu given by (2.9). I t is important to notice that 0'1 "=vi,,+A ' j u v j J vjlU=v j,,-A 'jUvi. (6.3)
the derivatives of dorin Fa,,, are ordinary derivatives,
not covariant ones. (We shall see in the next section It may be noticed that the relation (4.10) between
that the ordinary and covariant curls are not equal, be- Xlr and x ; k is of the same type as (6.2) and could he
cause the affine connection is in general nonsymmetric.) written simply as
As before, one can see that any invariant function of A",, X;, =X lr (6.4)
must be a function of F"kt only, and the simplest free
Lagrangian for A",, is, therefore, according to our convention. However, we shall retain
the use of two separate symbols because we wish to
(5.11) extend the definition of covariant derivative in a differ-
ent way to that of Sec. 4. It seems natural to define the
6. GEOMETRICAL INTERPRETATION covariant derivative of a world tensor in terms of the
Up to this point, we have not given any geometrical covariant derivative of the associated local tensor.
signiiicance to the transformations (4.1), or to the new Thus, for instance, to define the covariant derivatives
fields hk' and A'$, but it is useful to do so in order to of the world vectors (6.2) one would form the world
be able to compare the theory with the more familiar tensors corresponding to (6.3). This gives
metric theory of gravitation. uA;yEhihViI "=uA,,+rxruv',
Now the [ f i transformations are general coordinate
transformations, and according to (4.11) hk' transforms v,; bi,sviiv=v,,,u-P,,uvA,
like a contravariant vector under these transformations, where
while bk,, and AijP transform like covariant vectors. FA,,"=hiAbi,,lu~ -bi,,hSA I ". (6.5)
Thus the quantity Note that this definition of P,,is equivalent to the
&+b 'Wbku (6.1) See for instance H. Weyl, Z. Physik 56, 330 (1929).
H. J. Belinfante, Physica 7, 305 (1940).
is a symmetric covariant tensor, and may therefore be 0 Compare J. A. Schouten, J. Math. and Phys. 10, 239 (1931).
175

LORENTZ INVARIANCE A N D THE GRAVITATIONAL F I E L D 219

requirement that the covariant derivatives of the therefore equal t o the Christoffel connection OP,,.
vierbein components should vanish, (This is the analog for world tensors of OA 'j,.) Then
is symmetric, and Eq. (6.12) yields Einstein's familiar
hi'; +O, bi,:+0. (6.6) equations for empty space,
For a generic quantity a transforming according to
Zu=0.
6a= #Ciisij(y+ oJApU, (6.7)
However, when matter is present, P,,is"no longer
the covariant derivative is defined by2l symmetric, and its antisymmetric part is given by
(6.13). Then the tensor 4, is also nonsymmetric, and
,+
u:,,=a, $A 'jSip+r A , A @ a , (6.8) correspondingly the energy tensor density Zkuis in
whereas the e covariant derivative defined in Sec. 4 is general nonsymmetric, because hrfi does not appear in
obtained by simply omitting the last term of (6.8). I only through the symmetric combination gr'. Thus
Note that the two derivatives are equal for purely local the theory M e r s slightly from the usual one, in a way
tensors or spinors, but not otherwise. One easily finds first nofed by W e ~ lIn . ~the following section, we shall
that the commutator of two covariant differentiations investigate this difference in more detail.6
is given by Finally, we can rewrite the covariant conservation
laws in terms of world tensors. It is convenient to define
a:,v-a; ~Rii,Sijcu+RPu,4puU-CA,~;A, the contraction
where Rpu,,, and CAB.are defined in the usual way in CfiGC'pA,
terms of Rijrv and C$t. They are both world tensors,
and can easily be expressed in terms of Pry, in the formL4 since the covariant divergence of a vector density f'
is then
~~~,.=r~~,.~-r~.,.,-r~~ , r ~ . , + r ~ , r ~ ~ ~P;,=
(6.9) , f'.,+C,ffi. (6.14)
c',;,= rA,,-rAuP (6.10)
The conservation laws become
Thus one sees that RP.,,, is just the Riemann tensor
formed from the affine connection I'A,,. %',;,-C,%u,,+Ca,~u~= )Rpu,u@w,
From (6.6) it follows that SFw; , -c,ew=2w-?.p.
P =0, (6.11) I t may be noticed that these are slightly more com-
plicated than the expressions in terms of the a covariant
so that it is consistent to interpret FAfiu
as an &ne con- derivative.
nection in the Riemannian space. However, the de-
finition (6.5) evidently does not guarantee that it is 7. COMPARISON WITH METRIC THEORY
symmetric, so that in general it is not the Christoffel
connection. The curvature scalar R has the usual form For simplicity, we shall assume in this section that L
is only of first order in the derivatives, so that (5.10)
R- R*,, RNu=Ra&up is an explicit solution for A'S. The difference between
so that the free gravitational Lagrangian is just the the theory presented here and the usual one arises
usual one except for the nonsymmetry of It should because we are using a Lagrangian l o of first order, in
be remarked that it would be incorrect to treat the 64 which h p and A", are independent variables. The situ-
components of rA,,as independent variables, since ation is entirely analogous to that which obtains for
there are only 24 components of Aij,,. In fact the Yap, any theory with "derivative" interaction. In first-order
are restricted by the 40 identities (6.11). Thus there is form, the "momenta" Aij, are not just equal to deri-
no contradiction with the well-known fact that the vatives of the "coordinates" h p , or in other words to
first-order Palatini Lagrangian with nonsymmetric Far, OAij,. Thus an interaction which appears simple in
does not yield (6.11) as equations of motion."6 first-order form will be more complicated if a second-
The equations of motion (5.5) and (5.6) can be order Lagrangian is used, and vice versa.
rewritten in the form The second-order form of the Lagrangian may be
obtained by substituting for Aij, the expression (5.10).
@(&I"- 3g,P) = -%I", (6.12) This gives
@Ca,,= 6',tu-#S',SppU- i&A.SP,p. (6.13) I'=?$+Bo+'I,
From Eqs. (6.10) and (6.13) one sees that in the absence where O$ and O v a are obtained from and by replacing
of matter the a!ine connection P,, is symmetric, and A'S by OAij,, (or equivalently rA,,by and l? is
UThis is a generalization to nonsymmetric affinities of the an additional term quadratic in Skij, namely,
result proved in the appendix to footnote 3. See also footnotes 4
and 5. ' ~ = Q ~ ( 2 S i j k S j L i - S i j ~ j r + 2 S i i ~ (7.1)
jj~).
f l See for instance E. Schradinger, Space-time Structwe (Cam-
bridge University Press,New York, 1950). In this Lagrangian, only h k p and x are treated as inde-
176

220 T. W . B . K I B B L E

pendent variables. The equations of motion are equi- another, but there are plausible arguments for a par-
valent to those previously obtained if the variables Aj,, ticular choice.
are eliminated from the latter by using (5.10). The most obvious criterion would be to require that
The usual metric theory, on the other hand, is given the Lagrangian should be written in the symmetrized
by the Lagrangian fkst-order form suggested by SchwingerFo which in the
case of the scalar field discussed in Sec. 4 is
V =OI!+OI!o,
without the extra terms (7.1). If this Lagrangian were L=B(L1+L2).
written in a first-order form by introducing additional This corresponds to treating (p and r kon a symmetrical
independent variables Aj,,, then one would arrive at a footing. However, this may not in fact be the correct
form identical to the one given here except for the choice, because for some purposes (o and r kshould not
appearance of extra terms equal to (7.1) with a negative be treated in this way. In fact, the two L a g r a n g h
sign. differ in one important respect: 21 is independent of
Thus we see that the only merence betweenthe two A$, whereas v2 is not. Correspondingly, for L1 the
theories is the presence or absence of these direct- quantity S k i j vanishes, whereas for L2 one finds
interaction terms. Now if we had not set K= 1,then &
would have a factor K - ~ , whereas the terms (7.1) would s i j = (6% j--bk>ri) $0.

appear with the factor K. They are, therefore, extremely The conservation laws in the two cases are of course the
small in comparison to other interaction terms. In par- same, because the quantities Tki also differ. Now the
ticular, for a Dirac field, they would be proportional to tensor S k i j has often been interpreted as the spin
(see Appendix) density,8 so that the two cases m e r with regard to the
&k?K$h%K#.
separation of the total angular momentum into orbital
and spin terms. The scalar field is normally regarded as
Thus they are similar in form to the Fermi interaction a field of spinless particles, so that one would naturally
terms, but much smaller in magnitude, so that it seems expect Skijto vanish. This, therefore,furnishesapossible
impossible that they would lead to any observable criterion, which would select L1 rather than Lt. With
difference between the predictions of the two theories. this choice, a preferred position is assigned to the
Hence we must conclude that for all practical purposes wave function (o rather than the momenta r k J and
the theory presented here is equivalent to the usual one. the derivatives are written on (o only. In this way one
achieves a vanishing spin tensor, because the matrices
ACKNOWLEDGMENTS Sij are zero for the scalar field (o, but not for the vector
The author is indebted to Drs. J. L. Anderson, P. W. uk.It may be noticed that LI is automatically selected
Higgs, and D. W. Sciama for helpful discussions and if one writes the Lagrangian in its second-order form
comments. in terms of (o only:
APPENDIX L1= +q*k(ok-fm=$,
In this appendix we shall discuss the remaining which yields the modified Lagrangian
ambiguity in the modified Lagrangian. I t was pointed
out in Sec. 4 that the generally covariant Lagrangians 21= 44 (g(o.r(o.u -mzf) J

obtained from two equivalent Lagrangians Ll and Lz equivalent to !&.2 This should be contrasted with the
are in general inequivalent. One can now see that in fact second-order form of 1 2 , which is
they differ by a covariant divergence. Thus (4.14) can
be written in the form
and clearly differs from & by a covariant divergence.
This seems to be a resonable criterion, but the argu-
but in view of (6.14) this is not equal to the ordinary ments for it cannot be regarded as conclusive. For,
divergence. It is clear that quite generally changing L although it is true that the spin tensor obtained from
by a divergence must change 9 by the covariant di- Lz is nonzero, it is still true that the three space-space
vergence of a quantity which is a vector density under components of the total spin
coordinate transformations, and invariant under all
other transformations. This is the reason for the dif-
ference between this case and that of the linear trans-
formations of Sec. 2.
8ij=
S
dGSOij

We now wish to investigate the possibility of choosing are zero. Thus L1 and LZdiffer only in the values of the
a criterion which will select a particular form of L, and
36 J. Schwinger, Phys.Rev. 91, 713 (1953).
thus specify 2 completely. There does not seem to be Here SIis a linearization of $I in the sense of T. W.B.
any really compelling reason for one choice rather than Kibble and J. C. Polkinghome, Nuovo amento 8, 74 (1958).
177

LORENTZ INVARIANCE AND THE GRAVITATIONAL FIELD 221

spin part of the (O i) components of angular momentum. are of course tensors). In fact, (A.l) with m=O would
Indeed, one easily sees that it is true in general that not be gauge invariant. The reason for the difference
adding a divergence to L will change only the (a) is that ai is here treated simply as a component of x ,
components of Sij. Since it is not at all clear what sig- whereas A,, is introduced along with the gravitational
nificance should be attached to the separation of these variables to ensure gauge invariance.2B
components into orbital and spin terms, it might For a spinor field +, symmetry between and +
be questioned whether one should expect the spin appears to demand that one should choose the sym-
terms to vanish even for a spinless particle. Even so, metrized Lagrangian
the choice of LI seems in this case to be the most reason-
able. L= 3 (& %,k- $,kiY$l) -m&,
For a field of spin 1, the correspondingchoice would be which yields the spin density
L1= -3fii(ar,j-aj,i)+ff~fij+3m2a,~i,
which is again equivalent to the choice of the second- Since the Lagrangian must be Hermitian, one could
order Lagrangian in terms of a; only. I t yields not write the derivative on 9 alone. There remains,
however, another possible choice : We could introduce
a distinction between the left- and right-handed com-
which is a reasonable definition of the spin density?* ponents, ~ ) ~ = ) ( l f i - y s ) + , treating one of them line (p
The modided Lagrangian may be expressed in terms of and the other like zk.This gives the Lagrangian
the world vector a,, as
e= - f ~ g ~ ~ g ~ u , , : . - u a , , , ~ ( a p : . - - a , , , ~
This form of Lagrangian may seem rather u n n a t d ,
+f@m2gpw,. (A.1)
but it should be mentioned because there are other
It should be noticed that the electromagnetic Lagrangan grounds for treating ++ and +-
on a nonsymmetrical
is not obtained simply by putting m=O in (A.1). The footing.
dierence is that the derivatives in (A.l) are covariant
*This has the rather strange consequence that for the electro-
derivatives, and since r:, is nonsymmetric the covari- magnetic field the spin tensor SkCf vanishes, since the Lagran-
ant curl is not equal to the ordinary curl (though both gian is inde ndent of A,,.
See R.
(1958).
k Feynman and M. Gell-Man, Phys. Rev. 109,193
Compare footnote 18.
This page intentionally left blank
Chapter 5

Accelerated Frames:

Generalizing the Lorentz Transformations*

*C. Mdler, T. Fulton, R. Rohrlich, L.Wittcn, T. Y .Wu, Y. C. Lee, J. P. Hsu,


L. Hsu, D. T. Schmitt, T. Kleinschmidt
180

DET KGL. DANSKE VIDENSKABERNES SELSKAB


MATEMATISK-FYSISKE MEDDELELSER, BINDXX, Nr. 19

ON HOMOGENEOUS
GRAVITATIONAL FIELDS IN THE
GENERAL THEORY OF RELATIVITY
AND THE CLOCK PARADOX
BY

C. M O L L E R

KOBENHAVN
I KOMhlISSION HOS EJNAR MUNKSGAARD
1943
181

1. Introduction and statement of the problem.


the behaviour of clocks is treated accordins to the
W hen
principles of the special theory of relativity, without making
due allowance for the principles of the general theory, a well-
known paradox can arise, which was already mentioned i n
E I N S T E I N ' S original paper ') and later was discussed in detail
by L A N G E W N ~ ) .LAUE3), and LORENTZ'). With a slight simpli-
fication of the usual representation*, the problem may be stated
as follows. Consider two identically constructed clocks, C, and
c,, o n e of which, say C1, is permanently situated at rest at a
point on the positive X-axis of a definite Lorentz frame of
reference K , while C, is moving with constant velocity - u in the
direction of the X-axis (see Fig. 1). At the moment of coinci-
dence between C, and C,, the readings of the two clocks are
compared. After having travelled with constant velocity for a
long time, C, for a short time is attacked by a constant force
Fwhich brings i t to rest at the origin 0 of K and starts it back
to -4 N-ith reversed velocity u. A t the moment of the second en-
counter, the clocks C1 and C , are compared again. Let At, a n d
d t z denote the measurements on the two clocks of the time
elapsed between the two encounters. Now, assuming that the
force I: is so Iarge that the time during which C, is accelerated
is negligible compared with the time of travel at the constant
Velocity v , we have, according to the special theory of relativity,
the formuIa **
Usually, the two clocks are assumed to be initially and finally at rest,
which necessitates the further introduction of a force a t the beginning and at
[he end of the experiment.
** Throughout this paper, we shall use a time unit which makes the velocity
Oflight equal to uoity. The transition to ordinary units is then performed by
replacing in OUT formulae all time variables i, velocities u, acceIerations g, a n d
FFavitatiooal potentials + by ct, z, 5, ,: respectively, where c is the velocity
Of light i n ordinary units.
182

which shows that C, will register a smaller number of divisions


than C, at the end of the indicated experiment.
The paradox in question now arises, if we introduce a frame
of reference k moving together with CB in such a way that C,
is permanently situated at the origin of k. Since the motion of
C1 with respect to k then is similar to the motion of C, with
respect to K, it seems that an observer in k should arrive at
the conclusion that A t , must be smaller than d t , and must be
given by the formula
df, = A t , 1 / 1 4 (2)
in contradiction to (1). In the papers quoted above, it was
pointed out, however, that the equation

d z = dt 1/1-uB (3)
connecting the proper time d z of a clock moving with the
velocity u in a given system of reference with the time df of
this system is valid only if the frame of reference is a system
of inertia like K. The application of (3) in K thus leads to
the correct formula (l), while the application of (3) in k which
leads to formula (2) is not justified, since k is accelerated in
the middle of the experiment and, therefore, does not constitute
a simple system of inertia during this interval.
In the space-time continuum introduced by MINKOWSKI, the
two events marked by the Arst and second encounters of the
clocks are represented by two points connected by the world
lines of C1 and C,, OF which the first mentioned is a straight
line. Since the lengths of these world lines, on account of (3).
are proportional to the proper times AI, and At, of the two
clocks, the statement expressed by (1) may be considered a
special case of the general statement that a straight line con-
necting two points in Minkowski space is of greater length than
any other curve (of everywhere time-like character) connecting
the two points.
Thus, it was clear that the discussion of the indicated ex-
periment could not lead to any difficulties for the special theory
of relativity, since this theory does not make any statement at
183

Nr. 19 5
all regarding the behaviour of clocks in accelerated systems
like k. The paradox arose again, however, in the general theory
of relativity, according to which a treatment of the behaviour
of C,, from the point of view of a n observer in k, must be
possible. Neglecting the short interval during which k is n o
system of inertia, we then find again the formula (2) for the
time increase of C, measured with the time scale of k and, a t
first sight, i t is difEcult to understand how it is possible to ac-
count for the difference between (2) and (1) by consideration
of the short interval in which k is accelerated. The whole
question was clarified by E I N S T E I Nwho ~ ) pointed out that,
during this interval, the distant masses of the universe are ac-
celerated relative to k, and thus temporarily create a gravita-
tional field which influences the time rates of the clocks in such
a way that the total time increase of C, measured in the time
scale of k is again given by (1).
I n his paper just quoted, EINSTEIN did not give any explicit
calculations, but it is clear beforehand that the result of a cal-
culation must be as stated above. In fact, since At, and A 4
are proportional to the lengths of the world lines of C, and C,
and these lengths, according to the basic assumptions of the
general theory of relativity, are independent of the space-time
coordinates used in their evaluation, i t is obvious that we shall
get the same value for - ti whether the calculation is performed
ta
in K or in k. Nevertheless, it is instructive to calculate directly
the time increase of C, during the existence of the gravitational
field in k. For small values of u, this has been done by TOL-
MAN') who assumed that terms in u higher than the second
can be neglected. In order to account for the lack of symmetry
between the treatment given to the clock C,, which was at no
time subjected to any force. and that given to the clock C l ,
which was subjected to the force F in the middle of the experi-
ment, TOLMAN introduces a temporary homogeneous gravitational
field in the description where C, is taken as the moving clock
and C, as the one which remains at rest. This gravitational field
is allowed to act on C, and C2 in such a way as to produce the
desired change in velocity of C,, while CB remains at rest on
account of the force F. By means of the well-known formula
184

6 Nr. 19
for the relative rates of two clocks situated at points of different
potential in a weak static gravitational field, TOLMAN then finds
for the total increase in time of C, and C, during the considered
experiment the relation

which, for small u, is in accordance with (1).


Apart from the restriction to the case of small u, this treat-
ment does not seem to us to be complete, since it remains to
be shown that the transformation from K to the accelerated
system k leads to a system of space-time coordinates in which
the components of the metrical tensor are constant in time and
are of the form corresponding to the gravitational field explicitly
introduced by TOLMAN. In the present paper, we shall investigate
this point more closely and without making the assumption
of a small velocity u. It is shown that the accelerated frame of
reference k may be defined in such a way that the gravitational
field in k is slatic in the sense of the general theory of relativ-
ity. The equations by which the space-time coordinates of k
are expressed as functions of the coordinates of the system K
during the whole experiment are explicitly written down. By
means of these equations, the behaviour of the clocks C, and
C2 may easily be treated from the alternative standpoints of the
observers in the two systems K and k, thus leading to a com-
plete solution of the clock paradox.

2. Uniformly accelerated framesof reference and homo-


geneous gravitational fields.
In a general discussion of the clock paradox, we need a
formula connecting the space-time coordinates X, Y, 2, and T of
a Lorentz frame K with the coordinates x , y, z , and t of a
bniformly accelerated frame of reference k. If the direction
of acceleration is chosen as x-axis, the desired transformation
must have the form
z = f(X, T), y = Y, z = Z
t = h(X, T ) ,
f and h being functions of X and T, only.
185

Nr. 19 7
Taking for f and h the expressions
1
f =X--qTZ, h = T. (6)
2

where g is a constant, (5) represents the ordinary transforma-


tion to accelerated axes which, at least. for small velocities,
might be regarded as a reasonable change of coordinates. A free
particle in k has then a constant acceleration -9, j u s t like a
particle in a constant Newtonian field of gravitation. The gravita-
tional field in k, however, is not static in the sense of the
general theory of relativity, since he components of the metri-
cal tensor are varying with t.
In fact, introducing ( 5 ) and (6) into the expression
dsa = d X S + d P + d Z B - d T e (7)
for the line element in Minkowski space, we get
= dxa + + 62' + 2gtd~dt-ddta (1 -$fa),
~IJ' (81
i. e. the non-vanishing components of the metrical tensor defined
by the general expression*

ds' = gikd z i dJcK, (2')= (z, y, z, t ) (9)


are
911 =

gl4
szs = g3a = 1 ,
gal = gt-
944 = -(l-ggstgs)
} (10)

Even the geometry in physical space defined by the three-


dimensional line element

is seen to vary with t.


From (10) and ( l l ) , we get

d$=- d,cB +dyB+dzs


1--=tz
Here, the usual convention is made regarding t h e summation over dummy
Indlces from 1 to 4.
186

8 Nr. 19
in accordance with the fact that the measuring rods in k are
subjected to a Lorentz contraction.
The gravitational field i n the frame of reference defined by
(6) has, therefore, not much resemblance with the gravitational
fields assumed in the previous discussions of the clock paradox.
Our first task will be, if possible, to choose the functions f and
h in ( 5 ) in such a way that the gravitational field in k is
static. The expression foc the element of interval in the new
coordinates will then be of the form

dsa = A - d x 2 + d y a $ d z B - D - d P , (13)
where A and D are functions of x , only. This expression may
be further simplified by taking as coordinate dx instead of vx
x so that the line element takes the form
dsa = dx2 + dy2 + dz2-D - di2. (13')
If the desired transformation is at all possible, the functions
glk defined by (9) and (13') must satisfy EINSTEIN'S field equa-
tions for an empty space
c: Rk-- 1 ,dFR= 0, (14)

where Rf is the contracted Riemann-Christoffel tensor, and


R 5 Ri is obtained from R: by further contraction. The compo-

nents of Gf have been calculated by DINGLE') for a general


line element of the form
ds2 = A ( d d ) a+ B (dz')' + C ( d z a ) B- D (dz4)* (15)
with A, B, C, and D being any functions of the coordinates.
Using DINGLE'S formula in the special case of (13'), we get

where the accents indicate differentiation with respect to x, and


all other components G: vanish identically.
The equations (14) thus reduce to the single equation
(D'lm)'' = 0
with the general solution
187

Nr. 19 9
D = a(1fgr)a fl6')
containing two arbitrary conslants, a and g .
By adequate choice of the time variable, the constant (I may
be made equal to one, giving for the line element (13') the ex-
pression
+ +
ds' = dza dy' -+
dz' - (1 p)'dla . (17)

The functions gk, defined by (9) and (171,which were found


as solutions of the equations (14), may now by a simple cal-
culation be shown also to satisfy the more strict conditions

RL,,,, = 0,

where Rib is the uncontracted Riemann-Christoffel tensor. This


means that the geometry in the space-time continuum corre-
sponding to (1 7) is pseudo-Euclidean a n d that the line element
(17) may be brought into the simple form (7) by a suitable
transformation of the type (5). Apart from an arbitrary Lorentz
transformation, which does not change the form (7), this trana-
formation is uniquely determined.
Before we write dawn explicitly this transformation, which
inversely gives the transition from a n inertial system K to the
desired frame of reference k. we note that the gravitational field
in k, according to (17), is uniform in that part of the space
for which grc is a small quantity. In fact, neglecting all terms
of higher order in gx than the first, we have

Q@ = - 1 - 2 g x : . (1 9 )
and the Newtonian gravitational potential am,which, in the case
of "weak" fields, is defined by the equation")

944 = -1--2aw, (20)

has therefore the simple form

The line element (17) has, however. well defined physical


consequences for large values of gx also, SO that the gravitational
field defined by (17) is a generalization of the "weak" uniform
188

10 Nr. 19
field postulated in previous discussions of the clock paradox.
The only necessary restriction regarding the values of x is the
1
condition x > - -.
9
The geometry of physical space in k is Euclidean. x , y, and
z being Cartesian coordinates. The time variable t is the time
measured by a standard clock situated at rest at the origin z = 0.
The increase of tinie d r of a standard clock situated at any
other place is given by the formula

dz = G d t = (1 + g z ) d t , (22)

d z thus being zero in the singular plane x = --.1


9
Turning now to the explicit derivation of the transformation
connecting the space-time variables of the two systems K and k.
we start with the system k and try to find a transformation
by which the gravitational field of k is transformed away.
This may be effected by introduction of a frame of reference
consisting of material points which are allowed to fall freely in
the gravitational field of k. The world line of a free particle is
a geodesic given by the equations

1
s the proper time of the particle and r,
where d z = ~ d is
1
denote the ordinary Christoffel three index symbols. The values
of I$ in the case of (17) may also be taken from DINGLES
paper), and we get

all other components being zero.


The equations (23) with i = 1, 2, 3 are then simply

and from (17) we get, as a first integral of (23),


189

Nr. 19 11

For a particle a t rest (or small velocity), (24) reduces to

da
-dtax- - - g ( l fgx)
day=--
dez
dP dta - ''
If the gravitational potential ID is defined by the equation

dG -+
- - - -grad (D, z = (2,y, z),
dl'
we thus get
1
ID = gx+-g'x3,
2

a n expression which may be regarded as the generalization of


(21) to the case of strong fields.
The equation (20) is seen to hold also in this case, since
we get from (17) and (26)

g44 = -(1 + 2 ID). (27)

Finally, (22) may be written

dr = l m d t
which, for small 0 , reduces to the well-known formula6) for
weak fields
d z = (1 +IDm) d t . (29)

Returning now to the general equations (24) a n d (25), we


see that the motion of the particle in the directions of the y-
and z-axes is uniform if the proper time z is used a s time scale.
We are here only interested in the case where the velocities are
zero at t = z = 0, so that we have the solutions

go and z, being the initial values of y and z.


190

12 Nr. 19
From (25) and (24) we then get

dz 1 +gx
and
9
1 + gx
which may also be written

ds
-
dca (1 +gx)a = -2ga.

When the initial velocity is zero and xo denotes the initial


value of x , we get by integration of (32)

1
x ' = -{V(l +'gzo)a-g~z$- l}. (33)
9
Introduction of (33) into (31) gives

which by integration yields

From (33) and (34) it follows that a free particle initially


at rest at some point in k will move with increasing velocity
in the direction of the negative z-axis, later the velocity will
decrease and, finally, the particle will come to rest again in
1 1
the singular plane x = --at the time t = 00 or 'F = x,,+-.
9 9
W e now get the desired transformation, if we put xo = X,
yo = Y, zo = Z, and z = T in the equations (30), (33), and
(34). X, Y, Z. T then being the space-time coordinates of a
freely falling frame of reference K which, at the time t = T = 0,
coincides with the system k. In this way, we get
Nr. 19 13

y = Y , z = Z
1 l+gX+gT
t=-In
2 g l+gX-ggT'

By a simple calculation, it may be verified that t h e line


element (17) is really brought into the form (7) by the trans-
formation (35), showing that the system K is actually a system
of inertia.
Any fixed point in k with constant coordinates x , y. a n d x
is moving relative to K in accordance with the equations
1
x = -{V(1
8
+gz)a+Qs TB-1)

Y=y,Z=z

obtained by solving (35) with respect to X, Y, Z.


This motion is, according to the laws of the special theory
of relathity, identical with the motion of a particle of rest mass
m subjected to a constant force -
mg in the direction of the
1fgx
X-axis in a system of inertia, i. e. (36) represents the "hyperbolic
motion"") of a %niforrnly accelerated" particle with acceleration

9
r = -1 +gx' (37)

O n account of t h e dependence of y on x , the distance,


measuted by an ohserver in K, between two fixed points in the
frame k will not, i n general, be constant in time. Since, however,
the same distance is constant when measured by a comoving
meter stick. the system k deserves the name of a u n i f o r m l y
a c c e l e r a t e d r i g i d Frame of reference, and the transformation(35)
plays a similar part as does the Lorentz transformation in the
case of a rigid frame moving with c o n s t a n t v e l o c i t y .
Since the variables z and t, defined by (35), must be real,
we shall have to confine ourselves to the consideration of events
satisfying the condition

-(1 +SAY) < g T < 1 +gx. (38)


192

14 Nr. 19
For later use, we also write down the Lorentz transforma-
tion connecting the space-time coordinates of two systems of
inertia with the relative velocity u

X-X0--u(T- To)
x =
vi=2
y = Y, z=z (39)
T - To - u ( X - X o )
f-to =
vi=2
In (39), the space and time variables have been chosen in such
a way that the origin x = 0 of the system k a t the time t = to
corresponds to the coordinafe X = Xo and the time T = To i n K.

3. The clock paradox.


a. In the first part of this section. we shall treat the prob-
lem from the point of view of a n observer in K. While the
clock C, is permanently situated at rest a t the point A on the
positive X-axis, C, at the beginning is travelling with constant
velocity - u in the direction of the X-axis. At the point B, the
clock C2 is subjected to a constant force F, which brings it to
rest at the origin 0 and starts it back to B with reversed velo-
city. A t the time of arrival in B, C, will have regained the
velocity u which it retains during the travel from B to A. Let
us assume for simplicity that the coincidence of C, with 0 takes
place at the time T = 0 and that the proper time z of C2 is
also zero at this moment. Since the problem is then completely
symmetrical with respect to this event, we only need explicitly
to consider the behaviour of C, during its travel from 0 to B
and onwards to A.
Let T' and T" be the times, measured in the time scale of
the system K, during which C, travels from 0 to B a n d from
B to A, respectively, and let z' and z'' be the corresponding
proper times measured by the clock C , itself. T h e motion of C,
from B to 0 and back to B will be a hyperbolic motion given
by the equation
1
X = - {1/1 +ga T 2 - l}, (40)
9
193

Nr. 19 15
where the constant g is connected with the force F and the rest
mass m of C, by the relation

F = mg. (41)

dX
According to (40), the velocity u = - is given by
dT

yIK

Fig. 1.

and, since u = u for T = T', we have

Introducing (42) into (3), we get by integration

gT' = sinhgz'. (44)


194

16 Nr. 19
The corresponding relation between T" and 2' is, according
to the well-known formula from the special theory of relativity,

From (43) and (44) we further obtain

u = tgh gz'
1
---- coshgz'.
v1-2

Now, let d
'f
, denote the number of divisions registered by
C1 during the travel of Ca $om 0 to B, as judged by an ob-
server in K, and let difl be the corresponding number during
the period of uniform motion of C , from B to A. We then have

dkf, = T' and A;!, = T" (47)

and, for the total time elapsed between the two encounters of
C1 and C,, measured by C1 and Ca. respectively, we get
) Z(T'+T")
All = Z ( A > t l + d ~ t l =
+
At, = 2 (2' % ' I ) .

When the applied force F is chosen so large that TI and z'+


given by (43) and (44), become negligible, the connection be-
tween A t , and Ata, according to (48) a n d (45), is again given
by the simple formula (1).
If L' and L" denote the distances OB and BA, measured with
the measuring rods of the system K , we get from (40) and (43)

L' = -
1 (Vl + g a T ' e - 1 )
9
= 4 +-1)1
9 Vl-ua
(49)
while, obviously,
L" = uT".

We shall now introduce a frame of reference k moving


together with Ca and we may take C, as the origin of k. While
the motion of the origin is, thus, completely determined, the
motion of any other fixed point of k may, beforehand, be chosen
195

Nr. 19 17
arbitrarily. In the previous discussions of the clock paradox, it
has, however, tacitly been assumed that k should be a rigid
frame of referehce. According to the considerations in Section 2,
it is then clear that the transformation connecting the space-
time variables of K and k must be given by (35) during the
accelerated motion of C, from B to 0 and back. T h e motion of
the origin 2 = 0 relative to K is then, on account of (36),
identical with the motion of C2 given by (40), and the time
variable t is simply the proper time of the clock C,.
For all events satisfying the conditions -z' < t <z', the con-
nection between the coordinates of K and k is, thus, given by
(35). For t > z ' , the system k is a simple system of inertia, and
the corresponding space-lime transformation is obtained from
(39) by putting
I, = z', To= T', and X , = L'. (51)
Similarly, we have for f ( - 2 ' the transformations (39) with
reversed signs of u, z' and T'.
I n the following, we shall use the equations (35) and (39)
in a somewhat different form. Solving the last equation (35)
with respect to gT and introducing into the first equation, we
get, if we omit the trivial transformations of the y and z variables,

} (52)
for

By a similar procedure, we get from (39) and (51) the trans-


formation
T-T' = ( t - z ' ) ~ l - r o s + u ( X - - L ' )

For t <= -z', the corresponding transformation (53') is obtained


from (53) by reversing the signs of u, T', and 2'.
In spite of the great difference in form between the equations
(52) and (53), they are easily seen to be identical for t = 'F'.
For this particular value of t. the equations ( 5 2 ) reduce to
D. Rut. D8nake Vldcosk. Schkab. mat..^. Medd. XX. 19.
196

18 Nr. 19
gT = (1 + gx) tgh gz
1 + gX = (1 + 8) cosh gz
which, by means of (43), (46), a n d (49) may be written
T = u(X-L)+T

in accordance with (53) for t = z.


On account of the symmetry inherent in our problem, a
similar result would be obtained for t = -2,so that the cor-
relation of the coordinates x , y, I, t and the physical events is
performed in a continuous way by the equations (52), (53). and
(53). Also the v e l o c i t y of any fixed point in k relative to K
varies continuously at 1 = z (and -z). From (36) and (52)
we get for constant values of 5 , y, z

(3
On the other hand, - is equal to u and - u for t > z and
1 < -z, respectively, which, on account of (46), is seen l o be
in accordance with (54) for t equal to z and -z.
While, thus, the velocities of the different points of k vary
continuously, it is clear that the accelerations must be discon-
tinuous or t = t and -z, since the force F is assumed to set
in abruptly. This is also the reason for the sudden change i n
the gravitational potential from the value zero to the value
given by (26) at these moments.
The system k defined by (52), (53), and (53) thus seems
to be the most natural frame of reference to be used in the
discussion of the clock paradox. T h e applicability of this system
of coordinates is only restricted by the condition that (38) must
be satisfied for - r < t < z , i. e. for

-u(l+gX)<gTu(l+gX), (55)
on account of ,(62) and (46). Since u is smaller than one, a
comparison of (38) and (55) shows that this condition is satis-
1
fied for all events which take place at points X > - - .
9
197

Nr. 19 19
b. We shall now treat the problem from the point of view
of an observer in k, according to which Cs is permanently situ-
ated at rest at the origin o of k, while C, a t the beginning is
travelling with constant velocity u. The first encounter between
C, and C2 takes place a t the time t = - z f -F". At i = - T I , C,
has arrived at a point b on the positive x-axis with the coordi-

d
/
/
/

//Z
Fig. 2.

nate z = I". During the time --zf < t < z ' , Cl is subjected to the
gravitational field which brings it to rest a t the time 2 = o at
a point a in the distance I' from b, and starts it back to o with
reversed motion. In spite of this gravitational field everywhere
present during this period, C, remains a t rest on account of the
force F which just counterbalances the gravitational force.
The behaviour of the clock C, is now simply obtained from
(52) and (53) if we remember that the X-coordinate of C, has
the constant value X = L'+ L".
From the second equation (53) we then get
1" = L" y m , (56)
since 1': is the value of z for t = z'.
198

20 Nr. 19
Further, since the z-value of C, at t = o is 1'4- I", we get
from the second equation (52)

I' + 1" = L' + L" (57)


and, therefore,
I'= L'+L"(l--i7). (58)

On account of the Lorentz contraction factor i n (56), the


distance travelled by C, with constant velocity u relative t o k
will, thus, be shorter than the distance which C, travels with
constant velocity i n K. Nevertheless, the total dfstances travelled
by the two clocks along the z-axes will be equal. In the ex-
treme case of u + c, we have simply I" -+ 0 and I#-+ L' L". +
If dktl denotes the number of divisions registered by C,
during the travel from a to b (or from b to a), we-get from
the first equation (52), by putting X = L'+ L", t = r ' , and
T =lktl,
1 + +
4 = - [l g (L' L")] tgh 95' = T' uaT"
9
+ (59)

by means of (44), (46), (49), and (50).


For the corresponding number of divisions A: fl registered
by C, during the period of travel with constant velocity, we
have, according to the special theory of relativity,

This formula is also easily obtained from the first equation (53)
if we remember that S,l t, is the increase in T for X = L' L" +
+
during the interval 2' < t < r' z". On account of (45), we may
also write
/;t, = r"(1-uS). (61)

Although, thus, Art, is smaller than A l t , in (47), it follows


from (59), and (61) that the total time elapsed between the
two encounters of Ci and C, measured by C, and CS,respect-
ively, is again given by

A t , = 2 ( 4 f 1 + 4 f , ) = 2(T'+T")
A t , = 2 (2'4- 2")
199

Nr. 19 21
in accordance with the expressions (48) derived from the stand-
point of an observer in K.
It is interesting to note that /itl remains finite in the limit-
ing case of very large forces F, where TI, T', and dKff vanish,
since Aktl in (59) contains a term which only depends upon u
and 7'. It is just this term which is essential for the solution
of the clock paradox.
/ifl,
Since d f ain a n y case is smaller than At,, a n d accord-
ing to (60), is smaller than T", A i t l must be greater than z',
i. e. the clock C1 goes faster than C, during this period. From
the p i n t of view of an observer in k, tbe reason for this
difference in rate is to be sought mainly in the difference i n
gravitational potential (0 a t the places of the two clocks. T h e
behaviour of C1,however, will in general not be like that of a
clock a t r e s t at the point x = 1'+1'' = L'f L", even if T' and
z' are made small by use of a Iarge force F. In fact, the number
of divisions registered by a clock at rest during the time d t = zt
is, according to (26). (28), or (ZL),given by

a number which is greater than lktI


in (59), since we have

t& gr' <zf.


9
From (17) and (26), we get the expression

= dtV13-2Ul-u3 J
for the proper time of a particle moving with velocity u in the
gravitational potential (D. This general formula, which comprises
the special formulae (3) and (28), clearly shows that Aitl in
general must be smaller than (dkfJo, since C1 during the time
in question falls freely with increasing velocity from the place
z = L'+ L" towards smaller values of z, i. e. smaller values
of the potential (D.
Only in the case u << 1 considered by TOLMAN, where tgh gz'
is equal to gz', apart from terms of the third order in u (cf.
200

22 Nr. 19
(46)). it is allowed to treat Ci as a clock at rest during the
period of acceleration, since the difference between t, and
(ALfi)o is then of higher order i n u. Even in this case, where
gt may be treated as a small quantity, the equations (52).
however, do not reduce to the transformations giyen by (6). If
we neglect terms of higher order in gt, we obtain instead

x = s + - 2g1 P ( 1 +gz).

To get the transformation (6). we should, thus, have to replace


the factor 1 f g z by 1 and this would mean neglect of just
those terms which, in the preceding discussion, have been seen
to be essential for the treatment of the clock paradox.

4. Rigld frames of reference in arbltrary motlon.


I n Section 2, it was shown that the transformation (36) is
essentially determined by the condition that the gravitational
field of the accelerated system k should be static, and the line
element and the gravitational potential in the transformed system
are given by (17) and (26), respectively. Since the motion of
the origin of k in this case is a hyperbolic motion, the ap-
plicability of the transformation (35) in the preceding discussion
is confined to the case where the clock Ca is subjected to a
c o n s t a n t force during the period of acceleration. For any other
motion the gravitational field in the cornoving system will not
be static. Anyway, it is always possible to choose the time
variable t in the transformations ( 5 ) in such a way that the
line element takes the form (13), where A and D in general
are functions of both variables z and t. If we want the system
k to be a r i g i d frame of reference, A must, however, be indepen-
dent of f, so that the line element may be brought into the
simple form (13') by a suitable choice of the variabIe x. Then,
the spacial geometry is again Euclidean, 5, g, and z being
Cartesian coordinates.
Using DINGLE'S general' formulae7), one finds that EINSTEIN'S
field equations (14) in this case reduce to the single equation
20 1

Nr. 19 23

which is obtained from (16) by replacing the ordinary differen-


tiations with respect to z by partial differentiations. The general
solution is again of the form (16'), a and g here being arbitrary
functions of t. Finally, the time variable t may be chosen such
that the line element takes the same simple form (17) as in the
special case treated in Section 2, g in the general case being
an arbitrary function of f. The equations (18)-(29) and (63)
are seen, therefore, to hold also in the general case.
In order to find the transformation (5) by which the ex-
pression (7) for interval is transformed into (17) and by which,
conversely, the gravitational field i n k is transformed away, we
may proceed exactly as in Section 2. First, we solve the
equations (24) and (25) for the motion of a free particle h i -
tially a t rest. After that, the proper time z of the particle a n d
the initial values zo, yo, zo of the space coordinates in>k are
identified with the time and space coordinates T, X, Y, Z in K.
The solution of the equations (24) and (25) is only somewhat
more complicated than in the case of constant g considered in
Section 2. Since g may be regarded as a known function of f,
it is convenient to use t as parameter in (24) instead of z. The
elimination of z is easily performed by means of (25) and,
finally, applying elementary methods, a complete solution of the
problem is possible.
We shall here give the results, only. For the transformations
connecting the space-time variables of the systems K and k, we get

1
t
X = z c o s h 9 + $ s i n h @ d t , Y = y, 2=z
0

i
Pl
T = x sinh 9 + 50
cosh 0 dt
with
1
@(t) = g ( t ) dt
0 (64)
I t is easily verified by means of direct calculation that the line
element (7) is really brought into the form (17) by the trans-
formation (64). Further, we see that the equations (64) in the
case of constant g reduce to the equations (52) which are equi-
202

24 Nr. 19
valent to (35). On the other hand, if g is assumed to be finite
and constant for - z z ' < t < z ' and zero for all other times, (64)
leads to the transformations (52) and (53) used i n the discus-
sions of Section 3.
When g is given as a function of t, the transformation (64)
and, consequently, also the motion of the origin of k with
respect to K is completely determined. Conversely, the. function
g and the transformation (64) are uniquely determined by the
motion of the origin of k. Differentiating (64) by constant x,
we get
d X = sinh@.(l+gx)ddt
dT == cosh0*(1+ g x ) df.

The velocity U
is, thus,
=
(3
- of a fixed point in k with respect to K

U = tgh 0, (66)
an equation which may be regarded as a generalization of (54).
Moreover, we get from (66), (65), and from the definition of B
in (64)

which shows that the motion of a point x = constant is the


same as that of a particle of mass m attacked by a force
mg just like in the case of constant g (cf. p. 13).
1f g x '
When the motion of the origin x = 0 is given by the equation

X = LY(T), (68)

the corresponding g is obtained as a function of t by elimina-


tion of the variable T from the equations

g = - (d wz' )
dT 1/1-

t = 5 T

0
V-dT

which are easily derived from (67), (68). (66), and (65).
203

Nr. 19 25
By means of the general equations (64). it is now easy to
treat the clock paradox for a n arbitrary motion of the clock C,
during the interval of acceleration. Since, however, the treatment
of this general case does not exhibit any essentially new features
as compared with the treatment of the special case discussed in
Section 3, we shall confine ourselves to the general remarks
already made i n this section.
204

IL hTUOV0 CIMEN'L'O VOL. XXVI, N. 4 16 Novembre 1962

Physical Consequences of a Co-ordinate Transformation


to a Uniformly Accelerating Frame (*).
T. FULTOK
(**)
irke Johns H o p k i w University - Baltimore, X d .

B. ROHRLICII(**)
Un,iversity of Iowa - Iowa City, l a .

L. WITTEN
RIAS - Baltimore, Illcl.

(ricevuto il 16 Febbraio 1962)

Summary. - The classical equations of particle motion and of classical


electrodynamics are known to be covariant with respect t o conformal
co-ordinate transfoimations. This paper deals with a simple conformal
co-ordinate transformation from an inertial frame t o one which accel-
erates uniformly ; it examines the physical consequences predicted by
the covmiant theory and the transformation. Expressions are derived for
various effects of the conformal invariance on rnw.8, length, and time
meusurements; the predictions of the theory are shown to be the same
aa the correspondiiig predictions derived from the general theory of
relativity for the transformation being stndied. The predictions of the
conformal theory depend on the particular co-ordinate system chosen
(with an explicit dependence on the origin). The corresponding situation
prevails in general relativity with the components of the metric tensor
being a function of space and t.ime. An example of the o twin paradox B
is studied. Finally, after considering the behavior of four-momentum
conservation under conformal transformations, the frequency shift o f
radiation emitted by an excited atom falling freely in a uniform field
is calculated to first order in acceleration. Identical results are obtained
h o r n the conformal and general relativistic points of view. The frequency
shifts are what one would expect t o obtain by properly combining the
Doppler shifts and the shift arising as a consequence of the equivalence
of inertial mass t o gravitational mass.

( * ) Supported in part by the National Science Foundation and i n part by the


Aeronautical Research Laboratory.
205

PFIPSICAL CONSEQUENCES O F A CO-ORDINATE TRANSFORMATION ETC. 653

- Introduction.
I.

The study of conformal traiisformations and conformal invariance has a


rather extensive history. The transformations are well understood mathema-
tically, but their consequences for physics have been somewhat less extensively
examined, particularly regarding specific experimental predictions. I n fact,
most physicists probably believe that conformal invariance should not play
any important role in the description of nature while some want t o consider
it as a logical generalization of Lorentz invariance.
WEYL (l), EINSTEIN (z), and PAULI (3), among others, have considered in a

general way the consequences of conformal relativity as they may affect phys-
ical measurements. These questions are of interest in view of the conformal
invariance of the basic equations of classical physics ( 4 > 6 ) . To the best of our
knowledge, no one has discussed the specific physical effects of conformal rela-
tivity f o r a simple case, though very likely a number of physicists have thought
about and have been aware of the matters to be discussed in this paper. These
matters are very restricted in scope. We shall not consider all conformal co-
ordinate transformations but only those which take us from flat space t o flat
space. I n fact, we shall discuss in detail only one specific example of such a
transformation, the one responsible for transforming from an inertial co-ordinate
system to one which is accelerated uniformly with respect t o it. Though this
transformation is a simple one, it does have physical significance. I n the one-
dimensional case and for weak fields, its inverse represents a transformation
from a frame in which a particle falls freely in a static and uniform force field
to the rest frame of the particle (7).
I n the next section, we discuss in detail the properties of our specific trans-
formation, its singularities and other peculiar features. In Section 3 we treat
various thought experiments-relating to the variation of rest mass, leng&h,
and time under a transformation t o a uniformly accelerating co-ordinate frame.
We also discuss the so-called (( twin paradox. u Finally, in Section 4, we deal
with a specific experiment: we calculate the frequency shift of the spectral
line of a freely falling emitter.

(l) H . WEYL: Sitzungsber. Preuss. Akad. Wiss. (1918), p. 465; Nath. Zeits., 2,
3 8 4 (1918); Ann. Phys., 59, 101 (1919); Space, Time, and Xatter (N'ew York, 1950).
(z) A. EINSTEIN:Sitzungsber. Preuss. Akad. W i s s . (1921), p. 261.
(3) W. PAULI:Theory of Relativity (New York, 1958).

(4) J. A. SCEIOUTEN and J. HAANTJES:Physica, I,869 (1934).


(') J . A. SCHOUTEN and J. HAANTJES: Proc. Ned. Akacl. Wet., 39, 1063 (1936).
(a) F. ROHRLICH, T. FULTON and L. WITTEN:Bull. Am. Phys. SOC.,6, 346 (1961).
(7 F. ROHRLICH:The Static Homogeneous Gravitational Field (in preparation).
206

654 T. FULTON, P. ROHRLICH and L. WITTEN

2. - The acceleration transformation.

We consider the co-ordinate transformation from 2% reference Prame s


t o a reference frame 8. The co-ordinates of a point in 8 are given by 2; the
, the sarne point in S are given by
(:o-ordinates, x ~ of

(2.1) x/@= P ( X . 2 ) .
The metric arssociated with 8 is gap(x). If the frame b is a Riemanri space
(i.e. we deal with a general relativistic co-ordinate transformation) the metric
in 8 is given by

(2.2)

The metric traiisforms as a second rank tensor, and as a consequence, i n


general relativity, we have

(2.3)

where
(2.4)
and
(2.5)

A conformal co-ordinate transformation differs from a general relativistic


co-ordinate transformation in that the metric no longer transforms 8 8 R tensor.
Instead we have ( 9 )

(2.6)

The function ~ ( xrepresents


) a scale transformation, and,. as a, consequence,
the proper time is no longer an invariant under conformal transformation.

This type of transformation is sometimes referred to as il imssive co-ordinate


transformation. For a discussion of the relationship between confominl point trans-
formations and conformal co-ordinate transformation see the note by T. FULTON,
F. ROHRLICHand 1,. WITTEN:in preparation.
(9) J. HAANT.TES: Proc. Ned. A k a d . Wet., 43, 1288 (1940).
207

P H Y S I C A L C O N S E Q U E N C E S OP A C O - O R D I N A T E TRSNSFOR\~.TATI(JL<STY!. 9 55.

By definition, the proper time is

(2.7)

It follows that

(2.8) dtc = ~ dt2


( 2 ) .
The light cones are till invariant under conformal tran,$orinatioiia :

Since, in dealing with equations of motion of charged particles (Lorentz


equation, or Lorentz-Dirac equation), we have to deal with derivatives of
vectors, the definition of conformal co-ordinate transformations as giveii by
(2.1),(2.2), and (2.6) is not complete. It must be augmented by discussing
covariant differentiation ; in particular the behavior of the affiiie c80nnection
under the transformation must be given (see Appendix). A flat; spabceis one
in which the curvature tensor, R , computed from the affine Connection, I.
vanishes. A conformal co-ordinate transformation from flat space t o flat; space
is one in which both R and Re vanish. These co-ordinate transforniations are
given by the 15-parameter Lie group often referred t o as the ((conformal
group H. I n case derivatives enter the problem being considered, it is impor-
tant t o realize that although both S and S are flat under the conforiiial trans-
formations, not both I and rCvanish.
I n this paper we restrict ourselves to conformal co-ordinate transformations
from flat apace t o flat space and emphasize the details of one such transfor-
mation. We deal oidy with frames S which are Lorentz frames. That is, we set

(2.10)

where qpv is the Minkowski metric with signature +2. We shall consider trans-
formations t o uniformly accelerating frames of reference. The Y - s of (2.1)
will represent an inversion in the origin, followed by translation by a constant
vector and then by a second inversion about the new origili (lo). In other
words, we let (11)

(2.11) SB f
=-
P
( l o ) Such a transformation is discussed among other authors by T. FULTON and
E. ROHRLICEI: Phys. Bev., 107, 1163 (1957).
Ann. Phys., 9, 499 (1960); S. A. BLUDMAN:
(?) For any vector A P , we define A ? = A . p A p :
(23)

and a! is 8 constant vector (l2).


We will also demand that the conformal transformation take us to a flat
space IS, i.e.
(2.13)

We Cali imagine, for definiteness, t h a t the affine connection in AY vanishes,


but in 8it does not. However, the affine conn.ections will not explicitly enter
into the sequel.
Since a general co-ordinate transformation for (2.11) and (2.12) gives, upon
explicit computation,

t2.14)
where
(2.15)

we identify the scale function CT(S),using (2.13), as

(2.16) a($) = A - 2 ( 3 ) .
The transformation inverse t o . (2.11) and (2.12) is simple to exhibit. It too
must be an inversion in the origin, followed by a translation, this time i n the
opposite sense, by the same constant vector a p , and a second inversion:

(2.17)

where

and
1
(27.19) A(x)=I+ 2afix;x2a2 = - .
4s)

In what follows, we shall limit ourselves further to a special choice of apt

(12) We put c = 1 throughout this paper, d e s s explicitly stated otherwise.


209

PHYSICAL CONsEQUENCES OF A CO-ORDINATE TRANSPORMATIO?J ETC. 657

namely

(2.90) ap=
i0 ; o,o,--
3
We observe that the trailsformation (2.11), (2.12), with a@ defined as in
(2.20) does describe a uniformly accelerating frame of reference. This may
be made more obvious, if we do not set c = l and instead take the limit
c --f 00. The conformal transformation in this limit becomes

(2.21)

I lim
c+m
A(%) =1

I n the limit of vanishing g, the transformation reduces t o the identity trans-


formation, as it should.
Next, we examine how a particle, at reat at the origin for all time in the
ASframe, will move as seen by s n observer at rest in the S frame. The observer
will determine the co-ordmates at the accelerating particle t o be

(2.22)
I z=-$
gt2

1
with

(2.23) I. = 1 - $gat

The orbit of the particle, described by the observer a t rest in the 8 frame,
lies in the d-t plane and is hyperbolic, i.e. represents uniform acceleratioii:

(2.24)

In the nonrelativistic limit, or alternatively the small g limit, (2.24) recluces t o


210

655 T. FULTON. P . ROFlIZLICFI and L. WITTEN

Note tlhat we have used only the transformation eqs. (2.11.), (2.12) t o ob-
-tail1 (2.%), so it holds both for general relativistic co-ordinate transformations
and f o r conformal co-ordinate transformations. Not until we speclfy our metric
or, altenmt,ively, the transformation of proper time, do we restrict ourselves
t o one, or the other. Hyperbolic motion thus appears as a solution of either
a conformally ( 15,J.4) or a general relntivistic.ally eovariant equation of motion
whic.h reduces t o the equation of motion for a free particle in the frame X.
The hvperbolic i-iiotioii a8 seen by the observeT a t rest in S would equally
well result8if we, c.oiisidered 8 t o be an inertid frame in which the Lorentz
i n v ~ i i t equation
~~t of motion with elonstant rest mass and constant force held.
Thus we lmve three alternative and equivdent descriptions of the motion as
seeii by t,he observer in 8:
A ) d is ;
I frnme in whioh only R constant force is acting.
I;,) 8 is a frame in >L Weyl space, uniformly accelerating with respect
t o ;i,ii inertial frame. This considers the transformation to be a conformal
co-ordiimt,e t,l.~insforrnation.
B,) R is R frame iii a Riemannian space uniformly accelerating with
respect t<oan inertial frame. This considers the transformation t o be a general
relativistic. co-ordinate transforimtion.
The principle of equiva1enc.e as usually considered regards the descriptions
il and E , t o be equivalent. We shall shortly discuss the equivalence of the
desc,riptionn H, and B, for the simple class of transformations (2.11) and (2.1.2)
that we use considering. Hence, the primiple of equivalence can be regarded
as an equivalence between A and B, or between A and B,. For the case of
uniformly accelerahing motion, the, description used i.s a matter of computa-
tional conveiiienc:e.
For the nuke of completeness, since we will use them later, we give here
the expression for the instantaneous veloc,ity

(2.26)

where

(2.27)

N e s t n-e study the mapping of the x - t plane into the 2-t plane for

(I3) F. ( ~ ~ I ~ s E
Nuovo
Y : Ciwento, 3, 988 (1966); ,I. A . MC LENNAN:
iVuovo Cimexto
5, 640 (1957); H. A. BUCHDBHL: iVuovo Cirr~e)ito,11, 496 (1959). References t o earlier
work can he fouiitl in these papers.
( l * ) L. ISPELLI aiid A. SCIIILD:Phys. Bev., 70, 410 (1946).
21 1
~~ ~

PIIL?IC.4L C G N 3 E Q U E N C E S O F A C O - 0 R . D I N A T E T R A N S B O R M A T I O N E T C . 659

the tratiisformation (2.11), (2.12). (This mapping is not sufficient to charac-


terize tdie traiisf orniation completely.) The salient features of the mappin,g
call be most easily represented graphically.
The tmnsformation is singular in the z-t plane for the lilies

(2.28) t = f (3 + 5)
The fac,t that the transformation is singular is in no way disturbing. One
must iiierely obey t h e injunction t o stay away from the singulwities in dis-
cussing m y physical process. Using this transformation, we discuss physical
processes On117 for regions in space-time for which the transformation is non-
singulair.
In order t o determine hhe mapping, we consider the most general straight
line in the ;.-t plane, pasallel to the t-axis,

2
(2.39) x = - (a--1) for all t ,(a2 0, a # 0 ) .
9

The transformed equations in the 8' frame for these lines are

(3.30)

Thus, stmight lines parallel to the t-axis map into a family of hyperbolas wit1
parallel aeyiizptotes of slopes fl. The vertices of the two arms of the hyper

Pig. 1. - The mipping of particles at rest in the 9 frame ( 2 - t plane) at two typic
positions in t o t,heir corresponcliiig hyperbolic motions in the 9' plane (d-t' plan
by means of the acceleration transformation.
212

(ifio T. I"UI.TON, I:. ROFIRLICFI and I,. WITTEN

bolne :m located at the points

(2.31.)

t'

singular line?
-
a)
Fig. 2. - The mapping oQ the x-t plane into the z'-t' plane 01. the trans8ormation
of uniform acceleration. The singular lines in t h e z-t plane map into iritiiiity in
the d-t' plane.

Figure 1 illustrates the way typical lines in the 3-t plane map into the
d-t' plane. Figure 2 indicates the mapping of corresponding doiiiaias in the
two planes. The mapping is one to one for all regions of the plane except
along the singular lines.

3. - Kinematical consequence:; transformation of mass.

I n this section, we shall be concerned mostly with certain kineiiiatical


properties of the transformation as they relate to effects on the aieasurement
of length and time (15). We shall also discuss an example of the (( twirl para-
dox )) and the transformation of mass under a conformal co-ordinate transfor-
mation.

3 1. T'ransforrnatio.n of .rods. - Consider first the efl'ect of the traiisformation


on length. Suppose we have a very small measuring rod, a t rest in frame 8
and lying along the x-axis. The length, dl, measured by S is given by dZ=dx.

(I5) Some of these questions were recently discussed in a diffei,ent context by


3. N. GUPTA: Bcieme, 134, 1360 (1961).
213

I'I-IYSICAL CONSEQUENCES OF A CO-OR D I N A TE TRANSFORMATION ETC . 66 1

An observer in 8' will have t o make measurements at a giveii insta,iiC to to


measure the length of the rod. I n order to compare the results of his meas-
urement with the results of the observer S he will have t o realize t h a t he has
achieved a Minkowski metric by the introduction of a scale factor. To make
comparisons he must account for the scale factor by saying that dl'= ( d ~ i / A ' ) ~ ~ .
In order to predict the result of his measurement, we must calculate
(A' by using the inverse transformation. If S and S' were inertial
,BI-yl-o

frames connected by a Lorecltz transformation (3, = I), the resulting parsiai


derivative would correctly give the Loreiitz contraction. For our ease, we
have, from eqs. (2.17)-(2.20)

(3.1)

I n order to obtain the length effects, we now have to substitute the motion
satisfied in S' by a particle at rest in 8. We used infinitesimlll lengths in (3.1)
in order t o enable us t o substitute such particle motion. No such need arises
for Lorentz transformations so that finite lengths may be used when com-
paring measurements in two inertial frames. The distinction is one between
a transformation for which the new variables are non-linear fuiictioiih of the
old and one for which they are linear.
Substitution of the motion (2.24) in the expression (2.19) for A' yields

(3.2) 2' I "rbit = 1- 1? P O' ,

and insertion in (3.1) results in

(3.3)

where x i is the x' co-ordinate at t i . This agrees with the predictioiis of the
special theory of relativity. Observer S' was only able t o make this prediction
if he knew explicitly the value l / I ' by which he expanded space.
A general relativistic observer would make exactly the same ca,lculatioii
but would introduce 1' not as a scale factor but with the appropriate com-
ponents of the metric tensor. The distinction for this transformation is purely
formal.

3'2. Transformation of clocks. - We can proceed in a similar wa'y t o study


effects on time. Assume a clock at rest in AS' ( i e . , at a fixed co-ordinate in S ) .
I n order t o compare the rate of this clock with clocks in AS", we require the
214

6M T. FULTON, P. ROHHLICR and L. WITTEN

expressions (1l-l at/&),,, . I n particular, take ro= 0. Then we have

(3.4)

Ewluation of (3.4) along the orbit gives, with the use of (3.2)

(3.5)

Tlius u,e get a time dilation which is the same as the one which would be
obtained froiii the use of instantaneous Lorentz franies. Again the general
rellttirist woiiltl make the same calculation using 2 as the appropriate corn-*
ponents of the metric tensor.

33. IIcINpirndox. - It is conveiiient t o use the transformations (2.11)


and (9.17) t o make explicit cdculations regarding the twin paradox. One
((twin D is al.trest in a c.onstant force field and sees the other one start from the
smie poilit, wit,li ;L given velocity opposite t o the force, decelerate, stop, and
eveiitu:i.lly return t o the starting position with his initial speed.
Consicler : ~ nobserver 4 , at rest in S at the origin, and his twin, B , a t rest
in IS at the positmionz i , s= y= 0. Assume that both twins are at the same
space.-time point = .zB= 0, nt the time - t o and again a t the time t o , The
prope,r t,inie t4a-t has elapsed between coincidences of A and B is, of course,
A func,t#ionof the specific path taken (proper time is not an exact differential).
The proper time that has elapsed on the path of A between coincidences is
greater thalll 011 t,lie path of 23.
The nitmtion of each of the. two observers, A aiid B , seems on the face
of it t,o be similar. Observer A is stationary in his frame of reference and
sees ohserver B traverse a hyperbolic, orbit and return. Similarly observer B
is statioiiwy in his frame, of reference and sees ,4 travel a hyperbolic orbit.
It might be reasoned that if observer A concludes that his proper elapsed time
is g;re&tert81iaii that of B , observer B will conclude similarly that his own
proper elapsed tiiiie is gre,ater than that of A. These two conclusions disagree
with ei~cliother aiid hence the paradox.
The resolution of the parados is of course well known. Observer A is in
an inertial frame ; observer B is not. His frame appears inertial because of his
having scaled spwe everywhere by the factor 2. Observer B realizes that
he c,aiinot talk about proper time a t all in a physically meaningful way (Le., a
vay t,hat correspnds with measurements on a clock) uilless he scales space
again in surh a -\\.aythat at eac.h space-time point all proper intervals or proper
times are. inva,riant. Having accomplished this, he will realize that his esti-
215

PHYSICAL C O N S E Q U E N C E S OF A C O - O R D I N A T E T R A N S F O R S I A T I O h ETC. 863

mate of the, reading of a proper time interval between. two world points 011 8
given woidd line will necessarily agree with the estimate of observer A for the
same world h i e going between the same two points.
8honlcl obwrver B fail t o keep track of his scaling factor, he will be power-
less t o nmke m y remarks about the ratio of times or distances at different
parts of his dpnce. The interval d t c has no physical sigrdcance bec<ausethere
is no c..look which directly measures it. .However, he can compare ratios of
infinitesirnd 1e.ngths or distances at the same world-point (A' drops out of the
ratio) ; equivalently he can compare angles between world lines or any other
pliy,sicaJy inemiizgful quantity that can be expressed independently of A'
(a conformd invariant).
Rediziag iiow that there is no paradox and that calculations made in the
franie., 8,in which A is station.ary must agree with those made in the frame,
S', in which R is stationary, we proceed t o calculate 8 special case as an
exercise.
In 8, the observer is stationary at ad = 0. His path in S' is

(3.6)

For definiteness let observer B be located in S' at z i = l / y . His path in


in ASis

(3.7)

The two paths will intersect a t the times t = to= & 2 / ( d \ / 3 g ) ,I'= ff =
-
= f \/3/y. The elapsed time for A as calculated by the observer 9 is

where the integral is calculated over the path xa = 0. Observer B would find
for this elaped time

(3.9)

where t.he inte.gra1 is calculated over the path given by (3.6). Of comse, the
transforimtioii has been colzstructed so that ( d f P- dzr2)/A" = dta- dz2 and
observer .B n 4 l find that (3.9) gives the same result as (3.5). This can be
shown e,spSicitly. By use of (3.6) and (3.2), the integral (3.9) reduces after
216

684 'r. I W L ~ ~ O N ,I?. ROI-IIZLICII a,nd L. WITTEN

some 1-13 aiii pulati 011 t o

(3.1.0)
- 1'3 I 9

This reduces to (3.8) by use of the trailsformation

(3.11)

The elapsed time or B calculated by the observer A is

(3.13)

calculatecl along the path (3.73. This can be rewritten as

(3.13)

The fact that zB is less than T~ is in accordance with expectations. Observer R


would calculate this time by doing the following integration along the path,
2' = -
llg:

(3.1.4)

He knows without calculation that (3.13) must produce the same answer a%
(3.14) and can also readily verify that the integrals give the same resnlts by
using the transformation
4t;
(3.7.5) tB=
9 - g-ti,"'

to transform the integral (3.13) into (3.14).


Again the general relativist would make exactly the same calmlation BB
the c.onforma1 relativist. He would use the metric involving A'.

3'4. Tranaformation of mass. - SCHOUTEN and HAANTJES ha've shown (j)

that the Lorentz equation of motion of charged particles is covariant under


217

P H Y S I C A L C O N S E Q U E N C E S 01.'A C O - O R D I N A T E T R A N S F O R X A T I O N E T C . 666

conformal transformations, providing the rest mass is not invariant but trans-
forms appropriately. The charge of the particle and the velocity of light are
invariant. The proper mas8 must transform in the following way

(3.16)

The quantity m dt is invariant under conformal transformations :

(3.17) mcddtc= mddt.

For the transformation (2.11), (2.12) by virtue of (2.19), (2.20), in the plane
I /
s=y=o,
rrb
(3.18) =
1- gz' + (g"la)(dZ - f a )
I n the region of space-time for which g d >> ( g 2 / 4 ) ( x ' e - t ' 2 ) ,

(3.19) mc w m ( l + g d ) .
Equation (3.17) suggests how the point of view of general relativity differs
from that of conformal transformations. For general r e l a t i ~ t y ,the Lorentz
equation would be kept invariant, as would (3.17), by keeping each factor m
and d t individually invariant ; conformal relativity varies both. The value
of the transformed mass (3.18) is origin-dependent; this corresponds to the
analogous situation in general relatiGty which has an origin-dependent metric
tensor, g L V ( d )= l ' ( d ) - 2 q p , , . The origin-dependence is not particularly surprising,
and occurs quite often in metrics commonly considered in general relativity.
Equation (3.19) which is an appropriate approximation for certain regions
of space-time suggests an interpretation in classical terms. mc is the total
energy of the particle which contains contributions due t o the rest mass, m,
and to the gravikational potential, mgd. The full eq. (3.18) apparently adds
a correction due to velocity and its effects to the total energy. The origin-de-
pendence of (3.19) can now be thought of as corresponding to the arbitrary
addition of a constant to the gravitational potential; the mass difference be-
tween two points will according to (3.19) be origin-independent. Rowever
the exact expression (3.18) yields an origin-dependence eTen t o the mass dif-
ference between two points. This is probably related t o the actual physical
model that must be used t o produce the acceleration field with which we are
dealing, much as the gravitational potential from a point mass will yield an
origin-dependence for the potential difference of two equally spaced points
along the same radial line.
218

The disciissioil of mass in general relativity is not obvious mct stmight-


forward and a, simple comp arison of results is difficult. General c.onsiticrntions
would lead one t o believe that not mass but a notion like total energy content
of a particle (whatever that may mean in general relativity) i a important and
t b t the wwlts of a cbalculatlionin general relativity would agree witlh oiie in
conformal relativity. As an example of this we make a cdculation in the next,
sec*tionexplicitly iising the m a8s transformation.

4. - Frequency shift of a freely falling emitter.

Let 8 again be the frame which is freely falling in a uniforni g~ravit~itional


field of intensity g. Let 8' be the frame which is at rest in this field. Thus, B
is ail inertial frame, while 8' is a frame in which oiily a constant force is acting.
Letl an atom of rest mass m , (the mass measured by S while the a t o m is
at rest in 8) elnit ;L photon. Let the final atom have rest mash m.
We want t o investigate t h e frequency of the emitted photon heen by
tin observer in AY, using the m ethods of special relativity, coilformal relativity,
and general relativity, respectively. These three ways of looking at the si-
tuation (:orrespond esa,ctly t o the three points of view, A , B,, naid B j , Iistecl
in Section 2.

4'1. i!Jpec.ial relativity. - Although there is a gravitational field present.


in S', this frame can nevertheless be regarded as an inertial fraiiie in a certain
sense. If the atom decays a t the space-time point, z:, at which the observer
is located, the presence of a uniform gravitational field can be i<giiored. 011e
applies t h e coiiservation laws as in special relativity

(4.1.) p$=pP+kP,

where p $ and p!' are the momentum four-vectors of the atoiii -iiiimediahely
before and after the emission, and kf' is the photon four-momentum. We have
ignored the irrelevant dimensions z' and y', for motion along the :'-titxis (we
also put fi = 1).

I + + gz:) y* (I;0, 0, v,);


I dX'P
P:: = %:(I P o ) -=m*(l
dt*
219

PHYSICAL CONSEQUENCES O F A CO-ORDINATE TRANSFORMATLON E T C . 66 7

Here v, aiid zi are the veloc#itiesof m, and m as seen by the observey in 8.


The symbols t, and t denote the fact that each particle has its proper time.
The factor l + g d takes into account the presence of the gravitational field
t o f i s t order in g. The principle of equivalence demands that it be applied
t o kP as well.
Elimination of v from these equations yields for the frequency seen at z:
and produced at z: from an atom of velocity v * 7

(4.3)
/ I
cu(z,; x,, vg) = (m,:-m)
m, +m
---------p.(l-v i
: ) *
2m* *

The factor m,-m mould yield o if recoil were neglected and v, were zero;
the term (rn,+rn)/am, is a recoil correction; y+(l-v,) is the Doppler effect.
If the observer is located at a point xi # x , , energy conservatioii will give

with cu(z; x, 0) = w ( d ) . Thus, the frequency seen at s: of a photon produced


at x i by ail atom of velocity v, is

B,) Conformal relativity. - The observer in 8 realizes that he is not in


an inertial frame. The momentum four-vector is defined by

where vtc is related to m by eq. (3.16). The conservation law (assured by the
invariance of the theory under the translation group) is

(4.7)

This involves, when both observer and atom are at xi, according t o (3.16)
and (3.1.8)

1 k =
1
x(x/) = 7
1
1 -P o
( w ; 0, 0,k ) .
220

dfiS 'r. FULTON, F. R o K m m r and L. WITTEN

It is evident that 1clLc must be related t o d' in the same way that me is related
t o m, (3.16), iu ovclep t o preserve the covariance of (4.7).
Equatioiis (4.7) and (4.8) lead t o the same results as in case A . I n par-
ticular, the observer S' will again find the result (4.3) f o r o(zL; z i , t ~ * ) .
If the conformal observer is located at z i , he will apply a different scale
factor at xi than at z:. Since the conformal observer considers himself to be
in field free space, k p c is independent of position. Hence the ratio u)(z')/l'(a')
is a co11stmt, and he finds

which is identical with (4.5).


It may be thought that, since ionized atoms can obtain large accelerations
in readily available electric fields, a large frequency shift of a spectral line
emitted by an ion in such a field should be predicted by the conformal cal-
culation. However, the photon is not accelerated by the electric field in moving
from a, t o a1 and hence its behavior in a n electric field is quite different from
that in a gravitational field. It acquires no additional potential energy and
hence there is no frequency shift.

3,)GeneruZ rebtivity. - The general relativist might attack the problem


in R way which partially resembles both of the above points of view. When
both he and the emitter are at z:, he would say that he can apply the results
of special relativity and write (4.3) directly.
Now suppose the general relativist is at zi and the emitter is a t zi. He
will apply an invariance argument which will tell him that ( I E )

It should be pointed out that the equivalence of the conformal relativistic


result, eq. (4.9) and its corresponding general relativistic counterpart, eq. (4.10)
holds for arbitrary values of gz. Equation (4.5)is necessarily valid only for
small gz, because gravitational fields can be taken into account in the frame-
work of special relativity only in the Newtonian sense.

5. - Discussion.

We have considered how predictions of the results of some specific meas-


urements can be made from the point of view of c,onformal invariance. In

: Relativity, Thermodynamics and Cosmology


(I6) See, for example, R. C. TOLMAN
(Oxford, 1934), p. 288.
22 1

P H ~ I C A L CONSEQUENCES OF A CO-ORDINATE TRANSFORMATION ETC. 669

special rel~~tlivityreferring t o field-free space, the theory makes contact with


experience because the proper time is an invariant and i s measured by a phys-
ical clock. The comparison of clocks at two different space-time points offers
no difficulty. In general relativity, proper time is also an invariant measured
by il physical clock. A co-ordinate system can always be chosen sot hat the
description of a physical situation a t a point is identical with that of special
relativity. (Note that all the effects we have discussed are independent of
curvature. I To compare measurements a t two different space-time points, the
general relativist must know the metric tensor a t these points. He can deter-
mine it in either of two ways. He can calculate it from the theory if he knows
the physical situation in detail, or he can measure it.
The case of the general relativist is entirely analogous t o that of the con-
formal relativist. The latter knows that before making contact with experience
he must determine a scale factor. By an appropriate choice of co-ordinates,
h e can choose this factor t o be equal t o unity at his space-time point of ob-
servation; hence again he can reduce the situation at a single point to that
of special relativity. To make comparisons a t different space-time points, he
must determine the scale factor at the different points. This he can do either
by making calculations from the physical situation that prevails or by making
measurements.
The situation of the conformal relativist and of the general relativist are
thus completely parallel. Neither can make measurements without dctermin-
ing 1'; one will consider it to be a scale factor, the other a factor that mul-
tiplies q,, t o give the metric tensor. Obviously, general relativity deals with
a much broader class of transformations than does conformal relativity. Where
coiiformaJ relativity applies, general relativity agrees with it in detail. Since
general relativity is the broader theory, there seems to be no advantage at all
in considering collformal relativity as presenting a physically significant point
of view.
I n this paper, we have treated only conformal transformations from flat
space t o flat space. If more general conformal transformations are considered,
the above conclusions may no longer hold.

APPENDIX

The affine connection.


I n Riemannian geometry, the affine connection is given by t'he Christoffe
symbol
222

6 70 T. FULTON, F . ROHRLICH and L. WITTEN

which transforms under a general relativistic co-ordinate transformation from


the frame AS t o the grame S a8

If the transformation h o i n 8 to S were a conformal co-ordinate transformation


so that in place of lwe would deal with

where

It is more convenient in dealing with conformal transformations t o define


the affine connection, T$,in such a way that under a conformal co-ordinate
transformation

where .r$is related to T$ in the same way as are the Cristoffel symbols, (A.2)..
This requires the definition

provided x,, transforms according to

With the affine connection defined in this way, and considering conformel
co-ordinate transformations, we are dealing with an example of a Weyl space;
in showing the invariance of the equation of motion of charged particles, this
affine connection is particularly helpful.
223

The Clock Paradox in the Relativity Theory


TA-YOU WU and Y. C . LEE
Department of Physics, State University of New York, Bufalo, New York 14214

Received: 11 May 1971

Abstract
A system 5" (rocket) starts from rest in an inertial system S, and after a series of acceler-
ated, uniform and decelerated motions, comes back t o rest at its initial position in S. An
exact calculation is carried out, from the standpoint of S, of the time intervals for the
arrivals at S of light signals sent back by S'. From the standpoint of S', S has made a
round trip after undergoing a series of free falls in gravitational fields and coasting
motions. An exact calculation is carried out for the 'proper time' intervals in S from the
standpoint of S'. It is shown that there is exact agreement between S and S' in their
reckonings of the total time intervals for the two frames, namely, both S and S' agree
quantitatively, to them, the time interval is longer for S than for S'.
The accelerated motion of S'relative to Sexplicitly used in the treatment of the problem
in the present work is that under time-independent field and subject to the condition of
local Lorentz contraction and dilation; the resulting motion turns out to be that obtained
earlier by MslIer on entirely different considerations. The result of the present treatment
is, however, more general than this particular motion seems to imply, since by an arbitrary
coordinate transformation, it can be made to include an infinite number of accelerated
frames including time-dependent fields, all within the framework of flat space-time.
General remarks are given for the clock problem in the general theory of relativity in the
sense of Einstein's curved space.

1. The Clock Paradox


The question concerned is the following : Imagine a pair of clocks, one of
which remains at rest in an inertial frame, and the other sets out on a trip
(on a rocket, say), and after a time returns to rest in the inertial frame. Will
the travelling clock be slower than the one at home? Will they both agree
exactly by how much one is slower than the other?
This problem is sixty years old. In a paper in 191I , Einstein (191 1) gave a
simple theory in which (1) he employed the Doppler effect formula of the
special theory of relativity and obtained the effect of uniform acceleration
of a reference frame on the Doppler shift, and (2) he introduced the equi-
valence principle for the acceleration of a frame and a gravitational field.
Einstein concluded that a cloclc that has travelled, say in a circular path,
will 'lose time', because the rate of the clock is slower in the accelerated
motion.
Copyright 0 1972 Plenum Publishing Company Limited. No part of this publication may be reproduced,
stored in a retrieval system or transmitted, in any form or by any means, electronic, mechanical, photo-
copying, microfilming, recohi ng o r otherwise, without written permission of Plenum Publishing Company
Limited.
307
224

308 TA-YOU WU AND Y. C . LEE

That a returned clock should have lost time compared with the one at
home is so strange a conclusion that Einstein specifically wrote an article
(Einstein, 1918) in 1918, in the form of a dialogue between a critic and
himself, to show (1) how the trip will be viewed from the standpoints of both
frames, (2) how the reciprocal symmetry (in the sense of the special theory
of relativity) will be destroyed in this case by the accelerated motion of the
rocket, and that both frames will agree that the returned one will be slow,
and (3) that this is due to the loss of time, or the slowing down of the clock,
of the rocket during the accelerated portion of the rockets trip when it
turns back.
For definiteness, let us pose the following situation. From the standpoint
of the inertial system S, the rocket (or the travelling twin) S goes through
the following sequence of events :
B S C

F E

A-B : S starts off, with acceleration a (in the positive x direction),


reaching the velocity IJ at B.
B-C: S shuts off its engine and moves with a uniform velocity IJ relative
to s.
C-D: S starts its engine and decelerates, reducing its velocity relative
to S to zero.
D-E: S keeps its engine and starts accelerating toward S, reaching the
velocity -u at E.
E-F: S shuts off its engine and moves with constant velocity (-u)
toward S.
F-A : S starts its braking engine, and moves with acceleration --a,
coming to rest at A .
From the standpoint of the rocket S who regards itself as at rest, S will go
through the following events:
E F

A-B: S starts falling in a (universal) gravitation field -g (in the


negative x direction), attaining a velocity -v (relative to S) at B.
B-C: The gravitational field is removed, and S keeps on moving with
the (constant) velocity --u.
225

THE CLOCK PARADOX IN THE RELATIVITY THEORY 309


C-D: A (universal) gravitational field g in the positive x-direction is
turned on. S comes to stop (relative to S) at D.
D-E : The same field g continues to act, and S falls from Dto E,
attaining the velocity o (relative to S).
E-F: The field g is removed and S moves with the constant velocity v.
F-A : A gravitational field -g is turned on, and S is brought to rest
at A.
During A-B, C-D, D-E and F-A, S itself is in the same universal
gravitational field as S, but S is held fixed by some external agency.
During the uniform relative motion parts B-C, E-F, Swill say that the
clock of S is slow according to the time dilatation relation. If AT is the
proper time interval recorded by S for each of these parts, and AT, is the
time interval recorded by synchronised clocks attached to the frame S, then
2 d ~ ~ .
2AT -

- 2/(1 - P)
But with equal right, S will say that the clock of S is slow compared with
that of S,and if AT3 is the (proper) time interval recorded by S (for each of
the uniform relative motion parts) and AT, the time interval recorded by
synchronised clocks attached to the frame S, then

Einstein pointed out, however, that during the turning around parts
C-D, D-E in (1 .2), S is at a higher gravitational potential than S, and
the clock of S is faster than that of S. The clock in Swill during C-D and
D-E gain time and more than compensate the loss as given by (1.4). If
the time intervals during C-D, D-E are very short compared with those
for the uniform motion parts, the gain during C-D, D-E by the clock
of S will be such as to bring the total time ATrecorded by S (for the trip
B-C, C-D, D-E, El-F) to be longer than that AT recorded by S , in
accordance with (1.3), i.e.,
ATI
A1
AT =
- P)
The above statement of Einstein has been expressed in explicit form by
ToIman (1934). Let us view the trip from the standpoint of S as in (1.2).
Let T A B , T ~ T C~D E,,T E F (= T B C ) , T~~ (= T ~ B be
) the proper time intervals as
recorded by S, and let tBC,tCDE,tEF(= tBC),tFa (= tAB)be the time intervals
as recorded by synchronised clocks at various points in (1.2) attached to
S.? Then, by (1 .4),

t We shall, without causing confusion, drop the prime for B,C, etc. in the subscripts
for the 7s and t s.
226

310 TA-YOU WU AND Y. C . LEE

Let the average distance between S and S during the turning around
portion C-D-E be approximately taken to be x = vtBc. Since v = gtcD,
the Doppler effect separation gives

1
rg x time for the trip recorded in S (1.9)
d(1- P2)
This is in approximate agreement with (1.3). It is important to note that the
2 sign in (1.8) arises not so much because of the neglect of rAB and tCDEin
(1 .S) as because of the approximations made in obtaining (1.7).
In 1956, Dingle (1956) in a series of articles renewed the question of
whether the returned twin from a rocket trip is younger than his brother
who has stayed home. He believed that there should be no difference in
their aging, that all earlier conclusions, including Einsteins, are erroneous.
His questioning of these earlier works by many physicists has led to a great
flux of discussions. Most authors (Arzelies, 1966) maintain the conclusion
of Einstein. In most cases, the arguments amount to the simple statement
that since the rocket S has undergone accelerated and decelerated motions,
it is not on equal footing with S which is an inertial system, and hence the
reciprocal symmetry in the sense of the special theory of relativity has been
removed. This part of the argument is of course correct. But then, because
of attempts to simplify the problem for the non-specialist, the following
argument is usually put forward: One can make the time intervals for the
accelerated and decelerated parts very short compared with the time
intervals for the uniform relative motion parts [see (1.1) or (1.2)] and in the
limit negligible. Then, since only S is a preferred (in the sense that it is an
inertial) frame, one must only employ the relation (1.3). This part of the
argument is unfortunately misleading. We have seen in the preceding section
from the approximate treatment by Tolman that it is precisely the acceler-
ation (or, an equivalent gravitation field) during the turning around of the
rocket that slows down its clock (relative to the inertial frame S ) , and that
one obtains the result (I .9) in an approximation only, which is not exactly
the relation (1.3). The point that seems to have been forgotten in many
elementary discussions of the clock paradox is that while the conipen-
227

THE CLOCK PARADOX IN THE RELATIVITY THEORY 31 1


sations' in (1.8) and (1.9) must come from the accelerated motions, the
correct result L(3.20) and (3.22)in tliefollowing] should really beindependent
of the strength of the accelerating and decelerating field which determine
the length of time for these accelerated and decelerated parts. The undue
prominence given the uniform relative motion parts (coasting of the rocket)
and the consequent appearance of the Lorentz relation (1.3) are unfortunate,
for they tend to divert the attention from the accelerated motion, which is
essential in the clock paradox problem, to the expression (1.3) for uniform
relative motion. In actual fact, the uniform relative motion parts [B-C,
E-F, B'-C', E'-F' in (1.1) and (I .2)]are non-essential, and one wouId have
essentially the same 'paradox' if one does away with these (coasting) parts
entirely. In the following section, an analysis of the clock problem with, and
also without, the uniform relative motion parts in (1.1) and (1.2) will be
carried out to illustrzte this point.
In the literature, attempts have been made to convince one of the
immediate applicability of the expression (1.3) for the time intervals for the
whole trip as recorded by S and S' (assuming negligible times for the
accelerated parts of the motion) by the following argument. Let there be a
set of triplets instead of a pair of twins. Let C' stay home (in an inertial
frame); let A be moving away in a rocket with velocity t, (relative to C ) . At
a certain point in space, A meets I3 who is travelling toward C with velocity
--2, (relative to C). A and B do not stop; B just sets his clock according to
that of A . B finally passes by C. It is then claimed that the time recorded by
C (for the interval between the passing by of A and that of B ) is Ionger than
that recorded by Byin accordance with (1.3). This argument is a special case
of a general theorem in the special theory of relativity, namely, that in a
Minkowski diagram, the time interval measured on a straight line AB
(which is the time axis) is longer than the sum of time intervals measured
along a series of straight lines AC, CB,. . . DB (each being a time axis in
another Lorentz frame) which together with A B form a polygon. The
implication of this argument is that we make use of the awareness of the
accelerations to remove the reciprocal symmetry of the Lorentz frames, but
have ignored the effects of the accelerations on the time measures of the
systems. On this argument, one might as well contend with two Lorentz
frames since the use of a third frame does not add to the resolution of the
problem.
2. Arbitrary Motion Relative to an Inertial Frame
We shaIl study an accelerated motion that can be treated exactly in the
clock problem.
Let (X, T ) be the space and time coordinates in an inertial frame S and
let x , t be those in a frame S' which may be accelerated under the time-
independent field. Let v(x, T ) be the velocity of a fixed point x in S' relative
to S at t h e T and let X be the space coordinate of the point p so that
X- X(x,7') and
~ ( xT)
, = v(x(X,T),T ) (2.1)
228

312 TA-YOU WU AND Y. C . LEE

The velocity of the point p in S is

u= (%),
If we assume that in S the unit of length is the same as that in S, then the
condition of local Lorentz contraction is expressed by

From this, one obtains the equation

This is equivalent to

in which u is regarded as a function of X,T through the transformation (2.1).


To obtain the relation between the time t and the coordinates X and T,
we shall introduce the conditions of local Lorentz time dilatation and
reIativity of motion of S and S, namely,

and

where r is the proper time in S in the sense that if the metric in S is


(dy = dz = 0)
+
ds2 = -(dx2 dy + dz) + g44 dt (2.8)
then
ds2 = g4, d-r
so that
dr = 2/( 1 - u) dt (2.10)
where
(2.7)

Note that the T called the proper time above and defined by (2.9) is not the
normal proper time -ro defined by d-ro= ds which will be related to T here by
d-r0= d ( g 4 4 ) d r .In the present work, we make use of T in the calculation
ofrO.
229

THE CLOCK PARADOX IN THE RELATIVITY THEORY 313


The conditions (2.3) and (2.6) are valid at a point ( X , T ) or ( x , t ) . Our
object is to find a class of accelerated frames S (with respect to S ) with the
transformation
x = x(X7T ) , t = t ( X 7T) (2.11)
and satisfying (2.3) and (2.6). The hypothesis that a transformation (2.11)
exists between the ( X , T ) in an inertial frame and the (x,t)coordinates
implies that the space is Euclidean. In this case, we can integrate the
differential equations (2.5) and (2.6) since no curvature of space is involved.
Equation (2.5) can be solved by the method of separation of variables,
namely, by setting
0= 4
wW)
which leads to

u=
*+
AXfC,
, A, C,,C2 being constants.

If the initial condition is


v+aT atX=O asT+O
then
aT
(2.12)
V=
1 + (aX/c2)
Putting this into (2.2), one obtains the equation of motion of the point p
(x = 0 in S),
dX -
_ aT
(2.13)
+
dT - I ( a x / ~ )
To obtain the relationship between X and x, we may first integrate (2.13)
to obtain
(1 + $)2 - a f y= f ( x ) (2.14)

wheref(x) is an arbitrary function of x. On the other hand we can also


integrate the Lorentz contraction equation (2.3) in which v is given by
(2.12).The integration yields
2
(2.14)

where b(T)is an arbitrary function of T. However, by comparing the above


two equations (2.14) and (2.14) we see that b(T) must actually be a pure
constant by independent of T,and f ( x ) is just [(axle') + bI2. Our initial
conditions then require b = I , so that we have

(1 + $ ) 2 = ( 1 +F)2+($)2 (2.15)
230

3 14 TA-YOU WU AND Y. C. LEE

From (2.7), (2.13) and (2.15) we obtain


1 aT/c dT
-
-
1 + (aX/c2) [I + tax/c2>1dt
From Equations (2.6), (2.7), (2.8) and the assumption that g,, does not
depend on t, one can show that
ax
g44 = (1 + T )
' (2.16)

The above equation in dT/dt can be integrated, and with the initial condition
t =0 when T = 0

(1 + -:
+: (I+---:
: acT)
we obtain
-=In
2at - acT) -111 (2.17)
C

Using (2.15) in (2.17), we obtain


aT/c at
= - tanh- (2.15)
d{[I + (nx/c2)12+ (aT/c)2} C

and (2.7) becomes

g)x nT/c
= 1 + (aX/cz>
at
= tanh-
C
(2.18a)

In (2.18a), since x is held fixed, t is also the proper time T in S'.


Equations (2.15) and (2.17)now transform the metric
ds2 = -dX2 + dT2 (2.19)
into
ds2 = -dx2 + (1 +?)'dt2 (2.20)

In (2.20),for a given constant a, x is restricted to the region x > -(c2//a).


It is seen from (2.13) and (2.15) that the limiting value x = -(c2/a) corre-
sponds to u = c beyond which u should not pass.
For convenience, we write three consequences of equations (2.15) and
(2.17), namely,
c=(1 + 5)
C tanh($) (2.21)

"=(1
C +:)sinhe) (2.22)

I+-=
C2
(I + )-: cash-
at
(2.23)
23 1

THE CLOCK PARADOX IN THE RELATIVITY THEORY 315


Equation (2.15) describes the motion of a fixed point p in S from the
standpoint of S. It is the so-called hyperbolic motion.
Equation (2.23) describes the motion of a fixed point p in S from the
standpoint of S.
If S is accelerated along the -X direction, we have only to replace n in
all the equations (2.12)-(2.23) by -a, and obtain

(2.25)

. a(t - to)
sinh -- (2.26)
C C

where xo, Xo, To,to are constants to be detwmined by the appropriate


initial conditions. In this case, equations (2.7) and (2.13) become

(2.28)

u=- ()x
1 - [a(. - xo)/c2] dt
= ()dT x
= -tanha(t - to) (2.28a)

At this point, it is of considerable interest to note that the transformation


(2.15) and (2.17) are precisely that derived by Mnrller (1943) on completely
different considerations. Msller starts from a static metric assumed to be
ds2 = -(dx2 + dy2 + dz2)+ g44 dt2 (2.8)
where g,, = g,,(x). g,, is determined by the Einstein equations lip,= 0,
which lead to g,, = [I + (gx/c2)I2,g being a constant. The curvature tensor
Rive for this metric vanishes, showing the space to be Euclidean. With the
help of the equations of the geodesic, the transformation (2.15) and (2.17)
is found. We have arrived at (2.15) and (2.17) on the basis of the local
Lorentz transformation properties (2.3), (2.6), (2.7) with the assumption
that in (2.7), g4, is a function of x but not oft.

3. Resolution of the Clock Paradox


We shall now study the clock problem as stated in (1.1) and (1.2) of
Section 1, by treating the accelerated parts of the trip .4-B, C-D-E,F-A,
and A-L, C-D-E, 17-A by means of the accelerated motion described
by equations (2.9)-(2.24) and (2.25)-(2.28) of the preceding section.
232

3 I6 TA-YOU WU AND Y. C. LEE

(A) From the Standpoint of S


Referring to the figure in (l.l), let the origins of the coordinate systems
S ( X ) and S(x) be coincident at T = t = 0.
5 S C

The rocket S(x = 0) moves according to (2.9), (2.7), (2.18)-(2.18a),


(2.21)-(2.23). Part A-B: For the motion of the point x = 0 (fixed in S),the
t in (2.18a), (2.22), (2.23) becomes the proper time T~ in S (i.e., the time
registered by one and the same clock at x = O fixed in S). When x = O
reaches the velocity vo (relative to S frame) we have?
tanhah, = vo (3.2)

where T I is time, measured by the synchronised clocks attached to S, XI is


the distance traversed by S ( x = O ) when it has reached the velocity uo
(i.e., the part A-B). In the following, the subscript 1, 2, 3, . , . refer to the
parts A-By B-C, C-D, etc. respectively of the trip. The same subscripts
1, 2, 3, . . . are also used for the A-B, B-C, C-D, etc. in the following
section from the standpoint of S.
To obtain the time interval AT, recorded by one clock fixed at X = 0 in S,
let S send light signals back to S. Let T be the proper time in S. Then$

(3.5)

where by (3.2) AT^ = tanh-I vo. Thus

t In the following, we simplify writing by choosing the unit of time such that c = I , All
time, velocity, acceleration T, t, u, uo, u are to be replaced by CT, ct, v/c, uo/c, u/c2, to
convert to c.g.s. units.
$ AT, only represents the time interval for the clock at X = 0 to intercept all the light

-
signals sent to it by the clock attached to x = 0 within AT,. It does not really represent the
time of travel of the rocket ( x = 0) from A to B as recorded by the clock at X 0. The sum
in (3.10) is, however, the total time interval for the whole trip of S, as recorded by one
and the same clock at X = 0 in S.
233

THE CLOCK PARADOX IN THE RELATIVITY THEORY 317


From the symmetry of the situation, it is clear that for the part C-D,

and for the parts D-E, F A ,

(3.7)

For the parts B-C,E-F, if AT^ = 47,is the proper time intervals in S , the
sum of the intervals for the arrivals of the signals sent back by S during
these intervals is
AT2 + A T , = AT^J(-) + ATIJ(-)

Thus the total interval recorded by one single clock in S (at rest at X = 0)
from all signals sent back by S during its round trip is

(3.10)

The proper time (recorded by one clock) in Sfor the whole round trip is
4
AT^ = 4A701 + 2 4 7 2 = a- tanh- V o + 2 4 7 2 (3. I I)

(B) From the Standpoint of S


The fixed point X = 0 in S moves in the negative x direction.

-X 1

(3.12)
7
1

-VO
Part A-B. Let AT, be the proper time interval (registered by a clock at
X= Oin S)for X = 0 to reach the velocity-uo (relative to S).From (2.18a),
adT, = tanh-a71 = uo (3.13)
C
234

31 8 TA-YOU WU AND Y. C . LEE

and the distance traversed by A= 0 during this interval is given by (2.23)


(with X = 0) and (2.18) with tanhati 72 uo, i.e.,
1
1 +ax, = = 2/(1 - vo2) (3.14)
cosh atl
Parts B-C and E-F. Since the time interval for 2 3 4 or E-F in S is
8 7 2 in (3.8), S would have deduced from the special theory of relativity
that the combined intervals would have added to the proper time intervals
of Sthe value24 T2which can be calculated as follows. Let dT be an element
of proper time in S. For the combined B-C and E-F, light signals sent
by S back to S will reach S in intervd

This is now the time interval recorded by the clock in S, and is hence the
proper time interval 2 h 2 , i.e.,
2 4 T2 = d(1 - 00) 2872 (3.15)
Part C-D. During C-D and D-E, S would describe S as being acted
on by a gravitational field n in the positive x-direction, the motion being
described by equations (2.24)-(2.27). Equation (2.24) is
[I - a(x - so)] = [I - a ( X - X0)I2 - u(T- To)2 (2.24)
The constants so,Xo, Toare determined as follows. From the standpoint of
S1 [see (3.11, (3.2) and (3.8)], at C,
% ad 7 2
aT = 2, = UO (3.1 6)
1/( 1 - uo) + 1/(1 - uo)
and (2.28) leads to

At D,

u = 0 and

[see (3.4)]. These lead to

(3.17)
235

THE CLOCK PARADOX IN THE RELATIVITY THEORY 319


From (2.28a),
t anh a( t - to)= --u
the condition v = uo when t = 4~~+ 472 leads to

Thus, equation (2.25) corresponding to the case (2.24) is now

x tanh(t - 247, -AT,) (3.19)


At D,u = tanh(t - 247 - 472) = 0 and the point X = 0 of S has the time
T = ATAt-B,<,-D*
(3.20)

From the symmetry of the situation, it is clear that the time in S for the trip
A-B-C-D-E-F-A as seen (or, calculated) from the standpoint of S is
twice the d in (3.20) i.e.,
Total time d T in S =
(3.21)

which can be written


= 2(d 7A-B + 7B-c + Tc-D)
or,

(3.22)

the proper time in


3 = 4/a tanh-l uo + 245-2 (3.23)
Equation (3.21) shows complete agreement with the value given in (3.10)
obtained from the standpoint of S. Equation (3.22) shows that the loss of
time during B-C and E-F by S in the view of S [i.e., 2 4 ( 1 - v 0 2 ) 6 ~ 2
in (3.15) compared with 2d7/d( 1 - vo2)in (3.9)] is more than made up by
the gain in time by the S clock during C-D, D-E when S is at a higher
equivalent gravitational potential than S [see (3.12)], and the clock of S
is faster on account of the factor g44= (1 - gx) in
ds2 = d ~ +
(I - g.Y)dt
(with x = 0 at A and ax = 2[2/(1 - vo2)- 1 J - Vod72 at 0). The smaller
loss of time of the clock of S during A-B (when S is at a lower gravita-
tional potential, ds = -dx2 -1 (1 + gx)2dt ) is also more than made up by
236

320 TA-YOU WU AND Y . C. LEE

the gain during C-D. The total result is to bring the two reckonings of
the proper time intervals, by S and S, of the round trip into exact agreement
with each 0ther.t
It is seen from the foregoing results that all the calculations are exact, and
no approximations involving the assumption of making the accelerated
parts A-B, C-D-E, F-A (or A-B, C-D-E, F-A) very short compared
with the uniform relative motion part B-C, E-F (or B-C, E-F) have
been made. In fact, as emphasised by Einstein as early as in the 1918 paper
and brought out approximately by Tolman (1934) and exactly in (3.22)
above that is precisely the accelerated parts that resolve the paradox. Had
one literally neglected the accelerated parts, (3.10) and (3.22) would have
become
2 AT^
Total time in S (as reckoned by S ) =
d(1- vo2)
Total time in S (as reckoned by S) = 22/(1 - vo2) 4~~
On the other hand, had one done away entirely with the uniform relative
motion (coasting of rocket) parts B-C, E-F (B-C, E-F), the results
(3.10) and (3.22) would have become:

Standpoint of S Standpoint of S

Total proper time in S -4 uo


- UO]
a d ( 1 - EO2) 4

t The results (3.10),(3.11), (3.22), (3.23) above are alittlemorecornplete than those of
Mlaller (1943) in that here S starts out from rest and comes back at rest to S.There are
differences in details between this and Mdlers work. For example, we calculate the time
intervals AT,, AT,, AT,, AT6 in (3.5)-(3.8) as recorded by one cIockinS, andnotM0llers
times T, T which are not the proper times of one clock. Also, as remarked in Section 2
above, the starting points in the two works are different. In an application of M~llers
work, Fock (1959) has obtained an erroneous conclusion.
Fock (1959) states that the time intervals recorded by the clocks A , B in S, S are
given by
V=
74 - T B = - (*T- 3f)
CZ

where t = 2v/g is the time for the turning around part (C-D-E in (1.8) in the present
+
article), and T = uniformly moving part (B-C) (E-F) + t. Thus rA- 7 8 can be 0,
in disagreement with the results of everyone else. This strange result arises from the error
of the f sign in (62.09), which should have read U = U,, - g ( x l - x ) . When this correction
is made, one would have
v2T
71 - 7 B = --
c14
which is in agreement with the approximate result of (1.8) of Tolman and others.
237

THE CLOCK PARADOX I N THE RELATIVITY THEORY 32 1


Here is, of course, exact agreement between the reckonings of the total
proper time intervals from the standpoints of both frames.
In the calculations above, we had employed the accelerated motion
represented by equations (2.15) and (2.17) [or, (2.24) and (2.25)] which
correspond to motion under a time-independent field. The result, however,
is in fact quite general since from S J ( x ,t ) , one can carry out any arbitrary
coordinate transformation to a frame S,
X= xyx, t), t = t ( X , t )
which will lead in general to

where the g l j are functions of X and t and hence no longer static. The
space is, however, Euclidean. The motions of S relative to S can be quite
arbitrary and very complicated, but the description can be reduced to that
of S by the transformation above so that in a sense the treatment of the
clock problem by means of S has covered a whole (infinite number) class
of accelerated motions relative to S. This class of accelerated motions has
not brought in any curved space properties in the sense of Einsteins general
theory of relativity.
The present work has thus treated and resolved the clock problem
without having really made recourse to Einsteins theory of gravitation
involving curved space. This is worth noting in view of the usual statement
in the literature that an exact treatment of the clock problem (i-e., to all
orders of uo/c)calls for the general theory of relativity.

4. General Remarks on the Clock Paradox Problem


We are now in a position to summarise what we believe is relevant in the
clock problem in the relativity theory.
(1) Invariance of proper time under coordinate transformations. In the
theory of relativity (special and general),
ds2 = gpu d X p dxu
(4.1)
= invariant

in each group of coordinate transformations (the group in flat space-time


which includes the Lorentz group, and the general group in curved space-
time). Thus for a given world line C between two world points P,and Pz,the
proper time interval
1
p2
AT ds along C = invariant
P1
(4.2)

i.e., has the same value in all frames satisfying (4.1)


(2) Proper time intervals between two world-points along different
world lines.
238

3 22 TA-YOU W U AND Y. C. LEE

Consider a given field g,, = g,, (x1,x2,x3,x4).The motion of a particle


from one world point P,to another P2is uniquely given by the geodesic C
2
6 [dS=O
J
1

Other paths C1,C2joiningP1 and P2 will not correspond to the free motion
in the field g,,, but will correspond to motions under agencies other than
the field representative by gpv,and
2

f
1CI
ds #
1c
f ds (4.3)

which follows from the definition of the geodesic.


(3) For a given field g,,, between two arbitrarily given points P,and P 2 ,
there is one and only one geodesic.
From a point PI,there are m3 geodesics issuing in all directions. Of these,
one, say C, goes throughP,. Suppose another, say CI,makes an angle 8 with
C at P,. Since a geodesic is a line generated by an infinitesimal vector in a
continuous series of infinitesimal parallel displacements, and since the
angle between two vectors is invariant under parallel displacements, it
follows that in general two geodesics from a given point PIcannot intersect
at another arbitrarily chosen point Pz.(In the case of a spherical surface,
geodesics from a point P intersect at the antipode of P only.)
(4) The clock paradox. Let S and S be two (material) frames whose
coordinates transform according to (4.1). Let us follow Einsteins argument
in Section I , namely, from the standpoint of S (the rocket), S undergoes a
series of free falls in certain universal gravitation fields during AB, CD,
DE, F A and coasting BC, EE in (1.2). Suppose these fields are
represented by a field g,,. From the standpoint S , the frame S passes from
the initial point PI to P2(in 4-space) along the geodesic of g,, but S itself
passes from PI to P2 along a pure-time trajectory since S has been held
fixed (at rest) by means of some external agency. Thus the world line of S
is not a geodesic of gp,. In general, the proper time intervals along the two
world lines between PI and P2are different, according to (4.3).
The above result is general, holding for curved space as well as for flat
space. It is possible to make an explicit and exact calculation of the proper
time intervals in the flat space case, using the spirit of the general theory of
relativity (acceleration represented by a g,, field). This has been done by
M0llx (1943), and in the present work (Sections 2 and 3).
(5) Let S ( X , Y,Z,T) be a strictly inertial frame, i.e., frame in flat space-
time, and S(x,y,z, t ) be a frame in a curved space-time (i.e., in a gravita-
tional field in Einsteins theory). Then there is no coordinate transformation
+
which transforms ds2 = -(dx2 + dY2 d Z 2 )+ dT2into ds2 = 2g,,dx,dx,
with the curvature tensor R$, Z 0. In this case there is no invariant ds2and
there is no exact (only approximate) connection between the space-time
THE CLOCK PARADOX I N THE RELATIVITY THEORY 323
description in S and that in S.One can no longer compare dq, in S and the
d~ in S,and the clock paradox does not have any clear and exact meaning.

References
Arzelies, H. (1966). Relativistic Kinematics. Pergamon Press. Contains an extensive
bibliography on the clock problem.
Dingle, H. (1956). Nature, London, 177,782.
Dingle, H. (1957a). Nafure,London, 179, 1242.
Dingle, H. (1957b). Nature, London, 180,499, 1275.
Darwin, C. G. (1957). Nature, London, 180,976.
Einstein, A. (1911). Annalen der Physik, 35, 898. Translated and contained in Einstein et
al., The Principle of Relativity. Dover Publ. Inc., New York.
Einstein, A. (1918). Naiurwissenschaften, 6, 697. An exposition of the relativity theory,
and of the clock paradox, in the form of a dialogue.
Fock, V. (1959). The Theory of Space, Time andGravitntion, Section 62, p. 214, eq. (62.16).
Pergsunon Press.
McCrea, W. H. (1956). Nature, London, 177,783.
McMillan, E. M. (1957). Science, New York, 126,381.
Merller, C . (1943). Dmske Vid. Sel. Mat-Fys. &fed. X X , No. 19.
Tclman, R. T. (1934). Relativity, Thermodynamics and Cosmology. Oxford University
Press.
240

I L NUOVO CIMENTO VOL. 112B, N. 4 Aprile 1997

Four-dimensional symmetry of taiji relativity and


coordinate transformations based on a weaker postulate
for the speed of light. - I1
PINGHsu () and LEONARDO
JONG Hsu (2)
() Physics Deparkment, University o j Massachusetts Dartmouth
North Dartmouth, M A 02747, USA
(2) Physics Department, University o j Calvornia at Berkeley - Berkeley, CA 94720, USA

(ricevuto il 15 Maggio 1996; approvato il 9 Luglio 1996)

Summary. - Extended relativity is a theory of four-dimensional symmetry with


Reichenbachs time, in which only the 2-way speed of light is a universal constant. I t
includes special relativity as a special case. The theory is shown to be consistent with
experiments such as Fizeaus experiment, aberration of light and precision Doppler
shifts. The formulations of classical and quantum electrodynamics are discussed.
They are shown to be dependent on the four-dimensional symmetry rather than on
the usual constant one-way speed of light. The four-dimensional symmetry also
dictates a new coordinate transformation, called the Wu transformation, for
constant-linear-acceleration frames.
PACS 03.30 - Special relativity.
PACS 11.30.C~- Lorentz and Poincar.6 invariance.

We continue to demonstrate that the four-dimensional symmetry is necessary and


essential [l] for discussing physical laws from Reichenbachs viewpoint of time [2] or
Edwards weaker postulate for the speed of light [3]. Edwards attempted in 1963 to
formulate a relativity theory based on a weaker postulate that the 2-way speed of light
in a vacuum is a universal constant. He derived space and time transformations which
involve Reichenbachs time but which do not form a four-dimensional Lorentz group in
general. As a result, it leads to an incorrect expression for the relativistic
energy-momentum of a particle in the Lagrangian formalism of mechanics and
electrodynamics, as shown in paper I [l].Furthermore, it appears to be impossible to
obtain invariant forms of Maxwell equations and the Dirac equation if Reichenbachs
time is used as an evolution variable, as we shall see later. The reason is that
Reichenbachs time does not transform covariantly as the zeroth component of the
coordinate 4-vector in general. Therefore, the lack of four-dimensional symmetry
makes Edwards original transformations untenable.
Recently, we have formulated and discussed taiji relativity [4] based solely on the
24 1

first postulate of relativity, i. e. the invariance of physical laws. A four-dimensional


transformation between any two inertial frames, F ( w , x, y , z ) and F ( w ,x,y , X I ) ,
is derived. Since taiji Telativity does not m a k e a n y assumption regarding the speed of
light, the zeroth components w and wcannot be factored i n t o a well-dejined speed of
light and time. In fact, the speed of light and usual time (measured in seconds) are
unspecified and undefined in the theory. However, the theory of taiji relativity
possesses the four-dimensional symmetry which is shown to be the only essential
ingredient for the theory to be consistent with previous experiments. This sheds light
on the difficulty encountered by Edwards transformations.
In paper I, we show that, guided by the four-dimensional symmetry of taiji
relativity, Reichenbachs general convention of time (or, equivalently, the universal
2-way speed of light) can be used as the second postulate for the construction of a
new four-dimensional formalism of coordinate transformation which is termed
extended relativity. The second postulate is necessary to factorize, say, w into a
well-defined velocity function b (called ligh) and Reichenbachs time t in the F
frame, i.e. w = b t which is called lightime. (See eqs. (2.1)-(2.3) in sect. 2.) It turns
out that the lightime w, rather than Reichenbachs time, plays the role of evolution
variable in physical laws and makes extended relativity consistent with established
energy-momentum of a particle, the Lorentz group, etc. Furthermore, the covariant
lightime embedded in the four-dimensional symmetry is also crucial for the formulation
of a covariant quantum electrodynamics (QED) based on extended relativity, as we
shall see in sect. 6.

7. - Limiting four-dimensional symmetry and accelerated Wu transformation

One may wonder whether the power of four-dimensional symmetry is strong


enough to say something about constant-linear-acceleration (CLA) frames. The
attempt to generalize the coordinate transformation for inertial frames to that for CLA
frames through a symmetry consideration is very natural because the transformation
for a CLA frame must reduce to that for an inertial frame in the limit of zero
acceleration. So far, no satisfactory transformation for such non-inertial frames has
been obtained in the literature, even though one has general relativity and the
correspondence principle [6]. All those accelerated transformations discussed
previously are not based on a symmetry principle and do not naturally reduce to the
four-dimensional transformation for an intertial frame when the acceleration
approaches zero.
By a stroke of luck, we have found a transformation for CLA frames which does
reduce to the correct four-dimensional transformation in the limit of zero acceleration
and reduces to the Galilean accelerated transformation when the velocity is small. For
simplicity of notation and calculations in an accelerated frame, let us denote a CLA
frame by F ( w , x, y , z ) and an inertial frame by F I ( w 1 xI,
, yI, q). Suppose a CLA
frame F ( w , x, y , z ) is moving with a constant acceleration a , so that its velocity is /3 =
aw + P o , along the + x axis. We find that the accelerated transformation between Fr
and F is given by
242

wI = y ~ ( x+ 1l a y 3 - P ol a y 0,
(7.1) XI = y(z + 1/ayi) - l/ayo, gr = y, zI = z ;
P=aw+Po, y=l/(l-P2)/2, yo=1/(1-pi)/2,
which will be called the Wu transformation. If one wishes, one may define wI = c t I ,
where t~ is the usual Einstein time, in (7.1) for easy comparson with special relativity.
(But this definition is not necessary for deriving experimental results.) The inverse Wu
transformation of (7.1) is

r w1+Po/aro Po
4 x 1 + l/ayo) a
(7.2)
x = [(xI+ l / a y o ) 2 - (wI+Bo/ay0>21/2-
y=y[, x=x1.

One can verify that (7.1) and (7.2) reduce to four-dimensional transformations of the
form (2.2) in the limit of zero acceleration a. (See appendix.) We may remark that the
coordinate transformation between two CLA frames can be derived on the basis of (7.1)
or (7.2).
From the viewpoint of limiting four-dimensional symmetry, the CLA transforma-
tion must be expressed in terms of the Cartesian coordinates rather than other
coordinates, just like the Lorentz transformation. Furthermore, the coordinates of
CLA frames should play the same role and have a similar physical meaning as those of
inertial frames. This appears to be different from the usual viewpoint that coordinates
for accelerated frames have no physical meaning. The Wu transformation (7.1), based
on the four-dimensional symmetry, differs from that obtained by Mdler [6] based on
the approximate principle of equivalence in general relativity because they give
different spatial measurements by meter sticks or the Bohr radius of hydrogen atoms.
We believe that such a difference should be tested by, say, measuring a Doppler shift of
wavelength emitted from a source with a constant linear acceleration. We may remark
that the constant acceleration a in (7.1) can be shown to be related to constant change
of energy (or moving mass) per unit length measured in an inertial frame. This
differs from the usual definition of acceleration in (2.4). I t is interesting to note that
such a constant acceleration a dictated by the limiting four-dimensional symmetry is
precisely what has been actually realized in linear accelerators in laboratories. Physical
implications of the Wu transformation and their experimental tests will be discussed in
a separate paper.

8. - Remarks and discussions

In the formulation of QED in sect. 6, the electron is, as usual, assumed to be a point
particle. However, if the physical electron is really a fuzzy point (in the sense of fuzzy
set theory with a bell-shape membership function having a width Lo) rather than a
geometric point, then there will be a departure from the four-dimensional symmetry at
short distances or large momentum [I. A fuzzy-point model of a particle has been
interpretated as follows: a particle by itself is a structureless-point particle, but it can
simultaneously exist at different places with a different probabilities. As a result, the
position uncertainty of such a quantum particle has a minimum width Ax Lo. The -
Coulomb potential will be modified when T < L o , and the photon propagator in (6.19)
will be modified when momentum becomes larger than h / L o . For a detailed discussion
of the fuzzy-point model of particles, we refer to ref. [7].
243

Let us compare and summarize basic differences in various relativity theories:


a) Taiji relativity: it is based solely on the first postulate of relativity, namely, t h e
invariance of physical laws. A four-dimensional transformation between two inertial
frames, F(w , x, y, x) and F ( , w , X I , y , x), can be derived. The usual concept of time
and speed of light are undefined and completely unknown; nevertheless, the theory
agrees with all experiments. Taiji-times zu and w play the roles of evolution variables,
and the dimensionless taiji-velocities dx / dw , dx / dw , etc. are well-defined. I t is
interesting that there are only two universal and fundamental constants in QED based
on taiji relativity.
b ) Extended relativity: it is based on two postulates. In addition to the invariance
of physical laws, its second postulate is the universality of the 2-way speed of light.
Reichenbachs time and one-way speeds of light (non-isotropic in general) are
well-defined. However, lightime zu plays the role of the evolution variable in four-
dimensional physical laws. There are three universal and fundamental constants in QED
based on extended relativity. Special relativity is a special case (q = q = 0 in (2.2)).
c) Common relativity: it is based on two postulates. The additional second
postulate is a common time t = t for all observers [8]. The speed of light is, roughly
speaking, relative. Lightimes LV and w are evolution variables in four-dimensional
laws. There are only two universal and fundamental constants in QED based on
common relativity, precisely the same as those in taiji relativity. Common relativity has
the unique advantage for dealing with many-particle systems where canonical
evolution of the system is essential and for obtaining covariant thermodynamics and
invariant Plancks law of black-body radiations [9].

One may ask: how can one realize the evolution variable zu in the extended
coordinate transformation (2.2) by physical means? Since the invariant phase of an
electromagnetic wave in the F frame is given by k,,w - k . r ,where k,, = 1 k I , we can
define the lightime w in terms of k d , just as the length can be defined by the
wavelength 1 or 1 k 1 . We note that the clocks, which show lightime in this theory,
are the same as those in taiji relativity [4] because they have exactly the same
four-dimensional transformation property. However, the taiji-time w in F cannot be
factored into two Well-defined b and t because of the absence of a second postulate
while the lightime zu in extended relativity and common relativity can be factored into
two well-definecl functions 6 and t , as shown in (2.11, (2.2) and ref. [8].
Our discussions show that it is extremely important to be aware of what quantities
are actually measured in the experiments and what effects the assumption of a
universal speed of light may have had on the interpretation of the results. For example,
we have seen in paper I that the lifetime dilatation of unstable particle decay in flight
has little to clo with the property of Reichenbachs time with a general parameter q or
q, because the lifetime 5 is basically defined as the decay length divided by the
universal 2-way speed of light c. The basic reason is that the four-dimensional
symmetry dictates that the decay rates in, say, QED based on extended relativity can
only be defined in terms of the covariant lightime zu or 20 which has the dimension of
length.
The constant 2-way speed of light in extended relativity is in general not the
maximum speed of physical objects in the universe. Rather, it is the one-way speed of
light in a given direction, that is the maximum speed of any object in that direction, as
shown in (2.3). This holds for any inertial frame. It is worthwhile to note that this
property of light, being the maximum speed of all physical objects in any given
direction, is a logical consequence of the first postulate of relativity, as shown in taiji
relativity [4].
244

We have examined a number of experimental tests of special relativity and the


formulations of classical electrodynamics and QED. All of them are consistent with
extended relativity. These discussions can be generalized to other field theories such as
unified electroweak theory and quantum chromodynamics. As we have seen, onIy the
four-dimensional symmetry of physical laws is absolutely essential for explanations of
experimental results and for the formulation of classical electrodynamics and QED; the
universality of the one-way speed of light is irrelevant. In this connection, we stress
that according to the theory of taiji relativity [4],the universality of the one-way o r the
two-way speed of light is a convention rather than an inherent part of the physical
world [lo].

Note added in p~oojs

Suppose one writes dwI = y ( T d z u + Updx), dx, = y(Vdx + Wp dzu), dyI = cly, dzI = dx;
y = I/( 1 - p2))/2,
where T , U , V and W are four unknown functions of x and zu. The new Wu
transformation (7.1) for a constant-linear-acceleration (CLA) frame can be derived from the
postulate of the limiting four-dimensional symmetry of taiji relativity and the initial condition that
a CLA transformation reduces to the spatial identity rI = r when the taiji-time zu = 0 and the
initial velocity P o = 0. This initial condition holds also for the Lorentz transformation. Thus,
once the principle (or the first postulate) of relativity is rigorously stated to include the limiting
cases, the concept of acceleration is determined in the physical theory based on extended
four-dimensional framework.
Within the present conceptual framework, the taiji-time iu in the Wu transormation (7.1) or
(A.2) is a primary concept and has the dimension of length. The motion of physical objects,
including light signals, is a derived concept and described by dimensionless taiji-velocities
drldw. The taiji-time zu can be realized by computerized Leonardo clocks [4]: We could program
any Leonardo clock in a CLA frame F t o obtain a reading zuI from the nearest clock in an inertial
frame F , and, based on its F I frame position zI and given parameters u and P o , compute the
taiji-time w it should display, w = (,wI + P o l a y o ) / [ u ( x l+ l / u y o ) ]-Po/.. (See (7.2).)In the limit
of zero acceleration LO shown on a Leonarclo clock will automatically reduce to the taiji-time in the
+
four-dimensional transformation, zu = yo(zal poxI). I t will not reduce to relativistic time, unless
the second postulate of universal constant for the speed of light (zu = ct, zuI = ctI)is made in this
limit [4].
:I: :F :I:

This paper is dedicated to Prof. TA-YOUWu for his wonderful and tireless teaching
of physics and his ninetieth birthday. The work was supported in part by The Jing Shin
Research Fund of the UMass Dartmouth and by a grant from the Potz Science Fund.

APPENDIX

Limiting four-dimensional symmetry and constant-linear-accelerationframes

For simplicity, let us denote a CLA frame by F(zo, x,y , x) and an inertial frame by
F I ( W IX, I ,yI, 21). Suppose a CLA frame F(w,x , y, x) is moving with a constant
acceleration a , so that its velocity is
@.I) P = cm + D o ,
245

along the f x axis. Guided by the limiting four-dimensional symmetry, we find that the
linearly accelerated transformation between F I and F should be

This is a generalization of the accelerated transformation obtained by Wu and Lee [7]


based on a kinematic approach to satisfy limiting four-dimensional symmetry. It will be
called the &Wutransformation. When ,B approaches zero, the accelerated transforma-
tion (A.2) can reduce to the well-known transformation obtained in ref. [7], provided
one uses a new time z : p = aw + P o = tgh (a). Furthermore, one can verify that the
Wu transformation (A.2) indeed reduces to four-dimensional transformations in the
limit of zero acceleration a+O: wI= y o ( w+box),xI = y o ( x+ b o w ) ,yr = y, xI = x ;
where y o = 1 (1 --B;)/.
d
With the efinitions wI = ctI and w = ct , the extented relativistic time t in the CLA
frame F is completely determined by (A.2). In other words, if the time tI and the
position xI of an event as observed in the inertial frame F I is known, then the
corresponding time t in the CLA frame can be calculated, provided the constants c, a
and P o are given. Such a time t in a CLA frame can be physically realized by
computerized Leonardo clocks [4]: evidently, the time t in the CLA frame is not the
relativistic time in general and, hence, the constant c by itself is not physically
meaningful. Only when the acceleration a vanishes, the time t in (A.2) for the frame F
reduces to the relativistic time and c becomes physically meaningful. We may remark
that the definitions wI = ctI and w = ct are not necessary for deriving observable results
because we may directly use w as evolution variables.
When Po-+O, the inverse of the Wu transformation (A.2) leads to

(A.3) { w = c t I / ( l + a x I )= c t , ( l - a x I ) , ctI = wI ,
x = ( l / a ) [ l +2axI + a2(zf- c2t12)]1/2- l / a = x I - c 2 a t f / 2 .

Thus, c 2 a is related to a constant acceleration g in Newtonian mechanics by the relation

(A.4) a = g/cz
when velocities are small. In this sense, the Wu transformation (A.2) is a
four-dimensional generalization of the Galilean transformation for accelerated frames
in classical mechanics.
From (A.l) we obtain

(A.5) ds2 = c 2 dt: - d$ =goodw2 - d r 2 , goo= y 4 ( y t 2+ ax),


where x in a CLA frame F is restricted to the regon x > x, = - 1 / ( a y ; ) which may be
pictured as a wall singularity at x,. We may remark that finite Wu transformation
(A.2) implies that the space-time of the CLA frame F(W, z , y, z ) is flat, i.e. the
Riemann curvature tensor vanishes, R >km = 0 , which can also be directly calculated by
using the metric tensors in (A.5).
The velocity of a fixed point xI in FI as measured by F-observers using evolution
variable [4] w is dx/dw with xI fixed. From (A.2), we find

= P and
We see that only in the approximation goo= 1 do we have (dx/dw),,
246

64.7) ( d ~ / d w ) ~=
, a = constant .
We note that the Wu transformation (A.2) holds for general WI and w. In the limit of
zero acceleration, it reduces to the four-dimensional taiji transformation [4]. If one
wishes, one may define

(A.8) wI = ctI and w = bt , b = ( c t ~- ,&XI) /itI ( 1 - pq 1 - (B - q ) XI /el ,

where tI and t are, respectively, Einsteins time and extended Reichenbachs time
(and b is the corresponding ligh function), then the limit of zero acceleration of (A.2)
is the extended transformation (2.2) (where the inertial frame F corresponds to the
CLA frame F of (A.2) in the limit of zero acceleration). One can formulate, say, classical
electrodynamics in a CLA frame. According to taiji relativity, physical results in the
CLA frame F should be independent of the definition in (A.8).

REFERENCES

[ l] Hsu L., Hsu J. P. and SCHNEBLE D., Nuouo Cimento B , 111 (1996) 1299. This is referred as
paper I in the text.
[2] REICHENBACH H., The Philosophy ofSpace and Time (Dover, New York) 1958.
[3] EDWARDS W.F., Am. J. Phys., 31 (1963) 482.
[4] Hsu J. P. and Hsu L., Phys. Lett. A , 196 (1994) 1; 217 (1996) 359; HSU L. and Hsu J. P.,
Nuouo Cimento B, 111 (1996) 1283.
[5] See, for example, BJORKENJ. D. and DRELL S. D., Relativistic Quantum Mechanics
(McGraw-Hill, New York) 1964, pp. 261-268 and pp. 285-286; SAKURAI J. J., Advanced
Quantum Mechanics (Addison-Wesley, Reading, Mass.), 1967, pp. 171-172 and pp.181-188;
WEINBERC S., The Quantum Theory of Fields (Cambridge University Press, New York)
1995, pp. 134-147.
[6] M0LLER C., Danske Vid, Sel. Mat.-Fyz., xx, No. 19 (1943); FOCK V., The Theory of Space
Time and Gravitation (Pergamon, New York) 1958, pp. 206-211; WU T. Y. and LEE Y. C., Int.
J. Theor. Phys., 5 (1972) 307; TA-YOUWU, Theoretical Physics, Vol. 4, Theory of Relativity
(Lian Jing Publishing Co., Taipei) 1978, pp. 172-175.
[7] Hsu J. P., Nuovo Cimento B, 80 (1984) 183; 88 (1985) 140; Hsu J. P. and PEIS. Y., Phys. Rev.
A, 37 (1988) 1406.
[8] For a detailed discussion of common time in four-dimensional framework and its implica-
tions, see Hsu J. P., Nuovo Cimento B , 74 (1983) 67; 88 (1985) 140; 89 (1985) 30; Phys.
Lett. A, 97 (1983) 137; Hsu J. P. and WHANC., Phys. Rev. A , 38 (1988) 2248, appendix.
[9] HSU J. P., Nuovo Cimento B , 93 (1986) 178.
[lo] In other words, all physical results in taiji relativity or extended relativity can be derived by
simply using the quantities (w, x, y , z ) and (w, x, y , z) without ever mention time t or t
(measured in seconds) and speeds of light or other physical objects.
247

CHINESE JOURNAL O F PHYSICS VOL. 35, NO. 4 AUGUST 1997

Generalized Lorentz Transformations for Linearly Accelerated Frames


with Limiting Four-Dimensional Symmetry

Jong-Ping Hsu and Leonard0 Hsu2


Department of Physics, University of Massachusetts Dartmouth,
North Dartmouth, M A 027472300, U.S.A.
Department of Physics, University o f California at Berkeley,
Berkeley, CA 94720-7300, U.S.A.
(Received March 24, 1997)

Based on the principle of limiting four-dimensional symmetry, we discuss coordinate


transformations for constant-linear-acceleration (CLA) frames. We derive a new Wu
transformation which reduces to the Lorentz transformation in the limit of zero accel-
eration. The time for an accelerated frame can be realized by computerized clocks.
A CLA coordinate ( w ,2 ,y, z ) is preferred for the accelerated transformation and has
as much physical meaning for an accelerated frame F as ( W I , X I ,y ~ZI), for an inertial
frame F I . Furthermore, constant-linear-accelerationCY must be constant increase of a
particles energy per unit length, in consistent with what has been realized in high
energy linear accelerators. Some experimental implications are discussed.

PACS. 04.20.-q - Classical general relativity.


PACS. 11.3O.C~- Lorentz and Poincare invariance.
PACS. 0 3 . 3 0 . s ~- Special relativity.

I. Introduction

Classical physics using Galilean transformations has been satisfactorily improved in


modern physics by the use of Lorentz transformations. Yet, the corresponding classical
physics in constant-linear-acceleration (CLA) frames has not been equally improved in
modern physics. This is due to the lack of a generalized Lorentz transformation for CLA
frames. In 1943, Moeller obtained a transformation for an inertial frame FI and a uniformly
accelerated frame F , moving along the x-axis. This was based on (a) Einsteins vacuum
equation R;j = 0 and (b) time-independent metric tensors, ds2 = g00(x)c2dt2 - d x 2 - dy2 -
dz2. He obtained goo() = (1t 92) and a transformation for uniformly accelerated frames
[l].The reason for using R ; k = 0 is suggested by a heuristic view that the inertial force
of accelerated frames and the gravitational force may be considered as being unified
by Einsteins equation. Nevertheless, these two forces satisfy quite different boundary
conditions: Namely, in contrast t o the gravitational force, the inertial force does not
vanish at spatial infinity and the transformations for accelerated frames should reduce t o
the Lorentz transformation when inertial forces vanish. In 1972, Wu and Lee derived the
same transformation based on a kinematic approach without using Einsteins equation for
gravity [l]. But, Moellers transformations cannot be smoothly connected t o the Lorentz

407 @ 1997 T H E PHYSICAL SOCIETY


OF T H E REPUBLIC OF CHINA
248

408 GENERALIZED LORENTZ TRANSFORMATIONS F O R . . . VOL. 35

transformations in the limit of zero acceleration. This is due t o the stringent assumption
t h a t the metric tensor goo is time-independent.
In this paper, we follow the kinematic approach and obtain a satisfactory CLA trans-
formation by postulating a new and natural principle of limiting four-dimensional sym-
metry [a]: Any accelerated transformation of coordinates must reduce t o the form with
4-dimensional symmetry in the limit of zero acceleration. We show that the set of trans-
formations for the CLA frames forms a new group, which is termed the Wu group. T h e
WU transformation is a natural and simple generalization of the Lorentz transformation
and the Galilean transformation with constant acceleration. The limiting four-dimensional
symmetry principle contains more definite and satisfactory physical results than the equa-
tion Rik = 0, as far as CLA transformations are concerned. In the gravitational approach,
Einsteins covariant equation holds for any coordinate. However, the Lorentz transforma-
tion prefers the Cartesian coordinate. Therefore, the natural assumption of the smooth
connection between a linearly accelerated frame and an inertial frame dictates that a CLA
coordinate is preferred for CLA transformations. This is an important difference between
kinematic and gravitational approaches.
In our discussions, a CLA frame F(w,z,y,z) with the usual definition w = ct is
introduced. But we know that the constant speed of light c has no operational definition
in any CLA frame. Fortunately, the physical results in the paper are actually independent
of the definition w = ct. In previous papers, we have shown that the logically simplest
theory of relativity, called taiji relativity, can be formulated solely o n the basis of the first
postulate of relativity, without making any second postulate concerning the speed of light
[3]. The first postulate of relativity states that the laws of physics have the same form
in all inertial frames. We are able t o formulate a 4-dimensional physical theory with the
coordinate zy = ( w ~ , z ~ , y ~for ~ )inertial frame F I , where wy is the taiji-time with
, zan
the dimension of length. The absence of the second postulate forbids one to express the
taiji-time WI in terms of the usual time t I (measured in seconds) and velocity because they
cannot be defined for all inertial frames in taiji relativity. Nevertheless, the taiji-time w I
can be directly used as the evolution variable. Furthermore, the taiji-time with the unit of,
say, centimeters, can be physically realized by computerized clocks. Also, the invariant law
for the propagation of light, ds2 = dw; - dr; = 0 , implies the taiji-speedof a light signal
t o be dimensionless and has the universal value, , d = ~ IdrI/dwIl = 1, for all inertial frames.
A careful examination shows that taiji relativity is consistent with all previous experiments
[3]. Indeed, one can simply consider w in a CLA frame as the evolution variable for a
physical system. One can have a grid of computerized clocks in a CLA frame. These
clocks can be synchronized without relying on the constant speed of light signals and will
automatically read taiji-time W I in the limit of zero acceleration.

11. Coordinate transformations for linearly accelerated frames

, I = c t I , and a CLA frame


Suppose we have an inertial frame F I ( W IX, I , yr, z ~ ) W
F ( w ,2 ,y, z ) moving with a constant acceleration a along the z-axis. Based on the preceding
discussions, it is natural to assume that ds2 takes the form
249

VOL. 35 JONG-PING HSU AND LEONARD0 HSU 409

where iz is shown by the Einstein clocks in the inertial frame F I . As usual, one may define
w = c t , where the realization of t through a grid of computerized clocks will be discussed
later in sec. 6. Although Rik = 0 holds for arbitrary coordinates, we postulate the metric
( 2 . 1 ) so that d s 2 and the resultant transformations are compatible with both Einsteins
vacuum equation R;k = 0 and the new boundary conditions of limiting four-dimensional
symmetry. Since the CLA frame F moves along the z-axis, we look for axial symmetric
solutions with

g22 = g33 = - Y 2 ( z > , (2.2)


and all metric tensors are functions of z,except that goo may be a function of z and w .
This property of goo(z,w)is crucial for the new CLA transformation.

II- 1. Gravitational approach


Let us first consider the conventional gravitational approach based on Einsteins
equation to obtain uniformly accelerated transformations. Based on (2.1) and (2.2). We
can calculate Christoffel symbols Gik = g i m ( d k g m j t a j g m k - d m g j k ) and the Ricci tensor
+
Rik = amGZ + &G& GYkGFm- GZGEm. The equations R;; = 0, a = 0 , 1 , 2 , lead t o
a,zw - a,w(a,x/X - 2d,Y/Y) = 0, (2.3)
d2W/W
X + 2a:Y/Y - +
( a X x / x ) [ 2 a X Y / Y a,w/W] = 0 , (2.4)
gY/Y +
- a X Y a X X / ( Y x )( a z Y / Y ) 2 + d,Y&W/(YW) = 0 , (2.5)
respectively, where Y is given by (2.2) and

w 2= goo(2, w), x 2= -g11(z); w = W1(z)W2(w). (2.6)


We note that R33 = 0 gives the same equation as (2.5) and other components of Rik vanish
identically.
If d,Y # 0 , equations (2.3)-(2.5) Ieads to an exact solution

wl = f 3 / ( f l X f f ~ ) ~ x, = f2(flX t fo)12, y = f i x + fo, (2.7)


where fs are constants. We stress that W z ( w ) in (2.6) is arbitrary because it cannot
be determined by Einsteins equation. Physically, one expects that the metric tensor g22
+
should satisfies -g22 = Y 2= 1 rather than Y 2 = ( f l z f o ) 2 since there is no motion along
the y-axis. Furthermore, the accelerated transformation based on this solution cannot be
smoothly connected t o the Lorentz transformation (i.e., it does not satisfy the limiting
four-dimensional symmetry or the integrability conditions in (2.15) below.) Therefore,
the solution (2.7) is not physically meaningful.
Let us concentrate on the case 8,Y = 0 and 8,W # 0. We have the solution Y = 1,
which satisfies the boundary conditions g 2 2 ( 0 ) = g33(0) = -1 at the origin. From Eqs.
(2.3) and ( 2 . 8 ) , we deduce a general relation between W l ( x )and X ( z ) :
250

410 GENERALIZED LORENTZ TRANSFORMATIONS FOR . . . VOL. 35

dW(z)ldx =f q4, (2-8)


where f is a constant of integration. Furthermore, the time-dependent part of goo, i.e.,
W2(w), still cannot be determined by Einsteins equation, just like the previous case &Y #
0. Thus we have seen that Einsteins covariant equation by itself does not lead t o a specific
form for X ( x ) , Wl(x) and W2(w). (If one further postulates the limiting 4-dimensional
symmetry, then W ~ ( W can) be determined and one can obtain a CLA transformation based
on W2(w), Eq. (2.8) and a suitable initial condition. But if one postulate the limiting
4-dimensional symmetry, then one can obtain the same CLA transformation without using
Einsteins equation at all, as we shall see below in sec. 11-2.)
Moeller made two additional postulates, X ( x ) = W2(w)= 1, and obtained [l]

ds2 = g00(X)C2dt2- d x 2 - d y 2 - d z 2 , goo(x) = (1 +gX)2. (2.9)


This leads to Moellers transformation involving only one parameter g. When the accel-
eration g approaches zero, it does not reduce t o a Lorentz transformation with a constant
velocity. This is due to the lack of a velocity parameter in (2.9) which is intimately related
to his stringent assumption that the metric tensor goo is time-independent, i.e., W2(w) = 1,
as we shall see below.

11-2. Kinematic approach


Next, let us consider a new kinematic approach7based on the principle of limiting 4-
dimensional symmetry. Since F accelerates along the x-axis, the perpendicular coordinates
y and z should not appear in the metric tensors gik and g22 = g33 = -1 in (2.1). The length
of a measuring rod along the x-axis or the component 911 should not depend on time or w
because the acceleration is characterized by a constant.
Based on the limiting four-dimensional symmetry, the invariant interval (2.1) with
922 = g33 = -1 leads to the following differential form of transformation between FI and
F,

where

p = QW t Po, y(w) = (1 - P2)--1/2 = y; (2.11)

[goo(w, = G ( w ) Z ( z ) W ( WX,) > 0; (2.12)

(-gll)1/2 = X ( X ) > 0, (2.13)

The CLA transformation (2.10) is characterized by two parameters: acceleration Q and


initial velocity Po. In order to satisfy the limiting four-dimensional symmetry, we must
have

goo -+ 1 and gll --+ -1 as Q --+ 0. (2.14)

Also, the coefficients of dx and dw in (2.10) must satisfy the integrability conditions
25 1

VOL. 35 JONG-PING HSU AND LEONARD0 HSU 411

so that we have a finite coordinate transformation. It follows from (2.12) and (2.15) that

G(w) = a r 2 ( w ) / f , (2.16)

dZ(x)/dx = f X ( x ) , (2.17)

where the constant f is coming from separation of variables w and z. Using (2.16) and
(2.17), we can integrate (2.10) t o obtain the finite transformation between FI and F ,

Note that the constants of integration in (2.18) and the relations in (2.19) are all deter-
mined by the limiting four-dimensional symmetry and a boundary condition at the origin,
Z ( 0 ) < 00. In order to determined the precise form for the function Z ( x ) , we observe that
the Lorentz transformation reduces to the identity transformation, r=r, when time and
velocity vanish, t = 0 and V = 0. Thus, it is natural to impose the same initial condition
to the accelerated transformation (2.18): Namely, when time tI = 0 and velocity Po = 0,
transformation (2.18) reduces to the identity transformation,

r = rI, (2.20)

for all values of acceleration a. It follows from (2.18)-(2.20) that Z ( x ) must be

~ ( x=) (yo2 t ax), or x = 1 = (--gI1)l2, (2.21)


Thus, the metric tensor in (2.1) is now completely determined by the limiting four-dimensional
symmetry and the initial condition (2.20).
From Eqs. (2.18) and (2.21), we obtain a definite coordinate transformation between
the inertial frame FI and the accelerated frame F :

(2.22)

where the time t I = W I / C is shown by the conventional Einstein clocks in the inertial
frame FI. We shall call the result (2.22) the Wu transformation [4]. One can verify
that the transformation (2.22) with w = ct includes the Lorentz transformation, W I =
+ +
yo(x pow), X I = yo(z pw),as a special case, a -+ 0. In this sense, it satisfies the
limiting four-dimensional symmetry, s2 = c2t; - r; = w2 - r2 as a + 0. The inverse Wu
transformation of (2.22) can be deduced:
252

412 GENERALIZED LORENTZ TRANSFORMATIONS FOR . . . VOL. 35

We may remark that when PO -+ 0, one can verify that ( 2 . 2 3 ) leads t o the accelerated
Galilean transformation, x M a;z - c2at:/2. Thus, the acceleration a can be approximately
related to a constant acceleration g in Newtonian mechanics, a M g / c 2 . The transformation
for the covariant differential operators ( 8 / 8 w , b/ar) can be deduced from (2.23):

d/dWI = y(W-la/aw - p a / a x : > ,


(2.24)
d/dXI = y(8/8x - pW-18/8w), apyI= a/&, d p 2 1 = 8/82.
These relations will be useful for wave equations in quantum mechanics.

111. The Wu group of constant-linear-accelerationtransformations

To show the group properties of the Wu transformation ( 2 . 2 2 ) for CLA frames, we


must first find the coordinate transformations between two CLA frames F and F. Let
us consider another frame F accelerating with a velocity p = aw with respect t o F I .
For simplicity, we shall ignore trivial y and z axes and set all initial velocities t o zero in
the following discussions. However, these group properties can be shown t o be true also for
non-zero initial velocities. Similar t o (2.22), we can write down the Wu transformation
between the inertial frame FI and the accelerated frame F,

WI = yZIw, X I = yfZ/a - l/a;


2 = 1 + ax, p= NW, y = ( 1 - p 2) - 1 1 2 .

Using (2.14) with PO = y = z = 0 and (3.1), we deduce the Wu transformation between F


and F:
w = yZw/{a[Zy/a t 1 / a - /a]},

x = {[yZ/a f 1/a - l/a]2- y222w2}1/2 - l/a.

In order to see the group property of the Wu transformation ( 3 . 2 ) for two accelerated
frames, we need t o consider a third accelerated frame F with a velocity p = dw, In
analogy t o (3.2), we can write down the transformation between F and F ,

Based on ( 3.3) and the inverse of ( 3 . 2 ) , the transformation between F and F is

W = y ~ ~ / { r [ Z y / at i/a - / a ] } ,
2 = {[,/,/a f - l / a / ] 2 - y2~~2w2}1/2- (3.4)

which has the same form as that of (3.2). Using (3.1) and ( 3 . 4 ) , s2 = ( c t ~-) xf
~ can be
expressed in terms of the coordinate variables in F and F as follows:
253

VOL. 35 JONG-PING HSU AND LEONARD0 HSU 413

s2 = ( c t r )2 - 2: = [ 2 z y - 2j2 - 1]/cr = [2Zy - 2112


- ,]/a?
(3.5)
The existence of identity transformation and the associate law can also be verified. Thus
we conclude that the set of Wu transformations for accelerated frames form a group which
is termed the Wu group. Since a Wu transformation (with w = ct or w= ct, etc.) reduces
to a Lorentz transformation in the limit of zero acceleration, the Wu group includes the
Lorentz group as a special case when all accelerations vanish.
One can generalize the Wu transformation in such a way that the velocity u G ii is
in arbitrary and fixed direction. This implies that both the acceleration G and the initial
velocity $0 must be in the same direction:
u = aw + P-o . (3.6)
By differentiating (2.14), we obtain

+ +
d ( c t 1 ) = y ( W d ~ P d z ) , dxr = y ( d ~ P W d w ) , d l ~ r= d y , dzy = d z ; (3.7)
where

w = y 2 ( y i 2+ m ) ; yo=(l-Po)
2 -1/2
.
In analogy with the Lorentz transformation in an arbitrary direction, (3.7) can be general-
ized t o the following form:
drI =dr + (y - l ) ( u . d r ) u / u 2 t [y3y02u+ cry3(u. r)u/u]dw,
=dr + (y - l ) ( u / u ) d ( u . r / u ) + (l/cryi)(u/u)dy t (u . r)(u/u2)d(y - I),
(3.8)
dw I = y3[y02 + cru . r/u]dw + y u . d r .
= y 3 y i 2 d w + y ( u . r/u)du(y2 - 1) - y 3 ( u .r)udu + d ( y u . r ) ,

It is straightforward to carried out the integrations, we have


where u = JuJ.

It can be verified that the general transformation (3.9) reduces to (2.22) if u is in the
x-direction, u = ( p , 0,O). In the zero acceleration limit, (3.9) reduces to

(3.10)

which is the well-known form of the general Lorentz transformation.

IV. Physics in linearly a c c e l e r a t e d f r a m e s

Within the four-dimensional symmetry framework of taiji relativity based solely on


the first postulate of relativity [3], it was shown that there are only two universal and funda-
mental constants: J = 3.5177293~ g.cm and E = -1.6021891~10-20(4~)1/2(g . ern)'/',
254

414 GENERALIZED LORENTZ TRANSFORMATIONS FOR . . VOL. 35

instead of the usual three, h , e (in esu) [5] and c . These results can be applied t o the present
formalism of physics in CLA frames.
Since the speed of light in a CLA frame F as not a universal constant, we shall write
w E bt and u = d w / d t , where b is a variable in this section, so that one can see that the
physics in F does not depend on w = ct. The invariant action for a charged particle and
the electromagnetic potential a,(x) in F is assumed t o be

S=
J (-mds - Eu,dxp) - (1/4)/ f P uf!-Wd4x =/ Ldt - (1/4)/ f p w f P w W d 4 x ; ( 4 . 1 )

ds2 = W 2 d w 2- dx2 - d y 2 - d z 2 , (-det g,w)/2 = W ; w = bt; (4.2)

L = -m[w2 u 2 - v,2 - v 2y - v12]1/2- E(aou + u p i ) , u = dw/dt, v = dr/dt; (4.3)

fpu = a p a u - &a,. g,u = ( W 2 ,-1, -1, -1). (4.4)


In the limit cy -+ 0, F becomes an inertial frame and, hence, F and a,, correspond t o the
usual charge e (in esu) and the electromagnetic potential A , by e = e / c and a , = A,/c
respectively. The canonical momentum Pi of a particle in F is given by
Pi = -dL/6vZ = pi + Fa;; i = 1,2,3;

pi = (-myv,/C, - m y v y / C , - m y v , / C ) = g ; k p k , pi E mdxlds;
y ( 1 - ,2/C2)-/2; c =- uw, v 2 = vz2 + wy2 + va = -v;vi.
The Hamiltonian H = PO.with the same dimension as that of Pi,is defined by
PO = [ ( ~ L / ~ v- ~L I )/ Uv =~ po -t Fa0 = w[(P,- + m211i2+ Fao H;

PO = m y W = gooPo, po = mdxo/ds = m d w / d s .

The transformation of the covariant momenta pi and po is given by

pro = Y(PO/W- Ppi), PII = Y(PI - P p o / W > , pr2 = p 2 , p13 = p 3 ;

where

Pp = (PO,-P), P = ( P z , P g , P * ) = (p1,P2,p3)= ( - P l , - p Z , - p 3 ) *
Note that (4.10) is consistent with (2.24) because p , and 6/6xp should have the same
transformation property. Also, the CLA transformation of the covariant vector d x , is the
same as (4.10) because dx, = gpWdxwand dxl, = qpwdx:, where q,,, = (1,-1,-1,-1).
The invariant relation gfiwpupw= m2 implies

gqp, - Fu,)(Pw - eu,,)


(4.11)
= W-(PO - eao)2 - (PI - ea1)2 - ( ~ -2 ~ a 2 -) (~ ~ -3 ea3l2 = m2,
255

VOL. 35 JONG-PING HSU AND LEONARD0 HSU 415

where we have used (4.5), (4.6), (4.8), (4.9), gp = (W-, -1, -1, -1) and P = gP,.
This equation suggests that the generalized Dirac equation for the accelerated frame F
should have the form

[-y*(z)(Pp
- Ea,) - m]@= 0, P, = i J d / d x p . (4.12)

If one wishes, one can relate y*(x) in (4.12) t o constant Dirac matrices 7, by the relation
7*,() = e p (x)T, where e (z) is a tetrad.
4 4
V. Experimental Implications and Discussions

The result (4.10) and the new transformation (2.22) can be experimentally tested
by measuring Doppler shift of wavelength of light emitted from a CLA source. From Eq.
(4.10) one obtains the transformation of the covariant wave 4-vector k, = p , / J between
an inertial frame FI and a CLA frame F . Note that J k I o and J k o are moving masses of
the same photon measured from FI and F respectively. Suppose the radiation source is at
rest at the origin of the F frame, r = 0, and k, = ( L O , - k l , O , O ) , where ko = ko (rest) and
kl = Icl (rest). Experimentally, it is difficult t o measure ko (rest) and k1 (rest) in the CLA
frame. Thus we have t o express them in terms of quantities measured in the inertial frame
(or laboratory) F I . Using (4.10), the relation [6] ko(rest) = klo(rest) and Z(0) = r;, we
obtain the shifts of kro (related t o photons moving mass or atomic mass level [3]) for
waves emitted from a CLA source,

where (rest) denotes the source being at rest in F I . A similar relation can be obtained for
the wavelength. Such new effects predicted by the Wu transformation for waves emitted
from a CLA source may be termed Wu-Doppler effect. Note that Moellers transformation
will lead t o a result different from (5.1) because t in (2.9) and (2.22) with w = ct [or (5.2)
below] must have the same physical interpretation. Such a difference can be tested by
measuring the Wu-Doppler effect (5.1) in the laboratory frame FI by using the method of
Ives-Stilwell [7].
We stress that the Wu transformation (2.22) does not depend O R a specific relation
between w and t . Suppose one assumes w = ct in (2.22). The time t ,

t = [tl + Po/(c(Yro>l/[.(.I + 1/Q%)I- P O / ( C Q ) , (5.2)


in the accelerated frame F can be physically realized by the computerized clocks [3]: By
basing computerized clocks on a computer chip, one could program any clock in F t o
obtain a time reading t I from the nearest Einstein clock in FI and, based on c, Po, (Y and
its FI frame position X I , compute the time t it should display according t o (5.2). This is
a general method for synchronization of computerized clocks in a reference frame, without
relying on the constant speed of light. If one compares the rate of ticking of the Einstein
and the computerized clocks a t a fixed position X I , one has (dt/atr),, = l/[crzr 1/70]. +
This can be physically realized because the reading and the rate of ticking of a clock are
adjustable. The (computerized clocks in F will automatically becomes Einstein clocks as
256

416 GFNERhT,TZED LORENTZ TRANSFORMATIONS FOR . . VOL. 35

the acceleration upprwches zero, a ---f 0. In this sense, these sophisticated computerized
clocks are generalized Einstein clocks for both inertial and non-inertial frames. We note
that the choice of w = ct in (2.22) t o synchronize computerized clocks in F does not imply
that the speed of light is a constant c in the accelerated frame. For the general case w = b t ,
where t is defined by an arbitrarily preassigned function t ( z I , t i ) ,the time t can also be
physically realized by the computerized clocks. It appears t h a t all these different times
t ( z 1 , t r ) are equally physical, in principle, for describing physical phenomena, as discussed
in [3]. However, from fundamental laws of physics such as (4.11) and (4.12), we can see
that the rear evolution variable is w rather than t. Of course, we can also make these
computerized clocks t o read w directly.
The boundary condition Z ( 0 ) < cc, which leads t o f = (I: in (2.19), is imposed
for simplicity. It is not necessary: If Z(0) < 00 is not irnrosed, then one has f # a
in general. However, G(w)and Z(x)involve the factors l / f and f respectively, so that
W ( w ,z ) = G(w)Z(z) and the resultant physics do not depend on f.
The equivalence of the effects of a gravitational field and those of an observers
acceleration played art essential role a t the birth of general relativity. Nevertheless, some
authors suggested that it be buried with appropriate honors because it is false [9]. It is
the equivalence of gravitational arid inertial mass which is precise and necessary for general
relativity.
What is the operational meaning of the constant acceleration a? We show that the
constant acceleration (I: of a particle is directly and uniquely related the change of its energy
per unit length as measured in an inertial frame FI [lo]:

where we have used the diflerential transformation (2.10) and the momentum transforma-
tion (4.10) with p p = Jk,. We stress that, within the fIarricwork of the four-dimensional
symmetry, the concept of uniform acceleration of a particle can only be defined in the sense
of (5.3), i.c., constant change of a particles energy pro per unit length, as measured in
an inertial frame FI. It is gratifying t o see that this is precisely what has been used in high
energy laboratory. Other definition of acceleration such as the change of velocity per unit
time is only an approximation for small velocities and is, strictly speaking, incompatible
with the 4-dimensional symmetry.
Our results suggest that the kinematic approach first discussed by Wu and Lee
[l]is morc fruitful than the conventional gravitational approach, provided the limiting
4-dimensional symmetry is postulated.
The work is supported in part by the Potz Science Fund. This paper is written
as an affectionate jubilee greeting t o Taidas Physics Department. Appropriately, it deals
with 4-dimensional symmetry and time which engrossed JPs thoughts for many years as a
student at Taida.

References
[ 1 ] C . M d l e r , Danske Vid. Sel. Mat-Fys. xx, No.19 (1943) and T h e Theorg ofReIalivily(C1arcn-
don, Oxford 1969) pp.255-258. Ta-You W u and Y . C. Lee, Intern. J . Theor. Phys. 5, 307
(1972). Wu and Lee assumed (i) local FitzGerald-Lorentz contraction of length, (ii) local
257

VOL. 35 JONG-PING HSU AND LEONARD0 HSU 417

time-dilatation and (iii) a time-independent goo for CLA coordinates, and derived the same
transformation.
[ 21 Jong-Ping Hsu and Leonardo Hsu, Nuovo Cim. B112 (to be published in April, 1997).
[ 31 J . P. Hsu and L. Hsu, Phys. Letters A196, 1-6 (1994); (Erratum) ibid, 217, 359 (1996);
Leonardo Hsu and Jong-Ping Hsu, Nuovo Cimento B111, 1283 (1996).
[4] Ta-You Wu, Theoretical Physics, vol. 4: Theory of Relativity (in Chinese, Lian Jing Publishing
Co., 1978) pp. 172-175; see also ref. 1. Roughly speaking, Wu explored local relation
between accelerated transformation and Lorentz transformation; while we consider their
global relations based on the limiting four-dimensional symmetry principle.
[5] Note that the universal constant 2 is the charge measured in the electromagnetic unit (emu)
rather than in the electrostatic unit (esu).
[6] This equality is only approximate because there is no relativity or equivalence between F
and F I . Nevertheless, it turns out to be an extremely good approximation because atomic

[ 71
metric tensor goo,and the smallness of atomic sizes, -
structure is very stable against constant-linear-acceleration. This is basically related to the
cm.
When we define w = ct, the speed of light measured in the CLA frame F is C 5 d r / d t =
c W ( t , z )because the propagation of light is described by equation (2.1), ds = 0. Note that
C is anisotropic and depends on space and time in general. We stress that one can also
set w = btI in (2.22) and (2.23) without upsetting its Wu group property. Thus, we have a
common time, t = t I , for all frames and, hence, the speed of light measured in F by using
such a common time will be C = cy(1 - p), if d x I / d t I = +c. For discussions of common
time within the 4-dimensional framework, see J . P. Hsu, Phys. Lett. A97, 137 (1983); Nuovo
Cimento B74,67 (1983); J . P. Hsu and C. Whan, Phys. Rev. A38, 2248 (1988), Appendix.
[ 81 H. E. Ives and G. R. Stilwell, J . Opt. SOC.Am. 28, 215 (1938); 31, 369 (1941).
[ 91 J . L. Synge, Relativity: T h e General T h o r y (North-Holland, Amsterdam, 1966) pp. ix-x. See
also V . A. Fock, T h e theory of space, t i m e and gravitation (Pergamon, London, 1959).
[lo] For an object at rest in F I , we have ( d p o / d i ) , I = rna(y- - l ) / ( y Z 2 ) # ( d p I o / d z I ) z . It also
shows the lack of symmetry between a CLA frame F and an inertial frame F I .
258

Generalizing Lorentz Transformations for Accelerated Frames


and Their Physical Implications

Daniel T. Schmitt'and Tobias Kleinschmidt '


April 6, 2005

We discuss reference frames with constant-linear-accelerations and their generalized


spacetime transformations with minimal departure from the Lorentz transformations. The
requirement of limiting 4-dimensional symmetry of the Lorentz and PoincarB groups as-
sures that the generalized transformations reduce to the Lorentz transformations in the
limit of zero acceleration. The changes in the geometry of spacetime in accelerated frames
are shown graphically, including singularities and horizons. Physical implications and a
feasible experimental test are discussed.

*daniel.schrnittOphysik.uni-ulm.de
Department of Theoretical Physics, University of Ulrn, Germany
t tobias. kleinschmidt@gmx.de
Department of Physics, University of Massachusetts Dartmouth
North Dartmouth, MA 02747-2300, USA

1
259

1 Introduction
In the physics of the future, it is desirable that particle physics and quantum field
theory, including gravity, can be understood in both inertial and non-inertial frames. The
reason is that all physical frames of reference in the universe are, strictly speaking, non-
inertial because of the existence of the long range gravitational force. The inertial frame
is only an approximation or idealization of physically realizable reference frames. TO
understand physics in inertial frames, we have the Lorentz transformations or Lorentz and
Poincar6 invariance, so that physical theories can be formulated covariantly and tested
experimentally. In contrast, to understand physics in non-inertial frames, there is a group
of all point transformations (one-to-one and twice differentiable) of spacetime in which the
differential form ds2 = g p v d z p d z u is invariant. Its invariant theory is the tensor calculus of
the general theory of relativity. However, this group for non-inertial frames is too general
for quantum field theory and have little specific predictions. For example, there is no
Quantum Field Theory (QED) for accelerated reference frames which would allow us to
calculate particle lifetimes in accelerated frames.
Some physical phenomena involving accelerations of one particle can be treated, by
introducing a co-moving frame' , on the basis of the invariant theory of the Lorentz group.
But such an invariant theory is inadequate for treating many other physical phenomena.
In particular, quantizations of fields in non-inertial frames cannot be treated on the basis
of the invariant theory of the Lorentz group.
This motivates us to investigate a simple subset of reference frames which has constant-
linear-accelerations. [ 11 Our discussions are based on the principle of limiting 4-dimensional
symmetry which requires that all accelerated transformations be reduced to the Lorentz
and Poincar6 transformations in the limit of zero acceleration. [2] We discuss specific
spacetime transformations and their geometry and physical properties. Their geometry
and light propagation is illustrated in several graphs. We also discuss a specific experiment
of decay-length dilation to test the generalized Lorentz transformations.

2 Simple constant-acceleration transformations with limiting 4-


dimensional symmetry
The concept of linearly accelerated frames was not defined with complete satisfaction
because most accelerated transformations of spacetime are not smoothly connected to the
Lorentz transformations in the limit of zero acceleration. [3] We shall adopt the following
definition for constant-linear-acceleration (CLA) frame: (i) A particle at rest at any point
in a CLA frame must have a constant change of energy per unit length (or momentum per
unit time) as measured in high energy laboratory or in an inertial frame, whereas the values
of accelerations, measured from an inertial frame, may be different for particles located at
different points in the CLA frame. (ii) A finite spacetime transformation between inertial
and accelerated frames exists and it satisfies the limiting 4-dimensional symmetry in the
limit of zero acceleration.
A generalized Lorentz transformation must be consistent with the fact that any accel-
erated frame reduces to an inertial frame in the limit of zero acceleration. This is a natural,
basic and necessary requirement from the physical viewpoint. But such a generalization
turns out to be not unique and, hence, theoretically there are infinitely many generaliza-
tions. So far, no established principle can lead to a simple and unique generalization. Of
'The co-moving frame is usually treated to be an inertial frame which, of course, is an approximation.

2
260

course, experiment has the final say regarding the correct accelerated spacetime transfor-
mation. Indeed, the Wu and the Mmller transformations for constant accelerations are
just two of simple generalizations, and only the Wu transformation includes the Lorentz
transformation as a limiting case of zero acceleration. [l]There are other simple general-
izations of the Lorentz transformation. In all these minimal generalizations, the spacetime
of CLA frames is characterized by a metric tensor of the form ( W 2 , - l , - l , - l ) G P,,,
which may be called the Poincar6 metric tensor. Furthermore, there exist finite transfor-
mations for inertial and CLA transformations and, therefore, the spacetime of these CLA
frames is flat, i.e., having a vanishing Riemann curvature tensor. This property is useful
for formulations of field theory.
Let us consider the transformations between an inertial frame F I ( w ~X I, ,y ~Z ,I ) and
a new constant-linear-acceleration (CLA) frame F ( w ,x,y , z) which moves with a time-
dependent velocity p(w)along the x-axis. Suppose the metric tensors in the inertial and
the CLA frames are:
rl,v = diag.(l, -1, -1, -1) (1)
P, = diag.(W2, -1, -1, -1) (2)
where W = W ( w , r ) is any real-valued function of spacetime in F(w,x,y,z) with the
property W -+ 1 for vanishing acceleration. Thus, the Poincar6 metric tensor PPv reduces
to the Minkowski metric tensor qP, in the limit of zero acceleration.
To derive the spacetime transformation for CLA frames, one may start with a linear
relation for the differentials ( d w ~dar)
, and (dw, dx) with some unknown coefficients which
are functions of spacetime. Other components, dy and dz, are unchanged. Based on
physical considerations, a finite transformation of spacetime must exist between FI and F.
Thus, these coefficients must satisfy integrability condition, and the finite transformation
must reduce to the Lorentz transformation in the limit of zero acceleration, p + Po. One
obtains two types of simple generalizations of the Lorentz transformation from inertial
frames to constant-linear-acceleration frames, which are characterized by the metric tensor
of the form (2). [l,41
Let us consider a class of transformations for an inertial frame FI ( W I , X I ,y ~21)
, and a
constant-linear-acceleration frame F ( w , x,y, z ) , accelerated in the a-direction, [5]including
constant spacetime translations:

YI = y + yo, zy = z + z0; p = p(w)= arbitrary function,


1 1

where xg = (w,, x,, yo, z,) are constants and the velocity function p(w) will be specified
and discussed below. This general transformation of spacetime is interesting because it
includes the Wu transformation and the Mmller transformation for accelerated frames as
special cases. Even though P(w)may be an arbitrary function of time w in F , it is actually
a specific and unique expression when we express it in terms of spacetime coordinates of
the inertial frame F I :

Since the right-hand-side of (4) does not involve any arbitrary function, this implies that,
from the viewpoint of observers in the inertial frame F I , the frame F always moves the

5
26 1

same way no matter what is the function given to P(W). This may be interpreted as a
flexibility of time w of the accelerated frame F , as we shall see below.
When the acceleration in (3) approaches zero, i.e., P Po, the linear-acceleration
-+

transformations (3) must reduce to the Poincar6 transformations (provided w = ct and


WI =CtI),

WI = %(W +Pox) + 200, X I = yo(x+ Pow) + Zo, yI = y + yo, ZI = + GI. (5)

P(w)-+ Po + sow, (for small ao). (6)


We stress that as long as the arbitrary function P(w) satisfies this limiting property as
a. + 0, equation (3) will reduce to the Poincarb transformation (for w = ct and W I = ctr)
or the Lorentz transformation (when xg = 0) in the limit of zero acceleration. If a
spacetime transformation for non-inertial frames does not have this limiting 4-dimensional
symmetry, it is incomplete and probably not viable.
In the following discussions, we shall set xg = 0 for simplicity, and concentrate on the
physical effects on spacetime of reference frames due to accelerations.

3 Wu transformations, singular wall and horizons


Making the minimal departure from the Lorentz transformation and easiest non-constant
+
choice for P ( w ) in (3), we assume P ( w ) = Po sow. The spacetime transformations (3)
take the following form:

XI=Y(x+-;) 1 --, 1 (7)


WI = ?(Po + aow) (x + -
a0:.4 -cu,y,
PO 1

%Yo ffOY0

YI =Y, ZI = z; P = P o + a,w.
This is called the Wu transformations, which was the first generalization of the Lorentz
transformation obtained on the basis on limiting 4-dimensional symmetry. [2] We also
have the following transformation for differential 4-vector dxp = (dw,d x , d y , d z ) and the
explicit expression for the invariant interval ds2:

dwI = y ( W d w + Pdx), dxI = y(dX + PWdw), dYI = d y , dzI = dz, (8)


ds2 = dW12 - dx12 - dy12 - dz12 = W 2 d W 2 - dx2 - d y 2 - dz2,
w = y 2 ( y 0 2+ a&) > 0.
Note that there is a singular wall at 5 = -1/(ao-yo2),where W or the metric tensor Po,
vanish, which is unphysical. On the other side of the singular wall, dwI and W d w turn
out t o have a different sign in the transformation (8).
The inverse transformation of (7) is given by

w = - (1 WI +PO/QOYO
- Po) (9)
Qo XI +~/Q,~,
1
x = JCXI + l/Qo%)2 - (WI + Po/ao%)2 - -2
0%

Y = YI, z = .zI

4
262

Contributions of singular terms in the transformations (3) and (9) cancel in the limit of zero
acceleration. Thus, the inertial limit a, -+ 0 is well-defined for the Wu transformations.
We are used to picture physical world from the viewpoint of observers in a n iner-
tial frame. Let us use the inverse Wu transformations (9) to show the change and
the physical properties of space and time axes (w,x) due to velocity and acceleration.
We observe that for constant w and XI # - l / ( a o ~ othe ) , time transformation in (9)
leads to straight lines. All lines correspond to constant w pass through the same point
In contrast, for constant z, the inverse Wu transfor-
( W I , X I ) = ( - P o / ( a y o ) , -l/(a,~,)).
mation (9) leads to two sets of hypobolic lines, which satisfy the condition 1x1 l/cr,y,[ > +
IWI +Po/~,%l.

(a) Acceleration a. = -0.001 (b) Acceleration a. = +0.001

Figure 1: Spacetime graphs with velocity Po = 0


Figure 1 shows lines of constant z and constant w in the ( W I , X I ) plane for the inverse
Wu transformations (9) with zero inertial velocity, Po = 0, and different values for constant
acceleration, a = -0.001, +0.001.

1 0

(a) Acceleration a. = -0.001 (b) Acceleration a. = fO.OO1

Figure 2 : Spacetime graph with velocity Po = 0.5


Figure 2 shows the same spacetime picture with the initial velocity, Po = 0.5, and two
values of accelerations a , = -0.001, +0.001.
Clearly, figure 3 has zero acceleration, so that all lines are straight, corresponding to an
inertial frames with the velocity Po = 0.5, as we expect from the Lorentz transformation.

5
263

Figure 3: Spacetime graph with velocity Po = 0.5 accelerations cyo -+ 0

.- _ ... .....

(a) Velocity Do = 0 acceleration (b) Velocity Po = 0 acceleration


Ly0= 0.001 (Yo = 0.001

Figure 4: Lightcone graphs

6
264

Figure 4 shows the changes of lightcone given by the relation s2 = W I -XI ~ = 0, where
and XI can be expressed in terms of w and x by using the Wu transformation (7).
The straight lines represent the lightcone in an inertial reference frame for comparison.
It is interesting to observe that the region of spacetime consisting of all points [z +
l/(ay)] > 0 and all w from (-1 - ,f30)/ao to (1 - &,)/a, corresponds to the sector
of the ( w 21) +
~ , plane between ( W I Po/aoro)= +(xz l / c ~ ~ and + r ~ )( W I Po/cyoyo)= +
- ( x ~ + l / a ~ y ~As) . a0 0, this region becomes larger and larger, and eventually cover the
-+

whole plane ( W I , X I ) , as one can see in Figures 1, 2 and 3. Thus, this region of spacetime is
identified with the physical world of the CLA frame F. Similarly, the region of spacetime
+
consisting of all [x l / ( a y 2 ) ] < 0 is represented by the opposite sector with negative
(XI + l / a o ~ , )This
. opposite region of spacetime may be called the mirror spacetime of
the CLA frame. The rest of the ( W I , 5 1 )plane cannot be reached for any real w and z for
a, # 0. The time w is physical only within the range, (-1 - P,)/a, < w < (1 - Po)/a,.
Outside this range, the CLA frame ceases to exist because the velocity P(w)is greater
than 1, which is unphysical. The limits of time, w -+ (kl- ,B,)/cY,, in the CLA frame
implies WI -+ kt00 in inertial frame, (everywhere in physical space except at the singular
wall). This property is interlocked with the assumption P(w)= 0, crow. If one makes a +
different assumption for the arbitrary function P(w)in (3), one has a different time for a
CLA frame. [l]
The Wu transformations suggest that accelerations distort spacetime and form a hori-
zon, which corresponds to a wall singularity x = -1/(aoro2) (with arbitrary y and t
coordinates) of the coordinates for the CLA frame F or ( 2 1 + = (WI +Po/aoro)2.
This singular wall separates physical spacetime from the mirror spacetime. The loca-
tion of the wall singularity depends on the sign of the acceleration a , and the magnitude
laoyo21. The mirror spacetime emerges from the quadratic equation associated with the
second equation in (9):
(.+-&)?= (XI+&)- (WI.a-) 2

a070
Note that we also have a quadratic equation for the relativistic energy-momentum relation,
which leads to negative energy solution. In some sense, the mirror spacetime resembles
to the negative energy solution of a particle, so one may ask whether the mirror spacetime
exists in some physical sense. Such a question probably cannot be answered because even
if it exists there is no possibility of communication between these two sectors of spacetime
separated by the singular wall.
The rate of ticking of a clock and the observable speed of light in the CLA frame F
nearby this singular wall have very peculiar properties. Namely, as x - l / ( a o y o 2 ) in
-+

+
(9) (i.e., ( 2 1 l / a , ~ , )-+ (WI + P o / ( a o y o ) ) ,the Wu transformation leads to the following
results:
(a) The clock stands still:

1 (21 + 1/(.oro)2 - (WI + Po/aoYo)2 ~ o. (10)


Q ~ ( X I +1 / ~ ~ r ~ ) ~

(b) The speed of light in the CLA frame F increases indefinitely

7
265

The result (a) shows that the rate of ticking of a clock a t rest in F at the positon
(z, O , O ) , i.e., dx = 0, slows down in comparison with a clock at ( X I , 0,O) in the inertial
frame. The result (b) is due to the fact th at the law for the propagation of light is given by
ds2 = 0. T h e two properties in (10) and (11) are intimately related. We may remark th a t
it is natural for the ratio (10) to be positive. This is consistent with ( X I -t l/(aoyo)> 0
+
and ( X I 1 / c y o ~ o ) 2 > ( W I + ~ o / ( c y o ~ o ) 2 , where the last relation is consistent with p2 < 1.
Note t ha t the general transformations (3) with a n arbitrary velocity function p(w)
corresponds to constant-linear-acceleration in the following sense: Suppose a particle is at
rest in the CLA frame F at the position (x,O,0) = constant, we can derive the following
relation,

This result is consistent with the constant acceleration of a charged particle in a high-
energy linear accelerator, which has a constant potential drop per unit length. Thus, a
charged particle gains a constant kinetic energy per unit length. Note that the usual
definition of constant acceleration d2xI/dw; = constant in classical mechanics is only a n
approximation for small velocities and is inconsistent with high-energy experiment.

4 Lifetime Dilation with Accelerations


Let us consider lifetime dilation experiments. [6] Suppose the particles such as pions
are at rest at the origin x = y = z = 0 in an inertial frame F(w, z, y, z) and their
lifetime is measured in another inertial frame F I ( x ~ )The
. number of particles at time w,
N ( w ) , is given by the exponential decay law in the inertial frame F,

N(w) = (13)
where No is the number of particles at time w = 0 and wr is the constant lifetime of the
particles at rest. The numerical value of wr is the same as the decay-length D= T ~ C where
,
T~ is the usual lifetime at rest. Since the experiment is performed in the inertial frame F I ,
we express the law (13) in terms of observable variables in FI(xC;)by using the Lorentz
transformations between F and F I , i.e., the coordinates xp = ( w ,x , y1z ) and x! in (5)
are replaced by xp = (w, x, y, z) and zero respectively. Using x = y = z = 0, we have
N = N ~ ~ - w ~ / ( Y O W ~ ) =~ , ~ - ~ r / ( ~ o P o w - ) .
(14)

-
It shows that the particle lifetime is dilated by a yo factor. This result of lifetime dilation
has been confirmed with a very large value of yo 700, in high energy laboratories. [7]
We stress that lifetime dilation of a particle due to constant-acceleration is a pure
kinematic property and is independent of detailed interactions of particles. To discuss the
lifetime dilation with constant-linear-accelerations for particles at rest in the CLA frame
F ( w ,x , y , z ) , we assume that the form of the exponential decay law (13) still holds t o a
good approximation: [S]
N = NOe-w/wr, (15)
in a linearly accelerated frame. Now instead of using the Lorentz transformations, we use
their generalization, i.e., the Wu transformation (3) with x0p = 0, and express w in terms

8
of the spacetime variables in the inertial laboratory frame -FI[Z:):

where (x,0,O) denotes the particle's fixed position in the CLA frame F ( x P ) ,
In the experiment of particle decay in flight, one measures the number of particle
decay as a function of position X I rather than time WI in an inertial laboratory. From
(15) and (161, we obtain t h e generalized decay law for unstable particles moving with
constant- linear- acceler ations:

In the derivation of (16) and (171, we have used the inverse Wu transformation (9)

Figure 5: Particle Decay in Flight with Different Accelerations. This figme shows the
different decays of accelerated pions (massm o = 140MeV, 'decay length' D = 7.8m = w.,and
1: =0) with PO = 0.8. The symbol xI[rn] denotes the distance XI measured in meters, n denotes
N. From the upper t o the lower graph, the constant 'acceleration' F f m , is respectively given by
1/14rn, 1/35m,1/70m,1/140m, 1/280m, and 1/560m.

the second equation in (3) with xo = 0 and the approximation y =


For small auol

9
2.67

+
yO(l ao/30y02w)lead to the relation for w:

1
w=-
Po
(-
XI
Yo
- x - a,yo 2 X I X + CY,yo2x2
)
This relation and (9) lead to the following approximate decay formula for small acceleration
a0,
(18)

Clearly, when a , + 0 it reduces to the lifetime dilation (15) of special relativity, if the
particle is at rest at the origin of the moving frame, i.e., x = 0. However, when the location
of particles at rest in F is at x = a # 0, one obtains a new effect of lifetime dilation due
to acceleration a,. Based on (19), one can verify that N ( a ) / N ( O )# N(O)/N(-a). This
result in a CLA frame suggests that we no longer have the usual translational invariance
of inertial frames. This new non-translational invariance in non-inertial frames, suggested
by the lifetime dilation for particle decay in flight with acceleration, could be tested by
experiments in the future. Of course, the measurements of N ( a ) , N ( O )and N ( - a ) are
more difficult than the usual lifetime dilation. We note that the generalized law for the
lifetime dilation as shown in the decay law (17) or (19) holds for both particles moving
with constant velocity and constant-linear-acceleration.
To see the behavior of the particle decay at different accelerations, let us plot the curves
with different values of a. In linear accelerators, the constant acceleration a is expressed
in terms of a constant potential drop per unit length,

we have [9]
1

Let us use m, = 140MeV/c2 and the rest decay length D=7.8 meters for pions at rest in
F moving with a constant velocity Po = 0.8 for a demonstration. The exponential law for
particle decay in flight with different potential drops per meter is shown in figures 5 and
6.
It is important to determine experimentally the correct physical time in accelerated
frames before one formulates physical theories in accelerated frames. Therefore, this type
of experiment to test physical time in accelerated frames is crucial to free our understanding
of physics from the bondage of inertial frames. The experimental results will also motivate
physicists to formulate theories in both inertial and non-inertial frames and to extend our
view and understanding of the physical world.

The work was supported in part by the Jing Shin Research Fund of the UMassD
Foundation and the Potz Science Fund.

10
268

Figure 6: Particle Decay in Flight with Different Initial Velocities. This figure shows
the different decays of accelerated pions (mass m, = 140MeV, decay length D = 7.8m = w T ,
and z = 0 ) with the force F = lrMeV/m acting on them. The initial velocity PO is, from the
upper t o the lower graph, given by 0.99, 0.5, 0.4, 0.3, 0.2, and 0.1 respectively.

References
[l]Daniel T. Schmitt and Jong-Ping Hsu, Intern. J. Modern Phys. A (2005, to he pub-
lished).
;2] Jong-Ping Hsu and Leonard0 Hsu, Nuovo Cimerlto B, 112, 575 (1997) and Chin. J.
Phys. 35,407 (1997).
[3] C. Mmller, Dariske Vid. Sel. Ma.t.-Fys. 20, No. 19 (1943); see also The Theory of
Relativity, (Oxford university press, 1952), Chapter VII; Ta-You Wu and Y . C. Lee,
Intern. J. Theoretical Phys. 5, 307 (1972). Ta-You Wu, Theoretical Physics, voE.4,
Theory of Relativity (Lian Jing Publishing Co., Taipei, 1978) pp. 172-175. T. Fulton
and F. Rohrlich, Ann. Phys. 9, 499 (1960); T. Fulton, F. Rohrlich and L. Witten,
Nuovo Cimento XXVI, 652 (1962); E. A. Desloge and R.. J. Philpott, Am. J. Phys.
5 5 , 252 (1987).

[4]Jong-Ping Hsu, Eznsteins Relaliuity and Beyond - New S y m m e t r y Approaches,

(World Scientific, Singapore, 2000), Chapters 21-23. See also ref. 1.


;5] Jong-Ping Hsu, in Symposium on the Frontiers of physics at Millennium (Edited by
Yue-Liang Wu and Jong-Ping Hsu, World Scient.ific,Singapore, 2001) pp. 321-328.

[6] Tobias Kleinschmidt and Jong-Ping Hsu, UMassD preprint (2003).


[7] N.Grossman, K. Heller, et nl., Phys. Rev. Lett. 59 (1987) 18.

11
269

[8] Explicit calculation of particles decay at rest based on quantum field theory formu-
lated in non-inertial frames appears to be non-trivial. So far, there is no satisfactory
formulation of quantum field theory in non-inertial frames. The assumption of a con-
stant lifetime of a particle at rest in CLA frames can be justified for small accelera-
tions. Furthermore, this constant value of lifetime is not crucial for experimental tests
discussed here. The position-dependence of N(x) in equation (17) or (18) is crucial
for experimental tests.
[9] Their relation can be obtained by calculating ( d p ~ / d z ~ )with, 3: fixed, where p~ =
+
m o y o . One has ( d p 1 0 / d 3 : 1 ) ~= I(dprl/dw~).I = moao/(ro-2 sox) = F,. See also
equation (12).

12
This page intentionally left blank
Chapter 6

Quantum Gravity and 'Ghosts'*

*R. P. Feynman, B. S. DeWitt, L. D. Faddeev, V. N. Popov, S. Mendelstam,


E. S. Fradkin, I. V. Tyutin, F. J. Dyson
272

Val. XXlV (1963) ACTA PHYSICA POLONICA Fi~sc.6 (12)

QUANTUM THEORY OF GRAVITATION


BY R. P. FEYNMMAN

(Received July 3, 1963)

My subject is the quantum theory of gravitation. My interest in it is primarily in the


relation of one part of nature to another. Theres a certain irrationality to any work in gravi-
tation, so its hard to explain why you do any of i t ; for example, as far as quantum effects
are concerned let LIS consider the effect of the gravitational attraction between an electron
and a proton in a hydrogen atom; it changes the energy a little bit. Changing the energy
of a quantum system means that the phase of the wave function is slowly shifted relative
to ~vhatit would have been were no perturbation present. The effect of gravitation on the
hydrogen atom is to shift the pliase by 43 seconds of phase in every hundred times the
lifetime of the universe! An atom made purely by gravitation, let us say two neutrons held
together by gravitation, has a Bohr orbit of lo8 light years. The energy of this system is
lovio rydbergs. I wish to clisciiss liere the possibility of calculating the Lamb correction to
this thing, an energy, of the order This irrationality is shown also in the strange
gadgets of Prof. Weber, in the absurd creations of Prof. Wheeler and other such things.
because the dimensions are so peculiar. It is therefore clear that the problem we are working
on is not the correct problem; the correct problem is what determines the size of gravita-
tion? But since I am among equally irrational men I wont be criticized I hope for the fact
that there is no possible, practical reason for making these calculations.
I am limiting myself to not discussing the questions of quantum geometry nor what
happens when the fields are of very short wave length. I am not trying to discuss any prob-
lems which we dont already have in present quantum field theory of other fields, not that
I believe that gravitation is incapable of solving the problems that we have in the present
theory, but because I wish to limit my subject. I suppose that no wave lengths are shorter
than one-millionth of the Compton wave length of a proton, and therefore it is legitimate to
analyze everything in perturbation approximation; and I will carry o u t the perturbation
approximation as far as I can in every direction, so that we can have as many terms as we
want, which means that we can go to ten to the minus two-hundred and something rycl-
bergs.
I am investigating this subject despite the real cbfficulty that there are no experiments.
Therefore there is so real challenge to compute true, physical situations. And so I made

* Based on a tape-recording of Professor Feynmans lecture at the Conference on Relativistic Theories


of Gravitarion, Jablonna, July, 1962. - Ed.

( 697)
273

69 8

believe that there uere experiments; I imagined that there were a lot of ezperinients ;mcl
that the gravitational constant was more like the electrical constant and that they nere c o ~ ~ l S
up .with data o n the various gravitating atoms, and SO forth; and that it was a challeiige to
calculate whether the theory agreed with the data. SOthat in each case I gave inyself a specific
physical problem; not a question, what happens in a quantized geometry, how do you &fine
un energy tensor etc., unless that question was necessary to the solution of tlie physical
problem, so please appreciate that tlie plan of the attack is a succession of increasingly
complex physical problems; if I could d o one, then I was finished, and. I went to a harcler
one imagining the experimenters were getting into more and more coinplicatecl situatiolis.
Xlso I clecided not to investigate d i a t I would call familiar difficulties. The quantum electro-
c1.ynamics cliverges ; if this theory cliverges, its not something to be investigated. uriless it
produces any specific difficulties associated with gravitation. In short, I was looking entirely
for unfamiliar (that is, unfamiliar to meson physics) difficulties. For example, its imme-
cliately remarked that the theory is non-linear. T h s is not at all an unfamiliar cLifficJty;
the theory, for esample, of the spin l / 2 particles interacting with the electromagnetic fielcl
has a coupling term y-4.y which involves t h e e fields and. is therefore non-linear ; thats
not a new thing at all. Now, I tlioiight that t h s would be very easy and Id jast go ahead
and do it, and heres what I planned. I started with tlie Lagrarigian of Einstein for the inter-
acting fielcl of gravity and I had t o iiiake some definition for the matter since Im dealing
with real bodies and malie up my mind. what the matter T Y ~ Smade of; arid then later I woulcl
check whether the resuits that I have depend on the specific choice o r they are more powerful.
I can only do one example at a time; I took spin zero matter; then, since Im going to make
a perturbation theory, just as we d o in cluantum electroclynaniics, where it is allowed (it is
especially more allowed in gravity where the coupling constant is smaller), gL,,is written
as flat space as if there were no gravity plus x times / I , , ~ ,where x is the square root of tlie
gravitational constant. Then, if this is substituted in the Lagrangian, one pets a big mess,
which is outliriecl here.

= dL* +xA,,,..
g,,
Substituting and espanding, and simplifying the results by :I notation (a bar over a tensor
means
1
XI, = y ( X ~ . + ? i v L - d ~ , , n . , , ~ ;
Y

-
notice that if x,,,, iy symmetric, .LIv = XJ we get

First, there are terms w h c h are quadratic in h ; then there are terms which are quadratic
274

699

in rp, tlie spin zero meson field variable; then there are terms which are more complicated
thaii quaclratic; for example, here is a term with two cps and one 11, which I will write lip9
(I have written that one out, in particular); there are terms with three h s ; then there are
terms which involve two hs aiicl two rps; and so on and so on with inore ancl mole compli-
cated terms. The first two terms are considererl as the free Lagraiigian of the gravitational
field and of the matter.
Now we look first at what we woulcl want to solve problem classically, we take the
variation of this with respect to h,, from the first term we produce a certain combination of
second derivatives, and on tlie other aide a mess involving higher orders than first. Ancl the
same with the cp, of course.

We will speali in the following way: ( 3 ) is a wave equation, of which .S, is the source, just
x
like (4) is the wave equation of which is the source. The problem is to solve those equa-
tions i n succession, and to use the usual methods of calculation of the quantum theory.
Inasmuch as I wanted to get into the minimum of difficulties, I just tool; a guess that I use
tlie same plan as I clo in electricity; and the plan in electricity leads to tlie following sug-
gestion here: that if you have a source, you divide by the operator on the left side of (3)
in momentum space to get the propagator field. So I have to solve this equation ( 3 ) . But
as you all know it is singular; the entire Lagrangian in the beginning was invariant under
a complicated transformation of g,which in the form of IL is the follov-ing; if you aclcl to h
a gradient plus more, the entire system is invariant:

wherc Elt is arbitrary, and p and v should be made symmetric in all these equations. As
a consequence of this same invariance i n the complet Lagrangian one can show that the
source S,,, must have zero divergence S,,, = 0. In fact equations ( 3 ) woulcl not be consistent
without tllis condition as can be seen by barring both sides ancl taliing the divergence - the
left side vanishes identically. Now, because of the invariance of the equations, in the same
way that the Maswell equations cannot be solved to get a unique vector potential - so
these cant be solver1 and we cant get a unique propagator. But because of the invariance
under the transformation some arbitrary choice of a condition on hJtvcan be made, analogous
to the Lorentz condition All,~, =O in quantum electrodynamics. Maliiag the simplest choice
-
which I linow, I malie choice h,lla,a = 0. This is four conditions and I have free the four
variables E), that I can adjust to malie the condition satisfied by I z ~ , ~Then
. this equation (3)
is very simple, because two terms in (3) fall away and all we have is that the dillemberian
-
of h is equal to S. Therefore the generating field from a source S,, will equal the SPvtimes
l/k2 in Fourier series, where k2 is the square of the frequency, wai-e vector; the time part
might be called tlie frequency w, the space part k. This is the analogue of tlie equation in
electricity that says that the field is l/P times the current. In the method of quantum field
275

700

tlieory, you have a source which generates something, am1 that. may interact later wit11 some-
thing else; the iteraction, of course, is S,lyh,,,; SO that, I say, one source may create ii potential
whicll acts on another sotirce. So, to take the very simplest example uf two interacting sys-
tems, lets say S ancl S, tlie result u-ou!d be the following: h woulcl be generated b y S!,,
and then it woulcl iiiteract with S,!,,so ,: eve would get for the interaction of two systems, of
two particles, tlie fundamental interaction that we investigate

- 1
%2S{,,p S,,,. (6 )

This represents the law of gravitational interaction expressed by means of an interchange


of a virtual graviton. To unclerstand the theory better a d to see how far we already arrived.
we expand i t out in components. Let index 4 represent the time, and 3 t h e direction of k,
30 that 1 and 2 are transverse. The condition IC,,S~,~ = 0 becomes wS,,, = I S 3 , , where 1;. is
the magnitude of k. Using this, inany of the terms involving number 3 component of S can
be replaced by terms in number 4 components. After some rearranging there results

There is a singular point in the last term when w = k, and to be precise we put in the +
is
9s is well-known from electrodyna~nics.You note that i n the first two terms iiisteacl of one
over a four-dimensional w2-k2 Jb-e have here just l/k2, the momentum itself. S,, is the
energy density, so this first term represents the two energy densities interacting with n o w
dependence which means, in the Fourier transform a11 interaction iii;tantaneoLis in time;
a n d 1/k2 means l j r in space, so theres an iristaiitarieous 1/r interaction between masses,
Newtons law. In the next term theres another instantaneous term which says that New-
tons mass law should be corrected by some other components analogous to a Iiincl of magnetic
interc~:tioii (not quite analogous because the magnetic interaction in electricity already
involves a k3--02+ti~ propagator rather than just Bg.Brit the li2--w2 +it in gravitation
comes even later and is a much smallel term which involves velocities to tile fourtli). So
if we really wanted to do probleins with atoms that were held together gravitationally it
~ v o u l dbe very easy; we would take the first term, and possibly even the seconcl as the inter-
action. Being instantaneous, it can be p i t directly into a Schrodinger equation, analogous
t o the ez/r term for electrical interaction. And that take care of gravitation to a very high
accuracy, without a cpntizecl field theory at all. However, for still higher accuracy we
have t o do the radiative corrections, which come from the last term.
Radiation of free gravitons corresponds to the situation that there is a pole in the propa-
gator. There is a pole in the last term when w = k, of course, whlch means that the wave
number ancl the frequency are related as for a mass zero particle. The resiclue of the pole,
we see, is the product of two terms; which means that tliere are two kincis of waves, one
generatell. by S,l-S22 and the other generated by S12,a d so we have two 1;intLs nf trans-
276

70 1

versc polarized waves, that is there are two polarization states for ilir gravitoii. T h e linear
combination Sl1-5&* 2iS,,vary with angle 0 of rotation in the 1-2 plane as eiZio so rile
Eravitaton has spin 2, component 5 2 along direction of polarization. Everything is clear
directly from the expression ('7); I just wanted t o illustrate that the propagator (6) of cIuantilin
mechanics and all that we know about the classical situation are in evicleiit coincidence.
I n order to proceed to malie specific calculations by means of diagrams, beside the
propagator we need to l a o w just what the junctions are, in other words just what tile S'S
are for a particular problem; and I shall just illustrate how that's done i n one example.
It is clone by looking at the non-quadratic terms in the Lagrangian I've writtell one oLlt
completely. This one has a n h ancl two 9 ' s in the Lagrangian (2). T h e rules of the quantunl
mechanics for w i t t i n g this thing iire to look at the h and two y's: one 'p each refers to the
i n and out particle, ancl the one h corresponds to the graviton; so vie immediately see ill
that term a two particle interaction through a graviton (see Fig. 1). And we can immecliately

Fig. 1

read off the answer for the interaction this way: if the p1 and p a are the momenta of the
particles and Q the momentum of the graviton; a n d eQB is the polarization tensor of tile
plaiie wave representing the graviton, that is hQp= eQe e14 x, the Fourier expansion of this
'

term gives the amplitude for the coupling of two particles to a graviton

So this is a coupling of matter to gravity; it is first order, ancl then there are higher terms;
but the point I'm trying to make is that there is no mystery about what to write clo~\-ii-
everything is perfectly clear, from the Lagrangian. We have the propagator, we have the
couplings, we can &rite everything. A term like hhh implies a definite formula for tile
interaction of three gravitons; it is very complicated, and I won't write it down, but you
can read. it right off directly by substituting momenta for the gradients. That such a term
exists is, of course, natural, because gravity interacts mith any kind of energy, including
its own, so if it interacts with an object-particles i t will interact with gravitons; so this is
the scattering of a graviton in a gravitational field, which must exist. So that everything
is directly readable and all we have to do is pro:eed to find out if we get a sensible physics.
I've already indicated that the physics of direct interactions is sensible; arid I go ahead
now to compute a number of other things.
T o take just one example, we compute the Compton effect, or the analogue rather,
of the Compton effect, in d i i c h a gravitoii comes in a n d out o n a particle. The amplitude
277

702

for this is a sum of terms corresponding to the diagrams of Fig. 2. The amplidute for
the first diagram of Fig. 2 is the coupling (8) times the propagator for the intermediate
meson wljch reads (p2-rn2)-, which is the Fourier transform of the equation (4) which
is the propagation of the spin zero paiticle. Then there is another coupling of the same
form as (8). We multiply these together, to get the amplitude for that diagram

where we should substitute p =p 2 +qb = p 1 +qa. Then you must add similar contributions
from the other diagrams.

A 8 C 0

Fig. 2

The third one comes in because there are terms with two hs and two y7sin the Lagrangian.
One adds the four diagrams together and gets a n answer for the Compton effect. It is rather
simple, and quite interesting; that it is simple is what is interesting, because the labour is
fantastic in all these things.
But the thing I would like to emphasize is this; in this problem we used a certain wave
e:s for the incoming graviton number a say; the question is could we use a different one?
According to the theory, it should really be invariant under coordinate transformations
and so on, but what it corresponds to here is the analogue of gauge invariance, that you can
add to the potential a gradient (see (5)). And therefore it should be that if I changed eZp of
a particular graviton to eUp+qJp where 6 is arbitrary, and qa is the momentum of the gravi-
ton, there should be no change in the physics. In short, the amplitude should be unchanged;
and it is. The amplitude for this particular process is what I call gauge-invariant, or coor-
dinate-transforming invariant. At first sight this is somewhat puzzling, because you would
have expected that the invariance law of the whole thing is more complicated, including
the last two terms in (54,which I seem to have omitted. But those terms have been includecl;
you see asymptotically all you have to do is worry about the second term, the last two in
hs times 5s are in fact generated by the last diagram, Fig. 2 0 ; when I put a gradient in hexe
for this one, what this means is if I put for the incoming wave a pure gradient, I should get
zero. If I put the gradient qucpin for ezp o n this term D, I get a coupling between E
and the other field e:p because of the three graviton coupling. The result, as far as the
matter line is concerned is that it is acted on in first order by a resultant field eEu E, qf +
+ 1 q: ePvtUwhich is just the last two terms in (5).The rule is that the field which acts on the
278

703

matter itself must be invariant the way described by (5) ; but here in Fig. 2 Ive already cal-
culated all the corrections, the generator and all the necessary non-linear modifications if I
take all the diagrams into account. I n short, asymptotically far away if I include all lcincls
of diagrams such as D, the invariance need be checked only for a pure graderit added to
an incoming wave. It takes care of the non-linearities by calculating them through the in-
teraction.
I woulcl like, now, to emphasize one more point that is very important for our later
discussion. If I add a gradient, I said, the result was zero. Lets call a the one graviton coming
in and b the other one in every diagram. The result is zero if I use a gradient for a, o n l y
i f b i s a f r e e g r a v i t o n w i t h n o s o u r c e ; that is if it is either really an honest graviton
with (qa)2 = 0, or a pure potential, which is a solution of the free wave equation. That is
unlike electrodynamics, where the field b could have been a n y potential at all and adding
a gradient to a would have made no difference. But i n gravity, it must be that b is a pure
wave; the reason is very simple. There is no way to avoid this by changing any propagators;
this is not a disease - there is a physical reason. The reason can be seen as follows: If this
b had a source let me modify my diagrams to show the source of b, suppose some other
matter particle niade the b, so we acld onto each b line a matter line at the end, like Fig. 3a.
(E.g. Fig. 2a becomes Fig. 3b etc.)

a b C

Fig. 3

Now, if b isnt a free wave, but it had a source, the situation is this. If this a field is taken
as a gradient field which o p e r a t e s e v e r y w h e r e o n e v e r y t h i n g i n t h e d i a g r a m
it should give zero. But we forgot something; theres another type of diagram, if the CL
is supposed to act on everything, one of which looks like Fig. 3c, in which the a itself
acts on the source of b and then b comes over to interact with the original matter. I n other
words, among all the diagrams where there is a source, theres also these of type 3c. The
sum of all diagrams is zero; but the sum of those like Fig. 2 without those of type 3c is not
zero, and therefore if I were to just calculate the diagrams of Fig. 2 and forget about the
source of b and then put a gradient in for a the result cannot be zero, but must be get-
ting ready to cancel the terms from the likes of 3c when I do it right. That Nil1 turn out
to be an important point to emphasize. I have clone a lot of problems like this, without
closed loops but I wont bore you with all the problems and answers; theres nothing new,
I mean nothing interesting, in the sense that no apparent difficulties arise.
However, the next step is to take situations in which we have what we call closed l o o p ,
or rings, or circuits, in which not all momenta of the problem are clefined. Let me just men-
279

704

tion something. Ive analyzed this method both by doing a number of problems, and by
a mathematical high-class elegant technique - I can do high class mathematics too, but
I dont believe in it, thats the difference. I have to check it in a problem. I can prove that
no matter how complicated the problem is, if you take it in the order in which there are
no rings, in which every momentum is determined, the invariance is satisfied, the system
is independent of what choice I made of gauge and of the propagator I made in the begin-
ning; and everything is all right, there are no difficulties. I emphasize that this contains all
the classical caseF, and so I m really saying there are no difficulties in the classical gravita-
tion theory. This is not meant as a grand discovery, because after all, youve been worrying
about all these difficulties that I say dont exist, but only for you to get an idea of the cali-
bration - what I mean by difficulties ! If we take the next case, lets say the interaction of
two particles in a higher order, then you get diagrams of which Ill only begin to write
a few of them. One that looks like this in which two gravitons are exchanged,

a b C

Fig. 4

or, for instance, a graviton gets split into two gravitons and then come back - tliese are
only the beginning of a whole series of frightening-looking pictures, which correspond to
the problem of calculating the Lamb shift, or the radiative corrections to the hydrogen
atom. When I tried to do this, I did it in a straightforward way, following all the rules, putting
in the propagator l/k2, and so on. I had some difficulties, the thing didnt look gauge in-
variant but that had to do with the way I was making the cutoffs, because the stuff is infinite.
Shortage of time doesnt permit me to explain the way I got around all those things, because
in spite of getting around all those things the result is nevertheless definitely incorrect.
Its gauge-invariant, its perfectly O.K. looking, but it is definitely incorrect. The reason
I knew it was incorrect is the following. In order to get it gauge-invariant, I had to do a lot
of pushing and pulling, and I got the feeling that the thing might not be unique. I figured
that maybe somebody else could do it another way or something, and I was rather suspicious,
so I tried to get more tests for it; and a student of mine, by the name of Yu.ra, tested to see
if it was unitary ; and what that means is the following: Let me take instead of t h s scattering
problem, a problem of Fig. 4 in which time runs vertically, a problem which gives the same
diagrams but in which time i4 running horizontally, which is the annihilation of a pair, to
produce another pair, and we are calculating second order corrections to that problem.
Lets suppose for simplicity that in the final state the pair is in the same state as before.
280

705

Then, adding all these diagrams gives the amplitude that if you have a pair, particle and
antiparticle, they annihilate and recreate themselves; in other words its the amplitude
that the pair is still in the same state as a function of time. The amplitude to remain in the
same state for a time T in general is of the form
-i(..-i;) T
e
--Y T
you see that the imaginary part of the phase goes as e ; which means that the probability
of being in a state must decrease with time. Why does the probability decrease in time?
Because theres another possibility, namely, these two objects could come together, annihilate,
and produce a real pair of gravitons. Therefore, it is necessary that this decay rate of the
closed loop diagrams in Fig. 4 that I obtain by directly finding the imaginary part of the sum
agrees with another thing I can calculate independently, without looking at the closed loop
diagrams. Namely, what is the rate at which a particle and antiparticle annihilate into two
gravitons? And this is very easy to calculate (same set of diagrams as Fig. 2, only turned on
its side). I calculated this rate from Fig. 2, checked whether this rate agrees with the rate
at which the probability of the two particles staying the same decreases (imaginary part of
Fig. 4), and it does not check. Somethings the matter.
This made me investigate the entire subject in great detail to find out what the trouble
is. I discovered in the process two things. First, I discovered a number of theorems, which as
far as I know are new, which relate closed loop diagrams and diagrams without closed loop
diagrams (I shall call the latter diagrams trees). The unitarity relation which I have just
been describing, is one connection between a closed loop diagram and a tree; but I found
a whole lot of other ones, and this gives me more tests on my machinery. So let me just tell
you a little bit about this theorem, which gives other rules. It is rather interesting. As a matter
of fact, I proved that if you have a diagram with rings in it there are enough theorems
altogether, so that you can express any diagram with circuits completely in terms of diagrams
with trees and with all momenta for tree diagrams in physically attainable regions and on
the mass shell. The demonstration is remarkably easy. There are several ways of demonstra-
ting it; 111 only chose one. Things propagate from one place t o another, as I said, with
amplitude Ilk2. When translated into space, thats a certain propagation function which
you might call K+(l, 2), a function of two positions, 1, 2, in space-time. It represents, in the
past, incoming waves and in the future, it represents outgoing waves; so you have
w a e s come in and out; and thats the conventional propagator, with the iE and so on,
as usually represented. However, this is only a solution of the propagatorss equation,
the wave equation I mean; it is a special solution, as you all know. There are other solutions;
for instance there is a solution which is purely retarded, which Ill call K,, and which exists
only inside the future light-cone. Now, if you have two Greens functions for the same
equation they must cllffer by some solution of the homogeneous equation, say K,. That
means K, is a Solution of the free wave equation and K+ = K,,, + K,. In a ring like Fig. 4a
we have a whole product of these K+s. For example, for four points 1,2, 3, 4 in a ring
we have a product like this: K+(l,2)K+(2,3)&(3,4)&(4, 1) (all Ks are not the same,
some of them belong to the gravitons and some are propagators for the particles and so on).
706

But now let us see what happens if we were to replace one (or more) of these K+ by K,,
say K+(l,2) is KJl, 2)? Then between 1 , 2 we have just free particles, youve broken the
ring; youve got an open diagram, because K, is free wave solution, and this means its
an integral over all real momenta of free particles, on the mass shell and perfectly honest.
Therefore if we replace one of K+ by K, then that particular line is opened; and the process
is changed to one in which there is a forward scattering of an extra particle; theres a fake
particle that belongs to this propagator that has to be integrated over, but its a free diagram -
it is now a tree, and therefore perfectly definite and unique to calculate. But I said that I
could open every diagram; the reason is this. First I note that if I put K,, for every K in
a ring, I get zero

for to be non zero t , must be greater than t,, t2 > t,, t, > t, and t, > t , which is impossible.
Now make the substitution K,,, = K+-K, in (9). You get either all K+ in each factor,
which is the closed loop we want; or at least one K,, which are represented by tree diagrams.
Since the sum is zero, closed loops can be represented as integrals over tree diagrams. I was
surprised I had never noticed this thing before.
Well, then I checked whether these diagrams of Fig. 4 when opened into trees agreed
with the theorem. I mean I hoped that the theorem proved for other meson theories would
agree in principle for the gravity case, such that on opening a virtual graviton line the
tree would correspond to forward scattering of free graviton waves. And it does not work
in the gravity case. But, you say, how could it fail, after you just demonstrated that it ought
to work? The reason it fails is the following: This argument has to do with the position of
the poles in the propagators; a typical propagator is a factor l/(k2--mz++is), the f i e due
to the poles, and all Im doing here is changing the rule about the poles and picking up an
extra delta function 6 ( k 2 - m 2 ) as a consequence, which is the free wave coming in and
out. What I want these free waves to represent in the gravity case are physical gravitons
and not something wrong. They do represent waves of q 2 = 0 of course, but, as it turns
out, not with the correct polarization to be free gravitons. Id like to show it. It has to do
with the numerator, not the denominator. You see the propagator that I wrote before, which
- -
was S,, times l/(k2 +is) times SLv, is being replaced by S,, 8(q2)SLy. Now when I make
q2 = 0 I have a free wave instead of arbitiary momentum. This s h o u l d be a real graviton
or else theres going to be physical trouble. It inst; although it is of zero momentum, it
is not transverse. It does not make any difference in understanding the point so forget one
index in S,, - its a lot of extra work to carry the other index so just imagine theres one
index: S,S, 6(q2). This combination S,SL, is S,Si-S,Si -S,Si -S2$, where 4 is the
time and 3 is the direction, say, of momentum of the four-vector q. Then 1 and 2 are trans-
verse, and those are the only two we want. (Please appreciate I iemoved one index - I can
make it more elaborate, but it is the same idea.) That is we want only -3 S-S,S; instead
of the sum over four. Now what, about this extra term S4Si-S,S;? Well, it is S4-S, times
+ +
S; $ plus S, S, times Si -5;. But S4-S, is proprtional to q,S, (suppressing one index)
because q4 in this notation is the frequency and. equals q,, if we assume the 3-direction is
the direction of the momentum. So S4-S, is the response of the system to a gradient
282

707

potential, which we proved was zero in our invariance discussion. Therefore, we have shown
(S4-S3)/(Si+S;) = 0 and this should be accounted for by purely transverse wave contri-
butions. But it inst, and it isnt because the proof that the response to a gradient potential
i s zero required that the other particle that was interacting was an honest free graviton.
And four plus three in 5; +S; is not honest - its not transverse, it is not a correct kind
of graviton. You see, the only way you can get a polarization 4 f 5 going in the 4 - 3 direction is
to have what I call longitudinal response; its not a transverse wave. Such a wave could only
be generated by an artificial source here of some silly kind; it is not a free wave. When
theres an artificial source for one graviton, even the another is a pme gradient, the sum
of all the diagrams does not give zero. If the beam is not exactly that of a free wave, perfectly
transverse and everything, the argument that the gradient has to be zero must fail, for the
reason outlined previously.
Although this gradient for S4-S, is what I want and I hoped it was going to be zero
I forgot that the other end of it - Si+S; is a funny wave which is not a gradient, and
which is not a free wave - and therefore you do not get zero and should not get zero, and
something is fundamentally wrong.
Incidentally I investigated further and discovered another very interesting point.
There is another theory, more well-known to meson physicists, called the Yang-Mills theory,
and I take the one with zero mass; it is a special theory that has never been investigated
in great detail. It is very analogous to gravitation; instead of the coordinate transforniation
group being the source of everything, its the isotopic spin rotation group thats the source
of everything. It is a non-linear theory, thats like the gravitation theory, and so forth. At
the suggestion of Gell-Mann I looked at the theory of Yang-Mills with zero mass, which has
a kind of gauge group and everything the same; and found exactly the same difficulty. And
therefore in meson theory it was not strictly unknown difficulty, because it should have
been noticed by meson physicists who had been fooling around the Yang-Mills theory. They
had not noticed it because theyre practical, and the Yang-Mills theory with zero mass
obviously does not exist, because a zero mass field would be obvious; it would come out
of nuclei right away. So they didnt take the case of zero mass and investigate it carefully.
But this disease which I discovered here is a disease which exist in other theories. So at
least there is one good thing: gravity isnt alone in this difficulty. This observation that
Yang-Mills was also in trouble was of very great advantage to me; it made everything much
easier in trying to straighten out the troubles of the preceding paragraph, for several reasons.
The main reason is if you have two examples of the same disease, then there are many things
you d.ont worry about. You see, if there is something clifferent in the two theories it is not
caused by that. For example, for gravity, in front of the second derivatives of gpv in the
Lagrangian there are other gs, the field itself. I kept worrying something was going to happen
from that. In the Yang-Mills theory this is not so, thats not the cause of the trouble, and SO
on. Thats one advantage - it limits the number of possibilities. And the second great
advantage was that the Yang-Mills theory is enormously easier to compute with than the
gravity theory, and therefore I continued most of my investigations on the Yang-Mills
theory, with the idea, if I ever cure that one, 111 turn around and cure the other. Because
I can demonstrate one thing; line for line its a translation like music transcribed to a different
283

308

score; everything has its analogue precisely, so it is a very good example to work with.
Incidentally, to give you some idea of the difference in order to calculate this diagram Fig. 4b
the Yang-Mills case took me about a day; to calculate the diagram in the case of gravitation
I tried again and again and was never able to do it; and it was finally put OR a computing
machine -1 dont mean the arithmetic, I mean the algebra of all the terms coming in, just the
algebra; I did the integrals myself later, but the algebra of the thing was done on a machine
by John Matthews, so I couldnt have done it by hand. In fact, I think its historically
interesting that its the first problem in algebra that I know of that was done on a machine
that has not been done by hand.
Well, what then, now you have the difficulty; how do you cure it? Well I tried the
following idea: I assumed the tree theorem to be true, and used it in reverse. If every closed
ring diagram can be expressed as trees, and if trees produce no trouble and can be computed,
then all you have to do is to say that the closed loop diagram is the sum of the corresponding
tree diagrams, that it should be. Finally in each tree diagram for which a graviton line has
been opened, take only real transverse graviton to represent that term. This then serves
as the definition of how to calculate closed-loop diagrams ;the old rules, involving a propagator
l / k 2 f i e etc. being superseded. The advantage of this is, first, that it will be gauge invariant,
secoiid, it will be unitary, because unitarity is a relation between a closed diagram and an
open one, and is one of the class of relations I was talking about, so theres no difficulty.
And third, it7s completely unique as to what the answer is; theres no arbitrary fiddling
around with different gauges and SO forth, in the inside ring as there was before. So thats
the plan.
Now, the plan requires, however, one more point. Its true that we proved here that
every ring diagram can be broken up into a whole lot of trees; but, a g i v e n t r e e i s n o t
g a u g e i n v a r i a n t . For instance the tree diagram of Fig, 2A is not. Each one of the four
diagrams of Fig. 2 is not gauge-invariant, nor is any combination of them except the sum of
all four. So the thing is the following. Suppose I take a l l the processes, a l l of them that
belong together in a given order; for example, all the diagrams of fourth order, of which
Fig. 4 illustrates three; I break the whole mess into trees, lots of trees. Then I must gather

Fig. 5

the trees into baskets again, so that each basket contains the total of a l l of the diagrams of
some specific p r o c e s s (for example the four diagrams of Fig. 2), you see, not just some
particular tree diagram but the complet set for some process. The business of gathering the
tree diagrams together in bunches representing all diagrams for complet processes is impor-
tant, for only such a complet set is gauge invariant. The question is: Will any odd tree dia-
709

grams be left out o r can they all be gathered into processes? The question is : Can we express
the closed ring diagrams for some process into a sum over various other processes of tree
diagrams for these processes?
Well, in the case with one ring only, I am sure it can be done, I proved it can be done
and I have done it and its all fine. And therefore the problem with one ring is fundamentally
solved; because we say, you express it in terms of open parts, you find the processes that
they correspond to, compute each process and add them together.
You might be interested in what the rule is for one ring; its the sum of several pieces:
first it is the sum of all the processes which you get in the lower order, in which you scatter
one extra particle from the system. For instance, in Fig. 4 we have the rings for two particles
scattering. There is no external graviton but there are two internal ones; now we compute
in the same order a new problem in which there are two particles scattering, but while
thats happening another particle, for example a graviton scatters forward. Some of the
diagrams for this are illustrated in Fig. 5. State f the same state as g ; so another graviton
comes in and is scattered forward. In other words we do the forward scattering of an extra
graviton. In addition, from breaking matter lines we have terms for the forward scattering
of an extra positron, plus the forward scattering of an extra electron, and so o n ; one adds
the forward scattering of every possible extra particle together. That is the first contribution.
But when you break up the trees, you also sometimes break two lines, and then you get
diagrams like Fig. 6 with two extra particles scattering (here a graviton and electron) so it
turns out you must now subtract all the diagrams with two extra particles of all kinds
scattering. Then add all diagrams with 3 extra particles scattering and so on. Its a nice xule,
itss quite beautiful; it took me quite a w h l e t o find; I have other proofs for orther cases
that are easy to understand.
Now, the next thing that anybody would ask which is a natural, interesting thing to
ask, is this. Is it possible to go back and to find the rule by which you could have integrated
the closed rings directly? In other words, change the rule for integrating the closed rings,
so that when you integrate them in a more natural fashion, with the new method, it will

\
b

Fig. 6

give the same answer as this unique, absolute, definite thmg of the trees. Its not necessary
to do this, because, of course, Ive defined everything; but its of great interest to d o this,
because maybe 111 understand what I did wrong before. So I investigated that in detail.
It turns out there are two changes that have to be made - its a little hard to explain in
710

terms of the gravitation of which Ill only tell about one. Well, Ill try to explain the other,
but it might cause some confusion. Because I have to explain in general what Im doing
when I do a ring. Most what it corresponds to is this: first you subtract from the Lagrangian
this
I6Hp;q;o dt.
In that way the equation of motion that results is non-singular any more. Let me write
what it really is so that theres no trouble. You say to me what is this, theres a g in it and
an H in it? Yes. In doing a ring, theres a field variation over which youre integrating,
whch I call H ; and theres a g - which is the representative of all the outside disturbances
which can be summarized as being an effective externalfieldg. And so you add to the
complicated Lagrangian that you get in the ordinary way an extra term, which makes
it no longer singular. Thats the first thing; I found it out by trial and error before,
when I made it gauge invariant. But then secondly, you must subtract from the answer,
the result that you get by imagining that in the ring which involves only a graviton
going around, instead you calculate with a different particle going around, an artificial,
dopey particle is coupled to it. Its a vector particle, artificially coupled to the external
field, so designed as to correct the error in this one. The forms are evidently invariant,
as far as your g-space is concerned; these are like tensors in the g world; and therefore
its clear that my answers are gauge invariant or coordinate transformable, and all thats
necessary. But are also quantum-mechanically satisfactory in the sense that they are unitary.
Now, the next question is, what happens when there are two or more loops? Since
I only got this completely straightened out a week before I came here, I havent had time
to inwestigate the caFe of 2 or more loops to my own satisfaction. The preliminary
investigations that I: have made do not indicate that its going to be possible so easily
gather the thmgs into the right barrels. Its surprising, I cant understand it; when you
gather the trees into processes, there seems to be some loose trees, extra trees. I dont
understand them at the moment, and I therefore do not claim that this method of
quantization can be obviously and evidently carried on to the next order. In short,
therefore, we are still not sure, of the radiative corrections to the radiative corrections to
the Lamb shift, the uncertainty lies in energies of the order of magnitude of
rydbergs. I can therefore relax from the problem, and say: for all practical purposes
everything is all right. In the meantime, unfortunately, although I could retire from
the field and leave you experts who are used to working in gravitation to worry about
this matter, I cant retire on the claim that the number is so small and that the thing is
now r e a l l y irrational, if it was not irrational before. Because, unfortunately, I also discov-
ered in the process that the trouble is present in the Yang-Mills theory; and secondly
I have incidentally discovered a tree-ring connection which is of very great interest and
importance in the meson theories and so on. And so Im stuck to have to continue this
investigation, and of course you all appreciate that this is the secret reason for doing any
work, no matter how absurd and irrational and academic i t looks; we all realize that no
matter how small a thing is, if it has physical interest and is thought about carefully enough,
youre bound to think of something thats good for something else.
286

711

DISCUSSION
M s l l e r : May I, as a non-expert, ask you a very simple and perhaps foolish question.
Is this theory really Einsteins theory of gravitation in the sense that if you would have
here many gravitons the equations would go over into the usual field equations of Einstein?
F e y n man: Absolutely.
M s l l e r : You are quite sure about it?
F e y n m a n : Yes, in fact when I work out the fields and I dont say in what order Im
working, I have to do it in an abstract manner which includes any number of gravitons;
and then the formulas are definitely related to the general theorys formulas; and the in-
variance is the same; things like this that you see labelled as loops are very typical quantum-
-mechanical things; but even here you see a tendency to write things with the right deriva-
tives, gauge invariant and everything. No, theres no question that the thing is the Ein-
steinian theory. The classical limit of this theory that Im working on now is a non-linear
theory exactly the same as the Einsteinian equations. One thing is to prove it by equations;
the other is to check it by calculations. I have mathematically proven to myself so many
things that arent true. Im lousy at proving things - I always make a mistake. I dont
notice when Im doing a path integral over an infinite number of variables that the Lagrang-
ian does not depend upon one of them, the integral is infinite and Ive got a ratio of two
infinities and I could get a different answer. And I dont notice in the morass of things that
something, a little limit or sign, goes wrong. So I always have to check with calculations;
and Im very poor at calculations - I always get the wrong answer. So its a lot of work
in these things. But Ive done two things. I checked it by the mathmatics, that the forms
of the mathematical equations are the same; and then I checked it by doing a consid-
erable number of problems in quantum mechanics, such as the rate of radiation
from a double star held together by quantum-mechanical force, in several orders and
so on, and it gives the same answer in the limit as the corresponding classical problem.
Or the gravitational radiation when two stars - excuse me, two particles - go by each
other, to any order you want (not for stars, then they have to be particles of specified prop-
erties; because obviously the rate of radiation of the gravity depends on the give of the
starstides are produced). If you do a real problem with real physical things in in then Im
sure we have the right method that belongs to the gravity theory. Theres no question
about that. It cant take care of the cosmological problem, in which you have matter out
to infinity, or that the space is curved at infinity. It could be done Im sure, but I havent
investigated it. I used as a background a flat one way out at infinity.
M s l l e r : But you say you are not sure it is renormalizable.
F e y n m a n : Im not sure, no.
M s l l e r : In the limit of large number of gravitons this would not matter?
F e y n m a n : Well, no; you see, there is still a classical electrodynamics; and its not
got to do with the renormalizability of quantum electrodynamics. The infinities come in
different places. Its not a related problem.
R o s e n : Im not sure of this, not being one of the experts; but I have the impression
that because of the non-linearity of the Einstein equations there exists a difficulty of the
287

712

following kind. If the linear equations have a solution in the form of an infinite plane mono-
chromatic wave, there does not seem to correspond to that a more exact solution; because
YOU get piling up of energies in space and the solution then diverges at infinity. Could that.

have any bearing on the accuracy of this kind of calculation?


F e y n m a n : No, I take that into account by a series of corrections. A single graviton
is not the same thing as an infinite gravitational wave, because theres a limited energy
in it. Theres only one hw.
R o s e n : But youre using a momentum expansion which involves infinite waves.
F e y n m a n : Yes, there are corrections. You see what happens if one calculates the cor-
rections. If you have here a graviton coming in this way, then there are corrections for such
a ring as this and so on. And these produce first, a divergence as usual; but second, a
term in the logarithm of q2; which means that if this thing is absolutely a free plane

wave, theres no meaning to the correction. So it must be understood in this way, that
the thing was emitted some time far in the past, and is going to be absorbed some time
in the future; and has not absolutely been going on forever. Then theres a very small
coefficient in front of the logarithm and then for any reasonable q2, like the diameter of
the universe or something, I can still get a sensible answer; this is the shadow of the
phenomenon youre talking about, that the corrections to the propagation of a graviton,
dependent on the logarithm of the momentum squared carried by the graviton and which
would be infinite if it were really a zero momentum graviton exactly. And so a free
graviton just like that does not quite exist. And this is the correction for that. Strictly we
would have to work with wave packets, but they can be of very large extent compared to
the wave length of the gravitons.
A n d e r s o n : Id like to ask if you get the same difficulty in the electromagnetic case
that you did in the Yang-Mills and gravitational cases?
F e y n m a n : No, sir, you do not. Gauge invariance of diagrams such as Fig. 2 (there
is no 20) is satisfied whether b is a free wave or not. That is because photons are not the
source of photons; they are uncharged.
A n d e r s o n : The other thing I would like to suggest is that in putting of things into
baskets, you might be able to get easily by always only starting out with vacuum dia-
grams and opening those successively.
F e y n m a n : I tried that and it didnt go successfully.
I v a n e n k o : If I understood you correctly, you had used in the initial presentation the
transmutation of two particles into gravitons. Yes?
F e y n m a n : It was one of the examples.
288

713

I v a n e n k o : Yes. This process was considered, perhaps in a preliminary manner, by


ourselves and by Prof- Weber and Brill. I ask you two questions. Do you possess the effective
cross-section? Can you indicate the effects for which high-energy processes play an impor-
tant role?
F e y n m a n : I never went to energies more than one billion-billion BeV. And then the
cross-sections of any of these processes are infinitesimal.
I v a n e n k o : They increase very, very sharply with energy. Yes, because the radiation
is quadrupole, so it increases sharply in contrast to the electromagnetic transmutation of
an electron-positron pair.
F e y n m a n : It increases very sharply indeed. On the other hand, it starts out so low that
one has to go pretty far to get anywhere. And the distance that you have to go is involved
in this thing -the thing thats the analogue of e2/hc in electricity, which is 1/137 is non-
-existent in gravitation; it depends on the problem; this is so because of the dimensions of
G. SOif E is the energy of some process, then if you take GE2/hc you get an equivalent to
this e2/Ac. It may be less than that, but at least it cant be any bigger than this. So in order
to make this thing to be of the order of 1%,in which case the rate is similar to the rate of
photon annihilation, at ordinary energies, we need the G E 2 to be of the order of hc, and
as has been pointed out many times, thats an energy of the order grams, which is
10 BeV. You can figure out the answer right away; just take the energy that you are interes-
ted in, square, multiply by G and divide by hc; if that becomes something, then youre
getting somewhere. You still might not get somewhere, because the cross-section might
not go up that fast, but at least it cant get up any worse than that. So I think that in order
to get an appreciable effect, youve got t o go to ridiculous energies. So you either have
a ridiculously small effect or a ridiculous energy.
W e b e r : I have a cross-section which may be a partial answer t o Ivanenkos question.
Could I write it on the board? We have carried out a canonical quantization, which is not
as fancy as the one you have just heard about; but considering the interaction of photons
and gravitons; and it turns out that even in the linear approximation that one has the
possibility of the graviton production by scattering of photons in a Coulomb field. And the
scattering cross-section for this case turns out to be 8n2 times the constant of gravitation
times the energy of the scatterer times the thickness of the scatterer in the direction of
propagation of the photon through it divided by c4. This assumes that all of the climensions
of the scatterer are large in comparison with the wave length of the photon. We obtained
this result by quantization, and noticed that it didnt have Plancks constant in it, so we
turned around and calculated it classically. Now, if one puts numbers in this, one finds
that the scattering cross-section of a galaxy due to a uniform magnetic field through it is
loa cm2, a much larger number than the object that you talked about. This represents
a conversion of photons into gravitons of about 1 part in This is of course too small
to measure. Also, we considered the possibility of using this cross-section for a laboratory
experiment in which one had a scatterer consisting, say of a million gauss magnetic field
over something like a cubic meter. This turned o u t to be entirely impossible, a result in
total contradiction to what has appeared in the Russian literature. In fact, the theory of
fluctuations shows that for a laboratory experiment involving the production of gravitons
289

714

by scattering of photons in a Coulomb field, the scattered power has to be greater than twice
the square root of kT times the photon power divided by the averagipg time of the experi-
ment. I believe that the incorrect results that have appeared in the literature have been
due to the statement that A P has to be gieater than kT over t ;dimensionally these things
are the same, but order of magnitude-wise this kind of experiment for the scatterer of which
I spoke requires something like lo5 watts. Maybe I can say something about this
afternoon; I dont want to take any more time.
D e W i t t : I should like to ask Prof. Feynman the following questions. First, to give us
a careful statement of the tree theorem; and then outline, if he can to a brief extent, the
nature of the proof of the theorem for the one-loop case, whish I understand does work.
And then, to also show in a little bit more detail the structure and nature of the fictitious
particle needed if you want to renormalize everything directly with the loops. And if you
like, do it for the Yang-Mills, if things are prettier that way.
F e y n m a n : I usually dont find that to go into the mathematical details of proofs
in a large company is a very effective way to do anything; so, although thats the question
that you asked me - Id be glad to do it - I could instead of that give a more physical
explanation of why there is such a theorem; how I thought of the theorem in the first place,
and things of this nature; although I do have a proof - Im not trying to cover up.
D e W i t t : May we have a statement of the theorem first?
F e y n m a n : That I do not have. I only have it for one loop, and for one loop the careful
statement of the theorem is ... - look, let me do it my way. First - let me tell you how I
thought of this crazy thing. I was invited to Brussels to give a talk on electrodynamics -
the 50th anniversary of the 1911 Solvay Conference on radiation. And I said Id make
believe Im coming back, and Im telling an imaginary audience of Einstein, Lorentz and
so on what the answer was. I n other words, there are going to be intelligent guys, and Ill
tell them the answer. So I tried to explain quantum electrodynamics in a very elementary
way, and started out to explain the self-energy, like the hydrogen Lamb shift. How can
you explain the hydrogen Lamb shift easily? It turns out you cant at all - they clidnt
even know there was an atomic nucleus. But, never mind.. I thought of the following. I would
explain to Lorentz that his idea that he mentioned in the conference, that classically the
electromagnetic field could be represented by a lot of oscillators was correct. And that
Plancks idea that the oscillators are quantized was correct, and that Lorentzs suggestion,
whch is also in that thing, that Planck should quantize the oscillators that the field is equi-
valent to, was right. And it was really amusing to discover that all that was in 1911. And
that the paper in which Planck concludes that the energy of each oscillator was not nhw
but (n+1/2)hw which was also in that, was also right; and that this produced a difficulty,
because each of the harmonic oscillators of Lorentz in each of the modes had a frequency
of Aw/2 which is an infinite amount of energy, because there are an infinite number of modes.
A4ndthat thats a serious problem in quantum electrodjrIlamics and the first one we have
to remove. And the method we use to remove it is to simply redefine the energy so that
we start from a cbfferent zero, because, of course, absolute energy doesnt mean anything.
(In this gravitational context, absolute energy does mean something, but its one of the
technical points I cant discuss, which dido require a certain skill to get rid of, in making
290

715

a gravity theory; but never mind.) NOWlook - I. make a little hole in the box and I let in
a little bit of hydrogen gas from a reservoir; such a small amount of hydrogen gas, that
the density is low enough that the index of refraction in space differs from one by an amount
proportional t o A , the number of atoms. With the index being somewhat changed, the
frequency of all the normal modes is altered. Each normal mode has the same wavelenght
as before, because it must fit into the box; but the frequencies are all aItered. And there-
fore the hws should all be shifted a trifle, because of the shift of index, and therefore theres
a slight shift of the energy. Although we subtract h0/2 for the vacuum, theres a correction
when we put the gas in; and this correction is proportional t o the number of atoms, and
can be associated with an energy for each atom. If you say, yes, but you had that energy
already when you had the gas in back in the reservoir, I say, but let us only compare the
difference in energy between the 2s and 2P state. When we change the excitation of the
hydrogen gas from 2 s to 2P then it changes its index without removing anythmg; and the
energy difference that is needed to change the energy from 2s to the 2P for all these atoms
is not only the energy that you calculate with disregard. of the zero point energy; but the
fact is that the zero point energy is changed very slightly. And this very slight difference
should be the Lamb effect. So I thought, its a nice argument; the only question is, is it
true. In the first place its interesting, because as you well know the index differs from one
by an amount which is proportional t o the forward scattering for y rays of momentum k
and therefore that shift in energy is essentially the sum over all momentum states of the
forward scattering for y rays of momentum k. So I looked at the forward scattering and
compared it with the right formula for the Lamb shift, and it was not true, of course; its
too simple an argument. But then I said, wait, I forgot something. Dirac, explained to us
that there are negative energy states for the electron but that the whole sea of negative
energy states is filled. And, of course, if I put the hydrogen atoms in here all those electrons
in negative energy states are also ascattering off the hydrogen atoms; and therefore their
states are all shifted; and therefore the energy levels of all those are shifted a tiny bit. And
therefore theres shift in the eneigy due to those. And so there must be an additional term
which is the forward scattering of positrons, which is the same as scattering of negative
energy electrons. Actually, for the symmetry of things it is better to take half the case where you
make the positrons the holes and the other half where you make the electrons the holes;
so it should be 112 forward scattering by electrons, 1/2 scattering by positrons and scattering
by y rays - the sum of all those forward scattering amplitudes ought to equal the self-
-energy of the hydrogen atom. And thats right. And its simple, and its very peculiar.
The reason its peculiar is that these forwaid scatterings are real processes. At last I had
discovered a formula I had always wanted, which is a formula for energy differences (which
are defined in terms of virtual fields) in terms of actual measurable quantities, no matter
how clifficult the experiment may be -I mean I have to be able to scatter these things. Many
times in studying the energy difference due to electricity (I suppose) between the proton
and the neutron, I had hoped for a theorem which would go something like this -this energy
difference between proton and neutron must be equal to the following sum of a bunch of
cross-sections for a number of processes, but all real physical processes, I dont care how
hard they are to measure. So this is the beginning of such a formula. Its rather surprising.
29 1

716

Its not the same as the usual formula - its equal to i t but its not the same. I have 110
formulation of the laws of quantum gravidynamics; I have a proposal on how to make the
calculations. When I make the proposal on how to do the closed loops, the obvious proposal
does not work; it gives non-unitarity and stuff like that. So the obvious proposal is no good;
it works O.K. for trees; so how am I going to d.efine the answer for would correspond to
a ring? The one I happen to have chosen is the following: I take the ring in general for any
meson theory, one closed ring can be written as equivalent to a whole lot of processes each
one of which is trees. I then define, as my belief as to what the ring ought to be in the grand
theory, that its going to be also equal to the corresponding physical set of trees. When 1 saicl
this is equal to this. I didnt worry about gauge or anything else; what I means was, if these
werent gravitons but photons or any other neutral object - it d.oesnt make any difference
what they are - this theorem is right. So I suppose its right also for real gravitons, and
I suppose also that whats being scattered is only transverse and is only a real free graviton
with q2 = 0. Therefore, I say let this ring equal this set of trees. Every one of these terms
can be completely computed - its a tree. And its gauge invariant; that is, if I added an
extra potential on the whole thing, another outside disturbance of a type which is nothing
but a coordinate transformation - in short a pure gradient wave - to the whole diagram
then it comes on to all of these processes; but it makes no effect on any of them, and therefore
makes no effect on the sum; and therefore I know my definition of this ring is gauge-invariant.
Second, unitarity is a property of the breaking of this diagram; the imaginary part of this
equals something; if you take the imaginary part of this side, its already broken up, in fact,
and you can prove immediately that its the correct unitarity rule. Therefore its going to
be unitarity and so on and so on. And so I therefore define gravity with one ring in this
way. Now what prevents me from doing it with two rings? The lack of a complet statement
of what two rings is equal to in terms of processes; that is I can open the ring all right; but
I cant put the pieces - the broken diagrams - back together again into complete sets
that each one is a complete physical process. In other words some of them correspond to
the scattering of a graviton, but leaving out some diagrams. But the scattering of a graviton
leaving out diagrams is no longer gauge invariant, I mean, not evid.ently gauge invariant,
and so the power of the whole thing collapses. I dont know what to do with it. So thats
the situation; thats why it is crucial to the particular plan. Theres always, of course, another
way out. And thats the following (and thats what I tried to describe at the end of the talk -
maybe I talked too fast) : After all now Ive defined what this results is equal to - by definition
not that you should do a loop some way and get this, but that a loop is equal to this by
defition, and Im not going to do a loop any other way. But, of course, from a practical
point of view or from the point of view purely of interest, the question is, can you come
back now and calculate the ring directly by some particular mathematical shenanigans,
and get the same answer as you get by adding the trees. And I found the way to do that.
I have another way, in other words, to do the ring integral directly. I have to subtract
something from a vector particle going around the instead of a graviton to get the answer
right. So I known the rule, and I know why the rule is, and I have a proof of the rule for
one loop. I have two ways of extend.ing. I can either break this two loop diagram open and
get it back into the processes, like I did Trith the one ring - where SO far Im stuck. Or,
717

I can take the rule which I found here and try to guess the generalization for any number
of rings. Also stuck. But Ive only had a week, gentlemen; Ive only been able to straighten
out the difficulty of a single ring a week ago when I got everything cleaned up. Its more
than a week - I had to take a lot of time checking and checking; but I was only finished
checking to make sure of everything for this conference. And of course youre always
asking me about the thing I havent had time to make sure about yet, and Im sorry; I worked
hard to be sure of something, and now you ask me about those things I havent had time.
I hoped that I would be able to get it. I still have a few irons to try; Im not completely stL~c1~-
maybe.
D e W i t t : Because of the interest of the tricky extra particle that you mentioned at the
end, and its possible connection, perhaps, with some work of Dr B i a l y n i c k i - B i r u l a ,
have you got far enough on that so that you could repeat it with just a little more detail?
The structure of it and what sort of an equation it satisfies, and what is its propagator?
These are technical points, b u t they have an interest.
F e y n m a n : Give me ten minutes. And let me show how the analysis of these tree
diagrams, loop diagrams and all this other stuff is done mathematical way. Now I will show
you that I too can write equations that nobody can understand. Before I do that I should
like to say that there are a few properties that this result has that are interesting. First of
all in the Yang-Mills case there also exists a theory which violates the original idea of symmetry
of the isotopic spin (from w h c h was originally invented) by the simple assumption that
the particle has a mass. That means to add to the Lagrangian a term -p2u,d where up
is an isotopic vector. You add this to the Lagrangian. This destroys the gauge invariance
of the theory - its just like electrodynamics with a mass, its no longer gauge-invariant,
its just a dirty theory. Knowing that there is no such field with zero mass people say : ,,lets
put the mass term on. Now when you put a mass term on it is no longer gauge invariant.
But then it is also no longer singular. The Lagrangian is no longer singular for the same
reason that it is not invariant. And therefore everything can be solved precisely. The propa-
gator instead of being dpy between two currents is

where q;, is the momentum of propagating particle. The factor l / ( q 2 - p 2 )is typical for mass p
but the part -qpqy/,u2 is an important term w h c h can be taken to be zero in electrodynamics
but it is not obvious whether it can be taken to be zero in the case of Yang-PIiIls theory.
In fact it has been proved it cannot be taken to be zero; this propagator is used between two
currents. I am using the Yang-Mills example instead of the gravity example. I really want
only the case p 2 = 0, and am asking whether I can get there by first calculating finite p2,
then takmg the limit p2 = 0.
Now, with p2 # 0 t h s is a definite propagator and there are no ambiguities at the
closed rings, the closed loops. I have no freedom, I must compute this propagator. I mean
there is no reason for trouble, and there is no trouble. There is no gauge invariance either.
And of course I checked. I broke the rings and I computed by the broken ring theorem
method a closed loop problem of fair complexity (which in fact was the interaction of two
293

718

electrons). I computed it by the open ring method and by the closed ring method, and
of course it agreed, there is no reason that i t shouldnt. It turned out that for tree diagrams
you dont have to worry about this q,,q,,/,u2 term, you can drop it - but not for the closed
ring - only for tree. Therefore the tree diagrams have the definite limit as p2 goes to zero.
And yet I have the closed ring diagram which is equal to the tree diagram when the mass
is anything but zero, and therefore it ought to be true that the limit as p2 goes to zero of the
ring is equal to the case when ,u = 0. It sounds like a great idea why dont you define the
desired ,u2 = 0 theory that way? Answer: You cant put p2 equal zero in the form (10).
You cant do it because of the q,,qy/p2.So i t was necessary next to see if there is a way to
re-express the ring cbagrams, for the case with ,u2# 0, in a new form with a propagator
different from (lo), that didnt have a p2 in it, in such a form that you can take the limits
as p2 goes to zero. Then that would be a new way to do the p equal zero case; and thats
the way I found the formula. Ill try to explain how to find that theory.
We start with a definite theory, the Yang-Mills theory with a mass (the reason I do that
is that theres no ambiguity about what I a m trying to do) and later on I take the mass to zero,
then the theory works something like this. You have the Lagrangian P(A, 9) which involves
the vector potential of this field and the fields a) representing the matter with which this
Vbject is interacting for zero mass, to which, for finite mass we add the term ,u2A,A,. This
is the Lagrangian that has to be integrated and the idea is that you integrate this over all
fields A and p7; and that is the answer for the amplitude of the problem

But wait, what about the initial and final conditions? You have certain particles coming
in and going out. To simplify things ( t k s is not essential) Ill just study the case that corres-
ponds only to gravitons in and out. Ill call them gravitons and mesons even though they
are vector particles. The question is first, what is the right answer if you have gravitons
represented by plane waves, A,, A,, A, ... going in (positive frequency in A,) o r out (nega-
tive frequency). You make the following field up. Let Aasymbe defined as a times the wave
function A , that represents the first graviton coming in a plane wave, plus ,B times A , plus y
times A , and so on.

Aasm = aA,+BA,+ f*..., A +Aaym.

Then you calculate this integral (11) subject to the condition that A approaches A,,,
at infinity. Theresult of this is of course a function of a,B, y ... and so on. Then what you
want for X is just the term first order in a, B, y ... That means just one of each these gravi-
tons coming in and out. Thats the right formula for a regular theory, for meson theory,
You calculate the integral subject to the asymptotic condition, when you imagine all these
waves, but you take the first order perturbation with respect to each one of the incoming
waves. You never let the same photon operate twice; a photon operating twice is not a photon,
it is a classical wave. Sb you take the derivative of this with respect to a, p, y and so on,
then setting them all equal to zero. Thats problem. (In general theres pl asymptotic
too .)
294

719

Now the way I happened to do this is the following: Let us call A , the A which satisfies
the classical eqatiuons of motion, which in this particular case will be

I solve this subject to the condition that A,equals Aum. In other words, I find what is the
maximum or minimum - whatever it is - of the action in (ll),subject to the asymptotic
condition. That's the beginning of analysing this.
The next thing is to make the simple substitution A = A , +B andput it back in equa-
tion (11).Then if you take L? od A,+B (if B is negligible you get L? of A , and so forth)
so you get something like this

The integral is over all B, and B must go to zero asymptotically. This business can be expanded
in powers of B.

L?(A +B)-P(A) +p2BB+2p2AB = Quad (B)+Cubic (B)+...+p2BB. (15)


The zeroth power B is evidently zero. The first power of B is also zero because A , minimized
the original thing. So this starts out quadratic in B plus cubic in B plus etc., that's what
this is here. These quadratic forms Quad (B) and so on of course depend on A,, the cubic
form involves A , in some complex, maybe very complicated, locked-up mess, but as far
as B is concerned it is second power and higher powers.
Now I would like to point something out. First -it turns out if you analyze it, that the
contribution of the first factor here alone (if you had forgotten the intergal and called it one)
is exactly the contribution of all trees to the problem. So that's like the classical theories
related to trees. Next, if you drop the term cubic in B in the exponent completely and just
integrated the result over DB, that corresponds to the contribution from one ring, or from
two isolated rings, or three isolated rings, but not interlocked rings. If you start to include
the cubic term is has to come in a second power to do anything, because of the evenness
and oddness of function. And as soon as it comes in second power, the cubic term, having
three of these things come together twice, makes a terrible thing like 00 which is a double
ring. So you don't get to a double ring until you bring a cubic term down to the second
order. So if I disregard that and just work with this second order term Quad (B) +p2BB,
I'm studying the contribution from one ring. If I study this I am working from the trees.
And now you see I have in my hands an expression for the contribution of a ring correct
in all orders no matter how many lines come in. I also have expressions for the contributions
from trees and so on. I can compare them in different mathematical circumstances, and
it's on this basis that I have been able to prove everything I have been able to prove relating
one ring to trees.
Now, let me explain how the theorem was obtained that takes the case for the mass
and for a ring. Now we have to discuss a ring, which is a formula like this
295

320

The quadratic form involves A , SO the answer depends on A , - its some complicated
functional of A,. Anyway I wont say that all the time, Ill just remember that. We have
to integrate over all B. And the difficulty is - not difficulty, but the point is - that this
quadratic form in B is singular, because it came from the piece of the action that has an
invariance and this invariance keeps chasing us along. And there are certain transformations
of B which leave this Quard B part unchanged in first order. That transformation in the
Yang-fills theory is

where the vectors are in isotopic spin space and a is considered as first order. This trans-
formation leaves the quadxatic form invariant so the Quad (B) thing by itself is singular.
But it doesnt make any difference, because of the addition of the p2BB. If ,u2# 0, there
is no problem, but if ,u2+0, Id be in trouble.
I discovered that if I make this change (16) in the actual Lagrangian and carry everything
up to second order it is exact, in fact because its only second order. If I do it with the
exact change, the thing isnt invariant, it is only invariant to first order in a. But if I make
the substitution exactly, then I get a certain addition to the Lagrangian, in other words the
Lagrangian of B(this includes the p2,the Lagrangian plus the ,u2term in B) is the Lagrangian
plus the p2 term in B plus something like this

I have to explain that the semicolon is analogous to the semicolon in gravity. The semicolon
derivative X;,means the ordinary derivative of X minus A cross X and thats the analogue
of the Christoffel symbols. Anyway, I find out what happens to L when I make this trans-
formation. Now comes the idea, the trick, the nonsense: you start with the following thing;
you, say, suppose instead of writing the original terms down, instead of writing the original
Lagrangian I were to write the following:

Now I say that the integral over a is some constant or other. So all I have done is to multiply
my original integral by 2 of B (by 2 of B I mean the whole thing, I mean this whole thing
is going t o b e 2 of B). If I can claim that when I integrate a I get somethung which is inde-
pendent of B, which is not self-evident. If I integrate over all a it does not look as if it is
independent of B - but after a moments eonsideration you see that it is. Because if I can
solve a certain equation, which is ay,-p2a = BZ, I can shift the value of a by that amount,
and then this term would disappear. In other words if I can solve this, and call this solution a,
and change a to a,, then the B would cancel and it would only be a here. I did it a little
abstractly which is a little easier to explain, therefore, this term that Ive added can be
thought of as an integral of {he following nature: Integral of some B, plus an operator acting
on a (this complicated operator is the second derivative and so on) squared CDa. And then
by that substitution Ive j w t mentioned, this becomes equal to 1/2 the operator on A
296

721

times a squared Qa, which is equal to the integral e to the one half of a times A , the
operator A , times the operator A times a integrated over primed a. Now when you inte-
grate a quadratic form, which is a quadratic with an operator like this you get one Over
the square root of the determinant of the operator. SO this thing is one over the square root
of the determinant of the operator A A . The determinant of the operator A times A is
square of the determinant of A . So this is one over the determinant of the operator A , or
better it is one over square root of the determinant of the operator A squared, youll see
i n a minute why I like to write it in this way. In other words, when Ive written this thing
down Ive written the answer that I want. Lets call X the unknown answer that I want.
Then this is equal to X divided by this d.eterminants square root squared. Now comes the
trick- I now make the change from B to B. We notice that B changed to B is simply ...
oh!, this is wrong, thats whats wrong, it should be just this. Now Ive got it. T h e change
from B to B is to add something to B. Therefore to the differential of B it adds nothing,
its just shfting the B to a new value. So 1 make the transformation from B to B everywhere.
So then I have da and dB, and now I have a new thing up here where I make use of the
formula for 2 of B:

You see there is a certain cross term generated. here and another cross term coming from
expanding this out and the net result, with a little algebra here, is that becomes 2 of B, but
the quadratic term doesnt cancel out and is left; theres one half of BM,psquared; thats
from this term; the cross term here cancels the cross term in there; and then we have only
the quad.ratic - I mean the a terms

And the problem is now to do this integral on a;well, another miraculous thing happens.
I have the operator A , but that this down thing is aAa, and therefore its result is just cletermi-
nant once; or the square of this integral is equal to this determinant, or something like
that. Therefore, when you get all the factors right, X,the unknown, is equal to

S a c h s : I want to ask a question about long-range hopes. Perhaps for irrational reasons
people are particularly interested. in those parts of the theory where is a possibility of real
qualitative differences: what do the coordinates or topology mean in a quantized theory,
and this kind of junk. Now I wond.er if you think that this perturbation theory can eventually
be jazzed. u p to cover also this kind. of questions?
F e y n m a n : The present theory is not a theory as it is incomplete. I d.0 not give a rule
on how to do all problems. I expect of course that if I spend more time on figuring out how
to untangle the pretzels I shall be able to make it into such a theory. So lets suppose I did.
Now you can ask the question would the completed job, assuming it exists, be of any interest
to esoteric question about the quantization of gravity. Of course it would. be, because it
297

722

.izo~ildbe the expression of the quaiituni theory; there is i oday n o expression of tlre quaiiturn
tlieory wliicli is consistent. You say: hut its perturbation theory. But it isnt. I worked
on tlie thing analyzing it in tlie series of illcreasing accuracy, hut thats only, obviously, 13-lien
I aiii doing problems and cliecliin~,or doing things like I just did. But even tliere I havent
said OM: iiiaiiy times the vector potential A , is att.acliing the diagram, there is n o limit to
what order of external h i e s are involved in tlie calculation of A , , for example. -&lidSO if
I gel iiiy general theorem for all orders, Ill have soiiie kind of a formulation. Tlie fact isr
tliat i n such tliings as electrodgiiamics and other theories, it has not been possible to figure
out tlie coiisequeiices of the quaiituin field theory i n tlie case of strong interactions, because
of teclinical difficulties which are not technical difficulties just of the gravitation theory:
but exist all over the quaiituni field theory. I do not expect that the gravitational problenis
will be any easier in tliat region than they are in any otlier field theory, so I can say very
little there. But at least one should certainly foriiiulate the theory that youre tryiiig to
calculate first, a n d then find out what tlie consequeiices are, before trying to do it tlie otlier
way round. So I tliinli that youll be frustrated by the difficulties that do appear whenever
any theory diverges. On otlier hand, if you ask about tlie physical significance of the quaiiti-
zation of geometry, ill other words about tlie philosophy beliiiid i t ; what happeris to the
metric, and all such questions, those I believe will be answerable, yes. I think you would
be able to figure out the physics of it afterwards, but I wont to tliiiil; about that until I have
i t completely formulated, I dont want to start t o work out tlie anjer to something unless
I know d r a t the equation is I a m trying to analyze. But I dont have the doubt that pou
will be able to do sonietliing, liecause after all you are describing the plienolnena that yo11
would espect, a n d if you describe the plieiiomena then you expect y o u can then fiiid some
kind of framework in TI-hich to talk to help to understand the plienomeiia.
298

PHYSICAL REVIEW VOLUME 162, NUMBER 5 25 OCTOBER 1967

Quantum Theory of Gravity. 11. The Manifestly Covariant Theory*


BRYCE S. DEWITT
Institute for Advanced Study, Princeton, New Jersey
and
Department of Physics, University of North Carolina, Chapd Hill, North Carolinat
(Received 25 July 1966; revised manuscript received 9 January 1967)

Contrary to the situation which holds for the canonical theory described in the first paper of this series,
there exists a t present no tractable pure operator language on which to base a manifestly covariant quantum
theory of gravity. One must construct the theory by analogy with conventional S-matrix theory, using
the c-number language of Feynman amplitudes when nothing else is available. The present paper undertakes
this construction. It begins at an elementary level with a treatment of the propagation of small disturbances
on a classical background. The classical background plays a fundamental role throughout, both as a technical
instrument for probing the vacuum (i.e., analyzing virtual processes) and as an arbitrary fiducial point for
the quantum fluctuations. The problem of the quantized light cone is discussed in a preliminary way, and
the formal structure of the invariance group is displayed. A condensed notation is adopted which permits
the Yang-Mills field to be studied simultaneously with the gravitational field. Generally covariant Greens
functions are introduced through the imposition of covariant supplementary conditions on small dis-
turbances. The transition from the classical to the quantum theory is made via the Poisson bracket of
Peierls. Commutation relations for the asymptotic fields are obtained and used to define the incoming
and outgoing states. Because of the non-Abelian character of the coordinate transformation group, the
separation of propagated disturbances into physical and nonphysical components requires much greater
care than in electrodynamics. With the aid of a canonical form for the commutator function, two distinct
Feynman propagators relative to an arbitrary background are defined. One of these is manifestly co-
variant, but propagates nonphysical as well as physical quanta; the other propagates physical quanta only,
but lacks manifest covariance. The latter is used to define external-line wave functions and non-radiatively-
corrected amplitudes for scattering, p.air production, and pair annihilation by the background field. The
group invariance of these amplitudes is proved. A fully covariant generalization of the complete S matrix
is next proposed, and Feynmans lree theorem on the group invariance of non-radiatively-corrected n-particle
amplitudes is derived. The big problem of radiative corrections is then confronted. The resolution of this
problem is carried out in steps. The single-loop contribution to the vacuum-to-vacuum amplitude is first
computed with the aid of the formal theory of continuous determinants. This contribution is then func-
tionally differentiated to obtain the lowest-order radiative corrections to the n-quantum amplitudes.
These amplitudes split automatically into Feynman buskets, i.e., sums over tree amplitudes (bare scattering
amplitudes) in which all external lines are on the mass shell. This guarantees their group invariance. The
invariance can be made partially manifest by converting from the noncovariant Feynman propagator to
the covariant one, and this leads to the formal appearance of fictitiow quanta which compensate the
nonphysical modes carried by the covariant propagator. Although avoidable in principle, these quanta
necessarily appear whenever manifestly covariant expressions are employed, e.g., in renormalization theory.
The fictitious quanta, however, appear only in closed loops and are coupled to real quanta through vertices
which vanish when the invariance group is Abelian. The vertices are nonsymmetric and always occur with
a uniform orientation around any fictitious quantum loop. The problem of splitting radiative corrections
into Feynman baskets becomes more difficult in higher orders, when overlapping loops occur. This problem
is approached with the aid of the Feynman functional integral. It is shown that the measure or volume
element for the functional integration plays a fundamental role in the decomposition into Feynman
baskets and in guaranteeing the invariance of radiative corrections under arbitrary changes in the choice
of basic field variables. The measure has two effects. Firstly, it removes from all closed loops the no%
causal chains of cyclically connected advanced (or retarded) Greens functions, thereby breaking them
open and ensuring that at least one segment of every loop is on the mass shell. Secondly it adds certain non-
local corrections to the operator field equations, which vanish in the classical limit A 0. The question
arises why these removals and corrections are always neglected in conventional field theory without apparent
harm. It is argued that the usual procedures of renormalization theory automatically take care of them.
In practice the criteria of locality and unitarity are replaced by analyticity statements and Cutkosky rules.
It is virtually certain that the measure may be similarly ignored (set equal to unity) in gravity theory,
and that attention may therefore be confined to primary diagrams, i.e., diagrams which contain Feynman
propagators only, with no noncausal chains removed. A general algorithm is given for obtaining the
primary diagrams of arbitrarily high order, including all fictitious quantum loops, and the group invariance
of the amplitudes thereby defined is proved. Essential to all these derivations is the use of a background
field satisfying the classical free field equations. I t is never necessary to employ external sources, and
hence the well-known difficulties arising with sources in a non-Abelian context are avoided.

1. INTRODUCTION field. Attention was focused on some of the bizarre


features of the resulting formalism which arise in the
N the first paper of this series an attempt was made
I to show what happens when canonical Hamiltonian
quantization methods are applied to the gravitational
case of finite worlds, and which are of possible cos-
mological and even metaphysical signikance. Such

*This research was supported in part by the Air Force 0 5 c e t Permanent address.
of Scientific Research under Grant AFOSR-153-64, and in part by B.S. DeWitt, Phys. Rev. 160, 1113 (1967). This paper will
the National Science Foundation under Grant GP7437. be referred to aa I.
162 1195
299

1196 BRYCE S DEWITT 162

prosaic questions as the scattering, production, absorp- This,however, is not the whole story, for the general
tion, and decay of individual quanta were left un- coordinate transformation group still has, even as a
touched. The main reason for this was that the canonical gauge group, profound physical implications. Some of
theory does not lend itself easily to the study of these these we have already encountered in I, and some we
questions when physical conditions are such that the shall encounter in the present paper. Others will appear
effects of vacuum processes must be taken into account. in the final paper of this series, which is to be devoted
A manifestly covariant formalism is needed instead. to applications of the covariant theory. If it were not
It is the task of the present paper to provide such a for these implications there would be little interest in
formalism. pushing our investigations further, for there is no
We must begin by making clear precisely what is likelihood that such prosaic processes as graviton-
meant by manifest covariance. I n conventional graviton scattering or curvature induced vacuum
S-matrix theory (whether based on a conventional polarization will ever be experimentally observed.4 The
field theory or not) manifest covariance means real reason for studying the quantum theory of gravity
manifest Lorentz covariance. In the context of a is that by uniting quantum theory and general relativity
theory of gravity the question arises whether it should one may discover, at no cost in the way of new axioms
mean more than this, since the classical theory from of physics, some previously unknown consequences of
which one starts has manifest general covariance. general coordinate invariance, which suggest new in-
Here one must be careful. There is an important teresting things that can be done with quantum field
difference between general covariance and ordinary theory as a whole.
Lorentz covariance, and neither one implies the other. Our problem will be to develop a formalism which
Lorentz covariance is the expression of a geometrical makes manifest the extent to which general covariance
symmetry possessed by a system. In gravity theory permeates the theory. This will be accomplished by
it has relevance a t most to the asymptotic state of introducing, instead of a flat background, an adjust-
the field. As has been emphasized by Fock,2 the word able c-number background metric. Use of such a
relativity in the name general relativity has con- metric has the following fundamental technical advan-
notations of symmetry which are misleading. Far from tages: (1) I t facilitates the introduction of particle
being more relativistic than special relativity, general propagators which are generally covariant rather than
relativity is in fact less relativistic. For as soon as space- merely Lorentz-covariant. (2) I t reduces the study of
time acquires bumps (i.e., curvature) it becomes radiative corrections to the study of the vacuum. (3) It
absolute in the sense that one may be able to specify makes possible the generally covariant isolation of
position or velocity with respect to these bumps, pfo- divergences, which is essential to any renormalization
vided they are sufficiently pronounced and distin- program. (4) I t renders theorems analogous to the
guishable from one another. Only when the bumps Ward identity almost trivial. (5) It makes possible,
coalesce into regions of uniform curvature does space- in principle, the extension of the theory of radiative
time regain its relativistic properties. It never becomes corrections to worlds for which space-time is not
more relativistic than flat space-time, which is char- asymptotically flat and which may even be closed
acterized by the 10-parameter Poincar6 group. and finite. These advantages are typical of what we
The technical method of distinguishing between the shall mean by the phrase manifest covariance. Use
PoincarC group and the general coordinate transforma- of the phrase, however, is not to be understood as
tion group is to confine the operations of the latter implying that the simple trick of introducing a variable
group to a finite (but arbitrary) region of space-time. background metric makes everything obvious. The
The asymptotic coordinates are then left undisturbed generally covariant propagators will not be unique
by general coordinate transformations, and only the but will be choosable in various ways, analogous to
operations of the PoincarC group (if that is indeed the the gauge choices in quantum electrodynamics, and
asymptotic symmetry group of the problem) are we shall have to undertake a separate investigation,
allowed to change them. The general coordinate just as in quantum electrodynamics, to verify that
transformation group thus becomes a gauge group the choice is irrelevant. This investigation turns out
which, although historically an offspring of the Poin- to be much more complicated than in the case of
car6 group and the equivalence principle, plays techni- quantum electrodynamics.
cally the rather obscure role of providing the analytic Of the five advantages listed above as stemming
means by which the Einstein equations can be ob- from the use of a variable background metric only
tained from a variational principle and their essential the first two will appear in the present paper. The third
locality displayed?
argued [see S . Weinberg, Phys. Rev. 138, B988 (1965)] that the
1 V . Fock, The Theory of Space-Time a d Grauilalwn (Pergam- general coordinate transformation group is simply a consequence
mon Press, New York, 1959). of the zero rest mass of the gravitational field and its long-range
aThe content of the Einstein equations can be expressed in an character.
intrinsic coordinate-independentform only at the cost of introduc- Although one might hope for some very indirect cosmological
ing nonlocal structures. (See, for example, Ref. 32). It can be evidence for such processes.
300

162 QUANTUM T H E O R Y OF .GRAVITY. I1 1197

and fourth will be demonstrated in the following paper The language of graphs and the S matrix is much more
of this series, while the fifth remains a program for direct.
the future. I t is not out of place here, however, to The latter language, embracing as it does many dif-
speculate briefly on this ultimate program. As long ferent particle theories at once, is also much less
as the conventional S matrix is our chief concern it dependent on the detailed Lagrangian structure of the
appropriate to choose a background metric which is field theory on which it is based. It assumes that virtual
asymptotically flat. We shall see that Lorentz invari- processes may be described by an infinite set of basic
ance of the S matrix then follows almost trivially from diagrams, the combinatorial properties of which are the
the formalism, in the limit in which the background same for all field theories. In working out the details
metric becomes everywhere Minkowskian. Now i t is of how this language is t o be extended to the non-
obvious that scattering processes are also possible in Abelian case, we have attempted to develop it within
an infinite world which is not asymptotically flat. h as broad a framework as possible. Every thcorem in
such a world it should be possible to construct a this paper will therefore apply not only to the gravita-
generalized S matrix in which the convenlianal phne- tional field but also to the Yang-Mills field6which,
wave momentum eigenfunctions are replaced by wave like the gravitational field, possesses a non-Abelian
functions appropriatc to the altered asymptotic invariance group6
Section 2 begins with the introduction of a notation
geometry. The asymptotic gcometry itself would be which is sufficiently general to embrace all boson field
fixed by choosing the background metric appropriately. theories and at the 5ame time condensed enough to
In a dosed world no rigorous S matrix exists. The reduce the highIy complex analysis of subsequent sec-
continuum of scattering states is replaced by a regime tions to manageable proporlions, A table is included
of discrete quantization, and, as we have seen in I, to facilitate comparison of the condensed notation with
the wave function of the universe may even be unique. the detailed forms which the various symbols take in
It may be conjectured that the formalism most ap- the case of the Yang-Mills and gravitational fields.
propriate to this case is obtained by choosing the back- The notation is particularly useful in dealing with the
ground metric to be lzot a c number but rather an second functional derivative of the action, which plays
operator depending on a small number (e.g., one) of the role of the differential operator governing the prop-
quantum variables similar to the operator R represent- agation of infinitesimal disturbances on an arbitrary
ing the radius of the Friedmann universe studied in I. background field. I t is also useful in dealing with the
These variables would be quantized by the canonical higher functional derivatives, which are the bare vertex
method, while the full q-number metric would continue functions of the theory. The problem of the quantized
tb be treated by manifestly covariant methods. (Con- light cone is discussed in a preliminary way in Sec. 3,
dilions of constraint would, of course, have to be im- and its relationship t o the nonrenorrnalizabilily of
posed on the latter metric to take into account the fact the theory is noted. Attention is called to the various
that some of its degrees of freedom have been trans- roles of the background metric, one of which is to define
ferred t o the background metric.) The resulting the concepts of past and future. Greens theorem
simultaneous use of both the canonical and covariant for an arbitrary differential operator is then derived.
theories might help to reveal the relationship between
Section 4 introduces a notation for the basic struc-
them. tures governing the action of the invariance group on
As has been remarked in I, no rigorous mathematical the field variables. The relationship between manifest
Link has thus far been established between the canonical covariance and linearity of the group transformation
and covariant theories. I n the case of infinite worlds laws is emphasized. In Sec. 5 it is pointed out that the
it is believed that the two theories are merely two infinitesimal disturbances themselves are determined
versions of thc same theory, expressed in difierent only modulo an Abelian transformation group. This
languages, but no one knows for sure. The analysis of group, which is the tangent group of the full group,
radiative corrections has turned out to be of such affects only the field variables but not physical ob-
intricacy that the covariant theory has had to be servables. The latter are necessarily group-invariant.
developed completeIy within its own framework and Infinitesimal disturbances satisfying retarded or ad-
independently of the canonical theory. Although the
structure of the covariant theory is suggested by the 6C. N. Ymg and R. L. Mills, Phys. Rev. 96,191 (1954).
formalism of field operators, and hence maintains a few Tbe term invariance group, as used in this paper, will
always refer to the infinite dimensional gauge group of the
points of contact with conventional field theory, the theory, and not to the finite dimensional ((10) asymptotic
language of operators is dropped at a certain key stage isometry group, which is undetermined a priori. It is not hard to
and c-number criteria are thenceforth exclusively em- show that the Eang-Mills field and its gauge group can be
given a metrical interpretation which suggests a physical kinship
ployed to maintain internal consistency. I t turns out between the YangMills and ravitational fields which is closer
that the languagc of operators is a peculiarly unwieldy than the i D I m d rnathematicaf similarities between them alone
indicate. [See B. S. DeUitt, Dytramicd Theory o j Groups and
one in which to discuss questions of consistency when Fields (Gordon and Breach Science Publishers, Inc., New York.
the invariance group of the theory is non-Abelian, 19651, problem 77, p, 139.1
301

1198 BRYCE S. D E W I T T 162


vanced boundary conditions can be computed with the section. The lemma is used again in Sec. 11 to prove
aid of corresponding Greens functions provided sup- that the non-radiatively-corrected amplitudes for scat-
plementary conditions are imposed. For convenience tering, pair production and pair annihilation by the
these supplementary conditions are chosen in a mani- background field are group-invariant. Group in-
festly covariant way, but their essential arbitrariness variance here implies invariance under group trans-
is emphasized. formations of the background field, under gauge changes
Use of the covariant Greens functions in connection of the propagators, and under radiation gauge changes
with Cauchy data for infinitesimal disturbances is in the asymptotic wave functions. The amplitudes are
discussed in Sec. 6, and the fundamental reciprocity also shown to satisfy a set of relations which are the
relations of propagator theory are established. Transi- relativistic generalizations of the well known optical
tion from the classical to the quantum theory is made theorem for nonrelativistic scattering.
via the Poisson bracket of Peierls (see Ref. ZO), which Construction of the full S matrix of the theory is
is determined solely by the behavior of infinitesimal begun in Sec. 12. The field operators are separated into
disturbances. The reciprocity relations are used to show two parts, a classical background satisfying the classical
that Peierls Poisson bracket satisfies all the usual field equations, and a quantum remainder. Vacuum
identities. Section 7 introduces the important concept states associated with the remote past and future are
of the asymptotic fields, which obey the field equations defined relative to the background field. Vacuum matrix
of the linearized theory. From the asymptotic fields elements of chronological products are constructed by
one can construct asymptotic invariants, which may varying the vacuum-to-vacuum amplitude with re-
be used to characterize completely the physical state spect to the background field. I t turns out that all
of the field. The asymptotic invariants are conditional physical amplitudes can be obtained in this way
invariants, i.e., invariants modulo the field equations. despite the fact that the variations in the background
It is emphasized that their commutators (Le., Poisson field are subject to the constraint that the classical
brackets) are nonetheless well defined. A direct proof is field equations never be violated. The well-known
given that the asymptotic invariants satisfy the com- difficulties arising with the use of external sources in
mutation relations of the linearized theory, a result a non-Abelian context are thus avoided. When no in-
which is nontrivial when a group is present. This result variance group is present the vacuum matrix elements
is used in Sec. 8 to construct the creation and annihila- of chronological products are expressible in terms of
tion operators for real (i.e., physical) quanta in the functions having the combinatorial structure of tree
remote past and future. The detailed structures of the diagrams. Use of these functions constitutes an essential
asymptotic Yang-Mills and gravitational fields must part of the program for constructing the S matrix as
be investigated separately, but a condensed notation given in this paper. Since these functions are initially
(for the asymptotic wave functions) is again introduced, defined only in the absence of an invariance group,
which embraces both fields at once and emphasizes however, we are at this point forced to abandon the
their similarities. A table is included to facilitate the strict operator formalism. Section 13 displays the struc-
comparison. The quanta of both fields are transverse ture of the S matrix and its unitarity conditions when
and differ only in spin. States are labeled by helicity, no invariance group is present. Section 14 then begins
which i s readily shown to be Lorentz-invariant. the long and intricate task of generalizing this struc-
ture to the case in which a group is present. Aside
Continuing the uniform treatment of the two fields,
from an invariance lemma which is used to suggest the
Sec. 9 shows that the asymptotic commutator functions
of both can be expressed in a standard canonical form. desired generalization, the important proof of this sec-
A special notation is introduced for the projection of tion is the tree theorem. The tree theorem says that the
lowest-order (i.e., non-radiatively corrected) contribu-
the canonical form into the physical subspace. With
the aid of this projection two distinct Feynman prop- tions to any scattering process can always be calculated
agators are defined relative to an arbitrary back- by elementary methods, using any choice of gauge for
ground field. Both serve to describe the propagation of the propagators of the internal lines and any choice of
field quanta in nonasymptotic regions as well as a t gauge for the external-line wave functions. The result
infinity. One is manifestly covariant but propagates will be independent of the gauge choices provided all
nonphysical as well as physical quanta; the other prop- the tree diagrams contributing to the given process
agates physical quanta only but lacks manifest are summed together.
covariance. The latter is used in Sec. 10 to define the There remains only the question of the vacuum-to-
external line wave functions which enter into the ulti- vacuum amplitude itself. Since all radiative correc-
mate definition of the S matrix. These functions serve tions can be obtained by functionally differentiating
to generalize the asymptotic wave functions to the this amplitude with respect to the background field,
case in which an arbitrary background field is present. a proof of its group invariance would complete the
They satisfy a number of important relations following proof of the invariance of the entire S matrix. The real
from a fundamental lemma which is proved in this problem, however, is to construct the amplitude, and the
302

162 QUANTUM T H E O R Y OF G R A V I T Y . I 1 1199

invariance criterion must therefore be used as a guide in the classical limit h+ 0. The question arises why
rather than as an a posteriori consistency check. these removals and corrections are always neglected in
Section 15 pauses briefly to review the question of conventional field theory without apparent harm. It is
Lorentz invariance, to point out that the theory should argued that the usual procedures of renormalization
also be invariant under changes in the specific variables theory automatically take care of them and that in
with which one works, and to comment upon the utility practice the criteria of locality and unitarity are re-
of using c-number language exclusively. Section 16 placed by analyticity statements and Cutkosky rules
then plunges into the main problem. The single-loop (see Ref. 52). A detailed investigation of these cor-
contribution to the vacuum-to-vacuum amplitude is rections when a group is present is undertaken in Sec.
computed with the aid of the formal theory of con- 20. The two-loop Feynman-basket decomposition of
tinuous determinants, and various alternative forms for the preceding section is appropriately generalized and the
it are given. There is no ambiguity about this contribu- result is reexpressed in terms of covariant propagators,
tion, and its group invariance is readily demonstrated. including the fictitious quanta. It turns out that the total
This contribution is functionally differentiated in two-loop amplitude is obtainable from a set of covariant
Sec. 17 to yield the lowest-order contribution to primury diagrams (containing Feynman propagators
single quantum production by the background field. only, and hence off-mass-shell contributions in all
The latter splits into two parts, one involving the lines) by a process of removing noncausal chains and
covariant propagator for normal quanta and the other adding nonlocal corrections, which is completely
involving the covariant propagator for a set of fictitiozts analogous to that of the no-group case. Moreover, the
quanta which compensate the nonphysical quanta that primary diagrams, taken together, are group-invariant
the first propagator also carries. The fictitious quanta as they stand, independently of the tree theorem. This
are coupled to real quanta through asymmetric vertices suggests that even when a group is present the non-
which vanish when the invariance group is Abelian. causal chains and nonlocal corrections may be neglected
With the aid of the fundamental lemma of Sec. 10 and as in conventional field theory. The problem therefore
a collection of new identities it is shown that the becomes one of finding a general algorithm for obtain-
fictitious quanta can be formally avoided by replacing ing the primary diagrams of arbitrarily high order, in-
the covariant propagator by the noncovariant one cluding all fictitious quantum loops. The remainder of
which carries physical quanta only. The covariant Sec. 20 is devoted to the construction of such an algo-
propagators, however, are needed for the practical rithm. The generator for the algorithm is a Feynman
implementation of any renormalization program. functional integral for the vacuum-to-vacuum ampli-
The lowest-order radiative corrections to the tude, which includes fields representing the fictitious
n-quantum amplitudes are analyzed in Sec. 18. These quanta. The group invariance of this integral is explicitly
amplitudes split automatically into Feynman baskets, demonstrated, and the fictitious quanta are shown
i.e., sums over tree amplitudes (lowest-order scattering formally to obey Fermi statistics despite their integral
amplitudes) in which all external lines are on the mass spin. No physical criteria are violated, however, since
shell. The tree theorem then guarantees their group the fictitious quanta never occur outside of closed loops.
invariance. This invariance can be made partially Finally, the rules for inserting external lines into the
manifest by converting from the noncovariant prop- primary vacuum diagrams are given, and the asym-
metric vertices contained in the fictitious quantum
agator to the covariant one, and the fictitious quanta
loops are shown to have a uniform orientation around
again make their appearance. each loop.
The problem of splitting the radiative corrections
into Feynman baskets becomes more difficult in higher 2. NOTATION. INFINITESIMAL DISTURBANCES .
orders, when overlapping loops occur. This problem BARE VERTEX FUNCTIONS
is approached in Sec. 19 with the aid of the Feynman
functional integral. When no invariance group is present A quantum field theory begins with the selection of
it is shown that the measure or volume element for an action functional S. If the theory is local this func-
the functional integration plays a fundamental role in tional is expressible in the form
the decomposition into Feynman baskets and in
guaranteeing the invariance of the vacuum-to-vacuum
amplitude under arbitrary changes in the choice of
basic field variables. The measure has two effects.
S=
\ Cdz, dx=dx0dx1dx2dxa, (2.1)

Firstly, it removes from all closed loops the noncausal where &-hte Lagrangian (density)-+ a function of
chains of cyclically connected advanced (or retarded) the dynamical variables and a finite number of their
Greens functions, thereby breaking them open and in- space-time derivatives at a single point. Various criteria
suring that at least one segment of every loop is on such as covariance, self-consistency of the field equa-
the mass shell. Secondly, it adds certain nonlocal cor- tions, the existence of the vacuum as a state of lowest
rections to the operator field equations, which vanish energy, and positive definiteness of the quantum-
303

1200 BRYCE S DEWITT 162

mechanical Hilbert space in practice drastically limit Suppose the form of the action functional suffers the
the possible choices for 2. However, many different following change:
choices exist for the Lagrangian of a given field. Thus
it is always possible to add a trivial divergence to the S+S+eA, (2.3)
Lagrangian without changing the field equations a t all.
Moreover, the field variables may be replaced by where e is an infinitesimal constant. Such a change may
arbitrary functions of themselves; this replaces the field be thought of as being brought about by weak coupling
equations by linear combinations of themselves. Finally, to some external agent. The coupling produces an in-
even the number of field variables is not unique; for finitesimal disturbance 6pi in the field, which satisfies
example, alternative Lagrangians may be found leading the linear inhomogeneous equation
to field equations which express some of the variables in S,I1..6pi=-eA ,a. . (2.4)
terms of derivatives of others. What is important is that
the choice of Lagrangian is basically irrelevant to the That is, pi+Gpi satisfies the field equations of the
development of the theory of a given field and should system S+eA if pi satisfies those of the system S. The
be determined only by convenience. The quantum undisturbed field qi may be regarded as a background
theory of a given field must be constructed in such a jield upon which the disturbance 6pi propagates. The
way that it is invariant under changes in the mode of concept of the background field proves to be a useful
description of the field. one in the covariant theory, and will occur repeatedly
I t will prove convenient in what follows to adopt a in what follows.
highly condensed notation. The field variables (assumed For local theories the quantity S,;i has the form of a
here to be real) will be denoted by pi, and commas linear combination of 6 functions and derivatives of 6
followed by indices from the middle of the Greek functions, with functions of the field variables and their
alphabet will be used to denote differentiation with re- derivatives as coefficients. I n Eq. (2.4) S,iJ therefore
spect to the space-time coordinates. The first part of plays the role of a linear differential operator with
the Greek alphabet will be reserved for grouf indices, variable coefficients. The reader will find it useful to
to be introduced presently. Primes will be used to consult Table I, which lists the explicit forms which this
distinguish different points of space-time; they will also and various other abstract symbols of the general
appear on associated indices, or on field symbols them- formalism take in the cases of the Yang-Mills field and
selves, when it is desired to avoid cumbersome explicit the gravitational field, respectively.
appearances of the xs. In most cases, however, the I n the case of linear theories S,;i corresponds to a
primes will be simply omitted. This corresponds to linear differential operator with constant coefficients,
making the indices i, j , etc. do double duty as discrete and the higher functional derivatives S , i l k , etc., vanish.
labels for field components and as continuous labels over In nonlinear theories the higher functional derivatives
the points of space-time. That is, an index such as i will are known as bare vertex funclions. They describe the
really stand for the quintuple (i, xo, xl, x2, 2)and the basic interactions between jinite disturbances, the prop-
summation convention for repeated indices will be agation of which, as will be seen later, provides a direct
extended to include integrations over the xs. The classical model for the quantum S matrix.
significance of the indices thus becomes almost purely I t is frequently convenient to introduce a further con-
combinatorial. When this notation is employed it is densation of notation, namely to make the replacement
necessary to remember that expressions such as Mi, are
really elements of continuous matrices and that the s.i,...i, .--,s n (2.5)
symbol S i j involves a 4-dimensionaI 6 function.
and to drop the indices altogether. Equations (2.2)
For most purposes the form of the field equations is
and (2.4) are then replaced by
more important than the value of the action functional.
Therefore, the domain of integration in (2.1) is un- s1=0 (2.6)
important; when otherwise unspecified it is to be under-
stood as being large enough to embrace all points at and
which it may be desired to perform functional dif-
ferentiations. Functional differentiation with respect to Sz8pp=-Ai, (2.7)
the field variables will be denoted by a comma followed
by one or more Latin indices. Thus the field equations respectively. If the basic field variables are properly
will be expressed in the symbolic form chosen the number of nonvanishing bare vertex func-
tions is finite in the case of both the Yang-Mills and
s,i=o. (2.2) gravitational fields. Thus, for the Yang-Mills field we
have S n = O for n>4 when the field variables are chosen
In this paper no restriction is imposed on the range of Latin as in Table I, while for the gravitational field we have
indices. Other conventions, to the extent they overlap, are the
same as in I. S n = O for n>9 if the quantities p=g6/18gv-q~r are
162 QUANTUM THEORY OF GRAVITY. I1 1201

I. Expressions for the Yang-Mills and gravitational fields corresponding to quantities appearing in the abstract formalism.
TABLE

Abstract
symbol or
equation Corresponding expression for the Yang-Mills field Corresponding expression for the gravitational field

cp'

I;pu85fg,'~igcr.s+l"r.N-gNY.r). . .
The indices p, Y are raised and lowered by means of the
Minkowski metric ~ , v = d i a g ( - l , l l l , l ) and its inverse The indices p, v, p , u, r are raised and lowered by means
q p y . The indices CY, 8, y are raised and lowered by means of the metric tensor g," and its inverse gr".
of the Cartan metric, I n the remaining entries of this table the symbol (OR is
y a p -c-~c~,8c~gy replaced simply by R.
and its inverse y@. The c's are the structure constants
of a compact n-dimensional semi-simple Lie group, and
the constant c2 is chosen so that det(ya~)= 1.
o=s,i O = 6S/6AaNs -Fa#';".

St" The infinitesimal group parameters are functions 6 t a ( x ) The infinitesimal group parameters are the functions
which assign to each point x a corresponding &p(x) appearing in the infinitesimal coordinate
infinitesimal transformation of the generating Lie transformation ~ r = x f i + 6 p . Under inner automorphisms
group. Under inner automorphisms they transform they transform as contravariant vectors. Note that
according to the adjoint representation of the full group. group and coordinate indices coincide in the case of the
general coordinate transformation group.
R', R a r ~=i -6a51 ;,,, 6'p" Sa~6(x,x') Rwu~ -6put;v-6u.*;p, 6gr's gfiA(x,X')
6 cpi =Ria6ta 6 A a = - 6Ea:,= -6Ea,p-~ar,qAYP6EB 6 'ppv = -6tK" -St" ;I!= - g w a. P g q u w , r - g r d 4I
Semicolons denote invariant differentiation. A field Semicolons denote covariant differentiation. A field
quantity Q which has the group transformation law quantity cp which has the group transformation law
6 Q =G a QW, 8cp= - a.r6P+G",~8P,.= - c p : , 6 E f i + G r , ~ ~ : ,
where the G, are the generators of a matrix tepresenta- where the GI,, are the generators of a matrix representa-
tion of the generating Lie group, is defined to have the tion of the linear group, is defined to have the covariant
invariant derivative derivative
P:," cp,r+GaAaeQ. P:,= cp,C+G.or,v8P.
Invariant differentiation leaves transformation properties Covariant differentiation adds one covariant index. It
intact. It has the commutation law
~ : # v - ~ : v p =- - G a F a p v ~ .

S,iRia=0 FU~*:,,.=0
This identity is a consequence of the antisymmetry of
Fa,, and of the structure constants 0,8. Fa,. transforms
according to the adjoint representation of the group and
also satisfies the cyclic identity
Fapv; .+ ,+
Fa".: Farlr;"=0.

w
0
VI
306

1202 BRYCE S . DEWITT 162

chosen as the basic field variables.a With the conven-


tional choice of Table I the number of nonvanishing
gravitational vertex functions is infinite.
For a local theory a typical term in S, involves the
product of n- 1 6 functions or derivatives of 6 functions.
In momentum space with a constant (e.g., flat) back-
ground field these reduce to a single 6 function, which
expresses the conservation of momentum of the n field
quanta taking part in the elementary process described
by the vertex in question. The calculation of specific
processes is usually most conveniently performed in mo-
mentum space; the development of the general theory,
however-in particular, the demonstration of the
covariance of renormalization procedures-is best done
in coordinate space.
Because of the commutativity of functional differen-
tiation the bare vertex functions S,ijk... are completely
symmetric in their indices, and S,ii corresponds to a
self-adjoint linear operator. When employing the nota-
tion (2.5) we may regard the symbol Sz as actually
representing this operator. Note: The abstract notation
must be used with a measure of caution because the
associative law of matrix multiplication does not always
hold. If W and \Ei are two functions which do not vanish
rapidly at infinity, the value of the expression &S,ijW
may depend on which implicit integration is performed
first. This ambiguity may be removed by using ayows
@-z2?Er
to distinguish the two possibilities: and @ - S Z ~ . ~
The present discussion will be limited to boson fields.
For the extension of the formalism to the case of
fermion fields, which involves anticommutative dif-
ferentiation and antisymmetric vertex functions, the
reader may consult the reference given in Ref. 6.
This reference contains detailed proofs of some of the
important theorems to be stated in what follows. We
shall therefore restrict ourselves here to sketch-proofs
or simple statements of these theorems but will take the
occasion to improve their presentation.
307

PHYSICAL REVIEW VOLUME 162. NUMBER 5 1 5 OCTOBER 1967

Quantum Theory of Gravity. 111. Applications of the Covariant Theory*


BRYCE S. DEWKIT
Instilute Jar Advanced Study, Princdon, NEWJersey
and
Depdmen; of Physics, University of North Carolina, Chapel HiU, North Carolinat
(Received 25 July 1966; revised manuscript received 9 January 1967)

The basic momentum-space propagators and vertices (including those for the fictitious quanta) are
given for both the Yang-Mills and gravitational fields. These propagators are used to obtain thecross
sections for gravitational scattering of two xal ar particles, scattering of gravitons by scalar particles,
graviton-graviton scattering, two-graviton annihilation of scalar-particle pairs, and graviton bremsstrah-
lung. Special features of these cross sections are noted. Problems arising in renormalization theory and the
role of the Plan& length are discussed. The gravitational Ward identity is derived, and the structure of
the radiatively corrected 1-graviton vertex for a scalar particle is displayed. The Ward identity is only one
of an infinity of identities relating the many-graviton vertex functions of the theory. The need for such
identities may be eliminated in principle by computing radiative corrections directly in coordinate space,
using the tbeory of manifestly covariant Greens functions. As an example of such a calculation, the con-
tribution of conformal metric fluctuations to the vacuum-to-vacuum amplitude is summed to all orders.
The physical significance of the renormalization terms is discussed. Finally, Weinbergs treatment of the
infrared problem is examined. It is not dificult to show that the fictitious quanta contribute negligibly to
infrared amplitudes, and hence that Weinbergs use of the DeDonder gauge is justified. His proof that the
infrared problem in gravidynamics can be handled just as in electrodynamic3 is thereby made rigorous.

1. INTRODUCTION canonical or Hamiltonian theory and the other on the


I manifestly covariant theory of propagators and dia-
the first two papers Of this series two distinct grams. So far no rigorous mathematical l i d between
approaches to the quantum
the two has been established. In part this is due to the
Of gravity were One based On the so-caued kin& of questions asks. The theory
leads a h & unavoidably to speculations about the
* This research was supported in part by the Air Force OEce meaning of amplitudes for dserent 3-geometries or
of Scientific Kesearch under Grant AIOSR-153-64 and in part by
the National Science Foundation under Grant GP7437. the wave function of the universe. The covariant
t Permanent address. theory, on the other hand, concerns itself with micro-
B S DeWitt Phys. Rev. 160, 1113 (1967) ; preceding paper
j b a , 162, 1195 (i967). These papers will be referred to I and Processes such as vacuum polarization, etc.
n, respectively. The notation of the present paper is the same as Some of the questions raised by the canonical theory
~

that of 11, which should be consulted for the definition of un- were explored in I. In this third and find paper of the
familiar symbols, e.g., S, for the n-pronged bare vertex and
Y,,;,e for the asymmetric vertex coupling real and fictitious series we examine Some Of the consequences Of the
quanta. covariant theory.
1240 BRYCE S DirW I T T 162

Armed with the formalism constructed in I1 one can rather than in momenLum space. An example of such
in principle carry out the calculation of any micro- a calculation is given in Sec. 7, where the contribution
process to any order of perturbation theory in a of conformal metric fluctuations to the vacuum- to-
manner which is completely invariant and unambigu- vacuum amplitude is summed to all orders. The calcu-
ous except for the arbitrary high-energy cutoff which lation, which is manifestly covariant throughout,
must be introduced to render divergent integrals finite. makes use of an integral representation for the ampli-
A few of these calculations have actually been per- tude. A resum6 is given of that part of the mathe-
formed, and the only thing which prevents more of matical theory of covariant Greens functions which is
them from being done is the extreme tediousness of needed.
the algebra involved and the lack of any experimental Section 8 concludes the paper with a review of
motivation for them. It is a pity that Nature displays Weinbergs treatment of the infrared problem (see Ref,
such indifference to so intriguing and beautiful a sub- 37). If Yang-Mills quanta are assumed to be massless
ject, for the calculations themselves are of considerable then, since they can act as their own sources, they give
intrinsic interest. The present paper contains several rise to the special infrared divergences which plague
examples. They are by no means exhaustive but have massless electrodynamics. Weinberg showed that gravity
been selected as useful landmarks in a still largely miraculously escapes these difficulties ; its infrared di-
unexplored territory. Not all of these were originally vergences can be handled by the standard methods
carried out by the author, but it is hoped that their familiar in ordinary quantum electrodynamics. His
unified presentation here will make their results more proofs, however, were incomplete, since he did not
accessible than hitherto. have available a fully elaborated quantum theory. In
Section 2 begins with the rules of calculation in particular he used the DeDonder gauge without taking
momentum space. The basic structural elements of the into account the fictitious quanta. It is not difficult to
theory, namely the propagators for real and fictitious show that the fictitious quanta contribute negligibly
quanta, the vertices S3,S4, Vc.i)p, and the coupling in the infrared limit. Weinbergs results are therefore
with matter fields, are given for both the YangMills rigorous.
and gravitational fields. The standard Feynman rules
are summarized. The results of a few lowest-order 2. RULES OF CALCULATION I N
scattering calculations based on these rules are given MOMENTUM SPACE
and discussed in Sec. 3. Included are the cross section
We begin with the vertex functions for the Yang-
for gravitational scattering of two scalar particles, the
Mills field interacting with itself. We have seen in I1
cross section for scattering of gravitons by scalar
that when the standard field variables are used only
particles, the corresponding annihilation cross section, Ss and S4 are nonvanishing for this case. I n momentum
and the graviton-graviton cross section. Section 4 is de- space these become (apart from a 6 function expressing
voted to the problem of gravitational bremsstrahlung. conservation of momentum)
The role of the energy quadrupole moment tensor and
the absence of the forward peak a t high energies, charac-
teristic of photon bremsstrahlung, are noted.
Section 5 discusses some of the problems which arise
in renormalization theory. Although the Yang-Mills
theory looks as if it may be renormalizable (provided
its infrared difficulties can be disposed of), quantum
gravidynamics is definitely not renormalizable in the
usual sense. Tentative proposals for dealing with this
situation are briefly described, as is also the evidence
that gravity contains its own cutoff-at the Planck
length. Illustration of the actual details of the re-
normalization program, by explicit calculation of a
radiative correction, is postponed to Sec. 7. The correspondence of momenta with indices is p o p ,
The gravitational Ward identity and its implications pp~,pyu, p6~. All momenta are incoming
for gravitational form factors are derived in Sec. 6. (to the vertex), and momentum conservation implies
The general structure of the radiatively corrected 1- p+p+p=O for Ss and p+p+p+p=O for S4.
graviton vertex is displayed in the case of a scalar Indices on the structure constants are raised and
particle. It is emphasized that the gravitational Ward lowered by means of the Cartan metric -ya8. When all
identity is only one of an infinity of identities relating indices are in the lower position the structure constants
the many-graviton vertex functions of the theory. The are completely antisymmetric.
need for Ward identities can be eliminated by com- I n addition to the above vertices, the fictitious
puting radiative corrections directly in coordinate space vertex Irc,i,~is needed for the calculation of radiative
309

162 QUANTUM T H E O R Y O F G R A V I T Y . 111 1241

corrections. For the Yang-Mills field it takes the form field are much more complicated. I n this case we shall
employ the momentum-index combinations ppv, pud,
V~au,,o~pt
-+ - i ~ , p ~ $ = - i ~ , , p ( p ~ + p ) . (2.3) pl$/lx/l, $11 111 111
L K . The vertices must not only be sym-
metric in each index pair but must also remain un-
The propagators for the normal and fictitious quanta changed under arbitrary permutations of the momen-
are, respectively, tum-index triplets. At least 171 separate terms are
required in the complete expression for Sa in order to
G +Y aB~w/P2 (2.4)
exhibit this full symmetry, and for S4 the number is
c: yaB/pz1 (2.5) 2850. However, these numbers can be greatly reduced
by counting only the combinatorially distinct termsZ
with p2 being understood to have the usual small and leaving it understood that the appropriate sym-
negative imaginary part. metrizations are to be carried out. I n this way SIis
The corresponding quantities for the gravitational reduced to 11 terms and S4 to 28 terms, as follows:

6SS
--f

6QP&8 d P p * *X i

The Syni standing in front of these expressions indi- him best we shall not shackle him by describing one
cates that a symmetrization is to be performed on each here. We also make no attempt to display S g or any
index pair pv, UT, etc. The symbol P indicates that a higher vertices.
summation is to be carried out over all distinct permu- The vertex V ( a i ) Bhas the following form for the
tations of the momentum-index triplets, and the sub- gravitational field:
script gives the number of permutations required in v ( $ o f l r ) ,3
each case.
Expressions (2.6) and (2.7) can be obtained in a )sp[2Q,Q06,r- p,,PfrqU7
straightforward manner by repeated functional differ- +(P~p-Q,Qu)~~+Q.P~~o~,rl,
(2.8)
entiation of the Einstein action. This procedure, how- where the momentum-index combinations are Qp, pv,
ever, is exceedingly laborious. A more efficient (but p~~ and
, the symmetrization is to be performed on
still lengthy) method is to make use of the hierarchy the index pair UT. The propagators for the normal and
of identities (11, 17.31). It is a remarkable fact that fictitious quanta are given by
once Szois known all the higher vertex functions, and
hence the complete action functional itself, are de- G 3 (q~q,+q,q,~-qq~,)/p2, (2.9)
termined by the general coordinate invariance of the
theory. It is convenient, in the actual computation of
c: _.q*/p2. (2.10)
the vertices via (11, 17.31), to invent diagrammatic *The choice of terms is not completely unique since momentum
conservation may be used to replace a given term by other terms.
schemes for displaying the combinatorics of indices. We give here what we believe (but have not proved) to be the
Since each reader will devise the scheme which suits expressions containing the smallest number of terms.
3 10

1242 B R Y C E S. D E W I T T 162
If one wishes to calculate processes involving the the mass shell, to precisely the forms (2.13) and (2.15)
interaction of the Yang-Mills and/or gravitational regardless of the magnitude of the particle spin. This
field with matter, additional vertices describing this may be proved in each instance as a straightforward
interaction must be included. As prototypes of such consequence of the gauge invariance of the theory and,
vertices, we shall display those which arise from inter- when extended to the radiatively corrected vertices,
actions with scalar (or pseudoscalar) particles. The constitutes a boundary condition on the Yang-Mills
latter particles contribute to the total action functional and gravitational form factors.3 [See also Sec. 6.1
an expression of the form It is to be emphasized that the inclusion of addi-
tional fields in no way affects the formal theoretical
structure developed in 11.The topology and invariance
properties of diagrams remain completely unchanged.
One simply permits the field indices i, j , etc., to extend
where the covariant derivative is defined in Table I over a greater range of values in order to accommodate
of I1 and where the components of the new fields which have been
added. The only differences are differences of detail
~ - 7 ,rG,T= -Ga-,
@= (2.12) such as, for example, the sign modifications due to
7 being the matrix which connects the two forms of a statistics which appear when some of the added compo-
self-contragredient representation (of the Yang-Mills nents are those of fermion fields, or changes in the
Lie group) generated by the matrices G , and -Ga-, structure of the invariance group which arise from
respectively. We find having both the Yang-Mills and gravitational fields
simultaneously present and interacting with each other.
The rules for combining vertices and propagators
into transition amplitudes are completely standard.
With the notational conventions of the present paper
they may be summarized as follows: (1) An expression
such as (2.1), (2.2), (2.3), (2.6), etc., for each vertex;
(2) an expression such as (2.4), (2.5), (2.9), (2.10),
etc. for each propagator; (3) a factor ( - i ) / ( 2 ~ ) ~for
each independent closed loop; (4) an additional factor
(- 1) for each closed fermion or fictitious-quantum
loop, or when necessary to assure antisymmetry of
fermion amplitudes; ( 5 ) an over-all factor i(2n) times
a 6 function assuring total energy-momentum con-
servation; (6) a wave function uid (see Table I1 of
11) or its complex conjugate evaluated a t x=O for each
external line; (7) integration over all the independent
momenta.
+ (PpPr+PrPp)q+ (Pp? Gauge invariance may be invoked as a useful con-
sistency check in all calculations. However, it must be
- (tP+fp)q- @~+~~)q. (2.16) applied to the entire amplitude for a given process
and not merely to a single diagram. It is therefore
The corresponding vertices which describe the inter- algebraically more laborious than corresponding checks
action of the gravitational and/or Yang-Mills fields in electrodynamics. I t is no longer possible to exploit
with particles having spin are obtained by straight- charge conservation by following individual lines
forward computation from the pertinent action func-
tional. The latter is obtained in each case via the These are analogs of the electromagnetic form factors. The
principle of minimal coupling (which, in the case gravitational form factors are also sometimes referred to as
of gravity, is nothing but the strong equivalence stress-energy, mass, or mechanical form factors.
4 These are, in fact, the most important differences. It is worth
principle) from the corresponding action functional mentioning that when fermion fields are included it is usually
in the absence of gravitational and Yang-Mills fields, convenient to replace the metric field g, by a vierbein field.
by replacing ordinary derivatives by covariant deriva- Otherwise the group transformation laws are no longer linear.
[See B . S. DeWitt and C. M. DeWitt, Phys. Rev. 87, 116 (1952).]
tives, the Minkowski metric q,, by g, and the volume We also mention that the combined vierbein-general-coordinate-
element dx by g1I2dx. We do not give here the results transformation group has the structure of a semi-direct product
based on the automorphisms of the wicrhein group under general
of such calculations for particles with spin but merely coordinate transformations. In the combined group only the
point out (what is more useful for the reader) that the vierbein group is an invariant subgroup. The coordinate trans-
three-pronged vertices, when sandwiched between nor- formation group is its factor group. Similar statements apply
to the combined Yang-Mills-general coordinate-transformation
malized wave functions, always reduce, in the limit of group. The analysis of these cases is therefore correspondingly
zero momentum transfer, with particle momenta on complicated.
311

162 QUANTUM THEORY OF GRAVITY. Ill 1243

through diagrams, for now the conserved quantity which permit the amplitude to be recast in the form
Yang-Mills charge, energy-momentumleaks all over
every diagram. Moreover, when Yang-Mills quanta or
i{r,(i)rw,(2)-4r0i(i)7'o,(2)-4rM(i)r(2)
gravitons interact with themselves, the closed loops
form traffic jams of spurious charge which can be un-
snarled only by calling the fictitious quanta to the
rescue. -(-exchange and virtual annihilation terms. (3.9)
3. SCATTERING CROSS SECTIONS The first term yields an instantaneous "Newtonian"
interaction, while the second gives rise to a "delayed"
We now display some of the lowest-order amplitudes
interaction propagated by transverse gravitons. In this
and scattering cross sections which the covariant theory
case the factors which couple separately to the two
yields. One of the simplest is the amplitude for the
states of linear polarization are TuTtt and 27i2,
scattering of two identical scalar particles by exchange
of a single Yang-Mills quantum. This has the form respectively.
From (3.6) it is straightforward to compute the
differential cross section for gravitational scattering
+exchange and virtual annihilation terms, (3.1) of identical scalar particles in the center-of-mass frame.
One finds5
where
q=pi'-pi=Pi-pi, (3.2) d<r G2E2r(l+3z>2)(l-zi2)+4zi2(l+i>2) cos2(0/2)

j.,=MEfEj-WyGjt^+pJ, (3-3) dQ, 16 ji2 sin2(0/2)


the X's being the internal (Yang-Mills group) states of sin2(0/2)
the particles and the remaining notation being con- if cos2 (0/2)
ventional. The same form (3.1) also holds for particles
with spin, but the expression for the current jaf is then + (3-ii2)(l+2)+2ii4sin201 , (3.10)
more complicated.
Since the initial and final momenta are on the mass where = | p\/E and the gravitation constant has been
shell we have the conservation laws reintroduced through the units convention 16irG=l.
y.(i)-g=y.(2)-?=o, (3.4) The nonrelativistic and extreme relativistic limits of
this cross section are, respectively,
which permit the scattering amplitude to be reexpressed f
in the form dtr\ G2m?r 1 1
) =H + +3 (3.11)
i/NH 16 LiPsirtye n cos2^0
2

(3.12)
-f-exchange and virtual annihilation terms , (3.5) fi/E

where a factor i(2-n-)~2S(pi'+p2pip^) has been re- In a similar manner one may compute the cross
moved, and the 3-axis has been chosen in the direction section for scattering of gravitons by scalar particles.
of the spatial part q of the space-like 4-vector q. The The relevant diagrams are shown in Pig. 1, the heavy
first term of (3.5) represents the instantaneous "Cou- lines denoting particles and the light lines gravitons.
lomb" interaction of the particles; the second repre- Diagrams (a) and (b) vanish in the rest frame of the
sents a "delayed" interaction propagated by transverse target particle, and one finds for unpolarized gravitons5
quanta, the factors jai and _;'a2 being separately coupled
da- G'm2
to the two states of linear polarization of these quanta.
The corresponding amplitude arising from exchange [l+2e sin2|0]2 sin4J0
of a graviton is
sin20
], (3-13)
X 0)" V+'T<r-i7''VT)2rl,T(2)/s!
/rf<r\
+exchange and virtual annihilation terms, (3.6) (3.14)
\<to/ NK
where
T,t= i (&)-1*[pl.pf,+p,p'lt- !, (p p'+m'fl. (3.7) (3.15)
Again we have conservation laws
6
C. F. Cooke, Ph.D. thesis, University of North Carolina, 1964
(3.8) (unpublished).
312

1244 BRYCE S DEWITT 162

demonstrated only in lowest order, b y carrying out a


brute-force computation of the relevant amplitudes.
The tediousness of the algebra involved in obtaining
the graviton-graviton cross section may be inferred
from the complexity of the vertex functions (2.6) and
(a1 (bl (c) (d)
(2.7) which are involved in the diagrams which repre-
FIG.1. Lowest-order diagrams for scattering of a graviton by a sent the amplitude (Fig. 1with the heavy lines replaced
material particle. The heavy line denotes the particle and the by graviton lines). Fortunately, the presence of the
light lines denote gravitons.
polarization tensors in the external-line wave func-
tions, and the momentum condition p z = O for free
where e is the energy of the graviton measured in units
quanta, eliminate many of the terms from these ex-
of m. I t will be noted that these cross sections have no
pressions. Nevertheless, a large amount of cancellation
resemblance to those for Compton scattering, but on
between terms still has to be dug out of the algebra,
the contrary, continue to display the sharp forward
and this, combined with the fact that the final results
Rutherford peak characteristic of long-range inter-
are ridiculously simple, leads one to believe that there
actions. This feature is due to diagram (d) of Fig. 1
must be an easier way. The cross sections which one
whose presence, as may be readily checked, is essential
for the gauge invariance of the scattering amplitude. finds are
Owing to the equivalence principle gravitons, like
photons, are deflected by a gravitational field (in par-
ticular by the long-range static field of any material
particle), and the above cross sections are dominated
by this effect. f4--3 sin20T, (3.19)
By the well-known substitution rule the diagrams
of Fig. 1 yield also the amplitude for annihilation of a
pair of scalar particles into gravitons. We record here (3.20)
only the low- and high-energy limits of the total
annihilation cross section in the center-of-mass frame : showing again the forward Rutherford peaking.
UNR= 2T@m2/v (3.16) We shall not record here the corresponding cross
sections involving Yang-Mills quanta, since these
UER= ( 3 8 ~ / 3 ) f f E . (3.17) depend, in their finer details, on which Lie group is
chosen as generator of the Yang-Mills group and on
The cross section for the inverse process, namely, the which representations are chosen for the material
production of a scalar pair by colliding gravitons particles. There is also a serious difficulty with the
(again in the center-of-mass frame) is identical with Yang-Mills field in regard to the infrared catastrophe,
(3.17) a t high energies. Near threshold, on the other which will be discussed in Sec. 8. Since our primary
hand, i t is given by interest in this article is the gravitational field, we
( ~ ~ - - E> 1.
u= 2 ~ f f m ~ l)lz/c, (3.18) refer the reader with a special interest in Yang-Mills
cross sections to the dissertations of Remler6 and
The only elastic process which remains to be con- Dotson. It is, however, perhaps worth remarking that
sidered is the scattering of one graviton by another. in the case of the scattering of one Yang-Mills quan-
This process has some unusual features. I t turns out tum by another the phenomenon of helicity conserva-
that the helicity of the colliding gravitons is individually tion is again found to hold, with or without the in-
conserved. That is, there is no spin flip, in spite of the clusion of graviton exchange forces in the total ampli-
presence of derivative coupling. If both gravitons are tude, and regardless of the choice of the Lie group.
right (left) handed before collision then both are right Moreover, an extension of the helicity conservation
(left) handed after the collision. If one is right handed rule to processes involving real gravitons in interaction
and the other left handed then they maintain this with Yang-Mills quanta apparently exists. Thus indi-
relationship also, through the collision. vidual helicities remain unchanged when a graviton
The helicity of extremely relativistic particles, and and a Yang-Mills quantum collide elastically. If the
of massless quanta in particular, is notoriously rigid. diagrams contributing to this process are turned on
In the classical theory, for example, the spin of such their sides so as to yield the amplitudes for annihilation
a particle suffers no precession under geodetic motion of two Yang-Mills quanta into a pair of gravitons (or
in a n external gravitational field but remains always the reverse process) further selection rules emerge. One
pointing parallel or antiparallel to the trajectory. How-
ever, no general principle has yet been discovered SEE.A. Remler, Ph.D. thesis, University of North Carolina,
which implies that helicity conservation must hold to 1964 (unpublished).
7 A. C. Dotson, Ph.D. thesis, University of North Carolina,
all orders of perturbation theory. It has so far been 1964 (unpublished).
313
~~

162 Q U A N T U M T H E O R Y O F G R A V I T Y , 111 1245

finds that it is impossible to produce two gravitons and simply multiplies the original amplitude. This
having opposite helicities by annihilation of Yang-Mills limiting form actually holds for all external lines, re-
quanta, or conversely to produce a Yang-Mills pair gardless of the spin character of their associated
having opposite helicities by the reverse process. The particles. It even holds when the external line is a
quanta in both the initial and final states must have graviton line, provided the emission vertex is inserted
identical helicities if the amplitude is to be nonvanish- not merely into a single diagram but into the sum of
ing. Helicity selection rules exist even for the process all diagrams contributing to the original amplitude.
in which two Yang-Mills quanta coalesce to produce This may be verified in a straightforward manner by
a single Yang-Mills quantum and a graviton. If both plugging in the 3-pronged graviton vertex (2.6) and
initial quanta have the same helicity the final quanta eliminating the terms involving q. Of the remaining
must have this helicity too; if the initial helicities are terms only those survive which yield a net contribution
opposite the final helicities must be opposite. The same of the form (4.3); the rest disappear in virtue of the
obviously holds for the reverse process. gauge invariance of the total original a m p l i t ~ d e . ~
The multiplicative factor (4.3) exhibits the well-
4. GRAVITATIONAL BREMSSTRAHLUNG known infrared divergence and can be obtained from
a purely classical model. We note that the infrared
Since the problem of gravitational radiation from divergence shows up only when the emission takes
accelerating masses has bedeviled classical relativists place from lines on the mass shell; it does not occur
for years it is a pleasant surprise to discover that its when the emission is from internal lines of a scattering
treatment within the quantum framework is quite diagram. The external lines therefore dominate the
simple.8 Consider a scattering diagram in which one of soft graviton emission. This means that the precise
the lines represents a scalar particle (real or virtual) details of the scattering process have little relevance
of momentum p. Let the diagram be modified by the in the limit q 4 0, and that the long-wavelength end
emission of a graviton of momentum q from this line. of the emission spectrum is determined primarily by
If the momenta of all lines subsequent to the inserted the asymptotic trajectories of the incoming and out-
graviton vertex are held fixed while those prior to the going particles, just as in the case of photon brems-
vertex are adjusted in such a way as to conserve strahlung. For wavelengths large compared to the
momentum and keep external lines on the mass shell, space-time region in which the collision takes place
then the only additional effect of the graviton emission (the size of this region is determined by the magnitudes
is to introduce into the corresponding amplitude, a of typical energies exchanged in the collision) the eff ec-
factor tive graviton source is a stress tensor of the form

(2nY2 44
TP(z)= c qnmnVnPvn.J 0
6(x- Vn7)d7, (4.4)

P P (PY+4Y)+PY(PP+PP) - PPYcmz+P. (P+ 411 which idealizes the particles t o classical points colliding
X (4.1)
at the coordinate origin. Here rn, and V , are, respec-
9
( p + qI2+ m2- i0
tively, the mass and 4-velocity of the lzth particle,
which follows from Eq. (2.15) and Table I1 of 11. and the sign factor qn tells whether the particle is in-
Alternatively, if the momenta prior to the vertex are coming or outgoing. The summation is over all the
held fixed we get a factor which differs from (4.1) by external lines, and the velocities are subject to the
the replacement q 4 -4. energy-momentum conservation law :
If the graviton is emitted from an external line these
factors reduce to C qnmnVn=O. (4.5)
1 PPfJ+tdP,@+PYQ,-- l l r d Q) The classical emission spectrum is obtained by pro-
~-
e*P*e**
1 (4.2) jecting (4.4) onto the graviton wave functions urv*(z,q)
(27r)32 dff @+ 2sP. 4- io (see Table I1 of 11). The corresponding quantum
where q = + 1 or -1 according as the external line is amplitude is
outgoing or incoming, and p is held fixed on the mass
shell. In the long-wavelength limit q -+ 0 (4.2) itself
reduces to

wn, (e**. Vn)


* Th e comparison is actually unfair. The questions which
classical relativists ask are usually quite detailed-e.g., the precise
=c n 2(27r)32z/p0 v,.q--iq,o
(4.6)
damped motion of radiating sources, or the nonlinear properties
of coherent waves of large a m p l i t u d e a n d are inevitably much DThe gauge invariance holds in every order of perturbation
more difficult. theory.
314

1246 BRYCE S DEWITT 162

which, in view of the relation fn=mnVn, is just (4.3) resemble bundles of plane waves having momenta con-
summed over all the external lines. fined to narrow cones. These bundles (particularly
When the collision is nonrelativistic (4.6) reduces to their outer regions) have difficulty readjusting to the
altered particle trajectories arising from the collision
(4.7) and hence partly escape as radiation.
where the graviton gauge is chosen so that the compo- In the gravitational case the sharp forward emission
nents eko of the polarization vectors e* vanish, and is absent.I2In fact for an extremely relativistic collision
AZ is the change in the spatial integral of the total (I pn/ = E m ) which is confined to a plane (e.g., 2-
3-stress dyadic as a result of the collision: particle scattering) it is easy to verify that the total
sum (4.6) yields an amplitude which vanishes for
emission in the plane.13 This implies that, unlike photon
A Z = A T d x = C qnpnvn, (4.8) emission, graviton emission is a cooperative phenome-
J n
non which cannot be traced to the individual particle
T= E,n B(rlnxO)pnVn6(x-vVn~), (4.9) fields. Indeed the real gravitational field of a particle,
namely the Riemann tensor, falls off as the inverse
V+t= Vn/V2= pJEn=pJmn. (4.10) cube rather than the inverse square of the distance,
and hence its outer regions contribute negligibly to the
Now it is well knowno that energy-momentum con- emission. This has obvious implications for investiga-
servation permits the integral of the 3-stress dyadic tions of classical 2-body radiation as well as for at-
to be reexpressed as one half the second time derivative tempts to introduce Weizsacker-Williams approxima-
of the second moment JxxToodx of the energy density. tion schemes into quantum calculations.
Moreover, since e+*.e**=O, the trace of AZ may be
removed from (4.7). Therefore the emission amplitude 5. RENORMALIZATION AND T H E
may be written in the alternative form: PLANCK LENGTH
t(2~QO)-~~e~*.~(d~Q/dt2).e~*, (4.11) In lowest-order perturbation theory the formal rules
of the manifestly covariant theory yield results which
where A(d2Q/dt2) denotes the change in the second agree with the classical theory in the correspondence
time derivative of the energy quadrupole moment principle limit. In higher orders, divergences appear,
tensor just as they do for other field theories, and almost
nothing is known about how to extract finite and
Q=/b-. 1x2)Toodx , (4.12) physically meaningful radiative corrections from the
results. In the case of quantum gravidynamics the
severity of the divergences is such that the theory is
showing that soft gravitons are emitted predominantly not, by standard criteria, renormalizable. This is due
in the quadrupole mode. to the quadratic momentum dependence of the vertices
It is of interest to examine the angular distribution S,(n33), which in turn may be traced to the de-
of the emitted radiation. From (4.6) one sees that each pendence of the light cone on the background field,
external line makes a contribution to the emission i.e., t o the field dependence of the coefficients of the
amplitude, which has an angular distribution of the second time derivatives appearing in 52. Thus by
form counting momentum powers one finds for the super-
ficial degree of divergence of any diagram
(4.13)
D=-ZLi+2 C Vn+4K, (5.1)
n
where 0 is the angle between v and q, and p is a helicity where Li denotes the number of internal lines, V , the
phase angle. In the case of photon bremsstrahlung the number of n-pronged vertices, and K the number of
sine appears linearly instead of quadratically in the independent momentum integrations. Now i t is not
numerator, with the consequence that for relativistic difficult to show that
collisions (v= 1) the emission is concentrated sharply
in the forward directions of all the particles (initial as K=L- c V,+l.
n
(5.2)
well as final). This peaking may be attributed to the
individual Lorentz-contracted Coulomb fields, which This was first pointed out by R. P. Feynman in a mimeo-
graphed letter to V. F. Weisskopf dated January 4 to February
11, 1961 (unpublished).
OSee, for example, L. D. Landau and E. M. Lifshitz, The 18 Introducing unit vectors fi and a, in the directions of q and
Classical Thwrv o r Fields. translated bv M. Hammermesh pn, respectively, one may write the amplitude in this case in the
(Addison-fesle$ Publishing Company, Irk, Reading, Massa- form
chusetts, 1962), rev. 2nd Ed.
In view of the nonrelativistic energy conservation law constx c qa.-=constX c q,~.(~+fi.a~,
nqn(mn+tpn-vn) =0, this trace is just twice the rest mass lost
in the collision and already vanishes for elastic collisions. which vanishes by energy-momentum conservation.
315

162 QUANTUM THEORY OF GRAVITY. I11 1247


Therefore light on the ultraviolet problem. Faced with its brutal
D=2(K+1), K > l , (5.3) consequences there are several paths one may try t o
follow to make life bearable. One of these is to soften
which increases without limit as the order of the the degree of divergence by abandoning S-matrix uni-
diagram increases. tarity (i.e., positive definiteness of Hilbert space)
In the case of the Yang-Mills field the situation is through the introduction of field equations of the fourth
better. Here we have differentialorder. This may be accomplished by adding
D=-2Li+Vsf4K, (5.4) terms of the form Jg1I2()Rzdxand f g112RpvRpvdxto
the Einstein action, which changes Eqs. (5.1) and (5.3)
which, in combination with the readily verified com- to
binatorid law
D=-4Li+4X Vn+4K=4 (5.7)
La+2Li=
n
cnVn=3Va+4V4, (5.5)
n

for all diagrams of order greater than zero. Such a


yields procedure in effect introduces a separate unit of mass
D=4-L,, (5.6) (or length) into the theory, and if this mass is chosen
where L, is the number of external lines. I n order to sufficiently big, the S matrix will be nearly unitary,
compensate the divergences one may introduce counter significant departures from unitarity occurring only
terms into the original Lagrangian in the conventional under extreme conditions, when collision energies ap-
manner. The most divergent counter term is always the proach that of the unit of mass.
simplest. I t is necessarily of the form constXF,,,Fa~v, Nevertheless, it would be nice if any breakdown in
provided the divergence has been handled in a mani- conventional ideas which may be necessary were to
festly group-invariant manner. But such a counter emerge from the quantized Einstein theory in its un-
term can be detected by as many as four external mutilated form. There is already a unit of mass in the
lines. Hence D is never actually greater than zero. theory: the absolute unit (h~16&)~@=3.07X1W6g
This is quite analogous to the situation which occurs -1018 BeV, and one is loath to introduce another.
with vacuum polarization diagrams in quantum elec- One might try to use the extra mass merely as a
trodynamics: Although L,=2, D is reduced from 2 to regulator, in a spirit similar to that of the &limiting
0 by gauge invariance. One therefore expects that, proposal of Lee and Yang16 for the charged vector
with proper handling of the overlapping divergences, boson. Equation (5.7) suggests that the regulated
a careful analysis will show that none of the ultraviolet theory may be renormalizable, requiring only three
divergences of the Yang-Mills theory is worse that counter terms, respectively, quartically, quadratically,
logarithmic, and that only a single counter term, of and logarithmically divergent. If this is so one might
the above mentioned type, is needed for each group- attempt to let the regulator mass become infinite
invariant set of diagrams. If that is true it is then not after the renormalization has been performed. How-
difficult to show that renormalization merely rescales ever, there is no guarantee that the renormalized am-
the structure constants cupr. Mathematically, this is plitudes will themselves remain finite in the limit, nor
equivalent to a rescaling of coordinates in the group that unitarity, which is violated by the regulator, will
manifold; physically, it corresponds to a change in the then be restored. If unitarity stays violated one is not
strength of the coupling of the Yang-Mills field to sure whether this represents a fundamental feature of
itself and to other Yang-Mills-charge bearing fields. the quantized Einstein theory or is merely a conse-
I t should be remarked that although the above quence of the regulator approach ; one would be inclined
results [Eqs. (5.3) and (5.6)] have been stated for the to suspect the latter. HalpernlGhas criticized unortho-
case in which each field interacts only with itself, they dox uses of regulators in handling nonrenomalizable
hold also when other fields bearing stress-energy or field theories, and has shown that they often lead to
Yang-Mills charge are present, provided the spins of illegal modifications of analytic properties.
the added fields are not greater than 3 and their other If regulators are to be excluded then perturbation
mutual interactions are ren~rmalizable.~ Unfortunately, theory cannot be used except in a formal way. One
in the case of the Yang-Mills field the ultraviolet must necessarily sum infinite classes of diagrams and
divergences are not the whole story; infrared difficulties hope that the increasingly strong divergence of the
of a special type also make their appearance. These successive terms of the series, as expressed by Eq.
will be discussed in Sec. 8. (5.3), will lead to high-energy damping and a finite
result for the total amplitude. The author has shown17
In the case of gravity there are no infrared problems
beyond those which can be handled by conventional that this hope is actually fulfilled for at least one
methods. Equation (5.3), however, casts a rather dismal
]IT.D. Lee and C. N. Yang, Phys. Rev. 128,885 (1962) ; 128,
899 (1962).
l4 In the case of gravity additional fields of spin 1 are allowed 18 M. B. Halpern, Phys. Rev. 140,B1570 (1965).
if they are massless. B. S. DeWitt, Phys. Rev. Letters 13, 114 (1964).
3 16

1248 B R Y C E S. D E W I T T 162
simple class of diagrams, namely, those which represent integral equations, or otherwise simplify the computa-
two scalar particles exchanging gravitons in the ladder tional labor. It is clear that the results can give a t
approximation. It turns out that the "leading terms" best only a qualitative insight into the true analytic
(i.e., the most divergent) of the Bethe-Salpeter ampli- structure of the theory.
tude can be summed exactly, and, owing to certain
remarkable cancellations, the sum of the ladder-type 6. THE GRAVITATIONAL WARD IDENTITY
contributions to the gravitational self-energy can be
expanded in a power series in the bare mass, with no Although the computational difficulties involved in
approximations whatever. The method can also be extracting physical information from quantum gravi-
extended to the case of charged scalar particles, with dynamics are formidable, the theory has a redeeming
one or more of the graviton ladder rungs replaced by feature in its general covariance, which serves as a
photons, and a simple expression can be obtained for cross check on the consistency of various calculations
the lowest-order electromagnetic self-energy. The self- and imposes constraints on the permissible forms of
energies and renormalization constants found in this various amplitudes. One of these constraints has
way are all finite. recently been discussed by Brout and Englert.19 These
The finiteness of these quantities may be traced to authors derive a generalized Ward identity relating
the behavior of the particle-particle scattering ampli- the gravitational vertex function of a scalar particle
tude. In the limit of very high momentum transfer the to the self-energy function arising from all its inter-
singularity of the gravitational interaction kernel is actions. Their derivation is easily generalized to the
displaced off the light cone in coordinate space and onto case of a particle of arbitrary spin.
a hyperboloid lying a t a distance X= (4f~G/?rc~)~/*=1.82 Denote the field of the particle by q A . In addition
X10-3a cm in spacelike directions. This is roughly t o the functions Ria (or, in expanded notation, RPY.,)
equivalent t o endowing the scalar particles with the characterizing the coordinate transformation behavior
properties of hard spheres of diameter X, and may be of the gravitational field (see 11) we now have corre-
regarded as a manifestation of the smearing out of the sponding functions R", for q". The explicit structure
light cone due to quantum fluctuations. of these functions may be inferred from Table I of 11:
Similar results have been found for spin-4 electrons R*,,=- ~A,lr6(~,~')+G.rAsrps6.u(x,x'). (6.1)
by Khriplovich,'* and there seems to be no reason
why, with enough labor, they may not also be extended We note that RAP,vanishes in the limit pA-+ 0, and
to particles of higher spin, including the graviton in that its functional derivative has the momentum-space
interaction with itself. Thus gravity may indeed prove form
to be the universal regulator which renders all field
theories finite. RAP,p r -+-i6A~P"p+iGYCA~P'V, (6.2)
It should be remarked that the self-energy functions
in which the association of momenta with indices is
which are obtained by summing ladder graphs appear
PA, p'p', p"B" (P+#'+P"=O).
to correspond to "good" spectral functions, which do
Let us denote the full (radiatively corrected) propa-
a minimum of violence to unitarity. This suggests that
gator for the particle by SAB.I t is the sum of the bare
no illegal analytic operations have inadvertently crept
propagator G A B and a function obtained by applying the
into the summation procedure. An improved calcula-
operator GAB6/6pBtwice to the vacuum-to-vacuum am-
tional method, which insures analytic legality in gen-
plitude. Since the vacuum-to-vacuum amplitude is an
eral, has been developed by Halpern.'" He sums first
invariant the propagator SAB,like GAB,transforms in
the absorptive parts of any amplitude and then obtains
the manner indicated by the position of its indices.2O Its
the full amplitude by a dispersion integral. The tech-
inverse must transform contragrediently :
nique is applicable to gravity theory as well as to other
nonrenormalizable theories, and is amenable to N / D S-'AB, i R i u + S 1 ,cRC,
~~
approximation schemes. It is probably the safest = -S-'CBR~,,A-S'ACR~,,B. (6.3)
method currently available, but it is very complicated
to apply. Equation (6.3) is the gravitational Ward identity. T o
Although the finite results which have been obtained get it into more familiar form one must reexpress it in
thus far are very suggestive, one must remember that momentum space, with all the background fields set
they derive from restricted classes of diagrams. They equal to zero. In this limit S + A B .becomes
~ the negative
are therefore not y-invariant but depend on the par- of the gravitational vertex function, which is conven-
ticular gauge chosen for the internal graviton lines. So
far calculations have been restricted to those gauges I @ K .Brout and F. Englert, Phys. Rev. 141, 1231 (1966). See
which avoid "dangerous" singularities in the resulting also K. Just and K . Rossberg, Nuovo Cimento 40, 1077 (1965).
"This will be true even if q A possesses a gauge group of its
own, provided the gauge conditions which determine G A B are
10 I. B. Khriplovich, report, Siberian Section, Academy of covariant. Note that the "background field" now includes q-4 in
Science, USSR, Novosibirsk, 1965 (unpublished). addition to the metric field.
317

162 QUANTUM THEORY OF GRAVITY. I11 1249

tionally denoted by PV,the particle indices being The cancellation of divwgences which is implied by
suppressed and the index i being replaced by the more (6.12) applies only to the leading term of the vertex
explicit p . Making use of (6.2) and the momentum function, in the limit p+ p , and only on the mass
space form of Ri,, which is given in Table I1 of 11, one shell. In order that no divergences occur in the remain-
readily finds ing terms, or off the mass shell, the interactions which
the field Q A experiences with other fields must be of
2r,v(p,P)q.= .s-l (p)p,- s-~(P)P, the renormalizable type (or else summable to finite
- Cs-l(p)GY,-a,-S1(p)14,
(6.4) values). The example of the scalar particle provides an
where fi and p are, respectively, the incoming and adequate illustration of the conditions which must be
outgoing particle momenta and q=p-p is the in- satisfied. In this case we
coming graviton momentum. This, with the spin terms G1(p)=p+m2, (6.13)
involving GY, omitted is the equation given by Brout
and Englert. It holds, as a simple consequence of = ~CPrPV+PYP,- d P .P+ m9)3 (6.144
rPy(p,P)
general covariance, no matter how many other fields
= lPPY(P,p)-%vrrsm2, (6.14b)
are coupled to the field pAand involved in the structure
of the vertex function. where the index 0 refers to the bare mass, and we may
Now introduce the vertex and wave-function renor- write
malization constants Zl and Zp. They are defined by
S-l@)=4+m2+-z($),
(6.15)
u + ( P ) r P v ( P , P ) d P )=zl-lu+(P)Y,(P,P)~(P)
> am2=m2- mOz=Z(-ma) ,
p= -mz, (6.5) r,.(p,p)=rorY(p,p)+n,,(p,p). (6.16)
S--l(P)=zz-6-1 (P)+Z (P)l, The functions Z and A are related by the Ward identity
(6.6)
[az(P)/apIpLd=O, as follows:
where ypy and G are the bare vertex and propagation 2A,,(p,p)q=z(p))p,--z(~)P,. (6.17)
functions, respectively, m is the particle rest mass,
and u(p) is a particle wave function satisfying I t is not hard to show that the general solution of
(6.17) is
~-l(P)u(p)=~l(P)u(P)=O (6.7)
on the mass shell. From (6.6) we may infer
(P)/aplp~-,,p= Z2-[ aG1(p)/ap]pl_m2. (6.8)
On the other hand, (6.4) yields, in the limit p+ P,
2r,(P,?) = P , a ~ ( P ) / a P ~ - 7 r S - W where F is an arbitrary function. Therefore the graviton
-S-l(p)GV,-GF,t-S-l (P), (6.9) vertex of a scalar particle is characterized on the mass
whence, in virtue of (6.7), shell, by a single function of qz. This is the gravita-
tional form factor.
2at(P)r,V(P,P)4P) = P,U+(P)Cas-l (P)/WE($) I (6.10) Now introduce the renormalized self-energy function
p= -mz. 2, defined by
Now, since (6.4) is a consequence simply of general ~ ( y ) = S m z +(Z2--1)($+m2)+Zi-S:(pz),
covariance, it holds also if r,,and S- are replaced by (6.19)
ypuand G I , respectively. Therefore we have Z(-mz)=O, [d2($)/dp2]p*,,~=0.
2u(P)r, (p>f)(fi) = .$rut (P)[ac-l(p)/aylu(fi) 3 (6.11) I n terms of this function Eq. (6.18) takes the form
$=-ma, A,(P,fl)= (22-l- 1)YPY(P,P)-3gm2tllur
From (6.5), (6.8), and (6.11) it follows that
z1=zz. (6.12)
When both vertex and wave-function radiative correc-
tions are taken into account the two renormalizations
cancel, and there remains only the graviton renormali-
zation Zs arising from vacuum polarization,2 which
has the effect of modifying the gravitation constant. X (qzvlly-qpqv) 1 (6.20)
The polarization of the vacuum by a gravitational field is of ZZEquation (6.14a) is obtained from (2.15) by making the
the quadrupole type. Examples of renormalization terms to which replacement p-+ -p, since p is here an outgoing and not an
it leads are given in Sec. 7. incoming momentum.
3 18

1250 BRYCE S DEWITT 162

which suggests that we also introduce a renormalied magnitude of the gravitation constant G, in terms of
form factor F , defined by arbitrarily chosen (e.g., international mks) mass stand-
ards, to be determined by experiment.23
. (6.21)
F(pZ,p,qZ)= t(Zi-l- l)+Zz-lP (p,P,@9 It is clear that the gravitational Ward identity is
Combining (6.14b), (6.16), and (6.20) we then get only one of an infinity of identities, derivable from Eq.
(17.31) of 11, which relate vertex functions involving
F,~W,P)=ZJ,~(P,P) (6.22a) n gravitons to those involving lz-1 gravitons. Such
identities become superfluous if calculations are per-
formed in coordinate space rather than in momentum
space, for then the general covariance of the theory
can be kept constantly manifest. That such calcula-
-tC2(p3+2 (PIV,.+ +P@2,P2,q2) tions are actually feasible will be demonstrated in the
next section.
X (q%pv-~rqv) 1 (6-22b)
which reduces, on the mass shell, to 7. RENORMALIZATION IN COORDINATE SPACE.
CONFORMAL VACUUM FLUCTUATIONS
rll.(P,P)=rr(p,p)+P(-m2, -m2, q? (Q2V,v-Q,qJ,
P 2 = p z = - ~ 2 . (6.23) The chief tool for studying quantum gravidynamics
directly in space-time is the theory of Greens functions
The Zz factor in (6.22a) takes into account the wave- in hyperbolic Riemannian manifolds developed by
function renormalization arising from self-energy in- Hadamard.24 The basic structural element of this
sertions in the external lines. theory is the geodetic interval, denoted by u,25 which is
If the scalar particle is coupled to other fields through defined as one half the square of the distance along the
nonrenormalizable interactions then the functions Z geodesic between any two space-time points x and x.
and F will diverge in perturbation theory. In particular, The geodetic interval is a symmetric function of x and
they will diverge if virtual gravitons are permitted to x which transforms as a biscalar, i.e., as a scalar
contribute to the vertex function. Thus unless an separately a t x and x. It satisfies the differential
arbitrary cutoff is used, or someone discovers a way equationz6
to sum gravitational interactions to all orders, the
gravitational field must be allowed to act only through
g = 1a@ : p~ .. k2 - ;,,u?,
2 ~
(7.1)
the external graviton line. Although the identity (6.12) and the boundary condition
continues to hold formally in the nonrenormalizable
case, it is then of n_outility. Because of the divergence
which remains in F , Eq. (6.23) will yield an infinite
cross section for the scattering of the particle in an I n a general Riemannian manifold u is not single-
external gravitational field. valued, except when x and x are sufficiently close to
x
In the renormalizable case and P are finite, and one another? The geodesics emanating from a given
expression (6.23) has a well-defined limit as q-0, point will often, beyond a certain distance, begin to
namely, cross over one another. The locus of points a t which
f,(P,P)=PrP, P=-m*. (6.24) the onset of overlap occurs forms an envelope of the

More generally, with partides of arbitrary spin one 21 The necessity of measuring G disappears if absolute units are
finds adopted, with h=c=16nG=l. However, the masses of the ele-
mentary particles must then be measured in absolute units, which
, 4=- m 2 , (6.25)
ut(p)F,&,p)u(fi)= (~T)-~P,PY/~E is operationally the same thing as measuring both G and the
masses in mks units.
when the wave functions u(p)are chosen to correspond J. Hadamard, Lectures on Cauckys Problem in Linear Partial
Dijerenlial Eyualions (Yale University Press, New Haven,
to &function normalization with respect to 3-momen- Connecticut, 1923).
tum. As Brout and Englert point the universality *6B.S. DeWitt and R. W. Brehme, Ann. Phys. (N. Y.) 9, 220
(1960). See also J. L. Synge, Relativity: The General Theory (North-
of (6.25) implies that the equivalence principle relating Holland Publishing Company, Amsterdam, 1960). Synge calls
gravitational and inertial mass holds in the quantum this function the world /unction and denotes it by the symbol R.
26 The semicolons denote covariant differentiation. For a scalar
theory as well as the classical theory. I n particular the this is the same as ordinary differentiation. u L Pis a vector of length
motion of a nonrelativistic particle in a slowly varying equal to the distance along the geodesic between z and z, tangent
gravitational field is independent of its mass. to geodesic a t z, and oriented in the direction z -+ z. u. is a 8

If a high-energy cutoff is permitted then the Ward vector of equal length, tangent to the geodesic a t z, and oifented
in the opposite direction.
identity may be applied to gravity itself, i.e., to the In some manifolds (e.g., some compact manifolds) every pair
three-graviton vertex. In this case the wave function of points may be linked by more than one geodesic. It is always
possible, however, to define a single-valued function c in the
renormalization constants 21 and 2 3 coincide, and Eq. neighborhood of z by starting a t z and following each geodesic
(6.12) tells us that Z1=22=23. This leaves only the emanating from z until it hits a caustic.
319

162 Q U A N T U M T H E O R Y O F GRAVITY. 1 x 1 1251

family of geodesics, known as a cuzlstic swfuce. The the factors gill4 being inserted to insure the covariance
equation for the caustic surface relative to a given of operator
point is D-l= 0, where Taking matrix elements of (7.11) one obtains
D=-det(--a,,,.).
I) is a bidensily of unit weight at both
satisfies the boundary condition
(7.3)
x and x,which
where
g%(~,~)g/=i
L- (x,sIx,O)dF, (7.12)

lim D = g . (7.4) Id),


(x,sIx,O)=(xl e~p(ig+/Fg-~/~s) (7.13)
.%-I

It is convenient to replace D by the biscalar satisfying

A E gDg-fi , lim A = 1 , (7.5) a


245 -i-(x,s
as
Ix,O)= (?,S Ix,O);,, (7.14)
whose values a t given points are independent of the
choice of coordinate system. By covariantly diff eren- and the boundary condition
tiating Eq. (7.1), one can derive the differential
equationz6
(x,O 1 XO) = ( x Ix) =6 (x,x) . (7.1.5)
The Schrodinger equation (7.14) is solved by the
A-l(Au:P):,,=4 or a,-,,=4--cs,P(lnA);,, (7.6) ansatz
which shows that A increases or decreases along each m
geodesic from x according as the rate of divergence of (z,s I%,o) = -~ ( 4 ? r ) - ~ ~ ~ ~C~ u,(is),
s - z e ~ (7.16)
~ ~ ~ ~ ~ ~
-a
the neighboring geodesics from x, which is measured
uo= 1 , (7.17)
by uil;, is less than or greater than 4, the rate in flat
space-time. If the divergence rate becomes negatively which is suggested by its known solution in flat space-
infinite a caustic surface develops and A blows up. time29
We shall illustrate the use of u and A in the theory Inserting (7.16) into (7.14) and making use of (7.1)
of Greens functions by considering the Feynrnan prop- and (7.6), one finds
agator of the simplest of all fields: the massless scalar
field. The defining equation isa8 U : ~ U , ; , + ~ Z U ~ ~ A - ~ / ~;,,,(, A ~ * U ~ I )
%=l, (7.18)
2, 3 . e . .
gzG,,(x,x)= -a(%,%),(7.7)
These recursion relations may be solved by successive
together with appropriate boundary conditions. The quadratures along each geodesic emanating from x.
introduction of boundary conditions is most easily Hadamardz4and Riesz31have shown that the solutions
accomplished in the abstract formalism which replaces as well as the series (7.16) converge up to the first
(7.7) by caustic. Formally we may write
FG=-l, (7.8)
where n
F=-pp,g/2gPPv, (7.9) an= [ ( ~ u ; ~ + ~ ) A ~ ~ ~ ~ ~ A1,~ ~ (7.19)
].
In-1
G(x,x)=(xIGIx), (7.10)
where ip,, and g114Fg114are now to be understood as
the Ix) being eigenvectors of a commuting set of the gradient and Laplace-Beltrami operators, respec-
Hermitian operators x@ in a fictitious Hilbert tively. Setting
and the ps being Hermitian momenta canonically
conjugate to the xs. The formal solution of (7.8)
which incorporates the Feynman boundary conditions (w?9,,+m)-1=lm [exp(iu,P,+m)t,]dt, , (7.20)
is
-1 and making the variable transformation
gl/4GglN=
p 4 ~ p 4 + i o
t1=t1, 11 = tl ,
ta= tz- tl , ta- fl+tZ (7.21)
...
= ilm exp (ig-1/4Fg-1ir)ds, (7.11)
...
1 =t,- t,_l , t, =t1+tz+ . . .+t, ,
n.o For example
Eq. (7.7) C(z,x) is to be understood as a biscalar and the
6 function &(z,z)as a bidensity of unit weight at x and zero
weight at z.
Cf., J. Schwinger, Phys. Rev. 82, 664 (1951). JI
(ZI (g14Ggr/4)z]
z)=g/gA
M. Riesz, Acta Math. 81 (1949).
1
G(z,z)gl/~G(x,z)dz.
3 20

1252 BRYCE S D K W I TT 162


we may recast (7.19) into the form and where the symbol T indicates that the operators
O(t)appearing in the exponential, via (7.25), are to
a,,=[ d t l f l dtz.. */t-ldL,D(tl)*. .O(t,)-l, (7.22)
--OD be chronologically ordered with respect to the pa-
--oo -m rameters t.
where Now the chronological ordering operation commutes
with differentiation or integration with respect to the
parameter s. Hence Eq. (7.24) may be inserted directly
into (7.12). The result is a formal generalization of a
Substitution into (7.16) then yields well-known expression

1.1, (7.26)
where
%=[-D(t)dt, (7.25) Ell@)being the Hankel function of the second kind of
order 1. This formula has the series expansion

y=0.5772.. * , (7.28)

where the instructlions u+iO and -8-iO indicate Table I of 11). Because of the coordinate invariance
what is evident from (7.24), namely, that G(x,x) is of the theory the functional integration is redundant
the boundary vaue of a function of u and 8 which and ambiguous, and since no one has yet discovered
is analytic in the upper-half u plane and the upper-half an analytically accessible nonredundant subspace for
% plane. The singularity structure in u reflects the the integration, we are forced to accept Eq. (20.12) of
usual behavior of the Feynman propagator on the 11 as the effective definition of the integral. However,
light cone (u= 0). The remaining singularity structure there is an incomplete nonredundant subspace which is
symbolized by the logarithm of -%-iO, on the other easily accessible, namely, the subspace of all conformally
hand, is far from simple owing to the presence of the equivalent geometries. One may simply set
chronological ordering operation.
In the perturbative approach to quantum gravi- 4pv=Xgpv, (~cv++w=Bpv-~pv B w = (l+X)gpv, (7.30)
dynamics we must deal not with the scalar propagator and integrate over X, to obtain the partial contribution
(7.27) but with the vector and tensor propagators to (0, m 10, - a) arising from conformal fluctuations
6 .8 and Gj. However, the latter have structures in the vacuum geometry. The special interest of this
closely similar to (7.27); the only difference is that the integration is that it can be performed exactly, giving
operators a(t)
out of which % is built are slightly the conformal contribution to all orders of perturbation
more complicated, and the 1 standing on the right theory. The only fly in the ointment is that this is
of Eqs. (7.19), (7.22), (7.24), (7.26), and (7.27) is the one contribution for which high-energy damping
replaced by the geodetic parallel displacement functionF6 cannot be expected to produce a finite cutoff, There is
Therefore we can gain a qualitative understanding of no smearing out of the light cone, because conformal
the renormalization program in coordinate space al- metric fluctuations leave the light cone invariant.
ready by studying the scalar propagator. Moreover, It is easy to show that
there is an interesting nonperturbative treatment of g1/z ( 4 ) R = ( l + X ) p (4)R
the vacuum-to-vacuum amplitude in which the scalar
propagator itself directly enters: -3g/(1 +X)-X;,X;r-3glzX,,p, (7.31)
Consider the Feynman functional integral, Eq. and hence
(20.33) of 11,which may be rewritten in the form SC P+ $I-S[PI-S, i C ~ 3
~XP~QXPI = /-,UZ (4)g-g1/2 (OR+g1/2(R#.-+glr.
(SCp+41- SCPI- s,i~ql#)d+, (7.29)
= -3/g1/l (1+X)-lX; ,X;F dx . (7.32)
where S is the Einstein action and prv=grv--9rv (see
32 1

162 QUANTUM THEORY OF GRAVITY. 111 1253

The following change of variables then suggests itself : Several comments are now in order. First we remark
that although the final result is divergent, the degree o j
X=f$+$#?, l+X= (1++@)2. (7.33) divergence is bounded. The singularity a t X I = 1: is there-
This change not only simplifies expression (7.32) but fore not an essential one as one might have expected
at the same time guarantees the integrity of the signa- on the basis of Eq. (5.3). As a matter of fact (7.39) is
ture of space-time. We may allow @ to range from - m identical in structure with the contributions which the
to 00 without danger of encountering unphysical geo- propagators G ' j and GaB of the full theory make in
metries and at the trivial cost of counting each distinct lowest perturbation order (i.e., the single closed loops
geometry twice at every point instead of only once. of FV (1)) .32
Thus we write I t may be conjectured that inclusion of the non-
conformal vacuum fluctuations will eliminate the di-
exp (iaconformal) vergences altogether, and that a rough approximation
to the exact vacuum-to-vacuum amplitude can be ob-
tained simply by making the replacement U ( X , Z ) +
$A-a in (7.39), where A is a high-energy cutoff of the
from which we immediately obtain order of unity in absolute units. The "i0" attached to
each u in (7.39) reflects the presence of unremoved
Wconforma~=acodorrn&[ p ] - t i ~ c o ~ o r r n a ~ [ ~ ] noncausal chains. In passing from to W these
imaginary infinitesimals should be discarded. We ob-
tain, therefore, the estimate

The formal determinant may be evaluated by a (7.40)


variational technique with the aid of Eq. (7.11). Under
a change in the background field (7.35) suffers the
variation
6conforrna= -fi tr[g1/4Gg1/46(g1/4Fg1/4)]
-0,545.. .>..].1] . (7.41)
=f tr e~p(ig~/~Fgll~ss)6(gl/~Fg~'~)
ds 2'-z

L O
Still cruder estimates o W can be obtained by
=-fi6 tr
L- s-l e ~ p ( i g ' / ~ F g ' /d~ss, ) (7.36)

which may be immediately integrated to yield


finding approximations to the complicated quantity '8.
By repeated covariant differentiation of Eqs. (7.1) and
(7.6), and use of the commutation laws for the indices
thereby induced, one can show that
lim 1
Wconformal= -+tri
L- s-' exp(ig1/4Fg'/'s)ds
+constant. (7.37)
2'+%
A-l/2g'/4Fg-lf4Al!2.

- lim A-1/2A-1/2
a'-+%
r!J
P = l
6
(4)R. (7.42)

The trace symbol here means "integrate the diagonal This quantity raised to the nth power can be extracted
matrix element over space-time.'' Hence, making use from expression (7.19) or (7.22) for an. Moreover, it is
of (7.12), (7.13), (7.16) and (7.27), we find clear that the operator Q((1)has the dimensions of the

where
-
Wconforrnal=
s ~conforrnaid~+COnStant,(7.38)
curvature scalar and in the limit XI -+ x, is a kind of
nonlocal, or mean curvature averaged over a certain
neighborhood of x . If we represent the purely nonlocal
part schematically by A%, we may write
&conformal
O(t) -
Z'"Z
e"(Q ( 4 ) R + ~ % ) , (7.43)

The crudest approximation to 2' is then obtained

+[27- In2 - &+In (- 8-i0) **The manifestly covariant occurrence of three distinct types
,-. of divergences: quartic, quadratic, and logarithmic, already in

I I..-,.
lowest order, implies that the conjecture of Brout and Englert
+ln(u+iO)]%* .1 (Ref. 19) that quantum gravidynamics is conventionally re-
(7.39) normalizable is unfounded.
3 22

1254 BRYCE S DEWITT 162


simply by omitting A% altogether: basically a local phenomenon, and global conditions
should have little relevance here.
8. THE INFRARED PROBLEM
The most important contributors to the gravita-
tional polarization of the vacuum, and to the modi-
fications in Einsteins equations which this polariza-
tion produces, are the massless fields, including gravity
Expression (7.45) is prototypical of the contributions itself. These are also the fields which most readily
which all fields make to the geometrical part of the yield real quanta. The effect of real quantum produc-
vacuum-to-vacuum amplitude. (The only deviations tion on the vacuum-to-vacuum amplitude is taken into
from it occur with massive fields, for which 2 c4)Rgets account by the -iO attached to -%in the logarithm
replaced by -mz+& (4)R,and with fermion fields, for of (7.41). Owing to the complexity of %, however, the
which the sign of each term is reversed.) These con- branch-point behavior of the logarithm is very in-
tributions originate in the vacuum polarization which volved, and it is not easy to investigate directly in-
the background geometry induces, and give rise to coordinate space whether or not serious infrared difli-
nonobservable renormalizations as well as physically culties lie hidden in this expression.
real radiative corrections. In a closed finite world such difficulties cannot arise
The first term in (7.45) is a cosmological term since there is a natural low-energy cutoff; troubles
representing the zero-point vacuum energy which every occur only in infinite worlds. Let us for simplicity
field, including the gravitational field itself, possesses. confine our attention to flat backgrounds.as I t is then
It is eliminated by redefining the zero point. appropriate to revert to momentum space to study
The second term in (7.45) renormalizes the gravita- the problem. The analysis for this case is straight-
tional interaction strength. The relation between the forward and has been carried out by Weinberg: we
renormalized and bare gravitation constants (G and shall summarize his results.
Go,respectively) is The amplitude for a single soft graviton to be pro-
G=ZGo, (7.46) duced in a given process has already been derived
[Eq. (4.6)]. The corresponding amplitude for the
2- (l+Az/48r2). (7.47) emission of N soft gravitons in all possible ways from
I n the theory of the pure gravitational field 2 is the a given diagram is just the product of N single-graviton
only renormalization constant which occurs (provided, amplitudes. The form of these amplitudes is such that
of course, the exact theory is really finite.) Because of an infrared divergence arises in the computation of the
the manifest covariance of (7.45) it is clear that the rate a t which any physical process takes place when
same renormalization applies to all vertex functions arbitrary numbers of soft gravitons having total energy
no matter how many graviton prongs they possess. No less than E are simultaneously emitted. This divergence
Ward identity is needed. disappears if the contributions from virtual soft gravi-
The third term in (7.45) is the only one having tons are also included. Weinberg shows that the correct
observable physical consequences.aa In the classical total rate is given by an expression of the form
limit of long wavelengths and large coherent ampli-
tudes it may be regarded as a correction to the Einstein
Lagrangian. Hill has applied such a correction term
to the problem of gravitational collapse of the Fried-
b(B)=l[
A -yI
exp(Bl
1 iuc-

-;-d.)
1
drr (8.2)
mann universe, with encouraging results. He finds that ,I--L l , T2B2+ . . ,
if the sign of the coefficient in front is negative, as (8.3)
would be the case for the contribution from a fermion where is the rate without graviton emission and A
field, this term succeeds in turning the collapse cycle is a parameter marking the dividing line between
around before infinite curvature is reached?s I t may soft and hard virtual gravitons. If A is chosen to
be objected that in applying the correction to the be of the order of the typical energies involved in the
Friedmann model one violates the boundary conditions physical process, Eq. (8.1) gives a fair estimate of the
of asymptotic flatness which were assumed to get it rigorous value which would be obtained for r ( E ) if the
in the first place. However, vacuum polarization is contributions from ultraviolet virtual gravitons were
also included and appropriate renormalizations per-
UThe nonlocal part of the A* term, which has been omitted formed. The soft gravitons make appreciable contribu-
from (7.49, also has observable consequences. tions only if attached to the external lines of the ro
Jl T. W. Hill, Ph.D. thesis, University of North Carolina, 1965
(unpublished).
Not, however, until a density of the order of unity in absolute Conclusions reached for this case are presumably valid also
units is reached. At this density all the matter in the visible for infinite worlds having other background geometries.
universe has been compressed to a region the size of a nucleon. O7 S. Weinberg, Phys. Rev. 140,B516 (1965).
323

162 Q U A N T U M T H E O R Y OF G R A V I T Y . I11 1255


diagrams, and hence the only radiative corrections
which should be included in Toare those which involve
internal lines and vertices. The quantity B is given by
G
B=-
2~
c lfvnm

m,n ( l - ~ n m ~ )
gnqmmnmm

vnm
ln-
1+Vnm

l-vnm
(8.4)

vnmE [I- (m~mm/pn.pm)2]z; (8.5)


i t depends only on the parameters of the external lines.
These results are completely standard. Except for
the detailed form of the quantity B they are identical
with the corresponding results in quantum electro-
dynamics. The question which now must be asked is:
What happens when emission takes place from particles
which are themselves massless? In quantum electro-
dynamics such emission is known to give rise to a new conservation
With the aid of the energy-momentum
and more serious kind of infrared divergence which
cannot be removed in any simple or completely natural
way. This circumstance has been invoked as the
reason why massless charged particles do not occur the last two terms of (8.7) may be combined into
in Nature. In the case of the Yang-Mills and gravita-
tional fields the difficulty presents itself in a peculiarly
acute form since these fields are themselves both
massless and charged. Moreover, although there is which vanishes by symmetry. The masses ma, mb thus
no experimental evidence for the existence of the Yang- disappear from (8.7), and since papb 1n(-2p5.pb)
Mills field, gravity is an established fact, as is also its vanishes when either u=b or p a is parallel to P b , it is
interaction with photons. evident that B is completely free of divergences.
Since the Yang-Mills field is a vector field its diverg- The only uncertainty which remains in Weinbergs
ence difficulties are similar to those of massless electro- analysis, and which he himself points out, concerns his
dynamics and hence are difficult if not impossible to use of the DeDonder gauge for the virtual gravitons.
remove. Because of the noncommutativity of the emis- Except when the stress-energy tensor is conserved a t
sion vertices for Yang-Mills quanta it is not possible each virtual graviton vertex it is not easy to see that
to sum the effect of arbitrary numbers of real and the choice of gauge is immaterial. But stress-energy
virtual quanta into a closed expression like (8.1). conservation of this simple type holds only when the
Moreover, the situation is further complicated by the particle lines on both sides of the vertex are on the mass
fact that there is a non-negligible amplitude for the shell. [See E q . (3.8)]. Since only the external lines
soft quanta themselves to emit soft quanta. However, satisfy this condition Weinberg must appeal to the fact
there is no evidence whatever that the situation would that the other lines are only slightly o j the mass shell
improve if one could find a way to take all these extra and hence violate the conservation conditions only
complications rigorously into account.
In the case of the gravitational field, on the other Weinbergs act of faith on this question can be
hand, the difficulties miraculously disappear. This rigorously justified within the framework of the com-
happy state of affairs is a consequence of the detailed plete theory developed in 11. We known from this
structure of expression (8.4), which in turn derives theory that the choice of gauge for internal lines is
from the special form of the graviton emission vertex: irrelevant provided: (a) it is applied consistently and
row. -+ p,p, as q -+ 0. We shall now show how it comes (b) all diagrams contributing to a given process are
about. included. Now Weinberg omits the diagrams which in-
Let us use indices from the first part of the alphabet volve infrared jctilious quanta. But it is not hard to
to distinguish the massless particles from the others. show that the contributions of these diagrams all vanish
We shall continue to use the symbols ma, mb, etc. but as the infrared momenta go to zero, and hence may be
with the understanding that these masses ultimately neglected. This is a consequence of the fact that the
tend to zero. We may then write fictitious quanta always occur in closed loops containing
uniformly oriented vertices V(.i)e. Because of the special
l+ vab* llaqbmamb 1+vab
In- 3
8In electrodynamics it is not dieicult to show that gauge
(l-Vab)* 1-vob
invariance holds when every vertex along each charged particle
line is taken into account. In gravidynamics, however, every line
is charged, and the charge splits up or recombines at every
vertex.
324

1256 BRYCE S. DEWITT 162

form (2.8) which these vertices possess, the unifomi fourth order in the velocities. This expression can be
orientation guarantees that a t least one of the vertices greatly simplified with the aid of the energy-momentum
in each infrared loop is proportional to an infrared conservation laws
momentum.
We conclude this section by repeating Weinberg's C ~ n m n ( I + + ~ n ~ + + ~ n ' +. ) -= .O ,
n
calculation of B in the nonrelativistic limit and correct-
ing a minor mistake in his result. The quantity vnm2 is
first expanded in the form
vnm2= (vn- v,y-vn2v,2+ 2 (v,'+vm2)vn. v, and one finally obtains the compact formula
- ~ ( v ~ . v , ) ~ +. .. , (8.9)
B= (4G/5r)tr(Ad2Q/dt2)2, (8.11)
where vn= pn/En. This expansion is then inserted into
where adZQ/dP is the dyadic previously defined by
l+vnm2 qntmmnmm l+vnm Eqs. (4.11) and (4.12), having the explicit traceless
In-
(1-~nm~)'" vnm l-vnm formaQ
~ d 2 Q / d P = z qnntn(vnvn-+lvn2). (8.12)
= 21l~q,m,m,
('d 40
63
. . (8.10)
lS.-~,,,,~+_o~~~+
)
to obtain a lengthy expression for B correct to the
n

a*By inadvertently dropping a term Weinberg obtains a


dyadic which is not traceless.

Bryce DeWitt (Photo courtesy of C e d e DeWitt-Morette)


325

FEYNMAN DIAGRAMS FOR T H E YANG-MILLS F I E L D

L. D. FADDEEV and V. N . POPOV


iVlatkematica1 Institute. Leningrad. USSR

Received 1 June 1967

Feynman and De Witt showed, that the r u l e s must be changed f o r the calculation of contributions f r o m
diagrams with closed loops i n the theory of gauge invariant fie lds . The y suggested a l s o a s pe c ific r e c i p e
for the c a s e of one loop. In t hi s l e t t e r we propose a s i m p l e method for calculation of the contribution
from a r b i t r a r y d i a g r a m s . T h e method of Feynman functional integration is us e d.

It i s known, that one can a s s o c i a t e th e field of ambiguity d o e s not i n f l u en ce the p h y si cal r e s u l t s


the Yang-Mills type with a n a r b i t r a r y s i m p l e i n quantum el ect r o d y n am i cs. It s e e m s that F ey n -
group G [ l - 3 1 . I t is a p p r o p r i a t e t o d e s c r i b e th is man [4] was the f i r s t to sh o w t h a t the m a t t e r i s
held by m ean s of the m a t r i c i e s B p ( x ) with v a lu e s not s o s i m p l e i n t h e c a s e s of Yang-Mills and
i i i the Lie a l g e b r a of this group. gravitational f i el d s. N am el y the contribution of
The gauge group co n s i ts of t h e t r a n s f o r m a t i o n s the closed loop d i a g r a m s depends essen t i al l y on
the longitudinal p a r t of the p r o p ag at o r and s p o i l s
the t r a n s v e r s a l i t y and u n i t ar i t y p r o p e r t i e s of
s c a t t e r i n g am p l i t u d es. F ey n m an himself d e -
where n ( x ) is an a r b i t r a r y function with v a l u e s in s c r i b e d the n e c e s s a r y ch an g e of r u l e s for cal cu -
[lie group G. lation the contribution f r o m d i a g r a m s with o n e
The Lagrange function closed loop. A m o r e d et ai l ed d er i v at i o n of t h e
new r u l e s was p v e n by De Witt [ 5 ] . However i t
s e e m s t h at nobody g a v e the g en er al i zat i o n of
t h ese r u l e s for a r b i t r a r y d i a g r a m s .
is invariant with r e s p e c t to t h e s e t r a n s f o r m a t i o n s . T h e f o r m a l co n si d er at i o n s below a r e to p v e a
It is c l e a r , that s i m p l e explanation of the d escr i l i ed difficulties
nnd a q u i t e workable r e c i p e to c i r c u m v e n t thein.
P =L, + p1 We k n o w I r o m F ey n m x n 161 t h a t every elcnient
of the S - m a t r i x can b e w r i t t en down as the func-
where L!,,is a q u ad r at i c f o r m , and PI is the s u m tional i n t eg r al
of trilinear and q u ar t i c f o r m s in B . In th e quanti-
ration of the F ey n m an type 2 1 generates vertices (in lout) - j ex p { i q B] } [dB(x)
u i t h three and four ex t e r n a l li n e s and Po is to X

define the propagator function. However th e f o r m up to an (infinite) n o r m al i zi n g f a c t o r . H e r e


L0 i s singular and the longitudinal p a r t of the S[B] = P ( x ) d x is t h e action functional and one
propagator can not b e found unambiguously. T h i s is to i n t e g r a t e o v e r all f i el d s B ( x ) with the a s -
326

Volume 2 5 0 . number 1 P H YS I CS L E T T E RS 24 July 1967

ymptote at t = xo -+ i= prescribed by in-and out-


s t a t e s . The d i a g r a m s appear naturally in the
perturbativ e calculation of this integral.
In the c a s e of gauge invariant theory it i s
necessary to transform this integral a little. In
fact, we can s a y , using the natural geometrical where G ( x ) i s a Green function of the D'Alembert
operator. T h i s expression corresponds t o the
-
language, that the integrand i s constant on the
"orbits" B P B$ of the gauge group i n the mani-
fold of all fields B P ( x ) . It follows that the integral
closed loop with the s c a l a r p a r t i c l e propagating
along i t and interacting with the t r a n s v e r s e vec-
itself is proportional to the volume of this orbit t o r p a r t i c l e s according to the law
which can be expressed as the integral s n d S 2 ( x ) - E Sp(cp[BPJ,p]).
X T h e r e r e s u l t s the diagram technique with the
over all m a t r i c e s a(%). This integral should be following f e a t u r e s :
factorized before using the perturbation theory. 1. The p u r e t r a n s v e r s a l Green function i s to
T h e r e exist s e v e r a l methods f o r this purpose. be used a s a propagator for the vector particles
The idea of one of them i s to integrate over the (Landau gauge).
o r b i t s and s o m e t r a n s v e r s a l surface. It i s ap- 2. It i s n e c e s s a r y to take into account the new
propriate to choose for the latter the "plane" vertex with two s c a l a r and one vector external
?lLB1.l= 0. Then the integral reduces to the fol- line in addition to the ordinary v e r t i c e s with three
lowing and four lines.
Concrete calculations with these changes in
the r u l e s give t r a n s v e r s e and unitary expressions
for the s c a t t e r i n g amplitudes.
It must be s t r e s s e d that the Landau gauge i s
integrate over t r a n s v e r s e fields and A[Bj is to be essential f o r the n e w r u l e s . It is connected with
chosen such that the condition the chosen method of extracting the fact or
S r I d n (x ) .
x
A[BJ 6 (?,,(BP)
n dS2 = const An other method leads to the expression
X

holds. It i s the nontriviality of A[B] which distin-


Jexp{iS[BJ - x } v[B]r T d B
i i . / S P ( J ~ B P d) ~
X
guishes the theories of Yang-Mills and gravita-
tional fields f r o m quantum electrodynamics. where the factor Q[B]must b e found from the
We must know A[B] only for t r a n s v e r s e fields condition
and i n this c a s e a l l contribution to the l a s t inte-
g r a l i s g v e n by the neighbourhood of the unit el-
ement of the group. After appropriate lineariza-
tion we come to the condition T h i s integral gives the perturbation s e r i e s with
Feynman propagator, but the calculation of cp[B]
A [ B ] 1 n 6 ( C l u -E[Bfi,a,u]) d d x ) = const i s m o r e cumbersome tha.n that of A[B].
X We conclude with the comment that one can
where '3 is the D'Alembert operator and U ( X ) a r e proceed i n an analogous way with gravitation
functions with values in the Lie algebra of the theory. T h e analog f o r A[B] i s the determinant
group G. of the Beltrami - Laplace operator in a harmonic
Formally A[B] is equal to the determinant of coordinate s y s t e m .
the operator
AU = Q u - ) A O -UE V ( B ) U
E [ B P , J M U~ References
After extracting the trivial infinite [actor detAO 1. C.N.Yang and R . L. M i l l s , P h y s . Rev. 96 (1954)191.
2. R.Utiyama. Phys. Rev. 101 (1956) 1597.
we obtain the following expression f o r lnA[B] 3. S.L.Glashow and M.Gel1-Mann, Ann. of P h y s . 15
lnA[B] = ln(detA/detAo) = Sp ln(1 - ~ A g - l v ( B ) ) (1961)437.
4. 2. P. Feynman, Acta Physica Polonica, 24 (1963)
Developing the right hand s i d e in a power s e r i e s 697.
in E we have the following expressions for the 5. B.S.De Witt, Relativity, groups and topology.
(Blackie a n d Son Ltd 1964) pp 587-820.
coefficients 6 . R . P . Feynman. Phys. R e v . 80 (1950)440.
327

PHYSICAL REVIEW VOLUME 1 7 5 . N U M B E R 5 25 NOVEMBER lY68

Feynman Rules for Electromagnetic and Yang-MillsFields from the


Gauge-Independent Field-Theoretic Formalism*
STANLEY MANDEISTAK
Depwtmmf of Physics, Univasiiy of Cdyornk,Berkeley, CaJgornia 94720
(Received 17 June 1968)

The Feynman rules for the Yang-Millsfield, originally derived by Feynman and DeWitt from S-matrix
theory and the tree theorem, are here derived as a consequence of field theory. Our starting point is the
gauge-independent, path-dependent formalism which we p m p o d earlier. The path-dependent Greens
functions in this theory are expressed in terms of auxiliary, path-independent Greens functions in such a
way that the path-dependence equation is automatically satisfied. The formula relating the path-dependent
to the auxiliary Greens functions is similar to the classical formula relating the pathdependent field vari-
ables to the potentials. By using a notation similar but not identical to Schwingers functional notation, the
infinite set of equations satisfied by the Greens function can be replaced by a single equation. When the
equation for the auxiliary Greens functions of electromagnetism is solved in a perturbation series, the usual
Feynman rules result. 1701 the Yang-Millsfield, however, one obtains extra terms; such terms correspond
preasely to the dosed loops of fictitious scalar particles introduced hy Feynman,DeWitt, and Faddeev
and Popov.

1. INTRODUCTION path-dependent theory of the Yang-Mills field has


been treated by Rialynicki-Birula.6
T HE discovery of the Feynman rules for the Yang-
Mills and gravitational fields by Feynman him-
self has solved a long-standing problem in relativistic
In the present paper we shall rederive the Feynman
rules for the electromagnetic field from the path-
quantum mechanics. Feynman only derived his pro- dependent formalism, and we shall then derive the more
cedure for diagrams with a single closed loop, but complicated Feynman rules for the Yang-Millsfield. We
D e W W has recently extended the procedure to di- shall derive the Feynmm rules for the gravitational
agrams of arbitrary complexity. Another general proof field in the following paper.
of the prescription for the Yang-Mills field has been The fundamental principle of the path-dependent
given by Faddeev and Popov, who used a functional formalism was to avoid the introduction of non-gauge-
integration procedure which is probably equivalent t o invariant quantities. Thus the electromagnetic poten-
that of DeWitt. tials were not introduced, but were replaced by the
Feynman and DeWtt obtained their prescription by electromagnetic field variables F#p. Similarly, the
a somewhat indirect method. From the Feynman rules charged field variables +(z) were replaced by the path-
for nongauge particles they obtained the tree theo- dependent but gauge-invariant variables 3 ( z , P ) . For
rem, which relates the contribution to the S matrix practical purposes one would like to introduce the
from a closed-loop diagram to the contribution from a potentials as auxiliary variables, as one does in classical
diagram where the loop is opened at one point. They field theory. By doing so one would be able to calculate
then assumed that the tree theorem was valid in in terms of path-independent variables; one would
theories with gauge particles; they were thus able to transfer to the path-dependent variables a t the end of
derive the Feynman rules for the S matrix. The validity the calculation. It is well known, however, that one
of the tree theorem guarantees that the S matrix is cannot introduce covariant potentials without en-
unitarity, and their results can almost certainly be larging the Hilbert space and employing an indefinite
derived from an analyticity-unitarity calculation in metric. For electromagnetism one can use noncovariant
perturbation theory. potentials such as those of the Coulomb gauge. One
The question arises whether one can obtain the can then derive the Feynman rules after a certain
Feynman rules within the framework of a field theory amount of algebraic calculation. I t is possible to
of the Yang-Mills field or of gravity, and it is the formulate the Yang-Mills theory in terms of non-
purpose of the present paper to attempt to do so. We covariant gauges, such as Schwingers modification of
shall take as our basis the path-dependent theory of the Coulomb gauge, or the Arnowitt-Fickler gauge.6
gauge fields which we suggested earlier.4The theory was However, the method which was used in electromag-
origindlly formulated for electromagnetism and for netism for deriving the covariant Feynman rules from
gravity, but it can be applied to any gauge field. The such gauges is not applicable here, a t any rate without
essential modification. Io our knowledge no such con-
Research supported in part by the Air Force Office of Scientific sistent formalism has been given for the gravitational
Research,Office of Aerospace Research under Grant No. AF-
AFOSR-68-1471. field.
R. P. Feynman Acta Phys. Polon. 24 697 (1963).
B.S. DeWitt khys. Rev. 162 1195 (1967);162,1239 1967) I. Bialynicki-Birula,Bull. A d . Polon. Sd.11 135 (1963).
L.D.Faddec; and V. N. Popov, Phys. Letters 25B,29 1967): 6 J. Schwin er, Phys Rev. 127, 324 (1962);R. L. Arnowitt
S. Mandelstam, Ann. Phys. (N. Y,) 19, 1, 25 (1962). and S. I. F i d e r , W. 127, 1821 (1962).
175 1580
328

175 ELECTROMAGNETIC AND YANG-MILLS FIELDS 1581

In the present treatment we shall avoid noncovariant Our results will be the same as those found by Feyn-
quantities and we shall therefore not introduce PO- man, DeWitt, and Faddeev and Popov, They showed
tentials as quantum-niechanical operators. Instead, we that the correct prescription was to take all Feynman
shall introduce auxiliary Greens functions. I n for- diagrams of the Lorentz-gauge theory, together with
malisms of quantum electrodynamics which employ Feynman diagrams containing closed loops of fictitious
potentiais, whether in the Coulomb or Lorentz gauges, scalar particles. In our treatment we shall find that
one can define Greens functions integrals corresponding to closed loops of scalar
particles appear directly in the solution of the Greens-
G,,(zL,** * ; ~ I , * * * ; z I , * * * ) function equations. We may associate such integrals
-
= (01T{&(zi).+*(YI) * .. A &I) . . * I 0). with closed loops of scalar particles if we wish, but this
is purely a mnemonic device. The fictitious particles
One can also define path-dependent but gauge-invariant never occur in external lines, nor do they appear in the
Greens functions intermediate states of the unitaxity condition.
-
G+&QI,* * * ;yi,PI,** * j 81,. )
I n our present formulation of the theory, the Feyn-
man rules are thus rules for calculating auxiliary
- ( 0 ~ 1 { 9 ( x , , P , ).. .
. o~I,PI).~.F,.(Zl)...}]o)Greens functions. We can then proceed to calculate
The latter Greens functions can be expressed in terms the gauge-invariant, path-dependent Greens func-
of the former. In our present approach, the path- tions, since we shall already have expressed them in
independent Greens functions will be introduced, not terms of the auxiliary Greens functions. Ry using the
as vacuum-expectation values of time-ordered prod- reduction formulas we can then calculate the S matrix.
ucts, but as auxiliary functions in their own right. The The fundamental reduction formulas of the theory
physical, path-dependent Greens functions of our involve the path-dependent Greens functions. How-
theory will then be expressed in terms of the auxiliary ever, one can use these reduction formulas to derive
Greens functions by using the same formulas as in further reduction formulas involving the auxiliary
theories with potentials. The connection between the Greens functions. Thus, from the Feynman rules for
path-dependent and path-independent Greens func- the auxiliary Greens functions, one can derive Feyn-
tions will guarantee that the path-dependence equations man rules for the S matrix by the usual reinterpreta-
are satisfied, as we shall verify explicitly. We then have tion of the external lines.
to find the equations which the auxiliary Greens func- The equations for the Greens functions are coupled
tions must satisfy in order that the path-dependent integral equations between an infinite number of such
Greens functions satisfy the correct equations. functions. Moreover, when expressing path-dependen t
For electrodynamics, such an approach has already Greens functions in terms of auxiIiary Greens func-
been carried out by Sarker. He found that the equa- tions, one finds that a single path-dependent Greens
tions satisfied by the auxiliary Greens functions are function is equal to the s u m of an infinite number of
similar, but not identical, to the equations satisfied by auxiliary Greens functions. It would be clumsy, if in
the Greens functions of the Lorentz-gauge theory. The principle possible, to carry out manipulations with such
difference is due to the fact that he started with infinite systems of equations. We require a shorthand
the Maxwell equations aP,(x)/(&c,,)+jl.= 0, whereas for expressing the infinite sets of equations as single
the Lorentz-gauge theory starts with the equations equations. The Schwinger functional notation provides
UZA,(x)+jv=O. Nevertheless, he showed that the 11s with such a shorthand; Schwingers functional dif-
Greens functions calculated by the usual Feynman ferential equation is equivalent to the complete set of
rules do satisfy the correct equations. The Feynman equations for the Greens functions. Cnfortunately it
rules were thus derived from a procedure which was co- does not appear to be an easy matter to express the
variant throughout and which did not make use of an equations for path-dependent Greens functions in
enlarged Hilbert space. Schwingers notation. We shall therefore use another
When we carry out a similar treatment for the Yang- notation in which our fundamentalquantity corrcsponds
Mills field, we shall again find that the equations to Schwingers 8/67 rather than to 7. We shall indicate
satisfied by our auxiliary Greens functions are slightly the connection between our notation and Schwingers
different from the corresponding equations in the but we shall not w u m e knowledge of his notation.
(incorrect) Lorentz-gauge theory. As with electro- In the following section we shall illustrate some of our
magnetism, the difference is due to the dropping of a methods by using the A@ theory. We shall find the dif-
term - aaA./ax,,ax.in the Lorentz gauge. In this case, ferential equations for the h e n s functions and shall
however, we shall find that the difference is important, use them to construct the perturbation expansion. We
and that the solution to our equations contains terms shall then develop our notation for simplifying the
besides those given by the Lorentz-gauge Feynman writing of the differential equations. Essentially what
rules. we shall do is to form a linear space of all Greens func-
tions and to write the differential equations as equations
A. Q . Sarker, Ann. Phys. (N.Y . ) 24, 19 (1963). for vectors in this space. In Sec. 3 we shall treat the
329

1582 S T A N L E Y M A N D E L S 1A M 175

functions
(2.3d)
(2.3e)
One method of obtaining the perturbation series for
the Greens functions is to use the differential equations

Q----.GH
satisfied by them. This is the method we shall use in
+ - ---+ the following sections when treating gauge fields. Thus,
G2 will satisfy the equation
(b)
FIG.1. Diagrammatic representation of Eqs. (2.5).
(0?-P~)GZ(ZI,ZZ)=aXGs(li,r~,zz)+ib(zl-~q). (2.4a)
Equation (2.4a) is obtained by applying the dif-
electromagnetic field. We shall write down the equa- ferential equation (2.1) to the factor +(XI) of (2.3a).
tions for the path-dependent Greens functions and The first term on the right of (2.4a) arises from the
shall reexpressthem in our shorthand notation. Working interaction term in (2.1) while the second term is ob-
within this notation, we shall then express our path- tained by applying the differential operator - a2/axoz
dependent Greens functions in terms of new, path- to the time ordering itself. In deriving this term it is of
independent, auxiliary Greens functions. We shall course necessary to use the commutation relation (2.2b).
determine the equations which the auxiliary Greens The higher Greens functions will satisfy similar
functions should satisfy in order that the path-depen- equations. Thus G3 will satisfy the equation
dent Greens functions satisfy the required equations.
On solving them, we shall iind that they lead to the ( 12-/J2)G3(X1,X2&)= ~AG4(Xl,ZI&,X3)
ordinary Feynman rules. In Sec. 4 we shall treat the + ~ ~ ( ( ~ ~ - X Z ) G ~ ( X ~ ) + ~ ~ ~ ( X (2.4b)
~-ZI)GI(XZ).
Yang-Mills field in a similar way. Here, however, we
shall find that the perturbation expansion contains Equations (2.4a) and (2.4b) can be integrated t o yield
the formulas

s
terms besides those given by naive Feynman rules.
Gz(x1,xe) = -+z> dzl~AF(Xl-x1)G3(Xq,Zi1,22)
2. DIFBERENTIAL EQUATIONS FOR
GREENS FUNCTIONS ++AF(S-X~), (2.5a)
In this section we shall summarize the method of
determining Greens functions by solving differential
equations, and shall also develop our shorthand nota-
Ga(xl,X2,x3) = -*&
I
dxl~AF(XI-lq)G4(24,24,22,r3)

+*AP(xl--%)Gl(%)
tion. The method is certainly not new but, as far as we
are aware, there is no easily available reference in which +aA~(xl-dGl(xJ. (2.5b)
it is described, and we therefore felt it worthwhile to
Equations (2.5) are illustrated diagramatically in
describe its application to non-gauge fields before
Fig. 1.
passing on to the gauge fields in which we are interested.
If we are working in perturbation theory, the first
We shall treat the simple case of a neutral scalar
Greens function on the right of (2.5a) or (2%) will
field with A@ coupling. The field equations will be
be required to one order lower than that on the left,
(0-Pd4-+A(&4)z= 07 (2.1) since it contains an explicit factor A. The second term
on the right of (2.5a) is known explicitly, while that on
and the 4s will satisfy the commutation relations the right of (2.5b) only involves GI.Hence, if we con-
struct the perturbation series order by order and, within
C+(x,t)dY,t)l= C4(x,t)d(Y,t)l=o, (2.24 each order, construct the functions GI, G2, .. suc-
cessively, the right-hand side of (2.5) will be known in
(2.2b) terms of previously calculated functions. We can
therefore construct the entire perturbation series in this
manner, and it is not difficult to see that we obtain the
usual prescription for Feynman diagrams.
In a field theory with a simpleLagrangian, such as the
A@ theory, it is sufficient to write down the first few
equations (2.4) and (2.5); the form of the subsequent
equations is then fairly obvious. When writing down
equations for gauge fields and performing manipulations
with them, however, it would be somewhat cumbersome
to proceed in tbis manner. We require a notation in
330

17.5 E 1 , E C T R O M A G N E T l C A N D YANG-M ILLS F I E L D S 1583

which the whole series of equations (2.4) can be simply space on which the operators &a$ act. The 8 s do not
displayed. In the remainder of the section we shall satisfy the quantum-mechanical commutation rules.
develop such a natation. We emphasize that we are IRfact, all the 8 s and their space and time derivatives
doing nothing more than constructing a shorthand for commute with one another. The linear space of vectors
expressing the equations satisfied by the Green's I ) is thus a totally Werent space from the quantum-
functions. mechanical Hilbert space of vectors I ).
We shall work with the linear space of the totality It would be inconvenient if we had to display equa-
of functions Co,Cl(z1),C2(zl,x2),. .. A typical vector tions such as (2.8) whenever we used them, and, in fact,
in the linear space may be written there i s a standard notation for expressing (2.8) in a
more compact form. This notation is expressed in terms
of vectors in the dual space. A vector in the dual space is
written in the form (HI, and is defined by means of its
scalar products ( H l C ) with all vectors in our space
1 C). The scalar products must depend linearly on IC).
We define the vectors (a,], (Hl(z1)I, (HZ(XI,SZ) I, * *
The linear space is thus the sum of a series of subspaces in the dual space by the equation.
co 0 0 (HolC)=Co,
0 Cl(X1) 0 . . I
(HdXl) Ic)= Cl(X1), (2.9)
0 0 Ce(x1,Xp) I c)=C2(S,S>,
... ... ,.. . tHz(z1,Xp)

C being the general vector (2.6).There is a single vector


The first subspace consists of a single vector, but each (Ho 1, a vector (H&J I for each value of xl, a vector
of the other subspaces is itself an infinite-dimensional (Hp(r1,xJ I for each combination of variables x1, a,and
subspace. Thus, the second subspace is the space of all so on. A vector in the original space is uniquely defined
functions of a single variable m, the third subspace is by its scalar products with all vectors ( H o,~(H1(xl) I,
the space of all symmetric functions of two variables
2, and a,and so on. We shall denote the vector (2.6)
-
(H&l,z2) 1, .* in the dual space.
We next construct the dual space of the space of
by the symbol I C). We are interested in the particular vectors (HI. If ICf)is a vector in this space, it is defined
vector by its scalar product will all vectors (H 1 :
Go
(HI1(xl,"'xn)~C~=Cn)(x,,' .*X%).
(2.7)
Ghi,XJ However, the totality of functions C,: CI'(Z), C < ( X ~ , X ~ ) ,
... -
* * define a vector in the original space with which we
started. There is thus a one-one correspondence be-
where G0=(010)=1, and the remaining G's are the tween our original space IC) and our new space I C'),
Green's functions of our theory. We shall denote this which is such that corresponding vectors have identical
vector by the symbol IG). scalar products with all vectors (HI.The space of
We now define an operator J(z) acting in this space vectors lCf),which is the dual of the space of vectors
as follows: (Ill, may therefore be regarded as identical to our
original space of vectors IC).
Given any operator 0 acting in the dual space, we
can define an operator 0 acting in the original space as
follows:
The variable x in the vector on the right is regarded as a ( H I (0IC)1= { (H I01 IC) (2.10)
fixed parameter, while XI, a,. . are the variables cor-
responding to o w linear space. for all vectors A in the dual space. We can therefore
To avoid misunderstanding we must emphasize use the notation (H 10I C) to express the scalar product.
that the operator J(2) corres+ds to the quantum- Equation (2.10)can also be used to define an operator
mechanical operator +(%), but that it is a totally dif- in the dual space once the operator in the original space
ferent type of operator acting on a totally different is known.
type of space. Once we have expressed the field theory It is now easy to express 6 as an operator in the dud
in terms of Green's functions,we need no longer consider space. By (2.9) and (2.8)
the quantum-mechanical Hilbert space; the whole
theory has been expressed in terms of the c-number
I
(Hol (J(4 CN = Cd4,
equations (2.4). In order to express these c-number (Hl(S) I{6(4I0) =C*(Z,Xl) , (2.11)
equations in a simple way, we have defined a new linear (~~2(zsz2) I{B(x) I c))=C*(3c,ZI,x1).
33 1

1584 S TA N L E Y M A N D E LS T A M 175

Now, by (2.9), We now use (2.14) and (2.15) to express the vectors
(Bl(4 I c) = Cl(d I (Hz(xl,~JI , ( H , ( x i , x ~ ,1,~ )and (Ho Ia(x1- X P ) in terms
(HZ(X4 1 0=Cz(z,xd, (2.12) of the single vector (HI(-) I :

( E 3 ( x , x I 2 X 2 ) IC) = c~(x,zl,x*). (Hn(%l,zz)1 =(FZ1(x2) /6(Xl) (2.2oa)


From (2.11) and (2.12), we condude that @3(xi,xq~z) 1 =(Ht(xz) I$(xi), (2.20b)
(noIB(.)=@ 1 ( 4 I , (a,16(2l-x2)=(Hl(XZ)jq(Xl). (2.204
(Hl(X1) I&4= Wz(x,x1) I , (2.13) Equation (2.19) therefore becomes
( Pe(xI,Z%) Id(~)=(Hs(x,zI,xz)1 ,
(El ?-pa)(HikJ I&dIG)=3Wi(~22)1@(x$ IG)
since IC) is an arbitrary vector. For the vector fi(Hl(%z)I dx1) la. (2.21)
(Iln(xl. .x,) 1, Eq. (2.13) states that
7

Since the operator (012-pz) in the first term of (2.21)


(Hn(X1. * ~ n ) l d ( % ) = ( a n + z-( .~x, ~
J l~, . (2.14) acts only on the variable XI, it may be taken inside the
scalar product. Thus
The notation of the dual space therefore allows us to
express (2.8) in the compact form (2.14), and we shall (Hi(xJ 1 (Uiz-~)6(d IG)=MHi(xJ I Q ( 4 IG)
use this notation in the remainder of the paper. We +i(Hi(xz) 1 ?I(%)IG). (2.22d)
should emphasize that the use of the dual space does
not involve any deep mathematics, but is purely a con- Similarly, Eq. (2.4b) may be written in the form
cession to the printer. It would be perfectly possiblc
to rewrite every subsequent equation in this paper (Hz(xzxs) I (012-- Pzw(x1) IC)= 4 W z ( x z , x d I 6 ~ ( X l )1 G)
without using the duaI space. Every equation of the +i(Hz(%,zJ J d ~IG$
). (2.22b)
form (2.14) would then be replaced by an equation of The general Eq. (2.4) can be obtained by replacing the
the form (2.8). vectors (HI[ and ( a 2 1 in (2.23) by the general vector
We now define a new operator q, which we shall (H, I:
require when writing the right-hand side of (2.4). It is
defined as follows: (H,(x,. * I
* ~ + i (O?-p2)&~1)
) IG)
(Hn(21rX2,. .,Xn) Id4 =$W(w. . X n + l ) I4(.1> Ic)
+i(A,(zs- * . X n + l ) I ??(XI) I@. (2.22c)
=?(a,l(r1,x,;..,[x,],...x,)Is;x-x,).
t-1
(2.15) Since the vector (H,(zz. .x,+l)l in (2.22~)is arbitrary,
we may rewrite the equation as an equation for the
The vector (Hn-~(xl,z$,.. ,Ex,], - .,xn) I denotes the
vector G :
vector which is obtained from (Bn(xl,xp,- .,x,,* *,xn)1 * 1
(0i2-p2)&(~1)G)=%h&(x~)/ G ) - f - i d x ~ )/G). (2.23)
by removing the variable xI. From (2.14) and (2.15) it
is easily seen that t and q5 obey the commutation We have rhus replaced the infinite set of equations for
relations : the Greens functions by a single equation for the
Ctl(x1),&41= -8 4 ( x 1 - ~ ) . (2.16) vector (G).
We can easily integrate (2.23) to give the result:
Furthermore, from (2.15),
(~oIt)(4=0. (2.17) (&%) +qa/dJta,(x- d)B(d)
Equations (2.16) and (2.17) are sufKuent to determine
t), since (Hw117 can be found horn (2.14) and (2.16)
once (H,I q I is known. We may therefore regard (2.16) -~x:*,(x-~)t)(z))IG)=~. (2.24)
and (2.17) as the dcfinitions of 11; we can then easily
obtain (2.15). Equation (2.24) is 01 course equivalent to the series of
Having defined the operators 4 and 9, we can express Eqs. (2.5) in our new notation.
the Greens functions equations in a compact form. Our It is worthwhile noticing that Eq. (2.23) has the
Eq. (2.9) which defines the vectors @,&I,shows that same form as the field Eq. (2.1), except for the term
(H,(xI+*.~)\G)=G(z~*. .xn). (2.18) iv(x). This last term corresponds to the 6-function
terms in the equations for the Greens functions.
Hence Eq. (2.4a) may be written as follows: We can rehte our notation to Schwingers functional
notdtion by making the correspondence
(0i 2 - / 4 ( ~ z ( x 1 , dI G ) = $ W d ~ i , x l , 4IG)
.
9-&(XI- ~ l( I )l o I G ) (2.19) $(%) s/st)(x> 7 * q(%)1
332

STANLEY MAN DELSTAM 175


and regarding the operators as acting on the Schwinger We shall always use indices from the b c g h i n g of the
functional. We have chosen to defie 4 as our funda- Greek alphabet to denote componcnb in isotopic space,
mental quantity and to define 1 in terms of it, whereas indices from t h e middle of the Greek alphabet to
Schwinger proceeb in the reverse direction. The reason denote components in ordinary space. Note that F,
why we have set up a new notation is that quantities being a charged field, is pathdependent.
simihr to 4 may easily be ddined in the path-dependent The path-dependence equation is a straightforward
formalism, whereas the quantities analogous to are not generalization of (3.5) ;
quite so simple. Our notation is therefore more easily
applicable to the theories treated in this paper.
,4s usual, 6,F,,." is the change of the variable F,,=
callsed by a change in the path by an infinitesimal
area up# a t the point a. The path P' is that portion of
P leading to 8. Equation (4.2) requires the following
consistency condition, which is analogous to the
homogeneous Maxwell equations:

T h e equal-time commutators between the F's will


contain terms analogous to (3.3) and (3.4). Thus
[F,j"(x,t,P),F;,,'(y,l,P31= O (4.3a)
CF~"(x,f,P),F,ks(Y,f,P')]

4. YAHG-MILLS FIELD
The massless Yang-Mills field appears to possess all
the essential complications of the gravitational field
while lacking some of the algebraic complications. It is It is not difficult to check that Eqs. (4.1)(4.3) are
therefore instructive to consider this field before going consistent with one another and with Lorenz trans-
on to the gravitational hid. We shall treat as&- formations. In fact, the equations of motion and com-
interacting Yang-MiUs field, since interaction with mutation relations may be derived from the Lagrangian
other fields does not introduce my new features.
The path-depadent formalism for the Yang-Mills
field has been examined by Bialynicki-BirulaP The One may define path-dependent Green's functions
procedure foUowed is analogous to that used for the in the usual way. As in the electromagneic case, it is
electromagneticfield, with the difference that the Yang- necessary to include -function terms if the Green's
Mills field plays the dual role of the gauge field and the functions are to be covariant. The definitions are ther-
charged field. The field equations are simpler in fore as follows:
crppearance than the Maxwell equations of electro-
dynamics, since there is no additional current term.
They take the form
aF"."(qp)
-0. (4.la)
%
IOB. Zumino, J. Math. Ph'hys. 1, 1 (IY60).
333

175 ELECTROMAGNETIC A N D YA NG-M ILLS F I E L D S 1595

Higher Green's functions may be defined m a similar and we need not explain it in detail again. We con-
way. struct the linear space of the totality of all functions
The field equations (4.la) may be rewritten as equa- C?,,...a,..(xl,Pl,. . .). We then construct the dual space
tions for the Green's function, analogous to Eqs. (3.8) and define the vector ( ~ r v . . . a . . . ( x l , P ~ , .1 in the
a )

for the electromagnetic Green's functions. Thus the usual way. We next define the operators Pa,,,(x,P) by
two-point Green's function satisfies the equation the equations

We also define an operator U(x,P>which corresponds


to the right-handside of (4.5):
The four-point Green's function satisfies the equation (R<X,...@.-(x1,PJ,*. I Ur"(X,P)
a
~~~~r..ps,rX,ru(x1P1,X92,~rs,x~p4)
ax,
a
4az2,,
-sm.--ap.
7 a%<
6ola84(~1-~)Gya=X,ro(ZpPa,29$

+two similar terms with 2 c-)3 , 2 t)4


X(a,,,...,,,,...8...[.rl...(
X6,,64(x-x,)-g c
Xl,Yl,. . .[xr,P,].

E.qX 1; d.5
. .) I

-ig%&
La dtJ4((x1- UGw'pr ,A . r u ( 2 9 1 , d ~ , d ' 4 )

+two similar terms with 2 ++ 3, 2 ++ 4. (4.5b)


x (.x.~...B.,.iq16...(xl,p1,...)
where 7 and are the indices corresponding to the co-
164(~,--),

ordinates x,, and the superscript [7]6 in (4.8) indicates


(4.8)

that y is to be replaced by 6. The operator U may be


The right-hand side of (4.5b) is obtained by differentiat- defined by its commutation relations
ing the time ordering and applying the commutation
relations (4.3).Higher Green's functions will satisfy
equations similar to (4.5b).
Equation (4.lb) implies that the Green's functions
satisfy equations such as
a
C,,,-C~~,,.~(Z~,P~,X,,~~>
=0. (4.54
8x4
The path-dependence equations satisfied by the
Green's function can be obtained from (4.2)' by follow- Note that the right-hand side of (4.9) has terms cor-
ing reasoning identical to that used for the electro- responding to the right-hand sides of both (3.19a)
magnetic field. Again the term arising from the vana- and (3.19b), this is because the variable F in the Yang-
tion of the tirne-ordering cancels against the term ob- Mills field plays the roles of the gauge field and the
tained from the 6 function in the Green's function [the charged field. Equation (4.9) must be supplemented by
second term on the right of (4.4b)], and the h a 1 result the equation
is (a0IV,"(Xl,Pl)=0, (4.10)

1
61,~a$.,P.(xi,PI,Xg,P~)=gbyr d ~ r ) i ( X ~ ) G ' B ~ p ~ . p ~ . x ~

X(Xl,P1,X2,P2,X*,PO. (4.6)
to complete the definition of U.
In our present notation, the field equations (4.5)
become

I n this equation, ti,, is the variation of G caused by a


variation of the path PIat the point xa, and PI' is the
portion of P1 leading to xa. The higher Green's func-
tions satisfy similar path-dependence equations. The path-dependence equation (4.6) is

Condensed Notation
The condensed notation which we shall use is where P' represents the portion of P leading to t h e
very similar to that used in the two previous sections, point 2.
334

1596 STANLEY MANDELSTAM 175


Auxiliary Variables Another symbol which it will be convenient to in-
troduce is the following:
Following the procedure used in electrodynamics, we
shall attempt to expres our path-dependent quantities
F in terms of auxiliary path-independent quantities A.
The formulas relating the ps to the B's will be the
~ w ( ~ ' , x=, &y+g
~ ) W r
I:* d t J . f ( t ) +ga%%60.

same as the formulas relating the field variables F to


the potentials A . Thus, following the results of Ref. 5,
we make the connection as follows:
xLd!L ~ b ' ~ A ' ) A p 6 ( * )..+, (4.17)

where x' is a point on the path P. In other words,


~ , , . u ( x , ~=
) Vq(ZJ').fpv7(x) , (4.W V,,(x',x,P) is defined in a similar way to V,,(s,P)
where except that all integrals are taken from d to z instead
aA,.(x) aA,y(x) of from - to x. The following relation may be verified
Q)

j#P(x)==---- directly:
ax, a% V.q(x,P)=Vu6(d,P')v8,7(%',5P) (4.18)
where as usual P' is the portion of P leading to 5'.
If we differentiate (4.13~)with respect to the end-
point of the path of integration, we obtain the equation
a
-Vn,(x,P)=g,ad,'(x)V.s(x,P). (4.19)
a5
As we shall see below, this equation will enable us to ex-
press derivatives of path-dcpcndent functions in terms
of derivatives of auxiliary functions. Equation (4.19)
can be written in integral form,
* * * . (4.13~)
XA$'([')A,,r($)+
If WE. compare (4.13) with (3.24) and (3.34), we ob-
serve that we have to take the curl of A and multiply it
by the function V in order to obtain the path-dependent
Vny(%P)= &fg%b
1: dfflA~'(f)Vm6(r!,P)
1 (4.20)

where we have used the boundary condition TIm7= ~ 3 , ~


function F. Again, this is because the Yang-Mills when g=O or when 2- m. By expanding (4.20) in a
field is both a gauge field and a charged field. power series g, we recover lhe definition (4.13~).
Before going further it will be useful to obtain two Equation (4.19) or (4.20) may therefore be taken as a
identities satisfied by the function V. For this purpose definition of V in place of (4.13~).
it is convenient to obtain an alternative definition of V , We can now show that the definitions (4.13) do lead
due to Bialynicki-Birula.6 We introduce the following to the correct path-dependence equations (4.12) for F,
unitary matrix in the Pauli spin space: and we have carried out the algebra in Appendix A.
The formulas (4.13) can therefore be used to define
path-dependent quantities in terms of auxiliary
quantities.
The operator q will be defined in exactly the same
where the matrices 7. are the Pa& matrices. The way as the analogous operators were defined in the two
symbol L indicates that the T'S are to be ordered from preceding sections:
the beginning to the end of the path when expanding the
[tl,,"(~J,A2(~2)]=- ~&,,~'(xI--sz), (4.21)
exponential. I t is then not difficult to see that
(A01 1,%4 = 0. (4.22)
W+(x,P)r,W(x,P) = V,,(X,P)T,. (4.15)
We require an expression for the corresponding path-
From (4.15) we can derive the identities
dependent quantity U."(x,:,F), defined by (4.9) and
V,,V,,,= v,l?aV7,P 67,. , (4.16a) (4.10), in terms of the auxiliary variables. In electro-
dynamics the quantity U(lc,P) was equal to V(x,P)
ys,VsaVeg=fup7Vh, (4.16b)
X q ( x ) , and this suggests that the Yang-MiUs operator
cis VnaVpgl= sp7 V T r . (4.16~) V,m(x,P) might be given by a similar equation:
For future reference we shall add the following trivial U p u ( ~ , PV)n~7 ( z $ ' ) ~ v ~ ( ~ ) . (4.23)
identity involving the 2s; We shall verify (4.23) in Appendix A. The equality be-
,p,ua.+ 64&ay+ eu~eEa76=0 (4.164 tween the two sides of (4.23) is to be interpreted in the
335

175 ELECTROMAGNETIC AND YANG-MILLS FIELDS 1597

sense that they both have the same commutation rela- If we compare (4.24) with (3.36) we notice that terms
tions (4.9) with the operators F,,fl(x',P'). corresponding to the right-hand side of (3.36a) and
Once we have defined the operators ANa(& we (3.364 appear on the right of (4.24). This is once more
can construct our enlarged dual space of vectors due to the fact that the Yang-MiUs field plays the dual
(Jl,,..:*..(q,** * ) I . We can then construct the linear role of gauge field and charged field.
space of vectors IG) and can define auxiliary, path- Let us tirst investigat_ehow the operators defined in
independent Green'_sfunctions G. Ths path-dependent (4.13) transform when A undergoes the transformation
Green's functions G can be expressed in terms of the (4.24). From (4.13b) one can verify directly that
G's by formulas analogous to (3.29). We shall not give
the details, which are the exact analog of the cor- fwa(x) +~~:;"(Bfhg-8,X,(X)K,(X). (4.25)
responding details in electrodynamics. We can also
write equations similar to (3.32) for the operators 7. The function 'b transforms in a similar way:

Gauge Transformations V a v ( ~ , p--$


) VuT(x$')+&*sXe(x) Vaa(x,P). (4.26)

The gauge transformations are given by the equation The easiest way of verifying (4.26) is to show that
(4.19)' which may be taken as the defining equation for
V , remains invariant when A undergoes the trans-
formalion (4.24) and V the transformation (4.26).
Under such a transformation, the two sides of (4.19)
transform as follows:

from (4.16d). Comparing (4.27) and (4.28), we observe


that the changes in the two sides of Eq. (4.19) are the
same, so that (4.19) is invariant under the transforma-
tion (4.24), (4.26). Thus the change in V is given by
(4.26),
We can now find the change in P as defined by
(4.13): The integral hxd3'Y#(y)x.36'),when commuted with
&-(x), does give (4.24). We thus conclude that the
integral JdyU&y)x&) commutes with all our path-
P p v a ( x , ~ -*
) ~,"u(x,P)+Xge7s,XtG) Vas(z,P)fpiy(x) dependent variables P,,"(x,P). Furthermore, since V
+Age 7 6 VU., (X,P)X (x ) f P pd( x) [from (4.25) and undergoes the transformation (4.26), we conclude that
(4.2611
=P,,a(x,Y).

The variable PWa(z,P)is therefore invariant under the = -g6,8&6(%) vae(S,P). (4.30)
transformation (4.24), and we are justified in callkg Field Equations
it a gauge transformation.
To define a generator of the gauge transformation, Our aiin is now to express the field equations (4.11)
we construct the operator as equations lor the auxiliary variables. The first term
336

1.598 S T A N I, E Y M A N D E I. S T A M If5
of (4.11) is easily transformed:

J
a
=--v,,(x,P)j#y(.) The factor V,, in the second term of (4.34) is still not
ax, in front of the other factors, but we can move it into
this position by using the cornmutation relation (4.30)
= V&, ~ ) -a. f ; , ~ ( x ) + y 6 ~ ~ u ~ ( x ) ~ ~ ~ ( x ) ~ ~ , ~The
( ~ )
equation then becomes

s
ax,
[from (4.19)] u=(%,n = Ya,(~,P)l)?-+9+ d Y ~ ~ Y ( ~ , ~ ) ~ B ~ ) x Y

We have seen that the second tcnn --iUpQ(@) of e,y(x) = l).y(x)+ d r Y~(r)xya(x,r)+g.,ax~(z,x)
(4.11) is equivalent to the operator --zVuy(x,P)~,y(x)
in the sense that they both have the same commuta-
tion relations with the operator F,,@(.,P>. We may
therefore be tempted to rewrite the field equations
(4.11) as follows:
+gy6&6(x,%) , (4.36)
from (4.29).
We can thus generalize the field equations (4.32)
to read

However, we shall show below that the consistency of


Eq. (4.32) requires that its last term satisfy a divcrgence )IG)=O,
+ y ~ ~ ; 6 ( x ) ~ ; ~ ~ ( ~ ) - ~ e , y ( ~ ) (4.37)

condition similar to the corresponding condition in the


Maxwell equation of electrodynamics, and we shall with 0 given by (4.36). If we multiply (4.37) by Vuy,,
have to generalize it if the condition is to be satisfied. sum over a,and apply (4.16a), we obtain the equation
We shall follow the procedure used in electrodynamics a
and shall make use of the fact that the commutation
relations (4.9), which define the operator U.(x,P)
uniquely in the original linear space, do not define it
L,,
-f,,IY(x)+ty6d116(2)j,,r(x)-ie..(x)
) JG)=O. (4.38)

uniquely in the enlarged linear space. We employ this The path dependence has been removed from (4.38),
freedom to find a definition of UVu(x,P)which gives and we shall adopt it as our field equation. By taking
consistent field equation%W e begin by writing the gauge-invariant derivative

s
Uv(Z,P)=Vay(Z,P)%~(x)+ dy YB(Y)XUpI(X,Y) (4.33)
of the factor within the parentheses we can easily show
where x is arbitrary. Since the second term commutes that the last term must satisfy the consistency condition
with eveiy gauge-invariant operator, the right-hand
side of (4.33) maintains the correct commutation rela-
(4.39)
tions (4.9). All the terms in the equation of motion
(4.32) have a factor Vu7(x,P) in front of them, and it
will be convenient for us if the last term in (4.33) also We have t o choose the [unction x in the deiinition
such a factor. We therefore define (4.36) of 6 so that (4.39) is satisfied.
In order to orient ourselves we shall first find a func-
tion eye,of the form qraf&a, which satisfies (4.39). The
337

175 ELECTROMAGNETIC AND YANG-MILLS FIELDS 1599

term @,'a will not have precisely the form of the second
term of (4.36), but we shall then be able to modify it
so that all conditions are satisfied. The following func-
tion clearly satisfiea (4.39):

+-
FIG.3. Diagrammatic representation of the equation for the
twc-point Green's function in the Yang-Millstheory.

where
where the matrices I and ED represent the symbols
6, and em@,, considered as matrices in a and y. The
superscript (1) on 8 indicates that it is not our hal
definition of this operator. Equation (4.40) can be re-
written without using reciprocals of operators as
follows:

+gEuBByE~a*ApB(s)Ar'(Z)A.((2). (4.46)
By integrating (4.45) in the usual way and wing (4.43)
for e, we obtain the result

The right-hand side of (4.41) resembles that of


(4.36). The differences are, first that the operator 1) in
the second term of (4.41) is ordered to the right of the
other operators whereas i t should be ordered to the
left, and second that the last term of (4.36) is missing.
Let us therefore change (4.42) to bring it into the correctwhere 0 is defined by (4.42). We have omitted the
form (4.36): middle term of (4.43), as it is a pure divergence. If we
wish we may generalize (4.47) by replacing the pro-
pagator 6,&Ns-4 by [s,~-c(a~/axaz,>O-a]3A\,
x(z-x'); we then obtain other gaugas such as the
Landau gauge.
(4.43) If the last term of (4.47) had been absent, we would
have obtained Feynman rules similar t o those for elec-
with the function 0 still defined by (4.42). In Appendix trodynamics. The equation for the Green's function
B we shall calculate the value of the expression could then be represented graphically as in Fig. 3, with-
out the second-last diagram. For simplicity we have
exhibited the equation for the two-pint Green's
function; equations for higher Green's functions can be
similarly represented. The three- and four-pint
and shall show that it is zero, so that (4.43) is a per- vertices have the following factors associated with
missible, consistent choice for the function 8. them:

Rules for Feynman Diagrams ~ a ( p i , ~~s&;


, ~ ; P.X,Y,P)
=i ( 2 . . ) ' e d B , r ( P z - P 3 ) ~ 6 " ~ ~ ( P r P 1 ) . 6 , ,
We can rewrite Eq. (4.38) in the form
f(P1--PZ)P~,1 (4.48)
Vl(PIP,kG P&v; PalY,P; P4,6,,J)
= - (2A)4.B.B.,I(~ppbr~-b"rsrp)
+gj."(x)-&(x)
1 [G)=0 , (4.45) - (2X)4egy,f.B1(4ys,ipO-
- (2d*G78&,'*(6.A##-
S,&J.
6p~b.p)
(4.49)
1600 ST 4N L EY M A N D E L S TA M 175

One could then construct Feynman diagrams by ar- vertices: w(Pla,p&u,p8y) = - ( 2 ~ ) ~ g g a , ~ ,;p ~ ,(4.53b)
ranging the vertices (4.48) and (4.49) in all possible
ways. In fact, Eq. (4.47) without the last term is an over-all factor - 1. (4.53c)
identical to the equation we would have obtained by
starting from the Lagrangian In (4.53c), the quantities $1, a and pa, y refer to the
dashed lines, the quantities p2, @u to the solid lines
representing the Yang-Mills quanta. We notice that
the vertex factor is not symmetric in the two dashed
lines; it involvcs a factor p,, but no factor PI,,. I t is for
this reason that we have drawn arrows on the dashed
lines in Fig. 3. The factor p ~ in. (4.53b) is associated
with the line directed away from the vertex.
and writing down "naive" Feynman rules in the usual The presecription for constructing Feynman diagrams
way. is therefore to draw three-particle and four-particle
The presence of the last term in (4.47) shows that vertices with factors (4.48)and (4.49), and also polygons
the naive Feynman rules are not correct and that there with any number of dashed lines and with factors (4.53)
are additional terms in the perturbation expansion. associated with them. The three- and four-point
From (4.42), we can expand 0 as a perturbation series vertices, as well as the vertices of the polygons, are
in g as follows: then to be joined by solid Yang-Mills lines in all
possible ways.
The Feynman rules for our theory are the same as
those for a theory with ficititious scalar particles as
well as the Yang-Mills particles. The Feynman diagrams
contain three- and four-point vertices involving the
a Pang-Mills lines alone. The factors (4.48) and (4.49)
x-~AF(Z1-R)ig4r*Apr(X2). .
' CB,~&'(X~)
are associated with these vertices. In addition, the
ax,,
a 1 diagrams contain vertices involving two scalar lines
X- -Ap(xn-y). (4.51) and one Yang-Mills line. Associated with such vertices
ax,. 2 are the factors (4.53b). There is a further factor -1
associated with each closed loop of scalar particles. The
When (4.51) is substituted in the last term of (4.47), scalar limes only occur as internal lines and only in
we obtain the result closed loops.
Note added in nanwcripf. Faddeev and Popov
(unpublished) have shown that their functional-
integration prescription' can be related to Schwinger's
formulation of the Yang-MiIls theory.6 This therefore
provides an alternative derivation of the Feynman
rules from a quantized field theory. Faddeev and
Popov have restricted themselves to Landau gauge.

The expression (4.52) has the form of an integral


which occurs in Feynman diagrams, and the contribu-
tion (4.52) t o (4.47) has been represented by the second-
last diagram of Fig. 3. The sumnlation sign represents
the sum over polygons with any number of dashed lines,
and corresponds to the summation over n in (4.52).
The dashed lines and vertices are associated respectively
with the factors + A F ( ~ , - x + l ) and ige(a/dz,) in (4.52).
Thus, in momentum space, the following factors are
associated with the dashed lines and the vertices at
which they end:
i e.6
dashed lines: -- ; (4.53a)
(21b)4-g+ie
3 39

PHYSICAL REVIEW D VOLUME 2, NUMBER 12 15 DECEMBER 1970

S Matrix for Yang-Mills and Gravitational Fields


E. s. FRAJJKIN AND I. v. TYUTIN
Pkyskcd Lelrsdev Instilute, Academy o j ScienccS, Moscow, U.S.S.R.
(Received 19 January 1970)

A method is suggested (and applied to the Yang-Mills and gravitational fields) for the construction of
the generating functional (S matrix) for fields possessing an invariance group. The unitarity and gauge
independence of the S matrix on the mass shell are seen explicitly.

I. INTRODUCTION which fact reaffirms its unitarity. I t is shown, further-


more, that taking the additional conditions consistently
T HERE has lately been considerable intensification
in the study of theories partially or completely
invariant under non-Abelian groups of transformations.
into account makes i t possible to obtain self-consistent
equations for the massless Yang-Mills field in the pres-
ence of an external source.
This is in connection with the discovery of vector In Sec. I V the Feynrnan rules for the gravitational
mesons and their classification into multiplets, with the field are constructed in covariant gauges. These rules
use of vector mesons to account or the form factors of coincide with those suggested in Refs. 2, 3, and 5. In the
particles, and with the current-algebra approach. Inter- framework of our approach we also obtain the S matrix
mediate vector bosom are introduced in many schemes for a noncovariant (Dirac) gauge, for which the Feyn-
of weak interaction. An important example of a theory man rules have been obtained by Popov and Faddeev6
with a non-Abelian group of invariance is that of the using a method closely connected with the canonical
gravitational field. formulation of the gravitational field. In addition, by
I n the present paper a procedure for constructing the our method, the equivalence of the S matrix in co-
Feynman rules is proposed for theories possessing a variant and noncovariant gauges is proved.
gauge group, such as the theories of the niassless Yang- We use the following notation. Greek p, Y, A, . . . and
Mills and gravitation fields. I t is known that some the Latin i , j , k indices take the values 0, 1, 2 , 3 and
additional (gauge) condition must be imposed on the I, 2, 3, respectively. I n Secs. I1 and 111, g,, means the
dynamical variables in order that a consistent quantum
Minkowski tensor (+, - - -) and aiX means the unit
field theory may be formulated on the basis of a tensor. By the summation over repeated indices is every-
Lagrangian density invariant under a local transforma-
where meant a,b,=aabo-akbk; a,=a/azp; U = &a,;
tion group. In covariant gauges this can conveniently
V= a h & . In Sec. IV, g,,means the metric tensor, and the
be done by the use of Lagrange multipliers. The basic Minkowski tensor is designated as The usual sum-
idea of the method proposed is to choose the Lagrange mation over repeated indices means aFb*=Clpoaa#bF.
multiplier in such a way that one is led to free equations
We use the system of units R=c= 1 in Secs. I1 and 111
of motion for the additional field. This fact guarantees
and c3/16rk= 1 in Sec. I V (where k is the gravitational
the unitarity of the S matrix in physical space. The
constant).
Feynman rules obtained coincide with those proposed
in Refs. 1-5. The difference of the method under con-
sideration from that of Refs. 2-5 is that we have
succeeded in obtaining a set of consistent dynamical
equations completely describing the theory. On the one
hand, these equations make it possible to elucidate the
reason for the additional diagrams to appear, and, on
the other hand, guarantee the unitarity of the physical
S matrix. Section I1 is devoted to the construction of
the S matrix for the massless Yang-Mills field in arbi- IV. CONSTRUCTION OF FEYNMAN RULES FOR
trary gauge and to the proof of the gauge invariance of GRAVITATIONAL FIELD
the S matrix. I n Sec. I11 constructing the Feynman
rules in the Coulomb and axial gauges is considered on In this section we shall obtain the general rules for
the basis of the canonical quantization procedure. The construction of the S matrix for the gravitation field,
S matrix obtained coincides with that found in Sec. 11, prove the gauge invariance of the S matrix, and in more
detail consider two covariant gauges (the harmonic
K. P. Feynman, Acta Phys. Polon. 24, 697 (1963). condition and its linearized form) and the Diracl7
lB.S. DeWitt, Phys. Rev. 162, 1195 (I%?); 162, 1239 (1967). noncovariant gauge. The Feynman rules found by us
L. D. Faddeev and V. N. Popov, Phys.Lcttcrs ZSB,30 (1967) ;
V. N. Popov and I.. I). Faddeev, ITP report, Kiev, 1967 coincide with those of Refs. 2, 3, and 5.
(unpublished),
I S . Mandelstam, Phys. Rev. 175, I580 (1968). V. N. Popov and L. D. Faddeev ( t o be published)
S . Mandelstam, Phys. Rev. 175, 1604 (1968). " P , A. M. Dirac, Phys. Rev. 114, 924 (1959).
2 2841
340

2 S MATRIX FOR YANG-MILLS AND GRAVITATIONAL FIELDS 2851

The classical gravitational field is described by the I n the case K,.=g()) the first term in (4.8) should be
actionis written as

wo= i dxLo(x), (4.1) J d x ~(x)R,,~~G~x(x).

The second term in (4.8) is absent in the second-order


(4.9)

Lo=4(d-g)glp,. (4.2)
fonnalisrn. We use the following designations :
Here g,, is the metric tensor, g= detg,,, gcgvA= PA,and
R,, is the curvature tensor of second rank (Ricci tensor) G(z)IG(~)~(x) 6iVo/6g(B,,(x) , (4.10)
R,,= avr*sr-aur~,v+
rP,,ur~yp-r~,,yr~pu.
(4.3) GWo/Gg(~)(~),(4.11)
GPv(~)=G(B),v(x)=

is the Christoffel symbol GflvA(x)=6Wo/GIP,x(t). (4.12)


rrPA---
zgPU (a&uA+aAgov-aa,gvA). (4.4) Equation (4.12) exists in the second-order formalism
only. We find the differential operators R with the help
As we shall show later, the most convenient choice of of (4.6) and (4.7):
the concrete form of the dynamical variables depends
on the gauge condition. R , , A ~ R ( B ) , , ~(2P-
~ E l)d,g@)p,
The variables we shall use belong to the following
class:
+ +
2Pg B v A a p 2 a$ PA+ 2g ()P h (4.13)
g8 P Y=
- g Bg, and g(B)=g@gp. (4.5) for the case K,,v=g(~),v;
R,d= R (8)P vh=- (2P--I)a,gta,A++2Pg(8,Ad,
I n the general case we shall designate the variables
g(B)fly and g(D), as K,. The Einstein equations for the -26,~3~g(.9) 7- 26,g(s) yAdy (4.14)
gravitational field can be obtained from (4.1) by- two for the case K , v = g ( p ) P Y ;
methods :
I n the first-order formalism the expression (4.1) R ~ ~- a ~p P- h- ~ ~ , a 7 r ~ v A - ~ ~ P r ~ , A a 7
should be varied with respect to K,, and P , , A J considered +2aArcllY+ 2 r ~ , , ~ - -~ , a , a ~ .(4.15)
as independent variables. From the equations obtained Since the (~(x)are arbitrary, we obtain an important
by varying (4.1) with respect to P V A , the expression identit!. from (4.8):
(4.4) for P V A can be deduced.
I n the second-order formalism the variation should be R rV~G YA (~)+R P,xnG0,IA
. (2)~ (4.16)
made only with respect to K @ . I t is assumed in this case
Note that (4.16) is satisfied by arbitral-! K~~ and rfivA.
that (4.1) is expressed only in terms of K, with the help
According to (4.16), four identities exist among the
of (4.4).
Einstein equations. As noted in Sec. 11. this means that
Both methods lead, of course, to the same field
four additional (gauge) conditions should be imposed
equations.
on K,, and P x for consistent construction of the
As is well known, (4.1) and (4.2) are invariant under
quantum theor)-.
the gauge transformations of K,, and IpvA, the infinites-
As in Sec. 11, we shall use the method of Lagrdnge
imal form of which is
multipliers. Consider the class of gauges determined by
g (8) -3 g ( 8 ) g(B)- ?arg(B) + g ( B ) 1 y a 7 ~ the function
+g(B)~arE-2Pg(B)~aYEY, (4.6) #P=#P(~; K j r ) * (4.17)
g(B)Py+g(8)S P=
-g ( 8 )p v - [ ~ a7 g ( @ ) , - g ( f i ) , y a P Three concrete forms of the gauge functions will be
-g(8)yapEY-2Pg(B),vay5Y, (4.6) considered later. The Lagrangian is

r*,x--+ wVA=
r~,A--~avr~.A+r~,Aa.E~ L= Lo++,B*J3++culj*.p6,,B*., (4.18)
-rrP~aX~-rrPhrav~-a,axS.
(4.7) where 6, is the Minkowski tensor. The case afO will
be considered only for these gauge functions (4.17)
,$P(x) are arbitrary infinitesimal functions of x, I n the
which depend linearly on K~~ and are independent
first-order formalism, gauge transformations of both K,, of revA.
and I p v h should be made; in the second-order formalism,
only gauge transformations of K, should be made.
The gauge variation of (4.1) has the form
B+P(%)=
I dy D+,P(x,y;K,r)B(Y). (4.19)

6Wo=
J dx ~ ( ~ ) [ R p V ~ G Y X ( ~ ) + R p V ~ u(4.8)

l*L. D. Landau and E. M. Lifshitz, The Classical Theory of


G . Y Aobtain
Varying (4.18) with respect to
( ~ ) ] . the field equations
a*. -
#P=-aG,uB+Y, GP+--B=O,
K,,~, P v A , and Br, we

(4.20)
Fields (Pergamon, London, 1962), 2nd ed. 8KP
34 1

2852 E . S. F R A D K I N AND 1. V. TYUTIN 2

in the case of the second-order forinalism. I n the first- in the second-order formalism, and to
order formalism the following equation should be added :
TI n d d X ) rI drx(x) (4.29)
t ,< P;V<X
(4.21)
in the first-order formalism.
Note that the generating functionals (4.26), written
With the help of the identity (4.16), we obtain the in terms of different variables g(@I),,. or g(@a)pv, do not
following consequences of the field equations: coincide in general. I n this connection there arises the
question of a choice of the true variables (the true
( x ) =o ,
(Q+,BV) (4.22) measure) in terms of which the generating functional
can be written as the functional integral of exp(iL) over
Q+,= Q(x,Y; K , r ) the true variables. The proposed method does not
permit the value of the /3 to be determined. It could be
found in principle with the help of the correctly formu-
lated canonical formalism for the gravitational field.
However, we shall not investigate this problem in the
We impose the restriction on +,, that in the limit present work.
We now give some arguments which show that the
K B V = ~ ~ Y , rlvA=o, S matrix corresponding to the generating functional
(4.26) and (4.27) is independent of the choice of the
the operator Q+,, should be a nonsingular differential
variables of the functional integration belonging to
operator Q(O)+,. Choose the D+ function in the form
class (4.5). Suppose we integrate over K ~ , , in (4.26). The
D+p=[ Q + - ] ~ + A y ( ) . (4.24) corresponding Jacobian has in general the form

Then the B field satisfies the free equation d ( K , r ) / d ( K , r ) = D e t 4 y ~ ( x ) 6 ( x - - y ), (4.30)

Q+()ppBY=
0. (4.25) where y is some number and

Therefore, the S matrix is unitary, and the Einstein ~(~)=det~,,,(x). (4.31)


equations are valid in the physical subspace. Then we have
The generating functional is equal to

s
Za+= d(K,I)dB@ exp

I n the gauge
+Tr lnQ+(&())-l
1. (4.26) +Tr In&( -K)~(@+(O))-~
1
$J,=O,
i.e., for (Y= 0, we have

x e X p {~ ~ ~ ( L , + K , J Y . ) + T r l ~ Q + ( ~ ~. ( (4.27)
0))l} We can see from (4.26) that za+
corresponds to the
field equations for K,,, Ipi,and Br which are obtained
from (4.20)-(4.22) by the substitution B+P+B + r .
I n (4.26) and (4.27), d(K,r) is equal to
a+. -
rI rI dx,(x) (4.28) $= -aG,$+, GN+ -B+ =O , (4.20)
P<# 6%
342

2 S MATRIX FOR YANG-MILLS AND GRAVITATIONAL FIELDS 2853

the corresponding transformation of the metric tensor is


(4.21)

z(x) =el+) , J J x ) =detde&(x)/dx, (4.38)


It i s seen from (4.19) and (4.22) that B satisfies the
free equation and the Jacobian of the transformation of the integra-
Q+(),.BY= 0. (4.25) tion measure u!(K) is

Let us write [ - K ( x ) ] - ~ in the form oa~d(gr(B))/d(gr(@~)


= DetJJ.6+20fl(x)G1n(z(x)
-y) . (4.39)
[-K(X)]-y= l+a(Z). (4.32)
In order to calculate the DB,
we note that the matrix
The expression (4.19) then acquires the form inverse to the G(z(x) --y) is
G(x-z(y))Js(Y) (4.40)
B+(x)=B+(X)+ ( Q + - l X a ~ + A v c B ) ( X ) . (4.33)
*

If we formally use the rule of the calculation of the


I t is clear that in the second term of (4.33), integrating determinant of the product of the matrices, then
by pa.rts can change the direction of the Q+(O)operation
(at least in the perturbation theory) and one can use Dets(z(x) -y)=Det-1/2J,(x)G(x-y). (4.41)
(4.25). If
Finally we observe that the field equations (4.20)- DetJ,(x))G(x--y)# 1, (4.42)
(4.22) coincide with (4.20)-(4.22). then the invariant gravitation measure is
Thus the generating functional (4.26) leads to the
same field equations for K#, r u k # , Br, and consequently
to the same S matrix, as the generating functional
n n dgw(x).
r_i.
(4.43)
(4.26) does.
Note also that all the Heisenberg operators belonging This result does not agree with the form of the invariant
to class (4.5) must lead to the same S matrix according measure
to the Borchers theorem. (We ignore the question of the
meaning of ~ l , , as an operator function of K,,. See also 2 rsp
g(-5/2)(x)dg,,(x) =nnz &<
dg,(-62o)(X) , (4.44)
the analogous statement for the case of nonlinear
chiral Lagrangians in Ref. 19.) which is proposed by a number of authors.z0
Now we pass to the proof of gauge invariance. We Note that the integration measure over the group of
first prove that the S matrix is independent of the type the gauge coordinate transformations has a formz
of gauge condition which is analogous to (4.39) :

! b h ; K , r ) =0 , (4.34) dp~=Det~Jt(x)G(Z(x)--y) n d?r(z) ,


..P
(4.45)

i.e., that the S matrix corresponding to the generating


functional (4.27) is independent of the form of the
dpR=Det-lJf(x)G(x--y) n dZr(x).
z.0
(4.46)

function +,,.
Define the function &(K,r) by the relation If (4.42) is true, then the left and the right measures
are different. When proving the gauge independence of
the S matrix, this fact should be taken into account
[in particular, in (4.35) one should use the d p ~ ] .
The formal proof of the invariance of the S matrix
can be made in the general case DetJ(x)G(x-y)# 1 [then
Here S is an element of the coordinate gauge trans- it is necessary to assume the P-independence of the
formation group, and dp(S) is the measure of group S matrix). However, taking into account that the
integration. (For more details on the coordinate gauge arbitrary functions~ ( xare) the coordinates themselves,
transformation group see Ref. 12.) we can expect that
Let us explain some peculiarities of the coordinate
DetJ(x)G(x-y)= 1. (4.47)
group transformations. Under the transformation
Indeed, the G(z(x)-y) can be considered as the matrix
X -+ P(x), (4.36)
20 C. W. Misner, Rev. Mod. Phys. 29,497 (1957) ; J. R. Klauder,

la S. Coleman, J. Wess, and B. Zurnino, Phys. Rev. 177, 2239 Nuovo Cirnento 19, 1059 (1961); B. Laurent, Arkiv Fysik 16,
(1969). 279 (1959); B. S. DeWitt, J. Math. Phys. 3, 1073 (1962).
743

2854 r;. s . FIZAI)KIN A N D I . v. TYUTIN 2

of the permutation of the points. The corresponding Thus we prove that the S matrix of the gravitation field
finite-dimensional matrix has determinant 1. Therefore, is independent of the type of gauge condition (4.34).
the 6(z(x)--y) has determinant 1 if it is calculated by The proof that the S matrix is independent of the
the use of the finite-dimensional approximations. In as for the case when the gauge function is fixed can be
this case (4.47) follows from (4.41) and made similarly to Sec. 11. For this purpose substitution
of (4.6) or (4.6) should be made in expression (4.26)
dgI,=d/.W=dg=~ @(.x). (4.48) with
X.I(
tqX)= - (sa/zff)(c~*-llr~~)(~). (4.55)
Furthermore, d p has the property (2.44), and the Then
) d(K,r) for arbitrary ps are invariant
measures d ( ~and
1
under the gauge coordinate transformations. Below we Lo- ---$pSpYJ/y+KpJPY 3 Lo
assume that (4.47) is true. 2a
As in Sec. I1 the property (2.44) enables one to prove 1
the gauge invariance of &(K,r). We must know the
function &(K,r) only for K~~ and I,X satisfying the
condition (4.34). In this case the group integral is con- and the variation of the term T r In in (4.26) is compen-
centrated in the neighborhood of the unit element. The sated by the resulting Jacobian. We omit the corre-
gauge transformations have the form (4.6), (4.7), and sponding cumbersome calculations.*
&(S)=rId S W 3 (4.49) Let us pass to the consideration of some particular
3,s gauges.
As in Sec. I1 we obtain A. Harmonic Condition
AC(KJ7 I , P O Consider the class of gauges determined by the
function
= / d p ( x ) 6( (Qpv+Tp)(x)}
-Det-Q. (4.50) g1y+ a , g q X ) , g= (d-g)gpV. (4.57)
The harmonic condition corresponds to
Keeping (4.50) in mind, the expression (4.27) can be
(4.58)
rewritten in the form
We shall use gfiy as independent variables. By means of
Zo, = \d(Kj r)6{$,(X; r))A+(K,1) (4.14) (with p = $ ) , (4.23), and (4.24), we find
J
el%=
6~~(gxuaxau+aXgxua,)+axg~~a~.
(4.59)
Xexp[ i ~ ~ ~ ~ ( ~ o + (4.51)
~ ~ ~ ~ The
~ ~ generating
) ] . functional is equal to*I

Consider another gauge condition:

Multiply (4.51) by (b+I(K,r) and perform the gauge +Tr InQ~fi-l]. (4.60)
transformation
r p v X-+
-+ K,,~S-~,
K~~ rS-lpVh. In transverse gauge (a=O), we have
The quantities LO,A,, and A+1 are invariant. Further-
more, the following substitution can be made on the
mass shell:
K~-~ -+ ~
K ~ , ,J
JCP
V V (4.53)
as discussed in Sec. 11.
Then we obtain 1
+4 Tr lngrvd,dy~-l S{ a p g y ( x ) ) . (4.61)

The Feynman rules for calculation of the generating


functional (4.60) in powers of

- - A#- %- ,p (4.62)
E. S. Fradkin and I. v. Tyutin, CNH hboratorio di Ciller-
netica report, Napoli, 1969 (unpublished).
344

2 S MATRIX F O R YANG-MILLS A N D GRAVITATIONAL F I E L D S 2855

are are

(a) L L = LO- (1/2a) a,h#'~,xc@"--~ (0)1 (a) Lint2= -


Lo (I/2a)+,26@p+v2-L(0)
-i T r In(G,~+[(hXu~x~, -i T r ln{6,,+Ch,,O+(d,h~,-+a,h~~)
+a ~ f i ~ ~ d , ) 6 , P axh"'a,]U
+ ). (4.63) X ( 6 u u ~ , 6 , ~ + 6 " @ ~ , 6 , g - 6 6 " ~ ~ , ) ] 0 )(4.70)
.
These should be taken as the interaction Lagrangian.
These are taken as the interaction Lagrangian. (b) L(o)zis the Lagrangian of the linearized theory
(b) L(,,)' is the Lagrangian of the linearized theory

L (0) I = t a&xapiLyx -[(a+ 1)/2a]a,h'9%x.


-@,ha&, h= 6,,h~'. (4.64)

The lowering and raising of indices in (4.64) are done


by the Minkowski tensors 6," and 6".
(c) The free propagator of h g p is calculated from
(4.64) to be The raising of indices in (4.71) is accomplished by the
Minkowski tensor 6,'.
(c) The free propagator of h,, is calculated from
(4.71) to be
D,V,xuz(P)= --i[6,6~o--,~6..-6,,6,~
+[(a+ 1)/P21(6,xP"Pu+~,~P"Px
+6vAf',ps+ ~ P U P , P X ) ] $ - ~ . (4.72)
The Feynman rules for a= -1 were also given by
Mandelstam.

C. Dirac Gauge
+ ~ L p ~ p ~ + 6 ~ ~ p * p " (4.65)
)]~!.
We give the arguments which show that the S matrix
obtained by Popov and Faddeeve in the Dirac gaugeI7
The Feynman rules for the gravitational field in the coincides with the S matrix in the covariant gauges.
gauge a,gMv=O were also obtained by Fadeev and Consider t.he following set of gauge conditions:
Popov.3
(4-g)eikPik0= 0 ,
G3k= a i [ ( - g ( 3 ) ) l / 3 e i k1-- 0 . (4.73)
B. Linearized Form of Harmonic Condition Here
Consider the class of gauges described by the function g(3)= detgik, g= (l/g00)g(3), eikgkj= Sji. (4.74)
$'p2s 6"X(a,gAp-%a&Acr). (4.66) In gauge (4.73) it is natural to use the first-order
formalism. We choose g"" and P r A as independent
We choose g, as independent variables. We then obtain variables. With the help of (4.14) (with a=0), (4.15),
with the help of (4.13) (with p=O), (4.32), and (4.24) and (4.23) one finds
Q O= ~[-2b-ie'"nk0-2~i(g0'c/g~O)e~~rlK0
~
erv2=gPvO
+(augp8-+argm8)
X (6""~3,6,@+ 6'flc3r6vu- PflaV). (4.67) +&e'krlk' -2 i s k e i h ] d - g , (4.75)
Qaoj= 0 , (4.76)
The generating functional isz1
Qj30=- 2 ~ i T i j - e ' i ' ( - g ( 3 ) ) ' 1 3 ( + ~ ~ l e k j a ~ - 6 ; ~ a ' ) , (4.77)
1 pij=(1/gOO)gOmei"(-g(3))1/3

X ( + g , , n e k G k -q6mjal -36&,) , (4.78)


pyi=
-fzi(-g(3))~/3elial
+Tr InQ2rd-1]. (4.68) +6j'(-g(a))l/3e'"alarn. (4.79)
From (4.75)-(4.79), we obtain
The Feynman rules for the perturbation calculation of
(4.68) in powers of (',(o),,O= 0 , Q,(o)ji= -6jip-+pf)lc3j,

h#"=g#"-6#" (4.69) 0L3 ( 0 )t.O= I).


Qa"J),,i= (4.80)
345

2856 E. S. F R A D K I N A N D I . V. T Y U T I N 2

The generating functional is g" and rVxfiand consequently to the same S matrix as
(4.81) does.
23'
I [
d(grY,r)6{+sfi) exp a jdz(Lo+gY'J,.)
Now let us integrate in (4.89) over all the I'pvx except
Faik. One can show6 that Lo takes the form

Lo 3 ? r " g i k - - a ( a , g ) . (4.90)
+Tr lnQ300?-1+Tr lnBj'(Q3(o)-1)ki], (4.81)
Here H ( r , g ) is the Hamiltonian of the gravitation
B.i=Q3Tji=
1- (-g(3))1/3(6j"e'ndld~+~elid~dj). field, the explicit form of which we do not need. ?rik are
the canonical momenta for gik:
According to the general arguments given in this
section, the generating functional (4.81) on the mass ?rik= [d(
-g(3))/dgao)(eikelm-eirekm)rol,, (4.91)
shell is equivalent to (4.60), (4.61), and (4.68).
Now we transform expression (4.75). Using (4.25) With the help of (4.91) the gauge condition (4.73)
and (4.80) the field equations for B can be obtained: can be rewritten exactly in the form given by Diracl?:

VBo= 0 , (Gj%'V+f6ae'dldj)Bi=
0. (4.82) $,30Egik?ri'l:=0, + i=- a i [ ( - g ( 3 ) ) 1 / 3 e i k ] = o . (4.92)

The only physical solution of (4.82) is B,,? 0. Thus g''" Let us pass from the integration over r C i k and g g v in
and rvx'satisfy (4.20) and (4.21) with B,(3)=O; i.e., (4.89) to the integration over riband gfip.The resulting
relations (4.4) for r d are valid, and gfi" satisfies the Jacobian can be omitted. The proof of this fact is
usual Einstein equations. analogous to that of the possibility of arbitrary choice
Substituting (4.4) into (4.75), one obtains of the functional integration variables belonging to
class (4.5).
Q30'= (d-g(3))eikVidk(da) The final expression for the generating functional in
-(da)(2/-g(3))eikVidk, (4.83) gauge (4.73) or (4.92) is
viak= didk-'y'ikdl. (4.84)
Here yikL is the three-dimensional Christoff el symbol,
and a= (go0)-l. Let us find the expression for F o i k with
Zff=
J d(g,a)d{+afi)

the help of the Einstein equations for ghV, and substitute


it into the relation
Xexp i
[J ds(?r"g;k-H(?r,g)fgikJik)

- h. O = o ,
rk .+ i i k r O . zk -
e i k t ~

which must be true according to (4.73). From (4.85) one


(4.85) f T r lnAt--'+Tr lnBji(Qa(o)-l)kj. (4.93)
1
obtains Expression (4.93) has been obtained by Popov and
Faddeeva with the help of another method closely
(.\/-g@))eViak(4a/CY)+( d 4 ( 4 + 3 ) ) ~ ( 3 ) = 0 . (4.86) connected with the canonical quantization procedure.
Note once more that the S matrix corresponding to
Here R(3)=e'kR(3)ik,and 11(3)ik is the three-dimensional (4.93) is equal to that in covariant gauges.
curvature tensor of second rank.
Finally, the expression for QaOOtakes the form
V. CONCLUSION
~ ~ -~ 0 ,= (dav (4.87)
The present paper has been devoted to constructing
A= (1/-g(3))R("+(1/--g(3))eikVidk. (4.88) the S matrix in theories invariant under gauge groups.
Though the only cases considered were those of the
Thus we can see that the expression Yang-Mills field and gravitation, the method developed
can in principle be applied to arbitrary theories (the
Z'=
s [/
d(gfig,r)6{q3qexp i dx(Lo+g@yJ,,u)
theories of the Yang-Mills and the gravitational fields
are apparently the only gauge theories of physical
interest12),particularly in the cases where no connection

1
+Tr lnA.\/or+Tr InBj-Tr 1nQ3(") (4.89) with the canonical scheme can be traced. Furthermore,
the method suggested proves to be convenient for con-
structing the perturbation expansion of the S matrix in
for the generating functional with gauge condition theories partially invariant under a gauge group, the
(4.73) can be used instead of (4.81). The generating power of divergence in the S matrix being considerably
functional (4.89) leads to the same field equations for reduced.
346

2 S MATRIX FOR YANG-MILLS AND GRAVITATIONAL FIELDS 2857

I n this paper no attention was paid to possible inter- not the method of Fradkin and Efimov.2z) It is con-
actions with other particles. The latter would not affect venient to treat this problem using the variables hc' and
our considerations, however. P r h in the first-order formalism where there are two
We would like to discuss briefly the problems which vertices: a vertex rrh and the vertex responsible for the
have not yet been solved. interaction of h*" with the fictitious B field. The formal
(1) Owing to divergences, there is an important estimate of degrees of growth leads to the conclusion
problem of introducing a regularization which will not that the theory is of unrenormalizable type.
affect the group properties of the theory. Recall that
non-gauge-invariant regularization in electrodynamics ACKNOWLEDGMENTS
creates the photon mass. From the more recent view, the
resulting photon mass is due to Schwinger terms or, in The authors are grateful to the participants in the
the end, to the singular character of products of field theoretical seminar a t the P. N. Lebedcv Institute for
operators a t coincident points. In nonlinear theories useful discussions. One of the authors (E. F.) thanks Dr.
this problem becomes even more complicated. The Popov and Dr. Faddeev for being so kind as to acquaint
Schwinger ternis affect even the renormalization con- him with Ref. 6 prior to publication, and Professor E. R.
stant, as for instance in the case of the Yang-Mills Caianiello for his kind hospitality. He is also grateful to
field. Professor Abdus Salam for hospitality a t the Inter-
(2) There is an interesting question whether the national Centre for Theoretical Physics, Trieste.
gravitation field is renormahable in the framework of
a E. S. Fradkin, Nucl. Phys. 49, 624 (1963);76, 588 (1966);
perturbation theory. (We mean here the usual perturba- G.V. Efimov, Zh. Eksperim. i Teor. Fiz. 44,2107 (1963)[Soviet
tion expansion with respect to a coupling constant and Phys. JETP 17, 1417 (1963)l;Nuovo Cimento 32, 1046 (1964).
347

BULLETIN OF THE
AMERICAN MATHEMATICAL SOCIETY
Volume 78, Number 5. September 1972

h.1 ISS E D 0ORTUNIT1ES


BY FREEMAN J . DYSON

It is important for him who wants to discover not to confine him-


self to one chapter of science, but to keep in touch with various others.
JACQUES HALIAMARI)

1. Introduction. The purpose of the Gibbs lectures is officially defined


as to enable the public and the academic community to become aware
of the contribution that mathematics is making to present-day thinking
and to modem civilization. This puts me in a difficult position. I happen
to be a physicist who started life as a mathematician. As a working
physicist, I am acutely aware of the fact that the marriage between
mathematics and physics, which was so enormously fruitful in past
centuries, has recently ended in divorce. Discussing this divorce, the
physicist Res Jost remarked the other day, As usual in such affairs,
one of the two parties has clearly got the worst of it. During the last
twenty years we have seen mathematics rushing ahead in a golden age
of luxuriant growth, while theoretical physics leR on its own has become
a little shabby and peevish. So I am forced to give this lecture an emphasis
different from that intended by the founders. Instead of talking about
the contribution that mathematics is making to present-day thinking
in my field, I shall talk about the contribution that mathematics ought
to have made but did not. I shall examine in detail some examples of
missed opportunities, occasions on which mathematicians and physicists
lost chances of making discoveries by neglecting to talk to each other.
My purpose in calling attention to such incidents is not to blame the
mathematicians or to excuse the physicists for our failure in the last
twenty years to equal the great achievements of the past. My purpose
is not to lament the past but to mould the future.
It is obviously absurd for me to imagine that I can mould the future
with a one-hour lecture. The fact that Hilbert in 1900 [I] and Minkowski
in 1908 [2] succeeded in doing it does not give me any confidence that
I can do it too. But at least I have learned from Hilbert and Minkowski
that one does not influence people by talking in generalities. Hilbert
and Minkowski gave specific suggestions of things that mathematicians
and physicists could profitably think about. I shall try to follow their
Josiah Willard Gibbs Lecture. given under the auspices of the American Mathematical
Society. January 17, 1972; received by the editors January 17, 1972.
348

style. I shall try to convince you by examining actual cases that the progress
of both mathematics and physics has in the past been seriously retarded
by our unwillingness to listen to one another. And I will end with an
attempt to identify some areas in which opportunities for future discov-
eries are now being missed.

6. General coordinate invariance. Up to now, my examples of missed


opportunities have been mathematical discoveries which actually occur-
red, although they coukl have occurred a long time earlier. In such cases
one can be sure that an opportunity existed, but it existed only in the past.
I now come to the more dificult task of identifying missed opportunities
that are still open. Here one can no longer be sure that the opportunity
is real, but if it is real then it has the virtue of existing in the present.
The past opportunities which I discussed have one important feature
in common. In every case there was an empirical finding that two disparate
or incompatible mathematical concepts wefe juxtaposed in the description
of a single situation. Taking the four examples in turn, the pairs of dis-
parate concepts were respectively: modular functions and Lie algebras,
field equations and particle dynamics, Lorentz invariance and Galilean
invariance, quaternion algebra and Grassmann algebra In each case the
opportunity offered to the pure mathematician was to create a wider
conceptual framework within which the pair of disparate elements would
find a harmonious coexistence. I take this to be my methodological
principle in looking for opportunities that are still open. I look for situa-
tions in which the juxtaposition of a pair of incompatible concepts is
acknowledged but unexplained.
The most glaring incompatibility of concepts in contemporary physics
is that between Einsteins principle of general coordinate invariance and
all the modern schemes for a quantum-mechanical description of nature.
Einstein based his theory of general relativity [a] on the principle that
God did not attach any preferred labels to the points of space-time. This
principle requires that the laws of physics should be invariant under the
Einstein group E, which consists of all one-to-one and twice-differentiable
transformations of the coordinates. By making full use of the invariance
under E, Einstein was able to deduce the precise form of his law of gravi-
tation from general requirements of mathematical simplicity without any
arbitrariness He was also able to reformulate the whole of classical
physics (electromagnetism and hydrodynamics) in E-invariant fashion,
and so determine unambiguously the mutual interactions of matter, radia-
tion and gravitation within the classical domain. There is no part of physics
more coherent mathematically and more satisfying aesthetically than this
classical theory of Einstein based upon E-invariance.
349

On the other hand, aU the currently viable formalism for describing


nature quantum-mechanically use a much smaller invariance group.
The analysis of Bacry and Evy-Leblond [21] indicates the extreme range
of quantum-mechanical kinematical groups that have been contemplated.
In practice all serious quantum-mechanical theories are based either on
the PoincarC: group P or the Galilei group G. This means that a class of
preferred inertial coordinate-systems is postulated a priori in flat con-
tradiction to Einsteins principle. The contradiction is particularly un-
comfortable, because Einsteins principle of general coordinate invariance
has such an attractive quality of absoluteness. A physicists intuition tells
hini that, if Einsteins principle is valid at all it ought to be valid for the
whole of physics, quanturn-mechanical as well as classical. If the principle
were not universally valid, it is difficult to understand why Einstein
achieved such deeply coherent insights into nature by assuming it to be so.
To make the mathematical incompatibility more definite. I \vill focus
attention on one of the competing schemes for describing a quantum-
mechanical universe. I choose the scheme which is most carefully based
on rigorous mathematical definitions and which is also general enough
to encompass a wide variety of physical systems. This scheme is the
Algebra of Local Observables of Haag and Kastler [29]. The six axioms
of Haag and Kastler are the following.
(1) Existence of locul obseroables. To every region B (1.e. open set with
compact closure) in 4-dimensional space-time there corresponds an
abstract C*-algebra . d ( B ) .
( 2 ) Isotony. If B, 3 B2 then .d(B,) 3 . d ( D 2 ) , and . d ( B l ) and , d ( B , )

have a common unit element.


(3) Existence of qtiasilocal ohservahles. The union of all .d(B) is a
normed *-algebra, the completion of which is a C*-algebra d ,the algebra
of quasilocal observables. It is supposed that all physically measurable
quantities are elements of d .
(4) Poincari iwariance. To every element L of the Poincak group
there corresponds an automorphism aI.of .d,such that a,J.d(f?))= . d ( L B )
for every region B .
( 5 ) Local coinnrutatiuity. If two regions B , and B, are completely
spacelike with respect to each other (i.e. if B2 lies outside every light-cone
with vertex in Bl), then d ( B 1 )and . d ( B , ) commute.
( 6 ) Prirnitiuity. .d admits a faithful algebraically irreducible represen-
tation.
These axioms, taken together with the axioms defining a C*-algebra
[30], are a distillation into abstract mathematical language of all the
general truths that we have learned about the physics of microscopic
systems during the last 50 years. They describe a mathematical structure
3 50

of great elegance whose properties correspond in many respects to the


facts of experimental physics. In some sense, the axioms represellt the
most serious attempt that has yet been made to define precisely what
physicists mean by the words observabjlity, causality, locality, relativistic
invariance, which they are constantly using or abusing in their everyday
speech.
If we look at the axioms in detail, we see that (I), (2), (3) and (6) are
consistent with Einsteins general coordinate invariance, but (4)and ( 5 )
are inconsistent with it. Axioms (4) and IS), the axioms of PoincarC
invariancz and local commutativity, require the Poimark group to be
built into the structure of space-time. If we try to replace the Poincark
group P by the Einstein group E. we have no way to define a space-like
relationship between two regions, and axiom (5) becomes meaningless.
I therefore propose as an outstanding opporturiily still open to the pure
mathematicians. to create Q ntat/iewafical strtccture preserviriy the tilain
jeotrites of the i l m y - K a s t I t l . oxicms h i r r possessiry E-ir1t7ariartce iristend
of P-iriraricince.
I had better warn a n y mathernatician who intends to respond tomy
challenge that his task will not be easy. No merely formal rearrangement
of the Haag-Kastler axioms can possibly be sufficient. For we know that
Einstein could construct his E-invariant classical theory of 1916 only by
bringing in the full resources of Riemannian diNerentia1 geometry. He
needed a metric tensor to give his space-time a structure independent of
coordinate-systems Therefore an E-invariant axiom of local commuta-
tivity to replace axiom ( 5 ) will require at least some quantum-mechanical
analog of Riemannian geometry. Some analog or a metric tensor must
be introducod in order to give a meaning to space-like separation The
ansncr to m y challenge. will necessarily involve a delicate weaving to-
gether of concepts from differential geometry. functional analysis, and
abstract algebra With these words uf warning I leave the problem to you.
35 1

REFERENCES
I. D. Hilbert, Mafhemntisclte Problenle, Lecture to the Second Internat. Congress of
Math. (Paris, 1900), Arch. Math. und Phys. (3) 1 (1901). 44-63; 213-237; Englilh transl.,
Bull. Amer. Math. SOC.8 (1902), 437-479.

2. H hlirrkoa.ski. Rnrcrrr rtrrd Zeir. Lecture to the 80111 Assernbly of Nattlral Scientists
( K d n . 19081. Phys. 3 . 10 (14001. 104 I I I . ISnglistl tr:ii\sl,. 7 % ~ prirlc~iple
. t/ R&irtt,ify, Abcr-
deen Ilni\.. Presz. Ahertlcen. 1923.

21. i f , B a c n and J.-M. Lkvy-Lcblcind, Po.ssihlc &imwotics. J. Mathematical Phys. 9


(1968L 1605-1614 hfR 38 # 6821. To save time 1 have slightly misstated their conclusion;
each of the groups D, P and N can occur in two alternative forms, so that the number of
possibilities is strictly speaking 11 rather than 8.

28. A. Einstein, Die Grundlage d i r allgenreinen Relntir~itiitsthuorie,Ann. P h y . 49 ( 1916),


769-822.
29i-R.. Haag and D. Kastler. An alyehrtiic npproacli to qrcc~nrmtrnfield theory. J. hlathe-
matical Phys. 5 (1964), 848-861. MR 29 #3144.
30. J. Dixmier. Les C*-algPhres et leitrs repr6,srnfntiorn. Cahiers Scienlifique< fasc. 29.
Gauthier-Villars, Paris, 1964. MR 30 # 1404.
352

A Brief Remark for "Missed Opportunities"

Freeman Dyson

Looking back on this lecture thirty years later, I have the impression
that things have improved. Mathematicians and physicists are listening
more to one another now than they were then. Ideas and methods are
spreading more easily between the two disciplines. On the other hand,
there is now a widening gulf of incomprehension between two groups of
physicists, one group doing string theory and the other group doing other
kinds of physics. The string theorists and the mathematicians understand
each other, but the gap that used to separate mathematics from physics
now separates string theory from the rest of physics.
Chapter 7

Gauge Theories of Gravity"

"K. Hayashi, T. Nakano, R. Utiyama and T. Fukuyama, C. N. Yang, W. T. Ni,


Y. M. Cho, J. P. Hsu
3 54

491

Yrugxess uf lllcurctical Ilivsics, Vol. 35, Nu. 2, Ailgust 1Yb7

Extended Trailslation lnvariaiice and Associated Gaugc Fields

Kenji HAYASHI
and Tadao NAKANO*
Department of Physics, Kyoto Unizlersity, Kyoto
Depai-ttnent of Physics, Osaka City UnizJei-sity,Osaka

(Received April 1, 1967)

Gauge fields together with nonlinear field equations to govern them are introduced by
requiring that the Lagrangian should be invariant under a n extended translation in space-
time, i. e. a translation in u-hich four parameters are replaced by four arbitrary coordinate-
dependent functions. A prescription is given to convert a non-invariant canonical (pseudo)
energy-momentum tensor into an invariant one.
T h e symmetric part of these field equations is examined for the two cases: (1) under
linear and non-relativistic approximation, it reduces to the classical gravitational-field equation,
(2) for static and spherically symmetric field, its solution is shown to correspond to Schwarz-
schilds solution. T h e antisymmetric part has no classical analogues, f o r there a r e n o sources
of skew-symmetric energy-momentum tensors i n the classical experiments. A reasonable
method is proposed to eliminate this redundant field.

1. Introduction
Since it was suggested that the electromagnetic interaction is best understood
in terms of a principle of gauge invariance, under a gauge transformation with
a coordinate-dependent function, there have been a number of attempts to deduce
the existence of gauge fields coupled to conserved currents, starting with the idea
of extended transformations.)
It was shown that the invariance under the n-parameter Lie group of trans-
formation referred to space-time and/or fieIds leads to the conservation of n
generators. Further, invariance requirement under an extended transformation,
i.e. a transformation whose n parameters are replaced by ?z space-time dependent
functions, necessitates the introduction of R (generally) non-commuting vector
fields together with field equations which they must
The purpose of this paper is to deduce the existence of a gravitational field
from the translational invariance in an extended sense just mentioned above. In
order to construct the gravitational interaction, Utiyama has proposed to introduce
24 new field variables by postulating the invariance under an extended four-
dimensional rotation which is specified by six skew-symmetric arbitrary functions
oi,(XI .3) However, the self-inconsistency of his scheme was pointed out by Kibble
who has claimed that it is necessary to consider the extension of full 10-parameter
inhomogeneous Lorentz group in place of the restricted six-parameter group.4)
Then, our method is different from both of them and will be shown to be one
of the simplest ways of discussing the gravitational interaction within the Lagrangian
355

492 K . Haynslzi and T . Nnknno

f u i indlisiii of ~ l i cunquantisecl fields ill that wc necci the iniiiiiiial transforiiintioii


groul) (11 ans~alioilg r o u l ) ) :uicl its fxtt:ilsioll ncccssary and suffic.ici11to tlcciuce it.' )
In the following section, a general formulation of the extended translation is
given within the classical Lagrangian framework and a prescription is presented to
convert a " pseudo " energy-momentum tensor into an invariant energy-momentum
tensor. In $ 3, we apply it to the system consisting of the spinor field and the
new fields.
We identify the symmetric part of these new fields with the classical gravi-
tational field by means of the linear approximation to the non-linear field variables.
In 4, for the purposes of comparison we shall consider the static and spherically
symmetric field in which the exact solution of Einstein's equation of gravitation
has been well known and verified by the observations. In 5 5, elimination of
the antisymmetric part of the new fields will be attempted and the final section
is devoted to a discussion of the results.

2. General formulation
We start with the Lagrangian density**),***'

LO= LO (qA,q& x k ) , =a q " / a x k ,


where qA are a set of fields, ( A = l , 2, ..., N). T h e action integral referred to
an arbitrary four-dimensional domain Z,

is invariant under the following infinitesimal transformation :

q A (x)+ q / A (x!)= qA(x) 6qA( 5 ), +


x k + x i = x k + 8 x k ,
if the following identity holds true at any world points (independent of the
behavior of qA and its derivatives):

LO $. L O 6 x k ' k 8*LO + ( L O 6 s k ) ' k = o (2.3)


where $*Lo = bLo- L O * k $ X k is often called a substantial variation in the Sense that
a variation caused by the coordinate transformation is subtracted.****) U p to
the variation of first order, it follows that

*) Einstein's theory of general relativity has been based on the general covariance under the

extended translation within the classical mechanics.


**) It is assumed that the Lagrangian to be considered hereinafter contains the first derivatives
of the field variable at most.
***) We use the imaginary fourth coordinate x , = i c t .
**#::k) A ummation convention for dummy indices is used throughout this paper.
EXte)&d Ti-aitshtion Imwi-innce and Associated Gouge Fields 493

(2.1')

which is used in deriving the above identity. Consider a translation


6qA= 0 , axk= F, , (F k : infinitesimal parameter), (2* 4)
then the following identity must obtain in order to preserve the invariance
under this transformation :
8Lo - dL, 8L 8qA
- --_=o
8L 8qG
(2* 5)
8xk dxk 8qA 8xk 846 8xZ1,
which obviously implies that the invariant Lagrangian under the translation has
no explicit x-dependence, hence we shall consider exclusively the Lagrangian
Lo = Lo (qA,qP,) . -
(2 6)
Equation (2.3) is rewritten as
[LO],qa6*qA+ s k ' k = o ,
and the equation of motion is abbreviated
dL0
'k

where
s k = Lo>,$6qA
-T d X i , aLo/8q$= Lo,$ ,
(2 - 9)
Ti,= Lo,p$q: - 6ikLo .
If the action integral is invariant under the translation (2-4), the conservation
law of the energy-momentum tensor defined above follows
T,,.,= 0 (2.10)
on account of the field equation (2.8).
Next we consider the extended translation*"'
6qA=O, 632' = E' (x), (2.11)
(E' (x);infinitesimal arbitrary function).
The invariance property of Lo under the translation (2.4) breaks down in this
case ; the variation of the derivative does not vanish,
- A
oq., = (6&' - q t s x ~=
p - &:#qe . (2* 12)
We shall further require the invariance of the action integral under the extended

4:) This is called the Euler equation and derived by postulating 6aT=0 under the condition that
GicqA should vanish on the boundary surface of the integration domain.
* * ) In this case the Greek indices are used for conveniences.
357

494 K. Hayashi and T. Nakano

translation, by dcfiiiiiig the covariant derivative tlirough which the new field
a , + ( ~is) introduced so as to satisfy our postulate :
D,q = (8,. + likp (x)} q:!+= bL (z) q:: , (2.13)
6Dkq = 0 . (2.14)
In order to satisfy Eq. (2.14), i t follows immediately,
6bk = E Y v b k Y . (2.15)
Therefore we recover the invariance of the action integral even under the extended
translation (i) by simply replacing 44 in the original Lagrangian by the covariant
derivative DkqA defined above ;
L
O(qAy q:) j L (qA,q t y b k P ) =L (qA,DkqA)
=L
O (qA,q$+DkqA) 3 (2- 16)
hence its variation associated with (2.11) vanishes identically
6L=O, (2.16)
and further (ii) by multiplying L by a certain function b ( x ) so as to satisfy
the required identity (2.3):
6L+LE(;=O,
L = bL. (2-17)
Accordingly, the transformation property of b has to be
6b = - E(;b . (2-18)
Next tasks are then to construct such a function b ( x ) and the invariant field
strength from bk and its first derivatives. For these purposes, it is necessary
to define the field bk, inverse to bkl from the following orthogonal relations:

(2- 19)

Consequently, it follows
6be+= - E$bkY ,
hence we choose
b = de t (bk,) ,
because it has the desired property (2.18). In other words, the invariant volume
element becomes bd4xinstead of d4x. Suppose that we obtain a free Lagrangian
Lo for the new field, the action integral turns out
358

Extended Truiislntiolz Gzvnriance a ~ Associated


d Gauge Fields 495

where
L = L + LG=b (L+LG) ,*) (2.20)
and LG consists of the invariant field strength. We shall write for short
<aa>
= (qA, bkp) (2.21)
The invariance of the action integral follows from the following identity analogous
to (2.3):
6*L + (LEV).,= 0 ,
which is just shown above to hold by means of our prescription. The above
identity is rewritten as before,
[ L ]Qa6*Qa+ S$ = 0 , (2.22)
where

(2.23)

(2.24)

(2-25)

As E, ~ t :and
GFLare chosen arbitrarily inside the integration domain 2, the
second term of Eq. (2.25) resolves itself into the three identities
* & ; ([L]bkybku)*p$
(mTy$fY)~p~o, (2.26)

;& ; + (T +a)
rL1.kpbk + (LP6kpXbkY).hE0
, (2 - 27)
,
eyfi&Ekx,pbkv30 (2* 28)
among which there are only two independent identities, for the last identity
implies
+
L:kx,, LEk,., = 0 (2 28)
and the differentiation of the second identity yields the first one, by making use
of (2.28). Furthermore it suggests that the invariant field strength must contain
an nntisyinmci ric coiiil>il1;ition d h,,., with rcspcct to thc Crcck suhscripts and
~ -

I n the standard terminology of tensor algebra, L is named the tensor d e ~ ~ s i t y .


I)

* .)mTv,t p are the canonical energy-momentum tensors.


I
359

496 K. Hayashi and T. Nakaiio

then the contraction of these indices has to be performed; finally we have)


ClciTn = 2b~~p.v~bLPbsLY,
= 2bk,b~~b,~~,
with
%
I, =0 .
It should be noticed that there exist infinitely many conservation laws for gene-
ralized energy-momentum tensors in addition to the one implied directly by ( 2 . 2 6 ) ,
(f():an arbitrary function) ,

(2 - 29)

Under the extended translation only the Greek indices are associated with the
transformation properties : Then there arises a question, What role does the
Latin index play? To see it, we consider the four-dimensional rotation of the
field variables only,
3Y=O ,
(2.30)
8qA= TqA,
under the assumption that L is kept invariant under (2.30) and Lo under the
Lorentz transformation specified by
axk @kLxI , (@(LO = O),

6qA= TqA.
The assumed invariance properties respectively yield the following identities :
+
L!hATqA L!LkqATD,qA+ L!,kP6b,P= 0 , (2.31)
+ Lov,,*,TqP,- Lo*qhq$@Lk= O ,
Lo*,.*TqA
and (2.31) passes into
(6b, - w k l b L P ) L%,rl4$; = 0 ,

by making use of the relation (2.16). Hence the transformation property is


established,
6bk = WklbLP,
that is, the Latin index is related to the four-dimensional rotation and bk trans-
forms as a four vector under it (the same is true for b,, too).
Tlic quantity h, is the contravariant vector a s it transforms contragradiently
- --_ __ -

) A(,,)= (ID)( A p v + A v , ) ,flip,]= (I,?) (Ap9--Av,).


*I) The conserved quantities will be explicitly given later in an invariant manner.
360

Exteiided Trniislntion Invariaiice mid Associated Gauge Fields 497

to agP,, and b,, the covariant vector as it transforms cogradiently to (2.12). bkp
is to be referred to as a vierbein system because of its dual character under
the extended translation and the four-dimensional rotation specified by (2.11)
and (2-30), respectively. These situations are made clear and are summed up
by the following statement. Under the combination of these two independent
transformations,
6x = E (x), (2.30)
6qA= TqA, -
(2 30)
L stays invariant if bkfi (or equivalently bk,,) transforms as follows:

-
(2 30)

The field strength ckl,Jlis reducible under the four-dimensional rotation ; the
irreducible parts of it are calculated by means of the standard method:
i) an irreducible tensor of rank 3,
~ L s i = ~ ( k l )-
m (1/3) (Sk:lC,nV-S?n(kC1))

ii) a vector,
CkF=C7Jlmk = (bbk)~,/b ,
iii) an axial vector,
C k A =i&i,n,lCin171/6 .
The tensor CkL,n is represented in terms of these irreducible tensors,
Cklm = (4/3) c;[ini] + (2/3) 6k[icZj + i~kl,m?~?ld.

We shall require that LG should be of the quadratic form in the first derivative
of bkp. Thus, we choose
d;= b(acclmcrh, + pckvCkv + TCkACkA + 0) *) (2 32)
with the arbitrary constants a, B, y, 8. Inserting the above Lagrangian into
(2.27) , we obtain after some algebra
Bkl =7nTkl 9
(2 * 33)
Bkl = -bmipbFklm,p CmVFkmL + (1/2) C l m 7 l ~ ; c m-
n cm7tkFnin1 + ~ L Z L ~(2, 34)
*

F/ci;clm = 4b {aC.&m] + v]
PJk[lCm - (1/6) ~)I&kL?JlIIC?~) (2 35)

(2 * 36)
36 1

495 K. Hnynshi mid T. Nakaiio

where all the tensors are converted into the local tensors in order to preserve
the invariance under the extended translation.
We shall manage to write down the equation of motion (2.33) in a simpler
form analogous to the divergent form (2.27),
- ( b l P b m F k l m ) *v = b, (mTkl + tkl) y (2 37)*

where

(2 35)
*

(2 - 39)

It shoulL be emphasized that t k , remains invariant u n w r the extencdd translation


while is not invariant, as is easily shown (hence a tilde has been attached
to the canonical energy-momentum tensor density i,). From (2.37) , we obtain
the conservation law
{b, ( m r k l -k t k l ) 1 = 0 , (2 * 40)
which is essentially the same as (2-26), although t, is preferred to iv. In a
manner similar to that stated above, (2.29) turns out
,

1
(brTl())Zp =0

T, =At) ( m T k l + t k l ) -&k)vbrnFkLm 7 (2 * 41)


( A k ) (x) is an arbitrary function).
Hence there exist indenumerable conserved quantities

\
Ptn= d3x:b;Tl(*), (2.42)

where of course the conserved vector corresponding to (2.40) is included by


a particular choice in A k ) ( 5 ):

(At))= (A:))= ( $ k z ) ,
p k
i
= d3xb;( m T k l + tkl) . -
(2 43)

A function Ak)needs not to be a vector under the four-dimensional rotation,


Before closing this section, we shall resolve (2.33) into the symmetric and
skew-symmetric parts**)

4:) Tk,, corr-csponcls to rlic faiiicd psciidn c i i c r ~ y - i i i ~ ~ i i i c t i ~ iIcnsor


iiii OT thc gravitational h l t l
mhicli docs rial tmilsforiu as n tcnsnr untler ( I I C p w r a l cooulinate t.r.aoslorir~ncion.
* 1 Just recall that the symmetrized (not the canonical j energy-momentum tensor ol the matter
source is of physical significance.
362

E x t e d e d T7-nnsLatio7i I m a r i m c e aiid Associated Gauge FieZds 499

(2.48)

where ti2 does not contain an arbitrary constant 13 ; it can b e rewritten i n terms
of the irreducible tensor components,
ti2 = (1/2)ib {a- (4/9) r } (2ClklJllljCIP,,JCjA+ E,.lJljc,lLcjA
+ 3i (6k.C,,AC7,d -CkA4CLA)} .
(2.48)
In deriving (2.46) and (2.47) a special care is taken in order to eliminate the
second derivate of the field variables from the definitions of ti? and tiL,by dint
of the useful identity
(2.49)
If we put
(2.50)
the above two equations (2.46) and (2.47) yield after the differentiation
(2.51)
(2.52)

$3. Linear approximation in spinor-vierbein interaction


The field equations proposed in 2 are non-linear with respect to the field
variables bkp. We know that a linear theory (Newtons theory) accounts, with
a considerable degree of accuracy, for the motion of bodies under the gravitational
forces. We shall discuss the interaction between the spinor field and the vier-
bein field h, by assuming both the difference
b,, - 8, = U,

and its first derivatives are so small as compared to unity that the quadratic
terms in a, and/or its derivatives lead only to secondary effects and are
hereafter neglected. In this linear approximation, all the Greek indices are
363

500 K. Hayashi and T. Nakano

replaced by the Latin indices as there remain no differences between them. From
the orthogonal relations (2 19) , it follows that
a
k &lk ,

and the various non-linear quantities pass into the linearized ones,

a
~ (km'l~)
(3-1)
ck V
CkA== (1/3) ze*Zm<Z2m'

The field equations (2-44) and (2-45) become

(a -2/9) {flt
(3-2)
(V9) r>
(3-3)
with
= amm , D = 9mdm .
Provided that the relation (2-50) holds, these two equations are completely
decomposed into the symmetric and anti-symmetric parts,

a,(lm)'mk ~ kl

(3-4)
- {a - (4/9) rl (3-5)

With the help of the convenient notations

(1/2) Stla (3-6)

we shall be able to simplify the form of these equations to some extent,


kl 2o(fcm'7,U) + Ski (Smn'mn ~ K^ ~ ~ ^T (kl) , (3 7)

^Aitm'mn == ~ K?"-t [fc{] . (3 8)

Further by imposing the generalized Lorentz conditions on Skt and AM

*' K$ corresponds to the cosmological constant, hence it will be neglected hereinafter.


364

Exteiided Trimslatiosi Ixzrnriunce and Associated Gnzige Fields 501

we obtain
( 3 * lo)
(3.11)
011 the ot_-er hand, we choose the Lagrangian 0- a spinor field, say an
electron, as a matter field,
Lo= W.2) (Frkh L;d>+ 4 G ,
- (3.12)
hence we obtain the invariant substitute of it by the prescription stated in $ 2 ,
L = (1/2) b, - j + F~.
($rks,, iFZprkfi 111 (3 - 13)
The equation of motion derived from L=bL is
t7krrk$p+ (l/Z) ck.rL$+ m G =0 , ( 3 - 14)
which is reduced by the linear approximation (3 * 1) and the condition of diver-
gence-free (3.9) to
rk +
(3, - li (SkA- (l/2) S,pS) dr, 1A,,3, - (1/4) li-S.,}4t ?)L$ = 0 , (3* 14)
(SWL,,,= 1.
011 multiplying the dual operator, we obtain the differential equation of second
order,
{ (1 +/cs)0 -- 2 K s k , @ k o ? L - (1/4) R (ns)14
+ i { K (SknL.$TIL(1/2) S,la,) +
- 6 k ~ $ / ? t 2=~O , ( 3 * 15)
where
rrrl=a,, tidti .
Now let us proceed to the non-relativistic limit of (3.10), (3.11) and
(3.15) ;
n&=- l i m r l r O O =

1
--KO,

fls,,= 0 = OLSd , (a, b = 1,2, 3), ( 3 * 16)


=0 ,
OA,,
E$J= {p2/(Z//Z) Sua t
- (/i/l~/2) ( I E ~ / ~ Nfi,L ) ~ T ~ ~(3}* 17)
wliere the well-lmown relations
i3,G = (/it +E )+ , - Lau+ = pu$ , (LL -= I, 2, 3)

k) One i s permitted to set SkL=0=Akl except for So, only, as there are no components of
sources for t h e m
365

502 K. Hayashi aild T. Nakano

are used and p denotes the density of matter sources.


Equation (3.17) turns out,
E$h= { p z /(2WZ) - (KWZ/2) soo} , (3.18)
where we neglected the last term containing K.
Upon comparing the potential term in this Schroedinger equation with the
Newtonian potential of gravitation p, we find that
p = - (li-/2)So0=al4. (3 * 19)
If the field does not change quickly with time, i.e. 500 is almost static, i t follows
from (3.16) and (3.19)
Ap = - (li-/Z) As00= (K2/2)p ,
with which the Newtonian equation
Ap = 4nkp
is to be compared. In this manner we are able to determine the coupling constant
of the symmetric field,
li= JSX,
(k = 1.06 x lO-g-= 5.2 x lop6cm2 in the natural units).
We close this section by remarking that the coupling constant of the anti-
symmetric field A,, cannot be determined as there seems no source of skew-
symmetric energy-momentum tensors in the classical experiments.

4. Comparison with Einsteins theory


In this section we shall compare the symmetric part of our equation (2.44)
with Einsteins equation of gravitation, by defining the symmetric metric tensor by
gpy= bkbk , 9,= bkpbkv. (4.1)
The Christoffel three-index symbol of the second kind is given by
x +b w d .
ry = blx {bk(pbmFklm (4.2)
T h e Einstein equation takes the form

where
R, = 2{ G i d X ) - r;J>,p}
7

.
R =gpYRpy
Transition from the Greek indices into the Latin ones (as already shown in
(2.36), for example) yields
366

Extended Translation Invariance and Associated Gauge Fields 503

Gk, = bk:b<G,, K Z m T ( k l ) , (4.3)


with which our symmetric equation (2.44) (devided by b)
K 2 B ( k Z ) =C a m T ( k C (4.4)
should be compared. By the use of the previous notation (3.7) provided that
the relation (2.50) holds, the difference of these two equations is represented by
l i 2 B ( k l )- G ~ L-= (4i2)- ( -8i&(kmltsCL)mnCiA + 6CkACkA+ 36klC,AC,A). (4.5)
It should be noticed that each term of the difference contains an axial vector CkA.
Now we have to check whether it vanishes or not.
Apart from the linear approximation where the above difference vanishes
exactly, there has been the well-known solution of Schwarzschild in the spherically
symmetric field. In particular, we consider its static case: the field variables do
not depend on time and a mass point is situated at the origin under the influence
of the spherically symmetric forces. In this situation, the components of b,,
transform according to the laws
abaa = @ahbaa$- Uapbap 9
I

under the three-dimensional rotation

It is easy to construct the general form of bk, such that it has the required
transformation properties mentioned above.
+ + BXaXa
baa = 6aa (1 A )
baa=iCXa,
bta= i D X a ,
bk6=1+E,
where
X a= xa/r, r2= 2
and A, B, E are functions of r only.
a,

A simple calculation yields by making use of these general forms,


CkA=O. (4.9)
Consequently, the equivalence between the solution of Schwarzschild and ours
367

504 K. Hayashi and T. Nohano

is established.

,6 5. Elimination of the antisymmetric field


We shall discuss various choices in the arbitray constants introduced in the
free Lagrangian LGfor the bkPfield (2.32), except for a trivial case, a = /3 =r = 0 .
In particular we lay emphasis on the possible vanishing of the skew-symmetric
energy-momentum tensor generated by a matter field about which no definite
statements have been given in the previous sections. The well-known procedure
of symmetrizing energy-momentum tensors by means of adding the canonical spin
angular-momentum to it cannot be applied to the present case.
To be specific, we shall base our arguments on the spinor Lagrangian (3 12).
As it is invariant under the translation in the usual sense (see ( 2 . 4 ) ) , there are
the coriservation laws of the symmetric and antisymmetric energy-momentum
tensors separately,
T(kl)'l = 7 (5.1)
T[kl]'l=o 7 (5 * 2)
where
T L ~(1/2)
= <$rkG'l - '$r'ad') .
As stated at the end of $ 2 , we could obtain the separate conservation laws (2.51)
and (2.52) closely similar to the above ones, provided that a and /3 satisfy the
relation given by (2-50). Now we shall assume it, although we cannot find any
a prioi-i reason to require such separate conservation laws in an extended sense;
however, the result obtained in $ 4 seems to support our assumption. Various
cases are investigated in order.
Case 1.- a + P = O .
The field equations for the bkP field become
- {bzPb,,"h3a - 28n&J
- ( 3 / 2 )iekznc,cc,z
[cklnr A
tk) ,
} = bz" (mT~,tz)
ry + (5 - 3)
- (bfb,,"b (3/2) i[a!- (4/9) rl E ~ L ? ~ , ~ c , ! ) 1" =bf ("TCkl, -ttit) . (5 * 4)
As is easily shown, it is impossible to make both sides of Eq. (5.4) vanish
identically without further conditions.
Case 2 : a+/3=O and a - ( 4 / 9 ) ~ = 0 .
We find
t;l =0 , t:l= t , l (5.5)
and
B[kll =0 (5.6)
However, the skew-symmetric part of the energy-momentum tensor mTCaL1
cannot
368

Extended Translation Imlariance and Associated Gauge Fields 505

vanish identically so long as the matter field exists. It is interesting to check


whether its derivatives should vanish or not. By making use of the equation of
motion ( 3 - 14) and the generalized Klein-Gordon equation which is derived from
-
(3 14) by multiplying a proper dual operator,
(0'-
17z2)$= {id,,DmD,,+cmVD,,+ ( 1 / 2 ) b , ' ~ ~ ~ ~
+ ( i / 2 )~ w n b ? , +I P(1/4)
~ ~ pcmVcmV> @,*)

Dn=DmD,, (5* 7)
we observe the connection between the energy-momentum tensor and its derivative
as follows
mFLk= ( 1 / 2 )bbi' ($rk$pp- $,'rk$) = b" TLk , (5.8)
(bL'makL)'p= -CmkLmTmL, (5.8')
(bLPmTlk)'p = - b ( 1 / 4 )Eklmncjmn ($r5rL$'j - $'.jr57L@) +
$- (bbmpcrnkt~rLfi)'p, (5 * 9 )
(l7LPtkL)'p +
= - CmkLtmL 2bkYb,1[p'v] .
(bL"b?nXFn~m)'X (5 * 10)
In fact, the sum of (5.5) and (5.10) vanishes if the field equation for the b',
field is employed. Thus, the derivative of the skew-symmetric energy-momentum
tensor cannot be made zero with any choices in the free parameters.
Case 3 : a + P = O , a - (4/9)r=O and we add an axial-vector interaction to the
matter field.
First, we consider the local homogeneous Lorentz transformation (compare
with (2- 30) ) ,
8x'=O,
8fi= (i/4) WkL (x)dkL$ , (@<kl) (x>= O) i (5.11) **)
n-
o$= - (.
2 / 4 )wkL qgkL . J
Under this transformation, the modified Lagrangian L' (3 13) is n o longer
invariant. According to our recipe we should introduce new fields in order to
make the theory invariant.***) Instead of introducing new fields, however, w e
shall here add some interaction terms consisting of the gravitational field strength.
There are the vector, the axial-vector and the tensor couplings constructed in
terms of ckV, CkA and &,,' in the form of tri-linear interactions. On examining
the respective transformation properties, we find a promising one, that is, a n

*) Its linearized form is given by (3.15). I t should be noticed that the covariant derivative
does not commute each other, yielding the invariant field strength, [D,, DL]=cmkLDm.
**) ou(.z) and its first derivative are assumed to vanish on the boundary surface of the integra-
tion domain.
***) Detailed discussion of it will be made in a forthcoming paper.
369

506 K. Hayushi and T. Nnkaizo

axial-vector coupling,
LA= - (3'1/41C L A $ r S r k $ , (5.12)
and the modified Lagrangian
L' + L A (5.13)
remains invariant under the extended four-dimensional rotation (and of course
under the extended translation). From (5.13), the equation of motion is replaced
by
+
bkPrk$',, (1/2) C k v r k $ - (3i/4) Ck*rdk$ + ?JLfi =0 . (5 * 14)
It should be noticed, on the other hand, that the action integral

I ( 2 ) = \ d4xLG
2

is kept invariant even under (5.11), because of the particular choice in the
arbitrary coefficient (see (5.6) ) ,

61= 5
I
=0 .
= B(kl)Lohzd4x
Bkllokld4x
L

Now the total Lagrangian density becomes


L' + L" + LG.

The sum of the anti-symmetric parts


+
T& = (l/2) b,",(6rl,fir,- 6 , ~ d-) (3i/4) c & $ T ~ ~ ~ ~ $
mTCkzl
- (1/4) {crn"&rkrlrrn$ 3. bmP($rkrlrm$~,,
-t6Prkr1r7n$J)
-6p1Cd'$rm$ + bmP($rmfi.,+ iLrm$)1) + (1/2> c&FrLl$
+ 6;"k <&TL,$% +&dZI$)
is shown to vanish after some algebra with the aid of the equation of motion
(5.14) and the similar one for $. Consequently, we have managed to symmetrize
the energy-momentum tensor of the matter field; in this case the additional
interaction Lagrangian of an axial-vector type plays the similar role to that of
the canonical spin angular momentum played in a conventional manner for a
flat-space. The additional interaction gives rise to a spin interaction with a spinor
fieldt which is explicitly observed in the non-relativistic limit,
370

Extended Translation Inoai-iance and Associated Gauge Fields 507

$ 6 . Discussion

We have discussed the extension of the translation group and introduced


the new field including the gravitational field in a manner quite analogous to
the electromagnetic case. It should be stressed that the geometrical interpretation
in terms of a Riemannian space may be given if desired and if necessary.
What differs from the electromagnetic field lies in the fact that our field
strength is decomposed into the three irreducible parts. The free Lagrangian
for the new fields is constructed with the four arbitrary constants.")
In performing the linear approximation to the complicated non-linear field
equations and making comparison between the Einstein's equation of gravitation
and ours, we have assumed one relation among the arbitrary constants. This
assumption is equivalent to the requirement that there should exist the conserva-
tion laws of the energy-momentum tensors quite similar to those implied by the
translational invariance in a narrow sense for a free matter field. T h e relation
assumed above has enabled us 1) to decompose the linearized field equations
into one for the symmetric field variable and the other for the antisymmetric
field variable, respectively, and 2) to identify the symmetric solution of the non-
linear field equations with Schwarzschild's solution in the spherical symmetry.
We have been incapable of making any definite and conclusive statement
concerning the antisymmetric part of the field equations, except for the proposed
prescription to eliminate it as a redundant field by adding the tri-linear inter-
action Lagrangian of the axial-vector type besides the particular choice in the
free param2ters. In this case the total Lagrangian keeps invariant even under
the extended four-dimensional rotation.
In a forthcoming paper, the extension of the homogeneous Lorentz group
and its detailed consequences will be reported, where it is emphatically aimed
to introduce a massive gauge field in an invariant-theoretic way.

References

For example, C. N. Yang and R. L. Mills, Phys. Rev. 96 (1954), 191.


See, for example, Kadi Husimi, Lecture on Variational Principle in Field Theory in 1943
at the Meeting of Japan Physical Society (edited by the Japan Ministry of Education, 1944).
Earlier references are listed in detail therein.
R. Utiyama, Phys. Rev. 101 (1956), 1597.
T. W. B. Kibble, J. Math. Phys. 2 (1961), 212.
D. W. Sciama, Festschrift for Infeld (Pergamon Press, New York, 1961), p. 415. We are
indebted to Dr. Kei Semba for informing us of this paper.

*) The field strength of the electromagnetic field is irreducible for itself and there remains

only one free parameter in producing the free electromagnetic Lagrangian.


37 I

612

Progress of Theoretical Physics, Vol. 45, No. 2, February 1971

Gravitational Field as a Generalized Gauge Field

Ryoyu UTIYAMA
and T a k e s h i FUKWAMA

Depa~tme7zt of Physics, Osaka University, Osaka

(Received August 17, 1970)

It is shown that a symmetric tensor field of the second rank A,,(z) should he introduced
i n order t o retain the invariance of the action-integral under a generalized translation xa
+z#+P (z), provided that the original action-integral is invariant under inhomogeneous
Lorentz transformations. It is further proved that the generalized gauge field A,, should ap-
pear in the Lagrangian i n exactly the same fashion as the metric tensor gPv does in Einsteins
theory of gravitation.
Some general feature is also discussed with respect t o a law of conservation of some
physical quantity which becomes no longer valid when the interaction with the generalized
gauge field takes place, provided that the associated group is non-Ahelian.

1. Introduction
A gravitational field w a s first interpreted as a kind of generalized gauge
fields by one of the present authors) by introducing a system of tetrads h,(.x)
and extending the Lorentz transformation of the tetrads at each world point t o a
larger g r o u p depending upon six arbitrary functions of x instead of six parameters.
Besides this article, some a ~ i t h o r s ~ )tried
, ~ ) to introduce a gravitational field by
extending the translation g r o u p to a general transformation of coordinates
xzr+z+ 6(..),
but their arguments seem rather unsatisfactory and complicated.
Many groups of transformations depending o n parameters have been found
in connection with the different kinds of conservation laws. A m o n g these groups
it i s well known that the group of phase-transformations of complex fields was
extended t o the gauge transformation depending on a n arbitrary scalar function
I (x) connected with t h e existence of an electroniagnetic field. T h e invariance
under rotations in the iso-spin space was extended t o the invariance u n d e r a
generalized rotation group by an adjoined introduction of t h e Yang-Mills field.
T h e most well-known group, namely the translation group, has been conjectured
t o be related with the gravitational field because the gravitational field is, following
Einsteins equation, produced by the energy-momentum tensor of material fields, the
conservation of which holds owing to the invariance of the material system under
a translation of coordinates. I n spite of such a conjecture, however, there has not
been any convincing article which shows the gravitational field being derivable
from the postulate that the action integral of a matrial system is invariant under
a group of generalized translations depending upon four arbitrary functions of z.
372

Gravitational Field as n Generalized Gauge Field 613

T h e aim of the present paper is t o show that a tensor field of the second
Tmik should be introduced in order to retain the invariance of the action-integral
and that this tensor field should appear in the original Lagrangian of the material
field in exactly the same way a s the metric tensor gpu does i n Einsteins theory
of gravitation. T h i s conclusion is derived from the assumption that the original
Lagrangian is invariant under inhomogeneous Lorentz transformations but the in-
variance under rotations of tetrads has not been assumed.
In addition to the derivation of the gravitational field, some general feature
is discussed with respect t o the laws of conservation. It is shown that a physi-
cal quantity owned by some field, say d A ( x ) ,which is conserved owing t o the
invariance of the action-integral of d A under some parameter-group of transfor-
mations, becomes unable to satisfy the law of conservation when the original 4-
field begins to interact with a generalized gauge field associated with the group
mentioned above, provided that this group of transformations is non-Abelian. T h e
conservation is recovered only when the quantity carried by the generalized gauge
field is taken into account together with that possessed by the field dA.
T h e present procedure of introducing the interaction of a gravitational field
with a material system might include its application in a derivation of an S-
matrix f o r a material system interacting with a gravitational field if the Lorentz-
invariant S-matrix is known for this matrial system without the gravitational in-
teraction.

5 2. Fundamental postdate

Consider a field $ A ( x ) ( A = l , 2...N) with a Lagrangian density

Let us assume that the action-integral

I=
s Ld4x

is invariant under the following groups of transformations

i) trans la t i on
x-+x= x + a, (aP= constant parameter)

ij) Lorentz transformation

where E, is an infinitesimal parameter and is restricted by the condition


373

614 R. Utiyanzn and T. Fukuynmn

Epv = y p p .E P , = - E U P 7

~/&=yp=
I -1
1
0
p=v=11,2,3,
P+Y
p=v=O.
9

H e r e it has been assumed that the field $ A is a kind of tensor and t h e transfor-
mation-coefficient C?: is a n appropriate sum of products of Kroneckers 6.
O u r fundamental postulate is that the action integral should be invariant
under the generalized translations which is a generalization of (i) and (ii) de-
pending upon four infinitesimal arbitrary functions of x7 in place of four para-
meters a#, i.e.
xp-+x@= xp + p ( x ). (2.1)
I n order t o realize this postulate, the original arguments of the Lagrangian,
for example a,$,, should be replaced with a n appropriately defined covariant
derivative V k $ A by introducing a new field AK(x).
Let a Lagrangian

Ll ($A7 $A,X, A , AK,X)


be a substitute for the original one. Since the parameter ap in (i) behaves as
a vector under Lorentz transformations, i t is plausible t o assume that the new
field AK is a covariant tensor of the r-th rank, following the conclusion of the
theory of generalized gauge field^.^' T h u s the field AFl...pr(x)
h a s t o b e trans-
formed under the transformation (2.1) in the following way:

8Ap,...pr OD (;,2lr):.
(5)= A,,l...vr f, , (2.2)
where the transformation coefficient D is

The expression of 6$A for the Lorentz transformation has the form
8$A= $B c?; f , v
*

for the variation of d A under the transformation ( 2 - 1 ) . T h i s expression of 8dA


gives rise t o not only terms being proportional t o tIf/tIx but also terms having
6f/tIx.i?x in the variation of the action-integral. I n order to cancel these terms,
it is necessary (a) to let the Lagrangian L, depend upon dAK/tIx in addition to
A , if A , is a tensor as w e have assumed, o r (b) t o change the definition (2.2)
of 6AK in such a way that a term having azf/8x-ax appears in the definition of
6 A , provided that aAK/tIx should not appear in L,. T h e approach (b) means
an abandonment of the tensor character of the A-field. In the present paper,
however, w e assume the A-field to be a tensor, and consequently L, depends on
both AR and aAK/dx,
374

Gravitational Field as a Generalized Gauge Field 615

T h e postulate that the n ew action-integral

Il =
s Lld x

should be invariant under (2.1) leads to the following various identities (see
Appendix A ) :
a, { [ L Y .C2;. $B + [Li]lar.
D (:::::L:)Y, . Ab,. . . b r }
+ [Ll]a.ar.Aa,...a,,p~O
+ {LI]-$A,~ , (2 - 4)
a, { [LJ.c<;. + [L,]+I-.D (
$B y
,...a:); -0 , .Ab,...br -T / l ) p } (2.5)
[Li]* c$ * $B + [ L I ] ~ D. .(:::$)$
~~ * A b , ...a,

+ { p and Y interchanged} S O , (2 7)
where the following abbreviations have been used :

5 3. Determination of the type of gauge field A(,A,...,cr)


should be included in L,
T h e identity (2.7) shows that $ A , v and Aal...ar,,
only through a particular linear combination V,$A of the following type:
Vx$A =a x $ A + $B * ( x ) .A a ,...ar,
M2.aru (3-1)
where the coefficient M is t o be determined later and probably depends on x.
V,$A is called i n what follows a covariant derivative of $A.
T h e Lagrangian Ll can be rewritten in terms of V,$A as

Ll ($AY$A,XY AYaA/dx) = LZ ($A> v X $ A , A). ( 3 .2)


Making use of (3.1) and rewriting (2.7) in terms of L,, we have

dLZ . .$ B . [{~~.C~p+M~i...ar
I...a,)p 4 .A b l...brt. ~ ( L i . , . ~ r

ark$,

+ {Y and ,u interchanged}]=O .
375

616 X. Utiynmn and 7'. Fukuynnza

T h i s expression suggests thzt the coefficient M should be a linear coinbinatioii


of c",:
~;;,-u,.v (5) = c;;.y&,-awa (x). (3.3)
By substituting (3.3) for M's in the above identity, (2.7) gives

- 2s;. 8#]= - 8: (616;+ &sf)


'AQ l...br ( x ). [ { Y ~ " . u r u a . ~ ( {/!
~ ~and +
: ; Y: ~interchanged}].
~)~} (3.4)

Since the left-hand side of ( 3 - 4 ) is symmetric with respect t o the subscripts I,


and /?,the same should also be true for the right-hand side. T h u s w e find*)
- yga;-a,vn
y3-'arua
-
= y(Ti;.arva .
I n what follows, f o r t h e sake of simplicity, let t h e discussion be restricted
to the case that the field A is an irreducible tensor of the r-th rank with a full
symnietry :

A(q...UT)
(X)*

T h e double contraction of (3.4) by putting v = , l and ,u=@ and nialriiig use of


the definition ( 2 . 3 ) gives a relation

208; = zr. y : y a k j - a r - l - p ) y a (X) .A(al..,ar-lp)


(5) *

I f w e define (x) by
A(al'..ur) (x)cc Y&)
I...u,-,a)ua
9 (3.5)
the above relation is written as
* A ( 5 1 . . . a , ~ ,., ) ~ ~ ~
A(a1".5r-1a) (3* 6)
T h i s result shows that r should be >1 otherwise ( 3 . 6 ) leads to a contradiction.
The relation (3.5) 2llows us to represent Y in terms of A(a1.'.) as
,...a,. ua
y > $ y ? - ) v a= (XI . z((fl...*r))(L8)
A('J,-JJr) , (3.7)
where the coefficient 2 on the right-hand side is a n appropriate sum of products
of Kronecker's 6.
Substituting (3.7) for Y i n (3.4) and making use of the definition ( 2 - 3 )
of D,w e have a n important relation
- A(a,...5r.,p)
28:8{fi]=~* A(bl'..*T) {Z:fl...lrr)(~,q)
I-ar-,u)un +
,Z~f,'.:~~~rj($fn}. ( 3 . 8 )
F r o m our assumption that and A(bl...br) a r e both fully symmetric with
respect to their suffices, it is plausible t o assume that t h e undetermined coefficient
2 is also symmetric with respect t o both superscripts (al...a,)and subscripts
(b,-..br), in addition t o the symmetric pair of subscripts (,l/3). Since w e have
no information about the symmetry of 2 with respect to the extra superscripts

*) (PA) means that Y is symmetric with respect to suffices inside the parentheses.
376

GrazlitntionaZ Field as n Generalized Gauge Field 617

Y and a, 2 having the above mentioned symmetry with respect to the suffices
can be expressed in terms of the products of Kroneckers 8 in the following w a y :

tc i: (a,...a . )
[ O ( * , ...> . . ! : f i r ) . t d.,kS$;:::;.!$
o;isyp+ s$::::pzl*,)0~i8,1 ...h,. )8;;;ij) ,
i=l 1. J (L+ij)

(3-9)
where a , b, c and d a r e undetermined constants. 6::::;
in (3.9) means

where the summation should be taken over all the permutatioiis 01 (bl...br).
Inserting the expression (3.9) into (3 -8), and taking contractions of (3.8)
with respect to many different pairs of suffices, w e arrive a t t h e result
r=2, a=b=O, c= -d= ( 3 . lo)
with the normalization
(L) .A(,,)
A(pp) ( x ) =8
:
. (3.11)
T h e details of the derivation of this result a r e given iil Appendix B. (3.10)
determines 2, Y and Vxad a s follows:
= 3 [o$:,,z8;,SY, +
z((;$;;& On((6a,,la) , ) J ; , ~ Y p

+o(a,a,)
(b,B) o*,S;+61~:~?6;,6~-2o(l,,
a (*]b2)1> Yala,)/J(W (3.12)

Y ~ ~ z . A( ,(, u~, , ), . = ~ A a { A a , , s + f l ~ a , x - A ~ ~ . a } = d p U l ( ~ ) , (3.13)


Vd.4 = &$A + .c,:.
4;* .
$3 (3.14)
A;* in (3.13) is nothing but the Christoffels rj*provided that our A,, is
identified with the metric tensor g,..

5 4. Derivation of Lagrangian
Before beginning a discussion about (2.6), let us consider a little extension
of the Lorentz-invariance of the original Lagrangian L.
T h e Lorentz-invariance of the original Lagrangian can be made manifest by
writing explicitly the metric tensor 77,. It is easily seen, however, that this
manifestly invariant expression of the action-integral can also be invariant uiider
a n affine transformation)
xp+xp = apv- x u ap, + det (aP,)+O , (4.1)
*) In the case of an affine transformation, since the coefficient ufiv has no such restriction

as T o v a f l a . u ~ g = ~ a g ,there cannot exist such a covariant affine tensor as g,, whose components a r e
kept unchanged under the transformation (4-1).
377

618 R. Utiyama and T. Fukuyama

if the metric tensor in L is replaced with a constant covariant affine tensor


c,, and a t the same time, if L is replaced with

L ( d A > d A , X , C p w )= J l T . L ( d A , d A , x , C p Y ) y

C =d e t ( C J . (4.2)
T h e invariance of I = J L d 4 x under an infinitesimal affine transformation
x-+x= xi+ A, .xu+ a

where

Making use of ( 4 . 3 ) , we can r e w r i t e

N o w let us return to a discussion


of L, defined by ( 3 - 2 ) as follows:

where the following relations have been employed :

T h e expression {...} on the right-hand side of (4.6) vanishes owing t o t h e c o m -


mutation relation

[cgyc:]= 8; * c;- 8;. c;y (4.7)


where CF is a N XN matrix whose (A,B ) element is (AIC;IB)=CgF. The
commutation relation (4.7) holds owing t o the fact that the matrix C; is a re-
378

Gravitational Field as a Generalized Gauge Field 619

presentation of the N-th order of generators of the group of affine transfor-


mations and, in fact, w e can easily derive the relation (4.7) by considering a simple
example.
Comparing (4.5) with (4.6) and remembering that the right-hand side of
(4.6) identically vanishes, we a r e led to the conclusion that L, should have t h e
same functional form as L, namely
Jm*L(dA, vX$A,

=5 ( $ A , r d A > A,)Y

where
A = det (A,).
T h e discussion s o f a r developed neither compelles us t o interpret A,, as the
gravitational field nor gives any information on the signature of A,, but in order
to let the field equation of be hyperbolic, A,, should have the signature -,
+, $, +. I n place of our A,, one can consider a particular tensor field BfiY
which can be derived by introducing a system of curvilinear coordinates U into
the Minkowskian space, that is,

T h e question whether our A,, is identical with the fictitious gravitational field
B,, o r is an entity being completely different from Bpu, giving a non-vanishing
L
curvature tensor,*) is to be answered by the field equation of A,.Thus if
Einsteins equation is taken as a field equation, our A,, describes a permanent
gravitational field produced by the material field $ A .

5. Law of conservation derived from identities (2.4) and ( 2 . 5 )


T h e equation of the field $ A ( x )
[LJ= 0
gives rise to an interesting relation when it is inserted into the identities (2-4)
and (2.5).
By recalling the definition (2.3) , the identity (2.5) becomes

while (2-4) reads

*) The term curvature tensor means the Riemann-Christoffels curvature tensor when A,,

is substituted for the metric tensor qYY in the ordinary definition of the curvature tensor.
379

620 R. Utiymna and T. Fukuyarna

8S, 1
8X = y S a b.A,,,,
__- , (5- 3 )

where the following notations have been employed :

S, = S * A,,. (5.4)
T h e relationship between S, and T& is given by (2.6) with the aid of ( 5 - 1 ) :

S, = TZ),- d~FpC, (5 * 5)
where

T h e antisymmetry of F with respect to the superscripts [,?,a] is due to the identi-


ty (2.7).
T h e relations ( 5 - 5 ) and (5.2) and the fact that Spyis a symmetric tensor
density, as easily seen from the definition (5-4), show that S should be regard
ed as a energy-momentum tensor density of the field $ A interacting with the
field A.
Taking into account the transformation property of S under (2.1), we can
rewrite (5-3) in a covariant form

V S ~ =o
, , (5.3)
where
s u p=S,/JK ,
and the covariant derivative of S u pis

+ A;, *Sap
V k S , = dxSYP - A:p. S u b .

T h e existence of the non-vanishing right-hand side of (5.3) means that the energy-
momentum of $ A is not conserved owing t o the interaction of dA with A. It is
well known that (5.3) can be transformed into the expression

a,{S,+t,} =o (5.6)
by the aid of the field equation of A, where t, represents a pseudo energy-
momentum tensor density of the field A.
This result that the energy-momentum of $ A can no longer be conserved
when the interaction of $ A with A takes place, is a consequence of the general
theory of the non-Abelian gauge fields on which a brief explanation will be given
in the next section.
380

GravitationaZ Field as a GeneraZized Gauge Field 621

6 . General remarks on the law of conservation*)


Consider a field $,(x), (A=1,2, ...N ) , the field equation of which is derived
from the action-integral

I= JLW ($A7 .
ht,.>d4~

L e t us assume that I is invariant under a group of transformations depending on


parameters E,, (a= I, 2...r) :**I
n
I

6$A(x) =$A(x) -&(x) =$B*CBAea.


F o r simplicity, the transformation of coordinates is excluded from our discussion.
T h e invariance of I leads t o a set of identities

T h e postulate that I should be invariant even under an extended group which


depends upon arbitrary function Ra(x)s instead of E,s, necessitates a n intro-
duction of a generalized gauge field Au,,(x)with a transformation property

R c (x)+ Ra, p .
6Aa, = Ab, .MbaC. (6.2)
This postulate of invariance gives rise to the following identities, if one follows
a similar line of argument t o that given in Appendix A :

where the new Lagrangian L, is

Ll ($A7 $A,*, Asp) -


T h e identity (6.5) implies that L, should depend upon dA,, and A,,-only
through a n invariant derivative defined by

*) The first half of the content of this section is a review of the paper I.I.I., but our defini.
tion of j ( a ) is different from that given by (1.27) on page 1601 of 1.1.1.
**) For brevity, coordinate transformations are not considered in this section.
381

622 R. Utiyama and T. Fukuyama

T h e identity (6.3) is transformed into the following expression with the aid of
(6-6)and (6.7):

(6.3)

where Lies commutation relations)


a b
[C, C] =facb * t?
have been used together with the reasonable assumption

A comparison of (6.3) with (6.1) suggests that L, should be chosen t o have


the same functional form as L(,), namely,
L ($A> $A,X, AaJ =&+) ($A7 Vx4-d *

Let us define the (a)-current j:$i by

It is easily seen from the definition (6.8) that j ( + )has a transformation property
8j{i;A=j::;-fCaa. L (x)
which leads to the definition of the invariant derivative of j,,,:
V Lj:;; = 9 , j{:) - A,..-f~.j::;. (6.9)
Making use of (6.9) and assuming the field equation of $A

&IA =0 7

w e can derive the equations of continuity for j,,) from the identity (6.3):
v P J- (( a -0.
0) - (6.10)
(6-10) shows that the (a)-charge of the $A-field defined by

is no longer conserved except for the Abelian case fcb=O.


Let us assume that a Lagrangian density L(,,(A,,, Aax,Jis chosen in s u c h
a way as that the action-integral I A = . f L ( A l d 4
isxinvariant under the transforma-
tion (6.2). T h e n in a completely similar way we can derive many identities,
among which the following identity corresponds t o (6.4):
a
*) The (B, A ) element of the N x N matrix C is defined by
a a
(BlCIA)=C$.
382

G r a v i t a t i o n a l Field as a Generalized G a u g e FieZd 623

(6.11)

where

(6-12)

is employed, (6-11) reads


] 0,
d, [j j f ) ) j + j ( q 2 = (6 - 13)
where the second term defined by

(6.14)

is interpreted as the (a)-charge carried by the A-field. T h e relation (5.6) i n


5 is a particular example of (6-13). In the former case, the energy and mo-
mentum stand for the (a)-charge of the present section.
T h e field equation (6.12) shows that the A-field emerges from a current
density of the (a) -charge. This interpretation (6.12) together with the fact
that the A-field possesses the (a)-charge as is shown by (6.13), leads t o the
conclusion that the A-field can be produced by itself. T h i s is the reason why a
generalized gauge field associated with a non-Abelian group should obey a set of
non-linear field equations.

Appendix
A. Derivation of identities ( 2 . 4 ) - ( 2 - 7 )
Consider a variation of

11 = J L( d A , dA,x, A(,...)>A(a1...),x> d 4 z
9

due to the variations of $A and A(al...)


$A (x)- d l (x)= ad,= 6BC::. ~ Y ,Y
=6A(,,...
A [ a l . . . I ( -Acu1...)(x)
~) ,=A(e,...).D(8::::)Y,.fYy
which are caused by a transformation of coordinates
x ~ - += +
xx ~p ~6 X P = x p + I (x).
611 of the first order with respect to P is given by
383

624 R. Utiyama and T. Fukuyama

-t 8Ll 8A(+),
8A(U,...L
x
+ A1. Cyp} d *x
,

L e t us introduce another kind of variation defined by

a$A(x) =dA'(X) -$A(x) -8$A(X) -$A,k*8xx,

which has a convenient property

In terms of this

SI, =

Substituting the definitions (A-1) for and 8A(ul...)in the above expression
and putting 811=0, we a r e led to th e identities. Especially when Cp(x)= a P + d P v - x "
and A(ul...)(x)is replaced with a constant affine tensor C p y (and consequently
the coefficient of each parameter a" or A', in SI,, should identically
A(al...),k=O),
vanish because the domain of integration can be arbitrarily chosen. T h e relations
thus obtained are the identities (4.3) and (4-4)w h e r e L takes the place of th e
present L,.
O n the contrary if Cp(x) is an arbitrary function of x, the first and the
second terms of the right-hand side in (A.3) can be transformed by a partial-
integration to the following form :
3 84

Gravitational Field as a Generalized Gauge Field 625

where t Pcan take any value inside S. T h u s we have the identity

[ * . * I p [a,{
= [L,lA.C2;.6B+ [L,]l.D(~::::)~
*A(b,...))

+ [ L J Ay 5 , , + [LJ-)
* * yo
A(a,*..),p] (A.5)
which is nothing but the identity ( 2 . 4 ) . Xrisertjiig (A.5) iiitu (A.4), w e have

u,=l 3 , [ - . . 1 * . d 4 ~ = 0
for arbitrary t P s . By putting equal to zero the coefficients of cf, and its deriva-
tives i n the above identity, ( 2 . 5 ) - ( 2 . 7 ) can be derived.

where the following abbreviations have been employed :


A=
P-
A(fiaz.-+lA( p a r - . a t )>
AffY= A ( a Y a a . . . a r ) A t B a
DP- P s-.aT)

etc.
T h e double contraction of (B.1) by putting p = l and Y=P leads t o
A: (5ra + 2 (27-+ 3 ) . r . b + 10. r (r+ 3 ) - c+ 2r(r- 1)(7- + 3 ) .d )==ZOb:,
which allows to put

= 6;
A aP-= A ( ~ % - . ~ r r lA(pa,...al.)
.
and c on s e q u e n t ly
A(%--%)
. A(a,..-ar) = 4 -
Inserting (B-3) into ( B - 2 ) , w e have
+ + +
57-a 2 (2r+ 3 ) 7-b 10r ( r 3 ) c+ 2r(r - 1) (r+ 3 ) d = 20 .
In a similar way, t h e contraction of (B.1) by putting a=p and ,u=A leads t o
6ra + 2r (2r+ 3 ) b + 10r(r + 3 ) . c+ 2r(7-- 1)( r+ 3 ) d= 20 , (B.4)
where the normalization (B-3) and (B-3) have been employed. The third type
of contraction a=A and f i = p gives a relation
+
15ra 107-(27-+ 3) b + 2r (7r+ 3)c + 2r (2r + 3 ) ( r- 1)d = 1 0 . (B .5)
385

626 R. Utiyama and T . Fukuyanza

A subtraction of (B-2) from (B-4) gives


ra=O
or

n=O. (B.6)
If (B.1) is contracted by putting ,.u=,l, we obtain a relation
{Z (T- +
1)* r - b G ( r - 1) -T. C S2r(r- 1) (rt l)d}A:,
= (5 - rb - r ( r + a ) c - Y(T- 1) .d}8s
;; - 2r(7-+ 2) b .s;aY,. (3.7)
Since the left-hand side of (B.7) is symmetric with respect t o the pairs of suf-
fices (a,Y) and (B, p ) the same should be true for the right-hand side. Thus
w e have
~ - T ~ - T ( T + ~ ) c - T ( T - ~ ) -d=-2r(r+2) * b , (B.8)
and consequently (B .7) becomes
+
{2r(r--I)b 6 ( r - 1) . T . c + 2 ~ ( 7 - 1 ) ( ~ + -ld)} A G = - 4 r ( r f 2) .b-8{%)).
(B .7)
Similarly, contractions fi = p and a =X lead to
{CT+ T(T- 1)d)A:; = (1- (2r+3) rb - T ( T + 1)C- T(T - 1)d}8:;;; (B -9)
and
{ ~ T ( Y - l)b + r ( r - 1)
C+ ~ ( r l)d}A$
- z= (1- 5rb - ~ ( r 1)
+C- T(T- 1) d}@$]
(B - 10)
respectively.
I n (B.7), (B.9) and @.lo), if the coefficients of -4:: do not vanish, these
relations show that &4::: should be proportional t o @:I.
Recalling the normaliza-
tion (B.3), we have to put

A;; = +8$p). (B .11)


T h u s (B.9) and (B.10) give
5(2r+3)7-b+ ( 5 r + 7 ) r . c + ( 2 r t 3 ) ( r - 1 ) - r . d = 5 (B .9)
and
5 (2r+ 3 ) rb+ (7r+ 3)rc-k (2r+3) (T- 1) rd=5 (B 10)
*

respectively.
A subtraction of (Beg) from (B.10) gives
( 2 ~ - 4 )- r * c = O . (B * 12)
If we choose r = 2, then (B.11) becomes unsolvable relations
(x)= $ 8 g ) .
A(x)*A,,
386

Gravitational Field ns a Genei-&zed Gauge Field 627

On the contrary if c=O is chosen, we have t h r e e relations:


(B . 8 ) -+ ( 2 ~ +
3) T*b = T ( T - 1)d - 5 ,
(B.9), (B-10)+5(2~-+3)- ~ . r l l + ( 2 ~ $ 3 () ~ * - .7-.d=5,
l)
( B - 7 ) - > 3 ( 2 ~ + 3 ).b4- (~-1)(r+l)d=C)-
T h e s e t h r e e equations are unsolveble with respect t o 6 aiid A, Therelore (3- 12)
cannot hold.
Thus we arrive a t a conclusion that all the coefficients of both sides of re-
lations (B .7), (I3 .9) and (B .lo} should vanish. Vanishing of t h e right-hand
side of (B-7) gives
b=O (B.13)
while the left-hand side of (B.7) gives a relation when it vanishes:
3c+ ( r - t l ) d = O . (B - 14)
Similarly, (B.9) gives a couple of relations
C S (7--I)%=O, ( E . 15)
1 = r (r+ 1)c 3. r ( r - 1)d - (B .16)
Finally from (B-10) w e obtain relations

C S (r-l)d=O. (3 * 17)
I = r ( r + 1)c-t-r ( r - . l ) c l . (B .18)
A comparison of (B-15) with (B-17) gives
( r - 2 ) .d=O.
Since the case d=O leads to a contradiction a s easily seen, w e obtain
r=2.
It is easily verified that the solution of our problem is
fi=b=O, c= - d = J4

References

1) R. Utiyama, Phys. Rev. 101 (19561, 1597.


This paper is referred to as L1.I.
2) T. W. B. Kibble, J. Math. Phys. 2 (19611, 212.
3) T. Nakano & K. Hayashi, Prog. Thcor. Phys. 38 (1967). (191.
4) cf. 1.1.1.
387

33, NUMBER
VOLUME 7 PHYSICAL REVIEW LETTERS 12 AUGUST1974

Integral Formalism for Gauge Fields


C . N. Yang
Institute of Theoretical Physics, University of Wroctaw, Wrockzw, Poland, and
lnstitute of Theoretical Physics, State University of New Yo&, Stony Brook, New Y w k 11990
meceived 10 June 1974)

A new integral formdism for gauge fields is described. Further developments are
presented, including gravitation equations related to, but not identical with, Einstein's
equations.

It was pointed out by Weyl many years ago o r e m s are naturally developed. We summarize
that the electromagnetic field can be formulated some of these below. Details will be published
in t e r m s of an Abelian gauge transformation. elsewhere.
This idea was extended' in 1954 to the concept Gauge field strength.-Consider a path ABCDA
of gauge fields for non-Abelian groups. That forming the border of a n infinitesimal parallelo-
formulation, like the Weyl formulation for elec- gram with sides dx and dx'. ~ A B can ~ Abe com-
tromagnetism, was based on the replacement of puted by multiplying four phase factors like (1)
8, by a,, - ieB,. One might call such formula- together, resulting i n
tions differential formulations. It is the purpose
of the present paper to reformulate the concept
of gauge fields in an integral formalism. The where
new formalism is conceptually superior to the
differential formalism and allows for natural
developments of additional concepts. It further
allows a mathematical and physical discussion
of the gravitational field as a gauge field, re-
sulting in equations related, but not identical,
to Einstein's.
The basic point is the fact that electromagne- fpuk will be called a gauge field, o r gauge field
tism is a nonintegvable phase factor, a fact dis- strength. They a r e the Faraday-Maxwell fields
cussed many y e a r s ago by Dirac, Peierls, and when G =U(l).
others, and more recently by many authors.' Gauge transformation.-A gauge transf o r ma-
This fact is now generalized as follows: tion in the integral formalism is defined by a
Definition of a gauge f i e l d . 4 o n s i d e r a mani- transformation
fold with points on it labeled by x p ( p = 1, 2, . . . ,
n) and consider a gauge G which is a Lie group (PAB-(PAB'= ~ A ~ A B ~ B - ' , (5)
with generators Xk (k = 1, 2, . . . , m). [For G where tA is an element of G which depends on the
=U(1) we have electromagnetism; for G non- point A . It is clear that under (5)
Abelian we have non-Abelian gauge fields.] De-
fine a path-dependent (i.e., nonintegrable) phase (P,B c m - q AB c m ' t A PA wcm <A - '- (6)
factor (PAW as an element of the group G associ- Thus
ated with path A B between two points A and B on
the manifold. The association is to have the
group property: qABC = q A B q Bwherec, the paths where R,,, is the adjoint representation for the
A B and BC a r e segments of ABC. Furthermore element L . The simple transformation property
for an infinitiesimal path A to A + d x pthe phase (7) is the definition f o r the concept thatfPvki s
factor is close to the identity I of G , s o that' gauge covariant. Generalization to other repre-
sentations R of G f o r a gauge-covariant quantity
qA(A +ax) =I + bpk(@k d x . (11 +,8ygis immediate3:
The function b p k ( x )defined on the manifold will
be called a gauge potential; qABwill be called a
gauge phase factor. b P k is not gauge covariant; f p u k is.
With this definition additional concepts and the- Gauge-co variant diff erentiation.-To retain

445
388

33, NUMBER
VOLUME 7 PHYSICAL REVIEW LETTERS 12 AUGUST
1974

gauge covariance in differentiation we define tial and gauge fields are respectively
av + b p ( K I Z k I J ) $ J ,
=q (9)

where z k is the matrix representation of X,. Gen- It i s important to recognize that in this defini-
eralization to other cases is obvious. An inter- tion we have chosen a fixed coordinate system.
esting theorem is that A coordinate transformation would generate a
linear transformation in the vector spaces VA
fPUIXk+fU
XI/ + f X P I U k =0, (10)
and V B . In other words M A B - N A M A B N B - l . Com-
which is the gauge-Bianchi identity. parison with (5) shows thus that a coordinate
Introduction of a Riemannian m e t r i c . 4 0 far transformation generates a simultaneous gauge
we need no metric for the manifold. Now we in- transformation of the parallel-displacement gauge
troduce a metric for it and discuss arbitrary co- potential. In fact, the usual nonlinear t e r m in the
ordinate transformations. We come then natural- transformation of {z,,} is precisely the nonlinear
ly to Riemannian covariant quantities and doubly t e r m needed in the gauge transformation of the
covariant derivatives. b, is Riemannian covari- gauge noncovariant quantity b:. In this con-
ant, since V A B is coordinate-system independent. nection we observe that for GL(n),
f P u k is doubly covariant. We have

$XIIP=~KllI 9

etc. It is easily shown that where the semicolon represents the usual Rie-
mannian covariant differentiatim with a and /3
f,ullhk+fI.hIIpk+fXpII vk=O (1 2) treated as usual contravariant and covariant in-
which is satisfied by all gauge fields on all Rie- dices. The rule works also in general. E.g.,
mannian manifolds.
Source of gauge fieZds.- We define, in analogy
with electromagnetism, a source four-vector J, Nontrivial sourceless gauge fields. - - G a u g e
for a gauge field: fields for which f p u k $ 0 and J
: = 0 a r e of physi-
cal interest. So far only nonanalytic examples
J,k=g ~XfPullXk=fpu~ll~. (13) a r e known.
After some computation one derives a theorem: We now can construct two general types of gen-
e r a l types of examples,
gJ, A= 0 (conserved current) , (14) (a) Consider the natural Riemannian geometry
which in electromagnetism states charge conser- of a semisimple Lie group. Its parallel-displace-
vation. In Ref. 1 this was Eq. (14). One can also ment gauge field is sourceless and analytic.
generalize Eqs. (15) and (16) of Ref. 1, leading (b) Consider the same Riemannian manifold of
to the concept of total charge. a group G a s above in (a). Define ( P A B as that for
parallel -displacement gauge field.-For any an infinitesimal path A B , pAa=@-B). This
Riemannian manifold, the important concept of gauge phase factor which is itself an element of
parallel displacement defines, along any path G gives a gauge field which is analytic and
A B , a linear relationship between any vector V A sourceless.
at A and its parallel vector V B a t B . Thus paral- P u r e spaces.-A Riemannian manifold for
el displacement is defined by an n x n matrix MA* which the parallel-displacement gauge field is
which gives this linear relationship. MA^ is a sourceless will be called a pure space. A nec-
representation of an element of GL(n). Thus we essary and sufficient condition for a pure space
have the following: is
Theorem-Parallel displacement defines a
gauge field with G being GL(n). The index k has
n2 values and we write k = (a@).The gauge poten- A four-dimensional Einstein space, i.e., one for

446
389

33, NUMBER
VOLUME 7 PHYSICAL REVIEW LETTERS 12 AUGUST1974

which R U B= 0, is a pure space. and g '". Now form the variation


Gravitational field as a gauge field.-The elec-
tromagnetic field and the u s u a l gauge fields a r e
special cases of gauge fields, satisfying (12) and
(13). A natural question is whether one should xJ-gd"x=O, (22)
identify these same equations for the parallel-
displacement gauge field as the equations for the in which bl.l("') and g'" a r e independently varied.
gravitational field. There are advantages in this The resultant equations a r e satisfied by (15) and
identificationand we shall come back to this topic (19).
in a later communication. If one adopts this iden- It is a great pleasure to acknowledge the warm
tification then gravitational equations are third- hospitality extended to me during my visit to the
order differential equations5 for g,,,,. A pure Institute of Theoretical Physics a t Wroctaw
gravitational field is then described by a pure where this paper was written.
space a s defined above.
Variatiowl principles.-Equation (13) with J: 'C. N. Yang and R. L. Mills, Phys. Rev. 96, 191
= 0 follows from a variational principle (1954).
6 S 6 d n x = 0 , where 'S. Mandelstam, Ann. Phys. (New York) 2 1, 25
(1962) ; I. Biai'ynicki-Birula, Bull. Acad. Pol. Sci.,
Ser. Sci. Math. Astron. Phys. ll, 135 (1963).
3We use the summation convention for repeated in-
dices. Greek indices run from 1to n. Lower case
In the variationg,,, is kept fixed and b p k is var- Latin indices run from 1 to m m of course is also the
ied, andf,,,' is given by (3); C k Q b a r e not varied. dimension of the adjoint representation of G. Upper
One could also find a variational principle which case Latin indices run from 1 to M , where M is the
is satisfied by a pure space (19). Choose C k O bto dimension of a representation of G.
be the structure constants for GL(n), given by 4T.T. Wu and C. N. Yang, in Properties of Matter
(16). Write the L of (20) as a functional of tia0' under Unusual Conditions. edited by H . Mark and
S. Fernbach (Wiley, New York, 1969), p. 349.
and g ":
5R.Utiyama, Phys. Rev. 101,1957 (1956), had con-
cluded that Einstein's equations are gauge-field equa-
tions, We believe that was an unnatural interpretation
which of course also contains derivatives of b t of gauge fields.

447
390

Yangs Comments Concerning Gravitational and Gauge Fields

C . N. Yang. Talking a b o u t Physics Research a n d Teaching


The 5th talk at the Graduate School of the
Chinese University of Science and Technology,
Beijing, China, (1986/5/27 - 1986/6/12).

XX: What is your view concerning the unification of gravitational field a.nd gauge field?
Yang: Comparing their formulas, one finds that they are very similar. There is no
question that they have an intimate relation. But as to what kind of relation after all, it is
still a controversial problem.
Both F,, in gauge field and R,, in gravitational field are curvatures in geometry. R,,
is the second order derivative of g,,. Therefore, Einsteins gravitational field equation
is a second order differential equation of g,,, while the equation of motion P F , , = ....
for gauge field is a first order differential equation of curvature. This is what happens in
electromagnetism. The concept that both F,, and R,, are curvatures is absolutely primary.
I think that this is absolutely unchangeable. Since P F , , = .... is the first order differential
equation of curvature, the gravitational field should be the third order differential equation
of g,,. This is a clue that Einsteins theory of gravity needs to be modified. In 1974, when
the geometric structure was clarified, I proposed a new gravitational equation. However, I
knew how to write down the left-hand-side of the equation and could not write down the
right-hand-side. At present, this problem has not been solved.
39 1

35, NUMBER
VOLUME 5 PHYSICAL REVIEW LETTERS 4 AUGUST1975

Yangs Gravitational Field Equations*

Wei-Tou Ni
Institute of Physics, National Tsing &a University, Hsinchu. Taiwan. Republic of China
(Received 27 March 1975)

Yangs gravitational field equations in vacuo can be regarded a s derivative equations


of both Einsteins equations and NordstrSms equations, and embrace all their solutions.
Yangs equations admit monopole gravitational radiations ; therefore no analog of the Birk-
hoff theorem can be valid. The most general static spherical-symmetric solution contains
four arbitrary parameters. In particular, d s 2 = - d t Z + ( 1 + c i / r + c 2 ~ ) i d r 2 + ? ( d e z + s ~ 2 e
x d q 2 ) is a two-parameter exact solution. This metric possesses no gravitational red
shifts.

Recently by using a n integral formalism for gauge fields, Yang has proposed the following gravita-
tional field equations for pure spaces:
R , j ; k - Ria; j = 0. (1)
It is interesting to notice that (1) are satisfied by a l l vacuum solutions of Einsteins general relativity
G, + A g , = 0
and NordstrEms second theory2- (in Einstein- Fokker form)
R=O, Cijkl=O. (3)
where C i j k li s the Weyl tensor. In fact, f r o m Bianchi identities and the definition of the Weyl tensor,
a simple calculation shows that (1) are equivalent to

,,
R = 0 , C i , k l : =l 0. (5)
Equations ( 4 ) a r e differentiated equations (antisymmetrized in j and k ) of (2); while (5) are differen-
tiated equations (summed over 1) of (3). Hence Yangs theory is a derivative theory of both Einsteins
theory and NordstrEms theory. It is amusing and intriguing that the two eminent and structurally dif-
ferent theories of gravity could b e embraced in a single s e t of equations.
Since Nordstroms theory admits monopole radiations, Yangs theory admits them too. Therefore
no analog of the Birkhoff theorem of general relativity could b e valid for Yangs theory. More specif-
ically, (1) have the following time-dependent spherical-symmetric solution (the time dependence can-
not be transformed away by coordinate changes):

where co is an a r b i t r a r y constant, and f and g are two a r b i t r a r y functions. Hence spherical symmetry
does not imply time independence.
In the static spherical-symmetric case, the metric can be put in the form
ds2= - e 2 * ( ) d t 2+ eZYr)d?+?(dr2 +sin*Od@). ( 7)
F o r this metric (1) reduce to
A - 2 A 2 + A + + + f 2 + r - 2 ( 1 -e2*) = O
and
392

35, NUMBER
VOLUME 5 P H Y S I C A L REVIEW LETTERS 4 AUGUST1975

An integration of (9) gives


CP" + a'* - 'A' +2 ~ -a~' -( A') + Y - ~ 1( - e2)' + iR,e 2'= 0, (10)
where R, is the constant s c a l a r curvature. Both (8) and (10) a r e second o r d e r in A and a'; hence the
general solution of (8) and (10) h a s four a r b i t r a r y parameters. (Because of nonlinearity there might
b e a discrete number of such s e t s of four parameters.) The additive constant in a can b e removed by
a change of time scale. Therefore the general static spherical-symmetric solution has four a r b i t r a r y
parameters. This demonstrates that the solution of (1) is much r i c h e r than that of ( 2 ) (two parame-
t e r s ) o r (3) (one parameter). As a m a t t e r of fact
d s 2 = - d P + ( l+ c , / r + c 2 . f 2 ) ' 1 d y 2 + P ( d 8 2 + s i n 2 8 d q ? (11)
is a solution of (1) but not of (2) or (3); (11) possesses no gravitational red shift^.^ The problems of
boundary conditions and sources f o r (1) deserve extensive studies to clear up this richness of solu-
tions. In view of the present s u c c e s s of the renormalization of gauge theories, these studies could
contribute to the solution of the long-standing problems of quantum gravity.
I a m grateful to K.S. Cheng, J. C. Shaw, and E. Yen for their helpful discussions on the background
of Yang's paper.

*Work supported in part by the National Science Council, Republic of China.


'C. N. Yang, Phys. Rev. Lett. 33, 445 (1974).
*G. NordstrBm, Ann. Phys. (Leipzig) 4 2 , 533 (1913). and 3, 1101 (1914).
3A. Einstein and A. D. Foklter, Ann. Phys. (Leipzig) 44, 321 (1914).
'W.-T. Ni, Astrophys. J. 176, 769 (1972).
'The second term in (2) is a constant t e r m and, hence, does not contribute to (4).
'In view of this result, the recent claim of Thompson IPhys. Rev. Lett. 34, 507 (197511that the Birkhoff theorem
of general relativity generalizes to (1) i s invalid.
'W.-T. Ni, report presented at the annual meeting of the Chinese Physical Society, Chung-Li, Talwan, 2 6 Janu-
a r y 1975 (unpublished).
393

Einstein Lagrangian as the translational Yang-Mills Lagrangan*


Y. M. Chot
The Enrico Fermi Institute, and The Department of Physics, University of Chicago, Chicago, Illinois 60637
(Received 1 December 1975; revised manuscript received 6 May 1976)
The gauge theory of translation with a Yang-Millstype Lagrangian quadratic in the field strengths is shown
to be precisely Einsteins theory of gravitation and the corresponding gauge transformation is identified as the
general coordinate transformation. The gauge potentials of the translation group are interpreted as the
nontrivial part of the vierbein fields and the gauge field strengths are given in terms of the anholonomity of the
local orthonormal basis one starts with.

I. INTRODUCTION 11. THE GAUGE THEORY OF THE TRANSLATION GROUP

Gauge theories of the Yang-Mills type and Ein- Let us assume that the structural group G of our
steins theory of gravitation have a common fea- bundle P is T, with four commuting generators
ture: the self-interaction of the fields. Then one Ea (a=1 , 2 , 3 , 4 ) ,
i s led to ask whether Einsteins theory itself is a
gauge theory. Of course this i s an old question a? S B 1 =o, (1)
and many people-3 have suggested that Einsteins and that the base manifoldM is the four-dimen-
theory can be viewed as the gauge theory of the sional space-time with an orthonormal basis a t
four-dimensional translation group T,. Unfortun- each point, i.e., four orthonormal vector fields
ately certain features seem not to have been fully e , (i = 1 , 2 , 3 , 4 ) with the commutation relations
clarified so far, and it is precisely these features
that bear out the complete relationship between
the Yang-Mills and Einstein theories. Of course the basis independence of a theory is one
In this paper we show that if one only applies the of the basic principles in physics, and one can
gauge principle (this includes a Yang-Mills-type choose any other b a s i s if one wants to, but for
Lagrangian quadratic in the field strengths) for the obvious reasons the local orthonormal b a s i s (2) i s
group of translation Tqof space-time, the gauge the natural one to s t a r t with in our problem. Notice
theory that one obtains is unique and becomes p r e - that this orthonormalbasis i s not in general a coordi-
cisely Einsteins theory of gravitation. In this T, nate basis since the b a s i s vectors do not commute. If
gauge formalism of Einsteins theory the transla- we introduce a coordinate basis a, ( p = 1 , 2 , 3 , 4 )
tional gauge potentials a r e identified a s the non- with
trivial p a r t of the vierbein fields and the gauge field
strengths a r e given in terms of the commutator [a,,,a,l =o,
coefficients (i.e., the anholonomity) of the local then ei can be written in t e r m s of thevierbeinfields
orthonormal basis one s t a r t s with. h;,
To prove that the unique gauge theory of the
e, =#ap,
translation is Einsteins theory, it i s important to
observe that although the gauge group Tq i s Abeli- and correspondingly we have
an, it is not an internal-symmetry group and acts
T , =(a&?
~ ~-a,h;)h:. (3)
on space-time itself. Fortunately the geometric
meaning of gauge theories has, been well understood Here a , =hpa, is the directional derivative in the
by now in terms of the bundle p i ~ t u r e . ~The
-~ direction of e , and h i a r e the inverse vierbein
power of this bundle picture has been appreciated fields
by Cho and Freund. in unifying gauge theories
h:hg = 6:, hfh; = 6;.
with gravitation and also recently by Wu and Yang.
In the following we will first prove our claim in a Observe that due to the commutation relation (2)
formal way constructing the eight-dimensional of the basis e , , the directional derivatives a, do not
bundle of the translation group T4 over Space-time commute either.
and then will give a precise physical meaning to At this point we would like to emphasize that all
this translational bundle. For the details about the above expressions a r e just a matter of a for-
the bundle formalism of gauge theories we refer malism and we have not assumed that our space-
the reader to Ref. 5. time i s curved. Eventually we Will c r e a t e a curva-

14
- 252 1
394

t u r e by introducing gauge fields associated with T, the gauge potentials BY a s the nontrivial p a r t of
symmetry, but we s t a r t with a f l a t space-time and the vierbein fields
so far o u r hff remain trivial. h i =6: + K B ~ . (7)
Now, given a connection form' w = w " [ , in the
bundle P, the gauge potentials BF a r e a s usual This means that wenow have created the curvature of
given by the connection coefficients of Zi, the lift space -time b y introducing the gauge potentials B:
of e i into a four-dimensional gauge-defining sub- f0.r T4 and making the vierbeinfields hf nontrivial.

manifoldo, i.e., a c r o s s section of P: Notice that the decomposition of h r into 6p and l?:
is basis-dependent since 6: i s not invariant
w U ( E i )=KB;I, (4) under a rotation of the local orthonormal b a s i s e,.
where we have introduced a dimensional constant K From Eqs. (3), (6), and (7)one finds that
(of dimension of a length) to give the canonical di-
mension to the gauge potentials B f . This K will
s e r v e a s the coupling constant for the gauge group
T, and will be related to the gravitational constant
l a t e r on.
With 2, ( z = 1 , 2 , 3 , 4 ) as the horizontal lift of e ,
and (2 ( a = 1,2,3,4)a s the fundamental vector
The gauge field strengths G: a r e thus determined
fields which a r e vertical, we clearly have by the commutation coefficients T i j kof the ortho-
" 2 , (31 =o, n o r m a l b a s i s v e c t o r s that one s t a r t s with. We would
like to emphasize here that once the connection.w
[ 6 2 , Z k I =o, (5) (i.e., the gauge potentials in physical t e r m s ) is given,
[ z { , z j ] =Ti,';& - K G i , " [ , ' , the gauge field strengths q,a r e uniquely determined
where Gyj a r e the vertical components of the com- from the geometrical s t r u c t u r e of thebundle and
mutator coefficients of the horizontal lift vector a r e not something that one can define otherwise
fields Z i . The first two equations come from the as sometimes s ~ g g e s t e d . ~
definition and the third is due to the fact that the Gauge transformations in this picture a r e
changes of bundle c r o s s If we change
projection of [Zi,2,] down to the base manifold is
the same a s [ e i , e , ] . Notice that because of the the c r o s s s e c t i o n a t o o ' by a four-translation
B"(x) (a= 1 , 2 , 3 , 4 ) in the four-dimensional fiber
Abelian character of the (2's the group action on
the bundle space is really a translation and there space [geometrically V ( x ) a r e simply a set of
i s no "rotation" whatsoever. Mathematically this transition functions that relate o' to a], we clearly
means that the holonomy group of P i s T, and not have
the Lorentz group. Z; =c?, +(aiea)(2
Let us recall that in the bundle picture the gauge and (8)
field strengths a r e given by the vertical cornmuta-
1 I
t o r coefficients of two horizontal vector fields, B': = ~ " ( e =BY
;) +-aieg.
i.e., by G i j u . To find the gauge field strengths in
t e r m s of potentials BF, notice that from Eq. (4) From Eq. ( 7 ) this means that under the gauge
and from the definition of w', transformation we have
WU(Zi)= 0, hr - h': =hf +ai@'
w"(58) =a", = h 3 6 $ + a ,@')
it follows that = hffXt,
A -

e i = e l -KB;I(,*, where
and Xz = 6 $ +a,@',
[Gi,;,] =[zi - K B y ( z , z j -KB;[g] and now the gauge transformation i s unambiguouslyi
= T i j 'z b -K(aiBP - ajBF)Ep* identified as a general coordinate transformation
in the coordinate basis ap :
=Tijk2b-K[(aiBy - ajBp) - T i j k i 3 f ] ( 2 . (5')
a,,- a; =(b;+a,e')a,
Thus one has
=x;a,. (9)
G i r a = ( a i ~ - a , f l ) - T , j .b ~ (6)
Notice that gauge transformations (or equivalently
Now following Kibble's suggestion' we interpret general coordinate transformations) do not change
395

the orthonormal basis e , : They change the coor- Now in the last line of Eq. (12) the third t e r m i s
dinate basis a,, . Also notice that under a local explicitly a total divergence. But each of the f i r s t
gauge transformation one has two t e r m s cannot be made into a total divergence
and one is forced to choose CI : b : c = 1 : 2 : - 4 to
-
Gilt" = ( a j q a - ap;") T i j k B B ;
satisfy the consistency requirement. So the La-
a
= Gij + ;
1
(ai a, - ajar)ea - -1 TijkakeU grangian should have the f o r m

= Gun, (10)
i.e., G i l a a r e invariant under gauge transforma- We would like to emphasize that any other linear
tions a s expected for an Abelian gauge group. combination in C d o e s not yield a meaning7Ful theory.
Once we have the gauge field strengths GI,", we Now it is readily seen that C i s (again up to a
?4
can write in the manner of YangandMillsthemost divergence) precisely Einstein s Lagrangian. T h i s
general Lagrangian quadratic in these G i j " . Using completes our argument that the four-dimensional
Eq. ( 6 ' ) we have translational gauge theory with Lagrangian quadra-
d: = G G i j ' G k , '(aqjkq"qU8 +bqik6k6i+cq' k 6 i 6 3 tic in the field strengths i s precisely Einstein's
theory of gravitation. The fact that the Lagrangian
=
1
- (13) i s equivalent to Einstein's one is of course
K2
(a T i j k T l j k + Tij k T i k l + c Ti, j Tikk),
well known. But the geometrical meaning of this
where Lagrangian does not seem to have been fully under-
stood. We now understand i t as the translational
\/-g=det(ht), gauge formalism of Einstein's theory of gravitation.
a , b , and c a r e f o r the time being a r b i t r a r y con- At this point one may wonder whether we have
stants, and we have used the f l a t metric q a e for required a Lorentz gauge invariance by imposing
the fiber space. Notice that in our formal- the independence of the theory under the local Lo-
ism we do not need a Riemannian metric a pn'on'. rentz transformation (11) of the orthonormal basis
The crucial question now i s whether the constants e i . Even so, however, one does not need to in-
a, b , and c a r e really arbitrary. To answer this ques- troduce Lorentz gauge fields in one's theory if one
tion let us point out that the above Lagrangian is basis has only s c a l a r s and internal gauge fields as one's
dependent as it i s obtained using Eq. (6'). NOW, if the source field^.^ This is so because s c a l a r s a r e
theory is going to have any meaning a t all, it should not singlets under the Lorentz transformation and also
depend upon which orthonormal frame one s t a r t s with. the internal gauge fields do not couple directly to
This means that if one chooses a different set of the gauge fields of the Lorentz group owing to the
e i ' s , the Lagrangian should differ only by a total gauge invariance of the internal symmetry. In any
divergence. We now show that this consistency re- event it should be made c l e a r that the independence
quirement removes all the arbitrariness in a , b , of the theory under the local Lorentz transforma-
and c. tion (11)is a consistency condition that one has to
Notice that under a n infinitesimal change of or- require f o r one's theory." In the presence of
thonormal frame, one has spinor source fields, of course, this consistency
condition naturally leads us to introduce the gauge
h Y ( x ) - h ; " ( ~=hY(x)+wikhg(x),
) (11) fields of the Lorentz group to the theory and one
where w i k ( x )= - w k i ( x )a r e six infinitesimal func- obtains the Einstein-Cartan" theory of gravitation a s
tions so that has been argued by Kibble.z*'Z In this c a s e the
translational gauge group i s replaced by the Poin-
6 2 = fi(ZaT,j b 6 T i j k+ 26 Tij k 6 T i k j+ 2 c T i j j 6 T i k k )
card group.g
==[2aTijk(ai.wjk -ajwik)+ ZbTijk(aiwkj - a k w i j )
111. PHYSICAL INTERPRETATION
+ 2cTijjakwki]
We now wish to make c l e a r how in the presence
= J-g[(4a - 2b) Tij k aiw,k - 2b Tij b a , w i j of source fields the translation group T, a c t s on
them. By doing so we will give a precise physical
+ 2cTijjakwkiI meaning to our bundle of translation group. Re-
==[(4a -2 b ) ~ - (~z b +~c ) T~, ~ ,a~ , w
~ , ~~] ~ ~ member that we have treated our bundle just like
a principal fiber bundle of an internal-symmetry
- 2caV(hyGakwbi). (12) group except f o r one crucial difference, i.e., the
identification of Eq. (7). This eq.uation i n t e r l x k s
Here the last equality comes from the following
the fiber space of the translation group T4 with
identity :
space-time and allows u s to speak of our T, as a
au(hyJ-gakwki) = - J - g ( i T i j k a k w i+j T j j j a k w k i ) - space-time symmetry rather than an internal sym-
396

metry. We w i l l first justify this basic equation a n d we started from the usual coordinate b a s i s of a
clarify the meaning of the translation group T,. global Minkowski frame. But clearly one should
Let us consider a s c a l a r field $ ( x ) a s the s o u r c e b e able to s t a r t with any other b a s i s as well. In-
field f o r simplicity and s t a r t with an action integral deed it may be more desirable to construct the
written in the usual coordinate basis of a global theory in a basis-independent way. So l e t us start
Minkowski f r a m e : f r o m the beginning with a local orthonormal f r a m e
e, and write down the action integral (14) a s

Clearly under a global translation of the coordinate


x - X'JJ= x + P I (15) In this c a s e one is led to have the following co-
where E~ a r e four infinitesimal parameters, one variant derivative Di f o r the bundle of T,:
has
~~=a,+q[,. (20)
6'4 = cp(x') -$(XI=c'a,$(x). (16) This i s dictated by the geometry of the bundle5
T h i s suggests that one may view the generators 6, since the covariant derivative of ai in the bundle
of T, as the coordinate derivatives a,,. For the formalism is given by the horizontal lift a^i of a i .
moment let us take this point of view. Now under Also in this picture the group actions are inter-
a local translation one has to introduce a covariant preted to transform the field components along the
derivative to keep the action integral (14) invariant. fiber space keeping the physical space-time points
But since the covariant basis is the one in which the invariant. This means that under the translation
components of the metric remain flat, it i s quite nat- (15) one should have
ural to identify ai as the covariant derivative of a.,
Then one i s led to the following equation:
a, =hya P = $'(x') - #(XI = 0, (21)
= (6y + Kq)a,.
where 0' ( p = 1 , 2 , 3 , 4 ) a r e the fiber-space coordi-
In short, interpreting 5, as a,, and identifying the nate variables a s before and we have identified the
local ortinonormal basis a s the covariant b a s i s one generators 6; as a/aOP, which is again dictated by
easily obtains Eq. (7). In fact this r a t h e r intuitive the geometry of the bundle.5 Notice the difference
interpretation has been given by Kibble.' between t h i s equation and Eq. (16). Thus in this
Now one can easily write down the action integral bundle picture 6; annihilate the source fields as
which is invariant under a local translation and in- the fields remain invariant at each physical space-
dependent of a choice of a local orthonornial basis. t i m e point and do not depend upon the fiber-space
Including the kinetic t e r m of the translational gauge variables. Observe here that this fiber-space de-
fields one had3 pendence of the source fields has been derived,
not assumed, f r o m what one means by the t r a n s -
lational invariance. The way the gauge fields of T,
couple to source fields is then given by Eq. (7):
+-
KZ
1
+ -
("Ti j h T j,k + T,j k T i h r T ij j Ti,,)] dQx. Di($=(ai -Kq{;)#

(18) = a * # = (6y + KB!)a,,@. (22)


Clearly the Lagrangian (18) describes Einstein's Thus the gauge fields of T, couple to source
theory of gravitation in the presence of the s c a l a r fields as if the generators of T, were the
source field # ( x ) provided coordinate derivatives a, and one a r r i v e s at the
action integral (18) a s before due to Eq. (7).
K' = 16nG, (19) One can choose either of the interpretations
where G is the gravitational constant. Thus the above. The f i r s t one emphasizes too much the us-
coupling parameter K of the group T, is indeed r e - ual global Minkowski coordinate f r a m e but gives an
lated to the gravitational constant. intuitively clear meaning to the translational sym-
Now we will give another interpretation, i.e., t i e metry, whereas the other has the m e r i t of t r e a t -
one f o r our bundle formalism of the translational ing the theory in a basis-independent way and al-
gauge theory, which allows u s to keep the lows u s to keep the parallelism between the gauge
complete parallelism between a gauge theory of an theory of the space-time symmetry T, and that of
internal symmetry and that of the space-time sym- a n internal symmetry, with the identification of
metry T,. Notice that in the above interpretation Eq. (7).
397

IV. CONCLUSIONS flict with observation." A Yang-Mills-type gauge


W e have shown that the gauge theory of the four- theory of gravitation which gives Einstein's theory
dimensional translation group is unique and be- is, a s we have seen, the gauge theory of the trans-
comes precisely the vierbein formalism of Ein- lation group T,.
stein's theory of gravitation as far as one chooses
the Lagrangian to be the lowest possible combina- ACKNOWLEDGMENTS
tions, quadratic in field strengths. The gauge po-
tentials of the translation group a r e interpreted as It is a great pleasure to thank Professor P. G. 0 .
the nontrivial part of thevierbeinfields and the cor- Freund, Professor R. Geroch, and Professor
responding gauge transformations a r e shown to be Y. Nambu for many illuminating discussions, es-
the general coordinate transformations. pecially about the consistency condition, and f o r
YangI4 has recently proposed a GL(4) gauge theo- careful readings of the manuscript. I a m also
r y of gravitation of a Yang-Mills quadratic type grateful to Dr. J. Friedman and D r . R. Wald for
which differs from Einstein's theory and may con- many discussions.

*Work supported in part by the National Science Founda- See also L. N. Chang, K. I. Macrae, and F, Mansouri,
tion under Contract No. PHY74-08833 A01. i b i d . 2, 235 (1976).
t P r e s e n t address: Department of Physics, New York *Curnotation here i s the s a m e a s the one in Ref. 5.
University, New York. New York 10003. 'y. M. Cho, Phys. Rev. D (to b e published).
'See, e.g., S. L. Glashow and M. -11-Mann, Ann. Phys. "There appear different opinions in the l i t e r a t u r e with
(N.Y.) Is, 437 (1961); R. P. Feynman, Lectures on which we d o not agree. See, e.g., K. Hayashi, Nuovo
Gravitation (Caltech, Pasadena, Calif., 1963). Cimento 3, 639 (1973); Gen. Relativ. Gravit. A, 1
'T. W. B. Kibble, J. Math. Phys. 2, 212 (1961). (1974).
%ee, e.g.. K. Hayashi and T. Nakano, Prog. Theor. "$. Cartan, C. R. Acad. Sci.
n,
z,
". 593 (1922); Ann.
Phys. 2, 491 (1967): G. D. Kerlick, t h e s i s , Princeton Ecole Normale 325 (1923); 1 (1924).
Univ., 1975 (unpublished): 3 . M. N e s t e r , r e p o r t , 1974 '*Kibble's work followed after the pioneering work of
(unpublished). See also, F. A. Kaempffer, Phys. Rev. Utiyama. See R. Utiyama, Phys. Rev. 101, 1597
165, 1420 (1968). (1956). See also D. W. Sciama, Recent Developments
'ATTrautman, Rep. Math. Phys. 1,29 (1970). in General RelatiVity (Pergamon, New YorK, 1962).
5Y.M. Cho, J. Math. Phys. g, 2029 (1975). 1 3 h this paper our signature of the m e t r i c is (+, -, -, -).
6Y. M. Cho and P. G. 0. Freund, Phys. Rev. D G , 1711 I4C. N. Yang, Phys. Rev. Lett. 33, 445 (1974).
(1975). 15R. Pavelle, ehys. Rev. Lett. 34, 1114 (1975).
'T. T. Wu and C. N. Yang, Phys. Rev. D g , 3845 (1975).
398

Volume 1198, number 4.5.6 PHYSICS LETTERS 23/30 December 1982

DE SITTER AND POINCARE GAUGE-INVARIANT FERMION LAGRANGIANS AND GRAVITY *

J.P. HSU
Physics Department, Southeastern Massachusetts University, North Dartmouth, MA 02 74 7, USA

Received 20 July 1982

We present a new fermion lagrangian which possesses exact symmetry under the local de Sitter group. The lagrangian
involves new scale gauge fields related to the newtonian force and the usual Yang-Mills phase gauge fields related to a
new gravitational spin force between two fermions. Generalization of the usual gauge theory for external symmetry
groups is also discussed.

I t has been suggested that gravity is related to gauge Mills phase gauge fields. They have different trans-
fields of four-dimensional symmetry such as the de formation property and, therefore, must be treated as
Sitter group [ 1,2]. The idea is quite interesting diffcrent and independent fields.
because the de Sitter group possesses the maxiniuin Lct 11sconsidcr the generalization o f ypJ,+ in the
four-dimensional symmetry [3] and is the unique gen- form for ;I non-abelian external symmetry group:
eralization of the PoincarC group. It also suggests tlie
existence of a new gravitational spin forcc between rql1C, 1 (1)
objects with nonzero net spin densities. The dc Sitter where rcL
involves both tlic Dirac matrices a n d scale
group is a rotational group in de Sitter space, which is gauge fields e j , and llic gauge-covariant derivative D,
the hypersurface of a four-dimensional sphcre of a hy- contains pliase gauge fields /$ = (oh, 0;):
perbolic character in one direction, einbeddcd in a
five-dimensional space. The radius of the spiierc is
rg 3 e j p= + efi i(yiyk - # y j > / 4 ~
denoted by L . The de Sitter group reduces to tlie
PoincarC group in tlie flat space limit L + m. =fizz*, (2)
One important ingredient in a realistic gauge theory
of gravity is the fermion field - a sourcc of the gravi-
tational field. But in previous discussions [ 1,2] one
either ignored the fermion field or discussed a fermion
lagrangian which has only upproximate symmetry un-
der local de Sitter gauge transformations. I t appears
that one cannot get a ferniion lagrangian with exuct
external gauge symmetry if one just employs the usual
Ej = ( 2 L e p , e$./L) .

Yang-Mills fields, i.e. phase gauge fields [4]. The quantity ZA is the matrix representation of the
In this paper, we present a new fermion lagrangian, S0(3,2) de Sitter group generators:
which has exact symmetry under the local de Sitter
group. It is necessary that the lagrangian involves new [Z,, Zc] = if&ZA , A = i, j k ; etc. (4)
scale gauge fields in addition t o the usual Yang- The local de Sitter gauge transformations are given by
$ $ = Ed$ >
* The work is supported in part by Southeastern Massachusetts +-

University. bp bl=tdbptd + (apEd> tdl/(k) I (5)


399

Volume 119B,number4.5.6 PHYSICS LETTERS 23/30 December 1982

r, rp= tdr@t:l,
--f (5 cond) u ik .
g ~ LelpyekT7 (1 2 cond)
where For large L , gp is approximately the same as g@.In
the limit L w, it is natural t o interpret e f as the
&j= exp [ i d(x) Z
, 1. +

vierbein component and gpuas the spacetime metric.


The gauge functions d (x) = (wi(x), w(x)) are real Thus we can interpret gpuas the spacetime metric in
and arbitrary. the present theory with the de Sitter gauge group. We
It can be shown that q(rpD, + m) J, is invariant are able to define the a f f n e connection F;, and the
under the local de Sitter gauge transformations (5): Riemann curvature tensor Ra*u in terms of the new
gauge-invariant metric gpyb y the usual relations:
r(rlDI+ rn) (L = T(rpDp+ m) (L . (6) -OL -1-ho!
We stress that this symmetry property holds if and rrW- i g (~,.F,A + a,g, - ~ A Z ~ ,,, )P ~ g h=, st:,
only if both e/ and efi are introduced.
Note that the field e$ is dimensionless and is
Pbu= a,F& - apF%+F$,Fg - FLF&. (13)
related to a change in the scale rather than a change In this way, FEU,EauUand g are all invariant under
in the phase [ 4 ] ,so that e j may be termed a scale the local de Sitter gauge transformations (5). The
gauge field. In view of the presence of this new scale invariant lagrangian for these fields is uniquely deter-
gauge field, the present gauge theory is a generaliza- mined by the principle o f gauge invariance and the
tion of the Yang-Mills theory. Such a generalization principle of general covariance:
appears to be necessary because the de Sitter group is
an exrerrial symmetry group, in which the generator Id4x(-det gpu)1/2[ ~ ( i p P D , , J , + h.c.)
ZA does not commute with ~ k in, contrast to the case
of an internal symmetry group. - rn$$ - bTr FPuF9+ (87d3-l I?] , (14)
The phasc field strength F;,, is given by
where G is a constant and =f?OLpvagw.Field equa-
(DpD, - DUD,) J, V$JA
J, , (7) tions for e j , l$ and J, can bc determined from the
lagrangian (14).
i.e.
When we takc the limit L +-,some components
F$ = a,e - au$ t gt&.b,byC. (8) of gauge ficlds, i.e. 6; and $,. disappear from the
theory and wc obtain the following lagrangian:
One can verify that Fpu= F;,ZA is gauge covariant
and transforms as follows: I = (-det gp,)/2
FMY FbU= tdFpYth .
-+ (9) X [.C+ - iTr(F;,Z,, F$Zb) g w g u p + (8nG)-l R ] ,
Thus, Tr(F,,t.,p) i s a gauge-invariant quantity:
R f RUp,gpu
T C I u F & ) = Tr(?puFad , (10)
which is usually used as the lagrangian for the phase (a,r,q,- aur&t r;,,r& - r&rg,)gwu,
field e. =

We observe that T r ( P P ) is gauge invariant: r;, = :$OL(augpA + apgb - ag,) ,


Tr(r,P) = Tr(rpr). (1 1) g ~ u = e p e ~ik i k .
I k7) 9 gpv=epeuqlk >
Thus, we expect that e j enters the lagrangian for the
scale field through the combination e:e[=6; ~ e i e r = :S ,
EMU= ~ r ( r r , / 4 )
Tr(F&Z, FipZb) = 2F$FZqirn Vkn ,
= qkerei + e;eL,,, qimqin/2~2
a = ik , b = mn ,
+gp asL+-, (12)

3 29
Volume 1199, number 4,5,6 PHYSICS LETTERS 23/30 December 1982

(i.e. the de Sitter group) and infinite L (i.e. the Poin-


car6 group).
(1 5 cond) Our interpretation o f gauge fields for external sym-
metry groups differs from previous interpretations by
where the last term in ( 1 5) is identical with Einsteins Kibble and others [S-71 (see also refs. [2,8]). We may
lagrangian. The lagrangian (1 5) is invariant under local remark that a contact spin force between fermions
Poincare gauge transformations (i.e. the transforma- has been discussed by Kibble based on a non-gauge-
tions (5) with L -+ -) and general coordinate transfor- invariant fermion lagrangian. Since the external four-
mations (with the spacetime metric g,?. The PoincarC dimensional symmetry group is a fundamental symme-
gauge-invariant lagrangian (1 5) differs from that dis- try of nature, the prediction of the new long-range
cussed by Kibble [S]. From the viewpoint of symme- gravitational spin force should be taken seriously.
try, the lagrangian (1 5) is more satisfactory than We conclude that gauge field theory based on exter-
Kibbles lagrangian involving fermions. nal four-dimensional symmetry groups dictates the
Physically, the radius L of the de Sitter space is presence of a new scale gauge field, which differs
probably very large. In this case, physical effects of from the Yang-Mills phase gauge field [9]. Further-
bh and e/clk are negligible. Experimentally, the differ- more, the theory predicts a new long-range gravita-
ence between the de Sitter gauge-invariant lagrangian tional spin force between fermions. It appears that the
(14) and the PoincarC gauge-invariant lagrangian (1 5) quantization of these fields cannot be acconiplished
cannot be distinguished in the near future. So the im- by a straightforward application of the usual quantiza-
portant fields are b i and e / , which are generated by tion procedure for Yang-Mills fields. This needs fur-
the spin density and mass density, respectively. This ther investigation.
can be seen from the field equations derived from (1 5).
For example, the source of b: in the approximation of References
static and weak fields is
[ I ] P.C. Wcst, Pliys. Lctt. 76B (1978) 569;
P.K. Townscnd, Pliys. Rcv. D I 5 (1977) 2795;
Wu Yung-shi, Lcc Kcn-dao and Kuo Ilan-ying, Kexuc
=0, P+O> (16) Tongbao 19 ( I 974) 509.
J.P. Hsu, Pliys. Rcv. Lctt. 4 2 (1979) 934, 1920(E);
where U ( r ) and aj are respectively the positive-energy Nuovo Ciriicnto 6 1 8 (1981) 249;
Pauli spinor and the Pauli matrices. Of course, the see also: S. Fujita, cd., The Ta-You Wu Fcstscluift:
result (16) can also be derived from the de Sitter gauge- Science of matter (Gordon and Breach, 1978) pp. 65-73;
L.L. Smallcy, preprint.
invariant lagrangian (14) with the approximation of
F.J. Dyson, Bull. Am. Math. Soc. 78 (1972) 635.
very large L. C.N. Yang and R.L. Mills, Phys. Rev. 96 (1954) 191;
These gauge fields are interpreted as follows: Graviry C.N. Yang, Ann. NY Acad. Sci. 294 (1977) 8 6 ; Phys.
is related to scale gauge fields rather than the usual Today (Junc 1980) 42.
Yang-Mills gauge fields because the scale gauge field I S ] T.W.B. Kibble, J. Math. Phys. 2 (1961) 212.
161 See, for example, R. Utiyama, Phys. Rev. 101 (1956)
is generated by the mass density, according to gauge-
1597;
invariant lagrangians. The massless Yang-Mills field b, C.N. Yang, Phys. Rev. Lett. 33 (1974) 445.
is generated by the spin density of fermions and corre- 171 P.C. West, Pliys. Lett. 76B (1978) 569;
sponds t o a new long-range force between two bodies W. Drechslcr, Phys. Lett. 107B (1981) 415.
with nonzero spin densities. The strength of this new [8] F.W. Hehl, J. Nitsch and P. van der Heyde, Gravitation
and Poincari gauge field theory with quadratic lagrangian,
force is determined by a new dimensionless coupling
Univ. of Cologne preprint (1980).
constant g2, which is independent of the newtonian [ 9 ] J.P. Hsu, Gcncralized theory with external gauge symme-
gravitational constant G. These hold for both finite L try, SMU preprint (1982).

330
Chapter 8

Alternate Approaches to Gravity:

Roads Less Traveled By*

"P. A. M. Dirac, K. Hayashi, T. Shirafuji, A. A. Logunov, M. A. Mestvirishvili,


J. P. Hsu
402

PHYSICAL REVIEW VOLUME 114, NUMBER 3 MAY 1, 1959

Fixation of Coordinates in the Hamiltonian Theory of Gravitation


P. A. M. DIXAC
InsIda& joy Ad?a?ued Study, Princeton, New Jersey
(Received Decemfxr 10, 1958)

The theory of gravitation is usually expressed in terms of an arbitrary system of coordinates. This results
in the appearance of weak equations connecting the Hamiltonian dynamical variables that describe a state
at a certain time, leading to supplementary conditions on the wave function after quantization. I t is then
difficult to spccify the initial state in any practical problem.
To rcmove the difficulty one must eliminate the weak equations by fixing the coordinate system. The
general procedure for this elimination is here described. A particular way of fixing the coordinatc system is
then proposed and its effect on the Poisson bracket relations is worked out.

INTRODUCrION AND NOTATION that one could effect a substantial simplification, at the
HE problem of putting Einsteins equations for expense of giving up four-dimensional symmetry, by
T the gravitational field into the Harniltonian choosing a system of coordinates such that the three-
form, as a preliminary to quantization, has recently dimensional surfaces 3p= constant are all space-like
received a good deal of attention, because of the develop- and dealing with the physical states on these surfaces.
ment of mathematical methods sufficiently powerful The main features of the Hamiltonian formalism
to make it tractable. will be recapitulated here. The notation will be that
The Hamiltonian form involves the concept of a used by the author, with the exception that the sign
physical state at a certain time, which means in a of the g, will be changed throughout, to make goo
relativistic theory a state on a certain three-dimensional negative. Greek suffixes take on the values 0, 1, 2, 3,
space-like surface in space-time. At first people. chose lower-case Roman suffixes take on the values 1, 2, 3,
the space-like surface independent of the coordinates the determinant of the g, is - J z , the determinant of
x*, which enabled them to preserve the four-dimen- the g., is Kz, and the reciprocal matrix to g,, is er4. A
sional symmetry of the equations. Later it was realized3m4 Iower suffix added to a field quantity denotes an
___
* The authors stay at the Institute for Advanced SLydy was ordinary derivative, while ,I added to it denotes the
supported by the National Science Foundation. covariant derivative.
* F. A. E. Pirani and A. Schild, Phys. Rev. 79, 986 (1950).
Bergmann, Penfield, Schiller, and Zatzkis, Phys. Rev. 80, We shall deal with the gravitational field in inter-
81 (1950). action with other fields, or possibly particles. Spinor
Pirani, Schild, and Skinner, Phys. Rev. 87, 452 (1952).
P.A. M. Dirac, Proc. Roy. SOC.(London) A246,333 (1958). fields are excluded, as they require a special treatment.
403

COORDINATES IN HAMILTONIAN THEORY OF GRAVITATION 925

We have an action density of the form It should be noted that, for a vector A,, the ordinary
and covariant derivatives A , . and A , , , are both
e=e C f c M , independent of the go,. Their difference, namely
where is the action density of the gravitational field r,,JA,,is thus independent of the go,. We may take A ,
alone, involving the ,g, and their first derivatives, and here to be the unit normal, namely
2 ,is ~the action density of the other fields, involving the
other field quantities, q M say, and their first derivatives
and involving also the g,, but not derivatives of the gPv.
The gravitational action density is
ec= (16 r + ~ g q r , , ~ r , ~ - r,yprpsq, (1) is independent of the go,. This quantity may be called
where y is the gravitational constant, occurring in the the invariant velocity of g,, as it consists of the
numerator of Newtons law of force. To save writing, ordinary velocity graOmultiplied by a certain factor and
we shall take with certain terms added on, so as to produce a quantity
167ry= 1. (2) independent of the choice of coordinate system outside
the surface xo=t.
HAMILTONIAN FORM OF GRAVITATIONAL THEORY With the physical state described in this way, one
We shall deal with the physical state on the surface easily finds4 that for a dynamical variable Q not in-
xO=t and shall set up Hamiltonian equations of motion volving the gu0, dq/dxO is of the form
to determine how the state varies as t varies. The
Hamiltonian is, by the usual definition

with EL and .$* independent of the g,o. We need equa-


tions of motion to determine .$I,, & for any 7. The
coefficients (-Po)-$, grO in (6) are arbitrary and not
where the sum is over all the nongravitational dynamical restricted by the equations of motion.
coordinates q ~ . One gets equations of motion of the form (6) from
I t is evident that there must be a good deal of arbi- a Hamiltonian of the form
trariness in the equations of motion on account of the
arbitrariness in the system of coordinates V. I n the
first place we see that the og, can vary with t in an
arbitrary way. To describe the geometry of the surface
H=
s { (-ggOO)-*X~+g~~er*Xa}d3x,

with X L and X, independent of the gro and vanishing


(7)

P= t and also the system of coordinates xrin it, we need


the g, at all points on the surface, but we do not need weakly. It has been shown4 that the Hamiltonian (3)
the go,, which refer only to intervals going outside the takes the form (7) provided the dynamical coordinates
surface. Different values for the gp0 correspond to describing the nongravitational fields are chosen to be
different choices of a neighboring surface xo= t+ e and independent of the gp0, in the way discussed above, and
to different systems of coordinates zr in the neighboring provided one takes for CG,instead of (l),a n expression
surface, and these are completely arbitrary with a which differs from (1) by a perfect differential and
given initial surface xO= t. which does not contain the velocities g,oo, namely
We get the simplest form for the equations of motion cc=Jg, (rgp~rvSp-
rpvPrpaS)
if we describe the physical state on the surface xo=t
entirely in terms of dynamical variables that are
+ ( J P 0 ) 0 ( g r 0 / f 0 ) r - - (J~oo)~(go/goo)o. (8)

independent of the go,. Let us consider the kind of With this SO,the momenta p@ conjugate to gp0
quantities that can enter into such a description. vanish weakly, which results in the degrees of freedom
Suppose there is a vector field A,. The three co- described by gM0,yodropping out from the Hamiltonian
variant components A , on the surface remain invariant formalism. The weak equations p@=O give, when one
under a change of coordinates which leaves the co- passes to the quantum theory, the conditions p.V=O,
ordinates of each point on the surface invariant. So which show that the wave function # does not involve
these A , will enter into the description. We cannot have the go,.
A o , but we have instead the normal component of A , The surviving gravitational momenta are
namely A,P, where 1, is the unit normal. Similarly for a p r a = K ( e v o e d 6 - , y a e a b ) r ab0/(-gm).
(9)
tensor B,,, which may be the covariant derivative
A,,, of A,,, we have the quantities B,,, BT$, B,,Y, They are built up from the invariant velocities (5).
B,PI. Each of these quantities is unaffected by a The fundamental Poisson bracket (P.b.) relations for
change of coordinates which leaves the points on the them are
surface invariant and is thus independent of the go,.
404

926 P. A . M DIRAC

The expressions for XL and X, in (7) are found to be We can write the total Hamiltonian (7) in the form
3CL = K - y p y , , - *pr'psa) +B I

+(K-'(K2era)).)s + X ~ ~(1,1)
2 (pabgaa-)b + X M 8 , (12)
S
Xa=pabgoba-
where 4- (18)
gt.0ersXad3~,
Jj=+Kg
lSU GI," { (eroesb-er.?e=b)ei~v
(13)
+2 (er"e5b- eraebtL)eBU),
when it appears as Hlnain with arbitrary linear combina-
tions of X I , and of X,, for various values of zT,added on.
and XMI,, ~ C are
M ~the contributions arising from the These additional terms in the Hamil tonian produce
nongravitational fields. I t should be noted that the terms in the equations of motion in addition to those
terms B+(K-1(K2er"),),are equal to the density of the produced by Hmain,corresponding to the surface
three-dimensional scalar curvature of the surface xo= 1. xo= t undergoing arbitrary deformations and having
We have the weak equations arbitrary changes of its coordinate system zr as t varies.
XLZO, X,=O. (14) NEED FOR FIXATION OF THE COORDINATES
They are x equations or secondary constraints. To see To specify a physical state a t a particular time in the
where they come from, we note that Einstein's field classical theory, we must choose numerical values for
equations are all the dynamical coordinates and momenta so as to
satisfy the constraints (14). This involves solving some
R,'-+g,vRR,'= +T,', (15) differential equations, so it is not such a straight-
where Thyis the stress tensor produced by the non- forward matter as specifying a state in particle
gravitational fields. The left-hand side of (15) contains dynamics.
second derivatives of the gas and thus in general I n the quantum theory the situation is more compli-
contains accelerations g,ooo. The right-hand side of (15) cated. The constraints (14) go over into the conditions
contains no derivatives of the gas. Now the well-known on the wave function
identities xL*=o, (19)
(R,"- +g,'R,") I " ~ 0
X,*=O. (20)
may be written
To specify a state at a particular time involves obtaining
a solution of Eqs. (19), (20), which are functional
equations.
where the + a t the end indicates that some further Equation (20) expresses merely that $ must be
terms, not involving third derivatives of the g,, must invariant under changes of the coordinate system xr in
be added on. The right-hand side of (16) evidently the surface P=t. To get J. to satisfy this equation is
does not contain any third time derivatives g,gooo. thus not difficult. Equation (19) expresses the require-
Thus the left-hand side cannot involve third time ment that the state shall be specified in a way that is
derivatives, so R,O- +g,OR<'cannot involve accelerations independent of deformations of the surface xO=t. The
gaBoo. Thus if we take v = O in (15), we get equations treatment of such deformations is essentially as compli-
involving only dynamical coordinates and velocities. cated as the treatment of the passage from the surface
By substituting for the velocities here in terms of the xo=t to a neighboring surface x 0 = t + c , so to get $ to
momenta, we get four equations between dynamical satisfy (19) is essentially as complicated as solving the
coordinates and momenta only, which yield (14). equations of motion. Thus we have the situation that we
The main part of the Hamiltonian is obtained by cannot specify the initial state for a problem without
putting g=, -6,o in (7) and is thus solving the equations of motion. The formalism is thus
not suitable for dealing with practical problems.
The difficulty does not arise in the weak-field approxi-
mation, because then many of the terms in (19) get
neglected and the remaining ones, if expressed in terms
n
of Fourier components, are easy to handle,
To obtain a practical formalism of greater accuracy
than the weak-field approximation, i t is necessary to
after removal of a surface integral a t infinity. The introduce into the theory some new constraint that
removal of this surface integral does not disturb the fixes the surface xo=t, so that we no longer have the
validity of Hmninfor giving equations of motion, but possibility of making arbitrary deformations in it. Then
it results in Hmainnot vanishing weakly. the supplementary condition (19) gets eliminated. We
405

COORDINATES IN HAMILTONIAN THEORY OF GRAVITATION 927

may also introduce some further constraints that fix a new definition of P.b.s, which corresponds to the
the coordinate system z in the surface. While not number of effective degrees of freedom being reduced
essential for getting a practical formalism, such further by M .
constraints serve t o simplify the formalism by elimi- In simple cases we can pick out directly the degrees of
nating the conditions (ZO), and so making the task of freedom that have to be dropped and those that
specifying the initial state a trivial one. survive. Let us take the special case when M of the
The fixation of coordinates is advantageous also in equation Y,=O are
the weak-field approximation, because it leads to some
degrees of freedom dropping out from the formalism, p,,=O, in= 1, 2; - ., M . (21)
the procedure being similar to the elimination of the The remaining M of them must then contain all the
longitudinal waves in electrodynamics. variables q , independently, (otherwise the p,, would
When dealing with gravitational waves, people not all be second-class) and so it must be possible to
usually restrict the coordinate system by introducing solve them for the qm and write them as
the harmonic conditions
qm=fm(q,l4+1, qA.f+L?. . .PM+l, p,11++2...). (22)
( J g q = 0.
We now see that the degrees of freedom associated with
These conditions would be quite unsuitable in the q,, p , (m= 1, 2 , . . . M ) cease to play an effective role in
present formalism because they involve the g p 0 , which the dynamics. We can use Eqs. (21) and (22) to elimi-
the present formalism allows to be completely arbitrary. nate the variables p , and qm from the theory, which
Any restriction imposed on the gg0 would not help one implies using these equations as definitions or as strong
in dealing with Eqs. (14) or (19) and (20). We need equations. We then work with P.b.s that refer only to
some restrictions which affect only the variables the other degrees of freedom.
involved in (14), namely g, and pr8,and possibly also In the general case one retains all the dynamical
the liongravitational variables. variables and merely changes their P.b.s to correspond
to the reduction in the number of degrees of freedom.
GENERAL METHOD To do this one first sets up the matrix of all the P.b.s
Let us examine the general principles which come [U.,Y.,].I t can be shown that this matrix has a
into play when we introduce some new restrictions or nonvanishing determinant, provided there is no linear
constraints on the dynamical variables in a Hamiltonian combination of the Y , that is first-class. One must then
theory. Suppose we have a number of weak equations obtain the reciprocal matrix C,,,, satisfying
x n = O (n= 1, 2;. . N ) , which may be either primary or C d [ Y d ,Y,,,]= L,,. (23)
secondary constraints. We are taking N to be finite
for definiteness, but the same principles apply with A Note that Ca8,is a skew matrix, like [Y,,Y,.]. One then
infinite. Suppose further that these weak equations are defines new P.b.s by the formula
all first-class, so that Ct,d*=Ctd- CE, ~ 8 I ~ . * V ~ ~ , S l . (24)
CXn,xdl= 0. It can be checked5 that the new P.b.s satisfy all the
Now introduce some new restrictions, say the M fundamental relations that P.b.s ought to satisfy.
independent equations From (23) and (24) we see at once that [&Y.]*=O
for any t. Thus the Y , now have zero P.b. with every-
Y,=O, m=l, 2;.., M thing, so that we can consider the equations Y,=O as
with M G N . They are, of course, weak equations. strong equations and use them before working out
Suppose that none of them (and no linear combination P .b .s.
of them) has zero P.b. with all the xs, so that they are I n applying this method to the gravitational case
all second-class constraints. They will cause M of the we desire, of course, that the change in the P.b.s
xs to become second-class, while AT-M of the x s (or shall not be too complicated. I n particular, we would
linear combination of them) remain first-class. like to have no change a t all in the P.b. of two quanti-
Suppose X I , XZ,. . X M become second-class, while ties, neither of which involves the gravitational variables
g, pr8. This result is ensured provided the two condi-
XM+I; . ., X N remain first-class. We now have the 2M
second-class constraints xra=0, Y,=O (m= 1, 2,. . . M ) . tions hold: (i) The Y, (m= 1, 2,. . . M ) iiivolve only
Let us write ~,=YM+,, so that the 2M second-class the gravitational variables ; (ii) The P.b.s [Y,, Ym.]
constraints become Y,=O (s= 1, 2;. ., 2M). all vanish. The proof is as follows.
There is no place for second-class weak equations in We have already (x,,,xw,,]=O from the assumption
the quantum theory, so we have to transform them in that the xs were originally first-class. With the further
some way. We shall see that we can change them into condition [Y,,Y,,]=O we have [Y,,Y.,J=O except
strong equations (holding as equations between when 16 s< M and M+ 16 s< 2M or vice versa. This
operators in the quantum theory) provided we adopt P. A. M . Dirac, Can. J. Math. 2, 129 (1950)
406

928 P. A . M. DIRAC

leads to Ca81=0 except when 1 6 s < M and M+ 1 6 s We find that graand HI. have zero P.b. with p,* and K
< 2M or vice versa. The surviving elements of Care thus at all points.
C,, M+,,,,= -CM+,,,,, ., The elements C,,,, ,v+,,,! form a Let us change our basic dynamical coordinates from
matrix of M rows and columns, which is the reciprocal the six g, to the five independent gTBand Inx. The
of the matrix [x,,.,Y,,,]. momentum conjugate to 1nK is now, from (28), just
The formula (24) now reduces to Zp,, and the momenta conjugate to the graare certain
functions of the and grd.
cs,?l*-cs,rll= -Cm* M+m4Ct,~mICxm~,al The conditions (26) now take the form (21) and we
- Ct,xm,lCYm,rll). (25) have the equations XL=O playing the role of (22).
If 5 and 7 do not involve the gravitational variables, To put them into the form of (22) we must solve them,
the condition (i) above leads to [(,Ym]=O and with the help of (26), to get K expressed in terms of
[Y,,a]=O, so the right-hand side of (25) vanishes. quantities having zero P.b. with p,- and K. Such
The introduction of the new constraints into the quantities are the gr., Fa, $1.8) and the nongravi-
theory, when combined with the appropriate change in tational variables.
the P.b.s, leaves the Hamiltonian first-class. It follows From ( l l ) , the equation XL-O gives,
that the Hamiltonian equations of motion preserve all
the constraints.
in which we look upon the g, in B and X M Las expressed
FIXATION OF THE SURFACE in terms of the and K. This is a difficult equation to
solve generally for K. However, for gravitational fields
To fix the surface P=t, the natural conditions to that are not too strong, the important terms are those
take are that involve second derivatives of K, i.e., those on the
prT=grapr*=o. (26) left-hand side. We can therefore obtain the solution
This involves bringing into the theory one Y equation by a method of successive approximation, first putting
for each point of the surface. ~ = on l the right and solving the resulting simplified
One easily checks that equation, then substituting the first approximation for
K on the right and solving to get the second approxi-
[XB,g=puY-J = g,.p6, (x- x ) =0, mation, and so on. We shall consider this equation
so the conditions (26) do not disturb the first-class further in the next section, with reference to a particular
character of the equations X,= 0. This means that the system of coordinates, and for the present we shall
conditions (26) do not restrict the coordinate system x assume that the solution has been obtained.
in the surface, a result which is evident from the tensor Following the method of the preceding section for
character of (26). The conditions (26) mean geo- dealing with the second-class equations (21) and (22),
metrically that the surface shall have a maximum we express Hmninand X. in terms of the variables
three-dimensional area. The equations (26) and g,,, Bru,
ifra, Bt., p,, and K, and then eliminate puU and K
XL-O are now second-class and we can use them to from them by means of (26) and the solution of (30),
eliminate one degree of freedom at each point of space. which we may now use as strong equations. The
We have elimination from X, is trivial, as we get from (12),
using (26),
Cgra,puU1=g,*6(x-X).
It follows that the ratios of the g,, at any point have Xa=$abgabs-2(habgai) b+XM*. (31)
zero P.b.s with pu at all points of the surface. Let us If the nongravitational field variables are suitably
Put chosen, will not contain K . The elimination from
K = K ~ ,gre=g,,K-, ~=ersKZ. (27) Hmain leads to an expression
Then g,, involves only such ratios and has zero P.b.
with p- a t all points. There are five independent prnain J
p 91.~+B+Xnr~)d~x, (32)
= (K -3-ra-
grS,as their determinant is unity. The form the
reciprocal matrix to the matrix gY6,and also have the in which K is understood to have the appropriate valuc.
determinant unity. The integrand here may be considered as the energy
We have density or mass density. The complete Hamiltonian
[Kz,puu]= 3K%(x-- x), is now
and so

Put
[ l n ~ , p ~ ~+]6=( x - x) . (28) s
H*maln+ groeraXdd3x.

The term corresponding to the freedom of deformation


(33)

$rs= ( p r a - geiagobpcb)K2, (29) of the surface, i.e., the middle term of (18), has
$is=gmga b$Ob- disappeared.
407

COORDINATES 1N I I A M I L T O N I A N '1'1-IEOlZY O F G R A V I T A T I O N 929

We now have a Hamiltonian formalism in which which follows from (35), this reduces to
the degree of freedom described by and 1nK has
dropped out. The Hamiltonians (32) and (33) are
R= + ~ g ' ~ ~ ~(, &P ,F~b W - 2 b * Z b " k v ) -~ K - ~ U , , K , Z (39)
~~.
first-class even with the condition (26), so they lead The last term here, divided by lw, can be inter-
to equations of motion that preserve (26). The pro- preted as the mass density (or energy density) of the
cedure of substituting for M in the derivation of H*msin Newtonian field with the potential K- 1. It is negative
caused the introduction of the right amount of X L definite, corresponding to the Newtonian force being
into the Hamiltonian to ensure the preservation of (26). attractive. The remaining terms of B, together with
the first term on the right-hand side of (381, give the
FIXATION OF COORDINATES IN THE SURFACE energy density of the gravitational waves.
To get the theory into a more convenient form, one
THE NEW POISSON BRACKETS
must also fix the coordinate system 'x in the surface.
The most natural conditions to take for this purpose, With the coordinates fixed by (35), the P.b.'s of the
from the geometrical point of view, are the harmonic gravitational variables with one another and with the
conditions in three dimensions : nongravitational variables will be altered. The new
P.b.'s are given by formula (25) with Y, replaced by
(K5")*=0. (34) FUuand xmlreplaced by X'M. It thus reads
However, (34) does not have zero P.b. with (26), so if
we adopt (34) together with (26) we must change the Ct191*-ITE,?1= -~ ~ c ~ ~ ( x , 2 ' ~ ( c ~ , ~ ~ ~ J [ x
P.b. relationships between the nongravitational vari-
ables. To avoid this inconvenience, it is better to -[~,X'.][gF1L=,?]}d3xd3~'. (40)
replace (34) by
6 4 , = 0, (35) The coefficient C,'(x,x') is the reciprocal of the matrix
which does have zero P.b. with (26). [X'B,Eru,] and thus satisfies
With the coordinates fixed by (35), Eq. (30) reduces
to SC.a(.",2')[XII,F"u]d92'= g6.' (x- x"). (41)
-4vK=K4p9,a+B+XYL. (36)
where P denotes the Laplacian operator with respect Evaluating the P.b. here, we get
to the metric Era, namely
QZ= rw/axw.
The right-hand side in (36) equals the integrand in (32)
(37) s c"qz",T') { g 8 P * 6 a & ( X -
+&?6*&-
x')
2')) d32' =g,'6 (x- x''),
and is the mass density. To interpret (36), let us restore
which reduces to
the gravitational constant into the theory in accordance
with (2). It then becomes 4-$raC,,a(x',x).,= g6,'
V2Cmr(~',x) (x- x'), (42)
- (4'Ty)-'V2K= 16'~y~-'$~"$,,f (16~Ky)-~B+x,w~.(38) with 02 defined by (37).
This equation may be considered for fixed x', when
We now see that K-l is the Newtonian potential it is a differential equation for the unknown functions
generated by the mass density in a space with the Cu*(x',x)in the Variables x. The important domain for x
metric &. Thc fact that x occurs in the right-hand is now the neighborhood of x', since when x is far from
side of (38) can be understood as due to the Newtonian I' the functions CVr(x',x)are small. We can therefore
potential itself having some influence on the mass get an approximate solution by considering the space
density which generates it. as flat in this domain, so that the b b are constants.
Let us examine the term with B in (38). The expres- With this approximation we get, on differentiating
sion (13) for B, written in terms of the new variables, is (42) with respect to x',
B=-;K-' (KZrsu+ 2Kugrs) ( K g o b v + 6) VC,"= %6,(X- X'). (43 )
x { ( p a p b- p a z a b)geu+ 2 ( p pb -Zvagbu) -8u
e 1. Thc solution of this equation is
With the help of the equation
g,,c'*= 0,

which follows from the determinant of the grd being wherc I z - d denotes the distance from z t o x' with
unity, and of the equation respect to the metric LI.,
g,,,i? = 0, 1 X-dI ={g.*(Xr-x'p) (X"-r'*)) i. (44)
408

930 P. A . M DIRAC

Equation (42) now becomes can then set up the wave function as a function of these
variables,
1 1
V"C,'= g'"6 (x- x') +-Era

whose solution is
-
l&lr (,x-x', >,. b! (FI'",O.
The effective domain of J. is that for which the
Zr* are restricted to have the determinant unity and
to satisfy b*,=O. # may be considered as undefined
outside this domain. When we operate on # with 6abor
with any dynamical variable in the theory, we get
another wave function defined in the same domain,
One could get the solution of (42) to a higher accuracy on account of pa' commuting with the determinant
by substituting for the F b in the left-hand side of (42), of the F a and with Fa..
(remembering that Zab occurs also in the operator 7,) There are no supplementary conditions to be imposed
their Taylor expansions in powers of x-x' and using on $. We can choose it arbitrarily to correspond to the
the first approximation for CVrin those terms in which initial state in any problem. There is just one equation
it occurs with a factor xr-x". By a process of successive for +, the Schrodinger equation
approximation one could get the solution to any
desired accuracy.
With the coefficients Cro(x,xt)in (40) determined, which fixes the state a t later times.
the new P.b.'s are determined. I t should be noted that For the theory to be self-consistent it is necessary
the new P.b. of any nongravitational variable with that the space-like surface on which the state is defined
g,, or Zra vanishes. However, its new P.b. with does shall always remain space-like. The condition for this
not vanish. is that K2, the determinant of the g, shall remain
always positive. I n the present formalism this means
QUANTIZATION
K ~ > O , with K determined by (36). If the mass density
To pass over to the quantum theory, we must make is always positive, (36) shows that K> 1 and there is no
all our dynamical variables into operators satisfying trouble. Difficulties arise only where there is a large
commutation relations corresponding to the new negative density. This occurs very close to a point
P.b.'s. We must then pick out a complete set of com- particle, on account of the last term in (39). The
muting observables. We may take these to consist of gravitational treatment of point particles thus brings
the Era a t all points x', together with a complete set of in one further difficulty, in addition to the usual ones
commuting nongravitational observables, f say. We in the quantum theory.
409

PHYSICAL REVIEW D V O L U M E 19, N U M B E R 12 1 5 J U N E 1979

New general relativity


Kenji Hayashi
Institute of Physics, University of Tokyo (Komabo) Tokyo, Japan

Takeshi Shirafuji
Physics Department, Saitama University, Saitama. Japan
(Received 6 February 1979)
A gravitational theory is formulated on the Weitzenbkk space-time, characterized by the vanishing
curvature tensor (absolute parallelism) and by the torsion tensor formed of four parallel vector fields. This
theory is called new general relativity, since Einstein in 1928 first gave its original form. New general
relativity has three parameters c , , c,, and A, besides the Einstein constant K . In this paper we choose
c , = 0 = c2, leaving open A. We prove, among other things, that (i) a static, spherically symmetric
gravitational field is given by the Schwarzschild metric, that (ii) in the weak-field approximation an
antisymmetric field of zero mass and zero spin exists, besides gravitons, and that (iii) new general relativity
agrees with all the experiments so far carried out.

I. INTRODUCTION
From this equation we get
In 1928 Einstein introduced the notion of absolute
parallelism and tried to unify gravitation and elec-
tromagnetism, using tetrads with 16 degrees of
where the first t e r m denotes the Levi-Civita con-
freedom. H i s attempt failed because there was
nection,
no Schwarzschild solution in his simplified field
equation.* Later, in 1961 Mdller revived Einsteins
idea,3 and Pellegrini and Plebanski found a La-
grangian formulation for absolute p a r d l e l i ~ m . ~ and the second stands for the contortion tensor,
Recently this formalism was reconsidered by Mdl-
ier.5
In 1967, quite independently, Hayashi and Nakano with the torsion tensor
started to formulate the gauge theory of the space-
time translation group: This theory was of no
geometrical construction, but it was shown that, In t e r m s of the affine connection the curvature
for a static, isotropic gravitational field, a sym- tensor i s given by
metric part of their field equations is identical with
the Einstein field equation in general relativity,
and that, in the weak-field approximation, an an- The Riemann-Cartan space-time has both the
tisymmetric part describes the propagation of an curvature tensor and the torsion tensor. From this
antisymmetric field, whose source is related to the space-time follow two very interesting models of
intrinsic spin of spin-$ fundamental particles. space-time. One is the well-known Riemann
Miyamoto and Nakano estimated effects of ex- space-time V,, which i s obtained from the U, by
changing this field in the microscopic system. setting the torsion tensor to be identically vanish-
In later y e a r s Hayashi further developed the gauge ing. From (1.2) follows the Levi-Civita connection.
theory into a more elaborate framework* and fixed It is well known that general relativity i s the theory
the final form in 1973. Quite recently Hayashi of gravitation on this space-time, and that it as-
pointed out the connection between the gauge theory cribes gravitation to the Riemann-Christoffel
of the space-time translation group and absolute curvature tensor formed of the Levi-Civita con-
paral1elism.l0 nection.
Now we wish to unify these two developments Another interesting model is the Weitzenbock
mentioned above, following the geometry of under- space-time A,, which i s obtained from the U, by
lying space-time structure. The Riemann-Cartan setting the curvature tensor to be identically
space-time U, is a paracompact, Hausdorff, con- vanishing,
nected Cmfour-dimensional manifold endowed with
a locally Lorentzian metric g and a linear affine
connection fwhich is metric, Or, to put it equivalently, the Weitzenbock space-

19
- 3524 @ 1979 The American Physical Society
410

-
19 NEW G E N E R A L R E L A T I V I T Y 3525

time i s obtained by requiring the U, to admit abso- by equations that are covariant o r f o r m invariant
lute parallelism, i.e., to have a quadruplet (speci- under the group of general coordinate transforma-
fied by k = 0, 1 , 2 , 3 ) of linearly independent parallel tions. (B) The equivalence principle. (C) Gravi-
vector fields, b = { ~ ~ = { bwhich , } , is defined by tational field equations are derivable f r o m the ac-
tion principle. (D)The field equations are partial
D$b~=a,bkX+I:~bkp=O. (1.8)
differential equations in the field variables of not
Solving this equation we find the nonsymmetric higher than the second order. (E) The gravitational
affine connection, field is exhaustively described by the metric tensor
alone.
r~~=b~a,bk,, (1.9) In new general relativity the fundamental as-
and the torsion tensor, sumptions are as follows: (A) Underlying space-
(1.10) time is the Weitzenb6ck space-time, which has a
quadruplet of the parallel vector fields as the fun-
Here b*={bk}={bk,}is also a quadruplet of parallel damental structure. These parallel vector fields
vectorfields, which is inverse t o b. It is straight- give r i s e to the m e t r i c tensor as a by-product. A l l
forward t o see that the curvature E n s o r indeed physical laws are expressed by equations that a r e
vanishes identically [see (1.7)]. See Fig. 1 f o r covariant o r f o r m invariant under the group of
reduction of the Riemann-Cartan space-time. general coordinate transformations; (B) The
We will give the name, new general relativity, equivalence principle is valid only in classical
to the theory of gravitation on the WeitZenback physics. (C) and (D) are the s a m e as (C) and (D),
space-time, since Einstein in 1928, after invent- but at this time we s t a r t from the microscopic ac-
ing general relativity, considered absolute paral- tion principle. (E) The gravitational field is ex-
lelism f o r the first time, and the main conse- clusively described by a quadruplet of the parallel
quences of the present theory will be analogous vector fields. As is closely related t o (E), we
to those of general relativity s o f a r as macroscopic need t o assume: (F) A l l physical laws are ex-
phenomena are concerned. New general relativity pressed by equations that a r e covariant o r f o r m
attributes gravitation to the torsion tensor formed invariant under the group of global Lorentz trans-
of the parallel vector fields. tormations. When general relativity is extended
A s is well known, general relativity is formulated to the domain of microscopic system, one must
by the following fundamental assumptions, which use tetrads and h a s t o assume: (F) All physical
we will compare with those of new general rela- laws a r e expressed by equations that are covariant
tivity: (A) Underlying space-time is the Riemann o r form invariant under the group of local Lorentz
space-time, which h a s the metric tensor as the transformations.
basic structure. A l l physical laws a r e expressed We shall formulate new general relativity in the
following manner: In Sec. II geometry of the
WeitZenback space-time is described in some de-

1 Riemann-Cartan Space-Time
tail, with emphasis on spinor wave functions de-
fined in this space-time. In Sec. I11 microscopic
matter Lagrangians a r e considered, such as of the
electromagnetic field, of spin-$ fundamental parti-
cles and so forth. Their equations of motion are
derived and then approximated by the WKB method
to yield, in the classical limit, the geodesics of
the metric g:3 along which point particles and
light rays a r e defined to move. In Sec. IV a gravi-
tational Lagrangian is constructed by the require-
ment of invariance under (1) the group of general
coordinate transformations, (2) the group of global
Lorentz transformations (3) the parity operation,
and by the demand that (4) the Lagrangian be quad-
I Minkowski Space-Time I ratic in the torsion tensor. Gravitational field
equations a r e derived, with three unknown param-
FIG. 1. The reduction of space-time is made in two
eters, cl, c2, and c3. In Sec. V a static, spheri-
particular cases: One is the Riemann space-time V, cally symmetric field outside a massive neutral
with a curvabre tensor only @), and the other the body is determined, with two parameters, c1 and
WeitZenback space-time A with a torsion tensor alone cr; in this c a s e a t e r m proportional to c3 is van-
ishing identically. In Sec. VI comparison with all
41 1

3526 KENJI HAYASHI AND TAKESHI SHIRAFUJI -


19

the experiments so far carried out is made; Greek indices.


firstly, we clarify how the equivalence principle is By definition there exists a global system of
four orthonormal vector fields -b ( P )= { b i ( p ) } , such
violated in microscopic systems only, and second-
ly, upper bounds for the parameters, c, and c,, that
-
are obtained. In Sec. VII the f r e e parameters, c,
and c2, are classified into two classes with c3 ar- g ( b i ,bj) = g (bj,b i ) = B i j , (2-3)
bitrary; cI= 0 = c, and c, f 0 f c2. (Other cases are where u = ( q i j )= diag(-1, + 1,+ 1,+1). Thus the vec-
forbidden.) The r e s t of the present paper con- tor fields, -
b ( P ) = {-b , ( p ) } ,a r e expressed in the old
cerns with the former choice of f r e e parameters, basis by
and hence the static, isotropic field is given by
the Schwarzschild solution. In this case new gen- -
b =b g,,; (2.4a)
e r a l relativity has only one free parameter, equivalently, a global system of four orthonormal
A = 9 4 4 4 , besides the Einstein gravitational con- vector fields b * ( g ) = { b ( p ) } , which are dual t o
stant K. In Sec. VIII the group of local Lorentz
transformations, which we do not assume, is seen -( p ) = { b i ( p ) } , % written in the old basis by
b

to emerge as the dynamical symmetry group for a b = b f-E


- * (2.4b)
static, isotropic field. This new situation demands
the extension of absolute parallelism. In Sec. IX, Conversely, it follows that
as microscopic applications, the weak-field ap- -
E , = b i&i , (2.4~)
proximation to gravitational field equations is per-
formed. In Sec. X the coupling of an antisymmetric -
E = b i -b i . (2.4d)
field to matter is discussed; it propagates in Here the coefficients, {biu} o r { b i u } , are 16 func-
vacuum, mediating a long-range, spin-spin force tions, and must staisfy
among spin-9 fundamental particles with a coupling
strength 6, which is estimated by precise ex- b i b i y =6, b i u b J p =6 , (2.5a)
perimental values of quantum electrodynamics. In bipqijbv=Erv, bigLgsvbj=Vij* (2.5b)
Sec. XI the Birkhoff theorem, that a spherically
symmetric gravitational field in empty space must It should be remarked that Latin indices of b = { b }
be static, with a metric given by the Schwarzschild and b* ={bi}mean that they a r e Lorentz v e c i o r z
solution, is proved in new general relativity. In -b i s the covariant vector and &* is the contravar-
Sec. XI1 we draw conclusions. iant vector. F r o m (2.4) it follows that for any
In our conventions the middle p a r t of the Greek vector - V = VE,=
- Vb,- the components satisfy
..
alphabet, p , v , X , . , refers to 0, 1, 2, and 3,
..
while the initial part, a,8, y , . , denotes 1, 2, V = b i V i , V=bi,V ~ (2.6)
and 3. In a similar way the middle p a r t of the This rule of converting Greek to Latin indices and
..
Latin alphabet, i , j , k , . , means 0, 1, 2, h d 3, vice v e r s a is applied to any tensor of higher rank.
..
while the initial part, a , b , c,. , denotes 1, 2, Now, in the WeitzenbGck space-time the cova-
and 3. riant derivative, denoted by D * , defines absolute
p a r a l l e l i s m with respect to tFe global system of
11. GEOMETRY OF THE WEITZENB~CKSPACE-TIME the four orthonormal vector fields b. By definition
- satisfies
Ii*
The space-time M is assumed t o be a paracom-
pact, Hausdorff, connected C four-dimensional D
-*-b i = O , (2.7a)
manifold with a locally Lorentzian metric g. Let o r in the coefficient form,
U be a local coordinate neighborhood of p EM with
local coordinates x = { x } , then we can introduce D:bi=av b i A + r : i b,=O . (2.7b)
the coordinate basis E={E,}={(a/i3x),} with p = 0, From (2.4), (2.5), and (2.7) we find
1, 2, and 3, and the dyal Fasis E*={E}={(dx),}.
Every vector V a t p can be wriEen as V = V E w . D,-
*E,=~:;E, (2.8)
In particular iiie metric tensor g - is written aS with the affine connection of absolute parallelism,
r : t = b Ai a b i , = - b i aV b i (2.9)
where the metric components a r e simply the inner Here the global system of the four orthonormal
product of the coordinate basis vectors, vector fields i s called a q u a d w l e t of the parallel
v e c t o r f i e l d s , o r simply the p a r a l l e l v e c t o r f i e l d s .

These components a r e used to raise and lower


For a vector field V ( x ) =V i ( x ) b i =V(x)E,,, the co-
variant derivative is given by-
-
412

-
19 NEW G E N E R A L R E L A T I V I T Y 3527

- -b , = ( D : V Y-) E v-
D*V-= ( D : V i )-E " @ @ E,, , (2.10) with this invariance of underlying geometry, we
where
demand that physical - -should be inuan'ant un-
laws
d e r the action of L!. We call this the global Lo-
D:Vr= a,V', (2.11) rentz invariance.
In the Weitzenb6ck space-time, spinors a r e in-
D:V~ = a v v u +r ; y . (2.12)
troduced as quantities which transform like two-
Thus, for the components V' with respect to a valued representations of the proper, orthochro-
quadruplet of the parallel vector fields, the co- nous Lorentz group L!.'* Most elementary spi-
variant derivative coincides with the usual deriva- nors are four kinds of two-component spinors,
tive. i.e., contravariant spinor { t"}, covariant spinor
In the W e h e n b a c k space-time absolute paral- {xA}, dotted contravariant spinor { t;"}, and dotted
lelism of vectors at different points of M i s defined covariant spinor {xi} f o r A = 1 and 2. Dotted spi-
in the following way: Consider a vector I(p) nor6 transform like the complex conjugate of
= V'b,(p) at p and a vector W ( q ) =W'L,(q) a t q , undotted spinors. A spinor of higher rank is a
where the point 4 can be arTitrarily separated quantity which transforms like a direct product
from p . The parallelism of 1 and W is manifest: of two-component spinors. A vector V = V'k, is
If their components a r e equal witheach other, identified with.a mixed spinor of secoyd rank with
components vAB,
V'=WV', (2.13)
then the two vectors, V ( p ) and W(q), a r e parallel vAB=c { iByi J (2.16)
with each other and of2qual length. where {XiiB} is a s e t of Hermitian 2 x 2 matrices
In passing we make the remark that Latin indices satisfying
a r e used t o denote components with respect to a
quadruplet of the parallel vector fields, and are
raised and lowered by the Minkowski metric ten- where
sor, {%,} o r {v"}.
The affine connection, r*={rzi}, is not sym-
metric with respect to the exchange of lower two (2.18)
indices. The torsion tensor is given by
r ~ p . ~ r : ~ - r : ~ = b ~ ( a , b ~ p - a p b ~ (2.14)
v). One of the simplest choices, which we take in this
paper, i s
The curvature tensor formed of r*identically
vanishes [see (1.7)], since parallel transfer of a
vector is path independent owing t o absolute paral- where {q,a,, u3} is a s e t of the Pauli matrices.
lelism. Thus the Weitaenback space-time is The four-component Dirac spinor J, i s defined
characterized by the torsion tensor alone, and r e - by a direct sum of a covariant spinor and a dotted
duces to the lMinkowski space-time provided the contravariant spinor, and is written as a single
torsion tensor vanishes globally. See Fig. 1 for
column matrix
reduction of the Riemann-Cartan space-time. In
the Minkowski space -time the parallel vector
fields, which define absolute parallelism, coincide
with the coordinate basis of a Cartesian coordinate (2.20)
system.
When a quadruplet of the parallel vector fields b
is subject t o a global, proper, orthochronous
Lorentz transformation, The conjugate Dirac spinor $ is obtained f r o m J,
by
i?r=A',b;, (2.15a)
AJ,q,,,,Amn=tlrn, d e t A = l , (2.21)
Ao,s 1, a,A',=O, (2.15b)
new absolute parallelism defined by new parallel Now we extend the definition of absolute paralle-
vector fields b' is equivalent to the original one. lism to include spinors. Consider a spinor a t p ,
So geometry of the Weitaenback space-time is in- s a y a contravariant two-component spinor { [ " ( p ) } ,
variant under the global, proper, orthochronous and another spinor a t q of the s a m e type, say,
Lorentz group, L!r{A=(Af,)~GL(4,R), A',~,,A", {EA(q)}.If components of these spinors a r e equal,
= q h , d e t A = l , A o o a l , B,A',=O}. In conformity i.e., ( " ( p ) = EA(q),then two spinors a r e defined to
413

3528 KENJI HAYASHI AND TAKESHI SHIRAFUJI -


19

be parallel and of the same magnitude. From interaction shall be assumed t o hold in new general
(2.16) it follows that absolute parallelism of spi- relativity, because this invariance plays the fun-
nors implies absolute parallelism of vectors and damental role in quantum electrodynamics. The
tensors: In fact, for two vectors V at point p and electromagnetic Lagrangian density L,, is then
-
W a t another point-q, equality of spinor compo- given by
W A B ( q ) implies
nents, V A f ( ( p ) = , Vi(p) = W i ( q ) , be-
cause {C y }is independent of space-time posi- L,,= -ag ""g""F,, F,, , (3.3)
tion. with
When a spinor at point p is parallel transferred
to another point q , its components are kept un-
changed owing to absolute parallelism. There- which i s of the same form as the electromagnetic
fore, the covariant derivative 0: of spinors coin- Lagrangian density used in general relativity.
cides with the usual derivative a., Absolute parallelism i s applied to spinor wave
Finally, we make the following important re- functions of fundamental spin-2 particles, and the
mark: The parallel vector fields b are different Dirac Lagrangian density L, i s given byL5
from the so-called tetrad fields e%y an arbitrary,
position-dependent Lorentz tranaormation, which L D -- r zAbku[$ykD:I/J-
. (D:$)yk$]- m$$. (3.5a)
is called a local Lorentz transformation. In this paper we use the unit, R = c = 1, but through-
out this section we write A explicity for convenience
of taking the semiclassical limit. For spinors
111. MATTER LAGRANGIAN AND EQUATIONS OF MOTION
the covariant derivative 0: coincides with the
FOR TEST PARTICLES
usual derivative
A. Matter Lagrangian D:$= a,$. (3.6a)
In new general relativity we do not identify the
If we use the covariant derivative V, of general
six extra degrees of freedom of the parallel vec-
relativity,
tor fields with the electromagnetic field strength,
since we now know that such an attempt failed.' V,$= (a,+ $ i A i j U S " ) $ , (3.6b)
Instead, we take the electromagnetic potential
with respect to the Ricci rotation coefficients
-={A,,}as the dynamical variable independent of
A
the parallel vector fields. The matter part of the {Aijul,

action i s then represented as a sum of the action Aija'bkc(Aij,=-~(T,jk- Tjik- 'kij) 9 (3.7)
of fundamental particles and fields, i.e., of the
electromagnetic field and several kinds of spin-; then L , of (3.5a) can be rewritten as
fundamental particles;

where {a"} is the axial-vector p a r t of the torsion


tensor,
ay= bauak=$~'"PoTyp,~ (3.8)
where g is
This covariant derivative V,, when applied to ten-
g =det(g,,) = -[det(b',)]'<O , (3.2) s o r s whose indices are written in Greek, becomes
the usual covariant derivative with respect to the
and LLntrepresents nongravitational interactions
Levi-Civita connection (1.3). Here the completely
among fundamental particles and fields. Here the
index i in the second term labels spin-; funda- antisymmetric tensors, E ={E'"@'} and E*={E,,,},
mental particles such as quarks, electrons, with respect to the coordinate basis a r e defined
muons, electron-neutrinos, muon-neutrinos, etc., by"
all of which can be aescribed in fairly good ap-
proximation by spinor wave functions obeying the (3.9)
P " , = G ~ P y w ~
Dirac equation. If there exist other fundamental
fields besides the electromagnetic field, their where 6={6'"'m} and 6*={6,} a r e the completely
action must be added to (3.1): Gauge fields for antisymmetric tensor densities of weight -1 and
internal symmetry of fundamental particles, if + 1, respectively, with normalizations 6'23 = + 1
they exist, can be included in (3.1) in a similar and do,,, = -1. So the Completely antisymmetric
manner to the electromagnetic field. tensors with respect to the parallel vector fields
The gauge invariance of the electromagnetic a r e defined by
41 4

-
19 NEW G E N E R A L R E L A T I V I T Y 3529

Then, by means of the Dirac equation (3.13), this


(3.10) can be solved with respect t o J),

where f ( o ) ( f ) ( 2 ) ( 3 ) = + 1 and E ~ ) ( ~ ) ( ~ ) -1 J)=[l+(i/m)Rb,y((D:+$u.)] $I, (3.16)


( ~ )with
=
Lorentz (Latin) indices in parentheses. and s o we find that 4 satisfies the second-order
Variation of L , with respect to A, gives wave equation,

[iib, y (0:+ Su ), -m ]
where the electromagnetic current is defined by X (3.17)
[~bib,Yy(D,*+~v,)+m]$I=O ,
6 to which we apply the WKB approximation method.
jY=-Lint. (3.12)
J.4 Y We seek a semiclassical solution of (3.17) with
Equation (3.11) is just the Maxwell equation in the following form:
general relativity, and hence the law of electro-
magnetism is entirely free from the influence of 9 = ern(; s) 40 , (3.18)
absolute parallelism. In space-time with a given
background metric g, electromagnetic waves by assuming that t i s very small compared to S.
propagate in the same manner as in general rela- Usihg (3.18) in (3.17) and then putting each o r d e r
tivity: In the short-wavelength limit, in particu- of ( t / i ) to zero, we find (up to the f i r s t order)
lar, light rays propagate along the null geodesics (t/i): gYV(a,,s)(a,s)+m2=0, (3.19)
of the metric g.
The Dirac equation is derived from L , by taking ( E m : {2g(auS)(D,*+Sv,)
variation with respect to $, - b,b[[D:(a,S)]y1yk} $ I o = O . (3.20a)
[~b~Yyh(DZ+~uY)-m]J)=O, (3.13a) The last equation is rewritten as
o r equivalently, {2gY(a.SN,+ g Y Y [ v u ( ~ , S ) 1
- -
( i t b ~ y k V , 3tia,y5yk m ) $ =0 , (3.13b) +z =f0~ @(3.20b)
3 ~. ~ ~ j m b m Y ( 3 Y S ) a S 0

where { v Y } i s the vector part of the torsion tensor, with help of the relation between D,* and V , ,
vp=Ttiu, (3.14) D:4,= (V,+$ iKf,vSij)40,
(3.21)
and only the gravitational interaction is included. D:(aYs)=v,(aYS) -K?,,,,a,S,
with { K i j v }and {KAYv}
being the contortion tensor
B. Equations of motion for massive Dirac particles
defined by
We shall derive two equations of motion for a
freely falling Dirac particle, i.e., the equation of Ki,= biAblYK,Y
orbit and the equation of spin pPecession, by = i b ibjY(TiYV - T,,,- T,J
applying the WKB approximation method to the
= -A ijv. (3.22)
Dirac equation (3.13).
The particle of spin i is usually represented by The applicability condition of the semiclassical
a four-component spinor wave function obeying the solution (3.18) is that when it is used in (3.17) the
first-order Dirac equation, However, it is well t e r m s of o r d e r ( t / i ) are much l a r g e r than those
known that it can equally well be described by a of o r d e r ( t / i ) . Estimating I a,SI/ti -l/(wave-
two-component spinor wave function obeying a -
length) 1 / X , D,*$, @,/w with w the width of the
second-order wave equation. So there are two wave packet, and lD,*(a,S)l -E/XC with L being the
equivalent ways t o take the classical limit for the distance over which the parallel vector fields
particle of spin i, in accordance with which a {b*,} vary considerably, we obtain the following in-
wave equation is considered; a first-order wave equality:
equation o r a second-order one.
L>>X, w > > X . (3.23)
F o r our present purpose of deriving the spin
equation in addition to the orbit equation, it is Equation (3.19) is the Hamilton-Jacobi equation
much more convenient to s t a r t from a second- which describes the motion of freely falling parti-
order wave equation rather than from the Dirac cles in general relativity.*l The complete solution
equation (3.13). We thus introduce a two-com- S(x; al,a2,a3)with three f r e e parameters, a l , az,
ponent spinor wave function $I byzo and as,determines the classical orbit by
$I = 31+ Y 5 ) q J . (3.15) as
-= B,(=const), (a=1,2,3). (3.24)
3%
415

3530 KENJI H A Y A S H I A N D T A K E S H I S H I R A F U J I 19

When the trajectory ~ ( 7defined


) by (3.24) is
parametrized by the proper time T , it satisfies
the geodesic equation,

fir-
- --1
(four-velocity) ,
g8,S =(I (3.25)
dr m (3.32 c)
(3.26) In the integral of ( 3 . 3 2 ~ )compensation takes place
almost everywhere except in the space-time re-
Given the solution S ( x ; a l , a, a3)of (3.19), Eq. gion satisfying
(3.20) can be solved to define the spinor wave
function q50 in t e r m s of S. By virtue of (3.16), the
semiclassical expression f o r the Dirac spinor
wave function in t e r m s of S and cpo i s given by According to the condition (1) stated above, the
right-hand side of (3.33a) is negligibly s m a l l com-
pared to the macroscopic scale. Therefore, in the
$= exP(; s) $0 3 (3.27) macroscopic scale, the wave packet (3.32~)has
nonvanishing amplitude only along a world line
x ( T ) defined by

-
= (1 bkU,yk)@,,. (3.28) (3.33b)
The probability current, j = b,$y$, then takes Here T is the proper time along the world line.
the following form in the semiclassical approxi- The wave packet thus Propagates along a classical
mation: trajectory ~ ( 7 )satisfying the geodesic equation
j=pU, (3.29) (3.26).
Now we turn our consideration to the motion of
where p is defined by spin f o r a spin-f particle described by the spinor
p= - 2b4U,&,y$, . (3.30) wave function (3.32~). The spin polarization is
described by the spinor wave function &(x(T); a),
Equation (3.20b) of cpo ensures that j satisfies the since other two factors of ( 3 . 3 2 ~ )are s c a l a r func-
continuity equation, tions which have nothing t o do with intrinsic spin
polarization. We introduce a new spinor wave
(3.31) function $ $ x ( T ) ; 3)by
The expression (3.29) f o r the probability current 1
shows that, in the semiclassical approximation,
$;=T $0.

the probability may be regarded as following along Then it is normalized like


the classical trajectory.
We can form a wave packet by superposing the b4&y $
:= U (four-velocity) , (3.34b)
solutions of (3.27) with different values of param- and accordingly it can be taken as the normalized
e t e r s , a = ( o l ,az,a3): spinor wave function which describes spin polari-
zation. F r o m (3.20a)-(3.20b) and (3.28)-(3.31),
it follows after a little algebra that the normalized
spinor wave function $: satisfies
where p ( a ) is a weight function with a sharp
peak of width Aa at a = Z (zl, z3).
Zz, Here a
(3.35a)
sharp peak means that the following conditions
are satisfied: (1) The ratio E/Aa is negligibly
o r equivalently,
small compared to the macroscopic scale, and
(2) the ratio E / ( A a ) satisfies the inequality
c,,,bm~UaSf~)$~=O, (3.35b)
(3.3213)
where D*/& and V / d r mean covariant differentia-
tion along a classical trajectory x ( T ) , D*/dT
Since $&%; a) is not highly oscillating with respect = UDZ and V / d T = UV,, respectively. Equations
to a, it follows from the inequality (3.32b) that (3.35) describe the temporal change of the spin
Eq. (3.32a) can be rewritten as state as a spin-; particle moves along a classical
416

19 NEW G E N E R A L RELATIVITY 3531

trajectory ~ ( 7 ) . The second t e r m of (3.35b)


represents the effect of absolute parallelism on
-_d a -ga,S=p (four-momentum) , (3.42)

the spin precession in new general relativity.


We define the spin vector {S} by (3.43)
S=-ibk$~y5yk~~, (3.36a) where a is the affine parameter along the tra-
which has only three independent components, be- jectory: The normalization of u is fixed by (3.42).
cause (3.28) and (3.34a) give The four momentum {p} is null, due to (3.41).
The current {j=b,Xdx} is also null, because x
us, = 0 . (3.3 6b)
is a two-component spinor. Since these two null
It follows from (3.35) that {S} satisfies vectors, {p} and { j } , a r e orthogonal with each
other by virtue of (3.40), they must be proportional
to one another,
E S = -(K*-KyUAs, (3.37a)
dr j=pp, (3.44)
or equivalently,
where p is a positive-definite s c a l a r function. We
can apply the s a m e argument as f o r the massive
(3.37b)
case, to show that antineutrinos moue along the
classical trajectory satisfying (3.43) in the short-
These a r e the classical equations of spin preces-
wavelength limit.
sion in new general relativity. The right-hand
We take Xo/Gas the normalized spinor wave
side of (3.37b) represents the effect of absolute
function for antineutrinos. The spin vector (3.36a)
parallelism. When the axial-vector p a r t of the
then becomes
torsion tensor vanishes, (3.37b) reduces to the
equation of spin precession in general relativity.
(3.45)
F o r a nonrelativistic particle in a weak gravita-
tional field: (3.37b) becomes
dg
-=-saxs,
at
- (3.38)
showing that antineutrinos are of helicity +ias
they should. Therefore, the classical equations
of motion f o r neutrinos and antineutrinos in new
general relativity are the same as those in gene-
s
where and 5 are the space components of {S} ral relativity.
and {a}, respectively.

C. Equations of motion for neutrinos and antineutrinos


IV. GRAVITATIONAL LAGRANGIAN IN VACUUM
Neutrinos and antineutrinos are described by
We shall construct a gravitational Lagrangian
two-component spinor wave functions which a r e
density in vacuum,
obtained from (four-component) Dirac spinor
wave functions of (2.20) by putting x = 0 f o r neu-
trinos and 5 = 0 f o r antineutrinos, respectively.
Ic=l d4x=LG. (4.1)
We shall consider only antineutrinos, since neu- For this purpose we enumerate the basic postu-
trinos can be treated in a similar manner. F o r lates which the above action must obey:
antineut r inos , the Dir ac equation (3.13 a) becomes (1)Invariance under the group of general co-
the Weyl equation f o r a right-handed massless ordinate transformations; for a r b i t r a r y change of
particle )23 coordinates the parallel vector fields transform
like
(3.39)
bh(x)= (ax/ax)b,(x). (4.2)
The semiclassical solution, x = exp(iS/E)X,, must
satisfy (2) Invariance under the group of global, pro-
p e r , orthochronous Lorentz transformations L!;
(3.40)
for its element A = (AJ with A t M = 11, detA = 1,
and hence we get A,,> 1, and a,,A=O, the parallel vector fields
change like
det(b,a sd )= -gUu(a .s)(a,, s) = 0 , (3.41)
b,(x) =Ahb,(d. (4.3)
which is just the Hamilton- Jacobi equation f o r a
massless particle in general relativity. The class- (3) L, be invariant under the parity operation;
ical trajectory x(o) defined by (3.24) satisfies by parity operation we mean the Lorentz transfor-
the following equations: mation, b,,-b,, and -kc.,, where Lorentz
417

3532 KENJI HAYASHI AND TAKESHI SHIRAFUJI 19


-

indices a r e enclosed by parentheses.


(4) L , be quadratic t e r m s in the torsion tensor, J
besides a cosmological term.
Now, the torsion tensor i s given by

T?, = b:(av b k , - a ,bk,) , (4.4)


Next we observe the identity
which is reducible with respect to the group of
global Lorentz rransformations. It i s convenient
to perform an irreducible decomposition, from
which we can construct a gravitational Lagrangian
density. The torsion tensor is decomposed into
three irreducible p a r t s under this
(4.14)
tAuv=* !JAv)+-& (guAvLl+gVUvA)
where R({ }) denotes the Riemann scalar curva-
- ~ g A P v V 3 (4.5) ture. It i s given by the contraction of the Ricci
tensor, which is again the contraction of the Rie-
v,= FA,, (4.6) mann-Christoffel curvature tensor:

a,=$ c u v m T V w , (4.7) RP,,({ I)=}:{a, - a{:II}+{A9{;1 -{A%A},


(4.15)
where E * ={E I I y p o } i s the completely antisymmetric
tensor, introduced in (3.9). In fact, let p ( m , n ) be RUv({ }) = R ? P A v ( { 1) 9 (4.16)
an irreducible representation of the proper ortho- R({ 1)=gR ,A{ * (4.17)
chronous Lorentz group, where 2m and 2% take
Here t h e symbol {&} denotes the Levi-Civita con-
non-negative integer numbers.z4b Then the tensor
nection (1.3) of the Riemann space-time. Since
{t,,} transforms according to p ( $ , i ) @ p ( $ ,$) of 16 the Weitzenbock space-time has the vanishing of
dimensions (the Young table [21] minus traces),
the curvature tensor [see (1.7)], the identity of
the vector { u p } according to ~(4,;) of 4 dimen-
(4.14) should be taken as purely mathematical.
sions, and finally the axial-vector {a,} according
Using K = 8 &/c4 = 8 nG with C the Newton gravi-
to p ( 4 , $) of 4 dimensions (the Young table [ill]).
tational constant, we finally rewrite the gravita-
The torsion tensor i s conversely written in t e r m s
tional action in the following form:
of the three irreducibles,

5
T h j ~ v = (tAuv - tAuu) + $ (gAuJ,-gAyVp)
+fiuvpa * (4.8)
+ c z ( v u v I I )c+3 ( a u a u ) ) . (4.18)
The tensor {tAUv}has the following properties de-
rived from the defining equation (4.5): Comparing (4.13) with (4.18), we find that the
parameters a r e effectively (under integration
tAUv=tUAv 9 (4.9)
symbol) related to each other by
gt,,,= 0 =gA*tA,,, (4.10) 1 1
c l = a l +- , c =a --
tAUu+tUvA+tvAL&=O (4.11) 3K 3 K
(4.19)
3
The above postulates of (1) to (4) require that c3=a3+-
4K
.
the most general Lagrangian density be of the
form It should be mentioned that one of the f r e e param-
eters, c,, c2, and c3, must be nonzero; other-
+ u,(zv,) + a3(aua,)+a,,
L, =al(tAutArv) (4.12)
wise, the left-hand side of a gravitational field
where a,, a,, and a3 a r e f r e e parameters, while equation would become symmetric, while the
a, is a cosmological term. right-hand side, the energy-momentum tensor of
In Appendix A we will treat a case of lifting up spin-4 fundamental particles, would become non-
the postulate of (3), by adding to (4.12) parity- symmetric. This is a contradiction.
violating t e r m s like (va,) and ( ~ , , , , , t ~ ~ t ; In
~). It is easy ro derive a gravitational field equa-
the r e s t of t h i s paper we shall neglect the cosmo- tion. F o r the sake of completeness we write down
logical term, so we have the gravitational action of a gravitational field equation when matter i s
418

-
19 NEW G E N E R A L R E L A T I V I T Y 3533

present, by adding to the vacuum gravitational with the Dirac Lagrangian LgR used in general
action a matter action I,, which satisfies the relativity
postulates of (1) to (4),
LgR=+ i b k B yVu$ - (V,$) y 141 -wz$$. (4.30b)
I=Z,+Z,, (4.20)
It is useful t o split the gravitational field equa-
tion into the symmetric and antisymmetric parts:
,Z = d4xGL,. (4.21)
G({ })+2KD~Fb+2KVAF~uA+
2KH
F r o m this action follows the following field equa-
tion, by taking variation with respect to the -KguvL= KT), (4.31)
parallel vector fields b,, and then multiplying with 2D:F[udb+ 2vbflIVIk=T[lYI (4.32)
qkfblP:
where
Guy({ })+ ~KD:F+ 2~v~~~+2~H
-KgL= KT. (4.22)
Here the first t e r m denotes the Einstein tensor of
general relativity,
GuY({})=R({})-~guuR({}), (4.23)
and the tensor {FUvA}
stands f o r Furthermore, it is often useful to rewrite the
field equation in Latin indices:
-
~ + c2( gvb-gkv)
F X = ~ ~ ( t tUA)
-4 c3<-ap

= -Fby. (4.24)
The fourth t e r m {H} is defined by
HY= p f i ~ ; - + TYWF~~=HY@, (4.25)
which is shown to be symmetric upon inserting the
irreducible decomposition (4.8) of the torsion ten-
sor. Finally, L is given by
L= C l ( t w A r v ) +c*(vu,)+ c3(auau). (4.26)
A source t e r m is, as usual, defined by
6 I d 4 x G L , = /d4x=T,6bku

= -J dx=T:bb,. (4.27)

Therefore, an energy-momentum tensor i s given


by
q T , , , = -qk, b ~ , b ~ L , / b b , , (4.28a) V. THE STATIC, ISOTROPIC GRAVITATIONAL FIELD
IN VACUUM
o r equivalently,
Let us consider a static, isotropic gravitational
G T = ~kibJ6=Ln/6bkv. (4.2813) field produced by a static, spherical body, assum-
F o r instance, the energy-mcjmentum tensor of the ing that the spin of constituent particles of a body,
electromagnetic field and the Dirac field is calcu- if i t exists, can b e completely neglected. The
lated from the above formulas with (3.3) and (3.5) state of a central body then does not change under
to be space inversion, besides t i m e r e v e r s a l and space
rotation. Therefore, it is possible to find a set of
T,=Fu,F,gpa+guULem, (4.29) coordinates, 2 = t , xl, x 2 , and x 3 , such that the
T = - 2L tb
. k parallel vector fields ={bs = { b t } are form inuar-
Ily @?La$- (a,$) y,Ql+gu&L,
innt under time reversa1,space inversion, and
= -$ ibk& Ykv,$- cvcl,? )yk$l space rotation,
+ 4 E~, b:Kwu$ YY I4 t - - t , b(o)--!!(o)
+ggv(L~R-qak$Y5yk$)
9 (4.30a) (time r e v e r s a l ) , (5.la)
419

3534 KENJI HAYASHI AND TAKESHI SHIRAFUII -


19
~.

xu -- xu
1 bcz)--b(a) tropic form,

(space inversion), (5.lb) ds2 =guvdx'dxv


xu - R a 6 X B , -Ra&(c, = 9 b',, b ',dx Ir d x
(space rotation), (5.1~) = -A(r)dt2+ B(v)dx'dxa , (5.3a)
where, t o avoid confusion, Latin indices in b, a r e where the metric coefficients, A and B , are r e -
enclosed in parentheses, and R = (Rat) = (RUB)is a lated to C and D by
position-independent 3 x 3 orthogonal matrix
A=I/c?, B=I/D'. (5.3b)
RRt=RtR= I , detR = 1. (5.ld)
The set of coordinates { x ' } is therefore the iso-
Then, as is shown in Appendix B, by using the free- tropic coordinate system.
dom to redefine the radius We write the gravitational field equation (4.22)
X'" = @(?-)XU, Y = ( X P X y -, as

we can assume, without loss of generality, that


p"=~p" , (5.4a)
the parallel vector fields & = ( b k } = { b k C ( x )have
} a
diagonal form, where {I"? is defined by

b ( o ) o = C ( Y ) , b (o)" = 0 = b @)O, I'VE Guy({ }) + 2itD,*Fu"' + ~ K Z I ~ F ' " ~

b = D(Y)~," , (5.2) + 2KH"'- K~JL~L'. (5.4b)


with two unknown functions of r , C, and D. The in- F o r a static, isotropic gravitational field with
variant distance ds2 is then expressed in the iso- (5.2), (p"}is given by
I

where the parameter E is a constant defined by A. The Newtonian limit

E
K(C, +Cz) We assume that a central gravitating body is a
- - '
1+K(CI+4Cz) (5'6) nonrelativistic system with all the components of
and a prime means differentiation with respect t o Fa being negligibly small compared t o ?ao; ?ao
>> I TuBl= 0. Then the gravitational field is weak;
Y. It is shown i n Appendix C that the constant
[1+K ( C ~+ 4c2)] is a nonzero number. the metric coefficients, A and B,a r e nearly unity,
There is no appearance of the parameter c3, but A = 1= B, and t e r m s quadratic i n A' and B' can be
only t h e parameters, c1 and cz, owing to a static, ignored in the field equation (5.4) with (5.5). We
then find that the gravitational field equation i n the
isotropic gravitational field. In other words, we
Newtonian limit is given by
can s a y nothing about the parameter c3 i n this
case. Now we proceed to study a solution of the
field equation (5.4) with (5.5) f o r the following
+- -
2 [A'+ (1 26)B'] = K P o ,
three cases: (A) the Newtonian limit, (B) the post-
Newtonian approximation, and (C) an exact s o h -
Y I (5.7a)
tion i n vacuum.
420

19
- NEW GENERAL RELATIVITY 3535

(1-2~)A'+B'=0. (5.W
The external solution satisfying the boundary con-
dition,

r-- I --
limA(r) = lim B ( r )= 1, (5.8)

is

2 Gm
A(v) = 1 -
(1-E)(1-4)[1+K(C1+4C2)] 7'
.
(5.9a)
2(1- 2 r ) Gm
B ( r )= 1+
-
(1 )(1- 4 4 [ 1 + K ( C 1 + 4CJ] -F '
(5.9b)
with m the total m a s s of the source,

m= I I
Po(x)d3x=4s r2P0(r)dr. (5.10)

It was found i n Sec. 111 that the trajectory of a FIG. 2. The curve of (5.12b): F 1 = ~ candF2=KC2.
I
test particle is determined by the geodesic equa- is* = - 4ZJ@ + %I).
tion (3.26), which reduces for a nonrelativistic
particle to
-
(1 ~ ) ( 1 -4 ~ ) [ 1 +K ( C ~+ 4c,)] 1, (5.12a)
which we shall assume hereafter. This condition
is called the Newton approximation condition. In
t e r m s of Fl = KC^ and F, = KC,, the Newton approxi-
_ - 1
- (1 - )(I - 4)[1 +K(C1+4C2)] & (-?) *
mation condition reads as
4~1+~,+9F1F,=0. (5 12b)
(5.11a)
From this follow the two cases, c1 = 0 = c2 and c1
Here the solution (5.9a) is used i n the final step.
# 0 # c2.See Fig. 2 f o r the curve specified by
We demand that the trajectory of a nonrelativistic
(5.12b). Now, combining (5.6) and (5.12b), we find
test particle, specified by x " ( t ) , obeys the Newton
equation of motion - E - 4E
(5.12~)
c1=-- 3 ( 1 - ~ ) ' c, =- 3(1-4)'
(5.11b) Since E is observable i n solar-system experi-
ments, a s will be shown i n Sec. VL, we draw the
where q5 is a gravitational potential, which takes curves of (5.12~) versus E i n Fig. 3.
the form
Q = - Gm/r (5.1 l c ) B. Vacuum solution in the post-Newtonian approximation
for a gravitational field around a spherical body
with m a s s m. Accordingly, the parameters, c1 The field equation (5.4) with (5.5) can be r e -
and c2, must satisfy the condition written i n vacuum as follows,

(5.13a)

(5.13b)

(5.13~)
42 1

3536 KENJI HAYASHI AND TAKESHI SHIRAFUJI

This equation can be integrated to give

(5.17a)

with fl a n integration constant, which can be fixed


by using (5.14) with (5.15) in (5.17a):
-
fl = (1 ~ ) ( 14E)(GM)'.
- (5.1%)
Inthesamewaywegetfrom[3~(5.13b)- (5.13c)-2
x (5.13a) 1
A' B'
-
(1 3 ~ ) - + 2 ~ - = (AB)-'/'&
A B Y Z'
(5.18a)

with fz a n integration constant given by


I
.
fz = 2(1- ~ ) ( 1 -4 ~ )GM)
( (5.18b)
FIG. 3. The curves of (5.12~):The solid curve is for
-El, and the dashed curve is for F,. From the combination [ ( l - 5 ~x)(5.17a) + 26
x (5.18a)l it follows that
In the spatial region f a r outside the Schwarz- d
- c1/I
(AB)'I2 = 2~ 7 1 - 5 ~-
+- (GM)'
schild radius, i.e., r>>GM,the metric coefficients, (5.19)
dr Y 2 r 3 '
A(r) and B(Y), can be expanded in a small param-
and, therefore, remembering that the boundary
e t e r (GM/r),
condition for A and B, denoted by (5.8), i s ex-
pressed by
(5.14a)
limAB=l, (5.20)
I-"

we obtain
where M i s the gravitational m a s s of a central
gravitating body, and p, y , and 6 are.expansion
parameters to be determined by the field equa-
tion. Using (5.14) i n (5.13), and putting each order
of (GM/r) equal to zero, we find that the param- = (1 -F)(l+F), (5.21)
e t e r s , p, y , and 6, are given by
where two constants, p and q , a r e defined by
p=1-c/2, y=l-2, 2
(5.15) p ~-{[(l- ~ ) ( 14~)]"'
- -2 ~ }
6 = i(l- 3~+ $ E ' ) . 1 - 5E
It is to be noticed that the Newton approximation =~+E+o(E'~,
condition (5.12a) is not used to derive (5.15), al- n (5.22a)
though the above results a r e consistent with
(5.12a).

C. Exact vacuum solution Here E is assumed t o be

The gravitational field studied i n the previous E < + , (5.22b)


two subsections is weak. Now we derive a n exact which covers the important case of E = 0; f o r < t
salution of the vacuum field equation (5.13), which E < 1,p and q become complex values. Substitution
allows u s to study a strong gravitational field in of (5.21) into (5.17a) and (5.18a) finally gives
new general relativity.
-
After slight modification of px(5.13b) (5.13c)l
we obtain
(5.23)

+cF+g)((l-Z~h
2 A B
A' B' (5.16) +~)=0. It can be shown by direct calculations that this
solution indeed satisfies the field equation (5.13).
422

-
19 NEW G E N E R A L R E L A T I V I T Y 3537

The parallel vector fields of (5.2) are thus given by

(5.24)

in a static, isotropic gravitational field. The invariant distance ds2 of (5.3a) becomes

(5.25)

where we have introduced the spherical polar co- tensor appearing on the right-hand side of the
ordinates by Einstein field equation. It follows from the con-
servation law that the world line of a freely falling
x1 = Y sine cos+, x z = r s i n e sin+ ,
(5.26) test body is the geodesics of the metric g. The
x 3 = r cose . characteristic feature of general relati3ty is that
the conservation law of (6.1) is a consequence of
If the parameter E of (5.6) is exactly zero, then
the Einstein gravitational field equation, and hence
two constants,# and q , are exactly equal to 2 , and
hence this metric coincides with the Schwarz- that mechanical equations of motion f o r matter
are consequences of the same gravitational field
schild metric written in the isotropic coordi-
equation.
nates:
Now we shall show that almost the same property
holds also in new general relativity based on the
Weitzenbkk space-time. From the invariance of
the gravitational action under the group of general
coordinate transformations follows the identityz7

(
+ 1+-
3 [ d r z + r z ( d ~ z + s i n z O d ~ z )(5.27)
].
where
(6.2)

VI. COMPARISON WITH EXPERIMENTS c g B k y = Gb k , B u v = - 6 G L c / 6 b k , , (6.3)

A. The equivalence principle After slight modification, this identity can be re-
written as
It has been verified experimentally to very high
accuracyz6that the world line of a freely falling V , B *-KyAp
~ B,,~=0 , (6.4)
test body is independent of its composition and where is the contortion tensor given by
structure. The equivalence principle implies that (3.22). From the definition of (6.3) it follows that
the unique world line of a test body coincides with the gravitational field equation takes the form
the geodesics of the metricg. It was shown in BUY = TILL (6.5)
Sec. 111 that by taking the short-wavelength limit of
the Maxwell and Dirac equations the photon with the matter energy-momentum tensor {T)
and Dirac particles in the classical limit are to defined b y (4.28). Using (6.5) in the identity (6.4),
travel along the geodesics of the metricg. Thus, we get the response equation to gravitation,
new general relativity is compatible with the
V,T -PA
T,, = 0. (6.6)
equivalence principle in this limit.
In general relativity implications of the equi- This is the conservation law of new general rel-
valence principle a r e concisely expressed by the ativity, corresponding to the conservation law
conservation law, (6.1) of general relativity. The energy-momentum
tensor {T*} is not symmetric in new general rel-
V,TSL =0, (6.1)
ativity. However, an antisymmetric part { T c p y I )
where {TS2 is the matter energy-momentum is due to the contribution from the intrinsic spin
423

3538 KENJI HAYASHI AND TAKESHI SHIRAFUJI -


19

of spin- $ fundamental particles. For macroscopic to the hyperfine splitting of the atomic energy
bodies such as a test body employed in terrestrial levels, and it shall be discussed in Sec. X.
experiments and astrophysical objects such a s
planets and s t a r s , effects due to the intrinsic B. Comparison with solar-system experiments
spin of spin-$ fundamental particles canbe ignored, Since the invariant distance dsz of (5.3a) i s
and hence their energy-momentum tensor can be written in the isotropic coordinates, the post-
supposed to be symmetric and of the same form as Newtonian parameters of the expansion (5.14a)-
that of general relativity. Therefore, an energy- (5.14b), 6 and y , are the Eddington-Robertson
momentum tensor of macroscopic bodies satisfies
parameters. Thus, by virtue of (5.15), the
the conservation law
Eddington-Robertson parameters of new general
VyTuy= 0 (6.7) relativity a r e given by
owing to the antisymmetric property of the con- P=1-/2, y=1-2<. (6.9)
tortion tensor { K u X uwith
} respect to v and A. The
The values of B and y have been measured by the
only exception seems to be compact stellar ob- solar-system experiments:
jects such as neutron s t a r s and black holes: The
spin direction of neutrons may happen to be aligned l.OO* 0.06(retardation of radio waves30),
over the macroscopic scale inside neutron s t a r s .
If this i s indeed the case, the gravitational response (6.10a)
of neutron matter should be described by Eq. (6.6)
1.014* 0.018 (solar deflection31), (6.10b)
instead of by the conservation law (6.1)
The equivalence principle i s thus satisfied f o r
macroscopic bodies in new general relativity, i ( 2 + 2 y - p ) = 1.003i0.005
and the world line of a test body coincides with
(perihelion advances3z), (6.11)
the geodesics of the wetric g , although the metric
g- itself may be dqferent fro& that of general re1 - q 40 y- - 3 = - 0.001 i 0.015
ativity. (lunar laser ranging33). (6.12)
In the microscopic scale new general relativity
violates the equivalence principle, since effects From (6.9) it follows that the Nordtvedt parameter
due to the intrinsic spin of spin-i fundamental q i s vanishing in new general relativity;
particles cannot be ignored there, and an anti-
q=O. (6.13)
symmetric part of the energy-momentum tensor
should be seriously taken into account. The motion For the sake of safety, we here adopt the value
of the intrinsic spin of a freely falling spin-i fun- (6.1Ob) for y . Using (6.9) in (6.10b) and (6.111, we
damental particle, f o r example, does not satisfy get
the equivalence principle. As was shown in
(3.37b), the spin vector {S} obeys the equation
- 0.007 f 0.009 from (6.10b)
(6.14a)
of motion -0.003*0.004 from (6.11).
Combining these two values for c as if they were
W/dr= - $crrupsUuapSo (6.8a)
independent, we are led to
with < = - 0.004* 0.004. (6.14b)

(6.8b) This value of E satisfies our assumption of c Q f ;


see (5.22b).
By virtue of (5.12c), two dimensionless con-
where r i s the proper time. In o r d e r for this stants, KC^ and KC, can be expressed as
equation of motion f o r the spin vector to meet
with the equivalence principle, the right-hand side
should vanish. Therefore, unless the axial-vector
(6.15)
part {au} of the torsion tensor happens to vanish
identically, the equation of motion f o r the spin
vector violates the equivalence principle. An-
other important implication of new general rel- Use of (6.14b) then gives
ativity for microscopic phenomena i s the pre-
KC,=0.001*0.001, KCz=-0.005*0.005. (6.16)
diction of universal spin-spin interaction, caused
by an antisymmetric part of the energy-momentum Rewriting the gravitational Lagrangian density
tensor. This interaction, if it exists, contributes L, of (4.18) as
424

-
19 N E W G E N E R A L RELATIVITY 3539

The gravitational Lagrangian density L then be-


comes
+2KC2(VuU , ) + 2 K C 3 ( a Y a , ) ] , (6.17) 1
L, = R ( { 1)+c3(aua,) , (7.1)
we find that the strength of the c1 and c , terms are
severely restricted b y the solar-system experi- and the gravitational field equation, (4.31) and
ments. (4.32), can be expressed a s
Guy({ }) +K = K T ( ) , (7.2)
VII. THE CASE OF cl=Oc2
We have seen in the last section that the c , and 6, b, ? J ~ ( ~ J = X ~) T C u Y 1 ,
(7.3)
c p terms of L, are, if they exist, very severely where we have introduced a new parameter A by
restricted by the solar-system experiments.
9
Therefore, we shall henceforce ignore these two X= -
4c, ,
(7.4)
terms and assume that c 1 = O = c 2 . The case of
c1# O # c, shall be discussed in a separate paper. and {K)and { J } are defined by

If= -KX {f[ f Y P ~ A ( -T Ti:)


~p, -
+ Y P a A ( T ~T;;)]a,
o, - faga - aguu aP a,} 1 (7.5)

-
J U ~ = b i P b , Y J ~ ~ P = _ ~ g Ua0Y P U (7.6)
Taking the combination of [(7.2) + (K/X)x (7.3)], the gravitational field equation is rewritten as
G({})+L~~TPl (7.7)
with {L} defined by

L= -
~{aA[CUPGA(T!,,-T ; ~ ) + f Y P a A ( TTi;)]
~,,- - 3aa- ~ g u Y a P a , + 3 ~ p Y o ~ ( b + paap,vaoi ) } , (7.8)
I

where {ai = b f Y a u }is a scalar with respect to gen- part of the torsion tensor vanishes identically,
eral coordinate transformations.
As is evident by the definition of the torsion
a =& f bk,(a, b,, - 8, b,,) = 0 , (7.10)
tensor (4.4) and its irreducible components of and (2)effects due to the intrinsic spin of spin-;
(4.5)-(4.7), the second term {LuY}of (7.7) does not fundamental particles can be neglected. The first
transform like a tensor under a local Lorentz condition implies that the left-hand side of (7.7)
transformation becomes the Einstein tensor GUY({}). The second
condition, on the other hand, allows u s to treat
-bb(x)=A,(%)_b;(x) spin-f fundamental particles as if they were spin-
(7.9)
Ab(xhj,,,A , ( x ) = Vnn * less; the energy-momentum tensor {TuY}on the
right-hand side of (7.7) can then be identified with
The energy-momentum tensor of the electro- the energy-momentum tensor {T;$ used in gen-
magnetic field depends on the parallel vector eral relativity. Thus, in this particular case the
fields - b only through the metric tensor, g,, gravitational field equation (7.7) is identical with
= b i u q f , b i Y ,and hence it is locally Lorentz in- the Einstein field equation,
variant. The energy-momentum tensor of spin-
$ fundamental particles, however, i s not locally GUY({}) = KTG. (7.11)
Lorentz invariant, due to the second term of the
second line of (4.30a), i.e., f A,, b,K~0,&~~y~$. For example, suppose that the metric in the in-
Thus, the energy-momentum tensor of matter is variant distance,
not locally Lorentz invariant, unless effects due
ds2=-A (x)(dx0)2+E(%)(dx1)z
to the intrinsic spin of spin-; fundamental par-
ticles can be neglected. Therefore, the g r a v - + C ( X ) ( ~+ D
) (~~ ) ( d x ) , (7.12)
itational f i e l d equation of (7.7) i s not invariant
under a local Lorentz transformation. is an exact solution of the Einstein field equation
The gravitational field equation is considerably (7.111, whereA(x), B ( x ) , C ( x ) , and D ( x ) a r e
simplified in the particular case which satisfies functions of x . Define the parallel vector fields
the following two conditions: (1)The axial-vector -b = @,I
- by
425

3540 KENJI HAYASHI AND TAKESHI SHIRAFUJI 19


-

1 1 transformation (7.9), also satisfy the E,instein


b,Ll,= go, :,I, = -El,
6- field equation by virtue of the local Lorentz in-
(7.13) variance of the Einstein field equation. The condi-
1 1
-
b(2,=
d-g2,
z -b , 3 , = -
Jo E3' tion (7.10), on the other hand, i s not fulfilled by
-b' in general, because the axial-vector part of the
with _Eu = a/&' and Latin indices in b, enclosed in torsion tensor, {a"}, transforms like
parentheses. Then they form a systGm of four
orthonormal vectors with their contravariant com-
ponents given by under a local Lorentz transformation (7.9). Here
A,, i s defined by A,, = vm,AJb.
btO,O=l / a , b c l j l= 1 / a ,
The new parallel vector f i e l d s i ' thus satisfy the
b ( 2 , z = l / c ,b , 3 1 3 = l / m , (7.14a) gravitational field equation (7.7), if and only if the
transformation matrix [A',(%)]obeys the condition,
bku= 0 otherwise,
~'"p'bJ,bhpAmi(x)A,b,a
=0, (8.2)
and their covariant components given by
which ensures the condition (7.10) f o r b'. In the
b''',= a,b ' " , = ~ , present particular case, therefore, thegrauita-
b ' 2 1 2 = c ,b i 3 ' , = m , (7.14b) tional field equation (7.7) is invariant under those
local Lorentz transfmmations which satisfy the
b k p= 0 otherwise. condition (8.2), and the parallel vector fields are
In this case the axial-vector p a r t of the torsion defined by the gravitational field equation with
tensor, formed of -b, vanishes identically; ambiguity of making those local Lorentz trans -
formations. This ambiguity does not lead to a n y
a'=O. (7.15) observable effects, because the Maxwell and Dirac
Therefore, the parallel vector fields (7.13) a r e an equations, ( 3 . 1 1 ) and (3.13), respectively, a r e also
exact solution of the gravitational field equation invariant under those transformations.
(7.7) with the source term, T'"= TG. The metric In the Weitzenbdck space-time the parallel
of the form (7.12) covers, among others, a num- vector fields should be defined only with arbi-
b e r of static vacuum solutions with high symmetry trariness of making a global Lorentz transforma-
of the Einstein field equation,34 such as the tion, and there i s no room f o r making any local
Schwarzschild solution, the Reissner-Nordstram Lorentz transformations. In the present particular
solution and the Weyl solution, and the Friedmann case, however, the new parallel vector fields i'
in cosmology. connected with 4 by a local Lorentz transformation
satisfying (8.2) should be regarded as equivalent
to b', because the Maxwell, Dirac, and gravita-
VIII. GEOMETRY OF THE EXTENDED WEITZENBOCK tional field equations a r e all invariant under the
SPACE-TIME
transformation from 1! to b'. W e a r e thus forced
In this section we shall consider the particular to generalize the concept of absolute parallelism
case discussed in the last section: Namely, we in the following manner: Absolute parallelism
shall assume that the parallel vector fields, 2 defined by b' shall be regarded as equivalent to
=b}, satisfy both the condition (7.10) and the that defined by b, provided that 2 and are con-
condition that spin-i fundamental particles can nected with each other by a local Lorentz trans-
be treated as if they were spinless. The gravita- formation subjecting to (8.2). W e shall refer to
tional field equation (7.7) is then apparently of the this new parallelism as extended absolute paral-
same form a s the Einstein field equation, but the lelism, and space-time endowed with extended
geometrical background of these two equations a r e absolute parallelism shall be called the extended
quite different: In general relativity the Einstein Weitzenbijck space-time. The geometry of the
equation defines the Riemann space- time, while extended WeitzenbGck space-time then is invariant
in new general relativity the gravitational field under those local Lorentz transformations which
equation i s to define the parallel vector fields of fulfil the condition (8.2).
the WeitzenbEck space-time. F o r given parallel vector fields 6, we denote by
Let the parallel vector fields, &={&}, be a A(b) the s e t of those local Lorentz transformations
solution of the gravitational field equation (7.7): whicH fulfil the condition (8.2). The s e t A(b) does
Namely, we suppose that b=&} simultaneously not form a Lie group: Namely, for two elements
satisfies both the condition (7.10) and the Einstein of A@), A and A', the inverse A-1 and the product
field equation (7.11). New parallel vector fields, A'A do not belong to A(?) in general. However,
b' =@I,obtained
}, from _b by a local Lorentz f o r an infinitesimal local Lorentz transformation,
426

-
19 NEW G E N E R A L R E L A T I V I T Y 3541

A',,(x) = 6',+ d,,(lC), w , ~ +wA, =0, (7.12). The parallel vector fields, defined by
(7.13), in such coordinates are equivalent to the
lwJk I << I (8.3) parallel vector fields of (8.7) defined in the iso-
the condition (8.2) becomes tropic coordinates. An example i s given by the
f u v ~ uI h
spherical polar coordinates ( t , Y,0, $1 introduced
b "b pW,*GC),, = 0 , (8.4) by (5.26): The Schwarzschild metric reads as
b y neglecting the second- and higher-order t e r m s d s2 =- A( y ) d t 2 +B( r. ) d r2 +C ( Y) d B 2 +D0)d@
(y, (8.9)
of wlb. Since the condition (8.4) i s linear in wIb,
the infinitesimal neighborhood of the unit element with A and B still given by (8.6), and
in h(4)has some of a Lie-algebra property: The
inverse, @-')Ik = 6', -
wJh, and the product, @'A)',
= 6',+ w' + w I I , satisfy (8.4) f o r any two infinites- (8.10)
imal local Lorentz transformations, A and A',
belonging to h ( b ) .
As a n e x a m p c of the extended WeitZenback
space-time, consider the static isotropic space-
Thus, the system of four orthonormal vectors, t',
time, which has the Schwarzschild metric for
the present case of c , = 0 = cz. Written in the iso- (8.11)
tropic coordinates used in Sec. V, the Schwarzs-
child metric is expressed in the isotropic form,
,
ds2 = -A(y)dtz + B ( ~ ) d f d P (8.5) with
with

is also a solution of the gravitational field equation


in vacuum, and can be taken as the parallel vector
and the parallel vector fields - b defined by (5.24) fields, which are related to -b of (8.7) by a local
are space rotation
1 1
!to) = ~ E o t (l a ) = ~ 6 t _ E 9 u (8.7)
(8.13)
with { E J the coordinate basis; _E,= a/ax". The
axial-vector part of the torsion tensor, formed of
-, thus vanishes identically, and so the static,
b sin8 sin$ cos8)
isotropic space-time is the extended WeitZenback cosOsin+ -sin0 . (8.14)
space-time. Besides the parallel vector fields
-b of (8.7), there exists an infinitely large number -sin@ cos@ 0
of parallel vector fields, which a r e all equivalent
to b and with each other: A l l parallel vector fields
The parallel vector fields b' a r e usually used as
are related to 2 of (8.7) by local Lorentz trans-
tetrad fields in quantum field theory f o r the
formations satisfying the condition (8.2). The Schwarzschild ~ p a c e - t i m e . ~
The
~ Schwarzschild
condition (8.4) for an infinitesimal local Lorentz
metric i s of the form (7.12) also in the Kruskal-
transformation specified by (8.3) can easily be Szekeres coordinates, 3 7 and so we can use this
solved in this case: The solution which leaves b
coordinate system to form parallel vector fields
static i s
-b" by (7.13).
IX. THE WEAK-FIELDAPPROXIMATION
Further insights into new general relativity can
with H , arbitrary small functions independent of be gained by applying the s e t of the gravitational
t , where Latin (Lorentz) indices in wIh a r e en-
field equation, (7.2) and (7.3), to weak-field sit-
closed by parentheses. uations
F o r a finite local Lorentz transformation i t is
not easy to solve the Eq. (8.2). However, we can
find a special kind of the parallel vector fields in
the static, isotropic space-time by looking for
such a s e t of coordinates id'} that the Schwarzs- since in this case the particle spectrum of the
child metric is expressed in the diagonal form of new general relativity can be clarified by the use
427

3542 KENJI HAYASHI AND TAKESHI SHIRAFUJI -


1.9
- _

of the unitary, irreducible representations of the nonsymmetric energy-momentum tensor {T,} is


Poincare group. In this situation we can expand taken to lowest o r d e r i n the weak fields; namely,
the field equation in a*, and c k p and can keep only it i s independent of {h,,] and BUv}, and satisfies
lowest terms. Thus we need not distinguish Latin the ordinary conservation law in special relativity,
indices from Greek indices, which are now raised
and lowered by the Minkowski metric tensor, a,Tuv=O, (9.9)
{?f) o r {q,,}: We shall use Greek indices through- by virtue of the response equation (6.6).
out this section. From (2.5a) follows Multiplying a on both s i d e s of (9.7) and (9.8),
we find that both the symmetric and antisym-
ay,+c;Y=o, (9.lb)
metric parts of {T,) satisfy the conservation law,
and hence we take {a,,,,) as the basic field variable.
a, T ( w ) = 0 (9.10a)
We shall decompose the weak field {auv} into its
symmetric and antisymmetric parts, a, T c p y J =0 . (9.1Ob)
a, = th,,+A,, , (9.2) By virtue of (9.91, these two equations are not
independent of each other. The conservation law
with h,,=h,, and A,, =-A,,,,. The components of
(9.10b) imposes a severe restriction on the form
the metric tensor a r e then written as
of spin tensor of matter. In fact, due to the Tetrode
g,, = 7,u +h,u - (9.3) formula in special r e l a t i ~ i t y , ~ ~
The antisymmetric field makes no contribution to 2 T [ P V 1 = 80 SU , (9.11)
the space-time metric in this approximation,
Eq. (9.10b) is automatically satisfied if and
implying that it is associated with the intrinsic
only if a spin tensor, { s ~ ) ,i s totally antisym-
spin of spin-4 fundamental particles.
metric with respect to its three indices. Thus,
The Einstein tensor becomes to lowest o r d e r
the gravitational field equation demands that a
in h,,
spin tensor be expressed as
G;,({})=- ipZ,,,- aA(auEux+a,,?iuJ SUVO = ~ U U P JU 5u (9.12)
+77yyap~ , R I , (9.4) by an axial-vector current, {JSu}. F o r Dirac par-
-
where 0 = V, and h,, a r e defined as usual by
ticles, {J5b is given by
- -
$$Y~Y.,#. (9.13)
h,, = h , , - 4 q,,, h, h = ~ h ,., (9.5)
J5,=-

As can be checked by d i r e c t calculation, the lin-


The second term of (7.2), i.e., (Kuu], is of second
o r d e r in hFUand A,,, and hence can be ignored. earized field equations of (9.7)-(9.8) are invariant
The left-hand side of (7.31, when indices a r e under gauge transformations,
lowered, becomes in the weak-field approxi- h,u=h,v-8,J,-a,J,, (9.14a)
mation
A,,=A pu + a, Hu -8 3 , Y (9.14b)
b 3 , blvap(d= -
J*)= -[oA,, aA(a,AUl- auA.J1,
with J , and H , arbitrary s m a l l functions which
(9.6a) leave the fields weak. In a particular case, J ,
= 2H, = A , , these gauge transformations give r i s e
because the axial-vector p a r t {u} of the torsion to an infinitesimal coordinate transformation,
tensor is given by x y - x p p = x u +A(%). By means of these gauge
a = + E~~~~ B,A,, . (9.6b) freedoms, we can put the gauge conditions,
-
Thus in the weak-field approximation the sym- a, h = O , (9.15)
metric and antisymmetric parts of the gravitational a,,AuY= 0 , (9.16)
field equation are given by
- - - which we shall assume henceforth. Then the field
oh,,- aA(a,h,+a,,h,,)+qpu aPaoPu=-z~T~py), equations of (9.7)-(9.8) become
-
(9.7) Oh, = - ZKT(,) 9 (9.17)
OA,, - a@, A,, - 8,AUx)= - AT,,,, . (9.8) OA,, = - XT,,,j. (9.18)
It follows from these equations that the symmetric We shall restrict our discussions to the anti-
field {h,,,] and the antisymmetric field are Mu} symmetric field @,}, because the physics of the
completely decoupled from each other. The {huV} {h,,,} field is well known.39 The retarded solution
obeys the linearized Einstein field equation. The of (9.18) is given by
428

-
19 NEW G E N E R A L R E L A T I V I T Y 3543

only the 2-independent t e r m s ,


(t"")= (Z/A) Re(k'kYdPu6,,- Z k P k P d V U ~ , )
(9.19)
+ (2/X)9'vkpk'ds'6,, . (9.28)
Suppose that we observe the gravitational field
Using the condition (9.251, we then find
in the space region far outisde a source; there-
fore, we can calculate the solution of (9.19) to I
(t") = (4/X) Id,, 'k'k' , (9.29)
lowest o r d e r in l / ~ = l / l Z l , using the expansion
where we have chosen the directfon of k' a s the
third axis. Therefore, only the (12) component
is physically sagnificand, and the energy density
(9.20) toois p o s i t i v e definite $fthe constant A is positive.
The (12) component, d,,, does not change at
aIl under a rotation around the third axis; in fact,
We assume that the energy-momentum tensor for such a rotation the rotation matrix (Rob)
can be expressed a s Fourier integral o r as a sum satisfies R , , = 0 = R z 3 , and hence the d,, transforms
of Fourier components; suppose we calculate f o r like a scalar,
a single Fourier component,
..
~ ~ , , , I x , t )~=~ , , ~ , ( j i ,
+Tlpvl(G,w ) e + f w t . (9.21)
The physically significant component d,, is thus
where a b a r means complex conjugation. In the of helicity zero. In the terminology of elementary
wave zone the solution (9.19) then becomes just particle physics the {Au,,) field is a massless field
like a plane wave of spin 0.
In the above discussion the gUv} field is assumed
Au,(x',t)=du,(~,,w)e'"++6,,(j;,w ) e m f k X , (9.22)
to be a classical field. The quantization of the
with the wave vector, b,"} field can be performed consistently, and the
- A resulting quantized theory does not involve ghost
k-wx, k0-w, (;=&), (9.23)
states.lo
and the polarization tensor The space components of the solution (9.22),
which decrease as l/r, contribute to the energy-
momentum tensor, but the (Oru) components do not,
and hence they a r e of no physical significance.
The wave vector {k'} is a null vector, and the This fact suggests that the next t e r m s of Aoa(2, t ) ,
polarization tensor satisfies the conditions, which are proportional to l / r z , are important.
- In orde r to eliminate the ( l / r )t e r m from A Oa( 2, t ) ,
kvd,,( z,
W )= 0, k" d , "( G,W ) = 0 (9.25)
we rewrite the (Oa)components of (9.19) a s follows:
b y virtue of the conservation law (9.1Obl. Since r
is very large, the 3 dependence of d,,,(x, w ) can be
neglected, and so the plane wavc(9.22) satisfies
the d'Alembert equation, OA,,.( x , t ) = 0 , and the
gauge condition (9.16).
The energy-momentum tensor I t " } of the {A,,,,} (9.31)
field is givenbby
with tU given by
t'Y=- [ aL A/a b,A,)]B'A pv + v'" L A

where L A is the linearized Lagrangian density


of the &4pu} field,
(9.32)
(9.27)
The integral ln (9-31) can be rewritten by using
We use the plane wave solution (9.22) in (9.26), the relations, (9.11) and (9.12):
and average t:"' over a space-time region much
la r g e r than [ k 1-l. The average kills all terms
proportional to exp(* likx), and we are left with
429

3544 KENJI HAYASHI AND TAKESHI S H l R A F U l l 19

X. COUPLING OF AN ANTISYMMETRIC FIELD

A s is well known, the symmetric field {h,,v} can


be neglected in atomic phenomena. So we shall
study the coupling of an antisymmetric field, as-
= i c a 8 , / d 3 ~ ) J B I ( x t ,-?.).%a
f t suming that the metric tensor is the Minkowski
metric tensor,
=a c,87xBSr(t - r ) , gpv =7),,u. (10.1)
where $=(S,') is a total intrinsic spin of the source, It is then convenient to employ a Cartesian co-
ordinate system { x p} : The tetrad fields associa-
S,(t)=J d3XJJZ, t)=&J d3XSWO(Z,t). ted with it, which we denote by g={ %} ={ eAI!
= 6, are related to the parallel vector fields -
b
(9.33) by a local Lorentz transformation,
We now perform a gauge transformation (9.14b) 5 =A',(%)%,
with H, given by
(10.2)
H,=E,, Ho=O. (9.34) .
A',(%)=6', - 6 b ' 6 ' u A , , , ( ~ )

Then, dropping a prime on A,,, we finally get Here we assume that an antisymmetric field
{A,,,,} i s so weak that we can neglect the second-
and higher-order t e r m s of {A,,").
In Sec. 11, the Dirac spinor wave function was
Since 5, decrease as l/r, the change of the space introduced by referring to the parallel vector
components, &I, a,$,
,- = decreases as l/rz, fields b; we denote it here by $ b . The Dirac spin-
and hence the ( I / r ) t e r m s of A , , do not change o r wave function $, , which is defined by referring
to the tetrad fields g, is related to $a by the local
under this gauge transformation. The expression
(9.35) is to be compared with the asymptotic Lorentz, transformation (10.2);
expression for ho,(Z, t),41 =U(A)$*,U(A)=l -&AwvSUv. (10.3)
It should be remarked h e r e that the spinor wave
(9.36) function qe is usually used in atomic physics to
describe the electron.
where $={Mu}is a total angular momentum of
Suppose that & satisfies the Dirac equation
the source,
(3.13b), then Eq. (10.3) implies that i), satisfies
M,(t)=c,,r/d3xX8Tboi(k, t). (9.37) (ips ,,- -
$a,,y 5y' m)qe = o (10.4)
by virtue of the following property of the covar-
See Table I f o r an illustration of {h,,,,] and pUv} iant derivative V,,:
in the asymptotic region.

Here V t ) and V!' mean the covariant derivative


defined by the Ricci rotation coefficients formed
TABLE I. Asymptotic expressions for h , and A, far of and g, respectively: V$) coincides with the
from a weakly gravitating system. The result for h , is
usual derivative a,, since e RIJ = 6,p.
well known (Ref. 41). but we list it here for compari-
son's sake. Now we apply the Dirac equation (10.4) to the
electron in the hydrogen atom, including the elec-
tromagnetic interaction between the electron and
the proton by the minimal principle
0 ap-a,, +id,, (10.6)
where ( - e l is the electric charge of the electLon,
and the electromagnetic potential { A ' } = ( A o ,A)
is given by
[gravitational
radiation terms that
+ [gravitational die out as O(l/r)I (10.7)
radiation terms that
die out a s O(l/r)I
Here the vector potential is due to the magnetic
430

-
19 NEW G E N E R A L R E L A T I V I T Y 3545

moment of the proton; M,, 3p, and g, a r e the The last t e r m of (lOD15), which consists of two
mass, the spin, and the gyromagnetic ratio of the parts, describes the spin-spin interaction of the
proton, respectively. The Dirac equation (10.4) electron and the proton: One is due to the magnet-
then becomes ic moment of the proton, and the other due to an
antisymmetric field.
[ i y ' ( a , + i e A , ) - f a , y 5 y v ' -m]JI,=O , (10.8) The spin-spin coupling due t o a n antisymmetric
for the electron in hydrogen atom. field is not restricted to the c a s e of the electron
F o r the proton at rest a t the origin, the axial- and the proton, but quite universal. F o r any two
vector current of (9.13) is given by spin-f particles, A and B , separated by 7 , we
J,, = 0, 3, = 8, 6'((x3 . (10.9)
can show in the similar way that in the nonrela-
tivistic approximation the coupling with an anti-
Use of this in (9.11)-(9.12) shows that space- symmetn'c field leads to universal spin-spin cou-
space components of the antisymmetric part of pling,
T P v , T L a e , , vanish identically; therefore, we
find that
Aa~=O (1O.lOa)
around the proton. On the other hand, the (Oa)
components of an antisymmetric field a r e given
by (9.35):

(10.10b)
where 5, and 8, are the spin vectors of the spin-
Using (10.10a)-(10.10b) in (9.6b), we obtain the + particles, A and B, respectively. This spin-
axial-vector part of the torsion tensor around the spin coupling makes a contribution to the hyper-
proton at rest, fine splitting of energy levels in atoms and muon-
ium (the bound state of an electron and a positive
(10.11)
muon).
Let u s first consider the hyperfine structure in-
In order to evaluate the effects due to an anti- terval Av(H) of the ground state of the hydrogen
symmetric field, we rewrite the Dirac equation atom. We denote by A vQED( H ) the theoretical value
(10.8) into two-component wave equations, which is based on conventional quantum electrody-
namics and on the assumption that the proton is a
Dirac particle without internal structure. Adding
possible corrections to AvQeo ( H ) , we express
AU(H) as
Av(H) = A v ~ ~ ~ (t H ) [ ~a,&)].
6$'+ (10.17)
(10.12b)
where we put Here 6$) is the correction due t o internal structure
of the proton: The precise value of 6(;) is not
(10.13) known at present, but it is estimated to be 1-2
The last term 6,(H) is a possible correc-
and used the standard representation of the y tion which a r i s e s from universal spin-spin cou-
mat rice^.^' Here 5 denotes the momentum opera- pling of (10.16): From the expression (10.15) f o r
tor; p a = - i a / a x a .
In the Pauli approximation, the Hamiltonian, we obtain
in which (10.12b) may be approximated to
A/16n =O.OlZX -x (GeV)' . (10.18)
6A(H)=ezgp
/4mM, 4n

we get The theoretical value A vQED(a)is in good agree-


ment with the experimental value44;
i *=HQ, (10.14 )
at

Since the correction 6 2 ) i s of the order of 1 ppm,


we estimate the upper limit on 6,(H) as
43 1

3546 KENJI HAYASHI AND TAKESHI SHIRAFUJI 19


-
6,(tl)S5X10-6. (10.20) vacuum. When we considered the static, iso-
tropic gravitational field in Sec. V, we assumed
Because of ambiguity in the correction a,): it that the state of a central gravitating spherical
s e e m s difficult t o estimate the value of 6,(H) with body does not change under space inversion,
higher precision than (10.20). Combining (10.18) besides it is invariant under time r e v e r s a l and
and (10.20), we get space rotation. This is the case either if con-
stituent particles of a spherical body are spin-
-h
4n 5 4X (GeV)-' . (10.21) less, o r if the spin of constituent particles is ran-
domly distributed and can be ignored. If the spin
Next we shall consider the hyperfine structure of constituent particles of a spherical body hap-
interval of the ground state of muonium. The pens to be polarized to outward (or inward) radial
theoretical value A vQEo(ep)based on conventional direction, however, the spin state of the gravi-
quantum electrodynamics agrees well with the tating body changes under space inversion: In
experimental ~ a l u e ' ~ ; fact, if the spin of constituent particles is po-
larized to outward radial direction, then after
space inversion the spin is polarized to inward
radial direction. Therefore, we assume h e r e
(10.22) that the parallel vector fields &=&}={b,'} and
The value of A ueXp(ep) is known with much higher
their dual i* p}
= = {bk,} are for* invariant
under space rotation (5.lc), but not necessarily
precision than A vQED(ep),due t o uncertainty in
form invarimzt under time reversal @.la)and
our knowledge of the fundamental constants p,/p,
and a; h e r e I,and pp a r e the magnetic moments space inversion (5.lb). It is shown in Appendix
B that we can then take the following expression
of the muon and the proton, respectively, and a
is the fine-structure constant. Possible fraction-
for b*= v}={b',,}:
al correction to AuoED(ep), 6A(ep), which arises
from universal spin-spin coupling of (10.16), is -U
obtained from.(lO.l8) by replacing the proton
parameters, M p and g, , with the muon paramet-
ers, M,,andg,;
where C, D, F, and H are unknown functions of
t and r = ( x u x u ) " 2 . The parallel vector fields
(10.23) -
b = &} = {bh"}are then represented as
Here M , is the muon mass and g,, is the gyro-
magnetic ratio of the muon. From (10.22) we es- -P
timate the upper limit on 6,(ep) as

6,(ep) 5 . (10.24)
(ahu)= 'f" --
CD
xu

This upper limit can be improved, provided that k 0 (D6: + F z x ' x "
D
+ Ff,,,xaZ/(D2 +r2F2)
the fundamental constants pu/pp and a would be (1l.lb)
known with higher precision. Using (10.23) in
(10.24), we obtain and the invariant distance ds2is expressed in a
rotationally invariant form,
L 3X10m4(GeV)-2 .
h
-
4n (10.25) d s 2 = - ( C 2 - r 2 H 2 ) d t 2 +2DHdt(xudxu)

Summing up, we conclude from (10.21) and + ( D 2 + r 2 F 2 ~ d x u d x u - F F Z ( ~ u d(11.2)


~u)2.
(10.25) that the square of the coupling strength In empty space the antisymmetric part (7.3) of
of an antisymmetric field is bouaded by A/4n the gravitational field equation reads
S 3X1Oe4 (GeV)'l. This result is in agreement
with the quantum-field-theoretical estimation of ~ , ( ~ e ' ~ m " b , P b , "= aO , .) (11.3)
Miyamoto and N a k a n ~ . ~ The axial-vector part {a"} of the torsion tensor
is expressed in t e r m s of the unknown functions,
XI. TIME-DEPENDENTSPHERICALLY SYMMFTRIC C, D, F, and H, as
FIELDS
We now turn to a spherically symmetric, but P, for p = O ,
*a" = (11.4)
not necessarily static ,gravitational field in Q x u , for b = a ,
432

-
19 NEW G E N E R A L R E L A T I V I T Y 3547

with P and Q defined by sion tensor must vanish,


P = 2DF + $ r ( D F - DF) , a=O (11.16)
(11.5)
Q = - $ H F + $ ( ~ ) F- oh), for a spherically symmetric gravitational field
in vacuum. The symmetric p a r t (7.2) of the
where a dot and a prime denote a / a t and e / a r ,
gravitational field equation now becomes the Ein-
respectively. Using (11.4) in (11.3), we get
stein equation i n vacuum,
HP + DQ
2F D2+ r 2 F 2
xa=o (11.6) G,,,,({ }) = 0 . (11.17)
According to the Birkhoff theorem45i n general
relativity, a spherically symmetric solution of
(11.17) must be static and is given by the Schwarzs-
child solution.
We have thus shown that a spherically sym-
+ E ~Yb c x c [ ( ~ ~ ) + 2 . -D -+( r . )F] D =O (11.7) metric solution of the gravitational f i e l d equa-
tions (7.2)-(7.3) with source t e r m s absent must
for ( i j )= (a,b). Equation (11.6) gives coincides with the static, isotropic field in va-
cuum studied in Secs. V and VII, i.e., the Schwarz-
F=O, (11.8a)
schild solution. This is just the Birkhoff theorem
or of new general relativity.
HP + DQ = 0 . (11.8b) XII. CONCLUSION
It follows f r o m (11.5) that if F = 0, then P and Q
We have formulated new general relativity and
vanish identically. On the other hand, if (11.8b)
proved the following:
is satisfied, (11.7) gives
(1) The equations of motion for s p i n - i funda-
( g P ) + 2 r D r 2FF22 FC P = o , mental particles and photons a r e approximated
(11.9)
by the WKB approximation method to yield, in
the classical limit, the geodesics of the metricg,
which can be readily integrated to give
the extremal curve. This is the corresponding
principle in new general relativity.
(2) In the case of c , = 0 = c2 the gravitational
action is of the form
with f ( t ) an unknown function of t . We impose
the boundary condition at spatial infinity as

7- -
lim bk, = bkr, limb, = 6,
I- -
Then unknown functions, C , D, F and H , satisfy
. (11.11)
Here K is the Einstein gravitational constant,
K = 8 r G / c 4= 8nG, and A i s a new parameter,
bounded by X/4n < lo- Ec/(GeV) from precise
lim C ( t , r )= lim D ( t , r ) = 1, (11.12) experiments in quantum electrodynamics. (We
r-- 7-
leave open the possibility that X would be equal
I- - r- -
lim r H ( t , r )= lirn r F ( t , r )= 0,

and hence from (11.5) if follows that


(11.13) to K , i.e., X = K . ) What differs from general rela-
tivity i s the second term, which consists of the
axial-vector part {a} of the torsion tensor.

I- -
limrP(t,r)=limrQ(t,r)=O.
r- - (11.14) From this action follows the gravitational field
equation,
Because of the boundary condition (11.13) for F , G({ }) + L = KT , (12.2)
the integral in the exponent of (11.10) converges
-
for r -a, and s o the exponential factor of (11.10) where
approaches a finite positive value for r-00.
Therefore, in o r d e r to satisfy the boundary con-
dition (11.14), the unknown function f ( t ) must
vanish, and hence we get
P ( l , r )= 0 , Q ( t , r )= 0 , (11.15) (12.3)
by virtue of (11.8b) and (11.10). It then follows Here a, = biua,, i s a vector with respect to global
from (11.4) that the axial-vector part of the tor- Lorentz transformations, but a s c a l a r with res-
433

3548 KENJI HAYASHI AND TAKESHI SHIRAFUJI -


19

pect t o general coordinate transformations. Ob- Astrophysik f o r discussions and suggestions


viously, the field equation is invariant under during his stay there in 1976-1978, when the
global Lorentz transformations but violates the present work began.
local Lorentz invariance in general.
(3) In the static, isotropic gravitational field
the axial-vector part of the torsion tensor i s APPENDIX A: PARITY-VIOLATINGTERMS, @pa,,) AND
*hUt. PO
identically vanishing, and the solution is given by (lrupo A )

the Schwarzschild solution. In Sec. IV w e constructed the gravitational


(4) New general relativity a g r e e s with all the Lagrangian density L, of (4.12) by postulating the
experiments which have so far been c a r r i e d out, basic principles (1)-(4). We now lift up the
a s general relativity does. postulate of (3), then we can add to L, of (4.12)
(5) In the weak-field approximation to the grav- parity-violating t e r m s like (va,,) and ( c u , , p o t a L Y ~ p u ) :
itational field equation, it splits into two separate
a,(aa,)
L , =a,(2~Ytallu)+a2(vvy)+
equations; one is for the symmetric field @},
and the other is for the antisymmetric field {A,,,), + u,(vau)+ a 5 ( E u u p u t l ~ t i ~ O ) , (Al)
-
Oh,, = -2KT(u,, 7 (12.4) where a cosmological t e r m i s neglected. Because
of the identity,

with the conditions, e,,iiuy = 0 and B,,Afi = 0. The ~ , u p o t =f ~ v a , -9 (=a),


aut;P ,, (A2)
f i r s t equation describes the propagation of a we can drop the a, term, absorbing it into the
graviton with zero m a s s and spin 2, and the s e - a, term. Accordingly, the gravitational action of
cond means the propagation of a zero-mass and (4.18) involves one additional parity violating
zero-spin particle, which e x e r t s spin-dependent term;
force among spin-i fundamental particles.
(6) In microscopic processes the equivalence
principle is violated by means of the antisym-
metric field described by (12.5), which is coupled
to spin-i fundamental particles. However, in =J&xd=z(&R({ })+cl(tAta,,)
the macroscopic scale the equivalence principle
i s recovered. +c,(v%d + c3(aaU)
+ c,(va,)) , (A31
(7) In new general relativity the Birkhoff theo- where the parameters, cl, c2, and c3, a r e given
rem, that a spherically symmetric gravitational by (4.19) and c, i s given by
field in empty space must be static, with a me-
t r i c given by the Schwarzschild solution, is c,=a,. (A4
proved. The gravitational field equations a r e then given
At this point we summarize several important by (4.22) with the tensor (FrUA}of (4.24) involving
features of new general relativity in comparison an additional term,
with general relativity; see Table 11.
Finally, we emphasize that new general relativity,
FFUI = C,(tYA - ,l,) + c2(gVv).,g*u)

originally due to Einstein in 1928, is a gravita- -5(PYlPap+++d -FLU -&YAPv,)


tional theory that is acceptable on the experimen- 3
tal and theoretical grounds.
At present it s e e m s impossible to detect the = -Fpa. (A51
differences between general relativity and new The tensor {H} i s still defined b y (4.25), and L
general relativity. Among other things, it is is
highly expected to see what are Kerr-like solu-
L = C l ( t ~ ~ Y t A y , ) C C Z ( Z t % ) +c,(au,)
tions, i.e., stationary and axially symmetric
solutions, in new general relativity, since the + c,(v%zLa,). (A61
K e r r solution in general relativity has the total It is to be noticed that the choice of parameters
anplm momentum, to which the axial-vector in Ref. 4 corresponds to the case of c3= O + c,.
p a r t {a) of the torsion tensor in new general
In a static, isotropic gravitational field, f o r
relativity may contribute.
which the parallel vector fields b are of a diagonal
form (5.2), there is no appearance of the para-
ACKNOWLEDGMENT meters, c, and c4, in the gravitational field equa-
One of the authors (K.H.) wishes to thank mem- tion. Therefore, all the results of Sec. V still
b e r s of the Max Planck Institut ffir Physik und hold true independently of the parity-violating
434

-
19 NEW G E N E R A L R E L A T I V I T Y 3549

TABLE II. Comparison of new general relativity with general relativity.

General relativity New general relativity

Space-time Riemann space-time Weitzcnsbck space-time


-absolute parallelism-
Connection Levi-Civita connection Nonsymmetric affine
rh={$) connection
I$X=bh?&bh,
Basic structure Metric tensor Parallel vector fields
g =k,} -b={bk) -cg=6,)
Gravitation

Transformation
Riemann-Chr istoffel
curvature tensor
General coordinate
Torsion tensor
T?, = b:($bk,
General coordinate
- 8,bh,,)

group transformation group transformation group


(Local Lorentz group) Global Lorentz group
The Birkhoff Yes Yes
theorem
Static, isotropic The Schwarzschild The Schwarzschild
gravitational solution solution
field
Static, axially The Kerr solution Not yet found
symmetric gravi-
tational field
Very strong, Black holes Black holes
static, isotropic
field
Newtonian Yes Yes
approximat ion
Weak-field Symmetric fi&ld{E,,,,); Symmetric field {h;y);
approximation 0 F,,,, = ~ K T , &,,= -ZKT~,,
with &E=O with v-=O
Antisymmetric fleld b,,,,);
0 A,,,, = -XT h,,l

with G W = O
Quantum Graviton ; Graviton;
spin 2 and massless spin 2 and massless
Scalar particle; positive
energy, spinless and
massless
Theory Macroscopic Microscopic
Equivalence Yes Yes, for macroscopic
principle phenomena
No, for microscopic
phenomena

c, t e r m of L,. In particular, the values of the -fC3 8A(8,AvA - 8uAuA)]


p a r a m e t e r s , c, and c,, are severely r e s t r i c t e d by
the solar-system experiments, as is shown in
+ c4[-+a(e,Z, - B,&J
(6.16). + ~ E ~-A)]
~ =~T ~ ~
, , , , l ,~
(A81~ ~ (
We shall thus a s s u m e henceforth in this appen-
dix that c1= 0 = c,. F u r t h e r m o r e , in o r d e r to where
-
{xu}is the dual of {A,,,,},
elucidate effects of the parity-violating c4 t e r m of
A,, = ~ E , , , , ~ A ~ . (A91
L,, we h e r e apply the gravitational field equation
to weak-field situations, where (9.la) is satisfied. The nonsymmetric energy-momentum t e n s o r
Using the notation introduced in Sec. M, we find {T,,,,}
satisfies the o r d i n a r y conservation law (9.9).
that the s y m m e t r i c and antisymmetric p a r t s of Corresponding to the invariance of the gravita-
the gravitational field equation are given b y tional field equation (4.22) under g e n e r a l coordi-
nate transformations, the linearized field equa-
~ : : ) ~ ~ } ~ - ~ ~ ~ 4 ~ A ~ ~ l r (A71
~ u ~ + tion ~ P ~ ~ =is~invariant
~ u (A7)-(A8) ~ ~ ~ , under , ~ , gauge trans-
435

3550 KENJI HAYASHI AND TAKESHI SHIRAFUJI -


19

formation, Multiplying By on (A7) [or (AE)], we get


h'pu= h,,, - ( 8 p 4 + a,, A,,), -gC,oaG,,,,= aUT,,,, (A.ll)
of which the retarded solution is given by
A'~,,=A,,,,+~(~~~-~,,A,,),
where A, a r e four small but otherwise arbitrary
functions that leave {h,,,} and {A,,,} weak. F o r the
*
a'Z,,,(x, t ) =- 3 f d s x , W"'.,,~'t
8nc, x - x'
- -IT - g' I 1,
(A. 12)
symmetric field {h,,,}, the most convenient choice
of gauge is to put the harmonic condition (9.15), since c, is assumed to be nonvanishing. Using
which we shall assume henceforth. (9.4), (9.15), and (A12) in (A7), the field equation
It is to be noticed that the field equation (A7)- of h,,, reads
(A8) i s not invariant under space inversion and -
time reversal, when c, does not vanish. This is, Oh,,, = - ~ u T ( , , , )
of course, a direct consequence of the fact that
the gravitational Lagrangian density of (A31 in-
volves the term c,(v"ud, which changes sign under
space inversion and time reversal. This apparent
parity violation, however, does not lead to a n y
observable effects in the weak-field approximation,
as will be shown below.

which is interpreted as the gravitational radiation which, in view of (9.6b), is equivalent to the
produced by the source {T,,"}. Inspection of (A151 vanishing of the axial-vector p a r t of the torsion
shows that if {BAT""} does not identically vanish, tensor;
the field {h,,,} propagates inside the light cone as
if i t i s massive. a"=O. (A. 18b)
It seems natural, however, to r e s t r i c t the Therefore, {A,,,,} can be represented as curl of a
theoretical framework of gravitation by requiring vector field {But,
that gravitational radiation should propagate on
the light cone with the speed of light. In view of
A,,, = a,B, - B,B,. (A.19)
this criterion, the case of c , f 0 should be disre- Using (9.11)-(9.12), (A18a), and (A19) in (A8), we
garded unless the energy-momentum tensor satis- find that the field equation of {A,,,,} i s rewritten a s
fies
aATiu=0 , (A.16)
in addition to the ordinary energy-momentum con- the retarded solution of which is given by (A19),
servation law (9.9). Therefore, we shall assume with B, defined by
(A16) hereafter. Then the spin tensor {s""} is
totally antisymmetric with respect to i t s three
indices, and is represented as (9.12).
It follows from (A13) and (A16) that the symme-
tric field {h,,,} satisfies the field equation It follows from (A18b) that an antisymmetric
- field does not couple with spin-$ fundamental
Ohuu=-2KT(,u, > (A.17) particles [see the Dirac equation (3.13b)l. On the
which is nothing but the field equation (9.17) in the other hand, the electromagnetic field i s decoupled
case of cq= 0. Consequently, we find that the from an antisymmetric field, since the f o r m e r
symmetric field {h,,,,} is not influenced af all by interacts with the gravitational field through the
the parity-violating c , term of L,. metric tensor (g,,,}. Consequently, an antisymme-
From (A12) and (A16) i t follows that' t r i c field (A19) does not interact with fundamental
particles and fields, and so it i s entirely devoid
a,,Z u u = o , (A.18a) of physical reality.
436

-
19 NEW GENERAL RELATIVITY 3551

The present case of c,#O i s invariant under the f r e e to redefine the time coordinate and the radius
gauge transformation (A10). Using this gauge by
freedom, we have put the harmonic condition
(9.15), which i s necessary to eliminate unphysical t'=Q(t,r), x ' " = $ ( t , r ) x ~ , (B3)
components of the symmetric field {huv}. We are with 9 and J, a r b i t r a r y functions of t and Y.
still left with the freedom to perform a gauge Under arbitrary coordinate transformation xu
transformation (A10) with A, satisfying the - x 111 , the parallel vector fields {b',,} transform
like covariant vectors
d'hlembert equation,
OA,=O. (A22) b lU ( x ' ) = ( 8xY/8x'")b*,(x) . (B4)
It follows from (A21), however, that {BJ satisfies For a redefinition (B3) of t and r , the transform-
the inhomogeneous d' Alembert equation, ation coefficients (ax"/axru) a r e given by
at/at'=($+rqWa,
axa/at' = -($/A)%,,
if matter exists. Accordingly, a gauge trans- (B5)
at/axra = - ( ~ ~ / r a ) x ~ ,
formation (A10) with (A22) is insufficient to make
an antisymmetric field (A19) vanishing in that
space-time region where there exists nonvanishing
source, {JS& 0.
Therefore, an antisymmetric field (A19), al-
though i t is unphysical, cannot be eliminated by
a symmetry transformation of the present case of
c,+ 0. This situation is to be contrasted with that
of the electromagnetic field, in which unphysical
components of the electromagnetic potential {A,,)
can be eliminated by choosing an appropriate Using (B5) in (B4), we obtain
gauge. It is unreasonable to accept a theory in-
volving such unphysical degrees of freedom that
cannot be removed by a symmetry transformation b'LO),,=$ [ ( J , + r J I * ) C - & G ] , (B7a)
of a theory. Consequently, we should disregard
the case of c,#O.

APPENDIX B: SPHERICALLY SYMMETRIC PARALLEL


VECTOR FIELDS

Consider the parallel vector fields f o r a spher-


ically symmetric (but time-dependent in general)
system. We mean by "spherically symmetric"
that it is possible to choose a "quasi-Minkowskian" +F cauBx~. (B7d)
coordinates, x1,xZ,X3,xo= t , such that the parallel
vector fields b*={b*} = { b R , ) are f o m invariant Inspection of (B7b) shows that the (Oa)compo-
under space rotatizn nents, b ( o ) u , can be eliminated by setting
xu - R u B 9 , 2') -Rae!@', (B1) C
-Q'-G@=O, J,=1. 038)
Y
where R =(Rae)= ( R O Bi)s an orthogonal 3 X 3 matrix
In particular, if C , D , E, F , G , and H are all
R R t = R t R = I , detR-1.
time independent, @ can be taken as
The most general expression of {b',} can then be
given by
-CI
G%"
where $(Y) is a function of Y. Now we assume G
to be zero, then theE t e r m in b ' ) 0 can be elimin-
ated by putting
where C , D , E , F , G , and H a r e unknown func-
tions of t and ~ = ( x ~ x ~ We) ~ are,
' ~ . however, still
437

3552 KENJI HAYASHI AND TAKESHI SHIRAFUJI 19


-

The parallel vector fields (B2) then take the fol- compwents, b(')", must vanish. The parallel
lowing form: vector fields (B11) then become

0) >
(bk,)=(" 0314)
0 D6.m
where C and D are unknown functions of Y alone.
Further reduction of {bk,) i s impossible: Any
of C, D, F, and H cannot be put to zero in ad- APPENDIX C: PROOF OF l+(cl +4c2)# 0
dition to E and G. This is evident from (B7a) and
(B7d) for C, D , and F. To prove this for H, we The field equations f o r the static isotropic gravi-
assume that the (a0) components, b('),,, were el- tational field become
iminated from ( B l l ) by a suitable redefinition of
t and r , then Eqs. (B?b)-(B7d) show that the func-
- [I- K(C1 - 2C2)B"
-K(C1 4- C 2 ) A "
n

tion J, must satisfy - Y { K ( c+~C,)A + [I - K ( C -~ 2c2)lB = 1) TOO , (C l a )


$'=O, JIH- @=O. [ I - K(C1 - 2C2)@'+ [1 + K(C1 + 4 C , ) b ' =o (C1b)
However, these two conditions of J, are not com- in the Newtonian limit. (Dividing ( C l ) by
patible with each other, since D and H a r e , in gen- [I + K ( C ~ + ~ C , ) ] gives (5.7).) Assume that 1
eral, functions of Y.
Now we turn to a static, spherically symmetric
+ K ( C +4c2)
~ =0, then (Clb) gives [l K ( C ~ 2c2)@' - -
= O . If A'=O, the equation of motion (5.11a) for
system. We assume furthermore that the spin of a nonrelativistic test particle becomes
constituent particles of the system, if.it exists,
can be completely neglected: This means that
there i s no physical distinction between the left-
and right-handed coordinate system to describe it. which contradicts the Newton equation of motion.
Then the parallel vector fields a r e forin invariant
under time reversal and space inversion,
- -
If 1 K ( C ~2c2)=0, on the other hand, c1a n d c2
a r e given by cl = 1 / 3 ~and c 2 = - 1 / 3 ~ , respectively.
t--t, t(,,)--i(,,)
(time reversal) (BIZ) Using these values of c1 and c2 in (Cla) gives

X"--X~, -b h)--bG) (space inversion) (B13) 0 = KTOO,


in addition to space rotation (Bl). Hence the (a0) which is a contradiction. Therefore, [1+K ( C ~+4c,)]
components., b(=),,, and the F term in the (aa) should not be zero.

'A. Einstein, (a) Sitzungsber. Preuss. Akad. Wiss. 217 "R. WeitZenback, Inuariantentheorie (Noordhoff, Gron-
(1928);@) 224 (1928); (c) 2 (1929);(d) 156 (1929); (e) ingen, 1923); Chap. XIII, Sec. 7.
18 (1930);(f) 401 (1930). "See, e.g., E. T. Davies and K. Ywo. in Convegno
'A. Einstein and W. Mayer, Sitzungsber. Preuss. Akad. Internazionale Celebratiuo del Centenario della Nas-
Wiss. 110 (1930). cita di Tulli Levi-Ciuita, Atti dei Convegni Ltncei
k. Mbller, K. Dan. Vidensk. Selsk. Mat. Fys. Skr. 1, (Academia Nazionale dei Lincei, Roma, 1975); p. 53.
No. 1 0 (1961). '3Throughout this paper we mean by "geodesics" the
4C. Pellegrini and J. Plebanski, K. Dan. Vidensk. Selsk. shortest (or longest) possible path between two points,
Mat. Fys. Skr. 2, No. 4 (1962). "length" being measured by the metric g.
k. Mbller, K. Dan. Vidensk. Selsk. Mat. Fys. Skr. 89, "See, for example, K. Hayashi and T. Shirafuji, Prog.
No. 13 (1978). Theor. Phys. 57, 302 (1977).
6K.Hayashi and T. Nakano. Prog. Theor. Phys. a, 491 ' 5 ~convention
r of the y m a t r i c e s i s a s follows:
0967).
%.Miyamoto and T. Nakano, Prog. Theor. Phys. 2, {yi. r')=-2q''. S i ' = ( i / 4 ) [ y i , 7'1,
295 (1971).
*K.Hayashi, (a) Gen. Relativ. Gravit. 4, 1 (1973);@)
y 5= i y 'y 'y2y '.
Lett. Nuovo Cimentn 5, 529 (1972); (c)5, 739 (1972); In the spinor representation (2.20)of J I , the Y matrices
e,
(d) 2, 883 (1972);(e) Phys. Lett. 497 (1973); are
(f) K B , 497 (1973).
'K. Hayashi, Nuovo Cimento g , 639 (1973).
'OK. Hayashi, Phys. Lett.E, 441 (1977).
438

-
19 NEW GENERAL RELATIVITY 3553

16C. Mbller, The Theory of Relativity (Clarendon, Ox- U. S. A.46, 871 (1960);Phys. Rev. Lett. 2,215 (1960).
ford, 1952). "A. S. Eddington, The Mathematical Theory of Relativ-
"Although the magnitude of spin vanishes in the classi- i t y (Cambridge Univ. Press, Cambridge, England,
c a l limit F- 0, the spin polarization h a s the meaning- 1924), 2nd edition, p. 105; H. P. Robertson, in Space
ful classical limit. The classical equation of spin Age Astronomy, edited by A. J. Deutsch and W. B.
precession in a homogeneous electromagnetic field is Klempler (Academic, New York, 1962). p. 228.
now well established and employed in the experimental "J. D. Anderson et a l . . Astrophys. J. 200, 221 (1975).
study of the anomalous magnetic moment of muons and "E. B. Fomalont and R. A. Sramek, Phys. Rev. Lett.
electrons. See, for example, V. Bargmann, L. Michel, 36, 1475 0976).
and V. L. Telegdi, Phys. Rev. Lett. 2, 435 (1959). '%otnote 27 of I. I. Shapiro et al.. Phys. Rev. Lett.
WKB approximation method was f i r s t applied to 36, 555 (1976).
the Dirac equation in the electromagnetic field by 3 E G . Williams et al., Phys. Rev. Lett. 2, 551 (1976);
W. Pauli, Helv. Phys. Acts?, 179 0932). The classi- I. I. Shapiro et al., ibid. 36. 555 0976).
c a l equation of spin precession in the homogeneous %ee, for example. J. Ehlers and W. Kundt, in Gravita-
magnetic field was l a t e r derived by this method in tion, edited by L. Witten (Wiley, New York, 1962);
S. I. Rubinow and J. B. Keller, Phys. Rev. 131,2789 and W. Kinnersley, in General Relativity and G m v -
(1963); K. Rafanelli and R. Shiller, ibid. 3,B279 itation. edited by G. Shaviv and N. Rosen (Wiley, New
(1964). York, 1975).
''See, for example, R. P. Feynman and M. Gell-Mann, 35A. Friedmann, Z. Phys. lo,377 (1922); z, 326 (1924).
Phys. Rev. 109, 193 (1958). 36D. G. Boulware, Phys. Rev. D ll, 1404 (1975);l2,
the spinor representation, Cp is indeed a two-com- 350 (1975).
ponent spinor. We c a n a s welluse $=;(I -v5$ in- 37M. D. Kruskal, Phys. Rev. 119, 1743 (1960);G. Sze-
stead of (3.15) without any change in the result of the keres, Publ. Mat. Debrecen 7, 285 (1960).
classical limit. "The spin tensor { S x ' y } is taken to l o w e r t o r d e r in
"The Hamilton-Jacobi equation in classical mechanics the weak field, and so i t i s independent of the weak
i s treated in, for example, H. Goldstein, C h s i c a l field. The Tetrode formula (9.11) is equivalent to the
Mechanics (Addison-Wesley, Reading, Mass., 1950). total angular momentum conservation law,
Application to particle motion in general relativity can
be found in C. W. Misner, K. S. Thorne. and J. A. a ,,Mx""= 0,
Wheeler, Gravitation (Freeman. San Francisco, with MAWdefined by
1973).
"Then w c c a n put tl'= Q , O , O . O ) , br''=tSk', and so
MA P Y , ~AT',, - ,'TAU +s UY.

S"= (0, S) by virtue of (3.28) and (3.30). 39See, f o r example, S. Weinberg, Gmvitatwn and
23Theu matrix ( u k )is defined by ( u k ) = ( I , u ' , u z , u 3 ) . Cosmology (Wiley, New York. 1972). Chap. 10; o r
24aK,Hayashi and A. Bregman, Ann. Phys. (N.Y.) 15, C. W. Misner, K. S. Thorne, and J. A. Wheeler, Grav-
562 (1973);p. 597. itation (Freeman, San Francisco. 1973). Chap. 18.
24b1. M. Gel'fand. R. A. Minlos, and Z. Ya. Shapiro, 40Forquantization of the {A,,,,} field, see K. Hayashi,
Representations of the Rotation and Loren& Groups Phys. Lett. @, 497 (1973).
and Their Applications (Pergamon, Oxford, 1963). 4'See the second reference of Ref. 39, p. 449.
25R.C. Tolman, Relativity, Thermodynamics and Cos- "Here we u s e the standard representation of the y ma&
mology (Oxford Univ. P r e s s . Oxford. England, 1934), rices:
Eq. (82.14).
26P.J. Roll, R. Krotkov, and R. H. Dicke, Ann. Phys.
(N.Y.) 26, 442 (1964);V. B. Braginsky and V. I.
Panov, Zh. Eksp. Teor. Fiz. fi, 873 (1971) kov. "S. D. Drell and J. D. Sullivan, Phys. Rev. 154,1477
Phys.-JETP+, 464 (1971)J. (1967).
'?K.Hayashi, Lett. Nuovo C I m e n t o j , 529 0972). "E. N. Taylor, W. H. P a r k e r , and D. N. Langenberg,
28Fora spinning macroscopic test body such as a tor- Rev. Mod. Phys. 4 ,-l 375 (1969).
que-free gyroscope, the situation is different, and "The hyperfine s t r u c t u r e of muonium is reviewed both
the equation of the spin precession can be derived theoretically and experimentally by V. M. Hughes and
from the conservation law (6.7)by applying the method T. Kinoshita. in Mum Physics, edited by V. W. Hughes
developed by Papapetrou in general relativity; and C. S. Wu (Academic, New York, 1977), Vol. I,
A. Papapetrou, Proc. R. Soc. London-. 248 (1951); Chap. 11.
E. Corinaldesi and A. Papapetrou, ibid. E ,259 46G. Birkhoff, Relativity and Modern Physics (Harvard
(1951). See also L. Schiff, Proc, Natl. Acad. Sci. Univ. P r e s s , Cambridge, Mass.,, 1923), p. 253.
439

PHYSICAL R E V I E W D VOLUME 24, NUMBER 12 15 D E C E M B E R 1981

Brief Reports
Brie/Reports are short papers which report on completed research which, while meeting the usual Physical Review standards of scien-
tific quolity. does not warrant a regular article. (Addenda to papers previously published in the Physical Review by the same authors are
included in Brief Reports.) A Brief Report may be no longer than 3% printed pages and must be accompanied by an abstract. The same
publication schedule as for regular articles is followed. and page proofs are sent to authors.

Addendum to New general relativity


Kenji Hayashi
Institute of Physics, T o k p University (Komaba). Tokyo 153, Japan

Takeshi Shirafuji
Physics Department, Saitama University, Urawa. Saitama 338, Japan
(Received 28 July 1981)
We make a short comment on our new general relativity formulated on the WeitzenWk space-time. The new
general relativity considered here has one free parameter besides the Einstein constant K . The total action is
invariant under a class of local Lorentz transformations, besides being invariant under general coordinate and global
Lorentz transformations. The consequences of this restricted local Lorentz invariance are studied.

In a previous paper we studied a gravitational brief comment on the internal consistency of this
theory based on the Weitzenb5ck space -time with model; in particular, we put fonvard an argument
absolute parallelism. This theory attributes gravi- against a recent statement of internal inconsisten-
ty to the torsion of space-time, defined byZ cy.3
The Lagrangian density of (2b) now becomes
Tauu =bi(aubip - a p b i v ) (1)
with b={hi} = { b , } a quartet of the parallel vector L,= (1/2K)R({ } ) + C 3 ( a U u ) , (3)
fields (or simply the parallel vector fields). Pos- with a total-derivative term neglected, and it has
tulating that the gravitational Lagrangian density the following invariance property besides the in-
should be quadratic in the torsion tensor and con- variance under general coordinate and global Lor-
serve parity, we found that it is represented by entz transformations: namely, the L, of (3) i s i n -
variant under those local Lorentz transformations,
u2(21v,)+ u3(aa,)
tc=a1(t~~t,,,)+ , (2a)
bi = b ; ? I f i ( x ) ,
where a,, u z , and a3 a r e free parameters. Or
equivalently, it i s rewritten by o r simply
& = (1/2K)R(i})+Cl(PtA,,)+ C2(zIYp) b= bA(x), (4)
+ c3(aup)+a total derivative (2b) with A ( x ) = { A ~ ( xand
) } A T @ = 7), which leave the
axial-vector part of the torsion tensor u invari-
with ~ = 8 n C c, , = a , + ( 1 / 3 ~ )c,, = a , - ( 1 / 3 ~ ) , and cg
ant. Using the definition of a, a= + z ~ ~ we T ~ ~ ~ ,
= a , + ( 3 / 4 ~ ) whereG
, denotestheNewtoniangravita-
see that the constraint imposed on A ( x ) is given by
tional constant. Here t*, v, and a a r e three
irreducible building blocks of the torsion tensor,
CifmnbiPbnYA1f(x)avA*~(x)=
0, (5)
while R({ }) is the ordinary scalar curvature de-
fined by the Christoffel symbol. We have compared where A&)= q h A m f ( x ) . This property of L, ,
the theory with solar-system experiments, and which we shall hereafter refer to a s the r e -
found that the parameters c1 and c p a r e severely stricted local Lorentz invariance, is a character-
restricted by the currently available experimental istic feature of the present model with c1= c z = 0,
data; KC^= 0 . O O l i 0.001 and K C * = -0.005i 0.005. A s and i t h a s some consequences which we shall now
for the parameter c 3 , solar-system experiments discuss.
do not give any restriction a t all. Consider the variation of the action under an in-
In view of this severe restriction on K C , and KC^, finitesimal local Lorentz transformation (4) con-
we have proposed a particular model for which strained by (5), i.e., A&)= 6,+ w f i b ) with
the parameters c 1 and c z a r e exactly vanishing; c ,
= c 2 = 0. It is the purpose of this paper to make a Eifmnbib,,auWfm(x)=
0, (6)

2s 3312 0 1981 The American Physical Society


440 ~~~

24 BRIEF REPORTS 3313

where wt,(x)= vlrnwrn,(x), w J i ( x ) =0, and where V , denotes the covariant derivative with r e -
Iw,,(z)[ << 1. The matter part of the action changes spect to the Ricci rotation coefficients formed of
like {b,}: the t e r m s in the square brackets a r e invar-
iant under any local Lorentz transformations.
Therefore, the Lagrangian density L, of (13) has
the required restricted local Lorentz invariance.
For the Rarita-Schwinger field of spin P, however,
= J d4x=(T(j1+ - - 1( G L6, , $ S i t q ) w , , ( x )
G 6q , the minimal Lagrangian density does not possess
(7) this invariance property, and one must add some
nonminimal coupling terms in an ad hoc manner to
where Ti is the energy-momentum tensor of mat- ensure the restricted local Lorentz invariance.
ter, For the gauge fields of internal symmetrv of the
c g T f J =b i , , 6 ( c g L,)/6b,, , (8) fundamental particles, such as photons, W *
mesons, 2 mesons, and gluons, the Lagrangian
and S f J is the infinitesimal Lorentz generator for density i s constructed by the usual Yang-Mills
the matter fields which we denote collectively by procedure, andhence it is described in t e r m s of
q . The gravitational part of the action, on the the metric tensor alone, namely, the Lagrangian
other hand, does not change under this transforma- density for the gauge fields of internal symmetry
tion: is invariant under any local Lorentz transforma-
tions.
6) d q x c g L y = - d 4 x c g B L 1 w , , ( x ) =0 , (9) The equations of motion for matter fields are de-
rived from L , by the action principle, and s o they
where B is defined by a r e covariant under the local Lorentz transforma-
c g B =-b,,6(~gLo)/6bi, . (10) tions constrained by (5). That is, we have
Therefore, the gravitational field equation, which
now reads as
Bfs T , (11) where w e denote the Lorentz transformation rule
o f q byqf=U(A)q.
requires that the TCshould also satisfy It is thus impossible to distinguish experimen-

1d 4 x 6 T C i J 1 w ,(x) = 0 (12)
tally (i.e., by observing the motion of test bodies)
two parallel vector fields b and b from one anoth-
e r , if these two parallel vector fields are related
when the matter fields obey their field equations to each other by a local Lorentz transformation
6(GgLM)/6= q 0. According to (7), this condition satisfying (5). In other words, these two parallel
of T is automatically guaranteed if and only if vector fields should be interpreted as physically
the matter part of the action is invariant under in- equivalent with each other. We can therefore
finitesimal local Lorentz transformations con- divide the s e t of parallel vector fields into equiva-
strained by (6). Namely, it is required by con- lence classes. Observing the motion of test bod-
sistency of the gravitational field equation that the i e s , one cannot unambiguously specify a single
matter part of the Laflangian density L , should quartet of parallel vector fields but only an equiv-
have the same restricted local Lorentz invariance alence c l a s s of parallel vector fields. Conse-
as the gravitational Lagrangian density of (3). quently, the underlying space-time of the present
The Lagrangian density of the fundamental par- model with c , c 2 = 0 is not a Weitzenbbck space-
ticles of spin *, which is derived from the special- time but a new c l a s s of space-time which may be
relativistic one by the minimal prescription, i s classified somewhere between the Riemann and the
given by WeitZenrock space-times. We shall call this new
L,= ( i / Z ) b / ( q y * e , q -e,qyq) -mqq class of space-time the extended Weitzenbbck
space-time.
=[(i/2)bh(q?bV,q -V,qY*q) -Wtqq] The gravitational field equation symbolically de-
+ $(qyby5q)a 7 (13) noted by (11)reads
44 1

3314 BRIEF REPORTS 24


where G({ } ) is the ordinary Einstein tensor changes smoothly a s b i s varied. Then, when h is
formed of the metric tensor. This equation is not varied to _b+ 6 p , the A(%)changes toA(x)[l+Q(x)]
form invariantunder the local Lorentz transforma- ~ ( X ) ]nil(%)+
= ( A , ( X ) [ ~ ~ + ~ where }, a,<(%)=0 and
tions constrained by (5), because it involves t e r m s I I
a&) << 1 with a,,(%)= 71irS1,(x). According to
like T,,,, bipaoai,Bpvo, and T W ywhich, are not the constraint of (5) with> replaced b y & + 6b, the
covariant under those local Lorentz transforma- M i ( x ) must satisfy
tions. This fact poses a serious question about c i~mn{bib,Y[B,Q,,+
2Ab,(~,A,)~,]
internal consistency of the present model: name-
ly, does the gravitational field equation indeed de- + (b16b,,+b,,6b1Llb~(a,Ab,)]}=
0, (16)
fine Only an equivalence class of parallel vector f o r which we assume that the solution depending
fields? If the present model is internally con- linearly on 6 b may be represented a s an integral
sistent, the answer should be affirmative. When form:
a quartet of parallel vector fields _b is a solution
of the gravitational field equation, then any other
member & of the same equivalence class should
also be a solution at the same time.
At present we do not have a definite answer to (17)
the above question. However, w e shall give a
From the restricted local Lorentz invariance of
formal argument which suggests that the answer
the total Lagrangian density L =L,+ L u, we have
may indeed be affirmative.
Consider two parallel vector fields, b={b i W } and
b={b l i p } , which are related to each other by a
local Lorentz transformation (4) constrained by
(5). Since the constraint equation (5) involves the
parallel vector fields b explicitly, its solution
[i.e., the transformation matrix A(%)]depends on where 6b is defined by b + b b = ( @ + b P ) A ( 1 + 5 2 ) .
-b. This is the reason why the gravitational field Taking the difference of these two equations in-
equation of (15) is not form invariant under the tegrated over whole space-time, and then using
local Lorentz transformations constrained by (5). (17), we obtain the following transformation rule
We shall assume that the solutionA(x) of (5) for the Euler derivative 6(= L)/6biu:
1

where Ak(x)= V m J A k J ( x.) then and q also satisfy their field equations at
The second term under the integral sign in (19) the same time. This result indicates that the par-
a r i s e s due to the fact that the local Lorentz trans- allel vector fields a r e determined by the gravita-
formation matrix A(x) does depend on the parallel tional field equation up to a freedom of making the
vector fields, and it represents the peculiar fea- local Lorentz transformations constrained by (5).
t u r e of the transformation law of the gravitational We can thus conclude that the present model with
field equation in the present model with CL = c2 = O . c1 = Cz = O i s internally consistent if the Lorentz
Because of this second term in (19), the gravita- transformation matrix A(%)constrained by (5) de-
tional field equation is not covariad under the pends smoothly on b. More specifically, we have
local Lorentz transformations constrained by (5). assumed the integral representation (17)f o r j l i i ( x ) ,
Nevertheless, it does follow from (14) and (19) which we have not yet succeeded in proving.
that if if and q obey their field equations
6 ( G L ) / 6 b * = O I 6(\/qL)/6q,=O, (20)

K. Hayashi and T. Shirafuji, Phys. Rev. D g , 3524 %. Kopczyrkki, University of Cologne report (unpub-
(1979). lished).
Weuse the same notations and conventions as in Ref. 1.
442

Progress o f Theoretical Physics, Vol. 74, No. 1, July 1985

Relativistic Theory of Gravitation

A. A. LOGUNOVand M. A. MESTVIRISHVILI
USSR State Committee for Utilization of Atomic Energy
Institute for High Energy Physics, Serpukhov, Moscow

(Received October 22, 1984)

In the present paper a relativistic theory of gravity (RTG) is unambiguously constructed on the basis
o f the special relativity and geometrization principle. In this, a gravitational field is treated as the
Faraday-Maxwell spin-2 and spin-0 physical field possessing energy and momentum. The source of a
gravitational field is the total conserved energy-momentum tensor of matter and of a gravitational field in
Minkowski space. In the RTG, the conservation laws are strictly fulfilled for the energy-momentum and
for the angular momentum of matter and a gravitational field. The theory explains the whole available
set o f experiments on gravity. In virtue o f the geometrization principle, the Riemannian space in our
theory is of field origin, since it appears a s an effective force space due to the action of a gravitational field
on matter. The RTG leads to an exceptionally strong prediction: The Universe is not closed but just
flat. This suggests that in the Universe a missing mass should exist in a form of matter.

1. Introduction
In this paper the relativistic theory of gravitation (RTG) is constructed on the basis
of the special relativity, and the ideas by Poincare, Minkowski, Einstein and Hilbert get
their further development. Also the investigations of the authors). are reflected and
developed here.
First the principle of relativity was applied to mechanical phenomena only. But then
Henri PoincarG formulated it as the universal principle for all physical p h e n ~ m e n a : ~ )
The laws of physical phenomena should be the same both for an observer at rest and for
one who is in the state of a uniformly translational motion. So, we do not and cannot
have any means to distinct whether we are in such a motion or not. Even by now they
used to think that the essence of the principle of relativity is restricted by the existence of
only one class of coordinate systems, the so-called inertial reference frames within which
physical processes take place in the same way. However, as shown in Ref. 4), the
pseudo-Euclidean space-time geometry discovered by Minkowski allows t o formulate the
generalized principle of relativity, valid both for the class of inertial and that of noniner-
tial frames. The generalized principle of relativity was formulated in Ref. 4) : Which-
ever physical reference system is chosen, inertial or noninertial, one can always find an
infinite set of other frames, where physical phenomena are simultaneous with those in the
initial reference frame. Thus, we do not and cannot have any experimental means to
distinguish in what particular reference frame out of this infinite set we are.
The discovery of the pseudo-Euclidean space-time geometry allows t o formulate
physical laws both in inertial and noninertial reference frames, and thus to disprove the
erroneous statement? on inapplicability of the special theory of relativity to accelerated
reference frames. This means that when describing physical phenomena in Minkowski
space subject to a physical problem we may choose any reference frame adequate for the
given problem and, hence, set a corresponding metric tensor 7 of Minkowski space.
According to the ideology of the general relativity (GR), the special principle of relativity
443

32 A . A . Logunov and M. A . Mestvirishvili

cannot be applied for gravitational phenomena. It was that a very central point in which
almost seventy years ago Einstein and Hilbert turned away from a special theory of
relativity when costructing GR. This resulted in giving up the conservation laws for the
energy-momentum and angular momentum, as well as in the development of unphysical
concepts on the nonlocalizalility of gravitational energy, and of many other things, which
have nothing to do with gravity. These two eminent scientists left the surprisingly simple
Minkowski space with the maximal, ten-parameter,group of space motion and entered the
maze of the Riemannian geometry, which entangled the following generations of
physicists engaged in gravity. Some authors even consider giving up the energy-
momentum conservation laws in GR to be the most important principal step of this theory
which overthrew the concept of energy. But it would be too thoughtless if we renounced
the most important law of nature, i. e., the conservation law of energy-momentum and the
angular momentum of a closed system, without sound experimental grounds. It was
shown in Refs. 1) that, since the GR does not and cannot have conservation laws for the
energy-momentum of both matter and a gravitational field, then the inert mass defined in
Einstein theory has no physical sense, the gravitational radiation flux, as it is defined in
the GR, can always be annihilated by the correponding choice of the admissible reference
frame, and hence, the Einstein quadrupole formula for the gravitational field radiation
does not follow from GR. The general relativity does not basically suggest that a binary
system looses energy because of gravitational radiation. The GR does not have the
classical Newtonian limit and, consequently, does not satisfy the most fundamental
principle of physics, i. e., the correspondence principle. This is what the absence of
energy-momentum conservation laws leads to, should one reject dogmatism, think seri-
ously over the heart of the problem and perform almost an elementary analysis. All of
it testifies to the fact that the GR is not a satisfactory physical theory. Therefore, the
problem of constructing a classical theory of gravity which would satisfy all the require-
ments imposed on a physical theory, is quite vital.
As opposed to the GR, our theory is based on the special principle of relativity which
we, following Poincar6, consider universal and, consequently, applicable to gravitational
phenomena as well. Thus, in our approach the conservation laws for the energy-
momentum and angular momentum are fulfilled strictly and have a covariant character.
Therefore, our theory contains no pseudotensors and as a consequence no unphysical
concepts of the gravitational energy nonlocalizability arises. Figuratively speaking, our
overriding problem is to construct without leaving Minkowski space an effective field
Riemannian space with the help of a tensor gravitation field and the geornetrization
principle, with the conservation laws for matter being strictly fulfilled. This will allow
us to use, if necessary, Riemannian space already inspired with the conservation laws for
matter. Note, that Riemannian space constructed in such a way is, literally, of a field
origin since the effective force space is generated by a gravitation field of Faraday-
Maxwell type. Thus, in the present paper we shall carry out this program developing the
ideas of Refs. 6). In this we manage to preserve with necessity the Hilbert-Einstein
equations supplementing them with four new field equations. According to new equa-
tions, a gravitation field has, in the general case, only 2 and 0 spins. This theory changes
the conventional concepts of space-time influenced by the GR, takes u s out of the maze of
the Riemannian geometry and is in spirit of the modern theories in elementary particle
physics. Everything turns out surprisingly simple and natural. The only thing to
444

Relativistic Theory of Gravitation 33

wonder is that the way to this simplicity and lucidity took 70 years. It follows from our
theory, that the Einstein general principle of relativity has neither physical sense nor any
physical content.')
T h e theory developed in this paper is based on the concept of a gravitation field being
a physical field of Faraday-Maxwell type and possessing energy-momentum. Thus, a
gravitation field, as well as all other physical fields, is characterized by the energy-
momentum tensor of the system. We consider a gravitation field as a spin 2 and spin 0
physical field, and a free gravitation field t o have spin 2. T h e space-time geometry for all
physical fields is pseudo-Euclidean (Minkowski space). Thus, the conservation laws for
the energy-momentum and angular momentum of a closed system are rigorously fulfilled.
This is the principal distinction between our theory and the Einstein GR. Another
important problem arising in the costruction of a theory of gravity, is that concerning the
interaction between a gravitation field and matter. We think a gravitation field t o be
universal, and to act on all forms of matter identically. We construct our theory on the
basis of the geometrization principle,'' which says that the equations of motion of matter
under the action of the tensor gravitation field 4'' in Minkowski space with the metric
tensor y " , may be identically represented a s the equations of motion of matter in the
effective Riemannian space-time with the metric tensor g '" depending on the gravitational
field 4'' and the metric tensor y". In this way we introduce the concept of a n effective
Riemannian space of field nature. Proceeding from Minkowski space and the geometriza-
tion principle, the Lagrangian density has a general form

L = L J ? ~ ~6 ', ' ) + ~ , w ( g@~A )~,, (A)

where 6'"
=6 4 ' ' is the density of the tensor of a field variable in the gravitational field
4". Gzk=&g1' is the density of the metric tensor of the Riemannian space .'g 7''
=&Y" is the density of the metric tensor of Minkowski space. @A are the fields of
matter.
In this theory, the Lagrangian density of the gravitational field depends on the metric
tensor 7'' and the gravitational field 4". That is why it crucially differs from the GR,
where the Lagrangian density depends only on the metric tensor of the Riemannian space
g". Thus, in our theory, contrary to the GR, the geometrization of the Lagrangian
density of the gravitational field is not complete.
2. Geometrization principle and general relations
in the relativistic theory of a tensor field
Without any loss of generality, we shall assume the tensor density 5" of the metric
tensor of Riemannian space-time to be a local function depending on the tensor density 7'"
of the metric tensor of Minkowski space and the tensor density 6"of the gravitation field.
Let the Lagrangian density L M depends only on the fields @ A , on their first-order
covariant derivatives, and also on the tensor d&sity g'' in virtue of the geometrization
principle. The Lagrangian density of the gravitational field in taken to depend on the
tensor density fa", their first-order partial derivatives, as well a s on the gravitation field
density 6'" and its first-order covariant derivatives with respect to the Minkowski metric.
T o obtain the coservation laws, we use the invariance of the action in the infinitesimal
covariant shift. Indeed, since the action is a scalar, the variations of the actions of
445

34 A . A . Logunov and M. A . Mestvirishvili

matter, ~ J M , and of the gravitational field, 8J8, will be zeros under an arbitrary
infinitesimal coordinate transformation. Calculate first the variation of the action of
matter under the transformation
xl=xi+p(x), (1.1)
tibeing the infinitesimal shift four-vector

where div are divergence terms inessential for our consideration.


T he Euler variation is defined as usual:

Under coordinate transformation(1-1) the variations 8~i7,,B L O are


~ easy to calculate,
provided one uses their transformation law:

Here and in what follows D, is the covariant derivative with respect to the Minkowski
metric. Putting these expressions into (1.2) and integrating them by parts, one obtains

In virtue of arbitrariness of the vector ,$, one finds from the condition 8 / M = o a strong
identity

Introduce the notations:

Then the l h s of relation (1.5) may be represented as:

The rhs of this equality may easily be brought to the form

Proceeding from this, strong identity (1.5) may be written in the form
446

Relativistic Theory of Gravitation 35

Due to the least action principle, the equations for matter fields have the form
6 L M -0. (1.9)
6@A
Taking into account the above equation, one may find from strong identity (1.8) the weak
identity:

8,(p.-$+o. (1.10)

Note, that the density of the energy-momentum tensor of matter in Riemannian space 2
is expressed v i a P as follows:

&TPu= f P o - L2g f l v F . (1.11)

Thus, expression (1.10) entails the covariant equation for matter conservation in Riernan-
nian space:
VrTP=O. (1.12)
If the number of equations for matter is four, then instead of equations for matter ( 1 - 9 )
one can always use equivalent equations (1- 12).
The variation of the action integral may be written in the equivalent form

Here the variations BL&py, 6ppuunder coordinate transformation (1-1) equal


8 , ~ &Do[+
~ ~ D a ~ - D u . ( a $ ~, f l ) (1.14)
6 L ~ P FaVD,EP+
u ~ 7-apDa,fu-FDJn. (1.15)

Substituting expressions for the variations ~ L PSr ,J p U ,6 ~ into


0 (1-13)
~ and integrating
by parts in virtue of arbitrariness of Ep, one comes to the strong identity

It should be noted that this identity is valid irrespective of the fulfillment of the
equations of motion for matter and for gravitation field.
Let us introduce the notations:

Comparing identities (1.8) and (1*16), we find


447

36 A. A. Logunov and M. A. Mestvirishvili

(1.18)
Analogously, it follows from the invariance of the action of the gravitation field under
coordinate transformations (1-1) that:

Here

- -
Adding expressions (1 18) and (1 19), we get

(1.21)
Here
,-PU= ,-y+ (1.22)
In virtue of the least action principle, the equations of the gravitation field have the form

If we take these equations into account from (1-21),we obtain an extremely important
identity:
? ) = Fuu-+Puut).
T- u Y1- ~ ~ u PpvDu(
BPUVu(
(1-24)

As may easily be verified, the density of the energy-momentum tensor of matter in


Riemannian space is equal to

G T P U = 29
, P U - r - P Y T- . (1-25)

Similarly, the density of the total energy-momentum tensor in Minkowski space equals

On applying these expressions, relation (1-24) takes the form


V T, = DptYP
. (1-27)
This equality reflects the geometrization principle.
In a pseudo-Euclidean space, the covariant divergence of the sum of densities of the
energy-momentum tensors of matter and gravitation field is exactly equal to that in an
effective Riemannian space, but only of the density of the energy-momentum tensor of
matter. Provided the equations of motion for matter are satisfied, one has
448

Relativisfic Theory of Gravitation 37

D tv = V T, = 0 . (1.28)
From the covariant equation for matter conservation in Riemannian space, i t is not
clear, what is conserved while from t h e conservation law for the total energy-momentum
tensor t, in Minkowski space it is clear that both the energy-momentum of matter and
gravitation field are conserved. Thus, in this theory Riemannian space appears a s a
result of the action of the gravitation field on all forms of matter. That is why it is an
effective Riemannian space of field origin. Minkowski space finds its exact physical
reflection in the conservation laws for the energy-momentum tensor and the angular
momentum of matter and gravitation field taken together.
Since there are ten Killing vectors in a flat space, there are, consequently, ten
conservable integral quantities for a closed system of fields. If the number of equations
of motion for matter is four, then instead of them we may use the equations expressing the
total energy-momentum tensor conservation in Minkowski space:
D(tL+ G U ) =o . (1.29)
This equation, alongside of those for a gravitation field, defines all the unknown character-
istics of matter and gravitation field. It is worth noting that both matter and gravitation
field in our theory are characterized by energy momentum tensors. As a result, in our
theory, contrary to the GR, no pseudotensors arise and, hence, there are no unphysicaI
concepts of the nonlocalizability of gravitation energy.
If we, following Hilbert and Einstein, choose the Lagrangian density of a gravitation
field in a completely geometrized form, i.e., depending on the metric tensor g t b of
Riemannian space and its derivatives only, e. g.,

L,=&R,
where R is the scalar curvature of Riemannian space, then in virtue of the field equations
the energy-momentum tensor density of the free gravitation field in Minkowski space
would always be equal to zero:

(1.30)

Thus, it is impossible, in principle, to construct a completely geometrized Lagrangian of


the tensor physical field, possessing energy and momentum. Therefore, the theory con-
structed on the basis of the Completely geometrized Lagrangian cannot describe the
physical gravitation field of Faraday-Maxwell type in Minkowski space. As was stated
in literature (see, e. g., Ref. 811, in Minkowski space one can unambiguously find with the
help of the spin-2 tensor field the GR gravitation field Lagrangian which equals the scalar
curvature R. However, these papers have no physical sense, as long a s the energy-
momentum tensor is zero, as is seen from (1-30). Therefore, all these papers are
meaningless from the viewpoint of physics, and their results are inaccurate.
9 3. Relation between canonical energy-momentum tensor
and Hilbext tensor
The Lagrangian density of the gravitational field depends on the metric tensor density
r,the density&pu of the tensor gravitation field and on their first-order derivatives.
449

38 A . A. Logunov and M. A. Mestvirishvili

Under coordinate transformation (1.11, the action variation SJg is zero and, hence,

Here
JA=-[a~~+K~ADa~p,
where the density of the canonical tensor r a A is equal to

and the density of the tensor of rank three K, is

Putting into (2.1) formulae (1.14) and (1.15) for the variations c ~ L P , we shall
6~7,
obtain in virtue of arbitrariness of the volume 9 the following strong identity:

Since the shift vector ,fa is arbitrary, the last expression leads to the following identities:

D, raL= - L6P
D a P y , (2.6)

l3L -
rpa- D A K , =d ~a ~ - 8 6Lg -or 6L a 6Lg -ur
c a ~ d + 2 g J P a - 6 p6 p b r Y , (2.7)
66 p 64
KFA=- K i a . (2-8)
Henceforth our theory is based on the linear relation between the metric tensor density
g of the effective Riemannian space and the density P of the tensor gravitation field
g,u= y_PY+ p u . (2-9)
In this case we shall obtain the equalities:

Using these equalities we find

Here r,$are Christoffel symbols of Minkowski space


rG=+r[aPYryf auyrp- drYPYl.
2
450

Relativistic Theory of Gravitation 39

As a result of elementary calculations, one obtains for IT,"^ the following expression:

Since the density of the energy-momentum tensor of the gravitation field is

identity (2.7) may be written in the form

It is just the identity which establishes the relation between the Hilbert tensor density in
Minkowski space and that of the canonical energy momentum tensor.
For further use it is convenient to introduce the following quantity as a characteristic
of the gravitation field:

t&=rPa-DiK,"'. (2.13)
In virtue of identity (2-121, this quantity coincides exactly with the density of the Hilbert
energy-momentum tensor in the case of a free gravitation field.

4. The basic identity

As was shown in Ref. 9), the symmetric tensor d i k of rank two may be represented as
the sum of the irreducible representations: one with spin 2, one with spin 1 and two with
spin 0:
$ik= [P2+PI + PO+ pO']t: (DLm . (3.1)

The quantities P, are convenient to write in the momentum representation. Let us


introduce the auxiliary operators:
1
x.z k --
-h[ y. ___
Q;:k]
'
y . -4i4r
I k - Q2
(3.2)

with the help of which Psmay be represented in the form


Po=XnjXmL
, Po,= Y n i Y m 1 , (3.3)

PZ 3 +XjmXnL]
=T[XiLXnm - XniX m' . (3.5)

In the 1-representation the projection operators P , are nonlocal integrodiff erential ones:
45 1

40 A. A . Logunov and M. A . Mestvirishvili

~ ~ 4 ~ ~ = J d 4 yY )~4 m, l (~Y )(. x ,

-
With the help of expressions (3-3) (3-5) one may easily make sure that only operators
P2 and POare conserved:
qlP;",',=q1Pd",',=O, qmP;",t=qmPGnt=O. (3.6)
As may be easily verified, the tensor field has the only local operator of a lower order
which is linear in field. It equals
fik = [(PZ
- 2PO)@ ] i k (3-7)
and its divergence is identically equal to zero.
difik=O. (3-8)
The field fik describes only spins 2 and 0, i.e., in a more detailed form
fik =UBik-didmBmk-
drdmOmi+yikdmdlBml, (3-9)
1
B i k = 4 i k - y y i k 4 .
(3.10)

In the covariant derivatives this operator becomes


J m n = D , D ~ (g bn y p m + g b m y p n - g mn y ~ b -g w s y m n ) . (3-11)
Or, in another form
jmn= - y ~ b ~ p ~ 6 g m n +b g~-bn+DnD
m D ug- b m - ymnDpDbG"".
One may easily get convinced that the following identity takes place:
DmJmnEO. (3- 12)
We have called it basic, since it is of fundamental importance for the construction of the
relativistic theory of gravity.

5 5. Equations of the relativistic theory of gravitation


In the present section we construct relativistic equations for matter and for gravita-
tion field in the framework of the special relativity and geometrization principle.
The relation between the effective metric of a field Riemannian space and gravitation
field may be chosen in the simplest form:
g ik = ) / - g g l k =)/-yylk+J-r(pk . (4.1)
In our theory, the field variable of the gravitation field is the tensor 4 i k . Let us assume
that in the general case the gravitation field has only spins 2 and 0 and that a free
gravitation field has spin 2. Such physical requirements, as is seen in 5 4, lead in Galilean
coordinates to the following four equations for the gravitational field:
a,41k=a,p=o. (4.2)
Similar conditions were sometimes used in the GR as a specific class of
Relativistic Theory of Gravitation 41

harmonic coordinate conditions to solve island-type problems. It was Fock who paid
special attention to the importance of harmonic coordinate conditions in solving island
problems. He wrote as follows: The above remarks concerning the privileged charac-
ter of the harmonic system of coordinates should not be understood, in any case, a s some
kind of prohibition of the use of other coordinate systems. Nothing is more alier; to our
point of view than such an interpretation. He went on: Likewise, in the case of the
Theory of Gravitation, the existence of harmonic coordinates, defined apart from a
Lorentz transformation, though a fact of primary theoretical and practical importance,
does not in any way preclude the use of other, non-harmonic, coordinate systems. From
the point of view of our theory, when solving island problems, Fock was unconsciously
working with ordinary Galilean coordinates in an inertial reference frame, which are, as
known from the special relativity, definitely singled out coordinates. Therefore, in
Focks calcdations for island systems the harmonic conditions were not the coordinate
ones as he thought them to be but, a s will be seen from our theory, field equations in
Galilean coordinates of an inertial reference frame. It was due to this very fact that they
played such an important role in his specific calculations, which neither Fock nor others
even suspected.
Thus, Fock considered harmonic conditions no more than a s priviledged coordinate
conditions applicable for island-type problems only. This is quite natural, since he
and ail his eminent predecessors were captured by the Riemannian geometry, which
basically gave no possibility to make a deeper insight into the problem. In order t o make
step ahead and impose these conditions as universal covariant field equations, it was
necessary to give up the ideology of the GR, leave the maze of the Riemannian geometry
and apply a special principle of relativity in defiance of the GR, as well as t o introduce the
concepts of a gravitation field a s a Faraday-Maxwell physical field, possessing some
energy and momentum. All of it was translated into reality in our theory, with the choice
of coordinates being arbitrary and set only by the metric tensor y* of Minkowski space,
as is generally accepted in elementary particle theory. As for Eq. (4*2), in our theory
they are comprehensive and universal because of being the gravitation field equations.
They have nothing to do with the choice of coordinates. In Minkowski space these
equations are written in the covariant form

Judging by 5 3, we conclude that these field equations exclude automatically spins 1 and
0 from a gravitational tensor field. Thus, we have already costructed four covariant
equations (4.3) for the fourteen unknown variables of a gravitation field and of matter.
To construct other ten equations, we draw a simple, but farreaching analogy with an
electromagnetic field. As is known, Maxwell electrodynamic equations may be written in
the covariant form a s follows:

Due to conservation of electromagnetic current one has

Dvju=O. (4.5)
453

42 A. A. Logunov and M. A. Mestvirishvili

The l h s of Maxwell equations ( 4 - 4 ) is constructed in such a way that its divergence is


identically zero. It follows that spin 0 is excluded from the vector field with spins 1 and
0. By analogy with electrodynamics, we shall construct equations for the tensor gravita-
tion field. The only tensor of the second rank which conserves is the energy-momentum
tensor of matter and a gravitation field in Minkowski space.
D,tt;+t&?=O. (4-6)
Therefore, it would be natural to choose it as a complete source of the gravitation field.
We have already obtained four equations (4.3) describing a gravitation field. Therefore,
in order to achieve a complete description of the unknown ten field variables and of four
variables of matter, it is necessary to write ten more equations. As proved in 5 4, the only
identically conserved tensor linear operator is J . Hence, by analogy with
electrodynamics, for the remaining ten equations one should necessarily take the follow-
ing ones:
uD [ g @ v y w + g W y r - g w y u r - g r U y w ] = ( t ; v + tp). (4-7)
This type of equations guarantees automatically that the conservation law for the energy-
momentum tensor of matter and a gravitational field will be fulfilled in Minkowski space:
D,(t,+tG?=O. (4.8)
Besides, a s a consequence, the covariant equation for matter conservation in Riemannian
space is also satisfied:
V,T=O. (4.9)
Since the divergence of the lh s of Eq. (4-7) is identically zero, four components correspond-
ing to spins 1 and 0 are excluded from the tensor field d i h containing representations
with spins 2, 1, 0, 0. Systems of Eqs. ( 4 - 3 ) and (4.7) determine completely the unknown
variables of matter and a gravitation field. The principal problem is to find out if there
is a Lagrangian density for the gravitation field with spins 2 and 0 which would
automatically lead to equations (4.7) due to the least action principle. The most general
Lagrangian density of the gravitation field d i h describing spins 2 and 0 quadratic in the
first-order derivatives has the form

Lg= agmiinqg-Dig hqDp@mn +


f bgbqDm @wDpgmk C g m h g n q g DlB mhDpgnq. (4.10)

The convolution of covariant derivatives taken with respect to Minkowski metric is


realized with the help of the effective metric tensor g i h of Rimannian space. In this way
we guarantee the action of the gravitation field on itself which is similar to the one on
matter. The constants a , b and c in the Lagrangian are arbitrary so far.
Because of the geometrization principle, the Lagrangian density of matter is
L M = L M ( ~@~A ) ., (4-11)
In virtue of the least action principle, the system of equations for a gravitation field is as
follows:

(4.12)
454

Relativistic Theory of Gravitation 43

Here relation(4.l) was taken into account. For this system of equations to be presented
, is necessary to choose the constants a, b and c for the Lagrangian
in the form of ( 4 - 7 ) it
density in a definite and unique way.
This suggests that the Lagrangian of the gravitation field with spins 2 and 0 in
Minkowski space is determined unambiguously. In order to make such a choice of the
coefficients a, b and c let us calculate the density of the energy momentum tensor for
matter and a gravitational field.
Let us introduce the notations

(4-13)

Taking into account the definitions of 7"" we find

-
(4 14)

Similarly

(4-15)

Using equality (4-14) we obtain

Calculating the variation of the total Lagrangian over Ymn and with regard for the field
equations

where
H?=( g"DiG*"+ Gn'Dig*? g h u . (4-19)

In order that no new equations on the field d i k would arise from the equality
Dmtmn=O,
which would otherwise lead to a redefinition of the system of equations, it is necessary
and sufficient for the coefficients a, b and c to satisfy the following conditions:
a=-b/2, c=b/4. (4-20)

Thus, with such a choice of the constants one comes to the identity
Dmtmn=O
which was enclosed in Eq. (4.7). With regard to the choice of the coefficients as in (4.20),
~ ~
455

44 A. A . Logunov and M. A. Mestvirishvili

expression (4.18) takes the form


~ ~ ~ ~ ( ~ 5 n ~ u m + g - um nm YU-
~ u n- V U-
9 Y 9 Y
mn
)-
- z (1t , " " + t B " ) t (4.21)

which coincides with Eq. ( 4 - 17)obtained earlier by analogy with electrodynamics,provide-


d one puts
26=1/A.
So, the only Lagrangian density of the form
1- 1-
L8 - [gkqDmgPqDpGkm--
2 g k r n g n q g- " D LkqDpgmn
~ +Tg km GnqG " D ~ g ' ~ D p""1g
32n
(4.22)
leads to the field equations in the form of (4.21). According to the correspondence
principle, the constant /I is chosen to be equal
A= - 1 6 1 ~ . (4.23)
This Lagrangian density can be presented in the form
Lg=,Z;;[ChnD'g""-
1 g m n G k kGkl] , (4.24)

where the tensor of rank three, G ~ Lis ,defined by the formula


Ck,=+G'*(DmgLp+ Dt@mp-DpGLm) . (4.25)

It may also be presented in the form

Lg- J-sg""[CLGkk-GfnGt]. (4.26)


16a
Such a Lagrangian was considered for the first time by Rosen in Ref. 2). In (4.26) the
tensor of the third rank, Gkl, is equal to
G : r = 2 g1P k ( D m g p r + D 1 g P m - D ~ ~ r r n ).
(4.27)

With consideration for Eq. (4.3), the complete system of equations for matter and
gravitational field will be6'
G = 16n(tFn tZn) ,
y P 6 D r D ~mn + (4-28)
DmGmn=O, (4.29)
or in the Galilean coordinate system
0 G m n = 1 6 ~ ( t , " " + t 2 ", ) amgmn=0.
Should we confine ourselves only to the first system of equations, (4.28), then the separa-
tion of the Riemannian space metric into the Minkowski space metric and tensor
gravitational field would be of a conditional character and would not have any physical
meaning. The second system of four field equations, (4.29), separates decisively all that
456

Relativistic Theory-of Gravitatior. 45

relates to the inertia forces from all that is connected with gravitational field. T h e two
systems of Eqs. (4.28) and (4.291, are generally covariant. Corresponding physical
conditions are imposed, as usual, within a given, for example, Galilean coordinate system
on the behaviour of gravitational field. In the framework of GR, one cannot formulate
the conditions for the metric g " remaining in Riemannian space since the asymptotics of
the metric always depends on the choice of the three-dimensional coordinate system. It
should also be noted that the equations of matter motion are contained in the given system
of equations. The density for the energy-momentum tensor of gravitational field in
Minkowski space is equal to

(4.30)

for Lagrangian density (4.22). Here, as we can see, there automatically appears the
second-rank curvature tensor Rpqin Riemannian space. Similarly, the tensor density of
matter energy-momentum in Minkowski space is equal to

for Lagrangian density (4.11).


In deriving expressions (4-30) and (4.31) we used the identity

and the equality

which directly follows from the expression for coupling ( 4 - 1). Putting the expressions
for the energy-momentum tensors of matter and gravitational field in field equations (4 -71,
we transform them to the form of Hilbert-Einstein equations
8x( T p u - 1
TgfiuT)
(4.34)

Thus, system of Eq. (4.7) is equivalent to the system of Hilbert-Einstein equations. T h e


complete system of equations for matter and gravitational field, (4.28) and (4-29), is
equivalent to the system of equations

D,g'"=o. (4-36)
it is worth mentioning that Eq. (4.36) is general and universal since these a r e field
equations describing a gravitational field with spins 2 and 0. The choice of a reference
frame (or coordinate system) is determined by the metric tensor Y"" of Minkowski space.
Hence, Eq. (4.36) do not impose any restrictions on the choice of a coordinate system.
Consequently, the system of Eq. (4.36) excludes spins 1 and 0' in the density of tensor field
d", leaving only spins 2 and 0. The required six components of gravitational field,
457

46 A. A. Logunou and M. A. Mestuirishuili

corresponding to these spins, and four components of matter are defined from field
equations (4.28) or from their equivalent Hilbert-Einstein equations (4.35). The system
of equations for gravitational field, (4-28) and (4.29), may be expressed in a somewhat
different form through the Hilbert energy-momentum tensor density in Riemannian space.
However, for this purpose we will have to obtain some relations if use is made of a specific
-
expression for the Lagrangian density of gravitational field obtained by us earlier, (4 24) :

where

Using these expressions, we calculate then the density of the tensor of the third rank,
K F , with formula (2.10).
With account for the equality

Using this expression and definition (2.13), we shall have for 5,the following expression:
1
$,, = rpa--D~oF
16n
- (4-38)

where the density of the asymmetric tensor '


:
0 is equal to
&A= gpuD6[@A"Ga!J-
,augAu], (4.39)
and the well-known expression (3.11) is denoted through J a u .
Expression (4.38) is right the one we will need in what follows. In preparing for
further calculations we derive now an identity often used in literature. In Galilean
coordinates, the Lagrangian density of gravitational field, (4-24), takes the form

where in this case

The quantities CiUare the tensors of the third rank with respect t o the linear transforma-
tion of the coordinates. Therefore L s will be a scalar density with respect to the same
transformations. From the invariance of action with respect to the linear transforma-
45 8

Relativistic Theory of Gravitation 47

tions, we have

6J g = /d4x d, J
D
[ ++L 8] =0 . (4.40)

Here
J = - t n r a A
f l?$A6a[p, (4.41)
where the density of the canonical tensor r: is equal to

aLg ,
raA= -62Lg+&GPua(a,gpu) (4.42)

and the density of the tensor of the third rank, l??, is in this case

Rp=2 a(a,.&aL) gaY-6/ta dLg gcr.


a ( d A8 g=) (4.43)

Putting into (4*40),the formula for the variation of S L ~ with


respect to linear transfor-
mations in virtue of the arbitrariness of the volume a , we obtain the identity

Whereof there straightforwardly follow the identities

Since

then from identity (4-46) we have

rpa-adl&A= --[J-s
8n R p a - 1~ 8 p a R.] (4.48)

Considering the equality

and the expressions for &, in Galilean coordinates we will come to


16nKf = a,( 6 p A GUY- 6 p g a A ) + 0, , (4.49)

where 0, is the density of the antisymmetric tensor,


O,A= gpua,(~ A u g a u - g a u g i y ) . (4.50)

Putting the expression for K? into (4.481, we shall obtain the identity
459

48 A. A. Logunozi and M. A. Mestvirishvili

In the curvature tensor, one can always identically replace, leaving it unchanged, the
conventional derivatives by the covariant ones in the Minkowski metric, therefore expres-
sion (4-51) may be presented in the covariant form:

In this case the canonical tensor density in (4.52) will be equal to expression (2-3):

where the Lagrangian density L , is already presented in terms of the derivatives covariant
in the Minkowski metric, (4.24).
Using identity (4.52) we may present the expression for Gp (4.38) in the form

(4.53)

As we have already established, the system of equations of matter and gravitational field,
(4.28) and (4*29),is equivalent to the system of Eqs. (4.35) and (4-36). With the help of
expression (4-53) the system of equations of matter and gravitational field may also be
rewritten in another equivalent form?

Here T,' is the Hilbert energy-momentum tensor density (1.6) for the matter in Rieman-
nian space. It is quite obvious that in virtue of (4.54) and (4.55) the conservation law for
energy-momentum tensor of matter and gravitational field has the form

D,(T,'+k)=O. (4.56)
The covariant matter conservation law in Riemannian space may identically be presented
in the form

V T,,"= 8, T,,"-+ TPb&g,a' D , T,,' - GivTi'= 0 . (4.57)

From comparison of (4-56) and (4-57),we have

As seen from this expression, the matter acquires energy and momentum right from
gravitational field, the total energy-momentum tensor of matter and gravitational field
being always conserved rigorously. The construction of R T G on the basis of Minkowski
space and geometrization principle allowed us to deal only with covariant quantities at
every stage of our reasonings. Here we give briefly some of our results following from
460

Relativistic Theory of Gravitation 49

this theory. The post-Newtonian parameters in it are equal t o


y=/f3-l , a1 = 0 2 = cy3= El =t z = &= 4r= i u = u .
Therefore the theory describes the whole set of gravitation experiments available at
presenr. It strictly satisfies the correspondence principle and predicts the existence of
gravitational waves in the Faraday-Maxwell spirit, which possess both energy and
momentum. In virtue of the geornetrization principle, the curvature of the effective field
Riemannian space appears a s a result of gravitational field action on matter.
Gravitational waves in this theory propagate as electromagnetic ones. Since OUT theory
is based on the special relativity principle, the inertial mass of the system is well defined

m = /d3* [t?+ tc3

and is a scalar with respect t o the three-dimensional transformations of coordinates. The


quantity

P= j d 3 1 [ t P f tk]
is an energy-momentum four-vector with respect to any coordinate transformations;
similarly, the angular momentum is also a tensor with respect to any coordinate transfor-
mations in four-dimensional Minkowski space. It may also be shown that for any island
static system, the inertial mass is exactly equal to its active gravitating mass. The given
theory provides a prediction of an extraordinary force, it leads to a strictly definite
development of the Universe.6 According to it, the Universe is not closed, it is flat in
virtue of Eq (4.291,
d s 2 = c 2 d r 2 - V ( r ) ( d x 2 + d y 2 + d z 2.)
The Universe expansion is defined by the function V (r ) which is easily calculated from
the field equations. Equations (4.28) and (4.29) make us easily convinced that the total
energy density of matter and gravitational field is always zero at any instant of time of
the Universe development. T h e nowaday density of all the forms of matter, p 0 , should be
equal to its critical density p C ,
po=pc=3H2/8aG,

where H is the Hubble constant.


The Universe expands infinitely, the deceleration parameter o being equal to
4412.
The Universe age T is defined by the formula
T=2/3H.
The density of the observed usual matter is much smalIer than the critical density pC.
Since the given theory is based on fundamental general physical principles and is con-
structed unambiguously, its predictions on the character of the Universe development are
so much general that they necessarily demand an obligatory existence of a missing mass
is some form of matter. Hence, in the Universe there must exist a missing mass SO that
46 1

50 A. A . Logunov and M. A. Mestvirishvili

the total density of matter be equal to the critical value of p c .

Acknowledgements

The authors express their deep gratitude to V. A. Ambartsumyan, N . N . Bogolubov,


A. A. Vlasov, S. S. Gershtein and A. N. Tavkhelidze for valuable discussions.

References
1) A. A. Logunov et al., Theor. Math. Phys. 40 (1979), 291.
V. I. Denisov and A. A. Logunov, Theor. Math. Phys. 50 (1982). 3 ; Elemenhy Particle and Atomic
Nucleus Physics (19821, v. 13, part 3, p. 757; Modem Problems of Mathematics (Moscow. VINITI of
Acad. of Sciences, USSR, 1982), v. 21.
2) N. Rosen, Phys. Rev. 57 (1940). 147; Ann. of Phys. 22 (1963). 1.
A. Papapetrou, Proc. Roy. Irish Acad. A52 (1918). 11.
S. Gupta, Proc. Roy. SOC.A65 (1952), 608.
W. Thimng, Ann. of Phys. 16 (1961), 69.
3) H. Poincar6, Present and Future of Mathematical PhysicdBulletin des Sciences Mathematiques, Decem-
ber, 1904), v. 28, ser. 2, p. 302; The Monist(January, 1905), v. XV. No. 1;Relutivity Principle, ed. A. A.
Tyapkin (Atomizdat, Moscow. 1973).
4) A. A. Logunov, Lectures on the Theoly of Relativity(Moscow Univ. 1984), (In Russian).
5) A. Einstein, Collected works (Nauka, Moscow, 1965). v. I.
W. Pauli, Theory of Relativity (Pergamon Press, 1965).
C. M$ller, The Theory of Relativity (Clarendon Press, Oxford, 1972).
L. I. Mandelstam, Lectures on Optics, Relatidy Theory and Quantum Meckanics (Nauka, Moscow,
1973, p. 218.
6) A. A. Logunov and A. A. Vlasov, Minkowski Space as the Basis of the Physical Gravitation Theov
(Moscow Univ.. Moscow, 1984); TMF (1984), v. 60, p. 319 Spherically Symmetric Solution in Gravitation
Theo7y Based on Minkowski Space (Moscow Univ., Moscow, 1984); TMF (1984), V. 60, p. 163.
A. A. Vlasov, A. A. Logunov and M. A. Mestvirishvili, IHEP Reprint 84-156 (Serpukhov, 1984).
7) V. A. Fock, Theory of Space, Time and Gravitation (Pergamon Press, London, 1959).
8) V. I. Ogievetsky and I. V. Polubarinov, Ann. of Phys. 35 (1965). 167; JINR Preprint P-2106 (1965).
9) C. Fronsdal, Sup. Nuovo Cim. 9 (1958), 416.
K. J. Barnes, J. Math. Phys. 6 (1965). 788.
10) De-Donder, La Gmvifique Einsteinienne. (Paris. 1921); Tkeorie des Champs Grauzjques (Paris, 1926)
V. A. Fock, J. of Phys. 1 (1939), 81; Rev. Mod. Phys. 29 (1957). 235.

Authors' comments:
'.....I think it appropriate to say that in the paper "Relativistic Theory of Gravitation"
some aspects that are of great importance and appeared in the later works had not
been taken into consideration. Therefore, the book "The Theory of Gravity" Moscow
NAUKA, 2001 (also see gr-qc/0210005 V2 21 Oct 2002), is more useful and
complete.'
462

Yang-Mills Gravity: A Union of the Einstein-Grossmann Metric with


Yang-Mills Tensor Fields in Flat Spacetime with Translation Symmetry

Jong-Ping Hsu
Department of Physics,
University of Massachusetts Dartmouth
North Dartmouth, MA 02747-2300, USA

Based on spacetime translation gauge symmetry, we discuss a new Yang-Mills gravity


for interacting spinor and tensor fields in flat spacetime. The theory possesses both a
Yang-Mills action with quadratic gauge curvature of spin-2 fields and a particle action
involving an effective Einstein-Grossmann metric for the motions of matter. The effective
Einstein-Grossmann metric is naturally and completely determined by the translation gauge
symmetry in flat spacetime. Such a Yang-Mills gravity holds in both inertial and non-inertial
frames. Its interaction Lagrangian of the spin-2 gauge field has the maximum coupling of
4-vertex and is much simpler than that in the Hilbert-Einstein action of metric fields. The
results of the post-Newtonian approximat>ionare consistent with gravitational experiments I

Yang-Mills gravity in flat spacetime can shed light on renormalizable quantum gravity.
463

In general relativity, the structure of couplings for g,,, is very complicated and also has
non-trivial difficulties in both technical and conceptual aspects from the viewpoint of quan-
tum field theory. However, as a classical field theory, one of the strengths of general relativity
lies in its successful equation of motions for objects and light rays, based on the Einstein-
Grossmann metric g,,,dzpdz. [l] As for renormalizable quantum field theory, Yang-Mills
fields have the best track record in theory and experiment, provided the underlying space-
time is flat. In this paper, we show that Yang-Mills fields with spacetime translation gauge
symmetry have special features which provide a natural union of the Einstein-Grossmann
metric and the gravitational Yang-Mills field. The union implies: (i) The framework is
applicable to all general frames of reference (both inertial and non-inertial) in which the
spacetime is characterized by the vanishing Riemann-Christoffel curvature tensor. (ii) The
effective Einstein-Grossmann metric originates physically from a spin-2 Yang-Mills field in
flat spacetime.
The formulations for electromagnetic and Yang-Mills fields associated with internal gauge
groups have been developed extensively. They are based on the replacement d , + 8, +
igB;ra, where T~ is the constant matrix representations of the gauge groups which have
little to do with external spacetime. For external gauge groups related to spacetime, e.g.,
the de Sitter group or the PoincarC group, the gauge invariant Lagrangian involving fermions
turns out to be richer in content. [a]
In this paper, we concentrate on a specific simple external gauge group of translations
T(4) in flat spacetime. The translation group T(4) is the Abelian subgroup of the PoincarC
group. It is particularly interesting because it is the minimal gauge group related to the
conserved energy-momentum tensor which couples to a spin-2 field q5,,,. However, the
generators of the translational group are the displacement operators, p , = id, (c=fi=l)
in inertial frames. In a general frame (inertial or non-inertial) with a metric tensor P,,,
we replace 8, by D,, i.e., the partial covariant derivative with respect to the Levi-Civita
connection (or the metric tensor P,,,) in flat spacetime. Thus, the replacement in Yang-
Mills gravity takes a different form: D, + D, - ig$,,p 2 J,,,D, where D , = P,,,D,
in such a gauge theory. The generators of this non-compact translation group do not have
a constant matrix representation. It is precisely this unique property that leads naturally
to an effective Einstein-Grossmann metric in flat spacetime. Such an effective metric
emerges from the Lagrangian of matter fields (as shown in equations (5) and (6) below).
Furthermore, the displacement operator of the translation gauge group dictates that the
coupling constant g in J,,, must have the dimension of length and that the interaction
cannot have both attractive and repulsive forces, in sharp contrast to the dimensionless
real coupling constants in electrodynamics and other Yang-Mills theories associated with
internal gauge groups.
The new formalism of external gauge symmetry for translations in flat spacetime leads
to a gauge-invariant action involving fermions. It suggests (a) the massless Yang-Mills spin-
2 field in flat spacetime as the gravitational gauge field [3], (b) a new gravitational gauge
equation in both inertial and non-inertial frames and (c) an effective metric G,,,dzPdzU for
the motion of classical objects. In the post-Newtonian approximation, the present gauge
field equation is consistent with classical tests such as the perihelion shift of the Mercury
and the time delay of radar echoes.
Let us consider the local spacetime translation with an arbitrary infinitesimal vector
464

gauge-function A'( x) ,

The basic point is that this transformation has a dual interpretation: (i) a shift of the
spacetime coordinates by an infinitesimal vector gauge-function Ai'(z). and (ii) an arbitrary
infinitesimal transformation. These two rriathematical implications of the transformation (1)
dictates the following gauge transformation of spacetime translations for physical quantities
in the Lagrangian of fields (e.g., L + b p in (12) below):

ax'Pl &.lPrn d,P1 &an


x-
&Ul
... -
&Vna
- -
&'a1 "'&'an '
where 111, v1,Q I , PI, etc. are spacetime indices.
The spinor $(z) and D p $ transform, by definition, as a scalar function Q(z) and a
covariant vector Q p (x) respectively under the translation gauge transformation. We use
D, to denote the partial covariant derivative with respect to the metric tensor Pp,,(x) in a
general reference frame. The translation gauge transformation (2) is formally similar to the
Lie variations in the coordinate transformation.
For an example of Ppu(x) in the flat spacetime of a general reference frame, let us consider
the accelerated Wu transformation between an inertial frame F l ( w 1 , 21,y ~2 ,1 ) and a non-
inertial frame F(w,2 ,y, 2). Suppose FI is at rest and the frame F moves in the x-direction
with an initial velocity Po and a constant linear acceleration a , [4]:

where u = Po+a,w, y = l/J1 - (Po + and yo = lid=. In the zero acceleration


limit, a. 40, the Wu trarisformations ( 3 ) reduce to the usual 4-dimensional transforma-
tions, w f = y,(w + +
P o x ) , 21 = yo(x Pow), yr = y, ZI = z . In the special case Po = 0,
the W u transformations (3) are the same as the accelerated Maller transformations, [5] pro-
vided one makes a change of the time variable, w + w* = ( l / a o ) t a n h - l ( c r o w ) .The Wu
transformation preserves the spacetime interval d s 2 :

d s 2 = dwr2 - dr12 = W 2 d w 2 - d r 2 = P P , d x p d x V , W 2= y4 ($ + 2
a,x> . (4)

All constant-linear-acceleration frames of reference have the metric tensor of the form
P,,, = ( W 2 - I , - 1, - 1). [6] The existence of the finite Wu transformations (3) implies that
the spacetime of the constant-linear-acceleration frames is flat, i.e., having zero Riemann-
Christofel curvature tensor. The metric terisor PPY for a general frame of reference with
zero Itiemann-Christoffel curvature tensor may be called the Poincari metric tensor. In the
limit of zero acceleration, a , + 0 , Ppy in (4) reduces to the hilirikowski metric tensor qihUof
inertial frames.
465

In order to see the connection of such a spin-2 field and the gravitational field, let us
consider the kinetic term in the Lagrangian of a scalar field @: (1/2)P~vD,<PD,@. In the
presence of the spin-2 field d,,, the translation gauge symmetry dictates the replacement,

Thus, we have

Gap = PJ,a J v p = Pop + 2gdffp+ g2dffXdpoPXa, D,@ = a,@,


where Gap formally resembles a metric tensor. One can see from (5) that it appears as if
the geometry of the spacetime is changed from an Euclidean spacetime to a non-Euclidean
spacetime due to the presence of the spin-2 field dPv. As suggested by the action for a
quantum field involving the kinetic term in (5), the action S, for the motion of classical
objects effectively is assumed to take the form,

where Gp,,dxpdxv denotes the effective Einstein-Grossmann metric for motions of classical
objects.
This action (6) for particles suggests a simple and natural union of Einstein-Grossmann
metric for motions of classical objects and Yang-Mills fields for gravity with flat-spacetime
translation gauge group: Namely, the spin-2 gauge field and its interaction with fermion
matter actually takes place in flat spacetime; only the equation of motion of classical objects
is derived from the classical action S, which happens to have a form similar to the Einstein-
Grossmann metric gPydxpdxV.We stress that the effective metric tensor G,, in (5) and (6)
is completely determined by the Yang-Mills action with translation gauge symmetry.
The present theory of Yang-Mills gravity is formulated on the basis of the translation
gauge symmetry and the postulate of the effective metric tensor in (6) for the motion of
a classical particle in such a spin-2 fields. However, from the field-theoretic viewpoint, the
real physical spacetime is still flat and the fundamental metric tensor is still P,,,in general
frames of reference. Thus, G,, in (6) is treated as merely an effective metric tensor for
the motion of a classical object in the presence of the spin-2 gauge field, in the sense that the
+dimensional effective interval is d& = Gp,,dxPdzv in the action S, for classical objects
such as planets, stars or light rays. We note that the Poincar6 metric tensor Pp, is a purly
geometrical property of spacetime and does not contain physical field d,,,. This property is
important because it enables the Yang-Mills gravity to have a very simple coupling, namely,
the maximum coupling is a 4-vertex (in Feymann diagrams).
The translational gauge invariance requires that a symmetric spin-2 field, d,,, = &,
must couple to the fermion field $ via the energy-momentum tensor. We postulate the
following gauge-invariant fermion action S, in a general frame:

SQ = 1 L ~ g d ~ x , (7)
466

[I?, T] = 2 P p ( ~ ) ! rp = r e $ , P = det(P,,), , (8)


Ap$ = J p U D $ , R$ a$, J p w Ppu + g4pur (9)
where 7 and e g are respectively the constant Dirac matrices and the tetrads, while A,
is the translational gauge-covariant derivative. If one considers epbJpu in the action ( 7 )
as an effective tetrad, E,, = e p a J p y r then the corresponding effective metric tensor is
E,,Et,pqab = epaJpaeYbJup7jab= PPJpcrJup = Gap, which is the same as that in (5). We
may remark that the Lagrangian Lq in (7) can be symmetrized.
The translational gauge transformation (2) leads t o L+ -+ Lq - A X & L q . Therefore, the
fermion Lagrangian o L $ only changes by a divergence,

where we have used the gauge transformation,

This gauge transformation can be obtained by using

ppu - AaxPp, - PppduAp- P,,d,A = [(I - A&)P,p](d; - a,Aa)(6t - & A p )


for an infinitesimal vector function A p ( z ) . The last term in (10) does not contribute to the
action (7) because of Gauss theorem. Thus, the action SQ is translational gauge invariant.
As usual, the gauge-curvature (or the gauge-field strength) is given by the commutator
of two gauge covariant derivatives A, = J p y D y :

[ J P , D Y ,J,pDP] = (J,,(DUJ,p) - J a v ( D U J p p )DP


) = C,,,DP. (11)
The gauge-curvature C,p satisfies the algebraic identities, (i) cyclicity, C,,p + Capp +
Cppa= 0 and (ii) antisymmetry, Cpap= -C,,p, which can be directly verified. By examin-
ing all possible terms of the scalar quadratic gauge-curvature such as C p p f f C p c r p, CaPaCPpp,
etc. by using the anti-symmetric property Cpag = -C,,p and by interchanging dummy
indices, we find that there are only two independent quadratic terms, Cp,pCppoI and
CraiYCILPp, for the T(4) gauge curvature. All other terms can be expressed in terms of
them, e.g., CpapCpoIp= 2CpoIpC!-pcy. Thus, the T(4) gauge invariant Lagrangian den-
sity is naturally assumed to be a linear combination of these two independent terms,
+ . simplicity, the constant f is chosen to be f = -1. That
C p a ~ C p P af C p f f a C p P yFor
is, one has the simplest linearized equation when a gauge condition is imposed. In this case,
the linearized gauge field equation turns out to be the same as the linearlized Einsteins
equation in general relativity. Thus, we postulate the action for the spinor-tensor fields to
be
1 1
S Q =~ L , , j 4 G d 4 z , LQ4 = -- ( c p a p c p = -
2g2
+
cpffwPp) L$, (12)

where L q d n o n l y changes by a divergence under the gauge transformation (2). Note that
the quadratic gauge-curvature in (12) can be written as (-l/2g2)( ~ C p , p C ~ a-pCpolaCpbp).
467

For simplicity, let us choose inertial frames rather than general non-inertial frames in the
following discussions of experimental implications. One of the ways to do this is to consider
the case,
ppu = rlpv, J p u = rlpv +
g&u, (13)
and all tensors indices are raised and lowered by qll,,. For weak fields, the linearized gauge
equation for spin-2 field can be derived from (12):

[dxax$pu- d x a p ~ x-y - dad;) + a,a,$:


- d p d Q U p ]= -gTpy, (14)
-
where p and v are understood to be symmetrized, and Tpv = $iypdu$ is the energy-
momentum tensor of a free fermion. In the calculations of the effective metric tensor and the
gravitational quadrupole radiation, it is not necessary to symmetrized p and v explicitly in
the gauge-field equation. The gauge symmetry allows us to impose the usual gauge condition
for the massless spin-2 field in (14),

The gauge equation (14) takes a simple form:

which turns out to be formally the same as Einsteins equation for weak gravitational field.
The classical particle action S, = - J mdsei, with the effective interval ( d ~ , i )=~
Gpvdz!-dzu,leads to the variation

SS, = -mGpU(dzp/ds,,)Sz, (17)


if one considers only the actual path with one of its end point variable. [8] Thus we have
the Hamilton-Jacobi equation for a particle with mass m,

Ip(d,S)(a,S) - m2 = 0, IpG,x = df. (18)


In the Newtonian limit, (16) with the only non-vanishing component, TOO= mJ3(r),
leads to 9 4 0 0 = - g 2 m / ( 8 7 r r ) . Also, Ioo in the Hamilton-Jacobi equation (18) has the usual
+
result I o o = 1 2Gm/r in this limit, where G is the gravitational constant. Based on these
results, together with Ip in (18) and G,, in (6), we obtain the relation

(19)

by solving (16) with the spherical coordinate, d= (w, r , 8 , $ ) to the first order approxima-
tion.
In order to show that the theory is viable beyond the first-order approximation (19), let
us compare the result (18) with the perihelion shift of the Mercury, [7, 81 which is sensitive
to the coefficient appearing in the second-order term in loo or GOO.We solve the non-linear
gauge field equations by the method of successive approximation and carry out the related
468

post-Newtonian approximation to the second order. To accomplish this calculation and to


have a well-defined gauge fields, we include the usual gauge condition ( 1 5 ) ,so that the total
Lagrangian takes the form in a general frame:
1
Lt-= {L++ - ( ( / ( 2 g 2 ) [ D P J p-a 2 D a J ~ ] [ D J v-a ~ D , J , ] } ~ . (20)
2
We derive the following exact gauge field equations in a general frame:

where /I and v are understood to be symmetrized.


It suffices to consider an inertial frame and a static and spherically symmetric system,
in which gauge fields are produced by an object at rest with mass m. Based on symme-
try considerations, [9] the non-vanishing components of the exterior solutions $ P v ( r ) are
400(r),$11(r),422(7-) and 433(r) = 422sin20, where XP = (w, r ,0 , d ) .
To solve the static gauge field, let us write

For ( p , v ) = ( O , O ) , (1, l ) ,( 2 , a), ( 3 , 3 ) ,the gauge-field equation (21) can be written respec-
tively as

cl(
dr
R2--
T)$ 7+ ( ~+
2R2dS
R-
d
dr
- + -8
ddrR r )(
dS
-q-
dr
+ dT
2-1
dr
2
--
r
~
2
+2 -m)
r
(23)

1 d
-R + 2T) + - -(S - 3R + 4 T ) - = 0,
2r dr

dS
-R(-
dr
+ 2-)dT
dr
-
2
-T2
r
+
1 d
+ (
469

after tedious but straightforward calculations. T h e equation for the component ( p , v ) =


(3:3) is the same as equation ( 2 5 ) .
The second-order approximation of the fields in (23)-(25) are solved and, hence, ( 2 2 )
leads t o the following results:

The parameter u o can be determined by the Kewtonian approximation of g400

Equations ( 2 2 ) and ( 2 6 ) lead t o the second order approximation of gauge field gq500:

Gm G2m2
9400 = --r +-2r2 ' g411=--+-Grm GF2 (1- + -
,2) , (28)

which is consistent with the ('-independent first order solutions in (19).


Based 011 (21) and (6) with Ppu = (1, -1, - r 2 , -r2sin2#), we obtain the effective metric
tensor ,

which are [-dependent, just like the gauge field d P u . In order to see the [-independent
physical results and their agreement with gravitational experiments, let us carry out the
expansion to the second order in all components Goo, Gll, G22 and G33 in the usual spherical
coordinate. For any given value of [, we can make a change of variable: p 2 = r2(1+ 2Gm/r+
[G2rn2/r2][4/(' - 3 ] ) , where the zeroth and the first orders terms are independent of [. We
obtain the following effective metric tensors GpLV(p):

2Gm .40
Goo([>)= 1 - -$ 7 , G h ( p ] = -
P P-
(I+---
27 2::m2)
1 (30)

G22(p) = - p 2 , G33(p) = -p2.?in26', Ao 0,


which are (-independent. This is interesting because, although Yang-Mills gravity is for-
mulated in flat spacetime, the spacetime translation gauge symmetry leads to an effective
470

metric tensor G,,(p) in (5), (6) and (30) to the second order. Note that only the second or-
der term in G,,(p), i.e., 2G2m2/p2,differs from the corresponding term in Einsteins theory.
As we shall see below, this difference is too small to be detected with the available appara-
tus. [9] We have shown that the effective metric tensor G,,(p) in the spherical coordinates
are independent of the gauge parameter E in the post-Newtonian approximation.
The inverse of non-vanishing components of G,, are given by

1 2Gm 4G2m2
IO0(p) = -=1+-+-,
Goo(P> P P2
Iyp)= L=
- I - :- 2 [ + -6G2m2]
,
Gll(P) P2
1 -1 1 -1
I(p) = - -- 133(p) = ___ -
-
G22(P) P2 G33(p) p2sin28
Let us choose B = r/2, the Hamilton-Jacobi equation for a planet with mass mp has the
following form:

By the general procedure for solving this equation, we look for an action S in the form
S = -E,w +M $ + j ( p ) . The action S is found to be

where E, and bf are respectively constant energy and angular momentum. As usual, the
trajectory is determined by d S / d M = constant:

OD 11 - 6G2m2
I II I-l+-----,
P2
where 1 and Ill are given by (31).This term loolI1lldiffers from the corresponding term
in Einsteins theory, in which goolgl1l = 1. For the approximate trajectory, we write (34) as
a differential equation with = l / p , We have

We differentiate equation (35) with respect to 4 and obtain


d2u 1
- = - - g(1-
dqP P
Q) + 3Gma2,
471

M2
p=-
miGm'
Einstein's theory does not have this type of correction term Q because it has the relation
go0lg1ll = 1. The new correction term Q is of the order of (Gm/P),02 which is the result
of the relation loolll'll > 1 in (34). This correction is extremely small and undetectable
because of the velocity ,8 of the planet is very small in comparison with the speed of light.
To see the effect of this correction Q to the perihelion shift, we solve (36) by a change
of variable 3 = u( 1 - Q). We can write equation (36) as

By the usual successive approximation, [7] we obtain the solution

where e is the eccentricity. The major semiaxis a can be expressed in terms of e and P :
a = P/(1 - e2). The perihelion shift for one revolution of the planet is give by

2na 6nGm
P P

where the second term shows the difference between the present Yang-Mills gravity and
Einstein's theory. This result shows that the observable perihelion shift is independent of
the gauge parameter ( which appears in the second order approximation of the solution of
the spin-2 gauge field gq5. Since the observational accuracy of the perihelion shift of the
Mercury is about 1 %, the prediction (40) of Yang-Mills gravity can be tested only when
(Ed - rni)/rn: M p 2 x 0.01. Thus, it is not possible to test the small correction in (40) of
Yang-Mills gravity in the solar system.
For the bending of light, the eikonal equation (47) with l p v given by (31), one obtains
the trajectory of the ray which is the same as (36) with mp + 0 and Eo replaced by
w, = -a$/aw (c=l). Following the usual procedure, we obtain

4Gmw, 3nGmw,
A$%-
M
where the second term in the bracket is negligible. So there is no observable difference
between Yang-Mills gravity and Einstein's theory in the known experiments.
472

If one compares the fermion equation, (iIp (PPv+gq5,,)Dv - m)+ = 0, derived from (12)
with the Dirac equation in quantum electrodynamics [i.e., (+ad, - eyPA, - m)+ = 01,one
can see a distinct difference: Namely, the kinematic term i ~ ~ and 8 , the electromagnetic
coupling term e y p A , have a different relative sign, if one takes the complex conjugate of the
Dirac equations. This implies the presence of both repulsive and attractive forces between
two charges. However, there is no change in the relative sign of the kinematical term and
the spin-2 coupling term in our fermion equation when one takes the complex conjugate.
This provides a natural explanation that the gravitational force is only attractive between
two fermions, in contrast to that in electrodynamics.
The Yang-Mills gravity with the translational symmetry [lo] has a well-defined conser-
vation law for the energy-momentum tensor, just as that in ordinary field theory. It is
believed that such a spacetime gauge theory can shed light on quantum gravity because
(i) the maximum interaction vertex is 4-vertex, just as that in the usual Yang-Mills theory
with internal Lie groups, and (ii) it is based on gauge symmetry, which could minimize the
ultraviolet divergences. In light of these discussions, it is possible t o understand gravity
based on the spacetime gauge theory with the translational symmetry in general frames of
reference.
The author would like to thank Zhenhua Ning for his help. The work is supported in
part by the Potz Science Fund and the Jing Shin Research Fund of the UMass Dartmouth
Foundation.

Note added. The energy-momentum tensor of gravitation t,, is defined by the field equation
(21) (with [ = 0) written in the following form, DXD~& = -g(Tpv +
t p u ) ,in a general
frame. Using the usual approximations and gauge condition (15), we can calculate the
average energy-momentum of a gravitational plane wave and the power. For example, the
power Po emitted per unit solid angle in the direction x/lxl can be written as

where T(k,w) is defined as follows: [9] Suppose one observes this radiation in the wave zone,
one can write the polarization tensor in terms of the Fourier transform of TPv:

T P v ( kw, ) z
J d3xT,,(x, w ) ] e x p ( - i k . x),

where the polarization tensor e,,(x, w) is defined by the relation: +,v(x,t ) M [ e P v ( xw,) x
+
ezp(-ikxxx) c.c.]. The approximate result for the power emitted per solid angle in Yang-
Mills gravity turns out t o be the same as that obtained in Einsteins theory. [9]
473

References
[l] A. Einstein and M . Grossmann, Z. Math. Physik, 62 225 (1913). See also F. J . Dyson,
Bull. Am. Math. SOC.78 (1972) 635, and S. Weinberg, Gravitation and Cosmology
(John Wiley and Sons, 1972) pp. 285-289.
[2] Jong-Ping Hsu, Phys. Lett., 119B (1982) 328. Such a gauge theory predicts a new
gravitational spin force produced by fermion spin densities, in addition to the usual
gravitational force produced by the mass density. T h e usual Yang-Mills-type formalism
(with Faddeev-Popovs gauge compensation terms) for internal gauge groups is more
difficult to be applied t o this case.
[3] The idea of an effective Riemannian spacetime due to the presence of a symmetric
spin-2 field in flat spacetime was extensively discussed by Logunov and others. Their
theory has a different gauge transformation and a completely different Lagrangian. See
A. A. Logunov, The Theory of Gravity (Trans. by G . Pontecorvo, Moscow, Nauka,
2001) and references therein. For a discussion of the spin-2 field, see also H. van Dam
and M. Veltman, Nucl. Phys. B22 (1970) 397, and S. Weinberg, The Quantum Theory
of Fields. vol. 1. Foundations (Cambridge Univ. Press, 1995) pp. 246-255.
[4]Jong-Ping Hsu and Leonard0 Hsu, Nuovo Cimento B, 112 (1997) 575 and Chin. J.
Phys. 35 (1997) 407. Jong-Ping Hsu, Einsteins Relativity and Beyond - New Symmetry
Approaches, (World Scientific, Singapore, 2000) , Chapters 21-23. Daniel Schmidt and
Jong-Ping Hsu, Intern. J . Modern Phys. A (2005, to be published).
[5] C. Merller, Danske Vid. Sel. Mat.-Fys. 20 (1943) No. 19; Ta-You Wu and Y . C. Lee,
Intern. J. Theoretical Phys. 5 (1972) 307.
[6] For arbitrary linear accelerations with limiting $-dimensional symmetry, one has
+
Pp,dxpdx = W 2 d w 2 2Udwdx - dr2. See J. P. Hsu, Chin. J. Phys. 40 (2002) 265.
[7] L. Landau and E. Lifshitz, The Classical Theory of Fields (trans. by M. Hamermesh,
Addison-Wesley, 1951) p. 58 and pp. 312-316. See also S. Weinberg, ref. 1, pp. 185-201.
[8] Wei-Tou Ni, in The Proceedings of the Fourth International Workshop on Gravitation
and Astrophysics (Ed. L. Liu, J . Luo, X. Z. Li and J. P. Hsu, World Scientific, 2000)
pp. 1-19,
[9] S. Weinberg, Gravitation and Cosmology (John Wiley and Sons, 1972) p. 178 and pp.
259-273.
[lo] Within the framework of curved spacetime, gauge gravity with translation gauge sym-
metry was discussed by Y. M. Cho, Phys. Rev. 14,2515 (1976). His formulation and
results are very much different from ours. For example, Cho made additional assumption
tP$ = 0 for the gauge covariant derivative, so that one has Ai$ = (ai+ B!<,)$ = ai$,
where tPare the translation group generators. Furthermore, he assumed a*$ = h;ap$,
where It; = 6; + f B r . This assumption is equivalent to assuming the curved space-
time. Yang-Mills formulation of gauge symmetry and gauge fields do not make these
additional assumptions.
This page intentionally left blank
Chapter 9

Experimental Tests of Gravitational Theories*

*W. T. Ni, J. H. Taylor, Jr.


476

EMPIRICAL FOUNDATIONS OF THE RELATIVISTIC GRAVITY

WEI-TOU N1
Centerfor Gravitation and Cosmology, Solar-System Division,
Purple Mountain Observatory. Chinese Academy of Sciences.
N o 2, BerJing W Rd, Nanjing, China 210008 M tnr!qmo ai ( n

Received 16 February 2005

In 1859, Le Verrier discovered the mercury perihelion advance anomaly. This anomaly turned out to
be the first relativistic-gravity effect observed. During the 141 years to 2000, the precision of
laboratory and space experiments, and astrophysical and cosmological observations on relativistic
gravity have improved by 3 orders of magnitude. In 1999, we envisaged a 3-6 order improvement in
the next 30 years in all directions of tests of relativistic gravity. In 2000, the interferornetric
gravitational wave detectors began their runs to accumulate data. In 2003, the measurement of
relativistic Shapiro time-delay of the Cassini spacecraft determined the relativistic-gravity parameter
y to be 1.000021 ? 0.000023 of general relativity --- a 1.5-order improvement. In October 2004,
Ciufolini and Pavlis reported a measurement of the Lense-Thirring effect on the LAGEOS and
LAGEOS2 satellites to be 0.99 k 0.10 of the value predicted by general relativity. In April 2004,
Gravity Probe B (Stanford relativity gyroscope experiment to measure the Lense-Thirring effect to 1
%) was launched and has been accumulate science data for more than 170 days now. pSCOPE
(MICROSCOPE: MICRO-Satellite a trainee Compensee pour IObservation du Principle
dEquivalence) is on its way for a 2007 launch to test Galileo equivalence principle to 10.. LISA
Pathfinder (SMART2), the technological demonstrator for the LISA (Laser Interferometer Space
Antenna) mission is well on its way for a 2008 launch. STEP (Satellite Test of Equivalence
Principle), and ASTROD (Astrodynamical Space Test of Relativity using Optical Devices) are in the
good planning stage. Various astrophysical tests and cosmological tests of relativistic gravity will
reach precision and ultra-precision stages. Clock tests and atomic interferometry tests of relativistic
gravity will reach an ever-increasing precision These will give revived interest and development both
in experimental and theoretical aspects of gravity, and may lead to answers to some profound
questions of gravity and the cosmos.

1. Introduction
A dimensionless parameter <(x, t) characterizing the strength of gravity at a spacetime
point 8 [with coordinates (x, t)] due to a gravitating source is the ratio of the negative
of the potential energy, rnU (due to this source), to the inertial mass-energy mc2 of a test
body at p , i.e.,

Here U(x, t) is the gravitational potential.


For a point source with mass A4 in Newtonian gravity,

1
477

r(x, t) = GM 1 Rc2, (2)

where R is the distance to the source. For a nearly Newtonian system, we can use
Newtonian potential for U. The strength of gravity for various configurations is tabulated
in Table 1.

Table 1. The strength of gravity for various configurations

Sun Earth Orbit 1.0 x 10-8


Sun Jupiter Orbit 1.9 x 1 0 ' ~
Earth Earth Surface 0.7
Earth Moon's Orbi 1.2 x 1 0 "
Galaxv Solar System - 10-6
Significant Part of Observed Universe Our Galaxy 1-

The development of gravity theory stems from experiments. Newton's theory of


gravity [ l ] is empirically based on Kepler's laws [2] (which are based on Brahe's
observations) and Galileo's law of free-falls [3] (which is based on Galileo's experiment
of motions on inclined planes and with free-all trajectories). Toward the middle of
nineteenth century, astronomical observations accumulated a precision which enabled Le
Verrier [4] to discover the mercury perihelion advance anomaly in 1859. This anomaly is
the first relativistic-gravity effect observed. Michelson-Morley experiment [ 5 ] , via
various developments [6], prompted the final establishment of the special relativity
theory in 1905 [7, 81. Motivation for putting electromagnetism and gravity into the same
theoretical framework [7], the precision of Eotvos experiment [9] on the equivalence, the
formulation of Einstein equivalence principle [101 together with the perihelion advance
anomaly led to the road for general relativity theory [ 111 in 1915.
In 1919, the observation of gravitational deflection of light passing near the Sun
during a solar eclipse [12] made general relativity famous and popular. The perihelion
advance of Mercury, the gravitational deflection of light passing near the Sun together
with the establishment of the gravitational redshift constitute three classical tests of
general relativity. With the development of technology and advent of space era, Shapiro
[13] proposed a fourth test --- time delay of radar echoes in gravitational field. This,
together with the more precise test of the equivalence principle [I41 and Pound-Rebka
redshift experiment [15] in 1960's, marked the beginning of a new era for testing
relativistic gravity. Since the beginning of this era, we have seen 3-4 order improvements
for old tests together with many new tests. The technological development is ripe that we
are now in a position to discern 3-5 order further improvements in testing relativistic
gravity in the coming 25 years (2005-2030). This will enable us to test the second order
relativistic-gravity effects. A road map of experimental progress in gravity together with
its theoretical implication is shown in Table 2.

2
478

Table 2. A road map (Highlights) for gravity. D denotes dynamical effect; EP denotes equivalence
principle effect.
~~~

Exoeriment I Precision t Theory


Keplers laws
(i) Galieleos EP
Galileos experiment on inclined 5x (EP)
(ii) The motion with constant
planes (1592) 5 (D) force has constant acceleratior
Newtons pendulum experiment
10 (EP) 9:
(1687 [I])
Observation of celestial-body 10 (EP)
Newtons inverse square law
motions of the solar system (-1687) 10 - (D)
Anomalous advance of Mercurys
perihelion based on 397 meridian *
10. (D)
and 14 transit observations (1 859
[4])
Anomalous advance of Mercurys
perihelion based on transit
lO-(D)
observations from 1677 to 1881
(1882 [16])
Michelson-Morley Experiment Special relativity
10. I
(1887 [5])
Eotvos experiment (1 889 [9]) 10. (EP) 6. *
Light deflection experiment (1 91 9
[ 121)
sx lo- (D)
Roll-Krotkov-Dicke experiment

10. (EP)
1 Binary pulsar observation (I979
1 O-I3 (D) IT
II
3eneral relativity with
:osmological constant
Cassini time-delay experiment [56] I lo- - (D) !/
LAGEOS gravity experiment 1671 I
10. (DI 6. #
GP-B Experiment [68] (D) 6
uSCOPE Exueriment 11091 I

t In terms of dominant observable effects


9 Confirmation of Galileos equivalence principle
* Leading to general relativity
# Confirming the prediction of general relativity
7 Confirming the quadrupole radiation formula of general relativity
11 In terms of relativistic parameters, the precision is 10
%Testing frame dragging of general relativity
!Testing Galileos equivalence principle
$ In terms of relativistic parameters, the precision will be - lo-

3
479

This review is a five-year update from a previous review article (W.-T. Ni,
"Empirical tests of the relativistic gravity: the past, the present and the future", in Recent
advances and cross-century outlooks in physics: interplay between theory and
experiment: proceedings of the Conference held on March 18-20, 1999 in Atlanta,
Georgia, editors, Pisin Chen, and Cheuk-Yin Wong, [Singapore: World Scientific,
20001 ; and pp. 1-19 in Gravitation and Astrophysics, editors, Liu L, Luo J, Li X-Z and
Hsu J-P [Singapore: World Scientific, 20001 ).
In section 2, we review the Mercury's perihelion advance and events leading to
general relativity. In section 3, we discuss the classical tests. In section 4, we review
precision measurement tests and the foundations of relativistic gravity. In section 5, we
review solar system tests since the revival (1960). In sections 6 and 7, we discuss
astrophysical tests and cosmological tests respectively. In section 8, we discuss
gravitational-wave observations in relation to testing relativistic gravity. In section 9, we
discuss next generation experiments in progress, planned and proposed. In section 10, we
give an outlook. In the appendix, we discuss empirical tests associated with Edditington-
Robertson formalism.

2. Mercury's Perihelion Advance and Events Leading to General Relativity


In 1781, Herschel discovered the planet Uranus. Over years, Uranus persistently
wandered away from its expected Newtonian path. In 1834, Hussey suggested that the
deviation is due to perturbation of an undiscovered planet. In 1846, Le Verrier predicted
the position of this new planet. On September 25, 1846, Galle and d'Arrest found the
new planet, Neptune, within one degree of arc of Le Verrier's calculation. This
symbolized the great achievement of Newton's theory. [22]
With the discovery of Neptune, Newton's theory of gravitation was at its peak. As
the orbit determination of Mercury reached 1 O-', relativistic effect of gravity showed up.
In 1859, Le Verrier discovered the anomalous perihelion advance of Mercury [4].
In 1840, Arago suggested to Le Verrier to work on the subject of Mercury's motion.
Le Verrier published a provisional theory in 1843. It was tested at the 1848 transit of
Mercury and there was not close agreement. As to the cause, Le Verrier [23] wrote
"Unfortunately, the consequences of the principle of gravitation have not be deduced in
many particulars with a sufficient rigour: we will not be able to decide, when faced with
a disagreement between observation and theory, whether this results completely from
analytical errors or whether it is due in part to the imperfection of our knowledge of
celestial physics." [24]
In 1859, Le Verrier [4] published a more sophisticated theory of Mercury's motion.
This theory was sufficently rigorous for any disagreement with observation to be taken
quite confidently as indicating a new scientific fact. In this paper, he used two sets of
observations --- a series of 397 meridian observations of Mercury taken at the Paris
Observatory between 1801 and 1842, and a set of observations of 14 transits of Mercury.
The transit data are more precise and the uncertainty is of the order of 1". The calculated
planetary perturbations of Mercury is listed in Table 3. In addition to these perturbations,
there is a 5025"/century general precession in the observational data due to the
precession of equinox. The fit of observational data with theoretical calculations has
discrepancies. These discrepancies turned out to be due to relativistic-gravity effects. Le
Verrier attributed these discrepancies to an additional 38" per century anomalous

4
480

advance in the perihelion of Mercury.


Newcomb [I61 in 1882, with improved calculations and data set, obtained 42".95
per century anomalous perihelion advance of Mercury. The value more recently was
(42".98 k 0.04)/century [25].
In the last half of the 19th century, efforts to account for the anomalous perihelion
advance of Mercury went into two general directions: (i) searching for the planet Vulcan,
intra-Mercurial matter and the like; (ii) modification of the gravitation law. Both kinds of
efforts were not successful. For modification of the gravitational law, Clairaut's
hypothesis, Hall's hypothesis and velocity-dependent force laws were considered. The
successful solution awaited the development of general relativity.
In 1887, the result of Michelson-Morley experiment posed a serious problem to
Newtonian mechanics. A series of developments [6] led to PoincarC's adding to the five
classical principles of Physics the Principle of Relativity [26, 271 in 1904 --- "The laws of
physical phenomena must be the same for a fixed observer and for an observer in
rectilinear and uniform motion so that we have no possibility of perceiving whether or
not we are dragged in such a motion", and to the seminal works of PoincarC [7] and
Einstein [8] in 1905. In [7], PoincarC attempted to develop a relativistic theory of gravity
and mentioned gravitational-wave propagating with the speed of light based on Lorentz
invariance.
A crucial milestone toward a viable relativistic theory of gravity was established
when Einstein proposed his equivalence principle in 1907 [lo]. This principle had a firm
empirical basis due to the precision experiment of Eotvos 191. In this same paper,
Einstein predicted gravitational redshift. With Special Relativity and the Einstein
Equivalence Principle, geometrization of classical physics in the large came naturally in
the setting of pseudo-Riemannian manifold. In the years up to 1915 were full of debates
and arguments between a number of physicists (Abraham, Einstein, Nordstrom, ....)
concerned with developing a new relativistic gravitational theory [28]. In 1915,
Einstein's General Relativity was proposed [ 11, 291 and the anomalous perihelion
advance of Mercury was explained.

Table 3. Planetary perturbations of the perihelion of Mercury [4]


Venus 280".6/century
Earth 83". 6lcentury
Mars 2".6lcentury
Jupiter 152".6/century
Saturn I" .2/century
Uranus 0". Vcentury
Total 526".l/century

3. Classical Tests
The perihelion advance anomaly of Mercury, the deflection of light passing the limb of
the Sun and the gravitational redshift are the three classical tests of relativistic gravity.
Using EEP, Einstein [30] derived the deflection of light passing the limb of the Sun in
1911. This agrees with the deflection of light derived by using particle model of light in
the late 18th century. Before 1915, observations on light deflection were not successful
due to war and weather. Einstein's general relativity doubled the prediction of the

5
48 1

deflection of light (1 'l.75). The 1919 British solar eclipse expeditions reported reasonably
good agreement with the prediction of Einstein's relativity. Before 1960, there were
several such observations. The accuracy of these observations was not better than 10 -
20%.
After Einstein [ 101 proposed the gravitational redshift, Freundlich [31J started the
long effort to disentangle the gravitational redshift of solar and other stellar spectral lines
from other causes. Over the next five decades, astronomers did not agree on whether
there is gravitational redshift empirically [32]. This question is finally settled and
gravitational redshift confirmed by Pound and Rebka [ 151 using Mossbauer effect. The
improved result of Pound and Snider [I51 confirmed the redshift prediction to 1 %
accuracy.

4. Precision Measurement and Foundations of Relativistic Gravity


The foundation of relativistic gravity at present rests on the equivalence of local physics
everywhere in spacetime. This equivalence is called the Einstein equivalence principle
(EEP) [lo]. Its validity guarantees the universal implementation of metrology and
standards. Precision metrology or measurement, in turn, test its validity. Possible
violations will give clues to the origin of gravity.
The most tested part of equivalence is the Galileo equivalence principle (the
universality of free-all). In the study of the theoretical relations between the Galileo
equivalence principle and the Einstein equivalence principle, we [33, 341 proposed the x-
g framework summarized in the following interaction Lagrangian density

L, = - (1/(16n)lxk'FflFk/ - Akjk (-gf''2)- 2 1 1111 (df,)/(dtj s(X-Xd, (3)


x
where xVkl - ki!, = - is a tensor density of the gravitational fields (e.g., g,,, p, etc.) and
j k , FIl E A , ,- A,, have the usual meaning. The gravitation constitutive tensor density x'Ik'
dictates the behavior of electromagnetism in a gravitational field and has 21 independent
components in general. For a metric theory (when EEP holds), x'Ik' is determined
completely by the metric g" and equals (-@I [(1/2) gikg"-(112) g" g"]. Here we use this
framework to look into the foundation of relativistic gravity empirically.
The condition for no birefringence (no splitting, no retardation) for electromagnetic
wave propagation in all directions in the weak field limit gives ten constraints on the x's.
With these ten constraints, x can be written in the following form

x ' J ~ ' = ( - N ) ' ~ ~ [ (LP-(I/~)H''


I / ~ ) H ' ~ H k J / y l + pe'Jk/, (4)

where H = det ( H ) is a metric which generates the light cone for electromagnetic
propagation, and e"' is the completely antisymmetric symbol with e"123= 1 [35-371.
Recently, Lammerzahl and Hehl have shown that this non-birefringence guarantees,
without approximation, Riemannian light cone, i.e., Eq. (4) [38J
Eq. (4) is verified empirically to high accuracy from pulsar observations and from
polarization measurements of extragalactic radio sources and will be discussed in 46 on
the astrophysical tests. Let us now look into the empirical constraints for H ' l and p. In Eq.
(3), ds is the line element determined from the metric g,,. From Eq. (4), the gravitational
coupling to electromagnetism is determined by the metric H,, and two scalar fields cp and
yl. If HI, is not proportional to g,/, then the hyperfine levels of the lithium atom, the

6
482

beryllium atom, the mercury atom and other atoms will have additional shifts. But this is
not observed to high accuracy in Hughes-Drever experiments [39]. Therefore H, is
proportional to g, to a certain accuracy. Since a change of Hkto m'does not affect iJk'
in Eq. (4), we can define H I ,= glI to remove this scale freedom. [35,40]
In Hughes-Drever experiments [39] Am/m 5 0.5 x lo-*' or Amlm,, 5 0.3 x
where me nr is the electromagnetic binding energy. Using Eq. (4) in Eq. (3), we have three
kinds of contributions to Arnlm,, . These three kinds are of the order of (i) (H,,,,- g,,), (ii)
(Hop- go,,)v, and (iii) (Hoo- goo)v2respectively [35, 401. Here the Greek indices ,LA, v
denote space indices. Considering the motion of laboratories from earth rotation, in the
solar system and in our galaxy, we can set limits on various components of (H,, - g,) from
Hughes-Drever experiments as follows:

I H,,,, - g,,, I I U 5 lo-''


I Hop-go, I / U I -
I Hoo- gooI / U 5 10'".
where U (- is the galactical gravitational potential.
Eotvos-Dicke experiments [9, 14, 41-43] are performed on unpolarized test bodies;
the latest such experiments [43] reach a precision of 3 x In essence, these
experiments show that unpolarized electric and magnetic energies follow the same
trajectories as other forms of energy to certain accuracy. The constraints on Eq. (4) are

I Hoo-gooII U < (7)


where U is the solar gravitational potential at the earth.
In 1976, Vessot and Levine [44] used an atomic hydrogen maser clock in a space
probe to test and confirm the metric gravitational redshift to an accuracy of 1.4 x
[45]. The space probe attained an altitude of 10,000 km above the earth's surface. With
Eq. (6), the constraint on Eq. (4) is

I Hoo-gool/ U S 1.4 x (8)

Thus, we see that for the constraint on I Hoo- goo 1 / U, Hughes-Drever experiments give
the most stringent limit. However, STEP mission concept [46] proposes to improve the
WEP experiment by five orders of magnitude. This will again lead in precision in
determining Hoo,
The theory (3) withxyk' given by

xuk= (-&'I2 [(ID) gik2'- (10) g" g"' + p eyk'] , (9)

where p is a scalar or pseudoscalar function of the gravitational field and eVk' = (-&)-1'2ei'k'
is studied in [47] and [48]. In (3), particles considered have charges but no spin. To
include spin-1/2 particles, we can add the Lagrangian for Dirac particles. Experimental
tests of the equivalence principle for polarized-bodies are reviewed in [49].
To include QCD and other gauge interactions, we have generalized the x - g
framework [50]. Now we are working on a more comprehensive generalization to
include a framework to test special relativity, and a framework to test the gravitational

7
483

interactions of scalar particles and particles with spins together with gauge fields.

5. Solar System Tests


For last forty years, we have seen great advances in the dynamical testing of relativistic
gravity. This is largely due to interplanetary radio ranging and lunar laser ranging.
Interplanetary radio ranging and tracking provided more stimuli and progresses at first.
However with improved accuracy of 2 cm from 20-30 cm and long-accumulation of
observation data, lunar laser ranging reaches similar accuracy in determining relativistic
parameters as compared to interplanetary radio ranging. Table 4 gives such a comparison.

Table 4. Relativity-parameter determination from interplanetary radio ranging and from lunar laser
ra

J2 (Sun)=(2.3+5.2)~10-fitted) 1.00012_+0.001I [55, 561


1.000~0.0001 [53] (EPM2004 fitting)
1.000_+0.002[25] (Viking ranging time delay)
PPN Space 0.9985+0.0021 [52] (Solar-System Tests)
Curvature 1.000021_+0.000023 [56](Cassini S/C Ranging) 1.000k0.005 [54]
0.9999+_0.0001[53] (EPM2004 fitting)
Geodetic 0.997+0.007 [54]
Precession 0.9981+_0.0064[55]
Strong (3.2+4.6)~1
O-I3 [541
Equivalence
Principle (-2.0+2.0)~10-~~[55,43]
(2+4)x 1O-/yr [57](Viking Lander Ranging)
Tempora1 _+10x10-12/yr [58] (Viking Lander Ranging) (lf8) x 10-I2/yr[54]
Change in
+2.0x101/yr [59J(Mercury & Venus Ranging) (0.4k0.9)~10-I2/yr[55]
+( 1 . 1 - 1 . 8 ) ~ 1 0 ~ ~[60]
~ / y(Solar-System
r Tests)

In the last column of Table 4, the values come from two references [54] and [ 5 5 ] . In
[ 5 5 ] , Williams et al. used a total of 15 553 LLR normal-point data in the period of March
1970 to April 2004 from Observatoire de la CBte dAzur, McDonald Observatory and
Haleakala Observatory in their determination. Each normal point comprises from 3 to
about 100 photons. The weighted rms scatter after their fits for the last ten years of
ranges is about 2 cm (about 5 x 10-lof range).
In 2003, Bertotti, Iess and Tortora [56] reported a measurement of the frequency
shift of radio photons due to relativistic Shapiro time-delay effect from the Cassini
spacecraft as they passed near the Sun during the June 2002 solar conjunction. From this
measurement, they determined y to be 1.000021 _+ 0.000023.
With the advent of VLBI (Very Long Baseline Interferometry) at radio wavelengths,
the gravitational deflection of radio waves by the Sun from astrophysical radio sources
has been observed and accuracy of observation had been improved to 1.7 x 1 0-3for y [6 1 ,
621 in 1995. Recent analysis using VLBI data from 1979-1999 improved this result by
about four times to 0.99983 L 0.00045 [63]. Fomalont and Kopeikin [64] measured the

8
484

effect of retardation of gravity by the field of moving Jupiter via VLBI observation of
light bending from a quasar.
The solar-system measurements have made possible the creation of high-accuracy
planetary and lunar ephemerides. Two most complete series of ephemerides are the
numerical DE ephemerides of JPL [65] and the EPM ephemerides of the Institute of
Applied Astromomy [53]. They are of the same level of accuracy and can be used to fit
experiments/observations and to determine astronomical constants. Krasinsky and
Brumberg [66] used these two series of ephemerides to analyze the major planet motions
and the AU (Astronomical Unit); Pitjeva [53] have recently used the EPM framework to
determine the AU and obtain 1 AU = 149 597 870 696.0 m. The JPL DE410
determination of this number is 1 AU = 149 597 870 697.4 m. The difference of 1.4 m
represents the realistic error in the determination of the AU. Pitjeva's [53] determination
of p and y is obtained simultaneously with estimations for the solar oblateness and the
possible variability of the gravitational constant.
In 1918, Lense and Thirring predicted that the rotation of a body like Earth will drag
the local inertial frames of reference around it in general relativity. In 2004, Ciufolini and
Pavlis [67] reported a measurement of this Lense-Thirring effect on the two Earth
satellites, LAGEOS and LAGEOS2; it is 0.99 k 0.10 of the value predicted by general
relativity. In the same year, Gravity Probe B (a space mission to test general relativity
using cryogenic gyroscopes in orbit) was launched in April and aims at measurement of
Lense-Thirring effect to about 1 YO[68].
With the Hipparcos mission, very accurate measurements of star positions at various
elongations from the Sun were accumulated. Most of the measurements were at
elongations greater than 47" from the Sun. At these angles, the relativistic light
deflections are typically a few mas; it is 4.07 mas according to general relativity at right
angles to the solar direction for an observer at 1 AU from the Sun. In the Hipparcos
measurements, each abscissa on a reference great-circle has a typical precision of 3 mas
for a star with 8-9 mag. There are about 3.5 million abscissae generated, and the
precision in angle or similar parameter determination is in the range. FraeschlC, Mignard
and Arenou [69] analyzed these Hipparcos data and determined the light deflection
+
parameter y to be 0.997 0.003. This result demonstrated the power of precision optical
astrometry.

6. Astrophysical Tests
In the early days, astronomical observations of the solar system provided the basis
for developing gravitation theories. With increasing precise observations, astrophysics
and cosmology are increasingly more important for such developments. Precise timing
of pulsars provides:
(i) confirmation of quadrupole radiation formula for gravitational radiation [ 171,
(ii) additional testing ground for Post-Newtonian Parameters [ 171,
(iii) test of nonbirefringence of propagation of electromagnetic wave in a gravitational
field, and
(iv) upper limit of background gravitational-wave radiation [70-731.
We refer (i), (ii) and (iv) to references cited. Here, we discuss (iii).
With the null-birefringence observations of pulsar pulses and micropulses before
1980, the relations (4) for testing EEP are empirically verified to - [35-371.
With the present pulsar observations, these limits would be improved; a detailed such
analysis is in [74]. Analyzing the data from polarization measurements of extragalactic
radio sources, Haugan and Kauffmann [75] inferred that the resolution for null-

9
485

birefringence is 0.02 cycle at 5 GHz. This corresponds to a time resolution of 4 x 10.'' s


and gives much better constraints. With a detailed analysis and more extragalactic radio
observations, (4) would be tested down to 10-28-10-29at cosmological distances. In 2002,
Kostelecky and Mews [76] used polarization measurements of light from cosmologically
distant astrophysical sources to yield stringent constraints down to 2 x The
electromagnetic propagation in Moffat's nonsymmetric gravitational theory fits the x-g
framework. Krisher [77], and Haugan and Kauffmann [75] have used the pulsar data and
extragalactic radio observations to constrain it.

7. Cosmological Tests
In an attempt to find a static cosmology, Einstein add a cosmological constant A to
his equation. The term containing A can be interpreted as a modification of Einstein's
equation or it can be just interpreted as vacuum stress-energy.
Although Einstein considered the proposal of this term his biggest blunder in his life,
the value of A needs to be determined using cosmological observations.
Recent evidence suggests that Type Ia supernovae (SNeIa) can be used as precise
cosmological distance indicators [78]. Early results with these SNeIa observations imply
that there is not enough gravitating matter to close the universe [IS, 191 and that
currently the expansion of the Universe is accelerating [20, 2 I], indicating A-density
(cosmological term, dark energy or quintessence) is larger than the ordinary-matter
density. More supernovae observations together with more precise cosmic background
anisotropy measurements will be important in testing and determining the gravitational
equation in the cosmological context.
In section 4, we mentioned a nonmetric theory [33, 34, 471 in discussing the
foundations of relativistic gravity. Theories with spontaneous direction [79] and axion
theories also have such an electromagnetic interaction. The effect of cp [in (9)] in this
theory is to change the phase of two different circular polarizations in gravitation field
and gives polarization rotation for linearly polarized light [47, 79, SO]. Using polarization
observations of radio galaxies, Carroll, Field and Jackiw [79, SO] put a limit of 0.1 on Acp
over cosmological distances. Using a different analysis of polarization observation of
radio galaxies, Nodland and Ralston [SI] found indication of anisotropy in
electromagnetic propagation over cosmological distances with a birefringence scale of
order m (i.e., about 0.1 - 0.2 Hubble distance). This gave Acp - 5 - 10 over Hubble
distance). Later analyses [82-861 did not confirm this result and put a limit of Acp 5 1
over cosmological distance scale.
The natural coupling strength cp is of order 1. However, the isotropy of our
observable universe to may leads to a change Acp of cp over cosmological distance
scale smaller [87]. Hence, observations to test and measure Acp to are significant
and they are promising. In 2002, DASY microwave interferometer observed the
polarization of the cosmic background. With the axial interaction (9), the polarization
anisotropy is shifted relative to the temperature anisotropy. In 2003, WMAP (Wilkenson
Microwave Anisotropy Probe) [SS] found that the polarization and temperature are
correlated. This gives a constraint of 10.' of Acp 1891. Planck Surveyor [90] will be
launched in 2007 with better polarization-temperature measurement and will give a
sensitivity to Acp of 10-2-10-3.A dedicated future experiment on cosmic microwave
background radiation will reach 10-5-10-6Acp-sensitivity. This is very significant as a
positive result may indicate that our patch of inflationary universe has a 'spontaneous
polarization' in fundamental law of electromagnetic propagation influenced by

10
486

neighboring patches and we can observeneighboring patches through a determination of


this fundamental physical law; if a negative result turns out at this level, it may give a
good constraint on superstring theories as axions are natural to superstring theories.
In section 4, we mentioned Hughes-Drever experiments [39] to test the spatial
anisotropy. Here we mention test of cosmic spatial anisotropy using polarized electrons.
Following Phillips pioneer work [91], we [92] and Berglund et al. [93] improved the
sensitivity. In 2000, we used a rotatable torsion balance carrying a transversely spin-
polarized ferrimagnetic Dy6FeZ3mass to test the cosmic spatial anisotropy, and have
achieved an order-of-magnitude improvement [94, 951 over previous experiments; the
anomalous transverse energy splitting of spin states of electron due to spatial anisotropy
is constrained to be less than 6 x Gev. Eot-Wash group also improved on their
experiment and gave a constraint of 1.2 x lo-* Gev [96].

8. Gravitational-Wave Observations
The importance of gravitational-wave detection is twofold: (i) as probes to fundamental
physics and cosmology, especially black hole physics and early cosmology, and (ii) as
tools in astronomy and astrophysics to study compact objects and to count them. We
follow [97] to extend the conventional classification of gravitational-wave frequency
bands [98] into the ranges:
(i) High-frequency band (1-10 kHz): This is the frequency band that ground
gravitational-wave detectors are most sensitive to.
(ii) Low-frequency band (100 nHz - 1 Hz): This is the frequency band that space
gravitational-wave experiments are most sensitive to.
(iii) Very-low-frequency band (300 pHz-I00 nHz): This is the frequency band that the
pulsar timing experiments are most sensitive to.
(iv) Extremely-low-frequency band (1 aHz - 10 fHz): This is the frequency band that the
cosmic microwave anisotropy and polarization experiments are most sensitive to.
The cryogenic resonant bar detectors have already reached a strain sensitivity of
(10-21)/(Hz)l in the kHz region. Five such detectors --- ALLEGRO, AURIGA,
EXPLORER, NAUTILUS and NIOBE --- have been on the air, forming a network with
their bar axis quasi-parallel in a continuous search for bursts. TAMA (300 m annlength)
interferometer started accumulating data in 2000. GEO, and kilometer size laser-
interferometric gravitational-wave detectors --- LIGO and VIRGO --- took runs and
started to accumulate data also with strain sensitivity goal aimed at 10-23/(Hz)12 in the
frequency around 100 Hz. Various limits on the gravitation-wave strains for different
sources become significant. For example, analysis of data collected during the second
LIGO science run set strain upper limits as low as a few times for some pulsar
sources; these translate into limits on the equatorial ellipticities of the pulsars, which are
smaller than for the four closest pulsars [99].
Space interferometer (LISA [ 100, 1011, ASTROD [ 102, 1031) for gravitational-
wave detection hold the most promise. LISA (Laser Interferometer Space Antenna) [ 1001
is aimed at detection of low-frequency to 1 Hz) gravitational waves with a strain

sensitivity of 4 x IO-/(Hz) at 1 mHz. There are abundant sources for LISA: galactic
binaries (neutron stars, white dwarfs, etc.). Extra-galactic targets include supermassive
black hole binaries, supermassive black hole formation, and cosmic background
gravitational waves. A date of launch is hoped for 2013.
For the very-low-frequency band and for the extremely-low-frequency band, it is
more convenient to express the sensitivity in terms of energy density per logarithmic

11
487

frequency interval divided by the cosmic closure density pc for a cosmic background of
gravitational waves, i .e., R,(f)(=(f/pc)dpdf)/df).
The upper limits from pulsar timing observations on a gravitational wave
background are about R, 5 lo- in the frequency range 4-40 nHz [70], and Q, 5 4 x
at 6 x lo- Hz [72]. More pulsar observations with extended periods of time will improve
the limits by two orders of magnitude in the lifetime of present ground and space
gravitational-wave-detector projects. The COBE microwave-background quadrupole
anisotropy measurement [104, 1051 gives a limit R, (1 aHz) - low9on the extremely-low-
frequency gravitational-wave background [ 106, 1071. Ground and balloon experiments
probe smaller-angle anisotropies and, hence, higher-frequency background. WMAP [ 1081
and Planck Surveyor [90] space missions could probe anisotropies with I up to 2000 and
with higher sensitivity.

9. The Next 25 Years (2005-2030)


During last 146 years, the precision of laboratory and space experiments, and
astrophysical and cosmological observations on relativistic gravity have improved by 3-4
orders of magnitude. In 2000, the ground long interferometric gravitational wave
detectors began their runs to accumulate data. In April 2004, Gravity Probe B (Stanford
relativity gyroscope experiment to measure the Lense-Thirring effect to 1 %) [68] was
launched and has been accumulate science data for more than 170 days now. pSCOPE
(MICROSCOPE: MICRO-Satellite a trainee CompensCe pour IObservation du Principle
dEquivalence) [I091 is on its way for a 2007 launch to test Galileo equivalence principle
to LISA Pathfinder (SMART2), the technological demonstrator for the LISA
(Laser Interferometer Space Antenna) mission is well on its way for a 2008 launch.
STEP (Satellite Test of Equivalence Principle) [46], and ASTROD (Astrodynamical
Space Test of Relativity using Optical Devices) [102, 1031 are in the good planning stage.
Various astrophysical tests and cosmological tests of relativistic gravity will reach
precision and ultra-precision stages. Clock tests and atomic interferometry tests of
relativistic gravity will reach an ever-increasing precision. In the next 25 years, we
envisage a 3-5 order improvement in all directions of tests of relativistic gravity. These
will give revived interest and development both in experimental and theoretical aspects
of gravity, and may lead to answers to some profound questions of gravity and the
cosmos.
In this section, we illustrate this expectation by looking into various ongoing /
proposed experiments related to the determination of the PPN space curvature parameter
y. Some motivations for determining y precisely to - are given in [I 10, 11 11.
Table 5 lists the aimed accuracy of such experiments.
GP-B is an ongoing experiment [68] using quartz gyro at low-temperature to
measure the Lense-Thirring precession and the geodetic precession. The geodetic
precession gives a measure of y.
Bepi-Colombo [I 121 is planned for a launch in 2013 to Mercury. A simulation
predicts that the determination ofy can reach 2x [ I 131.
GAIA (Global Astrometric Interferometer for Astrophysics) [ 1 141 is an astrometric
mission concept aiming at the broadest possible astrophysical exploitation of optical
interferometry using a modest baseline length (-3m). GAIA is planned to be launched in

12
488

2013. At the present study, GAIA aims at limit magnitude 21, with survey completeness
to visual magnitude 19-20,and proposes to measure the angular positions of 35 million
objects (to visual magnitude V=15) to 10 pas accuracy and those of 1.3 billion objects
(to V=20) to 0.2 mas accuracy. The observing accuracy of V=10 objects is aimed at 4
pas. To increase the weight of measuring the relativistic light deflection parameter y,
GAIA is planned to do measurements at elongations greater than 35" (as compared to
essentially 47" for Hipparcos) from the Sun. With all these, a simulation shows that
GAIA could measure y to 1 x 10.' - 2 x 1 0-7accuracy [I 151.

Table 5. Aimed accuracy of PPN space parameter y for various


ongoing / proposed experiments. The types of experiments
(deflection, retardation or geodetic precession) are given in the
parentheses.

Bepi-Colombo [113] (retardation) 1 2x


GAIA rll51 (deflection) I 1 ~ 1 0 - ~ -~10.'
2
ASTROD I [116] (retardation) IXIO-~
LATOR [ 1 171 1x10-8
ASTROD [ 1 181 IXIO-~
In the ranging experiments, the retardations (Shapiro time delays) of the
electromagnetic waves are measured to give y. In the astrometric experiments, the
deflections of the electromagnetic waves are measured to give y. These two kinds of
experiments complement each other in determining y. The ASTROD I (Single Spacecraft
Astrodynamical Space Test of Relativity using Optical Devices) mission concept [I 161 is
to use a drag-free spacecraft orbiting around the Sun using 2-way laser pulse ranging and
laser-interferometric ranging between Earth and spacecraft to measure y and other
relativistic parameters precisely. The y parameter can be separated from the study of the
Shapiro delay variation. The uncertainty on the Shapiro delay measurement depends on
the uncertainties introduced by the atmosphere, timing systems of the ground and space
segments, and the drag-free noise. A simulation shows that an uncertainty of 1 0-7on the
determination of y is achievable.
LATOR (Laser Astrometric Test Of Relativity) [I 171 proposed to use laser
interferometry between two micro-spacecraft in solar orbit, and a 100 m baseline multi-
channel stellar optical interferometer placed on the ISS (International Space Station) to
do spacecraft astrometry for a precise measurement of y.
For ASTROD (Astrodynamical Space Test of Relativity) [102, 103, 1181, 3
spacecraft, advanced drag-free systems, and mature laser interferometric ranging will be
used and the resolution is subwavelength. The accuracy of measuring y and other
parameters will depend on the stability of the lasers and/or clocks. An uncertainty of 1 x
on the determination of y is achievable in the time frame of 2015-2025.

13
489

10. Outlook
Physics is an empirical science, so is gravitation. The road map for gravitation is clearly
empirical. As precision is increased by orders of magnitude, we are in a position to
explore deeper into the origin of gravitation. The current and coming generations are
holding such promises.

Acknowledgements
I would like to thank the National Natural Science Foundation (Grant No. 104751 14),
and the Foundation of Minor Planets of Purple Mountain Observatory for supporting this
work.

Appendix
Since many readers are more familiar with the parametrization given by Eddington (A. S.
Eddington, The Mathematical Theory of Relativity [2nded., Cambridge University Press,
19241) and Robertson (H. P. Robertson, p. 228 in Space Age Astronomy, ed. by A. J.
Deutsch and W. H. Klemperer [Academic Press, New York, 19621) in testing relativistic
gravity, with the recommendation of J. P. Hsu (one of the editor of this book), I add this
explanatory appendix.
The Eddington-Robertson parametrization of metric is

ds2= (I-2a(GM/r)+2P(GA4,/r)2+...)d? - (1+2y(GM/r)+. ..) (dr2+r2deZ+r2sin28d('),

where a, p, y are the Eddington-Robertson relativistic parameters. For matter, a can be


absorbed into G; the Newtonian limit then requires G to be the Newtonian gravitational
constant. To test the Einstein Equivalence Principle, for electromagnetism, we could set
a, p and y to aem, Pem and yem,and the metric corresponds to H,, in section 4; for matter,
we then have a, P and y equal to amatter(=l),
Pmatterand ymatter,and the metric corresponds
to g,,.The constraints (5, 7, 8) become

For dynamical tests of P and y, Table 4 and Table 5 and the associated discussions in
section 5 and section 9 apply.

References
1 . I. Newton, Philosophiae Naturalis Principia Mathematica (London, 1687).
2. J. Kepler, Astronomia nova de motibus stellae Martis (Prague, 1609); Harmonice mundi (Linz,
1619).

14
490

3. G. Galilei, Discorsi e dimostria-ioni matematiche intorno a due nuove scienze (Elzevir, Leiden,
1638)
4. U. J. J. Le Verrier, Theorie du mouvement de Mercure, Ann. Observ. imp. Paris (Mim.) 5, 1-196
(1859).
5. A. A. Michelson and E. W. Morley, Am. J. Sci. 34,333 (1887).
6. H. A. Lorentz, Kon. Neder. Akad. Wet. Amsterdam. Versl. Gewone Vergad. Wisen Natuurkd. Afd.
6, 809 (1904), and references therein.
7. H. Poincare, em C. R. Acad. Sci. 140, 1504 (1905), and references therein.
8. A. Einstein,Ann. Phys. 17, 891 (1905).
9. R. V. Eotvos, Math. Naturwiss. Ber. Ungarn 8,65 (1889).
10. A. Einstein, Jahrb. Radioakt. Elektronik 4, 411 (1907); Corrections by Einstein in Jahrb.
Radioakt. Elektronik 5, 98 (1908); English translations by H. M. Schwartz in Am. J. Phys. 45,
512, 811, 899 (1977).
1 1. A. Einstein, Preuss. Akad Wiss. Berlin, Sitzber. 778, 799, 83 1, 844 ( I 9 15).
12. F. Dyson, A. Eddington and C. Davidson, Phil. Trans. Roy. SOC.220A, 291 (1920); F. Dyson,
A. Eddington and C. Davidson, Mem. Roy. Ast. SOC.62, 291 (1920).
13. I. I. Shapiro, Phys. Rev. Lett. 13,789 (1964).
14. P. G. Roll, R. Krotkov and R. H. Dicke, Ann. Phys. (U. S. A,) 26,442 (1964).
15. R. V. Pound and G. A. Rebka, Phys. Rev. Lett. 4, 337 (1960); R. V. Pound and J. L. Snider,
Phys. Rev. Lett. 13, 539 (1964).
16. S. Newcomb, "Discussion and results of observations on transits of Mercury from 1677 to
1881", Astr. Pap. am. Ephem. naut. Alm., 1, 367-487 ( U S . Govt. Printing Office, Washington,
D.C., 1882).
17. J. H. Taylor, Rev. Mod. Phys. 66,711 (1 994); and references therein.
18. P. M. Granavich, et al., Astrophys. J. 493,53 (1998).
19. S. Perlmutter, et al., Nature 391,51 (1998).
20. A. G. Riess, et aL, Astrophys. J. 116, 1009 (1998).
21. S. Perlmutter, etal., Astrophys. J , 517(2), 565-586 (1999).
22. P. Moore, The Story of Astronomy, 5th revised edition (New York, Grosset & Dunlap
Publishing, 1977).
23. U. J. J. Le Verrier, C. R. Acad. Sci. Paris 29, 1 (1 849); the English translation is from [24].
24. N. T. Roseveare, Mercury's Perihelionfrom Le Verrier to Einstein (Oxford, Clarendon Press,
1982); the reader is referred to this book for a thorough study of the history related to the
Mercury's perihelion advance.
25. I. I. Shapiro, in General Relativiw and Gravitation, ed. N. Ashby, D. F. Bartlett and W. Wyss,
p. 313 (Cambridge, Cambridge University Press, 1990).
26. H. Poincare, L'etat actuel et I'avenir de la physique mathematique, Bulletin des Sciences
Mathimatiques, Tome 28, 2e serie (reorganized 39-l), 306 (1904); the English translation is
from [27].
27. C. Marchal, Sciences 97-2 (April, 1997) and English translation provided by the author.
28. J. Stachel, p. 249 in Twentieth Century Physics, Vol. I , ed. L. M. Brown, A. Pais and B.
Pippard (New York, AIP Press, 1995).
29. D. Hilbert, Konigl. Gesell. d. Wiss. Gottingen, Nachr., Math.-Phys. KI., 395 (1915).
30. A. Einstein, Ann. Phys. (Germany) 35, 898 (191 I).
3 1. K. Hentschel, Erwin Finlay-Freundlich and testing Einstein's theory of relativity, Arch, Hist.
Exact Sci. 47, 143 (1994); and references therein.
32. E. G. Forbes,Ann. Sci. 17, 143 (1961).
33. W.-T. Ni, Bull. Am. Phys. Soc 19,655 (1974).
34. W.-T. Ni, Phys. Rev. Lett. 38, 301 (1977).
35. W.-T. Ni, "Equivalence Principles and Precision Experiments" pp. 647-65 1, in Precision
Measurement and Fundamental Constants 11, ed. by B. N. Taylor and W. D. Phillips, Natl.
Bur. Stand. (U.S.), Spec. Publ. 617 (1984).
36. W.-T. Ni, "Timing Observations of the Pulsar Propagations in the Galactic Gravitational Field
as Precision Tests of the Einstein Equivalence Principle", pp. 441-448 in Proceedings of the
Second Asian-Pacific Regional Meeting of the International Astronomical Union, ed. by B.
49 1

Hidayat and M. W. Feast (Published by Tira Pustaka, Jakarta, Indonesia, 1984).


37. W.-T. Ni, "Equivalence Principles, Their Empirical Foundations, and the Role of Precision
Experiments to Test Them", pp. 491 -5 17 in Proceedings of the 1983 International School and
Symposium on Precision Measurement and Gravity Experiment, Taipei, Republic of China,
January 24-February 2, 1983, ed. by W.-T. Ni (Published by National Tsing Hua University,
Hsinchu, Taiwan, Republic of China, June, 1983).
38. C. L-erzahl and F.W. Hehl, Phys. Rev. D 70, 105022 (7) (2004).
39. V. W. Hughes, H. G. Robinson, and V. Beltran-Lopez, Phys. Rev. Lett. 4, 342 (1960); V.
Beltran-Lopez, H. G. Robinson, and V. W. Hughes, Bull. Am. Phys. SOC.6,424 (1961); R. W.
P. Drever, Phil. Mag. 6, 683 (1 962); J. F. Ellena, W.-T. Ni and T.-S. Ueng, ZEEE Transactions
on Instrumentation and Measurement 1M-36, 175 (1987); T. E. Chupp, R. J. Hoara, R. A.
Loveman, E. R. Oteiza, J. M. Richardson, M. E. Wagshul, Phys. Rev. Lett. 63, 1541 (1989).
40. W.-T. Ni, "Implications of Hughes-Drever Experiments", pp. 5 19-529 in Proceedings of the
1983 International School and Symposium on Precision Measurement and Gravity Experiment,
Taipei, Republic of China, January 24-February 2, 1983, ed. by W.-T. Ni (Published by
National Tsing Hua University, Hsinchu, Taiwan, Republic of China, June, 1983).
41. R. V. Eotvos, D. Pekar and E. Fekete, Ann. Phyx (Leipzig) 68, I 1 ( I 922).
42. V. B. Braginsky and V. I. Panov, Zh. Eksp. Teor. FI=. 61, 873 (1971) [Sov. Phys. JETP 34, 463
(1972)].
43. Y. Su, etal.,Phys. Rev. D 50, 3614(1994); S. Baessler, etal., Phys. Rev. Lett. 83, 3585 (1999);
E. G. Adelberger, Class. Quantum Grav. 18,2397 (2001).
44. R. F. C. Vessot, and M. W. Levine, Gen. Rel. Grav. 10, 181 (1979).
45. R. F. C. Vessot, et al., Phys. Rev. Lett. 45,2081 (1980).
. I

46. ESA SCI(93)4,-STEP (Satellite test of the equivalence principle) report on the phase A study
(1993).
47. W.-T: Ni, A Nonmetric Theory of Gravity, preprint, Montana State University, Bozeman,
Montana, USA (1973). The paper is available via
http://gravity5.phys.nthu.edu.tw/webpage/article4/index. html.
48. W.-T. Ni, "Spin, Torsion and Polarized Test-Body Experiments", pp. 53 1-540 in Proceedings
of the 1983 International School and Symposium on Precision Measurement and Gravity
Experiment, Taipei, Republic of China, January 24-February 2, 1983, ed. by W.-T. Ni
(Published by National Tsing Hua University, Hsinchu, Taiwan, Republic of China, June,
1983).
49. W.-T. Ni, Searches for the role of polarization and spin in gravitation, review article in
preparation for Reports on Progress in Physics (2005).
50. W.-T. Ni, fhys. Lett. A 120, 174 (1987).
51. C. M. Will, and K. Nordtvedt, Jr., Astrophys. J., 177, 757 (1972); and references therein.
52. J. D. Anderson, E. L. Lau, S. Turyshev, J. G. Williams, and M. M. Nieto, Bulletin of the
American Astronomical Society 34, 660 (2002).
53. E. Pitjeva "Precise determination of the motion of planets and some astronomical constants
from modern observations", 12 p, to be published in IAU Coll. N 196 / Transit of Venus: new
views of the solar system and galaxy (ed. D.W. Kurtz), Cambridge: Cambridge University
Press, 2005.
54. J. G. Williams, X. X. Newhall, and J. 0. Dickey, Phys. Rev. D 53, 6730 (1996).
55. J. G. Williams, S. G. Turyshev, and D. H. Boggs, Phys. Rev. Lett. 93,261 101(4) (2004).
56. B. Bertotti, L. Iess, and P. Tortora, Nature 425, 374-376 (2003).
57. R. W. Hellings, P. J. Adams, J. D. Anderson, M. S. Keesey, E. L. Lau, E. M. Standish, V. M.
Canuto and I. Goldman, Phys. Rev. Lett. 51, 1609 (1 983).
58. R. D. Reasenberg, Philos. Trans. R. SOC.London. Sec. A 310, 227 (1983); J. F. Chandler, R. D.
Reasenberg, and I. I. Shapiro, Bull. Amer. Astron. SOC.25, 1233 (1993).
59. J. D. Anderson, J. K. Campbell, R. F. Jurgens, E. L. Lau, X. X. Newhall, M. A. Slade III, and E.
M. Standish Jr., "Recent Developments in Solar-System Tests of General Relativity", in
Proceedings of the 6th Marcel Grossmann Meeting on General Relativity, Ed. H. Sat0 and T.
Nakamura, p. 353 (Singapore, World Scientific, 1992).
60. J. G. Williams, J. D. Anderson, D. H. Boggs, E. L. Lau, and J. 0. Dickey, Bulletin of the

16
492

American Astronomical Sociefy 33, p , 836 (2UU1).


61. D. S. Robertsun, W. E. Carter, and W.H. Dillinger, .Vatwe 349, 768 (1991).
62. E. Lcbach, et al., Phys. Rev. Leu. 75, 1439 (1995).
63. S. S. Shapiro, J. L. Davis, D. E. Lebah, and J. S. Gregory, Phys. Rev. Lett. 92, 121101(4)
(2004).
64. E. B. Fnrnalont, and S . M. Kopeikin, Asfrophys. J., 598,704 (2003).
65. E. M. Standish, JPI., lnterofice Memorandum, 312.N-03-009, 16 p (2003).
66. G. A. Krasinsky and V.A. Brumberg, Crlrsl. illech. R Dyn. Astron. 90, 267-288 (2004).
67. I. Ciufonlini, and E. C. Pavlis, ?&ure 431, 958-960 (2004); N. Ashby, Nature 431, 918-919
(2004).
68. GP-B (Gravity Probe B) litto::~einstein.sianrord.~di~; C. W. F. Everitt, S. Buchman, D. B.
DeBra, G. M. Keiser, J. M. Lockhart, B. Muhlfelder, B. W. Parkinson, J. P. Turneaure, and
other members of the Gravity Probe B team, Gravity Probe B: Countdown to Launch, pp.
52-82 in Gyros Clocks, Interferometers .. . : Testing Relativistic Gravity in Space, eds. C.
Liimmerzahl, C. W. F. Everitt: F. W. Hehl (Berlin, Springer-Verlag, 2001).
69. M. Frceschle, F. Mignard and F. Arenou, p. 49 in Proceedings of the ESA Symposium
Hipparcos-Venice97.13-16 M q ; Venice, Italy, ESA SP-402 (July 1997).
70. M. P. McHugh, G. Zalamanslq, F. Vernotte, and E. Lantz, Phys. Rev. D 54, 5993 (1996), and
referenccs therein.
71. G. Zalamansky, C. Robert, F. Vcrnottc el n l , Man. Xot. Roy. Astron. SOC.288 (2), 533-537
(1997).
72. A. N. Lomnien, Proceedings of the 270 WE-IIeraeus Seminar on: Neutron Scars, Pulsars and
Supernova remnants, Physikzenlrum, Bad Honnef, Germany, Jan. 2 1-25, 2002, eds. W.
Beckcr, H. Lesch and J . Triimpcr, W E Rcport 278, pp.114-125; astro-ph10208572.
73. V. A. Polapov, Y . P. Ilyasov, V. V. Oreshko, and A. F,. Kodin, Astronomy Letters 29 (4), 241-
245 (2003); S. M. Kopeikin, V.A. Polapov, Man. Not. Roy. Aslron. SUC. 355 (2), 395-412
(2004), and referenccs thcrein.
74. H.-W. Huang, Pulsar Timing and Equivalence Principle Tests, Mastcr thesis, National lsing
Hua University (Hsinchu, 2002)
75. M. P. Haugan and T.F. Kauffmann, Phys. Rev. D 52,3168 (1 995).
76. V. A. Kostelecky and M. Mewes, Phys. Rev. D 66, 056005(24) (2002).
77. T. P. Krisher, Phys. Rev. D 44,R2211 (1991).
78. A. G. Riess, W. H. Press and R. P. Kirshner, Astrophys. J., 473, 588 (1996), and references
therein.
79. S . M. Carroll, G. B. Field, R. Jackiw, Phys. Rev. D 41, 123 1 (1990).
80. S. M. Carroll and G. B. Field, Phys, Rev. D 43, 3789 (1991).
81. B. Nodland and J. P. Ralston, Phys. Rev. Left. 78, 3043 (1997).
82. J. F. C. Wardle, R. A . Perky, and M. H. Cohrn; Phys. Rev. Lett. 79, 1801 (1 997).
83. D. J. Eisenstein and E. F. Bum, Plys. Rev. Len. 79, 1957 {1997).
84. S. M. Carroll and G. B. Field, Phw. Rev. Lett. 79,2394 (1997),
85. T. J. Idoredo,E. A. Flanagan, and I. M. Wasserman, Phys. Rev. D 56,7507 (1997).
86. S.M. Carroll, P b s . Rev. Lett. 81, 3067 (1998).
87. W.-1. Ni, Spin in Gravity, pp. 439-156 in Gyros, Clocks, Interferometers.,, : Testing
Kelatlvislic Gravity in Space, eds. C.Lhmerzahl, C. W. I;. Evcritt, F. W. Hehl (Hcrlin,
Springer-Verlag, 200 1).
88. A. Kogul, D. N. Spergel, C. Barncs, et a!., As~upl?ys.J. Supp/. 148 (I), 161-173 (2003); A.
Kogut, New Astron. Rev., 47(11-12), 977-986 (2003).
89. W.-T. Ni, Chin. Phys. Lett. 22(1); 33-35 (2005).
90. wvw.rssd.cs:i. iiit!index*p!~,~~r~?i~.~t=PL...\r;CY
91. R. Phillips, P h p . Rev. Lett. 59, 1784 (1987).
92. S.-L. Wang, W.-T. Ni and S.-s. Pan, ,WodernPhys. Lett. A 8, 3715 (1993).
93. C. J. Berglund et al., Phys. Rev. Len. 75, 1879 (1995).
94. L . 8 . Hou and W.-T. Ni, Test of Spatial Anisotropy for Polarized Electrons Using a
Rotatable Torsion Balance, p. 143 in Proceedings of the International Workshop on
Gravitation and Astrophysics, Tokyo, 1997, edited by K. Kuroda (ICRR, Tokyo, 1998).

17
493

95. L.-S. Hou, W.-T. Ni and Y.-C. M. Li, Phys. Rev. Lett. 90 (20), 201101 (4) (2003); Preprrnt
physics/0009012 (2000).
96. B. R. Heckel et al., in CPT and Lorentx Symmetry 11, V. A. Kostelecky, ed. (World Scientific,
Singapore, 2002).
97. K. S. Thorne, Gravitational Waves, p. 160 in Paticle and Nuclear Astrophysics and Cosmology
m the Next Millennium), ed. E. W. Kolb and R. D. Peccei (World Scientific, Singapore, 1995).
98. W.-T. Ni, "ASTROD and gravitational waves", pp. 117-129 in Gravitational Wave Detection,
edited by K. Tsubono, M.-K. Fujimoto and K. Kuroda (Universal Academy Press, Tokyo,
Japan, 1997).
99. B. Abbott et al., "Limits on gravitational wave emission from selected pulsars using LIGO data,
arXiv: gr-qc/0410007 v2 19 Jan., 2005 (2005).
100. LISA, Pre-Phase A Report, second edition, July (1998).
101. LISA, Laser Interferometer Space Antenna: A Cornerstone Mission for the Observation of
Gravitational Waves, ESA System and Technology Study Report, ESA-SCI 11 (2000).
102. A. Bec-Borsenberger, J. Christensen-Dalsgaard, M. Cruise, A. Di Virgilio, D. Gough, M.
Keiser, A. Kosovichev, C. Laemmerzahl, J. Luo, W.-T. Ni, A. Peters, E. Samain, P. H.
Scherrer, J.-T. Shy, P. Touboul, K. Tsubono, A.-M. Wu, and H.-C. Yeh, "Astrodynamical
Space Test of Relativity using Optical Devices ASTROD---A Proposal Submitted to ESA in
Response to Call for Mission Proposals for Two Flexi-Mission F2/F3", January 3 1,2000.
103. W.-T. Ni, fnt. J. Mod. Phys. D 11 (7): 947-962 (2002).
104. G. F. Smoot et a/.,Astrophys. J. 396, LI (1992).
105. C. L. Bennett et al., Astrophys. J. 464, L1 (1996).
106. L. M. Krauss and M. White, Phys. Rev. Lett. 69,969 (1992).
107. R. L. Davis, H. M. Hodges, G. F. Smoot, P. J. Steinhardt, and M. S. Turner, Phys. Rev. Lett.
69, 1856 (1992).
108. C. L. Bennett, M. Halpern, G. Hinshaw, et al., Astrophys. J. Suppl. 148 (l), 1-27 (2003).
109. P. Touboul, M. Rodrigues, G. Metris and B. Tatry, MICROSCOPE, testing the equivalence
principle in space, C. R. Acad. Sci. Ser. IV 2(9), 1271-1286 (2001).
110 T. Damour and K. Nordtvedt, Jr., Phys. Rev. Letf. 70, 2217 (1993).
11 1. T. Damour, F. Piazza, and G. Veneziano, Phys. Rev. D 66,046007 (2002); Preprint hep-
tN0205 111 (2002).
1 12. http://www.esa.int/esaSC/12039 l-index-0-m.html
113. A. Milani, D. Vokrouhlicky, D. Villani, C. Bonanno, and A. Rossi, Phys. Rev. D 66 (8),
082001(21) (2004).
1 14. http://www.esa.int/esaSC/ 120377-index-O-m.html
115. A. Vecchiato, M. G. Lattanzi, B. Bucciarelli, et al., Astron. Astrophys. 399 (I), 337-342,
(2003).
116. W.-T. Ni, G. Bao, Y. Bao, et al., J. Korean Phys. SOC.45: S118-Sl23 (2004).
117. S. G. Turyshev, M. Shao, and K. Nordtvedt, Class. Quantum Grav. 21, 2773-2799 (2004).
118. W.-T. Ni, S. Shiomi and A.-C. Liao, Class. Quantum Grav. 21, S641 (2004).
494

Binary pulsars and relativistic gravity'


Joseph H. Taylor, Jr.
Princeton Universiw Princeton, New Jersey 08544

I. SEARCH AND DISCOVERY band ratio noise detectable over interstellar distances.
However, the rich diversity of the observed radio pulses
suggested magnetospheric complexities far beyond those
Work leading to the discovery of the first pulsar in a
readily incorporated in theoretical models. Many of us
binary system began more than twenty years ago, so it
seems reasonable to begin with a bit of history. Pulsars suspected that detailed understanding of the pulsar emis-
sion mechanism might be a long time coming-and that,
burst onto the scene (Hewish et al., 1968) in February
1968, about a month after I completed my Ph.D. at Har- in any case, the details might not turn out to be funda-
vard University. Having accepted an offer to remain mentally illuminating.
In September 1969 I joined the faculty at the Universi-
there on a post-doctoral fellowship, I was looking for an
ty of Massachusetts, where a small group of us planned
interesting new project in radio astronomy. When Na-
ture announced the discovery of a strange new rapidly
to build a large, cheap radio telescope especially for ob-
pulsating radio source, I immediately drafted a proposal, serving pulsars. Our telescope took several years to
together with Harvard colleagues, to observe it with the build, and during this time it became clear that whatever
92 m radio telescope of the National Radio Astronomy the significance of their magnetospheric physics, pulsars
Observatory. By late spring we had detected and studied were interesting and potentially important to study for
quite different reasons. As the collapsed remnants of su-
all four of the pulsars which by then had been discovered
by the Cambridge group, and I began thinking about how pernova explosions, they could provide unique experi-
mental data on the final stages of stellar evolution, as well
to find further examples of these fascinating objects,
which were already thought likely to be neutron stars. as an opportunity to study the properties of nuclear
Pulsar signals are generally quite weak, but have some matter in bulk. Moreover, many pulsars had been shown
unique characteristics that suggest effective search stra- to be remarkably stable natural clocks (Manchester and
tegies. Their otherwise noise-like signals are modulated Peters, 19721, thus providing an alluring challenge to the
by periodic, impulsive waveforms; as a consequence, experimenter, with consequences and applications about
dispersive propagation through the interstellar medium which we could only speculate at the time. For such
makes the narrow pulses appear to sweep rapidly down- reasons as these, by the summer of 1972 I was devoting a
ward in frequency. I devised a computer algorithm for large portion of my research time to the pursuit of accu-
recognizing such periodic, dispersed signals in the inevit- rate timing measurements of known pulsars, using our
able background noise, and in June 1968 we used it to new telescope in western Massachusetts, and to planning
discover the fifth known pulsar (Huguenin et al., 1968). a large-scale pulsar search that would use bigger tele-
Since pulsar emissions exhibited a wide variety of new scopes a t the national facilities.
and unexpected phenomena, we observers put consider- I suspect i t is not unusual for an experiment's motiva-
able effort into recording and studying their details and tion to depend, at least in part, on private thoughts quite
peculiarities. A pulsar model based on strongly magnet- unrelated to avowed scientific goals. The challenge of a
ized, rapidly spinning neutron stars was soon established good intellectual puzzle, and the quiet satisfaction of
as consistent with most of the known facts (Gold, 1968). finding a clever solution, must certainly rank highly
The model was strongly supported by the discovery of among my own incentives and rewards. If an experiment
pulsars inside the glowing, gaseous remnants of two su- seems difficult to do, but plausibly has interesting conse-
pernova explosions, where neutron stars should be creat- quences, one feels compelled to give it a try. Pulsar
ed (Large, et al., 1968; Staelin and Reifenstein. 1968), searching is the perfect example: it's clear that there
and also by an observed gradual lengthening of pulsar must be lots of pulsars out there, and, once identified,
periods (Richards and Comella, 1969) and polarization they are not so very hard to observe. But finding each
measurements that clearly suggested a rotating source one for the first time is a formidable task, one that can
(Radhakrishnan and Cooke, 1969). The electrodynami- become a sort of detective game. T o play the game you
cal properties of a spinning, magnetized neutron star invent an efficient way of gathering clues, sorting, and as-
were studied theoretically (Goldreich and Julian, 1969) sessing them, hoping to discover the identities and celes-
and shown to be plausibly capable of generating broad- tial locations of all the guilty parties.
Most of the several dozen pulsars known in early 1972
were discovered by examination of strip-chart records,
'Nobel Lecture, presented to the Royal Swedish Academy of without benefit of further signal processing. Neverthe-
Sciences on 8 December 1993. less, it was clear that digital computer techniques would

Reviews of Modern Physics, VoI. 66, No. 3,July 1994 01994 The American Physical Society 711
495

71 2 Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity

be essential parts of more sensitive surveys. Detecting PSR 1913+16


new pulsars is necessarily a multidimensional process; in
addition to the usual variables of two spatial coordinates,
one must also search thoroughly over wide ranges of
period and dispersion measure. Our first pulsar survey,
in 1968, sought evidence of pulsar signals by computing
the discrete Fourier transforms of long sequences of in-
tensity samples, allowing for the expected narrow pulse
shapes by summing the amplitudes of a dozen or more
harmonically related frequency components. I first de-
scribed this basic algorithm (Burns and Clark, 1969) as
part of a discussion of pulsar search techniques, in 1969.
An efficient dispersion-compensating algorithm was con- FIG. 1. Distribution of 558 pulsars in Galactic coordinates.
ceived and implemented soon afterward (Manchester The Galactic center is in the middle, and longitude increases to
et al., 1972; Taylor, 1974), permitting extension of the the left.
method to two dimensions. Computerized searches over
period and dispersion measure, using these basic algo-
rithms, have by now accounted for discovery of the vast our 1973-74 Arecibo survey was carried out by Russell
majority of nearly 600 known pulsars, including forty in Hulse. He describes that work, and particularly the
binary systems (Taylor el al., 1993; Camilo, 1994). discovery of PSR 1913+ 16, in his accompanying lecture
In addition to private stimuli related to the thrill of (Hulse, 1994). The significant consequences of our
the chase, my outwardly expressed scientific motivation discovery have required accurate timing measurements
for planning an extensive pulsar survey in 1972 was a extending over many years, and since 1974-76 I have
desire to double or triple the number of known pulsars. I pursued these with a number of other collaborators. I
had in mind the need for a more solid statistical basis for shall now turn to a description of these observations.
drawing conclusions about the total number of pulsars in
the Galaxy, their spatial distribution, how they fit into
the scheme of stellar evolution, and so on. I also realized
(Taylor, 1972) that it would be highly desirable . . . to I I 1 1 1 1 1 1 I I 1 1 1 1 1 1 I I 1 1 1 1 1 1 1 I Illlrr

find even one example of a pulsar in a binary system, for


-
measurement of its parameters could yield the pulsar
mass, an extremely important number. Little did I
-12 -
.. . .
suspect that just such a discovery would be made, or that
it would have much greater significance that anyone had
foreseen! In addition to its own importance, the binary
pulsar PSR 1913+ 16 is now recognized as the harbinger
of a new class of unusually short-period pulsars with
numerous important applications.
An up-to-date map of known pulsars on the celestial
sphere is shown in Fig. I. The binary pulsar PSR
1913+16 is found in a clump of objects close to the
Galactic plane around longitude 50, a part of the sky PSR 67
that passes directly overhead at the latitude of the Areci-
bo Observatory in Puerto Rico. Forty of these pulsars, 0
P
r
3
3
,lllj
d
including PSR 1913+16, were discovered in the survey
that Russell Hulse and I carried out with the 305 m Are- L. -0 1
cibo telescope (Hulse and Taylor, 1974, 1975a, 1975b).
Figure 2 illustrates the periods and spin-down rates of - 2 . 1 , 1 1 1 1 1 1 1 ( I ,
known pulsars, with those in binary systems marked by
larger circles around the dots. All radio pulsars slow
down gradually in their own rest frames, but the slow-
-e.m m ---
down rates vary over nine orders of magnitude. Figure 2 ,001 .01 1 1 10
makes it clear that binary pulsars are special in this re- Period (s)
gard. With few exceptions, they have unusually small FIG. 2. Periods and period derivatives of known pulsars
values of both period and period derivative-an impor- Binary pulsars, denoted by larger circles around the dots, gen-
tant factor which helps to make them especially suitable erally have short periods and small derivatives. Symbols
for high-precision timing measurements. aligned near the bottom represent pulsars for which the slow-
Much of the detailed implementation and execution of down rate has not yet been measured.

Rev. Mod. Phys., Vol. 66. No. 3, July 1994


496

Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity 713

II. CLOCK-COMPARISON EXPERIMENTS available standards at national time-keeping laboratories,


with time transfer accomplished via satellites in the Glo-
Pulsar timing experiments are straightforward in con- bal Positioning System.
cept: one measures pulse times of arrival (TOAS) at the An example of pulse profiles recorded during timing
telescope, and compares them with time kept by a stable +
observations of PSR 1913 16 is presented in Fig. 4,
reference clock. A remarkable wealth of information which shows intensity profiles for 32 spectral channels
about a pulsars spin, location in space, and orbital spanning the frequency range 1383- 1423 MHz, followed
motion can be obtained from such simple measurements. by a de-dispersed profile at the bottom. In a five-
For binary pulsars, especially, the task of analyzing a se- minute observation such as this, the signal-to-noise ratio
quence of TOAs often assumes the guise of another intri- is just high enough for the double-peaked pulse shape of
cate detective game. Principal clues in this game are the PSR 1913+16 to be evident in the individual channels.
recorded TOAs. The first and most difficult objective is Pulse arrival times are determined by measuring the
the assignment of unambiguous pulse numbers to each phase offset between each observed profile and a long-
TOA, despite the fact that some of the observations may term average with much higher signal-to-noise ratio.
be separated by months or even years from their nearest Differential dispersive delays are removed, the adjusted
neighbors. During such inevitable gaps in the data, a offsets are averaged over all channels, and the resulting
putsar may have rotated through as many as 107-100 mean value is added to the time tag to obtain an
turns, and in order to extract the maximum information equivalent TOA. Nearly 5000 such five-minute measure-
content from the data, these integers must be recovered ments have been obtained for PSR 1913+16 since 1974,
exoctly. Fortunately, the correct sequence of pulse num- suing essentially this technique. Through a number of
bers is easily recognized, once attained, so you can tell improvements in the data-taking systems (Taylor et al.,
when the game has been won. 1976; McCulloch e l al., 1979; Taylor et al., 1979; Taylor
A block diagram of equipment used for recent pulsar and Weisberg, 1982, 1989; Stinebring er a!., 19921, the
timing observations (Taylor, 1991)at Arecibo is shown in typical uncertainties have been reduced from around 300
Fig. 3. Incoming radio-frequency signals from the anten- ps in 1974 to 15-2Ops since 1981.
na are amplified, converted to intermediate frequency,
and passed through a mdtichannel spectrometer
equipped with square-law detectors. A bank of digital
signal averagers accumulates estimates of a pulsars 1423
periodic wave form in each spectral channel, using a
precomputed digital ephemeris and circuitry synchron-
ized with the observatorys master clock. A programm-
able synthesizer, its output frequency adjusted once a
second in a phase-continuous manner, compensates for
changing Doppler shifts caused by accelerations of the
pulsar and the telescope. Average proaes are recorded 1403
once every few minutes, together with appropriate time
tags. A log is kept of small measured offsets (typically of
order I ps) between the observatory clock and the best

GPS Satellites

From antenna ,ox 1383 ,-


I I I I I I I I L d

I * UIC(NISI)
Boulder

0 .02 -04
Time (s)
Synthsrizer
FIG. 4. Pulse profiles obtained on April 24, 1992 during a five-
minute observation of PSR 1913+ 16. The characteristic
double-peaked shape, clearly seen in the de-disped profile at
FIG. 3. Simplified block diagram of equipment using for timing the bottom, is also discernible in the 32 individual spectral
pulsars at Arecibo. channels.

Rev. Mod. Phya.. Vol. 66. No. 3. July 1994


497

714 Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity

111. MODEL FIlTING Figure 5 illustrates the combined orbital delay


AR + A E + A S for PSR 1913+16, plotted as a function of
In the process of data analysis, each measured topo- orbital phase. Despite the fact that the Einstein and
centric TOA, say lobs, must be transformed to a corre- Shapiro effects are orders of magnitude smaller than the
sponding proper time of emission T in the pulsar frame. Romer delay, they can still be measured separately if the
Under the assumption of a deterministic spin-down law, precision of available TOAs is high enough. In fact, the
the rotational phase of the pulsar is given by available precision is very high indeed, as one can see
1 from the lone data point shown in Fig. 5 with 50 OOOa er-
$b(T)=vT+--CT2, (1) ror bars.
2
Equations (1) and (2) have been written to show explic-
where (b is measured in cycles, Y E 1/P is the rotation fre- itly the most significant dependences of pulsar phase on
quency, P the period, and 9 the slowdown rate. Since a as many as nineteen a priori unknowns. In addition to
topocentric TOA is a relativistic space-time event, it the rotational frequency Y and spin-down rate 9, these
must be transformed as a four-vector. The telescopes lo- phenomenological parameters include a reference arrival
cation at the time of a measurement is obtained from a time to, the dispersion constant D, celestial coordinates a
numerically integrated solar-system model, together with and 6, proper-motion terms pa and p6, and annual paral-
published data on the Earths unpredictable rotational lax o. For binary pulsars the terms on the third line of
variations. As a first step one normally transforms to the Eq. (2), with as many as ten significant orbital parame-
solar-system barycenter, using the weak-field, slow- ters, are also required. The additional parameters in-
motion limit of general relativity. The necessary equa- clude five that would be necessary even in a purely
tions include terms depending on the positions, velocities, Keplerian analysis of orbital motion: the projected sem-
and masses of all significant solar-system bodies. Next, imajor axis x a I sini / c , eccentricity e, binary period Pb,
one accounts for propagation effects in the interstellar longitude of periastron a, and time of periastron To. If
medium; and finally, for the orbital motion of the pulsar the experimental precision is high enough, relativistic
itself. effects can yield the values of five further post-
With presently achievable accuracies, all significant Keplerian parameters: the secular derivatives c j and Pb,
terms in the relativistic transformation can be summa- the Einstein parameter y , and the range and shape of the
rized in the single equation orbital Shapiro delay, r and s Gsini. Several earlier ver-
T = f o b $ - t O + A c - D / f 2+ARo(a,6,/ia,/i6,ff) sions of this formalism for treating timing measurements
of binary pulsars exist (Blandford and Teukolsky, 1976;
+AEo- AS&, 6) Epstein, 1977; Haugan, 1989, and have been historically
important to our progress with the PSR 1913+ 16 experi-
-AR(X,e,Pb,To,O,cj,pb)-A~(y)-AS(r,~) . (2) ment. The elegant framework outlined here was derived
Here to is a nominal equivalent TOA at the solar-system during 1985-86 by Damour and Deruelle (1985, 1986).
barycenter; A, represents measured clock offsets; D /f 2 Model parameters are extracted from a set of TOAs by
is the dispersive delay for propagation at frequency f calculating the pulsar phases qNT) from Eq. (1) and
through the interstellar medium; A,@, AEa, and ASo are minimizing the weighted sum of squared residuals,
propagation delays and relativistic time adjustments
within the solar system; and A,, AE, and A, are similar
terms for effects within a binary pulsars orbit. Sub-
scripts on the various As indicate the nature of the
time-dependent delays, which include Romer, Ein-
stein, and Shapiro delays in the solar system and in 1
the pulsar orbit. The Romer terms have amplitudes com-
parable to the orbital periods times u /2m, where u is the
0
orbital velocity and c the speed of light. The Einstein
terms, representing the integrated effects of gravitational
redshift and time dilation, are smaller by another factor -1
eu/c, where e is the orbital eccentricity. The Shapiro
time delay is a result of reduced velocities that accom-
pany the well-known bending of light rays propagating -2
close to a massive object. The delay amounts to about
120 p s for one-way lines of sight grazing the Sun, and the -3
magnitude depends logarithmically on the angular im- 0 .2 .4 .6 .8 1 1.2
pact parameter. The corresponding delay within a Orbital phase (P,=7.75 h )

binary pulsar orbit depends on the companion stars +


FIG.5. Orbital delays observed for PSR 1913 16 during July,
mass, the orbital phase, and the inclination i between the 1988. The uncertainty of an individual five-minute measure-
orbital angular momentum and the line of sight. ment is typically 50 000 times smaller than the error bar shown.

Rev. Mod. Phys., Vol. 06, No. 3.July I004


498

Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity 715

racy. For each system the orbital period Pb and project-


ed semimajor axis x can be combined to give the mass
function.

Here m I and m 2 are the masses of the pulsar and com-


panion in units of the Sun's mass, M a ; I use the short-
hand notations s =sini, T a =GMa / c '=4.925 490 947
X s, where G is the Newtonian constant of gravity.
In the absence of other information, the mass function
cannot provide unique solutions for rn I , rn2, or s. Never-
theless, likely values of m 2 can be estimated by assuming
a pulsar mass close to 1.4Ma (the Chandrasekhar limit
for white dwarfs) and the median value cosi=O.5, which
FIG. 6. Schematic diagram of the analysis of pulsar timing implies s=0.87. With this approach one can distinguish
measurements carried out by the computer program TEMPO. three categories of binary pulsars, which I shall discuss
The essential functions are all described in the text. by reference to Fig. 7: a plot of binary pulsar companion
masses versus orbital eccentricities.
Twenty-eight of the binary systems in Fig. 7 have or-
(3) bital eccentricities e < 0.25 and low-mass companions
likely to be degenerate dwarfs. Most of these have nearly
with respect to each parameter to be determined. In this circular orbits; indeed, the only ones with eccentricities
equation, ni is the closest integer to +(Ti), and ui is the more than a few percent are located in globular clusters,
estimated uncertainty of the ith TOA. In a valid and re- and their orbits have probably been perturbed by near
liable solution the value of xz will be close to the number collisions with other stars. Five of the binaries have
of degrees of freedom, i.e., the number of measurements much larger eccentricities and likely companion masses
N minus the number of adjustable parameters. Parame- of 0.8Ma or more; these systems are thought to be pairs
ter errors so large that the closest integer to +( Ti ) may of neutron stars, one of which is the detectable pulsar.
not be the correct pulse number are invariably accom- The large orbital eccentricities are almost certainly the
panied by huge increases in xz; this is the reason for my result of rapid ejection of mass in the supernova explo-
earlier statement that correct pulse numbering is easily sion creating the second neutron star. Finally, at the
recognizable, once attained. In addition to providing a upper right of Fig. 7 we find two binary pulsars that
list of fitted parameter values and their estimated uncer-
tainties, the least-squares solution produces a set of post-
fit residuals, or differences between measured TOAs and
those predicted by the model (see Fig. 6). The post-fit re-
siduals are carefully examined for evidence of systematic PSR 1913+16
trends that might suggest experimental errors, or some v
inadequacy in the astrophysical model, or perhaps deep
physical truths about the nature of gravity.
Necessarily some model parameters will be easier to
measure than others. When many TOAs are available,
spaced over many months or years, it generally follows
that at least the pulsar's celestial coordinates, spin pa- b
u
rameters, and Keplerian orbital elements will be measur-
able with high precision, often as many as 6-14
significant digits. As we will see, the relativistic parame-
ters of binary pulsar orbits are generally much more -6 -5 -4 -3 -2 -1 0
difficult to measure-but the potential rewards for doing log[Orbitol eccentricity]
so are substantial. FIG. 7 . Masses of the companions of binary pulsars, plotted as
a function or orbital eccentricity. Near the marked location of
PSR 1913+16, three distinct symbols have merged into one;
IV. THE NEWTONIAN LIMIT these three binary systems, as well as their two nearest neigh-
bors in the graph, are thought to be pairs of neutron stars. The
Thirty-five binary pulsar systems have now been stud- two pulsars at the upper right are accompanied by high-mass
ied well enough to determine their basic parameters, in- main-sequence stars, while the remainder are believed to have
cluding the Keplerian orbital elements, with good accu- white-dwarf companions.

Rev. Mod. Phys., Vol. 66, No.3, July 1994


499

716 Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity

move in eccentric orbits around high-mass main-


sequence stars. These systems have not yet evolved to 20
the stage of a second supernova explosion. Unlike the
binary pulsars with compact companions, these two sys-
tems have orbits that could be significantly modified by
complications such as tidal forces or mass loss.

V. GENERAL RELATIVITY AS A TOOL

As Russell Hulse and I suggested in the discovery pa-


per for PSR 1913+16 (Hulse and Taylor, 1975a), it
should be possible to combine measurements of relativis-
tic orbital parameters with the mass function, thereby
determining masses of both stars and the orbital inclina- 00
l>
tion. In the post-Keplerian (PK) framework outlined
above, each measured PK parameter defines a unique
curve in the ( m l , m , ) plane, valid within a specified
theory of gravity. Experimental values for any two PK
parameters (say, I, and y , or perhaps r and s) establish .2 .4 .6 .a 1
the values of m l , m,, and s unambiguously. In general Orbital phase
relativity the equations for the five most significant PK FIG. 8. Measurements of the Shapiro time delay in the PSR
parameters are as follows (Damour and Deruelle, 1986; 1855+09 system. The theoretical curve corresponds to Eq.
Taylor and Weisberg, 1989; Damour and Taylor, 1992): (lo),and the fitted values of r and s can be used to determine the
-5/3 masses of the pulsar and companion star.
.=1[$] ,
( ~ ~ ~ ) 2 / 3 ( 1 - ~ 2 ) - 1 (5)
the line of sight, greatly magnifying the orbital Shapiro
delay. The relevant measurements (Rawley et al., 1988;
Ryba and Taylor, 1991; Kaspi et al., 1994) are illustrated
in Fig. 8, together with the fitted function As(r,s), in this
case closely approximated by
A,=--2rlog(l--scOS[2rr(-d--do)]) , (10)

where 4 is the orbital phase in cycles and d0=0.4823 the


phase of superior conjunction. The fitted values of r and
s yield the masses m l =1.50?:::,6, mz=0.258+::~f~.In a

-
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Again the masses m , , m,. and M z m I + m , are ex- PSR 1534+12 *


pressed in solar units. I emphasize that the left-hand
sides of Eqs. ( 5 ) through (9) represent directly measurable
quantities, at least in principle. Any two such measure-
ments, together with the well-determined values of e and
P b , will yield solutions for m l and m2, as well as explicit
PSR 1534+12'

PSR 1802-07

PSR 1855+09

PSR 1913+16
- +

- *
predictions for the remaining PK parameters.
PSR 1913+16'
The binary systems most likely to yield measurable PK
parameters are those with large masses and high eccen- PSR 2127+11C k-----l

tricities and which are astrophysically "clean," so that


their orbits are overwhelmingly dominated by the gravi-
tational interactions between two compact masses. The
five pulsars clustered near PSR 1913-k 16 in Fig. 7 would
seem to be especially good candidates, and this hzz been
borne out in practice. In the most favorable cir-
cumstances, even binary pulsars with low-mass com-
0
PSR 2127+11C'

PSR 2303+46

PSR 2303+46'
1 1 1 1
.5
I I I
1
I

Neutron star mass


I
-
I I I

(Me)
I
1.5
I I I I I
2

panions and nearly circular orbitals can yield significant FIG. 9. The masses of ten neutron stars, measured by observing
post-Keplerian measurements. The best present example relativistic effects in binary pulsar orbits. Asterisks after pulsar
is PSR 1855+09: its orbital plane is nearly parallel to names denote companions to the observed pulsars.

Rev. Mod. PhyS.. VOl. 86, No. 3, July 1994


500

Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity 717

similar way, all binary pulsars with two measurable P K


parameters yield solutions for their component masses.
At present, most of the experimental data on the masses
of neutron stars (see Fig. 9) come from such timing analy-
ses of binary pulsar systems (Taylor and Dewey, 1988;
Thorsett et al., 1993, and references therein).

VI. TESTING FOR GRAVITATIONAL WAVES

If three or more post-Keplerian parameters can be


measured for a particular pulsar, the system becomes
over-determined, and the extra experimental degrees of
freedom transform it into a calibrated laboratory for test-
ing relativistic gravity. Each measurable PK parameter
beyond the first two provides an explicit, quantitative
test. Because the velocities and gravitational energies in
a high-mass binary pulsar system can be significantly rel-
ativistic, strong-field and radiative effects come into play.
Two binary pulsars, PSRs 19134-16 and 1534+12, have
now been timed well enough and long enough to yield
three or more P K parameters. Each one provides - -14
- 1 2 1990 U
significant tests of gravitation beyond the weak-field,
slow-motion limit (Damour and Taylor, 1982; Taylor 1975 1980 1985
e t al., 1992). Year
PSR 1913+ 16 has an orbital period Pb -7.8 h, eccen-
tricity e ~ 0 . 6 2 ,and mass function f,=O. 13Ma. With FIG. 10. Accumulated shift of the times of periastron in the
the available data quality and time span, the Keplerian PSR 1913-616 system, relative to an assumed orbit with con-
stant period. The parabolic curve represents the general relativ-
orbital parameters are actually determined with fraction- istic prediction for energy losses from gravitational radiation.
al accuracies of a few parts per million, or better. In ad-
dition, the PK parameters &, y , and Pb are determined
with fractional accuracies better than 3 X 5X
and 4X respectively (Taylor and Weisberg, 1989;
Taylor, 1993). Within any viable relativistic theory of
gravity, the values of h and y yield the values of m I and
m 2 and a corresponding prediction for P b arising from
the damping effects of gravitational radiation. At present
levels of accuracy, a small kinematic correction (approxi-
mately 0.5% of the observed Pb ) must be included to ac-
count for accelerations of the solar system and the binary
pulsar system in the Galactic gravitational field (Damour
and Taylor, 1991). After doing so, we find that Einstein's
theory passes this extraordinarily stringent test with a
fractional accuracy better than 0.4% (see Figs. 10 and
11). The clock-comparison experiment for PSR
+
1913 16 thus provides direct experimental proof that
changes in gravity propagate at the speed of light, there-
by creating a dissipative mechanism in an orbiting sys-
tem. It necessarily follows that gravitational radiation
exists and has a quadrupolar nature.
0 1 2 3
PSR 1534+ 12 was discovered just three years ago, in a Pulsar mass (M,)
survey by Aleksander Wolszczan (1991) that again used
the huge Arecibo telescope to good advantage. This pul-
sar promises eventually to surpass the results now avail- FIG. 11. Solid curves correspond to Eqs. (5)-(7) together with
the measured values of h, y , and Pb. Their intersection at a sin-
able from PSR 1913-t 16. It has orbital period Pb = 10.1 gle point (within the experimental uncertainty of about 0.35%
h, eccentricity e ~ 0 . 2 7 and
, mass function f,~ 0 . 3 1 M o . in Pb ), establishes the existence of gravitationalwaves. Dashed
Moreover, with a stronger signal and narrower pulse curves correspond to the predicted values of parameters r and s;
than PSR 1913+ 16, its TOAs have considerably smaller these quantities should become measurable with a modest im-
measurement uncertainties, around 3 ps for five-minute provement in data quality.

Rev. Mod. Phys.. VoI. 66, No. 3.July 1994


50 1

718 Joseph H. Taylor, Jr.: Binary pulsars and relativistic gravity

observations. Results based on 15 months of data (Tay- Gold, T., 1968, Rotating neutron stars as the origin of the pul-
lor et al., 1992) have already produced significant mea- sating radio sources, Nature 218, 731-732.
surements of four PK parameters: &, y , r, and s. In re- Goldreich, P., and W. H. Julian, 1969, Pulsar electrodynam-
cent work not yet published, Wolszczan and I have mea- ics, Astrophys. J. 157, 869-880.
sured the orbital decay rate, P,,,and found it to be in ac- Haugan, M. P., 1985, Post-Newtonian arrival-time analysis for
a pulsar in a binary system, Astrophys. J. 296, 1- 12.
cord with general relativity at about 20% level. I n fact, Hewish, A., S. J. Bell, J. D. H. Pilkington, P. F. Scott, and R.
all measured parameters of the PSR 15344- 12 system are A. Collins, 1968, Observation of a rapidly pulsating radio
consistent within general relativity, and it appears that source, Nature 217,709-713.
when the full experimental analysis is complete, Huguenin, G. R., J. H. Taylor, L. E. Goad, A. Hartai, G. S. F.
Einsteins theory will have passed three more very Orsten, and A. K. Rodman, 1968, New pulsating radio
stringent tests under strong-field and radiative condi- source, Nature 219, 576.
tions. Hulse, R. A., 1994, The discovery of the binary pulsar, in Les
I d o not believe that general relativity necessarily con- Prix Nobel (The Nobel Foundation).
tains the last valid words t o be written about the nature Hulse, R. A., and J. H. Taylor, 1974, A high sensitivity pulsar
of gravity. The theory is not, of course, a quantum survey, Astrophys. J. 191, L59-L61.
Hulse, R. A. and J. H. Taylor, 1975a, Discovery of a pulsar in
theory, and at its most fundamental level the universe ap-
a binary system, Astrophys. J. 195, L51-L53.
pears to obey quantum-mechanical rules. Nevertheless, Hulse, R. A., and J. H. Taylor, 1975b, A deep sample of new
our experiments with binary pulsars show that, whatever pulsars and their spatial extent in the galaxy, Astrophys. J.
the precise directions of future theoretical work may be, 201, LS5-L59.
the correct theory of gravity must make predictions that Kaspi, V. M., J. H. Taylor, and M. Ryba, 1994, High-precision
are asymptotically close to those of general relativity timing of millisecond pulsars. 111. Long-term monitoring of
over a vast range of classical circumstances. PSRs B1855+09 and B1937+21, Astrophys. I. (in press).
Large M. I., A. E. Vaughan, and B. Y. Mills, 1968, A pulsar
supernova association, Nature 220,340-341.
ACKNOWLEDGMENTS Manchester, R. N., and W. L. Peters, 1972, Pulsar parameters
from timing observations, Astrophys. J. 173,221-226.
Russell Hulse and I have many individuals to thank for Manchester, R. N., J. H. Taylor, and G. R. Huguenin, 1972,
their important work, both experimental and theoretical, New and improved parameters for twenty-two pulsars, Na-
without which our discovery of PSR 1913+16 could not ture Phys. Sci. 240,74.
have borne fruit so quickly or so fully. Most notable McCulloch, P. M., J. H. Taylor, and J. M. Weisberg, 1979,
among these are Roger Blandford, Thibault Damour, Lee Tests of a new dispersion-removingradiometer on binary pul-
Fowler, Peter McCulloch, Joel Weisberg, and the skilled sar PSR 1913+16, Astrophys. J. 227, L133-LI37.
Radhakrishnan, V., and D. J. Cooke, 1969, Magnetic poles
and dedicated technical staff of the Arecibo Observatory.
and the polarization structure of pulsar radiation, Astrophys.
Lett. 3, 225-229.
Rawley, L. A., J. H. Taylor, and M. M. Davis, 1988, Funda-
REFERENCES mental astrometry and millisecond pulsars, Astrophys. J. 326,
947-953.
Richards, D. W., and J. M. Comella, 1969, The period of pul-
Blandford, R., and S. A. Teukolsky, 1976, Arrival-time sar NP 0532, Nature 222,551-552.
analysis for a pulsar in a binary system, Astrophys. J. 205, Ryba, M.F., and J. H. Taylor, 1991, High precision timing of
580-591. millisecond pulsars. I. Astrometry and masses of the PSR
Burns, W. R. and B. G. Clark, 1969, Pulsar search tech- 1855i-09 system, Astrophys. J. 371, 739-748.
niques, Astron. Astrophys. 2, 280-287. Staelin, D. H., and E. C. Reifenstein, 111, 1968, Pulsating radio
Camilo, F., 1994, Millisecond pulsar searches, in Lives ofthe sources near the Crab Nebula, Science 162, 1481- 1483.
Neutron Stars, NATO ASI Series, edited by A. Alpar (Kluwer, Stinebring, D. R., V. M. Kaspi, D. J. Nice, M. F. Ryba, J. H.
Dordrecht). Taylor, S. E. Thorsett, and T. H. Hankins, 1992, A flexible
Damour, T., and N. Deruelle, 1985, General relativistic celes- data acquisition system for timing pulsars, Rev. Sci. Instrum.
tial mechanics of binary systems. I. The post-Newtonian 63,3551-3555.
motion, Ann. Inst. Henri Poincart Phys. Thtor. 43,107-132. Taylor, J. H., 1972, A high sensitivity survey to detect new
Damour, T., and N. Deruelle, 1986, General relativistic celes- pulsars, research proposal submitted to the US National Sci-
tial mechanics of binary systems. 11. The post-Newtonian ence Foundation, September, 1972.
timing formula, Ann. Inst. Henri Poincare Phys. Thtor. 44, Taylor, J. H., 1974, A sensitive method for detecting dispersed
263-292. radio emission, Astron. Astrophys. Suppl. Ser. 15, 367.
Damour, T., and J. H. Taylor, 1991, On the orbital period Taylor, J. H., 1991, Millisecond pulsars: Natures most stable
+
change of the binary pulsar PSR 1913 16, Astrophys. J. 366, clocks, Proc. IEEE 79, 1054-1062.
501-511. Taylor, I. H., 1993, Testing relativistic gravity with binary and
Damour, T., and J. H. Taylor, 1992, Strong-field tests of rela- millisecond pulsars, in General Relutiuity und Gruuitation
tivistic gravity and binary pulsars, Phys. Rev. D 45, 2992, edited by R. J. Gleiser, C. N. Kozameh, and 0. M.
1840- 1868. Moreschi (Institute of Physics, Bristol), pp. 287-294.
Epstein, R., 1977, The binary pulsar: Post Newtonian timing Taylor, J. H., and R. J. Dewey, 1988, Improved parameters for
effects, Astrophys. J. 216,92- 100. four binary pulsars, Astrophys. J. 332,770-776.

Rev. Mod. Phys.. VoI. 66. No. 3. July 1994


502

Joseph H. Taylor, Jr.: Binary pulsars and relativisticgravity 719

Taylor, 3. H., L. A. Fowler, and P. M. McCulloch, 1979, Mea- Taylor, J. H., and J. M. Weiberg, 1989, Further experimental
surements of general relativistic effects in the binary pulsar tests of relativistic gravity using the binary pulsar PSR
PSR 1913+ 16, Nature 277,431. 1913+ 16, Astrophys. J. 345,434-450.
Taylor, J. H., R. A. Hulse, L. A. Fowler, G. W. Gullahorn, and Taylor, J. H.. A. Wolszczan, T. Damour, and J. M. Weisberg,
J. M. Rankin, 1976, Further observations of the binary pulsar 1992, Experimental constraints on strong-field relativistic
PSR 1913f16, Astrophys. J. 206, L53-L58. gravity, Nature 355, 132- 136.
Taylor, J. H.,R. N. Manchester, and A. G. Lyne. 1993, Cata- Thorsett, S. E., Z. Arzoumanian, M. M. McKinnon, and J. H.
log of 558 pulsars; Astrophys. J. Suppl. Ser. 88, 529-568. Taylor, 1993, The masses of two binary neutron star sys-
Taylor, J. H., and J. M. Weisberg, 1982, A new test of general tems, Astrophys. J. 405, L29-L32.
relativity: Gravitational radiation and the binary pulsar PSR A. Wolszczan, 1991, A nearby 37.9 ms radio pulsar in a rela-
1913-k16, Astrophys. J. 253,908-920. tivistic binary system, Nature 350, 688-690.

Rev. Mod. Phys.. VoI. 66, No. 3,July 1994


Chapter 10

Other Perspectives*

*T. T. Wu, C. N. Yang, R. L. Mills, H. L. O'Raifeartaigh, N. Straumann,


P. M. Ho, S. Weinberg, P. J. E. Peebles and B. Ratra,
504

p ~ y S I C A LR E V I E W D VOLUME 12, NUMBER 12 1 5 D E C E M B E R 1975

Concept of nonintegrable phase factors and global formulation of gauge fields


Tai Tsun Wu*
Gordon McKay Laboratory. Horvard University, Cambridge. Mnrrachusetrs 02138

Chen Ning Yangt


Institute for Theoretical Physics, State University of New York. Stony Brook. New York I 1 794
(Received 8 September 1975)
Through an examination of the Bohrn-Aharonov experiment an intrinsic and complete description of
electromagnetism in a space-time region is formulated in terms of a nonintegrable phase factor. This concept,
in its global ramifications, is studied through an examination of Diracs magnetic monopole field.
Generalizations to non-Abelian groups are carried out, and result in identification with the mathematical
concept of connections on principal fiber bundles.

1. MOTIVATION AND INTRODUCTlON global problems we analyze in Sec. Ill the field
produced by a magnetic monopole. We demon-
The concept of the electromagnetic field was s t r a t e how the quantization of the pole strength,
conceived b y Faraday and Maxwell to describe a striking result due to D i r a ~ i,s~understood in
electromagnetic effects in a space-time region. this concept of electromagnetism. The demon-
According to this concept, the field strenght f,, stration i s closely related to that in the original
describes electromagnetism. It was l a t e r real- Dirac paper. Dirac discussed the phase factor of
ized: however, thatf, by itself does not, in the wave function of a n electron (which, among
quantum theory, completely describe all electro- other things, depends on the electron energy). Our
magnetic effects on the wave function of the elec- emphasis is on the nonintegrable electromagnetic
tron. The famous Bohm-Aharonov experiment, phase factor (which does not depend on such quan-
first beautifully performed by Chambers: showed tities as the energy of the electron).
that in a multiply connected region where f , , = O The monopole discussion leads to the recognition
everywhere there a r e physical experiments for that in general the phase factor (and indeed the
which the outcome depends on the loop integral vector potential A ), can only be properly defined
in each of many overlapping regions of space-
time. In the overlap of any two regions there ex-
i s t s a gauge transformation relating the phase
around an unshrinkable loop. This r a i s e s the factors defined for the two regions. This discus-
question of what constitutes an intrinsic and corn- sion is made m o r e p r e c i s e in Sec. IV. It leads to
plete descnption of electromagnetism. In the the definition of global gauges and global gauge
present paper we wish to discuss this question and transformations.
also its generalization to non-Abelian gauge fields. In Sec. V generalizations to non-Abelian gauge
An examination of the Bohm-Aharonov experi- groups are made. The special c a s e s of SU, and
ment indicates that in fact only the phase factor SO, gauge fields a r e discussed in Secs. VI and VII.
A surprising result is that the monopole types are
quite different for SU, and SO, gauge fields and f o r
electromagnetism.
and not the phase (l), i s physically meaningful. In The mathematics of these r e s u l t s is in fact well
other words, the phase (1)contains more infor- known to the mathematicians in fiber bundle theo-
mation than the phase factor (2). But the addition- r y . An identification table of terminologes is
al information is not measurable. This simple given in Sec. V. We should emphasize that our in-
point, probably implicitly recognized by many t e r e s t in this paper does not lie in the beautiful,
authors, is discussed in Sec. II. It leads to the deep, and general mathematical development in
concept of nonintegrable (i.e., path-dependent) fibsr bundle theory. Rather we a r e concerned with
Phase factor as the b a s i s of a description of elec- the necessary c a c e p t s to describe the physics of
tromagne tism. gauge theones. I t is remarkable that these con-
This concept has been taken3 as the basis of the cepts have already been intensively studied a s
definition of a gauge field. The discussions in mathematical constructs.
Ref. 3 , however, centered only on the local prop- Section VLI d i s c u s s e s a gedanken generalized
erties of gauge fields. To extend the concept to Bohm-Aharonov experiment f o r SU, gauge fields.

__
12 3845
505

3 846 T A I T S U N WU A N D C H E N N I N G Y A N G 12

Unfortunately, the experiment i s not feasible un-

-0
l e s s the m a s s of the gauge particle vanishes. In
the l a s t section we make several remarks. interference
electron plane
11. DESCRIPTION OF ELECTROMAGNETISM cylinder
beam
The Bohm-Aharonov experiment explores the
electromagnetic effect on an electron beam (Fig.
1)in a doubly connected region where the electro-
magnetic field i s zero. As predicted by Aharonov FIG. 1. Bohm-Aharonov experiment (Refs. 1, 2). A
and Bohm, the fringe shift is dependent on the magnetic flux is in the cylinder. Outside of the cylinder
phase factor (Z), which is equal to the field strength fuv = O .

eXP(Za), We conclude: (a) The field strengthf,, under-


describes electromagnetism, i.e., different
where is the magnetic flux in the cylinder. Thus
physical situations in a region may have the same
two c a s e s a and b for which
f,,. (b) The phase (1)overdescribes electromag-
a,- n , = i n t e g e r x (hc /e ) (3) netism, i.e., different phases in a region may
describe the same physical situation. What pro-
give the s a m e interference fringes in the experi- vides a complete description that i s neither too
ment. This we shall state and prove as follows.
much nor too little is the phase factor (2).
Theorem 1: If ( 3 ) is satisfied, no experiment Expression (2) i s less easy to u s e (especially
outside of the cylinder can differentiate between when one makes generalizations to non-Abelian
c a s e s a and b . groups) as a fundamental concept than the concept
Consider f i r s t an electron outside of the cylin- of a phase factor for any path from P to Q
d e r . We look f o r a gauge transformation on the
electron wave function $ a and the vector potential
f o r case a , which changes them into the
corresponding quantities for case b , i.e. we try provided that an a r b i t r a r y gauge transformation
to find S = e- i u such that
s = S a b= ( S b , J 1 ,

- exp(ga(Q)) e x p ( 2 A,dx) exp($a(P))

(5) (8)
For this gauge transformation to be definable, S does not change the prediction of the outcome of
must be sinzle-valued, but ci itself need not be. any physical measurements. Following Ref. 3 ,
Now ( A J b- (A,),,is curlless; hence (5) can always we shall call the phase factor (7) a nonintegrable
be solved for (Y. But it i s multiple-valued with a n (i.e., path-dependent) phase factor.
increment of Electromagnetism i s thus the gauge-invanant
manifestation of a nmntegrable phase f a c t o r . We
shall develop this theme f u r t h e r in the next sec-
tion.
e
=-(ab - no)
AC
Ill. FIELD DUE TO A MAGNETIC MONOPOLE
every time one goes around the cylinder. If (3)
is satisfied, ha= Z R X integer and S i s single-
valued. Case a and case b outside of the cylinder The definition of a nonintegrable phase factor
a r e then gauge-transformable into each other, and (7) in a general case may present problems. To
no physically observable effects would differentiate illustrate the problem, l e t u s study the magnetic
them. The same argument obviously holds if one monopole field of Dirac.* Consider a static mag-
studies the wave function of an interacting system netic monopole of strength g # 0 a t the origin
of particles provided the charges of the particles ? = 0 and take the region R of space- time under
a r e all integral multiples of e . Thus we have consideration to be all space-time minus the O n -
shown the validity of Theorem 1. gin ?= 0. We shall now show the following:
506

12
c
C O N C E P T O F N O N I N T E G R A B L E P H A S E F A C T O R S AND ... 3847

Theorem 2: There does not exist a singuiarity- The gauge transformation i n the overlap of the two
f r e e A ,, over all R . regions i s
Lf a singularity-free A , , does exist throughout R ,
consider the loop integral $ A,,&' f o r time t =0
around a circle at fixed spherical coordinates
S = S o b=exp(-i a)= exp ?Z@).
-
-
r and 0 with azimuthal angle 4 = O 2 R . This in- This i s a n allowed gauge transformation if and
tegral, denoted by n ( ~0), f o r r>O, is equal to the only if S is single-valued, i.e.
magnetic flux through a cap bounded by the loop,
or more explicitly n(r,0) = 2ng(l - cos0). At 0 = 0, *=integer =D ,
RC
O ( r ,0) = 0. Increasing 0 leads to a continuous in-
crease in s2 till one approaches 0 = n , a t which which i s D i r a c ' s quantization. With (13) we have
O ( 7 , n) = 4ng. (9)
But a t 0 = n the loop shrinks to a point. Therefore To define the phase factor for a path we r e f e r to
n(r,n) = 0 since A ,has no singularity. We have Fig. 2 , where a point in the overlapping region,
thus reached a contradiction and Theorem 2 is such a s point P, is regarded as two points P, and
proved. P,. If a path is entirely within region a o r b , we de-
With an A, which has singularities, the nonin- fine 0 along the path by (7) with (Au).o r ( A , ) , in
tegrable phase factor becomes undefined if the path the integrand in the exponent. If the path Q P is -
goes through a singularity. This difficulty m u s t be entirely within the overlapping region we have
resolved in order to use a nonintegrable phase then two possible phas.e factors and aOhpb.
factor as a fundamental concept to describe elec- It is easy to prove that
tromagnetism. It can be resolved in the following
way. Let us seek to divide R into two overlapping
regions R , and R, and to define (A,), and (Au),, i.e.,
each singularity-free in their respective regions,
so that (i) their curls a r e equal to the magnetic
field and (ii) in the cwrlapping region (A,,), and which merely states that (Au),and (A,,)b a r e related
(Au),are related by gauge transformation. One by a gauge transformation with the transformation
possible choice is to take the regions to be factor (12).
F o r a path that c r i s s c r o s s e s in and out of the
R,: O S B < n / 2 + 6 O<T, 05@<2n, allt
(10) overlapping region, such as A - B - - -
C D E in
R,: n / 2 - 6 < 0 S n O<T, 05$<2r, allt Fig. 2, the definition of 0 is
with a n overlap extending throughout n/2 - 6< 0 *P,DCflA= aP,D~S,(D)a,b,flbSb,(B)afl~A ' (15)
< n/2 + 6. (We assume 0 < 6 5 n/2.) Take
Notice that fixing the path but sliding the points
B and D along i t does not change aEDcflA [because
of formulas like (14')] s o long as B and D remain
in the overlapping region.
The phase factor so defined satisfies the group
proper@, eag'J
@EDCBA= @ED=*D~CBA

= %?Db'DbCflA
=*EDc@caA etc. (16)
The relationship between the electromagnetic field
and the phase factor around a loop is the s a m e as
r 7
usual. One only has to be careful that if the start-
ing and terminating point A is in the overlapping
region, the phase factor is taken to be aAaflA,
=aAbBAb, and not +A,BAb o r The phase
factor around the loop is then equal to

FIG. 2 . Schematic diagram illustrating the relation-


ship between R, and R,. where s2 i s the magnetic flux through a cap bor-
507
~

3848 T A I T S U N WU A N D C H E N N I N G Y A N G 12
<
dered by the loop. Notice that because of Diracs tential (A,);. W e shall illustrate schematically
quantization condition, the phase factor i s the the transformation by elevating the region b in
s a m e whichever w a y one chooses the cap provided Figure 3(a).
it does not p a s s through the point F =0 (any t ) . One could extend the region b. One could also
W e have satisfactorily resolved the difficulty contract it, provided the whole R r e m a i n covered-
mentioned a t the beginning of this section, pro- One could create a new region by considering a
vided Diracs quantization condition (13) is satis- subregion of b as x additional region R, [Figure
fied. We shall now prove the following. 3@)], and define the gauge transformation connect,
Theorem 3 : If (13) is not satisfied (the above ing them as the identity transformation s o that
method of resolving the difficulty would not work (A,,),= (A,,),. One can then elevate R, and con-
since) there exists no division of R into overlap- t r a c t R,, which r e s u l t s in Fig. 3(c).
ping r e g o n s R,,R,,R,, . ..so that condition (i) and Through operations Of the kind mentioned in the
(ii) stated above, properly generalized to the case last three paragraphs, which we shall call distor-
of more than two regions, would hold. tions, we a r r i v e a t a large number of possibilities,
To prove this statement, observe that if such a each with a particular choice of overlapping re-
division is possible, one could generalize (15) and gions and with a particular choice of gauge trans-
a r r i v e at a satisfactory definition of the phase formation f r o m the original @,), or (A,,), to the
factor. The phase factor around a loop is then a new A,, in each region. Each of such possibilities
continuous function of the loop. Take the loop to will be called a gauge (or global gatige). T h i s
be a parallel on the sphere Y fixed, t = O , 6 fixed, definition is a natural generalization of the usual
@ = O - 2a. The phase factor defined by the gener- concept, extended to deal with the intricacies of
alization of (15) i s equal to the field of a magnetic monopole.
For each choice of gauge there is a definition of
a nonintegrable phase factor f o r every path. The
group condition * C , B A , = * C , B b * B b A , i s always
This i s not equal to unity when 8 = TI, since (13) is satisfied.
assumed to be invalid. Thus we have a contradic- Notice that the original gauge we s t a r t e d with
tion. was characterized by (a) specifying [in (lo)] the
Theorem 3 shows that if Diracs quantization regions [R, and R,]and (b) specifying the gauge
condition (13) i s not satisfied, then the field of a transformation factor (12) in the overlap (between
magnetic monopole of strength g cannot be taken R, and R,). I t does not refer to any specific A,,.
as a realizable physical situation in R. (Of course, [ A distortion may of course lead to no changes in
if one excludes the half-line x =y = 0, z < O , or any characterizations (a) and (b). Thus two different
half-line starting f r o m F=O leading to infinity, gauges may s h a r e the same characterizations (a)
then it is possible to have any value for g.) This and (b).] In the case of the monopole field, we
conclusion is the same as Diracs, but viewed had chosen the vector potential to be given by (11).
from a somewhat different point of emphasis. But, in fact, we can attach to this gauge any
and (A ,), provided they a r e gauge-transformed
1V. GENERAL DEFINITION OF GAUGE
into each other by (12) in the region of overlap.
AND GLOBAL GAUGE TRANSFORMATION
(The resultant f , , i s , of course, not a monopole
Assuming that (13) holds, to round out our con- field in general.) Thus a gauge is a c a c e p t not
cept of a nonintegrable phase factor the question tied to any specvic vector potential. We shall c a l l
of the flexibility in the choice of the overlapping the process of distortion leading f r o m one gauge to
regions and the flexibility in the choice of A,, in the another a global gauge tmansformation. It i s also
regions must be faced. Both of these questions a r e a concept not tied to any specific vector potentid.
related to gauge transformations. It i s a natural generalization of the usual gauge
Consider a gauge transformation 5 in R, (( will transformation.
be assumed to be many times differentiable, but The collection of gauges that can b e globally

-
not necessarily analytic), resulting in a new po- gauge-transformed into each other will be said to

C
I b C
-b b-
a (1 a

F I G . 3 . Distortions allowed i
n gauge transformation.
508

.n
16
C O N C E P T O F NONINTEGRABLE P H A S E FACTORS AND ... 3849
J

Wlong to the s a m e gauge type. the gauge field and only depends on the gauge:
The phase factor around a loop s t a r t s and ends
at the same point in the s a m e region. Thus it does
not change under any global gauge transformation,
#fuVdxudx=- -iAc
e d-a
an,
(InS,,)dx, (19)

i.e- we have, for Abelian gauge fields, the follow- where S i s the gauge transformation defined by
ing. (12) for the gauge S, in question, and the integral
Tlleorem 4a; The phase factor around any loop is taken around any loop around the origin f=O in
is invariant under a global gauge transformation. the overlap between R a and R,, such as the equa-
It follows trivially from this, by taking an in- tor on a sphere r = l .
finitesimal loop, that To prove this theorem we observe that the flux
T)leorem 5a: The field strengthf,, is invariant through the upper half of the sphere 7=l is equal
under a global gauge transformation. to the following integral around the equator:
For a given value of D , the gauge defined by (10)
and (12) will be denoted by S,. For D # D , the re-
lationship, o r r a t h e r the lack of relationship, be-
ween S, and .S, i s shown by Theorem 6. The flux through the lower half i s equal to a simi-
Theorem 6 : For D *D,S, and S, a r e not re- lar integral around the equator:
lated by a global gauge transformation, i.e., they
are not of the s a m e gauge type.
To prove this theorem we use Theorem 7.
Theorem 7: Between two gauge fields defined on Hence
the same gauge there exists a continuous interpo-
lating gauge field defined on the s a m e gauge.
To prove Theorem 7, we simply make a linear
interpolatic? between the two original gauge fields
which we shall denote by ( A u ) ( u )and [A,)(:
~ ( = )t ( ,)I~ a)+ (1
7 - ~ ) ( A , ) ( Q, o5t =1. (18)
which completes the proof. Using (13) and (12),
In an overlap between regions a and b this inter- the right-hand side of (21) i s equal to 4rg, as ex-
polating vector potential a s s u m e s values (A]:), pected.
and (A,):) which a r e related by the proper gauge If one s t a r t s with any gauge which is of the same
transformation belonging to this overlap. Thus gauge type a s S,, and makes a global gauge trans-
we have proved Theorem 7. formation on i t , the total flux i s not changed by
Now go back to Theorem 6 and a s s u m e it to be Theorem 5a. Thus (19), which depends only on the
invalid. Then we can gauge-transform the vector gauge, is in fact the s a m e for all gauges of the
potential belonging to the monopole of strength s a m e type. Notice that if there a r e more regions
D A c / l e to the gauge S,. For this gauge we have in a gauge than two, (19) should be replaced by a
then two monopole fields of different pole sum of line integrals along paths that a r e in the
strengths. Using Theorem 7 we interpolate be- various overlaps between the regions. For a case
b e e n them and obtain unquantized magnetic mon- of three regions there a r e three paths, which a r e
opoles, which contradict Theorem 3. illustrated in Fig. 4. Along each path the integral
Notice that although in this proof of Theorem 6 is of the form (19) with S denoting the gauge trans-
we have used two specific gauge fields, the the- formation factor, such as (12), between the two
orem itself does not r e f e r to any specific gauge regions containing the path. To prove Theorem 8 in
fields a t all. this case one need only add three loop integrals to-
By the s a m e argument as used in the proof of
Theorem 7 , any gauge field defined on S, must
have a magnetic monopole of strength D A c / Z e a t
the excluded point F= 0, in addition to possible
fields produced by electric charges and currents. >-- --_
Thus the total magnetic flux around the origin
F=O is equal to ( Z i f i c / e ) D for any gauge field de-
fined on 9., We shall state this as a theorem and @cR
Ove another proof of it. Rb
Theorem 8: Consider gauge S, and define any FIG. 4 , Ca s e of thre e regions f o r The ore m 8. The
gauge field on it. The total magnetic flux through t h r e e paths f r o m P to Q are in the t h r e e overlapping
a sphere around the origin F=O is independent of regions between ( R , , R b ) , ( R b , R , ) , and (R,,R,,).
509

3850 T A I T S U N WU A N D C H E N N I N G Y A N G

gether, each of the form of (ZOa) and (ZOb), and (2) If three regions R,, R,, and R , overlap, then
notice that along each path the integrand is always there a r e gauge transformations S o b ,SbnrSac,S,,
the difference of the vector potential A , between Sbc,S,, so that
two regions, very much as in (21).
The f i r s t proof we gave above of Theorem 8 i s S
, S , = 1 , etc .
,, S
easy and i s obvious to a physicist. The second inR,nR,nR,.
proof i s more involved but is m o r e intrinsic. The A s in the case of electromagnetism, both the
theorem is a special c a s e of the Chern-Well concept of a gauge and the concept of a global
theorem which evolved from the famous Gauss- gauge transformation a r e not tied to any specific
Bonnet-Allendoerfer-Weil-Chern theorem, a gauge potentials, denoted in general by b t .
seminal development in contemporary mathemat- The nunintegrable phase factor f o r a given path
ics. We want to emphasize two consequences of i s now an element of the gauge group. W e shall
the theorem. (i) The right-hand side of (19) i s in- s t i l l call i t a phase factor. Since these phase fat.
dependent of the gauge field, and only depends on t o r s do not i n general commute with each other,
the gauge type. (ii) The right-hand side of (19) has Theorems 4a and 5a f o r the Abelian c a s e need to
as integrand the gradient of Ins. Since S is single- be modified as follows.
valued, the integral must be equal to an integral Theorem 4 : Under a. global gauge transforma-
multiple of a constant (in this c a s e 2z-i). A r e - tion, the phase factor around any loop remains in
markable fact i s that these consequences remain the same class. The c l a s s does not depend on
valid in the general mathematical theorem, which which point is taken as the starting point around
is very deep. the loop.
Theorem 5: The field strengthfz, i s covariant
V. GENERALIZATION TO NON-ABELIAN under a global gauge transformation.
GAUGE FIELD Only theorem 4 i s not immediately transparent.
For a loop ABCA, under a gauge transformation
So far we have only considered electromagnetism
and described i t in t e r m s of an Abelian gauge field *ABcA- *kBCA= S(A)*A,cAS-(A) .
that corresponds to the group U,, o r equivalently Thus and 9,,,, a r e in the s a m e c l a s s . Also
SO2. On the basis of the discussions in the pre- around the s a m e loop if we change the starting
ceding section, the generalization to the non- point from A to C ,
Abelian case can be carried out without much diffi-
culty. For a local region this h a s been done i n Ref. 3 . *caac= Q C A * A B C A * A C .
Extension to global considerations is our present Hence changing the starting point does not change
focus of interest. the class.
A gauge is defined by (a) a particular choice of Theorem 4 defines the class of a loop. This
overlapping regions and (b) a particular choice of concept i s the generalization of the phase factor
single-valued gauge transformations Sab in the f o r electromagnetism around a loop with the mag-
overlapping regions. The choice of gauge trans- netic flux as the exponent. It is a gauge-invariant
formations clearly must satisfy the following two
concept.
conditions. These concepts have been extensively studied
(1)In the overlapping region R a n R,, the gauge by the mathematicians in the framework of more
transformations S, from a to b and S,, f r o m b to general mathematical constructs. A translation
a a r e related by of terminology i s given in Table I.
Sb.= 1 1

where 1 i s the identity element of the gauge group.


510

2 CONCEPT O F NONINTEGRABLE PHASE FACTORS AND ... 3851

TABLE I. Translation of terminology.

Gauge field terminology Bundle terminology

gauge (or global gauge) principal coordinate bundle


gauge type principal fibe r bundle.
gauge potential b i connection on a principal
fiber bundle
Sb,(see Sec. V) transition function
phase f act or Q parallel displacement
field strength fa$
so u r c e a J E
c urva ture
?
electromagnetism connection on a U, (1) bundle
isotopic spin gauge field connection on a SU, bundle
Dirac s monopole quantization classification of U, (1) bundIe
according to first Chern c l a s s
electromagnetism without monopole connection on a trivia l U,Q) bundle
electromagnetism with monopole connection on a nontrivial U , ( l ) bundle

a Le., electric source. This is the generalization ( s e e Ref. 3) of the concept of e l e c t r i c


c h a r g e s and current s.
511

Chen Ning Yang and Tai Tsun Wu in Leiden (1984)


(Photo courtesy of Judy Wong)
512

Gauge fields
Robert Mills
Physics Department, The Ohio State Uniuersity, Columbus, Ohio 43210
(Received 15 January 1987; accepted for publication 6 February 1989)
This article is a survey of the history and ideas of gauge theory. Described here are the gradual
emergence of symmetry as a driving force in the shaping of physical theory; the elevation of
Noethers theorem, relating symmetries to conservation laws, to a fundamental principle of
nature; and the force of the idea (the gauge principle) that the symmetries of nature, like the
interactions themselves, should be local in character. The fundamental role of gauge fields in
mediating the interactions of physics springs from Noethers theorem and the gauge principle in a
remarkably clean and elegant way, leaving, however, some tantalizing loose ends that might prove
to be the clue to a future deeper level of understanding. The example of the electromagnetic field
as the prototype gauge theory is discussed in some detail and serves as the basis for examining the
similarities and differences that emerge in generalizing to non-Abelian gauge theories. The article
concludes with a brief examination of the dream of total unification-all the forces of nature in a
single unified gauge theory, with the differences among the forces due to the specific way in which
the fundamental symmetries are broken in the local environment.

FOREWORD the view of nature known as quantum field theory leads to


yet another radical departure, not implied by either relativ-
This article appeared first in Chinese translation, in ity or quantum theory alone, from our conventional under-
Ziran Zazhi (Nature Journal, published in Shanghai), Au- standing, involving in this case the very nature of elemen-
gust 1987 and, subsequently, in a shortened English ver- tary particles. These are no longer seen as primordial
sion in the proceedings of a conference on the History of indestructible matter, but as field quanta, that is, the
Modem Gauge Theories (July 1987, Utah State Universi- quantum excitations of different fundamental fields, in ex-
ty, Logan, UT), edited by Arnold Rosenblum and pub- act analogy to the way in which photons, the particlelike
lished by Plenum Press. Both Ziran Zazhi and Plenum units of electromagnetic energy, are understood as quan-
Press have kindly given their permission for its use in the tum excitations of the electromagnetic field.
American Journal of Physics. Finally, I want to draw attention to Einsteins dream of a
I hope to be forgiven the absence of a bibliography. The unified geometrical picture of all the fundamental forces of
discursive and descriptive nature of the article has not nature, generalizing his most fruitful and most beautiful
seemed to require references and it would be difficult to do creation, the general theory of relativity. This theory sees
justice to the voluminous literature on the subject without gravity as the manifestation of a curvature of space and
major changes in the character of the article. time by which the motion of all objects-whether planets
or ping-pong balls-is seen not as the result of actual gravi-
I. INTRODUCTION: THE SWEEP OF CHANGE tational forces acting on the object, but as the objects
The 20th century has been a time of tremendous ferment attempt to follow the straightest possible path (the geo-
and change in our understanding of the world around us. It desic) among the bumps and hollows of space-time itself.
has not been simply a matter of discovering new phenome- Einstein spent most of his later years in a herculean, but
na or adding new features to existing theories though in- vain, effort to extend this geometrical approach to include
deed those things have happened to a remarkable extent: It electromagnetic forces. It is ironic, therefore, that the last
goes much deeper than that. The laws of physics, as we try 30 years, starting, in fact, the year before Einstein died,
to state them, have a basic character that reflects our un- have seen the development of a point of view, the principle
derstanding of the nature of reality at the deepest philo- of gauge inuariunce, that generalizes in an entirely unex-
sophical level; it is this most basic character that has been pected way the geometrical character of Einsteins general
repeatedly and violently upset just within the lifetime of theory of relativity and succeeds in providing a satisfactory
many of us. description of all the forces of nature. It is not yet a unified
No one figure stands larger in 20th-century physics, or theory-there are three independent gauge theories in the
symbolizes more fully this revolution or, rather, this series currently accepted picture-but most physicists feel that
of revolutions, in our understanding of nature than Albert full unification is only a matter of time.
Einstein ( 1879-1955). In devising the special and general The gauge principle, which might also be described as a
theories of relativity, Einstein singlehandedly revised, radi- principle of local symmetry, is a statement about the invar-
cally and permanently, our conception of space and time. iance properties of physical laws. It requires that euery con-
He also played a key role in the even deeper revolution tinuous symmetry be a local symmetry, a phrase that is ex-
represented by quantum theory, which shattered our deter- plained and discussed in detail later. This requirement
ministic view of physical laws (though Einstein himself involves a far greater degree of symmetry in nature than is
could not accept this), and gave the observera basic role in normally conceived and the idea would probably never
fixing physical reality whose significance we have still, in have been proposed if there were not familiar examples of
my view, not fully grasped. The welding of these two con- local symmetry already known, namely, general relativity
ceptual revolutions, of relativity and quantum theory, into and electromagnetic theory. It is so severea constraint that
513

there is only a very limited class oftheories that can meet it (as a woman) striving for recognition at the University of
and extremely little arbitrariness in the forms of interaction Gottingen, relates in particular to variational principles as
that are allowed. they apply to physics. The substance of the theorem, for
It is the purpose of this article to explore this idea of our purposes, is that for every symmetry of nature there is 0
gauge invariance-to tell something of the history of the corresponding conservation law and for every consemation
idea, to give a survey of some of the physics needed to un- law there is a symmetry. In the Lagrangian formulation ofa
derstand the principle, and to describe the logic of the physical theory, the symmetry in question is a symmetry of
gauge invariance principle and how it gives rise to very the Lagrangian; since the form of the Lagrangian deter.
particular forms of physical theories. Three key concepts mines the equations of motion of the system being de-
run through the discussion-distinct, but tightly interre- scribed, this means that it is symmetry of thoseequations of
lated: symmetry, conservation laws, and gauge fields. We motion, that is, of the physical theory itself.
shall look first at how these interrelationships appear in the Physics is characterized by a number of deeply signifi.
case of classical electromagnetic theory, which is the sim- cant conservation laws such as energy, linear and angular
plest example of a gauge field theory, and then examine momentum, and electric charge, together with a number of
how this provides the stimulus and basis for the generaliza- others that have emerged in more recent years. Noethers
tion to what are now called non-Abelian gauge theories. I theorem shows explicitly how these are related to the very
shall then give a sketch of the developments in more recent structure-the symmetry, in fact-af physical laws. All
years-the resolution of a variety of formidable difficulties theorems are proved on the basis of given hypotheses, and
and the formulation of the Standard Model, the current in the case of Noethers theorem, the most important as-
widely accepted picture of the elementary particles and the sumption is that the equations of motion of physics are
gauge fields by which they interact. derivable from a variational principle, known as Hamil-
While our present understanding of physics involves a tons principle. Hamiltons principle states that for the true
unification of type in the sense that all the force fields of trajectory of a system-its history as a function of time-
nature are seen to be of the same character, namely, gauge the time integral of the Lagrangian is stationary with re-
fields, I shall discuss, finally, the hope that most physicists spect to small changes of that trajectory away from its true
share of a more complete unification-the hope of showing shape. There is no obvious reason why the equations of
that all the forces are associated with asingle gauge theory, motion should have this character. It is easy to devise a
with a single multicomponent gauge field whose different universe whose equations of motion do not satisfy any such
components are related to each other in some completely variational principle, but in our universe they always do.
symmetrical way. This dream has already been partially Hamiltons principle was first discovered in connection
realized in the Standard Model through the unification of with mechanical systems, where the Lagrangian turns out
the electromagnetic and weak interactions, but there still to be the difference between the kinetic and potential ener-
are major obstacles in the way of its final realization. gies, but the principle is easily extended to include velocity-
dependent forces of certain types, including the magnetic
11. T H E BEGINNINGS OF THE GAUGE IDEA force on a moving charged particle. (Hamiltons principle
The key ideas leading up to the introduction of general- cannot be extended to include dissipative forces, but ele-
ized gauge fields came from Noether, Weyl, and London. mentary forces are never dissipative.) Finally, it has turned
The underlying trend, of which gauge symmetry is a partic- out that systems of a completely nonmechanical nature,
ular manifestation, is the growing realization in this centu- such as the electromagnetic field, also can be described in
ry of the importance of symmetry to our basic understand- this way: Maxwells equations, the equations of motion of
ing of the universe, to the point where it is now felt that it is the electromagnetic field, can also be derived from Hamil-
the underlying symmetry of physical laws that drives the tons principle, though with a Lagrangian that seems to
system-that determines the structure of the laws and the have nothing to do with such things as kinetic and potential
number and character of the elementary particles. This is a energy. We have become so used to this state of affairs that
characteristically 20th-century development. Prior to this, when we are trying to devise a new theory we invariably
symmetries were seen as accidental and if the theories of look for the correct Lagrangian, assuming almost without
physics showed certain symmetrical structures, that was question that the new theory must also obey Hamiltons
nice, but not of fundamental importance. principle. It always seems to work and, in consequence, we
always have Noethers theorem also: the relation between
symmetries and conservation laws.
A. Noether When we make the transition to quantum theory,
Noethers theorem is still true, although in most ways of
Emm), Noether ( 1882-1935) is regarded in mathemat- doing quantum theory the proof looks very different. One
ical circles as one of the most important mathematicians of such proof is outlined in Sec. 111. It seems to me quite possi-
this century, though not in fact for the work that has made ble that Noethers theorem is the more fundamental fact-
her known to physicists. The theorem that bears her name, that the physical theories that we devise to describe the
which has been the keystone in the development of symme- universe about us have the structure they do because of this
try as a guiding force in physics and for which (and for fundamental relationship between symmetries and conser-
which alone) she is known in physics, is hardly known to vation laws. If this is so, then Noethers theorem becomes a
mathematicians, who honor her for her work on commuta- principle rather than a theorem, like the principles of equiv-
tive rings and algebraic number theory. In each case, alence in special and general relativity; we should say then
though, she is noted for probing the underlying concepts that classical physical laws take the Lagrangian form and
upon which mathematical disciplines are based. quantum theory takes its characteristic Hamiltonian form
Noethers theorem, proved in 1918 while she was still as a consequence of Noethers principle.
R. Weyl and London Ning) (1922- ), who was then at the Institute for Ad-
vanced Study in Princeton, NJ. For some years prior to
The idea of gauge invariance also had its inception in this, since the time when Yang was a graduate student at
1918. Hermann Weyl (1885-1955), a friend of Noethers
the Southwest Associated University in Kunming, China,
at Gottingen, had been deeply influenced by Einstein and
he had been much impressed with the relationship between
shared his vision of seeing electromagnetism as a manifes-
charge conservation and gauge invariance and, in particu-
tation of some kind of local symmetry, similar to the local
lar, by the fact that the whale structure of electromagnetic
symmetry that characterizes the general theory of relativi-
theory would be uniquely determined by the sole require-
ty. In the case of general relativity, the symmetry in ques-
ment of gauge invariance. After coming to the United
tion is an invariance of the form of the basic equations un-
States in 1945, as a graduate student at the University of
der arbitrary curvilinear coordinate transformations,
Chicago, Yang began the attempt to generalize the gauge
corresponding to the physical requirement that physical
invariance argument to other conservation laws, in partic-
laws appear the same to all observers regardless of the state ular the conservation of isospin. Many conservation laws of
of motion-accelerating, rotating, or whatever-of their various sorts have appeared since then, but at that time the
reference frames. only conservation law that bore a useful similarity to elec-
The invariance that Weyl hoped to exploit was an invar- tric charge was the conservation of isospin (usually re-
iance with respect to change of scale: the requirement that ferred to then as isotopic spin). Isospin was an imperfect
physical laws be the same if the scale of all length measure- conservation law, violated by electromagnetic and weak
ments is changed by the same overall factor. Weyl wanted interactions, but apparently strictly true for strong interac-
to require a local gauge invariance in which the scale tions. One could easily imagine a world with only the
changes are allowed to be different at different points in strong interaction, where the conservation of isospin and
space and time, analogous to the curvilinear coordinate the associated symmetries would be exactly valid. If the
transformations of general relativity. The associated con- gauge invariance idea could be generalized, the result
servation law, by Noethers theorem, was to be the conser- should be a complete theory of the strong interaction, with
vation of electric charge. isospin as the charge responsible for the interactions and
Einstein pointed out serious flaws in the idea and it lay the newly invented gauge field as the glue playing the
dormant until 1927, by which time Schrodinger had intro- same role as the electromagnetic field in electrodynamics.
duced his wave equation for quantum theory (in 1926) and During the academic year 1953-1954, Yang was a visi-
complex wavefunctions were seen to play a role in physics. tor to Brookhaven National Laboratory, about 80 km east
In 1927, Fritz London (1900-1954) pointed out that the of New York City on Long Island. Here the Cosmotron,
symmetry associated with electric charge conservation was then the biggest particle accelerator in the world (acceler-
not a scale invariance, but aphase invariance, i.e., the in- ating protons of energies of 2-3 GeV) was just beginning to
variance of quantum theory under an arbitrary change in produce the abundance of new and unfamiliar particles
the complex phase of the wavefunction (explained in detail that have transformed the face of physics in the years since.
in Sec. 11).The invariance uner a global phase change- I was at Brookhaven also, on a postdoctoral appointment,
multiplication of the wavefunction by a constant phase fac- and was assigned to the same ofice as Yang. (I was still
tor e-was trivial in fact; the nontrivial fact was that the belatedly writing my dissertation, the study of a possible
existence of the electromagnetic field allows a much contribution to the fourth-order Lamb shift, under the
broader kind of invariance, invariance under a local phase guidance of Normal Kroll at Columbia University in New
change, in which the phase factor varies arbitrarily from York.) Yang, who has demonstrated on a number of occa-
one point to another in space-time. That is, 0 becomes an sions his generosity to young physicists beginning their car-
arbitrary function of x, y , I, and t, the coordinates of space eers, told me about his idea of generalizing gauge invar-
and time. How this works is explained in Sec. 111. iance and we discussed it at some length. Having some
Weyl also played a part in this modification of his idea background in quantum electrodynamics, I was able to
and continued to use the name gauge symmetry to de- contribute something to the discussions, especially with re-
scribe it, although it was now a misnomer, since the word gard to the quantization procedures, and to a small degree
gauge historically refers to a choice of length scale, rath- in working out the formalism; however, the key ideas were
er than to the assignment of complex phases. Yangs. The predicted quanta would have a spin of I, like
the photon, but would also have isospin of 1, like the pi
C. Yang and Mills
meson: This means that they would form a charge triplet,
For some 25 years the idea of gauge invariance (almost with positive, negative, and neutral states, just like the
always thought of in terms of local gauge invariance) was pion. The question of renormalizability was far beyond us,
seen as a specific characteristic of electromagnetic theory, as was the question of the mass of the gauge field quantum.
useful in various ways (e.g., as a check on the validity of These questions were not to be resolved for another 10 -15
calculation procedures), but not of more fundamental sig- years, by which time there had been an order-of-magnitude
nificance. Local gauge invariance was felt to have addi- increase in the sophistication of physicists understanding
tional implications within electromagnetic theory, such as of quantum field theory.
zero mass for the photon (although it was hard to make a At about the same time (also in 1954), Ronald Shaw, a
rigorous proof of this) and the constancy, among the dif- student of Abdus Salam at Cambridge University in Eng-
ferent elementary particles, of the elementary unit of elec- land, was also thinking deeply about possible generaliza-
tric charge. tions of the idea of gauge invariance, influenced in particu-
The idea that local gauge invariance might have a more lar by lecture notes of Schwinger. Shaws unpublished
universal significance in physics began to be considered in doctoral dissertation ( 1954) on The Problem of Particle
the early 1950s, particularly by C. N.Yang (Yang Chen Types and Other Contributionsto the Theory ofE/ernrntary
515

Particles includes a section (Invariance under General tromagnetic theory and general relativity, both of which, ;Is
Isotopic Spin Transformations) that closely and indepen- we shall see, show just this kind of local symmetry.
dently parallels the argument of the 1954 paper of Yang In general relativity, the relevant local symmetry is an
and myself and duplicates the basic equations of non-Abe- invariance under arb; trary curvilinear coordinate transfor-
lian gauge theory. There seems to be no question that the mations-as if one were making a different space-time ro-
time was ripe for this development. tation at every point. This is a sort of four-dimensional
analog of the sliced cylinder described above, with the dif-
ference that we are now talking about the invariance, not of
111. T H E GAUGE PHILOSOPHY: LOCAL an object, but of physical laws. It is well known that insist-
SYMMETRY ing on this general invariance leads inevitably to a complete
The idea at the core of gauge theory, as mentioned ear- theory of the gravitational force.
lier, is the local symmetry principle: Every continuous sym- In the electromagnetic case, which I talk about in more
metry of nature is a local symmetry. Let me now explain detail in Sec. IV, the symmetry in question is an invariance
more fully what these words mean. First, what is a sym- under changes in phase ofcomplex fields or wavefunctions,
metry of nature? We say that an object is symmetrical if its
appearance is unchanged by some transformation-we can -
$ $e9
and it becomes a local symmetry if 0 is taken to be an arbi-
(1)
say that its properties are invariant under the transfor-
mation. Look at some examples: An equilateral triangle trary function of the space-time coordinates ( x , y , r , t ) .This
looks the same if it is rotated by 120or if it is reflected with may not seem to have anything to do with electromagne-
respect to any of its altitudes, while a sphere has a much tism but, in fact, as we shall see, this local symmetry can be
richer class of symmetries, being invariant under a rotation realized only ifwe introduce an additional field with all the
of any magnitude about any of its diameters or a reflection familiar properties of the electromagnetic field.
in any of its median planes. Another type of symmetry, The electromagnetic case was in fact the inspiration for
called a translational symmetry, is illustrated by an infi- the further development of the gauge field idea. The fact
nitely long cylinder, which is invariant under any displace- that examples of local symmetry were already known to
ment parallel to its axis. The point, then, is that a symmetry exist in nature strongly suggested that this might be a gen-
necessarily involves some invariance property. eral principle and that we should examine other observed
Now the world as wesee it is not particularly symmetri- symmetries of nature to see if the same thing happens in
cal because the objects in it are irregular in shape and loca- every case. In Sec. IV, I will show how the gauge principle
tion. The rules of behauior of the physical world have a works in detail, first for the electromagnetic case and then
great deal of symmetry, however; this is what I mean by for the non-Abelian generalization.
symmetries of nature. The fact that experiments work the
same in China and in the US reflects an invariance of the IV. CONSERVED QUANTITIES, SYMMETRIES,
laws of physics under spatial displacements and rotations, AND GAUGE FIELDS
and hence a symmetry. The fact that the same experimental We now need to find out how the assumption of local
results are obtained at different times reveals the time dis- symmetry can lead to a physical theory and how it can
placement symmetry of those laws, etc. determine the character of that theory. We find that in
The transformations we normally consider areglobal-a
every case there is a characteristic logical pattern that
single rotation, for example, on the entire universe. Invar-
emerges, represented graphically in Fig. 1, connecting con-
iance under such a transformation is referred to as a glo-
served quantities, symmetries of nature, and gauge fields.
bal symmetry. The meaning of a local symmetry, on the
First, as we have seen, there is Noethers theorem, which
other hand, would be that the objects or physical laws in
states that for every conservation law there is an associated
question are invariant under a local transformation, -
symmetry and vice versa; second, there is the fact, men-
which is in fact a large number of separate transformations
tioned above, that the requirement of local symmetry leads
with a different one at every point in space and time. As an
to a gauge field theory of a particular well-determined
example, consider a long circular cylinder, which is clearly
character; and third, we find that the gauge field theory
invariant under rotations about its axis. Now imagine that
determined in this way necessarily includes interactions
the cylinder is sliced into a large number N , say, of very thin
between the gauge field and the conserved quantity with
disks. The system is now invariant under a transformation
which we started. Thus we have the astonishing fact that
in which every ring is rotated through a different angle and
the resulting symmetry, a local symmetry, is a much richer
symmetry than the original one. The original global sym-
metry is an invariance under a set of transformations de- Noethers
Theorem
scribed by a single parameter, the angle of rotation, while in
the local symmetry case, we have invariance under a much Symmetry
Quont i t y
larger set of transformations described by N different an-
gles of rotation.
We now apply this idea, not to objects, but to the laws of Local
physics, and ask that somehow they manage to have this Symmetry
much richer type ofsymmetry that we saw in the case of the Gauge
thinly sliced cylinder. Why would one think it possible that Field
the laws of physics should have such an extended symme-
try? It would seem extremely unlikely if it were not for the
fact that we already know of two examples in nature: elec- Fig. I . The logical pattern of a gauge theory
516

for every true conservation law there is a complete theory on thesequence in which the rotations are performed. This
of a gauge field for which the given conserved quantity is is directly associated with the fact that the different compo-
the source. Theonly restriction is that theconservation law nents of the angular momentum operator do not commute
be associated with a continuous symmetry (this would ex- with each other, while the different components of the lin-
clude, for example, parity, which is associated with reflec- ear momentum operator do commute with each other.
tion symmetry). The resulting theory has just one free pa- ( 2 ) The dynamical variablg A is invariant under the
rameter, the interaction strength. We shall now discuss in transformations generated by B. For example, the x com-
detail the different links in this logical pattern. ponent of position of a particle is invariant under displace-
ments in they direction (the operator ?, commutes with the
A. Noethers theorem operator Py), but is not invariant under displacements in
the x direKtion (the operator Z does not c p m u t e with the
Noethers theorem, connecting conservation laws with
operator P,).9 mathematical form, if A represents the
symmetry, was originally developed in 1918 as a theorem
variable A and B generates the transformation
in the calculus of variations, with an immediate application
to classical Lagrangian mechanics. This theorem takes a \v = e - IABqJ,
particularly elegant and general form, however, in relation (4)
to Hamiltonian mechanics, whether of the classical or then the expectation value ofA is unaltered by the transfor-
quantum mechanical type. I shall present Noethers mation
theorem in the quantum context, but it will easily be seen h

by those who know about such things that the entire discus- **A* = qJ*A^qJ. (5)
sion can be brought back to the classical domain by simply h

( 3 ) The dynamical variablg B is invariant under the


replacing commutators with Poisson brackets throughout.
transformations generated by A [as in ( 2 ) 1.
The first thing to observe is that the Hermitian linear (4)The dynamical variables A and Bare simultaneously
operators of quantum theory play a double role: Each oper- measurable with arbitrary precision-there is no uncer-
ator represents, of course, one of the dynamical variables of tainty principle relating them. (Another way of saying this
the theory; it also serves, however, as the generator nf a is that they have a complete set of simultaneous eigenstates,
class of transformations. Suppose, for example, that A is i.e., states for which both A and B have exactly predictable
the linear operator associated with the observable A, with values. )
the usual relationship that if the system is in the state repre- Any one of these four statements is true only if the corre-
sented by the state vector Y,then the expectation value o f A sponding operators commute, so we can say that they are
is given by equivalent-each implies the others. In particular, the
(A ) = qJ*A^Y, (2) equivalence of ( 2 ) and ( 3 ) says that the dynamical vari-
h

which illustrates the relation of the operatorA to the corre- 3bIeA is invariant under the transformations generated by
sponding physical observable A. [Here, the caret indicates B if and only if the dynamical varixble B is invariant under
an operator and the asterisk indicates conjugation. Note the transformations generated by A.
that the left side of Eq. ( 2 ) represents aphysical quantity, Now, our interest is in th5case that B, say, is taken to be
the mean value of a physical variable, while the right side is the Hamiltonian operator H , which is the operator asso-
a mathematical expression.] On the other hand, the same ciated with the totalenergy of the system. The transforma-
operator can be used to generate a unitary transformation tjons generated by H are time displacements, which is why
on the state vectors of the system: H i s the operator that appears in the Schrodinger equation,
ih-=HV,
dT
qJ+qJ z e-AAqJ, (3) dt
whereA is an arbitrary real parameter. Such a transforma- whose integrated form,
tion on the state vector may represent a transfTrmation on
the physical state itself, so that, for example, ifA is taken as
h q ( t )=e-~fi~T(o), (7)
Pz,the z component of the momentum operator, then Y
represents the state that differs from Y by a displacement /z corresponds to the general form ( 3 ) . Thus we see why it is
in the z direction. Every continuous family of transforma- the energy operator that governs the dynamics of the sys-
tions is generated in this way and is thus associated unique- tem.
ly with one of the physical variables of the theory. To say that a dynamical varixble A is invariant under the
Now, one ofthe well-known features of these linear oper- transformations generated by H i s to say that A is a con-
ators is that they do not in general commute with each stant of the motion, that is, its expectation value in any
other (that is, multiplication of operators is not commuta- state will be invariant under any time displacement. On the
tive) and whether or not two operators commute has phys- other hand, the statement that H is invariant under the
ical significance. This significance takes different forms de- transformations generated by A is to say that $represents a
pending on which rolezach g t h e two operators is playing. symmetry of the dynamical laws because it is H that deter-
Thus if two operators A and B commute, we can make four mines those dynamical laws. We now see (and this is our
different statements about the variables and transforma- modified version of Noethers theorem) that these two sta-
tions associated with these operators: h
tements are equivalent s i p e each is equivalent to the state-
( 1 ) The transformations generated by> and Bcommute ment that the operators A and H commute:
with each other. Thus rotations about different axes do not The dynamical variable A is conserved ifand only i f t h e
commute, while displacements in different directions do dynamical l a p are invariant under the transformations
commute, that is, you get a different net rotation depending generated by A .
517

B. Local symmetry and gauge fields: The electromagnetic tials (as you may have anticipated if you are familiar with
case the standard form of the Schrodinger equation in the pres-
ence of the electromagnetic field), that is, we combine the
We now come to the next link in the diagram shown in conventional derivative d,, with a vector field A, ,
Fig. 1, where we relate the assumption oflocal symmetry to
the existence of gauge fields. Suppose we consider some D, = a, + ineA,,, (16)
particular conservation law such as the electric charge Q, with the transformation law for A, chosen in such a way
and identify the associated symmetry_transformations gen- that Eq. ( 15) is satisfied. If you substitute expression ( 16)
erated by the Hermitian operator Q. The action of this intoEq. (15),anduseEq. (14) forthevariationofd,$(x),
transformation on a complex field $ ( x ) associated with then the terms that are left, after removing the terms that
particles of charge ne is to change its phase by an amount cancel, give
proportional to n:
- ine(df,B)$+ine6A,$=0, (17)
$ ( x )- $ ' ( x ) = e-'""$(x). (8)
which is satisfied if A satisfies the transformation law
In what follows it is convenient to work with infinitesimal
transformations for which Eq. ( 8 ) reduces, by Taylor ex- SA, ( x ) = a,, B ( x ) , (18)
pansion of the exponential, to the familiar gauge transformation for electromagnetic po-
$ ( x ) = ( I - ineB)$(x). (9) tentials.
These expressions can be rewritten in a more
Here, x represents the space-time coordinates and 6' is the familiar notation: A,, = #, the scalar potential, and
arbitrary infinitesimal parameter of the transformation. --Ai ( i = 1,2,3) are the components of the vector poten-
You can think of $ ( x ) as essentially equivalent to the tial A, so that Eq. ( 18) is equivalent to
Schrodinger or Dirac wavefunction. The invariance comes
from the fact that any quantity that is physically observable
involves the factors $* and $, so that the phase factors
Sg,=--, s
C
("S),
cancel. If several different fields are involved, as in a pro- SA= -ve. (20)
duction or decay process, then the conservation of charge
in the process is exactly related to the cancellation of these What have we proved? We started with a conservation
phase factors. The cancellation works the same for deriva- law, the conservation of electric charge with its associated
tives of $as long as B is taken as a constant with respect to symmetry, and then (as if we had never heard of the elec-
X. tromagnetic field) we showed that the requirement that the
Now, however, I want tointroduce the idea oflocdsym- symmetry be local forces us to introduce a gauge field,
metry, as discussed in Sec. 111, that is, I want to make a which turns out to be nothing but the familiar electromag-
different transformation at each different point in space- netic field. Before we turn to the more general kinds of
time, which corresponds exactly to allowing B to be an arbi- gauge fields, let us continue with the example of the electro-
trary function B ( x ) : magnetic field and see how the gauge invariance condition
fixes the form of the theory.
$'(x) =e-tn'~e(x)$(x) ( 10)
- [ 1 - ineB(x)]$(x) (11)
The standard procedure for setting up a physical theory
is to start with a classical Lagrangian formulation and then
go through a well-defined "quantization" procedure to
or
generate the appropriate quantum mechanical theory. The
=@(XI -$(x) (12) classical equations of motion are generated from the La-
= - ineB(x)$(x). (13) grangian by a variational principle (Hamilton's principle),
which is discussed in Sec. I1 and explained in standard me-
The invariance of expressions such as $*$is easy to see, but
chanics texts. Then the quantization procedure ( a pre-
the invariance is lost in expressions that involve derivatives
scription for constructing the canonical momentum opera-
of $ since derivatives of 19appear as well. Thus (letting a, tors, the Hamiltonian operator, and the quantum
represent the partial derivative with respect to xf', x" = ct, mechanical equations of motion) guarantees that the re-
x ' = x , x2 = y , x3 = z ) we find that a,+(x) transforms as sulting theory satisfies the important Correspondence
Principle, which is the requirement that the theory be en-
tirely consistent with the classical theory in the macroscop-
ic domain, where the classical theory is known to be valid:
We need only discuss the Lagrangian since the form of the
with the first term of standard form [Eq. ( 1 3 ) ] and an theory is determined once that is known.
awkward second term involving the derivative of B ( x ) . In In the case o f a field theory, the important quantity is the
order to achieve local symmetry, we have to get rid of the Lagrangian density L (the Lagrangian itself is the space
second term in some way. To do this, we must replace the integral of L ) , which depends on the various fields and
conventional derivative a,, $ by a "covariant" derivative their partial derivatives and which must be a relativistic
D,, $whose transformation law is required to be of thestan- invariant in order that the resulting theory be Lorentz in-
dard form, variant. Indeed, the Lagrangian density must display all
the symmetries that are required of the theory that it is to
generate.
This will make expressions such as $*OF$, for instance, Thus we see that if we want to construct a dynamical
invariant under the transformation. How can we accom- theory of our new gauge field, we need to construct an
plish this? We simply introduce the electromagnetic poten- appropriate Lagrangian density L which is both Lorentz
518

and gauge invariant. If we add some fairly standard re- field equations, which as I said are just Maxwells equa-
quirements about the physical reasonableness of the field tions, take the form
equations, we find that L is uniquely determined apart
from trivial factors. If our gauge field interacts with just a d,, f =j p. (26)
single charged particle field $, then we expect to get two One might ask whether there should be some kind of
field equations, one for $ and one for the gauge field A,. gauge-covariant derivative, such as D,, in Eq. (26); the
Now suppose that we already know the Lagrangian density answer is that there is no difference in this case since the
for $in the absence of the gauge field; It certainly involves extra term in 0, [ Eq. ( 16)] is proportional to the charge
derivatives of $ and to achieve gauge invariance we have of the relevant particle and the charge of the photon is zero.
found [ Eq. ( 16)1 that the ordinary partial derivative must We have now completed the pattern shown in Fig. I . The
be replaced by the gauge covariant derivative D,. Thus L requirement of local symmetry has not only generated a
now automatically includes an interaction term involving gauge field of uniquely determined structure, but has dic-
A, and $ that shows exactly how the charged particles tated almost uniquely the form of the interaction-the pre-
behave in the presence of a given gauge field A,. If $ is a cise form of the forces on the charged particle and the pre-
Dirac field (for a relativistic spin-1/2 fermion), for exarn- cise way in which the electric charge-current density serves
ple, then the appropriate terms in L take the form as the source for the gauge field.

L , = $[iyJaDp- m ) $ , (21)
4
where is related to $*, the conjugate of I++. (The speed of C. Local symmetry and gauge fields: The general case
light c has been taken as equal to 1 and the repeated indexp Now we are ready to look at the more general case, in
is summed over by a standard summation convention.) which we start with an arbitrary conservation law and, by
In the classical limit, it can easily be shown that the interac- the same logic that we used above, develop the theory of a
tion term in (21), gauge field whose relationship to the conserved quantity
will be the same as that of the electromagnetic field to the
electric charge.
corresponds exactly to the Lorentz force (the familiar elec- The most obvious candidate for the conserved quantity
trostatic and magnetic forces) on the charged particle. might seem to be energy or momentum, but these in fact
To get the dynamical equations for the gauge field, we turn out to be much more complicated. The associated lo-
need to add a term to L that is again both Lorentz and cal symmetry is the invariance under local coordinate
gauge invariant and involves derivatives of A,, , so that the transformations and the associated gauge field, as men-
resulting equations for A, will have the character of field tioned in Sec. 111, is the gravitational field; however, there
equations. We first look for gauge inuorfunt expressions are a number of sourcej of confusion that make this a poor
involving A, and find that the only one we can make is the case to presenkhere. I discuss this case in more dctail in Sec.
combination VI.
f,,. = d,A, - a,A,. (23) The more typical case of a non-Abelian gauge field is
obtained when the symmetry involved is an infernal sym-
Since the variation of A, under a gauge transformation is metry, i.e., one that is not associated with any kind of coor-
simply the space-time gradient of B ( x ) [Eq. ( l S ) ] , the dinate transformation. The phase invariance associated
cross derivatives of B ( x ) that appear in Eq. (23) cancel, with charge conservation is one example of an internal
leavingf,, invariant. The gauge invariant combinationfp;, symmetry; there can be others of this type associated with
is the familiar tensor representation of the electric and other conserved particle numbers such as baryon or lepton
magnetic fields E and H; the elementsf, ( i = I,2,3) are number, These examples are exactly equivalent in form to
the components of E and the elementsf, give the compo- the electromagnetic case, so that the result of the logic
nents of H. Next, we need to construct a relativistic invar- would be a gauge field identical in structure to the electro-
iant (Lorentz xalar) out off, to serve as the Lagrangian magnetic field, but coupled to the appropriate conserved
density for the field; the only way to do this that produces particle number. This is not what happens.
physically reasonable field equations is to take the scalar The other kind of internal symmetry is associated with
product families of identical particles, such as isospin multiplets (if
one can treat them as identical) or multiplets of colored
quarks. The conserved quantities in this case are associated
wherep and Yare summed over by the summation conven- with the quantum numbers that label the members of the
tion mentioned above and the superscripts (contravar- multiplets, together with certain operators that induce
iant indices) indicate an additional factor of - 1 when transitions from one member of a multiplet to another.
either p or Y has the value 1, 2, or 3. The factor merely There is thus a family of operators that on one hand corre-
serves to set the scale of the electromagnetic fieM vectors. spond to the conserved dynamical variables, e.g., isospin or
When the standard procedure is followed for generating color and, on the other hand, correspond to a group of
the classical field equations from the Lagrangian density transformations, the symmetry group of the multiplets.
(24). with the interaction term (22) the result is simply The fact that these operators do not in general commute
Maxwells equations. Interaction term generates a source with each other is precisely what makes this case so very
term for Maxwells equations, which is found to be just the different from the electromagnetic case, as we shall see in
charge-current density what follows. In each case it is the mathematical structure
j , = ne&,+. (25) ofthe symmetry group that determines the structure of the
gauge field and the form of the interaction. The symmetry
In the covanant notation that we have been using, the groups have names according to their structure, so that one
speaks of U ( 1), the simple phase symmetry group of elec- There are now three arbitrary infinitesimal parameters 8,
tromagnetic theory or SU (2), the isospin symmetry group. describing the transformation and the symbols T, here
[ SU( 2) is also the symmetry group of spatial rotations in represent 2 x 2 matrices which govern the mixing of the
quantum theory, which is why the language of isospin is so components $p and $ in the transformation. These matri-
similar to the language of angular momentum, even though ces (proportional to the Pauli spin matrices) hxve the Same
they have nothing to dowith each other physically.] commutation relations (27) as the operators T, and pro-
I want to describe in qualitative terms what happens vide what is called a representation of the symmetry
when we apply the gauge philosophy to one of these non- group. 9 and T are the three-component vectors whose
commuting (non-Abelian) symmetry groups. The his- components are the parameters 0, and the matrices T,.
torical example was the isospin case and indeed this is the Now, in exact parallel to the electromagnetic case, we try
simplest case to consider, though it should be emphasized to impose the condition of local symmetry, which means
that the same mathematical forms appear for any non-Abe- that we allow 9 to be an arbitrary vector-valued function of
lian symmetry group, with only minor changes. I shall fol- x, so that
low the example of the electromagnetic field step by step
and try to clarify the differences. 6 @ ( x )= - i g e ( x ) . T $ ( x ) , (33)
h

First, the conserved electric cbarge operator Q is re- in close analogy to Eq. (13). Again, we find that while
placed by the family of operators T,, which we take as tke terms not involving derivatives cause no problems, the
three components of isospin. The 2 component T, variation of the deriuatiue of $has an additional term in-
(which has nothing to do with thez direction in real space) volving the derivative of 9,as in Eq. ( 14). To get rid of
labels, by its eigenvalues, the memkers of xn isospin multi- these we again need to replace the conventional derivative
plet, while thex2ndycomponents T , and T,, which do not a, with a covariant derivative D,, defined in terms 0f.a new
commute with T,, generate transformations that mix the field whose transformation rule can have a term propor-
different values of T3. [In the case of color, the symrnepy tional to a,, 9 ( x ) . The correct form for D,$ is given by
group is called SU ( 3 ) and there are eight operators T,,
two of which commute with each other so that their eigen-
+
D,$= (a, igB,.T)$, (34)
values can serve to label the members of a color multiplet, a with a new field B, ( x ) that is a vector with respect to
two-dimensional array. ] Lorentz transformations and also a vector with respect to
It is charactejStic of all these symmetry groups that the isospin rotations; It therefore has 12 components in all.
set of operators T, is closed under the commutation rela- When the condition of covariance is imposed on D,,$,
tion, that is, the commutator js always just a linear combi- namely,
nation of the same operators T,. Thus we write
SCo,,$) = -ge.T(D,$) (35)
[like $ itself, Eq. (33) J , then we find that the transforma-
tion rule for B, is completely determined. Because of the
where the coefficients c , ~ , , called the structure con-
fact that the matrices T, do not commute, we find that the
stants, are a set of numerical constants that completely
transformation rule is more complicated than that for the
characterize the local structure of the symmetry group;
electromagnetic potentials A, :
again, y is summed over. In the case of isospin, the coeffi-
cients take the values 0, 1 in such a way as to give the
usual angular-momentum-type commutation relations The second term on the right side of (36), which is ab-
sent in the electromagnetic case [Eq. ( l e ) ] , is extremely
[?,,?*I = iT3, etc. (28)
important and is associated in one way or another with all
The complex field @ now has several components, de- the interesting and novel features of non-Abelian gauge
pending on the size of the multipiet we want to describe. fields (i.e., fields associated with non-Abelian symmetry
For the isospin case we can consider as an examplejust the groups): This term reflects the isovector character of our
nucleon doublet, consisting of the proton and neutron, new field (it is the normal expression for the change in a
so that vector under rotation) and the fact that the quanta of this

*= (t). field must carry isospin. This is in contrast to the case of


photons, which do not carry electric charge. The new
quanta interact with each other in a nonlinear way, just as
Equation (29) is the wavefunction for a single nucleon, photons would if they were charged.
which may be in the proton state, with amplitude *, or It may be helpful to think about this analogy a little more
in the neutron state, with amplitude $,, . [The basic mul- carefully. If photons were electrically charged, then they
tiplet for color SU(3) would be a color triplet of quarks would interact with each other by way of the electromag-
described by a three-component wavefunction +.] netic field, i.e., in the language of quantum field theory, by
The general transformation on the nucleon wavefunc- the exchange of virtual photons! Even if no other charged
tion $, generalizing Eq. (9), is particles were present, the photons would not be free. An-
other way to visualize this state of affairs is in terms of the
classical electromagnetic field equations: There would be
= ( 1 - ig&T)@, (31) nonzero electric charge and current densities wherever the
electromagnetic field strengths were nonzero and the dy-
where @ represents a different mixture of the proton and namical behavior of the fields would be influenced by these
neutron states than does $. The change in $ is given by charge and current densities in a complicated nonlinear
= - igt%T@. (32) way. Many of the usual properties of the electromagnetic
520

field, in particular the linear superposition of wave solu- to each point x of space-time. The term fiber bundle re-
tions, would be lost. fers to theanalogy with a collection ofthin threads or fibers
Photons are not charged, of course, but the B quanta m e bound together into a much thicker bundle. Each thread
in the sense that they carry isospin and isospin is the corresponds to one of the local vector spaces at a point x ;
source-the analog of charge-for theB field. The descrip- the collection of these into a product space is the bundle.
tion of the last paragraph may not apply to the electromag- (There is also a principal fiber bundle consisting of the
netic case, but it applies almost exactly to the B field. Ex- collection of parameter spaces, each attached to a point x ,
cept for this self-interaction, the field equations for each associated with the local transformations discussed
isospin component of the B field are identical to those for above.)
the electromagnetic field. The process of moving from one point x to a neighboring
Let us now complete the description of the B field, still point requires us to define a connection between the two
following the logical sequence of the electromagnetic case. associated local vector spaces, giving rise to the covariant
Recall first that the components B,, ( x ) are analogous to derivatives introduced in Eqs. (16) and (34), and which
the electromagnetic potentials A,, ( x ) , so that we need to bear a very close relationship to the covariant derivatives of
construct the electric and magnetic field analogs. These general relativity theory. The failure of the covariant de-
analogs take the covariant form fPv,which now has the rivatives to commute is an indication of a kind of curva-
character of an isovector, like B,,, and is given [cp. Eq. ture in this extended space, with the quantities fPt.giving a
( 2 3 ) l by precise characterization of that curvature, again in close
analogy to the curvature tensor of general relativity. It
f f i Y = a V B ,-, a , , B , - g B , , X B , . (37) seems very clear that this geometrical structure represents
The electric field, whose spatial components E, are given a wonderful and unexpected vindication of Einsteins vi-
by fa, has the character of an isovector also, so that it has sion of a unified geometrical picture of all the forces of
nine components in all instead of the familiar three compo- nature.
nents. The spatial components H, of the magnetic field, This completes our outline of the non-Abelian gauge
given by the elements f,k, have the same character. field logic, using the example of isospin as the conserved
The final steps, as with the electromagnetic field, are to quantity that serves as the source for the gauge field B. At
construct the Lagrangian density and field equations for the time the theory was invented, isospin was the only rea-
theB field. Because it must be Lorentz and gauge invariant, sonable candidate for this role and the hope was that the
the Lagrangian density can only be resulting gauge field might serve as the carrier of the strong
interaction. There was a general feeling at the time that the
L,= - 4! f@%.i (38) idea, while beautiful, was beset by too many difficulties to
be acceptable. A number of years had to pass and a number
[cp. Eq. (24) 1, while the field equations derived from ( 3 8 )
of sophisticated developments had to occur before it was
have the form
discovered how the difficulties could be resolved and thus
D,. P= j@, (39) the non-Abelian gauge fields seen as an acceptable descrip-
like Eq. (26). Here jp is the isospin current density asso- tion of the fundamental interactions of nature.
ciated with the other particles present-in our example, the
neutron-proton field,
V. RESOLVING DIFFICULTIES: THE GAUGE
j = g w T @ . (40) THEORY OF THE ELECTROWEAK
The covariant derivative D,, ofthe field f is no longer equiv- INTERACTION
alent to the regular derivative a,, since f is an isovector. The
rule is As mentioned briefly in Sec. 11, the mass of the gauge
quantum and the renormalizability of the theory were seen
+
D , , i = d , i gB, xi, (41) from the beginning as major problems. The resolution of
in close analogy with Eqs. (36) and (37). these problems occurred over the course of time as part of
Now, the form of Eq. (39) does not tell us that jwis the long and intricate process by which the modern gauge
conserved, and in view of our discussion we should not theories of the electroweak and strong interactions were
expect it to be since theB field carries isospin as well. How- developed. A proper account of these developments is be-
ever, if the second term on the right side of (41) is taken to yond the scope of this article, but I shall give a brief descrip-
the right side of Eq. (391, then Eq. (39) takes the form tion of how these particular difficulties were finally re-
solved.
a V i Y= jp - g B , X P , (42) The problem with mass is essentially that the forces to be
and since the four-dimensional divergence of the left side described are short-range forces and thus require massive
of (39) is now automatically zero, we see that it is the right interaction quanta (as explained below), while the intro-
side of (42) that represents the full conserved isospin cur- duction of such a mass into a gauge theory destroys the
rent, including the contribution of the B field. gauge invariance, as well as making it virtually impossible
It is important to mention that the structures we have that the theory be renormalizable.
developed are seen by mathematicians as having a deeply The question of renormalizability has to d o with the han-
geometrical character, which is realized in the mathemat- dling of certain infinite expressions that seem to arise inevi-
ical theory of fiber bundles. The space in which this geo- tably when doing any kind of relativistic quantum field
metrical character manifests itself is a kind of product theory. For certain kinds of theory, including quantum
space (the associated fiber bundle, to be more precise) in electrodynamics, there is a logical procedure (discussed
which a local vector space, with elements $ ( x ) , is attached below) for dealing with these infinities that permits us to
52 1

make realistic finite calculations for real physical effects. ferromagnet, which, when cooled to absolute zero and thus
The procedures are not mathematically rigorous, but they brought to its state of lowest energy, must come to a mag-
work; in the case of quantum electrodynamics (QED) they netized state with some definite orientation even though
work with astonishing precision. Theories for which these any orientation is allowed. This ground state of the ferro.
procedures work are called renormalizable; nonrenor- magnet is the analog of the vacuum state for our universe.
malizable theories are .useless in practice and apparently The symmetry of the underlying theory, then, is spon.
unphysical. taneously broken in the sense that the universe we see, ex-
isting within the framework of such an unsymmetrical
vacuum state, will exhibit this broken symmetry even in the
A. Mass of gauge quanta
way that the physical laws appear to operate. Now, the
Our experience with gauge fields led us to suppose that broken symmetry that we want to look at is the gauge sym-
the quanta would have to have zero mass in order for the metry of the electromagnetic and weak interactions,
theory to satisfy gauge invariance. Now, the range of an viewed together as associated with a single four-component
interaction is inversely proportional to the mass of the me- gauge field. There is no evidence that the vacuum state for
diating quanta and zero-mass quanta imply long-range in- such a theory is degenerate, so the authors of the theory
teractions, like the electromagnetic and gravitational forced the vacuum to be degenerate by means of a device
forces. The strong and weak interactions, however, are of due to Higgs, which consists of the artificial introduction of
extremely short range and fall off exponentially at dis- an additional field, called the Higgs field, with properties
tances of the order of the nuclear size. (This is why the chosen so as to make the vacuum state degenerate. In the
nuclei of neighboring atoms, for example, interact with example worked out in Sec. IV, where the symmetry was
each other only by electromagnetic forces, even though the the isospin symmetry SU(2), the Higgs field might be tak-
nuclear force is much stronger.) Thus we seem to run into a en to be an isospin vector field, with a quartic form for the
contradiction and we ask the following question: Is it possi- field energy chosen in such a way that the field configura-
ble for the gauge quanta to acquire a mass in some way tion of minimum energy is a uniform nonzero field with a
without violating gauge invariance? definite but arbitrary orientation in the isospin vector
In what we now believe is a reliable picture of fundamen- space. Like the magnetization of the ferromagnet, this on-
tal particles and interactions at our present level of under- ented Higgs field gives the observed universe a preferred
standing, there are three separate gauge theories: the Gla- orientation, now in isospin space, and provides an elegant
show-Weinberg-Salam theory for electromagnetic and model for the observed broken symmetry.
weak interactions, the color gauge theory for strong inter- In the standard model, however, isospin symmetry is
actions, and general relativity for the gravitational interac- treated as accidental and is not associated with a gauge field
tion. The first two of these theories, together with the spec- [although efforts to extend the standard model to a larger
trum of elementary particles associated with them, make symmetry group that will unify the strong and electroweak
up what is now referred to as the Standard Model. In interactions (Grand Unified Theories, or GUTS)in-
each of these three gauge theories the question of the mass variably incorporate the symmetry of flavor, of which
of the quanta takes a different form. In gravitation theory, isospin is one facet]. In the gauge theory of electroweak
the mass is simply zero and the forces are long range. In interactions, the symmetry group is taken to be
quantum chromodynamics (QCD), the theory of the color S U ( 2 ) x U ( l ) , where U ( 1 ) is the phase symmetry dis-
gauge field that mediates the strong interaction, the mass is cussed in Sec. IV in connection with the electromagnetic
zero, but the confinement of color charge (discussed be- field [Eq.( 8 ) ] and SU(2) is the same symmetry group
low) prevents its manifestation as a long-range force. Fin- used to describe isospin symmetry, but now applied to a
ally, in the electroweak theory, which unifies the electro- somewhat different symmetry referred to as weak iso-
magnetic and weak interactions, all the gauge quanta spin. Weak isospin symmetry, unlike ordinary isospin, re-
except the photon acquire a mass through a mechanism lates the weakly interacting particles, so that the electron
known as spontaneous symmetry breaking, which does and its neutrino form a doublet (with extra complications
not in fact spoil either the gauge invariance of the theory or associated with left- and right-handed spin asymmetries
its renormalizability. that I will not go into here), as well as the muon and tau
The principle of spontaneous symmetry breaking is that with their respective neutrinos. The isospin multiplets of
the actual symmetry of a system may be less than the sym- hadrons (the strongly interacting particles) are modified
metry of the underlying physical laws. This is obvious in so that, for example, the proton is paired, not with the neu-
the world around us, which is full of objects that are neither tron alone, but with a quantum mixture (linear superposi-
translation nor rotation invariant, even though the phys- tion) of the neutron, the and the z.
ical laws of which they are a manifestation are both transla- This theory predicts the existence of four gauge quanta: a
tion and rotation invariant. The new discovery is that even neutral photonlike object, sometimes called the X,asso-
the vacuum state-the supposedly blank canvas on which ciated with the U( 1 ) symmetry and a weak-isospin triplet
the universe is painted-an fail to exhibit the full symme- associated with the SU(2) symmetry, whose members are
try of the laws of physics. If the vacuum is unique, that is, if referred to as w, W* . The Higgs symmetry-breaking
there is only one state oflowest energy, then it must indeed mechanism now has several consequences: The W* parti-
have the full symmetry of the laws, but if it is not unique, clesacquirea mass and theXandthe W a r e mixed, so that
i.e., if the vacuum is a degenerate state, then this is no long- the neutral particles you see in nature are two different
er the case and for each unsymmetrical vacuum state there linear combinations of these two particles. One of these,
will be others of the same minimal energy related to the christened the Z ,has a mass and the other, the familiar
first by the various symmetry transformations under which photon, is massless. The masses of the W * and the Z are
the physical laws are invariant. A famous example is the governed by the structure of the uniform Higgs field back-
522

ground and do not afect the basic gauge invariance of the rnalized. These two parameters, rn and e, appear in the
theory. The interaction strength associated with all four theory from the beginning and represent the mass and
particles is essentially the same and the observed weakness charge of the naninteructing electron, usually referred to as
of the weak interaction, mediated by the W * and the Z ,is the bare mass and the bare charge. What was realized,
understood as a consequence of their masses, which are then, was that the electromagnetic interaction, in addition
taken to be large enough (9C-lo0 proton masses) to agree to shifting atomic energy levels, would also alter the ob-
with what we see. served mass and charge. T o obtain a prediction of the ob-
served values, one needs to do another calculation; the re-
sulting expressions are equal to the bare values plus
H. Renormalizability corrections of order a,a,etc. Thesecorrections are called
the renormalized mass and charge. I called this apredic-
Renormalizability was the other problem, reflecting one rim, but that is not correct because in fact we do not know
of the deep questions in quantum field theory, a reminder what the bare values are and we do know what are the
of the fact that the theory has no rigorous mathematical observed, or renormalized, values. What we must do, then,
foundation and indeed has some serious flaws if viewed is use these expressions to deduce the bare parameters from
from a strictly mathematical viewpoint. Let us review just the obserued values. This process is called renormaliza-
enough of the history of this problem to see how it relates to tion and would be necessary and indeed quite straightfor-
the new non-Abelian gauge theories. ward even if there were no divergences in the theory. Pre-
The big problem is what are called divergences, or dictions of experimental quantities such as the Lamb shift
infinities. Since the early days of electromagnetic theory must be reexpressed in terms of the renorrnclfizedmass and
and on into the era of QED, certain calculations of what charge since these are the observed values.
should be physically reasonable effects have given infinite What happens, though, is that in the calculation of the
answers. The first is the infinite electrostatic self-energy of renormalized mass and charge the same sort of divergent
the classical electron (or of any charged point particle). integrals appear as were found in the original Lamb shift
The energy density of the electric field is proportional to calculation. What does this mean? First, it does not mean
[El2and the field a t a distance r from a particle of charge e that the renormalized values are infinite (since they are the
is proportional to e/?; thus one ought to be able to find the observed values), but that the bare values differ by some
total energy in the electric field of the particle by integrat- infinite factor from the observed values. The process of
ing. renormalization, then, requires us to manipulate these infi-
If the particle has radius a, say, the result is an integral nite quantities, which is clearly nut allowed mathematical-
proportional to e a J r p Zdr, where the integration is from u ly, but seems to be necessary to arrive at any kind of an-
to C O , giving e2/a. If the electron is a point particle, then swer.
u = 0 and the energy is infinite. Is this physically reasona- What we do in practice is find some artificial way of
ble? One might argue that this is an unobservable energy making the integrals converge so thar the expressions we
and that, therefore, its value is physically meaningless. Dif- need to manipulate are finite. There are a number of ways
ficulties arise on account of relativity theory, however, of doing this, closely analogous to keeping a finite radius (I
which tells us that energy is equivalent to mass and that for the electron in the calculation of the electron self-ener-
such a point particle should therefore have infinite inertia, gy discussed above (and indeed the self-energy is the
which would certainly be an observable effect. Afinite ra- mass). The divergent expressions are now finite for finite u
dius a, on the other hand, also produces contradictions and become infinite only when [I is set equal to zero. The
with the requirements of relativistic invariance and we are renormalization procedure can now be carried through
left with a dilemma that is still with us, though in a more and the expressions for the bare charge and mass in terms
abstract form, 80 years later. of the observed values substituted into the expression for
Since the advent of QED, the difficulty has shown up in the Lamb shift (or any other experimentally observable
the form of divergent integrals in the calculation of all sorts quantity). The astonishing result is that the troublesome
of physical effects. The historic example was the Lamb integrals cancel out and disappear completely and when
shift, referring to the measurement by Lamb and Rether- the cutoff parameter- in our exampie-is set equal to
ford in 1947 of the extremely small shift of the spectral lines zero the result is still finite, The theory could now be com-
of hydrogen due to quantum effects in the electromagnetic pared with experiment (remember that the effect predicted
field. The experiment revealed, and the theory in principle is extremely tiny) and the agreement was spectacular. The
predicted, a contribution smaller by a factor a (the fine- initial agreement was at the level of a few parts per million
structure constant, approximately equal to &) than the and has steadily improved over the years.
smallest previously identified contribution. The actual cal- Clearly, we have in some sense a correct theory, even
culation of this contribution was straightforward to set up though the procedures are mathematically unsound. In the
using standard principles of quantum mechanical pertur- case of QED, the results are to a large degree independent
bation theory, as it is called, and gave the proper factor a, of what cutoff scheme is used; there are a number of differ-
multiplied, however, by an expression involving divergent ent ways of replacing the divergent integrals with finite
integrals, that is, with a value equal to infinity. This highly expressions and the end result, when the divergent parts
unphysical result shows that there was something seriously have canceled, seems to be always the same, A theory for
wrong with the basic ideas of QED and caused consider- which this procedure works is called renormalizable. It
able anxiety among physicists at the time. is easy to find otherwise reasonabIe theories that are not
The problem was solved, after a fashion, when it was renormalizable since there are a number of things that can
realized that the fundamental parameters involved, the go wrong; thus a very important fact about QED is that it is
mass and electric charge of the electron, needed to be renor- renormalizable and can therefore be used to make highly
523

precise predictions about the tiny effects such as the Lamb tal proved unfruitful. For another thing, the hadrons had
shift that validate its claim to be true. more structure than seemed appropriate for elementary
It is evident, now, that if non-Abelian gauge theories are particles, even taking into account the big effect that virtual
to be taken seriously, their renormalizability becomes a quantum processes would have as a result of their large
question of great importagce. The nonlinear character of interaction strength. Finally, the observed symmetry pat-
these theories, which is related directly, as discussed in Sec. terns among the hadrons, described by quantum numbers
IV, to their self-interaction, makes this question very diffi- known collectively as flavor (which is like a multidimen-
cult indeed, even if the quanta are massless. To make mat- sional extension of isospin) strongly suggested a composite
ters worse, the first such theory to show signs of really structure, since it would take only a small number of sub-
being true was the Glashow-WeinbergSalam theory of nuclear particles, with the correct fundamental symmetry
the electroweak interaction and as we have seen the gauge characteristics, to permit the construction of the large var-
quanta (apart from the photon) are massive. If the masses iety of observed particles.
are put in directly, in violation of gauge invariance, the These subnuclear particles, christened quarks by
theory is found to be nonrenormalizable, but what if the Murray Gell-Mann, who was responsible for the present
masses arise from spontaneous symmetry breaking? This form of this hypothesis, were at first three in number. Two
remained an open question for some time and glimpses of of the quarks, called the u and d (for up and down,
the truth were beginning to be seen by several physicists constituted an isospin doublet and generated all the ob-
when, in 1971, Gerard t Hooft, a young graduate student served isospin multiplets, while the third, the s (for
from the Netherlands, solved the problem completely. In a strange) quark, was responsible for the additional di-
truly remarkable piece of work, t Hooft confirmed the ren- mension of flavor known as strangeness.
ormalizability of massless gauge theories, already shown Each of the known baryons (strongly interacting fer-
implicitly by others, and went on to renormalize the mas- mions) could then be modeled as a bound state of three
sive case provided the masses were generated by spontane- quarks and each of the mesons (strongly interacting bo-
ous symmetry breaking, the Higgs mechanism. sons) could be modeled as a bound state of a quark and an
This major breakthrough cleared the way for wide ac- antiquark. It was also found to be essential (after some
ceptance of the Glashow-Weinberg-Salam theory of the distress with the idea and a number of unsuccessful efforts
electroweak interaction. The authors of the idea had in- to avoid it) to give the quarks fractional electric charge, so
tended to produce an example of a theory with broken sym- +
that the u quark has charge f and the d and s quarks each
metry using the Higgs mechanism in an ad hoc way and have charge - {, in units of e, the proton charge.
hardly expected that the result would correspond to rea- In more recent years, two additional flavors of quark
lity. In fact, however, the Higgs mechanism not only pro- have been identified, the c (for charm) and b (for bot-
vided the quanta with masses, but allowed for complete tom, or sometimes beauty), while a sixth flavor, to be
renormalizability and resulted, surprisingly, in a practical known as t (for top, or truth) is widely felt to be need-
theory that has enjoyed a remarkable degree of experimen- ed to complete the picture. This would then match the six
tal confirmation. varieties of lepton in some deeper flavor symmetry that
may some day unify the strong and electroweak interac-
tions. In the standard model, though, flavor symmetry is
VI. QCD AND THE STANDARD MODEL not included as a true symmetry and is not associated with
a gauge field. The many efforts to model the strong interac-
We turn now to a brief description of the second of the tions in such a way were unsuccessful and the next level of
two gauge theories that make up the Standard Model, truth was found to involve a deeper degree of complexity,
namely, quantum chromodynamics (QCD), or color as we shall now see.
gauge theory, the gauge theory of the strong interactions. The next chapter in this story came with the realization
Again, there is a long and complicated history leading up to that the different flavors did not provide enough quarks to
the point where such a theory could even be proposed since explain what we observed. In the first place, there was a
the major ingredients of the theory are impossible to ob- difficulty with the Pauli exclusion principle. In order to
serve directly. The color gauge field is unobservable; the explain several of the observed baryons, one had to suppose
conserved color charge is unobservable; and the quarks, that three quarks of the same variety were bound in what
which are the particles that carry the color charge, are un- was essentially the same orbital state. The quarks, though,
observable. Actually, the last comment is not quite fair be- had to be spin-4 fermions, just like electrons, which meant
cause although free quarks have never been seen and are that there could be no more than two in the same orbital
probably impossible to produce, one can nevertheless say state, corresponding to the two possible spin states. (The
that the quarks inside the neutron or proton have been ob- idea that quarks might be neither bosons nor fermions, but
served indirectly through the deep inelastic scattering of obey some more complicated kind of parastatistics, was
high-energy probes, in a manner very similar to the obser- also explored, particularly by 0.W. Greenberg in 1964,but
vation of the atomic nucleus by Rutherford and his co- has not proved fruitful.) The other difficulty was in finding
workers in 1909, through the scattering of alpha particles. a force that would bind the quarks together so strongly that
The first stages leading to the development of color they could never (or only rarely) escape and yet would not
gauge theory involved the realization that the hadrons (the show up as a comparably strong force among the observed
proton, neutron, and other strongly interacting particles, hadrons, seen as bound configurations of quarks. In 1965,
both mesons and baryons) could not be elementary parti- Han and Nambu pointed the way to a solution of these
cles. For one thing, there was no logical reason for identify- difficulties by suggesting (in connection with a model of
ing some of the hadrons as more fundamental than the rest, quarks with integer rather than fractional charge) that
while schemes for treating all of them as equally fundamen- there might be additional quantum numbers needed to de-
5 24

scribe quarks, so that each flavor of quark could come in tary color, in a nice analogy with the two ways of adding
three varieties, later to be called colors, and linked by a the three color vectors as described above. In contrast to
new symmetry, with the structure known as SU(3). This ordinary color, though, the three types of color charge are
dealt nicely with the exclusion principle and the color totally indistinguishable from each other, reflecting the
charge, as one might call it, could provide the basis for a symmetry upon which the gauge theory is based.
binding force for the quarks. ( A complication needs to be mentioned here: The con-
It was not until 1972, though, when gauge theories had served color charge is in actuality a vector in an eight-di-
become more popular, that the idea of binding the quarks mensional space, in analogy to the three-dimensional iso-
with a gauge field was put forward. In that year, Gell- spin space. Just as the isospin components fail to commute,
Mann, Fritzsch, and Bardeen introduced the term color so that the eigenvalues ofjust one of them, referred to as I,,
to describe these additional degrees of freedom and pro- label the members of each isospin multiplet, in the same
posed a gauge theory based on the color SU(3) symmetry way just two of the eight components of color can be taken
group. This also followed theobservation by Adler that one to be commuting, so that the quark multiplets are labeled
can calculate the rate of decay of the pi meson into two by those two parameters and give rise to the two-dimen-
photons and that the result depends strongly on the num- sional vectors discussed above.)
ber of fypes of constituent particle. Adler found that the The gauge quanta have come to be referred to as
decay rate is off by a factor of about 3 if one used only the gluons, in reference to the long-standing puzzlement as
flavored quarks, but gives very reasonable agreement if one to the nature of the glue that holds nuclear matter to-
uses the color triplets. gether; the color gauge field is often referred to as the
One of the strong motivations for the new color SU( 3) gluon field. The quantum field theory of the color gauge
model was the need to explain why quarks were never ob- field is called quantum chromodynamics (QCD) in
served. The hope was that there might be what is called analogy to quantum electrodynamics (QED), the quan-
color confinement, which would require that the force tum theory of the electromagnetic field. (Chromo- is a
between color charges is strong enough at large distances to prefix, from a Greek root, meaning color.) The gluons
prevent the charges ever being separated. It is as if the are massless (there are no broken symmetries in this theor
Coulomb potential between electric charges, instead of fall- ry), but because they inevitably carry color charge (see
ing off as l/r, were logarithmic, or even linear, at large Sec. IV), they are also confined, just as are the quarks.
distances. Then it would take an infinite amount of work to This, together with the color neutrality of all the physical
separate two opposite charges to an infinite distance from hadrons, prevents the gluon field from generating a long-
each other and any unbalanced charges would inevitably range force. Many people have speculated that there might
find each other and recombine into neutral composites. As be color neutral bound configurations of gluons, referred to
it is, in fact, with the long-range Coulomb force is such that as glueballs, but they would necessarily be massive.
free electric charges have a strong tendency to recombine Whether the gluon field does in fact produce confine-
and most matter at the macroscopic scale is very close to ment is still an unanswered question, but there are strong
being electrically neutral. indications that it does, as a direct consequence of the non-
In the case of quarks, the color charge is seen as a vector linear character of such a gauge field. In contrast to con-
in a two-dimensional plane and the three different colors finement, which concerns the behavior of the field at large
correspond to vectors at angles of 120 from each other, as distances from the source, there is also a very important
shown in Fig. 2( a ) . In order to obtain a color neutral com- question as to what happens at small distances. As men-
bination, it is necessary to add the color vectors in multi- tioned at the beginning of this section, high-energy probes
ples of three. Thus, to get vectors to add to zero, three have pointed to the presence of quarks in the interior of
vectors, one of each type, can be added, as in Fig. 2 ( b ) ,or a baryons and furthermore have indicated that the quarks
vector can be added to its negative (corresponding to an move very freely at these small distances, as if the forces
antiquark), as in Fig. 2(c). The first possibility corre- between them were much weaker than the theory seems to
sponds to the baryons, consisting of three quarks and the demand. This weakness of the interaction at small dis-
second possibility corresponds to the mesons, consisting of tances has come to be called asymptotic freedom, in ref-
a quark-antiquark pair. The use of the term color for erence to the fact that very small distances correspond to
this conserved vector quantity was motivated by the analo- very large momentum transfers (related to the Heisenberg
gy to the three primary colors, usually taken as red, blue, uncertainty principle). It was very important, then, that
and green or red, blue, and yellow. In order to achieve any proposed theory show this sort ofbehavior; it has been
white, the neutral color, you must combine all three pri- one of the triumphs of QCD that Gross and Wilczek and,
mary colors, or else combine a color with its complemen- independently, Politzer, were able to show in 1972 that this
was indeed the case.
With this settling of the problem of asymptotic freedom,
all the major difficulties in formulating gauge theories of
the strong, electromagnetic, and weak interactions were
essentially solved, giving rise to what we have referred to as
the Standard Model, the picture of the world as consisting
of quarks and leptons, together with the two basic gauge
fields associated with the strong and electroweak interac-
tions.
(b) (C) I have not yet discussed the gravitational field, which
seems to me to be the mystery at the center of the puzzle. It
Fig. 2. Vector addition for color charge is clear that Einsteins general theory of relativity can be
525

understood in many ways as the gauge theory associated ending with a coherent, if still incomplete, picture in which
with the symmetries of space and time, but it does not fit all the basic interactions of nature are understood in terms
easily into the pattern I have tried to describe. The local of three fundamental gauge fields: the gluon field, the
symmetry involved is the invariance of the theory under electroweak field, and the gravitational field.
arbitrary curvilinear coordinate transformations, which In this presentation, I have greatly oversimplified the
can be seen as a kind of local Poincare invariance (the com- complex developments leading up to the Standard Model
plete symmetry of special relativity). The conserved quan- and I must do the same now with the multitudinous devel-
tities associated with Poincark invariance, namely, energy, opments and efforts of recent years extending beyond that
momentum, and angular momentum, are to some extent model. Two principal lines of thought have been directed
the sources for the gravitational field, as the gauge philoso- toward the goal of unifcufion.There has long been a desire
phy requires, but there are important differences. In the on the part of scientists, stimulated especially by Einstein,
first place, the local transformations associated with the to see coherence in the laws of physics, a feeling that all the
coordinate transformations of general relativity are general laws should be part of a single pattern, consequences in a
affine transformations-a much broader class than the unified way of a single fundamental principle governing all
Poincark group and too broad in fact to describe the sym- of nature. The history of physics has seen many unifica-
metries we see in our local environment. Closely associated tions and our present picture, consisting of the Standard
with this is the fact that the quantities in general relativity Model plus general relativity, represents spectacular prog-
that play the role of the gauge potentials, namely, the Cris- ress toward this end in that all the diverse areas of 19th- and
toffel symbols (or the affine connection in mathematical 20th-century physics are drawn together into a very com-
terms), are not dynamically independent gauge potentials pact set of elementary constituents and interaction laws;
at all, but are derived from the metric tensor, which in some furthermore, the interactions are all associated with gauge
way then acts the part of the gauge potential. theories that follow the same basic principles.
In the second place, the sources of the gravitational field It is clear, though, that thejob is not finished. The dream
in conventional general relativity are just the energy and is to unify the basic interaction fields into a single theory,
momentum distributions, associated through Noethers presumably a gauge theory associated with some great and
theorem not with the whole Poincari group, butjust with a natural symmetry group that comprises all the symmetries
subgroup of it, namely, the translation symmetry of space of our current picture and breaks down through some kind
and time, as if general relativity were the gauge theory of of spontaneous symmetry breaking, like that associated
the translation symmetries alone. Many people, then, have with the electroweak gauge theory, into the different sub-
tried to extend the gauge symmetry to the full Poincart5 symmetries. Gravity will undoubtedly be the most difficult
grouptheories with torsion, in the usual terminol- to incorporate into such a unified theory; thus the first step
ogy-which would predict forces associated with angular was expected to be a unification of the strong and
momentum distributions. This remains very speculative. electroweak theories into a GUT. In such a scheme it is
Finally, gravitation theory has proved extremely diffi- evident that the six (presumed) flavors of quarks will be
cult to quantize, with the problems I have described ap- symmetrically related to the six different leptons and inevi-
pearing in their worst possible form, together with other tably, on account of the symmetry breaking, will be able
problems not present in the other kinds of gauge theory. very weakly to transform into each other. If quarks can
Nonetheless, these three basic interaction theories seem turn into leptons, though, it means that an otherwise stable
to give a remarkably coherent picture of the universe as we hadron (the hadrons are the baryons and mesons, the
know it. Classical gravitation theory, which is general rela- strongly interacting particles) would ultimately decay. In
tivity as Einstein developed it, has survived many experi- fact, the only stable hadron is the proton (though the neu-
mental tests and fits beautifully with all our conceptions of tron can also be stable when bound in a nucleus), so the
stellar behavior and cosmology. The Glashow-Weinberg- first characteristic of a GUT would be. proton decay. The
Salam theory of the electroweak interaction has also had lifetime would have to exceed around 10 years, which is
remarkable success in describing experimental results and, greater than the age of our universe by a factor of around
while it is extremely difficult to do any but the crudest lo2,so you might expect it to be impossible to observe.
calculations with QCD, there are many indications that it However, there are many protons in the matter around us,
does give a correct description of the strong interactions. In for example, about lo3 in a ton of water, so there is in fact a
Sec. VII I shall try to give a summary of the ideas presented real chance of seeing them. Major experimental efforts
here and some account of the directions that are being ex- have been made to do so in a number of different laborato-
plored in trying to find a single unified theory encompass- ries, using huge vats of liquid surrounded by electronic
ing all the fundamental interactions-a theory of every- eyes, in deep underground mines in order to minimize com-
thing, as some people have called it. sic ray background. Thus far the results have been negative
and the limits obtained on the decay rate are sufficiently
low to have forced us to abandon at least the most reasona-
VII. T H E FUTURE A THEORY OF EVERYTHING? ble forms of GUT. The efforts continue, but the interest in
GUTS has decreased considerably. One of the untidy ele-
In the preceding pages I have tried to describe some of ments in the search for a GUT is the Higgs, the pre-
the history of the idea of gauge symmetrics and gauge fields sumed quantum of the Higgs field, introduced in an ad hoc
starting with Noethers theorem-or Noethers principIe, way, as you will recall, to provide a symmetry breaking
perhaps (see Sec. 111)-relating symmetrics and conserva- mechanism for the electroweak gauge theory. Such a parti-
tion laws; going on to develop the logic of gauge theory and cle seems to have no fundamental reason of its own for
the unique way in which the gauge philosophy generates existing since it seems esthetically unsatisfactory to SUP-
theories associated with given natural symmetrics; and pose that nature invented it for that purpose alone. What
526

many hope is that the dynamics of the fermion and gauge symmetries involving the extra dimensions would appear
fields will be found to produce spontaneous symmetry at our level as internal symmetries, allowing for the possi-
breaking without the help of such a device, a mechanism bility (not completely realized at this point) of relating the
referred to as dynamical symmetry breaking. two types of symmetry, while, on the other hand, the prob-
Beyond the unification of the strong and electroweak lems with divergences and other ambiguities that beset
forces there lies the dream of total unification of gravity quantum field theories would at least take a different form
into a single theory of everything. This sort of work is and might even prove to be soluble.
much more speculative than in the case of GUTSbecause One exciting development that combines these ideas is
of the extra degrees of complexity in gravitation and its known as ten-dimensional superstring theory. The theo-
failure to conform to the pattern of the other gauge theo- ry is a supersymmetric gauge theory in a ten-dimensional
ries. The major reason for these differences is that gravita- space-time, with six of the dimensions assumed to be highly
tion is associated with what is called a dynamical symme- compact, as suggested above, and with the additional fea-
try group having to do with transformations on space and ture that the fundamental entities out of which matter is to
time, the medium in which dynamical events take place, be built are not point particles, but strings, tiny open or
while the other gauge theories are associated with internal closed loops that can be thought of as dislocations in the
symmetry groups whose transformations involve only basic structure of space and time and whose transforma-
nondynamical degrees of freedom, the flavor and color pa- tion symmetries are the basis for an associated gauge theo-
rameters labeling the different elementary particles. ry. With this precise combination of dimensions and struc-
While it may turn out that the role of gravity is indeed tures it seems that many of the difficulties mentioned above
intrinsically different from that of the other forces, there might find their natural resolution, such as the canceling of
are nevertheless two lines of attack on the problem of unifi- divergences and anomalies in the gauge theories, the natu-
cation that are elegant enough to stand at least some chance ral appearance of the gravitational field, and the natural
of being true, namely, Kaluza-Klein and supersymmetry concealment of the extra bosons and fermions generated by
theories. The idea in supersymmetric theories is to extend supersymmetry. There are many aspects of such a theory
the symmetry groups of nature to include transformations that could never be directly tested since they involve the
that mix boson and fermion fields. This can be done with- structure of space-time and matter at a totally inaccessible
out violating the conservation of fermion number by allow- level: There are those who regard the exercise as futile for
ing the cross terms in the transformation matrices, the this reason. One could argue, though, that if such a theory
terms associated with boson-fermion mixing, to be what should be found to provide a consistent basis for under-
are called Grassmann elements, which are totally anti- standing physics at the level a t which we do observe the
commuting analogs of ordinary c numbers. While the su- world and if there is no other way of doing this that anyone
persymmetry transformations never turn bosons complete- can think of, then elegance alone is a sufficient reason for
ly into fermions, or vice versa, such theories do predict treating it as true.
matching pairs of bosons and fermions, including even the There seems to be little doubt now that the ultimate theo-
gauge quanta (photons and photinos, gluons and ry, if it is ever accurately identified, will turn out to be a
gluinos, etc.). This proliferation of particles may be gauge theory. My own feeling is that there will have to be at
somewhat embarrassing, but what is startling is that a spin- least one more major conceptual revolution before that fi-
2 gauge field arises very naturally in this context, with nal goal is achieved. While gauge theories are easy to for-
many of the right properties to be a lively candidate for the mulate at the classical level, the process of quantizing
gravitational field. It is linked to a spin-3/2 gravitino gauge theories is quite awkward, involving either nonco-
Eeld which might or might not have observable conse- variant procedures or the introduction of unphysical de-
quences. grees of freedom, all of which suggests to me that we may
Another very significant approach is the Kaluza-Klein be starting from an incorrect understanding of quantum
idea, growing out of the idea of Kaluza ( 1921) and Klein theory itself. If the most basic theory of the universe is a
( 1926), that the basic structure of space-time might in fact quantum gauge theory, then a gauge theory should be the
have more than the usual four dimensions. If the additional most natural thing (if not perhaps the o n b thing) that can
dimensions are tightly curled up, so that you could not go be quantized, rather than the most awkward; indeed, you
more than cm, say, in such a direction without re- should be able to formulate a quantum gauge theory direct-
turning to your starting point, then at the level where ordi- ly, without going through the intermediate stage of the
nary physics takes place only four dimensions would be classical theory. Will the ultimate theory be a gauge theory
observed. The basic symmetries of nature, however, would of the full group of unitary transformations on the Hilbert
be realized in this higher dimensioned space-time and the space of quantum theory, so that quantum states them-
basic quantum field theory would be a field theory in that selves will be understood locally, rather than globally?
higher number of dimensions. On one hand, dynamical
527

MAGNETIC MONOPOLES, FIBER BUNDLES, AND GAUGE FIELDS

Chen Ning Yang


Institute for Theoretical Physics
State University of New York at Stony Brook
Stony Brook, New York 11 794

The reports in this monograph have shown great enthusiasm and exuberance for
the unification of various interactions through the concept of gauge fields. I would
like to emphasize a point that has not yet been explicitly stated by any of the other
authors: gauge fields are deeply related to some profoundly beautiful ideas of
contemporary mathematics, ideas that are the driving forces of part of the
mathematics of the last 40 years. Recalling the relationship between physics and
mathematics in earlier periods, general relativity and Riemannian geometry,
quantum mechanics and Hilbert space, it is all too obvious that physicists may again
be zeroing in on a fundamental new secret of nature.
The mathematical development referred to above is the theory of fiber bundles.
It may appear, a priori, that this theory is quite abstract and is unrelated to the
structure of the physical world. To show that this is not true, we will start with a
simple demonstration that electromagnetism and quantum mechanics together lead
naturally to nontrivial fiber bundles. We will then trace the early history of the
gauge field concept and its generalization, emphasizing three related but different
conceptual motivations, each of which leads to a general formulation of gauge
fields.

MAGNETIC
MONOPOLES
AND NONTRIVIAL
BUNDLES

The magnetic monopole is the magnetic charge. Though the idea of magnetic
monopoles probably was discussed in classic physics early in the history of elec-
tricity and magnetism, modern discussions of this concept date back only to 1931,
when the important paper of Dirac pointed o u t that magnetic monopoles in
quantum mechanics exhibit some extra and subtle features. In particular, with the
existence of a magnetic monopole of strength g, electric charges and magnetic
charges must necessarily be quantized, in quantum mechanics. We will give a new
derivation of this result below.
If one wants to describe the wave function of an electron in the field of a
magnetic monopole, it is necessary to find the vector potential A around the
monopole. Dirac chose a vector potential that has a string of singularities. The
necessity of such a string of singularities is obvious if we prove the following
theorem2:

Theorem: Consider a magnetic monopole of strength g # O at the origin, and


consider a sphere of radius R around the origin. There does not exist a vector
potential A for the monopole magnetic field that is singularity free on the sphere.
This theorem can be proved easily in the following way. If there were a
singularity-free A , we would consider the loop integral

86
528

Yang: Monopoles, Fiber Bundles & Gauge Fields 87

around a parallel on the sphere, as indicated in FIGURE1 . According to Stokes


theorem, this loop integral is equal to the total magnetic flux through the cap a :

Similarly, we can apply Stokes theorem to cap 6 , obtaining

Here, n, and as are the total upward magnetic fluxes through caps a and p, both of
which are bordered by the parallel. Subtracting these two equations, we obtain

which is equal to the total flux out of the sphere, which, in turn, is equal to 47rgf0.
We have thus reached a contradiction.
Having proved this theorem, we observe that R is arbitrary. Thus, one con-
cludes that there must be a string(s) of singularities in the vector potential to
describe the monopole field. Yet, we know that the magnetic field around the
monopole is singularity free. This fact suggests that the string of singularities is not
a real physical difficulty. Indeed, the situation is reminiscent of the problem that
one faces when one wants to find a parametrization of the surface of the globe. The
coordinate system that we usually use, latitude and longitude, is not singularity free.
It has singularities at the north pole and at the south pole. Yet, the surface of the
globe is evidently devoid of singularities. We deal with this situation usually in the
manner illustrated in FIGURE 2. We consider a rubber sheet with nicely defined
coordinates and stretch and wrap it downward onto the globe, so that it covers more
than the northern hemisphere. Similarly, we consider another rubber sheet with
nicely defined coordinates and stretch and wrap it upward, so it covers more than

FIGURE 1 . A sphere of radius R with a magnetic


monopole at its center. The parallel divides the
sphere into two caps LY and p.

FIGURE
2. Method of parametrizing the globe.
529

88 Annals New York Academy of Sciences

the southern hemisphere. We now have a double system of coordinates to describe


the points on the globe. The description is analytic in the domain covered by each
sheet, if the globe had experienced no violence in the stretching and wrapping. In
the overlapping region covered by both sheets, one has two coordinate systems that
are transformable into each other by an analytic nonvanishing Jacobian. This
double coordinate system is an entirely satisfactory way to parametrize the globe.
Following this idea, we will now attempt to exorcise the string of singularities in
the monopole problem by dividing space into two regions. We will call the points
outside of the origin, above the lower cone in FIGURE 3 , region R , . Similarly, we
will call the points outside of the origin, under the upper cone, R b . The union of
these two regions gives all points outside of the origin. In R , , we will choose a
vector potential for which there is only one nonvanishing component of A , the
azimuthal component:

It is important to notice that this vector potential has no singularities anywhere in


R , . Similarly, in R b , we choose the vector potential

which has no singularities in R b . It is simple to prove that the curl of either of these
two potentials gives correctly the magnetic field of the monopole.
In the region of overlap, because both of the two sets of vector potentials share
the same curl, the difference between them must be curlless and therefore must be a
gradient. Indeed, a simple calculation shows

where q5 is the azimuthal angle. The Schradinger equation for an electron in the
monopole field is thus

FIGURE 3. Division of space outside of monopole


.
g into overlapping regions R , and Rb
530

Y ang: Monopoles, Fiber Bundles & Gauge Fields 89

where $, and I J ~are, respectively, the wave functions in the two regions. The fact
that the two vector potentials in these two equations are different by a gradient tells
us, by the well-known gauge principle, that $, and $ b are related by a phase factor
transformation

$, = S$b, S = exp(iea), (7)

or

Around the equator, which is entirely in R , , $, is single valued. Similarly,


because the equator is also entirely in R b , $b is single valued around the equator.
Therefore, S must return to its original value when one goes around the equator.
That fact implies Diracs quantization condition:

2q = integer. (9)

HILBERT
SPACE OF SECTIONS

Two $s, $, and $bp in R, and R , , respectively, that satisfy the condition of
transition (Equation 8) in the overlap region are called a section by the
mathematicians. We see that around a monopole, the electron wave function is a
section and not an ordinary function. We will call these functions wave sections.
Different wave sections (which belong to different energies, for example)
clearly satisfy the same condition of transition (Equation 8) with the same q. Thus,
we need to develop the concept of a Hilbert space of sections. To develop this
concept, we define the scalar product of two sections t , v (for thesame q ) by

(The question of convergence at r=O and r = 00 is ignored here.) Notice that in the
overlap

so that Equation 10 is well defined.


It is clear that if $, is a section, x t is also a section, because

xta =s(xtb).

Thus, x is an operator in the Hilbert space of sections. Similarly, we prove that the
components of (p - eA) are operators, but those of p are not. Furthermore, x and
p - eA are both Hermitian.
53 1

90 Annals New York Academy of Sciences

Following F i e r ~ ,we
~ will now attempt to construct angular momentum
operators. Define

qr
L = r x ( p - e A ) - -.
r

It is clear that L,, L , , and L , are Hermitian operators on the Hilbert space of
sections. The following commutation rules can be easily verified:

[L,,xl =O, [ L , , y l =iz, [L,,zl = -iy,


[ L x9,- eA, I = 0, [ L , .py- eA, I = i ( p , - eA,),
[ L , , P , - ~ A , I= - i ( p y - e A y ) .

It follows from these commutation rules that

[ L , ,Ly ] = iL, , etc. (14)

Equation 13, together with its consequence (Equation 14), show that L , , L , , and
L , are the angular momentum ope rut or^.^ We emphasize that neither the Hilbert
space nor these operators possess any singularities. (The singularities of A , and
are not real singularities, because they occur outside of R, and Rb , respectively.)

MONOPOLE
HARMONICS
Yq,],,,

Because [ 9, L] = 0, we can diagonalize 9 and study operators L for fixed 9.


That is, we will study sections of the form

w -6K
where [ is a section dependent only on angular coordinates 0 and 4. L operates,
then, on angular sections.
Equation 14 shows that [ L2,L, ] = 0 . Simultaneous diagonalization produces
the familiar multiplets with eigenvalues I ( I + 1) and m,

L*Yq,I,rn= l ( ~ + l ) Y q , ~ , m ~ L ~ Y q , l , ~ = m Y q , ~ , m , (15)

where 1 = 0, 1/2, 1, . . . ,and for each value of 1,m ranges from - 1 to + 1 in integral
steps of increment. Yq,],,,are eigensections, which are called3 monopole har-
monics. The allowed values of I and m are

I = Iql, IqI +1, I q l + 2 , . . . , m = -1, - I + 1,. . . ,I. (16)

Each of these 1,m combinations occurs exactly once. One can choose each Y
normalized, so that

sin0d0 rr0
I Yq,l,m
I 2 d 4= 1.
5 32

Yang: Monopoles, Fiber Bundles & Gauge Fields 91

Different Yq,l,m(for fixed q) are orthogonal, a fact one easily proves in the usual
way from Equation 15.
The explicit values of Yq,l,min terms of Jacobi polynomials were given in
Reference 3. They were obtained from Equation 15, in exactly the same way one
usually obtains the spherical harmonics YI.,,. Indeed,

Y1,m = Y0,l.m .

The collection of Yq,l,mfor fixed q and values of I,m given by Equation 16 form3 a
complete orthonormal set of angular sections.
Each is analytic in R , ; so is (Yq,l,m), in R b . Thus, all of the discon-
tinuities, cusps, and singularities in A and in $ are removed in a very smooth way.

Remarks: ( A ) It is important to realize that the above-described way of using


( A ) , and ( A ) b together to describe the magnetic field of a monopole has an ad-
ditional advantage: It gives the magnetic field H correctly everywhere. In older
papers, one often used a single A with a string of singularities. Because, by
definition,

v e ( v xA)=O,

the magnetic field described by v x A must have continuous flux lines. Thus, its
flux lines consist of the dotted lines of FIGURE 4, plus the bundle of lines described
by the solid line, so as to make the net flux at the origin zero. Thus, v x A does
not correctly describe the magnetic field of the monopole, a point already em-
phasized by W e n t ~ e l . ~
(B) For ordinary spherical harmonics, there are many important theorems, such
as the spherical harmonics addition theorem and the decomposition of products of
spherical harmonics by use of Clebsch-Gordon coefficients. These theorems can be
generalized to monopole harmonics.6
(C) In the approximately 40 years since Diracs first paper on monopoles, the
subject has been beset with difficulties due to singularities. Now that we have
removed the difficulty of string singularities through the introduction of the
concept of sections, it is revealed that there is yet another difficulty, which we will
call the Lipkin-Weisberger-Peshkin difficulty. This difficulty occursE in studying
the radial wave function of a Dirac electron around a monopole (TABLE1). It can
be removed through the introduction of a small extra magnetic moment for the
Dirac electron.
(D) It is instructive to go back to the reasoning represented in FIGURE 1 and
attempt to repeat the steps for the combined A,, A , description of the magnetic
field. Choose the parallel to be the equator. Then,

FIGURE 4. Magnetic flux lines due to A. Because


v -( v x A ) = O , flux lines are everywhere con-
tinuous. Therefore, there is return flux along the
solid line.
533

92 Annals New York Academy of Sciences

TABLE1
DIFFlCULTlES AND METHODSOF
WLUTlON FOR STUDYING THE MOTION
OF A
DIRAC
ELECTRON
IN THE FIELD
OF A MAGNETIC
MONOPOLE

Annular Wave Function Radial Wave Function


Difficulty of string singularity, Lipkin-Weisberger-Peshkin difficulty,
solved by introducing sections solved by introducing extra magnetic moment

Thus,

which is, by Equation 6,equal to the increment of (Y around the equation, that .is,
2g(27r) = 47rg.
We have arrived at an identity. I have provided this simple argument because it
is exactly the gist of the proof of the famous Gauss-Bonnet-Allendoerfer-Weil-
Chern theorem and the later Chern-Weil theorem, which play seminal roles in
contemporary mathematics.
In fact, gauge fields, of which electromagnetism is the simplest example, are
conceptually identical to some mathematical concepts in fiber bundle theory.
TABLE2 gives translations for the terminologies used by physicists, on the one
hand, and mathematicians, on the other. We notice that, in particular, Diracs
monopole quantization (Equation 9) is identical to the mathematical concept of
classification of U( 1) bundles according to the first Chern class.
The last two entries of TABLE2 identify electromagnetism with and without
magnetic monopoles with connections to trivial and nontrivial U( 1) bundles. Why
is electromagnetism without monopoles trivial? We can gain some un-
derstanding by looking at a paper loop and a Moebius strip (FIGURE 5 ) . If they are
cut along the dotted lines, each would break into two pieces. Looking at the
resultant pieces, we cannot differentiate between the two. The paper loop and the
Moebius strip are different only in the way the resultant pieces are put together. For
the latter, a twist of one of the resultant pieces is necessary. The difference between
a trivial and a nontrivial bundle resides only in the processes of joining: for the
nontrivial bundle, a twist is needed in the joining process. In the case of elec-
tromagnetism, the joining process is given by Equation 7 or 8. If there is no
monopole, S= 1, and the bundle is trivial. If there is a monopole, S Z 1, and the
bundle is nontrivial. (We may describe the nontrivial nature by saying that a twist
ofphase is necessary.)

OF THE CONCEPT OF GAUGE


EARLYHISTORY FIELDS

Einsteins discovery of the relationship between gravitation and the geometry of


space-time stimulated work by many great geometers: Levi-Civita, Cartan, Weyl,
and others. In his book, Raum, Zeit und Materie (space, time and matter), Weyl
534

Yang: Monopoles, Fiber Bundles & Gauge Fields 93

TABLE 2
TRANSLATION
OF TERMINOLOGIES
-.

Gauge Field Terminology Bundle Terminology


Gauge (or global gauge) principal coordinate bundle
Gauge type principal fiber bundle
Gauge potential b: connection on a principal fiber bundle
S (Equation 8) transition function
Phase factor CP parallel displacement
Field s t r e n g t h x curvature
Source (electric) J, ?
Electromagnetism connection on a W 1bundle
lsotopic spin gauge field connection on a SU, bundle
Diracs monopole quantization classification of U bundle according
to first Chern class
Electromagnetism without monopole connection on a trivial U l bundle
-
Electromagnetism with monopole connection on a nontrivial U , bundle

FIGURE 5 . Examples of trivial (left)


and nontrivial (or Moebius strips, right)
fiber bundles.

attempted to unify gravity and electromagvnetism through the use of the geometric
concept of aspace-time-dependent scale change. The basic idea is summarized
below.
&P

scale 1 1 +s,dX,

f f j-+ I aj-/axp d ~ g

scale change f f+ {aiaxk + s,

In the summary above, the first line indicates how the scale changes in going from a
point x to a neighboring point x + d T g of space-time. The second line shows how
a function of space-time changes as a result of the change in argument from xp t o x
+ d x p . Finally, if the scale change is applied t o the function f, one obtains at
-@ + d P the product

Expanding to first order in the small displacement gives the last line in the sum-
mary. The increment in f is, then,
535

94 Annals New York Academy of Sciences

Weyl tried to incorporate electromagnetism into a geometric theory by iden-


tifying the vector potential A, with a space-time-dependent S, , generating scale
changes as described. This attempt proved, however, unsuccessful.
In 1925, the concepts of quantum mechanics emerged. A key concept in
quantum mechanics is the replacement of the momentum p, in the classic
Hamiltonian by an operator:

p, - - ih (d/axp).

For a charged particle, the replacement is

p,, - (e/c)A, - - ih [Waxp- i(e/hc)A,, 1. (19)

In 1927, Fock'O observed that one could base quantum electrodynamics on this
operator. London" pointed out the similarity of Fock's to Weyl's earlier work.
Comparing Equations 18 and 19, Weyl's identification would be correct if one
makes the replacement

S , - -i(e/hc)A,.

In other words, instead of ascale change

one considers aphase change

[ I -i(e/hc)A,,dx~] =exp[ -i(e/hc)A,dxfl], (20)

which can be thought of as an imaginary scale change. Weyl put all of these ex-
pressions togetherI2 in a remarkable paper (which also first discussed the two-
component theory of a spin-1/2 particle) in which the transformation of the elec-
tromagnetic potential

A, - A ; = A , + d , a (second-type transformation), (21)

and the associated phase transformation

$- $' = $exp(iea/hc) (first-type transformation), (22)

of the wave function of a charged particle were explicitly discus~ed.'~


Although the phase change factor (Equation 20) is no longer a scale factor, Weyl
536

Yang: Monopoles, Fiber Bundles & Gauge Fields 95

kept the earlier terminology*t that he used in 1918-20 and called both the trans-
formation (Equation 20) and the associated phase change of wave functions
gauge transformations.

Generalization: With the discovery of many new particles after World War 11,
physicists explored various couplings between the elementary particles. Many
possible couplings can be written down, and the desire to find a principle to choose
among the many possibilities was one of the motivation~~J* for an attempt to
generalize Weyls gauge principle for electromagnetism. The point here is that for
electromagnetism, the gauge principle determines, all at once, the way in which any
particle of charge qe, a conserved quantity, serves as a source of the electromagnetic
field. Because the isotopic spin I is also conserved, a natural question was, Does
there exist a generalized gauge principle that determines the way in which I serves as
the source of a new field?
Another motivation for an attempt at generalization is the observation that the
conservation of I implies that the proton and the neutron are similar. Which to call
a proton or, indeed, which superposition of the two to call a proton, is a convention
that one can select arbitrarily (if the electromagnetic interaction is switched off). If
one requires this freedom of choice to be independent for observers at different
space-time points, that is, if one requires localized freedom of choice, one is led to a
generalization of the gauge principle.
These two motivations were, of course, intertwined and led quite naturally to the
formulation* of non-Abelian gauge fields.
~ a generalized gauge principle came later and is the in-
A third a p p r o a ~ h to
tegral formalism of gauge fields. It starts from the observation that the gauge
principle of Weyl deals with a phase factor (Equation 20) between two neighboring
points. Along a path from space-time point A to space time point B, the resultant
phase factor is

rB
JA

which is path dependent. that is. nanintegrable. (Dirac had already discussed, in
1931, non-integrable phases for wave functions.) If one analyzes the meaning of
electromagnetism in quantum mechanics, especially through a discussion of the
Bohm-Aharonov experiment ,20* one reaches the conclusion2 that electromag-
netism is the gauge invariant manifestation of a non-integrable phase factor.
Once this conclusion is reached, a natural generalization is to replace a

The idea of scale invariance, discussed in Reference 9, was developed carlier, in 1918-19,
in three papers by Weyl (submitted on May 2 and June 8,1918 and on January 7, 1919). In the
first two of them, he used the term Mussstub Inuuriunz (see Reference 14); in the third paper,
he settled on the term Eich Inuuriunz.
The English translation of Eich Inuuriunz was calibration invariance in Henry Broses
1921 translation of the fourth edition of Weyls book Space, Time und Mutter (republished
by Dover). The translation gauge invariance was not used, I suspect, until after Weyls
1929 article.I2 It appeared (probably not for the first time) in Diracs article of 1931.
t The transformation (Equation 21) that leaves field strengths unchanged must have been
known in the nineteenth century. It did not, however, seem to have a specific name. In the
many editions of Foppl-Abraham-Becker-Sauter on electricity and magnetism, which started
in 1894, Eich or gauge was not used until the 1964 English translation Electromugnetic
Fields and Interactions, in which the term Lorentz gauge was inserted in a footnote.
3 The experiment was performed by Chamber\.
537

96 Annals New York Academy of Sciences

nonintegrable phase factor by a nonintegrable element of a Lie group. One


thus obtains naturally an integral formalism of gauge rields.
We illustrate in FIGURE6 the three approaches to the general concept of gauge
fields. The three approaches are, of course, deeply interrelated, because phases,
symmetry, and conservation laws are themselves related.
It is my opinion that, conceptually, the integral formalism of gauge fields is to
be preferred to the earlier differential approach. The integral formalism has more
structure and more meaning. It brings to the fore problems of global topology not
easily formulated in terms of the differential approach. For example, in our earlier
discussion of the field around the magnetic monopole, we did not introduce the
concept of nonintegrable phase factors. We did not run into any conceptual dif-
ficulties, only because we had not raised such questions as a rotation of the coor-
dinate axes. As soon as such questions are raised, it becomes apparent that the
integral formalism is more superior, because it specifies that intrinsic meaning is
unrelated to the choice of coordinate axes and of regions R , and R b .
Differential formalism, however, is used in computing. (The relationship
between differential and integral formalisms is quite similar to that between Lie
algebras and Lie groups.) In fact, a gauge-Riemannian calculus has been
developed.2
Electromagnetism is, as we have seen, a gauge field. That gravitation is a gauge
field is universally accepted, although exactly how it is a gauge field is a matter still
to be ~ 1 a r i f i e d . l Whether
~ ~ ~ ~ weak and strong interactions are also due to gauge
fields is a matter that has been intensively studied in recent years,23 together with
the question of the renormalizability of non-Abelian gauge If one may
borrow a term used by the biologists, one would say that there is gradually forming
a dogma that all interactions are due to gauge fields. Because of the
mathematical difficulties involved in the solution of quantized gauge fields,

NONlNTEGRABLE
PHASE
/ \
integral
formalism \
\
I GAUGE

CONSERVED QUAb

SOURCE OF FIELD

\---

FIGURE
6 . Three motivations that led to the concept of gauge fields.

5 Abers and Lee24also contains a review of earlier works of R. P. Feynman, L. D. Faddeev,


V. N. Popov, and M. T. Veltman.
Yang: Monopoles, Fiber Bundles & Gauge Fields 97

however, I believe it will be a long time before the question can be definitively
answered as t o exact!y how strong and weak interactions are due to gauge fields.
Reflecting on how the concepts basic to gauge fields were formulated by
physicists, we see that at every step, the development was tied to the problem of the
conceptual description of the physical world. Firstly, Maxwell equations originated
with the four fundamental experimental laws of electricity and magnetism and with
Faradays introduction of the concepts of field and flux. Maxwells equations and
the principles of quantum mechanics led to the idea of gauge invariance. Attempts
to generalize this idea, motivated by physical concepts of phases, symmetry, and
conservation laws, led to the theory of non-Abelian gauge fields. That non-Abelian
gauge fields are conceptually identical to ideas in the beautiful theory of fiber
bundles, developed by mathematicians without reference to the physical world, was
a great marvel to me. In 1975, I discussed my feelings with Chern, and said, This
is both thrilling and puzzling, since you mathematicians dreamed up these concepts
out of nowhere. He immediately protested, NO, no, these concepts were not
dreamed up. They were natural and real.

REFERENCES

1. DIRAC,P.A.M. 1931. Proc. Roy. SOC. A133: 60.


2. Wu, T. T. & C. N. YANG. 1975. Phys. Rev. D 12: 3845.
3. WU, T. T. & C. N. YANG. 1976. Nucl. Phys. B 107: 365.
4. FIERZ,M. 1944. Helv. Phys. Acta 17: 27.
5. WENTZEL,G. 1966. Progr. Theor. Phys. Suppl. 37-38: 163.
6. WU, T.T. & C. N. YANG. 1977. To be published.
7. LIPKIN,H. J., W. I. WEISBERGER & M. PESHKIN.1969. Ann. Phys. 53: 203.
8. KAZAMA, Y., C. N. YANG&A. S. GOLDHABER. 1977. Phys. Rev. D. In press.
9. WEYL, H. 1920. Raum, Zeit und Materie. 3rd edit. Springer Verlag. Berlin-
Heidelberg. New York.
10. FOCK,V. 1927.2. Phys. 39: 226.
11. LONDON,F. 1927. Z. Phys. 42: 375.
12. WEYL,H. 1929. Z.Phys. 56: 330.
13. PAULI, W. 1933. Handbuch der Physik. 2nd edit. Vol. 24(1): 83. Geiger and Scheel.;
PAULI, W. 1941. Rev. Mod. Phys. 13: 203.
14. WEYL,H. 1918. Sitzber. Preuss Akad. Wiss.: 465; WEYL,H. 1918. Math. Z. 2:
384; WEYL,H. 1919. Ann. Phys. 59: 101.
15. WEYL,H. 1921. Space, Time and Matter. Dover Publications, Inc. New York, N.Y.
16. 1964. Electromagnetic Fields and Interactions. Blaisdell Publishing Co. Waltham,
Mass.
17. YANG,C. N. &R. MILLS. 1954. Phys. Rev. 95: 631.
18. YANG,C. N. &R. MILLS. 1954. Phys. Rev. 96: 191.
19. YANG,C. N. 1974. Phys. Rev. Lett. 33: 445.
20. AHARONOV, Y. & D. %OHM. 1959. Phys. Rev. 115: 485; CHAMBERS, R. G. 1960.
Phys. Rev. Lett. 5: 3.
21. YANG,C. N. 1975. Proc. Sixth Hawaii Topical Conf. Particle Phys.
22. UTIYAMA, R. 1956. Phys. Rev. 101: 1957.
23. WEINBERG, S. 1%7. Phys. Rev. Lett. 19: 1264; SALAM,A. 1%8. In Elementary Par-
ticle Theory. N. Svartholm, Ed. Almquist and Forlag. Stockholm, Sweden.
24. THOOFT, G. 1971. Nucl. Phys. B 35: 167; ABERS,E. S. & B. W. LEE.1973. Phys. Rep.
9c: 1.
539

Gauge theory: Historical origins and some modern developments


Lochlainn ORaifeartaigh
Dublin Institute for Advanced Studies, Dublin 4, Ireland

Norbert Straumann
lnstitut fur Theoretische Physik der Universitat Zurich-lrchel,Zurich, Switzerland
One of the major developments of twentieth-century physics has been the gradual recognition that a
common feature of the known fundamental interactions is their gauge structure. In this article the
authors review the early history of gauge theory, from Einsteins theory of gravitation t o the
appearance of non-Abelian gauge theories in the fifties. The authors also review the early history of
dimensional reduction, which played an important role in the development of gauge theory. A
description is given of how, in recent times, the ideas of gauge theory and dimensional reduction have
emerged naturally in the context of string theory and noncommutative geometry.

CONTENTS pressed in terms of connections of fiber bundles-is now


widely recognized. Thus H. Weyl was right when he
I. Introduction 1 wrote in the preface to the first edition of Space, Time,
11. Weyls Attempt to Unify Gravitation and Matter (Raum.. Zeit.. Materie) early in 1918: Wider ex-
Electromagnetism 2 panses and greater depths are now exposed to the
A. Weyls generalization of Riemannian geometry 2 searching eye of knowledge, regions of which we had
B. Electromagnetism and gravitation 3 not even a presentiment. It has brought us much nearer
C . Einsteins objection and reactions of other
physicists 4
to grasping the plan that underlies all physical happen-
111. Weyls 1929 Classic: Electron and Gravitation 5 ing (Weyl, 1922).
A. Tetrad formalism 7 It was Weyl himself who in 1918 made the first at-
B. The new form of the gauge principle 7 tempt to extend general relativity in order to describe
IV. The Early Work of Kaluza and Klein 8 gravitation and electromagnetism within a unifying geo-
V. Kleins 1938 Theory 11 metrical framework (Weyl, 1918). This brilliant proposal
VI. The Pauli Letters to Pais 12 contains the germs of all mathematical aspects of a non-
VII. Yang-Mills Theory 13
VIII. Recent Developments 15
Abelian gauge theory, as we shall make clear in Sec. 11.
A. Gauge theory and strings 15 The words gauge (Eich-) transformation and gauge in-
1. Introduction 15 variance appeared for the first time in this paper, but in
2. Gauge properties of open bosonic strings 16 the everyday meaning of change of length or change of
3. Gravitational properties of closed bosonic calibration.
strings 16 Einstein admired Weyls theory as a coup of genius
4. The presence of matter 17 of the first r a t e , . , , but immediately realized that it
5. Fermionic and heterotic strings: supergravity
was physically untenable: Although your idea is so
and non-abelian gauge theory 17
6. The internal symmetry group G 18 beautiful, I have to declare frankly that, in my opinion, it
7. Dimensional reduction and the heterotic is impossible that the theory corresponds to Nature.
symmetry group & X E8 18 This led to an intense exchange of letters between Ein-
B. Gauge theory and noncommutative geometry 19 stein (in Berlin) and Weyl [at the Eidgenossische Tech-
1. Simple example 19 nische Hochschule (ETH) in Zurich], part of which has
2. Application to the standard model 20 now been published in Vol. 8 of The Collected Papers of
a. The Kaluza-Klein mechanism 20
b. The noncommutative mechanism 21
Albert Einstein (1987). [The article of Straumann (1987)
Acknowledgments 21 gives an account of this correspondence, which is pre-
References 21 served in the Archives of the ETH.] No agreement was
reached, but Einsteins intuition proved to be right.
Although Weyls attempt was a failure as a physical
1. INTRODUCTION theory, it paved the way for the correct understanding of
gauge invariance. Weyl himself reinterpreted his original
It took decades until physicists understood that all theory after the advent of quantum theory in a seminal
known fundamental interactions can be described in paper (Weyl, 1929), which we shall discuss at length in
terms of gauge theories. Our historical account begins Sec. 111. Parallel developments by other workers and in-
with Einsteins general theory of relativity, which is a terconnections are indicated in Fig. 1.
non-Abelian gauge theory of a special type (see Secs. I11
and VII). That other gauge theories emerged, in a slow
and complicated process, gradually from general relativ- The G e r m a n word eichen probably comes from the Latin
ity and their common geometrical structure-best ex- aequare, i.e., equalizing t h e length t o a standard one.

Reviews of Modern Physics, Vol. 72, No. 1, January 2000 @OOO The American Physical Society 1
541)

2 L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments

particular their short-range behavior, did not point to a


gauge-theoretical description. We all know that the
gauge symmetries of the standard model are very hid-
den, and it is therefore not astonishing that progress was
very slow indeed.
In this paper we present only the history up to the
invention of Yang-Mills theory in 1954. The indepen-
dent discovery of this theory by other authors has al-
ready been described (ORaifeartaigh, 1997). Later his-
tory covering the application of the Yang-Mills theory to
the electroweak and strong interactions is beyond our
scope. The main features of these applications are well
known and are covered in contemporary textbooks. One
modern development that we do wish to mention, how-
ever, is the emergence of both gauge theory and dimen-
sional reduction in two fields other than traditional
quantum field theory, namely, string theory and non-
commutative geometry, as their emergence in these
fields is a natural extension of the early history. Indeed
in string theory both gauge invariance and dimensional
reduction occur in such a natural way that it is probably
not an exaggeration to say that, had they not been found
V-A, F e y n m a ~81Gell-Mann. others. earlier, they would have been discovered in this context.
The case of noncommutative geometry is a little differ-

Schwinger, Glashow,Salam, Weinberg.. . -. ent, as the gauge principle is used as an input, but the
change from a continuum to a discrete structure pro-
duces qualitatively new features. Amongst these is an
FIG. 1. Key papers in the development of gauge theories. interpretation of the Higgs field as a gauge potential and
the emergence of a dimensional reduction that avoids
the usual embarrassment concerning the fate of the ex-
At the time Weyls contributions to theoretical phys- tra dimensions.
ics were not appreciated very much, since they did not A fuller account of the early history of gauge theory is
really add new physics. The attitude of the leading theo- given by ORaifeartaigh (1997). There one can also find
reticians was expressed with familiar bluntness in a let- English translations of the most important papers of the
ter by Pauli to Weyl of July 1, 1929, after he had seen a early period, as well as Paulis letters to Pais on non-
preliminary account of Weyls work: Abelian Kaluza-Klein reductions. These works underlie
Before me lies the April edition of the Proc. Nat. the diagram in Fig. 1.
Acad. (US).Not only does it contain an article
from you under Physics but shows that you are II. WEYLS ATTEMPT TO UNIFY GRAVITATION
now in a Physical Laboratory: from what I hear AND ELECTROMAGNETISM
you have even been given a chair in Physics in
America. I admire your courage; since the conclu- On the 1st of March 1918 Weyl writes in a letter to
sion is inevitable that you wish to be judged, not Einstein:
for success in pure mathematics, but for your true These days I succeeded, as I believe, to derive
but unhappy love for physics. (Translated from electricity and gravitation from a common
Pauli, 1979.) source. . . .
Weyls reinterpretation of his earlier speculative pro- Einsteins prompt reaction by postcard indicates already
posal had actually been suggested before by London and a physical objection, which he explained in detail shortly
Fock, but it was Weyl who emphasized the role of gauge afterwards. Before we come to this we have to describe
invariance as a symmetry principle from which electro- Weyls theory of 1918.
magnetism can be derived. It took several decades until
the importance of this symmetry principle-in its gener- A. Weyls generalization of Riemannian geometry
alized form to non-Abelian gauge groups developed by
Yang, Mills, and others-also became fruitful for a de- Weyls starting point was purely mathematical. He felt
scription of the weak and strong interactions. The math- a certain uneasiness about Riemannian geometry, as is
ematics of the non-Abelian generalization of Weyls clearly expressed by the following sentences early in his
1929 paper would have been an easy task for a math- paper:
ematician of his rank, but at the time there was no mo- But in Riemannian geometry described above there is
tivation for this from the physics side. The known prop- contained a last element of geometry at a distance
erties of the weak and strong nuclear interactions, in Cferngeometrisches Element)-with no good reason,

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


541

L. ORaifeartaigh and N. Straurnann: Gauge theory: origins and modern developments 3

as far as I can see; it is due only to the accidental xlql.llql


development of Riemannian geometry from Euclid- IlP) /f
Xlq1,vlq)
ean geometry. The metric allows the two magnitudes
of two vectors to be compared, not only at the same
point, but at any arbitrarily separated points. A true
infinitesimal geometry should, however, recognize
only a principle for transferring the magnitude of a P /I /
vector to an infinitesimally close point and then, on
transfer to an arbitrary distant point, the integrability Y
of the magnitude of a vector is no more to be ex- FIG. 2. Path dependence of parallel displacement and trans-
pected than the integrability of its direction. port of length in Weyl space.
After these remarks Weyl turns to physical speculation
and continues as follows:
The compatibility requirement (1)leads to the follow-
On the removal of this inconsistency there appears ing expression for the Christoffel symbols in Weyls ge-
a geometry that, surprisingly, when applied to the ometry:
world, explains not only the gravitational phenom-
ena but also the electrical. According to the result-
ant theory both spring from the same source, in-
deed in general one cannot separate gravitation and
electromagnetism in a unique manner. In this
theory all physical quantities have a world geo- The second A-dependent term is a characteristic new
metrical meaning; the action appears f r o m the be- piece in Weyls geometry, which has to be added to the
ginning as a pure number. It leads to an essentially Christoffel symbols of Riemannian geometry.
unique universal law; it even allows us to under- Until now we have chosen a fixed, but arbitrary, met-
stand in a certain sense why the world is four di- ric in the conformal class [g]. This corresponds to a
mensional. choice of calibration (or gauge). Passing to another cali-
bration with metric g, related to g by
In brief, Weyls geometry can be described as follows
(see also Audretsch, Gahler, and Straumann, 1984). g= e2g, (5)
First, the space-time manifold M is equipped with a con-
we find that the potential A in Eq. (1)will also change to
formal structure, i.e., with a class [g] of conformally
equivalent Lorentz metrics g (and not a definite metric A , say. Since the covariant derivative has an absolute
as in general relativity). This corresponds to the require- meaning, A can easily be worked out: On the one hand
ment that it should only be possible to compare lengths we have, by definition,
at one and the same world point. Second, it is assumed,
as in Riemannian geometry, that there is an affine (lin- vg= - 24 @g, (6)
ear) torsion-free connection which defines a covariant and on the other hand we find for the left side with Eq.
derivative V and respects the conformal structure. Dif- (1)
ferentially this means that for any g E [ g ] the covariant Vg=V(e2g)=2 dh@g+e2 Vg=2 d h @ g - 2 A @ g . (7)
derivative Vg should be proportional to g:
Thus

where A =A, d x , is a differential 1-form. A = A - d h (A,=A,- d,h). (8)


Consider now a curve y:[O,l]--tM and a parallel- This shows that a change of calibration of the metric
transported vector field X along y. If I is the length of X , induces a gauge transformation for A:
measured with a representative g E [ g ] , we obtain from
g i e 2 A g , A-A-dX. (9)
Eq. (1) the following relation between l ( p ) for the ini-
tial point p = y( 0 ) and I ( q ) for the end point q = y( 1): Only gauge classes have an absolute meaning. [The
Weyl connection is, however, gauge invariant. This is
conceptually clear, but can also be verified by direct cal-
culation from Eq. (4).]
Thus the ratio of lengths in q and p (measured with g B. Electromagnetism and gravitation
E [ g ] ) depends in general on the connecting path y (see
Fig. 2). The length is only independent of y if the curl of Turning to physics, Weyl assumes that his purely in-
A, finitesimal geometry describes the structure of space-
time and consequently he requires that physical laws sat-
F=dA (F,,=d,A.- d,,A,), (3) isfy a double invariance: (1)They must be invariant with
vanishes. respect to arbitrary smooth coordinate transformations;

Rev. Mod. Phys., Vol. 72, No. 1, January 2000


542

4 L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments

(2) They must be gauge invariant, i.e., invariant with re- termed the eliminants of the latter. These structural
spect to the substitutions of Eq. (9) for an arbitrary connections hold also in modern gauge theories.
smooth function A.
Nothing is more natural to Weyl than identifying A,
with the vector potential and F,, in Eq. (3) with the C. Einsteins objection and reactions of other physicists
field strength of electromagnetism. In the absence
After this sketch of Weyls theory we come to Ein-
of electromagnetic fields (F,,= 0) the scale
steins striking counterargument, which he first commu-
factor, exp(-J,A) in Eq. (2), for length transport be-
comes path independent (integrable) and one can find a nicated to Weyl by postcard (see Fig. 3). The problem is
gauge such that A , vanishes for simply connected space- that if the idea of a nonintegrable length connection
(scale factor) is correct, then the behavior of clocks
time regions. In this special case, it is the same situation
would depend on their history. Consider two identical
as in general relativity.
Weyl proceeds to find an action that is generally in- atomic clocks in adjacent world points and bring them
along different world trajectories which meet again in
variant as well as gauge invariant and that would give
the coupled field equations for g and A . We do not want adjacent world points. According to Eq. (2) their fre-
quencies would then generally differ. This is in clear
to enter into this, except for the following remark. In his
contradiction with empirical evidence, in particular with
first paper Weyl (1918) proposes what we now call the
Yang-Mills action: the existence of stable atomic spectra. Einstein therefore
concludes (see Straumann, 1987):
S ( g , A ) =- -,41/ Tr(RA*R). . . . (if) one drops the connection of the ds to the
measurement of distance and time, then relativity
Here R denotes the curvature from and *R its Hodge loses all its empirical basis.
dual. Note that the latter is gauge invariant, i.e., inde- Nernst shared Einsteins objection and demanded on
pendent of the choice of g E [ g ] . In Weyls geometry the behalf of the Berlin Academy that it be printed in a
curvature form splits as R = A+ F , where is the metric short amendment to Weyls article. Weyl had to accept
piece (Audretsch, Gahler, and Straumann, 1984). Corre- this. One of us has described elsewhere (Straumann,
spondingly, the action also splits, 1987; see also Vol. 8 of Einstein, 1987) the intense and
instructive subsequent correspondence between Weyl
Tr( RA*a)= Tr( h~ *a)+ F A * F . (11)
and Einstein. As an example, let us quote from one of
the last letters of Weyl to Einstein:
The second term is just the Maxwell action. Weyls This [insistence] irritates me of course, because ex-
theory thus contains formally all aspects of a non- perience has proven that one can rely on your in-
Abelian gauge theory. tuition; so unconvincing as your counterarguments
Weyl emphasizes, of course, that the Einstein-Hilbert seem to me, as I have to admit. . .
action is not gauge invariant. Later work by Pauli (1919) By the way, you should not believe that I was
and by Weyl himself (1918, 1922) soon led to the con- driven to introduce the linear differential form in
clusion that the action of Eq. (10) could not be the cor- addition to the quadratic one by physical reasons. I
rect one, and other possibilities were investigated (see wanted, just to the contrary, to get rid of this
the later editions of Space, Time, Matter). methodological inconsistency (Znkonsequenz)
Independent of the precise form of the action, Weyl which has been a bone of contention to me already
shows that in his theory gauge invariance implies the much earlier. And then, to my surprise, I realized
conservation of electric charge in much the same way as that it looked as if it might explain electricity. You
general coordinate invariance leads to the conservation clap your hands above your head and shout: But
of energy and m o r n e n t ~ m .This
~ beautiful connection physics is not made this way! (Weyl to Einstein 10
pleased him particularly: . . . [it] seems to me to be the December 1918).
strongest general argument in favour of the present
theory-insofar as it is permissible to talk of justification Weyls reply to Einsteins criticism was, generally
in the context of pure speculation. The invariance prin- speaking, this: The real behavior of measuring rods and
ciples imply five Bianchi-type identities. Correspond- clocks (atoms and atomic systems) in arbitrary electro-
ingly, the five conservation laws follow in two indepen- magnetic and gravitational fields can be deduced only
dent ways from the coupled field equations and may be from a dynamical theory of matter.
Not all leading physicists reacted negatively. Einstein
transmitted a very positive first reaction by Planck, and
The integrand in Eq. (10) is indeed just the expression Sommerfeld wrote enthusiastically to Weyl that there
R , g y s R a B Y s ~ d d X o h . . . ~ din
x 3 local coordinates which is was . . . hardly doubt, that you are on the correct path
used by Weyl (RepYs=the curvature tensor of the Weyl con- and not on the wrong one.
nection). In his encyclopedia article on relativity Pauli (1921)
3We adopt here the somewhat naive interpretation of energy- gave a lucid and precise presentation of Weyls theory,
momentum conservation for generally invariant theories of the but commented on Weyls point of view very critically.
older literature. At the end he states:

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


543

L.ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments 5

FIG. 3. Postcard from Einstein to Weyl 15 April 1918. From A.rchives of Eidgenossische Technische Hochschule, Zurich.

. . . In summary one may say that Weyls theory along a completely different line at the principle of
has not yet contributed to getting closer to the so- gauge invariance in the framework of wave mechanics.
lution of the problem of matter. His approach was similar to that of Klein, which will be
Eddingtons reaction was at first very positive but he discussed in detail (in Sec. IV).
soon changed his mind and denied the physical rel- The contributions of Schrodinger (1922), London
evance of Weyls geometry. (1927), and Fock (1927) are discussed in the book of
The situation was later appropriately summarized by ORaifeartaigh (1997), where English translations of the
London (1927) as follows: original papers can also be found. Here, we concentrate
on Weyls seminal paper Electron and Gravitation.
In the face of such elementary experimental evi-
dence, it must have been an unusually strong meta-
physical conviction that prevented Weyl from 111. WEYLS 1929 CLASSIC: ELECTRON
abandoning the idea that Nature would have to AND GRAVITATION
make use of the beautiful geometrical possibility
that was offered. He stuck to his conviction and Shortly before his death late in 1955, Weyl wrote for
evaded discussion of the above-mentioned contra- his Selecta (Weyl, 1956) a postscript to his early attempt
dictions through a rather unclear re-interpretation in 1918 to construct a unified field theory. There he ex-
of the concept of real state, which, however, pressed his deep attachment to the gauge idea and adds
robbed his theory of its immediate physical mean- (p. 192):
ing and attraction. Later the quantum-theory introduced the
In this remarkable paper, London suggested a reinter- Schrodinger-Dirac potential fi of the electron-
pretation of Weyls principle of gauge invariance within positron field; it carried with it an experimentally-
the new quantum mechanics: The role of the metric is based principle of gauge-invariance which guaran-
taken over by the wave function, and the rescaling of the teed the conservation of charge, and connected the
metric has to be replaced by a phase change of the wave fi with the electromagnetic potentials + l in the
function. same way that my speculative theory had con-
In this context an astonishing early paper by Schro- nected the gravitational potentials glk with the + 1 ,
dinger (1922) has to be mentioned, which also used and measured the + l in known atomic, rather than
Weyls world geometry and is related to Schrodingers unknown cosmological units. I have no doubt but
later invention of wave mechanics. This precursor rela- that the correct context for the principle of gauge-
tion was discovered by Raman and Forman (1969). [See invariance is here and not, as I believed in 1918, in
also the discussion by C. N. Yang in Schrodinger the intertwining of electromagnetism and gravity.
(1987).] This reinterpretation was developed by Weyl in one
Simultaneously with London, Fock (1927) arrived of the great papers of this century (Weyl, 1929). Weyls

Rev. Mod. Phys., Vol. 72, No. 1. January 2000


544

6 L. ORaifearlaighand N. Straumann: Gauge theory: origins and modern developments

classic not only gives a very clear formulation of the with observation, is that the exponent of the factor
gauge principle, but contains, in addition, several other multiplying $ is not real but purely imaginary. 9
important concepts and results-in particular his two- now plays the role that Einsteins ds played before,
component spinor theory. The richness and scope of the It seems to me that this new principle of gauge-
paper is clearly visible from the following table of con- invariance, which follows not from speculation but
tents: from experiment, tells us that the electromagnetic
Introduction. Relationship of General Relativity to field is a necessary accompanying phenomcnon,
the quantum-theoretical field equations of the not of gravitation, but ol the material wave-field
spinning electron: mass, gauge-invariance, distant- rcpresented by $. Since gauge-invariance involves
parallelism. Expected modifications of the Dirac an arbitrary function A it has the character oE gen-
theory. - I . Two-cornponcnt theory: the wave eral relativity and can naturally only be under-
function $has only two componcnts. -51. Connec- stood in that context.
tion between the transformation of the t) and the We shall soon enter into Weyls justification, which is,
transformation of a normal tetrad in four-
not surprisingly, strongly associated with genera1 relativ-
dimensional space. Asymmetry of past and future,
ity. Before this we have to describe his incorporation of
of left and right. -52. In General Relativity the
metric at a given point is determined by a normal the Dirac theory into general relativity, which he
tetrad. Components of vectors relative to the tet- achieved with the help of the tetrad formalism.
rad and coordinates. Covariant differentiation of One of the reasons for adapting the Dirac theory of
4. -83. Generally invariant form of the Dirac ac- the spinning electron to gravitation had to do with Ein-
lion, characteristic for the wave-field of mattcr. steins recent unified theory, which invoked a distant
-94. The differential conservation law of energy parallelism with torsion. Wigner (1929) and others had
and momentum and the symmctry of the energy- noticed a connection between this thcory and the spin
momcntum tensor as a consequence of the double- theory ol the electron. Weyl did not likc this and wanted
invariance (1) with respect to coordinate Iransfor- to dispense with tclcparallelism. In the introduction he
mations (2) with respect to rotation of the tetrad. says:
Momentum and moment of momentum for matter. I prefer not to believe in distant parallelism for a
-95. Einsteins classical theory of gravitation in number of reasons. First my mathematical intu-
the new analytic formulation. Gravitational en-
ition objects to accepting such an artificial geom-
ergy. -16. The electromagnetic field. From the ar-
etry; I find it difficult to understand the force that
bitrariness of the gauge-factor in 9 appears the ne-
would keep the local tetrads at different points and
cessity of introducing the electromagnetic
potential. Gauge invariance and charge conserva- in rotated positions in a rigid rclationship. There
tion. The space-integral of charge. The introduc- are, 1 beheve, two important physical reasons as
tion of mass. Discussion and rejeclion 01 another wcll. The loosening of the rigid relationship bc-
possibility in which electromagnctisrn appcars, not tween the telrads at different points converts thc
as an accompanying phenomenon of matter, but of gaugc-factor e i A ,which remains arbitrary with rc-
gravitation. spect to @, Gom a constant to an arbitrary function
of space-time. In other words, only through the
The modern version of the gauge principle is already
loosening of the rigidity does the established
spelled out in the introduction:
gauge-invariance become understandable.
The Dirac field-equations for 9 together with the
Maxwell equations for the four potentials f p of the This thought is carried out in detail after Weyl has set
electromagnetic field have an invariance property up his two-component theory in special relativity. in-
which is formally similar to the one which I called cluding a discussion of P and T invariance. He empha-
gauge-invariance in my 1918 theory of gravitation sizes thereby that the two-component theory excludes a
and electromagnetism; the equations rcmain in- linear implementation of parity and remarks: ;It is only
variant when one makes the simultaneous replace- the fact that the left-right symmetry actually appears in
ments Nature that forces us to introduce a sccond pair of (lr
componcnts. To Weyl the mass problem is thus not
dh rclevant lor this. Indeed he says: Mass, howevcr, is a
@ by eih@ and f!, by f p - d x - p , gravitational cffcct; thus there is hope of finding a sub-
stitute in the theory ot gravitation that would produce
where A is understood to be an arbitrary function the required corrections.
of position in four-space. Here the factor elch,
where -e is the charge of the electron, c is the
speed of light, and hl27r is the quantum of action, 4At the time i t was thought by Weyl, and indeed by all physi-
has been absorbed in f p . The connection of this cists, that the two-component theory required a zero mass. In
gauge invariance to the conservation of electric 1957, after the discovery of parity nonconservation, it was
charge remains untouched. But a fundamental dif- found that the two-component theory could be consistent with
ference, which is important to obtain agrccmcnt a finite mash. See Case (1957).

Rev. Mod. Phys., Vol. 72,No. 1, Janualy 2000


545

L. ORaifearlaigh and N. Straumann: Gauge theory: origins and modern developments 7

A. Tetrad formalism

In order to incorporate his two-component spinors


into general relativity, Weyl was forced to make use of (For two-component Weyl fields, one has similar expres-
local tetrads (Vierbeine). In Sec. 2 of his paper he devel- sions in terms of the Pauli matrices.)
ops the tetrad formalism in a systematic manner. This With these tools the action principle for the coupled
was presumably independent work, since he does not Einstein-Dirac system can be set up. In the massless case
give any reference to other authors. It was, however, the Lagrangian is
mainly E. Cartan (1928) who demonstrated the useful-
ness of locally defined orthonormal bases-also called
moving frames-for the study of Riemannian geometry.
In the tetrad formalism the metric is described by an
arbitrary basis of orthonormal vector fields { e , ( x ) ; a where the first term is just the Einstein-Hilbert Lagrang-
=0,1,2,3}. If { e a ( x ) } denotes the dual basis of 1-forms, ian (which is linear in 0).Weyl discusses, of course, im-
the metric is given by mediately the consequences of the following two sym-
metries:
g = v p v e p ( x )@ e " ( x ) , ( vpv)= diag(1,- 1,- 1,- 1). (12) (i) local Lorentz invariance,
Weyl emphasizes, of course, that only a class of such (ii) general coordinate invariance.
local tetrads is determined by the metric: the metric is
not changed if the tetrad fields are subject to space-time- B. The new form of the gauge principle
dependent Lorentz transformations:
All this is a kind of a preparation for the final section
e " ( x ) -tAp"(x)e@(x). (13) of Weyl's paper, which has the title "electric field."
With respect to a tetrad, the connection forms w Weyl says:
= ( w ; ) have values in the Lie algebra of the homoge- We come now to the critical part of the theory. In
neous Lorentz group: my opinion the origin and necessity for the electro-
magnetic field is in the following. The components
W,@+ wpa= 0. (14) $2 are, in fact, not uniquely determined by the

(Indices are raised and lowered with ?lap and q a p ,re- tetrad but only to the extent that they can still be
spectively.) They are determined (in terms of the tetrad) multiplied by an arbitrary "gauge-factor'' elh. The
by the first structure equation of Cartan: transformation of the (I, induced by a rotation of
the tetrad is determined only up to such a factor.
In special relativity one must regard this gauge-
de"+ w;AeP=O. (15) factor as a constant because here we have only a
(For a textbook derivation see Straumann, 1984.) Under single point-independent tetrad. Not so in general
local Lorentz transformations [Eq. (13)] the connection relativity; every point has its own tetrad and hence
forms transform in the same way as the gauge potential its own arbitrary gauge-factor; because by the re-
of a non-Abelian gauge theory: moval of the rigid connection between tetrads at
different points the gauge-factor necessarily be-
w ( x ) - t ~ ( ~ ) w ( x ) ~ - l ( x ) - d ~ ( x ) ~ - l ( ~(16)
). comes an arbitrary function of position.
The curvature forms n=(nt)are obtained from w in In this manner Weyl arrives at the gauge principle in
exactly the same way as the Yang-Mills field strength its modern form and emphasizes "From the arbitrariness
from the gauge potential: of the gauge factor in CC. appears the necessity of intro-
ducing the electromagnetic potential." The first term d $
0= d o + WAW (17) in Eq. (19) now has to be replaced by the covariant
gauge derivative ( d - i e A ) (I,, and the nonintegrable
(second structure equation). scale factor (2) of the old theory is now replaced by a
For a vector field V , with components V" relative to phase factor:
{e,}, the covariant derivative DV is given by

DV"=dV"+ w;Vp. (18)


Weyl generalizes this in a unique manner to spinor fields
which corresponds to the replacement of the original
*:
gauge group R by the compact group U(1). Accord-
ingly, the original Gedankenexperiment of Einstein
translates now to the Aharonov-Bohm effect, as was first
pointed out by Yang (1980). The close connection be-
Here, the a"p describe infinitesimal Lorentz transforma- tween gauge invariance and conservation of charge is
tions (in the representation of $). For a Dirac field these again revealed. The current conservation follows, as in
are the familiar matrices the original theory, in two independent ways: On the

Rev. Mod. Phys.. Vol. 72,No. 1, January 2000


546

8 L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments

one hand, it is a consequence of the field equations for ance. I myself have long since abandoned this
matter plus gauge invariance. On the other hand, how- theory in favour of its correct interpretation: gauge
ever, it is also a consequence of the field equations for invariance as a principle that connects electromag-
the electromagnetic field plus gauge invariance. This netism not with gravitation but with the wave-field
corresponds to an identity in the coupled system of field of the electron. -Einstein was against it [the origi-
equations that has to exist as a result of gauge invari- nal theory] from the beginning, and this led to
ance. All this is now familiar to students of physics and many discussions. I thought that I could answer his
does not need to be explained in more detail. concrete objections. In the end he said Well,
Much of Weyls paper appeared also in his classic Weyl, let us leave it at that! In such a speculative
book The Theory of Groups and Quantum Mechanics manner, without any guiding physical principle,
(Weyl, 1981). There he mentions the transformation of one cannot make Physics. Today one could say
his early gauge-theoretic ideas: This principle of gauge that in this respect we have exchanged our points
invariance is quite analogous to that previously set up by of view. Einstein believes that in this field [Gravi-
the author, on speculative grounds, in order to arrive at tation and Electromagnetism] the gap between
a unified theory of gravitation and electricity. But I now ideas and experience is so wide that only the path
believe that this gauge invariance does not tie together of mathematical speculation, whose consequences
electricity and gravitation, but rather electricity and must, of course, be developed and confronted with
matter. experiment, has a chance of success. Meanwhile
When Pauli saw the full version of Weyls paper he my own confidence in pure speculation has dimin-
became more friendly and wrote (Pauli, 1979, p. 518): ished, and I see a need for a closer connection with
In contrast to the nasty things I said, the essential quantum-physics experiments, since in my opinion
part of my last letter has since been overtaken, it is not sufficient to unify Electromagnetism and
particularly by your paper in 2. J Physik. For this Gravity. The wave-fields of the electron and what-
reason I have afterward even regretted that I ever other irreducible elementary particles may
wrote to you. After studying your paper I believe appear must also be included.
that I have really understood what you wanted to Independently of Weyl, Fock (1929) also incorporated
do (this was not the case in respect of the little the Dirac equation into general relativity using the same
note in the Proc. Nut. Acad.). First let me empha- method. On the other hand, Tetrode (1928), Schro-
size that side of the matter concerning which I am dinger (1932), and Bargmann (1932) reached this goal
in full agreement with you: your incorporation of by starting with space-time-dependent y matrices, satis-
spinor theory into gravitational theory. I am as dis- fying { y,yy}=2g~. A somewhat later work by Infeld
satisfied as you are with distant parallelism and
and van der Waerden (1932) is based on spinor analysis.
your proposal to let the tetrads rotate indepen-
dently at different space-points is a true solution.
In brackets Pauli adds:
Here I must admit your ability in Physics. Your
IV. THE EARLY WORK OF KALUZA AND KLEIN
earlier theory with gik= Xgik was pure mathemat-
ics and unphysical. Einstein was justified in criticiz- Early in 1919 Einstein received a paper of Theodor
ing and scolding. Now the hour of your revenge Kaluza, a young mathematician (Privatdozent) and con-
has arrived. summate linguist in Konigsberg. Inspired by the work of
Then he remarks, in connection with the mass problem, Weyl one year earlier, he proposed another geometrical
Your method is valid even for the massive [Dirac] unification of gravitation and electromagnetism by ex-
case. I thereby come to the other side of the mat- tending space-time to a five-dimensional pseudo-
ter, namely, the unsolved difficulties of the Dirac Riemannian manifold. Einstein reacted very positively.
theory (two signs of mo) and the question of the On 21 April 1919 he writes, The idea of achieving [a
2-component theory. In my opinion these prob- unified theory] by means of a five-dimensional cylinder
lems will not be solved by gravitation.. . the gravi- world never dawned on m e . . . . At first glance I like
tational effects will always be much too small. your idea enormously. A few weeks later he adds:
The formal unity of your theory is starting. For un-
Many years later, Weyl summarized this early tortu- known reasons, Einstein submitted Kaluzas paper to
ous history of gauge theory in an instructive letter the Prussian Academy after a delay of two years
(Seelig, 1960) to the Swiss writer and Einstein biogra- (Kaluza, 1921).
pher C. Seelig, which we reproduce in an English trans- Kaluza was actually not the first who envisaged a five-
lation. dimensional unification. It is astonishing to note that G.
The first attempt to develop a unified field theory Nordstrom had this idea already in 1914 (Nordstrom,
of gravitation and electromagnetism dates to my 1914). We recall that Nordstrom had worked out in sev-
first attempt in 1918, in which I added the principle eral papers (Nordstrom, 1912, 1913a, 1913b) a scalar
of gauge invariance to that of coordinate invari- theory of gravitation that was regarded by Einstein as

Rev. Mod. Phys., VoI. 72, No. 1, Janualy 2000


547

L. ORaifeaftaigh and N. Straumann: Gauge theory: origins and modern developments 9

the only serious competitor to general relativity. (In lar field could play an important role, and he makes
collaboration with Fokker, Einstein gave this theory a some speculative remarks in this direction.
generally covariant, conformally flat form.) Nordstrom In the classical part of his first paper, Klein (1926a)
started in his unification attempt with five-dimensional improves on Kaluzas treatment. H e assumes, however,
electrodynamics and imposed the cylinder condition, beside the condition of cylindricity, that gS5is a constant.
that the fields should not depend on the fifth coordinate. Following Kaluza, we keep here the scalar field 4 and
write the Kaluza-Klein ansatz for the five-dimensional
Then the five-dimensional gauge potential ( ) A splits as
+
()A = A + dx, where A is a four-dimensional gauge
metric ()g in the form
wg, 4-113( g - 4a@ U ) ?
potential and 4is a space-time scalar field. The Maxwell (23)
field splits correspondingly, (IF= F+ d&,dx5, and where g=g,, dx d x Y is the space-time metric and w is a
hence the free Maxwell Lagrangian becomes differential 1-form of the type
w=dx+ KA,dxP. (24)
Like 4, A = A , dx is independent of x; K is a coupling
In this manner Nordstrom arrived at a unification of his constant to be determined. The convenience of the con-
formal factor 4-113 will become clear shortly.
theory of gravity and electromagnetism. [The matter
Klein considers the subgroup of five-dimensional co-
source (five-current) is decomposed correspondingly.] It ordinate transformations which respect the form (23) of
seems that this early attempt left, as far as we know, no the d = 5 metric:
traces in the literature.
We now return to Kaluzas attempt. Like Nordstrom xp+xfi, XLX5+f(XP). (25)
he assumes the cylinder condition. Then the five- Indeed, the pull-back of ( 5 ) g is again of the form (23)
dimensional metric tensor splits into the four- with
dimensional fields g,, , A , , and +.
Kaluzas identifica-
tion of the electromagnetic potential is not quite the
right one, because he chooses it equal to gps (up to a
constant), instead of taking the quotient gps1g55,This Thus A = A , dx transforms like a gauge potential un-
does not matter in his further analysis, because he con- der the Abelian gauge group (25) and is therefore inter-
siders only the linearized approximation of the field preted as the electromagnetic potential. This is further
equations. Furthermore, the matter part is only studied justified by the most remarkable result derived by
in a nonrelativistic approximation. In particular, the Kaluza and Klein, often called the Kaluza-Klein miracle.
five-dimensional geodesic equation is only written in this It turns out that the five-dimensional Ricci scalar ()R
limit. Then the scalar contribution to the four-force be- splits as follows:

r 1
comes negligible and an automatic split into the usual
gravitational and electromagnetic parts is obtained. 1
( s ) R = 4 1 3R + 4 ~ 2 + F , y F p - ~ ( V q 5 ) 2 + ~ A l n. +
Kaluza was aware of the limitations of his analysis, 64 l l
but he was confident of being on the right track, as be- (27)
comes evident from the final paragraph of his paper: For + = l this becomes the Lagrangian of the coupled
In spite of all the physical and theoretical difficul- Einstein-Maxwell system. In view of the gauge group
ties which are encountered in the above proposal it (25), this split is actually no miracle, because no other
is hard to believe that the derived relationships, gauge-invariant quantities can be formed.
which could hardly be surpassed at the formal For the development of gauge theory this dimensional
level, represent nothing more than a malicious co- reduction was particularly important, because it re-
incidence. Should it sometime be established that vealed a close connection between coordinate transfor-
the scheme is more than an empty formalism this mations in higher-dimensional spaces and gauge trans-
would signify a new triumph for Einsteins General formations in space-time.
Theory of Relativity, whose suitable extension to With Klein we consider the d = 5 Einstein-Hilbert ac-
five dimensions is our present concern. tion
For good reasons the role of the scalar field was un-
clear to him, except in the limiting situation of his analy-
sis, where 4 becomes the negative of the gravitational
potential. Kaluza was, however, well aware that the sca- assuming that the higher-dimensional space is a cylinder
with 0 ~ x ~ ~ L = 2 Since
a R ~ .

5For instance, Einstein extensively discussed Nordstroms


m d x = 6 4 - l d4x dx (29)
second version in his famous Vienna lecture On the Founda- we obtain
tions of the Problem of Gravitation (23 September 1913) and
made it clear that Nordstroms theory was a viable alternative ( ) S = - I ( ;1z R +14 4 ~ , B - - 1
(~4)?)~d4x.
to his own attempt with Grossmann. [See Doc. 17 of Vol. 4 of 6 K 4
the Collected Papers of Albert Einstein (Einstein, 1987)l. (30)

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


548

10 L. ORaifeartaigh and N. Straurnann: Gauge theory: origins and modem developments

Our choice of the conformal factor 4!-13 in Eq. (23) was Dicke-type interactions has lately been investigated for
made so that the gravitational part in Eq. (30) is just the instance by Damour and collaborators (Damour and
Einstein-Hilbert action, if we choose Polyakov, 1994).
Since the work of Fierz (published in German, Fierz,
K= 16irG. (31) 1956) is not widely known, we briefly describe its main
For 4= 1 a beautiful geometrical unification of gravita- point. Quoting Pauli, Fierz emphasizes that, in theories
tion and elcctromagnetisrn is obtained. containing both tensor and scalar fields, the tensor field
We pause by noting that nobody in the early history appearing most naturally in thc action of the theory can
of Kaluza-Klein theory sccms to have noticed the fol- differ from the physical metric by some conformal
lowing inconsistency in putting 4- 1 [see, however, Li- factor depending o n the scalar fields. In order to decide
chneruwicz (1995)l: The field equations for the dimen- which is the atomic-unit metric and thus the gravita-
sionally rcduccd action (30) arc just the five-dimensional tional constant, onc has to look at the coupling to mat-
equations ( 5 ) R , b =0 for the Kaluza-Klein ansatz (23). ter. The physical metric g,, is thc one to which matter
Among these, the 4 equation, which is equivalent to is universally coupled (in accordance with the principle
()I?,= 0, becomes of equivalence). For instance, the action for a spin-0
massive matter field 9 should take the form

For $=l this implies the unphysical result F,,FP=O.


This conclusion is avoided if one proceeds in the reverse A unit 01 length is then provided by the Compton wave-
ordcr, i.c., by putting q%l in the action (30) and varying length Urn, and tcst particles fall along geodesics of g p v .
afterwards. However, if the extra dimension is treated as Fierz spcciaiizcs Jordans theory (with two frcc con-
physical-a viewpoint adopted by Klein (as wc shall stants) such that the Maxwell density, expressed in terms
see)-it is clearly essential that onc maintain consistency of the physical metric, is not multiplied with a
with the d = 5 field equations. This is an example of the spacetime-dependent function. Otherwise the vacuum
crucial importance of scalar fields in Kaluza-Klcin thco- would behave like a variable dielectric and this would
ries. have unwanted consequences, although the refraction is
Kaluza and Klein both studied the d = 5 geodesic 1: The fine structure constant would become a function
equation. For the metric (23) this is just the Euler- of spacetime, changing the spectra of galaxies over cos-
Lagrange equation for the Lagrangian mological distances.
With these arguments Fierz arrives at a theory which
(33) was later called the Brans-Dicke theory. H e did not,
howcvcr, confront the theory with observations, because
Sincc x 5 is cyclic, we have the conservation law ( m he did not believe in its physical relevancc. [The inten-
=mass of the particle) tion of Fierzs publication was mainly pedagogical (Fi-
crz, 1999, privatc communication),]
Equation (36) brings us to the part ol Kleins first
(34)
paper that is related to quantum theory. There he inter-
If use of this is made in the other equations, we obtain prets the five-dimensional geodesic equation as the geo-
metrical optical limit of the wave equation ()KIT= O on
the higher-dimensional space and establishes for speciaI
situations a close relation of the dimensionally reduced
wave equation with Schrodingers equation, which had
Clearly, p s has to be interpreted as q / K , where q is the
been discovered in the same year. His ideas are more
charge of the particle,
clearly spelled out shortly afterwards in a brief Nature
Pj=q[K. (36) note entitled The Atomicity of Elcctricity as a Quan-
tum Theory Law (Klein, 192613). There Klcin says in
The physical significancc of thc last tcrm in Eq. (35)
connection with Eq. (36):
rcmaincd obscurc. Much later, Jordan (1949, 1955) and
Thiry (1948, 1951) tried to make use of the new scalar The charge 4 , so far as our knowledge goes, is al-
field to obtain a theory in which the gravitational con- ways a multiple (if the electronic chargc e, so that
stant is replaced by a dynamical field. Further work by wc may writc
Jordan (1949: 1955), Fierz (1956), and Brans and Dicke e
(1961) led to a much studied theory, which has been for ps=n- [HEZ]. (38)
many years a serious competitor of general relativity.
Generalized versions (Bergmann, 1968) have recently This formula suggests that the atomicity of electric-
played a role in models of inflation (see, for example, ity may be interpreted as a quantum theory law. In
Steinhardt, 1993). The question of whether the low- fact, if the five-dimensional space is assumed to be
energy effective theory of string theories, say, has Brans- closed in the direction of x 5 with period I,, and if

Rev. Mod. Phys.. Vol. 72,No. 1, January 2000


549

L. OAaifeartaigh and N. Straumann: Gauge theory: origins and modern developments 11

we apply the formalism of quantum mechanics to


our geodesics, we shall expect p 5 to be governed
by the following rule:
where D , is the doubly covariant derivative (with re-
(39) spect to g,, and A , ) with the charge
K
n being a quantum number, which may be positive
or negative according to the sense of motion in the
9n=n K. (43)
direction of the fifth dimension, and h the constant This shows that the mass of the nth mode is
of Planck.
1
Comparing Eqs. (38) and (39), Klein finds the value of m,=JnJ--.
the period L, R5
h Combining Eq. (43) with q n = n e , we obtain, as before,
L= 4?%??=0.8X10-30 cm, Eq. (40) or

and adds:
(45)
The small value of this length together with the
periodicity in the fifth dimension may perhaps be where a is the fine-structure constant and lP1 is the
taken as a support of the theory of Kaluza in the Planck length.
sense that they may explain the non-appearance of Equations (43) and (44) imply a serious defect of the
the fifth dimension in ordinary experiments as the five-dimensional theory: The (bare) masses of all
result of averaging over the fifth dimension. charged particles ( ( n ( 3 l )are of the order of the Planck
Klein concludes this note with the daring speculation mass
that the fifth dimension might have something to do with
Plancks constant: m,=n-mpl.
J;;
2 (46)
In a former paper the writer has shown that the
differential equation underlying the new quantum The pioneering papers of Kaluza and Klein were
mechanics of Schrodinger can be derived from a taken up by many authors. For some time the projec-
wave equation of a five-dimensional space, in tive theories of Veblen (1933), Hoffmann (1933), and
which h does not appear originally, but is intro- Pauli (1933) played a prominent role. These are, how-
duced in connection with the periodicity in x 5 . Al- ever, just equivalent formulations of Kaluzas and
though incomplete, this result, together with the Kleins unification of the gravitational and the electro-
considerations given here, suggests that the origin magnetic field (Bergmann, 1942; Ludwig, 1951).
of Plancks quantum may be sought just in this pe- Einsteins repeated interest in five-dimensional gener-
riodicity in the fifth dimension. alizations of general relativity has been described by
Bergmann (1942) and Pais (1982) and will not be dis-
This was not the last time that such speculations have
cussed here.
been put forward. The revival of (supersymmetric)
Kaluza-Klein theories in the eighties (Appelquist, Cha-
dos, and Freund, 1987; Kubyshin ef d.,1989) led to the
idea that the compact dimensions would necessarily give
rise to an enormous quantum vacuum energy via the V. KLEINS 1938 THEORY
Casirnir effect. There were attempts to exploit this
vacuum energy in a self-consistent approach to compac- The first attempt to go beyond electromagnetism and
tification, with the hope that the size of the extra dimen- gravitation and apply Weyls gauge principle to the
sions would be calculable as a pure number times the nuclear forces occurred in a remarkable paper by Oskar
Planck length. Consequently the gauge-coupling con- Klein, presented at the Kazimierz Conference on New
stant would then be calculable. Theories in Physics (Klein, 1938). Assuming that the
Coming back to Klein we note that he would also mesons proposed by Yukawa were vectorial, Klein pro-
have arrived at Eq. (39) by the dimensional reduction of ceeded to construct a Kaluza-Klein-like theory which
his five-dimensional equation. Indeed, if the wave field would incorporate them. As in the original Kaluza-Klein
$ ( x , x 5 ) is Fourier decomposed with respect to the peri- theory he introduced only one extra dimension but his
odic fifth coordinate, theory differed from the original in two respects:
(i) The fields were not assumed to be completely in-
dependent of the fifth coordinate x5 but to depend on it
through a factor e - l P x 5 where e is the electric charge.
(ii) The five-dimensional metric tensor was assumed
one obtains for each amplitude $,,(x) [for the metric
to be of the form
(23) with += 11 the following four-dimensional wave
equation: g,Y(X), g55=L gp5=Pxp(x), (47)

Rev. Mod. Phys., Vol. 72,No. 1. January 2000


550

12 L. ORaifeartaigh and N. Straurnann: Gauge theory: origins and modern developments

where g,, was the usual four-dimensional Einstein met- An obvious weakness of Kleins theory is that there is
ric, p was a constant, and x , ( x ) was a matrix-valued only one coupling constant p, which implies that the
field of the form nuclear and electromagnetic forces would be of approxi-
mately the same strength, in contradiction with experi-
ment. Furthermore, the nuclear forces would not be
charge independent, as they were known to be at the
time. These weaknesses were noticed by Meller, who, at
where the d s are the usual Pauli matrices and A , ( x ) is
the end of the talk, objected to the theory on these
what we would now call an S U ( 2 ) gauge potential. This
was a most remarkable ansatz considering that it implies grounds. Kleins answer to these objections was aston-
a matrix-valued metric, and it is not clear what moti- ishing: this problem could easily be solved he said, be-
vated Klein to make it. The reason that he multiplied cause the strong interactions could be made charge in-
the present-day S U ( 2 ) matrix by u3 is that u3 repre- dependent (and the electromagnetic field separated) by
sented the charge matrix for the fields. introducing one more vector field C , and generalizing
Having made this ansatz, Klein proceeded in the stan- the 2 X 2 matrix x p
dard Kaluza-Klein manner and obtained, instead of the
Einstein-Maxwell equations, a set of equations that we
would now call the Einstein-Yang-Mills equations. This In other words, he there and then generalized what was
is a little surprising because Klein inserted only electro- effectively a (broken) S U ( 2 ) gauge theory to a broken
magnetic gauge invariance. However, one can see how S U ( 2 ) X U(1) gauge theory. In this way, he anticipated
the U ( 1 ) gauge invariance of electromagnetism could the mathematical structure of the standard electroweak
generalize to S U ( 2 ) gauge invariance by considering the theory by 21 years!
field strengths. The S U ( 2 ) form of the field strengths Klein has certainly not forgotten his ambitious pro-
corresponding to the B, and B , fields, namely, posal of 1938, in contrast to what has been suspected by
Gross (1995). In his invited lecture at the Berne Con-
F f ,= d,B ,,- d,B, + ie ( A,B ,-A ,,B), , (49) gress in 1955 (Klein, 1956) he came back to some main
aspects of his early attempt and concluded with the
FEY= d,B,- d,B,- i e ( A , B , - A $,) (50) statement:
actually follows from the electromagnetic gauge prin- On the whole, the relation of the theory to the
ciple d , + D , = d , + i e ( l - u 3 ) A , , given that the three five-dimensional representation of gravitation and
vector fields belong to the same 2 X 2 matrix. The more electromagnetism on the one hand and to symmet-
difficult question is why the expression ric meson theory on the other hand-through the
appearance of the charge invariance group-may
Fzy=d,A,,- d , A , - i e ( B , B , - B , B , ) perhaps justify the confidence in its essential
for the field strength corresponding to A , contained a soundness.
bilinear term when most other vector-field theories, such
as the Proca theory, contained only the linear term. The
reason is that the geometrical nature of the dimensional VI. THE PAUL1 LETTERS TO PAIS
reduction meant that the usual space-time derivative d,
was replaced by the covariant space-time derivative d, The next attempt to write down a gauge theory for the
+ i e ( l - u 3 ) x , / 2 , with the result that the usual curl dr\X nuclear interactions was due to Pauli. During a discus-
was replaced by a,x,- d , ~ , + i e l 2 [ ~,x,], , whose third sion following a talk by Pais at the 1953 Lorentz Con-
component is just the expression for F;,, . ference in Leiden (Pais, 1953), Pauli said:
Being primarily interested in the application of his . . . I would like to ask in this connection whether the
theory to nuclear physics, Klein immediately introduced transformation group with constant phases can be am-
the nucleons, treating them as an isodoublet $ ( x ) on plified in way analogous to the gauge group for electro-
which the matrix 6, acted by multiplication. In this way magnetic potentials in such a way that the meson-
he was led to field equations of the familiar S U ( 2 ) form, nucleon interaction is connected with the amplified
namely, group.. .
Stimulated by this discussion, Pauli worked on this
ie
( 7 .D + M ) I,@) = 0, ,
D = a,+ 2 ( 1- u 3 ) x ,. (52)
- problem and drafted a manuscript to Pais that begins
with the heading (Pauli, 1999).
However, although the equations of motion for the vec- Written down July 22-25, 1953, in order to see
tor fields A , , B , , and B , would be immediately recog- how it looks. Meson-Nucleon Interaction and Dif-
nized today as those of an S U ( 2 ) gauge-invariant ferential Geometry.
theory, this was not at all obvious at the time and Klein In this manuscript, Pauli generalizes the Kaluza-Klein
does not seem to have been aware of it. Indeed, he im- theory to a six-dimensional space and arrives through
mediately proceeded to break the S U ( 2 ) gauge invari- dimensional reduction at the essentials of an S U ( 2)
ance by assigning ad hoc mass terms to the Bp and B , gauge theory. The extra dimensions form a two-sphere
fields. S2 with space-time-dependent metrics on which S U ( 2 )

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


55 1

L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modem developments 13

operates in a space-time-dependent manner. Pauli de- There is, however, no justification for the particu-
velops first in local language the geometry of what we lar choice of the five-dimensional curvature scalar
now call a fiber bundle with a homogeneous space as P as integrand of the action integral, from the
typical fiber [in his case S2=SU(2)/U(1)]. Studying the standpoint of the restricted group of the cylindrical
curvature of the higher-dimensional space, Pauli auto- metric [gauge group]. The open problem of finding
matically finds, for the first time, the correct expression such a justification seems to point to an amplifica-
for the non-Abelian field strength. tion of the transformation group.
Since it is somewhat difficult to understand exactly
what Pauli did, we give some details, using more familiar In a second letter (Pauli, 1999), Pauli also studies the
formulations and notations. dimensionally reduced Dirac equation and arrives at a
Pauli considers the six-dimensional total space mass operator that is closely related to the Dirac opera-
M X S2, where S2 is the two-sphere on which SO(3) acts tor in internal space ( S 2 , y ) . The eigenvalues of the lat-
in the canonical manner. He distinguishes among the ter operator had been determined by him long before
diffeomorphisms (coordinate transformations) those (Pauli, 1939). Pauli concludes with the statement: So
which leave M pointwise fixed and induce space-time- this leads to some rather unphysical shadow particles.
dependent rotations on S2:
( X . Y ) - - t [ x , R ( x ,) Y l . (54) VII. YANG-MILLS THEORY
Then Pauli postulates a metric on M X S 2 that is sup-
posed to satisfy three assumptions. These lead him to In his Hermann Weyl Centenary Lecture at the ETH
what is now called the non-Abelian Kaluza-Klein ansatz: (Yang, 1980), C. N. Yang commented on Weyls remark
The metric S on the total space is constructed from a The principle of gauge-invariance has the character of
space-time metric g, the standard metric y o n S2, and a general relativity since it contains an arbitrary function
Lie-algebra-valued 1-form, A, and can certainly only be understood in terms of it
(Weyl, 1968) as follows:
A=AT,, An=A;dxp, (55) The quote above from Weyls paper also contains
on M [ T , , a = 1,2,3, are the standard generators of the something which is very revealing, namely, his
Lie algebra of SO(3)] as follows: If KLdlay are the strong association of gauge invariance with general
three Killing fields on S2, then relativity. That was, of course, natural since the
idea had originated in the first place with Weyls
g = g - yij[ d y + K d ( y ) A ] @
[d y j + K;(y)A]. (56) attempt in 1918 to unify electromagnetism with
In particular, the nondiagonal metric components are gravity. Twenty years later, when Mills and I
worked on non-Abelian gauge fields, our motiva-
g,i=A;(x)yjjKh. (57) tion was completely divorced from general relativ-
Pauli does not say that the coefficients of A ; in Eq. (57) ity and we did not appreciate that gauge fields and
are the components of the three independent Killing general relativity are somehow related. Only in the
fields. This is, however, his result, which he formulates in late 1960s did I recognize the structural similarity
terms of homogeneous coordinates for S2. He deter- mathematically of non-Abelian gauge fields with
mines the transformation behavior of A ; under the general relativity and understand that they both
group (54) and finds in matrix notation what he calls were connections mathematically.
the generalization of the gauge group: Later, in connection with Weyls strong emphasis of
the relation between gauge invariance and conservation
A p + R A p R - + R- d,R. (58) of electric charge, Yang continues with the following in-
With the help of A, , he defines a covariant deriva- structive remarks:
tive, which is used to derive field strengths by apply- Weyls reason, it turns out, was also one of the
ing a generalized curl to A , . This is exactly the field melodies of gauge theory that had very much ap-
strength that was later introduced by Yang and Mills. To pealed to me when as a graduate student I studied
our knowledge, apart from Kleins 1938 paper, it ap- field theory by reading Paulis articles. I made a
pears here for the first time. Pauli says that this is the number of unsuccessful attempts to generalize
true physical field, the analog of the field strength and gauge theory beyond electromagnetism, leading fi-
he formulates what he considers to be his main result: nally in 1954 to a collaboration with Mills in which
The vanishing of the field strength is necessary and we developed a non-Abelian gauge theory. In [. . .]
sufficient for the A ; ( x ) in the whole space to be we stated our motivation as follows:
transformable to zero. The conservation of isotopic spin points to the ex-
It is somewhat astonishing that Pauli did not work out istence of a fundamental invariance law similar to
the Ricci scalar for as for the Kaluza-Klein theory. the conservation of electric charge. In the latter
One reason may be connected with his remark on the case, the electric charge serves as a source of elec-
Kaluza-Klein theory in Note 23 of his relativity article tromagnetic field; an important concept in this case
(Pauli, 1958) concerning the five-dimensional curvature is gauge invariance, which is closely connected
scalar (p. 230): with (1) the equation of motion of the electro-

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


552

14 L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments

magnetic field, (2) the existence of a current den- We should let Frank proceed. I then resumed,
sity, and (3) the possible interactions between a and Pauli did not ask any more questions during
charged field and the electromagnetic field. We the seminar.
have tried to generalize this concept of gauge in- I dont remember what happened at the end of the
variance to apply to isotopic spin conservation. It seminar. But the next day I found the following
turns out that a very natural generalization is pos- message:
sible.
February 24, Dear Yang, I regret that you made it
Item (2) is the melody referred to above. The almost impossible for me to talk with you after the
other two melodies, (1) and (3), where what had seminar. All good wishes. Sincerely yours, W.
become pressing in the early 1950s when so many Pauli.
new particles had been discovered and physicists
had to understand how they interact with each I went to talk to Pauli. H e said I should look up a
other. paper by E. Schrodinger, in which there were simi-
lar mathematics.6 After I went back to
I had met Weyl in 1949 when I went to the Insti- Brookhaven, I looked for the paper and finally ob-
tute for Advanced Study in Princeton as a young tained a copy. It was a discussion of spacetime-
member. I saw him from time to time in the next dependent representations of the yp matrices for a
years, 1949-1955. He was very approachable, but I Dirac electron in a gravitational field. Equations in
dont remember having discussed physics or math- it were, on the one hand, related to equations in
ematics with him at any time. His continued inter- Riemannian geometry and, on the other, similar to
est in the idea of gauge fields was not known the equations that Mills and I were working on.
among the physicists. Neither Oppenheimer nor But it was many years later when I understood that
Pauli ever mentioned it. I suspect they also did not these were all different cases of the mathematical
tell Weyl of the 1954 papers of Mills and mine. theory of connections on fiber bundles.
Had they done that, or had Weyl somehow came
across our paper, I imagine he would have been Later Yang adds:
pleased and excited, for we had put together two I often wondered what he [Pauli] would say about
things that were very close to his heart: gauge in- the subject if he had lived into the sixties and sev-
variance and non-Abelian Lie groups. enties.
It is indeed astonishing that during those late years At another occasion (Yang, 1980) he remarked:
neither Pauli nor Yang ever talked with Weyl about I venture to say that if Weyl were to come back
non-Abelian generalizations of gauge invariance. today, he would find that amidst the very exciting,
With the background of Sec. VI, the following story of complicated and detailed developments in both
spring 1954 becomes more understandable. In late Feb- physics and mathematics, there are fundamental
ruary, Yang was invited by Oppenheimer to return to things that he would feel very much at home with.
Princeton for a few days and to give a seminar on his He had helped to create them.
joint work with Mills. Here is Yangs report (Yang,
1983): Having quoted earlier letters from Pauli to Weyl, we
add what Weyl said about Pauli in 1946 (Weyl, 1980):
Pauli was spending the year in Princeton, and was
deeply interested in symmetries and interactions. The mathematicians feel near to Pauli since he is
(He had written in German a rough outline of distinguished among physicists by his highly devel-
some thoughts, which he had sent to A. Pais. Years oped organ for mathematics. Even so, he is a
later F. J. Dyson translated this outline into En- physicist; for he has to a high degree what makes
glish. It started with the remark, Written down the physicist; the genuine interest in the experi-
July 22-25,1953, in order to see how it looks, and mental facts in all their puzzling complexity. His
had the title Meson-Nucleon Interaction and Dif- accurate, instructive estimate of the relative weight
ferential Geometry.) Soon after my seminar be- of relevant experimental facts has been an unfail-
gan, when I had written down on the blackboard, ing guide for him in his theoretical investigations.
Pauli combines in an exemplary way physical in-
(d, - i EB), CCI. sight and mathematical skill.
Pauli asked, What is the mass of this field B,? I To conclude this section, let us emphasize the main
said we did not know. Then I resumed my presen- differences between general relativity and Yang-Mills
tation, but soon Pauli asked the same question theories. Mathematically, the so (1,3)-valued connection
again. I said something to the effect that that was a forms w in Sec. IIIA and the Lie-algebra-valued gauge
very complicated problem, we had worked on it potential A are on the same footing; they are both rep-
and had come to no definite conclusions. I still re- resentatives of connections in (principle) fiber bundles
member his repartee: That is not sufficient ex-
cuse. I was so taken aback that I decided, after a
few moments hesitation to sit down. There was 6E. Schrodinger, Sitzungsberichte der Preussischen (Akad
general embarrassment. Finally Oppenheimer said, emie der Wissenschaften, 1932), p. 10.5.

Rev. Mod. Phys., Vol. 72. No. 1 , January 2000


553

L.ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments 15

9- en first unification proposal of Hermann Weyl, they may


I create beautiful and highly relevant mathematics, which
do not, however, describe Nature. In the latter case his-
w A tory shows, however, that such ideas can one day also
become fruitful for physics. It may, therefore, be appro-

1 1
0- F
FIG. 4. General relativity vs Yang-Mills theory.
priate to conclude with some remarks on current at-
tempts in string theory and noncommutative geometry.

A. Gauge theoly and strings


1. Introduction

So far we have considered gravitation and gauge


over the space-time manifold. Equation (17) translates theory only within the context of local field theory.
into the formula for the Yang-Mills field strength F, However, gravitation and gauge theory also occur natu-
F=dA+AAA. (59) rally in string theory (Green, Schwarz, and Witten, 1987;
Polchinski, 1998). Indeed, whereas in field theory they
In general relativity one has, however, additional geo- are optional extras that are introduced on phenomeno-
metric structure, since the connection derives from a logical grounds (equality of inertial and gravitational
metric, or the tetrad fields e*(x), through the first struc- mass, divergence-free character of the magnetic field,
ture equation (15). This is shown schematically in Fig. 4. etc.) in string theory they occur as an intrinsic part of the
[In bundle theoretical language one can express this as structure. Thus string theory is a very natural setting for
follows: The principle bundle of general relativity, i.e., gravitation and gauge fields. One might go so far as to
the orthonormal frame bundle, is constructed from the say that, had string theory preceded field theory histori-
base manifold and its metric, and has therefore addi- cally, the gravitational and gauge fields might have
tional structure, implying, in particular, the existence of emerged in a completely different manner. An interest-
a canonical 1-form (soldering form), whose local repre- ing feature of string gauge theories is that the choice of
sentatives are the tetrad fields; see, for example, gauge group is quite limited.
Bleecker (1981).] String theory is actually a natural setting not only for
Another important difference is that the gravitational gravitational and gauge fields but also for the Kaluza-
Lagrangian * R = 1 / 2 R m p ~ * ( e U ~ise alinear
) in the field Klein mechanism. Historically, the most obvious diffi-
strengths, whereas the Yang-Mills Lagrangian FA* F is culty with Kaluza-Klein reductions was that there was
quadratic. no experimental evidence and no theoretical need for
any extra dimensions. String theory changes this situa-
VIII. RECENT DEVELOPMENTS tion dramatically. As is well known, string theory is con-
formally invariant only if the dimension d of the target
The developments after 1958 consisted in the gradual space is d = 10 or d = 26, according to whether the string
recognition that-contrary to phenomenological is supersymmetric or not. Thus, in contrast to field
appearances-Yang-Mills gauge theory could describe theory, string theory points to the existence of extra di-
weak and strong interactions. This important step was mensions and even specifies their number.
again very difficult, with many hurdles to overcome. We shall treat an important specific case of dimen-
One of them was the mass problem, which was solved, sional reduction within string theory, namely, the toroi-
probably in a preliminary way, through spontaneous dal reduction from 26 to 10 dimensions, in Sec.
symmetry breaking. Of critical significance was the rec- VIII. A. 7. However, since no phenomenologically satis-
ognition that spontaneously broken gauge theories are factory reduction from 26 or 10 to 4 dimensions has yet
renormalizable. On the experimental side the discovery emerged, and the dimensional reduction in string theory
and intensive investigation of the neutral current was, of is rather similar to that in ordinary field theory (Ap-
course, crucial. For the gauge description of the strong pelquist, Chodos, and Freund, 1987; Kubyshin et al.,
interactions, the discovery of asymptotic freedom was 1989), we shall not consider any other case, but refer the
decisive. That the S U ( 3 ) color group should be gauged reader to the literature.
was also not at all obvious. And then there was the con- Instead we shall concentrate on the manner in which
finement idea, which explains why quarks and gluons do gauge theory and gravitation occur in the context of di-
not exist as free particles. All this is described in numer- mensionally unreduced string theory. We shall rely
ous modern textbooks and does not have to be repeated. heavily on the result (Yau, 1985) that if a massless vec-
The next step in creating a more unified theory of the tor field theory with polarization vector 5, and on-shell
basic interactions will probably be much more difficult. momentum p , ( p 2 = 0 ) is invariant with respect to the
All major theoretical developments of the last twenty transformation
years, such as grand unification, supergravity, and super-
symmetric string theory, are almost completely sepa- t,(P 1- t,(P 1+ V ( P1P, > (60)
rated from experience. There is a great danger that where ~ ( p is) an arbitrary scalar, then it must be a
theoreticians may get lost in pure speculations. As in the gauge theory. Similarly, we shall rely on the result that a

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


554

16 L. ORaifeartaigh and N. Straumann: Gauge theoty: origins and modern developments

second-rank symmetric-tensor theory must be a gravita- the emission of the on-shell massless states. The vertices
tional theory if the polarization vector satisfies are the analogs of the ordinary Feynmann diagrams in
quantum field theory and take the form

and if the theory is invariant with respect to


5,V(P)-+5,V(P)+ v@vf 7vPp (62) where 5, is the polarization vector, p , , is the momentum
for arbitrary v , , ( p ) , and p 2 = 0 (Weinberg, 1965; Wald, ( p 2 = O ) of the emitted particle, and d, is a spacelike
1986, and references therein; Feynman, 1995). derivative. This vertex operator is to be inserted in the
functional integral (63). Although the form of this vertex
2. Gauge properties of open bosonic strings
is not deduced from a second-quantized theory of strings
(which does not yet exist) the vertex operator (64) is
To fix our ideas we first recall the form of the path generally accepted as the correct one, because it is the
integral for the bosonic string (Green, Schwarz, and only vertex that is compatible with the two-dimensional
Witten, 1987; Polchinski, 1998), namely, structure and conformal invariance of the string, and
that reduces to the usual vertex in the point-particle
limit. Suppose now that we make the transformation t v
- ~ , + c ~ ( p ) pin
, Eq. (60). Then the vertex V ( 5 , p ) ac-
where u are the coordinates, d 2 u is the diffeomorphic- quires the additional term
invariant measure, and hap is the metric on the two-
dimensional world sheet of the string, while vPvis the
Lorentz metric for the 26-dimensional target space in
which the string, with coordinates X"( u),moves. Thus
r](p I &e'P .x(u)p. d,x(a)= - i v(p)
I &a,( e ' p .x(u)),
(65)
the X P ( u ) may be regarded as fields in a two- which is an integrable factor that attaches itself to the
dimensional quantum field theory. The action in Eq. two ends of the string and thus displays the gauge-
(63), and hence the classical two-dimensional field covariant character of V ( 5 , p ) . The important point is
theory, is conformally invariant, but, as is well known, that this gauge covariance is not imposed from outside,
the quantum theory is conformally invariant only if N but is an intrinsic property of the string. It is a conse-
= 26 or N = 10 in the supersymmetric version. quence of the fact that the string is conformally invariant
The open bosonic string is the one in which gauge (which dictates the form of the vertex operator) and has
fields occur. Indeed, one might go so far as to say that an internal structure (manifested by the fact that it has
the open string is a natural nonlocal generalization of a an internal two-dimensional integration).
gauge field. The ends of the open strings are usually
assumed to be attached to quarks, and thus there is a
certain qualitative resemblance between the open
bosonic strings and the gluon flux lines that link the 3. Gravitational properties of closed bosonic strings
quarks in theories of quark confinement. We wish to
Just as the open bosonic string is the one in which
make the relationship between gauge fields and open
gauge fields naturally occur, the closed bosonic string is
bosonic strings more precise.
the one in which gravitational fields naturally occur. It
As is well known (Green, et al. 1987; Polchinski,
turns out, in fact, that a gravitational field, a dilaton
1998), the vacuum state 10) of the open string is a scalar
field, and an antisymmetric tensor field occur in the
tachyon and the first excited states are Xp(?)1O)
closed string in the same way that the gauge field ap-
= J d (sins)X:(T,s)(O), where T and s are the time and
pears in the open string. The ground state 10) of the
space components of u and X ? ( T , S ) is the positive- closed string is again a tachyon but the new feature is
frequency part of X p ( 7,s). For a suitable standard value that, for the standard value of the normal-ordering con-
of the normal-ordering parameter for the Noether gen- stant, the first excited states are massless states of the
erators of the conformal (Virasoro) group, these states form X y ( u ) X : ( u ) l O ) and it is the symmetric, trace,
are massless, i.e., p 2 = 0, where p , , is the 26-dimensional and antisymmetric parts of the two-tensor formed by the
momentum. Furthermore, they are the only massless X's that are identified with the gravitational, dilaton,
states. Thus if there are gauge fields in the theory, these and antisymmetric tensor fields, respectively. The ques-
are the states that must be identified with them. On the tion is how the identification with the gravitational field
other hand, since all the other (massive) states in the is to be justified, and again the answer is by means of a
Fock space of the string are formed by acting on 10) with vertex operator, this time for the emission of an on-shell
higher moments of X ( u ) we see that the operators graviton. The vertex operator that describes the emis-
X $ ( T ) are the prototypes of the operators that create sion of an on-shell massless spin-2 field (graviton) of
the whole string. It is in this sense that the string can be polarization tPv and momentum p i , where p 2 = 0, is
regarded as a nonlocal generalization of a gauge field.
The question is: how is the identification of the mass-
less states X - ( .)lo) to be justified? The justification
comes about through the so-called vertex operators for

Rev. Mod. Phys., Vol. 72, No. 1, January 2000


555

L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments 17

Already at this stage there is a feature that does not which dates from the days of strong-interaction string
arise in the gauge-field case: Since the vertex operator is theory. In this mechanism one simply attaches charged
bilinear in the field X it has to be normal ordered, and particles to the open ends of the string. These charged
it turns out that the normal ordering destroys the classi- particles are not otherwise associated with any string
cal conformal invariance, unless and thus the mechanism is rather ad hoc and leads to a
hybrid of string and field theories. But it has the merit of
tf=O and ptpV=O. (67) introducing charged particles directly and thus empha-
We next make the momentum-space version of an in- sizing the relationship between strings and gauge fields.
finitesimal coordinate transformation, namely, The Chan-Paton mechanism has the further merit of
allowing a simple generalization to the non-Abelian
5pv-+tpu+ O(PIPV+ T v ( P ) P p . (68) case. This is done by replacing the charged particles by
Under this transformation the vertex V gpicks up an ad- particles belonging to the fundamental representations
ditional term of the form of compact internal symmetry groups G, typically
quarks q , ( x ) and antiquarks g b ( x ) . The vertex opera-
tor then generalizes to one with double labels (u,b) and
represents non-Abelian gauge fields in much the same
way that the simple bosonic string represents an Abelian
gauge field.
An interesting restriction arises from the fact that
In analogy with the electromagnetic case we can carry since the string represents gauge fields, and gauge fields
out a partial integration. However, this time the expres- belong to the adjoint representation of the gauge group,
sion does not vanish completely but reduces to the vertex function must belong to the adjoint represen-
tation. This implies that even at the tree level the tensor
product of the fundamental group representation with
itself must produce only the adjoint representation, and
where A denotes the two-dimensional Laplacian. On the this restricts G to be one of the classical groups S O ( n ) ,
other hand, A X ( a ) = O is just the classical field equa- Sp(2n), and U ( n ) .Furthermore, it is found that U(n)
tion for X ( o ) , and it can be shown that even in the violates unitarity at the one-loop level, which leaves only
quantized version it is effectively zero (Green, Schwarz, S O ( n ) and SP(2n). Finally, these groups require sym-
and Witten, 1987; Polchinski, 1998). Thus, thanks to the metrization and antisymmetrization in the indices a and
dynamics, we have invariance with respect to the trans- b to produce only the adjoint representation, and this
formations (68). But Eq. (67) and invariance with re- implies that the string is oriented (symmetric with re-
spect to Eq. (68) are just the conditions (61) and (62) spect to its end points). When all these conditions are
discussed earlier for the vertex to be a gravitational satisfied it can be shown that the non-Abelian vertex
field. As in the gauge-field case, the important point is corresponding to Eq. (64) is covariant with respect to
that the general coordinate invariance is not imposed, non-Abelian gauge transformations corresponding to
but is a consequence of the conformal invariance and tp+ ~ ( p ) pabove.
, But since these transformations
internal structure of the string. are nonlinear the proof is more difficult than in the Abe-
The appearance of a scalar field in this context is not lian case.
too surprising, since a scalar also appeared in the
Kaluza-Klein reduction. What is more surprising is the
appearance of an antisymmetric tensor. From the point 5. Fermionic and heterotic strings: Supergravity
of view of traditional local gravitational and gauge-field and non-abelian gauge theory
theory the presence of an additional antisymmetric ten- The Chan-Paton version of gauge string has the obvi-
sor field seems at first sight to be an embarrassment. But ous disadvantage that the charged fields (quarks) are not
it turns out to play an essential role in maintaining con- an intrinsic part of the theory. A second method of in-
formal invariance (cancellation of anomalies), so its troducing fermions is to place them in the string itself.
presence is to be welcomed. This is done by replacing the kinetic term (ax) by a
4. The presence of matter
Dirac term qSW in the Lagrangian density. Interesting
cases are those in which the number of fermion compo-
Of course, the open bosonic string is not the whole nents just matches the number of bosons, so that the
story any more than pure gauge fields are the whole Lagrangian is supersymmetric. In that case the condition
story in quantum field theory. One still has to introduce for quantum conformal invariance reduces from d = 26
quantities that correspond to fermions (and possibly sca- to d=10. An interesting case from the point of view of
lars) at the zero-mass level. There are essentially two gauge theory and dimensional reduction is the heterotic
ways to do this. The first is the Chan-Paton mechanism string, in which the left-handed part forms a superstring
(Green, Schwarz and Witten, 1987; Polchinski, 1998), and the right-handed part forms a bosonic string in

Rev. Mod. Phys.. Vol. 72,No. 1, January 2000


556

18 L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments

which 16 of the bosons are fermionized. For the het- trivial representations of G, and these are the ones ob-
erotic string the Lagrangian in the bosonic-string path tained from the tensor products of Eq. (73) with the
integral (63) is replaced by the Lagrangian space-time scalars AAlO). At this point one must make a
p=10 p=10 A=32 choice about the internal symmetry group G. The sim-
z
p=l
a,xpa*x,-2 z fl
p=l
d-$p+-2
A=l
A! d , ~ ! , plest choice is evidently G=SO(32), and it is obtained
by assigning antiperiodic boundary conditions to all the
(71) fermion fields A. (Assigning periodic boundary condi-
where the $s and As are Majorana-Weyl fermions and tions to all of them violates the masslessness condition.)
the As belong to a representation (labeled with A ) of an Since the product states continue to belong to the ad-
internal symmetry group G. It is only through the As joint representation of SO(32), they are the natural
that the internal symmetry group enters. The left- and candidates for states associated with non-Abelian gauge-
right-handed parts of the theory are conformally invari- fields, and an analysis of the vertex operators associated
ant for quite different reasons. The left-handed part of with these states confirms that they do indeed corre-
the Xs and the left-handed fermions are conformally spond to SO(32) gauge fields.
invariant, because together they form the left-handed In sum, the heterotic string produces both supergrav-
part of a superstring (this is why the summation over the ity and non-Abelian gauge theory.
Xs is only from 1 up to 10). The right-handed part of
the Xs and the right-handed fermions AA are confor-
7. Dimensional reduction and the heterotic
mally invariant because, from the point of view of
symmetry group E8x E8
anomalies, two Majorana-Weyl fermions are equivalent
to one boson and thus the system is equivalent to the A variety of other left-handed internal symmetry
right-handed part of a 26-dimensional bosonic string. groups GCSO(32) can be obtained by assigning peri-
(This is why the index A runs from 1 to 32.) The fact odic and antiperiodic boundary conditions to the fermi-
that there are 32 fermions obviously puts strong restric- ons AA of the heterotic string in a nonuniform manner.
tions on the choice of the internal symmetry group G. However, apart from the SO(32) case just discussed, the
We now examine the particle content of the theory, only assignment that satisfies unitarity at the one-loop
using the light-cone gauge, where there are no redun- level is an equipartition of the 32 fermions into two sets
dant fields. There are no tachyons for the left-moving of 16, with mixed boundary conditions. This would ap-
fields; the first excited states are massless and take the pear, at first sight, to lead to an SO( 16) X SO(16) inter-
form nal symmetry and gauge group, on the same grounds as
SO(32) above, but a closer analysis shows that it actu-
li), and la),, (72) ally leads to a larger group, namely E s X E , , which ac-
where the l i ) L for i=1...8 are the left-handed compo- tually has the same dimension (496) as SO(32). This
nents of a massless space-time vector in the eight trans- group is quite attractive for grand unification theory, as
verse directions in the light-cone gauge and are the it breaks naturally to E,, which is one of the favorite
components of a massless fermion in one of the two grand unified theory groups.
fundamental spinor representations of the same space- Once we accept that S 0 ( 1 6 ) X S 0 ( 1 6 ) is a gauge
time SO(8) group. These states are all G invariant. group and that a rigid internal symmetry group
The first excited states for the right-moving sector are E 8 X E s exists, it follows immediately that E , X E , must
be a gauge group, because the action of the rigid gen-
[ i ) R and A ~ A ~ I o ) , (73) erators of E,XE, on the SO(16)XSO(16) gauge fields
where the li)R are the right-handed analogs of the li), produces E , X E , gauge fields.
and the A A l O ) states are massless space-time scalars. The This reduces the problem to the existence of a rigid
states [ i ) Rare G invariant but the states A A l O ) belong to E 8 X E , symmetry, but, within the context of our present
the adjoint representation of G and thus it is only methods, this is a rather convoluted process. One must
through these states that the internal symmetry enters at introduce special representations of SO( 16) XSO( 16),
the massless level. project out some of the resulting states, and construct
The physical states are obtained by tensoring the left- vertices that represent the elements of the coset
and right-moving states (72) and (73). On tensoring the (E8XE,)I[SO(16)XSO(16)]. Luckily there is a much
right-handed states with the vectors in Eq. (73) we ob- more intuitive way to establish the existence of the
viously obtain states that are G invariant, and they turn E , X E , symmetry, and, as this way provides a very nice
out to be just the states that would occur in N = 1 super- example of dimensional reduction within string theory,
gravity. An analysis of the vertex operators, similar to we shall now sketch it.
that carried out above for closed bosonic strings, con- We have already remarked that, from the point of
firms that these fields do indeed correspond to super- view of Virasoro anomalies, the 32 right-handed
gravity. Majorana-Weyl fermions A are equivalent to the right-
handed parts of 16 bosons. This relationship can be car-
6. The internal symmetry group G
ried farther by bosonizing the fermions according to
From the point of view of non-Abelian gauge theory A*(u)= :exp[+&u)]:, where + R ( u )is a right-moving
the interesting states are those belonging to the non- bosonic field, compactified so that O S ~ ~ ( U ) < ~InTTthat
.

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


557

L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments 19

case we may regard the right-handed part of the het- where nodenotes the space of zero-forms. The essential
erotic string as originating in the right-handed part of an new feature is the introduction of a discrete component
ordinary 26-dimensional bosonic string, in which 16 of d of the outer derivative d. This is defined as a self-
the 26 right-moving bosonic fields XR(a) have been fer- adjoint off-diagonal matrix, i.e.,
mionized by letting Xf(u)++f(g) for O C + ~ ( L T ) < ~ T
and a = 1 . . .16. Since the Xs correspond to coordinates
in the target space of the string, this is equivalent to a (75)
toroidal compactification of 16 of the target-space di-
with constant entries k. (More generally one could take
mensions and thus is equivalent to a Kaluza-Klein-type
the off-diagonal elements in d to be complex-conjugate
dimensional reduction from 26 to 10 dimensions. It turns
out that the toroidal compactification and conversion to bounded operators, but that will not be necessary for
fermions is consistent only if the lattice that defines the our purpose.) The outer derivative of the zero-forms
16-dimensional torus is even and self-dual. But it is well with respect to d is obtained by commutation,
known that there are only two such lattices, called 0 2
and E X +E x , and since these have automorphism groups
SO(32) and E 8 X E x , respectively, one sees at once
where the origin of these symmetry groups lies. The fur- The noncommutativity enters in the fact that dwo does
ther reduction from ten to four dimensions is, of course, not commute with the forms in R,. The one-forms w1
another question. One of the more attractive proposals are taken to be off-diagonal matrices,
is that the quotient, six-dimensional space, be a Calabi-
Yau space (Green, Schwarz, and Witten, 1987; Yau,
1985), but we do not wish to pursue this question further (77)
here.
where the u ( x ) s are ordinary scalar functions and i l l
denotes the space of one-forms. Note that, according to
B. Gauge theory and noncommutative geometry
Eq. (76), the discrete component of the outer derivative
maps R o into a,. The outer derivative of a one-form
The recent development of noncommutative geom-
etry by Connes (1994) has permitted the generalization with respect to d is obtained by anticommutation. Thus
of gauge-theory ideas to the case in which the standard d A o l 3 { d , w l } = [ u b ( X ) - U 0 ( X ) ] k Z E 510, (78)
differential manifolds (Minkowskian, Euclidean, Rie-
mannian) become mixtures of differential and discrete where I is the unit 2 x 2 matrix. It is easy to check that
manifolds. The differential operators then become mix- with this definition we have dAdA = 0 on both R-spaces.
tures of ordinary differential operators and matrices. The U ( 1) gauge group is a zero-form and is the direct
From the point of view of the fundamental physical in- sum of the U ( 1) gauge groups on the two sectors of the
teractions, the interest in such a generalization of gauge zero-forms. Thus it has elements of the form
theory is that the Higgs field and its potential, which are
normally introduced in an ad hoc manner, appear as U(x)=je;) ei:(x)) E U(1). (79)
part of the gauge-field structure. Indeed the Higgs field
emerges as the component of the gauge potential in the Its action on both Ro and is by conjugation. Thus
discrete direction and the Higgs potential, like the under a gauge transformation the zero-forms are invari-
self-interaction of the gauge field, emerges from the ant and the one-forms transform according to
square of the curvature. The theory also relates to
Kaluza-Klein theory because the Higgs field and poten- w l ( x ) + o ; ( x ) = U - ( x ) w l ( x )U ( x ) . (80)
tial can also be regarded as coming from a dimensional Explicitly,
reduction in which the discrete direction in the gauge
group is reduced to an internal direction.
1. Simple example
To explain the idea in its simplest form we follow where X ( x ) = p ( x ) - r u ( x ) . (81)
Connes (1994) and use as an example the simplest non- The discrete component of a connection takes the
trivial case, namely, when the continuous manifold is a form
four-dimensional compact Riemannian manifold with
gauge group U ( 1 ) and the discrete manifold consists of
just two points. With respect to the new discrete (two-
point) direction the zero-forms (functions) w o ( x ) are
and thus resembles a Hermitian one-form. But, being a
taken to be diagonal 2 X 2 matrices with ordinary scalar
connection, it is assumed to transform with respect to
functions as entries:
U ( x ) as

(74)
v ( x ) +v,(x)
= u - ( x ) v ( xU)( x )+ e - ~ - ( x ) d ~ ( x ) ,
(83)

Rev. Mod. Phys., Vol. 72, No. 1, January 2000


558

20 L. ORaifeanaigh and N. Straumann: Gauge theory: origins and modern developments

where e is a constant. Thc transformation law (83) is the


natural extension of the conventional transformation
law for connection forms.
The discrete component of the covariant outer deriva- The action of the gauge group and thc covariant deriva-
tive is defined to be tive on them is by ordinary multiplication, i.e.,
0
D= d + e V ( x )= (93)

and

where
k
4(x)=u(x) +c, c = -. (85)

The outer derivative with D is formed in the same way respectively. As might be expected from the fact that the
as with d, namely, by commutation and anticommuta- fermions are U(1) covariant, it is the U(I)-covariant
tion on the forms SL, and a,, respectively. From Eq. field +(I),and not the component u ( x ) of thc connec-
(83) it follows in the usual way that D translorms cova- tion, that couples to them in Eq. (94).
riantly with rcspect to the U (1 ) gauge group, i.e.,
D[ $ ~ ( x ) I =u-'(n)D[ 4(x)lU ( X )
D[ 4(~1l+ (86) 2. Application to the standard model
whcre As has already bccn mentioned, the immediate physi-
cal intcresl of the noncommutative gauge theory lies in
4x(x)= eLA['"'C(x). (87) its application to the standard model of the fundamental
This is consistent with the fact that D acts on the gauge interactions. The new feature is that it produces the
group by commutation, Higgs Geld and its potential as natural consequcnccs of
Note that, although the component v ( x ) of the con- gauge theory, in contrast to ordinary field theory in
nection does not transform covariantly with respect to which they are introduced in an ad hoc phenomenologi-
U ( 1 ) , the field +(x) does. Since +(x) is also a space- cal manner. The mechanism by which they are produced
time scalar, it can therefore be identified as a Higgs field. is very like that used in Kaluza-Klein reduction so, to
As we shall see, the fact that d(x) rather than u ( x ) is put the noncommutative mechanism into perspective, let
identified as the Higgs field is of great importance for us first digress a little to recall the usual Kaluza-Klein
the Higgs potential. mechanism.
Having defined the covariant derivative, we can pro-
ceed to construct the curvature. In an obvious notation
a. The Kaluza-Klein mechanism
this can be written as
Consider the gauge-fermion Lagrangian density in 4
+ n dimensions, namely,
where F,,, is the convcnhmal curvature and
Fdp= d,V - ~ A +A+ [ A @V, ] wherc A , 4 = 1 . . . 4 + n . IT we let , u u , v = O . ,. 3 and r,s
= 4 . . . n and assume that thc fields dn not depend on
the coordinatcs x r , the Dirac operator and the curva-
ture decompose into
where U p is the conventional covarianl derivative. The
interesting component is F d d , which turns out to be FAfi =
F d d = dAV+ e V 2 . (90)
and
The explicit form of Eq. (90) is easily computed to be
IpDA=I/lrDp+y'Ar, (94)
F,,= (k(u +u*)+evu*)r=e(l$12- c 2 ) 1 . (91)
respectively, and hence the Lagrangian (95) decomposes
Since it is & ( x ) that must be identified as a Higgs field, into
the relationship between Eq. (91) and the standard
U ( 1) Higgs potential is obvious. 1 1
L= -Tr(Fpv)z--(D,,Ar)2+$-yvDfi$
Before applying the above formalism to physics, how- 4 2
ever, we have to introduce fermion fields 9 ( x ) . These
are taken to be column vectors of ordinary fermions (97)
*Ax).

Rev. Mod. Phys., Vol 72, No. 1, January2000


559

L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modern developments 21

The extra components A , of the gauge potential are But this is just the renormalizable potential that is used
space-time scalars and may therefore be identified as to produce the spontaneous breakdown of U ( 1) invari-
Higgs fields. Thus the dimensional reduction produces a ance. Putting all the new contributions together, we see
standard kinetic term, a standard Yukawa term, and a that the introduction of the discrete dimension and its
potential for the Higgs fields. The problem is that the associated gauge potential + ( x ) produces exactly the
Higgs potential is not the one required for the standard extra terms
model. In particular, its minimum does not force lArl to
assume the fixed nonzero value that is necessary to pro-
duce the masses of the gauge fields and leptons.

b. The noncommutative mechanism


that describe the Higgs sector of the standard U ( 1 )
As we shall now see, the noncommutative mechanism model. Thus, when the concept of manifold is general-
is very similar to the Kaluza-Klein mechanism. But it ized in the manner dictated by noncommutative geom-
eliminates the artificial assumption that the fields do not etry, the standard Higgs sector emerges in a natural way.
depend on the extra coordinates and it produces a Higgs Note, however, that since there are three undetermined
potential that is of the same form as those used in stan- parameters in Eq. (99), the noncommutative approach
dard models. As in the Kaluza-Klein case, the procedure does not achieve any new unification in the sense of
is to start with the formal gauge-fermion Lagrangian reducing the number of parameters. However, it consid-
(95) and expand around the conventional four- erably reduces the ranges of the parameters, places
dimensional gauge and fermion fields. strong restrictions on the matter-field representations,
From the discussion of the previous section we see and even rules out the exceptional groups as gauge
that if we expand the Dirac operator and the Yang-Mills groups (Schucker, 1997).
curvature in this way we obtain Of course the above model is only a toy one, since it
uses the gauge group U(1) X U ( 1) rather than the gauge
groups U ( 2 ) and S[U ( 2 )X U ( 3 ) ]of the standard elec-
troweak and electroweak-strong models or the gauge
groups of grand unified theory.
However, the general structure provided by noncom-
mutative geometry can be applied to any gauge group.
respectively, where g is a constant whose value cannot Connes himself (Connes, 1994) has applied it to the
be fixed as the theory does not relate the scales of D , standard model. There is some difficulty in applying it to
and D. The resemblance between Eq. (98) and the cor- grand unified theories because of the restrictions on fer-
responding Kaluza-Klein expression (96) is striking. mion representations, but a modified version has been
It is clear from Eq. (98) that for the noncommutative applied to grand unified theories in the work of Cham-
case the formal Yang-Mills-fermion Lagrangian (95) de- seddine et al. (1993). As in the toy model, the noncom-
composes to mutative approach does not achieve any new unification
in the sense of reducing the number of parameters,
though, as already mentioned, it introduces some impor-
tant restrictions. Most importantly, it provides a new
and interesting interpretation.

ACKNOWLEDGMENTS
where
We are indebted to C. N. Yang for important remarks
which improved the paper. A number of people gave us
positive reactions and welcome comments. We thank, in
Since the field + ( x ) is a scalar that transforms covari- particular, D. Giulini, F. Hehl, U. Lindstrom, T.
antly with respect to the U(1) gauge group it may be Schucker, and D. Vassilevich. Special thanks go to T.
interpreted as a Higgs field. Hence, in analogy with the Damour for some pertinent remarks which led to sev-
Kaluza-Klein mechanism, the noncommutative mecha- eral improvements of the printed version.
nism produces a standard kinetic term, a standard
Yukawa term, and a potential for the Higgs fieId. The
difference lies in the form of the potential, which is no REFERENCES
longer the square of a commutator. From Eq. (91) we
have Appelquist, T., A. Chodos, and P. G . 0. Freund, 1987, Modern
Kuluza-Klein Theories (Addison-Wesley, London).
Audretsch, J., F. Gahler, and N. Straumann, 1984, Commun.
Math. Phys. 95,41.

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


5 60

22 L. ORaifeartaigh and N . Straumann: Gauge theory: origins and modern developments

Bergmann, V., 1932, Sitzungsber. K.Preuss. Akad. Wiss., Phys. Nordstrom, G., 1913b, Ann. Phys. (Lcipzig) 42,533.
Math. K1. 346. Nordstrom, C., 1914, Phys. Z . 15,504.
Bergmann, P. G., 1942, An Iritroduction to the Theory orKela- OKaifearLaigh L., 1997, The Dawning of Guuge T h o ? (Prin-
tiuiry (Prenlice-Hall, New York), Chaps. XVll and XVIII. ceton University, Princeton, NJ).
Bergmann, P. G., 1968, Int. J. l h e o r . Phys. 1,25. Pais, A., 1953, Conference in Honour of H . A . Larentz, Leiden
Bleecker, D., 1981, Gauge Theory und Vurialional Principles 1953, Physica (Amsterdam) 19, 869.
(Addison-Wesley, London). Pais, A.? 1982, Subtle is fhe L o r d The Science and Life of Al-
Brans, C. H., and R. H. Dicke. 1961, Phys. Rev. 124,925. bert Einrtein (Oxford University, New Yorkj.
Cartan, E., 1928, LeGons sur la Geornitrie des Espaces de Rie- Pauli, W., 1919, Phys. Z. 20, 457.
mann, 2nd ed. (Gauthier-Villars. Paris). Pauli, W., 1921, Relativitatstheorie, Encyklopiidie der Math-
Case, C. M., 1957, Phys. Rev. 107,307. emrischen Wissenschafien (Leipzig, Teubner), Vol. 5.3. p.
Chamseddine, A. H., G . Felder, and J. Frohlich, 1993, Nucl.
539.
Phys. B 395,672.
Pauli, W.. 1Y33, Ann. Phys. (Leipzig) 18, 305.
Chandrasekharan, K., 1986, Ed., llennann Weyl, 1885-1.985
Pauli, W,, 1939, Helv. Phys. Acta 12, 147.
(Springer, New York).
Cnnncs, A., 1994, Noncommuturive Geometry (Academic, Ncw P a d , W.. 1958, Theory of Relativity (Pergamon, New York).
Yorkj. Pauli, W., 1979, Wissenschafilicher Briefwechsel, Vol. k 1919-
Darnour, T., and A. M. Polyakov, 1994: Nucl. Phys. B 423.532. 1929 (Springer, Berlin), p. 505. (Translation of the letter by L.
Dicke, R. H., 1962, Phys. Rev. 125, 2163. ORaifeartaighj.
Einstein, A., 1987, The Collected Papers of Albert Einstein, Pauli, W., 1999, Wissenschafilicher Briefwechsel, Vol. IV, Part
edited by J. Stachel, D. C. Cassidy, and R. Schulmann (Prin- I1 (Springer, Berlin), Letters 1614 and 1682.
ceton University, Princeton, NJ). Polchinski, J., 1998, String Theory, Vols. I, 11, Cambridge
Feynman, R. P., 1995, Feynmn Lectures on Gravitation, ed- Monographs on Mathematical Physics (Cambridge Univer-
ited by B. Hatfield (Addison-Wesley). sity, Cambridge, England).
Fierz, M., 1956, Helv. Phys. Acta 29, 128. Raman, V.,and P. Forman, 1969, Hist. Stud. Phys. Sci. 1, 291.
Fierz, M., 1999, private communication. Schrodinger, E., 1922, Z . Phys. 12, 13.
Fock, V., 1921, %. Phys. 39. 226. Schrodinger, E., 1932, Sitzungsber. K. Preuss. Akad. Wiss.,
Fock, V., 1929, Z. Phys. 57. 261. Phys. Math. K1. 105.
Green, M., J. Schwarz, and E. Witten, 1987, Theory of Strings Schrodinger, E., 1987, Schrodinger: Cenrenary Celebration of a
and Super.vtrings (Cambridge University, Cambridge, En- Polymulh, cdited by C . Kilmister (Camhridgc University,
gland). New YorWCamhridgc, England).
Gross, D., 1995, in The Oskar KIeiri Centenary Symposium, Schucker, T., l Y Y 7 , Geometry and Forces, in Proceedings of
cditcd by U. Lindstrom (World Scientific, Singapore), p. 94. (Re 1997 E M S Summer School on Noncommuruiivr Geometry
Hoffmann, B., 1933, Phys. Rev. 37,88. and Applications, Monsaraz and Lisbon, edited by P.
Infeld, L., and B. L. van der Waerden, 1932, Sitzungsber. K. Almeida, to appear (hep-tM9712095)
Preuss. Akad. Wiss., Phys. Math. K1. 380 and 474. Seelig, C., 1960, Albert Einstein (Europa, Zurich), p. 274.
Jordan, P., 1949, Nature (London) 164,637. Steinhardt, P. J., 1993, Class. Quantum Grav. 10, 33.
Jordan, P., 1955, Schwerkraft und Weltall. 2nd ed. (Vieweg, Straumann, N., 1984, General Relativity and Relativistic Astro-
Braunschweig). physics- Texts and Monographs in Physics (Springer, Berlin).
Kaluza, Th., 1921, Sitzungsber. K. Freuss. Akad. Wiss., Phys. Straumann. N., 1987. Phys. BI. (Germany) 43 (11). 414.
Math. KI., 966; lor an English translation see ORaifeartaigh, Tetrodc, H., 1928. Z.Phys. 50,336.
1997. Thiry, Y.R.:1948, C. R. Acad. Sci. 226, p. 216, 1881.
Klein, O., 1926~1,Z. Phys. 37, 895; for 3n English translation Thiry, Y. R.,1951, These (Univcrsiti de Psris).
scc OKaifcartaigh, 1997. Vcblen, 0.. 1933, Projektive Relativicatstheorie (Springer, Bcr-
Klein, O., 1926b, Naturc (London) 118, 516. lin).
Klein, O., 1938, 1938 Conference on. New 7heories in Physics, Wald, R. M., 1986. Phys. Rev. D 33, 3613.
held in Kasimierz, Poland; reproduced in ORaifeartaigh, Weinberg, S., 1965, Phys. Rev. 138, A988.
1991. Weyl, H., 1318, Gravitation und Elektrizitat? Sitzungsber.
Klein, O., 1956, in Proceedings of rhe Berne Congress, Helv. Deutsch. Akad. Wiss. Berlin, Klossefu ... pp. 465-480. See
Phys. Acta, Suppl. IV, 58. also H. Weyl, 1968, Gesammelfen Abhandlungen, edited by
Kubyshin, Y. A,, J. M. Mourao, G . Rudolph. and 1. P. Vo- K . Chadrasekharan (Springer, Berlin). An English translation
lobujev, 1989, Dimensional Reduction of Gauge Theories, is given in ORaifeartaigh, 1997.
Spontaneous Compactifcation and Model Building, Springer Weyl. H., 1922, Space, Time, Matter (Methuen, London, and
Lecture Notes in Physics No. 349 (Springer, New Yorkj. Dover, Yew York). Translated from the 4th German Edition.
Lichnerowicz, A,, 1955, Thkvies Rehrivisre de la Gravitation et [Raum: Zeit: Materie, 8. Aufage (Springer, Berlin, 1993)l.
de IE/ectroniagnitis,ne (Masson. Paris), Chap. 4. Weyl, H., 1929, Elektron und Gravitation. I Z . Phys. 56,330.
London, t:., 1927, %. Phys. 42,375. Wcyl, H., 1946, Memorabilia, in IIermann Weyl. edited by
Ludwig, C., 1951, Foorlschritlc der Prujolrtiven Relativiththeo- K. Chandrasekharan (Springer, Ncw York), p, 85.
rie. (Vieweg, Branunschweig). Wcyl, H., 1956, Selerfa (Birkhiuser, Boston).
Nordstrom, G., 1912, Phys. Z. W, 1126. Weyl- H.: 1968, Cesanimelte Ahhandlungen, edited by K .
Nordstrom, G., 1913a, Ann. Phys. (Leipzig) 40, 856. Chandrasekharan (Springer, Berlin), Vol. 111, p . 229.

Rev. Mod. Phys., Vol. 72,No. 1, January 2000


56 1

L. ORaifeartaigh and N. Straumann: Gauge theory: origins and modem developments 23

Weyl, H., 1980, in Hermann Weyl, edited by K. Chandrasekha- in Hermann Weyl, 1885-1985, edited by K. Chandrasekharan
ran (Springer, Berlin), p. 85. (Springer, New York), p. 7.
Weyl, H., 1981, Gruppentheorie und Quantenmechanik (Wis- Yang, c. N., 1983, Selected Papers 1945-1980 with Commen-
senschaftliche Buchgesellschaft, Darmstadt), Nachdruck der tary (Freeman, Francisco), p. 525.
Yau, S. T., 1985, Compact three-dimensional Kahler Mani-
2 Aufl, Leipzig 1931. translation: Group Theory folds with zero Ricci curvature, in Symposium on Anomalies,
and Quantum Mechanics, Dover, New York (1950)l. Geometry, and Topology, edited by W. Bardeen and A.
Wigner, E., 1929, 2. Phys. 53, 592. White (World Scientific, Singapore), p. 395. See also refer-
Yang, C. N., 1980, Hermann Weyls Contribution to Physics, ences therein.

Rev. Mod. Phys.. VoI. 72, No. 1 , January 2000


562

String theory as a generalization of gauge symmetry

Pei-Ming Ho

Department of Pliysic:s, National Iaiwa.n University, Taipei, Taiwan, R.O.C.

pm ho@phys.ntu.edu.tw

This is a brief review of the relation between string theory and usual gauge theories including
Einsteins gravity and Yang-Mills theory. In particular, we would like t o explain ho117 string theory
extends or generalizes ideas behind gauge symmetry and Einsteins general relativity. Most of the
material concerning this article can be found in [l,21, if no further reference is provided.

1 There is gravity
Let us start with somc basic. fxt.s about string theory. The first and foremost to say is that string
theory includes gravit,y. More importantly? string theory coritairis quantum gmvity, and is the only
theory of quantum gravity which admits a. ultra.violet-finitc and unita.ry perturbation theory. On t.he
other hand, there is no consistent. pertiirbatiori thcory for the canonical quantization of Einsteins
theory of pure gravity.
Contrary to particles, for which it is extrcmely hard to find a well-behaved (causal, unitary, renor-
malizable) interaction with t,he gravitons, strings are almost always accompanied by gra\-ity. As each
excited state of a string can be reinterpreted as a particle, the massless spin-two excitation of a closed
string is identified with the graviton. This excitation exists in all five superstring theories (type I, type
IIA, type IIB, heterotic SO(32) a.nd het,erotic EB x Es) as well as the bosonic string theory.
A string can have (infinitely) many vibrabion modes. The graviton is a massless spin-2 osillation
mode of a closed string. Each oscillation mode shoiild bc matched with a certain part,icle in space-
time. Each particle is associated with a, vertex cipcrator defined in the 1+1 dimensional quantum field
theory living on the string worldshcc~. Irite1actinns among particles are determined in string theory
by correlation functions of wrtex operators. -4 Feyrirrian diagram for a spacetime particle scattering
process corresponds to a path intcgrsl of tlie worldsheet thcory. Unlike particle theory, wherc intcrac-
tion vertices and propagators are built in a Lagranglan, ad1 ingredients of Feynrnan diagrams are fixed
in string thcory by the definition of a free string.
There are two complimentary approaches to see how gravitational interaction is dcterrriiried in
string theory. The first. approach is to compute the Feynman diagram of a. sca irig process involving
gravitons, i n the flat, trivial background. From the result one can construct order by order a. field
theory involving the metric to reproduce the scattering amplitudes.
The second approach is to consider a string propagating in a perturbative deformation of the flat
background. The spacetime met.ric appears in the kinetic term of the string worldsheet act.ion. The
requirement of (quantum) conformal invariance then imposes a strong constraint. on the metric. The
563

constraint involves the metric and its derivatives, and can be viewed as the equation of motion of the
metric. In this approach, the equation of motion for the spacetime metric is in fact a self-consistency
condition. This is a remarkable feature of string theory which is not unique to gravity. Dynamics and
kinematical constraints are unified in string theory.
Of course, the results of the two approaches agree with each other. They also agree with the
Einstein equation at low energies for weak coupling (Newton) constant. A t high energies (compared
with the energy scale of the string tension), string theory modifies Einsteins theory, but with general
covariance intact.

2 There is a bigger gauge symmetry


In addition t o the graviton, strings have other oscillation modes including many other gauge fields. For
open strings, there is a massless spin-1 oscillation mode corresponding to a gauge field in spacetime.
Non-Abelian gauge symmetry also arises in string theory in various ways. One possibility is t o
associate a so-called Chan-Paton factor to each endpoint of an open string. The massless spin-1 oscil-
lation mode is then labelled by two indices, and the corresponding spacetime gauge field is promoted
to a matrix. The result is that the U(1) gauge symmetry is enhanced t o a non-Abelian gauge sym-
metry. This happens for open strings in type I string theory, and the gauge group has to be SO(32)
for self-consistency. Another possibility is t o add currents corresponding to a global symmetry on
the string worldsheet, or t o compactify the spacetime in a particular way so that new massless spin-1
oscillation modes appear.
Similarly, when there are multiple coincident D-branes on top of each other, an open string stretched
between two D-branes naturally acquire labels specifying the two D-branes. For coincident N D-branes,
there is an U ( N )gauge symmetry on the D-brane worldvolume.
In addition to the massless spin-1 oscillation mode, there are (infinitely) many other oscillation
modes corresponding to gauge fields of higher spins. I t is believed that these gauge fields are massive
due to a spontaneous symmetry breaking mechanism, and the symmetry may be restored in the high
energy limit. String extends general covariance and usual gauge symmetry to a much larger symmetry.
There are higher spin gauge fields of arbitrarily many Lorentz indices. Furthermore, as the symmetry is
so highly intertwined, it is tempting to conjecture that its self-consistency uniquely fixes the dynamics
of the whole theory.
Wittens cubic string field theory [5] manifestly exihibits the gauge symmetry of bosonic open
string theory. The action of the cubic string field theory is

1
* (QQ)+ -9*
3
9*9

where 9 [ X ( a ) ]is the string wave function (a state of the string worldsheet theory in the formulation
using BRST quantization) and Q is the BRST charge for conformal symmetry. The product labelled
by * defines an associative algebra for the string states, which essentially tells us how t o glue two
strings together into one string. This action has the gauge symmetry

We see from this formula that the string wave function is itself a gauge field for a huge gauge symmetry.
5 64

The string wave function XI' can be represented as a generic state in the Hilbert space of the string
worldsheet theory in BRST quantization. In terms of an expansion of creation operators,

= / + +
ddk [ 4 ( k ) ZAIL(k)a!l a(lc)b-lco + iBp(k)af2B,,(k)aflaYl+
-

Po(k)b-aco +Pl(k)b-lc-l + ZP,(k)cu!,b-lCO +. . .] c1)O;k ) , (3)

where atare the bosonic operators representing fluctuations of the spacetime coordinate X p , and c,,
b, belong to the ghost sector. The coefficients $ ( k ) , A p ( k ) , a ( k B
) ,, ( k ) , etc. are Fourier transforms of
spacetime fields. Their gauge transformations are given by

1
sp, = -(a2-2);,
2
1
bpo = -(LJ2-2)1,
2

Obviously, A , is the massless spin-1 guage field mentioned above. B,, is a symmetric rank-2 gauge
potential. At higher mass levels one finds gauge fields a t higher ranks. Except A,, the other gauge
fields are massive, signaling a spontaneous symmetry breakdown.
The physical meaning of the huge gauge symmetry of string theory has not yet been thoroughly
explored. People also suspect that there is another huge hidden (global) symmetry which is sponta-
neously broken, and that one should study the high energy limit where the symmetry may be restored
[6]. The 2 dimensional string theory is the best understood toy model of strings. It has the w,
symmetry and the symmetry algebra is strong enough t o determine all scattering amplitudes. It has
been conjectured by Gross and his collaborators [6] that the full symmetry may uniquely determine
the dynamics for higher dimensional strings as well. It is possible that the huge global symmetry has
the same origin as the huge gauge symmetry of string theory, or that its existence is a necessity for
self-consistency due to the higher spin gauge symmetries [7].
While the general covariance of general relativity is embedded in a bigger gauge symmetry, we do
not seem to fully comprehend the physical notion which generalizes the equivalence principle. What
are the gendenken experiments analogous to the elevator in free fall?

3 Geometry is induced
Before the advant of string theory, spacetime is put in by hand as a stage in which particles interact
and events take place. In string theory, spacetime can be a derived notion. f i o m the viewpoint of the
string worldsheet action, the spacetime coordinates of the string are scalar fields on the worldsheet.
Properties of spacetime are determined by properties of these scalar fields. Of course one can also
view spacetime coordinates of a point particles as a scalar field on the worldline. The crucial difference
565

is that while string worldsheet theory determines the dynamics of spacetime geometry, the particle
worldline theory does not.
As spacetime is a derived notion, the geometrical structure of spacetime is something t h a t should
be extracted from the theory. We have already mentioned above how the dynamics of spacetime
metric is determined in string theory, yet Riemannian geometry is not the only geometrical structure
the spacetime can possess.
Noncommutative geometry is a mathematical notion that generalizes classical geometry [S] A clas-
sical manifold is commutative, that is, the algebra of functions on the manifold is a commutative
algebra. Mathematicians noticed that, although traditionally one visualizes a manifold as a collection
of points (with some topology), the algebra of functions provides an alternative equivalent description
t o some extent. Hence it is natural t o relax the definition of geometry t o allow the algebra of functions
t o be noncommutative. Such a space no longer admits the picture of a set of points, but various
geometric structure and quantities can be defined.
In a field theory, the base space can be noncommutative by imposing nontrivial commutation
relations for the coordinates. In string theory, the commutativity of spacetime coordinates depends
on the quantization of the scalars fields on the worldsheet. In suitable background, noncommutative
geometry arises automatically.
In string theory, there are solitonic objects called D-branes. They are submanifolds of spacetime
on which open strings end. Turning on a background field called NS-NS B-field, one can quantize
the open string and find that the coordinates of the endpoints are noncommutative [9], with the
noncommutativity depending on the B-field background. The D-brane field theory is then conveniently
defined as a noncommutative field theory [lo].
Another way t o see the noncommutativity of a D-brane is to consider scattering amplitudes of open
strings [12]. In other words, you probe the geometric property of the D-brane worldvolume using open
strings. Due t o the B-field background, the interactions a t low energies are most conveniently described
by a field theory living on a noncommutative space. With other background fields (graviphoton) turned
on, the spacetime itself can also be noncommutative [13].
In addition t o noncommutative geometry, there might be more exotic geometrical notions which
we can learn from string theory. In general, one probes the spacetime via strings. The geometry
of spacetime is an induced notion derived from decoding string interactions in a certain way. It is
conceivable that there can be ambiguity in the decoding process, corresponding to the freedom of
making a change of variables (field redefinition). For example, a noncommutative field theory can be
reinterpreted as a commutative field theory with higher derivative interactions [ll]. We will make
more comments below on other ambiguities in defining spacetime properties.
Matrix model provides another way t o demonstrate the idea. The BFSS model [14]and the IKKT
model [15]are conjectured to be equivalent descriptions of string theory. Spacetime coordinates corre-
spond to N x N matrices with N + 03. As matrices the coordinates are generically noncommutative.
Interactions in the models would make it impossible to have large extended dimensions in spacetime
if there were no supersymmetry to guarantee flat directions in the moduli space. The large scale,
(roughly) commutative spacetime around us is not given to us without warrant. The choice of the
effective spacetime dimension by our universe can be translated into questions about the free energy
of the theory [16].
T-duality (including Mirror symmetry) is another example of the ambiguity in extracting the
566

spacetime structure from string theory. The simplest setting of T-duality is having one spatical
direction compactified on a circle. Let the radius be denoted R. It turns out that this theory is
equivalent t o another string theory with a dual spatial direction compactified on a circle of radius 1/R in
string units. The two theories dual to each other are different descriptions of the same physical system.
The same physical state can have totally different descriptions in the two theories. For instance, a
string winding on a circle can be matched with a Kaluza-Klein mode (momentum eigenstate) in the
dual theory. But a state is always matched with another state with the same energy. Physicists trying
t o describe a physical process can adopt either one of the two theories and its language. They may
sound very different in words, but the two theories agree when it comes t o the prediction of the result
of a measurement in terms of pure numbers.
In general, shapes and topology of spacetime can be different between equivalent theories. Even
the spacetime dimension is not independent of the choice of description/theory. Superstrings in 10
dimensions are believed to be equivalent t o the M-theory in 11 dimensions. Physics is not only
observer-dependent, like it already is in Einsteins theory, but is now also theory-dependent. In fact,
all five superstring theories are believed to be dual to one another, and also to ( a quantum version of)
the 11 dimensional theory of supergravity.

4 Summary
To summarize, gauge symmetry is generalized to include gauge fields of arbitrary spins in string theory.
Einsteins theory of spacetime is also extended to more general geometric structures. The notion that
physics is observer-dependent, which we first learned from the theory of relativity, is magnified t o
the notion that physics is theory-dependent. It is also believed that Yangs inspiring saying that
Symmetry dictates interaction is fully honoured by string theory.
An obvious difference between string theory and general relativity or Yang-Mills theory is that the
discovery of the latter was motivated by beautiful theoretical notions on symmetry and geometry, but
the discovery of string theory was an accident. What is the symmetry principle of string theory? This
is one of the most important problems of string theory.
We have so far avoided discussions on holography and cosmological constant in this article. Holo-
graphic principle is argued to be a salient feature of quantum gravity [17]. In string theory we have
seen remarkable evidences of it [18]. It is generally viewed to be of utmost importance for quantum
gravity, but so far our understanding of it remains at a technical level. On the other hand, our under-
standing of the cosmological constant problem is not even a t the technical level. It might be that the
secrets of holography and cosmological constant are deeply hidden inside string theory waiting t o be
discovered, and their revelation will bring us a drastic conceptual breakthrough so that string theory
will no longer be called string theory.

Acknowledgment
This work is supported in part by the National Science Council and National Center for Theoretical
Sciences, Taiwan, R.O.C. and the Center for Theoretical Physics at National Taiwan University.
567

References

[l]M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory. Vol. 1: Introduction, Cambridge

University Press; M. B. Green, J. H. Schwarz and E. Witten, Superstring Theory. Vol. 2: Loop

Amplitudes, Anomalies And Phenomenology, Cambridge University Press.

[2] J. Polchinski, String theory. Vol. 1: An introduction t o the bosonic string, Cambridge Uni-

versity Press; J. Polchinski, String theory. Vol. 2: Superstring theory and beyond, Cambridge

University Press.

[3] See, for instance, J. Isberg, U. Lindstrom, B. Sundborg and G. Theodoridis, Classical and quan-

tized tensionless strings, Nucl. Phys. B 411, 122 (1994) [arXiv:hep-th/9307108]; B. Sundborg,

Stringy gravity, interacting tensionless strings and massless higher spins, Nucl. Phys. Proc.

Suppl. 102, 113 (2001) [arXiv:hep-th/0103247]; E. Sezgin and P. Sundell, Massless higher spins

and holography, Nucl. Phys. B 644, 303 (2002) [Erratum-ibid. B 660, 403 (2003)] [arXiv:hep-

th/0205131]; C. S.Chu, P. M. Ho and F. L. Lin, Cubic string field theory in pp-wave background

and background independent Moyal structure, JHEP 0209, 003 (2002) [arXiv:hep-th/0205218].

[4] See, for instance, C. Fronsdal, Massless Fields With Integer Spin, Phys. Rev. D 18, 3624

(1978); J. Fang and C. Fronsdal, Massless Fields With Half Integral Spin, Phys. Rev. D 18,

3630 (1978); M. A. Vasiliev, Progress in higher spin gauge theories, Prepared for 9th Marcel

Grossmann Meeting o n Recent Developments in Theoretical and Experimental General Relativity,

Gravitation and Relativistic Field Theories ( M G 9), R o m e , Italy, 2-9 Jul 2000 M. A. Vasiliev,

Higher spin gauge theories in various dimensions, Fortsch. Phys. 52, 702 (2004) [arXiv:hep-

th/0401177].

[5] E. Witten, Noncommutative Geometry And String Field Theory, Nucl. Phys. B 268, 253 (1986).

[6] D.J. Gross and P. Mende, Phys. Lett. B197,129 (1987); Nucl.Phys.B303, 407(1988); D.J. Gross,

High energy symmetry of string theory, Phys. Rev. Lett. 60,1229 (1988); PhiLTrans. R. Soc.
Lond. A329,401(1989); D.J. Gross and J.L. Manes, The high energy behavior of open string

theory, Nucl.Phys.B326, 73(1989). See section 6 for details.


568

[7] C. T. Chan, P. M. Ho and J . C. Lee, Ward identities and high-energy scattering amplitudes in

string theory, Nucl. Phys. B 708,99 (2005) [arXiv:hep-th/0410194].

[8] A. Connes, Noncommutative Geometry, Academic Press.

[9] C. S. Chu and P. M. Ho, Noncommutative open string and D-brane, Nucl. Phys. B 5 5 0 , 151

(1999) [arXiv:hep-th/9812219].

[lo] A. Connes, M. R. Douglas and A. Schwarz, Noncommutative geometry and matrix theory:

Compactification on tori, JHEP 9802, 003 (1998) [arXiv:hep-th/9711162].

[ll] N. Seiberg and E. Witten, String theory and noncommutative geometry, JHEP 9909, 032

(1999) [arXiv:hep-th/9908142].

[12] V. Schomerus, D-branes and deformation quantization, JHEP 9906, 030 (1999) [arXiv:hep-

th/9903205].

[13] H. Ooguri and C. Vafa, The C-deformation of gluino and non-planar diagrams, Adv. Theor.

Math. Phys. 7,53 (2003) [arXiv:hep-th/0302109]; N. Seiberg, Noncommutative superspace,

N = 1 / 2 supersymmetry, field theory and string theory, JHEP 0306, 010 (2003) [arXiv:hep-

th/0305248].

[14] T . Banks, W. Fischler, S. H. Shenker and L. Susskind, M theory as a matrix model: A conjec-

ture, Phys. Rev. D 5 5 , 5112 (1997) [arXiv:hep-th/9610043].

[15] N. Ishibashi, H. Kawai, Y. Kitazawa and A. Tsuchiya, A large-N reduced model as superstring,

Nucl. Phys. B 498, 467 (1997) [arXiv:hep-th/9612115].

[16] H. Aoki, S. Iso, H. Kawai, Y. Kitazawa and T . Tada, Space-time structures from IIB matrix

model, Prog. Theor. Phys. 99, 713 (1998) [arXiv:hep-th/9802085].

[17] G. t Hooft, Dimensional reduction in quantum gravity, [arXiv:gr-q~/9310026];L. Susskind,

The World as a hologram, J. Math. Phys. 36, 6377 (1995) [arXiv:he~-th/9409089].

[18] J. M. Maldacena, The large N limit of superconformal field theories and supergravity, Adv.

Theor. Math. Phys. 2, 231 (1998) [Int. J. Theor. Phys. 38, 1113 (1999)l [arXiv:hep-th/9711200];

0. Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri and Y. Oz, Large N field theories, string
theory and gravity, Phys. Rept. 323, 183 (2000) [arXiv:hep-th/9905111].
569

The cosmological constant problem.


Steven Weinberg
Theory Group, Department of Physics, Universily of Texas, Austin, Texas 78712
Astronomical observations indicate that the cosmological constant is many orders of magnitude smaller
than estimated in modern theories of elementary particles. After a brief review of the history of this prob-
lem, five differentapproaches to its solution are described.

CONTENTS I I . EARLY HISTORY


I. Introduction
After completing his formulation of general relativity
1
11. Early History I in 1915- 1916, Einstein (1917) attempted to apply his new
111. The Problem 2 theory to the whole universe. His guiding principle was
IV. Supersymmetry, Supergravity,Superstrings 3 that the universe is static: The most important fact that
V. Anthropic Considerations 6 we draw from experience is that the relative velocities of
A. Mess density 8 the stars are very small as compared with the velocity of
B. Ages 8 light. No such static solution of his original equations
C. Number counts 8
VI. Adjustment Mechanisms 9
could be found (any more than for Newtonian gravita-
VII. Changing Gravity 11 tion), so he modified them by adding a new term involv-
VIII. Quantum Cosmology 14 ing a free parameter A, the cosmological constant:
IX. Outlook 20
Acknowledgments 21 R,,-+g,,R -hgpy=-8oGT,,,, . (2.1)
References 21
Now, for h > 0, there was a static solution for a universe
filled with dust of zero pressure and mass density
A s I was going up the stair,
I met a man who wasnt there. h
He wasn t there again today,
P8pG. (2.2)
I wish, I wish hed stay away. Its geometry was that of a sphere S,,with proper cir-
Hughes cumference 2rr,where

1. INTRODUCTION r=l/-, (2.3)


so the mass of the universe was
Physics thrives on crisis. We all recall the great pro-
gress made while finding a way out of various crises of (2.4)
the past: the failure to detect a motion of the Earth
through the ether, the discovery of the continuous spec- In some popular history accounts, it was Hubbles
trum of beta decay, the T-f3 problem, the ultraviolet discovery of the expansion of the universe that led Ein-
divergences in electromagnetic and then weak interac- stein to retract his proposal of a cosmological constant.
tions, and so on. Unfortunately, we have run short of The real story is more complicated, and more interesting.
crises lately. The standard model of electroweak and One disappointment came almost immediately. Ein-
strong interactions currently faces neither internal incon- stein had been pleased at the connection in his model be-
sistencies nor conflicts with experiment. It has plenty of tween the mass density of the universe and its geometry,
loose ends; we know no reason why the quarks and lep- because, following Machs lead, he expected that the
tons should have the masses they have, but then we know mass distribution of the universe should set inertial
no reason why they should not. frames. It was therefore unpleasant when his friend de
Perhaps it is for want of other crises to worry about Sitter, with whom Einstein remained in touch during the
that interest is increasingly centered on one veritable war, in 1917 proposed another apparently static cosmo-
crisis: theoretical expectations for the cosmological con- logical model with no matter at all. (See de Sitter, 1917.)
stant exceed observational limits by some 120 orders of Its line element (using the same coordinate system as de
magnitude. In these lectures I will first review the histo- Sitter, but in a different notation) was
ry of this problem and then survey the various attempts
that have been made at a solution.
d?2=-
1
cosh2Hr

[dt -dr2

--H-2tanh2Hr(dt?+ sin28dq2)],

*Morris Loeb Lectures in Physics, Harvard University, May (2.5)


2, 3, 5, and 10, 1988.
For a good nonmathematical description of the cosmological 2The notation used here for metrics, curvatures, etc., is the
constant problem, see Abbott (1988). same as in Weinberg (1972).

Reviews of Modern Physics, VoI. 61, No. 1 , JanuaIy 1989 Copyright 01988 The American Physical Society 1
570

2 Steven Weinberg: The cosmological constant problem

with H related to the cosmological constant by 111. THE PROBLEM

H = m (2.6) Unfortunately, it was not so easy simply to drop the


and p=p =O. Clearly matter was not needed to produce cosmological constant, because anything that contributes
inertia. to the energy density of the vacuum acts just like a
At about this time, the redshift of distant objects was cosmological constant. Lorentz invariance tells us that
being. discovered by Slipher. Over the period from 1910 in the vacuum the energy-momentum tensor must take
to the mid-l920s, Slipher (1924) observed that galaxies the form
(or, as then known, spiral nebulae) have redshifts
z =Ah/h ranging up t o 670, and only a few have blue- (T,,)=-(p)g,, . (3.1)
shifts. Weyl pointed out in 1923 that de Sitters model (A minus sign appears here because we use a metric
would exhibit such a redshift, increasing with distance, which for flat space-time has g,=-I.) Inspection of
because although the metric in de Sitters coordinate sys- Eq. (2.1) shows that this has the same effect as adding a
tem is time independent, test bodies are not at rest; there term 8aG ( p ) to the effective cosmological constant
is a nonvanishing component of the affine connection
heff=h+8aG(p) . (3.2)
r;,= -H sinhHr tanhHr (2.7)
Equivalently we can say that the Einstein cosmological
giving a redshift proportional to distance constant contributes a term h/8aG to the total effective
vacuum energy
z Hr for H r << 1 . (2.8)
In his influential textbook, Eddington (1924) interpreted p v = ( p ) +1/8aG=heff/8?rG . (3.3)
Sliphers redshifts in terms of de Sitters static A crude experimental upper bound on hcbor p is pro-
universe. vided by measurements of cosmological redshifts as a
But of course, although the cosmological constant was function of distance, the program begun by Hubble in the
needed for a static universe, it was not needed for an ex- late 1920s. The present expansion rate is today estimated

++]_
panding one. Already in 1922, Friedmann (1924)had de- as
scribed a class of cosmological models, with line element
(in modern notation) [ =Ho=50- 100 km /sec Mpc
d2=dt2-R2(t) [- dr2 +r2(d02+sin20dp2)
1-kr2 =(+-1)XIO-O/yr.

(2.9) Furthermore, we do not gross effects of the curvature of


the universe, so very roughly
These are comoving coordinates; the universe expands or
contracts as R ( t ) increases or decreases, but the galaxies
I~~/R~,,~sH:.
keep fixed coordinates r,O,p. The motion of the cosmic Finally, the ordinary nonvacuum mass density of the
scale factor is governed by an energy-conservation equa- universe is not much greater than its critical value
tion
Ip- ( p ) I 5 3Ha/8aG

[% 12= -k +f R 2( 8 s G p + h ) (2.10) Hence (2.10) shows that


IAeA 5Hg
The de Sitter model is just the special case with k =O and
p=O; in order to put the line elemeht (2.5) in the more
or, in physicists units,
general form (2.9), it is necessary to introduce new coor- (pvl S10-29 g / ~ m ~ = l OGeV4.
-~~ (3.4)
dinates,
A more precise observational bound will be discussed in
t = t -H-1ncoshHr , Sec. V, but this one will be good enough for our present
r=H-exp(-Ht)sinhHr , (2.11) purposes.
As everyone knows, the trouble with this is that the en-
w=e, pi=p, ergy density ( p ) of empty space is likely to be enormous-
ly larger than GeV4. For one thing, summing the
and then drop the primes. However, we can also easily zero-point energies of all normal modes of some field of
find expanding solutions with h = O and p > 0. Pais (1982) mass m up to a wave number cutoff A>>m yields a vacu-
quotes a 1923 letter of Einstein to Weyl, giving his reac- um energy density (with f2=c = 1)
tion to the discovery of the expansion of the universe:
If there is no quasi-static world, then away with the
cosmological term!

Rev. Mod. Phys.. VoI. 61. No. 1. January 1989


571
~

Steven Weinberg: The cosmological constant problem 3

If we believe general relativity up to the Planck scale, of quantum fluctuations. As we have seen, the zero-point
then we might take A - ( ~ T G ) - ~ ,which would give energies themselves gave far too large a value for ( p ) , so
Zeldovich assumed that these were canceled by h / 8 a G ,
(p)=2-0a-4G--2=2X 10 GeV4. (3.6) leaving only higher-order effects: in particular, the gravi-
But we saw that / ( p ) + h / 8 ~ G ( is less than about tational force between the particles in the vacuum fluc-
GeV4, so the two terms here must cancel to better tuations. (In Feynman diagram terms, this corresponds
than 118 decimal places. Even if we only worry about to throwing away the one-loop vacuum graphs, but keep-
zero-point energies in quantum chromodynamics, we ing those with two loops.) Taking A3 particles of energy
would expect ( p ) to be of order A&-J16rr2, or A per unit volume gives the gravitational self-energy den-
GeV, requiring h/8aG to cancel this term to about 41 sity of order
decimal places. ( p ) .s(GAZ/A-)A=GA6 . (3.7)
Perhaps surprisingly, it was a long time before particle
physicists began seriously to worry about this problem, For no clear reason, Zeldovich took the cutoff A as 1
despite the demonstration in the Casimir effect of the GeV, which yields a density ( p) =IO - GeV4, much
reality of zero-point et~ergies.~Since the cosmological smaller than from zero-point energies themselves, but
+
upper bound on I ( p ) h/87rG 1 was vastly less than any still larger than the observational bound (3.4) on
value expected from particle theory, most particle theor- I(p)+h/8aGI by some 9 orders of magnitude. Neither
ists simply assumed that for some unknown reason this Zeldovich nor anyone else felt encouraged to pursue
quantity was zero. But cosmologists generally continued these ideas.
to keep an open mind, analyzing cosmological data in The real beginning of serious worry about the vacuum
terms of models with a possibly nonvanishing cosmologi- energy seems to date from the success of the idea of spon-
cal constant. taneous symmetry breaking in the electroweak theory.
In fact, as far as I know, the first published discussion In this theory, the scalar field potential takes the form
of the contribution of quantum fluctuations to the (withpZ>O,g>O)
effective cosmological constant was triggered by astro-
nomical observations. In the late 1960s it seemed that an y = y 0 -P2dtd+g(d+d)** (3.8)
excessively large number of quasars were being observed At its minimum this takes the value
with redshifts clustered about z = 1.95. Since l+z is the
ratio of the cosmic scale factor R ( t ) at present to its (3.9)
value at the time the light now observed was emitted, this
could be explained if the universe loitered for a while at a Apparently some theorists felt that V should vanish a t
value of R ( 1 ) equal to 1/2.95 times the present value. A d=O, which would give Vo=O, so that ( p ) would be
number of authors [Petrosian, Salpeter, and Szekeres negative definite! In the electroweak theory this would
(1967); Shklovsky (1967); Rowan-Robinson (196811 pro- give (p)---g(300 GeV)4, which even for g as small as
posed that such a loitering could be accounted for in a a* would yield I ( p ) l =lo6 GeV4, larger than the bound
model proposed by Lemake (1927, 1931). In this model on pv by a factor los3. Of course we know of no reason
there is a positive cosmological constant he, and positive why Vo or h must vanish, and it is entirely possible that
+
curvature k = 1, just as in the static Einstein model, Vo or h cancels the term -p4/4g (and higher-order
while the mass of the universe is taken close to the Ein- corrections), but this example shows vividly how un-
stein value (2.4). The scale factor R ( t ) starts at R = O natural it is to get a reasonably small effective cosmologi-
and then increases; however, when the mass density cal constant. Moreover, at early times the effective
drops to near the Einstein value (2.2), the universe temperature-dependent potential has a positive coefficient
behaves for a while like a static Einstein universe, until for 40, so the minimum then is at 0-0, where
the instability of this model takes over and the universe V ( d ) = V o . Thus, in order to get a zero cosmological
starts expanding again. In order for this idea to explain a constant today, we have to put up with an enormous
preponderance of redshifts at z = 1.95, the vacuum ener- cosmological constant at times before the electroweak
gy density pv would have to be (2.95 )3 times the present phase transition. [This is not in conflict with experiment;
nonvacuum mass density p,,. in fact, the hase transition occurs a t a temperature T of
These considerations led Zeldovich (1967) to attempt P
order p / g ,so the black-body radiation present at that
to account for a nonzero vacuum energy density in terms
4Veltman (1975) attributes this view to Linde (1974). himself
(quoted as to be published), and Dreitlein (1974). However,
3Casimir(1948)showed that quantum fluctuations in the space Lindes paper does not seem to me to take this position.
between two flat conducting plates with separation d would pro- Dreitleins paper proposed that Eq. (3.9) could give an accept-
duce a force per unit area equal to fic7r2/240d4,or 1.3OX lo-* ably small value of ( p ) , with p/t/Ti fixed by the Fermi cou-
dyncmz/d4. This was measured by Sparnaay (1957),who found pling constant of weak interactions,if p is very small, of order
a force per area of (1-4)XlO-* dyncm2/d4, when d was MeV. Veltmans paper gives experimental arguments
varied between 2 and 10 pm. against this possibility.

Rev. Mod. Phys.. Vol. 61. No. 1. January 1989


572

4 Steven Weinberg: The cosmological constant problem

time has an energy density of order p4/g2, larger than with c independent of gpv. With this L , there are no
the vacuum energy by a factor I /g (Bludman and Ruder- solutions of Eq. (3.11), unless for some reason the
man, 1977).] At even earlier times there were other tran- coefficient c vanishes when (3.10) is satisfied.
sitions, implying a n even larger early value for the Now that the problem has been posed, we turn to its
effective cosmological constant. This is currently regard- possible solution. The next five sections will describe five
ed as a good thing; the large early cosmological constant directions that have been taken in trying to solve the
would drive cosmic inflation, solving several of the long- problem of the cosmological constant.
standing problems of cosmological theory (Guth, I98 1;
Albrecht and Steinhardt, 1982; Linde, 1982). We want to
IV. SUPERSYMMETRY. SUPERGRAVITY,
explain why the effective cosmological constant is small SUPERSTRINGS
now, not why it was always small.
Before closing this section, I want to take up a peculiar Shortly after the development of four-dimensional glo-
aspect of the problem of the cosmological constant. The bally supersymmetric field theories, Zumino ( 1975)point-
appearance of an effective cosmological constant makes it ed out that supersymmetry in these theories would, if un-
impossible to find any solutions of the Einstein Reld equa- broken, imply a vanishing vacuum energy. The argu-
tions in which g , i s the constant Minkowski term vrv. ment is very simple: the supersymmetry generators Q,
That is, the original symmetry of general covariance, satisfy an anticommutation relation
which is always broken by the appearance of any given
metric gNv, cannot, without fine-tuning, be broken in (4.1)
such a way as to preserve the subgroup of space-time
where u and fl are two-component spin indices; u,,u2.
translations.
and o3 are the Pauli matrices; o o = l ; and PP is the
This situation is unusual. Usually if a theory is invari-
energy-momentum 4-vector operator. If supersymmetry
ant under some group G, we would not expect to have to
fine-tune the parameters of the theory in order to Rnd is unbroken, then the vacuum state 10)satisfies
vacuum solutions that preserve any given subgroup (4.2)
H C G. For instance, in the electroweak theory, there is a
finite range of parameters in which any number of dou- and from (4.1) and (4.2) we infer that the vacuum has
blet scalars will get vacuum expectation values that vanishing energy and momentum
preserve a U(1) subgroup of SU(Z)XU( 1). So why will
this not work for the translational subgroup of the group
(OIPI0)=0 .
of general coordinate transformations? Suppose we look This result can also be obtained by considering the poten-
for a solution of the field equations that preserves transla- tial V ( d , d * ) for the chiral scalar fields 4 of a globally su-
tional invariance. With all fields constant, the field equa- persymmetric theory:
tions for matter and gravity are
(4.3)
(3.101
where W ( 4 )is the so-called superpotential. (Gauge de-
(3.11) grees of freedom are ignored here, but they would not
change the argument.) The condition for unbroken su-
With N $s, these are N 3 - 6 equations for N f 6 un- persymmetry is that W be stationary in 4,which would
knowns, so one might expect a solution without fine- imply that V take its minimum value,
tuning. The problem is that when (3.10) is satisfied, the
dependence of L on g,, is too simple to allow a solution ( p >= Vmin=o . (4.4)
of (3.1 1). There is a GL(4) symmetry that survives as a
vestige of general covariance even when we constrain the Quantum effects do not change this conclusion, because
fields to be constants: under the GL(4) transformation with boson-fermion symmetry, the fermion loops cancel
the bason ones.
g,,-AP,A uYgp I (3.12) The trouble with this result is that supersymmetry is
broken in the real world, and in this case either (4.1) or
$i + D y ( A )tcl, ; (3.13) (4.3) shows that the vacuum energy is positive-definite.
the Lagrangian transforms as a density, If this vacuum energy were the sole contribution to the
effective cosmological constant, then the effect of super-
l-+DetAL. (3.14) symmetry would be to convert the problem of the cosmo-
When Eq. (3.10) is satisfied, this implies that L trans- logical constant from a crisis into a disaster.
Fortunately this is not the whole story. It is not possi-
forms as in (3.14) under (3.12) alone. This has the unique
ble to decide the value of the effective cosmological con-
solution
stant unless we explicitly introduce gravitation into the
L=c(Detg)2 , (3.15) theory. Any globally supersymmetric theory that in-

Rev. Mod. PhYS., VOI. 61. No. I,


January 1989
573

Steven Weinberg: The cosmological constant problem 5

volves gravity is inevitably a locally supersymmetric su- point of V. Thus in supergravity the problem of the
pergravity theory. In such a theory the effective cosmo- cosmological constant is no more a disaster, but just as
logical constant is given by the expectation value of the much a crisis, as in nonsupersymmetric theories.
potential, but the potential is now given by (Cremmer O n the other hand, supergravity theories offer oppor-
et a l . , 1978, 1979; Barbieri et al., 1982; Witten and tunities for changing the context of the cosmological con-
Bagger, 1982) stant problem, if not yet for solving it. Cremmer et al.
(1983) have noted that there is a class of Kahler poten-
V(4,q5*)= e x p ( 8 r G K ) [ D WCO-')\(Dj
i W)*
tials and superpotentials that, for a broad range of most
-24rGI W I 2 ] , (4.5) parameters, automatically yield an equilibrium scalar
field configuration in which V=O, even though super-
where K (1#,4*) is a real function of both I$ and 4' known symmetry is broken. Here is a somewhat generalized
as the Kahler potential, DiW is a sort of covariant version: the Kahler potential is
derivative
K = -3 In1 T+ T* -h ( Ca,Ca*)I / 8 r G
(4.6)
+R(S",S"*) (4.8)
and (.!?-I)> is the inverse of a metric while the superpotential is
W = W , ( C ' ) + W 2 ( S " ), (4.9)
(4.7)
and T,C",S" are all chiral scalar fields. No constraints
The condition for unbroken supersymmetry is now are placed on the functions h (C",C"*), t ( S " , S " * ) ,
D, W=O. This again yields a stationary point of the po- , W2CS"),except that h and R are real, and
W I ( C a )or
tential, but now it is one at which V is generally negative. functions all depend only on the fields indicated; in par-
In fact, even if we fine-tuned W so that there were a su- ticular, the superpotential must be independent of the
persymmetric stationary point at which W =O and hence single chiral scalar T.
V=O, such a solution would not, in general, be the state With these conditions the potential (4.5) takes the form
of lowest energy, though it would be stable [Coleman and
de Luccia (19801, Weinberg (1982)l. It should, however,
be mentioned that if there is a set of field values at which
W=O and Di W=O for all i in lowest order of perturba-
tion theory, then the theory has a supersymmetric equi-
librium configuration with Y = O to all orders of pertur-
bation theory, though not necessarily beyond perturba- (4.10)
tion theory (Grisaru, Siege], and Rocek, 1979). The same
is believed to be true in superstring perturbation theory where ( J V - ' ) is
~ ~the reciprocal of the matrix
(Dine and Seiberg, 1986; Friedan, Martinec, and Shenker, a2h
1986; Martinec, 1986; Attick, Moore, and Sen, 1987; Pb
= (4.11)
Morozov and Perelomov, 1987).
acR*acb
'

Without fine-tuning, we can generally find a nonsuper- The matrices and grim are necessarily positive-
symmetric set of scalar field values at which V = O and definite, because of their role in the kinetic part of the
Di W#O, but this would not normally be a stationary scalar Lagrangian

(4.12)
I
Hence Eq. (4.10) is positive and therefore, without fur- aw
DaW=-+8rG-W
air
ther fine-tuning, may be expected to have a stationary aca aca
point with V =O, specified by the conditions
(4.14)
-
--
a w-D,
OL
-n
-
W=O . (4.13)
and this does not necessarily vanish. (However, to have
But this is not necessarily a supersymmetric supersymmetry broken, it is essential that the superpo-
configuration, because here tential actually depend on all of the chiral scalars S", be-

Rev. Mod. Phy?.., Vol. 61, No. 1, January 1989


574

6 Steven Weinberg: The cosmological constant problem

cause otherwise the conditions D , W=O would require So far, the only examples where this occurs entail a
W = O and hence D, W - 0 . ) compactification to two rather than four space-time di-
The superpotential W depends on C and S, but not mensions, but it does not seem unlikely that four-
on T, so the conditions (4.13) will generally fix the values dimensional examples could be found. A more serious
of C and S at the minimum of V, while leaving T un- obstacle is that the Atkin-Lehner symmetry seems irre-
determined. The field Tenters the potential only in the trievably tied to one-loop order.
overall scale of the part that depends on the C, so such Indeed, it is very hard to see how any property of su-
theories are called no-scale models. An intensive phe- pergravity or superstring theory could make the effective
nomenological study of these models was carried out at cosmological constant sufficiently small. It is not enough
CERN for several years following 1983 (Ellis, Lahanas, that the vacuum energy density cancel in lowest order, or
et al., 1984; Ellis, Kounnas, et al., 1984; Barbieri et al., to all finite orders of perturbative theory; even nonpertur-
1985). bative effects like ordinary QCD instantons would give
Of course, these models do not solve the cosmological far too large a contribution to the effective cosmological
constant problem, because neither Eq. (4.8) nor Eq. (4.9) constant if not canceled by something else. According to
is dictated by any known physical principle. In particu- our modern theories, properties of elementary particles,
lar, in order to cancel the second term in Eq. (4.9, it is like approximate baryon and lepton conservation, are
essential that the coefficient of the logarithm in the first dictated by gauge symmetries of the standard model,
term in (4.8) be given the apparently arbitrary value which survive down to accessible energies. We know of
-3/8aG. no such symmetry (aside from the unrealistic example of
It was therefore exciting when, in some of the first unbroken supersymmetry) that could keep the effective
work on the physical implications of superstring theory, cosmological constant sufficiently small. It is conceivable
it was found that compactification of six of the ten origi- that in supergravity the property of having zero effective
nal dimensions yielded a four-dimensional supergravity cosmological constant does survive to low energies
theory with Kahler potential and superpotential of the without any symmetry to guard it, but this would run
form (4.8) and (4.9). Specifically, Witten (1985) found a counter to all our experience in physics.
Kahler potential of the form (4.8), with h quadratic in the
C s and b = - ln(S +S * )/Sac, but with a superpoten- V. ANTHROPIC CONSIDERATIONS
tial that depended solely on the Cs. By including non-
perturbative gaugino condensation effects, Dine et al. I now turn to a very different approach to the cosmo-
(1985) were able to give the superpotential a dependence logical constant, based on what Carter (1974) has called
on S (though they did not treat the dependence of the the anthropic p r i n ~ i p l e . ~Briefly stated, the anthropic
Kahler potential or superpotential on the C fields). In principle has it that the world is the way it is, at least in
this work, the S field is a complex function (now often part, because otherwise there would be no one to ask why
called r) of four-dimensional dilaton and axion fields, it is the way it is. There are a number of different ver-
while the T field represents the scale of the compactified sions of this principle, ranging from those that are so
six-dimensional manifold. The factor 3 in Eq. (4.8) arises weak as to be trivial to those that are so strong as to be
in these models because one compactifies on a complex absurd. Three of these versions seem worth distinguish-
manifold with ( 10-4)/2= 3 complex dimensions (Chang ing here.
et a l . , 1988).
(i) In one very weak version, the anthropic principle
Intriguing as these results are, they have not been tak-
amounts simply to the use of the fact that we are here as
en seriously (even by the original authors) as a solution of
one more experimental datum. For instance, recall M.
the cosmological constant problem. The trouble is that
Goldhabers joke that we know in our bones that the
no one expects the simple structures (4.8) and (4.9)to sur-
lifetime of the proton must be greater than about 10l6yr,
vive beyond the lowest order of perturbation theory, be-
because otherwise we would not survive the ionizing par-
cause they are not protected by any symmetry that sur- ticles produced by proton decay in our ownbodies. No
vives down to accessible energies.
one can argue with this version, but it does not help us to
Recently Moore (1987a, 1987b) has attempted a more
explain anything, such as why the proton lives so long.
specifically stringy attack on the problem. Early work
Nor does it give very useful experimental information;
by Rohm (1984) and Polchinski (1986) had shown that in
certainly experimental physicists (including Goldhaber)
the calculation of the vacuum energy density, the sum
have provided us with better limits on the proton life-
over zero-point energies can be converted into an integral
time.
over a complex modular parameter T. (In string
theories, two-dimensional conformal symmetry makes
the tree-level vacuum energy vanish.) Last year Moore
pointed out that for some special compactifications there
is a discrete symmetry of modular space, known as 5Recent discussions of the anthropic principle are given in the
Atkin-Lehner symmetry, that makes the integral over T books by Davies (1982) and Barrow and Tipler (1986), and in ar-
vanish despite the absence of space-time supersymmetry. ticles by Carter (1983), Page (1987), and Rees (1987).

Rev. Mod. Phys.. VoI. 61. No. 1. January 1989


575

Steven Weinberg: The cosmological constant problem 7

(ii) In one rather strong version, the anthropic princi- (1) The vacuum energy may depend on a scalar field
ple states that the laws of nature, which are otherwise in- vacuum expectation value that changes slowly as the
complete, are completed by the requirement that condi- universe expands, as in a model of Banks (1985).
tions must allow intelligent life to arise, the reason being (2) In a model of Linde (1986, 1987, 1988b), fluctua-
that science (and quantum mechanics in particular) is tions in scalar fields produce exponentially expanding re-
meaningless without observers. I do not know how to gions of the universe, within which further fluctuations
reach a decision about such matters and will simply state produce further subuniverses, and so on. Since these
my own view, that although science is clearly impossible subuniverses arise from fluctuations in the fields, they
without scientists, it is not clear that the universe is im- have differing values of various constants of nature.
possible without science. (3) The universe may go through a very large number
(iii) A moderate version of the anthropic principle, of first-order phase transitions in which bubbles of small-
sometimes known as the weak anthropic principle, er vacuum energy form; within these bubbles there form
amounts to an explanation of which of the various possi- further bubbles of even smaller vacuum energy, and so
ble eras or parts of the universe we inhabit, by calculat- on. This can happen if the potential for some scalar field
ing which eras or parts of the universe we could inhabit. has a large number of small bumps, as in a model of Ab-
An example is provided by what I think is the first use of bott (1985). Alternatively, the bubble walls may be ele-
anthropic arguments in modem physics, by Dicke (1961), mentary membranes coupled to a 3-form gauge potential
in response to a problem posed by Dirac (1937). In effect, A,, as in the work of Brown and Teitelboim (1987a,
Dirac had noted that a combination of fundamental con- 1987b).
stants with the dimensions of a time turns out to be (4) The universe may start in a quantum state in which
roughly of the order of the present age of the universe: the cosmological constant does not have a precise value.
Any measurement of the properties of the universe
fi/Gcm~=4.SX100yr . (5.1) yields a variety of possible values for the cosmological
constant, with a priori probabilities determined by the in-
[There are various other ways of writing this relation, itial state (Hawking, 1987a). We will see examples of this
such as replacing m,,with various combinations of parti- in Secs. VII and VIII.
cle masses and introducing powers of e2/fic. Diracs In models of these types, it is perfectly sensible to apply
original large-number coincidence is equivalent to us- anthropic considerations to decide which era or part of
ing Eq. (5.1) as a formula for the age of the universe, with the universe we could inhabit, and hence which values of
mrr replaced by ( 137m,m~)3= 183 MeV. In fact, there the cosmological constant we could observe.
are so many different possibilities that one may doubt A large cosmological constant would interfere with the
whether there is any coincidence that needs explaining.] appearance of life in different ways, depending on the
Dirac reasoned that if this connection were a real one, sign of hew For a large positive hemthe universe very ear-
then, since the age of the universe increases (linearly) ly enters an exponentially expanding de Sitter phase,
with time, some of the constants on the left side of (5.1) which then lasts forever. The exponential expansion in-
must change with time. He guessed that it is G that terferes with the formation of gravitational condensa-
changes, like l/t. [Zee (1985) has applied similar argu- tions, but once a clump of matter becomes gravitationally
ments to the cosmological constant itself.] In response to bound, its subsequent evolution is unaffected by the
Dirac, Dicke pointed out that the question of the age of cosmological constant. Now, we do not know what
the universe could only arise when the conditions are weird forms life may take, but it is hard to imagine that it
right for the existence of life. Specifically, the universe could develop at all without gravitational condensations
must be old enough so that some stars will have complet- out of an initially smooth universe. Therefore the an-
ed their time on the main sequence to produce the heavy thropic principle makes a rather crisp prediction: h,
elements necessary for life, and it must be young enough must be small enough to allow the formation of
so that some stars would still be providing energy sufficiently large gravitational condensations (Weinberg,
through nuclear reactions. Both the upper and lower 1987).
bounds on the ages of the universe at which life can exist This has been worked out quantitatively, but we can
turn out to be roughly (very roughly) given by just the easily understand the main result without detailed calcu-
quantity (5.1). Hence there is no need to suppose that lations. We know that in our universe gravitational con-
any of the fundamental constants vary with time to ac- densation had already begun at a redshift z, 2 4 . At this
count for the rough agreement of the quantity (5.1) with time, the energy density was greater than the present
the present age of the universe. mass density py0 by a factor ( 1+z, )3 1 125. A cosmolog-
It is this weak anthropic principle that will be ap-
plied here. Its relevance arises from the fact that, in ical constant has little effect as long as the nonvacuum
some modern cosmological models, the universe does energy density is larger than p v , so one can conclude that
have parts or eras in which the effective cosmological a vacuum energy density p v no larger than, say loOpM,
constant takes a wide variety of values. Here are some would not be large enough to prevent gravitational con-
examples. densations. [The quantitative analysis of Weinberg

Rev. Mod. Phys.. VoI. 61, No. 1. January 1989


576

a Steven Weinberg: The cosmological constant problem

(1987) shows that for k =0, a vacuum energy density no


greater than d 1 +zc /3 would not prevent gravita-
tional condensation at a redshift 2,; this is 410pM0 for (5.3)
a, =4.1
This result suggests strongly that if it is the anthropic
principle that accounts for the smallness of the cosmolog-
The dynamics of clusters of galaxies seems to indicate
ical constant, then we would expect a vacuum energy that nM0is in the range 0.1-0.2 (Knapp and Kormendy,
density p v - ( 10- loo)p&,o,because there is no anthropic
19871, which with these assumptions would indicate a
reason for it to be any smaller. value for p v i p M o in the range 4-9. If we discount the
Is such a large vacuum energy density observationally
allowed? There are a number of different types of astro- evidence from the dynamics of clusters of galaxies, then
nomical data that indicate differing answers to this ques- SLMo could be as small as 0.02 (Knapp and Kormendy,
tion. I987),corresponding 80 B value of p v / p M o=50. {See also
Bahcall el nl. (19871.1

A. Mass denaity
0. Ages
If, as often assumed, the universe now has negligible
In a dust-dominated universe with k =O and pv=O,
spatial curvature, then
the age of the universe is t 0 = 2 / 3 H 0 . For
S1,+RMo=l I (5.2) H o = 100 km/sec Mpc, this is 7 X lo9 yt, considerably
less than the ages usually d i m a t d for globular clusters
where a, and QM0 are the ratios of the vacuum energy (Renzid, 1986). On the other hand, for a dust-dominated
density and the present mass density to the critical densi- universe with k =O and pv#O, the present age of an ob-
tY ject that formed at a redshift z, is

(5.4)

For instance, for zc=4 and pv/pM0=9 (i.e., nMo=O.l),this gives an age 1.1HL' in place of +Hi'.This is not in
conffict with globular cluster ages even for Hubble constants near 100 km/sec Mpc.
These considerations of cosmic age and density have led a number of astronomers to suggest a fairly large positive
cosmological constant, with p v > p M [de Vaucouleurs (1982, 1983);Peebles (1984, 1987a, 1987b);Turner, Steigman, and
Krauss (1984)j. However, there recently has appeared a strong argument against this view, which we shall now consid-
er.

C. Number counts

Loh and Spillar (1986) have carried out a survey of numbers of galaxies as a function of redshift, subsequently ana-
lyzed by Loh (1986). For a uniformly distributed class of objects that are all bright enough to be detectable at redshifts
5zmsx, the number of objects observed at redshift less than z Sz,,, in a dust-dominated universe with k = O is

N ( < z) cr -~ T J -,,ds s4/? 1 +p "s * / p M 0 [~,'ds's rL2/3 ( 1 -ttpvs'2/py1 L I


12, (5.5)

Of course, in the real world there are always some objects This is more than 3 orders of magnitude below the an-
too dim to be seen. Loh's analysis allowed for an un- thropic upper bound discussed earlier. If the effective
known luminosity distribution, assuming only that its cosmological constant is really this small, then we would
shape does not evoIve with time. Under these assump- have to conclude that the anthropic principle does not
tions, he found that the vacuum energy must be quite explain why it is so small. [However, there are reasons to
small: specifically, be cautious in reaching this conclusion. Bahcall and
Tremaine (1986) have recently reanalyzed the data of
Loh and Spillar, using a plausible model of galaxy evolu-
pv/PnS,'o. I T0d.j . tion in which the shape of the luminosity distribution

Rev. Mod. Phys.. VoI. 61. NO. 1. January 1989


577

Steven Weinberg: The cosmological constant problem 9

does change with time. They considered only the case matter, or the vacuum and radiation, in such a way that
pv=O, leaving sZMo undetermined, and found that evolu- either p v / p w or p v / p R remain constant, respectively
tion in this model could increase or decrease the inferred (see also Reuter and Wetterich, 1987). In order for the
value of nMoby as much as unity. Presumably it would vacuum to transfer energy to ordinary matter in such a
way that p v / p M remains fixed, and if baryon number is
also have a similarly large effect on the inferred value of
conserved, then it would be necessary to create baryon-
p v / p w o when n M o + tis v constrained to be unity. In antibaryon pairs at a sufficient rate to produce a trouble-
addition, the redshifts of Loh and Spillar are photometric some p r a y background. Alternatively, if the vacuum
and therefore less certain than those obtained from shifts transfers energy to radiation in such a way that p v / p R
of individual spectral lines.] remains constant, and if pv is comparable with the
Now let us consider a cosmological constant of the present mass density pMo, then p v / p ~must be rather
other sign, Aer< 0. Here the cosmological constant does
not interfere with the formation of gravitational conden- large, completely changing the results of cosmological
sations. Instead (for k = O or k =+1), the whole nucleosynthesis.
universe collapses to a singularity in a finite time T. The One more possibility that was not considered by Freese
anthropic constraint here is simply that the universe last er al. is that the vacuum transfers energy t o radiation,
long enough for the appearance of life (Barrow and avoiding the problems of baryon-antibaryon annihilation,
Tipler, 1986),say, T Z 0 . 5 H , ' , where H,' is the Hubble but in such a way as to keep a fixed ratio p v / p M rather
time in our universe. For a dust-dominated universe with than p v / p R . However, this also does not work. With
k =0, we have pv=cpM and R 3pM constant, E q . (5.8) yields

(5.6)

so the anthropic constraint here is just (5.9)

IPvl SPM0 ' (5.7)


So that there is no interference with calculations of
In this case the anthropic principle can explain why the cosmological nucleosynthesis, we need
cosmological constant is as small as found by Loh (19861,
but not much smaller. O n the other hand, a negative
cosmological constant would not help with the cos-
pR =PRO [% I 4
mic mass and age problems. and therefore
Before closing this section, let me take up one possibili-
ty that may confront us in a few years. Suppose it really
(5.10)
is confirmed that, as suggested by cosmic ages and densi-
ties, there is a cosmological constant with pv of order
p M o . Would we then have any alternative to an an- Thus, even if we are willing to suppose that the vacuum
thropic explanation for this value ofpv? The mass densi- energy changes with time, a vacuum energy density com-
t y p M changes with time, so without anthropic considera- parable with the present mass density seems very dificult
to explain on other than anthropic grounds.
tions it is very hard to explain why a constant pv should
equal the value that pM happens to have at present. But
perhaps p v really is not constant. For instance, Peebles VI. ADJUSTMENT MECHANISMS
and Ratra (1988) and Ratra and Peebles (1988) have con-
sidered a model in which the vacuum energy depends on I now turn to an idea that has been tried by virtually
a scalar field that changes as the universe expands. In or- everyone who has worried about the cosmological con-
der to qualify as a vacuum energy, it i s only necessary for stant [see, e.g., Dolgov (1982); Wilczek and Zee (1983);
p v to be accompanied with a pressure p v = - p v ; the Wilzcek (1984, 1985);Peccei, Sola, and Wetterich (1987);
value of p v can change if the vacuum exchanges energy Barr and Hochberg (198811. Suppose there is some scalar
with matter and radiation. The conservation of energy Q whose source is proportional to the trace of the
then relates the change of pv to the change in the densi- energy-momentum tensor
ties of matter (with p M = O ) and radiation (with
PR =PR /3 ): Cl*QrrTI;,ccR . (6.1)
(Here Tpv is the total energy-momentum tensor that in-
cludes a possible cosmological constant term
--hgPv/8aG. ) Suppose also that T P , , depends on Q and
Freese er al. (1987) have considered the possibility that vanishes at some field value Qw Then Q will evolve until
energy is exchanged only between the vacuum and it reaches an equilibrium value &,, where T",,=O, and the

Rev. Mod. Phys.. VoI. 61. No. 1. January 1989


578

10 Steven Weinberg: The cosmological constant problem

Einstein field equations have a flat-space solution. the N fields JI, with N - 1 fields ua (not necessarily sca-
Of course, we do not observe such a scalar field, but for lars) and one scalar 4, in such a way that the symmetry
these purposes it can couple as weakly as we like; a weak transformation (6.5) takes the form
coupling simply implies that the equilibrium value t$o is
6gpbv=2Eg~,, S U , = O , St$=-E. (6.7)
very large. In this respect the scalar t$ is analogous to the
axion, especially in its later invisible version [Kim [To do this, we first define a transverse surface S in
(1979);Dine, Fischler, and Srednicki (1981)l. field space by an equation T(JI)=O,where T ( J I )is any
Even very weakly coupled, it is possible that the t$ field function on which X,(aT/aJI,, )fn(JI) does not vanish.
could have interesting effects, because it must have very We take u, as any set of coordinates on this ( N- 1)-
small mass. If it has any nonzero mass M + , then at ener- dimensional surface, and define JI,,(u;t$) as the solution
gies below m g we can work with an effective Lagrangian of the ordinary differential equation dJI,,/ d t $ = f , , ( J I )
in which t$ has been integrated out, and so does not ap- subject to the condition that at t$=O, JI,, is at the point
pear explicitly. But massless fields like the gravitational on S with coordinates u. The condition that S be a trans-
and electromagnetic field will still appear in this effective verse surface ensures that, at least within a finite region
Lagrangian, and their vacuum fluctuations will contrib- of field space, any point JI,, is on just one of these trajec-
ute to the effective cosmological constant. In order to tories.] This symmetry simply ensures that for constant
keep p v < GeV4, we need the scalar field adjust- fields the Lagrangian can depend on gAv and t$ only in the
ment to cancel the effect of gravitational and electromag- combination e+gA,. The general arguments of Sec. 111
netic field fluctuations down to frequencies GeV; then show that when the field equations for u are
for this purpose we must have m+ < lo- GeV. A field satisfied, the Lagrangian must take the form
this light will have a macroscopic range: W m + c 20.01 L=eW(Detg)/Lo(u) . (6.8)
cm.
Unfortunately it seems to be impossible to construct a We see that the source of t$ is the trace of the energy-
theory with one or more scalar fields having the assumed moment urn tensor
properties. This can be seen in very general terms. What --
aL - T,(Detg )I/ , (6.9)
we want is to find an equilibrium solution of the field a4
equations in which g,, and all matter fields JI,, (perhaps
~ , ~ = g ~ e ~ + L.~ ( u ) (6.10)
tensors as well as scalars) are constant in space-time. For
such constant fields the Euler-Lagrange equations are It is true that if there were a value of t$ where L is sta-
simply tionary in t$, then the trace of the Einstein field equations
would automatically be satisfied at this point, but clearly
(6.2) there is no such stationary field value (unless, of course,
we fine-tuneLo so that it vanishes at its stationary point).
-- To put this another way, since L depends only on t$ and
aL -0. (6.3) g,, only in the combination g,,, =ee4g,, (and derivatives
aJIn
of I$ and g,,), we might as well redefine the metric as g,
As we saw in Sec. 111, the problem is in satisfying the instead of g,,. Then t$ is just a scalar with only deriva-
trace of the gravitational field equation. To make a solu- tive couplings and clearly cannot help with our problem6
tion natural, we would like this trace to be a linear com- As one example of many failed attempts along this
bination of the JI,, field equations; that is, we want line, let us consider a proposal of Peccei, Sol&,and Wet-
terich (1987). They observed that the symmetry (6.5) or
(6.4) (6.7) may be broken by conformal anomalies, such as
those that produce the fl function of quantum chromo-
for all constant g,, and JI,,. This can be restated as a dynamics, in such a way that the effective Lagrangian be-
symmetry condition: for constant fields the Lagrangian comes
must be invariant under the transformation L,,= (Detg)*[eqLo(u) -tt$W, J , (6.11)
k ~ v = 2 E g ~ v S, J I n = - - ~ f n C t c ) . (6.5) where W, represents the effect of the conformal anoma-
With this condition, if we find a solution JIco of the
Euler-Lagrange equations for JI,,,
*his remark is due to Polchinski (1987).
An equation essentially equivalent to (6.11) appeared in the
(6.6) preprint version of the paper by Peccei, Sola, and Wetterich
(1987). In the published version this equation was removed, and
then the trace of the field equation for g,, is automatical- it was acknowledged that fine-tuning is still needed to make the
ly satisfied. cosmological constant vanish. However, this equation was
The problem is that under these assumptions, it is im- quoted in the meantime in a paper by Ellis, Tsamis, and
possible (without fine-tuning L)to find a solution to the Voloshin (1987),which mostly deals with the observable conse-
field equations (6.3) for the $. To see this, we replace quences of the light scalar particle in this model.

Rev. Mod. Phys.. Vol. 61. No. 1 , January 1989


579

Steven Weinberg: The cosmological constant problem 11

ly. The source of the $ field is now cal assumptions that later turn out to have exceptions of
great physical interest. (A famous example is the
(6.12) Coleman-Mandula theorem.) More discouraging than
any theorem is the fact that many theorists have tried to
with T the previous energy-momentum tensor (6.10). invent adjustment mechanisms t o cancel the cosmologi-
Now we can find an equilibrium solution for the field, cal constant, but without any success so far.
a t a value 4o such that
VII. CHANGING GRAVITY
4e40Lo+Wp=0 . (6.13)
A number of authors have suggested changing the
rules of classical general relativity in such a way that the
The trouble is that this is not the condition for a flat- cosmological constant appears as a constant of integra-
space solution; the Einstein equation for a constant tion, unrelated to any parameters in the action [Van der
metric is Bij ef al. (1982); Weinberg (1983); Wilczek and Zee
(1983); Buchmuller and Dragon (1988a, 1988b)l. This
(6.14) does not solve the cosmological constant problem, but it
does change it in a suggestive way.
I will describe one version of this idea, in which one
which is not the same as (6.13). The point is that just cal- maintains general covariance, but reinterprets the for-
ling the anomalous term in (6.l l ) W pdoes not make it a malism so that the determinant of the metric is not a
term in the trace of the energy-momentum tensor to dynamical field. Any theory can be written in a way that
which gpv is coupled. This result is not surprising, since is formally generally covariant, so by the usual argu-
(6.11) does not obey the symmetry (6.7). One cannot ments we can take the action for gravity and matter as
have it both ways: either we preserve the symmetry, in
which case there is no equilibrium solution for $, or we
break the symmetry, in which case such an equilibrium
solution does not imply a solution of the field equations where $ are a set of matter fields appearing in the matter
for a constant metric. (Also see Coughlan ef al., 1988; action ZM. (IM includes a possible cosmological constant
Wetterich, 1988.) term -AS G d 4 x / ~ P G . )The variational derivative of
In a slightly diferent version of this general class of Eq.(7.1) with respect to the metric is
models, we can try coupling a scalar field so that it is the
curvature scalar R rather than the trace of the energy- (7.2)
momentum tensor that directly serves as the source of
the scalar field. [See, e.g., Dolgov (1982); Barr (1987); where, as usual, Tpvis the variational derivative of IM
Ford (1987).] For instance, we might take the Lagrangian
with respect to gFv. In ordinary general relativity all
as
components of the metric are dynamical fields, so Eq.
(7.2) vanishes for all p , ~yielding
, the usual Einstein field
equations. However, just because we use a generally co-
variant formalism does not mean that we are committed
to treating all components of the metric as dynamical
(6.15)
fields. For instance, we all learn in childhood how to
write the equations of Newtonian mechanics in general
This has a flat-space solution with gpv=vFvand 4=& (a curvilinear spatial coordinate systems, without supposing
constant), provided that the 3-metric has to obey any field equations at all.
In particular, if the determinant g is not dynamical,
U(40)=03 . (6.16) then the action only has to be stationary with respect to
variations in the metric that keep the determinant fixed,
However, as the above authors observed, the effective
gravitational coupling in this theory is given by

G 8For instance, we assumed that in the solution for flat space all
GeR= +
1 lb~rGU(4,)
=o . (6.17)
fields are constant, but it might be that this solution preserves
only some combination of translation and gauge invariance, in
This is not much progress; we always knew thet a which case some gauge-noninvariant fields might vary with
nonzero vacuum energy does not prevent a flat-space space-time position. (This is the case for the 3-form gauge field
solution if the gravitational constant is zero. model discussed at the end of Sec. VII and in Sec. VIII.) Fur-
The no-go theorem proved in this section should not thermore, it is possible that the foliation of field space, which al-
be regarded as closing off all hope in this direction. No- lows us to replace the with uo and 6,does not work
go theorems have a way of relying on apparently techni- throughout the whole of field space.

Rev. Mod. Phys., Vol. 61. NO.1. January 1989


5 80

12 Steven Weinberg: The cosmological constant problem

i.e., for which g@v6gP,=O; hence only the traceless part This is consistent only if ZPy is traceless; however, Ein-
of (7.2) needs to vanish, yielding the field equation stein took for tPV not the full energy-momentum tensor of
matter and radiation, but just the traceless tensor of radi-
RPV--$gPVR= -8&(TPV--fgPvTA).) . (7.3) ation alone. This is, of course, conserved only outside
matter. In such regions there is no difference between
This is just the traceless part of the Einstein field equa- Eqs. (7.8) and (7.3),so by the same calculation as shown
tions; these equations evidently contain less information here, Einstein was able to recover Eq. (7.7), with A a con-
than Einsteins, but as we shall see, not much less. Be- stant of integration. However, inside matter, Eq. (7.8)is
cause the whole formalism is generally covariant, the different from (7.3),the difference being that the right-
energy-momentum tensor satisfies the usual conservation hand side of Eq. (7.3)includes the traceless part of the
law
energy-momentum tensor of matter. A consequence of
TPV;,=O , (7.4) this difference is that in charged matter R is an undeter-
mined function, except that it is constant along world
and of course the Bianchi identities still hold, lines.
I will also take the opportunity of this pause to com-
( R p v - + g f i v R );,=O . (7.5)
ment on the connection between the formulation de-
The full Einstein field equations are automatically con- scribed here and that of Zee (1985) and Buchmiiller and
sistent with (7.4) and (7.5),but for the traceless part we Dragon (1988a, 1988b). These authors take as their start-
get a nontrivial consistency condition. Taking the co- ing point the assumption that the action is invariant not
variant derivative of Eq. (7.3)with respect to xP yields under the group of all coordinate transformations, but
only under the subgroup of transformations xP+x p
$a,R =8?rG$a,TAA , with Det(ax/ax)=l. This is not really in conflict
with the formulation presented here; the general covari-
or, in other words, R --8rrGTAA is a constant, which we
ance of Eq. (7.1)is achieved at the cost of introducing a
will call -4A:
metric that is partly nondynamical (just as we can make
R - 8 r G T A A = -4A (constant) . (7.6) Newtonian mechanics formally Lorentz invariant by in-
troducing a nondynamical quantity, the velocity of the
From (7.3)and (7.6),
we obtain reference frame). However, in giving up general covari-
ance, one may be led to a theory with unnecessary ele-
RPv-+gPvR -AgPV=-8&TPv. (7.7)
ments. Under transformations with Det(ax/ax )= 1, the
Thus we recover the Einstein field equations, but with a determinant of the metric g behaves just like any scalar
cosmological constant that has nothing to do with any field, so one can introduce arbitrary functions of g here
terms in the action or vacuum fluctuations, arising, in- and there in the action. There is nothing wrong with
stead, as a mere integration constant. To put this anoth- this, but it is not necessary, no different from inserting a
er way, Eq. (7.3) does not involve a cosmological con- new scalar field into the theory.
stant; the contribution of vacuum fluctuations automati- Now let us return to the theory described by the field
cally cancel on the right-hand side of Eq. (7.3),so this equations (7.3).In my view, the key question in deciding
equation does have flat-space solutions in the absence of whether this is a plausible classical theory of gravitation
matter and radiation. The remaining problem in this for- is whether it can be obtained as the classical limit of any
mulation is: why should we choose the flat-space solu- physically satisfactory quantum theory of gravitation.
tions? To help in answering this, and also to illuminate the
Before proceeding with this theory, I should pause to points raised in the previous paragraph, let us look at a
mention that it is closely related to a proposal made long simple model (Teitelboim, 1982) that shares several
ago by Einstein (1919). After his formulation of general features with the theory of gravitation studied here.
relativity and its application to cosmology, Einstein Consider a free relativistic particle, with space-time
turned to the old problem of a field theory of matter. In trajectory x p ( s ) parametrized by a variable s. In order
a paper titled Do Gravitational Fields Play an Essential for the action to be invariant under arbitrary reparame-
Part in the Structure of the Elementary Particles of trizations s-+s(s), we must introduce an einbein g (s),
Matter? he proposed to replace the original gravitation- with transformation rule
al field equation with the equation r 1--1

R , , - -$gpvR= - 8vGtP, . (7.8) (7.9)

The action may then be taken as

9This was pointed out to me by someone in the audience of the


lectures at Harvard. I thank my informant for this interesting - - Jmdzs g ( s ) .
(7.10)
historical reference. 2

Rev. Mod. Phys.. Vol. 61, No. 1, January 1989


58 1

Steven Weinberg: The cosmological constant problem 13

The conditions that I be stationary with respect to varia- Here h,, N,and N' parametrize the 4-metric, with line
tions in x P ( s ) and g ( s ) are, respectively, element given by

dpp=o, (7.11)
d 2 =( h - I @ - N'NJh.. )dt
ds
-2hijNidxJdt -hijdxidxj , (7.18)
pppp,=-m2, (7.12)
h =Det(hij) . (7.19)
where p,, is the canonical conjugate to x 8:
Furthermore, diis the canonical conjugate to hij, and %
(7.13) and 7 f i are functions of hij and viJand their space
derivatives, given by
However, just because we choose to write the action in %=T I 9..
i,,k/ T i j ? T k l - 1 3 ) ~ (7.20)
a reparametrization-invariant way does not necessarily
mean that we must treat the einbein g ( s ) as a dynamical Yf,=-2hijVkd.k , (7.21)
quantity. If we treat x%), but not g ( s ) , as dynamical
variables, then we obtain Eq. (7.11), but not (7.12). Of where (')R is the scalar curvature and V k is the covariant
course, Eq. (7.11) implies that p@' is a constant bust as derivative, both calculated using the 3-metric h,, and
Eq. (7.3) implies that R -8rGTAA is constant]. If we Qij,k/ Ehikhjl f h , h j k - h j j h k / . (7.22)
like, we can call this constant - m 2 , but this is now a
mere integration constant, unrelated to anything in the We see that N and N' just act as Lagrange multipliers for
original action. 3f and Xi,
respectively. Moreover, from (7.18), we see
Now to quantization. The Hamiltonian here is that R 2is just the quantity whose status is under ques-
tion here, the determinant of the 4-metric"
(7.14)
~=(Detg,,)l'* . (7.23)
so in quantum mechanics we calculate amplitudes by the
functional integral Thus, just as the integral over the einbein g ( s ) enforced
the constraint p"p,= -m2, the integral over Detg en-
A =.f[dxltd~l[dgl forces the constraint
%=2h. (7.24)
The two conditions are quite similar. Just as T,,,, has sig-
The einbein g (s)has no canonical conjugate, and so ap- nature +++ -, the quantity (7.22), viewed as a 6 X6
pears here only as a Lagrange multiplier, whose integral matrix, has signature +, +, +, +, +,
-. Hence the in-
yields a factor tegration over Detg,, has the effect of eliminating one
negative norm degree of freedom for each x, d'a (h -I)'',
JJ~(ppp, + m *) . (7.16) just as the integral over the einbein g(s) allows one to
5
eliminate the variable-pO. However, for gravity there is a
Presumably the classical theory in which g is not dynami-
"potential" term in %, proportional to the 3-curvature,
cal would be obtained as the classical limit of a quantum
and it is not entirely clear to me that it really is necessary
theory in which we do not do a functional integral over
to constrain % to take a fixed value. For the present, the
g ( s ) , and hence do not get the factor (7.16). But then
question of whether it is necessary to integrate over
there would be nothing to keep p" timelike. This is such
Detg,,, must be left open. [Recent work by Henneaux
a trivial theory that it is hard to say that anything goes
and Teitelboim (1988) shows that there is a sensible gen-
wrong physically; but we may anticipate that in less trivi-
erally covariant quantum version of the classical theory
al theories, we need a field to serve as a Lagrange multi-
described by Eq. (7.3).]
plier for every negative norm degree of freedom like p o .
Before closing this section, I should note that several
This is the case, for instance, in string theories, where the
authors have made a rather different suggestion, which
integration over the world-sheet metric is needed to en-
also has the effect of converting the cosmological con-
force the Virasoro conditions on physical states.
stant from a function of parameters in the action into a
The quantum theory of gravitation can be put in simi-
constant of the motion (Aurilia et al., 1980; Witten,
lar terms. Using the Arnowitt-Deser-Misner (1962) for-
malism, we calculate amplitudes as functional integrals, 1983; Henneaux and Teitelboim, 1984). They proposed
adding to the action a term
Z = S[dhi,][da"][d#][dNIv']

'OIn order to obtain this result, I have defined & and R


differently from the usual 3( and N , by moving a factor h"*
(7.17) from N to Yf.

Rev. Mod. Phys.. Vol. 61, No. 1. January 1989


5 82

14 Steven Weinberg: The cosmological constant problem

(7.25) the space-time coordinate system so that the spacelike


surface has constant t , and then decomposing the 4-
where FfiVpo is the exterior derivative of a 3-form gauge metric g,, as in Eq. (7.19).] This wave function satisfies
field A vpu, a sort of Schrodinger equation, known as the Wheeler-
DeWitt equation [DeWitt (1967);Wheeler (1968)l:
Fpvpo =a[, A vpo] 9 (7.26)
and g EE -Detg,,. Since FPvP" is totally antisymmetric,
it can be expressed as
FPVP" =c PV"P" IG, (7.27)
(8.1)
where elrVPuis the Levi-Civita tensor density, with
= 1, and c is a scalar field. The field equation for A with notation explained in Sec. VII (except that we now
is include a matter energy density T,, in which the canoni-
F~'P",,=O , (7.28) cal conjugate of a matter field CP is replaced with S/SCP).
It will be very important in what follows that we express
so, using (7.27) the solution as a Euclidean path integral
--
ac - 0 . (7.29)
ax,
But the action (7.25) then takes the form
where we integrate over all Euclidean-signature 4-metrics
(7.30) g,, and matter fields @ defined on a 4-manifold M,, that
In other words, whatever else contributes tg the cosmo- have the 3-manifold Ml[h,$] with 3-metric hi, and
logical constant, there is one term that depends on the in- matter fields 4 as a boundary. (The Wheeler-DeWitt
tegration constant c, equation is the constraint obtained from integrating the
Lagrange multiplier as discussed in Sec. VII.) Here S
AF=4rGc2 . (7.31) is the Euclidean action"
Again, this does not solve the cosmological constant
problem, but it does change the way it arises.
If h is a constant of integration, then in a quantum
theory we expect the state vector of the universe to be a
superposition of states with different values of h, in +matter terms+surface terms . (8.3)
which case the anthropic considerations of Sec. V would
set a bound on th effective cosmological constant. Since Eq. (8.1) is a differential equation in an infinite-
dimensional space [the set of h i j ( x )and Q(x) for all XI, it
has an infinite variety of solutions, which can be specified
VIII. QUANTUM COSMOLOGY by giving the 4-manifold in Eq. (8.2) other boundaries,
besides the M1[h,+] on which the 3-metric and matter
The last approach to the cosmological constant prob- fields are specified. Hartle and Hawking (1983) proposed
lem that I shall describe here is based on the application as a cosmological initial condition that the manifold M 4
of quantum mechanics to the whole universe. In 1984 should have no boundaries other than M,(h,Q). We will
Hawking (1984b) described how in quantum cosmology see that Coleman's (1988b) approach does not depend
there could arise a distribution of values for the effective critically on the choice of initial conditions.
cosmological constant, with an enormous peak at h,,=O. There are technical problems associated with this for-
Very recently, this approach has been revived in an excit- malism. One is an operator-ordering ambiguity: there
ing paper by Coleman (1988b), using a new mechanism are various ways of orderingI2 the hij fields and 6 / 6 h i j
for producing a distribution of values for the cosmologi- operators in (8.1), all of which have (8.2) as solution, but
cal constant (that rests in part on other work of Hawking
and Coleman) and finding an even sharper peak. Related
ideas have also been recently discussed by Banks (1988).
Before describing the work of Coleman and Hawking, I "The Euclidean action S is opposite in sign to what we would
will have to say something about quantum cosmology in get if we replaced the metric g,, in the action I in Eq. (7.1) with
general. one of signature +, +, +.
f, This sign of S is chosen so that
Most treatments of quantum cosmology are based on ordinary matter makes a positive contribution to S.
the "wave function of the universe," a function 'u[h,g] of lZThe insertion of factors h -'" and h'" in Q. (8.1)
the 3-metric and matter fields on a spacelike surface. represents one choice of operator ordering, which is made in or-
[The 3-metric hi, can be conveniently defined by adapting der to allow the derivation of the conservation equation (8.8).

Rev. Mod. Phys.. Vol. 61, No. 1, January 1989


583

Steven Weinberg: The cosmological constant problem 15

with different ways of calculating the measure [ d g ] [d @ ] is a natural definition of time, and we generally ask for
(Hawking and Page, 1986). Another problem, potentially the probabilities that the fields have certain values at a
more worrisome, is that for gravity the Euclidean action definite time. However, here time is a coordinate with no
(8.3) is not bounded below. Gibbons, Hawking, and Per- objective significance, and this coordinate time is even
ry (1978) have proposed rotating the contour of integra- imaginary. As Augustine (398) warned, I must not al-
tion for the overall scale of the 4-metric so that it runs low my mind to insist that time is something objective.
parallel to the imaginary axis. We will not need to go Heeding this warning, suppose we choose some time-
into these technicalities here, because it will turn out that keeping field o ( x , t ) , for instance, the trace of the
we only need to deal with the effective action at its equi- energy-momentum tensor, and use its value to define a lo-
librium point. cal time a. Each value of a defines a 3-surface, on which
A problem that is more relevant to us here has to do the coordinate time t is a function t ( x , a ) defined impli-
with the probabilistic interpretation of the wave function citly by
Y and of Euclidean path integrals like (8.2). Hawking
has proposed (1984a, 1 9 8 4 ~that) exp( -S[g,cP]) should
a(x,t[x,a))=a . (8.4)
be regarded as proportional to the probability of a partic- We are then interested in the probability that the tangen-
ular metric and matter field history. It is not immediate- tial components of the metric and all matter fields other
ly clear what is meant by this-even supposing that we than a (x,t ) have specified values on this surface. Calling
had the godlike ability to measure the gravitational and these quantities b, (x, t ), we see that the probability den-
matter fields throughout space-time, it would be in a sity for the b , ( x , f ) to have the values 8, ( x ) at local time
space-time of Lorentzian rather than Euclidean signa- a is
ture. However, since we can (sometimes) go from one
signature to another by a complex coordinate transfor-
mation, it may be that a Euclidean history g,,(x), W x )
can be interpreted in terms of correlations of scalar quan-
Xn6(b,(x,t(x,a))--P,(x)) , (8.5)
n,x
tities, just as if the space-time were Lorentzian. In much
of Hawkings work (e.g., Hawking, 1979),these questions with N a normalization factor, determined by the condi-
are avoided by using the formalism only to calculate the tion that the total probability of finding any value for the
probability that, in the space-time history of the universe, b,(x) at local time a should be unity:
there is a spacelike 3-surface with a given 3-metric ! ~ , ~ ( x )
and matter fields 4(x). For instance, with Hartle-
Hawking (1983) initial conditions, we would integrate (8.6)
over all closed 4-manifolds that contain such a 3-surface.
If this surface bisects the 4-manifold, then it can be re- [This usually makes N a function of a,because in (8.5)
garded as the boundary of the two halves of the 4- and (8.6) we integrate only over matter and metric his-
manifold, and so the integral is (with some qualifications) tories for which Eq. (8.4) is satisfied on some 3-surface.
just the square of the wave function (8.2). But questions With some boundary conditions, this condition is au-
still arise concerning the probabilistic interpretation of tomatically satisfied, and then N is a independent. For
Y, particularly with regard to normalization. If instance, if M 4 has two boundaries, on which a ( x ) is re-
IY[h,q5]lzis the probability density that there exists some quired to take values a land a2,then there are 3-surfaces
3-surface on which the 3-metric is h,,(x) and the matter on which (8.4) is satisfied for all a in the range
fields are # x ) , then we would not simply want to set the a,<a <a,.] Where the surface of constant a bisects the
functional integral of IP[h,$]12 over h , f ( x ) and & x ) 4-space, P , [ B ] can be written as proportional to the
equal to unity, because in this functional integral we are square of the wave function Y[a,/3], but with a constant
summing up possibilities that are not exclusive; if the in 3-space.
universe has some h,f ( x ) and #(x) on one 3-surface, then
it may also have some other h h ( x ) and #(x) on some
other 3-surface. After all, you would not expect that the I3Thisquote is not merely a display of useless erudition. Book
probabilities that you ever in your life have flipped a coin XI of Augustines Confessions contains a famous discussion of
the nature of time, and it seems to have become a tradition to
and gotten heads, and that you ever in your life have
quote from this chapter in writing about quantum cosmology.
flipped a coin and gotten tails, should add up to unity. Thus Hawking (1979) quotes What did God do before He
I would like to offer an interpretation of what is meant made Heaven and Earth? I do not answer as one did merrily:
by treating 1 Y [ h , @ ] / *as a probability density, which He was preparing a Hell for those that ask such questions. For
seems to me implicit in Hawkings writings (and may al- at no time had God not made anything, for time itself was made
ready be stated explicitly somewhere in the literature). by God. Coleman (1988a) quotes The past is present
As everyone has recognized, the problem has to do with memory. To this, I can add one more very relevant quote: I
the role of time in quantum gravity. [See, e.g., Hartle confess to you, Lord, that I still do not know what time is. Yet
(1987).] The problems raised here do not arise in asymp- I confess too that I do know that I am saying this in time, that I
totically flat cosmologies, because in such theories there have been talking about time for a long time, . . . .

Rev. Mod. Phys.,Vol. 61. No. 1 . January 1989


584

16 Steven Weinberg: The cosmological constant problem

Coleman (1988b) short-circuits many of the problems


that arise in giving a probabilistic interpretation to Eu- (8.12)
clidean path integrals by using such integrals only to cal-
culate expectation values: the expectation value of an ar- The trouble here is, of course, the same as that encoun-
bitrary scalar field A,,*(x), which may depend on the tered in giving a probabilistic interpretation to the
metric and matter fields and their derivatives, is taken as Klein-Gordon equation: the integrand in (8.12) is not, in
J [ d g ~ [ d * ~ ~ , , ~ ( x ) e -x- ps [(~ , @ I ) general, positive. Banks, Fischler, and Susskind (1985)
(A)= . (8.7) and Vilenkin (1986, 1988a), have considered minisuper-
S[dgl[d*lex~(--S[g,~l) space models in which Y is complex, with increasing
phase, for which the integrand of Eq. (8.12) is positive-
The general covariance of the theory makes ( A ) in- definite; however, this is not the case in general, and, in
dependent of x . In fact, it should be emphasized that this particular, not for Hartle-Hawking boundary conditions.
sort of expectation values includes an average over the For a recent more general discussion, see Vilenkin
time in the history of the universe that A is measured. (1988b).)
On the other hand, the probability P,[p] discussed above I now want to give a simplified description of
is the expectation value of a nonlocal operator, the delta Hawkings (1984b)proposed solution of the cosmological
function in ( 8 3 , and refers to a specific local time a. constant problem, using for this purpose parts of
( I should mention here that there is a very different Colemans (1988b)analysis. In order to make the cosmo-
and apparently unrelated approach to the problem of giv- logical constant into a dynamical variable, Hawking in-
ing a probabilistic interpretation to the wave function Y. troduces a 3-form gauge field A,,,& of the sort described
The Wheeler-DeWitt equation (8.1) is somewhat like the at the end of Sec. VII. According to the general ideas of
Klein-Gordon equation for a particle in a scalar potential Euclidean quantum cosmology, the probability distribu-
and leads immediately to a somewhat similar conserva- tion for the scalar c ( x ) defined by Eq. (7.27) at any one
tion law (now given for pure gravity): point x = x l is
P(c)=(~(c(x,)-c))

I
XIm Y*[h]- 6
6hkl(X)
Y[h]]] . (8.8)
0: J[dA][dg][d@l8(c(xl ) - c )

Xexp(--S[A,g,@]) .
It is well known that such functional integrals can be ex-
(8.13)
Since the beginning, it was hoped that such a conserva-
tion law could be used to construct a suitable probability pressed as exponentials of the effective action at its sta-
density (DeWitt, 1967). Usually (8.8) is stated in a tionary point.14 In the present case, we have
minisuperspace context, where h, ( x ) is constrained to
P ( c ) 0:exp( - A, aC1) ,
xC, (8.14)
depend on only a finite number of parameters. Since
gij,,,h k= -hij, it is natural to treat the overall scale of where T[A,g,@] is the total action (the sum of one-
hij as a sort of global time coordinate, and take as a prob- particle irreducible graphs with external lines replaced
ability density the corresponding component of the con- with fields A,g,*) and the subscript c indicates that this
served current in (8.8). I wish to point out here that quantity is to be evaluated at a point where is station-
such a construction is not limited to any particular ary with respect to any variations in A , , ( x ) , gp,(x), or
minisuperspace formulation, but can be carried out in the @ ( x ) that leave c ( x l ) = c fixed. Now, among all the pos-
general case. Take Y to depend on a global time sible stationary points of r, there is one that can be
found knowing only the effective action relevant to large
(8.9)

and an arbitrary (in fact, infinite) number of other param-


eters (,[h], all (, independent of the overall scale of I4The usual proof, for the case without a delta function in the
h,( x ): integrand, proceeds by adding a term J J n to the action, where
denotes the various fields, and J is a set of corresponding
currents. The path integral is then exp[-W(J)]
(8.10)
E Sdn exp( --S - SJn). The effective action is defined by the
Legendre transformation T ( R ) = W(J0)-J J n n , where J n is
We also introduce a Jacobian J ( ( , T ) and write the func-
the current that produces a given expectation value =6 W / M .
tional measure as
The condition for zero current is that r ( n )be stationary with
[dh]=J[d(]dT . (8.1 1) respect to 0,and at this point r ( n ) =W ( 0 ) .The delta function
in (8.13) can be dealt with by writing it as an integral
Multiplying Eq. (8.8) with 6(T[h]--T) and doing an in- Idoexp[io[c(x,) - c I). One can then use the above theorem
tegral over x and a functional integral over hij(x), we to evaluate the functional integral before integrating over o,
easily find a constancy condition now with no restriction on c (x), and then doing the o integral.

Rev. Mod. Phys.. Vol. 61. No. 1. January 1989


585

Steven Weinberg: The cosmological constant problem 17

4-manifolds. In this case, it is convenient to set all fields where co is the value of c (assuming there is one), for
except A,,,,h and gp,, equal to their ( A- and g-dependent) which A(c)=O.
stationary values, in which case the effective action can It is important that the quantity A(c) is the true
be expanded in inverse powers of the size of the mani- effective cosmological constant, previously called hem
that would be measured in gravitational phenomena at
long ranges.I6 The constant A in Eq. (8.15) includes all
effects of fields other than g,, and All,&, including all
quantum fluctuations. Hence the result (8.21), if valid,
really does solve the cosmological constant problem.
We can check that this result is not invalidated by the
the omitted terms involving more than two derivatives of
terms neglected in Eq. (8.16). For a large radius r, the ex-
g and/or A. As we saw in Sec. VII, the condition that
hibited terms in (8.16) are of order hr4/G and rZ/G, re-
this be stationary in A,, [for variations that keep c(x,
spectively, while a term with D > 4 derivatives would
fixed] is that FpVAp have vanishing covariant divergence,
yield a contribution to refof order (mr)4-D, where m is
from which it follows that c in Eq. (7.27) is constant;
some combination of the Planck mass and elementary-
hence particle masses. For A ( c ) S m 2 ,this shifts the size of the
manifold by
6r/r =GA(c)[A(c)/m 2]1D-432 << 1 .
(8.16)
The change in the stationary value of the action is then
where
S r , , = [ h ( ~ ) / m ~ ] ( ~ - ~ ) 1 ~,< _
C2
A(c)=--+A. (8.17) so these higher-derivative terms have no effect on the
2
singularity (8.20).
The condition that this be stationary in g,, is, of course, Coleman (1988b) does not need to introduce a 3-form
that g,, satisfy the Einstein field equations with cosmo- gauge field A,,,*; rather, in order to make the cosmologi-
logical constant U c ) . For any such solution, cal constant into a dynamical variable, he considers the
R = -4A(c), so at the stationary point effect of topological fixtures known as wormhole^.^^ An
explicit example of a wormhole is provided by the metric
(8.18) (Hawking, 1987b, 1988)
With Hartle-Hawking boundary conditions, the solution ds2=( 1 + b 2 / x x @ ) 2 d x d x . (8.22)
of the Einstein equations for A ( c ) > O is a 4-sphere of
This appears to have a singularity at xP=O, but the line
proper circumference 2nr, where
element is invariant under the transformation
(8.19)
x-+xJ=xb2/xvxv, (8.23)
yielding a probability density proportional to
so the region x p x , < b Z actually has the same geometry
exp( -rer)=exp[3a/GA(c)l . (8.20) as that with x P x p > b 2 . The space described by Eq. (8.22)
therefore consists of two asymptotically flat 4-spaces,
On the other hand, for h ( c )< O the solutions can be made joined together at the 3-surface with xfxp=b2, a 3-
compact by imposing periodicity conditions, but they all sphere known as a baby universe. This 4-metric is not
have re,2 0. Hawkings conclusion is that the probabili- a solution of the classical Einstein equations (though it
ty density has an infinite peak for A(c)-+O+; hence, after does have R =O), but this is not very relevant; the action
normalizing P, is
P ( c ) = G ( c -co) , (8.21) S=3rb2/G, (8.24)
so the factor exp(-S) suppresses the effects of all

I5Such an effective action may be used as the input for calcula-


tions in which we include quantum effects only from virtual 16This property is shared by an imaginative solution to the
massless particles with IqI less than some cutoff A. Such cosmological constant problem proposed by Linde (1988a).
effects are, of course, finite, and their A dependence is to be can- 7The importance of quantum fluctuations in space-time topol-
celed by giving the coefficients in a suitable A dependence. ogy at small scales has been emphasized for many years by
(This point of view is described by Weinberg, 1979b.) In order Wheeler (e.g., 1964),and more recently by Hawking (1978)and
to prevent these quantum effects from generating an unaccept- Strominger (1984). Such space-time foam was considered as a
able cosmological constant, the cutoff A must be taken very mechanism for canceling a cosmological constant by Hawking
small. (1983).

Rev. Mod. Phys.. Vol. 61, No. 1. January 1980


586

18 Steven Weinberg: The cosmological constant problem

wormholes except those of Planck dimensions or less, for different interpretation [see also Giddings and Strom-
which quantum effects are surely important. [A model inger (1988b)I. The state ) B ) in Eq. (8.26) may always be
with classical wormhole solutions, based on a 2-form ax- expanded in eigenstates of the operators a, +a!:
ion, has been presented by Giddings and Strominger
(1988al.l IB )=JfB(a)n
daiIa) ,
#
(8.28)
If Planck-sized wormholes can connect asymptotically
flat 4-spaces, then they can connect any 4-spaces that are ( a i + a / ) l a )=a,Ia!) , (8.29)
large compared to the Planck scale. We are therefore led
to consider contributions to the Euclidean path integral (8.30)
from large 4-spaces [like the 4-sphere in Hawkings
(1984b) theory] connected to themselves and each other the function f B ( a )depending on the boundary condi-
with Planck-sized wormholes. Each wormhole can be re- tions. For instance, for Hartle-Hawking conditions, IB )
garded as the creation and subsequent destruction of a satisfies Eq. (8.271, and so
baby universe [like the 3-sphere of proper circumference
f B ( a ) =n ~ - ~ e x p ( - a a f / 2 ) . (8.31)
47rb in Hawkings (1987b, 1988) wormhole model], and ,
such baby universes may also appear as part of the
boundary of the emanifold. (With n baby universes on the boundary of the 4-space,
What are the effects of these wormholes and baby this would be multiplied with a Hermite polynomial of
universes? At scales large compared with the scale of the order n.) In the state ] a ) ,the effect of the creation and
baby universe, the creation or destruction of a baby annihilation of baby universes is to change the action S to
universe can only show up through the insertion of a lo- s,=s+ ~ a a , J ~ , ( x ) .d 4 x (8.32)
cal operator in the path integral. The various types of I
baby universes can be classified according to the form of
these local operators. The effect of creating and destroy- That is, the coupling constant multiplying each possible
ing arbitrary numbers of baby universes of all types can local term s0,d4x is changed by an amount a,. As soon
thus be expressed by adding a suitable term in the action as we start to make any sort of measurements, the state
of the universe breaks up into an incoherent superposi-
(8.25) tion of these la )s, each appearing with a priori probabil-
ity ( f B ( a ) I 2but
; for each term we have an ordinary
where a, and a / are the annihilation and creation opera- wormhole-free quantum theory, with a-dependent action
tors for a baby universe of type i, and Oi( x ) is the corre- (8.32).
sponding local operator. [This was first stated by Hawk- If all we want is to explain why the cosmological con-
ing (1987b). Creation and annihilation operators for stant is not enormous, then our work is essentially done.
baby universes were earlier used by Strominger (1984). The effective cosmological constant is a function of the
For a proof of Eq. (8.29, see Coleman (1988a) and Gid- a,,because among the 0,there is a simple operator
dings and Strominger (1988b).] The path integral over all 0,=G,whose coefficient contributes a term 8 a G a , to
4-manifolds with given boundary conditions is to be cal- A, and also because the vacuum energy ( p ) depends on
culated as the couplings of all interactions, each of which has a
term proportional to one of the a,. Now, generic baby-
J[dg][d@]e-S= JNo[dg][d@](Ble-SIE) , (8.26)
universe states IB ) will have components la) for which
where No means that wormholes and baby universes are &&a) is very small, as well as others for which it is enor-
excluded, and I B ) is a normalized baby-universe state mous. The anthropic considerations of Sec. VI tell us
depending on the boundary conditions. For instance, that any scientist who asks about the value of the cosmo-
with Hartle-Hawking boundary conditions, IB ) is the logical constants can only be living in components la)
empty state for which A,, is quite small, for otherwise galaxies and
stars could never have formed (for h,,>O), or else there
a,lE)=O. (8.27) would not be time for life to evolve (for A, < 0).
These baby universes have an important effect even if However, it is of great interest to ask whether the
none of them appear as part of the boundary of the 4- effective cosmological constant is really zero, or just
manifold, as would be the case for Hartle-Hawking small enough to satisfy anthropic bounds, in which case
boundary conditions. Hawking (1987b, 1988) has sug- it should show up observationally. The probability of
gested that since the baby universes are unobservable, getting any particular value of the a,,and hence of
their effect is an effective loss of quantum coherence. finding a value is not just given by the function
[See also Hawking (1982);Teitelboim (1982); Strominger IfB( a)I* arising from the boundary conditions, but is also
(1984); Lavrelashvili, Rubakov, and Tinyakov (1987, affected by the functional integral itself.
1988); Giddings and Strominger (1988b). A contrary In calculating this effect, Coleman (1988b) observed
view was taken by Gross (1984).] Recently Coleman that although we are to integrate only over connected 4-
(1988a) has argued (convincingly, in my view) for a manifolds, on a scale much large than the wormhole

Rev. Mod. Phys., Vol. 61, No. 1 . January 1989


587

Steven Weinberg: The cosmological constant problem 19

scale those manifolds that appear disconnected will really possible that the essential singularity in
be connected by wormholes. Hence any sort of probabili- exp(exp[3a/Gh(a)]] is canceled by an essential zero in
ty density or expectation value will contain as a factor a the apriori probability ]fB(a)l2. However, this is not the
sum over disconnected manifolds consisting of arbitrary case for Hartle-Hawking boundary conditions, where
numbers of closed connected wormhole-free com- IfB ( a )I is a simple Gaussian. Moreover, Coleman
ponents. Just as for Feynman diagrams, this sum is the (1988b) has shown that in his theory such an essential
exponential of the path integral for a single closed con- zero would be destroyed by almost any perturbation of
nected wormhole-free manifold the boundary conditions; instead of its being unnatural to
have zero cosmological constant, it would be highly un-
(8.33) natural not to. Still, the problem of boundary conditions
is disturbing, because it reminds us that quantum cosmol-
where CC indicates that we include only closed connect- ogy is an incomplete theory.
ed wormhole-free manifolds, and S,[g] is the action (3) Are wormholes real? Colemans calculation de-
(8.32)with all fields other than gJx) integrated out. pends on there being a clear separation between the very
The path integral in (8.33)can be evaluated by precise- large 4-manifolds, for which the long-range effective ac-
ly the same methods as described above in connection tion is stationary (and large and negative), and very small
with Hawkings (1984b) model [and used for this purpose wormholes, whose contribution to the action is of order
by Coleman (1988b)I. The result is that the probability unity (and generally positive). Furthermore, the
density for A,, contains a factor (for he,> 0) wormholes have been assumed to be so well separated
that we can ignore their interactions (the dilute gas ap-
F=exp exp
[ [-
Eh:, j j +O(l)
(8.34) proximation). It may be possible to construct a theory in
which the wormhole scale [like b in Eq. (8.2211 is some-
The fact that this is now an exponential of an exponen- what larger than the Planck scale, large enough to allow
tial, instead of a mere exponential, is not essential in solv- the wormhole metric to be calculated classically, but we
ing the cosmological constant problem (though it is im- would still have to ask whether this is actually the case.
portant in fixing other constants, as described at the end Hawking (1984b) does not need to worry about
of this section). Either way, the probability distribution wormholes, but how do we know that the 3-form gauge
has an infinite peak at h,,-+O+, which, after normaliz- field is real? A related question for both authors: even
ing so that the total probability is unity, means that P ( a ) granting the existence of the stationary point of the ac-
has a factor tion at which reff= -33~/hG, how do we know that this
is the dominant stationary point?
P(a)oc6(Aeda)) . (8.35) (4) What about the other terms in the effective action?
For instance, suppose we include the 6-derivative termIg
In addition, as in Hawkings case, from the way that F
has been calculated it is clear that this her is the constant
that appears in the effective action for pure gravity with
all high-energy fluctuations integrated out; hence it is the
cosmological constant relevant to astronomical observa-
tion.
Has the cosmological constant problem been solved?
Perhaps so, but there are still some things to worry about
I
+urRRorpv , (8.36)
+ < ( a ) G (a )RPvApR

with 4(a)a dimensionless coefficient that, like A and G,


depends on the baby-universe parameters ai. Hawking
in Colemans approach, as also in the earlier work of and Coleman found a stationary point of this action for
Hawking. Here is a short list of qualms. which r o f f + - m when h(a)G(a)-+O, but for this pur-
(1) Does Euclidean quantum cosmology have anything pose it is essential that c ( a )remain bounded in this limit.
to do with the real world? It is essential to both Coleman (We recall that in our previous discussion of the higher-
and Hawking that the path integral be given by a station- derivative terms in rem we assumed that the coefficient
ary point of the Euclideanized action-the conclusion ,,,4--D
of terms with D 2 4 derivatives remains less than
would be completdy wiped out if in place of hD-42as h-+O.) But if we can let 1/hG go to infinity,
exp(3?r/GhCff)we had found exp(3ai/Gheff). Some of then why not let 6 go to infinity also? In particular, why
the technical and conceptual difficulties of Euclidean not use a dimensional factor I / h ( a ) in place of G ( a )in
quantum cosmology were discussed at the beginning of
this section.
(2) What are the boundary conditions? It is always 19Termsinvolving the Ricci tensor RPv or its trace R are not
included here, because they represent merely a redefinition of
the metric; see, e.g., Weinberg (1979a). The 4-derivative term
I8This sum actually includes manifolds that are truly not con- Rpv,+RM* is not included, because it can be combined with
nected by wormholes or anything else, but their contribution is terms involving R,, or R to make a topological invariant and is
a harmless multiplicative factor, which will cancel out anyway therefore physically unmeasureable for fixed large-scale topolo-
in normalizing P(a). gy.

Rev. Mod. Phys.. Vol. 61, No. 1. January 1989


588
~

20 Steven Weinberg: The cosmological constant problem

the last term of Eq. (8.36)? This would completely invali- nice to be able to calculate them, because up t o now the
date the analysis of the singularity in the probability den- only really unsatisfactory feature of the quantum theory
sity P ( a ) ,and could well wipe it out. of gravitation has been the apparent arbitrariness of this
The last of these four qualms suggests some interesting infinite set of parameters.
possibilities. Suppose we do assume that for some reason
constants like cJa) in Eq. (8.36) are bounded. Then the IX. OUTLOOK
effect of wormholes is not only t o fix Ma) at zero, but
also t o fix these other constants a t their lower or upper All of the five approaches to the cosmological constant
bounds. [I think this is the correct interpretation of what problem described in Secs. IV -VIII remain interesting.
Coleman (1988b) calls the big fix.] For instance, for At present, the fifth, based on quantum cosmology, ap-
[(a) bounded and ( h ( a ) G ( a ) l < < l , the action (8.36) is pears the most promising. However, if wormholes (or 3-
stationary for a sphere of proper circumference 27rr, form gauge fields) d o produce a distribution of values for
where the cosmological constant, but without a n infinite peak a t
2 3 64[rGzh heff=O, then we will have to fall back on the anthropic
(8.37)
-h 3 principle to explain why he, is not enormously larger
than allowed by observation. Alternatively, it may be
for which the effective action takes the value some change in the theory of gravity, like that described
- 377 128cGha2 here in Sec. VII, that produces the distribution in values
(8.38)
eff Gh 3 for heP The approaches based on supersymmetry and
adjustment mechanisms described in Secs. IV and V I
Thus the probability distribution exp[exp( -ref)] not
seem least promising at present, but this may change.
only has an infinite peak at h(a)=O, but also contains a All five approaches have one other thing in common:
factor They show that any solution of the cosmological constant
problem is likely to have a much wider impact on other
(8.39) areas of physics or astronomy. One does not need to ex-
plain the potential importance of supergravity and super-
For Gh-0, the quantity G h e x p ( 3 r / G h ) becomes strings. A light scalar like that needed for adjustment
infinite, so the normalized probability will have a delta mechanisms could show up macroscopically, as a fifth
function at the upper bound of [(a). All constants in the force. Changing gravity by making Detg,, not dynami-
effective action for gravitation, including terms with any cal would make us rethink our quantum theories of grav-
numbers of derivatives, can be calculated in this way,20 itation, and wormholes might force all the constants in
but they all have to be bounded as h ( a ) G ( a ) - + Ofor any these theories to their outer bounds. Finally, and of
of this t o make sense. greatest interest t o astronomy, if it is only anthropic con-
It may be that the bounds (if any) on parameters like straints that keep the effective cosmological constant
[(a)arise from the details of wormhole physics, in which within empirical limits, then this constant should be rath-
case these remarks are not going to be useful numerically er large, large enough to show up before long in astro-
for some time. However, there is another more exciting nomical observations.
possibility, that there are just unitarity bounds, which Note added in prooJ As might have been expected, in
could be calculated working only with low-energy the time since this report was submitted for publication
effective theory itself. Of course, we are not likely to be there have appeared a large number of preprints that fol-
able t o measure parameters like [ ( a ) but
, it would still be low up on various aspects of the work of Coleman
(1988b) and Banks (1988). Here is a partial list: Accetta
el al. (1988); Adler (1988); Fischler and Susskind (1988);
Giddings and Strominger (1988~);Gilbert (1988); Grin-
@Tothe extent that it will become possible to calculate func- stein and Wise (1988); Gupta and Wise (1988); Hosoya
tions like U a ) ,G ( a ) ,{(a)etc., in terms of the parameters in an (1988); Klebanov, Susskind, and Banks (1988); Myers and
underlying fundamental theory, such as a string theory, the lo-
Periwal (1988); Polchinski (1988); Rubakov (1988). I a m
cation of the delta functions in F may allow us to infer some-
not able t o review all of these papers here. However, I d o
thing about the values of the ai and of the parameters in the un-
derlying theory. However, without such an underlying theory, want t o mention two further qualms, regarding
it is impossible to use calculations of AG, 5; etc., to infer any- Colemans proposed solution of the cosmological con-
thing about the observed parameters of some intermediate stant problem, that are raised by some of these papers.
theory like the standard model. This is because, in addition to First, Fischler and Susskind (1988), partly on the basis of
charges, masses, etc., the standard model implicitly also in- conversations with V. Kaplunovsky, have pointed out
volves parameters k,,,Go,{,,, . . . appearing in the effective ac- that the exponential damping of large wormholes may be
tion for gravitation. When we integrate out the quarks, leptons, overcome by Colemans double exponential. If this were
and gauge and Higgs bosom, we obtain new values for h,G,C, the case, we would be confronted with closely packed
etc.; but these new values depend on an equal number of un- wormholes of macroscopic as well as Planck scales. This
knowns &, G,,Co, etc., as well as on charges and masses. would be a disaster for Colemans proposed solution of

Rev. Mod. Phys.. Vol. 61, No. 1, January 1989


589

Steven Weinberg: The cosmological constant problem 21

t h e cosmological constant problem, and would also indi- Attick, J., G. Moore, and A. Sen, 1987, Institute for Advanced
cate t h a t we do n o t fully understand how t o use Euclide- Studies preprint.
a n path integrals in q u a n t u m cosmology. Next, Polchin- Augustine, 398, Confessions, translated by R. S. Pine-Coffin
ski (1988) has found t h a t t h e Euclidean path integral over (Penguin Books, Harmondsworth, Middlesex, 1961),Book XI.
closed, connected, wormhole-free manifolds inside t h e Bahcall, J., T. Piran, and S. Weinberg, 1987, Eds., Dark Matter
in rhe Uniuerse: 4th Jerusalem Winter School for Theoretical
exponential i n (8.33) h a s a phase that might eliminate t h e
Physics (World Scientific, Singapore).
peak in the probability distribution a t zero cosmological
Bahcall, S. R., and S . Tremaine, 1988, Astrophys. J. 326, L1.
constant. A s pointed o u t here in footnote 15, when we Banks, T., 1985, Nucl. Phys. B 249, 332.
use a n effective action ref
t o evaluate such path integrals, Banks, T., 1988, University of California, Santa Cruz, Preprint
t h e effective action m u s t be taken a s a n input to calcula- No. SCIPP 88/09.
tions in which we include quantum fluctuations i n mass- Banks, T., W. Fischler, and L. Susskind, 1985, Nucl. Phys. B
less particle fields with momenta u p t o some ultraviolet 262, 159.
cutoff A . T h i s cutoff must b e taken a s t h e same a s t h e in- Barbieri, R., E. Cremmer, and S. Ferrara, 1985, Phys. Lett. B
frared cutoff t h a t was used in calculating ren so t h a t all 163, 143.
fluctuations a r e taken into account. I t was remarked in Barbieri, R., S. Ferrara, D. V. Nanopoulos, and K. S. Stelle,
1982, Phys. Lett. B 113,219.
footnote 15 t h a t A must b e taken very small, t o avoid Barr, S. M., 1987, Phys. Rev. D 36, 1691.
reintroducing a cosmological constant, b u t a s Polchinski Barr, S. M., and D. Hochberg, 1988, Phys. Lett. B 211.49.
n o w remarks, no matter how small we take A, t h e in- Barrow, J. D., and F. J. Tipler, 1986, The Anthropic Cosmologi-
tegral over fluctuations in t h e gravitational field with mo- cal Principle (Clarendon, Oxford).
menta less t h a n A produces a phase in t h e integral. Since Bernstein, J., and G. Feinberg, 1986, Eds., Cosmological Con-
this phase appears inside t h e exponential in Eq. (8.33), if stants (Columbia University, New York).
its real part is not positive definite there would b e no ex- Bludman, S . A,, and M. Ruderman, 1977, Phys. Rev. Lett. 38,
ponential peak a t zero cosmological constant. O n t h e 255.
o t h e r hand, in t h e absence of wormholes this phase Brown, J. D., and C. Teitelboim, 1987a, Phys. Lett. B 195, 177.
would appear a s a n overall factor in front of a single ex- Brown, J. D., and C. Teitelboim, 1987b, Nucl. Phys. B 297, 787.
Buchmiiller, W., and N. Dragon, 1988a, University of Hann-
ponential, so it would not affect t h e peaking a t zero
over Preprint No. ITP-UH 1/88.
cosmological constant found by Hawking [ 1984b).
Buchmiiller, W., and N. Dragon, 1988b, Phys. Lett. B 207,292.
Carter, B., 1974, in International Astronomical Union Symposi-
ACKNOWLEDGMENTS um 63: Confrontation of Cosmological Theories with Obserua-
tional Data, edited by M. S. Longair (Reidel, Dordrecht), p.
I have been greatly helped in preparing this review by 291.
Carter, B., 1983, in The Constants of Physics, Proceedings of a
conversations with many colleagues. Here is a list of a
Royal Society Discussion Meeting, 1983, edited by W. H.
few o f those to whom m y thanks a r e especially due. Sec- McCrea and M. J. Rees (printed for The Royal Society, Lon-
tion 11: G. Holton; Sec. 111: E. Witten; Sec. IV: S. d e don, at the University Press, Cambridge), p. 137.
Alwis, J. Polchinski, E. Witten; Sec. V: P. J. E. Peebles, Casimir, H. B. G., 1948, Proc. K.ed. Akad. Wet. 51,635.
P. Shapiro, E. Vishniac; Sec. VI: J. Polchinski; Sec. VII: Chang, N.-P., D.-X. Li, and J. Perez-Mercader, 1988, Phys.
C. Teitelboim, F. Wilczek; Sec. VIII: L. Abbott, S. Cole- Rev. Lett. 60, 882.
man, B. DeWitt, W. Fischler, S. Giddings, L. Susskind, Coleman, S., 1988a, Nucl. Phys. B 307, 867.
C. Teitelboim, F. Wilczek. O f course, they t a k e no Coleman, S., 1988b. Why there is nothing rather than some-
responsibility for anything t h a t I may have gotten wrong. thing: A theory of the cosmological constant, Harvard Uni-
Research was supported in p a r t by t h e Robert A. Welch versity Preprint No. HUTP-88/A022.
Foundation and N S F G r a n t No. P H Y 8605978. Coleman, S., and F. de Luccia, 1980, Phys. Rev. D 21, 3305.
Coughlan, G. D., I. Kani, G. G. Ross, and G. Segre, 1988,
CERN Preprint No. TH. 5014/88.
REFERENCES Cremmer, E., S. Ferrara, C. Kounnas, and D . V. Nanopoulos,
1983, Phys. Lett. B 133, 61.
Cremmer, E., B. Julia, J. Scherk, S. Ferrara, L. Girardello, and
Abbott, L., 1985, Phys. Lett. B150,427. P. van Nieuwenhuizen, 1978, Phys. Lett. B 79, 231.
Abbott, L., 1988, Sci. Am. 258 (No. 5 ) , 106. Cremmer, E., B. Julia, J. Scherk, S. Ferrara, L. Girardello, and
Accetta, F. S., A. Chodos, F. Cooper, and B. Shao, 1988, Fun P. van Nieuwenhuizen, 1979, Nucl. Phys. B 147, 105.
with the wormhole calculus, Yale University Preprint No. Davies, P. C. W., 1982, The Accidental Universe (Cambridge
YCTP-P20-88. University, Cambridge).
Adler, S. L., 1988, On the Banks-Coleman-Hawking argument de Sitter, W., 1917, Mon. Not. R. Astron. SOC.78, 3 (reprinted
for the vanishing of the cosmological constant, Institute for in Bernstein and Feinberg, 1986).
Advanced Study Preprint No. IASSNS-HEP-88/35. de Vaucouleurs, G., 1982, Nature (London)299,303.
Albrecht, A., and P. J. Steinhardt, 1982, Phys. Rev. Lett. 48, de Vaucouleurs, G., 1983, Astrophys. J. 268,468, Appendix B.
120. DeWitt, B., 1967, Phys. Rev. 160, 1113.
Arnowitt, R., S. Deser, and C. W. Misner, 1962, in Grauitarion: Dicke, R. H., 1961, Nature (London) 192,440.
A n Introduction to Current Research, edited by L. Witten (Wi- Dine, M., W. Fischler, and M. Srednicki, 1981, Phys. Lett. B
ley, New York) p . 227. 104, 199.

Rev. Mod. Phys.. Val.61. No. 1 . January 1989


590

22 Steven Weinberg: The cosmological constant problem

Dine, M., R. Rohm, N. Seiberg, and E. Witten, 1985, Phys. Hawking, S. W., 1984a, Nucl. Phys. B 239, 257.
Lett. B 156, 55. Hawking, S. W., 1984b, Phys. Lett. B 134,403.
Dine, M., and N. Seiberg, 1986, Phys. Rev. Lett. 57,2625. Hawking, S. W., 1984c, in Relatiuity, Groups and Topologv II,
Dirac, P. A. M., 1937, Nature (London) 139, 323. . .
NATO Advanced Study Institute Session XL . Les Houches
Dolgov, A. D., 1982, in The Very Early Uniuerse: Proceedings of 1983, edited by B. S. DeWitt and Raymond Stora (Elsevier,
the 1982 Nufield Workshop at Cambridge, edited by G. W. Amsterdam), p. 336.
Gibbons, S. W. Hawking, and S. T. C. Siklos (Cambridge Uni- Hawking, S. W., 1987a, remarks quoted by M. Gell-Mann,
versity, Cambridge), p. 449. Phys. Scr. T15, 202 (1987).
Dreitleh, J., 1974, Phys. Rev. Lett. 34,777. Hawking, S. W., 1987b, Phys. Lett. B 195, 337.
Eddington, A. S., 1924, The Mathematical Theory of Relativity, Hawking, S. W., 1988, Phys. Rev. D 37,904.
2nd Ed. (Cambridge University, London). Hawking, S., and D. Page, 1986, Nucl. Phys. B 264, 185.
Einstein, A., 1917, Sitzungsber. Preuss. Akad. Wiss. Phys.- Henneaux, M., and C. Teitelboim, 1984, Phys. Lett. B 143, 415.
Math. K1. 142 [English Translation in The Principle of Rela- Henneaux, M., and C. Teitelboim, 1988, The cosmological
tiuity (Methuen, 1923, reprinted by Dover Publications), p. constant and general covariance, University of Texas pre-
177; and in Bernstein and Feinberg, 19861. print.
Einstein, A., 1919, Sitzungsber. Preuss. Akad. Wiss., Phys.- Hosoya, A., 1988, A diagrammatic derivation of Colemans
Math. Kl. [English translation in The Principle of Relatiuity vanishing cosmological constant, Hiroshima Preprint No.
(Methuen, 1923, reprinted by Dover Publications),p. 1911. RRK-88-28.
Ellis, J., C. Kounnas, and D. V. Nanopoulos, 1984, Nucl. Phys. Kim, J., 1979, Phys. Rev. Lett. 43,103.
B 241, 373. Klebanov, I., L. Susskind, and T. Banks, 1988, Wormholes and
Ellis, J., A. B. Lahanas, D. V. Nanopoulos, and K. Tamvakis, cosmological constant, SLAC Preprint No. SLAC-Pub.-4705.
1984, Phys. Lett. B 134,429. Knapp, G. R., and J. Kormendy, 1987, Eds., Dark Matter in the
Ellis, J., N. C. Tsamis, and M. Voloshin, 1987, Phys. Lett. B Universe: LA. U.Symposium No. 117 (Reidel, Dordrecht).
194, A 29 1. Lavrelashvili, G. V., V. A. Rubakov, and P. G. Tinyakov, 1987,
Fischler, W., and L. Susskind, 1988, A wormhole catas- Pisma Zh. Eksp. Teor. Fiz. 46, 134 [JETP Lett. 46, 167
trophe, Texas Preprint No. UTTG-26-88. (1987)l.
Ford, L. H., 1987, Phys. Rev. D 35,2339. Lavrelashvili, G . V., V. A. Rubakov, and P.G. Tinyakov, 1988,
Freese, K., F. C. Adams, J. A. Frieman, and E. Mottola, 1987, Nucl. Phys. B 299,757.
Nucl. Phys. B 287,797. Lema%re, G., 1927, Ann. SOC.Sci. Bruxelles, Ser. 147,49.
Friedan, D., E. Martinec, and S. Shenker, 1986, Nucl. Phys. B Lemahe, G., 1931, Mon. Not. R. Astron. SOC.91,483.
271,93. Linde, A. D., 1974, Pisma Zh. Eksp. Teor. Fiz. 19, 320 [JETP
Friedmann, A,, 1924, 2. Phys. 21, 326 [English translation in Lett. 19, 183 (1974)l.
Bernstein and Feinberg, 1986, Eds., Cosmological Constants Linde, A. D., 1982, Phys. Lett. B 129, 389.
(Columbia University, New York)]. Linde, A. D., 1986, Phys. Lett. B 175, 395.
Gibbons, G. W., S. W. Hawking, and M. J. Perry, 1978, Nucl. Linde, A. D., 1987, Phys. Scri. T15, 169.
Phys. B 138,141. Linde, A. D., 1988a, Phys. Lett. B 200, 272.
Giddings, S. B., and A. Strominger, 1988a, Nucl. Phys. B 306, Linde, A. D., 1988b, Phys. Lett. B 202, 194.
890. Loh, E. D., 1986, Phys. Rev. Lett. 57, 2865.
Giddings, S. B., and A. Strominger, 1988b, Nucl. Phys. B 307, Loh, E. D., and E. J. Spillar, 1986, Astrophys. J. 303, 154.
854. Martinec, E., 1986, Phys. Lett. B 171, 189.
Giddings, S. B., and A. Strominger, 1988c, Baby universes, Moore, G., 1987a, Nucl. Phys. B 293, 139.
third quantization, and the cosmological constant, Harvard Moore, G., 1987b, Institute for Advanced Study Preprint No.
Preprint No. HUTP-88/A036. IASSNS-HEP-87/59, to be published in the proceedings of the
Gilbert, G., 1988, Wormhole induced proton decay, Caltech Cargese School on Nonperturbative Quantum Field Theory.
Preprint No. CALT-68-1524. Morozov, A,, and A. Perelomov, 1987, Phys. Lett. B 199,209.
Grinstein, B., and M. B. Wise, 1988, Light scalars in quantum Myers, R. C., and V. Periwal, 1988, Constants and correlations
gravity, Caltech Preprint No. CALT-68-1505. in the Coleman calculus, Santa Barbara Preprint No. NSF-
Grisaru, M. T., W. Siegel, and M. Rocek, 1979, Nucl. Phys. B ITP-88-151.
159,429. Page, D., 1987, in The World and I (in press).
Gross, D. J., 1984, Nucl. Phys. B 236, 349. Pais, A., 1982,,Subtle is the Lord . . .: The Science and the Life
Gupta, A. K., and M. B. Wise, 1988, Comment on wormhole of Albert Einstein (Oxford University, New York).
correlations, Caltech Preprint No. CALT-68-1520. Peccei, R. D., J. Sola, and C. Wetterich, 1987, Phys. Lett. B
Guth, A. H., 1981, Phys. Rev. D 23, 347. 195, 183.
Hartle, J. B., 1987, in Gravitation in Astrophysics: Cargese 1986, Peebles, P. J. E., 1984, Astrophys. J. 284,439.
edited by B. Carter and J. B. Hartle (Plenum, New York), p. Peebles, P. J. E., 1987a, in Proceedings of the Summer Study on
329. the Physics of the Superconducting Super Collider, edited by R.
Hartle, J. B., and S. W. Hawking, 1983, Phys. Rev. D 28,2960. Donaldson and J. Marx (Division of Particles and Fields of the
Hawking, S. W., 1978, Nucl. Phys. B 144, 349. APS, New York).
Hawking, S. W., 1979, in Three Hundred Years of Gravitation, Peebles, P. J. E., 1978b, Publ. Astron. Soc. Pac., in press.
edited by S. W. Hawking and W. Israel (Cambridge Universi- Peebles, P. J. E., and B. Ratra, 1988, Astrophys. J. Lett. 325,
ty, Cambridge). L17.
Hawking, S. W., 1982, Commun. Math. Phys. 87, 395. Petrosian, V., E. E. Salpeter, and P. Szekeres, 1967, Astrophys.
Hawking, S. W., 1983, Philos. Trans. R. SOC.London, Ser. A J. 147, 1222.
310, 303. Polchinski, J., 1986, Commun. Math. Phys. 104, 37.

Rev. Mod. Phys.. VoI. 61. No. 1. January 1989


59 1

Steven Weinberg: The cosmological constant problem 23

Polchinski, J., 1987, private communication. Weinberg, S.; 1979b, Physica 96A, 327.
Polchinski, J., 1988, in preparation. Weinberg, S.,1982, Phys. Rev. Lett. 48, 1776.
Ratra, B., and P. J. E. Peebles, 1988, Phys. Rev. D 37,3406. Weinberg, S., 1983, unpublished remarks at the workshop on
Rees, M. J., 1987, New Sci. August 6, 1987,p. 43. Problems in Unification and Supergravity, La Jolla Insti-
Renzini, A., 1986, in Galaxy Distances and Deuiations from tute, 1983.
Universal Expansion, edited by B. F. Madore and R. B. Tully Weinberg, S., 1987, Phys. Rev. Lett. 59, 2607.
(Reidel, Dordrecht), p. 177. Wetterich, C., 1988, Nucl. Phys. B 302, 668.
Reuter, M., and C. Wetterich, 1987, Phys. Lett. 188, 38. Wheeler, 1. A., 1964, in Relatiuity, Groups and Topology, Lec-
Rohm, R., 1984, Nucl. Phys. B 237,553. .
tures Delivered at Les Houches, 1963 . , ,edited by B. DeWitt
Rowan-Robinson, M., 1968, Mon. Not. R. Astron. SOC.141, and C. DeWitt (Gordon and Breach, New York), p. 317.
445. Wheeler, J. A., 1968, in Buttelle Rencontres, edited by C .
Rubakov, V. A., 1988, On the third quantization and the DeWitt and J. A. Wheeler (Benjamin, New York).
cosmological constant, DESY preprint. Wilczek, F., 1984, Phys. Rep. 104, 143.
Shklovsky, I., 1967, Astrophys. J. 150, L1. Wilczek, F., 1985, in How Far Are We from the Gauge Forces:
Slipher, V. M., 1924, table in Eddington (1924), The Mathemati- Proceedings of the 1983 Erice Conference, edited by A. Zichichi
cal Theory of Relatiuity, 2nd Ed. (Cambridge University, Lon- (Plenum, New York), p. 208.
don), p. 162 Wilczek, F., and A. Zee, 1983, unpublished work quote by Zee
Spamaay, M. J., 1957, Nature (London) 180,334. (1989, in High Energy Physics: Proceedings of the Annual
Strominger, A., 1984, Phys. Rev. Lett. 52, 1733. Orbis Scientiae, edited by S . L. Mintz and A. Perlmutter (Ple-
Teitelboirn, C., 1982, Phys. Rev. D 25, 3159. num, New York).
Turner, M. S.,G. Steigman, and L. M. Krauss, 1984, Phys. Rev. Witten, E., 1983, in Proceedings of the 1983 Shelter Island
Lett. 52,2090. Conference on Quantum Field Theory and the Fundamental
Van der Bij, J. J., H. Van Dam, and Y. J. Ng, 1982, Physica Problems of Physics, edited by R. Jackiw, N. N. Khuri, S.
116A, 307. Weinberg, and E. Witten (MIT, Cambridge, Massachusetts), p.
Veltman, M., 1975, Phys. Rev. Lett. 34,717. 273.
Vilenkin, A., 1986, Phys. Rev. D 33, 3560. Witten, E., 1985, Phys. Lett. B 155, 151.
Vilenkin, A., 1988a, Phys. Rev. D 37,888. Witten, E., and J. Bagger, 1982, Phys. Lett. B 115, 202.
Vilenkin, A., 1988b, Tufts Preprint No.TUTP-88-3. Zee, A,, 1985, in High Energy Physics: Proceedings of the 20th
Weinberg, S., 1972, Gravitation and Cosmology (Wiley, New Annual Orbis Scientiae, 1983, edited by S . L. Mintz and A.
York). Perlmutter (Plenum, New York).
Weinberg, S., 1979a, in General Relativity: An Einstein Cen- Zeldovich, Ya., B., 1967, Pisma Zh. Eksp. Teor. Fiz. 6, 883
renary Suruey, edited by S . W. Hawking and W. Israel (Cam- [JETP Lett. 6, 316 (1967)].
bridge University, Cambridge),p. 800. Zumino, B., 1975, Nucl. Phys. B 89, 535.

Rev. Mod. Phys..Vol. 61, NO. 1. January 1989


~

REVIEWS OF MODERN PHYSICS, VOLUME 75, APRIL 2003

The cosmological constant and dark energy


P.J. E. Peebles
Joseph Henry Laboratories, Princeton University, Princeton, New Jersey 08544

Bharat Ratra
Department of Physics, Kansas State University, Manhattan, Kansas 66506
(Published 22 April 2003)

Physics welcomes the idea that space contains energy whose gravitational effect approximates that of
Einsteins cosmological constant, A; today the concept is termed dark energv or quintessence. Physics
also suggests that dark enerly could be dynamical, allowing for the arguably appealing picture of an
evolving dark-energy density approaching its natural value, zero, and small now because the
expanding universe is uld. This would alleviate the classical problem of thc curious energy scale of a
millielectrun voll associated with a constant A. Dark energy may have been detected by recent
cusmological tests. These tem make a good scientific casc for the context, in the relativistic
Friedmann-Lemaitre model, in which Ihe gavitational inverse-square law is applied to the scales of
cosmology. We have well-chcckcd evidence that the mean mass density is not much more than
one-quarter of the critical Einstein-de Sitter value. The case for dctcction of dark energy is not yet as
convincing but slill serious; we await more data, which may be derived from work in progress. Planned
observations may detect the evolution of the dark-energy density: a positive result would be a
considerable stimulus for attempts a t understanding the microphysics of dark energy This review
presents the basic physics and astronomy of the subject, reviews the history of ideas. assesses the state
of the observational evidence. and comments on recent developments in the search for a fundamental
theory

CONTENTS 4. The redshift-angular-size and redshin-


magnitude relations 587
5. Galaxy counts 588
1. Introduction 559 6. The gravitational lensing ratc 588
A. The issues for observatiunel cosmology 560 7. Dynamics and the mcan mass density 589
B. The opportunity for physics 561 8. The baryon mass fraction in clustcrs of
C. Some explanations 562 galaxies 590
11. Basic Concepts 563 9. The cluster mass function 590
A. The Friedmann-Lemaitre model 563 10. Biasing and the development of nonlinear
B. The cosmological constant 565 mass density fluctuations 590
C . Inflation and dark energy 566 11. The anisotropy of the cosmic microwave
111. Historical Remarks 567 background radiation 591
A. Einsteins thoughts 567 12. The mass autocorrelation function and
B. The development of ideas 569 nonbaryonic matter 593
1. Early indications of A 569 13. The gravitational inverse-square law 545
2. The coincidences argument against h 570 C . Thr >talc of the cosmological tests 5Y4
3. Vacuum energy and A 570 V. Concluding Remarks 595
C . Inflation 572 Note added in proof 596
I , The sccnario 572 Acknowldgmznrs 597
2. Inflation in a low-density universe Alipendix: Recenl Dark-Energy Scnlar Field Research 591
573
References 599
D. Thc cold-dark-matter mndel 574
E. Dark energy 576
1. The XCDM parametrization 576
2. Decay by emission of matter or radiation 577
1. INTRODUCTION
3. Cosmic field defects 578
4. Dark-energy scalar field 578
IV The Cosmological Tests
There is significant observational evidence for the de-
580
A. The theories
tection of Einsteins cosmological constant, A, or a com-
580
1. General relativity 580
ponenl ol the material content of the universe that var-
2. The cold-dark-maller model for structure ies only slowly with time and space and so acts like A.
formation 5x2 Wc shall use the lerm dark energy for A or a component
13. The tcsk 584 that acts like it. Detection ol dark energy would be a
1. Thermal cosmic microwavc background new clue to an old puzzle: the gravitational effect ol Ihe
radiation 585 zcro-point energies ol particles and fields. The total with
2. Light-element abundance4 SHS other energies, that are close lo hornogencous and
3. Expansion times 586 nearly indepcndcnt of time, acts as dark energy. What is

559 02003 The American Physical Society


593

560 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

puzzling is that the value of the dark-energy density has The reader is referred to Leibundguts (2001, Sec. 4) dis-
to be tiny compared to what is suggested by dimensional cussion of astrophysical hazards. Astronomers have
analysis; the startling new evidence is that it may be dif- checks for this and other issues of interpretation when
ferent from the only other natural value, zero. considering the observations used in cosmological tests.
The main question to consider now is whether to ac- But it takes nothing away from this careful and elegant
cept the evidence for detection of dark energy. We out- work to note that the checks are seldom convincing, be-
line the nature of the case in this section. After review- cause the astronomy is complicated and what can be
ing the basic concepts of the relativistic world model in observed is sparse. What is more, we do not know ahead
Sec. 11, and in Sec. 111 reviewing the history of ideas, we of time that the physics well tested on scales ranging
present in Sec. IV a more detailed assessment of the from the laboratory to the Solar System survives the
cosmological tests and the evidence for detection of A or enormous extrapolation to cosmology.
The situation is by no means hopeless. We now have
its analog in dark energy.
significant cross-checks from the consistency of results
There is little new to report on the big issue for based on independent applications of the astronomy and
physics-why the dark-energy density is so small-since of the physics of the cosmologxal model. If the physics
Weinbergs (1989) review in this journal. But there have or astronomy was faulty we would not expect consis-
been analyses of a simpler idea: can we imagine that the tency from independent lines of evidence-apart from
present dark-energy density is evolving, perhaps ap- the occasional accident and the occasional tendency to
proaching zero? Models are introduced in Secs. 1I.C and stop the analysis when it approaches the right answer.
III.E, and recent work is summarized in more detail in We have to demand abundant evidence of consistency,
the Appendix. Feasible advances in cosmological tests and that is starting to appear.
could detect evolution of the dark-energy density, and The case for detection of A or dark energy com-
perhaps its gravitational response to large-scale fluctua- mences with the Friedmann-Lemaitre cosmological
tions in the mass distribution. This would substantially model. In this model the expansion history of the uni-
motivate the search for a more fundamental physics verse is determined by a set of dimensionless parameters
model for dark energy. whose sum is normalized to unity,
A. The issues for observational cosmology

We will make two points. First, cosmology has a sub-


stantial observational and experimental basis, which
supports many aspects of the standard model as almost The first, nMo, is a measure of the present mean mass
certainly being good approximations to reality. Second, density in nonrelativistic matter, mainly baryons and
the empirical basis is not nearly as strong for cosmology nonbaryonic dark matter. The second, f l R 0 - 1 X is
as it is for the standard model of particle physics: in a measure of the present mass in the relativistic 3-K
cosmology it is not yet a matter of measuring the param- thermal cosmic microwave background radiation, which
eters in a well-established theory. almost homogeneously fills space, and the accompanying
To explain the second point we direct our attention to low-mass neutrinos. The third is a measure of A or the
those more accustomed to experiments in the laboratory present value of the dark-energy equivalent. The fourth,
than to astronomy-related observations of astronomers RKo, is an effect of the curvature of space. We review
Tantalus principle: one can look at distant objects but some details of these parameters in the next section, and
never touch them. For example, the observations of su- of their measurements in Sec. IV.
pernovae in distant galaxies offer evidence of dark en- The most direct evidence for detection of dark energy
ergy, under the assumption that distant and nearby su- comes from observations of supernovae of a type whose
pernovae are drawn from the same statistical sample intrinsic luminosities are close to uniform (after subtle
(that is, that they are statistically similar enough for the astronomical corrections, a few details of which are dis-
purpose of this test). There is no direct way to check cussed in Sec. IV.B.4). The observed brightness as a
function of the wavelength shift of the radiation probes
this, and it is easy to imagine differences between distant
the geometry of spacetime, in what has come to be
and nearby supernovae of the same nominal type. More
called the redshift-magnitude relation.2 The measure-
distant supernovae are seen in younger galaxies, because ments agree with the relativistic cosmological model
of the travel time of light, and these younger galaxies with aKO=O, meaning no space curvature, and RAo
tend to have more massive rapidly evolving stars with -0.7, meaning nonzero A. A model with RAo=O is two
lower heavy-element abundances. How do we know that
the properties of the supernovae are not also different?
The apparent magnitude is m = -2.5 log,,f plus a constant,
where f is the detected energy flux density in a chosen wave-
Sahni and Starobinsky (2000): Carroll (2001): Weinberg length band. The standard measure of the wavelength shift,
(2001); Witten (2001): and Ellwanger (2002) present more re- due to the expansion of the universe, is the redshift z defined
cent reviews. in Eq. (7) below.

Rev. Mod Phys , Vol. 75, No. 2, April 2003


594

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 561

or three standard deviations off the best fit, depending geneous and isotropic way from a hotter denser state:
on the data set and analysis technique. This is an impor- how else could space, which is transparent now, have
tant indication, but 2 to 3 CT is not convincing, even when been filled with radiation that has relaxed to a thermal
we can be sure that systematic errors are under Teason- spectrum? The debate is when the expansion com-
able control. And we have to consider that there may be menced or became a meaningful concept. Some whose
opinions and research we respect question the extrap-
a significant systematic error from differences between
olation of the gravitational inverse-square law, in
distant, high-redshift, and nearby, low-redshift, superno- its use in estimates of masses in galaxies and systems of
vae. galaxies, and of RMo. We agree that this law is one of
There is a check, based on the cold-dark-matter the hypotheses to be tested. Our conclusion from the
(CDM) model3 for structure formation. The fit of the cosmological tests of Sec. IV is that the law passes
model to the observations reviewed in Sec. 1V.B yields significant, though not yet complete, tests, and that
two key constraints. First, the angular power spectrum we already have a strong scientific case, resting on the
of fluctuations in the temperature of the 3-K thermal abundance of cross-checks, that the matter density
cosmic microwave background radiation across the sky parameter RMo is about one-quarter. The case for
indicates that nKois small. Second, the power spectrum detection of nno is significant too, but not yet as com-
of the spatial distribution of the galaxies requires Ruo pelling.
-0.25. Similar estimates of nMo follow from indepen- For the most part the results of the cosmological tests
dent lines of observational evidence. The rate of gravi- agree wonderfully well with accepted theory. But the ob-
tational lensing prefers a somewhat larger value (if RKO servational challenges to the tests are substantial: we are
drawing profound conclusions from very limited infor-
is small), and some dynamical analyses of systems of
mation. We have to be liberal when considering ideas
galaxies prefer lower nMO. But the differences could all about what the universe is like, and conservative when
result from measurement uncertainties. Since nRo in Eq. accepting ideas into the established canon.
(1) is small, the conclusion is that RAois large, in excel-
lent agreement with the supernovae result. B. The opportunity for physics
Caution is in order, however, because this check
depends on the CDM model for structure formation. Unless there is some serious and quite unexpected
We cannot see the dark matter, so we naturally assign flaw in our understanding of the principles of physics we
it the simplest properties possible. Maybe it is significant can be sure the zero-point energy of the electromagnetic
that the model has observational problems with galaxy field at laboratory wavelengths is real and measurable,
formation, as discussed in Sec. IV.A.2, or maybe these as in the Casimir (1948) effect? Like all energy, this
problems are only apparent, due to the complications of zero-point energy has to contribute to the source term in
the astronomy. We are going to have to determine which Einsteins gravitational field equation. If, as seems likely,
is correct before we can have confidence in the role of the zero-point energy of the electromagnetic field is
the CDM model in cosmological tests. We will get a close to homogeneous and independent of the velocity
strong hint from current precision angular distribution of the observer, it manifests itself as a positive contribu-
measurements of the 3-K thermal cosmic microwave tion to Einsteins A, or dark energy. The zero-point en-
background radiation4 If the results match precisely the ergies of the fermions make a negative contribution.
prediction of the relativistic model for cosmology and Other contributions, perhaps including the energy den-
the CDM model for structure formation, with parameter sities of fields that interact only with themselves and
choices that agree with the constraints from all the other
cosmological tests, there will be strong evidence that we
are approaching a good approximation to reality, and See Bordag, Mohideen, and Mostepanenko (2001) for a re-
the completion of the great program of cosmological cent review. The attractive Casimir force between two parallel
tests that commenced in the 1930s. But all that is in the conducting plates results from the boundary condition that
future. suppresses the number of modes of oscillation of the electro-
We wish to emphasize that the advances in the empiri- magnetic field between the plates, thus suppressing the energy
cal basis for cosmology already are very real and sub- of the system. One can understand the effect at small separa-
stantial. How firm the conclusion is depends on the is- tion without reference to the quantum behavior of the electro-
sue, of course. Every competent cosmologist we know magnetic field, such as in the analysis of the van der Waals
accepts as established beyond reasonable doubt that interaction in quantum mechanics, by taking account of the
the universe is expanding and cooling in a near homo- term in the particle Hamiltonian for the Coulomb potential
energy between the charged particles in the two separate neu-
tral objects. But a more complete treatment, as discussed by
Cohen-Tannoudji,Dupont-Roc,and Grynberg (1992), replaces
3The model is named after the nonbaryonic cold dark matter the Coulomb interaction with the coupling of the charged par-
that is assumed to dominate the masses of galaxies in the ticles to the electromagnetic-fieldoperator. In this picture the
present universe. There are more assumptions in the CDM van der Waals interaction is mediated by the exchange of vir-
model, of course; they are discussed in Secs. 1II.D and IVA.2. tual photons. With either way of looking at the Casimir
4At the time of writing the Microwave Anisotropy Probe effect-the perturbation of the normal modes or the exchange
(MAP) satellite is collecting data; the project is described in of virtual quanta of the unperturbed modes-the effect is the
http://map.gsfc.nasa.gov/ same, the suppression of the energy of the system.

Rev Mod Phys., Vol. 75, No. 2, April 2003


595

562 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

gravity, might have either sign. The value of the sum things-baryonic and nonbaryonic matter-and a,,, ,
suggested by dimensional analysis is much larger than which is thought to represent something completely dif-
what is allowed by the relativistic cosmological model. ferent, is not much larger. Also, if the parameters were
The only other natural value is h=O.If A really is tiny measured when the universe was one-tenth its present
but not zero, this introduces a most stimulating though size the time-independent A parameter would contrib-
enigmatic clue to the physics yet to be discovered. ute ClA-0.003. That is, we seem to have come on the
To illustrate the problem we outline an example of a scene just as A has become an important factor in the
contribution to A. The energy density in the 3-K thermal expansion rate. These curiosities surely are in part acci-
cosmic microwave background radiation, which amounts dental, but maybe in part physically significant. In par-
to ClR0-5X10-5 in Eq. (1) (ignoring the neutrinos), ticular, one might imagine that the dark-energy density
peaks at wavelength A-2mm. At this Wien peak the represented by A is rolling to its natural value, zero, but
photon occupation number is about one-fifteenth. The is very small now because we measure it when the uni-
zero-point energy amounts to half the energy of a pho- verse is very old. We shall discuss efforts along this line
ton at the given frequency. This means the zero-point to at least partially rationalize the situation.
energy in the electromagnetic field at wavelengths A
-2 mm amounts to a contribution 6fi,,,-4X10-4 to
the density parameter in A or the dark energy. The sum
over the modes scales as K 4 [as illustrated in Eq. (37)]. C. Some explanations
Thus a naive extrapolation to visible wavelengths deter-
mines that the contribution amounts to 6aAO-5 X lo, We have to explain our choice of nomenclature. Basic
already a ridiculous number. concepts of physics say that space contains homoge-
The situation can be compared to the development of neous zero-point energy, and perhaps also energy that is
the theory of weak interactions. The Fermi pointlike in- homogeneous or nearly so in other forms, real or effec-
teraction model is strikingly successful for a consider- tive (such as from counter terms in gravity physics,
able range of energies, but it was clear from the start which make the net energy density cosmologically ac-
that the model fails at high energy. A fix was discussed- ceptable). In the literature this near homogeneous en-
mediate the interaction by an intermediate boson-and ergy has been termed vacuum energy, the sum of
eventually incorporated into the even more successful vacuum energy and quintessence (Caldwell, Davk, and
electroweak theory. General relativity and quantum me- Steinhardt, 1998), and dark energy (Turner, 1999). We
chanics are extremely successful over a considerable have adopted the last term, and we shall refer to the
range of length scales, provided we agree not to use the dark-energy density pa that manifests itself as an effec-
rules of quantum mechanics to count the zero-point en- tive version of Einsteins cosmological constant, but one
ergy density in the vacuum, even though we know we that may vary slowly with time and position.6
have to count the zero-point energies in all other situa- Our subject involves two quite different traditions, in
tions. There are thoughts on improving the situation, physics and astronomy. Each has familiar notation, and
though they seem to be less focused than was the case familiar ideas that may be in the air but not in recent
for the Fermi model. Perhaps a new energy component literature. Our attempt to take account of these tradi-
spontaneously cancels the vacuum energy density or the tions commences with the summary in Sec. I1 of the ba-
new component varies slowly with position and here and sic notation with brief explanations. We expect that
there happens to cancel the vacuum energy density well readers will find some of these concepts trivial and oth-
enough to allow observers like us to exist. Whatever the ers of some use, and that the useful parts will be differ-
nature of the more perfect theory, it must reproduce the ent for different readers.
successes of general relativity and quantum mechanics. We offer in Sec. 111 our reading of the history of ideas
That includes the method of representing the material on A and its generalization to dark energy. This is a
content of the observable universe-all forms of mass fascinating and we think edifying illustration of how sci-
and energy-by the stress-energy tensor, and the rela- ence may advance in unexpected directions. It is rel-
tion between the stress-energy tensor and the curvature evant to an understanding of the present state of re-
of macroscopic spacetime. One part has to be adjusted. search in cosmology, because traditions inform opinions,
The numerical values of the parameters in Eq. (1) also and people have had mixed feelings about A ever since
are enigmatic, and possibly trying to tell us something. Einstein (1917) introduced it 85 years ago. The concept
The evidence is that the parameters have the approxi- never entirely disappeared in cosmology because a se-
mate values ries of observations hinted at its presence, and because
to some cosmologists A fits the formalism too well to be
ignored. The search for the physics of the vacuum, and
its possible relation to A, has a long history too. Despite
We have written in two parts: measures the
density of the baryons we know exist and RDMo mea-
sures the hypothetical nonbaryonic cold dark matter we 6The dark energy should of course be distinguished from a
need to fit the cosmological tests. The parameters OBo hypothetical gas of particles with velocity dispersion large
and ~ D M ohave similar values but represent different enough that the distribution is close to homogeneous.

Rev. Mod. Phys., Vol 75,No. 2, April 2003


596

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 563

the common and strong suspicion that A must he negli- those who have not already thought to d o so, t o check
gibly small, because any other acceptable value is ab- that Eq. (4) is required t o preserve homogeneity and
surd, all this history has made the community well pre- isotropy.'
pared for the recent observational developments that The rate of change of the distance in Eq. (4) is the
argue for the detection of A. speed
Our approach in Sec. IV to the discussion of the evi-
dence for detection of A, from the cosmological tests, u=dlldt=HI, H=ula, (3
also requires explanation. One occasionally reads that
the tests will show us how the world will end. That cer- where the overdot means the derivative with respect to
tainly seems interesting, hut it is not the main point: why world timc t aud H is thc time-dependcnt Hubble pa-
should wc trust an extrapolation into the indefinite fu- rameter. When u is small compared to thc spccd of light
turc of a theory that we can at best show is a good this is Hubble's law. The present value 01H is Hubble's
approximation to real it^?^
As we remarked in Sec. T.A,
constant, H,, . When needed we will use'
the purpose of the tests is to check the approximation to
reality, by checking the physics and astronomy of the
H 0 = lOOh km spl Mpc-' = 6 7 t 7 km sK1 Mpc-I
standard relativistic cosmological model, along with any
viable alternatives that may be discovered. We take our =(15+2 Gyr)-', (6)
task to be the identification of the aspects of the stan-
dard theory that enter the interpretation of the measure- at two standard deviations. The first equation defines the
ments and thus are or may be empirically checked or dimensionless parameter l a .
measured. Another measure of the expansion follows by consid-
ering the stretching of the wavelength of light received
from a distant galaxy. The observed wavelength XohS nf a
II. BASIC CONCEPTS fcaturc in thc spcctrum that had wavelength ,X, at emis-
sion satisfies
A. The Friedmann-Lemaltre model

The standard world model is close to homogeneous


and isotropic on large scales, and lumpy o n small
scales-the effect of mass concentrations in galaxies,
stars, people, etc. The length scale at the transition from
nearly smooth to strongly clumpy is about 10 Mpc. We 'We feel we have to comment on a few details about Eq. (4)
to avoid contributing to debates that are more intense than
use here and throughout the standard astronomers' seem warranted. Think of the world time f as the proper lime
length unit, kept by each of a dense set of observers, each moving so that
all the others are isotropicallymoving away, and with the times
1 M p ~ = 3 . 1 x 1 0 ' cm=3.3X106
~ light years. (3) synchronized to a coinrnon energy dcnsity, p ( l ) , in the near
homogeneous expanding universc. 'l'hc distancc L ( t ) i s t h e sum
To be more definite, imaginc that many sphcrcs of ra- uf Ihe proper distances between neighboring obscrvers, all
dius 10 Mpc are placed at random, and the mass within measured at time t , and along the shortest distance between
each is measured. At this radius the rrns fluctuation in the two observers. The rate of increase of the distance, d l i d i ,
the set of values of masses is about equal to the m a n may exceed the velocity of light. This is no more problematic
value. On smaller scales the departures from homogene- in relativity theory than is the large speed at which the beam of
ity are progressively more nonlinear; on larger scales the a flashlight on Earth may swing across the face of the Moon
density fluctuations are perturbations to the homoge- (assuming an adequately tight beam). Space sections at fixed t
may be noncompact, and the total mass of a homogeneous
neous model. From now on we mention these perturba- universe formally infinite. As far as is known this is not mean-
tions only when relevant for the cosmological tests. ingful: we can only assert that the universe is close to homo-
The expansion of the universe means the distance I ( t ) geneous and isotropic over observable scales, and that what
between two well-separated galaxies varies with world can be observed is a finite number of baryons and photons.
time t as 'The numerical values in Eq. (6) are deterniincd tronr a n
analysis of all available measurements of 11, prior to mid-l YYY
4 r ) -a (1 1, (4) (Gott el d.,2001). They are a very reasonablc summary CLI
the current situation. For instance, the Hubble Space Tele-
whcrc the expansion or scale factor a ( t ) is independent scope Key Projcct summary measurement value 110=72
of the choice of galaxies. It is an interesting exercise, for ? 8 kms-'Mpc-* (launccrtainty; Freedman ef al., 2001) IS in
very good agreement with Eq. ( 6 ) , as is t h c recent Tarnmann
et al. (2001) summary value H0=60Z6 km s-' MpcC' (ap-
proximate l u systematic uncertainty). This is an example of
70bservations may now have detected A, at a characteristic the striking change in the observational situation over the pre-
energy scale of a millielectron volt [Eq. (47)]. We have no vious five years: the uncertainty in H o has decreased by more
guarantee that an even lower-energy scale does not exist; such than a factor of 3, making it one of the better-measured cos-
a scale could first become apparent through cosmological tests. mological parameters.

Rev. Mod. Phys.. Vol. 75, No. 2,April 2003


597

564 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

where the expansion factor u is defined in Eq. (4) and z


is the redshift. That is, the wavelength of freely traveling
radiation stretches in proportion to the factor by which
the universe expands. To understand this, imagine that a =Hi[RMO( 1 f z)3 + RRO(1 + z ) 4
large part of the universe is enclosed in a cavity with
perfectly reflecting walls. The cavity expands with the +R,,+R,O(l +z)2]. (11)
general expansion, the widths proportional to a ( t ) .
Electromagnetic radiation is a sum of the normal modes The first equation defines the function E ( z ) that is in-
that fit the cavity. At interesting wavelengths the mode troduced for later use. The second equation assumes
frequencies are much larger than the rate of expansion constant A; the time-dependent dark-energy case is re-
of the universe, so adiabaticity says a photon in a mode viewed in Secs. 1I.C and I1I.E. The first term in the last
stays there, and its wavelength thus must vary as A part of Eq. (11) represents nonrelativistic matter with
= u ( t ) , as stated in Eq. (7). The cavity disturbs the long- negligibly small pressure; one sees from Eqs. (7) and (9)
wavelength part of the radiation, but the disturbance can that the mass density in this form varies with the expan-
be made exceedingly small by choosing a large cavity. sion of the universe as pMaa-3a(l+z)3. The second
Equation (7) defines the redshift z . The redshift is a term represents radiation and relativistic matter, with
convenient label for epochs in the early universe, where pressurepR=pR/3, whence p R = ( l + ~ ) The ~ . third term
z exceeds unity. A good exercise for the student is to is the effect of Einsteins cosmological constant, or a
check that when z is small Eq. (7) reduces to Hubbles constant dark-energy density. The last term, discussed in
law, where Az is the first-order Doppler shift in the more detail below, is the constant of integration in Eq.
wavelength A, and Hubbles parameter H is given by Eq. (10). The four density parameters a,, are the fractional
( 5 ) . Thus Hubbles law may be written as cz = H l (where contributions to the square of Hubbles constant, H i ,
we have put in the speed of light). that is, R,,(t)=8.rrGp,o/(3N~).At the present epoch,
These results follow from the symmetry of the cosmo- z =0, the present value of u / a is H o , and the Q,, sum to
logical model and conventional local physics; we do not unity [Eq. (1))
need general relativity theory. When z -> 1 we need rela- In this notation, Eq. (8) is
tivistic theory to compute the relations among the red-
shift and other observables. An example is the relation
between redshift and apparent magnitude used in the
supernova test. Other cosmological tests check consis-
tency among these relations, and this checks the world The constant of integration in Eqs. (10) and (11) is
model. related to the geometry of spatial sections at constant
In general relativity the second time derivative of the world time. Recall that in general relativity events in
expansion factor satisfies spacetime are labeled by the four coordinates x p of time
and space. Neighboring events at separation dx have
invariant separation ds defined by the line element

ds2=g,,dx,dx . (13)
The gravitational constant is G. Here and throughout
we choose units to set the velocity of light to unity. The The repeated indices are summed, and the metric tensor
mean mass density, p ( t ) , and the pressure, p ( t ) , count- g,, is a function of position in spacetime. If ds2 is posi-
ing all contributions including dark energy, satisfy the tive then ds is the proper (physical) time measured by
local energy-conservation law an observer who moves from one event to the other; if
negative, Ids1 is the proper distance between the events
a measured by an observer who is moving so the events
p= -3 -(p+p). (9) are seen to be simultaneous.
In the flat spacetime of special relativity one can
The first term on the right-hand side represents the de- choose coordinates so the metric tensor has the
crease of mass density due to the expansion that more Minkowskian form
broadly disperses the matter. The p d V work in the sec-
o \
n,=l
ond term is a familiar local concept, and is meaningful in /l 0 0
general relativity. But one should note that energy does
not have a general global meaning in this theory. O -l O O
The first integral of Eqs. (8) and (9) is the Friedmann 0 0 - 1 0
equation \o 0 0 -11
8 A freely falling, inertial observer can choose locally
a2=- rrGpa2+ const.
3 (lo) Minkowskian coordinates, such that along the path of
the observer g,,,= 7,) and the first derivatives of g,,
It is conventional to rewrite this as vanish.

Rev. Mod. Phys.. Vol. 75, No. 2,April 2003


598

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 565

In the homogeneous world model we can choose co-


ordinates so the metric tensor is of the form that results
in the line element where pn is a constant, in a general coordinate labeling.
When writing this contribution to the stress-energy ten-
[
ds2= d t 2 - a ( t) 1+dzr2+r2(d#+sin2 SdS)]
sor separately from the rest, we bring the field equation
(17) to
= dt2- K-a(t)2[ dX2+ sinh2 x(d B2 + sin2 Bd S2)].
(15) This is Einsteins (1917) revision of the field equation of
In the second expression, which assumes K>O, the ra- general relativity, where is proportional to his cosmo-
dial coordinate is r = K-*sinhx. The expansion factor logical constant A; his reason for writing down this equa-
a ( t ) appears in Eq. (4). If a were constant and the con- tion is discussed in Sec. 1II.A. In many dark-energy sce-
stant K vanished this would represent the flat spacetime narios pa is a slowly varying function of time and its
of special relativity in polar coordinates. The key point stress-energy tensor differs slightly from Eq. (19), so the
for now is that RKo in Eq. (11), which represents the observed properties of the vacuum depend on the ob-
constant of integration in Eq. (lo), is related to the con- servers velocity.
stant K: One sees from Eqs. (14), (18), and (19) that the new
component in the stress-energy tensor resembles an
~ K O K/(HOUO)~, (16) ideal fluid with negative pressure,
where uo is the present value of the expansion factor
a ( t ) . Cosmological tests that are sensitive to the geom- P A = -Pa. (21)
etry of space constrain the value of the parameter OKo,
and RKoand the other density parameters ni0 in Eq. This fluid picture is of limited use, but the following
(11) determine the expansion history of the universe. properties are worth noting.
It is useful for what follows to recall that the metric The stress-energy tensor of an ideal fluid with four-
tensor in Eq. (15) satisfies Einsteins field equation, a velocity U P generalizes from Eq. (18) to TP=(p
differential equation we can write as + p ) u u - p g @ . The equations of fluid dynamics fol-
low from the vanishing of the divergence of T. Let us
G,,=8?rGT,,. (17) consider the simple case of locally Minkowskian coordi-
The left side is a function of g r u and its first two deriva- nates, meaning free fall, and a fluid that is close to ho-
tives; it represents the geometry of spacetime. The mogeneous. By the latter we mean the fluid velocity
stress-energy tensor T,, represents the material con- 5-the space part of the four-velocity u-and the den-
tents of the universe, including particles, radiation, sity fluctuation Sp from homogeneity may be treated in
fields, and zero-point energies. An observer in a homo- linear perturbation theory. Then the equations of energy
geneous and isotropic universe, moving so the universe and momentum conservation are
is observed to be isotropic, would measure the stress-
energy tensor to be Sp+((p)+(p))V.v=O, ((p)+(p));+c,2VSp=o,
(22)
where c : = d p l d p and the mean density and pressure are
(p) and ( p ) . These combine to

sp= c y Sp. (23)


This diagonal form is a consequence of the symmetry; If c f is positive this is a wave equation, and c, is the
the diagonal components define the pressure and energy speed of sound.
density. With Eq. (18), the differential equation (17) The first of Eqs. (22) is the local energy-conservation
yields the expansion-rate equations (11) and (12). law, as in Eq. (9). If p = - p , the p d V work cancels the
pdV part: the work done to increase the volume cancels
B. The cosmological constant the effect of the increased volume. This has to be so for
a Lorentz-invariant stress-energy tensor, of course,
Special relativity is very successful in laboratory phys- where all inertial observers see the same vacuum. An-
ics. Thus one might guess that any inertial observer other way to see this is to note that the energy flux den-
would see the same vacuum. A freely moving inertial sity in Eqs. (22) is ( ( p ) + ( p ) ) d . This vanishes when
observer represents spacetime in the neighborhood by
locally Minkowskian coordinates, with the metric tensor
vpvgiven in Eq. (14). A Lorentz transformation to an ?These arguments have been familiar, in some circles, for a
inertial observer with another velocity does not change long time, though in our experience, discussed more often in
this Minkowski form. The same must be true of the private than in the literature. Early statements of elements are
stress-energy tensor of the vacuum, if all observers see in Lernaitre (1934) and McCrea (1951); see Kragh (1999, pp.
the same vacuum, so it has to be of the form 397 and 398) for a brief historical account.

Rev Mod Phys., Vol 75,No 2,April 2003


599

566 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

p = - p: the streaming velocity loses meaning. When c: a small number. Since we are near the edge of the lumi-
is negative Eq. (23) shows that the fluid is unstable, in nous part of our galaxy, a search for the effect of A on
general. But when p = - p the vanishing divergence of the internal dynamics of galaxies such as the Milky Way
Tp becomes the condition shown in Eq. (22), that p does not look promising. The precision of celestial dy-
= ( p ) + S p is constant. namics in the Solar System is much greater, but the ef-
There are two measures of gravitational interactions fect of A is very much smaller; gA/g-10-22 for the orbit
with a fluid: the passive gravitational mass density deter- of the Earth.
mines how the fluid streaming velocity is affected by an One can generalize Eq. (19) to a variable p i \ , by tak-
applied gravitational field, and the active gravitational ing p a to be negative but different from - p A . But if the
dynamics were that of a fluid, with pressure a function of
mass density determines the gravitational field produced
p a , stability would require c : = d p a / d p A > O , from Eq.
by the fluid. When the fluid velocity is nonrelativistic the
expression for the former in general relativity is p f p , as (23), which seems quite contrived. A viable working
one can determine by writing out the covariant diver- model for a dynamical P A is the dark energy of a scalar
gence of T. This vanishes when p = - p , consistent field with self-interaction potential chosen to make the
with the loss of meaning of the streaming velocity. The variation of the field energy acceptably slow, as dis-
latter is p+ 3 p , as one can see from Eq. (8). Thus a fluid cussed next.
with p = - p / 3 , if somehow kept homogeneous and
static, would produce no gravitational field. In the
model in Eqs. (19) and (21) the active gravitational mass C. Inflation and dark energy
density is negative when pA is positive. When this posi-
tive dominates the stress-energy tensor, u is positive: The negative active gravitational mass density associ-
the rate of expansion of the universe increases. In the ated with a positive cosmological constant is an early
language of Eq. (20), this cosmic repulsion is a gravita- precursor of the inflation picture of the early universe;
tional effect of the negative active gravitational mass inflation in turn is one precursor of the idea that A might
density, not a new force law. generalize into evolving dark energy.
The homogeneous active mass represented by A To begin, we review some aspects of causal relations
changes the equation of relative motion of freely moving between events in spacetime. Neglecting space curva-
test particles in the nonrelativistic limit to ture, a light ray moves a proper distance d l = a ( t ) d x
= d t in time interval d t , so the integrated coordinate
d2r displacement is
-=g+n,,H;i,
dt2 x= dtla(t).
where g is the relative gravitational acceleration pro- If nAo=O this integral converges in the past-we see
duced by the distribution of ordinary matter. For an distant galaxies that at the time of observation cannot
illustration of the size of the last term consider its effect have seen us since the singular start of expansion at a
on our motion in a nearly circular orbit around the cen- = 0. This particle horizon problem is curious: how
ter of the Milky Way galaxy. The Solar System is moving could distant galaxies in different directions in the sky
at speed v,=220 km s-l at radius r = 8 kpc. The ratio of know to look so similar? The inflation idea is that in the
the acceleration g A produced by A to the total gravita- early universe the expansion history approximates that
tional acceleration g= u : / r is of de Sitters (1917) solution to Einsteins field equation
for A>O and T,,=O in Eq. (20). We can choose the
ga Ig=ClaoH~r21u~-10-5, (25) coordinate labels in this de Sitter spacetime so space
curvature vanishes. Then Eqs. (11) and (12) show that
the expansion parameter is
Lest we contribute to a wrong problem for the student we
note that a fluid with p = - pi3 held in a container would have
net positive gravitational mass, from the pressure in the con- where Ha is a constant. As one sees by working the
tainer walls required for support against the negative pressure integral in Eq. (26), here everyone can have seen every-
of the contents. We have finessed the walls by considering a one else in the past. The details need not concern us; for
homogeneous situation. We believe Whittaker (1935) gives the the following discussion two concepts are important.
first derivation of the relativistic active gravitational mass den- First, the early universe acts like an approximation to de
sity. Whittaker also presents an example of the general propo- Sitters solution because it is dominated by a large effec-
sition that the active gravitational mass of an isolated stable tive cosmological constant, or dark-energy density.
object is the integral of the time-time part of the stress-energy
tensor in the locally Minkowskian rest frame. Misner and Put- Second, the dark energy is modeled as that of a near
man (1959) give the general demonstration. homogeneous field, CD.
12Thisassumes that the particles are close enough for appli- In this scalar field model, motivated by grand unified
cation of the ordinary operational definition of proper relative models of very-high-energy particle physics, the action
position. The parameters in the last term follow from Eqs. (8) of the real scalar field CD (in units chosen so that Plancks
and (21). constant ti is unity) is

Rev. Mod. Phys., Vol. 75,No. 2,April 2003


600

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 567

imagine that in other models pa approaches a constant


positive value at large time, and spacetime approaches
the de Sitter solution, or p a passes through zero and
The potential-energy dcnsity V is a function of the field becomes negative, causing spacetime to collapse to a Rig
@, and g is the determinant of thc metric tensor. When Crunch.
the field is spatially homogeneous [in the line element of Th e power-law modcl with a>O has two properties
Eq. (15)], and space curvature may be neglected, the that seem desirable. First, the solution in Eq. (32) is said
field equation is to be an atrrucror (Ratra and Peebles, 1988) or a trucker
(Steinhardt, Wang, and Zlatev, 1999), meaning it is thc
d . dV
@ + 3- @ +
u
-
d@
=O (29) asymptotic solution for a broad range of initial condi-
tions at high redshift. That includes relaxation to a near
The stress-energy tensor of this homogeneous field is homogeneous energy distribution even when gravity has
diagonal (in the rest frame of an observer moving so collected the other matter into nonrelativistic clumps.
that the universe is seen to bc isotropic), with time and Second, the energy density in the attractor solution de-
space parts along the diagonal creases less rapidly than that of mattcr and radiation.
This allows us to realize the scenario: after inflation hut
at high redshift the ficld cncrgy density pm is small so it
does not disturb the standard model lor the origm of the
light elements, but eventually p+ dominates and the u n -
verse acts as if it had a cosmological constant, but onc
that varies slowly with posiliun and time. We commcnt
If the scalar field varies slowly in time, so that d)'eV ,
on dctaiis of t h s model in Sec M E .
the Geld energy approxirnatcs the effect of Einstein's
cosmologycal constant, with p Q = - p a .
The inflation picture assumes that the near exponcn-
tial expansion of Eq. (27) in the early universe lasts long 111. HISTORICAL REMARKS
enough so that every bit of the present observable uni-
verse has seen every other bit, and presumably has dis- These comments on what people were thinking are
covered how to relax to almost exact homogeneity. The gleaned from the literature and supplemented by private
field may then start varying rapidly enough to produce discussions and our own recollections. More is required
the entropy of our universe, and the field or the entropy for a complete history of the subject, of course, but we
may produce the baryons, leaving po small or zero. But hope we have captured the main themes and the way in
one can imagine that the late time evolution of p a is which these themes have evolved into the present appre-
slow. If it is slower than the evolution in the mass density ciation of the situation.
in matter, there comes a time when pa again dominates,
and the universe appears to have a cosmological con- A. Einstein's thoughts
stant.
A model for this late timc cvolution assumes a poten- Einstein disliked the idea of an island univcrsc in as-
lial ol lhe lorm ymptotically flat spacetime, because a particle could
lcavc the island and move arbitrarily far from all the
v= K m o , (31) other matter in the univcrsc, yet preserve all its inertial
where the conslant K has ditnensions OC mass raised to propcrtics, which he considered a violation of Mach's
the power u+4. For simplicity let us supposc the uni- idea af the relativity of inertia. Eimtein's (1917) cosnio-
verse aftcr inflation, but at high redshift, is dominated by logical model accordingly assumcs that the universe is
matter or radiation, with mass density p, that drives the homogeneous and isotropic, on average, thus removing
power-law expansion, a ~ t " Then. the power-law solu- the possibility ol arbitrarily isolated particlcs. Einstein
lion lo the held Eq. (29) with the potential in Eq. (31) is had no empirical suppurt for this assumption, yet it
Q, o: p 2 + m) agrccs with modern precision tests. Thcrc is no agree-
(32) ment as t o whether this is more than a lucky guess.
and the ratio of the mass densities in the scalar field and Motivated by the observed low velocities of the then
in matter or radiation is known stars, Einstein assumed that the large-scale struc-
ture of the universe is static. He introduced the cosmo-
/p,t4i(2+~! (33) logical constant to reconcile this picture with his general
In the. Iimit at which the parameter cy approaches zero, relativity theory. In the notation of Eq. (12), one sees
p a is constant, and this model is equivalent to Einstein's that a positive value of SZ,, can balance the positive
A. values of nbro and ilRofor consistency with u = O . The
When a>O the field Q, in this model grows arbitrarily balance is unstable: a small perturbation to the mean
large at large time, so pG+O, and the universe ap- mass density or the mass distribution causes expansion
proaches the Minkowskian spacetimc of spccial relativ- or contraction of the whole or parts of the universe. One
ity. This is within a simple model, ol course. It is easy to sees this in Eq.(24): the mass distribution can be chosen

Rev Mod Phys . Vol 75, No 2,April 2003


60 1

568 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

so the two terms on the right-hand side cancel, but the sort of quantum treatment, passes through a quasistatic
balance can be upset by redistributing the mass.13 approximation t o Einsteins solution, and then continues
Einstein did not consider the cosmological constant to expanding to de Sitters (1917) empty space solution. To
be part of the stress-energy term: his form for the field modern tastes, this loitering model requires incredibly
equation [in the streamlined notation of Eq. (17)] is special initial conditions, as will be discussed. Lernaitre
liked it because the loitering epoch allows the expansion
C,,,-8 mGpAgp,=8?7G T,, . (34) time to be acceptably long for Hubbles (1929) estimate
of N o , which is an order-of-magnitude high.
The left-hand side contains the metric tensor and its de- The record shows Einstein never liked the A term. His
rivatives; a new constant of nature, A, appears in the view of how general relativity might fit Machs principle
addition to Einsteins original field equation. One can was disturbed by de Sitters (1917) solution to Eq. (34)
equally place Einsteins new term on the right-hand side for empty space ( T p Y = O )with A>0.425 Pais (1982, p.
of the equation, as in Eq. (20), and count P&,, as part 288) pointed out that Einstein, in a letter t o Weyl in
of the source term in the stress-energy tensor. The dis- 1923, commented on the effect of A in Eq. (24): Ac-
tinction becomes interesting when p~ takes part in the cording to De Sitter, two material points that are suffi-
dynamics, and the field equation is properly written with ciently far apart, continue t o be accelerated and move
p,, , or its generalization, as part of the stress-energy ten- apart. If there is no quasistatic world, then away with the
sor. One would then be able to say that the differential cosmological term. We do not know whether at this
equation of gravity physics has not changed from Ein- time Einstein was influenced by Sliphers redshifts or
steins original form; instead there is a new component Friedmanns expanding world model.
in the content of the universe.
Having assumed that the universe is static, Einstein
did not write down the differential equation for a ( t ) , 14North (1965) reviews the confused early history of ideas on
and so did not see the instability. Friedmann (1922, the possible astronomical significance of de Sitters solution for
1924) found the evolving homogeneous solution, but an empty universe with A>O; we add a few comments regard-
had the misfortune to do so before the astronomy ing the physics that contributed to the discovery of the expand-
became suggestive. Sliphers measurements of the spec- ing world model. Suppose an observer in de Sitters spacetime
tra of the spiral nebulae-galaxies of stars-showed holds a string tied to a source of light, so the source stays at
most are shifted toward the red, and Eddington (1924, fixed physical distance r<H,. The source is much less mas-
pp. 161 and 162) remarked that that might be a manifes- sive than the observer, the gravitational frequency shift due to
tation of the second, repulsive term in Eq. (24). Lemai- the observers mass may be neglected, and the observer is mov-
tre (1927) introduced the relation between Sliphers red- ing freely. Then the observer receives light from the source
shifts and a homogeneous matter-filled expanding shifted to the red by Gh/X=-(HAr)Z/2.The observed red-
shifts of particles moving on geodesics depend on the initial
relativistic world model. H e may have been influenced conditions. Stars in the outskirts of our galaxy are held at fixed
by Hubbles work, which led to the publication (Hubble, mean distances from Earth by their motions. The mean shifts
1929) of the linear redshift-distance relation [Eq. (S)]:as of the spectra of light from these stars include this quadratic de
a graduate student at MIT Lemaitre attended a lecture Sitter term as well as the much larger Doppler and ordinary
by Hubble. gravitational shifts. The prescription for initial conditions that
In Lemahres (1927) solution, the expanding universe reproduces the linear redshift-distance relation for distant gal-
traces asymptotically back to Einsteins static case. Le- axies follows Weyls (1923) principle: the world particle geode-
maitre then turned to what he called the primeval atom, sics trace back to a near common position in the remote past,
which is now termed the Big Bang model. This solution in the limiting case of the Friedmann-Lemaitremodel at QMo
expands from densities so large that they require some -to. This spatially homogeneous coordinate labeling of de Sit-
ters spacetime, with space sections with negative curvature,
already appears in de Sitter [1917, Eq. (15)], and is repeated in
Lanczos (1922). This line element is the second expression in
l3To help motivate the introduction of A, Einstein (1917) our Eq. (15) with aacosh HAt.Lernaitre (1925) and Robertson
mentioned a modification of Newtonian gravity physics that (1928) present the coordinate labeling for the spatially flat
could render the theory well defined when the mass distribu- case, where the line element is dsZ=dtZ-eZH~(dn2+dy2
tion is homogeneous. In Einsteins example, similar to what + d z Z ) [in the choice of symbols and signature in Eqs. (15) and
was considered by Seeliger and Neumann in the mid-l890s, the (27)]. Lemaitre (1925) and Robertson (1928) note that par-
modified field equation for the gravitational potential p is ticles at rest in this coordinate system present a linear redshift-
V2p-hp=4rGpM. This allows the nonsingular homoge- distance relation, u = H A r ,at small u . Robertson (1928) esti-
neous static solution p = -4nGpMIh. In this example the po- mated H A ,and Lema?tre (1927) its analog for the Friedmann-
tential for an isolated point mass is the Yukawa form, p Lemaitre model, from published redshifts and Hubbles galaxy
ae-Ar/r. Trautman (1965) pointed out that this is not the non- distances. Their estimates are not far off Hubbles (1929) pub-
relativistic limit of general relativity with the cosmological lished value.
term. Rather, Eq. (24) follows from v2p=4T&(pM-2pA), To the present way of thinking the lengthy debate about the
where the active gravitational mass density of the A term IS singularity in de Sitters static solution, chronicled by North
p A+ 3 p A= - 2p,, . Norton (1999) reviewed the history of ideas (1965), seems surprising, because de Sitter (1917) and Klein
of this Seeliger-Neumann Yukawa-type potential in gravity (1918) had presented de Sitterssolution as a sphere embedded
physics. in 4-plus-1-dimensionalflat space, with no physical singularity.

Rev. Mod. Phys., Val. 75,No. 2, April 2003


602

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 569

The earliest published comments we have found on Gamow (1970, p. 44) recalls that when I was discuss-
Einsteins opinion of A within the evolving world model ing cosmological problems with Einstein, h e remarked
(Einstein, 1931; Einstein and de Sitter, 1932) make the that the introduction of the cosmological term was the
point that, since not all the terms in the expansion-rate biggest blunder he ever made in his life. This certainly
equation (11) are logically required, and the matter term is consistent with all ol Einsteins written comments that
surely i s present and likely dominates over radialion at we have seen on the cosmolopcal cnnstant per se; we do
low redshift, a reasonable working model drops flKo and not know whether Einstein was also referring to the
nA0 and ignores Om. This simplifics thc cxpansion-rate missed chance to predict the evolution of the universe.
equation to what has come to be called the Einstein-de
Sitter model, 6. The development of ideas

1. Early indications of A
(35)
In the classic book, The Classical Theory of Fields,
where phf is the mass density in nonrelativistic matter; Landau and Lifshitz (1951, p. 338) second Einsteins
opinion of the cosmological constant A, stating there is
here RM=8.rrGpM/(3H2)is unity. The left side is a
no basis whatsoever for adjustment of the theory to
measure of the kinetic energy of expansion per unit
include this term. The empirical side of cosmology is not
mass, and the right-hand side a measure of the negative much mentioned in this book, however (though there is
of the gravitational potential energy. In effect, this
a perceptivc commcnt on thc limited empirical support
model universe expands with escapc vclocity. or the homogeneity assumption: p. 332). In the Supple-
Einstein and de Sitter point out that Hubbles esti- mentary Notes to the English translation of his book,
mate of H o and de Sitters estimate of the mean mass Theory of Relativity, Pauli (1958, p. 220) also endorses
dcnsity in galaxies are not inconsistent with Eq. (35) Einsteins position.
(and since both quantities scale with distance in the Discussions elsewhere in the literature on how one
same way, this result is not affected by the errnr in the might find cmpirical constraints on the values of the cos-
distance scale that affected Hubbles initial measure- mological parameters usually take account of A. The
ment of H o ) . But the evidence shows now that the mass continued interest was at least in part driven by indica-
density is about one-quarter of what is predicted by this tions that A might be needed to reconcile theory and
equation, as we w d discuss. observations. Here are three examples.
Einstein and de Sitter (1932) remarked that the cur- First, the expansion timc is uncomfortably short if A
vature term in Eq. (11) is essentially determinable, and =O. Sandages recalibration of the distance scale in the
an increase in the precision of the data derived from 1960s indicates Ho-75 krns-lMpc-. If A=O this
observations will enable us in the future to fix its sign shows that the time of expansion from densities too high
and determine ils value. This is happcning, 70 years for stars to have existed is <H,-13 Gyr, maybe less
later. The cosmological constant term is measurable, in than the ages of thc oldcst stars, then estimated to he
principle, too, and may now have been detected. But greater than about 15Gyr. Sandage (1961a) points out
Einstein and de Sitter said only that the theory of an that the problem is removed by adding a positive A. The
expanding universe with finite mean mass density can present estimates reviewed below (Sec. IV.B.3) are not
be reached without the introduction of A. far from these numbers, but still too uncertain for a sig-
Further to this point, in the appendix of the second nificant case for A.
edition of his book, The Meaning of Relativity, Einstein Second, counts of quasars as a function of redshift
(1945, p. 127) states that the introduction of the cos- show a peak at z-2, as would be produced by the loi-
mologic member -Einsteins terminology for tering epoch in Lemahes A model (Petrosian, Salpeter,
A-into the equations of gravity, though possible from and Szckeres, 1967; Shklovsky, 1967; Kardashev, 1967).
the point of view of relativity, is to be rejected from the The peak is now well established, centered at z-2.5
point of view of Iogical economy, and that if Hubbles (Croom er al., 2001; Fan et al., 2001). I1 is usually inter-
expansion had been discovered at thc timc of thc crc- prctcd as the evolution in the rate of violent activity in
ation of the gencral theory of relativity, the cosmologic the nuclei ol galaxies, though in the absence of a loiter-
member would never have been introduced. It seems ing epoch the indicated sharp variation in quasar activity
now so much less justified to introduce such a member with timc is curious (but certainly could be a conse-
into the field equations, since its introduction loses its quence of astrophysics that is not well understood).
sole original justiiication,-that of leading to a natural The third example is the redshift-magnitude relation.
solution of the cosmologic problem. Einstein knew that Sandages (1961a) analysis indicates this is a promising
without the cosmological constant the expansion time method of distinguishing world models. The Gunn and
derived from Hubbles estimate of H , is uncomfortably Oke (1975) measurement of this reIation for giant ellip-
short compared to estimates of the ages of the stars, and tical galaxies, with Tinsleys (1972) correction for evolu-
opined that that might be a problem with the star ages. tion of the star population from assumed formation at
The big error, the value of H o , was corrected by 1960 high redshift, indicates curvature away from thc lincar
(Sandage, 1958: 1962). relation in the dircction that, as Gunn and Tinsley

Rev. Mod. Phys., Vol. 75, No. 2, April 2003


603

570 P. J. E.Peebles and Bharat Ratra: The cosmological constant and dark energy

(1975) discuss, could only be produced by A (within gen- the early 1960s, in R. H. Dickes gravity research group,
eral relativity theory). The new application of the the coincidences argument was discussed, but published
redshift-magnitude test, to type-la supernovae (Sec. much later (Dicke, 1970, p. 62; Dicke and Peebles,
IV.B.4), is not inconsistent with the Gunn-Oke measure- 1979). We do not know its provenance in Dickes group,
ment; we do not know whether this agreement of the whether from Bondi, McCrea, Dicke, or sonieone else.
measurements is significant, because Gunn and Oke We would not be surprised to learn others bad similar
were worried about galaxy evolution.6 thoughts.
The coincidences argument is sensible but not a proof,
2. The coincidences argument against A of course. The discovery of the 3-K thermal cosmic mi-
crowave background radiation gave us a term in the
A n argument against an observationally interesting expansion-rate equation that is down from the dominant
value of A, from our distrust of accidental coincidences, one by four orders of magnitude, not such a large factor
has been in the air for decades, and became very influ- by astronomical standards. This might be counted as a
ential in the early 1980s with the introduction of the first step away from the argument. From the dynamics of
inflation scenario for the very early universe. galaxies the evidence that flMois less than unity is an-
If the Einstein-de Sitter model in Eq. (35) were a other step (Peebles, 1984, p. 442; 1986). And yet another
good approximation at the present epoch, an observer is the development of the evidence that the A and dark-
measuring the mean mass density and Hubbles constant matter terms differ by only a factor of 3 [Eq. (2)]. This
when the age of the universe was one-tenth the present last piece is the most curious, but the community has
value, or ten times the present age, would reach the come to accept it, for the most part. The precedent
same conclusion, that the Einstein-de Sitter model is a makes Lemaitres loitering model more socially accept-
good approximation. That is, we would flourish at a time able.
that is not special in the course of evolution of the uni- A socially acceptable value of A cannot be such as to
verse. If, on the other hand, two or more of the terms in make life impossible, of c o ~ r s e . ~But
perhaps the most
the expansion-rate equation (11) made substantial con- productive interpretation of the coincidences argument
tributions to the present value of the expansion rate, it is that it demands a search for a more fundamental un-
would mean that we are present at a special epoch, be- derlying model. This is discussed further in Sec. 1II.E
cause each term in Eq. (11) varies with the expansion and the Appendix.
factor in a different way. To put this in more detail, we
imagine that the physics of the very early universe, when 3. Vacuum energy and A
the relativistic cosmological model became a good ap-
proximation, set the values of the cosmological param- Another tradition to consider is the relation between
eters. The initial values of the contributions to the A and the vacuum or dark-energy density. In one ap-
expansion-rate equation had to have been very different proach to the motivation for the Einstein field equation,
from each other, and exceedingly specially fixed, to yield taken by McVittie (1956) and others, A appears as a
two flios with comparable values. This would be a most constant of integration (of the expression for local con-
remarkable and unlikely coincidence. The multiple coin- servation of energy and momentum). McVittie (1956, p.
cidences required for the near vanishing of a and a at a 35) emphasizes that, as a constant of integration, A can-
redshift not much larger than unity makes an even stron- not be assigned any particular value on a priori
ger case against Lemahes loitering model, with this line grounds. Interesting variants of this line of thought are
of argument. still under discussion (Weinberg, 1989; Unruh, 1989; and
The earliest published comment we have found on references therein).
this point is by Bondi (1960, p. 166), in the second edi- The notion of A as a constant of integration may be
tion of his book Cosmology. Bondi notes the remark- related to the issue of the zero point of energy. In labo-
able property of the Einstein-de Sitter model: the di- ratory physics one measures and computes energy differ-
mensionless parameter we now call f l M is independent ences. But the net energy matters for gravity physics,
of the time at which it is computed (since it is unity). and one can imagine that A represents the difference
The coincidences argument follows and extends Bondis between the true energy density and the sum at which
comment. It is presented in McCrea (1971, p. 151). one arrives by laboratory physics. Eddington (1939) and
When Peebles was a postdoctoral research associate, in Lemaitre (1934, 1949) make this point.

16Early measurements of the redshift-magnitude relation If A were negative and the magnitude too large there would
were meant in part to test the Steady State cosmology of not be enough time for the emergence of life such as ours. If A
Bondi and Gold (1948) and Hoyle (1948). Since Steady State were positive and too large the universe would expand too
cosmology assumes spacetime is independent of time its line rapidly to allow galaxy formation. Our existence, which re-
element has to have the form of the de Sitter solution with quires something resembling the Milky Way galaxy to contain
RKo=O and the expansion parameter in Eq. (27). The mea- and recycle heavy elements, thus provides an upper bound on
sured curvature of the redshift-magnitude relation is in the the value of A. Such anthropic considerations are discussed by
direction predicted by Steady State cosmology. But this cos- Weinberg (1987, 2001) and Vilenkin (2001), and references
mology fails other tests discussed in Sec. IVB. therein.

Rev. Mod. Phys.,Vol. 75, No. 2 , April 2003


604

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 571

Bronstein" (1933) carries the idea further, allowing


for transfer of energy between ordinary matter and that
represented by A. In our notation, Bronstein expresses
this picture by generalizing Eq. (9) t o
Nernst (1916) seems to have been the first to write
a down this equation, in connection with the idea that
pa=-p-3 a P+P)>
-( (36) the zero-point energy of the electromagnetic field
fills the vacuum, as a light aether, that could have physi-
where p and p are the energy density and pressure of cally significant properties?' This was before Heisen-
ordinary matter and radiation. Bronstein goes on t o pro- berg and Schrodinger: Nernst's hypothesis is that each
pose a violation of local energy conservation, a thought degree of freedom, to which classical statistical mechan-
that no longer seems interesting. North (1965, p. 81) ics assigns energy k Tl2, has Nullpunktsenergie h uI2.
finds Eddington's (1939) interpretation of the zero point This would mean that the ground-state energy of a one-
of energy also somewhat hard to defend. But for our dimensional harmonic oscillator is h v, twice the correct
purpose the important point is that the idea of A as a
value. Nernst's expression for the energy density in the
form of energy has been present, in at least some circles,
electromagnetic field thus differs from Eq. (37) by a fac-
for many years.
tor of 2 (after taking account of the two polarizations),
The zero-point energy of fields contributes to the
dark-energy density. To make physical sense the sum which is wonderfully close. For a numerical example,
over the zero-point mode energies must be cut off at a Nernst noted that if the cutoff frequency were u
short distance or a high frequency up to which the = 10'' Hz, or -0.4 MeV, the energy density of the
model under consideration is valid. The integral of the Lichtuther (light aether) would be loz3 e r g ~ m - ~or,
zero-point energy (kl2) of normal modes (of wave num- about 100 g ~ m - ~ .
ber k ) of a massless real bosonic scalar field (a),up to By the end of the 1920s Nernst's hypothesis was re-
the wave-number cutoff k , , gives the vacuum energy placed with the demonstration that in quantum mechan-
density the quantum-mechanical expectation value" ics the zero-point energy of the vacuum is as real as any
other. W. Pauli, in unpublished work in the 1920~:~re-
peated Nernst's calculation, with the correct factor of 2,
taking k , t o correspond to the classical electron radius.
"Kragh (1996, p. 36) describes Bronstein's motivation and Pauli knew the value of pa was quite unacceptable: the
history. We discuss this model in more detail in Sec. III.E, and
radius of the static Einstein universe with this value of
comment on why decay of dark energy into ordinary matter or
radiation would be hard to reconcile with the thermal spec- pa "would not even reach t o the moon" (Rugh and
trum of the 3-K cosmic microwave background radiation. De- Zinkernagel, 2002, p. 5)." The modern version of this
ca into the dark sector may be interesting. "physicists' cosmological constant problem" is even
"Equation (37), which usually figures in discussions of the
vacuum energy puzzle, gives a helpful indication of the situa-
tion: the zero-point energy of each mode is real and the sum is motion, where the stress-energy tensor is diagonal, which is
large. The physics is seriously incomplete, however. The elimi- not unexpected because we need a preferred frame to define
nation of spatial momenta with magnitudes k > k , only makes k , . It is unacceptable as a model for the properties of dark
sense if there is a preferred reference frame in which k , is energy, of course. For example, if the dark-energy density were
defined. Magueijo and Smolin (2002) mention a related issue: normalized to the value now under discussion, and varied as
In which reference frame is the Planck momentum of a virtual p A a a ( f ) - 4 ,it would quite mess up the standard model for the
particle at the threshold for new phenomena? In both cases origin of the light elements. We get a more acceptable model
one may implicitly choose the rest frame for the large-scale for the behavior of p A from the second prescription, with the
distribution of matter and radiation. It seems strange to think cutoff at a fixed physical momentum. If we also want to satisfy
that the microphysics is concerned about large-scale structure, local energy conservation we must take the pressure to be
but perhaps this happens in a sea of interacting fields. The p a = - p a . This does not contradict the derivation ofpo in the
cutoff in Eq. (37) might be applied at a fixed comoving wave first prescription, because the second situation cannot be de-
number k c m a ( [ ) - ' , or at a fixed physical value of k,. The first scribed by an action: the pressure must be stipulated, not de-
prescription can be described by an action written as a sum of rived. What is worse is that the known fields at laboratory
terms 62/2+k2@Z/[2a(r)']for the allowed modes. The zero- momenta certainly do not allow this stipulation; they are well
point energy of each mode scales with the expansion of the described by analogs of the action in the first prescription. This
universe as a ( [ ) - ' . and the sum over modes scales as po quite unsatisfactory situation illustrates how far we are from a
consistent with k , m a ( t ) - ' . In the limit of exact spa- theory of vacuum energy.
tial homogeneity, an equivalent approach uses the spatial av- "A helpful discussion of Nernst's ideas on cosmology is that
erage of the standard expression for the field stress-energy ten- of Kragh (1996, pp. 151-157).
sor. Indeed, DeWitt (1975) and Akhmedov (2002) show that "This is discussed by Enz and Thellung (1960); Enz (1974);
the vacuum expectation value of the stress-energy tensor, ex- Rugh and Zinkernagel (2002, pp. 4 and 5), and Straumann
pressed as an integral cutoff at k = k , , and computed in the (2002).
preferred coordinate frame, is diagonal with a space part p o "In an unpublished letter in 1930, G. Gamow considered the
=pe/3, for the massless field we are considering. That is, in gravitational consequences of the Dirac sea (Dolgov, 1989, p.
this prescription the vacuum zero-point energy acts like a ho- 230). We thank A. Dolgov for helpful correspondence on this
mogeneous sea of radiation. This defines a preferred frame of point.

Rev Mod PhyS , VOl 75,NO 2,April 2003


605

572 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

more acute, because a natural value for k , is thought to transitions accompanying the symmetry breaking. Each
be much larger than what Nernst or Pauli used.23 first-order transition has a latent heat that appears as a
While there was occasional discussion of this issue in contribution to an effective time-dependent A ( t ) or
the middle of the 20th century (as in the quote from N. dark-energy density.= The decrease in value of the dark-
Bohr in Rugh and Zinkernagel, 2002, p. 5), the modern energy density at each phase transition is much larger
era begins with the paper by Zeldovich (1967) that con- than the acceptable present value (within relativistic cos-
vinced the community to consider the possible connec- mology); the natural presumption is that the dark energy
tion between the vacuum energy density of quantum is negligible now. This final condition seems bizarre, but
physics and Einsteins cosmological ~ o n s t a n t . ~ the picture led to the very influential concept of infla-
If the physics of the vacuum looks the same to any tion. We discussed the basic elements in connection with
inertial observer its contribution to the stress-energy Eq. (27); we now turn to some implications.
tensor is the same as Einsteins cosmologcal constant
[Eq. (19)]. Lemaitre (1934) notes t h s : in order that ab-
solute motion, i.e., motion relative to the vacuum, may
not be detected, we must associate a pressure p = - p c 2
to the energy density pc2 of vacuum. Gliner (1965)
goes further, presenting the relation between the metric
tensor and the stress-energy tensor of a vacuum that is
the same t o any inertial observer. But it was Zeldovich
(1968) who presented the argument clearly enough and
at the right time to catch the attention of the community.
With the development of the concept of broken sym-
metry in the now standard model for particle physics
came the idea that the expansion and cooling of the uni-
verse is accompanied by a sequence of first-order phase

231n terms of an energy scale c,, defined by p A = E : , the


Planck energy G- is about 30 orders of magnitude larger
than the observed value of rA. This is, of course, an extreme
case, since many of the theories of interest break down well
below the Planck scale. Furthermore, in addition to other con-
tributions, one may add a counterterm to Eq. (37) to predict
any value of pn . With reference to this point, it is interesting
to note that while Pauli did not publish his computation of P A ,
he remarks in his famous 1933 Handbuch der Physik review on
quantum mechanics that it is more consistent to exclude a
zero-point energy for each degree of freedom as this energy,
evidently from experience, does not interact with the gravita-
tional field (Rugh and Zinkernagel,2002, p. 5). Pauli was fully
aware that one must take account of zero-point energies in the
binding energies of molecular structure, for example (and we
expect he was aware that what contributes to the energy con-
tributes to the gravitational mass). He chose to drop the sec-
tion with the above comment from the second (1958) edition
of the review (Pauli, 1980, pp. iv and v). In a globally super-
symmetric field theory there are equal numbers of bosonic and
fermionic degrees of freedom, and the net zero-point vacuum
energy density p A vanishes (Iliopoulos and Zumino, 1974;
Zumino, 1975). However, supersymmetry is not a symmetry of
low-energy physics, or even at the electroweak unification
scale. It must be broken at low energies, and the proper setting
for a discussion of the zero-point p A in this case is locally su-
persymmetric supergravity. Weinberg (1989, p. 6) notes it is
very hard to see how any property of supergravity or super-
string theory could make the effective cosmological constant
sufficiently small. Witten (2001) and Ellwanger (2002) review
more recent developments on this issue in the superstringiM
theory/branes scenario.
For subsequent more detailed discussions of this issue, see
Zeldovich (1981), Weinberg (1989), Carroll, Press, and Turner
(1992), Sahni and Starobinsky (2000), Carroll (ZOOl), and
Rugh and Zinkernagel (2000).

Rev. Mod. Phys., Vol. 75,No. 2, April 2003


606 ~

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 595

partures from homogeneity, such as in spacetime curva-


ture fluctuations frozen in during inflation, and perhaps
in new cosmic fields. It would not be surprising if cosmic
field defects, that have such a good pedigree from par-
ticle physics, also find a role in structure formation. And
a central point to bear in mind is that fixes, which d o not
seem unlikely, could affect the cosmological tests.
A decade ago we had significant results from the cos-
mological tests. For example, estimates of the product
Hoto suggested we might need positive A, though the
precision was not quite adequate for a convincing case.
That is still so; the community will be watching for fur-
ther advances. We had fairly good constraints on R,,
from the theory of the origin of the light elements. The
abundance measurements are improving; an important
recent development is the detection of deuterium in gas
clouds at redshifts z-3. Ten years ago we had useful
estimates of masses from peculiar motions on relatively
small scales, but also more mixed messages from larger
scales. The story seems more coherent now, but there is
still room for improved consistency. We had a case for
V. CONCLUDING REMARKS
nonbaryonic dark matter, from the constraint on RBo
We cannot demonstrate that there does not exist some and from the CDM model for structure formation. The
other physics, applied to some other cosmology, that case is tighter now, most notably due to the successful fit
equally well agrees with the cosmological tests. The of the CDM model prediction to the measurements of
same applies to the whole enterprise of physical science, the power spectrum of the 3-K cosmic microwave back-
of course. Parts of physics are so densely checked that ground radiation anisotropy. In 1990 there were believ-
they are quite convincing approximations to ieality. The able observations of galaxies identified as radio sources
web of tests is much less dense in cosmology, but, as we at z - 3 . Now the distributions of galaxies and the inter-
have tried to demonstrate, by no means negligible, and galactic medium at z-3 are mapped out in impressive
growing tighter. detail, and we are seeing the development of a semi-
A decade ago there was not much discussion on how empirical picture of galaxy formation and evolution.
to test general relativity theory on the scales of cosmol- Perhaps that will lead us back to the old dream of using
ogy. That was in part because the theory seemed so logi- galaxies as markers for cosmological tests.
cally compelling, and certainly in part also because there Until recently it made sense to consider the con-
was not much evidence to work with. The empirical situ- straints on one or two of the parameters of the cosrnol-
ation is much better now. We mentioned two tests, ogy while hoiding all the rest at reasonable values.
namely, of the relativistic active gravitational mass den- That helps us understand what the measurements are
sity and the gravitational inverse-square law. The consis- probing; it is the path we have followed in Sec. 1V.B. But
tency of constraints on RMo from dynamics and the the modern and very sensible trend is to consider joint
redshift-magnitude relation adds a test of the effect of fits of large numbers of parameters to the full suite of
space curvature on the expansion rate. These tests are observation^.'^^ This includes a measure of the biasing
developing; they will be greatly improved by work in or antibiasing of the distribution of galaxies relative to
progress. There certainly may be surprises in the gravity mass, rather than our qualitative argument that one use-
physics of cosmology at redshifts Z S ~ O ~but , it is al- fully approximates the other. In a fully satisfactory cos-
ready clear that if so the surprises will be subtly hidden. moIogical test the parameter set will also include param-
A decade ago the direction the theory of large-scale etrized departures from standard physics extrapolated to
structure formation would take was not at all clear. Now the scales of cosmology.
the simple CDM model has had enough success that Community responses to advances in empirical evi-
there is good reason to expect the standard model ten dence are not always close to linear. The popularity of
years from now will resemble CDM. We have listed the Einstein-de Sitter model continued longer into the
challenges to t h s structure-formation model. Some may 1990s than seems logical to us, and the switch to the now
merely be a result of the complications of interpreting standard ACDM cosmological model-with flat space
the theory and observations. Others may prove to be sections, nonbaryonic cold dark matter, and dark
real and drive adjustments of the model. Fixes certainly energy-arguably is more abrupt than warranted by the
will include one element from the ideas of structure for-
mation that were under debate a decade ago: explosions
that rearrange matter in ways that are difficult to com- 07Examplesare Cole e l al. (1997), Jenkins er al. (1998), Line-
pute. Fixes may also include primeval isocurvature de- weaver (2001), and Percival el al. (2002).

Rev. Mod. Phys., Vol. 75,No. 2,April 2003


607

596 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

advances in the evidence. Our review leads us to con- in at least some circles, a decade from now, whatever the
clude that there is now a good scientific case arguing outcome of the present work on the cosmological tests.
that the matter density parameter is fiM0-o.25, and a Though this much is clear, we see no basis for a predic-
fairly good case that about three-quarters of that is not tion of whether the standard cosmology a decade from
baryonic. The cases for dark energy and for the ACDM now will be a straightforward elaboration of ACDM, or
model are significant, too, though obscured by observa- whether there will be more substantial changes of direc-
tional issues of whether we have an adequate picture of tion.
structure formation. But we expect that rapid advances
in the observations of structure formation will soon dis- Note added in proof
sipate these clouds, and, considering the record, likely
reveal new clouds over the standard model for cosmol- The measurements of the angular distribution of the
ogy a decade from now. 3-K cosmic microwave background radiation by the
A decade ago the high-energy-physics community had Wilkinson Microwave Anisotropy Probe (WMAP) were
a well-defined challenge to show why the dark-energy released (in Bennett et al., 2003; Spergel ef af.,2003, and
density vanishes. Now there seems to be both a new references therein) after this review was completed. It is
challenge and clue: determine why the dark-energy den- appropriate to comment on how the results of this su-
sity is exceedingly small but not zero. The present state perb experiment have changed our assessment of the
of ideas can be compared with the state of research on cosmological tests.
structure formation a decade ago: in both situations The discussion in Sec. 1V.C led us to conclude that the
there are many lines of thought but not a clear picture of case for the detection of dark energy is not as compel-
which is the best direction to take. The big difference is ling as the case for dark matter, because there are fewer
that a decade ago we could be reasonably sure that ob- cross checks. WMAP changes that. The ACDM model
servations in progress would guide us to a better under- gives an excellent fit to the WMAP measurements. The
standing of how structure formed. Untangling the phys-
parameters required by this fit, including density param-
ics of dark matter and dark energy and their role in
eters 0.19SQM0S0.35, 0.65SfiAoS0.81, and -0.02
gravity physics is a much more subtle challenge, but, we
hope, will be guided by advances in the exploration of ~ f i ~ ~ c O (all
. 0 6at two standard deviations), are in
the phenomenology. Perhaps in another ten years this good agreement with other constraints (as summarized
will include detecting the evolution of the dark energy, in Sec. IV).In particular, a p r e - W A F survey of the
and detecting the gravitational response of the dark- constraints on fiMofrom a combination of the dynami-
energy distribution to the large-scale mass distribution. cal, baryon fraction, power spectrum, weak lensing, and
There may be three unrelated phenomena to deal with: cluster abundance measurements indicates 0 . 2 s CiMo
dark energy, dark matter, and a vanishing sum of zero- 50.35 (at two standard deviations, Chen and Ratra,
point energies and whatever goes with them. Or the 2003b), in striking accord with the WMAP estimate. The
phenomena may be related. Because our only evidence fit to the WMAP measurements and the overall consis-
of dark matter and dark energy is from their gravity, it is tency of parameter constraints is strong evidence that
a natural and efficient first step to suppose that their the ACDM model is a good approximation to reality.
properties are as simple as allowed by the phenomenol- This evidence increases the weight of parameter esti-
ogy. However, it makes sense to watch for hints of more mates that depend on the ACDM model. And the model
complex physics within the dark sector. fit to the WMAP measurements requires the presence of
The past eight decades have seen steady advances in dark energy, provided the Hubble constant is within ac-
the technology used for the cosmological tests, from ceptable distance of the astronomical measurements.
telescopes to computers; advances in the theoretical This is distinct from the line of argument for 0 ~ 0 - 0 . 7
concepts underlying the tests; and progress through the described in Sec. IV.It provides the wanted cross check
learning curves on applying the concepts and technol- that makes a convincing case for the detection of dark
ogy. We see the results: the basis for cosmology is much energy.
firmer than it was a decade ago. And the basis surely will
Issues remain. The ACDM fit to WMAP indicates the
be a lot more solid a decade from now.
density parameter in baryons is 0.021 S R B , , h h 2 ~ 0 . 0 2 4
Einstein's cosmological constant, and the modern vari-
ant, dark energy, have figured in a broad range of topics (at two standard deviations), consistent with what is in-
under discussion in physics and astronomy, in at least dicated by the standard Big Bang nucleosynthesis model
some circles, for much of the past eight decades. Many and measurements of the primeval deuterium abun-
of these issues undoubtedly have been discovered more dance [Eq. (62) and footnote 1011. To be resolved are
than once. But in our experience such ideas tend to per- the somewhat different estimates of fiB,$z2from the he-
sist for a long time at low visibility and sometimes low lium and lithium abundances. The temperature anisot-
fidelity. Thus the community has been very well pre- ropy autocorrelation function is consistent with zero at
pared for the present evidence for detection of dark en- angular separations greater than about 60" (consistent
ergy. And for the same reason we believe that dark en- with the earlier but less emphatic COBE result men-
ergy, whether constant, or rolling toward zero, or maybe tioned in footnote loo), and seems not likely to be con-
even increasing, still will be an active topic of research, sistent with the ACDM prediction. Maybe this is an un-

Rev. Mod. Phys., Vol. 75,NO.2,April 2003


608

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 599

likely statistical fluctuation. Or maybe it is telling us Bardeen, J. M., F! J. Steinhardt, and M. S. Turner, 1983, Phys.
about the physics of the dark energy Rev. D 28, 679.
We take this opportunity to refer to Padmanabhans Barr, S . M., and D. Hochberg, 1988, Phys. Lett. B 211, 49.
(2002) recently completed review of the cosmological Barr, S . M., and D. Seckel, 2001, Phys. Rev. D 64, 123513.
constant, with particular emphasis on possible resolu- Barreiro, T., E. J. Copeland, and N. J. Nunes, 2000, Phys. Rev.
tions of the physicists cosmological constant problem D 61, 127301.
and the physics of the resulting dark energy models. Bartlett, J. G., and J. Silk, 1990, Astrophys. J. 353, 399.
Bartolo, N., and M. Pietroni, 2000, Phys. Rev. D 61, 023518.
ACKNOWLEDGMENTS Battye, R. A,, M. Bucher, and D . Spergel, 1999, e-print
astro-ph/9908047.
We a re indebted to Pia Mukherjee, Michael Peskin, Bean, R., and A. Melchiorri, 2002, Phys. Rev. D 65, 041302.
and Larry Weaver for detailed comments o n drafts of Bennett, C. L., el a/., 2003, e-print astro-ph/0302207.
this review. We thank Uwe Thumm for help in translat- Bernardeau, E, and R. Schaeffer, 1992, Astron. Astrophys.
ing and discussing papers written in German. We have 255, 1.
also benefited from discussions with Neta Bahcall, Rob- Bertolami, O., 1986, Fortschr. Phys. 34, 829.
ert Caldwell, Gang Chen, Andrea Cimatti, Mark Dick- Bertolami, O., and P. J. Martins, 2000, Phys. Rev. D 61, 064007.
inson, Michael Dine, Masataka Fukugita, Salman Habib, Bindtruy, F!, 1999, Phys. Rev. D 60, 063502.
David Hogg, Avi Loeb, Stacy McGaugh, Paul Schechter, Bindtruy, P., and J. Silk, 2001, Phys. Rev. Lett. 87, 031102.
Chris Smeenk, Gary Steigman, E d Turner, Michael Birkel, M., and S. Sarkar, 1997, Astropart. Phys. 6, 197.
Turner, Jean-Philippe Uzan, David Weinberg, an d Si- Blais-Ouellette, S., F! Amram, and C. Carignan, 2001, Astron.
mon White. B.R. acknowledges support from NSF J. 121, 1952.
C A R E E R Grant No. AST-9875031, and P.J.E.P. ac- Blanchard, A,, R. Sadat, J. G. Bartlett, and M. Le Dour, 2000,
knowledges support in part from th e NSF. Astron. Astrophys. 362, 809.
Bludman, S. A,, and M. A. Ruderman, 1977, Phys. Rev. Lett.
38, 255.
REFERENCES Blumenhagen, R., B. Kors, D. Lust, and T. Ott, 2002, Nucl.
Phys. B 641, 235.
Abazajian, K., G. M. Fuller, and M. Patel, 2001, Phys.-Rev.D Blumenthal, G. R., A. Dekel, and J. R. Primack, 1988, Astro-
64, 023501. phys. J. 326, 539.
Abbott, L. F., and M. B. Wise, 1984, Nucl. Phys. B 244, 541. Blumenthal, G. R., S. M. Faber, J. R. Primack, and M. J. Rees,
Abell, G. O., 1958, Astrophys. J., Suppl. Ser. 3, 211. 1984, Nature (London) 311, 517.
Akhmedov, E. Kh., 2002, e-print hep-th/0204048. Bode, P., J. P. Ostriker, and N. Turok, 2001, Astrophys. J. 556,
Albrecht, A,, and P. J. Steinhardt, 1982, Phys. Rev. Lett. 48, 93.
1220. Bond, J. R., 1988, in The Early Universe, edited by W. G. Un-
Alcock, C., and B. Paczyfiski, 1979, Nature (London) 281,358. ruh and G. W. Semenoff (Reidel, Dordrecht), p. 283.
Allen, S . W., R. W. Schmidt, and A. C. Fabian, 2002, Mon. Not. Bond, J. R., R. Crittenden, R. L. Davis, G. Efstathiou, and P. J.
R. Astron. SOC.335, 256. Steinhardt, 1994, Phys. Rev. Lett. 72, 13.
Alpher, R. A., and R. Herman, 2001, Genesir of the Big Bang Bondi, H., 1960, Cosmology (Cambridge University, Cam-
(Oxford University, Oxford). bridge).
Amendola, L., 1999, Phys. Rev. D 60, 043501. Bondi, H., and T. Gold, 1948, Mon. Not. R. Astron. SOC.108,
Amendola, L., 2000, Phys. Rev. D 62, 043511. 252.
Amendola, L., and D. Tocchini-Valentini, 2002, Phys. Rev. D Bordag, M., U. Mohideen, and V. M. Mostepanenko, 2001,
66, 043528. Phys. Rep. 353, 1.
Arag6n-Salamanca. A,, C. M. Baugh, and G. Kauffmann, 1998, Borgani, S., P. Rosati, F! Tozzi, S. A. Stanford, P. R. Eisenhardt,
Mon. Not. R. Astron. SOC.297, 427. C. Lidman, B. Holden, R. D. Ceca, C. Norman, and G.
Arai, K., M. Hashimoto, and T. Fukui, 1987, Astron. Astro- Squires, 2001, Astrophys. J. 561, 13.
phys. 179, 17. Boughn, S. F!, and R. G. Crittenden, 2001, e-print
Aramendariz-Picon, C., V. Mukhanov, and P J. Steinhardt, astro-ph/Olll281.
2001, Phys. Rev. D 63, 103510. Bousso, R., 2000, J. High Energy Phys. 0011, 038.
Baccigalupi, C., A. Balbi, S. Matarrese, F. Perrotta, and N. Branchini, E., W. Freudling, L. N. Da Costa, C. S. Frenk, R.
Vittorio, 2002, Phys. Rev. D 65, 063520. Giovanelli, M. F! Haynes, J. J. Salzer, G. Wegner, and I. Ze-
Baccigalupi, C., S. Matarrese, and F. Perrotta, 2000, Phys. Rev. havi, 2001, Mon. Not. R. Astron. SOC.326, 1191.
D 62, 123510. Branchini, E., I. Zehavi, M. Plionis, and A. Dekel, 2000, Mon.
Bacon, D. J., R. J. Massey, A. R. Refregier, and R. S. Ellis, Not. R. Astron. SOC.313, 491.
2002, e-print astro-phi0203134. Brandenberger, R. H., 2001, e-print hep-phi0101119.
Bahcall, J. N., and R. A. Wolf, 1968, Astrophys. J. 152, 701. Brax, P., and J. Martin, 2000, Phys. Rev. D 61, 103502.
Bahcall, N. A., R. Cen, R. Davd, J. F! Ostriker, and Q. Yu, Brax, P., J. Martin, and A. Riazuelo, 2000, Phys. Rev. D 62,
2000, Astrophys. J. 541, 1. 103505.
Bahcall, N. A., and X. Fan, 1998, Astrophys. J. 504, 1. Brax, P., J. Martin, and A. Riazuelo, 2001, Phys. Rev. D 64,
Bahcall, N. A., J. P. Ostriker, S. Perlmutter, and F! J. Stein- 083505.
hardt, 1999, Science 284, 1481. Bridle, S., R. G. Crittenden, A. Melchiorri, M. P. Hobson, R.
Bahcall, N. A., et al., 2003, Astrophys. J. 585, 182. Kneissl, and A. N. Lasenby, 2002, Mon. Not. R. Astron. SOC.
Banks, T., and W. Fischler, 2001, e-print hep-thi0102077. 335. 1193.

Rev Mod Phys , Vol 75,No 2.April 2003


609

600 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

Bronstein, M., 1933, Phys. 2. Sowjetunion 3, 73. Corasaniti, P. S . , and E. J. Copeland, 2002, Phys. Rev. D 65,
Brown, G. S., and B. M. Tinsley, 1974, Astrophys. J. 194, 555. 043004.
Buchalter, A,, D. J. Helfand, R. H. Becker, and R. L. White, Croom, S. M., R. J. Smith, B. J. Boyle, T. Shanks, N. S. Loar-
1998, Astrophys. J. 494, 503. ing, L. Miller, and I. J. Lewis, 2001, Mon. Not. R. Astron. SOC.
Bucher, M., and D. N. Spergel, 1999, Phys. Rev. D 60,043505. 322, L29.
Bucher, M., and N. Turok, 1995, Phys. Rev. D 52, 5538. Cyburt, R. H., B. D. Fields, and K. A. Olive, 2001, New As-
Burgess, C. P., ? Martineau, F. Quevedo, G. Rajesh, and R.-J. tron. 6, 215.
Zhang, 2002, J. High Energy Phys. 0203, 052. Dalal, N., K. Abazajian, E. Jenkins, and A. Manohar, 2001,
Burles, S., K. M. Nollett, and M. S. Turner, 2001, Astrophys. J. Phys. Rev. Lett. 87, 141302.
Lett. 552, L1. Daly, R., and E. J. Guerra, 2001, e-print astro-ph/0109383.
Burstein, D., 2000, in Cosmic Flows Workshop, edited by S . Danese, L., G. L. Granato, L. Silva, M. Magliocchetti, and G.
Courteau and J. Willick, Astronomical Society of the Pacific De Zotti, 2002, in The Mass of Galaxies at Low and High
Conference Proceedings No. 201 (Astronomical Society of Redshift, edited by R. Bender and A. Renzini (Springer, Ber-
the Pacific, San Francisco), p. 178. lin), in press.
Caldwell, R. R., 2002, Phys. Lett. B 545, 23. Dasgupta, K., C. Herdeiro, S. Hirano, and R. Kallosh, 2002,
Caldwell, R. R., R. Davd, and I? J. Steinhardt, 1998, Phys. Rev. Phys. Rev. D 65, 126002.
Lett. 80, 1582. Dav6, R., D. N. Spergel, ? J. Steinhardt, and B. D . Wandelt,
Canuto, V., ? J. Adams, S.-H. Hsieh, and E. Tsiang, 1977, Phys. 2001, Astrophys. J. 547, 574.
Rev. D 16, 1643. Davis, A. C., M. Dine, and N. Seiberg, 1983, Phys. Lett. l25B,
Canuto, V., and J. F. Lee, 1977, Phys. Lett. 72B, 281. 487.
Carlstrom, J. E., M. Joy, L. Grego, G. Holder, W. L. Holzapfel, Davis, M., G. Efstathiou, C. S. Frenk, and S. D. M. White,
S. LaRoque, J. J. Mohr, and E. D. Reese, 2001, in Construct- 1985, Astrophys. J. 292, 371.
ing the Universe with Clusters of Galuxies, edited by F. Durret Davis, M., and P J. E. Peebles, 1983a, Annu. Rev. Astron.
and G. Gerbal (in press), e-print astro-phi0103480. Astrophys. 21, 109.
Carretta, E., R. G. Gratton, G. Clementini, and E F. Pecci, Davis, M., and I? J. E. Peebles, 1983b, Astrophys. J. 267, 465.
2000, Astrophys. J. 533, 215. Davis, R. L., 1987, Phys. Rev. D 35, 3705.
Carroll, S. M., 1998, Phys. Rev. Lett. 81, 3067. de Blok, W. J. G., and A. Bosma, 2002, Astron. Astrophys.
Carroll, S. M., 2001, Living Rev. Relativ. 4, 1. 385, 816.
Carroll, S . M., W. H. Press, and E. L. Turner, 1992, Annu. Rev. de Blok, W. J. G., S. S. McGaugh, A. Bosma, and V C. Rubin,
Astron. Astrophys. 30, 499. 2001, Astrophys. J. Lett. 552, L23.
Casimir, H. B. G., 1948, Proc. K. Ned. Akad. Wet. 51, 635. Deffayet, C., G. Dvali, and G. Gabadadze, 2002, Phys. Rev. D
Chaboyer, B., and L. M. Krauss, 2002, Astrophys. J. Lett. 567, 65, 044023.
L45. de la Macorra, A,, and G. Piccinelli, 2000, Phys. Rev. D 61,
Chen, B., and E-L. Lin, 2002, Phys. Rev. D 65, 044007. 123503.
Chen, G., and B. Ratra, 2003a, Astrophys. J. 582, 586. de la Macorra, A., and C. Stephan-Otto, 2001, Phys. Rev. D 87,
Chen, G., and B. Ratra, 2003b, e-print astro-ph/0302002. 271301.
Chen, X., R. J . Scherrer, and G. Steigman, 2001, Phys. Rev. D de Oliveira-Costa, A., M. Tegmark, L. A. Page, and S. P.
63, 123504. Boughn, 1998, Astrophys. J. Lett. 509, L9.
Chiba, T., 1999, Phys. Rev. D 60, 083508. de Ritis, R., A. A. Marino, C. Rubano, and ? Scudellaro, 2000,
Chiba, T., T. Okabe, and M. Yamaguchi, 2000, Phys. Rev. D 62, Phys. Rev. D 62, 043506.
023511. de Sitter, W., 1917, Mon. Not. R. Astron. SOC.78, 3.
Chimento, L. P., and A. S. Jakubi, 1996, Int. J. Mod. Phys. D 5, DeWitt, B. S . , 1975, Phys. Rep. 19, 295.
71. Dicke, R. H., 1968, Astrophys. J. 152, 1.
Choi, K., 1999, e-print hep-phi9912218. Dicke, R. H., 1970, Gravitation and the Universe (American
Cimatti, A., el al., 2002, Astron. Astrophys. Lett. 391, L1. Philosophical Society, Philadelphia).
Coble, K., S. Dodelson, M. Dragovan, K. Ganga, L. Knox, J. Dicke, R. H., and P J. E. Peebles, 1979, in General Relativity,
Kovac, B. Ratra, and T. Souradeep, 2003, Astrophys. J. 584, edited by S. W. Hawking and W. Israel (Cambridge Univer-
585. sity, Cambridge, England), p. 504.
COC,A., E. Vangioni-Flam, M. Lass;, and M. Rabibet, 2002, Dimopoulos, K., and J. W. F. Valle, 2001, e-print
Phys. Rev. D 65, 043510. astro-phl0111417.
Cohen-Tannoudji, C., J. Dupont-Roc, and G. Grynberg, 1992, Dine, M., 1996, e-print hep-phi9612389.
Atom-Photon Interactions (Wiley, New York), p. 121. Di Pietro, E., and J. Demaret, 2001, Int. J. Mod. Phys D 10,
Colberg, J. M., et al., 2000, Mon. Not. R. Astron. SOC.319,209. 231.
Cole, S., D. H. Weinberg, C. S . Frenk, and B. Ratra, 1997, Dirac, P. A. M., 1937, Nature (London) 139, 323.
Mon. Not. R. Astron. SOC.289, 37. Dirac, P. A. M., 1938, Proc. R. SOC.London, Ser. A 165, 199.
Coleman, S., and F. De Luccia, 1980, Phys. Rev. D 21, 3305. Dodelson, S., et al., 2002, Astrophys. J. 572, 140.
Colley, W. N., J. R. Gott, and C. Park, 1996, Mon. Not. R. Dodelson, S., M. Kaplinghat, and E. Stewart, 2000, Phys. Rev.
Astron. SOC.281, L82. Lett. 85, 5276.
Cooray, A. R., 1999, Astrophys. J. 524, 504. Dolgov, A. D., 1983, in The Very Early Universe, edited by G.
Copeland, E. J., N. J. Nunes, and F. Rosati, 2000, Phys. Rev. D W. Gibbons, S. W. Hawking, and S. T. C. Siklos (Cambridge
62, 123503. University, Cambridge, England), p. 449.

Rev. Mod. Phys., Vol. 75, NO.2,April 2003


P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 601

Dolgov, A. D., 1989, in The Quest for the Fundamental Con- Flores, R. A., and J. R. Primack, 1994, Astrophys. J. Lett. 427,
stants in Cosmology, edited by J. Audouze and J. Tran Thanh L1.
Van (Editions Frontikres, Gif-sur-Yvette), p. 227. Ford, L. H., 1987, Phys. Rev. D 35,2339.
Doran, M., and J. Jackel, 2002, Phys. Rev. D 66, 043519. Fosalba, P., 0.Dore, and F. R. Bouchet, 2002, Phys. Rev. D 65,
Doran, M., M. Lilley, J. Schwindt, and C. Wetterich, 2001, As- 063003.
trophys. J. 559, 501. Freedman, W. L., 2002, Int. J. Mod. Phys. A 17, 51, 58.
Dreitlein, J., 1974, Phys. Rev. Lett. 33, 1243. Freedman, W. L., et al., 2001, Astrophys. J. 553, 47.
Dubinski, J., and R. G. Carlberg, 1991, Astrophys. J. 378, 496. Freese, K., F. C. Adams, J. A. Frieman, and E. Mottola, 1987,
Durrer, R., B. Novosyadlyj, and S . Apunevych, 2003, Astro- Nucl. Phys. B 287, 797.
phys. J. 583, 33. Frewin, L. A., and J. E. Lidsey, 1993, Int. J. Mod. Phys. D 2,
Dvali, G., and S.-H. H. Tye, 1999, Phys. Lett. B 450, 72. 323.
Eddington, A. S., 1924, The Mathematical Theory of Relativity Friedland, A,, H. Muruyama, and M. Perelstein, 2003, Phys.
(Cambridge University, Cambridge, England). Rev. D (in press).
Eddington, A. S., 1939, Sci. Prog. 34, 225. Friedmann, A., 1922, Z. Phys. 10, 377 [English translation in
Efstathiou, G., 1999, Mon. Not. R. Astron. SOC.310, 842. Cosmological Constants, edited by J. Bernstein and G. Fein-
Efstathiou, G., 2002, Mon. Not. R. Astron. SOC.332, 193. berg (Columbia University, New York, 1986), p. 491.
Efstathiou, G., and J. R. Bond, 1999, Mon. Not. R. Astron. Friedmann, A,, 1924, Z. Phys. 21, 326 [English translation in
SOC.304, 75. Cosmological Constants, edited by J. Bernstein and G. Fein-
Efstathiou, G., and M. J. Rees, 1988, Mon. Not. R. Astron. berg (Columbia University, New York, 1986), p. 591.
SOC. 230, 5P. Frieman, J. A., C. T. Hill, A. Stebbins, and I. Waga, 1995, Phys.
Efstathiou, G., W. J. Sutherland, and S . J. Maddox, 1990, Na- Rev. Lett. 75, 2077.
ture (London) 348,705. Fry, J. N., 1984, Astrophys. J. 279, 499.
Fry, J. N., 1985, Phys. Lett. 158B, 211.
Einstein, A., 1917, Sitzungsher. K. Preuss. Akad. Wiss. 142
Fry, J. N., 1994, Phys. Rev. Lett. 73, 215.
[English translation in The Principle of Relativity (Dover,
Fry, J. N., and E. Gaztaiiaga, 1993, Astrophys. J. 4l3,447.
New York, 1952), p. 1771.
Fujii, Y., 1982, Phys. Rev. D 26,2580.
Einstein, A,, 1931, Sitzungsber. Preuss. Akad. Wiss., Phys. Fujii, Y.,2000, Gravitation Cosmol. 6, 107.
Math. K1. 235. Fujii, Y., and T. Nishioka, 1990, Phys. Rev. D 42, 361.
Einstein, A., 1945, The Meaning of Relalivity (Princeton Uni- Fukugita, M., T. Futamase, and M. Kasai, 1990, Mon. Not. R.
versity, Princeton, NJ). Astron. SOC.246, 24P.
Einstein, A,, and W. de Sitter, 1932, Proc. Natl. Acad. Sci. Fukugita, M., C. J. Hogan, and P. J. E. Peebles, 1998, Astro-
U.S.A. 18, 213. phys. J. 503, 518.
Ellwanger, U., 2002, e-print hep-ph/0203252. Gamow, G., 1970, My World Line (Viking, New York).
Endo, M., and T. Fukui, 1977, Gen. Relativ. Gravit. 8, 833. Ganga, K., B. Ratra, J. 0. Gundersen, and N. Sugiyama, 1997,
Enz, C. P., 1974, in Physical Reality and Mathematical Descrip- Astrophys. J. 484, 7.
tion, edited by C. P. Enz and J. Mehra (Reidel, Dordrecht), p. Garcia-Bellido, J., J. Rabadin, and F. Zamora, 2002, J. High
124. Energy Phys. 0201, 036.
Enz, C. P., and A. Thellung, 1960, Helv. Phys. Acta 33, 839. Garnavich, P. M., et aL, 1998, Astrophys. J. 509, 74.
Eriksson, M., and R. Amanullah, 2002, Phys. Rev. D 66, Gasperini, M., 1987, Phys. Lett. B 194, 347.
023530. Gasperini, M., F. Piazza, and G. Veneziano, 2002, Phys. Rev. D
Etoh, T., M. Hashimoto, K. Arai, and S . Fujimoto, 1997, As- 65, 023508.
tron. Astrophys. 325, 893. Gebhardt, K., et al., 2000, Astrophys. J. Lett. 539, L13.
Evrard, A. E., 1989, Astrophys. J. Lett. 341, L71. Geller, M. J., and P. J. E. Peebles, 1973, Astrophys. J. 184, 329.
Faber, S . M., G. Wegner, D. Burstein, R. L. Davies, A. Georgi, H., H. R. Quinn, and S . Weinberg, 1974, Phys. Rev.
Dressler, D. Lynden-Bell, and R. J. Terlevich, 1989, Astro- Lett. 33, 451.
phys. J., Suppl. Ser. 69, 763. Gerke, B., and G. Efstathiou, 2002, Mon. Not. R. Astron. SOC.
Falco, E. E., C. S . Kochanek, and J. A. Muiioz, 1998, Astro- 335, 33.
phys. J. 494, 47. Giovannini, M., E. Keihanen, and H. Kurki-Suonio, 2002,
Fan, X., et al., 2001, Astrophys. J. 122, 2833. Phys. Rev. D 66, 043504.
Faraoni, V., 2000, Phys. Rev. D 62, 023504. Giudice, G. F., and R. Rattazzi, 1999, Phys. Rep. 322, 419.
Feldman, H. A., J. A. Frieman, J. N. Fry, and R. Scoccimarro, Gliner, E. B., 1965, Zh. Eksp. Teor. Fiz. 49, 542 [Sov. Phys.
2001, Phys. Rev. Lett. 86, 1434. JETP 22, 378 (1966)l.
Ferrarese, L., and D. Merritt, 2000, Astrophys. J. Lett. 539, L9. Gonzilez-Diaz, P. F., 2000, Phys. Rev. D 62, 023513.
Ferreira, P. G., and M. Joyce, 1998, Phys. Rev. D 58, 023503. Goobar, A,, and S . Perlmutter, 1995, Astrophys. J. 450, 14.
Fields, B. D., and S . Sarkar, 2002, Phys. Rev. D 66, 010001. Gorski, K. M., B. Ratra, R. Stompor, N. Sugiyama, and A. J.
Fischler, W., A. Kashani-Poor, R. McNees, and S . Paban, 2001, Banday, 1998, Astrophys. J., Suppl. Ser. 114, 1.
J. High Energy Phys. 0107, 003. Gorski, K. M., B. Ratra, N. Sugiyama, and A. J. Banday, 1995,
Fischler, W., B. Ratra, and L. Susskind, 1985, Nucl. Phys. B Astrophys. J. Lett. 444,L65.
259, 730. Gorski, K. M., J. Silk, and N. Vittorio, 1992, Phys. Rev. Lett.
Fisher, K. B., C. A. Scharf, and 0. Lahav, 1994, Mon. Not. R. 68, 733.
Astron. SOC.266, 219. Gott, J. R., 1982, Nature (London) 295, 304.
Fixsen, D. J., E. S . Cheng, J. M. Gales, J. C. Mather, R. A. Gott, J. R., 1997, in Critical Dialogs in Cosmology, edited by
Shafer, and E. L. Wright, 1996, Astrophys. J. 473, 576. N. Turok (World Scientific, Singapore), p. 519.

Rev. Mod. Phys., Vol. 75, No. 2,April 2003


61 1

602 P.J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

Gott, J. R., M. S. Vogeley, S. Podariu, and B. Ratra, 2001, Huterer, D., and M. S. Turner, 2001, Phys. Rev. D 64, 123527.
Astrophys. J. 549, 1. Hwang, J.-c., and H. Noh, 2001, Phys. Rev. D 64, 103509.
Green, A. M., and J. E. Lidsey, 2000, Phys. Rev. D 61, 067301. Ikebe, Y., T. H. Reiprich, H. Bohringer, Y. Tanaka, and T.
Gudmundsson, E. H., and G. Bjornsson, 2002, Astrophys. J. Kitayama, 2002, Astron. Astrophys. 383, 773.
565, 1. Iliopoulos, J., and B. Zumino, 1974, Nucl. Phys. B 76, 310.
Gum, J. E., 1967, Astrophys. J. 147, 61. Israel, F. l?, 1998, Astron. Astrophys. Rev. 8, 237.
Gunn, J. E., and J. B. Oke, 1975, Astrophys. J. 195, 255. Jenkins, A., C. S . Frenk, E R. Pearce, P. A. Thomas, J . M.
Gunn, J. E., and B. M. Tinsley, 1975, Nature (London) 257, Colberg, S. D. M. White, H. M. P. Couchman, J. A. Peacock,
454. G. Efstathiou, and A. H. Nelson, 1998, Astrophys. J. 499, 20.
Gurvits, L. T., K. I. Kellermann, and S . Frey, 1999, Astron. John, M. V., and K. B. Joseph, 2000, Phys. Rev. D 61,087304.
Astrophys. 342, 378. Johri, V. B., 2002, Class. Quantum Grav. 19,5959.
Guth, A. H., 1981, Phys. Rev. D 23, 347. Joyce, M., and T. Prokopec, 1998, Phys. Rev. D 57, 6022.
Guth, A. H., 1997, The Inflationary Universe (Addison-Wesley, Juszkiewicz, R., ? G. Ferreira, H. A. Feldman, A. H. Jaffe, and
Reading). M. Davis, 2000, Science 287, 109.
Guth, A. H., and S.-Y. Pi, 1982, Phys. Rev. Lett. 49, 1110. Kaganovich, A. B., 2001, Phys. Rev. D 63, 025022.
Guyot, M., and Ya. B. Zeldovich, 1970, Astron. Astrophys. 9, Kahn, F. D., and L. Woltjer, 1959, Astrophys. J. 130,705.
227. Kaiser, N., 1984, Astrophys. J. Lett. Ed. 284, L9.
Halliwell, J. J., 1987, Phys. Lett. B 185, 341. Kaiser, N., 1987, Mon. Not. R. Astron. SOC.227, 1.
Halpern, M., H. P. Gush, and E. H. Wishnow, 1991, in After the Kamionkowski, M., B. Ratra, D. N. Spergel, and N.Sugiyama,
First Three Minutes, edited by S. S. Holt, C. L. Bennett, and 1994a, Astrophys. J. Lett. 434, L1.
V. Trimble (AIP, New York), p. 53. Kamionkowski, M., D. N. Spergel, and N. Sugiyama, 1994b,
Halverson, N. W., et al., 2002, Astrophys. J. 568, 38. Astrophys. J. Lett. 426, L57.
Halyo, E., 2001a, J. High Energy Phys. 0110, 025. Kamionkowski, M., and N. Toumbas, 1996, Phys. Rev. Lett. 77,
Halyo, E., 2001b, e-print hep-ph10105341. 587.
Hamilton, A. J. S., and M. Tegmark, 2002, Mon. Not. R. As- Kardashev, N., 1967, Astrophys. J . Lett. 150, L135.
tron. SOC.330, 506. Kay, S. T., F. R. Pearce, C. S. Frenk, and A. Jenkins, 2002,
Hamilton, J.-Ch., and K. Ganga, 2001, Astron. Astrophys. 368, Mon. Not. R. Astron. SOC.330, 113.
760. Kazanas, D., 1980, Astrophys. J. Lett. Ed. 241, L59.
Harrison, E. R., 1970, Phys. Rev. D 1,2726. Kim, J. E., 2000, J. High Energy Phys. 0006, 016.
Hawking, S. W., 1982, Phys. Lett. 115B, 295. Kirzhnitz, D. A,, and A. D. Linde, 1974, Zh. Eksp. Teor. Fiz.
He, X.-G., 2001, e-print astro-phi0105005. 67, 1263 [Sov. Phys. JETP 40,628 (1975)J
Hebecker, A., and C. Wetterich, 2001, Phys. Lett. B 497, 281. Klein, F., 1918, Nachr. Ges. Wiss. Goettingen, Math.-Phys. K1.
Helbig, P., D. Marlow, R. Quast, I? N. Wilkinson, I. W. A. December, 394.
Browne, and L. V. E. Koopmans, 1999, Astron. Astrophys., Klypin, A., A. V. Kratsov, 0. Valenzuela, and F. Prada, 1999,
Suppl. Ser. 136, 297. Astrophys. J. 522, 82.
Hellerman, S., N. Kaloper, and L. Susskind, 2001, J. High En- Knebe, A., J. E. G. Devriendt, A. Mahmood, and J. Silk, 2002,
ergy Phys. 0106, 003. Mon. Not. R. Astron. SOC.329, 813.
Hiscock, W. A., 1986, Phys. Lett. 166B, 285. Knox, L., and L. Page, 2000, Phys. Rev. Lett. 85, 1366.
Hivon, E., F. R. Bouchet, S. Colombi, and R. Juszkiewicz, Kofman, L. A,, and A. A. Starobinsky, 1985, Pisma Astron.
1995, Astron. Astrophys. 298, 643. Zh. 11, 643 [Sov. Astron. Lett. 11, 271 (1985)].
Hoekstra, H., H. K. C. Yee, and M. D. Gladders, 2002, New Kogut, A., A. J. Banday, C. L. Bennett, K. M. Gbrski, G. Hin-
Astron. Rev. 46, 767. shaw, G. E Smoot, and E. L. Wright, 1996, Astrophys. J. Lett.
Holden, D. J., and D. Wands, 2000, Phys. Rev. D 61,043506. 464, L5.
Holtzman, J. A., 1989, Astrophys. J., Suppl. Ser. 71, 1. Kolb, E. W., and S. Wolfram, 1980, Astrophys. J. 239, 428.
Horvat, R., 1999, Mod. Phys. Lett. A 14, 2245. Kolda, C., and W. Lahneman, 2001, e-print hep-ph/0105300.
Hoyle, F., 1948, Mon. Not. R. Astron. SOC.108, 372. Kolda, C., and D. H . Lyth, 1999, Phys. Lett. B 458, 197.
Hoyle, F., 1959, in Paris Symposium on Radio Astronomy, IAU Komatsu, E., B. D. Wandelt, D. N. Spergel, A. J. Banday, and
Symposium 9, edited by R. N. Bracewell (Stanford Univer- K. M. Gbrski, 2002, Astrophys. J. 566, 19.
sity, Stanford), p. 529. Kosowsky, A,, 2002, in Modern Cosmology, edited by S.
Hoyle, F., and R. J. Tayler, 1964, Nature (London) 203, 1108. Bonometto, V. Gorini, and U. Moschella (IOP, Bristol), p.
Hradecky, V., C. Jones, R. H. Donnelly, S. G. Djorgovski, R. R. 219.
Gal, and S. C. Odewahn, 2000, Astrophys. J. 543, 521. Kragh, H., 1996, Cosmology and Controversy (Princeton Uni-
Hu, W., 1998, Astrophys. J. 506, 485. versity, Princeton, NJ).
Hu, W., and P. J. E. Peebles, 2000, Astrophys. J. Lett. 528, L61. Kragh, H., 1999, in The Expanding Worlds of General Relativ-
Hu, W., and N. Sugiyama, 1996, Astrophys. J. 471, 542. ity, edited by H. Goenner, J. Renn, J. Ritter, and T. Sauer
Huang, J.J., 1985, Nuovo Cimento SOC.Ital. Fis., B 87B, 148. (Birkhauser, Boston), p. 377.
Hubble, E., 1929, Proc. Natl. Acad. Sci. U S A . 15, 168. Krauss, L. M., and B. Chaboyer, 2001, e-print
Hubble, E., 1936, Realm o f t h e Nebulae (Yale University, New astro-ph10111597.
Haven). Reprinted (Dover, New York, 1958). Kruger, A. T., and J. W. Norbury, 2000, Phys. Rev. D 61,
Huey, G., and G. Lidsey, 2001, Phys. Lett. B 514, 217. 087303.
Huey, G., and R. Tavakol, 2002, Phys. Rev. D 65, 043504. Kujat, J., A. M. Linn, R. J. Scherrer, and D. H. Weinberg, 2002,
Humason, M. L., N. U. Mayall, and A. R. Sandage, 1956, As- Astrophys. J. 572, 1.
tron. J. 61. 97. Kyae, B., and Q. Shafi, 2002, Phys. Lett. B 526,379.

Rev. Mod. Phys, Vol. 75, No. 1,April 2003


612

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 603

Lahay O.? P. 5. Lilje, J. R. Primack, and M. J. Rees, 1991! Mizuno, S., and K.-i. Maeda, 2001, Phys. Rev. D 64, 123521.
Mon. Not. R. Astron. SOC.251, 128. Moffat, J. N.,2001, e-print hep-thl0105017.
Lanczos, K., 1922, Phys. 2. 23, 539. M o h o ! P., S. A. Levshskov, M. Dessauges-Zavadsky. and S.
Landau. L. D., and E. M. Lifshitz, 1951, The C I ~ ~ s i c Theory
al DOdorico, 2002, Astron. Astrophys. 381, Lh4.
of Fields (Pergarnon, Oxford). Moore. B., 1994, Nature (London) 370: 629.
Lands, S. D., 2002, Astrophys. 1. Lett. 567, L1. Moore, B., S . Ghigna, F. Governato, G . Lake, T. Quinn, J.
Larsen, E, J. E van der Schaar. and R. G. Leigh, 2002, J. High Stadel, and I? Tozzi, 1999a. Astrophys. J . Lett. 524. L19.
Energy Phys. 0204- 047. Moore, B., T. Quinn, F, Governato, J. Stadel, and G. Lake,
Lau. Y.-K.,1985, Aust. J. Phys 38,547. 1999b, Mon. Not. R. Astron. SOC.310, 1147.
Lazarides. G.* 2002, e-print hep-pW020.1294. Mukherjee, P., B. Dendison. B. Ratra, J. H. Simonetti, K.
Lee, A . T., e t a / . :2001, Astrophys. J. Lett. 561,L1. Ganga, and J.-Ch. Hamikon, 2002, Astrophys. J. 579, 83.
Leibundgut, B., 2001, Annu. Rev. Astron. Astrophys. 39, 67. Mukherjee, P., M . P. Hobson. and A. N. Lasenby, 2000: Mon.
Lemaitre, G., 1925, J. Math. Phys. (Cambridge, Mass.) 4, 188. Not. R. Astron. Sac. 318, 1157.
Lernahre, G., 1927? Ann. SOC.Sci. Bruxelles, Ser. 1 47, 49 Munshi! D., and Y. Wang, 2003>Astrophys. J. 583. 566.
[Man. Not. R. Astron. SOC.91, 483 (1931)l. Myung, Y. S., 2001, Mod. Phys. Lett. A 16,1963.
Lemaitre, G., 1934. Proc. Natl. Acad. Sci. U.S.A. 20, 12. Narayanan, V. K., D. N. Spergel, R. Davi, and C . 2 Ma, 2000,
Lemaitre, G . , 1949, in Alberf Einstein: PhilosopheT-Scientist7 Astrophys. J . Lett. 543, L103.
edited by E! A. Schilpp (Library of Living Philosophers. Nernst, W., 1916, Verh. Dtsch. Phys. Ges. 18, 53.
Evanston), p. 437. Netterfield, C. B., et al., 2002, Astrophys. J. 571, 604.
Liddle, A. R., and R. J. Scherrer, 1999, Phys. Rev. D 59. Newman, J. A., and M. Davis, 2000, Astrophys. J. Lett. 534.
023509. L11.
Lightman, A. P., and P. L. Schechter, 1990, Astrophys. J.! Ng. S . C. C., N. J. Nunes, and E Rosati, 2001, Phys. Rev. D 64,
Suppl. Ser. 74, 831. 083510.
Lima, J. A. S., and J. S. Alcaniz, 2001, Braz. J. Phys. 31, 583. Ng, S. C. C., and D . L. Wiltshire, 2001, Phys. Rev. D 63.
Lima, J. A. S., and J. S. Alcaniz, 2002, Astrophys. J. 566, 15. 023503.
Linde., A. D., 1974, Pisma Zh. Eksp. Teor. F i r 19, 320 [JETF Nilles, H. P., 1985, in New Trendr in Particle Theory, edited by
Lett. 19, 183 (1974)l. L. Lusanna (World Scientific. Singapore). p. 119.
Linde, A. D., 1982, Php. Lett. 108B, 389, Nolan, L. A.. J. S. Dunlop, R. Jimenez, and A. F. Heavens,
Lineweaver. C., 2001, e-print astro-ph10112381. 2001, e-print astro-ph10103450.
Loh, E. D., and E. J . Spillar, 1986, Astrophys. J. Lett. Ed. 307. Noomura, Y., 7. Watari, and T. Yanagida, 2000, Phys. Lett. B
L1. 484,103.
Lucchin, S.. and S. Matarrese, 1985a, Phys. Rev. D 32, 1316. North, J. D., 1965, The Memure of the Universe (Oxford Uni-
Lucchin, S.. and S. Matarrese! 1985b, Phjs. Lett. 164B, 282. versity, Oxford). Reprinted (Dover. New York. 1990).
Lynden-Bell, D., 1969, Nature (London) 223, 690. Norton, J. D., 1999, in The Expanding Worlds of General Rela-
Lyth, D. H., and E. D. Stewart, 1990, Phys. Lett. B 252,336. tivity, edited by H. Goennes, 3. Renn, J. Ritter, and T. Sauer
Magueijo, J.: and L. Smolin, 2002, Phys. Rev. Lett. 88, 190403. (Birkhauser, Boston), p. 271.
Majumdar, A. S., 2001, Phys. Rev. D 64, 083503. Olson, T. S., and T. F. Jordan, 1987, Phys. Rev. D 35, 3258.
Mak, M. K.:J. A. Belinchon, and T. Harko. 2002, Int. J. Mod. Oort, J. H., 1958, Proceedings of the 11th Solvay Conference,
Phys. D 11, 1265. Structure and Evolution of the Universe (Stoops, Brussels), p.
Maldacena, J., and C. Nuiiez, 2001, Int. J . Mod. Phys. A 16! 163.
822. Ott, T., 2001, Phys. Rev. D 64,023518.
Maor. I., R. Brustein, J. McMahon, and P. J. Steinhardt, 2002, Oukbir, J., and A. Blanchard, 1992, Astron. Astrophys. 262,
Phys. Rev. D 65,123003. 721.
Marriage, T. A., 2002, e-print astro-ph/0203153. Overduin, J. IM., and E I . Cooperstock, 1998, Phys. Rev. D 58,
Masiero, A., M. Pietroni, and F. Rosati, 2000, Phys. Rev. D 61, 043506.
023504. Overduin, J. M., P. S. Wesson, and S. Bowyer, 1993, Astrophys.
Mason, B. S.?el al.,20MI e-print astro-ph10205384. J. 404, 1.
Mathis, H.. and S. D. M. White, 2002, Mon. Not. R. Astron. Ozer, M., and M.0. Taha, 1986, Phys. Lett. B 171, 363.
SOC. 337, 1193. Padilla, N. D., M. E. Merchin, C. A. Valotto, D. F. Lambas,
Matyjasek, J., 1995+Phys. Rev. D 51, 4154. and M. A. G. Maia. 2001, Astrophys. J. 554, 873.
McCrea, W. H., 1951, Proc. R. Soc. London, Ser. A 206, 562. Padmanabhan, T.. 2002, e-print hep-th/0212290.
McCrea, W. H., 1971, Q. J. R. Astron. SOC. 12,140. Pais, A.. 1982, Subtle is the Lord ... (Oxford University, New
McVittie, G. C., 1956, General Relativity and Cosrnoiogy York) ,
(Chapman and Hall, London). Papovich, C., M. Dickinson, and H. C. Ferguson, 2002, e-print
Medved. A. J. M., 2002, Class. Quantum Grav. 19,4511. astro-ph/0201221.
Misziros. P., 1974, Astron. Astrophys. 37* 225. Park, C.-G., C. Park, B. Ratra, and M. Tegmark, 2001, Asrro-
Mitgrom, M.. 1983- Astrophys. J. 270, 365. phys. J. 556, 582.
Miller. A . D., et a!., 2002a. Astrophjs. J.? Suppl. Ser. 140, 115. Pauli, W., 1958, Theory of Relativity (Pergamon, New York).
Miller, C. J., R. C. Nichol. C. Genovese, and L.Wasserman, Reprinted (Dover, New York, 1981).
2002b, Astrophys. J. Lett. 565, L67. Pauli, W.? 1980, General Principles of Quantum Mechanics
Miralda-Escude, J., 2002, Asrrophys. J. 564. 60. (Springer, Berlin).
Misner, C. W., 1969, Phys. Rev. Lett. 22, 1071. Peacock, J. A., et a[., 2001, Nature (London) 410, 169.
Misner. C. W.* and P. Putnam, 1959, Pbys. Rev, 116. 1045. Peebles, P. J. E.. 1965, Astrophys. J. 142, 1317.

Rev. Mod. Phys.. Vol. 75, No. 2. April 2003


613

604 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

Peebles, P. J. E., 1966, Astrophys. J. 146, 542. Premadi, P.,H. Martel, R. Matzner, and T. Futamase, 2001,
Peebles, P. J. E., 1971, Physical Cosmology (Princeton Univer- Astropbys. J., Suppl. Ser. 135, 7.
sity, Princeton, NJ). Primack, J., 2002, e-print astro-ph/0205391.
Peebles, F! J. E., 1980a, The Large-Scale Structure of the Uni- Pryke, C., N. W. Halverson, E. M. Leitch, J. Kovac, J. E. Carl-
verse (Princeton University, Princeton, NJ). strom, W. L. Holzapfel, and M. Dragovan, 2002, Astrophys. J.
Peebles, I? J. E., 1980b, Ann. N.Y. Acad. Sci. 336, 167. 568, 46.
Peebles, P. J. E., 1982, Astrophys. J. Lett. Ed. 263, L1. Pskovskii, Yu. P., 1977, Astron. Zh. 54, 1188 [Sov. Astron. 21,
Peebles, P. J. E., 1984, Astrophys. J. 284, 439. 675 (1977)].
Peebles, P. J. E., 1986, Nature (London) 321, 27. Quevedo, F., 1996, in Workshops of Particles and Fields and
Peebles, F! J. E., 1987, Astrophys. J. Lett. Ed. 315, L73. Phenomenology of Fundamental Interactions, edited by J. C.
Peebles, ? J. E., 1989a, in Large Scale Structure and Motions in D'Olivio, A. Fernandez, and M. A. Perez, AIP Conf. Proc.
the Universe, edited by M. Meuetti, G. Giuricin, and F. No. 359 (AIR Woodbury, NY), p. 202.
Mardirossian (Kluwer, Dordrecht), p. 119. Ratra, B., 1985, Phys. Rev. D 31, 1931.
Peebles, P. J. E., 1989b, J. R. Astron. SOC.Can. 83, 363. Ratra, B., 1989, Phys. Rev. D 40, 3939.
Peebles, P. J. E., 1993, Principles of Physical Cosmology (Prin- Ratra, B., 1991, Phys. Rev. D 43, 3802.
ceton University, Princeton). Ratra, B., 1992a, Phys. Rev. D 45, 1913.
Peebles, ? J. E., 2001, Astrophys. J. 557, 495. Ratra, B., 1992b, Astrophys. J. Lett. 391, L1.
Peebles, P. J. E., 2002, e-print astro-ph/0201015. Ratra, B., 1994, Phys. Rev. D 50,5252.
Peebles, F! J. E., R. A. Daly, and R. Juszkiewicz, 1989, Astro- Ratra, B., and ? J. E. Peebles, 1988, Phys. Rev. D 37, 3406.
phys. J. 347, 563. Ratra, B., and P. J. E. Peebles, 1994, Astrophys. J. Lett. 432,
Peebles, P. J. E., S . D. Phelps, E. J. Shaya, and R. B. Tully, L5.
2001, Astrophys. J. 554, 104. Ratra, B., and F! J. E. Peebles, 1995, Phys. Rev. D 52, 1837.
Peebles, P. J. E., and B. Ratra, 1988, Astrophys. J. Lett. Ed. Ratra, B., and A. Quillen, 1992, Mon. Not. R. Astron. SOC.259,
325, L17. 738.
Peebles, P. J. E., S . Seager, and W. Hu, 2000, Astrophys. J. Lett. Ratra, B., R. Stompor, K. Ganga, G. Rocha, N. Sugiyama, and
K. M. Gbrski, 1999, Astrophys. J. 517, 549.
539, L1.
Ratra, B., N. Sugiyama, A. J. Banday, and K. M. Gbrski, 1997,
Peebles, P: J. E., and J. Silk, 1990, Nature (London) 346,233.
Astrophys. J. 481, 22.
Peebles, P. J. E., and A. Vilenkin, 1999, Phys. Rev. D 59, Refregier, A., J. Rhodes, and E. J. Groth, 2002, Astrophys. J.
063505. Lett. 572, L131.
Peebles, P. J. E., and J. T. Yu, 1970, Astrophys. J. 162, 815. Refsdal, S., 1970, Astrophys. J. 159, 357.
Percival, W. J., et al., 2001, Mon. Not. R. Astron. SOC.327, Riess, A. G., ef al., 1998, Astrophys. J. 116, 1009.
1297. Rindler, W., 1956, Mon. Not. R. Astron. SOC.116, 662.
Percival, W. J., et al., 2002, Mon. Not. R. Astron. SOC.337, Robertson, H. P., 1928, Philos. Mag. 5, 835.
1068. Robertson, H. P, 1955, Publ. Astron. SOC.Pac. 67, 82.
Perlmutter, S., et al., 1999a, Astrophys. J. 517, 565. Rossi, G. C., and G. Veneziano, 1984, Phys. Lett. 138B, 195.
Perlmutter, S., M. S. Turner, and M. White, 1999b, Phys. Rev. Roussel, H., R. Sadat, and A. Blanchard, 2000, Astron. Astro-
Lett. 83, 670. phys. 361, 429.
Perrotta, E, C. Baccigalupi, and S . Matarrese, 2000, Phys. Rev. Rubakov, V A., M. V Sazhin, and A. V Veryaskin, 1982, Phys.
D 61, 023507. Lett. 115B, 189.
Peskin, M. E., 1997, in Fields, Strings, and Duality, edited by C. Rubano, C., and P. Scudellaro, 2002, Gen. Relativ. Gravit. 34,
Efthimiou and B. Greene (World Scientific, Singapore), p. 307.
729. Rugh, S. E., and H. Zinkernagel, 2002, Stud. Hist. Philos. Mod.
Petrosian, V., E. Salpeter, and ? Szekeres, 1967, Astrophys. J. Phys. 33, 663.
147, 1222. Sachs, R. K., and A. M. Wolfe, 1967, Astrophys. J. 147, 73.
Phillipps, S., S. P. Driver, W. J. Couch, A. Fernandez-Soto, ? Sahni, V., M. Sami, and ' ISouradeep, 2002, Phys. Rev. D 65,
D. Bristow, S . C. Odewahn, R. A. Windhorst, and K. Lan- 023518.
zetta, 2000, Mon. Not. R. Astron. SOC.319, 807. Sahni, V., and A. Starobinsky, 2000, Int. J. Mod. Phys. D 9,373.
Phillips, M. M., 1993, Astrophys. J. Lett. 413, L105. Sahni, V, and L. Wang, 2000, Phys. Rev. D 62, 103517.
Phillips, N. G., and A. Kogut, 2001, Astrophys. J. 548, 540. Sandage, A., 1958, Astropbys. J. 127, 513.
Pierpaoli, E., D. Scott, and M. White, 2001, Mon. Not. R. As- Sandage, A,, 1961a, Astrophys. J. 133,355.
iron. SOC.325, 77. Sandage, A., 1961b, The Hubble Atlas of Galaxies (Carnegie
Plionis, M., 2002, e-print astro-ph/0205166. Institution, Washington).
Podariu, S . , R. A. Daly, M. P. Mory, and B. Ratra, 2003, As- Sandage, A., 1962, in Problems of Extragalactic Research, ed-
trophys. J. 584, 577. ited by G. C. McVittie (McMillan, New York), p. 359.
Podariu, S., P. Nugent, and B. Ratra, 2001, Astrophys. J. 553, Sandage, A,, 1988, Annu. Rev. Astron. Astrophys. 26, 561.
39. Sarkar, S., 2002, e-print hep-ph/0201140.
Podariu, S., and B. Ratra, 2000, Astrophys. J. 532, 109. Sato, K., 1981a, Mon. Not. R. Astron. SOC.195, 467.
Podariu, S., and B. Ratra, 2001, Astrophys. J. 563, 28. Sato, K., 1981b, Phys. Lett. WB, 66.
Podariu, S., T. Souradeep, J. R. Gott, B. Ratra, and M. S. Sato, K., N. Terasawa, and J. Yokoyama, 1989, in The Quest for
Vogeley, 2001, Astrophys. J. 559, 9. the Fundamental Constants in Cosmology, edited by J. Au-
Polenta, G., et a[., 2002, Astrophys. J. Lett. 572, L27. douze and J. Tran Thanh Van (Editions Fronti&res,Gif-sur-
Pollock, M. D., 1980, Mon. Not. R. Astron. SOC.193, 825. Yvette), p. 193.

Rev. Mod. Phys., Vol. 75, No. 2,April 2003


614

P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy 605

Schindler, S., 2001, e-print astro-ph/0107028. Thomas, D., and G . Kauffmann, 1999, in Spectrophoiomelric
Sciama, D. W, 2001, Astrophys. Space Sci. 276. 151. Dating of Sfars and Galuxies, Astronomical Society of the
Scoccimarro. R., H. A. Feldman, J. N. Fry, and J. A. Frieman, Pacific Conference Proceedings No. 192, edited by I. Huberg,
2001, Astrophys. J. 546, 652. S . Heap, and R. Cornett (Astronomical Society of the Pacific,
Scott, P. E, el al., 2002. e-print astro-ph10205380. San Francisco), p. 261.
Seljak, U., 2002, Mon. Not. R. Astron. SOC.337, 769. Thuan, T. X., and Y. I. Izotov, 2002, in Matter in rhe Universe,
Sellwood, I . A,, and A. Kosowsky, 2M1, in Gas and Ga!axy edited by F. Jetzer, K. Pretzl, and R. Vcn Steiger (Kluwer,
Evolurion, Astronomical Society of the Pacific No. 240, ed- Dordrechtj, in press.
ited by J. E. Hibbard, M. Rupen, and J . H. van Gorkum Tinsley, B. M., 1972, Astrophys. J. 178, 319.
(Astronomical Society of the Pacific, San Francisco), p. 311. Totani, T., Y. Yoshii, T. Maihara, F, Iwamuro, and K. Moto-
Sen, A. A., and S. Sethi, 2002, Phys. Lett. B 532* 159. hara, 2001, Astrophys. J. 559, 592.
Shafi, Q., and C. Wetterich, 1985, Phys. Lett. 152B,51. Townsend, P K.*2001, J. High Energy Phys. 0111,042.
Shandarin, S . F., H. A. Feldman, Y. Xu. and M. Tegmark, 2002, Trager, S. C. ?S . M. Faber, G. Worthey. and J. J. Gonzilez,
Astrophys. J., Suppl. Ser. 141, 1. 2000, Astron. J. 119, 1645.
Shiu, G . , and S . H . 3. Tye, 2001, Phys. Lett. B 516, 421. Trautman, A., 1965, in Lectures an General Relativity, edited
Shklovsky. J.: 1967, Astrophys. J. Lett. 150, L1 by A. Trautrnan, F. A . E. Pirani, and H. Bondi (Prentice-Hall,
Shvartsman, V F., 1969, Pisma Zh. Eksp. Teor. Fiz. 9, 315 Englewood Cliffs, NJ),p, 230.
[JETP Lett. 9, 184 (196911. Tully, R. B., R. S . Somemilk, N. Trentham, and M. A. W.
Sigad, Y., A. Eldar, A. Dekel, M. A. Strauss, and A. Yahil. Verheijen, 2002, Astrophys. J. 569, 573.
1998, Astrophys. J. 495,516. Turner, E. L., 1990, Astrophys. J. Lett. 365,L43.
Silk, J., 1967, Nature (London) 215, 1155. Turner, M. S . , 1999, in The GaLacficHalo, Astronomical Soci-
Silk, J., 1968, Astrophys. J. 151,459. ety of the Pacific Conference Proceedings No. 165, edited by
Silk, J., and N. Vittorio, 1987*Astrophys. J. 317, 564. B. K. Gibson, T.S . Axelrod, and M. E. Putnam (Astronomi-
Singh, T. P.: and T. Padmanabhan, 1988, Int. 3. Mod. Phys. A 3, cal Society of the Pacific, San Francisco), p. 431.
1593. Turner, M. S.* and M. White, 1997- Phys. Rev. D 56, R4439.
Skordis- C., and A. Albrecht, 2002, Phys. Rev. D 66,043523. Turner. M. S.. and L. Widrow, 1988, Phys. Rev. D 37,2743.
Smith: S., 1936, Astrophys. J. 83. 23. Unruh. W. G . , 1989, Phys. Rev. D 40,1048.
Smoot, G . F., er d , 1992, Astrophys. J. Lett. 396. L1. Urefia-Lbpez, L. A., and T. Matos, 2000, Phys. Rev. D 62,
Sommer-Larsen, J., and A. Dolgov. 2001, Astrophys. J. 551, 081302.
608. Uson, J. M., and D.T. Wilkinson, 1984, Nature (London) 3 Y ,
Souradeep, T.. and B. Ratra, 2001, Astrophys. J. 560, 28. 427.
Spergel, D., and U.-L. Pen, 1997, Astrophys. I. Lett. 491, L67. Uzan, J.-P., 1999, Phys. Rev. D 59, 123510.
Spergel, D. N., and P. J. Steinhardt, 2000, Phys. Rev. Lett. 84, Uzan, J.-P., 2003, Rev. Mod. Phys. 75, 403.
3760. Uzan, J.-P,, and E Bernardeau, 2001, Phys. Rev. D 64, 083004.
Spergel, D. N., et a l , 2003, e-print asuo-ph/0302209. Uzawa, K., and J. Soda, 2001, Mod. Phys. Lett. A 16, 1089.
Spokoiny. B., 1993, Phys. Lett. B 315,40. Van Waerbeke, L., Y. Mallier, R. Pe116, U.-L. Pen. H. J. Mc-
Starobinsky, A. A,, 1982, Phys. Lett. 117J3, 175. Cracken, and B. Jain, 2002, Astron. Astrophys. 393, 369,
Starobinsky, A. A,, 1998- Gravitation Cosmol. 4, 88. Veltman, M., 1975, Phys. Rev. Lett. 34,777.
Steigman. G., 2002>private communication. Verde, L., er al., 2002, Mon. Not. R. Astron. SOC.335, 432.
Steigman, G., D. N. Schrarnm, and J. E. Gunn, 1977, Phys Viana, I? T. I?, R . C. Nichol. and A. R. Liddle, 2002, Astrophys.
Lett. 66B,202. J. Lett. 569, L75.
Steinhardt, P. J., and N. Turok, 2002, Science 296, 1436. Vilenkin, A., 1984, Phys. Rev. Lett. 53, 1016.
Steinhardt, P. I.,L. Wang, and I. Zlatev, 1999, Phys. Rev. D 59, Vilenkin. A., 2001, e-print astro-ph10106083.
123504. Vilenkin, A,, and E. l? S . Shellard, 1994, Cosmic Strings and
Stoehr. F,, S . D. M. White, G. Tormen, and V. Springel, 2002, Other TopoIogical Defecrs (Cambridge University, Cam-
Mon. Not. R. Astron. SOC.335, L84. bridge, England).
Stompor, R.. el d ,2001, Astrophys. J. Lett. 561, L7. Vishwakarma, R. G., 2001, Class. Quantum Grav. 18. 1159.
Straumann, N., 2002, e-print astro-ph10203330, Vittorio, N., and J. Silk, 1985, Astrophys. J. Lett. Ed. 297, L1.
Sugiyama, N.. and N. Gouda, 1992, Prog. Theor. Phys. 88,803. Waga, I., and J. A. Frieman, 20W, Phys. Rev. D 62, 043521.
Sunyaev. R.A,, and Ya. B. Zeldovich, 1970, Astrophys. Space Wang, L., and l? J . Steinhardt, 1998, Astrophys. J. 508, 483.
Sci. 7, 3. Wang, X.. M. Tegmark, and M. Zaldarriaga, 2002, Phys. Rev. D
Susskind, L., 1979, Php. Rev. D 20,2619. 65, 123001.
Sutherland, W., H. Tadros, G . Efstathiou, C. S. Frenk, 0 . Wang, Y.,and G. Lovelace, 2001, Astrophys. J. Lett. 562, L l l j ,
Keeble. S. Maddox, R. G. McMahon, S. Oliver. M . Rowan- Wasserman, I., 2002, Phys. Rev. D 66, 123511.
Robinson. and s. D. M. White, 1999, Mon. Not. R. Astron. Weinberg, S . , 1987, Phys. Rev. Lett. 59. 2607.
SOC. 300s. 289. Weinberg, S., 1989. Rev. Mod. Phys. 61, 1.
Tadros, H., et al., 1999, Mon. Not. R. Astron. SOC.305, 527. Weinberg, S., 2001, in Relalivisric Astrophysics, edited by J. C.
Tammann, G. A.? B. Reindl, E Thim, A. Saha: and A. Wheeler and H. Martel, ALP Conf. Proc. No. 586 (AIP,
Sandage. 2001. in A New Era in Cosmology, Astronomical Melville, NY), p. 893.
Society of the Pacific Conference Proceedings, edited by T. Weiss, N., 1987, Phys. Lett. B 197, 42.
Shanks and N. Metcalf (Astronomical Society of the Pacific, Weller, J., and A. Albrecht, 2002, Phys. Rev. D 65, 103512.
San Francisco), in press. Wetterich, C., 1988, Nucl. Phys. B 302, 668.
Tasitsiomi, A,, 2002, e-print astro-phi0205464. Wey5 H., 1923, Phys. 2. 24, 230.

Rev. Mod. Phys.. Vol. 75,No. 2,April 2003


615

606 P. J. E. Peebles and Bharat Ratra: The cosmological constant and dark energy

White, M., and C. S. Kochanek, 2001, Astrophys. J. 560,539. Yamamoto, K., M. Sasaki, and T. Tanaka, 1995, Astrophys. J.
White, S . D. M., 1992, in Clusters and Superclusters of Galax- 455,412.
ies, edited by A. C. Fabian (Kluwer, Dordrecht), p. 17. Zaldarriaga, M., D. N. Spergel, and U. Seljak, 1997, Astrophys.
White, S . D. M., J. I? Navarro, A. E. Evrard, and C. S. Frenk, J. 288, 1.
1993, Nature (London) 366,429. Zee, A,, 1980, Phys. Rev. Lett. 44, 703.
Whittaker, E. T., 1935, Proc. R. SOC. London, Ser. A 149,384. Zee, A., 1985, in High Energy Physics, edited by S . L. Mintz
Wilczek, F., 1985, in How Far Are We from the Gauge Forces, and A. Perlmutter (Plenum, New York), p. 211.
edited by A. Zichichi (Plenum, New York), p. 157. Zeldovich, Ya. B., 1964, Astron. Zh. 41,19 [Sov. Astron. 8, 13
Wilkinson, D. T., and P. J. E. Peebles, 1990, in The Cosmic (1964)i.
Microwave Background: 25 Years Later, edited by N. Man-
Zeldovich, Ya. B., 1967, Zh. Eksp. Teor. Fiz., Pisma Red. 6,
883 [JETP Lett. 6,316 (1967)l.
dolesi and N. Vittorio (Kluwer, Dordrecht), p. 17.
Zeldovich, Ya. B.,1968, Usp. Fiz. Nauk 95,209 [Sov. Phys.
Wilson, G., N. Kaiser, and G. A. Luppino, 2001, Astrophys. J.
Usp. 11,381 (1968)l.
556,601. Zeldovich, Ya. B., 1972, Mon. Not. R. Astron. SOC.160,1P.
Witten, E., 2001, in Sources and Detection of Dark Matter and Zeldovich, Ya. B., 1978, in IAU Symposium 79, The Large-
Dark Energy in the Universe, edited by D. B. Cline (Springer, Scale Structure of the Universe, edited by M. S . Longair and J.
Berlin), p. 27. Einasto (Reidel, Dordrecht), p. 409.
Worthey, G., 1994, Astrophys. J., Suppl. Ser. 95,107. Zeldovich, Ya. B., 1981, Usp. Fiz. Nauk 133,479 [Sov. Phys.
Wu, J.-H. P., et al., 2001a, Astrophys. J., Suppl. Ser. 132,1. Usp. 24, 216 (1981)l.
Wu, J.-H. P., et al., 2001b. Phys. Rev. Lett. 87,251303. Zeldovich, Ya. B., I . Yu. Kobzarev, and L. B. Okun, 1974, Zh.
Wu, X.-P., and F. Hammer, 1993, Mon. Not. R. Astron. SOC. Eksp. Teor. Fiz. 67,3 [Sov.Phys. JETP 40,1 (1975)J.
262,187. Zimdahl, W., D. J. Schwarz, A. B. Balakin, and D. Pa&,
Wyithe, J. S. B., and A. Loeb, 2002, Astrophys. J. 581,886. 2001, Phys. Rev. D 64,063501.
Yahiro, M., G. J. Mathews, K. Ichiki, T. Kajino, and M. Orito, Zumino, B., 1975, Nucl. Phys. B 89, 535.
2002, Phys. Rev. D 65,063502. Zwicky, F., 1933, Helv. Phys. Acta 26,241.

Rev. Mod. Phys., Vol. 75,No. 2,April 2003


This page intentionally left blank
Appendices*

*J. P. Hsu, D. Fine, S. L. Marateck


618

Marcel Grossmann

1878-1936

Marcel Grossmann studied mathematics at the Zurich Polytechnikum [later


named the Eidgenossische Technische Hochschule, (ETH)] and was a staunch and
helpful friend of Einstein. In 1909 Einstein dedicated his doctoral thesis to Grossmann.
The collaboration of Einstein and Grossmann lead to a ground-breaking paper: Outline
of a Generalized Theory of Relativity and of a Theory of Gravitation, which was
published in 1913 and was one of the two fundamental papers which established
Einsteins theory of gravity. They also collaborated a second paper in 1914. Einstein had
already moved to Berlin when this paper came out. Hoffmann commented: In retrospect
it is heartbreaking to see how close the collaborators came to achieving their goal.
Practically all the needed mathematical ingredients were there, and, as Einstein remarked
later, he and Grossmann had considered the actual field equations only to discard them
for what at the time seemed compelling reasons.. .. [ 11
Grossmann was a member of an old Swiss family from the Zurich area. His
father managed a textile factory. Grossman was known for both his quick grasp of new
subjects and his depth of understanding. Speaking of their time together at the Zurich
Polytechnikum, Einstein described him as an excellent student of mathematics whose
lecture notes of his classes were so complete and well-organized they could have been
published. Grossmann appreciated the character of his friend Einstein, who concentrated
on his own reading (of books by Helmholtz, Boltzmann, Maxwell and Hertz) and joyful
exploration of the physical world. Grossmann told his father that Einstein had a great
potential. The two met regularly, discussing their studies and much more; Einstein often
visited Grossmanns home at Thalwil to play music. Since Einstein did not attend class
regularly and did not pay particularly close attention to some of his mathematics classes
at that time, Grossmann helped Einstein to pass the examinations by lending him the
lecture notes and helping him on problems.
Marcel Grossmann told his father about Einsteins difficulties in finding a steady
job more than a year after graduation in 1900. His father recommended Einstein strongly
to a friend, the Director of the Swiss Patent office. As a result, Einstein found a haven in
which to concentrate on physical problems, as well as the means to marry his friend and
study partner Mileva Maric. Einstein considered this help to be the greatest thing Marcel
Grossmann did for me as a friend.
In 1900 Grossmann graduated from Zurich Polytechnikum and became an
assistant to the geometer W. Fiedler. He continued to do research on non-Euclidean
geometry and taught in high schools for the next seven years. In 1902, he earned his
doctorate from the University of Zurich with the thesis On the Metrical Properties of
Collinear Structures. He also published two geometry books for high school students
and three papers on non-Euclidean geometry. At a remarkably young age, Grossmann
was appointed full professor of descriptive geometry at the Eidgenossische Technische
619

Hochschule in 1907. His doctoral thesis and his favorite subject were non-Euclidean
geometry, which, as luck would have it, paved the way for his celebrated collaboration
with Einstein later. Grossmann was a teacher of outstanding ability who trained many
mathematicians in geometry. [2]
In the meantime, his friend Einstein was attempting to obtain a position at Berne
University, so at the end of 1907 he submitted his 1905 paper on special relativity, On
the Electrodynamics of Moving Bodies, to Berne University as an inaugural thesis. The
paper was rejected as being incomprehensible! [3] Einstein was bitterly disappointed and
gave up his dream of becoming a university professor for a while. He became interested
in a teaching position at the Technical School. Einstein wrote a letter to Grossmann for
advice and explained:
Do not imagine that I am driven to such careerist ways by megalomania or some
other questionable passion; rather, I came to this hankering only because of an ardent
desire to be able to continue my private scientific work under less unfavorable
conditions, as you will certainly understand.. ..
As a professor of geometry, Grossmann organized summer courses for high
school teachers. In 1910 he became one of the founders of the Swiss Mathematical
Society. Within a year, he became Dean of the mathematics-physics section of
Eidgenossische Technische Hochschule. As a new Dean, he made an effort to persuade
Einstein to return to the ETH. As destiny would have it, Einstein agreed in 1912: I am
extraordinarily happy about the prospect of returning to Zurich. Around this time,
Einstein sought to formulate mathematically his ideas on the general theory of relativity;
he turned to his friend for assistance: Grossmann, you must help me, or else Ill go
crazy! After discussions with Grossmann, Einstein was sure that he was on the right
path. Grossmann introduced Einstein to the absolute differential calculus, started by E. B.
Christoffel (1864) and fully developed by Ricci and Levi Civita (1901). Grossmann
facilitated Einsteins unique synthesis of mathematical and theoretical physics in what is
still today considered the most elegant and powerful theory of gravity: The General
Theory of Relativity. [4]
Pais has an interesting, lucid and detailed analysis of the Einstein-Grossmann
paper (I. Physical Part, by A. Einstein; 11. Mathematical Part, by M. Grossmann):[5]
Grossmanns concluding section starts as follows. The problem of the
formulation of the differential equations of a gravitation field draws attention to the
differential invariants.... and .... covariants of .... ds2 = g,dxpdxV. He then presents to
Einstein the major tensor of the future theory: the Christoffel four-index symbol, now
better known as the Riemannian-Christoffel (curvature) tensor: RhpvK=. ....I From this
tensor it is .... possible to derive a second-rank tensor of the second order [in the
derivatives of g,, I, the Ricci tensor: R,, = RhCLLv....
Unfortunately, Grossmann reached an erroneous conclusion: It turns out,
however, that in the special case of the infinitely weak, static gravitational field this
tensor does not reduce to the expression AT. Alas, the symmetry property of general
covariance and the freedom of choosing a coordinate condition were not properly
understood by the collaborators.
A crucial change of destiny for their collaboration occurred in 1913 when Planck
and Nernst came to Zurich to persuade Einstein to go to Berlin. As a result Einstein left
Zurich in March 1914. During the next year, the endeavor for Einstein in Berlin was to
620

really understand his own idea of general covariance and the key role the Ricci tensor
played in his theory.
A page from the draft of Einsteins landmark paper The Foundation of the
General Theory of Relativity was included in his collected works. [6] Einstein wrote:
. . . .. . Finally, I want to acknowledge gratefully my friend, the mathematician
Grossmann, whose help not only saved me the effort of studying the pertinent
mathematical literature, but who also helped me in my search for the field equations of
gravitation. For some reason, Einstein did not publish this page together with his
landmark paper.
Grossmann died of multiple sclerosis in 1936. Einstein wrote a letter to
Grossmanns wife to convey his heartfelt appreciation of Grossmanns kindness:
.... Our student days together come back to me. He is a model student; I untidy
and a dreamer. He on excellent terms with the teachers and grasping everything easily; I
aloof and discontented, not very popular. But we were good friend and our
conversations over iced coffee at the Metropol every few weeks belong among my nicest
memories. Then the end of the studies .... I suddenly abandoned by everyone, facing life
not knowing which way to turn. But he stood by me and through him and his father I
came to Haller in the Patent OfJice a few years later, In a way, this saved my life; not
that I would have died without it, but I would have been intellectually stunted.
But Einstein did not write an obituary shortly after Grossmanns death. Pais
commented that I have a sense of regret that Einstein did not do something for which he
had often demonstrated a talent and sensitivity.
In the last year of Einsteins life (1955), he wrote of Grossmann, of their
collaboration, and how the latter had checked through the literature and soon discovered
that the mathematical problem had already been solved by Riemann, Ricci, and Levi-
Civita. Einstein also wrote: The need to express at least once in my life my gratitude to
Marcel Grossmann gave me the courage to write this ... autobiographical sketch. (7,8)
In 1975, the International Center for Theoretical Physics, Trieste, Italy announced
the Marcel Grossmann Meeting on the Recent Progress of the Fundamentals of General
Relativity as follows:
Marcel Grossmann was associated with Albert Einstein in elucidating the
mathematical basis of general relativity. In commemoration of his contributions, a
Marcel Grossmann meeting on the Recent Progress of the Fundamentals of General
Relativity will be held at the International Center for Theoretical Physics in Trieste
during the period 7-12 July 1975 under the directorship of professor R. RufJini (Institute
for Advanced study, Princeton, N.J., USA) and cosponsored by the University of Trieste.
The theme of the Meeting will cover recent advances in the mathematical techniques of
general relativity as well as progress in the physics of relativistic field theories......
I

Since this first meeting an International Conference in memory of Marcel


Grossmanns contribution to theoretical physics is held regularly.
The life of Marcel Grossmann is a splendid symphony of friendship.

We would like to thank Anna Revay-Grossmann and Carlo Revay for materials
regarding their grandfather Marcel Grossmann.
62 1

References

1. B. Hoffmann and H. Dukas. Albert Einstein, Creator and Rebel (The Viking Press,
1972) p. 105.
2. L. Kollros, Pro$ Dr. Marcel Grossmann, 2878-1936 (Extrait des Actes de la Societe
Helvetique des Sciences Naturelles, Geneve 1937). pp. 325-329. A. Pais, Subtle
is the Lord. ..(Oxford Univ. Press, 1982), Chapter 12.
3. See ref. l.,p. 86. This is understandable: Even a thinker like Mach, whose previous
criticism of absolute space and absolute motion played a major role in paving
the way for Einstein, was to say harsh things about Einsteins special relativity.
4. R. Ruffini, Marcel Grossmann, (unpublished).
5. See paper A in Chapter 2 of this volume. A. Pais, ref. 2.
6. The Collected Papers ofAlbert Einstein (Ed. J. Stachel, Princeton Univ. Press, 1993).
English translator Anna Beck. Vol. 5.
7. A. Einstein, in HelEe Zeit, Dunkle Zeit (C. Seeling, Ed). Europa Verlag, Zurich,
1956. [A.Einstein, Erinnerungen - Souvenirs (100 Jahre Eidgenossische
Technische Hochschule , 28 Jahrgang, 1955)l.
8. A Pais, Subtle is the Lord.. . (Oxford Univ. Press, 1982), p. 225.

J. P. Hsu and D. Fine

Marcel Grossmann [photo courtesy of Anna RCvay-Grossmann]


622

Remembering
Robert L. Mills
A few months ago, I was saddened
to learn that Robert L. Mills had
passed away. As a fellow graduate of
Columbia College, I feel a special
bond with Mills and am therefore
submitting this memorial note.
Mills, who shared with C. N. Yang
the 1980 Rumford Premium Prize
from the American Academy of Arts
and Sciences for development of a
generalized gauge invariant field
theory, died on 27 October 1999
from prostate cancer. His passing
was a great loss to his family,
friends, and the physics community.
Robert L. Mills
Mills was born on 15 April 1927
in Englewood, New Jersey. He grad- From 1955 to 1956, Mills was a
uated from George School in Penn- member of the Institute for Advanced
sylvania in early 1944 and, in Study in Princeton, New Jersey. He
March, entered Columbia College in then joined the physics department of
New York. While there, he enlisted Ohio State University and became a
in the US Merchant Marines in the full professor in 1962. He remained a t
last year of World War 11; he served OSU until his retirement in 1995. His
until 1947. research was in quantum field theory,,
On leave from the service, he at- the theory of alloys, and many-body
tended classes at Columbia, where theory. He worked with Andrew
his father was an economics profes- Sessler on many-body theory; later,
sor. In 1948, his senior year, Mills Leon Cooper joined in the effort. That
was a winner of the Putnam na- work, besides producing papers that
tional college mathematics contest. appeared in Physical Review and
The mathematical ability he dis- Physical Review Letters, resulted in
played there was evident throughout Millss writing a book, Propagators
his career as a theoretical physicist. for Many-Particle Systems: An Ele-
He then studied a t Cambridge Uni- mentary Deatment (Gordon and
versity, where he received first-class Breach, 1969). He later wrote Space,
honors in the mathematical tripos Time and Quanta: An Introduction to
and a masters degree. Mills re- Contemporary Physics (W. H. Free-
turned to Columbia and got his PhD man, 1994). For his outstanding dedi-
in 1955 under Norman Kroll for a cation t o his students, Mills received
thesis on radiative corrections in OSUs Rosalene SedgJvick Faculty
quantum electrodynamics. Service Award. With his wife, Lee, he
From 1953 to 1955, Mills was a shared the OSU International Com-
research associate at Brookhaven munity Service Award. He was a vis-
National Laboratory and shared an iting professor a t many schools and a
office with Yang. During that time, visiting scientist a t CERN. After his
they developed what is now known retirement, he taught for a year as a
as the Yang-Mills theory, a non- Fulbright scholar a t St. Patricks col-
abelian local gauge invariant theory lege in Ireland.
that would become one of the pivotal According to Sessler, Robert was
concepts of physics. It formed the even-tempered and simply a joy to
model for non-abelian gauge theories work with. His coworkers enjoyed
that followed and is thus one of the interacting with him. A memorial
bases for the standard model of ele- piece in the 2000 OSU Physics De-
mentary particles and string theory.2 partment Magazine concludes with
Yang-Mills also has applications t o this statement: A gentlemen of
mathematics. unfailing good humor and sincere
and active concern for helping a phonon mediated electron-
others, Robert Mills will be long electron interaction could be du-
remembered with great respect plicated among 3He atoms due
and affection. to the atom-atom potential. Al-
While preparing this letter, I though the solution in 3He
couldnt help but observe Millss de- turned out to be somewhat more
votion to his friends and their devo- complicated, the basic idea was
tion to him, as exemplified by his in- vindicated with the discovery,
teraction with Yang and Cooper, both about ten years later, of the su-
Nobel laureates. The following com- perfluidity of 3He.
ments were obtained via private Bob Mills was a talented, cre-
communication. ative physicist. We miss him.
In 1953-1954, I was visiting -Leon Cooper
Brookhaven and Bob was my I would be remiss in compiling
office mate. We discussed many this tribute to Mills if I didnt mention
things in physics, from the ex- the direct or indirect influence of
perimental results pouring out Yang-Mas on some of the advances
of the new Cosmotron, to theo- establishing the standard model.
retical topics like renormaliza- These advances, tours de force all,
tion and the Ward identity. It illustrate the wonderful synergy of
was in that year that we found theoretical and experimental physics
the very elegant and unique and include the Glashow-Weinberg-
generalization of Maxwells Salam (GWS) theory, the Glashow-
equation. We were pleased by niopoulos-Maiani (GIM) model, the
the beauty of the generaliza- successful searches for neutral cur-
tion, but neither of us had an- rents and for the gauge particles W
ticipated its great impact on and Zo, the proof of the renonnaliza-
physics 20 years later. tion of Yang-Mills theories, and quan-
Bob spent one year, I think tum chromodynamics encompassing
it was 1955-1956, at the Insti- asymptotic freedom and quark confine-
tute for Advanced Study in ment. This is a splendid legacy indeed.
Princeton and we resumed our
collaboration. One fruit of that Many thanks to Lee Mills for so gen-
was a paper on the overlapping erously giving of her time to provide
me with information on her hus-
divergence in the photon prop-
agator which, however, was not bands careez
written up for publication3 until References
1966, when he and his family 1. C. N. Yana. R. L. Mills, Ph.ys. Reu. 96,
visited us for the summer just 191 (1954):
after I had moved to [SUNY] 2. For details on Yang-Mills in the devel-
Stony Brook. opment of gauge theory, see L.
Bob was an old-fashioned ORaifeartaigh, N. Strauman, Reu.
Mod. Phys. 72, l(2000); also see L.
man. Among all the physicists
ORaifeartaigh, The Dawning of Gauge
that I know, he was certainly Theory, Princeton U. Press, Princeton,
one of the most honest and the N.J. (1997), for a reprinting of funda-
most sincere. mental papers in gauge theory up to
Bob had a brilliant mind. He 1956, with commentaries.
was very quick a t grasping new 3. R. L. Mills, C . N. Yang, Prog. Theor.
ideas. I shall treasure the mem- Phys. Sup. 37, 507 (1966).
ory of our intensive collabora- 4. L. N. Cooper, R. L. Mills, A. M. Sessler,
tion and of our many discus- Phys. Rev. 114, 1377 (1959).
sions on diverse topics ranging Samuel L. Marateck
from accelerator theory to the (marateck@cs.nyu.edu)
theory of computability.
Courant Institute of
-C. N. Yang
Mathematical Sciences
Bob Mills and I, with Andy New York University
Sessler, wrote a paper dis- New York City
cussing possible superfluidity of
helium-3. In it, we suggested
that the electron pairing due to

Das könnte Ihnen auch gefallen