Sie sind auf Seite 1von 353

ARTÍCULOS CIENTÍFICOS CLÁSICOS

1895 – 1937
Índice

Wilhelm Röntgen - Über eine neue Art von Strahlen - (1895)........................................................... 3


Wilhelm Röntgen - On a new kind of rays – (1895)..........................................................................25
Pierre y Marie Curie - Sur une nouvelle substance fortement radio-active, contenue dans la
pechblende - (1898)………………………………………………………………………………....38
Pierre y Marie Curie - Sobre una nueva sustancia fuertemente radiactiva contenida en la
pechblenda - (1898)………………………………………………………………………...…….....42
Max Planck - Zur Theorie des Gesetzes der Energieverteilung im Normalspectrum - (1900)….….45
Max Planck - On the Law of Distribution of Energy in the Normal Spectrum - (1900)....................56
Albert Einstein - Zur Elektrodynamik bewegter Körper - (1905)......................................................63
Albert Einstein - On the Electrodynamics of Moving Bodies - (1905)..............................................94
Albert Einstein - Ist die Trägheit eines Körpers von seinem Energieinhalt abhängig? - (1905)......118
Albert Einstein - Does the Inertia of a Body depend upon its Energy-Content? - (1905)................121
Niels Bohr - On the Constitution of Atoms and Molecules - (1913)................................................124
Werner Heisenberg - Über quantentheorestische Umdeutung kinematischer und mechanischer
Beziehungen - (1925)........................................................................................................................184
Werner Heisenberg - Quantum-Theoretical Re-Interpretation of Kinematic and
Mechanical Relations - (1925)..........................................................................................................199
Erwin Schrödinger - Quantisierung als Eigenwertproblem - (1926).................................................215
Erwin Schrödinger - Quantisation as a Problem of Proper Values - (1926)......................................231
Werner Heisenberg - Über den anschaulichen Inhalt der quantentheoretischen Kinematik und
Mechanik - (1927)..............................................................................................................................238
Werner Heisenberg - The Actual Content of Quantum Theoretical Kinematics and
Mechanics - (1927).............................................................................................................................264
Alexander Fleming - On the Antibacterial Action of Cultures of a Penicillium - (1929)..................299
Edwin Hubble - A Relation between Distance and Radial Velocity
among Extra-Galactic Nebulae - (1929)............................................................................................312
Alan Turing - On Computable Numbers, with an Application
to the Entscheidungsproblem - (1937)...............................................................................................318
4
W.C. RUNTCEN
UBER EINE NEUE -ART
VON STRAHLEN
s~T~u~~~s~ER~cwE
DERP ~ ~ Y ~ ~ U U $ C M - M ~ ~ ~ ~GWEUSCHAFT
~ Z I N ~ ~ H E Z.UNWBRZBURC
N ~ S.10 ,
JAVRGANG 1895, S. 162 UND J A H R ~ f8%,
J

NEUDRUCK
ANLXSSLICH. E S 100JAHRICElN.BaSTEMENS -,
'

DER PHYS~KALISCH-ME~IZINISCHEN,
C-ESELLSCHAFT-ZUWURZBURG
1 -

ZU IWRBN M~'PC~LIBZ)BRN
QBHOUTBN:'
B. V. ?rt~Basgkf,F. BM- $.
R mtnriir, ,aU
iMm t CAJ& V. CMUI~'
R aAuSWIs
W. B&4TMN& A. BMft FT- M. V.
C. -URs G OBIIIUIIDT., C W 4 E RQQtK
M. -* H&uoG iuu TmmDOR t. MnnW,
B. Illg;lWI . A. V. ~(r)- 0-
~UUI~Q.I~O~~L~~T.AKUSBMAUL~W.
a. m m , L ~ W Gr a i r i ~ t a u m ,V. B
A-,
ALP- 0.QU- F.V. W V - s
I . v . w~ . c R I~ . ~ ~~ ~ s
. #.V.-, J.8C#BaBIL-0.mPIUTsC-:
C. T#. n PH. C V. A. V.
C m m sll.
R V. 'IEU
W. WmN, J. w!m-
Aus deiri Bericht der 111. Sitzung vom '23. Januar 1896.
Herr R ö iit g e n , von lebhaftem, langanhaltendem Beifall begrüsst, hält seinen
angekündigten Vortrag über: ,,Eine neue Art von Strahlen" *). Gegen Schluss
desselben wird nach dem neuen Verfahren der Schattenriss des Skelettes einer
rncnschlichen Hand photographisch aufgenommen und zwar der Rechten des Ehren-
präsidenteri der C$esellschaft, Herrn V. K ö l l i k e r . Letzterer dankt im Kamen d e r
Gesellschaft dem Vortragenden für die Mittheilungen, die in den Annalen der
Sitzungen a n Bedeutung ihres Gleichen nicht haben, und bringt auf Herrn R ö n t g e n
ein Hoch aus, in welches die Mitglieder und das gesammtd, den Hörsaal des physi-
kalischen lnstituts gedrängt füllende Auditorium dreimal mit lautem Ruf und unter
rauschcnderii Beifall einstimmen. Der Vorschlag Herrn V. K ö l l i k e r ' B , die neuen
„X-StrahlenL' von nun a n „Röntgen'sche Strahlen" zu nennen, entfesselt neuen
allgerncincn Jubelruf.
I n der vom I. Vorsitzenden eingeleiteten Discussion sprechen die Herren
V. K ö l l i k e r und R ö n t g e n über die Möglichkeit, die neuen Strahlen für medici-
nisclie Zwecke dienstbar zu machen**).
Der J. Vorsitzende schliesst hierauf die hochbedcutsame Sitzung, indem er noch
dem Vortragenden seinen ganz besonderen Dank dafür ausspricht, dass er zur
ersten Veröffentlichung seiner Untcrsuchungcn das Organ der physikalisch-medi-
cinischcri Ccsellscliaft gewählt hat.

*) cf. Sitzungsberichte 1895, pag. 132.


**) Herr P . Iiölliker beinerkt, dass die ncue Entdeckung voraussichtlich auch eine
grosse Bcdciiturig auf inedicinisclieiii Gebiet haben werde; Gelegenheit, die X-
Strahlen ziir 1)urclileuchtung Kranker zu verwenden, sei ja a n dem reichen Material
der hiesigen lilinilren geboten und eine Unterstützung der Mediciner dabei durch
Herrn Iiö~~tgun wohl zu hoffen. Es scheinen wohl fürs Erste chirurgisclie Affec-
tioneri, vor Alleni Veränderungen ani Knochengerüst der Exploration durch die
.
neuen Strahlen zugänglich zu sein.
Herr Böntyen erwidert, dass zum Durchleuchten von Körpertheilen, die wesent-
lich dicker sind als Arrne und Beine, intensivere Röhren als die bisherigen construirt
werden müssen und dass er mit dieser Aufgabe beschäftigt ist. Welche inneren
Theile des menschlichen Körpers mit den verbesserten Röhren sichtbar gemacht
werden können, lässt sich zur Zeit nicht sagen; das hängt von dem Grade ihrer
noch nicht untersuchten Durchlässigkeit und von ihrer Lage irn Körper ab.
W. C. Rön t gen: Ueber eine neue Art
von Strahlen.
I. M i t t h e i l u n g .
i
1. Läßt man durch eine Hittorf'sche Vacuumrohre, oder einen
genügend evacuierten Lenard'schen, Crookes'schen oder ähnlichen
I
Apparat die Entladungen eines grösseren Ruhmkorff's gehen und bedeckt
die Röhre mit einem ziemlich eng anliegenden Mantel aus dünnem,
schwarzem Carton, so sieht man in dem vollständig verdunkelten
Zimmer einen in die Nähe des Apparates gebrachten, mit Barium-
platincyanur angestrichenen Yapierschirm bei jeder Entladung hell auf-
leuchten, fluoresciren, gleichgultig ob die angestrichene oder die andere
Seite des Schirmes dem Entladungsapparat zugewendet ist. Die Fluo-
! I

rescenz ist noch in 2 m Entfernung vom Apparat bemerkbar. I


Man überzeugt sich leicht, dass die Ursache der Fluorescenz vom
Entladungsapparat und von keiner anderen Stelle der Leitung ausgeht.
2. Das an dieser Erscheinung zunächst Auffallende ist, dass durch
die schwarze Cartonhulse, welche keine sichtbaren oder ultravioletten
Strahlen des Sonnen- oder des elektrischen Bogenlichtes durchläßt,
ein Agens hindurchgeht, das im Stande ist, lebhafte Fluorescenz zu
erzeugen, und man wird deshalb wohl zuerst untersuchen, ob auch andere
Körper diese Eigenschaft besitzen.
Man findet bald, dass alle Korper fur dasselbe durchlässig sind,
aber in sehr verschiedenem Grade. Einige Beispiele fuhre ich an.
Papier ist sehr durch1ässig:l) hinter einem eingebundenen Buch
von ca 1000 Seiten sah ich den Fluorescenzschirm noch deutlich
leuchten; die Druckerschwärze bietet kein merkliches Hinderniss.
Ebenso zeigt sich Fluorescenz hinter einem doppelten Whistspiel;
eine einzelne Karte zwischen Apparat und Schirm gehalten macht sich
dem Auge fast gar nicht bemerkbar. - Auch ein einfaches Blatt Stanniol
ist kaum wahrzunehmen; erst nachdem mehrere Lagen uber einander
gelegt sind, sieht man ihren Schatten deutlich auf dem Schirm. -
Dicke Holzblöcke sind noch durchlässig; zwei bis drei cm dicke Bretter
aus Tannenholz absorbieren nur sehr wenig. -- Eine ca. 15 mm dicke
Aluminiumschicht schwächte die Wirkung recht beträchtlich, war aber
1) Mit ,,Durchl&sigkeit" eines Körpers bezeichne ich das Verhältnis der Hellig-
keit eines dicht hinter dem Körper gehaltenen Fluorescenzschirmes zu derjenigen
Helligkeit des Schirmes, welche dieser unter denselben Verhältnissen aber ohne
Zwischenschaltung des Körpers zeigt.
nicht im Stande, die Fluorescenz ganz zum Verschwinden zu bringen. -
Mehrere cm dicke Hartgummischeiben lassen noch Strahlen*) hindurch.-
Glasplatten gleicher Dicke verhalten sich verschieden, je nachdem sie
bleihaltig sind (Flintglas) oder nicht; erstere sind viel weniger durch-
lässig als letztere. - Hält man die Hand zwischen den Entladungs-
apperat und den Schirm, so sieht man die dunkleren Schatten der Hand-
I
knochen in dem nur wenig dunklen Schattenbild der Hand. - Wasser,
Schwefelkohlenstoff und verschiedene andere Flüssigkeiten erweisen
sich in Glimmergefässen untersucht als sehr durchlässig. - Dass Wasser-
1 stoff wesentlich durchlässiger wäre als Luft, habe ich nicht finden
I
können. - Hinter Platten aus Kupfer, resp. Silber, Blei, Gold, Platin
ist die Fluorescenz noch deutlich zu erkennen, doch nur dann, wenn die
Plattendicke nicht zu bedeutend ist. Platin von 0,2 mm Dicke ist noch
durchlässig; die Silber- und Kupferplatten können schon stärker sein.
Blei in 1,5 mm Dicke ist so gut wie undurchlässig und wurde deshalb
häufig wegen dieser Eigenschaft verwendet. - Ein Holzstab mit qua-
dratischem Querschnitt (20 X 20 mm), dessen eine Seite mit Bleifarbe
weiss angestrichen ist, verhält sich verschieden, je nachdem er zwischen
Apparat und Schirm gehalten wird; fast vollständig wirkungslos,
wenn die X-Strahlen parallel der angestrichenen Seite durchgehen,
entwirft der Stab einen dunklen Schatten, wenn die Strahlen die An-
strichfarbe durchsetzen müssen. - In eine ähnliche Reihe, wie die Me-
talle, lassen sich ihre Salze, fest oder in Lösung, in Bezug auf ihre
Durchlässigkeit ordnen.
3. Die angeführten Versuchsergebnisse und andere führen zu der
Folgerung, dass die Durchlässigkeit der verschiedenen Substanzen,
gleiche Schichtendicke vorausgesetzt, wesentlich bedingt ist durch ihre
Dichte: keine andere Eigenschaft macht sich wenigstens in so hohem
Grade bemerkbar als diese.
Daß aber die Dichte doch nicht ganz allein massgkbend ist, das
beweisen folgende Versuche. Ich untersuchte auf ihre Durchlässigkeit
nahezu gleichdicke Platten aus Glas, Aluminium, Kalkspath und Quarz ;
die Dichte dieser Substanzen stellte sich als ungefähr gleich heraus,
und doch zeigte sich ganz evident, daß der Kalkspath beträchtlich weni-
ger durchlässig ist als die übrigen Körper, die sich untereinander ziem-
lich gleich verhielten. Eine besonders starke Fluorescenz des Kalk-
spathes (vergl. U. pag. 4) namentlich im Vergleich zum Glas habe icl-i
nicht bemerkt.
*) Der Kürze halber möchte ich den Ausdruck ,,Strahlena und zwar zur Unter-
scheidung von anderen den Namen „X-Strahlena gebrauchen. Vergl. U. p. 9.
4. Mit zunehmender Dicke werden alle Körper weniger durchlässig.
Um vielleicht eine Beziehung zwischen Durchlässigkeit und Schichten-
dicke finden zu können, habe ich photographische Aufnahmen (vergl.
U. pag. 4) gemacht, bei denen die photographische Platte zum Theil I
bedeckt war mit Stanniolschichten von stufenweise zunehmender
Blätterzahl; eine photometrische Messung soll vorgenommen werden, iI
wenn ich im Besitz eines geeigneten Photometers bin. i
5. Aus Platin, Blei, Zink und Aluminium wurden durch Auswalzen
Bleche von einer solchen Dicke hergestellt, dass alle nahezu gleich durch-
lässig erschienen. Die folgende Tabelle enthält die gemessene Dicke
1
in mm, die relative Dicke bezogen auf die des Platinbleches und die
Dichte.
Dicke relative Dicke Dichte
I
Pt. 0,018 mni 1 21,5 ,
Pb. 0,05 ,, 3 11,3
Zn. 0,10 ,, 6 7,1
Al. 3,5 ,, 200 2,6
Aus diesen Werthen ist zu entnehmen, dass keineswegs gleiche
Durchlässigkeit verschiedener Metalle vorhanden ist, wenn das Produkt,
aus Dicke und Dichte gleich ist. Die Durchlässigkeit nimmt in viel
stärkerem Masse zu, als jenes Product abnimmt.
6. Die Fluorescenz des Bariumplatincyanurs ist nicht die einzige
erkennbare Wirkung der X-Strahlen. Zunächst ist zu erwähnen, dass
auch andere Körper fluoresciren; so z. B. die als Phosphore bekannten
Calciumverbindungen, dann Uranglas, gewöhnliches Glas, Kalkspath,
Steinsalz etc.
Von besonderer Bedeutung in mancher Hinsicht ist die Thatsache,
dass photographische Trockenplatten sich als empfindlich für die
X-Strahlen erwiesen haben. Man ist im Stande manche Erscheinung 1
I

1
zu fixieren, wodurch Täuschungen leichter ausgeschlossen werden;
und ich habe, wo es irgend anging, jede wichtigere Beobachtung, die
ich mit dem Auge am Fluorescenzschirm machte, durch eine photo-
f
graphische Aufnahme controlliert.
Dabei kommt die Eigenschaft der Strahlen, fast ungehindert durch
clunnere Holz-, Papier- und Stanniolschichten hindurchgehen zu können,
sehr zu Statten; man kann die Aufnahmen mit der in der Cassette
oder in einer Papierumhullung eingeschlossenen photographischen
Platte im beleuchteten Zimmer machen. Andererseits hat diese Eigen-
schaft auch zur Folge, daß man unentwickelte Platten nicht bloss
durch die gebräuchliche Hülle aus Pappendeckel und Papier geschützt
längere Zeit in der Nähe des Entladungsapparates liegen lassen darf.
Fraglich erscheint es noch, ob die chemische Wirkung auf die
Silbersalze der photographischen Platte direct von den X-Strahlen
ausgeübt wird. Möglich ist es, daß diese Wirkung herrührt von dem
Fluorescenzlicht, das, wie oben angegeben, in der Glasplatte oder viel-
leicht in der Gelatineschicht erzeugt wird. ,,Films6' können übrigens
ebenso gut wie Glasplatten verwendet werden.
Dass die X-Strahlen auch eine Wärmewirkung auszuüben im
Stande sind, habe ich noch nicht experimentell nachgewiesen; doch
darf man wohl diese Eigenschaft als vorhanden annehmen, nachdem
durch die Fluorescenzerscheinungen die Fähigkeit der X-Strahlen,
verwandelt zu werden, nachgewiesen ist, und es sicher ist, dass nicht
alle auffallenden X-Strahlen den Körper als solche wieder verlassen.
Die Retina des Auges ist für unsere Strahlen unempfindlich; das
dicht an den Entladungsapparat herangebrachte Auge bemerkt nichts,
wiewohl nach den gemachten Erfahrungen die im Auge enthaltenen
Medien für die Strahlen durchlässig genug sein müssen.
7. Nachdem ich die Durchlässigkeit verschiedener Körper von re-
lativ großer Dicke erkannt hatte, beeilte ich mich, zu erfahren, wie
sich die X-Strahlen beim Durchgang durch ein Prisma verhalten,
ob sie darin abgelenkt werden oder nicht. Versuche mit Wasser und
Schwefelkohlenstoff in Glimmerprismen von Ca. 30° brechendem
Winkel haben gar keine Ablenkung erkennen lassen weder am Fluo-
rescenzschirm noch an der photographischen Platte. Zum Vergleich
wurde unter denselben Verhältnissen die Ablenkung von Lichtstrahlen
beobachtet; die abgelenkten Bilder lagen auf der Platte um Ca. 10 mm
resp. Ca. 20 mm von dem nicht abgelenkten entfernt. - Mit einem
Haftgummi- und einem Aluminiumprisma von ebenfalls Ca. 30° brechen-
dem Winkel habe ich auf der photographischen Platte Bilder bekommen,
an denen man vielleicht eine Ablenkung erkennen kann. Doch ist die
Sache sehr unsicher, und die Ablenkung ist, wenn überhaupt vorhanden,
jedenfalls so klein, dass der Brechungsexponent der X-Strahlen in
den genannten Substanzen höchstens 1,05 sein könnte. ' Mit dem Fluo-
rescenzschirm habe ich auch in diesem Fall keine Ablenkung beobachten
können.
Versuche mit Prismen aus dichteren Metallen lieferten bis jetzt
wegen der geringen Durchlässigkeit und der in Folge dessen geringen
Intensität der durchgelassenen Strahlen kein sicheres Resultat.
I n Anbetracht dieser Sachlage einerseits und andererseits der
Wichtigkeit der Frage, ob die X-Strahlen beim Uebergang von einem
Medium zum anderen gebrochen werden können oder nicht, ist es sehr
erfreulich, daß diese Frage noch in anderer Weise untersucht werden
kann als mit Hülfe von Prismen. Fein pulverisirte Körper lassen in
genügender Schichtendicke das auffallende Licht nur wenig und Zer-
streut hindurch in Folge von Brechung und Reflexion: erweisen sich
nun die Pulver für die X-Strahlen gleich durchlässig wie die cohärente
Substanz - gleiche Massen vorausgesetzt - so ist damit nachgewiesen,
dass sowohl eine Brechung als auch eine regelmässige Reflexion nicht
in merklichem Betrage vorhanden ist. Die Versuche wurden mit
fein pulverisirtem Steinsalz, mit feinem, auf electrolytischem Wege
gewonnenem Silberpulver und dem zu chemischen Untersuchungen
vielfach verwandten Zinkstaub angestellt; es ergab sich in allen Fällen
kein Unterschied in der Durchlässigkeit der Pulver und der cohärenten
Substanz, sowohl bei der Beobachtung am Fluorescenzschirm als auch
auf der photographischen Platte.
Dass man mit Linsen die X-Strahlen nicht concentriren kann, ist
nach dem Mitgetheilten selbstverständlich; eine grosse Hartgummi-
linse und eine Glaslinse erwiesen sich in der That als wirkungslos.
Das Schattenbild eines runden Stabes ist in der Mitte dunkler als am
Rande; dasjenige einer Röhre, die mit einer Substanz gefüllt ist, die
durchlässiger ist als das Material der Röhre, ist in der Mitte heller als
am Rande.
8. Die Frage nach der Reflexion der X-Strahlen ist durch die
Versuche des vorigen Paragraphen als in dem Sinne erledigt zu be-
trachten, dass eine merkliche regelmässige Zuruckwerfung der Strahlen
an keiner der untersuchten Substanzen stattfindet. Andere Versuche,
die ich hier ubergehen will, führen zu demselben Resultat.
Indessen ist eine Beobachtung zu erwähnen, die auf den ersten Blick
das Gegentheil zu ergeben scheint. Ich exponirte eine durch schwarzes
Papier gegen Lichtstrahlen geschützte photographische Platte, mit der
Glasseite dem Entladungsapparat zugewendet, den X-Strahlen; die
empfindliche Schicht war bis auf einen frei bleibenden Theil mit blanken
Platten aus Platin, Blei, Zink und Aluminium in sternförmiger Anord-
nung bedeckt. Auf dem entwickelten Negativ ist deutlich zu erkennen,
daß die Schwärzung unter dem Platin, dem Blei und besonders ixnter
dem Zink stärker ist als an den anderen Stellen; das Aluminium hatte
gar keine Wirkung ausgeubt. Es scheint somit, dass die drei genannten
Metalle die Strahlen reflectiren; indessen wären noch andere Ursachen
für die stärkere Schwärzung denkbar, und um sicher zu gehen, legte
ich bei einem zweiten Versuch zwischen die empfindliche Schicht
und die Metallplatten ein Stück dunnes Blattaluminiilm, welches für
ultraviolette Strahlen undurchlässig, dagegen für die X-Strahlen sehr
durchlässig ist. Da auch jetzt wieder im Wesentlichen dasselbe Resultat
erhalten wurde, so ist eine Reflexion von X-Strahlen an den genannten
Metallen nachgewiesen.
Hält man diese Thatsache zusammen mit der Beobachtung, dass
Pulver ebenso durchlässig sind, wie cohärente Körper, dass weiter
Körper mit rauher Oberfläche sich beim Durchgang der X-Strahlen,
wie auch bei dem zuletzt beschriebenen Versuch ganz gleich wie polirte
Körper verhalten, so kommt man zu der Anschauung, dass zwar eine
regelmässige Reflexion, wie gesagt, nicht stattfindet, dass aber die
Körper sich den X-Strahlen gegenüber ähnlich verhalten, wie die
trüben Medien dem Licht gegenüber.
Da ich auch eine Brechung beim Übergang von einem Medium
zum anderen nachweisen konnte, so hat es den Anschein, als ob die
X-Strahlen sich mit gleicher Geschwindigkeit in allen Körpern bewegen,
und zwar in einem Medium, das uberall vorhanden ist, und in welchem
die Körpertheilchen eingebettet sind. Die letzteren bilden für die
Ausbreitung der X-Strahlen ein Hinderniss und zwar im Allgemeinen
ein desto grösseres, je dichter der betreffende Körper ist.
9. Demnach wäre es möglich, dass auch die Anordnung der Theil-
chen im Körper auf die Durchlässigkeit desselben einen Einfluss aus-
iibte, dass z. B. ein Stück Kalkspath bei gleicher Dicke verschieden
durchlässig wäre, wenn dasselbe in der Richtung der Axe oder senkrecht
dazu durchstrahlt wird. Versuche mit Kalkspath und Quarz haben
aber ein negatives Resultat ergeben.
10. Bekanntlich ist Lenard bei seinen schönen Versuchen über die
von einem dünnen Aluminiumblättchen hindurchgelassenen Hittorf'-
sehen Kathodenstrahlen zu dem Resultat gekommen, daß diese Strahlen
Vorgänge im Aether sind, und dass sie in allen Körpern diffus verlaufen.
Von unseren Strahlen haben wir Aehnliches aussagen können.
In seiner letzten Arbeit hat Lenard das Absorptionsvermögen ver-
schiedener Körper für die Kathodenstrahlen bestimmt und dasselbe
U. a. für Luft von Atmosphärendruck zu 4,10, 3,40, 3,10 auf 1 cm
bezogen gefunden, je nach der Verdünnung des im Entladungsapparat
enthaltenen Gases. Nach der aus der Funkenstrecke geschiitzten Ent-
ladungsspannung zu urtheilen, habe ich es bei meinen Versuchen
meistens mit ungefähr gleichgrossen und nur selten mit geringeren
und grösseren Verdünnungen zu thun gehabt. Es gelang mir mit dem
L. Weber'schen Photometer - ein besseres besitze ich nicht - in
atmosphärischer Luft die Intensitäten des Fluorescenzlichtes meines
Schirmes in zwei Abständen -Ca. 100 resp. 200 mm -vom Entladungs-
apparat mit einander zu vergleichen, und ich fand aus drei recht gut
mit einander übereinstimmenden Versuchen, dass dieselben sich umge-
kehrt wie die Quadrate der resp. Entfernungen des Schirmes vom
Entladungsapparat verhalten. Demnach hält die Luft von den hindurch-
gehenden X-Strahlen einen viel kleineren Bruchteil zurück als von den I
6
Kathodenstrahlen. Dieses Resultat ist auch ganz in Übereinstimmung
mit der oben erwähnten Beobachtung, dass das Fluorescenzlicht noch
in 2 m Distanz vom Entladungsapparat wahrzunehmen ist.
Ähnlich wie Luft verhalten sich im Allgemeinen die anderen Körper:
sie sind für die X-Strahlen durchlässiger als für die Kathodenstrahlen.
11. Eine weitere sehr bemerkenswerthe Verschiedenheit in dem
Verhalten der Kathodenstrahlen und der X-Strahlen liegt in der Tat-
sache, dass es mir trotz vieler Bemühungen nicht gelungen ist, auch in
sehr kräftigen magnetischen Feldern eine Ablenkung der X-Strahlen
durch den Magnet zu erhalten.
Die Ablenkbarkeit durch den Magnet gilt aber bis jetzt als ein
characteristisches Merkmal der Kathodenstrahlen; wohl ward von
Hertz und Lenard beobachtet, dass es verschiedene Arten von Kathoden-
strahlen gibt, die sich durch „ihre Phosphorescenzerzeugung, Absor-
birbarkeit und Ablenkbarkeit durch den Magnet von einander unter-
scheiden", aber eine beträchtliche Ablenkung wurde doch in allen
von ihnen untersuchten Fällen wahrgenommen, und ich glaube nicht,
dass man dieses Characteristicum ohne zwingenden Grund aufgeben wird.
12. Nach besonders zu diesem Zweck angestellten Versuchen ist
es sicher, dass die Stelle der Wand des Entladungsapparates, die am
1
stärksten fluorescirt, als Hauptausgangspunkt der nach allen Rich-
tungen sich ausbreitenden X-Strahlen zu betrachten ist. Die X-Strahlen
gehen somit von der Stelle aus, wo nach den Angaben verschiedener I
Forscher die Kathodenstrahlen die Glaswand treffen. Lenkt man
die Kathodenstrahlen innerhalb des Entladungsapparates durch einen
Magnet ab, so sieht man, dass auch die X-Strahlen von einer anderen
Stelle, d. h. wieder von dem Endpunkte der Kathodenstrahlen ausgehen.
Auch aus diesem Grund können die X-Strahlen, die nicht ablenkbar
sind, nicht einfach unverändert von der Glaswand hindurchgelassene
resp. reflectirte Kathodenstrahlen sein. Die grössere Dichte des Glases
ausserhalb des Ent,ladungsgefässes kann ja nach Lenard für die
grosse Verschiedenheit der Ablenkbarkeit nicht verantwortlich gemacht
werden.
Ich komme deshalb zu dem Resultat, dass die X-Strahlen nicht iden-
tisch sind mit den Kathodenstrahlen, dass sie aber von den Kathoden-
strahlen in der Glaswand des Entladungsapparates erzeugt werden.
13. Diese Erzeugung findet nicht nur in Glas statt, sondern, wie ich
an einem mit 2 mm starkem Aluminiumblech abgeschlossenen Apparat
beobachten konnte, auch in diesem Metall. Andere Substanzen sollen
später untersucht werden.
14. Die Berechtigung, für das von der Wand des Entladungs-
apparates ausgehende Agens den Namen ,,Strahlenu zu verwenden,
leite ich zum Theil von der ganz regelmäßigen Schattenbildung her,
die sich zeigt, wenn man zwischen den Apparat und den fluoresciren-
den Schirm (oder die photographische Platte) mehr oder weniger durch-
lässige Körper bringt.
Viele derartige Schattenbilder, deren Erzeugung mitunter einen ganz
besonderen Reiz bietet, habe ich beobachtet und theilweise auch
photographisch aufgenommen; so besitze ich z. B. Photographien von
den Schatten der Profile einer Thüre, welche die Zimmer trennt, in
welchen einerseits der Entladungsapparat, andererseits die photographi-
sehe Platte aufgestellt waren; von den Schatten der Handknochen ;
von dem Schatten eines auf einer Holzspule versteckt aufgewickelten
Drahtes; eines in einem Kästchen eingeschlossenen Gewichtssatzes;
einer Bussole, bei welcher die Magnetnadel ganz von Metall einge-
schlossen ist; eines Metallstückes, dessen Inhomogenitiit durch die
X-Strahlen bemerkbar wird; etc.
Für die geradlinige Ausbreitung der X-Strahlen beweisend ist
weiter eine Lochphotographie, die ich von dem mit schwarzem Papier
eingehüllten Entladungsapparat habe machen können; das Bild ist
schwach aber unverkennbar richtig.
15. Nach Interferenzerscheinungen der X-Strahlen habe ich viel
gesucht, aber leider, vielleicht nur in Folge der geringen Intensität
derselben, ohne Erfolg.
16. Versuche, um zu constatiren, ob elektrostatische Kräfte in
irgend einer Weise die X-Strahlen beeinflussen können, sind zwar ange-
fangen aber noch nicht abgeschlossen.
17. Legt man sich die Frage vor, was denn die X-Strahlen - die
keine Kathodenstrahlen sein können - eigentlich sind, so wird man
vielleicht im ersten Augenblick, verleitet durch ihre lebhaften Fluo-
rescenz- und chemischen Wirkungen, an ultraviolettes Licht denken.
Indessen stösst man doch sofort auf schwerwiegende Bedenken. Wenn
nämlich die X-Strahlen ultraviolettes Licht sein sollten, so müsste
dieses Licht die Eigenschaft haben :
a) dass es beim Uebergang aus Luft in Wasser, Schwefelkohlen-
stoff, Aluminium, Steinsalz, Glas, Zink etc. keine merkliche
Brechung erleiden kann ;
b) dass es von den genannten Körpern nicht merklich regelmässig
reflectirt werden kann;
c ) dass es somit durch die sonst gebräuchlichen Mittel nicht pola-
risirt werden kann;
d) dass die Absorption desselben von keiner anderen Eigenschaft
der Körper so beeinflusst wird als von ihrer Dichte.
Das heisst, man müsste annehmen, dass sich diese ultravioletten
Strahlen ganz anders verhalten, als die bisher bekannten ultrarothen,
sichtbaren und ultravioletten Strahlen.
Dazu habe ich mich nicht entschliessen können und nach einer
anderen Erklärung gesucht.
Eine Art von Verwandtschaft zwischen den neuen Strahlen und
den Lichtstrahlen scheint zu bestehen, wenigstens deutet die Schatten-
bildung, die Fluorescenz und die chemische Wirkung, welche bei beiden
Strahlenarten vorkommen, darauf hin. Nun weiss man schon seit
langer Zeit, dass ausser den transversalen Lichtschwingungen auch
longitudinale Schwingungen im Aether vorkommen können und nach
Ansicht verschiedener Physiker vorkommen müssen. Freilich ist ihre
Existenz bis jetzt noch nicht evident nachgewiesen, und sind deshalb
ihre Eigenschaften noch nicht experimentell untersucht.
Sollten nun die neuen Strahlen nicht longitudinalen Schwingungen
im Aether zuzuschreiben sein?
Ich muss bekennen, dass ich mich im Laufe der Untersuchung
immer mehr mit diesem Gedanken vertraut gemacht habe, und gestatte
II
i

mir dann auch diese Vermuthung hier auszusprechen, wiewohl ich


mir sehr wohl bewusst bin, dass die gegebene Erklärung einer weiteren
Begründung noch bedarf.
Würzburg. Physikal. Institut der Universität. 28. Dec. 1895.
Vorläufige Mittheilung.
Hand dks Amtomt dh Gon Köl liker. Im IPhysikalischen Institii ersität
Würzburg ahlen aufge!nommen V,on Professor Dr. W. C
11. Mittheilung.
(Als Beitrag eingereicht.)
Da meine Arbeit auf mehrere Wochen unterbrochen werden muss,
gestatte ich mir im Folgenden einige neue Ergebnisse schon jetzt,
mitzutheilen.
18. Zur Zeit meiner ersten Publication war mir bekannt, dass die
X-8trahlen im Stande sind, electrische Körper zu entladen, und ich
vermuthe, dass es auch die X-Strahlen und nicht die von dem Alu-
miniumfenster seines Apparates unverändert durchgelassenen Kathoden-
strahlen gewesen sind, welche die von Lenard beschriebene Wirkung
auf entfernte electrische Körper aiisgeübt haben. Mit der Veröffent-
lichung meiner Versuche habe ich aber gewartet, bis ich in der Lage war,
einwurfsfreie Resultate mitzutheilen.
Solche lassen sich wohl nur dann erhalten, wenn man die Beobach-
tungen in einem Raum anstellt, der nicht nur vollständig gegen die von
der Vacuumröhre, den Zuleitungsdrähten, dem Inductionsapparat etc.
ausgehenden electrostatischen Kräfte geschützt ist, sondern der auch
gegen Luft abgeschlossen ist, welche aus der Nähe des Entladungs-
apparates kommt.
Ich liess mir zu diesem Zweck aus zusammengelötheten Zinli-
blechen einen Kasten anfertigen, der groß genug ist, um mich und die
nöthigen Apparate aufzunehmen, und der bis auf eine durch eine Zink-
thüre verschliessbare Oeffnung überall luftdicht verschlossen ist.
Die der Thüre gegenüber liegende Wand ist zu einem großen Theil
mit Blei belegt; an einer dem außerhalb des Kastens aufgestellten Ent-
ladungsapparat nahe gelegenen Stelle wurde die Zinkwand mit der
darüber gelegten Bleiplatte in einer Weite von 4 cm ausgesclinitten,
und die Öffnung ist mit einem dünnen Aluminiumblech wieder luftdicht
verschlossen. Durch dieses Fenster können die X-Strahlen in den E

Beobachtungskasten eindringen.
Ich habe nun Folgendes wahrgenommen:
I
a) In der Luft aufgestellte, positiv oder negativ electrisch geladene
Körper werden, wenn sie mit X-Strahlen bestrahlt werden, entladen
und zwar desto rascher, je intensiver die Strahlen sind. Die Intensität
der Strahlen wurde nach ihrer Wirkung auf einen Fluorescenzschirm
oder auf eine photographische Platte beurtheilt.
Es ist im Allgemeinen gleichgültig, ob die electrischen Körper
Leiter oder Isolatoren sind. Bis jetzt habe ich auch keinen specifischen
Unterschied in dem Verhalten der verschiedenen Körper bezüglich der
Gescliwindigkeit der Entladung gefunden; ebensowenig in dem Ver-
halten von positiver und negativer Electricität. Doch ist es nicht aus-
geschlossen, dass geringe Unterschiede bestehen.
b) Ist ein electrisirter Leiter nicht von Luft sondern von einem
festen Isolator z. B. Paraffin umgeben, so bewirkt die Bestrahlung
dasselbe, wie das Bestreichen der isolirenden Hülle mit einer zur Erde
abgeleiteten Flamme.
C) Ist diese isolirende Hülle von einem eng anliegenden, zur Erde
abgeleiteten Leiter umschlossen, welcher wie der Isolator für X-Strahlen
durchlässig sein soll, so übt die Bestrahlung auf den inneren, electri-
sirten Leiter keine mit meinen Hülfsmitteln nachweisbare Wirkung aus.
d) Die unter a, b, c mitgetheilten Beobachtungen deuten darauf
hin, dass die von den X-Strahlen bestrahlte Luft die Eigenschaft er-
halten hat, electrische Körper, mit denen sie in Berührung kommt,
zu entladen.
e) Wenn sich die Sache wirklich so verhlllt, und wenn ausserdem
die Luft diese Eigenschaft noch einige Zeit behält, nachdem sie den
X-Strahlen ausgesetzt war, so muss es möglich sein, electrische Körper,
welche selbst nicht von den X-Strahlen getroffen werden, dadurch zu
entladen, dass man ihrien bestrahlte Luft zuführt.
In verschiedener Weise kann man sich davon überzeugen, dass
diese Folgerung in der That zutrifft. Eine, wenn auch nicht die ein-
fachste, Versuchsanordnung niöchte ich mittheilen.
Ich benutzte eine 3 cm weite, 45 cm lange Messingröhre; in einigen
Centimeter Entfernung von dem einen Ende ist ein Theil der Röhren-
wand weggeschnitten und durch ein dünnes Aluminiumblech ersetzt;
am anderen Ende ist unter luftdichtem Abschluß eine an einer Metall-
stange befestigte Messingkugel isolirt in die Röhre eingeführt. Zwischen
der Kugel und dem verschlossenen Ende der Röhre ist ein Seitenröhr-
chen arigelöthet, das mit einer Saugvorrichtung in Verbindung gesetzt
werden kann; wenn gesaugt wird, so wird die Messingkngel umspült
von Luft, die auf ihrem Wege durch die Röhre an dem Aluminiumfenster
vorüber gegangen ist. Die Entfernung vom Fenster bis zur Kugel
beträgt uber 20 Cm.
Diese Röhre stellte ich im Zinkkasten so auf, dass die X-Strahlen
durch das Aluminiumfenster der Röhre, senkrecht zur Achse derselben
eintreten konnten, die isolirte Kugel lag dann außerhalb des Bereiches
dieser Strahlen, im Schatten. Die Röhre und der Zinkkasten waren
leitend mit einander, die Kugel mit einem Hankel'schen Electroskop
verbunden.
Es zeigte sich nun, dass eine der Kugel mitgetheilte Ladung (po-
sitive oder negative) von den X-Strahlen nicht beeinflusst wurde, so-
lange die Luft in der Röhre in Ruhe blieb, dass die Ladung aber sofort
beträchtlich abnahm, wenn durch kräftiges Saugen bestrahlte Luft
der Kugel zugeführt wurde. Erhielt die Kugel durch Verbindung
mit Accumulatoren ein constantes Potential, und wurde fortwährend
bestrahlte Luft durch die Röhre gesaugt, so entstand ein electrischei
Strom, wie wenn die Kugel mit der Röhrenwand dnrch einen schlechten
Leiter verbunden gewesen wäre.
f) Es fragt sich, in welcher Weise die Luft die ihr von den X-
Strahlen mitgetheilte Eigenschaft wieder verlieren kann. Ob sie sie von
selbst, d. h. ohne mit anderen Körpern in Berührung zu kommen,
mit der Zeit verliert, ist noch unentschieden. Sicher dagegen ist es, dass
eine kurz dauernde Berührung mit einem Körper von großer Ober-
fläche, der nicht electrisch zu sein braucht, die Luft unwirksam machen
kann. Schiebt man z. B. einen genügend dicken Pfropf aus Watte in die
Röhre so weit ein, dass die bestrahlte Luft die Watte durchstreichen
muss, bevor sie zu der electrischen Kugel gelangt, so bleibt die Ladung
der Kugel auch beim Saugen unverändert. .
Sitzt der Pfropf an einer Stelle, die vor dem Aluminiumfenster liegt,
so erhält man dasselbe Resultat wie ohne Watte: ein Beweis, dass nicht
etwa Staubtheilchen die Ursache der beobachteten Entladung sind.
Drahtgitter wirken ähnlich wie Watte; doch muss das Gitter sehr
eng sein, und viele Lagen müssen über einander gelegt werden, wenn
die durchgestrichene, bestrahlte Luft unwirksam sein soll. Sind diese
Gitter nicht, wie bisher angenommen, zur Erde abgeleitet, sondern
mit einer Electricitätsquelle von constantem Potential verbunden,
so habe ich immer das beobachtet, was ich erwartet hatte; doch sind b

diese Versuche noch nicht abgeschlossen.


g) Befinden sich die electrischen Körper statt in Luft in trockenem
Wasserstoff, so werden sie ebenfalls durch die X-Strahlen entladen.
Die Entladung in Wasserstoff schien mir etwas langsamer zu verlaufen,
doch ist diese Angabe noch unsicher wegen der Schwierigkeit, bei auf-
einander folgenden Versuchen gleiche Intensität der X-Strahlen zu
erhalten.
Die Art und Weise der Füllung der Apparate mit Wasserstoff dürfte
die Möglichkeit ausschließen, dass die anfänglich auf der Oberfläche
der Körper vorhandene verdichtete Luftschicht bei der Entladung
eine wesentliche Rolle gespielt hätte.
h) In stark evacuirten Räumen findet die Entladung eines direct,
von den X-Strahlen getroffenen Körpers viel langsamer - in einem
Fall z. B. Ca. 70mal langsamer - statt, als in denselben Gefäßen,
welche mit Luft oder Wasserstoff von Atmosphärendruck gefüllt sind.
i) Versuche über das Verhalten einer Mischung von Chlor und
Wasserstoff unter dem Einfluss der X-Strahlen sind in Angriff ge-
nommen.
j) Schliesslich möchte ich noch erwähnen, dass die Resultate von
Untersuchungen über die entladende Wirkung der X-Strahlen, bei wel-
chen der Einfluss des umgebenden Gases unberücksichtigt blieb, viel-
fach mit Vorsicht aufzunehmen sind.
19. I n manchen Fällen ist es vortheilhaft, zwischen den die X-
Strahlen liefernden Entladungsapparat und den Ruhmkorff einen
Tesla'schen Apparat (Condensator und Transformator) einzuschalten.
Diese Anordnung hat folgende Vorzüge : erstens werden die Entladungs-
apparate weniger leicht durchschlagen und weniger warm; zweitens
hält sich das Vacuum, wenigstens bei meinen selbstangefertigten
Apparaten, längere Zeit, und drittens liefern manche Apparate inten-
sivere X-Strahlen. Bei Apparaten; die zu wenig oder zu stark evacuirt
waren, um mit dem Ruhmkorff allein gut zu functioniren, leistete die
Anwendung des Tesla'schen Transformators gute Dienste.
Es liegt die Frage nahe - und ich gestatte mir deshalb sie zii
erwähnen, ohne zu ihrer Beantwortung vorläufig etwas beitragen zii
können - ob auch durch eine continuirliche Entladung mit constant
bleibendem Entladungspotential X-Strahlen erzeugt werden können;
oder ob nicht vielmehr Schwankungen dieses Potentials zum Entstehen
derselben durchaus erforderlich sind.
* 20. In $ 13 meiner ersten Veröffentlichung ist mitgetheilt, dass
die X-Strahlen nicht blos in Glas sondern auch in Aluminium ent-
stehen können. Bei der Fortsetzung der Vntersuchung nach dieser
I
Richtung hin hat sich kein fester Körper ergeben, welcher nicht iin
Stande wäre, unter dem Einfluß der Kathodenstrahlen X-Strahlen
zu erzeugen. Es ist mir auch kein Grund bekannt geworden, weshalb
sich flüssige und gasförmige Körper nicht ebenso verhalten würden.
Quantitative Unterschiede in dem Verhalten der verschiedenen
Körper haben sich dagegen ergeben. Lässt man z. B. die Kathoden-
strahlen auf eine Platte fallen, deren eine Hälfte aus einem 0,3 nim
dicken Platinblech, deren andere Hälfte aus einen1 1 rnm dicken
Aluminiumblech besteht, so beobachtet man, an dem mit der Lochcamera
aufgenommenen photographischen Bild dieser Doppelplatte, dass das
Platinblech auf der von den Kathodenstrahlen getroffenen (Vorder-)
Seite viel mehr X-Strahlen aussendet als das Aluminuimblech auf der
gleichen Seite. Von der Hinterseite dagegen gehen vom Platin so gut
wie gar keine, vom Aluminium aber relativ viel X-Strahlen aus. Letz-
t,ere Strahlen sind in den vorderen Schichten des Aluminiums erzeugt
und durch die Platte hindurch gegangen.
Man kann sich von dieser Beobachtung leicht eine Erklärung ver-
schaffen, doch dürfte es sich empfehlen, vorher noch weitere Eigen-
schaften der X-Strahlen zu erfahren.
Zu erwähnen ist aber, dass der gefundenen Thatsache auch eine
praktische Bedeutung zukommt. Zur Erzeugung von möglichst in-
tensiven X-Strahlen eignet sich nach meinen bisherigen Erfahrungen
Platin am besten. Ich gebrauche seit einigen Wochen mit gutem Erfolg
einen Entladungsapparat, bei dem ein Hohlspiegel aus Aluminium
als Kathode, ein unter 45O gegen die Spiegelachse geneigtes, im Krüm-
mungscentrum aufgestelltes Platinblech als Anode fungirt.
21. Die X-Strahlen gehen bei diesem Apparat von der Anode aus.
Wie ich aus Versuchen mit verschieden geformten Apparaten schliessen
inuss, ist es mit Rücksicht auf die 'Intensität der X-Strahlen gleich-
gültig, ob die Stelle, wo diese Strahlen erzeugt werden, die Anode ist
oder nicht.
Speciell zu den Versuchen mit den Wechselströmen des Tesla'schen
Transformators wird ein Entladungsapparat angefertigt, bei dem beide
Electroden Aluminiumhohlspiegel sind, deren Axen mit einander
einen rechten Winkel bilden; im gemeinschaftlichen Krümmungscen-
trum ist eine die Kathodenstrahlen auffangende Platinplatte angebracht.
Ueber die Brauchbarkeit dieses Apparates soll später berichtet werden.
Abgeschlossen: 9. März 1896.
JVürzburg. Physikal. Institut d. Universität.

Anmerkung der S c h r i f t l e i t u n g : Außer diesen beiden Mitteilungen


erschien noch eine dritte Mitteilung in den Sitzungsberichten der K. preuss.
Bkad. der Wissensch. zu Berlin, Jahrgang 1897.
Sämtliche 3 Mitteilungen wurden nachtriiglich in den Annalen der Physik
Band 64, 1, 1898 nochmals veröffentlicht.
llit. Bilder sind Aufriahnien von Originalapparaten, die sich
iiii Höiitgengedächtiiis-
ziiiiiuer des Physikalischen Institutes der Universität Würzburg befinden.
Bild 1 . Erste Seite des Originfllmnnuskripts.

Bild 2. Zwei von den Röhrentypen, die Röntgen bei den eraten Versuchen verwendete.
Von den Röhren de. unterenTyps (,,absolutes Vakuum") wurden während der Untersuclinng
von Röntgen Dutzende verbraucht.
Bild 3. Bleiblech mit Fenstern aus verschiedenen Metallen zur Untersuchung der Absorption.
Das Bleiblech ist an den beiden Querseiten nach hinten umgebogen zum Einschieben einer
photographischen Platte in Lederkassette (vgl. Abschnitt 2).

Hila 4 . Zwei Spalte in Bleiblechen von Röntgen priniitiv mit Korken zubammengebaut.
Sie dienten zur -4usblendung eines definierten Strahlenbiindels.
Bild 5 . Prisinen aus IIartgummi und Aluminium und Hohlprisnia aus Glimiiicrplättchen
wurden auf die horizontale Bleiplatte von Bild 4 gesetzt; etwaige Ablenkung der Strahlen
hRtte auf diese Weise erkennbar werden miisson (vgl. Abschnitt 7 ) .

Bild 6. Der Elektromagnet mit dem Röntgen die Ablenkung der Strahlen versuchte
(vgl. Abschnitt 11).
Glasser, O. (1958). Dr. W.C. Röntgen. Springfield, Illinois, Charles C. Thomas.
Mould, R. F. (1993). A Century of X-rays and Radioactivity in Medicine, With Emphasis on
Photographic Records of the Early Years. Bristol, Institute of Physics.
Röntgen, W. C. (1895). "Über eine neue Art von Strahlen. Vorläufige Mittheilung."
Sitzungsberichte der Physikalish-Medizinischen Gesellschaft zu Würzburg 137: 132-141.
Sur une nouvelle substance fortement radio-active contenue dans la
pechblende

Note de P. et M. Curie et G. Bémont. C.R. T.127 (1898) 1215-1217


Sobre una nueva sustancia fuertemente radiactiva
contenida en la pechblenda
Nota de P. y M. Curie y G. Bémont. C. R. T.127 (1898) 1215-1217
espectro y dispersión catódicos, como implicando analogías no justificadas con la luz ordinaria.
Pero, en su hipótesis, los rayos simples se deben a potenciales diferentes y por ende tienen
velocidades de propagación diferentes; sin mencionar que son consecutivos. Así, los rayos
simples de la luz ordinaria, que atravesaron un cuerpo transparente, incluso gaseoso, entran en
el mismo caso. Las analogías son nítidas al contrario ( 1).”

FÍSICA. — Sobre una nueva sustancia fuertemente radiactiva, contenida en la pechblenda


(2). Nota del Sr. P. Curie, Sra. P. Curie y del Sr. G. Bémont, presentada por el Sr. Becquerel.

“Dos de nosotros demostramos que, solo a través de procedimientos químicos,


podíamos extraer pechblenda, una sustancia fuertemente radiactiva. Esta sustancia es cercana
al Bismuto debido a sus propiedades analíticas. Sostuvimos la opinión que la pechblenda
contenía quizá un elemento nuevo, para el cual propusimos el nombre de Polonio (3).
“Las investigaciones que realizamos actualmente están en concordancia con los
primeros resultados obtenidos; sin embargo, en el curso de estas investigaciones, encontramos
una segunda sustancia fuertemente radiactiva y del todo distinta de la primera por sus
propiedades químicas. En efecto, el Polonio es precipitado en una solución ácida por el
hidrógeno sulfurado; sus sales son solubles en los ácidos, y el agua los precipita de estas
disoluciones; el polonio está completamente precipitado por el amoníaco.
“La nueva sustancia radiactiva que acabamos de descubrir cuenta con la apariencia
química del Bario casi puro: no es precipitada ni por el hidrógeno sulfurado, ni por el sulfuro
de amonio, ni por el amoníaco; el sulfato es insoluble en agua y en ácidos; el carbonato es
insoluble en el agua; el cloruro, muy soluble en el agua, es insoluble en el ácido clorhídrico
concentrado y en el alcohol. En conclusión, esta sustancia entrega el fácilmente reconocible
espectro del Bario.
“No obstante, creemos que esta sustancia, pese a estar constituida sobre todo por Bario,
contiene además un elemento nuevo que le comunica la radiactividad y que, por cierto, es un
elemento cercano al Bario debido a sus propiedades químicas.
“A continuación, las razones que defienden este punto de vista:
“1° El Bario y sus compuestos no son simples radiactivos; ahora bien, uno de nosotros
demostró que la radiactividad parecía ser una propiedad atómica, persistente en todos los
estados químicos y físicos de la materia (4). Según este punto de vista, la radiactividad de
nuestra sustancia, al no deberse al Bario, debe atribuirse a otro elemento.
“2° Las primeras sustancias que obtuvimos tenían, en el estado de cloruro hidratado,
una radiactividad 60 veces más fuerte que aquella del Uranio metálico (siendo la intensidad
radiactiva evaluada por la magnitud de la conductibilidad del aire en nuestro aparato de
platillos). Al disolver estos cloruros en agua y al precipitar una parte por alcohol, la parte
precipitada resulta mucho más activa que la parte preservada como diluida. Si nos basamos en
este hecho, podemos operar una serie de fraccionamientos que nos permiten obtener cloruros
cada vez más activos. De esta forma, obtuvimos cloruros con una actividad 900 veces superior
que la del Uranio. Si nos detuvimos fue por la falta de sustancia, y, según el avance de las
operaciones, se puede prever que la actividad habría aumentado aún más en caso de haber
podido continuar. Estos hechos pueden explicarse por la presencia de un elemento radiactivo,
cuyo cloruro sería menos soluble en agua alcoholizada que aquel del Bario.
“3° El señor Demarçay aceptó examinar el espectro de nuestra sustancia, con una
amabilidad que nunca terminaremos de agradecer. Los resultados de su examen se exponen en
una Nota especial a continuación de la nuestra. El señor Demarçay encontró una línea en el
espectro qui no parece deberse a ningún elemento conocido. Esta línea, apenas visible con el
cloruro 60 veces más activo que el Uranio, se volvió notoria con el cloruro enriquecido por
fraccionamiento hasta la actividad de 900 veces el Uranio. Entonces, la intensidad de esta línea

(1) El Señor Goldstein exagera cuando nombra la dispersión de los raros catódicos como una llamada
dispersión, comparable a la dispersión provocada por un espejo rotativo.
(2) Este trabajo fue realizado en la Escuela municipal de Física y Química industriales.
(3) El Sr. P. Curie y la Sra. P. Curie, Comptes rendus, t. CXXVII, p. 175.
(4) Sra. P. Curie, Comptes rendus, t. CXXVI, p. 1101.
aumenta al mismo tiempo que la radiactividad, y pensamos que por ende existe una razón muy
seria para atribuirla a la parte radiactiva de nuestra sustancia.
“Las distintas razones que acabamos de enumerar, nos llevan a creer que la nueva
sustancia radiactiva contiene un elemento nuevo, al que proponemos nombrar Radio.
“Hemos determinado el peso atómico de nuestro Bario activo, al dosificar el cloro en el
cloro anhidro. Descubrimos números que difieren muy poco de aquellos obtenidos de forma
paralela con el cloruro de Bario inactivo; pese a que los números para el Bario activo son
siempre un poco mayores, pero la diferencia es del tipo de magnitud de los errores de la
experiencia.
“La nueva sustancia radiactiva contiene, sin lugar a dudas, una proporción de Bario muy
fuerte; sin embargo, la radiactividad es considerable. Entonces, la radiactividad del Radio debe
ser enorme.
“El Uranio, el Torio, el Polonio, el Radio y sus compuestos vuelven al aire conductor de
electricidad y se comportan fotográficamente sobre las placas sensibles. Según estos dos puntos
de vista, el Polonio y el Radio son considerablemente más activos que el Uranio y el Torio. En
tan solo mediominuto de colocación, se obtienen buenas impresiones sobre las placas
fotográficas con el Radio y el Polonio; mientras que hacen falta varias horas para obtener el
mismo resultado con el Uranio y el Torio.
“Los rayos emitidos por los compuestos de Polonio y de Radio vuelven fluorescente el
platinocianuro de Bario; según este punto de vista, su acción es análoga a la de los rayos de
Roentgen, pero considerablemente más débil. Para este experimento, colocamos una hoja muy
delgada de aluminio en la sustancia activa, sobre la cual se desplegó una delgada capa de
platinocianuro de Bario; en la oscuridad, el platinocianuro se ve ligeramente luminoso frente a
la sustancia activa.
“Así, descubrimos una fuente lumínica muy tenue, a decir verdad, que sin embargo
funciona sin requerir de una fuente de energía. Lo que presenta una contradicción, al menos en
primera instancia, con el principio de Carnot.
“El Uranio y el Torio no producen ninguna luminosidad en estas condiciones,
probablemente su acción sea muy débil (5).”

(5) Que se nos permita agradecer aquí al Sr. Suess, Correspondiente del Instituto, Profesor de la Universidad
de Viena. Gracias a su benevolente intervención, pudimos
553

9. Ueber das Gesetx


der Energ4euerteilzcng i m Norrnalspectrzcm;
won M a x P l a n ck.
(In snderer Form mitgeteilt in der Deutschen Physikalischen Gesellschaft,
Sitzung vom 19. October und vom 14. December 1900, Verhandlungen
2. p. 202 uud p. 237. 1900.)

Einleitung.
Die neueren Spectralmessungen von 0. L u m m e r und
E. P r i n g s h e i m I) und noch auffalliger diejenigen von
H. R u b e n s und F. K u r l b a u m 2 ) , welche zugleich ein fruher
von H. Beckmann3) erhaltenen Resultat bestatigten, haben
gezeigt, dass das zuerst von W. W i e n aus molecularkinetischen
Betrachtnngen und spater von mir aus der Theorie der elektro-
magnetischen Strahlung abgeleitete Gesetz der Energieverteilung
im Normalspectrum keine allgemeine Giiltigkeit besitzt.
Die Theorie bedarf also in jedem Falle einer Verbesserung,
und ich will im Folgenden den Versuch machen, eine solche
auf der Grundlage der von mir entwickelten Theorie der
elektromagnetischen Strahlung durchzufuhren, Dazu wird es
vor allem notig sein, in der Reihe der Schlussfolgerungen,
welche zum Wien'schen Energieverteilungsgesetz fuhrten, das-
jenige Glied ausfindig zu machen, welches einer Abanderung
fahig ist; sodann aber wird es sich darum handeln, dieses
Glied aus der Reihe zu entfernm und einen geeigneten Ersatz
dafur zu schaffen.
Dass die physikalischen Grundlagen der elektromagnetischen
Strahlungstheorie, einschliesslich der Hypothese der ,,natiir-
lichen Strahlung", auch einer gescharften Kritik gegeniiber
Stand halten, habe ich in ineinem letzten Aufsatz4) uber diesen
I) 0. L u m m e r u. E. P r i n g s h e i m , Verhandl. tier Deutsch. Physikal.
Gesellscb. 2. p. 163. 1900.
2 ) H. R u b e n s und F. K u r l b a u m , Sitzungsber. d. k. Akad. d.
Wissensch. zii Berlin vom 25. October 1900, p. 929.
3) H. B e c k m a n n , 1naug.-Dissertation, Tubingen 1898. Vgl. auch
H. R u b e n s , Wied. Ann. 69. p. 582. 1899.
4) M. P l a n c k , Ann. d. Phys. 1. p. 719. 1900.
Annalon der Physik. IV. Folge. 4. 36
554 Mi Planck.
Gegenstand dargelegt; und da auch die Rechnungen meines
Wissens keinen Fehler enthalten, so bleibt auch der Satz be-
stehen, dass das Gesetz der Energieverteilung im Normal-
spectrum vollkommen bestimmt ist , wenn es gelingt , die
Entropie S eines bestrahlten, monochromatisch schwingenden
Resonators als Function seiner Schwingungsenergie U zu be-
rechnen. Denn dann erhalt man aus der Beziehung d S l d U= 119.
die Abhangigkeit der Energie U von der Temperatur 8,und
da andererseits die Energie U durch eine einfache Beziehung l)
mit der Strahlungsdichte der entsprechenden Schwingungszahl
verkniipft ist, auch die Abhangigkeit dieser Strahlungsdichte
von der Temperatur. Die normale Energieverteilung ist dann
diejenige, bei welcher die Strahlungsdichten aller verschiedenen
Schwingungszahlen die namliche Temperatur besitzen.
Somit reducirt sich das ganze Problem auf die eine Auf-
gabe, S als Function von U zu bestimmen, und der Losung
dieser Aufgabe ist auch der wesentlichste Teil der folgenden
Untersuchung gewidmet. I n meiner ersten Abhandlung iiber
diesen Gegenstand hatte ich S direct durch Definition, ohne
weitere Begriindung, als einen einfachen Ausdruck von U hin-
gestellt, und mich damit begniigt nachzuweisen, dass diese
Form der Entropie allen Anforderungen, welche die Thermo-
dynamik an sie stellt, Geniige leistet,. Ich glaubte damals,
dass sie auch die einzige ihrer Art sei, und dass somit das
W ien'sche Gesetz, welches aus ihr folgt, notwendig allgemeine
Gultigkeit besitze. Bei einer spateren naheren Untersuchungz)
zeigte sich mir indessen, dass es auch noch andere Ausdriicke
geben muss, welche dasselbe leisten, und dass es deshalb
jedenfalls noch einer weiteren Bedingung bedarf, urn S ein-
deutig herechnen zu konnen. Eine solche Bedingung glaubte
ich gefunden zu haben in dem mir damals unmittelbar plau-
sibel scheinenden Satz, dass bei einer unendlich kleinen
irreversibeln Aenderung eines nahezu im thermischen Gleich-
gewicht befindlichen Systems von N gleichbeschaffenen , im
nanilichen stationaren Strahlungsfeld befindlichen Resonatoren
die damit verbundene Vermehrung ihrer Gesamtentropie S, = N S

1) Vgl. unten Gleichung (8).


2) M. P l a n c k , t. c. p. 730ff.
6esetz der Energieverteilung im Normalspectrum. 555

nur abhangt von ihrer Gesamtenergie U, = N U und deren


Aenderungen, nicht aber von der Energie U der einzelnen
Resonatoren. Dieser Satz fiihrt wiederum mit Notwendigkeit
zum W i en’ schen Energieverteilungsgesetz. Da nun aber
letzteres durch die Erfahrung nicht bestatigt wird, so ist man
zu dem Schlusse gezwungen, dass auch jener Satz in seiner
Allgemeinheit nicht richtig sein kann und daher aus der Theorie
zu entfernen ist. l)
Es muss also nun eine andere Bedingung eingefuhrt
werden, welche die Berechnung von S gestattet, und um dies
zu bewerkstelligen, ist ein naheres Eingehen auf die Bedeutung
des Entropiebegriffes notwendig. F u r die Richtung des dabei
einzuschlagenden Gedankenganges giebt der Hinblick auf die
Unhaltbarkeit der friiher gemachten Voraussetzung einen Finger-
zeig. Im Folgenden wird nun ein Weg beschrieben, auf dem sich
ein neuer einfacher Ausdruck der Entropie und damit auch eine
neue Strahlungsformel ergiebt , welche mit keiner der bisher
festgestellten Thatsachen in Widerspruch zu stehen scheint.

I. Berechnung der Entropie eines Resonators als Function


seiner Energie.
Ej 1. Entropie bedingt Unordnung, und diese Unordnung
beruht nach der elektromagnetischen Strahliingstheorie bei den
monochromatischen Schwingungen eines Resonators, auch wenn
er sich in einem dauernd stationaren Strahlungsfelde befindet,
in der Unregelmassigkeit, mit der er bestandig seine Amplitude
und seine Phase wechselt, sofern man Zeitepochen betrachtet,
welche gross sind gegen die Zeit einer Schwingung, aber klein
gegen die Zeit einer Messung. Ware Amplitude und Phase
absolut constant, also die Schwingungen vollkommen homogen,
so konnte keine Entropie existiren und die Schwingungsenergie
musste vollkommen frei in Arbeit verwandelbar sein. Die
constante Energie U eines einzelnen stationar schwingenden
Resonators ist danach nur als ein zeitlicher Mittelwert auf-
zufassen , oder, was ganz auf dasselbe hinauskommt , als der
gleichzeitige Mittelwert der Energien einer grossen Anzahl N
1) Man vergleiclie hierzu die Kritiken, die dieser Satz bereits ge-
funden hat: von W. W i e n (Rapport fur den Pariser Congress 2. p. 40.
1900) und von 0. L u m m e r (1. c. 2. p. 92. 1900).
36 *
556 M. Planck.
von gleichbeschaffenen Resonatoren, die sich in dem namlichen
stationiiren Strahlungsfelde ,befinden, weit genug voneinander
entfernt, um sich nicht gegenseitig direct zu beeinflussen. I n
diesem Sinne wollen wir kunftig von der mittleren Energie U
eines einzelnen Resonators sprechen. Dann entspricht der ge-
samten Energie
(1) uN=flU
eines solchen Systems von N Resonatoren eine gewisse Ge-
samtentropie
(2) fiN=NS
desselben Systems, wobei S die mittlere Entropie eines einzelnen
Resonators darstellt, und diese Entropie SN beruht auf der
Unordnung, mit der sich die gesamte Energie UN auf die
einzelnen Resonatoren verteilt.
8 2. Wir setzen nun die Entropie SN des Systems, bis
auf eine willkiirlich bleibende additive Constante, proportional
dem Logarithmus der Wahrscheinlichkeit W dafiir , dass die
N Resonatoren insgesamt die Energie UN besitzen, also:
(3) +
SN = k log W const.
Diese Festsetzung kommt nach meiner Meinung im Grunde
auf eine Definition der genannten Wahrscheinlichkeit W hinaus ;
denn wir besitzen in den VoraussetzungeA, welche der elektro-
magnetischen Theorie der Strahlung zu Grunde liegen, gar
keinen Anhaltspunkt, um von einer solchen Wahrscheinlichkeit
in einem bestimmten Sinne zu reden. Fu r die Zweckmassig-
keit der so getroffenen Festsetzung lasst sich von vornherein
ihre Einfachheit und ihre nahe Verwandtschaft mit einem
Satze der kinetischen Gastheorie I) anfuhren.
6 3. Es kommt nun darauf an, die Wahrscheinlichkeit W
dafiir zu finden, dass die N Resonatoren insgesamt die Schwin-
gungsenergie UN besitzen. Hierzu ist es notwendig, U, nicht
als eine stetige, unbeschrankt teilbare, sondern als eine dis-
crete, auR einer ganzen Zahl von endlichen gleichen Teilen
zusammengesetzte Grijsse aufzufassen. Nennen wir einen solchen
Teil ein Energieelement E , so ist mithin zu setzen:
(4) u, = P.e ,

1) L. Boltzmann, Sitzungsber. d. k. Aksd. d. Wissensch. zu Wien


(II) '76. p.428. 1877.
(:esetz der &neryieverteihing im Normalspectmm. 551

walk Y eiiie ganze, im allgemeinen grosse Zahl bedeutet,


wAu*end wir den Wert von E noch dahingestellt sein lassen.
Ntiii ist einleuchtend, dass die Verteilung der P Energie-
rlemente auf die N Resonatoren nur auf eine endliche ganz
behtimmte Anzahl von Arten erfolgen kann. Jede solche Art
tlw Verteilung nenneri wir nach einem von L. B o l t z m a n n fur
rineii ahnlichen Begriff gebrauchten Ausdruck eirie ,,Complexion".
ISezeichnet man die Resonatoren mit den Ziffern 1, 2, 3 . . . N,
whreibt diese der Reihe nach nebeneinander und setzt unter
1 cden Resonator die Anzahl der bei irgend einer willkurlich
\'orgeiiommeneri Verteilung auf ihn entfallenden Rnergie-
elcmente, so erhalt man fiir jede Complexion ein Symbol von
folgendcr Form :
I 2 3 4 5 6 7 8 9 10
38 11 0 9 2 20 4 4 5
Hier ist N = 10, P= 100 angenommen. Die Anzahl 5)i aller
tn6glic:hen Complexionen ist offenbar gleich der Anzahl aller
m6glickien Ziffernbilder, die man auf diese Weise, bcj be-
stimmtem N und P, fur die untere Reihe erhalten kann. Der
1)eutlichlieit halber sei noch bemerlit, dass zwei Complexionen
ids verschieden nnzusehen sind, wenn die entsprechenden
ZiEernbilder dieselberi Ziffern , aber in verschiedener An-
ordiiung, enthalten.
Aus der Combinatiorislehre ergiebt sich so die Anzahl
tlller moglichen Complexionen ZIX
1) - ( N + P - l l ) !
92 = N1 .. ( N 2+ l ) .. ( N 3+ 2 )..... (.l L T + PI ' - - _
( N - l!! P ! *
Nun ist nach dem Stirling'schen S a k e in erster An-
niiherung :
N! = J y N
iolglicli in eritsprechender Annaherung

5 4, Die Hypothese, welche wir jetzt der weiteren Rech-


nung z u Grunde legen wollen , lautet folgendermaassen: Die
Walirsclieinlichkeit W dafiir, dass die N Resonatoren insgesarnt
die Schwiiigungsenergie 2 7, besitzen, ist proportional der An-
558 M ; Planck.

zahl % aller bei der Verteilung der Energie UN auf die N R e -


sonatoren mijglichen Complexionen; oder mit anderen Worten:
irgend eine bestimmte Complexion ist ebenso wahrscheinlich,
wie irgend eine andere bestimmte Complexion. Ob diese Hypo-
these in der Natur wirklich zutrifft, kann in letzter Linie nur
durch die Erfahrung gepriift werden. Dafiir wird es aber
umgekehrt moglich sein, falls die Erfahrung einmal zu ihren
Gunsten entschieden haben sollte , aus der Qiiltigkeit dieser
Hypothese weitere Schlusse zu ziehen auf die speciellere Natur
der Resonatorschwingungen, namlich auf den Charakter der dabei
auftretenden ,,indifferenten und ihrer Grosse nach vergleich-
baren urspriinglichen Spielraume", in der Ausdrucksweise von
J. v. Kries.') Bei dem jetzigen Stande der Frage diirfte
allerdings ein weiteres Beschreiten dieses Gedankenganges noch
verfriiht erscheinen.
5 5. Nach der eingefuhrten Hypothese in Verbindung
mit Gleichung (3) ist die Entropie des betrachteten Systems
von Resonatoren bei passender Bestimmung der additiven
Constanten :

(5)
{
s N == KklogW
j(N+ P)log(N+P)-XlogN-PPlogPf
und mit Beriicksichtigung von (4)und (1):
s,=kM((1+ ~ ) i o g ( i + ~ } - Tui o g u ).
Also nach (2) die Entropie S eines Resonators als Function
seiner Energie U

8 = ((1 + :)log (1 + Y ) - G log u\E ,

11. Einfuhrung des W ien'sohen Verschiebungsgesetzes.


$j 6. Nachst dem Kirchhoff'schen Satz von der Pro-
portionalitat des Emissions- und des Absorptionsvermogens
bildet das von W. Wien2) entdeckte und nach ihm benannte

1) bob. v. Kri ea, Die Principien der Wahrscheinlichkeitsrechnung


p. 36. Freiburg 1886.
2) W. W i e n , Sitzungsber. d. k. Akad. d. Wissensch. zu Berlin vom
9. Febr. 1893. p. 55.
Gesetz dcr Brtcrgieverteilung im ATormalspectrurn. 559
aogenannte Verschiebungsgesetz, welches das S t e f a n - B o l t z -
m a n n 'sche Gesetz der Abhangigkeit der Gesamtstrahlung von
der Temperatur als specielle Anwendung mit umfasst, in dem
fest begriindeten Fundament der Theorie der Warmestrahlung
den wertvollsten Bestandteil. Es lautet in der ihm von
M. T h i e s e n l ) gegebenen Fassung:
B.dA= t95v(k~Y).dA,
wo A die Wellenlange, E d A die raumliche Dichte der dem
+
Spectralbezirk A bis h d il angehbrenden ,,schwarzen" Strah-
lung2), 9. die Temperatur, und y (3) eine gewisse Function des
einzigen Arguments x hezeichnet.
5 7 . Wir wollen nun untersuchen, was das Wien'sche
Verschiebungsgesetz iiber die Abhangiglseit der Entropie S
unseres Resonators von seiner Energie U und seiner Eigen-
periode aussagt, und zwar gleich in dem allgemeinen Falle,
dass der Resonator sich in einem beliebig en diathermanen
Medium befindet. Zu diesem Zwecke verallgemeinern wir zu-
nachst die Thiesen'sche Form des Gesetzes auf die Strahlung
in einem beliebigen diathermanen Medium mit der Lichtfort-
pflanzungsgeschwindigkeit c. Da wir nicht die Gesamtstrahlung,
sondern monochromatische Strahlung zu betrachten haben, so
wird es beim Vergleich verschiedener diathermaner Medien
notig, statt der Wellenlange h die Schwingungszahl v ein-
zu fiihren.
Bezeichnen wir also die raumliche Dichte der dem Spectral-
bezirk v bis v + d v angehbrenden strahlenden Energie mit u d v,
so ist zu schreiben: u d v statt EdA, c / v statt A. und c d v l v 2
statt dA. Dadurch ergiebt sich:

Nun ist nach dem bekannten K i r c h hoff- C1a u s i u s 'schen


Gesetz die von einer schwarzen Flache pro Zeiteinheit in ein
diathermanes Medium emittirte Energie von bestimmter Tem-
peratur 9. und bestimmter Schwingungszahl v umgekehrt pro-

1) M. T h i e s e n , Verhandl. d. Deutsch. Phys. Gesellsch. 2. p. 6 6 . 1900.


2) Man konnte vielleicht noch passender von einer ,,weissen"
Strahlung sprechen, in sachgemasser Verallgerneinerung dessen, was man
schon jetzt unter vollkornmen weissem Licht versteht.
560 M . Planck.
portional dem Quadrate ca der Fortpflanzungsgeschwindigkeit;
also ist die raumliche Energiedichte it umgekehrt proportional
c3, und wir erhalten:

wobei die Constanten der Function f' von c unabhgngig sind.


Statt dessen konnen wir auch schreiben, wenn f jedesmal,
auch im Folgenden, cine neue Function eines einzigen Arguments
bezeichnet:
(7)
und ersehen unter anderem daraus, wie bekannt, dass die in
dem Cubus einer Wellenlange enthaltene strahlende Energie
von bestimmter Temperatur und Schwingungszahl: 11 ?k5 fur alle
diathermanen Medien dieselbe ist.
5 8. Um nun von der raumlichen Strahlungsdichte u zur
Energie U eines in dem Strahlungsfelde befindlichen, stationar
mitschwingenden Resonators mit der namlichen Schwingungs-
zahl v iiberzugehen, benutzen wir die in Gleichung (34) meiner
Abhandlung uber irreversible Strahlungsvorgange l) ausgedruckte
Beziehung :
v*
Q==,U
( Q rlst die Intensitat eines monochromatischen, geradlinig
polarisirten Strahles), welche zusammen mit der bekannten
Gleichung :
8-n L

Hieraus und aus (7) folgt:

wo nun c uberhaupt nicht mehr vorkommt. Statt dessen


konnen wir auch schreiben:

1) M. P l a n c k , Ann. d. Phys. 1. p. 99. 1900.


Gesetz der Energieverteilung im Xormalspectriim. 56 1

$j 9. Endlich fuhren wir auch noch die Entropie S des


Resonators ein, indem wir setzen:

(9)
1
-dS
~-
Y dl?’
Dann ergiebt sich :

d u - w
und integrirt:
(10 ) 8= f[-Y j ’
d. h. die Entropie des in einem beliebigen diathermanen Medium
schwingenden Resonators ist von der einzigen Variabeln U/v
abhbgig und enthalt ausserdem nur universelle Constante.
Dies ist die einfachste mir bekannte Fassung des W ien’schen
Verschiebungsgesetzes.
§ 10. Wenden wir das W ien’sche Verschiebungsgesetz
in der letzten Fassung auf den Ausdruck ( 6 ) der Entropie S
a n , so erkennen wir, dass das Energieelement E proportional
der Schwingungszahl v sein muss, also:
E=Il.u
und somit:

Hierbei sind h und k. universelle Constante.


Durch Substitution in (9) erhalt man:
1 =
6 /L Y 1 0 4 + /;j,

und aus (8) folgt d a m das gesuclite Energieverteilungsgesetz :


11
8nhvS
=-
c3
. ~~

1L Y
1

ek* -1
oder auch, wenn man mit den in § 7 ange,gebenen Substitu-
tionen statt der Schwingungszahl v wieder die Wellenlange I
einfiihrt ;
562 M. Planck.
Die Ausdriicke fiir die Intensitat und fur die Entropie
der im diathermanen Medium fortschreitenden Strahlung, sowie
den Satz der Vermehrung der gesamten Entrbpie bei nicht-
stationaren Strahlungsvorgangen denke ich an anderer Stelle
abzuleiten.
111. Zahlenwerte.
9 11. Die Werte der beiden Naturconstanten h und k
lassen sich mit Hiilfe der vorliegenden Messungen ziemlich
genau berechnen. F. K u r l b a u m ' ) hat gefunden, dass, wenn
man mit S, die ,gesamte Energie bezeichnet, die von 1 qcm
eines auf t o C. befindlichen schwarzen Korpers in 1 sec in die
Luft gestrahlt wird :
S,,, - So= 0,0731 Watt
- = 7,31 . 1O5
cmB
~ ~
erg
cma sec
Daraus ergiebt sich die raumliche Dichte der gesamten
Strahlungsenergie in der Luft bei der absoluten Temperatur 1 :
4 . 7 -, _~105
._
_
3710". (3734 - 273')
- = 7,061. 10-16 erg .
cmSgrad4
Andererseits ist nach (12) die raumliche Dichte der ge-
samten strahlenden Energie fur tY = l :
m m

u 0

und durch gliedweise Integration :

Setzt man dies = 7,061.10-'6, so ergiebt sich, da c = 3.1010.


~-
k4 = 1,1682.101~.
(14) ha

1) F. K u r l b a u m , Wied. Ann. 65. p. 759. 1898.


Gesetz der ~ner,qieverteilungim Normalspectrum. 563

8 12. 0. L u m m e r und E. P r i n g s h e i m l ) haben das


Product A , 8 , wo A, die Wellenlange des Maximums von E
in Luft bei der Temperatur 9. bedeutet, zu 2940 p . grad be-
stimmt. Also in absolutem Maass:
A m 6 = 0,294 cm . grad.
Andererseits folgt aus (13), wenn man den Differential-
quotienten von E nach I gleich Null setzt, wodurch I = I., wird:

und aus dieser transcendenten Gleichung:


Ch
h m 6= 4,@51. k *
Folglich:

Hieraus und aus (14) ergeben sich die Werte der Natur-
constanten :
(15) h = 6,55. erg. sec ,
k = 1,346. --erg .
grad
Das sind dieselben Zahlen, welche ich in meiner friiheren
Mitteilung angegeben habe.
1) 0. Lummer und E. P r i n g s h e i m , Verhandl. der Deutschen
Physikal. Gesellsch. 2. p. 176. 1900.
(Eingegangen 7. Januar 1901.)
On the Law of Distribution of Energy in the Normal Spectrum
Max Planck

Annalen der Physik, vol. 4, p. 553 ff (1901)

This PDF file was typeset with LATEX based on the HTML file at
http://dbhs.wvusd.k12.ca.us/webdocs/Chem-History/Planck-1901/Planck-1901.html.
Please report typos etc. to Koji Ando at Kyoto University (ando@kuchem.kyoto-u.ac.jp).
On the Law of Distribution of Energy in the Normal Spectrum

Max Planck

Annalen der Physik, vol. 4, p. 553 ff (1901)

The recent spectral measurements made by O. Lummer and E. Pringsheim1 , and even more notable
those by H. Rubens and F. Kurlbaum2 , which together confirmed an earlier result obtained by H. Beck-
mann3 , show that the law of energy distribution in the normal spectrum, first derived by W. Wien from
molecular-kinetic considerations and later by me from the theory of electromagnetic radiation, is not
valid generally.
In any case the theory requires a correction, and I shall attempt in the following to accomplish
this on the basis of the theory of electromagnetic radiation which I developed. For this purpose it will
be necessary first to find in the set of conditions leading to Wien’s energy distribution law that term
which can be changed; thereafter it will be a matter of removing this term from the set and making an
appropriate substitution for it.
In my last article4 I showed that the physical foundations of the electromagnetic radiation theory,
including the hypothesis of “natural radiation”, withstand the most severe criticism; and since to my
knowledge there are no errors in the calculations, the principle persists that the law of energy distribution
in the normal spectrum is completely determined when one succeeds in calculating the entropy S of an
irradiated, monochromatic, vibrating resonator as a function of its vibrational energy U . Since one then
obtains, from the relationship dS/dU = 1/θ, the dependence of the energy U on the temperature θ, and
since the energy is also related to the density of radiation at the corresponding frequency by a simple
relation5 , one also obtains the dependence of this density of radiation on the temperature. The normal
energy distribution is then the one in which the radiation densities of all different frequencies have the
same temperature.
Consequently, the entire problem is reduced to determining S as a function of U , and it is to this task
that the most essential part of the following analysis is devoted. In my first treatment of this subject I
had expressed S, by definition, as a simple function of U without further foundation, and I was satisfied
to show that this from of entropy meets all the requirements imposed on it by thermodynamics. At that
time I believed that this was the only possible expression and that consequently Wein’s law, which follows
from it, necessarily had general validity. In a later, closer analysis6 , however, it appeared to me that there
must be other expressions which yield the same result, and that in any case one needs another condition
in order to be able to calculate S uniquely. I believed I had found such a condition in the principle, which
at the time seemed to me perfectly plausible, that in an infinitely small irreversible change in a system,
near thermal equilibrium, of N identical resonators in the same stationary radiation field, the increase in
the total entropy SN = N S with which it is associated depends only on its total energy UN = N U and
the changes in this quantity, but not on the energy U of individual resonators. This theorem leads again
to Wien’s energy distribution law. But since the latter is not confirmed by experience one is forced to
conclude that even this principle cannot be generally valid and thus must be eliminated from the theory7 .
Thus another condition must now be introduced which will allow the calculation of S, and to accom-
plish this it is necessary to look more deeply into the meaning of the concept of entropy. Consideration
1 O. Lummer and E. Pringsheim, Transactions of the German Physical Society 2 (1900), p. 163
2 H. Rubens and F. Kurlbaum, U Proceedings of the Imperial Academy of Science, Berlin, October 25, 1900, p. 929.
3 H. Beckmann, Inaugural dissertation, Tübingen 1898. See also H. Rubens, Weid. Ann. 69 (1899) p. 582.
4 M. Planck, Ann. d. Phys. 1 (1900), p. 719.
5 Compare with equation (8).
6 M. Planck, loc. cit., pp. 730 ff.
7 Moreover one should compare the critiques previously made of this theorem by W. Wien (Report of the Paris Congress

2, 1900, p. 40) and by O. Lummer (loc. cit., 1900, p. 92).

1
of the untenability of the hypothesis made formerly will help to orient our thoughts in the direction in-
dicated by the above discussion. In the following a method will be described which yields a new, simpler
expression for entropy and thus provides also a new radiation equation which does not seem to conflict
with any facts so far determined.

1 Calculations of the Entropy of a Resonator as a Function of its Energy


§1. Entropy depends on disorder and this disorder, according to the electromagnetic theory of radiation
for the monochromatic vibrations of a resonator when situated in a permanent stationary radiation
field, depends on the irregularity with which it constantly changes its amplitude and phase, provided
one considers time intervals large compared to the time of one vibration but small compared to the
duration of a measurement. If amplitude and phase both remained absolutely constant, which means
completely homogeneous vibrations, no entropy could exist and the vibrational energy would have to
be completely free to be converted into work. The constant energy U of a single stationary vibrating
resonator accordingly is to be taken as time average, or what is the same thing, as a simultaneous average
of the energies of a large number N of identical resonators, situated in the same stationary radiation field,
and which are sufficiently separated so as not to influence each other directly. It is in this sense that we
shall refer to the average energy U of a single resonator. Then to the total energy
UN = N U (1)
of such a system of N resonators there corresponds a certain total entropy
SN = N S (2)
of the same system, where S represents the average entropy of a single resonator and the entropy SN
depends on the disorder with which the total energy UN is distributed among the individual resonators.

§2. We now set the entropy SN of the system proportional to the logarithm of its probability W , within
an arbitrary additive constant, so that the N resonators together have the energy EN :
SN = k log W + constant (3)
In my opinion this actually serves as a definition of the probability W , since in the basic assumptions
of electromagnetic theory there is no definite evidence for such a probability. The suitability of this
expression is evident from the outset, in view of its simplicity and close connection with a theorem from
kinetic gas theory8 .

§3. It is now a matter of finding the probability W so that the N resonators together possess the
vibrational energy UN . Moreover, it is necessary to interpret UN not as a continuous, infinitely divisible
quantity, but as a discrete quantity composed of an integral number of finite equal parts. Let us call each
such part the energy element ; consequently we must set
UN = P  (4)
where P represents a large integer generally, while the value of  is yet uncertain.
(The above paragraph in the original German)
Es kommt nun darauf an, die Wahrscheinlichkeit W dafür zu finden, dass die N Res-
onatoren insgesamt die Schwingungsenergie UN besitzen. Hierzu ist es notwendig, UN nicht
als eine stetige, unbeschränkt teilbare, sondern als eine discrete, aus einer ganzen Zahl von
endlichen gleichen Teilen zusammengesetzte Grösse aufzufassen. Nennen wir einen solchen
Teil ein Energieelement , so ist mithin zu setzen
UN = P 
wobei P eine ganze, im allgemeinen grosse Zahl bedeutet . . . .
8 L. Boltzmann, Proceedings of the Imperial Academy of Science, Vienna, (II) 76 (1877), p. 428.

2
Now it is evident that any distribution of the P energy elements among the N resonators can result
only in a finite, integral, definite number. Every such form of distribution we call, after an expression
used by L. Boltzmann for a similar idea, a “complex”. If one denotes the resonators by the numbers 1,
2, 3, ... N , and writes these side by side, and if one sets under each resonator the number of energy
elements assigned to it by some arbitrary distribution, then one obtains for every complex a pattern of
the following form:
1 2 3 4 5 6 7 8 9 10
7 38 11 0 9 2 20 4 4 5
Here we assume N = 10, P = 100. The number R of all possible complexes is obviously equal to the
number of arrangements that one can obtain in this fashion for the lower row, for a given N and P . For
the sake of clarity we should note that two complexes must be considered different if the corresponding
number patterns contain the same numbers but in a different order.
From combination theory one obtains the number of all possible complexes as:

N (N + 1)(N + 2) · · · ·(N + P − 1) (N + P − 1)!


R= =
1 · 2 · 3 · · · ·P (N − 1)!P !

Now according to Stirling’s theorem, we have in the first approximation:

N! = NN

Consequently, the corresponding approximation is:

(N + P )N +P
R=
NN · PP

§4. The hypothesis which we want to establish as the basis for further calculation proceeds as follows:
in order for the N resonators to possess collectively the vibrational energy UN , the probability W must be
proportional to the number R of all possible complexes formed by distribution of the energy UN among
the N resonators; or in other words, any given complex is just as probable as any other. Whether this
actually occurs in nature one can, in the last analysis, prove only by experience. But should experience
finally decide in its favor it will be possible to draw further conclusions from the validity of this hypothesis
about the particular nature of resonator vibrations; namely in the interpretation put forth by J. v. Kries9
regarding the character of the “original amplitudes, comparable in magnitude but independent of each
other”. As the matter now stands, further development along these lines would appear to be premature.

§5. According to the hypothesis introduced in connection with equation (3), the entropy of the system
of resonators under consideration is, after suitable determination of the additive constant:

SN = k log R = k{(N + P ) log(N + P ) − N log N − P log P } (5)

and by considering (4) and (1):


    
U U U U
SN = kN 1+ log 1 + − log
   

Thus, according to equation (2) the entropy S of a resonator as a function of its energy U is given by:
    
U U U U
S=k 1+ log 1 + − log (6)
   
9 Joh. v. Kries, The Principles of Probability Calculation (Freiburg, 1886), p. 36.

3
2 Introduction of Wien’s Displacement Law
§6. Next to Kirchoff’s theorem of the proportionality of emissive and absorptive power, the so-called
displacement law, discovered by and named after W. Wien10 , which includes as a special case the Stefan-
Boltzmann law of dependence of total radiation on temperature, provides the most valuable contribution
to the firmly established foundation of the theory of heat radiation, In the form given by M. Thiesen11
it reads as follows:
E · dλ = θ5 ψ(λθ) · dλ
where λ is the wavelength, E · dλ represents the volume density of the “black-body” radiation12 within
the spectral region λ to λ + dλ, θ represents temperature and ψ(x) represents a certain function of the
argument x only.

§7. We now want to examine what Wien’s displacement law states about the dependence of the entropy
S of our resonator on its energy U and its characteristic period, particularly in the general case where the
resonator is situated in an arbitrary diathermic medium. For this purpose we next generalize Thiesen’s
form of the law for the radiation in an arbitrary diathermic medium with the velocity of light c. Since we
do not have to consider the total radiation, but only the monochromatic radiation, it becomes necessary
in order to compare different diathermic media to introduce the frequency n instead of the wavelength λ.
Thus, let us denote by u · dν the volume density of the radiation energy belonging to the spectral
region ν to ν + dν; then we write: u · dν instead of E · dλ; c/ν instead of λ, and c · dν/ν 2 instead of dλ.
From which we obtain  
5 c cθ
u=θ 2 ·ψ
ν ν
Now according to the well-known Kirchoff-Clausius law, the energy emitted per unit time at the frequency
ν and temperature θ from a black surface in a diathermic medium is inversely proportional to the square
of the velocity of propagation c2 ; hence the energy density u is inversely proportional to c3 and we have:

θ5
 
θ
u= 2 3 ·f
ν c ν

where the constants associated with the function f are independent of c.


In place of this, if f represents a new function of a single argument, we can write:

ν3
 
θ
u= 3 ·f (7)
c ν

and from this we see, among other things, that as is well known, the radiant energy u · λ3 at a given
temperature and frequency is the same for all diathermic media.

§8. In order to go from the energy density u to the energy U of a stationary resonator situated in the
radiation field and vibrating with the same frequency ν, we use the relation expressed in equation (34)
of my paper on irreversible radiation processes13 :

ν2
K= U
c2
(K is the intensity of a monochromatic linearly, polarized ray), which together with the well-known
equation:
8πK
u=
c
10 W. Wien, Proceedings of the Imperial Academy of Science, Berlin, February 9, 1893, p. 55.
11 M. Thiesen, Transactions of the German Physical Society 2 (1900), p. 66.
12 Perhaps one should speak more appropriately of a “white” radiation, to generalize what one already understands by

total white light.


13 M. Planck, Ann. D. Phys. 1 (1900), p. 99.

4
yields the relation:
8πν 2
u= U (8)
c3
From this and from equation (7) follows:
 
θ
U =ν·f
ν
where now c does not appear at all. In place of this we may also write:
 
U
θ =ν·f (9)
ν

§9. Finally, we introduce the entropy S of the resonator by setting


1 dS
=
θ dU
We then obtain:  
dS 1 U
= ·f
dU ν ν
and integrated:  
U
S=f (10)
ν
that is, the entropy of a resonator vibrating in an arbitrary diathermic medium depends only on the
variable U/ν, containing besides this only universal constants. This is the simplest form of Wien’s
displacement law known to me.

§10. If we apply Wien’s displacement law in the latter form to equation (6) for the entropy S, we then
find that the energy element  must be proportional to the frequency ν, thus:

 = hν

and consequently:     
U U U U
S=k 1+ log 1 + − log
hν hν hν hν
here h and k are universal constants.
By substitution into equation (9) one obtains:
 
1 k hν
= log 1 +
θ hν U


U= (11)
ehν/kθ
−1
and from equation (8) there then follows the energy distribution law sought for:

8πhν 3 1
u= · hν/kθ (12)
c3 e −1
or by introducing the substitutions given in 7, in terms of wavelength λ instead of the frequency:
8πch 1
E= 5
· ch/kλθ (13)
λ e −1
I plan to derive elsewhere the expressions for the intensity and entropy of radiation progressing in a
diathermic medium, as well as the theorem for the increase of total entropy in nonstationary radiation
processes.

5
3 Numerical Values
§11. The values of both universal constants h and k may be calculated rather precisely with the aid of
available measurements. F. Kurlbaum14 , designating the total energy radiating into air from 1 sq cm of
a black body at temperature t◦ C in 1 sec by St , found that:
watt erg
S100 − S0 = 0.0731 · 2
= 7.31 · 105 ·
cm cm2 · sec
From this one can obtain the energy density of the total radiation energy in air at the absolute temperature
1:
4 · 7.31 · 105 erg
= 7.061 · 10−15 ·
3 · 1010 · (3734 − 2734 ) cm3 · deg4
On the other hand, according to equation (12) the energy density of the total radiant energy for θ = 1 is:
Z ∞
8πh ∞ ν 3 dν
Z

u = udν = 3
0 c 0 ehν/k − 1
Z ∞
8πh
= ν 3 (e−hν/k + e−2hν/k + e−3hν/k + · · ·)dν
c3 0
and by termwise integration:
 4  
∗ 8πh k 1 1 1
u = ·6 1+ + + +···
c3 h 24 34 44
48πk 4
= · 1.0823
c3 h3
If we set this equal to 7.061 · 10−15 , then, since c = 3 · 1010 cm/sec, we obtain:
k4
= 1.1682 · 1015 (14)
h3

§12. O. Lummer and E. Pringswim15 determined the product λm θ, where λm is the wavelength of
maximum energy in air at temperature θ, to be 2940 micron·degree. Thus, in absolute measure:
λm = 0.294 cm · deg
On the other hand, it follows from equation (13), when one sets the derivative of E with respect to θ
equal to zero, thereby finding λ = λm
 
ch
1− · ech/kλm θ = 1
5kλm θ
and from this transcendental equation:
ch
λm θ =
4.9651 · k
consequently:
h 4.9561 · 0.294
= = 4.866 · 10−11
k 3 · 1010
From this and from equation (14) the values for the universal constants become:
h = 6.55 · 10−27 erg · sec (15)

erg
k = 1.346 · 10−16 · (16)
deg
These are the same number that I indicated in my earlier communication.
14 F. Kurlbaum, Wied. Ann. 65 (1898), p. 759.
15 O. Lummer and Pringsheim, Transactions of the German Physical Society 2 (1900), p. 176.

6
89 1

3. Zur EZe7ctrodzJnarnik bewegter Eiirper;


v o n A. E i n s t e i n .
___

Da6 die Elektrodynamik Maxwells - wie dieselbe gegen-


wiirtig aufgefa6t zu werden pflegt - in ihrer Anwendung auf
bewegte Korper zu Asymmetrien fuhrt, welche den Phanomenen
nicht anzuhaften scheinen, ist bekannt. Man denke z. B. an
die elektrodynamische Wechselwirkung zwischen einem Mag-
neten und einem Leiter. Das beobachtbare Phanomen hangt
hier nur ab von der Relativbewegung yon Leiter und Nagnet,
wiLhrend nach der ublichen Auffassung die beiden Falle, da6
der eine oder der andere dieser Korper der bemegte sei, streng
voneinander zu trennen sind. Bewegt sich namlich der Magnet
und ruht der Leiter, so entsteht in der Umgebung des Magneten
ein elektrisches Feld von gewissem Energiewerte, welches an
den Orten, wo sich Teile des Leiters befinden, einen Strom
erzeugt. Ruht aber der Magnet und bewegt sich der Leiter,
so entsteht in der Umgebung des Magneten kein elektrisches
Feld, dagegen im Leiter eine elektromotorische Kraft, welcher
an sich keine Energie entspricht, die aber - Gleichheit der
Relativbewegung bei den beiden ins Auge gefa6ten Fiillen
vorausgesetzt - zu elektrischen Stromen von derselben Grb6e
und demselben Verlaufe Veranlassung gibt, wie im ersten Falle
die elektrischen Krafte.
Beispiele iihnlicher Art, sowie die mifilungenen Versuche,
eine Bewegung der Erde relativ zum ,,Lichtmedium" zu kon-
statieren, fiihren zu der Vcrmutung, da6 dern Begriffe der
absoluten Ruhe nicht nur in der Mechanik, sondern auch in
der Elektrodynamik keine Eigenachaften der Erscheinungen ent-
sprechen , sondern da6 vielmehr fiir alle Koordinatensysteme,
fiir welche die mechanischen Gleichungen gelten, auch die
gleichen elektrodynamischen und optischen Qesetze gelten, wie
dies fur die GroBen erster Ordnung bereits erwiesen ist. Wir
wollen diese Vermutung (deren Inhalt im folgenden ,,Prinzip
der Relativitat" genannt werden wird) zur Voraussetzung er-
heben und au6erdem die mit ihm nur scheinbar unvertragliche
892 A. h'iristein.

Voraussetzung einfiihren, da8 sich das Licht im leeren Raume


stets mit einer bestimmten, vom Bewegungszustande des emit-
tierenden Kiirpers unabhangigen Geschwindigkeit Y fortpflanze.
Diese beiden Voraussetzungen geniigen, um zu einer einfachen
und widerspruchsfreien Elektrodynamik bewegter Korper zu ge-
langen unter Zugrundelegung der Maxwellschen Theorie fur
ruhende Korper. Die Einfuhrung eines ,,Lichtathers" wird sich
insofern als uberfliissig erweisen, als nach der zu entwickelnden
Auffassung weder ein mit besonderen Eigenschaften ausgestntteter
,.absolut ruhender Raum't eingefiihrt, noch einem Punkte des
leeren Raumes, in welchem elektromagnetische Prozesse statt-
finden, ein Geschwindigkeitsvektor zugeordnet wird.
Die zu entwickelnde Theorie stiitzt sich - wie jede andere
Elektrodynamik - auf die Kinematik des starren Korpers, da
die Aussagen einer jeden Theorie Beziehungen zwischen starren
Korpern (Koordinatensystemen), Uhren und elektromagnetischen
Prozessen betreffen. Die nicht geniigende Beriicksichtigung
dieses Umstandes ist die Wurzel der Schwierigkeiten , mit
denen die Elektrodynamik bewegter Korper gegenwartig zu
kampfen hat.
I. K i n e m a t i s c h e r Teil.
1 . Definition der Gleichzeitigkeit,

Es liege ein Koordinatensystem vor, in welchem die


N ewtonschen mechanischen Gleichungen gelten. Wir nennen
dies Koordinatensystem zur sprachlichen Unterscheidung von
spater einzufiihrenden Koordinatensystemen und zur Prazi-
sierung der Vorstellung das ,,ruhende System".
Buht ein materieller Punkt relativ zu diesem Koordinaten-
system! so kann seine Lage relativ zu letzterem durch starre
MaBstabe unter Benutzung der Methoden der euklidischen
Geometrie bestimmt und in kartesischen Koordinaten aus-
gedriickt werden.
Wollen wir die Bewegung eines materiellen Punktes be-
schreiben, so geben wir die Werte seiner Koordinaten in
Funktion der Zeit. E s ist nun wohl im Auge zu behalten,
da8 eine dernrtige mathematische Beschreibung erst dann
einen physikalischen Sinn hat, wenn man sich vorher dariiber
klar geworden ist, was hier unter ,,Zeit" verstanden wird.
Zur Elektrodynamik bewegter Kiirper. 893
Wir haben zu berucksichtigen, da6 alle unsere Urteile, in
welchen die Zeit eine Rolle spielt, immer Urteile uber gleich-
zeitbe Ereignisse sind. Wenn ich z. B. sage: ,,Jener Zug
kommt hier um 7 Uhr an,(( so heiBt dies etwa: ,,Das Zeigen
des kleinen Zeigers meiner Uhr auf 7 und das Ankommen des
Zuges sind gleichzeitige Ereignisse." l)
Es kijnnte scheinen, daB alle die Definition der ,,Zeit" be-
treffenden Schwierigkeiten dadurch uberwunden werden khnten,
daB ich an Stelle der ,,Zeit" die ,,Stellung des kleinen Zeigers
meiner Uhr" setze. Eine solche Definition geniigt in der Tat,
wenn es sich darum handelt, eine Zeit zu definieren ausschlieb-
lich fur den Ort, an welchem sich die Uhr eben befindet; die
Definition genagt aber nicht mehr, sobald es sich darum handelt,
an verschiedenen Orten stattfindende Ereignisreihen miteinander
zeitlich zu verkniipfen, oder - was auf dasselbe hinauslauft -
Ereignisse zeitlich zu werten, welche in von der Uhr entfernten
Orten stattfinden.
Wir kijnnten uns allerdings damit begnugen, die Ereignisse
dadurch zeitlich zu werten, daB ein samt der Uhr im Koordinaten-
ursprung befindlicher Beobachter jedem von einem zu wertenden
Ereignis Zeugnis gebenden, durch den leeren Raum zu ihm ge-
langenden Lichtzeichen die entsprechende Uhrzeigerstellung zu-
ordnet. Eine solche Zuordnung bringt aber den Ubelstand mit
sich, daB sie vom Standpunkte des mit der Uhr versehenen
Beobachters nicht unabhiingig ist, wie wir durch die Erfahrung
wissen. Zu einer weit praktischeren Festsetznng gelangen wir
durch folgende Betrachtung.
Befindet sich im Punkte A des Raumes eine Uhr, so kann
ein in A befindlicher Beobachter die Ereignisse in der un-
mittelbaren Umgebung von A zeitlich werten durch Aufsuchen
der mit diesen Ereignissen gleichzeitigen Uhrzeigerstellungen.
Befindet sich auch im Punkte B des Raumes eine Uhr - wir
wollen hinzufiigen, ,,eine Uhr von genau derselben Beschaffen-
heit wie die in A befindliche" - so ist auch eine zeitliche
Wertung der Ereignisse in der unmittelbaren Umgebung von
1) Die Ungenauigkeit, welche in dem Begriffe der Gleichzeitigkeit
zweier Ereignisse an (annghernd) demselben Orb steckt und gleichfalls
durch eine Abstraktion iiberbriickt werden mu6, eoll hier nicht eriirtert
werden.
Anxialen der Physik. IV. Folge.. 17. 58
894 8.Einstein.

B durch einen in B befindlichen Beobachter moglich. Es ist


aber ohne weitere Festsetzung nicht moglich, ein Ereignis in
A mit einem Ereignis in 13 zeitlich zu vergleichen; wir haben
bisher nur eine ,,d-ZeitLLund eine ,,B-ZeitLc,aber keine fur d
und B gemeinsame ,,Zeit" definiert. Die letztere Zeit kann
nun definiert werden, indem man durch Definition festsetzt, daB
die ,,ZeitLL,welche das Licht braucht, um von A nach B zu
gelangen, gleich ist der ,,Zeit", welche es braucht, um von B
nach A zu gelangen. Es gehe namlich ein Lichtstrahl zur
,,A-ZeitiL tA Yon A nach B ab, werde zur ,,B-ZeitLLtB in B
gegen A zu retlektiert und gelange zur ,,A-ZeitLL ti nach A
zuruck. Die beiden Uhren laufen definitionsgema5 synchron,
wenn
lB - tA = ti - tB.
Wir nehmen an, da5 diese Definition des Synchronismus
in widerspruchsfreier Weise moglich sei, und zwar fur beliebig
viele Punkte, daB also allgemein die Beziehungen gelten:
1. Wenn die Uhr in B synchron mit der Uhr in d lauft,
so Iauft die Uhr in A synchron mit der Uhr in B.
2. Wenn die Uhr in A sowohl mit der Uhr in B als auch
mit der Uhr in C synchron lauft, so laufen auch die Uhren in
B und C synchron relativ zueinander.
Wir haben so unter Zuhilfenahme gewisser (gedachter)
physikalischer Erfahrungen festgelegt, was unter synchron
laufenden , an verschiedenen Orten befindlichen, ruhenden
Uhren zu verstehen ist und damit offenbar eine Definition
von ,,gleichzeitig" und ,,ZeitLLgewonnen. Die ,,ZeitdL eines
Ereignisses ist die rnit dem Ereignis gleichzeitige Angabe
einer am Orte des Ereignisses befindlichen, ruhenden Uhr,
welche mit einer bestimmten, ruhenden Uhr, und zwar fur
alle Zeitbestimmungen mit der namlichen Uhr, synchron lauft.
Wir setzen noch der Erfahrung gemid3 fest, da5 die
GroBe
eiiB = p
.~ ~

4
:' -
eine universelle Konstante (die Lichtgeschwindigkeit im leeren
Raume) sei.
Wesentlich ist, da5 wir die Zeit mittels im ruhenden System
Zur Elektrodynamik Leweyter Korper. 895

ruhender Uhren definiert haben ; wir nennen die eben definierte


Zeit wegen dieser Zugehorigkeit zum ruhenden System ,,die
Zeit des ruhenden Systems((.

2. nber d i e Relativitiit von Liingen und Zeiten.

Die folgenden Uberlegungen stiitzen sich auf das Relativitats-


prinzip und auf das Prinzip der Konstanz der Lichtgeschwindig-
keit, welche beiden Prinzipien wir folgendermahen definieren.
1. Die Gesetze, nach denen sich die Zustiinde der physi-
kalischen Systemu andern, sind unabhkngig davon, auf welches
von zwei relativ zueinander in gleichfijrmiger Translations-
bewegung befindlichen Koordinatensystemen diese Zustands-
anderungen bezogen werden.
2. Jeder Lichtstrahl bewegt sich im ,,ruhenden" Koordi-
natensystem mit, der bestimmten Geschwindigkeit 7, unabh&iigig
davon, ob dieser Lichtstrahl von einem ruhenden oder be-
wegten Korper emittiert ist. Hierbei ist
Lichtweg
Geschwindigkeit = Zeitdauer ,

mobei ,,Zeitdauer" im Sinne der Definition des 8 1 auf-


zufassen ist.
Es sei ein ruhender starrer Stab gegeben; derselbe be-
sitze, mit einem ebeufalls ruhenden MaSstabe gemessen, die
Lange 1. Wir denken uns nun die Stabachse in die X-Achse
des ruhenden Koordinatensystems gelegt und dem Stabe hierauf
eine gleichfarmige Paralleltranslationsbewegung (Geschwindig-
keit v ) langs der X-Achse im Sinne der wachsenden x erteilt.
Wir fragen nun nach der Lange des bewegten Stabes, melche
wir uns durch folgende zwei Operationen ermittelt denken:
a) Der Beobachter bewegt sich samt dem vorher genannten
MaSstabe mit dem auszumessenden Stabe und miSt direkt
durch Anlegen des MaDstabes die Lange des Stabes, ebenso,
wie wenn sich auszumessender Stab, Beobachter und MaBstsb
in Ruhe befanden.
b) Der Beobachter ermittelt mittels im ruhenden Systeme
aufgestellter, gemM3 5 1 synchroner, ruhender Uhren, in welchen
Punkten des ruhenden Systems sich Anfang und Ende des
auszumessenden Stabes zu einer bestimmten Zeit t befinden.
58 *
896 A. Zinstein.

Die Entfernung dieser beiden Punkte, gemessen mit dem


schon benutzten, in diesem Falle ruhenden MaBstabe ist
ebenfalls eine Lange, welche man als ,,Lange des Stabes''
bezeichnen kann.
Nach dem Relativitatsprinzip mu13 die bei der Operation a)
zu findende Lange, welche wir ,,die Lange des Stabes im be-
wegten System'' nennen wollen, gleich der Lange I des ruhen-
den Stabes sein.
Die bei der Operation b) zu findende Lange, welche wir
,,die Lange des (bewegten) Stabes im ruhenden System"
nennen wollen , werden wir unter Zugrundelegung unserer
beiden Prinzipien bestimmen und finden, daB sie von I ver-
schieden ist.
Die allgemein gebrauchte Kinematik nimmt stillschweigend
an, daB die durch die beiden erwahnten Operationen bestimmten
Langen einander genau gleich seien, oder mit anderen TS'orten,
da13 ein bewegter starrer Korper in der Zeitepoche t in geo-
metrischer Beziehung vollstandig durch denselGen Kbper, wenn
er in bestimmter Lage ruht, ersetzbar sei.
Wir denken uns ferner an den beiden Stabenden ( A und B)
IJhren angebracht, welche mit den Uhren des ruhenden Systems
synchron sind, d. h. deren Angaben jeweilen der ,,Zeit des
ruhenden SystemsrLan den Orten, an welchen sie sich gerade
befinden, entsprechen ; diese Uhren sind also ,,synchron im
ruhenden System".
Wir denken uns ferner, daB sich bei jeder Uhr ein mit
ihr bewegt.er Beobachter hefinde, und daB diese Beobachter
auf die beiden Uhren das im 8 1 aufgestellte Kriterium fiir
den synchronen Gang zweier Uhren anwenden. Zur Zeit I)
tA gehe ein Lichtstrahl von A aus, werde zur Zeit t B in B
reflektiert und gelange zur Zeit t> nach A zuriick. Unter Be-
riicksichtigung des Prinzipes von der Konstanz der Licht-
geschwindigkeit finden wir :

1) ,,Zeit" bedeutet hier ,,Zeit des ruhenden Systems" und zugleich


,,Zeigerstellung der bewegten Uhr, melche sich an dem Orte, von dem
die ,Rede ist, befindet".
Zur Elektrodynamik bewegter Kiirper. 897

und ti - te = vr~+ v '


~
B

wobei rgg die Ltlnge des bewegten Stabes - im ruhenden System


gemessen - bedeutet. Mit dem bewegten Stabe bewegte Be-
obachter warden also die beiden Uhren nicht synchron gehend
finden, wahrend im ruhenden System befindliche Beobachter
die Uhren als synchron laufend erklaren wiirden.
Wir sehen also, daB wir dem Begriffe der Gleichzeitigkeit
keine absolute Bedeutung beimessen durfen, sondern da6 zwei
Ereignisse, welche, von einem Koordinatensystem aus betrachtet,
gleichzeitig sind, von einem relativ zu diesem System bewegten
System aus betrachtet, nicht mehr als gleichzeitige Ereignisse
aufzufassen sind.

§ 3. Theorie der Koordinaten- und Beittransformation


von dem ruhenden auf ein relativ BU dieeem in glslclhfijrmiger
Translationsbewegung beflndliches System.

Seien im ,,ruhenden" Raume zwei Koordinatensysteme,


d. h. zwei Systeme von je drei von einem Punkte ausgehenden,
aufeinander senkrechten starren materiellen Linien, gegeben.
Die X-Achsen beider Systeme mogen zusammenfallen, ihre
Y- und 2-Achsen beziiglich parallel sein. Jedem Systeme sei
ein starrer MaBstab und eine Anzahl Uhren beigegeben, und
es seien beide MaBstibe sowie alle Uhren beider Systeme
einander genau gleich.
Es werde nun dem Anfangspunkte des einen der beiden
Systeme (k) eine [konstante) Geschwindigkeit v in Richtung
der wachsenden x des anderen, ruhenden Systems ( K ) erteilt,
welche sich auch den Koordinatenachsen , dem betreffenden
MaBstabe sowie den Uhren mitteilen moge. Jeder Zeit t des
ruhenden Systems K entspricht dann eine bestimmte Lage der
Achsen des bewegten Systems und wir sind aus Symmetrie-
griinden befugt anzunehmen, daB die Bewegung von k so be-
schaffen sein kann, da6 die Achsen des bewegten Systems zur
Zeit t (es ist mit ,,ti' immer eine Zeit des ruhenden Systems
bezeichnet) den Achsen des ruhenden Systems parallel seien.
Wir denken uns nun den Raum sowohl vom ruhenden
System K aus mittels des ruhenden MaBstabes als auch vom
59s A . Binstein.

bewegten System k mittels des mit ihm bewegten MaBstabes


ausgemessen und so die Koordinaten x, y, z bez. E, 17, j er-
mittelt. Es werde ferner mittels der im ruhenden System be-
findlichen ruhenden Uhren durch Lichtsignale in der in § 1
angegebenen Weise die Zeit t des ruhenden Systems fur alle
Punkte des letzteren bestimmt, in denen sich Uhren befinden;
ebenso werde die Zeit t des bewegten Systems fur alle Punkte
des bewegten Systems, in welchen sich relativ zu letzterem
ruhende Uhren befinden, bestimmt durch Anwendung der in
Ej 1 genannten Methode der Lichtsignale zwischen den Punkten,
in denen sich die letzteren Uhren befinden.
Zu jedem Wertsystem x, y, z, t , welches Ort und Zeit
eines Ereignisses im ruhenden System vollkommen bestimmt,
gehort ein jenes Ereignis relativ zum System k festlegendes
Wertsystem E , q, 5, t, und es ist nun die Aufgabe zu losen,
das diese GroBen verkniipfende GleichungssyRtem zu finden.
Zunachst ist klar, daB die Gleichungen linear sein miissen
wegen der Homogenitatseigenschaften , welchc wir Raum und
Zeit beilegen.
Setzen wir z’=x - B t, so ist klar, daB einem im System k
ruhenden Punkte ein bestimmtes, von der Zeit unabhangiges
Wertsystem 2, y, z zukommt. Wir bestimmen zuerst T als
Funktion von x’, y, z und t. Zu diesem Zwecke haben wir
in Gleichungen auszudriicken, daf3 T nichts anderes ist als
der Inbegriff der Angaben von im System k ruhenden Uhren,
welche nach der im § 1 gegebenen Regel synchron gemacht
worden sind.
Tom Anfangspunkt des Systems k aus werde ein Licht-
strahl zur Zeit zo langs der X-Achse nach x’ gesandt und von
dort zur Zeit z1 nach dem Koordinatenursprung reflektiert,
wo er zur Zeit rs anlange; so mu13 dann sein:
+
% ( t o T,) = T1
oder, indem man die Argumente der Funktion T beifiigt und
das Prinzip der Konstanz der Lichtgeschwindigkeit im ruhen-
den Systeme anwendet:

V- a’
Zur Elektrodynamik bewegter Kb’rper. 899

Hieraus folgt, wenn man z’ unendlich klein wahlt :


1 1 ar at 1 at
h(r
+v
FJijt = + Gat’
oder

Es ist zu bemerken, daB wir statt des Koordinatenursprunges


jeden anderen Punkt als Ausgangspunkt des Lichtstrahles
hatten wilblen konnen und es gilt deshalb die eben erhaltene
Gleichung fiir alle Werte von z’, y, z.
Eine analoge Uberlegung - auf die H-und 2-Achse an.
gewandt - liefert, wenn man beachtet, daS sich das Licht
langs dieser Achsen vom ruhenden System aus betrachtet
stets mit der Geschwindigkeit f Y z - va fortpflanzt:

Aus diesen Qleichungen folgt, da t eine lineare Funktion ist:

wobei a eine vorlaufig unbekannte Funktion ~ ( v ist ) und der


Kiirze halber angenommen ist, daS i m Anftrngspunkte von R
fur t = O t = O sei.
Mit Hilfe dieses Resultates ist es leicht, die QroSen %,
TI,<
zu ermitteln, indem man durch Gleichungen ausdruckt, dal3
sich das Licht (wie das Prinzip der Konstanz der Licht-
geschwindigkeit in Verbindung mit dem Relativitatsprinzip
verlangt) auch im bewegten System gemessen mit der Ge-
schwindigkeit P fortpflanzt. Fur einen zur Zeit t = 0 in
Richtung der wachsenden E ausgesandtcn Lichtstrahl gilt:
g = Bt,
oder
E =a P(t - V
.
x’)
Nun bewegt sich aber der Lichtstrahl relativ zum Anfangs-
900 A. Einstein.

punkt von K im ruhenden System gemesseii mit der Ge-


schwindigkeit P - v, so da8 gilt:
5’
__ = t.
v-v
Setzen wir diesen Wert von t in die Gleichung fur .$ ein, so
erhalten wir :

Auf analoge Weise finden wir durch Betrachtung von langs


den beiden anderen Achaen bewegte Lichtstrahlen :
q = vr=ar(t--= 2, 2.‘) 1

wobei
y =t; x’=O;
y m
also
V
q=a Y
V‘ - 0 2
~

und
V
g= u=
-z.
1/V”V’
Setzen wir fiir x‘ seinen Wert ein, so erhalten wir:

und y eine vorliiufig unbekannte Funktion von v ist. Macht


man uber die Anfangslage des bewegten Systems und uber
den Nullpunkt von r keinerlei Voraussetzung, so ist auf den
rechten Seiten dieser Gleichungen je eine additive Konstante
zuzufugen.
Wir haben nun zu beweisen, dab jeder Lichtstrahl sich,
im bewegten System gemessen, mit der Geschwindigkeit Y
fortpflanzt, falls dies, wie wir angenommen haben, im ruhenden
Zur Elektrodynamik bewegter Korper. 901
System der Fall ist; denn wir haben den Beweis dafiir noch
nicht geliefert, daf3 das Prinzip der Konstanz der Licht-
geschwindigkeit mit dem Relativitiitsprinzip vereinbar sei.
Zur Zeit t = t = 0 werde von dem zu dieser Zeit gemein-
samen Koordinatenursprung beider Systeme aus eine Kugelwelle
ausgesandt, welche sich im System K mit der Geschwindigkeit 7
ausbreitet. 1st (x,y, z) ein eben von dieser Welle ergriffener
Punkt, so ist also
xa+ya+z*= PP.
Diese Gleichung transformieren wir mit Hilfe unserer Trans-
formationsgleichungen und erhalten nach einfacher Rechnung :
p + q a + ga = P t 2 .

Die betrachtete Welle ist also auch im bewegten System


betrachtet eine Kugelwelle von der Ausbreitungsgeschwindig-
keit P. Hiermit ist gezeigt, da6 unsere beiden Grundprinzipien
miteinander vereinbar sind.
In den entwickelten Transformationsgleichungen tritt noch
eine unbekannte Funktion rp von ZI auf, welche wir nun be-
stimmen wollen.
Wir fiihren zu diesem Zwecke noch ein drittes Koordinaten-
system K' ein, welches relativ xum System K derart in Parallel-
translationsbewegung parallel zur H-Achso begriffen sei , da8
sich dessen Koordinatenursprung mit der Geschwindigkeit - u
auf der H-Achse bewege. Zur Zeit t = O m6geu alle drei
Koordinstenanfangspunkte zusammenfallen und es sei fur
t = x =y = z = 0 die Zeit t' des Systems K' gleich Null. Wir
nennen x', y', z' die Koordinaten, im System K' gemessen, und
erhalten durch zweimalige Anwendung unserer Tmnsformations-
gleichungen :

x'= d--)P(-4(E + ).V = cp(4y(--)')2,


y'= yP(-v)v = sP(v)T(--)Y,
z'= y ( - V ) g =d-)d--)z-
Da die Beziehungen zwischen x', y', z' und x, y, z die Zeit t
nicht enthalten, so ruhen die Systeme K und R' gegeneinander,
902 A. Einstein.

und es ist klar, da6 die Transformation von K auf K' die
identische Transformation sein muW. Es ist also :
sp(+P(-v) =1
Wir fragen nun nach der Bedeutung von ~ ( v ) . Wir fassen
das Stuck der HiAchse des Systems k ins Auge, das zwischen
8 = 0, q = 0, c= 0 und 8 = 0, 17 = 1, 5 = 0 gelegen ist. Dieses
Stuck der H-Achse ist ein relativ zum System K mit der Ge-
schwindigkeit v senkrecht zu seiner Achse bewegter Stab,
dessen Enden in K die Koordinaten besitzen:
1
Xl =vt, Y l = z ' 21=0
und
x2 = v t , yz = 0 , z2 = 0 .

Die Lange des Stabes, in K gemessen, ist also E/cp(v); damit


ist die Bedeutung der Funktion sp gegeben. Aus Symmetrie-
griinden ist nun einleuchtend, da6 die im ruhenden System
gemessene Lange eines bestimmten Stabes , welcher senkrecht
zu seiner Achse bewegt ist, nur von der Geschwindigkeit, nicht
aber von der Richtung und dem Sinne der Bewegung abhangig
sein kann. Es andert sich also die im ruhenden System ge-
rnessene Lange des bewegten Stabes nicht, wenn v mit - v
vertauscht wird. Hieraus folgt :
--
E -- 1
T(V) 91(-0),
oder
YN= Cu(-v).
Aus dieser und aer vorhin gefundenen Relation folgt, daB
sp (v) = 1 sein muB , so daB die gefundenen Transformations-
gleichungen ubergehen in:
*=P(t - e
-i'"")l

,g = P(. - V I ) ,

wobei
Ztcr Elektrodynamik 6ewegter KGrper. 903

4. Physikalisohe Bedeutung der erhaltenen Gleiohungen,


bewegte starre Kiirper und bewegte Uhren betreffend.
Wir betrachten eine starre Kugel l) vom Radius R, welche
relativ zum bewegten System h ruht, und deren Mittelpunkt
im Koordinatenursprung von k liegt. Die Gleichung der Ober-
flache dieser relativ zum System K mit der Geschwindigkeit v
bewegten Kugel ist:
Ez + q2 + 5' = R2.
Die Gleichung dieser Oberflache ist in x, y, z ausgedruckt zur
Zeit t = 0:

Ein starrer Kiirper, welcher in ruhendem Zustande ausgemessen


die Gestalt einer Kugel hat, hat also in bewegtem Zustande -
vom ruhenden System aus betrachtet - die Gestalt eines
Rotationsellipsoides mit den Achsen

Wahrend also die Y- und &Dimension der Kugel (also


auch jedes starren Kiirpers von beliebiger Gestalt) durch die Be-
wegung nicht modifiziert erscheinen, erscheint die X-Dimension
im Verhaltnis 1 :I/i-- (u/Y)J verkilrzt, also um so starker, j e
grSBer v ist. F u r v = P schrumpfen alle bewegten Objekte -
vom ,,ruhenden" System aus betrachtet - in flachenhafte
Gebilde zusammen. Fur Uberlichtgeschwindigkeiten werden
unsere Uberlegungen sinnlos; wir werden iibrigens in den
folgenden Betrachtungen finden, daB die Lichtgeschwindigkeit
in unserer Theorie physikalisch die Rolle der unendlich groBen
Geschwindigkeiten spielt.
Es ist klar, daB die gleichen Resultate von im ,,ruhenden"
System ruhenden Korpern gelten, welche von einem gleich-
f6rmig bewegten System aus betrachtet werden. -
Wir denken uns ferner eine der Uhren, welche relativ
zum ruhenden System ruhend die Zeit t, relativ zum bewegten

1) Das heiSt einen KSrper, welcher ruhend untersucht Rugelgestalt


besitzt.
904 A. Einstein.

System ruhend die Zeit t anzugeben befahigt sind, im Koordi-


natenursprung von k gelegen und so gerichtet, daB sie die
Zeit t angibt. Wie schnell geht diese Uhr, vom ruhenden
System aus betrachtet?
Zwischen k G r i j B e n x, t und t, welche sich auf den Ort
dieser Uhr beziehen, gelten offenbar die Gleichungen :

und
x =vt.
Es ist also

woraus folgt, daB die Angabe der Uhr (im ruhenden System
betrachtet) pro Sekunde um (1 - fi- ( v / 7 ) 2 ) Sek. oder - bis
auf GroBen vierter und hoherer Ordnung um ~ ( v / YSek. )~
zuruckbleib t.
Hieraus ergibt sich folgende eigentiimliche Konsequenz.
Sind in den Punkten A und B von K ruhende, im ruhenden
System hetrachtet , synchron gehende Uhren vorhanden, und
bewegt man die Uhr in A mit der Geschwindigkeit v auf der
Verbindungslinie nach B , so gehen nach Ankunft dieser Uhr
in B die beiden Uhren nicht mehr synchron, sondern die von d
nach B bewegte Uhr geht gegenuber der von Anfang an in B
befindlichen urn + t v 2 / T 2 Sek. (bis auf GroBen vierter und
hoherer Ordnung) nach, wenn t die Zeit ist, welche die Uhr
von A nach B braucht.
Man sieht sofort, daB dies Resultat auch dann noch gilt,
wenn die Uhr in einer beliebigen polygonalen Linie sich von d
nach B bewegt, und zwar auch dann, wenn die Punkte A
und B zusammenfallen.
Nimmt man an, daB das fiir eine polygonale Linie be-
wiesene Resultat auch fur eine stetig gekriimmte Kurve gelte,
so erhalt man den Satz: Befinden sich in A zwei synchron
gehende Uhren und bewegt man die eine derselben auf einer
geschlossenen Kurve mit konstanter Geschwindigkeit , bis sie
wieder nach A zuruckkommt, was t Sek. dauern moge, so geht
die letztere Uhr bei ihrer Ankunft in A gegeniiber der un-
Zur Blektrodynamik lewegter K'iiper. 905

bewegt gebliebenen urn +


t ( v / Va Sek. nach. Man schlieBt
daraus, daB eine am Erdaquator befindliche Unruhuhr urn einen
sehr kleinen Betrag langsamer laufen mu6 als eine genau
gleich beschaffene, sonst gleichen Bedingungen unterworfene,
a n einem Erdpole befindliche Uhr.

$ 5. Additionatheorem der Geechwindigkeiten.

In dem langs der X-Achse des Systems K mit der Ge-


schwindigkeit v bewegten System k bewege sich ein Punkt
gemat3 den Gleichungen :
= WE '5,

9=w9=,
5= 0,
wobei w E und w? Konatanten bedeuten.
Gesucht ist die Bewegung des Punktes relativ zum System K .
Fahrt man in die Bewegungsgleichungen des Punktes mit Hilfe
der in 8 3 entwickelten Transformationsgleichungen die GrOBeii
x , y , z, t ein, so erhalt man:

x = -~
WE + 2,

2, W E t,
I+-
VP

Z E O .

Das Gesetz vom Parallelogramm der Geschwindigkeiten gilt


also nach unserer Theorie nur in erster Annaherung. Wir
setzen:

und
906 A . Einstein.

cc ist dann als der Winkel zwischen den Geschwindigkeiten u


und w anzusehen. Nach einfacher Rechnung ergibt sich:
(v2 + wp + 2 v w cos 0) -
TI= --

Es ist bemerkenswert, dab v und w in symmetrischer Weise


in den Ausdruck fur die resultierende Geschwindigkeit ein-
gehen. Hat auch w die Richtung der X-Achse (BAchse), so
erhalten mir :
u=--. v f vww
1 f y s
Bus dieser Gleichung folgt , daB aus der Zusammensetzung
zweier Geschwindigkeiten, welche kleiner sind als Y , stets eine
Geschwindigkeit kleiner als Y resultiert. Setzt man niimlich
v = 7- x , w = 7 - A , wobei x und 2. positiv und kleiner als Y
seien, so ist:
u= 7 2 v - % - A X I . < 7 .
2v- %-A+-
v
Es folgt ferner, daf3 die Lichtgeschwindigkeit 7 durch
Zusammensetzung mit einer ,,Uuterlichtgeschwindigkeit" nicht
geandert merden kann. Man erhtilt fur diesen Fall:
us- v+ 20
20
= 7.
1 + r ,

Wir hatten die Formel fiir U fur den Fall, daB v und w
gleiche Richtung besitzen, auch durch Zusammensetzen zweier
Transformationen gemaB $j 3 erhalten konnen. Fiihren m i r
neben den in 6 3 figurierenden Systemen K und iL noch ein
drittes, zu K in Parallelbewegung begriffenes Koordinaten-
system K' ein, dessen Anfangspunkt sich auf der 3 A c h s e mit
der Geschwindigkeit w bewegt, so erhalten wir zwischen den
GroBeii 2, y, z, t und den entsprechenden GroBen von h' Glei-
chungen, welche sich von den in 8 3 gefundenen nur dadurch
unterscheiden, daB an Stelle von ,,d'die GroBe
Zur Elektrodynamik bewegter Ki'rper. 907
tritt ; man sieht daraus, daS solche Paralleltransformationen -
wie dies sein mu6 - eine Gruppe bilden.
Wir haben nun die fur uns notwendigen Satze der unseren
zwei Prinzipien entsprechenden Kinematik hergeleitet und gehen
dazu uber, deren Anwendung in der Elektrodynamik zu zeigcn.

11. E e k t r o d y n amis ch e r T e il.


S 6. Transformation der htaxwell-Hertsachen Uleichungen fur
den leeren Raum. ifber die Natur der bei Bewegung in einem
Ihgnetfeld suftretenden elektromotorischen Kriifte.

Die Maxwell-Hertzschen Gleichungen fur den leeren


Raum magen gultig sein fur dau ruhende System K, so da6
gelten mage:
_ - -a-N- a i u v1
_v1 _ax
at - a ~ a x '
aL
at-X-K'
ay az

I a y aL aiv i a az
~ ax
v a t -Z-?iF' V d t = X - K '
1 az ----
aM ar, I aN ax a y
v at ax v at -F-K'
a y 9 ----
wobei (X,I", 2) den Vektor der elektrischen, (L, M, N) den der
magnetischen Kraft bedeutet.
Wenden w i r auf diese Gleichungen die in 5 3 entwickelte
Transformation an, indem wir die elektromagnetischen Vor-
gange auf das dort eingefuhrte, mit der Geschwindigkeit v
bewegte Koordinatensystem beziehen, so erhalten wir die
Gleichungen:
1 a x
--
v az
ag Y---N
1
V
( a r;
908 A . Einstein.

mobei

I)as Relativittitsprinzip fordert nun, daB die Max we1 1-


Hertzschen Gleichungen fur den leeren Raum auch im
System k gelten, wenn sie im System 6' gelten, d. h. daB fur
die im bewegten System k durch ihre ponderomotorischen
Wirkungen auf elektrische bez. magnetische Massen definierten
Vektoren der elektrischen und magnetischen Kraft ("X', Y'2')
und (L',M', A")) des bewegten Systems k die Gleichungen gelten:

_ _ - a_ u_ _ a i v
_1 -a-Y' 1 aiw
-~
azi - -ax!
__ . -.
V a T - a; a t ' v a+ a5 a5 '
1 aZ' aM' aL' 1 aAT' _ ax' 81"
aT
--=----7
TT at aT ---
v at
_ _ _ ~ ,
aP at
Offenbar miissen nun die beiden fdr das System k ge-
fundenen Gleichungssysteme genau dasselbe ausdrucken, da
beide Gleichungssysteme den Maxwell-Hertzschen Gleichungen
fur das System K aquivalent sind. Dn die Gleichungen beider
Systeme ferner bis auf die die Vektoren darstellenden Symbole
iibereinstimmen , so folgt , daB die in den Gleichungssystemen
an entsprechenden Stellen auftretenden Funktionen bis auf
einen fur alle Funktionen des einen Gleichungssystems ge-
meinsamen, von t, ? j , j und t unabhangigen, eventuell von v
abhangigen Faktor w (v) iibereinstimmen miissen. Es gelten
also die Beziehungen:
X' = ,$fJ (v)X , L'= y ( v ) J ,
Zur Elektrodynamik lewegter Kirper. 909

Bildet man nun die Umkehrung dieses Gleichungssystems,


erstens durch Auflosen der soeben erhaltenen Gleichungen,
zweitens durch Anwendung der Gleichungen auf die inverse
Transformation (von k auf Is), welche durch die Geschwindig-
keit - v charakterisiert ist, so folgt, indem man beriicksichtigt,
da6 die beiden so erhaltenen Gleichungssysteme identisch sein
miissen :
y(v).$O(-v) = 1 .
Ferner folgt aus Symmetriegriinden l)
y(v) = y ( - v ) ;
es ist also
v(v)= 1 9

und unsere Gleichungen nehmen die Form an:


x' = x, E = L,
M' = p ( M + ;
2),

N' = p (N-;Y ).
Zur Interpretation dieser Gleichungen bemerken wir folgendes.
Es liegt eine punktfdrmige Elektrizitatsmenge vor , welche im
ruhenden System K gemessen von der GroSe ,,eins" sei, d. h.
im ruhenden System ruhend auf eine gleiche Elektrizitatsmenge
im Abstand 1cm die Kraft 1Dyn ausiihe. Nach dem Relativitats-
prinzip ist diese elektrische Masse auch im bewegten System
gemessen von der GrOSe ,,eins". Ruht diese Elektrizitats-
menge relativ zum ruhenden System, so ist definitionsgema6
der Vektor (X; Y , 2) gleich der auf sie wirkenden Kraft. Ruht
die Elektrizitatsmenge gegeniiber dem bewegten System (wenig-
stens in dem hetreffenden Augenblick), so ist die auf sie
wirkende, in dem bewegten System gemessene Kraft gleich
dem Vektor (X, Y', 2'). Me ersten drei der obigen Gleichungen
lassen sich mithin auf folgende zwei Weisen in Worte kleiden:
1. 1st ein punktfiirmiger elektrischer Einheitspol in einem
elektromagnetischen Felde bewegt, so wirkt auf ihn iluBer der
1) 1st z. B. X = Y = Z = L = M = 0 und 0, so ist au8 Symmetrie-
griinden klar, da6 bei Zeichenwechsel von 2) ohne Anderung des nume-
rischen Wertee auch Y' sein Vorzeichen ilndern mu6, ohne seinen nume-
rischen Wert zu iindern.
Annalen der Phpalk. IV. Folge. 17. 59
910 A. Einstein.

elektrischen Kraft eine ,,elektromotorische Kraft", welche unter


Vernachlassigung von mit der zweiten und hijheren Pot enzen
von v / Y multiplizierten Gliedern gleich ist dem mit der
Lichtgeschwindigkeit dividierten Vektorprodukt der Bewegungs-
geschwindigkeit des Einheitspoles nnd der magnetischen Kraft.
(Alte Ausdrucksweise.)
2. 1st ein punktf6rmiger elektrischer Einheitspol in einem
elektromagnetischen Felde bewegt, so ist die auf ihn wirkende
Kraft gleich der an dem Orte des Einheitspoles vorhandenen
elektrischen Kraft, welche man durch Transformation des Feldes
nuf ein relativ zum elektrischen Einheitspol ruhendes Koordi-
natensystem erhalt. (Neue husdrucksweise.)
Analoges gilt uber die ,,magnetomotorischen Krafte". Man
sieht, daf3 in der entwickelten Theorie die elektromotorische
Kraft nur die Rolle eines Hilfsbegriffes spielt, welcher seine
Einfuhrung dem Umstande verdankt, da8 die elektrischen und
magnetischen Krafte keine von dem Bewegungszustande des
Koordinatensystems unabhangige Existenz besitzen.
Ee ist ferner klar, daB die in der Einleitung angefuhrte
Asymmetrie bei der Betrachtung der durch Relativbewegung
eines Magneten und eines Leiters erzeugten Str6me verschwindet.
Auch werden die Fragen nach dem ,,Sitz" der elektrodynamischen
elektromotorischen Krafte (Unipolarmaschinen) gegenstandslos.

3 7. Theorie des Doppelerschen Prinsips und der Aberration.

Im Systeme K befinde sich sehr ferne vom Koordinaten-


ursprung eine Quelle elektrodynamischer Wellen, welche in
einem den Koordinatenursprung enthaltenden Raumteil mit
geniigender Annaherung durch die Gleichungen dargestellt sei :
X = X,sin (I,, L = Lo sin Uj,
Y=Y0sin@, M=M,,sin@,
Z=Z,sin@, N=Nosin@,
@=GI
ax+by+cx
(t - - - - - - - v- 1.
Hierbei sind (Io,Yo, 2,) und (J,,,
No,No) die Vektoren, welche
die Amplitude des Wellenzuges best,immen, a, b, c die Richtungs-
kosinus der Wellennormalen.
Wir fragen nach der Beschaffenheit clieser Wellen, wenn
dieselben von einem in dem bewegten System k ruhenden
Zur Elektrodynamik bewegter Kiirper. 911

Beobschter untersucht werden. - Durch Anwendung der in


06 gefundenen Transformationsgleichungen fur die elektrischen
und magnetischen Krafte und der in 0 3 gefundenen Trans-
formationsgleichungen fur die Koordinaten und die Zeit er-
halten wir unmittelbar :
1y' = X, sin W, L'= A,, sin O ' ,

YJ

" 1
- v ~sinoW,
= /I (yo AT = /I (M,+ + z0)sin W ,
Z=/I(Z, ++%)sin@, N ' = ~ ( N -, + Y o ) s i n w 7

w = w' (t- a' F i- b' q f c' 5


V
wobei
w' = w p ( 1 - +),
2,

a =
a - v,
1 -a?
v

gesetzt ist.
Aus der Gleichung fiir w' folgt: 1st ein Beobachter relativ
zu einer unendlich fernen Lichtquelle von der Frequenz v mit
der Geschwindigkeit v derart bewegt , da8 die Verbindungs-
linie ,,Lichtquelle-Beobachter" mit der auf ein relativ ZUP
Lichtquelle ruhendes Koordinatensystem bezogenen Geschwindig-
keit des Beobachters den Winkel rp bildet, so ist die von
dem Beobachter wahrgenommene Frequenz v' des Lichtes
durch die Gleichung gegeben :
1 - co9 cp 7 2,

v' = v
dl - *);(
Dies ist das Doppelersche Prinzip fur beliebige Geschwindig-
59*
91 2 A. Einstein.

keiten. F u r 4p = O nimmt die Gleichung die ubersichtliche


Form an:

Man sieht, daB - im Gegensatz zu der ublichen Auffnssung -


fur u = - 0 0 , u = o o ist.
Nennt man v' den Winkel zwischen Wellennormale (Strahl-
richtung) im bewegten System und der Verbindungslinie ,,Licht-
quelle-Beobachter", so nimmt die Gleichung fur a' die Form an :

Diese Gleichung driickt das Aberrationsgesetz in seiner all-


gemeinsten Form aus. 1st y = a / 2 , so nimmt die Gleichung
die einfache Gestalt an:
cos y' = - -
2;
. T'

Wir haben nun noch die Amplitude der Wellen, wie


dieselbe im bewegten System erscheint, zu suchen. Nennt
man A bez. A' die Amplitude der elektrischen oder magne-
tischen Kraft im ruhenden bez. im bewegten System gemessen,
so erhalt man:

welche Gleichung fur ~p = 0 in die einfachere ubergeht:


1 - -v
1 2 = AZ--. ?-,
2'
1 f -=
I

Es folgt aus den entwickelten Gleichungen, daB fiir einen


Beobachter, der sich mit der Geschwindigkeit P einer Licht-
quelle naherte, diese Lichtquelle unendlich intensiv erscheinen
muBte.
Zur Wektrodynamik bewegter Korper. 913

5 8. Transformation der Energie der Lichtstrahlen. Theorie des


auf vollkommene Spiegel ausgeubten Strahlungedruckee.

Da d a18 rn gleich der Lichtenergie pro Volumeneinheit


ist, so haben wir nach dem Relativitatsprinzip N a / 8w als die
Lichtenergie im bewegten System zu betrachten. Es ware
daher sla/A2 das Verhaltnis der ,,bewegt gemessenen" und
,,ruhend gemessenen" Energie eines bestimmten Lichtkomplexes,
wenn das Volumen eines Lichtkomplexes in K gemessen und
in h gemessen das gleiche ware. Dies ist jedoch nicht der
Fall. Sind a, 6, c die Richtungskosinus der Wellennormalen
des Lichtes im ruhenden System, so wandert durch die Ober-
flachenelemente der mit Lichtgeschwindigkeit bewegten Kugel-
flache
(2 - Y a t)a+ ( y - P b t)a + (t- V c t)z = Ra

keine Energie hindurch; wir kijnnen daher sagen, daB diese


Flache dauernd denselben Lichtkomplex umschliebt. Wir
fragen nach der Energiemenge, welche diese Flache im System k
betrachtet umschliebt, d. h. nach der Energie des Lichtkomplexes
relativ zum System k.
Die Kugelflache ist - im bewegten System betrachtet -
eine Ellipsoidflache, welche zur Zeit t = 0 die Qleichung besitzt :

Nennt man S das Volumen der Kugel, 8' dasjenige dieses


Ellipsoides, so ist, wie eine einfache Rechnung zeigt:

Nennt man also 2 die im ruhenden System gemessene, E' die


im bewegten System gemessene Lichtenergie, welche von der
betrachteten Flache umschlossen wird, so erhalt man :
914 A . #in stein.

welche Formel fur y = 0 in die einfachere ubergeht:

Es ist bemerkenswert, daS die Energie und die Frequenz


eines Lichtkomplexes sich nach demselben Gesetze mit dem
Bewegungszustande des Beobachters andern.
Es sei nun die Koordinntenebene = 0 eine vollkommeii
spiegelnde Flache, an welcher die im letzten Paragra.ph be-
trachteten ebenen Wellen reflektiert werden. Wir fragen nach
dem auf die spiegelnde Flache ausgeubten Lichtdruck und
nach der Richtung, Frequenz und Intensitat des Lichtes nach
der Reflexion.
Das einfallende Licht sei durch die GroBen A, cosy, v
(auf das System K bezogen) definiert. Von k aus betrachtet
sind die entsprechenden GrdBen:
1 - cos rp

cos cp - -I’‘1’
cos $GI = ’
1 - 2,
,,cosq

F u r das reflektierte Licht erhalten wir, wenn wir den Vor-


gang auf das System k beziehen:
A” = A’ ,
cos $GI‘ =- cos TI,
v” = . 21‘

Endlich erhalt man durch Riicktransformieren aufs ruhende


System K fur das reflektierte Licht:
Zur Elektrodynamik bewegter Korper. 915

111
1 + P0
- COB cp'! 1 - 2-coscp
0
P +
A = A"- = A
1 - (+)' 9

coscpt' + 4; (1 + ( + ) ' ) c o s q - 2 - 2,

C 0 8 yf" =
1
-- - - -1 - 2 p c o s c p +
+ -coacp'l
0
T'
~ (q- 17

Die auf die Flacheneinheit des Spiegels pro Zeiteinheit


auftreffende (im ruhenden System gemessene) Energie ist
offenbar A 8 / 8n (7cos y~- v). Die von der Flacheneinheit
des Spiegels in der Zeiteinheit sich entfernende Energie ist
Y + v). Die Differenz dieser beiden Aus-
~ l " ' ~ n/ 8(- Vcos q
driicke ist nach dem Energieprinzip die vom Lichtdrucke in
der Zeiteinheit geleistete Arbeit. Setzt man die letztere gleich
dem Produkt P.v, wobei P der Lichtdruck ist, so erhiilt man:

In erster Annaherung erhalt man in Ubereinstimmung mit der


Erfahrung und mit anderen Theorien

P = 2 - cA=
os2y.
8 n

Nach der hier benutzten Methode konnen alle Probleme


der Optik bewegter Korper gelijst werden. Das Wesentliche
ist, dab die elektrische und magnetische Kraft des Lichtes,
welches durch einen bewegten Korper beeinflufit wird, auf ein
relativ zu dem Korper ruhendes Koordinatensystem trans-
'
formiert werden. Dadurch wird jedes Problem der Optik be-
wegter Korper auf eine Reihe von Problemen der Optik ruhender
Korper zuriickgefiihrt.
916 A. Ztnstein.

9. Transformation der M e x w e l l - H e r t z s o h e n Gleichungen


mit Berucksichtigung der Xonvektionsstr Sme.

Wir gehen aus von den Gleichungen:


1 I ~ air _
L_ az - ~

wobei
. = aax-+ey+,,
0 ~
x a i - az

die 4n-fache Dichte der Elektrizifat und (u,, uy, uJ den Ge-
schwindigkeitsvektor der Elektrizitat bedeutet. Denkt man
sich die elektrischen Massen unveranderlich an kleine, starre
Korper (Ionen, Elektronen) gebunden, so sind diese Gleichungen
die elektromagnetische Grundlage der L o r e n t zschen Elektro-
clynamik und Optik bewegter Korper.
Transformiert man diese Gleichungen, welche im System K
gelten mogen, mit Hilfe der Transformationsgleichungen von
9 3 und 5 6 auf das System R , so erhalt man die Gleichungen:

wobei
Zur Elektrodynamik bewegter KGrper. 917

Da - wie aus dem Additionstheorem der Geschwindigkeiten


(5 5) folgt - der Vektor (uE,u q , uc) nichts anderes ist als
die Geschwindigkeit der elektrischen Massen im System k ge-
messen, so ist damit gezeigt, da6 unter Zugrundelegung unserer
kinematischen Prinzipien die elektrodynamische Grundlage der
L orentzschen Theorie der Elektrodynamik bewegter Korper
dem Relativitatsprinzip entspricht.
Es moge noch kurz bemerkt werden, daB aus den ent-
wickelten Gleichungen leicht der folgende wichtige Satz ge-
folgert werden kann: Bewegt sich ein elektrisch geladener
Korper beliebig im Raume und andert sich hierbei seine
Ladung nicht, von einem mit dem Korper bewegten Koordi-
natensystem aus betrachtet, so bleibt seine Ladung auch -
von dem ,,ruhenden" System K aus betrachtet - konstant.

10. Dynamik des (ltmgsam beschleunigten) Elektrona.

I n einem elektromagnetischen Felde bewege sich ein punkt-


formiges, mit einer elektrischen Ladung s. versehenes Teilchen
(im folgenden ,,Elektron" genannt), iiber dessen Bewegungs-
gesetz wir nur folgendes annehmen:
Ruht das Elektron in einer bestimmten Epoche, so erfolgt
in dem nachsten Zeitteilchen die Bewegung des Elektrons nach
den Gleichunnen

d1 x.
p- = &Z,
rlts

wobei x , y, z die Koordinaten des Elektrons, p die Masse


des Elektrons bedeutet, sofern dasselbe langsam bewegt ist.
Es besitze nun zweitens das Elektron in einer gewissen
Zeitepoche die Geschwindigkeit v. Wir suchen das Gesetz,
nach welchem sich das Elektron im unmittelbar darauf folgen-
den Zeitteilchen bewegt.
Ohne die Allgemeinheit der Betrachtung zu beeinflussen,
konnen und wollen wir annehmen, dab das Elektron in dem
Momente, wo wir es ins Auge fassen, sich im Koordinaten-
91 8 8. #instein.

sprung befinde uiid sich langs der X-Ache des Systems K m i t


der Geschwindigkeit v bewege. Es ist dann einleuchtend, da8
das Elektron im genannten Momente (t = 0) relativ zu einem
langs der 1-Achse mit der konstanten Geschwindigkeit v
parallelbemegten Koordinatensystem K ruht.
Bus der oben gemachten Voraussetzung in Verbindung
mit dem Relativitatsprinzip ist klar, daW sich das Elektron in
der unmittelbar folgenden Zeit (fur kleine Werte von t) vom
System k aus betrachtet nach den Gleichungen bewegt:

wobei die Zeichen 6, 91, 5, z, X', Y, Z' sich auf das System k
beziehen. Setzen wir noch fest, daJ3 fur t = x = y = z = 0
z = = 77 = = 0 sein soll, so gelten die Transformations-
gleichungen der §§ 3 und 6, so daB gilt:

= p (x - v l ) , s' = x,
4 = ?I >

<=z,

Mit Hilfe dieser Gleichungen transformieren wir die obigen


Bewegungsgleichungen vom System k auf das System K und
erhalten :
dax 1
[
E
---r = ,u +€,
dt-

Wir fragen nun in Anlehnung an die ubliche Betrachtnngs-


weise nach der ,,longitudinalen" und ,,transversalen" Masse
Zur Elektrodynamik bewegter KGrper. 919
des bewegten Elektrons. Wir schreiben die Gleichungen (A)
in der Form
cln z
p p x = e x = &r,

und bemerken zunachst, daB e X , E Y', e 2' die Komponenteii


der auf das Elektron wirkenden ponderomotorischen Kraft sind,
und zwar in einem in diesem Moment mit dem Elektron mit
gleicher Geschwindigkeit wie dieses bewegten System betrachtet.
(Diese Kraft konnte beispieleweise mit einer im letzten Systom
ruhenden Federwage gemessen werden.) Wenii wir nun diese
Kraft schlechtweg ,,die auf das Elektron wirlcende Kraft"
nennen und die Gleichung
Massenzahl x Beschleunigungszahl = Kraftzahl
aufrechterhalten, und wenn wir ferner festsetzen, daB die Be-
schleunigungen im ruhenden System K gemessen werden aollen,
so erhalten wir aus obigen Gleichungen:
Longitudinale Masse = P

Transversale Masse = P
1 - (+y *

Natiirlich wlirde man bei anderer Definition der Kraft


und der Beschleunigung andere Zahlen fiir die Massen erhalten ;
man ersieht daraus, daB man bei der Vergleichung ver-
schiedener Theorien der Bewegung des Elektrons sehr vor-
sichtig verfahren muB.
Wir bemerken, daB diese Resultate iiber die Masse auch
filr die poiiderabeln materiellen Punkte gilt; denn ein pon-
derabler materieller Punkt kann durch Zufiigen einer beliebig
Meinen elektrischen Ladung zu einem Elektron (in unserem
Sinne) gemacbt werden.
Wir bestimmen die kinetische Energie des Elektrons.
Rewegt sich ein Elektron vom Koordinatenursprung des Systems
K aus mit der Anfangsgeschwindigkeit 0 bestandig auf der
920 A. Einstein.

X-Achse unter der Wirkung einer elektrostatischen Kraft X,


so ist klar, daB die Clem elektrostatischen Felde entzogene
Energie den Wert f ~ X d hat. z Da das Elektron langsam
beschleunigt sein sol1 und infolgedessen keine Energie in Form
von Strahlung abgeben moge, so muB die dem elektrostatischen
Felde entzogene Energie gleich der Bewegungsenergie W des
Elektrons gesetzt werden. Man erhalt daher, indem man be-
achtet, daB wahrend des ganzen betrachteten Bewegungsvor-
ganges die erste der Gleichungen (A) gilt:

W wird also fur v = P unendlich grog. cberlicht-


geschwindigkeiten haben - wie bei unseren fruheren Resultaten
- keine Existenzmoglichkeit.
Auch dieser Ausdruck fur die kinetische Energie mu8 dern
oben angefuhrten Argument zufolge ebenso fur ponderable
Massen gelten.
Wir wollen nun die aus dem Gleichungssystem (A) resul-
tierenden, dem Experimente zugiinglichen Eigenschaften der
Bewegung des Elektrons aufzahlen.
1. Bus der zweiten Gleichung des Systems (A) folgt, daf3
eine elektrische Kraft Y und eine magnetische Kraft N dann
gleich stark ablenkend wirken auf ein mit der Geschwindigkeit
v bewegtes Elektron, wenn Y= N . V I P , Man ersieht also, dab
die Ermittelung der Geschwindigkeit des Elektrons aus dem
Verhaltnis der magnetischen Ablenkbarkeit Am und der elek-
trischen Ablenkbarkeit Be nach unserer Theorie fur beliebige
Geschwindigkeiten mbglich ist durch Anwendung des Gesetzes :
_
A, - -.
v
8, v
Diese Beziehung ist der Priifung durch das Experiment
zuganglich, da die Geschwindigkeit des Elektrons auch direkt,
z. B. mittels rasch oszillierender elektrischer und magnetischer
Felder, gemessen werden kann.
2. Aus der Ableitung fur die kinetische Energie des
Elektrons folgt, daB zwischen der durchlaufenen Potential-
Zur Eektrodynamik bewegter Kiirper. 921
differenz und der erlangten Geschwindigkeit v des Elektrons
die Beziehung gelten muI3:

3. Wir berechnen den Kriimmungsradius R der Bahn,


wenn eine senkrecht zur Geschwindigkeit des Elektrons wirkende
rnagnetische Kraft N (als einzige ablenkende Kraft) vorhanden
ist. Aus der zweiten der Gleichungen (A) erhalten wir:

---
d t ? - .fR -
day - p
~
VN . i l - (+y
_ _

oder
V

Diese drei Beziehungen sind ein vollstandiger Ausdruck


fur die Gesetze, nach denen sich gemii6 vorliegender Theorie
das Elektron bewegen mu&
Zum Schlusse bemerke ich, da8 mir beim Arbeiten an
dem hier behandelten Probleme mein Freund und Kollege
M. Besso treu zur Seite stand und daS ich demselben manche
wertvolle Anregung verdanke.
B e r n , Juni 1905.
(Eingegangen 30. Juni 1905.)
ON THE ELECTRODYNAMICS OF MOVING
BODIES
By A. EINSTEIN
June 30, 1905

It is known that Maxwell’s electrodynamics—as usually understood at the


present time—when applied to moving bodies, leads to asymmetries which do
not appear to be inherent in the phenomena. Take, for example, the recipro-
cal electrodynamic action of a magnet and a conductor. The observable phe-
nomenon here depends only on the relative motion of the conductor and the
magnet, whereas the customary view draws a sharp distinction between the two
cases in which either the one or the other of these bodies is in motion. For if the
magnet is in motion and the conductor at rest, there arises in the neighbour-
hood of the magnet an electric field with a certain definite energy, producing
a current at the places where parts of the conductor are situated. But if the
magnet is stationary and the conductor in motion, no electric field arises in the
neighbourhood of the magnet. In the conductor, however, we find an electro-
motive force, to which in itself there is no corresponding energy, but which gives
rise—assuming equality of relative motion in the two cases discussed—to elec-
tric currents of the same path and intensity as those produced by the electric
forces in the former case.
Examples of this sort, together with the unsuccessful attempts to discover
any motion of the earth relatively to the “light medium,” suggest that the
phenomena of electrodynamics as well as of mechanics possess no properties
corresponding to the idea of absolute rest. They suggest rather that, as has
already been shown to the first order of small quantities, the same laws of
electrodynamics and optics will be valid for all frames of reference for which the
equations of mechanics hold good.1 We will raise this conjecture (the purport
of which will hereafter be called the “Principle of Relativity”) to the status
of a postulate, and also introduce another postulate, which is only apparently
irreconcilable with the former, namely, that light is always propagated in empty
space with a definite velocity c which is independent of the state of motion of the
emitting body. These two postulates suffice for the attainment of a simple and
consistent theory of the electrodynamics of moving bodies based on Maxwell’s
theory for stationary bodies. The introduction of a “luminiferous ether” will
prove to be superfluous inasmuch as the view here to be developed will not
require an “absolutely stationary space” provided with special properties, nor
1 The preceding memoir by Lorentz was not at this time known to the author.

1
assign a velocity-vector to a point of the empty space in which electromagnetic
processes take place.
The theory to be developed is based—like all electrodynamics—on the kine-
matics of the rigid body, since the assertions of any such theory have to do
with the relationships between rigid bodies (systems of co-ordinates), clocks,
and electromagnetic processes. Insufficient consideration of this circumstance
lies at the root of the difficulties which the electrodynamics of moving bodies
at present encounters.

I. KINEMATICAL PART
§ 1. Definition of Simultaneity
Let us take a system of co-ordinates in which the equations of Newtonian
mechanics hold good.2 In order to render our presentation more precise and
to distinguish this system of co-ordinates verbally from others which will be
introduced hereafter, we call it the “stationary system.”
If a material point is at rest relatively to this system of co-ordinates, its
position can be defined relatively thereto by the employment of rigid standards
of measurement and the methods of Euclidean geometry, and can be expressed
in Cartesian co-ordinates.
If we wish to describe the motion of a material point, we give the values of
its co-ordinates as functions of the time. Now we must bear carefully in mind
that a mathematical description of this kind has no physical meaning unless
we are quite clear as to what we understand by “time.” We have to take into
account that all our judgments in which time plays a part are always judgments
of simultaneous events. If, for instance, I say, “That train arrives here at 7
o’clock,” I mean something like this: “The pointing of the small hand of my
watch to 7 and the arrival of the train are simultaneous events.”3
It might appear possible to overcome all the difficulties attending the defini-
tion of “time” by substituting “the position of the small hand of my watch” for
“time.” And in fact such a definition is satisfactory when we are concerned with
defining a time exclusively for the place where the watch is located; but it is no
longer satisfactory when we have to connect in time series of events occurring
at different places, or—what comes to the same thing—to evaluate the times of
events occurring at places remote from the watch.
We might, of course, content ourselves with time values determined by an
observer stationed together with the watch at the origin of the co-ordinates,
and co-ordinating the corresponding positions of the hands with light signals,
given out by every event to be timed, and reaching him through empty space.
But this co-ordination has the disadvantage that it is not independent of the
standpoint of the observer with the watch or clock, as we know from experience.
2 i.e.
to the first approximation.
3 We shall not here discuss the inexactitude which lurks in the concept of simultaneity of
two events at approximately the same place, which can only be removed by an abstraction.

2
We arrive at a much more practical determination along the following line of
thought.
If at the point A of space there is a clock, an observer at A can determine the
time values of events in the immediate proximity of A by finding the positions
of the hands which are simultaneous with these events. If there is at the point B
of space another clock in all respects resembling the one at A, it is possible for
an observer at B to determine the time values of events in the immediate neigh-
bourhood of B. But it is not possible without further assumption to compare,
in respect of time, an event at A with an event at B. We have so far defined
only an “A time” and a “B time.” We have not defined a common “time” for
A and B, for the latter cannot be defined at all unless we establish by definition
that the “time” required by light to travel from A to B equals the “time” it
requires to travel from B to A. Let a ray of light start at the “A time” tA from
A towards B, let it at the “B time” tB be reflected at B in the direction of A,
and arrive again at A at the “A time” t0A .
In accordance with definition the two clocks synchronize if

tB − tA = t0A − tB .
We assume that this definition of synchronism is free from contradictions,
and possible for any number of points; and that the following relations are
universally valid:—
1. If the clock at B synchronizes with the clock at A, the clock at A syn-
chronizes with the clock at B.
2. If the clock at A synchronizes with the clock at B and also with the clock
at C, the clocks at B and C also synchronize with each other.
Thus with the help of certain imaginary physical experiments we have set-
tled what is to be understood by synchronous stationary clocks located at dif-
ferent places, and have evidently obtained a definition of “simultaneous,” or
“synchronous,” and of “time.” The “time” of an event is that which is given
simultaneously with the event by a stationary clock located at the place of
the event, this clock being synchronous, and indeed synchronous for all time
determinations, with a specified stationary clock.
In agreement with experience we further assume the quantity
2AB
= c,
t0A− tA
to be a universal constant—the velocity of light in empty space.
It is essential to have time defined by means of stationary clocks in the
stationary system, and the time now defined being appropriate to the stationary
system we call it “the time of the stationary system.”

§ 2. On the Relativity of Lengths and Times


The following reflexions are based on the principle of relativity and on the
principle of the constancy of the velocity of light. These two principles we define
as follows:—

3
1. The laws by which the states of physical systems undergo change are not
affected, whether these changes of state be referred to the one or the other of
two systems of co-ordinates in uniform translatory motion.
2. Any ray of light moves in the “stationary” system of co-ordinates with
the determined velocity c, whether the ray be emitted by a stationary or by a
moving body. Hence
light path
velocity =
time interval
where time interval is to be taken in the sense of the definition in § 1.
Let there be given a stationary rigid rod; and let its length be l as measured
by a measuring-rod which is also stationary. We now imagine the axis of the
rod lying along the axis of x of the stationary system of co-ordinates, and that
a uniform motion of parallel translation with velocity v along the axis of x in
the direction of increasing x is then imparted to the rod. We now inquire as to
the length of the moving rod, and imagine its length to be ascertained by the
following two operations:—
(a) The observer moves together with the given measuring-rod and the rod
to be measured, and measures the length of the rod directly by superposing the
measuring-rod, in just the same way as if all three were at rest.
(b) By means of stationary clocks set up in the stationary system and syn-
chronizing in accordance with § 1, the observer ascertains at what points of the
stationary system the two ends of the rod to be measured are located at a definite
time. The distance between these two points, measured by the measuring-rod
already employed, which in this case is at rest, is also a length which may be
designated “the length of the rod.”
In accordance with the principle of relativity the length to be discovered by
the operation (a)—we will call it “the length of the rod in the moving system”—
must be equal to the length l of the stationary rod.
The length to be discovered by the operation (b) we will call “the length
of the (moving) rod in the stationary system.” This we shall determine on the
basis of our two principles, and we shall find that it differs from l.
Current kinematics tacitly assumes that the lengths determined by these two
operations are precisely equal, or in other words, that a moving rigid body at
the epoch t may in geometrical respects be perfectly represented by the same
body at rest in a definite position.
We imagine further that at the two ends A and B of the rod, clocks are
placed which synchronize with the clocks of the stationary system, that is to say
that their indications correspond at any instant to the “time of the stationary
system” at the places where they happen to be. These clocks are therefore
“synchronous in the stationary system.”
We imagine further that with each clock there is a moving observer, and
that these observers apply to both clocks the criterion established in § 1 for the
synchronization of two clocks. Let a ray of light depart from A at the time4 tA ,
4 “Time” here denotes “time of the stationary system” and also “position of hands of the

moving clock situated at the place under discussion.”

4
let it be reflected at B at the time tB , and reach A again at the time t0A . Taking
into consideration the principle of the constancy of the velocity of light we find
that
rAB rAB
tB − t A = and t0A − tB =
c−v c+v
where rAB denotes the length of the moving rod—measured in the stationary
system. Observers moving with the moving rod would thus find that the two
clocks were not synchronous, while observers in the stationary system would
declare the clocks to be synchronous.
So we see that we cannot attach any absolute signification to the concept of
simultaneity, but that two events which, viewed from a system of co-ordinates,
are simultaneous, can no longer be looked upon as simultaneous events when
envisaged from a system which is in motion relatively to that system.

§ 3. Theory of the Transformation of Co-ordinates and


Times from a Stationary System to another System in
Uniform Motion of Translation Relatively to the Former
Let us in “stationary” space take two systems of co-ordinates, i.e. two sys-
tems, each of three rigid material lines, perpendicular to one another, and issuing
from a point. Let the axes of X of the two systems coincide, and their axes of
Y and Z respectively be parallel. Let each system be provided with a rigid
measuring-rod and a number of clocks, and let the two measuring-rods, and
likewise all the clocks of the two systems, be in all respects alike.
Now to the origin of one of the two systems (k) let a constant velocity v
be imparted in the direction of the increasing x of the other stationary system
(K), and let this velocity be communicated to the axes of the co-ordinates, the
relevant measuring-rod, and the clocks. To any time of the stationary system K
there then will correspond a definite position of the axes of the moving system,
and from reasons of symmetry we are entitled to assume that the motion of k
may be such that the axes of the moving system are at the time t (this “t” always
denotes a time of the stationary system) parallel to the axes of the stationary
system.
We now imagine space to be measured from the stationary system K by
means of the stationary measuring-rod, and also from the moving system k
by means of the measuring-rod moving with it; and that we thus obtain the
co-ordinates x, y, z, and ξ, η, ζ respectively. Further, let the time t of the
stationary system be determined for all points thereof at which there are clocks
by means of light signals in the manner indicated in § 1; similarly let the time
τ of the moving system be determined for all points of the moving system at
which there are clocks at rest relatively to that system by applying the method,
given in § 1, of light signals between the points at which the latter clocks are
located.
To any system of values x, y, z, t, which completely defines the place and
time of an event in the stationary system, there belongs a system of values ξ,

5
η, ζ, τ , determining that event relatively to the system k, and our task is now
to find the system of equations connecting these quantities.
In the first place it is clear that the equations must be linear on account of
the properties of homogeneity which we attribute to space and time.
If we place x0 = x − vt, it is clear that a point at rest in the system k must
have a system of values x0 , y, z, independent of time. We first define τ as a
function of x0 , y, z, and t. To do this we have to express in equations that τ is
nothing else than the summary of the data of clocks at rest in system k, which
have been synchronized according to the rule given in § 1.
From the origin of system k let a ray be emitted at the time τ0 along the
X-axis to x0 , and at the time τ1 be reflected thence to the origin of the co-
ordinates, arriving there at the time τ2 ; we then must have 21 (τ0 + τ2 ) = τ1 , or,
by inserting the arguments of the function τ and applying the principle of the
constancy of the velocity of light in the stationary system:—

x0 x0 x0
    
1 0
τ (0, 0, 0, t) + τ 0, 0, 0, t + + = τ x , 0, 0, t + .
2 c−v c+v c−v

Hence, if x0 be chosen infinitesimally small,


 
1 1 1 ∂τ ∂τ 1 ∂τ
+ = + ,
2 c − v c + v ∂t ∂x0 c − v ∂t
or
∂τ v ∂τ
0
+ 2 2
= 0.
∂x c − v ∂t
It is to be noted that instead of the origin of the co-ordinates we might have
chosen any other point for the point of origin of the ray, and the equation just
obtained is therefore valid for all values of x0 , y, z.
An analogous consideration—applied to the axes of Y and Z—it being borne
in mind that light is always propagated√ along these axes, when viewed from the
stationary system, with the velocity c2 − v 2 gives us
∂τ ∂τ
= 0, = 0.
∂y ∂z
Since τ is a linear function, it follows from these equations that
 
v 0
τ =a t− 2 x
c − v2

where a is a function φ(v) at present unknown, and where for brevity it is


assumed that at the origin of k, τ = 0, when t = 0.
With the help of this result we easily determine the quantities ξ, η, ζ by
expressing in equations that light (as required by the principle of the constancy
of the velocity of light, in combination with the principle of relativity) is also

6
propagated with velocity c when measured in the moving system. For a ray of
light emitted at the time τ = 0 in the direction of the increasing ξ
 
v 0
ξ = cτ or ξ = ac t − 2 x .
c − v2

But the ray moves relatively to the initial point of k, when measured in the
stationary system, with the velocity c − v, so that

x0
= t.
c−v
If we insert this value of t in the equation for ξ, we obtain

c2
ξ=a x0 .
c2 − v 2
In an analogous manner we find, by considering rays moving along the two other
axes, that
 
v 0
η = cτ = ac t − 2 x
c − v2

when
y
√ = t, x0 = 0.
c2 − v2
Thus
c c
η = a√ y and ζ = a √ z.
c2 −v 2 c − v2
2

Substituting for x0 its value, we obtain

τ = φ(v)β(t − vx/c2 ),
ξ = φ(v)β(x − vt),
η = φ(v)y,
ζ = φ(v)z,

where
1
β=p ,
1 − v 2 /c2

and φ is an as yet unknown function of v. If no assumption whatever be made


as to the initial position of the moving system and as to the zero point of τ , an
additive constant is to be placed on the right side of each of these equations.

7
We now have to prove that any ray of light, measured in the moving system,
is propagated with the velocity c, if, as we have assumed, this is the case in the
stationary system; for we have not as yet furnished the proof that the principle
of the constancy of the velocity of light is compatible with the principle of
relativity.
At the time t = τ = 0, when the origin of the co-ordinates is common to the
two systems, let a spherical wave be emitted therefrom, and be propagated with
the velocity c in system K. If (x, y, z) be a point just attained by this wave,
then

x2 + y 2 + z 2 = c2 t2 .

Transforming this equation with the aid of our equations of transformation


we obtain after a simple calculation

ξ 2 + η 2 + ζ 2 = c2 τ 2 .

The wave under consideration is therefore no less a spherical wave with


velocity of propagation c when viewed in the moving system. This shows that
our two fundamental principles are compatible.5
In the equations of transformation which have been developed there enters
an unknown function φ of v, which we will now determine.
For this purpose we introduce a third system of co-ordinates K0 , which rel-
atively to the system k is in a state of parallel translatory motion parallel to
the axis of Ξ,† such that the origin of co-ordinates of system K0 moves with
velocity −v on the axis of Ξ. At the time t = 0 let all three origins coincide, and
when t = x = y = z = 0 let the time t0 of the system K0 be zero. We call the
co-ordinates, measured in the system K0 , x0 , y 0 , z 0 , and by a twofold application
of our equations of transformation we obtain
t0 = φ(−v)β(−v)(τ + vξ/c2 ) = φ(v)φ(−v)t,
x0 = φ(−v)β(−v)(ξ + vτ ) = φ(v)φ(−v)x,
y0 = φ(−v)η = φ(v)φ(−v)y,
z0 = φ(−v)ζ = φ(v)φ(−v)z.

Since the relations between x0 , y 0 , z 0 and x, y, z do not contain the time t,


the systems K and K0 are at rest with respect to one another, and it is clear that
the transformation from K to K0 must be the identical transformation. Thus

φ(v)φ(−v) = 1.
5 The equations of the Lorentz transformation may be more simply deduced directly from

the condition that in virtue of those equations the relation x2 + y 2 + z 2 = c2 t2 shall have as
its consequence the second relation ξ 2 + η 2 + ζ 2 = c2 τ 2 .
† Editor’s note: In Einstein’s original paper, the symbols (Ξ, H, Z) for the co-ordinates of the

moving system k were introduced without explicitly defining them. In the 1923 English translation,
(X, Y, Z) were used, creating an ambiguity between X co-ordinates in the fixed system K and the
parallel axis in moving system k. Here and in subsequent references we use Ξ when referring to the
axis of system k along which the system is translating with respect to K. In addition, the reference
to system K0 later in this sentence was incorrectly given as “k” in the 1923 English translation.

8
We now inquire into the signification of φ(v). We give our attention to that
part of the axis of Y of system k which lies between ξ = 0, η = 0, ζ = 0 and
ξ = 0, η = l, ζ = 0. This part of the axis of Y is a rod moving perpendicularly
to its axis with velocity v relatively to system K. Its ends possess in K the
co-ordinates
l
x1 = vt, y1 = , z1 = 0
φ(v)

and
x2 = vt, y2 = 0, z2 = 0.
The length of the rod measured in K is therefore l/φ(v); and this gives us the
meaning of the function φ(v). From reasons of symmetry it is now evident that
the length of a given rod moving perpendicularly to its axis, measured in the
stationary system, must depend only on the velocity and not on the direction
and the sense of the motion. The length of the moving rod measured in the
stationary system does not change, therefore, if v and −v are interchanged.
Hence follows that l/φ(v) = l/φ(−v), or

φ(v) = φ(−v).

It follows from this relation and the one previously found that φ(v) = 1, so that
the transformation equations which have been found become

τ = β(t − vx/c2 ),
ξ = β(x − vt),
η = y,
ζ = z,

where
p
β = 1/ 1 − v 2 /c2 .

§ 4. Physical Meaning of the Equations Obtained in


Respect to Moving Rigid Bodies and Moving Clocks
We envisage a rigid sphere6 of radius R, at rest relatively to the moving
system k, and with its centre at the origin of co-ordinates of k. The equation of
the surface of this sphere moving relatively to the system K with velocity v is

ξ 2 + η 2 + ζ 2 = R2 .
6 That is, a body possessing spherical form when examined at rest.

9
The equation of this surface expressed in x, y, z at the time t = 0 is

x2
p + y 2 + z 2 = R2 .
( 1 − v 2 /c2 )2

A rigid body which, measured in a state of rest, has the form of a sphere,
therefore has in a state of motion—viewed from the stationary system—the
form of an ellipsoid of revolution with the axes
p
R 1 − v 2 /c2 , R, R.

Thus, whereas the Y and Z dimensions of the sphere (and therefore of every
rigid body of no matter what form) do not appear p modified by the motion, the
X dimension appears shortened in the ratio 1 : 1 − v 2 /c2 , i.e. the greater the
value of v, the greater the shortening. For v = c all moving objects—viewed from
the “stationary” system—shrivel up into plane figures.† For velocities greater
than that of light our deliberations become meaningless; we shall, however, find
in what follows, that the velocity of light in our theory plays the part, physically,
of an infinitely great velocity.
It is clear that the same results hold good of bodies at rest in the “stationary”
system, viewed from a system in uniform motion.
Further, we imagine one of the clocks which are qualified to mark the time
t when at rest relatively to the stationary system, and the time τ when at rest
relatively to the moving system, to be located at the origin of the co-ordinates
of k, and so adjusted that it marks the time τ . What is the rate of this clock,
when viewed from the stationary system?
Between the quantities x, t, and τ , which refer to the position of the clock,
we have, evidently, x = vt and
1
τ=p (t − vx/c2 ).
1 − v 2 /c2

Therefore,
p p
τ =t 1 − v 2 /c2 = t − (1 − 1 − v 2 /c2 )t

whence it follows that the


p time marked by the clock (viewed in the stationary
system) is slow by 1 − 1 − v 2 /c2 seconds per second, or—neglecting magni-
tudes of fourth and higher order—by 12 v 2 /c2 .
From this there ensues the following peculiar consequence. If at the points A
and B of K there are stationary clocks which, viewed in the stationary system,
are synchronous; and if the clock at A is moved with the velocity v along the
line AB to B, then on its arrival at B the two clocks no longer synchronize,
but the clock moved from A to B lags behind the other which has remained at
† Editor’s note: In the 1923 English translation, this phrase was erroneously translated as “plain

figures”. I have used the correct “plane figures” in this edition.

10
B by 12 tv 2 /c2 (up to magnitudes of fourth and higher order), t being the time
occupied in the journey from A to B.
It is at once apparent that this result still holds good if the clock moves from
A to B in any polygonal line, and also when the points A and B coincide.
If we assume that the result proved for a polygonal line is also valid for a
continuously curved line, we arrive at this result: If one of two synchronous
clocks at A is moved in a closed curve with constant velocity until it returns to
A, the journey lasting t seconds, then by the clock which has remained at rest
the travelled clock on its arrival at A will be 12 tv 2 /c2 second slow. Thence we
conclude that a balance-clock7 at the equator must go more slowly, by a very
small amount, than a precisely similar clock situated at one of the poles under
otherwise identical conditions.

§ 5. The Composition of Velocities


In the system k moving along the axis of X of the system K with velocity v,
let a point move in accordance with the equations

ξ = wξ τ, η = wη τ, ζ = 0,

where wξ and wη denote constants.


Required: the motion of the point relatively to the system K. If with the help
of the equations of transformation developed in § 3 we introduce the quantities
x, y, z, t into the equations of motion of the point, we obtain

wξ + v
x = t,
1 + vwξ /c2
p
1 − v 2 /c2
y = wη t,
1 + vwξ /c2
z = 0.

Thus the law of the parallelogram of velocities is valid according to our


theory only to a first approximation. We set

 2  2
dx dy
V2 = + ,
dt dt
w2 = wξ2 + wη2 ,
a = tan−1 wη /wξ , †

7 Not a pendulum-clock, which is physically a system to which the Earth belongs. This

case had to be excluded.


† Editor’s note: This equation was incorrectly given in Einstein’s original paper and the 1923

English translation as a = tan−1 wy /wx .

11
a is then to be looked upon as the angle between the velocities v and w. After
a simple calculation we obtain
p
(v 2 + w2 + 2vw cos a) − (vw sin a/c)2
V = .
1 + vw cos a/c2
It is worthy of remark that v and w enter into the expression for the resultant
velocity in a symmetrical manner. If w also has the direction of the axis of X,
we get
v+w
V = .
1 + vw/c2
It follows from this equation that from a composition of two velocities which
are less than c, there always results a velocity less than c. For if we set v =
c − κ, w = c − λ, κ and λ being positive and less than c, then
2c − κ − λ
V =c < c.
2c − κ − λ + κλ/c
It follows, further, that the velocity of light c cannot be altered by compo-
sition with a velocity less than that of light. For this case we obtain
c+w
V = = c.
1 + w/c
We might also have obtained the formula for V, for the case when v and w have
the same direction, by compounding two transformations in accordance with §
3. If in addition to the systems K and k figuring in § 3 we introduce still another
system of co-ordinates k 0 moving parallel to k, its initial point moving on the
axis of Ξ† with the velocity w, we obtain equations between the quantities x,
y, z, t and the corresponding quantities of k 0 , which differ from the equations
found in § 3 only in that the place of “v” is taken by the quantity
v+w
;
1 + vw/c2
from which we see that such parallel transformations—necessarily—form a group.
We have now deduced the requisite laws of the theory of kinematics cor-
responding to our two principles, and we proceed to show their application to
electrodynamics.

II. ELECTRODYNAMICAL PART


§ 6. Transformation of the Maxwell-Hertz Equations for
Empty Space. On the Nature of the Electromotive Forces
Occurring in a Magnetic Field During Motion
Let the Maxwell-Hertz equations for empty space hold good for the station-
ary system K, so that we have
† Editor’s note: “X” in the 1923 English translation.

12
1 ∂X ∂N ∂M 1 ∂L ∂Y ∂Z
c ∂t = ∂y − ∂z , c ∂t = ∂z − ∂y ,
1 ∂Y ∂L ∂N 1 ∂M ∂Z ∂X
c ∂t = ∂z − ∂x , c ∂t = ∂x − ∂z ,
1 ∂Z ∂M ∂L 1 ∂N ∂X ∂Y
c ∂t = ∂x − ∂y , c ∂t = ∂y − ∂x ,

where (X, Y, Z) denotes the vector of the electric force, and (L, M, N) that of
the magnetic force.
If we apply to these equations the transformation developed in § 3, by re-
ferring the electromagnetic processes to the system of co-ordinates there intro-
duced, moving with the velocity v, we obtain the equations
n  o n  o
1 ∂X ∂
c ∂τ
= ∂η
β N − vc Y ∂
− ∂ζ β M + vc Z ,
n  o n  o
1 ∂
c ∂τ
β Y − vc N = ∂L
∂ξ

− ∂ζ β N − vc Y ,
n  o n  o
1 ∂
c ∂τ
β Z + vc M = ∂
∂ξ
β M + vc Z − ∂L
∂η
,
n  o n  o
1 ∂L ∂
c ∂τ
= ∂ζ
β Y − vc N ∂
− ∂η β Z + vc M ,
n  o n  o
1 ∂
c ∂τ
β M + vc Z = ∂
∂ξ
β Z + vc M − ∂X
∂ζ
,
n  o n  o
1 ∂
c ∂τ
β N − vc Y = ∂X
∂η

− ∂ξ β Y − vc N ,

where
p
β = 1/ 1 − v 2 /c2 .

Now the principle of relativity requires that if the Maxwell-Hertz equations


for empty space hold good in system K, they also hold good in system k; that
is to say that the vectors of the electric and the magnetic force—(X0 , Y0 , Z0 )
and (L0 , M0 , N0 )—of the moving system k, which are defined by their pondero-
motive effects on electric or magnetic masses respectively, satisfy the following
equations:—

1 ∂X0 ∂N0 ∂M0 1 ∂L0 ∂Y0 ∂Z0


c ∂τ = ∂η − ∂ζ , c ∂τ = ∂ζ − ∂η ,
1 ∂Y0 ∂L0 ∂N0 1 ∂M0 ∂Z0 ∂X0
c ∂τ = ∂ζ − ∂ξ , c ∂τ = ∂ξ − ∂ζ ,
1 ∂Z0 ∂M0 ∂L0 1 ∂N0 ∂X0 ∂Y0
c ∂τ = ∂ξ − ∂η , c ∂τ = ∂η − ∂ξ .

13
Evidently the two systems of equations found for system k must express
exactly the same thing, since both systems of equations are equivalent to the
Maxwell-Hertz equations for system K. Since, further, the equations of the two
systems agree, with the exception of the symbols for the vectors, it follows that
the functions occurring in the systems of equations at corresponding places must
agree, with the exception of a factor ψ(v), which is common for all functions
of the one system of equations, and is independent of ξ, η, ζ and τ but depends
upon v. Thus we have the relations
X0 = ψ(v)X, L0 = ψ(v)L,
Y0 = ψ(v)β Y − c N , M0
v
= ψ(v)β M + vc Z ,
 

Z0 = ψ(v)β Z + vc M , N0 = ψ(v)β N − vc Y .
If we now form the reciprocal of this system of equations, firstly by solving
the equations just obtained, and secondly by applying the equations to the
inverse transformation (from k to K), which is characterized by the velocity −v,
it follows, when we consider that the two systems of equations thus obtained
must be identical, that ψ(v)ψ(−v) = 1. Further, from reasons of symmetry8
and therefore

ψ(v) = 1,

and our equations assume the form


X0 = X, L0 = L,
Y0 = β Y − c N , M0
v
= β M + vc Z ,
 

Z0 = β Z + vc M , N0 = β N − vc Y .
As to the interpretation of these equations we make the following remarks: Let
a point charge of electricity have the magnitude “one” when measured in the
stationary system K, i.e. let it when at rest in the stationary system exert a
force of one dyne upon an equal quantity of electricity at a distance of one cm.
By the principle of relativity this electric charge is also of the magnitude “one”
when measured in the moving system. If this quantity of electricity is at rest
relatively to the stationary system, then by definition the vector (X, Y, Z) is
equal to the force acting upon it. If the quantity of electricity is at rest relatively
to the moving system (at least at the relevant instant), then the force acting
upon it, measured in the moving system, is equal to the vector (X0 , Y0 , Z0 ).
Consequently the first three equations above allow themselves to be clothed in
words in the two following ways:—
1. If a unit electric point charge is in motion in an electromagnetic field,
there acts upon it, in addition to the electric force, an “electromotive force”
which, if we neglect the terms multiplied by the second and higher powers of
v/c, is equal to the vector-product of the velocity of the charge and the magnetic
force, divided by the velocity of light. (Old manner of expression.)
8 If, for example, X=Y=Z=L=M=0, and N 6= 0, then from reasons of symmetry it is clear

that when v changes sign without changing its numerical value, Y0 must also change sign
without changing its numerical value.

14
2. If a unit electric point charge is in motion in an electromagnetic field,
the force acting upon it is equal to the electric force which is present at the
locality of the charge, and which we ascertain by transformation of the field to
a system of co-ordinates at rest relatively to the electrical charge. (New manner
of expression.)
The analogy holds with “magnetomotive forces.” We see that electromotive
force plays in the developed theory merely the part of an auxiliary concept,
which owes its introduction to the circumstance that electric and magnetic forces
do not exist independently of the state of motion of the system of co-ordinates.
Furthermore it is clear that the asymmetry mentioned in the introduction
as arising when we consider the currents produced by the relative motion of a
magnet and a conductor, now disappears. Moreover, questions as to the “seat”
of electrodynamic electromotive forces (unipolar machines) now have no point.

§ 7. Theory of Doppler’s Principle and of Aberration


In the system K, very far from the origin of co-ordinates, let there be a
source of electrodynamic waves, which in a part of space containing the origin
of co-ordinates may be represented to a sufficient degree of approximation by
the equations

X = X0 sin Φ, L = L0 sin Φ,
Y = Y0 sin Φ, M = M0 sin Φ,
Z = Z0 sin Φ, N = N0 sin Φ,

where
 
1
Φ = ω t − (lx + my + nz) .
c

Here (X0 , Y0 , Z0 ) and (L0 , M0 , N0 ) are the vectors defining the amplitude of
the wave-train, and l, m, n the direction-cosines of the wave-normals. We wish
to know the constitution of these waves, when they are examined by an observer
at rest in the moving system k.
Applying the equations of transformation found in § 6 for electric and mag-
netic forces, and those found in § 3 for the co-ordinates and the time, we obtain
directly

X0 = X0 sin Φ0 , L0 = L0 sin Φ0 ,
Y = β(Y0 − vN0 /c) sin Φ , M0 = β(M0 + vZ0 /c) sin Φ0 ,
0 0

Z0 = β(Z0 + vM0 /c)sin Φ0 , N0 = β(N0 − vY 0


0 /c) sin Φ ,
0 0 1 0 0 0
Φ = ω τ − c (l ξ + m η + n ζ)

where

ω0 = ωβ(1 − lv/c),

15
l − v/c
l0 = ,
1 − lv/c
m
m0 = ,
β(1 − lv/c)
n
n0 = .
β(1 − lv/c)

From the equation for ω 0 it follows that if an observer is moving with velocity
v relatively to an infinitely distant source of light of frequency ν, in such a way
that the connecting line “source-observer” makes the angle φ with the velocity
of the observer referred to a system of co-ordinates which is at rest relatively
to the source of light, the frequency ν 0 of the light perceived by the observer is
given by the equation
1 − cos φ · v/c
ν0 = ν p .
1 − v 2 /c2

This is Doppler’s principle for any velocities whatever. When φ = 0 the equation
assumes the perspicuous form
s
1 − v/c
ν0 = ν .
1 + v/c

We see that, in contrast with the customary view, when v = −c, ν 0 = ∞.


If we call the angle between the wave-normal (direction of the ray) in the
moving system and the connecting line “source-observer” φ0 , the equation for
φ0† assumes the form
cos φ − v/c
cos φ0 = .
1 − cos φ · v/c
This equation expresses the law of aberration in its most general form. If φ =
1
2 π, the equation becomes simply

cos φ0 = −v/c.

We still have to find the amplitude of the waves, as it appears in the moving
system. If we call the amplitude of the electric or magnetic force A or A0
respectively, accordingly as it is measured in the stationary system or in the
moving system, we obtain

2 (1 − cos φ · v/c)2
A0 = A2
1 − v 2 /c2
which equation, if φ = 0, simplifies into
† Editor’s note: Erroneously given as “l0 ” in the 1923 English translation, propagating an error,

despite a change in symbols, from the original 1905 paper.

16
2 1 − v/c
A0 = A2 .
1 + v/c

It follows from these results that to an observer approaching a source of light


with the velocity c, this source of light must appear of infinite intensity.

§ 8. Transformation of the Energy of Light Rays. Theory


of the Pressure of Radiation Exerted on Perfect Reflectors
Since A2 /8π equals the energy of light per unit of volume, we have to regard
02
A /8π, by the principle of relativity, as the energy of light in the moving system.
2
Thus A0 /A2 would be the ratio of the “measured in motion” to the “measured
at rest” energy of a given light complex, if the volume of a light complex were
the same, whether measured in K or in k. But this is not the case. If l, m, n are
the direction-cosines of the wave-normals of the light in the stationary system,
no energy passes through the surface elements of a spherical surface moving
with the velocity of light:—

(x − lct)2 + (y − mct)2 + (z − nct)2 = R2 .

We may therefore say that this surface permanently encloses the same light
complex. We inquire as to the quantity of energy enclosed by this surface,
viewed in system k, that is, as to the energy of the light complex relatively to
the system k.
The spherical surface—viewed in the moving system—is an ellipsoidal sur-
face, the equation for which, at the time τ = 0, is

(βξ − lβξv/c)2 + (η − mβξv/c)2 + (ζ − nβξv/c)2 = R2 .

If S is the volume of the sphere, and S0 that of this ellipsoid, then by a simple
calculation
p
S0 1 − v 2 /c2
= .
S 1 − cos φ · v/c

Thus, if we call the light energy enclosed by this surface E when it is measured in
the stationary system, and E0 when measured in the moving system, we obtain
2
E0 A0 S0 1 − cos φ · v/c
= 2 = p ,
E A S 1 − v 2 /c2

and this formula, when φ = 0, simplifies into


s
E0 1 − v/c
= .
E 1 + v/c

17
It is remarkable that the energy and the frequency of a light complex vary
with the state of motion of the observer in accordance with the same law.
Now let the co-ordinate plane ξ = 0 be a perfectly reflecting surface, at
which the plane waves considered in § 7 are reflected. We seek for the pressure
of light exerted on the reflecting surface, and for the direction, frequency, and
intensity of the light after reflexion.
Let the incidental light be defined by the quantities A, cos φ, ν (referred to
system K). Viewed from k the corresponding quantities are

1 − cos φ · v/c
A0 = A p ,
1 − v 2 /c2
cos φ − v/c
cos φ0 = ,
1 − cos φ · v/c
1 − cos φ · v/c
ν0 = ν p .
1 − v 2 /c2

For the reflected light, referring the process to system k, we obtain

A00 = A0
cos φ00 = − cos φ0
ν 00 = ν0

Finally, by transforming back to the stationary system K, we obtain for the


reflected light

1 + cosφ00 · v/c 1 − 2 cos φ · v/c + v 2 /c2


A000 = A00 p =A ,
1 − v 2 /c2 1 − v 2 /c2
cos φ00 + v/c (1 + v 2 /c2 ) cos φ − 2v/c
cos φ000 = 00
=− ,
1 + cos φ · v/c 1 − 2 cos φ · v/c + v 2 /c2
1 + cos φ00 · v/c 1 − 2 cos φ · v/c + v 2 /c2
ν 000 = ν 00 p =ν .
1 − v 2 /c2 1 − v 2 /c2

The energy (measured in the stationary system) which is incident upon unit
area of the mirror in unit time is evidently A2 (c cos φ−v)/8π. The energy leaving
the unit of surface of the mirror in the unit of time is A0002 (−c cos φ000 + v)/8π.
The difference of these two expressions is, by the principle of energy, the work
done by the pressure of light in the unit of time. If we set down this work as
equal to the product Pv, where P is the pressure of light, we obtain
A2 (cos φ − v/c)2
P=2· .
8π 1 − v 2 /c2

18
In agreement with experiment and with other theories, we obtain to a first
approximation

A2
P=2· cos2 φ.

All problems in the optics of moving bodies can be solved by the method
here employed. What is essential is, that the electric and magnetic force of the
light which is influenced by a moving body, be transformed into a system of
co-ordinates at rest relatively to the body. By this means all problems in the
optics of moving bodies will be reduced to a series of problems in the optics of
stationary bodies.

§ 9. Transformation of the Maxwell-Hertz Equations


when Convection-Currents are Taken into Account
We start from the equations

1
n
∂X
o
∂N ∂M 1 ∂L ∂Y ∂Z
c ∂t + ux ρ = ∂y − ∂z , c ∂t = ∂z − ∂y ,
1
n
∂Y
o
∂L ∂N 1 ∂M ∂Z ∂X
c ∂t + uy ρ = ∂z − ∂x , c ∂t = ∂x − ∂z ,
1
n
∂Z
o
∂M ∂L 1 ∂N ∂X ∂Y
c ∂t + uz ρ = ∂x − ∂y , c ∂t = ∂y − ∂x ,

where
∂X ∂Y ∂Z
ρ= + +
∂x ∂y ∂z

denotes 4π times the density of electricity, and (ux , uy , uz ) the velocity-vector of


the charge. If we imagine the electric charges to be invariably coupled to small
rigid bodies (ions, electrons), these equations are the electromagnetic basis of
the Lorentzian electrodynamics and optics of moving bodies.
Let these equations be valid in the system K, and transform them, with the
assistance of the equations of transformation given in §§ 3 and 6, to the system
k. We then obtain the equations

1 ∂X0 0 ∂N0 ∂M0 1 ∂L0 ∂Y0 ∂Z0


n o
c ∂τ + uξ ρ = ∂η − ∂ζ , c ∂τ = ∂ζ − ∂η ,
∂Y0 ∂L0 ∂N0 1 ∂M0 ∂Z0 ∂X0
1
+ u η ρ0
n o
c ∂τ = ∂ζ − ∂ξ , c ∂τ = ∂ξ − ∂ζ ,
∂Z0 ∂M0 ∂L0 1 ∂N0 ∂X0 ∂Y0
1
+ u ζ ρ0
n o
c ∂τ = ∂ξ − ∂η , c ∂τ = ∂η − ∂ξ ,

where

19
ux − v
uξ =
1 − ux v/c2
uy
uη =
β(1 − ux v/c2 )
uz
uζ = ,
β(1 − ux v/c2 )

and

∂X0 ∂Y0 ∂Z 0
ρ0 = + +
∂ξ ∂η ∂ζ
2
= β(1 − ux v/c )ρ.

Since—as follows from the theorem of addition of velocities (§ 5)—the vector


(uξ , uη , uζ ) is nothing else than the velocity of the electric charge, measured in
the system k, we have the proof that, on the basis of our kinematical principles,
the electrodynamic foundation of Lorentz’s theory of the electrodynamics of
moving bodies is in agreement with the principle of relativity.
In addition I may briefly remark that the following important law may easily
be deduced from the developed equations: If an electrically charged body is in
motion anywhere in space without altering its charge when regarded from a
system of co-ordinates moving with the body, its charge also remains—when
regarded from the “stationary” system K—constant.

§ 10. Dynamics of the Slowly Accelerated Electron


Let there be in motion in an electromagnetic field an electrically charged
particle (in the sequel called an “electron”), for the law of motion of which we
assume as follows:—
If the electron is at rest at a given epoch, the motion of the electron ensues
in the next instant of time according to the equations

d2 x
m = X
dt2
d2 y
m 2 = Y
dt
d2 z
m 2 = Z
dt

where x, y, z denote the co-ordinates of the electron, and m the mass of the
electron, as long as its motion is slow.

20
Now, secondly, let the velocity of the electron at a given epoch be v. We
seek the law of motion of the electron in the immediately ensuing instants of
time.
Without affecting the general character of our considerations, we may and
will assume that the electron, at the moment when we give it our attention, is at
the origin of the co-ordinates, and moves with the velocity v along the axis of X
of the system K. It is then clear that at the given moment (t = 0) the electron
is at rest relatively to a system of co-ordinates which is in parallel motion with
velocity v along the axis of X.
From the above assumption, in combination with the principle of relativity, it
is clear that in the immediately ensuing time (for small values of t) the electron,
viewed from the system k, moves in accordance with the equations

d2 ξ
m = X0 ,
dτ 2
d2 η
m 2 = Y0 ,

d2 ζ
m 2 = Z0 ,

in which the symbols ξ, η, ζ, X0 , Y0 , Z0 refer to the system k. If, further, we


decide that when t = x = y = z = 0 then τ = ξ = η = ζ = 0, the transformation
equations of §§ 3 and 6 hold good, so that we have
ξ = β(x − vt), η = y, ζ = z, τ = β(t − vx/c2 ),
X0 = X, Y0 = β(Y − vN/c), Z0 = β(Z + vM/c).

With the help of these equations we transform the above equations of motion
from system k to system K, and obtain
d2 x 

dt2 = mβ 3X 
d2 y  v
 
dt2 = mβ Y − c N · · · (A)
d2 z  v
 
= mβ Z + c M

dt2
Taking the ordinary point of view we now inquire as to the “longitudinal”
and the “transverse” mass of the moving electron. We write the equations (A)
in the form
2
mβ 3 ddt2x = X = X0 ,
2
mβ 2 ddt2y = β Y − vc N = Y0 ,

2
mβ 2 ddt2z = β Z + vc M = Z0 ,


and remark firstly that X0 , Y0 , Z0 are the components of the ponderomotive
force acting upon the electron, and are so indeed as viewed in a system moving
at the moment with the electron, with the same velocity as the electron. (This
force might be measured, for example, by a spring balance at rest in the last-
mentioned system.) Now if we call this force simply “the force acting upon the

21
electron,”9 and maintain the equation—mass × acceleration = force—and if we
also decide that the accelerations are to be measured in the stationary system
K, we derive from the above equations

m
Longitudinal mass = p .
( 1 − v 2 /c2 )3
m
Transverse mass = .
1 − v 2 /c2
With a different definition of force and acceleration we should naturally
obtain other values for the masses. This shows us that in comparing different
theories of the motion of the electron we must proceed very cautiously.
We remark that these results as to the mass are also valid for ponderable
material points, because a ponderable material point can be made into an elec-
tron (in our sense of the word) by the addition of an electric charge, no matter
how small.
We will now determine the kinetic energy of the electron. If an electron
moves from rest at the origin of co-ordinates of the system K along the axis
of X under the action of an electrostatic force X, itR is clear that the energy
withdrawn from the electrostatic field has the value X dx. As the electron is
to be slowly accelerated, and consequently may not give off any energy in the
form of radiation, the energy withdrawn from the electrostatic field must be put
down as equal to the energy of motion W of the electron. Bearing in mind that
during the whole process of motion which we are considering, the first of the
equations (A) applies, we therefore obtain

Z Z v
W = X dx = m β 3 v dv
0
( )
2 1
= mc p −1 .
1 − v 2 /c2

Thus, when v = c, W becomes infinite. Velocities greater than that of light


have—as in our previous results—no possibility of existence.
This expression for the kinetic energy must also, by virtue of the argument
stated above, apply to ponderable masses as well.
We will now enumerate the properties of the motion of the electron which
result from the system of equations (A), and are accessible to experiment.
1. From the second equation of the system (A) it follows that an electric
force Y and a magnetic force N have an equally strong deflective action on an
electron moving with the velocity v, when Y = Nv/c. Thus we see that it is
possible by our theory to determine the velocity of the electron from the ratio
9 The definition of force here given is not advantageous, as was first shown by M. Planck.

It is more to the point to define force in such a way that the laws of momentum and energy
assume the simplest form.

22
of the magnetic power of deflexion Am to the electric power of deflexion Ae , for
any velocity, by applying the law
Am v
= .
Ae c
This relationship may be tested experimentally, since the velocity of the
electron can be directly measured, e.g. by means of rapidly oscillating electric
and magnetic fields.
2. From the deduction for the kinetic energy of the electron it follows that
between the potential difference, P, traversed and the acquired velocity v of the
electron there must be the relationship
Z ( )
m 2 1
P = Xdx = c p −1 .
 1 − v 2 /c2

3. We calculate the radius of curvature of the path of the electron when a


magnetic force N is present (as the only deflective force), acting perpendicularly
to the velocity of the electron. From the second of the equations (A) we obtain
r
d2 y v2  v v2
− 2 = = N 1− 2
dt R mc c
or
mc2 v/c 1
R= ·p · .
 1 − v /c N
2 2

These three relationships are a complete expression for the laws according
to which, by the theory here advanced, the electron must move.
In conclusion I wish to say that in working at the problem here dealt with
I have had the loyal assistance of my friend and colleague M. Besso, and that I
am indebted to him for several valuable suggestions.

23
About this Document

This edition of Einstein’s On the Electrodynamics of Moving Bodies is


based on the English translation of his original 1905 German-language paper
(published as Zur Elektrodynamik bewegter Körper, in Annalen der Physik.
17:891, 1905) which appeared in the book The Principle of Relativity, pub-
lished in 1923 by Methuen and Company, Ltd. of London. Most of the
papers in that collection are English translations from the German Das Rela-
tivatsprinzip, 4th ed., published by in 1922 by Tuebner. All of these sources
are now in the public domain; this document, derived from them, remains in
the public domain and may be reproduced in any manner or medium without
permission, restriction, attribution, or compensation.
Numbered footnotes are as they appeared in the 1923 edition; editor’s
notes are marked by a dagger (†) and appear in sans serif type. The 1923
English translation modified the notation used in Einstein’s 1905 paper to
conform to that in use by the 1920’s; for example, c denotes the speed of
light, as opposed the V used by Einstein in 1905.
This edition was prepared by John Walker. The current version of this
document is available in a variety of formats from the editor’s Web site:

http://www.fourmilab.ch/

24
639

Die Resultate einer jiingst in diesen Annalen von mir


gublizierten elektrodynamischen Untersuchung 1) W r e n zu einer
sehr interessanten Folgerung , die hier abgeleitet werden 8011.
Ich legte dort die Yaxwell-Hertzschen Gleichungen fiir
den leeren Raum nebst dem Maxwellschen Ausdruck ftir die
elektromagnetische Energie des Raumes zugrunde und anller-
dem das Prinzip:
Die Gesetze, nach denen sich die Zus#inde der physi-
kalischen Systeme jindern, sind unabhiingig davon, auf welches
von zwei relativ zueinander in gleichformiger Parallel-Trans-
lationsbewegung befindlichen Koordinatensystemen diese Zu-
stmdsiinderungen beaogen werden (Relativitiltsprinzip).
Gestiitzt a d diese Grundlagen3 leitete ich unter anderem
das nachfolgende Resultat ab (1. c. 0 8):
Ein System von ebenen Lichtwellen besitze, auf das KO-
ordinatensystem (x, y, z) bezogen, die Energie I; die Strahl-
richtung (Wellennormale) bilde den Winkel cju mit der z-Achse
des Systems. Fiihrt man ein neues, gegen das System (z,y, z)
iii gleichformiger Paralleltranslation begriffenes Koordinaten-
system (E, I],;) ein, dessen Ursprung sich mit der Geschwindig-
keit v liings der x-Achse bewegt, so besitzt die genannte Licht-
menge - im System (& I], 5) gemessen - die Energie:

wobei P die Lichtgeschwindigkeit bedeutet. Vou diesem Re-


sultat machen wir im folgenden Gebrauch.
-_

1) A. Einstein, Ann. d. Phys. 17. p. 891. 1905.


2) Dm dort benutzte Prinzip der Konetans der Lichtgeachwindig-
keit iat natiirlich in den Maxwellechen Gleichungen enthelten.
42'
640 8..Einstein.

Es befinde sich nun im System (x, y, z) eiii ruheiidw


Kiirper, dessen Energie - auf das System (x, y, L) bezogen -
Eo sei. Relativ zu dem wie oben mit der Geschwindigkeit 17

bewegten System 0, q, 5) sei die Energie des Korpers H,.


Dieser Kiirper sende in einer mit der x-Achse den
Winkel e bildenden Richtung ebene Lichtwellen von der
Energie 1;/2 (relativ zu (2,y, z) gemessen) und gleichzeitig eine
gleich gro6e Lichtmenge nach der entgegengesetzten Richtung.
Hierbei bleibt der Korper in Ruhe in bezug auf das System
(x, y, 2). Fiir diesen Vorgang muB das Energieprinzip gelten
und zwar (nach dem Prinzip der Relativitat) in bezug auf beide
Koordinatensysteme. Nennen wir bez. HI die Energie des
Kiirpers nach der Lichtaussendung relativ zum System (z,y, z)
bez. (& q, 5) gemessen, so erhalten wir mit Benutznng der oben
angegebenen Relation:

Durch Subtraktion erhalt man aus diesen Gleichungen :

Die beiden in diesem Ausdruck auftretenden Differenzen von


der Form H - E haben einfache physikalische Bedeutungen.
H und E sind Ehergiewerte desselben Korpers, bezogen nuf
zwei relativ zueinander bewegte Koordinatensysteme , wobei
der KSrper in dem einen System (System (x, y, 2)) ruht. Es
ist also klar, daB die Differenz a-2 sich von der kinetischen
Energie K des Korpers in bezug auf das andere System
(System (E, 7, 0) nur durch eine additive Konstante C unter-
scheiden kann, welche von der Wahl der willkiirlichen addi-
a,,
- E, + c,
= tio
2 f - x1 = K1+c,
(la 1: sich wahrend der Lichtaussendung nicht andert. Wir
cphdten also:
KO-

[ ~ t :kinetische Energie des Korpers in bezug auf (& 71, <) nimmt
dolge der Lichtaussendnng ab, und zwar um einen von den
Qualitaten des Korpers unabhiingigen Betrag. Die Differenz
A;, - Kl hangt ferner von der Geschwindigkeit ebenso ab wie
die kinetische Energie des Elektrons (1. c. 8 10).
Unter ‘Gernachlassigung von GrbBen vierter und hoherer
Ordnung konnen wir setzen :
KO - r;l = L 0 2
r-9 2
dus dieser Gleichung folgt unmittelbar :
Gibt ein Korper die Energie L in Form von Strahluiig
,Lb, so verkleinert sich seine Masse um L / P . Hierbei ist es
oflenbar unwesentlich, daS die dem Kiirper entzogene Energie
gerade in Energie der Strahlung iibergeht, so daB wir zu der
allgemeineren Folgerung gefiihrt werden :
Die Masse eines Kijrpers ist ein MaB fiir dessen Energie-
:halt; andert eich die Energie um L, so h d e r t sich die Masse
;,I demselben Sinne um 2 1 9 . loao, wenn die Energie in Erg
und die Masse in Grammen gemessen wird.
Es ist nicht ausgeschlossen, daB bei Korpern, deren
Euergieinhalt in hohem MaBe veranderlich ist (z. B. bei den
Radiumsalzen), eine Priifung der Theorie gelingen wird.
Wenn die Theorie den Tatsachen entspricht, so iibertragt
die Strahlung Trtigheit zwischen den emittierenden und absor-
bierenden Korpern.
B e r n , September 1905.
(Eingegangen 27. September 1905.1
DOES THE INERTIA OF A BODY DEPEND
UPON ITS ENERGY-CONTENT?
By A. EINSTEIN
September 27, 1905

The results of the previous investigation lead to a very interesting conclusion,


which is here to be deduced.
I based that investigation on the Maxwell-Hertz equations for empty space,
together with the Maxwellian expression for the electromagnetic energy of space,
and in addition the principle that:—
The laws by which the states of physical systems alter are independent of
the alternative, to which of two systems of coordinates, in uniform motion of
parallel translation relatively to each other, these alterations of state are referred
(principle of relativity).
With these principles∗ as my basis I deduced inter alia the following result
(§ 8):—
Let a system of plane waves of light, referred to the system of co-ordinates
(x, y, z), possess the energy l; let the direction of the ray (the wave-normal)
make an angle φ with the axis of x of the system. If we introduce a new system
of co-ordinates (ξ, η, ζ) moving in uniform parallel translation with respect to
the system (x, y, z), and having its origin of co-ordinates in motion along the
axis of x with the velocity v, then this quantity of light—measured in the system
(ξ, η, ζ)—possesses the energy

1 − v cosφ
l∗ = l p c
1 − v 2 /c2

where c denotes the velocity of light. We shall make use of this result in what
follows.
Let there be a stationary body in the system (x, y, z), and let its energy—
referred to the system (x, y, z) be E0 . Let the energy of the body relative to the
system (ξ, η, ζ) moving as above with the velocity v, be H0 .
Let this body send out, in a direction making an angle φ with the axis
of x, plane waves of light, of energy 21 L measured relatively to (x, y, z), and
simultaneously an equal quantity of light in the opposite direction. Meanwhile
the body remains at rest with respect to the system (x, y, z). The principle of
∗ The principle of the constancy of the velocity of light is of course contained in Maxwell’s

equations.

1
energy must apply to this process, and in fact (by the principle of relativity)
with respect to both systems of co-ordinates. If we call the energy of the body
after the emission of light E1 or H1 respectively, measured relatively to the
system (x, y, z) or (ξ, η, ζ) respectively, then by employing the relation given
above we obtain

1 1
E0 = E1 + L + L,
2 2
1 1 − vc cosφ 1 1 + v cosφ
H0 = H1 + L p + Lp c
2 1 − v 2 /c2 2 1 − v 2 /c2
L
= H1 + p .
1 − v 2 /c2

By subtraction we obtain from these equations


( )
1
H0 − E0 − (H1 − E1 ) = L p −1 .
1 − v 2 /c2

The two differences of the form H − E occurring in this expression have simple
physical significations. H and E are energy values of the same body referred
to two systems of co-ordinates which are in motion relatively to each other,
the body being at rest in one of the two systems (system (x, y, z)). Thus it is
clear that the difference H − E can differ from the kinetic energy K of the body,
with respect to the other system (ξ, η, ζ), only by an additive constant C, which
depends on the choice of the arbitrary additive constants of the energies H and
E. Thus we may place

H0 − E0 = K0 + C,
H1 − E1 = K1 + C,

since C does not change during the emission of light. So we have


( )
1
K0 − K1 = L p −1 .
1 − v 2 /c2

The kinetic energy of the body with respect to (ξ, η, ζ) diminishes as a result
of the emission of light, and the amount of diminution is independent of the
properties of the body. Moreover, the difference K0 − K1 , like the kinetic energy
of the electron (§ 10), depends on the velocity.
Neglecting magnitudes of fourth and higher orders we may place
1L 2
K0 − K1 = v .
2 c2
From this equation it directly follows that:—

2
If a body gives off the energy L in the form of radiation, its mass diminishes
by L/c2 . The fact that the energy withdrawn from the body becomes energy of
radiation evidently makes no difference, so that we are led to the more general
conclusion that
The mass of a body is a measure of its energy-content; if the energy changes
by L, the mass changes in the same sense by L/9 × 1020 , the energy being
measured in ergs, and the mass in grammes.
It is not impossible that with bodies whose energy-content is variable to a
high degree (e.g. with radium salts) the theory may be successfully put to the
test.
If the theory corresponds to the facts, radiation conveys inertia between the
emitting and absorbing bodies.

About this Document

This edition of Einstein’s Does the Inertia of a Body Depend upon its
Energy-Content is based on the English translation of his original 1905 German-
language paper (published as Ist die Trägheit eines Körpers von seinem En-
ergiegehalt abhängig?, in Annalen der Physik. 18:639, 1905) which appeared
in the book The Principle of Relativity, published in 1923 by Methuen and
Company, Ltd. of London. Most of the papers in that collection are English
translations by W. Perrett and G.B. Jeffery from the German Das Relativat-
sprinzip, 4th ed., published by in 1922 by Tuebner. All of these sources are
now in the public domain; this document, derived from them, remains in the
public domain and may be reproduced in any manner or medium without
permission, restriction, attribution, or compensation.
The footnote is as it appeared in the 1923 edition. The 1923 English
translation modified the notation used in Einstein’s 1905 paper to conform
to that in use by the 1920’s; for example, c denotes the speed of light, as
opposed the V used by Einstein in 1905. In this paper Einstein uses L to
denote energy; the italicised sentence in the conclusion may be written as
the equation “m = L/c2 ” which, using the more modern E instead of L to
denote energy, may be trivially rewritten as “E = mc2 ”.
This edition was prepared by John Walker. The current version of this
document is available in a variety of formats from the editor’s Web site:

http://www.fourmilab.ch/

3
N. Bohr, Philos. Mag. 26, 1 1913

On the Constitution of Atoms and Molecules

N. Bohr,
Dr. phil. Copenhagen
(Received July 1913)

Introduction

In order to explain the results of experiments on scattering of α rays by


matter Prof. Rutherford1 has given a theory of the structure of atoms.
According to this theory, the atom consist of a positively charged nucleus
surrounded by a system of electrons kept together by attractive forces from
the nucleus; the total negative charge of the electrons is equal to the positive
charge of the nucleus. Further, the nucleus is assumed to be the seat of
the essential part of the mass of the atom, and to have linear dimensions
exceedingly small compared with the linear dimensions of the whole atom.
The number of electrons in an atom is deduced to be approximately equal to
half the atomic weight. Great interest is to be attributed to this atom-model;
for, as Rutherford has shown, the assumption of the existence of nuclei, as
those in question, seems to be necessary in order to account for the results
of the experiments on large angle scattering of the α rays.2
In an attempt to explain some of the properties of matter on the basis
of this atom-model we meet, however, with difficulties of a serious nature
arising from the apparent instability of the system of electrons: difficulties
purposely avoid in atom-models previously considered, for instance, in the
one proposed by Sir. J.J. Thomson3 According to the theory of the latter
the atom consist of a sphere of uniform positive electrification, inside which
the electrons move in circular orbits.
1
E. Rutherford, Phil. Mag. XXI. p. 669 (1911)
2
See also Geiger and Marsden, Phil. Mag. April 1913.
3
J.J. Thomson, Phil. Mag. VII. p. 237 (1904).

1
The principal difference between the atom-models proposed by Thomson
and Rutherford consist in the circumstance that the forces acting on the
electrons in the atom-model of Thomson allow of certain configurations and
motion of the electrons for which the system is in a stable equilibrium; such
configurations, however, apparently do not exist for the second atom- model.
The nature of the difference in question will perhaps be most clearly seen by
noticing that among the quantities characterizing the fist atom a quantity
appears – the radius of the positive sphere – of dimensions of a length and of
the same order of magnitude as the linear extension of the atom, while such
a length does not appear among the quantities characterizing the second
atom, viz. the charges and masses of the electrons and the positive nucleus;
nor can it do determined solely by help of the latter quantities.
The way of considering a problem of this kind has, however, undergone
essential alterations in recent years owing to the development of the theory
of the energy radiation, and the direct affirmation of the new assumptions
introduced in this theory, found by experiments on very different phenomena
such as specific heats, photoelectric effect, Röntgen-rays, & c. The result of
the discussion of these questions seems to be a general acknowledgment of
the inadequacy of the classical elecrtodynamics in describing the behaviour
of system of atomic size.4 Whatever the alteration in the laws of motion of
the electrons may be, it seems necessary to introduce in the laws in question
a quantity foreign to the classical electrodynamics, i.e., Planck’s constant, or
as it often is called the elementary quantum of action. By the introduction
of this quantity the question of the stable configuration of the electrons in
the atoms is essentially changed, as this constant is of such dimensions and
magnitude that it, together with the mass and charge of the particles, can
determine a length of the order of magnitude required.
This paper is an attempt to show that the application of the above ideas
to Rutherford’s atom-model affords a basis for a theory of the constitution
of atoms. It will further be shown that from this theory we are led to a
theory of the constitution of molecules.
In the present first part of the paper the mechanism of the binding of
electrons by a positive nucleus is discussed in relation to Planck’s theory. It
will be shown that it is possible from the point of view taken to account in
a simple way for the law of the line spectrum of hydrogen. Further, reason
are given for a principal hypothesis on which the considerations contained
in the following parts are based.
4
See f. inst., “Theorie du ravonnement et les quanta.” Rapports de la rennion a
Bruxeless, Nov. 1911, Paris, 1912.

2
I wish here to express my thinks to Prof. Rutherford for his kind and
encouraging interest in this work.

Part I. – Binding of Electrons by Positive Nuclei.

§ 1. General Considerations

The inadequacy of the classical electrodynamics in accounting for the prop-


erties of atoms from an atom-model as Rutherford’s, will appear very clearly
if we consider a simple system consisting of a positively charged nucleus of
very small dimensions and an electron describing closed orbits around it.
For simplicity, let us assume that the mass of the electron is negligibly small
in comparison with that of the nucleus, and further, that the velocity of the
electron is small compared with that of light.
Let us at first assume that there is no energy radiation. In this case the
electron will describe stationary elliptical orbits. The frequency of revolution
ω and the major-axis of the orbit 2a will depend on the amount of energy
W which must be transferred to the system in order to remove the electron
to an infinitely great distance apart from the nucleus. Denoting the charge
of the electron and of the nucleus by – e and E respectively and the mass
of the electron by m, we thus get

2 W 3/2 eE
ω= · √ , 2a = . (1)
π eE m W

Further, it can easily be shown that the mean value of the kinetic energy of
the electron taken for a whole revolution is equal to W . We see that if the
value of W is not given, there will be no values of ω and a characteristic for
the system in question.
Let us now, however, take the effect of the energy radiation into account,
calculated in the ordinary way from the acceleration of the electron. In this
case the electron will no longer describe stationary orbits. W will continu-
ously increase, and the electron will approach the nucleus describing orbits
of smaller and smaller dimensions, and with greater and greater frequency;
the electron on the average gaining in kinetic energy at the same time as the
whole system loses energy. This process will go on until the dimensions of

3
the orbit are the same order of magnitude as the dimensions of the electron
or those of the nucleus. A simple calculation shows that the energy radiated
out during the process considered will be enormously great compared with
that radiated out by ordinary molecular processes.
It is obvious that the behaviour of such a system will be very different
from that of an atomic system occurring in nature. In the first place, the
actual atoms in their permanent state to have absolutely fixed dimensions
and frequencies. Further, if we consider any process, the result seems always
to be that after a certain amount of energy characteristic for the systems in
question is radiated out, the system will again settle down in a stable state
of equilibrium, in which the distance apart of the particles are of the same
order of magnitude as before the process.
Now the essential point in Planck’s theory of radiation is that the energy
radiation from an atomic system does not take place in the continuous way
assumed in the ordinary electrodynamics, but that it, on the contrary, takes
place in distinctly separated emissions, the amount of energy radiated out
from an atomic vibrator of frequency ν in a single emission being equal to
τ hν, where τ is an entire number, and h is a universal constant.5
Returning to the simple case of an electron and a positive nucleus consid-
ered above, let us assume that the electron at the beginning of the interaction
with the nucleus was at a great distance apart from the nucleus, and had
no sensible velocity relative to the latter. Let us further assume that the
electron after interaction has taken place has settled down in a stationary
orbit around the nucleus. We shall, for reasons referred to later, assume
that the orbit in question is circular: this assumption will, however, make
no alteration in the calculations for system containing only a single electron.
Let as now assume that, during the binding of the electron, a homo-
geneous radiation is emitted of a frequency ν, equal to half the frequency
of revolution of the electron in its final orbit; then from Planck’s theory,
we might expect that the amount of energy emitted by the process consid-
ered is equal to τ hν, where h is Planck’s constant an entire number. If we
assume that the radiation emitted is homogeneous, the second assumption
concerning the frequency of the radiation suggests itself, since the frequency
of revolution of the electron at the beginning of the emission is 0. The ques-
tion, however, of the rigorous validity of both assumptions, and also of the
application made of Planck’s theory, will be more closely discussed in § 3.
5
See f. inst., M. Planck, Ann. d. Phys. XXXI. p. 758 (1910); XXXVII. p. 612 (1912);
Verh. Phys. Ges. 1911, p. 138.

4
Putting
ω
W = τh , (2)
2
we get by help of the formula (1)
2π 2 me2 E 2 4π 2 me2 E 2 τ 2 h2
W = , ω= , 2a = . (3)
τ 2 h2 τ 3 h3 2π 2 meE
If in these expressions we give τ different values, we get a series of values
for W , ω, and a corresponding to a series of configurations of the system.
According to the above considerations, we are led to assume that these
configurations will correspond to states of the system in which there is no
radiation of energy; states which consequently will be stationary as long as
the system is not disturbed from outside. We see that the value of W is
greatest if τ has its smallest value 1. This case will therefore correspond
to the most stable of the system, i.e., will correspond to the binding of
the electron for the breaking up of which the greatest amount of energy is
required.
Putting in the above expressions τ = 1 and E = e, and introducing the
experimental values
e
e = 4.7 · 10−10 , = 5.31 · 1017 , h = 6.5 · 10−27 ,
m
we get
1 W
2a = 1.1 · 10−8 cm, ω = 6.2 · 1015 , = 13 volt.
sec e
We see that these values are of the same order of magnitude as the linear
dimensions of the atoms, the optical frequencies, and the ionization- potentials.
The general importance of Planck’s theory for the discussion of the be-
haviour of atomic system was originally pointed out by Einstein.6 The
considerations of Einstein have been developed and applied on a number
of different phenomena, especially by Stark, Nernst, and Sommerfield. The
agreement as to the order of magnitude between values observed for the
frequencies and dimensions of the atoms, and values for these quantities cal-
culated by considerations similar to those given above, has been the subject
of much discussion. It was first pointed out by Haas,7 in ann attempt to
6
A. Einstein, Ann. d.Phys. XVII. p. 132 (1905); XX. p. 199 (1906); XXII. p. 180
(1907).
7
A.E. Haas, Jahrb. d. Rad. u.El. VII. p. 261 (1910). See further, A.Schidlof, Ann. d.
Phys. XXXV. p. 90 ( 1911); E. Wertheimer, Phys. Zietschr. XII. p. 409 (1911), Verh.
deutsch. Phys. Ges. 1912, p. 431; F.A. Lindermann, Verh.deutsch.Phys.Ges. 1911, pp.
482, 1107; F. Haber, Verh. deutsch. Phys. Ges. 1911, p. 1117.

5
explain the meaning and the value of Planck’s constant on the basis of J.J.
Thomson’s atom-model, by help of the linear dimensions and frequency of
an hydrogen atom. Systems of the kind considered in this paper, in which
the forces between the particles vary inversely as the square of the distance,
are discussed in relation to Planck’s theory by J.W. Nicholson. 8 In a series
of papers this author has shown that it seems to be possible to account for
lines of hitherto unknown origin in the spectra of the stellar nebulae and
that of the solar corona, by assuming the presence in these bodies of certain
hypothetical elements of exactly indicated constitution. The atoms of these
elements are supposed to consist simply of a ring of a few electrons surround-
ing a positive nucleus of negligibly small dimensions. The ratios between
the frequencies corresponding to the lines in question are compared with the
ratios between the frequencies corresponding to different modes of vibration
of the ring of electrons. Nicholson has obtained a relation to Planck’s theory
showing that the ratios between the wave-lenth of different sets of lines of
the coronal spectrum can be accounted for with great accuracy by assum-
ing that the ratio between the energy of the system and the frequency of
rotation of the ring is equal to an entire multiple of Planck’s constant. The
quantity Nicholson refers to as the energy is equal to twice the quantity
which we have denoted above by W. In the latest paper cited Nicholson has
found it necessary to give the theory a more complicated form, still, how-
ever, representing the ratio of energy to frequency by a simple function of
whole numbers.
The excellent agreement between the calculated and observed values of
the ratios between the wave-length in question seems a strong argument in
favour of the validity of the foundation of Nicholson’s calculations. Serious
objections, however, may be raised against the theory. These objections are
intimately connected with the problem of the homogeneity of the radiation
emitted. In Nicholson’s calculations the frequency of lines in a line-spectrum
is identified with the frequency of vibration of a mechanical system in a
distinctly indicated state of equilibrium. As a relation from Planck’s theory
is used, we might expect that the radiation is sent out in quanta; but systems
like those considered, in which the frequency is a function of the energy,
cannot emit a finite amount of a homogeneous radiation; for, as soon as
the emission of radiation is started, the energy and also the frequency of
the system are altered. Further, according to the calculation of Nicholson,
the systems are unstable for some modes of vibration. Apart from such
8
J.W. Nicholson, Month. Not. Roy. Astr. Soc. LXXII. pp. 49, 139, 677, 693, 729
(1912).

6
objections – which may be only formal (see p. 23)?????? – it must be
remarked, that the theory in the form given dies not seem to be able to
account for the well-known laws of Balmer and Rydberg connecting the
frequencies of the lines in the line- spectra of the ordinary elements.
It will now be attempted to show that the difficulties in question disap-
pear if we consider the problems from the point of view taken in this paper.
Before proceeding it may be useful to restate briefly the ideas characterizing
the calculations on p. 5. The principal assumptions used are:

(1) That the dynamical equilibrium of the systems in the stationary states
can be discussed by help of the ordinary mechanics, while the passing
of the systems between different stationary states cannot be treated
on that basis.

(2) That the latter is followed by the emission of a homogeneous radiation,


for which the relation between the frequency and the amount of energy
emitted is the one given by Planck’s theory.

The first assumption seems to present itself; for it is known that the or-
dinary mechanism cannot have an absolute validity, but will only hold in
calculations of certain mean values of the motion of the electrons. On the
other hand, in the calculations of the dynamical equilibrium in a stationary
state in which there is no relative displacement of the particles, we need not
distinguish between the actual motions and their mean values. The second
assumption is in obvious constant to the ordinary ideas of electrodynamics,
but appears to be necessary in order to account for experimental facts.
In the calculations on page 5 we have further made use of the more
special assumptions, viz., that the different stationary states correspond to
the emission of a different number of Planck’s energy-quanta, and that the
frequency of the radiation emitted during the passing of the system from a
state in which no energy is yet radiated out to one of the stationary states,
is equal to half the frequency of revolution of the electron in the latter state.
We can, however (see § 3), also arrive at the expressions (3) for the stationary
states by using assumptions of somewhat different from. We shall, therefore,
postpone the discussion of the spacial assumptions, and first show how by
the help of the above principal assumptions, and of the expressions (3) for
the stationary states, we can account for the line-spectrum of hydrogen.

7
§ 2.Emission of Line-spectra

Spectrum of Hydrogen. – General evidence indicates that an atom of hydro-


gen consist simply of a single electron rotating round a positive nucleus of
charge e.9 The reformation of a hydrogen atom, when the electron has been
removed to great distances away from the nucleus – e.g. by the effect of
electrical discharge in a vacuum tube – will accordingly correspond to the
binding of an electron by a positive nucleus considered on p. 5. If in (3)
we put E = e, we get for the total amount of energy radiated out by the
formation of one of the stationary states,

2π 2 me4
Wr = .
τ 2 h2
The amount of energy emitted by the passing of the system from a state
corresponding to τ = τ1 to one corresponding to τ = τ2 , is consequently
µ ¶
2π 2 me4 1 1
W r 2 − Wr 1 = · − .
h2 τ22 τ12

If now we suppose that the radiation is question is homogeneous, and that


the amount of energy emitted is equal to hν, where ν is the frequency of the
radiation, we get
Wr2 − Wr1 = hν
and from this µ ¶
2π 2 me4 1 1
ν= · − . (4)
h3 τ22 τ12
We see that this expression accounts for the law connecting the lines in
the spectrum of hydrogen. If we put τ2 = 2 and let τ1 vary, we get the
ordinary Balmer series. If we put τ3 = 3, we get the series in the ultra-red
observed by Paschen10 and previously suspected by Ritz. If we put τ2 = 1
and τ = 4, 5, . . . , we get series respectively in the extreme ultraviolet and
the extreme ultra-red, which are not observed, but the existence of which
may be expected.
9
See f. inst. N. Bohr, Phil. Mag. XXV. p. 24 (1913). The conclusion drawn in
the paper cited in strongly supported by the fact that hydrogen, in the experiments on
positive rays of Sir. J.J. Thomson, is the only element which never occurs with a positive
charge corresponding to the lose of more than one electron (comp. Phil. Mag. XXIV. p.
672 (1912).
10
F. Paschen, Ann. d. Phys. XXVII. p.565 (1908).

8
The agreement in question is quantitative as well as qualitative. Putting
e
e = 4.7 · 10−10 , = 5.31 · 1017 and h = 6.5 · 10−27 ,
m
we get
2π 2 me4
= 3.1 · 1015 .
h3
The observed value for the factor outside the bracket in the formula (4) is

3.290 · 1015 .

We agreement between the theoretical and observed values is inside the


uncertainty due to experimental errors in the constants entering in the ex-
pression for the theoretical value. We shall in § 3 return to consider the
possible importance of the agreement in question.
It may be remarked that the fact, that it has not been possibly to observe
more than 12 lines of the Balmer series in experiments with vacuum tubes,
while 33 lines are observed in the spectra of some celestial bodies, is just
what we should expect from the above theory. According to the equation
(3) the diameter of the orbit of the electron in the different stationary states
is proportional to τ 2 . For τ = 12 the diameter is equal to 1.6 · 10−6 cm,
or equal to mean distance between the molecules in a gas at a pressure of
about 7 mm mercury; for τ = 33 the diameter is equal to 1.2 · 10−5 cm,
corresponding to the mean distance of the molecules at a pressure of about
0.02 mm mercury. According to the theory the necessary condition for the
appearance of a great number of lines is therefore a very small density of
the gas; for simultaneously to obtain an intensity sufficient for observation
the space filled with the gas must be very great. If the theory is right, we
may therefore never expect to be able in experiments with vacuum tubes to
observe the lines corresponding to high numbers of the Balmer series of the
emission spectrum of hydrogen; it might, however, be possible to observe
the lines by investigation of the absorption spectrum of this gas. (see § 4).
It will be observed that we in the above way do not obtain other series of
lines, generally ascribed to hydrogen; for instance, the series first observed
by Pickering11 in the spectrum of the star ζ Puppis, and the set of series
recently found by Fowler12 by experiments with vacuum tubes containing a
mixture of hydrogen and helium. We shall, however, see that, by help of the
above theory, we can account naturally for these series of lines if we ascribe
them to helium.
11
E.C. Pickering, Astrophys. J. IV. p. 369 (1896); v. p. 92 ( 1897).
12
A. Fowler, Mouth. Not. Roy. Astr. Soc. LXXIII. Dec. 1912.

9
A neutral atom of the latter element consists, according to Rutherford’s
theory, of a positive nucleus of charge 2e and two electrons. Now considering
the binding of a single electron by a helium nucleus, we get putting E = 2e
in the expressions (3) on page 5, and proceeding in in exactly the same way
as above,
µ ¶ Ã !
8π 2 me4 1 1 2π 2 me4 1 1
ν= · 2 − 2 = · ¡ τ ¢2 − ¡ τ ¢2 .
h3 τ2 τ1 h3 2 1
2 2

If we in this formula put τ1 = 1 or τ2 = 2, we get series of lines in the


extreme ultra-violet. If we put τ2 = 3, and let τ1 vary, we get a series which
includes 2 of the series observed by Folwer, and denoted by him as the first
and second principal series of the hydrogen spectrum. If we put τ2 = 4,
we get the series observed by Pickering in the spectrum of ζ Puppis. Every
second of the lines in this series is identical with a line in the Balmer series
of the hydrogen spectrum; the presence of hydrogen in the star in question
may therefore account for the fact that these lines are of a greater intensity
than the rest of the lines in the series. The series is also observed in the
experiments of Fowler, and denoted in his paper as the Sharp series of the
hydrogen spectrum. If we finally in the above formula put τ2 = 5, 6, . . ., we
get series, the strong lines of which are to be expected in the ultra-red.
The reason why the spectrum considered is not observed in ordinary
helium tubes may be that in such tubes the ionization of helium is not
so complete in the star considered or in the experiments of Fowler, where
a strong discharge was sent through a mixture of hydrogen and helium.
The condition for the appearance of the spectrum is, according to the above
theory, that helium atoms are present in a state in which they have lost both
their electrons. Now we must assume that the amount of energy to be used in
removing the second electron from a helium atom is much greater than that
to be used in removing the first. Further, it is known from experiments on
positive rays, that hydrogen atoms can acquire a negative charge; therefore
the presence of hydrogen in the experiments of Fowler may effect that more
electrons are removed from some of the helium atoms than would be the
case if only helium were present.
Spectra of other substances. — in case of systems containing more elec-
trons we must – in conformity with the result of experiments – expect more
complicated laws for the line-spectra than those considered. I shall try to
show that the point of view taken above allows, at any rate, a certain under-
standing of the laws observed. According to Rydberg’s theory — with the

10
generalization given by Ritz13 – the frequency corresponding to the lines of
the spectrum of an element can be expressed by

ν = Fτ (τ1 ) − Fs (τ2 ),

where τ1 and τ2 are entire numbers, and F1 , F2 , F3 , . . . are functions of


K K
τ which approximately are equal to (τ +a 1)
2 , (τ +a )2 , . . . K is a universal
2
constant, equal to the factor outside the bracket in the formula (4) for the
spectrum of hydrogen. The different series appear if we put τ1 or τ2 equal
to a fixed number and let the other vary.
The circumstance that the frequency can be written as a difference be-
tween two functions of entire numbers suggests an origin of the lines in the
spectra in question similar to the one we have assumed for hydrogen; i.e.
that the lines correspond to a radiation emitted during the passing of the
system between two different stationary states. For system containing more
than one electron the detailed discussion may be very complicated, as there
will be many different configurations of the electrons which can be taken
into consideration as stationary states. This may account for the difference
sets of series in the line spectra emitted from the substances in question.
Here I shall only try to show how, by help of the theory, it can be simple
explained that the constant K entering in Rydberg’s formula is the same for
all substances. Let us assume that the spectrum in question corresponds to
the radiation emitted during the binding of an electron; and let us further
assume that the system including the electron considered is neutral. The
force on the electron, when at a great distance apart the nucleus and the
electrons previously bound, will be very nearly the same as the above case of
the binding of an electron by a hydrogen nucleus. The energy corresponding
to one of the stationary states will therefore for τ great be very nearly equal
to that given by the expression (3) on p. 5, if we put E = e. For τ great we
consequently get

2π 2 me4
lim[τ 2 · F1 (τ )] = lim[τ 2 · F2 (τ )] = . . . = ,
h3
in conformity with Rydberg’s theory.

13
W. Ritz, Phys. Zeitschr. IX. p. 521 (1908).

11
§ 3.General Considerations Continued

We shall now return to the discussion (see p. 7) of the special assumptions


used in deducing the expression (3) on p. 5 for the stationary states of a
system consisting of an electron rotating round a nucleus.
For one, we have assumed that the different stationary states correspond
to an emission of a different number of energy-qyanta. Considering systems
in which the frequency is a function of the energy, this assumption, however,
may be regarded as improbable; for as soon as one quantum in sent out the
frequency is altered. We shall now see that we can leave the assumption used
and still retain the equation (2) on p. 5, and thereby the formal analogy
with Planck’s theory.
Firstly, it will be observed that it has not been necessary, in order to
account for the law of the spectra by help of the expressions (3) for the
stationary states, to assume that in any case a radiation is sent out corre-
sponding to more than a single energy-quantum, hν. Further information
on the frequency of the radiation may be obtained by comparing calcula-
tions of the energy radiation in the region of slow vibrations based on the
above assumptions with calculations based on the ordinary mechanics. As
is known, calculations on the latter basis are in agreement with experiments
on the energy radiation in the named region.
Let us assume that the ratio between the total amount of energy emitted
and the frequency of revolution of the electron for the different stationary
states is given by the equation W = f (τ ) · hω, instead of by the equation
(2). Proceeding in the same way as above, we get in this case instead of (3)

π 2 meE 2 π 2 me2 E 2
W = , ω= .
2h2 f 2 (τ ) 2h3 f 3 (τ )

Assuming as above that the amount of energy emitted during the passing
of the system from a state corresponding to τ = τ1 to one for which τ = τ2
is equal to hν, we get instead of (4)
µ ¶
π 2 me2 E 2 1 1
ν= · − .
2h3 f 2 (τ2 ) f 2 (τ1 )

We see that in order to get an expression of the same form as the Balmer
series we must put f (τ ) = cτ .
In order to determine c let us now consider the passing of the system
between two successive stationary states corresponding to τ = N and τ =

12
N − 1; introducing f (τ ) = cτ , we get for the frequency of the radiation
emitted
π 2 me2 E 2 2N − 1
ν= · 2 .
2c2 h3 N (N − 1)2
For the frequency of revolution of the electron before and after the emis-
sion we have
π 2 me2 e2 π 2 me2 E 2
ωN = and ωN −1 = .
2c3 h3 N 3 2c3 h3 (N − 1)3

If N is great the ratio between the frequency before and after the emission
will be very near equal to 1; and according to the ordinary electrodynamics
we should therefore expect that the ratio between the frequency of radiation
and the frequency of revolution also very nearly equal to 1. This condition
will only be satisfied if c = 1/2. Putting f (τ ) = τ /2, we, however, again
arrive at the equation (2) and consequently at the expression (3) for the
stationary states.
If we consider the passing of the system between two states corresponding
to τ = N and τ = N − n, where n is small compared with N , we get with
the same approximation as above, putting f (τ ) = τ /2,

ν = nω.

The possibility of an emission of a radiation of such a frequency may also be


interpreted from analogy with the ordinary electrodynamics, as an electron
rotating round a nucleus in an elliptical orbit will emit a radiation which
according to Fourier’s theorem can be resolved into homogeneous compo-
nents, the frequency of which are nω, if ω is the frequency of revolution of
the electron.
We are thus led to assume that the interpretation of the equation (2) is
not that the different stationary states correspond to an emission of differ-
ent numbers of energy-quanta, but that the frequency of the energy emitted
during the passing of the system from a state in which no energy is yet
radiated out to one of the different stationary states, is equal to different
multiples of ω/2, where ω is the frequency of revolution of the electron in the
state considered. From this assumption we get exactly the same expressions
as before for the stationary states, and from these by help of the principal
assumptions on p. 7 the same expression for the law of the hydrogen spec-
trum. Consequently we may regard our preliminary considerations on p. 5
only as a simple from of representing the results of the theory.

13
Before we leave the discussion of this question, we shall for a moment
return to the question of the significance of the agreement between the ob-
served and calculated values of the constant entering in the expressions (4)
for the Balmer series of the hydrogen spectrum. From the above consider-
ation it will follow that, taking the starting-point in the form of the law of
the hydrogen spectrum and assuming that the different lines correspond to
a homogeneous radiation emitted during the passing between different, sta-
tionary states, we shall arrive at exactly the same expression for the constant
in question as that given by (4), if we only assume (1) that the radiation is
sent out in quanta hν, and (2) that the frequency of the radiation emitted
during the passing of the system between successive stationary states will
coincide with the frequency of revolution of the electron in the region of slow
vibrations.
As all the assumptions used in this latter way of representing the theory
are of what we may call a qualitative character, we are justified in expecting
— if the whole way of considering is a sound one – an absolute agreement
between the values calculated and observed for the constant in question, and
not only an approximate agreement. The formula (40 may therefore be of
value in the discussion of the results of experimental determinations of the
constants e, m, and h.
While there obviously can be no question of a mechanical foundation of
the calculations given in this paper, it is, however, possible to give a very
simple interpretation of the result of the calculation on p. 5 by help of sym-
bols taken from the ordinary mechanics. Denoting the angular momentum
of the electron round the nucleus by M , we have immediately for a circular
orbit πM = T /ω, where ω is the frequency of revolution and T the kinetic
energy of the electron; for a circular orbit we further have T = W (see p. 3)
and from (2), p. 5, we consequently get

M = τ M0 ,

where
h
M0 = = 1.04 · 10−27 .

If we therefore assume that the orbit of the electron in the stationary
states is circular, the result of the calculation on p. 5 can be expressed by
the simple condition: that the angular momentum of the electron round the
nucleus in a stationary state of the system is equal to an entire multiple of
a universal value, independent of the charge on the nucleus. The possible
importance of the angular momentum in the discussion of atomic systems

14
in relation to Planck’s theory is emphasized by Nicholson.14
The great number of different stationary states we do not observe expect
by investigation of the emission and absorption of radiation. It most of the
other physical phenomena, however, we only observe the atoms of the matter
in a single distinct state, i,e., the state of the atoms at low temperature.
From the preceding considerations we are immediately led to the assumption
that the “ permanent” state is the one among the stationary states during
the formation of which the greatest amount of energy is emitted. According
to the equation (3) on p. 5, this state is the one which corresponds to τ = 1.

§ 4. Absorption of Radiation

In order to account for Kirchhoff’s law it is necessary to introduce assump-


tions on the mechanism of absorption of radiation which correspond to those
we have used considering the emission. Thus we must assume that a system
consisting of a nucleus and an electron rotating round it under certain cir-
cumstances can absorb a radiation of a frequency equal to the frequency of
the homogenous radiation emitted during the passing of the system between
different stationary states. Let us consider the radiation emitted during the
passing of the system between two stationary states A1 and A2 correspond-
ing to values for τ equal to τ1 and τ2 , τ1 > τ2 . As the necessary condition
of the radiation in question was the presence of systems in the state A1 , we
must assume that the necessary condition for an absorption of the radiation
is the presence of systems in the state A2 .
These considerations seems to be in conformity with experiments on
absorption in gases. In hydrogen gas at ordinary conditions for instance
there is no absorption of a radiation of a frequency corresponding to the
line-spectrum of this gas; such an absorption is only observed in hydrogen
gas in a luminous state. This is what we should expect according to the
above. We have on p. 9 assumed that the radiation in question was emitted
during the passing of the systems between stationary states corresponding
to τ ≥ 2. The state of the atoms in hydrogen gas at ordinary conditions
should, however, correspond to τ = 1; furthermore, hydrogen atoms at
ordinary conditions combine into molecules, i.e., into system in which the
electrons have frequencies different from those in the atoms (see Part III.)
From the circumstance that certain substances in a non-lumimous state, as,
14
J.W. Nicholson, loc. cit. p. 679.

15
foe instance, sodium vapour, absorb radiation corresponding to lines in the
line-spectra of the substances, we may, on the other hand, conclude that
the lines in question are emitted during the passing of the system between
two states, one of which is the permanent state.
How much the above considerations differ from an interpretation based
on the ordinary electrodynamic of perhaps most early shown by the fact
that we have been forced to assume that a system of electrons will absorb
a radiation of a frequency different from the frequency of vibration of the
electrons calculated in the ordinary way. It may in this connexion be of
interest to mention a generalization of the considerations to which we are led
by experiments on the photo-electric effect and which may be able to throw
some light on the problem in question. Let us consider state of the system in
which the electron is free, i.e., in which the electron possesses kinetic energy
sufficient to remove to infinite distances from the nucleus. If we assume
that the motion of the electron is governed by the ordinary mechanics and
that there is no ( sensible) energy radiation, the total energy of the system
– as in the above considered stationary states – will be constant. Further,
there will be perfect continuity between the two kinds of states, as the
difference between frequency and dimensions of the system in successive
stationary states will diminish without limit if τ increases. In the following
considerations we shall for the sake of brevity refer to the two kinds of states
in question as “ mechanical” states; by this notation only emphasizing the
assumption that the motion of the electron in both cases can be assumed
for by the ordinary mechanics.
Tracing the analogy between the two kinds of mechanical states, we
might now expect the possibility of an absorption of radiation, not only
corresponding to the passing of the system between two different stationary
states, but also corresponding to the passing between one of the stationary
states and a state in which the electron is free; and as above, we might
expect that the frequency of this radiation was determined by the equation
E = hν, where E is the difference between the total energy of the system
in the two states. As it will be see, such an absorption of radiation is just
what is observed in experiments on ionization by ultra-violet light and by
Röntgen rays. Obviously, we get in this way the same expression for the
kinetic energy of an electron ejected from an atom by photo-electron effect
as that deduced by Einstein15 i.e., T = hν − W , where T is the kinetic
energy of the electron ejected, and W the total amount of energy emitted
during the original binding of the electron.
15
A. Einstein, Ann. d. Phys. XVII. p. 146 (1905).

16
The above considerations may further account for the result of some ex-
periments of R.W. Wood16 on absorption of light by sodium vapour. In these
experiments, an absorption corresponding to a very great number of lines
in the principal series of the sodium spectrum is observed, and in addition
a continuous absorption which begins at the head of the series and extends
to the extreme ultra-violet. This is exactly what we should expect accord-
ing to the analogy in question, and, as we shall see, a closer consideration
of the above experiments allows us to trance the analogy still further. As
mentioned on p. 9 the radii of the orbits of the electrons will for stationary
states, corresponding to high values for τ be very great compared with or-
dinary atomic dimensions. This circumstance was used as an explanation of
the non-appearance in experiments with vacuum-tubes of lines correspond-
ing to the higher numbers in the Balmer series of the hydrogen spectrum.
This is also in conformity with experiments on the emission spectrum of
sodium; in the principal series of the emission spectrum of this substance
rather few lines are observed. Now in Wood’s experiments the pressure was
not very low, the states corresponding to high values for τ could therefore
not appear; yet in the absorption spectrum about 50 lines were detected.
In the experiments in question we consequently observe an absorption of
radiation which is not accompanied by a complete transition between two
different stationary states. According to the present theory we must assume
that this absorption is followed by an emission of energy during which the
systems pass back to the original stationary state. If there are no collisions
between the different systems this energy will be emitted as a radiation of
the same frequency as that absorbed, and there will be no true absorption
but only a scattering of the original radiation; a true absorption will not
occur unless the energy in question is transformed by collisions into kinetic
energy of free particles. In analogy we may now from the above experi-
ments conclude that a bound electron – also in cases in which three is no
ionization – will have an absorbing (scattering) influence on a homogeneous
radiation, as soon as the frequency of the radiation is greater than W/h,
where W is the total amount of energy emitted during the binding of the
electron. This would be highly in favour of a theory of absorption as the one
sketched above, as there can in such a case be no question of a coincidence
of the frequency of the radiation and a characteristic frequency of vibration
of the electron. If will further be seen that the assumption, that there will
be an absorption (scattering) of any radiation corresponding to a transi-
tion between two different mechanical states, is in perfect analogy with the
16
R.W. Wood, Physical Optics, p. 513 (1911).

17
assumption generally used that a free electron will have an absorbing (scat-
tering) influence on light of any frequency. Corresponding considerations
will hold for the emission of radiation.
In analogy to the assumption used in this paper that the emission of
line- spectra is due to the re-formation of atoms after one or more of the
lightly bound electrons are removed, we may assume that the homogeneous
Röntgen radiation is emitted during the setting down of the systems after one
of the firmly bound electrons escapes, e.g. by impact of cathode particles. 17
In the next part in this paper, dealing with the constitution of atoms, we
shall consider the question more closely and try to show that a calculation
based on this assumption is in quantitative agreement with the results of
experiments: here we shall only mention briefly a problem with which we
meet in such a calculation.
Experiments on the phenomena of X-rays suggest that not only the emis-
sion and absorption of radiation cannot be treated by the help of the ordinary
electrodynamics, but not even the result of a collision between two electrons
of which the one is bound in an atom. This is perhaps most early shown by
some very instructive calculations on the energy of β-particles emitted from
radioactive substances recently published by Rutherford.18 These calcula-
tions strongly suggest that an electron of great velocity in passing through
an atom and colliding with the electrons bound will loose energy in distinct
finite quanta. As is immediately seen, this is very different from what we
might expect if the result of the collisions was governed by the usual me-
chanical laws. The failure of the classical mechanics in such a problem might
also be expected beforehand from the absence of anything like equipartition
of kinetic energy between free electrons and electrons bound in atoms. From
the point of view of the “mechanical” states we see, however, that the follow-
ing assumption – which is in accord with the above analogy – might be able
to account for the result of Rutherford’s calculation and for the absence of
equipartition of kinetic energy; two colliding electrons, bound or free, will,
after the collision as well as before, be in mechanical states. Obviously, the
introduction of such an assumption would not make any alteration neces-
sary in the classical treatment of a collision between two free particles. But,
considering a collision between a free and a bound electron, it would follow
that the bound electron by the collision could not acquire a less amount of
energy than the difference in energy corresponding to successive stationary
states, and consequently that the free electron which collides with it could
17
Compare J.J. Thomson, Phil. Mag. XXIII. p. 456 (1912).
18
E. Rutherford, Phil. Mag. XXIV. pp. 453 & 893 (1912).

18
not lose a less amount.
The preliminary and hypothetical character of the above considerations
needs not to be emphasized. The intention, however, has been to show that
the sketched generalization of the theory of the stationary states possibly
may afford a simple basis of representing a number of experimental facts
which cannot be explained by help of the ordinary electrodynamics, and
that assumptions used do not seem to be inconsistent with experiments
on phenomena for which a satisfactory explanation has been given by the
classical dynamics and the wave theory of light.

§ 5.The permanent State of an Atomic System

We shall now return to the main object of this paper – the discussion of the
“permanent” state of a system consisting of nuclei and bound electrons. For
a system consisting of a nucleus and an electron rotating round it, this state
is, according to the above, determined by the condition that the angular
momentum of the electron round the nucleus is equal to h/2π.
On the theory of this paper the only neutral atom which contains a single
electron is the hydrogen atom. The permanent state of this atom should
correspond to the values of a and ω calculated on p. 5. Unfortunately,
however, we know very little of the behaviour of hydrogen atoms on account
of the small dissociation of hydrogen molecules at ordinary temperatures. In
order to get a closer comparison with experiments, it is necessary to consider
more complicated systems.
Considering systems in which more electrons are bound by a positive
nucleus, a configuration of the electrons which presents itself as a permanent
state is in which the electrons are arranged in a ring round the nucleus. In the
discussion of this problem on the basis of the ordinary electrodynamics, we
meet– apart from the question of the energy radiation – with new difficulties
due to the question of the stability of the ring. Disregarding for a moment
this latter difficulty, we shall first consider the dimensions and frequency of
the systems in relation to Planck’s theory of radiation.
Let us consider a ring consisting of n electrons rotating round a nucleus
of charge E, the electrons being arranged at equal angular intervals the
circumference of a circle of radius a.
The total potential energy of the system consisting of the electrons and

19
the nucleus is
ne
P =− · (E − esn ) ,
a
where
1 s=n−1
X sπ
sn = cosec .
4 s=1 n
For the radial force exerted on an electron by the nucleus and the other
electrons we get
1 dP e
F =− · = − 2 · (E − esn ) .
n da a
Denoting the kinetic energy of an electron by T and neglecting the elec-
tromagnetic forces due to the motion of the electrons (see Part II), we get,
putting the centrifugal force on an electron equal to the radial force,
2T e
= 2 · (E − esn ) ,
a a
or
e
T = · (E − esn ) .
2a
From this we get for the frequency of revolution
s
1 e (E − esn )
ω= · .
2π ma3
The total amount of energy W necessary transferred to the system in order
to remove the electrons to infinite distances apart from the nucleus and from
each other is
ne
W = −P − nT = · (E − esn ) = nT,
2a
equal to the total kinetic energy of the electrons.
We see that the only difference in the above formula and those holding
for the motion of a single electron in a circular orbit round a nucleus is the
exchange of E for E − esn . It is also immediately seen that corresponding to
the motion of an electron in an elliptical orbit round a nucleus, there will be
a motion of the n electrons in which each rotates in an elliptical orbit with
the nucleus in the focus, and the n electrons at any moment are situated at
equal angular intervals on a circle with the nucleus as the centre. The major
axis and frequency of the orbit of the single electrons will for this motion
be given by the expressions (1) on p. 3 if we replace E by E − esn and W
by W/n. Let us now suppose that the system of n electrons rotating in a
ring round a nucleus is formed in a way analogous to the one assumed for

20
a single electron rotating round a nucleus. It will thus be assumed that the
electrons, before the binding by the nucleus, were at a great distance apart
from the latter and possessed no sensible velocities, and also that during
the binding a homogeneous radiation is emitted. As in the case of a single
electron, we have here that the total amount of energy emitted during the
formation of the system is equal to the final kinetic energy of the electrons.
If we now suppose that during the formation of the system the electrons at
any moment are situated at equal angular intervals on the circumference of a
circle with the nucleus in the centre, from analogy with the considerations,
on p. 5 we are here led to assume the existence of a series of stationary
configurations in which the kinetic energy per electron is equal to τ hω/2,
where τ is an entire number, h Planck’s constant, and ω the frequency of
revolution. The configuration in which the greatest amount of energy is
emitted is, as before, the one in which τ = 1. This configuration we shall
assume to be the permanent state of the system if the electrons in this state
are arranged in a single ring. As for the case of a single 3electron we get
that the angular momentum of each of the electrons is equal to h/2π. It
may be remarked that instead of considering the single electrons we might
have considered the ring as an entity. This would, however, lead to the same
result, for in this case the frequency of revolution ω will be replaced by the
frequency nω of the radiation from the whole ring calculated from ordinary
electrodynamics, and T by the total kinetic energy nT .
There may be many other stationary states corresponding to other ways
of forming the system. The assumption of the existence of such states seems
necessary in order to account for the line-spectra of systems containing more
than one electron (p. 11); it is also suggested by the theory of Nicholson
mentioned on p. 6, to which we shall return in a moment. The consideration
of the spectra, however, gives, as far as I can see, no indication of the
existence of stationary states in which all the electrons are arranged in a ring
and which correspond to greater values for the total energy emitted than
the one we above have assumed to be the permanent state. Further, there
may be stationary configurations of a system of n electrons and a nucleus
of charge E in which all the electrons are not arranged in a single ring. The
question, however, of the existence of such stationary configurations is not
essential for our determination of the permanent state, as long as we assume
that the electrons in this state of the system are arranged in a single ring.
Systems corresponding to more complicated configurations will be discussed
on p. 24.?????
Using the relation T = hω/2 we get, by help of the above expressions
for T and ω, values for a and ω corresponding to the permanent state of the

21
system which only differ from those given by the equations (3) on p. 5, by
exchange of E for E − esn .
The question of stability of a ring of electrons rotating round a positive
charge is discussed in great detail by Sir. J.J. Thomson19 An adaption of
Thomson’s analysis for the case here considered of a ring rotating round a
nucleus of negligibly small linear dimensions is given by Nicholson.20 The
investigation of the problem in question naturally divides in two parts: one
concerning the stability for displacements of the electrons on the plane of the
ring; one concerning displacements perpendicular to this plane. As Nichol-
son’s calculations show, the answer to the question of stability differs very
much in the two cases in question. While the ring for the latter displace-
ments in general is stable if the number of electrons is not great; the ring is
in no case considered by Nicholson stable for displacement of the first kind.
According, however, to the point of view taken in this paper, the ques-
tion of stability for displacements of the electrons in the plane of the ring is
most intimately connected with the question of the mechanism of the bind-
ing of the electrons, and like the latter cannot be treated on the basis of
the ordinary dynamics. The hypothesis of which we shall make use in the
following is that the stability of a ring of electrons rotating round a nucleus
is secured through the above condition of the universal constancy of the an-
gular momentum, together with the further condition that the configuration
of the particles is the one by the formation of which the greatest of energy
is emitted. As will be shown, this hypothesis is, concerning the question of
stability for a displacement of the electrons perpendicular to the plane of
the ring, equivalent to that used in ordinary mechanical calculations.
Returning to the theory of Nicholson on the origin of lines observed in
the spectrum of the solar corona, we shall now see that the difficulties men-
tioned on p. 7 may be only formal. In the first place, from the point of
view considered above the objection as to the instability of the systems for
displacements of the electrons in the plane of the ring may not be valid. Fur-
ther, the objection as to emission of the radiation in quanta will not have
reference to the calculations in question, if we assume that in the coronal
spectrum we are not dealing with a true emission but only with a scattering
of radiation. This assumption seems probable if we consider the conditions
in the celestial body in question: for on account comparatively few colli-
sions to disturb the stationary states and to cause a true emission of light
corresponding to the transition between different stationary states; on the
19
Loc. cit.
20
Loc. cit.

22
other hand there will in the solar corona be intense illumination of light of
all frequencies which may excite the natural vibrations of the systems in
the different stationary states. If the above assumption is correct, we im-
mediately understand the entirely different from for the laws connecting the
lines discussed by Nicholson and those connecting the ordinary line-spectra
considered in this paper.

Proceeding to consider systems of more complicated constitution, we


shall make use of the following theorem, which can be very simply proved; –
“In every system consisting of electrons and positive nuclei, in which the nu-
clei are at rest and the electrons move in circular orbits with a velocity small
compared with the velocity of light, the kinetic energy will be numerically
equal to half the principal energy.”
By help of this theorem we get – as in the previous cases of a single
electron or of a ring rotating round a nucleus – that the total amount of
energy emitted, by the formation of the systems from a configuration in
which the distances apart of the particles are infinitely great and in which
the particles have no velocities relative to each other, is equal to the kinetic
energy of the electrons in the final configuration.
In analogy with the case of a single ring we are here led to assume that
corresponding to any configuration of equilibrium a series of geometrically
similar, stationary configuration of the system will exist in which the kinetic
energy of every electron is equal to the frequency of revolution multiplied
by τ /2h where τ is an entire number and h Planck’s constant. In any such
series of stationary configurations the one corresponding to the greatest
amount of energy emitted will be the one in which τ for every electron is
equal to 1. Considering that the ratio of kinetic energy to frequency for a
particle rotating in a circular orbit is equal to π times the angular momentum
round the center of the orbit, we are therefore led to the following simple
generalization of the hypotheses mentioned on pp. 15 and 22. ??????
“In any molecular system consisting of positive nuclei and electrons in
which the nuclei are at rest relatire to each other and the electrons more in
circular orbits, the angular momentum of every electron round the centre of
its orbit will in the permanent state of the system be equal to h/2π, where h
is Planck’s constant.”21
In analogy with the considerations on p. 23, we shall assume that a
21
In the considerations leading to this hypothesis we have assumed that the velocity of
the electrons is small compared with the velocity of light. The limits of the validity of this
assumption will be discussed in Part II.

23
configuration satisfying this condition is stable if the total energy of the
system is less than in any neighbouring configuration satisfying the same
condition of the angular momentum of the electrons.
As mentioned in the introduction, the above hypothesis will be used in a
following communication as a basis for a theory of the constitution of atoms
and molecules. It will be shown that it leads to results which seem to be in
conformity with experiments on a number of different phenomena.
The foundation of the hypothesis has been sought entirely in its relation
with Planck’s theory of radiation; by help of considerations given later it
will be attempted to throw some further light on the formation of it from
another point of view.

April 5, 1913

24
N. Bohr, Philos. Mag. 26, 476 1913

On the Constitution of Atoms and Molecules

N. Bohr,
Dr. phil. Copenhagen
(Received July 1913)

Part II. – Systems containing only a


Single Nucleus
1

§ 1 General Assumptions

Following the theory of Rutherford, we shall assume that the atoms of the
elements consist of a positively charged nucleus surrounded by a cluster
of electrons. The nucleus is the seat of the essential part of the mass of
the atom, and has linear dimensions exceedingly small compared with the
distance apart of the electrons in the surrounding cluster.
As in the previous paper, we shall assume that the cluster of electrons is
formed by the successive binding by the nucleus of electrons initially nearly
at rest, energy at the same time being radiated away. This will go on until,
when the total negative charge on the bound electrons is numerically equal to
the positive charge on the nucleus, the system will be neutral and no longer
able to exert sensible forces on electrons at distances from the nucleus great
in comparison with the dimensions of the orbits of the bound electrons. We
may regard the formation of helium from α rays as an observed example of
1
Part I was published in Phil. Mag. XXVI. p. 1 (1913).

1
a process of this kind, an α particle on this view being identical with the
nucleus of a helium atom.
On account of the small dimensions of the nucleus, its internal structure
will not be of sensible influence on the constitution of the cluster of electrons,
and consequently will have no effect on the ordinary physical and chemical
properties of the atom. The latter properties on this theory will depend
entirely on the total charge and mass of the nucleus; the internal structure
of the nucleus will be of influence only on the phenomena of radioactivity.
From the result of experiments on large-angle scattering of α-rays, Ru-
therford2 found an electric charge on the nucleus corresponding per atom
to a number of electrons approximately equal to half the atomic weight.
This result seems to be in agreement with the number of electrons per atom
calculated from experiments on scattering of Röntgen radiation.3 The total
experimental evidence supports the hypothesis4 that the actual number of
electrons in a neutral atom with a few exceptions is equal to the number
which indicated the position of the corresponding element in the series of
element arranged in order of increasing atomic weight. For example on this
view, the atom of oxygen which is the eighth element of the series has eight
electrons and a nucleus carrying eight unit charges.
We shall assume that the electrons are arranged at equal angular inter-
vals in coaxial rings rotating round the nucleus. In order to determine the
frequency and dimensions of the rings we shall use the main hypothesis of
the first paper, viz.; that in the permanent state of an atom the angular
momentum of every electron round the centre of its orbit is equal to the
universal value h/2π, where h is Planck’s constant. We shall take as a con-
dition of stability, that the total energy of the system in the configuration in
question is less than in any neighbouring configuration satisfying the same
condition of the angular momentum of the electrons.
If the charge on the nucleus and the number of electrons in the different
rings is known, the condition in regard to the angular momentum of the
electrons will, as shown in § 2, completely determine the configuration of
the system. i.e., the frequency of revolution and the linear dimensions of the
rings. Corresponding to different distributions of the electrons in the rings,
however, there will, in general, be more than one configuration which will
satisfy the condition of the angular momentum together with the condition
of stability.
2
Comp. also Geiger and Marsden, Phil. Mag. XXV. p. 604 (1913).
3
Comp. C.G. Barkla, Phil. Mag. XXI. p. 648 (1911).
4
Comp. A.v.d. Broek, Phys. Zeitschr. XIV. p. 32 (1913).

2
In § 3 and § 4 it will be shown that, on the general view of the formation of
the atoms, we are led to indications of the arrangement of the electrons in the
rings which are consistent with those suggested by the chemical properties
of the corresponding element.
In § 5 will be shown that it is possible from the theory to calculate the
momentum velocity of cathode rays necessary to produce the characteris-
tic Röntgen radiation from the element, and that this is in approximate
agreement with the experimental values.
In § 6 the phenomena of radioactivity will be briefly considered in relation
of the theory.

§ 2 Configuration and Stability of the System

Let us consider an electron of charge e and mass m which moves in a circular


orbit of radius a with a velocity v small compared with the velocity of light.
Let us denote the radial force acting on the electrons by e2 /a2 F ; F will in
general be dependent on a. The condition of dynamical equilibrium gives
mv 2 e2
= 2 F.
a a
Introducing the condition of universal constancy of the angular momen-
tum of the electron, we have
h
mva = .

From these two conditions we now get
h2 −1 2πe2
a= · F and v = · F; (1)
4π 2 e2 m h
and for the frequency of revolution w consequently
4π 2 e2 m
ω= · F 2. (2)
h2
If F is known, the dimensions and frequency of the corresponding orbit are
simply determined by (1) and (2). For a ring of n electrons rotating round
a nucleus of charge ne we have (comp. Part I., p. 20)????

1 s=n−1
X sπ
F = N − sn , where sn = · cosec .
4 s=1 n

3
The values for sn from n = 1 to n = 16 are given in the table 1.
For systems consisting of nuclei and electrons in which the first are at
rest and the latter move in circular orbits with a velocity small compared
with the velocity of light, we have shown (see part I., p. 21)???? that the
total kinetic energy of the electrons is equal to the total amount of energy
emitted during the formation of the system from an original configuration in
which all the particles are at rest and at infinite distances from each other.
Denoting this amount of energy by W , we consequently get
Xm 2π 2 e4 m X 2
W = v2 = F . (3)
2 h2
e
Putting in (1), (2), and (3) e = 4.7·10−10 , m = 5.31·10−17 , and h = 6.5·10−27
we get
a = 0.55 · 10−8 F −1 , v = 2.1 · 108 F, X
(4)
ω = 6.2 · 1015 F 2 , W = 2.0 · 10−11 F 2 .
In neglecting the magnetic forces due to the motion of the electrons
we have in Part I. assumed that the velocities of the particles are small
compared with the velocity of light. The above calculations show that for
this to hold, F must be small compared with 150. As will be seen, the latter
condition will be satisfied for all the electrons in the atoms of elements of
low atomic weight and for a greater part of the electrons contained in the
atoms of the other elements.
If the velocity of the electrons in not small compared with the veloc-
ity of light, the constancy of the angular momentum no longer involved a
constant ratio between the energy and the frequency of revolution. Without
introducing new assumptions, we cannot therefore in this case determine the
configuration of the systems on the basis of the consideration in Part I. Con-
siderations given later suggest, however, that the constancy of the angular
momentum is the principal condition. applying this condition for velocities
not small compared with the velocity of light, we get the same expression
for v as that given by
p(1), while the quantity m in the expressions for a and
ω is replaced by m/ (1 − v 2 /c2 ), and in the expression for W by
 s 
c2 v2
m·2 · 1 − 1− .
v2 c2

As stated in Part I., a calculation based on the ordinary mechanics given


the result, that a ring of electrons rotating round a positive nucleus in general
is unstable for displacement of the electrons in the plane of the ring. In order

4
to escape from this difficulty, we have assumed that the ordinary principles
of mechanics cannot be used in the discussion of the problem in question,
any more than in the discussion of the connected problem of the mechanism
of binding of electrons. We have also assumed that the stability for such
displacement is secured through the introduction of the hypothesis of the
universal constancy of the angular momentum of the electrons.
As is easily shown, the latter assumption in included in the condition of
stability in § 1. Consider a ring of electrons rotation round a nucleus, and
assume that the system is in dynamical equilibrium and that the radius of the
ring is a0 , the v0 , the total kinetic energy T0 , and the potential energy P0 . As
shown in Part i. (p. 21) we have P0 = −2T0 . Next consider a configuration
of the system in which the electrons, under influence of extraneous forces,
rotate with the same angular momentum round the nucleus in a ring of
radius a = αa0 . In this case we have P = α1 P0 , and on account of the
uniformity of the angular momentum v = 1/α · v0 and T = 1/α2 · T0 . Using
the relation P0 = −2T0 , we get
µ ¶2
1 1 1
P +T = · P0 + 2 T0 = P0 + T0 + T0 · 1 − .
α α α
We see that the total energy of the new configuration is greater than in
the original. according to the condition of stability in § 1 the system is
consequently stable for the displacement considered. In this connexion, it
may be remarked that in Part I. we have assumed that the frequency of
radiation emitted or absorbed by the systems cannot be determined from the
frequencies of vibration of the electrons in the plane of the orbits, calculated
by help of the ordinary mechanics. We have, on the contrary, assumed
that the frequency of the radiation is determined by the condition hν = E,
where ν is the frequency, h Planck’s constant, and E the difference in energy
corresponding to two different “stationary” states of the system.
In considering the stability of a ring of electrons rotating round a nucleus
for displacements of the electrons perpendicular to the plane of the ring,
imagine a configuration of the system in which the electrons are displaced by
δz1 , δz2 , . . . δzn respectively, and suppose that the electrons, under influence
of extraneous forces, rotate in circular orbits parallel to the original plane
with the same radial and the same angular momentum round the axis of the
system as before. The kinetic energy is unaltered by the displacement, and
neglecting powers of the quantities δz1 , . . . δzn higher than the second, the
increase of the potential energy of the system is given by
1 e2 X 1 e2 X X π(r − s)
· 3 ·N (δz)2 − · 2· | cosec3 | (δzr − δzs )2 ,
2 a 32 a n

5
where a is the radius of the ring, N e the charge on the nucleus, and n the
number of electrons. According to the condition of stability in § 1 the system
is stable for the displacement considered, if the above expression is positive
for arbitrary values of δz1 , . . . δzn . By a simple calculation it can be shown
that the latter condition is equivalent to the condition

N > pn,0 − pn,m , (5)

where m denotes the whole number (smaller than n) for which

1 s=n−1
X sπ sπ
pn,k = cos 2k · cosec3
8 s=1 n n

has its smallest value. This condition is identical with the condition of
stability for displacements of the electrons perpendicular to the plane of the
ring, deduced by help of ordinary mechanical considerations.5
A suggestive illustration is obtained by imagining that the displacements
considered are produced by the effect of extraneous forces acting on the elec-
trons in a direction parallel to the axis of the ring. If the displacements are
produced infinitely slowly the motion of the electrons will at any moment
be parallel to the original plane of the ring, and the angular momentum of
each of the electrons round the centre of its orbit will obviously be equal to
its original value; the increase in the potential energy of the system will be
equal to the work done by the extraneous forces during the displacements
we are led to assume that the ordinary mechanics can be used in calculating
the vibrations of the electrons perpendicular to the plane of the ring – con-
trary to the ease of vibrations in the plane of the ring. This assumptions is
supposed by the apparent agreement with observations obtained by Nichol-
son in his theory of the origin of lines in the spectra of the solar corona and
stellar nebulae (see Part I. pp. 6 & 23).?????? In addition it will be shown
later that the assumption seems to be in agreement with experiments on
dispersion.
The following table gives the values of sn and Pn,0 - Pn,m from n = 1 to
n = 16.

Table 1.
5
Comp. J.W. Nicholson, Month. Not. Roy. Astr. Soc. 72. p. 52 (1912).

6
n sn pn,0 − pn,m n sn pn,0 − pn,m

1 0 0 9 3.328 13.14
2 0.25 0.25 10 3.863 18.13
3 0.577 0.58 11 4.416 23.60
4 0.957 1.41 12 4.984 30.80
5 1.377 2.43 13 5.565 38.57
6 1.828 4.25 14 6.159 48.38
7 2.305 6.35 15 6.764 58.83
8 2.805 9.56 16 7.379 71.65

We see from the table that the number of electrons which can rotate in
a single ring round a nucleus of charge N e increases only very slowly for
increasing N ; for N = 20 the maximum value is n = 10; for N = 13; for
N = 60, n = 15. We see, further, that a ring of n electrons cannot rotate in
a single ring round a nucleus of charge ne unless n < 8.
In the above we have suppose that the electrons move under the influence
of a stationary radial force and that their orbits are exactly circular. The
first condition will not be satisfied if we consider a system containing several
rings of electrons which rotate with different frequencies. If, however, the
distance between the rings is not small in comparison with their radii, if
the ratio between their frequency is not near to unity, the deviation from
circular orbits may be very small and the motion of the electrons to a close
approximation may be identical with that obtained on the assumption that
the charge on the electrons is uniformly distributed along the circumference
of the rings. If the ratio between the radii of the rings is not near to unity, the
conditions of stability on this assumption may also be considered sufficient.
We have assumed in § 1 that the electrons in the atoms rotate in coaxial
rings. The calculation indicated that only in the case of systems containing
a great number of electrons will the planes of the rings separate; in the case
of systems containing a moderate number of electrons, all the rings will be
situated in a single plane through the nucleus. For the sake of brevity, we
shall therefore here only consider the latter case.
Let us consider an electric charge E uniformly distributed along the
circumference of a circle of radius a.
At a point distant z from the plane of the ring, and at a distance r from

7
the axis of the ring, the electrostatic potential is given by

1 dϑ
U = ·E √ .
π a2 + r2 + z 2 − 2ar cos ϑ
0

r
Putting in this expression z = 0 and a = tan2 α, and using the notation

π/2
Z

K(α) = p ,
1 − sin2 α cos2 ϑ
0

we get for the radial force exerted on an electron in a point in the plane of
the ring
∂U Ee
e = 2 Q(α),
∂r r
where
1
Q(α) = sin4 α(K(2α) − cotα · K 0 (2α)).
π
The corresponding force perpendicular to the plane of the ring at a
distance r from the center of the ring and at a small distance δz from its
plane is given by
∂U Eeδz
e = R(α),
∂z r3
where
2
R(α) = sin6 α[K(2α) + tan(2α) · K 0 (2α)].
π
A short table of the functions Q(α) and R(α) is given on p. 485.???
Next consider a system consisting of a number of concentric rings of
electrons which rotate in the same plane round a nucleus of charge N e. Let
the radial of the rings be a1 , a2 , . . ., and the number of electrons on the
different rings n1 , n2 , . . .
Putting ar /as = tan2 (αr,s ) we get for the radial force acting on an elec-
tron in the rth ring e2 /a2r Fr where
X
Fr = N − s − ns Q(αr,s ).

the summation is to be taken over all the rings except the one considered.
If we know the distribution of the electrons in the different rings, from the
relation (1) on p. 478,???? we can, by help of the above, determine a1 , a2 , . . ..
The calculation can be made by successive approximations, starting from
a set of values for the α’s, and from them calculating the F ’s, and then

8
redetermining the α s by the relation (1) which gives Fs /Fr = ar /as =
tan2 (αr,s ), and so on.
As in the case of a single ring it is supposed that the systems are stable
for displacements of the electrons in the plane of their orbits. In a calculation
such as that on p. 480,????? the interaction of the rings ought strictly to
be taken into account. This interaction will involve that the quantities F
are not constant, as for a single ring rotating round a nucleus, but will vary
with the radii of the rings; the variation in F , however, if the ratio between
the radii of the rings is not very near to unity, will be too small to be of
influence on the result of the calculation.
Considering the stability of the systems for a displacement of the elec-
trons perpendicular to the plane of the rings, it is necessary to distinguish
between displacements in which the centres of gravity of the electrons in
the single rings are unaltered, and displacements in which all the electrons
inside the same ring are displaced in the same direction. The condition of
stability for the first kind of displacements is given by the condition (5) on
p. 481,???? if for every ring we replace N by a quantity Gr determined
by the condition that e2 /a3r Gr δz is equal to the component perpendicular
to the plane of the ring of the force – due to the nucleus and the electrons
in the other rings – acting on one of the electrons if it has received a small
displacement δz. Using the same notation as above, we get
X
Gr = N − ns R(αr,s ).

If all the electrons in one of the rings are displaced in the same direction
by help of extraneous forces, the displacement will produce corresponding
displacements of the electrons in the other rings; and this interaction will be
of influence on the stability. For example, consider a system of m concentric
rings rotating in a plane round a nucleus of charge N e, and let us assume
that the electrons in the different rings are displaced perpendicular to the
plane by δz1 , δz2 , . . . , δzm respectively. With the above notation the increase
in the potential energy of the system is given by

1 X e2 1 XX e2
·N nr 3 (δzn )2 − · nr ns 3 R (αr,s ) (δzr − δzs )2 .
2 an 4 ar

The condition of stability is that this expression is positive for arbitrary


values δz1 , . . . δzm . This condition can be worked out simply in the usual
way. It is not of sensible influence compared with the condition of stability
for the displacements considered above, except in cases where the system
contains several rings of few electrons.

9
The following Table. containing the values of Q(α) and R(α) for every
fifth degree from α = 20◦ to α = 70◦ , gives an estimate of the order of
magnitude of these functions: –

Table 2.

α tan2 α Q(α) R(α)

20 0.132 0.001 0.002


25 0.217 0.005 0.011
30 0.333 0.021 0.048
35 0.490 0.080 0.217
40 0.704 0.373 1.549
45 1.000 - -
50 1.420 1.708 4.438
55 2.040 1.233 1.839
60 3.000 1.093 1.301
65 4.599 1.037 1.115
70 7.548 1.013 1.041

³ ´
tan2 α indicated the ratio between the radii of the rings tan2 (ar,s ) = aars .
The values of Q(α) show that unless the ratio of the radii of the rings is
nearly unity the effect of outer rings on the dimensions of inner rings is
very small, and that the corresponding effect of inner rings on outer is to
neutralize approximately the effect of a part of the charge on the nucleus
corresponding to the number of electrons on the ring. The values of R(α)
show that the effect of outer rings on the stability of inner – though greater
than the effect on the dimensions – is small, but that unless the ratio between
the radii is very great, the effect of inner rings on the stability of outer is
considerably greater than to neutralize a corresponding part of the charge
of the nucleus.
The maximum number of electrons which the innermost ring can contain
being unstable is approximately equal to that calculated on p. 482 for a
single ring rotating round a nucleus. For the outer rings, however, we get
considerably smaller numbers than those determined by the condition (5) if

10
we replace N e by the total charge on the nucleus and on the electrons of
inner rings.
If system of rings rotating round a nucleus in a single plane is stable for
small displacements of the electrons perpendicular to this plane, there will
in general be no stable configurations of the rings, satisfying the condition
of the constancy of the angular momentum of the electrons, in which all the
rings are not situated in the plane. An exception occurs in the special case
of two rings containing equal numbers of electrons; in this case there may be
a stable configuration in which the two rings have equal radii and rotate in
parallel planes at equal distances from the nucleus, the electrons in the one
ring being situated just opposite the intervals between the electrons in the
other ring. The latter configuration, however, is unstable if the configuration
in which all the electrons in the two rings are arranged in a single ring is
stable.

§ 3 Constitution of Atoms containing very few Electrons

At stated in § 1, the condition of the universal constancy of the angular


momentum of the electrons, together with the condition of stability, is in
most cases not sufficient to determine completely the constitution of the sys-
tem. On the general view of formation of atoms, however, and by making
use of the knowledge of the properties of the corresponding elements, it will
be attempted , in this section and the next, to obtain indications of what
configurations of the electrons may be expected to occur in the atoms. In
these considerations we shall assume that the number of electrons in the
atom is equal to the number which indicates the position of the correspond-
ing element in the series of elements arranged in order of increasing atomic
weight.
Exceptions to this rule will be supposed to occur only at such places in
the series where deviation from the periodic law of the chemical properties
of the elements are observed. In order to show clearly the principles used
we shall first consider with some detail those atoms containing very few
electrons.
Forsake of brevity we shall, by the symbol N (n1 , n2 . . .), refer to a plane
system of rings of electrons rotating round a nucleus of charge N e, satisfying
the condition of the angular momentum of the electrons with the approx-
imation used in § 2. n1 , n2 . . . are the numbers of electrons in the rings,

11
starting from inside. By a1 , a2 , . . . and ω1 , ω2 . . . we shall denote the radii
and frequency of the rings taken in the same order. The total amount of
energy W emitted by the formation of the system shall simply be denoted
by W [N (n1 , n2 , . . .)].

N =1 Hydrogen.

In Part I. we have considered the binding of an electron by a positive


nucleus of charge e, and have shown that it is possible to account for the
Balmer spectrum of hydrogen on the assumption of the existence of a series
of stationary states in which the angular momentum of the electron round
the nucleus is equal to entire multiplies of the value h/2π, where h is Planck’s
constant. The formula found for the frequencies of the spectrum was
µ ¶
2π 2 e4 m 1 1
ν= · 2 − 2 ,
h3 τ2 τ1
where τ1 and τ2 are entire numbers. Introducing the values for e, m, and
h used on p. 479, we get for the factor before the bracket 3.1 · 1015 ; 6 the
value observed for the constant in the Balmer spectrum is 3.290 · 1015 .
For the permanent state of a neutral hydrogen atom we get from the
formula (1) and (2) in § 2, putting F = 1,

h2 4π 2 e4 m
1(1) : α= = 0.55 · 10−8 , ω= = 6.2 · 1015 ,
4πe2 m h3
2π 2 e4 m
W = = 2.0 · 10−11 .
h2
These values are of the order of magnitude to be expected. For W/e we
get 0.043, which corresponds to 13 volts; the value for the ionizing potential
of a hydrogen atom, calculated by Sir J.J. Thomson from experiments on
positive rays, is 11 volt.7 No other definite data, however are available for
hydrogen atoms. For sake of brevity, we shall in the following denote the
values for a, ω and W corresponding to the configuration 1(1) by a0 , ω0 , and
W0 .
6
This value is that calculated in the first part of the paper. Using the values e =
4.78 · 10−10 (see R.A. Millikan, Brit. Assoc. Rep. 1912, p. 410), e/m = 5.31 · 1017 (see
P. Gmelin, Ann. d. Phys. XXVIII. p. 1086 (1909) and A.H. Bucherer, Ann. d. Phys.
XXXVII p. 597 (1912)), and e/h = 7.27 · 1016 calculated by Planck’s theory from the
experiments of E. Warbung G. Leithauser, E. Hupka, and C. Muller, Ann.d.Phys. XL. p.
611 (1913)) we get 2π 2 e4 m/h3 = 3.26 · 1015 in very close agreement with observations.
7
J.J. Thomson, Phil. Mag. XXIV. p. 218 (1912).

12
At distance from the nucleus, great in comparison with a0 , the system
1 (1) will not exert sensible forces on free electrons. Since, however, the
configuration:

1(2) a = 1.33a0 , ω = 0.563ω0 , W = 1.13W0 .

corresponds to a greater value for W than the configuration 1(1), we may


expect that a hydrogen atom under certain conditions can acquire a negative
charge. This is in agreement with experiments on positive rays. Since
W [1(3)] is only 0.54, a hydrogen atom cannot be expected to be able to
acquire a double negative charge.

N =2 Helium.

As shown in Part I., using the same assumptions as for hydrogen, we


must expect that during the binding of an electron by a nucleus of charge
2e, a spectrum is emitted, expressed by
à !
2π 2 me4 1 1
ν= · − .
h3 ( τ22 )2 ( τ21 )2

This spectrum includes the spectrum observed by Pickering in the star


xi Puppis and the spectra recently observed by Folwer in experiments with
vacuum tubes filled with a mixture of hydrogen and helium. These spectra
are generally ascribed to hydrogen.
For the permanent state of a positively charge helium atom, we get
1
2(1) a = a0 , ω = 4ω0 , W = 4W0 .
2
At distances from the nucleus great compared with the radius of the bound
electron, the system 2(1) will, to a close approximation, act an an electron
as a simple nucleus of charge e. For a system consisting of two electrons
and a nucleus of charge 2e, we may therefore assume the existence of a
series of stationary states in which the electron most lightly bound moves
approximately in the same way as the electron in the stationary states of
a hydrogen atom. Such an assumption has already been used in Part I. in
an attempt to explain the appearance of Rydberg’s constant in the formula
for the line-spectrum of any element. We can, however, hardly assume the
existence of a stable configuration in which the two electrons have the same
angular momentum round the nucleus and move in different orbits, the one
outside the other. In such a configuration the electrons would be so near to

13
each other that the deviations from circular orbits would be very great. For
the permanent state of a neutral helium atom, we shall therefore adopt the
configuration

2(2) a = 0.571a0 , ω = 3.06ω0 , W = 6.13W0 .

Since
W [2(2)] − W [2(1)] = 2.13W0 ,
we see that both electrons in a neutral helium atom are more firmly bound
than the electron in a hydrogen atom. Using the values on p. 488,???? we
get
W0 W0
2.13 · = 27 , 2.13 · = 6.6 · 1015 1/sec.
e h
these values are of the same order of magnitude as the value observed for
the ionization potential in helium, 20.5 volt,8 and the value for the fre-
quency of the ultra-violet absorption in helium determined by experiments
on dispersion 5.9 · 1015 1/sec.9
The frequency in question may be regarded as corresponding to vibra-
tions in the plane of the ring (see p. 480).???? The frequency of vibration
of the whole ring perpendicular to the plane, calculated in the ordinary way
(see p. 482), is given by ν = 3.27ω0 . The fact that the latter frequency
is great compared with that observed might explain that the number of
electrons in a helium atom, calculated by help of Drude’s theory from the
experiments on dispersion, is only about two-thirds of the number to be
e
expected. (Using m = 5.31 · 1017 the value calculated is 1.2.)
For a configuration of a helium nucleus and three electrons, we get

2(3) a = 0.703a0 , ω = 2.02ω0 , W = 6.07W0 .

Since W for this configuration is smaller than for the configuration 2( 2),
the theory indicates that a helium atom cannot acquire a negative charge.
This is in agreement with experimental evidence, which shows that helium
atoms have no “affinite” for free electrons.10
8
J.Franck u. G. Hertz, Verb. d. Deutsch. Phys. Ges. XV. p. 34 (1913).
9
C. and M. Cuthbertson, Proc. Roy. Soc. A. LXXXIV. p. 13 (1910). In a previous
paper (Phil. Mag. Jan. 1913) the author took the values for the refractive index in
helium, given by M. and C. Cuthbertson, as corresponding to atmosphere pressure; these
values, however, refer to double atmosphere pressure. Consequently the value there given
for the number of electrons in a helium atom calculated from Drude’s theory has to be
divided by 2.)
10
See J. Franck, Verh. d. Deutsch. Phys. Ges. XII. p. 613 (1910).

14
In a later paper it will be shown that the theory offers a simple ex-
planation of the marked in the tendency of hydrogen and helium atoms to
combine into molecules.

N =3 Lithium.

In analogy with the cases of hydrogen and helium we must expect that
during the binding of an electron by a nucleus of charge 3e, a spectrum is
emitted, given by
à !
2π 2 me4 1 1
ν= · − .
h3 ( τ32 )2 ( τ31 )2
On account of the great energy to be spent in removing all the electrons
bound in a lithium atom (see below) the spectrum considered can only be
expected to be observed in extraordinary cases.
In a recent note Nicholson11 has drawn attention to the fact that in the
spectra of certain stars, which show the Pickering spectrum with special
brightness, some lines occur the frequencies of which to a close approxima-
tion can be expressed by the formula
µ ¶
1 1
ν=K· − .
4 (m ± 1/3)2
where K is the same constant as in the Balmer spectrum of hydrogen. From
analogy with the Balmer- and Pickering-spectra, Nicholson has suggested
that the lines in question are due to hydrogen.
It is seen that the lines discussed by Nicholson are given by the above
formula if we put τ2 = 6. The lines in question correspond to τ1 = 10, 13 and
14; if we for τ2 = 6 put τ1 = 9, 12 and 15, we get lines coinciding with lines
of the ordinary Balmer-spectrum of hydrogen. If we in the above formula
put τ = 1, 2, and 3, we get series of lines in the ultra-violet. If we put τ2 = 4
we get only a single line in visible spectrum, viz.: for τ1 = 5 which gives
ν = 6.662 · 1014 , or a wave-length λ = 4.503 · 10−8 cm closely coinciding
with the wave-length 4.504 · 10−8 cm of one of the lines of unknown origin
in the table quoted by Nicholson. In this table, however, no lines occur
corresponding to τ2 = 5.
For the permanent state of a lithium atom with two positive charges we
get a configuration
1
3(1) a = a0 , ω = 9ω0 , W = 9W0 .
3
11
J.W. Nicholson, Month. Not. Roy. Astr. Soc. LXXIII. 382 (1913).

15
The probably of a permanent configuration in which two electrons move
in different orbits around each other must for lithium be considered still less
probable than for helium, as the ratio between the radii of the orbits would
be still nearer to unity. For a lithium atom with a single positive charge we
shall, therefore, adopt the configuration:

3(2) a = 0.364a0 , ω = 7.56ω0 , W = 15.13W0 .

Since W [3(2)]−W [3(1)] = 6.13W0 we see that the first two electrons in a
lithium atom very strongly bound compared with the electron in a hydrogen
atom; they are still more rigidly bound than the electrons in a helium atom.
From a consideration of the chemical properties we should expect the
following configuration for the electrons in a neutral lithium atom:

3(2, 1) a1 = 0.362a0 , ω1 = 7.65ω0 ,


W = 16.02W0
a2 = 1.182a0 , ω2 = 0.716ω0 ,

This configuration may be considered as highly probable also from a dy-


namical point view. The deviation of the outermost electron from a circular
orbit will be very small, partly on account of the great values of the ratio
between the radii, and of the ratio between the frequencies of the orbits
of the inner and outer electrons, partly also on account of the symmetrical
arrangement of the inner electrons. accordingly, it appears probable that
the three electrons will not arrange themselves in a single ring and from the
system:
3(3) a = 0.413a0 , ω = 5.87ω0 , W = 17.61W0 ,
although W for this configuration is greater than for 3(2,1).
Since W [3(2,1) - W [3(2)] = 0.89W0 , we see that the outer electron in
the configuration 3(2,1) is bound even more lightly than the electron in a
hydrogen atom. the difference in the firmness of the binding corresponds
to a difference of 1.4 volts in the ionization potential. A marked difference
between the electron in hydrogen and the outermost electron in lithium lies
also in the greater tendency of the latter electron top leave the plane of this
orbits. The quantity G considered in § 2, which gives a kind of measure for
the stability for displacements perpendicular to this plane, is thus for the
outer electron in lithium only 0.55, while for hydrogen it is 1. This may have
a bearing on the explanation of the apparent tendency of lithium atoms to
take a positive charge in chemical combinations with other elements.

16
For a possible negatively charged lithium atom we may expect the con-
figuration:

3(2, 2) a1 = 0.362a0 , ω1 = 7.64ω0 ,


W = 16.16W0
a2 = 1.516a0 , ω2 = 0.436ω0 ,

it should be remarked that we have no detailed knowledge of the prop-


erties in the atomic state, either for lithium or hydrogen, or for most of the
electrons considered below.

N =4 Beryllium.

For reasons analogous to those considered for helium and lithium we may
for the formation of a neutral beryllium atom assume the following states:

4(1) a = 0.25a0 , ω = 16ω0 , W = 16W0 ,


4(2) a = 0.267a0 , ω = 14.06ω0 , W = 28.13W0 ,
4(2, 1) a1 = 0.263a0 , ω1 = 14.46ω0 ,
W = 31.65W0 ,
a2 = 0.605a0 , ω2 = 2.74ω0 ,
4(2, 2) a1 = 0.262a0 , ω1 = 14.60ω0 ,
W = 33, 61W0 ,
a2 = 0.673a0 , ω2 = 2.21ω0 ,

although the configurations:

4(3) a = 0.292a0 , ω = 11.71ω0 , W = 35.14W0 ,


4(4) a = 0.329a0 , ω = 9.26ω0 , W = 37.04W0 ,

correspond to less values for the total energy than the configuration 4( 2,1)
and 4(2,2).
From analogy we get further for the configuration of a possible negatively
charged atom,

4(2, 3) a1 = 0.263a0 , ω1 = 14.51ω0 ,


W = 33.66W0
a2 = 0.803a0 , ω2 = 1.55ω0 ,

Comparing the outer ring of the atom considered with the ring of a
helium atom, we see that the presence of the inner ring of two electrons
in the beryllium atom markedly charges the properties of the outer ring;

17
partly because the outer electrons in the configuration adopted for a neutral
beryllium atom are more lightly bound than the electrons in a helium atom,
and partly because the quantity G, which for helium is equal to 2, for the
outer ring in the configuration 4(2,2) is only equal 1.12.
Since W [4(2,3)] - W [4(2,2)] = 0.05W0 , the beryllium atom will further
have a definite, although very small affinity for free electrons.

§ 4 Atoms containing greater numbers of electrons

From the examples discussed in the former section it will appear that the
problem of the arrangement of the electrons in the atoms is intimately con-
nected with the question of the confluence of two rings of electrons rotating
round a nucleus outside each other, and satisfying the condition of the uni-
versal constancy of the angular momentum. apart from the necessary con-
ditions of stability for displacements of the electrons perpendicular to the
plane of the orbits, the present theory gives very little information on this
problem. It seems, however, possible by the help of simple considerations to
throw some light on the question.
Let us consider two rings rotating round a nucleus in a single plane, the
one outside the other. Let us assume that the electrons in the one ring
act upon the electrons in the other as if the electric charge were uniformly
distributed along the circumference of the ring, and that the ring with this
approximation satisfy the condition of the angular momentum of the elec-
trons and stability for displacements perpendicular to their plane.
Now suppose that, by help of suitable imaginary extraneous forces acting
parallel to the axis of the rings, we pull the inner ring slowly to one side.
During this process, on account of the repulsion from the inner ring, the
outer will move to the opposite side of the original plane of the rings. During
the displacements of the rings angular momentum of the electrons round the
axis of the system will remain constant, and the diameter of the inner ring
will increase while that of the outer will diminish. At the beginning of the
displacement the magnitude of the extraneous forces to be applied to the
original inner ring will increase but thereafter decrease, and at a certain
distance between the plane of the rings the system will be in a configuration
of equilibrium. This equilibrium, however, will not be stable. If we let the
rings slowly return they will either reach their original position, or they
arrive at a position in which the ring, which originally was the outer, is now

18
the inner, and vise versa.
If the charge of the electrons were uniformly distributed along the cir-
cumference of the rings, we could by the process considered at most obtain
an interchange of the rings, but obviously not a junction of them. Taking,
however, the discrete distribution of the electrons into account, it can be
shown that in the special case when the number of electrons on the two
rings are equal, and when the rings rotate in the same direction, the rings
will unite by the process, provided that the final configuration is stable. In
this case the radii and the frequency of the rings will be equal in the unstable
configuration of equilibrium mentioned above. In reaching this configura-
tion the electrons in the one ring will further be situated just opposite the
intervals between the electrons in the outer, since such an arrangement will
correspond to the smallest total energy. If now we let the rings return to
their original plane, the electrons in the one ring will pass into the intervals
between the electrons in the other, and from a single ring. Obviously the
ring thus formed will satisfy the same condition of the angular momentum
of the electrons as the original rings.
If the two rings contain unequal numbers of electrons the system will
during a process such as that considered behave very differently, and, con-
trary to the former case, we cannot expect that the rings will flow together,
if by help of extraneous forces acting parallel to the axis of the system they
are displaced slowly from their original plane. It may in this connexion be
noticed that the characteristic for the displacements considered is not the
special assumption about the extraneous forces, but only invariance of the
angular momentum of the electrons round the centre of the rings; displace-
ments of this kind take in the present theory a similar position to arbitrary
displacements in the ordinary mechanics.
The above considerations may be taken as an indication that there is
greater tendency for the confluence of two rings when each contains the
same number of electrons. Considering the successive binding of electrons
by a positive nucleus, we conclude from this that, unless the charge on the
nucleus is very great, rings of electrons will only join together if they contain
equal numbers of electrons; and that accordingly the numbers of electrons
on inner rings will only be 2, 4, 8, . . .. If the charge of the nucleus is very
great the rings of electrons first bound, if few in number, will be very close
together, and we must expect that the configuration will be very unstable,
and that a gradual interchange of electrons between the rings will be greatly
facilitated.
This assumption in regard to the number of electrons in the rings is
strongly supported by the fact that the chemical properties of the elements

19
of low atomic weight vary with a period of 8. Further, it follows that the
number of electrons on the outermost ring will always be odd or even, ac-
cording as the total number of electrons in the atom is add or even. This
has a suggestive relation to the fact that the valency of an element of low
atomic weight always is odd or even according as the number of the element
in the periodic series is odd or even.
For the atoms of the elements considered in the former section we have
assumed that the two electrons first bound are arranged in a single ring,
and, further, that the two next electrons are arranged in another ring. If
N ≥ 4 the configuration N (4) will correspond to a smaller value for the
total energy than the configuration N (2,2). The greater the value of N the
closer will the ratio between the radii of the rings in the configuration N (2,2)
approach unity, and the greater will be the energy emitted by an eventual
confluence of the rings. The particular member of the series of the elements
for which the four innermost electrons will be arranged for the first time in
a single ring cannot be determined from the theory. From a consideration
of the chemical properties we can hardly expect that it will have taken
place before boron (N = 5) or carbon (N = 6), on account of the observed
trivalency and tetravalency respectively of these elements; on the other hand,
the periodic system of the elements strongly suggests that already in neon
(N = 10) an inner ring of eight electrons will occur. Unless N > 14 the
configuration N (4,4) corresponds to smaller value for the total energy that
the configuration N (8); already for N ≥ 10 the latter configuration, however,
will be stable for displacements of the electrons perpendicular to the plane
of their orbits. A ring of 16 electrons will not be stable unless N is very
great; but in such a case the simple considerations mentioned do not apply.
The confluence of two rings of equal number of electrons, which rotate
round a nucleus of charge N e outside a ring of n electrons already bound,
must be expected to take place more easily than the confluence of two similar
rings rotating round a nucleus of charge (N − n) · e; for the stability of the
rings for a displacement perpendicular to their plane will (see § 2) be smaller
in the first than the latter case. This tendency for stability to decrease
for displacements perpendicular to the plane of the ring will be especially
marked for the outer rings of electrons of a neutral atom. In the latter
case we must expect the confluence of rings to be greatly facilitated and
in certain cases it may even happen that the number of electrons in the
outer ring may be greater than in the next, and that the outer ring may
show deviations from the assumption of 1, 2, 4, 8 electrons in the rings, e.g.
the configurations 5(2,3) and 6(2,4) instead of the configuration 5(2,2,1)
and 6(2,2,2). We shall here not discuss further the intricate question of the

20
arrangement of the electrons in the outer ring. In the scheme given below the
number of electrons in this rings is arbitrary put equal to the normal valency
of the corresponding element; i.e. for electronegative and electropositive
elements respectively the number of hydrogen atoms and twice the number
of oxygen atoms with which one atom of the element combines.
Such an arrangement of the outer electrons is suggested by considera-
tions of atomic volumes. As is well known, the atomic volume of the elements
is a periodic function of the atomic weights. If arranged in the usual way
according to the periodic system, the elements inside the same column have
approximately the same atomic volume, while this volume changes consider-
ably from one column to another, being greatest for columns corresponding
to the smallest valency 1 and smallest for the greatest valency 4. An ap-
proximate estimate of the radius of the outer ring of a neutral atom can
be obtained by assuming that the total forces due to the nucleus and the
inner electrons is equal to that from a nucleus of charge ne, where n is the
number of electrons in the ring. Putting F = n − sn in the equation (1) on
p. 478, ?????? and denoted the value of a for n = 1 by a0 , we get for n = 2,
a = 0.41a0 ; and for n = 4, a = 0.33a0 . According the arrangement chosen
for the electrons will involve a variation in the dimensions of the outer ring
similar to the variation in the atomic volumes of the corresponding elements.
It must, however, be borne in mind that the experimental determinations of
atomic volumes in most cases are deduced from consideration of molecules
rather that atoms.
From the above we are led to the following possible scheme for the ar-
rangement of the electrons in light atoms: –

1(1) 9(4,4,1) 17(8,4,4,1)


2(2) 10(8,2) 18(8,8,2)
3(2,1) 11(8,2,1) 19(8,8,2,1)
4(2,2) 12(8,2,2) 20(8,8,2,2)
5(2,3) 13(8,2,3) 21(8,8,2,3)
6(2,4) 14(8,2,4) 22(8,8,2,4)
7(4,3) 15(8,4,3) 23(8,8,4,3)
8(4,2,2) 16(8,4,2,2) 24(8,8,4,2,2)
Without any fuller discussion it seems not unlikely that this constitution
of the atoms will correspond to properties of the elements similar with those
observed.
In the first place there will be a marked periodicity with a period of 8.
Further, the binding of the outer electrons in every horizontal series of the

21
above scheme will become weaker with increasing number of electrons per
atom, corresponding to the observed increase of the electropositive character
for an increase of atomic weight of the elements in every single group of the
periodic system. A corresponding agreement holds for the variation of the
atomic volumes.
In the case of atoms of higher atomic weight the simple assumptions used
do not apply. A few indications, however, are suggested from consideration
of the variations in the chemical properties of the elements. At the end of
the 3rd period of 8 elements we meet with the iron-group. This group takes a
particular position in the system of the elements, since it is the first time that
elements of neighbouring atomic weight show similar chemical properties.
This circumstance indicates that the configurations of the electrons in the
elements of this group differ only in the arrangement of the inner electrons.
The fact that the period in the chemical properties of the elements after the
iron-group is no longer 8, but 18, suggests that elements of higher atomic
weight contain a recurrent configuration of 18 electrons in the innermost
rings. The deviation from 2, 4, 8, 16 may be due to a gradual interchange of
electrons between the rings, such as is indicated on p. 495. Since a ring of
18 electrons will not be stable the electrons may be arranged in two parallel
rings (see p. 486). ??????? Such a configuration of the inner electrons will
act upon the outer electrons in very nearly the same way as nucleus of charge
(N − 18) · e. It might therefore be possible that with increase of N another
configuration of the same type will be formed outside the first, such as is
suggested by the presence of a second period of 18 elements.
On the same lines, the presence of the group of the rare earths indicates
that for still greater values of N another gradual alteration of the inner-
most rings will take place. Since, however, for elements of higher atomic
weight than those of this group, the laws connection the vibration of the
chemical properties with the atomic weight are similar to these between the
elements of low atomic weight, we may conclude that the configuration of
the innermost electrons will be again repeated. The theory, however, is not
sufficiently complete to give a definite answer to such problems.

§ 5 Characteristic Röntgen Radiation

According to the theory of emission of radiation given in Part I., the ordinary
line-spectrum of an element is emitted during the reformation of an atom

22
when one or more of the electrons in the other rings are remover. In analogy
it may be supposed that the characteristic Röntgen radiation is sent out
during the setting down of the system if electrons in inner rings are removed
by some agency, e.g. by impact of cathode particles. This view of the
origin of the characteristic Röntgen radiation has been proposed by Sir. J.J.
Thomson.
Without any special assumption in regard to the constitution of the
radiation, we can from this view determine the minimum velocity of the
cathode rays necessary to produce the characteristic Röntgen radiation of
a spacial type by calculating the energy necessary to remove one of the
electrons from the different rings. Even if we know the numbers of electrons
in the rings, a rigorous calculation of this momentum energy might still
be complicated, and the result largely dependent on the assumptions used;
for, as mentioned in Part I., p. 19, ?????????? the calculation cannot
be performed entirely on the basis of the ordinary mechanics. We can,
however, obtain very simply an approximate comparison with experiments
if we consider the innermost ring and as a first approximation neglect the
repulsion from the electrons in comparison with the attraction of the nucleus.
Let us consider a simple system consisting of a bound electron rotating in a
circular orbit round a positive nucleus of charge N e. From the expressions
(1) on p. 478 ??????? we get for the velocity of the electron, putting F = N ,

2πe2
v= N = 2.1 · 108 · N.
h
The total energy to be transferred to the system in order to remove
the electron to an infinite distance from the nucleus is equal to the kinetic
energy of the bound electron. If, therefore, the electron is removed to a great
distance from the nucleus by impact of another rapidly moving electron, the
smallest kinetic energy possessed by the latter when at a great distance from
the nucleus must necessarily be equal to the kinetic energy of the bound
electron before the collision. The velocity of the free electron therefore must
be at least equal to e.
According to Whiddington’s experiments12 the velocity of cathode rays
just able to produce the characteristic Röntgen radiation of the so-called
K-type-the hardest type of radiation observed–from an element of atomic
weight A is for elements from Al to Se approximately equal to A cot 108
cm/sec. As seen this is equal to the above calculated value for r, if we put
N = A/2.
12
R. Whiddington, Proc. Roy. Soc. A. LXXXV. p. 323 (1911).

23
Since we have obtained approximate agreement with experiment by as-
cribing the characteristic Röntgen radiation of the K-type to the innermost
ring, it is to be expected that no harder type of characteristic radiation will
exist. This is strongly indicated by observations of the penetrating power of
γ rays.13
It is worthy of remark that the theory gives not only nearly the right
value for the energy required to remove an electron from the outer ring, but
also the energy required to remove an electron from the innermost ring. The
approximate agreement between the calculated and experimental values is
all the more striking it is recalled that the energies required in the two cases
for an element of atomic weight 70 differ by a ratio of 1000.
In connexion with this it should be emphasized that the remarkable
homogeneity of the characteristic Röntgen radiation – indicated by experi-
ments on absorption of the rays, as well as by the interference observed in
recent experiments on diffraction of Röntgen rays in crystals – is in agree-
ment with the main assumption used in part I. (see p. 7) in considering the
emission of line-spectra, viz. that the radiation emitted during the passing
of the systems between different stationary states is homogeneous.
Putting in (4) F = N , we get for the diameter of the innermost ring
approximately 2a = 1/N · 10−8 cm. For N = 100 this gives 2a = 10−10 cm,
a value which is very small in comparison with ordinary atomic dimensions
but still very great compared with the dimensions to be expected for the
nucleus. according to Rutherford’s calculation the dimensions of the latter
are of the same order of magnitude as 10−12 cm.

§ 6 Radioactive Phenomena

According to the present theory the cluster of electrons surrounding the


nucleus is formed with emission of energy, and the configuration is deter-
mined by the condition that the energy emitted is a maximum. The stability
involved by these assumptions seems to be in agreement with the general
properties of matter. It is, however, in striking opposition to the phenomena
of radioactivity, and according to the theory the origin of the latter phenom-
ena may therefore be sought elsewhere than in the electronic distribution
round the nucleus.
13
Comp. E. Rutherford, Phil. Mag. XXIV. p. 453 (1912).

24
A necessary consequence of Rutherford’s theory of the structure of atoms
is that the α-particles have their origin in the nucleus. On the present theory
it seems also necessary that the nucleus is the seat of the expulsion of the
high-speed β-particles. In the first place, the spontaneous expulsion of a
β-particle from the cluster of electrons surrounding the nucleus would be
something quite foreign to the assumed properties of the system. further,
the expulsion of an α-particle can hardly be expected to produce a lasting
effect on the stability of the cluster of electrons. The effect of the expulsion
will be of two different kinds. Partly the particle may collide with the
bound electrons during its passing through the atom. This effect will be
analogous to that produced by bombardment of atoms of other substances
by α-rays and cannot be expected to give rise to a subsequent expulsion
of β-rays. Partly the expulsion of the particle will involve an alteration in
the configuration of the bound electrons, since the charge remaining on the
nucleus is different from the original. In order to consider the latter effect
let us regard a single ring of electrons rotating round a nucleus of charge
N e, and let us assume that an α-particle is expelled from the nucleus in
a direction perpendicular to the plane of the ring. The expulsion of the
particle will obviously not produce any alteration in the angular momentum
of the electrons; and if the velocity of the α-particle is small compared with
the velocity of the electrons – as it will be if we consider inner rings of an
atom of high atomic weight – the ring during the expulsion will expand
continuously, and after the expulsion will take the position claimed by the
theory for a stable ring rotating round a nucleus of charge (N − 2) · e. The
consideration of this simple case strongly indicates that the expulsion of an
α-particle will not have a lasting effect on the stability of the internal rings
of electrons in the residual atom.
The question of the origin of β-particles may also be considered from
another point of view, based on a consideration of the chemical and physical
properties of the radioactive substances. As is well known, several of these
substances have very similar chemical properties and have hitherto resisted
every attempt to separate them by chemical means. There is also some evi-
dence that the substances in question show the same line-spectrum. 14 It has
been suggested by several writers that the substances are different only in
radio-active properties and atomic weight but identical in all other physi-
cal and chemical respects. according to the theory, this would mean that
the charge on the nucleus, as well as the configuration of the surrounding
electrons, was identical in some of the elements, the only difference being
14
see A.S. Russel and R. Rossi, Proc. Roy. Soc. A. LXXXVII. p. 478 (1912).

25
the mass and the internal condition of the nucleus. From the considerations
of § 4 this assumption is already strongly suggested by the fact that the
number of radioactive substances is greater than the number of places at
our disposal in the periodic system. If, however, the assumption is right,
the fact that two apparently identical elements emit β-particles of different
velocities, shows that the β-rays as well as the α-rays have their origin in
the nucleus.
This view of the origin of α- and β-particles explains very simply the
way in which the change in the chemical properties of the radioactive sub-
stances is connected with the nature of the particles emitted. The results of
experiments are expressed in the two rules:–15
1. Whenever an α-particles is expelled the group in the periodic system
to which the resultant product belongs is two units less than that to which
the parent body belongs.
2. Whenever a β-particle is expelled the group of the resultant body is
1 unit greater than that of the parent.
As will be seen this is exactly what is to be expected according to the
considerations of § 4.
In escaping from the nucleus, the β-rays may be expected to collide with
the bound electrons in the inner rings. This will give rise to an emission
of a characteristic radiation of the same type as the characteristic Rönt-
gen radiation emitted from elements of lower atomic weight by impact of
cathode-rays. The assumption that the emission of γ-rays is due to colli-
sions of β-rays with bound electrons is proposed by Rutherford16 in order
to account for the numerous groups of homogeneous β-rays expelled from
certain radioactive substances.

In the present paper it has been attempted to show that the application
of Planck’s theory of radiation to Rutherford’s atom-model through the
introduction of the hypothesis of the universal constancy of the angular
momentum of the bound electrons, leads to results which seem to be in
agreement with experiments.
In a later paper the theory will be applied to systems containing more
than one nucleus.

15
See A.S. Russell, Chem. News, CVII. p. 49 (1913); G.v. Hevesy, Phys. Zeitschr.
XIV. p. 49 (1913); K. Fajaus, Phys. Zeitschr. XIV. pp. 131 & 136 (1913); Verh. d.
deutsch. Phys. Ges. XV. p. 240 (1913); F. Soddy, Chem. News, CVII. p. 97 (1913).
16
E. Rutherford, Phil. Mag. XXIV. pp. 453&893 (1912).

26
879

Uber quantentheoretische Umdeutung


k i n e m a t i s c h e r u n d m e c h a n i s e h e r Beziehungen.
Von ~T. H e i s e n b e r g in GSttingen.
(Eingegangen am 29. Juli 1925.)

In der Arbeit sol[ versucht werden, Grundlagen zu gewinnen f[ir eine quanten-
s Mechanik, die ausschliel]lich auf Beziehungen .zwischen prinzipiell
beobachtbaren GrSflen basiert ist.

Bekanntlich last sich gegen die [ormalen Regeln, die allgemein hi


der Qnantentheorie zur Berechnung beobachtbarer Grs~en (z. B. der
Energie im Wasserstoffatom) benutzt werden, der schwerwiegende Ein-
wand erheben, da~] iene Rechenregeln als wesentlichen Bestandteil Be-
ziehungen enthalten zwischen GrS~en, die scheinbar prlnzipiell nicht
beobachtet werden kSnnen (wie z.B. Oft, Unflaufszeit des Elektrons),
da~ also ienen Regela offenbar iedes anschautiche physikalische Fnnda-
merit mangelt, wenn man nicht immer noch an tier ttoffnnng festhalten
\v~ll, da~ iene bis ietzt unbeobachtbaren GrS~en sparer vielleicht experi-
mentell znggnglich gemacht werden k~nnten. Diese Ho~fnung kSnnte
als berechtigt angesehen werden, wenn die genannten Regeln in sich
konsequent had auf einen bestimmt umgrenzten Bereich quantentheoretischer
Probleme anwendbar wgren. Die Er~ahrung zelgt aber, da~ sich nut
das Wasserstoffatem und der Starkeffekt dieses Atoms ienen formalen
Regeln der Quantentheorie fiigen, da~ aber schon beim Problem tier
,,gekreuzten Felder" (Wasserstof*atom in elektrischem and magnetlschem
Feld verschiedener Richtung) fundamentale Schwierigkeiten auftreten,
da~ die Reaktioa der Atome auf periodisch wechselnde Felder sicherlich
nicht dutch die genannten Regeln. beschrieben werden kann, sad da~
schlie~lich eine Ausdehnung der Quantenregeln au~ die Behandlnng tier
Atome mit mehreren Elektronen sich als nnmSgllch erwiesen hat. Es
ist iiblich geworden, dieses Yersagen der quantentheoretischen Regeln,
die ia wesentlich dutch die Anwendung der klassischen Mechanik
charakterisiert waren, als Abweichung yon der klassischen Mechanik zu
bezeichnen. Diese Bezeichnung kann aber wohl kaum als sinngemgl]
angesehen werden, wenn man bedenkt, da~ schon die (~a ganz allgemein
giiltige) E i n s t e i n - B o hr sche Frequenzbedingung eine so vSllige Absage
an die klassische Mechanik oder besser, yore Standpunkt der Wellen-
theorie aus, an die dieser Mecl~auik 'zugrunde liegende Xinematik dar-
stellt, da~ auch bei den ein[achsten quantentheoretischen Problemen an
Zeitschrift fiir Physik. Bd. XXXIII. 59
880 W. tteisenberg,

eine Giiltigkeit der klassischen Mechanik schlechterdings nicht gedacht


werden kann. Bei dieser Sachlage schelnt es geratener, jene ttoffnung
auf eine Beobachtung der bisher unbeobachtbaren GriiBen (wie Lage,
Um]aufszeit des Elektrons) ganz aufzugeben, gleichzeitig also einzurgumen,
dab die teilweise (J'bereinstimmung der genannten Quantenregeln mit der
Erfahrung mehr oder weniger znfMlig sei, und zu versuchen, eine der
klassischen Mechanik analoge quantentheoretische Mechanik auszubilden,
in welcher nur Beziehungen zwischen be6bachtbaren GrSBen vorkommen.
Als die wichtigsten ersten Ansgtze zu einer solchen quantentheoretischen
Mechanlk kann man neben der Frequenzbedingung die K r a m e r s s c h e
Dispersionstheorie 1) und die auf dieser Theorie weiterbauenden Arbeiten 2)
ansehen. Im folgenden wollen wir einige neue quantenmechanische Be-
ziehnngen heranszustellen suchen und zur vollstandigen Behandlung einiger
spezieller Probleme benutzen. Wir werden uns dabei auf Probleme von
einem Freiheitsgrade beschrgnken.
w 1. In der klassischen Theorie ist die Strahlung eines bewegten

ron (m dor Wello ooe, ane aio


%

Ausdrticke : e

e .

gegeben, sondern es kommen in n~chster ~ h e r u n g noch Glieder hinzu,


z.B. yon der Form e .

re3 oD~
die man als ,Quadrupolstrahlung" bezeichnen kann, in noch hiiherer Nahe-
rung Glieder z.B. der Form e

r c~ 5 D~;
in dieser Weise liiBt sich die Naherung beliebig weir treiben. (Ira
Vorhergehenden bedeuteten: ~, 5~ die Feldstarken im Aufpunkt, e die
Ladung des Elektrons, r den Abstand des Elektrons vom AuIpunkt,
die Geschwlndigkeit des Elektrons.)
Man kann sich fragen, wie jene hiiheren Glieder in der Quanten-
theorie aussehen miiBten. Da in der klassischen Theorie die hiiheren

1) H. v. Kramers, Nature 118, 673, 1924.


~) ~f. Born, ZS. L Phys. 26, 379, 1924. B.A. Kramers und W. Heisen-
berg~ ZS. f. Phys. 81, 681, 1925. M. Born und P. Jordan, ZS. f. Phys. (Ira
Erscheinen.)
Quantentheoretische Umdeu~ung kinematischer u. mechanischer Beziehungen. 881

Ngherungen einfaeh berechnet werden k~nnen, wenn die Bewegung


des Elektrons bzw. ihre Fourierdarstellung gegeben ist, so wird man in
der Qua.ntentheorie Ahnliches erwarten. Diese Frage hat nichts mit
Elektrodynamik zn tun, sondern s~e ist, dies scheint uns besonders
wichtig, rein k i n e m a t i s c h e r Natur; wir k~ffnen sie in einfaehster Form
folgendermaJen stellen: Gegeben sei ehae an Stelle der klassischen Grille
x (t) tretende quantentheoretische GrSSe ; welche quantentheoretische
GrS$e tritt da~m an Stelle yon x (t)2?
Bevor wir diese Frage beantworten kt~nnen, miissen wir uns daran
erinnern, dab es in der Quantentheorie nicht mt~glich war, dem Elektron
einen Punkt im Raum als Funktion der Zeit mittels beobachtbarer
GraVen zuzuordnen. W-ohl aber kann dem Elektron auch in der
Quantentheorie eine Ausstrahlung zugeordnet werden; diese Strahlnng
wird besehrieben erstens durch die Frequenzen, die als Fnnktionen zweier
Variablen auftreten, quantentheoretisch in der Gestalt:
1
v (~, ~ - - ~) = -~ { w ( n ) - - w ( ~ - ~)},
in der klassischen Theorie in der Form:
1 dW
v(n,~z) ~ ~.v(n) ~ a h dn
(gierin ist n. h ~ J, einer der kanonisehen Konstanten, gesetzt.)
Als charakteristisch ~ r den Vergleich der ]dassisehen mit der
Qnantentheorie hinsichtlieh der Frequenzen kann man die Kombinations-
9relationen anschreiben:
Klassiseh:
~(s,~) + v(.,~) = v(n,~.+~).
Quantentheoretisch :

bzw. v(n--fl, n--O*--lS)+v(n,n--fl) ~- v(n,n--a--fl).


Neben den Frequenzen sind zweitens zur Beschreibung der Strahlung
notwendig die Amplituden; die Amplituden k(innen als komplexe Vek-
toren (mit je seehs unabhingigen Bestimmungssttieken) anfgefal~t werden
mid bestimmen Polarisation und Phase. Aueh sie sind Funktionen der
zwei Variablen n mid ~, so dal3 der betreffende Tell der Strahhng dutch
den ~olgenden Ausdruek dargeste]lt wird:
Quantentheoretisch:
~ e {9/(n, n - - ~) e~(". '~-~)q. (1)
Klassisch :
.~e { 91,~(,~) e,~(,o.,,,}. (2)
59*
882 W. Iteiseaberg,

Der (in 2 enthaltenen) ]Phase seheint zunachst eine physikaliscbe


Bedeutung in der Quantentheorie nicht zuzukommen, da die Freqnenzen
der quantentheorie mit ihren Oberschwingungen im allgemeinen nicht
kommensurabel sind. Wit werdeil aber sofort sehen, daft die Phase
aueh in der Quantentheorie eine bestlmmte, der in der klassisehen Theorie
analoge Bedeu~ung hat. Betrachten wit ietzt eine bestimmte Gr~fe x(t)
in der klassisehen Theorle, so kann man sie reprgselltier~ denken dutch
eine Gesamtheit von Gr~ften der Form
~a (S) ei~(n) "~t,

die, ~e naehdem die Bewegung periodisch ist oder nieht, zu einer Summe
oder zu einem Integral vereinig~ x (t) darstellen:

X (~t, t) ~ - X a ~c~ (n) ei m (n). a t

bzw. +~ (2 a)
x(n, t) ~- -~I~l~(n)ei~ I
Eine solche Vereinigung der entsprechenden quantentheoretischen
Gr~l]en scheint wegen der Gleichbereehtigung der Gr(il~en n, n - - ct nicht
ohne Willkiir m(iglich und deshalb nicht sinnvo]l; wohl aber kann man
die Gesamtheit der Griiften
~(n, n - - ct) ei~(n,n-~)t
als Repr~sentant der Gr(ifte x (t) auffassen und dann die obea gestellte
Frage zu beantworten suchen: Wodurch wird die Gr(il]e x (t) 2 reprasentiert?
Die Antwort lautet klassisch offenbar so:

~ fl (~t) e i~ (n) flt = E a ~a ~fl-- ~ ei~ (n) (a + fl- a) t (3)


-- oc~

bzw. _____~ 2~2fl_~e~ (n)(.§ dg, (4)

wobei dana

x (t) ~ = ~ ~ 0~) e~~' ~) ~ ~ (5)

bzw. -~- j" ~)~ (n) eiw(n)t~t d ~. (6)


--oo
Quantentheoretische Umdeutung kinematischer u. mechanischer Beziehungen. 883

Quantentheoretlseh seheint es die einfachste and natiirlichste An-


nahme, die Beziehungen (3, 4) durch die folgenden zu ersetzen:
+~
(n,~, - - t~),e~(~,~-~)t = ~ ~(n, ~ - - ~ ) ~ (.--~, n--t~) e~(~,~-~)~ (7)

bzw. ~ _1 dot'(n, n --~z) ~ ( n - - ~z, n--fl)ei~~ (s)

und zwar ergibt sich diese A r t der Zusammensetzang nahezu zwanglgufig


aus der Kombinationsre]ation der Frequenzen. Macht ma:a diese An-
nahme (7) und (8), so erkennt man aueh~ dal] die Phasen der quanten-
theoretischen ~ eine ebenso grol]e physikalisehe Bedeutung haben wie
die in der klassisehen Theorle: n u r d e r Anfangspunkt der Zeit and ds/her
eine allen ~l g e m e i n s a m e Phasenkonstante ist willkiirlieh und ohne
physikalische Bedenttmg; doeh die Phase der e i n z e l n e n ~ geht wesent-
lich in die GrSl~e ~ einl). Eine geometrisehe Interpretation soleher
quantentheoretiseher Phasenbeziehungen in Analogie zur klassischen
Theorle scheint zunaehst kaum mSglich.
F r a g e n w i t welter nach dem Repr~sentant der GrSl]e x (t) ~, so linden
wir ohne S e h w i e r i g k e i t :
Klassisch :
d-~ d-~
~(n, r) = ~ ~ ~, ,~ ~ (~) ~ (~*)~-~-,~(~)" (9)
--~ --oo

Quantentheoretisch :

(n, ~ - r) : ~ ~ ~, ~ ~ (n, ~ - ~ ) ~ (~-~, n - ~ - ~ ) ~ (~,-~-~, n - r) (lO)

bzw. die entsprechenden Integrale.


I n ~hnlieher Weise lussen sieh alle GrSl]en der F o r m x (t) n quanten-
theoretiseh darstellen, nnd wenn irgend eine F n n k t i o n f[x(t)] gegeben
ist, so kann man offenbar immer dann, wenn diese F a n k t i o n nach 1)otenz -
reihen in x entwiekelbar ist, das quantentheoretisehe Analogon linden.
Eine wesentliehe Schwierigkeit entsteht iedoeh, wenn wir zwei GrSl]en
. x (t), y (t) betrachten und naeh dem P r o d u k t x(t)y (t) fragen.

1) Vgl. auch H. A. K r a m e r s und W. t t e i s e n b e r g , 1. c. In die dort


benutzten Ausdrticke fiir das induzierte Streumoment gehen die Phasen wesent-
lich ein.
884 W. Heisenberg,

Sei x(t) dutch 71, y(t) dutch ~ eharakterisiert, so ergibt sieh als
Darstellung yon x (t). y (t) : Klassisch:

~ ( . ) -= ~ ~(.)~_~(.).
Quantentheoretisch:
+~

--oo

Wahrend klassisch x (t). y (t) stets gleich y (t) x (t) wird, braueht dies
in der Quantentheorie im allgemeinen nicht der Fall zu sein. - - In speziellen
Fallen, z.B. bei der Bildung yon x(t).x(t) ~, tritt diese Schwierigkeit
nicht anf.
Wenn es sich, wie in der zu Beginn dieses Paragraphen gestellten
Frage, um Bildungen der Form
v(t)~(t)
handelt~ so wird man quantentheore~isch v T; ersetzen sollen durch
v/J ~- ~v v~
~ , um zu erreiehen, dab v/J als Differentialquotient yon -~- anf-
tritt. In ahnlieher Weise lassen sich wohl stets naturgemii~e quanten-
theoretisehe Mittelwerte angeben, die allerdings in noch hiiherem Grade
hypothetiseh sind als die Formeln (7) und (8).
Abgesehen yon der eben geschilderten Schwierigkeit diirf~en Formeln
vom Typus (7), (8) allgemein geniigen, um aueh die Weehselwirkung der
Elektronen in einem Atom dureh die eharakteristischen Amplituden der
Elektronen auszudriieken.
w 2. Nach diesen ~berlegungen, welehe die Kinematik tier Quanten-
theorie zum Gegensfand batten, werden wit zum mechanischen Problem
tibergehen, das auf die Bestimmung tier ?X, v, W aus den gegebenen
Kraften des Systems abzielt. In der bisherigen Theorie wird dieses
Problem gelSst in zwei Schritten:
1. Integration der Bewegungsgleichung
+ f(x) = O. (11)
2. Bestimmung der Konstante bei periodischen Bewegungen durch

~ , d q = ~ m~dx = J ( = nh). (12)

Wenn man sich vorn~mmt, eine qua~tentheoretisehe /Vfechanik au~- "


zubauen, welche der klassischen mSgl]chst analog ist, so liegt es wohl
sehr nahe, die Bewegungsgleichung (11) direkt in die Quantentheorie zu
iibernehmen, wobei es nur notwendig ist - - um nicht yore sicheren Fun-
Quantentheoretische Umdeutung kinematischer u. mechanischer Beziehungsn. 885

dament der prinzipiell beobachtbaren OrSfien abzugehen - - , an Stelie der


GrS~en ;~, f ( x ) ihre aus w 1 bekannten qnantentheoretisehen Repr~sentunten
zn setzen. In der klassisehen Theorie ist es mSglieh, die Ltisung yon
(11) dutch Ansatz yon x in Fourierreihe~ bzw. Fourierintegralen mit
unbestimmten Koeffizienten (und Frequenzen) zu suehen; allerdings er-
halten wir dann im allgemeinen unendlich viele Gleichnngen mi~ tmendlich
vielen Unbekannten bzw. In~egralgleiehungen, die sich nur in speziellen
Fallen zu einfa.ehen Reknrsionsformeln fiir die ~A umges~alten lassen. In
der Quantentheorie sind wir jedoeh vorlaufig auf diese A r t der LSsung
yon (11) angewiesen, da sieh, wie oben besprochen~ keine tier Funktion
x ()~, t) direkt analoge quantentheoretisehe Funktion definieren lie~.
Dies hat zur Folge, da~ die qual~tentheoretisehe LSsung yon (1.!)
zunaehst nut in den einfachsten Fallen durehfiihrbar ist. Bevor wir aui
solche einfache Beispiele eingehen, sei noch die quantentheoretische Be-
sfimmung der Xonstante naeh (12) hergeleitet. Wir nehmen also an, dal3
die Bewegnng (khssisch) periodisch sei:
+oo

x = ~a~(n)ei~%,~; (13)
dann ist
~n~ = ~n : ~ a~(n), i ~ , ~ e ~ % , t
--oo

und

--oo

Da ferner a_~ (n) = a~ (n) is~ (x soll reeli sein), so folgt

Dieses Phasenintegral hat man bisher meist; gleieh einem ganzen


Vielfachen yon h, also glelch n. h gesetzt; eiae solche Bedingung fiigt sich
aber nJcht nur sehr gezwungen der mechanischen Rechnung ein, sie erscheint
auch selbst yore bisherigen Standpunkt aus Jm Sinne des Korrespondenz-
prinzips willktirlieh; denn korrespondenzmal3ig sind die J nur bis auf
eine additive Konst~nte als ganzzahtige Viel~ache von l~ festgelegt, und
an Stelle yon (14) hatte naturgemalJ zu treten:

d (nl~) ~ - d . a~m~cSdt '


dn dn 7
das heist
d
h = 2~- ~-(~.-la~i~-). (1~)
886 W. Heisenberg,

Eine solche Bedingung tegt allerdings die a, dann auch nur bis auf
eine Konstante lest, und diese Unbestimmtheit hat empirisch in dem Auf-
treten yon halben Quantenzahlen zu Schwierigkeiten Anlal] gegeben.
Fragen wir nach einer (14) and (15) entsprechenden quanten-
theoretisehen Beziehung zwischen beobachtbaren GrS~en, so stellt sich die
vermil]te Eindeutigkeit -v'on selbst wieder her.
Zwar besitzt eben nur Gleichung (15) eine an die K r a m e r s s c h e Dis-
persionstheorle ankniipfende einfaehe quantentheoretisehe Verwandlung 1) :

h = 4~ ~ {ta(., n + ~) 2~0~,n + ~) -i a(.,n-= ~)12~(~,~- ~)} , (1(~)


o
doeh diese Beziehung gentigt hler zur eindeutigen Bestimmung der a;
denn die in den Gr~l]en a zunachst unbestimmte Konstante wird yon
selbst dutch die Bedingung festgelegt, dab es einen Normalzustand geben
solle, yon dem aus keine Strahhng mehr stattfindet; sei der Normal-
zustand mit n o bezeichnet, so sollen also alle
a ( n o, n o - ~ ) = 0 (~r a> 0)
sein. Die Frage naeh halbzahliger oder ganzzabllger Quantelung diirfte
daher in einer quantentheoretischen Meehanik, die nur Beziehungen
zwischen beobaehtbaren Gr~l]en benutzt, nicht auftreten kSnnen.
Die Gleich~mgen (11) und (16) zusammen enthalten, wenn sie sieh
15sen lassen, eine vollst~ndige Bestimmung nicht mlr der Frequenzen und
EnergJen, sondern auch der quantentheoretischen Ubergangswahrsehein-
lichkeiten. Die wirkliche mathematische Durchfiihrung gelingt jedoch
zun~chst~ nur in den einfachsten F~llen; eine besondere Komplikation
entsteht aueh bei vielen Systemen, wie z. B. beim Wasserstoffatom,
dadureh, dat] die LSsungen tells periodisehen, tefls aperiodischen Be-
weglmgen entsprechen, was zur Folge hat, dal] die qnantentheoretischen
Reihen (7), (8) and die Gleichung (16) stets in eine Summe und ein
Integral zerfallen. Quantenmechanisch lK$t sich eine Trennung in ,,perio-
dische and aperiodische Bewegungen" im allgemeinen nieht durehfiihren.
Trotzdem ki~nnte man vielleicht die Gleiehungen (11) und (16)
wenigstens prinzipiell als befriedigende LSsung des mechanisehen Problems
ansehen, wenn sich zeigen liel3e, dull diese LSsung iibereinstimmt bzw.
nicht in Widerspruch steht mit den bisher bekannten quantenmechanischen
Beziehnngen; dab also eine kleine StSrung eines-mechanischen Problems
zu Znsafzgliedern in der Energie bzw. in den Frequenzen Anlal] gibt, die

]) Diese Beziehung wurde schon auf Grund -~on Betrachtungen von Dispersion
gegeben yon W. Kuhn, zS. f. Phys. 88, 408, 1925~ und Thomas, Naturw. 18, 1925.
QuantentheoretischeUmdeutung k i n e m a t i s c h e r u. m e c h a n i s c h e r B e z i e h u n g e n . $87

eben den yon K r a m e r s und B o r n gefundenen Ausd1~icken entsprechen


- im Gegensatz zu denen, welche die klassische Theorie liefern witrde.
-

Ferner mtil3te untersucht werden, ob im allgemeinen der Gleichung (11)


auch in der bier vorgeschlagenen quantentheoretischen Auffassung ein
Energieintegral m ~ - - F U(x)~ const "entspricht und ob die so ~,,'e-
dw
wonnene Energie - - ahnlich, wie klassisch gilt: v - - dJ tier Be-
dingung gentigt: J W = h . v . Eine allgemeine Beantwor~lmg dieser
Fragen erst kSnnte den inneren Zusammenhang der bisherigen quanten-
mechanischen Versuche dartun und zu einer konsequent nut mi~ beob-
achtbaren GrSSen operierenden Qua~tenmechanik ftihren. Abgesehen yon
einer allgemeinen Beziehung zwischen der K r a m e r s s c h e n Dispersions-
formel und den Gleichungen (11) und (16) kSnnen wir die oben gestellten
Fragen nut in den ganz speziellen, durch einfache Rekursion l~sbaren
Fallen beantwor~en.
Jene allgemeine Beziehung zwisch en der K r a m e r s schen Dispersions-
theorie und unseren Gleichungen (11), (16) besteht darin, dais aus
Gleichung (11) (d. h. ihrem quantentheoretischen Analogon) ebenso wie
in der klassischen Theorie folg~, daiS sich das schwingende Elektron
gegeniiber Licht, das viel kurzwelliger ist sis alle Eigenschwingungen
des Systems, wie ein freies Elektron verh~lt Dieses l~esultat folgt auch
aus der K r a m e r s s c h e n Theorie, wenn man noch Gleichung (16) beriick-
sichtigt. In der Tat findet X r a m e r s ftir das durch die Welle E c o s 2 z v t
induzierte Moment:
f 2'

M ~ e2Ecos2zvt.~ E~ [ v:(n, n d_a)__v 2


o

v ~ (n, ~ -- ~) -- v ~ t'
also fitr v >> v (n,n -F g)
2Ee2cos2=vt
o
12 v (n, ~$ - - ~)},

was wegen (16) ~ibergehf in


e2Ecos 2 a r t
M~
v2.4 ~ m
w 3. AlE einfachstes Beispiel soll im folgenden der a,nharmonische
0szii!ator behandelt werden:
~i + eoo2x + ~x 2 ~ 0. (17)
888 W. ]teisenberg,

Klassisch lii~t sich diese Gleichung be~riedigen dureh einen Ansatz


der Form
x ~ ita o 27 a l c o s f l t 27 i t a 2 c o s 2 f l t 27 itnaacos3flt 2 7 . . . i t ~ - l a ~ c o s v f l t r
wobei die a Potenzreihen in it sind, die mit einem yon it freien Gliede
beginnen. W i r versuchen quantentheoretisch einen analogen Ansatz und
reprasentieren x durch Glieder der Form
ita(n,n); a(n,n--1)cosfl(n,n--1)t; ita(n,n-- 2)cosfl(n,n-- 2)t;
. . . i t ~ - l a ( n , n - - v) cos fl (n, n - - v) t . . .
Die Rekursionsformeln zur Bestimmung der a und fl lauten (his au~
Glieder der Ordnung it) naeh Gleichung (3), (4) bzw. (7), (8):
Klassisch:
a~ (~) _ _
fl~176 27 2 - - o:
- - fl2 2 7 f l 0 0;
an (18)
( - - 4 fl2 27 fl0g) a n ( n ) 27 y1__o_ ;
( - - 9 fin 27 flo~) as (n) 27 a I a n = 0;

Quantentheores
a 2 ( n 27 1, n) 27 a 2 ( n , n - 1) = 0;
fl~ a o (n) 27 4
-- fln(n,~-- 1) + flo~ = o;
a ( n , n - - 1)a ( n - - 1, n - - 2 )
(--fl2(n'n--2)+fl~~ 2 = 0; (19)
(-- fl~ (n, n - - 3) 27 COo~) a (n, n - - 3)
a(n,n--1)a(n--l,n--3) a(n,n--2)a(n--2,n--3)
+ 2 -~ 2 =o;

Hierzu kommt die Quantenbedingung:


Klassisch (3 --~ n h) :
L +oo [a, lnfl
1 = 2~m ~ vn 4
--oo

oo
Qaantentheoretisch:
h = ~ [[~(~ + ~,~)l~ f l ( ~ + ~ , - ) - ] ~(~, ~ - , ) In fl(~, ~ - ,)].
0
Dies ergibt in erster N~herung, sowohl klassiseh wie quanten-
theoretiseh: (n 27 const) h
a~ (n) bzw. a ~ (n, n - - 1) = (20)
~1:O't f l o
Quantentheoretische Umdeutung kinematischer u. mechanischer Beziehungen. 889

Quantentheoretisch l ~ t sich die Konstaate in (20) bestimmen durch


die Bedingnng, daft a(no, n o - - 1 ) im Normalzustand Null sein solle.
Numerieren wit die n so, daft n im Normalzustand gleieh Null wird, also
no = 0, so folgt nh
a~(n,n--1) -=~ - - .
~nco o
Aus den Rekursionsgleiehungen (18) folgt dann, daft in der klassi-

sehen Theorie a~ (in erster N~hertmg in s yon der Form wird ~ (v)n~,
wo ~(v) einen yon n unabh~ngigen Faktor darstellt, In der Quan~en-
theorle ergibt sieh aus (19)

V i n n! (21)

wobei u (v) denselben, yon n unabhgngigen Proportionalitatsfaktor dar-


stellt. Ffir grofle Werte von n geh~ natfirlieh der quan~entheoretisehe
Weft yon a~ asymptotiseh in den klassisehen fiber.
Ffir die Energie liegt es nahe, den klassischen Ansatz
~n~ 2 x~ mZ s
+ mco~oy + T x = w
2
zu versuchen, der in der hier durchgerechneten Naherung auch quanten-
theoretiseh wirklich konstant ist und nach (19), (20) und (21) den Wert hat:
Klassisch:
n h coo. (22)
W--~- 2 z
Quantentheoretisch [nach (7), (8)]:
w- (~ + -~-)h ~o
2 zl (23)
(bis auf GrSJen der Ordmmg ~2).
Naeh dieser Auffassung ist also sehon beim harmonischen Oszillator
die Energie nleht dutch die ,,klassisehe ]~echasJk", d. h. (22) darstellbar,
sondern sie hat die ~orm (28).
Die genauere Durehreehnung aueh der hDheren Nghernngen in ] ~
a, co so]l ansgefithrt werden am einfaeheren Beispiel des anharmonisehen
Oszillators yore Typns:
~; + co~ x + Z x 3 = O.
Klassisch kann man hier setzen:
x = a~ cos cot d- Z as cos 3 cot d- ~2a5 cos 5 cot d- "" ",
analog versuchen wir quantentheoretisch den Ansatz
a(n,n--1)eosco(n,n--1)t; Xa(n,n--3)cosco(n,n--3)t; ...
890 W. Heisenberg,

Die Gr(i~en a sind wieder t)otenzreihen in ~, deren erstes Glied,


wie in (21), die Form hat:

wie man dureh Ausreehnen der den Gleiehtmgen (18), (19) en~spreehenden
Gleiehungen erh~lt.
Ffihrt; man die Bereetmung yon co, a naeh (18), (19) bis zur Nahe-
rung ~2 bzw. ~. dureh, so erh~l~ man:
3nh 3 h~
ea 0~, n - - l ) ~ coo + ~ . 8 ~eo~ m Z2. 2 5 6 e o 2 m ~ (17n2 + 7) + ... (24)

a(.,n-1)=l/-~ (1--z a~h + (25)


r~coom \ 16 ~co~m ""/"
1 ~ h8 ( 39 ( n - - l ) h~
_ ~ n (n --l) (n-- 2) \l -- ~ 3 2 ~ - ~ m ] " (2~)

Die Energie, die als das konstante Glied yon


~ x~ mZ
m~ + ~ x~

definiert ist (da]] die periodischen Glieder wirklich alle Iqull sind, konnte
ich nicht allgemein beweisen, in 4en durchgerechaeten Gliedern war es
der Fall), ergibt sich zn
w - - (n + ~) h ~o + Z. 3 (n~ + n + ~) h~
2~ 8.4 2 eoo.m

9 512~ ~eoom~,~ 1 7 n a + - ~ - + -~n + 9 (27)

Diese Energie ]~ann man auch noch nach dem K r a m e r s - B o r n -

schen Yerfahren berechnen, indem man das Glied m~ x~ als St(irungsglled


zum harmonischen Oszfllator auffal~t. Maa kommt dama wirklich wieder
genau zum Resnltat (27), was mir eine bemerkenswerte Stiitze fiir die
zugruadegelegtea quantenmechanischen Gleichungen zu sein scheint.
Feraer eriiillt die nach (27) berechnete Energie die Formel [vgl. (24)]:
~o (n, n - - 1) 1
2~ = ]~. [W(n) -- W(n -- 1)],

welche ebentalls als notwendige Bedingung ffir die MSglichkeit einer den
Gleichungen (11) und (16) entsprechenden Bestimmung der 1Jbergangs-
wahrscheinlichkeiten zu betrachten ist.
Quantentheoretische Umdeutung kinematiscber u. mechanischer Beziehungen. 891

Zum Sehlul~ sei der Rotator als Beispiel angefttSrt and auf die Be-
ziehung der Gleichungen (7), (8) zu den Intensitgts~ormeln beim Zeeman-
effekt 1) und bei den Multipletts 2) hingewiesen.
Sei der Rotator reprgsentiert durch ein Elektron, das im k o n s t a n t e n
Abstand a um einen Kern kreist. Die ,,Bewegungsg]eichungen" besagen
dann klassiseh wie quantentheoretisch nnr, dal] das Elektron ira kon-
stanten Abstand a eine ebene, gleichfSrmige Rotation usa den Kern be-
sehreibt mit der Winkelgeschwindlgkeit co. Die ,,Quantenbedingung" (16)
ergibt naeh (12):
d (2 ~.~a2co),

nach (16):

woraus in beiden Fgllen folgt:


h. (n + const)
co (~, n -- ! ) ~-~ 2 ~ ~n a ~

Die Bedlngung, dal3 im NormMzustand (n o ~ 0) die Strahlung ver-


schwinden solle, fiihrt zu der Formel:

co(,~,n -- 1) = 2 =.~a 2 " (28)


Die Energie wird
W =

oder nach (7), (8)

W= -2 9 2 -- 8~r~ma 2 ( n ~ q - n q - ~ ) , (29)

2z
was wieder der Beziehung co (n, n - 1) = ~- [W(n)- ] > V ( n - 1)] ge-

ntigt. Als Stiitze ftir die yon der bisher itblichen Theorie abweichenden
Formeln (28) und (29) kann es angesehen werden, dal3 v i d e Banden-
spektren (auch solche, bei denen die Existenz eines Elektronenimpulses
unwahrscheinllch ist) nach K r a t z e r 8) Forme]n yore Typus (28), (29)
(die man bisher der klassisch-mechanisehen Theorie zuliebe dutch hatb-
zahllge Quantelung zu erk]aren suehte) zu fordern seheinen.

1) Goudsmit and R. deL. Kronig, Naturw. 13, 90, 1925; g. HSnl, ZS.
f. Phys. 31, 340, 1925.
~) R. de L. K r o n i g , ZS. f. Phys. 8][, 885, 1925; A. Sommerfeld und
g. tISnl, Sitzungsber. d. Preull. Akad. d. Wiss. 1925, S. 141; H. N. Russell,
Nature 115, 835, 1925.
3) Vgl. z. B. A. K r a t z e r , Sitzungsber. d. Bayr. Akad. 1922, S. 107:
892 W. Heisenberg,

Um beim Rots zu den G o u d s m i t - K r o n i g - t t i i n l s c h e n Formeln


zu gelangen, miissen wit das Gebiet der Probleme mit einem Freiheits-
grad verlassen und aanehmen, dab der Rotator, in irgendwelcher Richtung
im Raume, um die Achse z eines au~eren Feldes eine sehr langsame Pr~-
zession ~ ausfiihre. Die dieser Prazession entsprechende Quantenzahl
heiBe m. Dann wlrd die Bewegung reprasentiert durch die GrSi]en
z: a ( n , n - - l ; m,m)coseo(n,n--1)t;
x + i y: b (n, n - - 1 ; m, m - - 1) ei[~(n, n - l ) + o]t;
b (n, n - - 1 ; m - - 1, m) e i [ - ~ ( n , n - D + o]t.
Die Bewegtmgsgleichungen lauten einfach:
x2 + y 2 + z 2 = a 2,
was nach (7) zu den Gleichungen 1) Anlal~ gibt:
1 {~ a ~ (n, n - l ; m, m) + b2 (n, n - l ; m, m - l ) + b2 (n, n - 1 ; m, m+ 1)
+89a2 (n+ l,n; m,m) + b2 (n+ l,n; m-l,m) -~-b2( n + l , n ; m+l,m)} = a 2. (30)
{ a ( n , n - - 1 ; m,m) a ( n - - l , n - - 2; m,m)
= b(n,n-- 1 ; m , m + 1) b ( n - - 1 , n - - 2 ; m + 1, m)
+ b ( n , n - - 1 ; m , m - - 1 ) b ( n - - l , n - - 2 ; m - - 1 , m). (31)
Hierzu kommt nach (16) die Quantenbedingung:
2~m{b~(n,n--1; m,m--1)eo(n,n--1)
- - 52 (n, n - - 1 ; m - - 1, m) co (n, n - - 1)} = (m + const) h. (32)
Die diesen Gleichungen entsprechenden klassischen Beziehungen:
1 2 2
~ao -~-bl - ~ 1 = 62; ]
1 2
-~a 0 =
b~b I; / (33)
2 = ~ (b~,--b2_,)o ----- (~ + const) h
geniigen (bis auf die tmbestimmte Konstante bei m) zur eindeutigen Fest-
legung der ao, bv b ~ .
Die am einfachsten sich darbietende Liisung der quantentheoretischen
Gleichungen (30), (31), (32) lautet:

b(n,n--1;m,m--1)-----a
l/ (n+m+l)(n+m)

b(n,n~l;m--l,m)=a[
1/(. - ~) (,,
4(n + {)n
- ~ + 1) ;
/ -
+ m § 1)(n--
a ( n , n - - 1; re, m) = a W (n
(n + -~).

1) Die Gleichung (30) ist im wesentlichen identisch mit den Ornstein-


Burgerschen Summenregeln.
Quantentheoretische Umdeutung kinematischer u. mechanischer Beziehungen. 893

Diese Ausdriicke stimmen mi~ den Formeln von G o u d s m i t , K r o n i g


uud g i i n l iiberein; man kann iedoch nicht einfach eiasehen, daft diese
Ausdriicke die e i n z i g e Liisung yon (30), (31), (32) darstellea - - was mir
iedoch bei Beaehtung der Randbedingunffea (Versehwinden der a, b am
,,Rande", vgl. die oben zitierten Arbeiten yon K r o n i g , S o m m e r f e l d
mid H(inl, R u s s e l l ) wahrseheinlich seheint.
Eine der bier angestellten iihnliche Uberlegung fiihrt auch bel den
Intensitiitsformela der ~ultipletts zu dem Ergebnis, dab die genannten
Intensititsregeln mit Gleichung (7) und (16) im Einklang stehen. Dieses
Resultat diirfte wiederum Ms Stiitze insbesondere fiir die Rich~igkeit der
kinematischen Gleichung (7) anzuspreehen sein.
Ob eine Methode zur Bestlmmung quantentheoretischer Daten durch
Beziehungen zwischen beobaehtbaren GrSl~en, wie die bier vorgeseMagene,
schon in prinzipieller ttinsicht als befriedigend angesehen werden kSnnte,
oder ob diese Me,bode doch noeh einen viel zu groben Angriff auf das
physikalische, zunichs~ offenbar sehr verwickelte 1)roblem einer quanten-
theoretischen Mechanik darstellt, wird sich erst durch eine tiefergehende
ma~hematisehe Untersuchung der hier sehr oberfliehlich benutzten Me-
rhode erkermen lassen.

G S t t i n g e n , Institut fiir theoretisehe Physik.


361

3. Quamt$s&erumg als E4gmwtwtproth!ern;


von E. S c h r o d h g e r .
(Erste Mitteilung.)

8 1. In dieser Mitteilung mochte ich zuniichst an dem ein-


fichsten Fall des (nichtrelativistischen und ungestorten) Wasser-
stoffatoms zeigen, dafi die iibliche Qnantisierungsvorschrift sich
(lurch eine andere Forderung ersetzen I&, in der kein Wort
von ,,ganzen Zahlen" mehr vorkommt. Vielmehr ergibt sich
dio Qanzzahligkeit auf dieselbe natiirliche Art, wie etwa die
Ganzzahligkeit der Knotenzahl einer schwingenden Saite. Die
neue Aufiassung ist verallgemeineruogefihig und rllhrt,, wie ich
3laub0, sehr tief an das wahre Wesen der Quantenvorschriften.
Die iibliche Form der letzteren kniipft an die H a m i l -
ton sche partielle Differentialgleicliung am :
.') N(g, %) = E .
E s wird von dieser Qleichung eine Lijsung gesucht, welche
sich darstellt als Summe von Funktionen je einer einzigen der
unabhangigen Variablen q.
Wir ftihren nun far S eine neue unbekannte q~ ein derart,
cla6 q~ als ein Produkt von eingriffigen Funktionen der einzelnen
Koordinaten erscheinen wiirde. D.h. wir setzen
('tj 8=KlgqJ.
Die Konstante K mu6 aus dimensionellen Granden eingefiihrt
werden, sie hat die Dimension einer Wirhung. Damit erhalt man

Wir suchen nun nicht eine Lbsnng der Gleichung (1 sondern


I),

wir stellen folgende Forderung. Gleichung (1') la6t sich bei


VernachlLssignng der Massenveranderlichkeit Rtets, bei BerUck-
Richtigung derselben wenigstens dann, wenn es sich um das Ein-
elektronenproblem handelt, auf die Gestalt bringen: quadratieche
362 E. Schtiidinger.
Form von T,IJ und seinen ersten Ableitungen = 0. Wir suchen
solche reelle im ganzen Konfigurationenraum eindeutige end-
liche und zweimal stetig differenzierbare Funktionen q, welche
das uber den ganzen Konfigarationenraum erstreckte Integral
der eben genannten quadratischen Form I ) zu einem Eztremurn
machen. Dutch dieses ratiationsproblem etsetzen wir die Qjianien-
bedingungen.
Wir werden fur H zunilchst die Hamiltonsche Funktion
der Keplerbewegung nehmen und zeigen, daS die aufgeetellte
E’orderung f i r aUe positiven, aber nur fiir eine diskrete Schar
uon negativen E-Werten erfiillbar ist. D. b. das genannte
Variationsproblem hat ein diskretes und ein kontinuierliches
Eigenwertspektrum. Des diskrete Spektrum entspricht den
B almerschen Termen, das kontinuierliche den Energien der
Hyperbelbahnen. Damit numerische Ubereinstimmung bestehe,
muB h: den Wert h/2n erbalten.
Da fiir die Aufstellung der Variationsgleichungen die
Koordinatenwahl belanglos ist, wahlen wir rechtwinkelige kar-
tesische. Dann lautet (1’) in unserem Fall (e, m sind Ladung
und Masse des Elektrons):

r = v-2.
Und unser Variationsproblem lautet

das Integral erstreckt iiber den ganzeii Raum. Man findet


daraus in gewohnter Weise

Es muS also erstens

1) Es entgeht mir nicht, da6 diese Formulierung nicht ganz ein-


deutig ist
Quantisierung als Eigenwertproblem. 363
und zweitens mu0 das iiber die unendlich ferne geschlossene
Oberflirche zu erstreckende Integral I

(& wird sich herausstellen, da3 wir wegen dieser letzteren


Forderung unser Variationsproblem noch durch eine Forderung
itber das Verhalten von ay im Unendlicheu zu erggnzen haben,
tlamit auch das oben behauptete Rontinuierliche Eigenwert-
bpoktrum wirklich existiere. Doch davon spater.)
Die Lijsung von (5) laBt eich (turn Beispie2) in raumlichen
Polarkoordinaten r, 8, ~p bewerkstelligen, indem man 9 ale
Prod& j e einer Fnnktion von r, von 8,von cp ansetzt. Die
Xethode ist sattsam bekannt. Fiir die Abhangigkeit von den
Polarwinkeln ergibt sich eine K~igeiflachenfunktionfur die Ab-
hangigkeit von r - die Funktion wollen wir x nennen -
erhilt man leicht die DifferentiaIgleichung:
dsx 2 dx 2mE 2meP n(n+1)
,7) =+7zt.(F+=-
?*. )x=o *

n = 0 , 1, 2, 3 ..,.
Die Beschrilnkung von n auf ganze Zahlen ist bekanntlich-not-
wendig, damit die Abhangigkeit von den Polarwiokeln eindeutig
werde. - Wir beniitigen Lbsungen von (?), die fiir alle nicht-
negativen reellen r-Werte endlich bleiben. Nun hat 1) die
Clleichung (7) in der komplexen r-Ebene zwei Singularititen,
bei r = 0 und r = co,von denen die zweite eine ,,Stelle der
Unbesthmtheit" (weaentlich singulare Stelle) allev I n tegrale ist,
die erste hingegen nicht (flir kein Integral). Diese beiden
Singularititen bilden gerade die Randpunkte unseres reellen
Intervalls. In einem solchen Falle wei0 man nun, daB die
Forderung des Endliclibleibens in den Randpunkten fur die
Fnnktion x einer Randbedingrcng gleichkommt. Die Gleichung
hat im allgemeinen iiberhaupt kein Integral, das in beiden Rand-
punkten endlich bleibt, sondern ein solches Integral existiert
~- -
1) Fur die Anlaitung zur Behandlang der Gleichung (7) bin ich
Hermann W e y l zu grijltem Dank verptlichtet. Ich verweiee fk die
irn folgenden nicht bewiesenen Behauptungen auf L. S c h l e e i n g e r ,
Differentialgleichungeu (Sammlung S chub er t Kr. 13, Gijschen 1900,
besondera Kap. 3 und 5.)
364 B. Schrodinger.
nur ftir gewisse ausgezeichnete Werte der in der Oleichung
auftretenden Konstanten. Diese ausgezeichneten Werte gilt es
zu bestimmen.
Der eben hervorgehobene Sachverhalt ist der springelide
Punkt in der ganzen Untersuchung.
Wir betrachten zunachat die singulare Stelle r = 0. Die
sogenannte determinierende Fundamentalgleichuny, welche das
Verhalten der Integrale an dieser Stelle bestimmt, ist
(8) g(p - 1) + 2 p - n ( n + 1) = 0
mit den Wurzeln
(8'1 el ?L, g, = - (n + 1) .
=5

Die beiden kanonischen Integrale an dieser Stelle gehoren also


zu den Exponenten n und - (n + 1). Von ihnen ist, da n
nicht negativ ist, nur das erste fur uns brauchbar. Es wird,
da es zu dem yriiperen Exponenten gehiirt, dnrch eine ge-
wohnliche Potenzreihe dargestellt, die mit r" beginnt,. (Das
andere Integral, das uns nicht interessiert, kann, wegen der
ganzzahligen Differenz zwischen den Exponenten, einen Loga-
rithmus enthalten.) Da der nachste singnliire Punkt erst im
Unehdlichen lie& konvergiert die genannte Potenzreihe be-
stiindig und stellt eine Ganze fianszendente dar. Wir stellen
also fest:
Die gewchte Losung ist eine (bis auf einen belaitglosen hon-
stanten Fuktor) eindeutig bestimmte Ganze Banstendente, die bei
r = 0 xum Exponenten n gehort.
Es handelt sich jetzt darum, das Verhalten dieser Funktion
im Unendlichen der positiven reellen Achse zu nntersuchen.
Dazu vereinfachen wir die Gleichung (7) durch die Substitution
(9) % = T a u ,

worin a so gewahlt wird, daS das Qlied mit l / r z fortfhllt.


Dazu mu0 a einen der beiden Werte n, - (n + 1) erhalten,
mie man leicht nachrechnet. Qleichnng (7) nimmt d a m die
Form an:
(7 3
dS U
-
d ra + 2 ( a + 1) ddUr + -2m
T K'( B + -)
69
u= 0 .
Ihre Integrale gehoren bei T = 0 zu den Exponenten 0 und
-
- 2 u 1. Fiir den ersten a-Wert,, a r: n, ist dae erste, Air
Quantisieruny als Eigenwertproblem. 365

den zweiten a-Wert, cx = - (n + l), ist das zweite dieser


Integrale eine Ganze Transzendente und fubrt nach (9) auf die
gezuchte Lbsnng, die j a eindeutig ist. Wir verlieren also nichts,
wenn wir uns auf einen der beiden u-Werte beschranken. Wir
wilhlen
(10) u=n.
IJnsere Lijsung U gehijrt dann also bei r = 0 zum Exponenten 0.
Gleichung (7') bezeichnen die Mathematiker als Laplacesche
(fleichung. Der allgemeine Typus ist

(7'3
Bei uns haben die Konstanten die Werte
d o = 0, 4 = 2(u + l ) , &o = -2mlC
-- , E l = -
R
2meT
K? *

Dieser Gleichungstypus ist aus dem Qrunde verhaltnismtflig


cinfach zu behandeln, weil die sogenannte Laplacesche Trans-
Sormation, die im allgemeinen wieder eine Qleichung zweiter
Ordnung ergibt, hier auf die erste Ordnung fuhrt, die durch
Quadraturen liisbar ist. Dies gestattet eine Darstellung der
Lasungen von (7") Relbst durch Integrale im Komplexen. Ich
Siihre hier nur das Endergebnis an.l) Das Integral

(12) U = s e z r (z - cl)O1' - I (z - cJU' - dz


L

ist eine Losung von (7") fir einen Integrationsweg L, fur den

(18) J$[ezr(z -cl)u1(z -cZ)a*1dz = o.


L

Die Konstanten cl, c2, ul, u2 haben folgende Werte. c, und


c2sind die Wurzeln der quadratischen Qleichung
(.14) 22+~oz+&o=0
und

1) Vgl. L. S c h l e s i n g e r , a. a. 0. Die Theorie verdankt man


H. PoincarC und J. Horn.
366 E. Schriidinger.
Im Fnlle der Gleichung (7') wird also nach (11) und (10)

Die Integraldarstellung (12) gestnttet nicht nur, das asym-


ptotische Verhalten der Gesamtheit von Liisungen, wenn r in
bestimmter Weise ins Unendliche geht, zu iiberblicken, sondern
auch, dieses Verhalten fiir eine bestimmte Liisung anzugeben,
was immer vie1 schwieriger ist..
Wir wollen nun zunachst den Fall, daB u1 und u2 reelle
ganze Zahlen sind, ausschliepen. Der Fall tritt, wenn er ein-
tritt, stets fur beide QrtBen gleicbzeitig ein und zwar dsnn
und nur d a m , wenn
= reelle ganze Zahl .
K 1/?mE
+
Wir nehmen also jetzt an, daf3 (15) nicht erfiillt ist.
Das Verbalten der Gesamtheit von Losungen fur eine
bestimmte Art des Unendlichwerdens von r - wir wollen
stets denken fur realpositives Unendlichwerden - wird als-
dann l) charekterisiert durch das Verhalten der beiden linear
unabhangigen Losungen, die durch folgende zwei Spezialisierungen
des Integrationsweges L erhalten werden, und die wir U, und
U, nennen wollen. Beide Male komme z aus dem Unendlichen
und gehe auf demselben Wege dorthin zuriick, und zwar in
solcher Richtung, daB
(16) lime"' = O ,
z=m
d. h. der Realteil von z r sol1 negativ unendlich werden. Hier-
durch wird der Bedingnng (13) geniigt. Bazwischera werde im
einen Falle (Losung U,) die Stelle clr im anderen Falle (Lo-
sung Uz) die Stelle c2 j e einmal umlaufen.
Diese beiden Losungen werden nun fiir sehr grof3e real-
positive r-Werte asymptotisch (im Sinne P o i n c a r 6s) dargestellt
durch
1) Wenn (15) erfiillt ist, wird mindefltens der eine von den beiden
im Text beschriebenen Integrationswegen nnbrauchbar, da er ein ver-
schwindendes Ergebnis liefert.
Quantisierung als Eigenwertpro&2ern. 367
&\,., eclrr-ax(- (e2niai - 1) o
q
(17) { 'v
(5 - c,)" - 9
eCp+r-% (- l)"(e?ni% - l)T(a,)(c, - c l ) . l - l ,
mobei wir uns hier mit dem ersten W e d der nach ganzen
negativen Potenzen von r fortschreitenden asymptotischen Reihen
begnugen.
Wir hahen nun die beiden Falle E Z 0 zu unterscheiden.
Sei zunilchst
1. E > 0. Wir bemerken erstens, daB hierdurch das
-
Kichtzotreffen von (15) eo ips0 gewahrleistet ist, weil diese
Grijf3e reinimaginar wird. Ferner werden nach (14") auch c1
und c2 reinimaginiir. Die Exponentialfunktionen in (1 7) sind
also, da r reell ist, endlichbleibende periodische Funktionen.
Pie Werte von al und a, nach (14") zeigen, daB V , und U,
beide wie r-"-l gegen Null gehen. Dasselbe muP also won
11 nserer ganzen transzendenten Liisung U gelten, deren Verhalten

Piir suchen, wie immer sie sich aus U, und U, linear zusammen-
setzen moge. Ferner lehrt ( 9 ) mit Beachtung von (lo), daB die
Funktion x, d. i. die gauze transzendente Losung der ursprunglich
wwliegenden Qleichung (7), immer noch wie 1 / r gegen Null
geht, da sie aus U durch Multiplikation mit r" entsteht. Wir
kijnnen also aussprechen:
Die Eulersche 'Differenzialgleichung (5) unseres Variations-
problems hat f u r jedes positive E Liisungen, die i m ganzen R a u m
eintleutig endlich und stetig sind und i m Unendlichen unter be-
standigen Oszillatioiien wie I/. gegen Null gehen. - Von der
Oberfli-ichenbediugung (6) wird noch zu sprechen sein.
2. E < 0. I n diesem Fall ist die Mtiglichkeit (15) nicht
el) ips0 ausgeschlossen, doch halten wir vorlilufig an ihrem
verabredeten AusschluB fest. Dann wachst nach (14") und (17)
I\ fur r =oo uber alle Grenzen, U, dagegen verschwindet
exponentiell. Unsere Gauze Transzendente U (und dasselbe gilt
von x ) wird also dann und nur d a m endlich bleiben, wenn
IT mit U, bis auf einen numerischen Faktor identisch ist. B a s
k t aber nicht der Pall. Man erkennt das so: wahlt man in
(12) fur den Integratiousweg 5 einen geschlossenen Umlauf um
beide Punkte c1 und c2, welcher Umlauf wegen der Qanzzahlig-
+
keit der Summe ccl u, dann wirlilich auf der Riemannschen
E'lache des Integranden geschlossen ist, mithin eo ips0 der
368 3. Schrodinger.

Bedingung (13) geniigt, so liiBt sich leicht zeigen, dab des


Integral (12) alsdann unsere Ganre Transtendente U darstellt.
Es lilBt sich niimlich in eine Reihe nach positiven Potenzen
von r entwickeln, die jedenfalls fiir hinreichend kleine r koc-
vergiert, daher der Differentialgleichung (7’) geniigt, daher mit
derjenigen fur U zusammenfallen mufi Also: U wird durch
(12) dargestellt, wenn L ein geschlossener Uinlauf um beide
Punkte, c1 und c,, ist. Dieser geschlossene Umlauf laUt sich
aber so verzerren, daB er aus den Leiden fruher betrachteten
Integrationswegen, die zu U,und U, gehoren, additiv kombiniert
erscheint und zwar mit nicht verschtchdenden Faktoren, etwa
1 und e 2 n i a 1 . Daher kann U nicht rnit U, iibereinstimmen,
sondern mu0 auch Ul enthalten. W. z. b. w.
Uneere Ganze Transzendente U, die unter den Lijsungen
von (7‘) allein fur die Problemlasung in Betracht kommt,
bleibt a160 unter den gemachten Voraussetzungen fur groBe r
nicht endlich. - Vorbehaltlich der Yollstandigheitsuntersuchung,
d. h. des Nachweises, da5 unser Verfahren alle linear nnab-
hangigen Problemlasungen finden liiBt, durfen wir also aus-
sprechen:
Fur negative 3, zoelche der Bedingung (15) nicht genugen,
hat unser Pariationsprobbm keiiie Xosung.
Wir haben jetzt nur noch diejenige diskrete Schar von nega-
tiven E-Werten zu untereuchen, welche der Bedinguny (15) geniigen.
ccl und ccg sind dann also beide ganzzahlig. Von den zwei
Integrationswegen, welche uns friiher das Fundamentalsystem U,,
U, geliefert haben, muS sicherlich der erste abgeandert werden,
um Nichtverschwindendes xu liefern. Denn a, - 1 ist sicherlich
positiv, die Stelle c1 ist also jetzt weder ein Verzweigungs-
pnnkt noch ein Pol des Integranden, sondern eine gewohnliche
Nullstelle. Es kann auch die Stelle ca reguliir werden, wenii
niimlich auch cc2 - 1 nicht negativ ist. I n jedem Faile aber
lassen sich zwei passende Integrationswege leicht angeben und
die Integration auf ihnen sogar in geschlossener Form durch
bekannte Funktionen ausfiihren, so da0 das T’erhalten der
Losungen vollkommen iiberblickt wird.
Sei namlich
(15’) K y:
m ea
.-
- 2 s
-
-I; 1 = 1, 2, 3, 4 . . ..
Quantisierung als Eigenzuertproblem. 369

I>ann ist nach (14“)


( l,’,’) a,-i=l+n, aZ-1=-l+n.
Idan hat nun die beiden Falle zu unterscheiden 1 11 und <
i > n. Sei zunachst
<
a) 1 n. Dann verlieren c2 und c1 jedweden singularen
(Jharakter, gewinnen aber dafiir die Eignung, als Anfangs-
oder Endpunkte des Integrationsweges zu fungieren, behufs
33rfullung der Bedingiing (13). Ein dritter hiefiir geeigneter
Funkt ist das Negativreellunendliche. Jeder Weg zwischen
zweien dieser drei Punkte liefert eine Liisung und von diesen
tlrei Lasungen sind je zwei linear unabhangig, wie man leicht
hesfatigt, indem man die Integrale in geschlossener Form aus-
rechnet. Im besonderen wird die yanze transzendente Liisuny
(lurch den Integrationsweg von c1 nach c2 geliefert. Denn daU
tZieses Integral fur r = 0 regdar bleibt, erkennt man sofort,
ohne es auszurechnen. Ich betone das, weil die wirkliche
Ausrechnung eher geeignet ist, diesen Sachverhalt zu ver-
Hchleiern. Dagegen zeigt sie, daB das Integral fur positiv
unendlich groBe r uber alle Grenzen wachst. Endlich bleibt
iur groBe r eines von den beiden anderen Integralen, das aber
clafur fur r = 0 unendlich wird.
<
Wir erhalten also im Falle I n Keine Problemlosung.
b) I > n. Dann ist nach (14”’) c1 eine Nullstelle, c2 ein
.Pol mindestens erster Ordnung des Integranden. Zwei un-
ribhangige Integrale werden dann geliefert : das eine durcli
den Weg, der vou z = -a,vorsichtshalber mit Vermeidung
tlea Pols, zur Nullstelle fuhrt; das andere durch das Residuum
im Pol. Jetzteres ist die Ganze Transzendente. Wir wolleii
seinen ausgerechneten Wert angeben, aber gleich mit r’z mul-
t,ipiiziert, wodurch wir nach (9) und (10) die Losung x der
ursprunglich vorliegenden Qleichung (7) erhalten. (Die belang-
lose multiplikative Konstante ist willkurlich adjustiert.) Man
findet:

Alan erkennt, dab dies airklich eine brauchbare Lijsung ist,


(la sie fur alle reellen niclitnegativen r endlich bleibt. Durch
Annalen der Phpsik. rV. Folge. 79. 24
310 E. Schriidinger.
ihr exponentielles Verschwinden im 'IJnendlichen wird uberdies
die Oberflachenbedingung (6) verburgt. Wir fassen die Resultate
fur negative E zusammen:
Bei negativem E hat unser Yariationsproblem dunn und nur
d a m Losungm, wenn E der Bedinyoung (15) genugt. Der
ganzen Zahl n, webhe die Ordnung der in der Losung uuf-
tretenden Kugelflachenfunhtion angibt, durfen dann imnier nur
Werte kleiner als I erteilt werden (wovun stets mindestens einer ZUT
Perfiigung steht). Ber von r abhanigige Teil cler .llosung wird
durch (18) gegeben.
Durch Abzahlung der Konstanten in den Kugelflachen-
funktionen (bekanntlich 2 n + 1) findet man ferner:
Die gpfundene Losung enthalt f u r eine zulassige Kombination
(n, I ) genau 2 n + 1 willkiirliche Konstanten; f u r einen vor-
gegcbenen I- Wert also la willhurliche Konstunten.
Wir haben damit die eingangs aufgestellten Behauptungen
uber das Eigenwertspektrum unseres Variationsproblems in den
Hauptziigen bestatigt, immerhin bestehen noch Lucken.
Erstens der Nachweis der Vollstandigkeit des yesamfen
nachgewiesenen Systems von Eigenfunktionen. Damit will ich
mich in dieser Note nicht befassen. Nach anderweitigen Er-
fahrungen darf man vermuten, daB uns keine Eigenwerte ent-
gangen sind.
Zweitens ist jetzt daran zu erinnern, daB die fur positives
Z# nachgewiesenen Eigenfunktionen nicht ohne weiteres das
Variationsproblem in der Form, in der es anfangs gestellt
wurde, losen, weil sie im Unendlichen nur wie l l r , *a,- auf
einer groBen Kugel also nur wie l / r a gegen Null geht. Das
Oberflaichenintegral (6) bleibt daher gerade noch von der
Ordnung des 8y im Unendlichen, Wunscht man also das
kontinuierliche Spektrum wirklich mit zu erhalten, so mu6 man
dem Problem noch eine Bedinguiig hinzufugen: etwa daB 6y
im Unendlichen verschwinden, oder wenigstens, dab es einem
konstanten Wert zustreben so11, unabhangig von der Richtung,
in der man ins raumlich Unendliche geht; in letzterem Fall
bringen die Kugelflachenfunktionen das 0berflachenintegral zum
Verschwinden.
Quantisierung als Eigenrcertpro6len~ 371
8 2. Die Bedingung (15) ergibt

Es ergeben sich also die wohlbekannten Bohrschen Energie-


niveaus, die den Balmertermen entsprechen, wenn man der
lionstante K, die wir in (2) aus dimensionellen Griinden ein-
i'iihren mu6ten, den Wert erteilt

Ihiser 1 ist die Hauptquantenzahl. 11 + 1 hat Analogie mit


~ l e rAzimutalquantenzahl, die weitere Aufspaltung dieser Zahl
bei der naheren Bestimmung der Kugelflachenfunktionen kann
mit der Aufspaltung des Aeimutalquants in ein ,,aquatorialea"
und ein ,,polares" Quant in Analogie gesetzt werden. Diese
Zablen bestimmen hier das System der Knotenlinien auf der
Kugel. Auch die ,,radiale QuantenzahlJclZ - n 1 bestimmt -
genau die Zahl der ,,Knotenkugeln", denn man kann sich leicht
iiberzeugen, da6 die Fnnktion f ( z ) in (18) genau I - n 1 -
positive reelle Wurzeln hat. - Die positiven E-Werte ent-
sprechen dem Kontinuum der hyperbolischen Bahnen, denen
inan in gewissem Sinn die radiale Quantenzahl 00 zuschreiben
kann. Dem entspricht, da8, wie wir gesehen haben, die he-
treffenden Losungsfunktionen unter lestiindiyen Oszillationen
ins Unendliche hinaus laufen.
Von Interesse ist noch, daf3 der Bereich, innerhalb dessen
die Funktionen (18) merklich von Null verscbieden sind nnd
innerhalb dessen sich ibre Oszillationen abspielen, jedenfalls
ron der aUyemeinen G'Topenorclnung der gro0en Achse der zu-
geordneten Ellipse ist. Der Faktor, mit dem multipliziert
tier Hadiusvektor als Argument der konstantenfreien Funktion f'
auftritt, ist - selbstverstandlich - das Reziproke einer Liinge,
iind diese Lange ist
E- = - = -K-l-l - - - ha E a1
:2 1) ~

v/--,,,&E m e y 4n2nte' - - i '


wo uL die Halbachse der Z-ten Ellipsenbahn.
(Die Gleichungen
folgen aus(l9)zusammen mit der bekanntenBeziehungRl = - - ").
2a2
24 *
372 E. Schriidinger.
Die GrOBe (21) gibt die GrOBenordnung des Wurzelbereiches
fiir kleinzehliges 1 und n; denn dann darf angenommen werden,
da6 die Wurzeln von f ( z ) von der QrBBenordnung Eins sind.
Das ist nattirlich nicht mehr der Fall, wenn die Koeffizienten
des Polynoms gro0e Zahlen sind. Ich mochte auf die ge-
nauere Abschatzung der Wurzeln jetzt nicht eingehen, glaube
aber, daB obige Behauptung sich drtbei ziemlich genau be-
stiit.igen wird.
5 3. E s liegt nattirlich sehr nahe, die Funktion y auf
einen Schwitlgungsvorgang im Atom zu beziehen, dem die den
Elektronenbahnen heute vielfach bezweifelte Realitat in hoherem
Ma0e zukommt als ihnen. Ich hatte auch ursprhglich die
Absicht, die neue Fassung der Quantenvorschrift in dieser
mehr anschaulichen Art zu begriinden, habe aber dann die
obige neutral mathematische Form vorgezogen , weil sie das
Wesentliche klarer zutage treten la6t. Als das Wesentliche
erscheint mir, da0 in der Quantenvorschrift nicht mehr die
geheimnisvolle ,,Ganzzahligkeitsforderung" auftritt, sondern diese
ist sozumgen einen Schritt weiter zurtickverfolgt : sie hat ihreu
Grund in der Endlichkeit und Eindeatigkeit einer gewissen
Raumfunktion.
Ich mochte aueh jetzt noch nicht nilher anf die Erorterung
dnr Vorstellungsmaglichkeiten iiber diesen Schwingungsvorgang
eingehen, bcvor etwas kompliziertere F d l e in der neuen Fassung
mit Erfolg durchgerechnet sind. Es ist nicht ausgemacht, daB
dieselbe in ihren Xrgebnissen ein bloSer Abklatsch der iiblichen
Quantentheorie sein wird. Z. B. fuhrt das relativistische Kepler-
problem, wenn man es genau nach der eingangs gegebenen
Vorschrift durchrechnet, merkwiirdigerweise auf halbzahlige
Teilquanten (Radial- und Azimutquant).
Immerhin seien zu der Schwingungsvorstellung hier noch eiriige
Bemerkungen erlaubt. Vor allem mochte ich nicht unerwahnt
lassen, da0 ich die Anregung zu diesen Uberlegungen in erster
Linie den geistvollen Theses dee Hrn. L o u i s d e Rroglie')
verdanke und dem Nachdenken iiber die riiumliche Verteilnng
jener ,,Phasenwellen", von denen er gezeigt hat, da0 ibrer
stets eine g a m e Zahl, entlang der Bahn gemessen, auf jede
1) 1,. d e B r o g l i e , Ann. de Physique (10) 3. 5 . 2 2 . 1925 (Thbses,
Paris 1924)
Quantisierung als Eigenwertproblem. 373
Feriode oder Quasiperiode des Elektrons entfallen. Der Haupt-
unterschied ist, daB d e B r o g l i e an fortschreitende Wellen
denkt, wahrend wir, wenn wir unseren Formeln die Schwingungs-
vorstellung unterlegen, auf stehende Eigenschwingungen gefuhrt
Nerden. Ich habe kurzlichl) gezeigt, dab man die E i n s t e i n -
sche Gastheorie auf die Betrachtung solcher stehender Eigen-
schwingungen , fur welche man dar, Dispersionsgesetz der
d e Broglieschen Phasenwellen ansetzt, grunden kann. Die
obigen Betrachtungen fur das Atom hatten sich als Verall-
gemeinerung jener Uberlegungen am Gasmodell darstellen lassen.
FaBt man die einzelnen Funktionen (lS), multipliziert mit
einer Kugelflachenfunktion der Ordnung n, als die Beschreibung
von Eigenschwingungsvorgangen auf, dann muB die GroBe E
etwas mit der Prequenz des betreffenden Vorganges zu tun
hahen. Nun ist man gewBhnt, daB bei Schwingungsproblemen
der ,,Parameter" (gewohnlich h genannt) dem Quadrat der
Frequenz proportional ist. Aber erstens wiirde ein solcher
Ansatz im vorliegenden Fall gerade fur die neyativeu E-Werte
zii imaginaren Frequenzen fuhren, zweitens sagt dem Quanten-
tfieoretiker sein Gefiihl, daB die Energie ,der Frequenz selbst
wid nicht ihrem Quadrat proportional sein muB.
Der Widerspruch lost sich folgendermaBen. F u r den
,.Parameter" E der Variationsgleichung (5) ist j a vorlaufig Kein
nnturlicites Nullniveau festgelegt, besonders da die unbekannte
Funktion y auBer mit 3 noch mit einer Punktion von r
inultipliziert erscheint, die, unter entsprechender hinderung des
Nullniveaus von E, um eine Konstante abgeandert werden kann.
Iqolglich ist die ,,Erwartung des Schwingungstheoretikers" dahin
zu berichtigen, dab nicht E' selbst - das, was wir bisher so
nrtonten und auch weiter so nennen wollen - sondern B ver-
rnehrt um eine gewisse Konstante dem Quadrat der Frequenz
1)roportional erwartet wird. Sei diese Konstante nun sehr yrop
in1 Vergleich zu den Betragen aller vorkommenden negativen
R-Werte [die j a durch (15) beschrankt sind]. Dann werden
( 5 1 stens die Frequenzen reell, zweitens aber werden unsere
13- Werte, da sie nur relativ kleinen Frequenzunterschieden
wtsprechen, tatsachlich sehr angenahert diesen Frequenzunter-
__
1) Erscheint demngchst in der Physilr. Zeitschr.
374 E. Schrodinger.
schieden proportional. Das hinwiederum ist alles, was das
,,naturliche Gefiihl" des Quantentheoretikers verlangen kann,
solange das Nullniveau der Energie nicht festgelegt ist.
Die Auffassung, da8 die Frequenz des Schwingungsvor-
genges etwa durch-
C'
(22) v = C ' ~ / C + E =C'fC+-E'+ ...
2vc
gegeben sei, wo C eine gegen alle E sehr groBe Konstante,
hat aber noch einen anderen sehr schhtzenswerten Vorzug.
Sie vermittelt ein Perstandnis fu r die Bohrsche Prequenzbedingung.
Nach der letzteren sind doch die Emissionsfrepemen den
E-Differenren proportional, also nach (22) auch den DitFerenzen
der Eigenfrequenzen Y jener hypothetischen Schwingungsvor-
glnge. Und zwar sind die Eigenfrequenzen alle sehr groB
gegen die Emissionsfrequenzen, stimmen unter sich nahe uberein.
Die Emissionsfrequenzen erscheinen demnach als tiefe ,,Diffe-
renzt6netLder mit vie1 hoherer Frequenz erfolgenden Eigen-
schwingungen selbst. DaB beim Hinuberwandern der Energie
aus der einen in die andere Normalschwingung irgendetwas -
ich meine die Lichtwelle - in Erscheinung tritt, dem als
Prequenz jene Frequenzdifferenz zukommt, ist sehr verstandlich;
man braucht sich nur vorzustellen, daB die Lichtwelle urslchlich
verkniipft ist mit den wahrend des Uberganges an jeder Raum-
stelle notwendig auftretenden Schioebungen und da6 die Frequenz
des Lichtes bestimmt wird durch die Haufigkeit, mit der das
Intensititsmaximum dea Schwebungsvorganges pro Sekunde
wiederkehrt.
Es mag Bedenken erregen, daB diese Schliisse sich auf
die Beziehung (22) in ihrer nuherungsweisen Gestalt (nach Ent-
wicklung der Quadratwurzel) grunden, wodurch die Bo hrsche
Frequenzbedingung selbst scheinbar den Charakter einer Nahe-
rungsformel erhalt. Das ist aber nur scheinbar und wird vollig
vermieden, wenn man die relativistische Theorie entwickelt,
durch welche uberhaupt erst ein tieferes Verstilndnis vermittelt
wird. Die groBe additive Konstante C hangt natiirlich aufs
innigste zusammen mit der Ruhenergie m cB des Elektrons.
Auch das scheinbar nociimalige und unabhangzge Auftreten der
Konstante h [die doch schon durch (20) eingeftihrt war] in der
Frequenzbedingung wird durch die relativistische Theorie auf-
Quantisieruny als Eigenwertproblem. 375
geklkt bzw. vermieden. Aber leider begegnet ihre einwandfreie
Durcbfuhrnng vorlaufig noch gewissen, oben beriihrten Schwierig-
keiten.
Es ist kaum notig, hervorzuheben, um wie vieles sym-
pathischer die Vorstellung sein wiirde, daB bei einem Quanten-
iihergang die Energie aus einer Schwingungsform in eine
andere ubergeht, als die Voratellung von den springcnden
1:lektronen. Die h d e r u n g der Schwingungsform kann sich
stetig in Raum und Zeit vollziehen, sie kaun gern solange
dauern, als erfahrungsgema6 (Kanalstrahlversuche von W. W ien)
der EmissionsprozeB dauert : und gleichwohl werden, wenn
wlihrend dieses Ubergangs das Atom fur verhaltnismaBig kurze
Zeit einem elektrischen Feld .ausgesetzt wird , das die Eigen-
frequenzen verstimmt, die Schwebungsfrequenzen sogleich mit-
serstimmt werden, und zwar gerade nur solauge als das Feld
einwirkt. Diese experimentell festgestellte Tatsache bereitet
helranntlich dem Verstandnis bisher die groBten Schwierigkeiten,
man vergleiche etwa die Diskussion in dem bekannten Losungs-
- -
versuch von Bo h r Kra m e r s S l a t er.
Im ubrigen darf man in der Freude iiber das menschliche
Naherriicken all dieser Dinge doch nicht vergessen, daB die
Vorstellung, das Atom schwinge, wenn es nicht strahlt, jeweilig
in Form einer Eigenschwingung, daB, sage ich, diese Vorstellung,
iuenn sie festgehalten werden muB, sich doch immer noch sehr
htaxk von dem naturlichen Bild eines schwingenden Systems
eutfernt. Denn ein makroskopisches System verhklt sich j a
bebanntlich nicht so, sondern liefert im allgemeinen ein Pot-
pourri seiner Eigenschwingungen. Man darf aber seine Ansicht
in diesem Punkt nicht voreilig festlegen. Auch ein Potpourri
von Eigenschwingungen am einzelaen Atom wiirde nichts ver-
achlagen, sofern dabei j a auch keine anderen Schwebungs-
frequenzen auftreten als diejenigen, zu deren Emission das
Atom erhhrungsgemaB unter Urnstanden befahigt ist. Auch die
gleichzeitige wirkliche Aussendung vieler von diesen Spektral-
iinien durch dasselbe Atom widerspricht keiner Erfahrung.
Mihn konnte sich also gut denken, daB bloB im Normalzustand
w i d niherungsweise in gewissen ,,metastabilen" Zustanden) das
Atom mit einer Eigentrequenz schwingt und eben aus diesem
Orunde nicht strahlt, well namlich keine Schwebungen auftreten.
376 E. Sclirodinger. Quantisierung als X@enwertproblem.
Die Anregung bestiinde in einer gleichzeitigen Erregung einer
oder mehrerer anderer Eigenfrequenzen, wodurch dann Schwe-
bungen entstehen, welche die Lichtemission hervorrufen.
Unter allcn Umstanden mochte ich glauben, da6 die zur
gbichen Frequenz gehorigen Eigenfunktionen im allgemeinen
alle gleichzeitig angeregt sind. Mehrfachheit der Eigenwerte
entspricht namlich in der Sprache der bisherigen Theorie der
Bntartung. Der Reduktion der Quantisierung entarteter Systeme
diirfte die willkiirliche Aufteilung der Energie auf die zu einem
Eigenwert gehBrigen Eigenfunktioneri entsprechen.

Zusatt bei der Korrektur am 28. IL 1926.


Fiir den Fall der klassischen Mechanik konservativer
Systeme lii6t. sich die Variationsaufgabe schijner, als eingangs
geschehen, ohne ausdriickliche Beziehung auf die H a m i l t o n -
sche partiellc I)ifferentialgleichung folgendermaben formulieren.
Sei Z'(q, p ) die kinetische Energie als Funktion der Koordinaten
und Impulse, P die potentielle Energie, d r das Volumelement
des Koufigurationenraums ,,rationell gemessen'L, d. h. nicht
.
einfach das Produkt d q , , d y a . . dq,, sondern noch dividiert
durch die Quadratwurzel aus der Diskriminante der quadra-
tischen Form P(y,p). (Vgl. G i bbs, Statistische Mechanik.)
Dann sol1D,I das ,:Hamiltonsche Integral"

(23)
stationar machen unter der normierenden Nebenbedingung
(24) s v 2 d r= 1 .
Die Eigenwerte dieses Variationsproblems sind bekanntlich die
stationuren Werte des Integrals (23) und liefern nach unserer
These die Quantenniveaus der Energie.
Zu (14") sei noch bemerkt,, da8 man in der Gro6e u2
im wesentlichen den bekannten Sommerfeldschen Ausdruck
B
vx f 1/c vor
- ___ sich hat (vgl. ,,Atombau", 4. Aufl., S. 775).
Z u r i c h , Physikalisches Institut der Universitiit.
(Eingegangen 27. Januar 1926.)

Druek von XIetzger & Wittig in Leipxig.


https://ntrs.nasa.gov/search.jsp?R=19840008978 2020-04-04T14:05:22+00:00Z

NAS_ TECHNICAL MEMORANDUM NASA TM-77379


%

,L THE ACTUAL CONTENT OF QUANTUM i


, _, THEORETICAL KINEMATICS AND MECHANICS ,

t" ,

W
?
l

-I
, 1

Translation of "Uber den anschaulichen Inhalt der


quantentheoretischen Kinematik und Mechanik", Zeit-
"
schrift fur Physik, v. 43, no. 3-4, pp. 172-198, 1927.

!--i
.
(lIISk-TB-773791 Tile ICTaAL COIJTEIJT OF
QU&IITUB THEOiigTZCaL KZlfEH&TICS &lid BKCiikJZCS
!18_- 170 _6

(18ational aeronautics and Space


&dminist_ation) 35 p iIC &O3/_F &OI CSCL 126 63/77 Uncla8
18109

• s

NATIONAL AERONAUTICS AND SPACE ADMINISTRATION


.. WASHINGTON D.C, 20546 DECEMBER 1983

o ,, , m
_1 i I I II I • I II

First, exact definitions are supplied in this paper for


: the terms: position, velocity, energy, etc. (of the electron,
for instance), such that they are valid also in quantum mech-
anics; then we shall show that canonically conjugated variables
" can be determined simultaneously only with a characteristic
uncertainty. This uncertainty is the intrinsic reason for the
occurrence of statistic_l relations in quantum mechanics. Their
mathematical formulation is made possible by the Dirac-Jordan
theory. Beginning from the basic principles thus obtained, we
shall show how macroscopic processes can be understood from.the
viewpoint of quantum mechanics. Several imaginary experiments
are discussed to elucidate the theory.

I i.

_classlJ_l_ _mdUnllmited
a. _,,_ _,_ioi_i_i_ u. I,,,,_.,_kRRI, l

III II_ ..... Ul

i
W"

THE ACTUAL CONTENT OF QUANTUrl THEORETICAL KINEMATICS AND 1172" !


"I MECHANICS

By W Heisenberg, Institute for Theoretical Physics of the


'• _ University, Copenhagen, Denmark

'[ I

_ i SUMMARY First,
plied in this exact
paper for definitions
the terms: are sup-
position, i
_ velocity, energy, etc. (of the electron, for
! instance), such that they are valid also in :
_ quantum mechanics; then we shall show that
=! canonically conjugated variables can be de-
•_ term,ned simultaneously only with a charac-
_ teristic uncertainty _§I]. This uncertainty ,
:_" is the intrinsic reason for the occurrence
of statistical relations in quantum mechan-
ics. Their mathematical formulation is made
_ possible by the Dirac-Jordan theory (§2). Be-
_ ginning from the basic principles thus oh-
" rained, we shall show how macroscopic pro- 1-*
cesses can be understood from the viewpoint |:

_
of quantum mechanics
experiments
(§3). Several imaginary
are discussed to elucidate the
F
theory (§4).

_ We believe to understand a theory intuitively, if in all sim- °_


ple cases we can qualitatively imagine the theory's experi-
.i

mental consequences and if we have simultaneously realized


._I
J that the application of the theory excludes internal contra-
dictions• For instance: we believe to understand Einstein's

4l concept of a finite three-dimensional space intuitively, be-


, cause we can imagine the experimental consequences of this
concept without contradictions. Of course, these consequences I
contradict our customary intuitive space-tlme beliefs. But we [

customary view of space and time can not be deduced either _


' can
from convince
our laws ourselves that orthefrom
of thinking, possibility
experience.of applying this
The intuitive I""
, i

• Numbers in the margin indicate foreign pagination

I
I
, ORIGINAL PAGE Ig
OF POOR QUALITY

interpretation of quantum mechanics is still full of internal I


contradictions, which become apparent in the battle of opin-
_ ions on the theory of continuums and discontinuums, corpuscles
and waves. This alone tempts us to believe that an interpre-
tation of quantum mechanics is not going to be possible in the
• customary terms of kinematic and mechanical concepts. Quantum
' theory, after, derives from the attempt to break with those
customary concepts of kinematics and replace them with rela-
tions between concrete, experimentally derived values. Since
this appears to have succeeded, the mathematical structure of
quantum mechanics won't require revision, on the other hand.
By the same token, a revision of the space-time geometry for
r small spaces and times will also not be necessary, since by a
_ choice of arbitrarily heavy masses the laws of quantum mechan-
k ics can be made to approach the classic laws as closely as 117___3
desired, no matter how small the spaces and times. The fact
,i that a revision of the kinematic and mechanic concepts is re-
' quired seems to follow immediately from the basic equations
of quantum mechanics. Given a mass _, it is readily understand-
able, in our customary understanding, to speak of the position
and of the velocity of the center of gravity of that mass m.
h
But in quantum mechanics, a relation Pq--qP:'f_-_i exists
between mass, position and velocity. We thus have good reasons
to suspect the uncritical application of the terms "position"
and "velocity". If we admit that for very small spaces and
times discontinuities are somehow typical, then the failure
of the concepts precisely of "position" and "velocity" become
immediately plausible: if, for instance, we imagine the uni-

it_
_1. ..q.z

dimensional motion of a mass point, then in a continuum theory

2
J

it will be possible to trace the trajectory curve x(t) for


the particle's trajectory (or rather, that of its center of
mass) (see Fig. I, above), with the tangent to the curve in-
dicating the velocity, in each _ase. In a discontinuum theo-
! ry, in contrast, instead of the curve we shall have a series
:_ of points at finite distances (s_e Gig. 2, above). In this

case it is obviously pointless to talk of the velocity at a i


certain position, since the velocity can be defined only by
' means of two positions and consequently and inversely, two

: different velocities corresponded to each point. 1

The question thus arises whether it might not be possible, by I


i means of" a more precise analysis of those kinematic and me-
._ chanical concepts, to clear up the contradictions currently
i
to thus achieve an intuitive understanding of the relations of
existing in an intuitive interpretation of quantum mechanics,

quantum mechanics.*

§ I The concepts: position, path, velocity, energy /17--4

In order to be able to follow the quantum-mechanical behavior


of any object, it is necessary to know the object's mass and
and the interactive forces with any fields or other objects.

Only then is it possible to set up the hamiiconian function


for the quantum-mechanical system. [The considerations below

* This paper was written as a consequence of the efforts and


wishes expressed clearly by other scientists, much earlier, be-
fore quantum mechanics was developed. I particularly remember
Bohr's papers on the basic tenets of quantum theory (for
instance, Z.f.Physlk 13, 117 (1923)) and Einstein's discus-
sions on the relation--Setween wave fields and light quanta.
In more recent times, the problems here mentioned were dis-
cussed most clearly by W. Pauli, who also answered some of
the questions that arise ("Ouantentheorle", Handbuch d.Phys.
["Quantum theory", Handbook of Physics] Vol. XXIII, subse-
quently cited as l.c.). Quantum mechanics has changed little
in the formulation Pauli gave to these problems. It is also
a special pleasure for me here to thank Mr. W. Paull for the
stimulation I derived from our oral and written discussions,
which have substantially contributed to this paper.

i -._lh_ " II l II I III ., ,.,,m,,,


shall in general refer to non-relativistic quantum mechanics,
since the laws of quantum-theory electrodynamics are not com-
pletely known yet.* No further statements regarding the ob-
._ Ject's "gestalt" are necessary: the totality of those inter-
_'_ active forces is best designated by the term "gestalt".

°, If we want to clearly understand what is meant by the word


_ "position of the object" - for instance, an electron - (rela-
tive co a given reference system}, th_n we must indicate the
i definite experiments by means of which we intend to determine
_ the "position of the electron " Otherwise the word is meaning- ?

!
!
less In principle, there is no shortage of experiments that 1

permit a determination of the "position of the electron" to


t
any desired precision, even. For instance: illuminate the e-

lectron and look at it under the microscope. The highest pre-


cision attainable here in the determination of the position is
substantially determined by the wavelength of the light used.
But let us build in principle, a r-ray microscope and by means
s

" of it determine the position as precisely as desired. But in

I this determination a secondary circumstance becomes essential:

] the Compton effect. Any observation of the scattered light


I coming from the electron (into the eye, onto a photographic
t
i plate, into a photocell} presupposes a photoelectric effect,

i that is, it can also be interpreted as a light quantum strik-


I ing the electron, there being ref]ectedordiffracted to then
)
I

I - deflected once again by the microscope's lense - finally /17__55


I triggering the photoelectric effect. At the instant of the
determination of its position - i.e., the instant at which
'i the light quantum is diffracted by the electron - the electron
discontinuously changes its impulse. That change will be more
i pronounced, the smaller the wavelength of the light used, i.e.
the more precise the position determination is to be. In the
f iii J u • i,

i * However, significant progress was made very recently through


! the work of P. Vlrac [Proc. Roy. Soc. (A), 114, 243 (1927)
' and subsequent studies.]
I

F ...........
,- _ ORIGINAL PAGE |g
i ' OF POOR qUALITY

.i instant at which the electron's position is known, therefore,


; its impulse can become known only to the order of magnitude
corresponding to that discontinuous change. That is, the more
!
.Ji
!4 precisely the position is determined, the more imprecisely
_ will the impulse be known, and vice-versa. This provides us
with a direct, intuitive clarification of the relation
_ h . Let q be the precision to which the value
_" Pq --qP--__i I
of _ is known (ql is approximately the average error of _),
or here, the wavelength of the light; Pl is the precision to
i which the value of _ can be determined, or in this case, the
iF discontinuous change in _ during the Compton effect. Accord-
ing to the basic equations of the Compton effect, the rela-

tion between Pl and ql is then

P,_l _ _'. , (l)

That relation (I) above stands in a direct mathematical con-


h
nection with the commutation relation Pq--qP--_;i shall
be shown below. Here we shall point out that equation (I) is
the precise expression for the fact that we once sought to
describe by dividing the phase space into cells of size h.

Other experiments can also be performed to determine the e-


lectron's position, such as impact tests. A very precise de-
termination of the position requires impacts with very fast
particles, since for slow electrons the diffraction phenomena
- which according to Einstein are a consequence of the de
Broglle waves (see for instance the Ramsay effect) - preclude
a precise determination of the position. Thus, once again for
a precise position measurement the electron's impulse changes
disontlnuously and a simple estimate of the precision with
the equations of the de Broglie waves once again leads to e-
quation (1).

This discussion seems to define the concept "position of the


electron" clearly enough and we only need to add a word about

5
the "size" of the electron. If two very fast particles strike
the electron sequentially in the very brief time interval At,
then the two positions of the electron defined by these two
particles lie very close together, separated by a distance AI.
From the laws observed for m-particles we conclude that AI can
be reduced to a magnitude of the order of 10-12 cm, provided
At is sufficiently small and the particles selected are suf- /17--6
ficiently fast. That is the meaning, when we say that the e-
lectron is a particle whose radius is not greater than 10-12 cm.

Let us move on to the concept of the "path of the electron."


By path or trajectory we mean a series of points in space (in
a given reference system) that the electron adopts as sucessive
"positions." Since we already know what "position at a certain
time" means, there &re no new difficulties, here. It is still
readily understood that the often used expression, for instance,
"the I-S orbit of the electron in the hydrogen atom" makes no
sense, from out point of view. Because in order to measure this
IS orbit, we would have to illuminate the atom with light such
that its wavelength is considerably shorter than 10-8 cm. But
one light quantum of this kind of light would be sufficient to
completely throw the electron out of its "orbit" (for which
reason never more than a single point of this "path" could be
defined, in space) and hence the word "path" is not very sen-
sible or meaningful, here. This can be easily derived from the
experimental possibilities, even without any knowledge of the
new theories.

In contrast, the imaginary position measurements can be per-


formed for many atoms in a IS state. (Atoms in a given "station-
ary" state, for instance, can in principle be isolated by the
Stern-Gerlach experiment.) Thus, for a given state, for ins-
tance 1S, of an atom, a probability function must exist for the
electron's positions, such that it corresponds, on the average,
to the classical trajectory over all phases, and that can be
established by measurements to any desired pre_ision. Accord-

ing to Born* this function is given by _is(q)$1s(q) , if


$is(q) is the Schroedinger wave function corresponding to the
state IS. I want to Join Dirac* and Jordan*, in view of sub-

sequent generalizations, in saying: the probability is given /177


by S(IS,q)_(IS,q), where S(IS,q) is that column of the trans-
formation matrix S(E,q) from E to _, which corresponds to E =
EIS (E = energies).

In the fact that in quantum theory for a given state - for


instance IS - only the probability function for the electron
position can be given, we may see a characteristic statistical
feature of quantum theory, as do Born and Jordan, quite in
contrast to the classical theory. On the other hand, if we
want to we can say with Dirac that the statistics came in via
our experiments. Because also in classical theory only the
probability of a certain electron position could be given, if
and as long as we do not know the atom's phases. Rather, the
difference between classical aud quantum mechanics consists in
this: classically, we can always assume the phases to have
been determined in a previous experiment. But in reality this
is impossible, because every experiment to determine the phase
would either destroy or modify the atom. In a definite station-
ary "state" of the atom, the phases are indetermined in

* The statistical meaning of the de Broglie waves was first


formulated by A. Einstein [Sitzungsber.d.preuss.Akad.d.
Wiss. 1925, p.3). This statistical element then plays a
slgnifT_t role for M. Born, W. Helsenberg and P. Jordan,
"Ouantum mechanics II." [Z.f.Phys. 35, 557 (1926)], espe-
cially chapter 4, §3, and P. Jordan-_Z.f.Phys. 37, 376
(1926)]; it is analyzed mathematically in a fun_-amental
paper by M. Born [Z.f.Phys. 38, 803 (1926)] and used for
the interpretation of the coIIislon phenomena. The founda-
tion for using the probability theorem from the transforma-
tion theory for matrices can be found in: W. Helsenberg [Z.
f. Phys. 40, 501 (1926)], P. Jordan [ibid. 40, 661 (1926)],
W. Paull-TAnm. in Z.f.Phys. 41, 81 (1927)]_-P. Virac [Proc.
Roy.Soc.(A) 113, 621 (1926)], P. Jordan [Z.f.Phys. 40, 809
(1926)]. The_atistical side of quantum mechanics i_ gen-
eral is discussed by P. Jordan (Naturwiss. 15, 105 (1927)]
and M. Born [Naturwlss. 15, 238 (1927)].
a@
i ORIGINAL PAGE [8
OF POOR QUALITY

principle, which we may consider a direct clarification of the


_' known equations

El-- fE = _ h
=D or 3w- w3= _=-;

(] : action variable, w: angular variable).


t
!
The word "velocity" of an object is easily defined by measure-
merit, if it is a force-free motion. For instance, the object
can be illuminated with red light and then the particle's ve-
locity can be determined by the Doppler effect of the scatter-
ed light. The determination of the velocity will be the more
precise, the longer the wavelength of the light used is, since
then the particle's velocity change per light quantum due to
Compton effect will be the smaller. The position determination
becomes correspondingly uncertain, as required by equation(1).
If the velocity of the electron in an atom is to be measured
at a certain instant, we should have to make the nuclear charge
and the forct:s due to the other electrons disappear, at that
instant, so that the motion may proceed force free, after that
instant, to then perform the determination described above. As
was the case earlier, we once again can convince ourselves that
a function p(t) for a certain state of the atom - say, IS - can
not be defined. In contrast, there again will be a function for 117--8
the probability of _ I_ this state, which according to Dlrac
and Jordan will have the value S(1S,p)_(1S,p). Again, S(1S,p)
means the column of the transformation matrix S(E,p) of E Int_
pthat corresponds to E = EIS.

Finally, let us point out the experiments that allow the meas-
urement of the energy or the value of the action variables J.
Such experiment_ are particularly important since only with
their aid will we be able to define what we mean, when we talk
about the discontinuous change of the energy or or J. The

8
Franck-Hertz collision experiments permit the tracing back of
the energ_ measurements on atoms to the energy measurements of
electrons moving in a straight line, because of the validity
of the energy theorem in the quantum theory. In principle,
this measurement can be made as precise as desired, if only

t we forego the simultaneous determination of the electron posi-


J
tion, i.e., of the phase (see above, the determination of _),
: k
corresponding to the relation £t--tE----_-z3 • The Stern-
Gerlach experiment permits the determination of the magnetic
or an average electric moment of the atom, i.e., the measure-
ment of magnitudes that depend only the action variables J. The
phases remain undetermined in principle. If it is not sensible
to talk of the frequency of a light wave at a given instant, it
is not possible either to speak of the energy of an atom at a
particular instant. In the Stern-Gerlach experiment this cor-
responds to the situation that the precision of the energy
measurement will be the smaller, the shorter the time interval
during which the atom is under the influence of the deflecting
forcem. Because an upper limit for the deflecting force is
given by the fact that the potential energy of that deflecting
force inside the beam of rays can vary only by quantities that
are considerably smaller than the energy differences of the Q

stationary states, if a determination of the stationary states'

energy is to be possible. If E I is the quantity of energy that


satisfies that condition (E I at the same time is a measure of
the precision of that energy measurement), then E1/d is the
maximum value for the deflecting force, if d is the width of
the ray beam (measurable by means of the width of the slit

used. The angular deflection of the atom beam Is then £1tl/dP,


where t I is the period of time during which the atoms are under.
the effect of the deflecting force, _ the impulse of the atoms /179
in the direction of the beam. This deflection must be at least

of the same order of magnitude as the naturaZ beam broadening


caused by diffraction in the slit, in order for a measurement

u Cf. also W. Pault, 1.c.p.61

9
. OF POOR QOALFrV

to be possible. The angular deflection due to diffraction is


approximately _/d, where R is the de Broglie wavelength, i.e.,

a _ dp cr since _--_. _
., ; _t, _ h. (_)

"_ This equation corresponds to equation (1) and it shows that a


t': precise energy determination can be attained only through a
corresponding uncertainty in the time.

§ 2 The Dirac-Jordan theory

i
We would like to summarize the results of the previous section
and generalize them in tLis statement: All concepts used in
classical theory to describe a mechanical system can also be
defined exactly for atomic processes, in analogy to the classic
concepts. But purely from experimentation, the experiments that

! serve for such definitions carry an inherent uncertainty, if we


expect from them the simultaneous determination of two canoni-
cally conjugated variables. The degree of this uncertainty is
given by equation (I), widened to include any canonically con-
jugated varlab]es. It is reasonable to _ere compare the quantum
theory wlth the special theory of relativity. According to the
theory of relativity, the term "slmultaneous _'can only be de-
fined by experiments in which the propagation veloclty of light
plays an essential role. if there were a "sharper" definition
of simultaneity - for instance, signals that propasate infl-
nltely rapidly - then the theory of relativlty would be Impos-
slble. But since such signals do not exist - because the velo-
city of light already appears in the definltlon of simultane-
ity - room is available for the postulate of a constant velo-
city of light and therefore th_a Fostulate is not contradicted
by the appropriate use of the terms, "position, veloclty, time *.
The situation Is similar in regard to the _efinttlon of the

10
ORIGINAL PAGE_
' OF POOR QUALITY

concepts "electron position and velocity", in quantum theory.


All the experiments we could use to define these terms neces-
sarily contain the uncertainty expressed by equation (I), even
though they permit an exact definition of the individual con-
cepts £ and _. If experiments existed that allowed a "more
precise" definition of _ and _ than that corresponding to e-
quation (I), then the quantum theory would be impossible. This /IBO i
@
uncertainty - which is fixed by equation (I) - now provides the
space for the relations that find thel; terse expression in
the commutation relations of quantum mechanics,
k
Pv--qP --'- 2xi "

This equation becomes possible without having to change the

physical meaning of the variables E and _.


I
t

For those physical phenomena for which a quantum theory formu-


lation is still unknown (for instance, electrodynamics), equa- t
tion (1) represents a demand that may be helpful in finding the
new laws. For quantum mechanics, equation (1) can be derived
from the Dirac-Jordan formulation, by means of a minor general-
ization. If for a certain value n of an arbitrary parameter we
can determine the position _ of the electron at q' with a pre-
cision ql' then we can express this fact by means of a proba-
bility am_lltude $(n,q) that wlll be noticeably different from
zero only in an area of approximate dimension ql around q'. We
c_n thus say, more specifically

i.e.,
We thus have for the probability amplitude correspondtn8 to p:

s($_) : _s($ e)s_.t)de. (4)

Zn asreement wlth Jordan, we can say for :S(q,p) :hat

,(,,.)=

1'1
J , ORIG,,._ = Pj
• OF POGR 4UALITY

ferent from zero only for values of p for which 2_(p-p')ql/h


is not substantially larger than I. More especially, in the
ii In that case, according to (4), S(q,p) will be noticeably dif-
case of (3) we shall have:

S(_. j,) prop J e _ 'v,' ,tq, I


,_ i.e.,

S{_,p)prop¢ =l,=
_ +h-¢'_-I"_ that is S_prope pt*

_! where

" _iqt --" .... (6) 4

/181 t
- Thus, assumption (3) for S(n,q) corresponds to the experiment-

al fact that the value p' of _ and the value q' of _ were mess-
_ ured [with the precision restriction (6)]. _
!
!

t The purely mathematical characteristic of the Dirac-Jordan


formulation of quantum mechanics is that the relations between ._i
; p,¢,E , etc., can be written as equations between very gen-
eral matrices, such that any variable indicated by quantum
theory appears as the diagonal matrix. The feasibility of such
a notation seem reasonable if we visualize the matrices as
tensors (for instance, moments of inertia) in multidimensional
spaces, among which mathematical relations exist. The axes of
the coordinate system in which these mathematical relations
are expressed can always be placed along the main axis of one
of these tensors. It is after all always possible to character-
ize the mathematical relation between two tensors A and B by
means of transformation formulae that will convert a system of
coordinates oriented along the main axis of A, into one ori ....
ented along the main axis of B. The latter formulation cortes- '"
ponds to $chroedinger's theory. In contrast, Dirac's notation
of the q-numbers must be considered the truly "Invarlant"

12
!
'_ formulation of quantum mechanics, independent of all coordi-
nate systems. If we wanted to derive physical results from

that mathematical model, then we must assign numerical values i


!
to the quantum mechanics variables, i.e., the matrices (or l
"tensors" in multidimensional space). This is to be understood I
as meaning that in that multidimensional
rection
kind
is arbitrarily
of experiment
space a certain di-
chosen
performed),
(that
and
is,
then
established
the "value"
by the
of the
I
matrix is asked for (for instance, the value of the moment of
inertia, in that picturel, in the direction chosen. This ques-
tion has unequivocal meaning only if the direction chosen co-
incides with one of the matrix' main axes: in that case there
will be an exact answer to the question. If the direction
chosen deviates but little from one of the matrix' main direc-

tions, we can still talk with a certain imprecision, given by


the relative inclination, with a certain probable error, of
the "value" of the matrix in the direction chosen. We can thus
state: it is possible to assign a number to every quantum
theory variable, or matrix, which provides its "value", with a
certain probable error. The probable error depends on the sys-
tem of coordinates. For each quantum mechanics variable there /182
exists one system of coordinates for which the probable error
vanishes, for that variable. Thus, a given experiment can
never provide precise information on all quantum mechanics
variables: rather, it divides the physical variables into
"known" and "unknown" {or: more or less precisely known vari-
ables), in a manner characteristic for that experiment. The
results of two experiments can be derived precisely from each
other only when the two experiments divide the physical vari-
ables in the same manner into "known" and "unknown" (i.e., if
the tensors in that multidimensional space already used for
visualization are "viewed" from the same direction, in both
experiments.) If two experiments cause two different distribu-
tions into "known" and "unknown" variables, then the relation
of the results of those experiments can be given appropriately
only statistically.

13
' ORiGiNAL _A_ _
OF POOR QUALITY

Let us perform an imaginary experiment, to more precisely dis-


cuss these statistical relations We shall start by sending a

t Stern-Gerlach beam of atoms through a field F I that is so in-


_ homogeneous in the beam direction, that it causes noticeably
numerous transitions due to a "shaking effect". The atom beam

._ is then allowed to run unimpeded, but then a second field shall


begin, F2, as inhomogeneous as F I. We shall assume that it is
possible to measure the number of atoms in the different sta-

tionary states, between F I and F2 and also beyond F2, by means


of an eventually applied magnetic field. Let us assume the
atoms' radiative forces to be zero. If we know that an atom was i!

in the energy state En before passing through F I, then we can


express this experimental fact by assigning a wave function to

the atom - for instance, in p-space - with a certain energy Ep


and the indetermined phase Sn

After passing through field FI, the function will have become*

_. ' _ _:,_(. __)


S(E., _)--,. _]c.,. _(E.,, _)¢ h _.7)
Jl

Let us assume that here the 8m are arbitrarily fixed, such /183
that the Cnm is unequivocally determined by F]. The matrix
Cnm transforms the energy value before passing through F I to
that after passing through F]. If behind F] we perform a de-
termination of the stationary states - for instance, by means
of an inhomogeneous magnetic field - then we shall find, with

a probability of Cnm_nm that the atom has passed from the


state _ to the state _. If we determine experimentally that
the atom has actually acquired the state m, then in the sub-
sequent calculations we shall have to assign it the function
* See P. Dirac, Proc.Roy.Soc. (A)112, 661 (1926) and M. Born,
Z. f. Phys• 40, 167 (1926).

14

%
[,

, o

ORIGINAL PACT _
OF POOR QUALITY

Sm with an indeterminate phase, instead of the function


_c_,.Sm . Through the experimental determination "state m"

we select, from among the different possibilities (Cnm) , a •


: certain _ and simultaneously destroy, as we shall explain |
below, whatever remained of phase relations in the variables

Cnm. When the beam passes through F2, we repeat the same pro-
cedure used for F I. Let dnm be the coefficients of the trans-
formation matrix that converts the energies before F2 to those
after F2. If no determination of the state is performed bet-
ween F I and F2, then the eigen-function is transformed accord-
ing to the following pattern:

s(E.,p) r-_
' _.,..s(_.,p) _-_
" _. _,_.._.,S(E,, _,). (8)
m m I

Let _=g._--e._ . If the stationary state of the atom

is determined, after F2, we shall find the state _ with a pro-


bability of enlenl . If, in contrast, we determined "state m"
between F I and F2, then the probability for _ behind F2 is
given by dml_ml . Repeating the entire experiment several times
(determining the state, each time, between F I and F2) we shall
then observe the state _, behind F2, with the relative frequency
Z.L---_,,c..c_.d,_a,.t . This expression does not agree with
m

enl_nl. For this reason Jordan (l.c.) mentions an "interference


of the probabilities". I, for one, would not agree with this.

Because the two experiments leading to enlenl or Znl, respec-


tively, are really physically different. In one case the atom

suffers no disturbance between F I and F21 in the other it is


disturbed by the equipment that makes the determination of the
stationary states possible. The consequence of this equipment
is that the "phase" of the atom changes by quantities that are
uncontrollable in principle, Just as the impulse was changed /18__4
in the determination of the electron's position (cf. § I). The

magnetic field for the determination of the state between FI


and F2 will change the eigen-values E and during the observa-
tion of the atom beam (I am thinking of something like a Wilson
track) the atoms will be slowed down in different degrees,

15
statistically, and in an uncontrollable manner. As a conse-

quence, the final transformation matrix enl (from the energy i


values before F I to those after leaving F2) is no longer given
by 3_,_ , and instead each term of the sum will have, in
addition, an unknown phase factor. Hence, all we can expect

is for the average


changes, to be equal
value
to Znl.
of enlenl,
-
A simple
over all
calculation
eventual
shows
phase
this
1
to be the case.

Thus, following certain statistical rules, we can draw conclu-


sion3, based on one experiment, regarding the results possible
for _nother. The other experiment selects, by itself and from
among all the possibilities, one particular one, thus limiting
the possibilities for all subsequent experiments. This inter-
pretation of the equation for the transformation matrix S, or
Schroedinger's wave equation, is possible only because the sum
oe all solutions is also a solution. Here we can see the deeper
meaning of the linearity of Schroeding, r's equations and hence
t!ey can be understood only as waves in the phase space; for
ttis same reason we would consider any attempt to replace
these equations - for instance, in the relativistic case (for
several electrons) - by non-linear equations as doomed to fail.

§ 3 The transition from micro to macromechanics

I believe the analyses performed in the preceding sections of


the terms "electron position", "velocity", "energy", etc., have
sufficiently clarified the concepts of quantum theory kinemat-
ics and mechanics, so that an intuitive understanding of the
_croscopic processes must also be possible, from the point of
view of quantum mechanics. The transition from micro to macro
mechanics _as already been dealt with by Schroedinger*, but I

* E. Scnroedinger, Naturwiss. 14, 664 (1926)

16
do not believe that Schroedinger's considerations address the
essence of the problem, for the following reasons: according
to Schroedinger, in highly excited states a sum of the eigen-
i vibrations will yield a not overly large wave packet, that in
i its turn, under periodic changes of its size, performs the ,
periodic motions of the classical "electron". The following /185
ij objections can be raised here: If the wave packet had such
properties as described here, then the radiation emitted by

s
the atom could be developed into a Fourier series in which the
]

: frequencies of the harmonic vibrations are integer multiples


of the fundamental fr_4uency. Instead, the frequencies of the
: spectral lines emitted by the atom are never integer multiples
of a fundamental frequency, according to quantum mechanics -
with the exception of the special case of the harmonic oscil-
lator. Thus Schroedinger's consideration is applicable only to
the harmonic oscillator considered by him, while in all other
cases in the course of time the wave packet spreads over all
space surrounding the atom. The higher the atom's excitation

state, the slower will be the scattering of the wave packet• !-_v?
But it will occur, if one waits long enough. The argument used _"-
above for the radiation emitted by an atom can be used, for the '.
time being, against all attempts of a direct transition from

quantum to classical mechanics, for high quantum numbers. For b _


this reason, it used to be attempted to circumvent that argu- |_
ment by pointing to the natural beam width of the stationary I. _
states; certainly improperly, since in the first place this I_
,° ,

way
insufficient
out is already
radiation
blocked
at higher
for thestates;
hydrogenin atom,
the second
because
place,
of l\_-_'i

the transitionwithout
derstandable from quantum to classical
borrowing mechanics
from electrodynamics. must be un-
Bohr* has _0 [_
.o.

repeatedly pointed out these known difficulties, in the past, ;_{[


that make a direct connection between quantum and classical ",
1,%I., $

theory difficult. If we explained them here again in such ,_.

* N. Bohr, Basic Postulates of Quantum Theory, l.c. 17 _:,


J

F
!

detail, it is because apparently they have been forgotten. 1

I believe the genesis of the classical "orbit" can be precise-


ly formulated thus: the "orbit" only comes into being by our

observing it. Let us assume an atom in its thousandth excita- i


tion state. The dimensions of the orbit are relatively large
here, already, so that it is sufficient, in the sense of § I, i
to determine the electron's position with a light of relative-
ly long wavelength. If the determination of the electron's
position is not to be too uncertain, then one consequence of i
!
Compton recoil will be that after the collision, the atom will
be in some state between, say, the 950th and the 1050th. At

the same time, the electron's impulse can be derived - to a i


precision given by equation (I) - from the Doppler effect. The i
experimental fact so obtained can be characterized by means of /186
a wave packet - or better, probability packet - in q-space, by
a variable given by the wavelength of the light used, essen-
tially composed of eigenfunctions between the 950th and the

1050th eigen-function, and through the corresponding packet in _


p-space. After a certain time, a new position determination is _;_
performed, to the same precision. According to § 2, its result
can be expressed only statistically; possible positions are all
those within the now already spread wave packet, with a calcu-
lable probability. This would in no way be different in clas-
sical theory, since in classical theory the result of the sec-
ond position could also be given only statistically, due to
the uncertainty in the first determination; In addition, the
system's orbits would also spread in classical theory similarly
to the wave packet. However, the laws of statistics themselves
are different, in quantum mechanics and classical theory. The
second position determination selected a _ from among all those
possible, thus limiting the possibilities for all subsequent
determinations. After the second position determination, the
results for later measurements can be calculated only by again
assigning to the electron a "smaller" wave packet of dimension

18
I ORIG.,_AL =_'4"
OF POOR OUALI'P[ T
_ (wavelength of the light used for the observation). Thus,
each position determination reduces the wave packet again to l
its original dimension i. The "values" of the variables p
and q are known to a certain precision, during all experi-

il ments. Since within these limits of precision the values of


i_ p and q follow the classical equations of motion, we can
conclude, directly from the laws of quantum mechanics,
[

dH #H

P=- q= . J
But as we mentioned, the orbit can only be calcu]%ted statis-
tically from the initial conditions, which we may consider a J
consequence uncertainty existing in principle, in the initial
conditions. The laws of statistics are different for quantum
mechanics and classical theory. Under certain conditions, this
can lead to gross macroscopic differences between classical and
quantum theory. Before discussing an example of this, I want
to show by means of a simple mechanical system - the force-free
motion of a mass point - how the transition to the classical

theory discussed above is to be formulated mathematically. The /18__/7


equations of motion are (for unidimensional motion)

1 , 4=;i I p", p=o. (1o)


Since time can be treated as a parameter (as a "c-number") if
there are no external, time-dependent forces, then the solu-
tion to this equation is:

1 t
q ----._p, + q, ; p -- p,, (11)

where p, and _ represent impulse and position at time t=O.

At time t=O [see equations (3) to (6)], let qo = q' be meas-


ured with precision q1' Po = p' with precision p;. If from
the values" of _ and _ we are to derive the "value" of q
at time _, then according to Dirac and Jordan we must find
that transformation function, that transforms all matrices

19
I

f ' ORIGINALPAGE_J
OF POOR QUALITY

• ,_{: in which qo appears as a diagonal matrix, into matrices in


g which q appears as the diagonal matrix. In the _atrlx pat-
tern in which qo appears as the diagonal matrix, p, can be

i' __ replaced by the operator _k d . According to Dirac [l.c.


_.._ equation (11)] we then have for the transformation amplitude

I;
i_ sought, S(qo,q) , the differential equation

'-: li k 0 !
"_c I,,,_ _-_q,_
+eoj s(q.,e)= es(q.,_) (1_)

:, ,,,, __
(,,).-
! S(qe, e) _ const.e ..... _.-t..... (IS) .
i
¢

_ Thus S_ is independent of qo' i.e., if at time t : 0, qo is


- known exactly, then at any time t > 0 all values of q are e-
qually likely, i.e., the probability that _ lles within a fi-
nite range, is generally zero. This is quite clear, intuitive-

ly. Because the exact determination of qo leads to an infi-


" nitely large Compton recoil. The same would of course be true

• of any mechanical system. However, if at time t = 0 , qo i_


known only to a precision ql and Po to precision PI' then [cf.
equation (3)]

S(,/,_,)= COat.e--"_ f--Tp _'--_,

and the probability function for _ will have be calculated /18_8


from the equation

We obtain

t Bdm f I t ,%

If we introduce the abbreviation

20
" oRIGINALpAGE_
• OF PoOR QUALITY

_: then the exponent in (141 becomes

i -- , (, ;,))+""I
'_ The term in q,2 can be included in the constant factor (inde-
.P
-} pendent of g); by integration we obtain

_:i
_{,-_,,)r , l,'
$(_,,j) -- eou.t.e lqt= 1 , (16

! (,_;,,_.,,)(,-
¢onst. e- " s qL'(I
J From which follows

(,--,._,.)'
- S(e_._J]._(_,__-- eonst.e e_t(i"+P_"-. (IT)

.-| Thus, at time t the electron is at position (tlm)p' + q' to .,


. a precision _lyT_-_ . The "wave packet" or better, the "

"probabilityto p_c_:et"
According (15), 13 has become largerto bythea time
is proportional factort, ofinversely
}:I_.

proportional to the mass - this is immediately plausible - and


, inversely proportional to q2I. Too great a precision in qo has a ,""
greater uncertainty in Po as a consequence and hence al.qo !
leads to an increased uncertainty in _[. The parameter n, which
we introduced above for formal reasons, could be eliminated in
all equations, here, since it does not enter in the calcula- ,o .

tions

of
As statistics
an example and
that those from quantumbetween
the difference theory thecanclassical
lead to gross
laws !
macroscopic differences in the results from both theories, un- '
der certain conditions, shall be briefly discussed for the
reflection of an electron flow by a grating. If the lattice
m
w

constant is of the order of magnitude of the de Broglie wave- /189


length of the electron, then the reflection will occur in
!
i certain discrete directions in space, as does the light at a
'_ grating. Here, classical theory yields macroscopically some-
_4 thing grossly different. And yet, we can not find a contradic-
.'! tion against classical theory in the orbit of a single electron.
_i We could do it, if somehow we could direct the electron to a
_4 certain location on a grating line and there establish that the
reflection did not occur classically. But if we want to deter-
; mine the electron's position so precisely that we could say at

i which location ona grating line it would impact, then the elec-
{ tron would acquire such a velocity, due to this determination,
that the de Broglie wavelength of the electron would be reduced
to the point that in this approximation, the electron would be
a-tually reflected in the direction prescribed by classical
theory, without contradicting the laws of quantum theory.

§ 4 Discussion of some special, imaginary experiments .:.0_

According to the intuitive interpretation of quantum theory at-


tempted here, the points in time at which transitions - the
"quantum Jumps" - occur should be experimentally determinable
in a concrete manner, such as energies of stationary states,
for instance. The precision to which such a point in time can
be determined is given by equation (2) as hlAEI, if AE is the
change in energy accompanying the transition. We are thinking
of an experiment such as the following: Let an atom, in state
2 at time t=O, return to its normal state I by emitting radia-
tion. We could then assign to the atom, in analogy to equation
(7), the eigenfunctton

i i gl i i el i i

m See W. Pauli, 1.c., p.12

,?.2
' ' ORIGINAL PAGE
t OFPOORQUALrrf

i s(t,p) = _.,_(_,_e A + _1 - e- '"_(E,,p)e- -T'- (18)


L

'I if we assume that the radiation damping wlll express itself in

•-) the eigen-function by means of a factor of the form e-at(the


'_ true dependence may not be that simple). Let us send this atom
_i through an inhomogeneous magnetic field, to measure its energy,
as is customary in the Stern-Gerlach experiment, except that
) the inhomogeneous field shall follow tl_eatom beam for a good
' portion of the path. The corresponding acceleration could be
:_ measured by dividing the entire path followed by the atom beam
)
! in the magnetic field, into small partial paths, at the end of
; each of which we measure the beam's deflection. Depending on 119--0
:=_ the atom beam's velocity, the division into partial paths will
correspond, atom, partlal
_I for the also to division into time
-, intervals At. According to § I, equation (2), to the interval
At corresponds a precision in the energy of h/At. The probabll-
ity of measuring a certain energy can be dlrectly derived from
S(p,E) and is hence calculated in the l.,terval from nat to
(n+1)At by means of
+ I)4e Imd&J

mAt_ (a + I)_/ &


m4t

If at time (n+1)At we make the determination, "state 2", then


for all subsequent events we may no longer assign to the atom
the elgen-function (18], but one derived from (18) if we re-
place t with t-(n+1)At. If, in contrast, we determine "state
, I", then from then on we must assign to the atom the elgen-
, function
i

Thus, in a series of Intervals &t we would first observe "state


2 e, then continuously estate 1. e To hake a differentiation of
the two states possible, At must not fall below h/AE. Thus, the

23
e

!
transition-point in time can be determined with that precision. I
We conceive of the experiment above entirely in the sense of I
|
the old _nterpretation of quantum theory, as explained by
Planck, Einstein and Bohr when we speak of a discontinuous {

change of energy. Since such an experiment can be performed, I


in principle, agreement as to its results must be possible, i

In Bohr's basic postulate of the quantum theory, the energy

of an atom, as well as the values of the action variables J, i


has the privilege over other items to be determined (such as J
|
the position of the electron, etc.) that its numerical value !
can always be given. This privileged position held by energy |
over other quantum mechanics magnitudes is owed strictly to
the circumstance
integral
that in a closed system, it represents
of the equation
an
of motion (for the energy matrix we
I
have E = const.). In contrast, in open systems the energy
has no preference over other quantum mechanics variables. In /191
particular, it will be possible to conceive of experiments,
in which the atom's phases w are precisely measurable and

for which then the energy will remain, in principle, Indeter-


mined, corresponding to a relation Jw-wJ.-:-_s- i ,
or J1wl _ h. Such an experiment is provided by resonance
fluorescence, for instance. If an atom is irradiated wlth an

etgen-frequency of say, v12 : (E 2 - E1)/h, then the atom will


vibrate in phase wlth the external radiation, in whlch case

in principle It is senseless to ask, in which state - E I or


E2 - the atom is vlbratlns. The phase relation between atom
and external radiation can be determined, for instance, by
means of the phase relations among many atoms (Woods experi-
ment). IF one does not want to use experiments Involving ra-
diation, the phase relation can also be measured by perform-
lng precise position measurements In the sense oF J 1 For the
electron, at different times, relatlve to the phase of the
ltsht used for Illumination (for many atoms). To each atom
we could then assign a "wave function" such as

24
(Z"
ORIGINAL PAGE
• OF POOR QUAL_P(

s(e. 0 --'=c,_, (J:,,,_);" _ + I/T -- ,'7v,,(_,, _) e- ,, (l_) .

Here c2 depends on the _ntensity and B on the phase of the


i illuminating light. Thus, the probability _ of a certain posl-
_i tion is

s(q,o + (,-4),,,

The periodic *,erm in (20) can be experimentally separated


from the non-periodical, since the position determi_ _.ion can
be performed at different phases of the illuminating light.

In a known imaginary experiment proposed by Bob,-, Lhe atoms of


a Stern-Gerlach atom beam are initially excited to resonance
fluorescence, at a certain location, by means of light irradia-
tion. After a certain length, the atoms pass throush an Inhomo-
geneous magnetic field; the radiation emitted by the atoms can
be observed over the entire length of their path, before and
behind the magnetic fleld. Before the atoms enter the magnetic
field, they exhibit normal resonance fluorescence, i.e., In
analogy to the d_sperslon theory, we must assume that all atoms
emit in phase wlth the incident , spherical light waves. At
first, thls latter interpretation stands in conflict wlth what
a rough application of the light quanta theory or the baslc /1_
rules of quantum theory indicate: from it one would conc].udo
that that only a few atoms would be ra!sed to an "upper state"
by the absorption of a light quantum and hence, that _11 of
the resonance radiation would come from Intensively radiating
excited centers. Thus, It used to be tempting to say: the con-
cept ot ltght quanta can be called upon here only for the
energy tmpulse balance; "in reality" all atoms radiate In lower
states as a weak and coherent spherical wave. Once the atoms
have passed through the magnetic field, there can hardly b_
any doubt left that the atom beam has split into two beams i

of which one corresponds to atoms in the higher state and the


other, to atoms in the lower state. If the atoms in the lower

state were radiating, this would be a gross infringement of


the energy theorem, because all of the excitation energy is
t contained in the fraction with the higher state. Rather, there
can be no doubt that behind the magnetic field, only the atom i
beam with the upper states is emitting light - and non-coherent
light, at that - from the few intensively radiating atoms in
the upper state. As Bohr showed, this imaginary experiment makes !
particularly clear how careful we must be with the application i

of the concept "stationary state". From the conception of the !


I

quantum theory developed here, it is easy to discuss Bohr'S ex-


periment without any difficulty. In the outer radiation field
the phases of the atoms are determined and hence there is no
sense in talking of the energy of the atom. Even after the atom
has left the radiation field we can not say that it is in a

certain stationary state, if we are asking for coherence charac- _


teristics of the radiation. But experiments can be performed to
test in which state the atom is; the result of this experiment
car only be given statistically. Such an experiment is actual-
ly performed by the inhomogeneous magnetic field. Behind the
magnetic field, the energies of the atoms are determined and
hence their phases are undetermined. The radiation is incoher-
ent and emitted only by atoms in the upper state. The magnetic
field determined the energies and hence destroys the phase re-
lations. Bohr's imaginary experiment provides a beautiful
clarification of the fact that the energy of the atom is also,
"in reality, not a number, but a matrix."The law of conserva-
tion applies to the matrix energy and hence also to the value
of the energy, as precisely as it is measured, in each case.

Analytically, the cancellation of the phase relations can be /19--3


followed approximately thus: let Q be the coordinates of the
atom's center of mass; we can then assign to the atom (instead
of (19)) the eigen-function

26
i OR,GINAL PAG_ ?_
• OF POOR QUALITY I

s(Q,Os(q, t) -- s(_, _,0 ('_D

where S(Q,t) is a function that [as S(n,q) in (1611 is differ- o

_;! ent from zero in only a small area around a point in Q-space, i
'_ and propagates with the velocity of the atoms in the direction
_} of the beam. The probability of a relative amplitude q for
some values Q is given by the integral of _

S(Q,q,t)S(O,q,t) over Q, i.e., via (20). "

The eigen-function (21), however, will change in the magnetic

field in a calculable manner, and because of the differing de-


flection of the atoms in the upper and the lower state, will
have become, behind the magnetic field, i

S(Q,_,t) = %s,(0,t),/,,(,_;,v)e h i,
_=l£,t
-_ _/i -- ,'_ S, (Q, t) ea (El, q) ¢ 1 (22)

S1(Q,q,t) and S2(Q,t) will be functions in Q-space differing


from zero only in a small area surrounding the point. But this _
point is different for S1_%nd for S 2. Hence SIS 2 is zero every- _
where. Hence, the probabilzty of a relative amplitude R and a
definite value 0 is

The periodic term in (201 has disappeared and with it, the pos-

sibility of measuring a phase relation. The result of the sta-

tistical
less position
of the phase determination
of the incidentwill always
light be the itsame,
for which was regard-
deter- }

mined. We may assume that experiments with radiation whose theo-


ry has not yet been fully elaborated will yield the same re-
sults regarding the phase relations of atoms to the incident
light.

Finally, let us examine the relation between equation (2),

E1t I =h, and a problem complex discussed by Ehrenfest* and two '
other researchers by means of Bohr's correspondence principle,

27

• _ mm - ...................
!
{
i

in two important papers**. Eflrenfest and Tolman speak of "weak !


quantization" when a quantifiea periodic motion is subdivided,
by quantum jumps or other disturbances, into time intervals /I__9_
: that can not be considered long in relation to the system's

period. Supposedly, in this case there are not only the exact
_ energy values from quantum theory, but also - with a lower a
priori probability that can be qualitatively indicated - energy
values that do not differ too much from the quantum theory-based
values. In quantum mechanics, such a behavior is to be inter-
i pretated as follows: since the energy is really changed, due to
•: other disturbances or to quantum jumps, each energy measurement
has to be performed in the interval between two disturbances,

if it is to be unequivocal. This provides an upper limit to t I


in the sense of § I. Thus the energy value Eo of a quantified
state is also measured only with a precision E I = t/t I. Here,
the question whether the system "really" adopts energy values

E that differ from Eo-with the correspondingly smaller statis-


tical weight - or whether their experimental determination is

due only to the uncertainty of the measurement, is pointless, _


in principle. If t I is smaller than the system's period, then ._.
there is no longer any sense in talking of discrete stationary "_
states or discrete energy values.

In a similar context, Ehrenfest and Breit (l.c.) point out the


following paradox: let us imagine a rotator - for instance, in
the shape of a gear wheel - fitted with a mechanism that after
f revolutions just reverses the direction of rotation. Let us
further assume that the gear wheel acts on a rack that can be
linearly displaced between two blocks. After the specified num-
ber of revolutions, the blocks force the rack, and hence the
wheel, to reverse direction. The true period T of the system is
u, , ii

XS.f. Phys. 9, 207 (1922) and P.


* P"
Ehrenfest
EhrenfeStandandR.c.G.Tolman,Breit,
Phys.Aev. 2_, 28? (1924); see also
the discussion in N. Bohr, Basic postulates of quantum theory,
l.c.
** Mr. W. Pauli pointed this relation out to me.

28
i long in relation to the period _ of the wheel; the discrete
energy steps are correspondingly dense, and denser, she greater

T is. Since from the point of view of a consistent quantum theo-


ry all stationary states have the same statistical weight, for
i _ a sufficiently large T practically all energy values will occur
with the same frequency - in contrast to what we would expect
_ for the rotator. Initially, this paradox becomes even sharper

_ when we consider our points of view. Because in order to es-


r tablish whether the system will adopt the discrete energy val-
ues corresponding to a pure rotator singly or with special
_ _ frequency, or whether it will adopt all possible values {i.e.,
values corresponding to the small energy steps h/T) with the
same probability, a time t_ is sufficient, which is small in

• for such measurements never becomes effective, it apparently


ii relation toitself
manifests T (but--
in that
_). all
That possible
is, although
energythe
values
large can
period
occur. /195
We believe that such experiments for the determination of the
system's total energy would actually yield all possible energy
values with the same probability; and this is not due to the
large period T, but to the linearly displaceable rack. Even if
the system should find itself in a state whose energy corres-
ponds to the rotator quantification, by means of external
forces acting on the rack it can be easily taken to states,
that do not correspond to the rotator quantification*. The
coupled system rotator-rack simply has periodicity character-
istics that are different from those of the rotator. The solu-
tion of the paradox rather lies in the following: if we wanted
to measure the energy of the rotator alone, then we shall first_
have to dissolve the coupling between rotator and rack. In
classical theory, for a sufficiently small mass of the rack the
dissolution of the coupling could occur without energy changes
and therefore there the energy of the total system could be
equated to that of the rotator (for a small rack mass). In
, ii

* According_to Ehrenfest and Breit, this can occur not at all,


or only rarely, due to forces acting" on the wheel.

29
l
w

•i wheel
quantumis mechanics,
at least ofthetheinteraction
same order energy
of magnitude, as
between rack one
and of
,_ the rotator's energy steps (even for a small rack mass, a high
P

wheel and rack!} Once the coupling is dissolved, the rack and

i! the wheel individually


zero-point adopt
energy remains for their quantum interaction
the elastic theory energy
between
!_ values. Thus, to the extent that we can measure the energy
values of the rotator alone,
we will always find the values
with allowed
i prescribed by quantum theory, the precision by
i the experiment. Even for a vanishingly small rack mass will
the energy of the coupled system be different from that of the
rotator. The energy of the coupled system can adopt all pos- T
' sible values (those allowed by T-quantification) with the same i
probability • 6

Ouantum theory kinematics and mechanics are vastly different !


from classical. But the applicability of classical kinemati= _ -_

and mechanical concepts can not be deduced either


laws that govern our thinking, or from experience•
from the
We are en- i.._.
='."

the impulse, position, energy, etc., of an electron are pre-


cisely defined concepts, we need not be discouraged by the fact "_
titled to this conclusion by the relation (I) plql _h. Since /196 i__
that the fundamental equation (I) contains only a qualitative [
statement. Since, in addition, we can qualitatively conceive of
the theory's experimental consequences, in all simple cases,
we shall no longer have to view quantum mechanics as not intui-
tive or abstract*. If we admit this, then we would of course

* Schroedinger described quantum mechanics as a formal theory,


of frightening, even repulsive un-intuitiveness and abstrac-
tion. The value of the mathematical (and to that extent, in-
tuitive) penetration of the laws of quantum mechanics accom-
" plished by Schroedinger can certainly not be praised highly
enough. However, in terms of the principled, physical ques-
tions, I believe the popular intuitiveness of wave mechanics
has deflected it from the straight path that had been _erked

3O
e

,
&
0

also like to be able to derive the quantitative laws of quan-


tum mechanics directly from the intuitive foundations, i.e.,
essentially, from relation (I). For this reason Jordan attempt-
ed to interpret the equation

as a probability relation. We can not agree, however, with


that interpretation (§ 2}. Rather, we believe that the quanti-
tative laws can be understood, to begin with, according to the
principle of the greatest possible simplicity, starting from
the intuitive foundations. If, for instance, the X coordinate
of the electron no longer is a "number" - as can be concluded
experimentally, from equation (I) - then the simplest imaginary
assumption [that does not contradict (I)] is that this X coor-
dinate is a diagonal term of a matrix whose non-diagonal terms

are expressed in an uncertainty, or respectively, by other kinds I


of transformations (cf. for instance § 4). Perhaps the statement
that the velocity in the X-direction "in reality" is not a num-

ber, but a diagonal term in a matrix is no more unintuitive and _v


abstract than the determination, that the electric field inten- _T
sity "in reality" is the time portion of an antisymmetrical
t
tensor of the space-time world. The expression "in reality" is

description of natural phenomena in mathematical terms. As soon _


as we admit that all quantum theory variables "in reality" are
just as much or as little justified here as it is for any other I<°_

If one assumes that the interpretation of quantum mechanics at- /19__/7


tempted here
matrices, the is valid at least
quantitative laws infollow
its essential points, then we
without difficulty, "
l,i=
ii
may be allowed to discuss its main consequences, in a few words.
We have not assumed that quantum theory - in contrast to clas- ..

that starting from exact data we can only draw statistical "::'

by the works of Einstein and de Broglie on the one hand, and ,,


by quantum mechanics, on the other.

31
l I
r
b

, conclusions. Among others, the known experiments by Geiger and


ii Bothe speak against such an assumption. Rather, in all cases
in which relations exist between variables, in classical theo-

i! ry, that can really be measured precisely, the corresponding


_-_ exact relations exist also in quantum theory (impulse and en-
", ergy theorems). But in the rigorous formulation of the law of
_;T causality - "If we know the present precisely, we can calculate
"''. the future" - it is not the conclusion that is faulty, but the
premise. We simply can not know the present in principle in all
its parameters. Therefore all perception is a selection from a
; totality of possibilities and a limitation of what is possible
in the future. Since the statistical nature of quantum theory
is so closely to the u_certainty in all observations or percep-
_i tions, one could be tempted to conclude that behind the ob-
_' served, statistical world a "real" world is hidden, in which
-: the law of causality is applicable. We want to state explicit-
ly that we believe such speculations to be both fruitless and
pointless. The only task of physics is to describe the relation

'i i cribed better by the following:


betweenobservationseThetruesituationcouldratherbedes- Because all experiments are i'_

:i, subject to the laws of quantum mechanics and hence to equation p


_ (I), it follows that quantum mechanics once and for all ,stab- '
lishes the invalidity of the law of causality.

i
4
i Addendum at the time of correction. After closing this paper,
I new investigations by Bohr have led to viewpoints that allow a

: considerable broadening and refining of the analysis of quantum ;.


. mechanics relations attempted here. In this context, Bohr cal- ,_.:
' led my attention to the fact that I had overlooked some essen- !i
t.
tial points in some discussions of this work. Above all, the
uncertainty in the observation is not due exclusively to the _
existence of discontinuities, but is directly related to the :_.
requirement of doing Justice simultaneously to the different I_
experiences expressed by corpuscular theory on the one hand, ,_,_

32
!
• $

and by wave theory on the other. For instance, in the use of /19.8
an imaginary r-ray microscope, the divergence of the ray beam
must be taking into account. The first consequence of this is
i that in the observation of the electron's position, the direc-
tion of the Comptom recoil will only be known with some uncer-
tainty, which will then lead to relation (I). It is further-
more not sufficiently stressed that rigorously, the simple
theory of the Compton effect can be applied only to free elec-
trons. As professor Bohr made very clear, the care necessary in
the application of the uncertainty relationship is essential
above all in a general discussion of the transition from micro
to macro-mechanics. Finally, the considerations on resonance

fluorescence are not entirely correct, because the relation ]

electrons is not as simple as assumed here. I am greatly in-


between the phase of the light and that of the motion of the !
debted to professor Bohr for being permitted to know and discuss
during their gestation those new investigations by Bohr, men-
tioned above, dealing with the conceptual structure of quantum
theory, and to be published soon.

33
226

ON THE ANTIBACTERIAL ACTION OF CULTURES OF A


PENICILLIUM, WITH SPECIAL REFERENCE TO THEIR
USE IN THE ISOLATION OF B. INFLUENZ?1E.
ALEXANDER FLEMING, F.R.C.S.
From the Laboratories of the Inoculation Department, St Mary's Hospital, London.

Received for publicatioii May 10th, 1929.

WHILE working with staphylococcus variants a nunmber of culture-plates


were set aside on the laboratory bench and examined from timne to time. In
the examlinations these plates were necessarily exposed to the air and they
became contanminated with various micro-organismns. It was noticed that
around a large colony of a contaminating mould the staphylococcus colonies
became transparent and were obviously undergoing lysis (see Fig. 1).
Subcultures of this mould were made and experiments conducted with a
view to ascertaining something of the properties of the bacteriolytic substance
which had evidently been fornmed in the mould culture and which had diffused
into the surrounding medium. It was found that broth in which the mould
had been grown at room temperature for one or two weeks had acquired
marked inhibitory, bactericidal and bacteriolytic properties to imany of the
miiore common pathogenic bacteria.

CHARACTERS OF THE MOULD.


The colony appears as a white fluffy mnass which rapidly increases in size
and after a few days sporulates, the centre becoming dark green and later in
old cultures darkens to almost black. In four or five days a bright yellow
colour is produced which diffuses into the nmedium. In certain conditions a
reddish colour can be observed in the growth.
In broth the mould grows on the surface as a white fluffy growth changing
in a few days to a dark green felted mass. The broth becomes bright yellow
and this yellow pigment is not extracted by CHC13. The reaction of the broth
becomes markedly alkaline, the pH varying, fronm 8 5 to 9. Acid is produced
in three or four days in glucose and saccharose broth. There is no acid
production in 7 days in lactose, nlannite or dulcite broth.
Growth is slow at 370C. and is most rapid about 200C. No growth is
observed under anaerobic conditions.
In its morphology this organism is a penicillium and in all its characters
it most closely resembles P. rubruan. Biourge (1923) states that he has never
found P. rubr-um in nature and that it is an " animal de laboratoire." This
penicillium is not uncommon in the air of the laboratory.
PENICILLD.S. 227
IS THE ANTIBACTERIAL BODY ELiABORATED 1N CULTURE BY ALL MOULD)S?
A number of other miioulds were grown in broth at rootmi temperature and
the cultuire fluids were tested for antibacterial substances at various intervals
up to one imionth. The species exatm-ined were: Eidam,ia viridiscens, Botrytis
cineria, A spergiluis f nuigatuts, Sp)orotrichluni, Cladosporioni, Penicillilum, 8
strains. Of these it was found that only one strain of penicilliumi )roduced
any inhibitory substance, and that one had exactly the samiie culltural characters
as the original one from the contanminated plate.
It is clear, therefore, that the production of this antibacterial substance is
not common to all rnoulds or to all types of penicilliumil.
In the rest of this article allusion will constantly be imiade to experiments
with filtrates of a broth culture of this mould, 's-6 for convenience and to avoid
the repetition of the rather currmbersomue phrase " Mould broth filtrate," the
namie lpenicillin " will be used. This will denote the filtrate of a broth
culture of the particular penicillium with which we are concerned.
METHOI)S OF EXAMINING CULTURES F'OR ANTlBACT'ERIAL SUBS'I'ANCE.
The simplest method of examining for inhibitory power is to cut a furrow
in an agar plate (or a plate of other suitable culture material), and fill this in
with a mixture of equal parts of agar and the broth in which the mould has
grown. When this has solidified, cultures of various microbes can be streaked
at right angles from the furrow to the edge of the plate. The inhibitory
substance diffuses very rapidly in the agar, so that in the few hours before the
mnicrobes show visible growth it has spread out for a centimetre or more in
sufficient concentration to inhibit growth of a sensitive microbe. On further
incubation it will be seen that the proximal portion of the culture for perhaps
one centimetre becomes transparent, and on examination of this portion of the
culture it is found that practically all the microbes are dissolved, indicating
that the anti-bacterial substance has continued to diffuse into the agar in
sufficient concentration to induce dissolution of the bacteria. This simple
method therefore suffices to demonstrate the bacterio-inhibitory and bacterio-
lytic properties of the mould culture, and also by the extent of the area of
inhibition gives some measure of the sensitiveness of the particular microbe
tested. Fig. 2 shows the degree of inhibition obtained with various microbes
tested in this way.
The inhibitory power can be accurately titrated by making serial dilutions
of penicillin in fresh nutrient broth, and then implanting all the tubes with
the same volume of a bacterial suspension and incubating them. The inhibi-
tion can then readily be seen by noting the opacity of the broth.
For the estimation of the antibacterial power of a mnould culture it is
unnecessary to filter as the rmould grows only slowly at 370 C., and in 24 hours,
when the results are read, no growth of mould is perceptible. Staphylococcus
is a very suitable microbe on which to test the broth as it is hardy, lives well
in culture, g-rows rapidly, and is very sensitive to penicillin.
The bactericidal power can be tested in the same way except that at
intervals measured quantities are explanted so that the number of surviving
microbes can be estimated.
2-28 A. FLEMING.
PROPERTIES OF THE ANTIBACTERIAL SUBSTANCE.
Effect of heat.-Heating for 1 hour at 560 or 80° C. has no effect on the
antibacterial power of penicillin. Boiling for a few minutes hardly affects it
(see Table II). Boiling for 1 hour reduces it to less than one quarter its
previous strength if the fluid is alkaline, but if it is neutral or very slightly acid
then the reduction is much less. Autoclaving for 20 nminutes at 1150 C.
practically destroys it.
Effect of filtrationt.-Passage through a Seitz filter does not diminish the
antibacterial power. This is the best method of obtaining sterile active lmlould
broth.
Solubility.-It is freely soluble in water and weak saline solutions. My
colleague, Mr. Ridley, has found that if penicillin is evaporated at a low
temperature to a sticky mass the active principle can be completely extracted
by absolute alcohol. It is insoluble in ether or chloroform.
Rate of developmrent of inhibitory szbstance in cutlture.-A 500 c.c.
Erlenmeyer flask containing 200 c.c. of broth was planted with mould spores
and incubated at room temperature (100 to 20° C.). The inhibitory power of
the broth to staphylococcus was tested at intervals.
After 5 days complete inhibition in 1 in 20 dilution.
,, 6 ,, ,, ,, ,, 1 in 40
, 7 ,, ,, ,, ,, 1 in 200
, 8 ,, ,, ,, ,, 1 in 500
Grown at 20° C. the development of the active principle is more rapid and
a good sample will completely inhibit staphylococci in a 1 in 500 or 1 in 800
dilution in 6 or 7 days. As the culture ages the antibacterial power falls and
mav in 14 days at 20° C. have almost disappeared.
The antibacterial power of penicillin falls when it is kept at room tem-
perature. The rate of this fall can be seen from Table I.
TABLE L.-Effect of Keeping at Room Temperature on the Anti-
Staphylococcal Power of Penicillin.
Growth of staphylococcus in dilutionis of penicillin as under.
1/20. 1/40. 1,60. 1,80. 1/100. 1/200. 1/300. 1/400. 1,600. 1j800. 1/1000. Control.
At time of filtr'ation . ++ ++
After 4 days ..- ± ++ ++
,7, .- _-± + + ++ ++
9-± + + ++ ++
,,13 ,,- - - - - + + + + ++ +
15 , . - ± + + + + + + + + ++ ++
If the reaction of penicillin is altered fronm its original pH of 9 to a pH of
6'8 it is much more stable.
The smi1all drops of bright yellow fluid which collect on the surface of the
mould may have a high antibacterial titre. One specimen of such fluid
completely inhibited the growth of staphylococci in a dilution of 1 in 20,000
while the broth in which the mould was grow;ng, tested at the same time,
inhibited staphylococcal growth in 1 in 800.
If the mnould is grown on solid medium and the felted mass picked off and
BRITISH JOURNAL OF EXPERIMENTAL PATHOLOGY, VOL X, No. 3.

I
}'1iiII111m ('1 11 dIVl.

S;ll] ll I I 114 1 I.

Nl\ ;, Istl l(
c1,1o l1v.

FIG. 1.-Photograph of a culture-plate showing the dissolution of staphylococcal


colonies in the neighbourhood of a penicillium colony.

f: iL "-:L-- domelml.

~.B, INFLUENZA

FiG. 2.
Fleming.
BRITISH JOURNAL OF EXPERIMENTAL PATHOLOGY, VOL. X, No. 3.

FIG. 3.-Photograph of a culture-plate (Fildes medium) which had been


evenly planted with a mixture of staphylococci and B. influenz. Six
drops of penicillin were then spread over the lower half of the plate.
Note complete inhibition of staphylococci in the peniicillin treated area
with resultant pure culture of B. influenzae.

I~~~~~~~~~~~~~~~~~~~~~~

l ,I,i i,,
'(,l':~~~~~~~~~I ix;(

FIG. 4-Photograph of a culture-plate (Fildes medium) which bad been


evenly planted with nasal mucus from an individual suffering from a
"4cold." Six drops of penicillin were spread over the lower half of the
plate before incuibation. Note profuse growth of staphylococci and
diphtheroid bacilli in untreated hialf, whe-reas in treated half only some
three colonies of B. influenzt are seen.
Fleming.
PENICILLIN. W229
TAB3LE II.-Inhibitory Power of Penicillin (Heated and Unzheated) onl
Various Mlicrobes (Agar Plate MlIethod).
Extent of inhibition in mm. from
penicillin embedded in agar, seruin
Type of microbe. agar, or blood agar plates.
Unheated. Boiled for
1 minute.
ExpIerinient 1:
Staphylococcus pyogenes 23 21
Streptococcus 17 17
,, viridans (mouth) 17 15
Diphtheroid bacillus 27 22
Sarcina 10 10
Micrococcus lysodeikticus 6 7
froni air (1) 20 16
(2) 4 9
B. anthhracis 0 0
B. typhosus 0 0
Enterococcus 0 0
Experiment 2:
Staphylococcus pyo genes 24
Streptococcus 30
,, viridans ((mouth) 25
Pneumococcus 30
Diphtheroid bacillus 35
B. pyocyanteus 0
B. pneurnonie (Friedlander) 0
B. coli 0
B. paratyphosus A 0
Experitrnent 3:
Staphylococcus pyogenes 16
Gonococcus 16
Meningococcus 17
Experinment 4:
Staphylococcus pyogenies 17
,,9 epiderinidis. 18
Streptococcus pyogenes 15
,,
9 viridanzs (faeces) 5
B. diphtherie (2 strains) 14
Diphtheroid bacillus 10
Grami-negative coccus froml the mouth (1) 12
,, (2) . 0
B. coli 0
B. infiuenz(,6 (Pfeiffer) 6 strains 0
230

-4
0)

q)cJCA)
*
* -4
k
z
0

* -4>

I.
.,4

..

-.4
-4
04
*,.4
0

0)
0
ov0

C° O
~0
++ + + + + + + + + + + ++ +
++++++++++

°+l +l

C4 H OO Oo

r-4
-rIO0

16 O O O O
O
.
+ + + + + + + + +

8++ + +1 + + + + + +
~±+++++I+++++ +..

8 oo co + + + + + +
cq o 0+1±+--++
o+l + + + + +

8~_OsO s+ + + + +

H O

.
*
O

*
Q0
+ ++ ++ + + + :.:

00+++
0 > +

O+

+
s+
+
+

C+++
+ ++

+O + + +
~+

tl*C
+++

+ + +++
+ +±+
+

.
+

. .
A. FLEMING.

*:

++ + ++ + +
.
+++

.
++

.
rO
O0
v

8
+

gCD+++++++++

eDcoo+o

*-40)
+o
++O+
+
+

**

t00
!
~
+ + + + + + + + + +
+ ++ ++++ ++ +++

+++
+ + + + + + + + + + +

+
+ Q + + ++

--4 r--4

~Ca

11-4~~~~~~~~~~~-
C:a,

>
,84
X*w^ t
o

<
;
;..

*C a
++++

o+>+ o o

W
o 00

o
o

oC>

;X_

0
.

+
+

-'-4

0C

~ ~ ~ ~ ~ -Z r 0o
tt~ ~ ~ Z P-,v CPBC CZ --4

G2 ;G00mm m
PENICILLIN. 231

extracted in normal salt solution for 24 hours it is found that the extract has
bacteriolytic properties.
If this extract is imiixed with a thick suspension of staphylococcus
suspension and incubated for 2 hours at 450 C. it will be found that the opacity
of the suspension has mnarkedly diminished and after 24 hours the previously
opaque suspension will have become almost clear.
Influence of the medium on the antibacterial titre of the mould culture.-So
far as has been ascertained nutrient broth is the muost suitable mnedium for the
production of penicillin. The addition of glucose or saccharose, which are
fermented by the mould with the production of acid, delays or prevents the
appearance of the antibacterial substance. Dilution of the broth with water
delays the formation of the antibacterial substance and diminishes the
concentration which is ultimately reached.
INHIBITORY POWER OF PENICILLIN ON THE GROWTH OF IBACTERIA.
Tables II and III show the extent to which various microbes, pathogenic
and non-pathogenic, are inhibited by penicillin. The first table shows the
inhibition by the agar plate method and the second shows the inhibitory power
when diluted in nutrient broth.
Certain interesting facts emerge fromi these Tables. It is clear that
penicillin contains bacterio-inhibitory substance which is very active towards
some nmicrobes while not affecting others. The members of the coli-typhoid
group are unaffected as are other intestinal bacilli such as B. pyocyaneus,
B. proteus and V. cholerae. Other bacteria which are insensitive to penicillin
are the enterococcus, some of the Gram-negative cocci of the mouth,
FriedlIander's pneumobacillus, and B. intfluenzae (Pfeiffer), while the action on
B. dysentter iae (Flexner), and B. pseudo-tuberculosis rodentium is almnost
neg,ligible. The anthrax bacillus is completely inhibited in a 1 in 10 dilution
but in this case the inhibitory influence is trifling when comiipared with the
effect on the pyogenic cocci.
It is on the pyogenic cocci and on bacilli of the diphtheria group that the
action is most manifest.
Staphylococci are very sensitive, and the inhibitory effect is practically
the same on all strains, whatever the colour or type of the staphylococcus.
Streptococcus pyogenes is also very sensitive. There were small differences
in the titre with different strains, but it may be said generally that it is
slightly more sensitive than staphylococcus.
Pneumococci are equally sensitive with Streptococcus pyogenes.
The green streptococci vary very considerably, a few strains being almost
unaffected while others are as sensitive as S.pyogenzes. Gonococci, meningococci,
and somiie of the Gram-negative cocci found in nasal catarrhal conditions are
about as sensitive as are staphylococci. Many of the Gram-negative cocci
found in the mouth and throat are, however, quite insensitive.
B. diphtheriee is less affected than staphylococcus but is yet completely
inhibited by a 1% dilution of a fair sainple of penicillin.
It may be noted here that penicillin, which is strongly inhibitory to many
bacteria, does not inhibit the growth of the original penicillium which was
used in its preparation.
232 A. FLEMING.

The Rate of Killing of Staphylococci by Penicillin.


Some bactericidal auents like the hypochlorites are extremely rapid in
their action, others like flavine or novarsenobillon are slow. Experiments
were made to find into which category penicillin fell.
To 1 c.c. volumes of dilutions in broth of penicillin were added 10 c.mm.
volumes of a 1 in 1000 dilution of a staphylococcus broth culture. The tubes
were then incubated at 37°C. and at intervals 10 c.mm. volumes were removed
and plated with the following result:
Numiiber of colonies developing after sojourn in
penicillin in concentrations as under:
Control. 1/80. 1/40. 1/20. 1/10.
Before . . 27 27 27 27 27
After 2 hours 116 73 51 48 23
,,41 ,, . . xc 13 1 2 5
,,8 ,, . ac 0 0 0 0
,,12 ,, .cc 0 0 0 0
It appears, therefore, that penicillin belongs to the group of slow acting
antiseptics, and the staphylococci are only completely killed after an interval
of over 41 hours even in a concentration 30 or 40 times stronger than is
necessary to inhibit completely the culture in broth. In the weaker concen-
trations it will be seen that- at first there is growth of the staphylococci and
only after some hours are the cocci killed off. The same thing can be seen if
a series of dilutions of penicillin in broth are heavily infected with staphylo-
coccus and incubated. If the cultures are examined after four hours it may
be seen that growth has taken place apparently equally in all the tubes but
when examined after being incubated overnight, the tubes containing penicillin
in concentrations greater than 1 in 300 or 1 in 400 are perfectly clear while
the control tube shows a heavy growth. This is a clear illustration of the
bacteriolytic action of penicillin.

TOXICITY OF PENICILLIN.
The toxicity to aniinals of powerfully antibacterial mould broth filtrates
appears to be very low. Twenty c.c. injected intravenously into a rabbit were
not more toxic than the same quantity of broth. Half a c.c. injected intra-
peritoneally into a mouse weighing about 20 gm. induced no toxic symnptoms.
Constant irrigation of large infected surfaces in man was not accompanied by
any toxic symlptoins, while irrigation of the human conjunctiva every hour for
a day had no irritant effect.
In vitro penicillin which completely inhibits the growth of staphylococci in
a dilution of 1 in 600 does not interfere with leucocytic function to a greater
extent than does ordinary broth.
USE OF PENICILLIN TO DEMONSTRATE OTHER BACTERIAL INHIBITIONS.
When materials like saliva or sputum are plated it is not uncommon to see,
where the implant is thick, an almost pure culture of streptococci and
PENICILLIN. 233
pneumococci, and where the implant is thinner and the streptococcal colonies
are nmore widely separated, other colonies appear, especially those of Gram-
negative cocci. These Gram-negative cocci are inhibited by the streptococci
(probably by the peroxide they produce in their growth) and it is only when
the mass effect of the streptococci is reduced that they appear in the culture.
Penicillin may be used to give a striking demrionstration of this inhibition
of bacteria by streptococci and pneumococci. Sputumn is spread thickly on a
culture plate, anid then 5 or 6 drops of penicillin is spread over one half of it.
After incubation it nay be seen that on the half untreated with penicillin there
is a confluent growth of streptococci and pneumococci and nothing else, while
on the penicillin-treated half many Gramn-negative cocci appear which were
inhibited by the streptococci and pneumococci, and can only flourish when
these are themselves inhibited by the penicillin.
If some active penicillin is embedded in a streak across an agar plate
planted with saliva an interesting growth somuetinmes results. On the portion
most distal from the penicillin there are many streptococci, but these are
obscured by coarsely growing cocci, so that the resultant growth is a copious
confluent rough mass. These coarse growing cocci are extremely penicillin
sensitive and stop growing about 25 mm. from the embedded penicillin. Then
there is a zone of about 1 cm. wide of pure streptococci, then they are inhibited
by the penicillin, and as soon as that happens Gram-negative cocci appear and
grow right up to the embedded penicillin. The three zones of growth produced
in this way are very striking.
USE OF PENICILLIN IN THE ISOLATION OF B. FNFLUENZE (PFEIFFER)
AND OTHER ORGANISMS.
It sometimes happens that in the human body a pathogenic microbe mnay
be difficult to isolate because it occurs in association with others which grow
more profusely and which mask it. If in such a case the first microbe is
insensitive to penicillin and the obscuring microbes are sensitive, then by the
use of this suibstance these latter can be inhibited while the former are allowed
to develop normally. Such an example occurs in the body, certainly with
B. influenze2 (Pfeiffer) and probably with Bordet's whooping-cough bacillus
and other organismus. Pfeiffer's bacillus, occurring as it does in the respiratory
tract, is usually associated with streptococci, pneumiiococci, staphylococci and
Grain-negative cocci. All of these, with the exception of some of the Grami-
negative cocci, are highly sensitive to penicillin and by the addition of somiie
of this to the medium they can be completely inhibited while B. inlfiuenzce is
unaffected. A definite quantity of the penicillin may be incorporated with the
molten culture medium before the plates are muade, but an easier and very
satisfactory method is to spread the infected material, sputumll, nasal muticus,
etc., on the plate in the usual way and then over one half of the plate spread
2 to 6 drops (according to potency) of the penicillin. This smnall amount of
fluid soaks into the agar and after cultivation for 24 hours it will be found that
the half of the plate without the penicillin will show the normal growth while
on the penicillin treated half there will be nothing but B. influenzaz with
Gram-negative cocci and occasionally some other microbe. This makes it
infinitely easier to isolate these penicillin-insensitive organisns, and repeatedly
R~ ~ -
234 A. FLEMING.

B. influenzte has been isolated in this way when they have not been seen in
films of sputum and when it has not been possible to detect them in plates
not treated with penicillin. Of course if this method is adopted then a
medium favourable for the growth of B. infiuenzte must be used, e. g. boiled
blood agar, as by the repression of the pneumococci and the staphylococci the
symbiotic effect of these, so familiar in cultures of sputum on blood agar, is
lost and if blood agar alone is used the colonies of B. infiuenzce may be so
minute as to be easily missed.
Figs. 3 and 4 are photographs of culture-plates made after the method
described above. On the plate shown in Fig. 3 a mixture of staphylococci and
B. influenze was spread over the whole plate of Fildes tnediumi (Fildes, 1921),
then 6 drops of penicillin were spread over the lower half of the plate. The
upper half shows the mixed culture while the lower half gives a pure culture
of B. influenzac. Fig. 4 represents a culture of nasal inucus fromn a " cold "
made on the same medium. Here, on the upper half (untreated with peni-
cillin) staphylococci and diphtheroid bacilli grow abundantly, while on the
treated (lower) half only some three or four colonies of B. influenzae appear.
In conjunction with my colleaguie, Dr. McLean, a series of cultures were
made from the throats of 25 nurses warded for "influenza." The swabs were
planted on boiled blood agar and over half of each plate was spread 3 or 4
drops of penicillin. The results are set forth in Table IV.

TABLE IV.-Sumrnary of Results obtained from Post-Nasal Swabs in


25 Consecutive Cases of "Influenza."
Without penicillin. With penicillin. Without penicillin. With penicillin.

- 0

0
0
O~~~~~~~'
=
C>tb E C,~~00
,, ° 0 0~
0

n ;
a4

1. ++ + . - ++ +
+ 14. ++ - - - + -
2. + - ++ - ++ +
++ 15. ++ - - - + +
3. ++ ++ + - + - 16. ++ - _ - +
4. + .- _ - + l 17. + - -4+
5. + _- - ++ - 18. ++ - -+
6. -++ - - . - +1 ++ 19. + + + - - + ++
7. + + ++ .- + + 2).+ - - - -+
8.+ + - .- + - 21.+- - - + +
9. + + - .- + + 22. ++ - - .+- t -
1-.+ + *.- + - 23.-i - + .- + +
11. ++ - - - + - 24. ++ ++ - ++ -
12. + + + ++ - + + 25. + + - ++ . - - -
13. + ++ ++ - + ++
In the above Table account has only been taken of the commin-on microbes
found in these cultures. In sonme there were a few diphtheroid bacilli which
were always penicillin sensitive, and in others there were Gram-negative
bacilli which were penicillin insensitive, although they were inhibited by
streptococci or pneurnococci. Pneumococci and streptococci were classed
together, as comiplete tests were not niade to differentiate one from the other.
PENICILLIN. 235

(From the appearance of the colonies and the morphological characters


pneumococci were evidently present in rost cases in much larger numbers
than were streptococci.)
The swabs were generally planted thickly and in some cases where the
growth on the portion of the plate without penicillin was almost confluent,
the cultures were sampled by taking smears from thick portions of the growth.
In these cases it is possible that the results given do not give a quite complete
picture of the cultures. This, however, does not affect the present argument
that by the addition of penicillin to the culture medium, and the consequent
inhibition of the pyogenic cocci, the isolation of B. influenza, is very much
easier. And in a number of cases it was isolated when it was completely
missed in the cultures without penicillin.
It is quite immaterial how many pneumococci and streptococci are present
in a specimen-they are completely inhibited-and even a few B. influenzce
can be isolated from a mixture with an enormous number of these cocci.
From a number of observations which have been made on sputum, post-
nasal and throat swabs it seems likely that by the use of penicillin, organisms
of the B. influenzce group will be isolated from a great variety of pathological
conditions as well as from individuals who are apparently healthy.
DISCUSSION.
It has been demonstrated that a species of penicillium produces in culture
a very powerful antibacterial substance which affects different bacteria in
different degrees. Speaking generally it may be said that the least sensitive
bacteria are the Gram-negative bacilli, and the most susceptible are the
pyogenic cocci. Inhibitory substances have been described in old cultures of
many organisms; generally the inhibition is more or less specific to the
microbe which has been used for the culture, and the inhibitory substances are
seldom strong enough to withstand even slight dilution with fresh nutrient
material. Penicillin is not inhibitory to the original penicillium used in its
preparation.
Emmerich and other workers have shown that old cultures of B. pyo-
cyaneus acquire a marked bacteriolytic power. The bacteriolytic agent,
pyocyanase, possesses properties simiilar to penicillin in that its heat resistance
is the same and it exists in the filtrate of a fluid culture. It resembles
penicillin also in that it acts only on certain microbes. It differs however in
being relatively extremely weak in its action and in acting on quite different-
types of bacteria. The bacilli of anthrax, diphtheria, cholera and typhoid are
those most sensitive to pyocyanase, while the pyogenic cocci are unaffected,
but the percentages of pyocyaneus filtrate necessary for the inhibition of these
organisms was 40, 33, 40 and 60 respectively (Bocchia, 1909). This degree
of inhibition is hardly comparable with 0'20% or less of penicillin which is
necessary to completely inhibit the pyogenic cocci or the 1% necessary for
B. diphtheriae.
Penicillin, in regard to infections with sensitive microbes, appears to have
sonme advantages over the well-known chenmical antiseptics. A good sample
will completely inhibit staphylococci, Streptococcus pyogenes and pneumococcus
in a dilution of 1 in 800. It is therefore a nmore powerful inhibitory agent than
18
236 A. FLEMING.

is carbolic acid and it can be applied to an infected surface undiluted as it is


non-irritant and non-toxic. If applied, therefore, on a dressing, it will still be
effective even when diluted 800 times which is more than can be said of the
chemical antiseptics in use. Experiments in connection with its value in the
treatment of pyogenic infections are in progress.
In addition to its possible use in the treatment of bacterial infections
penicillin is certainly useful to the bacteriologist for its power of inhibiting
unwanted microbes in bacterial cultures so that penicillin insensitive bacteria
can readily be isolated. A notable instance of this is the very easy,isolation of
Pfeiffers bacillus of influenza when penicillin is used.
In conclusion my thanks are due to my colleagues, Mr. Ridley aLd Mr.
Craddock, for their help in carrying out some of the experiments described in
this paper, and to our mycologist, Mr. la Touche, for his suggestions as to the
identity of the penicillium.
SUMMARY.
1. A certain type of penicillium produces in culture a powerful antibacterial
substance. The antibacterial power of the culture reaches its mnaximunm in
about 7 days at 20° C. and after 10 days diminishes until it has almost
disappeared in 4 weeks.
2. The best medium found for the production of the antibacterial substance
has been ordinary nutrient broth.
3. The active agent is readily filterable and the name "penicillin" has
been given to filtrates of broth cultures of the mould.
4. Penicillin loses most of its power after 10 to 14 days at room tempera-
ture but can be preserved longer by neutralization.
5. The active agent is not destroyed by boiling for a few iminutes but in
alkaline solution boiling for 1 hour m-arkedly reduces the power. Autoclaving
for 20 minutes at 115° C. practically destroys it. It is soluble in alcohol but
insoluble in ether or chloroform.
6. The action is very marked on the pyogenic cocci and the diphtheria
group of bacilli. Many bacteria are quite insensitive, e.g. the coli-typhoid
group, the influenza-bacillus group, and the enterococcus.
7. Penicillin is non-toxic to animals in enormous doses and is non-irritant.
It doses not interfere with leucocytic function to a greater degree than does
ordinary broth.
8. It is suggested that it may be an efficient antiseptic for application to,
or injection into, areas infected with penicillin-sensitive microbes.
9. The use of penicillin on culture plates renders obvious many bacterial
inhibitions which are not very evident in ordinary cultures.
10. Its value as an aid to the isolation of B. infiuenza, has been
demonstrated.
REFERENCES.
BIOURGE.-(1923) ' Des moissures du group Penicillium Link.' Louvain, p. 172.
EMMERICH, LOEUW AND KORSCHUN.-(1902) Zbl. Bakt., 30, 1.
BocCHIA.-(1909) Ibid., 50, 220.
FILDES, P.-(1920) Brit. J. Ezp. Path., 1, 129.
168 ASTRONOMY: E. HUBBLE PRoc. N. A. S.
appearance the spectrum is very much like spectra of the Milky Way
clouds in Sagittarius and Cygnus, and is also similar to spectra of binary
stars of the W Ursae Majoris type, where the widening and depth of the
lines are affected by the rapid rotation of the stars involved.
The wide shallow absorption lines observed in the spectrum of N. G. C.
7619 have been noticed in the spectra of other extra-galactic nebulae, and
may be due to a dispersion in velocity and a blending of the spectral types
of the many stars which presumably exist in the central parts of these
nebulae. The lack of depth in the absorption lines seems to be more
pronounced among the smaller and fainter nebulae, and in N. G. C. 7619
the absorption is very weak.
It is hoped that velocities of more of these interesting objects will soon
be available.

A RELATION BETWEEN DISTANCE AND RADIAL VELOCITY


AMONG EXTRA-GALACTIC NEBULAE
By EDWIN HUBBLB
MOUNT WILSON OBSURVATORY, CARNSGIS INSTITUTION OF WASHINGTON
Communicated January 17, 1929
Determinations of the motion of the sun with respect to the extra-
galactic nebulae have involved a K term of several hundred kilometers
which appears to be variable. Explanations of this paradox have been
sought in a correlation between apparent radial velocities and distances,
but so far the results have not been convincing. The present paper is a
re-examination of the question, based on only those nebular distances
which are believed to be fairly reliable.
Distances of extra-galactic nebulae depend ultimately upon the appli-
cation of absolute-luminosity criteria to involved stars whose types can
be recognized. These include, among others, Cepheid variables, novae,
and blue stars involved in emission nebulosity. Numerical values depend
upon the zero point of the period-luminosity relation among Cepheids,
the other criteria merely check the order of the distances. This method
is restricted to the few nebulae which are well resolved by existing instru-
ments. A study of these nebulae, together with those in which any stars
at all can be recognized, indicates the probability of an approximately
uniform upper limit to the absolute luminosity of stars, in the late-type
spirals and irregular nebulae at least, of the order of M (photographic) =
-6.3.1 The apparent luminosities of the brightest stars in such nebulae
are thus criteria which, although rough and to be applied with caution,
Downloaded by guest on April 4, 2020
VOL. 15, 1929 ASTRONOMY: E. HUBBLE 169
furnish reasonable estimates of the distances of all extra-galactic systems
in which even a few stars can be detected.
TABLE 1
NSBULAx WHoss DisTANcEs HAvE BSN ESTIMATSD FROM STARS INVOIVXD OR FROM
MEAN LumNosrrIns IN A CLUSTER
OBJUCT m, r V M
S. Mag. .. 0.032 + 170 1.5 -16.0
L.Mag. .. 0.034 + 290 0.5 17.2
N. G. C. 6822 .. 0.214 - 130 9.0 12.7
598 .. 0.263 - 70 7.0 15.1
221 .. 0.275 - 185 8.8 13.4
224 .. 0.275 - 220 5.0 17.2
5457 17.0 0.45 + 200 9.9 13.3
4736 17.3 0.5 + 290 8.4 15.1
5194 17.3 0.5 + 270 7.4 16.1
4449 17.8 0.63 + 200 9.5 14.5
4214 18.3 0.8 + 300 11.3 13.2
3031 18.5 0.9 - 30 8.3 16.4
3627 18.5 0.9 + 650 9.1 15.7
4826 18.5 0.9 + 150 9.0 15.7
5236 18.5 0.9 + 500 10.4 14.4
1068 18.7 1.0 + 920 9.1 15.9
5055 19.0 1.1 + 450 9.6 15.6
7331 19.0 1.1 + 500 10.4 14.8
4258 19.5 1.4 + 500 8.7 17.0
4151 20.0 1.7 + 960 12.0 14.2
4382 .. 2.0 + 500 10.0 16.5
4472 .. 2.0 + 850 8.8 17.7
4486 .. 2.0 + 800 9.7 16.8
4649 .. 2.0 +1090 9.5 17.0
Mean -15.5
m, = photographic magnitude of brightest stars involved.
r = distance in units of 106 parsecs. The first two are Shapley's values.
v = measured velocities in km./sec. N. G. C. 6822, 221, 224 and 5457 are recent
determinations by Humason.
m, = Holetschek's visual magnitude as corrected by Hopmann. The first three
objects were not measured by Holetschek, and the values of me represent
estimates by the author based upon such data as are available.
Mt = total visual absolute magnitude computed from mg and r.

Finally, the nebulae themselves appear to be of a definite order of


absolute luminosity, exhibiting a range of four or five magnitudes about
an average value M (visual) = - 15.2.1 The application of this statistical
average to individual cases can rarely be used to advantage, but where
considerable numbers are involved, and especially in the various clusters
of nebulae, mean apparent luminosities of the nebulae themselves offer
reliable estimates of the mean distances.
Radial velocities of 46 extra-galactic nebulae are now available, but
Downloaded by guest on April 4, 2020
170
170ASRNM:.HBBEPO.NA.S ASTRONOMY: E. HUBBLE PRoc. N. A. So
individual distances are estimated for only 24. For one other, N. G. C.
3521, an estimate could probably be made, but no photographs are avail-
able at Mount Wilson. The data are given in table 1. The first seven
distances are the most reliable, depending, except for M 32 the companion of
M 31, upon extensive investigations of many stars involved. The next
thirteen distances, depending upon the criterion of a uniform upper limit
of stellar luminosity, are subject to considerable probable errors but are
believed to be the most reasonable values at present available. The last
four objects appear to be in the Virgo Cluster. The distance assigned
to the cluster, 2 X 106 parsecs, is derived from the distribution of nebular
luminosities, together with luminosities of stars in some of the later-type
spirals, and differs somewhat from the Harvard estimate of ten million
light years.2
The data in the table indicate a linear correlation between distances and
velocities, whether the latter are used directly or corrected for solar motion,
according to the older solutions. This suggests a new solution for the solar
motion in which the distances are introduced as coefficients of the K term,
i. e., the velocities are assumed to vary, directly with the distances, and
hence K represents the velocity at unit distance due to this effect. The
equations of condition then take the form
rK + X cos a cos 5 + Y sin a cos 5 + Z sin 5 = v.
Two solutions have been made, one using the 24 nebulae individually,
the other combining them into 9 groups according to proximity in direc-
tion and in distance. The results are
24 OBJuCeS 9 GROUPS
X -65 50 + 3 70
Y +226 95 +230 120
Z -195 40 -133 70
K +465 50 +513 60 km./sec. per 106 parsecs.
A 2860 2690
D + 40° + 330
V. 306 km./sec. 247 km./sec.

For such scanty material, so poorly distributed, the results are fairly
definite. Differences between the two solutions are due largely to the
four Virgo nebulae, which, being the most distant objects and all sharing
the peculiar motion of the cluster, unduly influence the value of K and
hence of Vo. New data on more distant objects will be required to reduce
the effect of such peculiar motion. Meanwhile round numbers, inter-
mediate between the two solutions, will represent the probable order of
the values. For instance, let A = 2770, D = +36° (Gal. long. = 320,
lat. = +180), Vo = 280 km./sec., K = +500 km./sec. per million par-
Downloaded by guest on April 4, 2020
VOL. 15, 1929 ASTRONOMY: E. HUBBLE 171
secs. Mr. Stromberg has very kindly checked the general order of these
values by independent solutions for different groupings of the data.
A constant term, introduced into the equations, was found to be small
and negative. This seems to dispose of the necessity for the old constant
K term. Solutions of this sort have been published by Lundmark,3 who
replaced the old K by k + ir + mr2. His favored solution gave k = 513,
as against the former value of the order of 700, and hence offered little
advantage.
TABLE 2
NZBULA" WHOSE DIsTNcus ARS ESTIMATUD FROM RADIAL VOLOCITMs
ODJgCT v Vs r mg Mt
N.G. C. 278 + 650 -110 1.52 12.0 -13.9
404 - 25 - 65 11.1
584 +1800 + 75 3.45 10.9 16.8
936 +1300 +115 2.37 11.1 15.7
1023 + 300 - 10 0.62 10.2 13.8
1700 + 800 +220 1.16 12.5 12.8
2681 + 700 - 10 1.42 10.7 15.0
2683 + 400 + 65 0.67 9.9 14.3
2841 + 600 - 20 1.24 9.4 16.1
3034 + 290 -105 0.79 9.0 15.5
3115 + 600 +105 1.00 9.5 15.5
3368 + 940 + 70 1.74 10.0 16.2
3379 + 810 + 65 1.49 9.4 16.4
3489 + 600 + 50 1.10 11.2 14.0
3521 + 730 + 95 1.27 10.1 15.4
3623 + 800 + 35 1.53 9.9 16.0
4111 + 800 - 95 1.79 10.1 16.1
4526 + 580 - 20 1.20 11.1 14.3
4565 +1100 - 75 2.35 11.0 15.9
4594 +1140 + 25 2.23 9.1 17.6
5005 + 900 -130 2.06 11.1 15.5
5866 + 650 -215 1.73 11.7 -14.5
Mean 10.5 - 15.3

The residuals for the two solutions given above average 150 and 110
km./sec. and should represent the average peculiar motions of the in-
dividual nebulae and of the groups, respectively. In order to exhibit
the results in a graphical form, the solar motion has been eliminated from
the observed velocities and the remainders, the distance terms plus the
residuals, have been plotted against the distances. The run of the re-
siduals is about as smooth as can be expected, and in general the form of
the solutions appears to be adequate.
The 22 nebulae for which distances are not available can be treated in
two ways. First, the mean distance of the group derived from the mean
apparent magnitudes can be compared with the mean of the velocities
Downloaded by guest on April 4, 2020
172 ASTRONOMY: E. HUBBLE PRoc. N. A. S.

corrected for solar motion. The result, 745 km./sec. for a distance of
1.4 X 106 parsecs, falls between the two previous solutions and indicates
a value for K of 530 as against the proposed value, 500 km./sec.
Secondly, the scatter of the individual nebulae can be examined by
assuming the relation between distances and velocities as previously
determined. Distances can then be calculated from the velocities cor-
rected for solar motion, and absolute magnitudes can be derived from the
apparent magnitudes. The results are given in table 2 and may be
compared with the distribution of absolute magnitudes among the nebulae
in table 1, whose distances are derived from other criteria. N. G. C. 404

0.

o~~~~~~~~~~~~~~~~
S0OKM

DISTANCE
0 IDPARSEC S 2 ,10 PARSECS
FIGURE 1
Velocity-Distance Relation among Extra-Galactic Nebulae.
Radial velocities, corrected for solar motion, are plotted against
distances estimated from involved stars and mean luminosities of
nebulae in a cluster. The black discs and full line represent the
solution for solar motion using the nebulae individually; the circles
and broken line represent the solution combining the nebulae into
groups; the cross represents the mean velocity corresponding to
the mean distance of 22 nebulae whose distances could not be esti-
mated individually.

can be excluded, since the observed velocity is so small that the peculiar
motion must be large in comparison with the distance effect. The object
is not necessarily an exception, however, since a distance can be assigned
for which the peculiar motion and the absolute magnitude are both within
the range previously determined. The two mean magnitudes, - 15.3
and - 15.5, the ranges, 4.9 and 5.0 mag., and the frequency distributions
are closely similar for these two entirely independent sets of data; and
even the slight difference in mean magnitudes can be attributed to the
selected, very bright, nebulae in the Virgo Cluster. This entirely unforced
agreement supports the validity of the velocity-distance relation in a very
Downloaded by guest on April 4, 2020
Vow. 15, 1929 ASTRONOMY: E. HUBBLE 173
evident matter. Finally, it is worth recording that the frequency distribu-
tion of absolute magnitudes in the two tables combined is comparable
with those found in the various clusters of nebulae.
The results establish a roughly linear relation between velocities and
distances among nebulae for which velocities have been previously pub-
lished, and the relation appears to dominate the distribution of velocities.
In order to investigate the matter on a much larger scale, Mr. Humason
at Mount Wilson has initiated a program of determining velocities
of the most distant nebulae that can be observed with confidence.
These, naturally, are the brightest nebulae in clusters of nebulae.
The first definite result,4 v = + 3779 km./sec. for N. G. C. 7619, is
thoroughly consistent with the present conclusions. Corrected for the
solar motion, this velocity is +3910, which, with K = 500, corresponds to
a distance of 7.8 X 106 parsecs. Since the apparent magnitude is 11.8,
the absolute magnitude at such a distance is -17.65, which is of the
right order for the brightest nebulae in a cluster. A preliminary dis-
tance, derived independently from the cluster of which this nebula appears
to be a member, is of the order of 7 X 10" parsecs.
New data to be expected in the near future may modify the significance
of the present investigation or, if confirmatory, will lead to a solution
having many times the weight. For this reason it is thought premature
to discuss in detail the obvious consequences of the present results. For
example, if the solar motion with respect to the clusters represents the
rotation of the galactic system, this motion could be subtracted from the
results for the nebulae and the remainder would represent the motion of
the galactic system with respect to the extra-galactic nebulae.
The outstanding feature, however, is the possibility that the velocity-
distance relation may represent the de Sitter effect, and hence that numer-
ical data may be introduced into discussions of the general curvature of
space. In the de Sitter cosmology, displacements of the spectra arise
from two sources, an apparent slowing down of atomic vibrations and a
general tendency of material particles to scatter. The latter involves an
acceleration and hence introduces the element of time. The relative im-
portance of these two effects should determine the form of the relation
between distances and observed velocities; and in this connection it may
be emphasized that the linear relation found in the present discussion is a
first approximation representing a restricted range in distance.
Mt. Wilson Contr., No. 324; Astroph. J., Chicago, Ill., 64, 1926 (321).
2Harvard Coll. Obs. Circ., 294, 1926.
'Mon. Not. R. Astr. Soc., 85, 1925 (865-894).
' hese PROCUDINGS, 15, 1929 (167).
Downloaded by guest on April 4, 2020
230 A. M. TUKING [Nov. 12,

ON COMPUTABLE NUMBERS, WITH AN APPLICATION TO


THE ENTSCHEIDUNGSPROBLEM

By A. M. TURING.

[Received 28 May, 1936.—Read 12 November, 1936.]

The "computable" numbers may be described briefly as the real


numbers whose expressions as a decimal are calculable by finite means.
Although the subject of this paper is ostensibly the computable numbers.
it is almost equally easy to define and investigate computable functions
of an integral variable or a real or computable variable, computable
predicates, and so forth. The fundamental problems involved are,
however, the same in each case, and I have chosen the computable numbers
for explicit treatment as involving the least cumbrous technique. I hope
shortly to give an account of the relations of the computable numbers,
functions, and so forth to one another. This will include a development
of the theory of functions of a real variable expressed in terms of com-
putable numbers. According to my definition, a number is computable
if its decimal can be written down by a machine.
In §§ 9, 10 I give some arguments with the intention of showing that the
computable numbers include all numbers which could naturally be
regarded as computable. In particular, I show that certain large classes
of numbers are computable. They include, for instance, the real parts of
all algebraic numbers, the real parts of the zeros of the Bessel functions,
the numbers IT, e, etc. The computable numbers do not, however, include
all definable numbers, and an example is given of a definable number
which is not computable.
Although the class of computable numbers is so great, and in many
Avays similar to the class of real numbers, it is nevertheless enumerable.
In § 81 examine certain arguments which would seem to prove the contrary.
By the correct application of one of these arguments, conclusions are
reached which are superficially similar to those of Gbdelf. These results

f Godel, " Uber formal unentscheidbare Satze der Principia Mathematica und ver-
•vvandter Systeme, I " . Monatsheftc Math. Phys., 38 (1931), 173-198.
1936.] ON COMPUTABLE NUMBERS. 231

have valuable applications. In particular, it is shown (§11) that the


Hilbertian Entscheidungsproblem can have no solution.
In a recent paper Alonzo Church f has introduced an idea of "effective
calculability", which is equivalent to my "computability", but is very
differently defined. Church also reaches similar conclusions about the
EntscheidungsproblemJ. The proof of equivalence between "computa-
bility" and "effective calculability" is outlined in an appendix to the
present paper.

1. Computing machines.
We have said that the computable numbers are those whose decimals
are calculable by finite means. This requires rather more explicit
definition. No real attempt will be made to justify the definitions given
until we reach § 9. For the present I shall only say that the justification
lies in the fact that the human memory is necessarily limited.
We may compare a man in the process of computing a real number to ;i
machine which is only capable of a finite number of conditions q1: q2. .... qI;
which will be called " m-configurations ". The machine is supplied with a
" t a p e " (the analogue of paper) running through it, and divided into
sections (called "squares") each capable of bearing a "symbol". At
any moment there is just one square, say the r-th, bearing the symbol <2>(r)
which is "in the machine". We may call this square the "scanned
square ". The symbol on the scanned square may be called the " scanned
symbol". The "scanned symbol" is the only one of which the machine
is, so to speak, "directly aware". However, by altering its m-configu-
ration the machine can effectively remember some of the symbols which
it has "seen" (scanned) previously. The possible behaviour of the
machine at any moment is determined by the ra-configuration qn and the
scanned symbol <S (r). This pair qn, © (r) will be called the '' configuration'':
thus the configuration determines the possible behaviour of the machine.
In some of the configurations in which the scanned square is blank (i.e.
bears no symbol) the machine writes down a new symbol on the scanned
square: in other configurations it erases the scanned symbol. The
machine may also change the square which is being scanned, but only by
shifting it one place to right or left. In addition to any of these operations
the m-configuration may be changed. Some of the symbols written down

f Alonzo Church, " An unsolvable problem, of elementary number theory ", American
J. of Math., 58 (1936), 345-363.
X Alonzo Church, "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1
(1936), 40-41.
232 A. M. TURING [Nov. 12,

will form the sequence of figures which is the decimal of the real number
which is being computed. The others are just rough notes to "assist the
memory ". It will only be these rough notes which will be liable to erasure.
It is my contention that these operations include all those which are used
in the computation of a number. The defence of this contention will be
easier when the theory of the machines is familiar to the reader. In the
next section I therefore proceed with the development of the theory and
assume that it is understood what is meant by "machine", "tape",
"scanned", etc.

2. Definitions.
Automatic machines.
If at each stage the motion of a machine (in the sense of § 1) is completely
determined by the configuration, we shall call the machine an "auto-
matic machine" (or a-machine).
.For some purposes we might use machines (choice machines or
c-manhines) whose motion is onty partially determined by the configuration
(hence the use of the word "possible" in §1). When such a machine
reaches one of these ambiguous configurations, it cannot go on until some
arbitrary choice has been made by an external operator. This would be the
case if we were using machines to deal with axiomatic systems. In this
paper I deal only with automatic machines, and will therefore often omit
the prefix a-.

Computing machines.
If an a-machine prints two kinds of symbols, of which the first kind
(called figures) consists entirely of 0 and 1 (the others being called symbols of
the second kind), then the machine will be called a computing machine.
If the machine is supplied with a blank tape and set in motion, starting
from the correct initial ra-configuration, the subsequence of the sjinbols
printed by it which are of the first kind will be called the sequence computed
by the machine. The real number whose expression as a binary decimal is
obtained by prefacing this sequence by a decimal point is called the
number computed by the machine.
At any stage of the motion of the machine, the number of the scanned
square, the complete sequence of all symbols on the tape, and the
ra-configuration will be said to describe the complete configuration at that
stage. The changes of the machine and tape between successive complete
configurations will be called the moves of the machine.
1936.] ON COMPUTABLE NUMBERS. 233

Circular and circle-free machines.


If a computing machine never writes down more than a finite number
of symbols of the first kind, it will be called circular. Otherwise it is said to
be circle-free.
A machine will be circular if it reaches a configuration from which there
is no possible move, or if it goes on moving, and possibly printing symbols
of the second kind, but cannot print any more symbols of the first kind.
The significance of the term "circular" will be explained in §8.
Computable sequences and numbers.
A sequence is said to be computable if it can be computed by a circle-free
machine. A number is computable if it differs by an integer from the
number computed by a circle-free machine.
We shall avoid confusion by speaking more often of computable
sequences than of computable numbers.

3. Examples of computing machines.


I. A machine can be constructed to compute the sequence 010101....
The machine is to have the four m-configurations " b " , " c " , " £ " , "c : >
and is capable of printing " 0 " and " 1 ". The behaviour of the machine is
described in the following table in which " R " means "the machine moves
so that it scans the square immediately on the right of the one it was
scanning previously". Similarly for "L". "E" means "the scanned
symbol is erased" and " P " stands for "prints". This table (and all
succeeding tables of the same kind) is to be understood to mean that for
a configuration described in the first two columns the operations in the
third column are carried out successively, and the machine then goes over
into the m-configuration described in the last column. When the second
column is left blank, it is understood that the behaviour of the third and
fourth columns applies for any symbol and for no symbol. The machine
starts in the m-configuration b with a blank tape.
Configuration Behaviour
m-config. symbol operations final -config.
b None PO, R c
c None R c
c None PI, R t
I None R b
234 A. M. TURING [NOV. 12,

If (contrary to the description in § 1) we allow the letters L, R to appear


more than once in the operations column we can simplify the table
considerably.
m-config. symbol operations final m-config.
None PO 6
0 R, R, P I b
1 R, R, PO b
II. As a slightly more difficult example we can construct a machine to
compute the sequence 001011011101111011111 The machine is to
be capable of five ra-configurations, viz. " o ", " q ", "p ", " f ", " b " and of
printing " o " , "x", " 0 " , " 1 " . The first three symbols on the tape will
be " aoO " ; the other figures follow on alternate squares. On the inter-
mediate squares we never print anything but "x". These letters serve to
" keep the place " for us and are erased when we have finished with them.
We also arrange that in the sequence of figures on alternate squares there
shall be no blanks.

Configuration Behaviour
m-config. symbol operations final
m-config.
b Pa, R, Po, R, PO. R, R, PO, L, L 0

i?, Px, L, L, L 0

•{ ;fAny (0 or 1) R, R
q

rt J
q
q i
[ None PI, L p
E, R q
1g R f
^ 1I None L, L p
fAny R,R f
None PO, L, L 0

To illustrate the working of this machine a table is given below of the


first few complete configurations. These complete configurations are
described by writing down the sequence of symbols which are on the tape,
1936.] ON COMPUTABLE NUMBERS. 235

with the m-configuration written below the scanned symbol. The


successive complete configurations are separated by colons.
: 990 O r o o O 0 : 9 9 0 0 : 9 9 0 0 : 9 9 0 0 1 :
b o q q q p
990 0 1 : 9 9 0 0 1 : 9 9 0 0 1 : 9 9 0 0 1 :
P P f f
990 0 1:990 0 1 :oa0 0 1 0:
f f
990 0 H - 0 : ....
c
This table could also be written in the form
b :9 9 o 0 0 : 99q0 0 : ..., (C)
in which a space has been made on the left of the scanned symbol and the*
m-configuration written in this space. This form is less easy to follow, but
we shall make use of it later for theoretical purposes.
The convention of writing the figures only on alternate squares is very
useful: I shall always make use of it. I shall call the one sequence of alter-
nate squares JF'-squares and the other sequence ^/-squares. The symbols oi •.
^-squares will be liable to erasure. The symbols on F-squares form a
continuous sequence. There are no blanks until the end is reached. There
is no need to have more than one jE'-square between each pair of .F-squarcs :
an apparent need of more ^/-squares can be satisfied by having a sufficiently
rich variety of symbols capable of being printed on ^-squares. If a
symbol /3 is on an F-square S and a symbol a is on the ^-square next on the
right of S, then S and /3 will be said to be marked with a. The
process of printing this a will be called marking jS (or S) with a.

4. Abbreviated tables.
There are certain types of process used by nearly all machines, and.
these, in some machines, are used in many connections. These processes
include copying down sequences of symbols, comparing sequences, erasing
all symbols of a given form, etc. Where such processes are concerned we
can abbreviate the tables for the m-configurations considerably by the use
of "skeleton tables". In skeleton tables there appear capital German
letters and small Greek letters. These are of the nature of "variables '".
By replacing each capital German letter throughout by an ^^-configuration
236 A. M. TURING [Nov. 12,

and each small Greek letter by a symbol, we obtain the table for an
m-configuration.
The skeleton tables are to be regarded as nothing but abbreviations:
they are not essential. So long as the reader understands how to obtain
the complete tables from the skeleton tables, there is no need to give any
exact definitions in this connection.
Let us consider an example:
m-config. Symbol Behaviour Final
m-config.
L f^G, 95, a) From the m-configuration
f(e,S5,a) f(@, 93, a) the machine finds the
L f(<5,S3,a)
symbol of form a which is far-
thest to the left (the "first a")
R and the ?w-confi,guration then
fi(6,93,a)
becomes (L If there is no a
R f2(G, then the m-configuration be-
fa comes 93.
not a R I, 93, a)
None R 93
If we were to replace £ throughout by q (say), 93 by r, and a. by x, we
should have a complete table for the m-configuration f (q, x, x). f is called
an "?/i-configuration function" or "m-function".
The only expressions which are admissible for substitution in an
»i-function are the m-configurations and symbols of the machine. These
have to be enumerated more or less explicitly: they may include expressions
such as p(c, x); indeed they must if there are any m-functions used at all.
If we did not insist on this explicit eaumeration, but simply stated that
the machine had certain m-configurations (enumerated) and all m-configu-
rations obtainable by substitution of m-configurations in certain m-func-
tion.-J, we .should usually get an infinity of m-configurations; e.g., we might
say that the machine was to have the m-configuration q and all m-configu-
rations obtainable by substituting an m-configuration for £ in p(£). Then
it would have q, p(q), pfp(q)V p(p(p(q))), ... asm-configurations.
Our interpretation rule then is this. We are given the names of the
^-configurations of the machine, mostly expressed in terms of m-functions.
We are also given skeleton tables. All we want is the complete table for
the m-configurations of the machine. This is obtained by repeated
substitution in the skeleton tables.
1936.] ON COMPUTABLE NUMBERS. 237

Further examples.
(In the explanations the symbol " - > " is used to signify " t h e machine
goes into the ra-configuration. . . . ")

e((5,23,a) f (e^S, S3, a), S3, a) From c(S, 23, a) the first a is
„ ^ erased and -> (L If there is no
c^G, S3, a) # G

c(S3, a) c(c(S3, a), 23, a) From c(S3, a) all letters a are


erased and -»53.
The last example seems somewhat more difficult to interpret than
most. Let us suppose that in the list of m-configurations of some machine
there appears c('b, x) ( = q , saj'). The table is

c(6; a;) e(c(b, x). h, x)

or q c(q, 6, a;).
Or, in greater detail:
q c(q, 6, x)

c(q, 6, x) f (ci(q, 6, a.1), t), a )

Cj.(q, I), re) £• q.

In this we could replace cJL(q, h, x) by q' and then give the table for f (with
the right substitutions) and eventually reach a table in which no
m-functions appeared.

, j8) f (pc^G, j8), € , Q ) From pc (g, /3) the machine


[Any i?3JR pe^S.jS) Prints ^ ^ ^
ue (<S j8) \ sequence of sj^mbols and -> C
[None P/S 6
I(S) ^ 2 From f'((5: 2J, a) it does the
r /gx j^ G same as for f(6, S3, a) but
moves to the left before -^ <3.
f(6,»,o) f(t(6),a3,a)

f"(S,»,o) f(t(S),S8,a)

c(S,S3,o) f'(c-i(S), 55, a) c(<£, S3, a). The machine


c (<l) R pe(€ JS) writes at the end the first sym-
bol marked a and -> £.
238 A. M. TURING [NOV. 12,

The last line stands for the totality of lines obtainable from it by
replacing fi by any symbol which may occur on the tape of the machine
concerned.

cc(£,S3,a) c(e(G,S3,a),83,a) ce(23, a). The machine


copies down in order at the
cc(23,a) ce(ce(83,a),23,a) end all symbols marked a
and erases the letters a; ->SS.
vc(G,93,a,j8) f(re 1 (g 3 $B 3 a, i 8),^ 5 a) rc(£, S3, a, 0). The ma-
chine replaces the first a by
re^^a.fl E,Pp <Z (8 a n d - > g ^ 35 if there is no a.

re(S, a, P) «<»' a> # • T h e m a c h i n e r e "


re («(», a, j8), 93, a, j8)
places all letters a by ]S; ->S5.
cr(Ci,23;a) c(tt(G,9$,a,a), S3,a) Cr(83, a) differs from
ce(23, a) 011137" in that the
«(«(5S,a),rc(SS,a,a),a) letters a are not erased. The
m-configuration cv(5S, a) is
taken up when no letters
" a " are on the tape.

•r (C. 21, e. a. ,5) f ( c p i ^ S(, )S), f(3t, g, j8), a)

cp,(C, 2l,i8) 7 f (cp2(e,2T, y), S(,

7 S
cp.,((S. 2(, y)
[noty SI.
The first symbol marked a and the first marked ]8 are compared. If
there is neither a nor ft, —> (I\ If there are both and the symbols are alike,
-> (5. Otherwise -> 21.

cpc(6, SI, G, a, jS) cp (c (e((5, S, yS), 6, a), SI, g, a, ^)

cpe(S, 21, S, a, j8) differs from cp(§, 21, £, a, j8) in that in the case when
there is similarity the first a and /? are erased.

cpe^, Q, a, P) cpe (cpe(Sl, Q, a, j8), 21, 6, a, )3).

cpe(2I, S, a, j8). The sequence of symbols marked a is compared with


the sequence marked /?. -> Q if they are similar. Otherwise -> 21. Some
of the symbols a and /? are erased.
1936.] ON COMPUTABLE NUMBERS. 239

R a). The machine


JAny
finds the last symbol of
[None R
form a. -> @.
JAny R
[None

not a
3)> a ) pc2(S, a, jS). The machine
prints a j8 at the end.

ce2(95, a, ce(ce(255j8), a) ce3(S5,a,j8,y). The mach-


ine copies down at the end
ce3(S5,a, j8,y) ce (ce2(S5,0, y), a) £ r s t t he symbols marked a,
then those marked jS, and
finally those marked y; it
erases the symbols a, /?, y.
R e1((5) From e(^) the marks are
,^> erased from all marked sym-
L bols. -> @.
f
Any R, E, R
None

5. Enumeration of computable sequences.


A computable sequence y is determined by a description of a machine
which computes y. Thus the sequence 001011011101111... is determined
by the table on p. 234, and, in fact, any computable sequence is capable of
being described in terms of such a table.
It will be useful to put these tables into a kind of standard form. In the
first place let us suppose that the table is given in the same form as the first
table, for example, I on p. 233. That is to say, that the entry in the operations
column is always of one of the forms E :E,R:E,L:Pa: Pa, R: Pa, L:R:L:
or no entry at all. The table can always be put into this form by intro-
ducing more m-configurations. Now let us give numbers to the w-configu-
rations, calling them qx, ..., qR, as in §1. The initial m-configuration is
always to be called qv We also give numbers to the symbols #]_,....., Sm
240 A. M. TUBING [Nov. 12,

and, in particular, blank = 80, 0 = Slt 1 = S2. The hnes of the table are
now of form
Final
m-config. Symbol Operations m-config.

to s, PSk,L
to Si PSkiR
to Si PSk
Lines such as

to Si E, R

are to be written as

to Si PS0, R

and lines such as

ft Si R
to be written as

to s. PS,, R
In this way we reduce each line of the table to a line of one of the forms
(Nj, (N2), (i\y.
From each line of form (N^ let us form an expression q( Sj]Sb L qm;
from each line of form (N2) we form an expression qiSjSkRqm;
and from each line of form (N3) we form an expression #,•#, SkNqm.
Let us write down all expressions so formed from the table for the
machine and separate them by semi-colons. In this way we obtain a
complete description of the machine. In this description we shall replace
q{ by the letter "D" followed by the letter "A" repeated i times, and $,- by
" D " followed by "C" repeated j times. This new description of the
machine may be called the standard description (S.D). It is made up
entirely from the letters "A", " C", "D", "L", "R", "N", and from

If finally we replace "A" by " 1 " , "C" by " 2 " , "D" by " 3 " , " L"
by " 4 " , "R" by c ' 5 " , "N" by " 6 " , and "*3> by £ < 7" we sh,all have a
description of the machine in the form of an arabic numeral. The integer
represented by this numeral may be called a description number (D.N) of
the machine. The D.N determine the S.D and the structure of the
1936.] ON COMPUTABLE NUMBERS. 241

machine uniquely. The machine whose D.N is n may be described as

To each computable sequence there corresponds at least one description


number, while to no description number does there correspond more than
one computable sequence. The computable sequences and numbers are
therefore enumerable.
Let us find a description number for the machine I of § 3. When we
rename the m-configurations its table becomes:
q-L ^o *b1} K q2

q2 SQ P8O, R q3

q3 So PS2) R #4
PS
ft SQ o>R ft
Other tables could be obtained by adding irrelevant lines such as

qx Sx PSVR q2

Our first standard form would be

qxOQOJRq%j q%^o^o-"ft» 2*3®o^2-"ft' ft^o^oRQ\J•


The standard description is
DADDCRDAA ;DAADDRDAAA;

I^^DDCCtfi)^^ \DAAAADDRDA;
A description number is
31332531173113353111731113322531111731111335317
and so is
3133253117311335311173111332253111173111133531731323253117
A number which is a description number of a circle-free machine will be
called a satisfactory number. In § 8 it is shown that there can be no general
process for determining whether a given number is satisfactory or not.

6. The universal computing machine.


It is possible to invent a single machine which can be used to compute
any computable sequence. If this machine M is supplied with a tape on
the beginning of which is written the S.D of some computing machine .At,
8KR. 2. VOL. 42. NO. 2144. B
242 A. M. TURING [NOV. 12,

then 'It will compute the same sequence as i t . In this section I explain
in outline the behaviour of the machine. The next section is devoted to
giving the complete table for U.
Let us first suppose that we have a machine i t ' which will write down on
the .F-squares the successive complete configurations of i t . These might
be expressed in the same form as on p. 235, using the second description,
(C), with all symbols on one line. Or, better, we could transform this
description (as in §5) by replacing each ra-configuration by " D " followed
by "A" repeated the appropriate number of times, and by replacing each
symbol by " D " followed by "C" repeated the appropriate number of
times. The numbers of letters'' A " and'' C " are to agree with the numbers
chosen in §5, so that, in particular, " 0 " is replaced by "DC", " 1 " by
"DCC", and the blanks by " D " . These substitutions are to be made
after the complete configurations have been put together, as in (C). Diffi-
culties arise if we do the substitution first. In each complete configura-
tion the blanks would all have to be replaced by " D ", so that the complete
configuration would not be expressed as a finite sequence of symbols.
If in the description of the machine II of § 3 we replace " o " by " DA A ",
" a " by "DCCC", " q " by "DAAA", then the sequence (C) becomes:
DA .DCCCDCCCDAADCDDC.DCCCDCCCDAAADCDDC:... (CJ
(This is the sequence of symbols on ^-squares.)
It is not difficult to see that if i t can be constructed, then so can i t ' .
The manner of operation of i t ' could be made to depend on having the rules
of operation {i.e., the S.D) of i l written somewhere within itself {i.e. within
i l / ) ; each step could be carried out by referring to these rules. We have
only to regard the rules as being capable of being taken out and ex-
changed for others and we have something very akin to the universal
machine.
One thing is lacking : at present the machine i t ' prints no figures. We
may correct this by printing between each successive pair of complete
configurations the figures which appear in the new configuration but not
in the old. Then (C^) becomes
DDA:O:O:DCCCDCCCDAADCDDC:DCCC... (C2)

It is not altogether obvious that the ^-squares leave enough room for
the necessary "rough work", but this is, in fact, the case.
The sequences of letters between the colons in expressions such as
(Cj) may be used as standard descriptions of the complete configurations.
When the letters are replaced by figures, as in § 5, we shall have a numerical
1936.] ON COMPUTABLE NUMBERS. 243

•description of the complete configuration, which may be called its descrip-


tion number.

7. Detailed description of the universal machine.


A table is given below of the behaviour of this universal machine. The
•m-configurations of which the machine is capable are all those occurring in
the first and last columns of the table, together with all those which occur
when we write out the unabbreviated tables of those which appear in the
table in the form of m-functions. E.g., e(anf) appears in the table and is an
wi-fimction. Its unabbreviated table is (see p. 239)
9 R e^onf)
e(anf)
not 9 L c(anf)
Any R, E, R ei(anf)
e^anf)
None anf
Consequently e1(anf) is an m-configuration of U.
When \l is ready to start work the tape running through it bears on it
the symbol a on an .F-square and again Q on the next i£-square; after this,
on .F-squares only, comes the S.D of the machine followed by a double
colon " : : " (a single symbol, on an .F-square). The S.D consists of a
number of instructions, separated by semi-colons.
Each instruction consists of five consecutive parts
(i) " D " followed by a sequence of letters "A". This describes the
relevant m-configuration.
(ii) "JD" followed by a sequence of letters " C". This describes the
scanned symbol.
(iii) " D " followed by another sequence of letters "C". This
describes the symbol into which the scanned symbol is to be changed.
(iv) " L " , " i 2 " , or "JV", describing whether the machine is to move
to left, right, or not at all.
(v) " D " followed by a sequence of letters "A". This describes the
final m-configuration.
The machine U is to be capable of printing "A", " 0 " , c t D " , " 0 " ,
• " 1 " , "u", "v", "w", " z " , "y", " z " . The S.D is formed from " ; " ,
((
•"A", "C", "D", "L", R"} "N".
244 A. M. TURING [Nov. 12,

Subsidiary skeleton table.

(Not A R, R con(£, a) con(@. a). Starting from


con(@, a) an J^-square, S say, the se-
A L, Pa, R con^S, a) q u e n c e Q o f s y m b o l s de scrib-

A R,Pa,R con^a) ing a configuration closest on


con^CE, a) the right of S is marked out
D R, Pa, R con2(§, a) with letters a. ->@.

G R, Pa, R con2(£,a) con(S, ). In the final con-


con2(§, a) figuration the machine is
Not C R.R scanning the square which is
four squares to the right of the
last square of C. C is left
unmarked.
The table for U.

6. The machine prints


on the .F-squares after
hx R,R,P:,R,R,PD;R,R,PA anf ->anf.
anf g(anf1} :) anf. The machine marks
the configuration in the last
COn (font, y) c o m p i e t e configuration with
y. -

R, Pz: L con (limp, x) font. The machine finds


the last semi-colon not
font L,L !om
marked with z. It marks
not z nor L !om this semi-colon with z and
the configuration following
it with x.

Hnr,> cpe(c(fom, x, y), iim, x, y) fmp. The machine com-


pares the sequences marked
x and y. It erases all letters
x and y. -> Sim if they are
alike. Otherwise ->• font.

anf. Taking the long view, the last instruction relevant to the last
configuration is found. It can be recognised afterwards as the instruction
following the last semi-colon marked z. -Mim.
1936.] ON COMPUTABLE NUMBERS. 245

Sim S im. The machine marks out


the instructions. That part of
con (stm2, ) the instructions which refers to
A Sim 3 operations to be carried out is
marked with u, and the final m-
not A .R,Pu, R, R ,R Sim 2
configuration with y. The let-
not A L, Py e(mB, ters z are erased.
A L,Py, ,R Sim 3

•mt mi. The last complete con-


figuration is marked out into
four sections. The configiira-
A L, L, L, L mf2 ration is left unmarked. The
symbol directly preceding it is
C , Pa;, j ^ , Z',
marked with x. The remainder
of the complete configuration
D R, Px, L, L, L m?3 is divided into two parts, of
which the first is marked with
not : R, Pv, L, L, L m!3 v and the last with w. A colon is
m?3 printed after the whole. -> $f;.
: mL

m?4 con

[Any mf6
mh
[ None P:
, u) Sf;. The instructions (marked
u) are examined. If it is found
L, L, L that they involve "Print 0" or
?, R, R, R "Print 1", then 0: or 1: is
printed at the end.

•R, 22
inSt, 0, :
xnit
246 A. M. T U R I N G [NOV. 12,

in«t fl(t(in«1),tt) «**• T h e n e x t complete


configuration is written down,.
a R, E in^t1(a) carrying out the marked instruc-
tions
L) ce5(o»,.t>, y, x, u, w) - The letters u
> v> w> x> V
are erased. -^anf.
i?) ce5(o», v, x, u, y, w)
\nitx{N) ec5(ot>, v, x, y, u, w)
co c(anf)

8. Application of the diagonal process.


It may be thought that arguments which prove that the real numbers
are not enumerable would also prove that the computable numbers and
sequences cannot be enumerable*. It might, for instance, be thought
that the limit of a sequence of computable numbers must be computable.
This is clearly only true if the sequence of computable numbers is defined
by some rule.
Or we might apply the diagonal process. "If the computable sequences
are enumerable, let a/( be the n-th computable sequence, and let </>;l(ra) be
the ?n-th figure in au. Let /? be the sequence with \—<j>n(n) as its n-th.
figure. Since /3 is computable, there exists a number K such that
l—cf)ll(n) = <f)K(n) all n. Putting n = K, we have 1 = 2(f>K(K), i.e. 1 is
even. This is impossible. The computable sequences are therefore not
enumerable".
The fallacy in this argument lies in the assumption that § is computable.
It would be true if we could enumerate the computable sequences by finite
means, but the problem of enumerating computable sequences is equivalent
to the problem of finding out whether a given number is the D.N of a
circle-free machine, and we have no general process for doing this in a finite
number of steps. In fact, by applying the diagonal process argument
correctly, we can show that there cannot be any such general process.
The simplest and most direct proof of this is by showing that, if this
general process exists, then there is a machine which computes /?. This
proof, although perfectly sound, has the disadvantage that it may leave
the reader with a feeling that "there must be something wrong". The
proof which I shall give has not this disadvantage, and gives a certain
insight into the significance of the idea "circle-free". It depends not on
constructing /3, but on constructing fi', whose n-th. figure is <j>n{n).

* Cf. Hobson, Theory of functions of a real variable (2nd ed., 1921), 87, 88.
1936.] ON COMPUTABLE NUMBERS. 247

Let us suppose that there is such a process; that is to say, that we can
invent a machine <D- which, when supplied with the S.D of any computing
machine i l will test this S.D and if i l is circular will mark the S.D with the
symbol "u" and if it is circle-free will mark it with " s ". By combining
the machines <& and U we could construct a machine :l I- to compute the
sequence j8'. The machine <O- may require a tape. We may suppose that
it uses the jE'-squares beyond all symbols on .F-squares, and that when it
has reached its verdict all the rough work done by l0- is erased.
The machine Ji has its motion divided into sections. In the first N— 1
sections, among other things, the integers 1, 2,..., N— 1 have been written
down and tested by the machine <Q>-. A certain number, say R(N— I), of
them have been found to be the D.N's of circle-free machines. In the N-th
section the machine (& tests the number N. If N is satisfactory, i.e., if it
is the D.N of a circle-free machine, then R(N) = l-\-R(N—l) and the first
R{N) figures of the sequence of which a $£N is N are calculated. The
R(N)-th figure of this sequence is written down as one of the figures of the
sequence/3' computed by Ji. If N is not satisfactory, then R(N) = R(N— 1)
and the machine goes on to the (iV-(-l)-th section of its motion.
From the construction of J I- we can see that .11- is circle-free. Each
section of the motion of Ji comes to an end after a finite number of steps.
For, by our assumption about Q, the decision as to whether N is satisfactor}'
is reached in a finite number of steps. If N is not satisfactory, then the
JV-th section is finished. If N is satisfactory, this means that the machine
il(JV) whose D.N is N is circle-free, and therefore its J?(iV)-th figure can be
calculated in a finite number of steps. When this figure has been calculated
and written down as the R(N)-th figure of /3', the iV-th section is finished.
Hence il is circle-free.
Now let K be the D.N of Ji. What does Ji do in the K-th. section of
its motion 1 It must test whether K is satisfactory, giving a verdict " 5 "
or "u". Since K is the D.N of JI- and since JI is circle-free, the verdict
cannot be "u". On the other hand the verdict cannot be "s". For if it
were, then in the K-th. section of its motion J I- would be bound to compute
the first R(K—1) + 1 = R(K) figures of the sequence computed by the
machine with K as its D.N and to write down the R(K)-th as a figure of the
sequence computed by ill. The computation of the first R(K) — l figures
would be carried out all right, but the instructions for calculating the
R(K)-th. would amount to "calculate the first R(K) figures computed by
H and write down the R(K)-th". This R{K)-th figure would never be
found. I.e., 'i-l is circular, contrary both to what we have found in the last
paragraph and to the verdict " s " . Thus both verdicts are impossible
and we conclude that there can be no machine '0-.
248 A. M. TURING [NOV. 12,

We can show further that there can be no machine £• which, when


supplied iviih the S.D of an arbitrary machine AV, will determine vjhether AV
ever prints a given symbol (0 say).
We will first show that, if there is a machine £, then there is a general
process for determining whether a given machine . U< prints 0 infinitely
often. Let Jl x be a machine which prints the same sequence as A\, except
that in the position where the first 0 printed by .11- stands, A\x prints 0.
• U2 is to have the first two s\aribols 0 replaced by 0, and so on. Thus, if • U-
were to print
ABAQlAABOQIOAB...,

then A\± would print

ABA01AAB0010AB...

and .112 would print

ABAoiAAB~00l0AB....

Xow let H; be a machine which, when supplied with the S.D of .U, will
write down successively the S.D of .11, of .ll l5 of • U2, ... (there is such a
machine). We combine V' with I' and obtain a new machine, Xj. In the
motion of (, first > is used to write down the S.D of -U, and then t tests
it.: o: iy written if it is found that • 11 never prints 0; then ^ writes the S.D
of • II2, and this is tested.. : 0 : being printed if and only if • Ux never prints 0)
and so on. KOAV let us test .<, with ('. If it is found that X] never prints 0,
then .H prints 0 infinitely often; if Xj prints 0 sometimes, then .11 does not
print 0 infinitely often.
Similarly there is a general process for determining whether • U- prints 1
infinitely often. By a combination of these processes we have a process
for determining whether. U prints an infinity offigures,i.e. we have a process
for determining whether .11 is circle-free. There can therefore be no
machine i .
The expression "there is a general process for determining..." has
been used throughout this section as equivalent to "there is a machine
which will determine ... ". This usage can be justified if and only if we
can justify our definition of "computable". For each of these "general
process : ' problems can be expressed as a problem concerning a general
process for determining Avhether a given integer n has a property G(n) [e.g.
G{n) might mean "n is satisfactory" or "n is the Godel representation of
a provable formula"], and this is equivalent to computing a number
whose n-th. figure is 1 if G (n) is true and 0 if it is false.
1936.] Otf COMPUTABLE NUMBERS. 249

9. The extent of the computable numbers.


No attempt has yet been made to show that the " computable " numbers
include all numbers which would naturally be regarded as computable. Al I
arguments which can be given are bound to be, fundamentally, appeals
to intuition, and for this reason rather unsatisfactory mathematically.
The real question at issue is " What are the possible processes which can be
carried out in computing a number?"
The arguments which I shall use are of three kinds.
(a) A direct appeal to intuition.
(6) A proof of the equivalence of two definitions (in case the new
definition has a greater intuitive appeal).
(c) Giving examples of large classes of numbers which are
computable.
Once it is granted that computable numbers are all c: computable"".
several other propositions of the same character follow. In particular, it
follows that, if there is a general process for determining whether a formula
of the Hilbert function calculus is provable, then the determination can bo
carried out by a machine.

I. [Type (a)]. This argument is only an elaboration of the ideas of § 1.


Computing is normally done by writing certain symbols on paper. "We
may suppose this paper is divided into squares like a child's arithmetic book.
In elementary arithmetic the two-dimensional character of the paper is
sometimes used. But such a use is always avoidable, and I think that it
will be agreed that the two-dimensional character of paper is no essential
of computation. I assume then that the computation is carried out on
one-dimensional paper, i.e. on a tape divided into squares. I shall also
suppose that the number of symbols which may be printed is finite. If we
were to allow an infinity of symbols, then there would be symbols differing
to an arbitrarily small extent j . The effect of this restriction of the number
of symbols is not very serious. It is always possible to use sequences of
symbols in the place of single symbols. Thus an Arabic numeral such as

f If we regard a symbol as literally printed on a square we may suppose that the square
is 0 < x < 1, 0 < y < 1. The symbol is defined as a set of points in this square, viz. the
set occupied by printer's ink. If these sets are restricted to be measurable, we can define
the "distance" between two symbols as the cost of transforming one symbol into the
other if the cost of moving unit area of printer's ink unit distance is unity, and there is an
infinite supply of ink at x = 2. y = 0. With this topology the symbols form a condition-
ally compact space.
250 A. M. TUBING [NOV. 12,

17 or 999999999999999 is normally treated as a single symbol. Similarly


in any European language words are treated as single symbols (Chinese,
however, attempts to have an enumerable infinity of symbols). The
differences from our point of view between the single and compound symbols
is that the compound symbols, if they are too lengthy, cannot be observed
at one glance. This is in accordance with experience. We cannot tell at
a glance whether 9999999999999999 and 999999999999999 are the same.
The behaviour of the computer at any moment is determined by the
symbols which he is observing, and his " state of mind " at that moment.
We may suppose that there is a bound B to the number of symbols or
squares which the computer can observe at one moment. If he wishes to
observe more, he must use successive observations. We will also suppose
that the number of states of mind which need be taken into account is finite.
The reasons for this are of the same character as those which restrict the
number of symbols. If we admitted an infinity of states of mind, some of
them will be '' arbitrarily close " and will be confused. Again, the restriction
is not one which seriously affects computation, since the use of more compli-
cated states of mind can be avoided by writing more symbols on the tape.
Let us imagine the operations performed by the computer to be split up
into "simple operations" which are so elementary that it is not easy to
imagine them further divided. Every such operation consists of some change
of the physical system consisting of the computer and his tape. We know
the state of the system if we know the sequence of symbols on the tape,
which of these are observed by the computer (possibly with a special
order), and the state of mind of the computer. We may suppose that in a
simple operation not more than one symbol is altered. Any other changes
can be split up into simple changes of this kind. The situation in regard to
the squares whose symbols may be altered in this way is the same as in
regard to the observed squares. We may, therefore, without loss of
generality, assume that the squares whose symbols are changed are always
"observed" squares.
Besides these changes of symbols, the simple operations must include
changes of distribution of observed squares. The new observed squares
must be immediately recognisable by the computer. I think it is reasonable
to suppose that they can only be squares whose distance from the closest
of the immediately previously observed squares does not exceed a certain
fixed amount. Let us say that each of the new observed squares is within
L squares of an immediately previously observed square.
In connection with "immediate recognisability ", it may be thought
that there are other kinds of square which are immediately recognisable.
In particular, squares marked by special symbols might be taken as imme-
1936.] ON COMPUTABLE NUMBERS. 251

diately recognisable. Now if these squares are marked only by single


symbols there can be only a finite number of them, and we should not upset
our theory by adjoining these marked squares to the observed squares. If.
on the other hand, they are marked by a sequence of symbols, we
cannot regard the process of recognition as a simple process. This is a
fundamental point and should be illustrated. In most mathematical
papers the equations and theorems are numbered. Normally the numbers
do not go beyond (say) 1000. It is, therefore, possible to recognise a
theorem at a glance by its number. But if the paper was very long, we
might reach Theorem 157767733443477 ; then, further on in the paper, we
might find "... hence (applying Theorem 157767733443477) we have ... ".
In order to make sure which was the relevant theorem we should have to
compare the two numbers figure by figure, possibly ticking the figures off
in pencil to make sure of their not being counted twice. If in spite of this
it is still thought that there are other "immediately recognisable" squares,
it does not upset my contention so long as these squares can be found by
some process of which my type of machine is capable. This idea is
developed in III below.
The simple operations must therefore include:
(a) Changes of the symbol on one of the observed squares.
(6) Changes of one of the squares observed to another square
within L squares of one of the previously observed squares.
It may be that some of these changes necessarily involve a change of
state of mind. The most general single operation must therefore be taken
to be one of the following:
(A) A possible change (a) of symbol together with a possible
change of state of mind.
(B) A possible change (6) of observed squares, together with a
possible change of state of mind.
The operation actually performed is determined, as has been suggested
on p. 250, by the state of mind of the computer and the observed symbols.
In particular, they determine the state of mind of the computer after the
operation is carried out.
We may now construct a machine to do the work of this computer. To
each state of mind of the computer corresponds an " m-configuration " of
the machine. The machine scans B squares corresponding to the B squares
observed by the computer. In any move the machine can change a symbol
on a scanned square or can change any one of the scanned squares to another
square distant not more than L squares from one of the other scanned
252 A. M. TURING [NOV. 12.

squares. The move which is done, and the succeeding configuration, are
determined by the scanned symbol and the m-configuration. The
machines just described do not differ very essentially from computing
machines as defined in § 2, and corresponding to any machine of this type
a computing machine can be constructed to compute the same sequence,
that is to say the sequence computed by the computer.

II. [Type (6)].


If the notation of the Hilbert functional calculus f is modified so as to
be systematic, and so as to involve onty a finite number of symbols3 it
becomes possible to construct an automatic J machine 3C, which will find
all the provable formulae of the calculus§.
Now let a be a sequence, and let us denote by Ga(x) the proposition
"The rc-th figure of a is 1 ", so that1' —Ga(x) means "The z-th figure of a
is 0 ". Suppose further that we can find a set of properties which define
the sequence a and which can be expressed in terms of Ga(x) and of the
prepositional functions N(x) meaning "x is a non-negative integer" and
F(x, y) meaning "y = x-\-l ". When we join all these formulae together
conjunctively, we shall have a formula, % say, which defines a. The terms
of 21 must include the necessary parts of the Peano axioms, viz.,

N(x)-»(3y)F(x, y)) &(F(X,

which we will abbreviate to P.


When we say " 2( defines a", we mean that —21 is not a provable
formula, and also that, for each n, one of the following formulae (A,J or
(BJ is provable.
%&Ftn^Ga(uW), (AB)«T

where F™ stands for F{u, u') & F(u', u") & ... F^-v, u™).

f The expression "the functional calculus" is used throughout to mean the restricted
Hilbert functional calculus.
+ It is most natural to construct first a choice machine (§ 2) to do this. But it is
then easy to construct the required automatic machine. We can suppose that the choice3
are always choices between two possibilities 0 and 1. Each proof will then be determined
by a sequence of choices ilt i2, ..., •?•„ (ix = 0 or 1, u = 0 or 1, ..., in = 0 or 1), and hence
the number 2" + i1 2"~^-\-i22"---\-...-\-in completely determines the proof. The automatic
machine carries out successively proof 1, proof 2, proof 3, ....
§ The author has found a description of such a machine.
II The negation sign is written before an expression and not over it.
*\ A sequence of r primes is denoted by ''-1.
1936.] ON COMPUTABLE NUMBERS. 253

I say that a is then a computable sequence: a machine 'JCa to compute


a can be obtained by a fairly simple modification of JC
We divide the motion of Ka into sections. The n-th section is devoted
to finding the n-th figure of a. After the (n— l)-th section is finished a double
colon :: is printed after all the symbols, and the succeeding work is done
wholly on the squares to the right of this double colon. The first step is to
write the letter "A " followed by the formula (An) and then " B " followed
by (Bn). The machine Ka then starts to do the work of JC, but whenever
a provable formula is found, this formula is compared with (An) and with
(Bn). If it is the same formula as (An), then the figure " 1 " is printed, and
the n-th. section is finished. If it is (B,J, then " 0 " is printed and the section
is finished. If it is different from both, then the work of K is continued
from the point at which it had been abandoned. Sooner or later one of
the formulae (An) or (B?1) is reached; this follows from our hypotheses
about a and 21, and the known nature of JC. Hence the n-th section will
eventually be finished. 3CO is circle-free; a is computable.
It can also be shown that the numbers a definable in this way by the use
of axioms include all the computable numbers. This is done by describing
computing machines in terms of the function calculus.
It must be remembered that we have attached rather a special meaning
to the phrase " 21 defines a ". The computable numbers do not include all.
(in the ordinary sense) definable numbers. Let 8 be a sequence whose
n-th figure is 1 or 0 according as n is or is not satisfactory. It is an imme-
diate consequence of the theorem of § 8 that 8 is not computable. It is (so
far as we know at present) possible that any assigned number of figures of 8
can be calculated, but not by a uniform process. When sufficiently many
figures of 8 have been calculated, an essentially new method is necessaiy in
order to obtain more figures.
III. This may be regarded as a modification of I or as a corollary of II.
We suppose, as in I, that the computation is carried out on a tape; but we
avoid introducing the "state of mind" by considering a more physical
and definite counterpart of it. It is always possible for the computer to
break off from his work, to go away and forget all about it, and later to come
back and go on with it. If he does this he must leave a note of instructions
(written in some standard form) explaining how the work is to be con-
tinued. This note is the counterpart of the "state of mind". We will
suppose that the computer works in such a desultory manner that he never
does more than one step at a sitting. The note of instructions must enable
him to carry out one step and write the next note. Thus the state of progress
of the computation at any stage is completely determined by the note of
254 A. M. TURING [NOV. 12,

instructions and the symbols on the tape. That is, the state of the system
may be described by a single expression (sequence of symbols), consisting
of the symbols on the tape followed by A (which we suppose not to appear
elsewhere) and then by the note of instructions. This expression may be
called the "state formula". We know that the state formula at any
given stage is determined by the state formula before the last step was
made, and we assume that the relation of these two formulae is expressible
in the functional calculus. In other words, we assume that there is an
axiom 2( which expresses the rules governing the behaviour of the
computer, in terms of the relation of the state formula at any stage to the
state formula at the preceding stage. If this is so, we can construct a
machine to write down the successive state formulae, and hence to
compute the required number.

10. Examples of large classes of numbers which are computable.


It will be useful to begin with definitions of a computable function of
an integral variable and of a computable variable, etc. There are many
equivalent ways of defining a computable function of an integral
variable. The simplest is, possibly, as follows. If y is a computable
sequence in which 0 appears infinitely! often, and n is an integer, then let
us define £(y, n) to be the number of figures 1 between the n-th and the
(?i-\- l)-th figure 0 in y. Then <f)(n) is computable if, for all n and some y,
.<f>(n) = £(y, n). An equivalent definition is this. Let H(x, y) mean
<f)(x) = y. Then, if we can find a contradiction-free axiom 21^, such that
2^-* P, and if for each integer n there exists an integer N, such that

%&
and such that, if m=£<f>(n), then, for some N',
%&
then <j> may be said to be a computable function.
We cannot define general computable functions of a real variable, since
there is no general method of describing a real number, but we can define
a computable function of a computable variable. If n is satisfactory,
let yn be the number computed by ./U {n), and let

| If *Al computes y, then the problem whether .11 prints 0 infinitely often is of the
same character as the problem whether A\, is circle-free.
1936.] ON COMPUTABLE NUMBERS. 255

unless yn = 0 or yn — 1, in either of which cases an = 0. Then, as n


runs through the satisfactory numbers, an runs through the computable
numbersf. Now let <f)(n) be a computable function which can be
shown to be such that for any satisfactory argument its value is satis-
factory %. Then the function /, defined by f(an) — a^n), is a computable
function and all computable functions of a computable variable are
expressible in this form.
Similar definitions may be given of computable functions of several
variables, computable-valued functions of an integral variable, etc.
I shall enunciate a number of theorems about computability, but I
shall prove only (ii) and a theorem similar to (iii).

(i) A computable function of a computable function of an integral or


computable variable is computable.

(ii) Any function of an integral variable defined recursively in terms


of computable functions is computable. I.e. if 0(ra, n) is computable, and
r is some integer, then rj(n) is computable, where

(iii) If <f> (m, n) is a computable function of two integral variables, then


<j>{n, n) is a computable function of n.

(iv) If (j>(n) is a computable function whose value is always 0 or 1, then


the sequence whose fi-th figure is <f>(n) is computable.
Dedekind's theorem does not hold in the ordinary form if we replace
*' real'' throughout by '' computable''. But it holds in the following form :

(v) If G(a) is a propositional function of the computable numbers and

(a) (3a)(3jB){G(a)&(-G(j8))},

(6) Q(a)

and there is a general process for determining the truth value of G(a), then

f A function an may be defined in many other ways so as to run through the


computable numbers.
J Although it is not possible to find a general process for determining whether a given
number is satisfactory, it is often possible to show that certain classes of numbers are
satisfactory.
256 A. M. TURING [NOV. 12r

there is a computable number £ such that

In other words, the theorem holds for any section of the computables
such that there is a general process for determining to which class a given
number belongs.
Owing to this restriction of Dedekind's theorem, we cannot say that a
computable bounded increasing sequence of computable numbers has a
computable limit. This may possibly be understood by considering a
sequence such as
l ± 1 I I I
J j
-5 2' 5' 8' io 2» ••• •

On the other hand, (v) enables us to prove


(vi) If a and /? are computable and a < /? and <£(a) < 0 < </>(/?), where
(f>(a) is a computable increasing continuous function, then there is a unique
computable number y, satisfying a < y < fi and <f>(y) = 0.
Computable convergence.
We shall say that a sequence fin of computable numbers converges
computably if there is a computable integral valued function N(e) of the
computable variable e, such that we can show that, if e > 0 and n > N(e)
and m > N(e), then \pn—j8m| < e.
We can then show that
(vii) A power series whose coefficients form a computable sequence of
computable numbers is computably convergent at all computable points
in the interior of its interval of convergence.
(viii) The limit of a computably convergent sequence is computable.
And with the obvious definition of " uniformly computably convergent":
(ix) The limit of a uniformly computably convergent computable
sequence of computable functions is a computable function. Hence
(x) The sum of a power series whose coefficients form a computable
sequence is a computable function in the interior of its interval of
convergence.
From (viii) and TT— 4(1—i-|--i—...) we deduce that TT is computable.
From e = l + l+n-j-+»-j+... we deduce that e is computable.
1936.] OlST COMPUTABLE NUMBERS. 257

From (vi) we deduce that all real algebraic numbers are computable.
From (vi) and (x) we deduce that the real zeros of the Bessel functions
are computable.

Proof of (ii).
Let H(x, y) mean "r](x) = y", and let K{x, y, z) mean "(f>(x, y) = z".
21^ is the axiom for <f>(x, y). We take 31, to be

% & P & (F{x, y)-*Q{x, y)) & [G{x, y) & G(y, z)->G(x, z))

& (FW-*H{U, VP>)) & (J(v, w) & #(v, x) & Z(w, x} z)->H(iv, z))

& [ £ f ( w , 2) & ^ ( 2 , <)v (?(<, z)

I shall not give the proof of consistency of %n. Such a proof may be
constructed by the methods used in Hilbert and Bernays, Grundlagen der
Mathematik (Berlin, 1934), p. 209 et seq. The consistency is also clear
from the meaning.
Suppose that, for some n, N, we have shown
%&
then, for some M,
%&

&
and

Hence 21,
Also ST, &
Hence for each w some formula of the form

is provable. Also, if M'^M and i f ' ^ m and m^r)(u), then


SI, & FW^G^W), u^) v G(u^m\
8EB. 2. VOL. 4 2 . NO. 2 1 4 5 .
258 A. M. TURING [NOV. 12,

and
2( & FW)-^ f {G(u^n^, w(m)) v G(u^m\
&

Hence 21, & FW"> -> ( - H { u ^ n \ u™)).


The conditions of our second definition of a computable function are
therefore satisfied. Consequently rj is a computable function.
Proof of a modified form of (iii).
Suppose that we are given a machine Tl, which, starting with a tape
bearing on it 9 9 followed by a sequence of any number of letters "F" on
P-squares and in the m-configuration b, will compute a sequence yn
depending on the number n of letters " F ". If <f>n(m) is the m-th figure of
yv, then the sequence /3 whose n-th. figure is <f>n{n) is computable.
We suppose that the table for Tl has been written out in such a way
that in each line only one operation appears in the operations column. We
also suppose that S, 0, 0, and 1 do not occur in the table, and we replace
9 throughout by 0, 0 by 0, and 1 b y l . Further substitutions are then
made. Any line of form

21 a- PO
95
we replace by
21 a PO
te(23, u, h, k)
and any line of the form
21 a Pi
93
by 2( a Pi
re(93, t>, h, k)
and we add to the table the following lines:
u pe(ul5 0)
Uj. R, Pk, R, P0, R, P0 u2
u2 re(u3, u3, k, h)
u3 pe(u2, F)
and similar lines with x> for u and 1 for 0 together with the following line
c R, PE, R, Ph 6.
(
We then have the table for the machine H/ which computes jS. The
initial m-configuration is c, and the initial scanned symbol is the second a.
1936.] ON COMPUTABLE NUMBERS. 259

11. Application to the Entscheidungsproblem.

The results of § 8 have some important applications. In particular, they


can be used to show that the Hilbert Entscheidungsproblem can have no
solution. For the present I shall confine myself to proving this particular
theorem. For the formulation of this problem I must refer the reader to
Hilbert and Ackermann's Grundziige der Theoretischen Logik (Berlin,
1931), chapter 3.
I propose, therefore, to show that there can be no general process for
determining whether a given formula 2( of the functional calculus K is
provable, i.e. that there can be no machine which, supplied with any one
21 of these formulae, will eventually say whether 21 is provable.
It should perhaps be remarked that what I shall prove is quite different
from the well-known results of Godelf. G odel has shown that (in the forma-
lism of Principia Mathematica) there are propositions 21 such that neither
'21 nor — 21 is provable. As a consequence of this, it is shown that no proof
•of consistency of Principia Mathematica (or of K) can be given within that
formalism. On the other hand, I shall show that there is no general method
which tells whether a given formula % is provable in K, or, what comes to
the same, whether the system consisting of K with —21 adjoined as an
cextra axiom is consistent.
If the negation of what Godel has shown had been proved, i.e. if, for each
21, either 21 or — 21 is provable, then we should have an immediate solution
of the Entscheidungsproblem. For we can invent a machine JC which will
prove consecutively all provable formulae. Sooner or later JC will reach
either 21 or —21. If it reaches 21, then we know that 2( is provable. If it
reaches — 21, then, since K is consistent (Hilbert and Ackermann, p. 65), we
know that 21 is not provable.
Owing to the absence of integers in K the proofs appear somewhat
lengthy. The underlying ideas are quite straightforward.
Corresponding to each computing machine i t we construct a formula
Un (it) and we show that, if there is a general method for determining
whether Un (.11) is provable, then there is a general method for deter-
mining whether i t ever prints 0.
The interpretations of the propositional functions involved are as
follows :

Rst(x> V) i s to be interpreted as "in the complete configuration x (of


J/l) the symbol on the square y is S".

t Loc. cit.
S2
260 A. M. TURING [NOV. 12,

I(x, y) is to be interpreted as "in the complete configuration x the


square y is scanned".
KQm(x) is to be interpreted as "in the complete configuration x the
m-configuration is qm.
F(x, y) is to be interpreted as sty is the immediate successor of x ".
Inst {qt Sj 8k L 37} is to be an abbreviation for

(x, y, x', y') I (BSj(x, y) k I(x, y) k K8i(x) k F(x, x') k F(y', y))
f
I{x'iy')kBSk{x',y)kKqi{x')

k (z) \_F{y', z)v(RSj(x, z) + Rak(x', z)


Inst {q{ 8, Sk R qt} and Inst {qt 8j Sk N q{]
are to be abbreviations for other similarly constructed expressions.
Let us put the description of .11 into the first standard form of § 6. This
description consists of a number of expressions such as "q{ 8i Sk Lqt" (or
with ROT N substituted for L). Let us form all the corresponding expres-
sions such as Inst {qt $3- Sk L qt} and take their logical sum. This we call
Des(.U).
The formula Un(.U) is to be

{3u)[N{u) &, (x)(N{x)->{3x')F(x, X'))


&. (y, z)(F(y, z)->N(y) k N(z)) & (y) R>%(% y),
& I(u, u) & Kqi{u) & Des(..U)l
->(35) (30 [N(s) & N(t) & RSl(s, t)).

[K{u)&... &Des(.U)] may be abbreviated to A(M).


When we substitute the meanings suggested on p. 259-60 we find that
Un(.U) has the interpretation "in some complete configuration of M, S-^
(i.e. 0) appears on the tape ". Corresponding to this I prove that
(a) If Sx appears on the tape in some complete configuration of • U, then
Un(U) is provable.
(b) If Un (• U) is provable, then 8X appears on the tape in some complete
configuration of • 11.
When this has been done, the remainder of the theorem is trivial.
1936.] ON COMPUTABLE NUMBERS. 261

LEMMA 1. / / S± appears on the tape in some complete configuration of


.At, then Un(.At) is provable.
We have to show how to prove Un (it). Let us suppose that in the
n-th complete configuration the sequence of symbols on the tape is
&r(n,o)> *^r(n,i)5 •••> $i<n,nh followed by nothing but blanks, and that the
scanned symbol is the i(n)-th, and that the m-configuration is q^n). Then
we may form the proposition
, u) & RSrluJvF>, u') & ... & RSr{H,Mn\

which we may abbreviate to CCn.


As before, F{u, u') & F{u', u") & ... & F{u^\ w(r)) is abbreviated
to F<r).
I shall show that all formulae of the form A{-W) & F™^- CCn (abbre-
viated to CFn) are provable. The meaning of CFn is " The n-th. complete
configuration of i t is so and so ", where "so and so " stands for the actual
n-th. complete configuration of i t . That CFn should be provable is
therefore to be expected.
CF0 is certainly provable, for in the complete configuration the symbols
are all blanks, the m-configuration is qx, and the scanned square is u, i.e.
CC0 is
(y) RSo{u, y) & I(u, u) & KQl(u).
A(o\i)->CC0 is then trivial.
We next show that CFn^-CFn+1 is provable for each n. There are
three cases to consider, according as in the move from the n-th to the
(n-j-l)-th configuration the machine moves to left or to right or remains
stationary. We suppose that the first case applies, i.e. the machine
moves to the left. A similar argument applies in the other cases. If
r[n,i(n)}=a, r(n-\-l, i(n-\-l)} = c, k(i(n)j =b, and k(i(n-\-l)) =d,
then Des (it) must include Inst {qa 8b Sd L q^ as one of its terms, i.e.

Hence A(.AV) & Fin+n^1nat{qa8b8dLqc} &


But Inst{qa Sb 8dLqc} & ^ n + w ^ ( C C n -
is provable, and so therefore is
A (• It) & F(n+»-> (CCn -» C(L .,
262 A. M. TURING [NOV. 12,

and (AIM) & F™^CCn) -+ (.4(it) & F<n+V^CCn+1),


i.e. CFm-»CF.n+V
CFn is provable for each n. Now it is the assumption of this lemma
that 8± appears somewhere, in some complete configuration, in the sequence
of symbols printed by M; that is, for some integers N, K, CGN has
RS[(u^N\u^) as one of its terms, and therefore CCN^RSl{u{N\ u(K)) is
provable. We have then

and A(.M)&FW->CCN.
We also have
(3u)A(M)-+(3u)(3uf)...
where N' — max (N, K). And so

(3u) A (. U.) -> (3^ 7 )) (3uW) RS


(3u)A(M)->(3s)(3t)RSl(s,t),
i.e. Un(-U) is provable.
This completes the proof of Lemma 1.

2. / / Un(-U) is provable, then S1 appears on the tape in some


LEMMA
complete configuration of M.
If we substitute any propositional functions for function variables in
a provable formula, we obtain a true proposition. In particular, if we
substitute the meanings tabulated on pp. 259-260 in Un(^U), we obtain a
true proposition with the meaning " S1 appears somewhere on the tape in
some complete configuration of .M".

We are now in a position to show that the Entscheidungsproblem cannot


be solved. Let us suppose the contrary. Then there is a general
(mechanical) process for determining whether Un(.tl) is provable. By
Lemmas 1 and 2, this implies that there is a process for determining whether
.41 ever prints 0, and this is impossible, by §8. Hence the Entscheidungs-
problem cannot be solved.

In view of the large number of particular cases of solutions of the


Entscheidungsproblem for formulae with restricted systems of quantors, it
1936.] ON COMPUTABLE NUMBERS. 263

is interesting to express Un(ii) in a form in which all quantors are at the


beginning. Un(At) is, in fact, expressible in the form
{u){3x){w){3u1)...{3un)%, (I)
where 95 contains no quantors, and n = 6. By unimportant modifications
we can obtain a formula, with all essential properties of Un(.it), which is of
form (I) with n = 5.

Added 28 August, 1936.


APPENDIX.

Computabiliiy and effective calculability


The theorem that all effectively calculable (A-definable) sequences are
computable and its converse are proved below in outline. It is assumed,
that the terms "well-formed formula " (W.F.F.) and "conversion " as used
by Church and Kleene are understood. In the second of these proofs the
existence of several formulae is assumed without proof; these formulae
may be constructed straightforwardly with the help of, e.g., the
results of Kleene in "A theory of positive integers in formal logic'",
American Journal of Math., 57 (1935), 153-173, 219-244.
The W.F.F. representing an integer n will be denoted by Nn. We shall
say that a sequence y whose n-th figure is (f>y(n) is A-definable or effectively
calculable if l-\-</>y(u) is a A-definable function of n, i.e. if there is a W.F.F.
My such that, for all integers n,

i.e. {My} (Nn) is convertible into Xxy.x(x(y)) or into Xxy.x(y) according as


the n-th figure of A is 1 or 0.
To show that every A-definable sequence y is computable, we have to
show how to construct a machine to compute y. For use with machines it
is convenient to make a trivial modification in the calculus of conversion.
This alteration consists in using x, x', x", ... as variables instead of
a, b, c, .... We now construct a machine JL which, when supplied with the
formula My, writes down the sequence y. The construction of X is some-
what similar to that of the machine K which proves all provable formulae
of the functional calculus. We first construct a choice machine £-v which,
if supplied with a W.F.F., M say, and suitably manipulated, obtains any
formula into which M is convertible. £± can then be modified so as to
yield an automatic machine £-2 which obtains successively all the formulae
264 A. M. TURING [NOV. 12,

into which M is convertible (cf. foot-note p. 252). The machine £>


includes ^ 2 a s a P a r ^. The motion of the machine X when supplied
with the formula My is divided into sections of which the n-th. is
devoted to finding the n-th figure of y. The first stage in this n-th. section
is the formation of {My} {Nn). This formula is then supplied to the
machine £2, which converts it successively into various other formulae.
Each formula into which it is convertible eventually appears, and each, as
it is found, is compared with

and with Aa:|Aa;'[{a;}(a;')] |, i.e. Nv


If it is identical with the first of these, then the machine prints the figure 1
and the n-th section is finished. If it is identical with the second, then 0
is printed and the section is finished. If it is different from both, then the
work of .!!2 is resumed. By hypothesis, {My}(Nn) is convertible into one of
the formulae N2 or Nx; consequently the n-th section will eventually be
finished, i.e. the n-th. figure of y will eventually be written down.

To prove that every computable sequence y is A-defUiable, we must


show how to find a formula My such that, for all integers n,
{My}(Nn)c(mvN1+<j)y{n).
Let .11 be a machine which computes y and let us take some description
of the complete configurations of -U by means of numbers, e.g. we may take
the D.N of the complete configuration as described in §6. Let £(n) be
the D.N of the w-th complete configuration of M. The table for the
machine .U gives us a relation between £(n-\-l) and £(n) of the form

where py is a function of very restricted, although not usually very simple,


form : it is determined by the table for. U. py is A-defmable (I omit the proof
of this), i.e. there is a W.F.F. Ay such that, for all integers n,

Let U stand for


Xu[{{u}(Ay))(Nr)],
where r=£(0); then, for all integers n,
{Uy}(NJ conv N,{n).
1936.] ON COMPUTABLE NUMBERS. 265

It may be proved that there is a formula V such that


conv Nx if, in going from the n-th to the (n-\- l)-th
complete configuration, the figure 0 is
printed.
conv JV2 if the figure 1 is printed,
conv N3 otherwise.
Let Wy stand for

so that, for each integer n,


conv {Wy} (Nn),
and let Q be a formula such that
\{Q}(Wy)UNs) convNr(s),
where r(s) is the 5-th integer q for which {Wy} (NQ) is convertible into either
N-L or JVa. Then, if j|f7 stands for

it will have the required property f.

The Graduate College,


Princeton University,
New Jersey, U.S.A.

t In a complete proof of the A-definability of computable sequences it would be best to


modify this method by replacing the numerical description of the complete configurations
by a description which can be handled more easily with our apparatus. Let us choose
certain integers to represent the symbols and the m-configurations of the machine.
Suppose that in a certain complete configuration the numbers representing the successive
symbols on the tape are s1s2... sn, that the m-th symbol is scanned, and that the ?n.-configur-
ationhas the number t; then we may represent this complete configuration by the formula
„ N» ..., #,„,_,], [Nt, NaJ, [NSM+V ..., NSlt]],

where [a, 6] stands for \u f" -{ {u} (a) )(&)]»

[a, 6, c] stands for AM P I \ {u} (a)}(b) J (c)l,


etc.