Sie sind auf Seite 1von 391

Group Theory in Physics: Lecture Notes

Rodolfo Alexander Diaz S.


Universidad Nacional de Colombia
Departamento de Fı́sica
Bogotá, Colombia

June 24, 2013


Contents

1 Sets and functions 9


1.1 Partitions and equivalence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2 Functions, Mappings and transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2 Linear or vector spaces 13


2.1 Definition of a linear vector space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Algebraic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Vector subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Dimension and bases in vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.5 Mappings and transformations in vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Linear transformations of a vector space into itself . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.6.1 Projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.7 Normed vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7.1 Convergent sequences, cauchy sequences and completeness . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.7.2 The importance of completeness in Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.7.3 The concept of continuity and its importance in Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.8 Banach Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.8.1 Continuous linear transformations of a Banach space into scalars . . . . . . . . . . . . . . . . . . . . . . 23
2.8.2 Continuous linear transformations of a Banach space into itself . . . . . . . . . . . . . . . . . . . . . . . 24
2.9 Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.9.1 Orthonormal sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.9.2 The conjugate space H ∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.9.3 The conjugate and the adjoint of an operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.10 Normal operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.11 Self-Adjoint operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.12 Unitary operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.13 Projections on Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3 Basic theory of representations for finite-dimensional vector spaces 34


3.1 Representation of vectors and operators in a given basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Change of coordinates of vectors under a change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Change of the matrix representative of linear transformations under a change of basis . . . . . . . . . . . . . . 37
3.4 Active and passive transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 Theory of representations on finite dimensional Hilbert spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5.1 Representation of linear operators in finite dimensional Hilbert spaces . . . . . . . . . . . . . . . . . . . 40
3.6 Determinants and traces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.7 Rectangular matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.8 Symmetric and antisymmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.9 The eigenvalue problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.9.1 Matrix representative of the eigenvalue problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.9.2 Eigenvectors and the canonical problem of matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.10 Normal operators and the spectral theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.10.1 A qualitative discussion of the spectral theorem in infinite dimensional Hilbert spaces . . . . . . . . . . 49
3.11 The concept of “hyperbasis” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.12 Definition of an observable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.13 Complete sets of commuting observables (C.S.C.O.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.14 Some terminology concerning Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.15 The Hilbert Space L2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2
CONTENTS 3

3.15.1 The wave function space ̥ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55


3.16 Discrete orthonormal basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.16.1 Dirac delta function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.17 Closure relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.18 Introduction of hyperbases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.18.1 Orthonormality and Closure relations with hyperbases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.18.2 Inner product and norm in terms of the components of a vector in a hyperbases . . . . . . . . . . . . . 60
3.19 Some specific continuous bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.19.1 Plane waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.19.2 “Delta functions” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.20 Tensor products of vector spaces, definition and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.20.1 Scalar products in tensor product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.20.2 Tensor product of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.20.3 The eigenvalue problem in tensor product spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.20.4 Complete sets of commuting observables in tensor product spaces . . . . . . . . . . . . . . . . . . . . . . 65
3.21 Restrictions of an operator to a subspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.22 Functions of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.22.1 Some commutators involving functions of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.23 Differentiation of operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.23.1 Some useful formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4 State space and Dirac notation 70


4.1 Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.2 Elements of the dual or conjugate space Er∗ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3 The correspondence between bras and kets with hyperbases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 The action of linear operators in Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.5 Projectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.6 Hermitian conjugation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.6.1 The adjoint operator A† in Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.6.2 Mathematical objects and hermitian conjugation in Dirac notation . . . . . . . . . . . . . . . . . . . . . 76
4.7 Theory of representations of E in Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.7.1 Orthonormalization and closure relation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.7.2 Representation of operators in Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4.8 Change of representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.8.1 The transfer matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.8.2 Transformation of the coordinates of a ket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.8.3 Transformation of the coordinates of a bra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.8.4 Transformation of the matrix elements of an operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.9 Representation of the eigenvalue problem in Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.9.1 C.S.C.O. in Dirac notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.10 The continuous bases |ri and |pi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.10.1 Orthonormalization and closure relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.10.2 Coordinates of kets and bras in {|ri} and {|pi} . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.10.3 Changing from the {|ri} representation to {|pi} representation and vice versa . . . . . . . . . . . . . . . 86
4.10.4 The R and P operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.10.5 The eigenvalue problem for R and P . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.11 General properties of two conjugate observables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
4.11.1 The eigenvalue problem of Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.11.2 The action of Q, P and S (λ) in the {|qi} basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.11.3 Representation in the {|pi} basis and the symmetrical role of P and Q . . . . . . . . . . . . . . . . . . . 91

5 Some features of matrices and operators in C2 and R3 93


5.1 Diagonalization of a 2 × 2 hermitian matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.1.1 Formulation of the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.1.2 Eigenvalues and eigenvectors of K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.1.3 Eigenvalues and eigenvectors of H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
5.2 Some general properties of 3 × 3 real matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.2.1 Real antisymmetric 3 × 3 matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.2.2 Decomposition of a 3 × 3 matrix in its antisymmetric and symmetric parts . . . . . . . . . . . . . . . . 97
4 CONTENTS

6 Abstract Group Theory 99


6.1 Groups: Definitions and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Examples of abstract groups and further properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.3 Examples of group realizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4 Groups of transformations and isomorphisms between groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.5 Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.6 Symmetric groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
6.6.1 Cycle structures in permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
6.6.2 Cayley’s theorem and regular permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
6.7 Resolution of a group in cosets, Lagrange’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
6.8 Conjugacy classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.9 Conjugate and Invariant subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
6.10 The factor group G/ℜ. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
6.11 Homomorphisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
6.12 A group as a direct product of some subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.13 Direct product of groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.14 Classes, subgroups, invariant subgroups and quotient groups from S4 (optional) . . . . . . . . . . . . . . . . . . 128

7 Group representations 131


7.1 Comments on notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
7.2 The concept of representation and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
7.3 Examples of construction of matrix representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
7.4 Equivalent and inequivalent representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
7.5 Reducible and irreducible representations, invariant subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
7.6 Unitary representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
7.7 Schur’s lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
7.8 Orthonormality and completeness relations of irreducible matrix representations . . . . . . . . . . . . . . . . . 143
7.8.1 Orthonormality of irreducible matrix representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
7.8.2 Geometrical interpretation of the orthonormality relation of irreducible matrix representations . . . . . 145
7.8.3 Completeness relations for irreducible matrix representations . . . . . . . . . . . . . . . . . . . . . . . . 145
7.9 Examples of application of the orthonormality and completeness condition for irreducible matrix representations 146
7.10 Orthonormality and completeness relations for irreducible characters . . . . . . . . . . . . . . . . . . . . . . . . 147
7.10.1 Criteria of irreducibility for representations of finite groups through their character tables . . . . . . . . 150
7.11 The regular representation of a finite group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.12 Reduction of the regular representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

8 Additional issues on group representations 155


8.1 Direct product representations and Clebsch-Gordan Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.1.1 Definition and basic properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
8.1.2 Coupled and decoupled bases and Clebsch-Gordan coefficients . . . . . . . . . . . . . . . . . . . . . . . . 157
8.1.3 The importance of direct product representations in Physics . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.2 Construction of representations in vector spaces of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8.2.1 Further properties of the operators OT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
8.2.2 Invariant functions and invariant operators in Vf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
8.2.3 Some examples of representation on vector spaces of functions . . . . . . . . . . . . . . . . . . . . . . . 163
8.3 The adjoint and the complex conjugate representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
8.3.1 Conditions for the equivalence of D∗ (G) and D (G) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.3.2 Conditions for the equivalence of D and D∗ , real representations . . . . . . . . . . . . . . . . . . . . . . 168
8.4 Square roots of group elements (optional) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.4.1 Other square roots of group elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.4.2 Square roots and ambivalent classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

9 Irreducible basis vectors and operators 176


9.1 Irreducible basis vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
9.2 Reduction of vectors by projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9.2.1 Definition of true projections from generalized projections . . . . . . . . . . . . . . . . . . . . . . . . . . 180
9.2.2 The reduction of direct product representations with the projection method . . . . . . . . . . . . . . . . 182
9.3 Irreducible operators and the Wigner-Eckart theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
CONTENTS 5

10 A brief introduction to algebraic systems 185


10.1 Groups and vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
10.2 Rings: definitions and properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
10.2.1 Rings with identity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
10.2.2 The structure of rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
10.2.3 Homomorphisms and isomorphisms for rings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
10.3 Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196

11 Group algebra and the reduction of the regular representation 199


11.1 Left ideals and invariant subspaces of the group algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
11.2 Decomposition of finite dimensional algebras in left ideals: projections . . . . . . . . . . . . . . . . . . . . . . . 202
11.3 Idempotents as generators of left-ideals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
11.4 Complete reduction of the regular representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
11.4.1 Generation of the idempotent associated with the identity representation . . . . . . . . . . . . . . . . . 207
11.5 The reduction of the regular representation of C3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
11.5.1 Generation of the idempotents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
11.5.2 Checking for primitive idempotence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
11.5.3 Checking for inequivalent primitive idempotents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

12 Representations of the permutation group 212


12.1 One dimensional representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
12.2 Partitions and Young diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
12.3 Symmetrizers, anti-symmetrizers, and irreducible symmetrizers of Young tableaux . . . . . . . . . . . . . . . . 215
12.4 Symmetrizers, antisymmetrizers, and irreducible symmetrizers of Young tableaux associated with S3 . . . . . . 216
12.4.1 Properties of idempotents and left-ideals of S3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
12.5 General properties of Young tableaux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
12.5.1 Examples of the general properties of Young tableux . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
12.6 Irreducible representations of Sn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

13 Symmetry classes of tensors 227


13.1 The role of the general linear group Gm and the permutation group Sn on the tensor space Vmn . . . . . . . . . 227
13.1.1 Definition of the general linear group Gm and the tensor space Vmn . . . . . . . . . . . . . . . . . . . . . 227
13.1.2 Realization of Gm on the tensor space Vmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
13.1.3 Realization of Sn on the tensor space Vmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
13.1.4 Interplay of Gm and Sn on the tensor space Vmn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
13.2 Totally symmetric tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
13.3 Totally anti-symmetric tensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
13.4 Reduction of the tensor space V23 in irreducible invariant subspaces under S3 and G2 . . . . . . . . . . . . . . . 236
(23)
13.4.1 Irreducible invariant subspaces under S3 generated by Θm and Θm . . . . . . . . . . . . . . . . . . . . 237
(23)
13.4.2 Irreducible invariant subspaces under G2 generated by Θm and Θm . . . . . . . . . . . . . . . . . . . . 238
13.4.3 Reduction of V23 in irreducible subspaces under S3 and G2 . . . . . . . . . . . . . . . . . . . . . . . . . . 240
13.5 Reduction of the tensor space Vmn into irreducible tensors of the form |λ, α, ai . . . . . . . . . . . . . . . . . . . 240

14 One dimensional continuous groups 244


14.1 The rotation group SO (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
14.1.1 The generator of SO (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
14.1.2 Irreducible representations of SO (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
14.1.3 Invariant integration measure, orthonormality and completeness relations . . . . . . . . . . . . . . . . . 248
14.1.4 Multi-valued representations of SO(2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
14.1.5 Conjugate basis vectors for SO (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
14.2 Continuous translational group in one dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
14.2.1 Conjugate basis vectors for T1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
14.3 General comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253

15 Rotations in three-dimensional space: The group SO (3) 255


15.1 The Euler angles parameterization of a three-dimensional rotation . . . . . . . . . . . . . . . . . . . . . . . . . 256
15.1.1 The Euler angles in the X-convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
15.1.2 Euler angles in the Y −convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
15.2 The angle-and-axis-parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
15.2.1 Proper orthogonal transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260
6 CONTENTS

15.2.2 Real proper orthogonal matrices in three-dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260


15.3 Euler’s theorem for rotations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
15.3.1 The angle-axis parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262
15.3.2 Parameterization of rotations by succesive fixed-axis rotations . . . . . . . . . . . . . . . . . . . . . . . . 263
15.3.3 Relation between the angle-axis parameters and the Euler angles (y − convention) . . . . . . . . . . . . 264
15.4 One-parameter subgroups, generators and Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264
15.5 Irreducible representations of the SO (3) Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
15.5.1 General properties of J 2 , J3 , and J± . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
15.6 Matrices of the generators for any (j) −representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
15.7 Matrices of the group elements for any (j) −representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
15.8 Matrix representations of generators and group elements for j = 0 . . . . . . . . . . . . . . . . . . . . . . . . . 271
15.9 Matrix representations of generators and group elements for j = 1/2 . . . . . . . . . . . . . . . . . . . . . . . . 271
15.9.1 Matrix representations of the generators (j = 1/2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271
15.9.2 Matrix representations of the group elements (j = 1/2) . . . . . . . . . . . . . . . . . . . . . . . . . . . 272
15.10Matrix representations of generators and group elements for j = 1 . . . . . . . . . . . . . . . . . . . . . . . . . 274
15.10.1 Matrix representations of the generators (j = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
15.10.2 Matrix representations of the group elements (j = 1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
15.11Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
15.12Some features of the irreducible representations of the SO (3) group . . . . . . . . . . . . . . . . . . . . . . . . 275
15.13Direct product representations of SO (3) and their reduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
15.13.1 Properties of the direct product representations of SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . 276
15.13.2 Reduction of the direct product representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 278
15.13.3 Clebsch-Gordan coefficients for SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281
15.14Irreducible spherical tensors and the Wigner-Eckart theorem in SO(3). . . . . . . . . . . . . . . . . . . . . . . . 283
15.14.1 Irreducible spherical tensors and its properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283
15.14.2 The Wigner-Eckart theorem for SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
15.15Cartesian components of tensor operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284
15.16Cartesian components of a second rank tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
15.16.1 Decomposition of a second rank tensor in its symmetric and antisymmetric part . . . . . . . . . . . . . 286
15.16.2 Transformation of the trace of a second rank-tensor under SO (3) . . . . . . . . . . . . . . . . . . . . . . 287
15.16.3 Transformation of the antisymmetric part of a second rank-tensor under SO (3) . . . . . . . . . . . . . . 287
15.16.4 Transformation of the symmetric part of a second rank-tensor under SO (3) . . . . . . . . . . . . . . . . 288
15.16.5 Decomposition of V32 in invariant irreducible subspaces under SO (3) . . . . . . . . . . . . . . . . . . . . 288
15.16.6 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

16 The group SU(2) and additional properties of SO (3) 291


16.1 Relation between SO (3) and SU (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
16.2 Cartesian parameterization of SU (2) matrices and the group manifold . . . . . . . . . . . . . . . . . . . . . . . 293
16.3 An alternative way to see the relation between SO (3) and SU (2) (optional) . . . . . . . . . . . . . . . . . . . . 294
16.4 Representation matrices for SU (2): The tensor method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
16.5 Invariant integration measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297
16.5.1 Invariant measure in different sets of coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
16.6 Invariant integration measure, general approach for compact Lie groups . . . . . . . . . . . . . . . . . . . . . . 300
16.6.1 Application of the general method to SU (2) and SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . 301
16.7 Orthonormality relations of D(j) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
16.8 Completeness relations of D(j) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304
16.9 Completeness relations for Bose-Einstein and Fermi-Dirac functions . . . . . . . . . . . . . . . . . . . . . . . . 305
16.9.1 Summary of properties of Bose-Einstein and Fermi-Dirac functions . . . . . . . . . . . . . . . . . . . . . 306
16.10Completeness relations for partially separable functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306
16.10.1 Completeness for λ = 0 and spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
16.11Generalized projection operators in SO (3) and SU (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
16.12Differential equations and recurrence relations for the D(j) functions . . . . . . . . . . . . . . . . . . . . . . . . 308
16.12.1 Some useful formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
16.12.2 Recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
16.12.3 Differential equations for D(j) (φ, θ, ψ) functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
16.13Group-Theoretical interpretation of the spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
16.13.1 Transformation under rotation and addition theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316
16.13.2 Decomposition of products of Ylm with the same arguments . . . . . . . . . . . . . . . . . . . . . . . . 316
16.13.3 Recursion formulas for Ylm (θ, φ) with l fixed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317
16.13.4 Recursion formulas for Ylm (θ, φ) with m fixed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
CONTENTS 7

16.13.5 Symmetry relations for spherical harmonics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318


16.13.6 Orthonormality and completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
16.14Group theory, special functions and generalized Fourier analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
16.15Properties of the D(j) (φ, θ, ψ) representations of SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
16.15.1 “Special” unitarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319
16.15.2 Other properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
16.15.3 Properties in the Cordon-Shortley convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

17 Applications in Physics of SO (3) and SU (2) 325


17.1 Applications of SO (3) for a particle in a central potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
17.1.1 Characterization of states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
17.1.2 Asymptotic plane wave states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
17.1.3 Partial wave decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
17.2 Kinematic effects, dynamic effects and group theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
17.3 Transformation properties of fields under SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
17.3.1 Transformation of multicomponent fields under SO(3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330
17.4 Transformation properties of operators under SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331
17.4.1 Transformation properties of local operators under SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . 332
17.5 Applications of the generalized projection operators in SO (3) and SU (2) . . . . . . . . . . . . . . . . . . . . . 334
17.5.1 Single particle states with spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
17.5.2 Two particle states with spin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
17.5.3 Scattering of two particles with spin: partial-wave decomposition . . . . . . . . . . . . . . . . . . . . . . 337

18 Euclidean Groups in two dimensions 339


18.1 The Euclidean group in two dimensions E2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
18.2 One-dimensional representations of E2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
18.3 Basic Lie algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344
18.4 Unitary irreducible representations of E2 : Lie algebra method . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
18.5 The induced representation method and the plane-wave basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348
18.6 Relation between the angular momentum and plane wave bases . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
18.7 Differential equations, recursion formulas and addition theorem of the Bessel functions . . . . . . . . . . . . . . 351
18.7.1 Recursion formulas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 351
18.7.2 Differential equation for Bessel functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
18.7.3 Addition theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
18.7.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
18.8 Method of group contraction: SO (3) and E2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355
18.8.1 Relation between the irreducible representations of SO (3) and E2 . . . . . . . . . . . . . . . . . . . . . 357
18.8.2 Relation between representation functions of SO (3) and E2 . . . . . . . . . . . . . . . . . . . . . . . . . 357

19 General Treatment of continuous groups 358


19.1 The notion of continuity in a group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
19.2 Noether theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358
19.3 Lie Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

A Definition and properties of angular momentum 361


A.1 Definition of angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
A.2 Algebraic properties of the angular momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
A.2.1 Algebra of the operators J2 , J3 , J+ , J− . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361
A.3 Structure of the eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362
A.3.1 General features of the eigenvalues of J2 and J3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363
A.3.2 Determination of the eigenvalues of J2 and J3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364
A.4 Properties of the eigenvectors of J2 and J3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366
A.4.1 Generation of eigenvectors by means of the operators J+ and J− . . . . . . . . . . . . . . . . . . . . . . 366
A.4.2 Summary of results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
A.5 Construction of a standard basis from a C.S.C.O . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368
A.6 Decomposition of E in subspaces of the type E (j, k) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369
8 CONTENTS

B Addition of two angular momenta 370


B.1 Total and partial angular momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370
B.2 Addition of two angular momenta with j(1) = j(2) = 1/2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
B.2.1 Eigenvalues of J3 and their degeneracy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
B.2.2 Diagonalization of J2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372
B.2.3 Eigenstates of J2 and J3 : singlet and triplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373
B.3 General method of addition of two angular momenta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374
B.3.1 Forming the tensor space and the associated angular momenta . . . . . . . . . . . . . . . . . . . . . . . 375
B.3.2 Total angular momentum and its relations of commutation . . . . . . . . . . . . . . . . . . . . . . . . . 375
B.3.3 Change of basis to be carried out . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
B.3.4 Eigenvectors of J2 and J3 : Case of j1 = j2 = 1/2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376
B.3.5 Eigenvalues of J3 and their degeneracy: general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377
B.3.6 Eigenvalues of J2 : general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378
B.4 Eigenvectors common to J2 and J3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
B.4.1 Special case j1 = j2 = 1/2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380
B.5 Eigenvectors of J2 and J3 : general case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
B.5.1 Determination of the vectors |JM i of the subspace E (j1 + j2 ) . . . . . . . . . . . . . . . . . . . . . . . . 382
B.5.2 Determination of the vectors |JM i in the other subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . 382

C Transformation from the decoupled basis to the coupled basis and Clebsch-Gordan coefficients in SO (3)384
C.1 Properties of the Clebsch-Gordan coefficients for SO (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384
C.1.1 Selection rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
C.1.2 Unitarity of the transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385
C.1.3 Recurrence relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386
C.1.4 Phase conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387
C.1.5 Signs of some C-G coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388
C.1.6 Changing the order of j1 and j2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389
C.1.7 Simultaneous change of the sign of m1 , m2 and M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
C.1.8 Evaluation of hm, −m (j, j) 0, 0i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390
C.1.9 Some specific Clebsch Gordan coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
Chapter 1

Sets and functions

We shall develop only the notions on sets and functions required for our later work. We shall assume that the concept of
element is clear enough as a well defined object or entity. A set is then an aggregate of such elements, considered together
or as a whole, that is the set can be considered as a single entity by itself. A class or collection is a set of sets, sometimes
could be useful the concept of family which is in turn a set of classes or collections1 . It is very important to say that the
terms element, set, collection and family are not intended to be used rigidly. Their usage depend on our attitude and context,
for instance a straight line can be thought as a set of points but also as a single entity (element). The cartesian plane can
be thought as a single entity (element) as a set of points, as a set of straight lines parallel to the x axis etc. If the lines are
seen as sets of points, then we can consider the plane as a collection of straight lines parallel to the x-axis and in turn the
three dimensional space will be considered as a family of cartesian planes. The flexibiliy in our thinking about elements, sets,
collections and families is extremely important in both physical and mathematical reasoning. For instance, a system of particles
is considered as a set when we consider “particles” as single indivisible systems, but then we could introduce corrections due
to the fact that our “particles” consists in turn of more elementary entities. From this point of view our system of particles is
now a collection of the “particles” and the latter are sets instead of elements.

1.1 Partitions and equivalence relations


Definition 1.1 A partition of a non-empty set S is a collection {Si } of non-empty subsets of S which are disjoint each other
and whose union equals S. The subsets Si are called partition sets.

In other words, a partition is a way of splitting a non-empty set S in non-empty subsets {Si } such that each element of S
belongs to one and only one of those subsets.

Example 1.1 (a) For the set {1, 2, 3, 4, 5}, the collection {(1, 3, 5) , (2, 4)} gives a partition and the collection {(1, 2) , (3, 5) , (4)}
gives another one.

Example 1.2 Let X be the set of all points in the coordinate plane. Let Sx ≡ {(x, y) : ∀y ∈ R} i.e. the set of all points with
the same x−coordinate (vertical lines). The collection {Sx : ∀x ∈ R} is a partition of S.

Of course, many partitions are possible for a given set.

Definition 1.2 A binary relation in the set S is a mathematical symbol or a verbal phrase which we denote by R here, such
that for a given ordered pair (x, y) of elements of S the statement xRy is meaningful in the sense that it can be classified
definitely as true or false. The symbol x R y reads as x is related by R to y. Similarly, x 6 R y, says that x is not related by
R to y.

Example 1.3 Let X be the set of all integers and let R mean “is less than” usually denoted by <. For instance, we have
3 < 6, 8 6< 6, 5 6< 5 etc.

Definition 1.3 Let X be a set and R a relation between ordered pairs of elements in X. This relation is said to be reflexive
if x R x ∀x ∈ X i.e. if x is related by R to itself. The relation is symmetric when x R y if and only if y R x, ∀x, y ∈ X.
The relation is transitive if the couple of statements x R y and y R z imply that x R z, for all x, y, z ∈ X.

Example 1.4 Let X be the set of all real numbers and let R mean “is less than” denoted by <. This relation is not reflexive
since x 6< x. It is not symmetric because if x < y then y 6< x. It is transitive because x < y and y < z implies x < z. The
relation ≤ (less than or equal) is reflexive because x ≤ x is true for any x ∈ R. It is not symmetric because x ≤ y does not
imply y ≤ x for all x, y ∈ R. It is transitive since x ≤ y and y ≤ z implies x ≤ z for all x, y, z ∈ R.
1 The term “class” will be used in group theory with another meaning, so we prefer the term collection in set theory.

9
10 CHAPTER 1. SETS AND FUNCTIONS

Definition 1.4 Let X be a set and R a binary relation between ordered pairs of elements in X. This relation is said to be an
equivalence relation and we denote it by ≃ if the relation has the following properties (1) x ≃ x for every x (reflexivity). (2)
x ≃ y ⇒ y ≃ x (symmetry). (3) x ≃ y and y ≃ z ⇒ x ≃ z (transitivity). By extension we say that any relation in the set X
that accomplishes these axioms are called equivalence relations.

Example 1.5 Let X be the set of all real numbers and let R mean “is equal to” usually denoted by =. This is an equivalence
relation. (1) x = x, (2) x = y ⇔ y = x. (3) x = y and y = z implies x = z.

Definition 1.5 For a given partition {Si } of the set S we can induce a binary relation in S in the following way: Let (x, y)
be an ordered pair in S and x ∼ y means “x and y belong to the same partition set”.

The most remarkable result in this section is that the binary relation described above is an equivalent relation, and that
conversely a given equivalence relation in the set S induces a unique partition of S.

Theorem 1.1 Let S be a non-empty set and let {Si } be a given partition of S. The relation x ∼ y established in definition
1.5 is an equivalence relation. Conversely, if we can establish an equivalence relation x ≃ y between elements of S, such a
relation induces a unique partition {Fk } in which elements of a given partition set are related by ≃, i.e. x and y are in the
same partition set, if and only if x ≃ y.

Proof : Let S be a non-empty set and let {Si } be a given partition of S. For the relation x ∼ y established in definition
1.5, we see that (a) x ∼ x since x obviously belongs to the same partition set as x itself. (b) If x belongs to the same partition
set as y, it is clear that y belongs to the same partition set as x so the relation is symmetric. (c) Transitivity is also obvious.
Now for the converse. Let ≃ be an equivalence relation for the elements of the set S. If x is an element of S we define
the subset [x] ≡ {y : y ≃ x} and we call it the equivalence set of x, and consists of all elements which are equivalent to x. We
shall show that the collection of all distinct equivalence sets forms a partition of S. By reflexivity x ∈ [x] for each element x
in S so each equivalence set is non-empty and their union is S. Now we show that any couple of equivalence sets [x1 ], [x2 ] are
either disjoint or coincident. Suppose that [x1 ] and [x2 ] are not disjoint so they have a common element z. Since z belongs to
both equivalence sets we see that z ≃ x1 and z ≃ x2 and by symmetry x1 ≃ z. Let y ∈ [x1 ] hence y ≃ x1 . Since y ≃ x1 and
x1 ≃ z then by transitivity y ≃ z. Further, since y ≃ z and z ≃ x2 transitivity says that y ≃ x2 , so that y is in [x2 ]. Since y
was chosen arbitrarily in [x1 ], we see that [x1 ] ⊆ [x2 ]. Starting with w ∈ [x2 ] with a similar procedure we see that [x2 ] ⊆ [x1 ],
so that [x1 ] = [x2 ]. QED.
We have shown that a given partition of a non-empty set S, induces naturally an equivalence relation by saying that two
elements in S are equivalent if and only if they belong to the same partition set. Reciprocally, an equivalence relation for
elements in S, induces a partition of S in which elements within a partition set are equivalent to each other. The partition sets
are called equivalence sets or more generally classes of equivalence. Each class of equivalence can be generated by starting
with any of its elements and finding all elements that are equivalent to it. Theorem 1.1 shows that there is no distinction
between partitions of a set and equivalence relations in the set. They are a single mathematical idea considered from different
points of view. The approach chosen depends on our purposes.

1.2 Functions, Mappings and transformations


A function consists of three objects: two sets X and Y and a rule f which assigns to each element x in X a single totally
determined element y in Y . The element y corresponding to a given x is usually denoted as f (x) as it is called the image of
x under the rule f or the value of f at the element x. The rule is called a mapping or transformation or operator, the
term mapping is perhaps the most usual in a general context. The set X is called the domain and the set of elements y that

are the image of some x is called the range (in other words, the range is the set of all f (x) s for all x′ s.). A function whose
range is only one point is called a constant functions.

Definition 1.6 Mappings on a set S: If we have a set S of objects, which we call points, a mapping M of the set S of
points on itself is a recipe to associate with each point p of the set, an image point p′ which is also in the set. Symbolically
M
p −→ p′ or p′ = M p ; ∀p ∈ S

i.e. p′ is the image of p under the mapping M . We also call it a mapping M on the set S.

The set could be finite or infinite. In the case of finite sets of points, we can show the association explicitly, for example
 
a−→b
M ≡ b− →c  (1.1)
c−→a
1.2. FUNCTIONS, MAPPINGS AND TRANSFORMATIONS 11

in the case of infinite sets, it is customary to use a functional law, for intance we can define the Mappings on the real numbers
x ∈ R such that x → x′ = 2x.
Two mappings M and M ′ over a set of points S are identical if M p = M ′ p, ∀p ∈ S. One important mapping is the identity
which is defined as Ip = p, ∀p ∈ S.
We can also define the composite of mappings which consists of succesive mappings acting on the set of points. If certain
mapping M1 takes p into p′ while another mapping M1 takes p′ into p′′ then we have

p′ = M1 p , p′′ = M2 p′ = M2 (M1 p) = M2 M1 p ≡ (M1 M2 ) p


M M
and since p′′ ∈ S we conclude that the composite M2 M1 is another mapping that makes the association p −−−
2 1
−→ p′′ . Taking
a longer chain of mappings we can see that
p′′′ = M3 p′′ = M3 (M2 M1 ) p
alternatively
p′′′ = M3 p′′ = M3 (M2 p′ ) ≡ (M3 M2 ) p′ = (M3 M2 ) M1 p
and since it is valid for all p ∈ S we have
M3 (M2 M1 ) = (M3 M2 ) M1
showing that mappings are associative.

Example 1.6 Let us use the mapping defined in Eq. (1.1) along with the following one

 
a−→b
M′ ≡  b −
→b  (1.2)
c−→b
and make the composite of (M ′ M ) S where S ≡ (a, b, c) we easily obtain
   
a−
→b b−
→b
MS =  b − → c  ⇒ M ′ (M S) =  c → − b 
c−
→a a−
→b
 
a−→b−
→b
M ′ (M S) =  b − →b 
→c−
c−→a−
→b
if we make the composite in the opposite way
 
a−→b−
→c
M (M ′ S) =  b − →c 
→b−
c−→b−
→c
this simple example shows us that the mappings are in general non-conmutative i.e. M M ′ 6= M ′ M .

Theorem 1.2 The composite of mappings on a set of points is an associative operation but not necessarily commutative.

We can also define mappings from a set of points S into another set of points S ′ . In that case the points in the domain
p ∈ S and the image points p′ ∈ S ′ belong to different sets S and S ′ . This kind of mapping have the same properties shown
above. There are some important kind of mappings

Definition 1.7 An onto mapping: is a mapping in which every element p′ ∈ S ′ is image of at least one point in the domain
p ∈ S.

When a mapping M : S → S ′ is onto we say that it is a mapping of S onto S ′ . When this is not the case (or we are not
sure) we say that it is a mapping of S into S ′ .

Definition 1.8 A one-to-one mapping: is a mapping in which not two points in the domain set have the same image.

Definition 1.9 A mapping M is a one-to-one mapping of S onto S ′ if no two points of the domain set S have the same image
in S ′ , and every point p′ ∈ S ′ is image of one (and only one) point p ∈ S. When a mapping M : S → S ′ is one-to-one and
onto, we also say that it is a one-to-one correspondence.
12 CHAPTER 1. SETS AND FUNCTIONS

The mapping M defined in (1.1) is one-to-one and onto while the one defined in (1.2) is not. The identity Mapping is
one-to-one and onto. Given a one-to-one mapping of S onto S ′ we can find the inverse of it, such that if p′ = M p then there
is a mapping such that Rp′ = p, ∀p ∈ S and p′ ∈ S ′ . This is a mapping of S ′ into S, that we usually denote as R ≡ M −1 .
Indeed this mapping is also one-to-one and onto.

Theorem 1.3 If M is a one-to-one mapping of S onto S ′ , its inverse M −1 is a one-to-one mapping of S ′ onto S.

Proof : Let p′ , q ′ ∈ S ′ with p′ 6= q ′ . Since M is an onto mapping there exist p, q ∈ S with p 6= q such that M p = p′ and
M q = q ′ . Thus, M −1 (p′ ) = p and M −1 (q ′ ) = q so that M −1 (p′ ) 6= M −1 (q ′ ), showing that M −1 is one-to-one.
Now let p ∈ S. Since M p = p′ ∈ S ′ and M −1 p′ = p, then we see that for any p ∈ S exists p′ ∈ S ′ such that M −1 p′ = p
showing that M −1 is onto. QED.
The following theorem is left to the reader

Theorem 1.4 If M is a one-to-one mapping of S onto S ′ and M ′ is a one-to-one mapping of S ′ onto S ′′ , then the composite
mapping M ′ M is a one-to-one mapping of S onto S ′′ .

Since p′ = M p ⇒ M −1 p′ = M −1 M p = p ∀p ∈ S; therefore we conclude that M −1 M = I the identity transformation. In a


−1
similar fashion we can show that M M −1 = I. It is also easy to prove that M −1 = M. For instance, the inverse of M in
Eq. (1.1) is given by    
a−→b a−
→c
M ≡ b− → c  ⇒ M −1 =  b − →a 
c−→a c−
→b
the inverse of a composite of one-to-one correspondences is easily obtained
−1
p′ = M p , p′′ = M ′ p′ , ⇒ p′′ = (M ′ M ) p ⇒ p = (M ′ M ) p′′

p = M −1 p′ , p′ = M ′−1 p′′ , ⇒ p = M −1 M ′−1 p′′

therefore
−1
(M ′ M ) = M −1 M ′−1
i.e. the inverse of the composite or product of one-to-one and onto mappings is the composite of the inverse mappings in
reverse order. Permuutations are very important examples of one-to-one and onto mappings.

Definition 1.10 (Permutation): A permutation is a one-to-one mapping of a finite non-empty set of points onto itself.

Some examples of one-to-one- and onto mappings are.

Example 1.7 According with definition 1.10, a permutation can be carried out as follows: Let us figure out a set of n boxes
each one with one element i.e. ai in the i − th case, with i = 1, · · · , n. We interchange the elements in such a way that the
element ai is now located in the pi − th case with pi = 1, · · · , n. One example of this type of mapping is
 

 1→3  
 
2→1

 3→2  
 
4→4

More formally we can define the following

Example 1.8 Translations: the points x in the X−coordinates is tranformed into x′ = x + c where c is an arbitrary constant.

Example 1.9 Linear non-singular transformations in n−dimensional space. With respect to a fixed coordinate system, the
point (x1 , . . . , xn ) is mapped into the point (x′1 , . . . , x′n ) such that
n
X
x′i = aij xj ; i = 1, . . . , n
j=1

this mapping is a transformation if and only if the determinant of the matrix array made of the elements aij has non-zero
determinant i.e. is non-singular.
Chapter 2

Linear or vector spaces

We shall describe the most important properties of linear or vector spaces. This treatment is not rigorous at all, and only
some simple proofs are shown. Our aim limits to provide a framework for our subsequent developments.

2.1 Definition of a linear vector space


Definition 2.1 We define the set of scalars as either the set of all real numbers R or the set of all complex numbers C. When
we want to call about scalars without specifying whether they are real or complex, we denote them as S.
Definition 2.2 Any non-empty set of objects V = {xi } form a linear space (or a vector space) if there is a “sum” operation
defined between the elements, and a “multiplication” by scalars (i.e. the system of real or complex numbers) such that
1. If xi ∈ V , and α ∈ S, then αxi ∈ V
2. If xi , xj ∈ V , then xi + xj ∈ V
3. xi + xj = xj + xi , ∀xi , xj ∈ V
4. xi + (xj + xk ) = (xi + xj ) + xk , ∀xi , xj , xk ∈ V
5. (α + β) xi = αxi + βxi ; ∀xi ∈ V and ∀α, β ∈ S
6. α (xi + xj ) = αxi + αxj , ∀xi , xj ∈ V and ∀α ∈ S
7. (αβ) xi = α (βxi ) ; ∀xi ∈ V and ∀α, β ∈ S
8. 1xi = xi ; ∀xi ∈ V
9. ∃ an element 0 ∈ V such that xi + 0 = xi , ∀xi ∈ V
10. ∀xi ∈ V , ∃ an element in V denoted by −xi such that xi + (−xi ) = 0
The element 0 is usually called the null vector or the origin. The element −x is called the additive inverse of x. We should
distinguish the symbols 0 (scalar) and 0 (vector). The two operations defined here (sum and product by scalars) are called
linear operations. A linear space is real (complex) if we consider the scalars as the set of real (complex) numbers.
Let us see some simple examples
Example 2.1 The set of all real (complex) numbers with ordinary addition and multiplication taken as the linear operations.
This is a real (complex) linear space.
Example 2.2 The set Rn (Cn ) of all n-tuples of real (complex) numbers is a real (complex) linear space under the following
linear operations
x ≡ (x1 , x2 , . . . , xn ) ; y ≡ (y1 , y2 , . . . , yn )
αx ≡ (αx1 , αx2 , , αxn ) ; x + y ≡ (x1 + y1 , x2 + y2 , . . . , xn + yn )
Example 2.3 The set of all bounded continuous real functions defined on a given interval [a, b] of the real line, with the linear
operations defined pointwise as
(f + g) (x) = f (x) + g (x) ; (αf ) (x) = αf (x) ; x ∈ [a, b]
Some very important kinds of vector spaces are the ones containing certain sets of functions with some specific properties.
We can consider for example, the set of functions defined on certain interval with some condition of continuity integrability
etc. For instance, in quantum mechanics we use a vector space of functions.

13
14 CHAPTER 2. LINEAR OR VECTOR SPACES

2.2 Algebraic properties


Some algebraic properties arise from the axioms:
1. The origin or identity 0 must be unique. Assuming another identity 0′ we have that x + 0′ = 0′ + x = x for all x ∈ V.
Then 0′ = 0′ + 0 = 0. Hence 0′ = 0.
2. The additive inverse of any given vector x is unique. Assume that x′ is another inverse of x then
x′ = x′ + 0 = x′ + (x+ (−x)) = (x′ + x) + (−x) = 0 + (−x) = −x ⇒ x′ = −x

3. The equality xi + xk = xj + xk implies xi = xj . To see it, we simply add −xk on both sides.
(xi + xk ) + (−xk ) = (xj + xk ) + (−xk ) ⇒ xi + [xk + (−xk )] = xj + [xk + (−xk )] ⇒ xi + 0 = xj + 0 ⇒ xi = xj
This property is usually called the rearrangement lemma.
4. α · 0 = 0. We see it from α · 0 + αx = α · (0 + x) = αx = 0 + αx and applying the rearrangement lemma.
5. 0 · x = 0. It proceeds from 0 · x + αx = (0 + α) x = αx = 0 + αx and using the rearrangement lemma.
6. (−1) x = −x. We see it from x+ (−1) x = 1 · x + (−1) x = (1 + (−1)) x = 0x = 0 = x+ (−x) and the rearrangement
lemma.
−1 −1 −1
7. αx = 0 then
 α = 0 or x = 0. For if α 6= 0 we can multiply both sides of the equation by α to give α (αx) = α 0
−1
⇒ α α x = 0 ⇒ 1x = 0 ⇒ x = 0. If x 6= 0 we prove that α = 0 by assuming α 6= 0 and finding a contradiction. This
is inmediate from the above procedure that shows that starting with α 6= 0 we arrive to x = 0.
It is customary to simplify the notation in x + (−y) and write it as x − y, such an operation is called substraction.

2.3 Vector subspaces


Definition 2.3 A non-empty subset M of V is a vector subspace of V if M is a vector space on its own right with respect to
the linear operations defined in V .
This is equivalent to the condition that M contains all sums, negatives and scalar multiples. The other properties are
derived directly from the superset V . Further, since −x = (−1) x it reduces to say that M must be closed under addition and
scalar multiplication.
When M is a proper subset of V it is called a proper subspace of V . The zero space {0} and the full space V itself are
trivial subspaces of V .
The following concept is useful to study the structure of vector subspaces of a given vector space,
Definition 2.4 Let S = {x1 , .., xn } be a non-empty finite subset of V , then the vector
x = α1 x1 + α2 x2 + . . . + αn xn (2.1)
is called a linear combination of the vectors in S.
We can redefine a vector subspace by saying that a non-empty subset M of V is a linear subspace of V , if it is closed under
the formation of linear combinations. If S is a subset of V we can see that the set of all linear combinations of vectors in S is
a vector subspace of V , we denote this subspace as [S] and call it the vector subspace spanned by S. It is clear that [S] is the
smallest subspace of V that contains S. Similarly, for a given subspace M a non-empty subset S of M is said to span M if
[S] = M . Note that the closure of a vector space under an arbitrary linear combination can be proved by induction from the
closure property of vector spaces under linear operations. Notice additionally, that the proof of induction only guarantees the
closure under any finite sum of terms, if we have an infinite sum of terms (e.g. a series) we cannot ensure that the result is
an element of the space, this is the reason to define linear combinations as finite sums. If we want a property of closure under
some infinite sums additional structure should be added as we shall see later.
Suppose now that M and N are linear subspaces of V . Consider the set M + N of all sums of the form x + y with x ∈ M
and y ∈ N . Since M and N are subspaces, this sum is the subspace spanned by the union of both subspaces M +N = [M ∪ N ].
It could happen that M + N = V in this case we say that V is the sum of M and N . In turn it means that every vector in
V is expressible as a sum of a vector in M plus a vector in N . Further, in some cases any element z of V is expressible in a
unique way as such a sum, in this case we say that V is the direct sum of M and N and it is denoted by
V =M ⊕N
we shall establish the conditions for a sum to become a direct sum
2.3. VECTOR SUBSPACES 15

Theorem 2.1 Let a vector space V be the sum of two of its subspaces V = M + N . Then V = M ⊕ N ⇔ M ∩ N = {0} .

Proof: Assume first that V = M ⊕ N , we shall suppose that ∃ z 6= 0 with z ∈ M ∩ N , and deduce a contradiction from it.
We can express z in two different ways z = z + 0 with z ∈ M and 0 ∈ N or z = 0 + z with 0 ∈ M and z ∈ N . This contradicts
the definition of a direct sum.
Now assume M ∩ N = {0}, by hypothesis V = M + N so that any z ∈ V can be expressed by z = x1 + y1 with
x1 ∈ M and y1 ∈ N . Suppose that there is another decomposition z = x2 + y2 with x2 ∈ M and y2 ∈ N . Hence
x1 + y1 = x2 + y2 ⇒ x1 − x2 = y1 − y2 ; but x1 − x2 ∈ M and y1 − y2 ∈ N . Since they are equal, then both belong to the
intersection so x1 − x2 = y1 − y2 = 0 then x1 = x2 and y1 = y2 showing that the decomposition must be unique. QED.
When two vector subspaces of a given space have only the zero vector in common, it is customary to call them disjoint
subspaces. It is understood that it does not correspond to disjointness in the set-theoretical sense, after all two subspaces of a
given space cannot be disjoint as sets, since any subspace must contain 0. Thus no confusion arises from this practice.
The concept of direct sum can be generalized when more subspaces are involved.

Definition 2.5 Let V be a vector space and let {M1 , .., Mn } be a collection of subspaces of V . We say that V is the direct sum
of the collection {M1 , .., Mn } and denote it as

V = M1 ⊕ M2 ⊕ . . . ⊕ Mn

when each z ∈ V can be expressed uniquely in the form

z = x1 + x2 + . . . + xn ; xi ∈ Mi

Theorem 2.2 Let V be a vector space and let {M1 , .., Mn } be a collection of subspaces of V . If V = M1 + .. + Mn , this sum
becomes a direct sum if and only if each Mi is disjoint from the subspace spanned by the others.

Proof: It is enough to realize that

V = M1 + M2 + .. + Mn = M1 + [M2 + .. + Mn ] = M1 + [∪ni=2 Mi ]

but according with theorem 2.1, V = M1 ⊕ [M2 + .. + Mn ] if and only if M1 ∩ [∪ni=2 Mi ] = {0}, proceeding similarly for the
other Mi′ s, we arrive at the condition above. QED.
Note that the condition established by theorem 2.2, is stronger than requiring that any given Mi is disjoint from each of
the others. The previous facts can be illustrated by a simple example. The most general non-zero proper subspaces of R3 are
lines or planes that passes through the origin. Thus let us define

M1 = {(x1 , 0, 0)} , M2 = {(0, x2 , 0)} , M3 = {(0, 0, x3 )}


M4 = {(0, x2 , x3 )} , M5 = {(x1 , 0, x3 )} , M6 = {(x1 , x2 , 0)}

M1 , M2 , M3 are the coordinate axes of R3 and M4 , M5 , M6 are its coordinate planes. R3 can be expressed by direct sums of
these spaces in several ways
R3 = M 1 ⊕ M 2 ⊕ M 3 = M 1 ⊕ M 4 = M 2 ⊕ M 5 = M 3 ⊕ M 6
for the case of R3 = M1 ⊕ M2 ⊕ M3 we see that the subspace spanned by M2 and M3 i.e. M2 + M3 = [M2 ∪ M3 ] = M4 is
disjoint from M1 . Similarly M2 ∩ [M1 ∪ M3 ] = {0} = M3 ∩ [M1 ∪ M2 ]. It is because of this, that we have a direct sum.
Now let us take M3 , M6 and M ′ defined as a line on the plane M4 that passes through the origin making an angle θ with
the axis x3 such that 0 < θ < π/2, since R3 = M3 + M6 it is clear that

R3 = M3 + M6 + M ′ ; M3 ∩ M6 = M3 ∩ M ′ = M6 ∩ M ′ = {0} (2.2)

however this is not a direct sum because M3 + M6 = R3 so that M ′ ∩ (M3 + M6 ) = M ′ 6= {0}. Despite each subspace is disjoint
from each other, there is at least one subspace that is not disjoint from the subspace spanned by the others. Let us show
that there are many decompositions for a given vector z ∈ R3 when we use the sum in (2.2). Since R3 = M3 + M6 a possible
decomposition is z = x + y + 0 with x ∈ M3 , y ∈ M6 , 0 ∈ M ′ . Now let us take an arbitrary non-zero element w of M ′ ; clearly
M3 + M6 = R3 contains M ′ so that w = x′ + y′ with x′ ∈ M3 , y′ ∈ M6 . Now we write z = x + y = (x − x′ ) + (y − y′ ) + x′ + y′
then z = (x − x′ ) + (y − y′ ) + w. We see that (x − x′ ) is in M3 and (y − y′ ) is in M6 . Now, since w ∈ M ′ and w 6= 0 this is
clearly a different decomposition with respect to the original one. An infinite number of different decompositions are possible
since w is arbitrary.
Finally, it can be proved that for any given subspace M in V it is always possible to find another subspace N in V such
that V = M ⊕ N . Nevertheless, for a given M the subspace N is not neccesarily unique. A simple example is the following: in
R2 any line crossing the origin is a subspace M and we can define N as any line crossing the origin as long as it is not collinear
with M ; for any N accomplishing this condition we have V = M ⊕ N .
16 CHAPTER 2. LINEAR OR VECTOR SPACES

2.4 Dimension and bases in vector spaces


Definition 2.6 Let V be a vector space and S = {x1 , . . . xn } a finite non-empty subset of V . The set S is defined as linearly
dependent if there is a set of scalars {α1 , .., αn } not all of them zero such that

α1 x1 + α2 x2 + . . . + αn xn = 0 (2.3)

if S is not linearly dependent, we say that it is linearly independent, this means that in Eq. (2.3) all coefficients αi must be
zero. Thus linear independence of S means that the only solution for Eq. (2.3) is the trivial one. When non-trivial solutions
exists the set is linearly dependent.

¿What is the utility of the concept of linear independence of a given set S? to see it, let us examine a given vector x in
[S], each of these vectors arise from linear combinations of vectors in S

x = α1 x1 + α2 x2 + . . . + αn xn ; xi ∈ S (2.4)

Theorem 2.3 Let V be a vector space, let S = {x1 , .., xn } be an ordered, finite, non-empty subset of V . Let [S] be the
set formed by all linear combinations of S. The set S is linearly independent if and only if the corresponding ordered set
{α1 , . . . , αn } associated with any x ∈ [S] by Eq. (2.4), is unique.

Proof: Let us assume first that S is linearly independent. Suppose there is another decomposition of x as a linear
combination of elements of S
x = β1 x1 + β2 x2 + .. + βn xn ; xi ∈ S (2.5)
substracting (2.4) and (2.5) we have

0 = (α1 − β1 ) x1 + (α2 − β2 ) x2 + .. + (αn − βn ) xn

but linear independence require that only the trivial solution exists, thus αi = βi and the ordered set of coefficients is unique.
Now let us assume that the ordered set {α1 , .., αn } associated with any x ∈ [S], is unique. Let us take an arbitrary null
linear combination of S
γ1 x1 + γ2 x2 + .. + γn xn = 0 (2.6)
since 0 ∈ [S], the set {γ1 , . . . , γn } is unique by hypothesis, and since {0, . . . , 0} is a solution of Eq. (2.6), it is the only solution.
QED.
Theorem 2.3 is very important for the theory of representations of vector spaces. The discussion above permits to define
linearly independence for an arbitrary (not necessarily finite) non-empty set S

Definition 2.7 Let S be an arbitrary non-empty subset of a vector space V . The set S is linearly independent if every finite
non-empty subset of S is linearly independent in the sense established by definition 2.6.

As before, an arbitrary non-empty set S is linearly independent if and only if any vector x ∈ [S] can be written in a unique
way as a linear combination of vectors in S.

Definition 2.8 A basis B for a vector space V , is a linearly independent subset of V that spans V , i.e. [B] = V

Bases are the most important linearly independent sets in V . It can be checked that B is a basis if and only if it is a
maximal linearly independent set, in the sense that any proper superset of B must be linearly dependent. We shall establish
without proof a very important theorem concerning bases of vector spaces

Theorem 2.4 If S is a linearly independent set of vectors in a vector space V , there exists a basis B in V such that S ⊆ B.

In words, given a linearly independent set S, if it is not already a basis of V , it is always possible to add some elements to
S for it to become a basis. A linearly independent set is non-empty by definition and cannot contain the null vector1 . Hence,
we see that if V = {0} it does not contain any basis, but if V 6= {0} and we can take a non-zero element x of V , the set {x}
is linearly independent and theorem 2.4 guarantees that V has a basis that contains {x}, it proves that

Theorem 2.5 Every non-zero vector space has a basis

Now, since any set consisting of a single non-zero vector can be enlarged to become a basis, it is clear that any non-zero
vector space contains an infinite number of bases2 . It worths looking for general features shared by all bases of a given linear
space. Tne first theorem in such a direction is the following
1A linear combination α1 x1 + . . . + αk xk + β · 0 = 0, posseses (at least) an infinite set of solutions with α1 = . . . = αk = 0 and β arbitrary.
2 For example, if x 6= 0 and α 6= 1 we see that αx 6= x. We can form a different basis with each value of α (i.e. with each αx vector) for a fixed x.
2.4. DIMENSION AND BASES IN VECTOR SPACES 17

Theorem 2.6 Let S = {x1 , x2 , . . . , xn } be a finite, ordered, and non-empty subset of the linear space V . If n = 1 then S is
linearly dependent⇔ x1 = 0. If n > 1 and x1 6= 0 then S is linearly dependent if and only if some one of the vectors x2 , ..., xn
is a linear combination of the vectors in the ordered set S that precede it.

Proof: The first assertion is trivial. Then we settle n > 1 and x1 6= 0. Assuming that one of the vectors xi in the set
x2 , ..., xn is a linear combination of the preceding ones we have

xi = α1 x1 + ... + αi−1 xi−1 ⇒ α1 x1 + ... + αi−1 xi−1 − 1 · xi = 0

since the coefficient of xi is 1, this is a non trivial linear combination of elements of S that equals zero. Thus S is linearly
dependent. We now assume that S is linearly dependent hence the equation

α1 x1 + ... + αn xn = 0

has a solution with at least one non-zero coefficcient. Let us define αi as the last non-zero coefficient, since x1 6= 0 then i > 1
then we have    
α1 αi−1
α1 x1 + ... + αi xi + 0 · xi+1 + ... + 0 · xn = 0 ⇒ xi = − x1 + ... + − xi−1
αi αi
and xi is written as a linear combination of the vectors that precede it in the ordered set S. QED
The next theorem provides an important structural feature of the collection of bases in certain linear spaces

Theorem 2.7 If a given non-zero linear space V has a finite basis B1 = {e1 , ..., en } with n elements, then any other basis
B2 = {fi } of V must be finite and also with n elements.

Proof : We first prove that B2 is finite by assuming that it is infinite and arriving to a contradiction. Each ei is a linear
combination of some fj′ s, the fj′ s that appear in the linear combination of at least one ei forms a finite subset S of B2 . Since
B2 is infinite, there exists a vector fj0 ∈ B2 which is not in S. But fj0 is a linear combination of the e′i s and therefore of the
vectors in S. It shows that S ∪ {fj0 } is a linearly dependent subset of B2 , but it contradicts the fact that B2 is a basis.
Since B2 is finite, we can write it as
B2 = {f1 , . . . , fm }

for some positive integer m. We shall show that m = n. Since the e′i s span V , f1 is a linear combination of the ei s.
Therefore, the set S1 ≡ {f1 , e1 , . . . , en } is linearly dependent. Thus according with theorem 2.6, one of the e′i s that we
denote as ei0 , is a linear combination of the vectors in S1 that precede it. Hence, we can delete this vector defining S2 =
{f1 , e1 , . . . , ei0 −1 , ei0 +1 , . . . , en } and this set still spans V . Once again, f2 is a linear combination of the vectors in S2 , so the
set S3 ≡ {f1 , f2 , e1 , . . . , ei0 −1 , ei0 +1 , . . . , en } is linearly dependent. Applying theorem 2.6 once more, some vector in S3 must
be a linear combination of the preceding ones; and since the fj′ s are linearly independent, such a vector must be one of the e′i s.
After deleting this vector, the set that remains still spans V . Continuing this way, it is clear that we cannot run out of the

ei s before exhausting the fj′ s; for if we do, theorem 2.6 says that one of the fj′ s is a linear combination of the preceding ones,
contradicting the fact that the fj′ s are linearly independent. This argument shows that the number of fj′ s cannot exceed the
number of e′i s which means that m ≤ n. We can reverse the role of the e′i s and fj′ s to prove that n ≤ m. Therefore, n = m.
QED.
The following theorem (that we give without proof) gives a complete structure to this part of the theory of vector spaces

Theorem 2.8 Let V be a non-zero vector space. If B1 = {ei } and B2 = {uj } are two bases of the vector space, then B1 and
B2 are sets with the same cardinality.

These theorem is valid even when the bases are sets with infinite cardinality. This result says that the cardinality of a
basis is a universal attribute of the vector space since it does not depend on the particular basis used. Hence the following are
natural definitions

Definition 2.9 The dimension of a non-zero vector space is the cadinality of any of its basis. If V = {0} the dimension is
defined to be zero.

Definition 2.10 A vector space is finite-dimensional if its dimension is a non-negative integer. Otherwise, it is infinite-
dimensional.

As any abstract algebraic system, vector spaces requires a theory of representations in which the most abstract set is
replaced by another set with more tangible objects. However, for the representation to preserve the abstract properties of the
vector space, set equivalence and linear operations must be preserved. This induces the following definition

Definition 2.11 Let V and V ′ be two vector spaces with the same system of scalars. An isomorphism of V onto V ′ is a
one-to-one mapping f of V onto V ′ such that f (x + y) = f (x) + f (y) and f (αx) = αf (x)
18 CHAPTER 2. LINEAR OR VECTOR SPACES

Definition 2.12 Two vector spaces with the same system of scalars are called isomorphic if there exists an isomorphism of
one onto the other.

When we say that there exists a one-to-one mapping from V onto V ′ we are establishing the equivalence of V and V ′ as
sets, so they have the same cardinality (the same “number of elements”). The remaining properties guarantee the preservation
of linear operations. Let x, y ∈ V , such that x′ ≡ f (x) ∈ V ′ and y′ ≡ f (x) ∈ V ′ , now let x + y = z the properties

f (x + y) = f (x) + f (y) , f (αx) = αf (x)

can be restated as
′ ′
(x + y) = x′ + y′ , (αx) = αx′
in words if x → x′ , y → y′ through the mapping f and x + y = z, x′ + y′ = z′ then z → z′ through the same mapping. A
similar relation occurs with the scalar product. Therefore, to say that two vector spaces are isomorphic means that they are
abstractly identical with respect to their structure as vector spaces. It is easy to prove that isomorphisms between vector spaces
are equivalence relations, that is: (a) the identity is an isomorphism of V onto itself (reflexivity) (b) If f is an isomorphism of
V onto V ′ then f −1 (x′ ) exists, is one-to-one and onto and the relations

f −1 (x′ + y′ ) = f −1 (x′ ) + f −1 (y′ ) ; f −1 (αx′ ) = αf −1 (x′ )

also holds (symmetry). Finally (c) If g is an isomorphism of V ′ onto V ′′ then the composed mapping h (x) ≡ g (f (x)) of V onto
V ′′ is also an isomorphism (transitivity), i.e. h (x) is one-to-one and onto, with h (x + y) = h (x) + h (y) and h (αx) = αh (x).
Now let V be a non zero finite dimensional space. If n is its dimension, there exists a basis B = {e1 , .., en } whose elements
are written in a definite order. Each vector x in V can be written uniquely in the form

x = α1 e1 + .. + αn en

so the n−tuple (α1 , .., αn ) is uniquely determined by x. If we define a mapping f by f (x) = (α1 , .., αn ) we see that this is an
isomorphism of V onto Rn or Cn depending on the system of scalars defined for V . It leads to:

Theorem 2.9 Any real (complex) non-zero finite dimensional vector space of dimension n is isomorphic to Rn (Cn ).

Indeed, this theorem can be extended to vector spaces of arbitrary dimensions, we shall not discuss this topic here. By
now, it suffices to realize that the isomorphism establishes here is not unique for it depends on the basis chosen and even on
the order of vectors in a given basis. It can be shown also that two vector spaces V and V ′ are isomorphic if and only if they
have the same scalars and the same dimension.
From the results above, we could then be tempted to say that the abstract concept of vector space is no useful anymore,
and we can concentrate on Rn or Cn only in the case of finite-dimensional vector spaces. However, this is not true because
on one hand the isomorphism depends on the basis chosen and most results are desirable to be written in a basis independent
way. But even more important, almost all vector spaces studied in Mathematics and Physics posses some additional structure
(topological or algebraic) that are not neccesarily preserve by the previous isomorphisms.

2.5 Mappings and transformations in vector spaces


For two vector spaces V and V ′ with the same system of scalars we can define a mapping T of V into V ′ that preserves linear
properties
T (x + y) = T (x) + T (y) ; T (αx) = αT (x)
T is called a linear transformation. We can say that linear transformations are isomorphisms of V into V ′ since linear operations
are preserved. T also preserves the origin and negatives

T (0) = T (0 · 0) = 0 · T (0) = 0 ; T (−x) = T ((−1) x) = (−1) T (x) = −T (x) (2.7)

It happens frequently that the states in physical systems are vectors of a given vector space (especially in quantum mechanics).
Hence, the transformations of these vectors are also important in Physics because they will represent transformations in the
states of the physical system. On the other hand, we shall see that the set of all linear transformations are in turn vector
spaces with their own internal organization.
Let us now define some basic operations with linear transformations, a natural definition of the sum of two linear transfor-
mations is of the form
(T + U ) (x) ≡ T (x) + U (x) (2.8)
and a natural definition of multiplication by scalars is

(αT ) (x) ≡ αT (x) (2.9)


2.6. LINEAR TRANSFORMATIONS OF A VECTOR SPACE INTO ITSELF 19

finally the zero and negative linear transformations are defined as


0 (x) ≡ 0 ; (−T ) (x) ≡ −T (x) (2.10)
with these definitions it is inmediate to establish the following
Theorem 2.10 Let V and V ′ be two vector spaces with the same system of scalars. The set of all linear transformations of
V into V ′ with the linear operations defined by Eqs. (2.8, 2.9, 2.10) is itself a vector space.
Proof : Let us define as β (V, V ′ ) the set of all linear transformations of V into V ′ . Let us check for some of the axioms, if
T, U ∈ β (V, V ′ ) then their linear operations defined by Eqs. (2.8, 2.9) yields
(λT ) (α1 x + α2 y) ≡ λT (α1 x + α2 y) = λ [α1 T (x) + α2 T (y)] = α1 λT (x) + α2 λT (y)
= α1 (λT ) (x) + α2 (λT ) (y)
(T + U ) (α1 x + α2 y) ≡ T (α1 x + α2 y) + U (α1 x + α2 y) = α1 T (x) + α2 T (y) + α1 U (x) + α2 U (y)
= α1 (T + U ) (x) + α2 (T + U ) (y)
hence λT and T + U are also linear transformations. Eq. (2.10) ensures the existence of the zero element and of the inverse
additive of each T ∈ β (V, V ′ ). The remaining axioms prove easily. QED.
The most interesting cases are the linear transformations of V into itself and the linear transformations of V into the vector
space of scalars (real or complex). We shall study now the first case.

2.6 Linear transformations of a vector space into itself


In this case we usually speak of linear transformations on V . The first inmediate consequence is the capability of defining the
composition of operators (or product of operators)
(T U ) (x) ≡ T (U (x)) (2.11)
associativity and distributivity properties can easily be derived
T (U V ) = (T U ) V ; T (U + V ) = T U + T V
(T + U ) V = T V + U V ; α (T U ) = (αT ) U = T (αU )
we prove for instance

[(T + U ) V ] (x) = (T + U ) (V (x)) = T (V (x)) + U (V (x))


= (T V ) (x) + (U V ) (x) = (T V + U V ) (x)
commutativity does not hold in general. It is also possible for the product of two non-zero linear transformations to be zero.
An example of non commutativity is the following: we define on the space P of polynomials p (x) the linear operators M and
D
dp dp
M (p) ≡ xp ; D (p) = ⇒ (M D) (p) = M (D (p)) = xD (p) = x
dx dx
dp
(DM ) (p) = D (M (p)) = D (xp) = x +p
dx
and M D 6= DM. Suppose now the linear transformations on R2 given by
Ta ((x1 , x2 )) = (x1 , 0) ; Tb ((x1 , x2 )) = (0, x2 ) ⇒ Ta Tb = Tb Ta = 0

thus Ta 6= 0 and Tb 6= 0 but Ta Tb = Tb Ta = 0.


Another natural definition is the identity operator I
I (x) ≡ x
we see that I 6= 0 ⇔ V 6= {0}. Further
IT = T I = T
for every linear operator T on V . For any scalar α the operator αI is called scalar multiplication since
(αI) (x) = αI (x) = αx
it is well known that for a mapping of V into V ′ to admit an inverse of V ′ into V requires to be one-to-one and onto. In this
context this induces the definition
20 CHAPTER 2. LINEAR OR VECTOR SPACES

Definition 2.13 A linear transformation T on V is called non-singular if it is one-to-one and onto, and singular otherwise.
When T is non-singular its inverse can be defined so that
T T −1 = T −1 T = I
it can be shown that when T is non-singular T −1 is also a non-singular linear transformation.
Definition 2.14 Let V be a vector space a T a linear transformation onV . The range R and the null space (or kernel) K of
T in V ar defined as
R ≡ {z ∈ V : T (x) = z for some x ∈ V } ; K ≡ {z ∈ V : T (z) = 0}
in words, the range is the set of all images of T through V and the kernel or null space is the set of elements of V that are
mapped into the null element 0. If T is onto, R = V , if T is one-to-one, K = {0}, if T is one-to-one and onto then R = V
and K = {0}.
Theorem 2.11 The range R and the null space (or kernel) K of T in V are vector subspaces of V
Proof : If z ∈ R there is some x ∈ V such that T (x) = z, hence αT (x) = αz and T (αx) = αz, since αx ∈ V it proves that
αz is in R. Now if z′ is also in R exists x′ such that T (x′ ) = z′ therefore T (x) + T (x′ ) = z + z′ so that T (x + x′ ) = z + z′ .
Since x + x′ ∈ V it proves that z + z′ ∈ R.
Now, if z ∈ K then T (z) = 0 hence αT (z) = 0 and T (αz) = 0, therefore αz ∈ K. If z′ also belongs to K then T (z′ ) = 0
and T (z) + T (z′ ) = T (z + z′ ) = 0 showing that z + z′ ∈ K. QED.
For future purposes the following theorem is highly relevant
Theorem 2.12 If T is a linear transformation on V , then T is non-singular ⇔ T (B) is a basis for V whenever B is.
Proof : This theorem is valid for any vector space but we shall restrict the proof to finite-dimensional vector spaces. Let
B ≡ {e1 , . . . , en } be a basis of V , and assume that T is non-singular, it suffices to prove that the set T (B) ≡ {T e1 , . . . , T en }
is linearly independent. Let us consider a null linear combination of the elements in T (B)
α1 T (e1 ) + . . . + αn T (en ) = 0 ⇔ (2.12)
T (α1 e1 + . . . + αn en ) = 0 (2.13)
since T is one-to-one, its kernel in V is {0}. Thus, the only solution for T (x) = 0 is x = 0. Hence the only solution for Eqs.
(2.12, 2.13) is
α1 e1 + . . . + αn en = 0 (2.14)
and since B is linearly independent, the only solution for (2.14) and hence for (2.12) is the trivial one, proving that T (B) is
linearly independent.
Now we assume that T (B) is a basis for V whenever B is. Let y be an arbitrary vector in V , since T (B) is a basis then
y = αi T (ei ) = T (αi ei )
and since αi ei ≡ z ∈ V there exists a vector z in V such that T (z) = y. Since y is arbitrary, T is onto. Now, we shall assume
that T (x) = T (y), since B is a basis we have x = αi ei and y = βi ei , from our hypothesis we have
T (x) − T (y) = 0 ⇒ T (x − y) = T ((αi − βi ) ei ) = 0
⇒ (αi − βi ) T (ei ) = 0 (2.15)
sum over repeated indices. Since T (B) is a basis, the only solution for (2.15) is the trivial one. Hence αi = βi for all indices
and x = y. Therefore T (x) = T (y) implies x = y and T is one-to-one. QED.

2.6.1 Projection operators


We shall discuss some very important types of linear transformations. Let V be the direct sum of two subspaces V = M ⊕ N
it means that any vector z in V can be written in a unique way as z = x + y with x ∈ M and y ∈ N . Since x is uniquely
determined by z, this decomposition induces a natural mapping of V onto M in the form
P (z) = x
it is easy to show that this transformation is linear and is called the projection on M along N . The most important property of
these transformations is that they are idempotent i.e. P 2 = P . We can see it taking into account that the unique decomposition
of x is x = x + 0 so that
P 2 (z) = P (P (z)) = P (x) = x = P (z)
The opposite is also true i.e. a given idempotent linear transformation induces a decomposition of the space V in a direct sum
of two subspaces
2.7. NORMED VECTOR SPACES 21

Theorem 2.13 If P is a linear transformation on a vector space V , P is idempotent⇔there exists subspaces M and N in V
such that V = M ⊕ N and P is the projection on M along N .

Proof : We already showed that decomposition in a direct sum induces a projection, to prove the opposite let define M
and N in the form
M ≡ {P (z) : z ∈ V } ; N = {z : P (z) = 0}
M and N are vector subspaces and correspond to the range and the null space (or kernel) of the transformation P respectively.
We show first that M + N = V , this follows from the identity

z = P (z) + (I − P ) (z) (2.16)

P (z) belongs to M by definition, now



P ((I − P ) (z)) = (P (I − P )) (z) = P − P 2 (z) = (P − P ) (z) = 0 (z) = 0

thus (I − P ) (z) belongs to the null space N so M + N = V . To prove that this is a direct sum we must show that M and N
are disjoint (theorem 2.1). For this, assume that we have a given element P (z) in M that is also in N then

P (P (z)) = 0 ⇒ P 2 (z) = P (z) = 0

thus the common element P (z) must be the zero element. Hence, M and N are disjoint and V = M ⊕ N . Further, from (2.16)
P is the projection on M along N .
Of course in z = x + y with x ∈ M , y ∈ N we can define a projection P ′ (z) = y on N along M . In this case
V = M ⊕ N = N ⊕ M but now M is the null space and N is the range. It is easy to see that P ′ = I − P .
On the other hand, we have seen that for a given subspace M in V we can always find another subspace N such that
V = M ⊕ N so for a given M we can find a projector with range M and null space N . However, N is not unique so that
different projections can be defined on M .
Finally, it is easy to see that the range of a projector P corresponds to the set of points fixed under P i.e. M =
{P (z) : z ∈ V } = {z : P (z) = z}.

2.7 Normed vector spaces


Inspired in the vectors of Rn in which we define their lengths in a natural way, we can define lengths of vectors in abstract
vector spaces by assuming an additional structure

Definition 2.15 A normed vector space N is a vector space in which to each vector x there corresponds a real number denoted
by kxk with the following properties: (1) kxk ≥ 0 and kxk = 0 ⇔ x = 0.(2) kx + yk ≤ kxk + kyk (3) kαxk = |α| kxk

As well as allowing to define a length for vectors, the norm permits to define a distance between two vectors x and y in
the following way
d (x, y) ≡ kx − yk
it is easy to verify that this definition accomplishes the properties of a metric

d (x, y) ≥ 0 and d (x, y) = 0 ⇔ x = y


d (x, y) = d (y, x) ; d (x, z) ≤ d (x, y) + d (y, z)

in turn, the introduction of a metric permits to define two crucial concepts: (a) convergence of sequences, (b) continuity of
functions of N into itself (or into any metric space).
We shall examine both concepts briefly

2.7.1 Convergent sequences, cauchy sequences and completeness


If X is a metric space with metric d a given sequence in X

{xn } = {x1 , .., xn , ...}

is convergent if there exists a point x in X such that for each ε > 0, there exists a positive integer n0 such that d (xn , x) < ε
for all n ≥ n0 . x is called the limit of the sequence. A very important fact in metric spaces is that any convergent sequence
has a unique limit.
22 CHAPTER 2. LINEAR OR VECTOR SPACES

Further, assume that x is the limit of a convergent sequence, it is clear that for each ε > 0 there exists n0 such that
m, n ≥ n0 ⇒ d (x, xm ) < ε/2 and d (x, xn ) < ε/2 using the properties of the metric we have

ε ε
m, n ≥ n0 ⇒ d (xm , xn ) ≤ d (xm , x) + d (x, xn ) < + =ε
2 2

a sequence with this property is called a cauchy sequence. Thus, any convergent sequence is a cauchy sequence. The opposite
is not necessarily true. As an example let X be the interval (0, 1] the sequence xn = 1/n is a cauchy sequence but is not
convergent since the point 0 (which it wants to converge to) is not in X. Then, convergence depends not only on the sequence
itself, but also on the space in which it lies. Some authors call cauchy sequences “intrinsically convergent” sequences.
A complete metric space is a metric space in which any cauchy sequence is convergent. The space (0, 1] is not complete
but it can be made complete by adding the point 0 to form [0, 1]. In fact, any non complete metric space can be completed
by adjoining some appropriate points. It is a fundamental fact that the real line, the complex plane and Rn , C n are complete
metric spaces.
We define an open sphere of radius r centered at x0 as the set of points such that

Sr (x0 ) = {x ∈ X : d (x, x0 ) < r}

and an open set is a subset A of the metric space such that for any x ∈ A there exists an open sphere Sr (x) such that
Sr (x) ⊆ A.
For a given subset A of X a point x in X is a limit point of A if each open sphere centered on x contains at least one point
of A different from x.
A subset A is a closed set if it contains all its limit points. There is an important theorem concerning closed metric
subspaces of a complete metric space

Theorem 2.14 Let X be a complete metric space and Y a metric subspace of X. Then Y is complete ⇔ it is closed.

2.7.2 The importance of completeness in Physics


In either classical or quantum mechanics it is usual to encounter series of the form

X
cn ψn
n=1

In quantum mechanics, ψn are functions belonging to the state space that describe physical states and cn are some appropriate
coefficients. In classical mechanics, ψn are usually solutions of a linear differential equation whose superposition with cn form
the most general solution, the set of all solutions usually form a vector space.
For this series to have any physical sense, it must be convergent. To analyze convergence we should construct the sequence
of partial sums
( 1 2 3
)
X X X
cn ψn , cn ψn , cn ψn , ...
n=1 n=1 n=1

if this series is “intrisically” convergent the corresponding sequence of partial sums should be a cauchy sequence. Any series
that defines a cauchy sequence has a bounded norm

X

cn ψn < ∞

n=1

it would then be desirable that an intrinsically convergent series given by a superposition of physical states ψn be another
physical state ψ. In other words, the limit of the partial sums should be within the vector space that describe our physical
states. To ensure this property we should demand completeness of the vector space that describe the physical states of the
system.
On the other hand, it is usual to work with subspaces of the general physical space. If we want to guarantee for a series in
a given subspace to be also convergent, we should require for the subspace to be complete by itself, and according to theorem
2.14 it is equivalent to require the subspace to be closed with respect to the total space. Therefore, closed subspaces of the
general space of states would be particularly important in quantum mechanics.
2.8. BANACH SPACES 23

2.7.3 The concept of continuity and its importance in Physics


The concept of continuity arises naturally for mappings of a metric space into another metric space. Let f be a mapping of
(X, d1 ) into (Y, d2 ) we say that f is continuous at x0 ∈ X if for each ε > 0 there exists δ > 0 such that d1 (x, x0 ) < δ ⇒
d2 (f (x) , f (x0 )) < ε. This mapping is said to be continuous if it is continuous for each point in its domain.
Continuity is also an essential property in Physics since for most of physical observables or states we require some kind of
“smoothness” or “well behavior”. Continuity is perhaps the weakest condition of well behavior usually required in Physics.
We have previously defined isomorphisms as mappings that preserve all structure concerning a general vector space. It is
then natural to characterize mappings that preserve the structure of a set as a metric space

Definition 2.16 If X, Y are two metric spaces with metrics d1 and d2 a mapping f of X into Y is an isometry if d1 (x, x′ ) =
d2 (f (x) , f (x′ )) ∀x, x′ ∈ X. If there exists an isometry of X onto Y , we say that X is isometric to Y .

It is clear that an isometry is necessarily one-to-one. If X is isometric to Y then the points of these spaces can be put
in a one to one correspondence in such a way that the distance between pairs of corresponding points are the same. In that
sense, isometric spaces are abstractly identical as metric spaces. For instance, if we endow a vector space V with a metric then
another metric vector space V ′ will be identical to V as metric and vector space if and only if there is an isometric isomorphism
between them. Isometry preserves metric (distances) while isomorphism preserve vector structure (linear operations). Of course
a norm-preserving mapping is an isometry for the metric induced by such a norm. Thus for our purposes norm preserving
mappings will be isometries.

2.8 Banach Spaces


From our experience in classical mechanics we have seen that the concept of a vector space is useful especially when we
associate a length to the vectors, this induces the concept of normed vector spaces, the norm in turn induces a metric i.e. a
natural concept of the distance between vectors. Metric structure in turn lead us to the concepts of convergent sequences and
continuity of functions. In particular, the previous discussion concerning completeness incline us in favor of spaces that are
complete. Then we are directly led to normed and complete linear spaces

Definition 2.17 A Banach space is a normed and complete vector space

As in any vector space, linear transformations are crucial in the characterization of Banach spaces. Since a notion of
continuity is present in these spaces and continuity is associated with well behavior in Physics, it is natural to concentrate our
attention in continuous linear transformations of a Banach space B into itself or into the set of scalars. Transformations of
B into itself will be useful when we want to study posible modifications of the vectors (for instance the time evolution of the
vectors describing the state of the system). On the other hand, transformations of B into the scalars will be useful when we
are interested in connecting the state of a system (represented by a vector) with a measurement (which is a number).
Before considering each specific type of continuous linear transformation, we should clarify what the meaning of continuity
of a linear transformation is. Since continuity depends on the metric induced on the space, we should define for a given space
of linear transformations on a Banach space B, a given metric. We shall do it by first defining a norm in the space of linear
transformations. Specifically, we shall define the following norm

kT k = sup {|T (x)| : kxk ≤ 1} (2.17)

We shall refer to the metric induced by this norm when we talk about the continuity of any linear transformation of a Banach
space into itself or into the scalars. It can be shown that for this norm, continuity is equivalent to boundedness.

2.8.1 Continuous linear transformations of a Banach space into scalars


Let us consider first the continuous linear transformations of B into the scalars. This induces the following

Definition 2.18 A real (or complex) functional is a continuous linear transformation of a real (or complex) normed linear
space into R (or C).

Definition 2.19 The set of all functionals on a normed linear space N is called the conjugate space of N and is denoted by
N ∗.

For the case of general normed spaces (and even for Banach spaces), the structure of their conjugate spaces is in general
very intrincate. However we shall see that conjugate spaces are much simpler when an additional structure (inner product) is
added to Banach spaces.
24 CHAPTER 2. LINEAR OR VECTOR SPACES

2.8.2 Continuous linear transformations of a Banach space into itself


Let us discuss now the continuous linear transformations of Banach spaces into themselves.

Definition 2.20 An operator is a continuous linear transformation of a normed space into itself.

Definition 2.21 The set of all operators on a Banach space, is denoted by ß(T )

Indeed, the set of operators on a Banach space is itself a space with a rich structure as can be seen from the following
theorem

Theorem 2.15 The space ß(T ) previously defined, forms an algebra

A particularly useful result in Physics is the following

Theorem 2.16 If a one-to-one linear transformation T of a Banach space onto itself is continuous, then its inverse is auto-
matically continuous

Though we do not provide a proof, it is important to note that this result requires the explicit use of completeness (it is
not valid for a general normed space). We see then that completeness gives us another desirable property in Physics: if a given
transformation is continuous and its inverse exist, this inverse transformation is also continuous.
Let us now turn to projectors on Banach spaces. For general vector spaces projectors are defined as idempotent linear
transformations. For Banach spaces we will required an additional structure which is continuity

Definition 2.22 A projector in a Banach space B, is defined as an idempotent operator on B

The consequences of the additional structure of continuity for projectors in Banach spaces are of particular interest in
quantum mechanics

Theorem 2.17 If P is a projection on a Banach space B, and if M and N are its range and null space. Then M and N are
closed subspaces of B such that B = M ⊕ N

The reciprocal is also true

Theorem 2.18 Let B be a Banach space and let M and N be closed subspaces of B such that B = M ⊕ N . If z = x + y is
the unique representation of a vector z in B with x in M and y in N , then the mapping P defined by P (z) = x is a projection
on B whose range and null space are M and N respectively.

These properties are interesting in the sense that the subspaces generated by projectors are closed subspaces of a complete
space, and then they are complete by themselves. We have already said that dealing with complete subspaces is particularly
important in quantum mechanics.
There is an important limitation with Banach spaces. If a closed subspace M is given, though we can always find many
subspaces N such that B = M ⊕ N there is not guarantee that any of them be closed. So there is not guarantee that M
alone generates a projection in our present sense. The solution of this inconvenience is another motivation to endow B with
an additional structure (inner product).
Finally, the definition of the conjugate N ∗ of a normed linear space N , induces to associate to each operator in the normed
linear space N and operator on N ∗ in the following way. Let us form a complex number c0 with three objects, an operator T
on N , a functional f on N and an element x ∈ N , we take this procedure: we map x in T (x) and then map this new element
of N into the scalar c0 through the functional f

x → T (x) → f (T (x)) = c0 (2.18)

Now we get the same number with other set of three objects: an operator T ∗ on N ∗ , a functional f on N (the same functional
of the previous procedure) and an element x ∈ N (the same element stated before), the steps are now the following, we start
with the functional f in N ∗ and map it into another functional through T ∗ , then we apply this new functional to the element
x and produce the number c0 . Schematically it is

f → T ∗ (f ) → [T ∗ (f )] (x) = c0 (2.19)
with this we are defining an appropriate mapping f ′ such that f ′ (x) gives our number. In turn, it induces an operator on N ∗
that maps f in f ′ and this is the newly defined operator T ∗ on N ∗ . Equating Eqs. (2.18, 2.19) we have

[T ∗ (f )] (x) ≡ f (T (x)) (2.20)


2.9. HILBERT SPACES 25

where f is a functional on N i.e. an element in N ∗ , T an operator on N and x an element of N . If for a given T we have that
Eq. (2.20) holds for f and x arbitrary, we have induced a new operator T ∗ on N ∗ from T . It can be shown that T ∗ is also
linear and continuous i.e. an operator. When inner product is added to the structure, this operator becomes much simpler.
By using the norm (2.17) applied to operators on B ∗ we have
kT ∗ k = sup {kT ∗ (f )k : kf k ≤ 1}
it can be proved that
kT ∗ k = kT k (2.21)

such that the mapping T → T is norm preserving and therefore an isometry, we can also see that
∗ ∗
(αT1 + βT2 ) = αT1∗ + βT2∗ ; I ∗ = I ; (T1 T2 ) = T2∗ T1∗ (2.22)
Since linear operations are preserved the mapping T → T ∗ is an isometric isomorphism. However, the product is reversed
under the mappping, this shows that the spaces ß(T ) and ß(T ∗ ) are equivalent as metric and vector spaces but they are not
equivalent as algebras (the spaces are not isomorphic as algebras because of the non-preservation of the product).

2.9 Hilbert spaces


In R3 , it is customary to define a set of three orthonormal vectors ui such that any vector in R3 can be written as x = αi ui
sum over repeated indices. The dot product is defined such that
x · y ≡ kxk kyk cos θ (2.23)
the dot product is a good mathematical tool for many purposes in solid analytic geometry. If we accept the statement that
the zero vector is orthogonal to every vector we can say that the dot product is null if and only if both vectors are orthogonal.
Let {vi } be a given basis (not necessarily orthonormal) of R3 ; any two vectors in R3 are expressed in the form
x = αi vi ; y = βj vj (2.24)
the dot product and the norm of these two vectors can be written
x·y = (αi vi ) · (βj vj ) = αi βj vi · vj ≡ αi βj mij
2
x·x = kxk = (αi vi ) · (αj vj ) = αi αj vi · vj ≡ αi αj mij
These expressions can be in general complicated. Notice that these and other algebraic operations with dot products become
much easier when an orthonormal basis is used since in this case we have mij = δij so that x · y = αi βi and x · x = αi αi .
These facts put orthonormal basis in a privileged position among other bases.
Further, an attempt of extension of these ideas to C 3 permits to define the inner product in this space in the following way,
given the vectors (2.24) where α and β are complex we define
(x, y) = (α∗i vi ) · (βj vj ) = α∗i βj (vi · vj ) = α∗i βj mij
the conjugate on α is included in order to obtain the appropriate norm of complex vectors from the inner product of such a
vector with itself. It can be seen by using an orthonormal basis in which mij = δij
2
(x, x) = kxk = α∗i αi = |αi | |αi |
the simplification above comes from the extension of the concept of orthogonality to complex vectors, they are orthogonal if
and only if (x, y) = 0.
In both the real and complex cases, the concept of orthogonality was very important not only because of the geometry
but also because of the algebra. We observe for instance, that no angle like the one in (2.23) can be defined in the complex
case, but the algebra of inner products continues being simple and useful. On the same ground, we were able to talk about
orthogonality in the complex case via the inner product and exploit the advantages of orthonormal sets, although two vectors
of the complex plane are not “perpendicular”.
In the same way, in abstract vector spaces is not so clear how to use the concept of orthogonality in a geometrical way, but
from the discussion above it is clear that the extension of the concept would represent great simplifications in the algebraic
sense. Notwithstanding, we shall see that the extension of the concept of inner product will also provide some geometrical
interpretations.
As always in mathematics, a natural extension should come from the extrapolation of the essential properties of the concept
in the restricted way, the inner product in the complex and real spaces has the following properties
∗ 2
(x, αy + βz) = α (x, y) + β (x, z) ; (x, y) = (y, x) ; (x, x) = kxk
we are led to the following
26 CHAPTER 2. LINEAR OR VECTOR SPACES

Definition 2.23 A Hilbert space is a real or complex Banach space whose norm arises from an inner product, which in turn
is defined as a complex function (x, y) of the vectors x and y with the following properties

(x, αy + βz) = α (x, y) + β (x, z)



(x, y) = (y, x)
2
(x, x) = kxk

Definition 2.24 Two vectors x, y in a Hilbert space are said to be orthogonal if (x, y) = 0, we denote it as x ⊥ y. A vector
is said to be normal or unitary if (x, x) = 1.

Theorem 2.19 Let H be a Hilbert space. From the axioms of the inner product, the following properties hold

|(x, y)| ≤ kxk kyk (2.25)


2 2 2 2
kx + yk + kx − yk = 2 kxk + 2 kyk (2.26)
2 2 2 2
4 (x, y) = kx + yk − kx − yk + i kx + iyk − i kx − iyk (2.27)
2 2 2 2
x ⊥ y ⇒ kx + yk = kx − yk = kxk + kyk (2.28)

Proof: As a matter of illustration, let us prove Eq. (2.26), known as the paralelogram law
2 2
kx + yk + kx − yk = (x + y, x + y) + (x − y, x − y) = (x, x + y) + (y, x + y) + (x, x − y) − (y, x − y)
= (x, x) + (x, y) + (y, x) + (y, y) + (x, x) − (x, y) − (y, x) + (y, y)
2 2
= 2 (x, x) + 2 (y, y) = 2 kxk + 2 kyk

QED.
Equation (2.25) is known as the Schwarz inequality. Eq. (2.26) is known as the paralelogram law because in plane geometry
it reduces to the theorem which says that the sum of the squares of the sides of a paralelogram equals the sum of the squares
of its diagonals. As well as its geometrical interpretation, this law says that only certain Banach spaces can be converted into
Hilbert spaces, only those normed complete spaces in which the norm obeys the paralelogram law can become a Hilbert space.
Further, if for a given norm, the paralelogram law is satisfied, then Eq. (2.27), gives us the recipe to define an inner product
from such a norm. Finally, for reasons easy to visualize Eq. (2.28) is called the pithagorean theorem.
Let H be a Hilbert space. A vector x ∈ H is said to be orthogonal to a non-empty set S ⊆ H, if x ⊥ y for all y ∈ S. The
orthogonal complement of S is the set of all vectors in H that are orthogonal to S, it is denoted as S ⊥ . Two non-empty sets
M ⊆ H and N ⊆ H are orthogonal if x ⊥ y for all x ∈ M and for all y ∈ N ; this is denoted as M ⊥ N . If M is a closed
vector subspace of H then M ⊥ is also closed in H. The following theorems are important for physical purposes

Theorem 2.20 If M and N are closed vector subspaces of a Hilbert space H such that M ⊥ N , then the linear subspace
M + N is also closed

Theorem 2.21 If M is a closed linear subspace of a Hilbert space H, then H = M ⊕ M ⊥

Thus we see that the expansion of the union of closed subspaces preserves the closure property and so the completeness
property too. In addition, theorem 2.21 says that given a closed subspace of H we can always find a closed subspace to generate
H by direct sum. Besides, the closed space that makes the work is the orthogonal complement. It means that for any given
closed subspace M we can define a projection with range M and null space M ⊥ . Contrast this with the problem arising in
Banach spaces in which we cannot guarantee the closure of the complementary space.

2.9.1 Orthonormal sets


An orthonormal set {ei } in H is a non empty subset of H such that if i 6= j then ei ⊥ ej and kei k = 1 for all i. this set could
be of any cardinality (non necessarily countable). The zero Hilbert space has no orthonormal sets. The following theorems are
of great practical interest

Theorem 2.22 Let {e1 , .., en } be a finite orthonormal set in H. If x is a vector in H we have
n
X 2 2
|(ei , x)| ≤ kxk (2.29)
i=1
Xn
x− (ei , x) ei ⊥ ej ; j = 1, .., n (2.30)
i=1
2.9. HILBERT SPACES 27

We can give the following interpretation of this theorem: Eq. (2.29) says that the sum of the components of a vector in the
various orthogonal directions defined by the ortonormal set, cannot exceed the length of the vector. Similarly, Eq. (2.30) says
that if we substract from a vector its components in several perpendicular directions the resultant has no components left in
those directions.
The following theorem shows that the coefficients obtained for a given vector from an orthonormal set are not arbitrary
n o
Theorem 2.23 If {ei } is an orthonormal set in a Hilbert space H, and if x is any vector in H, the set S = ei : |(ei , x)|2 6= 0
is either empty or countable.

These results permit to extend theorem 2.22 for arbitrary orthonormal sets

Theorem 2.24 Let {ei } be an arbitrary orthonormal set in H. If x is a vector in H we have


X 2 2
|(ei , x)| ≤ kxk (2.31)
X
x− (ei , x) ei ⊥ ej ; j = 1, .., n (2.32)
n o
2
where the symbol of sum means the following, defining the set S = ei : |(ei , x)| 6= 0 , we define the sum to be zero (number or
vector) when S is empty. If S is finite,
P∞ the definitions in (2.32, 2.31) coincide with the ones in (2.29, 2.30), if S is countably
infinite, the sums become series n=1 for a given order of the set S = {e1 , .., ei , ..}, in this case the limit of the series is
independent of the order chosen for S.

Definition 2.25 An orthonormal set in H is said to be complete if it is maximal, that is, if it is impossible to add an element
e to the set while preserving the orthonormality in the new set.

Theorem 2.25 Every orthonormal set in a Hilbert space is contained in a complete orthonormal set

Theorem 2.26 Every non-zero Hilbert space contains a complete orthonormal set

Theorem 2.27 Every orthonormal set is linearly independent

Theorem 2.28 Let H be a Hilbert space and {ei } an orthonormal set in H. The following conditions are equivalent to one
another

{ei } is complete (2.33)


x ⊥ {ei } ⇒ x = 0 (2.34)
X
∀x ∈ H ⇒x= (ei , x) ei (2.35)
2
X 2
∀x ∈ H ⇒ kxk = |(ei , x)| (2.36)

This is perhaps the most important theorem in terms of applications in Physics, and in particular quantum mechanics.
It is convenient to discuss some terminology related with it. The numbers (x, ei ) are called the Fourier coeeficients of x and
Eq. (2.35) is its Fourier expansion. Eq. (2.36) is called Parseval’s equation. All these equations refer to a given complete
orthonormal set.
This sequence of theorems are similar to the ones explained in the general theory of vector spaces in which complete
orthonormal sets replaced the concept of bases, and fourier expansions replaced linear combinations.
It is clear that for finite dimensional spaces Fourier expansions become linear combinations. On the other hand, since
orthonormal sets are linearly independent (Theorem 2.27), it is easy to see that in the case of finite dimensional spaces
complete orthonormal sets are linearly independent sets that generate any vector by linear combinations. Hence, complete
orthonormal sets are bases.
For infinite dimensional spaces there is a different story. If we remember that linear combinations are finite by definition,
we see that in this case Fourier expansions are not linear combinations. For a given linearly independent set to be a basis, it
is necessary for any vector of the space to be written as a linear combination of such a set, basis certainly exists for Hilbert
spaces according to theorem 2.5 but complete orthonormal sets are NOT bases in the sense defined for the general theory of
vector spaces.
Moreover theorem 2.23 shows that the Fourier expansion given in Eq. (2.35) is always countable, this is a remarkable result
because it means that the fourier expansion for a given complete orthonormal set is always a series, even if the cardinality of
the complete orthonormal set is higher than the aleph (cardinality of the integers).
The informal discussion above can be formally proved to produce the following statement

Theorem 2.29 A Hilbert space is finite dimensional if and only if every complete orthonormal set is a basis.
28 CHAPTER 2. LINEAR OR VECTOR SPACES

However, owing to the analogy between bases and complete orthonormal sets the following theorem is quite expected

Theorem 2.30 Any two complete orthonormal sets of a given Hilbert space have the same cardinality.

And this fact induces a natural definition

Definition 2.26 The orthogonal dimension of a Hilbert space H is the cardinality of any complete orthonormal set in H.

It is important to keep in mind the difference between the dimension and the orthogonal dimension of a Hilbert space of
infinite dimension.

2.9.2 The conjugate space H ∗


We have defined the conjugate space of a Banach space B as the set of all functionals in B i.e. of all linear continuous mappings
of B into the scalars. We said however that the structure of the conjugate spaces of an arbitrary Banach space is very complex.
Fortunately, this is not the case for Hilbert spaces in which the inner product provides a natural association between H and
H ∗.
Let y be a fixed vector in H and consider the function fy defined by

fy (x) ≡ (y, x) (2.37)

it is easy to prove linearity

fy (αx1 + βx2 ) = (y, αx1 + βx2 ) = α (y, x1 ) + β (y, x2 )


fy (αx1 + βx2 ) = αfy (x1 ) + βfy (x2 )

continuity comes from the Schwarz inequality

|fy (x)| = |(x, y)| ≤ kxk kyk ⇒ |fy | ≤ kyk

then fy is bounded and so continuous. Indeed it can be shown that |fy (x)| = kyk. We then have found an algorithm to
generate some functionals from the mapping
y → fy (2.38)

described above, this is a norm preserving mapping of H into H ∗ . However, it can be shown that indeed this is a mapping of
H onto H ∗ as stated in this

Theorem 2.31 Let H be a Hilbert space, and f an arbitrary functional in H ∗ . Then there exists a unique vector y ∈ H such
that
f (x) = (y, x) ∀x ∈ H

since the mapping (2.38) is norm preserving, we wonder whether it is linear3 , this is not the case because

fy1 +y2 (x) = (y1 + y2 , x) = (y1 , x) + (y2 , x) = fy1 (x) + fy2 (x)
fαy (x) = (αy, x) = α∗ (y, x) = α∗ fy (x)

such that
fy1 +y2 = fy1 + fy2 ; fαy = α∗ fy (2.39)

however the mapping (2.38) is an isometry (it preserves metric) since

kfx − fy k = kfx−y k = kx − yk

we can characterize H ∗ in the following way

Theorem 2.32 H ∗ is a Hilbert space with respect to the inner product defined by (fx , fy ) = (y, x).
3 It is important not to confuse the mapping described by Eq. (2.37) with the mapping in Eq. (2.38). The former is defined from H into the

complex space C while the latter is from H onto H ∗ .


2.9. HILBERT SPACES 29

2.9.3 The conjugate and the adjoint of an operator


A really crucial aspect of the theory of Hilbert spaces in Physics is the theory of operators (continuous linear transformations
of H into itself). For instance, observables in quantum mechanics appear as eigenvalues of some of these operators.
We have defined the conjugate of an operator for Banach spaces but they are still too general to get a rich structural theory
of operators. The natural correspondence between H and H ∗ will provide a natural relation between a given operator on H
and its corresponding conjugate operator on H ∗ .
Let T be an operator on a Banach space B. We defined an operator on B ∗ denoted T ∗ and called the conjugate of T by
Eq. (2.20)
[T ∗ (f )] (x) = f (T (x)) (2.40)
and Eqs. (2.21, 2.22) says that T → T ∗ is an isometric isomorphism (as vector spaces) between the spaces of linear operators
on H and H ∗ . We shall see that the natural correspondence between H and H ∗ permits to induce in turn an operator T † in
H from the operator T ∗ in H ∗ . The procedure is the following: starting from a vector y in H we map it into its corresponding
functional fy , then we map fy by the operator T ∗ to get another functional fz then we map this functional into its (unique)
corresponding vector z in H the scheme reads

y → fy → T ∗ fy = fz → z (2.41)

the whole process is a mapping of y to z i.e. of H into itself. We shall write it as a single mapping of H into itself in the form

y → z ≡ T †y

the operator T † induced in this way from T ∗ is called the adjoint operator. Its action can be understood in the context of
H only as we shall see. For every vector x ∈ H we use the definition of T ∗ Eq. (2.40) to write

[T ∗ (fy )] (x) = fy (T (x)) = (y, T x)



[T ∗ fy ] (x) = fz (x) = (z, x) = T † y, x

where we have also used Eqs. (2.37, 2.41). Hence



(y, T x) = T † y, x ∀x, y ∈ H (2.42)

we can see that Eq. (2.42) defines T † uniquely and we can take it as an alternative definition of the adjoint operator associated
with T . It can also be verified that T † is indeed an operator, i.e. that it is continuous and linear. We can also prove the
following

Theorem 2.33 The adjoint operation T → T † is a one-to-one onto mapping with these properties
† † †
(T1 + T2 ) = T1† + T2† , (αT ) = α∗ T † , T † = T
† 2
(T1 T2 ) = T † T † ; T † = kT k ; T † T = T T † = kT k
2 1
0∗ = 0 , I∗ = I (2.43)

If T is non-singular then T † is also non-singular and


−1 †
T† = T −1
†
Notice for instance that T † = T implies that

(T y, x) = y, T † x ∀x, y ∈ H (2.44)

We define the commutator of a couple of operators T1 , T2 as

[T1 , T2 ] ≡ T1 T2 − T2 T1

this operation has the following properties

[T1 , T2 ] = − [T2 , T1 ] (2.45)


[αT1 + βT2 , T3 ] = α [T1 , T3 ] + β [T2 , T3 ] (2.46)
[T1 , αT2 + βT3 ] = α [T1 , T2 ] + β [T1 , T3 ] (2.47)
[T1 T2 , T3 ] = T1 [T2 , T3 ] + [T1 , T3 ] T2 (2.48)
[T1 , T2 T3 ] = T2 [T1 , T3 ] + [T1 , T2 ] T3 (2.49)
30 CHAPTER 2. LINEAR OR VECTOR SPACES

[[T1 , T2 ] , T3 ] + [[T3 , T1 ] , T2 ] + [[T2 , T3 ] , T1 ] = 0 (2.50)


such properties can be proved directly from the definition, Eq. (2.45) shows antisymmetry and Eqs. (2.46, 2.47) proves
linearity. Finally, relation (2.50) is called the Jacobi identity which is a manifestation of the non-associativity of this algebra.
It can be seen that the space of operators on a Hilbert space H (called ß(H)) is a Banach space and more generally a
Banach Algebra. This organization permits an elegant theory of the operators on Hilbert spaces.
Most of physical theories work on Hilbert spaces. In addition, the most important operators on Hilbert spaces in Physics
are self-adjoint and unitary operators, which are precisely operators that have a specific relation with its adjoints.

2.10 Normal operators


 
Definition 2.27 An operator on a Hilbert space H that commutes with its adjoint N, N † = 0 is called a normal operator

There are two reasons to study normal operators (a) From the mathematical point of view they are the most general type
of operators for which a simple structure theory is possible. (b) they contain as special cases the most important operators in
Physics: self-adjoint and unitary operators.
It is clear that if N is normal then αN is. Further, the limit N of any convergent sequence of normal operators {Nk } is
also normal

N N † − N † N ≤ † † † †
N N † − Nk Nk + Nk Nk − Nk Nk + Nk Nk − N † N



= N N † − Nk Nk† + Nk† Nk − N † N → 0

then N N † − N † N = 0 and N is normal then we have proved

Theorem 2.34 The set of all normal operators on H is a closed subset of ß(H) that is closed under scalar multiplication

It is natural to wonder whether the sum and product of normal operators is normal. They are not, but we can establish
some conditions for these closure relations to occur

Theorem 2.35 If N1 and N2 are normal operators on H with the property that either commutes with the adjoint of the other,
then N1 + N2 and N1 N2 are normal.

The following are useful properties for the sake of calculations in quantum mechanics

Theorem 2.36 An operator N on H is normal ⇔ kN xk = N † x ∀x ∈ H
2
Theorem 2.37 If N is a normal operator on H then N 2 = kN k

2.11 Self-Adjoint operators


We have said that the space of operators on a Hilbert space H (called ß(H)), is a special type of algebra (a Banach Algebra)
which has an algebraic structure similar to the one of the complex numbers, except for the fact that the former is non-
commutative. In particular, both are complex algebras with a natural mapping of the space into itself of the form T → T †
and z → z ∗ respectively. The most important subsystem of the complex plane is the real line defined by the relation z = z ∗ ,
the corresponding subsystem in ß(H) is therefore defined as T = T † , an operator that accomplishes that condition is called a
self-adjoint operator. This is the simplest relation that can be established between an operator and its adjoint. It is clear that
self-adjoint operators are normal. Further, we already know that 0† = 0 and I † = I thus they are self-adjoint. A real linear
combination of self-adjoint operators is also self-adjoint

(αT1 + βT2 ) = α∗ T1† + β ∗ T2† = αT1 + βT2

further, if {Tn } is a sequence of self adjoint operators that converges to a given operator T , then T is also self-adjoint

T − T † ≤ kT − Tn k + Tn − T † + T † − T † = kT − Tn k + kTn − Tn k + T † − T †
n n n


= kT − Tn k + (Tn − T ) = kT − Tn k + k(Tn − T )k = 2 kT − Tn k → 0

shows that T − T † = 0 so that T = T † this shows the following

Theorem 2.38 The self-adjoint operators in ß(H) are a closed real linear subspace of ß(H) and therefore a real Banach space
which contains the identity transformation
2.12. UNITARY OPERATORS 31

Unfortunately, the product of self-adjoint operators is not necessarily self-adjoint hence they do not form an algebra. The
only statement in that sense is the following

Theorem 2.39 If T1 , T2 are self-adjoint operators on H, their product is self-adjoint if and only if [T1 , T2 ] = 0

It can be easily proved that

Theorem 2.40 If T is an operator on a Hilbert space H then T = 0 ⇔ (x, T y) = 0 ∀x, y ∈ H.

It can be seen also that

Theorem 2.41 If T is an operator on a complex Hilbert space H then T = 0 ⇔ (x, T x) = 0 ∀x ∈ H.

It should be emphasized that the proof of theorem 2.41 makes explicit use of the fact that the scalars are complex numbers
and not merely the real system.
The following theorem shows that the analogy between self-adjoint operators and real numbers goes beyond the simple
analogy from which the former arise

Theorem 2.42 An operator T on H is self-adjoint⇔ (x, T x) is real ∀x ∈ H.

An special type of self-adjoint operators are the following ones

Definition 2.28 A positive operator on H is a self-adjoint operator such that (x, T x) ≥ 0, ∀x ∈ H. Further, if (x, T x) ≥
0, and (x, T x) = 0 ⇔ x = 0 we say that the operator is positive-definite. If (x, T x) ≥ 0, and (x, T x) = 0 for some x 6= 0 we
say that the operator is positive-singular.

It is clear that the following operators are positive: 0, I, T T † , T † T note also that all the analoguous elements in the
2
complex plane are non-negative numbers 0, 1, zz ∗ = z ∗ z = |z| .

Theorem 2.43 If A is a positive operator then I + A is non-singular

Continuing the analogy between ß(H) and the algebra of complex numbers, we can see that a complex number can be
written as its real and imaginary parts in the form
z + z∗ z − z∗
z = a1 + ia2 ; a1 ≡ , a2 ≡
2 2i
in a similar way we can decompose an arbitrary operator T on H in the form

T + T† T − T†
T = A1 + iA2 ; A1 ≡ ; A2 ≡ (2.51)
2 2i
it is clear that A1 and A2 are self-adjoint so they can be called the “real” and “imaginary” components of the T operator. If
T is self-adjoint its imaginary part is zero as expected. We can see that it is precisely because of the non commutativity of the
self-adjoint operators that non-normal operators exist

Theorem 2.44 If T is an operator on H it is normal ⇔ its real and imaginary parts commute

2.12 Unitary operators


Perhaps the most important subsystem of the complex plane after the real line is the unit circle characterized by the equation
2
zz ∗ = z ∗ z = |z| = 1. This leads to a natural definition of an special subset of the normal operators

Definition 2.29 An operator U on H which satisfies the equation U U † = U † U = I is said to be unitary

Unitary operators are thus the analogues of complex numbers of unitary absolute value. In words, unitary operators are
those non-singular operators whose inverses equal their adjoints, they are thus mappings of H onto itself. The geometric
significance of these operators can be clarified with the following theorem

Theorem 2.45 If T is an operator on H, the following conditions are equivalent to one another

T †T = I (2.52)
(T x, T y) = (x, y) ∀x, y ∈ H (2.53)
kT (x)k = kxk ∀x ∈ H (2.54)
32 CHAPTER 2. LINEAR OR VECTOR SPACES

In general an operator T with any of the properties (2.52-2.54), is an isometric isomorphism of H into itself, since T
preserves linear operations, as well as the inner product and the norm (and thus the metric). For finite-dimensional spaces
any of them are necessary and sufficient conditions for T to be unitary. Nevertheless, this is not the case when we treat with
infinite-dimensional spaces, let us see an example: consider the operator T in C ∞ given by

T {x1 , x2 , ...} = {0, x1 , x2 , ...}

which preserves norms but has no inverse. The point is that this is an isometric isomorphism into H but not onto H (the
image does not contain any element of C ∞ with a non-null first component). So in the case of infinite dimension, the condition
to be onto must be added to the conditions (2.52-2.53) for an operator to be unitary.

Theorem 2.46 An operator on H is unitary⇔is an isometric isomorphism of H onto itself.

In words, unitary operators are those one-to-one and onto operators that preserve all structure relevant for a Hilbert space:
linear operations, inner products, norm and metric.
In practice, unitary operators usually appear in Physics as operations that keep the norm of the vectors unaltered (like
rotations in ordinary space). Indeed, this statement is the definition usually utilized in Physics books.
There is another theorem useful in the theory of representations for Hilbert spaces which is also used sometimes as the
definition

Theorem 2.47 An operator T on H is unitary ⇔ T {ei } is a complete orthonormal set whenever {ei } is.

Another important characteristic for physical applications is the following

Theorem 2.48 The set of all unitary operators on H forms a group (see definition 6.1 page 99).

2.13 Projections on Hilbert spaces


In Banach spaces we defined projections as idempotent continuous linear transformations or equivalently as idempotent op-
erators. We also saw that a couple of closed subspaces such that B = M ⊕ N induces a projection and viceversa. We saw
however that for a given closed subspace M of B there is not necessarily another closed subspace such that B = M ⊕ N .
In contrast, theorem 2.21 guarantees that for a given closed subspace M of a Hilbert space H there always exists a
decomposition with another closed subspace in the form H = M ⊕M ⊥ . Besides, in this decomposition the closed complementary
space is precisely the orthogonal complement of M . Since orthogonality is a very important new concept that arises from Hilbert
spaces, we shall concentrate on projections induced by this particular decomposition. It is then natural to look for the new
features required by a given projection in order to have M as its range and M ⊥ as its null space

Theorem 2.49 If P is a projection (with the definitions 2.20, 2.22 given for Banach spaces) on H with range M and null
space N then M ⊥ N ⇔ P = P † and in this case N = M ⊥ .

A projection in which its range and null space are perpendicular is called an orthogonal projection. Indeed, orthogonal
projections are the only ones that are relevant in the theory of operators on Hilbert spaces, then we shall redefine the concept
of projection once again

Definition 2.30 A projection on a Hilbert space will be defined as an idempotent, continuous, and self-adjoint linear trans-
formation. If idempotent, continuous, non-self adjoint linear transformations are of some use, we call them non-orthogonal
projections.

The following facts are easy to show, 0 and I are projections and they are distinct if and only if H 6= {0}. P is the
projection on M ⇔ I − P is the projection on M ⊥ .
We can also see that
x ∈ M ⇔ P x = x ⇔ kP xk = kxk
it can also be seen that P is a positive operator and kP k ≤ 1.
Sometimes occur in Physics that a given operator T on H maps a proper subspace M of H into itself. The following chain
of definitions permits to study this kind of operators

Definition 2.31 Let T be an operator on H, and M a closed vector subspace of H. M is said to be invariant under T if
T (M ) ⊆ M .

In this case the restriction of T to M can be regarded as an operator of M into itself. A more interesting situation occurs
when M and M ⊥ are invariant under T
2.13. PROJECTIONS ON HILBERT SPACES 33

Definition 2.32 If both M and M ⊥ are invariant under T , we say that M reduces T or that T is reduced by M .

This situation invites us to study T by restricting its domain to M and M ⊥ . The projections provide the most relevant
information for these scenarios

Theorem 2.50 A closed vector subspace M is invariant under an operator T ⇔ M ⊥ is invariant under T †

Theorem 2.51 A closed vector subspace M reduces an operator T ⇔ M is invariant under both T and T †

Theorem 2.52 If P is the projection on a closed vector subspace M of H, M is invariant under an operator T ⇔ T P = P T P

Theorem 2.53 If P is the projection on a closed vector subspace M of H, M reduces an operator T ⇔ T P = P T

Theorem 2.54 If P and Q are projections on closed linear subspaces M and N then M ⊥ N ⇔ P Q = 0 ⇔ QP = 0

We wonder whether the sum of projections in our present sense is also a projection. This is the case only under certain
conditions

Theorem 2.55 If P1 , .., Pn are projections on closed subspaces M1 , .., Mn of a Hilbert space H, then the sum P = P1 + .. + Pn
is a projection ⇔the Pi′ s are pairwise orthogonal i.e. Pi Pj = δij Pi , in that case P is the projection on M = M1 + .. + Mn .
Chapter 3

Basic theory of representations for


finite-dimensional vector spaces

In this section we intend to establish an equivalence between abstract objects such as elements of vector spaces and linear
transformations, in a more tangible language suitable for explicit calculations. This is the gist of the theory of representations
for vector spaces

3.1 Representation of vectors and operators in a given basis


If n is the dimension of a finite-dimensional vector space V , a set of n linearly independent vectors in V , forms a basis for
the vector space. Given certain ordered basis {ui } ≡ {u1 , .., un } in a vector space V any vector can be written as a linear
combination of such a basis, we shall use the convention of sum over repeated indices

x = xi ui (3.1)

The coefficients xi are called the coordinates of the vector x, relative to the ordered basis {ui }. Linear independence ensures
that the set of coordinates (x1 , .., xn ) is unique when the basis is ordered in a well-defined way. Hence, such a set of coordinates
forms a representation of the vector x with respect to the ordered basis {ui }.
A mapping T of V into itself, associates each vector x with another vector y in V

y = Tx

if the mapping is one-to-one and onto it admits an inverse1

x = T −1 y

if the transformation is linear we have


T (αx+βy) = αT x + βT y ∀x, y ∈ V
where α and β are complex numbers. The definition of T is intrinsic and does not depend on the particular basis chosen for
the vector space. Notwithstanding, for many practical purposes we define a representation of both the vectors and operators
in a basis {ui }. In that case, we can describe the action of T by a transformation of coordinates (in the same basis)

yi = Ti (x1 , x2 , . . . , xn ) i = 1, . . . , n

if Ti admits an inverse we get


xi = Ti−1 (y1 , y2 , . . . , yn ) i = 1, . . . , n
the necessary and sufficient condition for the existence of the inverse is that the jacobian J ≡ ∂Ti /∂xj be different from zero.
On the other hand, if we assume that T is a linear transformation we can write

y = T x = T (xi ui ) = xi T ui (3.2)

Eq. (3.2) says that y is a linear combination of the vectors T ui , and the coefficients of the combination (coordinates)
coincide with the coordinates of x in the basis ui . The vectors T ui must be linear combinations of {uj } and we denote the
coefficients of these linear combinations as Tji
vi ≡ T ui = uj Tji (3.3)
1 If the mapping is only one-to-one but not onto, the inverse still exist but restricted to the vector subspace in which all the vectors x ∈ V are

mapped.

34
3.1. REPRESENTATION OF VECTORS AND OPERATORS IN A GIVEN BASIS 35

the real or complex coefficients Tji can be organized in a square arrangement of the form
 
T11 T12 · · · T1n
 T21 T22 · · · T2n 
 
T≡ . .. .. 
 . . . ··· . 
Tn1 Tn2 · · · Tnn

this square arrangement symbolized as T is called the matrix representative (of n × n dimension) of the linear transformation
T relative to the ordered basis {ui }. Substituting Eq. (3.3) in Eq. (3.2)

yj uj = uj Tji xi

and since the uj are linearly independent


yj = Tji xi
this operation is represented by the following notation
    
y1 T11 T12 ··· T1n x1
 y2   T21 T22 ··· T2n  x2 
    
 ..  =  .. .. ..  ..  (3.4)
 .   . . ··· .  . 
yn Tn1 Tn2 ··· Tnn xn
   
y1 T11 x1 + T12 x2 + .. + T1n xn
 y2   T21 x1 + T22 x2 + .. + T2n xn 
   
 ..  =  ..  (3.5)
 .   . 
yn Tn1 x1 + Tn2 x2 + .. + Tnn xn

where the LHS (left-hand side) of Eqs. (3.4, 3.5) are column vector arrangements. Eq. (3.5) is usually written in the form

y = Tx

the last equality appears in matrix notation where T is the matrix representative of the linear operator T in the ordered basis
ui . Similarly, x and y are the coordinate representatives of the intrinsic vectors in the same ordered basis. Eq. (3.3) shows
clearly how to construct the matrix T, i.e. applying the operator to each vector in the basis, and writing the new vectors as
linear combinations of the basis. The coefficient of the i − th new vector associated to the j − th element of the basis gives the
element Tji in the associated matrix. Observe that for a matrix representative to be possible, the linearity was fundamental
in the procedure.
On the other hand, since we are looking for an isomorphism among linear transformations on V and the set of matrices
(as an algebra), we should define linear operations and product of matrices in such a way that these operations are preserved
in the algebra of linear transformations. In other words, if we denote by [T ] the matrix representative of T in a given ordered
basis we should find operations with matrices such that

[T1 + T2 ] = [T1 ] + [T2 ] ; [αT ] = α [T ] ; [T1 T2 ] = [T1 ] [T2 ]

we examine first the product by a scalar, according to the definition (2.9) we have

(αT ) (ui ) = α (T ui ) = α (uj Tji ) = uj (αTji ) ⇒


(αT ) (ui ) = uj (αTji ) ⇒ (uj ) (αT )ji = uj (αTji )

using linear independence we obtain the algorithm for scalar multiplication

(αT )ji = αTji

Now for the sum we use the definition 2.8

(T + U ) uj = T uj + U uj = ui Tij + ui Uij = ui (Tij + Uij ) ⇒


(T + U ) uj = ui (Tij + Uij ) ⇒ ui (T + U )ij = ui (Tij + Uij )

and along with linear independence it leads to


(T + U )ij = (Tij + Uij )
Moreover, for multiplication (composition) we use definition 2.11

(T U ) ui = T (U ui ) = T (uj Uji ) = Uji T (uj ) = Uji (T uj ) = Uji (uk Tkj ) ⇒


(T U ) ui = (Tkj Uji ) uk ⇒ uk (T U )ki = uk (Tkj Uji )
36 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

linear independence gives


(T U )ki = Tkj Uji (3.6)
It can be easily shown that the matrix representations of the operators 0 and I are unique and equal in any basis, they
correspond to [0]ij = 0 and [I]ij = δij .
Finally, we can check from Eq. (3.3) that the mapping T → [T ] is one-to-one and onto. It completes the proof of the
isomorphism between the set of linear transformations and the set of matrices as algebras.
On the other hand, owing to the one-to-one correspondence T ↔ [T ] and the preservation of all operations, we see that
non-singular
 −1  linear transformations (i.e. invertible linear transformations) should correspond to invertible matrices. We denote
T the matrix representative of T −1 , and our goal is to establish the algorithm for this inverse matrix, the definition of the
inverse of the linear transformation is
T T −1 = T −1 T = I
since the representation of the identity is always [I]ij = δij , the corresponding matrix representation of this equation is
   
[T ]ik T −1 kj = T −1 ik [T ]kj = δij (3.7)

this equation can be considered as the definition of the inverse of a matrix if it exists. A natural definition is then

Definition 3.1 A matrix which does not admit an inverse is called a singular matrix. Otherwise, we call it a non-singular
matrix.

Since T −1 is unique, the corresponding matrix is also unique, so the inverse of a matrix is unique when it exists. We shall
see later that a necessary and sufficient condition for a matrix to have an inverse is that its determinant must be non-zero.
The algebra of matrices of dimension n×n is called the total matrix algebra An , the preceding discussion can be summarized
in the following

Theorem 3.1 If B = {u1 , .., un } is an ordered basis of a vector space V of dimension n, the mapping T → [T ] which
assigns to every linear transformation on V its matrix relative to B, is an isomorphism of the algebra of the set of all linear
transformations on V onto the total matrix algebra An . Such an isomorphism preserves linear operations as well as composition
of linear transformations.

Theorem 3.2 If B = {u1 , .., un } is an ordered basis of a vector space V of dimension n, and T a linear transformation
 whose
−1
matrix relative to B is [aij ]. Then T is non-singular ⇔ [aij ] is non-singular and in this case [aij ] = T −1 .

Definition 3.2 Let A be a n × n matrix characterized by the elements aij , we define the transpose of A (symbolized as AT
e as the n × n matrix with elements given by e
or A) aij ≡ aji . This is the matrix obtained when columns are interchanged with
rows in the matrix A.

Theorem 3.3 Let A, B be two n × n matrices characterized by the elements aij , bij respectively we have

^=B
(AB) e ; (AB)−1 = B−1 A−1
eA

Proof:      
eA
B e e
= B e
A ^
= bki ajk = ajk bki = (AB)ji = (AB)ij
ij ik kj

the proof for the inverse follows by direct inspection using associativity and the fact that the inverse is unique
 
(AB) B−1 A−1 = A BB−1 A−1 = AIA−1 = I

QED.

3.2 Change of coordinates of vectors under a change of basis


We have already seen that any vector space has an infinite number of bases. Notwithstanding, once a given basis is obtained,
any other one can be found by a linear transformation
 of the original basis.
Let {uj } be our “original” ordered basis and u′j any other ordered basis. Each u′i is a linear combination of the original
basis
u′i = aij uj i = 1, . . . , n (3.8)
linear independence of {ui } ensures the uniqueness of the coefficients aij . The
 natural
question is whether we require any
condition on the matrix representation aij in Eq. (3.8) to ensure that the set u′j be linearly independent. If we remember
3.3. CHANGE OF THE MATRIX REPRESENTATIVE OF LINEAR TRANSFORMATIONS UNDER A CHANGE OF BASIS37

that there is a one-to-one correspondence between matrices and linear transformations we see that aij must correspond to a
(unique) linear transformation A. In explicit notation Eq. (3.8) becomes
 ′   
u1 A11 · · · A1n u1
 ..  . .. ..  .. 
 .  = .. . .  .  (3.9)
u′n An1 · · · Ann un

now appealing to theorem 2.12 we see that u′j is a basis if and only if A is non-singular, but A is non-singular if and only if
[A]ij = aij is a non-singular matrix. Equation (3.9) can be written in matrix notation as

u′ = Au (3.10)

the new set {u′i } is a basis if and only if the matrix A is non-singular. Any vector x can be written in both bases

x = xi ui = x′i u′i = x′i aij uj = x′j aji ui (3.11)

where we have used the fact that i, j are dummy indices. Now, owing to the linear independence of ui

xi = x′j aji = ãij x′j ; ãij ≡ aji

where ãij ≡ aji indicates the transpose of the matrix A. In matrix form we have

u′ = Au , x = Ãx (3.12)

and using Eq. (3.12) we get

x′ = Ã−1 x (3.13)
observe that if the original basis transform to the new one by a non-singular matrix A (Eq. 3.10), the original coordinates
transform to the new ones by the matrix Ã−1 (Eq. 3.13). It is easy to show that Ã−1 = Ag−1 then Ae is non-singular if and
only if A is non-singular. Hence Eq. (3.13) makes sense whenever A is non-singular.
Defining the transpose of a column matrix as

x̃ = (x1 , x2 , . . . , xn )

i.e. as a row matrix, Eq. (3.11) can be written as


x = x̃u = x̃′ u′
which gives a convenient notation for the coordinate-form of vectors in different basis.
It is important to emphasize that the vector x has an intrinsic meaning while its coordinates depend on the basis chosen.

3.3 Change of the matrix representative of linear transformations under a


change of basis
Let us define an intrinsic equation for a linear transformation T of V into itself

y = Tx (3.14)

y and x denote here intrinsic vectors while y, x are their representation in coordinates under a given ordered basis. Starting
with the ordered basis {ui } we write equation (3.14) in matrix form

y = Tx (3.15)

for any other ordered basis {u′i } the matrix and coordinate representatives are different and we write them as

y′ = T′ x′ (3.16)

we remark that Eqs. (3.15) and (3.16) represents the same intrinsic Equation (3.14).
Since we know the relation between the coordinate representatives given by Eq. (3.13), our goal here is to know the relation
between the matrix representatives of T . Using Eq. (3.13) we find
−1 −1 −1
  
y′ = Ã−1 y = Ã Tx = Ã TÃÃ x = Ã−1 TÃ Ã−1 x
y′ = T′ x′ (3.17)
38 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

where we have defined


T′ ≡ Ã−1 TÃ (3.18)
from Eqs. (3.17, 3.18) we see that T′ is the representative matrix of the operator T in the new basis u′i where the matrix Ã−1
gives the transformation between coordinates from the old basis to the new one Eq. (3.13). We remember that A must be
non-singular to represent a change of basis.

Definition 3.3 The transform of a matrix A (also called a similarity transformation) by a non singular matrix S, is defined
as A′ = SAS−1 . The matrices A′ and A are said to be equivalent.

Eq. (3.18) shows that the new matrix representation of T (i.e. T′ ), is equivalent2 to the old matrix representation T, and
the transform of T by Ã−1 is T′ .
We can also consider a transformation S from a vector space V into another V ′

x′ = Sx, x = S −1 x′

For S −1 to be linear, it is necessary that V and V ′ be of the same dimensionality. If a linear operator T is defined in V , then
T and S induce a linear operator in V ′ in the following way let map x′ of V ′ into y′ of V ′ in the following way
 
x′ → x = S −1 x′ → y = T x = T S −1 x′ → y′ = Sy = S T S −1 x′

hence the mapping x′ → y′ has been performed as



x′ → y′ = ST S −1 (x′ )

or course, we can define a mapping T ′ of V ′ into itself that makes the work in a single step, thus

T ′ ≡ ST S −1 ; y′ = T ′ (x′ ) (3.19)

The transformation given by (3.19) is also a similarity transformation. Although the transformations shown in (3.18) and
(3.19) resembles, they have fundamental differences. In (3.18) we are representing the same mathematical object by taking
different bases, and is a matrix equation. By contrast, Eq. (3.19) expresses a relation between two different mathematical
transformations acting on different spaces3 , and the equation is intrinsic, independent of the basis.

3.4 Active and passive transformations


In Physics, it is important to differentiate between two types of transformations, the passive ones and the active ones. We can
understand passive transformations by examining the transformations y → y′ , x → x′ and T → T ′ to go from Eq. (3.15) to
Eq. (3.16), if we remember that both are representatives of the same intrinsic equation (3.14) we realize that the mappings
described above do not change the vectors or the transformation but only their representatives. These mappings (called passive
mappings) thus correspond to a change in the basis and not to a change on the mathematical objects by themselves.
In contrast, an active mapping or transformation transforms a mathematical object into another one. For instance, in
the first of Eqs. (3.19) we map a linear transformation on V into a different linear transformation on V ′ , the mathematical
object itself has changed. Similarly the mapping x′ → y′ through T ′ described by the second of Eqs. (3.19) is an active
transformation because x′ and y′ are two different vectors.
The difference between a passive and active mappings or transformations should be clear from the context. For instance
Eqs. (3.18) and (3.19) are identical in form from the algebraic point of view, but (3.18) represents a passive transformation (a
change of basis or a change of representation), while (3.19) represents an active one.

3.5 Theory of representations on finite dimensional Hilbert spaces


We shall study n−dimensional Hilbert spaces. We remember that an inner product is a mapping that takes an ordered pair of
vectors x, y in a vector space V, and associates to it a scalar α denoted by α = (x, y) such that

(x, y) = (y, x) ; (x, βy) = β (x, y) ; (x1 + x2 , y) = (x1 , y) + (x2 , y)
(x, x) ≥ 0, and (x, x) = 0 ⇔ x = 0
2 Similarity transformations provides an equivalence relation between two matrices. Thus, the expression equivalent matrices becomes logical. In

addition, we see that T and T′ describe the same mathematical object (though in different bases), so that the term equivalence acquires more sense
in this context.
3 It could be argued that both spaces are identical since they have the same dimensionality. This is true only for their properties as general vector

spaces, but not necessarily for any additional algebraic or topological structure on them.
3.5. THEORY OF REPRESENTATIONS ON FINITE DIMENSIONAL HILBERT SPACES 39

2
the definition of the inner product is intrinsic (basis independent). The norm of a vector is defined as kxk ≡ (x, x). This in
turn allows us to normalized the vectors, i.e. construct vectors with norm or “length” equal to one by the rule
xi xi
ui = p = (3.20)
(x, x) kxik

such that (ui , ui ) = 1. Different inner products defined into the same vector space, lead to different Hilbert spaces. Another
important concept that arises from the inner product is that of orthogonality. An orthonormal set is a set {xi } with xi ∈ H
such that
(xi , xj ) = δij
The theory of representations of a finite dimensional Hilbert space is particularly simple if we realize that in finite dimension, the
Fourier expansion given by Eq. (2.35) becomes a linear combination, the series in (2.36) to calculate the norm becomes a finite
sum, and finally complete orthonormal sets become bases. These are the main ideas that lead to the theory of representations
in a Hilbert space
Our first goal is to find the way in which the coordinates of a given vector are obtained from the inner product. We first see
the form of the coordinates when the basis consists of a complete orthonormal basis. Rewriting the Fourier expansion (2.35)
in finite dimension and using sum over repeated indices we have

x = (ui , x) ui = xi ui

so the coordinate of a vector x associated with the normal vector ui is given by

xi = (ui , x)

Let us now see how an arbitrary inner product can be calculated using an orthonormal basis

(x, y) = (xi ui , yj uj ) = x∗i yj (ui , uj ) = x∗i yj δij = x∗i yi (3.21)

the norm of a vector is also easily seen as


2
kxk = (x, x) = x∗i xi = |xi | |xi | (3.22)
if the basis {vi } is not an orthonormal set, we can express the scalar product by determining the numbers

mij ≡ (vi , vj ) (3.23)

the properties of the inner product lead to mij = m∗ji . This numbers form a matrix that we shall call the metric matrix.
Defining (Aij )† ≡ A∗ji (the adjoint or hermitian conjugate of the matrix A) we find that m = m† , from the definition of the

adjoint matrix we see that (AB) = B† A† . A matrix that coincides with its adjoint is called self-adjoint or hermitian. The
metric matrix is hermitian. We shall see now that knowing the metric matrix in a certain basis, we can find any possible inner
product

(x, y) = (xi vi , yj vj ) = x∗i yj (vi , vj ) = x∗i mij yj


(x, y) = x† my

and the norm becomes


(x, x) = x∗i mij xj = x† mx (3.24)
representing x as a one column matrix, x† is a one row matrix with the coordinates conjugated. The quantities of the form
x† Ay, with A hermitian, are called hermitian forms. If additionally we impose that x† Ax ≥ 0, we have a positive definite
hermitian form4 .

Gram-Schmidt process for orthonormalization of linearly independent sets


From the previous discussion, it is very clear that complete orthonormal sets posses many advantages with respect to other
sets of linearly independent vectors. It leads us to study the possibility of finding an orthonormal set from a given set
of linearly independent vectors in a Hilbert space. The so-called Gram-Schmidt orthonormalization process starts from an
arbitrary set of independent vectors {x1 , x2 , .., xn , ...} on H and exhibits a recipe to construct a corresponding orthonormal
set {u1 , u2 , .., un , ...} with the property that for each n the vector subspace spanned by {u1 , u2 , .., un } is the same as the one
spanned by {x1 , x2 , .., xn }.
4 An inner product guarantees that the hermitian form constructed with the metric matrix is positive-definite. However, it is usual in relativity to

define a pseudo-metric that leads to non positive definite hermitian forms. Observe that the metric tensor in relativity has some negative diagonal
elements which would be forbidden if they arose from an authentic inner product.
40 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

The gist of the procedure is based on Eqs. (2.32, 3.20). We start by normalizing the vector x1
x1
u1 =
kx1 k

now we substract from x2 its component along u1 to obtain x2 − (u1 , x2 ) u1 and normalized it

x2 − (u1 , x2 ) u1
u2 =
kx2 − (u1 , x2 ) u1 k

it should be emphasized that x2 is not a scalar multiple of x1 so that the denominator above is non-zero. It is clear that u2 is
a linear combination of x1 , x2 and that x2 is a linear combination of u1 , u2 . Therefore, {u1 , u2 } spans the same subspace as
{x1 , x2 }. The next step is to substract from x3 its components in the directions u1 and u2 to get a vector orthogonal to u1
and u2 according with Eq. (2.32). Then we normalize the result and find

x3 − (u1 , x3 ) u1 − (u2 , x3 ) u2
u3 =
kx3 − (u1 , x3 ) u1 − (u2 , x3 ) u2 k

once again {u1 , u2 , u3 } spans the same subspace as {x1 , x2 , x3 }. Continuing this way we clearly obtain an orthonormal set
{u1 , u2 , .., un , ...} with the stated properties.
Many important orthonormal sets arise from sequences of simple functions over which we apply the Gram-Schmidt process
In the space L2 of square integrable functions associated with the interval [−1, 1], the functions xn (n = 0, 1, 2, ..) are linearly
independent. Applying the Gram Schmidt procedure to this set we obtain the orthonormal set of the Legendre Polynomials.
2
In the space L2 of square integrable functions associated with the entire real line, the functions xn e−x /2 (n = 0, 1, 2, ..)
are linearly independent. Applying the Gram Schmidt procedure to this set we obtain the normalized Hermite functions.
In the space L2 associated with the interval [0, +∞), the functions xn e−x (n = 0, 1, 2, ..) are linearly independent. Or-
thonormalizing it we obtain the normalized Laguerre functions.
Each of these orthonormal sets described above can be shown to be complete in their corresponding Hilbert spaces.

3.5.1 Representation of linear operators in finite dimensional Hilbert spaces


First of all let us see how to construct the matrix representation of a linear operator by making profit of the inner product.
Eq. (3.3) shows us how to construct the matrix representation of T in a given basis by applying the operator to each element
ui of such a basis

T ui = uj Tji ⇒ (uk , T ui ) = (uk , uj Tji ) = (uk , uj ) Tji


⇒ (uk , T ui ) = mkj Tji

if the basis is orthonormal then mkj = δkj and


Tki = (uk , T ui ) (3.25)
Eq. (3.25) gives the way to construct an element of the matrix representative of an operator T on H through the inner product
and using an orthonormal basis.
Now we turn to the problem of finding a relation between the matrix representative of an operator and the matrix rep-
resentative of its adjoint. If we have a linear operator T on a Hilbert space, another operator called its adjoint and denoted
as T † exists such that 
(T x, y) = x, T † y ∀x, y ∈ V
the matrix representative of T † has a rather simple relation with the matrix representative of T when an orthonormal basis is
used
(T x, y) = (T (xi ui ) , yk uk ) = (xi T (ui ) , yk uk ) = x∗i yk (T ui , uk )
and using (3.3) we find
(T x, y) = x∗i yk (uj Tji , uk ) = x∗i yk Tji∗ δjk = x∗i yk Tki

= x∗i Teik

yk (3.26)
on the other hand we have
    
† † †
(T x, y) = x, T † y = xi ui , T † (yk uk ) = x∗i ui , yk T † uk = x∗i ui , uj Tjk yk = x∗i (ui , uj ) Tjk yk = x∗i δij Tjk yk

(T x, y) = x∗i T † ik yk (3.27)

Equating Eqs. (3.27, 3.26) and taking into account that x and y are arbitrary, we have

T† ik
= Teik
∗ e∗
⇒ T† = T (3.28)
3.6. DETERMINANTS AND TRACES 41

and so the matrix representative of T † is the conjugate transposed of the matrix representative of T . Once again, it is important
to emphasize that it is only valid in an orthonormal basis, it can easily be proved that for an arbitrary basis described by
the metric matrix m, the matrix representation of T † is m−1 T e ∗ m. Remembering that an operator is hermitian or

self-adjoint if it coincides with its adjoint operator (T = T ) i.e. (T x, y) = (x, T y) , ∀x, y ∈ V, we conclude that in an
orthonormal basis, hermitian operators are represented by matrices which coincide with their conjugate transposed. It is then
natural to make the following definition:

Definition 3.4 An hermitian matrix is a square matrix which coincides with its conjugate transposed i.e. Tik = Teik

.

We should insist however, in the fact that hermitian operators correspond to hermitian matrices only if the basis in which
the operator is represented is orthonormal. In particular, the form to calculate the norm described in (3.22), is usually taken
for granted and it is easy to forget that it only applies in orthonormal bases as we can see from (3.24). This is because when
the basis {vi } is not orthonormal, the coordinates of a vector with respect to {vi } are not given by Fourier coefficients of the
form described in Eq. (2.35)
Now assume that we go from an orthonormal basis ui into another orthonormal basis u′i . We know from theorem 2.47 that
a linear operator is unitary if and only if it transforms a complete orthonormal set into another complete orthonormal set,
then if A is a unitary operator we have

δij = (Aui , Auj ) = u′i , u′j = (uk aki , um amj ) = a∗ki amj (uk , um ) = a∗ki amj δkm
δij = a∗ki akj = e
a∗ik akj

so the matrix of transformation from ui into u′i accomplishes

A† A = 1

so that A† is the right-inverse of A. In finite dimensions, it implies that A† is also a left-inverse of A and that such an inverse
is unique, therefore
A† A = AA† = 1
from which these kind of matrices are non-singular. Therefore a matrix that transform an orthonormal basis into another
orthonormal basis must satisfy
A† = A−1
by theorem 3.1 these matrices are associated with unitary operators as long as we use an orthonormal basis, thus it is natural
to call them unitary matrices.

3.6 Determinants and traces


A very important property of any matrix is its determinant denoted by |A| or by det A, and is a real or complex number
associated with the matrix. Its construction was primarily motivated by the study of simultaneous linear equations. We assume
that the reader is familiarized with the concept and the calculation of this quantity. We have mentioned that a matrix admits
−1
an inverse if and only if its determinant is non-null. This is because the inverse of a matrix A depends on (det A) . The
determinant of the transpose coincides with the determinant of the matrix
e = det A
det A (3.29)

for the conjugate matrix (in which we conjugate each of its elements) we get

det (A∗ ) = (det A)∗ (3.30)

Additionally it can be demostrated that the determinant of the product is the product of the determinants

det (AB) = (det A) · (det B) (3.31)

and since the determinant of the identity is 1 we get


  
1 = det 1 = det AA−1 = (det A) · det A−1

so that  −1
det A−1 = (det A) (3.32)
if any row or column is multiplied by a scalar α, the determinant is also multiplied by the scalar. For example in three
dimensions      
α a11 α a12 α a13 a11 α a12 a13 a11 a12 a13
det  a21 a22 a23  = det  a21 α a22 a23  = α det  a21 a22 a23  (3.33)
a31 a32 a33 a31 α a32 a33 a31 a32 a33
42 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

so that if we multiply an n × n matrix by a scalar, the determinant is

det (αA) = αn det A (3.34)

in particular
n
det (−A) = (−1) det A (3.35)
another important property is the trace of the matrix defined as the sum of its diagonal elements

T rA = aii (3.36)

we emphasize the sum over repeated indices. We prove that

T r [AB] = T r [BA] (3.37)

in this way
T r [AB] = (AB)ii = aik bki = bki aik = (BA)kk = T r [BA]
it is important to see that the trace is cyclic invariant, i.e.
h i h i
T r A(1) A(2) . . . A(n−2) A(n−1) A(n) = T r A(n) A(1) A(2) . . . A(n−2) A(n−1)
h i
= T r A(n−1) A(n) A(1) A(2) . . . A(n−2) (3.38)

and so on. To prove it, we define


B ≡ A(1) A(2) . . . A(n−1)
so that
h i h i h i h i
T r A(1) A(2) . . . A(n−2) A(n−1) A(n) = T r BA(n) = T r A(n) B = T r A(n) A(1) A(2) . . . A(n−2) A(n−1)

and taking into account that the indices (1) , (2) , ... are dummy, any cyclic change is posible. It worths saying that property
(3.37) does not mean that the matrices can be commuted to calculate the trace, for instance for three or more matrices the
trace is not the same for any order of the matrices, only cyclic changes are possible. In that sense, we should interpret (3.37)
as a cyclic change and not as a commutation.
But the most important properties of the traces and determinants is that they are invariant under a similarity transformation
 
det A′ = det BAB−1 = (det B) · (det A) · det B−1 = (det B) · (det A) · (det B)−1
⇒ det A′ = det A

where we have used (3.31) and (3.32). Now for the invariance of the trace
n
  X  X X X X
T rA′ = T r BAB−1 = BAB−1 ii = bik akl b̄li = b̄li bik akl = δkl akl = akk = T rA
i=1 ikl ikl kl k

where b̄li denotes matrix elements of B−1 . Alternatively we can see it by using the cyclic invariance of the trace Eq. (3.38),
such that    
T r [A′ ] = T r BAB−1 = T r B−1 BA = T rA (3.39)
the invariance of determinants and traces under similarity transformations are facts of major importance because all represen-
tations of a given linear transformation are related each other by similarity transformations. It means that determinants and
traces are intrinsic quantities that can be attributed to the linear trasnformations thus

Definition 3.5 We define the trace and the determinant of a given linear transformation of V into itself by calculating the
trace and determinant of the matrix representative of the linear transformation in any basis.

3.7 Rectangular matrices


A rectangular matrix is an arrangement of numbers consisting of m rows and n columns. In that case we say that the matrix
has dimensions m × n. The elements of such a matrix will be of the form

(A)ik = aik ; i = 1, . . . , m ; k = 1, . . . , n
3.8. SYMMETRIC AND ANTISYMMETRIC MATRICES 43

the transpose of this matrix would have dimensions n × m. A column vector arrangement (from now on, we shall call it simply
a “vector”, though it is not neccesarily a vector in all the sense of the word) is a rectangular matrix of dimension m × 1, its
transpose (a row “vector”) is a rectangular matrix of dimensions 1 × m.
Now, it would be desirable to extrapolate the algorithm of square matrices composition to calculate products of rectangular
matrices
cij ≡ aik bkj
It is observed that this extrapolation of the matrix product to the case of rectangular matrices C = AB, can be defined
consistently only if the number of columns of A coincides with the number of rows of B.
AB = C if A ≡ Am×n and B ≡ Bn×d ⇒ Cm×d
In particular, the product of a column vector (m × 1 matrix) with a m × m matrix in the form xA cannot be defined.
eA can be defined.
Nevertheless, the product of the transpose of the vector (row vector) and the matrix A in the form x
In a similar fashion, the product Ae eA
x cannot be defined but Ax can. From these considerations the quantities Ax and x
correspond to a new column vector and a new row vector respectively.
From the dimensions of the rectangular matrices we see that

e n×m and Bn×d ⇒ B


Am×n ⇒ A e d×n
and the product AB is defined. However, their transposes can only be multiplied in the opposite order, i.e. in the order B e A.
e
Indeed, it is easy to prove that, as in the case of square matrices, the transpose of a product is the product of the transpose
of each matrix in the product, but with the product in the opposite order. Applying this property it can be seen that
] =x
(Ax) e
eA ; ]
(e e
xA) = Ax
where we have taken into account that the transpose of the transpose is the original matrix.

3.8 Symmetric and antisymmetric matrices


If a matrix coincides with its transpose
aij = aji e
⇔ A=A
we say that it is a symmetric matrix. If the matrix coincides with minus its transpose
aij = −aji ⇒ A = −A e
we say that the matrix is antisymmetric. It is clear that all diagonal elements of an antisymmetric matrix are null, and
hence such matrices are traceless.
Note that it is always possible to decompose any matrix in a symmetric and an antisymmetric part
  1 
A = A b +A ; A b ≡ 1 A+A e ; A≡ A−A e
2 2
aij + aji aij − aji
aij = b aij + aij ; b
aij ≡ , aij = (3.40)
2 2
and the transpose of A is also a combination of the same components
e =A
A b −A
a real n × n symmetric matrix has n (n + 1) /2 independent components (e.g. the diagonal elements, and all elements above
such a diagonal) while a real n × n antisymmetric matrix has n (n − 1) /2 independent components (e.g. all elements below the
main diagonal). This is consistent with the fact that an arbitrary matrix can be separated in a symmetric and antisymmetric
components, since
n (n + 1) n (n − 1)
+ = n2
2 2
gives the correct degrees of freedom of an arbitrary n × n real matrix. For complex matrices all degrees of freedom are
duplicated. Now, since the trace is invariant under a similarity transformation, it is sometimes useful to separate the trace
as an independent degree of freedom in the decomposition (3.40). Since the antisymmetric component in (3.40) is already
traceless, we decompose the symmetric part in two symmetric matrices, one of them traceless and the other containing the
trace as its only degree of freedom
   
b =A
A b tl + (trA) Abt ; A b tl =b aij [1 − δij δim (T rA)] , Abt = δij δim ; 1 ≤ m ≤ n (3.41)
ij ij
 
in words, the elements of A b t are zero except for one diagonal element associated with a given m, for which A bt = 1. On
  mm
the other hand, A b tl is identical to Ab except for A b tl =b b tl is traceless while
amm − T rA. The latter element ensures that A
mm
b t only contains the trace as a degree of freedom. We shall illustrate this issues for n = 3 in Sec. 5.2.2, page 97.
A
44 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

3.9 The eigenvalue problem


If T is a linear transformation on a vector space of finite dimension n, the simplest thing that the linear transformation can
do to a vector is to produce a “dilatation” or “contraction” on it, eventually changing the “sense” of the “arrow” but keeping
its “direction”. In algebraic words, certain vectors can be transformed by T into a scalar multiple of itself. If x is a vector in
H this operation is given by
T x = λx (3.42)
a non-zero vector x such that Eq. (3.42) holds, is called an eigenvector of T , and the corresponding scalar λ is called an
eigenvalue of T . Each eigenvalue has one or more eigenvectors associated with it and to each eigenvector corresponds a unique
eigenvalue.
Let us assume for a moment that the set of eigenvalues for a given T is non-empty. For a given λ consider the set M of all
(λ)
its eigenvectors together with the vector 0 (which is not an eigenvector), we denote this vectors as xi . M is a linear subspace
of H, we see it by taking an arbitrary linear combination of vectors in M
     
(λ) (λ) (λ) (λ)
T αi xi = αi T xi = αi λxi = λ αi xi ⇒
   
(λ) (λ)
T αi xi = λ αi xi

such that a linear combination is also an eigenvector with the same eigenvalue5 . Indeed, for Hilbert spaces it can be shown
that M is a closed vector subspace of H. As any vector space, M has many basis and if H is finite dimensional, complete
orthonormal sets are basis. The dimension of M is thus the maximum number of linearly independent eigenvectors associated
with λ. M is called the vector eigenspace generated by the eigenvalue λ. This discussion induces the following

Definition 3.6 A given eigenvalue λ in Eq. (3.42) is called n−fold degenerate if n is the dimension of the eigenspace M of
H generated by λ. In other words, n is the maximum number of linearly independent eigenvectors of λ. If n = 1 we say that
λ is non-degenerate.

Even for non-degenerate eigenvalues we always have an infinite number of eigenvectors, for if x(λ) is an eigenvector, then
(λ)
αx is also an eigenvector for any scalar α. Eq. (3.42) can be written equivalently as

(T − λI) x = 0 (3.43)

let us return to the problem of the existence of eigenvalues, we illustrate such a problem with the following example

Example 3.1 The operator T on C ∞ given by

T {x1 , x2 , ...} = {0, x1 , x2 , ...} (3.44)

is an operator on a Hilbert space that has no eigenvalues. It can be seen by observing that the eigenvalue equation for this
operator combined with Eq. (3.44) yields

T {x1 , x2 , ...} = λ {x1 , x2 , ...} = {0, x1 , x2 , ...}

if λ = 0 all xi = 0 so it is not an eigenvector. If λ 6= 0 we obtain λ x1 = 0 and λx2 = x1 , λx3 = x2 etc. leading again to a
null vector.

We confront then the problem of characterizing the type of operators that admit eigenvalues. In the finite-dimensional case,
we shall see that the theory of representations and the fundamental theorem of algebra ensures the existence of eigenvalues for
an arbitrary operator.

3.9.1 Matrix representative of the eigenvalue problem


The one to one correspondence between matrices and operators in the finite dimensional case permits to make a matrix
representation of the eigenvalue equation (3.42). Let T be the n × n matrix associated with the operator T and x the column
vector representative of x (an n × 1 matrix). Eq. (3.42) is written as

Tx = λx (3.45)

which is the eigenvalue equation associated with the matrix. The idea is trying to solve for the eigenvalues and eigenvectors
in a given representation. The values λ are in general complex. According with our previous discussion the eigenvalue is
5 The 0 vector must be included explicitly to take into account the trivial linear combination, since by definition 0 is not an eigenvector.
3.9. THE EIGENVALUE PROBLEM 45

the “dilatation”or “contraction” factor, if it is a negative real number it “inverts the sense of the arrow”. Let us rewrite the
eigenvalue equation as
(T − λ1) x = 0 (3.46)
for simplicity we shall use n = 3 but the arguments are valid for arbitrary finite dimensions. In three dimensions the explicit
form of (3.46) becomes

(T11 − λ) X1 + T12 X2 + T13 X3 = 0


T21 X1 + (T22 − λ) X2 + T23 X3 = 0
T31 X1 + T32 X2 + (T33 − λ) X3 = 0 (3.47)

This set of homogeneous equations for X1 , X2 , X3 has non trivial solution only if the determinant of the system is null, therefore
 
T11 − λ T12 T13
det (T − λ1) = det  T21 T22 − λ T23  = 0 (3.48)
T31 T32 T33 − λ

this condition is known as the secular or characteristic equation of the matrix. The variables to be found are the eigenvalues
λ associated with the matrix. It worths saying that even if non-trivial solutions exist, the set of homogeneous equations (3.47)
do not give us definite values for all the components of the eigenvectors but only for the quotient among these components.
This can be understood either from algebraic or geometric arguments. From the algebraic point of view, it is related with the
fact that the product of the eigenvector x with any scalar is also an eigenvector, this can be seen inmediately from (3.46)6 .
Geometrically, this implies that only the “direction” of the eigenvector is determined but not its “length” neither its “sense”.
This is particularly apparent in three dimensions. Since T represents a linear transformation, it is clear that if T preserves
the direction of x i.e. Tx = λx it also preserves the “direction” of the vector αx for α arbitrary.
When the determinant (3.48) is expanded, we observe that the solution of the secular equation reduces to finding the roots
of a polynomial of n degree. Appealing to the fundamental theorem of algebra we always have exactly n complex roots, some
of them could be repeated so that we could have fewer than n distinct roots. In general we can construct no more than n
linearly independent vectors xk each one associated with an eigenvalue λk . By now, the set of eigenvalues are associated to a
matrix, but in order to associate it to its corresponding operator, we should be sure that the set of eigenvalues is the same for
any representation of the operator i.e. that all equivalent matrices have the same set of eigenvalues

Theorem 3.4 If two n × n matrices are equivalent i.e. T ′ = ST S −1 then both have the same set of eigenvalues.

In summary, the fundamental theorem of Algebra together with the intrinsic meaning of the set of eigenvalues, solves the
problem of the existence of eigenvalues for linear transformations on finite-dimensional vector spaces.

Definition 3.7 The set of eigenvalues of T is called its spectrum and is denoted by σ (T ).

Theorem 3.5 If T is an arbitrary linear transformation on a finite dimensional complex vector space, the spectrum of T
constitute a non-empty finite subset of the complex plane. The number of elements in this subset does not exceed the dimension
n of the space.

Some other important theorems related with the set of eigenvalues are the following

Theorem 3.6 T is singular ⇔ 0 ∈ σ (T ).



Theorem 3.7 If T is non-singular, then λ ∈ σ (T ) ⇔ λ−1 ∈ σ T −1

More information about the spectral resolution of some types of operators in a Hilbert space will be given by means of the
spectral theorem. By now, we turn to the problem of the sets of eigenvectors and its relation with the canonical problem of
matrices.

3.9.2 Eigenvectors and the canonical problem of matrices


Since we can have many representations of a given operator by changing basis, many matrix representatives can be constructed.
It is natural to wonder whether it is posible to choose the basis in such a way that the matrix representative is as simple as
possible. In practice, the simplest matrices are diagonal matrices i.e. matrices for which Tij = 0 for i 6= j. Thus, we are
looking for a basis under which the matrix representative of a given operator T is diagonal. Starting with a given basis {ui } we
obtain a matrix representative of T (denoted by T), we wonder whether there exists another basis {u′i } for which the matrix
6 Alternatively, this can be seen form the fact that the secular equation only has non-trivial solution when one or more of the equations is linearly

dependent with the others. In such a case there are more variables than equations and hence an infinite number of solutions.
46 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

representative T′ of T is diagonal. From Eqs. (3.10, 3.18) we see that T and T′ are related by a similarity transformation
that also gives us the transformation among the bases
e −1 TA
u′ = Au ; T′ = A e (3.49)

We shall see that for finite dimensional matrices, the canonical problem of matrices is intimately related with the structure
of its eigenvectors. Let us consider the representation Xk of the eigenvectors of T with respect to the original basis {ui }. We
denote the i−th coordinate of the k−th eigenvector in the form Xik (with respect to the original basis). We are able to settle
an square arrangement with this eigenvectors, putting them aside as column vectors. In three dimensions, such an arrangement
has the form  
X11 X12 X13
X ≡ (X1 X2 X3 ) =  X21 X22 X23  (3.50)
X31 X32 X33
Eqs. (3.46) are written for each eigenvalue λk and its corresponding eigenvector Xk in the form

(T − λk 1) Xk = 0 ⇒ TXk = λk Xk no sum over k (3.51)

writing Eqs. (3.51) in components with respect to the basis {ui } we get (for n dimensions)
n
X
Tij Xjk = λk Xik ⇒
j=1
Xn n
X
Tij Xjk = Xij δjk λk (3.52)
j=1 j=1

in the two previous equations there is no sum over the repeated index k. The Xjk element is the j−th component of the Xk
vector. Now, the quantity δjk λk can be associated with a diagonal matrix, in three dimensions this matrix is written as
 
λ1 0 0
λ ≡  0 λ2 0  (3.53)
0 0 λ3

in matrix form Eq. (3.52) reads


TX = Xλ
multiplying on left by X−1 we find
X−1 TX = λ (3.54)
it corresponds to a similarity transformation acting on T. Note that the matrix X built from the eigenvectors is the transfor-
mation matrix (comparing with 3.49 we have X ≡ A). e We see then that matrix T is diagonalized by X by means of a similarity
transformation and the elements of the diagonal correspond to the eigenvalues (λk associated with the column vector Xk of the
matrix X in Eq. 3.50). When there are some degenerate eigenvalues i.e. some of them acquire the same value, it is not always
possible to diagonalize the matrix T. It is because in that case, the eigenvectors that form the matrix X are not necessarily
linearly independent. If any given column vector of the matrix is linearly dependent with the others, the determinant of X is
zero and X−1 does not exist.
On the other hand, when diagonalization is possible, the determinant and the trace of T can be calculated taking into
account that such quantities are invariant under a similarity transformation, therefore
 
det T = det X−1 TX = det λ = λ1 λ2 . . . λn (3.55)
 −1 
T rT = T r X TX = T rλ = λ1 + λ2 + . . . + λn (3.56)

so that the determinant and the trace of a diagonalizable matrix are simply the product and sum of its eigenvalues respectively.
In summary, a canonical form of a given matrix can be obtained as long as the eigenvectors of the matrix form a basis, the
question is now open for the conditions for the eigenvectors to form a basis, and this is part of the program of the spectral
theorem.

3.10 Normal operators and the spectral theorem


Let T be an operator on a finite-dimensional Hilbert space H. By theorem 3.5 the spectrum σ (T ) is a non-empty finite set of
complex numbers with cardinality less than or equal to the dimension n of H. Let λ1 , . . . , λm be the set of distinct eigenvalues;
let M1 , . . . , Ṁm be their corresponding eigenspaces; and let P1 , . . . , Pm be the projections on these eigenspaces. The spectral
theorem is the assertion that the following three statements are equivalent to one another
3.10. NORMAL OPERATORS AND THE SPECTRAL THEOREM 47

I) The Mi′ s are pairwise orthogonal and HP= M1 ⊕ . . . ⊕.Mm


Pm
m
II) The Pi′ s are pairwise orthogonal, I = i=1 Pi , and T = i=1 λi Pi .
III) T is normal.
The assertion I) means that any vector x ∈ H can be expressed uniquely in the form

x = x1 + . . . + xm ; xi ∈ Mi ; (xi , xj ) = 0 f or i 6= j (3.57)

applying T on both sides and using linearity

T x = T x1 + . . . + T xm = λ1 x1 + . . . + λm xm (3.58)

this shows the action of T on each element of H in an apparent pattern from the geometrical point of view. It is convenient
to write it in terms of projections on each Mi . Taking into account that Mj ⊆ Mi⊥ for each i and for every j 6= i we obtain
from Eq. (3.57) that
Pi x = xi
from which it follows

Ix = x = x1 + . . . + xm = P1 x + . . . + Pm x
Ix = (P1 + . . . + Pm ) x ; ∀x ∈ H

therefore
m
X
I= Pi (3.59)
i=1

and relation (3.58) gives

T x = λ1 x1 + . . . + λm xm = λ1 P1 x + . . . + λm Pm x
T x = (λ1 P1 + . . . + λm Pm ) x ; ∀x ∈ H

hence m
X
T = λi Pi (3.60)
i=1

Eq. (3.60) is called the spectral resolution of the operator T . In this resolution it is to be understood that all the λ′i s are
distinct and that the Pi′ s are non-zero projections which are pairwise orthogonal and satisfy condition (3.59). It can be shown
that the spectral resolution is unique when it exists. These facts show that I) ⇒ II).
Now, we look for the conditions that the operator must satisfies to be decomposed as Eq. (3.60). From Eq. (3.60) we see
that
T † = λ∗1 P1 + . . . + λ∗m Pm (3.61)
and multiplying (3.60) with (3.61) and using the fact that the Pi′ s are pairwise orthogonal we have
m
! m ! m X m m X
m
X X X X
† ∗
TT = λi Pi λk Pk = λi λ∗k Pi Pk = λi λ∗k Pi2 δik
i=1 k=1 i=1 k=1 i=1 k=1
m
X 2
TT† = |λk | Pk (3.62)
k=1

and multiplying in the opposite order we obtain the same result


m
X
T †T = |λk |2 Pk (3.63)
k=1

from which we see that  


T, T † = 0
and the operator must be normal. We have proved that I) ⇒ II) ⇒ III). To complete the proof we should show that III) ⇒ I)
i.e. that every normal operator T on H satisfies conditions I).
This task is accomplished by the following chain of theorems

Theorem 3.8 If T is normal, x is an eigenvector of T with eigenvalue λ ⇔ x is an eigenvector of T † with eigenvalue λ∗ .

Theorem 3.9 If T is normal the Mi′ s are pairwise orthogonal


48 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

Theorem 3.10 If T is normal, each Mi reduces T .


Theorem 3.11 If T is normal, the Mi′ s span H.
For most of applications theorem 3.9 is rewritten as
Theorem 3.12 If T is normal, two eigenvectors of T corresponding to different eigenvalues are orthogonal. In particular this
is valid for self-adjoint and unitary operators.
Assume that T = T † , since for a given eigenvector x there is a unique eigenvalue λ we see from theorem 3.8 that λ = λ∗ so
the corresponding eigenvalues are real. Now assume for a normal operator T that σ (T ) is a subset of the real line, using the
spectral resolution of T † Eq. (3.61) we find
T † = λ∗1 P1 + . . . + λ∗m Pm = λ1 P1 + . . . + λm Pm = T
we have the following
Theorem 3.13 Let T be a normal operator on a Hilbert space of finite dimension H with distinct eigenvalues {λ1 , . . . , λm },
then T is self-adjoint ⇔ each λi is real.
It is important to emphasize that the hypothesis of real eigenvalues leads to the self-adjointness of the operator only
if normality is part of the hypothesis (because of the use of the spectral thoerem). It does not discard the possibility of
having non-normal operators with real spectrum, in that case such operators would not be self-adjoint. In addition, it worths
remembering that self-adjoint operators where constructed as the analogous of “the real line subset” in the algebra of operators.
So the fact that its eigenvalues are all real is a quite expected result.
An especial type of self-adjoint operators are the positive operators for which
(x, T x) ≥ 0 ∀x ∈ H (3.64)
applying the spectral resolution of T on xi ∈ Mi , with xi 6= 0, we have
m
X m
X
T xi = λk Pk xi = λk xi δik = λi xi
k=1 k=1

and using it in Eq. (3.64) we find


(xi , T xi ) = (xi , λi xi ) = λi (xi , xi ) ≥ 0 no sum over i
2
λi kxi k ≥ 0 ⇒ λi ≥ 0
on the other hand, by assuming that a normal operator T has a real non-negative spectrum we obtain
n
! n n
! n Xn n X
n
X X X X X
(x, T x) = x, λi Pi x = xk , λi xi = λi (xk , xi ) = λi δki (xk , xk )
i=1 k=1 i=1 k=1 i=1 k=1 i=1
n
X 2
(x, T x) = λk kxk k ≥ 0
k=1

we see then that


Theorem 3.14 Let T be a normal operator on a Hilbert space of finite dimension H with distinct eigenvalues {λ1 , . . . , λm },
then T is positive ⇔ λi ≥ 0 for each i = 1, . . . , m.
It is clear from theorem 3.13 that a normal positive operator must be self-adjoint. Now, for a normal operator T , a necessary
and sufficient condition for T to be unitary is that T † T = I (in finite dimension, it is not necessary to show that T T † = I)
using Eqs. (3.59, 3.63) the condition for unitarity is
m
X m
X m
X
2 2
T †T = I ⇒ |λk | Pk = I ⇒ |λk | Pk = Pk
k=1 k=1 k=1

multiplying by Pi and using the pairwise orthogonality of projectors


m
X m
X
|λk |2 Pk Pi = Pk Pi ⇒ |λi |2 Pi2 = Pi2 ⇒ |λi |2 Pi = Pi
k=1 k=1

so that |λi | = 1. This procedure also shows that if T is a normal operator in which |λi | = 1 for each i, then T T † = I and T is
unitary, then we have
3.10. NORMAL OPERATORS AND THE SPECTRAL THEOREM 49

Theorem 3.15 Let T be a normal operator on a Hilbert space of finite dimension H with distinct eigenvalues {λ1 , .., λm },
then T is unitary ⇔ |λi | = 1 for each i = 1, . . . , m.

Now, remembering that unitary operators where constructed as the analogous of “the unitary circle subset” in the algebra
of operators, the fact that its eigenvalues lie in the unitary circle of the complex plane is pretty natural.
Now we are prepared to discuss the canonical problem for normal matrices. We denote ni the dimension of each eigenspace
Mi it is clear that
n1 + n2 + ... + nm = n
 i
M
 ii contains ni linearly independent vectors x1 , .., xini that can be orthonormalized by a Gram Schmidt process to say
u1 , .., uini . If we do this for each Mi the set form by the union of these orthonormal sets
 i
{u} ≡ ∪m i
i=1 u1 , .., uni

is clearly an orthonormal set because all vectors corresponding with different Mi′ s are orthogonal according to theorem 3.9. In
addition, since the Mi′ s span H according to theorem 3.11 this orthonormal set is complete and so a basis. Therefore, for any
normal operator T of H we can always form an orthonormal complete set of eigenvectors. If we use this orthonormal complete
eigenvectors to form the matrix of diagonalization Eq. (3.50) we see that the matrix obtained is a unitary matrix, it is clear
that for these matrices the inverse always exists7 , and therefore the diagonalization can be carried out. Then we have the
following

Theorem 3.16 The diagonalization of a normal matrix T can be performed by a similarity transformation of the form T′ =
U TU−1 where U is a unitary matrix.

This is of particular interest because it means that given a matrix representative of T in a basis consisting of a complete
orthonormal set, there exists another complete orthonormal set for which the matrix representative acquires its canonical form.
Further, it is easy to see that the canonical form of a normal matrix is given by
 
λ1
 .. 
 . 
 
 λ1 
 
 λ2 
 
 .. 
 . 
 
 λ2 
 
 .. 
 . 
 
 λm 
 
 .. 
 . 
λm

where the elements out of the diagonal are zero and each λi is repeated ni times (λi is ni −fold degenerate). It is easily seen
that the matrix representation of Pi in this orthonormal basis is
 
  0n1 ×n1 0 0  
1n1 ×n1 0 0 0
P1 = ; P2 =  0 1n2 ×n2 0  ; Pm =
0 0 0 1nm ×nm
0 0 0

and the matrix representation of the spectral decomposition becomes clear.

3.10.1 A qualitative discussion of the spectral theorem in infinite dimensional Hilbert spaces
The rigorous discussion of the infinite dimensional case for the spectral theorem is out of the scope of this survey. We shall
only speak qualitatively about the difficulties that arises when we go to infinite dimension. For simplicity we assume that A
is a self-adjoint operator, the spectral resolution is given by
m
X
A= λi Pi
i=1

7 It can be seen by combining theorem 3.6 with the fact that λi 6= 0 for each i (see theorem 3.15).
50 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

since the eigenvalues are real we can order them in a natural way in the form λ1 < λ2 < .. < λm and we use the Pi′ s to define
new projections

Pλ0 = 0
Pλ1 = P1
Pλ2 = P1 + P2
....
Pλm = P1 + ... + Pm = I

the spectral decomposition of the self-adjoint operator A can be written as

A = λ1 P1 + λ2 P2 + ... + λm Pm

= λ1 (Pλ1 − Pλ0 ) + λ2 (Pλ2 − Pλ1 ) + ... + λm Pλm − Pλm−1
Xm

A = λi Pλi − Pλi−1
i=1

if we define
∆Pλi ≡ Pλi − Pλi−1
we can rewrite the decomposition of A as
m
X
A= λi ∆Pλi
i=1

which suggest an integral representation Z


A= λ dPλ (3.65)

in this form, the spectral decomposition of a self-adjoint operator is valid for infinite dimensional Hilbert spaces. For normal
operators we have a similar pattern Z
N = λ dPλ (3.66)

The first problem to carry out this generalization is that an operator on H need not have eigenvalues at all, as illustrated
by example 3.1 page 44. In this general case the spectrum of T is defined as

σ (T ) = {λ : T − λI is singular}

when H is finite dimensional, σ (T ) consists entirely of eigenvalues. In the infinite dimensional case we only can say that σ (T )
is non-empty, closed and bounded. Once this difficulty is overcome we should give a precise meaning to the integrals (3.65, 3.66)
and prove the validity of those relations. For example, in quantum mechanics what we usually do is to impose that operators
related with physical quantities must be self-adjoint operators for which their eigenvectors provide a complete orthonormal set
(not all self-adjoint operators in infinite dimension satisfy this condition), self-adjoint operators that accomplish this condition
are called observables. Whether a given operator is observable or not must be determined after its eigenvalue equation is
solved. For this kind of operators the spectral theorem in its present form can be extended to infinite dimensions.
It worths emphasizing that the existence of eigenvalues in the finite dimensional case came from the fundamental theorem
of algebra, which in turn came from the fact that the characteristic equation of a finite dimensional matrix is a polynomial
equation. An extension to infinite dimension clearly does not lead to a polynomial equation, so we cannot resort to the
fundamental theorem of algebra.

3.11 The concept of “hyperbasis”


Suppose that the vector space that concerns us is V , which is a proper subspace of a bigger vector space W . As any vector
space, W has a basis {wi } that generates any vector in W by linear combinations. It is obvious that any vector of V must be
generated through linear combinations of {wi }. However, there are at least two reasons for which {wi } is not a basis for V
(a) at least one element of the set {wi } is not in V , and one of the conditions for a given set S to be a basis of a given vector
space V is that S ⊆ V . (b) given a basis {vi } of V we have that {wi } and {vi } does not have in general the same cardinality,
and we know that different bases must have the same cardinality.
Let us see a simple example: let us use an orthonormal basis of R3 given by
1 1 1
u1 ≡ √ (1, 1, 1) ; u2 ≡ √ (4, −1, −3) ; u3 = √ (−2, 7, −5)
3 26 78
3.12. DEFINITION OF AN OBSERVABLE 51

to generate all vector of the XY plane. The coordinates of ui are written with respect to the ordinary cartesian coordinates.
Since these vectors generate R3 it is clear that they generate the XY plane which is a proper subset of R3 . Notwithstanding,
none of the vectors ui lies in the XY plane, all the elements of this “hyperbasis” are outside of the vector space we pretend
to expand. Further, any basis of XY has two elements while our hyperbasis has three elements. Therefore, the cardinality of
the hyperbasis is higher than the dimension of the space that we shall study. For our purposes however, what really matters
is that any vector in XY can be generated as a unique linear combination of {u1 , u2 , u3 }. For instance, the vector x of the
XY plane represented by (3, −2, 0) in ordinary cartesian coordinates, is represented in this hyperbasis as

x = (u1 , x) u1 + (u2 , x) u2 + (u3 , x) u3


   
1 1
= √ (1, 1, 1) · (3, −2, 0) u1 + √ (4, −1, −3) · (3, −2, 0) u2 +
3 26
 
1
+ √ (−2, 7, −5) · (3, −2, 0) u3
78
1 14 20
x = √ u1 + √ u2 − √ u3
3 26 78
note that in this case an element of the plane is given by a triple with respect to the hyperbasis, in this case
 
1 14 20
x= √ , √ ,− √
3 26 78
in quantum mechanics and in the solution of some differential equations, a similar strategy is used, but for complete orthonormal
sets instead of basis. The Hilbert space L2 that concerns us is of infinite countable orthogonal dimension, but we shall use
frequently complete orthonormal sets of a bigger space with infinite continuous orthogonal dimension. Therefore, we shall
expand the vectors of L2 in terms of hyper-complete orthonormal sets {vx } with continuous cardinality. In general, the
elements vx of the bigger space will be outside of L2 . However, as before, a fourier expansion (instead of a linear combination)
will be possible with this hyper-complete orthonormal sets.
Notice that for any cardinality of the orthogonal dimension of a Hilbert space, we see that the Fourier expansion Eq.
(2.35) is always a series. This is by virtue of theorem 2.23 that says that the non-zero fourier coefficients of any vector are
always countable, even if the complete orthonormal set belongs to a higher cardinality. However, such a theorem is valid for
complete orthonormal sets in which all the elements of the set lies in the space under consideration. If we use a hyper-complete
orthonormal set the elements of such a set do not lie on the space that we are expanding, thus theorem 2.23 does not necessarily
hold. Consequently, when continuous hyper-complete orthonormal sets are used, we shall obtain integrals instead of series in
our Fourier expansions. Does it make any sense to replace series by integrals? it suffices to observe that it is in general easier
to solve integrals in a closed form, than series in a closed form.
Finally, it is important to emphasize that even with hyper-complete orthonormal sets (or with hyperbases), the expansion
of a given vector is unique because of the linear independence of the elements of such a set. We recall that this unicity is
essential in the representation theory of vector spaces.

3.12 Definition of an observable


Measurements in Physics are always real numbers. In quantum mechanics, such measurements are related with eigenvalues
of some operators on a Hilbert space. It is then natural to associate measurements with eigenvalues of self-adjoint operators
since their spectra are always real.
For any finite-dimensional Hilbert space it is always possible to form a complete orthonormal set with the eigenvectors of
a normal operator, and in particular with the eigenvectors of a self-adjoint operator. However, in infinite dimensional Hilbert
spaces this is not necessarily the case. Therefore, we establish the following

Definition 3.8 A given self-adjoint operator A on H is called an observable, if there exists a complete orthonormal set of
eigenvectors of A.

The following sets of theorems are of central importance in quantum mechanics

Theorem 3.17 If two operators A and B commute and if x is an eigenvector of A, then Bx is also an eigenvector of A with
the same eigenvalue. If λ is non-degenerate x is also an eigenvector of B. If λ is n−fold degenerate, the eigensubspace Mλ is
invariant under B.

Since x is an eigenvector of A we have

Ax = λx ⇒ BAx = λBx ⇒ ABx = λBx


52 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

where we have used the fact that A and B commutes, hence

A (Bx) = λ (Bx)

which proves that Bx is an eigenvector of A with eigenvalue λ. Observe that if λ is non-degenerate all its eigenvectors are
“colinear” hence Bx must be colinear with x i.e. Bx = cx and x is also an eigenvector of B.
On the other hand, if λ is n−fold degenerate, we can only say that Bx lies in the n dimensional eigensubspace Mλ of A.
In other words, if x ∈ Mλ then Bx ∈ Mλ
Another way to express the previous theorem is

Theorem 3.18 If two operators A and B commute, every eigensubspace of A is invariant under B.

Of course, the roles of A and B can be interchanged.

Theorem 3.19 If two normal operators A and B commute, and if x1 , x2 are two eigenvectors of A with different eigenvalues,
then (x1 , Bx2 ) = 0

By hypothesis we have
Ax1 = λ1 x1 ; Ax2 = λ2 x2
but from theorem 3.17 Bx2 is an eigenvector of A with eigenvalue λ2 . Now from theorem 3.12 since λ1 6= λ2 then Bx2 is
orthogonal to x1 and the theorem is proved.
The previous theorems do not use the concept of observable8 , but the following one does

Theorem 3.20 Let A and B be two observables in a Hilbert space H. Then A and B commute ⇔ one can construct a complete
orthonormal set in H with eigenvectors common to A and B.

Assume that A and B commute, we shall define the normalized eigenvectors of A as uin

Auin = λn uin ; i = 1, .., gn

where gn is the degree of degeneracy of λn . For n 6= n′ the eigenvectors are orthogonal and for n = n′ and i 6= i′ we can always
orthonormalized the vectors in each eigensubspace of A, so that
 
uin , ujk = δnk δij

let us write H as a decomposition of the eigenspaces of A (taking into account that A is an observable)

H = M1 ⊕ M2 ⊕ M3 ⊕ ...

there are two cases. For each one dimensional Mk (each non-degenerate λk ) all vectors in Mk are “colinear” and they are also
eigenvectors of B.
In the other case, gp > 1 then Mp is gp dimensional. We can only say that Mp is invariant under B. Consider the restriction
of A and B to the subspace Mp . Since the vectors uip in Mp are eigenvectors of A, the restriction of A to Mp has a matrix
(p)
representative Aij of the form
(p)   
Aij = vpi , Avpj = vpi , λp vpj = λp vpi , vpj = λp δij

thus the matrix representation of A(p) is λp I for any orthonormal set complete in Mp (not neccesarily the original). Now let
us see the matrix representative of the restriction B (p) of B on Mp , writing this representation in our original orthonormal set
(p) 
Bij = uip , Bujp

since B is a self-adjoint operator this matrix is self-adjoint, and according to theorem 3.16 they
can always be diagonalized
by a unitary transformation, which in turn means that there exists an orthonormal set vpi in Mp for which the matrix
representative of B (p) is diagonal, hence
(p)  (p)
Bij = vpi , Bvpj = Bi δij
which means that the new orthonormal set complete in Mp consists of eigenvectors of B
(p)
Bvpi = Bi vpi
8 They use however the assumption that the operators involved posses eigenvalues, which is not guaranteed in infinite-dimension.
3.13. COMPLETE SETS OF COMMUTING OBSERVABLES (C.S.C.O.) 53


and since Mp contains only eigenvectors of A, it is clear that vpi is an orthonormal set complete in Mp that are common
eigenvectors of A and B. Proceeding in this way with all eigensubspaces of A with more than one dimension, we obtain a
complete orthonormal set in H in which the elements of the set are common eigenvectors of A and B.
It is important to emphasize that for a given Mp the orthonormal set chosen a priori does not in general consist of
eigenvectors of B, but it is always possible to obtain another orthonormal set that are eigenvectors of B and by definition they
are also eigenvectors of A.
Now let us prove that if A and B are observables with a complete orthonormal set of common eigenvectors then they
commute. Let us denote the complete orthonormal set of common eigenvectors as uin,p then

ABuin,p = bp Auin,p = an bp uin,p


BAuin,p = an Buin,p = an bp uin,p

therefore
[A, B] uin,p = 0

since uin,p form a complete orthonormal set, then [A, B] = 0. QED.


It is also very simple to show that if A and B are commuting observables with eigenvalues an and bp and with common
eigenvectors uin,p then
C = A+B
is also an observable with eigenvectors uin,p and eigenvalues cn,p = an + bp .

3.13 Complete sets of commuting observables (C.S.C.O.)



Consider an observable A and a complete orthonormal set uin of the Hilbert space that consists of eigenvectors of A. If none
of the eigenvalues of A are degenerate then the eigenvalues determine the eigenvectors in a unique way (within multiplicative
constant factors). All the eigensubspaces Mi are one-dimensional and the complete orthonormal set is simply denoted by
{un }. This means that there is only one complete orthonormal set (except for multiplicative phase factors) associated with
the eigenvectors of the observable A. We say that A constitutes by itself a C.S.C.O.
On the other hand, if some eigenvalues of A are degenerate, the specification of the set {an } of eigenvalues is not enough
to determine a complete orthonormal set for H because any orthonormal set in each eigensubspace Mn can be part of such a
complete orthonormal set. Thus the complete orthonormal set determined by the eigenvectors of A is not unique and it is not
a C.S.C.O.
Now we add a second observable B that commutes with A, and construct a complete orthonormal set common to A and
B. By definition, A and B constitutes a C.S.C.O. if the complete orthonormal set common to both is unique (within constant
phase factors for each of the vectors in the complete set). In other words, it means that any given pair of eigenvalues an , bp
determines the associated common normalized eigenvector uniquely, except for a phase factor.
In theorem 3.20 we constructed the complete orthonormal set common to A and B by solving the eigenvalue equation of
B within each eigensubspace defined by A. For A and B to constitute a C.S.C.O. it is necessary and sufficient that within
each Mn the gn eigenvalues of B be distinct9 . In this case, since all eigenvectors vni in each Mn have the same eigenvalue an
(n)
of A, they will be distinguished by the gn distinct eigenvalues bi associated with these eigenvectors of B. Note that it is not
necessary that the eigenvalues of B be non-degenerate, we can have two (or more) equal eigenvalues of B associated with two
(or more) distinct eigensubspaces Mn and Mk of A. We only require not to have degeneracy of the eigenvalues of B within a
given eigensubspace Mn of A. Indeed, if B were non-degenerate it would be a C.S.C.O. by itself.
On the other hand, if for at least one pair {an , bp } there exist two or more linearly independent eigenvectors common to A
and B they are not a C.S.C.O. Let us add a third observable C that commutes with both A and B, and proceeds as above.
When to the pair {an , bp } corresponds only one eigenvector common to A and B, then according with theorem 3.17, it is
automatically an eigenvector of C as well. On the contrary, if the eigensubspace Mn,p is gn,p dimensional, we can construct
within it, an orthonormal set of eigenvectors of C. Proceeding in this way with each Mn,p we can construct a complete
orthonormal set with eigenvectors common to A, B, C. These three observables are a C.S.C.O. if this complete orthonormal
set is unique (except for multiplicative phase factors). Once again, if Mn,p has the eigenvectors uin,p common to A and B this
(n,p)
occurs if and only if all gn,p eigenvalues of C denoted as ck are distinct. As before, C can be degenerate, but as long as
degenerate eigenvalues are not repeated within a single eigenspace Mn,p of A and B. Therefore, a given triple of eigenvalues
{an , bp , ck } of A, B, C has a unique common eigenvector within a multiplicative factor. If two or more linearly independent
eigenvectors common to A, B, C can be constructed for a given set {an , bp , ck }, we can add a fourth observable D that commute
with those three operators and so on.
9 If M is one-dimensional, theorem 3.17 says that an eigenvector of A in M is automatically an eigenvector of B and it is clearly uniquely
n n
determined, except for multiplicative factors. Only the case in which Mn has more than one dimension is non-trivial.
54 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

Definition 3.9 A set of observables {A, B, C, ..} is called a complete set of commuting observables (C.S.C.O.) if (i) All
observables commute pairwise, (ii) specifying the set of eigenvalues {an , bp , ck , ..} of the observables determines a unique (within
phase factors) complete orthonormal set of eigenvectors common to all the observables.

An equivalent form is the following

Definition 3.10 A set of observables {A, B, C, ..} is called a complete set of commuting observables (C.S.C.O.) if there is a
unique complete orthonormal set (within phase factors) of common eigenvectors.

It is obvious that if a given set is a C.S.C.O. we can add any observable that commutes with the observables of the set and
the new set is also a C.S.C.O. However, for most of our purposes we shall be interested in “minimal C.S.C.O.” in the sense
that by removing any observable of the set, the new set is not complete any more.
If a given set {A1 , .., An } of observables is a C.S.C.O., an eigenvector associated with a set {ak1 , .., akn } determines a unique
common normal eigenvector (within a phase factor) so it is natural to denote the vector as uak1 ,ak2 ,akn . On the other hand, in
the context of quantum mechanics, a global phase has no Physical information. Therefore, all normal vectors associated with
{ak1 , .., akn } have the same Physical information, this fact enhance the qualification of “unique” for these vectors, although
they are not unique from the mathematical point of view.

3.14 Some terminology concerning Physics


We have defined linear combinations as finite sums. A basis in a vector space is thus a set of linearly independent vectors for
which any vector of the space can be written as a finite sum of elements of the basis (multiplied by the appropriate scalars).
Notably, bases always exist even in an infinite-dimensional vector space. However, in practice it is not easy to find a basis in
an infinite dimensional Hilbert space. In this case, it is more usual to utilize complete orthonormal sets, they make a work
similar to bases in the sense that they generate any vector in a unique way, but the difference is that complete orthonormal
sets expand a vector in a series (Fourier expansion) while bases do it in finite sums.
In quantum mechanics, the state of a physical system is described by a vector belonging to an infinite dimensional Hilbert
space. Similarly, the set of all solutions of many differential equations (in either classical or quantum scenarios) forms an infinite
dimensional Hilbert space. Thus infinite dimensional Hilbert spaces are the framework of quantum mechanics and many classical
problems as well. Now in either classical or quantum mechanics, we call a basis to mean a complete orthonormal set, and
the series expansion is usually call a linear combination. Now, when we deal with infinite dimensional Hilbert spaces, we
never use basis in the mathematical sense. Therefore, no confusion arises with this terminology. Self-adjoint operators are
usually called hermitian operators. The conjugate space H ∗ of H is usually call the dual space of H. The vectors in our
Hilbert space are called kets, while the corresponding elements in the dual space (the functionals) are called bras.
In addition the Hilbert space we work with, is a separable space so that its dimension is countable (countably infinite).
We shall resort however to some hyperbases which are of continuous cardinality, the elements of these hyperbases do not
belong to our Hilbert space. Consequently, the elements of the hyperbasis will not be physical states, but we shall call them
continuous basis. Nevertheless, they will be very useful for practical calculations.
In addition there will be a change of notation to facilitate the mathematical calculations, it is called Dirac notation

3.15 The Hilbert Space L2


In the formalism of quantum mechanics the information of a quantum particle is described by a function of the space and time
denoted as ψ (r, t) and called the wave function. The quantity, |ψ (r, t)|2 dx dy dz will be interpreted as the probability of
finding at time t, the particle in a volume dx dy dz. Since the particle must be somewhere in the space, we must demand that
the integral over the whole volume must be equal to unity
Z
2
dV |ψ (r, t)| = 1

the integration extends over all space. However, in certain cases we could assume that the particle is in a given confined volume
and the integral will be restricted to such a volume.
The discussion above leads to the fact that the space of Physical states of one particle should be described by a square-
integrable wave function. The state space is then the Hilbert space L2 of the square-integrable functions in a given volume.
For a system of several particles we will have a space with similar features, but by now we will concentrate on the space
that describes a single particle. On the other hand, in Hilbert spaces coming from solutions of differential equations of either
classical or quantum Physics, we usually demand boundedness (translated into square-integrability) of the solutions in a given
volume. Thus, the solutions of many differential equations usually leads us to the space L2 as well.
For several reasons we cannot specified in general the state space of a particle. First of all, several physical considerations
can lead us to the fact that the particle is confined to a certain bounded volume. For instance, in one dimension it is not the
3.15. THE HILBERT SPACE L2 55

same the space of functions that are square integrable in the whole real line, as (say) the space of functions that are square
integrable in a bounded interval. In other words, different regions of square integrability leads us to different L2 spaces. On the
other hand, it is usual to demand as well as square integrability, that the functions accomplish additional features of regularity.
For example, to be defined all along the interval, or to be continuous, derivable, etc. The specific conditions depend on the
particular context, and they are required to define the state space completely.
For example, it has no physical meaning to have a function that is discontinuous at a given point since no experiment can
measure a real phenomenon at scales below certain threshold. We could then be tempted to say that we must demand the
functions to be continuous. However, this is not necessarily the case since some non-physical functions could help us to figure
out what is happening. Let us take some familiar examples in classical mechanics, it is usual in electrostatics to assume the
presence of a surface charge, which leads to a discontinuity in the electric field, in the real world a charge is distributed in
a very thin but finite layer and the discontinuity is replaced by a very slopy curve. Indeed, a surface charge is equivalent to
an infinite volume density, but we have seen that this assumption provides a simple picture of many electrostatic phenomena
though it is not a real physical state. Classical waves represented by a single plane wave in optics are other good examples,
since it is not possible to have a real wave being totally monochromatic (a physical state is always a superposition of several
plane waves), but many of the wave phenomena are easier to study with these non physical states, and indeed many real
physical phenomena such as the laws of geometric optics are predicted by using them.
In summary, depending on our purposes (and attitudes) we could demand to have only physical states or to decide to
study some non-physical ones that are obtained when some physical parameters are settled at extreme values. In conclusion,
our assumptions on the functions to work with, affects the definition of the Hilbert space of states that we should use as a
framework.
In particular in quantum mechanics, given the volume V in which a particle can stay, we say that our space of states is a
subspace of the Hilbert space L2 of the square integrable functions in the volume V . We denote by ̥ the subspace of states
in which ̥ ⊆ L2 . For this subspace to be a Hilbert space, it must be closed (for completeness to be maintained).

3.15.1 The wave function space ̥


According to the discussion above, we only can say that our wave function space that describe our physical states in quantum
mechanics, is a closed vector subspace of L2 for a volume determined by our physical conditions. What really matters is to be
sure whether the additional conditions imposed to our functions keeps ̥ as a closed vector space. For instance, if we assume
continuity and/or derivability, it is easy to show that a finite linear combination preserves these conditions. Less evident is
to ensure that a series preserves these conditions (for the subspace to be closed in L2 ), but we are not be concern with this
problem here, neither we shall discuss the aspects concerning the completeness of L2 . We then limite ourselves to determine
the vector space character of L2 . Let ψ1 , ψ2 ∈ L2 , we show that

ψ (r) = λ1 ψ1 (r) + λ2 ψ2 (r)


2
is a square integrable function. For this, we expand |ψ (r)|
2 2 2 2 2
|ψ (r)| = |λ1 | |ψ1 (r)| + |λ2 | |ψ2 (r)| + λ∗1 λ2 ψ1∗ (r) ψ2 (r) + λ1 λ∗2 ψ1 (r) ψ2∗ (r)

now for the last two terms we have


h i
2 2
|λ∗1 λ2 ψ1∗ (r) ψ2 (r)| = |λ1 λ∗2 ψ1 (r) ψ2∗ (r)| ≤ |λ1 | |λ2 | |ψ1 (r)| + |ψ2 (r)|

hence h i
|ψ (r)|2 ≤ |λ1 |2 |ψ1 (r)|2 + |λ2 |2 |ψ2 (r)|2 + 2 |λ1 | |λ2 | |ψ1 (r)|2 + |ψ2 (r)|2

and the integral of each of the functions on the right-hand side converges. Then the integral
Z
2
|ψ (r)| dV

converges. So ψ is a square integrable function.


The scalar product will be defined as Z
(ϕ, ψ) = dV ϕ∗ (r) ψ (r) (3.67)

it can be shown that this integral always converges if ϕ and ψ belong to L2 . We should check that this definition accomplishes
the properties of an inner product, the properties arise directly from the definition

(ϕ, λ1 ψ1 + λ2 ψ2 ) = λ1 (ϕ, ψ1 ) + λ2 (ϕ, ψ2 ) ; (λ1 ϕ1 + λ2 ϕ2 , ψ) = λ∗1 (ϕ1 , ψ) + λ∗2 (ϕ2 , ψ)


(ϕ, ψ) = (ψ, ϕ)∗ ; (ψ, ψ) ≡ kψk2 ≥ 0 and (ψ, ψ) = 0 ⇔ ψ = 0
56 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

let us mention some important linear operators on functions ψ (r) ∈ ̥.


The parity operator defined as
Πψ (x, y, z) = ψ (−x, −y, −z)

the product operator X defined as


Xψ (x, y, z) = xψ (x, y, z)

and the differentiation operator with respect to x denoted as Dx

∂ψ (x, y, z)
Dx ψ (x, y, z) =
∂x
it is important to notice that the operators X and Dx acting on a function ψ (r) ∈ ̥, can transform it into a function that
is not square integrable. Thus it is not an operator of ̥ into ̥ nor onto ̥. However, the non-physical states obtained are
frequently useful for practical calculations.
The commutator of the product and differentiation operator is of central importance in quantum mechanics
 
∂ ∂ ∂ ∂
[X, Dx ] ψ (r) = x − x ψ (r) = x ψ (r) − [xψ (r)]
∂x ∂x ∂x ∂x
∂ ∂
= x ψ (r) − x ψ (r) − ψ (r)
∂x ∂x
[X, Dx ] ψ (r) = −ψ (r) ∀ψ (r) ∈ ̥

therefore
[X, Dx ] = −I (3.68)

3.16 Discrete orthonormal basis


The Hilbert space L2 (and thus ̥) has a countable infinite dimension, so that any authentic basis of ̥ must be infinite but
discrete. A discrete orthonormal basis {ui (r)} with ui (r) ∈ ̥ should follows the rules given in section 2.9.1. Thus, from our
definition (3.67) of inner product, orthonormality is characterized by
Z
(ui , uj ) = d3 r u∗i (r) uj (r) = δij

the expansion of any wave function (vector) of this space is given by the Fourier expansion described by Eq. (2.35)

X Z
ψ (r) = ci ui (r) ; ci = (ui , ψ) = d3 r u∗i (r) ψ (r) (3.69)
i

using the terminology for finite dimensional spaces we call the series a linear combination and ci are the components or
coordinates, which correspond to the Fourier coefficients. Such coordinates provide the representation of ψ (r) in the basis
{ui (r)}. It is very important to emphasize that the expansion of a given ψ (r) must be unique for {ui } to be a basis, in this
case this is guaranteed by the form of the Fourier coefficients.
Now if the Fourier expansion of two wave functions are
X X
ϕ (r) = bj uj (r) ; ψ (r) = ci ui (r)
j i

The scalar product and the norm can be expressed in terms of the components or coordinates of the vectors according with
Eqs. (3.21, 3.22)
X X 2
(ϕ, ψ) = b∗i ci ; (ψ, ψ) = |ci | (3.70)
i i

and the matrix representation of an operator T in a given orthonormal basis {ui } is obtained from Eq. (3.25)

Tij ≡ (ui , T uj )
3.16. DISCRETE ORTHONORMAL BASIS 57

3.16.1 Dirac delta function


Dirac delta function is a powerful tool to express the fact that a given orthonormal set is complete. It is also useful to convert
point, linear and surface densities in equivalent volumetric densities. It is important to emphasize that the Dirac delta function
is not indeed a function but a distribution. In the language of functional analysis it is a functional (one-form) that acts on
vector spaces of functions, assigning to each element of such a space a real number in the following way: Let V be a vector
space of real-valued functions defined in the domain (b, c) with certain properties of continuity, derivability, integrability, etc.
The Dirac delta distribution is a mapping that assigns to each element f (x) of V a real number with the following algorithm10
Z c 
f (a) if a ∈ (b, c)
f (x) δ (x − a) dx ≡
b 0 if a ∈/ [b, c]

We shall mention in passing that with this distribution, it is possible to write a point charge (or mass) density located at
r0 , as an equivalent volumetric density
ρ (r) = qδ (r′ − r0 ) (3.71)
after adequate integrations, this density reproduces properly the total charge as well as the potential and field generated by
such a density.
There are several sequences of distributions that converge to the Dirac delta function, one of the most used is the following
n 2 2
fn (x − a) = √ e−n (x−a) (3.72)
π
it can be shown that when the limit n → ∞ is taken, the definition an all basic properties of the Dirac delta distribution are
reproduced. Note that all gaussian distributions contained in this sequence have unit area and are centered around a. Further,
the larger values of n, the sharper and higher are the gaussian bells, in such a way that area is preserved. Consequently,
for large values of n, the area is concentrated in a small neihbourhood around a. In the limit n → ∞, the whole area is
concentrated in an arbitrary small interval around a.
Some basic properties of the Dirac delta function are the following:

Z ∞ Z ∞
δ (x − a) dx = 1 ; f (x) ∇δ (r − r0 ) dV = − ∇f |r=r0 (3.73)
−∞ −∞
1
δ (ax) = δ (x) ; δ (r − r0 ) = δ (r0 − r) (3.74)
|a|
 1
xδ (x) = 0 ; δ x2 − e2 = [δ (x + e) + δ (x − e)] (3.75)
2 |e|
It worths emphasizing that owing to its distribution nature, the Dirac delta function makes no sense by itself, but only
1
within an integral. For example, when we say that δ (ax) = |a| δ (x), we are not talking about a numerical coincidence between
both members, but about an identity that must be applied in the vector space of functions in which we are working, in the
form Z c Z c
1
f (x) δ (ax) dx = f (x) δ (x) dx ∀ f (x) ∈ V y ∀ a ∈ R
b b |a|
Strictly speaking, the mapping can be done over the complex numbers with analogous properties. In the same fashion, it
is necessary to clarify that the equivalent volumetric density of a point charge (and all equivalent densities generated by a
delta function) is indeed a distribution. For example, the density described by (3.71), makes sense only within integrals that
generate the total charge, the potential or the field. Ordinary densities are functions but equivalent densities are distributions.
In summary, what we construct by means of the equivalent volumetric density is a distribution that produces the correct
mapping to reproduce the total charge, potential and field. R
In more than one dimension, the delta function is converted in products of unidimensional deltas, the property δ (n) (x) dn x =
1, applied to n dimensions says that the delta function is not dimensionless, its dimension is given by x−n .
Another outstanding property of the Dirac delta function is the following
X 1
δ [g (x)] = ′
δ (x − xj ) (3.76)
j
|g (xj )|

where g ′ (x) is the derivative of g (x) and the xj are the simple zeros of the function g (x) :

g (xj ) = 0 , g ′ (xj ) = 0

10 It ∞ if r = 0 R
is customary to define the Dirac delta “function” as δ (r) = and δ (x) dx = 1. This definition is based on an erroneous
0 if r = 6 0
conception of the Dirac delta distribution as a function. Despite of it, we shall talk about the Dirac delta function from now on, to be in agreement
with the literature.
58 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

and the summation is performed over all simple zeros of g (xj ). If g (x) has multiple zeros (that is, roots xj for which g ′ (xj ) = 0),
the expression δ [g (x)] makes no sense. Note that the properties
1
δ (−x) = δ (x) , δ (ax) = δ (x)
|a|

are special cases of property


By now, we shall relate the delta function with completeness of orthonormal vectors. Note that in the case of finite-
dimension vector spaces, completeness can be proved simply checking that the number of linearly independent vectors is equal
to the dimension of the space. By contrast, in vector spaces with countable infinite dimension, we could have a countably
infinite set of linearly independent vectors but still they could be incomplete, in that case they can be completed by adding a
finite or countably infinite set of linearly independent vectors (since in that case the cardinality of the set of vectors does not
change). In other words, an orthonormal set could have the cardinality of the orthogonal dimension of the space and still be
incomplete. Owing to it, the proof of completeness is particularly important.

3.17 Closure relations


For any arbitrary vector ψ (r) of ̥ to be expandible in the set of unitary linearly independent vectors {ui (r)}, it is necessary
for the set which defines the basis to be complete, the completeness condition can be obtained by replacing the Fourier
coefficients cn in the expansion of ψ (r)
X X XZ B
ψ (r) = cn un (r) = (un , ψ) un (r) = u∗n (r′ ) ψ (r′ ) un (r) d3 r′
n n n A
Z " #
B X
ψ (r) = ψ (r′ ) u∗n (r′ ) un (r) d3 r′
A n

where the integral with limits A and B means a triple volume integral. On the other hand
Z B
ψ (r) = ψ (r′ ) δ (r − r′ ) d3 r′
A

equating the two latter expressions and taking into account that ψ (r′ ) is arbitrary, we get
X
u∗n (r′ ) un (r) = δ (r − r′ ) (3.77)
n

tracing these steps back, we see that the relation above guarantees us that any function within the space can be expanded in
terms of the set {un (r)}. In turn, we see that the expansion associated with a given ordered basis {un (r)} is unique, which is
a consequence of the linear independence of the set. Therefore, Eq. (3.77), is known as completeness or closure relation.
We shall study several complete sets that accomplish property (3.77). The proof of completeness of these sets is however
out of the scope of this manuscript.

3.18 Introduction of hyperbases


In the case of discrete basis each element ui (r) is square integrable and thus belong to L2 and in general to ̥ as well. As
explained before, it is sometimes convenient to use some hyperbases in which the elements of the basis do not belong to either
L2 or ̥, but in terms of which a function in ̥ can be expanded, the hyperbasis {u (k, r)} will have in general a continuous
cardinality with k denoting the continuous index that labels each vector in the hyperbasis. According to our previous discussions
the Fourier expansions made with this hyperbasis are not series but integrals, these integrals will be called continuous linear
combinations.

3.18.1 Orthonormality and Closure relations with hyperbases


In the hyperbasis {u (k, r)}, k is a continuous index defined in a given interval [c, d]. Such an index makes the role of the index
n in discrete bases. We shall see that a consistent way of expressing orthonormality for this continuous basis is11
Z B
(uk , uk′ ) = u∗ (k, r) u (k ′ , r) d3 r = δ (k − k ′ ) (3.78)
A
11 From now on we shall say continuous bases, on the understanding that they are indeed hyperbases.
3.18. INTRODUCTION OF HYPERBASES 59

we show it by reproducing the results obtained with discrete bases. Expanding an arbitrary function ψ (r) of our Hilbert space
as a continuous linear combination of the basis gives
Z d
ψ (r) = c (k) u (k, r) dk
c

then we have
Z ! Z "Z #
d B d

(uk′ , ψ) = uk ′ , c (k) u (k, r) dk = u (k, r) c (k) u (k, r) dk d3 r
c A c
Z "Z # Z Z
d B d d
∗ 3
= c (k) u (k, r) u (k, r) d r dk = c (k) (uk′ , uk ) dk = c (k) δ (k − k ′ ) dk = c (k ′ )
c A c c

from which the fourier coefficients of the continuous expansion are evaluated as

c (k ′ ) = (uk′ , ψ) (3.79)

when the Fourier coefficients are associated with continuous linear combinations (integrals) they are usually called Fourier
transforms. In this case, a vector is represented as a continuous set of coordinates or components, where the components or
coordinates are precisely the Fourier transforms.
Therefore, in terms of the inner product, the calculation of the Fourier coefficients in a continuous basis (Fourier transforms)
given by Eq. (3.79) coincides with the calculation of them with discrete bases Eq. (3.69). Eq. (3.79) in turn guarantees
that the expansion for a given ordered continuous bases is unique12 . Those facts in turn depend strongly on our definition
of orthonormality in the continuous regime Eq. (3.78) showing the consistency of such a definition. After all, we should
remember that hyperbases are constructed as useful tools and not as physical states, in that sense we should not expect a
“true orthonormality relation” between them13 .
Let us see the closure relation
Z d Z d
ψ (r) = c (k) u (k, r) dk = (uk , ψ) u (k, r) dk
c c
Z " Z #
d B
ψ (r) = u∗ (k, r′ ) ψ (r′ ) d3 r′ u (k, r) dk
c A
Z "Z #
B d
ψ (r) = u∗ (k, r′ ) u (k, r) dk ψ (r′ ) d3 r′
A c

on the other hand Z B


ψ (r) = δ (r − r′ ) ψ (r′ ) d3 r′
A
from which we find Z d
u∗ (k, r′ ) u (k, r) dk = δ (r − r′ ) (3.80)
c

which defines us the closure relation for a continuous basis {u (k, r)}.
From the discussion above, the closure relations for discrete or continuous basis can be interpreted as “representations” of
the Dirac delta function. Similar situation occurs with the orthonormality relation but only for continuous bases.
It worths emphasizing at this point that a given representation of the delta in a given space cannot be applied to another
space. For example,
Pr it is possible to have a r−dimensional vector space of functions V1 with a basis {vn (r)}, that defines a
closure relation n=1 vn∗ (r′ ) vn (r) = δ1 (r − r′ ), let us think about another r + k dimensional vector space denoted by V2 and
such that V2 ⊃ V1 , such that a basis {um } of V2 includes the previous basis plus other linearly independent vectors; the closure
Pr+k
relation is: n=1 u∗n (r′ ) un (r) = δ2 (r − r′ ). What is the difference between δ1 (r − r′ ) and δ2 (r − r′ )?, the answer lies in the
distribution nature of the badly called Dirac delta function; the fundamental property of this distribution tells us that for all
functions ψ (r′ ) that belongs to V1 we have that
Z B " # Z B
X

ψ (r) = ψ (r ) vn (r ) vn (r) d3 r′ =
∗ ′
ψ (r′ ) δ1 (r − r′ ) d3 r′
A n A

12 Remember that for a given set of vectors to constitute a basis, it is important not only to be able to expand any vector with the elements of the

set, it is also necessary for the expansion of each vector to be unique. In normal basis (not hyperbasis) this is guaranteed by the linear independence,
in our continuous set it is guaranteed by our definition of orthonormality in such a set that expresses linear independence.
13 It is clear for example that with r = r′ the “orthonormality” relation diverge, so it is not a normalization in the mathematical sense.
60 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

however, if the function ψ (r) does not belong to V1 but it belongs to V2 then δ1 (r − r′ ) is not an adequate distribution to
represent this function. This is a general property of the distributions, since they are defined solely by means of the way in
which they map the functions of a specific vector space into the scalars. A representation of the Dirac delta (and in general of
any distribution) is linked to a very specific vector space of functions.

3.18.2 Inner product and norm in terms of the components of a vector in a hyperbases
Let us take two vectors ϕ and ψ that belong to ̥. Both can be expressed as continuous linear combinations of a continuous
basis {uk }
Z d Z d
ψ (r) = dk u (k, r) c (k) ; ϕ (r) = dk ′ u (k ′ , r) b (k ′ )
c c
now the idea is to write the scalar product of them in terms of the continuous set of components of each vector i.e. in terms
of their Fourier transforms c (k) and b (k ′ ). The scalar product is
Z B Z d Z d Z B
(ϕ, ψ) = d3 r ϕ∗ (r) ψ (r) = dk ′ dk b∗ (k ′ ) c (k) d3 r u∗ (k ′ , r) u (k, r)
A c c A

now using the orthonormality relation Eq. (3.78) we have


Z B Z d Z d
3 ∗ ′
(ϕ, ψ) = d r ϕ (r) ψ (r) = dk dk b∗ (k ′ ) c (k) δ (k − k ′ )
A c c
Z d
(ϕ, ψ) = dk b∗ (k) c (k) (3.81)
c

the norm is obtained simply by taking ϕ = ψ then


Z d
2 2
(ψ, ψ) = kψk = dk |c (k)| (3.82)
c

Eqs. (3.81, 3.82) are clearly the continuous analogs of Eq. (3.70) for discrete basis.
In summary, the basic relations obtained in discrete bases (inner products, norms, fourier coefficients, orthonormality,
completeness etc.) posseses the same structure in continuous bases but with the following replacements
X Z
i(discrete) ↔ k(continuous) , ↔ dk , δij ↔ δ (k − k ′ )
i

3.19 Some specific continuous bases


3.19.1 Plane waves
We shall use a continuous basis represented by the set
n o  3/2
1
zeip·r/~ ; z≡
2π~
where p is the continuous index that labels the different vectors of the basis. Indeed, p represents three continuous indices
px , py , pz . By now ~ is simply a mathematical constant, but it will become highly relevant in Physics. We consider the space
of square integrable functions over the whole space, all integrals are undestood to be triple integrals. The continuous linear
combination of a given square integrable function is given by
 3/2 Z ∞
1
ψ (r) = d3 p ψ̄ (p) eip·r/~
2π~ −∞

it is clear that ψ̄ (p) provides the continuous set of coordinates of the vector ψ (r) under our continuous basis. They are
thus the Fourier transforms of ψ (r) with respect to the basis of plane waves. It is useful to define

vp (r) ≡ zeip·r/~ (3.83)

from which the fourier transforms can be calculated by Eq. (3.79)


 3/2 Z ∞
1
c (k) = (uk , ψ) ⇒ ψ̄ (p) = (vp , ψ) = d3 r e−ip·r/~ ψ (r)
2π~ −∞
3.19. SOME SPECIFIC CONTINUOUS BASES 61

the basic relation in Fourier analysis Z ∞


1
3 d3 k eik·u = δ 3 (u) (3.84)
(2π) −∞

can be used by assigning k → zp and u → (r − r′ ) to show that


Z ∞ Z ∞
1 ′
d3 p ei ~ (r−r ) = δ 3 (r − r′ )
p
3 ∗ ′
d p vp (r ) vp (r) = 3 (3.85)
−∞ (2π~) −∞

by comparing it with Eq. (3.80), we see that (3.85) expresses the completeness relation for the continuous basis {vp } in the
space of functions that are square-integrable in the whole physical space. The orthonormality relation can also be obtained
from the property (3.84) but with the assignments k → zr and u → p − p′
Z ∞
1 ′
d3 r e−i ~ (p−p ) = δ 3 (p′ − p) = δ 3 (p − p′ )
r
(vp , vp′ ) = 3 (3.86)
(2π~) −∞
2
by using p = p′ in Eq. (3.86) it is clear that kvp k = (vp , vp ) is divergent. Thus, the plane waves are not square-integrable in
the whole space. Therefore, the elements of this continuous basis do not belong to the Hilbert space under study.

3.19.2 “Delta functions”


We shall use a continuous basis of “highly improper” functions defined by

ξr0 (r) ≡ δ (r − r0 ) (3.87)

{ξr0 (r)} represents the set of delta functions centered at each of the points r0 of the whole space. These functions are not
/ ̥. Nevertheless, the following relations are valid for functions that belong to ̥
square-integrable so {ξr0 (r)} ∈
Z
ψ (r) = d3 r0 ψ (r0 ) δ (r − r0 )
Z
ψ (r0 ) = d3 r ψ (r) δ (r0 − r)

rewritten them appropriately we have


Z
ψ (r) = d3 r0 ψ (r0 ) ξr0 (r) (3.88)
Z
ψ (r0 ) = d3 r ξr∗0 (r) ψ (r) = (ξr0 , ψ) (3.89)

Eq. (3.88) gives ψ (r) ∈ ̥ as a continuous linear combination of the set {ξr0 }, where ψ (r0 ) are the fourier transforms. On the
other hand, (3.89) indicates that the fourier transforms are evaluated as usual.
By using the properties of the Dirac delta function, it is possible to prove that the set {ξr0 } accomplishes orthonormality
and completeness relations Z

ξr0 , ξr′0 = d3 r δ (r − r0 ) δ (r − r′0 ) = δ (r0 − r′0 )

and Z Z
d3 r0 ξr∗0 (r′ ) ξr0 (r) = d3 r0 δ (r′ − r0 ) δ (r − r0 ) = δ (r − r′ )

note that the non-physical functions that constitute a continuous basis can usually be seen as limits in which one or more
parameters of a physically realizable state are taken at extreme (non-physical) values.
As an example the Dirac function can be taken as the limit of gaussians given by Eq. (3.72)
n 2 2
fn (x − a) = √ e−n (x−a)
π

for each value of n these functions are square integrable, continuous, and derivable, they could describe a physical system.
Notwithstanding, by taking n → ∞, the functions are no longer square-integrable and lose all properties of well-behavior.
Concerning plane waves, physical states (in both classical and quantum mechanics) consists of a superposition of plane
waves with a finite width spectrum of frecuencies ∆ν, by taking the limit ∆ν → 0 we obtain a monochromatic (non-physical)
wave, corresponding to a single plane wave.
62 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

3.20 Tensor products of vector spaces, definition and properties


Let V1 and V2 be two vector spaces of dimension n1 and n2 . Vectors and operators on each of them will be denoted by labels
(1) and (2) respectively.
Definition 3.11 The vector space V is called the tensor product of V1 and V2
V ≡ V1 ⊗ V2
if there is associated with each pair of vectors x (1) ∈ V1 and y (2) ∈ V2 a vector in V denoted by x (1) ⊗ y (2) and called
the tensor product of x (1) and y (2), and in which this correspondence satisfies the following conditions: (a) It is linear with
respect to multiplication by a scalar
[αx (1)] ⊗ y (2) = α [x (1) ⊗ y (2)] ; x (1) ⊗ [βy (2)] = β [x (1) ⊗ y (2)] (3.90)
(b) It is distributive with respect to addition
[x (1) + x′ (1)] ⊗ y (2) = x (1) ⊗ y (2) + x′ (1) ⊗ y (2)
x (1) ⊗ [y (2) + y′ (2)] = x (1) ⊗ y (2) + x (1) ⊗ y′ (2) (3.91)
(c) When a basis is chosen in each space, say {ui (1)} in V1 and {vj (2)} in V2 , the set of vectors ui (1) ⊗ vj (2) constitutes a
basis in V . If n1 and n2 are finite, the dimension of the tensor product space V is n1 n2 .
An arbitrary couple of vectors x (1), y (2) can be written in terms of the bases {ui (1)} and {vj (2)} respectively, in the
form X X
x (1) = ai ui (1) ; y (2) = bj vj (2)
i j

Using Eqs. (3.90, 3.91) we see that the expansion of the tensor product is given by
XX
x (1) ⊗ y (2) = ai bj ui (1) ⊗ vj (2)
i j

so that the components of the tensor product of two vectors are the products of the components of the two vectors of the
product. It is clear that the tensor product is commutative i.e. V1 ⊗ V2 = V2 ⊗ V1 and x (1) ⊗ y (2) = y (2) ⊗ x (1)
On the other hand, it is important to emphasize that there exist in V some vectors that cannot be written as tensor
products of a vector in V1 with a vector in V2 . Nevertheless, since {ui (1) ⊗ vj (2)} is a basis in V any vector in V can be
expanded in it XX
ψ= cij ui (1) ⊗ vj (2) (3.92)
i j

in other words, given a set of n1 n2 coefficients of the form cij it is not always possible to write them as products of the form
ai bj of n1 numbers ai and n2 numbers bj , we cannot find always a couple of vectors in V1 and V2 such that ψ = x (1) ⊗ y (2).

3.20.1 Scalar products in tensor product spaces


If there are inner products defined in the spaces V1 and V2 we can define an inner product in the tensor product space V . For
a couple of vectors in V of the form x (1) ⊗ y (2) the inner product can be written as
(x′ (1) ⊗ y′ (2) , x (1) ⊗ y (2)) = (x′ (1) , x (1))(1) (y′ (2) , y (2))(2)

where the symbols (, )(1) and (, )(2) denote the inner product of each of the spaces of the product. From this, we can see that
if the bases {ui (1)} and {vj (2)} are orthonormal in V1 and V2 respectively, then the basis {ui (1) ⊗ vj (2)} also is
(ui (1) ⊗ vj (2) , uk (1) ⊗ vm (2)) = (ui (1) , uk (1))(1) (vj (2) , vm (2))(2) = δik δjm

Now, for arbitrary vectors in V , we use the expansion (3.92) and the basic properties of the inner product
 
XX XX
(ψ, φ) =  cij ui (1) ⊗ vj (2) , bkm uk (1) ⊗ vm (2)
i j k m
X X X X
= c∗ij bkm (ui (1) ⊗ vj (2) , uk (1) ⊗ vm (2)) = c∗ij bkm δik δjm
i,j k,m i,j k,m
X
(ψ, φ) = c∗ij bij
i,j

it is easy to show that with these definitions the new product accomplishes the axioms of an inner product.
3.20. TENSOR PRODUCTS OF VECTOR SPACES, DEFINITION AND PROPERTIES 63

3.20.2 Tensor product of operators


e (1) acting on V as follows: when
Consider a linear transformation A (1) defined on V1 , we associate with it a linear operator A
e
A (1) is applied to a tensor of the type x (1) ⊗ y (2) we define
e (1) [x (1) ⊗ y (2)] = [A (1) x (1)] ⊗ y (2)
A
when the operator is applied to an arbitrary vector in V , this definition is easily extended because of the linearity of the
transformation
XX XX
e (1) ψ = A
A e (1) cij ui (1) ⊗ vj (2) = e (1) [ui (1) ⊗ vj (2)]
cij A
i j i j
XX
e (1) ψ
A = cij [A (1) ui (1)] ⊗ vj (2) (3.93)
i j

e (2) of a linear transformation in V2 is obtained in a similar way


the extension B
XX
e (2) ψ =
B cij ui (1) ⊗ [B (2) vj (2)]
i j

finally, if we consider two operators A (1) , B (2) defined in V1 and V2 respectively, we can define their tensor product A (1)⊗B (2)
as
XX
[A (1) ⊗ B (2)] ψ = cij [A (1) ui (1)] ⊗ [B (2) vj (2)] (3.94)
i j

it is easy to show that A (1) ⊗ B (2) is also a linear operator. From Eqs. (3.93, 3.94) we can realize that the extension of the
operator A (1) on V1 to an operator A e (1) on V can be seen as the tensor product of A (1) with the identity operator I (2) on
V2 . A similar situation occurs with the extension B e (2)
e (1) = A (1) ⊗ I (2) ; B
A e (2) = I (1) ⊗ B (2)
e (1) B
Now let us put the operators A (1) ⊗ B (2) and A e (2) to act on an arbitrary element of a basis {ui (1) ⊗ vj (2)} of V

[A (1) ⊗ B (2)] ui (1) ⊗ vj (2) = [A (1) ui (1)] ⊗ [B (2) vj (2)]


h i
Ae (1) B
e (2) ui (1) ⊗ vj (2) = Ae (1) {ui (1) ⊗ [B (2) vj (2)]} = [A (1) ui (1)] ⊗ [B (2) vj (2)]

e (1) and B
therefore, the tensor product A (1) ⊗ B (2) coincides with the ordinary composition of two operators A e (2) on V
e (1) B
A (1) ⊗ B (2) = A e (2)

additionally, it can be shown that operators of the form A e (1) and B e (2) commute in V . To see it, we put their products in
both orders to act on an arbitrary vector of a basis {ui (1) ⊗ vj (2)} of V
h i
Ae (1) B
e (2) ui (1) ⊗ vj (2) = Ae (1) {ui (1) ⊗ [B (2) vj (2)]} = [A (1) ui (1)] ⊗ [B (2) vj (2)]
h i
Be (2) A
e (1) ui (1) ⊗ vj (2) = Be (2) {[A (1) ui (1)] ⊗ vj (2)} = [A (1) ui (1)] ⊗ [B (2) vj (2)]

therefore we have h i
Ae (1) , B
e (2) = 0 or A (1) ⊗ B (2) = B (2) ⊗ A (1)
an important special case of linear operators are the projectors, as any other linear operator, the projector in V is the tensor
product of the projectors in V1 and V2 . Let M1 and N1 be the range and null space of a projector in V1 and M2 , N2 the range
and null space of a projector in V2
V1 = M1 ⊕ N1 ; x (1) = xM (1) + xN (1) ; xM (1) ∈ M1 , xN (1) ∈ N1 ; P1 (x (1)) = xM (1)
V2 = M2 ⊕ N2 ; y (2) = yM (2) + yN (2) ; yM (2) ∈ M2 , yN (2) ∈ N2 ; P2 (y (2)) = yM (2)
(P1 ⊗ P2 ) (x (1) ⊗ y (2)) = [P1 x (1)] ⊗ [P2 y (2)] = xM (1) ⊗ yM (2)
for an arbitrary vector we have
XX XX
(P1 ⊗ P2 ) ψ = (P1 ⊗ P2 ) cij ui (1) ⊗ vj (2) = cij [P1 ui (1)] ⊗ [P2 vj (2)]
i j i j
XX
(P1 ⊗ P2 ) ψ = cij ui,M (1) ⊗ vj,M (2)
i j

finally, as in the case of vectors, there exists some operators on V that cannot be written as tensor products of the form
A (1) ⊗ B (2).
64 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

3.20.3 The eigenvalue problem in tensor product spaces


Let us assume that we have solved the eigenvalue problem for an operator A (1) of V1 . We want to seek for information
concerning the eigenvalue problem for the extension of this operator to the tensor product space V . For simplicity, we shall
assume a discrete spectrum
A (1) xin (1) = an xin (1) ; i = 1, 2, . . . , gn ; xin (1) ∈ V1
where gn is the degeneracy associated with an . We want to solve the eigenvalue problem for the extension of this operator in
V = V1 ⊗ V2
Ae (1) ψ = λψ ; ψ ∈ V1 ⊗ V2

from the definition of such an extension, we see that a vector of the form xin (1) ⊗ y (2) for any y (2) ∈ V2 is an eigenvector of
Ae (1) with eigenvalue an
   
e (1) xin (1) ⊗ y (2) =
A A (1) xin (1) ⊗ y (2) = an xin (1) ⊗ y (2) ⇒
   
e (1) xin (1) ⊗ y (2) =
A an xin (1) ⊗ y (2)

it is natural to ask whether a complete set of linearly independent eigenvectors of Ae (1) can be generated in this way. We shall

see that it is true if A (1) is an observable in V1 . Assuming it, the set of orthonormal eigenvectors xin (1) forms a basis in
V1 . If we now take an orthonormal basis {ym (2)} in V2 , then the set of vectors
 i,m  i
ψn ≡ xn (1) ⊗ ym (2)

forms an orthonormal basis in V . It is clear that the set ψni,m consists of eigenvectors of A e (1) with eigenvalues an , and since
they are a basis, a complete orthonormal set of eigenvectors of A e (1) have been generated with the procedure explained above.
This in turn means that if A (1) is an observable in V1 , its extension A e (1) is also an observable in V . Further, the spectrum
of Ae (1) coincides with the spectrum of A (1). Notwithstanding, it worths to say that if N2 is the dimension of V2 , if an is
gn −fold degenerate in V1 , it will be gn · N2 −degenerate in V . This is because for a given eigenvector xin (1) in V1 , there are
N2 linearly independent eigenvectors ψni,m ≡ xin (1) ⊗ ym (2) since m = 1, . . . , N2 .
We know that each eigenvalue an of A (1) in V1 defines an eigensubspace V1,an in V1 with gn dimension. The corresponding
eigensubspace generated by an in V is a N2 · gn subspace Van . The projector onto V1,an is written by

V1 = V1,an ⊕ V1,a n
; x (1) = xan (1) + x⊥ ⊥ ⊥
an (1) ; xan (1) ∈ V1,an , xan (1) ∈ V1,an
P1an (x (1)) = xan (1)

and its extension to V is defined as


   
Pe1an ≡ P1an ⊗ I2 ; Pe1an ψni,m ≡ Pe1an xin (1) ⊗ ym (2) = P1an xin (1) ⊗ ym (2)
Pe1an ψni,m = xan (1) ⊗ ym (2)

So that Pe1an is a mapping of V onto Van ≡ V1,an ⊗ V2 . Now assume that we have a sum of operators of both spaces

e (1) + B
C=A e (2)

where A (1) and B (2) are observables in their corresponding spaces, with the following eigenvalues and eigenvectors

A (1) xin (1) = an xin (1) ; i = 1, 2, . . . , gn ; xin (1) ∈ V1


k k k
B (2) ym (2) = bm ym (2) ; k = 1, 2, . . . , hm ; ym (2) ∈ V2

e (1) and B
we have seen that A e (2) commute, so they should have a common basis of eigenvectors in V . This basis is precisely,
the tensor product of their eigenvectors
   
e (1) xin (1) ⊗ ym
A k
(2) = an xin (1) ⊗ ymk
(2)
   
e (2) xin (1) ⊗ ym
B k
(2) = bm xin (1) ⊗ ymk
(2)

and they are also eigenvectors of C = Ae (1) + B


e (2)
h i   
Ae (1) + B
e (2) xin (1) ⊗ ym
k
(2) = (an + bm ) xin (1) ⊗ ymk
(2)
   
C xin (1) ⊗ ym
k
(2) = cnm xin (1) ⊗ ymk
(2) ; cnm = an + bm
3.21. RESTRICTIONS OF AN OPERATOR TO A SUBSPACE 65

So that if C = A e (1) + B
e (2) the eigenvalues of C are the sums of the eigenvalues of A
e (1) and B
e (2). Besides, we can form a
basis of eigenvectors of C by taking the tensor product of the basis of A (1) and B (2).
It is important to emphasize that even if an and bm are non-degenerate, it is posible for cnm to be degenerate. Assume
that an and bm are non-degenerate, and for a given cnm let us define all the sets of pairs {(nj , mj ) : j = 1, . . . , q} such that
anj + bmj = cnm . In that case, the eigenvalue cnm is q−fold degenerate, and every eigenvector corresponding to this eigenvalue
can be written as
Xq
 
cj xnj (1) ⊗ ymj (2)
j=1

in this case there are eigenvectors of C that are not tensor products.

3.20.4 Complete sets of commuting observables in tensor product spaces


For simplicity assume that A (1) forms a C.S.C.O. by itself in V1 , while {B (2) , C (2)} constitute a C.S.C.O. in V2 . We shall
show that by gathering the operators of the C.S.C.O. in V1 with the operators of C.S.C.O. in V2 , we form a C.S.C.O. in V
with their corresponding extensions.
Since A (1) is a C.S.C.O. in V1 , all its eigenvalues are non-degenerate in V1

A (1) xn (1) = an x (1)

the ket x (1) is then unique within a constant factor. In V2 the set of two operators {B (2) , C (2)} defines commom eigenvectors
{ymp (2)} that are unique in V2 within constant factors

B (2) ymp (2) = bm ymp (2) ; C (2) ymp (2) = cp ymp (2)

In V , the eigenvalues an are N2 −fold degenerate. Similarly, there are N1 linearly independent eigenvectors of B (2) and C (2)
associated with two given eigenvalues of the form (bm , cp ). However, the eigenvectors that are common to the three commuting
observables A e (1) , B
e (2) , C
e (2) are unique within constant factors

e (1) [xn (1) ⊗ ymp (2)] = an [x (1) ⊗ ymp (2)]


A
e (2) [xn (1) ⊗ ymp (2)] = bm [x (1) ⊗ ymp (2)]
B
e (2) [xn (1) ⊗ ymp (2)] = cp [x (1) ⊗ ymp (2)]
C

since {xn (1)} and {ymp (2)} were bases in V1 andn V2 , we see that {xon (1) ⊗ ymp (2)} is a basis in V constituted by commom
eigenvectors of the three operators. Thus the set Ae (1) , B
e (2) , C
e (2) is a C.S.C.O. in V .

3.21 Restrictions of an operator to a subspace


It is useful in many applications to be able to restrict an operator to a certain subspace Vq of a given vector space V . Let us
assume

V = V1 ⊕ . . . ⊕ Vq ⊕ . . .
x = x1 + . . . + xq + . . . ; with xi ∈ Vi

Projectors, which are the natural operators to “restrict” a vector by extracting the components that are orthonormal to a given
subspace, will be also the natural operators to restrict operators. Let Pq be the projector onto a subspace Vq . A priori, we
could think in defining a restriction by “restricting the vector” in which the operator will act on. This is done by substracting
all components orthogonal to the subspace Vq by applying a projection, and then let the operator A act on this projection so
we have
A ≡ APq ⇒ Ax = APq x = Axq
in this case we have restricted the domain of A to the subspace Vq , but once the operator A is applied to all vectors in Vq , the
range could be outside of Vq . Hence, the projector must be applied again after the application of A in order to restrict the
range appropriately. We then define the restriction Ab of the operator A to the subspace Vq as

bq ≡ Pq A = Pq APq
A

so that both the domain and the range are restricted to Vq . It can be easily checked that the matrix representation of A bq is
reduced to a submatrix in the Vq space. Let qk be the dimension of Vq . Let us use an ordered basis such that the first qk terms
66 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

expand Vq . Using such a basis we have


   
Abq = bq uj = (ui , Pq APq uj ) = (Pq ui , APq uj )
ui , A
ij

(ui , Auj ) if i, j ≤ qk
(Pq ui , APq uj ) =
0 if i > qk and/or j > qk

observe that the submatrix associated with i, j ≤ qk (i.e. associated with the Vq subspace), remains the same with respect to
the non-restricted matrix. But the elements outside of such a submatrix are zeros, showing that the new operator only acts in
Vq .
bq of an operator A differs from A itself, because we are changing the
It is important to emphasize that the restriction A
mapping. In the special case in which the subspace Vq is invariant under A, the range of A is automatically restricted into Vq
when the domain is restricted to Vq . Thus in that case the restriction can be defined with only one projector operator

bq ≡ APq
A

bq is identical to the mapping described by A when such mappings


so when Vq is invariant under A the mapping described by A
are restricted to the domain Vq .

3.22 Functions of operators


Let A be an arbitrary operator. The operator An with n being a non-negative integer is easily defined as

A0 ≡ I , An = AA · · · A (n times)

similarly for negative integers a consistent definition is


n
A−n ≡ A−1 with AA−1 = A−1 A = I

it is useful to define functions of operators. Assume that a function F can be expanded in certain domain in the following way

X
F (z) = fn z n (3.95)
n=0

by definition, the function F (A) of the operator A corresponds to an expansion of the form (3.95) with the same coefficients
fn
X∞
F (A) = fn An (3.96)
n=0
A
for instance, the function e of the operator A reads
X∞
An A2 A3
eA = =I +A+ + + ...
n=0
n! 2! 3!

the convergence of series of the type (3.96) depends on the eigenvalues of A and the radius of convergence of the function
(3.95). We shall not treat this topic in detail.
If F (z) is a real function the coefficients fn are real. On the other hand, if A is hermitian then F (A) also is, as can be
seen from (3.96). Owing to the analogy between real numbers and hermitian operators this relation is quite expected. Now,
assume that xi,k is an eigenvector of A with eigenvalue ai we then have

Axi,k = ai xi,k ⇒ An xi,k = ani xi,k

and applying the eigenvector in Eq. (3.96) we find



X ∞
X
F (A) xi,k = fn ani xi,k = xi,k fn ani
n=0 n=0
F (A) xi,k = F (ai ) xi,k

so that if xi,k is an eigenvector of A with eigenvalue ai , then xi,k is also eigenvector of F (A) with eigenvalue F (ai ). The fact
that all eigenvectors of A are also eigenvectors of F (A) has many important implications. (a) If A is observable then F (A) also
is. (b) If A is diagonalizable (this is the case for observables), the matrix of diagonalization of A (constituted by a complete
3.22. FUNCTIONS OF OPERATORS 67

set of its eigenvectors) also diagonalizes F (A). It means that we can find a basis in which the matrix representative of A is
diagonal with the eigenvalues ai in the diagonal and that in such a basis, the operator F (A) has also a diagonal representation
with elements F (ai ) in the diagonal. For example let σz be an operator that in certain basis has the matrix representation
 
1 0
σz =
0 −1

in the same basis we have    


e1 0 e 0
eσz = =
0 e−1 0 1/e
if A and B do not commute, we have that in general the operators F (A) and F (B) do not commute either. For instance
X∞ ∞ ∞ X ∞
An X B m X An B m
eA eB = = (3.97)
n=0
n! m=0 m! n=0 m=0
n! m!
X∞ ∞ ∞ X ∞
B m X An X B m An
eB eA = = (3.98)
m=0
m! n=0 n! m=0 n=0
m! n!
X∞ n
(A + B)
eA+B = (3.99)
n=0
n!

these three expressions are in general different from each other unless [A, B] = 0. We see by direct inspection of Eqs. (3.97,
3.98, 3.99) that if A and B commute, then F (A) and F (B) also do. Notice that if A, B are observables, A and B commute
if and only if they can be diagonalized simultaneously, and so F (A) and F (B). This is another way to see that if [A, B] = 0
then [F (A) , F (B)] = 0.

3.22.1 Some commutators involving functions of operators


Theorem 3.21 Suppose we have two operators A and B such that B commutes with their commutator, that is

[B, C] = 0 ; C ≡ [A, B] (3.100)

if F (B) is a function of the operator B then we have

[A, F (B)] = [A, B] F ′ (B) (3.101)

where F ′ (B) is the derivative of F (B) “with respect to B”, defined as



X ∞
X
F (B) = fn B n ⇒ F ′ (B) ≡ nfn B n−1 (3.102)
n=0 n=0

Proof : The commutator [A, F (B)] is given by


" ∞
# ∞
X X
n
[A, F (B)] = A, fn B = fn [A, B n ] (3.103)
n=0 n=0

we first prove that


[A, B n ] = [A, B] nB n−1 = nB n−1 [A, B] (3.104)
the second equality comes directly from Eq. (3.100). The first equality can be proved by induction. For n = 0 we have B n = I
and both sides clearly vanish. Now let us assume that it works for n and show that it is satisfied by n + 1. Applying Eq.
(2.49), and taking into account Eqs. (3.104, 3.100) we have
 
A, B n+1 = [A, BB n ] = [A, B] B n + B [A, B n ] = [A, B] B n + B [A, B] nB n−1
= CB n + BCnB n−1 = CB n + nCBB n−1 = C (n + 1) B n
 
A, B n+1 = [A, B] (n + 1) B n

which shows the validity of Eq. (3.104). Replacing Eq. (3.104) in Eq. (3.103), we find

X
[A, F (B)] = [A, B] fn nB n−1 = [A, B] F ′ (B)
n=0

QED.
68 CHAPTER 3. BASIC THEORY OF REPRESENTATIONS FOR FINITE-DIMENSIONAL VECTOR SPACES

Corollary 3.22 It is straightforward to show that if both operators commute with their commutator we see that equations
[A, F (B)] = [A, B] F ′ (B) ; [G (A) , B] = [A, B] G′ (A) (3.105)
are satisfied simultaneously. A very important case in Physics occurs when [A, B] = αI. In that case, we have
[A, B] = αI ⇒ [A, F (B)] = αF ′ (B) ; [G (A) , B] = αG′ (A) (3.106)
 At 
Example 3.2 Let us evaluate e , B under the condition that [A, [A, B]] = [B, [A, B]] = 0. Applying Eq. (3.102) we have
X∞ n X∞ X∞ X∞
(At) tn tn tn
eAt = = F (A) = An ⇒ F ′ (A) ≡ n An−1 = 0 + An−1
n=0
n! n=0
n! n=0
n! n=1
(n − 1)!
∞ ∞
X (At)n−1 X (At)k
F ′ (A) = t =t
n=1
(n − 1)! k!
k=0
′ At
F (A) = te (3.107)
Combining Eqs. (3.105, 3.107) we find  At 
e , B = t [A, B] eAt (3.108)

3.23 Differentiation of operators


Let A (z) be an operator that depends on the arbitrary variable z. We define the derivative of A (z) with respect to z as
dA A (z + ∆z) − A (z)
= lim (3.109)
dz ∆z→0 ∆z
provided that this limit exists. Operating A on an arbitrary vector x, using a basis {ui } independent of z, and applying Eq.
(3.3) we have
A (z) x = A (z) xi ui = xi A (z) ui = xi uj Aji (z) (3.110)
since dA/dz is another operator, it makes sense to talk about its matrix representation
 
dA (z) dA (z) dA (z) dA (z)
x= xi ui = xi ui = xi uj (3.111)
dz dz dz dz ji

Applying the derivative on both extremes of Eq. (3.110), and taking into account that the basis {ui } and the vector x are
independent of z, we have
dA (z) dAji (z)
x = xi uj (3.112)
dz dz
comparing Eqs. (3.111, 3.112) we obtain  
dA (z) dAji (z)
= (3.113)
dz ji dz
so the matrix representative of the derivative of A is obtained by taking the derivative of each of its elements14 . The differen-
tiation rules are similar to the ones in ordinary calculus
d dF dG d dF dG
(F + G) = + ; (F G) = G+F (3.114)
dz dz dz dz dt dt
except that care must be taken with the order of appearance for the operators involved. Let us examine the second of these
equations. Since F G is just another operator, we can use Eq. (3.113) to have
     
d (F G) d d d d
= (F G)ji = [Fjk Gki ] = Fjk Gki + Fjk Gki
dz ji dz dz dz dz
"    #
dF dG
= Gki + Fjk
dz jk dz ki
in matrix form we see that
d (FG) dF dG
= G+F
dz dz dz
we already knew that there is a one-to-one isomorphism from the operators onto the matrices that preserves product, sum and
scalar product. In this section, we have seen that this relation is also valid for the derivatives of these operators, at least when
such a derivative exists.
14 Care must be taken to distinguish between the derivative in Eq. (3.102) and the derivative in Eq. (3.109). In Eq. (3.102) the derivative is taken

with respect to B as the “variable of derivation”. On the other hand, in Eq. (3.109) the variable to derive with, is a parameter z from which our
operator depends on.
3.23. DIFFERENTIATION OF OPERATORS 69

3.23.1 Some useful formulas


Applying the derivation rules we can develop some identities for functions of operators. Let us calculate the derivative of the
operator eAt . By definition we have
X∞ n
(At)
eAt =
n=0
n!
differentiating the series term by term we have

X X∞ X∞ n−1
d At An An (At)
e = ntn−1=0+ ntn−1 =A
dt n=0
n! n=1
n! n=1
(n − 1)!
"∞ # "∞ #
d At X (At)k X (At)k
e = A = A
dt k! k!
k=0 k=0

where we have used the assignment k = n − 1. The series in the brackets is eAt once again, so we have
d At
e = AeAt = eAt A (3.115)
dt
in this case eAt and A commutes because only one operator is involved15 . Suppose that we want to differentiate eAt eBt .
Applying Eqs. (3.114, 3.115) we have
 
d  At Bt  d eAt Bt d eBt
e e = e + eAt = AeAt eBt + eAt BeBt
dt dt dt
the operator A can pass over eAt if desired but not over eBt unless that A and B commute. Similarly, B can pass over eBt but
not over eAt .
However, even if a single operator appears we should be careful with the order sometimes. For instance, if A (t) is an
arbitrary function of time then
d A(t) dA A(t)
e 6= e (3.116)
dt dt
it could be checked that A (t) and dA (t) /dt must commute with each other for the equality to be valid.

Theorem 3.23 Let A and B be two operators that commute with their commutator, they satisfy the relation
1
[A, [A, B]] = [B, [A, B]] = 0 ⇒ eA eB = eA+B e 2 [A,B] (Glauber′ s f ormula) (3.117)

Proof: Let define F (t) with t real as


dF (t)    
F (t) ≡ eAt eBt ; = AeAt eBt + eAt BeBt = A eAt eBt + eAt Be−At eAt eBt = A + eAt Be−At eAt eBt
dt
dF (t)  
= A + eAt Be−At F (t) (3.118)
dt
since A, B commute with their commutator, we can apply Eq. (3.108), so that
 At 
e ,B = t [A, B] eAt ⇒ eAt B = BeAt + t [A, B] eAt
⇒ eAt Be−At = B + t [A, B]

substituting this expression in Eq. (3.118) we get


dF (t)
= {A + B + t [A, B]} F (t) (3.119)
dt
by hypothesis, A + B commutes with [A, B], so that the differential equation (3.119) can be integrated as if A + B and [A, B]
were numbers 2
1
F (t) = F (0) e(A+B)t+ 2 [A,B]t
setting t = 0 we see that F (0) = I, thus we obtain
1 2
F (t) = e(A+B)t+ 2 [A,B]t

setting t = 1 and taking into account again that A + B commutes with [A, B], we obtain (3.117). It is necessary to emphasize
that this equation is valid only if A and B commutes with [A, B]. QED.
15 Compare Eq. (3.115) with Eq. (3.107).
Chapter 4

State space and Dirac notation

We have defined the space of Physical states as the one constituted by functions ψ (r) square-integrable in a given volume.
The space with these characteristics is denoted by L2 , but since in general we add some requirements to these functions, we
actually work in a subspace ̥ ⊆ L2 . On the other hand, we have seen that several bases can be constructed to represent those
functions. Therefore, the Physical system will be described by either the functions ψ (r) or by the set of its coordinates in a
given representation. When the representation is discrete we have a numerable set of coordinates (Fourier coefficients) while
in the case of continuous bases, the set of coordinates is continuous as well (Fourier transforms). In particular, the continuous
basis denoted as ξr0 (r) shows that the function ψ (r) can be considered as a coordinate system as well, because in this basis,
each coordinate is defined as ψ (r0 ) i.e. the value of ψ at each fixed point r0 of the volume1 .
We have now a situation similar to the one obtained in R3 , we can define a vector by a triple of coordinates in any basis
defined by a set of coordinate axes. However, vectors in R3 can be defined geometrically (intrinsically), and its algebra can be
performed in a coordinate-free form.
In the same way, we wish to define our state vector in a coordinate free (or intrinsic) way. The abstract space of state
vectors of a particle is denoted as Er which should be isometrically isomorphic with ̥. We should also define the notation and
algebra on the Er space.
Though we initially start with Er as identical to ̥, we shall see that it permits a generalization of the formalism when the
states in ̥do not contain all the Physical information of the system, as is the case when spin degrees of freedom are introduced
in the formalism. Hence, the algebra that we shall develop now will be valid when these generalizations are carried out. In
developing this algebra we are going to present the Dirac notation which is useful in practical calculations

4.1 Dirac notation


We are going to establish a one-to-one correspondence between the states of ̥ and the states of Er , though the latter will be
extended later. Thus to every square-integrable function ψ (r) in ̥ we make to correspond an abstract vector in Er in the form

ψ (r) ↔ |ψi

an abstract vector in the notation |ψi will be called a ket. Notice that no r−dependence appears in |ψi. Indeed, ψ (r) is
interpreted in this framework as a representation of |ψi in which each ψ (r) is a coordinate in the basis given by ξr (r′ ).
Therefore, r plays the role of index (three continuous indices) for the particular basis used.
The space of states of a particle in one dimension is denoted as Ex , while in three dimensions is Er .

4.2 Elements of the dual or conjugate space Er∗


In section 2.9.2 we defined a one-to-one correspondence between vectors (kets) of a Hilbert space and functionals (bras) in the
conjugate (dual) space in the following way (see Eqs. 2.37, 2.38)

|ψi ↔ f|ψi ; f|ψi (|ϕi) ≡ (|ψi , |ϕi)

Dirac notation designates f|ψi as hψ| which is called a bra. The correspondence above and the inner product will be written
as
|ψi ∈ Er ↔ hψ| ∈ Er∗ ; hψ| (|ϕi) ≡ (|ψi , |ϕi)
1 Notice that this is a simple way of defining an scalar field. A scalar field is completely delimited by defining its value at each point of the space

in which the field is defined (at a given time). In this case the number of coordinates is clearly the number of points in our space.

70
4.3. THE CORRESPONDENCE BETWEEN BRAS AND KETS WITH HYPERBASES 71

it induces a natural notation for the inner product

((|ψi , |ϕi)) ≡ hψ| ϕi

this is also called a bracket (i.e. the union of a bra with a ket). Let us now write the properties developed in section 2.9.2
Eq. (2.39), with this new notation

fα|ψi+β|ϕi = α∗ f|ψi + β ∗ f|ϕi


α |ψi + β |ϕi ∈ Er ↔ α∗ hψ| + β ∗ hϕ| ∈ Er∗

which is consistent with the properties of the inner product

(α |ψi + β |ϕi , |χi) = (α∗ hψ| + β ∗ hϕ|) |χi ⇒


hαψ + βϕ| χi = α∗ hψ| χi + β ∗ hϕ| χi

since the functionals (bras) are linear by definition, a linear combination of kets gives

f|ψi (α |ϕi + β |χi) ≡ αf|ψi (|ϕi) + βf|ψi (|χi)

in Dirac notation it reads


hψ| αϕ + βχi = α hψ| ϕi + β hψ| χi
from these facts it is clear that for any scalar α

|αψi = α |ψi ; hαψ| = α∗ hψ| (4.1)

now since

(|ψi , |ϕi) = (|ϕi , |ψi) ⇒
hψ| ϕi = hϕ| ψi∗

4.3 The correspondence between bras and kets with hyperbases


We have seen that hyperbases are sets of elements from which any element of the space can be expanded despite those
elements do not belong to the space under study. On the other hand, we have seen that the correspondence between vectors
and functionals (kets and bras) is one-to-one and onto. However, when hyperbases are used we shall see that some linear
functionals (bras) can be well-defined while there is not a well-defined corresponding vector (ket)
(ε)
Assume for example that we have a ket in ̥ given by a sufficiently regular function ξx0 (x) such that
Z ∞
dx ξx(ε)
0
(x) = 1
−∞
E D
(ε) (ε)
with the form of a peak of height ∼ 1/ε and width ∼ ε centered at x = x0 . If ε =6 0 then ξx0 ∈ Ex . Let ξx0 ∈ Ex∗ be its
associated bra. The idea is to have a function that converges to the Dirac delta function when ε → 0. For each |ψi ∈ Ex we
have that   Z ∞
hξx(ε)
0
|ψi = ξx(ε)
0
,ψ = dx ξx(ε)
0
(x) ψ (x) (4.2)
−∞

now we let ε to approach zero, and we find that


lim ξx(ε)
0
/ ̥x

ε→0

since theD square


of its norm tend to 1/ε and diverges. Nevertheless, in the limit ε → 0 the expression (4.2) is still well-defined,
(ε)
so that ξx0 is still associated with a functional that can be applied to any element of the state space, we shall denote this
bra as hξx0 | and this functional associates with each vector |ψi ∈ Ex the value ψ (x0 ) taken on by the associated wave function
in ̥x at the point x0 D

lim ξx(ε)
0
= hξx0 | ∈ Ex∗ if |ψi ∈ Ex ⇒ hξx0 | ψi = ψ (x0 )
ε→0

then the bra hξx0 | ∈Ex∗exists but there is not a ket associated with it in the space Ex .
This assymmetry is associated with the use of a hyperbasis. The elements of the hyperbasis do not belong to ̥x and so
has no elements associated in Ex either. However, the inner product of such an element with any element of ̥x is well-defined
and it permits to associate a bra belonging to Ex∗ . Indeed, by the theory of Hilbert spaces the corresponding ket must exists,
72 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

what really happens is that we cannot construct it as an element of Ex , this is perfectly understandable since such elements
are out of our Hilbert space.
Notice that we have indeed extended the concept of inner product and we have applied it to elements out of our Hilbert
space. For practical reasons it is usual to associate the bras hξx0 | ∈ Ex∗ to the “generalized ket” |ξx0 i that are not physical
states but are advantageous from the practical point of view.
Another example is the continuous basis consisting of plane waves truncated outside an interval of width L

1 L L
vp(L) (x) = √ eip0 x/~ ; − ≤x≤
0
2π~ 2 2

(L)
with the function vp0 (x) going
E rapidly to zero outside of that interval, but keeping continuity and differentiability. The ket
(L)
associated is denoted as vp0
E
(L)
vp(L)
0
(x) ∈ ̥ x ↔ vp0 ∈ Ex

the square of the norm is ∼ L/2π~, diverges if L → ∞. Therefore


E

lim vp(L)
0

/ Ex
L→∞

D E
(L) (L)
now we consider the limit of the bra vp0 associated with vp0 and applied to an arbitrary vector |ψi ∈ Ex

D   Z L/2
1
vp(L)
0
ψi = v (L)
p0 , ψ ≃ √ dx e−ip0 x/~ ψ (x)
2π~ −L/2

in the limit L → ∞ we find ψ̄ (p0 ) i.e. the Fourier transform of ψ (x) evaluated at p = p0 . From which we see that the inner
product converges and is well-defined
D

lim vp(L)
0
≡ hvp0 | ∈ Ex∗
L→∞
E
(L)
but it does not correspond to the ket associated with the limit of kets of the form vp0 .
E
(ε)
We could take the results above with the following point of view, the ket |ξx0 i means the ket given by ξx0 with ε much
smaller than any other length involved in the problem, so we are really working in Ex . The results obtained
E at the end
(ε)
depends very little on ε as long as it is much smaller than any other length in the problem. Certainly, ξx0 does not form an
orthonormal basis, and do not satisfy a closure relation with ε 6= 0, but it aproaches the orthonormality and closure conditions
as ε becomes very small.
The introduction of generalized kets, will ensure that we balance bras and kets in the limits concerned above. Generalized
kets do not have finite norm, but they can acquire a finite inner product with kets of our space of states.

4.4 The action of linear operators in Dirac notation


Linear operators are characterized easily in Dirac notation

|ψ ′ i = A |ψi ; |ψi , |ψ ′ i ∈ Ex
A (α |ψi + β |ϕi) = αA |ψi + βA |ϕi

the product of operators writes


AB |ψi = A (B |ψi)

it is also important to calculate the inner product between |ϕi and |ψ ′ i = A |ψi in the form

(|ϕi , |ψ ′ i) = (|ϕi , A |ψi) = hϕ| (A |ψi)

this is usually denoted simply as


hϕ| (A |ψi) ≡ hϕ| A |ψi ≡ hϕ| Aψi
4.5. PROJECTORS 73

4.5 Projectors
The simplest of all projectors are the ones in which the ranges are one dimensional subspaces of the Hilbert space. Let {|ψi}
be the one dimensional space spanned by the single non-zero ket |ψi. The projector P|ψi takes an arbitrary ket |ϕi ∈ Ex and
maps it into {|ψi} i.e.
P|ψi |ϕi = α |ψi ; α ≡ hψ| ϕi
in Dirac notation it could be written as
P|ψi ≡ |ψi hψ| ; P|ψi |ϕi = (|ψi hψ|) |ϕi = |ψi hψ| ϕi = α |ψi (4.3)
the most important property of a projector is the idempotence so that
2
P|ψi ≡ (|ψi hψ|) (|ψi hψ|) = |ψi hψ| ψi hψ| = P|ψi
⇒ hψ| ψi = 1
so the definition of P|ψi Eq. (4.3) as a projector is consistent only if |ψi is normalized.
Now we can write the projector onto a subspace of more than one dimension. If nj is the dimension of the subspace
(nj )
Mj ⊆ Ex we can define the projector from a complete orthonormal set
 i
u j ; i = 1, .., nj (4.4)
that spans such a subspace

(n1 ) (nj )
Ex = M1 ⊕ . . . ⊕ Mj ⊕ ...
x = x1 + . . . + xj + . . .
n1 nj
X (1)
X (j)
x = αi ui1 + . . . + αi uij + . . .
i=1 i=1
(n) 
αk ≡ ukn , x

nj
X (j)
PMj x = xj = αi uij
i=1
nj
X 
PMj x = uij , x uij
i=1

in Dirac notation it is nj nj
X i X i
i 
PMj |xi = i
huj |xi uj = u uj |xi
j
i=1 i=1
thus a direct notation for the projector is
nj
X i
i
PMj ≡ u u (4.5)
j j
i=1
(n )
it is clear that this is a projector as long as Eq. (4.4) defines an orthonormal set that spans Mj j of dimension nj .
nj ! nj ! nj nj
X i
i X
X X

P 2
= u u u k
u k
= ui hui uk uk
Mj j j j j j j j j
i=1 k=1 i=1 k=1
nj nj nj
XX
X i
i
2
PM = uij δik ukj = uj uj = PMj
j
i=1 k=1 i=1

If we have an observable A, its spectrum of eigenvectors forms a basis and we can construct a complete orthonormal set. In
that case, the spectral theorem (assuming it can be extended to infinite dimension for observables) says that the identity and
the observable A itself can be decomposed by means of the projectors built on each eigensubspace of the observable, if Mi is
the eigensubspace generated by the eigenvalue λi of A we have that
Ex = M1 ⊕ . . . ⊕ Mi ⊕ . . .
x = x1 + . . . + xi + . . .
Pi x = xi
74 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

in Dirac notation we have


ni E D
X j
Pi = ui uji
j=1

the spectral theorem says that



X ni E D
∞ X
X j
Pi = u i uji = I (4.6)
i=1 i=1 j=1

X X∞ X ni ED

λi Pi = λi uji uji = A (4.7)
i=1 i=1 j=1
n o
these forms will be applied frequently in quantum mechanics. Notice that Eq. (4.6) is valid if and only if uji is a complete
orthonormal set. Thus the decomposition of the identity in projectors is usually taken as the closure relation for the basis (or
hyperbasis) in which we are working.
It is also usual to work with a more general type of projector of the form

P = |ψi hϕ| (4.8)

applying an arbitrary vector on it we find


|ψi hϕ| χi = α |ψi ; α ≡ hϕ| χi
this is a projector on the one dimensional subspace {|ψi}. This operator is idempotent only if hϕ| is normal, however it defines
a non-orthogonal projection, since we shall see later that this operator is not self-adjoint or hermitian.

4.6 Hermitian conjugation


We have defined the action of a linear operator on a ket. We see that it induces a natural action of the operator on the bra

f|ϕi (A |ψi) = (|ϕi , A |ψi) ≡ gA|ϕi (|ψi) ∀ |ψi ∈ Ex (4.9)

the definition of the new functional gA|ϕi from a given f|ϕi and a given A is written in Dirac notation as2

A
f|ϕi ≡ hϕ| → gA|ϕi ≡ hϕ| A (4.10)

and Eq. (4.9) is written as


hϕ| (A |ψi) = (hϕ| A) (|ψi) (4.11)
so it is written simply as
hϕ| A |ψi
we should check that g is indeed a functional i.e. that it is a continuous linear mapping of the vectors into the complex
numbers, the basic properties of functionals are reproduced

gαA|ϕi+βA|χi (ψ) = α∗ gA|ϕi (|ψi) + β ∗ gA|χi (|ψi)


gA|ϕi (α |ψi + β |χi) = αgA|ϕi (|ψi) + βgA|ϕi (|χi)

Further, the association (4.10) is linear, to see it, we write a linear combination of bras

hϕ| = λ1 hϕ1 | + λ2 hϕ2 | (4.12)

which means that


hϕ| ψi = λ1 hϕ1 | ψi + λ2 hϕ2 | ψi ; ∀ |ψi ∈ Ex
then

(hϕ| A) (|ψi) = hϕ| (A |ψi) = (λ1 hϕ1 | + λ2 hϕ2 |) (A |ψi)


= λ1 hϕ1 | (A |ψi) + λ2 hϕ2 | (A |ψi)
= λ1 (hϕ1 | A) |ψi + λ2 (hϕ2 | A) |ψi
2 Notice that g
A|ψi is a new functional induced from f|ϕi and A. Of course gA|ψi must be associated to some vector i.e. gA|ψi = f|χi for some
|χi in our vector space, but it does not concern us. In particular, it is very important to observe that gA|ψi 6= fA|ψi .
4.6. HERMITIAN CONJUGATION 75

since ψ is arbitrary we find


hϕ| A = λ1 hϕ1 | A + λ2 hϕ2 | A
notice that it is different to start with a linear combination of kets

|ϕi = λ1 |ϕ1 i + λ2 |ϕ2 i (4.13)

from starting with the same linear combination of bras Eq. (4.12). Because the bra associated with Eq. (4.13) is given by

hϕ| = λ∗1 hϕ1 | + λ∗2 hϕ2 |

which differs from Eq. (4.12) owing to the antilinearity of the mapping described by Eq. (2.39), in page 28. The order is
important, the new bra induced from hϕ| by the operator A is written as hϕ| A and not in the form A hϕ|. For instance, if we
apply these relations to a ket the first expression hϕ| A |ψi is a complex number, while the second A hϕ| ψi = αA is another
operator.

4.6.1 The adjoint operator A† in Dirac notation


In Dirac notation we write |ψ ′ i = A |ψi ≡ |Aψi. We now want to know what is the corresponding bra |ψ ′ i ↔ hψ ′ | ≡ hAψ|. In
mathematical notation the question is

|ψi → f|ψi ; |ψ ′ i = A |ψi ≡ |Aψi ⇒


?
|ψ ′ i → f|ψ′ i

to elucidate the answer we apply an arbitrary vector |ϕi to the functional we want to find

fA|ψi (|ϕi) = f|ψ′ i (|ϕi) = hψ ′ |ϕi = hAψ| ϕi = hψ| A† ϕi

where we have applied property (2.44). Now we apply property (4.11) to get
 
f|ψ′ i (|ϕi) = hψ| A† ϕ = hψ| A† (|ϕi)

since this is valid for |ϕi arbitrary we find


f|ψ′ i ≡ hψ ′ | = hψ| A†
in Dirac notation we have then

|ψ ′ i = A |ψi ≡ |Aψi

hψ | = hψ| A† ≡ hAψ|

notice that as before, the mapping of the dual space into itself is denoted with the operator defined on the right-hand side and
not on the left3 . Further by assigning A = λI and taking into account that A† = λ∗ I we have that

hψ ′ | = hλψ| = hλIψ| = hψ| (λI) = hψ| λ∗ I ⇒
hλψ| = λ∗ hψ|

in agreement with Eq. (4.1). On the other hand since

hψ ′ | ϕi = hϕ| ψ ′ i∗

we see that
hψ| A† |ϕi ≡ hψ| A† ϕi = hAψ| ϕi = hϕ| Aψi∗
obtaining finally

hψ| A† |ϕi = hϕ| A |ψi (4.14)
and we recall the most important properties of the adjoint operators (see Eqs. (2.43))
† †
A† = A , (αA + βB) = α∗ A† + β ∗ B † (4.15)
† † †
(AB) = B A (4.16)
3 Stricktly speaking, a mapping of the dual (or conjugate) space into itself is carried out by the conjugate operator instead of the adjoint operator

since the latter maps the Hilbert space into itself and not the dual (see Sec. 2.9.3). Notwithstanding, from the practical point of view this subtlety
is irrelevant.
76 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

4.6.2 Mathematical objects and hermitian conjugation in Dirac notation


In general, the order of bras, kets and operators is of major importance, the only objects we can put in any order are scalars,
for instance the mathematical objects

λ hϕ| B |ψi ; λ hψ| B |ϕi ; λ hψ| ϕiB ; λ |ψi hϕ| B (4.17)

are all distinct each other, the first and second are (different) complex numbers, while the last two are (different) operators,
as can be verified by applying an arbitrary vector on the right-hand side of these objects. However, expressions like

λ |ψi hϕ| B ; |ψi λ hϕ| B ; |ψi hϕ| λB ; |ψi hϕ| Bλ

are all equal, indeed we could think about the multiplication by a scalar as equivalent to the operator λI which commutes
with everything.
We shall now define a useful operation that we call hermitian conjugation. Our basic objects are kets, bras, operators and
scalars. In general words, hermitian conjugations are mappings induced by the existence of the dual E ∗ of our Hilbert space
E. These mappings posses the following features

1. A ket |ψi ∈ E is naturally mapped into a bra hψ| ∈ E ∗ .

2. A bra hψ| ∈ E ∗ is naturally mapped into an element of the conjugate space of E ∗ , i.e on E ∗∗ . However, for Hilbert spaces
it can be shown that E ∗∗ = E hence the bra is mapped into its corresponding ket4 .

3. An operator A in ß(E) is mapped naturally into the conjugate vector A∗ in ß(E ∗ ) but the inner product structure permits
in turn to define another operator A† in ß(E) from A∗ and from the practical point of view we regard A∗ and A† as
identical. Thus the hermitian conjugation in this case will be the mapping A → A† .

4. Now finally for scalars. Taking into account that for all practical uses scalars λ can be considered as operators in ß(E) of

the form λI, we see that the natural hermitian conjugation gives λI → (λI) = λ∗ . Therefore, the natural conjugation

operation is λ → λ .

5. We notice now that the hermitian conjugation reverses the order of the objects to which it is applied. We have seen that

(A |ψi) = hψ| A† , Eq. (4.16) shows that the order of a product of operators is reversed when we apply the “adjointness”
(or hermitian conjugation) on that product, when scalars are involved the place in which scalars are located is irrelevant.

By the same token, let us see what is the conjugate of the non orthogonal projection defined in (4.8)

P = |ψi hϕ| ; P † = (|ψi hϕ|)

applying Eq. (4.14) we find


† ∗
hχ| (|ψi hϕ|) |ηi = [hη| (|ψi hϕ|) |χi] = hη| ψi∗ hϕ| χi∗ = hχ| ϕi hψ| ηi

hχ| (|ψi hϕ|) |ηi = hχ| (|ϕi hψ|) |ηi ; ∀ |ηi , |χi ∈ E

then we have
(|ψi hϕ|)† = |ϕi hψ| (4.18)

once again, the hermitian conjugation converts each object in its hermitian conjugate and reverse the order of such objects.
These observations permit to give a rule to obtain the hermitian conjugate of a mathematical object composed by a
juxtaposition of bras, kets, operators and scalars. The rule is

1. Replace each object by its hermitian conjugate

|ψi → hψ| , hϕ| → |ϕi , A → A† , λ → λ∗

2. Reverse the order of the factors, taking into account that the position of the scalars is not relevant.
4 In Banach spaces, the property B ∗∗ = B is called reflexibity and is not in general satisfied. For Hilbert spaces, reflexibity is automatic from

which we can assign the dual element of a dual element to the original vector. This is another satisfying property of Hilbert spaces, not accomplished
by general Banach spaces.
4.7. THEORY OF REPRESENTATIONS OF E IN DIRAC NOTATION 77

For example, the hermitian conjugates of the objects defined in (4.17) are given by
† ∗
[λ hϕ| B |ψi] = hψ| B † |ϕi λ∗ = λ∗ hψ| B † |ϕi = [λ hϕ| B |ψi]
† ∗
[λ hψ| B |ϕi] = hϕ| B † |ψi λ∗ = λ∗ hϕ| B † |ψi = [λ hψ| B |ϕi]
† ∗
[λ hψ| ϕiB] = B † hϕ| ψiλ∗ = λ∗ hϕ| ψiB † = (λ hψ| ϕi) B †
[λ |ψi hϕ| B]† = B † |ϕi hψ| λ∗ = λ∗ B † |ϕi hψ| = λ∗ B † [|ψi hϕ|]†
where we have used Eq. (4.14). In the first two expressions the original mathematical objects are scalars and hence the
hermitian conjugates are also scalars (the complex conjugates of the original scalars). In the third expression the original
object is an operator multiplied by a scalar, thus its hermitian conjugate is also an operator multiplied by a scalar (the adjoint
of the original operator and the complex conjugate of the scalar). In the fourth expression, the original object is a product of
two operators and a scalar (a scalar times a projection times the operator B) and its adjoint is the product of the conjugate
scalar with the adjoint of each of the operators in reverse order. In each case, the scalars are located in the most convenient
place since their positions are unimportant. Indeed, we can put the conjugate of the scalars in any place, for instance in the
case
† † ∗
[λ |χi hψ| B |ϕi] = [λ hψ| B |ϕi |χi] = λ∗ hψ| B |ϕi hχ|
that coincides with the rules when we take into account Eq. (4.14).
It is important to see that according to (4.18) the projectors given by (4.3) are hermitian, thus according to theorem
2.49, they are orthogonal projectors (i.e. projectors in the sense of a Hilbert space), this in turn says that the sums in (4.5)
are also orthogonal projectors (see theorem 2.55). On the other hand, the projectors described by (4.8) with |ϕi = 6 |ψi are
non-hermitian and consequently they are non-orthogonal projections.

4.7 Theory of representations of E in Dirac notation


For most of our purposes we shall use a representation with respect to orthonormal bases. The particular problem suggests
the particular basis to work with. Most of the developments here are not new but gives us a very good opportunity of using
the Dirac notation and be aware of its great advantages as a tool for calculations. We are going to describe the representation
theory in both discrete and continuous bases.

4.7.1 Orthonormalization and closure relation


In Dirac notation, the orthonormality of a set of discrete {|ui i} or continuous {|wα i} orthonormal kets is expressed by
hui |uj i = δij ; hwα |wα′ i = δ (α − α′ )
we emphasize once again that hwα |wα i diverges so that |wα i does not have a bounded norm and thus it does not belong to
our state space. We call |wα i generalized kets because they can be used to expand any ket of our state space.
A discrete set {|ui i} or a continuous one {|wα i} constitutes a basis if each ket |ψi of our state space can be expanded in a
unique way on each of these sets Z
X
|ψi = ci |ui i ; |ψi = dα c (α) |wα i (4.19)
i
the problem is considerably simplified if we asume that the bases are orthonormal, because in that case we can extract the
coefficients by applying a bra huk | or hwα′ | on both sides of these equations
X Z
huk |ψi = huk | ci |ui i ; hwα′ |ψi = hwα′ | dα c (α) |wα i
i
X X
huk |ψi = ci huk | ui i = ci δki = ck
i i
Z Z
hwα′ |ψi = dα c (α) hwα′ | wα i = dα c (α) δ (α − α′ ) = c (α′ )

from which we obtain the familiar result


ck = huk |ψi ; c (α′ ) = hwα′ |ψi (4.20)
replacing the Fourier coefficients (4.20) in the expansions (4.19) we find
!
X X X
|ψi = hui |ψi |ui i = |ui i hui |ψi = |ui i hui | |ψi
i i i
Z Z Z 
|ψi = dα hwα |ψi |wα i = dα |wα i hwα |ψi = dα |wα i hwα | |ψi
78 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

since this is valid for any ket |ψi ∈ E the operators in parenthesis must be the identity operator on E
X Z
P{ui } ≡ |ui i hui | = I ; P{wα } ≡ dα |wα i hwα | = 1 (4.21)
i

we can reverse the steps and show that applying the identity in the form given by Eqs. (4.21) we obtain that any |ψi ∈ E must
be a unique linear combination of {|ui i} or {|wα i}
!
X X
|ψi = I |ψi = P{ui } |ψi = |ui i hui | |ψi = |ui i hui | ψi
i i
X
|ψi = ci |ui i ; ci ≡ hui | ψi (4.22)
i

Z  Z
|ψi = I |ψi = P{wα } |ψi = dα |wα i hwα | |ψi = dα |wα i hwα | ψi
Z
|ψi = dα c (α) |wα i ; c (α) ≡ hwα | ψi

these facts show that Eqs. (4.21) manifest a closure relation in Dirac notation. This is consistent with our discussion in Sec.
4.5 that led to Eq. (4.6), in which we saw that each element of the form |ui i hui | is a projector operator and Eqs. (4.21) are
decompositions of the identity in projectors5. In other words, the projector given by the sums in (4.21) has the whole space
as its range. In the case of the continuous basis, they are “hyperprojectors” but we shall call them projectors from now on.
Hence the representation of a ket |ψi in a discrete basis is given by the set of its fourier coefficients {hui | ψi} it is usually
written in matrix form as a column matrix    
hu1 | ψi c1
 hu2 | ψi   c2 
   
 ..   .. 

|ψi =  .   
= . 
 hui | ψi   ci 
   
.. ..
. .
the representation of a ket |ψi in a continuous basis is given by the set of its fourier transforms {hui | ψi} it is usually written
in continuous matrix form as a column matrix
   
.. ..
 .   . 
|ψi =   
 hwα | ψi  =  c (α)


.. ..
. .

the representation of a bra can be obtained by the same insertion of the identity as follows
X
hψ| = hψ| I = hψ| P{ui } = hψ| ui i hui |
i
X
hψ| = c∗i hui | ; ci = hui | ψi
i

which can also be obtained by taking the hermitian conjugation of Eq. (4.22) and applying (4.1). For continuous basis the
process is similar
Z
hψ| = hψ| I = hψ| P{wα } = dα hψ| wα i hwα |
Z
hψ| = dα c∗ (α) hwα | ; c (α) = hwα | ψi

in matrix notation the bra is represented as a one row matrix of the coefficients, in both the discrete and continuous cases

hψ| = hψ| u1 i hψ| u2 i · · · hψ| ui i · · ·

hψ| = c∗1 c∗2 · · · c∗3 · · ·
5 In Eq. (4.6) the lower index labels the eigenvalue and the upper index indicates the degree of degeneracy of the given eigenvalue. In Eq. (4.21)

the single index runs over all different eigenvectors.


4.7. THEORY OF REPRESENTATIONS OF E IN DIRAC NOTATION 79


hψ| = ··· c∗ (α) ···
by comparing the representation of the corresponding ket |ψi we see that the representation of the bra is obtained by transposing
the matrix representative of the ket (i.e. converting the column in a row) and taking the conjugate of each element.
Let us reproduce the inner product expressions (3.70) and (3.81) by insertion of the identity with projectors
X
hϕ| ψi = hϕ| I |ψi = hϕ| P{ui } |ψi = hϕ| ui ihui |ψi
i
X
hϕ| ψi = b∗i ci ; bi = hui | ϕi ; ci = hui |ψi
i

Z
hϕ| ψi = hϕ| I |ψi = hϕ| P{wα } |ψi = dα hϕ| wα ihwα |ψi
Z
hϕ| ψi = dα b∗ (α) c (α) ; b (α) = hwα | ϕi ; c (α) = hwα |ψi

in matrix form we can see the inner product as the product of a row vector times a column vector
 
c1
 c2 
 
  ..  X ∗
hϕ| ψi = b1 b2 · · · b3 · · ·  . 
∗ ∗ ∗ 
= b i ci
 ci  i
 
..
.
in continuum form we have  
..
. Z
 
hϕ| ψi = ··· b∗ (α) · · · 
 c (α)  = dα b∗ (α) c (α)

..
.
and the norms are obtained with ϕ = ψ i.e. bi = ci or b (α) = c (α)
2
X 2 Z
hψ| ψi = kψk = |ci | = dα |c (α)|2
i

4.7.2 Representation of operators in Dirac notation


Let us see the representation of an operator A under a basis {ui } or {wα }. In Sec. 3.5.1 Eq. (3.25), we saw that a matrix
representative of A under the basis {ui } is given by
Aij = hui | Auj i = hui | A |uj i
and in a continuous basis
A (α, α′ ) = hwα | A |wα′ i
they are arranged in a square matrix with infinite countable or continuous numbers of columns and rows
 
A11 A12 · · · A1j · · ·
 A21 A22 · · · A2j · · · 
 
 .. .. .. 
A=  . . . 

 Ai1 Ai2 · · · Aij · · · 
 
.. .. ..
. . .
 
..
 . 
A=
 ··· A (α, α′ ) · · · 

..
.
it is interesting to see the matrix representative of a product of operators by insertion of the identity
X
(AB)ij = hui | AB |uj i = hui | AIB |uj i = hui | AP{ui } B |uj i = hui | A |uk i huk | B |uj i
k
X
(AB)ij = Aik Bkj
k
80 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

which coincides with the algorithm for matrix multiplication developed in Sec. 3.1, Eq. (3.6). We can develop easily the
matrix multiplication algorithm with continuum matrices
(AB) (α, β) = hwα | AB |wβ i = hwα | AIB |wβ i = hwα | AP{ui } B |wβ i
Z
(AB) (α, β) = dγ hwα | A |wγ i hwγ | B |wβ i
Z
(AB) (α, β) = dγ A (α, γ) B (γ, β) (4.23)

now let us see the matrix representative of the ket |ψ ′ i given by


A |ψi = |ψ ′ i
from the knowledge of the components of |ψi and A, in a given representation {ui }
X
|ψi = ci |ui i ; Aik = hui | A |uk i ; ci ≡ hui |ψi
i

The coordinates of |ψ i in this basis yield
X
c′i = hui |ψ ′ i = hui | A |ψi = hui | AI |ψi = hui | AP{ui } |ψi = hui | A |uk i huk | ψi
k
X
c′i = Aik ck (4.24)
k

so that X XX
|ψ ′ i = c′i |ui i = Aik ck |ui i
i i k
we can obtain this alternatively as
X X XX
|ψ ′ i = A |ψi = IAI |ψi = |ui i hui | A |uk i huk | ψi = |ui i hui | A |uk i huk | ψi
i k i k
X XX
|ψ ′ i = c′i |ui i = Aik ck |ui i
i i k

the transformation of the coefficients given in Eq. (4.24) can be displayed explicitly as
 ′    
c1 A11 A12 · · · A1j · · · c1
 c′2   A21 A22 · · · A2j · · ·   c2 
    
 ..   .. .. ..   .. 
 . = . . .  . 
 ′    
 ci   Ai1 Ai2 · · · Aij · · ·   ci 
    
.. .. .. .. ..
. . . . .
with a continuous basis {wα } we have
Z
c′ (α) = hwα | ψ ′ i = hwα | A |ψi = hwα | AI |ψi = hwα | AP{wα } |ψi = dβ hwα | A |wβ i hwβ |ψi
Z
c′ (α) = dβ A (α, β) c (β)

which is the continuous extension of multiplication of a matrix with a column vector.


Let us see the representation of the bra hψ| A
XX
hψ| A = hψ| IAI = hψ| ui i hui | A |uj i huj |
i j
XX
= c∗i Aij huj |
i j

Therefore, the bra hψ| A is represented by the product of the row matrix that represents hψ| times the square matrix
representing A respecting the order
 
A11 A12 ··· A1j ···
 A21 A22 ··· A2j ··· 
 
  .. .. .. 
hψ| A = c1 c2 · · · c3 · · · 
∗ ∗ ∗
 . . . 

 Ai1 Ai2 ··· Aij ··· 
 
.. .. ..
. . .
4.8. CHANGE OF REPRESENTATIONS 81

observe that the matrix product is not defined in the opposite order, thus we cannot give meaning to A hψ|.
In many cases, it is also interesting to calculate the element hϕ| A |ψi in terms of the coordinates of the bra and the ket
and in terms of the components of A. To do it, we insert an expansion of the identity twice
XX
hϕ| A |ψi = hϕ| IAI |ψi = hϕ| P{ui } AP{ui } |ψi = hϕ| ui i hui | A |uj i huj |ψi
i j
XX
hϕ| A |ψi = b∗i Aij cj ; bi = hui | ϕi, Aij = hui | A |uj i , cj = huj |ψi
i j

which in matrix form is written as a bilinear form


  
A11 A12 ··· A1j ··· c1
 A21 A22 ··· A2j ···   c2 
  
 .. .. ..  .. 
hϕ| A |ψi = b∗1 b∗2 ··· b∗3 ··· 
 . . . 
 . 
 (4.25)
 Ai1 Ai2 ··· Aij ···   ci 
  
.. .. .. ..
. . . .
this is the natural way of superposing the representations of hϕ|, A, and |ψi respecting the order. The result is of course a
number. The extension for continuous bases is
Z Z
hϕ| A |ψi = hϕ| P{wα } AP{wβ } |ψi = dα dβ hϕ| wα i hwα | A |wβ i hwβ |ψi

and we obtain
Z Z
hϕ| A |ψi = dα dβ b∗ (α) A (α, β) c (β)

b (α) = hwα | ϕi ; A (α, β) = hwα | A |wβ i ; c (β) = hwβ |ψi


notice that Eq. (4.11) expresses the associativity of the matrix expressions given by Eq. (4.25).
Finally, the projection operator P = |ψi hψ| has matrix representative given by
Pij = hui | P |uj i = hui | ψihψ |uj i = ci c∗j
in matrix language it is written as
   
c1 c1 c∗1 c1 c∗2 ··· c1 c∗j ···
 
c2  c2 c∗1 c2 c∗2 ··· c2 c∗j ··· 
   
 
..   .. .. .. 

|ψi hψ| =  . c∗1 c∗2 ··· c∗3 ··· = . . . 
  
 ci   ci c∗1 ci c∗2 ··· ci c∗j ··· 
   
.. .. .. ..
. . . .
this representation is particularly simple when P = |uk i huk | i.e. when the ket that forms the projector is part of the basis.
The matrix representation of the adjoint operator is obtained by using property (4.14)
 ∗
A† ij = hui | A† |uj i = huj | A |ui i = A∗ji
 ∗
A† (α, β) = hwα | A† |wβ i = hwβ | A |wα i = A∗ (β, α)
these results coincide with the one obtained in Eq. (3.28). If A is hermitian then A = A† and
Aij = A∗ji ; A (α, β) = A∗ (β, α) (4.26)
in particular applying these conditions for i = j or α = β we see that the diagonal elements of an hermitian matrix are real.
These facts are valid only if the basis is orthonormal, otherwise the matrix representative of the adjoint of the matrix takes
another form.

4.8 Change of representations


In a representation characterized by a given orthonormal basis {|ui i} the kets, bras and operators have some specific matrix
representatives. We want to write the matrix representative of these objects in a new orthonormal basis {|tk i} using the Dirac
notation6 .
6 This problem is a bit lees general that the one treated in Sec. (3), because in that section the bases involved are non necessarily orthonormal.

However, in this case we are treating the problem in infinite dimension.


82 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

4.8.1 The transfer matrix


For future purposes we define the matrix S in the form

Sik ≡ hui | tk i ; S† ki

= Sik = htk | ui i
(k)
To give a geometrical meaning to S, let define Vi ≡ Sik and V(k) the k−th column vector with components Sik . Then, it
is clear that V(k) is the matrix representative (column matrix) of the element |tk i in the basis {|ui i}. We then construct a
square matrix by putting these column vectors side by side
     
S11 S12 S11 S12 · · ·
      
S = V(1) V(2) · · · =  S21   S22  · · ·  =  S21 S22 · · · 
.. .. .. ..
. . . .
We can also see that S is a unitary matrix
 X † X
S † S km = Ski Sim = htk | ui i hui | tm i = htk | P{ui } |tm i = htk | tm i = δkm
i i
 X †
X

SS ij
= Sik Skj = hui | tk i htk | uj i = hui | P{tk } |uj i = hui | uj i = δij
k k

consequently
S † S = SS † = I
On the other hand, we will also require the closure and orthonormalization relations with both bases
X
P{ui } = |ui i hui | = I ; hui | uj i = δij
i
X
P{tk } = |tk i htk | = I ; htk | tm i = δkm
k

we shall see soon that S accounts on the transformation of coordinates and matrix representations when a change of basis is
carried out. For this reason it is usually called the “transfer matrix”. We recall that we can guarantee that the transfer matrix
is unitary only if both bases involved are orthonormal.

4.8.2 Transformation of the coordinates of a ket


The coordinates of a ket |ψi in the basis {|ui i} are hui | ψi ≡ |ψi(ui ) . To know the coordinates in the new basis htk | ψi, in terms
of the old ones, we insert the closure relation for {|uk i} in the element htk | ψi
X X †
htk | ψi = htk | ui i hui | ψi = Ski hui | ψi
i i
(t)
X † (u) (t)
ck = Ski ci ; c = S † c(u)
i

The inverse relation can be obtained by taking into account that S † = S −1


c(t) = S −1 c(u) ⇒ c(u) = Sc(t)
or alternatively by inserting an identity in the element hui | ψi
X X
hui | ψi = hui | tk i htk | ψi = Sik htk | ψi
k k
(u)
X (t) (u)
ci = Sik ck ; c = Sc(t)
k

4.8.3 Transformation of the coordinates of a bra


We insert the identity in the element hψ| tk i
X X
hψ| tk i = hψ| ui i hui | tk i = hψ| ui iSik
i i
∗(t)
X ∗(u)
ck = ci c∗(t) = e
Sik ⇒ e c∗(u) S
i

similarly
c∗(u) = e
e c∗(t) S †
4.9. REPRESENTATION OF THE EIGENVALUE PROBLEM IN DIRAC NOTATION 83

4.8.4 Transformation of the matrix elements of an operator


We start with htk | A |tm i and insert two identities
XX X † (u)
htk | A |tm i = htk | IAI |tm i = htk | ui i hui | A |uj i huj |tm i = Ski Aij Sjm
i j i,j
(t)
X † (u) (t) † (u)
Akm = Ski Aij Sjm ; A =S A S (4.27)
i,j

and the inverse relation is obtained from


X X (t) †
huk | A |um i = huk | ti i hti | A |tj i htj |um i = Ski Aij Sjm
i,j i,j
(u)
X (t) †
Akm = Ski Aij Sjm ; A(u) = SA(t) S † (4.28)
i,j

or taking into account that S † = S −1 .

4.9 Representation of the eigenvalue problem in Dirac notation


For a given observable A the eigenvalue problem reads
A |ψi = λ |ψi
we want to construct its matrix representation in a basis {ui }. We first multiply by a bra of the form hui | on both sides
hui | A |ψi = λhui |ψi
and insert an identity
X
hui | A |uj i huj |ψi = λhui |ψi
j
X
Aij cj = λci ; ci ≡ hui |ψi ; Aij ≡ hui | A |uj i
j

with ci and Aij the matrix elements of |ψi and A in the basis {ui }. This expression can be rewritten as
X
[Aij − λδij ] cj = 0
j

which is the well known expression for the eigenvalue problem in matrix form.

4.9.1 C.S.C.O. in Dirac notation


n o
(1) (m)
Assume that a given set of observables {A1 , ..., Am } forms a C.S.C.O. Then a given set of eigenvalues an1 , ..., anm defines
a unique normalized eigenvector common to all the observables (within a phase factor). We shall see later that any set of kets
that differ in a global phase factor
|ψi , eiθ1 |ψi , ..., eiθk |ψi
n o
(1) (m)
have the same physical information. Thus, the normalized ket associated with the set an1 , ..., anm is unique from the physical
point of view. Therefore, it is usual to denote the corresponding ket in the form |ψn1 ,...,nm i or simply as |n1 , n2 , ..., nm i and
the set of eigenvalues are called quantum numbers.
Ai |n1 , . . . , ni , ..., nm i = a(i)
ni |n1 , . . . , ni , ..., nm i ; i = 1, .., m

4.10 The continuous bases |ri and |pi


From the wave functions space ̥ we have constructed the abstract space Er such that there is an isometric isomorphism of
̥ onto Er , therefore they are abstractly identical as Hilbert spaces. Consequently, an element ψ (r) ∈ ̥ has a unique image
|ψi ∈ Er and vice versa. In particular, the inner product must be preserved by this correspondence
|ψi ↔ ψ (r) ; |ϕi ↔ ϕ (r) ; hψ| ↔ ψ ∗ (r) ; hϕ| ↔ ϕ∗ (r)
Z
(|ϕi , |ψi) = (ϕ, ψ) ≡ hϕ| ψi = d3 r ϕ∗ (r) ψ (r)
84 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

Er will describe the state space of a spinless particle. We have discussed before that ψ (r) can also be interpreted as a
representation of the abstract ket |ψi in the continuous basis {ξr (r′ )} defined in Eq. (3.87). We also saw that ξr (r′ ) are not
elements of ̥, but they can be used to expand any element of ̥ in a unique way. We call ξr (r′ ) “generalized wave functions”
and it is natural to associate with them some “generalized kets” denoted as |ri that do not belong to Er but can expand any
element of Er in such a way that if ψ (r) ↔ |ψi then the expansion of ψ (r) under ξr (r′ ) has the same coefficients as the
expansion of |ψi under |ri Z Z
ψ (r) = dr′ c (r′ ) ξr′ (r) ; |ψi = dr′ c (r′ ) |r′ i

We denote this association as ξr ↔ |ri. Similarly, for the continuous basis defined in Eq. (3.83) by {vp (r)} which has plane
waves as “generalized wave functions”, we shall have a continuous basis of Er denoted as |p0 i

ξr (r′ ) ↔ |ri ; vp (r) ↔ |pi

therefore, using the bases {ξr (r′ )} and {vp (r)} of ̥ we have defined two continuous basis in Er denoted as {|ri} and {|pi}.
Consequently, all bras, kets and operators in Er will have a continuous matrix representation in these bases. The basis {|ri} is
labeled by three continuous indices x, y, z which are the coordinates of a point in three dimensional space. Similarly, the basis
{|pi} is labeled by three continuous indices px , py , pz which are components of a cartesian vector.

4.10.1 Orthonormalization and closure relations


We shall calculate hr |r′ i using the definition of the scalar product in Er
Z Z
hr |r′ i = d3 r′′ ξr∗ (r′′ ) ξr′ (r′′ ) = d3 r′′ δ (r′′ − r) δ (r′′ − r′ )

hr |r′ i = δ (r − r′ ) (4.29)

similarly
Z  3 Z  3 Z
1 1 ′
d3 r e−i(p−p )·r/~

hp |p′ i = d3 r vp∗ (r) vp′ (r) = d3 r e−ip·r/~ eip ·r =
2π~ 2π~
hp |p′ i = δ (p − p′ )

where we have used property (3.84). The closure relations for {|ri} and {|pi} are written according with the second of Eqs.
(4.21) integrating over three indices instead of one. The orthonormality and closure relations for these bases are then

hr |r′ i = δ (r − r′ ) ; hp |p′ i = δ (p − p′ ) (4.30)


Z Z
d3 r |ri hr| = I ; d3 p |pi hp| = I (4.31)

4.10.2 Coordinates of kets and bras in {|ri} and {|pi}


Consider an arbitrary ket |ψi corresponding to a wave function ψ (r). The closure relations for {|ri} and {|pi} permits to
expand |ψi as Z Z Z Z
|ψi = d3 r |ri hr| ψi = d3 r c (r) |ri ; |ψi = d3 p |pi hp| ψi = d3 p c̄ (p) |pi (4.32)

the coefficients c (r) = hr| ψi and c̄ (p) = hp| ψi are calculated as follows
Z Z
hr| ψi = d r ξr (r ) ψ (r ) = d3 r′ δ (r′ − r) ψ (r′ ) = ψ (r)
3 ′ ∗ ′ ′

Z  3/2 Z
1
hp| ψi = d3 r vp∗ (r) ψ (r) = d3 r e−ip·r/~ ψ (r) = ψ̄ (p)
2π~

hence
c (r) = hr| ψi = ψ (r) ; c̄ (p) = hp| ψi = ψ̄ (p) (4.33)
the coefficients c (r) of the expansion of |ψi under {|ri} are the wave functions evaluated at the point r, this fact reinforces
the interpretation of the wave function as the representation of |ψi under the basis |ri. The coefficients c̄ (p) are the Fourier
transforms of the wave function, this coefficients ψ̄ (p) are usually called “wave functions in momentum space”. Since they
represent the same abstract vector |ψi, it is clear that ψ (r) and ψ̄ (p) contain the same physical information. This can also
be seen by taking into account that given ψ (r) then ψ̄ (p) is uniquely determined and vice versa. On the other hand, by
comparing Eqs. (4.32, 4.33) with Eqs. (3.88, 3.89) we see that if ψ (r) ↔ |ψi then the expansion of ψ (r) under ξr (r′ ) has the
4.10. THE CONTINUOUS BASES |Ri AND |Pi 85

same coefficients as the expansion of |ψi under |ri as we demanded. Similar situation occurs with the basis {vp } in ̥ and the
basis |pi in Er .
An important particular case arises when |ψi = |pi which is indeed a generalized ket. Assuming that all the relations above
are also valid for generalized kets, and taking into account that |pi ↔ vp (r), then Eq. (4.33) gives
 3/2
1
hr| pi = vp (r) = eip·r/~ (4.34)
2π~

the same result is obtained by taking into account the equality of the inner product of vectors in ̥ and vectors in Er when
this equality is extended to generalized vectors
Z Z
hr| pi = (|ri , |pi) = (ξr , vp ) = d r ξr (r ) vp (r ) = d3 r′ δ (r′ − r) vp (r′ ) = vp (r)
3 ′ ∗ ′ ′

applying Eq. (4.33) for |ψi = |r′ i ↔ ψ (r) = ξr′ (r) we find

hr| r′ i = ξr′ (r) = δ (r − r′ )

which is consistent with the orthonormalization relation. Similar arguments leads to


 3/2
1
hp| ri = vp∗ (r) = e−ip·r/~ ; hp| p′ i = δ (p − p′ )
2π~

Assume that we have an orthonormal basis {ui (r)} in ̥ and an orthonormal basis {|ui i} in Er such that ui (r) ↔ |ui i.
Starting with the closure relation for {|ui i} in Er
X
|ui i hui | = I
i

and evaluating the matrix element of it between |ri and |r′ i we have
X
hr |ui i hui | r′ i = hr| I |r′ i = hr| r′ i
i

and using Eqs. (4.33, 4.30) we find


X
ui (r) u∗i (r′ ) = δ (r − r′ )
i

which is the closure relation as it was expressed in Eq. (3.77) for {ui (r)} in ̥, reversing the steps we can obtain the closure
relation for {|ui i} in Er starting from the closure relation for {ui (r)} in ̥7 .
Notice that the inner product of two kets in terms of their coordinates under the basis {|ri} is a particular case of Eq.
(3.81). Equivalently, we obtain it by insertion of the identity
Z
hϕ |ψi = d3 r hϕ |ri hr |ψi

and interpreting the components hϕ |ri and hr |ψi as in Eq. (4.33)


Z
hϕ |ψi = d3 r ϕ∗ (r) ψ (r)

a similar procedure can be done for the basis {|pi}


Z Z
hϕ |ψi = d p hϕ |pi hp |ψi = d3 p ϕ̄∗ (p) ψ̄ (p)
3

from which it is obtained Z Z


3 ∗
d r ϕ (r) ψ (r) = d3 p ϕ̄∗ (p) ψ̄ (p)

this is a well-known property of the Fourier transforms.


7 Notice that I (r, r′ ) = hr′ | I |ri = hr′ | ri = δ (r − r′ ) shows that the Dirac delta can be seen as the representation of the identity under the

continuous hyperbasis {|ri}.


86 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

4.10.3 Changing from the {|ri} representation to {|pi} representation and vice versa
The procedure is similar to the one in section 4.8 but for continuous basis. If we consider the change from {|ri} to {|pi}, the
unitary transfer matrix S of changing the basis is
 3/2
1
S (r, p) = hr |pi = eip·r/~ (4.35)
2π~

a ket |ψi is represented as ψ (r) in {|ri} and we know well that in {|pi} it is given by ψ̄ (p). Here we see that it is consistent
with the formalism developed in Sec. 4.8

Z Z
hp |ψi = d3 r hp |ri hr |ψi = d3 r S† (r, p) hr |ψi
 3/2 Z
1
ψ̄ (p) = d3 r e−ip·r/~ ψ (r) (4.36)
2π~

similarly
Z Z
hr |ψi = d3 p hr |pi hp |ψi = d3 p S (r, p) hp |ψi
 3/2 Z
1
ψ (r) = d3 p eip·r/~ ψ̄ (p) (4.37)
2π~

the representation of bras can be obtained by hermitian conjugation of the relations with kets.
Now for a given operator, the matrix elements in {|pi} read A (p′ , p) = hp′ | A |pi inserting two identities we get
Z Z
′ 3 ′
hp | A |pi = d r d3 r hp′ | r′ i hr′ | A |ri hr |pi
Z Z
hp′ | A |pi = d3 r′ d3 r S † (r′ , p′ ) A (r′ , r) S (r, p)

which is the continuous generalization of (4.27). Using (4.35) we find


 3 Z Z
′ 1 3 ′ ′ ′
A (p , p) = d r d3 r e−ip ·r /~ A (r′ , r) eip·r/~
2π~
 3 Z Z
1 ′ ′
A (p′ , p) = d3 r′ d3 r e−i(p ·r −p·r)/~ A (r′ , r)
2π~

the inverse relation is obtained from


Z Z
hr′ | A |ri = d3 p′ d3 p hr′ | p′ i hp′ | A |pi hp |ri
Z Z
′ 3 ′
hr | A |ri = d p d3 p S (r′ , p′ ) A (p′ , p) S † (r, p)

this is the continuous generalization of (4.28). From (4.35) we find


 3 Z Z
1 ′ ′
A (r′ , r) = d3 p′ d3 p eip ·r /~ A (p′ , p) e−ip·r/~
2π~
 3 Z Z
1 ′ ′

A (r , r) = d p 3 ′
d3 p ei(p ·r −p·r)/~ A (p′ , p)
2π~

4.10.4 The R and P operators


Let |ψi be an arbitrary ket of Er and ψ (r) = ψ (x, y, z) the corresponding wave function. We define an operator X in the
form8
|ψ ′ i = X |ψi
8 The operator X does not belong to ß(E ), because for some square integrable functions ψ (r), the function ψ ′ (r) defined in Eq. (4.38) is not
r
square integrable.
4.10. THE CONTINUOUS BASES |Ri AND |Pi 87

such that in the {|ri} representation the associated wave function ψ ′ (r) = ψ (x, y, z) is given by

ψ ′ (x, y, z) = xψ (x, y, z) (4.38)

so in the {|ri} representation, it corresponds to the operator that multiplies the wave function by x. We should emphasize
however, that the operator X is defined on the Er state space. Eq. (4.38) can be expressed by

hr| X |ψi = hr| ψ ′ i = ψ ′ (r) = xψ (r) = xhr |ψi

Of course, we can introduce the operators Y and Z in a similar way

hr| X |ψi = xhr |ψi , hr| Y |ψi = yhr |ψi , hr| Z |ψi = zhr |ψi ; |ri = |x, y, zi (4.39)

we can consider X, Y, Z as the “components” of a “vector operator” R, by now it only means a condensed notation inspired
in the fact that x, y, z are the components of the ordinary vector r.
These operators can be easily manipulated in the {|ri} representation. For instance, the element hϕ| X |ψi can be calculated
as Z Z
hϕ| X |ψi = d r hϕ| ri hr| X |ψi = d3 r ϕ∗ (r) x ψ (r)
3

similarly, we define the operators Px , Py , Pz that forms the “vector operator” P, such that their action in the {|pi} represen-
tation is given by

hp| Px |ψi = px hp |ψi , hp| Py |ψi = py hp |ψi , hp| Pz |ψi = pz hp |ψi ; |pi = |px , py , pz i (4.40)

however, when we require to work with both operators simultaneously, we should choose only one basis. Hence, it is important
to know how the operator P acts in the {|ri} representation, and how the operator R acts in the {|pi} representation.
Let us first look for the way in which the operator P acts in the {|ri} representation. For this, we use Eqs. (4.33, 4.34,
4.40) to evaluate
Z Z  3/2 Z
1
hr| Px |ψi = d3 p hr| pi hp| Px |ψi = d3 p hr| pipx hp| ψi = d3 p eip·r/~ px ψ̄ (p) (4.41)
2π~

to evaluate this term we start with the expression of the Fourier transform Eq. (4.37)
 3/2 Z ∞
1
ψ (r) = d3 p eip·r/~ ψ̄ (p)
2π~ −∞
 3/2 Z  
∂ψ (r) 1 ∞
∂  ip·r/~ 
= d3 pe ψ̄ (p)
∂x 2π~ −∞ ∂x
 3/2 Z ∞  
∂ψ (r) 1 3 i ip·r/~
= d p px e ψ̄ (p)
∂x 2π~ −∞ ~

we have that  3/2 Z ∞


~ ∂ψ (r) 1
= d3 p px eip·r/~ ψ̄ (p) (4.42)
i ∂x 2π~ −∞

if we continue derivating this expression we find


 3/2 Z ∞  n 
∂ n ψ (r) 1 i
= d3 p px eip·r/~ ψ̄ (p)
∂xn 2π~ −∞ ~

replacing (4.42) in (4.41) we obtain


~ ∂ψ (r)
hr| Px |ψi =
i ∂x
and similarly for Py , Pz . In vector form we summarize it as
~
hr| P |ψi = ∇hr |ψi (4.43)
i
in the {|ri} representation, the operator P coincides with the differential operator acting on the wave functions. Let us
calculate hϕ| Px |ψi in the {|ri} representation
Z Z  
~ ∂
hϕ| Px |ψi = d3 r hϕ |ri hr| Px |ψi = d3 r ϕ∗ (r) ψ (r) (4.44)
i ∂x
88 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

of great importance are the commutators among the components Pi , Ri . We shall calculate them by applying an arbitrary ket
|ψi on such a commutator, and using the {|ri} representation. For instance

hr| [X, Px ] |ψi = hr| (XPx − Px X) |ψi = hr| (XPx ) |ψi − hr| (Px X) |ψi
~ ∂
= hr| X |Px ψi − hr| Px |Xψi = x hr| Px ψi − hr| Xψi
i ∂x
~ ∂ ~ ∂ ~ ∂
= x hr| Px |ψi − hr| X |ψi = x hr| ψi − [x hr| ψi]
i ∂x i ∂x i ∂x
~ ∂ ~ ∂ ~
= x hr| ψi − x [hr| ψi] − hr| ψi
i ∂x i ∂x i
so that
hr| [X, Px ] |ψi = i~ hr| ψi
since this is valid for any ket |ψi and any generalized ket |ri of the basis, we conclude that

[X, Px ] = i~I

it is usual to omit the identity operator since it is not important for practical calculations. In a similar way, we can calculate
the other commutators, to condense notation it is convenient to define

R1 ≡ X, R2 ≡ Y, R3 ≡ Z, P1 ≡ Px , P2 ≡ Py , P3 ≡ Pz

to write
[Ri , Rj ] = [Pi , Pj ] = 0 ; [Ri , Pj ] = i~δij (4.45)
they are called canonical commutation relations. These relations are intrinsic and should not depend on the basis in which we
derive them.
We can show that R and P are hermitian operators. For example let us show that X is hermitian
Z Z Z ∗
3 3 ∗ 3 ∗
hϕ| X |ψi = d r hϕ |ri hr| X |ψi = d r ϕ (r) x ψ (r) = d r ψ (r) x ϕ (r)

hϕ| X |ψi = hψ| X |ϕi

since this is valid for arbitrary kets |ψi and |ϕi, and taking into account Eq. (4.14) we conclude that X = X † . For Px we see
that
Z Z Z ∗
3 3 ∗ 3 ∗
hϕ| Px |ψi = d p hϕ |pi hp| Px |ψi = d p ϕ̄ (p) px ψ̄ (p) = d p ψ̄ (p) px ϕ̄ (p)

hϕ| Px |ψi = hψ| Px |ϕi

and Px = Px† . The procedure is the same for the other components of R and P

R = R† , P = P†

There is an alternative proof of the hermiticity of P by using its action in the {|ri} representation given by Eq. (4.43).
Integrating Eq. (4.44) by parts we have
Z Z ∞  
~ ∂
hϕ| Px |ψi = dy dz dx ϕ∗ (r) ψ (r)
i −∞ ∂x
Z  Z ∞ 
~ x=∞ ∂ ∗
= dy dz [ϕ∗ (r) ψ (r)]x=−∞ − dx ψ (r) ϕ (r)
i −∞ ∂x

since the scalar product hϕ| ψi is convergent, ϕ∗ (r) ψ (r) approaches zero when x → ±∞. Hence the first term on the
right-hand side vanishes and we find
Z  Z ∗
~ ∂ ∗ ~ ∂
hϕ| Px |ψi = − d3 r ψ (r) ϕ (r) = d3 r ψ ∗ (r) ϕ (r)
i ∂x i ∂x

hϕ| Px |ψi = hψ| Px |ϕi

two things deserve attention, first the presence of the i factor is essential because i∂/∂x is hermitian but ∂/∂x is not. Second,
we have used explicitly the fact that |ψi and |ϕi belong to Er by assuming that the scalar product hϕ| ψi is convergent, so this
proof is not valid for generalized kets.
4.11. GENERAL PROPERTIES OF TWO CONJUGATE OBSERVABLES 89

4.10.5 The eigenvalue problem for R and P


Let us calculate the matrix element X (r′ , r) of the operator X in the basis {|ri}
X (r′ , r) = hr′ | X |ri = x′ hr′ | ri = x′ δ (r − r′ ) = xδ (r − r′ ) = x hr′ | ri
hr′ | Xri = x hr′ | ri
so the components of the ket X |ri in the {|r′ i} representation are equal to the ones of the ket |ri = |x, y, zi multiplied by x
X |ri = x |ri
we proceed in the same way for Y and Z
X |ri = x |ri , Y |ri = y |ri , Z |ri = z |ri ; |ri = |x, y, zi
the kets |ri are eigenkets common to X, Y, Z. The set {|ri} of common eigenvectors of X, Y, Z forms a basis showing that
{X, Y, Z} is a complete set of commuting observables. On the other hand, the specification of the three eigenvalues x0 , y0 , z0
determines uniquely the “normalized” eigenvector |r0 i except for a phase eiθ . In the {|ri} representation the coordinates of
|r0 i are δ (x − x0 ) δ (y − y0 ) δ (z − z0 ). Therefore, the set {X, Y, Z} constitutes a C.S.C.O. in Er .
Analogous reasoning shows that for the commuting observables {Px , Py , Pz } the eigenvalues and eigenvectors are
Px |pi = px |pi , Py |pi = py |pi , Pz |pi = pz |pi ; |pi = |px , py , pz i
since {|pi} is a basis the operators Px , Py , Pz are observables. Because the set of eigenvalues (p0x , p0y , p0z ) determines uniquely
the vector |p0 i the set {Px , Py , Pz } constitutes as C.S.C.O. in Er .
It worths pointing out that X is not a C.S.C.O. by itself in the Er state space because when x0 is specified y0 and z0 can
take any real values. Therefore, x0 is an infinitely degenerate eigenvalue. Notwithstanding in the state space Ex of a particle in
one dimension, X constitutes a C.S.C.O. since the eigenvalue x0 determines uniquely the eigenvector |x0 i, and its coordinates
in the {|xi} representation are given by δ (x − x0 ).
It can also be shown that the set {X, Py , Pz } constitutes a C.S.C.O. since they commute with each other, and for a set of
eigenvalues {x0 , p0y , p0z } there is a unique eigenvector whose associated wave function is
1 i(p0y y+p0z z)/~
ψx0 ,p0y ,p0z (x, y, z) = δ (x − x0 ) e
2π~
of course, similar C.S.C.O. are built from the sets
{Y, Px , Pz } , {Z, Px , Py }

4.11 General properties of two conjugate observables


Two arbitrary observables Q and P are called conjugate if they obey the conmutation rule
[Q, P ] = i~ (4.46)
such pairs of observables are frequently encountered in quantum mechanics. The position and momentum observables are good
examples, as can be seen in Eq. (4.45). However, in what follows all properties are derived from the commutation rule (4.46)
and the fact that Q and P are observables, regardless the specific form of the operators. Let us define the operator S (λ) that
depends on a real parameter λ as
S (λ) = e−iλP/~ (4.47)
since P is observable and so hermitian, the operator S (λ) is unitary
S † (λ) = eiλP/~ = S −1 (λ) = S (−λ) (4.48)
since P obviously commute with itself, Eq. (3.117) of page 69, leads to
S (λ) S (µ) = S (λ + µ) (4.49)
now we calculate the commutator [Q, S (λ)]. To do it, we take into account that [Q, P ] = i~ clearly commutes with Q and P ,
therefore we can apply theorem 3.21, Eq. (3.101) to obtain
 
iλ −iλP/~
[Q, S (P )] = [Q, P ] S ′ (P ) = i~ − e = λS (P )
~
where we have written S (P ) instead of S (λ) to emphasize that when applying Eq. (3.101) we are considering S as a function
of the operator P (so the derivative is with respect to P ). Rewriting it in the old notation we have
[Q, S (λ)] = λS (λ) ⇒ QS (λ) − S (λ) Q = λS (λ)
QS (λ) = S (λ) [Q + λ] (4.50)
90 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

4.11.1 The eigenvalue problem of Q


Suppose that Q has a non-zero eigenvector |qi, with eigenvalue q

Q |qi = q |qi (4.51)

applying Eq. (4.50) on the vector |qi we have

QS (λ) |qi = S (λ) [Q + λ] |qi = S (λ) [q + λ] |qi


Q [S (λ) |qi] = [q + λ] [S (λ) |qi] (4.52)

therefore, S (λ) |qi is also an eigenvector of Q with eigenvalue q + λ. Note that S (λ) |qi is non-zero because S (λ) is unitary
so the norm of |qi is preserved. On the other hand, since λ can take any real value, we conclude that by starting with
an eigenvector of Q, we can construct another eigenvector of Q with any real eigenvalue by applying the appropriate S (λ).
Consequently, the spectrum of Q is continuous and consists of all real values.
Note that this result shows in particular that conjugate operators Q, P cannot exist in finite dimensional vector spaces
since for the latter the spectrum must be finite. Even they do not exist strictly in spaces of denumerable dimension such as
L2 , (for which the spectrum must be at most denumerable), so the eigenvectors |qi will form hyperbasis in L2 .
Let us now show that if any given q is non-degenerate, then all the other eigenvalues of Q are also non-degenerate. For this
we assume that the eigenvalue q + λ is at least two-fold degenerate and arrive to a contradiction. From this hypothesis, there
are at least two orthogonal eigenvectors |q + λ, αi and |q + λ, βi associated with the eigenvalue q + λ

hq + λ, β |q + λ, αi = 0 (4.53)

now consider the two vectors S (−λ) |q + λ, αi and S (−λ) |q + λ, βi from Eq. (4.52) we see that

QS (−λ) |q + λ, αi = [q + λ + (−λ)] S (−λ) |q + λ, αi = qS (−λ) |q + λ, αi


QS (−λ) |q + λ, βi = [q + λ + (−λ)] S (−λ) |q + λ, βi = qS (−λ) |q + λ, βi

so S (−λ) |q + λ, αi and S (−λ) |q + λ, βi are two non-zero eigenvectors9 associated with the eigenvalue q. Calculating the inner
product of them
hq + λ, β| S † (−λ) S (−λ) |q + λ, αi = hq + λ, β |q + λ, αi = 0

where we have used Eq. (4.53) and the fact that S (λ) is unitary. Thus, we arrive to the fact that S (−λ) |q + λ, αi and
S (−λ) |q + λ, βi are two orthogonal (and so linearly independent) eigenvectors associated with q, contradicting the hypothesis
that q is non-degenerate. This result can be extended to find that the eigenvalues of Q must all have the same degree of
degeneracy.
We now look for the eigenvectors. We fix the relative phases of the different eigenvectors of Q with respect to the eigenvector
|0i associated with the eigenvalue 0, by setting
|qi ≡ S (q) |0i (4.54)

applying S (λ) on both sides of (4.54) and using (4.49), we get

S (λ) |qi = S (λ) S (q) |0i = S (λ + q) |0i = |q + λi

and the corresponding bra gives


hq| S † (λ) = hq + λ|

now using Eq. (4.48) we see that S † (λ) = S (−λ) from which

hq| S (−λ) = hq + λ| ⇒ hq| S (λ) = hq − λ|

where we have replaced λ → −λ in the last step. In summary the action of S (λ) on the eigenvectors |qi of Q are given by

S (λ) |qi = |q + λi ; hq| S (λ) = hq − λ| (4.55)

now we can characterize the action of the operators P, Q and S (λ) in either the {|qi} basis or the {|pi} basis.

9 They are non-zero because |q + λ, αi and |q + λ, βi are non-zero by hypothesis, and S (λ) is unitary.
4.11. GENERAL PROPERTIES OF TWO CONJUGATE OBSERVABLES 91

4.11.2 The action of Q, P and S (λ) in the {|qi} basis


Since Q is an observable, the set of eigenvectors {|qi} of Q forms a basis. A given ket |ψi in our Hilbert space can be written
in the {|qi} basis as Z
|ψi = dq |qi ψ (q) ; ψ (q) ≡ hq |ψi

let us calculate the fourier transform of Q |ψi in the {|qi} basis

hq| Q |ψi = qhq |ψi = qψ (q)

where we have used (4.51) and the hermiticity of Q. The action of Q on |ψi in the {|qi} basis, reduces to a simple multiplication
with its associated eigenvalue. The action of S (λ) on |ψi in this basis is also simple

hq| S (λ) |ψi = hq − λ| ψi = ψ (q − λ) (4.56)

where we have used (4.55). Note that a function f (x − a) is the function that at the point x = x0 + a, takes on the value
f (x0 ), so that it is the function obtained from f (x) by a translation of +a. Therefore, Eq. (4.56), shows that the action of
S (λ) on |ψi in the basis {|qi} , can be described as a translation of the wave function over a distance +λ parallel to the q−axis.
So S (λ) is usually called the translation operator.
The action of P on |ψi in the {|qi} basis is a bit longer to obtain. Let ε be an infinitesimal quantity such that
ε 
S (−ε) = eiεP/~ = I + i P + O ε2
~
therefore
h ε i ε 
hq| S (−ε) |ψi = hq| I + i P + O ε2 |ψi = hq |ψi + i hq| P |ψi + O ε2
~ ~
ε 
hq| S (−ε) |ψi = ψ (q) + i hq| P |ψi + O ε2 (4.57)
~
on the other hand, from Eq. (4.56) we have
hq| S (−ε) |ψi = ψ (q + ε) (4.58)
and comparing (4.57) with (4.58) we have
ε 
ψ (q + ε) = ψ (q) + i hq| P |ψi + O ε2 ⇒
~
ε 
i hq| P |ψi = ψ (q + ε) − ψ (q) − O ε2
~
solving for hq| P |ψi and taking into account that ε is infinitesimal we have
~ ψ (q + ε) − ψ (q)
hq| P |ψi = lim
i ε→0 ε
~ d
hq| P |ψi = ψ (q) (4.59)
i dq
~ d
so the action of P on a ket in the {|qi} basis is that of i dq .

4.11.3 Representation in the {|pi} basis and the symmetrical role of P and Q
The wave function vp (q) associated with the eigenvector |pi of P with eigenvalue p in the {|qi} basis, is given by

vp (q) = hq |pi ; P |pi = p |pi (4.60)

we can evaluate vp (q) by using Eqs. (4.59, 4.60)


1 1 ~ d
vp (q) = hq |pi = hq| p |pi = hq| P |pi = vp (q) ⇒
p p ip dq
~ d
vp (q) = vp (q) (4.61)
ip dq
indeed we are assuming that Eq. (4.59) can be extended to generalized kets. If we choose hq| 0i to be real, the differential
equation (4.61) has the (normalized) solution
1
vp (q) = hq |pi = √ eipq/~
2π~
92 CHAPTER 4. STATE SPACE AND DIRAC NOTATION

we can then write Z ∞


1
|pi = √ dq eipq/~ |qi
2π~ −∞

a wave function in the {|pi} representation is given by


Z Z
ψ̄ (p) = hp |ψi = hp| |qi hq| ψi = hp |qi hq| ψi
Z ∞
1
ψ̄ (p) = √ dqeipq/~ ψ (q)
2π~ −∞

which is the Fourier transform of ψ (q).


It can be shown that the action of the P operator in the {|pi} repesentation is associated with multiplication by p, while
the representation of X corresponds to the operations i~ d/dp. Therefore, the results are symmetrical in the {|qi} and {|pi}
bases. It comes from the fact that we can interchange Q and P with no more cost than changing the sign of the conmutator
in (4.46). The analogous of the translation operation in the {|pi} basis is the operator defined by

T (α) = eiαQ/~

which acts as a translation in the |pi space. The arguments developed for the basis {|qi} can be repeated in the basis {|pi} by
interchanging P by Q and i by −i everywhere. As a matter of curiosity, in Classical Mechanics, the Hamilton equations are
also symmetrical in the conjugate variables (Q, P ) and we can interchange them with no more cost that a change in sign.
We emphasize again that the results obtained in this section only depend on the canonical rule of commutation (4.46) and
the fact that Q and P are observables, but do not depend on the explicit form of the Q and P operators. Thus, position an
momentum operators are only special cases of Q and P .
Chapter 5

Some features of matrices and operators in


C2 and R3

R3 is a very useful vector space in Physics for quite obvious reasons. In quantum mechanics, the 2-dimensional complex space
C2 is of major importance since spinors lie in this space. Thus, it deserves to show some important details on some matrices
defined in these spaces.

5.1 Diagonalization of a 2 × 2 hermitian matrix


This example illustrates many concepts introduced in the eigenvalue problem in a quite simple way. Further, it is useful in
many practical calculations involving systems of two states in quantum mechanics. The eigenvalue problem is very easy but
the determination of eigenvectors could lead easily to complicated expressions. We shall determine the eigenvalues and find
the eigenvectors in a way easy to handle.

5.1.1 Formulation of the problem


Consider an hermitian operator R in a two dimensional Hilbert space. Its matrix representation in a given orthonormal basis
{|ϕ1 i , |ϕ2 i} reads    
hϕ1 | R |ϕ1 i hϕ1 | R |ϕ2 i H11 H12
H≡ = (5.1)
hϕ2 | R |ϕ1 i hϕ2 | R |ϕ2 i H21 H22
an hermitian operator is described by an hermitian matrix when the basis used is orthonormal. Therefore,
∗ ∗ ∗
H11 = H11 ; H22 = H22 ; H12 = H21
so that diagonal elements are real. Let us express the matrix in Eq. (5.1) in the equivalent form
 1   1 
2 (H11 + H22 ) 0 2 (H11 − H22 ) H12
H = 1 +
0 2 (H11 + H22 ) H21 − 21 (H11 − H22 )
  ∗ !
2H21
1 1 0 1 1 (H −H )
H = (H11 + H22 ) + (H11 − H22 ) 2H21
11 22
2 0 1 2 (H11 −H22 ) −1
∗ !
2H21
1 1 1 (H 11 −H22 )
H = (H11 + H22 ) I + (H11 − H22 ) K ; K ≡ 2H21 (5.2)
2 2 (H11 −H22 ) −1

and I is the identity matrix. Let |ψ± i be two linearly independent eigenvectors of K
K |ψ± i = κ± |ψ± i (5.3)
applying the ket |ψ± i on Eq. (5.2) we have
1 1
H |ψ± i = (H11 + H22 ) I |ψ± i + (H11 − H22 ) K |ψ± i
2 2
1
H |ψ± i = [(H11 + H22 ) + (H11 − H22 ) κ± ] |ψ± i
2
therefore |ψ± i are also eigenvectors of H with eigenvalues
1
H |ψ± i = E± |ψ± i ; E± ≡ [(H11 + H22 ) + (H11 − H22 ) κ± ] (5.4)
2

93
94 CHAPTER 5. SOME FEATURES OF MATRICES AND OPERATORS IN C2 AND R3

note that the problem reduces to find the eigenvectors of K (which coincide with the ones of H) and also its eigenvalues (which
are related with the eigenvalues of H through Eq. 5.4). Solving the problem for K is equivalent to choose the origin of the
eigenvalues in (H11 + H22 ) /2 = (T rH)/2. Note that this shift is independent of the basis chosen to write H.

5.1.2 Eigenvalues and eigenvectors of K


For simplicity we define the angles θ, ϕ in terms of the matrix elements Hij as follows

2 |H21 |
tan θ = , 0≤θ<π (5.5)
H11 − H22
H21 = |H21 | eiϕ , 0 ≤ ϕ < 2π (5.6)

so ϕ is the argument of the term H21 . Matrix K in Eq. (5.2) can be written as

2|H21 |e−iϕ
!  
1 (H11 −H22 ) 1 tan θ e−iϕ
K= = (5.7)
2|H21 |eiϕ
−1 tan θ eiϕ −1
(H11 −H22 )

the characteristic equation of matrix (5.7) yields

det [K − λI] = 0 = (1 − κ) (−1 − κ) − tan2 θ ⇒


1
κ2 − 1 − tan2 θ = 0 ⇒ κ2 = 1 + tan2 θ =
cos2 θ
the eigenvalues of K read
1 1
κ+ = , κ− = − (5.8)
cos θ cos θ
and they are real as expected. We can express 1/ cos θ in terms of the matrix elements Hij by using Eqs. (5.5) and the fact
that cos θ and tan θ are both of the same sign since 0 ≤ θ < π.
s s
1 p 4 |H21 |
2 2
(H11 − H22 ) + 4 |H21 |
2
= 1 + tan2 θ = 1+ 2 = 2
cos θ (H11 − H22 ) (H11 − H22 )
s
1 (H11 − H22 )2 + 4 |H21 |2
κ± = ± =± 2 (5.9)
cos θ (H11 − H22 )

let us find the eigenvectors of K. We denote as a and b the components of |ψ+ i in the basis {|ϕ1 i , |ϕ2 i}. From Eqs. (5.7,
5.8) this eigenvector must satisfy
    
1 tan θ e−iϕ a 1 a
=
tan θ eiϕ −1 b cos θ b

of course only one of the two equations is linearly independent since only quotients between the coefficients can be determined,
therefore  
−iϕ a −iϕ 1
a + b tan θ e = ⇒ b tan θ e =a −1
cos θ cos θ

multiplying by eiϕ/2 and defining 2α ≡ θ this equation yields


 
sin 2α −iϕ/2 1 − cos 2α
b e = a eiϕ/2
cos 2α cos 2α
b sin 2α e−iϕ/2 = a (1 − cos 2α) eiϕ/2
−iϕ/2
 
b (2 sin α cos α) e = a 1 − 1 − 2 sin2 α eiϕ/2
2b sinα cos α e−iϕ/2
 = 2a sin2 α eiϕ/2
b cos α e−iϕ/2 = a sin α eiϕ/2

in terms of θ we get
θ −iϕ/2 θ
b cos e = a sin eiϕ/2 (5.10)
2 2
5.1. DIAGONALIZATION OF A 2 × 2 HERMITIAN MATRIX 95

we demand normalization with the additional requirement of positivity for the coefficient a, so we have

a sin θ eiϕ/2 2
2 2 2 2
|a| + |b| = 1 ⇒ |a| + =1
cos θ2 e−iϕ/2
2
2 θ 2 2 θ
|a| + a tan eiϕ = 1 ⇒ |a| + |a| tan2 = 1
2 2
 
2 θ 2 θ
|a| 1 + tan2 = 1 ⇒ |a| = cos2
2 2
so that
θ
a = cos ≥0 since 0 ≤ θ < π (5.11)
2
replacing (5.11) in (5.10) we get
θ −iϕ/2 θ θ θ
e = cos sin eiϕ/2 ⇒ b = sin eiϕ
b cos
2 2 2 2

so that the eigenvector |ψ+ i associated with the eigenvalue κ+ reads

′ θ θ
|ψ+ i = a |ϕ1 i + b |ϕ2 i = cos |ϕ1 i + sin eiϕ |ϕ2 i
2 2

it is clear that |ψ+ i ≡ e−iϕ/2 |ψ+ i is also an eigenvector of K with the same eigenvalue κ+ and this vector looks more
symmetrical. Thus, we define the eigenvector |ψ+ i as1
θ −iϕ/2 θ
|ψ+ i = cos e |ϕ1 i + sin eiϕ/2 |ϕ2 i (5.12)
2 2
an analogous calculation gives the eigenvector of K corresponding to κ− = −1/ cos θ
θ −iϕ/2 θ
|ψ− i = − sin e |ϕ1 i + cos eiϕ/2 |ϕ2 i (5.13)
2 2
the eigenvalues of H are obtained by combining Eqs. (5.4, 5.9)
1
E± ≡ [(H11 + H22 ) + (H11 − H22 ) κ± ]
2 s
" #
2 2
1 (H11 − H22 ) + 4 |H21 |
= (H11 + H22 ) ± (H11 − H22 ) 2
2 (H11 − H22 )
 q 
1 2 2
E± ≡ (H11 + H22 ) ± (H11 − H22 ) + 4 |H21 |
2
it worths saying that the eigenvalue problem can be solved directly without resorting to the angles θ and ϕ defined in Eq. (5.5,
5.6). This procedure is advantageous only if we have to calculate the eigenvectors as well.

5.1.3 Eigenvalues and eigenvectors of H


Let us summarize our results. We consider an hermitian operator R in a two dimensional Hilbert space, and its matrix
representation in the orthonormal basis {|ϕ1 i , |ϕ2 i}
   
hϕ1 | R |ϕ1 i hϕ1 | R |ϕ2 i H11 H12
H≡ = (5.14)
hϕ2 | R |ϕ1 i hϕ2 | R |ϕ2 i H21 H22
its eigenvalues and eigenvectors are given by
 q 
1 2 2
E± ≡ (H11 + H22 ) ± (H11 − H22 ) + 4 |H21 | (5.15)
2
θ θ
|ψ+ i = cos e−iϕ/2 |ϕ1 i + sin eiϕ/2 |ϕ2 i (5.16)
2 2
θ −iϕ/2 θ
|ψ− i = − sin e |ϕ1 i + cos eiϕ/2 |ϕ2 i (5.17)
2 2
2 |H21 |
tan θ = , H21 = |H21 | eiϕ ; 0 ≤ θ < π , 0 ≤ ϕ < 2π (5.18)
H11 − H22
1 This is equivalent to define the phase of the coefficient a as −ϕ/2 instead of zero, in the process of normalization.
96 CHAPTER 5. SOME FEATURES OF MATRICES AND OPERATORS IN C2 AND R3

as a matter of consistence we can see that


2
E+ + E− = H11 + H22 = T rH , E+ E− = H11 H22 − |H12 | = det H
2
in agreement with Eq. (3.55, 3.56). From Eq. (5.15), the spectrum becomes degenerate i.e. E+ = E− when (H11 − H22 ) +
4 |H21 |2 = 0. That is when H11 = H22 and H12 = H21 = 0. So a 2 × 2 hermitian matrix has a degenerate spectrum if and only
if it is proportional to the identity.
It worths remarking that although functions of θ are expressed simply in terms of the Hij elements by means of Eqs.
(5.18), it is not the case when functions of θ/2 appears. Thus, when we do calculations with the eigenvectors (5.16, 5.17), it
is convenient to keep the results in terms of θ/2 up to the end of the calculation instead of replacing it in terms of the Hij
quantities.

5.2 Some general properties of 3 × 3 real matrices


Before establishing the most outstanding properties of matrices in three-dimensional space, we define the Levi-Civita tensor
as a set of components εijk with i, j, k = 1, 2, 3 such that

 1 if i, j, k is an even permutation of 1, 2, 3
εijk = 0 if there are at least two repeated indices

−1 if i, j, k is an odd permutation of 1, 2, 3

for instance ε111 = ε212 = 0 because they contain repeated indices, ε123 = 1 and ε213 = −1 since 213 is obtained by one
transposition of 1, 2, 3 (i.e. an odd number of transpositions). It is clear that εijk is totally antisymetric under the interchange
of any two labels that is
εijk = −εjik = −εikj = −εkji
In cartesian coordinates, the vector product can be expressed through this tensor

(A × B)i = εijk Aj Bk

with sum over repeated indices, this is easily shown by explicit calculation. The following properties come directly from the
definition
εijk εijk = 6 ; εijk εmjk = δim ; εijk εmnk = δim δmj − δni δmj (5.19)
We shall start characterizing real antisymmetric matrices in three dimensions.

5.2.1 Real antisymmetric 3 × 3 matrices


A real antisymmetric 3 × 3 matrix can be parameterized as
   
0 a12 a13 0 a12 −a31
Aa = a21 0 a23  = −a12 0 a23  (5.20)
a31 a32 0 a31 −a23 0

and has only three independent degrees of freedom, so that it is reasonable to construct a vector arrangement with them
   
a23 v1
vA ≡ a31  ≡ v2  (5.21)
a12 v3

now if we apply this matrix to an arbitrary vector x we obtain


    
0 v3 −v2 x1 v3 x2 − v2 x3
Aa x = −v3 0 v1  x2  = v1 x3 − v3 x1  (5.22)
v2 −v1 0 x3 v2 x1 − v1 x2

the components of the new vector are clearly the ones obtained in the vector product

Aa x = x × vA (5.23)

conversely, any vector product can be associated with an antisymmetric matrix acting on one vector. This can shown as follows

(x × vA )i = εijk xj vk = (εijk vk ) xj ≡ Aaij xj = (Aa x)i


5.2. SOME GENERAL PROPERTIES OF 3 × 3 REAL MATRICES 97

in which we have defined


Aaij ≡ εijk vk (5.24)
the antisymmetry of this matrix is clear from the antisymmetry of the Levi-Civitá tensor εijk . The inverse relation is obtained
multiplying Eq. (5.24) by εijm and using property (5.19) we have

εijm Aaij ≡ εijm εijk vk = 2δmk vk = 2vm


1
vm = εmij Aaij
2
for example
1 1
(vA )1 ≡ v1 = (ε123 Aa23 + ε132 Aa32 ) = [ε123 Aa23 + (−ε123 ) (−Aa23 )] = ε123 Aa23 = a23
2 2
where we have used the antisymmetry of εijk and of Aaij . This result coincides with Eq. (5.21).
In summary, any real 3 × 3 antisymmetric matrix can be parameterized as
     
0 a12 a13 0 a12 −a31 0 v3 −v2
Aa = a21 0 a23  = −a12 0 a23  ≡ −v3 0 v1  (5.25)
a31 a32 0 a31 −a23 0 v2 −v1 0

where the three degrees of freedom of the matrix can be associated with a vector as follows
   
a23 v1
1
vA ≡ a31  ≡ v2  ; (vA )m = vm = εmij aij (5.26)
2
a12 v3

if we apply this matrix to an arbitrary vector x we obtain

Aa x = x × vA (5.27)

conversely, any vector product can be associated with an antisymmetric matrix acting on one vector as follows

(x × vA )i = Aaij xj = (Aa x)i ; Aaij ≡ εijk vkA (5.28)

5.2.2 Decomposition of a 3 × 3 matrix in its antisymmetric and symmetric parts


We saw in Sec. 3.8, that an arbitrary matrix can be decomposed in a symmetric component plus an antisymmetric one.
Following this section, we obtain the explicit form of such a decomposition

   
a11 a12 a13 2a11 a12 + a21 a13 + a31
 a21 a22 a23  = A b +A ; A 1
b ≡  a12 + a21
A = 2a22 a23 + a32 
2
a31 a32 a33 a13 + a31 a23 + a32 2a33
 
0 a12 − a21 a13 − a31
1
A ≡ a21 − a12 0 a23 − a32  (5.29)
2
a31 − a13 a32 − a23 0

in components it is written as
aij + aji aij − aji
aij = b
aij + aij ; b
aij = , aij =
2 2
the antisymmetric matrix can be parameterized as in Eq. (5.25)
   
0 a12 − a21 a13 − a31 0 v3 −v2
1
A= a21 − a12 0 a23 − a32  ≡ −v3 0 v1  (5.30)
2
a31 − a13 a32 − a23 0 v2 −v1 0

combining Eqs. (5.26, 5.30) we obtain the vector associated with the antisymmetric matrix A
   
v a − a32
1  1  1  23 1 1
∗A ≡ v2 = a31 − a13  ; (∗A)i = vi = εijk ajk (5.31)
2 2 2 2
v3 a12 − a21
98 CHAPTER 5. SOME FEATURES OF MATRICES AND OPERATORS IN C2 AND R3

the vector ∗A is usually called the dual of the matrix A. On the other hand, the symmetric part can be decomposed in two
symmetric components, a traceless one and another that only contains the trace as a degree of freedom, we carry it out by
using Eq. (3.41) with m = 2
   
2a11 a12 + a21 a13 + a31 0 0 0
b =A b tl + A 1
b t =  a12 + a21 −2a11 − 2a33 a23 + a32  + (a11 + a22 + a33 )  0 1 0 
A
2
a13 + a31 a23 + a32 2a33 0 0 0

hence any matrix A can be decomposed into three parts


 
0 0 0
A b b b b
= A + A = A + Atl + At ; At ≡ (T rA)  0 1 0 
0 0 0
   
0 a12 − a21 a13 − a31 0 v3 −v2
1 1
A ≡ a21 − a12 0 a23 − a32  ≡ −v3 0 v1 
2 2
a31 − a13 a32 − a23 0 v2 −v1 0
   
2a11 a12 + a21 a13 + a31 k1 k2 k3
b tl 1 1
A ≡ a12 + a21 −2a11 − 2a33 a23 + a32  ≡  k2 −k1 − k5 k4  (5.32)
2 2
a13 + a31 a23 + a32 2a33 k3 k4 k5

we recall that this further decomposition is motivated by the fact that the trace is an invariant under a similarity transformation,
such that it is often useful to put the trace apart as a degree of freedom.
Recalling that the degrees of freedom of the antisymmetric part can be condensed in a three-vector, we conclude that the
9 degrees of freedom of an arbitrary 3 × 3 real matrix can be decomposed as: (a) The three degrees of freedom vi of the vector
given by Eq. (5.31), (b) The 5 degrees of freedom ki associated with a traceless symmetric matrix A b tl , and (c) The trace of A.
This decomposition is particularly important to study the properties of second-rank tensors under rotations (see Sec. 15.16,
page 15.16).
Chapter 6

Abstract Group Theory

6.1 Groups: Definitions and basic properties


Definition 6.1 (Abstract Group): A group is a non-empty set of elements G ≡ {ak }, for which a law of composition
ai ∗ aj is well defined ∀ai , aj ∈ G, and satisfies the following axioms

1. If ai , aj ∈ G ⇒ (ai ∗ aj ) ∈ G

2. The law of composition is associative, i.e. ai ∗ aj ∗ ak = (ai ∗ aj ) ∗ ak = ai ∗ (aj ∗ ak ).

3. ∃ e ∈ G / ai ∗ e = ai ∀ai ∈ G.

4. ∀ai ∈ G, ∃ a−1
i ∈ G / ai ∗ a−1
i =e

The law of composition is also called a law of combination, a product or a multiplication which is by no means the same as
the ordinary multiplication, it will become clear after some examples. The first axiom says that the set must be “closed” under
the law of combination i.e. the product between the elements of the group does not take us out of the set. The third axiom
demands the existence of a module or identity element denoted by e, which keeps all the elements of the group unchanged
under the law of combination. Finally, the fourth condition requires that for each element in the group, the corresponding
inverse element (under the law of combination defined for the group) must exist and must belong to the set. It is important
to emphasize that we do not require for the law of combination to be commutative.
Some properties can be developed from the axioms. For instance, the axioms only say that e is a left-identity and that a−1
is a left-inverse. We can prove from the axioms that e is truly an identity and is unique, and also that a−1 is truly an inverse
and also unique for a given a. Let us see the most outstanding properties that can be developed from the axioms. We first
−1
observe that, since a−1 ∈ G, then the fourth axiom says that a−1 ∈ G and that
−1
a−1 a−1 =e (6.1)

we prove now the following properties:

1. (a) e−1 e = e. Since e ∈ G it has a left inverse e−1 ∈ G such that ee−1 = e, multiplying this equation by e−1 on left we
have e−1 ee−1 = e−1 e by associativity e−1 e e−1 = e−1 e, but from the third axiom ae = a ∀a ∈ G in particular
for a = e−1 then

e−1 e e−1 = e−1
−1 −1
but e−1 ∈ G and must also have a left inverse that we denote as e−1 . Multiplying on both sides by e−1 on
the right and using associativity we find
 −1 −1 h −1 i −1
e−1 e e−1 e−1 = e−1 e−1 ⇒ e−1 e e−1 e−1 = e−1 e−1

⇒ e−1 e e = e ⇒ e−1 e = e

where we have used Eq. (6.1) with a−1 = e−1 . Combining this proof with the fourth axiom we have

e−1 e = ee−1 = e (6.2)

99
100 CHAPTER 6. ABSTRACT GROUP THEORY

(b) a−1 a = e. The fourth axiom says that aa−1 = e then


 
aa−1 = e ⇒ a−1 aa−1 = a−1 e ⇒ a−1 a a−1 = a−1
 −1 −1 h −1 i
⇒ a−1 a a−1 a−1 = a−1 a−1 ⇒ a−1 a a−1 a−1 =e
−1
 −1

⇒ a a e=e ⇒ a a =e

where we have used Eq. (6.1). Combining this with the fourth axiom we have

a−1 a = aa−1 = e (6.3)

so a left-inverse is also a right-inverse.


(c) ea = a. Starting from the previous proof we have
 
a−1 a = e ⇒ a a−1 a = ae ⇒ aa−1 a = a ⇒ ea = a

and using the third axiom we have


ae = ea = a (6.4)
so a left identity is also a right identity.
(d) The identity is unique. Assume that there is another identity e′ then e′ = e′ e = e then e′ necessarily equals e.
(e) For a given a ∈ G the inverse a−1 is unique. Assume another inverse a′ then

a′ = ea′ = a−1 a a′ = a−1 (aa′ ) = a−1 e = a−1 ⇒ a′ = a−1

(f) e = e−1 . We start with ae = a



ae = a ⇒ (ae) e−1 = ae−1 ⇒ a ee−1 = ae−1 .

From Eq. (6.2, 6.4) we get


ae = ae−1 ⇒ a = ae−1 ∀a ∈ G
it shows that e−1 is a left identity, but a left identity is also a right identity and it is unique. Therefore e = e−1 .
−1 −1 −1
(g) a−1 = a: From Eqs. (6.1, 6.3), we have a−1 a−1 = e and a−1 a = e. Consequently, both a−1 and a are
left-inverses of a−1 , but a left-inverse is also a right-inverse, and the inverse is unique.
−1  
(h) (ab) = b−1 a−1 : we have (ab) b−1 a−1 = a bb−1 a−1 = aea−1 = (ae) a−1 = aa−1 = e therefore b−1 a−1 is an
inverse of ab and the inverse is unique.

We summarize these results in the following theorem:

Theorem 6.1 Let G be a group. The identity is a unique element e of G, and for each element a ∈ G, the inverse is a unique
element of G denoted by a−1 . Further, the following properties are satisfied for all a, b ∈ G

ae = ea = a ; aa−1 = a−1 a = e ; e = e−1 (6.5)



−1 −1 −1 −1 −1
a = a ; (ab) =b a (6.6)

The following lemma is extremely useful in developing the theory of group representations.

Lemma 6.1 The rearrangement lemma: Let G be a group, if a, b, c ∈ G and ab = ac then b = c. Similarly, if ba = ca then
b=c

Proof :  
ab = ac ⇒ a−1 (ab) = a−1 (ac) ⇒ a−1 a b = a−1 a c ⇒ eb = ec ⇒ b = c.
For right-hand side (RHS) multiplication, the proof is similar. QED.
This result implies that if b and c are distinct elements in G then ab and ac are also distinct. Hence, if G is a finite set,
and if we settle the elements of G in a given order and multiply each element by a given a ∈ G on the left, what we obtain is
the same set G in a different order i.e. a rearrangement of the original order. Of course, it is also valid for multiplication on
the right, but both rearrangements are in general different since the product is not necessarily commutative. Finally, it worths
noting that the validity of the rearrangement lemma is closely related with the existence of inverses for each element of the
group, and with the associativity axiom.
The group structure also permits that certain simple equations are always solvable
6.2. EXAMPLES OF ABSTRACT GROUPS AND FURTHER PROPERTIES 101

Theorem 6.2 If a,b ∈ G. Then the two equations ax = b and ya = b have unique solutions x and y in G given by x = a−1 b
and y = ba−1

Proof : The fact that they are solutions can be checked directly. To prove the unicity of x, asume a different solution x′
for ax′ = b then ax = ax′ and x = x′ by the rearrangement lemma. For y it is analogous. QED.
It should be remarked that in an abstract group we do not endow the elements ai with any particular nature (or we do
not mind it). All that really matters is the law of combination that we define among the elements.

Definition 6.2 The number of elements in a group is defined as the order of the group

6.2 Examples of abstract groups and further properties


Now, let us see some examples of abstract groups.

Example 6.1 The unitary set G = {e} in which we define the product e ∗ e = e, forms the trivial group of order one.

Example 6.2 We shall build up the group of order two G = {e, a}; one of these elements must be the identity (say e). Eqs.
(6.5, 6.6) lead to ee = e, ea = ae = a. To complete the law of composition we should determine a ∗ a ≡ a2 . Since (owing to
the first axiom for the groups) a2 should belong to the set, we only have two possibilities, a2 = a or a2 = e. If we assume that
a2 = a then aa = ae and from the rearrangement lemma we get a = e (contradiction). Therefore a2 must be equal to e from
which we have defined the abstract group of two elements uniquely. The last rule implies that a is its own inverse.

To build up the multiplication table of finite groups of higher order it is convenient to use a table of multiplication. We
put the elements of the group in the first column in a given order starting with e. Then we put the group elements in the first
row starting with e and in the same order of the column. For convention the product is written such that the first element in
the law of combination is given by the row while the second element is given by the column in the following way

e ··· ai ··· an
.. ..
. .
ak ··· ak ∗ ai
..
.
an

For the sake of simplicity, we shall omit the notation ak ∗ ai for simply ak ai from now on; unless we consider it necessary. We
can observe that the rearrangement lemma demands that each element appears once and only once in each column and in each
row of the group table.

Example 6.3 To build up the abstract group of three elements G = {e, a, b}, we start constructing the trivial part of the table

e a b
a
b

Each element should appear once and only once in each row and in each column. Under this condition, for the second row we
have only two possibilities a2 = e and a2 = b.
If we assume a2 = e then we require ab = b to fill the second row, but the latter implies a = e which is a contradiction (in
other words if ab = b then b would appear twice in the third column). Therefore the only possibility is a2 = b, and to fill the
second row without repetition the only possibility is ab = e we have so far

e a b
a b e
b

the remaining products are uniquely determined by requiring no repetitions in the second and third columns

e a b
a b e
b e a

There is a unique way of constructing the multiplication table. So there is only one abstract group structure of order three.
102 CHAPTER 6. ABSTRACT GROUP THEORY

e a b c
a b c e a2 = b, ab = c = a3
C4 = ;
b c e a a4 = b 2 = e
c e a b
Table 6.1: The cyclic group of four elements

e a b c
a e c b a2 = b2 = c2 = e , ab = c
D2 = ;
b c e a ac = b , bc = a
c b a e
Table 6.2: The non-cyclic group of four elements

Example 6.4 It can be shown that two abstract groups can be generated with four elements. Let us start with the trivial
part of the group table
e a b c
a
(6.7)
b
c

to avoid repetition of symbols in each row and column we have two possibilities for ac, they are1 (1) ac = e, (2) ac = b. In this
example we examine the first case (1) ac = e in that case the table becomes

e a b c
a e
b
c

and to avoid repetitions in rows and columns the product bc must be equal to a, then we can fill the fourth column in a unique
way as
e a b c e a b c
a e a e

b a b a
c c b
the product ab can only be c. Then we can fill the second row uniquely

e a b c e a b c
a c e a b c e

b a b a
c b c b

the product ca can only be e, with this we fill the remaining of the table

e a b c e a b c e a b c
a b c e a b c e a b c e
→ → (6.8)
b a b a b c e a
c e b c e a b c e a b

the results are summarized in table 6.1. This group is called the four-group or Vierergruppe or simply the V group. For reasons
that will be clear later, it is also called the C4 or cyclic group of order four. By considering the second possibility ac = b, several
possibilities arise but only one of them contributes with a new group structure shown in table 6.2. We shall study this case in
detail later (see example 6.23, page 107).

We recall that the law of combination do not have to be conmutative, so in general ak ai 6= ai ak . There are notwithstanding
some groups in which the law of combination is such that ak ai = ai ak ∀ai , ak ∈ G.

Definition 6.3 We say that a group is abelian or commutative if ak ai = ai ak ∀ai , ak ∈ G.

We can see for instance, that all the abstract groups of order one, two, three and four explained above, are abelian groups.
The symmetry in tables 6.1, 6.2 manifests the abelian structure of the groups of order four.
6.2. EXAMPLES OF ABSTRACT GROUPS AND FURTHER PROPERTIES 103

e = a1 a2 a3 a4 a5 a6
a2 e a5 a6 a3 a4
a3 a6 e a5 a4 a2
a4 a5 a6 e a2 a3
a5 a4 a2 a3 a6 e
a6 a3 a4 a2 e a5
Table 6.3: The smallest non-abelian group (order 6).

Example 6.5 The smallest non-abelian group is of order 6. It is shown in table 6.3, its non abelianity is traced by the
non-symmetric form of this table.

For abelian groups, it is usual to denote the law of combination with the symbol “+” and the identity as “0” so a + 0 =
0 + a = a, a + b = b + a. Of course, the symbol “+” does not mean ordinary sums. The inverse of an element is denoted by −a.
Since we can make the product of an element with itself aa we denote it by a2 . Similarly, we denote the inverse element as
−1
a . Then we can settle the following power notation
m −1
a−1 a = e ≡ a0 , a−m ≡ a−1 = (am )

Definition 6.4 Let G be a group, and a ∈ G. The powers of an element a ∈ G are elements obtained in the form am with m
|m| −1
an integer (positive, negative or zero). If m = 0 then a0 ≡ e and if m < 0 we define am = a−|m| ≡ a−1 = a|m| .

Definition 6.5 Let G be a group, and a ∈ G. If all the powers of the element a are distinct, a is said to be of infinite order.
If it is not the case, the order of the element is defined as the smallest positive integer power n, such that an = e.

If the element a is not of infinite order, then there is at least one positive integer n such that an = e. If we define n = p − q
where p and q are integers (p > q) then

ap−q = e ⇒ ap a−q = e ⇒ ap a−q aq = eaq


⇒ ap = aq (p > q) (6.9)

Theorem 6.3 Let G be a group and a ∈ G. If a is of finite order n, then the sequence of powers

S (a) ≡ a0 = e, a1 = a, a2 , . . . , an−1

consists of n distinct elements of the group, and no more different elements can be generated by any other power of a. If
ak = e, then k is a (positive, negative, or zero) multiple of n. Finally, the set S (a) also forms a group under the same law of
combination as G.

Proof : Equation (6.9) shows that if a is of finite order, we can find two integers p, q (p > q) such that ap = aq . If we
denote as n the smallest positive integer such that an = e, then p − q ≥ n. To prove that the sequence S (a) consists of
different elements, let us assume that it is not true and arrive to a contradiction. Assume ap = aq (p > q) for p, q < n, then
ap−q = e with 0 < p − q < n, in contradiction with the definition of n. Now we prove that no more than n different elements
are generated from powers of a when a is of order n. Any integer k (positive, negative, or zero) can be written as k = rn + s
with 0 ≤ s < n and r an integer (positive, negative, or zero). To find ak we can evaluate it as
r
ak = arn+s = (an ) as = er as = as

therefore, regardless the value of k (positive, negative or zero), exists a value s with 0 ≤ s < n, such that ak = as . It also shows
that if ak = e then k is either zero or a multiple of n. In conclusion, if a is of order n, we have that n and only n different
elements can be generated from it. The proof of the fact that S (a) forms a group of order n, under the law of combination
of G, proceeds as follows: The identity is always in the set and for any ap with 0 ≤ p < n, the inverse is given by an−p and
belongs to the set. In addition, ap aq ≡ ak , with k ≡ p + q, and it was already proved that ak = as with 0 ≤ s < n, then ap aq
also belongs to the set. Associativity comes directly from the group G. QED.
Sometimes, the whole group can be generated from powers of a single element, the abstract groups of order one, two
and three explained above show this feature. For thegroup of order one it is trivial2 . The group of order two G2 = {e, a}
can be generated from a in the following way G2 = a0 = e, a1 = a and a is of order 2. For G2 = {e, a, b} we can write
G2 = a0 = e, a1 = a, a2 = b and a is of order 3, additionally, the element b can generate the group as well. This leads us to
the following
1 It is more advantageous to start with ac than for instance with aa, because the latter only discards one possibility. In the same way, the table

of the group of three elements would have been filled faster by starting with the product ab.
2 It is obvious that for any group, the identity is of order one.
104 CHAPTER 6. ABSTRACT GROUP THEORY

Definition 6.6 A cyclic group is one which can be generated from succesive integer powers of a single element of the set.

In the case of a finite cyclic group, the rows and columns of its multiplication table are cyclic permutations of each other.

Example 6.6 As we saw in tables 6.1, 6.2, there are two abstract groups of order four, and one of them (the so-called C4 ) is
cyclic.

6.3 Examples of group realizations


We have seen that the group structure is given by the group table and many properties can be extracted without using any
explicit form for the elements of the group. However, in Physics we seldom deal with abstract elements i.e. abstract groups.
Instead, we work in realizations of such abstract groups in which the elements have a concrete form3 . Here there are some
examples

Example 6.7 The elements are the numbers 1, −1 and the law of combination is ordinary multiplication.

Example 6.8 The elements are the numbers 0, 1 with the following law of combination

0+1=1+0=1 , 0+0=0 , 1+1=0 (6.10)

the symbol “+” is used to emphasize the abelianity of the group, but clearly differs from ordinary addition.

Example 6.9 The n elements consisting of nth-roots of unity under multiplication, this is a cyclic group which can be generated
2πi
from the element e n . This example shows that we can construct cyclic groups of any finite order.

Example 6.10 The elements are the integers, the law of combination is ordinary addition, the identity is 0, any integer k
has its inverse under addition −k. The group is of infinite order. 0 is of order one, but any other integer is of infinite order.
Further, this is a cyclic group of infinite
m order since we can generate all the elements from integer powers of 1, any negative
integer can be written as 1−m = 1−1 , we should bear in mind that the law of combination is ordinary addition, so that
m
the power 1−1 means (−1) + (−1) + . . . + (−1) = −m, similarly any positive number is obtained through 1m = m, zero is
| {z }
m times
obtained from 10 = 0 since it is (+1) added zero times. Notice that if the law of combination were ordinary multiplication, this
set would not be a group since the inverse of 0 is not in the set.

Example 6.11 The group of even integers under addition (including zero).

Example 6.12 The powers of 2 under ordinary multiplication

. . . , 2−2 , 2−1 , 20 , 21 , 22 , . . .

this is also a cyclic group whose generator is 2.

Example 6.13 The group of non-singular matrices n × n, with matrix multiplication as the law of combination.

Example 6.14 The set of all real (or complex) numbers under addition is an abelian group. The law of combination is
symbolized by “+” the identity by “0” and the inverse of a is (−a). Observe in particular that theorem 6.1 on page 100 plus
the abelianity of the group, provides a formal proof of the following statements

a + 0 = 0 + a = a ; a − a ≡ a + (−a) = −a + a = 0 ; 0 = −0 (6.11)
− (−a) = a ; − (a + b) = −b + (−a) = −a + (−b) ≡ −a − b (6.12)

6.4 Groups of transformations and isomorphisms between groups


We have seen that in Physics we use group realizations instead of abstract groups. Even more, groups in Physics are related
with symmetries, and a symmetry is an invariance under a transformation, therefore the elements of the group will be the
transformations which leave a system invariant. But we should bear in mind that to define a transformation it is necessary to
determine a set of points for the transformations to act on. Thus, in Physics it is useful to define a group of transformations

Definition 6.7 A group of transformations is a collection of transformations (mappings that are one-to-one and onto) G =
M
i
{M1 , M2 , . . .} defined on a given set of points S. i.e. pk −−→ pik with pk , pik ∈ S. Such that
3 When the elements acquire an specific nature we call it a realization of an abstract group or merely a group.
6.4. GROUPS OF TRANSFORMATIONS AND ISOMORPHISMS BETWEEN GROUPS 105

1. The collection contains the identity transformation


2. For every transformation M , its corresponding inverse is also included in the collection
3. If Mi and Mj belong to the collection, so Mi Mj does.

Observe that the definition of group of transformations has some additional ingredients. First of all, the law of combination
is already determined by the transformations themselves (i.e. the composite of the transformations is the law of combination).
Besides, the definitions of the elements themselves (transformations) is not enough and we should further define the set of
points in which such transformations act on. Finally, the associative law is not included in the above set of axioms, since
transformations satisfy it automatically, as shown in Sec. 1.2. In Physics we shall deal with groups of transformations all the
time4 .
Here there are some examples of groups of transformations

Example 6.15 The set of points consists of the points in three dimensional space. The group contains the elements {e, a}
with “e” being the identity transformation and “a” being the transformation that inverts the sign of all the coordinates of each
a
point (x, y, z) −
→ (−x, −y, −z).
a
Example 6.16 The set of points is the same of the previous example but (x, y, z) −
→ (x, y, −z). So that the transformation
“a” defines reflection in the plane X − Y .

Example 6.17 The points are two elements a1 , a2 , the group of transformations G = {e, P2 } are the possible permutations
among them    
a1 → a1 a1 → a2
P1 = e = ; P2 =
a2 → a2 a2 → a1

Example 6.18 The group D2 defined in table 6.2 is called the dihedral group, and is one of the crystallographic groups.
In a geometrical context this group is obtained with the configuration described in Fig. 6.1a in which the elements of the
group consists of the following transformations (a) the identity (leaving the figure unchanged), (b) reflection about the vertical
axis, (1, 3) (c) reflection about the horizontal axis (2, 4), (d) rotation around the center by π. Succesive combination of these
operations reproduces one of the transformations of the group, and it could be checked that they reproduce the group table 6.2.

Figure 6.1: (a) Configuration associated with the dihedral group D2 of four elements. (b) Configuration associated with the
dihedral group D3 of six elements.

Example 6.19 The points are the ones that define an equilateral triangle, and the elements of the group are {e, a, b}. e is
the identity, “a” is the rotation of 2π/3 around the axis that passes through the center of the triangle and perpendicular to the
plane of it; and “b” is the rotation around the same axis but sweeping an angle of 4π/3.

Example 6.20 The points are the ones that define an equilateral triangle, see Fig. 6.1b. Labelling a given corner of the
triangle with the number 1 and labelling the point in the middle of the side opposite to that corner as 1′ , we define an axis
(1, 1′ ) joining both points. Proceeding similarly to construct the axes (2, 2′ ) and (3, 3′ ) we define six symmetry transformations
for this geometrical configuration as follows (a) The identity transformation, (b) Three reflections about the axes (1, 1′ ), (2, 2′ ) ,
4 In physics we call the transformations as operators and the set of points on which the operators work are usually vector spaces. It is very

important not to confuse the elements of the group (transformations) with the set of points in which they act on.
106 CHAPTER 6. ABSTRACT GROUP THEORY

Abstract group e a
group of 1,-1 under multiplication 1 −1
group of 0, 1 under exotic addition 0 1
group of inversion (x, y, z) → (x, y, z) (x, y, z) → (−x, −y, −z)
group of reflection on the plane X − Y (x, y, z) → (x, y, z) (x, y, z) → (x, y, −z)
   
a1 → a1 a1 → a2
group of two permutations
a2 → a2 a2 → a1
Table 6.4: Realizations of the abstract group of order two, described in examples 6.7, 6.8, 6.15, 6.16 and 6.17.

(3, 3′ ). (c) Rotations around the center by 2π/3 and 4π/3. All six transformations leave the triangular configuration unchanged
except for the labels (1, 2, 3). The multiplication table of this group (called the dihedral group D3 ) reproduces the structure of
the group of six order given by table 6.3, page 103.

A group of transformations is a particular realization of an abstract group. For instance, we see that the groups described
by examples 6.7, 6.8, 6.15, 6.16 and 6.17, are determined by two elements {e, a} such that a2 = e, ea = ae = a. Therefore,
all of them have the same group table, i.e. the same group structure, despite all of them have different elements and laws of
combination5 . This group structure correspond of course, to the one of the abstract group of order 2. In other words, the
groups described in examples 6.7, 6.8, 6.15, 6.16 and 6.17, are realizations of a single abstract group (the abstract group of
order 2). This fact leads us to the concept of isomorphism among groups, for the different groups described above, we can
make the correspondences displayed in table 6.4

Observe that the three latter are groups of transformations but not the two former.

Definition 6.8 Two groups G and G′ are isomorphic if there exists a one-to-one mapping M of G onto G′ which preserves
the law of group combination. In other words, there is a one-to-one mapping M of G onto G′ : M a = a′ , a ∈ G and a′ ∈ G′
such that

(a′ ∗ b′ ) = (a × b) ⇔ (M a) ∗ (M b) = M (a × b)
where “∗” denotes the law of combination defined for the group G′ and “×” determines the law of combination for the group
G. The isomorphism is denoted as G ≈ G′ and it is a relation of equivalence.

The existence of a one-to-one mapping between G and G′ provides the equivalence of them as sets. The further requirement
that such a mapping preserves the law of combination translate all group properties from G to G′ and viceversa (because of the
symmetry of isomorphism). Now, if G′ is isomorphic with a third group G′′ then G is isomorphic with G′′ and all three groups
are esentially identical (because of the transitivity of the isomorphic relation). Finally, it is obvious that G is isomorphic with
itself (reflexivity) with the identity as an isomorphism.
According to this discussion, all groups that are isomorphic to each other come from the same abstract group and are
isomorphic to it. Consequently, all of them share the same group properties which can be obtained by studying the abstract
group only. From this fact we see that the concept of abstract group is very useful.

Example 6.21 The groups described by examples 6.7, 6.8, 6.15, 6.16 and 6.17, are isomorphic each other and isomorphic to
the abstract group of two elements. The isomorphism of these groups with the abstract group of two elements is illustrated in
table 6.4.

Example 6.22 The groups defined by Examples 6.10, 6.11, 6.12 are isomorphic each other, since we can define the transfor-
mations n ↔ 2n ↔ 2n among their elements and all of them are isomorphic to the cyclic group

. . . , a−2 , a−1 , a0 , a1 , a2 , . . .

which is clearly the only abstract cyclic group of denumerable order. By the same token, there is only one structure for the
cyclic group of order n which is the one of the group defined in example 6.9.

The isomorphism among the groups in examples 6.10, 6.11, shows that one group (of infinite order) may be isomorphic to
another one which is a proper subset of it. This cannot be the case in groups of finite order. In the case of groups of finite
order, for them to be isomorphic they should necessarily have the same number of elements.
5 In the case of groups of transformations, we see that examples 6.15, 6.16 are different transformations on the same set of points; while examples

6.15, 6.17 are transformations acting on different sets of points. Nevertheless, all of them belong to the same abstract group.
6.4. GROUPS OF TRANSFORMATIONS AND ISOMORPHISMS BETWEEN GROUPS 107

Example 6.23 In example 6.4, page 102, we considered the product ac as the “input product” and we had only two possibilities:
(1) ac = e that generates the cyclic group and (2) ac = b. We obtain the second abstract structure of order four, by starting
again from the trivial part of the table Eq. (6.7) and considering the second case (2) ac = b. The group table becomes

e a b c
a b
(6.13)
b
c

the product bc could be either (A) bc = a or (B) bc = e. Let us consider both possibilities (A) bc = a the third column of the
table becomes
e a b c e a b c
a b a b

b a b a
c c e
the product cb must be a and the fourth row becomes
e a b c e a b c
a b a b

b a b a
c a e c b a e

for ab we have two possibilities (A1) ab = c, (A2) ab = e. Either possibility fills the table completely

e a b c e a b c
a c b a e c b
bc = a and ab = c → → (6.14)
b a b c e a
c b a e c b a e
e a b c e a b c
a e b a c e b
bc = a and ab = e → → (6.15)
b a b e c a
c b a e c b a e

now let us consider the case (B) bc = e. The table 6.13 becomes

e a b c e a b c
a b a b

b e b e
c c a
for cb the only possibility is e, therefore the table is completed
e a b c e a b c e a b c e a b c
a b a b a b a e c b
bc = e ⇒ → → → (6.16)
b e b e b c e b c a e
c e a c b e a c b e a c b e a

we summarize our results as follows. Starting with the product ac as the “input product” we have two possibilites (1) ac = e,
(2) ac = b. The first possibility generates a single table Eq. (6.8) while the second generates three tables given by Eqs. (6.14,
6.15, 6.16)

e a b c e a b c E A B C E A B C
a b c e a e c b A C E B A E C B
T1 = ; T2 = ; T3 = ; T4 =
b c e a b c e a B E C A B C A E
c e a b c b a e C B A E C B E A
Where T3 and T4 are written in capitol letters for the sake of clarity for the following argument. It can be seen that T1 and T3
define isomorphic groups, where the isomorphism T1 ↔ T3 is given by

e↔E ; a↔A ; b↔C ; c↔B

let us show it with some of the products in tables T1 and T3

ab = c ↔ AC = B ; bb = e ↔ CC = E ; bc = a ↔ CB = A
108 CHAPTER 6. ABSTRACT GROUP THEORY

similarly T1 and T4 are isomorphic with the following isomorphism

e↔E ; a↔B ; b↔A ; c↔C

According with example 6.22 there is only one structure for the cyclic group of order n. Consequently, we can also see the
isomorphism between T1 , T2 , T3 by observing that all of them are cyclic. By contrast T2 is not cyclic as can be seen on the RHS
of table 6.2. Thus, only two abstract groups exist.

Example 6.24 The reader can show that starting with the trivial part of the table with four elements, we can use the product
ab as the “input product” and consider the two possibilities ab = e and ab = c. The same two abstract groups are obtained.

The following two theorems enhance the connection between groups of transformations and abstract groups

Theorem 6.4 The set SA ≡ {Mi : A → A} of all transformations (one-to-one and onto mappings) on a non-empty set A, is
a group under the law of composition of transformations.

Proof : The identity is in the set. Theorem 1.4 says that the composite of two transformations is a transformation. Theorem
1.3 says that the inverse exists and is also a transformation. Associativity is guaranteed by theorem 1.2. QED.

Theorem 6.5 (Cayley’ theorem): Let SG = {Mi : G → G} be the set of all transformations of a group G onto itself. Let f be
a mapping of G into SG defined by f (a) ≡ Ma with a ∈ G, where Ma is the mapping of G onto itself defined as Ma (x) ≡ ax
with x ∈ G. Then f is an isomorphism of G onto a subset B ⊆ SG .

Proof: We prove first that Ma ∈ SG for each a ∈ G. Let x, y ∈ G with x 6= y, so ax 6= ay owing to the rearrangement
lemma, hence Ma (x) 6= Ma (y) and the mapping Ma is one-to-one. On the other hand, let y be an arbitrary element of G,
it exists an element z ≡ a−1 y ∈ G, such that Ma (z) = az = a a−1 y = y, since y was arbitrary the mapping Ma is onto G.
Hence Ma is a one-to-one mapping of G onto itself so that Ma ∈ SG . By running over all a ∈ G, we obtain a collection of
transformations B ≡ {Ma : a ∈ G} ⊆ SG .
Now we prove that the mapping f of G into SG , preserves products. By definition f (a) f (b) = Ma Mb where Ma Mb (x) =
Ma (bx) = a (bx) = (ab) x = Mab (x). Since x is arbitrary, we obtain Ma Mb = Mab or equivalently f (a) f (b) = f (ab), showing
the preservation of the product.
Further, if a 6= b the rearrangement lemma says that for any x ∈ G we have Ma (x) = ax 6= bx = Mb (x) and hence
Ma 6= Mb or equivalently f (a) 6= f (b), so the mapping f is one-to-one. QED.
Note that to define SG we do not require the group structure of G, but only its structure as a set. Of course, different
groups Gi defined on the same set G, lead to different isomorphisms fi and so to different ranges i.e. different subgroups Bi in
SG (because in general, fi is into SG but not onto SG )6 . Therefore, theorem 6.5 also leads to

Corollary 6.6 Any group structure Gi on a given set G is isomorphic with a subgroup Bi of SG . Thus, the collection of all
subgroups of SG contains all possible abstract group structures on G.

6.5 Subgroups
If ℜ ⊆ G is a group under the same law of combination defined in G, we say that ℜ is a subgroup of G. In order to check
that ℜ ⊆ G is a subgroup of G we should verify the following requirements

1. If ai , ak ∈ ℜ ⇒ ai ∗ ak ∈ ℜ.

2. If ai ∈ ℜ ⇒ a−1
i ∈ ℜ.

The other properties are infered from the fact that ℜ is contained in the group G. Associativity is a heritage of the whole
group, and the existence of the identity in ℜ can be deduced from the two requirements above.
In the case of finite groups or groups in which all the elements are of finite order, only the first condition must be checked.
In this case if the first condition is satisfied, then for any element a (of order n) in ℜ the element an−1 is also contained in ℜ
and since aan−1 = an = e, we find that the inverse of a belongs to ℜ too.
However, in the case of infinite groups with at least one element of infinite order, both conditions should be verified.
The identity alone and G itself are improper (or trivial) subgroups of G. If a subgroup in G is not improper we called it a
proper subgroup in G. Finding the proper subgroups of a certain group G is one of the main challenges of group theory.

Theorem 6.7 If ℜ1 , ℜ2 are subgroups of G then ℜ1 ∩ ℜ2 is also a subgroup of G.


6 So G and B ⊆ S
i G are the ones that are isomorphic as groups, where Bi is induced by the group structure chosen in G. Note that the specific
group structure appears in the definition Ma (x) = ax.
6.6. SYMMETRIC GROUPS 109

Proof : Let a1 , a2 ∈ ℜ1 ∩ ℜ2 . Since a1 , a2 ∈ ℜ1 then a1 a2 ∈ ℜ1 . Similarly a1 , a2 ∈ ℜ2 hence a1 a2 ∈ ℜ2 . From this we see


that a1 a2 ∈ ℜ1 ∩ ℜ2 proving the closure axiom. Now if a ∈ ℜ1 ∩ ℜ2 then a ∈ ℜ1 and so a−1 ∈ ℜ1 similarly a−1 ∈ ℜ2 proving
that a−1 ∈ ℜ1 ∩ ℜ2 .

Corollary 6.8 Any number of subgroups of certain group could be intersected each other forming a subgroup with at least the
identity.

It is easy to check that

Theorem 6.9 If ℜ is a subgroup of G, and ℜ′ is a subgroup of ℜ, then ℜ′ is a subgroup of G as well.

This suggest the possibility of having a sequence of subgroups in the form

G ⊃ ℜ1 ⊃ ℜ2 ⊃ . . .

There could be several sequences. For instance, starting with the group of permutations of 3 symbols (S3 ), we find

S 3 ⊃ ℜ1 ⊃ e ; S 3 ⊃ ℜ2 ⊃ e ; S 3 ⊃ ℜ3 ⊃ e

where         
 1→1   1→3   1→2 
ℜ1 = e,  2 → 3  ; ℜ2 = e,  2 → 2  ; ℜ3 = e,  2 → 1 
     
3→2 3→1 3→3
all finite groups must finish the chain with the identity group.

Example 6.25 The group of three elements is an abelian subgroup of the group of six elements defined by table 6.3. This is
more apparent when we compare examples 6.19, 6.20.

Theorem 6.10 Let G = {ai } be a group. If a ∈ G is an element of finite order n, the subset

S (a) ≡ a0 , a1 , . . . , an−1

forms a subgroup of G of order n, generated by succesive non-negative powers of a.

Proof : See discussion in the proof of theorem 6.3, Sec. 6.2. QED.
It is important to emphasize that for ℜ to be considered a subgroup of G, the former must be a group under the same law
of combination defined for G.

Example 6.26 The rational numbers form a group G under addition. The positive rational numbers form a group ℜ under
multiplication (ℜ ⊂ G) but ℜ is not a subgroup of G because we are not using the same law of combination for both. ℜ is not
a group under addition.

6.6 Symmetric groups


According with definition 1.10 page 12, a permutation is a one-to-one mapping of a finite non-empty set of points onto itself.
It is easy to prove by induction that n objects can be permuted in n! different ways. An inmediate corollary of theorem 6.4 is

Theorem 6.11 The set of all permutations of degree n


 
1 2 ... n
p1 p2 . . . pn

forms a group of order n! (or degree n), under the law of composition of permutations. It is usually denoted as Sn .

The identity element in Sn is denoted by  


1 2 ... n
e=
1 2 ... n
and the inverse of a given permutation P is written as
   
1 2 ... n p1 p2 ... pn
P = ; P −1 = (6.17)
p1 p2 . . . pn 1 2 ... n
110 CHAPTER 6. ABSTRACT GROUP THEORY

and both are clearly permutations of the same degree. The group Sn of all permutations of n symbols (which is of order n!) is
called the symmetric group of degree n. These groups play a very special role in the theory of finite groups since they exhaust
all possible group structures of any finite order as we shall see soon. Thus, these groups deserve special attention.
In any given permutation such as  
1 2 3 4 5 6 7 8
P = (6.18)
2 3 1 5 4 7 6 8
the columns can be written in any order. For example, the permutation above can also be written as
 
2 4 1 3 8 5 7 6
P =
3 5 2 1 8 4 6 7

since the number 1 continues going to 2, the number 2 still goes to 3; 3 goes to 1 etc.
In order to obtain the composition of two or more permutations, it is useful to change the order of the columns in such a
way that the total transformation becomes apparent

Example 6.27 Let P1 , P2 , P3 be three permutations of degree 4 given by


     
1 2 3 4 1 2 3 4 1 2 3 4
P1 = ; P2 = ; P3 =
4 3 1 2 2 3 4 1 4 1 2 3

we shall obtain the composite P1 P2


       
1 2 3 4 1 2 3 4 2 3 4 1 1 2 3 4 1 2 3 4
P1 P2 = = =
4 3 1 2 2 3 4 1 3 1 2 4 2 3 4 1 3 1 2 4

where we have reordered the columns in permutation P1 , such that the bottom row of P2 coincides with the top row of P1 . It
is then clear that the composite is carried out by adjoining the top row of P2 with the bottom row of P1 . We shall also obtain
the composites P2 P1 , P1 P3 and P1 P2 P3
       
1 2 3 4 1 2 3 4 4 3 1 2 1 2 3 4 1 2 3 4
P2 P1 = = =
2 3 4 1 4 3 1 2 1 4 2 3 4 3 1 2 1 4 2 3
       
1 2 3 4 1 2 3 4 4 1 2 3 1 2 3 4 1 2 3 4
P1 P3 = = =
4 3 1 2 4 1 2 3 2 4 3 1 4 1 2 3 2 4 3 1
  
1 2 3 4 1 2 3 4
P1 P2 P3 = (P1 P2 ) P3 =
3 1 2 4 4 1 2 3
    
4 1 2 3 1 2 3 4 1 2 3 4
P1 P2 P3 = =
4 3 1 2 4 1 2 3 4 3 1 2

In addition, the notation for a certain permutation can be shortened by using its cycle structure. To clarify this point,
observe from Eq. (6.18), that the subset of symbols 1, 2, 3 transform among themselves forming a cycle 1, 2, 3 → 2, 3, 1.
Something similar happens with the subsets of symbols {4, 5} ; {6, 7} ; and {8}. Thus, the permutation can be expressed as

P = (123) (45) (67) (8) (6.19)

these cycles have no symbols in common and since the order of the symbols is not relevant in describing the permutation, we
can write the cycles in any order
P = (45) (123) (8) (67)
In order to shorten the notation, it is usual to ignore the elements on the permutation that remain unchanged, the notation
above is simplified to
P = (45) (123) (67)
However, the degree of the permutation must be kept in mind. Further, any cycle is invariant under a cyclic transformation

(123) = (312) = (231)

Example 6.28 Sometimes the cycle structure of a given permutation could become apparent by reordering the columns in the
permutation. For instance
   
′ 1 2 3 4 5 6 1 3 4 2 5 6
P = = = (134) (25) (6)
3 5 4 1 2 6 3 4 1 5 2 6
6.6. SYMMETRIC GROUPS 111

A cycle of two symbols is called a transposition, any cycle can be expressed in terms of transpositions (with some symbols
in common). For example
(123) = (13) (12) = (12) (23) = (23) (13) (6.20)
in general

(1, 2, . . . , n) = (1, n) (1, (n − 1)) . . . (1, 3) (1, 2) = (1, 2) (2, 3) (3, 4) . . . ((n − 1) , n)
= ((n − 1) , n) ((n − 2) , n) . . . (2, n) (1, n) (6.21)

in which the order of the factors is significant7 . The order of the factors is significant whenever the transpositions are adjoint.
That is, when we have elements in common. In general, Eqs. (6.20, 6.21) show that a permutation can be written in terms of
transpositions in many ways. Notwithstanding, the number of transpositions required to construct one n−cycle of n−symbols
is n − 1. As an immediate consequence, any permutation can be written as a product of transpositions. Even more, for any
permutation with a given cycle structure, the number of transpositions to construct it, is always even or always odd regardless
the algorithm followed. The following definition is then natural

Definition 6.9 An even permutation: is a permutation that is generated by an even number of transpositions. We define odd
permutations accordingly. The parity of a permutation is defined as δP = (−1)q with q the number of transpositions required
to generate the permutation. So even permutations have parity +1 while odd permutations have parity −1.

To obtain the parity of a given permutation it is useful the following concept

Definition 6.10 The decrement of a permutation: is the number of symbols minus the number of independent cycles (i.e.
cycles with no elements in common).

Theorem 6.12 If the decrement is even (odd) the permutation is even (odd).

Proof : According with Eq. (6.21) an ni − cycle is constructed with ni − 1 transpositions. Since ni is the number of symbols
in an ni −cycle, a permutation with k independent cycles and n symbols is constructed with a number of transpositions q given
by
Xk X k Xk
q= (ni − 1) = ni − 1=n−k
i=1 i=1 i=1
q n−k
Further, it is clear that the parity of a transposition is (−1). Hence the parity of a permutation is (−1) = (−1) . QED.
As a matter of consistency, we show that if we multiply a permutation by a transposition, we change the decrement in
±1, so that the parity of the permutation changes. Consider the transposition (ab) by which we multiply our permutation P .
Resolving P in independent cycles there are two possibilities (a) when a, b belong to the same cycle, let us write the cycle as

(a . . . xb . . . y)

it is clear that
(ab) (a . . . xb . . . y) = (a . . . x) (b . . . y)
so that the number of independent cycles increases by one and then the decrement decreases by one (of course the number of
symbols keeps unaltered). (b) If a, b belong to different cycles we can reverse these steps to show that the decrement increases
by one.

Example 6.29 The permutation P1 = (123) (45) (67) (8), has eight symbols (n1 = 8), and there are four independent
cycles (k1 = 4), the decrement is d1 = n1 − k1 = 4. Hence, P1 is even. Similarly for the permutations

P2 = (237) (14) (56) (89) ; n2 = 9, k2 = 4, d2 = 5 (odd)


P3 = (752) (836) (14) ; n3 = 8, k3 = 3, d3 = 5 (odd)
P4 = (235) (46) (89) (1) (7) ; n4 = 9, k4 = 5, d4 = 4 (even)

The parity of the product of two (or more) permutations is the product of their parities. Therefore, the product of two
permutations of the same parity (i.e. two odd permutations or two even permutations) results in an even permutation. By the
same token, two permutations of opposite parity produce an odd permutation.
In cyclic notation, the identity in Sn consists of n cycles each with one symbol

Pe = (1) (2) (3) . . . (n) (6.22)

and the inverse of (p1 , p2 , . . . , pm−1 , pm ) can be written as the same numbers in reverse order, or in general any sequence in
which the cyclic order is reversed, for instance

P = (p1 , p2 , . . . , pm−1 , pm ) ⇒ P −1 = (pm , pm−1 , . . . p2 , p1 ) = (p1 , pm , pm−1 , . . . , p2 ) (6.23)


112 CHAPTER 6. ABSTRACT GROUP THEORY

e (12) (23) (31) (123) (321)


(12) e (123) (321) (23) (31)
(23) (321) e (123) (31) (12)
(31) (123) (321) e (12) (23)
(123) (31) (12) (23) (321) e
(321) (23) (31) (12) e (123)
Table 6.5: Multiplication rules for the permutation group S3 , which is isomorphic to the dihedral group D3 . Elements are
labeled with their cycle structure.

Example 6.30 The symmetric group S3 is isomorphic to the dihedral group D3 . We write the rules of multiplication of S3 in
table 6.5, using the cyclic structure. A comparison with table 6.3, page 103 of D3 , shows the isomorphism between them.

From the previous facts we infere that

Theorem 6.13 The set of all even permutations forms a subgroup of Sn (the alternating group of order n!/2, for n > 1). In
contrast, the odd permutations do not form a group.

Proof : Since Sn is finite, it is enough to show closure. We already saw that a product of two even permutations is an even
permutation. By contrast, the product of two odd permutations is even. QED.

6.6.1 Cycle structures in permutations


The cycle structure of a given permutation of degree n, is given by the number of 1−cycles, 2−cycles,. . . , n−cycles of the
permutation. We denote as νi the number of i−cycles. A permutation having ν1 1−cycles, ν2 2−cycles,. . . , νn n−cycles is
denoted as (1ν1 , 2ν2 , . . . , nνn ) or generically as (ν). Let us write a given cycle structure in the following way

P = (·) . . . (·) (··) . . . (··) (· · ·) . . . (· · ·) . . . (n symbols) . . . (n symbols)


| {z } | {z } | {z } | {z }
ν1 1−cycles ν2 2−cycles ν3 3−cycles νn n−cycles

the number of symbols involving 1 − cycles is clearly 1ν1 , the number of symbols involving 2-cycles is 2ν2 , the number of
symbols in 3-cycles is 3ν3 , and so on. Further, each symbol 1, 2, . . . , n appears in one and only one cycle. From this argument
it is clear that
ν1 + 2ν2 + 3ν3 + . . . + nνn = n (6.24)
observe in particular that νn can only take the values 0 or 1. If νn = 1 then all other cycles are not present i.e. νk = δnk .

Example 6.31 In the permutation of Eq. (6.19) with n = 8, there is one 1 − cycle, two 2-cycles, one 3-cycle, and zero
k−cycles for 4 ≤ k ≤ 8. Thus
ν1 = 1, ν2 = 2, ν3 = 1, ν4 = ν5 = ν6 = ν7 = ν8 = 0
and
ν1 + 2ν2 + 3ν3 + . . . + nνn = 1 + 2 · 2 + 3 · 1 + 0 = 8

It is obvious that the number of different cycle structures in Sn is determined by the number of solutions of Eq. (6.24) for
ν1 , . . . , νn non-negative integers. Let us define the following non-negative integers

ν1 + ν2 + ν3 + . . . + νn = λ1
ν2 + ν3 + . . . + νn = λ2
ν3 + . . . + νn = λ3
.. ..
. = .
νn−1 + νn = λn−1
νn = λn (6.25)

from Eqs. (6.24, 6.25) and with νk ≥ 0, the λ′i s satisfy the following properties

λ1 + λ2 + . . . + λn = n ; λ1 ≥ λ2 ≥ . . . ≥ λn ≥ 0 ; λk = non − negative integer (6.26)

Therefore, according to Eqs. (6.25) and (6.26), the cycle structure (1ν1 , 2ν2 , . . . , nνn ) induces a unique partition of n (i.e.
a sequence of n non-negative integers λk in decreasing order whose sum is n). It would be desirable to reverse the steps i.e.
7 For instance (12) (13) = (321) 6= (13) (12).
6.6. SYMMETRIC GROUPS 113

starting from a partition of n (which is easy to obtain) arrive to a cycle structure. We achieve it by managing Eqs. (6.25) to
get

ν1 = λ1 − λ2
ν2 = λ2 − λ3
.. .
. = ..
νn−1 = λn−1 − λn
νn = λn (6.27)

And it is clear from (6.27), that starting from a given partition we obtain a unique cycle structure. Hence, partitions and
cycle structures are in a one-to-one correspondence. Therefore, the task of finding different cycle structures is reduced to the
task of finding all possible partitions of n in non-negative integers accomplishing the conditions (6.26)8 .

Example 6.32 For S7 , a possible partition


 of 7 in seven non-negative integers is: 7 = 3 + 1 + 1 + 1 + 1 + 0 + 0 we denote
them as (3111100) or even shorter 314 the cycle structure is calculated from Eq. (6.27)

ν1 = λ1 − λ2 = 3 − 1 = 2 ⇒ two (1 − cycles)
ν2 = λ2 − λ3 = 1 − 1 = 0 ⇒ zero (2 − cycles)
ν3 = ν4 = 1 − 1 = 0 ⇒ zero (3, 4 − cycles)
ν5 = λ5 − λ6 = 1 − 0 = 1 ⇒ one (5 − cycles)
ν6 = λ7 − λ6 = 0 − 0 = 0 ⇒ zero (6 − cycles)
ν7 = λ7 = 0 ⇒ zero (7 − cycles)

the cycle structure thus consists of two 1-cycles and one 5−cycle i.e. P = (·) (·) (· · · · ·).

Example 6.33 We can find all possible cycle structures of Sn by getting all partitions that accomplish the conditions in Eq.
(6.26). For increasing values of n the partitions could be found based on the partitions of smaller integers. Here we write the
partitions up to n = 7, they are:

n = 1 : (1)

n = 2 : (2) , 12

n = 3 : (3) , (21) , 13
 
n = 4 : (4) , (31) , (22) , 212 , 14
  
n = 5 : (5) , (41) , (32) , 312 , (221) , 213 , 15
    
n = 6 : (6) , (51) , (42) , 412 , (33) , (321) , 313 , (222) , 2212 , 214 , 16
       
n = 7 : (7) , (61) , (52) , 512 , (43) , (421) , 413 , (331) , (322) , 3212 , 314 , 222 1 , 2213 , 215 , 17
 
for instance to write all partitions of n that start with the number p, we should write all partitions of the form p [n − p]p ,
where [n − p]p denotes all partitions of the number n − p, that start with a number q less or equal to p. For example for n = 7,
all partitions that start with the number 2 are of the form (2, [5]2 ), and the partitions [5]2 are all the partitions of 5 that start
with a number less or equal to 2, that is
    
[5]2 : (221) , 213 , 15 = 22 1 , 213 , 15

and partitions of the form (2, [5]2 ) are


  
(2, [5]2 ) : 222 1 , 2213 , 215

Since the null values of λi are not relevant in a given partition, it is convenient to exclude them. We shall use from now on
the following definition

Definition 6.11 A partition λ ≡ {λ1 , λ2 , . . . , λr } of the P


positive integer n is a sequence of positive integers λi , arranged in
r
descending order, whose sum is n, that is λi ≥ λi+1 and i=1 λi = n. (a) Two partitions λ, µ are equal if λi = µi for all i.
(b) We say that λ > µ (µ > λ) if the first non-zero number in the sequence (λi − µi ) is positive (negative).
8 This task is considerably easier than finding all posible non-negative integer solutions of Eq. (6.24).
114 CHAPTER 6. ABSTRACT GROUP THEORY

Theorem 6.14 The number of permutations with a given cycle structure (ν) = (1ν1 , 2ν2 , . . . , nνn ) is given by
n!
n(ν) = (6.28)
(ν1 ! · ν2 ! · . . . · νn !) (1ν1 · 2ν2 · . . . · nνn )
Proof: Let us write the cycle structure in the following way

P = (·) . . . (·) (··) . . . (··) (· · ·) . . . (· · ·) . . .


| {z } | {z } | {z }
ν1 1−cycles ν2 2−cycles ν3 3−cycles

There are n cells to place each of the n symbols. There are n! ways to place one symbol in each cell. However, some different
configurations of the symbols represent the same permutation. For instance, if 1 and 2 appears in 1−cycles (1)(2), it is the
same as (2) (1). All the ν1 1−cycles can be permuted (ν1 ! times) without generating a new permutation, all ν2 2−cycles can
be permuted among themselves in ν2 ! ways. In general νi ! is the number of times in which i−cycles can be permuted among
themselves. Thus, a given permutation is replicated a number ν1 !ν2 ! . . . νn ! of times due to the reordering of independent
cycles. On the other hand, for a given order of the cycles, reordering of the symbols within each cycle could also lead to the
same permutation, for instance (123) = (312). For 1−cycles there is only one possible reordering of the symbol so replication
is given by the factor 1 = 1ν1 , a 2−cycle like (12) can appear also as (21) so there is a factor of 2 for each 2−cycle, and
therefore 2ν2 for all the ν2 2−cycles. A 3−cycle like (123) also appears as (231) and (312), so that each one of the ν3 3−cycles
contributes with a factor of 3 and then all 3−cycles contribute with a factor of 3ν3 . In general, in νk cycles of k objects one can
make k νk circular permutations that are not distinct. Thus each permutation (with a fixed ordering of the cycles) is replicated
1ν1 · 2ν2 · . . . · nνn because of the reordering of the symbols within each cycle. QED.

Exercise 6.1 Calculate the number of permutations for each cycle structure of S7 . Show that the sum of these 15 numbers is
given by n(v)1 + . . . + n(v)15 = 7! as it must be.

6.6.2 Cayley’s theorem and regular permutations


The symmetric groups Sn , are very important because as we will see below, they exhaust all possible group structures for finite
groups.

Theorem 6.15 Cayley’s Theorem for finite groups: Every group G of finite order n, is isomorphic with a subgroup of the
symmetric group Sn .

Proof : This is just a special case of theorem 6.5 and corollary 6.6 applied when G has finite order n, so that SG becomes
Sn . Nevertheless, it is illustrative to prove it by using the notation typical of permutations. Denoting {a1 , . . . , an } the elements
of the group in a prescribed order, we can choose ak ∈ G, and form the sequence

{ak a1 , ak a2 , . . . , ak an }

all these elements are different because of the rearrangement lemma and correspond indeed to a reordering of the group G, we
can then associate to this transformation, a permutation in the following way
 
a1 a2 ... an
ak → Pak = (6.29)
ak a1 ak a2 . . . ak an

this association is imposed for all ak ∈ G. For another element aj we have


   
a1 a2 . . . an ak a1 ak a2 ... ak an
aj → Paj = =
aj a1 aj a2 . . . aj an aj (ak a1 ) aj (ak a2 ) . . . aj (ak an )

in the last step we have only reordered the columns. Therefore, there is no change in the assigment for each element and the
permutation is unaltered. The product Paj Pak of both permutations is clearly

    
ak a1 ... ak an a1 ... an a1 ... an
Paj Pak = =
aj (ak a1 ) . . . aj (ak an ) ak a1 . . . ak an aj (ak a1 ) . . . aj (ak an )
 
a1 ... an
Paj Pak = = Paj ak
(aj ak ) a1 . . . (aj ak ) an

Because of the rearrangement lemma, if ak 6= aj then Pak 6= Paj . Thus the mapping ak ↔ Pak is a transformation which
preserves combinations since Paj Pak = P(aj ak ) . Consequently, the set of permutations {Pa1 , . . . , Pan } is isomorphic with the
group G = {a1 , . . . , an }. From the association given by Eq. (6.29), it is immediate to see that Pe is the identity permutation,
and that Pa−1 = Pa−1 as it must be. QED.
6.6. SYMMETRIC GROUPS 115

Since Sn for a given n is a group of finite order, it posesses a finite number of subgroups. On the other hand, Cayley’s
theorem says that any group structure of order n is isomorphic with one of these subgroups, therefore the number of different
group structures that we can build up with n elements must be finite. This assertion reduces considerably the task of contructing
finite group structures.

Example 6.34 To find explicitly the permutation associated to each element of certain group we can check directly in the
group table. As an example, let us take the group table of the dihedral group D2 of order 4 given in table 6.2, page 102.

e a b c
a e c b
b c e a
c b a e
To get the associated permutation for one element (say a ↔ Pa ); we write (e, a, b, c) in the top row and in the bottom row we
write the row corresponding to the element a in the group table (the second row in this case) the symbols are a, e, c, b; so the
permutation reads  
e a b c
a ↔ Pa =
a e c b
similarly, for the elements b and c we have
   
e a b c e a b c
b ↔ Pb = ; c ↔ Pc =
b c e a c b a e

relabeling the elements as numbers we write


     
1 2 3 4 1 2 3 4 1 2 3 4
a ↔ Pa = ; b ↔ Pb = ; c ↔ Pc =
2 1 4 3 3 4 1 2 4 3 2 1

hence, all permutations that form the subgroup of S4 isomorphic to D2 are


   
1 2 3 4 1 2 3 4
Pe = = (1) (2) (3) (4) ; Pa = = (12) (34)
1 2 3 4 2 1 4 3
   
1 2 3 4 1 2 3 4
Pb = = (13) (24) ; Pc = = (14) (23) (6.30)
3 4 1 2 4 3 2 1

Example 6.35 The cyclic group C3 of order three is isomorphic to the subgroup of S3 described by the elements {e, (123) , (321)}

Example 6.36 The cyclic group of order four C4 defined in table 6.1, page 102, is isomorphic to the subgroup of S4 given by
{e, (1234) , (13) (24) , (4321)}

These examples reveal some interesting characteristics concerning the cyclic structure of the subgroups of permutations
generated in this way

Theorem 6.16 Let V be a subgroup of the symmetric group Sn generated from an abstract group of order n with the assignment
described in Eq. (6.29). The permutations contained in V have the following features.

1. All permutations in V (except the one associated to the identity), change the position of all symbols. This kind of
permutations (along with the identity) are called regular permutations. An immediate consequence is that regular
permutations different from the identity do not have one-cycles.

2. Each element ocuppies a different case in each permutation. For instance, in Eq. (6.30) the element 3 is placed in the
the third case in Pe , the fourth case in Pa , the first case in Pb and the second case in Pc . In other words, each symbol
runs over all possible positions, and ocuppies each position only once. This is a general feature of subgroups containing
only regular permutations.

3. When these regular permutations are resolved in cycles, all the cycles have the same length, this is again a general feature
of subgroups containing regular permutations.

Proof : (1) Suppose that for a given permutation Pa with a 6= e, a given symbol ak is unchanged, it means that aak =
ak ⇒ aak = eak ⇒ a = e (rearrangement lemma), leading to a contradiction. In other words, if ak is unchanged it means that
ak appears twice in the k − th column of the group table which is not possible. (2) If a given element c appears in the same
position (say the k − th position) in two different permutations Pa 6= Pb , it means that c = aak = bak leading to a = b which is
116 CHAPTER 6. ABSTRACT GROUP THEORY

a contradiction. Again, it also implies that c appears twice in the k − th column of the group table. (3) Assume that a given
regular permutation Pa has one m−cycle (m) and one n−cycle (n) with different length so that 0 < m < n. Hence

Pa = (m) (n) (k) = (a1 , .., am ) (am+1 , am+2 , ..., am+n ) (k)
m
where (k) denotes the remaining cycles. The element (Pa ) must be in the group, but
m m
(m) = (a1 , .., am ) = (a1 ) (a2 ) . . . (am )
m m
i.e. (m) is the identity for the subset (a1 , .., am ). On the other hand, (n) cannot contain only one cycles because it is
not the identity in the subset (am+1 , ..., am+n ). Hence (Pa )m contains at least m one-cycles and it is not the identity, this is
impossible for a regular permutation. QED.
All permutation subgroups of Sn induced from a certain abstract group of order n with the method described above,
posesses the previous features. Observe that the assignment of a permutation Pak for a given element ak ∈ G in Eq. (6.29)
was done by left-hand side (LHS) multiplication with ak . It is clear that all the characteristics shown above are valid if we use
a “RHS multiplication” with ak , though the specific assigment ak ↔ Pa′ k is in general different, i.e. Pak 6= Pa′ k in general. We
shall use the LHS convention from now on.
As a consequence of theorem 6.16, if a regular permutation has a prime number n of symbols, it must be either the identity
or a n−cycle. It follows inmediately that, if the order of a group n is prime, the resolution of the corresponding permutations
must be n−cycles (except for the identity), so the only possible structure is cyclic. Then we find

Theorem 6.17 If the order of a group is a prime number n, the only possible group structure is the cyclic one. So it is
isomorphic with Cn .

This shows the strong limitations that the group axioms impose. There is only one possible structure for groups of prime
order no matter how large this number could be9 .

6.7 Resolution of a group in cosets, Lagrange’s theorem


Definition 6.12 Let ℜ = {bi } ⊆ G be a subgroup of a group G, and p an element in G. A left-coset of ℜ generated by p ∈ G is
defined as the set pℜ ≡ {pbi : bi ∈ ℜ} ⊆ G.

The mapping {bi } → {pbi } is obviously onto. Besides, it is one-to-one because of the rearrangement lemma. Hence, any
left-coset has the same cardinality of the subgroup
 ℜ that generates it. In addition, let c be an arbitrary element of G, and b a
given element of ℜ. Then c = c b−1 b = cb−1 b, since cb−1 ∈ G then c is contained in the left coset cb−1 ℜ. Consequently,
the collection of all left cosets induced by a given subgroup ℜ ⊆ G covers the group, i.e. the union of all those left cosets
generates G.
Now we shall prove that two given left cosets are either coincident or disjoint. Then, we should prove that if their intersection
is non-empty, they must coincide. Let pℜ, qℜ be two non-disjoint left cosets. Hence, there exist  hi , hj ∈ ℜ such  that phi =
qhj ⇒ q −1 phi = hj ⇒ q −1 p = hj h−1i . Now, since h h
j i
−1
∈ ℜ then q −1
p ∈ ℜ. Therefore, q −1
p ℜ = ℜ ⇒ q q −1
p ℜ = qℜ ⇒
pℜ = qℜ showing that both left cosets are coincident. We summarize our results as

Theorem 6.18 Let ℜ be a subgroup of G. The collection of all distinct left-cosets induced by ℜ forms a partition (in the set
theoretical sense, see definition 1.1) of G, in the sense that all distinct left-cosets are disjoint and their union generates G.
Further, each left-coset has the cardinality of the subgroup ℜ that generates it. The same statements are true for right-cosets.

It is important to take into account that except for ℜ itself, left-cosets are not subgroups since they do not contain the
identity. Finally, all these results are valid for right-cosets. Certainly, left-cosets and right-cosets (generated by the same
subgroup ℜ) can be different when the group is non-abelian (i.e. pℜ is in general different from ℜp), but all global properties
are exactly the same.
As a corollary, when the group G is finite of order n, the subgroup ℜ must be of finite order m ≤ n, and the number of
distinct left-cosets (or right-cosets) must be an integer number k, each of them with cardinality m. Since they form a partition
of G, we have that k · m = n i.e. the order of the group, from which we have

Theorem 6.19 (Lagrange’s theorem) If G is a group of finite order, and ℜ ⊆ G is a subgroup of G, the order of the subgroup
ℜ must be a divisor of the order of G.

If g, h are the order of G and ℜ respectively we have

g = kh
9 We should take into account that according to number theory, there is not a maximal prime number.
6.7. RESOLUTION OF A GROUP IN COSETS, LAGRANGE’S THEOREM 117

where the positive integer k is called the index of ℜ in G. A procedure to obtain the resolution of a group in left cosets traces
as follows: if ℜ = G, the subgroup coincides with the group and the resolution is trivial, otherwise we take an element a1 of
G outside of ℜ and we form the left coset a1 ℜ, if the group is not exhausted we take an element a2 of G outside of ℜ ∪ a1 ℜ and
form the left coset a2 ℜ, if the group is not exhausted we take an element outside of the union of ℜ ∪ a1 ℜ ∪ a2 ℜ and continue
this way until the cosets cover the group.
Once again, all results obtained here are also valid for right-cosets. Despite right-cosets are in general different from left-
cosets, the resolution of G by either left-cosets or right-cosets leads to the same index (the same number of cosets), as long as
both are induced by the same subgroup ℜ. 
Let a ∈ G. As we mentioned before, if a is of order k, the set of elements a0 , . . . , ak−1 forms a subgroup of G. This
subgroup is called the period of a, this is the smallest group that contains the element a. Since owing to Lagrange’s theorem
the order of any subgroup must be a divisor of the order of the group that contains it (if the latter is of finite order), it is
followed that

Corollary 6.20 The order of each element in a finite group must be a divisor of the order of the group.

From this corollary it turns out that

Corollary 6.21 Any group of prime order must be cyclic and can be generated from any of its elements except the identity.
Such groups cannot contain non-trivial subgroups (i.e. they only contain the group itself and the identity alone as subgroups).

In consistency with previous results. Lagrange’s theorem has the virtue of simplifiyng significantly the task of finding
proper subgroups of any group of finite order.

Example 6.37 Consider the permutation group S3 . (i) The subgroup ℜ1 = {e, (123) , (321)} has two cosets: ℜ1 itself and
M = {(12) , (23) , (31)} obtained by multiplying the elements of ℜ1 by one of the following elements (12), (23), or (31). (ii)
Now consider the subgroup ℜ2 = {e, (12)}; it induces three left cosets: ℜ2 , M1 , M2 , where M1 = {(23) , (321)} is obtained by
left multiplication of either (23) or (321) with ℜ2 and M2 = {(31) , (123)} obtained from either (31) or (123).

Since left-cosets (induced by a subgroup ℜ) form a partition of a group G, then according with theorem 1.1, we can form an
equivalence relation for elements in G. This is carried out by defining that two elements in G are equivalent if and only if they
belong to the same left-coset. To characterize elements belonging to the same left-coset we start by characterizing an arbitrary
element of a given left-coset. For this, we denote [p]L to indicate the left-coset pℜ induced by a subgroup ℜ ≡ {bi } ⊆ G. Let
y ∈ [p]L with p ∈ G. We have

[p]L ≡ pℜ = {y : y = pbk for some bk ∈ ℜ} = y : p−1 y = bk for some bk ∈ ℜ

[p]L ≡ pℜ = y : p−1 y ∈ ℜ (6.31)

Theorem 6.22 Let G be a group and ℜ be a subgroup of G. Two elements x, y ∈ G belong to the same left-coset (induced by
ℜ) if and only if x−1 y ∈ ℜ. Two elements x, y belong to the same right-coset (induced by ℜ) if and only if xy −1 ∈ ℜ.

Proof : We prove it for left-cosets, and for right-cosets we have an analogous procedure. Assuming that x and y belong
to the same left-coset pℜ we havefrom Eq. (6.31) that p−1 x = bm ∈ ℜ and p−1 y = bk ∈ ℜ, equivalently x−1 = b−1 m p
−1
and
−1 −1 −1 −1 −1 −1 −1
y = pbk therefore
 x y = b m p (pb k ) = b m b k so that x y ∈ ℜ. Conversely, if x y ∈ ℜ then x y = b k for some b k ∈ ℜ
so that x x−1 y = xbk ⇒ y = xbk ∈ xℜ hence y ∈ [x]L , and obviously x ∈ [x]L , from which both elements belong to the same
left-coset. QED.
In particular, the previous proof shows that any element x ∈ pℜ can generate the same left-coset i.e. [x]L = [p]L . Thus,
relation (6.31) must hold when we replace p by any element x ∈ [p]L . In words, any element of the partition set (left-coset)
can be taken as the “seed” to generate the whole partition set.
−1 −1
Note that x−1 y ∈ ℜ if and only if x−1 y = y −1 x ∈ ℜ. Similarly, xy −1 ∈ ℜ if and only if xy −1 = yx−1 ∈ ℜ.
−1 −1
However, xy is not in general the inverse of x y. This shows that the fact that x, y belongs to the same left-coset, does
not neccesarily imply that they belong to the same right-coset and vice versa. It shows that left-cosets and right-cosets are in
general different.

Definition 6.13 Let G be a group, and let ℜ ⊆ G be a subgroup of G. Two elements x, y ∈ G are said left-congruent modulo
ℜ, written x ≃ y (mod ℜ), if x−1 y ∈ ℜ. Two elements x, y ∈ G are said right-congruent modulo ℜ, written x ≈ y (mod ℜ), if
xy −1 ∈ ℜ.

Theorem 6.23 Let G be a group, and ℜ ⊆ G be a subgroup of G. The congruence relations x ≃ y and x ≈ y described in
definition 6.13, are equivalence relations. The congruence x ≃ y generates a partition of G in left-cosets induced by ℜ. Further,
the congruence x ≃ y generates a partition of G in right-cosets induced by ℜ.
118 CHAPTER 6. ABSTRACT GROUP THEORY

Proof: The fact that left (right) congruences generate left-(right-) cosets is a direct consequence of theorem 6.22. The fact
that these congruences are equivalent relations is guaranteed by theorem 1.1 and the fact that left-cosets (and right-cosets),
form a partition of G (theorem 6.18). Notwithstanding, it is illustrative to show the equivalence explicitly. We do it for left
congruence. From x−1 x = e ∈ ℜ then x ≃ x. Since ℜ must contain all its inverses then x−1 y ∈ ℜ if and only if y −1 x ∈ ℜ
−1 −1
hence x ≃ y if and
 only−1if y ≃ x. By the closure of ℜ under multiplication we see that x y ∈ ℜ and y z ∈ ℜ implies
−1 −1
that x y y z = x z ∈ ℜ, so x ≃ y and y ≃ z implies x ≃ z. QED. It is clear that for abelian groups, left and right
congruences coincide, and thus left-cosets coincide with right-cosets.

6.8 Conjugacy classes


Definition 6.14 Let G be a group, and let a, b ∈ G. The element a is said to be conjugate to b in G, if ∃ u ∈ G such that
uau−1 = b. It is also said that b is the transform of a by u 10 . It can be seen that

1. a is conjugate to itself (taking u = e)


 −1
2. If a is conjugate to b ⇒ b is conjugate to a (because a = u−1 b u−1 )

3. If a is conjugate to b, and b is conjugate to c, then a is conjugate to c. It can be seen as follows


  −1
uau−1 = b and vbv −1 = c ⇒ v uau−1 v −1 = c ⇒ (vu) a u−1 v −1 = c ⇒ (vu) a (vu) = c

The conjugate relation is therefore reflexive, symmetric, and transitive (an equivalence relation), any equivalence relation
can be used to separate the elements of the set (the group) in a collection of subsets that form a partition (see theorem 1.1), in
the sense that all these subsets are disjoint to each other and their union form the set (the group). The partition sets generated
by the conjugate relation will be called conjugacy classes

Definition 6.15 All elements of a group which are conjugate to each other are said to form a (conjugacy) class.

The conjugacy classes have the following features

• The classes are disjoint subsets that fills the whole set (group). In other words, each element in the group belongs to one
and only one class.

• The identity forms a class by itself

• All the elements in a class have the same order. Proof: Let k be the order of a. Then
  
ak = e ⇒ bk = uau−1 uau−1 . . . uau−1 = uak u−1 = ueu−1 = e
| {z }
k times

If we assume that there is n < k such that bn = e we obtain an = e contradicting the hypothesis that k is the order of
a. QED.

• If ℜ ⊆ G is a subgroup of G, and a, b ∈ ℜ are conjugate each other in G, it does not guarantee that a, b are conjugate in
ℜ. This is because the element u that relates a and b by conjugation in G can be outside of ℜ.

• If ℜ ⊆ G is a subgroup of G, and a, b ∈ ℜ are conjugate each other in ℜ, it is clear that they are also conjugate in G.

• The only class that forms a group is the one consisting of the identity alone.

• Let Ci = {b
m } be a conjugacy class of a group G. The set of all inverses of elements in Ci , forms a conjugacy class
Ci′ ≡ b−1
m with the same cardinality of Ci . Proof : Let bm , bn ∈ Ci , then ∃u ∈ G such that

−1 −1
bm = ubn u−1 ⇒ (bm ) = ubn u−1 ⇒ b−1 −1 −1
m = ubn u

which shows that b−1 −1


m is conjugate to bn . Hence, all elements in the set Ci′ belong to the same conjugacy class. Let
K ⊇ Ci′ be the class that contains the set Ci′ . We shall show that any element of K must be the inverse of some element
in Ci from which K ⊆ Ci′ and K = Ci′ . Let v ∈ K, therefore for each b−1 −1 −1
m ∈ Ci′ ⊆ K, ∃a ∈ G such that v = abm a then
−1 −1 −1
v = abm a such that v ∈ Ci . The fact that Ci and Ci′ are of the same cardinality comes from the uniqueness of
the inverse. QED.
10 To figure out how the concept of conjugation among the elements of a group arises, remember the transformation of a matrix when a change of

basis is done. These similarity transformations form an equivalence relation.


6.8. CONJUGACY CLASSES 119

• It could happen that a given conjugacy class Ci of a group G coincides with the conjugacy class Ci′ formed by the inverse
of its elements. In other words, there are some classes for which the inverse of each of its elements belongs to the class.
We call them ambivalent conjugacy classes. The identity alone is a trivial ambivalent conjugacy class. Since classes
form partitions, it is clear that Ci and Ci′ are either coincident or disjoint.
• In an abelian group, each element forms a conjugacy class by itself.
• Let us see explicitly the way in which the conjugate of certain permutation is obtained, let us define
     
1 ... n 1 ... n a1 . . . an
Pa ≡ ; Pb ≡ =
a1 . . . an b1 . . . bn ab1 . . . abn

   
−1 a1 . . . an 1 ... n b1 . . . bn
(Pb ) Pa (Pb ) =
ab1 . . . abn a1 . . . an 1 ... n
 
−1 b1 . . . bn
(Pb ) Pa (Pb ) =
ab1 . . . abn

the result is clear, to obtain the conjugate of Pa through the permutation Pb , we should apply the transformation Pb to
the top row and the bottom row of Pa .
• In the symmetric group of degree n, a conjugacy class consists of all permutations with the same cycle structure. To
prove it, we note from the previous item that the permutation qpq −1 differs from p only in that the label numbers {pi } in
−1
the cycle notation for p are replaced by {pqi } leaving the cycle structure unchanged. For example (23) (12) (23) = (13),
−1
(123) (12) (123) = (23).
• Hence, getting the number of classes in Sn is equivalent to find all different cycle structures i.e. all different partitions of
the integer n fulfilling the properties in Eq. (6.26). Further, the number of elements in a given class (i.e. a given cycle
structure) can be obtained from formula (6.28).
• As a corollary, the permutations in a given class are all even or all odd.
• In any symmetric group Sn , the inverse of an element belongs to the same class as the element itself. It can be shown
from Eq. (6.17) in which we see that P −1 is obtained by interchanging the rows that symbolyzes the original permutation
P , thus keeping the cycle structure unaltered. In particular, observe that when P is written in its cycle structure, the
inverse is obtained by keeping the same cycle structure but reversing the order of the symbols in each cycle. For example
if P = (216) (453) (78) (9) the inverse is given by P −1 = (612) (354) (87) (9). Consequently, all conjugacy classes of Sn
are ambivalent.

Example 6.38 Find the conjugate of the permutation


 
1 2 3 4
Pa ≡ = (12) (34)
2 1 4 3

through the permutation  


1 2 3 4
Pb ≡ = (134)
3 2 4 1
applying Pb to the top row 1234 of Pa it is obtained 3241. Now, applying Pb to the bottom row 2143 of Pa we get 2314, the
corresponding conjugate is then
   
−1 3 2 4 1 1 2 3 4
(Pb ) Pa (Pb ) = =
2 3 1 4 4 3 2 1
(Pb ) Pa (Pb )−1 = (14) (23)
−1
observe that Pa and its conjugate (Pb ) Pa (Pb ) have the same cycle structure as it must be.

Exercise 6.2 By using the method described in section 6.6.1, show that S4 posseses the following classes

C1 = {e}
C2 = {(12) , (13) , (14) , (23) , (24) , (34)}
C3 = {(12) (34) , (13) (24) , (14) (23)}
C4 = {(123) , (132) , (124) , (142) , (134) , (143) , (234) , (243)}
C5 = {(1234) , (1243) , (1324) , (1342) , (1423) , (1432)}
120 CHAPTER 6. ABSTRACT GROUP THEORY

Figure 6.2: Partitions of S3 (a) by left cosets of ℜ1 , (b) by left cosets of ℜ2 , (c) by its classes.

Exercise 6.3 Show the 15 different classes of S7 (See examples 6.32, 6.33 and exercise 6.1, Page 114).

Example 6.39 Figure 6.2 shows the partition of S3 by (a) left cosets generated with ℜ1 = {e, (123) , (321)}, (b) left cosets
generated with ℜ2 = {e, (12)} (see example 6.37) (c) the classes of S3 .

Notice that for a given group, the partition in left cosets (or right cosets) depends on the subgroup chosen, while the
partition in classes is unique.

6.9 Conjugate and Invariant subgroups


We have said that with the exception of the class of the identity, none of the classes in a group G are subgroups of G. However,
it is possible to gather some classes to form a subroup of G (there are at least two subgroups that can be formed this way, the
identity alone and G itself). Subgroups that consists of elements of G in complete classes are called invariant subgroups. To
arrive to the concept of invariant subgroup we first define the conjugate of a subgroup:

Definition 6.16 We define the conjugate of the subgroup ℜ in G by the element a ∈ G, as the set of elements aℜa−1 .

This definition emulates the concept of conjugation between two elements, but thinking in the subgroup ℜ as though it
were a single element. Clearly, if a ∈ ℜ, the set of elements aℜa−1 coincide with ℜ.

Theorem 6.24 The conjugate of the subgroup ℜ in G by the element a ∈ G also forms a subgroup in G, with the same
cardinality of ℜ. Further, the relation of conjugation between subgroups of a group G forms an equivalence relation.
  
Proof : Let us define ℜ = {bm } so aℜa−1 = abm a−1 , the product of any two elements of aℜa−1 is abi a−1 abj a−1 =
abi bj a−1 = abk a−1 ∈ aℜa−1 . Further, for each abi a−1 the inverse ab−1
i a
−1
is also in aℜa−1 . Thus aℜa−1 is a subgroup in G.
The fact that it has the cardinality of ℜ is a consequence of the rearrangement lemma. The proof of the fact that conjugation
between subgroups of G forms an equivalence relation, is very similar to the proof that conjugate elements in a group form an
equivalence relation. QED.

Theorem 6.25 If two subgroups ℜ and ℜ′ of a group G are conjugate to each other in G, they are isomorphic.

Proof : We know that ℜ′ = aℜa−1 for some a ∈ G. Defining the mapping M : bk ∈ ℜ → abk a−1 ∈ ℜ′ , we obtain a
one-to-one mapping of ℜ onto ℜ′ because of the rearrangement lemma. We prove that this mapping preserves the product as
follows  
M (bk bm ) = a (bk bm ) a−1 = abk a−1 abm a−1 = M (bk ) M (bm )
QED.
Theorem 6.25, shows that a given group G can contain several subgroups with the same abstract structure (though with
different elements from the point of view of G).

Definition 6.17 An invariant subgroup ℜ in G, is a subgroup that coincides with all their conjugates in G.

In other words, a subgroup ℜ in G is invariant in G if and only if

aℜa−1 = ℜ ; ∀a ∈ G (6.32)

invariant subgroups are also called self-conjugate. This condition can also be expressed as

aℜ = ℜa ; ∀a ∈ G (6.33)
6.9. CONJUGATE AND INVARIANT SUBGROUPS 121

i.e. a subgroup is invariant in G if and only if each left-coset defined in G, coincides with its corresponding right-coset defined
in G. This could be an alternative definition for the concept of invariant subgroup11 .
Further, for an invariant subgroup we see from (6.32) that if an element bi ∈ ℜ, all its conjugate elements abi a−1 are also
in the subgroup. It means that a subgroup ℜ of a group G, is invariant in G if and only if it contains all its elements in
complete conjugacy classes of G. In other words any given conjugacy class of G is either totally contained or disjoint from
an invariant subgroup in G. This would be a third possible definition of invariant subgroup which was the one we proposed
at the beginning of the section12 . It worths remarking that we say that ℜ is invariant “in G” this is because ℜ could be a
subgroup of another group G′ , assume for instance that G′ ⊃ G ⊃ ℜ, if ℜ is an invariant subgroup in G it does not neccesarily
mean that ℜ is invariant in G′ .
Any group contains at least two (trivial) invariant subgroups of G i.e. the improper groups {e}, and G.

Definition 6.18 A group is called simple: if it does not contain any proper invariant subgroup.

Definition 6.19 A group is called semi-simple if it does not contain any proper abelian invariant subgroup..

Since in an abelian group each element constitute a class by itself, we see that all the subgroups of an abelian group are
invariant. The groups of prime order are simple. Obviously, a simple group is also semi-simple.

Example 6.40 Let us define the subgroup ℜ = [e, (12) (34)], looking at the classes of S4 (exercise 6.2, page 119), we see that
this is not an invariant subgroup (there is only one of the elements of the class of (12) (34)). To find the conjugates of ℜ we
make the products aℜa−1 running over all the elements of S4 . The conjugate subgroups are

ℜ1 = e, (12) (34) ; ℜ2 = e, (13) (24) ; ℜ3 = e, (14) (23)

they are isomorphic each other and isomorphic with C2 . This illustrates the fact that a group G can contain (as subgroup)
the same abstract group several times i.e. represented with different elements of G. Each conjugate subgroup is obtained eight
times from the following elements of S4

ℜ1 from a = e, (12) , (34) , (12) (34) , (13) (24) , (14) (23) , (1324) , (1423)
ℜ2 from a = (14) , (23) , (132) , (124) , (143) , (234) , (1243) , (1342)
ℜ3 from a = (13) , (24) , (123) , (142) , (134) , (243) , (1234) , (1432)

Example 6.41 The subgroup ℜ = {e, (12) (34) , (13) (24) , (14) (23)} is invariant in S4 since it contains two complete classes
of S4 (see exercise 6.2, page 119).

Example 6.42 The subgroup ℜ = {e, (123) , (321)} is invariant in S3 since it contains two complete classes of S3 . Though
this is a necessary and sufficient condition for any subgroup to be invariant, let us verify two explicit cases to see that conjugate
elements of ℜ must be in this two classes:
−1 −1
(12) {e, (123) , (321)} (12) = {e, (321) , (123)} ; (123) {e, (123) , (321)} (123)
= {e, (123) , (321)}

Example 6.43The cyclic groups of non-prime order n are not simple nor semi-simple. For instance C4 = e = a0 , a1 , a2 , a3
has a subgroup e, a2 which is invariant and abelian. The cyclic groups of prime order are simple as discussed above.

Example 6.44 The group SO (3) of rotations in three-dimensions is a simple group. The two dimensional rotations group
SO (2) is not simple since it contains an infinite number of abelian invariant subgroups consisting of discrete rotations Rn/m by
angles that are rational fractions of 2π. It is surprising at a first glance, that SO (3) is simple while a proper subgroup of it
(SO(2)) is not. It owes to the fact that the subgroups Rn/m are invariant in SO (2), but they are not invariant in SO (3).

Theorem 6.26 Let an be the alternating group of Sn . Then an is an invariant subgroup in Sn .

Proof : Theorem 6.13 of page 112, has shown that an is a subgroup of Sn . Let P ∈ an such that P is an even permutation.
Let P ′ be a permutation in the same class of Sn as P , thus P ′ has the same cycle structure as P and so the same parity.
Consequently, all elements in the class of P are even permutations13 . QED.
11 It is important to emphasize that for ℜ = {b , . . . , b , . . .}, the equality expressed by aℜ = ℜa, is a relation between the whole sets. It does not
1 k
necessarily mean that abk = bk a.
12 It is important to bear in mind that the axioms for ℜ to be a subgroup of G must be checked. Once we are sure that ℜ is a subgroup of G, the

fact that the elements of ℜ are in complete conjugacy classes of G, is a necessary and sufficient condition for ℜ to be invariant in G.
13 Note that a subset formed by complete classes of a group G is not neccesarily an invariant subgroup, because it is not necessarily a subgroup at

all. For instance, it must contain the class of the identity. As an example, the set of odd permutations of Sn contains complete classes of Sn but it
is not a subgroup of Sn .
122 CHAPTER 6. ABSTRACT GROUP THEORY

6.10 The factor group G/ℜ.


Definition 6.20 Let ℜ ≡ {hi } be a subgroup of the group G. The multiplication of two left-cosets aℜ and bℜ with a, b ∈ G, is
defined as the set of all the products of the form ahi bhj

(aℜ) (bℜ) ≡ {ahi bhj : ∀hi , hj ∈ ℜ}

multiplication of right-cosets is defined accordingly.

This product is particularly simple in the case in which ℜ is invariant because left-cosets coincide with right-cosets.
Remembering definition 6.13, page 117, we obtain the following theorem

Theorem 6.27 Let G be a group, and let ℜ ⊆ G be an invariant subgroup in G. Two elements x, y ∈ G are left-congruent
modulo ℜ, if and only if they are right-congruent modulo ℜ.

Proof : Since left (right) congruences generate left (right) cosets, the theorem comes from the fact that left cosets coincide
with right cosets when ℜ is an invariant subgroup. Alternatively, it can be seen as follows: let x ≃ y (left congruence) hence
x−1 y ∈ ℜ therefore 
x x−1 y ∈ xℜ = ℜx ⇒ y ∈ ℜx ⇒ yx−1 ∈ ℜ ⇒ y ≈ x ⇒ x ≈ y
where we have used the symmetry of the right congruence relation ≈. Thus left congruence x ≃ y implies right congruence
x ≈ y. The converse proves similarly. QED. This theorem says that distinction between left-congruence and right-congruence
induced by ℜ ⊆ G makes no sense when ℜ is an invariant subgroup in G.

Definition 6.21 Let G be a group, and let ℜ ⊆ G be an invariant subgroup in G. Two elements x, y ∈ G are said congruent
modulo ℜ, written x ≃ y (mod ℜ), if xy −1 ∈ ℜ. Equivalently, two elements x, y ∈ G are said congruent modulo ℜ, written
x ≃ y (mod ℜ), if x−1 y ∈ ℜ.

Theorem 6.28 Let G be a group, and ℜ ⊆ G an invariant subgroup in G. The congruence relation x ≃ y described in
definition 6.21, is an equivalence relation. The congruence x ≃ y generates a partition of G in the cosets induced by ℜ. In
addition, congruences can be multiplied as if they were ordinary equations

x1 ≃ x2 and y1 ≃ y2 implies x1 y1 ≃ x2 y2 (6.34)

Proof : The fact that the congruence is an equivalence relation that generates the cosets come from the combination of
theorem 6.23 with theorem 6.27, along with the fact that left-cosets equate right-cosets when they are induced by invariant
subgroups. On the other hand, the hypothesis in Eq. (6.34) reads

x−1 −1
2 x1 ∈ ℜ and y1 y2 ∈ ℜ (6.35)

From the closure of multiplication in ℜ we have


 
x−1
2 x1 y1 y2−1 ∈ ℜ ⇒ x−1 −1
2 x1 y1 y2 ∈ ℜ ⇒
a ∈ ℜ, a ≡ x−1 −1
2 x1 y1 y2

now a′ ≡ x2 ax2−1 ∈ ℜ because invariant subgroups contains complete classes. Thus


 −1 −1
a′ = x2 x−1 −1
2 x1 y1 y2 x2 = x1 y1 y2−1 x−1
2 = x1 y1 (x2 y2 ) ∈ℜ

therefore x1 y1 ≃ x2 y2 . QED. Observe that we have used the fact that ℜ is invariant in G in two steps: (a) In Eq. (6.35)
we have used the equivalence between left-congruence and right-congruence valid for invariant subgroups only, and (b) in the
fact that invariant subgroups contain complete classes. As we shall see in the next theorem, Eq. (6.34), along with the fact
that left-cosets and right-cosets coincide, are the most important properties of invariant subgroups that makes them more
important than merely subgroups.

Theorem 6.29 If ℜ is an invariant subgroup in a group G, we can define a multiplication of left (or right) cosets, in the
following way
(aℜ) (bℜ) = (ab) ℜ (6.36)

Proof : Let a and b be two fixed representative elements of the cosets aℜ ≡ [a] and bℜ ≡ [b] respectively. In terms of these
representatives we can write
(aℜ) (bℜ) = a (ℜb) ℜ = a (bℜ) ℜ = (ab) ℜℜ = (ab) ℜ
because of the associativity in G the closure in ℜ, and the fact that ℜ is an invariant subgroup in G if and only if ℜb = bℜ
∀b ∈ G. It is clear however, that any element a′ ∈ aℜ could be a representative of this coset, so that a′ ℜ = aℜ or [a′ ] = [a].
6.10. THE FACTOR GROUP G/ℜ. 123

Similarly, if b′ ∈ bℜ then [b′ ] = [b]. Consequently, we should prove that Eq. (6.36) provides a well-defined operation. That is,
that such an operation does not depend on the representatives chosen in each coset to make it. By applying property (6.34) we
see that x ≃ x1 and y ≃ y1 implies xy ≃ x1 y1 ; which in other words means that [x] = [x1 ] and [y] = [y1 ] implies [xy] = [x1 y1 ]
showing that the operations with cosets are independent of the representative chosen for each coset. QED.
If we see the cosets (including ℜ itself) as elements, and define the law of combination (aℜ) (bℜ) = (ab) ℜ between them
(i.e. as the product showed in Eq. 6.36), we find that the collection {(ai ℜ)} of all distinct cosets induced by ℜ forms a group
under such a law of combination.

Theorem 6.30 Let G be a group and ℜ an invariant subgroup in G. Let G/ℜ ≡ {(ai ℜ)} = {[ai ]} be the set obtained by
gathering all distinct cosets with ai ∈ G, and considering each coset as a single element. The set G/ℜ, with the law of
combination (aℜ) (bℜ) = (ab) ℜ, forms a group called the factor or quotient group G/ℜ. The element ℜ is the identity of this
group, and a−1 ℜ is the inverse of the element aℜ.

Proof : The theorem follows from these observations:

1. The law of combination (aℜ) (bℜ) = (ab) ℜ says that the product of two cosets is again a coset. The following properties
are corollaries of this one.

2. ℜℜ = ℜ, it arises by considering a = b = e, or appealing to the subgroup nature of ℜ.

3. (aℜ) ℜ = aℜ, arises from b = e.


 
4. (aℜ) a−1 ℜ = aa−1 ℜℜ = ℜ

5. [(aℜ) (bℜ)] (cℜ) = [(ab) ℜ] (cℜ) = [(ab) c] ℜ = [a (bc)] ℜ = (aℜ) [(bc) ℜ] = (aℜ) [(bℜ) (cℜ)]. Where we have used the
associative axiom for G. QED.

It is clear that G/e = G and G/G = e. If G is finite, the order of G/ℜ is the index of ℜ in G, so the name quotient group.
It is important to insist that a consistent definition of a quotient group G/ℜ can only be given if ℜ is invariant in G, since a
well defined product is only posible if left cosets coincide with right cosets.
 
Example 6.45 The cyclic group of four elements G4 = a0 , a1 , a2 , a3 contains the invariant subgroup ℵ = a0 , a2 14 . We
first form the cosets that fill the group. The first coset is the subgroup itself, the second is obtained by taking an element outside
of the subgroup (say a1 = a)

aℵ = a1 , a3
the resolution of G4 in cosets of ℵ is
G4 = ℵ ∪ (aℵ) = ℵ ∪ (ℵa)
The factor group can then be formed as
  
G4 /ℵ ≡ {ℵ, aℵ} = a0 , a2 , a, a3
The group table is

ℵℵ = ℵ ; (aℵ) (ℵ) = aℵ
  
(aℵ) (aℵ) = a2 ℵ = a2 a0 , a2 = a2 , a4 = a2 , a0 ⇒
(aℵ) (aℵ) = ℵ

relabeling ℵ ≡ e, aℵ ≡ b we get ee = e, eb = be = b, b2 = e. So the factor group G4 /ℵ is isomorphic to the abstract group of


two elements.

Example 6.46 ℵ = {e, (123) , (321)} is an invariant subgroup in S3 and the coset that fills the group is (ij) ℵ = {(12) , (23) , (31)}
where (ij) is any transposition. Thus, the quotient group consists of two elements S3 /ℵ = {ℵ, (ij) ℵ} with the rule of multipli-
cation:

ℵ∗ℵ = ℵ ; ℵ ∗ [(ij) ℵ] = (ij) ℵ ∗ ℵ = (ij) ℵ


[(ij) ℵ] ∗ ℵ = (ij) ℵ ∗ ℵ = (ij) ℵ ; [(ij) ℵ] ∗ [(ij) ℵ] = (ij)2 ℵ ∗ ℵ = eℵ ∗ ℵ = ℵ

where we have used the fact that any transposition taken twice is the identity. With the assignment ℵ ≡ e and (ij) ℵ ≡ a we
obtain the abstract group C2 again.
14 The reader only have to prove that ℵ is a subgroup, since G4 is abelian.
124 CHAPTER 6. ABSTRACT GROUP THEORY

Example 6.47 The group E1 of all integers under ordinary sum has the set E2 of even integers as an invariant subgroup.
The cosets are M1 ≡ E2 and M2 ≡ 1 + E2 , the quotient group is isomorphic with C2 . Defining the subgroups Em invariant in
E1 as
Em ≡ {. . . , −3m, −2m, −m, 0, m, 2m, 3m, . . .} ; m ≡ positive integer
the cosets would be M1 = Em , M2 = 1 + Em , M3 = 2 + Em , . . . , Mm ≡ (m − 1) + Em . The quotient group E1 /Em is
isomorphic to the cyclic group Cm . This example15 exhibits a feature of infinite groups that finite groups do not possess: Both
groups E1 and Em are isomorphic (with the mapping n ↔ nm with n =all integers), though the quotient group E1 /Em is
non-trivial (for m > 1). For a finite group G, the only invariant subgroup of G isomorphic to G, is G itself. Therefore, the
quotient group must be the identity.

6.11 Homomorphisms
We defined previously an isomorphism between two groups G and G′ as a one-to-one mapping of G onto G′ , such that the
correspondence determined by the mapping preserves the rule of multiplication.

M
G −→ G′

(M a) ∗ (M b) = M (a × b) or (a′ ∗ b′ ) = (a × b)
where the symbols ∗ and × represent the laws of combination for G′ and G respectively. We could imagine a mapping in which
the correspondence preserves the law of combination, but several elements of G can be mapped into the same element in G′ .
Definition 6.22 Let G, G′ be two groups and let G → G′ be a mapping of G onto G′ . This mapping is called a homomorphism
of G onto G′ if it preserves group multiplication.
Notice that we required this mapping to be onto but not to be one-to-one. Since several elements of G can be mapped in
the same image point in G′ we conclude that for finite groups if g, g ′ are the orders of G, G′ , then g ≥ g ′ . If the equality holds
the homomorphism becomes an isomorphism (an isomorphism is a one-to-one homomorphism).
Theorem 6.31 Let M be a homomorphism of G onto G′ . If a ∈ G let us denote M (a) ≡ a′ ∈ G′ . This homomorphism
posseses the following properties
1. The identity of G must be mapped into the identity of G′ . Proof : Since M (e × a) = M (e) ∗ M (a) ∀a ∈ G. Then
M (a) = M (e) ∗ M (a) ∀a ∈ G. Now, when a runs over all G then M (a) runs over all G′ because M is onto, therefore
′ ′
a′ = (e) ∗ a′ ∀a′ ∈ G′ . On the other hand, by starting from M (a × e) we also show that a′ = a′ ∗ (e) ∀a′ ∈ G′ . Then

(e) = M (e) is the identity in G′ and we denote it as e′ . QED.
−1 ′ ′
2. If a ∈ G is mapped in a′ ∈ G′ , then a−1 is mapped in (a′ ) . Proof : a × a−1 = (e) = e′ . On the other hand,
′ ′ ′  ′ ′ −1
a × a−1 = a′ ∗ a−1 . Therefore a′ ∗ a−1 = e′ , hence a−1 is the unique inverse of a′ , from which a−1 = (a′ ) .
QED.
3. The set of all the elements ℜe′ = {ap } which are mapped in e′ forms an invariant subgroup of G. Proof : If ai , aj are
′ ′ ′
mapped in e′ then (ai × aj ) = a′i ∗ a′j = e′ ∗ e′ = e′ then ai × aj is also mapped in e′ . Further, ai × a−1
i = (e) = e′ ,
 ′ ′  ′ ′ ′
on the other hand ai × a−1i = a′i ∗ a−1
i = e′ ∗ a−1
i = a−1
i from which a−1
i = e′ and a−1
i is also mapped in e′ ,
showing thatℜe′ is a subgroup of G. Now to show that the subgroup is invariant in G, if a ∈ ℜe′ then ∀u ∈ G we have:
′ ′ −1
u × a × u−1 = u′ ∗ a′ ∗ u−1 = u′ ∗ e′ ∗ (u′ ) = e′ . Thus any conjugate of a is also mapped into the identity. QED.
Now since ℜe′ is an invariant subgroup in G, it induces a resolution of G in cosets G = ∪i (ai ℜe′ ). The invariant subgroup
ℜe′ is called the kernel or the center of the homomorphism M .
Theorem 6.32 Let M be a homomorphism of G onto G′ , and let ℜe′ be the kernel of M . All the elements of a certain coset
bi ℜe′ , are mapped by M into the same image point. If two elements bi and bj of G belong to different cosets, they are mapped
by M into different image points

Proof: If aj ∈ ℜe′ and bi ∈ G then (bi × aj ) = b′i ∗ a′j = b′i ∗ e′ = b′i ∀aj ∈ ℜe′ . Then any element of bi ℜe′ is mapped in
b′i . To prove
that elements from different cosets have different images, we shall prove that b′i = b′j implies bi ℜe′ = bj ℜe′ . Since
−1 ′
b′i = b′j then
b′j bi = e′ but
−1 ′ ′ ′  
b′j bi = b−1
j bi = M b−1 j M (bi ) = M b−1j bi

therefore, M b−1 ′ −1 −1
j bi = e , hence bj bi ∈ ℜe′ from which bj bi ℜe′ = ℜe′ so that bi ℜe′ = bj ℜe′ QED.
An important consequence of this theorem is the following
15 We should take into account that m + Em = Em , and in general (nm + k) + Em = k + Em with 0 ≤ k < m and n any integer.
6.11. HOMOMORPHISMS 125

Theorem 6.33 Let M be a homomorphism from G onto G′ , and let ℜe′ be the kernel of M . The group G′ is isomorphic to
G/ℜe′ . If the group is finite, the order of G′ must be the index of ℜe′ in G.

Proof : We shall omit the discrimination of multiplication symbols ∗ and × for G′ and G from now on. The elements of
the factor group G/ℜe′ are all the distinct cosets {bi ℜe′ }. Consider the mapping N : G/ℜe′ → G′ , of the form

N : bi ℜe′ → b′i ∈ G′ (6.37)

By theorem 6.32, N is a well-defined,16 one-to-one, and onto mapping. To see that group multiplication is preserved by N we
see that

N (bi ℜe′ ) N (bj ℜe′ ) = b′i b′j = M (bi ) M (bj ) = M (bi bj ) = (bi bj ) = N [(bi bj ) ℜe′ ]
So N is an isomorphism from G/ℜe′ onto G′ . If G is finite, G/ℜe′ has the same number of elements as G′ so the order of

G is the index of ℜe′ in G. QED.

Corollary 6.34 A homomorphism M of G onto G′ is an isomorphism if and only if ℜe′ = {e}.

From all these properties we see that we can determine the structure of the homomorphism just identifying the elements
of G which are mapped into the identity of G′ i.e. by getting the kernel or center ℜe′ (invariant subgroup in G) of the
homomorphism. Roughly speaking, the size of the kernel tells us how far is the homomorphism from being an isomorphism.

Example 6.48 Let us define a mapping from S3 = {e, (123) , (321) , (12) , (23) , (31)} to C2 = {e′ , a′ } as follows: The three
elements forming the subset H = {e, (123) , (321)} are mapped in e′ and the remaining elements R ≡ {(12) , (23) , (31)} are
mapped in a′ . This mapping is onto, and we see that it preserves multiplication from the fact that the product of any two
elements both from H or both from R results in an element of H, whereas the product of an element of H with an element in
R (in any order) gives an element in R. The center of the homomorphism is H and it can be checked explicitly that it is an
invariant subgroup in S3 . In example 6.37, we showed the cosets induced by H which are {H, (12) H} = {H, R}, these are the
elements of the quotient group G/H, which is clearly isomorphic to C2 . The homomorphism from S3 onto C2 along with the
isomorphism between C2 and S3 /H are illustrated in Fig. 6.3

Figure 6.3: Illustration of the homomorphism from S3 onto C2 and the isomorphism between C2 and S3 /H.

The homomorphism is a transitive and reflexive relation but not necessarily symmetric (it is symmetric only when it
becomes an isomorphism).

Example 6.49 We can establish a homomorphism from the cyclic group of order four G4 to the abstract group of order two
G2 = {e′ , b′ }. The elements of the subgroup ℵ invariant in G4 (defined in example 6.45) are mapped into e′ , while the elements
in the coset aℵ are associated to b′ .

Now to prove that a given group G′ is not homomorphic to other group G, we should prove that it is not possible to find
any homomorphism of G onto G′ . Let us see an example

Theorem 6.35 Let (Q+ , ∗) be the group of positive rationals under usual multiplication. Let (Q, +) be the group of all rationals
under usual sum. The (Q+ , ∗) is not homomorphic to (Q, +). In particular, these groups are not isomorphic.
16 This is a well-defined mapping because all elements in a given coset are mapped in a single element in the image of M . Thus, if b and b are
i j
in the same coset, then bi ℜe′ = bj ℜe′ but also b′i = b′j so the mapping (6.37) can be written equivalently as N : bj ℜe′ → b′j ∈ G′ for any bj that
belongs to the same coset as bi . Consequently, such a mapping remains the same regardless of our choice of the element in the coset to define it.
126 CHAPTER 6. ABSTRACT GROUP THEORY

Proof: We symbolize the elements of the group (Q, +) as ak ∈ Q. Assuming that (Q+ , ∗) is homomorphic to (Q, +), then
exists f (ak ) = Ak ∈ Q+ which is a mapping of Q onto Q+ such that

f (ai + aj ) = f (ai ) ∗ f (aj ) ; ∀ai , aj ∈ Q

let us choose in particular ai = aj = ak /2 with ak ∈ Q; the homomorphism demands that


a ak  a  a 
k k k
f + = f ∗f ⇒
2 2 2 2
h  a i2
k
f (ak ) = f ∀ak ∈ Q (6.38)
2
now since 2 ∈ Q+ and f (ak ) is onto, exists am ∈ Q such that f (am ) = 2, and applying condition (6.38) to am we have
h  a i2 a  √
m m
f (am ) = f =2⇒ f = 2
2 2

however it is clear that am /2 ∈ Q and it is well-known that 2 ∈ / Q+ . Therefore, the last equality contradicts the fact that
f (ak ) is a mapping of Q onto Q . Hence (Q , ∗) is not homomorphic to(Q, +). QED.
+ +

6.12 A group as a direct product of some subgroups


In some cases, the group G can be generated from some of its subgroups by an operation called “direct product”. In that case,
much of the structure (and representations) of G can be inferred from the structure (and representations) of those smaller
subgroups.

Definition 6.23 A group G is said to be the direct product of its subgroups ℜ1 , ℜ2 , . . . , ℜn if

1. The elements of different subgroups commute. That is, for all hi ∈ ℜi with i = 1, . . . , n the product h1 h2 . . . hn = h is
the same for any order of the elements.
2. Every element g ∈ G can be written in a unique way as

g = h1 ∗ h2 ∗ . . . ∗ hn (6.39)

where hi ∈ ℜi with i = 1, . . . , n; and all the subgroups are proper subgroups of G. We denote it as:

G = ℜ1 ⊗ ℜ2 ⊗ . . . ⊗ ℜn (6.40)

The subgroups ℜi are called the direct factors of G.

Theorem 6.36 If a group G can be expressed as the direct product of some of its proper subgroups ℜ1 , ℜ2 , . . . , ℜn we have

1. ℜi ∩ ℜj = {e} for i 6= j; proof: assume we have an element a 6= e common to ℜi and ℜj for i 6= j, the element a can be
written in at least two ways in the form given by Eq. (6.39). The first one is

a = e(1) ∗ e(2) ∗ e(3) ∗ . . . ∗ a(i) ∗ . . . ∗ e(j) ∗ . . . ∗ e(n) , a ∈ ℜi and e ∈ ℜj

and the second one is

a = e(1) ∗ e(2) ∗ e(3) ∗ . . . ∗ e(i) ∗ . . . ∗ a(j) ∗ . . . ∗ e(n) , e ∈ ℜi and a ∈ ℜj

contradicting the uniqueness of the expansion (6.39). QED.

2. Each ℜi is an invariant subgroup in G. Proof: If h′i ∈ ℜi ⇒ gh′i g −1 = (h1 h2 . . . hi . . . hn ) h′i (h1 h2 . . . hi . . . hn )−1 and
since all elements j 6= i commute then

gh′i g −1 = hi h′i h−1
i h1 h−1 −1 −1 −1
1 . . . hi−1 hi−1 hi+1 hi+1 . . . hn hn ⇒ gh′i g −1 = hi h′i h−1
i
⇒ gh′i g −1 ∈ ℜi ∀g ∈ G

from which ℜi is invariant in G. QED.

Moreover, from the definition it is immediate to see that the order of the factors ℜ1 ⊗ ℜ2 ⊗ . . . ⊗ ℜn is irrelevant, since
elements from different subgroups commute. Although we required conmutativity between any pair of elements of the different
subgroups that form the product, it does not neccesarily mean the conmutativity of the group G. Indeed, each ℜi could be
non-commutative.
6.13. DIRECT PRODUCT OF GROUPS 127


Example 6.50 The cyclic group C6 = a0 = e, a, a2 , a3 , a4 , a5 of order 6 can be written as

C3 : e, a2 , a4 ; C2 : e, a3
C6 = C3 ⊗ C2 = C2 ⊗ C3

the elements of C6 are generated as ai aj with ai ∈ C3 and aj ∈ C2 as follows

e = ee ; a = a4 a3 ; a2 = a2 e ; a3 = ea3 ; a4 = a4 e ; a5 = a2 a3

we see that each element of C6 is expressible in one and only one way in terms of the elements of each subgroup. We also see
that C2 ∩ C3 = {e} and the elements of C2 commutes with the elements of C3 (all these groups are abelian). Finally, C2 and
C3 are invariant in C6 (subgroups of an abelian group).

Theorem 6.37 If G = ℜ1 ⊗ ℜ2 , the quotient group G/ℜ1 is isomorphic to ℜ2 and G/ℜ2 is isomorphic to ℜ1 .

Proof : For G/ℜ1 we have G/ℜ1 = {gℜ1 : g ∈ G}, but since G = ℜ1 ⊗ ℜ2 we find

G/ℜ1 = {h1 h2 ℜ1 : h1 ∈ ℜ1 , h2 ∈ ℜ2 } = {(h1 ℜ1 ) (h2 ℜ1 ) : h1 ∈ ℜ1 , h2 ∈ ℜ2 }


G/ℜ1 = {ℜ1 (h2 ℜ1 ) : h2 ∈ ℜ2 } = {h2 ℜ1 : h2 ∈ ℜ2 }

Therefore, all distinct cosets induced by ℜ1 and generated by the elements of ℜ2 , form the whole quotient group G/ℜ1 . This
fact suggest the following correspondence

T : ℜ2 → G/ℜ1
T : h 2 → h 2 ℜ1

T is onto G/ℜ1 by the discussion above. Assume that h2 ℜ1 = h′2 ℜ1 , then ℜ1 = h−1 ′ −1 ′
2 h2 ℜ1 from which h2 h2 ∈ ℜ1 . On the
−1 ′ −1 ′
other hand, since h2 and h2 belong to ℜ2 then h2 h2 ∈ ℜ2 , but ℜ1 ∩ ℜ2 = {e}. Therefore, h2 h2 = e and h′2 = h2 . Hence

h2 ℜ1 = h′2 ℜ1 implies h′2 = h2 showing that the mapping is one-to-one. The preservation of the multiplication can be seen as

T (hh′ ) = (hh′ ) ℜ1 = (hℜ1 ) (h′ ℜ1 ) = T (h) T (h′ )

then we obtain ℜ2 ≃ G/ℜ1 . Similarly, ℜ1 ≃ G/ℜ2 . QED


This theorem coincides with our intuitive notion of quotient group. However, any intuitive idea must be formalized. For
instance, the converse is not always true, let ℜ1 be an invariant subgroup in G and ℜ2 = G/ℜ1 . It does not follow that
G = ℜ1 ⊗ ℜ2 . Let us see an example: S3 has an invariant subgroup H = {e, (123) , (321)}, the quotient group S3 /H is
isomorphic to any of the subgroups Hi = {e, (jk)} with i, j, k a cyclic permutation of 1,2,3. But S3 is not the direct product
of H and Hi , because the elements of H and Hi do not commute. Further, note that each Hi is not invariant in S3 , and hence
cannot be a direct factor of S3 . Nevertheless, this counter-example inspires us to do the following definition

Definition 6.24 Let ℜ1 and ℜ2 be two subgroups of a group G. We say that G is the semi-direct product of ℜ1 and ℜ2 , if the
following conditions are satisfied: (1) ℜ1 is an invariant subgroup in G. (2) ℜ1 ∩ ℜ2 = {e} and (3) any g ∈ G can be written
in a unique way as g = h1 h2 with h1 ∈ ℜ1 and h2 ∈ ℜ2 . The semi-direct product is denoted by

G = ℜ1 ∧ ℜ2

6.13 Direct product of groups


In section 6.12, we have shown how a given group can be written as a direct product of some of its subgroups. It would be
desirable to do the opposite. That is, given two groups G and G′ , we wonder how can we construct a new group G′′ from the
elements of G and G′ .
To obtain the direct product of two groups G ⊗ G′ we form all possible pairs (a, a′ ) with a ∈ G, a′ ∈ G′ . From the point of
view of set theory, this is the cartesian product of the sets G and G′ . Now, for this cartesian product to become a group,
we should establish a law of combination between the ordered pairs, that satisfies the axioms of a group. The product of pairs
is defined as
(a, a′ ) (b, b′ ) ≡ (ab, a′ b′ ) (6.41)
we should remember that each group has its own law of combination, so that ab and a′ b′ really means a×b and a∗b respectively.
If we see each ordered pair as a single element belonging to the new group G ⊗ G′ i.e. A ≡ (a, a′ ) ∈ G ⊗ G′ , equation (6.41)
defines automatically a law of combination for these elements with the following properties

1. AB = (ab, a′ b′ ) hence ab ∈ G and a′ b′ ∈ G′ ⇒ AB ∈ G ⊗ G′ , ∀ A, B ∈ G ⊗ G′


128 CHAPTER 6. ABSTRACT GROUP THEORY

2. ABC = (abc, a′ b′ c′ ) = (a (bc) , a′ (b′ c′ )) = A (BC) = ((ab) c, (a′ b′ ) c′ ) = (AB) C , ∀ A, B, C ∈ G ⊗ G′

3. E = (e, e′ ) / (a, a′ ) (e, e′ ) = (e, e′ ) (a, a′ ) = (a, a′ ) , ∀ A ∈ G ⊗ G′


 
4. A−1 = a−1 , a′−1 since A−1 A = AA−1 = a−1 a, a′−1 a = (e, e′ ) = E , ∀ A ∈ G ⊗ G′

So that these new elements along with the law of combination in Eq. (6.41) satisfy the axioms of a group. In the case of
finite groups, it is clear that the order of G ⊗ G′ is the product of the orders of G and G′ . If both groups G and G′ are abelian,
then for all ai , aj ∈ G and for all bk , bl ∈ G′ the ordered pairs are combined as (ai , bk ) (aj , bl ) = (ai aj , bk bl ) = (aj ai , bl bk ) =
(aj , bl ) (ai , bk ). Hence, if two groups are abelian their direct product is abelian too.
Note that we can see the groups G and G′ as the sets of ordered pairs of the form

G ≡ {(a, e′ ) : ∀a ∈ G, e′ ∈ G′ } ; G′ ≡ {(e, a′ ) : ∀a′ ∈ G′ , e ∈ G} (6.42)

both with the law of combination (6.41). In this way, Eq. (6.42) shows G and G′ as subgroups of G ⊗ G′ . It is clear that
elements of G commute with elemens of G′ when they are seen as subgroups of G⊗ G′ . Further, any pair in G⊗ G′ is expressible
in a unique way as a product of an element in G with an element in G′ i.e. (a, a′ ) = (a, e′ ) (e, a′ ). Finally let (c, c′ ) ∈ G ⊗ G′
then we have
−1 
(c, c′ ) (a, e′ ) (c, c′ ) = cac−1 , c′ e′ c−1 = (b, e′ ) ∈ G , ∀ (c, c′ ) ∈ G ⊗ G′ and ∀ (a, e′ ) ∈ G

showing that G is an invariant subgroup in G ⊗ G′ . Similarly, G′ is invariant in G ⊗ G′ . In conclusion, when G and G′ are
seen as subgroups of G ⊗ G′ they satisfy the conditions described in Sec. 6.12 as expected.

Example 6.51 Let us find C2 ⊗ C2′ . Although C2 and C2′ are the same abstract group, we use their elements as independent

C2 ≡ {e1 , a1 } ; C2′ ≡ {e2 , a2 }

where
a21 = e1 ; a22 = e2
the direct product is defined as the cartesian product

C2 ⊗ C2′ ≡ {(e1 , e2 ) , (e1 , a2 ) , (a1 , e2 ) , (a1 , a2 )}


≡ {E = A0 , A1 , A2 , A3 }

in which the “table of multiplication” is defined by

E∗E = E ; E ∗ A1 = (e1 , e2 ) ∗ (e1 , a2 ) = (e1 e1 , e2 a2 ) = (e1 , a2 ) = A1


E ∗ A2 = A2 ; E ∗ A3 = A3

this part of the table is trivial, now we evaluate the other terms

A1 ∗ A2 = (e1 , a2 ) ∗ (a1 , e2 ) = (e1 a1 , a2 e2 ) = (a1 , a2 ) = A3


A1 ∗ A3 = (e1 a1 , a2 a2 ) = (a1 , e2 ) = A2
A2 ∗ A3 = (a1 a1 , e2 a2 ) = (e1 , a2 ) = A1

we can also check that A2i = E, the rest of the terms can be guessed from the abelianity of the groups (and so of the direct
product of them). The direct product obtained this way is isomorphic with the non-cyclic group D2 of order four, defined in
table 6.2 page 102.

This example also shows that making the direct product of groups, is another method to generate new groups.

6.14 Classes, subgroups, invariant subgroups and quotient groups from S4


(optional)
Many of the concepts developed in this chapter can be illustrated by means of the S4 group of order 24 and degree 4. The
number of classes (i.e. the number of cyclic structures) is given by the number of partitions of its degree n = 4, and Eq.
(6.27) gives the cycle structure associated with each partition. Finally, the number of elements in a class (i.e. of a given
cycle structure) is obtained from Eq. (6.28). Denoting λ{i} a given partition, ν{i} its associated cyclic structure, and n{i} the
number of elements in the given class, we obtain the results of table 6.6
6.14. CLASSES, SUBGROUPS, INVARIANT SUBGROUPS AND QUOTIENT GROUPS FROM S4 (OPTIONAL) 129

λ{i} ν{i} n{i}


i=1 {4} ν1 = 4, ν2 = ν3 = ν4 = 0 1
i=2 {3, 1} ν1 = 2, ν2 = 1, ν3 = ν4 = 0 6
i=3 {2, 2} ν1 = 0, ν2 = 2, ν3 = ν4 = 0 3
i=4 {2, 1, 1} ν1 = 1, ν2 = 0, ν3 = 1, ν4 = 0 8
i=5 {1, 1, 1, 1} ν1 = ν2 = ν3 = 0, ν4 = 1 6
Table 6.6: Table of classification of partitions λ{i} , classes ν{i} , and number of elements in each class ni , of the permutation
group of degree 4.

Let us show explicitly the case of the partition λ{3,1} ≡ {3, 1, 0, 0}. From Eq. (6.27) with λ1 = 3, λ2 = 1, λ3 = λ4 = 0 we
find
ν1 = λ1 − λ2 = 3 − 1 = 2 ; ν2 = λ2 − λ3 = 1 − 0 = 1 ; ν3 = λ3 − λ4 = 0 − 0 = 0 ; ν4 = λ4 = 0
therefore, this class corresponds to elements with two 1-cycles and one 2−cycle, for example

(12) (3) (4) ≡ (12) ; (1) (23) (4) ≡ (23) etc.

now, using n = 4, ν1 = 2, ν2 = 1 and ν3 = ν4 = 0 in Eq. (6.28) the number of elements in this class yields

4! 4! 24
n{3,1} = = =
(ν1 ! · ν2 ! · ν3 ! · ν4 !) (1ν1 · 2ν2 · 3ν3 · 4ν4 ) (2! · 1! · 0! · 0!) (12 · 21 · 30 · 40 ) 2·2
n{3,1} = 6

Let us start by enumerating its elements, it is easily done by ordering them by their cyclic structure in the order given in
table 6.6

S4 = {e, (12) , (13) , (14) , (23) , (24) , (34) , (12) (34) , (13) (24) , (14) (23) ,
(123) , (124) , (132) , (134) , (142) , (143) , (234) , (243) , (1234) , (1243) , (1324) , (1342) , (1423) , (1432)}

its conjugacy classes are obtained by gathering elements with the same cyclic structure, so there are five conjugacy classes

C1 = {e}
C2 = {(12) , (13) , (14) , (23) , (24) , (34)}
C3 = {(12) (34) , (13) (24) , (14) (23)}
C4 = {(123) , (124) , (132) , (134) , (142) , (143) , (234) , (243)}
C5 = {(1234) , (1243) , (1324) , (1342) , (1423) , (1432)}

according with Lagrange’s theorem since the order of S4 is 24, the proper subgroups can only be of orders 2, 3, 4, 6, 8, 12; and
of course the trivial ones: S4 and the identity alone. Unfortunately, there is not a general algorithm to find out all possible
subgroups of a given group. Hence, we shall show only some of them. However, we will try to be as systematic as possible.
Let us remember first some simple facts: (a) There are cyclic groups of all orders, (b) if a permutation P consists of k−cycles
and/or 1−cycles only, it is clear that P k = e.
We shall find the subgroups of increasing order:

• The only subgroup of order one is clearly {e}


• All groups {e, g} of order two contain the identity plus a permutation g such that g 2 = e. It is clearly accomplished by
elements g containing only 2−cycles and/or 1−cycles. We can find six groups of the form {e, (ij)} and three groups of
the form {e, (ij) (kl)}.

• Groups of order three must be cyclic with the form e, g, g 2 such that g 3 = e. This is clearly accomplish by a three-cycle.
Therefore, we can find four subgroups of the form e, g, g 2 with g consisting of a three-cycle17 .
• There are two different abstract groups of order four, and both of them must be contained at least once in S4 owing to
the Cayley’s theorem.

– We start with the cyclic group of order 4, C4 . They are groups of the form e, g, g 2 , g 3 with g any 4−cycle. There
are three of these subgroups18 .
17 A priori we could think that there are 8 of these subgroups, each one generated by each of the eight 3−cycles. However, some of them generate

the same group (i.e. the same set of specific elements), though in different order.
18 Once again, though there are six 4−cycles, some of them generate the same group in different order.
130 CHAPTER 6. ABSTRACT GROUP THEORY

– The non-cyclic group of order four can be generated by gathering two complete classes C1 and C3 (Klein’s group)

V4 = {e, (12) (34) , (13) (24) , (14) (23)}

it can be guessed by observing that if a group of four elements consists of complete classes, it must contain C1 (the
identity), thus the only possible remaining class is C3 , because the other classes have more than three elements.

• Four subgroups of order six are isomorphic to S3 . They are simply the groups of permutations of three elements, by
keeping one unaltered. Note that cyclic groups of the form e, g, g 2 , g 3 , g 4 , g 5 cannot be generated by 6−cycles since
they are not contained in S4 .
• There is a subgroup of order eight isomorphic to the dihedral group D4 . Indeed, it can be identified by constructing the
D4 group on geometrical grounds, and then identifying its elements with permutations (similar to the procedure followed
to identify D3 with S3 in Example 6.30 page 112). This subgroup is given by

D4 = {e, (1234) , (13) (24) , (1432) , (13) , (12) (34) , (24) , (14) (23)}

the cyclic group of order 8 cannot be contained in S4 , since the latter does not have 8-cycles.
• An obvious group of order twelve is the alternating group consisting of all even permutations. According with theorem
6.26, page 121, it must be an invariant subgroup. It consists of

T = {e, (12) (34) , (13) (24) , (14) (23) , (123) , (124) , (132) , (134) , (142) , (143) , (234) , (243)}

so it is formed by the union of three complete classes C1 , C3 , C4 . It is isomorphic to the tetrahedral group. To form
explicitly the alternating group we take into account that: (a) it is invariant in S4 (theorem 6.26) so it contains complete
classes including C1 (the identity). (b) The remaining complete classes must sum 11 elements. (c) The parity of a
conjugacy class can be checked by examining the decrement of any of its elements (definition 6.10, theorem 6.12).

Although this list of subgroups is not neccesarily complete. We can assure that T and V4 are the only non-trivial invariant
subgroups of S4 . This can be proved by observing that according to the number of elements in each class given by table 6.6,
there are no ways of joining complete classes (including the class of the identity) to obtain 2,3,6 or 8 elements. Therefore,
subgroups of order 2, 3, 6, 8 cannot be invariant in S4 .
The factor groups obtained from the invariant subgroups T and V4 , are isomorphic to C2 and S3

S4 /T = {T, (ij) T } with (i, j) any 2 − cycle. S4 /T ≃ C2


S4 /V4 = {V4 , (12) V4 , (23) V4 , (13) V4 , (123) V4 , (321) V4 }
≃ {e, (12) , (23) , (13) , (123) , (321)} ≡ S3
Chapter 7

Group representations

Physics and geometry usually works with groups of transformations instead of abstract groups. Applications of group theory
in Physics are usually related with the symmetry transformations of the system under study. In classical Physics, we usually
work with the symmetry transformations for the solutions of certain partial differential or integral equation, the set of all
solutions for these types of equations frequently form a vector space. In quantum mechanics, a vector space (Hilbert space)
is used as a framework for the underlying theory, so that the construction of the symmetry transformations of the system is
more direct. Therefore, in most applications the transformations are linear operators acting on a vector space. It is well-known
that a matrix representation for a given linear operator in a vector space can always be constructed1 . Therefore, we usually
represents a given abstract group G with a group of operators acting on a given vector space V , these operators are denoted
by U (G) (we shall use the words “operators” and “linear transformations” interchangeably). In turn, these operators can be
described by their matrix representatives D (G).
In our formal developments, we shall assume that the vector spaces on which the operators are defined, are finite-dimensional.
However, most results can be extended to infinite-dimensional vector spaces widely used in Physics. In addition we shall use
complex vector spaces unless stated otherwise. We shall change slightly the notation we have used so far for vector spaces.
Therefore, before entering in the formalism, we shall briefly explain such changes

7.1 Comments on notation


We shall use convention of sum over repeated upper and lower indices so that
X
Ki B i = Ki B i (7.1)
i

but sum over repeated indices on the same level is not implied unless otherwise stated, for instance Ki Bi or K i B i refers to a
fixed index i. Vectors are denoted in a variety of ways, could be in either normal or boldface notation x or x. Sometimes Dirac
notation will be used |xi. Orthonormal sets will be denoted as {ei } or {|ei i}. Indeed, the context always clarifies whether
a given object is a vector. The mixing between Dirac and normal notation is particularly useful when vector spaces become
algebras. The components of a vector x with respect to a basis {ei } is denoted by xi hence x = ei xi . If the linear space has
a non-trivial metric tensor gij we distinguish contravariant components xi and covariant components xi related by xi = gij xj .
If the metric is trivial then xi = xi but we mostly use xi . Multiplication of a vector |xi with a scalar is written either as α |xi
or |xi α but the latter form will be the most frequently encountered, in order to show any further operation on |xi in a more
apparent

i form. Since lower labels are used for basis vectors |ei i then upper labels will be used to denote the associated bras
e . Similarly, since xi denotes vector components then xi will denote bra components, for instance


|xi = |ei i xi ⇔ hx| = xi ei (7.2)

consequently we write
x†i = xi∗ (7.3)
this convention permits to write the inner product as

hx |yi ≡ (x, y) = x†i y i (7.4)

a matrix will be denoted as Di k with i labelling rows and k labelling columns. As long as we use a trivial metric upper
and lower indices are indistinguishable. Thus, Di k is the same as Di k and we use them interchangeably when necessary. For
1 See Sec. 3.1.

131
132 CHAPTER 7. GROUP REPRESENTATIONS

instance, when transpose or hermitian conjugation is applied to a matrix we switch upper and lower indices (as in Eq. 7.3).
The transpose of a matrix denoted by AT or Ae is denoted by elements as

ei k = Ak i
A (7.5)

and the hermitian conjugation is denoted as ∗


A†i k = A∗k i = Ak i (7.6)
and if A carries any other indices we also switch them accordingly. For example,
h i∗
m k
A†µ (g) k = A∗µ (g)k m = Aµ (g) m (7.7)

this practice ensures that the lower-upper convention for sum works properly. Products of matrices look like
i
(ABC) k = Ai j B j m C m k (7.8)

however, since lower and upper indices are indistinguishable we can encounter alternative forms like
i
(ABC) k = Aij B jm C m k = Ai j B jm Cmk (7.9)

from time to time. All these forms are equivalent as long as raising and lowering of indices is managed consistently. The
identity operator will be denoted by E hence E i j = δ i j .

7.2 The concept of representation and basic properties


Assume that we have a set of non-singular linear transformations acting on a given vector space. The composition of operators
is a natural law of combination for them. With this law of combination, associativity is automatically satisfied. If this set of
non-singular operators is closed under composition, contains the identity and the inverse of each of its elements, we have a
“group of operators” or a “group of linear transformations”.

Definition 7.1 If there is a homomorphism from certain group G onto a group of operators U (G), that act on the vector
space V , we say that the set of operators U (G) forms a representation of the group G, in the representation space V . If V
is of dimension n, we say that the representation has degree n or is a n−dimensional representation. If the homomorphism
is an isomorphism (a one-to-one homomorphism) we say that the representation is faithful. Otherwise the representation is
degenerate.

Observe that the concept of representation is related to groups of transformations and is intimately related with the set of
points (vector space) in which the transformations are acting on.
For an element g ∈ G, we denote the operator (on the vector space V ) associated by the homomorphism as U (g), and its
matrix representative as D (g). The homomorphic mapping guarantees that
 −1
U : g ∈ G → U (g) ⇒ U (g1 g2 ) = U (g1 ) U (g2 ) , U g −1 = [U (g)] ; U (e) = I (7.10)

Hence the group multiplication of G is preserved by its representation U (G). Strictly speaking, such homomorphisms can be
obtained with mappings U that are not necessarily linear. In that case we talk about non-linear representations. However,
since we shall concentrate on linear representations only, we shall not use this differentiation. Observe finally, that is the linear
character of the representation that permits to form matrix representations. The dimension of the matrices is of course,
the dimension of the vector space in which the operators are defined.
It deserves to say that if we have a one-to-one operator S that maps V onto another vector space V ′ , a representation in
V is automatically generated by the induced operator that maps V ′ into itself2 , i.e. U (g)′ = S U (g) S −1 . It can be checked

that U (g) → S U (g) S −1 is an isomorphic mapping.


In order to construct a matrix representation of G in the vector space V , it is necessary to choose a basis in V . If V is
finite-dimensional of dimension n, we can choose a finite orthonormal basis {ei , i = 1, . . . , n} on V . The operators U (g) are
then represented by n × n matrices that can be constructed from the action of the linear operators on the basis vectors
j
U (g) |ei i = |ej i D (g) i , g∈G (7.11)

with sum over up-down repeated indices3 . The first index j is the row-label, and the second one i represents the column-label.
It is well known that a matrix realization represents an isomorphism between U (G) and D (G) in which not only the law
2 See Sec. 3.3, Eq. (3.19), page 38.
3 Compare with Eq. (3.3) in page 34.
7.2. THE CONCEPT OF REPRESENTATION AND BASIC PROPERTIES 133

of combination but also operations of sum and scalar product are preserved4 (we shall discuss this point later). By now we
concentrate on the preservation of the product only, it can be seen by applying the operators in Eq. (7.10) to each basis vector
ei .

U (g1 g2 ) |ei i = U (g1 ) U (g2 ) |ei i


j j k j
U (g1 ) U (g2 ) |ei i = U (g1 ) |ej i D (g2 ) i = (U (g1 ) |ej i) D (g2 ) i = |ek i D (g1 ) j D (g2 ) i
k
U (g1 g2 ) |ei i = |ek i D (g1 g2 ) i

from which we obtain


k j k
D (g1 ) j D (g2 ) i = D (g1 g2 ) i

that defines the algorithm for the matrix representatives to preserve composition. If |xi is an arbitrary vector in V we can
span it in a given basis
|xi = |ei i xi
we can find the coordinates of the transformed vectors |x′ i ≡ U (g) |xi as follows

|x′ i = |ei i x′i


i
|x′ i ≡ U (g) |xi = U (g) |ej i xj = |ei i D (g) j xj
i
⇒ x′i = D (g) j xj (7.12)

Since there is a homomorphic mapping between the elements of the group G = {g : g ∈ G}, and the set of operators
U (G) = {U (g) : g ∈ G}, and an isomorphism between U (G) and the matrices D (G), we are finally obtaining a homomorphism
between the group G and the matrices D (G). We call D (G) a matrix representation of G. Therefore

D (e) = In×n ; D (g1 g2 ) = D (g1 ) D (g2 ) ; D g −1 = D−1 (g)

For finite groups, the order of the group and of any faithful representation must coincide. For arbitrary groups, faithful
representations must have the same cardinality as the group.

Example 7.1 Let V = C be the complex plane (i.e. the one-dimensional complex vector space), and let U (g) = 1 for all
g ∈ G. Clearly, U (g1 ) U (g2 ) = 1 · 1 = 1 = U (g1 g2 ), hence g ∈ G → 1 forms a one-dimensional representation. This is called
the trivial representation of G. A trivial representation can also be constructed in any vector space V by defining U (g) = I for
all g ∈ G, where I is the identity in V . If the group has two or more elements, this representation is necessarily degenerate.

Definition 7.2 A non-trivial degenerate representation U (G) of a group G, is a degenerate representation different from the
trivial one described in example 7.1.

Example 7.2 Let G be a set of n × n matrices that accomplishes the axioms of a group. Let V = C and U (g) = det g. This
is a non-trivial one dimensional representation because U (g1 g2 ) = det (g1 g2 ) = det g1 · det g2 = U (g1 ) · U (g2 )

The set of elements in G mapped into the identity operator U (e) forms an invariant subgroup ℜ in G (see Sec. 6.11).
Identifying the invariant subgroup ℜ, and the disjoint cosets {gi ℜ} that fills the group as though they were single elements,
provides the quotient group G/ℜ. We also know that all the elements in a given coset are mapped into a single operator,
and elements belonging to different cosets are mapped into different operators. Consequently, the mapping gi ℜ → U (g) of all
distinct cosets onto the operators U (g) defines an isomorphism between the elements of such a quotient group, and the set
of operators {U (g)} (see theorem 6.33). Thus, the set {U (g)} is a faithful representation of G/ℜ. For the set of operators to
be a faithful representation of G, it is neccesary and sufficient that only the identity in G maps into the identity operator, i.e.
ℜ = {e}. Summarizing

Theorem 7.1 Let U : G → U (G) be a homomorphism from G onto a set of operators U (G) on a vector space V . Let ℜ be
the center or kernel of this homomorphism. The representation U (G) of the group G is a faithful representation of the quotient
group G/ℜ. Further, U (G) is a faithful representation of G if and only if ℜ = {e}.

Moreover, let ℜ be any invariant subgroup in a group G, and let gi ℜ → U (gi ℜ) be a representation of G/ℜ. If g ∈ gi ℜ, then
the mapping g → gi ℜ, followed by the mapping gi ℜ → U (gi ℜ) is a homomorphism of G onto U (gi ℜ). Thus a representation
of G/ℜ is also a representation of G. Notice that if ℜ is non-trivial, U (gi ℜ) must be a non-trivial degenerate representation of
G. Assume conversely that U (G) is a non-trivial degenerate representation of G, therefore the kernel ℜ of the homomorphism
(which is an invariant subspace of G) is non-trivial, so that U (G) is faithful in G/ℜ. Therefore, the existence of non-trivial
degenerate representations of G implies the existence of non-trivial invariant subspaces in G. In summary
4 See Sec. 3.1, theorem 3.1, page 36.
134 CHAPTER 7. GROUP REPRESENTATIONS

Theorem 7.2 Let ℜ be an invariant subgroup in a group G. (a) Any representation U (G/ℜ) of G/ℜ is also a representation
of G. (b) If ℜ is non-trivial, U (G/ℜ) is a non-trivial degenerate representation of G. (c) Conversely, if U (G) is a non-trivial
degenerate representation of G, then G has at least one non-trivial invariant subgroup ℜ such that U (G) defines a faithful
representation of G/ℜ.

Corollary 7.3 All representations (except the trivial one) of simple groups are faithful.

The fact that any representation of G/ℜ is also a representation of G, is very useful since G/ℜ is a “smaller” and usually
simpler group. Therefore, the task of finding representations of G/ℜ is often simpler than finding representations of G directly.
We illustrate these issues in the following example.

Example 7.3 The symmetric group S3 has an invariant subgroup H = {e, (123) , (321)}. The factor group

S3 /H = {H, (12) H}

is isomorphic to C2 = {e, a}. A non-trivial representation of C2 (i.e. of the quotient group) is {e, a} → {1, −1} or equivalently
{H, (12) H} → {1, −1}. This induces a 1-dimensional representation of S3 which assigns 1 to all the elements of H, and −1
to all the elements of the remaining coset (12) H = {(12) , (23) , (13)}. It can be checked that multiplication is preserved.

7.3 Examples of construction of matrix representations


Example 7.4 Let G be the dihedral group D2 consisting of the identity e, the reflection h about the Y −axis, the reflection ν
about the X−axis, and the rotation r by π around the origin, as described in example 6.18 Pag. 105. Let V2 be the bidimensional
euclidean space with basis vectors e1 , e2 . To obtain a matrix representation we study the transformation of the elements of the
orthonormal basis {e1 ,e2 } with each of the elements of the group, see Fig. 7.1a. For instance, the reflection h about the Y −axis
leaves e2 unchanged while e1 changes its sense, so that

he1 = −e1 ; he2 = e2 (7.13)

the reflection v makes the opposite, and the rotation r by π clearly change the sense of both vectors

ve1 = e1 ; ve2 = −e2 ; re1 = −e1 ; re2 = −e2 (7.14)

the matrix representatives are obtained applying transformations (7.13, 7.14) in Eq. (7.11)
     
U (h) e1 −e1 e1 D1 1 (h) + e2 D2 1 (h)
= ≡
U (h) e2 e2 e1 D1 2 (h) + e2 D2 2 (h)
⇒ D1 1 (h) = −1, D2 1 (h) = 0, D1 2 (h) = 0, D2 2 = 1

in the last step we used the linear independence of e1 , e2 . We get


 1   
D 1 (h) D1 2 (h) −1 0
D (h) = =
D2 1 (h) D2 2 (h) 0 1

similarly
       
U (v) e1 e1 e1 D1 1 (v) + e2 D2 1 (v) 1 0
= ≡ ⇒ D (v) =
U (v) e2 −e2 e1 D1 2 (v) + e2 D2 2 (v) 0 −1
       
U (r) e1 −e1 e1 D1 1 (r) + e2 D2 1 (r) −1 0
= ≡ ⇒ D (r) =
U (r) e2 −e2 e1 D1 2 (r) + e2 D2 2 (r) 0 −1

so the matrix representation is


   
1 0 −1 0
D (e) = ; D (h) =
0 1 0 1
   
1 0 −1 0
D (v) = ; D (r) = (7.15)
0 −1 0 −1

it is easy to check that the mapping g → D (g) is an isomorphism. Fig. 7.1a, shows graphically the transformations of the
basis vectors under the elements of this group.
7.3. EXAMPLES OF CONSTRUCTION OF MATRIX REPRESENTATIONS 135

Figure 7.1: Illustration of the transformation of the bidimensional euclidean space under three different groups.

Note that Eq. (7.11) in matrix form reads


 
(U^ eD (g) ; (U^
(g) e) ≡ ee′ = e (g) e) ≡ U (g) e1 , . . . , U (g) en , e
e≡ e1 , . . . , en (7.16)
where the symbole indicates transpose. If we transpose this equation we get
e (g) e
U (g) e ≡ e′ = D (7.17)
so if we construct the matrix that maps the column vector e into the column transformed vector e′ what we obtain is the
transpose of the matrix representation.
Example 7.5 Let us consider the group of continuous rotations in a plane around the origin O, G = {R (φ) : 0 ≤ φ < 2π}.
Let V2 be the bidimensional euclidean space, so that we obtain a two dimensional representation. Defining φ as positive in the
counterclockwise sense, the elements U (φ) of the group acting on the orthonormal basis e1 , e2 gives (see Fig. 7.1b)
e′1 = U (φ) e1 = e1 cos φ + e2 sin φ
 π  π
e′2 = U (φ) e2 = e1 cos φ + + e2 sin φ + = e1 (− sin φ) + e2 cos φ
 ′ 2  2
e1 e1
e′ = M (φ) e ; e′ ≡ ; e≡
e′2 e2
and the matrix representative D (φ) is obtained by taking into account Eq. (7.17)
 ′    
e1 cos φ sin φ e1
=
e′2 − sin φ cos φ e2
 
f (φ) = cos φ − sin φ
D (φ) = M (7.18)
sin φ cos φ
written in the order defined in Eq. (7.11), we write
 
  cos φ − sin φ
e′1 e′2 = e1 e2
sin φ cos φ
136 CHAPTER 7. GROUP REPRESENTATIONS

The new coordinates are given by Eq. (7.12) such that


 ′1    
i x cos φ − sin φ x1
x′i = D (φ) j xj ⇒ =
x′2 sin φ cos φ x2

Example 7.6 Let us construct now a representation of this group in R3 (with the rotation around the x3 −axis). This trans-
formation gives  ′    
x1 cos φ − sin φ 0 x1
 x′2  =  sin φ cos φ 0   x2 
x′3 0 0 1 x3
This matrix can also be constructed by taken the dot product as the inner product in R3 , and using the algorithm to construct
a matrix representative based on the inner product Eq. (3.25) page 40. We shall use cartesian coordinates {ei }.

Dj k (φ) ≡ (ej , U (φ) ek ) = ej · [U (φ) ek ]

U (φ) e1 = e1 cos φ + e2 sin φ , U (φ) e2 = −e1 sin φ + e2 cos φ ; U (φ) e3 = e3

D1 1 (φ) = e1 · [U (φ) e1 ] = e1 · (e1 cos φ + e2 sin φ) = cos φ


D2 2 (φ) = e2 · [U (φ) e2 ] = e2 · (−e1 sin φ + e2 cos φ) = cos φ
D1 2 = e1 · [U (φ) e2 ] = e1 · (−e1 sin φ + e2 cos φ) = − sin φ = −D2 1 (φ)
D3 3 (φ) = 1, D1 3 (φ) = D3 1 (φ) = D2 3 (φ) = D3 2 (φ) = 0

whatever the algorithm used, we obtain (for both the two dimensional and three dimensional representation) a continuous set
of matrices D (φ), and we can verify that the law of combination gives R (θ) R (φ) = R (θ + φ), showing explicitly the abelianity
of the group. Notwithstanding, we cannot assure that this is the only possible representation and we wonder whether there is
a way to construct other representations, different from the one induced by the transformation of the basis vectors with the
operators.

Example 7.7 Let G be the dihedral group D3 consisting of three reflections and three rotations (one of them is the identity)
as described in example 6.20, page 105. We choose again V2 as the vector space. The basis vectors denoted here by ex , ey
transform as shown in Fig. 7.1c. In this figure we are using the cyclic structure of the table 6.5, page 112 for S3 , because
D3 ≃ S3 . Expressing the transformed vectors U (g) ei in terms of the original basis according with Eq. (7.11) gives six two
dimensional matrices.

   √ 
1 0 1
D (e) = ; D [(123)] = − √1 − 3
0 1 2 3 1
 √   
1 1 3 −1 0
D [(321)] = − √ ; D [(23)] =
2 − 3 1 0 1
 √   √ 
1 −1 3 1 1 3
D [(12)] = − √ ; D [(31)] = √
2 3 1 2 3 −1

These examples show that different groups can have representations on the same vector space (the two dimensional euclidean
space in this case).

7.4 Equivalent and inequivalent representations


It is a fact of great importance that for most of the groups of relevance in Physics, the linear representations are limited and
can be classified and enumerated. Once the group structure is given, the structure of non redundant representations determines
the structure of the vector spaces in a large extent. Since in the process of constructing representations some redundancies
could appear, such redundancies must be removed when classification and enumeration is carried out. We shall study two
types of redundancies.
The first type of redundancy which is the topic of this section, concerns the concept of equivalent representations. Let
{U (g)} be a representation of a group G on the vector space V , it is immediate that for any non-singular operator S in V ,
the set of operators {U ′ (g)} ≡ S U (g) S −1 also forms a representation of G in V . The two sets of operators {U (g)} and
{U ′ (g)} are said to be related by a similarity transformation, it is easy to show that this is a relation of equivalence between
the two representations.
7.5. REDUCIBLE AND IRREDUCIBLE REPRESENTATIONS, INVARIANT SUBSPACES 137

Definition 7.3 Two representations of a group G related by a similarity transformation are called equivalent representations.

Let us construct the collection of all linear representations of a given group in a given vector space V . From a given
representation in this collection, we can generate all representations equivalent to it, by performing similarity transformations
with all non-singular linear transformations S on V . The subcollection formed in this way will be called an equivalence class.
Since equivalent classes are formed by equivalent relations, they form a partition of the whole collection of representations in
V.
The arguments above show that when we choose one representation in a given class, the other representations in the
class are redundant because they are all generated by the chosen one. Therefore, in enumerating the representations on a
given vector space, we choose only one representation in each equivalent class. Two representations corresponding to different
equivalent classes are called inequivalent representations. Finally, since the equivalent classes form partitions, we are sure that
this enumeration covers the whole collection of inequivalent representations in V .
The similarity transformation that relates two equivalent representations, resembles the similarity transformations that
relates matrix representatives of operators when a change of basis is done. Despite this resemblance, the geometrical interpre-
tation is very different. Let M (T ) be the matrix representative of the operator T in the basis u. If a new basis v is obtained
through a non singular matrix v′ = Bu, the matrix representative in the new basis is given by Eq. (3.18) page 38
e −1 M (T ) B
N (T ) = B e = SM S −1 ; S ≡ B
e −1

so M (T ) and N (T ) represents the same operator but in different bases, this is called a passive transformation. By contrast,
if T and S are operators, then T ′ = ST S −1 represents an operator different from T , and the transformation is called an
active transformation. In a similar fashion, M and N could also represent two different (though equivalent) operators, if both
matrices are constructed in the same basis.
Our next natural task consists of finding criteria to know whether two given representations are equivalent or inequivalent.
To do this, we should look for characterizations that remain invariant under similarity transformations. One of them is the
trace. The character χ (g) of g ∈ G in a representation U (G) is defined as χ (g) = T rU (g). If D (g) is a matrix realization of
U (g) we have X
χ(µ) (g) = D(µ) (g)i i
i
(µ)
where (µ) denotes the representation. The quantities χ (g) are independent of the basis chosen. A relation analogous to
a similarity transformation is the conjugate relation in abstract group theory, if g ′ = aga−1 with a, g, g ′ ∈ G, we see that
−1
D (g ′ ) = D (a) D (g) D (a) and since the trace is cyclic invariant we have
h i h i
T rD (g ′ ) = T r D (a) D (g) D (a)−1 = T r D (a)−1 D (a) D (g) = T rD (g)

thus conjugate elements have the same character

Theorem 7.4 Let G be a group. All elements of a given conjugacy class have the same character in a given representation
U (G). Therefore, the group character of a representation is a function of the class label only. Further, equivalent representations
possess the same set of characters.

Consequently, when we list the set of characters of the elements of the group in a certain representation, it is not neccesary
to indicate the character of each element.
n It suffices
o to make a list of the classses of the group K1 , K2 . . . , Kν and indicate the
(µ) (µ)
character associated to each class χ1 , . . . , χν . For a given representation, these set could be thought as a ν−dimensional
vector, this geometrical interpretation will be fruitful later on.
We have seen that it is always possible to construct a matrix representative for any linear operator defined on V. Therefore,
the set of matrices {D (g)} forms a faithful matrix representation of the group of operators {U (g)} in V. On the other hand,
since to form a representation we only require a homomorphism from G onto U (G), we cannot assure that U (G) is the only
non-equivalent representation of G in V (and it is not in general). Construction of all non-equivalent representations of a
group in a certain vector space is another main challenge of group theory.

7.5 Reducible and irreducible representations, invariant subspaces


We have discussed a type of redundancy for representations concerning a fixed vector space (or at least vector spaces of the same
dimensionality). In this section, we discuss a second type of redundancy concerning vector spaces of different dimensionality.
Let {U (G)} be a representation of the group G in Vn . Assume that for a given basis, the matrix representation D (G) of
U (G) has the following texture
 
D1 (g)m×m 0m×(n−m)
D (g)n×n = ; ∀g ∈ G
0(n−m)×m D2 (g)(n−m)×(n−m)
138 CHAPTER 7. GROUP REPRESENTATIONS

where m < n. Symbolically, this is written as a direct sum of the two square submatrices D1 (g) and D2 (g)

D (g) = D1 (g) ⊕ D2 (g)

if we multiply two matrices of this texture we obtain


 
D1 (ga ) D1 (gb ) 0
D (ga ) D (gb ) = ; ∀ga , gb ∈ G
0 D2 (ga ) D2 (gb )

and it is immediate to see that D1 (g) and D2 (g) also forms representations in vector spaces with lower dimensions than Vn .
In the opposite direction, given a couple of matrix representations we can form another one by defining the direct sum of them.
Therefore, given D1 (G) and D2 (G), the matrix representation D (G) has no additional information. However, the texture of
the matrices depends on the specific basis chosen. In other bases, the block diagonal form is in general lost. Therefore, when
the basis chosen is not the appropriate one, the matrices are not block-diagonal and it is not evident at a first glance whether
there is another basis in which the matrix representation acquires such a texture.
From this discussion it is desirable to characterize this type of redundancy by concepts that are independent of the basis
chosen. We can see that D1 (g) maps a subspace V1 of V with dimension m into itself. Similarly, D2 (g) maps a subspace V2
of dimension (n − m) into itself. This leads to the following

Definition 7.4 Let U (G) be a representation of G in V . Let V1 be a vector subspace of V . We say that V1 is an invariant
vector subspace of V with respect to U (G) if U (g) |xi ∈ V1 for all g ∈ G and all |xi ∈ V1 . An invariant subspace is called
minimal if it does not contain any non-trivial invariant subspace with respect to U (G).

In other words, an invariant subspace with respect to U (G), is a subspace V1 of V in which we can define a restriction of
the operators, such that {U (g)} can be considered as a set of operators that maps V1 into itself. There are two trivial invariant
subspaces under any U (G), the null space {0}, and V itself5 .

Definition 7.5 A representation U (G) of G in V is irreducible, if V is a minimal invariant vector subspace with respect to
U (G). Otherwise, the representation is reducible.

Definition 7.6 Let U (G) be a reducible representation of G in V . Let {Vi } be the collection of all proper invariant vector
subspaces of V with respect to U (G). If for each Vi ∈ {Vi } , the orthogonal complement Vi⊥ is also invariant with respect to
U (G), we say that U (G) is fully reducible or decomposable6 .

Example 7.8 Consider the representation of the dihedral group D2 on the bidimensional euclidean space V2 . The 1-dimensional
subspace spanned by e1 is invariant under all four operations, and same for its orthogonal complement spanned by e2 . The
2-dimensional representation given in example 7.4, Eq. (7.15) is fully reducible, and the two 1-dimensional representations are
obviously irreducible.

Example 7.9 The 1-dimensional subspaces generated by e1 and e2 are not invariant under the representation U (φ) of the
rotations in a plane given in example 7.5, Eq. (7.18). But if we form the following complex linear combinations

1
e± = √ (∓e1 − ie2 ) (7.19)
2

it can be shown that


U (φ) e+ = e+ e−iφ ; U (φ) e− = e− eiφ

The two dimensional representation given by Eq. (7.18) can be simplified by using the basis e± which are the eigenvectors of
the matrices in Eq. (7.18). In this new basis these matrices are diagonal7
 
e−iφ 0
D′ (φ) =
0 eiφ

the matrices D′ (φ) are obtained through a similarity transformation of D (φ) by the matrix S, which gives the transformation
from the basis {e1 , e2 } to the basis {e+ , e− } and is defined by Eq. (7.19).
5 The fact that the null space is invariant follows from the fact that T (0) = 0 for any linear transformation (see Eq. 2.7, page 18).
6 The orthonormal complement Vi⊥ of a subspace Vi in V is the set of all elements in V that are orthogonal to every element of Vi . See Sec. 2.9.
7 Note that all matrices D (φ) can be diagonalized simultaneously by the same similarity transformation, because the eigenvectors of D (φ) given

by Eqs. (7.19) are independent of φ.


7.5. REDUCIBLE AND IRREDUCIBLE REPRESENTATIONS, INVARIANT SUBSPACES 139

Assume that V1 is a n1 −dimensional proper invariant subspace of V with respect to U (G). We can always construct an
ordered basis {e1 , . . . , en } of V , such that the first n1 vectors of the basis span V1 . In that case, the matrix representative
constructed according with Eq. (7.11), gives
j
U (g) |ei i = |ej i D (g) i ∈ V1 , ∀g ∈ G, and f or i = 1, . . . , n1
j
therefore, it is necessary that D (g) i = 0 for i = 1, . . . , n1 and j = n1 + 1, . . . , n. Hence, the matrix representation acquires
the form  
D1 (g)n1 ×n1 D′ (g)n1 ×(n−n1 )
D (g)n×n = ; ∀g ∈ G
0(n−n1 )×n1 D2 (g)(n−n1 )×(n−n1 )
it is straightforward to check that if D (g) and D (g ′ ) are both of this form, then D (g) D (g ′ ) = D (gg ′ ) is also of this form,
and that Di (g) Di (g ′ ) = Di (gg ′ ) for i = 1, 2. So that all essential properties of D (G) are already contained in the lower
dimensional representations D1 (G) and D2 (G). It is then natural to remove these redundancies and concentrate only on
irreducible representations.
Moreover, the set {en1 +1 , . . . , en } spans V2 which is the complement of V1 (if the basis is orthonormal, V2 is the orthogonal
complement of V1 ). If V2 is also invariant, an analogous analysis shows that D′ (g) = 0, and the matrix D (g) becomes
block-diagonal.  
D1 (g)n1 ×n1 0n1 ×(n−n1 )
D (g)n×n = ; ∀g ∈ G (7.20)
0(n−n1 )×n1 D2 (g)(n−n1 )×(n−n1 )

This is the case when the representation is fully reducible or decomposable8 . When V1 and the complement V2 in V are both
invariant, we say that the representation induces a decomposition of the vector space in two invariant spaces such that

V = V1 ⊕ V2

If we start with any basis, even if the representation is fully reducible, the matrices will not be displayed in a block diagonal
form. However, if there are proper invariant subspaces under the representation, there will be a change of basis for which all
the matrices D (g) exhibit the same block diagonal texture (same structure of submatrices).
It is possible that V1 and/or V2 are not minimal invariant subspaces under U (G). In that case further reductions will be
neccesary to obtain irreducible representations. For instance, if V1 is in turn fully reducible there are proper subspaces V11
and V12 of V1 such that V1 = V11 ⊕ V12 . Something similar happens in V2 if it is reducible. This process should be repeated
until we obtain only minimal invariant subspaces under U (G).
Let us assume that all reductions are decompositions. In that case, the representation U (G) of operators in V defines a
set of minimal invariant subspaces Wi of V such that

V = W1 ⊕ . . . ⊕ Wp (7.21)

and with respect to these spaces, the operators can also be written as direct sums in the sense of operators

U (g) = U1 (g) ⊕ . . . ⊕ Up (g) ; g ∈ G (7.22)

We say that the group representation U (G) in V , has induced a decomposition of V as a direct sum of minimal n invariant
o
subspaces under the representation. Each subspace can be generated by an appropriate choice of basis, let us write Eki i ≡
 i
e1 , . . . , eiki a set of linearly independent vectors that generate the invariant subspace Wi , so that ki is the dimension of the
n o
k
invariant subspace. The set {E} ≡ Ek11 , . . . , Epp provides a basis for V, when the basis is chosen and ordered in this way,
the matrix representation has the form
 
D1 (g)k1 ×k1 0 0 0
 0 D2 (g)k2 ×k2 0 0 
 
D (g) =  ..  ; ∀g ∈ G
 0 0 . 0 
0 0 0 Dp (g)kp ×kp
Of course k1 + . . . + kp = n, with n the dimension of V. It makes sense to restrict the action of the operators {U (g)} to any
of the minimal invariant subspaces Wi , since we obtain mappings of Wi into itself. Notice that two or more minimal invariant
subspaces could be of the same dimensionality, so they are isomorphic. Assume for instance that Wi and Wj are isomorphic, we
wonder whether the representations Di (g) and Dj (g) are equivalent or not. It is possible that many equivalent representations
appear in this reduction, they can be represented by a single member in its class and we consider them as a repetition of a
8 However, if D (g) and D (g) are in turn reducible, Eq. (7.20) does not guarantee that the representation is fully reducible, because it is possible
1 2
that further reductions do not lead to a block-diagonal texture.
140 CHAPTER 7. GROUP REPRESENTATIONS

single representation. If we define as aν the number of times in which a given irreducible representation (ν) appears in this
decomposition, we can denote this decomposition as
X
D (g) = aν Dν (g) ; ∀g ∈ G (7.23)
⊕ν

where the sum runs over all inequivalent irreducible representations. Of course, irreducible representations of different dimen-
sionality are inequivalent.
For a given representation U (G) in V , we should keep in mind that although the texture of the matrices is basis dependent,
the division of the vector space in minimal invariant subspaces is intrinsic. Finding the irreducible representations of a group
in a vector space (and in particular, prove the irreducibility of certain representation), is a major challenge in group theory.

7.6 Unitary representations


Definition 7.7 If the group representation space V is an inner product space, and if U (g) is a unitary operator for all g ∈ G,
then we say that it is a unitary representation.
Unitary representations are of great importance in Physics since symmetry transformations are naturally related with
unitary operators because they preserve norm, metric, inner product, angles and most of relevant structures. The following
two theorems increase the importance of these kind of representations
Theorem 7.5 If a finite-dimensional unitary representation is reducible it is also fully reducible (decomposable)
Proof : Let U (G) be a unitary reducible representation of G in an inner product space V of dimension m. Let V1
be a proper invariant n1 −dimensional subspace under U (G) with n1 < m. We choose an ordered orthonormal basis
{e1 , e2 , . . . , en1 , . . . , em } of V such that {e1 , . . . , en1 } spans V1 . The orthogonal complement V2 of V1 in V , is spanned
by {en1 +1 , en1 +2 , . . . , em }. We shall prove that V2 is also invariant with respect to U (G). Since V1 is invariant then

|ei (g)i ≡ U (g) |ei i ∈ V1 for i = 1, . . . , n1 . Further, since the U (g) s are non-singular, the set {|ei (g)i , i = 1, ..., n1 } also
forms a basis of V1 for each g ∈ G (See theorem 2.12, page 20). On the other hand, it is clear that hej |ei i = 0 for all

j = n1 + 1, n1 + 2, . . . , m and i = 1, . . . , n1 . Since the U (g) s are unitary we have
hej |ei i = hU (g) ej |U (g) ei i ≡ hej (g) |ei (g)i = 0 ; ∀g ∈ G
for all j = n1 + 1, n1 + 2, . . . , m and i = 1, . . . , n1 . Hence, for each g ∈ G and for j = n1 + 1, n1 + 2, ..., m the elements
{|ej (g)i} are orthogonal to the basis {|ei (g)i , i = 1, ..., n1 } of V1 . Consequently
U (g) |ej i ≡ |ej (g)i ∈ V2 , for each g ∈ G and for j = n1 + 1, n1 + 2, . . . ., m
but {|ej (g)i ; j = n1 + 1, . . . ., m} is a basis for V2 for each g ∈ G because U (g) is non-singular. Therefore, any vector |xi ∈ V2
is a linear combination of them, we have that
m
X m
X m
X
j j
U (g) |xi = U (g) |ej i x = x U (g) |ej i = xj |ej (g)i ∈ V2 (7.24)
j=n1 +1 j=n1 +1 j=n1 +1

QED. This theorem can be extended to infinite-dimensional Hilbert spaces as long as the sums (7.24) which become series are
meaningful, and if the linearity of the unitary operator can be applied to the series (and not only to finite linear combinations).
This is guaranteed by the fact that by applying a unitary operator to a complete orthonormal set in a Hilbert space, we obtain
again a complete orthonormal set. Unitary operators used in Physics usually satisfy these conditions. It should be emphasize
however that the order of the group is arbitrary (not necessarily finite).
Theorem 7.6 Every representation U (G) of a finite group G in an inner product space V , is equivalent to a unitary repre-
sentation
Proof : For a pair of vectors x, y in V , we denote the inner product in V as hx |yi. Now let us construct the quantity
1 X
{x, y} ≡ hU (g) x |U (g) yi (7.25)
nG
g∈G

where nG is the (finite) order of the group. It can be shown that this expression also satisfies the axioms of an inner product.
By applying the rearrangement lemma to this new inner product we get
1 X 1 X
{x, y} = hU (gg ′ ) x |U (gg ′ ) yi = hU (g) U (g ′ ) (x) |U (g) (U (g ′ ) y)i
nG nG
g∈G g∈G

{x, y} = {U (g ′ ) x, U (g ′ ) y} (7.26)
7.7. SCHUR’S LEMMAS 141

for any fixed element g ′ of the group. This means that the representation U (G) is unitary with respect to this new inner
product. Now we consider a basis {ui } orthonormal in the original inner product, and a basis {vi } orthonormal in the new
inner product, as well as the operator T that takes the u′i s into the vj′ s.

hui |uj i = vi , vj = δ i j ; vi = T ui ⇒ T x = T xi ui = xi T ui = xi vi

⇒ {T x, T y} = xi vi , y j vj = x∗i y j δ i j = x∗i y i = hx |yi

⇒ {T x, T y} = hx |yi , ∀x, y ∈ V ⇔ {x, y} = hT −1 x T −1 y , ∀x, y ∈ V (7.27)

we have taken into account that T must be non-singular to represent a change of basis9 . Now consider the equivalent
representation defined by T −1
U ′ (g) = T −1 U (g) T (7.28)
using Eqs. (7.28, 7.27, 7.26) we find

hU ′ (g) x |U ′ (g) yi = hT −1 U (g) T x T −1 U (g) T y = hT −1 [U (g) T x] T −1 [U (g) T y]
= {U (g) T x, U (g) T y} = {T x, T y} = hx |yi ⇒
′ ′
hU (g) x |U (g) yi = hx |yi

and the representation U ′ (G) equivalent to U (G) is unitary with respect to the original inner product. QED.
Note that the operator T transforms a basis orthonormal with respect to the original inner product into a basis orthonormal
with respect to the new inner product. T is not unitary because the original and final basis are not both orthonormal with
respect to the same inner product. The theorem is valid for finite groups but the proof suggest that it can be extended to
infinite groups if we are able to extend the summation in Eq. (7.25) in a way that is meaningful and consistent with the
rearrangement lemma. Indeed this theorem can be extended to the case of compact continuous Lie groups.
Theorems 7.5, 7.6 have as a corollary that any representation of a finite group is fully reducible. Hence, for finite groups
we can perform succesive decompositions until we arrive at irreducible representations, we should remember that a given
irreducible representation could appear more than once within the original reducible representation, this situation is denoted
as

U (G) = U 1 (G) ⊕ . . . ⊕ U 1 (G) ⊕ U 2 (G) ⊕ . . . ⊕ U 2 (G) ⊕ . . . ⊕ U p (G) ⊕ . . . ⊕ U p (G)


| {z } | {z } | {z }
a1 times a2 times ap times
p
X
U (G) = aµ U µ (g) (7.29)
⊕µ=1

we should also recall that this decomposition is intrinsic because it concerns operators. In contrast, to obtain a block-diagonal
form for the matrix representatives we require an appropriate basis. However, it is very difficult in practice to check the
reducibility or irreducibility of a given representation by its matrix representation. Our next task is to develop some tools to
recognize whether a representation is reducible or not when its matrix representatives are not in block-diagonal form. Moreover,
if the representation is reducible we should find the way to reduce it up to its irreducible representations.

7.7 Schur’s lemmas


We shall prove two lemmas that permits to develop the central theorems concerning group representations theory, and provide
us with some criteria to check irreducibility

Lemma 7.1 (Schur’s lemma 1) Let U (G) and U ′ (G) be two irreducible representations of G in V and V ′ respectively. Let A
be a linear transformation from V ′ to V which satisfies A U ′ (g) = U (g) A for all g ∈ G. It follows that either (i) A = 0, or
(ii) A is an isomorphism from V ′ onto V (i.e. V and V ′ are isomorphic) and U (G) is equivalent to U ′ (G).

Fig. (7.2a) helps to visualize the situation. By hypothesis, ∀x′ ∈ V ′ we have AU ′ (g) x′ = U (g) Ax′ ≡ y ∈ V . Therefore,
we can do the mapping x′ → y in two ways. (1) x′ → Ax′ = x → U (g) x = U (g) Ax′ = y. (2) x′ → U ′ (g) x′ = y ′ → Ay ′ =
AU ′ (g) x′ = y.
Proof : (i) We define the range of A as: R ≡ {|xi ∈ V : |xi = A |x′ i for some |x′ i ∈ V ′ } (see Fig. 7.2b). For any |xi ∈ R
we have U (g) |xi = U (g) A |x′ i = AU ′ (g) |x′ i for some |x′ i ∈ V ′ and for all g ∈ G. Therefore, for any |xi ∈ R and for all
g ∈ G, we have U (g) |xi = A |U ′ (g) x′ i = A |y ′ i ∈ R. This shows that R is an invariant subspace of V with respect to U (g),
but U (g) is irreducible, hence we have either R = {|0i} (so A = 0) or R = V (so A is onto).
9 Since T must be one-to-one and onto, the statement on the LHS of Eq. (7.27) is equivalent to say that {T x, T y} = hx |yi , ∀ T x, T y ∈ V and

defining T x ≡ x′ , T y ≡ y ′ it can be written as {x′ , y ′ } = hT −1 x′ T −1 y ′ , ∀ x′ , y ′ ∈ V which is the statement on the RHS of Eq. (7.27). Since
T −1 is also one-to-one and onto, the same procedure leads to the double implication.
142 CHAPTER 7. GROUP REPRESENTATIONS

Figure 7.2: Illustration of the proof of the Schur’s lemma 1.

(ii) Now consider the null space N ′ (in V ′ ) of A, N ′ ≡ {|x′ i ∈ V ′ : A |x′ i = |0i}. (see Fig. 7.2c). For any |x′ i ∈ N ′
then AU ′ (g) |x′ i = U (g) A |x′ i = U (g) |0i = |0i. Hence for any |x′ i ∈ N ′ and ∀g ∈ G, we get A [U ′ (g) |x′ i] = |0i, so that
U ′ (g) |x′ i ∈ N ′ . Hence, N ′ is an invariant subspace of V ′ under U ′ (g). Because of the irreducibility of U ′ (g) we have either:
N ′ = V ′ (so A = 0) or N ′ = {|0′ i} (so that A is one-to-one10).
Finally, if V and V ′ are non-zero vector spaces, a consistent combination of (i) and (ii) leads to two possibilities11 : (a)
A = 0, or (b) A is a linear one-to-one transformation from V ′ onto V . The latter case means that V and V ′ are isomorphic
and A is invertible so that U (G) = AU ′ (G) A−1 , from which U (G) and U ′ (G) are equivalent representations. QED.

Lemma 7.2 (Schur’s lemma 2) Let U (G) be an irreducible representation of a group G on the finite-dimensional vector space
V . Let A be an arbitrary operator in V . If A commutes with all the operators in the representation, that is if A U (g) = U (g) A,
∀g ∈ G, then A must be a multiple of the identity operator

Proof : Let us consider the equation


A |xi = λ |xi (7.30)
whose solutions give the eigenvalues and eigenvectors of A. If V = {|0i} the theorem is trivial, so that we shall assume that
V 6= {|0i}. We shall consider the well-known property that if V 6= {0} is a finite-dimensional vector space, the set {λi } of
all the eigenvalues of A (its spectrum), is a non-empty finite set (see theorem 3.5, page 45). In addition, eigenvectors are
non-null by definition. Consequently, any subspace Mλ of all the eigenvectors associated with a given eigenvalue λ (plus the
null element) is a non-zero subspace of V .
If |xi is an eigenvector associated with a given eigenvalue λ, using our hypothesis and Eq. (7.30), we have

U (g) A |xi = λU (g) |xi ⇒ A [U (g) |xi] = λ [U (g) |xi]

Thus U (g) |xi is an eigenvector of A with eigenvalue λ for all g ∈ G. Therefore, the subspace Mλ of eigenvectors of A associated
with the eigenvalue λ, is an invariant subspace under U (G). Since U (G) is irreducible, Mλ must be either (a) the null space
{|0i} or (b) the whole space V . The first possibility is a contradiction (an eigensubspace must be non-zero). The second
possibility says that there is a unique eigenvalue, and all vectors in V are eigenvectors of A with eigenvalue λ. Consequently,
Eq. (7.30) holds for all vectors in V which implies A = λE. Note that it is possible that λ = 0, and in such a case A = 0.
QED.
10 Assume A |x′ i = A |y ′ i ⇒ A (|x′ i − |y ′ i) = |0′ i. Then, |x′ i − |y ′ i ∈ N ′ but N ′ = |0′ i ⇒ |x′ i − |y ′ i = |0′ i. We conclude that A |x′ i = A |y ′ i ⇒

|x′ i = |y ′ i, and A is one-to-one.


11 If V and/or V ′ are zero vector spaces, then A = 0. But in principle, irreducible representations are not defined in zero vector spaces.
7.8. ORTHONORMALITY AND COMPLETENESS RELATIONS OF IRREDUCIBLE MATRIX REPRESENTATIONS143

nG order of the group G


µ, ν labels for inequivalent irreducible representations of G
nµ the dimension of the µ irreducible representation
Dµ (g) the matrix corresponding to the element g ∈ G in the µ representation
ζi notation for the classes. The index i runs over the classes of the group
χµi character associated with the class ζi in the µ representation
ni number of elements in the class ζi
nc number of classes of the group i = 1, . . . , nc
Table 7.1: Notation utilized for relevant quantities concerning irreducible inequivalent representations of a group G.

Notice that Schur’s lemma 2 has been established for finite-dimensional vector spaces only (however, the cardinality of the
group is arbitrary). The problem with the extension to infinite-dimensional vector spaces lies in the fact that operators do
not necessarily posses eigenvalues at all12 . However, most of operators useful in Physics have a complete set of orthonormal
eigenvectors with their associated eigenvalues, this is the case of observables in quantum mechanics13 . Therefore, the theorem
can be extended to infinite dimensions for most of the Physical applications.
An important consequence of Schur’s lemma 2 is the following

Theorem 7.7 All irreducible representations of any abelian group must be 1-dimensional.

Proof : Let U (G) be an irreducible representation of an abelian group G. Let p be a fixed element of the group. Now,
U (p) U (g) = U (g) U (p) ∀g ∈ G, because of the abelianity of the group. Hence U (p) is an operator that commutes with all the

U (g) s, we conclude from Schur’s lemma 2 that U (p) = λp E. Since p is arbitrary, the representation {U (g)} is equivalent to
the set of operators {λp E}. But this representation is reducible in contradiction with our hypothesis, unless E is the identity
in one dimension. Therefore, U (G) is equivalent to the representation p → λp ∈ C for all p ∈ G. QED.

7.8 Orthonormality and completeness relations of irreducible matrix repre-


sentations
7.8.1 Orthonormality of irreducible matrix representations
In table 7.1 we introduce the notation necessary for our treatment. Let G be a finite group, and U µ (g) and U ν (g) irreducible
representations in spaces Vµ and Vν respectively. Let X be an arbitrary nµ × nν matrix and define
X
Mx = Dµ† (g) XDν (g) ; Dµ† (g) = Dµ−1 (g) (7.31)
g∈G

h i∗
now by our notation we have Dµ† (g)m l = Dµ∗ (g)l m = Dµ (g)l m . So we shall use unitary representations, this is always
possible according with theorem 7.6. Now we calculate the following product
X X  
Dµ−1 (p) Mx Dν (p) = Dµ−1 (p) Dµ−1 (g) XDν (g) Dν (p) = Dµ p−1 Dµ g −1 XDν (g) Dν (p)
g∈G g∈G
X  X  
−1 −1 ν −1
= Dµ p g XD (gp) = Dµ (gp) XDν (gp)
g∈G g∈G
X X X
= Dµ−1 ν
(gp) XD (gp) = Dµ† (gp) XDν (gp) = Dµ† (g) XDν (g)
g∈G g∈G g∈G

where we used the rearrangement lemma in the last step. Therefore

Dµ−1 (p) Mx Dν (p) = Mx ∀p ∈ G ⇒ Mx Dν (p) = Dµ (p) Mx ∀p ∈ G

applying Schur’s lemmas we have either Mx = 0 and µ 6= ν, or µ = ν and Mx = cx E, with cx a scalar constant and E the
identity operator14. Let us take X as one element in the set of nν nµ matrices Xlk (k = 1, . . . , nν ; l = 1, . . . , nµ ) with matrix
12 See discussion in Sec. 3.10.1, page 49.
13 See definition 3.8 page 51.
14 Since M is a rectangular n × n matrix, it could represent a linear transformation from V to V . Thus Schur’s lemma 1 says that either
x µ ν ν µ
(i) Mx = 0 or (ii) Mx is an isomorphism between Vν and Vµ so that Mx must be a square matrix and ν = µ. In the second case we can apply
Schur lemma 2 to say that Mx is proportional to the identity. Note that Schur’s lemma 1 does not forbid the possibility that Mx = 0 and µ = ν.
Nevertheless, it is included in Schur’s lemma 2, when we consider that if µ = ν then Mx = cx E because in particular cx could be null.
144 CHAPTER 7. GROUP REPRESENTATIONS

i
elements Xlk j = δjk δli . We find
X 
Mx ≡ Mlk = Dµ† (g) Xlk Dν (g)
g∈G
 X i
m k m m j
(Mx ) n ≡ Ml n = Dµ† (g) i Xlk jD
ν
(g) n
g∈G
m m X m j
X m k
(Mx ) n ≡ Mlk n = Dµ† (g) k i ν
i δj δl D (g) n = Dµ† (g) lD
ν
(g) n
g∈G g∈G
m m
but we have seen that Mx = Mlk n = 0 if µ 6= ν. We also saw that if µ = ν, then (Mx ) n = cx δnm which for this case we

k m k m
denote as Ml n = cl δn . These results give us
m X m k
Mlk n = Dµ† (g) l Dν (g) n = δµν δnm ckl (7.32)
g∈G

now let us take for a moment µ = ν, in this case δµν = 1 and the matrices Xlk and Mlk become nµ × nµ square matrices, the
indices m, n, k, l all run from 1 to nµ . We shall also take m = n and sum over m to find
nµ nµ nµ
X m XX m k
X
Mlk m = Dµ† (g) lD
µ
(g) m = ckl m
δm
m=1 g∈G m=1 m=1


XX
Dµ (g)k m Dµ† (g)m l = nµ ckl
g∈G m=1
X k
Dµ (g) Dµ† (g) l = nµ ckl (7.33)
g∈G
 k
and since the representation is unitary we have Dµ (g) Dµ† (g) = Enµ ×nµ so that Dµ (g) Dµ† (g) l = δlk .
X k
δ (g)l = nµ ckl (7.34)
g∈G

If k 6= l then Eq. (7.34) shows that ckl = 0. If k = l, and taking into account that the sum on the left-hand side of (7.34) runs
over the nG elements of G we have X k nG
δ (g)k = nG = nµ ckk ⇒ ckk =

g∈G

and the ckl coefficients are


nG k
ckl = δ
nµ l
replacing it in Eq. (7.32) it becomes
X nG ν m k
Dµ† (g)m l Dν (g)k n = δ δ δ
nµ µ n l
g∈G

so we obtain the following


Theorem 7.8 (Orthonormality of irreducible representation matrices): Let {Dµ (g)} denotes a matrix realization of a finite
dimensional µ−irreducible representation for a group G of finite order. These matrices satisfy the following condition
nµ X † m k
Dµ (g) l Dν (g) n = δµν δlk δnm (7.35)
nG
g∈G

where µ, ν are two irreducible representations, nµ is the dimensionality of the µ representation and nG the order of the group.
It is essential to emphasize that this theorem is only valid for irreducible representations denoted by µ, ν. It is because
its validity depends on Schur’s lemmas which are applicable to irreducible representations only. If we take into account that
for abelian groups all irreducible representations are 1-dimensional, theorem 7.8 leads to this
Corollary 7.9 If dµ (g) denotes the 1-dimensional irreducible representations of a finite order abelian group, the orthonor-
mality theorem 7.8 becomes
1 X ∗
d (g) dν (g) = δνµ ; dµ (g) , dν (g) ∈ C (7.36)
nG g µ
7.8. ORTHONORMALITY AND COMPLETENESS RELATIONS OF IRREDUCIBLE MATRIX REPRESENTATIONS145

7.8.2 Geometrical interpretation of the orthonormality relation of irreducible matrix repre-


sentations
A geometrical interpretation of Eq. (7.35) can be facilitated by the following definition

Definition 7.8 (Vector space VG ): Let G = {g1 , g2 , . . . , gnG } be a finite group of order nG . Let us see each element gk of the
group as a (normalized) nG −tuple of the form (0, 0, . . . , gk , . . . , 0) with nG coordinates. Defining linear operations on them, we
obtain a vector space VG of dimension nG in which {|gj i} forms an orthonormal basis.

To see why we call Eq. (7.35) an orthonormality property, let us form an nG −tuple in VG by fixing the indices (ν, k, n). In
that case, a fix nG −tuple is given by
r r  
Db ν (G)k n ≡ nν D ~ ν (G)k n ≡ nν Dν (g1 )k n , Dν (g2 )k n , . . . , Dν (gnG )k n (7.37)
nG nG

and Eq. (7.35) ispthe usual orthonormality for vectors of the type (7.37). This interpretation becomes more clear by seeing
the components nν /nG Dν (gi )k n in Eq. (7.37) as representations of vectors {|ν, k, ni} in the {|gi i} basis of VG .
r
nν ν
hgi |ν, k, ni = D (gi )k n (7.38)
nG

which would be the i−th component of the D b ν (G)k n vector along with the |gi i unit basis vector of the nG −dimensional space
VG . Let us consider the inner product of two vectors in the set {|ν, k, ni}
nG
X
hµ, l, m |ν, k, ni = hµ, l, m |gi i hgi |ν, k, ni (7.39)
i=1

where we have used the completeness of the basis {|gi i}. With our notation (7.38) we see that the RHS of Eq. (7.39) coincides
with the LHS of Eq. (7.35) from which we obtain a more familiar form concerning orthonormality
nG
X
hµ, l, m |ν, k, ni = hµ, l, m |gi i hgi |ν, k, ni = δµν δlk δnm (7.40)
i=1

7.8.3 Completeness relations for irreducible matrix representations


The geometrical interpretation developed in Sec. 7.8.2 opens the way for the following theorem

Theorem 7.10 The number of inequivalent irreducible representations of a finite group is restricted by the condition
X
n2µ ≤ nG (7.41)
µ

Proof : “Vectors” of the type |µ, k, ni defined by P


Eqs. (7.37, 7.38), are orthonormal nG −tuples labeled by {|µ, k, ni}. Since
for a given µ the labels (k, n) take n2µ values, then µ n2µ represents the number of vectors in this set. On the other hand,
nG represents the number of components of each vector i.e. the dimension of VG . Since all these vectors are orthogonal and
so linearly independent, it is clear that the number of vectors in this set cannot exceed the dimension of the vector space VG .
QED.
This theorem is important because it imposes a limit to the number of possible irreducible inequivalent representations {µ}
for a given finite group. Let us state another theorem for which part of the proof will be put off until our study of the regular
representation.

Theorem 7.11 The dimension nµ of each irreducible inequivalent representation of a finite group G, is related with the order
nG of the group by X
n2µ = nG (7.42)
µ

where the sum runs over all inequivalent irreducible representations of G. Further, the corresponding matrices satisfy a
completeness relation of the form
nµ nµ
XX X nµ n
Dµ (g)k n Dµ† (g ′ ) k = δ g g′ (7.43)
µ n=1 G
n
k=1
146 CHAPTER 7. GROUP REPRESENTATIONS

µ\g e a
d1 (C2 ) 1 1
d2 (C2 ) 1 −1
Table 7.2: Irreducible inequivalent representations of C2 .

Proof : We shall prove Eq. (7.42) in Sec. 7.12. By now we take Eq. (7.42) for granted and prove Eq. (7.43). Following the
arguments in the proof of theorem 7.10, we see that when equality holds in Eq. (7.41), the number of vectors in the linearly
independent set {|µ, k, ni} is precisely nG . Thus, such a set forms a basis in VG . Using the completeness of {|µ, k, ni} we write
nµ nµ
XX X
|µ, k, ni hµ, k, n| = E
µ k=1 n=1

the matrix representative of these operators in the orthonormal basis {|gi} is given by
nµ nµ
XX X
hg |µ, k, ni hµ, k, n| g ′ i = hg| E |g ′ i
µ k=1 n=1
nµ nµ
XXX
hg |µ, k, ni hµ, k, n |g ′ i = δ g g′ (7.44)
µ k=1 n=1

applying Eq. (7.38) in Eq. (7.44) we obtain Eq. (7.43). QED.


If we have found some inequivalent irreducible representations of a given finite group, Eq. (7.42) tells us whether we
have exhausted all of them or not. Further, the equality in Eq. (7.42), means that the number of independent vectors in
the set {|µ, k, ni} defined in Eq. (7.37), equals the dimension nG of the vector space VG . Therefore, the set {|µ, k, ni} of
nG −component vectors, is orthonormal but also complete in VG . It is clear from the proof of theorem 7.11, that Eq. (7.43) is a
manifestation of the completeness of the basis {|µ, k, ni} 15 . In the case of abelian groups, all representations are 1-dimensional
so that nµ = 1 for all µ and the number of inequivalent irreducible representations will be nG . On the other hand, these results
can be extended to infinite groups as long as we are able to obtain meaningful summations.

7.9 Examples of application of the orthonormality and completeness condi-


tion for irreducible matrix representations
Orthonormality relations (7.35) can be used to construct new irreducible inequivalent representations from those already
known. We shall illustrate it by the following examples

Example 7.10 Consider the group C2 = {e, a} of order two. Let us start from the trivial representation d1 (e) = d1 (a) = 1. In
this case nµ = 1 for any representation
√  (abelian group) and nG = 2. Organizing the representation as a normalized vector of the
type (7.37) we have d1 (G) = 1/ 2 (1, 1). To find another inequivalent irreducible representation we should find a normalized
two component “vector” orthogonal to the previous
√  one. Since the representation of the identity must be the number 1, we see
that the new vector is of the form d2 (G) = 1/ 2 (1, x). The condition of normalization along with orthogonality with d1 (G)
leads to the unique solution x = −1. The representation is then d2 (e) = −d (a) = 1. Finally, with these two representations,
equality in Eq. (7.42) holds. Hence theorem 7.11 says that there are no more irreducible inequivalent representations. These
representations are summarized in table 7.2

If some irreducible inequivalent representations are already known, orthonormality and completeness conditions could help
to find other representations. Further, combination of these conditions with other tools to handle simpler representations (like
theorem 7.2) could provide us additional information about representations. Let us see an example

Example 7.11 Consider the dihedral group D2 (table 6.2 page 102). The trivial irreducible representation is {e, a, b, c} →
{1, 1, 1, 1}. The elements {e, a} form an invariant subgroup, and the factor group is given by {(e, a) , (b, c)} = {Re , bRe } which
is isomorphic to C2 . Example 7.10, shows that this quotient group has two inequivalent irreducible representations. Now by
using theorem 7.2 we can obtain two representations for the full group D2 induced by the quotient group C2 . They are: the
trivial one d1 (G) = 1 and d2 (G) : {e, a, b, c} → {1, 1, −1, −1} 16 . Applying the same procedure to the invariant subgroups
15 Note
P
that completeness of {|µ, k, ni} is given by P{|µ,k,ni} = µ,k,n |µ, k, ni hµ, k, n| = E (see Eq. 4.21, page 78). So Eq. (7.44) is the matrix
representation of completeness in the basis {|gi}. Conversely, Eq. (7.40) can be interpreted as completeness of {|gi} written in the basis {|µ, k, ni},
and Eq. (7.44) can be considered as orthonormality of {|gi} written in the basis {|µ, k, ni}.
16 The latter is obtained from the representation e → 1, b → −1 of D /R ≈ C ≡ (e, b), in which all the elements in the kernel R (i.e. e, a) are
2 e 2 e
mapped into the image of the identity 1, while all the elements in the coset bRe (i.e. b, c) are mapped into the image of b given by −1.
7.10. ORTHONORMALITY AND COMPLETENESS RELATIONS FOR IRREDUCIBLE CHARACTERS 147

µ\g e a b c
d1 (D2 ) 1 1 1 1
d2 (D2 ) 1 1 −1 −1
d3 (D2 ) 1 −1 1 −1
d4 (D2 ) 1 −1 −1 1
Table 7.3: Irreducible inequivalent representations of D2 .

{e, b} and {e, c} we obtain two additional inequivalent irreducible representations d3 (G) : {e, a, b, c} → {1, −1, 1, −1} and
d4 (G) : {e, a, b, c} → {1, −1, −1, 1}. These four representations accomplishes the orthonormality and completeness conditions,
so they are all possible irreducible inequivalent representations of D2 . They are summarized in table 7.3, the orthogonality of
the rows of this table represents the orthonormality condition, while the orthogonality of the columns represents completeness17 .
Note that none of the irreducible inequivalent representations of D2 are faithful, since all the homomorphisms come from a
non-trivial kernel.

7.10 Orthonormality and completeness relations for irreducible characters


The orthonormality and completeness relations for irreducible matrices are very important theoretical results, and we have seen
that they permit to find new representations from those already known. However, in practice the procedure becomes difficult
as the dimension of representations increases. Further, matrices depend on the basis chosen. It is then natural to find a way
to manage with basis independent quantities. We have learnt that characters are basis independent. Further, for a given class
of equivalent representations, characters are also independent of the member of the class chosen. Finally, from theorem 7.4 we
see that all elements of the same conjugacy class in a group posses the same character in a given representation, from which
we deal with nc characters instead of nG ones (nc ≤ nG ). Consequently, if we are able to characterize irreducible inequivalent
representations by characters, the number of degrees of freedom decreases considerably and only intrinsic quantities are used.

Lemma 7.3 Let U µ (G) be an irreducible representation of a finite group G in a vector space V . The sum of U µ (g) over all
the elements of a given conjugacy class ζi , is given by
X ni µ
U µ (h) = χ E (7.45)
nµ i
h∈ζi

where E is the identity in V , and the remaining quantities are defined in table 7.1, page 143.

Proof : Performing a similarity transformation on the left-hand side of Eq. (7.45), we have
 
X −1
X  X µ 
U µ (g)  U µ (h) U µ (g) = U µ (g) U µ (h) U µ g −1 = U ghg −1
h∈ζi h∈ζi h∈ζi

since ghg −1 ∈ ζi for all g ∈ G and for all h ∈ ζi , we can use the rearrangement lemma on the elements of the class ζi to find
 
X −1
X
U µ (g)  U µ (h) U µ (g) = U µ (h) ⇒
h∈ζi h∈ζi
X
U µ (g) Ai = Ai U µ (g) , ∀g ∈ G ; Ai ≡ U µ (h)
h∈ζi

then Schur’ lemma 2 says that Ai = ci E. The constant ci can be evaluated by taking the trace on both sides of the latter
equation

XX X
T rAi = ci T rE ⇒ U µ (h)k k = ci nµ ⇒ χµ (ζi ) = ci nµ
h∈ζi k=1 h∈ζi
ni µ
ni χµi = ci n µ ⇒ ci = χ
nµ i
ni µ
⇒ Ai = χ E
nµ i
and returning to the definition of Ai , we obtain Eq. (7.45) QED.
17 However, neither rows nor columns are normalized, because the normalization factor that appears in Eq. (7.37) is absent in these rows and

columns.
148 CHAPTER 7. GROUP REPRESENTATIONS

Theorem 7.12 (Orthonormality and completeness of group characters): The characters of inequivalent irreducible represen-
tations of a finite group G, satisfy the following relations
Xnc
ni †i ν
χµ χi = δµν (Orthonormality) (7.46)
n
i=1 G
ni X µ †j
χ χ = δij (Completeness) (7.47)
nG µ i µ
µ ∗
by notation we have χ†iµ = (χi ) . The summation on the label “i” is over all distinct conjugacy classes of G. The summation
on µ is over all inequivalent irreducible representations of G.
Proof : Using Eq. (7.35) with m = l, k = n and summing over both indices we have
nµ nµ nµ nµ
nµ X X † m
X k
X X
Dµ (g) m Dν (g) k = δµν k m
δm δk
nG m=1 m=1 k=1
g∈G k=1

nµ X † X
χµ (g) χν (g) = δµν m
δm
nG m=1
g∈G
nc
X

ni χ†i ν
µ χi = nµ δµν
nG i=1

which gives Eq. (7.46). To prove completeness, we start with Eq. (7.43), summing g over the elements of a fixed class ζi and
summing g ′ over the elements of another given class ζj
nµ nµ
XX X nµ X X k
X X
Dµ (g)l k Dµ† (g ′ ) l = δ g g′ (7.48)
µ
nG ′ ′
l=1 k=1 g∈ζi g ∈ζj g ∈ζj g∈ζi

now from Eq. (7.45) of lemma 7.3, we have


X ni µ l X k nj †j k
Dµ (g)l k = χ Ek ; Dµ† (g ′ ) l = χ E l (7.49)
nµ i nµ µ
g∈ζi g′ ∈ζj

further, the first sum (over g ∈ ζi ) on the right hand-side of (7.48) is clearly zero if g ′ ∈
/ ζi . Now, if g ′ ∈ ζi , only one term

within this sum is one (when g = g ) and the others are zero, thus
X
δ g g′ = δij (7.50)
g∈ζi

where ζj is the class associated with g ′ . Replacing (7.49, 7.50) in (7.48), we have
XX X nµ  ni µ
nµ nµ 
nj †j k
 X
l
χ Ek χ E l = δij
nG nµ i nµ µ
µ l=1 k=1 g′ ∈ζj
nµ nµ
X nj ni µ XX 
χi χ†j
µ Elk Ek l = nj δij
µ
n n
µ G
l=1 k=1
ni nj X µ †j T rE
χ χ = nj δij
nG µ i µ nµ
ni X µ †j
χ χ = δij
nG µ i µ

which reproduces Eq. (7.47). QED.


Once again, the appearance of Eqs. (7.46, 7.47) invites to a geometrical interpretation. Let us defined “normalized”
characters r
ni
χ
bi ≡ χi (7.51)
nG
and with summation convention, Eqs. (7.46, 7.47) become

b†i
χ µχbνi = δµν (orthonormality) (7.52)
bµi χ
χ b†j
µ = δij (completeness) (7.53)
7.10. ORTHONORMALITY AND COMPLETENESS RELATIONS FOR IRREDUCIBLE CHARACTERS 149

defining 
χ bµ1 , χ
bµ ≡ χ bµ2 , . . . , χ
bµnc (7.54)
as a nc −component vector and defining a “scalar product” in these type of vectors, Eq. (7.52) can be simplified

b†µ · χ
χ bν = δµν (orthonormality) (7.55)

Corollary 7.13 The number of irreducible inequivalent representations of any finite group G, equals the number of distinct
conjugacy classes of G (denoted by nc ).

The characters χµi can be organized in a nc × nc matrix, we shall designate µ as the row index and i as the column index.
All column vectors are orthogonal to each other but they are not normalized, but row vectors are not in general orthogonal
nor normalized. A table of this matrix for any given G is called its character table.
If we instead define the normalized character table in which we use the normalized characters of Eq. (7.51) as entries,
then rows are orthonormal to each other (orthonormality, expressed by Eq. 7.52) and columns are orthonormal to each other
(completeness, expressed by Eq. 7.53). Therefore, the normalized character table forms an nc × nc unitary matrix.
The representation by characters is much more useful than the matrices representations, because the former are intrinsic
in the representation and with many less degrees of freedom.

Example 7.12 For abelian groups each element is a class by itself, and all irreducible representations are 1-dimensional.
Therefore Dµ (g) = Dµ (ζi ) = χiµ , so that the matrix representations coincide with the character representations. For instance,
matrix tables (7.2, 7.3) of C2 and D2 respectively, are also character tables.

Example 7.13 We shall find the character table of the non-abelian group S3 . This group has three classes: the 1-cycle {e}, the
set of 2-cycles {(12) , (23) , (13)} and the 3−cycles {(123) , (321)}. We label i = 1, 2, 3 for the 1−cyclic, 2−cyclic and 3−cyclic
classes respectively. We must have three inequivalent irreducible representations. We already know the trivial one p → E for
all p ∈ S3 . We label it as µ = 1, and its three characters are given by
 
(1) (1) (1)
χ1 , χ2 , χ3 = (1, 1, 1) (7.56)

A second unidimensional irreducible representation was shown in example 7.3, page 134 (derived from theorem 7.2), and it was
given by
Re = {e, (123) , (321)} → 1 , (ij) Re = {(12) , (23) , (13)} → −1
reordering the mapping by putting classes together we rewrite this homomorphism as

ζ1 ={e} → 1, ζ2 = {(12) , (23) , (13)} → −1 , ζ3 = {(123) , (321)} → 1


 
(2) (2) (2)
⇒ χ1 , χ2 , χ3 = (1, −1, 1) (7.57)

The last irreducible inequivalent representation (µ = 3) must be 2-dimensional according to Eq. (7.42) theorem 7.11 since
(3)
nG = 6 = 12 + 12 + n23 . The first character in µ = 3 will be given by χ1 = T rD (e) = T rE = 2. The other two characters are
determined by the orthogonality of the column vectors in the character table. By now this table is given by

µ\i ζ1 ζ2 ζ3
d1 1 1 1
d2 1 −1 1
d3 2 x y

orthogonality between the first and second column, and between the first and third column gives
 
1
(1∗ , 1∗ , 2∗ ) ·  −1  = 1 · 1 + 1 · (−1) + 2x = 0 ⇒ x = 0
x
 
1
(1∗ , 1∗ , 2∗ ) ·  1  = 1 · 1 + 1 · 1 + 2y = 0 ⇒ y = −1 (7.58)
y
(3) (3)
so that x ≡ χ2 = 0, and y ≡ χ3 = −1 and
 
(3) (3) (3)
χ1 , χ2 , χ3 = (2, 0, −1) (7.59)

The character table of S3 is shown in table 7.4a, it can be checked that all columns are orthogonal each other but rows are not.
Neither row vectors nor column vectors are normalized. We put the conjugate in the elements of the character on the LHS of
150 CHAPTER 7. GROUP REPRESENTATIONS

µ\i ζ1 ζ2 ζ3 µ\i ζ√
1 ζ√
2 ζ√
3
d1 1 1 1 d1 1/√6 1/ √2 1/√3
(a) (b)
d2 1 −1 1 d2 1/√6 −1/√ 2 1/ √3
d3 2 0 −1 d3 2/ 6 0/ 2 −1/ 3
Table 7.4: (a) Character table of Irreducible inequivalent representations of S3 . (b) Normalized character table of Irreducible
inequivalent representations of S3 . The normalized character table defines a unitary matrix.

Eqs. (7.58) to remember that if the characters are complex, the first “vector” must be taken as an adjoint vector arrangement.
Finally, replacing the characters by the normalized characters given by Eq. (7.51) in the character table, we construct the
normalized character table. Putting Eqs. (7.56, 7.57, 7.59) together we have
     
(1) (1) (1) (2) (2) (2) (3) (3) (3)
χ1 , χ2 , χ3 = (1, 1, 1) ; χ1 , χ2 , χ3 = (1, −1, 1) ; χ1 , χ2 , χ3 = (2, 0, −1) (7.60)

The row vectors of the normalized character table are obtained by combining Eqs. (7.60, 7.51, 7.54)
r r r  r r r !
  n1 (1) n2 (1) n3 (1) 1 3 2
(1) (1) (1)
χ
b1 , χb2 , χ
b3 = χ , χ , χ = · 1, · 1, ·1
nG 1 nG 2 nG 3 6 6 6
   
(1) (1) (1) 1 1 1
χ
b1 , χb2 , χ
b3 = √ ,√ ,√
6 2 3
r r r !  
  1 3 2 1 1 1
(2) (2) (2)
χ
b1 , χb2 , χ
b3 = · 1, · (−1) , · 1 = √ , −√ , √
6 6 6 6 2 3
r r r !  
  1 3 2 2 1
(3) (3) (3)
χ
b1 , χb2 , χ
b3 = · 2, · 0, · (−1) = √ , 0, − √
6 6 6 6 3

the normalized character table of S3 is shown in table 7.4b. In this case rows are orthonormal to each other and columns are
orthonormal to each other, forming a unitary matrix. Note that column vectors of the normalized character table differ from
column vectors of the character table by constants of normalization, implying that character table preserve the orthogonality
(but not the normalization) of the column vectors.
It is clear that for any representation of any given finite group, the character of the identity element (usually labeled as
i = 1) equals the dimension of the representation. Hence the first column of the character table gives us the dimension of each
irreducible inequivalent representation.

7.10.1 Criteria of irreducibility for representations of finite groups through their character
tables
Given an arbitrary representation U (G) of a finite group G, the character table of its irreducible inequivalent representations
can tell us how the irreducible inequivalent representations of G are embedded in U (G). Further, the set of characters of U (G)
tells us whether U (G) is irreducible or not. These tasks are carried out by the following theorems.
Theorem 7.14 Let U (G) be a representation of a finite group G in V with characters {χi } = {χ1 , . . . , χnc }. In the process
of reduction of U (G), the number of times aν that the irreducible representation U ν (G) appears (see Eq. 7.29, page 141), is
determined by
Xnc
ni
aν = χ†i
ν χi =χb†ν · χ
b (7.61)
i=1
n G

Proof: Taking the trace of the decomposition in Eq. (7.29) page 141, we have
X X r X r ni µ
ni
T rU (g) = aµ T rUµ (g) ⇒ χi = aµ χµi ⇒ χi = aµ χ
µ µ
nG µ
nG i
X
⇒ χbi = bµi
aµ χ
µ

where we have used Eq. (7.51). Taking the scalar product on both sides with χ b†ν and applying the orthonormality condition
Eqs. (7.46, 7.52), we have X X

bi†
χ ν χ
bi = aµ χbi†
ν χbµi ⇒ χbi†
ν χbi = aµ δνµ ⇒ aν = χb†ν · χ
b
µ µ

which combined with Eqs. (7.51, 7.54) gives Eq. (7.61). QED.
7.11. THE REGULAR REPRESENTATION OF A FINITE GROUP 151

Example 7.14 Consider the following reducible representation of C2


   
1 0 0 1
e→ ; a→
0 1 1 0

the characters are χi = (2, 0). Table 7.2 shows the irreducible inequivalent characters of C2 , from which we see that χµ=1
i =
(1, 1) , χµ=2
i = (1, −1). Thus theorem 7.14 Eq. (7.61), tells us that
2
X ni n1 n2 1 1
a1 = χ†i
ν=1 χi = χ†1
ν=1 χ1 + χ†2
ν=1 χ2 = 1∗ · 2 · + 1∗ · 0 · = 1
i=1
2 2 2 2 2
2
X ni n1 n2 1 1
a2 = χ†i
ν=2 χi = χ†1
ν=2 χ1 + χ†2
ν=2 χ2 = 1∗ · 2 · + (−1)∗ · 0 · = 1
i=1
2 2 2 2 2

hence a1 = a2 = 1, and each irreducible inequivalent representation of C2 appears once in the reduction of the 2-dimensional
representation shown above. So there is a similarity transformation that brings both 2×2 matrices to a totally reduced (diagonal)
form.

Theorem 7.15 (Necessary and sufficient conditions for irreducibility): Let U (G) be a representation of a finite group G in
V with characters {χi } = {χ1 , . . . , χnc }. A necessary and sufficient condition for U (G) to be irreducible is that
nc
X r
2 † ni
ni |χi | = nG i.e. χ
b ·χ
b=1 ; χ
b≡ (χ1 , χ2 , . . . , χnc ) (7.62)
i=1
nG

Proof : If aµ is the number of times a given irreducible representation U µ (G) appears in the reduction of U (G), we have

b† · χ
χ bµ )† · (aν χ
b = (aµ χ bν ) = aµ∗ aν χ
b†µ · χ
bν = aµ∗ aν δµν
X 2
b† · χ
χ b = |aµ | (7.63)
µ

Where we have used Eq. (7.55). If U (G) is irreducible, it is equivalent to an irreducible representation ν so that aµ = δµν ,
P 2
replacing it in Eq. (7.63) we obtain Eq. (7.62). Conversely if Eq. (7.62) holds, then Eq. (7.63) says that µ |aµ | = 1, since
ν
aµ are non-negative integers the only kind of solutions are aµ = δµ for certain fixed ν, showing that U (G) is equivalent to an
irreducible representation U ν (G). QED.
Since character tables are very useful in applications, there is an extensive list of character tables of symmetry groups in the
literature. In particular, character tables of all the crystallographic point-groups are given in books concerning applications of
group theory in Solid State Physics.

7.11 The regular representation of a finite group


Let G be a finite group of order nG , G ≡ {gi , i = 1, . . . , nG }. Consider the elements (g1 , . . . , gnG ) as n−tuples in a nG −dimensional
space VG (see definition 7.8, page 145), and let us represent the element a ∈ G by a permutation of the nG coordinates in the
form
agi ≡ gai ; i = 1, 2, . . . , n (7.64)
   
g1 g2 ... gnG g1 g2 . . . gnG

a (g1 ) a (g2 ) . . . a (gnG ) g a1 g a2 . . . g an G
we should remember that this is a regular permutation so that all elements change their positions for each a ∈ G, except for
the identity. Note that another way to see the Cayley’s theorem from Eq. (7.64) is the following

b → pb ⇔ bgi → gbi
pa pb → a (bgi ) → agbi → gabi
gabi = agbi = a (bgi ) = (ab) gi = cgi = gci (7.65)
⇒ pa pb → pc with c ≡ ab

Let us form a representation U (G) of G in VG . Note that each element gi of the group has a two-fold role: on one hand they
can be seen as basis vectors of VG
gm ≡ |gm i → (0, . . . , 0, gm , 0, . . . , 0)
152 CHAPTER 7. GROUP REPRESENTATIONS

or as an element of U (G) i.e. a mapping from VG onto VG i.e. gk : VG → VG . To avoid too many indices we denote |gk i the
basis vectors in VG and a, b, c the operators in U (G). From the definition (7.64) we have
E

a |gm i = |gam i = |ga1 i · 0 + . . . + gam−1 · 0 + |gam i · 1 + gam+1 · 0 + . . . + ganG · 0 (7.66)

the matrix representative of a ∈ G in the basis {|gk i}, is obtained from Eq. (7.11)
k
a |gm i = |gk i (∆a ) m (7.67)

combining Eqs. (7.66, 7.67) we obtain


(∆a )k m = δakm (7.68)
if we have ab = c two applications of Eq. (7.66), yields
E E E

ab |gm i = a |gbm i = gabm = gab1 · 0 + . . . + gabm−1 · 0 + gabm · 1 + gabm+1 · 0 + . . . + gabnG · 0 (7.69)
E

c |gm i = |gcm i = |gc1 i · 0 + . . . + gcm−1 · 0 + |gcm i · 1 + gcm+1 · 0 + . . . + gcnG · 0 (7.70)

from Eqs. (7.69, 7.70), the matrix representation of ab and of c are given by

ab |gm i = |gk i (∆ab )k m = δakbm ; c |gm i = |gk i (∆c )k m = δckm

equating both equations we obtain


ab = c ⇒ δakbm = δckm (7.71)
from (7.68, 7.71), it could be easily checked that these matrices form a representation, with ab = c we get

(∆a )k m (∆b )m j = δakm δbmj = δakm δaabm = δakb = δckj = (∆c )k j = (∆ab )k j
j j

Another way to construct such matrices is the following: The group multiplication gi gj = gk can be written formally as

m m 1 if m = k
gi gj = gk = gm ∆ij ; ∆ij = ; k = 1, . . . , nG (7.72)
0 if m 6= k

Eq. (7.72) can be rewritten as


m
gi |gj i = |gk i = |gm i (∆i ) j (7.73)
which gives to gi the role of operator and |gj i the role of vector. In Eq. (7.72), both gi and gj are seen as group elements. If
we simplify the notation by writing a ∈ G on Eq. (7.73), we have
m
a |gj i = |gm i (∆a ) j (7.74)

and comparing Eqs. (7.67, 7.74) we see that both types of matrices are the same18 .

Theorem 7.16 The matrices (∆i )k j ≡ ∆kij , i, j, k = 1, . . . , nG , defined in Eqs. (7.72, 7.73) form a representation of G. It is
called the regular representation. Such matrices are also described by Eqs. (7.67, 7.68).

The name regular representation comes from the fact that they are associated with the regular permutations.

7.12 Reduction of the regular representation


The importance of the regular representation of a finite group G, lies in the fact that it contains each irreducible inequivalent
representation of the group. The number of times that each inequivalent irreducible representation of G appears in its regular
representation is precisely nµ , i.e. the dimension of the representation.

Theorem 7.17 (Decomposition of the regular representation): (i) The regular representation of any finite group G, contains
every inequivalent irreducible representation µ precisely nµ times. (ii) We also have the identity
X
n2µ = nG (7.75)
µ

18 In equation (7.74), ∆a can be interpreted as the matrix representation of the “operator” a on VG in the basis {|gk i}.
7.12. REDUCTION OF THE REGULAR REPRESENTATION 153

The second part is precisely the proof of the missing part of theorem 7.11 related with the completeness relation for the
irreducible representation matrices.
Proof: Taking into account the definition of the matrices ∆b Eq. (7.68), the character of these matrices are
nG
X nG
X
k
χR
b = T r∆b = (∆b ) k = δ k bk (7.76)
k=1 k=1

Where the superscript “R” means that we are in the regular representation. The character of the identity is
nG
X nG
X nG
X
k
χR
e = (∆e ) k = δ k ek = δkk = nG (7.77)
k=1 k=1 k=1

on the other hand, when b 6= e, gbk ≡ bgk 6= gk , therefore


k
(∆b ) k = δ k bk = 0, (no sum) ∀gk ∈ G and ∀b 6= e ∈ G. (7.78)

Therefore, for each b 6= e, all the diagonal elements of ∆b vanish, so that


k
(∆b ) k =0 (no sum) ∀k = 1, . . . , nG and ∀b 6= e ∈ G. (7.79)

Combining Eqs. (7.77, 7.79), we have


T r∆R R e
b ≡ χb = n G δb (7.80)
The number of times each representation occurs can be calculated from Theorem 7.14 Eq. (7.61) and using Eq. (7.80)
nc
X ni R ne 1
aR
µ = χ†i R
µ χi = χ†e
µ χe = nµ nG = nµ
i=1
nG nG nG

From this result we see that the decomposition of the regular representation in inequivalent irreducible representations is given
by X
∆R b = nµ Dµ (b) ∀b ∈ G (7.81)
µ

in particular using b = e and taking the trace in Eq. (7.81) we obtain


X X
T r∆R
e = nµ T rDµ (e) ⇒ nG = n2µ
µ µ

where we have applied Eq. (7.80). It proves Eq. (7.75). QED. It is essential to check whether this proof of Eq. (7.75) does
not depend on the completeness theorems, because such an equation was used precisely to prove completeness theorem 7.11,
page 145. Tracking back to the place in which this proof was put off (see Eq. 7.42, page 145), we see that we had already
established the orthonormality theorems for irreducible matrices. We can then proceed to establish lemma 7.3 of page 147,
that only depends on the rearragement and Schur’s lemmas, then we can prove theorem 7.12 but only the part concerning
orthonormality of the irreducible characters (which in turn only depends on the orthonormality of irreducible matrices). The
part of theorem 7.12 concerning orthonormality is used to prove theorem 7.14. Finally, after developing the concept of regular
representation, theorem 7.14 is used to prove theorem 7.17 and in particular Eq. (7.75). Consequently, this proof is consistent.
Therefore, theorem 7.17 says that the regular representation of any finite group G, can generate all the inequivalent
irreducible representations of G by reducing it. Taking the appropriate ordered basis, the representation matrices of all the
elements of G can be brought to the following block diagonal form:
 
1
 D2 
 
 . .. 
 
 
 D 2 
 
 .. 
 . 
 
 D nc 
 
 . 
 .. 
nc
D

where by convention the basis is ordered such that the trivial representation appears first. Each representation Dµ appears
nµ times. In particular, the trivial representation appears only once.
154 CHAPTER 7. GROUP REPRESENTATIONS

Example 7.15 Consider the group C2 = {e, a}. The regular representation matrices are given by
   
1 0 0 1
∆e = ; ∆a = (7.82)
0 1 1 0
χR
e = 2 , χR
a =0 (7.83)

using the table of characters of C2 (Table 7.2, page 146) we find

aR R
µ=1 = 1 ; aµ=2 = 1

so that the regular representation contains each irreducible inequivalent representation only once. By the similarity transfor-
mation ∆′i = S∆i S −1 , the matrices (7.82) can be brought to diagonal form
   
1 0 1 0
∆′e = ; ∆′a =
0 1 0 −1
 
1 1
S ≡
−1 1

this makes explicit the decomposition in two irreducible inequivalent representations of one dimension.

Notice that this representation has been constructed on a vector space VG of nG −dimension, in which the elements of the
group gk act as basis vectors19. In particular, Eq. (7.72) has introduced linear operations (linear combinations) among the
elements of the group (though in this case the linear combination is trivial since only one element contributes to the sum).
Note that in VG as in any other vector space, we must define sum and multiplication by scalars, but since one of the basis of VG
is the set of elements of the group, we can define a product law in VG , based on the group law of combination. In summary, in
VG we can define naturally three operations: sum, scalar multiplication and multiplication between two elements of the space
(composition). We shall see that these laws of combination lead to the structure of an algebra that we shall call the group
algebra. A systematic method for the reduction of the regular representation on the group algebra will require to study our
space VG in more detail. We shall do it in chapter 11 after a brief description of the main algebraic systems in chapter 10.

19 See definition 7.8, page 145.


Chapter 8

Additional issues on group representations

In this chapter, we examine some techniques to construct new representations of a given group G from those already known.
In section 8.1 we show that new representations can be constructed from two representations U µ (G) in a vector space U and
U ν (G) in a vector space V by defining the associated representation in the tensor product U ⊗ V of the two vector spaces.
Though the latter representation is in general reducible (even if the component representations are irreducible), the process
of reducing the product representation leads in general to new irreducible representations. In section 8.2 we examine how a
representation U (G) in a vector space V induces a representation in a vector space of functions Vf whose elements are functions
f (x) in which the domain is the vector space V , these representations are the ones used in either classical or quantum field
theories. On the
n other hand,
o if we have a given matrix representation {D (G)} it is clear by direct inspection that the sets

{D (G)} and D e (G) are representations on the same vector space as D (G). Therefore the natural question is whether
−1

these three representations are equivalent or not. Consequently, section 8.3 concerns about the criteria of equivalence of these
representations and its relation to the existence of bilinear invariants.

8.1 Direct product representations and Clebsch-Gordan Coefficients


8.1.1 Definition and basic properties
Suppose we have certain representation U (G) in the vector space U and another representation V (G) in the vector space V .
In many Physical applications we require to form the direct or tensor product of two or more vector spaces to describe several
degrees of freedom. Since the product space is also a vector space, we wonder whether we can form a representation in this
new vector space from the representations in the component spaces. We start by defining briefly the concept of direct product
vector space, in order to explore representations in such a space1

Definition 8.1 (Direct product vector space): Let U and V be two vector spaces with inner product. Let {ui : i = 1, . . . , nu }
and {vj : j = 1, . . . , nv } be two orthonormal bases of U and V respectively. Let us define a set

{wk ≡ ui ⊗ vj : k = (i, j) ; i = 1, . . . , nu ; j = 1, . . . , nv }

of orthonormal vectors {wk } formed from “formal product vectors” ui ⊗ vj . The direct product space W = U ⊗ V consists of
all linear combinations of the orthonormal basis {wk }, i.e.

W = x : |xi = |wk i xk

where xk are the components of |xi ∈ W in the basis {|wk i}. Further, the inner product in W is such that (a) hwk |wk i =
′ ′ ∗
δ k k = δ i i δ j j ′ where k ′ = (i′ , j ′ ) and k = (i, j). (b) hx |yi = x†k y k with x†k = xk .

Definition 8.2 Let A be an operator on U , and let B be an operator on V . We define the operator C = A ⊗ B on W = U ⊗ V
in the following way

C |xi ≡ (A ⊗ B) |wk i xk = (A ⊗ B) (|ui i ⊗ |vj i) xij (8.1)


ij
C |xi ≡ [(A |ui i) ⊗ (B |vj i)] x (8.2)

in particular we define the extension of the operators A, B to the space W in the form
e ≡ A ⊗ IV ; B
A e ≡ IU ⊗ B

where IU and IV are the identity operators in U and V respectively.


1A more detailed treatment of the tensor product of vector spaces can be seen in Sec. 3.20, page 62.

155
156 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

The previous definition tells us how to construct the matrix representation of A ⊗ B from the matrix representation of each
original operator
k′ i′ j ′
C |wk i = |wk′ i (DC ) k = |ui′ i ⊗ |vj ′ i (DC ) ij
   
i′ j′
(A ⊗ B) (|ui i ⊗ |vj i) = |ui′ i (DA ) i ⊗ |vj ′ i (DB ) j

the left hand-sides of these equations are equal so we have


k′ i′ j ′ i′ j′
(DC ) k ≡ (DC ) ij = (DA ) i (DB ) j (8.3)
i′ j ′
note that in (DC ) ij the set k ′ ≡ (i′ , j ′ ) forms a “single” label for rows while k ≡ (i, j) represents a “single label” for columns.
For instance if k ≡ (i, j) with i = 1, 2 and j = 1, 2, 3, a natural ordering for k would be

{(i, j)} ≡ {(1, 1) , (1, 2) , (1, 3) , (2, 1) , (2, 2) , (2, 3)} ↔ k = {1, 2, 3, 4, 5, 6}

We shall simplify the notation in Eq. (8.3) and write it simply as


′ ′ ′ ′
C |wk i = |wk′ i C k k ; |wk′ i ≡ |ui′ i ⊗ |vj ′ i ; C k k ≡ Ai i B j j (8.4)

we should keep in mind that in Eq. (8.4), the operators A, B, C are written in the basis {ui }, {vj } and {wk ≡ ui ⊗ vj }
respectively. It is clear that W is a nu · nv −dimensional space. The trace of a product of operators C = A ⊗ B can be
calculated from Eq. (8.4) with i′ = i and j ′ = j and summing over i, j
X X X
T rC = Ckk = Ai i B j j = (T rA) (T rB) (8.5)
k i j

Theorem 8.1 Let U µ (G) and U ν (G) be representations of G in vector spaces U and V respectively. The set of operators
 µ×ν
U (g) ≡ U µ (g) ⊗ U ν (g) ∀g ∈ G

on W = U ⊗ V also forms a representation of G. Further, the group characters of U µ×ν (G) are equal to the product of the
characters of the two representations U µ (G) and U ν (G) i.e.

χµ×ν
i = χµi χνi ; i = 1, . . . , nc (8.6)

Proof : By definition of product of operators Eqs. (8.1, 8.2), we have

U µ×ν (g) |xi ≡ [(U µ (g) |ui i) ⊗ (U ν (g) |vj i)] xij

since U µ (G) and U ν (G) are representations in U and V , we have

U µ×ν (g1 ) U µ×ν (g2 ) |xi = [(U µ (g1 ) U µ (g2 ) |ui i) ⊗ (U ν (g1 ) U ν (g2 ) |vj i)] xij
= [(U µ (g1 g2 ) |ui i) ⊗ (U ν (g1 g2 ) |vj i)] xij
U µ×ν (g1 ) U µ×ν (g2 ) |xi = U µ×ν (g1 g2 ) |xi

further, Eq. (8.6) is a particular case of Eq. (8.5). QED.

Definition 8.3 If U µ (G) and U ν (G) are representations of G in U and V respectively, the set of operators
 µ×ν
U (g) ≡ U µ (g) ⊗ U ν (g) ∀g ∈ G

on W ≡ U ⊗ V is called the direct product representation in W , of U µ (G) in U and U ν (G) in V .

Suppose now that the representations U µ (G) and U ν (G) are irreducible in U and V . The direct product representation
µ×ν
U (g) of dimension nµ · nν is in general reducible in W . The number of times that a given representation λ appears in this
product representation, is given by theorem 7.14 Eq. (7.61)
Xnc
ni ∗
aµ×ν
λ b†λ χ
= χ bµ×ν = χλi χµ×ν
i
n
i=1 G
Xnc
ni ∗
aλµ×ν = χλi χµi χνi (8.7)
n
i=1 G

where we have used (8.6).


8.1. DIRECT PRODUCT REPRESENTATIONS AND CLEBSCH-GORDAN COEFFICIENTS 157

Example 8.1 Consider the product representation U µ×ν (S3 ), where U µ (S3 ) and U ν (S3 ) are any of the three irreducible
inequivalent representations of S3 discussed in example 7.13. Using the character table 7.4 page 150 we can see that

U 1×i = U 1 × U i = 1 × U i = U i ⇒ U 1×i ≃ U i ; i = 1, 2, 3

since the character of the product is the product of the characters then the set of characters of U 2 × U 3 is

χi2×3 = χ2i × χ3i = (1, −1, 1) × (2, 0, −1) = (1 · 2, −1 · 0, 1 · (−1)) = (2, 0, −1)
⇒ χ2×3
i = χ3i

therefore we have
U 2×3 = U 2 × U 3 ≃ U 3
finally, if we evaluate U 3×3 we see that since U 3 is two dimensional, U 3×3 is four dimensional. Therefore, it must be reducible.
The number of times that each representation appears in U 3×3 is given by Eq. (8.7), according with example 7.13 page 149,
we see that ni=1 = 1, ni=2 = 3, ni=3 = 2, and using the character table 7.4 we have
3
1 X ∗ 3 3
a3×3
λ=1 = ni χλ=1
i χi χi
nG i=1
1
= [1 · 1∗ · 2 · 2 + 3 · 1∗ · 0 · 0 + 2 · 1∗ · (−1) · (−1)] = 1
6
similarly
1 ∗ 
a3×3
λ=2 = 1 · 1∗ · 2 · 2 + 3 · (−1) · 0 · 0 + 2 · 1∗ · (−1) · (−1) = 1
6
1 ∗ 
a3×3
λ=3 = 1 · 2∗ · 2 · 2 + 3 · 0∗ · 0 · 0 + 2 · (−1) · (−1) · (−1) = 1
6
hence U i with i = 1, 2, 3 appears once
U 3×3 = U 1 ⊕ U 2 ⊕ U 3

Since the direct product of irreducible representations µ, ν in U and V is in general reducible we have
X
U µ×ν (G) = aλ U λ (G) (8.8)
⊕λ

where this sum is usually called a Clebsch-Gordan series. Consequently, W = U ⊗ V can be decomposed into a direct sum
of invariant subspaces Wαλ where λ is the label for the irreducible representation and α = 1, . . . , aλ distinguishes among the
aλ spaces that correspond to the same representation λ
aλ X
X
W = Wαλ (8.9)
⊕α=1 ⊕λ

where the sum over λ is only over the irreducible inequivalent representations of G included in U µ×ν (G), since not necessarily
all of them are contained in the product representation.
It is important not to confuse the concept of direct product representations with the concept of direct product of groups.
A direct product representation involves a single group G with two representations U ν (g) and U µ (g) on vector spaces Vν and
Vµ , the product representation is a new representation of the same group G but on the vector space Vν ⊗ Vµ . By contrast, a
direct product of groups concerns two groups G1 and G2 that generates another group G = G1 ⊗ G2 .

8.1.2 Coupled and decoupled bases and Clebsch-Gordan coefficients


We have considered so far the basis {wk ≡ ui ⊗ vj } of W , generated by the direct product of the bases of U and V . Notwith-
standing, the orthonormal basis {wk } does not lead in general to block-diagonal matrices Dµ×ν (G) in W .
It is therefore convenient to change to a new orthonormal basis, in which matrices  λ display the block-diagonal
texture. This
new basis is constructed by building orthonormal bases for each subspace Wαλ i.e. wαl : l = 1, . . . , nλ and putting all of them
together. Further, it is convenient to order this basis such that the first n1 basis vectors are in W11 the next n1 vectors are in
W21 and so on. The new basis is then given by
 λ
∪nλ=1
c
∪α
α=1 wαl : l = 1, . . . , nλ
λ

 λ  λ
wαl ≡ wαl : l = 1, . . . , nλ ; α = 1, . . . , aλ ; λ = irred ineq reps included in Dµ×ν (G)
158 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

 λ
the transformation from the original orthonormal basis {|wk i} to the new orthonormal basis wαl must be performed by a
unitary passive transformation. Using the completeness of |wk i we have
λ X λ
wαl = |wk i hwk wαl (8.10)
k

but k = (i, j) with i = 1, . . . , nµ ; j = 1, . . . , nν . From Eqs. (8.8, 8.9) it is clear that the decomposition of W in minimal
invariant subspaces depends on the irreducible representations µ and ν involved in the direct product representation. Thus
the coefficients in Eq. (8.10) also depend on the representations µ and ν, that forms the direct product. Hence we denote the
Fourier coefficients of the expansion as
λ
hwk wαl ≡ hk (µ, ν) α, λ, li = hi, j (µ, ν) α, λ, li ⇒ (8.11)
nµ nν
λ X X
wαl = |wi,j i hi, j (µ, ν) α, λ, li (8.12)
i=1 j=1

Definition 8.4 The complex numbers (8.11) that define the change of basis (8.12), are called Clebsch-Gordan coefficients
(CG-coefficients).

Note that for the unitary matrix formed by these complex numbers, the set (i, j) provides the “row index” while the
“column index” is characterized by the set (α, k, l). The labels (µ, ν) serves to identify this transformation matrix as being the
µ×ν
one
 λfrom
D to Dλ , the (µ, ν) labels are fixed in the expansion. It is customary in Physics to refer to the bases {|wk i} and
w as the coupled and decoupled basis respectively.
αl

Theorem 8.2 (Orthonormality and completeness of Clebsch-Gordan coefficients): The CG-coefficients satisfy the following
orthonormality and completeness conditions
aλ X
X nλ X
′ ′
hi′ , j ′ (µ, ν) α, λ, lihα, λ, l (µ, ν) i, ji = δ i i δ j j (8.13)
α=1 l=1 λ
nu X
X nv
′ ′ ′
hα′ , λ′ , l′ (µ, ν) i, jihi, j (µ, ν) α, λ, li = δα αδλ λ δl l (8.14)
i=1 j=1
hα, λ, l (µ, ν) i, ji = hi, j (µ, ν) α, λ, li∗ (8.15)

where the sum over λ runs over the irreducible inequivalent representations of G, that are contained in the representation
Dµ×ν (G).
 λ
Proof : Eq. (8.15) is immediate from the definition (8.11). Now, starting from the completeness of wαl we have
X
λ
|wi,j i = wαl
λ
wαl wi,j i (8.16)
α,λ,l

applying hwi′ ,j ′ | on both sides, and using the orthonormality of {|wk i} = {|wi,j i} we get
′ ′ X λ
λ
δ i i δ j j = hwi′ ,j ′ | wi,j i = hwi′ ,j ′ wαl wαl wi,j i
α,λ,l

and applying the definition of the CG-coefficients Eq. (8.11), we find


′ ′ X
δi i δj j = hi′ , j ′ (µ, ν) α, λ, lihα, λ, l (µ, ν) i, ji
α,λ,l
 λ
which is Eq. (8.13). If we start with the completeness of {|wi,j i} and uses the orthonormality of wαl we have
λ X ′ λ X λ′
wαl = λ
|wi,j i hwi,j | wαl i ⇒ hwαλ′ l′ wαl = λ
hwα′ l′ |wi,j i hwi,j | wαl i
i,j i,j
X
α′ λ′ l′ ′ ′ ′
δ αδ λδ l = hα , λ , l (µ, ν) i, jihi, j (µ, ν) α, λ, li
i,j

this is Eq. (8.14). QED.


It is convenient to simplify the notation by using sum over repeated indices and relabeling
λ λ
|wi,j i → |i, ji ; wαl → |α, λ, li ; hwi,j wαl ≡ hi, j |α, λ, li
8.1. DIRECT PRODUCT REPRESENTATIONS AND CLEBSCH-GORDAN COEFFICIENTS 159

that is, the “w” letter is removed and the symbols µ, ν are dropped since they are fixed. With these conventions we rewrite
Eqs. (8.12, 8.16) as
|α, λ, li = |i, ji hi, j |α, λ, li (8.17)
|i, ji = |α, λ, li hα, λ, l |i, ji (8.18)
they are clearly the inverse of each other. Applying the operators U µ×ν (g) to both bases and using Eq. (8.4) we get
i′ j′
U µ×ν (g) |i, ji = |i′ , j ′ i Dµ (g) iD
ν
(g) j ; ∀g ∈ G (8.19)
µ×ν ′ λ l′
U (g) |α, λ, li = |α, λ, l i D (g) l ; ∀g ∈ G (no sum over λ) (8.20)
where we have used the fact that Wαλ is invariant under U µ×ν (G) 2 . Replacing Eq. (8.18) into the Left-Hand Side (LHS) of
Eq. (8.19), and using Eq. (8.20) we find
l′
U µ×ν (g) |i, ji = U µ×ν (g) |α, λ, li hα, λ, l |i, ji = |α, λ, l′ i Dλ (g) l hα, λ, l |i, ji
and using Eq. (8.17) we have
l′
U µ×ν (g) |i, ji = |i′ j ′ i hi′ j ′ |α, λ, l′ i Dλ (g) l hα, λ, l |i, ji (8.21)
equating Eqs. (8.19, 8.21) and using the linear independence of the set {|i, ji} we have
′ ′ ′
Dµ (g)i i Dν (g)j j = hi′ j ′ |α, λ, l′ i Dλ (g)l l hα, λ, l |i, ji (8.22)
On the other hand, substituting Eq. (8.17) in the LHS of Eq. (8.20) and using Eq. (8.19) gives
′ ′
U µ×ν (g) |α, λ, li = U µ×ν (g) |i, ji hi, j |α, λ, li = |i′ , j ′ i Dµ (g)i i Dν (g)j j hi, j |α, λ, li (8.23)
equating Eqs. (8.20, 8.23) we have
l′′ i′ j′
|α, λ, l′′ i Dλ (g) l = |i′ , j ′ i Dµ (g) iD
ν
(g) j hi, j |α, λ, li
multiplying by hα′ , λ′ , l′ | on both sides we find
l′′ i′ j′
hα′ , λ′ , l′ |α, λ, l′′ i Dλ (g) l = hα′ , λ′ , l′ |i′ , j ′ i Dµ (g) iD
ν
(g) j hi, j |α, λ, li
′ ′ ′′ ′ ′
l i j
δαα δλλ′ δll′′ Dλ (g) l = hα′ , λ′ , l′ |i′ , j ′ i Dµ (g) iD
ν
(g) j hi, j |α, λ, li
′ ′
′ l i j′
δαα δλλ′ Dλ (g) l = ′ ′ ′
hα , λ , l |i , j i D (g) ′ ′ µ
iD
ν
(g) j hi, j |α, λ, li (8.24)
picking up Eqs. (8.22, 8.24) we obtain a useful theorem
Theorem 8.3 (Reduction of product representation): The similarity transformation composed of CG coefficients decomposes
the direct product representation Dµ×ν into its irreducible components. The following reciprocal relations hold
i′ j′ l′
Dµ (g) iD
ν
(g) j = hi′ j ′ |α, λ, l′ i Dλ (g) l hα, λ, l |i, ji (8.25)
′ l′ i′ j′
δαα δλλ′ Dλ (g) l

= hα , λ , l |i , j i D (g)′ ′ ′ ′ µ
iD
ν
(g) j hi, j |α, λ, li (8.26)
Note that the latter equation makes explicit the block-diagonal texture of the matrices in the new basis, the similarity
transformed matrix (RHS of Eq. 8.26) is diagonal in the indices λ, α (LHS of Eq. 8.26) indicating the invariance of the
subspace Wαλ .
Perhaps an easier geometrical interpretation of these results 
can be done by a simplification of the notation. Let us denote
u and v as the vector columns formed with the sets {|wk i} and w λ respectively. Let us denote S the transfer matrix with
αl
matrix elements given by hwk wαlλ
(i.e. by the CG-coefficients according with Eq. 8.11). In matrix form, Eqs. (8.10, 8.13,
8.14) are written as
v=u e S ; SS† = EW ; S† S = EW
where EW is the identity in W . The two latter equations are manifestations of the unitarity of the transfer matrix. Since u is
orthonormal, the unitarity of S ensures that the new basis v is also orthonormal. Further, Eqs. (8.25, 8.26) can be written as
h i
Dµ (g) ⊗ Dν (g) = S E⊥ ⊗ Dλα (g) S† ; E⊥ ⊗ Dλα (g) = S† [Dµ (g) ⊗ Dν (g)] S (8.27)

where Dλα (g) acts on Wαλ , while E⊥ is the identity on the orthogonal complement of Wαλ with respect to W = Vµ ⊗ Vν .
Therefore, E⊥ ⊗ Dλα (g) is an operator on W , and Eqs. (8.27) express the similarity transformations that connect the coupled
an decoupled representations. In particular, the second of equations (8.27) makes apparent the fact that the transfer matrix
S takes the product representation into the block-diagonal form with respect to the minimal invariant subspaces.

2 Itis clear that D λ (g)l l cannot depend on α since this label corresponds to different minimal invariant subspaces of W , but associated with
equivalent representations.
160 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

8.1.3 The importance of direct product representations in Physics


It is usual in Physics to characterize a system N with a state vector in a given vector space U and another system M with a
state vector in a vector space V . If we now want to consider N and M as a new single system, we describe the physical states
of the new system in the vector space W = U ⊗ V . Notice that despite the basis in W can be constructed from the bases in
U and V , not every vector w ∈ W can be written as the direct product of two vectors in U and V 3 . In Physics, it reveals the
fact that the interactions or correlations between the subsystems N and M can create states that are not generated by simply
adjoining the two subsystems.
Further, product spaces can also be generated when we require to add more degrees of freedom to a given subsystem. For
example, if U describe the orbital degrees of freedom of a single quantum particle, we add the spin variables by making the
tensor product between the orbital state space U with the spinorial state space χ.
On the other hand, operators, and hence group representations must also be extended properly. Direct products of
representations appears when we study the behavior of systems with several degrees of freedom under symmetry operations, or
to study the patterns of transition amplitudes of physical processes derived from the underlying symmetry. Sometimes they are
used to study broken symmetries. The addition of angular momenta of two or more particles and of orbital and spinor angular
momenta of a single particle, are the most common examples. An important result that will permit to explore transitions
amplitudes is the Wigner-Eckart theorem that comes directly from the process of reduction of direct product representations.
It worths saying that, as in the case of vectors, there exists operators C on the product space W that cannot be generated
as products of operators on U and V . In Physics problems it is related once again with the fact that two subsystems generate
interactions or correlations between them not included in the study of each subsystem apart. For instance, the Hamiltonian of
the compound system is not necessarily the direct product of the individual Hamiltonians, since the interacting potential arises
only after both subsystems are put together. If A and B correspond to the same physical operator in U and V respectively,
then A ⊗ B corresponds to the same physical operator in U ⊗ V . For instance, if A corresponds to the x−component of the
linear momentum of a particle p1x , and B is associated with the x−component of the linear momentum of a second particle
p2x , then the operator A ⊗ B is associated with the x−component of the linear momentum of the combined system of two
particles.

Example 8.2 Let x1 , x2 denote the coordinate vectors of particles 1 and 2 respectively. In Classical Mechanics we characterize
the two particle system by the set (x1 , x2 ). In Quantum Mechanics, such a characterization is made with the state vector coming
from the direct product of state vectors in the Hilbert spaces H1 and H2 associated with each particle. The state vector for each
particle 1,2 reads Z
|ψi i = |xi i ψi (xi ) d3 xi ; i = 1, 2

where ψi (xi ) is the wave function of the i−particle. The state vectors associated with the combined system of both particles
are in the space H = H1 ⊗ H2 . The system has coordinates |x1 , x2 i = |x1 i ⊗ |x2 i and state vectors
Z
|ψs i = |x1 , x2 i ψs (x1 , x2 ) d3 x1 d3 x2

if the compound system consists of identical particles, then an additional symmetrization or antisymmetrization will be neces-
sary.

8.2 Construction of representations in vector spaces of functions


In Physics we usually start with a set of operators {Ti } on a vector space V with elements {xj }, in which the set {Ti } is already
a representation of a group. In this section, we develop a general algorithm to find new representations on vector spaces of
functions Vf from this set of operators on V . We start with certain vector space of functions Vf , and denote an element of this
space as f (x), where the domain of these functions is V . The idea is to take an operator T acting on the elements of the vector
space V , and induce an operator that acts on the functions f (x), in such a way that the law of combination is preserved.
If we have an operator T on V such that
T x = x′ (8.28)
we want to define an induced operator OT on Vf that maps a function f (x) into another function f ′ (x). As a matter of ansatz,
let us put the restriction of preserving the value of the function when both transformations are carried out i.e.

f ′ (x′ ) = f (x) (8.29)

so that the transformed function


OT f ≡ f ′ (8.30)
3 See Sec. 3.20 page 62.
8.2. CONSTRUCTION OF REPRESENTATIONS IN VECTOR SPACES OF FUNCTIONS 161

has the same value at the image point x′ , as the value of the original function f at the object point x. In other words, if we
denote f (x) = c, we arrive to the same value c in two ways4: in the first one we start from x in V and map it in the value c
through the element f in Vf
x → f (x) = c
in the second way we start also from x but taking the following steps

x → T x = x′ → f (x′ ) = f (T (x)) → f ′ (x′ ) = [OT f ] [T (x)] = c


This procedure clearly gives a relation between T and OT . Eq. (8.29) can be rewritten as

OT f (T x) = f (x) , ∀x ∈ V, ∀f ∈ Vf , ∀T ∈ U (G) ⇔ (8.31)



OT f (x) = f T −1 x , ∀x ∈ V, ∀f ∈ Vf , ∀T ∈ U (G) (8.32)

now we should check that OT is a representation of a given group G when T is. A second transformation S gives

x′′ = Sx′ = ST x (8.33)

the action of the induced operator OS on a function h ∈ Vf is given by Eq. (8.31), hence

OS h (Sx′ ) = h (x′ ) , ∀x′ ∈ V, ∀h ∈ Vf , ∀S ∈ U (G) (8.34)

applying Eq. (8.34) for h = f ′ and using Eqs. (8.28, 8.29, 8.30) we find

OS f ′ (Sx′ ) = f ′ (x′ ) , ∀x′ ∈ V, ∀f ′ ∈ Vf , ∀S ∈ U (G) ⇒


OS OT f (ST x) = f (x) , ∀x ∈ V, ∀f ∈ Vf , ∀S, T ∈ U (G) (8.35)

on the other hand, ST ∈ {Ti } ≡ U (G), which implies that OST obeys Eq. (8.31) such that

OST f ((ST ) x) = f (x) , ∀x ∈ V, ∀f ∈ Vf , ∀S, T ∈ U (G) (8.36)

comparing Eqs. (8.35, 8.36), we find


OS OT = OST , ∀S, T ∈ U (G) (8.37)
so the product is preserved. Let replace T = E (the identity) in Eq. (8.31), we obtain

OE f (x) = f (x) , ∀x ∈ V, ∀f ∈ Vf

form which it is clear that OE is the identity as a mapping of Vf onto itself.

OE f = f , ∀f ∈ Vf (8.38)

combining Eqs. (8.37, 8.38) with S = T −1 we get


−1
OT −1 OT = OE ⇒ OT −1 = (OT )

Associativity of the set {OT } is guaranteed because they are mappings.


In summary: Let {Ti } be a representation on V ≡ {xk } of a group G. Let Vf ≡ {f (xk )} be a vector space of functions in
which the domain of the functions is V . From a mapping T on V we can induce a mapping OT on Vf as defined by Eq. (8.32).
Further the set {OTi } forms a new representation of G on the space Vf , induced from the representation {Ti } on V . We shall
see with specific examples that the mapping Ti → OTi is in general a homomorphism and not necessarily an isomorphism.

8.2.1 Further properties of the operators OT


It is easy to show that if {Ti } is a linear representation then {OTi } is also a linear representation. To see it, let us apply Eq.
(8.31) to the functions f (x) , g (x) and h (x) ≡ f (x) + g (x) (all of them belonging to Vf ) to obtain

OT h (T x) = h (x) ⇒ OT [f (T x) + g (T x)] = f (x) + g (x) ; ∀f, g ∈ Vf and ∀x ∈ V (8.39)


OT f (T x) = f (x) ; OT g (T x) = g (x) ; ∀f, g ∈ Vf and ∀x ∈ V (8.40)

summing the two Eqs. (8.40) and comparing with the second of Eqs. (8.39) we see that

OT [f (T x) + g (T x)] = OT f (T x) + OT g (T x) ; ∀f, g ∈ Vf and ∀x ∈ V


OT [f (x′ ) + g (x′ )] = OT f (x′ ) + OT g (x′ ) ; ∀f, g ∈ Vf and ∀x′ ∈ V
4 Note that c represents any mathematical object, not necessarily a number. The nature of c depends on the nature of the image points of the

mapping f .
162 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

where we have used the fact that T is onto in the last step. Now applying Eq. (8.31) to the functions f (x) and h (x) ≡ αf (x)
we get

OT h (T x) = h (x) ⇒ OT [α f (T x)] = α f (x)


OT f (T x) = f (x) ⇒ αOT f (T x) = αf (x)

comparing the two equations on the RHS and using the fact that T is onto, we obtain

OT [α f (x′ )] = αOT f (x′ ) ; ∀f ∈ Vf , ∀α ∈ C and ∀x′ ∈ V

Therefore the set {OTi } defines a linear representation

OT [αf (x) + βg (x)] = αOT f (x) + βOT g (x) ; ∀f, g ∈ Vf , ∀α ∈ C and ∀x ∈ V (8.41)

so that OT also has a matrix representation. In addition, let f (x) , h (x) be two functions in Vf , if g (x) ≡ f (x) · h (x) also
belongs to Vf , then using Eq. (8.31) we find5

OT [g (T x)] = g (x) ⇒ OT [f (T x) · h (T x)] = f (x) · h (x)


OT f (T x) = f (x) ; OT h (T x) = h (x) ⇒ [OT f (T x)] · [OT h (T x)] = f (x) · h (x)

equating the RHS of the last two lines and using the fact that T is onto, we find

OT [f (x) · h (x)] = OT f (x) · OT h (x) , ∀x ∈ V, ∀f, h ∈ Vf if f (x) · h (x) ∈ Vf

we have changed our notation for the group of operators {U (g)} → {T } in order to simplify our derivations. Let us summarize
our results in our traditional notation
 
−1
OU(g) f (U (g) x) = f (x) , OU(g) f (x) = f U (g) x , ∀x ∈ V, ∀f ∈ Vf (8.42)
OU(g) [αf (x) + βh (x)] = αOU(g) f (x) + βOU(g) h (x) (8.43)
OU(g) [f (x) · h (x)] = OU(g) f (x) · OU(g) h (x) (8.44)

now since Vf is a vector space, it has a basis {fi } of functions. A matrix representation can be obtained with the same
algorithm explained before. That is, by applying the operators OU(g) to the basis vectors (functions) fi as in Eq. (7.11)
j
OU(g) |fi i = |fj i D (U (g)) i , g∈G

8.2.2 Invariant functions and invariant operators in Vf


It could happen that for a given function f (x) and a given operator OT , we have OT f = f , according with Eqs. (8.31, 8.32)
we can express this fact as

OT f (x) = f (x) ; ∀x ∈ V

f (x) = f T −1 x ⇔ f (x) = f (T x) ∀x ∈ V (8.45)

in that case we say that f is invariant under the operator OT .

Example 8.3 Let f (x) ≡ x4 + y 2 and OT the operator induced by the parity transformation on x ≡ (x, y). Since P x = −x
we have
f (P x) = f (−x) = (−x)4 + (−y)2 = x4 + y 2 = f (x)
so f (P x) = f (x) and hence f is invariant under OP .

Example 8.4 Let f (x) ≡ x2 +y 2 and OT the operator induced by the rotation transformation on x ≡ (x, y). Since f (x) = kxk2
2
and R (φ) x = x′ such that kx′ k = kxk, then we have
2
f (R (φ) x) = f (x′ ) = kx′ k = kxk2 = f (x)

so f (R (φ) x) = f (x) and f (x) is invariant under OR(φ) (for all values of φ).
5 Note that the closure under multiplication is not guaranteed by vector space axioms. Such a closure would be satisfy in particular if V is also
f
an algebra.
8.2. CONSTRUCTION OF REPRESENTATIONS IN VECTOR SPACES OF FUNCTIONS 163

Let us assume that we have an operator R (x) on Vf such that

h (x) = R (x) g (x) , ∀x ∈ V and g (x) , h (x) ∈ Vf

we have
  
OT [R (x) g (x)] = OT h (x) = h T −1 x = R T −1 x g T −1 x (8.46)
on the other hand, since operators are associative we have
   
OT [R (x) g (x)] = OT R (x) OT−1 OT g (x) = OT R (x) OT−1 [OT g (x)]

OT [R (x) g (x)] = RT (x) g T −1 x , RT (x) ≡ OT R (x) OT−1 (8.47)

comparing Eqs. (8.46, 8.47) we have that



RT (x) ≡ OT R (x) OT−1 = R T −1 x , ∀x ∈ V ⇔ (8.48)
RT (T x) = R (x) , ∀x ∈ V , (8.49)

so the transformed operator RT at the point T x is the same as the original operator R at the point x. In general, the value of
RT in x differs from the value of R at the same point. If it happens that RT (x) = R (x) for all x ∈ V , we can express it as

RT (x) = R (x) , ∀x ∈ V
OT R (x) OT−1 = R (x) ⇔ [OT , R (x)] = 0

and we say that the operator R (x) is invariant under the transformation OT .
The most important case of invariance arise when the function f and/or the operator R (x) are invariant under all the
operators in the representation {OTi }. In that case we say that f and/or R (x) is invariant under the group representation
{OTi }.

Example 8.5 It is straightforward to observe that the operator H (x) = ∂/∂x2 + ∂/∂y 2 is invariant under the group repre-
sentation
 {OE , OP } with P the parity transformation on the space (x, y). It is also invariant under the group representation
OR(φ) with R (φ) denoting plane rotations.

8.2.3 Some examples of representation on vector spaces of functions


Example 8.6 Consider the group of operators consisting of the identity and inversion acting on the three dimensional euclidean
space U (G) = {U (e) , U (I)}. So that U (e) x = x and U (I) x = −x, with x ∈ R3 . Let us take a given function f (x). From
(8.42) we find

f (x) = Oe f (ex) = Oe f (x)


f (x) = OI f (Ix) = OI f (−x) ⇒ f (−x) = OI f (x)

Oe is the identity operator as expected, and OI maps f (x) in f (−x), this suggest to use both f (x) and f (−x) to obtain a
mapping of {f (x) , f (−x)} onto itself with these operators

Oe f (x) = f (x) ; OI f (x) = f (−x) ; Oe f (−x) = f (−x) ; OI f (−x) = f (x)

and defining f1 ≡ f (x), f2 ≡ f (−x), we find


 
Oe f1 = 1 · f1 + 0 · f2 Oe f1 = f1 D1 1 (e) + f2 D2 1 (e)

Oe f2 = 0 · f1 + 1 · f2 Oe f2 = f1 D1 2 (e) + f2 D2 2 (e)
 
OI f1 = 0 · f1 + 1 · f2 OI f1 = f1 D1 1 (I) + f2 D2 1 (I)

OI f2 = 1 · f1 + 0 · f2 OI f2 = f1 D1 2 (I) + f2 D2 2 (I)

and the matrix representative is    


1 0 0 1
D (e) = ; D (I) = (8.50)
0 1 1 0
in this case we are taking f1 and f2 as the basis of a two dimensional vector space of functions generated by their linear
combinations. Of course, the representation given by Eq. (8.50) of C2 must be reducible, since C2 is abelian so that its
irreducible representations are one-dimensional.
164 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

Example 8.7 For the same group of operators {e, I} in R3 , let us take an even function f+ (x) = f+ (−x). Therefore
Oe f+ (x) = OI f+ (−x) = f+ (x). In this case, f+ (x) and f+ (−x) are linearly dependent so that we only use a one-dimensional
vector space of functions. The representation is the trivial one and is unidimensional, showing that the mapping is a homo-
morphism of the form {e, I} → {Oe }. If we now use an odd function f− (x) = −f− (x), we obtain a non-trivial unidimensional
representation D (e) = −D (I) = 1. In the latter case the mapping is an isomorphism. These two irreducible one-dimensional
representations can be obtained from the reducible two-dimensional representation (8.50) by a similarity transformation. The
change of basis is given by
f1 + f2 f (x) + f (−x) f1 − f2 f (x) − f (−x)
f+ = √ ≡ √ ; f− = √ ≡ √
2 2 2 2
In these cases we have worked with vector spaces of functions of one and two dimensions. In Physics, there are vector spaces
of functions of higher dimensions, and even with infinite dimensions, since we often find that the vector space of solutions of
a given differential equation is of infinite dimension. Group representations on function spaces are very important in Physics.
For example, a function f (x) defined on the Euclidean space (or Minkowski space) is what we usually call a “field”, which is a
practical tool to describe observables in Physics (the wave function field in classical and quantum mechanics, electromagnetic
fields, fluid fields, gauge fields etc.). The type of field is given by the value of the function, if f (x) is a scalar, vector, or tensor
we talk about scalar, vector and tensor fields respectively.

Example 8.8 Consider the eigenvalue problem for an observable A on the space Vf

Ahnm = cn hnm , m = 1, . . . , kn (no sum on n)

where kn is the degree of degeneracy of the eigenvalue cn . It is well-known that the set of all eigenvectors of cn along with the
zero vector provides a vector subspace of Vf . Let {OR } be a group of operators under which the observable A is invariant, i.e.
A commute with all operators in that set. Theorem 3.17 says that OR hnm is also an eigenvector of A with the same eigenvalue
cn
A [OR hnm ] = cn [OR hnm ] , m = 1, . . . , kn , ∀OR ∈ {OR } (8.51)
(kn )  1 (k )
Let us denote Mn the kn −dimensional subspace of Vf associated with cn , and let hn , . . . , hknn be a basis of Mn n . Eq.
(k) (k)
(8.51) shows that each OR maps Mn into itself. Consequently, we can construct a representation on the subspace Mn by
n
applying each operator on all basis functions hi
j
OR hni = hnj D (OR ) i

(k )
in other words, each eigenvalue cn expands a subspace Mn n of Vf whose dimension is the degree of degeneracy kn of cn , and
the direct sum of such subspaces generates Vf X
Vf = Mn(kn )
⊕n
(kn )
and the preceding arguments show that if A is invariant under the group {OR }, each subspace Mn is invariant under such
a group.

8.3 The adjoint and the complex conjugate representations


Let {D (G)} be a matrix representation of the group G in the vector space V . A quite obvious way of obtaining new
representations on the same vector space is the following: Consider the following operations done on the matrices
e , D → D−1 , D → D†
D → D∗ , D → D (8.52)

we wonder whether the set of matrices obtained from {D (G)} with any of these operations also forms a representation. For
the conjugate of the matrices we have
∗ ∗
D∗ (g1 g2 ) = [D (g1 g2 )] = [D (g1 ) D (g2 )] = D∗ (g1 ) D∗ (g2 )

hence {D∗ (G)} also forms a representation and we call it the complex conjugate representation D∗ (G). However, the
remaining operations do not form a representation (unless the group is abelian), because they reverse the order of multiplication
when the operation is applied to the product of two matrices. For instance

e (g1 g2 ) = [D^
D (g1 g2 )] = [D (g^ e e
1 ) D (g2 )] = D (g2 ) D (g1 )

and same for the inverse and adjoint. Note however, that a representation can be obtained by combining two operations of
the ones giving in Eq. (8.52) (except by combining the complex conjugate with another different operation), since in that case
8.3. THE ADJOINT AND THE COMPLEX CONJUGATE REPRESENTATIONS 165

the order is reversed twice recovering the original order. For instance, we can form a representation by taking the transpose
inverse of the matrices
h i−1 h i−1 h i−1
De −1 (g1 g2 ) = D^
(g1 g2 ) = D (g^1 ) D (g2 ) = De (g2 ) D
e (g1 ) e −1 (g1 ) D
=D e −1 (g2 )

this is called the adjoint representation D (G)


 n −1 o n o
e (g) = D
D (g) ≡ D e g −1

this name is a bit misleading since certainly the matrices formed with this representation are not the adjoint matrices D† (G).
Nevertheless, since the adjoint matrices in the usual sense do not form a representation for a non-abelian group, it is not a
real source of confusion.
It is easy to check that the characters of the complex conjugate representation (CCR) are simply the complex conjugate of
the characters of the original representation

T r [D∗ (g)] = [T rD (g)] = χ∗ (g) ; ∀g ∈ G

Now, to see the relation between the characters of D and the characters of D we observe that D e −1 (g) = D
e g −1 and the
transposition leaves the trace invariant. Therefore, the characters of the adjoint representation (ADR) are given by

χ̄ (g) = χ g −1 or χ̄i = χi′ ; ∀g ∈ G and f or all ζi ⊆ G

where ζi′ is the class of the elements inverse to those in class ζi .


Now assuming that U (µ) (G) is an irreducible representation of G in V , the first natural question is whether CCR and ADR
are also irreducible or not.

Theorem 8.4 Let D (G) be a finite-dimensional matrix representation of a finite group G. Then D (G) , D (G) , D∗ (G) are
either all reducible or all irreducible.

Proof : Let {χi } , {χ∗i } and {χ̄i } be the set of characters associated with D (G) , D (G) , and D∗ (G) respectively. It is clear
that
nc
X nG
X nG
X nG
X nG
X
 
ni |χi |
2
= |χ (g)| =
2 χ gg −1 g −1 2 = χ g −1 2 = 2
|χ̄ (g)| ⇒
i=1 g∈G g∈G g∈G g∈G
nc
X nc
X
2 2
ni |χi | = ni |χ̄i |
i=1 i=1

2 2
where we have used the rearrangement lemma6 . Now since |χi | = |χ∗i | we find
nc
X nc
X nc
X
2 2 2
ni |χi | = ni |χ∗i | = ni |χ̄i |
i=1 i=1 i=1

it is obvious that these three expressions are either all equal or all different from nG . Since theorem 7.15, Eq. (7.62)
provides necessary and sufficient conditions for the irreducibility of a given representation of a finite group, we conclude that
D (G) , D (G) , D∗ (G) are either all reducible or all irreducible. QED.

Corollary 8.5 Let Dµ (G) be an irreducible matrix representation of a finite group G. The set of characters {χ̄i } satisfy
orthonormality and completeness relations of the form
Xnc
ni †i ν
χ̄ χ̄ = δµν (orthonormality)
n µ i
i=1 G
ni X µ †j
χ̄ χ̄ = δij (completeness)
nG µ i µ

Note that the orthonormality and completeness relations for {χ∗i } are obtained simply by taking the complex conjugate on
both sides of the orthonormality an completeness relations for {χi }.
ν
Now, if U ν (G) is an irreducible representation in V , we have seen that U (G) and U ∗ν (G) are irreducible representations
on the same vector space V . Hence, the next natural question is whether these three representations are equivalent or not.
6 Of course the number of classes nc and the number of elements ni in each class, are intrinsic properties of the group and not of the representation.
166 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

8.3.1 Conditions for the equivalence of D ∗ (G) and D (G)


Definition 8.5 Let D (G) be a matrix representation of the group G in a vector space V . Let F be a matrix on V . We say
that hy |F xi is an invariant in V under D (G) if

hy |F xi = hD (g) y |F D (g) xi ; ∀g ∈ G and ∀x, y ∈ V

further, if F is non-singular we call it a non-singular invariant. If F is hermitian we call it an hermitian invariant.


Let us assume that D∗ (G) and D (G) are equivalent representations in V . Hence, exists a non-singular matrix F on V
such that
D∗ (g) = Fe −1 D
e −1 (g) Fe ; ∀g ∈ G
taking the transpose on both sides we have
D† (g) = F D−1 (g) F −1 ⇒ D† (g) F D (g) = F ; ∀g ∈ G (8.53)

from which it is easy to prove that hy |F xi is a non-singular invariant in V under D (G)



hy |F xi = hy D† (g) F D (g) x = hD (g) y |F D (g) xi ; ∀g ∈ G and ∀x, y ∈ V (8.54)

Conversely, if there exists a non-singular matrix F such that hy |F xi is invariant in V under U (G) then

hy |F xi = hD (g) y |F D (g) xi = hy D† (g) F D (g) x ; ∀g ∈ G and ∀x, y ∈ V

hy F − D (g) F D (g) x = 0 ; ∀g ∈ G and ∀x, y ∈ V

applying theorem 2.40 page 31, we have


F − D† (g) F D (g) = 0 ⇒ D† (g) F D (g) = F ; ∀g ∈ G (8.55)

On the other hand, taking the transpose on both sides of Eq. (8.55) we have
e (g) FeD∗ (g) = Fe ⇒ D∗ (g) = Fe −1 D
D e −1 (g) Fe (8.56)

so that D∗ (G) and D (G) are equivalent representations. Note that in the last step we have used the non-singularity of F .
This can be summarized as

Theorem 8.6 Let D (G) be a matrix representation of the group G in V . The representations D∗ (G) and D (G) are equivalent
if and only if exists at least one non-singular invariant hx |F yi in V under D (G).

A very important corollary arises when D (G) is a unitary representation (for finite groups unitary representations exhaust
all possible inequivalent representations).

Corollary 8.7 Let D (G) be a unitary matrix representation of the group G in V . The representations D∗ (G) and D (G) are
equal and exists at least one non-singular hermitian invariant hx |F yi in V under D (G) in which F = E (the identity). In
particular, D∗ (G) is equivalent to D (G) for any finite group.

Proof: For a unitary representation D† = D−1 and taking the transpose D∗ = D. Therefore, D∗ is equivalent to (indeed,
equal to) D. The non-singular invariant hermitian in V under D (G) is trivial: choosing F = E makes the work because
unitary matrices leave the inner product invariant. Further, we should remember that for finite groups any representation is
equivalent to a unitary representation. QED.
We have seen that if a non-zero invariant F in V under U (G) exists, we are led to Eq. (8.55) which is identical with Eq.
(8.53)7 . On the other hand, taking the adjoint of Eq. (8.53) we find

D† (g) F † D (g) = F †

a procedure similar to the one given by Eq. (8.54) shows that hy F † x is also a non-zero invariant in V under U (G). Therefore,
we can find two hermitian invariants in V under U (G)
 
F + F† F − F†
hy x ; hy x (8.57)
2 2i
since F 6= 0, at least one of these hermitian invariants is non-zero. In particular, if F is already hermitian (or anti-hermitian),
then only one non-zero hermitian form can be constructed this way. Hence the existence of a non-zero matrix F on V such
that hy |F xi is invariant in V under U (G) leads to the existence of at least one non-zero hermitian invariant in V under U (G).
Note that in theorem 8.6, D (G) is not necessarily irreducible.
7 Note that the requirement of non-singularity for F is not necessary to arrive to Eq. (8.55). But it is necessary to arrive to Eq. (8.56).
8.3. THE ADJOINT AND THE COMPLEX CONJUGATE REPRESENTATIONS 167

Theorem 8.8 Let Dµ (G) be an irreducible matrix representation of the group G in V . Assume that there exists a non-singular
invariant hy |F xi in V under Dµ (G). If hy |Hxi is also invariant in V under Dµ (G), then H = αF with α being a complex
number. In particular, it means that the existence of a non-singular invariant in V under Dµ (G) implies that any other
invariant in V under Dµ (G) is either null or non-singular. Further, any matrix H associated with an invariant in V under
Dµ (G) must be proportional to its adjoint, i.e. H = kH † for some complex scalar k.

Proof : If hy |F xi is a non-singular invariant in V under Dµ (G) then D† F D = F (we omit the superscript µ and the
argument g ∈ G for clarity) taking the inverse on both sides we have

F −1 = D−1 F −1 D†−1 ⇒ HF −1 = H D−1 F −1 D†−1

and taking into account that hy |Hxi is also invariant, we find


  
HF −1 = D† HD D−1 F −1 D†−1 = D† HF −1 D†−1 ⇒
 
D†−1 HF −1 = HF −1 D†−1

taking the conjugate transpose on both sides we find

^
(HF −1 )∗ D −1 (g) = ^
D−1 (g) (HF −1 )∗ ; ∀g ∈ G
^   ^∗
(HF −1 )∗ D g −1 = D g −1 (HF −1 ) ; ∀g ∈ G

^
running over all g is equivalent to running over all g −1 . Therefore, (HF −1 )∗ commutes with all matrices of the irreducible

representation D (G). Hence, Schur’s lemma 2 says that (HF ^ −1 )∗ = α∗ E, taking the conjugate transpose of that expression

we have
HF −1 = αE ⇒ H = αF
therefore H is either null (if α = 0) or non-singular (if α 6= 0). If H = ±H † then this invariant form obviously satisfy the
condition H = kH † . If H 6= ±H † then Eqs. (8.57) give us two different non-null hermitian invariants, but according with the
previous proof, they cannot be independent, thus

H − H † = α H + H † ⇒ (1 − α) H = (1 + α) H † with α 6= ±1
 
1+α
H = H † with α 6= ±1
1−α

Since it is clear that α = ±1 leads to H = 0. Consequently, H = kH † for some scalar k. QED.


The equivalence of D∗ and D symbolized as D∗ ≈ D, means that ∃S on V such that D e −1 (g) = SD∗ (g) S −1 taking the
complex conjugate we have
D†−1 (g) = S ∗ D (g) S ∗−1 ⇒ D†−1 (g) ≈ D (g)
it is simple to reverse these steps to see that

D∗ ≈ D ⇔ D†−1 ≈ D

Let V be the n−dimensional space in which D (g) is constructed. Hence if D†−1 (g) is not equivalent with D, we cannot
construct a non-singular hermitian invariant in V under D (G). However, theorem 8.6 does not forbid to obtain non-singular
hermitian invariants under another representation D′ (G) defined in another vector space V ′ . With this in mind, we shall
construct a non-singular hermitian invariant under D′ (G) ≡ D (G) ⊕ D†−1 (G) in the 2n−dimensional vector space V ′ =
V (1) ⊕ V (2) where V (1) and V (2) are identical with V . Let x ∈ V (1) and y ∈ V (2) and let x, y be the apropriate extensions of
these vectors in the direct sum space8 . The transformations under D′ (G) of x and y are given by
 
x′ = D (G) ⊕ D†−1 (G) x = D (G) x
 
y′ = D (G) ⊕ D†−1 (G) y = D†−1 (G) y

where D (G) and D†−1 (G) are the extended matrices associated with D (G) and D†−1 (G) defined as
   
D (G)n×n 0n×n †−1 0n×n 0n×n
D (G) ≡ ; D (G) ≡
0n×n 0n×n 0n×n D†−1 (G)n×n
 
D (G)n×n 0n×n
D (G) ⊕ D†−1 (G) =
0n×n D†−1 (G)n×n
8 If x = (x1 , . . . , xn ) its extension is x = (x1 , . . . , xn , 0, . . . , 0). Similarly, y = (0, . . . , 0, y1 , . . . , yn ).
168 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

a vector in the direct sum can be written as


     
x 0 x
ψ =x+y = + = ; x ∈ V (1) , y ∈ V (2)
0 y y

a transformation under the representation U ′ (G) gives


  
  D (G)n×n 0n×n x
ψ′ = D (G) ⊕ D†−1 (G) ψ =
0n×n D†−1 (G)n×n y
 
D (g) x
ψ′ = (8.58)
D†−1 (g) y

we can define two non-singular hermitian matrices on V ′


   
0n×n En×n 0n×n −iEn×n
F1 ≡ ; F2 ≡ (8.59)
En×n 0n×n iEn×n 0n×n

it can be checked easily that they provide two non-singular hermitian invariants in V ′ under D′ (G)

φ† Fi ψ = hφ |Fi ψi ; i = 1, 2 ; φ, ψ ∈ V ′

let us do it for F1 , denoting    


z D (g) z
φ= ; φ′ =
w D†−1 (g) w
and using (8.58, 8.59) the transformed bilinear form gives
  
′† ′ † † † −1
 0n×n En×n D (g) x
φ F1 ψ = z D (g) w D (g)
En×n 0n×n D†−1 (g) y
 †−1 
 D (g) y
= z † D† (g) w† D−1 (g) = z † D† (g) D†−1 (g) y + w† D−1 (g) D (g) x
D (g) x
    
′† ′ † † † †
 y † †
 0n×n En×n x
φ F1 ψ = z y + w x = z w = z w
x En×n 0n×n y
φ′† F1 ψ ′ = φ† F1 ψ

8.3.2 Conditions for the equivalence of D and D ∗ , real representations


In this section we shall assume that G is a finite group or a group in which the rearrangement lemma and the orthogonality
and completeness theorems for characters can be extended to G. In particular, this is the case for compact Lie groups.

Definition 8.6 Let D (G) be a matrix representation of a group G in V . D (G) is called a real representation if D (g) is a
real matrix for each g ∈ G or if all such matrices can be brought to a real form through a similarity transformation, such that

D (G) = D∗ (G)

similarly a set of characters {χµi } in a given representation µ of G is called real if χµi is real for each class ζi ⊆ G. Otherwise
we say that it is a complex set of characters.

Let Dµ (G) be an irreducible representation of G in V . We want to search for the conditions under which Dµ (G) and Dµ∗ (G)
are equivalent. We have already seen that the set of characters of Dµ∗ is the complex conjugate of the set of characters of Dµ .
We can assume without any loss of generality that Dµ (G) is unitary. According to corollary 8.7, we have
µ
e µ−1 (g)
Dµ∗ (G) = D (G) ≡ D (8.60)

taking traces we find 


χ∗µ (g) = χµ g −1 (8.61)
On the other hand, by virtue of the orthogonality and completeness theorems of the irreducible characters, and the fact
that Dµ∗ is also irreducible, we can say that χµi = χ∗µ
i for each ζi ⊆ G if and only if the representations Dµ and D∗µ are
equivalent. Hence, we obtain

Theorem 8.9 Let Dµ (G) be an irreducible matrix representation of a finite (or compact Lie) group G in a finite-dimensional
vector space V . The set {χµi } is real if and only if Dµ (G) ≈ Dµ∗ (G). In particular, Dµ (G) ≈ Dµ∗ (G) if Dµ (G) is a real
representation.
8.3. THE ADJOINT AND THE COMPLEX CONJUGATE REPRESENTATIONS 169

Theorem 8.9 induces us to consider three cases

1. Dµ (G) is a real representation, so there exists a basis in V such that Dµ (G) acquires a real form and Dµ (g) = Dµ∗ (g) , for
all g ∈ G.
2. Dµ (G) ≈ Dµ∗ (G) but Dµ (G) is not a real representation.
3. Dµ (G) is not equivalent to Dµ∗ (G) such that {χµi } is complex.

Definition 8.7 Real representations (case 1) are also called integer representations. Representations in which Dµ ≈ Dµ∗
but Dµ is not a real representation (case 2) are called half-integer representations. These denominations were given by
Wigner.

Let us assume that Dµ ≈ Dµ∗ or equivalently that {χµi } is real. Hence, we are dealing with cases 1 and 2, and we have
from Eq. (8.61) 
χµ (g) = χµ g −1 (8.62)
further, Eq. (8.60) implies the existence of a transformation of similarity through a non-singular matrix S such that

e µ−1 (g) = Dµ∗ (g) ;


SDµ (g) S −1 = D ∀g ∈ G (8.63)

taking the inverse on both sides of the first equality we get

SDµ−1 (g) S −1 e µ (g) ⇒ SD−1 (g) S −1 [SDµ (g)] = D


= D e µ (g) [SDµ (g)] ⇒
µ

S e µ (g) SDµ (g)


= D ; ∀g ∈ G (8.64)

since Dµ (G) is unitary then Dµ∗ (G) also is, and from Eq. (8.63) S is unitary as well. From Eq. (8.64), it is easy to check
that the bilinear form9
x
eSy (8.65)
is invariant under Dµ (G)
xe′ Sy ′ = (D^
µ (g) x) S D µ (g) y = x
eD^µ (g) S D µ (g) y = x
eSy
eSy invariant under Dµ (G) with S 6= 0, we can reverse these steps and arrive to Eq.
conversely, if there exists a bilinear form x
(8.64) which can also be written as
SDµ (g) = De µ−1 (g) S ; ∀g ∈ G

applying Schur’s lemma 1 and the fact that S 6= 0, we conclude that S is an isomorphism from V onto itself (so that S is
non-singular) and Dµ (g) is equivalent to D e µ−1 (g) and so to Dµ∗ (g) according with Eq. (8.60). We also conclude that in case
µ
3 in which {χi } is complex, there is not a bilinear form of the type in Eq. (8.65) invariant under Dµ (G) for any S 6= 0.
Furthermore, if Dµ (G) ≈ D∗µ (G) (cases 1 and 2), Eq. (8.64) must hold. Taking the transpose and the inverse of this
equation we find
Se = De µ (g) SD
e µ (g) ; S −1 = D−1 (g) S −1 D
µ
e −1 (g) ; ∀g ∈ G
µ

multiplying these equations yields (no sum over µ)


h i h i h i
S −1 Se = Dµ−1 (g) S −1 Se Dµ (g) ⇔ Dµ (g) S −1 Se = S −1 Se Dµ (g) ; ∀g ∈ G

since Dµ (G) is irreducible, Schur’s lemma 2 says that

S −1 Se = cE ⇔ Se = cS (8.66)

taking the transpose of the last equation we have

S = cSe = c (cS) = c2 S ⇒ c2 = 1, c = ±1

so that S must be either symmetric or antisymmetric

Se = S or Se = −S (8.67)

On the other hand, from Eqs. (3.29, 3.35) page 41 we have


n
det Se = det S ; det (−S) = (−1) det S (8.68)
9 Note that this bilinear form is not in general an inner product in a complex vector space.
170 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

where n is the dimension of the representation. Observe that if {χµi } is real, the minus sign in Eq. (8.67) can only occur for
representations of even dimension. Otherwise, we would find from Eqs. (8.68) that det S = 0, contradicting the fact that S is
unitary when {χµi } is real.
If the matrix S is unitary and symmetric, it is well-known that there exists another unitary symmetric matrix B such that
B2 = S and B † = B −1 e=B
, B (8.69)
e ∗ = B −1 therefore
therefore B ∗ = B
B ∗ = B −1 ; B = B ∗−1 (8.70)
substituting (8.69) in Eq. (8.63) and using Eq. (8.70) we obtain
 
B 2 Dµ (g) B −2 = Dµ∗ (g) ⇒ B BDµ (g) B −1 B −1 = Dµ∗ (g) ⇒ BDµ (g) B −1 = B −1 Dµ∗ (g) B ⇒
 ∗
BDµ (g) B −1 = B ∗ Dµ∗ (g) B ∗−1 = BDµ (g) B −1 ; ∀g ∈ G
consequently, the unitary representation Dµ (g) is brought to a real form by a similarity transformation through the unitary
symmetric matrix B. Since B and Dµ (g) are both unitary, the real form is also unitary, hence the real form consists of real
orthogonal matrices
Dµ′ (g) ≡ BDµ (g) B −1 e µ′ (g) = E
, Dµ′ (g) D , ′
Dij (g) ∈ R ∀g ∈ G, and ∀i, j = 1, . . . , nµ
thus, when S is symmetric in Eq. (8.67) we are led to a real representation (case 1). We shall state without proof that the
antisymmetric case in Eq. (8.67) corresponds to the case 2 in which Dµ (G) ≈ Dµ∗ (G) but Dµ (G) is not a real representation.
In particular, the discussion below Eqs. (8.68) shows that case 2 is NOT possible in vector spaces of odd dimension.
Now in case 3 in which Dµ (G) is not equivalent to Dµ∗ (G), it does not exist a bilinear form x
eSy invariant under Dµ (G)
with S 6= 0. Hence S must be null. We can combine the results in the following theorem
Theorem 8.10 Let Dµ (G) be an irreducible unitary matrix representation of a finite (or compact Lie) group G in a finite-
dimensional vector space V . If the set {χµi } of characters in the µ representation is real (cases 1 and 2), there exists a bilinear
eSy invariant under Dµ (G) whose matrix S is unitary and symmetric or antisymmetric
form x

Se = S or Se = −S (8.71)
The symmetric case corresponds to real (or integer) representations (case 1), while the antisymmetric case corresponds to half-
integer representations (case 2). Half-integer representations cannot be constructed in odd dimensional vector spaces. Further,
if the set {χµi } is complex (case 3), it does not exist a bilinear form x
eSy invariant under Dµ (G) with S 6= 0, so that S must
be null. We summarize these results in the form

 +1 for integer representations (case 1)
S = cSe , c ≡ −1 for half-integer representations (case 2) (8.72)

0 for complex representations (case 3)

Criterion to distinguish the three cases through the set of characters


Let Dµ (G) be an irreducible representation of a finite group G. The matrix constructed as
X
S= e µ (g) XDµ (g)
D (8.73)
g∈G

where X is an arbitrary nµ × nµ matrix, gives us a bilinear form x eSy invariant under Dµ (G) as can be seen from
   
X X
xe′ Sy ′ = (D^ µ (g ′ ) x)  e µ (g) XDµ (g) Dµ (g ′ ) y = x
D eDe µ (g ′ )  e µ (g) XDµ (g) Dµ (g ′ ) y
D
g∈G g∈G
   
X X
e
= x eµ eµ
D (g ) D (g) XD (g) D (g ) y = x
′ µ
eµ ′ ^
[Dµ (g) Dµ (g ′ )] X D (g) D (g ) y
µ µ ′

g∈G g∈G
   
X X
xe′ Sy ′ e
= x e µ (gg ′ ) XDµ (gg ′ ) y = x
D e e µ (g) XDµ (g) y = x
D eSy
g∈G g∈G

where we have used the rearrangement lemma. In case 3 we must have S = 0 and using it in Eq. (8.73) we find
X
S ij = De µ (g)i m X m n Dµ (g)n j = 0
g∈G
8.3. THE ADJOINT AND THE COMPLEX CONJUGATE REPRESENTATIONS 171

as in section 7.8.1 we shall take X as one element in the set of n2µ matrices Xsk (k, s = 1, . . . , nµ ) with matrix elements
m
Xsk n = δnk δsm . Therefore in case 3 we have
X m X X
0 = e µ (g)i m X k
D nD
µ
(g)n j = e µ (g)i m δ k δ m Dµ (g)n j =
D e µ (g)i s Dµ (g)k j
D
s n s
g∈G g∈G g∈G
X s k
µ µ
⇒ D (g) i D (g) j =0 for all s, i, k, j = 1, . . . , nµ (8.74)
g∈G

setting k = i and summing over k we have



XX s k
X s
X s
0= Dµ (g) kD
µ
(g) j = [Dµ (g) Dµ (g)] j = Dµ g 2 j for all s, j = 1, . . . , nµ
g∈G k=1 g∈G g∈G

therefore in case 3 we obtain


X 
Dµ g 2 = 0 , when Dµ (G) is not equivalent to D∗µ (G) (8.75)
g∈G

taking the diagonal terms in Eq. (8.75) and summing over all of them we have in particular
X 
χµ g 2 = 0 , when Dµ (G) is not equivalent to D∗µ (G) (case 3) (8.76)
g∈G

Returning to the general case, we can combine Eqs. (8.73, 8.72) to find
 ^   
X X X X
e µ (g) XDµ (g)
D = c e µ (g) XDµ (g) ⇒
D e µ (g) XDµ (g) = c 
D e µ (g) XD
D e µ (g)
g∈G g∈G g∈G g∈G
X i n
X i n
⇒ eµ
D (g) mX
m
nD
µ
(g) j =c eµ
D (g) mX
em nD
µ
(g) j
g∈G g∈G
X X
⇒ Dµ (g)m i X m n Dµ (g)n j = c Dµ (g)m i X n m Dµ (g)n j
g∈G g∈G

m
and using once again X m n ≡ Xsk n = δnk δsm we get
X m n
X m n
Dµ (g) k m µ
i δn δs D (g) j = c Dµ (g) k n µ
i δm δs D (g) j ⇒
g∈G g∈G
X X
Dµ (g)s i Dµ (g)k j = c Dµ (g)k i Dµ (g)s j
g∈G g∈G

now using i = k and s = j and summing over k and j we have


nµ nµ nµ nµ
XX X j k
XX X k j
Dµ (g) µ
k D (g) j = c Dµ (g) kD
µ
(g) j
g∈G j=1 k=1 g∈G j=1 k=1
nµ nµ nµ
XX j
XX k
X j
[Dµ (g) Dµ (g)] j = c µ
D (g) k Dµ (g) j
g∈G j=1 g∈G k=1 j=1

XX  µ 2 j X
D g j = c χµ (g) χµ (g)
g∈G j=1 g∈G

we then obtain X  X
χµ g 2 = c χµ (g) χµ (g) (8.77)
g∈G g∈G

In cases 1 and 2 the set of characters is real, using this fact in the orthogonality relation Eq. (7.46) page 148 with µ = ν we
obtain
Xnc X
ni χµi χµi = nG = χµ (g) χµ (g) for cases 1 and 2 (8.78)
i=1 g∈G
172 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS


returning to the general case, since the representation is unitary we have (χµ ) = χ̄µ ≡ χν , where ν corresponds to the complex
representation obtained from µ. Therefore
nc
X nc
X nc
X nc
X
∗ ∗
ni χµi χµi = ni χµi χ∗µ
i = ni χµi (χνi ) = ni χµi χ†i µ
ν = n G δν (8.79)
i=1 i=1 i=1 i=1

where we have used the orthogonality relation Eq. (7.46). In case 3, D̄ is not equivalent to D so that µ 6= ν, then the
orthogonality relation yields10
Xnc X
ni χµi χµi = 0 = χµ (g) χµ (g) for case 3 (8.80)
i=1 g∈G

Substituting Eq. (8.78) in Eq. (8.77) we get


X 
χµ g 2 = cnG for cases 1 and 2 (8.81)
g∈G

combining Eqs. (8.76, 8.77) we see that c = 0 for the case 3, this result and Eqs. (8.72, 8.81) can be summarized in the
following theorem
Theorem 8.11 Let Dµ (G) be an irreducible matrix representation of a finite group G. The set of characters of the µ repre-
sentation has the property

X   +1 for case 1 (Dµ a real representation)
µ 2 µ µ
χ g = c nG where c = −1 for case 2 (Dµ ≈ D∗µ but not a real representation) (8.82)

g∈G 0 for case 3 (Dµ not equivalent to D∗µ )

eSy invariant under Dµ (G) in which S = cµ Se in each case. If the representation is integer
further we can find a bilinear form x
eSy invariant under Dµ (G) is
or half-integer (cases 1 and 2), S is a unitary matrix. For the case 3, the only bilinear form x
the one with S = 0.
Equation (8.82) provides a simple criterion to distinguish the three types of representations.

8.4 Square roots of group elements (optional)


Definition 8.8 Let G be a group and g, a ∈ G. Let η (a) be the number of solutions of the equation
g2 = a
we say that g is a “square root” of a.
Theorem 8.12 Let G be a group and a, b ∈ G. We see that η (a) = η (b) if a and b belong to the same conjugacy class. Thus
the number of square roots of a given element can be characterized by the class label η (a) ≡ ηi with a ∈ ζi . Additionally we
have
ηi = ηi′ (8.83)
where ζi′ is the class consisting of the inverse elements of ζi .

Proof: We can see that g is a square root of a if and only if cgc−1 is a square root of cac−1 for each c ∈ G, this can be
seen as follows
2  
g 2 = a ⇒ cgc−1 = cgc−1 cgc−1 = cg 2 c−1 = cac−1
2
cgc−1 = cac−1 ⇒ cg 2 c−1 = cac−1 ⇒ g 2 = a
2 −1
Similarly, since g −1 = g 2 it is straightforward that g is a square root of a if and only if g −1 is a square root of a−1 . Let
{gi } be the set of all square roots of a. It is easy to see that the mappings gi → cgi c−1 and gi → gi−1 are one-to-one and onto.
QED.
Now for a finite group G, theorem 8.12 says that the LHS of Eq. (8.82) can be rewritten in terms of η (a) as
X X nc
X

χµ g 2 = η (a) χµ (a) = ni ηi χµi ⇒
g∈G a∈G i=1
Xnc
ni ηi χµi = cµ n G
i=1
10 For cases 1 and 2 we have µ = ν in Eq. (8.79), from which Eq. (8.78) is reproduced.
8.4. SQUARE ROOTS OF GROUP ELEMENTS (OPTIONAL) 173

multiplying by χ†j
µ on both sides, summing over µ and using the completeness for characters Eq. (7.47) page 148, we find

nc
X Xnc nc
X
ni µ †j
ηi χ χ = cµ χ†j ⇒
i=1
n i µ
µ=1 G µ=1
µ

nc
X Xnc
ηi δij = cµ χ†j
µ (8.84)
i=1 µ=1

so that we obtain a very important theorem

Theorem 8.13 Let G be a finite group. The number of square roots of a given element a belonging to the class ζj is given by
nc
X
ηj = cµ χjµ (8.85)
µ=1

Proof: From Eq. (8.84) we only have to take into account that if {χµ } is real then Eq. (8.85) follows immediately. If {χµ }
is complex, theorem 8.11 says that cµ = 0 so that Eq. (8.85) also follows. QED.

Corollary 8.14 The number of square roots of the identity reads


nc
X
η (E) = cµ n µ
µ=1

in other words the number of square roots of the identity is obtained by summing the dimensions of all integer representations
and substracting the dimensions of all half-integer representations.

8.4.1 Other square roots of group elements


By using the rearrangement lemma we can see that the matrix
X 
D µ b2
b∈G

commute with all the matrices of the irreducible representation Dµ (g)


" #
X  X   X  X µ 2
µ ′ −1 µ 2
D (g ) D b Dµ (g ′ ) = Dµ g ′−1 Dµ b2 Dµ (g ′ ) = Dµ g ′−1 b2 g ′ = D b
b∈G b∈G b∈G b∈G

consequently, Schur’s lemma 2 says that such a matrix is proportional to the identity
X 
Dµ b2 = λE (8.86)
b∈G

to evaluate λ, we take traces on both sides of Eq. (8.86) and use Eq. (8.82)

cµ nG = λnµ

replacing this λ in Eq. (8.86) we find


X  cµ n G
D µ b2 = E (8.87)

b∈G

multiplying both sides of Eq. (8.87) by Dµ (g) we find


X  cµ n G µ
Dµ gb2 = D (g) (8.88)

b∈G

and taking the trace


X  cµ n G µ
χµ gb2 = χ (g) (8.89)

b∈G

redefining g ≡ a2 and summing over a we have


XX  cµ n G X µ 2 
χµ a 2 b 2 = χ a (8.90)

b∈G a∈G a∈G
174 CHAPTER 8. ADDITIONAL ISSUES ON GROUP REPRESENTATIONS

substituting Eq. (8.82) in Eq. (8.90) we obtain

XX  (cµ nG )2
χµ a 2 b 2 = (8.91)

b∈G a∈G

Now, let us define ξ (c) as the number of solutions of the equation

a2 b 2 = c

in terms of ξ (c) Eq. (8.91) becomes


X 2
(cµ nG )
ξ (c) χµ (c) =

c∈G

as before it can be shown that ξ (c) only depends on the conjugagy class to which c belongs. Using this fact and the completeness
relation for irreducible characters we obtain
Xnc 2
(cµ nG ) µ
ξ (c) = χ (c)
µ=1
nG · nµ

in particular, for c = E we find


nc
X 2
ξ (c) = nG [cµ ]
µ=1

2 2 2 −2
in words, the number of solutions of a b = E (or a = b ) is nG times the number of irreducible representations with real
characters.
This process can be generalized by multiplying equations of type (8.87) by the m elements g1 , g2 , . . . , gm obtaining
X X X  m
(µ)
 cµ n G
... D g12 g22 . . . gm
2
= E (8.92)

g1 ∈G g2 ∈G gm ∈G

and taking the trace we have  m


X X X  cµ n G
... χ(µ) g12 g22 . . . gm
2
= nµ (8.93)

g1 ∈G g2 ∈G gm ∈G

and by the same procedure, we find that the number of solutions ξm (E) of the equation

g12 g22 . . . gm
2
=E (8.94)

is given by
nc
X m
m−1 (cµ )
ξm (E) = (nG ) m−2 (8.95)
µ=1 (nµ )

8.4.2 Square roots and ambivalent classes


Theorem 8.15 The sum of the squares of the numbers of square roots of all elements of a finite group G, are equal to the
order nG times the number of real irreducible characters, that is

X nc
X nc
X
2 2
[η (g)] = ni ηi2 = nG [cµ ] (8.96)
g∈G i=1 µ=1

Proof: Evaluating Eq. (8.85) for a given element g and for g −1 , we have
nc
X nc
 X 
η (g) = cµ χµ (g) ; η g −1 = cν χν g −1
µ=1 ν=1


multiplying both equations, taking into account that η g −1 = η (g) and summing over g we have

X nc X
X nc X
2 
[η (g)] = cµ cν χµ (g) χν g −1 (8.97)
g∈G µ=1 ν=1 g∈G
8.4. SQUARE ROOTS OF GROUP ELEMENTS (OPTIONAL) 175


e ∗ (g). Therefore
we can assume a unitary representation for which Dν g −1 = Dν−1 (g) = Dν† (g) = D ν

χν g −1 = χν∗ (g) ⇔ χνi′ = χ†i
ν (8.98)

where ζi′ is the class of inverse elements of ζi . Substituting Eq. (8.98) in Eq. (8.97) we obtain

X nc X
X nc X nc X
X nc nc
X
2
[η (g)] = cµ cν χµ (g) χ∗ν (g) = cµ cν ni χµi χ†i
ν
g∈G µ=1 ν=1 g∈G µ=1 ν=1 i=1
nc X
X nc nc
X 2
= nG cµ cν δµν = nG [cµ ]
µ=1 ν=1 µ=1

where we have used the orthonormality theorem for irreducible characters. QED.
On the other hand, using Eq. (8.98) we can write the completeness relation for irreducible characters Eq. (7.47) in terms
of the conjugacy classes ζi and ζj ′ .
ni X µ †j ′ ′ ni X µ  µ ∗ ′
χi χµ = δij ⇒ χi χj ′ = δij ⇒
nG µ nG µ
ni X ′
χµ χj = δij
nG µ i µ

setting i = j and summing over i we obtain


nc Xnc nc
1 X X ′
ni χµi χiµ = δii (8.99)
nG µ=1 i=1 i=1

combining Eqs. (8.78, 8.80) and using te definition of cµ in Eq. (8.82) we obtain
X 2
χµ (g) χµ (g) = [cµ ] nG (8.100)
g∈G

replacing Eq. (8.100) in Eq. (8.99) we find


nc
X nc
X
2 ′
[cµ ] = δii (8.101)
µ=1 i=1

Theorem 8.16 Let G be a finite group of order nG . The number of inequivalent irreducible representations with real char-
acters11 is equal to the number of ambivalent conjugacy classes. The sum of the squares of the numbers of square roots of all
elements of G is equal to nG multiplied by the number of ambivalent conjugacy classes of G, that is
X nc
X nc
X ′
[η (g)]2 = ni ηi2 = nG δii (8.102)
g∈G i=1 i=1

Proof : The LHS of Eq. (8.101) gives the number of inequivalent irreducible representations with real characters, while the
RHS of such an equation provides the number of ambivalent conjugacy classes. The second statement of the theorem follows
by combining Eqs. (8.96, 8.101) from which we obtain Eq. (8.102). QED.

Corollary 8.17 Let G be a finite group. If all its conjugacy classes are ambivalent, all inequivalent irreducible representations
have real characters. In other words, Dµ (G) is equivalent to D∗µ (G) for all µ. This is the case for each symmetric group Sn ,
because all conjugacy classes of any Sn group are ambivalent (see page 119).

11 Or equivalently, the number of inequivalent irreducible representations for which D is equivalent to D ∗ .


Chapter 9

Irreducible basis vectors and operators

When a physical system possesses a symmetry, the solutions of the classical equation or the state vectors in quantum mechanics
can be classified according with the irreducible representations of the symmetry group. This basic principle is frequently used
in Physics. For example, the spherical harmonics appear often as part of the solution of either classical or quantum problems,
it is because the spherical harmonics reflect the underlying spherical symmetry of the physical systems under study. We shall
see that the concept of irreducible basis vectors will be useful in obtaining the decomposition of arbitrary vectors into their
irreducible components.
In a similar way, the physical observables such as position, momentum, electromagnetic fields etc. can be classified according
with the irreducible representations of the underlying group of symmetries. Under symmetry transformations, observables
transform in a definite way specified through group representation theory, just as do state vectors. For instance, components
of position and momentum transform as vectors under rotations, while components of Tµν (energy-momentum) and Fµν
(electromagnetic field) transform as second rank tensors under the homogeneous Lorentz transformations. The concept of
irreducible operators will also permit a decomposition of arbitrary operators into their irreducible components.
Finally, the combination of reductions of state vectors and operators will permit us to extract information about the state of
the system and measurable quantities contained in the underlying symmetry. In this chapter we develop the general formalism
that will be applied later to specific groups.

9.1 Irreducible basis vectors


Let U (G) be a representation on an inner product vector space V . We shall assume that U (G) is a unitary representation
unless otherwise is specified. Let Vµ ⊆ V be an irreducible (minimal) invariant subspace with respect to U (G) and let
{eµi : i = 1, . . . , nµ } be a basis of Vµ . We define

Definition 9.1 Any set of linearly independent vectors {eµi : i = 1, . . . , nµ } in V , that spans an irreducible invariant subspace
Vµ ⊆ V associated with an irreducible µ representation is called an irreducible set of basis vectors transforming according with
the µ−representation, if they transform under U (G) as
j
U (g) |eµi i = eµj Dµ (g) i ; ∀g ∈ G (9.1)

with Dµ (G) an irreducible matrix representation of G.

In other words, the matrix representation of U (g) in Vµ under the basis {|eµi i} is given by Dµ (g). Note that once the
matrix Dµ (g) is fully determined, the basis {eµi } is ordered in a strict way (or viceversa), since Eq. (9.1) shows that |eµi i
transforms according with the elements of the i − th column. The orthonormality and completeness relations satisfied by the
matrices associated with irreducible representations will permit to develop many results for the irreducible basis vectors, and
to obtain the decomposition of arbitrary vectors into their irreducible components

Theorem 9.1 Let U (G) be a representation of a finite group G in a vector space V . Let
 ν
{uµi : i = 1, . . . , nµ } and vj : j = 1, . . . , nν

be two sets of orthonormal irreducible basis vectors that span Vµ ⊆ V and Vν ⊆ V respectively. If the two irreducible represen-
tations µ and ν are inequivalent, the two minimal invariant subspaces spanned by these bases are orthogonal to each other. If
the irreducible representations are equivalent there are two possibilities (i) If the minimal subspaces spanned by these bases are
disjoint (only the zero element in common) then hv j |ui i = 0, and (ii) if the intersection of the minimal subspaces contains a
non-zero vector, both subspaces coincide and the two bases are related by a unitary transformation |ui i = |vj i S j i .

176
9.2. REDUCTION OF VECTORS BY PROJECTION OPERATORS 177

Proof : Since the representation is unitary we have

1 X †
U † (g) U (g) = E ⇒ U (g) U (g) = E (9.2)
nG
g∈G

Using Eqs. (9.1, 9.2) we have


 

j 1 X † 1 X 
j † 
hvνj |uµi i = vν  U (g) U (g) |uµi i = vν U (g) [U (g) |uµi i]
nG nG
g∈G g∈G
 
1 X h
i h i 1 X

j l j l
hvνj |uµi i = Dν† (g) k vνk |uµl i Dµ (g) i =  Dν† (g) k Dµ (g) i  vνk uµl i
nG nG
g∈G g∈G

and applying theorem 7.8 Eq. (7.35) (valid in general for finite groups) we have

k µ
δµ δj δl
v u i
hvνj |uµi i = ν i k vνk uµl i = δνµ δij ν k (9.3)
nµ nµ

if the representations are inequivalent then µ 6= ν and every vector of one basis is orthogonal to all vectors of the other.
Therefore, all vectors in one invariant subspace (say Vµ ) are orthogonal to all vectors of the other (say Vν ).
On the other hand, if µ = ν Eq. (9.3) simplifies to

k µ

j µ j vµ uk i
hvµ |ui i = δi (no sum over µ) (9.4)

to understand
 µ the δij factor, we should keep in mind that once the matrix representation Dµ (g) is determined, the bases {uµi }
and vj are ordered such that

k k
U (g) |uµi i = |uµk i Dµ (g) i ; U (g) vjµ = |vkµ i Dµ (g) j

so they are ordered according with the columns of the matrix. Therefore, when µ = ν we have two cases:
(i) If the subspaces spanned by both bases only have the zero element in common then the sets {|vj i} and {|ui i} correspond
(α) (β)
to different invariant subspaces Vµ and Vµ associated with the same representation. If one of the inner products in Eq. (9.4)
were different from zero, the representation would not be fully reducible. Now, since we have assumed that our representation
is unitary, this contradicts theorem 7.5. Therefore, Eq. (9.4) becomes hv j |ui i = 0. In this case it is usual that the two
subspaces be distinguished by the eigenvalues of some operator outside of the set U (G).
(ii) Full irreducibility means that the different invariant subspaces Vµα must have only the zero element in common1 .
Therefore, if the intersection of the irreducible invariant subspaces spanned by both bases contains a non-zero vector |xi, both
invariant irreducible subspaces must coincide. Thus, both sets {|vj i} and {|ui i} are orthonormal basis of the same invariant
subspace connected by a unitary transformation. QED.
Note that theorem 9.1 can be extended for infinite groups, as long as the orthonormality and completenes theorems given
by Eqs. (7.35, 7.43) can be extended accordingly. Theorem 9.1 can be regarded as a generalized version of the theorem which
says that the eigenvectors of a normal operator corresponding to different eigenvalues are orthogonal each other.

Example 9.1 Consider the Hilbert space of the states associated with an electron in a hydrogen atom. It is well known from
basic quantum mechanics that states corresponding to different angular momenta (i.e. different irreducible representations of
the rotation group) are orthogonal to each other, irrespective of the radial quantum number. States with the same angular
momenta are orthogonal if they correspond to different radial quantum numbers. Non-vanishing scalar products are obtained
only if both states possesses the same angular momenta and radial quantum numbers. This example also means that theorem
9.1 can be extended to infinite groups with infinite-dimensional representations for some important cases in Physics.

9.2 Reduction of vectors by projection operators


It is important to be able to decompose an arbitrary vector in terms of irreducible basis vectors, and to be able to transform
from an irreducible basis (with respect to a symmetry group or subgroup) to another irreducible basis. Given the symmetry
group and its irreducible inequivalent representations, the natural operators to achieve decompositions into the invariant
1 See theorems 2.1, 2.2, page 15.
178 CHAPTER 9. IRREDUCIBLE BASIS VECTORS AND OPERATORS

subspaces are projectors onto these subspaces. If U (G) is a representation of a finite group G on V , then V is the direct sum
of irreducible invariant subspaces in the form
X
V = Vµα ; µ = 1, . . . , nc ; α = 0, 1, . . . , aµ (9.5)
⊕µ,α

where nc is the number of conjugacy classes of G and aµ the number of times in which the irreducible representation µ appears
in the decomposition of U (G). We include the possibility of α = 0 because some irreducible representations µ could be absent
in the decomposition of U (G). A complete set of basis vectors associated with this decomposition can be obtained by using
sets of irreducible orthonormal basis in each Vµα . A set of irreducible basis vectors for a given Vµα will be denoted as {|α, µ, ii}
with i = 1, . . . , nµ and α, µ fixed. Of course, an orthonormal basis of the whole space is obtained with

{|α, µ, ii : i = 1, . . . , nµ ; µ = 1, . . . , nc ; α = 0, 1, . . . , aµ } (9.6)

i.e. when we run over all indices. We insist however, in the fact that some irreducible inequivalent representations µ of G
could not appear in the decomposition of D (G), we can express this fact by setting aµ = 0. When we concentrate on a given
irreducible invariant subspace we shall use the notation {eµi : i = 1, . . . , nµ } for its irreducible basis vectors.

Theorem 9.2 Let U (G) be a representation of the finite group G on V , and Dµ (G) be an irreducible matrix representation
of G. We define the operators
j nµ X −1 j
Pµi ≡ Dµ (g) i U (g) (9.7)
nG g

then given an arbitrary vector |xi ∈ V , the set of vectors


n o n o
j j
Pµi |xi ≡ Pµi |xi , i = 1, . . . , nµ (9.8)

with fixed j and µ, transform irreducibly according with the µ−representation (provided they are not all null).
j
Proof : Let us apply the operators {U (g)} to the vectors Pµi |xi
 
j nµ X j nµ X j
U (g) Pµi |xi = U (g)  Dµ−1 (g ′ ) i U (g ′ ) |xi = U (g) U (g ′ ) |xi Dµ−1 (g ′ ) i
nG ′ nG ′
g g
nµ X j n µ
X j
= U (gg ′ ) |xi Dµ−1 gg −1 g ′ i = U (g ′ ) |xi Dµ−1 g −1 g ′ i
nG ′ nG ′
g g
nµ X  −1
 j n X j
µ
= U (g ′ ) |xi Dµ g −1 g ′ i = U (g ′ ) |xi Dµ g ′−1 g i
nG ′ nG ′
g g

where we have used the rearrangement lemma and the fact that the matrices Dµ (g) form a representation of G. Thus we have
j nµ X j k
U (g) Pµi |xi = U (g ′ ) |xi Dµ g ′−1 k Dµ (g) i
nG ′
g
 
nµ X j
=  D−1 (g ′ ) k U (g ′ ) |xi Dµ (g)k i
nG ′ µ
g
j
U (g) Pµi |xi j
= Pµk |xi Dµ (g)k i (no sum over µ) (9.9)
n o
j j
thus U (g) acting on each element Pµi |xi of the set of vectors Pµk |xi defined in Eq. (9.8) gives a linear combination of the
vectors in the set.
Now, we shall show that if at least one of the vectors in the set given by Eq. (9.8) is non-null, such a set is a basis for
the irreducible invariant subspace Vµ associated with the irreducible µ−representation. If nµ = 1, it is trivial, so we assume
nµ > 1. We proceed by contradiction assuming that some of the vectors in the set given by Eq. (9.8) are either null or linearly
dependent, we then pick up the maximum set of non-null linearly independent vectors in such a set
n o n o
j j
Pµi |xi ≡ Pµi |xi , i = 1, . . . , m ; 1 ≤ m < nµ
S
n o
j
since at least one of these vectors is non-null, this set is non-empty. Now, all linear combinations of Pµi |xi generates the
n o S
j
same subspace as Pµi |xi which is of dimension 1 ≤ m < nµ . Further, from Eq. (9.9) and the fact that m ≥ 1, it is clear that
9.2. REDUCTION OF VECTORS BY PROJECTION OPERATORS 179

n o
j
all linear combinations of the set Pµi |xi forms a non-trivial invariant subspace contained in Vµ . Hence, we have generated
a non-trivial
n oinvariant subspace that is properly contained in Vµ , contradicting the irreducibility of the latter. Therefore, the
j
set Pµi |xi must be linearly independent and a basis for Vµ . QED.
This important theorem says that starting with any non-zero vector |xi ∈ V , we can generate an irreducible invariant
subspace associated with the µ−representation, with the set in Eq. (9.8) as a basis. This basis is orthogonal but not
j
normalized, but it can be normalized if desired. The operators Pµi are called generalized projection operators despite
they are not projections in the strict sense. They are very important in our subsequent developments.
j
Note that the index “j” in Pµi provides us a sequence of nµ generalized projectors
1 2 n
Pµi , Pµi , . . . , Pµiµ (9.10)
and each one of them can generate the invariant subspace from any |xi. Observe that once the matrix representation Dµ (g) is
j
fixed, Eq. (9.7) defines the sequence (9.10) in a well determined order. That is Pµi is defined through the sequence of nµ −rows
µ −1

of the nG matrices D g .
In the rest of this section we assume that the representations are unitary.
Theorem 9.3 Let U (G) be a representation of a finite group G in V . Let {eνk : k = 1, . . . , nν } be a set of irreducible basis
j
vectors transforming under U ν (G) with eνk ∈ V , and let Pµi be operators defined as in Eq. (9.7). If the irreducible matrix
µ
representations D (G) in Eq. (9.7) are unitary, we have
j
Pµi |eνk i = |eνi i δµν δkj (9.11)
Proof : We have
" #
j nµ X −1 j nµ X j
Pµi |eνk i = Dµ (g) i U (g) |eνk i = U (g) |eνk i Dµ† (g) i
nG g nG g
nµ X ν l j nµ X ν l j
= |el i Dν (g) k Dµ† (g) i = |eνl i D (g) k Dµ† (g) i
nG g nG g

where we have used the irreducibility of the set {eνk }, and the fact that the representation is unitary. Now using the orthonor-
mality condition for irreducible matrix representations Eq. (7.35) we get
j
Pµi |eνk i = |eνl i δµν δil δkj = |eνi i δµν δkj
j
which gives Eq. (9.11). QED. We see then that Pµi anihilates a vector |eνk i of a given irreducible basis if the generalized
projector correspond to a representation different from the representation associated with the irreducible basis. In addition,
such an anihilation occurs for µ = ν if the position of the generalized projector in the sequence (9.10) does not coincide with
the position of the vector in the irreducible basis set, i.e. if j 6= k (as long as both orderings are induced by the same matrices
Dµ (g)). Finally, if both the associated representations and positions coincide, i.e. µ = ν and j = k, the generalized projector
k
Pνi takes the basis vector |eνk i and convert it into the vector |eνi i associated with the i − th position in the ordered irreducible
set. Note that the i − th position is determined by the i − th column of the matrix representation Dµ−1 (g)j i that defines Pνi k
.
Corollary 9.4
j
Pµi l
Pνk = δµν δkj Pµi
l
(9.12)
Proof: Theorem 9.2 says that the set n o
l l
Pνk |xi ≡ |eνk i , k = 1, . . . , nν (9.13)
j
is an irreducible basis for any non-zero |xi ∈ V , provided they are not all null. Thus, theorem 9.3 says that we can apply Pµi
to this irreducible set of basis vectors and use Eq. (9.11), hence
h i
j  l  j l l  l 
Pµi Pνk |xi ≡ Pµi |eνk i = |eνi i δµν δkj = δµν δkj Pνi |xi
j  l 
Pµi Pνk |xi = δµν δkj Pµil
|xi (9.14)
this is valid for |xi arbitrary2. So Eq. (9.12) holds. QED.
Note that this property is quite similar to an idempotence. However, this is not exactly an idempotence because if µ = ν,
j = l and i = k Eq. (9.12) gives
j j
Pµi Pµi = δµµ δij Pµi
j
(9.15)
and if i 6= j these operators are not idempotent. So they are not true projections.
j
We observe that the number of operators of the type Pµi is nG . This can be seen by observing that i, j = 1, . . . , nµ so that
2
there are nµ operators for a fixed µ, and summing over all µ we obtain nG generalized projectors according with Eq. (7.42).
2 Even l |xi = |0i for all elements in the set defined in Eq. (9.13), we get the null vector on both sides of Eq. (9.14).
if |xi = |0i, or if Pνk
180 CHAPTER 9. IRREDUCIBLE BASIS VECTORS AND OPERATORS

Theorem 9.5 Let U (G) be a representation of a finite group G in V . The nG operators U (g), g ∈ G can be written as linear
j
combinations of the nG generalized projectors Pµi with µ = 1, . . . , nc ; i, j = 1, . . . , nµ in the form
nµ nµ
nc X
X X j i
U (g) = Pµi Dµ (g) j (9.16)
µ=1 i=1 j=1

Note that this is the inverse of the defining equation (9.7) of generalized projectors. It can be proved by using the
orthonormality condition for the Dµ (g) matrices. Observe that this theorem resembles the spectral theorem in which a normal
operator can be decomposed in projections and the coefficients of the linear combinations are the eigenvalues of the operator
[see section 3.10, Eq. (3.60), Pág. 47]. In words, this theorem says that all nG operators U (g) of the representation, are linear
combinations of the nG generalized projectors3.
l
Theorem 9.6 Let U (G) be a representation of a finite group U (G) in V , and let Pνk be the generalized projector of U (G)
associated with the ν−irreducible representation. The following identity holds
X i
l l
U (g) Pνk = Pνi Dν (g) k (no sum over ν) (9.17)
i

Proof: Using Eq. (9.16) and Eq. (9.12) we find


 
X j X X X
l
U (g) Pνk = Pµi Dµ (g)i j  Pνk
l
= Dµ (g)i j Pµi
j l
Pνk = Dµ (g)i j δµν δkj Pµi
l
= Dν (g)i k Pνi
l

µ,i,j µ,i,j µ,i,j i

QED. Note that this is a restatement of theorem 9.2 in pure operator form as can be seen by comparing Eqs. (9.9, 9.17). It
l
says that Pνk composed by any operator U (g) of the representation, is a linear combination of the generalized projectors.

9.2.1 Definition of true projections from generalized projections


j
From the generalized projections Pµi we observe that the particular value of the superscript “j” is rather irrelevant for most
4
of our purposes . A second observation is that Eq. (9.15) shows that these operators are not true projectors, since they are
not idempotent in general. Nevertheless, Eq. (9.15) shows that they become idempotent as long as i = j. From this discussion
we can define true projections in the following way
n o
j=i
Definition 9.2 The set of operators Pµi ≡ Pµi : i = 1, . . . , nµ are said to be the projection operators onto the basis vectors
P
{eµi : i = 1, . . . , nµ }. The operator Pµ ≡ i Pµi is called the projection operator onto the irreducible invariant subspaces Vµ
spanned by {eµi }.

We can see that they define true projections. To see it we combine Eq. (9.12) with the definitions
j=i l=k i k
Pµi Pνk ≡ Pµi Pνk = Pµi Pνk = δνµ δki Pµi
k
= δνµ δki Pµi
i
(no sums)
Pµi Pνk = δνµ δki Pµi (no sums) (9.18)

similarly
! !
X X X X X
Pµ Pν = Pµi Pνk = Pµi Pνk = δνµ δki Pµi = δνµ Pµi
i k i,k i,k i

Pµ Pν = δνµ Pµ (9.19)

Eqs. (9.18, 9.19) show that the sets {Pµi } and {Pµ } define linear idempotent and pairwise orthogonal operators. Thus, they
are projectors in the sense of Hilbert spaces (see definition 2.30, page 32). We shall also see that they are complete

Theorem 9.7 Let U (G) be a representation of a finite group G in V and let Pµi , Pµ be the operators described in definition
9.2. The sets {Pµi } and {Pµ } are complete in V , in the sense that
X X
Pµi = Pµ = E (identity operator in V )
µ,i µ

3 If
{U (g)} is a homomorphic image of G but not isomorphic, some of the operators in (9.16) for different elements of G will coincide. If ℜe is the
kernel induced by the homomorphism, then U (g) = U (g ′ ) if and only if g and g ′ belong to the same coset induced by ℜe .
4 An important exception is Eq. (9.16) that requires all values of j and of the other indices of P j .
µi
9.2. REDUCTION OF VECTORS BY PROJECTION OPERATORS 181

Proof : Let {eνk } be an irreducible set of basis vectors for a given irreducible invariant subspace Vν . From theorem 9.3 Eq.
(9.11) we have

Pµi |eνk i ≡ i
Pµi |eνk i = |eνi i δµν δki (no sum) (9.20)
X X
Pµ |eνk i = Pµi |eνk i = |eνi i δµν δki = |eνk i δµν (9.21)
i i

finally X X
Pµ |eνk i = |eνk i δµν = |eνk i
µ µ
P
since it holds for all the irreducible basis vectors of all the irreducible invariant subspaces then µ Pµ = E as long as the space
V is fully reducible in the form of Eq. (9.5). Note that this is the case since we have assumed that the representations are
unitary (see theorem 7.5). QED.
At this step, it is convenient to turn back to notation given by Eq. (9.6)

{|α, µ, ki : k = 1, . . . , nµ ; µ = 1, . . . , nc ; α = 0, 1, . . . , aµ } (9.22)

in which an irreducible basis set associated with an invariant subspace Vµα is denoted by

{|α, µ, ki : k = 1, . . . , nµ ; (α, µ) f ixed}

The effects of the projections and generalized projections on the basis (9.22) of the whole space V , are given by Eqs. (9.21,
9.20, 9.11) that can be rewritten as

Pµ |α, ν, ki = |α, ν, ki δνµ (9.23)


Pµi |α, ν, ki = |α, ν, ki δνµ δki (9.24)
j
Pµi |α, ν, ki = |α, ν, ii δνµ δkj (9.25)
(α) (β)
Note that the action of these operators is not sensitive to the label α. This is logical since Vµ and Vµ for α 6= β are esentially
j
identical (though they are orthogonal) with respect to U (G). Despite the generalized projections Pµi are not projectors in the
strict sense, they are the most powerful since they can be used to construct irreducible sets of basis vectors for the corresponding
invariant subspaces, starting from an arbitrary non-zero vector |xi ∈ V , as shown by theorem 9.2. Further they can be used
to decompose an arbitrary operator U (g) of the representation in terms of them according with Eq. (9.16). The latter process
canot be done with the operators Pµi and Pµ .

Example 9.2 Let U (G) be a representation of the finite group G on V , and Dµ (G) be an irreducible one-dimensional matrix
representation of G. In that case irreducible matrix representations on µ become c−numbers and those numbers are the char-
j
acters of the one-dimensional representation. Therefore, Pµi = Pµi = Pµ . Consequently, for one-dimensional representations
nµ = 1, the three projections defined above coincide each other and are given by

j 1 X −1
Pµi = Pµi = Pµ ≡ χ (g) U (g) (9.26)
nG g µ

note that for any one-dimensional representation Dµ (g) = χµ (g) must be non-zero for each g ∈ G. Otherwise, we could not
represent the element g −1 .

Example 9.3 Let Vf be the space of square integrable functions f (x) of the variable x in a given interval. Let G be the group
{e, Is } where Is x = −x. G is isomorphic with C2 , so that it has two 1-dimensional irreducible representations. Using Eq.
(9.26), the two generalized projection operators are
1  −1  1  −1 
P1 ≡ χ1 (e) U (e) + χ−1
1 (Is ) U (Is ) ; P2 ≡ χ2 (e) U (e) + χ−1
2 (Is ) U (Is )
2 2
and using the character table of C2 (see table 7.2, page 146), yields

E + U (Is ) E − U (Is )
P1 = ; P2 =
2 2
The “parity” operator U (Is ) acts on an element of Vf as follows (see example 8.7, page 164): U (Is ) f (x) = f (−x). Hence

f (x) + f (−x) f (x) − f (−x)


P1 f (x) = ≡ f+ (x) ; P2 f (x) = ≡ f− (x)
2 2
182 CHAPTER 9. IRREDUCIBLE BASIS VECTORS AND OPERATORS

we see that P1 + P2 = E showing the completeness of the projectors. Evaluating the action of the operators U (G) on f± (x)
we obtain
U (e) f± (x) = E f± (x) = f± (x)
f (−x) ± f (− (−x)) f (−x) ± f (x)
U (Is ) f± (x) = f± (−x) = = = ±f± (x)
2 2
It is clear that f+ (x) spans an irreducible 1-dimensional invariant subspace under U (G) as we saw in example 8.7. The same
occurs for f− (x). The function f+ (x) is even under parity, while f− (x) is odd. This leads us to the well known conclusion that
for a system with space inversion symmetry, it is advantageous to use even functions f+ (x) and/or odd functions f− (x) in the
function space, because we know their rules of transformation under the group (parity group). This is a simple example to show
that generalized projections generate minimal invariant vector subspaces. Less trivial examples will be shown in subsequent
developments.

9.2.2 The reduction of direct product representations with the projection method
Let G be a symmetry group and U µ (G), U ν (G) be two irreducible representations realized on the irreducible invariant vector
spaces Vµ and Vν with bases {|eµi i : i = 1, 2, . . . , nµ } and {|eνk i : k = 1, 2, . . . , nν } respectively. Now, let us consider the product
representation U µ×ν (G) realized on the vector space Vµ ⊗ Vν . We are able to find the irreducible invariant subspaces of Vµ ⊗ Vν
by using the projection operator method as follows: we start with the original basis vectors |k, li ≡ |eµk i ⊗ |eνl i and apply the
projection operators to form the set
n o
j
Pλi |k, li : i = 1, . . . , nλ ; (λ, j, k, l) f ixed

as long as the projection does not yield null vectors5 , this set spans an irreducible invariant subspace Vλ of Vµ ⊗ Vν . Then,
by selecting all the different sets of (λ, j, k, l) we can generate all the irreducible invariant subspaces. The transformation
matrix between the original and the new basis gives the CG-coefficients. Notice that the transformation matrices (and so the
CG-coefficients) are matrix representatives of the generalized projectors.

9.3 Irreducible operators and the Wigner-Eckart theorem


Operators on a vector space V transform in a prescribed way under symmetry transformations. Therefore, like in the case
of vectors, operators are naturally classified by the irreducible representations of the symmetry group. The transformation
properties of vectors and operators lead to considerable simplifications in the structure of observables. The Wigner-Eckart
theorem is a powerful tool in this direction.
If an operator acts on V then for a given |xi ∈ V we have
O |xi = |yi ; |xi , |yi ∈ V
if we multiply on left by another operator T we have
T O |xi = T |yi ⇒ T OT −1 T |xi = T |yi
′ ′
O |x i = |y ′ i ; O′ ≡ T OT −1 ; |x′ i ≡ T |xi ; |y ′ i = T |yi
for instance, if T expresses a passive transformation such as a change of basis, the operator O′ ≡ T OT −1 would express
the same intrinsic operator as O but written in another basis. If T is an active operator, O and O′ are expressing different
operators, but both are connected by a similarity transformation, so defining the similarity as an equivalence relation, we see
that O and O′ are members of the same class of equivalence.
In summary, the operator T induced a transformation of both the operators and vectors through the following recipe
T : |xi → T |xi ; ∀ |xi ∈ V
T : O → T OT −1 ; ∀O on V
In the previous sections, we worked with invariant sets of vectors, in the sense that under all the operators of a group
representations U (G) they all transform among themselves. More precisely, U (g) applied on any vector of the set is a linear
combination of vectors of the set.
The previous discussion, induces us to define an analogous concept for a set of operators. We then aim to define a type of
invariant set of operators {O1 , . . . , On } , in such a way that under symmetry operations they transform among themselves in
a way similar to Eq. (9.1). That is, any element U (g) of the group representation maps an operator in the set into a linear
combination of operators in the set in which the coefficients of the linear combination have to do with the irreducible elements
of an irreducible representation matrix. Then we define
n o
5 For j
instance, if a given representation λ is not included in µ × ν, we expect that all vectors in the set Pλi |k, li vanish.
9.3. IRREDUCIBLE OPERATORS AND THE WIGNER-ECKART THEOREM 183

Definition 9.3 Let G be group and U (G) a representation of G on the vector space V . Suppose we have a set of operators
{Oiµ : i = 1, . . . , nµ } on V transforming under the representation U (G) of G as:
−1 j
U (g) Oiµ U (g) = Ojµ Dµ (g) i ; ∀g ∈ G (9.27)

with Dµ (G) an irreducible matrix representation of G. A set of operators with these properties is called a set of irreducible
operators associated with the µ−representation. They are also called irreducible tensors.

We should emphasize however, that the operators Oiµ are defined on V and not only on an invariant vector subspace Vµ .
Let us take a set of irreducible vectors ejν
associated with the ν−representation and the invariant subspace Vν ⊆ V , and a
µ
set of irreducible
operators {O i } under the µ−representation. A natural question concerns the behavior of the nµ · nν vectors
of the type Oiµ eνj under the group of transformations U (G)
h i  h ih i
−1  k l
U (g) Oiµ eνj = U (g) Oiµ U (g) U (g) eνj = Okµ Dµ (g) i |eνl i Dν (g) j
 µ ν  k l
U (g) Oi ej = [Okµ |eνl i] Dµ (g) i Dν (g) j (9.28)

where
we have used Eqs. (9.1, 9.27). Comparing Eq. (9.28) with Eq. (8.4) we see that the set of nµ · nν vectors of the type
Oiµ eνj , transform under U (G) according with the direct product representation Dµ×ν (G). In other words, these vectors can
be seen as a “natural” basis for the tensor product Vµ ⊗ Vν

Oiµ eνj ≡ |(µν) iji = |ui i ⊗ |vj i ∈ Vµ ⊗ Vν (9.29)

notice however, that Oiµ is an operator defined on V , hence Oiµ eνj are vectors in V . Consequently, Vµ ⊗ Vν is an invariant
subspace (but not necessarily minimal) of V with respect to U (G). We recall from the discussion in Sec. 8.1, that there are
sets of basis vectors  λ
wαl ; λ, α f ixed , l = 1, . . . , nλ
(α)
which span the minimal invariant subspaces Vλ of Vµ ⊗ Vν with respect to U (G)
X (α)
Vµ ⊗ Vν = Vλ
⊕α,λ

and whose matrix representations


 exhibit the block-diagonal form. On the other hand, Eq. (9.29) shows the set of nµ · nν
vectors of the form Oiµ eνj as basis vectors {|ui i ⊗ |vj i} in Vµ ⊗ Vν ⊆ V , coming from products of the bases of each

component vector space. According with Sec. 8.1, we can express this “decoupled” basis Oiµ eνj in terms of the “coupled”
 λ
basis wαl by means of the Clebsch-Gordan coefficients
X ′ E
Oiµ eνj = wαlλ
hα, λ′ , l (µ, ν) i, ji (9.30)
α,λ′ ,l

k µ ν 
it is useful to determine the matrix elements eλ Oi ej of the matrix representation of Oiµ in the irreducible basis eνj .

k
To do this we multiply Eq. (9.30) by eλ

k µ ν X ′E
λ
eλ Oi ej = hekλ wαl hα, λ′ , l (µ, ν) i, ji (9.31)
α,λ′ ,l

according with theorem 9.1 Eq. (9.3), the inner product between two invariant irreducible basis gives
′E 1 λ′ k X m λ′ E
λ
hekλ wαl = δ δ he w (9.32)
nλ λ l m λ αm

where we have put the explicit sum over repeated indices. Substituting (9.32) in Eq. (9.31) we have


k µ ν X 1 ′ X ′ E
λ
eλ Oi ej = δλλ δlk hem ′
λ wαm hα, λ , l (µ, ν) i, ji

n λ m
α,λ ,l

k µ ν 1 X X m λ
eλ Oi ej = he w hα, λ, k (µ, ν) i, ji
nλ α m λ αm

From this discussion we obtain one of the most useful theorems in group representation theory
184 CHAPTER 9. IRREDUCIBLE BASIS VECTORS AND OPERATORS

Theorem 9.8 (Wigner-Eckart): Let U (G) be a representation of a group G in V . Let {Oiµ : i = 1, . . . , nµ } be a set of
irreducible tensor operators on V associated with the irreducible invariant subspace Vµ , and eλk : k = 1, . . . , nλ a set of
irreducible basis vectors associated with the irreducible invariant subspace Vλ ⊆ V then

k µ ν X
eλ Oi ej = hα, λ, k (µ, ν) i, ji hλ| Oµ |νiα
α
1 X m λ
hλ| Oµ |νiα ≡ he wαm (9.33)
nλ m λ

where the term hλ| Oµ |νiα is called the n reduced


matrix element and hα, λ, l (µ, ν) i, ji are the Clebsch-Gordan coefficients
Eo
µ λ,α  λ
associated to the change of basis from Oi ek to wαl where the former is the natural basis {|ui i ⊗ |vj i} associated
with the invariant subspaces of Vµ ⊗ Vν .


The importance of this theorem lies in the fact that the enormous quantity of elements ekλ Oiµ eνj can be separated in a
factor that is purely group-theoretical in nature (CG-coefficients that contain all the i, j, k dependence) and a reduced matrix
element that contains all the specific properties of the vectors and operators. The CG-coefficients are totally given by group
representation theory and can be looked up in published tables. Further, in many important applications, such as in the case
of rotations in three dimensions, each irreducible representation λ occurs only once in the reduction of the direct product µ × ν.
In that case α = 1 and there is only one reduced matrix for each (µ, ν, λ). In addition, in many cases the regularities of the
Wigner-Eckart theorem exhaust all the structure of the relevant matrix elements required by invariance under the symmetry
group.

In quantum mechanics, as well as being matrix representatives for the operators, the values of ekλ Oiµ eνj will be useful
in calculating expected values of observables, as well as transition amplitudes. Indeed, the Wigner-Eckart theorem permits
to understand many selection rules in atomic, molecular and nuclear Physics based on symmetry arguments. It is also the
case in many problems that we can predict quotients among the transition amplitudes from group theoretical grounds. This is
because in many cases, the reduced matrix cancels in the quotient, leaving us with the group theoretical CG-coefficients only.
Chapter 10

A brief introduction to algebraic systems

The concept of set is perhaps the most primitive in mathematics, understanding it as an aggregation of elements without any
internal structure or organization. As some structure is acquired by these sets, we form spaces. In Physics we usually work
with two types of mathematical structures: Topological and algebraic structures1 .
Topological structures concentrate mostly on the properties of some special subsets of the space (usually called open sets),
its operations as subsets (intersection, union and complement), as well as the properties of mappings from a topological space
into another. The concept of topological space is developed taking in mind a way to formulate the concept of continuity of
mappings in its purest form. If topological spaces are also metric spaces, a notion of distance is obtained, and the concept of
convergence of sequences becomes very valuable as well.
On the other hand, algebraic structures concentrate mostly on laws of combinations defined on the elements of the set.
Some laws of combinations are binary operations between a couple of elements of the set, that gives a third element. Some
other laws of combinations are defined between an element of the set and an element external to such a set, this is the case
of the scalar product in linear vector spaces. We then usually impose some properties (axioms) to the laws of combination
such as closure, associativity, commutativity, distributivity, the existence of a module, the existence of the inverse for a given
element etc.
In Mathematics and Physics there are four major algebraic systems useful in applications: Groups, linear or vector spaces,
rings, and algebras. As important as the abstract algebra of these algebraic systems (that is all properties extracted directly
from their axioms), is the theory of representations in which the algebraic system is mapped isomorphically into another
space in which the objects are either easier to characterized or more useful for applications. In vector spaces, the theory of
representations of finite dimensional vector spaces led to the coordinate representation of vectors through a given basis and
to matrix representations of operators. In groups, we are led to representations of the abstract group with a set of group
operators defined on a vector space, which in turn could provide a matrix representation for the elements of the group.
Our purpose in this chapter is to provide a brief treatment of the major algebraic systems, developing the properties
necessary for our subsequent work. Since groups and vector spaces have been studied quite in detail, we shall only make a
brief comment on them

10.1 Groups and vector spaces


As can be seen in chapter 6, the abstract group theory can be developed without any mention of vector spaces. Thus, if we
take the definition and axioms of a group as our starting point (see Sec. 6.1), we could define a vector space V as an abelian
group (the law of combination is denoted as x + y) whose elements are called vectors with the property that any scalar α and
any vector x ∈ V can be combined by the operation of scalar multiplication that gives another vector αx ∈ V such that

1. α (x + y) = αx + αy

2. (α + β) x = αx + βx

3. (αβ) x = α (βx)

4. 1 · x = x

Then, a vector space is an abelian additive group in which the elements of the group can be multiplied by scalars with
reasonable properties. The additive inverse (group inverse) of x is denoted as −x while the group identity is symbolized as
“0”.
1 Other important structures are “order structures” but we shall not discuss them here.

185
186 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

From the discussion above, all properties of groups are applicable to vector spaces2 . Vector subspaces are also subgroups
(but the opposite is not necessarily true!). Further, because of the abelianity, all vector subspaces are invariant subgroups,
hence we can define cosets and quotient groups from any vector subspace. Nevertheless, it is natural to ask whether these
quotient groups are also vector spaces (and not only groups). Let us then define cosets generated by a vector subspace (“vector
cosets”), and look for the structure of the quotient groups defined on them. We use a vector subspace M to introduce an
equivalence relation in the vector space V, in analogy with definition 6.21 and theorem 6.28 for group theory

Definition 10.1 Let V be a vector space and let M be a vector subspace in V . We say that an element x ∈ V is congruent
modulo M with another element y ∈ V , and denote it as x ∼ y, if x − y ∈ M .

Theorem 10.1 The relation defined in definition 10.1, is an equivalence relation between the elements of the vector space V .
Hence we can rewrite this relation as x ≡ y.

Proof : Since x − x = 0 ∈ M we see that x ∼ x (reflexivity). If x ∼ y i.e. x − y ∈ M then − (x − y) = y − x ∈ M because M


must contain the inverse of each of its elements, so y ∼ x (symmetry). If x − y ∈ M and y − z ∈ M then (x − y) + (y − z) ∈ M
then x − z ∈ M so that x ∼ z (transitivity). QED.
It is clear that this relation forms a partition in the vector space V , the partition sets can be defined as [x] where x ∈ V
and [x] is the set partition (called coset) generated by x. By definition it is the set of all elements y ∈ V such that y ≡ x
therefore

[x] ≡ {y : y ≡ x} = {y : y − x ∈ M } = {y : y − x = m for some m ∈ M } = {y : y = x + m for some m ∈ M }

it induces the following definition

Definition 10.2 Let M be a subspace of a linear space V . A coset of an element x in V is the set [x] = x + M ≡
{x + m : m ∈ M }

In particular, the null vector generates the coset consisting of the subspace M , since

[x = 0] = 0 + M = M

such that M ≡ [0]. Definition 10.2 is clearly the definition of a coset in abelian groups in which the law of combination is
written as “+” to emphasize its abelianity.

Theorem 10.2 Let M be a subspace of a linear space V . All the distinct cosets of V induced by M form a partition of V (i.e.
they are non-empty, disjoint and their union is the whole space V ) and if addition and scalar multiplication are defined as

(x + M ) + (y + M ) = (x + y) + M (10.1)
α (x + M ) = αx + M (10.2)

or in other notation
[x] + [y] = [x + y] ; α [x] = [αx] (10.3)
the set consisting of all these cosets form a vector space denoted by V /M and called the quotient space of V with respect to M .
The origin of V /M is the coset 0 + M = M and the negative (additive inverse) of x + M is −x + M.

Proof: Cosets form partitions in V because they are sets in V coming from an equivalence relation. Since x + y ∈ V and
αx ∈ V then (x + y) + M and αx + M are also cosets. The fact that they form an additive abelian group is a special case of
theorem 6.30 page 123. In particular, the following implications

M + (y + M ) = (0 + M ) + (y + M ) = (0 + y) + M = (y + M )
(−x + M ) + (x + M ) = (x − x) + M = 0 + M = M

show that M ≡ [0] is the null vector in the quotient vector space, and −x + M ≡ [−x] is the inverse of the coset x + M ≡ [x].
We should also prove the four axioms stated at the beginning of this section, for instance

α {[x] + [y]} ≡ α {(x + y) + M } ≡ [α (x + y)] + M = αx + αy + M ≡ (αx + M ) + (αy + M )


≡ [αx] + [αy]

the remaining axioms are proved similarly. QED.


As it can be seen, all what we did was to copy the properties of cosets and quotient groups, except that we defined the
law of combination (10.2) for the quotient group to become a vector space. In the case of vector spaces, the quotient vector
2 Notice however, that except for the trivial vector space {0}, all other vector spaces are infinite groups.
10.1. GROUPS AND VECTOR SPACES 187

Figure 10.1: Geometrical meaning of the quotient vector space R2 /M . This figure also illustrates the addition of two cosets.

space has an interesting geometrical meaning (see Fig. 10.1): Let V be the two-dimensional Euclidean space R2 , if we think
of a vector x as the head of an arrow with the tail at the origin, a proper non-trivial subspace M is given by a straight line
through the origin. A coset [x] = x + M is a line parallel to M that passes through the head of the arrow x. The quotient
space R2 /M is the space consisting of all lines parallel to M (in which we consider each line as a single element in such a set).
We sum two cosets [x] + [y] by adding x + y and forming the line (x + y) + M in which such a line passes the head of the arrow
z = x + y. The scalar multiplication is obtained similarly.
Another concept that arises naturally in vector spaces from group theory, is the concept of direct sum, which comes from
the concept of direct product of groups. Comparing section 6.12 with section 2.3 we see that once the direct product of groups
is defined, direct sums in vector spaces arise naturally just replacing subgroups in group theory by vector subspaces in vector
theory, and taking into account that vector spaces are abelian groups. Therefore, the direct product symbol ⊗ in group theory
is replaced by the symbol ⊕ in vector space theory, in order to emphasize the abelianity of its group structure.
We have seen that linear transformations are of central importance in the theory of vector spaces. Let us take a linear
transformation T from V into V ′ . We have seen that linear transformations preserve linear operations (sum and scalar
multiplication) so that it preserves in particular group operations (sums). Consequently, they are homomorphisms from the
point of view of vector spaces and groups. The null space M of T defined as the set of all elements x ∈ M ⊆ V such that
T (x) = 0′ ∈ V ′ , forms a vector subspace (and so an invariant subgroup) that tells us how far is this homomorphism from
being an isomorphism, from both vector and group theoretical grounds. Sometimes the null vector space of T is also called the
kernel of T . Since the null space M of T is a vector subspace, we can form cosets in V by the relation x ≡ y so that x − y ∈ M .
Now, two elements of the same coset in V (induced by M ) are mapped in the same element of V ′ . To see it, assume that x ≡ y
so that x − y ∈ M , we see that
T (x) − T (y) = T (x − y) = 0′ ⇒ T (x) = T (y)
and we can define the quotient vector subspace T /M . It is clear that T is an isomorphism from V onto V /M . It is easy to see
that T is invertible if and only if T is onto and M = {0′ } (i.e. T is one-to-one). If there exist an isomorphism T from V onto
V ′ we say that V is isomorphic to V ′ (as vector spaces and as additive abelian groups), so they are identical as vector spaces
and as additive abelian groups.
The previous discussion shows that it is more natural to develop abstract group theory first, and then abstract vector theory.
However, for the theory of representations is quite the opposite: the theory of group representations requires a development of
both abstract and representation theory of vector spaces.
Further, the set of all linear transformations on V posseses three types of laws of combination (see Sec. 2.6): sum (abelian
group law of combination), scalar product, and product among themselves (composition of linear transformations). Certain
subsets of the set of all linear transformations could form a group or a vector space. However, the existence of three laws of
188 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

combination for this kind of sets suggest to form other algebraic systems to include all the laws of combination. In this line of
thinking, we shall discuss two of the other major algebraic systems: rings and algebras3.

10.2 Rings: definitions and properties


Since the group combination law of an abelian group is usually denoted by the symbol “+” we shall call them additive abelian
groups. The inverse of x is denoted as −x and substraction is naturally defined as x − y ≡ x + (−y).
It can be checked that the set I of all integers is an additive abelian group with respect to ordinary addition. However, it
is equally important to realize that I is also closed under ordinary multiplication and that the multiplication and addition are
related in a way that enriches the structure of the system. In a similar way, the set L of all linear transformations on a vector
space V is closed under addition and multiplication (composition). Further, multiplication and addition in L are related in a
way that is analogous to the interplay between those operations in the set I previously defined. The theory of rings, provides
the general framework to study the interplay of two types of combinations of this kind.

Definition 10.3 A ring R is an additive abelian group which is closed under a second operation called multiplication (the
multiplication of two elements x, y ∈ R is denoted as xy), and that satisfies the following axioms: (i) Multiplication is associative:
(xy) z = x (yz) for all x, y, z ∈ R. (ii) Multiplication is distributive i.e. x (y + z) = xy + xz and (x + y) z = xz + yz.

Roughly speaking, a ring is an additive abelian group with an additional operation called multiplication in which addition
and multiplication are combined in a reasonable way. Notice however, that multiplication is not necessarily commutative, and
that there is not necessarily an identity for multiplication. The latter feature says that the “multiplicative inverse” is not
defined in a general ring.
A final important comment, the multiplication law of a ring is NOT a group multiplication. In fact, we shall see later that
with respect to multiplication, the ring is NOT a group ever.

Example 10.1 In each of these rings the elements are numbers and addition and multiplication have their ordinary meanings:
(a) The zero number alone, (b) The set I of all integers, (c) The set of all even integers, (d) The set of all rational numbers,
(e) The set of all real numbers, (f ) The set C of all complex numbers, (g) The set of all complex numbers with rational real
and imaginary parts

Example 10.2 The set of all n×n matrices (real or complex) with n a fixed positive integer, with sum and matrix multiplication
as our ring operations. Observe that singular matrices are also included in the set.

Example 10.3 Let m be a positive integer and Im the set of all non-negative integers less than m: Im ≡ {0, 1, . . . , m − 1}.
If a, b ∈ Im we define their “sum” a + b and “product” ab to be the remainders obtained when their ordinary sum and product
are divided by m. If m = 6 then I6 = {0, 1, 2, 3, 4, 5} and 3 + 4 = 1 (the remainder of 7/6), 4 + 5 = 3 (the remainder of 9/6),
1 + 2 = 3 (the remainder of 3/6), 2 · 3 = 0 (the remainder of 6/6), 3 · 5 = 3 (the remainder of 15/6), 1 · 2 = 2 (the remainder
of 2/6). Im with these operations is called the ring of integers modulo m. Let n be an element of Im . The additive inverse
of n is the element k = m ∼ n, where the symbol “∼” denotes the ordinary substraction if n 6= 0 and m ∼ n ≡ 0 if n = 0.
Therefore, if we denote −n as the additive inverse of any element n in the ring Im , we see that −n = m ∼ n. In particular
for I6 we have
−0 = 0, −1 = 5, −2 = 4, −3 = 3, −4 = 2, −5 = 1

The following are examples of additive abelian groups in which we define a rule of multiplication that DOES NOT form
a ring

Example 10.4 The set of real numbers with ordinary addition, in which the law of multiplication is defined by

a ∗ b ≡ a + ab

where ab is the ordinary multiplication. This multiplication is not left-distributive nor associative

a ∗ (b + c) = a + a (b + c) = a + ab + ac ; a ∗ b + a ∗ c = (a + ab) + (a + ac) = 2a + ab + ac
⇒ a ∗ (b + c) 6= a ∗ b + a ∗ c
(a ∗ b) ∗ c = (a + ab) ∗ c = (a + ab) + (a + ab) c = a + ab + ac + abc
a ∗ (b ∗ c) = a ∗ (b + bc) = a + a (b + bc) = a + ab + abc
⇒ (a ∗ b) ∗ c 6= a ∗ (b ∗ c)
3 Note in particular that the set L of all linear transformations on a given vector space V contains some singular (non-invertible) linear transfor-

mations. Thus, L is not a group under product (composition) of linear transformations. However, some subsets of L could form a group. In fact,
group representations in V are subsets of L.
10.2. RINGS: DEFINITIONS AND PROPERTIES 189

note however that it is right-distributive

(a + b) ∗ c = (a + b) + (a + b) c = a + b + ac + bc ; a ∗ c + b ∗ c = (a + ac) + (b + bc) = a + b + ac + bc
⇒ (a + b) ∗ c = a ∗ c + b ∗ c

Example 10.5 The set of real numbers with ordinary addition, and with law of multiplication given by

a ∗ b ≡ ab2

where ab denotes ordinary multiplication.

Example 10.6 The set of all n × n matrices under ordinary addition. The multiplication rule is defined as

A ∗ B = AB − BA

where AB is the ordinary matrix multiplication. It is easy to prove that this product is not associative.

Let us check for some general properties of the rings. We shall see that some familar facts from elementary algebra are
true but others are false. Any property must be proved from the axioms or theorems coming from the axioms.
The properties concerning only the sum, are obtained directly from the properties of additive abelian groups, starting from
the definition
a − b ≡ a + (−b) (10.4)
we have [see Ecs. (6.5, 6.6) adapted for abelian groups]

a+0 = 0 + a = a ; a + (−a) = −a + a = 0 ; 0 = −0 (10.5)


− (−a) = a ; − (a + b) = −a − b = −b − a ; ∀a, b ∈ R (10.6)

from the rearrangement lemma for groups, we see that if a + b = a + c then b = c. When we consider multiplication and its
interplay with sum, some interesting situations appear. We shall prove some algebraic properties by using the axioms. For the
sake of illustration, we describe one of them in detail
x0 = 0 ∀x ∈ R. To prove it, we use the distributive law (ring property), and the fact that 0 is the module of the sum
(group property)
x0 + x0 = x (0 + 0) = x0
now we add −x0 (the group inverse of x0) on both sides

(x0 + x0) + (−x0) = x0 + (−x0)

using the associativity of the sum (group associativity) and the fact that −x0 is the “negative” or additive inverse of x0, we
have

x0 + (x0 + (−x0)) = 0 ⇒ x0 + 0 = 0 ⇒
x0 = 0

where we have used the fact that 0 is a module of the sum (group property).
Note that despite the property x0 = 0 concerns the multiplication only, we have proved it by using the combination of
group properties of sum with ring axioms of sum and multiplication.
In a similar way, we can prove that 0x = 0. So, the product of two elements of the ring is the zero element if either of the
elements is zero. Surprisingly, the converse is NOT true!. The product of two non-zero elements could give the zero element.
Indeed, it happens quite often.

Example 10.7 It is well known that there are some non-zero n × n matrices whose product is the zero matrix. Let AB = 0
with A 6= 0 and B 6= 0. Assume that A is non-singular, hence A−1 6= 0 exists and A−1 AB = A−1 · 0 then B = 0 leading to
a contradiction. Similarly, assuming that B −1 exists gives a contradiction. Therefore if AB = 0 with A 6= 0 and B 6= 0, it is
implied that both matrices must be singular (the determinant of both matrices must be zero). For example
   
1 1 1 0
A ≡ ; B≡
0 0 −1 0
   
0 0 1 1
AB = ; BA = 6= 0
0 0 −1 −1

this example also shows that even if AB = 0, with A 6= 0 and B 6= 0, it could occur that BA 6= 0.
190 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

Example 10.8 Let us take a ring Im of integers mod m (see example 10.3) such that m is not a prime number. There exists
a divisor k of m such that 2 ≤ k ≤ m − 1 (ordinary order of the real numbers). Clearly, k and m/k (ordinary division) are
non-zero elements of Im . Further, k · (m/k) = 0 under the multiplication defined in this ring (see example 10.3). For instance,
let us take I6 , we have seen that 2 · 3 = 0.

Definition 10.4 An element z in a ring such that either zx = 0 or xz = 0, for some x 6= 0 is called a divisor of zero. In any
ring with non-zero elements, the element 0 itself is a divisor of zero.

Other properties could be obtained by using the axioms and properties developed above

0 = x0 = x (−y + y) ⇒ x (−y) + xy = 0 ⇒ [x (−y) + xy] + (−xy) = 0 + (−xy)


⇒ x (−y) + [xy + (−xy)] = −xy ⇒ x (−y) + 0 = −xy
⇒ x (−y) = −xy

similarly we have

0 = 0y = (−x + x) y ⇒ (−x) y + xy = 0 ⇒ [(−x) y + xy] − xy = 0 − xy


⇒ (−x) y + [xy − xy] = −xy ⇒ (−x) y = −xy

therefore
x (−y) = (−x) y = −xy
now let us define z ≡ −y, we have (−x) z = x (−z) = x (− (−y)) = xy hence

(−x) (−y) = xy

further

x (y − z) ≡ x (y + (−z)) = xy + x (−z) = xy + (−xz) ≡ xy − xz


(x − y) z ≡ (x + (−y)) z = xz + (−y) z = xz + (−yz) ≡ xz − yz

therefore
x (y − z) = xy − xz ; (x − y) z = xz − yz
Now, let us assume that ax = ay then ax − ay = 0 and a (x − y) = 0. If a is not a divisor of zero we have that x − y = 0
so that x = y. Similarly, if a is not a divisor of zero then xa = ya implies x = y.
Let us summarize our results

Theorem 10.3 Let R be a ring and x, y, z ∈ R. Defining x − y ≡ x + (−y), and following the ring axioms we have the
following algebraic properties

0 = −0 ; x + 0 = 0 + x = x ; x + (−x) = −x + x = 0 (10.7)
− (−x) = x ; − (x + y) = −x − y = −y − x (10.8)
if x + y = x + z then y = z (10.9)
x0 = 0x = 0 ; x (−y) = (−x) y = −xy ; (−x) (−y) = xy (10.10)
x (y − z) = xy − xz ; (x − y) z = xz − yz (10.11)

if ax = 0 or xa = 0 for some x 6= 0, we say that a is a divisor of zero. The element 0 is always a divisor of zero. However, it
is possible in a ring to have non-zero divisors of zero. Further, if a is not a divisor of zero then ax = ay or xa = ya implies
x = y. This is a “partial” cancellation law.

Definition 10.5 R is called a commutative ring if xy = yx for all x, y ∈ R.

The rings defined in examples 10.1, 10.3 are commutative but the one in example 10.2 is non-commutative as long as n > 1.

10.2.1 Rings with identity


We have already noted that the axioms for rings do not include the existence of a multiplicative identity, this in turn leads
to the fact that we cannot define the multiplicative or “ring” inverse of an element (though we can define a group inverse or
negative). Nevertheless, some rings posseses a multiplicative identity. For instance, the matrices of example 10.2 clearly has
the identity matrix as a multiplicative identity, while the zero matrix is the additive identity (abelian group identity) or zero
element of the ring.
10.2. RINGS: DEFINITIONS AND PROPERTIES 191

Definition 10.6 If the ring R contains a non-zero element 1 such that 1x = x1 = x for all x ∈ R we call it the “identity”
and we say that R is a ring with identity.

So we say “the identity” to mean the multiplicative module, and we say “the zero or null element” to mean the module of
the sum. We can prove that if R has an identity, it must be unique. First of all, we must observe that 1 cannot be a divisor
of zero4 , for if x 6= 0 then 1x = x1 = x 6= 0 for all x 6= 0 ∈ R. Assume that 1′ is another identity, so 1 = 1′ · 1 = 1 · 1, and since
1 is not a divisor of zero, the cancellation law implies 1′ = 1. In example 10.1 only (a) and (c) have no identity. The ring Im
of example 10.3 has an identity (the number 1) if and only if m > 1.
Now, the existence of an identity gives the possibility for some elements to have a multiplicative inverse.

Definition 10.7 If x ∈ R and there is another element y ∈ R such that xy = yx = 1, we say that y is the inverse of x and
we denote it as x−1 . If x ∈ R has an inverse, x is called regular (or invertible or non-singular). If x ∈ R has no inverse, we
call it a singular element.

In the language of rings, we say “the inverse” to mean the multiplicative inverse, and “the negative” to mean the additive
or group inverse.

Theorem 10.4 Let R be a ring with identity. If x is a divisor of zero then it is singular, in particular 0 is always singular.
If x has an inverse, such an inverse is unique. Further, the identity 1 is always regular.

Proof : If x is a divisor of zero we have either ax = 0 or xa = 0 for some a 6= 0. We examine first the case in which xa = 0
for some a 6= 0, and assume that x is non-singular. Let us multiply the latter equation by x−1 on left to find

x−1 (xa) = x−1 · 0 ⇒ x−1 x a = 0 ⇒ a = 0

which is a contradiction. The case in which ax = 0 for some a 6= 0, is similar. Hence if x is a divisor of zero it must be singular.
Now suppose that x has an inverse x−1 . Thus, x is non-singular and according to the previous proof, it is not a divisor
of zero. Now assume that there exists another inverse of x denoted by z. It is clear that 1 = x−1 x = zx, because x is not a
divisor of zero, the cancellation law says that x−1 = z.
The fact that 1 is always regular in a ring with identity follows from 1 · 1 = 1 such that 1 is an inverse (and so the inverse)
of itself. QED.
This theorem says that all rings with identity contains at least one singular element (the zero element), and at least one
regular element (the identity). Therefore, no ring with identity has an inverse for all of its elements (at least the zero element
has no inverse), this tells us that a ring with identity cannot form a group with respect to the law of multiplication5 . It worths
emphasizing that the concepts of identity, inverse, regular and singular elements only make sense for rings with identity.
We have defined the inverse of x (if it exists), in a ring with identity as an element y such that xy = yx = 1. However, in
non-commutative rings it is posible that xy = 1 6= yx or that yx = 1 6= xy. This induces the following definition.

Definition 10.8 Let R be a ring with identity. We say that x ∈ R is left regular if there exists an element y such that yx = 1,
and the element y is called a left inverse. If x is not left-regular, it is called left-singular. Right-regular, right-inverse, and
right-singular elements are defined in a similar way.

Theorem 10.5 An element x is regular ⇔ it is both left-regular and right-regular.

Proof : It is obvious that if x is regular, then it is both right-regular and left-regular. Now we assume that x is both
left-regular and right-regular, so there exists elements y and z such that yx = 1 and xz = 1. Then we have

y = y1 = y (xz) = (yx) z = 1z = z

hence, if for a given element both left-inverse and right-inverse exist, they must coincide. QED. It should be emphasized
however, that an element could be left-regular and right-singular or vice versa. In other words, it is possible for an element to
have a left inverse but not to have a right-inverse or vice versa. This is an important difference with respect to group theory.
In group theory, both the left inverse and right inverse must exist and coincide.
It is natural to look for rings in which all the elements are invertible except the zero element

Definition 10.9 A ring with identity is called a division ring if all its non-zero elements are regular. A commutative division
ring is called a field.

Division rings are as near as possible to the structure of a group under the multiplication law, as can be seen formally from
the following theorem
4 It is because of this fact that 1 must be different from zero, since 0 is always a divisor of zero. In particular, the zero ring has no identity.
5 Of course, rings without identity are even further from reaching a group structure under the law of multiplication.
192 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

Theorem 10.6 Let R be a ring with identity. R is a division ring ⇔ the non-zero elements of R form a group with respect to
multiplication.
Proof : Assume that R is a division ring. By the axioms of rings, multiplication is closed and associative. By definition
there is an identity for multiplication and all non-zero elements of the ring have a unique inverse. Therefore if R is a division
ring, the set R − {0} forms a group under multiplication.
Now assume that the non-zero elements of R form a group with respect to multiplication. By the axioms of group, there
is an identity for multiplication and each non-zero element has an inverse, so we obtain a division ring. QED.
Example 10.9 The rational numbers constitute a field under ordinary addition and multiplication, as do real numbers. Fields
are considered the “number systems” of mathematics.

10.2.2 The structure of rings


Definition 10.10 Let R be a ring. A non-empty set S is called a subring of R if S forms a ring with respect to the same sum
and multiplication defined in R. This is equivalent to demand that S be closed under the formation of sums, negatives, and
products.
In group theory, we have constructed cosets and quotient groups by means of invariant subgroups. Similarly, in the theory
of vector spaces, we could form vector cosets and quotient vector spaces by using vector subpaces. These facts suggest to study
the possibility of forming such structures from subrings.
The first task is to find certain kinds of subrings that could form partitions by means of cosets. These special types of
subrings are called ideals
Definition 10.11 An ideal in R is a subring I ⊆ R, with the additional properties
if i ∈ I ⇒ xi ∈ I ; ∀x ∈ R (10.12)
if i ∈ I ⇒ ix ∈ I ; ∀x ∈ R (10.13)
if I is a proper subset of R we call it a proper ideal. The zero ideal (the zero element alone), and the ring R itself, are called
trivial ideals.
It is clear that a non-zero ring has at least two ideals: the trivial ones. We say an ideal “in R”, because if T is a ring that
contains R as a subring, the set I is not neccesarily an ideal in T (compare this situation with the case of invariant subgroups).
Example 10.10 Consider the ring of all integers under ordinary sum and multiplication. If m is a positive integer, the subset
m̄ ≡ {. . . , −2m, −m, 0m, m, 2m, . . .}
constitutes a non-trivial ideal for each m > 1.
Example 10.11 Consider the ring C [0, 1] of all bounded continuous real functions defined on the closed interval [0, 1]. If
X ⊆ [0, 1], the set
I (X) ≡ {f ∈ C [0, 1] : f (x) = 0 for every x ∈ X}
is an ideal in this ring. If X is empty then I (X) = C [0, 1], and if X = [0, 1] then I (X) is the zero ideal.
Theorem 10.7 If R is a ring with identity, the identity 1 cannot be contained in a proper ideal I in R.
Proof : Assume that a proper ideal I contains 1. So it also contains 1x = x for all x ∈ R. Then I = R, contradicting the
fact that I is a proper ideal. QED.
We have seen that the abundance of non-trivial invariant subgroups in a given group dictates many aspects of the structure
of groups (e.g. simple and semisimple groups). In the same way, many aspects of the structure of rings have to do with the
non-trivial ideals contained in it. We illustrate it with the following theorem
Theorem 10.8 If R is a commutative ring with identity, then R is a field ⇔ it has no non-trivial ideals.
Proof : We first assume that R is a field, and show that it has no non-trivial ideals. So we prove that if I is a non-zero
ideal, then I = R. Since I is a non-zero ideal, it has an element a 6= 0, and since R is a field then a−1 exists and I contains
a−1 a = 1, by theorem 10.7 it implies I = R.
We now assume that R is a commutative ring with identity without non-trivial ideals. Let us take a fixed x 6= 0. The set
I = {yx : y ∈ R} of all multiples of x by elements of R, forms an ideal in R (prove!). I contains 1x = x 6= 0 so that it is a
non-zero ideal and hence I = R. Therefore, I contains 1, it means that there is an element y in R such that yx = 1 = xy
(commutativity), from which x has an inverse. Now, x is an arbitrary non-zero element of R, so that R is a division ring and
hence a field (it is commutative by hypothesis). QED.
The next step if we try to emulate the procedure with groups, is to define a partition (cosets) in the ring R from a given
ideal I ⊆ R. For this, we define an equivalence relation in R through I in analogy with definition 6.21 and theorem 6.28 for
group theory
10.2. RINGS: DEFINITIONS AND PROPERTIES 193

Definition 10.12 Given a ring R and an ideal I ⊆ R. Two elements x, y ∈ R are said to be congruent modulo I, written
x ≃ y (mod I), if x − y ∈ I. Since only one ideal is considered, we write it simply as x ≃ y.

Theorem 10.9 Let R be a ring and I ⊆ R an ideal in R. The congruence relation x ≃ y defined above is an equivalence
relation. Moreover, congruences can be added and multiplied, as if they were ordinary equations

x1 ≃ x2 and y1 ≃ y2 ⇒ x1 + y1 ≃ x2 + y2 and x1 y1 ≃ x2 y2 (10.14)

Proof : x − x = 0 and zero must be contained in any ideal (in any subring); hence x ≃ x and the relation is reflexive. As
in any subring, if x ∈ I then −x ∈ I; therefore if x − y ∈ I then y − x ∈ I and the relation is symmetric. Now, x ≃ y and
y ≃ z imply that x − y ∈ I and y − z ∈ I, but I must be closed under sums so that (x − y) + (y − z) = x − z ∈ I and x ≃ z,
so the relation is transitive.
Now to prove Eq. (10.14), we have by hypothesis that x1 − x2 ∈ I and y1 − y2 ∈ I. Now I must be closed under sums,
negatives and also under products of the type xy and yx with x ∈ R and y ∈ I, hence

(x1 − x2 ) + (y1 − y2 ) ∈ I ; x1 (y1 − y2 ) + (x1 − x2 ) y2 ∈ I (10.15)

and the conclusions in Eq. (10.14) follows from

(x1 + y1 ) − (x2 + y2 ) = (x1 − x2 ) + (y1 − y2 ) ∈ I (10.16)

x1 y1 − x2 y2 = x1 y1 − x1 y2 + x1 y2 − x2 y2 = x1 (y1 − y2 ) + (x1 − x2 ) y2 ∈ I (10.17)


QED. Notice that the fact that the congruence is an equivalence relation only depends on the fact that I is a subring (not
neccesarily an ideal), and the same occurs for the first part of Eq. (10.14) (sum property of congruences). However, the proof
of the second part of the property (10.14) (multiplication property of congruences) has used the fact that I is closed under
multiplication of any element of I with any element of R 6 . In other words, it is valid for ideals but not for general subrings.
In fact, this property is the main reason that makes ideals more important than merely subrings (in the same way as invariant
subgroups are more important than merely subgroups).
As customary, an equivalence relation in R generates a partition for R (theorem 1.1 page 10). The subsets of the partition,
called cosets are generated by taking an element x ∈ R and forming the set [x] of all elements y ∈ R that are congruent
modulo I with x. If R is not exhausted, we take an element z outside of [x], in order to form [z]. If [x] ∪ [z] does not fill R,
we take an element w outside this union to form [w] and so on, until we exhaust the full ring R.
Let us characterize a single coset [x]. By definition, it is the set of all elements y such that y ≃ x so that

[x] = {y : y ≃ x} = {y : y − x ∈ I} = {y : y − x = i for some i ∈ I}


= {y : y = x + i for some i ∈ I}
[x] = {x + i : i ∈ I} (10.18)

a natural notation for the ring cosets is


[x] ≡ x + I (10.19)
which means the set of all elements of the form x + i with i ∈ I. Either of Eqs. (10.18, 10.19) defines clearly the structure of
ring cosets of R induced by the ideal I.

Definition 10.13 Let R be a ring and I an ideal in R. A ring coset generated by an element x ∈ R and the ideal I in R is
defined as
[x] ≡ x + I ≡ {x + i : i ∈ I} (10.20)

Remember that the same coset can be generated by any other element of this class of equivalence, that is [x] = [y] if and
only if x ≃ y. So the elements x, y are called representatives of the coset which contains them (i.e. of their class of equivalence).
All the framework is settled to define the quotient ring R/I of R with respect to I.

Theorem 10.10 Let I be an ideal in a ring R, and let the coset of an element x in R be defined as [x] ≡ x+I = {x + i : i ∈ I}.
The distinct cosets, form a partition of R and if addition and multiplication of cosets are defined as

(x + I) + (y + I) = (x + y) + I ⇔ [x] + [y] = [x + y] (10.21)


(x + I) (y + I) = xy + I ⇔ [x] · [y] = [xy] (10.22)

then the set consisting of all distinct cosets as elements, constitutes a ring under the laws of combination (10.21, 10.22). Such
a ring is denoted by R/I and called the quotient ring of R with respect to I. The zero element of it is 0 + I = I and the negative
of x + I is −x + I. Moreover, if R is commutative, R/I is commutative as well; and if R has an identity 1 and I is a proper
ideal, then R/I has also an identity 1 + I.
6 Observe x1 and y2 in the proof are not neccesarily in I. Thus the second of Eqs. (10.15) is only true for ideals and not for merely subrings.
194 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

Proof : The fact that cosets form a partition of R is a direct consequence of their construction by means of an equivalence
relation.
To prove that Eqs. (10.21, 10.22) define a ring we should first be sure that they provide well-defined operations. That is,
that the operations do not depend on the representatives chosen in each coset to make them. By applying property (10.14) we
see that x ≃ x1 and y ≃ y1 implies x + y ≃ x1 + y1 and xy ≃ x1 y1 ; which in other words means that [x] = [x1 ] and [y] = [y1 ]
implies [x + y] = [x1 + y1 ] and [xy] = [x1 y1 ] showing that the operations with cosets are independent of the representative
chosen for each coset.
The fact that the cosets with these operations form a ring is a direct consequence of the ring properties of R. First, it is
easy to check that with respect to addition, I is a subgroup of the additive abelian group R. This subgroup is invariant since
it is a subgroup of an abelian group. Hence R/I is an abelian additive quotient group. As for the other ring properties, they
can be checked directly, for instance

(x + I) [(y + I) (z + I)] = (x + I) [yz + I] = x (yz) + I = (xy) z + I = [(xy) + I] (z + I)


(x + I) [(y + I) (z + I)] = [(x + I) (y + I)] (z + I)
⇒ [x] ([y] [z]) = ([x] [y]) [z]

(x + I) [(y + I) + (z + I)] = (x + I) [(y + z) + I] = x (y + z) + I = (xy + xz) + I


= (xy + I) + (xz + I)
(x + I) [(y + I) + (z + I)] = (x + I) (y + I) + (x + I) (z + I)
[x] ([y] + [z]) = ([x] [y] + [x] [z])

the rest of the theorem is straightforward. QED.

10.2.3 Homomorphisms and isomorphisms for rings


For any algebraic system, it is clear that preservation of the laws of combination, leads to the preservation of the abstract
algebraic properties. In vector spaces, we have defined an isomorphism between two vector spaces V and V ′ as a one-to-one
mapping of V onto V ′ that preserves linear operations (sum and scalar product), these kind of transformations are called
non-singular linear transformations, and V ,V ′ are consider identical as vector spaces. In group theory, we have defined a
homomorphism from G onto G′ as a mapping that preserves the law of combination. Homomorphic groups are partially
similar as groups but not identical since the mapping is not one-to-one. A one-to-one homomorphism is called an isomorphism
and in that case G and G′ are identical as groups. All this discussion can be translated into the language of rings in this way

Definition 10.14 Let R and R′ be two rings. A homomorphism of R into R′ is a mapping f of R into R′ that preserves the
laws of combination
f (x + y) = f (x) + f (y) ; f (xy) = f (x) f (y)

It can be shown7 that f (0) = 0 and f (−x) = −f (x) since

f (0) + f (0) = f (0 + 0) = f (0) ⇒ f (0) = f (0) − f (0) = 0


f (x) + f (−x) = f (x + (−x)) = f (0) = 0 ⇒ f (−x) = −f (x)

we say that the homomorphism preserves the zero and negatives. The image f (R) of R is clearly a subring of R′ and we call
it a homomorphic image of R. If the homomorphism is onto, then f (R) coincides with R′ and we say that R′ is homomorphic
to R. If the homomorphism is one-to-one, the subring f (R) is an isomorphic image of R. In the latter case, f (R) is esentially
identical to R as a ring, the difference is a matter of notation. The purely ring properties of R are reflected with complete
precision in f (R). We should keep in mind however, that if any of these spaces posseses some additional properties, they are
not necessarily mapped into the other space. If a homomorphism is one-to-one and onto, then R and R′ are isomorphic, i.e.
identical as rings.
We have seen that if R and R′ are isomorphic, all ring properties are translated from one to the other with absolute
precision. If we only have a homomorphism, ring properties are reflected with less precision. In group theory we learnt that
the size of the center or kernel of the homomorphism, tells us how far is the homomorphism of being an isomorphism. The
discussion that follows resembles the one in Sec. 7.2 that ended with theorems 7.1, 7.2 for group theory.

Definition 10.15 Let f be a homomorphism of R into R′ . The kernel K of this homomorphism is the inverse image in R of
the zero ideal in R′
K ≡ {x : x ∈ R and f (x) = 0}
7 We denote the zero of R and the zero of R′ with the same symbol, despite they are different elements. This causes no confusion.
10.2. RINGS: DEFINITIONS AND PROPERTIES 195

The following theorem is easy to verify


Theorem 10.11 Let f be a homomorphism of R into R′ . The kernel K of f is an ideal in R and K is the zero ideal ⇔ f is
an isomorphism.
This is analogous to the fact that if f is a homomorphism of a group G onto G′ , the kernel of the homomorphism is an
invariant subgroup in G, and the kernel is the identity alone if and only if the homomorphism is an isomorphism. So roughly
speaking, the size of K tells us the extent to which f fails to be an isomorphism, in both group and ring theory. Like in group
theory, for a given homomorphism f with kernel K, the ring cosets generated by the kernel are such that all the elements of a
given ring coset are mapped in a single element in the image, and two elements belonging to different ring cosets are mapped
into different images.
Further, as in the case of vector and group theory, the concepts of homomorphisms and isomorphisms for rings, lead directly
to the theory of ring representations. In a general way, the theory of representations of any mathematical system consists of
finding a mapping whose image preserves all (or part) of the essential properties of the mathematical system but in which the
image has a more familiar or simple form than the original system.
We shall not discuss the general theory of rings but we shall outline the main ideas followed to unreveal a given ring
structure. Suppose that R is a ring whose features are unknown. Now asume that R′ is a homomorphic image of R and R′
is a ring which has been well understood. R′ can give us some aspects of the structure of R but it contains only part of the
information of R as a ring. This information is usually completed with other homomorphic images of R. Each homomorphism
gives a “piece” of the whole picture of R. This procedure could be compared naively with the process done in solid analytic
geometry, in which the form of a figure is studied by means of their cross sections. Each cross section has only part of the
information and a smart management of each of them is necessary to obtain a complete (or quite complete) image of the figure.
The previous strategy leads us to make a systematic study of the homomorphic mappings of R. We remember that in
group theory, G/K is homomorphic to G and all homomorphic images of G are exhausted by characterizing all the invariant
subgroups of G and forming the corresponding quotient groups (see theorem 6.33 page 125). We shall see that an analogous
result appears for rings.
Let R be a ring, and let f be a homomorphism of R onto a ring R′ . Let K be the kernel of f . Since K is an ideal in R we
can define the quotient ring R/K, the mapping defined as
g (x) = x + K
is a homomorphism of R onto R/K, usually called the natural homomorphism. In words, this homomorphism maps each
element of R in its corresponding coset. The fact that this is a homomorphism is obtained from the definition of the operations
in R/K
g (x + y) = (x + y) + K = (x + K) + (y + K) = g (x) + g (y)
g (xy) = xy + K = (x + K) (y + K) = g (x) g (y)
finally we show that R/K is isomorphic with R′ . Let h be a mapping of R/K into R′ defined as
h (x + K) = f (x) (10.23)
First of all, we observe that this is a well-defined map because all elements in a given ring coset are mapped in a single
element in the image of f . Thus, if x ≃ y then x + K = y + K but also f (x) = f (y) so we can write Eq. (10.23) as
h (y + K) = f (y) for any y ≃ x so that such an equation remains valid regardless our choice of the element in the ring coset.
It is clear that the h mapping is onto. In addition, elements belonging to different ring cosets are mapped in different images
through f such that x + K 6= z + K implies f (x) 6= f (z), hence the mapping h is one-to-one. The preservation of sum and
multiplication follows from the fact that f is an isomorphism of R onto R′
h ((x + K) (y + K)) = h (xy + K) = f (xy) = f (x) f (y) = h (x + K) h (y + K)
h ((x + K) + (y + K)) = h ((x + y) + K) = f (x + y) = f (x) + f (y) = h (x + K) + h (y + K)
Therefore, h is a well-defined one-to-one mapping of R/K onto R′ that preserves sum and multiplication, and so is an
isomorphism. Since R/K and R′ are isomorphic, we can consider that R/K is the isomorphic image of R under f .
On the other hand, we could proceed in the opposite way. That is, starting with a given ideal I we form the quotient group
R/I and form the homomorphism
R → R/I ≡ x → x + I
It is clear that I is the kernel of this homomorphism and that all possible homomorphic images of R can be formed in this way
Theorem 10.12 Let R and R′ be two rings, and f a homomorphism of R onto R′ . Let K be the kernel of f . The quotient ring
R/K is isomorphic to R′ so we can consider R/K as the homomorphic image of R under f . Reciprocally, all homomorphic
images of R are obtained by the mappings
R → R/I ≡ x → x + I
when I runs over all ideals of R.
196 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

This theorem says that it is not necessary to go beyond the ring R to find out all its homomorphic images. Additionally,
it illustrates the importance of the ideal structure of a given ring. The more ideals the more homomorphic images.
In particular, if a ring R has no non-trivial ideals, the only homomorphic images of R are the zero ideal and R itself,
it implies that any representation is either faithful or trivial (the zero representation). Compare this situation with the one
associated with simple groups (see definition 6.18 page 121, and corollary 7.3 page 134).
If I is an ideal in a ring R, its properties relative to R, are reflected in properties of the quotient group R/I. We have
already mention that many properties of R are related with its richness in ideals. Further, structure theory of R is simpler if
there are quotient groups R/I that are sufficiently simple and familiar.
To illustrate these points we introduce the following concept: a maximal ideal I in a ring R is a proper ideal which is not
properly contained in any other proper ideal.

Theorem 10.13 Let R be a commutative ring with identity. An ideal I in R is maximal ⇔ R/I is a field.

Proof : If I is maximal, R/I is a commutative ring with identity in which there are no non-trivial ideals. Hence, by
theorem 10.8 we see that R/I is a field. We now assume that I is not maximal and prove that R/I is not a field. There are
two possibilities (a) I = R, (b) there is an ideal J such that I ⊂ J ⊂ R. For case (a) R/I is the zero ring so it cannot be a
field (there is no identity). For the case (b) R/I is a commutative ring with identity that contains the non-trivial ideal J/I,
then by theorem 10.8 it cannot be a field either. QED.

10.3 Algebras
We have seen that there are some algebraic systems that posseses three laws of combination: sum (abelian group property),
scalar product and multiplication (or composition) between elements of the system. This induces the following

Definition 10.16 A linear or vector space A is called an algebra if its vectors can also be multiplied in such a way that A is
also a ring, and in which scalar and ring multiplication are related in the following way

α (xy) = (αx) y = x (αy) (10.24)

with α a scalar and x, y ∈ A. The concept of algebra is a natural combination of the concepts of vector spaces and rings.

Since algebras are vector spaces, all concepts and properties developed for the latter are also valid for algebras. Thus, some
algebras are real and some are complex according with the scalars used. Every algebra has a well-defined dimension. Since an
algebra is also a ring, it can be commutative or not, and it may or may not have an identity. For algebras with identity we
can talk about inverses and about regular and singular elements.

Definition 10.17 An algebra with identity is a division algebra if, as a ring, it is a division ring.

Definition 10.18 A subalgebra of an algebra A is a non-empty subset A0 ⊆ A, that is an algebra in its own right with the laws
of combination of A. It can be checked that it is equivalent to require that A0 be closed under addition, scalar multiplication
and ring multiplication.

In the language of algebras, the ring multiplication is simply called multiplication.

Example 10.12 (a) The real vector space R of all real numbers is a commutative real algebra with identity, in which sum and
multiplication have their ordinary meaning. In this case, scalar and ring multiplication are the same. (b) The complex vector
space C of all complex numbers, is a complex algebra with identity under ordinary complex sum and multiplication. Once again,
scalar and ring multiplication coincide.

Example 10.13 The set C [0, 1] of all real bounded and continuous functions defined on the interval [0, 1]. This is a real
algebra with identity if sum, multiplication and scalar multiplication are defined pointwise. That is

(f + g) x ≡ f (x) + g (x) , (f · g) (x) ≡ f (x) g (x) , (αf ) x ≡ αf (x) ; ∀f, g ∈ C [0, 1] and ∀x ∈ [0, 1]

Example 10.14 Let V be a vector space and let β (V ) ≡ {T } be the set of all linear transformations of V into itself. This set
forms an algebra if the three laws of combination are defined as

(T + O) x ≡ T x + Ox ; T, O ∈ β (V ) , x ∈ V
(T O) x ≡ T (Ox) ; T, O ∈ β (V ) , x ∈ V
(αT ) x ≡ α (T x) ; T ∈ β (V ) , x ∈ V, α ≡ scalar

this algebra has an identity if and only if V 6= {0}. This algebra is in general non-commutative and has non-zero divisors of
zero (singular linear transformations).
10.3. ALGEBRAS 197

The concept of homomorphism between algebras is a natural extension of the homomorphisms created for other algebraic
systems. That is, a mapping that preserves the laws of combination.

Definition 10.19 A homomorphism f of an algebra A into an algebra A′ is a mapping of A into A′ that preserves the laws
of combination
f (x + y) = f (x) + f (y) ; f (αx) = αf (x) ; f (xy) = f (x) f (y) ∀x, y ∈ A
and f (A) is a homomorphic image of A. If A is onto, then f (A) = A′ and A′ is homomorphic to A. An isomorphism is a
one-to-one homomorphism.

Example 10.15 The set {MT } of matrix representatives of all linear transformations β (V ) of a given finite dimensional
vector space V , forms and algebra that is isomorphic to β (V ), see section 3.1 page 34.

Definition 10.20 An ideal I in an algebra A, is a non-empty subset of A which is a vector subspace when A is considered
as a vector space, and an ideal when A is considered as a ring. An ideal I in this sense is usually called an algebra ideal to
distinguish it from a ring ideal.

Theorem 10.14 Let A be an algebra and I an algebra ideal in A. All the distinct “algebra cosets” generated by I, defined as

[x] ≡ x + I ≡ {x + i : i ∈ I} (10.25)

forms another algebra called the quotient algebra A/I under the following rules of combination

(x + I) + (y + I) ≡ (x + y) + I ⇔ [x] + [y] ≡ [x + y] (10.26)


α (x + I) ≡ αx + I ⇔ α [x] ≡ [αx] (10.27)
(x + I) (y + I) ≡ xy + I ⇔ [x] · [y] ≡ [xy] (10.28)

If A is an algebra with identity, a non-empty subset I is an algebra ideal ⇔ it is a ring ideal. Hence for algebras with identity,
there is no distinction among ring ideals and algebra ideals.

Proof : Since I is a vector subspace, definition 10.2 says that Eq. (10.25) defines “vector cosets” and forms a partition
of A. Further, theorem 10.2 says that A/I forms a vector space under the laws of combination given by Eqs. (10.26, 10.27).
Similarly, I is a ring ideal, hence definition 10.13 page 193 says that Eq. (10.25) also defines “ring cosets” and theorem 10.10
says that A/I is also a ring under the laws of combination given by Eqs. (10.26, 10.28). It only remains to show that Eqs.
(10.24) relating multiplication and scalar multiplication, holds for our algebra cosets. This can be shown by using the algebra
properties of A and the laws of combination (10.27, 10.28)

α ([x] · [y]) = α ([xy]) = [α (xy)] = [(αx) y] = [αx] · [y]


α ([x] · [y]) = (α [x]) · [y]

similarly

α ([x] · [y]) = α ([xy]) = [α (xy)] = [x (αy)] = [x] · ([αy])


α ([x] · [y]) = [x] · (α [y])

If A is an algebra with an identity 1, it is obvious that an algebra ideal is also a ring ideal. Now, if I is a ring ideal in A, we
can show that I is closed under scalar multiplication since

i ∈ I ⇒ αi = α (1i) = (α1) i ∈ I for all α

where we have used the fact that α1 ∈ A. Thus I is also a vector subspace (it is closed under sum by definition) of A and thus
an algebra ideal. QED.

Definition 10.21 A left-ideal I in an algebra A, is a non-empty subset of A which is a vector subspace when A is considered
as a vector space, and such that
i ∈ I ⇒ xi ∈ I, ∀x ∈ A
A right-ideal in an algebra A, is a vector subspace of A such that

i ∈ I ⇒ ix ∈ I, ∀x ∈ A

if a non-empty subset I ⊆ A is both a left-ideal and a right-ideal, it is also an algebra ideal. That is, I is a vector subspace and
a ring ideal of A (see definition 10.20). In this context, ideals are also called two-sided ideals.
198 CHAPTER 10. A BRIEF INTRODUCTION TO ALGEBRAIC SYSTEMS

Definition 10.22 A maximal ideal in an algebra A, is a proper ideal that is not properly contained in any other proper ideal.
Maximal left-ideals and maximal right-ideals in an algebra could be defined similarly.

Note that groups and rings could consist of a finite number of elements, while vector spaces and algebras are necessarily
infinite sets8 .
It worths saying that, although in our present context an algebra is a very precise algebraic system described by definition
10.16, the word “algebra” is frequently used in both Physics and Mathematics, to refer to a generic “algebraic system”.

8 This is the case when we restrict the system of scalars in the vector spaces to be the set of all real or complex numbers. However, the theory

of vector or linear spaces can be extended to define the system of scalars as an arbitrary field. Even further, the system of scalars could be defined
as an arbitrary ring. In the latter case we talk about a module, instead of a linear space. However, this generality is out of the scope of the present
treatment.
Chapter 11

Group algebra and the reduction of the


regular representation

Linear group representations U (G) are sets of linear transformations on a given vector space V . However, we can define for
linear transformations not only the group operation but also operations of sum and scalar multiplication. Thus, it is natural
to extend the study of such representations by using other algebraic systems.
On the other hand, we have described the regular representation in section 7.11. First, we formed a nG −dimensional vector
space Ge by using the elements {g1 , g2 , . . . , gnG } of the finite group G as a basis. Hence, the vector space Ge consists of all
“formal linear combinations” of the form r = gi ri where gi ∈ G and ri are complex numbers. In this way, we are forming a
vector space Ge isomorphic (as a vector space) with CnG in which the group elements {gk } are playing the role of cartesian
orthonormal vectors {uk }. Further, the regular representation matrices are defined by gi gj = gm ∆m ij (see Eq. 7.72 page 152),
where the RHS is interpreted as a “formal sum” because the definition of a group does not involve linear combinations. Note
e with elements r = gi ri we can define linear combinations as
that for the vector space G
 
αr + βq = αgi ri + βgi q i = αri gi + βq i gi

αr + βq = αri + βq i gi (11.1)

where α, β are arbitrary complex numbers and r, q ∈ G. e Here we are applying the fact that linear combinations in G e are
identical as linear combinations in C nG
with the assignment gk ↔ uk . In addition, the multiplication rule coming from the
e Since scalars can be put in any order in the scalar
group structure serves to define a multiplication rule for the vectors of G.
multiplication as in C , it is reasonable to postulate the following axiom for the product between two vectors
nG

α (rq) = (αr) q = r (αq) (11.2)


so that the product between vectors can be defined as
 
rq = gi r i gj q j = (gi gj ) ri q j
rq = gk ∆kij ri q j (11.3)

where ∆kij contains the multiplication rule of the group. It is easy to check that with this additional operation, G e becomes an
algebra. For instance, using the rules for linear combinations (11.1), multiplication (11.3), as well as vector space axioms, we
have
  
r (p + q) = ri gi pk gk + q k gk = ri pk + q k gi gk = pk ri + q k ri gi gk = pk ri gi gk + q k ri gi gk
   
= ri gi pk gk + ri gi q k gk = rp + rq

r (pq) = ri gi [(pm gm ) (q n gn )] = ri gi [pm q n (gm gn )] = ri pm q n gi (gm gn ) = ri gi pm gm q n gn = (rp) q
where we have also used Eq. (11.2) and the associativity of G. In a similar way we can prove (p + q) r = pr + qr, showing that
e is a ring with identity (the identity of the ring is clearly the identity of the group). On the other hand, G
G e is also a vector
space isomorphic with CnG , and the axiom (11.2), shows that G e is a complex algebra with identity as stated in definition 10.16
page 196.
Definition 11.1 (Group algebra) Let G = {g1 , . . . , gnG } be a finite group. The space VG formed by all formal linear combina-
tions of the elements of the group i.e. r = gi ri where ri are complex numbers, is called the group algebra Ge if we define linear
combinations and products by

αr + βq = gi αri + βq i ; rq = (gi gj ) ri q j = gk ∆kij ri q j ; α (rq) = (αr) q = r (αq)
r, q ∈ e , gi ∈ G, ri , q j ∈ C
G

199
200 CHAPTER 11. GROUP ALGEBRA AND THE REDUCTION OF THE REGULAR REPRESENTATION

where ∆kij contains the multiplication rule of the group [defined in Eq. (7.72), Page 152]. It is clear that the identity of the
group becomes an identity for the algebra. Hence, G e is a nG −dimensional complex algebra with identity. The null element is
obtained by the null linear combination of any set of gk′ s.

Since the elements of the algebra are vectors, we emphasize it in some cases by using Dirac notation |ri ∈ G. e Since by
e we can define an inner product on this algebra (as a vector space) as
definition the elements gi ∈ G are a basis for G

hr |qi = ri∗ q i

with respect to this scalar product, the elements of G forms an orthonormal basis

hr |qi = hg i ri gk q k = ri∗ q k hg i |gk i = ri∗ q i
⇒ hg i |gk i = δki

e induces a natural mapping on G


it can be seen that the elements r of the group algebra G e (as a vector space) by the rule of
group multiplication. This is more apparent by using Dirac notation appropriately. We start with the following identity
 
rgi = gj rj gi = gj rj gi = (gj gi ) rj
rgi = gk ∆kji rj

where we have used Eq. (11.2), we should remember that such an equation is a property of an algebra which is not a vector
nor a ring property. In Dirac notation we rewrite this expression as

r |gi i = |gk i rj ∆kji = |gk i D (r)k i ; D (r)k i ≡ rj ∆kji (11.4)

e instead of an element gi of the basis in Eq. (11.4) we have


If we use any q ∈ G

r |qi = r |gi i q i = |gk i rj ∆kji q i


k
e (as a vector space). In particular, the factor D (r) i defined
this can be interpreted as a linear transformation (operator) on G
in Eq. (11.4) can be interpreted as the matrix representation of r in the basis {|gj i}, when r is taken as a linear operator on
e Therefore, every element of the group algebra G
G. e can be seen either as a vector of the algebra or as an operator on the
e
vector space G. This dual role of the elements of the group algebra as vectors and operators leads to important properties of
the regular representation.
The following example shows that the group algebra G e is not a division algebra.

Example 11.1 Let G e 2 be the complex two-dimensional group algebra associated with the group C2 = {e, a} of two elements
i.e. r ≡ β1 e + β2 a. Elements in G e2 of the form βe ± βa are non-null divisors of zero as can be seen from

(βe + βa) (βe − βa) = β 2 e2 − a2 = β 2 (e − e) = 0

and divisors of zero are singular according with theorem 10.4.

As in the case of groups, we can define representations of the group algebra as linear operators on certain vector spaces.
Indeed we shall use the representations on the group algebra to find the representations of the group that induced the algebra.
n o
Definition 11.2 A representation of the group algebra Ge is a linear mapping from Ge to a set of linear operators U (r) : r ∈ G
e
e and U (r), U (q) are their images
on a vector space V which preserves the laws of combination of the group algebra: if r, q ∈ G
then

U (αr + βq) = αU (r) + βU (q) ; U (rq) = U (r) U (q)


[U (r)] (x)
∈ V ; ∀x ∈ V
n  o
Definition 11.3 An invariant subspace of V under a representation U G e of the group algebra Ge is a subspace Va of V
n  o
e . An irreducible representation of G
such that [U (r)] (x) ∈ Va for all x ∈ Va and for all U (r) ∈ U G e in V , is one that has
no non-trivial invariant subspaces of V .

Theorem 11.1 (i) A representation of G e is also a representation of G and vice versa. (ii) An irreducible representation of
e is also an irreducible representation with respect to G and vice versa.
G
11.1. LEFT IDEALS AND INVARIANT SUBSPACES OF THE GROUP ALGEBRA 201

Proof : (i) Let {U (r)} be a representation of G e in V . Since gi ∈ G,


e we have in particular that U (gi gk ) = U (gi ) U (gk )
∀gi , gk ∈ G, so it forms a representation for the group.
Let {U (g)} be a linear representation of G in V . Hence, U (gi gk ) = U (gi ) U (gk ) ∀gi , gk ∈ G. If we construct the formal
linear combinations of elements of G, and demand “linear properties” on the extensions of U (g) i.e.

U (gi + gj ) ≡ U (gi ) + U (gj ) ; U (αgi ) ≡ αU (gi ) ; ∀gi , gj ∈ G (11.5)

e give1
with these linear properties of the extensions of U (g), the linear operations on G
  
U (αr + βq) = U αgi ri + βgi q i = αri U (gi ) + βq i U (gi ) = αU ri gi + βU q i gi
U (αr + βq) = αU (r) + βU (q) (11.6)

and for the product


  
U gi ri gk q k = ri q k U (gi gk ) = ri q k U (gi ) U (gk ) = U gi ri U gk q k
U (rq) =
U (rq) =
U (r) U (q) (11.7)
n  o n  o
e
(ii) Let U G be an irreducible representation of G e in V . We shall assume that U G e is reducible with respect to
G and we shall arrive to a contradiction. Therefore, assume that there is Va ⊂ V such that [U (gi )] x ∈ Va for all x ∈ Va

and for all U (gi ) ∈ {U (G)}. Since U is a linear mapping of G e into β (V ), we have U (r) = U gi ri = ri U (gi ). Hence,
[U (r)] x = ri [U (gi ) (x)]. Since Va is a subspace, it is closed
n  under
o linear combinations, from which
i
n  ro[U (gi ) (x)] ∈ Va .
Consequently, [U (r)] x ∈ Va for all x ∈ Va and for all U (r) ∈ U G e , contradicting the fact that U G e is an irreducible
representation of G e in V . The reciprocal is proved similarly. QED.
The preceding theorem permits to find irreducible representations of the group G by finding irreducible representations of
its group algebra G.e For it to be of practical importance, we should see whether constructing irreducible representations on
the algebra is easier than constructing them on the group. Indeed, the construction of irreducible representations of the group
algebra is facilitated by the possibility of having linear combinations of the group elements, to form appropriate projector
operators2.

11.1 Left ideals and invariant subspaces of the group algebra


e We have seen that all irreducible inequivalent
The regular representation of a finite group is constructed on the group algebra G.
representations {µ} of the group G, are contained in the regular representation, and a given µ is contained nµ times (theorem
7.17, page 152). Therefore, the group algebra G e can be decomposed (as a vector space) into a direct sum of irreducible invariant
a
subspaces Lµ as follows
nc Xnµ
X
e=
G Laµ
µ=1 a=1

we can find an orthonormal basis for G e by taking orthonormal bases in each Laµ . Further, it is possible to order this basis in
such a way that the first one lies in L1 (which is always 1−dimensional), the next n2 lie in L21 , . . . and so on. With that basis
1

ordered in that way, the matrix representations D (g) are block-diagonal


 
1
 D2 
 
 .. 
 . 
 
 D2 
 
 .. 
 . 
 
 Dnc 
 
 .. 
 . 
Dnc
1 We should distinguish the linearity in Eq. (11.6), from the linearity of the operators {U (g)}. To say that an operator U (g) is linear in V , we

mean [U (g)] (αx + βy) = α [U (g)] (x) + β [U (g)] (y), for all x, y ∈ V . This expresses the linearity of the mapping U (g) : V → V . On the other hand,
the linearity defined in Eq. (11.6), expresses the linearity of the mapping U : G e → β (V ), where β (V ) is the set of all linear operators on V.
2 Note that, despite the products rq are derived from group multiplication, the algebra with identity G, e is not a group under this multiplication (it
is a ring). It is because the trivial linear combinations generates a singular element (the zero of the algebra). Even more, some linear combinations
generate non-null elements that are singular, as illustrated in example 11.1, Page 200.
202 CHAPTER 11. GROUP ALGEBRA AND THE REDUCTION OF THE REGULAR REPRESENTATION

such that a given Dµ appears nµ times. Let us characterize each subspace Laµ . For this we denote the new basis as
 a
kµi : µ = 1, . . . , nc ; a, i = 1, . . . , nµ

Let us take a fixed subspace Laµ its basis is


 a
kµi : i = 1, . . . , nµ
Now, since the subspaces Laµ are invariant, it means that ∀ |ri ∈ Laµ all the elements gk of the group maps |ri in an element of
Laµ i.e.
gk |ri ∈ Laµ ; ∀ |ri ∈ Laµ and ∀gk ∈ G
e can be written as a linear combination of g ′ s. Thus p = gk pk . Taking into account that
on the other hand, any element p ∈ G k
a
Lµ is a vector subspace and hence closed under linear combinations we have

p |ri = pk gk |ri ∈ Laµ e


∀ |ri ∈ Laµ and ∀p ∈ G

in other words, Laµ is closed under left multiplication with any element of G.e Therefore, according with definition 10.21, page
a
197; the invariant subspaces Lµ are also left-ideals. Left-ideals which do not contain properly any non-zero left ideals are said
to be minimal left-ideals. Furthermore, if Laµ is not an invariant subspace (even if it is a vector subspace) under the irreducible
representation µ, there is at least one element gk ∈ G such that gk |ri ∈ / Laµ for some |ri ∈ Laµ , and such a space is not a
left-ideal. Consequently, minimal left-ideals are equivalent to irreducible invariant subspaces. In conclusion, if we identify the
e all inequivalent irreducible representations can be easily found. Once again, we are
minimal left-ideals of the group algebra G,
reflecting a group structure into an algebra structure.

11.2 Decomposition of finite dimensional algebras in left ideals: projections


Though we develop this section for applications on the group algebra, the results are valid for an arbitrary finite-dimensional
algebra A, as long as A can be decomposed as a direct sum of orthogonal left-ideals.
In order to identify the minimal left ideals of the group algebra, a natural approach is to characterize the projectors onto
those left-ideals. We could be tempted to use the projection method developed in Sec. 9.2. Nevertheless, the construction
of those projectors require the knowledge of irreducible matrix representations (see for instance Eq. 9.7). Hence, methods of
Sec. 9.2 are not useful to construct these representations3 . Instead, we shall try to characterize the projectors by using the
properties of left-ideals and linear combinations in the algebraic structure.
e is a vector space, minimal left ideals Laµ (which are vector subspaces) can be characterized by the projectors onto
Since G
Lµ . We denote these projectors as Pµa . According with the discussion on page 32, definition 2.30; for Pµa to be projectors in the
a

sense of Hilbert spaces, they must be linear, continuous, idempotent, self-adjoint operators which are also pairwise orthogonal
i.e.
Paµ Pbν = δ µν δab Paµ
Let us examine the projectors Pµa a bit closer. A projector is defined on a given subspace of the direct sum of all irreducible
e
invariant subspaces (minimal left-ideals) of G
nc nµ
X X  n 
e =
G Laµ = L11 ⊕ L12 ⊕ . . . ⊕ Ln2 2 ⊕ . . . ⊕ L1nc ⊕ . . . ⊕ Lnnc c (11.8)
⊕µ=1 ⊕a=1
1  
|ri = r1 + r21 + . . . + |rn2 i + . . . + rn1 + . . . + rnncnc ; rµa ∈ Laµ (11.9)
2 c


e each rµa is unique we define
since for a given |ri ∈ G,

Pµa |ri ≡ rµa ∈ Laµ (11.10)

now let us apply Pνb to the vector rµa . Since rµa already belongs to an invariant subspace (minimal left ideal), it is clear
that Pνb anhillates it, unless such a vector belongs to the subspace Lbν , in the latter case the projector keeps the vector unaltered,
thus

Pνb Pµa |ri = Pνb rµa = δ ab δµν rµa
Pνb Pµa |ri = δ ab δµν Pµa |ri ; e
∀ |ri ∈ G

e we have
since this is valid for all |ri ∈ G
3 Instead, those methods are useful to construct the minimal invariant subspaces of a vector space V under a group G, when its irreducible

representations are given.


11.2. DECOMPOSITION OF FINITE DIMENSIONAL ALGEBRAS IN LEFT IDEALS: PROJECTIONS 203


Pµa |ri ≡ rµa ∈ Laµ so that if |qi ∈ Laµ ⇒ Pµa |qi = |qi ; Pνb Pµa = Pµa Pνb = δ ab δµν Pµa (11.11)
showing the idempotence and pairwise orthogonality. The self-adjointness is equivalent to say that the vector subspaces Laµ
are orthogonal each other (see theorem 2.49, page 32), this is guaranteed by taking into account that G e is fully reducible (since
G is finite and thus any representation of G is equivalent to an unitary representation). Further, continuity is equivalent to
boundedness. Linearity is left to the reader. These observations says that Pµa defined by Eq. (11.10) are projectors in the
sense of Hilbert spaces.
A convenient way of describing the action of Pµa on |ri is by taking into account the decomposition of the identity of G. e
Remember that G e is an algebra with identity, which is precisely the identity of the group G
 n 
|ei = e11 + e12 + . . . + |en2 2 i + . . . + e1nc + . . . + ennc c ; eaµ ∈ Laµ
we shall see that a realization of the projectors through this decomposition of the identity, is possible by virtue of the dual
e as vectors and operators. We shall establish first the following
role of the elements r ∈ G
PN
Lemma 11.1 If V = ⊕m=1 Vm is a decomposition of V as a direct sum (where m could imply several indices), the decom-
position of the identity
N
X
e= em ; em ∈ Vm
m=1
is such that en em = δmn en .
Proof : The decomposition of en ∈ Vn can be written in two ways
en = 0 + 0 + . . . + 0 + en + 0 + . . . + 0
X N
en = en e = en em = en e1 + en e2 + . . . + en en−1 + en en + en en+1 + . . . + en eN
m=1

since each element in the decomposition is unique then en em = δmn en . QED.


Theorem 11.2 Let G e be a finite-dimensional algebra with identity, that can be decomposed in orthogonal left-ideals in the
form described by Eqs. (11.8). The projection operator Pµa defined in Eq. (11.10), is realized by right-multiplication with eaµ ,
i.e. the action of the projector can be described as
Pµa |ri = reaµ (11.12)
where eaµ is the corresponding projector of the identity e, i.e. eaµ = Pµa e. That is, Eq. (11.12) defines a linear operator with the
properties given by Eq. (11.11). The projectors Pµa have the additional property
e
rPµa = Pµa r ; ∀r ∈ G (11.13)
e in two
Proof : We should prove first that the operator Pµa is linear, this is left to the reader. Now let decompose r ∈ G
ways

nc X
X
r = rµa ; rµa ∈ Laµ
µ=1 a=1

nc X nµ
nc X
X X
r = re = r eaµ = reaµ
µ=1 a=1 µ=1 a=1

but Laµ is a left-ideal, so reaµ ∈ Laµ .


Since the decomposition of r is unique, we have reaµ = rµa ≡ Pµa r. In Dirac notation

Pµa |ri = reaµ .
The idempotence and pairwise orthogonality is a direct consequence of lemma 11.1
 
Pνb Pµa |ri = Pνb reaµ = reaµ ebν = r eaµ ebν = rδ ab δµν eaµ = δ ab δµν reaµ
Pνb Pµa |ri = e ⇒ P b P a = δ ab δµν P a
δ ab δµν Pµa |ri ∀r ∈ G ν µ µ

e over an arbitrary |qi ∈ G


Finally, to verify Eq. (11.13) let us apply rPµa and Pµa r for any r ∈ G e

Pµa r |qi = Pµa |rqi = (rq) eaµ = r qeaµ ; ∀r, q ∈ Ge

rPµa |qi = r qeaµ = r qeaµ ; ∀r, q ∈ G e

e are arbitrary, we obtain Eq. (11.13). QED. In words, Eq. (11.12) says that the action of projectors Pµa on the
since r, |qi ∈ G
left-side is equivalent to the action of the corresponding projection of the identity (eaµ = Pµa e) acting on the right-side.
204 CHAPTER 11. GROUP ALGEBRA AND THE REDUCTION OF THE REGULAR REPRESENTATION

11.3 Idempotents as generators of left-ideals


If we define
nµ nc nµ
X X X
Lµ ≡ e=
Laµ ; ; G Lµ ; Pµ = Pµa
⊕a=1 ⊕µ=1 a=1
nc nµ
X X
r = rµ ; rµ ≡ rµa
µ=1 a=1
nc nµ
X X
e = eµ ; eµ ≡ eaµ
µ=1 a=1

it is straightforward to establish the properties of the projectors Pµ onto the subspaces Lµ . We only have to take into account
that Lµ is also a left-ideal (though it is not minimal), and that decompositions on direct sums are unique.

Theorem 11.3 The projection operator Pµ onto the left-ideals Lµ has the properties

Pµ |ri = |rµ i ∈ Lµ ; if |qi ∈ Lµ ⇒ Pµ |qi = |qi


Pν Pµ = Pµ Pν = δµν Pµ ; rPµ = Pµ r ; ∀r ∈ G e

and Pµ is realized by right-multiplication with eµ in the form: Pµ |ri = |reµ i.

Inspired on lemma 11.1 we make the following definition

Definition 11.4 A given set {eλ } of non-zero elements of the group algebra Ge that satisfy the condition eµ eν = δµν eµ is called
a set of idempotents. Those which satisfy that condition up to an additional normalization, are called essentially idempotents.

The existence of a set of two or more idempotents in the sense of definition 11.4, in a given group algebra Ge would imply
that there are non-null divisors of zero. Showing again that Ge is not in general a division algebra. We shall see later that
most of group algebras have a set of several idempotents. A particular example of idempotents is precisely the set of non-null
components of the identity generated by a given direct sum of subspaces as shown by lemma 11.14.
n o
Definition 11.5 Let eλ be an idempotent, the set reλ : r ∈ G e is clearly a left-ideal. This is called the left-ideal generated
by the idempotent eλ .

This induces the following definition.

Definition 11.6 An idempotent that generates a minimal left-ideal is called a primitive idempotent.

Our next task is to look for a criterion to check whether a given idempotent is primitive or not.

Theorem 11.4 An idempotent ei is primitive, if and only if ei rei = λr ei for all r ∈ G, e where λr is a number that depends on
r.
n o
Proof : (i) Assume that ei is a primitive idempotent so that L = rei : r ∈ G e is a minimal left-ideal. Therefore, the
realization of the group algebra on L is irreducible (irreducible invariant subspace). Now, define an operator Rr on G e induced
e e
by r, in the form Rr |qi ≡ |qei rei i ∀ |qi ∈ G . Since qei r ∈ G then (qei r) ei = Rr |qi ∈ L. On the other hand

Rr s |qi = e
Rr |sqi = |(sq) (ei rei )i = |sqei rei i ; ∀s, q ∈ G
sRr |qi = e
s |q (ei rei )i = |(sq) (ei rei )i = |sqei rei i ; ∀s, q ∈ G

e We see that Rr is a linear operator of G


therefore Rr s = sRr ∀s ∈ G. e into L. If we restrict the domain of Rr to L ⊆ G,
e we
obtain in particular that Rr s = sRr ∀s ∈ G. Now, since L is an irreducible invariant subspace under G, we obtain by applying
Schur’s Lemma that Rr must be proportional to the identity operator in the subspace L. Since |qei i is an element of L for all
e then
|qi ∈ G
e
Rr |qei i = λr e |qei i = λr |qei i = |q (λr ei )i ; ∀ |qi ∈ G
4 For instance, if we decompose the identity of the group algebra in the basis {g } induced by the group, it is obvious that no more than one
i
idempotent is generated from e (i.e. the identity itself). However, by transforming the basis {gi } into another basis {ui } through a unitary
transformation, we obtain several non-null components of the identity with this new basis.
11.3. IDEMPOTENTS AS GENERATORS OF LEFT-IDEALS 205

on the other hand, from the definition of Rr and the idempotence of ei , we also have

Rr |qei i ≡ |(qei ) ei rei i = |q (ei ei ) rei i = |qei rei i = |q (ei rei )i


⇒ |q (λr ei )i = |q (ei rei )i

we can induce an operator Rr in this way from any r ∈ G. e And since the previous relation holds for all q ∈ Ge (in particular
e
for the identity), we conclude that ei rei = λr ei for all r ∈ G.
e and ei = e′ + e′′ where e′ and e′′ are two different idempotents5 . We shall prove that
(ii) Assume ei rei = λr ei for all r ∈ G i i i i
it leads to a contradiction. By definition we have

ei e′i = (e′i + e′′i ) e′i = e′i e′i = e′i ⇒ (ei e′i ) ei = e′i ei
⇒ ei e′i ei = e′i (e′i + e′′i ) ⇒ ei e′i ei = e′i ⇒ λe′i ei = e′i

where we have used our hypothesis in the last step. Using the last result and the definition of an idempotent we obtain

e′i = λe′i ei = e′i e′i = λ2e′i ei

therefore λ2e′ = λe′i from which either λe′i = 1 or λe′i = 0. If λe′i = 1, then ei = e′i and e′′i = 0. If λe′i = 0, then ei = e′′i and
i
e′i = 0. In either case, one of the elements e′i and e′′i is not an idempotent contradicting our assumption. QED.
Primitive idempotents are generators of irreducible representations. If we have a set of primitive idempotents, we want to
know which ones of them generates inequivalent representations.

Theorem 11.5 Two primitive idempotents e1 and e2 generate equivalent irreducible representations if and only if e1 re2 6= 0
e
for some r ∈ G.

Proof : Let L1 and L2 be the two minimal left ideals, generated by e1 and e2 respectively.
e Let q1 ∈ L1 , then q1 s = q1 (e1 re2 ) = (q1 e1 r) e2 ∈ L2 . Consider the linear
(i) Assume that e1 re2 = s 6= 0 for some r ∈ G.
transformation S from L1 into L2 defined as
S
q1 ∈ L1 −
→ q2 = q1 s ∈ L2
e the vector p |q1 i ∈ L1 since L1 is a left-ideal. Hence, S can be applied on p |q1 i. Then we have
now for all p ∈ G,

Sp |q1 i =S |pq1 i ≡ |(pq1 ) si = |p (q1 s)i = p |(q1 s)i ≡ pS |q1 i


⇒ Sp |q1 i = pS |q1 i ; ∀p ∈ G, e ∀q1 ∈ L1
 
e According with Schur’s lemma, the two representations D1 G
in other words, acting on L1 we have Sp = pS, ∀p ∈ G. e and
 
D2 G e realized on L1 and L2 respectively, must be equivalent (since S is a non-zero mapping).
(ii) If the two representations are equivalent, there exists a non-singular linear transformation S, such that

SD1 (p) S −1 = D2 (p) e


; ∀p ∈ G. (11.14)
1 2 e
SD (p) = D (p) S ; ∀p ∈ G. (11.15)

e Now, let us define


Considering S as a linear transformation from L1 onto L2 , it is equivalent to say that Sp = pS, ∀p ∈ G.
6
|si ≡ S |e1 i ∈ L2 . It can be seen that s is different from zero . Moreover

|si ≡ S |e1 i = S |e1 e1 i = Se1 |e1 i = e1 S |e1 i = e1 |si = |e1 si

so that s = e1 s. Further, since s ∈ L2 , and using Eq. (11.12) we see that |si = P2 |si ≡ |se2 i. In normal notation we write
s = se2 from which
s = se2 = (e1 s) e2 = e1 se2 =
6 0 because s 6= 0 ∈ L2 . (11.16)

QED.
5 Denying that e is a primitive idempotent is equivalent to say that the left-ideal L
i ei generated by ei can be decomposed in at least two orthogonal
left-ideals Lei = Le′ ⊕Le′′ which are also invariant subspaces under G. Therefore ei = e′i + e′′ ′ ′′
i with ei ei = 0.
i i
6 Suppose that s = Se = 0. Since e 6= 0 by definition, then S is a divisor of zero. Therefore, theorem 10.4 page 191, implies that S is a singular
1 1
element in contradiction with Eq. (11.14).
206 CHAPTER 11. GROUP ALGEBRA AND THE REDUCTION OF THE REGULAR REPRESENTATION

11.4 Complete reduction of the regular representation


e can be decomposed in left-ideals
Let us summarize the procedure to reduce the regular representation. (i) The group algebra G
Lµ with µ running over all irreducible inequivalent representations
nc
X
e=
G Lµ
⊕µ=1

e with an idempotent eµ which satisfy the condition


(ii) each Lµ is generated by right multiplication of all p ∈ G
nc
X
eµ eν = δµν eµ , and eµ = e
µ=1

(iii) The regular representation contains nµ times each irreducible inequivalent representation µ. Therefore, each Lµ (and the
associated eµ ) can be decomposed into nµ minimal left-ideals Laµ associated with nµ primitive idempotents eaµ
nµ nµ
X X
Lµ = Laµ ; eµ = eaµ ; eaµ ∈ Laµ
⊕a=1 a=1

this primitive idempotents satisfy the conditions

eaµ rebν = δ ab δµν λr eaµ e


∀r ∈ G

in summary, the problem of reducing the regular representation of a group G is reduced to identify all the inequivalent primitive
idempotents. In particular, we shall use this technique for the symmetric group Sn to derive all the inequivalent irreducible
representations of Sn .
We have seen that the complete reduction of the regular representation corresponds to a reduction of the group algebra
into minimal left-ideals. The following theorem shows that this minimal left-ideals, are contained in some special minimal
two-sided ideals

Theorem 11.6 If T is a minimal two-sided ideal and it contains a minimal left-ideal Laµ , then it contains all the other minimal
left-ideals associated with the same µ, and only these.

Proof : Let Laµ ⊆ T . (i) If Laµ and Lbµ correspond to equivalent irreducible representations, theorem 11.5 says that there
exists a non-zero element s ∈ Lbµ such that s = Pµb s = sebµ = eaµ sebµ (see Eq. 11.12, and Eq. 11.16 in the proof of theorem
11.5). Let us take r ∈ Laµ and since Laµ ⊆ T , we see that r ∈ T .
Now, since s ∈ Lbµ then rs ∈ Lbµ because the latter is a left-ideal. On the other hand, since r ∈ T , then rs ∈ T because T is
a two-sided ideal (in particular, a right-ideal). In conclusion, the element rs = rsebµ belong to both Lbµ and T . Hence, Lbµ ⊆ T .
(ii) If Laµ and Lbν are both in T , there exists an element s such that Lbµ s = sLbν and they generate equivalent representations
so that µ = ν. It can be P shown that if s does not exist, then T cannot be minimal. QED.
Now, since Lµ = a Laµ contains all left-minimals associated with the same µ, and only these; it suggest that Lµ are
minimal two-sided ideals. This fact is easy to check explicitly
P
Theorem 11.7 The left-ideals Lµ = a Laµ , associated to a given irreducible representation µ are minimal two-sided ideals
(or simply minimal ideals, see definition 10.21) i.e. each Lµ does not contain properly any non-zero two-sided ideal.

The preceding theorem shows that in the complete reduction of the regular representation we first decomposed G e into
minimal two-sided ideals Lµ for each irreducible inequivalent representation, to further reduce them into minimal left-ideals
Laµ ; which are also irreducible invariant subspaces with respect to the µ−representation.
Note that when G is an abelian group, we are led to a commutative group algebra G. e Any left-ideal in G e is also a
a
two-sided ideal. This shows that Lµ and Lµ must coincide, so that each irreducible representation appears only once. Since
each irreducible inequivalent representation appears nµ times in the regular representation, we conclude that each irreducible
representation of G is one-dimensional. Another way to see it, is by observing that if eaµ is a primitive idempotent that generates
e and the idempotence of ea we have rea ea = rea = λr ea , and the last
Laµ then eaµ reaµ = λr eaµ but using the commutativity of G µ µ µ µ µ
a
equality shows that any Lµ is one-dimensional i.e. all irreducible invariant subspaces under G are one-dimensional. Finally, the
regular representation is nG -dimensional and so G e is, and since each inequivalent irreducible representation is one-dimensional
and appears only once, there are nG inequivalent irreducible representations of G. This discussion is consistent with previous
results obtained for abelian groups.
11.5. THE REDUCTION OF THE REGULAR REPRESENTATION OF C3 207

11.4.1 Generation of the idempotent associated with the identity representation


The idempotent associated with the identity representation for a general finite group yields
1 X
e1 = gi (11.17)
nG
gi ∈G

e for the sum to make sense. From the rearrangement


we should keep in mind that each gi ∈ G is considered as an element of G
lemma, we see that
ge1 = e1 g = e1 ∀g ∈ G
In particular, it leads to
 
1 X 1 X 1 X 1
e1 e1 =  gi  e1 = (gi e1 ) = e1 = nG e 1 = e 1
nG nG nG nG
gi ∈G gi ∈G gi ∈G

and e1 (ge1 ) = e1 e1 = e1 ∀g ∈ G. Since any element r ∈ Ge is written as r = gi ri then we have


!
X X X X
i
e1 re1 = e1 gi r e1 = (e1 gi e1 ) ri = e1 r i = e1 ri
i i i i
X
i
e1 re1 = λr e1 ; λr ≡ r
i

these properties show that e1 is a primitive idempotent. Now let us characterize the left ideal Le1 generated by e1
!
n o X X X X
Le1 ≡ e
re1 : ∀r ∈ G ; re1 = gk r k e1 = (gk e1 ) rk = e1 r k = e1 rk ⇒
k k k k
re1 = λr e1

so Le1 is clearly unidimensional (any element of Le1 is linearly dependent with e1 ). Since ge1 = e1 ∀g ∈ G it corresponds to
the identity representation.

11.5 The reduction of the regular representation of C3


By now we illustrate the reduction of the regular representation with a simple example. This is certainly not the most
advantageous form of generating the inequivalent irreducible representations of C
3 , but it clarifies
the procedure to understand
the general technique. The group C3 = {e, a, b} can be denoted better as C3 = e, a, a−1 . The group C3 is abelian, so that
there are nG = 3 one-dimensional inequivalent irreducible representations, each one ocurring once in the regular representation.

11.5.1 Generation of the idempotents


The idempotent associated with the identity representation comes from Eq. (11.17)
1 
e1 = e + a + a−1 (11.18)
3
We shall propose a second idempotent of the form

e2 = xe + ya + za−1 (11.19)

and try to find values of x, y, z for e2 to become a second idempotent. We then demand orthogonality with e1 and idempotence
 
e1 e2 = 0 ⇒ e + a + a−1 xe + ya + za−1 = 0 (11.20)
−1
 −1
 −1

e2 e2 = e2 ⇒ xe + ya + za xe + ya + za = xe + ya + za (11.21)

evaluating explicitly the orthogonality condition for e1 e2 we have


    
A ≡ e + a + a−1 xe + ya + za−1 = e xe + ya + za−1 + a xe + ya + za−1 + a−1 xe + ya + za−1
  h 2 i
A = xe + ya + za−1 + xa + ya2 + ze + xa−1 + ye + z a−1
   
A = xe + ya + za−1 + xa + ya−1 + ze + xa−1 + ye + za
A = (x + y + z) e + (x + y + z) a + (x + y + z) a−1 = 0
208 CHAPTER 11. GROUP ALGEBRA AND THE REDUCTION OF THE REGULAR REPRESENTATION

e we resort to linear independence to write


remembering that e, a, a−1 are a basis for G,

(x + y + z) = 0 (11.22)

note that we are using the distributivity properties of rings. Now the condition for idempotence, Eq. (11.21) becomes
2 2 2  
(xe) + (ya) + za−1 + 2 (xe) (ya) + 2 (xe) za−1 + 2 (ya) za−1 = xe + ya + za−1

e to expand it as an ordinary polynomial.


where we have used the commutativity of G

x2 e + y 2 a−1 + z 2 a + 2xya + 2xza−1 + 2yze = xe + ya + za−1


  
x2 + 2yz e + z 2 + 2xy a + y 2 + 2xz a−1 = xe + ya + za−1 (11.23)

appealing to linear independence and gathering Eqs. (11.22, 11.23) we have a total of four equations

(x + y + z) = 0 ; x2 + 2yz = x ; z 2 + 2xy = y ; y 2 + 2xz = z (11.24)

from the first of Eqs. (11.24) we have x = −y − z and using it in the second we have

(y + z)2 + 2yz = −y − z ⇒ y 2 + 2yz + z 2 + 2yz + y + z = 0


y 2 + z 2 + 4yz + y + z = 0 (11.25)

multiplying the third of Eqs. (11.24) by z and the fourth one by y and substracting, we have

z 3 + 2xyz = yz ; y 3 + 2xyz = yz ⇒ z 3 − y 3 = 0

⇒ (z − y) y 2 + yz + z 2 = 0 (11.26)

Equation (11.26) has three solutions


1√  4π 1 √  4π
(i) y = z ; (ii) y = i 3 − 1 z = e−i 3 z; (iii) y = − 1 + i 3 z = ei 3 z (11.27)
2 2
replacing each of them in Eq. (11.25) we have

(i) y = z ⇒ z 2 + z 2 + 4z 2 + z + z = 0 ⇒ 2z (3z + 1) = 0
1
(i) y = z ⇒ y = z = 0 or y = z = − (11.28)
3
Let us rewrite Eq. (11.25) 
y 2 + yz + z 2 + 3yz + y + z = 0 (11.29)
and remembering that solutions (ii) and (iii) in Eq. (11.27) are the roots of y 2 + yz + z 2 , we see that Eq. (11.29) simplifies
to 3yz + y + z = 0, when those roots are used. Therefore
1√  1√ 
(ii) 3 · i 3 − 1 z2 + i 3−1 z+z = 0
2 2
1 √  2 1 √ 
(iii) −3· 1+i 3 z − 1+i 3 z+z = 0
2 2
consequently
3√  1√ 
(ii) z = 0 or i 3−1 z+ i 3−1 +1=0
2 2
3 √  1 √ 
(iii) z = 0 or − 1+i 3 z− 1+i 3 +1=0
2 2
solving the linear equation for z we find
 
1 1 √ 1 1 2π
(ii) z = 0 or z = i 3− = ei 3 (11.30)
3 2 2 3
√ !
1 i 3 1 1 2π
(iii) z = 0 or z = − − = e−i 3 (11.31)
3 2 2 3

picking up Eqs. (11.28, 11.30, 11.31) we obtain the solutions of y and z for Eqs. (11.24)
11.5. THE REDUCTION OF THE REGULAR REPRESENTATION OF C3 209

1
(ia) y = z = 0 ; (ib) y = z = −
3
1 i 2π
(iia) z = 0 ; (iib) z = e 3
3
1 −i 2π
(iiia) z = 0 ; (iiib) z = e 3
3
but it is easy to check from Eqs. (11.24), that solutions (ia), (iia) and (iiia), lead to x = y = z = 0, which in turn lead to
e2 = 0, according with Eq. (11.19), but this e2 is not an idempotent (they are non-zero by definition). So the solutions are
1 1 2π 1 2π
(i) y = z = − ; (ii) z = ei 3 ; (iii) z = e−i 3 (11.32)
3 3 3
but solutions (ii) and (iii) in Eq. (11.32) correspond to solutions (ii) and (iii) of Eq. (11.27). Then we have
 
4π 4π 1 i 2π 1 2π
(ii) y = e−i 3 z = e−i 3 e 3 = e−i 3
3 3
 
4π 4π 1 2π 1 2π
(iii) y = ei 3 z = ei 3 e−i 3 = ei 3
3 3
so that
1 1 2π 1 2π
(i) y = z = −; (ii) z = y ∗ = ei 3 ; (iii) z = y ∗ = e−i 3 (11.33)
3 3 3
and x for each solution is easily found from the first of Eqs. (11.24)
   
2 1 2π 2 1 1
x = −y − z ⇒ (i) x = ; (ii) x = −z ∗ − z = −2Re (z) = −2 · cos =− − =
3 3 3 3 2 3
 
1 2π 1
(iii) x = −y − y ∗ = −2Re (y) = −2 · cos =
3 3 3
the final set of solutions is
2 1 1 1 2π 1 1 2π
(i) x = , y = z = − ; (ii) x = , z = y ∗ = ei 3 ; (iii) x = , z = y ∗ = e−i 3 (11.34)
3 3 3 3 3 3
We should remember that the existence of several idempotents is an evidence of the fact that the group algebra is not a
division algebra.

11.5.2 Checking for primitive idempotence


Now we should check whether these solutions correspond to primitive idempotents. Certainly, at least one of them must not be
primitive, because otherwise we would obtain 4 idempotents (e1 associated with the identity representation, and other three),
which is more than the number of irreducible inequivalent representations nc = 3. This is quite expected because our solutions
have been extracted using idempotence and orthogonality with e1 , but such solutions do not guarantee orthogonality of the
solutions nor the condition in theorem 11.4.
(i)
Let us check e2 corresponding to the first solution in Eq. (11.34), replacing that solution in Eq. (11.19) we have
(i) 2 1 1
e2 = e − a − a−1
3 3 3
(i)
using the abelianity of C3 , we see that the multiplication of e2 with the elements of the group gives
(i) (i) (i) 2 1 1
e2 e = ee2 = e2 = e − a − a−1
 3 3 3 
(i) (i) 2 1 1 2 1 1
e2 a = ae2 = a e − a − a−1 = a − a−1 − e
3 3 3 3 3 3
 
(i) (i) 2 1 1 2 1 1
e2 a−1 = a−1 e2 = a−1 e − a − a−1 = a−1 − e − a
3 3 3 3 3 3
(i)
the relations above permit us to check the conditions of theorem 11.4, using the abelianity of C3 , and the idempotence of e2
we have
(i) (i) (i)
e2 ee2 = e2
(i) (i) (i) (i) (i) 1 2 1
e2 ae2 = e2 e2 a = e2 a = − e + a − a−1
3 3 3
210 CHAPTER 11. GROUP ALGEBRA AND THE REDUCTION OF THE REGULAR REPRESENTATION

(i) (i) (i) (i)


therefore e2 ae2 = e2 a 6= λa e2 so that theorem 11.4 predicts that it is not a primitive idempotent.
(ii) (iii)
Let us check e2 , e2 corresponding to the second and third solutions in Eq. (11.34), replacing those solutions in Eq.
(11.19) we have

(iii) 1h 2π 2π
i
e2 ≡ e+ = e + ei 3 a + e−i 3 a−1 (11.35)
3
(ii) 1h 2π 2π
i
e2 ≡ e− = e + e−i 3 a + ei 3 a−1 (11.36)
3
let us see the result of multiplying e+ with the elements of the group
1h 2π 2π
i
ee+ = e+ e = e+ = e + ei 3 a + e−i 3 a−1
3
1 h 2π 2π
i 1h 2π 2π
i
ae+ = e+ a = a e + ei 3 a + e−i 3 a−1 = a + ei 3 a−1 + e−i 3 e
3 3
1h i 2π −1 i 4π
i
−i 2π 1h 2π 2π
i 2π
= e + ae 3 + a e 3 e 3 = e + aei 3 + a−1 e−i 3 e−i 3
3 3

ae+ = e+ a = e−i 3 e+
1 h 2π 2π
i 1h 2π 2π
i
a−1 e+ = e+ a−1 = a−1 e + ei 3 a + e−i 3 a−1 = a−1 + ei 3 e + e−i 3 a
3 3
1h −i 4π −i 2π
−1
i 2π
i 1h 2π 2π
i 2π
= e+e 3 a+e 3 a e 3 = e + ei 3 a + e−i 3 a−1 ei 3
3 3
−1 −1 i 2π
a e+ = e+ a = e e+ 3

so we find
2π 2π
ee+ = e+ e = e+ ; ae+ = e+ a = e−i 3 e+ ; a−1 e+ = e+ a−1 = ei 3 e+ (11.37)
remembering that idempotence is already guaranteed by the solutions (11.34), it is obtained

e+ ee+ = e+ (ee+ ) = e+ e+ = e+
2π 2π
e+ ae+ = e+ (ae+ ) = e−i 3 e+ e+ = e−i 3 e+
 2π 2π
e+ a−1 e+ = e+ a−1 e+ = ei 3 e+ e+ = ei 3 e+

e
we see that the conditions of theorem 11.4, are satisfied by all elements of the group (i.e. for all elements of a basis of G).
e
Since any r ∈ G can be written as
r = λ1 e + λ2 a + λ3 a−1
we see that

e+ re+ = e+ λ1 e + λ2 a + λ3 a−1 e+ = λ1 e+ ee+ + λ2 e+ ae+ + λ3 e+ a−1 e+
2π 2π
e+ re+ = λ1 e+ + λ2 e−i 3 e+ + λ3 ei 3 e+
−i 2π 2π
e+ re+ = λr e+ ; λr ≡ λ1 + λ2 e 3 + λ3 ei 3

so that conditions of the theorem 11.4 are satisfied, and e+ is a primitive idempotent. In the same way we can show that e−
is also a primitive idempotent.

11.5.3 Checking for inequivalent primitive idempotents


The next natural question is whether e+ and e− generate equivalent representations. This question was the motivation to
develop theorem 11.5. According with that theorem we should evaluate e+ re− for all r ∈ G, e it is clearly sufficient to evaluate
e+ ge− for all g ∈ G.
2π  2π
e+ ee− = (e+ e) e− = e+ e− ; e+ ae− = (e+ a) e− = e−i 3 e+ e− ; e+ a−1 e− = e+ a−1 e− = ei 3 e+ e−

we should evaluate e+ e− . Using Eqs. (11.35, 11.36) we have


1h 2π 2π
ih 2π 2π
i
e+ e− = e + ei 3 a + e−i 3 a−1 e + e−i 3 a + ei 3 a−1
9
eh 2π 2π
i ei 2π3
h 2π 2π
i e−i 2π3
h 2π 2π
i
= e + e−i 3 a + ei 3 a−1 + a e + e−i 3 a + ei 3 a−1 + a−1 e + e−i 3 a + ei 3 a−1
9 9 9
1h 2π 2π
i ei 2π3
h 2π 2π
i e−i 2π
3
h 2π 2π
i
= e + e−i 3 a + ei 3 a−1 + a + e−i 3 a−1 + ei 3 e + a−1 + e−i 3 e + ei 3 a
9 9 9
11.5. THE REDUCTION OF THE REGULAR REPRESENTATION OF C3 211

µ e a a−1
1 1 1 1
2π 2π
2 1 e−i 3 ei 3
2π 2π
3 1 ei 3 e−i 3
Table 11.1: Irreducible representations of the group C3 .

eh 4π 4π
i a h 2π 2π
i a−1 h 2π 2π
i
e+ e− = 1 + ei 3 + e−i 3 + e−i 3 + ei 3 + 1 + ei 3 + 1 + e−i 3
9   9   9 
e 4π a 2π a−1 2π
e+ e− = 1 + 2 cos + 2 cos +1 + 2 cos +1
9 3 9 3 9 3
e+ e− = 0

finally, it can be checked that e1 e+ = e1 e− = 0. Therefore, according with theorem 11.5, the three primitive idempotents
e1 , e+ and e− generate inequivalent representations. The idempotent e1 spans the one-dimensional
n o left-ideal L1 , and then
e
generates the identity representation. The idempotent e+ spans the left-ideal L2 ≡ re+ ; ∀r ∈ G , the characters of the
associated irreducible representation of C3 can be found by applying Eq. (11.37) so that
2π 2π
e |e+ i = |e+ i 1 ; a |e+ i = |ae+ i = |e+ i e−i ; a−1 |e+ i = a−1 e+ = |e+ i ei 3
3

  2π 2π

thus, for the idempotent e+ , the representation elements associated with e, a, a−1 are 1, e−i 3 , ei 3 . Similarly, for the
  2π 2π

idempotent e− , the representation elements associated with e, a, a−1 are 1, ei 3 , e−i 3 . Table 11.1 shows the irreducible
representations of the group C3 . It is straightforward to check that these representations satisfy the orthogonality and com-
(i)
pleteness relations. Finally, it can be checked that the non-primitive idempotent e2 is equal to e+ + e− , showing that it is
decomposable.
Chapter 12

Representations of the permutation group

Cayley’s theorem gives a particular relevance to the symmetric or permutation groups Sn since any finite group of order
n is contained as a subgroup of Sn . From the point of view of the theory of representations, it means that knowing the
irreducible representations of symmetric groups would provide some information concerning the irreducible representations of
other finite groups. On the other hand, the irreducible representations of Sn will be useful in the study of finite dimensional
irreducible representations of some of the classical continuous groups, by means of the so-called tensor method. Furthermore,
permutation symmetry is of great relevance in the study of systems of identical particles in quantum mechanics, by virtue of
the symmetrization postulate.
In this chapter we shall construct all inequivalent irreducible representations of Sn for arbitrary n. For this, we introduce
the necessary tools: Young diagrams, Young tableaux and the associated symmetrizers and antisymmetrizers as well as the
irreducible symmetrizers. We shall see that the irreducible symmetrizers provide the projectors (idempotents) necessary to
obtain the irreducible representations on the group algebra space. It leads to the complete decomposition of the regular repre-
sentation of Sn . Next, in chapter 13, we analyze the role of the symmetrizers in the study of finite dimensional representations
of the general linear group of m−dimension GL (m), which is based on the complementary role of the Sn and GL (m) groups
on the space of nth−rank tensors in m−dimensional space.

12.1 One dimensional representations


From now on we shall assume that n ≥ 2. Every symmetric group Sn contains a non-trivial invariant subgroup An called the
alternating group (see definition 6.9, theorem 6.13), consisting of all even permutations within Sn . The quotient group Sn /An
is isomorphic with C2 . So theorem 7.2 says that Sn has two one-dimensional representations induced by the representations
p
of Sn /An ≃ C2 . The first is the identity representation and the second assigns to each permutation p the number (−1) which
p
is 1 for an “even” permutation and (−1) for an “odd” permutation. We define (−1) as the parity of the permutation.
There is another way of obtaining the one dimensional representations of Sn by using the idempotents i.e. the projection
operators on the group algebra.
Pn! Pn! p
Theorem 12.1 The symmetrizer s ≡ p=1 p and the antisymmetrizer a ≡ p=1 (−1) p of the group Sn are essentially
idempotent and primitive.

Proof : If q ∈ Sn we have that


n!
X n!
X n!
X n!
X
qs = q p= qp = p′ = s ; sq = pq = s ∀q ∈ Sn
p=1 p=1 p′ =1 p=1

so that qs = sq = s because of the rearrangement lemma. We have then


" n! # n! n!
X X X
ss = q s= qs = s
q=1 p=1 p=1
ss = n!s ⇒ sqs = (sq) s = ss = n!s ∀q ∈ Sn (12.1)

For the antisymmetrizer a we have


n!
X n!
X n!
X n!
X
p p p′ +q q p′ q
qa = q (−1) p = (−1) qp = (−1) p′ = (−1) (−1) p′ = (−1) a
p=1 p=1 p′ =1 p′ =1
q
aq = qa = (−1) a ∀q ∈ Sn (12.2)

212
12.2. PARTITIONS AND YOUNG DIAGRAMS 213

q
where we have used the fact that multiplication of each element p with q changes the parity according with the factor (−1) ,
if q is an even permutation then pq and qp have the same parity as p, but if q is odd, the permutations pq and qp have the
opposite parity of p. We then have
" n! # n! n! n!
X p
X p
X p p
X
aa = (−1) p a = (−1) (pa) = (−1) (−1) a = a
p=1 p=1 p=1 p=1
q q
aa = n!a ⇒ aqa = (aq) a = (−1) aa = (−1) n!a ∀q ∈ Sn
finally we examine the product as
" n!
# n! n! n!
X p
X X X
as = (−1) p s = (−1)p (ps) = (−1)p s = s (−1)p = 0
p=1 p=1 p=1 p=1
as = sa = 0
Pn! p
where we have used the fact that in the sum p=1 (−1) half of the terms are positive and the other half are negative (for
n ≥ 2). In summary, we have the following properties
qs = sq = s ; ss = n!s ; sqs = n!s ; ∀q ∈ Sn (12.3)
q q
aq = qa = (−1) a ; aa = n!a ; aqa = (−1) n!a ; ∀q ∈ Sn (12.4)
as = sa = 0 (12.5)
Therefore according with definition 11.4 and theorem 11.4, the set {s, a} gives essentially idempotents and each idempotent
is also primitive. QED.
Further, from the discussion in section 11.3, we see that s and a generate irreducible representations of Sn on the group
algebra. Moreover, since sqa = sa = 0 theorem 11.5 says that they generate inequivalent irreducible representations. Each
primitive idempotent s, a generate an invariant subspace (i.e. the minimal left-ideals generated by a and s. See definition 11.5,
page 204). The basis vectors of the irreducible inequivalent representations, are of the form |psi and |pai respectively. The
minimal left-ideals are n o n o
Ls ≡ rs : r ∈ Ge ; La ≡ ra : r ∈ G e
P
but r = pi ri hence
i X X X
rs = (pi s) ri = sri = s ri = αs (12.6)
i i i
P P p
with α a complex number. Hence, Ls is one-dimensional. In the same way, ra = i (pi a) ri = a i (−1) i ri = βa with
β ∈ C and La is also one-dimensional. Since ps = s ∀p ∈ Sn then Ls generates the identity representation1. On the other
hand, pa = (−1)p a generates the representation that assigns +1 if p is even and (−1) if p is odd. Thus, they provide the same
one-dimensional irreducible representations generated by the quotient group Sn /An .

12.2 Partitions and Young diagrams


Young diagrams are useful tools to generate primitive idempotents for all irreducible representations of Sn . From definition
6.11, let us recall that a partition λ ≡ {λ1 , λ2 , . . . , λr } of the integer n is a sequence of positive integers λi arranged in
descending order whose sum equals n
r
X
λk ≥ λk+1 , k = 1, . . . , r − 1 ; λi = n
i=1

Two partitions λ, η are considered equal if λi = ηi for all i. On the other hand λ > η (λ < η) if the first non-zero number in
the sequence (λi − ηi ) is positive (negative).
Definition 12.1 A partition λ = {λ1 , . . . , λi , . . . , λr } of the integer n, is graphically represented by a Young diagram which
consists of n squares arranged in r rows, the i−th one of which contains λi squares.
Example 12.1 For n = 3 there are three distinct partitions: {3} , {2, 1} , {1, 1, 1} the associated Young diagrams are respec-
tively

, ,

1 The symmetrizer s is the idempotent e defined by Eq. (11.17) corresponding to the identity representation, except for the “normalization” n−1 .
1 G
The absent of this normalization makes s an esentially idempotent instead of an idempotent.
214 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

Example 12.2 For n = 4 there are five distinct partitions: {4} , {3, 1} , {2, 2} , {2, 1, 1} and {1, 1, 1, 1} and the associated
Young diagrams are

, , , ,

We have already seen (see section 6.6.1) that there is a one-to-one correspondence between the distinct partitions of n and
the distinct cycle structures of permutations in Sn . In turn, there is a one-to-one correspondence between the distinct cycle
structures in Sn and the distinct conjugacy classes of Sn (see section 6.8).
Therefore, there is a one-to-one correspondence between the conjugacy classes of group elements of Sn and the partitions
of n, and since for each partition there is a Young diagram, it follows that

Theorem 12.2 The number of distinct Young diagrams for a given n, is equal to the number of conjugacy classes of Sn , which
in turn, is the number of inequivalent irreducible representations of Sn .

Where we have also used corollary 7.13, page 149. We also denote ν1 the number of 1−cycles, ν2 the number of 2−cycles
and so on for a given permutation. The relation between partitions and cycle structures is described by Eqs. (6.24, 6.25, 6.27)

ν1 + 2ν2 + . . . + nνn = n (12.7)

ν1 + ν2 + . . . + νn = λ1
ν2 + . . . + νn = λ2
.. ..
. = .
νn = λn (12.8)

ν1 = λ1 − λ2
ν2 = λ2 − λ3
.. .
. = ..
νn−1 = λn−1 − λn
νn = λn (12.9)

of course if λ ≡ {λ1 , . . . , λr } defines the partition and r < n, then elements λr+1 , . . . , λn are zero.

Example 12.3 For S3 the class {e} ≡ {(1) (2) (3)} corresponds to the cycle structure ν1 = 3, ν2 = ν3 = 0 and the partition
(λ1 , λ2 , λ3 ) = (3, 0, 0). The class {(12) , (23) , (31)} has a cycle structure2 ν1 = ν2 = 1, ν3 = 0 and a partition (2, 1, 0). Finally,
the class {(123) , (321)} has a cycle structure ν1 = ν2 = 0, ν3 = 1 and a partition (1, 1, 1).

Definition 12.2 (Young Tableau, Normal Tableau, and Standard Tableau): (i) A Young tableau is obtained by filling the
squares of a Young diagram with numbers 1, 2, . . . , n in any order, using each number once and only once. (ii) A normal
Young tableau is one in which the numbers 1, 2, . . . , n appear in order from left to right and from the top row to the bottom
row. (iii) A standard Young tableau is one in which the numbers in each row appear increasing (not neccesarily in strict order)
to the right and those in each column appear increasing to the bottom.

Example 12.4 For n = 4 some Young tableaux are

1
1 4
1 4 2 2 4 4
2 3 1 4 , , , 3 ,
3 1 3 3
2
2

the normal tableaux are


1
1 2
1 2 3 1 2 2
1 2 3 4 , , , 3 ,
4 3 4 3
4
4
2 Remember that {(12) , (23) , (31)} is an abbreviation of {(12) (3) , (23) (1) , (31) (2)}, so that each element has one 2-cycle and one 1-cycle.
12.3. SYMMETRIZERS, ANTI-SYMMETRIZERS, AND IRREDUCIBLE SYMMETRIZERS OF YOUNG TABLEAUX215

and some standard tableaux are


1
1 3
1 2 4 1 3 2
1 2 3 4 , , , 2 , (12.10)
3 2 4 3
4
4

Of course, all normal tableaux are standard tableaux but the opposite is not necessarily true. In Eq. (12.10) the second,
third, and fourth tableaux are standard but not normal.
Definition 12.3 We denote the normal Young tableau associated with a partition λ by the symbol Θλ .
An arbitrary Young tableau is obtained from the corresponding Θλ by applying an appropriate permutation p on the
numbers 1, 2, . . . , n in the boxes. Then an arbitrary tableau can be expressed uniquely as Θpλ ≡ pΘλ . It is quite obvious that
qΘpλ = Θqp
λ .

Example 12.5 Let p1 , p2 be two permutations of S7 given by


   
1 2 3 4 5 6 7 1 2 3 4 5 6 7
p1 = ; p2 =
3 6 5 2 7 4 1 4 3 7 1 5 6 2
the product p1 p2 yields
  
1 2 3 4 5 6 7 1 2 3 4 5 6 7
p1 p2 =
3 6 5 2 7 4 1 4 3 7 1 5 6 2
  
4 3 7 1 5 6 2 1 2 3 4 5 6 7
=
2 5 1 3 7 4 6 4 3 7 1 5 6 2
 
1 2 3 4 5 6 7
p1 p2 =
2 5 1 3 7 4 6
the normal tableau associated with the partition {4, 2, 1}, and the corresponding permuted tableaux are given by
1 2 3 4 3 6 5 2
Θ{4,2,1} ≡ 5 6 ; p1 Θ{4,2,1} = Θp{4,2,1}
1
= 7 4
7 1
4 3 7 1 2 5 1 3
p2 Θ{4,2,1} = Θp{4,2,1}
2
= 5 6 ; p1 p2 Θ{4,2,1} = Θp{4,2,1}
1 p2
= 7 4
2 6

12.3 Symmetrizers, anti-symmetrizers, and irreducible symmetrizers of Young


tableaux
We shall see that horizontal and vertical permutations of Young tableaux will permit to construct symmetrizers and anti-
symmetrizers of them. These symmetrizers and anti-symmetrizers will in turn permit to construct idempotents to build the
irreducible representations of Sn . We shall have an idempotent for each Young tableau.
Definition 12.4 (Horizontal and vertical permutations): Given a Young tableau Θpλ , we define horizontal permutations {hpλ }
as permutations of the numbers 1, 2, . . . , n in Θpλ such that each number remains in the same row of Θpλ after the permutation.
Similarly, in vertical permutations {vλp } each number appears in the same column of Θpλ after the permutation. In particular,
the identity is both a vertical and a horizontal permutation.
Example 12.6 With respect to the normal Young tableau
1 2 3 4
5 6 7
Θ{4,3,2,1} ≡ Θλ ≡ (12.11)
8 9
10
of S10 the following are horizontal permutations
2 3 4 1 2 3 1 4 2 1 4 3
5 7 6 6 5 7 5 6 7
h1 Θ λ = ; h2 Θ λ = ; h3 Θ λ = (12.12)
9 8 9 8 8 9
10 10 10
216 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

while the following ones are vertical permutations

5 2 7 4 10 6 3 4 1 2 7 4
1 9 3 5 2 7 5 6 3
v1 Θλ = ; v2 Θλ = ; v3 Θλ = (12.13)
10 6 8 9 8 9
8 1 10

It is clear that the cycles comprising a horizontal permutation hpλ must only contain numbers that appears in the same row
of the associated Young tableau Θpλ . In the same manner, the cycles in a vertical permutation vλp must only involve numbers
in the same column of its Young tableau Θpλ .

Example 12.7 The cycle structure of the horizontal and vertical permutations in example 12.6 are given by

h1 = (1, 2, 3, 4) (6, 7) (8, 9) ; h2 = (1, 2, 3) (5, 6) (8, 9) ; h3 = (1, 2) (3, 4) (12.14)


v1 = (1, 5) (8, 10) (6, 9) (3, 7) ; v2 = (1, 10) (2, 6) ; v3 = (3, 7) (12.15)

where we have used commas in the cycle structure, because there is an element of two ciphers (n = 10) in the set to be permuted.

Definition 12.5 (Symmetrizers, antisymmetrizers, and irreducible symmetrizers): The symmetrizer spλ , the anti-symmetrizer
apλ , and the irreducible symmetrizer epλ associated with the Young tableau Θpλ are defined as
X p
spλ ≡ hλ (sum over all horizontal permutations)
h
X vλ
apλ ≡ (−1) vλp (sum over all vertical permutations)
v
X vλ
epλ ≡ spλ apλ = (−1) hpλ vλp (sum over all hpλ and all vλp )
h,v

the irreducible symmetrizer is also called a Young symmetrizer.

12.4 Symmetrizers, antisymmetrizers, and irreducible symmetrizers of Young


tableaux associated with S3
Let us evaluate the symmetrizers, antisymmetrizers, and irreducible symmetrizers of the normal Young tableaux associated
with all the partitions λ1 = {3} , λ2 = {2, 1} , λ3 = {1, 1, 1} of n = 3.
For the partition λ1 = {3} the normal Young tableau reads

Θ1 = 1 2 3 ↔ λ1 = {3}

it is clear that horizontal permutations hp1 include all permutations p of S3 , since the diagram consist of a single row. Only
the identity is a vertical permutation v1p = e, because each column has a single element. Further
X p X
s1 = h1 = p = s (symmetrizer of the full group)
h p
a1 = e
e1 = s1 a1 = se = s (12.16)

Now for the normal Young tableau associated with λ2 = {2, 1}, we have

1 2
Θ2 = ↔ λ2 = {2, 1}
3

horizontal permutations h2 are h2 = e, (12). Vertical permutations are e, (31). Therefore

s2 = e + (12) ; a2 = e − (31) ;
e2 = s2 a2 = [e + (12)] [e − (31)] = [e + (12)] e − [e + (12)] (31)
e2 = e + (12) − (31) − (321) (12.17)

Now, for λ3 = {1, 1, 1}, we have


1
Θ3 = 2 ↔ λ3 = {1, 1, 1}
3
12.4. SYMMETRIZERS, ANTISYMMETRIZERS, AND IRREDUCIBLE SYMMETRIZERS OF YOUNG TABLEAUX ASSOCI

only e is a horizontal permutation h3 . Vertical permutations vλ include all permutations in S3 . Thus


X p
s3 = e ; a3 = (−1) p = a (anti-symmetrizer of the full group) ; e3 = ea = a (12.18)
p

These are all normal tableaux of S3 . As for the standard tableaux, apart from the normal tableaux there is one more

(23) 1 3
Θ2 = ↔ λ2 = {2, 1}
2
(23) (23)
which is clearly equal to (23) Θ2 . The horizontal permutations h2 are e, (31). The vertical permutations v2 are e, (12).
Therefore
(23) (23)
s2 = e + (31) ; a2 = e − (12)
(23)
e2 = [e + (31)] [e − (12)] = [e + (31)] e − [e + (31)] (12)
(23)
e2 = e + (31) − (12) − (123) (12.19)

12.4.1 Properties of idempotents and left-ideals of S3


Let us discuss some properties that can be seen in this specific example, and that suggest the general framework. First we see
that for each Young tableau Θλ , the horizontal permutations {hλ } form a subgroup Shλ ⊆ S3 and the vertical permutations
{vλ } form a subgroup Svλ ⊆ S3 .
Now, we observe that sλ is the total symmetrizer of the subgroup Shλ and aλ is the total anti-symmetrizer of Svλ . Taking
this into account and the fact that hλ ∈ Shλ and vλ ∈ Svλ , we can use the results of Sec. 12.1, Eqs. (12.3, 12.4) to obtain3

sλ hλ = hλ sλ = sλ , aλ vλ = vλ aλ = (−1) aλ ; sλ sλ = nλ sλ ; aλ aλ = nλ aλ

where nλ = λ1 !λ2 ! . . . λn !. Hence, sλ and aλ are esentially idempotents. However, they are not in general, primitive idempotents.
Note that for this example, the irreducible symmetrizers eλ for each standard tableau Θλ of S3
X v
eλ ≡ (−1) λ hλ vλ
h,v

are primitive idempotents. This is obvious for e1 , e3 in Eqs. (12.16, 12.18), because e1 = s and e3 = a i.e. the total symmetrizer
(23)
and the total anti-symmetrizer of the whole group S3 . Moreover, it can be checked that e2 and e2 given by Eqs. (12.17,
12.19), are also primitive idempotents.
We already know that s = e1 and a = e3 generate the two inequivalent irreducible one-dimensional representations of S3 .
Similarly, the primitive idempotent e2 given by Eq. (12.17), generates the two-dimensional representation
n of S3 o(see table 7.4
example 7.13, page 149). We can verify explicitly that the left-ideal generated by e2 i.e. Le2 ≡ re2 : r ∈ Se3 generates a
two-dimensional subspace of the group algebra. To verify this, we see the action of e2 on each element of the basis {pe2 : p ∈ S3 }
of Le2 :

ee2 = e2
(12) e2 = (12) [e + (12) − (31) − (321)] = (12) + e − (321) − (31) = e2
(23) e2 = (23) [e + (12) − (31) − (321)] = (23) + (321) − (123) − (12) ≡ r2
(31) e2 = (31) [e + (12) − (31) − (321)] = (31) + (123) − e − (23)
= − [e + (12) − (31) − (321) + (23) + (321) − (123) − (12)] = −e2 − r2
(123) e2 = (123) [e + (12) − (31) − (321)] = (123) + (31) − (23) − e
= (31) e2 = −e2 − r2
(321) e2 = (321) [e + (12) − (31) − (321)] = (321) + (23) − (12) − (123) = r2

in summary we have

ee2 = e2 , (12) e2 = e2 , (23) e2 = r2 , (31) e2 = −e2 − r2 , (123) e2 = −e2 − r2 , (321) e2 = r2


r2 ≡ (23) + (321) − (123) − (12) (12.20)

so that this irreducible invariant subspace under S3 (minimal left ideal of Se3 ) is spanned by e2 and r2 . The basis chosen
is of course arbitrary. We then see that the irreducible symmetrizers of the Normal Young tableaux generate all irreducible
representations of S3 .
3 Note that Eq. (12.5) cannot be applied because sλ and aλ are symmetrizers and anti-symmetrizers of different subgroups.
218 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

(23)
It can also be checked that e2 in Eq. (12.19) also generates a two-dimensional irreducible representation. Since S3 has
only one two-dimensional irreducible representation, it must be equivalent to the representation generated by e2 . Neverthe-
(23)
less, the invariant subspace (minimal left-ideal), generated by e2 is disjoint from the one generated by e2 . The left-ideal
(23) (23)
Le(23) generated by e2 is spanned by e2 and
2

(23)
r2 = (123) + (23) − (31) − (321) (12.21)
It can be shown that Le(23) is disjoint (only the zero element in common) with each of the left-ideals generated by the other
2
tableaux.
It worths saying that a good way to visualize the left ideals expanded by the irreducible symmetrizers lies in the fact that
the elements of the group gk are seen as an orthonormal basis for the group algebra. Thus for S3 we can assign
e → u1 , (12) → u2 , (13) → u3 , (23) → u4 , (123) → u5 , (321) → u6 (12.22)
for instance from Eqs. (12.17, 12.20) we have
e2 = e + (12) − (31) − (321) = u1 + u2 − u3 + 0 · u4 + 0 · u5 − u6 ≡ (1, 1, −1, 0, 0, −1) (12.23)
r2 ≡ (23) + (321) − (123) − (12) = 0 · u1 − u2 + 0 · u3 + u4 − u5 + u6 ≡ (0, −1, 0, 1, −1, 1) (12.24)
picking up Eqs. (12.16, 12.17, 12.18, 12.19, 12.20, 12.21) we see that the four left ideals are expanded by the following vectors
Le1 → {e1 } = {(1, 1, 1, 1, 1, 1)} ; Le3 → {e3 } = {(1, −1, −1, −1, 1, 1)}
Le2 → {e2 , r2 } = {(1, 1, −1, 0, 0, −1) , (0, −1, 0, 1, −1, 1)}
n o
(23) (23)
Le(23) → e2 , r2 = {(1, −1, 1, 0, 1, 0) , (0, 0, −1, 1, 1, −1)} (12.25)
2

these six vectors are linearly independent as can be seen from their associated matrix, with the vectors being the columns of
the matrix  
1 1 0 1 0 1
 1 1 −1 −1 0 −1 
 
 1 −1 0 1 −1 −1 
det 
 1
 = 54 6= 0
 0 1 0 1 −1  
 1 0 −1 1 1 1 
1 −1 1 0 −1 1
(23)
It is then clear that the four left-ideals generated by the idempotents of the four standard tableaux e1 , e2 , e3 , e2 span the
whole six-dimensional group algebra space Se3
Se3 = Le1 ⊕ Le2 ⊕ Le(23) ⊕ Le3 = Le1 ⊕ 2Le2 ⊕ Le3 (12.26)
2

and that two standard Young tableaux associated with the same Young diagram, generate equivalent representations. In this
case, Le1 appears only once because there is only one standard tableau for its associated Young diagram, and same for Le3 .
Now, Le2 appears twice because there are two standard Young tableaux associated with the corresponding Young diagram. It
coincides with the fact that in the regular representation each irreducible representation appears nµ times.
The identity element of Se3 has a unique decomposition along the four left-ideals in Eq. (12.26). To find it, we expand e in
the basis defined by Eqs. (12.25)

e1 = (1, 1, 1, 1, 1, 1) ; e2 = (1, 1, −1, 0, 0, −1) ; r2 = (0, −1, 0, 1, −1, 1)


(23) (23)
e2 = (1, −1, 1, 0, 1, 0) ; r2 = (0, 0, −1, 1, 1, −1) ; e3 = (1, −1, −1, −1, 1, 1) (12.27)
then we write the identity as a linear combination of this basis
(23) (23)
x1 e1 + x2 e2 + x3 r2 + x4 e2 + x5 r2 + x6 e3 = e (12.28)
and equating Eq. (12.28) by components we obtain
x1 + x2 + 0 · x3 + x4 + 0 · x5 + x6 = 1
x1 + x2 − x3 − x4 + 0 · x5 − x6 = 0
x1 − x2 + 0 · x3 + x4 − x5 − x6 = 0
x1 + 0 · x2 + x3 + 0 · x4 + x5 − x6 = 0
x1 + 0 · x2 − x3 + x4 + x5 + x6 = 0
x1 − x2 + x3 + 0 · x4 − x5 + x6 = 0 (12.29)
12.5. GENERAL PROPERTIES OF YOUNG TABLEAUX 219

the solution of this set of linear equations yields4


1 5 2 1 2 1
x1 = , x2 = , x3 = , x4 = , x5 = − , x6 =
18 9 9 3 9 18
so that the decomposition (12.28) of the identity in the basis (12.27) is given by
1 5 2 1 2 1
e= e1 + e2 + r2 + ep − rp + e3 (12.30)
18 9 9 3 9 18
where the unique components of the identity along each left-ideal are given by
1 5 2 1 2 1
e1 ∈ Le1 ; e2 + r2 ∈ Le2 ; ep − rp ∈ Le(23) ; e3 ∈ Le3
18 9 9 3 9 2 18
In conclusion, the regular representation of S3 has been fully reduced by using the irreducible symmetrizers associated with
its standard tableaux. In the next two sections we shall generalize these particular observations.

12.5 General properties of Young tableaux


We shall explore some general properties of horizontal and vertical permutations and of their corresponding symmetrizers,
antisymmetrizers and irreducible symmetrizers, necessary to generate the irreducible inequivalent representations of Sn . We
shall see later that the primitive idempotents on the group algebra of the regular representation of Sn , are given by the
irreducible symmetryzers. Hence irreducible symmetryzers become the generators of the irreducible representations of Sn .

Lemma 12.1 Let {hλ } and {vλ } be horizontal and vertical permutations of the normal Young tableaux Θλ and let sλ , aλ and
eλ be the associated symmetrizer, anti-symmetrizer and irreducible symmetrizer respectively. The corresponding quantities on
an arbitrary Young tableau Θpλ ≡ pΘλ are given by

hpλ = phλ p−1 ; vλp = pvλ p−1 ; spλ = psλ p−1 ; apλ = paλ p−1 ; epλ = peλ p−1 (12.31)

This lemma which can be checked by inspection of specific examples, says that all algebraic relations involving the operators
on a normal tableau are also satisfied by the corresponding operators on any tableau associated with the same Young diagram.
Thus, we can concentrate on normal tableaux Θλ only, on the understanding that the same properties are satisfied by arbitrary
Young tableaux Θpλ .

Lemma 12.2 For any given Θλ , the set of horizontal permutations {hλ } forms a subgroup Shλ of Sn , and sλ is the total
symmetrizer of this subgroup. The set of vertical permutations {vλ } also forms a subgroup Svλ of Sn and aλ is its associated
total anti-symmetrizer. Further, sλ and aλ satisfy the relations
vλ vλ
sλ hλ = hλ sλ = sλ , aλ vλ = vλ aλ = (−1) aλ , hλ eλ vλ = (−1) eλ ; ∀hλ , vλ (12.32)

the symmetrizers and antisymmetrizers satisfy the relations

sλ sλ = ξλ sλ ; aλ aλ = ηλ aλ (12.33)

so that they are essentially idempotents. However, they are not in general primitive idempotents.

Proof : The fact that the set {hλ } forms a subgroup of Sn follows by observing that (a) The product of two horizontal
permutations is another horizontal permutation. (b) The identity permutation is an element of the set, (c) The inverse of any
element is contained in the set5 . It is equal in the case of {vλ }.
The properties (12.32, 12.33) follow from the fact that sλ is the total symmetrizer of the subgroup Shλ and aλ is the total
anti-symmetrizer of the subgroup Svλ , and following a procedure based on the rearrangement lemma similar to the proof of
theorem 12.1. QED.

Lemma 12.3 Let Shλ ≡ {hλ } and Svλ ≡ {vλ } be the subgroups of all horizontal and all vertical permutations associated with
a given Θλ . If p ∈ Sn is expressible as p = hλ vλ , this decomposition is unique. That is, p = h′λ vλ′ implies h′λ = hλ and vλ′ = vλ .
4 We could be tempted to use a decomposition of e in the form
e = |e1 i he1 |ei + |e2 i he2 |ei + |r2 i hr2 |ei
E E
(23) (23) (23) (23)
+ e2 he2 |ei + r2 hr2 |ei + |e3 i he3 |ei

however, we cannot do the decomposition this way, because this basis is not normalized neither orthogonal.
5 Strictly speaking only the first condition is necessary for subgroups of finite groups.
220 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

Proof : If p = h′λ vλ′ then hλ vλ = h′λ vλ′ so that h′−1 ′ −1


λ hλ = vλ vλ . By lemma 12.2, the LHS of this equation is a horizontal
permutation and the RHS is a vertical one. Since the only intersection between them is the identity, we have that h′−1 λ hλ =
vλ′ vλ−1 = e so that h′λ = hλ and vλ′ = vλ . QED.

Lemma 12.4 For a given Θλ and a given p ∈ Sn a necessary and sufficient condition for p 6= hλ vλ is that there exists at least
two numbers in one row of Θλ which appear in the same column of Θpλ .

Proof : (i) Assume that p = hλ vλ we can rewrite it as


 
p = hλ vλ = hλ vλ h−1 −1
λ hλ = hλ vλ hλ hλ ≡ vλhλ hλ

where vλhλ ≡ hλ vλ h−1 hλ


λ , is a vertical permutation associated with the Young tableau Θλ according with Lemma 12.1. The

Young tableau associated with p = hλ vλ = vλ hλ , can be expressed as

Θpλ ≡ pΘλ = hλ vλ Θλ = vλhλ hλ Θλ = vλhλ Θhλλ

it shows that Θpλ can be obtained in two steps as follows. (a) Θλ → Θhλλ by hλ and then Θhλλ → Θpλ by vλhλ . In neither of these
steps (nor in the combination of them) is possible to bring two numbers belonging to the same row into the same column.
Therefore, if there are two numbers in one row of Θλ that appear in the same column of Θpλ we see that p 6= hλ vλ .
(ii) To prove the converse, assume that there are no two numbers shared by a row in Θλ and a column of Θpλ . We can
obtain the tableau Θpλ from Θλ by the following procedure: Start with the numbers appearing on the first column of Θpλ . By
our hypothesis, they must all belong to different rows of Θλ , therefore they can be brought to the first column by horizontal
permutations applied on each row, but the composition of horizontal permutations gives a horizontal permutation (they form
a subgroup). Consequently, the numbers on the first column of Θpλ are brought from Θλ to the first column by a horizontal
(1)
permutation hλ . Repeating this exercise for the other columns in turn, we obtain a tableau given by

(i) (2) (1)


hλ . . . hλ hλ Θλ ≡ hλ Θλ ≡ Θhλλ

where Θhλλ differs from Θpλ only by the order of the elements in individual columns. Hence, Θhλλ can be transformed into Θpλ
by a vertical permutation vλhλ (once again by a composition of vertical permutations that gives a single vertical permutation).
We have then obtained Θpλ = vλhλ Θhλλ = vλhλ hλ Θλ and applying Lemma 12.1 we see that vλhλ = hλ vλ h−1 λ so that Θpλ =
−1
hλ vλ hλ hλ Θλ = hλ vλ Θλ = pΘλ , therefore p = hλ vλ . Equivalently, if p 6= hλ vλ , there must be at least two numbers that
appear in one row of Θλ and one column of Θpλ . QED.

Lemma 12.5 Given Θλ and p ∈ Sn which is not of the form hλ vλ , there exist two transpositions e
hλ , veλ such that p = e
hλ pe

Proof : Since p 6= hλ vλ , Lemma 12.4 says that there exist at least two numbers in one row of Θλ that belongs to the
same column in Θpλ . Let t be the transpositions of these two numbers. By definition t is a member of the subgroup of
horizontal permutations {hλ } in Θλ , and also of the subgroup of vertical permutations {vλp } in Θpλ . Let us denote this element
as t ≡ e
hλ ≡ evλp . Further, according with lemma 12.1 and Eq. (12.31), for the vertical transposition veλp in Θpλ the corresponding
vertical transposition in Θλ is given by veλ = p−1 veλp p = p−1 tp, from which we have
 
e vλ = tp p−1 tp = t pp−1 tp = ttp = p
hλ pe

where we have used the obvious fact that a transposition applied twice is the identity. QED.

Lemma 12.6 Given a Young tableau Θλ , if an element r of the group algebra Sen satisfies

hλ rvλ = (−1) r ∀hλ , vλ (12.34)

then r must be a multiple of the irreducible symmetrizer eλ . More precisely, r = αe eλ , where αe is the coefficient associated
with the identity in the expansion of r, in the basis constituted by the elements of the group Sn .
P
Proof : We can write r as a linear combination of the elements of the group r = p αp p we shall show that. (i) αp = 0 if
v
p 6= hλ vλ and (ii) αp is proportional to (−1) λ if p = hλ vλ . Let us evaluate the LHS of Eq. (12.34)

n!
X n!
X
hλ rvλ = αq (hλ qvλ ) = αq p ; p ≡ hλ qvλ
q=1 p=1
12.5. GENERAL PROPERTIES OF YOUNG TABLEAUX 221

where we have used the rearrangement lemma6 . Since q = h−1 −1


λ pvλ we can write this expression as
n!
X
hλ rvλ = αh−1 pv−1 p
λ λ
p=1

expanding the RHS of Eq. (12.34) and equating both sides


n!
X n!
X vλ
αh−1 pv−1 p = (−1) αp p ; ∀hλ , vλ
λ λ
p=1 p=1

because of the linear independence of the elements of the group (considered as basis vectors in Sen ), we have

αh−1 pv−1 = (−1) αp ; ∀hλ , vλ (12.35)
λ λ

(i) If p is not of the form hλ vλ , Lemma 12.5 says that p = e vλ where e


hλ pe hλ , veλ are a horizontal and vertical transposition
respectively. Therefore
e
h−1 vλ−1 = e
h−1 e vλ e vλ−1 = p
λ pe λ hλ pe (12.36)
Now, since expression (12.35) is valid for all hλ , vλ , is valid in particular for e
hλ , e
vλ , and using Eq. (12.36) we have
αeh−1 pev−1 = αp = (−1)veλ αp = −αp
λ λ

where we have used the fact that a transposition veλ is an odd permutation. Thus αp = −αp , from which αp = 0 if p 6= hλ vλ .
(ii) If p = hλ vλ for some particular hλ and vλ , then
h−1 −1 −1 −1
λ pvλ = hλ (hλ vλ ) vλ = e

once again, we can apply Eq. (12.35) for these particular hλ and vλ , to find

αh−1 pv−1 = αe = (−1) αp
λ λ

⇒ αp = (−1) αe
P
where αe is a constant independent of p i.e. the unique coefficient associated to the identity in the expansion of r = p αp p.
Summarizing, the only non-zero terms in the expansion of r, correspond to the permutations in which p = hλ vλ for which
v
the coefficients are αp = (−1) λ αe , therefore
X X v
r= αp p = αe (−1) λ hλ vλ
p hλ ,vλ

in the last step, we have also used the fact that p is expressible in a unique way in the form p = hλ vλ (lemma 12.3), so that
by summing over all {hλ } and over all {vλ } we obtain each p = hλ vλ only once. Therefore, we obtain
X v
r = αe (−1) λ hλ vλ = αe eλ
hλ ,vλ

QED.
Lemma 12.7 Given two distinct Young diagrams labelled by λ and µ, assume that λ > µ. It follows that
aqµ spλ = spλ aqµ = eqµ epλ = 0 ; ∀p, q ∈ Sn (12.37)
Proof : By arguments similar to those in Lemma 12.5, there exists at least one pair of numbers which appears simultaneously
in one row of Θpλ and one column of Θqµ (as long as λ > µ). Denoting the transposition of these two numbers by t = e hpλ = veµq
and using lemma 12.2, i.e. the first two of Eqs. (12.32) we have
tspλ = e
hpλ spλ = spλ ; spλ t = spλ e
hpλ = spλ ; taqµ = veµq aqµ = −aqµ ; aqµ t = aqµ e
vµq = −aqµ
where we have taken into account that transpositions are odd permutations, we then obtain
tspλ = spλ t = spλ ; taqµ = aqµ t = −aqµ
from which we can deduce

spλ aqµ = (spλ t) aqµ = spλ taqµ = −spλ aqµ ⇒ spλ aqµ = −spλ aqµ = 0

aqµ spλ = − aqµ t spλ = −aqµ (tspλ ) = −aqµ spλ ⇒ aqµ spλ = −aqµ spλ = 0
finally  
eqµ epλ = sqµ aqµ (spλ apλ ) = sqµ aqµ spλ apλ = 0
QED.
6 We should be careful in using the rearrangement lemma for a linear combination of elements of the group, since we should be sure that in the

reordering of the group elements the assignment of the coefficients to the group elements in the combination has not changed.
222 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

12.5.1 Examples of the general properties of Young tableux


Young tableaux associated with S10
Let us take again the normal Young tableau of Eq. (12.11) in example 12.6.

1 2 3 4
5 6 7
Θ{4,3,2,1} ≡ Θλ ≡ (12.38)
8 9
10
We also take the horizontal permutation h1 given by Eq. (12.14) and a given permutation p1

 
1 2 3 4 5 6 7 8 9 10
h1 = (1, 2, 3, 4) (6, 7) (8, 9) = (12.39)
2 3 4 1 5 7 6 9 8 10
 
1 2 3 4 5 6 7 8 9 10
p1 = (1, 2, 6) (3, 4, 5) (7, 8) = (12.40)
2 6 4 5 3 1 8 7 9 10
 
1 2 3 4 5 6 7 8 9 10
p−1
1 = (6, 2, 1) (5, 4, 3) (8, 7) = (12.41)
6 1 5 3 4 2 8 7 9 10

the permuted Young tableau Θpλ1 yields


2 6 4 5
3 1 8
Θpλ1 ≡ p1 Θ λ ≡ (12.42)
7 9
10
we first evaluate
  
−1 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
p 1 h1 p 1 =
2 6 4 5 3 1 8 7 9 10 2 3 4 1 5 7 6 9 8 10 6 1 5 3 4 2 8 7 9 10
  
7 2 5 4 1 3 9 6 8 10 6 1 5 3 4 2 8 7 9 10 1 2 3 4 5 6 7 8 9 10
=
8 6 3 5 2 4 9 1 7 10 7 2 5 4 1 3 9 6 8 10 6 1 5 3 4 2 8 7 9 10
 
1 2 3 4 5 6 7 8 9 10
p1 h1 p−11 = = (2, 6, 4, 5) (1, 8) (7, 9)
8 6 3 5 2 4 9 1 7 10

observe that h1 and p1 h1 p−1


1 have the same cycle structure as it must be

h1 = (1, 2, 3, 4) (6, 7) (8, 9) (5) (10) ; p1 h1 p−1


1 = (2, 6, 4, 5) (1, 8) (7, 9) (3) (10) (12.43)
p1
from Eqs. (12.42, 12.43) it is clear that p1 h1 p−1
1 is a horizontal permutation with respect to Θλ

2 6 4 5
 3 1 8
p1 h1 p−1
1 Θpλ1 ≡ [(2, 6, 4, 5) (1, 8) (7, 9) (3) (10)]
7 9
10
6 4 5 2
 3 8 1
−1
p 1 h1 p 1 Θpλ1 = (12.44)
9 7
10

let us do the assignment [see Eq. (12.41) for p−1


1 ]

2↔e
1 , 6↔e
2 , 4↔e
3 , 5↔e
4 , 3↔e
5 , 1↔e
6 , 8↔e
7 , 7↔e
8 , 9↔e
9 , 10 ↔ f
10 (12.45)

from which Θpλ1 in Eq. (12.42), can be rewritten as

2 6 4 5 e
1 e
2 e
3 e
4
3 1 8 e
5 e
6 e
7
Θpλ1 ≡ ≡ e e (12.46)
7 9 8 9
10 f
10
12.5. GENERAL PROPERTIES OF YOUNG TABLEAUX 223

from the horizontal permutation h1 (with respect to Θλ ) in Eq. (12.43), we write the associated horizontal permutation hp11
with respect to Θpλ1      
hp11 = e1, e
2, e
3, e
4 e6, e
7 e
8, e
9 e 5 f 10 (12.47)

combining Eqs. (12.46, 12.47) we find

e
1 e
2 e
3 e
4 e
2 e
3 e
4 e
1 6 4 5 2
e
5 e
6 e
7 e
5 e
7 e
6 3 8 1
hp11 Θpλ1 ≡ hp11 e e = e e = (12.48)
8 9 9 8 9 7
f
10 f
10 10

where in the last step with reverse again the assignment (12.45). We see that Eqs. (12.44, 12.48) coincide, providing us with
a specific example for the validity of the relation
hpλ = phλ p−1
Now we take the vertical permutation (12.15)
 
1 2 3 4 5 6 7 8 9 10
v1 = (1, 5) (8, 10) (6, 9) (3, 7) = (12.49)
5 2 7 4 1 9 3 10 6 8

on the other hand


  
−1 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 1
p1 v1 p1 =
2 6 4 5 3 1 8 7 9 10 5 2 7 4 1 9 3 10 6 8 6 1 5 3 4 2 8 7 9 1
  
9 5 1 7 4 2 10 3 6 8 6 1 5 3 4 2 8 7 9 10 1 2 3 4 5 6 7 8 9 1
=
9 3 2 8 5 6 10 4 1 7 9 5 1 7 4 2 10 3 6 8 6 1 5 3 4 2 8 7 9 1
 
1 2 3 4 5 6 7 8 9 10
p1 v1 p−1
1 = = (1, 9) (2, 3) (4, 8) (7, 10) (5) (6) (1
9 3 2 8 5 6 10 4 1 7
p1
once again, v1 and p1 v1 p−1
1 have the same cycle structure as it must be. The corresponding vertical permutation for Θλ reads
    
v1p1 = e1, e
5 e 8, f
10 e 6, e
9 e
3, e
7 (12.51)

from Eqs. (12.46, 12.51) we have

e
1 e
2 e
3 e
4 e
5 e
2 e
7 e
4 3 6 8 5
e
5 e
6 e
7 e
1 e
9 e
3 2 9 4
v1p1 Θpλ1 ≡ v1p1 = = (12.52)
e
8 e
9 f
10 e
6 10 1
f
10 e
8 7

where in the last step we went back to the assignment (12.50). Now combining Eqs. (12.46, 12.50) we find

2 6 4 5 3 6 8 5
 3 1 8 2 9 4
p1 v1 p−1
1 Θpλ1 = [(1, 9) (2, 3) (4, 8) (7, 10)] = (12.53)
7 9 10 1
10 7

we see that (12.52,) and (12.53) coincide, providing a specific example for the identity

v1p1 = p1 v1 p−1
1

Young tableaux associated with S6


Based on the group S6 we start with the normal Young tableau associated with the partition {3, 2, 1}

1 2 3
Θλ = Θ{3,2,1} = 4 5
6

we first put the permutations that change symbols in a single row that is

h1 = e, h2 = (12) , h3 = (23) , h4 = (13) , h5 = (123) , h6 = (321) , h7 = (45) (12.54)


224 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

the first six of these elements are all the permutations of the three symbols 1, 2, 3. The remaining horizontal permutations are
obtained by combining all permutations in (12.54). However, since the first six permutations form a subgroup S3 , the only
new permutations that arise from these products are the ones obtained by making the product of h7 with the elements of S3
different from the identity

h8 = h2 h7 = (12) (45) , h9 = h3 h7 = (23) (45) , h10 = h4 h7 = (13) (45)


h11 = h5 h7 = (123) (45) , h12 = h6 h7 = (321) (45) (12.55)

it is clear that the products in (12.55) are conmutative since both permute disjoint sets, hence hi h7 = h7 hi for i = 1, 2, 3, 4, 5, 6.
Therefore no more new horizontal permutations arise by inverting the order of the products.
It is understood that each hi is considered as an element of S6 . For example

h3 = (23) = (1) (4) (5) (6) (23) , h5 = (4) (5) (6) (123)

by putting permutations (12.54, 12.55) together, we obtain the subgroup Sh of all horizontal permutations associated with
Θ{3,2,1} . This group is of order 12

Sh ≡ {e, (12) , (23) , (13) , (123) , (321) , (45) , (12) (45) , (23) (45) , (13) (45) , (123) (45) , (321) (45)}

note that the product is closed by construction. Since S6 is finite, the closure of the product is sufficient to prove that Sλ is a
subgroup of S6 (see Sec. 6.5, page 108). In a similar way we can construct the subgroup Sv of vertical permutations associated
with Θ{3,2,1}

Sv ≡ {e, (14) , (16) , (46) , (146) , (641) , (25) , (14) (25) , (16) (25) , (46) (25) , (146) (25) , (641) (25)}

12.6 Irreducible representations of Sn


Inspired on the results obtained for S3 in Sec. 12.4 and on the lemmas of Sec. 12.5, we shall develop the central theorems of
the theory of irreducible representations of Sn . The symmetrizers and anti-symmetrizers associated with each Young tableau,
form the basis to construct the primitive idempotents that permits to generate the irreducible invariant subspaces (left-ideals)
in the group algebra space. In particular, Lemma 12.1 shows that expressions that are valid for normal Young tableaux Θλ are
also valid for general Young tableaux Θpλ . Therefore, although we shall use normal tableaux to simplify the notation, all the
results will also be valid for arbitrary Young tableaux.

Theorem 12.3 The symmetrizer, anti-symmetrizer, and irreducible symmetrizer associated with a Young tableau Θλ have the
following properties

sλ raλ = ξr eλ ∀r ∈ Sen (12.56)


e2λ = ηeλ ; η = positive integer (12.57)

where ξr and η are ordinary numbers, and ξr depends on r. Since η 6= 0 we see that eλ is essentially idempotent.

Proof : (i) Let hλ (vλ ) be an arbitrary horizontal (vertical) permutation associated with Θλ . From Lemma 12.2, Eqs.
(12.32) we see that

hλ (sλ raλ ) vλ = (hλ sλ ) r (aλ vλ ) = sλ raλ (−1)

⇒ hλ (sλ raλ ) vλ = (−1) (sλ raλ ) ∀hλ , vλ

and applying Lemma 12.6 to the element t ≡ sλ raλ ∈ Sen we obtain t ≡ sλ raλ = αte eλ , with αte the coefficient corresponding
to the identity in the expansion of t in the group basis. Since t depends on r, it is clear that αte depends on r so that we can
denote αte ≡ ξr .
(ii)
e2λ = (sλ aλ ) (sλ aλ ) = sλ (aλ sλ ) aλ ≡ sλ raλ ; r ≡ aλ sλ ∈ Sen
and applying the first part of this theorem sλ raλ = αte eλ such that e2λ = αte eλ , where αte is the coefficient of the identity in the
expansion of t = sλ raλ = e2λ .
(iii) We have seen that e2λ = αte eλ where αte is the coefficient of the identity in the expansion of e2λ in the group basis. The
expansion of e2λ can be written in the form
  
X v
X v ′ X X v v′
e2λ ≡  (−1) λ hλ vλ   (−1) λ h′λ vλ′  = hλ vλ h′λ vλ′ (−1) λ (−1) λ (12.58)
hλ ,vλ h′λ ,vλ
′ hλ ,vλ h′λ ,vλ

12.6. IRREDUCIBLE REPRESENTATIONS OF SN 225

now, since e is the common element of the two subgroups {hλ } and {vλ }, it appears at least once in the expansion (12.58), i.e.
at least in the case in which
v v′
hλ = vλ = h′λ = vλ′ = e ⇒ (−1) λ (−1) λ = 1
and the coefficient is positive. Moreover, the identity could occur more than once in the expansion (12.58). This is the case
−1
when the equation hλ vλ h′λ vλ′ = e has more than one solution. Such an equation implies that h′λ vλ′ = (hλ vλ ) such that the
−1
vλ vλ
coefficient in the expansion is given by (−1) (−1) , further the inverse p−1 of any permutation p must have the same parity
−1
vλ vλ
as p, hence (−1) (−1) = 1. Therefore, when the identity occurs more than once in the expansion (12.58), the relevant
coefficient is always a non-vanishing positive integer (equal to the number of distinct solutions of hλ vλ h′λ vλ′ = e), and eλ is
essentially idempotent. QED.
In Sec. 12.4 we found four irreducible symmetrizers for the group S3 : e1 and e3 for the one-dimensional representations as
(23)
well as e2 and e2 for the two equivalent two-dimensional representations. We also saw that these irreducible symmetrizers
were essentially primitive idempotents. The generalization of this statement is
Theorem 12.4 The irreducible symmetrizer eλ associated with the Young tableau Θλ is a primitive essentially idempotent. It
generates an irreducible representation of Sn on the group algebra space Sen .
Proof: We have already shown that eλ is essentially idempotent (see Eq. 12.57). Now, we see that
eλ reλ = (sλ aλ ) r (sλ aλ ) = sλ (aλ rsλ ) aλ ∀r ∈ Sen
Applying theorem 12.3, Eq. (12.56) to the element r′ ≡ aλ rsλ ∈ Sen , we see that
eλ reλ = sλ r′ aλ = ξr′ eλ = λr eλ ; ∀r ∈ Sen
where we have used the fact that r′ depends on r so that we can write ξr′ ≡ λr . Now, according with theorem 11.4, eλ is a
primitive essentially idempotent. QED.
Once again, we saw in Sec. 12.4 that several primitive idempotents for S3 can be generated from the same partition of n i.e.
(23)
from the same Young diagram, this is the case with e2 , e2 . Furthermore, we saw that these idempotents generate equivalent
irreducible representations. We generalize this result as
Theorem 12.5 The irreducible representations generated by eλ and epλ , with p ∈ Sn , are equivalent.
Proof : Applying Lemma 12.1 Eq. (12.31), we see that epλ = peλ p−1 , therefore
 
epλ peλ = peλ p−1 peλ = peλ p−1 p eλ = pe2λ = ηpeλ
where we have used theorem 12.3, Eq. (12.57). Moreover, Eq. (12.57) also says that η 6= 0 so that ηpeλ is non-vanishing.
Therefore, epλ peλ 6= 0 and using theorem 11.5 we see that the irreducible representations generated by epλ and eλ are equivalent.
QED.
On the other hand, Sec. 12.4 shows that for S3 , the primitive essentially idempotents e1 , e2 , e3 associated with distinct
Young diagrams, generate irreducible inequivalent representations of S3 . In general we have
Theorem 12.6 Two irreducible symmetrizers eλ and eµ generate irreducible inequivalent representations if the corresponding
Young diagrams are different i.e. if λ 6= µ.
Proof: Without any loss of generality, we can assume that λ > µ. Let p be an arbitrary element of Sn , we see that
 
eµ peλ = eµ peλ p−1 p = eµ peλ p−1 p = eµ epλ p = (eµ epλ ) p = 0 ; ∀p ∈ Sn

where we have used Lemma 12.1 Eq. (12.31) as well as Lemma 12.7 Eq. (12.37). Now, since any r ∈ Sen is a linear combination
of the elements p ∈ Sn , we have
eµ reλ = 0 ; ∀r ∈ Sen (12.59)
so that theorem 11.5 says that eµ and eλ generate irreducible inequivalent representations. QED.
Corollary 12.7 If λ 6= µ, then epµ eqλ = 0, ∀p, q ∈ Sn .
For λ 6= µ, theorem 12.6 says that eµ ,eλ generate irreducible inequivalent representations so that

eµ p−1 q eλ = 0 ; ∀q, p ∈ Sn

which is a particular case of Eq. (12.59), since p−1 q ∈ Sen . Hence we have
      
eµ p−1 q eλ = 0 ⇒ eµ p−1 q eλ q −1 = 0 ⇒ p eµ p−1 q eλ q −1 = 0
 
⇒ peµ p−1 qeλ q −1 = 0 ⇒ epµ eqλ = 0
where we have used Lemma 12.1 Eq. (12.31). QED. Note that when λ > µ this result is obtained from Lemma 12.7, Eq.
(12.37).
226 CHAPTER 12. REPRESENTATIONS OF THE PERMUTATION GROUP

Theorem 12.8 (Irreducible inequivalent representations of Sn ): The irreducible symmetrizers {eλ } associated with all the
normal Young tableaux {Θλ } generate all the inequivalent irreducible representations of Sn .

Proof: It is clear that the number of normal Young tableaux is the number of Young diagrams. Now, theorem 12.2 says
that the number nc of classes of Sn is equal to the number of Young diagrams and so equal to the number of Normal tableaux.
In turn, the number of elements in the set {eλ } is equal to the number of normal Young tableaux, so equal to nc . Further, by
theorem 12.6, the irreducible representations generated by the set {eλ } are all inequivalent each other. Therefore, from {eλ }
we obtain nc irreducible inequivalent representations of Sn . QED.
We observe however that although the set {eλ } associated with all normal tableaux Θλ generates all irreducible inequiv-
alent representations of Sn , the left-ideals generated by them are not enough to obtain the full decomposition of the regular
representation. The reason is that in this decomposition, each irreducible P
inequivalent representation appears nµ times. Thus

for each µ with nµ ≥ 2, we require other nµ − 1 left ideals to obtain Lµ = ⊕a=1 Laµ . Since a given Θpλ generate an irreducible
representation equivalent to the one generated by Θλ , it seems plausible to use some of them to generate the remaining left-
ideals. Finally, our experience with S3 suggests that the standard tableaux could make the job. In the case of S3 we had a
two-dimensional representation and the full reduction was realized taking into account all the standard tableaux.
From these observations we shall state without proof the theorem governing the complete decomposition of the regular
representation of Sn .

Theorem 12.9 (Full decomposition of the regular representation of Sn ): (i) The left ideals generated by the idempotents
associated with distinct standard Young tableaux are linearly independent. (ii) The direct sum of the left-ideals generated by all
standard Young tableaux spans the whole group algebra space Sen .
Chapter 13

Symmetry classes of tensors

The Young tableau method and the irreducible representations of Sn are useful tools for the construction and classification of
irreducible tensors.

13.1 The role of the general linear group Gm and the permutation group Sn on
the tensor space Vmn
13.1.1 Definition of the general linear group Gm and the tensor space Vmn
Definition 13.1 Let Vm be a m−dimensional vector space, and let {g} be the set of all non-singular (invertible) linear
transformations on Vm . This set forms a group under the law of composition for linear transformations, and it is called the
General Linear Group GL (m, C). In this section we call it simply Gm .
For a given basis {|ii , i = 1, 2, . . . , m} on Vm we shall denote the matrix representation of Gm in the form
g |ii = |ji g j i
when g runs over all elements of Gm , then g j i runs over all m × m invertible matrices (det g 6= 0).
We have defined the direct or tensor product of two vector spaces. This concept could be generalized easily when more
than two vector spaces are involved. Of particular importance is the direct product of identical vector spaces
Definition 13.2 (Tensor space): The direct product space Vm ⊗ Vm ⊗ · · · ⊗ Vm involving n factors of Vm is called the tensor
space, and is denoted by Vmn .
As in any tensor product of vector spaces, a basis for Vmn arises from the bases of each Vm in the form
|i1 i2 · · · in i = |i1 i ⊗ |i2 i ⊗ · · · ⊗ |in i
so Vmn is a (mn ) −dimensional space. When no confusion arises we denote the basis above as {|iin }. The tensor product space
Vmn consists of all linear combinations of the elements in {|iin }, so if |xi ∈ Vmn then we have
|xi = |i1 i2 · · · in i xi1 i2 ···in , |xi = |iin x{i} ; xi1 i2 ···in ≡ x{i}

where the set x{i} defines the tensor components of |xi.

13.1.2 Realization of Gm on the tensor space Vmn


Each element g of the group Gm (defined on Vm ) induces a linear transformation on the tensor space Vmn in the following way
ge |i1 i2 · · · in i = ge [|i1 i ⊗ |i2 i ⊗ · · · ⊗ |in i] ≡ [g |i1 i ⊗ g |i2 i ⊗ · · · ⊗ g |in i]
 
ge |i1 i2 · · · in i = |j1 i g j1 i1 ⊗ |j2 i g j2 i2 ⊗ · · · ⊗ |jn i g jn in
ge |i1 i2 · · · in i = [|j1 i ⊗ |j2 i ⊗ · · · ⊗ |jn i] g j1 i1 g j2 i2 · · · g jn in ≡ |j1 j2 · · · jn i D (g){j} {i}
where eg is the natural extension of the linear transformation on Vm to a linear transformation on Vmn . Note that this extension
is possible because the vector spaces in the product are identical each other1 . We shall omit the notation ge to indicate the
1 InSec. 3.20.2, we considered a linear transformation A (1) on a vector space V1 , and defined its extension A e (1) acting on V1 ⊗ V2 as the tensor
product of A (1) with the identity of V2 . This extension can be generalized when more than two component spaces are considered. Nevertheless, in
(i)
our present context, it is more useful to define the extension of a given linear transformation Ai acting on Vm as the tensor product A1 ⊗A2 ⊗. . .⊗An
(1) (2) (n) n . We should insist however, in saying that the latter extension is possible only if each component space
acting on Vm ⊗ Vm ⊗ . . . ⊗ Vm ≡ Vm
(i)
Vm is identical with the other components.

227
228 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

extension and write simply

g |iin = |jin D (g){j} {i} ; D (g){j} {i} ≡ g j1 i1 g j2 i2 · · · g jn in , ∀g ∈ Gm (13.1)

it is easy to verify that the set {D (g)} consists of (mn ) × (mn ) matrices that form a (mn ) −dimensional representation of Gm .
Further, for any |xi ∈ Vmn we have
{j}
g |xi = g |iin x{i} = |jin D (g) {i} x{i}
which we rewrite as
{j}
g |xi ≡ |xg i = |jin x{j}
g ; x{j}
g ≡ D (g) {i} x
{i}

13.1.3 Realization of Sn on the tensor space Vmn


On the other hand, the symmetric group Sn also has a natural realization on Vmn . We first consider a mapping from p ∈ Sn
into a linear transformation pe on Vmn where pe is defined as

pe |xi ≡ |xp i = |iin x{i}


p

x{i}
p = xpi1 i2 ···in = xip1 ip2 ···ipn

once again, this realization of Sn on Vmn is possible because the space components of Vmn are identical each other. We shall
simplify the notation pe to simply p from now on. It is useful to express the action of p on the basis vectors {|iin } of Vmn , for
which we write
E

p |xi ≡ |xp i = |i1 i2 · · · in i xip1 i2 ···in = |i1 i2 · · · in i xip1 ip2 ···ipn = ip−1 ip−1 · · · ip−1
n
xi1 i2 ···in
1 2 n

where we have used the fact that for the summation indices ij and ipj we only have to preserve the correspondence among
superscripts and subscripts, but the order in the summation can be changed. On the other hand, we have
 
p |xi = p |i1 i2 · · · in i xi1 i2 ···in = [p |i1 i2 · · · in i] xi1 i2 ···in

equating the last two equations we have


E

p |i1 i2 · · · in i = ip−1 ip−1 · · · ip−1
n
; p |iin = ip−1 n (13.2)
1 2

from which we can obtain the matrix representation of {p} as a linear operator on Vmn .

p |iin = |jin D (p){j} {i} = ip−1 n
E
j1 j2 ···jn
⇒ |j1 j2 · · · jn i D (p) i1 i2 ···in = ip−1 ip−1 · · · ip−1
n
1 2
j1 j2 ···jn j1 j2
⇒ D (p) i1 i2 ···in =δ ip−1 δ ip−1 · · · δ jn ip−1 (13.3)
1 2 n

so that the matrix representative of p involves permuting the n δ−factors by p. It is useful to see how a product of permutations
acts on the basis vectors. We see it by assuming t = qp, hence t−1 = p−1 q −1 then
E

qp |i1 , i2 , . . . , in i = t |i1 , i2 , . . . , in i = it−1 , it−1 , . . . , it−1
n
= i(p−1 q−1 )1 , i(p−1 q−1 )2 , . . . , i(p−1 q−1 )n
1 2

It can be shown that the matrix representatives {D (p) : p ∈ Sn } described by Eq. (13.3), forms a representation of Sn , it
can be seen as follows
{j}
D (p) {i} = δ j1 ip−1 δ j2 ip−1 · · · δ jn ip−1 = δ jp1 i1 δ jp2 i2 · · · δ jpn in ; p ∈ Sn , D (p) ∈ β (Vmn ) (13.4)
1 2 n

where β (Vmn ) is the set of all linear transformations on Vmn . In the last equality, we have interchanged the order of the δ
factors, but the assignments of the superscript and subscript remain unchanged. Let us multiply two of these matrices using
both parameterizations of the matrix representations
 
D (p) D (q) = δ kp1 j1 · · · δ kpn jn δ j1 iq−1 · · · δ jn iq−1 = δ kp1 iq−1 · · · δ kpn iq−1 = δ kpq1 i1 · · · δ kpqn in
1 n 1 n
k(pq)1 k(pq)n
D (p) D (q) = δ i1 ···δ in = D (pq)

hence Eq. (13.4) defines a representation of Sn on Vmn .


13.1. THE ROLE OF THE GENERAL LINEAR GROUP GM AND THE PERMUTATION GROUP SN ON THE TENSOR SPA

Example 13.1 Consider a tensor element |xi ∈ V35 of the form

|xi = |i1 , i2 , i3 , i4 , i5 i xi1 i2 i3 i4 i5 , ik = 1, 2, 3

consider the permutation p = (13) (245) ∈ S5 and its inverse p−1 = (31) (542) i.e.
   
1 2 3 4 5 1 2 3 4 5
p= , p−1 = (13.5)
3 4 1 5 2 3 5 1 2 4

then we have
p |xi = |i1 , i2 , i3 , i4 , i5 i xip1 ip2 ip3 ip4 ip5 = |i1 , i2 , i3 , i4 , i5 i xi3 i4 i1 i5 i2
since the ik indices are dummy, we can make the assignments

j1 ≡ i3 , j2 ≡ i4 , j3 ≡ i1 , j4 ≡ i5 , j5 ≡ i2

then we can write

p |xi = |j3 , j5 , j1 , j2 , j4 i xj1 j2 j3 j4 j5 = |i3 , i5 , i1 , i2 , i4 i xi1 i2 i3 i4 i5


E

= ip−1 , ip−1 , ip−1 , ip−1 , ip−1 xi1 i2 i3 i4 i5
1 2 3 4 5

where we have renamed jk → ik and used the second of Eqs. (13.5).

13.1.4 Interplay of Gm and Sn on the tensor space Vmn


We have seen that Eqs. (13.1, 13.4) define representations D [Gm ], D [Sn ] of the group Gm and Sn on the vector space Vmn .
Both representations are in general reducible. Since Sn is a finite group, theorems 7.5, 7.6 guarantee that this representation
is fully reducible into irreducible representations. Indeed, we shall see later that the irreducible symmetrizers associated with
Young tableaux can give us the decomposition. In contrast, Gm is an infinite group so that full decomposition is not in general
guaranteed for a reducible representation. Notwithstanding, we shall see later that the reduction of the tensor space Vmn by
Young symmetrizers from the Sen algebra leads naturally to a full decomposition of D [Gm ]. In turn, it is related with the fact
that the linear transformations on Vmn representing {g ∈ Gm } commutes with the linear transformations representing {p ∈ Sn },
and each set of operators is essentially maximal in having this property.
Note that the result we have described, is a generalization of the following well-known facts (a) A set of commuting
operators share common eigenvectors and (b) a decomposition of reducible subspaces with respect to some subset of the
commuting operators, usually leads to the diagonalization of the remaining operators. As an example, when a Hamiltonian
exhibits a spherical symmetry it is diagonalized by decomposing first with respect to angular momentum operators.

Lemma 13.1 The representation matrices D (Gm ) in Eq. (13.1) and D (Sn ) in Eq. (13.4) satisfy the relation
 
1 2 ··· n
D{j} {i} = D{jq } {iq } ; {iq } ≡ (iq1 iq2 · · · iqn ) ; q ≡ (13.6)
q1 q2 · · · qn

Proof : The equality follows from the fact that the value of the products in Eqs. (13.1, 13.4) does not depend on the order
in which the n factors are placed. Therefore, permuting the n factors by an arbitrary element q ∈ Sn gives a simultaneous
reshuffling of the superscripts and the subscripts by the same permutation. QED.

Example 13.2 For the tensor space V35 of example 13.1, and for the permutation given by Eq. (13.5), we have
j1 j2 j3 j4 j5 j3 j4 j1 j5 j2
D (g) i1 i2 i3 i4 i5 = g j1 i1 g j2 i2 g j3 i3 g j4 i4 g j5 i5 = g j3 i3 g j4 i4 g j1 i1 g j5 i5 g j2 i2 = D (g) i3 i4 i1 i5 i2
jp1 jp2 jp3 jp4 jp5
= D (g) ip1 ip2 ip3 ip4 ip5

similarly, for D (p) we have


{j}
D (p) {i} = δ j1 ip−1 δ j2 ip−1 δ j3 ip−1 δ j4 ip−1 δ j5 ip−1 = δ j1 i3 δ j2 i5 δ j3 i1 δ j4 i2 δ j5 i4
1 2 3 4 5
j3 j4 j1 j5 j2 jp1
= δ i1 δ i2 δ i3 δ i4 δ i5 =δ ip −1
δ jp2 ip −1
δ jp3 ip −1
δ jp4 ip −1
δ jp5 ip −1
p1 p2 p3 p4 p5

{j} {jp }
D (p) {i} = D (p) {ip }

Definition 13.3 Linear transformations on Vmn satisfying condition (13.6) are called symmetry-preserving.

Theorem 13.1 The two sets of matrices {D (p) : p ∈ Sn } and {D (g) : g ∈ Gm } commute with each other.
230 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

Proof : We shall consider the action of pg and gp on the basis vectors of Vmn

pg |iin = p |jin D (g){j} {i} = jp−1 n D (g){j} {i} = |jin D (g){jp } {i}
where we have used the fact that {j} are dummy indices that can be appropriately reorganized, as long as we keep unaltered
the assignment between superscripts and subscripts. On the other hand, we have
{j} {jp }
gp |iin = g ip−1 n = |jin D (g) {i } = |jin D (g) {i}
p−1

where we have used Lemma 13.1. The equality of the RHS of the last two equations proves the theorem QED.

Example 13.3 Consider the space V22 whose elements are second-rank tensors (n = 2) in two-dimensional spaces (m = 2).
We denote the basis of V22 as |++i , |+−i , |−+i , |−−i. S2 is a group of two elements e, (12). For p = e the commutativity is
trivial to establish. Now for p = p−1 = (12) we get
E

pg |±±i = p |i1 i2 i g i1 ± g i2 ± = ip−1 ip−1 g i1 ± g i2 ± = |ip1 ip2 i g i1 ± g i2 ± = |i2 i1 i g i1 ± g i2 ± = |i2 i1 i g i2 ± g i1 ±
1 2

gp |±±i = g |±±i = |i1 i2 i g i1 ± g i2 ± = |i2 i1 i g i2 ± g i1 ± = pg |±±i

where we have used the fact that i1 , i2 = +, − are dummy indices of summation. For the remaining elements of the basis of
V22 we have
E

pg |±∓i = p |i1 i2 i g i1 ± g i2 ∓ = ip−1 ip−1 g i1 ± g i2 ∓ = |ip1 ip2 i g i1 ± g i2 ∓ = |i2 i1 i g i1 ± g i2 ∓ = |i2 i1 i g i2 ∓ g i1 ±
1 2

gp |±∓i = g |∓±i = |i1 i2 i g i1 ∓ g i2 ± = |i2 i1 i g i2 ∓ g i1 ± = pg |±∓i

and clearly these equalities hold for all p ∈ S2 , for all g ∈ G2 and all |xi ∈ V22 .

Observe that p could be thought either as (a) an element of Sn , (b) an element of Sen , and (c) an element of β (Vmn ),
i.e. as a linear transformation on Vmn . The latter point of view permits to see permutations and all quantities derived from
them (symmetrizers, anti-symmetrizers etc.) as operators on the tensor space Vmn . In the same way, since r ∈ Sen is a linear
combination of permutations, they can be interpreted as a linear combination of linear transformations on Vmn , and so r ∈ Sen
can also be associated with a linear transformation on Vmn . These facts induce us to classify the tensors according to their
properties of transformation under p and quantities related with them.

Definition 13.4 (Tensors of Symmetry Θpλ and Tensors of Symmetry Class λ): To each Young tableau Θpλ we associate
symmetry Θpλ consisting ofo{epλ |αi : |αi ∈ Vmn }. For a given Young diagram corresponding to the partition λ, the
tensors of the n
set of tensors reλ |αi : r ∈ Sen , |αi ∈ V n is said to belong to the symmetry class λ.
m

We shall decompose the tensor space Vmn in irreducible invariant subspaces under Sn and Gm by means of the irreducible
symmetrizers associated with the Young tableaux of Sn . If Lpλ is the left-ideal in Sen generated by the irreducible symmetrizer
epλ associated with the Young tableau Θpλ , we shall see that (i) a subspace consisting of tensors of the form r |αi for fixed
|αi ∈ Vmn and any r ∈ Lpλ is irreducibly invariant under Sn . (ii) A subspace of tensors of the form epλ |αi for any |αi ∈ Vmn and
fixed Θpλ , is irreducibly invariant under Gm . (iii) The tensor space Vmn can be decomposed in such a way that the basis acquires
a “factorized” form {|λ, α, ai}, where λ indicates a symmetry class associated with a Young diagram, α labels the irreducible
invariant subspaces under Sn while a labels the irreducible invariant subspaces under Gm .
n o
Definition 13.5 A subspace Tλ (α) ⊆ Vmn consists of the set of tensors reλ |αi , r ∈ Sen for a fixed |αi ∈ Vmn and a fixed λ.

We shall first characterize each subspace of the form Tλ (α)

Theorem 13.2 Let Tλ (α) ⊆ Vmn be a non-zero subspace as described in definition 13.5. Then, Tλ (α) is an irreducible invariant
subspace of Vmn with respect to Sn . Further, the realization of Sn on Tλ (α) coincides with the irreducible representation generated
by eλ on the group algebra Sen .

Proof : (i) Let |xi ∈ Tλ (α). By definition, |xi = reλ |αi for some r ∈ Sen . Now, since pr ∈ Sen we obtain p |xi = (pr) eλ |αi ∈
Tλ (α) for all p ∈ Sn . It shows that Tλ (α) is invariant under Sn .
Since Tλ (α) is non-zero, we see that eλ |αi 6= 0. Let {ri eλ } be a subset of Lλ ∈ Sen that forms a basis of Lλ . Hence, for
all r ∈ Sen (such that reλ ∈ Lλ ) we can write reλ = ri eλ β i from which reλ |αi = ri eλ |αi β i for all r ∈ Sen . Hence, the set
{ri eλ |αi} forms a basis of Tλ (α).
13.2. TOTALLY SYMMETRIC TENSORS 231

Therefore, the representation of p ∈ Sn on the group algebra Sen is obtained by applying p on the elements of the
basis {|ri eλ i} of Lλ ⊆ Sen
j
p |ri eλ i = |pri eλ i = |rj eλ i D (p) i on Lλ ⊆ Sen ; ∀p ∈ Sn (13.7)
this relation can also be written as
j
pri eλ = rj eλ D (p) i

and the representation of p ∈ Sn on the tensor space Vmn is obtained applying p on the basis {ri eλ |αi} of Tλ (α) ⊆ Vmn
j
pri eλ |αi = (pri eλ ) |αi = rj eλ |αi D (p) i on Tλ (α) ⊆ Vmn ∀p ∈ Sn (13.8)

Since the matrix elements D (p)j i were obtained from an irreducible invariant subspace Lλ of Sen , it corresponds to an
irreducible representation on Sen . Further, Eqs. (13.7, 13.8) show that the realization of Sn on Tλ (α) coincides with the
irreducible representation generated by eλ on the group algebra Sen , so this representation is also irreducible in Vmn . QED.
Note that theorem 13.2 has taken the hypothesis that Tλ (α) 6= {0}. This hypothesis must be checked for each λ, since
it is not guaranteed that every irreducible inequivalent representations of Sn appear in Vmn . In general, some irreducible
representations λ of Sn could be absent in Vmn , and in that case Tλ (α) will be zero as we shall see later. This fact differs from
the case of the group algebra Sen which contains every irreducible representation of Sn precisely nµ times.

13.2 Totally symmetric tensors


A very important Young diagram is the one associated with the partition {n}, which is given by

Xn!
p
Θ{n} ≡ Θs = ··· , es ≡ s =
p=1
n!

the (normalized) irreducible symmetrizer es associated with the partition {n} and the Young diagram Θ{n} = Θs is just the
symmetrizer of the full group Sn . In Sec. 12.1 Eqs. (12.3, 12.6), we saw that pes = es and res = γes with γ ∈ C, for
all p ∈ Sn and for all r ∈ Sen . Thus, the left-ideal Ls generated by es is one-dimensional and clearly corresponds n to the
o
identity representation. Correspondingly, for any given element |αi ∈ Vm the irreducible subspace Ts (α) ≡ res |αi , r ∈ Sen
n

is generated by all multiples of es |αi so it is also one-dimensional. It is easy to check that all the elements res |αi ∈ Ts (α) are
totally symmetric tensors
n!
! n! n! n!
X p γ X γ X γ X
res |αi = γes |αi = γ |αi = p |αi = p |iin α{i} = |iin α{ip }
p=1
n! n! p=1
n! p=1
n! p=1

Xn!
γ
res |αi = |iin α{ip } (13.9)
n! p=1

so that the components of any res |αi = γes |αi ∈ Ts (α) with respect to the basis {|iin }, are totally symmetric in the n−indices.
Now, according with theorem 13.2 the realization of Sn on Ts (α) must be the one-dimensional representation of the identity
in this space. This is consistent with the fact that all permutations leave a totally symmetric tensor unchanged. This can be
verified easily by applying any q ∈ Sn in Eq. (13.9)
n! n!
γ X γ X
q [res |αi] = q [γes |αi] = q |iin α{ip } = [|iin ] α{iqp }
n! p=1 n! p=1

Xn!
γ
q [res |αi] = [|iin ] α{ip } = γes |αi
n! p=1

where we have used the rearrangement lemma.


On the other hand, |αi has been kept fixed in the discussion above. Thus, the most general tensor space containing totally
symmetric tensors is obtained with Ts (α) running over all |αi ∈ Vmn . According with the definition 13.4 it is precisely the set
of tensors belonging to the symmetry class λ = s. We shall denote the subspace of tensors of the symmetry class s by Ts′ . We
wonder about the dimensionality of Ts′ i.e. to see how many linearly independent vectors of the form res |αi = γes |αi appears
when we run over all |αi ∈ Vmn .

Example 13.4 Consider third rank tensors (n = 3) in two dimensions (m = 2). A basis for this tensor space V23 can be
written as
{|αi i} = {|+ + +i , |+ + −i , |+ − +i , |+ − −i , |− + +i , |− + −i , |− − +i , |− − −i} (13.10)
232 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

we write the elements of S3 as

p1 = e, p2 = (12) , p3 = (13) , p4 = (23) , p5 = (123) , p6 = (321)


p−1
5 = p6 ; p−1
6 = p5 ; pk = pk
−1
with k = 1, 2, 3, 4

the normalized irreducible symmetrizers of the Young tableau Θ{3} reads


1
e{3} = [e + (12) + (13) + (23) + (123) + (321)] (13.11)
3!
and the action of each p ∈ S3 on each vector basis of V23 yields
E

pk |i1 , i2 , i3 i = ip−1 , ip−1 , ip−1 = ipk1 , ipk2 , ipk3 ; k = 1, 2, 3, 4
k k2 k3
1 E E

p5 |i1 , i2 , i3 i = ip−1 , ip−1 , ip−1 = ip61 , ip62 , ip63 ; p6 |i1 , i2 , i3 i = ip−1 , ip−1 , ip−1 = ip51 , ip52 , ip53
51 52 53 61 62 63

explicitly we obtain

p1 |i1 , i2 , i3 i = ip1 , ip1 , ip1 = |i1 , i2 , i3 i ; p2 |i1 , i2 , i3 i = ip2 , ip2 , ip2 = |i2 , i1 , i3 i
1 2 3 1 2 3

p3 |i1 , i2 , i3 i = ip3 , ip3 , ip3 = |i3 , i2 , i1 i ; p4 |i1 , i2 , i3 i = ip4 , ip4 , ip4 = |i1 , i3 , i2 i
1 2 3 1 2 3

p5 |i1 , i2 , i3 i = ip6 , ip6 , ip6 = |i3 , i1 , i2 i ; p6 |i1 , i2 , i3 i = ip5 , ip5 , ip5 = |i2 , i3 , i1 i
1 2 3 1 2 3

and using the basis tensors in Eq. (13.10) we have

p |+ + +i = |+ + +i , p |− − −i = |− − −i , ∀p ∈ S3
e |+ + −i = (12) |+ + −i = |+ + −i ; (13) |+ + −i = (123) |+ + −i = |− + +i
(23) |+ + −i = (321) |+ + −i = |+ − +i
e |+ − +i = (13) |+ − +i = |+ − +i ; (12) |+ − +i = (321) |+ − +i = |− + +i
(23) |+ − +i = (123) |+ − +i = |+ + −i

e |+ − −i = (23) |+ − −i = |+ − −i ; (12) |+ − −i = (123) |+ − −i = |− + −i


(13) |+ − −i = (321) |+ − −i = |− − +i
e |− + +i = (23) |− + +i = |− + +i ; (12) |− + +i = (123) |− + +i = |+ − +i
(13) |− + +i = (321) |− + +i = |+ + −i

e |− + −i = (13) |− + −i = |− + −i ; (12) |− + −i = (321) |− + −i = |+ − −i


(23) |− + −i = (123) |− + −i = |− − +i
e |− − +i = (12) |− − +i = |− − +i ; (13) |− − +i = (123) |− − +i = |+ − −i
(23) |− − +i = (321) |− − +i = |− + −i (13.12)

so that the result of applying the irreducible symmetrizer on each element of the basis, is obtained by combining Eqs. (13.11,
13.12), for instance
[e + (12) + (13) + (23) + (123) + (321)]
es |+ + −i = |+ + −i
6
[e + (12)] [(13) + (123)] [(23) + (321)]
= |+ + −i + |+ + −i + |+ + −i
6 6 6
2 |+ + −i + 2 |− + +i + 2 |+ − +i
=
6
|+ + −i + |+ − +i + |− + +i
es |+ + −i =
3
and proceeding in the same way for the other elements of the basis we obtain

es |+ + +i = |+ + +i ; es |− − −i = |− − −i
1
es |+ + −i = es |+ − +i = es |− + +i = [|+ + −i + |+ − +i + |− + +i]
3
1
es |+ − −i = es |− + −i = es |− − +i = [|+ − −i + |− + −i + |− − +i]
3
13.3. TOTALLY ANTI-SYMMETRIC TENSORS 233

we see that the three elements


|+ + −i , |+ − +i , |− + +i
generate the same tensor by multiplication with es . This is due to the fact that they are related each other by a permutation

|+ + −i = (23) |+ − +i = (321) |− + +i

and we see that if |x′ i = p |xi for some permutation p we have

es |x′ i = es p |xi = es |xi

and for the same reason, the three vectors

|+ − −i , |− + −i , |− − +i

generate the same tensor through es . In summary, starting with different elements of V2n=3 , we can generate only four linearly
independent totally symmetric tensors.

(1) |α1 i = |+ + +i ; es |α1 i = |+ + +i ≡ |s, 1, 1i (13.13)


(2) |α2 i = |+ + −i ; es |α2 i = [|+ + −i + |+ − +i + |− + +i] /3 ≡ |s, 2, 1i (13.14)
(3) |α7 i = |− − +i ; es |α7 i = [|− − +i + |− + −i + |+ − −i] /3 ≡ |s, 3, 1i (13.15)
(4) |α8 i = |− − −i ; es |α8 i = |− − −i ≡ |s, 4, 1i (13.16)

Note that we have introduced a classification scheme with three labels |λ, α, ai where λ ≡ s defines the symmetry class associated
with the Young diagram Θs consisting of a single row, “α” labels the four invariant irreducible subspaces under S3 , and we shall
see later that “a” labels the irreducible invariant subspaces under Gm , and in this case the label take a fixed value a = 1. It is
clear that the four totally symmetric tensors shown above are invariant under all p ∈ S3 . Their linear combinations represent
all totally symmetric tensors that can be constructed in V23 .

13.3 Totally anti-symmetric tensors


The next natural question is whether we can always generate totally anti-symmetric
P tensors in Vmn . Totally anti-symmetric
p
tensors must be generated by the (normalized) total antisymmetrizer a = p (−1) p/n!. This anti-symmetrizer is in turn
generated by the partition {1, 1, . . . , 1} of n, associated with the Young diagram consisting of a single column.

X (−1) p n! p
Θ{1,1,...,1} ≡ Θa = ... , ea ≡ a =
n! p=1

Let us check for the conditions to be able to build a totally anti-symmetric tensor in Vmn .

Theorem 13.3 (totally anti-symmetric tensors): Let Vmn be a tensor space of n − th rank. It contains totally anti-symmetric
tensors with respect to the basis {|iin } only if m ≥ n.

Proof: Let p = (kl) be a transposition, using Eq. (12.2) and the fact that a transposition is an odd permutation, we have

(kl) a = a (kl) = −a

consider an element |i1 i2 · · · in i of the natural basis of Vmn in which ik = il with k 6= l; i.e. such that there is a duplication in
the k and l positions. For this element we see that

[(kl) a] |i1 i2 · · · in i = −a |i1 i2 · · · in i


[a (kl)] |i1 i2 · · · in i = a |i1 i2 · · · in i

and since a (kl) = (kl) a we see that −a |i1 i2 · · · in i = a |i1 i2 · · · in i, therefore

a |i1 i2 · · · in i = 0 if ik = il with k 6= l

further, we observe that if n > m all natural basis elements of Vmn contain at least one such duplication. Therefore, the
anti-symmetrizer annihilates all elements of the basis and so all elements of the vector space Vmn . QED.

Corollary 13.4 The tensor space Vnn contains one and only one linearly independent tensor that is totally anti-symmetric
with respect to the basis {|iin }.
234 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

Proof : A natural basis for Vnn is described by


{|i1 , i2 , . . . , in i} ; ik = 1, 2, . . . , n ; k = 1, 2, . . . , n
from the arguments in theorem 13.3 for a given basis vector |i1 , i2 , . . . , in i not to be annihilated by the total anti-symmetrizer,
we require ik 6= im if k 6= m. Therefore, each possible value of ik appears once and only once2 in |i1 , i2 , . . . , in i. Consequently,
two basis vectors |i1 , i2 , . . . , in i , |k1 , k2 , . . . , kn i that are not annihilated by a, must be related each other by a permutation of
the indices i.e.
|k1 , k2 , . . . , kn i = |iq1 , iq2 , . . . , iqn i = q −1 |i1 , i2 , . . . , in i ; q ∈ Sn
and applying the antisymmetrizer to one element of the basis that does not vanish we have
 
a |i1 , i2 , . . . , in i = a qq −1 |i1 , i2 , . . . , in i = (aq) q −1 |i1 , i2 , . . . , in i
a |i1 , i2 , . . . , in i = (−1)q a |k1 , k2 , . . . , kn i
and running over all vectors of the basis, the ones not annihilated generates linearly dependent vectors. Further, for any Vnn ,
the natural basis (generated by tensor products of the bases components) certainly contains vectors |i1 , i2 , . . . , in i in which
each possible value of ik appears once and only once3 , showing that an anti-symmetric tensor can always be constructed in
Vnn . QED.
For instance, in V33 the following basis vectors generate linearly dependent anti-symmetric tensors
|1, 2, 3i , |1, 3, 2i , |2, 1, 3i , |3, 1, 2i , |2, 3, 1i , |3, 2, 1i
while the remaining 21 basis vectors are annihilated by the antisymmetrizer, since each one of them contain at least one
duplication. For instance
|1, 1, 1i , |1, 1, 2i , |3, 1, 3i , |2, 2, 1i , |3, 1, 1i . . . etc.
Coming back to arbitrary tensor spaces Vmn , theorem 13.3 says that for Vmn with m < n the irreducible representation
associated with the partition λ ≡ a = {1, . . . , 1} does not appear in Vmn , so that Ta (α) = {0} according with theorem 13.2. On
the other hand, if the condition m ≥ n is satisfied, the total anti-symmetrizer a ≡ ea that generates the irreducible subspace
p
La ⊆ Sen , also generates an irreducible subspace Ta (α) ⊆ Vmn . Since pea = (−1) ea , both La and Ta (α) are one-dimensional
p
and the realization of Sn on both La and Ta (α) corresponds to the one-dimensional representation p → (−1) . An element of
Ta (α) is of the form
p
rea |αi = ri pi ea |αi = (−1) i ri ea |αi
where we have used Eq. (12.2). If we apply a transposition qe to this element, and using the fact that transpositions are odd,
we have
p p
qerea |αi = (−1) i ri qeea |αi = − (−1) i ri ea |αi = −rea |αi
so applying any transposition to any vector in Ta (α) ⊆ Vmn we invert the sign of the vector, which is precisely the condition
of total anti-symmetry. Once again, we are interested in obtaining all linearly independent totally anti-symmetric tensors, for
which we run over all |αi ∈ Vmn .
Example 13.5 Consider the tensor space V23 described in example 13.4. Since 3 > 2 no totally anti-symmetric tensors exist
in such a space. The total anti-symmetrizers annihilates all elements of the basis given by Eq. (13.10) and so all vectors of
the space. For instance, using Eqs. (13.12) we have
1
a{1,1,1} |+ − +i = [e − (12) − (13) − (23) + (123) + (321)] |+ − +i
3!
1
a{1,1,1} |+ − +i = [|+ − +i − |− + +i − |+ − +i − |+ + −i + |+ + −i + |− + +i] = 0 (13.17)
3!
and same for the other elements of the basis. It is because each basis vector contains “+” or “−” at least twice.
Example 13.6 We have seen that there is one and only one linearly independent totally anti-symmetric tensor of rank n in the
tensor space Vnn . It is usually denoted by ε and it is obtained by applying the total anti-symmetrizer (usually non-normalized)
to a basis vector |i1 , i2 , . . . , in i in which each value of ik appears once and only once
n!
X n!
X
a |i1 , i2 , . . . , in i = (−1)p p |i1 , i2 , . . . , in i = (−1)p p−1 |i1 , i2 , . . . , in i
p=1 p=1
n!
X p
a |i1 , i2 , . . . , in i = (−1) |ip1 , ip2 , . . . , ipn i
p=1

2 It is precisely because i takes n-different values. If i took m−different values with m > n, not all the values of i would appear in a basis
k k k
vector.
3 For example |1, 2, . . . , ni and any permutation of it.
13.3. TOTALLY ANTI-SYMMETRIC TENSORS 235

where we have used Eq. (13.2), the rearrangement lemma, and the fact that each permutation has the same parity as its inverse.
In two dimensions, denoting the basis as

V22 → {|1, 1i , |1, 2i , |2, 1i , |2, 2i}

its components are given by

a |1, 2i = |1, 2i − |2, 1i = {0 · |1, 1i + 1 · |1, 2i − 1 · |2, 1i + 0 · |2, 2i}

we denote this as ε12 = −ε21 = 1, ε11 = ε22 = 0. In three dimensions we denote the basis of V33 as

|1, 1, 1i , |1, 1, 2i , |1, 1, 3i , |1, 2, 1i , |1, 2, 2i , |1, 2, 3i , |1, 3, 1i , |1, 3, 2i , |1, 3, 3i


|2, 1, 1i , |2, 1, 2i , |2, 1, 3i , |2, 2, 1i , |2, 2, 2i , |2, 2, 3i , |2, 3, 1i , |2, 3, 2i , |2, 3, 3i
|3, 1, 1i , |3, 1, 2i , |3, 1, 3i , |3, 2, 1i , |3, 2, 2i , |3, 2, 3i , |3, 3, 1i , |3, 3, 2i , |3, 3, 3i

the components of the only linearly independent tensor that is totally anti-symmetric yields
n o
−1 −1 −1 −1 −1
a |1, 2, 3i = e−1 − (12) − (13) − (23) + (123) + (321) |1, 2, 3i
a |1, 2, 3i = |1, 2, 3i − |2, 1, 3i − |3, 2, 1i − |1, 3, 2i + |2, 3, 1i + |3, 1, 2i

each of the 27 components is denoted as εijk and take the value 1 (−1) if (ijk) is an even (odd) permutation of (123), and
they are zero if any two indices are equal. In general, for Vnn the totally anti-symmetric tensor can be defined as εi1 i2 ...in in
which a component is equal to +1 for even permutations of (1, 2, . . . , n) equal to −1 for odd permutations of it, and zero when
any index is repeated.

Theorem 13.5 The tensor space Vm2 of second rank tensors (with m ≥ 2), contains m (m + 1) /2 linearly independent tensors
that are totally symmetric, and m (m − 1) /2 linearly independent tensors that are totally anti-symmetric. The set of all linearly
independent tensors that are totally symmetric plus the set of all linearly independent tensors that are totally anti-symmetric,
forms a basis in this space.

Proof : Consider the second rank tensors (n = 2) in m−dimensions (m ≥ 2). Let us consider the action of the total
symmetrizer and anti-symmetrizer. For the total symmetrizer, we have

es |iii = |iii ; i = 1, 2, . . . , m
es |iji = [|iji + |jii] /2 ; i 6= j

for the total anti-symmetrizer we obtain

a |iii = 0 ; a |iji = [|iji − |jii] /2 ; i 6= j

to see how many linearly independent anti-symmetric tensors are generated, we take into account that vectors of the form |iii
do not contribute and that |iji and |jii give redundant information, then the linearly independent antisymmetric tensors are
given by the set
{a |i, ji : i, j = 1, 2, . . . , m and j > i} (13.18)
for i = 1 there are m − 1 vectors of this form (j = 2, 3, . . . , m), for i = 2 there are m − 2 vectors,...,for i = m − 1 there is one
vector. The total number of vectors in the set described in Eq. (13.18) (which are linearly independent) gives
m (m − 1)
(m − 1) + (m − 2) + . . . + 1 =
2
for the symmetric tensors, we only have to take into account that diagonal vectors |iii also contribute, hence the linearly
independent symmetric tensors are given by the set4

{es |i, ji : i, j = 1, 2, . . . , m and j ≥ i} (13.19)

the diagonal terms gives m additional terms, hence the number of elements in this set is
m (m − 1) m (m + 1)
+m=
2 2
and the number of totally symmetric and totally antisymmetric tensors that are linearly independent is
m (m − 1) m (m + 1)
+ = m2
2 2
4 It is left to the reader the proof that the sets defined by Eqs. (13.18, 13.19) provide linearly independent tensors.
236 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

which is precisely the dimension of Vm2 . QED.


We have examined the representations generated by the partitions {n} ≡ s and {1, 1, . . . , 1} ≡ a associated with Young
diagrams consisting of a single row and a single column respectively. The irreducible symmetrizers es and ea associated
with these partitions generate totally symmetric and totally antisymmetric tensors respectively, where in the latter case the
condition m ≥ n must be satisfied to get anti-symmetric tensors.
Nevertheless, for n ≥ 3 there are other Young diagrams that can also generate irreducible representations on either Sen or on
Vmn . The Young diagrams not consisting of either a single row or a single column will generate tensors with mixed symmetry5 .

13.4 Reduction of the tensor space V23 in irreducible invariant subspaces under
S3 and G2
In example 13.4, we examined the symmetric 3rd rank tensors in 2−dimensions associated with the Young diagram Θ{3} .
Further, according with theorem 13.3, totally anti-symmetric tensors associated with Θ{1,1,1} cannot be defined (the total
antisymmetrizer annihilates all vectors of the form |αi ∈ V23 ). Let us examine tensors in V23 with symmetry associated with
the remaining Young diagram Θ{2,1} ≡ Θm of S3 . Its normal Young tableau Θλ=m and the irreducible symmetrizer em
associated with the normal tableau read
1 2 1 1
Θm = ; em = [e + (12)] [e − (31)] = [e − (31) + (12) − (321)] (13.20)
3 4 4
we shall see that with respect to S3 , two independent irreducible invariant subspaces of tensors with mixed symmetry, are
generated in V23 . As in example 13.4, we start by examining the action of em on all the basis vectors {|αi i} of V23 given by
Eqs. (13.10). For this, we notice that em annihilates the tensors |+ + +i , |+ − +i , |− − −i and |− + −i. This can be checked
either by explicit calculation of by noting that em antisymmetrizes the first and third positions, and these tensors duplicate
such positions. Using Eqs. (13.10, 13.12, 13.20), we obtain for the eight basis tensors the following
em |αi i = 0 ; i = 1, 3, 6, 8
1 1
em |α2 i = [e − (31) + (12) − (321)] |+ + −i = [|+ + −i − |− + +i + |+ + −i − |+ − +i]
4 4
1
= [2 |+ + −i − |− + +i − |+ − +i]
4
1 1
em |α4 i = [e − (31) + (12) − (321)] |+ − −i = [|+ − −i − |− − +i + |− + −i − |− − +i]
4 4
1
= [|+ − −i + |− + −i − 2 |− − +i]
4
1 1
em |α5 i = [e − (31) + (12) − (321)] |− + +i = [|− + +i − |+ + −i + |+ − +i − |+ + −i]
4 4
1
= [|− + +i + |+ − +i − 2 |+ + −i] = −em |α2 i
4
1 1
em |α7 i = [e − (31) + (12) − (321)] |− − +i = [|− − +i − |+ − −i + |− − +i − |− + −i]
4 4
1
= [2 |− − +i − |+ − −i − |− + −i] = −em |α4 i (13.21)
4
so that there are only two linearly independent vectors in the set {em |αi i}, we shall choose em |α2 i and em |α7 i.
On the other, hand, it is also useful to calculate the tensors derived from the other standard Young tableau associated with
the same Young diagram.
1 3 1 1
Θ(23)
m = ; e(23)
m = [e + (31)] [e − (12)] = [e + (31) − (12) − (123)] (13.22)
2 4 4
(23)
since em antisymmetrizes the first and second positions, it annihilates the tensors |+ + +i , |+ + −i , |− − +i , |− − −i or
equivalently the tensors |αi i with i = 1, 2, 7, 8. From |+ − +i and |− + +i we obtain only one linearly independent vector
since they differ from a permutation, a similar situation occurs with |+ − −i and |− + −i. Thus we choose the two linearly
(23)
independent vectors generated by |α3 i and |α6 i. To obtain the linearly independent vectors it is more useful to express em
(23)
taking into account that epm = pem p−1 such that em = (23) em (23), therefore
e(23)
m |α3 i = e(23)
m |+ − +i = (23) em (23) |+ − +i = (23) em |+ + −i = (23) em |α2 i
(23) (23)
em |α6 i = em |− + −i = (23) em (23) |− + −i = (23) em |− − +i = (23) em |α7 i
5 Note that for n = 2, Young diagrams could not generate tensors with mixed symmetry. It is because of that that totally symmetric and totally
2 (see theorem 13.5).
anti-symmetric tensors can form a basis in Vm
13.4. REDUCTION OF THE TENSOR SPACE V23 IN IRREDUCIBLE INVARIANT SUBSPACES UNDER S3 AND G2 237

and using Eqs. (13.21) we obtain

1
e(23)
m |α3 i = (23) em |α2 i = [2 |+ − +i − |− + +i − |+ + −i]
4
1
e(23)
m |α6 i = (23) em |α7 i = [2 |− + −i − |+ − −i − |− − +i] (13.23)
4
From Eqs. (13.21, 13.23), we see that we can generate all four linearly independent vectors associated with the partition {2, 1},
(23)
by using |α2 i and |α7 i along with em and (23) em . The latter is replacing em applied on |α3 i and |α6 i as can be seen in Eq.
(13.23).

(23)
13.4.1 Irreducible invariant subspaces under S3 generated by Θm and Θm
We have seen that four linearly independent vectors are generated from the partition {2, 1} associated with the two standard
(23)
Young tableaux Θm and Θm . They are generated by the two vectors |α2 i , |α7 i in combination with the operators em and
(23) em . We shall span from them two subspaces of 2 dimensions, irreducible invariant under S3 .
(i) Choosing |α2 i = |+ + −i, we obtain from Eqs. (13.21, 13.23) the independent tensors

1
em |α2 i =
[2 |+ + −i − |− + +i − |+ − +i] ≡ |m, 1, 1i (13.24)
4
1
(23) em |α2 i = [2 |+ − +i − |− + +i − |+ + −i] ≡ |m, 1, 2i (13.25)
4
n o
it can be checked that any element of the set Tm (α2 ) ≡ rem |+ + −i , r ∈ Se3 is a linear combination of the two tensors
above. Hence, according with theorem 13.2, the set

{em |α2 i , (23) em |α2 i} ≡ {|m, 1, 1i , |m, 1, 2i}

of mixed tensors forms a basis for an irreducible invariant subspace Tm (α2 ) of V23 under S3 which we denote from now on
as Tλ=m (1). To verify that, it suffices to show that pi em |+ + −i is a linear combination of the tensors (13.24, 13.25) for each
pi ∈ S3 . We can do the verification explicitly by taking an arbitrary linear combination of the tensors in Eq. (13.24, 13.25)

1
c1 |m, 1, 1i + c2 |m, 1, 2i = [(2c1 − c2 ) |+ + −i + (2c2 − c1 ) |+ − +i − (c1 + c2 ) |− + +i] (13.26)
4
Let us do it for pi = (123), using Eq. (13.24, 13.25) we have

[2 |+ + −i − |− + +i − |+ − +i] 2 |− + +i − |+ − +i − |+ + −i
pi em |+ + −i ≡ pi |m, 1, 1i = (123) =
4 4
− |+ + −i − |+ − +i + 2 |− + +i
(123) |m, 1, 1i = (13.27)
4
comparing the RHS of Eqs. (13.26, 13.27) we obtain the equations

2c1 − c2 = −1 , 2c2 − c1 = −1, − (c1 + c2 ) = 2

whose solution is c1 = c2 = −1. Note that this system has a solution despite it is over-determined, indicating that (123) |m, 1, 1i
is a linear combination of the form
(123) |m, 1, 1i = − |m, 1, 1i − |m, 1, 2i
similar procedure is carried out with the other elements of S3 .
(ii) Choosing |α7 i = |− − +i, Eqs. (13.21, 13.23) gives the independent tensors

1
em |α7 i = [2 |− − +i − |+ − −i − |− + −i] ≡ |m, 2, 1i (13.28)
4
1
(23) em |α7 i = [2 |− + −i − |+ − −i − |− − +i] ≡ |m, 2, 2i (13.29)
4
and they are the basis of another irreducible invariant subspace under S3 , consisting of tensors with mixed symmetry denoted
by Tλ=m (2).
Since S3 has only one two-dimensional irreducible representation, it is obvious that the realization of S3 either on Tm (1)
or Tm (2) corresponds to this irreducible two-dimensional representation (see table 7.4, page 150).
238 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

(23)
13.4.2 Irreducible invariant subspaces under G2 generated by Θm and Θm
It can be observed that the two tensors |m, 1, 1i and |m, 2, 1i given by Eqs. (13.24, 13.28), are two linearly independent tensors
of the form em |αi with |αi ranging over Vmn . They are tensors of the symmetry Θm . We shall see that the subspace spanned
by these two tensors is an invariant irreducible subspace under G2 .
First, we check that the tensors {|m, 1, 1i , |m, 2, 1i} span the subspace


Tm (1) ≡ em |αi , |αi ∈ V23

For this, we notice that according with Eqs. (13.21), the subspace Tm (1) is spanned by only two linearly independent vectors,

say em |+ + −i ≡ |m, 1, 1i and em |− − +i ≡ |m, 2, 1i. The fact that Tm (1) is invariant under G2 follows from
g |αi ∈ Vmn ∀ |αi ∈ Vmn ⇒

gem |αi = em g |αi = em [g |αi] ∈ Tm (1) ; ∀ |αi ∈ Vmn
where we have used theorem 13.1, page 229. Finally, we show that such a subspace is irreducible with respect to G2 . We will

do it by constructing the representation of G2 on Tm (1), then we look at the representation matrices and check whether they

have any common eigenvectors. Since Tm (1) is two-dimensional, if it is reducible the irreducible spaces must be of dimension
one.

Since we only deal with two dimensions, it is convenient to denote the basis of Tm (1) as |(±)i

1 1
|(+)i ≡ |m, 1, 1i =
[2 |+ + −i − |− + +i − |+ − +i] ; |(−)i ≡ |m, 2, 1i = [2 |− − +i − |+ − −i − |− + −i] (13.30)
4 4
now we study the action of the elements of G2 on these tensors, this action is defined by Eqs. (13.1)

1 1 i1 j1 k1 1 i2 j2 k2 1 i3 j3 k3
g |(+)i = g [2 |+ + −i − |− + +i − |+ − +i] = |i1 j1 k1 i g+ g+ g− − |i2 j2 k2 i g− g+ g+ − |i3 j3 k3 i g+ g− g+
4 2 4 4
1 + + + 1 + + − 1 + − − 1 + − +
g |(+)i = |+ + +i g+ g+ g− + |+ + −i g+ g+ g− + |+ − −i g+ g+ g− + |+ − +i g+ g+ g− +
2 2 2 2
1 − + + 1 − + − 1 − − + 1 − − −
|− + +i g+ g+ g− + |− + −i g+ g+ g− + |− − +i g+ g+ g− + |− − −i g+ g+ g−
2 2 2 2
1 + + + 1 + + − 1 + − − 1 + − +
− |+ + +i g− g+ g+ − |+ + −i g− g+ g+ − |+ − −i g− g+ g+ − |+ − +i g− g+ g+ +
4 4 4 4
1 − + + 1 − + − 1 − − + 1 − − −
− |− + +i g− g+ g+ − |− + −i g− g+ g+ − |− − +i g− g+ g+ − |− − −i g− g+ g+ +
4 4 4 4
1 + + + 1 + + − 1 + − − 1 + − +
− |+ + +i g+ g− g+ − |+ + −i g+ g− g+ − |+ − −i g+ g− g+ − |+ − +i g+ g− g+ +
4 4 4 4
1 − + + 1 − + − 1 − − + 1 − − −
− |− + +i g+ g− g+ − |− + −i g+ g− g+ − |− − +i g+ g− g+ − |− − −i g+ g− g+
4 4 4 4
1 + + + 1 + + + 1 + + +
g |(+)i = |+ + +i g+ g+ g− − |+ + +i g− g+ g+ − |+ + +i g+ g− g+
2 4 4
1 + + − 1 + + − 1 + + −
+ |+ + −i g+ g+ g− − |+ + −i g− g+ g+ − |+ + −i g+ g− g+
2 4 4
1 + − − 1 + − − 1 + − −
+ |+ − −i g+ g+ g− − |+ − −i g− g+ g+ − |+ − −i g+ g− g+
2 4 4
1 + − + 1 + − + 1 + − +
+ |+ − +i g+ g+ g− − |+ − +i g− g+ g+ − |+ − +i g+ g− g+
2 4 4
1 − + + 1 − + + 1 − + +
+ |− + +i g+ g+ g− − |− + +i g− g+ g+ − |− + +i g+ g− g+
2 4 4
1 − + − 1 − + − 1 − + −
+ |− + −i g+ g+ g− − |− + −i g− g+ g+ − |− + −i g+ g− g+
2 4 4
1 − − + 1 − − + 1 − − +
+ |− − +i g+ g+ g− − |− − +i g− g+ g+ − |− − +i g+ g− g+
2 4 4
1 − − − 1 − − − 1 − − −
+ |− − −i g+ g+ g− − |− − −i g− g+ g+ − |− − −i g+ g− g+
2 4 4
1 +
 + − + −
 1 −
 + − + −

g |(+)i = |+ + −i g+ g+ g− − g− g+ + |+ − −i g+ g+ g− − g− g+
2 4
1 +
 − + − +
 1 +
 − + − +

+ |+ − +i g+ g+ g− − g− g+ + |− + +i g+ g+ g− − g− g+
4 4
1 −
 + − − +
 1 −
 − + − +

+ |− + −i g+ g+ g− − g+ g− + |− − +i g+ g+ g− − g− g+
4 2
13.4. REDUCTION OF THE TENSOR SPACE V23 IN IRREDUCIBLE INVARIANT SUBSPACES UNDER S3 AND G2 239

  
 + −
+ + −
 1 1 1 −
 + − + −
 1 1 1
g |(+)i = g+ g− − g− g+
g+ |+ + −i − |+ − +i − |− + +i + g+ g+ g− − g− g+ |+ − −i + |− + −i − |− − +
2 4 4 4 4 2
+
 + − + −
 −
 + − + −

g |(+)i = g+ g+ g− − g− g+ |(+)i − g+ g+ g− − g− g+ |(−)i
 + − + −
 + −

g |(+)i = g+ g− − g− g+ |(+)i g+ − |(−)i g+ (13

with a similar procedure we find

1 1 i1 j1 k1 1 i2 j2 k2 1 i3 j3 k3
g |(−)i = g [2 |− − +i − |+ − −i − |− + −i] = |i1 j1 k1 i g− g− g+ − |i2 j2 k2 i g+ g− g− − |i3 j3 k3 i g− g+ g−
4 2 4 4
1 + + + 1 + + − 1 + − − 1 + − +
g |(−)i = |+ + +i g− g− g+ + |+ + −i g− g− g+ + |+ − −i g− g− g+ + |+ − +i g− g− g+ +
2 2 2 2
1 − + + 1 − + − 1 − − + 1 − − −
|− + +i g− g− g+ + |− + −i g− g− g+ + |− − +i g− g− g+ + |− − −i g− g− g+
2 2 2 2
1 + + + 1 + + − 1 + − − 1 + − +
− |+ + +i g+ g− g− − |+ + −i g+ g− g− − |+ − −i g+ g− g− − |+ − +i g+ g− g− +
4 4 4 4
1 − + + 1 − + − 1 − − + 1 − − −
− |− + +i g+ g− g− − |− + −i g+ g− g− − |− − +i g+ g− g− − |− − −i g+ g− g− +
4 4 4 4
1 + + + 1 + + − 1 + − − 1 + − +
− |+ + +i g− g+ g− − |+ + −i g− g+ g− − |+ − −i g− g+ g− − |+ − +i g− g+ g− +
4 4 4 4
1 − + + 1 − + − 1 − − + 1 − − −
− |− + +i g− g+ g− − |− + −i g− g+ g− − |− − +i g− g+ g− − |− − −i g− g+ g− +
4 4 4 4

1 +
 + − + −
 1 −
 + − + −

g |(−)i = |+ + −i g− g− g+ − g+ g− + |+ − −i g− g− g+ − g+ g−
2 4
1 +
 − + + −
 1 +
 − + − +

+ |+ − +i g− g− g+ − g− g+ + |− + +i g− g− g+ − g+ g−
4 4
1 −
 + − − +
 1 −
 − + − +

+ |− + −i g− g− g+ − g− g+ + |− − +i g− g− g+ − g+ g−
4  2  
+
 + − + −
 1 1 1 −
 − + − +
 1 1 1
g |(−)i = g− g+ g− − g− g+ − |+ + −i + |+ − +i + |− + +i + g− g− g+ − g+ g− − |+ − −i − |− + −i + |−
2 4 4 4 4 2

+
 + − + −
 −
 − + − +

g |(−)i = −g− g+ g− − g− g+ |(+)i + g− g− g+ − g+ g− |(−)i
 + − + −
 + −

g |(−)i = g+ g− − g− g+ −g− |(+)i + g− |(−)i (13.32)

Eqs. (13.31, 13.32) can be written as


 + −
  + −

g |(+)i = (det g) |(+)i g+ − |(−)i g+ ; g |(−)i = (det g) − |+i g− + |(−)i g−

so according with Eq. (7.11) the representation matrices are


 + +

g+ −g−
D (g) = (det g) − −
−g+ g−

it can be checked that these matrices are elements of G2 by themselves, and they do not commute with one another because
G2 is non-abelian. Hence, they do not all have a common eigenvector and therefore the representation is irreducible6 .
On the other hand, by a similar procedure we can show that the two tensors |m, 1, 2i and |m, 2, 2i defined by Eqs. (13.25,
(23) (23)
13.29) are two linearly independent tensors of the form em |αi. This can be verifiednobserving that (23)oem = em (23).
(23) (23)
They are tensors of the symmetry Θm . Further, these two tensors span the subspace em |αi , |αi ∈ V23 ≡ Tm ′
(2). The

subspace Tm (2) ⊂ V23 is invariant and irreducible under G2 . Moreover, the two sets {Tm ′
(a) : a = 1, 2} comprise tensors of
the symmetry class m, with m denoting the Young diagram (frame) associated with the normal tableau Θm . To reestablish
our original notation we shall use “α” instead of “i” from now on, the range of this label is the number of independent tensors
that can be generated by eλ |αi with |αi ∈ Vmn .
6 Note that this strategy only works for two-dimensional representations. It is because in that case, block-diagonalization of matrices coincides

with total diagonalization. Therefore, reduction of a representation is equivalent to simultaneous (total) diagonalization of all matrices and hence
the existence of a complete set of common eigenvectors.
240 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

13.4.3 Reduction of V23 in irreducible subspaces under S3 and G2


In summary, the complete reduction into irreducible tensors |λ, α, ai of the 8−dimensional tensor space V23 , is obtained by
gathering the four totally symmetric tensors given in example 13.4 Eqs. (13.13-13.16), and the four mixed tensors given by
Eqs. (13.24, 13.25, 13.28, 13.29)7 .

es |α1 i = |+ + +i ≡ |s, 1, 1i (13.33)


es |α2 i = [|+ + −i + |+ − +i + |− + +i] /3 ≡ |s, 2, 1i (13.34)
es |α7 i = [|− − +i + |− + −i + |+ − −i] /3 ≡ |s, 3, 1i (13.35)
es |α8 i = |− − −i ≡ |s, 4, 1i (13.36)

1
em |α2 i = [2 |+ + −i − |− + +i − |+ − +i] ≡ |m, 1, 1i (13.37)
4
1
(23) em |α2 i = [2 |+ − +i − |− + +i − |+ + −i] ≡ |m, 1, 2i (13.38)
4
1
em |α7 i = [2 |− − +i − |+ − −i − |− + −i] ≡ |m, 2, 1i (13.39)
4
1
(23) em |α7 i = [2 |− + −i − |+ − −i − |− − +i] ≡ |m, 2, 2i (13.40)
4
with
|α1 i = |+ + +i ; |α2 i = |+ + −i ; |α7 i = |− − +i ; |α8 i = |− − −i (13.41)
In this case λ = s, m associated with two distinct symmetry classes (Young diagrams). The label α denotes the two distinct
(but equivalent) sets of tensors Tλ (α) invariant under S3 . Finally, the label a indicates the basis elements within each set
Tλ (α), it is associated with distinct symmetries (Young tableaux) in the same symmetry class (same Young diagram).
From another point of view, the label a indicates the irreducible invariant subspace Tλ′ (a) under G2 , and in that case α
labels the basis element within each set Tλ′ (a).
Further, for the class of totally symmetric tensors Ts (α) with α = 1, 2, 3, 4 we obtain four one-dimensional subspaces of
V23 , each subspace is spanned by one of the totally symmetric tensors in Eqs. (13.13-13.16), and each subspace is irreducibly
invariant under the identity representation. On the other hand, Ts′ (a) consists of a single four-dimensional irreducible invariant
subspace under G2 , it is spanned by the four linearly independent totally symmetric tensors of Eqs. (13.13-13.16).
Now we examine the tensors of the class of “mixed symmetry”. The subspace Tm (1) is spanned by the tensors in Eqs.

(13.24, 13.25), while the set Tm (2) is spanned by the tensors in Eqs. (13.28, 13.29). On the other hand, Tm (1) is spanned by

(13.24, 13.28), while the set Tm (2) is spanned by the tensors in Eqs. (13.25, 13.29).
Note that the two sets of two linearly independent mixed tensors can be classified either as belonging to two invari-
ant subspaces under S3 denoted by {Tλ (α) , α = 1, 2} or as belonging to two invariant subspaces under G2 denoted by
′ (23)
{Tm (a) , a = 1, 2} . The latter comprise tensors of two distinct symmetries associated with Θm and Θm which belong to
the same Young diagram and so to the same symmetry class.

13.5 Reduction of the tensor space Vmn into irreducible tensors of the form
|λ, α, ai
The specific example of V23 clarifies our way to establish the central theorems concerning the general case.

Theorem 13.6 (i) Two tensor subspaces irreducibly invariant with respect to Sn and belonging to the same symmetry class,
either coincide or are disjoint. (ii) Two tensor subspaces invariant and irreducible with respect to Sn and belonging to different
symmetry classes are always disjoint.

Proof : (i) Let Tλ (α) and Tλ (β) be two invariant subspaces belonging to the same symmetry class λ, i.e. generated by
the same irreducible symmetrizer eλ . If they are not disjoint, they must have a non-zero element in common. Therefore, there
exist non-zero elements s, s′ ∈ Sen such that

seλ |αi = s′ eλ |βi ∈ Tλ (α) ∩ Tλ (β)


⇒ rseλ |αi = rs′ eλ |βi ; ∀r ∈ Sen
7 Note that for V n = V 3 we have m < n, so that there are no totally anti-symmetric tensors. Consequently, there is not a subspace T (α) in V 3
m 2 λ 2
associated associated with λ = {1, . . . , 1}.
N
13.5. REDUCTION OF THE TENSOR SPACE VM INTO IRREDUCIBLE TENSORS OF THE FORM |λ, α, Ai 241

when r ranges over all Sen so do rs and rs′ . Consequently, rseλ |αi ranges over all Tλ (α) and rs′ eλ |βi ranges over all Tλ (β).
Hence, if Tλ (α) ∩ Tλ (β) 6= {0} they must coincide.
(ii) Given any two subspaces Tλ (α) and Tµ (β) invariant under Sn their intersection is also invariant under Sn . If both are
irreducible, they must be either disjoint or coincident. But if λ 6= µ they cannot be coincident so that Tλ (α) ∩ Tλ (β) = {0} if
λ 6= µ. QED.
Theorem 13.6 along with theorem 13.2 of page 230, permits the complete decomposition of Vmn in irreducible invariant
subspaces Tλ (α) with respect to Sn . The decomposition can be written as
XX
Vmn = Tλ (α) (13.42)
⊕λ ⊕α

where α labels distinct subspaces associated with the same symmetry class (same Young diagram). As we explained before, the
basis elements of the tensors in the various symmetry classes are denoted by |λ, α, ai where a runs from 1 up to the dimension
of the subspace Tλ (α).
On the other hand, theorem 13.2 says that the subspaces Tλ (α) contains an irreducible representation of Sn which is
identical to the one contained in the left-ideal Lλ ⊆ Sen . Further, theorem 12.5 says that all representations of Sn in the
group algebra Sen coming from the same Young diagram are equivalent. Therefore, all subspaces Tλ (α) associated with λ
fixed, are associated to equivalent representations. Consequently, the basis of each Tλ (α) can be chosen in such a way that
the representation matrices of Sn on Tλ (α) are identical for all α within the same λ i.e.

p |λ, α, ai = |λ, α, bi Dλ (p)b a (13.43)

independently of α. This is possible because Tλ (α) and Tλ (β) describe the same irreducible representation of Sn .

Definition 13.6 A canonical basis for Vmn is a basis of the form {|λ, α, ai} where each subset of this basis with λ and α fixed
and running over a, spans an irreducible invariant subspace under Sn , and in which the representation matrices of Sn on Tλ (α)
are identical for all α within the same λ, as displayed in Eq. (13.43).

The most outstanding result of this section is that the decomposition of Vmn in irreducible invariant subspaces with respect
to Sn given by Eq. (13.42), automatically provides a decomposition with respect to the general linear group Gm as well. We
saw this feature in section 13.4 for the particular case of V23 .

Theorem 13.7 Let g ∈ Gm and let {|λ, α, ai} be the canonical basis of Vmn induced by the complete decomposition of Vmn
in irreducible invariant subspaces Tλ (α) under Sn . The subspaces Tλ′ (a) spanned by {|λ, α, ai} with fixed λ and fixed a, are
invariant with respect to Gm , and the representations of Gm on Tλ′ (a) are independent of a, that is, they are given by
β
g |λ, α, ai = |λ, β, ai Dλ (g) α

Proof : (i) From theorem 13.1 we see that gr = gpk rk = pk grk = rg for all r ∈ Sen and g ∈ Gm . Applying it to a given
element reλ |αi ∈ Tλ (α) and g ∈ Gm we find

g (reλ ) |αi = (reλ ) g |αi ∈ Tλ (gα)

so the operators of the linear group Gm do not change the symmetry class of the tensor, hence

g |λ, α, ai = |λ, β, bi Dλ (g)βb αa

(ii) We now show that Dλ (g) is diagonal in the indices a, b. We first note that for g ∈ Gm and p ∈ Sn
c βb c
gp |λ, α, ai = g |λ, α, ci Dλ (p) a = |λ, β, bi Dλ (g) αc Dλ (p) a (13.44)

where we have used the fact that in the canonical basis, the matrix representation of Sn is independent of α. On the other
hand
βc b βc
pg |λ, α, ai = p |λ, β, ci Dλ (g) αa = |λ, β, bi Dλ (p) c Dλ (g) αa (13.45)
by using the notation
h ib
Dλ (g)βb αc ≡ Dλ (g)β α c (13.46)

the matrix products on the RHS of Eqs. (13.44, 13.45) can be written as
h ib h ib
Dλ (g)βb αc Dλ (p)c a = Dλ (g)β α c Dλ (p)c a = Dλ (g)β α Dλ (p) a (13.47)
h ic h ib
b βc b β β
Dλ (p) c Dλ (g) αa = Dλ (p) c Dλ (g) α a = Dλ (p) Dλ (g) α a (13.48)
242 CHAPTER 13. SYMMETRY CLASSES OF TENSORS

substituting Eqs. (13.47, 13.48) in Eqs. (13.44, 13.45) we find


h ib
gp |λ, α, ai = |λ, β, bi Dλ (g)β α Dλ (p) a (13.49)
h ib
β
pg |λ, α, ai = |λ, β, bi Dλ (p) Dλ (g) α a (13.50)

from theorem 13.1, we have that gp = pg. Therefore, the two product matrices on the RHS of Eqs. (13.49, 13.50) must
coincide. Hence h ib h ib
β β
Dλ (g) α Dλ (p) a = Dλ (p) Dλ (g) α a (13.51)

from the notation in Eq. (13.46) and for the sake of clarity, we designate quantities in square brackets as matrices in the space
of latin indices, and suppress these indices. Then Eq. (13.51) becomes
h i h i
β β
Dλ (g) α [Dλ (p)] = [Dλ (p)] Dλ (g) α (13.52)

and for a fixed g ∈ Gm this equation is valid for all p ∈ Sn . Therefore, Schur’s Lemma says that the matrix Dλ (g)βb αa must
βb β
be proportional to the identity matrix in the latin indices i.e. Dλ (g) αc = Dλ (g) α δ b c . QED.
The next natural task is to check whether the invariant subspaces Tλ′ (a) under Gm are irreducible or not. To establish the
relevant theorem we should prove first the following Lemma

Lemma 13.2 The linear group transformations {D (g) , g ∈ Gm } on Vmn given by Eq. (13.1)
{j}
D (g) {i} ≡ g j1 i1 g j2 i2 · · · g jn in , ∀g ∈ Gm

span the space of all symmetry-preserving linear transformations K.

Proof : From the definition 13.3, A ∈ K means that

A{i} {j} = A{ip } {jp } ∀p ∈ Sn

Lemma 13.1 says that {g ∈ Gm } ⊆ K. A neccesary and sufficient condition for {g ∈ Gm } to span K is that the only linear
functional on K which yields L (g) = 0, ∀g ∈ Gm is L = 0. We shall show8 that this condition is satisfied. By definition of
linear functional on K we see that
L (A) = L{j} {i} A{i} {j} (13.53)
where L{j} {i} are components of L with respect to the dual basis to that which defines the components of A. It can be checked
that the symmetry-preserving linear functional defined as
X
e {j} {i} = 1
L L{jp } {ip }
n!
p∈Sn

produces the same effect on A ∈ K as L{j} {i} . To see this, we use the symmetry-preserving property of A to write

e {j} {i} A{i} {j} 1 X {jp } {i} 1 X {jp } {ip }


L = L {ip } A {j} = L {ip } A {jp }
n! n!
p∈Sn p∈Sn
1 X {k} 1
= L {m} A{m} {k} = n!L{k} {m} A{m} {k}
n! n!
p∈Sn
e {j} {i} A{i} {j}
L = L {j}
{i} A
{i}
{j}

Therefore, we can consider without any loss of generality that L{j} {i} is symmetry-preserving. Since g ∈ K, we can apply Eq.
(13.53) to A = g
L (g) = Lj1 j2 ···jn i1 i2 ···in g i1 j1 g i2 j2 · · · g in jn ∀g ∈ Gm (13.54)
Since it is valid for all g ∈ Gm , it is valid in particular for the case in which

g = ḡ + εφ
8 In this case we see K as a vector space. If we denote hφ| a functional on K, we see that if there is a functional hφ| 6= 0 such that hφ| gi = 0 for

all g ∈ Gm it means that there is an associated vector |φi orthogonal to all the elements g ∈ Gm . Hence, {g} is not a basis so that it does not span
K. The reciprocal can also be proved.
N
13.5. REDUCTION OF THE TENSOR SPACE VM INTO IRREDUCIBLE TENSORS OF THE FORM |λ, α, Ai 243

where ḡ and φ are both invertible m × m matrices, and ε an infinitesimal parameter. Substituting this form of g in Eq. (13.54),
expanding in powers of ε (to first order in ε) we obtain
    
L (g) = Lj1 j2 ···jn i1 i2 ···in ḡ i1 j1 + εφi1 j1 ḡ i2 j2 + εφi2 j2 · · · ḡ in jn + εφin jn ∀g ∈ Gm

= Lj1 j2 ···jn i1 i2 ···in ḡ i1 j1 ḡ i2 j2 . . . ḡ in jn + εφi1 j1 ḡ i2 j2 . . . ḡ in jn + εφi2 j2 ḡ i1 j1 ḡ i3 j3 . . . ḡ in jn

+ · · · + εφik jk ḡ i1 j1 . . . ḡ ik−1 jk−1 ḡ ik+1 jk+1 . . . ḡ in jn + · · · + εφin jn ḡ i1 j1 ḡ i2 j2 . . . ḡ in−1 jn−1

and using the symmetry of L, we find


Lj1 j2 ···jn i1 i2 ···in φi1 j1 ḡ i2 j2 · · · ḡ in jn = 0
to first order in ε. Since {φ} is arbitrary (except for the fact that it is invertible), we must have

Lj1 j2 ···jn i1 i2 ···in ḡ i2 j2 · · · ḡ in jn = 0

by repeated use of this argument we obtain that


Lj1 j2 ···jn i1 i2 ···in = 0
from which L = 0. QED.

Theorem 13.8 (Irrreducible representations of Gm ): The representations of Gm on the subspace Tλ′ (a) of Vmn are irreducible
representations.

Proof : We shall not provide a complete proof. Instead, we shall discuss the plausibility of the result. Since Gm is so
to speak, the most general group of transformations that commutes with Sn on Vmn , the operators {D (g) , g ∈ Gm } on the
subspace Tλ′ must be “complete”, they cannot be reducible. Specifically, consider an arbitrary linear transformation A on the
vector space Tλ′ (a). In tensor components we write

x{i} → y {i} = A{i} {j} x{j}

since x and y belong to the same symmetry class, A must be symmetry preserving in the sense given by definition 13.3

A{i} {j} = A{ip } {jp } ∀p ∈ Sn

further, we know from Lemma 13.1 that linear transformations representing g ∈ Gm are symmetry-preserving. It can be
proved that though A does not necessarily factorize as D (g) in Eq. (13.1), it can be written as a linear combination of D (g)
according with Lemma 13.29 . Since this is true for all linear transformations, D (g) must be irreducible. QED.
It is important to remark that the irreducible representations of Gm provided by tensors of various symetry classes described
in this chapter, are not the only irreducible representations of the general linear group. This group contains additional finite-
dimensional and infinite-dimensional irreducible inequivalent representations. Further, the tensor method described here can
be applied to many classical Lie groups such as the group of three dimensional rotations SO (3), to characterize their tensors
by their symmetry properties.

9 For this we should prove that A ∈ K, i.e. that it is a symmetry-preserving linear transformation on V n . Hence, lemma 13.2 says that it must be
m
a linear combination of {D (g)}. Note that we have established that A is symmetry-preserving as a mapping on Tλ′ (a) ⊆ Vm n but not as a mapping

on Vm n.
Chapter 14

One dimensional continuous groups

Continuous groups consist of elements that are labelled by one or more continuous parameters such as (a1 , a2 , . . . , ar ) in
which each variable has a well-defined range. Continuous groups are clearly infinite, but additionally a notion of “nearness”
or “continuity” must be introduced to the set (a1 , a2 , . . . , ar ) of parameters, this set is usually called the manifold of the
continuous group. Further, some conditions of derivability or analitycity can be required. Indeed, the fact that these groups
have an infinite number of elements does not necessarily mean an increasing in the complexity of its structure, because
continuity and analitycity could lead to considerable simplifications. Notwithstanting, the introduction of these concepts will
require to add new mathematical structures concerning analysis, algebra and geometry.
In this chapter we study the simplest examples of continuous groups: the rotational group in a plane SO (2) and the group
of one-dimensional translations T1 . Both depend on a single continuous parameter and are called one-dimensional continuous
groups. They are necessarily abelian such that their structure is quite simple. However, they are the starting point to study
other multi-dimensional continuous groups, since the theory of them is formulated in terms of their one-parameter subgroups.
The general mathematical theory of continuous groups is called the theory of Lie groups. A formal introduction of Lie
groups require notions of topology and differential geometry. We shall limit ourselves to introduce the most important features
of Lie groups by studying the classical Lie groups of space-time in Physics, and then develop part of the general theory of Lie
groups.
In this chapter we analyze the SO (2) and T1 continuous groups of one-parameter and introduce some concepts relevant in
general Lie groups such as generators, local properties around the identity element, global properties concerning the topological
structure of the manifold, invariant integration measure etc.

14.1 The rotation group SO (2)


Consider a system symmetric under rotations in a plane around a fixed point O. We use two cartesian orthonormal vectors
e1 , e2 . Let R (φ) be an element of the group (a rotation operator) characterized by the single parameter φ in the range
0 ≤ φ < 2π. We have already seen in example 7.5, page 135, that the action of R (φ) on each element of the basis of R2 reads

R (φ) e1 = e1 cos φ + e2 sin φ ; R (φ) e2 = −e1 sin φ + e2 cos φ


 
j cos φ − sin φ
R (φ) ei = ej R (φ) i ⇒ R (φ) = (14.1)
sin φ cos φ

if x ∈ R2 and x1 , x2 are its coordinates with respect to {ei } then x = ei xi transforms under the rotation R (φ) in the form
j
x → x′ ≡ R (φ) x = [R (φ) ei ] xi = ej R (φ) i xi

so the transformation of the coordinates is obtained from


j
x′ = ej x′j ⇒ x′j = R (φ) i xi (14.2)
2
it is clear that rotations leave the norm of the vectors invariant, so that kxk2 = kx′ k or equivalently xi xi = x′i x′i . This
condition along with Eq. (14.2) leads to
e (φ) = E
R (φ) R ∀φ (14.3)
This is called the orthogonality condition, and matrices satisfying property (14.3) are called orthogonal matrices. It can be
checked that matrices in Eq. (14.1) fulfill this relation. In addition, the orthogonality condition leads to the fact that
h i h i
e (φ) = det E ⇒ det [R (φ)] det R
det R (φ) R e (φ) = 1 ⇒ {det [R (φ)]}2 = 1

det R (φ) = ±1

244
14.1. THE ROTATION GROUP SO (2) 245

Eq. (14.1), shows that we should impose the additional restriction det R (φ) = 1. It can be shown that orthogonal matrices with
det [R (φ)] = −1 correspond to a rotation combined with an inversion (discrete symmetry). If we are interested in continuous
transformations (rotations) only, the elements with det [R (φ)] = −1, must be discarded.
Matrices satisfying the condition det R (φ) = +1 are called special. Thus, the rotations in a plane are described by Special
Orthogonal Real 2 × 2 matrices, we label them by SO (2).

Definition 14.1 (SO(2) group): The set {A} of all real 2 × 2 matrices that satisfy
e=E
AA ; det A = 1

forms a group called the special orthogonal group in two dimensions, and symbolized as SO (2).

Theorem 14.1 There is a one-to-one correspondence between rotations in a plane and SO (2) matrices.

Proof: It is clear that any matrix of rotations (see Eq. 14.1), corresponds to a SO (2) matrix. To prove the reciprocal, we
write a general real 2 × 2 matrix and its inverse in the form
   
a b 1 d −b
A= ; A−1 =
c d ad − bc −c a

e = A−1 we have
the special condition gives det A = ad − bc = 1. Using it, and the orthogonality condition (14.3) in the form A
   
a c d −b
= ⇒ a = d and b = −c
b d −c a

so that A becomes  
a −c
A= ; det A = a2 + c2 = 1
c a
from which |a| ≤ 1, |c| ≤ 1 and a2 + c2 = 1. Hence, there exist φ ∈ [0, 2π) such that a = cos φ, c = sin φ. Therefore, the matrix
A acquires the structure given by Eq. (14.1). QED.
This correspondence is also valid for arbitrary finite-dimensional SO (n) matrices in the euclidean space of dimension n.
Let us examine the law of composition that defines the group structure. The product R (φ1 ) R (φ2 ) can be obtained either by
algebraic multiplication of matrices or on geometrical grounds to give

R (φ1 ) R (φ2 ) = R (φ1 + φ2 ) (14.4)

with the understanding that if φ1 + φ2 goes outside the range [0, 2π) we have

R (φ) = R (φ ± 2π) (14.5)

Theorem 14.2 (Two-dimensional rotation group): With the law of multiplication R (φ1 ) R (φ2 ) = R (φ1 + φ2 ) and the defi-
−1
nitions R (φ = 0) = E and R (φ) = R (−φ) = R (−φ ± 2π), the two dimensional rotations {R (φ)} form a group called the
R2 or SO (2) group.

The group elements are labelled by the single real continuous parameter φ in the domain [0, 2π). It can be put in a one-to-one
correspondence with all points on the unit circle in two dimensions, which defines the topology of the group parameter space
(group manifold). Despite this is the most natural parameterization, it is not unique since we could label the group element with
any monotonic function ξ (φ) of φ over the above domain. It is clear that the group structure and its representations cannot
be affected by the labelling scheme. This fact leads to important consequences that can be extended to general continuous
groups. We shall discuss them later.

14.1.1 The generator of SO (2)


We shall see that group multiplication and the requirement of continuity provides most of the structure of SO (2) as it happens
with the general theory of Lie groups. Consider an infinitesimal rotation R (dφ). Differentiability of R (φ) requires that R (dφ)
differs from the identity E by only a quantity of first order in dφ, we then parameterize R (dφ) as

R (dφ) ≡ E − idφJ (14.6)

where the factor (−i) is put by convenience. We shall see that J is independent of φ. Now, consider the rotation R (φ + dφ),
from the multiplication law (14.4) and the parameterization (14.6) we have

R (φ + dφ) = R (φ) R (dφ) = R (φ) [E − idφJ]


R (φ + dφ) = R (φ) − idφ R (φ) J
246 CHAPTER 14. ONE DIMENSIONAL CONTINUOUS GROUPS

but R (φ + dφ) can also be written as


dR (φ)
R (φ + dφ) = R (φ) + dφ

and comparing both parameterizations we find
dR (φ)
= −iR (φ) J ; R (0) ≡ E (14.7)

the solution of 14.7 with the boundary condition R (0) ≡ E is unique so we obtain

Theorem 14.3 (Generator of SO(2)): All two-dimensional rotations can be written in terms of the operator J in the form

R (φ) = e−iφJ ; φ ∈ [0, 2π) (14.8)

and J is called the generator of the group.

This theorem says that most of the structure of the group and its representations are determined by the single generator
J, which in turn was obtained from the continuity and derivability properties in a neighbourhood of the identity. It means
that the local behavior of the group elements around the identity provides a significant part of the group structure. Once
again, this feature is extended to general Lie groups. Note that the group multiplication rule (14.4) is reproduced from the
parameterization (14.8) of the elements of the group. We can then concentrate on the single operator J instead of the infinite
number of elements of the group. Once J is determined all the elements R (φ) ∈ SO (2) are generated by Eq. (14.8).
Nevertheless, Eq. (14.8) does not give all the properties of the rotation group. For instance, the global property given by
Eq. (14.5) cannot be deduced from it. Global properties are mostly dependent of the topological structure of the manifold,
and they play a role in determining the irreducible representations.
From the matrix representation (14.1) we can deduce an explicit form for the operator J in the basis e1 , e2 . To do this, we
write the matrix (14.1) for R (dφ), up to first order in dφ
     
1 −dφ 1 0 0 −i
R (dφ) = = − idφ
dφ 1 0 1 i 0

and comparing with (14.6) we obtain  


0 −i
J= (14.9)
i 0
then J is a traceless hermitian matrix. It is easy to show that J 2 = E, J 3 = J, therefore
X∞ n X ∞ 2k X∞ 2k+1
−iφJ (−iφJ) (−iφJ) (−iφJ)
e = = +
n=0
n! (2k)! (2k + 1)!
k=0 k=0
h ik h ik
2 2
X∞ (−i) J 2k φ2k X ∞ (−i) (−i) J 2k+1 φ2k+1 X∞ k
(−1) Eφ2k X∞ 2k+1
k φ J
= + = −i (−1)
(2k)! (2k + 1)! (2k)! (2k + 1)!
k=0 k=0 k=0 k=0

X (−1) φ2k k X∞ 2k+1
   
k φ 1 0 0 −i
= E − iJ (−1) = E cos φ − iJ sin φ = cos φ − i sin φ
(2k)! (2k + 1)! 0 1 i 0
k=0 k=0
 
cos φ − sin φ
e−iφJ =
sin φ cos φ

which reproduces Eq. (14.1).

14.1.2 Irreducible representations of SO (2)


Let {U (φ)} be a representation of SO (2) on a finite-dimensional vector space V . U (φ) is the operator on V associated with
R (φ). The law of multiplication to be preserved is given by Ecs. (14.4, 14.5)

U (φ1 ) U (φ2 ) = U (φ2 ) U (φ1 ) = U (φ1 + φ2 ) (14.10)


U (φ) = U (φ ± 2π) (14.11)

for an infinitesimal transformation we can define a transformation similar to Ec. (14.6). For simplicity, we denote the generator
with the same symbol J though it is understood that in this case J acts on the n−dimensional space V . We then write

U (dφ) = E − idφ J
14.1. THE ROTATION GROUP SO (2) 247

from the same arguments of section 14.1.1 we obtain

U (φ) = e−iφJ (14.12)

as an operator equation on V . If we choose U (φ) to be unitary, then J must be hermitian. Now, since SO (2) is an abelian
group, their irreducible representations must be one-dimensional. Hence, if we choose U (φ) to be unitary and it belongs to an
irreducible (and so one-dimensional) representation, we must have

U (φ) |αi = |αi e−iφα ∀φ ∈ [0, 2π) (14.13)

and for any |αi in a minimal invariant subspace. Where α is a phase that in general depends on the vector |αi. Combining
(14.13) and (14.12) we see that
∞ ∞
−iφJ −iφα
X (−iφJ)n X (−iφα)n
U (φ) |αi = e |αi = e |αi ⇒ |αi = |αi ⇒
n=0
n! n=0
n!
X∞ n X∞ n
(−iφ) n (−iφ)
J |αi = |αi αn
n=0
n! n=0
n!

since it must be valid for all φ in [0, 2π), we have


J |αi = |αi α (14.14)
therefore, α is a real number chosen to coincide with the eigenvalue of the hermitian operator J. It is straightforward to check
that Eq. (14.13) automatically satisfies the multiplication rule (14.10) for any α, but to satisfy the global constraint (14.11),
we should impose a restriction on the eigenvalue α. The global restriction implying periodicity of 2π leads to

e−iφα = e−iα(φ±2π) ⇒ e∓2iπα = 1

so that α must be an integer. We denote this integer by m, and we write Ecs. (14.13, 14.14) in the form

J |mi = |mi m ; U (m) (φ) |mi = |mi e−imφ (14.15)

the representations arising in this way are classified according with the value of m

1. When m = 0, we have R (φ) → U (0) (φ) = 1, this is the identity representation


2. When m = 1, R (φ) → U (1) (φ) = e−iφ . This is an isomorphism between SO (2) group elements and complex numbers
on the unitary circle. As R (φ) runs over the group space, U (1) (φ) runs over the unit circle once, in the clockwise sense.
3. When m = −1, R (φ) → U (−1) (φ) = eiφ . An isomorphism between the elements of SO (2) and the unitary complex
circle. As R (φ) runs over the group space, U (−1) (φ) runs over the unit circle once, in the counter-clockwise sense.
4. When m = ±2, R (φ) → U (∓2) (φ) = e∓i2φ . These are mappings of the group manifold to the unit complex circle,
covering the circle twice in clockwise and counter-clockwise senses respectively.
5. When m = ± |m| is an arbitrary integer. R (φ) → U (∓m) (φ) = e∓i|m|φ . The mapping from the group manifold to the
unit circle covers the latter m times in clockwise and counter-clockwise senses respectively.

We summarize these results in a theorem

Theorem 14.4 (irreducible representations of SO (2)): All irreducible representations of SO (2) are one-dimensional. The
irreducible representations
 of SO (2) are given by J |mi = |mi m, where m is any integer and J is the generator of the group.
The elements U (m) (φ) associated with the (m) −representation are given by

U (m) (φ) = e−imφ (14.16)

and only the m = ±1 representations are faithful1 .

Note that the defining equation (14.1) for R (φ) is a two-dimensional representation and so it must be reducible2 . Eq.
(14.12) shows that the reduction can be performed by diagonalizing the generator J, that in two dimensions is given by Eq.
(14.9)  
0 −i
J= (14.17)
i 0
1 The existence of degenerate representations is related with the fact that SO (2) is not a simple group as can be seen by combining example 6.44,

page 121, with theorem 7.2 and corollary 7.3, page134.


2 Observe that the matrices R (φ) act on the 2-dimensional real vector space R2 , while the irreducible representations U (m) (φ) act on the one-

dimensional complex vector space C1 . Though both R2 and C1 are represented by a plane, they are totally different as vector spaces.
248 CHAPTER 14. ONE DIMENSIONAL CONTINUOUS GROUPS

the eigenvalue equation for J yields


(∓e1 − ie2 )
J |e± i = j± |e± i , j± = ±1, e± = √
2
so that the new basis provides the two invariant one-dimensional subspaces expanded by e+ and e−

J |e± i = ± |e± i ; R (φ) e± = e± e∓iφ

14.1.3 Invariant integration measure, orthonormality and completeness relations


We want to formulate the orthonormality and completeness relations for the functions U (m) (φ) = e−imφ in analogy with
theorems 7.8, 7.11 for finite groups. Since the representations are one-dimensional, orthogonality and completeness of matrices
coincide with orthogonality and completeness of characters, where the latter are given by theorems 7.46, 7.47. In order to
establish
 (m) these
relations for our continuous group, we must use the continuous parameter φ as a label for the elements of
U (φ) and m for the label of the representation, in the relevant formulas. Moreover, since φ is a continuous label, sums
over group elements must be replaced by integrals, and finite sums over group representations must be replaced by series.
In the process of integration, the integration measure must be well-defined (in other words the “differential volumes” should
be constructed appropriately). In particular, remember that φ is not the only parameter that can be used to label the elements
of the group, any function ξ (φ) monotonic in 0 ≤ φ < 2π can make this role. Nevertheless, for an arbitrary function f of the
group elements we see that Z Z Z

dξ f [R (ξ)] = dφ ξ (φ) f [R (φ)] 6= dφ f [R (φ)]

from which “integration” of f over the group manifold is not well defined a priori. Our task is then to find a natural unambiguous
definition of integral of f over the group manifold that is well-defined in the sense that the integration can be carried out with
any function ξ (φ) monotonic in 0 ≤ φ < 2π, obtaining the same results.
A survey of the theoretical structure of the representation theory for finite groups tells us that the rearrangement lemma
is crucial for the proof of most of the important theorems. Thus, if we want these theorems to be appropriately extended to
the continuous groups, it is necessary to preserve the rearrangement lemma in the integration process. Therefore, we should
find an integration measure such that
Z Z Z
 
dτR f [R] = dτR f S −1 R = dτSR f [R]

where f [R] is any function of the group elements, S is any element of the group, and dτR is the “differential of volume” or
“measure” associated with the R element of the group. If the group elements are labelled by the parameter ξ, then
dτR
dτR = dξ = ρR (ξ) dξ

where ρR (ξ) is the “density” of “weight” function defined from the measure dτR and the parameter ξ.

Definition 14.2 (invariant integration measure): A parameterization R (ξ) of the elements of the group space with an asso-
ciated weight function ρR (ξ) is said to provide an “invariant integration measure” if the equation
Z Z Z
 
dτR f [R] = dτR f S −1 R = dτSR f [R] (14.18)

holds for any function ξ (φ) monotonic in 0 ≤ φ < 2π, for all elements S of the group, and for all (integrable) functions f [R]
of the group elements.

It is clear that validity of Eq. (14.18) leads to


dτR = dτSR ∀S ∈ G (14.19)

which in turn leads to a condition on the density or weight functions


dτR ρR (ξ) dτR /dξR ρR (ξ) dτR dξSR
ρR (ξ) = ⇒ = ⇒ = ⇒
dξR ρSR (ξ) dτSR /dξSR ρSR (ξ) dτSR dξR
ρR (ξ) dξSR
= (14.20)
ρSR (ξ) dξR
where we have used (14.19) in the last step. It is clear that this condition is satisfied if we define
dξE dξE
ρR (ξ) = = (14.21)
dξR dξER
14.1. THE ROTATION GROUP SO (2) 249

where ξE is the group parameter around the identity element E and3 ξR = ξER is the corresponding parameter around R.
The fact that (14.21) leads to (14.20), can be seen from

dξE dξE dξSR dξSR


ρR (ξ) = = = ρSR (ξ) ⇒
dξR dξSR dξR dξR
dξSR
ρR (ξ) = ρSR (ξ)
dξR

from which we see that Eq. (14.21) leads to (14.20) consistently and it is a sufficient condition.
In evaluating the RHS of (14.21) R should be considered as fixed4 , the dependence of ξR = ξER on ξE is determined by
the group multiplication rule.
The determination of the measure is simpler when ξSR is linear in ξR . This is the case when ξ = φ i.e. the parameter is
the rotation angle. In that case, dφR = dφE . Group multiplication rule (14.10) leads to5
 
dφE
φER = φE + φR = φR ; ρR (φ) = =1 (14.22)
dφER R

we insist in the fact that in evaluating dφE /dφER we consider R as fixed (so the subscript R after the parenthesis). From Eq.
(14.22), we obtain the following result

Theorem 14.5 (Invariant integration measure of SO (2)): The rotation angle φ and the volume measure dτR = dφ, provide
the proper invariant integration measure over the SO (2) group space.

If ξ is a general parameterization of the group element, then

dτR = ρR (ξ) dξ = ρR (φ) dφ = dφ

therefore, we must have



ρR (ξ) =

and an invariant integration of a function f of the elements of the group, in terms of the ξ parameter must be carried out in
the form
Z Z Z  

dτR f [R] = dξ ρR (ξ) f [R (ξ)] = dξ f [R (ξ)]

although the discussion above seems to be quite complicated to arrive to rather simple results, it provides the line of thinking
to obtain the invariant integration measure in general continuous groups. Once the invariant integration measure is properly
defined, orthonormality and completeness relations can be easily written.

Theorem 14.6 The SO (2) representation functions U (m) (φ) defined in theorem 14.4 Eq. (14.16) satisfy the following or-
thonormality and completeness relations
Z 2π
1 †
dφ U(n) (φ) U (m) (φ) = δnm (orthonormality) (14.23)
2π 0
X †
U (n) (φ) U(n) (φ′ ) = δ (φ − φ′ ) (completeness) (14.24)
n

These relations are generalizations to continuous groups of theorems 7.46, 7.47 valid for finite groups. This generalization
is done by a replacement of a finite sum over group elements by the invariant integration over the continuous group parameter,
and by the replacement of a finite sum over the irreducible inequivalent representations, by a series of irreducible inequivalent
representations.
Note further, that theorem 14.6 with U (n) (φ) given by Eq. (14.16) is equivalent to the classical Fourier theorem for periodic
functions, where the discrete label n and the continuous parameter φ are the “conjugate variables”.

3 Since ξ = ξ (φ) is a function of φ, it is clear that ξE = ξ (φ = 0).


4 In that sense the variations of ξR = ξER could be considered to arise from the variations around E with R fixed.
5 It is usual to define φ = 0.
E
250 CHAPTER 14. ONE DIMENSIONAL CONTINUOUS GROUPS

14.1.4 Multi-valued representations of SO(2)


There is a new feature of continuous groups: the possibility of having multi-valued representations. To understand where they
come from, note that a representation must reproduce the group multiplication rule (14.10); but it is not compulsory that such
a representation reproduce the global periodic property (14.11). Let us start with a simple example considering the mapping

R (φ) → U (1/2) (φ) = e−iφ/2

it does not define a unique representation because

U (1/2) (φ + 2π) = e−iπ−iφ/2 = −U (1/2) (φ) (14.25)

while we expect on Physical grounds that R (φ + 2π) = R (φ). However, since U (1/2) (φ + 4π) = U (1/2) (φ), then Eq. (14.25)
defines a one-to-two mapping where each R (φ) is assigned to two complex numbers ∓e−iφ/2 differing by a factor of (−1). This
is a two-valued representation in the sense that the group multiplication law of SO (2) is preserved if either of the two numbers
corresponding to R (φ) is chosen.
A natural generalization is the following mapping

R (φ) → U (n/m) (φ) = e−inφ/m (14.26)

where n and m are integers with no common factors. For a given pair (n, m) this mapping defines a “m−valued representation”
of SO (2).
Some questions arise naturally from the discussion above: We ask first whether all continuous groups have multi-valued
representations. Further, if multi-valued representations exist, we wonder whether they are realized in Physical systems.
We shall give only a qualitative answer to the first question. The existence of multi-valued representations is related with
connectedness properties of the group manifold which is a global topological property. In the case of SO (2), its group
manifold (or group parameter space) is “multiply connected” this means that there exists closed “paths” on the unit circle
which wind around it m times for all integers m, and which cannot be continuously deformed into each other. The “multiple
connectedness” of the group manifold of SO (2) leads to the existence of m-valued representations for any integer m. Therefore,
we can establish the existence and nature of multi-valued representations from an intrinsic property of the group manifold.
As for the second question, as far as we know only single-valued representations are relevant in classical Physics while
single-valued and double-valued representations are of interest in quantum mechanics (but no others). The double-valued
representations in quantum mechanics arise from the connectedness of the group manifold of symmetries associated with the
Physical 3-dimensional and 4-dimensional spaces. We shall discuss these issues when the full rotation group and the Lorentz
group are discussed.

14.1.5 Conjugate basis vectors for SO (2)


Consider a particle state localized at a position given by the polar coordinates (r, φ) on the 2-dimensional space. A rotation
keep r unaltered so we shall simplify the notation |r, φi for the localized vector to |φi. The elements of R (φ) acting on |φ0 i
gives

U (φ) |φ0 i = |φ0 + φi ⇒ (14.27)


|φi = U (φ) |Oi ; 0 ≤ φ < 2π (14.28)

where |Oi represent a “reference state” aligned with a chosen x − axis. The set of vectors {|φi} describing a localized particle,
constitute a natural basis in our representation space V . However, another natural basis for this space is the set {|mi}
consisting of eigenstates of the generator J defined in Eqs. (14.15).
We shall look for the relations between both bases {|φi} and {|mi}. A first clarification is that the set {|φi} is indeed a
hyperbasis, since it has the cardinality of the continuum. In contrast, the set {|mi} is a real denumerable basis. Thus the
transfer matrix connecting one basis to the other is not strictly a “square” matrix but a “rectangular” one.
In order to connect both bases, we expand a given |φi in the basis {|mi} of eigenstates of J

X ∞
X ∞
X ∗
|φi = |mi hm| φi = |mi hm| U (φ) |Oi = |mi hO| U † (φ) |mi
m=−∞ m=−∞ m=−∞

X ∞
X
∗ ∗
= |mi hO| eiφJ |mi = |mi eiφm hO |mi
m=−∞ m=−∞

X
|φi = |mi hm| Oie−imφ (14.29)
m=−∞
14.1. THE ROTATION GROUP SO (2) 251

Since hm| φi is the transfer matrix between two orthonormal bases, it must be a unitary matrix. Further since
hm| φi = hm| Oie−imφ (14.30)
iαm
we see that hm| Oi is also unitary so that hm| Oi = e . On the other hand, each vector |mi spans a one-dimensional
irreducible invariant representation of SO (2). Therefore two vectors |mi
 and |ni with m 6= n cannot be connected by a rotation.
Consequently, we can define from {|mi} another orthonormal basis |m′ i ≡ |mi eiαm that also consists of eigenvectors of J
and that generates the same invariant subspaces. In this new basis, we see that hm′ | Oi = e−iαm hm| Oi = 1. Using this new
basis Eqs. (14.29, 14.30) are written as

X ∞
X
′ ′
|φi = |m′ i hm′ | Oie−im φ = |m′ i e−im φ
m′ =−∞ m′ =−∞
′ ′
hm′ | φi = hm′ | Oie−im φ = e−im φ
We can omit the prime notation from now on to obtain

X
|φi = |mi e−imφ (14.31)
m=−∞

hm| φi = e−imφ (14.32)


inφ
we can invert Eq. (14.31) multiplying by e /2π and integrating over φ, to obtain
Z 2π

|ni = |φi einφ (14.33)
0 2π
Comparing Eq. (14.16) with Eq. (14.32) we see that, by using the convention hm| Oi = 1 for all m, the “transfer matrix
elements” hm| φi between the two bases {|mi} and {|φi}, are precisely the group representation functions.
Of course an arbitrary vector |ψi can be expanded in either of these bases
X∞ Z 2π

|ψi = |mi hm| ψi = |φi hφ| ψi
m=−∞ 0 2π
X∞ Z 2π

|ψi = |mi ψm = |φi ψ (φ) ; ψm ≡ hm| ψi, ψ (φ) ≡ hφ |ψi
m=−∞ 0 2π

where ψm and ψ (φ) are the coordinates of the vector |ψi in the bases {|mi} and {|φi} respectively. These components are
related by

X ∞
X
ψ (φ) = hφ |ψi = hφ |mi hm |ψi = eimφ ψm
m=−∞ m=−∞
Z 2π Z 2π
dφ dφ
ψm = hm| ψi = hm| φi hφ| ψi = ψ (φ) e−imφ
0 2π 0 2π
therefore Z

X 2π
imφ dφ
ψ (φ) = e ψm ; ψm = ψ (φ) e−imφ
m=−∞ 0 2π
So the relations between the coordinates {ψ (φ)} , {ψm } in the conjugate basis {|φi} , {|mi} are Fourier transforms of each
other. Finally, we shall examine the action of the generator J on the elements of the “localized” basis {|φi}, for which we use
Eq. (14.31)

X ∞
X ∞
X ∞
d X
J |φi = J |mi e−imφ = J |mi e−imφ = |mi me−imφ = i |mi e−imφ
m=−∞ m=−∞ m=−∞
dφ m=−∞
d
J |φi = i |φi (14.34)

now, for an arbitrary state |ψi we have
 
d 1 d
hφ| J |ψi = hJφ |ψi = −i hφ| |ψi = hφ |ψi
dφ i dφ
1 d
hφ| J |ψi = ψ (φ)
i dφ
in quantum mechanics, J corresponds to the angular momentum operator measured in units of ~. However, this derivation is
purely group-theoretical and based on geometrical grounds, so it is equally valid in classical mechanics.
252 CHAPTER 14. ONE DIMENSIONAL CONTINUOUS GROUPS

14.2 Continuous translational group in one dimension


Rotations in a two dimensional plane by an angle φ can be interpreted as translations on the unit circle by the arc length φ.
We shall study now another physically important one-parameter continuous group: The group of continuous translations in
one-dimension denoted by T1 .
We label the coordinate axis as x. An arbitrary element T (x) of the group T1 corresponds to a translation by the distance
x. The vectors on which translations act on, are denoted by |xi. Although Dirac notation is used, we refer to vectors in either
classical or quantum mechanics. Let us assume a physical system with no spatial extension i.e. a physical system “localized”
in the coordinate x0 . We describe the state of this (classical or quantum) system as |x0 i, the action of T (x) on |x0 i yields

T (x) |x0 i ≡ |x + x0 i

then we have

T (x1 ) T (x2 ) |x0 i = T (x1 ) |x2 + x0 i = |x1 + x2 + x0 i = T (x1 + x2 ) |x0 i


T (0) |x0 i ≡ |0 + x0 i = |x0 i ; T (x) T (−x) = T (x + (−x)) = T (0)

so we have the following properties

T (x1 ) T (x2 ) = T (x2 ) T (x1 ) = T (x1 + x2 ) ; ∀x ∈ (−∞, ∞) (14.35)


−1
T (0) = E , T (x) = T (−x) ; ∀x ∈ (−∞, ∞) (14.36)

these are the properties required for T1 ≡ {T (x) ; x ∈ (−∞, ∞)} to form a group. Note that in this case, the manifold is
unbounded so that a global property of the type (14.11) for the rotation group is not necessary for the translation group. This
is an important difference between both groups.
For an infinitesimal displacement denoted by dx, we can parameterize

T (dx) ≡ E − idx P (14.37)

which defines the (displacement independent) generator of translation P . As in the case of rotations, we express T (x + dx) in
two different ways

dT (x)
T (x + dx) = T (x) + dx (14.38)
dx
T (x + dx) = T (dx) T (x) (14.39)

substituting (14.37) in (14.39) we have

T (x + dx) = (E − idx P ) T (x) = T (x) − iP T (x) dx

and comparing with (14.38) we have

dT (x)
= −iP T (x) (14.40)
dx
T (x) = e−iP x (14.41)

in which we have considered the boundary condition T (0) = E. The procedure is totally analoguous to the case of rotations
in a plane since the rule of multiplication is the same, and the global property (14.11) is not considered to obtain Eqs. (14.7,
14.8). The only difference is that no constraint is necessary to be imposed on the multiplication rule in the translation group.
Once again, the irreducible representations must be one-dimensional because of the abelianity. If T (x) is to be unitary P
must be hermitian, and the real eigenvalues of P will be denoted by p. Then we can form representations U (p) (x) in which
T (x) → U (p) (x) and we find
P |pi = |pi p ; U (p) (x) |pi = |pi e−ipx (14.42)
it is easy to check that all group properties (14.35, 14.36) are satisfied by this representation function for any given real value
of p. Thus p is unrestricted as x is.
A general comparison between the rotation and translation group gives us some similarities and differences. The functions
U (m) (φ) and U (p) (x) both have similar exponential forms owe to the similarity in the multiplication rule. For U (m) (φ)
the group manifold (label of the group elements) is continuous and bounded, while the label on irreducible representations
is discrete and unbounded, this fact is related with the boundedness of the manifold. For U (p) (x) the group manifold is
continuous and unbounded, the label on irreducible representations is also continuous and unbounded, this fact is related with
the unboundedness of the manifold.
14.3. GENERAL COMMENTS 253

As before, an appropriate invariant integration measure over the group elements must be applied to maintain the rearrange-
ment lemma. It is easy to show that the usual cartesian infinitesimal displacement dx provides the proper measure. However,
since the range of integration is infinite, not all integrals are strictly convergent in the classical sense. We shall not develop
this part of the theory in detail. We shall simply establish that the previous results are extended to generalized functions with
a generalized concept of orthonormality to obtain
Z
1 ∞ ′

dx U(p) (x) U (p ) (x) = δ (p − p′ ) (orthonormality)
N −∞
Z
1 ∞ †
dx U (p) (x) U(p) (x′ ) = δ (x − x′ ) (completeness)
N −∞

with N a yet unspecified normalization constant. Since U (p) (x) = e−ipx , these equations represent the statement of the Fourier
theorem for arbitrary (non-periodic) generalized functions. This correspondence, tells us that N = 2π.
These equations of orthonormality and completeness also show the conjugate role of the labels (x, p) typical of the Fourier
analysis.

14.2.1 Conjugate basis vectors for T1


The discussion on the conjugate vectors given for SO (2) in Sec. 14.1.5, can be repeated for T1 . In this case we are interested
in connecting the basis of “localized states” {|xi} with the basis of “translationally covariant” states {|pi} (eigenvectors of the
generator P ) given by Eqs. (14.42). In this case, both are hyperbasis and the transfer matrix is a continuous “square” matrix.
The bases {|xi} and {|pi} are related by
Z ∞ Z ∞
−ipx dp
|xi = |pi e ; |pi = |xi eipx dx
−∞ 2π −∞

where the normalization is chosen such that

hx′ |xi = δ (x − x′ ) ; hp′ |pi = δ (p − p′ )

once again, the transfer matrix elements are the group representation functions Eqs. (14.42)

hp |xi = e−ipx

if we expand an arbitrary vector |ψi in both basis, we have


Z Z
dp
|ψi = |xi ψ (x) dx = |pi ψ̄ (p) ; ψ (x) = hx |ψi , ψ̄ (p) = hp |ψi

and the relation between the coordinates in either bases are
Z Z
dp
ψ (x) = ψ̄ (p) eipx ; ψ̄ (p) = ψ (x) e−ipx dx

as in SO (2), coordinates associated with conjugate basis are Fourier transforms of each other. Furthermore

hx| P |ψi = hP x |ψi = −i
dx
from which the generator P can be identified with the linear momentum operator in quantum mechanical systems. We
emphasize however that these results were derived from purely-group theoretical techniques based on geometrical arguments.
Therefore, they are valid in either classical or quantum mechanics.

14.3 General comments


We have said that compact Lie groups possess most of the properties of finite groups. However, some properties of the
representations are obtained from the structure of their manifolds. As a matter of example, corollary 7.13 on page 149 says
that for a finite group the number of irreducible inequivalent representations equals the number of conjugacy classes. An
extrapolation to compact Lie groups would say that the number of irreducible inequivalent representations and the number of
conjugacy classes belong to the same cardinality. This is true in the case of T1 (which is not compact) because the number of
conjugacy classes (number of elements) and the number of inequivalent irreducible representations both lie in the continuum.
Nevertheless, in the case of SO (2) (which is compact), the number of conjugacy classes (number of elements) lies in the
continuum while the number of irreducible inequivalent representations is infinite but countable. To understand the difference,
254 CHAPTER 14. ONE DIMENSIONAL CONTINUOUS GROUPS

observe that the discrete nature of the irreducible representations of SO (2) comes from the global property (14.11) which is a
property of the manifold (so its origin is not group-theoretical in nature), as can be seen in Sec. 14.1.2. The extrapolation of
corollary 7.13 to the case of T1 works precisely because the global property (14.11) is absent in T1 . This is a good example to
show that properties of the manifold are important in the general structure of representations for Lie groups.
On the other hand, it is important to observe that T1 can describe a translation of any unbounded generalized variable,
while SO (2) describe the translation of a generalized bounded coordinate.
Chapter 15

Rotations in three-dimensional space: The


group SO (3)

The infinite groups we have considered so far are abelian. This leads to important simplifications since their irreducible
representations are one-dimensional and each element forms its own class. The SO (3) group of rotations in three-dimensional
euclidean space is perhaps the most important of all non-abelian Lie groups. It will illustrate some additional properties of
the Lie groups arising from the non-abelianity. In addition to its importance in the description of three dimensional rotations,
all simple and semi-simple Lie groups of interest in Physics contain SO (3) or its local equivalent SU (2) as a subgroup.

Definition 15.1 (The SO (3) group): The SO (3) group consists of all continuous linear transformations on three dimensional
Euclidean space which leave the length of the coordinate vectors invariant.

Consider a cartesian coordinate frame with orthonormal vectors ei with i = 1, 2, 3. Under a rotation we obtain

R : ei → e′i = ej Rj i (15.1)

where Rj i are the elements of a 3 × 3 matrix of rotation. Let x be an arbitrary euclidean vector such that x = ei xi then
x → x′ under rotation and
x′i = Ri j xj (15.2)
the requirement of invariance of the length |x| = |x′ | implies xi xi = x′i x′i which combined with Eq. (15.2) leads to

Ri k Rj k = δ ij (15.3)

with sum over repeated indices. In matrix form it is given by


e = RR
RR e =E ⇔ R
e = R−1 (15.4)

where Re is the transpose of R. Real matrices satisfying this condition has determinant ±1. Since all physical rotations
are reached from the identity by a continuous transformation, and since the identity has determinant +1, it follows that all
rotations must have determinant +1. We should then put the additional condition

det R = +1 (15.5)

matrices that satisfy the orthogonal condition (15.4) but with det R = −1, describe a combination of a rotation with a discrete
spatial inversion of the form Is = −E. They are called improper orthogonal transformations. In contrast, rotations are
proper or special orthogonal transformations (det R = +1).
For future purposes, we mention that both the “orthogonal” and the “special” conditions on the rotational matrix R can
be expressed as statements on invariant tensors. Rewriting Rj k = Rj l δ kl in the orthogonal condition (15.3) we can rewrite
such a condition as
Ri k Rj l δ kl = δ ij (15.6)
which expresses the invariance of the second rank tensor δ kl under rotations. Similarly, the special condition (15.5) can be
written as

Ri l Rj m Rk n εlmn = εijk (15.7)


ijk
where ε is the totally anti-symmetric third-rank unit tensor. When (i, j, k) = (1, 2, 3), or any even permutation of (1, 2, 3),
the left-hand side of (15.7) is just the determinant, and this equation coincides with (15.5). For odd permutations of (1, 2, 3)

255
256 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

we obtain Eq. (15.5) multiplied by (−1). Finally, if any index is repeated both quantities in (15.7) are null by virtue of the
anti-symmetrical nature of both sides of the equation. This identity says that εlmn is an invariant tensor under rotations.
e = 1, then Eq. (15.7) remains valid if we replace R → R.
If we take into account that det R e Therefore, an equivalent form
of such an equation is
Rl i Rm j Rn k εlmn = εijk (15.8)
If we perform a rotation R1 followed by another rotation R2 we can express the effect on the basis vectors as follows

R2 (R1 ei ) = R2 ej R1 j i = ek R2 k j R1 j i = ek (R2 R1 )k i ≡ ek R3 k i
R3 ≡ R2 R1 (15.9)

it is easy to check that R3 is also an SO (3) matrix (it can also be visualized on geometrical grounds)

e3
R = ^
(R e e −1 −1
2 R1 ) = R1 R2 = R1 R2 = (R2 R1 )
−1
= R3−1
det R3 = det [R1 R2 ] = det R1 det R2 = (+1) (+1) = +1

It is obvious that the identity matrix is an SO (3) matrix. Further, the inverse R−1 of any given R ∈ SO (3) matrix exists
because the determinant of R is non-zero and
g eRee g
e =E ; R ee e e=E
R−1 R −1 = R = RR −1 R−1 = R R = RR
 −1  1
det R = = +1
det R
so that R−1 is also an SO (3) matrix. Finally, associativity is a property of general linear transformations, or equivalently of
matrix multiplication. Thus, the set SO (3) of rotation matrices is a group.
The elements of SO (3) are determined by three continuous parameters. We shall describe the two most common conven-
tions. Another convention coming from SU (2) will be discussed later.

15.1 The Euler angles parameterization of a three-dimensional rotation


15.1.1 The Euler angles in the X-convention
There are many choices for the three independent parameters that determine a three-dimensional rotation. The most common
choice is given by the Euler angles that we describe below: We want to go from a given set X1 X2 X3 of orthonormal coordinate
axes to another set X1′ X2′ X3′ of orthonormal coordinate axes, where both systems of coordinates have a common origin. It is
clear that if we determine the directions of the X1′ X2′ axes, the third axis is uniquely determined as long as the transformation
is continuous (since in that case the chirality of the coordinate system is preserved). To do it, we require to determine the plane
in which the set X1′ X2′ lies and some angle that orientates those axes in such a plane. Fig. 15.1 shows the plane generated by
X1 X2 and also shows the (shadowed) plane generated by X1′ X2′ , those planes form a dihedral angle, and intersect each other
in the nodal line as indicated in Fig. 15.1. In order to pass from the system X1 X2 X3 to the system X1′ X2′ X3′ , we should take
the X1′ X2′ axes to their final positions, it can be realized schematically in three steps (a) Rotate the system for the new X1
axis to lie in the shadowed plane (i.e. the plane generated by X1′ X2′ ), (b) rotate the system for the new X2 axis to enter in the
shadowed plane as well, once the new X1 X2 axes already lie in the X1′ X2′ plane, the last step is (c) to carry out a rotation of
the axes in the shadowed plane to get their final orientation.
(a) (a) (a)
Let us see the procees in detail: (a) First of all, we pass from the X1 X2 X3 system, to the X1 X2 X3 system, by means
(a)
of a rotation around the X3 −axis through an angle φ, such that the new X1 axis lies along the nodal line, it guarantees
(a) (a)
that X1 lies within the shadowed plane, in this case it is clear that X3 = X3 . (b) In the next step, we pass from the
(a) (a) (a) (b) (b) (b) (b)
X1 X2 X3 system to the X1 X2 X3 system in such a way that the new X2 axis stays within the shadowed plane but
(a)
without taking out of that plane the axis already introduced. Therefore, the rotation must be made around the X1 axis for
(a)
this axis not to be taken out of the shadowed plane, we then do a rotation through the θ angle around the X1 axis, where
(b) (b) (a)
θ is the appropriate angle to introduce the X2 axis within the shadowed plane. In this step it is clear that X1 = X1 and
(b) (b)
with this procedure we have achieved that the new X1 and X2 axes lie in the plane generated by X1′ X2′ , the only missing
(b) (b)
step is then a rotation within such a plane that starts from X1 X2 toward X1′ X2′ , which is carried out with (c) a rotation
(b) (b)
around the X3 axis through an angle ψ. In this step X3′ = X3 .
From the discussion above, an arbitrary rotation from the X1 X2 X3 coordinate system, to the X1′ X2′ X3′ coordinate system,
can be realized in the form displayed in Fig. 15.1: we start with a counter-clockwise rotation through the angle φ around X3 ,
(a) (a) (a) (a)
the resultant coordinate system is denoted by X1 , X2 , X3 (of course, X3 coincides with X3 ). In the second step, we make
(a)
a counter-clockwise rotation of the new system with respect to X1 by the angle θ as shown in Fig. 15.1, and the resultant
(b) (b) (b) (b) (a)
sytem is denoted by X1 , X2 , X3 . The X1 axis (which coincides with X1 ), is formed by the intersection between the
15.1. THE EULER ANGLES PARAMETERIZATION OF A THREE-DIMENSIONAL ROTATION 257

Figure 15.1: Set of rotations to go from the axes X1 X2 X3 to the axes X1′ X2′ X3′ .

(b) (b) (b)


planes X1 X2 and X1 X2 as it is known as the nodal line. Finally, we make a counter-clockwise rotation ψ around X3 to
arrive at the final coordinate system X1′ X2′ X3′ . The elements of a complete transformation can be obtained by composition of
258 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

the three operations described above. The initial rotation (around X3 ) transforms a vector (or its components) in the form1
 
cos φ − sin φ 0
x(a) = R3 (φ) x ; R3 (φ) =  sin φ cos φ 0  (15.10)
0 0 1

(a) (a) (a)


where R3 (φ) is a matrix that describes a change of basis of the form X1 X2 X3 → X1 X2 X3 . The notation refers to the
fact that it is a rotation around X3 through an angle φ.
(a) (a) (a) (a) (b) (b) (b)
The second rotation around X1 , describes a change of basis from the basis X1 X2 X3 toward the basis X1 X2 X3 and
yields
 
1 0 0
x(b) = RN (θ) x(a) ; RN (θ) =  0 cos θ − sin θ  (15.11)
0 sin θ cos θ
the notation RN (θ) means a rotation around the nodal line (N), by an angle θ.
(b) (b) (b) (b)
Finally, we describe the change of basis X1 X2 X3 → X1′ X2′ X3′ with the rotation around X3 :
 
cos ψ − sin ψ 0
x′ = R3′ (ψ) x(b) ; R3′ (ψ) =  sin ψ cos ψ 0  (15.12)
0 0 1

(b)
R3′ (ψ) means that this rotation is around X3′ = X3 by an angle ψ. Therefore, we go from x to x′ by means of the
transformation
x′ = R (φ, θ, ψ) x ; R (φ, θ, ψ) ≡ R3′ (ψ) RN (θ) R3 (φ) (15.13)
calculating the product of the three matrices we obtain the most general matrix of rotation in terms of the Euler angles. For
reasons to be understood later, we shall rewrite the set (φ, θ, ψ) of Euler angles in the form (φx , θx , ψx )
 
cos ψx cos φx − cos θx sin φx sin ψx − cos ψx sin φx − cos θx cos φx sin ψx sin ψx sin θx
R (φx , θx , ψx ) =  sin ψx cos φx + cos θx sin φx cos ψx − sin ψx sin φx + cos θx cos φx cos ψx − cos ψx sin θx  (15.14)
sin θx sin φx sin θx cos φx cos θx

It can be verified that the inverse of this matrix coincides with its transpose, and that its determinant is +1. Therefore, this
is a proper orthogonal (or special orthogonal) matrix.
 
cos ψx cos φx − cos θx sin φx sin ψx sin ψx cos φx + cos θx sin φx cos ψx sin θx sin φx
R−1 e =  − cos ψx sin φx − cos θx cos φx sin ψx
=R − sin ψx sin φx + cos θx cos φx cos ψx sin θx cos φx  (15.15)
sin ψx sin θx − cos ψx sin θx cos θx

It is clear that the range of the angles is given by

0 ≤ φ < 2π, 0 ≤ θ ≤ π, 0 ≤ ψ < 2π

There exists of course an arbitrariness in the sequence of rotations that can be chosen, the first rotation can be made with
respect to anyone of the three axes, and in the two subsequent rotations, the only limitation is that we cannot carry out
two succesive rotations with respect to the same axis. Hence, there are a total of 12 possible conventions for a right-handed
coordinate system.
In the convention used in this section, the first rotation around X3 was used to introduce the X1 −axis in the X1′ X2′ plane,
(a) (b)
and then rotate around the X1 axis (nodal line) and finally around the X3 axis. We call this the X1 −convention (or the
x-convention), because the X1 axis is transformed to be aligned with the nodal line. However, we could equally start rotating
(a)
around X3 but to introduce the X2 axis in the X1′ X2′ −plane, so that X2 will be aligned with the nodal line, and the second
(a)
rotation will be around X2 . We call it the X2 −convention (or y−convention). The so-called x and y−conventions are the
most usual in Physics2 . There exists a third convention based on a different algorithm, it is called the xyz−convention, which
is widely used in engineering, we shall not describe it here.
Let us now describe the y−convention
1 The convention used in Eq. (15.10), follows from the definition (15.1). It should be emphasized that in some books, the matrix of rotation is

defined as x′ = Rx instead of x e′ = xeR. So our definition is the transpose of the definition in some books. Taking the transpose is equivalent to
change α → −α in any angle involved.
2 We could for instance start rotating around (say) the X axis to aligned X (or X ) with the nodal line. Nevertheless, such conventions are
1 3 2
seldom used in the literature. Starting with a rotation around X3 , is a very universal practice.
15.1. THE EULER ANGLES PARAMETERIZATION OF A THREE-DIMENSIONAL ROTATION 259

15.1.2 Euler angles in the Y −convention


By now we shall distinguish the Euler angles in the x−convention (φx , θx , ψx ) described above with respect to the ones in the
y−convention (φy , θy , ψy ). The intermediate steps followed in the y−convention will be written as
(a) (a) (a) (b) (b) (b)
X1 X2 X3 → X̄1 X̄2 X̄3 → X̄1 X̄2 X̄3 → X1′ X2′ X3′
(a)
We start with a rotation around X3 that alignes the new X̄2 −axis with the nodal line3 . It can be done by either (a) A
rotation φy ≡ 3π/2 + φx in the counter-clockwise sense or (b) with a rotation φy = π/2 − φx in the clockwise sense4 . Both
conventions are equivalent. We choose using φy = 3π/2 + φx to keep the usual counter-clockwise sense convention. In matrix
(a) (a) (a)
form this rotation from the basis X1 X2 X3 to the new basis X̄1 X̄2 X̄3 is described by
 
cos φy − sin φy 0
R3 (φy ) =  sin φy cos φy 0 
0 0 1
(a) (b)
it follows a rotations around the nodal line N ≡ X̄2 = X̄2 through an angle θy = θx (since in both conventions, θ is the
(b)
dihedral angle between the planes X1 X2 and X1′ X2′ ). From which the X1 axis is introduced in the X1′ X2′ plane. Comparing
(b) (b) (b) (b)
with the x−convention, it is clear that X̄2 = X1 (nodal line), and X̄1 = −X2 . The matrix of rotations from the basis
(a) (a) (a) (b) (b) (b)
X̄1 X̄2 X̄3 to the basis X̄1 X̄2 X̄3 is given by
 
cos θy 0 sin θy
RX̄ (a) (θy ) = RN (θy ) =  0 1 0 
2
− sin θy 0 cos θy
(b)
finally a rotation ψy = ψx + π/2 around the X̄3 = X3′ axis. This relation is easily seen by observing that ψx is the angle
(b) (b) (b)
between X2 (nodal line) and X1′ , and taking into account that X̄1 = −X2 . The matrix describing the change of basis
(b) (b) (b)
X̄1 X̄2 X̄3 → X1′ X2′ X3′ is given by  
cos ψy − sin ψy 0
R3′ (ψy ) =  sin ψy cos ψy 0 
0 0 1
on the other hand the relations between the Euler angles in both conventions are
   
3π 3π 3π
φx = φy − ; cos φx = cos φy − = − sin φy ; sin φx = sin φy − = cos φy
2 2 2
θx = θy ; cos θx = cos θy ; sin θx = sin θy
π  π  π
ψx = ψy − ; cos ψx = cos ψy − = sin ψy ; sin ψx = sin ψy − = − cos ψy
2 2 2
Consequently, the total rotational matrix in the y−convention of the Euler angles, could be constructed either by composing
the three matrices associated with each step

R (φy , θy , ψy ) = R3′ (ψy ) RN (θy ) R3 (φy ) (15.16)

or by replacing the relations

cos φx = − sin φy ; sin φx = cos φy


cos θx = cos θy ; sin θx = sin θy
cos ψx = sin ψy ; sin ψx = − cos ψy

in the matrix (15.14) associated with the x−convention. The result is


 
− sin ψy sin φy + cos θy cos φy cos ψy − sin ψy cos φy − cos θy sin φy cos ψy cos ψy sin θy
R (φy , θy , ψy ) =  cos ψy sin φy + cos θy cos φy sin ψy cos ψy cos φy − cos θy sin φy sin ψy sin ψy sin θy  (15.17)
− sin θy cos φy sin θy sin φy cos θy
from now on, whenever the parameterization by Euler angles is used, we shall employ the y−convention, and we simplify the
notation of the angles to (φ, θ, ψ). Nevertheless, it is very important to know the convention used in the construction of the
matrix, when we want to compare results coming from different sources.
3 The positive sense of the nodal line will be preserved with respect to the x−convention. Alignment is understood in the sense that the positive
(a)
X̄2 axis is parallel with the positive nodal line.
4 The clockwise or counter-clockwise senses are defined with respect to an observer located on the positive side of the axis.
260 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

15.2 The angle-and-axis-parameterization


15.2.1 Proper orthogonal transformations
The orthogonality property of rotations leads to the following property
e =1−A
(A − 1) A e

taking the determinant on both sides we get


 
e = det 1 − A
det (A − 1) det A e

e = det A = 1, and taking into account that the identity matrix is symmetric, we have
now, since det A
" #
  ^
det (A − 1) = det 1 − A e = det 1 − A e = det (1 − A)

⇒ det (A − 1) = det [− (A − 1)] (15.18)


and applying the property
n
det (−B) = (−1) det B (15.19)
in Eq. (15.18) for proper orthogonal matrices of odd dimensions, we see that
det (A − 1) = − det (A − 1)
det (A − 1) = 0 , f or odd dimensions (15.20)
and comparing with the general eigenvalue equation, we observe that λ = 1 is a solution for such an equation. In conclusion,
for any proper orthogonal matrix of odd dimension, λ = +1 is one of the eigenvalues. It is important to emphasize that this
conclusion is only valid for orthogonal proper matrices of odd dimension. In addition, if the matrix A is real, we obtain that
if λ is a solution of the secular equation, then λ∗ also is.

15.2.2 Real proper orthogonal matrices in three-dimensions


In the case of three dimensions we have an odd dimension. Hence, one of the eigenvalues equals the unity. By convention
we assign λ3 = +1. Since orthogonal matrices are normal, the spectral theorem guarantees that they can be brought to the
canonical form by means of a similarity transformation through a unitary matrix. Hence, the determinant (which is invariant
under a similarity transformation) is the product of the eigenvalues
det A = λ1 λ2 λ3 = λ1 λ2 = 1 (15.21)
now, real orthogonal matrices are special cases of unitary matrices. Further, we recall that the eigenvalues of unitary matrices
are within the unit complex circle, then we have
kλ1 k = kλ2 k = λ3 = +1 (15.22)
Taking into account Eqs. (15.21) and (15.22) we write
λ1 = eiΦ1 ; λ2 = eiΦ2 ; λ3 = 1
λ1 λ2 = 1 = eiΦ1 eiΦ2 ⇒ Φ1 = −Φ2 ≡ Φ
so that the eigenvalues are given by
λ1 = eiΦ ; λ2 = e−iΦ ; λ3 = 1
we see that λ1 = λ∗2 which is consistent with the fact that if the matrix A is real, λ∗ is a solution of the secular equation as
long as λ is. This in turn implies that λ1 and λ2 are both complex or both real. They are real when Φ = 0, ±π and complex
otherwise. These facts lead to the following theorem
Theorem 15.1 Let A be a real proper orthogonal matrix in three-dimensions. Its eigenvalues yields
λ1 = eiΦ ; λ2 = e−iΦ ; λ3 = 1 (15.23)
We have three possibilities
1. When Φ = 0, all eigenvalues are +1. This is the trivial case in which the matrix of transformation is the identity.
2. When Φ = ±π, λ1 = λ2 = −1, and λ3 = 1. This transformation can be considered as an inversion in two coordinate
axes keeping the third unaltered, it can be shown by applying the canonical matrix of eigenvalues to an arbitrary vector.
Equivalently, it can be seen as a rotation of π with respect to the third axis.
3. Φ 6= 0, ±π, in this case λ1 and λ2 are complex and λ1 = λ∗2 = eiΦ .
15.3. EULER’S THEOREM FOR ROTATIONS 261

15.3 Euler’s theorem for rotations


Theorem 15.2 (Euler’s theorem): A reorientation from a set of axes X1 X2 X3 to another set of axes X1′ X2′ X3′ (with common
origin), can be carried out by a single rotation around a fixed axis passing through the origin with a given angle Ψ of rotation.
Proof: Let A be the matrix that realizes the reorientation. If both sets of axes coincide, the transformation is the identity
so that Ψ = 0 and any axis of rotation makes the work. So we assume that the rotation is non-trivial. The theorem will be
proved if we show that there is one and only one linearly independent vector x (written in the basis of X1 X2 X3 ) that remains
invariant under the rotation characterized by the real proper orthogonal matrix A. To see this, we look for the solution of the
equation
x′ = Ax = x (15.24)
therefore, the problem reduces to show that λ = 1 is a non-degenerate eigenvalue of the matrix A. This is guaranteed by
theorem 15.1 when the rotation is non-trivial. It is then clear that an eigenvector x of A determines the direction of the axis
of rotation. QED.
It worths pointing out that this theorem depends on the odd dimensionality of the space. For instance, in two dimensions
there is not a vector that remains invariant under a rotation. The axis of rotation is perpendicular to the plane and so outside
of the space.
On the other hand, Eq. (15.24) shows that the eigenvector associated with λ = 1 determines the direction of the axis of
rotation, and since λ = 1 is non-degenerate for a non-trivial rotation, we can determine such an axis uniquely. Once the axis of
rotation is determined, we proceed to find the angle of rotation around such an axis. By means of a similarity transformation
we can obtain an equivalent matrix A′ such that
A′ = BAB−1
if we interpret B as a change of basis for an (active) operator described by A, we can choose the change of basis such that the
new axis X3′ coincides with the axis of rotation5 . In such a coordinate system A′ represents a rotation around the X3′ axis i.e.
in the X1′ X2′ −plane through an angle α. Consequently, the matrix A′ acquires the form
 
cos α − sin α 0
A′ =  sin α cos α 0 (15.25)
0 0 1
the trace of A′ yields
T rA′ = 1 + 2 cos α (15.26)
and recalling that the trace is invariant under a similarity transformation, we have
3
X
T rA′ = T rA = aii = 1 + 2 cos α (15.27)
i=1

where aii are the (known) diagonal elements of the matrix A. Therefore, α can be solved in terms of those elements. On the
other hand, let us assume another similarity transformation of A that brings it into the canonical form λ (see Eqs. 3.53, 3.54).
Using the invariance of the trace again and the structure of the eigenvalues Eq. (15.23) we find
T rA = T rλ = 1 + eiΦ + e−iΦ = 1 + 2 cos Φ (15.28)
equating (15.27) with (15.28) we obtain
1 + 2 cos Φ = 1 + 2 cos α
α = ±Φ (15.29)
so that the angle of rotation is equal to one of the complex phases associated with the eigenvalues of A.
On the other hand, if xk is an eigenvector associated with λk then αxk is also an eigenvector. Therefore, there is an
ambiguity in both the magnitude and the sense of the eigenvectors associated with a given eigenvalue. Consequently, there
is an ambiguity in the sense of gyration of the axis of rotation as can be seen in Eq. (15.29). Indeed, it is clear that the
solution of the eigenvalue problem does not fixed uniquely the orthogonal matrix A. For example, the secular equation for the
determinant shows us that the transpose of the matrix (and so its inverse in an orthogonal matrix) has the same eigenvalues
and eigenvectors as A. This is logical from the geometrical point of view, because the inverse rotation correspond to the
same axis of rotation and to the same angle except for a difference in sign6 , but we already saw that the eigenvectors are still
eigenvectors under a change of sign and the eigenvalues of a three-dimensional real proper orthogonal matrix are given by the
structure λi = 1, e±iΦ which is invariant under a change of sign of the phase Φ. These ambiguities can be at least decreased
by assuming that Φ is associated with A and −Φ is associated with A−1 , and fixing the sense of the axes of rotation by the
right-hand rule.
5 The interpretation of A as an active operator is only a tool to simplify the arguments, but the results will apply to arbitrary real proper

orthogonal three-dimensional matrices. Indeed in our present context, A represents a change of coordinate axes, so that A is a passive operator.
6 Equivalently, we can think that the inverse of the matrix is associated with the same angle but with an axis of rotation in the opposite sense.
262 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

15.3.1 The angle-axis parameterization

Figure 15.2: (A) Illustration of the group manifold for SO (3). The radius of the sphere is r = Ψmax = π. (B) Illustration of
the two types of closed curves on the group manifold. The curve (a) is closed in the usual sense, while the curve (b) is closed
in the sense that they represent the same point in the manifold by virtue of Eq. (15.31).

The previous results show that any rotation can be described by Rn (Ψ), where n is an unit vector specifying the direction of
the axis of rotation, and Ψ describes the angle of rotation around that axis. Any unit vector n requires two degrees of freedom
to be determined, usually two angles (such as the polar and azimuthal angles Θ, Φ). The set of independent parameters can
be (Ψ, Θ, Φ). We are using capitol letters to distinguish these angles from the Euler angles. The manifold is then constituted
by the set (Ψ, Θ, Φ) defined on the ranges

0 ≤ Ψ ≤ π , 0 ≤ Θ ≤ π, 0 ≤ Φ < 2π (15.30)

there is a redundancy in this parameterization


R−n (π) = Rn (π) (15.31)

the structure of the group parameter space (or group manifold) can be visualized by associating each rotation to a vector7
c ≡ Ψn, pointing in the direction of n, with magnitude equal to Ψ. The tips of these vectors fill a three-dimensional sphere of
radius Ψmax = π. The group manifold8 is illustrated in Fig. 15.2a. Owing to the redundancy expressed by Eq. (15.31), two
points on the surface of the sphere on opposite ends of a diameter are equivalent to each other.
The sphere with this additional feature is compact (closed and bounded) as well as double-connected. The latter feature
means that the group manifold allows two distinct classes of closed curves: (a) Those that can be deformed continuously into
a point, and (b) those that must wrap around the sphere once. We see both types of curves in Fig. 15.2b.
The curve (b) in Fig. 15.2b is closed because the ends of the line correspond to opposite ends of the sphere i.e. to the same
point on the manifold. This curve cannot be deformed continuously to a shape like the one of curve (a) since neither end of
the line can move inside the sphere without breaking the curve (that is, when one end moves inside the sphere, the curve is
not closed anymore), and when one end is moved on the surface the other end must keep up with it by staying at the opposite
side of the diameter, otherwise the curve breaks again (i.e. it stops being closed). It can be shown that all curves that wind
the sphere an even number of times can be deformed in a curve like (a) and so in a point. Similarly, all curves that wind the
sphere an odd number of times can be deformed in a curve like (b). These geometrical properties will give part of the structure
of the group representations.
There is a very useful property concerning group multiplication Eq. (15.9). In the angle-and-axis parameterization, we see
that

Theorem 15.3 Two rotations Rn′ (Ψ), and Rn (Ψ) associated with the same angle of rotation Ψ posses the following property

If n′ = Rn with |n| = |n′ | = 1 then Rn′ (Ψ) = RRn (Ψ) R−1 (15.32)

thus the rotational matrix Rn′ (Ψ) is obtained from the rotational matrix Rn (Ψ) (with the same angle of rotation), by a
similarity transformation. In particular, Rn−1 (Ψ) = Rπ Rn (Ψ) Rπ−1 where Rπ is a rotation through an angle π around an axis
perpendicular to n.
7 We should take care with this association. For instance, the basic operation (sum) of vectors is commutative, while rotations are not commutative.

Hence this analog is useful to construct the manifold, but not to analyze the group multiplication.
8 Observe that negative angles Ψ do not appear in this parameterization as can be seen in Eq. (15.30). Therefore, the inverse of a rotation R (Ψ)
n
will be expressed in the form R−n (Ψ) instead of Rn (−Ψ).
15.3. EULER’S THEOREM FOR ROTATIONS 263

Proof : Theorem 15.1 Eq. (15.23) show that rotations Rn′ (Ψ), and Rn (Ψ) associated with the same angle Ψ, posses
the same eigenvalues. Therefore, they are associated with the same canonical matrix (except for a possible reordering of the
eigenvalues) so that they are equivalent matrices. Further, since they are real normal matrices, the diagonalization of each of
them can be done by a real orthogonal matrix. Therefore, the transformation of similarity is done by means of a real orthogonal
matrix and so by a rotation9 . It is left to the reader to show that the matrix of rotation in the similarity transformation, is
precisely the one that transforms n into n′ . In particular,

Rn−1 (Ψ) = R−n (Ψ)

so that n′ = −n from which it is clear that n′ = Rπ n where Rπ is a rotation through an angle π around an axis perpendicular
to n. QED.
As a corollary, we obtain the characterization of the conjugacy classes of SO (3)

Theorem 15.4 (Conjugacy classes of the SO (3) group) A conjugacy class of the group SO (3) consists of all rotations by the
same angle Ψ. In particular, any rotation and its inverse belong to the same class (ambivalent classes).

Proof : Theorem 15.3 proves that rotations associated with the same angle Ψ belong to the same class. Further, if Ψ′ 6= Ψ
the matrix associated with any rotation Rn′ (Ψ′ ) has different eigenvalues from Rn (Ψ) according with theorem 15.1 Eq. (15.23).
Therefore, Rn′ (Ψ′ ) and Rn (Ψ) are not equivalent, so that they belong to distinct classes. QED.
Observe that in terms of the manifold described in Fig. 15.2, a conjugacy class consists of the set of points in a “spherical
shell” of radius Ψ.

Theorem 15.5 The group SO (3) is simple. Thus, all its representations (except the identity representation) are faithful.

The proof of theorem 15.5 is left to the reader. By now, observe that a given SO (2) is a subgroup of SO (3) but it is not
invariant in SO (3) as can be seen from the fact that an SO (2) consists of a set {R±n (Ψ) : 0 ≤ Ψ ≤ π} with n fixed, so that
a given SO (2) only contains two elements of each conjugacy class say Rn (Ψ0 ) and R−n (Ψ0 ), so it has no complete conjugacy
classes. By similar arguments the subgroups of a given SO (2) (see example 6.44, page 121) are not invariant in SO (3) despite
they are invariant in SO (2) (since the latter is abelian).

15.3.2 Parameterization of rotations by succesive fixed-axis rotations


Let us define as N a unit vector along the nodal line. A rotation in terms of the Euler angles (in the Y −convention) is described
by Eq. (15.16)
R (φ, θ, ψ) = R3′ (ψ) RN (θ) R3 (φ) (15.33)
where R3 (φ) indicates a rotation by φ around X3 , RN (θ) indicates a rotation by θ around the nodal line, and R3′ (ψ) means
rotation of ψ around X3′ . However, each of the three-rotation operators on the RHS of Eq. (15.33) are built in different bases.
(a) (a) (a)
R3 (φ) is written in the basis X1 X2 X3 and describes the change from such a basis to the basis X̄1 X̄2 X̄3 . The operator
(a) (a) (a) (a) (a) (a) (b) (b) (b)
RN (θ) is written in the basis X̄1 X̄2 X̄3 and describe the change of basis X̄1 X̄2 X̄3 → X̄1 X̄2 X̄3 . Finally, R3′ (ψ)
(b) (b) (b) (b) (b) (b)
is written in the basis X̄1 X̄2 X̄3 and describe the change of basis X̄1 X̄2 X̄3 → X1′ X2′ X3′ .
It is convenient to express a rotation parameterized with the Euler angles in terms of rotations around the fixed axes, i.e.
around the original basis X1 X2 X3 . To do it, we take into account that in the Euler angle parameterization (y−convention)
we have
X3′ = RN (θ) X3 ; N = R3 (φ) X2 (15.34)
and applying Eqs. (15.34) in Eq. (15.32) we see that
−1
R3′ (ψ) = RN (θ) R3 (ψ) RN (θ) (15.35)
−1
RN (θ) = R3 (φ) R2 (θ) R3 (φ) (15.36)

substituting (15.35) in Eq. (15.33) we have


 −1

R (φ, θ, ψ) = RN (θ) R3 (ψ) RN (θ) RN (θ) R3 (φ) = RN (θ) R3 (ψ) R3 (φ)
R (φ, θ, ψ) = RN (θ) R3 (ψ + φ) (15.37)

and substituting (15.36) in (15.37) we find


   
R (φ, θ, ψ) = R3 (φ) R2 (θ) R3−1 (φ) R3 (ψ + φ) = R3 (φ) R2 (θ) R3−1 (φ) R3 (ψ + φ) = R3 (φ) R2 (θ) R3 (ψ + φ − φ)
9 Indeed, the transformation of similarity can be done with an improper real orthogonal matrix R′ , but the same transformation of similarity is

obtained with R = −R′ which is a proper real orthogonal matrix associated with a continuous rotation.
264 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

R (φ, θ, ψ) = R3 (φ) R2 (θ) R3 (ψ) (15.38)


Eq. (15.38) shows that every rotation can be decomposed in a product of rotations around the fixed axis X2 , X3 . Rotations
around each of the fixed axes X1 X2 X3 by a generic angle α, are written as
   
cos α − sin α 0 cos α 0 sin α
R3 (α) =  sin α cos α 0  ; R2 (α) =  0 1 0  (15.39)
0 0 1 − sin α 0 cos α
 
1 0 0
R1 (α) =  0 cos α − sin α  (15.40)
0 sin α cos α

replacing these expressions in Eq. (15.38) we reproduce Eq. (15.17).

15.3.3 Relation between the angle-axis parameters and the Euler angles (y − convention)
Note that the Euler angles parameterization is more advantageous for algebraic manipulations than the angle-axis one. In
contrast, the latter is more advantageous for geometric interpretations. In general, they provide a complementary picture of
the behavior of rotations. Therefore, it is important to know the relation between the sets of parameters related with both
conventions. It can be shown that the relations between the Euler angles (φ, θ, ψ) in the y−convention and the angles in the
angle-axis parameterization (Φ, Θ, Ψ) are given by
  
(π + φ − ψ) tan θ2 θ φ+ψ
Φ= ; tan Θ =   ; cos Ψ = 2 cos2 cos2 −1 (15.41)
2 sin ψ+φ 2 2
2

15.4 One-parameter subgroups, generators and Lie algebra


The angle-and-axis parameterization described in Sec. 15.3.1, was suitable to describe the compactness and connectedness
properties of the manifold. Notwithstanding, for our present purpose, it is better to redefine this parameterization slightly, in
the form
π
Rn (Ψ) : n ≡ n (Θ, Φ) with 0 ≤ Θ ≤ , 0 ≤ Φ < 2π , −π < Ψ ≤ π and knk = 1
2
so that we restrict the unitary vectors n to the ones in which e3 · n ≥ 0, while Ψ runs over the same range as the parameter
of SO (2). In that case the inverse of a rotation is written as10

Rn−1 (Ψ) = Rn (−Ψ) (15.42)

and a conjugacy class consists of all rotations with a fixed value of |Ψ|, showing that for a given rotation, its inverse belongs to
the same conjugacy class. Note however, that the parameterization described in Sec. 15.3.1 is more suitable to characterize the
compactness and double-connected feature of the manifold. Indeed, topological properties of the manifold must be invariant
under a change of the set of parameters used to characterize the manifold. The details of this issue are out of the scope of the
present treatment.
With this reparameterization, it is clear that for any fixed unit vector n, exists a subgroup of SO (3) of rotations around
n. This group is isomorphic to SO (2). And associated with each of these subgroups there is a generator denoted by Jn , such
that all elements of the given subgroup can be written as

Rn (Ψ) = e−iΨJn ; −π < Ψ ≤ π (15.43)

which is a one-parameter subgroup of SO (3).

Lemma 15.1 For a fixed unit vector n and an arbitrary rotation R ∈ SO (3), it holds

RJn R−1 = Jn′ ; n′ ≡ Rn (15.44)

Proof : We establish first that n


RJR−1 = RJ n R−1 (15.45)
it is clearly valid for n = 1. Assuming it is valid for n we have
n+1   
RJR−1 = RJ n R−1 RJR−1 = RJ n R−1 R JR−1 = RJ n+1 R−1
10 Note that we define the parameter Ψ in the interval −π < Ψ ≤ π instead of 0 < Ψ ≤ 2π, for Eq. (15.42) to make sense. In this parameterization

for an arbitrary n the corresponding −n is not contained in the parameterization, except for Θ = π/2.
15.4. ONE-PARAMETER SUBGROUPS, GENERATORS AND LIE ALGEBRA 265

so it is valid for any n. From this we see that


∞ ∞ ∞
−iΨJ −1
X (−iΨJ)n −1
X (−iΨ)n n −1
X (−iΨ)n n
Re R = R R = RJ R = RJR−1
n! n! n!
k=0 k=0 k=0
−1
Re−iΨJ R−1 = e−iΨRJR (15.46)

applying Eq. (15.32), and using Eq. (15.46) we have that


 −1
Rn′ (Ψ) = RRn (Ψ) R−1 ⇒ e−iΨJn′ = R e−iΨJn R−1 = e−iΨRJn R ⇒
RJn R−1 = Jn′ ; n′ ≡ Rn

QED.
This lemma says that under rotations Jn behaves as a “vector” in the direction n. In this case Jn is a 3×3 matrix. The
generators along each coordinate axis can be calculated by assuming an infinitesimal rotation around the unit vector ek

Rek (dΨ) = E − idΨ Jek

for e1 , e2 , e3 we compare this equation with the infinitesimal form at first order of Eqs. (15.40, 15.39)
     
1 0 0 1 0 0 0 0 0
R1 (dΨ) =  0 1 −dΨ  =  0 1 0  − i dΨ  0 0 −i  (15.47)
0 dΨ 1 0 0 1 0 i 0
     
1 0 dΨ 1 0 0 0 0 i
R2 (dΨ) =  0 1 0 = 0 1 0  − i dΨ  0 0 0  (15.48)
−dΨ 0 1 0 0 1 −i 0 0
     
1 −dΨ 0 1 0 0 0 −i 0
R3 (dΨ) =  dΨ 1 0 = 0 1 0  − i dΨ  i 0 0  (15.49)
0 0 1 0 0 1 0 0 0

from which we see that


     
0 0 0 0 0 i 0 −i 0
J1 =  0 0 −i  ; J2 =  0 0 0  ; J3 =  i 0 0  (15.50)
0 i 0 −i 0 0 0 0 0
(Jk )l m = −iεklm (15.51)

where εklm is the totally anti-symmetric unit third-rank tensor in three dimensions (see also Eq. 15.7).

Theorem 15.6 (Vector generator J): (i) The set {Jk ; k = 1, 2, 3} behaves under rotations in the same way as coordinate vector
operators, that is
RJk R−1 = Jl Rl k ; k = 1, 2, 3 (15.52)
(ii) The generator of rotations around an arbitrary direction n can be written as

Jn = Jk nk ; n = ek nk ; k = 1, 2, 3 (15.53)

from which it follows that


k
Rn (Ψ) = e−iΨJk n ; n = ek nk ; k = 1, 2, 3 (15.54)

Proof : (i) Owing to the decomposition given by Eq. (15.38), it suffices to show Eq. (15.52) for the cases R = R2 (Ψ) and
R = R3 (φ). We show it explicitly using Eqs. (15.39, 15.50)
   
cos Ψ 0 sin Ψ 0 0 0 cos Ψ 0 − sin Ψ
−1
R2 (Ψ) J1 R2 (Ψ) =  0 1 0   0 0 −i   0 1 0 
− sin Ψ 0 cos Ψ 0 i 0 sin Ψ 0 cos Ψ
  
cos Ψ 0 sin Ψ 0 0 0
=  0 1 0   −i sin Ψ 0 −i cos Ψ 
− sin Ψ 0 cos Ψ 0 i 0
 
0 i sin Ψ 0
−1
R2 (Ψ) J1 R2 (Ψ) =  −i sin Ψ 0 −i cos Ψ  (15.55)
0 i cos Ψ 0
266 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

   
0 0 0 0 0 i
1 2 3  0 0 −i  cos Ψ +  0 0
J1 [R2 (Ψ)] 1 + J2 [R2 (Ψ)] 1 + J3 [R2 (Ψ)] 1 = 0 ·0
0 i 0 −i 0 0
 
0 −i 0
+  i 0 0  (− sin Ψ)
0 0 0
 
0 i sin Ψ 0
Jk [R2 (Ψ)]k 1 =  −i sin Ψ 0 −i cos Ψ  (15.56)
0 i cos Ψ 0
−1 k
comparing (15.55) with (15.56) we see that R2 (Ψ) J1 R2 (Ψ) = Jk [R2 (Ψ)] 1 . Similar procedure can be done to show that
−1 k −1 k
R2 (Ψ) Jm R2 (Ψ) = Jk [R2 (Ψ)] m for m = 2, 3 and to show that R3 (Ψ) Jm R3 (Ψ) = Jk [R3 (Ψ)] m for m = 1, 2, 3.
Moreover, there is an alternative way of showing it by using the invariance relations Eqs. (15.7, 15.3).

Ri l Rj m Rk n εlmn = εijk ⇒ Ri s Ri l Rj m Rk n εlmn = Ri s εijk ⇒



Ri s Ri l Rj m Rk n εlmn = Ri s εijk ⇒ δsl Rj m Rk n εlmn = Ri s εijk ⇒
Rj m Rk n εsmn = Ri s εijk

with sum over repeated indices. Now using Eq. (15.51) we have

Rj m (−iεsmn ) Rk n = −iεijk Ri s ⇒ Rj m (Js )m n Rk n = [Ji ]j k Ri s
   j 
m ^ n ) = Ri J j e k = Ji R i s j k
⇒ Rj m (Js ) n (R k s i k ⇒ RJs R
e = Ji R i s
⇒ RJs R
e = R−1 we obtain Eq. (15.52).
and since R

Figure 15.3: (a) Rotations to go from e3 to n (Θ, Φ). (b) Infinitesimal rotation around the X3 −axis.

(ii) Let us consider the rotation

R (Φ, Θ, 0) = R3 (Φ) R2 (Θ) R3 (0) = R3 (Φ) R2 (Θ) (15.57)

where we have used Eq. (15.38). According with Eq. (15.57), this operation consists of a rotation around X2 through the
angle Θ followed by a rotation around X3 through an angle Φ, it is displayed in Fig. 15.3a. Such a figure also shows that the
rotation R (Φ, Θ, 0) brings e3 to n (Θ, Φ). Therefore11
k
|n (Θ, Φ)i = R (Φ, Θ, 0) |e3 i = |ek i R (Φ, Θ, 0) 3 = |ek i nk ⇒
k k k
|n (Θ, Φ)i = |ek i n ; n = R (Φ, Θ, 0) 3 (15.58)
11 Note that in the operator R (Φ, Θ, 0) the angles Φ and Θ are the Euler angles of the operator, but in n (Θ, Φ) they are the angles in the angle-axis

parameterization.
15.4. ONE-PARAMETER SUBGROUPS, GENERATORS AND LIE ALGEBRA 267

from Eqs. (15.44) and (15.52) we have


k
Jn = R (Φ, Θ, 0) J3 R−1 (Φ, Θ, 0) = Jk R (Φ, Θ, 0) 3 = Jk n k

QED.
Eq. (15.53) says that {J1 , J2 , J3 } forms a basis for the generators of all the one-parameter abelian subgroups of SO (3).
We can form a vector space with the linear combinations of {J1 , J2 , J3 }. Further, Eq. (15.54) combined with (15.38), permits
to write rotations in terms of generators and Euler angles

R (φ, θ, ψ) = e−iφJ3 e−iθJ2 e−iψJ3 ; (φ, θ, ψ) ≡ Euler angles (15.59)

Eqs. (15.53, 15.54) or Eq. (15.59), show that in practice we can work with the three basis-generators {Jk } instead of the
infinite number of elements R (φ, θ, ψ). It is clear that we can span a linear vector space with the three elements {Jk } , forming
a three dimensional vector space. Eq. (15.53) shows that for any n the element Jn is an element of this vector space

Theorem 15.7 (Lie algebra of SO (3)): Let [{Jk }] be the vector space spanned by the three basis generators {Jk }. If we define
a “multiplication rule” as the commutator of two elements, we obtain an algebraic system that we shall call the Lie algebra
generated by {Jk }. The multiplication rule defined on the basis vectors gives

[Jk , Jl ] = iεklm J m (15.60)

where [Jk , Jl ] denotes the commutator between both operators.

Proof : We first prove the commutator rules (15.60). If k = l it is obvious. Consider the case k = 1 and l = 2. An
infinitesimal rotation around e2 by an angle dΨ is written as

R2 (dΨ) = E − idΨ J2 (15.61)

Now, let us apply this infinitesimal rotation around e2 on J1 , from Eq. (15.52) this rotation gives
k
R2 (dΨ) J1 R2−1 (dΨ) = Jk R2 (dΨ) 1 (15.62)

replacing (15.61) in (15.62) and keeping terms up to first order, we find12


k k
(E − idΨ J2 ) J1 (E + idΨ J2 ) = Jk [E − idΨ J2 ] 1 ⇒ (E − idΨ J2 ) ( J1 + idΨJ1 J2 ) = Jk E k 1 − idΨ Jk (J2 ) 1
h i
( J1 + idΨJ1 J2 − idΨ J2 J1 ) = J1 − idΨ J1 (J2 )1 1 + J2 (J2 )2 1 + J3 (J2 )3 1

using the explicit form of J2 Eq. (15.50) we have

J1 + idΨ [J1 , J2 ] = J1 − idΨ [−iJ3 ] ⇒ i [J1 , J2 ] = −J3


[J1 , J2 ] = iJ3

by cyclic permutation of the indices, we establish the validity of Eq. (15.60). QED.
It is clear that the multiplication rule in the Lie algebra provides the multiplication rule for the group elements Rn (Ψ) =
e−iΨJn and vice versa13 . As in SO(2), the multiplication rule is generated from the local properties around the identity, and
they give most of the properties of the group representation structure. Nevertheless, as in the case of SO (2) , there will be
global properties that lead to restrictions on the representations. Examples of these global structures are the following

Rn (2π) = E, Rn (π) = Rn (−π) (15.63)

as a matter of consistency it could be checked directly that matrices (15.50) satisfy the commutation relations (15.60).

Definition 15.2 A set of operators {J1 , J2 , J3 } are called angular momentum operators if they obey the commutation rules
given by Eqs. (15.60).

We have seen that angular momentum operators are important in both classical and quantum mechanics. In both classical
and quantum mechanics they appear as generators of special functions (spherical harmonics) in which solutions of differential
equations can be expanded when some spherical symmetry is apparent. In quantum mechanics, their eigenvalues are the
accesible values of angular momenta (measured in units of ~). In this treatment, the commutation relations have arisen from
geometrical properties of rotations.
12 Itis clear that (E + idΨ J) is the inverse of (E − idΨ J) at first order.
13 A Lie algebra is an algebraic system but not an algebra in the sense defined in Sec. 10.3. Observe in particular that the multiplication is not
associative, hence this algebraic system does not form a ring. For instance, [[J1 , J3 ] , J3 ] = J1 while [J1 , [J3 , J3 ]] = 0.
268 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

In classical mechanics, a quantity that is invariant under rotations is called a scalar. In quantum mechanics, a given
observable A of a Physical system commutes with all rotation operators if and only if it commutes with all three generators

[A, Rn (Ψ)] = 0 , ∀Rn (Ψ) ∈ SO (3) ⇔ [A, Jk ] = 0 , k = 1, 2, 3

such an observable is invariant under rotations of the system

A′ = Rn (Ψ) A Rn−1 (Ψ) = A Rn (Ψ) Rn−1 (Ψ) = A

and in analogy with classical mechanics we call them scalars. A very important case is the one in which the Hamiltonian is a
scalar i.e.
[H, Rn (Ψ)] = 0 ⇔ [H, Jk ] = 0 ; k = 1, 2, 3

since in this case the physical system itself is invariant under rotations. Further, the generators Jk will be constants of motion,
because they do not depend explicitly on time, and they commute with the Hamiltonian.

15.5 Irreducible representations of the SO (3) Lie algebra


Since the basis elements of the Lie algebra are generators of infinitesimal rotations, it is easy to see that every representation
of the group is automatically a representation of the associated Lie algebra. Conversely, the general expressions for the group
elements Eqs. (15.54, 15.59) show that a representation of the Lie algebra will give us a representation of the group. However,
if we demand in the group some additional global conditions such as Eqs. (15.63), the representations of the Lie algebra
that lead to appropriate group representations will be restricted. It can be shown that since the group parameter space is
compact (closed and bounded in R3 ), the irreducible representations are finite-dimensional and they are all equivalent to
unitary representations. Consequently, the generators can be chosen as hermitian.
The irreducible (or minimal) invariant subspaces associated with an irreducible representation will be constructed in two
steps. First we choose a “reference vector” as the starting point and second we generate the remaining vectors in the basis
that span the irreducible invariant subspace by applying the appropriate generators. This is the simplest case of the general
method of Cartan to study Lie groups.
There is another strategy that we shall use later in space-time groups, which is generating all the vectors of the irreducible
invariant subspace by sucessive application of the operators associated with the group elements.
It is advantageous to choose the basis of vectors in the representation complex vector space V as the common eigenvectors
of a set of mutually commuting generators (see theorem 3.20, page 52). The Lie algebra shows that J1 , J2 , J3 do not commute
each other, but the operator
J 2 ≡ J12 + J22 + J32
commutes with all the generators Ji . J 2 is then a scalar, and in the language of general Lie groups it is a Casimir operator.

Definition 15.3 (Casimir operator): An operator that commutes with all the generators of a Lie group, is called a Casimir
operator of such a group.

We know that an operator commutes with the generators if and only if it commutes with the elements of the group
 
J 2 , Jk = 0 ; k = 1, 2, 3 ⇔ J 2 Rn (Ψ) = Rn (Ψ) J 2 ; ∀Rn (Ψ) ∈ SO (3)

so a Casimir of SO (3) commutes with all elements of the group. Because of the Schur’s Lemma, J 2 is mapped in a multiple
of the identity matrix in any minimal invariant subspace. In other words, all vectors in a given irreducible invariant subspace,
are eigenvectors of J 2 with the same eigenvalue.
It is a universal convention to choose J 2 and J3 as the set of commuting operators. The remaining generators J1 and J2
will be combined to form the raising and lowering operators

J± = J1 ± iJ2 (15.64)

15.5.1 General properties of J 2 , J3 , and J±


In summary, we have the following definitions

J ≡ (J1 , J2 , J3 ) ; J2 ≡ J12 + J22 + J32 (15.65)


J+ ≡ (J1 + iJ2 ) ; J− ≡ (J1 − iJ2 ) (15.66)
15.5. IRREDUCIBLE REPRESENTATIONS OF THE SO (3) LIE ALGEBRA 269

with the following algebraic identities


 
[Ji , Jj ] = iεijk Jk ; J2 , J = 0 (15.67)
[J3 , J+ ] = J+ ; [J3 , J− ] = −J− (15.68)
 2 
[J+ , J− ] = 2J3 ; J , J± = 0 (15.69)
J+ J− = J2 − J32 + J3 ; J− J+ = J2 − J32 − J3 (15.70)

J± = J∓ (15.71)

such identities can be shown based on the definitions (15.65, 15.66), and the first of Eqs. (15.67), as a starting point. These
properties as well as the subsequent discussion, are proved in detail in appendix A14 . Thus, we shall limit here to describe the
results obtained in appendix A.

Theorem 15.8 The set {|j, mi} of orthonormal eigenvectors common to J 2 and J3 satisfy the following eigenvalue equation

J 2 |j, mi = |j, mi j (j + 1) ; J3 |j, mi = |j, mi m (15.72)

where j takes either non-negative integer values or positive half-odd-integer values


1 3 5
j = 0, , 1, , 2, , . . .
2 2 2
for a given j, if it is integer (half-odd-integer) the allowed values of m are integer (half-odd-integer). For a fixed value of j the
eigenvalue m takes the values
−j, − j + 1, − j + 2, . . . , j − 2, j − 1, j
so for a fixed j0 , we can form a 2j0 + 1-dimensional subspace of the form {|j0 , mi ; −j ≤ m ≤ j}

Lemma 15.2 Let |j, mi be an eigenvector common to J2 and J3 with eigenvalues j (j + 1) and m. It follows that (a) m = −j
if and only if J− |j, mi = 0. (b) If m > −j then J− |j, mi 6= 0 and it is an eigenvector of J2 and J3 with eigenvalues j (j + 1)
and (m − 1) respectively.

Lemma 15.3 Let |j, mi be an eigenvector common to J2 and J3 with eigenvalues j (j + 1) and m. It follows that (a) m = j if
and only if J+ |j, m, ki = 0. (b) If m < j then J+ |j, m, ki 6= 0 and it is an eigenvector of J2 and J3 with eigenvalues j (j + 1)
and (m + 1) respectively.

Lemma 15.4 Let |j, mi be an eigenvector common to J2 and J3 with eigenvalues j (j + 1) and m. The action of the lowering
and raising operators on these eigenvectors is given by
p
J± |j, mi = |j, m ± 1i j (j + 1) − m (m ± 1) (15.73)

The normalization factor on the RHS of this equation can be multiplied by a (m−dependent) phase factor eiθ(m) . Hence,
we have chosen all the normalization constants on the RHS of Eq. (15.73) to be real and positive (zero phase convention). This
is also called a canonical basis or “the Cordon-Shortley convention”. Two bases with different phase conventions give rise to
group representation matrices that may differ by phase factors in non-diagonal terms. Notwithstanding any phase convention
lead to identical Physical results if it is used consistently.
In the language of group theory we have characterized the irreducible representations of the SO (3) Lie algebra

Theorem 15.9 (Irreducible representations of the SO (3) Lie algebra): The irreducible representations of the Lie algebra
of SO (3), are each characterized by an angular momentum eigenvalue j from the set of non-negative integers or positive
half-odd-integers. The orthonormal basis vectors {|j, mi} can be specified by the following equations

J 2 |j, mi = |j, mi j (j + 1) ; J3 |j, mi = |j, mi m (15.74)


p
J± |j, mi = |j, m ± 1i j (j + 1) − m (m ± 1) (15.75)

a given irreducible invariant subspace consists of all linear combinations of the set

E (j) ≡ {|j, mi ; j f ixed and − j ≤ m ≤ j} (15.76)

of 2j + 1 orthonormal vectors associated with the different values of m.


14 There are some little differences with the developments in the framework of Quantum Mechanics. For instance, the group operators in quantum
i
mechanics are written as e− ~ J φ instead of e−iJ φ . Therefore, J has dimensions of angular momentum in quantum mechanics, while in our case
they are dimensionless. Thus, the universal constant ~ appears in some of the relations of Eqs. (15.67-15.71). Further, a possible degeneration in
the common eigenvectors of J 2 and J3 is sometimes denoted by |j, m, ki where k denotes the indices corresponding to linearly independent vectors
associated with the same eigenvalues of J2 and J3 .
270 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

Proof : We shall only prove that the subspace E (j) is irreducibly invariant. It is enough to prove that the vector space
E (j), is invariant under the action of the three generators of the group. For J3 it is a consequence of Eq. (15.74), for J1 and
J2 we see that
1
J1 |j, mi = (J+ + J− ) |j, mi = |j, m + 1i c+ + |j, m − 1i c− (15.77)
2
1 c+ c− 1p
J2 |j, mi = (J+ − J− ) |j, mi = |j, m + 1i − |j, m − 1i ; c± ≡ j (j + 1) − m (m ± 1) (15.78)
2i i i 2i
so the action of J1,2 is a linear combination of the vectors |j, m ± 1i, which have the same value of j as |j, mi ; so they belong
to the subspace specified.
To prove that the subspace is irreducible, we can take the set of basis vectors {|j, mi} in which a given vector |j, mp i does
not appear. Let mp = m0 ± k where |j, m0 i is in the set (the sign is chosen according whether m0 is less than or greater than
k
mp ). It can be checked that (J1 ) |j, m0 i contains the vectors |j, m0 ± ki and one of them is |j, mp i. Therefore, the action of
(J1 )k takes |j, m0 i out of the subspace spanned by the set {|j, mi}. Hence, such a subspace is not invariant. QED.

15.6 Matrices of the generators for any (j) −representation


From Eqs. (15.74, 15.75, 15.77, 15.78) the matrix elements of the generators for a (j) −representation associated with an
invariant irreducible subspace E (j) are given by
hj, m| J3 |j ′ , m′ i = mδjj ′ δmm′ (15.79)
hj, m| J2 |j ′ , m′ i = j (j + 1) δjj ′ δmm′ (15.80)
p
hj, m| J± |j ′ , m′ i = j (j + 1) − m′ (m′ ± 1)δjj ′ δm,m′ ±1 (15.81)

1 hp
hj, m| J1 |j ′ , m′ i = δjj ′ j (j + 1) − m′ (m′ + 1)δm,m′ +1
2 i
p
+ j (j + 1) − m′ (m′ − 1)δm,m′ −1 (15.82)

1 hp
hj, m| J2 |j ′ , m′ i = δjj ′ j (j + 1) − m′ (m′ + 1)δm,m′ +1
2i i
p
− j (j + 1) − m′ (m′ − 1)δm,m′ −1 (15.83)

We observe that all these matrix representations are proportional to δjj ′ , showing the invariance of the subspaces E (j) defined
(j)
in Eq. (15.76). Further, we see that the matrix (J3 ) is diagonal, this is because we chose X3 as the axis of quantization (the
2
basis consists of eigenvectors common to J and J3 ), the diagonal matrix elements are the 2j + 1 values of m. The matrix
(j) (j)
(J+ ) only has non-vanishing elements just above the diagonal, while the matrix (J− ) only has non-vanishing elements
below the diagonal. In the Cordon-Shortley convention, the matrix representation of J± is real.
For the matrices (J1,2 )(j) the only non-null elements are the ones just above and below the diagonal. In the Cordon-Shortley
(j) (j)
representation (J1 ) is real and symmetric and (J2 ) is anti-symmetric and purely imaginary (both are hermitian). Of course,
(j)
the matrix J2 is diagonal because E (j) consists of a basis of eigenvectors of J2 , all with the same eigenvalues, so that
(j)
its diagonal elements are identical. Consequently, the matrix representation of J2 within a subspace E (j), is j (j + 1) E,
with E the identity representation of dimension (2j + 1) × (2j + 1). This result was discussed above from the fact that J 2 is
a Casimir operator and using the Schur’s Lemma.
Since all directions in the space are equivalent, it is clear that the election of the quantization axis is arbitrary. From this
it is deduced that all the Ji′ s must have the same eigenvalues. The eigenvectors are however, different since the Ji′ s do not
commute each other. Therefore, within a subspace E (j) the eigenvalues of J1 , J2 , J3 are j, (j − 1) , . . . , (−j + 1) , −j. They are
also the eigenvalues of any component of the form Jn = J · n with n a unit vector in an arbitrary direction. The common
eigenvectors of J2 and J1 are linear combinations of the vectors |j, mi with j fixed. The same happens with the eigenvectors
common to J2 and J2 .

15.7 Matrices of the group elements for any (j) −representation


Knowing the action of the generators on the basis vectors of irreducible invariant subspaces E (j) , we can derive the matrix
representations of the group elements. Let us write

U (j) (φ, θ, ψ) |j, mi = |j, m′ i D(j) (φ, θ, ψ)m m (15.84)
15.8. MATRIX REPRESENTATIONS OF GENERATORS AND GROUP ELEMENTS FOR J = 0 271

where U (j) (φ, θ, ψ) is the group operator representing the group element R (φ, θ, ψ) in the (j) −representation. Note that
U (j) (φ, θ, ψ) acts on a 2j + 1−dimensional space which only coincides with R3 (the space in which the group elements
R (φ, θ, ψ), act on) when j = 1. Owing to this fact, the representation with j = 1 is called a vector representation of SO (3).
From Eq. (15.59), the matrix representation of U (j) (φ, θ, ψ) yields
m′ (j) (j) (j)
D(j) (φ, θ, ψ) m = hj, m′ | U (j) (φ, θ, ψ) |j, mi = hj, m′ | e−iφJ3 e−iθJ2 e−iψJ3 |j, mi
′ (j) ′ (j)
= hj, m′ | e−iφm e−iθJ2 e−iψm |j, mi = e−iφm hj, m′ | e−iθJ2 |j, mi e−iψm

so the matrix representation of U (j) (φ, θ, ψ) gives finally


′ ′ ′ ′ (j)
D(j) (φ, θ, ψ)m m = e−iφm d(j) (θ)m me
−iψm
; d(j) (θ)m m ≡ hj, m′ | e−iθJ2 |j, mi ; no sum over m, m′ (15.85)

we have already seen that in the Cordon-Shortley convention, the matrix representatives of J2 are purely imaginary and anti-
symmetric. Hence d(j) = e−iθJ2 is real, and since this matrix is unitary, it is orthogonal (a real unitary matrix is orthogonal).
Therefore, the d(j) −matrices are real orthogonal in the Cordon-Shortley convention. Using the real nature of d(j) (θ) in Eq.
(15.85), we obtain
m′ (j) (j)
∗ (j) (j)
d(j) (θ) m = hj, m′ | e−iθJ2 |j, mi = hj, m′ | e−iθJ2 |j, mi = hj, m| eiθJ2 |j, m′ i = hj, m| e−i(−θ)J2 |j, m′ i

d(j) (θ)m m = d(j) (−θ)m m′ (15.86)

this identity will be useful later.


Note finally that the space E (j) irreducibly invariant under either the Lie group or the Lie algebra, is a complex vector
space of 2j + 1 dimensions, i.e. a space of the type C2j+1 . On the other hand, since j can take all non-negative integer values
and all positive half-odd-integer values, there are representations in all finite-dimensions15 .

15.8 Matrix representations of generators and group elements for j = 0


The subspaces E (j = 0) are of dimension 2 (0) + 1 = 1 i.e. they are isomorphic with C1 (the one-dimensional complex vector
(j)
space). The only possible value for m is zero. The matrices (Ji ) are numbers and according with Eqs. (15.82, 15.83, 15.79)
these numbers are zero. The group elements are all mapped in the identity. This is the identity (or scalar) representation.

15.9 Matrix representations of generators and group elements for j = 1/2


The subspaces E (j = 1/2) are of dimension 2 (1/2) + 1 = 2, so they are isomorphic with C2 . The matrices in the subspace
E (j = 1/2) are of dimension 2 × 2 and the basis vectors will be chosen in the order m1 = 1/2, m2 = −1/2.

15.9.1 Matrix representations of the generators (j = 1/2)


The matrix representations are obtained from Eqs. (15.82, 15.83, 15.79, 15.80). Let us calculate the matrix representation of
J1 using (15.82)
  "s  
1 1 1 1 1
(J1 )ij ≡
, mi J1 , mj = δ 12 , 21 + 1 − mj (mj + 1) δmi ,mj +1
2 2 2 2 2
s   #
1 1
+ + 1 − mj (mj − 1) δmi ,mj −1
2 2
"r r #
1 3 3
(J1 )ij = − mj (mj + 1) δmi ,mj +1 + − mj (mj − 1) δmi ,mj −1
2 4 4

the diagonal elements are zero as expected so


 
(1/2) 1 1 1 1
(J1 )11 ≡ , J1 , =0
2 2 2 2
 
1 1 1 1
(J1 )(1/2) ≡ , − J1 , − =0
22
2 2 2 2
15 Note in particular that in the Cordon-Shortley convention, the generator J is purely imaginary, so we cannot define it as a mapping of a real
2
vector space into itself.
272 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

and the non-diagonal terms are


  "s   
1 1 1 1 1 3 1 1
(J1 )(1/2) ≡ , J1 , − = − − − + 1 δ 12 ,− 12 +1
12
2 2 2 2 2 4 2 2
s    #
3 1 1
+ − − − − 1 δ 12 ,− 12 −1
4 2 2
r
(1/2) 1 3 1 1
(J1 )12 = + δ1 1 =
2 4 4 2,2 2

  "s  
(1/2) 1 1 1 1 1 3 1 1
(J1 )21 ≡ , − J1 , = − + 1 δ− 12 , 12 +1
2 2 2 2 2 4 2 2
s   #
3 1 1
+ − − 1 δ− 12 , 12 −1
4 2 2
(1/2) 1
(J1 )21 =
2
this element can also be calculated taking into account that the matrix of J1 is real and symmetric. Such a matrix reads
 
1 0 1
(J1 )(1/2) =
2 1 0

the matrix representations of the other generators are calculated similarly, and we obtain
     
(1/2) 1 0 1 (1/2) 1 0 −i (1/2) 1 1 0
(J1 ) = ; (J2 ) = ; (J3 ) = (15.87)
2 1 0 2 i 0 2 0 −1
     
(1/2) 3 1 0 0 1 0 0
J2 = ; (J+ )(1/2) = ; (J− )(1/2) = (15.88)
4 0 1 0 0 1 0

15.9.2 Matrix representations of the group elements (j = 1/2)


Using Eq. (15.85), combined with Eqs. (15.87, 15.88) we find the matrix representation for the group elements

m′ ′ m′ m′ (j)
D(j) (φ, θ, ψ) m = e−iφm d(j) (θ) me
−iψm
; d(j) (θ) m ≡ hj, m′ | e−iθJ2 |j, mi ; no sum over m, m′ (15.89)

we redefine
(1/2) σk
Jk ≡ ; k = 1, 2, 3 (15.90)
2
where σk are called the Pauli matrices. From Eqs (15.87, 15.88, 15.90) we find
     
0 1 0 −i 1 0
σ1 = ; σ2 = ; σ3 = (15.91)
1 0 i 0 0 −1

it can be checked by inspection that


σk2 = E ; k = 1, 2, 3 (15.92)
From the second of Eqs. (15.89), used for the representation j = 1/2, we find
 
   ∞ −iθJ (1/2) n  X  
(−iθ)n 1 ′ h (1/2) in 1
X ∞
m′
1 ′ −iθJ2(1/2) 1 1 ′ 2 1
(1/2)
d (θ) = ,m e ,m =
m
2 2, m = 2, m n! 2 n! 2
, m J2 2, m
n=0 n=0
h in h in
(1/2) (1/2)
the term h1/2, m′ | J2 |1/2, mi is an element of the matrix representation of the operator J2 . Hence

 (1/2)
n m′ " n #m′
X∞ n h  i X∞ −iθJ X∞
−iθ σ22

(−iθ) n m 2
m′ (1/2)   m=
d(1/2) (θ) m = J2 m = m
n=0
n! n=0
n! n=0
n!
15.9. MATRIX REPRESENTATIONS OF GENERATORS AND GROUP ELEMENTS FOR J = 1/2 273

writing it as a matrix equation (and not as elements of the matrix), we have


∞ n
X −iθ σ22 θ
(1/2)
d (θ) = = e−i 2 σ2 (15.93)
n=0
n!

Using (15.92) we can write


X ∞ n X∞ 2k X∞ 2k+1
(1/2) θ [−i (θ/2) σm ] [−i (θ/2) σm ] [−i (θ/2) σm ]
e−iθJm = e−i 2 σm = = +
n=0
n! (2k)! (2k + 1)!
k=0 k=0
h ik h ik
2 2k 2k 2 2k+1 2k+1

X (−i) σm (θ/2) X ∞ (−i) (−i) σm (θ/2)
= +
(2k)! (2k + 1)!
k=0 k=0
∞ k 2k ∞ ∞ ∞
X (−1) E (θ/2) X k (θ/2)2k+1 σm X (−1)k (θ/2)2k X k (θ/2)
2k+1
= −i (−1) =E − iσm (−1)
(2k)! (2k + 1)! (2k)! (2k + 1)!
k=0 k=0 k=0 k=0

then we obtain finally


(1/2) θ θ θ
e−iθJm = e−i 2 σm = E cos − iσm sin ; m = 1, 2, 3 (15.94)
2 2
using the explicit form of σ2 Eq. (15.91), we get
   
(1/2) θ θ θ θ 1 0 θ 0 −i
e−iθJ2 = e−i 2 σ2 = cos E − i sin σ2 = cos − i sin
2 2 2 0 1 2 i 0
 1 1

(1/2) θ cos 2 θ − sin 2 θ
e−iθJ2 = e−i 2 σ2 = (15.95)
sin 12 θ cos 12 θ
it resembles a rotation in a plane but with the replacement θ → θ/2, this 1/2 factor is determinant in the properties of this
representation as we shall see later. Applying Eq. (15.95) in Eq. (15.93) we find
 
θ cos 12 θ − sin 12 θ
d(1/2) (θ) = e−i 2 σ2 = (15.96)
sin 12 θ cos 21 θ
m′ ′ m′
D(1/2) (φ, θ, ψ) m = e−iφm d(1/2) (θ) me
−iψm

since m = ±1/2 and defining m1 = 21 , m2 = − 21 we find


 
1
= e−iφ( 2 ) d(1/2) (θ) −iψ ( 12 )
1 1 1 1 φ ψ
D(1/2) (φ, θ, ψ) 1 = e−iφm1 d(1/2) (θ) 1e
−iψm1
1e = e−i 2 cos θ e−i 2
2
 
1
= e−iφ( 2 ) d(1/2) (θ) 2 e−iψ(− 2 ) = e−i 2 − sin θ ei 2
1 1 1 1 1 φ ψ
D(1/2) (φ, θ, ψ) 2 = e−iφm1 d(1/2) (θ) 2e
−iψm2
2
 
1
= e−iφ(− 2 ) d(1/2) (θ) 1 e−iψ( 2 ) = ei 2 sin θ e−i 2
2 2 1 2 1 φ ψ
D(1/2) (φ, θ, ψ) 1 = e−iφm2 d(1/2) (θ) 1e
−iψm1
2
 
2 2 −iφ(− 2 ) (1/2)
1 2 −iψ (− 2 )
1 φ 1 ψ
D(1/2) (φ, θ, ψ) 2 = e−iφm2 d(1/2) (θ) 2e
−iψm2
=e d (θ) 2 e =e i2
cos θ ei 2
2
so the matrix representation of a group element is given by
φ   ψ φ   ψ !
e−i 2 cos 12 θ e−i 2 −e−i 2 sin 12 θ ei 2
D(1/2) (φ, θ, ψ) = φ   ψ φ   ψ (15.97)
ei 2 sin 12 θ e−i 2 ei 2 cos 21 θ ei 2
if we apply Eq. (15.32) with n′ = Re2 , we find
  −1
D(1/2) [Rn′ (2π)] = D(1/2) RRe2 (2π) R−1 = D(1/2) [R] D(1/2) [Re2 (2π)] D(1/2) [R]
(1/2)
−1 −1
= D(1/2) [R] e−i(2π)J2 D(1/2) [R] = D(1/2) [R] e−iπσ2 D(1/2) [R]
and using Eq. (15.96) with θ = 2π, we obtain
−1
D(1/2) [Rn′ (2π)] = D(1/2) [R] (−E) D(1/2) [R] = −E (15.98)
this result is clearly independent of the direction of n , since there is always a rotation R to go from e2 to any given n′ .

Consequently, in this representation all complete revolutions are mapped into −E and not into E. The same procedure shows
us that all two complete revolutions i.e. Ψ = 4π for any n, are mapped into E. In general, all odd complete revolutions are
represented by −E, while all even complete revolutions are represented by E. Since for the rotation group SO (3), we demand
R (2π) = R (0), the j = 1/2 representation of the Lie algebra yields a double-valued representation of the group. We shall
return to this point later. The representations with j = 1/2 are usually called spinorial representations.
274 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

15.10 Matrix representations of generators and group elements for j = 1


The subspace E (j = 1) is of dimension 2j + 1 = 3, so it is of the type C3 . This is the only irreducible representation that acts
on a three-dimensional vector space. Further, despite these matrices act on a complex three-dimensional space, we can find a
similarity transformation that leads to real orthogonal matrices, whose action can be restricted to the real Euclidean three-
dimensional space, which is precisely the vector space in which SO (3) was originally defined. Thus, it is not a surprise to find
that the matrix representation of the group elements for j = 1, is equivalent to the matrix representation that defined the group
SO (3) Eq. (15.17) in the y−convention16. So j = 1 defines a faithful single-valued representation of SO (3) in three-dimensions.
For this reason the representation of SO (3) associated with j = 1, is usually called the vector representation.

15.10.1 Matrix representations of the generators (j = 1)


Since the dimension of E (j = 1) is three, the matrix representations of the generators are 3 × 3. We shall order the basis in
the form {|1, 1i , |1, 0i , |1, −1i}, so that m1 = 1, m2 = 0, m3 = −1.
Let us calculate for example the representation of J2 by using (15.83), this equation shows that the terms on the diagonal
vanish as well as those in which the indices differ in more than the unity. Therefore
(1) (1) (1) (1) (1)
(J2 )11 = (J2 )22 = (J2 )33 = (J2 )13 = (J2 )31 = 0
for the remaining elements we use (15.83) with j = 1
q
1
h1, mi | J2 |1, mj i = 1 (1 + 1) − mj (mj + 1)δmi ,mj +1
2i
q 
− 1 (1 + 1) − mj (mj − 1)δmi ,mj −1
q q 
1
h1, mi | J2 |1, mj i = 2 − mj (mj + 1)δmi ,mj +1 − 2 − mj (mj − 1)δmi ,mj −1
2i
Further, taking into account that the matrix associated with J2 is purely imaginary and antisymmetric in the Cordon-Shortley
convention, we only have to calculate two terms
(1) 1 h√ √ i 1
(J2 )12 = h1, m1 | J2 |1, m2 i = h1, 1| J2 |1, 0i = 2δ1,0+1 − 2δ1,0−1 = √
2i 2i
i
(J2 )(1)
12 = − √ = − (J2 )(1) 21
2

1 hp
(J2 )(1)
23 = h1, m2 | J2 |1, m3 i = h1, 0| J2 |1, −1i = 2 − (−1) [(−1) + 1]δ0,−1+1
p 2i
− 2 − (−1) [(−1) − 1]δ0,−1−1
(1) 1√
(J2 )23 = 2⇒
2i
(1) i (1)
(J2 )23 = − √ = − (J2 )23 ⇒
2
the matrix becomes  
0 −i 0
(1) 1 
(J2 ) = √ i 0 −i 
2 0 i 0
the other matrices are obtained similarly, and they are
   
0 1 0 0 −i 0
1 1
(J1 )(1) = √  1 0 1  ; (J2 )(1) = √  i 0 −i  (15.99)
2 0 1 0 2 0 i 0
   
1 0 0 (1) 1 0 0
(J3 )(1) =  0 0 0  ; J2 = 2 0 1 0  (15.100)
0 0 −1 0 0 1
 √   
0 2 √0 0
√ 0 0
(1) (1)
(J+ ) =  0 0 2  ; (J− ) =  2 √0 0  (15.101)
0 0 0 0 2 0
16 Note that all our formulas of rotations depend on X , X and J , J , (for instance equations 15.38, 15.59). In the x−convention they would be
3 2 3 2
in terms of X3 , X1 and J3 , J1 . Similarly, in the x−convention, the reduced matrix d(j) (θ) of Eq. (15.85) would be in terms of J1 instead of J2 .
15.11. SUMMARY 275

it is clear that this are the matrix representations of the generators in the canonical basis {|j, mi}. The same generators had
already been calculated in a cartesian basis {ei }, see Eqs. (15.50, 15.51) page 265. We can see that the set of matrices in Eq.
(15.50) is equivalent to the set in Eqs. (15.99, 15.100). To find the similarity transformation, we note that J3 is diagonal in the
canonical basis. Hence by finding the similarity transformation that diagonalize J3 from Eq. (15.50) to (15.100), we find the
similarity transformation connecting all the generators. The characteristic polynomial of the matrix J3 in Eq. (15.50) yields

λ λ2 − 1 = 0

the eigenvalues and their corresponding normalized eigenvectors are


   
∓1 0
1  1
λ± = ±1 ⇔ u± = √ −i  ; λ0 = 0 ⇔ u0 = √  √0 
2 0 2 2

(1)
denoting Jk (Jk ) the matrix representation in the cartesian (canonical) basis, the similarity transformation gives17
   
−1 0 1 −1 i √0
(1) 1 1
Jk = S −1 Jk S ; S = √  −i √0 −i  ; S −1 = S † = √  0 0 2  (15.102)
2 0 2 0 2 1 i 0

where S is a unitary complex matrix that emphasizes the fact that the diagonalization must be done on the complex vector
space C3 . The relation between the canonical and cartesian bases is obtained by comparing Eq. (15.102) with Eq. (3.18) and
using Eq. (3.12)       
|+i e1 −1 −i √0 e1
 |0i  = Se  e2  = √  01
0 2   e2 
|−i e3 2 1 −i 0 e3
from which we obtain
1
|±i = √ (∓e1 − ie2 ) ; |0i = e3 (15.103)
2
where the RHS of these equations are precisely the expressions for the eigenvectors of J3 written in the cartesian basis.

15.10.2 Matrix representations of the group elements (j = 1)


The d(1) (θ) matrix is obtained from Eqs. (15.85, 15.99) and reads
 √ 
(1 + cos√θ) /2 − sin θ/ 2 (1 − cos θ)
√ /2
d(1) (θ) =  sin θ/ 2 cos θ√ − sin θ/ 2  (15.104)
(1 − cos θ) /2 sin θ/ 2 (1 + cos θ) /2

and from the first of Eqs. (15.85) we obtain D(1) (φ, θ, ψ). Since we have shown that the set of generators in Eqs. (15.99,
15.100) is equivalent to the set of generators in Eq. (15.50), the matrix representations for the group elements generated by
them must also be equivalent. Therefore, the matrix representations of the group elements D(1) (φ, θ, ψ) (in the canonical basis)
are equivalent to the defining matrices of rotations (in the cartesian basis) in three dimensions Eq. (15.17), as anticipated.

15.11 Summary
It can be verified that the matrix representations constructed for j = 0, 12 , 1, obey the commutation rules (15.60). It can also
(j) (1/2)
be verified that the eigenvalues of the matrices (Ji ) are equal for i = 1, 2, 3. For (Ji ) they are given by ±1/2, while for
(1)
(Ji ) they are given by +1, 0, −1. Summarizing, all characteristics of the angular momentum operators must be reproduced
by the matrices calculated above for each representation.

15.12 Some features of the irreducible representations of the SO (3) group


We have discussed irreducible representations of the Lie algebra of SO (3), which lead to irreducible representations of the
group. We have seen that the representation with j = 0, 1 are single-valued while representations with j = 12 is double-valued.
We generalize these results in the following theorem
17 Remember that strictly speaking, the yuxtaposition of column eigenvectors gives the inverse of the matrix of transformation (and not the matrix

itself), as can be seen in Eq. (3.54), page 46.


276 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

Theorem 15.10 The irreducible representations of the Lie algebra of SO (3) (theorem 15.7), when applied to the group, belong
to two distinct categories: (i) When j is a non-negative integer, the representations are single-valued. (ii) when j is a positive
half-odd-integer the representations are double-valued.

Proof : We follow a procedure similar to the one that led us to Eq. (15.98). From Eq. (15.59) we see that the rotation
R3 (2π) can be obtained by setting φ = θ = 0 and ψ = 2π. Using this setting and Eq. (15.85), we have

m′   m′ −i0·m′ (j) m′ ′ ′
D(j) [R3 (2π)] m = D(j) e−2πiJ3 m = e d (0) m e−i2πm = δ m m e−2mπi = δ m m ei2(j−m)π e−i2πj

now we remember that m is integer (half-odd-integer) if and only if j is integer (half-odd-integer), therefore j − m is a
non-negative integer so that
′ ′ ′ 2j ′
D(j) [R3 (2π)]m m = δm m e−i2πj = δ m m e−iπ = (−1)2j δ m m (15.105)
2j
and since Rn (2π) = RR3 (2π) R−1 for some R, the above result implies D(j) [Rn (2π)] = (−1) E for all n. QED.
The existence of double-valued representations but not of other multi-valued representations is tightly related with the
double-connectedness of the group manifold discusssed in Sec. 15.2. We should take into account that the matrix representations
were derived from the Lie algebra, which in turn depend on the group structure in the vicinity of the identity. Hence, there
is no control on the global behavior of the matrix elements. For instance, whether the matrices D(j) [R (2π)] , D(j) [R (4π)] or
whatever coincide with D [R (0)] = E, is a feature that must be checked after all possible representations of the Lie algebra
are found.
Notice that multi-valued representations of all orders appear in SO (2), while only single-valued and two-valued represen-
tations appear in SO (3). Once again, it is related with the multiple-connectedness of the manifold of each group. We shall
see later that there is a one-to-two-mapping between SO (3) and the group SU (2) and that all irreducible representations
of SO (3) found above correspond to single-valued representations of SU (2). So SU (2) is called the covering group of SO (3).
In classical mechanics, systems possesing rotational symmetry are related with single-valued representations only. While
in quantum mechanics fermion systems (systems with half-odd-integer spin) are associated with double-valued representations
(wave functions associated with double-valued representations of SO (3)), while boson systems (with integer spin) are associated
with single-valued representations.

15.13 Direct product representations of SO (3) and their reduction


15.13.1 Properties of the direct product representations of SO (3)
We studied the reduction of direct product representations for any group in Sec. 8.1. We shall apply such a general analysis
to the group SO (3). This analysis is important for at least two reasons (a) The reduction of direct product representations
of SO (3) appears frequently in Physics applications (for example the coupling of two or more angular momenta), and (b)
The reduction of direct product representations provides an alternative method to obtain higher dimensional irreducible
representations from “fundamental” representations of lower dimension.

Let D(j) and D(j ) be two irreducible representations of SO (3) on vector spaces V and V ′ . The product representation

D(j×j ) on the vector space V ⊗ V ′ is a (2j + 1) × (2j ′ + 1) −dimensional representation. The natural basis of V ⊗ V ′ consists
of the tensor product of the bases of V and V ′

|m, m′ i ≡ |j, mi ⊗ |j ′ , m′ i (15.106)

where we have omitted the symbols jj ′ in the basis {|m, m′ i} since they are fixed. Equation (8.4) shows the definition of the
matrix form of the product representation

U (R) |m, m′ i = |n, n′ i D(j) (R)


n (j ′ ) (R)n′
mD m′ (15.107)

We showed in Sec. 8.1 that Eq. (8.4) (and hence Eq. 15.107), provides a representation of the given group (SO (3) in this
case). It can be shown that if j + j ′ is integer (half-odd-integer), the representation is single-valued (double-valued). Further,
unless either j or j ′ is null, the product representation is reducible.

Example 15.1 Let us study the product representation D(1/2×1/2) . We denote the four basis vectors in the product vector
space as
V (1/2) ⊗ V (1/2) → {|++i , |+−i , |−+i , |−−i}
we can show that the vector
|ai ≡ |+−i − |−+i
15.13. DIRECT PRODUCT REPRESENTATIONS OF SO (3) AND THEIR REDUCTION 277

which is totally antisymmetric in the two indices, is invariant under rotations18 . Applying U (R) on |ai and using Eq. (15.107)
n n′ n n′
U (R) |ai = U (R) [|+−i − |−+i] = |n, n′ i D(1/2) (R) + D(1/2) (R) − − |n, n′ i D(1/2) (R) −D
(1/2)
(R) +
= |+, +i D(1/2) (R)+ + D(1/2) (R)+ − + |+, −i D(1/2) (R)+ + D(1/2) (R)− −
− + − −
+ |−, +i D(1/2) (R) + D(1/2) (R) − + |−, −i D(1/2) (R) + D(1/2) (R) −
− |+, +i D(1/2) (R)+ − D(1/2) (R)+ + − |+, −i D(1/2) (R)+ − D(1/2) (R)− +
− + − −
− |−, +i D(1/2) (R) −D
(1/2)
(R) + − |−, −i D(1/2) (R) −D
(1/2)
(R) +

the terms associated with |+, +i and |−, −i cancel each other so that
+ − + −
U (R) |ai = |+, −i D(1/2) (R) + D(1/2) (R) − − |+, −i D(1/2) (R) − D(1/2) (R) +
+ |−, +i D(1/2) (R)− + D(1/2) (R)+ − − |−, +i D(1/2) (R)− − D(1/2) (R)+ +
h i
+ − + −
= |+, −i D(1/2) (R) + D(1/2) (R) − − D(1/2) (R) − D(1/2) (R) +
h i
− + − +
− |−, +i D(1/2) (R) − D(1/2) (R) + − D(1/2) (R) + D(1/2) (R) −

h i
+ − + −
U (R) |ai = [|+, −i − |−, +i] D(1/2) (R) + D(1/2) (R) − − D(1/2) (R) − D(1/2) (R) +
h i
U (R) |ai = |ai det D(1/2) (R) = |ai (15.108)

therefore |ai spans a one-dimensional subspace invariant under SO (3) and D(1/2×1/2) contains the irreducible representation
D0 at least once. Note that Eq. (15.108) confirms that the representation associated with j = 0 is the identity representation.
We shall see later that
D(1/2×1/2) = D(0) ⊕ D(1)
where D(1) is spanned by the three normalized totally symmetric vectors
1
|+, +i , √ (|+−i + |−+i) , |−−i
2
In order to study the general reduction of direct products, it is necessary to establish the relation between the generators
(j) (j ′ ) (j×j ′ )
Jn and Jn with the generators Jn of the direct product representation. We start from the following theorem

Theorem 15.11 Let J(1) and J(2) be two commuting angular momenta. The sum of these angular momenta is also an angular
momentum.

Proof : Let J(1) and J(2) be two commuting arbitrary angular momenta, we shall show that the sum of them

J ≡ J(1) + J(2)

is also an angular momentum. Since each J(α) is an angular momentum, we see that
h i h i
(1) (1) (1) (2) (2) (2)
Ji , J j = iεijk Jk ; Ji , Jj = iεijk Jk

so we have
h i h i h i
(1) (2) (1) (2) (1) (1) (2) (2) (1) (2)
[Ji , Jj ] = Ji + Ji , J j + Jj = Ji , J j + Jj + Ji , J j + Jj
h i h i h i h i
(1) (1) (1) (2) (2) (1) (2) (2)
[Ji , Jj ] = Ji , J j + Ji , J j + Ji , J j + Ji , J j

since the angular momenta J(1) and J(2) commute each other, we have
h i h i h i
(1) (1) (2) (2) (1) (2) (1) (2)
[Ji , Jj ] = Ji , Jj + Ji , J j = iεijk Jk + iεijk Jk = iεijk Jk + Jk
(1) (2)
[Ji , Jj ] = iεijk Jk ; Jk ≡ Jk + Jk

QED.
As a consequence, all properties of the angular momenta are also valid for the sum of two commuting angular momenta.
18 Since V (1/2) ⊗ V (1/2) is the direct product of two identical two-dimensional spaces, this is an space of the form V 2 according with the notation
2
of chapter 13. We can also see from theorem 13.5, page 235, that |ai is the only linearly independent totally antisymmetric tensor and that there are
three linearly independent totally symmetric tensors. Finally, the same theorem says that the totally antisymmetric tensor plus the three linearly
independent totally symmetric tensors form a basis for this space.
278 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

Corollary 15.12 If J(1) is an angular momentum on a vector space V1 and J(2) is an angular momentum on a space V2 , the
sum of them defined on the product space V1 ⊗ V2 is also an angular momentum.

Proof : Since J(1) and J(2) are defined on different spaces, we can form the sum (of the extended operators) as a new
operator on the product space. But J(1) and J(2) commute each other on the product space, since they originally belong to
different subspaces. Hence, the sum is an angular momentum by virtue of theorem 15.11. QED.

Theorem 15.13 The generators of a direct product representation are the sums of the generators of its constituent represen-
tations, that is
′ (j ′ )
Jnj×j = Jn(j) ⊗ E (j ) + E (j) ⊗ Jn

(15.109)
we can simplify the notation to write
′ (j ′ )
Jnj×j = Jn(j) + Jn (15.110)

Proof: Let us consider the representation of an infinitesimal rotation around an arbitrary axis n.
′ ′
U (j) [Rn (dψ)] ⊗ U (j ) [Rn (dψ)] = U (j×j ) [Rn (dψ)] (15.111)
to first order in dψ the LHS of Eq. (15.111) is given by
h i    
(j) (j) (j′ ) (j ′ ) (j) (j′ ) (j) (j ′ ) (j) ( j′ )
E − idψ Jn ⊗ E − idψ Jn =E ⊗E − i dψ E ⊗ Jn + Jn ⊗ E

on the other hand, the RHS of Eq. (15.111) is by definition



U (j×j ) [Rn (dψ)] = E j×j − idψ Jnj×j
′ ′

(j×j ′ )
comparing the last two equations we obtain Eq. (15.109). Further, corollary 15.12 guarantees that Jn is also an angular
momentum component. QED.
Owing to the result displayed in theorem 15.13, the direct product representations are characterized by the way in which
the generators (angular momenta) are added. Therefore, the characterization of direct product representations of SO (3) is
also expressed as the addition of angular momenta.

15.13.2 Reduction of the direct product representation


Here we only outline the procedure, the reader interested in details can go to appendix B. The reduction of the direct product
representation will be done by taking the natural basis of Eq. (15.106) as a starting point.

{|m, m′ i} ≡ {|j, mi ⊗ |j ′ , m′ i} (15.112)

We shall regroup the basis (15.112) to form invariant subspaces by using generators of the direct product space. The procedure
is similar to the one used to generate irreducible representations as described in Sec. 15.5. First of all, from theorem 15.13,
Eq. (15.109) we see that |m, m′ i is an eigenvector of J3 (generator on the product space)
 
′ (j) (j′ ) (j) (j ′ )
J3 |m, m i = J3 ⊗ E + E ⊗ J3 [|j, mi ⊗ |j ′ , m′ i]
′ (j ′ )
= J3 |j, mi ⊗ E (j ) |j ′ , m′ i + E (j) |j, mi ⊗ J3 |j ′ , m′ i
(j)

= m |j, mi ⊗ |j ′ , m′ i + |j, mi ⊗ m′ |j ′ , m′ i = [|j, mi ⊗ |j ′ , m′ i] (m + m′ )


J3 |m, m′ i = |m, m′ i (m + m′ )

since −j ≤ m ≤ j and −j ′ ≤ m′ ≤ j ′ , the highest value of J3 is Mmax = j + j ′ . There is only one vector corresponding to
Mmax which is |m, m′ i = |j, j ′ i. For the next-to-highest eigenvalue M = j + j ′ − 1, there are two associated eigenvectors:
|j − 1, j ′ i and |j, j ′ − 1i. The general situation is described in Fig. 15.4, where each point represents a basis eigenvector, and
those vectors with the same eigenvalue M of J3 are connected by dashed lines.
We have already shown that irreducible representations of SO(3), are constructed from eigenvectors common to one gen-
erator (usually J3 ) and the Casimir  2 operator
J 2 . Therefore, from the total angular momentum operator J ≡ J(1) + J(2) , we
shall construct eigenvectors of J , J3 with eigenvalues {J (J + 1) , M } as defined in Eqs. (15.74, 15.75). In this way, a basis
of a given irreducible invariant subspace associated with the (J) −representation is given by the kets

{|J, M i ; M = −J, − J + 1, . . . , J − 1, J}
15.13. DIRECT PRODUCT REPRESENTATIONS OF SO (3) AND THEIR REDUCTION 279

Figure 15.4: (a) Illustration of the vector addition of angular momenta in the general case. (b) Pairs of possible values (m, m′ )
for the specific case j = 2, j ′ = 1. In both cases, the points associated with a given value M = m + m′ are located on a straight
line of slope −1 depicted by dash lines.

therefore our task is to link the natural basis {|m, m′ i} with the basis {|J, M i} that generates the irreducible invariant
subspaces.
We start observing that, since the state with M = j + j ′ is unique, is must be the highest member of an irreducible basis
of the (J)−representation with J = j + j ′
|J = j + j ′ , M = j + j ′ i = |j, j ′ i (15.113)
where the RHS corresponds to the notation for the original basis |m, m′ i = |j, j ′ i, so for J = M = j + j ′ the old eigenvector
coincides with the new one (since there is only one vector with M = j + j ′ ). It can be verified that this is an eigenvector of J 2
with eigenvalue (j + j ′ ) (j + j ′ + 1). To do this, we use Eq. (15.70) of page 269, to write
h i
′ 2
 
′ 2 ′
j×j ′ j×j ′
J j×j = J3j×j + J3j×j + J− J+

h ′
i2
operating J 2 = J (j) + J (j ) on the eigenvector

|jji ⊗ |j ′ j ′ i = |j, j ′ i = |J = j + j ′ , M = j + j ′ i

we find
  
′ 2 ′
j×j ′ j×j ′
J 2 [|jji ⊗ |j ′ j ′ i] = J3j×j + J3j×j + J− J+ [|jji ⊗ |j ′ j ′ i]
  
′ (j ′ ) ′ (j ′ )
J3 ⊗ E (j ) + E (j) ⊗ J3 J3 ⊗ E (j ) + E (j) ⊗ J3
(j) (j)
J 2 [|jji ⊗ |j ′ j ′ i] = [|jji ⊗ |j ′ j ′ i]
   
′ (j ′ ) j×j ′ j×j ′
+ J3 ⊗ E (j ) + E (j) ⊗ J3
(j)
[|jji ⊗ |j ′ j ′ i] + J− J+ [|jji ⊗ |j ′ j ′ i]

 
j×j ′ j×j ′
but J− J+ [|jji ⊗ |j ′ j ′ i] = 0 since this is the vector with highest value of M . Then
  
2 ′ ′ (j) ( j′ ) (j) (j ′ ) (j) ′ ′ (j ′ ) ′ ′
J [|jji ⊗ |j j i] = J3 ⊗ E + E ⊗ J3 J3 |jji ⊗ |j j i + |jji ⊗ J3 |j j i
 
(j) ′ ′ (j ′ ) ′ ′
+ J3 |jji ⊗ |j j i + |jji ⊗ J3 |j j i
 
(j) ( j′ ) (j) (j ′ )
= J3 ⊗ E + E ⊗ J3 (j + j ′ ) [|jji ⊗ |j ′ j ′ i] + (j + j ′ ) [|jji ⊗ |j ′ j ′ i]

h i
2
J 2 [|jji ⊗ |j ′ j ′ i] = (j + j ′ ) + (j + j ′ ) [|jji ⊗ |j ′ j ′ i]
J 2 [|jji ⊗ |j ′ j ′ i] = [(j + j ′ ) (j + j ′ + 1)] [|jji ⊗ |j ′ j ′ i]
280 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

we can now generate the |J, M i vectors with M = j + j ′ − 1, . . . , −j − j ′ by repeated application of J− , e.g.
p
|J = j + j ′ , M = j + j ′ − 1i 2 (j + j ′ ) = J− |J = j + j ′ , M = j + j ′ i
p p
= J− |j, j ′ i = |j − 1, j ′ i 2j + |j, j ′ − 1i 2j ′ (15.114)

the vectors on the first line belong to the new basis |J, M i while the ones on the second line belong to the original natural
basis19 of the product space |m, m′ i. The 2 (j + j ′ ) + 1 vectors generated in this way span an invariant subspace associated
with J = j + j ′ .
Now, there are two linearly independent vectors associated with M = j + j ′ − 1. Nevertheless, one of them already
appeared in the J = j + j ′ invariant subspace as can be seen in Eq. (15.114). Therefore, we are left with a unique state,
orthogonal to the former, which must have J = j + j ′ − 1 by the same reasoning as before. We denote this vector as
|J = j + j ′ − 1, M = j + j ′ − 1i. An invariant subspace associated with J = j + j ′ − 1, is generated by succesive application
of J− on the latter vector, obtaining 2 (j + j ′ − 1) + 1 = 2 (j + j ′ ) − 1 vectors.
Further, it can be shown that the set of eigenvalues of J2 given by J (J + 1) are such that J can take the values

j + j ′ , j + j ′ − 1, j + j ′ − 2, . . . , |j − j ′ |

So that the process can be repeated for smaller values of J in unit steps, until we arrive to J = |j − j ′ |, for which the
invariant subspace is of dimension 2 |j − j ′ | + 1. Even if we agree to normalize the vectors in each step of the process, we
can still choose an arbitrary phase factor eiα with α a real number, by fixing α we are choosing a convention. Note that the
counting of independent basis states works well as can be seen from either Fig. 15.4a, or by noting that20

[2 (j + j ′ ) + 1] + [2 (j + j ′ − 1) + 1] + . . . + [2 |j − j ′ | + 1] = (2j + 1) (2j ′ + 1)

By construction the new basis {|J, M i ; M = −J, . . . , J; J = |j − j ′ | , . . . , |j + j ′ |} is also orthonormal. Thus the matrix of
transformation from the old basis to the new one must be unitary. As we saw in Sec. 8.1, the elements of such a matrix are
the so-called Clebsch-Gordan coefficients

|J, M i = |m, m′ i hmm′ (jj ′ ) JM i ; |m, m′ i = |J, M i hJM (jj ′ ) mm′ i (15.115)
′ ′ ∗
hJM (jj ) mm i = hmm′ (jj ′ ) JM i (unitarity condition) (15.116)

there is sum over repeated indices. The constructive process describe above permits to find out the Clebsch-Gordan coefficients
up to a common phase factor for each invariant subspace, that is the phase α could depend on J. Using the Cordon-Shortley
convention we find

hmm′ (jj ′ ) JM i are real


hj, J − j (jj ′ ) JJi are positive for all j, j ′ , J

and the unitary condition in the C-S convention becomes

hJM (jj ′ ) mm′ i = hmm′ (jj ′ ) JM i

though the Cordon-Shortley convention is quite universal, there is (unfortunately) a great variety of notations for the Clebsch-
Gordan coefficients such as

hJM |jj ′ , mm′ i , hJM |jm, j ′ m′ i , C (JM ; jm, j ′ , m′ ) , C (Jjj ′ ; M mm′ ) etc.

the notation adopted here has the advantage of being symmetric and provides a clear distinction between unsummed labels
[jj ′ ] and summation indices [JM and mm′ ].

Example 15.2 Let us return to example 15.1 to examine the reduction of D(1/2×1/2) . From our general discussion the state
|++i corresponds to J = M = 1
|++i = |1, 1i
applying Eq. (15.114) we obtain
  1/2 

J = 1 + 1 ; M = 1 + 1 − 1 2 1 + 1 1 1 1 1
= J− J = + , M = +
2 2 2 2 2 2 2 2 2 2
(j×j ′ ) ′ (j ′ )
= J− ⊗ E (j ) + E (j) ⊗ J− .
19 The (j)
last equality in Eq. (15.114) comes from the fact that J−
20 It is very important to keep in mind that in the old notation the quantum numbers refer to m and m′ , because j and j ′ are fixed in the process

describe here. For instance, |j, − j ′ i means |m = j, m′ = −j ′ i.


15.13. DIRECT PRODUCT REPRESENTATIONS OF SO (3) AND THEIR REDUCTION 281

General notation SO (3) notation Description


µ j The first factor irred. representation
ν j′ The second factor irred. representation
λ J Irred. repres. contained in the product repres.
|i, ji ≡ |wk i |m, m′ i Original basis (decoupling basis)
|α, λ, li |J, M i New basis (coupling basis)
Table 15.1: Table of translation from the language of the general theory of reduction of product representations (Sec. 8.1), into
the language of SO (3) in the present section.

s   s  
1 1 1 1 1 1
= J− |m = 1, m′ = 1i = − 1, 2· + , − 1 2·
2 2 2 2 2 2
so that
 

|J = 1; M = 0i 2 = m = − 1 , m′ = 1 + m = 1 , m′ = − 1
2 2 2 2
1
|1, 0i = √ [|−, +i + |+, −i]
2

in a similar way from J− |1, 0i we obtain


√ √
|1, −1i 2 = |−−i 2 ⇒ |1, −1i = |−−i

so we have a 3-dimensional invariant subspace spanned by the following set of three vectors
 
1
{|J = 1, M i ; M = 1, 0, −1} ≡ {|1, 1i , |1, 0i , |1, −1i} = |++i , √ [|−+i + |+−i] , |−−i
2

in which all elements are totally symmetric in the two indices (+, −). We usually call them a (symmetric) triplet. The remaining
dimension must lead to a one-dimensional irreducible representation (j = 0) which must correspond to the identity representa-

tion. According with example 15.1, the one-dimensional identity representation is generated by the vector [|+−i − |−+i] / 2,
which we call the (antisymmetric) singlet. Note that this antisymmetric singlet is the only vector orthogonal to |1, 0i in the
J3 = 0 basis. We decompose the product representation as

D(1/2×1/2) = D(1) ⊕ D(0) (15.117)

another usual notation that appears in the literature writes the representation according with their dimensions so D(1/2) is
represented by 2, D(0) by 1 and D(1) = 3, then we write Eq. (15.117) as

2⊗2 = 3⊕1 (15.118)

in this notation the concept of singlet and triplet appears more directly.

15.13.3 Clebsch-Gordan coefficients for SO (3)


A detailed treatment of the Clebsch-Gordan coefficients of SO (3) and their properties is given in appendix C. Here we
only provide an overview of these properties. Some of the features of the Clebsch-Gordan coefficients come from the general
treatment of the reduction of product representations given in Sec. 8.1 while others are specific for the SO (3) group i.e. for
the algebra of angular momentum. In order to translate general results of Sec. 8.1 to the particular case of SO (3) we translate
in table 15.1 from the general language of Sec. 8.1 into the language of SO (3) developed in this section.
The Clebsch-Gordan coefficients for SO (3) are tabulated in most books on the rotation group. There are several methods
to calculate them, different from the constructive algorithm described here. Incidentally, we have calculated some of the C-G
coefficients when going from Eq. (15.113) to Eq. (15.114), they are

hj, j ′ (jj ′ ) j + j ′ , j + j ′ i = 1
1/2
hj − 1, j (jj ) j + j , j + j − 1i = [j/ (j + j ′ )]
′ ′ ′ ′

1/2
hj, j ′ − 1 (jj ′ ) j + j ′ , j + j ′ − 1i = [j ′ / (j + j ′ )]

some properties of the Clebsch-Gordan coefficients of SO (3) are the following


282 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

1. Angular momentum selection rule

hmm′ (jj ′ ) JM i = 0 unless


m + m′ = M and |j − j ′ | ≤ J ≤ j + j ′ (15.119)

This comes directly from the fundamental properties of the algebra of angular momentum, and the rules of addition of
angular momenta.
2. Orthogonality and completeness
X
hJM (jj ′ ) mm′ i hmm′ (jj ′ ) J ′ M ′ i = δJJ′ δM
M

m,m′
X ′
hmm′ (jj ′ ) JM i hJM (jj ′ ) nn′ i = δnm δnm′
J,M

they come from the fact that they are the coefficients of a unitary matrix, and for any unitary matrix all rows (and
all columns) are orthonormal each other. This property is satisfied by C-G coefficients of any group as can be seen in
theorem 8.2 on page 158. It is because the unitarity of the matrix is required for any group under study.
3. Symmetry relations
j+j ′ −J
hmm′ (jj ′ ) JM i = (−1) hm′ m (j ′ j) JM i

= (−1)j+j −J
h−m, −m′ (jj ′ ) J, −M i

= (−1)j−J+m hM, −m′ (Jj ′ ) jmi [(2J + 1) / (2j + 1)]1/2 (15.120)

these symmetry relations are more apparent in the form of the Wigner’s three j−symbols
 
j j′ J j−j ′ +M √
′ = (−1) hmm′ (jj ′ ) JM i / 2J + 1
m m −M

and Eq. (15.120) is equivalent to say that the three j−symbols are invariant under the following changes

(a) Cyclic permutation of the three columns.


(b) Simultaneous change of sign of the three elements of the bottom row and multiplication of the coefficient by
j+j ′ +J
(−1) .

(c) Transposition of any two columns and multiplication by (−1)j+j +J

We have displayed the rotation of the basis from the original one to the one that expands the irreducible invariant subspaces.
The reduction translates into the fully reduction of the matrix representatives in block-diagonal form. To see the form of the
reduction we apply U (R) to the second of Eqs. (15.115) and make use of Eqs. (15.107)

U (R) |nn′ i = U (R) |J, N i hJN (jj ′ ) nn′ i


′ ′
|k, k ′ i D(j) (R)k n D(j ) (R)k n′ = |J, M i D(J) (R)M N hJN (jj ′ ) nn′ i
′ ′
hmm′ |k, k ′ i D(j) (R)k D(j ) (R)k ′ =
n n hmm′ |J, M i D(J) (R)M N hJN (jj ′ ) nn′ i
k
δmk δm′ k′ D(j) (R) (j ′ ) (R)k′ hmm′ (jj ′ ) JM i D(J) (R)
M
hJN (jj ′ ) nn′ i
nD n′ = N

expressing the final result with explicit sum symbols we have


′ ′ X
D(j) (R)m n D(j ) (R)m n′ = hmm′ (jj ′ ) JM i D(J) (R)M N hJN (jj ′ ) nn′ i (15.121)
J,M,N

this result can be obtained alternatively as a particular case of the general expression (8.25) Page 159, valid for any group,
and making the replacements of table 15.1. The inverse relation, can be obtained by either inverting Eq. (15.121) or as a
particular case of Eq. (8.26) Page 159
X ′ m′
hJM (jj ′ ) mm′ i D(j) (R) n D(j ) (R) n′ hnn′ (jj ′ ) J ′ M ′ i
M m
δJJ′ D(J) (R) M ′ = (15.122)
mm′ nn′

by comparing with Eq. (8.26), we observe that in Eq. (15.122) there is not a factor of the form δαα . It owes to the fact that
each representation of SO (3) appears only once in the product representation. Relation (15.122) allows us to construct higher
dimensional representations from lower dimensional ones.
15.14. IRREDUCIBLE SPHERICAL TENSORS AND THE WIGNER-ECKART THEOREM IN SO(3). 283

15.14 Irreducible spherical tensors and the Wigner-Eckart theorem in SO(3).


15.14.1 Irreducible spherical tensors and its properties
We developed in Sec. 9.3, definition 9.3, the concept of irreducible tensor for any given group. We now specialize the concept
for SO (3)

Definition 15.4 (Irreducible spherical tensor): Let {Oλs : λ = −s, . . . , s} be a set of operators which transform under a rotation
as s
−1
X λ′
U [R] Oλs U [R] = Oλs ′ D(s) (R) λ (15.123)
λ′ =−s

where D(s) (R) is a matrix associated with the s−irreducible representation of SO (3). Such a set of operators is called an
irreducible spherical tensor of angular momentum s, with respect to SO (3). Individual operators in this set are called spherical
components of the tensor.

Theorem 15.14 (Differential characterization of irreducible spherical tensors): If Oλs are components of a spherical tensor,
then
 2 s
J , Oλ = s (s + 1) Oλs ; [J3 , Oλs ] = λOλs (15.124)
p
[J± , Oλs ] = s
[s (s + 1) − λ (λ ± 1)]Oλ±1 (15.125)

Proof : Consider an infinitesimal rotation around the kth−axis. The LHS of Eq. (15.123) becomes
−1
U [R] Oλs U [R] = (E − idψ Jk ) Oλs (E + idψ Jk ) = (Oλs − idψ Jk Oλs ) (E + idψ Jk )
h i
2
= Oλs − idψ Jk Oλs + i dψ Oλs Jk + O (dψ)
U [R] Oλs U [R]−1 = Oλs − idψ [Jk , Oλs ] (15.126)

while the RHS of Eq. (15.123) gives


s
X s
X  h iλ′  s
X  h iλ′ 
′ ′ ′
(s) (s)
Oλs ′ D(s) (R)λ λ = Oλs ′ E λ λ − idψ Jk λ = Oλs ′ δ λ λ − idψ Jk λ
λ′ =−s λ′ =−s λ′ =−s
X s s
X h iλ′
λ′ (s)
Oλs ′ D(s) (R) λ = Oλs − idψ Oλs ′ Jk λ (15.127)
λ′ =−s λ′ =−s

Equating expressions (15.126, 15.127), and using convention of sum over upper-lower repeated indices
h iλ′
(s)
[Jk , Oλs ] = Oλs ′ Jk λ (15.128)

for k = 3 using the matrix representation of J3 in Eq. (15.79), we see that Eq. (15.128) becomes
h iλ′ h i
(s) λ′
[J3 , Oλs ] = Oλs ′ J3 s s
λ = Oλ′ λδ λ = λOλ (15.129)

which gives the second of Eqs. (15.124). On the other hand


h iλ′
(s) (s)
[J± , Oλs ] = [J1 ± iJ2 , Oλs ] = [J1 , Oλs ] ± i [J2 , Oλs ] = Oλs ′ J1 ± iJ2 λ
h iλ′
(s)
[J± , Oλs ] = Oλs ′ J± λ (15.130)

where we have used Eq. (15.128). Now, using the matrix representation of J± , Eq. (15.81) we find
p p
s
[J± , Oλs ] = Oλs ′ s (s + 1) − λ (λ ± 1)δλ′ ,λ±1 = Oλ±1 s (s + 1) − λ (λ ± 1)

which reproduces Eq. (15.125). By a similar procedure that led to Eqs. (15.128, 15.130) we can show that
h iλ′ h iλ′ h iλ′′
(s) (s) (s) (s)
[J+ J− , Oλs ] = Oλs ′ J+ J− λ = O s
λ′ J + λ′′ J − λ

s
p p
= O λ′ s (s + 1) − λ′′ (λ′′ + 1) δλ′ ,λ′′ +1 s (s + 1) − λ (λ − 1) δλ′′ ,λ−1
[J+ J− , Oλ ] = [s (s + 1) − λ (λ − 1)] Os λ
s
(15.131)
284 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

and
  h iλ′ h iλ′ h iλ′′
(s) (s) (s) (s)
J32 , Oλs = Oλs ′ J3 J3 s
λ = Oλ′ J3 λ′′ J3
s ′′
λ = Oλ′ [λ δλ′ ,λ′′ ] [λδλ′′ ,λ ]
 
J32 , Oλs = λ2 Oλs (15.132)

Now from Eq. (15.70) and applying Eqs. (15.131, 15.129, 15.132) we have
 2 s  2     
J , Oλ = J3 − J3 + J+ J− , Oλs = J32 , Oλs − [J3 , Oλs ] + [J+ J− , Oλs ] = λ2 − λ + s (s + 1) − λ (λ − 1) Oλs
 2 s
J , Oλ = s (s + 1) Oλs

which proves the first of Eqs. (15.124). QED.

Example 15.3 An operator invariant under rotations (for instance the Hamiltonian associated with a central potential) com-
mutes with all generators of rotations, so that it constitutes an irreducible spherical tensor corresponding to s = 0.
 √ √
Example 15.4 The set of operators J3 , J+ / 2, J− / 2 form spherical components of a vector (s = 1).

15.14.2 The Wigner-Eckart theorem for SO (3)


If a Physical system admits a symmetry group such as SO (3), symmetry operations imply relations between physical observ-
ables (operators) which belong to the same irreducible representation. Therefore, meaningful quantities are associated with
irreducible tensors. If a set of operators {Oλs } transforms according with the (s) −representation, their matrix elements within
irreducible Physical states satisfy the Wigner-Eckart theorem (see Sec. 9.3 Eq. 9.33), which translated into the language of
the SO(3) group (see table 15.1) gives

hj ′ m′ | Oλs |j, mi = hj ′ m′ (s, j) λmi hj ′ | Os |ji (15.133)

where the first factor on the RHS is a Clebsch-Gordan coefficient which is determined by group theory, so that it is independent
of the specific operator Oλs . The term hj ′ | Os |ji which is the “reduced matrix element”, depends on Os (it is the “dynamical”
part) but it is independent of m, m′ and λ. In particular, the independence with λ of the reduced matrix, says that such a
matrix can be calculated with any component of the irreducible tensor (note however that the complete matrix depends on λ
through the Clebsch-Gordan coefficient). Without any specific knowledge of the physical system we can derive the following
properties: (a) Selection rules: The matrix elements vanish unless |j − s| ≤ j ′ ≤ j + s and m′ = λ + m. It comes from the
Clebsch-Gordan properties. (b) The branching ratios involving components of a given irreducible tensor, can be calculated by
purely group-theoretical methods as can be seen from

hj ′ m′ | Oλs |j, mi hj ′ m′ (s, j) λmi hj ′ | Os |ji hj ′ m′ (s, j) λmi


= =
hj ′ n′ | Oσs |j, ni hj ′ n′ (s, j) σni hj ′ | Os |ji hj ′ n′ (s, j) σni

hence such branching ratios are quotients of Clebsch-Gordan coefficients.


An interesting application concerns electromagnetic transitions in atoms (visible light, x-ray) and nuclei (γ−ray). Since
the electromagnetic interaction is invariant under three-dimensional rotations, we can use SO (3) as a symmetry group for
these interactions. The electromagnetic transitions involve emission of a photon of angular momentum (s, λ) while the atomic
or nuclear system jumps from an initial state of angular momentum |j, mi to a final state of angular momentum |j ′ , m′ i. In
each case the first quantum number s (or j or j ′ ) corresponds to the magnitude of the angular momentum while the quantum
number λ (or m or m′ ), corresponds to its projection on the quantization X3 −axis. The quantum number λ takes 2s + 1
values, m takes 2j + 1 values and m′ takes 2j ′ + 1 values. According to quantum mechanics the probability (intensity) of each
transition is proportional to |f |2 with f = hj ′ m′ | Oλs |jmi where Oλs is the “multipole transition operator” for the process. Using
the Wigner-Eckart theorem we see that f = f0 hj ′ m′ (s, j) λmi where hj ′ m′ (s, j) λmi is the Clebsch-Gordan coefficient and f0
is the “reduced matrix element”. By virtue of this separation, all potential transitions depend on only one constant f0 . Let us
assume j = j ′ = s = 1. The C-G coefficient h1, m′ (1, 1) λ, mi vanish unless m′ = λ + m, providing some selection rules. In this
case there are nine possible transitions but only one reduced matrix element must be calculated. Further, the selection rule
permits only seven of the nine transitions originally available. Indeed, space inversion symmetry leads to additional selection
rules.

15.15 Cartesian components of tensor operators


It is sometimes convenient to write vectors and tensors in terms of their cartesian components. In the same way that the
position vector is considered as the model for cartesian vectors, the position operator (as defined in quantum mechanics) is
considered as the model of cartesian vector operators. From this reasoning, we provide the following definition
15.16. CARTESIAN COMPONENTS OF A SECOND RANK TENSOR 285

Definition 15.5 (Vector operators-Cartesian components): Three operators {Al : l = 1, 2, 3} are cartesian components of a
vector operator if they satisfy the following commutation relations with the generators of rotations

[Jm , Ak ] = iεmkp Ap

If we now consider the commutator of Jm with a yuxtaposition of two components of vector operators A and B we obtain

[Jm , Ak1 Bk2 ] = [Jm , Ak1 ] Bk2 + Ak1 [Jm , Bk2 ] = iεmk1 p Ap Bk2 + iεmk2 p Ak1 Bp

by assigning Ak1 Bk2 → Tk1 k2 , we can establish the behavior of a second rank cartesian tensor as

[Jm , Tk1 k2 ] = iεmk1 p Tpk2 + iεmk2 p Tk1 p

with further yuxtapositions we can obtain the expected behavior of the cartesian components of an arbitrary n − th rank tensor

Definition 15.6 (Tensor operators-Cartesian components): A set of operators {Tk1 ···kn ; ki = 1, 2, 3} are cartesian components
of a n−th rank tensor if they satisfy the following commutation relations with the generators of rotations

[Jm , Tk1 ···kn ] = i εmk1 p Tpk2 ···kn + . . . + εmkn p Tk1 ···kn−1 p

We see then that {Jl } themselves transform as a vector, as well as the momenta operators {Pi }. An example of a second
rank tensor is the stress tensor, Tij .

15.16 Cartesian components of a second rank tensor


Let ei with i = 1, 2, 3 be a cartesian basis for V3 . Second rank tensors are the vectors belonging to the 9-dimensional product
space V32 ≡ V3 ⊗ V3 , whose natural basis is given by

{ei ⊗ ej ; i, j = 1, 2, 3}

so an arbitrary vector T ∈ V32 can be written as

T ij ei ⊗ ej ≡ T ij ei ej (15.134)

it worths pointing out that T is a (intrinsic) geometrical object while T ij are its components in a specific basis of the tensor
space V32 . There is no distinction between upper and lower indices, but only for sum convention.
On the other hand, the SO (3) representation in V3 ⊗ V3 coming from the irreducible representation j = 1 in V3 is given by
Eq. (15.107)

U (R) |m, m′ i = |n, n′ i D(1) (R)n m D(1) (R)n m′ (15.135)
therefore, a vector (second-rank tensor) in the 9−dimensional product space V3 ⊗ V3 must transform under rotations according
with the D(1×1) representation of SO (3) described in Eq. (15.135). Consequently
k p
T ′kp = D(1) (R) mD
(1)
(R) nT
mn
(15.136)

from which we obtain


 
k e (1) (R) p
T ′kp = D(1) (R) mT
mn
D n
−1
⇒ T′ = D(1) (R) TD(1) (R)

where we have used the orthogonality condition. Further if we recall that at least in a specific basis, D(1) (R) corresponds to
a three-dimensional rotation in the usual sense, we can write

R ′
→ T ≡ RTR−1
T− (15.137)

Observe that Eq. (15.136) expresses a transformation of coordinates of a 9−components vector (the second rank tensor)
through a 9 × 9 matrix representation (associated with the product representation D(1×1) ). Though Eq. (15.137) clearly
represents the same equation, it suggests another interpretation: If we consider T as an operator acting on V3 (instead of a
vector belonging to V3 ⊗ V3 ), Eq. (15.137) expresses the transformation of an operator on V3 under three-dimensional rotations.
This dual role of a second rank tensor either as vector in the 9−dimensional V3 ⊗ V3 space or as an operator on the V3 space,
is very useful in Physics applications.
286 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

15.16.1 Decomposition of a second rank tensor in its symmetric and antisymmetric part
The procedure described in Sec. 5.2.2 page 97 for 3 × 3 matrices, is clearly valid for second-rank tensors of three dimensions.
Hence, we can decompose a 3-dimensional second rank tensor in its symmetric and antisymmetric parts. Eq. (5.29) says that
   
T11 T12 T13 2T11 T12 + T21 T13 + T31
b +T ; T 1
b ≡  T12 + T21
T =  T21 T22 T23  = T 2T22 T23 + T32 
2
T31 T32 T33 T13 + T31 T23 + T32 2T33
 
0 T12 − T21 T13 − T31
1
T ≡ T21 − T12 0 T23 − T32  (15.138)
2
T31 − T13 T32 − T23 0

in components it is written as
Tij + Tji Tij − Tji
Tij = Tbij + T ij ; Tbij = , T ij =
2 2
the antisymmetric part can be parameterized as
   
0 T12 − T21 T13 − T31 0 v3 −v2
1 1
T= T21 − T12 0 T23 − T32  ≡ −v3 0 v1  (15.139)
2 T31 − T13 T32 − T23 0 2 v2 −v1 0

since T has precisely three independent components, it is natural to associate a three-vector with the antisymmetric part T
in the form    
v T − T32
1  1  1  23 1 1
∗T ≡ v2 = T31 − T13  ; (∗T)i = vi = εi jk Tjk (15.140)
2 2 2 2
v3 T12 − T21
where ∗T is the dual of the tensor T. Moreover, the symmetric part can in turn be decomposed in two symmetric components,
a traceless one and another that only contains the trace as a degree of freedom. Hence any tensor T ∈ V32 can be decomposed
into three parts
 
0 0 0
T = T+T b =T+T b tl + T
bt ; b t ≡ (T rT)  0 1 0 
T (15.141)
0 0 0
   
0 T12 − T21 T13 − T31 0 v3 −v2
1 1
T ≡ T21 − T12 0 T23 − T32  ≡ −v3 0 v1  (15.142)
2 2
T31 − T13 T32 − T23 0 v2 −v1 0
   
2T11 T12 + T21 T13 + T31 k1 k2 k3
b tl ≡ 1 1
 T12 + T21 −2T11 − 2T33 T23 + T32  ≡  k2 −k1 − k5 k4 
T (15.143)
2 2
T13 + T31 T23 + T32 2T33 k3 k4 k5

This decomposition shows that the 9 components of an arbitrary 3 × 3 tensor T can be decomposed as: (a) The three
components vi of the vector given by Eq. (15.140), (they account on the components of T). (b) The 5 components ki associated
with a traceless symmetric tensor T b tl , and (c) The trace of T.
The set of all linearly independent traceless symmetric tensors forms a five-dimensional subspace Vtl ⊂ V32 . It is convenient
to find a basis that expands such a subspace. From Eqs. (15.134, 15.143) any traceless symmetric tensor T b tl can be written as
 
k1 k2 k3
b tl ≡ 1  k2 −k1 − k5 k4 
T
2
k3 k4 k5
1
= [k1 e1 e1 + k2 e1 e2 + k3 e1 e3 + k2 e2 e1 + (−k1 − k5 ) e2 e2 + k4 e2 e3 + k3 e3 e1 + k4 e3 e2 + k5 e3 e3 ]
2
b tl = 1 [k1 (e1 e1 − e2 e2 ) + k2 (e1 e2 + e2 e1 ) + k3 (e1 e3 + e3 e1 ) + k4 (e2 e3 + e3 e2 ) + k5 (e3 e3 − e2 e2 )]
T
2
such that Vtl is expanded by the following basis

1 1
e[12] ≡ (e1 e1 − e2 e2 ) ; e[32] ≡ (e3 e3 − e2 e2 )
2 2
1 1 1
e{12} ≡ (e1 e2 + e2 e1 ) , e{13} ≡ (e1 e3 + e3 e1 ) ; e{23} ≡ (e2 e3 + e3 e2 ) (15.144)
2 2 2
15.16. CARTESIAN COMPONENTS OF A SECOND RANK TENSOR 287

in a similar way, we can find the basis that spans the three dimensional subspace Va ⊂ V32 generated by the antisymmetric
tensors. From Eq. (15.142) we get
 
0 v3 −v2
1 1
T ≡ −v3 0 v1  = [v3 e1 e2 − v2 e1 e3 − v3 e2 e1 + v1 e2 e3 + v2 e3 e1 − v1 e3 e2 ]
2 2
v2 −v1 0
1
T = [v1 (e2 e3 − e3 e2 ) + v2 (e3 e1 − e1 e3 ) + v3 (e1 e2 − e2 e1 )]
2
so that the basis that spans Va is
1
e(23) = (e2 e3 − e3 e2 ) ; e(31) = (e3 e1 − e1 e3 ) ; e(12) = (e1 e2 − e2 e1 ) (15.145)
2
finally the one-dimensional space that accounts on the trace is spanned by

e22 = e2 e2 (15.146)

putting Eqs. (15.144, 15.145, 15.146) together we can find the matrix M of transformation from the cartesian basis {ei ej } to
the basis generated above
    
e(23) 0 0 0 0 0 1 0 −1 0 e1 e1
 e(31)   0 0 −1 0 0 0 1 0 0  e1 e2 
    
 e(12)   0 1 0 −1 0 0 0 0 0  e1 e3 
    
 e[12]    
  1 1 0 0 0 −1 0 0 0 0  e2 e1 
 e[32]  =  0 0 0 0 −1 0 0 0 1  e2 e2 
    
 e{12}  2  0 1 0 1 0 0 0 0 0  e2 e3 
    
 e{13}   0 0 1 0 0 0 1 0 0  e3 e1 
    
 e{23}   0 0 0 0 0 1 0 1 0  e3 e2 
e22 0 0 0 0 1 0 0 0 0 e3 e3

we see that det M = −8 6= 0, so that the new set of vectors is also a basis for V32 . However, this matrix is not unitary (real
orthogonal) so that the new basis is not orthonormal but can be normalized and orthogonalized by the Gram-Schmidt process.
On the other hand, we have already seen that if T ∈V32 i.e. is a second-rank tensor, it is (actively) transformed under
SO (3) as
R ′
→ T ≡ RTR−1
T− (15.147)
we shall examine the transformation of each of the tensors defined in Eq. (15.141) under rotations.

15.16.2 Transformation of the trace of a second rank-tensor under SO (3)


Since Eq. (15.147) is a similarity transformation, it is inmediate that the trace of the tensor is invariant under a SO (3)
transformation (see Eq. 3.39, page 42)
T rT′ = T rT

15.16.3 Transformation of the antisymmetric part of a second rank-tensor under SO (3)


Now, let us define the antisymmetric tensor T in the form
Tij − Tji
T ij ≡ (15.148)
2

the image T of T under a SO (3) transformation yields
′  h l i 1 h k l i 1 h k l l i
T ij = RT R−1 ij = Ri k T kl R−1 j = Ri (Tkl − Tlk ) R−1 j = Ri Tkl R−1 j − Ri k Tlk R−1 j
2 2
′ 1h −1
 
−1 k l
i 1h
′ l
 i 1h ′
−1 k −1
 i 1h ′ ′
i
T ij = RTR − R i T lk Rj = (T ) − Rj T lk R i = T − RTR = T − (T )
2 ij 2 ij
2 ij ji 2 ij ji

′ 1 ′ 
T ij = T − Tji′ (15.149)
2 ij
where the orthogonality of R ∈ SO (3) has been used. Hence, the SO (3) transformation preserves the antisymmetry of
T. Note that this could also be seen from lemma 13.1 on page 229, which states that GL (n) −and in particular SO (n) ⊂
GL (n) −transformations are symmetry preserving.
288 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

A totally antisymmetric tensor in V32 has three independent components (theorem 13.5 page 235). We shall show that they
transform under rotations as the components of a vector. To do this, we use the “dual” of T, denoted by ∗T that accounts on
the three components of T, and it is defined in Eq. (15.140)
1 ijk 1
(∗T)i ≡ ε Tjk = εi jk Tjk
2 2
we calculate the image of ∗T under a SO (3) transformation by using the orthogonality condition and Eq. (15.8)
1 ij ′ 1  1 m 1
(∗T′ )k = εk (T )ij = εk ij RTR−1 ij = εk ij Ri l Tlm R−1 j = εk ij Ri l Tlm Rj m
2 2 2 2
1 ij n 1 1 
= εn δ k Ri l Tlm Rj m = εn ij (Rn a Rk a ) Ri l Tlm Rj m = Rn a Ri l Rj m εnij Tlm Rk a
2 2 2
1 alm a
= ε Tlm Rk a = (∗T) Rk a
2
since upper and lower indices are not significant we can write this relation as
k a
(∗T′ ) = Rk a (∗T)
k
comparing it with Eq. (15.2) on page 255, we see that (∗T) transforms as coordinate components of a vector. Since the basis
used is in principle arbitrary, we conclude that ∗T transforms as a vector under SO (3). Indeed, when parity is incorporated,
it can be seen that ∗T transforms as a pseudovector (or axial vector), and it has to do with the fact that εijk is indeed a
pseudotensor.

15.16.4 Transformation of the symmetric part of a second rank-tensor under SO (3)


b defined as
Further, it is easy to see that the symmetric part of a tensor T
  1
b
T = (Tij + Tji )
ij 2
preserves its symmetry under a SO (3) transformation, that is
  1 ′ 
b′
T = Tij + Tji′
ij 2
which is proved with the same procedure that led from Eq. (15.148) to Eq. (15.149). Finally, since the trace is also invariant
under SO(3) transformations, it is clear that traceless symmetric tensors are transformed under SO(3) into traceless symmetric
tensors.

15.16.5 Decomposition of V32 in invariant irreducible subspaces under SO (3)


From the discussion above, we can say that the subspace generated by all linearly independent antisymmetric second-rank
tensors is invariant under SO (3). This space is of dimension three according with theorem 13.5 page 235. Similarly, the
subspace generated by all linearly independent symmetric second-rank tensors is invariant under SO (3), and such a subspace
is 6-dimensional according with theorem 13.5. We then obtain21

V32 = Vant ⊕ Vsym ; 3⊗3=3⊕6

moreover we have proved that the three independent components of a totally antisymmetric tensor transform as a vector, i.e.
under the (irreducible) D(1) representation of SO (3), from which the invariant subspace Vant is irreducible. It is inmediate that
the six-dimensional invariant subspace generated by the totally symmetric tensors must be reducible since a tensor proportional
to the identity is totally symmetric. Therefore, a 1-dimensional subspace associated with the identity representation D(0) must
be contained in the subspace Vsym so that
3⊗3=3⊕1⊕5 (15.150)
On the other hand, we have seen that the trace of a tensor is invariant under SO (3). Thus, the trace (multiplid by the
identity) transforms as D(0) . Consequently, it is natural to separate the components of a symmetric tensor in its trace and the
remaining independent components. Hence, we are led to study the transformation properties of traceless symmetric tensors22
(which have five components), in order to check whether the five-dimensional subspace in Eq. (15.150) is reducible or not.
21 Despite tensors in the space V 2 are written as 3 × 3 matrix arrangements, we should always bear in mind that they are vectors of 9 components
3
in the 9-dimensional space V32 . However, the notation as a 3 × 3 matrix is more compact than a notation as a 9-column vector. Note that a valid
matrix in V32 must be a 9 × 9 matrix.
22 Note that antisymmetric tensors are automatically traceless. This is consistent with the fact that the identity representation comes entirely from

the subspace generated by the symmetric tensors.


15.16. CARTESIAN COMPONENTS OF A SECOND RANK TENSOR 289

We showed in Sec. 15.16.1 Eq. (15.144) that five tensors given by


1 1
e[12] ≡ (e1 e1 − e2 e2 ) ; e[32] ≡ (e3 e3 − e2 e2 )
2 2
1 1 1
e{12} ≡ (e1 e2 + e2 e1 ) , e{13} ≡ (e1 e3 + e3 e1 ) ; e{23} ≡ (e2 e3 + e3 e2 ) (15.151)
2 2 2
form a basis for the subspace of traceless tensors. If this subspace of V32 happens to be irreducible, it must be equivalent to
the subspace generated by the canonical vectors associated with the five-dimensional irreducible representation of SO (3) (i.e.
j = 2). In other words, the subspace generated by the basis of traceless symmetric tensors Eq. (15.151), must be equivalent
to the subspace generated by
{|2, mi ; m = 2, 1, 0, −1, −2} (15.152)
we construct the J = 2 invariant subspace embedded in V32 (i.e. in the 1 × 1 representation) by using the methods of Sec.
15.13.2 and appendix B. In particular, Eqs. (15.113, 15.112) applied for j = j ′ = m = m′ = 1 says that
|2, 2i = |1, 1i ⊗ |1, 1i
and we can generate the remaining vectors of Eq. (15.152) by sucessive aplication of J− or using the appropriate (tabulated)
Clebsch-Gordan coefficients. The result yields
1
|2, 2i = |1, 1i ⊗ |1, 1i ; |2, 1i = √ (|1, 1i ⊗ |1, 0i + |1, 0i ⊗ |1, 1i)
2
r
1 2
|2, 0i = √ (|1, 1i ⊗ |1, −1i + |1, −1i ⊗ |1, 1i) + |0, 0i ⊗ |0, 0i
6 3
1
|2, −1i = √ (|1, −1i ⊗ |1, 0i + |1, 0i ⊗ |1, −1i) ; |2, −2i = |1, −1i ⊗ |1, −1i (15.153)
2
now we connect the vectors in Eq. (15.153) with the cartesian vectors {ei } by using the relation between the vectors {|1, mi}
and the cartesian vectors (see Eq. 15.103 page 275)
1
|1, ±1i = √ (∓e1 − ie2 ) ; |1, 0i = e3 (15.154)
2
substituting (15.154) in (15.153) we obtain the desired connection. For instance for |2, 2i we have
1 1 1
|2, 2i = |1, 1i ⊗ |1, 1i = √ (−e1 − ie2 ) ⊗ √ (−e1 − ie2 ) = [e1 e1 − e2 e2 + i (e1 e2 + e2 e1 )]
2 2 2
|2, 2i = e[12] + ie{12}
proceeding similarly with the remaining |2, mi vectors we see that all of them are linear combinations of vectors in Eq. (15.151),
so the five-dimensional subspace generated by {|2, mi} contains all vectors in Eq. (15.151). Now since Eq. (15.151) consists of
five linearly independent vectors, they must span the same subspace as {|2, mi}.
A good proof of consistency can be done by constructing the J = 1 irreducible subspace embedded in V32 that is the
subspace generated by
{|1, mi ; m = 1, 0, −1}
and show that is is equivalent to the subspace generated by the basis (15.145) associated with the antisymmetric tensors.
Finally, the J = 0 subspace generated by |0, 0i must be equivalent to the basis associated with the trace Eq. (15.146). Hence,
|0, 0i must be collinear with e2 e2 .

15.16.6 Summary of results


A general second rank tensor transforms under rotations as the D(1×1) representation, so it is reducible. We also see that
D(1×1) = D(0) ⊕ D(1) ⊕ D(2) ⇔ 3⊗3=1⊕3⊕5
further, we proved that
1. The trace of the tensor T rT = δ ij Tij is invariant under SO(3), so it transforms as the identity representation D(0)
2. The antisymmetric part of the tensor
Tij − Tji
T ij =
2
remains antisymmetric after an SO (3) transformation. The three independent components of the anti-symmetric part
of the tensor can be written as (∗T)k = εkij Tij /2 and they behave like a vector under rotations, so they transform as
D(1) .
290 CHAPTER 15. ROTATIONS IN THREE-DIMENSIONAL SPACE: THE GROUP SO (3)

3. The symmetric part of the tensor


Tij + Tji
Tbij =
2
remains symmetric after an SO (3) transformation. Further, the five independent components of the traceless symmetric
part of the tensor transform into each other, and they form the D(2) representation (repesentation of spin 2).

It worths emphasizing that higher rank tensors can also be decomposed into irreducible parts by separating the components
with different symmetry patterns. We can do it systematically by applying the tensor method with the symmetric group as
we will see later.
Chapter 16

The group SU(2) and additional properties


of SO (3)

In many ways the group SU (2) is simpler than SO (3). SU (2) is defined as the group of two-dimensional unitary matrices
with unit determinant. This group is locally equivalent to SO (3) as we shall see later. Therefore, it has the same Lie algebra.
On the global level, it is compact1 and simply connected. Consequently, all irreducible representations for the lie algebra are
single-valued representations of SU (2), in contrast with SO (3) which admits double-valued representations. We shall see also
that SU (2) is the “universal covering group” of SO (3), and this fact permits to derive some of the more advanced topics on
the representations of SO (3) from the study of SU (2).

16.1 Relation between SO (3) and SU (2)


Definition 16.1 (SU(2) group): The group SU (2) consists of all 2 × 2 complex unitary matrices with unit determinant.

We saw in Sec. 15.9.2, that every element of SO (3) can be mapped into a 2×2 unitary matrix with unit determinant
D(1/2) (φ, θ, ψ) given by Eq. (15.97). It can be proved conversely, that all SU (2) matrices can be parameterized in that form.
Let us start with an arbitrary 2×2 matrix  
a b
A= (16.1)
c d
since the elements are in general complex, we have eight real constants. The unitarity condition yields
  ∗   
a b a c∗ aa∗ + bb∗ ac∗ + bd∗
AA† = = =E
c d b∗ d∗ ca∗ + db∗ cc∗ + dd∗

so the constraints imposed by the unitarity condition are


2 2 2 2
|a| + |b| = 1 ; |c| + |d| = 1 ; ac∗ + bd∗ = 0 (16.2)

the first of these equations leads to |a| ≤ 1, and |b| ≤ 1. Further this equation suggest that |a| = cos β, |b| = sin β. Therefore,
the most general solution for the first of Eqs. (16.2) reads

a = cos β eiξa ; b = − sin β eiξb 0 ≤ β ≤ π/2, 0 ≤ (ξa , ξb ) ≤ 2π (16.3)

where the minus in the solution of b is introduced by convenience. Note that sin β ≥ 0 and cos β ≥ 0 for the allowed interval
of β, and a possible change of sign is contained in the phases. Similarly, the second of Eqs. (16.2) implies

c = sin α eiξc ; d = cos α eiξd ; 0 ≤ α ≤ π/2, 0 ≤ (ξc , ξd ) ≤ 2π (16.4)

substituting (16.3, 16.4) in the third of Eqs. (16.2) we have


 ∗  ∗
cos β eiξa sin α eiξc + − sin β eiξb cos α eiξd =0

cos β sin α ei(ξa −ξc ) = sin β cos α ei(ξb −ξd ) (16.5)


1 As before, compactness leads to the fact that all representations are finite-dimensional and equivalent to unitary representations, and that most

of results obtained for finite groups can be extended appropriately for this group.

291
292 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

equating the magnitudes in Eq. (16.5) we find

cos β sin α = sin β cos α ⇒ sin β cos α − cos β sin α = 0


sin (β − α) = 0

for the allowed range of β and α the only solution is


α=β (16.6)

Now equating the phases in Eq. (16.5) we get

ξa − ξc = ξb − ξd (modulo 2π)
ξa + ξd = ξb + ξc ≡ 2λ (modulo 2π) (16.7)

we have a constraint over four phases, so only three of them are independent. We reparameterize the three independent phases
in the following way

ξa = λ+ζ ; ξd = λ − ζ (modulo 2π) (16.8)


ξb = λ+η ; ξc = λ − η (modulo 2π) (16.9)

where λ, η and ζ are arbitrary independent real phases. Taking into account the definition of the λ parameter Eq. (16.7), the
range of the ξ phases Eqs. (16.3, 16.4), and the fact that 0 ≤ ξb + ξc < 2π (they are modulo 2π) we see that 0 ≤ 2λ < 2π or
0 ≤ λ < π. Now, from Eqs. (16.8) and the ranges of λ and the ξ phases we see that 0 ≤ (ζ, η) < 2π.
Replacing Eqs. (16.3, 16.4, 16.6, 16.8, 16.9) in Eq. (16.1) we find
     
a b cos β eiξa − sin β eiξb cos β ei(λ+ζ) − sin β ei(λ+η)
A= = =
c d sin α eiξc cos α eiξd sin β ei(λ−η) cos β ei(λ−ζ)

this leads directly to the following theorem

Theorem 16.1 An arbitrary 2×2 unitary matrix A can be expressed as


 
cos β eiζ − sin β eiη π
A = eiλ ; 0≤β≤ ; 0≤λ<π
sin β e−iη cos β e−iζ 2
0 ≤ ζ < 2π ; 0 ≤ η < 2π (16.10)

Note in particular that the restriction on the range of λ can also be seen from the fact that any additional overall phase
factor eiπ = −1 can always be absorbed in the ζ and η phases in Eq. (16.10).

Theorem 16.2 An arbitrary 2×2 SU(2) matrix A can be parameterized in terms of three real parameters (β, η, ζ) as in Eq.
(16.10) without the overall phase factor eiλ .
 
cos β eiζ − sin β eiη π
A= ; 0≤β≤ ; 0 ≤ ζ < 2π ; 0 ≤ η < 2π (16.11)
sin β e−iη cos β e−iζ 2

Proof : For the matrix A in Eq. (16.10) to be a SU(2) matrix we require the additional condition of unit determinant. It
leads to eiλ = 1 whose only solution within the allowed range of λ is λ = 0. QED.

Corollary 16.3 Any 2 × 2 matrix of the type SU (2) can be written in the form of D(1/2) (φ, θ, ψ) given by Eq. (15.97) page
273, by using the correspondences
θ −φ − ψ −φ + ψ
β= ; ζ= ; η= (16.12)
2 2 2
where the ranges of the new variables become

0 ≤ θ ≤ π ; 0 ≤ φ < 2π ; 0 ≤ ψ < 4π (16.13)

We observe that the range of ψ is twice that of the physical Euler angle ψ, which comes from the fact that the SU (2)
matrices form a double-valued representation of SO (3).
16.2. CARTESIAN PARAMETERIZATION OF SU (2) MATRICES AND THE GROUP MANIFOLD 293

16.2 Cartesian parameterization of SU (2) matrices and the group manifold


The unitarity condition for the matrix (16.1)  
a b
A= (16.14)
c d
can also be expressed by    
1 d −b a∗ c∗
A−1 = A† ⇒ = (16.15)
det A −c a b∗ d∗
and using the special condition det A = 1, we obtain
d = a∗ , −b = c∗ , −c = b∗ , a = d∗ ⇒
 
a b
A = ; det A = aa∗ + bb∗ = 1 (16.16)
−b∗ a∗

For other purposes, it is convenient to parameterize a SU (2) matrix using the cartesian form of complex numbers and the
structure given by Eq. (16.16)
 
r0 − ir3 −r2 − ir1
A = (16.17)
r2 − ir1 r0 + ir3
det A = r02 + r12 + r22 + r32 = 1 (16.18)
where ri are all real numbers. From this parameterization, the structure of the group manifold is more apparent. If we
see {ri : i = 0, 1, 2, 3} as the cartesian coordinates in a 4-dimensional Euclidean space, the group parameter space is simply
the surface of the unit 4-sphere2. This manifold is compact (closed and bounded) and simply-connected. These topological
properties make SU (2) a well-behaved group. In contrast, the SO(3) manifold was double-connected and it led to double-valued
representations.
In the cartesian parameterization of the SU (2) matrices, we can regard (r1 , r2 , r3 ) as the independent variables, with
q
r0 = 1 − (r12 + r22 + r32 ) (16.19)

from Eqs. (16.17, 16.19) it is clear that the identity of the group is obtained by setting r1 = r2 = r3 = 0. As customary, we
assume an infinitesimal transformation around the identity. Therefore, we settle rk = drk , k = 1, 2, 3, for the three independent
variables. On the other hand taking into account that
√ 1 
1 − x ≃ 1 − x + O x2
2
the dependent variable r0 in Eq. (16.19) becomes
q  h 4 i
1
r0 = 1 − (drk ) (drk ) = 1 − drk (drk ) + O drk
2
thus r0 is the identity at first order in drk . We can then write Eq. (16.17) for an infinitesimal transformation as
 
 1 − idr3 −dr2 − idr1
A drk =
dr2 − idr1 1 + idr3
   
1 0 −idr3 −dr2 − idr1
= +
0 1 dr2 − idr1 idr3
       
1 0 0 −idr1 0 −dr2 −idr3 0
= + + +
0 1 −idr1 0 dr2 0 0 idr3
       
1 0 0 1 0 −i 1 0
= − idr1 − idr2 − idr3
0 1 1 0 i 0 0 −1
from which the infinitesimal matrix becomes
     
 0 1 0 −i 1 0
A drk = E − idrk σk ; σ1 = ; σ2 = ; σ3 = (16.20)
1 0 i 0 0 −1

where σi are the Pauli matrices defined in Eq. (15.91). Equation (16.20) says that the Pauli matrices are the generators of
SU (2). From their explicit form, we can see that the Pauli matrices satisfy the following commutation relations
[σk , σl ] = 2iεklm σm
2 Note that for SO (3) the manifold was the VOLUME (and not the surface) of a 3-sphere of radius π.
294 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

comparing with Eq. (15.60) we see that SU(2) and SO(3) have the same Lie algebra if we make the identification
σk
Jk →
2
We constructed all the irreducible representations of this Lie algebra in Sec. 15.5. Since SU (2) is a simply connected group,
all the irreducible representations of its Lie algebra are also single-valued irreducible representations of the group.

16.3 An alternative way to see the relation between SO (3) and SU (2) (op-
tional)
We can see the fact that every SU (2) matrix is associated with a rotation in an alternative way. Let us associate every
coordinate vector x ≡ x1 , x2 , x3 , with a 2 × 2 hermitian traceless matrix
     
i 1 0 1 2 0 −i 3 1 0
X = σi x = x +x +x (16.21)
1 0 i 0 0 −1
 
x3 x1 − ix2
X = 1 2 (16.22)
x + ix −x3

it is easy to see that


2
det X = −xi xi = − |x| ; T rX = 0
X can be considered as an operator in the C2 space. Thus, a matrix A in SU (2) induces a linear transformation on X

X → X ′ ≡ AXA−1 = AXA† (16.23)

it is clear that X ′ is also hermitian and traceless, and preserves the determinant
    †
T rX ′ = T r AXA−1 = T r A−1 AX = T rX = 0 ; X ′† = AXA† = AXA† = X ′
 
det [X ′ ] ≡ det AXA−1 = det A det X det A−1 = det X

hence, X ′ can also be associated with a coordinate vector x′ as in Eqs. (16.21, 16.22) so that
2 2
− det X ′ = |x′ | = |x|

In conclusion, the SU (2) transformation (16.23), induces an SO (3) transformation x → x′ in the three-dimensional Euclidean
space. Moreover, Eq. (16.23) shows that the two SU (2) matrices ±A are mapped into the same rotation. Thus the mapping
from A ∈ SU (2) to R ∈ SO (3) is two-to-one.
Let us see explicitly the form of the transformation. Substituting Eqs. (16.16, 16.22) in Eq. (16.23) we find

     ∗ 
x′3 x′1 − ix′2 a b x3x1 − ix2 a −b
=
x + ix′2
′1
−x′3 −b∗ a∗ x + ix2
1
−x3 b∗ a
    3 ∗   
x′3 x′1 − ix′2 a b x a + x1 − ix2 b∗ −x3 b + x1 + ix2 a
=
x + ix′2
′1
−x′3 −b∗ a∗ x1 + ix2 a∗ − x3 b∗ − x1 + ix2 b − x3 a

expanding out the matrix products we obtain the following complex equations
 
2 2
x′3 = x3 |a| − |b| + x1 (b∗ a + a∗ b) + ix2 (ba∗ − b∗ a) (16.24)
h i h i
2 2 2 2
x′1 + ix′2 = x1 (a∗ ) − (b∗ ) + ix2 (a∗ ) + (b∗ ) − 2x3 a∗ b∗ (16.25)

Separating the real and imaginary parts in (16.25) and expressing it in matrix form we have
 h i h ∗ 2 i 
 ′1   1  (a∗ )2 + a2 − (b∗ )2 − b2 i
(a ) + (b ∗ 2
) − a 2
− b 2
a∗ b∗ + ab   1 
x 
 2 h 2 h 
 x
 i  i
 x′2  = 1
(a∗ )2 + b2 − (b∗ )2 − a2 1
(a∗ )2 + (b∗ )2 + a2 + b2 i (a∗ b∗ − ab)  x2  (16.26)
′3 
 2i 2 

x   x3
b ∗ a + a∗ b i (ba∗ − b∗ a) |a|2 − |b|2

Now we should verify that the 3×3 matrix obtained corresponds to a rotation SO (3). To see it, let us take the following
replacements
a = e( 2 ) , b = 0 ⇒ |a|2 + |b|2 = 1

(16.27)
16.4. REPRESENTATION MATRICES FOR SU (2): THE TENSOR METHOD 295

we observe that Eq. (16.27) is compatible with Eqs. (16.2, 16.3). According with (16.27) and (16.26) we obtain

x′1 = x1 cos α + x2 sin α


′2
x = −x1 sin α + x2 cos α
x′3 = x3 (16.28)

where we have taken into account the complex identities for sin α and cos α. The Eq. (16.28) represents a rotation by an angle
α, around the x3 axis. Further, replacing Eq. (16.27) in Eq. (16.16) gives the analoguous SU (2) matrix in C2 . Denoting
T
(ξ1 , ξ2 ) a vector of complex components in C2 , we have the following equivalence
!   1 
iα  cos α sin α 0 x
e2 0 ξ1
−iα ⇐⇒  − sin α cos α 0   x2 
0 e 2 ξ2
0 0 1 x3

By using Eq. (15.94) we can find



!
e 2 0
= e( 2 )σ3

−iα (16.29)
0 e 2

 
cos α sin α 0
 − sin α cos α 0  = eiJ3 α
0 0 1

so we can write
e( 2 )σ3 ⇐⇒ eiJ3 α

(16.30)
and similarly for rotations around the x1 and x2 axes
   
β β 
a = cos , b = i sin x1 − axis (16.31)
2 2
γ  γ  
a = cos , b = sin x2 − axis (16.32)
2 2
so that
        1 
β   1 0 0 x
cos i sin β2
 2     ξ1 ⇐⇒  0 cos β sin β   x2  (16.33)
i sin β2 cos β2 ξ2
0 − sin β cos β x3
  1 
 γ
 γ
   cos γ 0 − sin γ x
cos 2  sin 2 ξ1  0   x2 
γ γ ⇐⇒ 1 0 (16.34)
− sin cos ξ2
2 2 sin γ 0 cos γ x3

which can be written as



e( 2 )σ1 ⇐⇒ eiJ1 β (16.35)

e( 2 )σ2 ⇐⇒ eiJ2 γ (16.36)

Checking Eqs. (16.30), (16.35) and (16.36), we conclude that there is a relation between rotations SO (3) and the associated
SU (2) transformations for half-angles. For each rotation SO (3) in an angle θ, there exist two possible SU (2) transformations,
associated with θ and θ + 2π. In general for a rotation around an arbitrary axis n we can write

e( 2 )n·~σ ⇐⇒ eiθJ·n

⇒ ≡J
2

16.4 Representation matrices for SU (2): The tensor method


SU (2) is clearly a subgroup of GL (2, C). Therefore, we can apply the tensor method of Chapter 13, to construct higher
dimensional representations of SU (2). For this special case, it can be shown that (i) The irreducible tensors belonging to
symmetry classes of the permutation group, generate irreducible representations of SU (2) as they do for GL (2, C). (ii) The
totally symmetric tensors of rank n (in the space V2n ) form an (n + 1) −dimensional space, generating the j = n/2 irreducible
representation of SU (2). Hence, since n/2 runs over all integers and half-odd-integers, all irreducible representations of SU (2)
m
and SO (3) can be generated by the tensor method. From this fact we shall construct the functions d(j) (θ) m′ of Eq. (15.85)
explicitly, for all (j).
296 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

We denote the basic matrix Eq. (15.96) in the j = 1/2 representation as


 
c −s θ θ
r (θ) ≡ d(1/2) (θ) = ; c ≡ cos , s ≡ sin (16.37)
s c 2 2

let ξ i , i = +− be the components of an arbitrary two-component complex vector ξ (spinor) in the space V2 (isomorphic with
C2 ). As usual we define their transformations as
i
ξ i → ξ ′i = r (θ) j ξ j

this relation can be written explicitly by using Eq. (16.37)


 ′+    
ξ c −s ξ+
=
ξ ′− s c ξ−
ξ ′+ = cξ + − sξ − ; ξ ′− = sξ + + cξ − (16.38)

In the tensor space V2n , we define a tensor ξ {i} with components

ξ {i} = ξ i1 ξ i2 · · · ξ in (16.39)

which is totally symmetric by construction, and irreducible (see Sec. 13.2). Since ik = +, − ;we can use the symmetry of the
tensor to reorder the product in Eq. (16.39), in order to gather all components of the type ξ + and then all components of the
type ξ −
k − n−k
ξ {i} = ξ + ξ ; 0≤k≤n (16.40)
there are n + 1 independent components characterized by the n + 1 possible values of k. We shall label each independent
component with m ≡ k − n/2, so that
n n n
k= +m ; n−k =n− −m= −m
2 2 2
and Eq. (16.40) becomes
 n2 +m  n2 −m
ξ (m) = ξ + ξ− (16.41)
now, since n/2 is integer or half-odd-integer we define j = n/2, so that k = j + m and since 0 ≤ k ≤ n = 2j we have that
m = −j, −j + 1, . . . , j. Replacing j = n/2 and normalizing the tensor in Eq. (16.41) we find
j+m j−m
(ξ + ) (ξ − ) n
ξ (m) = (2j)! ; j = ; m = −j, −j + 1, . . . , j (16.42)
[(j + m)! (j − m)!] 2

It can be shown that ξ (m) transform as the canonical components of the j = n/2 irreducible representation of the SU (2)
Lie algebra. It can be shown by determining the action of J± on the symmetric tensors of V2n . It means that

r (θ) : ξ (m) → ξ ′(m) = d(j) (β)


m ( m′ )
m′ ξ (16.43)

applying Eq. (16.42) to ξ ′(m) and ξ (m ) and Eq. (16.38) to ξ ′± , we have
j+m j−m j+m j−m
(ξ ′+ ) (ξ ′− ) (cξ + − sξ − ) (sξ + + cξ − )
ξ ′(m) = (2j)! = (2j)! (16.44)
[(j + m)! (j − m)!] [(j + m)! (j − m)!]
j+m′ j−m′
′ (ξ + ) (ξ − )
ξ (m ) = (2j)! (16.45)
[(j + m′ )! (j − m′ )!]
replacing Eqs. (16.44, 16.45) in Eq. (16.43) we have
j+m j−m j+m′ j−m′
(cξ + − sξ − ) (sξ + + cξ − ) (j) m (ξ + ) (ξ − )
(2j)! = d (β) m′ (2j)!
[(j + m)! (j − m)!] [(j + m′ )! (j − m′ )!]
from which we can deduce a closed expression for the general matrix element
p  2j+m−m′ −2k  2k−m+m′
m
X k (j + m)! (j − m)! (j + m′ )! (j − m′ )! θ θ
(j)
d (θ) m′ = (−1) cos sin (16.46)
k! (j + m − k)! (k − m + m′ )! (j − m′ − k)! 2 2
k

this result along with the first of Eqs. (15.85) gives us the complete expression for the matrices of all representations of SO (3)
and SU (2).
16.5. INVARIANT INTEGRATION MEASURE 297

16.5 Invariant integration measure


We have remarked that the rearrangement Lemma is crucial in establishing the central theorems in the representation theory
for finite groups shown in Chapters 7, 9. For this reason, an appropriate generalization must be provided in the case of
continuous groups, and it led to the concept of invariant integration measure described in Sec. 14.1.3. In Sec. 14.1.3 we
establish the proper invariant measure for SO (2). This section focus in establishing the appropriate invariant measure for the
groups SU (2) and SO (3).
Following the discussion in Sec. 14.1.3, we look for an infinitesimal volume measure dτA around each group element A,
such that Z Z Z

f (A) dτA = f B −1 A dτA = f (A) dτBA (16.47)

for an arbitrary function of the group element f (A). Since both, SU (2) and SO (3) require three parameters to determine
their elements, assume a set (ξ, η, ζ) of parameters that characterize a group element A, the volume measure dτA reads

dτA = ρ (ξ, η, ζ) dξ dη dζ (16.48)

where ρ is the “density” or “weight” function for the invariant measure. The validity of Eq. (16.47) demands that dτA = dτBA
which in turn leads to (see Eq. 14.20)
ρ (ξ, η, ζ) ∂ (ξ ′ , η ′ , ζ ′ )
′ ′ ′
= (16.49)
ρ (ξ , η , ζ ) ∂ (ξ, η, ζ)
where (ξ ′ , η ′ , ζ ′ ) are the set of parameters associated with A′ = BA and the RHS is the Jacobian determinant for the change
of variables (ξ, η, ζ) → (ξ ′ , η ′ , ζ ′ ). It worths emphasizing that the same equation is obtained when we consider both sets of
parameters as associated with the same group element, with their corresponding weight functions on the LHS of Eq. (16.49).
Hence the change of variables (ξ, η, ζ) → (ξ ′ , η ′ , ζ ′ ) could have an active interpretation in which A (ξ, η, ζ) represents a group
element different from A (ξ ′ , η ′ , ζ ′ ) and running over all values of these coordinates is equivalent to run over all elements of the
group. On the other hand, the change of variables (ξ, η, ζ) → (ξ ′ , η ′ , ζ ′ ) could have a passive interpretation in which A (ξ, η, ζ)
represents the same group element as A (ξ ′ , η ′ , ζ ′ ) but expresssed in another coordinate system. The reasoning that led to Eq.
(16.49) was an active point of view, but a passsive point of view leads to the same equation3 .
In this section we develop a method to obtain the measure for the specific case of SU(2), the approach in the next section
is more general. It is clear that the weight function is simpler if the RHS of Eq. (16.49) is a constant. In turn, it happens
if (ξ ′ , η ′ , ζ ′ ) are linear functions of (ξ, η, ζ). Fortunately, one of the parameterizations of SU (2) and SO (3) group elements
P
satisfies this condition, namely (r0 , r1 , r2 , r3 ) with the constraint 3i=0 ri2 = 1. Let {ri } , {ri′ } and {si } be the sets of parameters
associated with the elements A, A′ and B respectively (see Eqs. 16.17, 16.18) with A′ = BA, so that {ri′ } are linear functions
of {ri }. Let us ignore for a while the constraints on the sets {ri } and {ri′ }, from Eq. (16.18), we have
3
X 3
X
2
det A′ = (ri′ ) = det (BA) = det B det A = det A = ri2
i=0 i=0

P3 P3
since i=0 ri2 = i=0 ri′2 , it is clear that the linear mapping {ri } → {ri′ } is an orthogonal transformation. The jacobian for
this linear transformation is det B = 1. Therefore, the RHS of Eq. (16.49) is equal to 1 and the simplest choice ρ = 1 suffices.
Now, introducing the constraint (16.18) on the set {ri } and using Eq. (16.48), we have
3
!
X
de
τA = δ 1 − ri2 dr0 dr1 dr2 dr3 (16.50)
i=0

where the Dirac delta function accounts on the fact that r0 is not independent. If we consider r1 , r2 , r3 as the independent
variables, we integrate the spurious variable r0 , so that the invariant measure in Eq. (16.50) is replaced by
Z "Z 3
!#
1 X
dτA ≡ dr0 de
τA = dr0 δ 1 − ri2 dr1 dr2 dr3 (16.51)
0 i=0

by using the property (3.76), page 57 of the Dirac delta function, we have
3
X 3
X
f (r0 ) = 1 − ri2 = 1 − r02 − ri2
i=0 i=1

3 In other words when Eq. (16.49) is satisfied in a passive point of view we establish that ρ (ξ, η, ζ) defines a measure, if we then verify that such

an equation is satisfied in an active point of view, we establish that this measure is invariant.
298 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

the roots in which f (r0A ) = 0 are given by


v v
u 3 u 3
u X u X
r0A1 = t1 − ri2 , r0A2 = −t1 − ri2 = −r0A1
i=1 i=1

the derivative of f (r0 ) is given by


v
u 3
u X
f ′ (r0 ) = −2r0 ′ ′ t
; |f (r0A1 )| = |f (r0A2 )| = 2 1 − ri2
i=1

therefore, property (3.76) says that


3
!
X 1 1 1
2
δ 1− ri = ′ δ (r0 − r0A1 ) + ′ δ (r0 − r0A2 ) = q P3 [δ (r0 − r0A1 ) + δ (r0 + r0A1 )] (16.52)
|f (r0A1 )| |f (r0A2 )|
i=0 2 1 − i=1 ri2

substituting (16.52), in Eq. (16.51) and integrating we obtain


 
Z Z 1 
1
dτA = dr0 de
τA = dr0 q P3 [δ (r0 − r0A1 ) + δ (r0 + r0A1 )] dr1 dr2 dr3
 0 
2 1 − i=1 ri2
Z 1 
1
dτA = q P3 dr0 δ (r0 − r0A1 ) dr1 dr2 dr3
2 1 − i=1 ri2 0

where we have taken into account that δ (r0 + r0A1 ) does not contribute because r0 and r0A1 are both positive. Thus we finally
obtain 
3
!1/2 −1
X
dτA = 2 1 − rk2  dr1 dr2 dr3 (16.53)
k=1

The first factor on the RHS of Eq. (16.53), gives the appropriate weight function for this set of parameters. The weight
function is not simple anymore4 , because after eliminating r0′ and r0 the set (r1′ , r2′ , r3′ ) is not a linear function of (r1 , r2 , r3 ).
We can see then that the addition of the spurious parameters r0 , r0′ permitted to deal with a linear transformation, and it was
because of this fact, that the constraint was introduced only at the end of the process.

16.5.1 Invariant measure in different sets of coordinates


To find the invariant measure in terms of other parameterizations such as the angles (β, ζ, η) in Eq. (16.11) or the Euler angles
(φ, θ, ψ), we should take a passive point of view for the change of variables. In particular, we shall use the previous result with
{ri } as the original parameterization and then derive the new weight function by means of Eq. (16.49), taking the latter with
a passive point of view. It is convenient to relax again the constraint in Eq. (16.18) and compensate it by multiplying the
matrix elements in the other parameterization, Eq. (16.11), by
v
u 3
uX
r≡t ri2 (16.54)
i=0

to find    
r cos β eiζ −r sin β eiη r cos β (cos ζ + i sin ζ) −r sin β (cos η + i sin η)
A= = (16.55)
r sin β e−iη r cos β e−iζ r sin β (cos η − i sin η) r cos β (cos ζ − i sin ζ)
we could then take into account the constraint later by setting r = 1. By now, r is used as another free parameter. We shall
study the change of coordinates (r0 , r1 , r2 , r3 ) → (r, β, ξ, η) and at the end of the process we use the constraints on r0 and r.
Comparing Eq. (16.55) with Eq. (16.17) we obtain
   
r cos β cos ζ + ir cos β sin ζ −r sin β cos η − ir sin β sin η r0 − ir3 −r2 − ir1
=
r sin β cos η − ir sin β sin η r cos β cos ζ − ir cos β sin ζ r2 − ir1 r0 + ir3
4 Remember that ρ = 1 suffices before the introduction of the constraint. Notwithstanding, the apropriate density or weight is obtained when only

independent parameters are involved.


16.5. INVARIANT INTEGRATION MEASURE 299

so that

r0 = r cos β cos ζ ; r3 = −r cos β sin ζ


r2 = r sin β cos η ; r1 = r sin β sin η

the jacobian determinant can be calculated to obtain


∂r
0 ∂r0 ∂r0 ∂r0

∂r ∂β ∂ζ ∂η cos β cos ζ −r sin β cos ζ −r cos β sin ζ 0
∂ (r0 , r1 , r2 , r3 ) ∂r
∂r
1 ∂r1
∂β
∂r1
∂ζ
∂r1
∂η sin β sin η
r cos β sin η 0 r sin β cos η

J ≡ = ∂r ∂r2 ∂r2 ∂r2 = sin β cos η
∂ (r, β, ζ, η) ∂r 2
∂β ∂ζ ∂η r cos β cos η 0 −r sin β sin η
∂r3 ∂r3 ∂r3 ∂r3
− cos β sin ζ r sin β sin ζ −r cos β cos ζ 0
∂r ∂β ∂ζ ∂η

expanding along the fourth column



cos β cos ζ −r sin β cos ζ −r cos β sin ζ

J = r sin β cos η sin β cos η r cos β cos η 0

− cos β sin ζ r sin β sin ζ −r cos β cos ζ

cos β cos ζ −r sin β cos ζ −r cos β sin ζ

− (−r sin β sin η) sin β sin η r cos β sin η 0

− cos β sin ζ r sin β sin ζ −r cos β cos ζ

 
sin β cos η r cos β cos η cos β cos ζ −r sin β cos ζ
J =
r sin β cos η −r cos β sin ζ
− r cos β cos ζ
− cos β sin ζ r sin β sin ζ sin β cos η r cos β cos η
 
sin β sin η r cos β sin η cos β cos ζ −r sin β cos ζ
+r sin β sin η −r cos β sin ζ
− r cos β cos ζ
− cos β sin ζ r sin β sin ζ sin β sin η r cos β sin η

  
J = r3 sin β cos β cos η − sin ζ sin2 β sin ζ cos η + cos2 β sin ζ cos η − cos ζ cos2 β cos η cos ζ + sin2 β cos η cos ζ
  
+r3 sin β cos β sin η − sin ζ sin2 β sin ζ sin η + cos2 β sin ζ sin η − cos ζ cos2 β sin η cos ζ + sin2 β sin η cos ζ
  
= r3 sin β cos β cos η − sin2 ζ cos η sin2 β + cos2 β − cos2 ζ cos η cos2 β + sin2 β
  
+r3 sin β cos β sin η − sin2 ζ sin η sin2 β + cos2 β − cos2 ζ sin η cos2 β + sin2 β
   
= r3 sin β cos β cos η − sin2 ζ cos η − cos2 ζ cos η + r3 sin β cos β sin η − sin2 ζ sin η − cos2 ζ sin η
   
= −r3 sin β cos β cos2 η sin2 ζ + cos2 ζ − r3 sin β cos β sin2 η sin2 ζ + cos2 ζ

= −r3 sin β cos β cos2 η − r3 sin β cos β sin2 η = −r3 sin β cos β cos2 η + sin2 η

∂ (r0 , r1 , r2 , r3 ) 1
= −r3 cos β sin β = − r3 sin 2β (16.56)
∂ (r, β, ζ, η) 2
and using Eq. (16.49) we have

ρ (r, β, ζ, η) ∂ (r0 , r1 , r2 , r3 ) ∂ (r0 , r1 , r2 , r3 ) 1


= ⇒ ρ (r, β, ζ, η) = = − r3 sin 2β (16.57)
ρ (r0 , r1 , r2 , r3 ) ∂ (r, β, ζ, η) ∂ (r, β, ζ, η) 2

where we have used Eq. (16.56) and the fact that ρ (r0 , r1 , r2 , r3 ) = 1, before using the constraint. The constraint on r is
obtained by combining Eqs. (16.18, 16.54), from which we get r2 = 1. From Eqs. (16.48, 16.57) and taking into account this
constraint at the end of the process, we find
1 3 1
de
τA = −δ 1 − r2 r sin 2β dr dβ dζ dη = δ 1 − r2 r3 [−2 sin 2β dβ] dr dζ dη
2 4

2 1 3
de
τA = δ 1−r r d (cos 2β) dr dζ dη
4
the new invariant measure is obtained after integrating the spurious variable r, and using the property
  1
δ a2 − r 2 = δ r 2 − a2 = [δ (r + a) + δ (r − a)]
2 |a|

so that Z  Z
1  1
dτA = δ 1 − r2 r3 dr d (cos 2β) dζ dη = d (cos 2β) dζ dη [δ (r − 1) + δ (r + 1)] r3 dr
4 8
300 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

but r is non-negative. Consequently, the factor δ (r + 1) does not contribute and we find
Z  Z
1  1
dτA = δ 1 − r2 r3 dr d (cos 2β) dζ dη = d (cos 2β) dζ dη δ (r − 1) r3 dr
4 8
1
dτA = d (cos 2β) dζ dη
8
where we have used the fact that the overall constant can be chosen arbitrarily to change the sign. When we use the Euler
angles parameterization we simply use Eq. (16.12) to obtain
1
dτA = dφ d (cos θ) dψ (16.58)
16
the overall constant factor can be chosen arbitrarily.
We have then used the following strategy to find an appropriate integration measure: we look for an specific parameterization
(the cartesian one) in which the determination of the weight function is easy. For this, we start relaxing the constraint on
the four original parameters using all of them as free, in that way the mapping {ri } → {ri′ } (active mapping) between
the parameters associated with different group elements becomes linear, leading to a very simple weight function, then the
constraint is added at the end of the process. To find the invariant measure in an arbitrary parameterization we use the
cartesian parameterization as the starting point, and apply Eq. (16.49) with a passive point of view to determine the weight,
and thus the invariant volume in an arbitrary parameterization, as before it is convenient in this process to relax the constraint
at the beginning and introduce it only at the end of the process.

16.6 Invariant integration measure, general approach for compact Lie groups
The procedure in Sec. 16.5, rests on the fact that we have found a linear parameterization given by Eq. (16.17). In this
section, we shall develop a general method to obtain the measure on the manifold of any compact Lie group, and in which the
starting point is an arbitrary parameterization of the manifold.

Theorem 16.4 (Invariant integration measure for Compact Lie groups): The following procedure defines an invariant inte-
gration measure on a compact Lie group G, with elements A ∈ G:
(i) For a given set of parameters {ξi }, calculate ∂A (ξ) /∂ξi
(ii) Form the product A−1 (ξ) ∂A(ξ)
∂ξi and expresses it as a linear combination of the basis elements {Jα } of the Lie algebra
(the generators), defining the matrix Ae from these linear combinations

∂A X e (ξ)α i
A−1 = Jα A (16.59)
∂ξi α

(iii) the weight function and invariant measure are given by


Y
e (ξ)
ρA (ξ) = det A ; dτA = ρA (ξ) dξi (16.60)
i

Proof : First of all, this definition is independent of the local coordinates used for A. Let {ηi } be another parameterization,
then  j
∂A ∂A ∂ξj X X X  α
A−1 = A−1 = e (ξ)α j ∂ξj =
Jα A e (ξ)α j ∂ξ i =
Jα A Jα A e (ξ) ∂ξ i
∂ηi ∂ξj ∂ηi α
∂ηi α
∂η α
∂η
on the other hand by definition
∂A X e (η)α i
A−1 = Jα A
∂ηi α

comparing the last two equations we have


e (ξ) ∂ξ
e (η) = A
A
∂η
so that   h i 
e e ∂ξ e ∂ξ
det A (η) = det A (ξ) = det A (ξ) det
∂η ∂η
and using Eq. (16.60) we have
ρA (η) ∂ξ
= det (16.61)
ρA (ξ) ∂η
16.6. INVARIANT INTEGRATION MEASURE, GENERAL APPROACH FOR COMPACT LIE GROUPS 301

as required by Eq. (16.49) in a passive point of view. Now let {ξi } be a given set of group parameters (local coordinates) at
A. For a fixed group element B, we choose local coordinates at BA as

(BA) (ξ) = B · A (ξ) ; ∀A (ξ) ∈ G

so that
∂ ∂A ∂A
(BA)−1 (BA) = A−1 B −1 B · = A−1 (16.62)
∂ξi ∂ξi ∂ξi
and from Eqs. (16.59, 16.62)

∂A X ∂ X
A−1 = e (ξ)α i = (BA)−1
Jα A (BA) = g (ξ)α i ⇒ A
Jα BA e (ξ) = BA
g (ξ)
∂ξi α
∂ξi α
e (ξ) =
det A g (ξ)
det BA

e (ξ) = det BA
where we have used the linear independence of Jα . It is then obvious that det A g (ξ), and applying Eq. (16.60)
we get
ρBA (ξ) = ρA (ξ) (16.63)
for arbitrary local coordinates {ηi } at BA, and using Eqs. (16.61, 16.63), we have

∂ξ ∂ (ξi ) ρA (ξ) ∂ (ηi )


ρBA (η) = ρBA (ξ) · det = ρA (ξ) ⇒ =
∂η ∂ (ηi ) ρBA (η) ∂ (ξi )

we can call η as ξ ′ where ξ ′ is associated with A′ = BA so that Eq. (16.49) is satisfied in general, and Eq. (16.60) defines an
invariant measure on the group parameter space. QED.
It worths remarking that this definition is independent of the choice of generators {Jα } in Eq. (16.59). It is because a
change of basis for the Lie algebra will modify the weight function by an overall constant only (given by the determinant of
the transformation matrix). Further, this proof does not depend on the details of the group, but only on the fact that they are
determined by a continuous set (manifold) of group parameters, and they have generators Jα . We restrict the algorithm for
compact groups to ensure that the volume is finite (see for instance Eq. 16.64), and that all integrals over the group manifold
exist.

16.6.1 Application of the general method to SU (2) and SO (3)


Applying this method for SU (2) in the Euler angle parameterization in Eq. (15.97) gives
φ   ψ φ   ψ !
e−i 2 cos 21 θ e−i 2 −e−i 2 sin 12 θ ei 2
A = φ   ψ φ   ψ
ei 2 sin 12 θ e−i 2 ei 2 cos 12 θ ei 2

 1
 iψ −i φ
 1
 iψ !
e 2 cos θ e 2 e 2 sin 2  e
θ 2
A−1 = A† = iφ
 21  −i ψ −i φ
 1 −i ψ
−e 2 sin 2 θ e 2 e 2 cos 2 θ e 2
φ   ψ φ   ψ !
∂A − 2i e−i 2 cos 12 θ e−i 2 2i e−i 2 sin 12 θ ei 2
= i iφ
  ψ
i iφ
  ψ
∂φ 2e
2 sin 12 θ e−i 2 2e
2 cos 12 θ ei 2
φ   ψ φ   ψ ! φ   ψ φ   ψ !
2 −1 ∂A ei 2 cos 12 θ ei 2 e−i 2 sin 12 θ ei 2 −e−i 2 cos 12 θ e−i 2 e−i 2 sin 12 θ ei 2
A = φ   ψ φ   ψ φ   ψ φ   ψ
i ∂φ −ei 2 sin 12 θ e−i 2 e−i 2 cos 21 θ e−i 2 ei 2 sin 12 θ e−i 2 ei 2 cos 21 θ ei 2
let us write the elements one-by-one
 1        
2 −1 ∂A φ 1 ψ φ 1 ψ φ 1 ψ φ 1 ψ
A 1 = −ei 2 cos θ ei 2 e−i 2 cos θ e−i 2 + e−i 2 sin θ ei 2 ei 2 sin θ e−i 2
i ∂φ 2 2 2 2
1 1
= − cos2 θ + sin2 θ
2 2
= − cos θ
 1        
2 −1 ∂A iφ 1 iψ −i φ 1 iψ −i φ 1 iψ iφ 1 ψ
A 2 = e 2 cos θ e e 2 2 sin θ e + e
2 2 sin θ e e 2 2 cos θ ei 2
i ∂φ 2 2 2 2
        
1 1 1 1 1 1
= cos θ sin θ e + sin θ cos θ e = 2 cos θ sin θ eiψ
iψ iψ
2 2 2 2 2 2
= sin θ eiψ
302 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

 2        
2 −1 ∂A φ 1 ψ φ 1 ψ φ 1 ψ φ 1 ψ
A 1 = ei 2 sin θ e−i 2 e−i 2 cos θ e−i 2 + e−i 2 cos θ e−i 2 ei 2 sin θ e−i 2
i ∂φ 2 2 2 2
        
1 1 1 1 1 1
= sin θ cos θ e−iψ + cos θ sin θ e−iψ = 2 cos θ sin θ e−iψ
2 2 2 2 2 2
= sin θe−iψ
 2        
2 −1 ∂A iφ 1 −i ψ −i φ 1 iψ −i φ 1 −i ψ iφ 1 ψ
A 2 = −e 2 sin θ e 2 e 2 sin θ e + e
2 2 cos θ e 2 e 2 cos θ ei 2
i ∂φ 2 2 2 2
     
1 1 1 1 1 1
= − sin θ sin θ + cos θ cos θ = cos2 θ − sin2 θ
2 2 2 2 2 2
= cos θ

turning back to the matrix form


   
2 −1 ∂A − cos θ sin θ eiψ − cos θ sin θ (cos ψ + i sin ψ)
A = =
i ∂φ sin θ e−iψ cos θ sin θ (cos ψ − i sin ψ) cos θ
     
− cos θ 0 0 sin θ cos ψ 0 i sin θ sin ψ
= + +
0 cos θ sin θ cos ψ 0 −i sin θ sin ψ 0
     
1 0 0 1 0 −i
= − cos θ + sin θ cos ψ − sin θ sin ψ
0 −1 1 0 i 0
= σ1 sin θ cos ψ − σ2 sin θ sin ψ − σ3 cos θ

proceeding the same for the other two coordinates we have


2 −1 ∂A
A = σ1 sin θ cos ψ − σ2 sin θ sin ψ − σ3 cos θ
i ∂φ
2 −1 ∂A 2 −1 ∂A
A = −σ1 sin ψ − σ2 cos ψ ; A = −σ3
i ∂θ i ∂ψ
taking into account that Jk = σk /2 for SU (2) and SO (3) we have
∂A
A−1 = iJ1 sin θ cos ψ − iJ2 sin θ sin ψ − iJ3 cos θ
∂φ
∂A ∂A
A−1 = −iJ1 sin ψ − iJ2 cos ψ ; A−1 = −iJ3
∂θ ∂ψ
 
   i sin θ cos ψ −i sin ψ 0
A−1 ∂A
∂φ A−1 ∂A
∂θ A−1 ∂A
∂ψ = J1 J2 J3  −i sin θ sin ψ −i cos ψ 0 
−i cos θ 0 −i
e (φ, θ, ψ) becomes
so that from Eq. (16.59) the matrix A
 
i cos ψ sin θ −i sin ψ 0
e (φ, θ, ψ) =  −i sin θ sin ψ
A −i cos ψ 0 
−i cos θ 0 −i

and from Eq. (16.60) ρA (φ, θ, ψ) = det Ae (φ, θ, ψ). Hence



i cos ψ sin θ −i sin ψ 0 cos ψ sin θ − sin ψ 0

ρA = −i sin θ sin ψ −i cos ψ 0 = i3 − sin θ sin ψ − cos ψ 0

−i cos θ 0 −i − cos θ 0 −1

cos ψ sin θ − sin ψ sin ψ
i = −i sin θ cos ψ
− sin θ sin ψ − cos ψ − sin ψ cos ψ = −i sin θ
de
τA = −i sin θ dθ dφ dψ = i dφ d (cos θ) dψ
⇒ de
τA ≡ dφ d (cos θ) dψ

where we have omitted the overall factor i. Therefore, we have reproduced Eq. (16.58) except for a global constant. In what
follows, we shall normalize the invariant measure by means of the “group volume”
Z
VG ≡ dφ d (cos θ) dψ (16.64)
16.7. ORTHONORMALITY RELATIONS OF D(J) 303

from which we obtain the normalized invariant measure


de
τA dφ d (cos θ) dψ
dτA ≡ = (16.65)
VG VG
where
Z 2π Z 1 Z 2π
VSO(3) = dφ d (cos θ) dψ = 8π 2 (16.66)
0 −1 0
Z 2π Z 1 Z 4π
VSU(2) = dφ d (cos θ) dψ = 16π 2 (16.67)
0 −1 0

the rearrangement lemma for (say) SU (2) yields


Z 2π Z 1 Z 4π Z 2π Z 1 Z 4π
1 1
dφ d (cos θ) dψ f [A (φ, θ, ψ)] = dφ d (cos θ) dψ f [BA (φ, θ, ψ)] (16.68)
16π 2 0 −1 0 16π 2 0 −1 0

we observe here that we have discussed only the “left-handed invariant integration”, as can be seen from Eq. (16.68). A
corresponding “right-handed invariant integration” in which B appears to the right-hand side of A, should be in principle
constructed and perhaps it has its own weight function since the groups are non-abelian. However, it can be proved that for
compact Lie groups such as SU (2) and SO (3), the two weight functions coincide.

16.7 Orthonormality relations of D(j)


From the normalized invariant integration measure for SU (2), we can define quantities similar to Eqs. (7.25, 7.31) which were
essential in proving the central theorems of Chapter 7. This integrated quantities over the group manifold permits to prove
that all the important results of the representation theory of finite groups also hold for SU (2) and SO (3) and indeed to all
compact Lie groups. For instance, every irreducible representation of SU (2) is finite-dimensional and equivalent to a unitary
representation. It leads in turn to the fact that if a representation is reducible is fully reducible (see theorem 7.5). We have
already constructed all the unitary irreducible representations of the SU (2) and SO (3) groups in Sec. 15.5. We now write in
detail the orthonormality relations satisfied by the representation functions

Theorem 16.5 (Orthonormality of D [R]): The irreducible representation matrices D(j) (A) for the SU (2) group satisfy the
following orthonormality condition:
Z
′ n′
(A) n D(j ) (A) m′ = δj j δn n δm′ m
† m ′ ′
(2j + 1) dτA D(j) (16.69)

† m  n ∗
where D(j) (A) n = D(j) (A) m , and dτA is the normalized invariant measure.

We interpret it as an orthonormality condition on the ground of considering (j, m, n) as indices for “vectors” D, with
“components” labelled by the group elements A. If we use the Euler-angle parameterization of A and use the fact that the
φ−dependence and ψ−dependence of D(j) (φ, θ, ψ) is known (see Eq. 15.85), the relation (16.69) can be converted into
Z h ih i
′ n′
(2j + 1) dτA eiφn d(j) (θ) m eiψm e−iφn d(j ) (θ) m′ e−iψm = δj j δn n δm′ m
n ′ ′ ′ ′
(16.70)

n
where we have used the Cordon-Shortley convention in which the elements d(j) (θ) m are all real. Using Eqs. (16.65, 16.67),
Eq. (16.70) becomes
Z
dφ d (cos θ) dψ iψ(m−m′ ) iφ(n−n′ ) (j) ′ n′
d (θ) m d(j ) (θ) m′ =
n ′ ′
(2j + 1) 2
e e δj j δn n δm′ m
16π
Z Z 4π Z 1
(2j + 1) 2π ′
iφ(n−n ) iψ (m−m′ ) ′ n′
d (cos θ) d(j) (θ) m d(j ) (θ) m′ =
n ′ ′

2
dφ e dψ e δj j δn n δm′ m
16π 0 0 −1
  Z 1
(2j + 1) ′ n′
d (cos θ) d(j) (θ) m d(j ) (θ) m′ =
′ n ′ ′

2
2πδn n (4πδm′ m ) δj j δn n δm′ m
16π −1

and the orthonormality relation simplifies to5


Z
2j + 1 1 ′
d (cos θ) d(j) (θ) n d(j ) (θ) n = δj j
m m ′
; m, n are f ixed (no sumation) (16.71)
2 −1
5 Note that n 6= n′ and/or m 6= m′ lead to a triviality.
304 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

note that for SO (3) we should change the group volume to 8π 2 and the range for the angle ψ will be [0, 2π]. These changes
lead however to the same result Eq. (16.71), as expected since both groups possess the same representations.
Comparing with the orthonormality relations for finite groups, we see that we can pass from Eq. (7.35) page 144 to Eq.
(16.69) with the following replacements
Z Z
1 X de
τA
nµ → 2j + 1 , → ≡ dτA (16.72)
nG VG
g∈G

where dτA is the normalized invariant measure, such that the number of elements of the group nG is replaced by the “group
volume” VG .

16.8 Completeness relations of D(j)


m
In analogy with the case of finite groups and the SO (2) group, we expect that the functions D(j) (φ, θ, ψ) n form a basis
in the space of functions defined on the group manifold. Notwithstanding, the Hilbert space we are dealing with, is infinite-
dimensional (remember that the number of components of a given “vector” equals the number of elements of the group). A
rigorous formulation for completeness in infinite-dimensional Hilbert spaces is out of the scope of this treatment. However, the
result is similar to theorem 7.11, Eq. (7.43) of page 145 for the case of finite groups, and also similar to the Fourier theorem
14.6, of page 249 for SO (2). This is the celebrated Peter-Weyl theorem.

Theorem 16.6 (Peter-Weyl theorem on completeness of D [R]): The irreducible representation functions D(j) (A)m n form a
complete basis in the space of (Lebesgue) square-integrable functions defined on the group manifold.

Hence, if f (A) is any of these functions with A belonging to the group we have
X m
n
f (A) = fjm D(j) (A) n (16.73)
j,m,n

and from the orthonormality Eq. (16.69) we find the inverse relation. Multiplying both sides of (16.73) by the same factor on
the left-hand side we have
† n † n
X r
s
(2j + 1) D(j) (A) m f (A) = (2j + 1) D(j) (A) m fkr D(k) (A) s
k,r,s

integrating both sides over the normalized group volume, and using the orthonormality expressed by Eq. (16.69), we have
Z X Z

(2j + 1) dτA D(j) (A)n m f (A) = s
fkr †
(2j + 1) dτA D(j) (A)n m D(k) (A)r s
k,r,s
X
s
= fkr δj k δn s δm r
k,r,s

so that the inverse relation of Eq. (16.73) is


Z
n † n
fjm = (2j + 1) dτA D(j) (A) mf (A) (16.74)

the two equations (16.73, 16.74) can be combined in a single formal equation
X m † n
(2j + 1) D(j) (A) n D(j) (A′ ) m = δ (A − A′ ) (16.75)
j,m,n

To see it, we multiply Eq. (16.75) by f (A′ ) and integrate over the group volume on both sides
X Z Z
(j) m † ′ n ′
(2j + 1) dτA′ D (A) n D(j) (A ) m f (A ) = dτA′ f (A′ ) δ (A − A′ )
j,m,n
X Z
(j) m † n
D (A) n (2j + 1) dτA′ D(j) (A′ ) m f (A′ ) = f (A)
j,m,n
X
n
fjm D(j) (A)m n = f (A) (16.76)
j,m,n
16.9. COMPLETENESS RELATIONS FOR BOSE-EINSTEIN AND FERMI-DIRAC FUNCTIONS 305

n
where we have defined fjm as in Eq. (16.74). So we have arrived to Eq. (16.73). Since the group elements A, A′ depend on
a continuous set of parameters, the Dirac delta function in Eq. (16.75) must be suitably defined in terms of the parameter
space. For instance, by using the Euler angle parameterization we have

δ (A − A′ ) = 16π 2 δ (φ − φ′ ) δ (cos θ − cos θ′ ) δ (ψ − ψ ′ )

The Peter-Weyl theorem holds for all compact Lie groups, and represents a significant generalization of the Classical Fourier
theorem. Note, in addition that the function f (A) does not necessarily mean a c−number, it can also be a vector valued or
an operator-valued function on the group. This theorem has many applications and we shall see some of them.

16.9 Completeness relations for Bose-Einstein and Fermi-Dirac functions


It is usual in Physics to deal with two types of functions in terms of the Euler parameterization (a) Bose-Einstein functions:
f (φ, θ, ψ + 2π) = f (φ, θ, ψ) and (b) Fermi-Dirac functions: f (φ, θ, ψ + 2π) = −f (φ, θ, ψ). Of course, the Peter-Weyl theorem
can be applied for these particular types of functions. Let us characterize the coefficients of the expansion (16.73) for both
kind of functions. According with Eq. (16.74), an arbitrary function f (φ, θ, ψ) satisfies
Z
n
fjm = (2j + 1) dτA D(j) †
(A)n m f (A)

and using the manifold and normalized invariant measure of SU (2) we have
Z 2π Z 1 Z 4π
n (2j + 1) ∗ m
fjm = dφ d (cos θ) dψ D(j) (φ, θ, ψ) n f (φ, θ, ψ)
16π 2 0 −1 0
Z 2π Z 1 Z 4π
(2j + 1)
n
fjm = dφ d (cos θ) eiφm d(j) (θ)m n dψ eiψn f (φ, θ, ψ)
16π 2 0 −1 0

so that
Z Z 1
(2j + 1) 2π
n
fjm = dφ d (cos θ) eiφm d(j) (θ)m n Kn,j (φ, θ) (16.77)
16π 2 0 −1
Z 4π
Kn,j (φ, θ) ≡ dψ eiψn f (φ, θ, ψ) (16.78)
0

the fact that Kn,j (φ, θ) depends on j will be clear soon. Let us concentrate on the integration over ψ by now, we have
Z 4π Z 2π Z 4π
Kn,j (φ, θ) = dψ eiψn f (φ, θ, ψ) = dψ eiψn f (φ, θ, ψ) + dψ eiψn f (φ, θ, ψ)
0 0 2π

taking ψ ′ = ψ − 2π in the second integral, we have


Z 2π Z 2π

Kn,j (φ, θ) = dψ eiψn f (φ, θ, ψ) + dψ ′ ei(ψ +2π )n
f (φ, θ, ψ ′ + 2π)
0 0
Z 2π  
Kn,j (φ, θ) = dψ eiψn f (φ, θ, ψ) + e2πin f (φ, θ, ψ + 2π)
0

where we have taken into account that ψ ′ is dummy. Now, according with the general theory of angular momentum and Eq.
(15.84), the row and column labels (m, n) are integer (half-odd integer) if (j) is integer (half-odd-integer). Consequently we
have
2j
e2πin = e(2πi)j = (−1)
therefore Z 2π h i
2j
Kn,j (φ, θ) = dψ eiψn f (φ, θ, ψ) + (−1) f (φ, θ, ψ + 2π) (16.79)
0

the function f (φ, θ, ψ) has been kept arbitrary so far. Let us define a label β in which

1/2 for F-D functions
β≡ (16.80)
1 for B-E functions

so that
fβ (φ, θ, ψ + 2π) = (−1)2β fβ (φ, θ, ψ)
306 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

and assuming that the function f (φ, θ, ψ) is either of B-E or F-D type, we have
Z 2π h i
2(j+β)
Kn,j (φ, θ) = dψ eiψn fβ (φ, θ, ψ) + (−1) fβ (φ, θ, ψ)
0
Z 2π h i
2(j+β)
Kn,j (φ, θ) = dψ eiψn fβ (φ, θ, ψ) 1 + (−1)
0

replacing it in the complete expression for the coefficients of the expansion Eq. (16.77), we have
Z Z 1 Z 2π h i
n (2j + 1) 2π iφm (j) m iψn 2(j+β)
fjm = dφ d (cos θ) e d (θ) n dψ e f β (φ, θ, ψ) 1 + (−1)
16π 2 0 −1 0
Z 2π Z 1 Z 2π h i h i
n (2j + 1) iφm (j) m iψn 2(j+β)
fjm = dφ d (cos θ) dψ e d (θ) n e f β (φ, θ, ψ) 1 + (−1)
16π 2 0 −1 0
Z 2π Z 1 Z 2π h i
n (2j + 1) † n 2(j+β)
fjm = 2
dφ d (cos θ) dψ D(j) (φ, θ, ψ) m fβ (φ, θ, ψ) 1 + (−1) (16.81)
16π 0 −1 0
h i
2(j+β)
and many features of these coefficients depend on the term 1 + (−1) . For β = 1/2 (F-D case) we see that this term
vanishes if j is integer, and if j is half-odd-integer Eq. (16.81) becomes
Z Z 1 Z 2π
n (2j + 1) 2π † n
fjm = dφ d (cos θ) dψ D(j) (φ, θ, ψ) m fβ (φ, θ, ψ) (16.82)
8π 2 0 −1 0

on the other hand, for β = 1 (B-E case) the coefficients vanish if j is half-odd-integer, and when j is integer the form of the
coefficient coincide with Eq. (16.82).

16.9.1 Summary of properties of Bose-Einstein and Fermi-Dirac functions


For physical systems, when we use the Euler angles parameterization, the functions f (A) usually belong to either of two
categories: (a) Bose-Einstein case: f (φ, θ, ψ + 2π) = f (φ, θ, ψ) and (b) Fermi-Dirac case: f (φ, θ, ψ + 2π) = −f (φ, θ, ψ). In
the Bose-Einstein, and Fermi-Dirac cases we have
m ′ 1
fjm = 0 , ∀j = n + , n = 0, 1, 2, 3, . . . (B − E)
2

m
fjm = 0 , ∀j = n , n = 0, 1, 2, 3, . . . (F − D)
and in both cases the non-vanishing coefficients in terms of the Euler angles is given by Eq. (16.82), i.e.
Z Z 1 Z 2π
m′ 2j + 1 2π † m′
fjm = dφ d cos θ dψ D(j) (φ, θ, ψ) m f (φ, θ, ψ) (16.83)
8π 2 0 −1 0

By using the definition (16.80) we could summarize all these results as follows: For functions of the type fβ (φ, θ, ψ) the
coefficients of the expansion (16.73) are given by Eq. (16.81), i.e.
2j + 1 h i Z 2π Z 1 Z 2π
m′ 2(j+β) † m′
fjm,β = 2
1 + (−1) dφ d (cos θ) dψ D(j) (φ, θ, ψ) m fβ (φ, θ, ψ) (16.84)
16π 0 −1 0

16.10 Completeness relations for partially separable functions


In many applications, the function f (φ, θ, ψ) acquires the form

f (φ, θ, ψ) = e−iλψ fe(φ, θ) (16.85)


where λ is either integer or half-odd-integer. In that case Eq. (16.83) becomes
Z Z 1 Z 2π h i
m′ 2j + 1 2π ∗ m −iλψ e
fjm = dφ d cos θ dψ D (j) (φ, θ, ψ) m ′ e f (φ, θ)
8π 2 0 −1 0
Z 2π Z 1 Z 2π h ih i
m′ 2j + 1 iφm (j) m iψm′ −iλψ e
fjm = dφ d cos θ dψ e d (θ) m ′ e e f (φ, θ)
8π 2 0 −1 0
Z 2π Z 2π Z 1 h ih i
m′ 2j + 1 iψm′ −iλψ iφm (j) m e(φ, θ)
fjm = dψ e e dφ d cos θ e d (θ) m ′ f
8π 2 0 0 −1
Z 2π  Z 2π Z 1 h ih i
m ′ 2j + 1 ′
iψ (m −λ) iφm (j) m i0·m′ e(φ, θ)
fjm = dψ e dφ d cos θ e d (θ) m ′e f (16.86)
8π 2 0 0 −1
16.10. COMPLETENESS RELATIONS FOR PARTIALLY SEPARABLE FUNCTIONS 307

where we have used Eq. (15.85) and the Cordon-Shortley convention. Now, since
Z 2π

dψ eiψ(m −λ) = 2π δm′ ,λ (16.87)
0

we see from Eqs. (16.86, 16.87) that if f (φ, θ, ψ) acquires the form of Eq. (16.85), the coefficients become

m λ
fjm = fjm δm′ ,λ and (16.88)
Z 2π Z 1
2j + 1 λ
λ
fjm = dφ †
d (cos θ) D(j) (φ, θ, 0) m fe(φ, θ) (16.89)
4π 0 −1

Replacing Eq. (16.88) in Eq. (16.73) and following the same procedure to go from (16.73) to (16.75) we obtain
X m
λ
f (A) = fjm D(j) (A) λ (16.90)
j,m
X m † λ
(2j + 1) D(j) (A) λ D(j) (A′ ) m = δ (A − A′ ) (16.91)
j,m

Therefore, for the set of all (Lebesgue) square-integrable functions f (φ, θ, ψ) of the form described by Eq. (16.85), the
completeness relations given by Eqs. (16.73, 16.75) are reduced to Eqs. (16.90, 16.91)
On the other hand, setting λ = 0 in Eqs. (16.85, 16.89) we obtain
Z Z 1 Z Z 1
2j + 1 2π 0 2j + 1 2π  m ∗
0
fjm = dφ †
d (cos θ) D(j) (φ, θ, 0) m fe(φ, θ) = dφ d (cos θ) D(j) (φ, θ, 0) 0 fe(φ, θ)
4π 0 −1 4π 0 −1
Z 2π Z 1 h i
2j + 1 m
= dφ d (cos θ) eiφ·m d(j) (θ) 0 ei0·0 fe(φ, θ)
4π 0 −1
Z Z 1
2j + 1 2π m
0
fjm = dφ d (cos θ) eimφ d(j) (θ) 0 fe(φ, θ) (16.92)
4π 0 −1

16.10.1 Completeness for λ = 0 and spherical harmonics


It is convenient to conjugate Eq. (16.92)
Z 2π Z 1
∗ 2j + 1 m
0
fjm = dφ d (cos θ) e−imφ d(j) (θ) 0 fe∗ (φ, θ) (16.93)
4π 0 −1

Where we have taken into account that d(j) (θ) is real in the Cordon-Shortley convention. On the other hand, when λ = 0,
it is clear from Eq. (16.85), that f (φ, θ, ψ) depends only on two angles φ, θ. For the case λ = 0, let us call the j−label as l (as
usual when the angular momentum is integer6 ), and make the following redefinitions
r

0 ∗ 2l + 1
e∗
f (φ, θ) ≡ F (θ, φ) ; flm ≡ flm (16.94)

r r
2l + 1 (l) m 2l + 1 −imφ (j) m ∗
D (φ, θ, 0) 0 = e d (θ) 0 ≡ Ylm (θ, φ) (16.95)
4π 4π
using the definitions (16.94, 16.95) in Eq. (16.93) we obtain
r r Z Z 1 "r #
2l + 1 2l + 1 2π 2l + 1 imφ (l) m
flm = dφ d (cos θ) e d (θ) 0 F (θ, φ) ⇒
4π 4π 0 −1 4π
Z 2π Z 1

flm = dφ d (cos θ) Ylm (θ, φ) F (θ, φ) (16.96)
0 −1

replacing Eqs. (16.85, 16.94, 16.95) with λ = 0 in the reduced Peter-Weyl theorem Eq. (16.90) we have
r
X m
X 2l + 1 m
X
e
f (φ, θ) = 0 (l) ∗
flm D (φ, θ, 0) 0 ⇒ F (θ, φ) = ∗
flm D(l) (φ, θ, 0) 0 = ∗
flm ∗
Ylm (θ, φ) ⇒

lm lm lm
X
F (θ, φ) = flm Ylm (θ, φ) (16.97)
l,m

6 From Eqs. (16.88, 16.89), we see that λ, m, m′ must be of the same type (integer or half-odd-integer), to have non-trivial contributions, and so
j as well.
308 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

putting together Eqs. (16.96, 16.97), we are led to a corollary of the Peter-Weyl theorem. If F (θ, φ) is any (Lebesgue) square-
integrable function7 on the space defined by the ranges 0 ≤ θ ≤ π and 0 ≤ φ < 2π, this function can be expanded in terms of
the functions Ylm (θ, φ) defined in Eq. (16.95), hence
X Z

F (θ, φ) = flm Ylm (θ, φ) ; flm = d (cos θ) dφ Ylm (θ, φ) F (θ, φ) (16.98)
l,m

so that the functions {Ylm (θ, φ)} form an orthonormal basis in the space of square-integrable functions defined on the unit
sphere. By now, Eq. (16.95) is simply a redefinition of some of the representation functions D(j) (A)m n of SO (3). We obtain in
Sec. 16.12 a differential equation for these functions (generated by group-theoretical methods), and we find that they coincide
with the celebrated spherical harmonics.

16.11 Generalized projection operators in SO (3) and SU (2)


The most convenient way to apply the Peter-Weyl theorem to vectors and operators is by means of the generalized projection
operators defined in Sec. 9.2, Eq. (9.7), page 178. The generalization of these Projections to the continuous groups SO (3)
and SU (2) can be written by taking into account the assignments (16.72) in Eq. (9.7) so that
Z
n
Pjm = (2j + 1) dτA D(j)†
(A)n m U (A) (16.99)

where we have assumed unitary representations. From theorem 9.2, we see that given any vector |xi in the representation
space, the set  n
Pjm |xi , m = −j, . . . , j ≡ {|j, m; ni} j, n f ixed (16.100)
forms a basis for an irreducibly invariant (j)−subspace under SU (2) or SO (3), i.e. they are canonical basis vectors {|j, m; ni}
for such subspaces. Since n is fixed, we call these vectors simply as {|j, mi}. In addition, according with definition 9.1 page
176, the set
{|j, mi , m = −j, . . . , j}
provides an irreducible set of basis vectors transforming according with the (j) −representation. Consequently, theorem 9.3,
Eq. (9.11) is valid for such a set, hence

n
Pjm |j ′ m′ i = |j ′ mi δj j δm′ n (16.101)
we shall apply these results in Sec. 17.5 to systems of one and two particles with spin.

16.12 Differential equations and recurrence relations for the D(j) functions
Special functions are usually defined by means of the differential equations that they satisfy. Here we shall establish the
differential equations for our D(j) functions and we see their relationship with well-known special functions of classical analysis.
In the group theoretical approach, the differential operators are associated with the generators of the group that determines
the infinitesimal transformations. So the special functions arising this way acquire geometrical significance. This provides a
better method to understand and classify them.
A prototype of differential equations arising from the generators by employing infinitesimal transformations was obtained
in Eqs. (14.7, 14.40) for SO (2) and T1 (x) respectively. We shall generalize it to the one-parameter subgroups of SU (2) and
SO (3). We start from Eq. (15.59), page 267

R (φ, θ, ψ) = e−iφJ3 e−iθJ2 e−iψJ3 ; R−1 = R† = eiψJ3 eiθJ2 eiφJ3 (16.102)

an infinitesimal transformation dφ is parameterized as

R (dφ, 0, 0) = E − idφJ3

and considering the rotation R (φ + dφ, θ, ψ), we write it in two ways

R (φ + dφ, θ, ψ) = R (dφ, 0, 0) R (φ, θ, ψ) = [E − idφJ3 ] R (φ, θ, ψ) = R (φ, θ, ψ) − idφJ3 R (φ, θ, ψ)


∂R (φ, θ, ψ)
R (φ + dφ, θ, ψ) = R (φ, θ, ψ) + dφ
∂φ
7 Note that if fe (φ, θ) runs over all square integrable functions, then its conjugate also does. Consequently, the function F (θ, φ) defined in Eq.

(16.94), runs over all integrable functions over θ and φ.


16.12. DIFFERENTIAL EQUATIONS AND RECURRENCE RELATIONS FOR THE D(J) FUNCTIONS 309

and comparing both expressions we find


∂R (φ, θ, ψ)
= −iJ3 R (φ, θ, ψ) (16.103)
∂φ
note that this equation can also be obtained by direct differentiation of Eq. (16.102) respecting the order of the products

∂R (φ, θ, ψ) ∂  −iφJ3 −iθJ2 −iψJ3  ∂e−iφJ3 −iθJ2 −iψJ3 


= e e e = e e = −iJ3 e−iφJ3 e−iθJ2 e−iψJ3 = −iJ3 R (φ, θ, ψ)
∂φ ∂φ ∂φ

we can insert an identity to write Eq. (16.103) as

∂R  
= −iJ3 R (φ, θ, ψ) = −iR R−1 J3 R (16.104)
∂φ

similarly we can obtain the differential equations for θ and ψ


 −iθJ2 
∂R (φ, θ, ψ) ∂  −iφJ3 −iθJ2 −iψJ3  −iφJ3 ∂e
= e e e =e e−iψJ3 = −ie−iφJ3 e−iθJ2 J2 e−iψJ3
∂θ ∂θ ∂θ
    
= −ie−iφJ3 e−iθJ2 e−iψJ3 eiψJ3 J2 e−iψJ3 = −i e−iφJ3 e−iθJ2 e−iψJ3 eiψJ3 J2 e−iψJ3
∂R (φ, θ, ψ)  
= −iR eiψJ3 J2 e−iψJ3 (16.105)
∂θ

∂R (φ, θ, ψ) ∂  −iφJ3 −iθJ2 −iψJ3  ∂ −iψJ3


= e e e = e−iφJ3 e−iθJ2 e = e−iφJ3 e−iθJ2 e−iψJ3 (−iJ3 )
∂ψ ∂ψ ∂ψ
∂R (φ, θ, ψ)
= −iRJ3 (16.106)
∂ψ

Therefore, Eqs. (16.104, 16.105, 16.106) show the effects due to the infinitesimal changes in the three group parameters
(φ, θ, ψ), which are described by.

∂  
i R (φ, θ, ψ) = J3 R (φ, θ, ψ) = R R−1 J3 R (16.107)
∂φ
∂  
i R (φ, θ, ψ) = R eiψJ3 J2 e−iψJ3 (16.108)
∂θ

i R (φ, θ, ψ) = RJ3 (16.109)
∂ψ

16.12.1 Some useful formulas


The quantities inside the square brackets on the RHS of Eqs. (16.107, 16.108) can be written as linear combinations of J3 and
the ladder operators by using the Baker-Hausdorff-Campbell theorem

1 1
eiαG He−iαG = H + [iαG, H] + [iαG, [iαG, H]] + [iαG, [iαG, [iαG, H]]] + . . . (16.110)
2! 3!
this equation can be rewritten appropriately by defining

b (A) ≡ [iαG, A]
G (16.111)

from which Eq. (16.110) is rewritten as

b (H) + 1 G
eiαG He−iαG = H + G b 2 (H) + 1 Gb3 (H) + . . . = eGb H
2! 3!

the term in square brackets in (16.108) becomes

X∞ bn
1 b2 1 J3 (J2 )
eiψJ3 J2 e−iψJ3 = J2 + Jb3 (J2 ) + J3 (J2 ) + Jb33 (J2 ) + . . . = ; Jb3 (A) ≡ [iψJ3 , A] (16.112)
2! 3! n=0
n!
310 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

the succesive powers become

Jb30 (J2 ) = (iψ)0 J2 ; Jb3 (J2 ) ≡ [iψJ3 , J2 ] = iψ [J3 , J2 ] = (−i) (iψ) J1


h i
2 2 2
Jb32 (J2 ) = iψJ3 , Jb3 (J2 ) = [iψJ3 , (−i) (iψ) (J1 )] = (−i) (iψ) [J3 , J1 ] = (−i) (iψ) iJ2 = (iψ) J2
h i h i
2 3 3 3
Jb33 (J2 ) = iψJ3 , Jb32 (J2 ) = iψJ3 , (iψ) J2 = (iψ) [J3 , J2 ] = (iψ) (−iJ1 ) = (−i) (iψ) J1
h i h i
3 4 4 4
Jb34 (J2 ) = iψJ3 , Jb33 (J2 ) = iψJ3 , (−i) (iψ) J1 = (−i) (iψ) [J3 , J1 ] = (−i) (iψ) iJ2 = (iψ) J2
h i h i
4 5 5
Jb35 (J2 ) = iψJ3 , Jb34 (J2 ) = iψJ3 , (iψ) J2 = (iψ) [J3 , J2 ] = (−i) (iψ) J1

the rule is clearly


2n n n
Jb32n (J2 ) = (iψ) J2 = i 2 ψ 2n J2 = (−1) ψ 2n J2
2n+1 n+1 n+1 n
Jb32n+1 (J2 ) = (−i) (iψ) J1 = −i2n+2 ψ 2n+1 J1 = − i2 ψ 2n+1 J1 = − (−1) ψ 2n+1 J1 = (−1) ψ 2n+1 J1

the rule simplifies to


n n
Jb32n (J2 ) = (−1) ψ 2n J2 ; Jb32n+1 (J2 ) = (−1) ψ 2n+1 J1 (16.113)
rigorous proof can be carried out by induction, assuming the validity of the first of Eqs. (16.113) we have
2(n+1)   
Jb3 (J2 ) = Jb32 Jb32n (J2 ) = Jb32 (−1)n ψ 2n J2 = Jb3 iψJ3 , (−1)n ψ 2n J2 = i (−1)n ψ 2n+1 Jb3 ([J3 , J2 ])
n n n
= i (−1) ψ 2n+1 Jb3 (−iJ1 ) = i (−1) ψ 2n+1 [iψJ3 , −iJ1 ] = i (−1) ψ 2n+2 [J3 , J1 ]
n n+1
= i2 (−1) ψ 2n+2 J2 = (−1) ψ 2(n+1) J2

and similarly for the second of Eqs. (16.113). Substituting Eqs. (16.113) in Eq. (16.112) we obtain
∞ bk
X X∞ b2n ∞ ∞ n ∞ n
J (J2 )
3 J3 (J2 ) X Jb32n+1 (J2 ) X (−1) ψ 2n X (−1) ψ 2n+1
eiψJ3 J2 e−iψJ3 = = + = J2 + J1
k! n=0
(2n)! n=0
(2n + 1)! n=0
(2n)! n=0
(2n + 1)!
k=0
   
J− − J+ J+ + J−
eiψJ3 J2 e−iψJ3 = J2 cos ψ + J1 sin ψ = i cos ψ + sin ψ (16.114)
2 2
J+ J− J+ h  π  π i
= (sin ψ − i cos ψ) + (sin ψ + i cos ψ) = cos ψ − + i sin ψ −
2 2 2 2 2
J− h  π  π i
+ cos ψ − − i sin ψ −
2 2 2
J+ i(ψ− π2 ) J− −i(ψ− π2 ) J+ iψ −i π J− −iψ i π
= e + e = e e 2 + e e 2
2 2 2 2
so we obtain finally
iJ+ iψ iJ− −iψ
eiψJ3 J2 e−iψJ3 = − e + e (16.115)
2 2
we have expressed this quantity in terms of J± instead of J1 , J2 because it is easier to handle with the action of J± on the
canonical basis |j, mi.
It worths remarking that the same result can be obtained by combining Eqs. (15.52, 16.102), from which
h i  
eiψJ3 J2 e−iψJ3 = e−i0·J3 e−i0·J2 e−i(−ψ)J3 J2 e−i0·J3 e−i0·J2 e−iψJ3 = R (0, 0, −ψ) J2 R (0, 0, ψ) = R (0, 0, −ψ) J2 R−1 (0, 0, −ψ)
= Jl Rl 2 (0, 0, −ψ) = J1 R1 2 (0, 0, −ψ) + J2 R2 2 (0, 0, −ψ) + J3 R3 2 (0, 0, −ψ)

and using the explicit form of the rotation matrix in the y−convention Eq. (15.17) page 259, we get

eiψJ3 J2 e−iψJ3 = −J1 sin (−ψ) + J2 cos (−ψ) = J2 cos ψ + J1 sin ψ

which coincides with the first of Eqs. (16.114). The remaining of the procedure to arrive to Eq. (16.115) is exactly the same.
Nevertheless, we should point out that this alternative method depends on relation (15.52) which is only valid for rotations in
the Euclidean space (or the j = 1 representation), and also on the explicit form of the matrix of three-dimensional rotations
(which in turn was used in the y−convention). In contrast, the derivation based on the Baker-Hausdorff-Campbell (BHC)
formula only depended on the Lie algebra of the generators. Therefore, the BHC derivation guarantees that Eq. (16.115) is
valid in any representation and is independent of any choice of basis and/or convention.
16.12. DIFFERENTIAL EQUATIONS AND RECURRENCE RELATIONS FOR THE D(J) FUNCTIONS 311

Now we calculate the quantity on the square brackets of the RHS of Eqs. (16.107), by using (16.102, 16.110) we have
     
R−1 J3 R = eiψJ3 eiθJ2 eiφJ3 J3 e−iφJ3 e−iθJ2 e−iψJ3 = eiψJ3 eiθJ2 eiφJ3 J3 e−iφJ3 e−iθJ2 e−iψJ3
 
= eiψJ3 eiθJ2 J3 e−iθJ2 e−iψJ3 = eiψJ3 eiθJ2 J3 e−iθJ2 e−iψJ3

employing the Baker-Hausdorff-Campbell relation twice we obtain



−1 J+ eiψ + J− e−iψ
R J3 R = − sin θ + J3 cos θ (16.116)
2

16.12.2 Recurrence relations


Substituting Eqs. (16.115, 16.116) in Eqs. (16.107, 16.108, 16.109) the latter become

∂ RJ+ eiψ + RJ− e−iψ
i R = − sin θ + RJ3 cos θ (16.117)
∂φ 2
∂ −RJ+ eiψ + RJ− e−iψ
R = (16.118)
∂θ 2

i R (φ, θ, ψ) = RJ3 (16.119)
∂ψ

multiplying Eq. (16.117) by (2/ sin θ) eiψ and Eq. (16.118) by 2eiψ and adding them we have
 
2i iψ ∂ 2iψ 2 ∂
e R = −RJ+ e − RJ− + eiψ RJ3 cos θ ; 2eiψ R = −RJ+ e2iψ + RJ− ⇒(16.120)
sin θ ∂φ sin θ ∂θ
 
2i iψ ∂ iψ ∂ 2iψ 2
e R + 2e R = −2RJ+ e + eiψ RJ3 cos θ ⇒
sin θ ∂φ ∂θ sin θ

   
i ∂ ∂ 1
2eiψ R+ R− RJ3 cos θ = −2RJ+ e2iψ ⇒
sin θ ∂φ ∂θ sin θ
   
i ∂ ∂ 1
e−iψ R+ R− RJ3 cos θ = −RJ+ (16.121)
sin θ ∂φ ∂θ sin θ

substituting (16.119) in Eq. (16.121) we have


   
−iψ i ∂ ∂ 1 ∂
e R+ R−i cos θ R = −RJ+ (16.122)
sin θ ∂φ ∂θ sin θ ∂ψ

substracting Eqs. (16.120) we have


 
2i iψ ∂ ∂ 2
e R − 2eiψ R = −RJ+ e2iψ − RJ− + eiψ RJ3 cos θ + RJ+ e2iψ − RJ− ⇒
sin θ ∂φ ∂θ sin θ
   
i ∂ ∂ 2
2eiψ R− R = −2RJ− + eiψ RJ3 cos θ ⇒
sin θ ∂φ ∂θ sin θ
 
i ∂ ∂ RJ3 cos θ
eiψ R− R− = −RJ− (16.123)
sin θ ∂φ ∂θ sin θ

substituting Eq. (16.119) in Eq. (16.123) yields


 
iψ i ∂ ∂ cos θ ∂
e R− R−i R = −RJ− (16.124)
sin θ ∂φ ∂θ sin θ ∂ψ

Putting Eqs. (16.119, 16.122, 16.124) together we have

  
∂ i ∂ ∂
e−iψ − − − cos θ R = RJ+ (16.125)
∂θ sin θ ∂φ ∂ψ
  
∂ i ∂ ∂
eiψ − − cos θ R = RJ− (16.126)
∂θ sin θ ∂φ ∂ψ
∂R
i = RJ3 (16.127)
∂ψ
312 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

in these equations, we have isolated the three generators on the RHS of Eqs. (16.107, 16.108, 16.109). Exactly the same
equations apply to the operators U (j) (φ, θ, ψ) on each representation space, because these equations come from the Lie algebra
of the generators. Replacing R by U (j) and sandwiching both sides of equations (16.125, 16.126) between canonical states |jmi
and hjm′ |, we obtain
  
∂ i ∂ ∂
e−iψ − − − cos θ hjm′ | U (j) (φ, θ, ψ) |jmi = hjm′ | U (j) (φ, θ, ψ) J+ |jmi
∂θ sin θ ∂φ ∂ψ
  
∂ i ∂ ∂
eiψ − − cos θ hjm′ | U (j) (φ, θ, ψ) |jmi = hjm′ | U (j) (φ, θ, ψ) J− |jmi
∂θ sin θ ∂φ ∂ψ

using Eqs. (15.73, 15.84) we have


  
−iψ ∂ i ∂ ∂ n
p
e − − − cos θ hjm′ | jniD(j) (φ, θ, ψ) m = hj, m′ | U (j) (φ, θ, ψ) |j, m + 1i j (j + 1) − m (m + 1)
∂θ sin θ ∂φ ∂ψ
  
iψ ∂ i ∂ ∂ n
p
e − − cos θ hjm′ | jniD(j) (φ, θ, ψ) m = hj, m′ | U (j) (φ, θ, ψ) |j, m − 1i j (j + 1) − m (m − 1)
∂θ sin θ ∂φ ∂ψ

using Eq. (15.84) again, and the fact that hj, m′ | j, ni = δm′ ,n we have
  
∂ i ∂ ∂ p
e−iψ − − − cos θ δm′ ,n D(j) (φ, θ, ψ)n m = hj, m′ | j, niD(j) (φ, θ, ψ)n m+1 j (j + 1) − m (m + 1)
∂θ sin θ ∂φ ∂ψ
  
∂ i ∂ ∂ n n
p
eiψ − − cos θ δm′ ,n D(j) (φ, θ, ψ) m = hj, m′ | j, niD(j) (φ, θ, ψ) m−1 j (j + 1) − m (m − 1)
∂θ sin θ ∂φ ∂ψ
so that
  
∂ i ∂ ∂ m′ m′
p
e−iψ − − − cos θ D(j) (φ, θ, ψ) m = D(j) (φ, θ, ψ) m+1 j (j + 1) − m (m + 1)
∂θ sin θ ∂φ ∂ψ
  
∂ i ∂ ∂ m′ m′
p
eiψ − − cos θ D(j) (φ, θ, ψ) m = D(j) (φ, θ, ψ) m−1 j (j + 1) − m (m − 1)
∂θ sin θ ∂φ ∂ψ

Now, using Eq. (15.85), we have


   h i h ip
−iψ ∂ i ∂ ∂ ′ m′ ′ m′
e − − − cos θ e−im φ d(j) (θ) m e−imψ = e−im φ d(j) (θ) m+1 e−i(m+1)ψ j (j + 1) − m (m + 1)
∂θ sin θ ∂φ ∂ψ
   h i h ip
iψ ∂ i ∂ ∂ ′ m′ ′ m′
e − − cos θ e−im φ d(j) (θ) m e−imψ = e−im φ d(j) (θ) m−1 e−i(m−1)ψ j (j + 1) − m (m − 1)
∂θ sin θ ∂φ ∂ψ
after deriving these equations, they become
 h i h i
∂ i ′ m′ ′ m′
p
e−iψ − − (−im′ + im cos θ) e−im φ d(j) (θ) m e−imψ = e−im φ d(j) (θ) m+1 e−imψ e−iψ j (j + 1) − m (m + 1
∂θ sin θ
 h i h i
∂ i ′ ′ ′ ′ p
eiψ − (−im′ + im cos θ) e−im φ d(j) (θ)m m e−imψ = e−im φ d(j) (θ)m m−1 e−imψ eiψ j (j + 1) − m (m − 1)
∂θ sin θ
m′
simplifying we obtain an ordinary differential equation in θ for the reduced matrix elements d(j) (θ) m
 
d 1 m′ m′
p
− − (m − m cos θ) d(j) (θ) m = d(j) (θ) m+1 j (j + 1) − m (m + 1)

(16.128)
dθ sin θ
 
d 1 m′ m′
p
− (m − m cos θ) d(j) (θ) m = d(j) (θ) m−1 j (j + 1) − m (m − 1)

(16.129)
dθ sin θ
m′
Eqs. (16.128, 16.129) are recurrence relations in the column index of d(j) (θ) m induced by the raising and lowering operators
J± .

16.12.3 Differential equations for D (j) (φ, θ, ψ) functions


In order to derive differential equations for D(j) (φ, θ, ψ) and d(j) (θ) that do not involve a change of indices, we combine the
action of J+ and J− . Applying property (15.70) we have
 
RJ 2 = R J+ J− + J32 − J3 (16.130)
16.12. DIFFERENTIAL EQUATIONS AND RECURRENCE RELATIONS FOR THE D(J) FUNCTIONS 313

Now we shall evaluate each term on the RHS of Eq. (16.130). First of all, we apply the operator i∂ψ on both sides of Eq.
(16.127) and taking into account that the generators Ji are independent of the parameters, we obtain
   
2 ∂ ∂R ∂R
i = i J3
∂ψ ∂ψ ∂ψ
and applying Eq. (16.127) on the RHS of this equation, we get
∂2R
− = RJ32 (16.131)
∂2ψ
On the other hand let us rewrite Eqs. (16.125, 16.126) in the form
b
e−iψ AR = RJ+ ; b = RJ−
eiψ BR (16.132)
     
b ∂ i ∂ ∂ b ∂ i ∂ ∂
A ≡ − − − cos θ ; B≡ − − cos θ (16.133)
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ
b on both sides of the second of Eqs. (16.132) and using the first of Eqs. (16.132) we
applying the differential operator e−iψ A
have h i   h i
e−iψ Aeb iψ B
b R = e−iψ AR b J− ⇒ e−iψ Ae b iψ B
b R = RJ+ J− (16.134)
b and B,
where we have used the fact that J− is independent of the parameters. Using the explicit form of the operators A b the
second of Eqs. (16.134) becomes
     
∂ i ∂ ∂ ∂ i ∂ ∂
e−iψ − − − cos θ eiψ − − cos θ R = RJ+ J− (16.135)
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ
and substituting Eqs. (16.135, 16.131, 16.127) on the RHS of Eq. (16.130) we obtain
       
∂ i ∂ ∂ ∂ i ∂ ∂ ∂2 ∂
RJ 2 = e−iψ − − − cos θ eiψ − − cos θ − − i R
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ ∂ψ 2 ∂ψ
substituting R by U (j) (φ, θ, ψ), and sandwiching this equation between canonical states, we obtain
hjm′ | U (j) (φ, θ, ψ) J 2 |jmi
       
−iψ ∂ i ∂ ∂ iψ ∂ i ∂ ∂ ∂2 ∂
= e − − − cos θ e − − cos θ − −i hjm′ | U (j) (φ, θ, ψ) |jmi
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ ∂ψ 2 ∂ψ
so that
m′
j (j + 1) D(j) (φ, θ, ψ) m
       
−iψ ∂ i ∂ ∂ iψ ∂ i ∂ ∂ ∂2 ∂ m′
= e − − − cos θ e − − cos θ − 2
−i D(j) (φ, θ, ψ) m
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ ∂ψ ∂ψ
m′
let us abbreviate the notation D(j) (φ, θ, ψ) m to simply D for a while, we find
       
−iψ ∂ i ∂ ∂ iψ ∂D i ∂D ∂D ∂ 2D ∂D
e − − − cos θ e − − cos θ − −i − j (j + 1) D = 0 (16.136)
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ ∂ψ 2 ∂ψ
Equation (16.136) can be rewritten as
 
∂2D ∂D
W− − i − j (j + 1) D =0 (16.137)
∂ψ 2 ∂ψ
     
−iψ ∂ i ∂ ∂ iψ ∂D i ∂D ∂D
W ≡e − − − cos θ e − − cos θ (16.138)
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ
and we shall expand the W term
     
∂D i ∂D ∂D ∂ i ∂ ∂
W = e−iψ − − cos θ − − − cos θ eiψ
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ
     
∂ i ∂ ∂ ∂D i ∂D ∂D
+e−iψ eiψ − − − cos θ − − cos θ
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ
    
∂D i ∂D ∂D i cos θ ∂
W = e−iψ − − cos θ eiψ
∂θ sin θ ∂φ ∂ψ sin θ ∂ψ
     
∂ i ∂ ∂ ∂D i ∂D ∂D
+ − − − cos θ − − cos θ
∂θ sin θ ∂φ ∂ψ ∂θ sin θ ∂φ ∂ψ
314 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

     
∂D i ∂D ∂D cos θ ∂ ∂D i ∂D ∂D
W = − − − cos θ − − − cos θ
∂θ sin θ ∂φ ∂ψ sin θ ∂θ ∂θ sin θ ∂φ ∂ψ
     
i ∂ ∂D i ∂D ∂D i cos θ ∂ ∂D i ∂D ∂D
− − − cos θ + − − cos θ
sin θ ∂φ ∂θ sin θ ∂φ ∂ψ sin θ ∂ψ ∂θ sin θ ∂φ ∂ψ

   
∂D i ∂D i cos θ ∂D cos θ ∂ ∂D i ∂D i cos θ ∂D
W = − − + − − +
∂θ sin θ ∂φ sin θ ∂ψ sin θ ∂θ ∂θ sin θ ∂φ sin θ ∂ψ
 2  2 2
  2
 2 
i ∂ D i ∂ D ∂ D i cos θ ∂ D i ∂ D ∂2D
− − − cos θ + − − cos θ
sin θ ∂φ∂θ sin θ ∂φ2 ∂φ∂ψ sin θ ∂ψ∂θ sin θ ∂ψ∂φ ∂ψ 2

 2   
cos θ ∂D i cos θ ∂D i cos2 θ ∂D ∂ D ∂D ∂ 1 i ∂2D ∂D ∂ (cot θ) i cos θ ∂ 2 D
W = − + − − − i − + i +
sin θ ∂θ sin θ sin θ ∂φ sin2 θ ∂ψ ∂θ2 ∂φ ∂θ sin θ sin θ ∂θ∂φ ∂ψ ∂θ sin θ ∂θ∂ψ
 2 2 2
  
i ∂ D i ∂ D i cos θ ∂ D i cos θ ∂ 2 D i ∂ 2D i cos θ ∂ 2 D
− − + + − +
sin θ ∂φ∂θ sin θ ∂φ2 sin θ ∂φ∂ψ sin θ ∂ψ∂θ sin θ ∂ψ∂φ sin θ ∂ψ 2

cos θ ∂D i cos θ ∂D i cos2 θ ∂D ∂ 2 D ∂D cos θ i ∂2D 1 ∂D i cos θ ∂ 2 D


W = − + 2 − 2 − 2
−i 2 + +i 2 −
sin θ ∂θ sin θ ∂φ sin θ ∂ψ ∂θ ∂φ sin θ sin θ ∂θ∂φ sin θ ∂ψ sin θ ∂θ∂ψ
2 2 2 2 2 2 2
i ∂ D 1 ∂ D cos θ ∂ D i cos θ ∂ D cos θ ∂ D cos θ ∂ D
− − + + + −
sin θ ∂φ∂θ sin2 θ ∂φ2 sin2 θ ∂φ∂ψ sin θ ∂ψ∂θ sin2 θ ∂ψ∂φ sin2 θ ∂ψ 2
reordering by putting common derivatives together we have

cos θ ∂D ∂ 2 D i ∂2D i ∂2D i cos θ ∂ 2 D i cos θ ∂ 2 D


W = − − + − + −
sin θ ∂θ ∂θ2 sin θ ∂θ∂φ sin θ ∂φ∂θ sin θ ∂ψ∂θ sin θ ∂θ∂ψ
i cos θ ∂D ∂D cos θ 1 ∂2D 1 ∂D i cos2 θ ∂D cos2 θ ∂ 2 D
+ 2 −i − +i 2 − −
sin θ ∂φ ∂φ sin2 θ sin2 θ ∂φ2 sin θ ∂ψ sin2 θ ∂ψ sin2 θ ∂ψ 2
2 2
cos θ ∂ D cos θ ∂ D
+ 2 +
sin θ ∂φ∂ψ sin2 θ ∂ψ∂φ
symplifying we obtain

cos θ ∂D ∂ 2 D 1 ∂2D 1  ∂D cos2 θ ∂ 2 D cos θ ∂ 2 D


W =− − 2
− 2 2
+ i 2 1 − cos2 θ − 2 2
+2 2 (16.139)
sin θ ∂θ ∂θ sin θ ∂φ sin θ ∂ψ sin θ ∂ψ sin θ ∂φ∂ψ
we observe that      2 
1 ∂ ∂ 1 ∂ ∂D cos θ ∂D ∂ D
sin θ D= sin θ = + (16.140)
sin θ ∂θ ∂θ sin θ ∂θ ∂θ sin θ ∂θ ∂θ2
hence the first two terms on the RHS of Eq. (16.139) can be replaced by the first member in Eq. (16.140)
 
1 ∂ ∂ 1 ∂2D ∂D cos θ ∂ 2 D cos2 θ ∂ 2 D
W =− sin θ D− + i + 2 − (16.141)
sin θ ∂θ ∂θ sin2 θ ∂φ2 ∂ψ sin2 θ ∂φ∂ψ sin2 θ ∂ψ 2

substituting Eq. (16.141) in Eq. (16.137) we obtain


   
1 ∂ ∂ 1 ∂2D ∂D cos θ ∂ 2 D cos2 θ ∂ 2 D ∂ 2 D ∂D
− sin θ D− +i +2 2 − − −i − j (j + 1) D = 0
sin θ ∂θ ∂θ sin2 θ ∂φ2 ∂ψ sin θ ∂φ∂ψ sin2 θ ∂ψ 2 ∂ψ 2 ∂ψ
   
1 ∂ ∂ 1 ∂2D cos θ ∂ 2 D 1 ∂2D
− sin θ D− + 2 − − j (j + 1) D = 0
sin θ ∂θ ∂θ sin2 θ ∂φ2 sin2 θ ∂φ∂ψ sin2 θ ∂ψ 2

recovering the notation D → D(j) (φ, θ, ψ) and reordering we finally obtain


  2  
1 ∂ ∂ 1 ∂ ∂2 ∂2 m′
sin θ + 2 2
+ 2
− 2 cos θ + j (j + 1) D(j) (φ, θ, ψ) m = 0 (16.142)
sin θ ∂θ ∂θ sin θ ∂φ ∂ψ ∂φ∂ψ
Moreover, in order to obtain an ordinary differential equation, we substitute Eq. (15.85) in Eq. (16.142), and we find
 
1 d d 1 2 ′2 ′
 m′
sin θ − 2 m + m − 2mm cos θ + j (j + 1) d(j) (θ) m = 0 (16.143)
sin θ dθ dθ sin θ
16.13. GROUP-THEORETICAL INTERPRETATION OF THE SPHERICAL HARMONICS 315

therefore, we obtain an ordinary differential equation for the d(j) (θ) function. For general values of (j, m, m′ ), the functions
m′
d(j) (θ) m are related with the classical Jacobi polynomials Plφ,θ . We see it by recalling the differential equation for the latter
 
 2
2 d d
1−z + [θ − φ − (2 + φ + θ) z] + l (l + φ + θ + 1) Plφ,θ = 0 (16.144)
dz 2 dz
it is a long but direct calculation to show that Eq. (16.143) can be transformed into Eq. (16.144) with the following assignment
s  m+m′  m−m′
(j) m′ (j + m′ )! (j − m′ )! θ θ m′ −m, m′ +m
d (θ) m ≡ cos sin Pj−m ′ (cos θ) (16.145)
(j + m)! (j − m)! 2 2

It is interesting to compare Eq. (16.145) with the Eq. (16.46) obtained by the tensor method.

Differential equations of D(j) (φ, θ, ψ) for m = 0 and spherical harmonics


A more familiar equation is obtained with the particular case m = 0. In that case, using Eq. (15.85) we see that
∂ (j) k ∂ h −iφk (j) k
i h
k
i
D (φ, θ, ψ) 0 = e d (θ) m e−imψ = −ime−iφk d(j) (θ) m =0 (16.146)
∂ψ ∂ψ m=0 m=0

it is clear that m = 0 requires from j to be an integer. Consequently, for the case m = 0, we shall make the substitutions
(j, m′ ) → (l, m) in Eq. (16.142) and take into account Eq. (16.146) to obtain
 
1 ∂ ∂ 1 ∂2 m
sin θ + + l (l + 1) D(l) (φ, θ, ψ) 0 = 0 (16.147)
sin θ ∂θ ∂θ sin2 θ ∂φ2
In Eq. (16.147) we recognize the well-known differential equation for spherical harmonics. Indeed the functions become the
spherical harmonics only after a proper normalization
r
2l + 1 h (l) m
i∗
D (φ, θ, 0) 0 ≡ Ylm (θ, φ) (16.148)

by using the relation between spherical harmonics and associated Legendre functions Plm (θ) , we can see that the latter are
basically the d(j) (θ) functions except for constant factors
s
(l) m m (l − m)!
d (θ) 0 = (−1) Plm (θ) (16.149)
(l + m)!

On the other hand, Eq. (16.149) can also be reproduced from Eq. (16.145) by observing that associated Legendre polynomials
are special cases of the Jacobi polynomials

m,m m l! −m/2
Pl−m (z) = (−2) 1 − z2 Plm (z) (16.150)
(l − m)!
in particular, the most familiar Legendre polynomials are given by
0
Pl (cos θ) = Pl0 (cos θ) = Pl0,0 (cos θ) = d(l) (θ) 0 (16.151)

16.13 Group-Theoretical interpretation of the spherical harmonics


We have been able to obtain some special functions well-known from classical analysis, by group-theoretical grounds. It gives
us a new geometrical insight for these functions. We shall illustrate the power of this new point of view by generating the
general properties of spherical harmonics by means of group-theoretical methods.
Let us start with the definition of Ylm (θ, φ) in terms of group representation matrix elements
r
2l + 1 h (l) m
i∗
Ylm (θ, φ) = D (φ, θ, 0) 0 = hθφ |lmi (16.152)

r
X m 2l + 1
|θ, φi ≡ |l, mi D(l) (φ, θ, 0) 0 (16.153)

lm

With these definitions we can see the spherical harmonics either as (a) special irreducible representation matrix elements,
see Eq. (16.152), or (b) Elements of the “transformation matrix” from the |θ, φi basis to the |l, mi basis see Eq. (16.153). The
general properties of the spherical harmonics are derived by taking one or the other point of view.
316 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

16.13.1 Transformation under rotation and addition theorem


Consider a coordinate system X1 X2 X3 and a unit vector u which is specified by the polar and azimuthal angles (θ, φ) with
respect to this coordinate system. Let us apply a rotation R (α, β, γ) on both the coordinate system X1 X2 X3 and on the unit
vector u. We obtain a new coordinate system X1′ X2′ X3′ and a new unit vector v. It is clear that the orientation of v with
respect to X1′ X2′ X3′ is specified by the polar and azimuthal angles (θ, φ)8 . Let us denote the polar and azimuthal angles of v
with respect to X1 X2 X3 as (ξ, ψ). By definition, we have

|ξ, ψi = U (α, β, γ) |θ, φi ⇒ hξ, ψ| = hθ, φ| U † (α, β, γ) ⇒ hξ, ψ| U (α, β, γ) = hθ, φ|

from which
m′
hθ, φ| lmi = hξ, ψ| U (α, β, γ) |lmi = hξ, ψ| lm′ iD(l) (α, β, γ) m (16.154)
and looking at the spherical harmonics as the “transformation matrix” from the |θ, φi basis to the |l, mi, Eq. (16.154) gives
the transformation law of spherical harmonics under rotations
m′
Ylm (θ, φ) = Ylm′ (ξ, ψ) D(l) (α, β, γ) m (16.155)

which is basically a manifestation of the group multiplication law. We shall also prove this theorem, by employing the
transformation law of fields under rotations (see Sec. 17.3 Eq. 17.27, page 330).
Now, we consider the special case of Eq. (16.155) in which m = 0. The LHS becomes
r
2l + 1
Yl0 (θ, φ) = Pl (cos θ)

and on the RHS we find r
m′
X 4π
(l) ∗
Ylm′ (ξ, ψ) D (α, β, γ) 0 = Ylm′ (ξ, ψ) Ylm ′ (β, α)
2l + 1
m′

equating both equations we get


2l + 1 X

Pl (cos θ) = Ylm′ (ξ, ψ) Ylm ′ (β, α)
4π ′ m

this is the celebrated addition theorem for spherical harmonics: (ξ, ψ) and (β, α) specify the direction of the two vectors v
and X3′ with respect to the original coordinate system9 X1 X2 X3 and θ is the angle between v and X3′ which is also the angle
between u and X3 . This relation is often used to decouple the dependence of Pl (cos θ) on the direction of the individual
vectors v (ξ, ψ) and X3′ (β, α) which span the angle θ.

16.13.2 Decomposition of products of Ylm with the same arguments


If we have a product of spherical harmonics with the same arguments (θ, φ) but of different order l, we can interpret it as the
product representation matrix of a given group element. We have learnt how to decompose this product into its irreducible
parts, from which linear combinations of single spherical harmonics of the same arguments are obtained. Using Eq. (15.121)
for integer angular momenta we write
′ m′
X
D(l) (φ, θ, ψ) n D(l ) (φ, θ, ψ) n′ =
m M
hmm′ (ll′ ) LM i D(L) (φ, θ, ψ) N hLN (ll′ ) nn′ i
L,M,N

and setting n = n′ = ψ = 0, we have


′ ′ X
D(l) (φ, θ, 0)m 0 D(l ) (φ, θ, 0)m 0 = hmm′ (ll′ ) LM i D(L) (φ, θ, 0)M N hLN (ll′ ) 0 0i
L,M,N

conjugating both sides, taking into account that the C-G coefficients have been chosen real, and using the selection rules
(15.119) we obtain
h i∗ h i∗ X h i∗
′ m′ m+m′
D(l ) (φ, θ, 0) 0
m
D(l) (φ, θ, 0) 0 = hmm′ (ll′ ) L, m + m′ i D(L) (φ, θ, 0) 0 hL, 0 (ll′ ) 0 0i
L,M,N
"r # "r # "r #
4π 4π X 4π
′ ′ ′
Ylm (θ, φ) Yl′ m′ (θ, φ) = hmm (ll ) L, m + m i YL,m+m′ (θ, φ) hL0 (ll′ ) 00i
2l + 1 2l′ + 1 2L + 1
L
8 The relative orientation of v with respect to X1′ X2′ X3′ is the same as the relative orientation of u with respect to X1 X2 X3 .
9 The fact that (β, α) gives the direction of X3′ with respect to X1 X2 X3 can be obtained by observing that u′3 = e−iαJ3 e−iβJ2 e−iγJ3 u3 =
e−iαJ3 e−iβJ2 u3 .
16.13. GROUP-THEORETICAL INTERPRETATION OF THE SPHERICAL HARMONICS 317

where we have applied Eq. (16.95) in the last step. Therefore, we have obtained a decomposition of products of Ylm with the
same arguments
s
X (2l + 1) (2l′ + 1)
Ylm (θ, φ) Yl′ m′ (θ, φ) = hmm′ (ll′ ) L, m + m′ i YL,m+m′ (θ, φ) hL0 (ll′ ) 00i (16.156)
4π (2L + 1)
L

where the first factor of the RHS is the Clebsch-Gordan coefficient.

16.13.3 Recursion formulas for Ylm (θ, φ) with l fixed


The simplest kind of recursion formula is generated by the raising and lowering operators in group representation space, as
mentioned in Sec. 16.13.2. Thus, those formulas come from Eqs. (16.125, 16.126) or equivalently, from Eqs. (16.128, 16.129):
 
d 1 m′ m′
p
− − (m − m cos θ) d(j) (θ) m = d(j) (θ) m+1 j (j + 1) − m (m + 1)

(16.157)
dθ sin θ
 
d 1 m′ m′
p
− (m′ − m cos θ) d(j) (θ) m = d(j) (θ) m−1 j (j + 1) − m (m − 1) (16.158)
dθ sin θ
Setting m′ = 0 and θ → −θ in Eqs. (16.157, 16.158), we obtain
 
d 1 p
− m cos θ d(j) (−θ)0 m = d(j) (−θ)0 m+1 j (j + 1) − m (m + 1) (16.159)
dθ sin θ
 
d 1 0 0
p
− − m cos θ d(j) (−θ) m = d(j) (−θ) m−1 j (j + 1) − m (m − 1) (16.160)
dθ sin θ
and using the identity (15.86) page 271 with m′ = 0 we find
 h i
d p
− m cot θ d(j) (θ)m 0 = d(j) (θ)m+1 0 j (j + 1) − m (m + 1)

 
d m m−1
p
− − m cot θ d(j) (θ) 0 = d(j) (θ) 0 j (j + 1) − m (m − 1)

1/2
we shall restrict to integer values of j, hence we replace j → l. Multiplying these equations by [(2l + 1) /4π] eimφ , we have
  "r # "r #
d 2l + 1 imφ (l) m 2l + 1 m+1
p
− m cot θ e d (θ) 0 = e−iφ ei(m+1)φ d(l) (θ) 0 l (l + 1) − m (m + 1)
dθ 4π 4π
  "r # "r #
d 2l + 1 imφ (l) m iφ 2l + 1 i(m−1)φ (l) m−1 p
− − m cot θ e d (θ) 0 = e e d (θ) 0 l (l + 1) − m (m − 1)
dθ 4π 4π

then, appealing to the definition (15.85) we have


  (r ) (r )
d 2l + 1 h (l) m
i∗
−iφ 2l + 1 h (l) m+1
i∗ p
− m cot θ D (φ, θ, 0) 0 = e D (φ, θ, 0) 0 l (l + 1) − m (m + 1)
dθ 4π 4π
  (r ) (r )
d 2l + 1 h (l) m
i∗
iφ 2l + 1 h (l) m−1
i∗ p
− − m cot θ D (φ, θ, 0) 0 = e D (φ, θ, 0) 0 l (l + 1) − m (m − 1)
dθ 4π 4π

and from Eq. (16.148) we get


 
d p
− m cot θ Ylm (θ, φ) = e−iφ Yl,m+1 l (l + 1) − m (m + 1)

 
d p
− − m cot θ Ylm (θ, φ) = eiφ Yl,m−1 (θ, φ) l (l + 1) − m (m − 1)

so that the recurrence relations for the spherical harmonics finally become
 
p iφ d
l (l + 1) − m (m + 1)Yl,m+1 (θ, φ) = e − m cot θ Ylm (θ, φ) (16.161)

 
p −iφ d
l (l + 1) − m (m − 1)Yl,m−1 (θ, φ) = e − − m cot θ Ylm (θ, φ) (16.162)

note that l remains unchanged in these relations. It is because the raising and lowering operators do not change this eigenvalue.
318 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

16.13.4 Recursion formulas for Ylm (θ, φ) with m fixed


In order to increase the value of l, we should enlarge the group representation space. It can be done by taking the direct
products of representations as it was done to obtain Eq. (16.156). Several recursion formulas couldpbe derived from this
procedure. As a matter of example, setting l′ = 1, m′ = 0 in Eq. (16.156), making use of Y10 = cos θ 3/ (4π) and the fact
that the C-G coefficients are real (C-S convention) we find
s
X 3 (2l + 1)
Ylm (θ, φ) Y10 (θ, φ) = hm, 0 (l, 1) L, mi YL,m (θ, φ) hL, 0 (l, 1) 0, 0i
4π (2L + 1)
L
r s
3 X 3 (2l + 1)
cos θ Ylm (θ, φ) = hL, m (l, 1) m, 0i YL,m (θ, φ) hL, 0 (l, 1) 0, 0i (16.163)
4π 4π (2L + 1)
L

Coefficients of the type hJ, M (j, 1) M − m′ , m′ i are given in table C.2, page 391. So our coefficients hL, m (l, 1) m, 0i are
obtained by setting J = L, m′ = 0, M = m in such a table. The quantity L can take the values l + 1, l, and l − 1; and using
the column associated with m′ = 0 of table C.2, we have
s s
(l − m + 1) (l + m + 1) (l + 1)
hl + 1, m (l, 1) m, 0i = ; hl + 1, 0 (l, 1) 0, 0i =
(2l + 1) (l + 1) (2l + 1)
m
hl, m (l, 1) m, 0i = p ; hl, 0 (l, 1) 0, 0i = 0
l (l + 1)
s s
(l − m) (l + m) l
hl − 1, m (l, 1) m, 0i = − ; hl − 1, 0 (l, 1) 0, 0i = −
l (2l + 1) (2l + 1)

replacing these coefficients in Eq. (16.163) we find


s
X (2l + 1)
cos θ Ylm (θ, φ) = hL, m (l, 1) m, 0i YL,m (θ, φ) hL, 0 (l, 1) 0, 0i
(2L + 1)
L
s
(2l + 1)
= hl + 1, m (l, 1) m, 0i Yl+1,m (θ, φ) hl + 1, 0 (l, 1) 0, 0i + hl, m (l, 1) m, 0i Yl,m (θ, φ) hl, 0 (l, 1) 0, 0i
[2 (l + 1) + 1]
s
(2l + 1)
+ hl − 1, m (l, 1) m, 0i Yl−1,m (θ, φ) hl − 1, 0 (l, 1) 0, 0i
[2 (l − 1) + 1]
s s s s s s
(l − m + 1) (l + m + 1) (l + 1) (2l + 1) (l − m) (l + m) l (2l + 1)
cos θ Ylm (θ, φ) = Yl+1,m (θ, φ) + Yl−1,m
(2l + 1) (l + 1) (2l + 1) (2l + 3) l (2l + 1) (2l + 1) (2l − 1)

and the recursion formula finally becomes


s s
√ (l − m + 1) (l + m + 1) (l − m) (l + m)
2l + 1 cos θ Ylm (θ, φ) = Yl+1,m (θ, φ) + Yl−1,m (θ, φ) (16.164)
(2l + 3) (2l − 1)

16.13.5 Symmetry relations for spherical harmonics


From the definition in Eq. (16.152) and the properties of the D-function to be shown in Sec. 16.15.3, and given by Eqs.
(16.187) page 323, we find
r r r
∗ 2l + 1 (l) m 2l + 1 −imφ (j) m 2l + 1 −imφ (j) −m m−0
Yl,m (θ, φ) = D (φ, θ, 0) 0 = e d (θ) 0 = e d (θ) −0 (−1)
4π 4π 4π
r r
m 2l + 1 h −i(−m)φ (j) −m
i∗
m 2l + 1 h (j) −m
i∗
m
= (−1) e d (θ) 0 = (−1) D (φ, θ, 0) 0 = (−1) Yl,−m (θ, φ)
4π 4π

obtaining finally
∗m
Yl,−m (θ, φ) = (−1) Ylm (θ, φ) (16.165)
16.14. GROUP THEORY, SPECIAL FUNCTIONS AND GENERALIZED FOURIER ANALYSIS 319

16.13.6 Orthonormality and completeness


On the other hand, the orthonormality relation of the spherical harmonics arises as a special case of theorem 16.5 along with
the definition in Eq. (16.152). Setting m = m′ = 0 and j ≡ l in the orthonormality relation (16.69), we have
Z
′ n′
(A) n D(l ) (A) 0 = δl l δn n δ0 0
† 0 ′ ′
(2l + 1) dτA D(l)
Z Z 1 Z 4π h i∗
(2l + 1) 2π ′ n′
D(l ) (φ, θ, ψ) 0 = δl l δn n
n ′ ′
(l)
2
dφ d (cos θ) dψ D (φ, θ, ψ) 0
16π 0 −1 0

from definition (15.85) it is clear that D(j) (φ, θ, ψ)m 0 = D(j) (φ, θ, 0)m 0 . Using this fact and definition (16.152) we get
Z Z 1 Z 4π h i∗
(2l + 1) 2π ′ ′
D(l ) (φ, θ, ψ)n 0
n ′ ′
(l)
2
dφ d (cos θ) dψ D (φ, θ, ψ) 0 = δl l δn n
16π 0 −1 0
Z 2π Z 1 "r # "r #Z

(2l + 1) 4π 4π ′ ′
dφ d (cos θ) Yl,n (θ, φ) Y l,n′ (θ, φ) dψ = δl l δn n
16π 2 0 −1 2l + 1 2l′ + 1 0

obtaining finally
Z 2π Z 1 Z

dφ d (cos θ) Ylm (θ, φ) Yl′ m′ (θ, φ) = δll′ δmm′
0 −1

in addition, the Peter-Weyl theorem discussed in Sec. 16.8 leads to the completeness of {Ylm } over the space of Lebesgue
square-integrable functions defined on the unit sphere. We have already discussed the completeness of the spherical harmonics
as a special case of the Peter-Weyl theorem (see Sec. 16.8, Eq. 16.98). A more compact form to express the completeness of
the spherical harmonics comes from a special case of Eq. (16.75)
X

Ylm (θ, φ) Ylm (θ′ , φ′ ) = δ (cos θ − cos θ′ ) δ (φ − φ′ )
l,m

16.14 Group theory, special functions and generalized Fourier analysis


We have seen in the discussion of SO (2) , SO (3) , SU (2) that there is a deep relation between group theory (associated
with symmetries) and special functions of the classical analysis (plane waves, spherical harmonics, etc.). This association
gives a geometrical interpretation to the differential equations, recursion formulas, addition theorems, orthonormality and
completeness, behavior under symmetry operations etc, for special functions of the classical analysis. This approach also
provides a generalized form of the Fourier analysis to functions defined on all group manifolds of compact Lie groups for which
the Peter-Weyl theorem applies.
The D−functions in general, and the spherical harmonics in particular, give a natural basis on the group manifold not
only for complex functions but also for vectors and operators on Hilbert spaces. These facts permitted in particular, the
characterization of the one and two particle kets in the quantum mechanical Hilbert space.

16.15 Properties of the D(j) (φ, θ, ψ) representations of SO (3)


16.15.1 “Special” unitarity
We have seen that the compactness of the group parameter space in SO (3) leads to the fact that all representations are
finite-dimensional and equivalent to unitary representations. Thus, we have chosen to construct unitary irreducible (finite-
dimensional) representations of SO (3). This was carried out by choosing hermitian generators Jk and a real set of parameters
k
{αk } such that eiαk J is unitary (see for instance Eq. 15.54). So for all representations the matrices satisfy

D(j)† (φ, θ, ψ) = D−1 (φ, θ, ψ) = D (−φ, −θ, −ψ)

which is evident from Eq. (16.102), page 308. On the other hand, we have seen that
 all continuous
 orthogonal transforma-
tions must satisfy det D = +1. As a matter of consistency, let us show that det D(j) (φ, θ, ψ) = 1. Using the axis-angle
parameterization we see
n o n  o n o n o n  o
det D(j) [Rn (Ψ)] = det D(j) RR3 (Ψ) R−1 = det D(j) [R] det D(j) [R3 (Ψ)] det D(j) R−1
n o
= det D(j) [R3 (Ψ)]
320 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

But in the canonical basis of common eigenvectors of J 2 and J3 , the matrix D(j) [R3 (Ψ)] is obviously diagonal. Therefore
n o j
Y
det D(j) [R3 (Ψ)] = e−imΨ = 1
m=−j

in the last equality, we use the fact that for each given m, both terms e−imΨ and eimΨ are present in the product. Consequently
n o n o
det D(j) [Rn (Ψ)] = det D(j) (φ, θ, ψ) = 1

so that D(j) (φ, θ, ψ) are special unitary matrices. This fact shows that the “special” character is essential in the group, and
so the name SO (3).

16.15.2 Other properties


From Eq. (15.43) and the fact that R−n (ψ) = Rn (−ψ) we have
R−n (ψ) = e−iψJ−n = Rn (−ψ) = eiψJn
⇒ J−n = −Jn (16.166)
it is easy to see geometrically that
R2 (π) e1 = −e1 ; R2 (π) e2 = e2 ; R2 (π) e3 = −e3 (16.167)
combining Eqs. (16.167, 16.166) with Lemma 15.1 Eq. (15.44) we have
R2 (π) J1 R2−1 (π) = J−e1 = −J1 ; R2 (π) J2 R2−1 (π) = Je2 = J2 ; R2 (π) J3 R2−1 (π) = J−e3 = −J3
R2 (π) J± R2−1 (π) = R2 (π) (J1 ± iJ2 ) R2−1 (π) = −J1 ± iJ2 = − (J1 ∓ J2 ) = −J∓
since these equations must be satisfied by any representation, we have that
(j) (j) (j) (j)
U (j) [R2 (π)] Jk U (j)† [R2 (π)] = (−1)k Jk ; U (j) [R2 (π)] J± U (j)† [R2 (π)] = −J∓ (16.168)
where we have used the fact that the representation is unitary U −1 = U † . Starting with the first of Eqs. (16.168) with k = 3
we have
(j) (j) (j) (j)
J3 = −U (j) [R2 (π)] J3 U (j)† [R2 (π)] ⇒ J3 U (j) [R2 (π)] = −U (j) [R2 (π)] J3 U (j)† [R2 (π)] U (j) [R2 (π)]
(j) (j)
⇒ J3 U (j) [R2 (π)] |j, mi = −U (j) [R2 (π)] J3 |j, mi
n o n o
(j)
⇒ J3 U (j) [R2 (π)] |j, mi = −m U (j) [R2 (π)] |j, mi

the last equation establishes that U (j) [R2 (π)] |j, mi is an eigenvector of J3 with eigenvalue −m. Therefore, such a vector must
be proportional to |j, −mi
U (j) [R2 (π)] |j, mi = |j, −mi ηm
j
(16.169)
multiplying the first of Eqs. (16.168) by U (j)† [R2 (π)] on left and by U (j) [R2 (π)] on right, and using k = 3 we have
(j) (j)
J3 = −U (j)† [R2 (π)] J3 U (j) [R2 (π)]
and sandwiching this equation with the vector |j, mi we have
(j) (j) (j) ∗
hj, m| J3 |j, mi = − hj, m| U (j)† [R2 (π)] J3 U (j) [R2 (π)] |j, mi = − hj, −m| J3 |j, −mi ηm
j j
ηm ⇒
j 2
m hj, m| j, mi = m hj, −m| j, −mi ηm

if the canonical basis is properly normalized it means that


j 2
ηm = 1

as expected from Eq. (16.169), the fact that both |j, mi and |j, −mi are normalized and that U (j) [R2 (π)] preserves norms
(unitary representation). On the other hand, the second of Eqs. (16.168) can also be written as
(j) (j) (j) (j)
U (j) [R2 (π)] J± = −J∓ U (j) [R2 (π)] ⇒ U (j) [R2 (π)] J± |j, mi = −J∓ U (j) [R2 (π)] |j, mi
p (j)
U (j) [R2 (π)] |j, m ± 1i j (j + 1) − m (m ± 1) = −J∓ |j, −mi ηm j

j
p p j
|j, − (m ± 1)i ηm±1 j (j + 1) − m (m ± 1) = − |j, −m ∓ 1i j (j + 1) − (−m) [− (m) ∓ 1]ηm ⇒
j
p p j
|j, − (m ± 1)i ηm±1 j (j + 1) − m (m ± 1) = − |j, − (m ± 1)i j (j + 1) − m [m ± 1]ηm (16.170)
16.15. PROPERTIES OF THE D(J) (φ, θ, ψ) REPRESENTATIONS OF SO (3) 321

since m and j are arbitrary we obtain


j j
ηm±1 = −ηm (16.171)
as long as m ± 1 is permitted10 . We summarize these properties in the form
k
J−n = −Jn , R2 (π) ek = (−1) ek ; k = 1, 2, 3 (16.172)
(j) (j) (j)† k (j) (j) (j) (j)† (j)
U [R2 (π)] Jk U [R2 (π)] = (−1) Jk ; U [R2 (π)] J± U [R2 (π)] = −J∓ (16.173)
j 2
U (j) [R2 (π)] |j, mi = j
|j, −mi ηm ; ηm =1 (16.174)
j j
ηm±1 = −ηm as long as m ± 1 is allowed (16.175)

16.15.3 Properties in the Cordon-Shortley convention


Reality of d(j) (θ) in the Cordon-Shortley convention
In the Cordon-Shortley convention, the matrix representations of J± in the canonical basis {|j, mi} are real, it leads to
purely imaginary representations of J2 (see Sec. 15.5.1), this in turn implies that the matrix representation of exp [−iθJ2 ],
in the canonical basis is real. But according with Eq. (15.85), it corresponds to the reduced matrix d(j) (θ). Further, since
D(j) (φ, θ, ψ) is unitary, Eq. (15.85) shows that d(j) (θ) also is, it means that d(j) (θ) are real orthogonal in the C-S convention.
Therefore h i−1
d(j) (θ) = d(j) (−θ) = d^(j) (θ) (16.176)

where d^
(j) (θ) means the transpose of d(j) (θ).

Action of R2 (π) on the canonical basis in the Cordon-Shortley convention


1/2
From Eqs. (15.96, 16.169) we see that we have adopted the convention η1/2 = 1. This can be thought as another way to
1/2
establish the Cordon-Shortley convention. We shall show that if we adopt η1/2 = 1, then ηjj = 1 for all j. We proceed by
induction, for which we start assuming that ηjj = 1 hence

U (j) [R2 (π)] |j, ji = |j, −ji (16.177)

now we consider the direct product of vectors and operators (see Sec. 15.13.2 page 278)
 
1 1 1 1

|j, ji ⊗ ,
= j + , j + ; U (j) [R2 (π)] ⊗ U (1/2) [R2 (π)] = U (j+1/2) [R2 (π)]
2 2 2 2
such that
 n o 
1 1 1 1
U (j+1/2) [R2 (π)] j + , j + = U (j) [R2 (π)] ⊗ U (1/2) [R2 (π)] |j, ji ⊗ ,
2 2 2 2
n o   
1 1 1 1
= U (j) [R2 (π)] |j, ji ⊗ U (1/2) [R2 (π)] , = |j, −ji ⊗ , −
2 2 2 2
  
1 1
U (j+1/2) [R2 (π)] j + , j + = j + 1 , − j + 1
2 2 2 2
j+1/2
therefore, if ηjj = 1 then ηj+1/2 = 1. Using the fact that ηjj = 1 and Eq. (16.175) we have

1 = ηjj = −ηj−1
j
= (−1)2 ηj−2
j
= . . . = (−1)n ηj−n
j
; n≤j
j−m j
with m ≡ j − n we have (−1) ηm = 1 so that
j j−m
ηm = (−1)
the matrix representation of the operator U (j) [R2 (π)] in the canonical basis reads
′ ′
U (j) [R2 (π)] |j, mi = |j, m′ i D(j) [R2 (π)]m m
j
⇒ |j, −mi ηm = |j, m′ i D(j) [R2 (π)]m m

because of the linear independence of the canonical basis it means that


m′ ′ j−m ′
D(j) [R2 (π)] m
j
= ηm δm −m = (−1) δm −m (16.178)
10 Note j j
that if m = ±j Eq. (16.170) automatically leads to ηm±1 = 0 even for ηm 6= 0.
322 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

on the other hand, it is easy to see on geometrical grounds that11


π (j) (j) π (j) π (j) π (j)
U (j) [R1 (π)] = ei 2 J3 e−iπJ2 e−i 2 J3 = ei 2 J3 U (j) [R2 (π)] e−i 2 J3 (16.179)

its matrix representation in the canonical basis yields


m′ π (j) π (j)
D(j) [R1 (π)] m = hj, m′ | U (j) [R1 (π)] |j, mi = hj, m′ | ei 2 J3 U (j) [R2 (π)] e−i 2 J3 |j, mi
′ m′
= ei 2 m hj, m′ | U (j) [R (π)] |j, mi e−i 2 m = ei 2 (m −m) D(j) [R (π)]
π ′ π π
2 2 m
(j) m′ iπ ′
2 (m −m)
j−m m′ iπ
2 (−m−m)
j−m m′ −iπm j−m ′
D [R1 (π)] m = e (−1) δ −m =e (−1) δ −m =e (−1) δm −m

−iπ m j−m m′ m j−m m′ j m′
= e (−1) δ −m = (−1) (−1) δ −m = (−1) δ −m

where we have applied Eq. (16.178). We summarize our results in the following way: The Cordon-Shortley convention is
1/2
equivalent to define η1/2 = 1 in Eq. (16.174), and from this we obtain

1/2 j−m
If η1/2 = 1 then ηjj = 1 and ηm
j
= (−1) (16.180)
′ ′ ′ ′
m j−m m j
D(j) [R2 (π)] m = (−1) δm −m ; D(j) [R1 (π)] m = (−1) δ m −m (16.181)

Complex conjugation of D(j)


Given a matrix representation, the conjugate of the matrices also yields a representation. Thus, it deserves to check whether
the conjugate representation is equivalent or not to the original representation. We shall first study the conjugate of rotations
around X3 and X2 . Since J3 is real in the C-S convention, the complex conjugate of the matrix D [R3 (Ψ)] gives
    h i
D(j)∗ [R3 (Ψ)] = D(j)∗ e−iΨJ3 = D(j) eiΨJ3 = D(j) e−i(−Ψ)J3 = D(j) [R3 (−Ψ)] = D(j) [R−e3 (Ψ)]
D(j)∗ [R3 (Ψ)] = D(j) [R2 (π) R3 (Ψ) R2 (−π)] = D(j) [R2 (π)] D(j) [R3 (Ψ)] D(j) [R2 (−π)]

in the last step, we have used Eq. (15.32) and the fact that −e3 = R2 (π) e3 . On the other hand, since J2 is purely imaginary,
the matrix representative of D(j) [R2 (Ψ)] is real in the C-S convention, thus

D(j)∗ [R2 (Ψ)] = D(j) [R2 (Ψ)] = D(j) [R2 (π + Ψ − π)] = D(j) [R2 (π) R2 (Ψ) R2 (−π)]
D(j)∗ [R2 (Ψ)] = D(j) [R2 (π)] D(j) [R2 (Ψ)] D(j) [R2 (−π)]

from Eq. (16.181) and changing notation in the form


h im′
j−m m′
Y (j) ≡ D(j) [R2 (π)] ; Y (j) m = (−1) δ −m

we can write
h i−1 h i−1
D(j)∗ [R3 (Ψ)] = Y (j) D(j) [R3 (Ψ)] Y (j) ; D(j)∗ [R2 (Ψ)] = Y (j) D(j) [R2 (Ψ)] Y (j) (16.182)

taking into account Eq. (15.38) and using Eqs. (16.182), the complex conjugate of an arbitrary marix of rotation yields

D(j)∗ [R (φ, θ, ψ)] = D(j)∗ [R3 (φ) R2 (θ) R3 (ψ)] = D(j)∗ [R3 (φ)] D(j)∗ [R2 (θ)] D(j)∗ [R3 (ψ)]
 h i−1   h i−1   h i−1 
= Y (j) D(j) [R3 (Ψ)] Y (j) Y (j) D(j) [R2 (Ψ)] Y (j) Y (j) D(j) [R3 (Ψ)] Y (j)
n oh i−1
= Y (j) D(j) [R3 (Ψ)] D(j) [R2 (Ψ)] D(j) [R3 (Ψ)] Y (j)

then the complex conjugate of a general rotation matrix is given by


h i−1
D(j)∗ (φ, θ, ψ) = Y (j) D(j) (φ, θ, ψ) Y (j)

Therefore, the conjugate representation is equivalent to the original one. Now, appealing to corollary 8.7 on page 166, the
(j)
adjoint representation D is also equivalent with D(j) .
11 For instance, if we apply Eq. (16.179), to u2 in the representation j = 1, we have
π π π (j) (j) π (j)
ei 2 J3 e−iπJ2 e−i 2 J3 u2 = −ei 2 J3 e−iπJ2 u1 = ei 2 J3 u1 = −u2
which is precisely the action of a rotation R1 (π) on u2 . We can proceed similarly with u1 and u3 . Further, since representations must preserve
products, this property must be translated to any representation.
16.15. PROPERTIES OF THE D(J) (φ, θ, ψ) REPRESENTATIONS OF SO (3) 323

Symmetry relations
From Eqs. (16.176, 16.181) we can derive some symmetry relations for the d(j) (θ) matrices. From Eq. (16.176) we see that12

d^

(j) (θ) = d(j) (−θ) ⇒ d(j) (θ)m (j) m
m = d (−θ) m′ (16.183)

and using the definition of d(j) (θ) Eq. (15.85) as well as Eqs. (16.183, 16.181) we have
m′ m′ m′ n
d(j) (θ) m = hj, m′ | U (j) [R2 (θ)] |j, mi = D(j) [R2 (θ − π + π)] m = D(j) [R2 (θ − π)] nD
(j)
[R2 (π)] m
h i h i
m′ j−m n n j−m n
= d(j) (− (π − θ)) n (−1) δ −m = d(j) (π − θ) m′ (−1) δ −m
m′ j−m −m
d(j) (θ) m = (−1) d(j) (π − θ) m′ (16.184)

similarly
m m m m n
d(j) (−θ) m′ = D(j) [R2 (−θ)] m′ = D(j) [R2 (−θ + π − π)] m′ = D(j) [R2 (−θ + π)] nD
(j)
[R2 (−π)] m′

(j) m (j) m j−n (j) m m′
= d (π − θ) nD [R2 (π)] n = (−1) d (π − θ) nδ −n
(j) m j+m′ (j) m
d (−θ) m′ = (−1) d (π − θ) −m′ (16.185)

furthermore
m′ m′ m′ m′ n k
d(j) (θ) m = D(j) [R2 (θ)] m = D(j) [R2 (π + θ − π)] m = D(j) [R2 (π)] kD
(j)
nD[R2 (j)
[R2 (θ)] (−π)] m
j−n ′ n m j−n m ′
n j−k m
m (j) (j) (j)
= (−1) δ −n d (θ) k D [R2 (π)] k = (−1) δ −n d (θ) k (−1) δ −k
(j) m′ j+m′ (j) −m′ j+m 2j+m+m′ −m′
d (θ) m = (−1) d (θ) −m (−1) = (−1) d(j) (θ) −m (16.186)

putting Eqs. (16.183-16.186) together we have



d(j) (θ)m m = d(j) (−θ)m m′ = (−1)j−m d(j) (π − θ)−m m′
j+m′ m 2j+m+m′ −m′
= (−1) d(j) (π − θ) −m′ = (−1) d(j) (θ) −m (16.187)

Relations with spherical harmonics


Section 16.12 shows that the D(j) (φ, θ, ψ) matrices, satisfy differential equations that relate them with some special functions
of the Mathematical Physics. We give the results here for completeness

1. When j ≡ l is integer, the D−functions are related with the spherical harmonics and the Legendre functions in this way

1/2 h i∗
2l + 1 m
Ylm (θ, φ) = D(l) (φ, θ, 0) 0 (16.188)

s
m (l + m)! (l) m
Plm (cos θ) = (−1) d (θ) 0 (16.189)
(l − m)!
0
Pl (cos θ) = Pl0 (cos θ) = d(l) (θ) 0 (16.190)

with Pl (z) the ordinary Legendre polynomial and Plm (z) the associated Legendre function.
m′ (a,b)
2. For arbitrary j, d(j) (θ) m is proportional to the Classical Jacobi Polynomial Pl (cos θ) with a = m′ − m, b = m′ + m
and l = j − m′ .

3. The functions D(j) (φ, θ, ψ) satisfy orthonormality and completeness conditions which are generalizations of Theorems
7.8, 7.11. This could be shown after the appropriate invariant integration measure is established. See theorem 16.5 page
303, and theorem 16.6 page 304.

Characters
To calculate the characters of a given irreducible representation (j) of SO (3) we should classify the elements of the group in
conjugacy classes. Theorem 15.4 provides us with the different classes in the angle-axis parameterization. A class consists of
a set {Rn (Ψ) ; ∀ |n| = 1 and Ψ fixed}. The number of characters is the number (in the continuum) of distinct angles Ψ, so
12 See also the derivation of Eq. (15.86), page 271.
324 CHAPTER 16. THE GROUP SU(2) AND ADDITIONAL PROPERTIES OF SO (3)

that Ψ is the “class label” or equivalently a “character label”. Thus, it is enough to calculate the character of R3 (Ψ). In the
canonical basis (Cordon-Shortley convention) we have
j
X j
X j
X j
X
m
χ(j) (Ψ) = D(j) [R3 (Ψ)] m = hj, m| e−iΨJ3 |j, mi = e−imΨ hj, m |j, mi = e−imΨ
m=−j m=−j m=−j m=−j
j
X j
X
m
χ(j) (Ψ) = e−iΨ ≡ Zm ≡ S
m=−j m=−j

to evaluate S we write

S = Z −j + Z −j+1 + Z −j+2 + . . . + Z j−1 + Z j ; ZS = Z −j+1 + Z −j+2 + . . . + Z j−1 + Z j + Z j+1 ⇒


S − ZS = Z −j − Z j+1 ⇒ (1 − Z) S = Z −j − Z j+1 ⇒
j
X Z −j − Z j+1
S ≡ Zm =
m=−j
(1 − Z)

(note that j does not have to be integer13 ) and remembering the value of Z = e−iΨ we have
Ψ
Z −j − Z j+1 eiΨj − e−iΨ(j+1) eiΨj − e−iΨ(j+1) ei 2 eiΨ(j+1/2) − e−iΨ(j+1/2)
χ(j) (Ψ) = = = Ψ = Ψ Ψ
(1 − Z) 1 − e−iΨ 1 − e−iΨ ei 2 ei 2 − e−i 2
  
sin j + 12 Ψ
χ(j) (Ψ) =
sin (Ψ/2)

when Ψ = 0 the operator is the identity and its trace must be 2j + 1. Note that although the expression above is not defined
in Ψ = 0, we obtain the correct limit
   
sin j + 12 Ψ j + 21 Ψ
lim = lim = 2j + 1
Ψ→0 sin (Ψ/2) Ψ→0 (Ψ/2)

so the character is given by


  
(j) sin j + 12 Ψ
χ (Ψ) = if Ψ 6= 0 and χ(j) (0) = 2j + 1 (16.191)
sin (Ψ/2)

For the representations j = 1/2, and j = 1 we obtain


 
(1/2) sin Ψ Ψ
χ (Ψ) = = 2 cos
sin (Ψ/2) 2
 
(1) sin (Ψ + Ψ/2) sin Ψ cos (Ψ/2) 2 Ψ
χ (Ψ) = = cos Ψ + = cos Ψ + 2 cos = 2 cos Ψ + 1
sin (Ψ/2) sin (Ψ/2) 2

these results can be cross-checked by comparing with the traces of the d(j) (Ψ) matrices given by Eqs. (15.96, 15.104). Observe
that the results coincide despite in d(j) (Ψ) the angle refers to an Euler angle while in χ(j) (Ψ) the angle refers to the angle-axis
parameterization. This is because, d(j) (Ψ) corresponds to a rotation around the axis X2 through an angle Ψ, such that in this
specific case the Euler angle and the angle in the angle-axis parameterization coincide.
Finally, we observe that χ(j) (Ψ) = χ(j) (−Ψ), as expected from the fact that by using the parameterization described in
Sec. 15.4, two angles with the same absolute value belong to the same conjugacy class.

13 Indeed j can be an arbitrary number, we only require that m varies from −j to j in unit intervals.
Chapter 17

Applications in Physics of SO (3) and SU (2)

17.1 Applications of SO (3) for a particle in a central potential


Let us apply the group-theoretical notions and methods to the case of a non-relativistic quantum particle under a central
potential V (r). The fact that V (r) depends only on the distance to the origin, leads to the invariance of the Hamiltonian
−1
under any rotation U (R) with R ∈ SO (3). Therefore U (R) HU (R) = H so that the Hamiltonian is a scalar, this can be
written as
[H, U (R)] = 0 , ∀R ∈ SO (3) ⇔ [H, Ji ] = 0 , for i = 1, 2, 3
we shall derive many important properties that arise from symmetry grounds without using the specific form of the potential.
This serves as a prototype for applications of group theoretical methods in Physical problems

17.1.1 Characterization of states


It is natural to use the basis of the state space consisting of the common eigenvectors of H, J 2 , J3 . We denote these eigenstates
as {|E, l, mi}. Since each |E, l, mi is eigenvector of H, J 2 , J3 we have

H |E, l, mi = |E, l, mi E , J 2 |E, l, mi = |E, l, mi l (l + 1) , J3 |E, l, mi = |E, l, mi m


l = integer m = −l, −l + 1, . . . , l − 1, l

the Schrödinger wave functions {ψElm (x)} associated with the kets {|E, l, mi} read

ψElm (x) = hx| E, l, mi (17.1)

with |xi an eigenstate of the position operator X. We shall use spherical coordinates (r, θ, φ) for the coordinate vector x, and
we shall fix the phase convention for |xi ≡ |r, θ, φi by defining

|r, θ, φi = U (φ, θ, 0) |re3 i = e−iφJ3 e−iθJ2 e−i0J3 |r, 0, 0i = e−iφJ3 e−iθJ2 |r, 0, 0i (17.2)

where we define each state in terms of a “standard state of reference” |re3 i = |r, 0, 0i which represents a localized state located
on the X3 −axis at a distance r from the origin. For a point-like particle, which is the system implicit in the formulation, we
see that such a state must be invariant under a rotation around the X3 −axis, therefore
" ∞
# ∞
X n
X n
−iψJ3
e |r, 0, 0i = |r, 0, 0i ⇒ I+ (−iψJ3 ) |r, 0, 0i = |r, 0, 0i ⇒ |r, 0, 0i + (−iψJ3 ) |r, 0, 0i = |r, 0, 0i
n=1 n=1

X n n
⇒ (−iψ) (J3 ) |r, 0, 0i = 0
n=1

since it is valid for all ψ ∈ [0, π], we see that

e−iψJ3 |r, 0, 0i = |r, 0, 0i ⇔ J3 |r, 0, 0i = 0 (17.3)

where the opposite implication is obvious. From Eqs. (17.1, 17.2) we obtain

ψElm (r, θ, φ) = hr, θ, φ| E, l, mi = hr, 0, 0| U † (φ, θ, 0) |E, l, mi


h im′

ψElm (r, θ, φ) = hr, 0, 0| E, l, m′ i D(l) (φ, θ, 0) m (17.4)

325
326 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

now, taking into account Eq. (17.3), we find


hr, 0, 0| E, l, m′ i = hr, 0, 0| (1 + J3 ) |E, l, m′ i ⇒ hr, 0, 0| E, l, m′ i = hr, 0, 0| (1 + m′ ) |E, l, m′ i
⇒ hr, 0, 0| E, l, m′ i = hr, 0, 0| E, l, m′ i + m′ hr, 0, 0| E, l, m′ i
⇒ m′ hr, 0, 0| E, l, m′ i = 0
so that hr, 0, 0| E, l, m′ i must be zero except perhaps in the case in which m′ = 0. We then obtain

hr, 0, 0| E, l, m′ i = δm′ ,0 ψeEl (r) ; ψeEl (r) ≡ hr, 0, 0| E, l, 0i (17.5)

where it is clear that ψeEl (r) does not depend on m′ . Replacing (17.5) in (17.4) we obtain
h im′ h i0
† †
ψElm (r, θ, φ) = δm′ ,0 ψeEl (r) D(l) (φ, θ, 0) e
m = ψEl (r) D(l) (φ, θ, 0) m
h i∗
m
ψElm (r, θ, φ) = ψeEl (r) D(l) (φ, θ, 0) 0

and using the relation (16.188) we obtain


r
4π e
ψElm (r, θ, φ) = ψEl (r) Ylm (θ, φ) ; ψeEl (r) ≡ hr, 0, 0| E, l, 0i (17.6)
2l + 1
Equation (17.6) is the well-known factorization of ψ (x) in the angular function Ylm (θ, φ) and the “radial-wave function”
ψeEl (r) which depends on the specific form of the potential V (r). The appearance of Ylm (θ, φ) in this decomposition is a
direct consequence of spherical symmetry1 . Further, the arguments based on invariance under rotations led to the fact that
the radial solution cannot depend on the quantum number m.

17.1.2 Asymptotic plane wave states


If the potential V (r) goes to zero faster than r−1 at large distances, the asymptotic state far away from the origin, can be
considered a free particle state i.e. a plane-wave function. Plane-waves are eigenstates of the operator P. Denoting the
magnitude of the momentum by p and its direction as p b (θp , φp ) we have

p2
E= ; |pi = |p, θp , φp i = U (φp , θp , 0) |pb
p3 i
2m
where we have chosen the “standard reference state” along the p b 3 −axis. We can relate these plane wave states with angular
momentum eigenstates2 |p, lp , mp i by using the projection operator technique described in Chapter 9. Applying theorem 9.2
to this case (or the version for SO (3) Eq. 16.99, page 308), one can see that
s  Z 2π Z 1
2lp + 1 h i∗
m
|p, lp , mp i = dφp d (cos θp ) |p, θp , φp i D(l) (φp , θp , 0) 0
4π 0 −1
Z
|p, lp , mp i = dΩp |p, θp , φp i Ylp mp (θp , φp ) (17.7)

where dΩp = dφp d (cos θp ). The inverse of Eq. (17.7) is given by


X
|p, θp , φp i = |p, lp , mp i Yl∗p ,mp (θp , φp ) (17.8)
lp ,mp

the notation θp , φp , lp , mp owes to the fact that they are defined in the momentum space and not in the coordinate space.

17.1.3 Partial wave decomposition


Now we characterize the scattering of a particle in a central potential field V (r). We assume in this section that |pi | = |pf | ≡ p
(elastic scattering). Let the momentum of the initial asymptotic state be along the X3 −axis so that pi = (p, θi = φi = 0), and
the final state has a momentum pf = (p, θ, φ). It is well known that the scattering amplitude can be written as
hpf | T |pi i = hp, θ, φ| T |p, 0, 0i (17.9)
where the scattering operator T depends on the Hamiltonian. We shall not be concerned about the specific form of the
scattering operator, we shall only assume that T is rotationally invariant, it means that [T, Ji ] = 0. Let us prove the following
q  (l) ∗
1 Note that in this problem, the term Ylm (θ, φ) = 2l+1

D (φ, θ, 0)m 0 , has its origin in the group representation theory for SO(3).
2 Since E = p2 /2m for a free particle we use |E, l, mi or |p, l, mi to denote angular momentum eigenstates at convenience.
17.1. APPLICATIONS OF SO (3) FOR A PARTICLE IN A CENTRAL POTENTIAL 327

Theorem 17.1 Let T be a rotationally invariant operator. Let {|p, l, mi} be eigenstates of J 2 and J3 with eigenvalues l (l + 1)
and m respectively. We see that
hp, l, m| T |p, l, mi = hp, l, m + 1| T |p, l, m + 1i if m 6= l
hp, l, m| T |p, l, mi = hp, l, m − 1| T |p, l, m − 1i if m 6= −l
so that this matrix element is independent of the quantum number m
hp, l, m| T |p, l, mi = hp, l, 0| T |p, l, 0i ≡ Tl (p) (17.10)
Proof: If l = 0 it is trivial. Let us assume that l 6= 0. From Eq. (15.73) we have
p
J± |p, j, mi = |p, j, m ± 1i c± ±∗ ±
jm ; hp, j, m| J∓ = cjm hp, j, m ± 1| ; cjm ≡ j (j + 1) − m (m ± 1) (17.11)
since T is rotationally invariant, it commutes with J1 and J2 . Therefore [T, J± ] = 0. Using this fact and Eq. (17.11), we
obtain
1 1 c+
hp, l, m + 1| T |p, l, m + 1i = + 2 hp, l, m| J− T J+ |p, l, mi = + 2 hp, l, m| T J− J+ |p, l, mi = +lm 2 hp, l, m| T J− |p, l, m + 1i
c c c
lm lm lm
c+ −
lm cl,(m+1)
hp, l, m + 1| T |p, l, m + 1i = + 2 hp, l, m| T |p, l, mi
c
lm

but from Eq. (17.11), we see that


p p
c−
l,(m+1) = l (l + 1) − (m + 1) [(m + 1) − 1] = l (l + 1) − m (m + 1) = c+
l,m

hence
c+ c+
hp, l, m + 1| T |p, l, m + 1i = lm lm
hp, l, m| T |p, l, mi = hp, l, m| T |p, l, mi
c+ 2
lm

observe that it is essential that |p, l, m + 1i 6= 0 (i.e. m 6= l) because otherwise c+


l,m = 0 and it cannot be put in the denominator.
If |p, l, m + 1i = 0 (m = l), and taking into account that l 6= 0, lemmas 15.2, 15.3, say that |p, l, m − 1i 6= 0. It can be proved
by an analoguous argument that
hp, l, m − 1| T |p, l, m − 1i = hp, l, m| T |p, l, mi if m 6= −l
QED.
Theorem 17.2 Let T be a rotationally invariant operator. Let {|p, l, mi} be eigenstates of J 2 and J3 with eigenvalues l (l + 1)
and m respectively. We see that
hp, l′ , m′ | T |p, l, mi = δll′ δmm′ Tl (p) ; Tl (p) ≡ hp, l, 0| T |p, l, 0i (17.12)
Proof : Since T is rotationally invariant, it commutes with J 2 and J3 so that
J3 T |p, l, mi = T J3 |p, l, mi ⇒ J3 T |p, l, mi = mT |p, l, mi
J 2 T |p, l, mi = T J 2 |p, l, mi ⇒ J 2 T |p, l, mi = l (l + 1) T |p, l, mi
so T |p, l, mi is also an eigenvector of J3 and J 2 with the same eigenvalues (see theorem 3.17, page 51). In other words, when
T acts on an eigenstate of the angular momentum, it will not change the quantum numbers (l, m). Therefore
T |p, l, mi = |p, l, mi Tlm (p) ⇒ hp, l′ , m′ | T |p, l, mi = hp, l′ , m′ | p, l, miTlm (p)
′ ′
hp, l , m | T |p, l, mi = δl′ l δmm′ Tlm (p)
where we have used the fact that if l 6= l′ they are two different eigenvalues of the same hermitian operator J 2 so eigenstates
with l 6= l′ must be orthogonal. For similar reasons, eigenstates with m 6= m′ are orthogonal. This equation can also be written
as
hp, l′ , m′ | T |p, l, mi = hp, l, m| T |p, l, mi δl′ l δmm′
comparing the last two equations and using Eq. (17.10) we obtain
hp, l′ , m′ | T |p, l, mi = δl′ l δmm′ Tl (p)
QED.
Observe that theorems 17.1, 17.2, are purely geometrical in nature, and they do not concern about scattering processes3 .
Our final theorem is more directly related with the elastic-scattering of a particle in a central potential
3 For the vectors |p, l, mi in these theorems, p is an arbitrary continuous parameter. We only assume that p remains constant in the process.
328 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

Theorem 17.3 (Partial wave decomposition of a rotationally invariant elastic-scattering operator): Let T be a rotationally
invariant elastic-scattering operator of one particle. Let {|p, l, mi} be eigenstates of J 2 and J3 with eigenvalues l (l + 1) and m
respectively. The scattering amplitude hpf | T |pi i can be expanded in the following way

X 2l + 1
hpf | T |pi i = Tl (E) Pl (cos θ) ; Tl (E) ≡ hE, l, 0| T |E, l, 0i (17.13)

l=0

Proof : If the scattering is elastic, the magnitude of the momentum is the same in both asymptotic states, and the scattering
amplitude acquires the form of Eq. (17.9)
hpf | T |pi i = hp, θ, φ| T |p, 0, 0i (17.14)
where we have taken the X3 −axis along the incident direction of the asymptotic one-particle state. Using Eq. (17.8) in Eq.
(17.14) we get
   
X X
hpf | T |pi i = hp, θ, φ| T |p, 0, 0i =  Yl′ m′ (θ, φ) hp, l′ , m′ | T  ∗
Ylm (0, 0) |p, l, mi
l′ ,m′ l,m
XX
′ ′ ∗
= Yl′ m′ (θ, φ) hp, l , m | T |p, l, mi Ylm (0, 0)
l′ ,m′ l,m

and using Eq. (17.12) we obtain


XX X
∗ ∗
hpf | T |pi i = δl′ l δm′ m Yl′ m′ (θ, φ) Tl (p) Ylm (0, 0) = Ylm (θ, φ) Tl (p) Ylm (0, 0) (17.15)
l′ ,m′ l,m l,m

now, taking into account that


s
2l + 1 (l − m)! m
Ylm (θ, φ) = P (cos θ) eimφ ; Plm (±1) = 0 for m 6= 0 ; Pl0 (cos θ) = Pl (cos θ)
4π (l + m)! l
we have
s s r
2l + 1 (l − m)! m 2l + 1 (l − 0)! 0 2l + 1
Ylm (0, 0) = P (1) = P (1) δm0 = Pl (1) δm0
4π (l + m)! l 4π (l + 0)! l 4π
r
2l + 1
Ylm (0, 0) = δm0 (17.16)

replacing (17.16) in (17.15) we find
r ∞ r
X 2l + 1 X 2l + 1
hpf | T |pi i = Ylm (θ, φ) Tl (p) δm0 = Yl0 (θ, φ) Tl (p)
4π 4π
l,m l=0

X 2l + 1
hpf | T |pi i = Tl (E) Pl (cos θ)

l=0

where we have taken into account that for a free particle E = p2 /2m, so that the magnitude of the momentum p leads to the
value of the energy in the asymptotic regions. QED.
This is the partial wave expansion of the scattering amplitude. We see that considerable simplifications are possible because
of the underlying spherical symmetry. All the “dynamics” is contained in the “partial wave amplitude” Tl (E).

17.2 Kinematic effects, dynamic effects and group theory


The latter examples show the power of group-theoretical methods when applied to physical problems. They permit a clean
separation between the “kinematic” (or geometrical) effects, from “dynamic” effects that are specific of the system under
study. It leads to important simplifications. For instance, to find the Schrödinger wave function in Eq. (17.1), symmetry
considerations led to Eq. (17.6) in which the kinematic or geometrical part is given by the spherical harmonics, while the
radial function gives the dynamic part. In practice, it leads to the reduction of a three-dimensional partial differential equation
to an ordinary differential equation in the radial variable. Similarly, in the partial wave expansion (17.13), for the scattering
problem of a particle in a central potential, the terms Tl (E) comprise the dynamics and the remaining factors are geometrical
universal terms.
It worths remarking that in both problems, there is a valuable geometric information even in the dynamic part, it is the
fact that the dynamical objects ψeEl (r) and Tl (E) do not depend on the quantum number m. It leads to further simplifications
when a specific dynamic is studied.
17.3. TRANSFORMATION PROPERTIES OF FIELDS UNDER SO (3) 329

17.3 Transformation properties of fields under SO (3)


We have studied so far the transformation of vectors under symmetry operations. But in physical applications other mathe-
matical objects such as fields and operators are very important, as well as their properties of transformations under symmetry
operations.
Let us assume a localized vector |xi. As our starting point, we consider the basic relations

U [R] |xi = |x′ i ; x′ = Rx x′i = Ri j xj (17.17)

where x, x′ are coordinate space three-vectors while |xi , |x′ i are localized states at x, x′ respectively. R ∈ SO (3) is a rotation.
Let |ψi be an arbitrary state vector, then
Z
|ψi = |xi ψ (x) d3 x (17.18)

where ψ (x) is a c−number Field in the coordinate representation. This equation says that a field is specified by its value at
each point x in the coordinate space. Our question now is how ψ (x) transforms under a rotation R, specifically if
Z
|ψ i = U [R] |ψi = |xi ψ ′ (x) d3 x

(17.19)

we wonder how ψ ′ (x) is related with ψ (x). We answer this question in the following theorem

Theorem 17.4 (Transformation formula for fields): The field of an arbitrary state transforms under rotations as:

ψ (x) → ψ ′ (x) = ψ R−1 x (17.20)

Proof : Applying the rotation operator on both sides of Eq. (17.18), and using Eq. (17.17) we find
Z Z Z

U [R] |ψi = U [R] |xi ψ (x) d x = |x i ψ (x) d x= |x′ i ψ R−1 x′ d3 x
3 ′ 3

making a change of integration variable x → x′ = Rx (with unit jacobian since det R = 1) we find that d3 x = d3 x′ so that
Z Z
 3 ′ 
′ −1 ′
U [R] |ψi = |x i ψ R x d x = |xi ψ R−1 x d3 x

in the last step we renamed the dummy integration variable. Comparing with Eq. (17.19) we find

ψ ′ (x) = ψ R−1 x

QED.
Note that the appearance of the inverse of R in Eq. (17.20) ensures that the mapping ψ (x) → ψ ′ (x) is a representation
of the symmetry group, as can be seen in Sec. 8.2, Eq. (8.32), in which we studied group representations on function spaces
induced by representations defined on their domains.

Example 17.1 Let |ψi = |pi be a plane-wave state; then ψp (x) = eip·x . From theorem 17.4, this field changes under rotations
as follows
 −1
ψ ′ (x) = ψp R−1 x = eip·R x (17.21)

writing p · R−1 x in matrix notation and using the associativity of the vector matrix products, we find
   
p · R−1 x = p
e R−1 x = p e R−1 x = p eR ] = Rp · x
e x = (Rp)x (17.22)

where the orthogonality condition of rotations has been used. Replacing (17.22) in (17.21) we find

ψ ′ (x) = ψp R−1 x = eiRp·x (17.23)

The same result can be obtained from

ψ ′ (x) = hx| U [R] |pi = hx| p′ i = ψp′ (x) ; p′ = Rp


⇒ ψ ′ (x) = ψp′ (x) = ψRp (x) = eiRp·x
330 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

Example 17.2 Let |ψi = |E, l, mi be a simultaneous eigenstate of H, J 2 , J3 for a central potential as described in Sec. 17.1.
From Eq. (17.6) we see that
ψElm (x) = cl ψeEl (r) Ylm (b
x) (17.24)
b is a unit vector with polar and azimuthal angles θ, φ respectively. Further
where x
m′
|ψ ′ i = U [R] |ψi = U [R] |E, l, mi = |E, l, m′ i D(l) [R] m

and from (17.24) we have


′ ′
hx |ψ ′ i = hx |E, l, m′ i D(l) [R]m m ⇒ ψ ′ (x) = ψElm′ (x) D(l) [R]m m ; no sum over (l)


ψ (x) = x) D(l) [R]m m
cl ψeEl (r) Ylm′ (b (17.25)

from Eq. (17.20) and taking into account that R−1 keeps the coordinate r invariant, we have
  
ψ ′ (x) = ψ ′ (rb
x) = ψ R−1 (rb
x) = ψ rR−1 x b = cl ψeEl (r) Ylm R−1 x
b (17.26)

and comparing Eqs. (17.25, 17.26) we find


 ′
Ylm R−1 x x) D(l) [R]m m ; no sum over (l)
b = Ylm′ (b (17.27)

which is a well-known property of the spherical harmonics. This result coincides with the one obtained in page 316, Eq.
(16.155).

17.3.1 Transformation of multicomponent fields under SO(3)


Now, we shall extend the study of transformation properties of fields under rotations to the case in which the field car-
ries an additional discrete index. If the field is to be irreducible, such an index must run over 2j + 1 values, so that
the multi-component field transforms under the (j) −irreducible representation of SO (3). We shall denote those fields as
{|x, ki ; k = −j, −j + 1, . . . , j − 1, j} with k a discrete index with 2j + 1 values. These fields transform as

U [R] |x, ki = |Rx, ni D(j) [R]n k (17.28)

note that we have a transformation related with the space coordinates x, and another one concerning the discrete components.
On the space coordinates x, we define the transformation as an ordinary rotation in Euclidean space (j = 1 irreducible
representation), while the components of the field transform according with the (j) −representation in which 2j + 1 is the
number of values of the discrete index. An arbitrary state of “spin” j is written by
Z
|ψi = |x, ki ψ k (x) d3 x ; k = −j, − j + 1, . . . , j − 1, j (17.29)

where ψ k (x) is a field of (2j + 1) −components associated with |ψi and there is sum over k. Once again, we ask for the way
in which ψ k (x) transforms under rotations. It follows from Eqs. (17.28, 17.29) that
Z Z Z
n
|ψ ′ i = U [R] |ψi = U [R] |x, ki ψ k (x) d3 x = {U [R] |x, ki} ψ k (x) d3 x = |Rx, ni D(j) [R] k ψ k (x) d3 x

by redefining x → Rx and taking into account the invariance of the differential of volume under this transformation, we find
Z
n 
|ψ ′ i = |x, ni D(j) [R] k ψ k R−1 x d3 x

on the other hand, the transformation can be expressed by


Z
|ψ i = |x, ni ψ ′n (x) d3 x

comparing the last two equations we find


R n 
→ ψ ′ ; ψ ′n (x) = D(j) [R]
ψ− k ψ k R−1 x (17.30)

It can be checked that the mapping defined in Eq. (17.30), provides a representation of the rotation group SO (3). To prove
it, consider the composition of two mappings R1 and R2 such that
R R
1
ψ −−→ ψ′ ; ψ ′ −−→
2
ψ ′′
17.4. TRANSFORMATION PROPERTIES OF OPERATORS UNDER SO (3) 331

for which we have


m  m l  
ψ ′′m (x) = D(j) [R2 ] R2−1 x = D(j) [R2 ] l D(j) [R1 ] n ψ n R1−1 R2−1 x

′l
h i
m −1
ψ ′′m (x) = D(j) [R2 R1 ] n ψ n (R2 R1 ) x

so that this mapping is a homomorphism.

Example 17.3 Let ψ k (x) be a Pauli spinor wave function. Such a field describe vector states of the form {|x, σi ; σ = ±1/2}.
And they transform under rotations as
λ 
ψ ′λ (x) = D(1/2) [R] σ ψ σ R−1 x ; λ, σ = ±1/2 (17.31)

and the matrix representatives are given by Eq. (15.97).

Of course, Eq. (17.30) reduces to Eq. (17.20) when j = 0, since in that case we are dealing with the identity representation
of SO (3). The behavior of multi-component fields under rotations induces the following definition

Definition 17.1 (Irreducible fields): A set of multi-component functions {φm (x) ; m = −j, −j + 1, . . . , j − 1, j} of the coor-
dinate vector x, is said to form an irreducible field of spin j, if they transform under rotations as:
R n 
→ φ′ ; φ′n (x) = D(j) [R]
φ− k φk R−1 x

where D(j) [R]n k define the matrix associated with the (j) −irreducible representation of SO (3). The set of multi-component
functions is also called a (2j + 1) −multiplet of SO (3) because of the dimension of the representation. For j = 0, 1/2, and 1
we have a singlet, a doublet, and a triplet of SO (3) respectively. Singlets are called scalars, doublets are called spinors and
triplets are vectors.

Example 17.4 Some important zero spin (scalar or singlet) fields in Physics are: The spinless wave-functions in quantum
mechanics, thermal fields, scalar potential fields, pressure fields etc.

Example 17.5 The wave-functions of fundamental fermions in quantum mechanics are very important j = 1/2 fields (Pauli
or spinorial fields). A Dirac wave-function is a field which is reducible under SO(3), consisting of the direct sum of two spin
1/2 irreducible fields.

Example 17.6 Some fields of j = 1 (vector fields) are the electric field E i (x) , the magnetic field B i (x) , the velocity field etc.

Example 17.7 Some fields of j = 2 (tensor fields) are: the stress-tensor field, the electromagnetic tensor field etc.

17.4 Transformation properties of operators under SO (3)


Now we consider the transformation properties of operators on the state vector space. We shall use as a prototype the
coordinate vector operator X i defined by the eigenvalue equation

X i |xi = |xi xi (17.32)

Theorem 17.5 (Transformation formula for vector operators): Components of the coordinate vector operator X (hence all
“vector operators” by definition) transform under rotations as follows
−1
Xi′ ≡ U [R] Xi U [R] = Xj R j i (17.33)

where Rj i is the 3×3 SO(3) matrix that defines the rotation, see Eqs. (15.1, 15.2). Equivalently, Rj i is the 3×3 SO(3) matrix
that defines the irreducible representation associated with j = 1.

Proof : Applying the operator U [R] to Eq. (17.32) we have

U [R] X i |xi = U [R] |xi xi ⇒ U [R] X i U [R]−1 U [R] |xi = U [R] |xi xi
−1 −1 i
⇒ U [R] X i U [R] |x′ i = |x′ i xi ⇒ U [R] X i U [R] |x′ i = |x′ i R−1 jx
′j

where we haved used Eq. (15.2). Renaming the variable x′ → x, we find


i
U [R] X i U [R]−1 |xi = |xi R−1 jx
j
332 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

On the other hand, from Eq. (17.32) we see that |xi xj = Xj |xi so that
−1 i
U [R] X i U [R] |xi = X j |xi R−1 j

since this is valid for |xi arbitrary we have


−1 i
U [R] X i U [R] = R−1 jX
j
(17.34)
Applying the orthogonality of R−1 we have
 i
U [R] X i U [R]−1 = e j X j = Rj i X j ⇒
R
−1
U [R] Xi U [R] = Xj R j i (17.35)

where we have used the fact that Xj = X j and Rj i = Rj i QED.


It can be shown by applying two rotations in sucession that this transformation formula provides a representation of the
group on the space of operators (which is also a vector space).
We have defined a vector operator as an operator whose rule of transformation under rotations is given by Eq. (17.33). A
very important case of vector operators concern the momentum operators

U [R] Pi U [R]−1 = Pj Rj i (17.36)

note that the RHS of Eqs. (17.33, 17.36) are similar to Eq. (15.1) for unit coordinate vectors, rather than that of Eq. (15.2)
for the coordinate components. It worths also saying that Eqs. (17.33, 17.36) are similar in form to Eq. (15.52) involving
J = (J1 , J2 , J3 ). Thus, the angular momentum operator J also transforms as a vector operator. Generalizations of Eqs. (17.36,
15.52) will be crucial in the characterization of space-time symmetry groups.
Note that the concept of vector operators, is a particular case of the notion of irreducible operators or irreducible tensors
introduced in definition 9.3 page 183, applied to the specific group SO (3). When a Hamiltonian is rotationally invariant it
is an irreducible operator corresponding to j = 0 (scalar operator). We shall discuss some higher rank tensors in subsequent
developments.

17.4.1 Transformation properties of local operators under SO (3)


Now, let us consider the case of local operators, that is operators that depend on the x variable. This scenario is useful
in quantum field theory and many-body quantum theory (in general a theory that involves second quantization i.e. the
quantization of classically continuous media). When a second quantization is carried out, we find that the field functions (or
wave-functions) become operators on the vector space of physical states. Let us denote a multi-component local operator as
Ψk (x). We shall use as a reference the second quantized Schrödinger theory of a spin 1/2 system. The local operator is a
two-component operator-valued Pauli spinor Ψσ (x). We will find out the way in which such an operator transform under a
rotation R. To obtain such a transformation we need the relation between the operator Ψ and the c−number wave-function
defined in ordinary quantum mechanics. The operator Ψ†σ (x) is interpreted as a creation operator of a localized particle, so
that
Ψ†σ (x) |0i = |x, σi ⇒ h0| Ψσ (x) = hx, σ|
where |0i is the vacuum state or 0−particle state. Therefore

hx, σ| ψi = h0| Ψσ (x) |ψi = ψ σ (x) (17.37)

where ψ σ (x) is the Pauli wave function for the state. Under an arbitrary rotation

U [R] |ψi = |ψ ′ i ⇒
|ψi = U [R]−1 |ψ ′ i (17.38)

and using the fact that the vacuum is invariant under any rotation we have

U −1 [R] |0i = U † [R] |0i = |0i ⇒


h0| = h0| U [R] (17.39)

and ψ ′λ (x) is related with ψ σ (x) by Eq. (17.31) which in matrix form reads
   
ψ ′ (x) = D(1/2) [R] ψ R−1 x ⇒ ψ R−1 x = D(1/2) R−1 ψ ′ (x)

defining y ≡ R−1 x then x = Ry and  


ψ (y) = D(1/2) R−1 ψ ′ (Ry)
17.4. TRANSFORMATION PROPERTIES OF OPERATORS UNDER SO (3) 333

writing it in components and redefining y → x, we have


 σ
ψ σ (x) = D(1/2) R−1 λ ψ ′λ (Rx) (17.40)

replacing Eqs. (17.38, 17.39, 17.40) in Eq. (17.37) we find


−1  σ
h0| U [R] Ψσ (x) U [R]
|ψ ′ i = D(1/2) R−1 λ ψ ′λ (Rx) (17.41)
 
On the other hand, multiplying Eq. (17.37) on the left by D(1/2) R−1 , and writing it in matrix form, we find
   
h0| D(1/2) R−1 Ψ (x) |ψi = D(1/2) R−1 ψ (x)

where we have used the invariance of h0| under rotations. Redefining ψ → ψ ′ and y = R−1 x we obtain
   
h0| D(1/2) R−1 Ψ (Ry) |ψ ′ i = D(1/2) R−1 ψ ′ (Ry)

writing this equation in components with y → x, it is found


 σ  σ
h0| D(1/2) R−1 λ Ψλ (Rx) |ψ ′ i = D(1/2) R−1 λ ψ ′λ (Rx) (17.42)

comparing equations (17.41, 17.42), and taking into account the arbitrariness of |ψ ′ i we obtain
−1  σ
U [R] Ψσ (x) U [R] = D(1/2) R−1 λ Ψλ (Rx) (17.43)

when comparing Eqs. (17.31, 17.43)4, we see that the RHS of each other differ by the interchange R ↔ R−1 . The reason for
this difference is the same as that for the difference between the operators X i Eq. (17.33) and the components xi given by Eq.
(15.2)5 . Finally, it can be proved that Eq. (17.43) defines a representation of the rotation group on the space of the operators
{Ψσ (x)}.
Now we remember that when we studied the transformation of the vector operator Xi we passed from Eq. (17.34) to Eq.
(17.35) which basically passed the D−matrix to the RHS. We wonder whether we can do the same in Eq. (17.43). To try it,
−1 †
we use the fact that D [R] = D [R] (unitarity)
 σ  ∗
σ λ
D R−1 λ = D† [R] λ = D [R] σ (17.44)

but substituting this expression on the RHS of Eq. (17.43), the result is quite complicated. It is more advantageous to take
the adjoint on both sides of Eq. (17.43), using the unitarity of U [R]
  σ ∗
−1
U [R] Ψ†σ (x) U [R] = Ψ†λ (Rx) D(1/2) R−1 λ
 σ
where we have taken into account that D(1/2) R−1 λ is a complex number and not a matrix. Using the unitarity of
 
D(1/2) R−1 expressed by Eq. (17.44), we obtain
−1 λ
U [R] Ψ†σ (x) U [R] = Ψ†λ (Rx) D(1/2) [R] σ (17.45)

this equation is the analog of Eqs. (9.27, 15.52, 17.33, 17.36). However, in the previous cases the operators involved had integer
spin and were chosen to be hermitian. Nevertheless, for half-odd-integer spin operators they are not naturally hermitian.
Therefore, Eqs. (17.43, 17.45) are distinct and will be used to characterized the transformation of the two independent
operators Ψσ (x) and Ψ†σ (x), under rotations.
The generalization of the previous result to any multi-component local field operator can be expressed as follows:

Definition 17.2 Let {Am (x) ; m = 1, 2, . . . , N } be a set of field operators which transform among themselves under rotations,
then we must have  m
−1
U [R] Am (x) U [R] = D R−1 n An (Rx)
where {D [R]} is some
 N −dimensional
  representation of SO (3). If the representation is irreducible and equivalent to a given
j, (so that D R−1 ≡ D(j) R−1 ) then {A} is said to have spin j.

Example 17.8 The example described above correspond to j = 1/2. For vector fields such as the second-quantized electro-
magnetic fields E (x) , B (x) and the vector potential A (x) we have D [R] = D(1) [R] = R, and corresponds to j = 1.

Example 17.9 For the relativistic Dirac field, we obtain the reducible representation 1/2 ⊕ 1/2. It is important to take into
account however, that this representation is not reducible anymore when the Lorentz symmetry and space inversion are included.
4 Eq. (17.31) is for a multi-component field, while Eq. (17.43) is for a multi-component field operator.
5 Inthe same way that X i is a quantization of xi , the local operator Ψσ (x) is a quantization of the field ψσ (x). Heuristically, we can say that the
process of quantization, “produces” an interchange in the roles of R and R−1 in the rules of transformations of the objects in question under SO(3).
334 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

17.5 Applications of the generalized projection operators in SO (3) and SU (2)


In Sec. 16.11 we established the generalization of the projection operators applied to SO (3) and SU (2). Such a projector
became Z
n
Pjm = (2j + 1) dτA D(j) †
(A)n m U (A) (17.46)

where dτA is the normalized volume. We shall apply these results to systems of one and two particles with spin. By contrasting
the treatment with the projection operators with the traditional treatment, we shall appreciate the power of the technique.

17.5.1 Single particle states with spin


We say that a single particle system posseses an intrinsic spin s, if the quantum mechanical states of this particle in its own
rest frame are eigenstates of J 2 with the eigenvalue s (s + 1). We denote such states as |p = 0, λi where λ = −s, . . . , s is the
eigenvalue of the operator J3 (along the “quantization axis” X3 ), in the rest frame. It is then natural to ask how to characterize
such a system when the particle is not at rest. Since description is easier in terms of conserved quantities, we define states
of either definite linear momentum p or definite energy and angular momentum (E, J, M ), depending on the nature of the
problem (both sets form conjugate bases, see Sec. 17.1, Eqs. 17.7, 17.8). For a particle with spin s, there are notwithstanding
(2s + 1) spin states for each p or (E, J, M ), we shall characterize those spin states6 .

Linear momentum eigenstates and helicity


We construct a state of definite momentum with magnitude p and direction n (θ, φ) as in Sec. 17.1, i.e. we define a “standard
state” in a given direction (usually the X3 −axis), and then we form a specific state by means of a specific rotational operation7 .
The standard state is an eigenstate of momentum with p = pe3
Pi |pe3 , λi = |pe3 , λi p δi3 ; i = 1, 2, 3 (17.47)

since there is not orbital angular momentum in the direction e3 of motion, we can interpret the spin index λ as the eigenvalue
of the total angular momentum J along such a direction, i.e. the eigenvalue of J3 . For an arbitrary direction of motion, λ will
be the eigenvalue of J · n (θ, φ) = J · P/p. More formally, let us calculate the commutator of J · P with P
 k 
J Pk , Pm = J k [Pk , Pm ] + [Jk , Pm ] P k = [Lk + Sk , Pm ] P k = [Lk , Pm ] P k
 
= iεkqr Xq Pr , Pm P k = iεkqr Xq [Pr , Pm ] P k + iεkqr [Xq , Pm ] Pr P k
[J · P, Pm ] = iεkqr (i~δqm ) Pr P k = −~εkmr Pr P k = ~εmkr Pk Pr = 0
the last equality follows from the antisymmetry of εmkr and the fact that Pk commutes with Pr . Since J · P commutes with
P, the standard state can be chosen as a simultaneous eigenstate of these operators; combining these facts with Eqs. (17.47)
we have
J·P Jk P k Jk pδ3k
|pe3 , λi = |pe3 , λi = |pe3 , λi = J3 |pe3 , λi ⇒
p p p
J·P
|pe3 , λi = J3 |pe3 , λi = |pe3 , λi λ (17.48)
p
then, we define a general single-particle state with momentum in the n (θ, φ) direction as
|p, λi ≡ |p, θ, φ, λi = U (φ, θ, 0) |pe3 , λi (17.49)
it can be verified that |p, λi is an eigenstate of P with eigenvalue p. By construction, λ represents the helicity of the
particle (projection of the angular momentum J along the direction of motion P/p). It is important to check whether this
interpretation applies for an arbitrary direction of the momentum, i.e. whether it is preserved by Eq. (17.49). To prove it, we
observe that J · P is a scalar
 k 
J Pk , Jm = J k [Pk , Jm ] + [Jk , Jm ] P k = J k [Pk , Lm ] + [Jk , Jm ] P k = J k [Pk , εmpq X p P q ] + i~εkmj J j P k
= εmpq J k X p [Pk , P q ] + εmpq J k [Pk , X p ] P q + i~εkmj J j P k = εmpq J k [−i~δk p ] P q + i~εkmj J j P k
[J · P, Jm ] = −i~εmpq J p P q − i~εmkj J j P k = −i~εmjk J j P k + i~εmjk J j P k = 0
where we have taken into account that summed indices are dummy, we find finally
[J · P, Jm ] = 0 ; m = 1, 2, 3
6 It is well-known in quantum mechanics, that the total angular momentum of a system undergoing a central interaction is a conserved quantity,

despite their partial angular momenta are not in general conserved.


7 It is analogous to the procedure in which all angular momentum states |j, mi in a given multiplet, are obtained from the “standard state” |j, ji.
17.5. APPLICATIONS OF THE GENERALIZED PROJECTION OPERATORS IN SO (3) AND SU (2) 335

so that J · P is invariant under all rotations. Now we check the action of J · P/p on |p, λi

J·P J·P J·P


|p, λi = U [R (φ, θ, 0)] |pe3 , λi = U [R (φ, θ, 0)] |pe3 , λi = U [R (φ, θ, 0)] |pe3 , λi λ ⇒
p p p
J·P
|p, λi = |p, θ, φ, λi λ (17.50)
p

therefore, relation (17.48) is valid for arbitrary directions of motion.

Construction of angular momentum eigenstates by the projection method

On the other hand, we shall construct states with definite angular momentum (J, M ). We have learnt that a canonical basis
for an invariant minimal subspace associated with the representation (J), can be constructed by taking any element |xi of the
λ
representation space and applying on it, a generalized projector PJM as shown in Eq. (16.100), provided the vectors in the set
(16.100) are not all null. In particular, we can take as the arbitrary vector |xi the standard state |pe3 , λi and apply to it the
λ
generalized projection operator PJM
λ
|pJM λi = PJM |pe3 , λi (17.51)
λ
with PJM given by Eq. (17.46). The label p in |pJM λi indicates the standard state |pe3 , λi from which it is generated in
Eq. (17.51)8 . The index λ, labels the subspace generated. The RHS of Eq. (17.51) is non-null according with Eq. (16.101).
Further, theorem 9.2 says that the vectors defined this way, automatically transform as the canonical basis {|J, M i} under
rotations. We now write Eq. (17.51) in terms of Physical variables, by using the explicit form of the projector Eq. (17.46)
Z
† λ
|pJM λi = (2J + 1) dτA D(J) (A) M U (A) |pe3 , λi
Z
2J + 1
= †
dφ d (cos θ) dψ U (φ, θ, ψ) |pe3 , λi D(J) (φ, θ, ψ)λ M
8π 2
Z
2J + 1   M
= 2
dφ d (cos θ) dψ e−iφJ3 e−iθJ2 e−iψJ3 |pe3 , λi D(J) ∗
(φ, θ, ψ) λ

Z h i
2J + 1  −iφJ3 −iθJ2 −iψJ3  iφM (J) M iψλ
= dφ d (cos θ) dψ e e e |pe 3 , λi e d (θ) λ e
8π 2

where we have used Eqs. (15.59, 15.85) and the Cordon-Shortley convention. From Eq. (17.48) we have
Z Z h i
2J + 1 2π iψλ −iψλ  −iφJ3 −iθJ2  iφM (J) M
|pJM λi = e e dψ dφ d (cos θ) e e |pe 3 , λi e d (θ) λ
8π 2 0
Z h i
2J + 1  −iφJ3 −iθJ2 −i0·J3  iφM (J) M iλ·0
|pJM λi = (2π) dφ d (cos θ) e e e |pe 3 , λi e d (θ) λ e
8π 2
Z h i∗
2J + 1 M
|pJM λi = dφ d (cos θ) U (φ, θ, 0) |pe3 , λi D(J) (φ, θ, 0) λ

Z
2J + 1 † λ
|pJM λi = dφ d (cos θ) U (φ, θ, 0) |pe3 , λi D(J) (φ, θ, 0) M

now using Eq. (17.49), the relation between plane-wave momentum eigenstates, and angular momentum eigenstates finally
becomes Z
2J + 1 † λ
|pJM λi = dφ d (cos θ) |p, θ, φ, λi D(J) (φ, θ, 0) M (17.52)

which is the exact vectorial analog of the scalar equation (16.89). It shows that angular momentum states are linear combina-
tions of the linear momentum states through the transfer matrix given by (2J + 1) D(J) †
(φ, θ, 0)λ M . In the particular case of
spinless particles we can set the index λ on the D−matrices as λ = 0, and drop it as a label of states. In this case we obtain the
well-known connection between orbital angular momentum and linear momentum states through the spherical harmonics, Eq.
(17.7) in Sec. 17.1.2. Equation (17.52) is a generalization of that result to states with arbitrary intrinsic angular momentum
s. Indeed, Eq. (17.52) is an example of the application of the Peter-Weyl theorem to vector-valued functions (instead of scalar
functions). It also corresponds to the special case of the theorem expressed by Eqs. (16.85, 16.88, 16.89).
The angular momentum states {|J, M i} also form a basis in the space of single particle states. Therefore, Eq. (17.52) can
M
be inverted. To do it, we multiply both sides of (17.52) by D(J) (φ, θ, 0) λ , sum over J, M and use the completeness condition
8 It does NOT mean that |pJM λi is an eigenstate of the linear momentum.
336 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

Eq. (16.91)9 , to get


X M
X 2J + 1 Z † λ M
|pJM λi D(J) (φ, θ, 0) λ = dφ′ d (cos θ′ ) |p, θ′ , φ′ , λi D(J) (φ′ , θ′ , 0) M D(J) (φ, θ, 0) λ

JM JM
Z X
1 † λ M
= dφ′ d (cos θ′ ) |p, θ′ , φ′ , λi (2J + 1) D(J) (φ′ , θ′ , 0) M D(J) (φ, θ, 0) λ

JM
Z
1
= dφ d (cos θ ) |p, θ , φ , λi [4πδ (φ − φ′ ) δ (cos θ − cos θ′ )]
′ ′ ′ ′

obtaining finally10 X M
|p, θ, φ, λi = |pJM λi D(J) (φ, θ, 0) λ
JM

in particular, the “standard state” is expressed as


X M
X X
|pe3 , λi = |p, 0, 0, λi = |pJM λi D(J) (0, 0, 0) λ = |pJM λi E M λ = |pJM λi δ M λ
JM JM JM
X
|pe3 , λi = |pJλλi
J

and we see that it is composed by states of all angular momenta.

Other approaches to the problem


There are other ways to characterize the states of a quantum mechanical single particle system with spin. Historically, the first
characterization was in terms of eigenstates of spin angular momentum operators {Si } along with linear momentum {Pi } or
orbital angular momentum {Li }. Therefore, instead of states of the type in Eq. (17.49) one deals with states |p, θ, φ, σi
which are eigenstates of {Pi } and S3 . Alternatively, we can use eigenstates of orbital angular momentum |p, l, m, σi, which are
eigenstates of P 2 , L2 , L3 and S3 . Both sets of eigenstates are connected by the relation
Z
|p, l, m, σi = dΩ |p, θ, φ, σi Ylm (θ, φ) (17.53)

the disadvantage with these types of states is that {Si } and {Li } are not conserved quantities like {Pi } and {Ji }. Further,
they can be defined only for restricted systems (e.g. non-relativistic) and under specific dynamical assumptions. There is no
unambiguous general method to measure the quantum numbers (l, m, σ). To form eigenstates of the total angular momentum
(which is a conserved quantity) from those given in Eq. (17.53) we must combine “orbital” and “static spin” degrees of freedom
by using the rules of addition of angular momentum
X
|pJM li = |p, l, m, σi hmσ (l, s) JM i (17.54)

however, these states still differ from the ones in Eq. (17.52) in the last quantum number. Each vector in the set (17.54)
are linear combinations of the set (17.52) and vice versa (it can be shown that l takes on the same number of values as λ).
Although both sets are in principle equivalent, the quantum number λ (helicity) has a well-defined physical meaning, while
the label l is model-dependent unless it happens to correspond to a conserved quantity (e.g. parity for spin 1/2 systems).
Another advantage of the helicity characterization of states is that it applies equally well for zero mass states (such as
photons) as for non-zero mass states. In contrast, the static spin has no meaning for zero mass states (which travel at the
speed of light), so that non-zero states and zero states must be considered apart in the old formalism.

17.5.2 Two particle states with spin


Two particles states are important either as bound states or in scattering problems. If one or both particles possess non-zero
intrinsic spin, the characterization of their quantum mechanical states can be rather difficult in the old formalism. If the spin
and orbital angular momentum are used, one must couple the spins of the two particles with the (relative) orbital angular
momentum in various ways such as the L−S coupling the j −j coupling etc. Then, by using several Clebsch-Gordan coefficients,
we form a state with definite total angular momentum. To describe a Physical process we must go through one of these coupling
schemes for the initial state (to form states with well-defined total angular momentum which is a conserved quantity), and
consider the transition to some final state (with the same total angular momentum) which must also be characterized in a
9 Note that we are dealing with functions that are independent of ψ and only depend on φ, θ. Thus we are dealing with functions of the form

given by Eq. (16.85). Consequently, we can use the completeness relation (16.91) instead of the more complete one given by Eq. (16.75).
10 The factor 4π instead of 16π 2 in the Dirac delta function has to do with the fact that there is not integration over ψ.
17.5. APPLICATIONS OF THE GENERALIZED PROJECTION OPERATORS IN SO (3) AND SU (2) 337

coupling scheme. This process is quite complicated and can be simplified if the initial characterization of Physical states is
chosen appropriately.
We follow a reasoning similar to the one described for one particle. Our starting point is then a “standard” state in the
center of mass frame (CM) of the two particles in which particle 1 has momentum pe3 and helicity λ1 . Particle 2 moves with
opposite momentum in the CM frame such that its momentum is −pe3 and its helicity is λ2

|pe3 , λ1 , λ2 i ≡ |pe3 , λ1 i ⊗ |−pe3 , λ2 i = |pe3 , λ1 i ⊗ U (0, π, 0) |pe3 , λ2 i


the net angular momentum of this state is λ1 − λ2 along e3 so that

J3 |pe3 , λ1 , λ2 i = |pe3 , λ1 , λ2 i (λ1 − λ2 )

so the general “plane-wave states” are defined as in Eq. (17.49)

|p, θ, φ, λ1 , λ2 i = U (φ, θ, 0) |pe3 , λ1 , λ2 i (17.55)


λ1 −λ2
applying the generalized projection operator PJM , eigenstates of total angular momentum are obtained directly
Z
2J + 1 † λ −λ
|pJM λ1 λ2 i = dΩ |p, θ, φ, λ1 , λ2 i D(J) (φ, θ, 0) 1 2 M (17.56)

inverting this equation we find X M
|p, θ, φ, λ1 , λ2 i = |pJM λ1 λ2 i D(J) (φ, θ, 0) λ1 −λ2 (17.57)
JM

The two particle helicity plane-wave and total angular momentum states were first formulated by Jacob and Wick. The helicity
states of Jacob and Wick have many advantages

1. All quantum numbers appearing in the state are physical and measurable, being independent of dynamical assumptions.
2. The relation between “plane-wave” states Eq. (17.55) and “angular momentum states” Eq. (17.56), is direct and free
from arbitrary coupling schemes, such as L − S, j − j etc.
3. The behavior of helicity states under discrete physical symmetry transformations such as space-inversion, time-reversal,
charge conjugation, and permutation of identical particles are in general simple and well-defined.
4. The formalism is aplicable to zero mass systems and non-zero mass systems on the same foot. This is not the case when
other coupling schemes for spins are used.
5. For the reasons discussed above, the description of scattering and decay processes in atomic, nuclear and particle Physics
is simple, elegant and practical.

17.5.3 Scattering of two particles with spin: partial-wave decomposition

Figure 17.1: Two-particle scattering from the point of view of the center of mass.

Consider the scattering of two particles with spin (λa , λb ) in the initial state that after the collision create two particles
with spins (λc , λd ) in the final state. In the center of mass frame, the momentum of particle a is opposite to the momentum of
338 CHAPTER 17. APPLICATIONS IN PHYSICS OF SO (3) AND SU (2)

particle b in the initial state and the same happens for particles c and d in the final state as can be seen in Fig. 17.1. According
with Sec. 17.5.2, Eq. (17.57), the initial and final two particle states can be characterized as
X λ −λ
|pi , λa , λb i = |pi , J, M = λa − λb , λa , λb i D(J) (φ, θ, 0) a b λa −λb (17.58)
J
X M
|pf , λc , λd i = |pf , J, M, λc , λd i D(J) (φ, θ, 0) λc −λd (17.59)
JM

where we have assumed that the initial states have been “prepared” with well defined values of helicity (hence there is no
sum over M ) while the final states have not a well defined helicity. Since all known interactions are invariant under spatial
rotations, the scattering matrix conserves angular momentum. Therefore, it corresponds to an irreducible spherical tensor
with s = 0. Applying the Wigner-Eckart theorem Eq. (15.133), and taking into account that s = 0 corresponds to the identity
representation of SO (3), we obtain

hpf Jf Mf λc λd | T |pi Ji Mi λa λb i = hλc λd | |TJ (E)| |λa λb i δJ Jf δMi Mf δJi J (17.60)

where E denotes the total energy of the system (conserved quantity) and the first factor on the RHS is the “reduced matrix
element” which depend only on the variables explicitly diplayed. Combining Eqs. (17.58, 17.60) we find
X λ −λ
hpf λc λd | T |pi λa λb i = hλc λd | |TJ (E)| |λa λb i d(J) (θ)λac −λdb ei(λa −λb )φ
J

which is the general partial wave expansion for any two particle scattering or reaction. If we compare these developments with
the ones in Sec. 17.1.3, we can see that in the helicity formalism, the spin degrees of freedom of the particles involved do
not introduce any significant complication with respect to spinless particles. By contrast, in the conventional approach with
static spin labels for the particles, the relationship between “plane wave” and “angular momentum” states, involves multiple
Clebsch-Gordan coupling coefficients for both the initial and final states, and consequently the partial wave expansion is much
more complicated than that of spinless particles. Further, additional properties of the partial wave expansion can be extracted
from spatial inversion and time reversal symmetries.
Chapter 18

Euclidean Groups in two dimensions

Experience has shown that physical space posseses two important properties: homogeneity and isotropy. Homogeneity says
that there is not any privileged point in the space, while isotropy says that there is not any privileged direction. An euclidean
space is a space endowed with these features. These postulates imply that the results of scientific experiments performed
on isolated systems should not depend on the origin of a reference frame neither of the orientation of the coordinate axes
or experimental setup. It is then natural to incorporate two types of symmetries associated with homogeneity and isotropy:
(a) Uniform translations T (b) along a direction b b through a distance b, it accounts on the fact that experimental results on
isolated systems must be invariant by shifting the origin . (b) Uniform rotations Rn (θ) around a direction n through an angle
θ, which account on the invariance of results under reorientation of reference frames. Consequently, we define
Definition 18.1 (Euclidean groups): The euclidean group En consists of all continuous linear transformations on the n−dimensio
Euclidean space Rn that leaves the length of all vectors invariant.

Points in Rn are characterized by their coordinates x1 , . . . , xn . A general linear transformation reads
x → x′ ; x′i = Ri j xj + bi (18.1)
the length (or norm) of a vector is the usual distance between the end points x and y of the vector
v
u n
uX
l=t (xi − y i )2 (18.2)
i=1

for this length to be invariant under a transformation of the type (18.1), the necessary and sufficient condition is that Ri j be
an orthogonal matrix. The “homogeneous” part of Eq. (18.1) corresponds to a rotation while the “inhomogeneous” part of
Eq. (18.1) (parameterized by bi ) corresponds to a uniform translation of all points.
Despite we have studied the group of continuous translations in Sec. 14.2 and the group of continuous rotations (in two
and three dimensions) in chapters 14, 15, the transformations T (b) and Rn (θ) do not commute. Therefore, the interplay
of them will give rise to new interesting properties1 . As well as the utility of the euclidean space to study the space-time
structure of non-relativistic theories, it provides a good scenario to introduce new techniques for analyzing the irreducible
unitary representations of non-compact, non-abelian Lie groups. Further, it permits to extend our understanding of special
functions (in this case Bessel functions) that appear in classical analysis and that are related with representation functions of
the euclidean groups. Finally, it prepares the way to study the relativistic structure of space-time that leads to the Lorentz
and Poincaré groups.
The Euclidean group En is also called the group of motion in the space Rn . It is because En is the symmetry group of
general motion in the Physical space in either classical and quantum mechanics. Since motion is usually governed by the
Lagrangian, and for many purposes the lagrangian can be considered as T − V with T the kinetic energy and V a potential
energy, it would be useful to examine the behavior of the Lagrangian under Euclidean transformations. In classical physics,
the kinetic energy reads
X1
T = mr vr2
r
2
where r is a label for the particles. Since dxr is the difference of two coordinates, we see that vr = dxr /dt is invariant under
2
translations and so T is. Additionally, vr2 = kvr k is clearly invariant under rotations, so that T is invariant under the full
Euclidean group. In quantum mechanics the kinetic energy yields
X p2 X ~2
r
T = =− ∂r2
r
2m r r
2m r

1 Note that it also says that E is NOT the direct product of the SO (n) and T (b) groups. It is because the elements of one group factor must
n
commute with the elements of the other group factor, to be able to form such a product (see sections 6.12, 6.13).

339
340 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

which is also invariant under the Euclidean group. If the potential energy depends on the coordinate vectors {xr } only, the
homogeneity of space implies that the laws of motion derived from V should not depend on the origin chosen. Hence, the
potential can only depend on relative coordinates xrs = xr − xs . On the other hand, isotropy demands that the laws of motion
arising from V , should not depend on the orientation of coordinate axes. Consequently, the variables {xrs } can only enter V
in rotationally invariant scalar combinations (such as kxrs k or xrs · xmn )2 . For two particle systems it leads to the fact that
interactions between them must be a function of the magnitude of the relative coordinates only, i.e. V = V (kx1 − x2 k) = V (r).
The homogeneity of space is in general tied to the conservation of linear momenta while the isotropy of the space is related
with the conservation of angular momenta, via Noether theorem.
Non-relativistic classical and quantum mechanics can be described in the framework of the Euclidean space. Thus, the
Euclidean group is appropriate to study the space-time symmetries of non-relativistic theories. Symmetry considerations like
the ones in the previous paragraph, are also useful in characterizing unknown interactions in frontier areas of Physics.
The most important Euclidean groups for Physics applications are the two and three dimensional ones. We start with the
simplest two dimensional Euclidean group

18.1 The Euclidean group in two dimensions E2


In a two dimensional space the axis of rotation is fixed
 so that rotations are characterized by a single parameter θ, while
translations are determined by two parameters b1 , b2 . Eq. (18.1) becomes

x′1 = x1 cos θ − x2 sin θ + b1 (18.3)


′2 1 2 2
x = x sin θ + x cos θ + b (18.4)

this element of the E2 group will be denoted as g (b, θ). The transformation rule described by Eqs. (18.3, 18.4) can be
expressed in matrix form if we represent each point x by a 3-component vector

x ≡ x1 , x2 , 1 (18.5)

so that we can write Eqs. (18.3, 18.4) as


    1 
x′1 cos θ − sin θ b1 x
 x′2  =  sin θ cos θ b2   x2  (18.6)
1 0 0 1 1

so that the matrix representation of g (b, θ) becomes


 
cos θ − sin θ b1
g (b, θ) =  sin θ cos θ b2  (18.7)
0 0 1

The group multiplication rule can be obtained by matrix multiplication


  
cos θ2 − sin θ2 b12 cos θ1 − sin θ1 b11
g (b3 , θ3 ) = g (b2 , θ2 ) g (b1 , θ1 ) =  sin θ2 cos θ2 b22   sin θ1 cos θ1 b21 
0 0 1 0 0 1
 1 2 1

cos (θ1 + θ2 ) − sin (θ1 + θ2 ) cos θ2 b1 − sin θ2 b1 + b2
g (b3 , θ3 ) =  sin (θ1 + θ2 ) cos (θ1 + θ2 ) sin θ2 b11 + cos θ2 b21 + b22  (18.8)
0 0 1

writing g (b3 , θ3 ) as in Eq.(18.7) and equating with Eq. (18.8) we have

g (b3 , θ3 ) = g (b2 , θ2 ) g (b1 , θ1 ) (18.9)


   
cos θ3 − sin θ3 b13 cos (θ1 + θ2 ) − sin (θ1 + θ2 ) cos θ2 b11 − sin θ2 b21 + b12
 sin θ3 cos θ3 b23  =  sin (θ1 + θ2 ) cos (θ1 + θ2 ) sin θ2 b11 + cos θ2 b21 + b22  (18.10)
0 0 1 0 0 1

the first two columns of these matrices clearly give


θ3 = θ1 + θ2 (18.11)
2 This kind of potentials lead to central forces. We can see it by observing that for two particles at rest, the only privileged vector that can be

constructed is the one joining them, if the force between them were not along this vector the space would not be isotropic. When particles start
moving, there are other privileged vectors (velocities of the particles) that permits the force not to be central (e.g. magnetic forces). We can also
obtain non-central forces if there is an intrinsic vector property associated with the particles. This is the case when point particles are endowed with
spin or in the case of point-like dipole moments.
18.1. THE EUCLIDEAN GROUP IN TWO DIMENSIONS E2 341

and matching the third column of both matrices we find


 1      1   1 
b3 cos θ2 b11 − sin θ2 b21 + b12 cos θ2 − sin θ2 0 b1 b2
 b23  =  sin θ2 b11 + cos θ2 b21 + b22  =  sin θ2 cos θ2 0   b21  +  b22 
1 1 0 0 0 1 1
 1   
b3 cos θ − sin θ 0
 b23  = R (θ2 ) b1 + b2 ; R (θ) ≡  sin θ cos θ 0  (18.12)
1 0 0 0

Gathering Eqs. (18.11, 18.12) we obtain the rule of multiplication

g (b3 , θ3 ) = g (b2 , θ2 ) g (b1 , θ1 )


θ3 = θ1 + θ2 ; b3 = R (θ2 ) b1 + b2 (18.13)

From Eq. (18.9, 18.10) we can characterize the inverse of a given element

E = g (b2 , θ2 ) g (b1 , θ1 )
   
1 0 0 cos (θ1 + θ2 ) − sin (θ1 + θ2 ) cos θ2 b11 − sin θ2 b21 + b12
 0 1 0  =  sin (θ1 + θ2 ) cos (θ1 + θ2 ) sin θ2 b11 + cos θ2 b21 + b22 
0 0 1 0 0 1

equating the first two columns clearly gives


θ1 = −θ2 (18.14)
for the third column we have
      1   1 
0 cos θ2 b11 − sin θ2 b21 + b12 cos θ2 − sin θ2 0 b1 b2
 0  =  sin θ2 b11 + cos θ2 b21 + b22  =  sin θ2 cos θ2 0   b21  +  b22 
1 1 0 0 0 1 1

writing the first two components only (the third one is trivial) and using (18.14) we have
      1   1 
0 cos θ2 b11 − sin θ2 b21 + b12 cos θ2 − sin θ2 b1 b2
= = +
0 sin θ2 b11 + cos θ2 b21 + b22 sin θ2 cos θ2 b21 b22
0 = R (θ2 ) b1 + b2 ⇒ b2 = −R (θ2 ) b1
b2 = −R (−θ1 ) b1 (18.15)

and it is straightforward to realize that this equation still holds when the third component is added again. From Eqs. (18.14,
18.15) we see that
−1
[g (b, θ)] = g (−R (−θ) b, −θ) . (18.16)
the subset defined by {g (0, θ) = R (θ)} forms the subgroup of rotations3 ; i.e. the SO (2) group studied in Sec. 14.1. The
generator of this one parameter subgroup is written in the present representation as
 
0 −i 0
J = i 0 0  (18.17)
0 0 0

and a general element of the rotation subgroup is given by R (θ) = e−iθJ . The generator J is the angular momentum operator
(see Sec. 14.1.5).
The subset {g (b,0) = T (b)} is the subgroup of translations T2 . It has two independent one-parameter subgroups with
parameters b1 and b2 and generators
   
0 0 i 0 0 0
P1 =  0 0 0  ; P2 =  0 0 i  (18.18)
0 0 0 0 0 0

it can be checked that P1 and P2 commute with each other, as expected since translations commute. Therefore, general
translations can be written as 1 2
T (b) = e−ib·P = e−ib P1 e−ib P2
with P being the momentum operator (see Sec. 14.2.1).
3 Comparing g (b,0) obtained from Eq. (18.7) with R (θ) obtained from (18.12), we see that [g (b,0)]3 6= R (θ)3 . However, it only affects the
3 3
third (spurious) component of the three vectors defined in (18.5).
342 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

Theorem 18.1 (Decomposition of elements of E2 ): An arbitrary element of E2 can be written as

g (b, θ) = T (b) R (θ) (18.19)

Proof : Using Eqs. (18.16, 18.13) we have

g (b, θ) R (θ)−1 = g (b, θ) g (0, θ)−1 = g (b, θ) g (−R (−θ) 0, −θ) = g (b, θ) g (0, −θ)
= g (R (θ) 0 + b, θ − θ) = g (b, θ − θ) = g (b, 0) = T (b) ⇒
−1
g (b, θ) R (θ) = T (b) ⇒ g (b, θ) = T (b) R (θ)

QED.
It is important to study the interplay between translations and rotations. We start establishing the Lie algebra

Theorem 18.2 (Lie algebra of E2 ): The generators of E2 satisfy the following commutation relations (Lie algebra):

[P1 , P2 ] = 0 ; [J, Pk ] = iεkm Pm ; k = 1, 2 (18.20)

where εkm is the 2-dimensional unit anti-symmetric tensor.

Proof : It can be checked by using the explicit matrix representations given by Eqs. (18.17, 18.18). QED.
Observe that the second of Eqs. (18.20) says that under rotations, the set {Pk } transform as components of a vector
operator. It can also be expressed as

Theorem 18.3 The generators Pk and the translation operators T (b) transform under rotations in E2 as follows
m
R (θ) Pk R (−θ) = e−iθJ Pk eiθJ = Pm R (θ) k (18.21)
R (θ) T (b) R (−θ) = e−iθJ T (b) eiθJ = T [R (θ) b] (18.22)

Proof : Using the Baker-Hausdorff-Campbell identity Eq. (16.110), page 309 we have

1 1
e−iθJ Pk eiθJ = Pk + [−iθJ, Pk ] + [−iθJ, [−iθJ, Pk ]] + [−iθJ, [−iθJ, [−iθJ, Pk ]]] + . . . (18.23)
2! 3!

let us define an operator Jb acting on another operator A as

b ≡ [−iθJ, A]
JA (18.24)

applying definition (18.24) in Eq. (18.23) for k = 1, we obtain

X∞ b2n ∞
b 1+ 1 b2 1 J P1 X Jb2n+1 P1
e−iθJ P1 eiθJ = P1 + JP J P1 + Jb3 P1 + . . . = + (18.25)
2! 3! n=0
2n! n=0
(2n + 1)!

we can prove by induction that


n n
Jb2n P1 = (−1) θ2n P1 ; Jb2n+1 P1 = (−1) θ2n+1 P2 (18.26)

for this we note first that by using the definition (18.24), the LHS of Eq. (18.26) with n = 0 gives P1 = P1 and JP b 1 = θP2
showing that (18.26) works for n = 0. We should also prove that if Eq. (18.26) is satisfied for n, it is also satisfied for n + 1.
To show it, we first calculate Jb2 P1 and Jb2 P2
h i
Jb2 P1 = Jb JP b 1 = Jb ([−iθJ, P1 ]) = −iθJb (iP2 ) = θJP
b 2 = θ [−iθJ, P2 ] = −iθ2 (−iP1 )

Jb2 P1 = −θ2 P1 (18.27)

similarly
Jb2 P2 = −θ2 P2 (18.28)
combining Eqs. (18.26, 18.27, 18.28) we see that
h i  
n n n+1 2(n+1)
Jb2(n+1) P1 = Jb2 Jb2n P1 = Jb2 (−1) θ2n P1 = (−1) θ2n Jb2 P1 = (−1) θ P1
h i  
n n n+1 2(n+1)+1
Jb2(n+1)+1 P1 = Jb2 Jb2n+1 P1 = Jb2 (−1) θ2n+1 P2 = (−1) θ2n+1 Jb2 P2 = (−1) θ P2
18.1. THE EUCLIDEAN GROUP IN TWO DIMENSIONS E2 343

which proves the validity of Eq. (18.26). Now replacing (18.26) in (18.25) we find

X X∞
n θ2n n θ
2n+1
e−iθJ P1 eiθJ = P1 (−1) + P2 (−1)
n=0
2n! n=0
(2n + 1)!
e−iθJ P1 eiθJ = P1 cos θ + P2 sin θ (18.29)

we can prove similarly that


e−iθJ P2 eiθJ = −P1 sin θ + P2 cos θ (18.30)
Eqs. (18.29, 18.30) can be rewritten as
 
cos θ − sin θ
e−iθJ (P1 P2 ) eiθJ = (P1 P2 )
sin θ cos θ
⇒ e−iθJ PeiθJ = P · R (θ)

which is the matrix form of Eq. (18.21). Now, for an arbitrary operator we see that

2    2
e−iθJ (A) eiθJ = e−iθJ AAeiθJ = e−iθJ A e−iθJ eiθJ Ae−iθJ = e−iθJ Ae−iθJ eiθJ Ae−iθJ = e−iθJ Ae−iθJ

which is easy to generalize as


n n
e−iθJ (A) eiθJ = e−iθJ Ae−iθJ (18.31)
this is hardly a surprise since similarity transformations preserve products. Using Eqs. (18.31, 18.21) we find
"∞   #
h i X −i b1 P1 + b2 P2 n
−iθJ iθJ −iθJ −i(b1 P1 +b2 P2 ) iθJ −iθJ
e T (b) e = e e e =e eiθJ
n=0
n!
∞  n iθJ ∞  −iθJ   n
X e −iθJ 1
−i b P1 + b P2 2
e X e −i b1 P1 + b2 P2 eiθJ
= =
n=0
n! n=0
n!
∞   n ∞  m m  n
X −i b1 e−iθJ P1 eiθJ + b2 e−iθJ P2 eiθJ X −i Pm R (θ) 1 b1 + Pm R (θ) 2 b2
= =
n=0
n! n=0
n!

X {−iP· [R (θ) b]}n
= = exp {−iP· [R (θ) b]} = T [R (θ) b]
n=0
n!

which proves Eq. (18.22). QED.


According with Theorem 17.5 and Eq. 17.36, expression (18.21) says that P is a vector operator. Note that the second of
Eqs. (18.20) and Eq. (18.21) are similar to Eqs. (15.60, 15.52) respectively, which express the rotational properties of Jk .
On the other hand, it can be shown that Eq. (18.22) is equivalent to the multiplication rule Eq. (18.13). To see this, we
observe that Eq. (18.22) along with Eq. (18.19) leads to Eq. (18.13)

g (a, θ) g (b, φ) = T (a) R (θ) T (b) R (φ) = T (a) R (θ) T (b) [R (−θ) R (θ)] R (φ) = T (a) [R (θ) T (b) R (−θ)] R (θ) R (φ)
= T (a) T [R (θ) b] R (θ + φ) = T [R (θ) b + a] R (θ + φ)
= g [R (θ) b + a, θ + φ]

Theorem 18.4 The translations T2 ≡ {T (b)} form an invariant abelian subgroup in E2 . The factor group E2 /T2 is isomorphic
to SO (2).

Proof : Using Eqs. (18.19, 18.22) we obtain


−1
g (b, θ) T (a) g (b, θ) = [T (b) R (θ)] T (a) [R (−θ) T (−b)] = T (b) [R (θ) T (a) R (−θ)] T (−b) = T (b) T [R (θ) a] T (−b)
−1
g (b, θ) T (a) g (b, θ) = T (b) T (−b) T [R (θ) a] = T [R (θ) a] ∈ T2 ∀T (a) ∈ T2 and ∀g (b, θ) ∈ E2 (18.32)

where we have used the abelianity of T2 . Equation (18.32) shows that T2 is invariant in E2 . On the other hand, the elements
of E2 /T2 are cosets defined by {T · g (b, θ)} where the argument of T is omitted because each coset contains all elements of
the form T (a). Left-cosets and right-cosets are identical since T2 is invariant in E2 . Using the decomposition in Eq. (18.19)
we obtain (as an equation of cosets)

T g (b, θ) = T [T (b) R (θ)] = [T T (b)] R (θ) = T R (θ)


344 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

where we have used the fact that the product of two translations is another translation. Observe that distinct cosets T R (θ) are
labelled by one parameter θ, so that they are in one-to-one correspondence with elements of SO (2) = {R (θ)}. Using Eq.
(18.22), we can determine the rule of multiplication between cosets
[T R (θ)] [T R (φ)] = T R (θ) T R (φ) = T R (θ) T [R (−θ) R (θ)] R (φ) = T [R (θ) T R (−θ)] R (θ) R (φ) = T T R (θ) R (φ)
[T R (θ)] [T R (φ)] = T R (θ + φ)
which proves the homomorphism. Therefore, E2 /T2 is isomorphic with SO (2). QED. Since T2 is invariant in E2 and the
factorization in Eq. (18.19) is unique, definition 6.24 says that E2 is the semi-direct product of T2 with SO (2)
E2 = T2 ∧ SO (2) (18.33)
On the other hand, since E2 contains at least one abelian invariant subgroup, it is not simple nor semi-simple. In addition,
the parameters b1 , b2 that characterize translations are not bounded, so that the group manifold is not compact. Therefore,
the representation theory for E2 (and En in general) is much more complicated than for the Lie groups we have studied.
We shall not develop the complete theory of representations, instead we shall restrict ourselves to the unitary representations
which are widely used in Physics. These representations are typically infinite-dimensional but most of the methods developed
previously can be applied for these types of representations.

18.2 One-dimensional representations of E2


Before entering in the general treatment, we notice that some degenerate representations can be obtained4 from the factor
group E2 /T2 ≈ SO (2) according with theorem 18.4. We learnt in Sec. 14.1 that the irreducible representations of SO (2)
are labelled by an integer m = 0, ±1, ±2, . . ., and all of them are one-dimensional, where the homomorphism is given by
R (θ) → e−imθ . The induced one-dimensional representations of E2 read
g (b, θ) ∈ E2 → Um (b, θ) = e−imθ ∀b
5
we can check that it is a representation by noting that Um (b, 0) = 1 ∀m and that
Um (b, θ) Um (a, φ) = e−imθ e−imφ = e−im(θ+φ)
which agrees with the expected rule Um [R (θ) a + b, θ + φ] = e−im(θ+φ) . These are the only finite-dimensional irreducible
unitary representations of E2 .

18.3 Basic Lie algebra


The group SO (2) has one generator J while T2 has two generators P1 , P2 . However, Eq. (18.20) shows that J do not commute
with P1 and P2 so that joining both groups gives other interesting features. Following the procedure in Sec. 15.5 we start by
defining raising and lowering operators, consisting of generators of translations
P± = P1 ± iP2 (18.34)
P+ + P− P+ − P−
P1 = ; P2 = (18.35)
2 2i
now Eq. (18.20) says that P1 , P2 commute with each other so P+ , P− do
[P1 , P2 ] = [P+ , P− ] = 0 (18.36)
From the second of Eqs. (18.20) we can evaluate the commutator of P± with J
[J, P± ] = [J, P1 ± iP2 ] = [J, P1 ] ± i [J, P2 ] = iP2 ± i (−iP1 ) = ± (P1 ± iP2 )
[J, P± ] = ±P± (18.37)
Now we define the squared momentum operator as
 2  2
P+ + P− P+ − P−
P2 ≡ (P1 )2 + (P2 )2 = +
2 2i
1  1 
= P+2 + P−2 + 2P+ P− − P+2 + P−2 − 2P+ P−
4 4
P2 ≡ (P1 )2 + (P2 )2 = P+ P− = P− P+ (18.38)
4 This is related with the non-simple nature of the group, as can be seen from Corollary 7.3 page 134.
5 Itmeans that all elements of the kernel of the homomorphism (i.e. the elements of the invariant subgroup T2 in E2 ) are mapped in a single
element (the identity), in the quotient group, as it must be (see Sec. 6.11, page 124).
18.4. UNITARY IRREDUCIBLE REPRESENTATIONS OF E2 : LIE ALGEBRA METHOD 345

from Eqs. (18.37, 18.36) it is easy to show that P2 commutes with all the generators of the group
 
J, P2 = [J, P+ P− ] = P+ [J, P− ] + [J, P+ ] P− = −P+ P− + P+ P− = 0
     
Pi , P2 = Pi , P k Pk = P k [Pi , Pk ] + Pi , P k Pk = 0 ⇒
     
J, P2 = Pi , P2 = P± , P2 = 0 (18.39)

therefore, P2 is a Casimir operator and has a unique eigenvalue for each irreducible representation (i.e. its matrix representation
in an irreducible invariant subspace is proportional to the identity). For unitary representations the generators (J, P1 , P2 ) are
mapped into hermitian operators that we shall denote by the same symbols. From Eq. (18.34) with P1,2 hermitian, it is clear

that (P ± ) = P ∓ . Since Pi has real eigenvalues then Pi2 has non-negative eigenvalues so P2 is a positive operator, whose
eigenvalues will be denoted by p2 ≥ 0.

18.4 Unitary irreducible representations of E2 : Lie algebra method


The basic procedure to construct irreducible representations of Lie groups starts with an initial “standard reference vector”
|v0 i in the representation space V , and operating on this vector succesively with either the generators or the elements of the
group until a closed (irreducible) invariant subspace or a basis for such a subspace is generated. We shall use this procedure
by using the generators, for this reason we call this the Lie algebra method. The procedure is similar to the one followed
for SO (3), but in our present case all faithful representations will be infinite-dimensional.
Following the procedure of chapter 15, we construct the general unitary representations of E2 . Unitary irreducible repre-
sentation of E2 is also a unitary representation (in general reducible) of the subgroup SO (2). Consequently, the representation
space is a direct sum of one-dimensional subspaces labelled by the eigenvalue m of J with m = 0, ±1, ±2, . . .
Since P2 and J commute, they admit a complete set of simultaneous eigenvectors that we shall denote as {|pmi}

P2 |pmi = |pmi p2 ; p2 ≥ 0 (18.40)


J |pmi = |pmi m ; m = 0, ±1, ±2, . . . (18.41)

From Eq. (18.37) we see that

(JP± − P± J) |pmi = ±P± |pmi ⇒ JP± |pmi = P± J |pmi ± P± |pmi = P± |pmi m ± P± |pmi
J [P± |pmi] = P± |pmi [m ± 1] (18.42)

hence, P± |pmi is an eigenvector of J with eigenvalue m ± 1. Let us normalize these vectors


2
kP± |pmik = hpm| P±† P± |pmi = hpm| P∓ P± |pmi = hpm| P2 |pmi = p2 hpm| pmi
2
kP± |pmik = p2 (18.43)

where we have used Eqs. (18.38, 18.40) and we assumed that the vectors |pmi are already normalized. We should study two
cases
(i) p2 = 0, from Eq. (18.43) it implies
P± |0 mi = 0 (18.44)
providing a one-dimensional representation6

J |0 mi = |0 mi m ; R (θ) |0 mi = e−iJθ |0 mi = |0 mi e−imθ ; T (b) |0 mi = e−ib·P |0 mi = |0 mi e−ib·0 = |0 mi

we basically rederived the one-dimensional representations previously obtained through the quotient group E2 /T2 ≈ SO (2).
In the equations above, we have used the symbols R (θ) and T (b) to denote either the group elements or their realization
as unitary operators on the representation space. Strictly speaking, the notation for realizations should be of the form U [R (θ)]
and U [T (b)] but we shall simplify the notation unless it proves to be necessary the distinction between group elements and
realizations on a given representation space.
(ii) p2 > 0. Since Eq. (18.42) says that P± |pmi is an eigenvector of J with eigenvalue m ± 1, such a vector must be
proportional to |p m ± 1i. Further, Eq. (18.43) says that for |p m ± 1i to be normalized, it must be of the form P± |pmi eiθ /p.
For later convenience we choose the phase factor in the form
 
i
|p m ± 1i = P± |pmi ± (18.45)
p
6 Note that Eq. (18.44) says that there exists a “fundamental state” that is annihilated by both the raising and lowering operators. In contrast,

for SO (3) there are states that are annihilated by either J+ or J− but not by both simultaneously.
346 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

it is very easy to show by induction that  n


n p
(P± ) |pmi = |p m ± ni (18.46)
±i
According with Eq. (18.45), starting from a given “standard” or “reference” vector, repeated application of the operators P±
leads to the sequence of basis vectors
{|p, mi , m = 0, ±1, ±2, . . .} (18.47)
this set generates an irreducible invariant subspace under E2 . It can be seen by showing that the action of the generators of
E2 on such a set, generates linear combinations of their elements.

J |p, mi = |p, mi m
P+ + P− ip
P1 |p, mi = |p, mi = [− |p m + 1i + |p m − 1i]
2 2
P+ − P− p
P2 |p, mi = |p, mi = − [|p m + 1i + |p m − 1i]
2i 2
n
finally, since (Pk ) |p, mi involves vectors of the form |p, m ± ni, we see that all elements in Eq. (18.47) are necessary to obtain
the closure property, thus showing that the subspace generated is irreducible.
The irreducible representation space spanned by the basis in Eq. (18.47) is clearly infinite-dimensional. In this space the
matrix elements of the generators and raising and lowering operators are
′ ′
hpm′ | J |pmi = mδ m m ; hpm′ | P± |pmi = ∓ipδ m m±1 (18.48)
1 ip h m′ ′
i
hpm′ | P1 |pmi = hpm′ | (P− + P+ ) |pmi = δ m−1 − δ m m+1 (18.49)
2 2
′ 1 ph ′ ′
i
hpm | P2 |pmi = hpm | (P+ − P− ) |pmi = − δ m m+1 + δ m m−1

(18.50)
2i 2
Theorem 18.5 (Faithful unitary irreducible representations of E2 ): The faithful unitary irreducible representations of E2 are
characterized by a strictly positive number p. The matrix elements of the generators are given by Eqs. (18.48, 18.49, 18.50),
and the representation matrices for finite transformations are given by
m′ ′
D(p) (b, θ) ≡ hpm′ | g (b, θ) |pmi = ei(m−m )φ Jm−m′ (pb) e−imθ
m (18.51)

where (b, φ) are the polar coordinates of the 2-vector b ≡ b1 , b2 , and
 2q
 m−m′ X
∞ (−1)q pb
pb 2
Jm−m′ (pb) ≡ ; m′ ≤ m
2 q=0
q! (q + m − m′ )!
m′ −m
Jm−m′ (pb) ≡ (−1) Jm′ −m (pb) ; m′ > m (18.52)

Jn defined by Eq. (18.52) is called the Bessel function of the first kind.

Proof : Taking into account the decomposition (18.19), it is enough to study rotations and translations separately. For
rotations we see from Eq. (18.41) that

hpm′ | e−iθJ |pmi = e−imθ δ m m (18.53)
in the case of translations we have from (18.22) that
−1
T (b) = T (b, φ) = T [R (φ) (b, 0)] = R (φ) T (b, 0) R (φ)

from which we obtain


−1 ′
hpm′ | T (b) |pmi = hpm′ | R (φ) T (b, 0) R (φ) |pmi = hpm′ | e−iφJ T (b, 0) eiφJ |pmi = hpm′ | e−iφm T (b, 0) eiφm |pmi

hpm′ | T (b) |pmi ≡ hpm′ | T (b, φ) |pmi = ei(m−m )φ hpm′ | T (b, 0) |pmi (18.54)

therefore, it suffices to consider translations along the X−axis only, using Eq. (18.35) we have
X k l
k (b/2) X l (b/2)
T (b, 0) = e−ibP1 = e−ib(P+ +P− )/2 = e−ibP+ /2 e−ibP− /2 = (−iP+ ) (−iP− )
k! l!
k l
X k+l
(b/2)
T (b, 0) = (−i)k+l P+k P−l (18.55)
k!l!
k,l
18.4. UNITARY IRREDUCIBLE REPRESENTATIONS OF E2 : LIE ALGEBRA METHOD 347

where we haved used Eq. (3.117) page 69, and the fact that P+ and P− commute with each other. Using Eqs. (18.46, 18.55),
the matrix elements of T (b, 0) between the states {|pmi} read
X k+l X  l k+l
′ k+l ′ k l (b/2) k l ′ p (b/2)
hpm | T (b, 0) |pmi = (−i) hpm | P+ P− |pmi = (−i) (−i) hpm | P+k |p m − li
k!l! −i k!l!
k,l k,l
 l  
X k l p p k (b/2)k+l X k (b/2)k+l
= (−1) ik (−i) hpm′ | |p m − l + ki = (−1) pk+l hpm′ |p m − l + ki
−i i k!l! k!l!
k,l k,l
k+ k+m−m′ )
(pb/2) (
X k+l X k+l ∞
X
k m′ (pb/2) k l (pb/2) k
= (−1) δ m−l+k = (−1) δ m−m′ +k = (−1) (18.56)
k!l! k!l! k! (k + m − m′ )!
k,l k,l k=0

defining n = 2k + m − m′ as the index summation we have



X n−m+m′ )/2 (pb/2)n
hpm′ | T (b, 0) |pmi = (−1)(
[(n − m + m′ ) /2]! [(n − m + m′ ) /2 + m − m′ ]!
n=m−m′

X n
′ (pb/2)
hpm′ | T (b, 0) |pmi = (−1)(n−m+m )/2 (18.57)
[(n − m + m ) /2]! [(n + m − m′ ) /2]!

n=m−m′

for definiteness, let us assume m′ ≤ m, the summation is then taken from n = |m − m′ | to infinity covering every integer in
that interval. The first equality in Eq. (18.56) shows that the non-vanishing terms satisfy m′ = m − l + k or equivalently
k − l = m′ − m. From these observations we depict the summation indices and limits in Fig. 18.1. Now we shift the summation

Figure 18.1: Summation indices and limits for the series representation of Bessel functions.

index to q = (n + m′ − m) /2 to obtain

m−m′
X (pb/2)2q

hpm | T (b, 0) |pmi = (pb/2) (−1)q ; m′ ≤ m (18.58)
q=0
q! (q + m − m′ )!

the RHS of Eq. (18.58) coincides with Eq. (18.52) for m′ ≤ m, and corresponds to the standard series representation of the
Bessel function of the first kind Jm−m′ (pb).

hpm′ | T (b, 0) |pmi = Jm−m′ (pb) ; m′ ≤ m

For the case m < m′ , we shift the summation index to q = (n − m′ + m) /2 in Eq. (18.57), to find

X 2q
m′ −m q (pb/2) m′ −m
hpm′ | T (b, 0) |pmi = (−pb/2) (−1) = (−1) Jm′ −m (pb)
q=0
q! (q + m′ − m)!
hpm′ | T (b, 0) |pmi = Jm−m′ (pb) ; m < m′ (18.59)
348 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

where we have used in the last step the usual definition of Bessel functions of negative integer indices. Therefore, combining
Eqs. (18.58, 18.59, 18.54) we obtain for the general matrix element of translations the result

hpm′ | T (b, φ) |pmi = ei(m−m )φ Jm−m′ (pb) (18.60)

combining (18.60) with the general matrix of rotations Eq. (18.53) and taking into account the decomposition (18.19) we have
m′
X
D(p) (b, θ) m ≡ hpm′ | g (b, θ) |pmi = hpm′ | T (b) R (θ) |pmi = hpm′ | T (b) |pni hpn| R (θ) |pmi
n
X
i(n−m′ )φ −imθ n i(m−m′ )φ
= e Jn−m′ (pb) e δ m =e Jm−m′ (pb) e−imθ
n

and we obtain Eq. (18.51). QED.


Since the basis {|pmi , m = 0, ±1, ±2, . . .} consists of eigenvectors of the “angular momentum operator” J, we call it the
angular momentum basis. Note that Bessel functions have arisen as representation functions of the E2 group. We remember
that Bessel functions usually appear in Physics in problems that are effectively 2-dimensional e.g. systems with cylindrical
symmetry. The group-theoretical characterization enligthens the geometrical origin of such systems, independent of any other
details. We shall study the properties of Bessel functions from the group theory point of view in Sec. 18.7.

18.5 The induced representation method and the plane-wave basis


The induced representation method is an alternative to derive irreducible representations of continuous groups that contains
an abelian invariant subgroup (i.e. that are not semi-simple). It consists of constructing a basis for the irreducible vector space
consisting of eigenvectors of the generators of the abelian invariant subgroup and other operators properly defined.
In the case of E2 the abelian invariant subgroup is the 2-dimensional translation group T2 . The two generators (P1 , P2 )
are components of a 2-dimensional vector operator P. The eigenvalues of P are two-dimensional vectors with real components
(since P is hermitian for unitary representations). We proceeds in the following way
(i) We select a “standard” or “reference” momentum vector (eigenvector of P) that we denote as p0 ≡ (p, 0). Then we
deal to generate the subspace (invariant and irreducible with respect to the abelian invariant subgroup T2 ) associated with
this standard eigenvector. The dimensionality of this subspace depends on whether there are any group operations in the
quotient group (SO (2) in this case) which leaves such an eigenvector invariant. Equivalently, we can look for the maximum
set of generators of the quotient group that commute with P. In this case, the quotient group SO (2) has only one generator
J and it does not commute with P. Consequently, there is only one independent eigenstate of P associated with the standard
momentum eigenvector p0 . Then we have

P1 |p0 i = |p0 i p ; P2 |p0 i = 0 ; P2 |p0 i = |p0 i p2 (18.61)

(ii) Now we generate the full irreducible invariant subspace (invariant with respect to E2 ). It is done by group operations
which produces new eigenvalues of P. These operations are associated with generators that do not commute with P 7 . In our
case, they can only be R (θ) = e−iθJ . Therefore, we study the “momentum content” of R (θ) |p0 i, using Eqs. (18.21, 18.61)
we have
h i
Pk R (θ) |p0 i = R (θ) [R (−θ) Pk R (− (−θ))] |p0 i = R (θ) Pl R (−θ)l k |p0 i = R (θ) |p0 i p0l R (−θ)l k (18.62)

using the fact that (p1 , p2 ) is a cartesian 2-vector, and the orthogonality condition for R (θ) we have.

pk = p0l R (−θ)l k or pk = pk = R (θ)k l p0 l

so that p ≡ (p1 , p2 ) is a new cartesian vector rotated through an angle θ with respect to p0 . Hence, we obtain

Pk [R (θ) |p0 i] = R (θ) |p0 i pk (18.63)

therefore, R (θ) |p0 i is a new eigenvector of P corresponding to a new eigenvalue (momentum vector) p = R (θ) p0 . It suggests
to define
|pi ≡ R (θ) |p0 i (18.64)
this definition also determines the relative phase of |pi with respect to the reference vector |p0 i. Since Eq. (18.64) describes
a rotation of θ that keeps the vector length unaltered (R (θ) = e−iθJ is a unitary operator), the polar coordinates of the new
coordinate vector are p = (p, θ). So if |p0 i is normalized, then |pi also is.
7 According with Theorem 3.17, page 51, generators that commute with P, do not produce new eigenvalues of P.
18.6. RELATION BETWEEN THE ANGULAR MOMENTUM AND PLANE WAVE BASES 349

Further, it can be shown that the set of vectors

{|pi ≡ R (θ) |p0 i ≡ |p, θi for some θ with 0 ≤ θ < 2π} (18.65)

associated with the standard vector |p0 i (i.e. the set of all vectors obtained from an arbitrary rotation of |p0 i), is closed under
all group operations. To see it, we note first that |pi are eigenstates of P with eigenvalue p, so that

T (b) |pi = e−ib·P |pi = |pi e−ib·p (18.66)

now for rotations we have


R (φ) |pi = R (φ) R (θ) |p0 i = R (φ + θ) |p0 i ≡ |p′ i
so that the set {|pi} under group operations yields

T (b) |pi = |pi e−ib·p , R (φ) |pi = |p′ i ; p′ ≡ R (φ) p = (p, θ + φ) (18.67)

showing the invariance of the space generated from the basis {p}. Moreover, Eqs. (18.67) also show that all vectors in the
set defined by Eq. (18.65) are necessary to keep the invariance. Therefore, {|pi} forms a basis of an irreducible vector space
invariant under E2 .
(iii) We fix the normalization of the set {|pi} of basis vectors. If p 6= p′ the associated eigenvectors |pi and |p′ i must be
orthogonal to each other because they are eigenvectors of an hermitian operator P associated with different eigenvalues. We
now ask for the appropriate normalization when p = p′ , since p2 is the (unique) eigenvalue of a Casimir operator P2 , it is
invariant under all group operations, then we only need to consider the continuous parameter θ in |p, θi ≡ |pi. The definition
of |pi in Eq. (18.64) clearly indicates a one-to-one correspondence between these basis vectors and the group elements R (θ)
of the subgroup of rotations SO (2). It is then natural to adopt the invariant integration measure of SO (2) (say dθ/2π) for
the basis vectors. Therefore, the orthonormality condition becomes

hp′ |pi = hp θ′ |p θi = 2πδ (θ′ − θ) (18.68)

note that the existence of an invariant abelian subgroup (in this case T2 ) was essential to develop the method. The abelian
nature of T2 allowed to label the basis vectors by p. On the other hand, in Eq. (18.62), we have used Eq. (18.21) which is
equivalent to theorem 18.4. Therefore, the invariant property was essential in the procedure of generating all |pi from |p0 i.
The set {|pi} of linear momentum basis vectors will be called plane wave basis in order to use a terminology familiar in
Physics.

18.6 Relation between the angular momentum and plane wave bases
Let us study the relation between the two bases generated so far to expand invariant subspaces under E2 . The plane-wave
basis can be denoted as {|p θi} while the angular momentum basis is denoted by {|p mi}. Since the parameter p is an invariant
label under all group transformations, we can simplify the notation by {|θi} and {|mi}.
We first obtain any vector |mi in terms of the set {|θi}. The ket |mi is an eigenvector of J, which is the generator of the
compact Lie subgroup SO (2). It can be obtained from any vector of the representation space by using the projection method
explained in Sec. 9.2. The extension of the projection operator to the SO (3) continuous group was studied in Sec. 16.11
and is given by Eq. (17.46). We can apply Eq. (17.46) using U (g) = R (φ), D (g) = e−imφ , dτA = dφ/2π and 2j + 1 → 1
(dimensionality of the irreducible representations of SO (2)), so we get
Z
dφ imφ
Pm = e R (φ) (18.69)

According with the discussion in Sec. 9.2, given any vector |xi in the representation space, the vector Pj |xi transforms
irreducibly under SO (2). Let us take |xi = |p0 i ≡ |φ = 0i the reference vector in the plane-wave states, and apply the
generalized projector (18.69) on it
Z 2π Z
dφ dφ
Pm |p0 i ≡ |mi
e = R (φ) |p0 i eimφ = |φi eimφ (18.70)
0 2π 2π
We can check that |mi
e is an eigenstate of J with eigenvalue m as follows
Z 2π Z 2π Z 2π
imφ dφ imφ dφ dφ
R (θ) |mi
e ≡ R (θ) R (φ) |p0 i e = R (θ + φ) |p0 i e = |θ + φi eimφ
0 2π 0 2π 0 2π
Z 2π Z 2π ′
dφ ′ dφ
= |θ + φi eim(φ+θ) e−imθ = |φ′ i eimφ e−imθ
0 2π 0 2π
R (θ) |mi
e = e e−imθ
|mi
350 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

consequently, |mi
e must be proportional to |mi. To find the constant of proportionality, we check the normalization and relative
phases of |mi
e for different eigenvalues m. We check the normalization |mie by using the ortonormality of the set {|θi} given by
Eq. (18.68) as well as the definition in Eq. (18.70)
Z Z Z
dφ dφ′ imφ −im′ φ′ dφ dφ′ i(mφ−m′ φ′ ) dφ i(m−m′ )φ
hme ′ |mi
e = hφ′ |φi e e = 2πδ (φ′ − φ) e = e
2π 2π 2π 2π 2π

e ′ |mi
hm e = m
δm

we now investigate the behavior of |mi


e under the action of P± . For this we examine first the quantity P± |φi, since the state
|φi is rotated in an angle φ with respect to |p0 i, and the eigenvalues of P associated with |p0 i are (p, 0), then it is clear that
the eigenvalues of P associated with |pi ≡ |φi are

(P1 , P2 ) |pi ≡ P |φi = |φi p ; p = (p cos φ, p sin φ) (18.71)

Further, using Eqs. (18.63, 18.64, 18.71) we obtain

P± |φi = P± R (θ) |p0 i = (P1 ± iP2 ) R (θ) |p0 i = P1 R (θ) |p0 i ± iP2 R (θ) |p0 i = R (θ) |p0 i (p cos φ ± ip sin φ)
P± |φi = |φi pe±iφ (18.72)

from Eq. (18.72) and definition (18.70), we can find the action of P± on |mie
Z 2π Z 2π Z 2π 
dφ dφ dφ
P± |mi
e = P± |φi eimφ = |φi pe±iφ eimφ = |φi ei(m±1)φ p
2π 2π 2π
0 E 0 0
^
P± |mi
e = m ± 1 p (18.73)

On the other hand, we have from Eq. (18.46) that

P± |mi = |m ± 1i p (∓i) (18.74)

from which |mi


e and |mi must be proportional to one another, the constant of proportionality is at most a function of m. Both
sets are properly normalized so that the constant of proportionality must be of magnitude one. Hence we write

e = |mi e−iλm ; |mi = |mi


|mi e eiλm (18.75)

to find λm we re-evaluate the action of P± on |mi


e but by means of its relationship with |mi. Using Eqs. (18.74, 18.75), we get
E
^ iλm±1
P± |mi
e = P± |mi e−iλm = |m ± 1i p (∓i) e−iλm = m ±1 e p (∓i) e−iλm
E
^
P± |mi
e = m ± 1 p (∓i) ei(λm±1 −λm ) (18.76)

comparing Eqs. (18.73, 18.76) we find

(∓i) e−iλm eiλm±1 = 1 ⇒ ei(λm±1 −λm ) = ±i ⇒ ei(λm+1 −λm ) = eiπ/2


π
λm+1 = λm + (18.77)
2
E

we choose to define λ0 = 0 i.e. e0 = |0im , from which Eq. (18.77) yields
m

π π π π mπ
λ1 = λ0 + = , λ2 = λ1 + = 2 , . . . , λm = (18.78)
2 2 2 2 2
replacing (18.78) in (18.75) we have
mπ π m
e eiλm = |mi
|mi = |mi e ei 2 e ei 2
= |mi = im ⇒
i mπ
e im = |mi
|mi = |mi e e 2

and using the definition of |mi


e Eq. (18.70) we obtain
Z

|φi eim(φ+ 2 )
π
i mπ
|mi = |mi
e e 2 = (18.79)

from this equation it is immediate to obtain the transfer matrix

hφ |mi = eim(φ+π/2) (18.80)


18.7. DIFFERENTIAL EQUATIONS, RECURSION FORMULAS AND ADDITION THEOREM OF THE BESSEL FUNCTION

the inverse of Eq. (18.79) yields X X


|φi = |mi hm |φi = |mi e−im(φ+π/2) (18.81)
m m
in particular when φ = 0 we have X m
|φ = 0i ≡ |p0 i = |mi (−i) (18.82)
m

applying a translation T (b) = T (b, θ) on both sides of Eq. (18.79) and using Eq. (18.67, 18.71) we get
Z Z
dφ dφ
T (b) |φi eim(φ+ 2 ) = |φi e−ib·p eim(φ+ 2 )
π π
T (b) |mi = T (b, θ) |mi =
2π 2π
Z Z
dφ −i(b cos θ, b sin θ)·(p cos φ, p sin φ) im(φ+ π ) dφ
|φi e−ibp(cos φ cos θ+sin θ sin φ) eim(φ+ 2 )
π
= |φi e e 2 =
2π 2π
Z

T (b, θ) |mi = |φi eim(φ+π/2)−ipb cos(θ−φ) (18.83)

combining (18.83, 18.80) we obtain the matrix representation of T (b) in the basis {|mi}
Z

hm′ | T (b, θ) |mi = hm′ | φieim(φ+π/2)−ipb cos(θ−φ)

Z
′ dφ
hm′ | T (b, θ) |mi = ei(m−m )(φ+π/2)−ipb cos(θ−φ) (18.84)

with the change of variable ψ = φ + π/2 − θ we obtain dψ = dφ and
Z Z
′ dφ ′ dψ
hm′ | T (b, θ) |mi = ei(m−m )(φ+π/2−θ+θ)−ipb cos(θ−φ−π/2+π/2) = ei(m−m )(ψ+θ)−ipb cos(−ψ+π/2)
2π 2π
Z
′ ′ dψ
hm′ | T (b, θ) |mi = ei(m−m )θ ei(m−m )ψ−ipb sin ψ (18.85)

Comparing Eqs. (18.60, 18.85), we obtain
Z 2π

Jn (z) = einψ−iz sin ψ (18.86)
0 2π
which is the (well-known) integral representation of the Bessel functions of the first kind.

18.7 Differential equations, recursion formulas and addition theorem of the


Bessel functions
18.7.1 Recursion formulas
Since Bessel functions are representation functions of E2 , we shall use group-theoretical techniques to explore some properties
of them. It is enough to concentrate on the translations subgroup. Defining

± b1 ± ib2 
b ≡ ; b1 = b+ + b− ; b2 = i b− − b+ (18.87)
2
using Eqs. (18.87, 18.35) we have
   

P+ + P−  P+ − P−
b·P = b1 P1 + b2 P2 = b+ + b− + i b− − b+
2 2i
1 +   
= b + b− (P+ + P− ) + b− − b+ (P+ − P− )
2
 
b·P = P+ b− + P− b+
so that the group element T (b) can be expressed by
 
T (b) = exp [−ib · P] = exp −i b− P+ + b+ P− (18.88)
deriving this expression with respect to b± we have
∂T (b)   
= exp −i b− P+ + b+ P− (−iP∓ ) ⇒
∂b±
∂T (b)
i = T (b) P± (18.89)
∂b∓
352 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

now expressing b in polar coordinates we have

b1 ± ib2 b cos φ ± ib sin φ b (cos φ ± i sin φ) b


b ≡ (b, φ) ; b± = = = = e±iφ
2 2 2 2
 −
b b i i (b/2) e−iφ i b
kbk2 = |b|2 = 4 eiφ e−iφ = 4b+ b− ; φ = ln e−2iφ = ln = ln
2 2 2 2 (b/2) eiφ 2 b+

in summary
√    
b ±iφ i b− i b+
b ≡ (b, φ) ; b± = e ; b = 4b+ b− ; φ = ln = − ln (18.90)
2 2 b+ 2 b−
the idea is to express the differential equation (18.89) in terms of polar variables b, φ. To do it, we write b = b (b+ , b− ) and
φ = φ (b+ , b− ) by means of Eqs. (18.90) and calculate ∂b/∂b± and ∂φ/∂b± as follows

∂b ∂ h + − 1/2 i 4b∓ 4 2b e∓iφ
= 4b b = √ = = e∓iφ ;
∂b± ∂b± 2 4b+ b− 2b
  +   
∂φ i ∂ b i b− 1 i i i
= − ln − =− =− + =− = − e−iφ
∂b+ 2 ∂b+ b 2 b+ b− 2b 2 (b/2) eiφ b
  −   
∂φ i ∂ b i b+ 1 i i i
= ln + = = − = = eiφ
∂b− 2 ∂b− b 2 b− b+ 2b 2 (b/2) e−iφ b

so that
∂b ∂φ i
= e∓iφ ; = ∓ e∓iφ (18.91)
∂b± ∂b ± b
the derivative on the LHS of Eq. (18.89) becomes

∂T (b) ∂b ∂T (b) ∂φ ∂T (b) ∂T (b) i ∂T (b)


= + ± = e∓iφ ∓ e∓iφ
∂b± ∂b± ∂b ∂b ∂φ ∂b b ∂φ
 
∂T (b) ∂ i ∂
= e±iφ ± T (b) (18.92)
∂b∓ ∂b b ∂φ

replacing (18.92) in Eq. (18.89), the latter becomes


 
±iφ ∂ i ∂
ie ± T (b, φ) = T (b, φ) P± (18.93)
∂b b ∂φ

this is still an “abstract” differential equation, to make it explicit we must use a basis. For this we sandwich both sides of Eq.
(18.93) between the states hm′ | and |mi to obtain
 
∂ i ∂
ie±iφ ± hm′ | T (b, φ) |mi = hm′ | T (b, φ) P± |mi = hm′ | T (b, φ) (∓ip) |m ± 1i
∂b b ∂φ
 
∂ i ∂
e±iφ ± hm′ | T (b, φ) |mi = ∓p hm′ | T (b, φ) |m ± 1i (18.94)
∂b b ∂φ

where we have used Eq. (18.46). Using Eq. (18.60) in Eq. (18.94), and setting n = m − m′ , p = 1, we obtain
 
∂ i ∂ ′ ′
e±iφ ± ei(m−m )φ Jm−m′ (pb) = ∓pei(m±1−m )φ Jm±1−m′ (pb)
∂b b ∂φ
 
∂ i ∂
e±iφ ± einφ Jn (b) = ∓ei(n±1)φ Jn±1 (b)
∂b b ∂φ
∂ i ∂ inφ
e±iφ einφ Jn (b) ± e±iφ Jn (b) e = ∓ei(n±1)φ Jn±1 (b)
∂b b ∂φ
∂ i 
e±iφ einφ Jn (b) ± Jn (b) e±iφ ineinφ = ∓e±iφ einφ Jn±1 (b)
∂b b
obtaining finally
 
d n
∓ Jn (b) = ∓Jn±1 (b) (18.95)
db b
18.7. DIFFERENTIAL EQUATIONS, RECURSION FORMULAS AND ADDITION THEOREM OF THE BESSEL FUNCTION

Theorem 18.6 (Recursion formulas for the Bessel functions): The Bessel functions Jn (b) satisfy the following recursion
formulas

d
2 Jn (b) = Jn−1 (b) − Jn+1 (b)
db
2n
Jn (b) = Jn−1 (b) + Jn+1 (b)
b
Proof : Writing Eqs. (18.95) separately, summing and substracting them, we have
   
d n d n
+ Jn (b) = Jn−1 (b) ; − Jn (b) = −Jn+1 (b)
db b db b
   
d n d n d
+ Jn (b) + − Jn (b) = Jn−1 (b) − Jn+1 (b) ⇒ 2 Jn (b) = Jn−1 (b) − Jn+1 (b)
db b db b db
   
d n d n 2n
+ Jn (b) − − Jn (b) = Jn−1 (b) + Jn+1 (b) ⇒ Jn (b) = Jn−1 (b) + Jn+1 (b)
db b db b b

QED.
In our group-theoretical context the (well-known) recurrence relations for Bessel functions have the geometrical interpre-
tation of coming from raising and lowering operators as can be seen from Eq. (18.89).

18.7.2 Differential equation for Bessel functions


Differential equations not leading to a recurrence relation (but directly to a solution), should contain the raising and lowering
operators in a “balanced” form such as P+ P− , P+ P+ P− P− etc. To obtain a differential equation that contains the product
operator P+ P− it is natural to derive Eq. (18.89) as follows
 
∂ ∂T (b) ∂ ∂T (b)
i = [T (b) P− ] = P− = −iT (b) P+ P−
∂b− ∂b+ ∂b− ∂b−
 
∂ ∂T (b)
= −T (b) P+ P− (18.96)
∂b− ∂b+

the LHS of Eq. (18.96) can be calculated from Eqs. (18.91, 18.92)
     
∂ ∂T (b) ∂b ∂ ∂T (b) ∂φ ∂ ∂T (b)
= +
∂b− ∂b+ ∂b− ∂b ∂b+ ∂b− ∂φ ∂b+
       
∂ ∂ i ∂ i ∂ ∂ i ∂
= eiφ e−iφ − T (b) + eiφ e−iφ − T (b)
∂b ∂b b ∂φ b ∂φ ∂b b ∂φ
  2  
∂ i ∂ i ∂ ∂
= eiφ e−iφ + 2 − T (b)
∂b2 b ∂φ b ∂b ∂φ
    
i ∂ i ∂ ∂ ∂ i ∂
+ eiφ −ie−iφ − + e−iφ − T (b)
b ∂b b ∂φ ∂φ ∂b b ∂φ

     
∂ ∂T (b) ∂2 i ∂ i ∂ ∂ 1 ∂ i ∂ i ∂ ∂ 1 ∂2
= + 2 − T (b) + − + + T (b)
∂b− ∂b+ ∂b2 b ∂φ b ∂b ∂φ b ∂b b2 ∂φ b ∂φ ∂b b2 ∂φ2
 2 
∂ i ∂ i ∂ ∂ 1 ∂ i ∂ i ∂ ∂ 1 ∂2
= + − + − + + T (b)
∂b2 b2 ∂φ b ∂b ∂φ b ∂b b2 ∂φ b ∂φ ∂b b2 ∂φ2
   2 
∂ ∂T (b) ∂ 1 ∂ 1 ∂2
= + + T (b) (18.97)
∂b− ∂b+ ∂b2 b ∂b b2 ∂φ2
combining Eqs. (18.97, 18.96) we obtain
 
∂2 1 ∂ 1 ∂2
+ + T (b) = −T (b) P+ P−
∂b2 b ∂b b2 ∂φ2

Using Eq. (18.38) we have  


∂2 1 ∂ 1 ∂2
− + + T (b) = T (b) P 2 (18.98)
∂b2 b ∂b b2 ∂φ2
354 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

this is the “abstract” equation which becomes explicit by sandwiching it with hm′ | and |mi
 2 
∂ 1 ∂ 1 ∂2
− + + hm′ | T (b) |mi = hm′ | T (b) P2 |mi
∂b2 b ∂b b2 ∂φ2
 2 
∂ 1 ∂ 1 ∂2
− + + hm′ | T (b) |mi = p2 hm′ | T (b) |mi (18.99)
∂b2 b ∂b b2 ∂φ2
where we have used Eq. (18.40). It leads to the following:
Theorem 18.7 (Differential equation for Bessel functions): The Bessel functions satisfy the following differential equation
 2  
d 1 d n2
+ + 1− 2 Jn (b) = 0 (18.100)
db2 b db b
Proof : Substituting Eq. (18.60) in Eq. (18.99), and replacing p = 1, n = m − m′ we have
 2 
∂ 1 ∂ 1 ∂2 ′ ′
− + + ei(m−m )φ Jm−m′ (pb) = p2 ei(m−m )φ Jm−m′ (pb)
∂b2 b ∂b b2 ∂φ2
 2 
∂ 1 ∂ 1 ∂2
+ + einφ Jn (b) = −einφ Jn (b)
∂b2 b ∂b b2 ∂φ2
2
 
inφ ∂ inφ 1 ∂ Jn (b) ∂ 2 inφ
e J n (b) + e J n (b) + e = −einφ Jn (b)
∂b2 b ∂b b2 ∂φ2
2
 
inφ ∂ inφ 1 ∂ Jn (b)
e Jn (b) + e Jn (b) − n2 2 einφ = −einφ Jn (b)
∂b2 b ∂b b
2
 
∂ 1 ∂ Jn (b)
Jn (b) + Jn (b) − n2 2 + Jn (b) = 0
∂b2 b ∂b b
 2  
d 1 d n2
Jn (b) + Jn (b) + Jn (b) 1 − 2 = 0
db2 b db b
getting finally   
d2 1 d n2
+ + 1 − Jn (b) = 0
db2 b db b2
QED.
Expression (18.100) provides the traditional differential equation for Bessel functions.

18.7.3 Addition theorem

Figure 18.2: Addition rule of the parameters for T (r) elements.

An addition theorem for Bessel functions can be derived from the group multiplication rule. The procedure is analoguous
to the one followed to derive the addition theorem for spherical harmonics in Sec. 16.13.1. Once again, restriction to the
translation subgroup suffices for our purposes, the multiplication rule for this subgroup reads
T (r) T (r′ ) = T (r + r′ ) = T (R) (18.101)
18.8. METHOD OF GROUP CONTRACTION: SO (3) AND E2 355

we shall express the vector in polar coordinates. Without any loss of generality, we can choose r ≡ (r, 0) and r′ ≡ (r′ , φ). Since
R is the vector sum of r and r′ , we see that

r + r′ cos φ
r ≡ (r, 0) ; r′ ≡ (r′ , φ) ⇒ R = (R, θ) , cos θ = (18.102)
R
the situation is depicted in Fig. 18.2. It leads to the following theorem

Theorem 18.8 (Addition theorem for Bessel functions): The Bessel functions satisfy

X
inθ
e Jn (R) = eikφ Jk (r′ ) Jn−k (r) (18.103)
k=−∞

where the summation is over all integers, and r, r′ , θ, φ are related as shown in Fig. 18.2.

Proof : Taking the matrix element on both sides of Eq. (18.101), inserting an identity, and taking into account Eq.
(18.102), we get

X
hm′ | T (r) |ki hk| T (r′ ) |mi = hm′ | T (R) |mi
k=−∞

X
hm′ | T (r, 0) |ki hk| T (r′ , φ) |mi = hm′ | T (R, θ) |mi (18.104)
k=−∞

and using Eq. (18.60) in Eq. (18.104), we find



X ∞
X
′ ′
ei(m−m )θ Jm−m′ (R) = ei(k−m )·0 Jk−m′ (r) ei(m−k)φ Jm−k (r′ ) = Jk−m+m−m′ (r) ei(m−k)φ Jm−k (r′ )
k=−∞ k=−∞

with the substitutions m − k = p and m − m′ = n, we obtain



X
einθ Jn (R) = Jn−p (r) eipφ Jp (r′ )
p=−∞

which is Eq. (18.103). QED.

18.7.4 Summary
Note that all properties of Bessel functions, recurrence relations, differential equation associated and addition theorem depended
on the subgroup of translations only. It is because the role of raising and lowering operators is crucial to generate such properties
and we have no raising and lowering operators associated with the rotations8 . It worths remarking that in this group-theoretical
context, the Bessel functions themselves (representation functions of E2 ) were encountered first and its associated differential
equation was given later, opposite to the order in usual texts of Classical Analysis, or Mathematical Physics.

18.8 Method of group contraction: SO (3) and E2


We shall consider an interesting relation between SO (3) and E2 that permits to induce representations of E2 from the ones of
SO (3). As groups of transformations, SO (3) is the symmetry group of the surface of a sphere, while E2 is the symmetry group
of a plane9 . The stereographic projection provides a one-to-one correspondence between transformations of SO (3) (points on
the surface of the sphere) and transformations of E2 (points on the plane).
Let us put the plane tangent to the sphere touching its “north pole”, and set up coordinates as indicated in Fig. 18.3. This
figure shows that rotations around X − axis (Y − axis) are associated by the stereographic projection with a translation in the
−ey (+ex ) direction on the plane (and vice versa). Further, rotations around the Z − axis maintain the same interpretation
in both cases. We translate these assertions in the form

e−iθx Jx ↔ eiby Py ; e−iθy Jy ↔ e−ibx Px ; e−iθz Jz ↔ e−iθz Jz (18.105)


8 Itis because there is only one generator J of rotations SO (2), and the definition of raising and lowering operators requires at least two generators.
9 We should not confuse this space of transformations with the manifold (space of parameters) of the group. For instance, the manifold of SO (3)
can be characterized by a “solid” sphere with radius π, while its space of transformations can be characterized by the “surface” of a sphere with
arbitrary radius.
356 CHAPTER 18. EUCLIDEAN GROUPS IN TWO DIMENSIONS

Figure 18.3: Stereographic projection of the space of transformations of SO (3) (the sphere), on the space of transformations
of E2 (the plane). We also show a (positive) rotation around the X−axis using a clockwise (left-handed) convention. We also
show that a rotation around the X−axis corresponds to a translation of magnitude by and direction −ey .

If sx is the length of arc subtended by the angle θx in Fig. 18.3, and R is the radius of the sphere, we see that

sx = Rθx ≈ by

this aproximation becomes exact when we take the limit of R going to infinity. Therefore, when R → ∞ we see that
by bx
θx = ; θy = when R → ∞ (18.106)
R R
substituting (18.106) in the correspondences (18.105), we can write them as

e−iθx Jx = e−iby Jx /R ↔ eiby Py ; e−iθy Jy = e−ibx Jy /R ↔ e−ibx Px ; e−iθz Jz ↔ e−iθz Jz (18.107)

we can interpret it as a mapping among the generators


Jx Jy
− ↔ Py , ↔ Px , J z ↔ Jz ; when R → ∞ (18.108)
R R
let us see whether the Lie algebra preserves such a correspondence
 
Jy Jx 1 iJz
[Px , Py ] ↔ , − = 2 [Jx , Jy ] = 2 → 0
R R R R
 
Jy 1 iJx
[Px , Jz ] ↔ , Jz = [Jy , Jz ] = ↔ −iPy
R R R
 
Jx 1 iJy
[Py , Jz ] ↔ − , Jz = − [Jx , Jz ] = ↔ iPx
R R R

so that the correspondence given by Eqs. (18.107, 18.108) leads to the correct Lie algebra of E2 . We express this in the form
of a theorem

Theorem 18.9 (Contraction of SO (3) to E2 ): Let R be the radius of the sphere whose surface represents the space of
transformations of the SO (3) group. In the limit of infinite R, the group SO (3) contracts to the group E2 according to the
correspondence given by Eqs. (18.108).
18.8. METHOD OF GROUP CONTRACTION: SO (3) AND E2 357

It is also useful to get the relationship between raising and lowering operators of SO (3) with raising and lowering operators
of E2 , when the contraction limit is taken. From Eqs. (18.108) we see that
J± Jx Jy
= ±i ↔ −Py ± iPx = ±i (Px + iPy ) = ±iP± ⇒
R R R

↔ ±iP± (18.109)
R

18.8.1 Relation between the irreducible representations of SO (3) and E2


We shall derive the irreducible representations of E2 in the basis {|p mi} from the representations of the rotation group SO (3).
We start observing that the matrix elements of Jz should not change. As for the raising and lowering operators J± of SO (3)
we have from Eq. (15.81) page 270 p

hm′ | J± |mi = δ m m±1 j (j + 1) − m (m ± 1) (18.110)
now, using the correspondence (18.109) this equation translates into
′ p
±iR hm′ | P± |mi = δ m m±1 j (j + 1) − m (m ± 1)

′ iδ m m±1 p
hm | P± |mi = ∓ j (j + 1) − m (m ± 1) (18.111)
R
so that we have induced the matrix elements of the raising and lowering operators P± of E2 , from the matrix elements of the
raising and lowering operators of SO (3). We see that in order to keep the RHS of Eq. (18.111) non-vanishing as R increases
without limit, we should use values of j proportional to R.
Theorem 18.10 (Irreducible representations of E2 by contraction from SO (3)): Non-trivial irreducible representations of E2
can be obtained from those of SO (3) if j = pR as R → ∞.
Proof : If j = pR when R → ∞, then from Eq. (18.111) we see that
r
′ m′ 1p ′ pR (pR + 1) − m (m ± 1)
hm | P± |mi = ∓iδ m±1 lim pR (pR + 1) − m (m ± 1) = ∓iδ m m±1 lim
R→∞ R R→∞ R2

hm′ | P± |mi = ∓ipδ m m±1

on the other hand ′


hm′ | P± P∓ |mi = hm′ | P2 |mi = p2 δ m m
where we have used Eqs. (18.38, 18.40). We have re-derived the irreducible representations described in Sec. 18.4. QED.

18.8.2 Relation between representation functions of SO (3) and E2


Theorem 18.10 permits us to express Bessel functions as limits of the d(j) −functions defined in Eq. (15.85) page 271. First of
all, by using Eq. (18.60), the Bessel functions can be expressed in a simple way as a matrix element of a translation bx along
the X−axis
Jn (pbx ) = ei[0−(−n)]·0 J0−(−n) (pbx ) = h−n| T (pbx , 0) |0i
Jn (pbx ) = h−n| T (pbx , 0) |0i (18.112)
Now, consider a rotation around the Y −axis at R → ∞, according with Eq. (18.106) and theorem 18.10 we see that
bx pbx pbx
θy = = = when R → ∞
R pR j
The representation function for this rotation is:
m′ m′ m′
d(j) (θy ) m = d(j) (bx /R) m = d(j) (pbx /j) m when R → ∞
according with the discussion at the beginning of Sec. 18.8, in the limit j → ∞ (which is equivalent to R → ∞), these
functions (matrix elements) should converge to the representation functions (matrix elements) of a translation along the
(positive) X−direction by the displacement bx (i.e. b = (bx , 0)), therefore
m′
lim d(j) (pbx /j) m = hm′ | T (pbx , 0) |mi (18.113)
j→∞

combining Eqs. (18.112, 18.113) with p = 1, m′ = −n and m = 0 we obtain


−n
Jn (bx ) = h−n| T (bx , 0) |0ip=1 = lim d(j) (bx /j) 0 (18.114)
j→∞

which gives the Bessel functions (representation functions associated with E2 ) in terms of an appropriate limit (contraction)
of the d(j) −functions (repesentation functions associated with SO (3)).
Chapter 19

General Treatment of continuous groups

19.1 The notion of continuity in a group


We can define a criterion of nearness by a metric over Rn (or more generally over a manifold). Let us define the elements
of the group U (a1 , . . . , ar ) where a1 , . . . , ar is a set of continuous parameters defining a manifold. The criterion of nearness
should be such that small changes in the parameters, produce small changes in the element i.e. U (a1 + δa1 , . . . , ar + δar ) =
U (a1 , . . . , ar ) + δU. We also require that a small change in a factor of a product leads to a small change in the product.
It gives a notion of continuity in the elements of the group. The most important continuous groups in Physics are the Lie
groups.
In a continuous group we should first of all demand the usual group requirements. We shall denote the elements of the
group as R (a) where  a denotes the whole set of continuous parameters i.e. a ≡ {a1 , ..., ar } in the manifold, and we define the
identity as R a0 ≡ R (0). Those elements should satisfy the following properties
1) R (0) R (a) = R (a) R (0) = R (a) , ∀a in the manifold
2) ∀a in the manifold ∃ā such that R (ā) = [R (a)]−1
3) R (a) R (b) = R (c) with R (c) ∈ G, and c belongs to the manifold.
4) R (a) [R (b) R (c)] = [R (a) R (b)] R (c) , ∀a, b, c in the manifold
c is a function of a, b. Remembering that they mean r−parameters we have

ck = φk (a1 , ..., ar ; b1 , ..., br ) ; k = 1, ..., r

or in short notation
c = φ (a; b) (19.1)
The reader should be careful for not to confuse the law of composition in the manifold Eq. (19.1), with the law of composition
of the group R (a) R (b) = R (c). To obtain the law of composition in the group, we should define the law of composition in
the manifold c = φ (a; b) plus the mapping that assigns a point of the manifold with a group element a → R (a).

19.2 Noether theorem


The Noether theorem says that any invariance of the action under a continuos symmetry, leads to a conserved current. However,
as well as the continuity, the symmetry should also satisfied a condition of connectness, which is that every element that
describe a symmetry transformation must be able to be reached continuously from the identity by infinitesimal transformations.
Therefore, groups of continuous transformations that can be connected with the identity by infinitesimal transformations are
special for physics. It leads us to consider groups with those properties

19.3 Lie Groups


For a continuous group to be a Lie group we demand the following requirements
1) φ must be analytic i.e. posseses continuous derivatives of all orders in a and b.
2) ā must be an analytic function of a i.e. ā = f (a) with f analytic in a.
With those additional requirements we say that the set of elements U (a1 , ...ar ) is an r−parameter Lie group.
These requirements guarantees that we can go from the identity to any other element of the group by succesive infinitesimal
continuos transformations. The statement of the Noether theorem says that conserved currents appears when the group of
symmetries of the action is a Lie group.

358
19.3. LIE GROUPS 359

Lie groups in Physics are groups of transformations in which we should specify not only the parameters that define the
point in the manifold but also the coordinates of the element in the vector space in which the operator acts on. Then, an
r−parameter Lie group of transformations on a n−dimensional vector space V, is defined as

xi = Fi (x1 , ..., xn ; a1 , ..., ar ) i = 1, ..., n

or shortly
x′ = F (x; a)
with x, x′ ∈ V, and Fi are analytic functions of a. This is an r−parameter Lie group of transformations acting on an
n−dimensional vector space1 . Requirements of analyticity and of basic group properties impose strong restrictions on the
functions Fi
F (F (x; a) ; b) = F (x; φ (a; b)) ∀a, b ∈ M (G) , ∀x ∈ V

Example 19.1 x′ = ax , a 6= 0 one parameter abelian group. a0 = 1, ā = 1/a, c = ba. All requirements can be checked

Example 19.2 x′ = a1 x + a2 , a1 6= 0, a0 ≡ (a1 = 1, a2 = 0) ; ā ≡ (1/a1 , −a2 /a1 ) ;


c = ab = (b1 a1 , b2 + b1 a2 ). This is a two parameter non abelian group

For a group to be an r−parameter Lie group, it is necessary that all parameters be essential, i.e. no less parameters can
be constructed such that we obtain a group isomorphic to the original one

Example 19.3 x′ = x + a + b, a and b are not essential for we can define a single one c = a + b and obtain an isomorphism
with this new group.

Theorem 19.1 A one parameter continuous group is equivalent to a group of translations and must be abelian, whenever every
element of the group can be reached continuosly from the identity.

Example 19.4 There are some mixing continuos groups that besides the continuos parameters, require a discrete label to
characterize all the elements. e.g. G : x′ = ±x + a. The group manifold consists of two disconnected sets. The subset
H1 : x′ = x + a can be reached continuosly from the identity, while H2 : x′ = −x + a cannot. Observe that with the discrete
symmetry I : x′ = −x, we see that H2 = H1 I, and we can write G = H1 + H2 I. H is an invariant subgroup and the quotient
group G/H1 is the group of order two.
n o
Theorem 19.2 Lie’s theorem: having a no-mixing unitary Lie group U b (α1 , ..., αr ; x) , we can express each element of the
b k in the following way
group in terms of hermitian operators L
( " #)
h i r
X
Ub (α1 , ..., αr ; x1 , ..., xn ) r = exp −i bk
αk L r
k=1

where
∂ b (α1 , ..., αr ; x)
U
bk =
−iL (19.2)
∂αk
α=0

and h i
L b m = cj L
bk , L b (19.3)
km j

with
cjkm = −cjmk ; cijm cmkn + cjkm cmin + ckim cmjn = 0 (19.4)

Clearly, the set of linear combinations of the hermitian operators L b k , forms an algebra under the commutator as a law of
combination, it is called a Lie algebra; in some contexts physicists also use the term bosonic algebra, especially in quantum
mechanical frameworks. Conversely, if we have a Lie algebra, defined by Eqs. (19.3, 19.4), we can induce a Lie group.
Observe that for these requirements it is necessary to have a unitary vector space. It is only by defining an inner product
that we could say that an element of the group is unitary and the operators L b k are hermitian. In addition in Eq. 19.2 the
b
derivatives are evaluated in α = 0, in the sense that U (0, ..., 0; x) must be the identity (a convenient parametrization to get it
is always possible)

Definition 19.1 The rank of a Lie group, is the largest number of generators that commute with each other.
1 It is important not to confuse the vector space in which the transfromations act on (described by the coordinates x ), and the space of points
i
(manifold) described by the parameters ak .
360 CHAPTER 19. GENERAL TREATMENT OF CONTINUOUS GROUPS

Example 19.5 The translation group in three dimensions has as generators pbk = −i∂/∂xk , all of them commute each other
so the rank is 3

Example 19.6 SO (3), the generators are the angular momentum operators. None of them conmute each other except with
themselves, so the rank is 1.
n o
Definition 19.2 If we can find a subset of generators Ub1 , ..., U
bn such that

h i
bi , U
G bk = aikl U
bl ∀G
bi ∈ G

we say that the subset generates an ideal subalgebra, and the subset forms an invariant subgroup of G2 .

Definition 19.3 A Lie algebra is simple if does not posses any ideal subalgebra different from the identiy alone. The group
generated by the algebra is also simple, since it will not contain any non-trivial invariant subgroup.

Definition 19.4 A lie algebra is semisimple if does not posess any abelian ideal algebra (different from the trivial ones). The
corresponding group is semisimple since it does not contain any non trivial abelian invariant subgroup

Theorem 19.3 Theorem of Racah: For any semi-simplegroup of rank k, there exists a set of k casimir operators, i.e. operators
that are functions of the generators b b b b
Li , of the form Cλ L1 , ..., Ln λ = 1, ..., k. Such that these casimir operators conmute
with every operator of the Lie group and among themselves. The eigenvalues of the casimir operators uniquely characterize the
multiplets of the group.

h i
2 Observe that the criterion of ideal subalgebra is more restricted than thus assuming a normal subalgebra i.e Ubi , U
bk = aikl U
bl
Appendix A

Definition and properties of angular


momentum

A.1 Definition of angular momentum


Definition A.1 A set of observables J = (J1 , J2 , J3 ) are called angular momentum operators if they obey the commutation
rules given by
[Ji , Jj ] = iεijk Jk (A.1)
Defining the operator
J2 = J12 + J22 + J32
we see that J2 is also hermitian. It worths emphasizing that the character of observable (i.e. an hermitian operator with a
complete set of eigenvectors in the space in which it is defined), is essential in the definition of angular momentum1 . It is
straightforward to calculate the commutator of the operators J2 and J
 2       
J , J1 = J12 + J22 + J32 , J1 = J22 , J1 + J32 , J1 = J2 [J2 , J1 ] + [J2 , J1 ] J2 + J3 [J3 , J1 ] + [J3 , J1 ] J3
= −iJ2 J3 − iJ3 J2 + iJ3 J2 + iJ2 J3
 2 
J , J1 = 0
and similarly with the other components such that  
J2 , J = 0 (A.2)
all the theory of angular momentum can be derived from the commmutation rules (A.1), and hence from its Lie algebra
structure. Now, since J2 and Ji commute and are observables, we can find a common base of eigenvectors for J2 and one of
the Ji′ s. It is a usual convention to choose J3 . Therefore, we shall diagonalize simultaneously the operators J2 and J3 .

A.2 Algebraic properties of the angular momentum


In this section, we study in detail the structure of the eigenvalues of the operators J2 and J3 as well as the structure of their
common eigenvectors. We start by defining the following operators
J+ ≡ J1 + iJ2 ; J− ≡ J1 − iJ2 (A.3)
1 1
J1 = (J+ + J− ) ; J2 = (J+ − J− ) (A.4)
2 2i
the operators J± are not hermitian and are the conjugate of each other. In our study of the angular momentum theory we
shall work with the operators J2 , J3 , J+ , J− so that we shall require the commutation relations between them.

A.2.1 Algebra of the operators J2 , J3 , J+ , J−


Using Eqs. (A.1, A.2, A.3) we can find the required commutation relations
[J3 , J± ] = [J3 , J1 ± iJ2 ] = [J3 , J1 ] ± i [J3 , J2 ] = iJ2 ± i (−iJ1 ) = {iJ2 ± J1 }
[J3 , J+ ] = J+ ; [J3 , J− ] = −J−
1 For a set of three operators, the character of observable can only be verified when we know the vector space in which the operators act on.

The commutation rules do not specify the space in which the operators are defined. Finally, hermitian operators are automatically observables in
finite-dimensional vector spaces, but it is not the case when we work on infinite-dimensional spaces.

361
362 APPENDIX A. DEFINITION AND PROPERTIES OF ANGULAR MOMENTUM

[J+ , J− ] = [J1 + iJ2 , J1 − iJ2 ] = [J1 , J1 − iJ2 ] + i [J2 , J1 − iJ2 ]


= [J1 , J1 ] − i [J1 , J2 ] + i [J2 , J1 ] + [J2 , J2 ] = 2i [J2 , J1 ] = 2i (−iJ3 )
[J+ , J− ] = 2J3 (A.5)

 2       
J , J± = J2 , J1 ± iJ2 = J2 , J1 ± i J2 , J2
 2 
J , J± = 0

the following product is also useful

J+ J− = (J1 + iJ2 ) (J1 − iJ2 ) = J12 + J22 + iJ2 J1 − iJ1 J2


= J12 + J22 + J32 − J32 + i [J2 , J1 ] = J2 − J32 + i (−iJ3 )
J+ J− = J2 − J32 + J3 (A.6)

now the product J− J+ can be obtained either explicitly or using the Eqs. (A.5, A.6)

J− J+ = J+ J− − [J+ , J− ] = J2 − J32 + J3 − 2J3


J− J+ = J2 − J32 − J3

Summarizing, we have the following definitions

J ≡ (J1 , J2 , J3 ) ; J2 ≡ J12 + J22 + J32 (A.7)


J+ ≡ (J1 + iJ2 ) ; J− ≡ (J1 − iJ2 ) (A.8)

where the Ji′ s are observables with the following algebraic properties
 2 
[Ji , Jj ] = iεijk Jk ; J ,J = 0 (A.9)
[J3 , J+ ] = J+ ; [J3 , J− ] = −J− (A.10)
 2 
[J+ , J− ] = 2J3 ; J , J± = 0 (A.11)
2
J+ J− = J − J32 + J3 ; J− J+ = J − 2
J32 − J3 (A.12)

A.3 Structure of the eigenvalues and eigenvectors


Since J2 is the sum of the squares of three hermitian operators, such an operator is positive

hψ| J2 |ψi = hψ| J12 |ψi + hψ| J22 |ψi + hψ| J32 |ψi = hψ| J1† J1 |ψi + hψ| J2† J2 |ψi + hψ| J3† J3 |ψi
2 2 2
= kJ1 |ψik + kJ2 |ψik + kJ3 |ψik ≥ 0

In particular, choosing |ψi as an eigenvector of J2 we see that


2
hψ| J2 |ψi = hψ| a |ψi = a hψ| ψi = a k|ψik ≥ 0 ⇒ a ≥ 0

so that the eigenvalues are non-negative. On the other hand, it is easy to show that for all a ≥ 0, the equation

j (j + 1) = a (A.13)

has one and only one non-negative root2 . Therefore, the specification of a determines j completely and vice versa. Hence,
without any loss of generality we can denote the eigenvalues of J2 in the form

J2 |ψi = j (j + 1) |ψi ; j ≥ 0

Let us consider that {|ψi} is the basis of eigenvectors common to J2 and J3 . We shall denote the eigenvalues of J3 in the form

J3 |ψi = m |ψi

where m is an adimensional quantity. In summary, we shall write the eigenvalue equation in the form

J2 |j, mi = j (j + 1) |j, mi ; J3 |j, mi = m |j, mi (A.14)


2 Eq.
√ 
(A.13) has the solutions j± = −1 ± 1 + 4a /2. If a ≥ 0, the only non-negative solution for j is j+ .
A.3. STRUCTURE OF THE EIGENVALUES AND EIGENVECTORS 363

A.3.1 General features of the eigenvalues of J2 and J3


Let us assume that the eigenvectors are normalized and that J2 and J3 are observables. We start by characterizing the
eigenvectors J+ |j, mi and J− |j, mi

kJ+ |j, mik2 = hj, m| J− J+ |j, mi ≥ 0 (A.15)


2
kJ− |j, mik = hj, m, k| J+ J− |j, mi ≥ 0 (A.16)

using Eqs. (A.12, A.14) we obtain



kJ± |j, mik2 = hj, m| J2 − J32 ∓ J3 |j, mi

= hj, m| j (j + 1) − m2 ∓ m |j, mi
= j (j + 1) − m2 ∓ m
kJ± |j, mik2 = {j (j + 1) − m (m ± 1)} (A.17)

substituting (A.17) in (A.15, A.16) we have that

j (j + 1) − m (m + 1) = (j − m) (j + m + 1) ≥ 0 (A.18)
j (j + 1) − m (m − 1) = (j − m + 1) (j + m) ≥ 0 (A.19)

let us assume that j − m < 0, since j ≥ 0 then m > 0 and j + m + 1 > 0. Therefore, (j − m) (j + m + 1) < 0, contradicting
Eq. (A.18). Hence, we should reject the hypothesis that j − m < 0.
It is then necessary that j − m ≥ 0, from this hypothesis we obtain that j − m + 1 > 0, and in order to satisfy Eq. (A.19) we
require that (j + m) ≥ 0, we then have that the conditions

j−m≥0 and j+m≥0 (A.20)

satisfy Eq. (A.19) by construction. We then have to check whether these conditions also satisfy the inequality (A.18). Using
the second condition j + m ≥ 0 we see that it implies j + m + 1 > 0, this together with the first condition in (A.20) satisfy
Eq. (A.18). We see then that the conditions (A.20) are necessary and sufficient to satisfy the inequalities (A.18) and (A.19).
Finally, and taking into account that j is non-negative, these conditions can be rewritten as

j−m ≥ 0 and j + m ≥ 0 ⇔ j ≥ m and j ≥ −m


⇔ j ≥ |m| ⇔ −j ≤ m ≤ j

from which the following lemma holds

Lemma A.1 If j (j + 1) and m are eigenvalues of J2 and J3 associated to the common eigenvector |j, mi then j and m satisfy
the inequality
−j ≤ m ≤ j (A.21)

Now, based on Eq. (A.21) we shall see the features of the vectors J− |j, mi and J+ |j, mi, where |j, mi is a eigenvector
common to J2 and J3 . First of all, we shall seek for the necessary and sufficient conditions for the vector J− |j, mi to be null.
This can be carried out from Eq. (A.17)

J− |j, mi = 0 ⇔ kJ− |j, mik2 = 0 ⇔ j (j + 1) − m (m − 1) = 0


⇔ (j − m + 1) (j + m) = 0

whose solutions are m = −j (its minimum value) and m = j + 1. But the second solution contradicts lemma A.1 Eq. (A.21).
Therefore
m = −j ⇔ J− |j, mi = 0 (A.22)
so that if m > −j the vector J− |j, mi will be non-null as long as Eq. (A.21) is satisfied. This can be verified by replacing
m > −j in Eq. (A.17) verifying that the norm of J− |j, mi is non-null. Now we shall prove that J− |j, mi is an eigenvector of
J2 and J3 . Since J2 and J− conmute, according with Eq. (A.11) we can write

 
J2 , J− |j, mi = 0 ⇒ J2 J− |j, mi = J− J2 |j, mi ⇒ J2 J− |j, mi = J− j (j + 1) |j, mi
⇒ J2 [J− |j, mi] = j (j + 1) [J− |j, mi]
364 APPENDIX A. DEFINITION AND PROPERTIES OF ANGULAR MOMENTUM

therefore J− |j, mi is an eigenvector of J2 with eigenvalue j (j + 1). This result is related with the fact that J2 and J− conmute,
as can be seen in theorem 3.17, pág. 51. Now we shall see that J− |j, mi is also an eigenvector of J3 , for which we use Eq.
(A.10)

[J3 , J− ] |j, mi = −J− |j, mi ⇒ J3 J− |j, mi = (J− J3 − J− ) |j, mi ⇒


J3 J− |j, mi = (J− m − J− ) |j, mi
⇒ J3 [J− |j, mi] = (m − 1) [J− |j, mi]

such that J− |j, mi is an eigenvector of J3 with eigevalue (m − 1). The results above can be condensed in the following lemma

Lemma A.2 Let |j, mi be an eigenvector common to J2 and J3 with eigenvalues j (j + 1) and m. We have that (a) m = −j
if and only if J− |j, mi = 0. (b) If m > −j then J− |j, mi 6= 0 and it is an eigenvector of J2 and J3 with eigenvalues j (j + 1)
and (m − 1).

The next natural step is to study the vector J+ |j, mi. From Eq. (A.17) we can see the necessary and sufficient conditions
for J+ |j, mi to be null.
2
J+ |j, mi = 0 ⇔ kJ+ |j, mik = 0 ⇔ j (j + 1) − m (m + 1) = 0
⇔ (j + m + 1) (j − m) = 0

whose solutions are m = j and m = − (j + 1) but the second solution is incompatible with lemma A.1 Eq. (A.21). Therefore

m=j ⇔ J+ |j, mi = 0 (A.23)

if m < j, and using (A.11, A.10) we obtain


 2 
J , J+ |j, mi = 0 ⇒ J2 J+ |j, mi = J+ J2 |j, mi ⇒
J2 [J+ |j, mi] = j (j + 1) [J+ |j, mi]

[J3 , J+ ] |j, mi = J+ |j, mi ⇒ J3 J+ |j, mi = J+ J3 |j, mi + J+ |j, mi


J3 J+ |j, mi = mJ+ |j, mi + J+ |j, mi
J3 [J+ |j, mi] = (m + 1) [J+ |j, mi]

so that J+ |j, mi is an eigenvector of J2 and of J3 with eigenvalues j (j + 1) and (m + 1). We have then the following lemma

Lemma A.3 Let |j, mi be an eigenvector common to J2 and J3 with eigenvalues j (j + 1) and m. We have that (a) m = j if
and only if J+ |j, mi = 0. (b) If m < j then J+ |j, mi 6= 0 and it is an eigenvector of J2 and J3 with eigenvalues j (j + 1) and
(m + 1).

We shall see that these lemmas permit to find the spectrum of J2 and J3 .

A.3.2 Determination of the eigenvalues of J2 and J3


Let us assume that |j, mi is an eigenvector of J2 and J3 with eigenvalues j (j + 1) and m. Our lemma A.1 says that

−j ≤ m ≤ j

since the vector is fixed the values of j and m are fixed. It is clear that there exist a non-negative integer p, such that

−j ≤ m − p < −j + 1 (A.24)

let us form a sequence of vectors n o


2 p
|j, mi , J− |j, mi , (J− ) |j, mi , . . . , (J− ) |j, mi (A.25)
p
we shall prove that these are non-null eigenvectors of J2 and J3 and that for powers greater than (J− ) , we obtain null vectors.
This is carried out by means of succesive applications of lemma A.2.
We start applying lemma A.2 for |j, mi. By hypothesis, |j, mi is a non-null eigenvector of J2 and J3 with eigenvalues
j (j + 1) and m. If m > −j we can apply lemma A.2, from which J− |j, mi ≡ |j, m − 1i is a non-null eigenvector of J2 and
2
J3 with eigenvalues j (j + 1) and (m − 1). If m − 1 > −j we can apply such a lemma again and J− |j, m − 1i = (J− ) |j, mi ≡
2
|j, m − 2i is a non-null eigenvector of J and J3 with eigenvalues j (j + 1) and (m − 2). In general, if m − (n − 1) > −j then
A.3. STRUCTURE OF THE EIGENVALUES AND EIGENVECTORS 365

h i
J− (J− )n−1 |j, mi = J− |j, m − (n − 1)i = (J− )n |j, mi ≡ |j, m − ni is a non-null eigenvector of J2 and J3 with eigenvalues
j (j + 1) and (m − n).
We shall see now that these conditions are satisfied only for n = 0, 1, . . . , p. If we assume that 0 ≤ n ≤ p then
m − (n − 1) = m − n + 1 ≥ m − p + 1 ≥ −j + 1
in the last step we have used the first inequality in expression (A.24). Consequently
m − (n − 1) ≥ −j + 1 > −j
such that the condition m − (n − 1) > −j necessary to apply lemma A.2 is satisfied when n = 0, 1, . . . , p.
Now let us see what happen with the vector (J− )p+1 |j, mi = J− [(J− )p |j, mi]. Since (J− )p |j, mi is an eigenvector of J2
and J3 with eigenvalues j (j + 1) and (m − p), lemma A.1 Eq. (A.21) says that (m − p) ≥ −j.
Let us assume for a moment that
(m − p) > −j
p
an additional application of lemma A.2 says that J− [(J− ) |j, mi] is a non-null eigenvector of J2 and J3 with eigenvalues j (j + 1)
and (m − p − 1). Now, applying the second inequality of expression (A.24), we have that
m − p − 1 < −j
which contradicts lemma A.1 Eq. (A.21). Hence, we should reject the hypothesis m − p > −j. Then we are left with the
condition m − p = −j and after applying lemma A.2 we obtain
(J− )p+1 |j, mi = J− |j, m − pi = 0
and all greater powers vanish as well. This vanishing avoids a conflict with lemma A.1.
From the discussion above, we deduce that exists a non-negative integer p such that
m − p = −j (A.26)
By a similar reasoning, exists a non-negative integer q such that
j−1<m+q ≤j
and it can be proved that for this non-negative integer q, the sequence
n o
2 q
|j, mi , J+ |j, mi , (J+ ) |j, mi , . . . , (J+ ) |j, mi (A.27)

consists of non-null vectors, but greater powers of J+ produce null vectors with which we avoid a contradiction with lemma
A.1. This implies in turn that exists a non-negative integer q such that
m+q =j (A.28)
We see that both operators J+ and J− have a limited sequence of powers that generate non-null vectors. This is in turn related
with the fact that J+ ( J− ) is an operator that increases (decreases) the value of m leaving j unaltered. But for a given j, the
quantity m has an upper an a lower limit, so we have limits for both the increment and decrement.
Combining Eqs. (A.26, A.28) we have that
p+q
p + q = 2j ⇒ j =
2
but p + q is a non-negative integer. Therefore, j can only acquire non-negative integer values or positive half-odd-integer values
1 3 5
j = 0, , 1, , 2, , . . .
2 2 2
Indeed, each value of j will correspond to an irreducible representation for either the Lie algebra or the Lie group SO (3). In
addition, if a non-null eigenvector |j, mi of J2 and J3 exists, the sequences (A.25, A.27) consist of non-null eigenvectors of J2
and J3 , where j (j + 1) is the eigenvalue of J2 while the eigenvalues of J3 are given by
m = −j, (−j + 1) , (−j + 2) , . . . , (j − 2) , (j − 1) , j (A.29)
it means that there are 2j + 1 possible values of m for a given j. Since these values are obtained from the sequences previously
studied, all the 2j + 1 possible values of m under the restriction (A.21) are accesible eigenvalues for a given j. In addition,
since p and q are non-negative integers, Eqs. (A.26, A.28) say that m is integer (half-odd-integer) if and only if j is integer
(half-odd-integer). Consequently, for a given value of j, all values of m described in Eq. (A.29) are associated with a vector
|j, mi, and they are also the only accesible values of m for the given value of j.
We can summarize these results in the following way: Let J be an arbitrary angular momentum that obeys the commutation
rules (A.1). If j (j + 1) and m denote the eigenvalues of J2 and J3 associated to the common eigenvector |j, mi. We have that
366 APPENDIX A. DEFINITION AND PROPERTIES OF ANGULAR MOMENTUM

• The only possible values of j are non-negative integers or positive half-odd-integers: 0, 21 , 1, 32 , 2, 52 , . . ..

• For a given value of j, exist 2j + 1 possible values of m given by Eq. (A.29). The quantity m is integer if and only if
j is integer, and half-odd-integer if and only if j is half-odd-integer. For a given j, all the values of m described in Eq.
(A.29) are permitted, and they are indeed the only ones that are allowed.

A.4 Properties of the eigenvectors of J2 and J3


We shall see that the algebraic properties of the operators J2 , J3 , J+ , J− , permit us to extract information about the
eigenvectors of J2 and J3 and construct the irreducible invariant subspaces associated with the rotations.
We should remember that for a given j all values of m given by Eq. (A.29) must appear. We shall develop a method to
generate the eigenvectors of J2 and J3 from certain subset of these vectors, and from the operators J+ , J− .

A.4.1 Generation of eigenvectors by means of the operators J+ and J−


Let us consider an angular momentum operator J that acts on a given vector space E, and we shall show an algorithm to
construct an orthonormal base in E consisting of eigenvectors common to J2 and J3 .
Let us take a couple of eigenvalues j (j + 1) and m such that eigenvectors of the form |j, m, ki are contained in the vector
space E, where the label k indicates a possible degeneracy in j, m. The eigenvectors associated to the pair (j, m) forms an
eigensubspace E (j, m) of dimension g (j, m). If g (j, m) > 1 for at least a pair (j, m), then the set J2 , J3 do not form a C.S.C.O.
We shall choose an orthonormal basis of vectors {|j, m, ki} with k = 1, . . . , g (j, m), and with j, m fixed.
If m 6= j there is a subspace E (j, m + 1) of E composed by eigenvectors of J2 , J3 with eigenvalues j (j + 1) and (m + 1).
In analogy, if m 6= −j there is a subspace E (j, m − 1) with eigenvectors of J2 , J3 and eigenvalues j (j + 1) , (m − 1). If m 6= j
we shall construct an orthonormal basis in E (j, m + 1) from the basis already constructed for E (j, m). Similarly, if m 6= −j
we shall generate an orthonormal basis in E (j, m − 1) starting from the basis in E (j, m).
First of all, we shall show that for k1 6= k2 the vectors J+ |j, m, k1 i and J+ |j, m, k2 i are orthogonal. In the same way, we
shall see that J− |j, m, k1 i and J− |j, m, k2 i are orthogonal. To show this, we calculate the inner product between the vectors
we are concerned using formulas (A.12)

(J± |j, m, k2 i , J± |j, m, k1 i) = hj, m, k2 | J∓ J± |j, m, k1 i = hj, m, k2 | J2 − J32 ∓ J3 |j, m, k1 i
 
= j (j + 1) − m2 ∓ m hj, m, k2 | j, m, k1 i
(J± |j, m, k2 i , J± |j, m, k1 i) = [j (j + 1) − m (m ± 1)] hj, m, k2 | j, m, k1 i (A.30)

and since the vectors {|j, m, ki i} associated with E (j, m) are orthonormal by hypothesis, we have that

Theorem A.1 Let |j, m, k1 i and |j, m, k2 i be two orthogonal eigenvectors of J2 and J3 with eigenvalues j (j + 1) , m, and
k1 6= k2 . Then J± |j, m, k2 i is orthogonal to J± |j, m, k1 i.

If k1 = k2 , Eq. (A.30) permits us to calculate the norm of J± |j, m, k2 i


2
kJ± |j, m, kik = [j (j + 1) − m (m ± 1)]

therefore, we can construct orthonormal vectors associated with |j, m ± 1, ki for which we only have to normalize the vectors
J± |j, m, ki. Let us start with J+ |j, m, ki, normalizing the vectors J+ |j, m, ki we obtain an orthonormal set in E (j, m + 1)
given by
J+ |j, m, ki
|j, m + 1, ki ≡ p (A.31)
j (j + 1) − m (m + 1)
multiplying Eq. (A.31) by J− , and using Eq. (A.12) we have

J− J+ |j, m, ki J2 − J32 − J3 |j, m, ki
J− |j, m + 1, ki = p =p
j (j + 1) − m (m + 1) j (j + 1) − m (m + 1)
[j (j + 1) − m (m + 1)] |j, m, ki
= p
j (j + 1) − m (m + 1)
p
J− |j, m + 1, ki = j (j + 1) − m (m + 1) |j, m, ki (A.32)

Theorem A.2 The orthonormal set {|j, m + 1, ki} in E (j, m + 1) generated by all the elements of the basis {|j, m, ki} of
E (j, m) by means of Eq. (A.31), constitutes a basis for E (j, m + 1).
A.4. PROPERTIES OF THE EIGENVECTORS OF J2 AND J3 367

Proof: We shall proceed by contradiction. For this we assume that the set {|j, m + 1, ki} is not a basis, according with
theorem 2.28, page 27, this is equivalent to say that there is a non-null vector |j, m + 1, αi in E (j, m + 1) orthogonal to all the
vectors in the set {|j, m + 1, ki}. This implies that α 6= k for all k ′ s of such a set. Since m+1 6= −j, the vector J− |j, m + 1, αi is
non null because of lemma A.2, and such a vector lies in E (j, m). Now, since α 6= k, theorem A.1 says that J− |j, m + 1, αi will
be orthogonal to all vectors of the form J− |j, m + 1, ki. On the other hand, Eq. (A.32) says that J− |j, m + 1, ki is collinear
with |j, m, ki. Consequently, by running through all the basis {|j, m, ki} we obtain that the set {J− |j, m + 1, ki} generated in
this way is also a basis for E (j, m). From the previous discussion we see that J− |j, m + 1, αi is a non-null vector of E (j, m),
orthogonal to all vectors of the basis {|j, m, ki}, but this is impossible by virtue of the theorem 2.28. Therefore, the set of
vectors {|j, m + 1, ki} generated by the basis {|j, m, ki} of E (j, m) by means of (A.31) is complete. QED.
Similarly, it can be proved that when m 6= −j, we are able to define vectors |j, m − 1i in the form

J− |j, m, ki
|j, m − 1, ki ≡ p (A.33)
j (j + 1) − m (m − 1)

and also similarly we can prove the following theorem

Theorem A.3 The orthonormal set {|j, m − 1, ki} in E (j, m − 1) generated by all the elements of the basis {|j, m, ki} of
E (j, m) by means of Eq. (A.33), constitutes a basis for E (j, m − 1).

Note that Eq. (A.33) is obtained from (A.32) by replacing m → m − 1. Equations (A.31, A.33) imply a choice of zero in
the phase difference between |j, m ± 1, ki and the vector J± |j, m, ki, such that the constant of proportionality between them
is real and positive. This convention of zero phase is known as the Cordon-Shortley convention.
Furthermore, we see that Eqs. (A.31) establish a one-to-one and onto relation between the bases of E (j, m) and E (j, m + 1).
In the same way, Eqs. (A.33) give us a one-to-one and onto relation between the bases of E (j, m) and E (j, m − 1). Consequently,
the spaces E (j, m) and E (j, m ± 1) are of the same dimensionality. By induction it is the obtained that the dimensionality of
any E (j, m) only depends on j
g (j, m) = g (j)
let us describe a systematic procedure to generate an orthonormal basis for the whole space E consisting of eigenvectors of J2
and J3 . For an accesible value of j we find a subspace of the form E (j, m) say E (j, j), and we find an orthonormal basis in
such an space3 {|j, j, ki ; k = 1, . . . , g (j)}. Now using (A.33) we construct the bases for E (j, j − 1) , E (j, j − 2) , . . . , E (j, −j)
by iteration. The union of the bases of the 2j + 1 subspaces associated with j gives us an orthonormal basis for the subspace
E (j) given by
E (j) = E (j, j) ⊕ E (j, j − 1) ⊕ E (j, j − 2) ⊕ . . . ⊕ E (j, −j) (A.34)
it is clear that the space E (j) has dimensionality (2j + 1) g (j). Once the basis is generated for a given E (j), we change to
another accesible value of j and repeat the procedure, until all possible values of j in E have been considered. An orthonormal
basis for E consisting of eigenvectors of J2 and J3 is obtained from the union of the bases associated with each accesible value
of j since
E = E (j1 ) ⊕ E (j2 ) ⊕ E (j3 ) ⊕ . . . (A.35)
where {j1 , j2 , j3 , . . .} are the accesible values of j in the vector space considered. We insist in the fact that this set must be a
subset of all the non-negative integer an half-odd-integer numbers. Table A.1 describes schematically the algorithm to generate
a basis for E (j) from the basis for E (j, j).
The basis generated by thi algorithm is called the standard basis of the vector space E, for which there are orthonormality
and completeness relations
g(j)
+j X
X X
hj, m, k |j ′ , m′ , k ′ i = δjj ′ δmm′ δkk′ ; |j, m, ki hj, m, k| = I (A.36)
j m=−j k=1

Of course, we can start from E (j, −j) and then construct bsed on J+ . Finally, we can start from a given E (j, m) with
−j < m < j, in such a case we should generate with J+ “upward” up to j and with J− “downward” up to −j.

A.4.2 Summary of results


The eigenvectors common to the operators J2 and J3 are denoted by {|j, m, ki}, where k labels a possible degeneracy of them.
The action of operators J2 , Ji and J± , on this canonical basis yields.
p
J 2 |j, m, ki = j (j + 1) |j, m, ki ; J3 |j, m, ki = m |j, m, ki ; J± |j, m, ki = j (j + 1) − m (m ± 1) |j, m ± 1, ki (A.37)
3 It is possible that for certain integer or half-odd-integer values of j, there is no any subspace of the form E (j, m) in the vector space E. It has to

do with the fact that not necessarily all representations of SO (3) are contained in a given vector space, so that only some values of j are possible.
368 APPENDIX A. DEFINITION AND PROPERTIES OF ANGULAR MOMENTUM

k=1 k=2 ... k = g (j)


E (j, j) |j, j, 1i |j, j, 2i ... |j, j, g (j)i
⇓ J− ⇓ J− ⇓ J− ... ⇓ J−
E (j, j − 1) |j, j − 1, 1i |j, j − 1, 2i ... |j, j − 1, g (j)i
⇓ J− ⇓ J− ⇓ J− ... ⇓ J−
.. .. .. ..
. . . .
E (j, m) |j, j − m, 1i |j, j − m, 2i ... |j, j − m, g (j)i
⇓ J− ⇓ J− ⇓ J− ... ⇓ J−
.. .. .. ..
. . . .
E (j, −j) |j, −j, 1i |j, −j, 2i ... |j, −j, g (j)i
E (j, k = 1) E (j, k = 2) E (j, k = g (j))
Table A.1: Construction of the standard basis for E (j) of dimension (2j + 1) g (j). Starting with each of the g (j) vectors |j, j, ki
of the first row, using the operator J− to construct the 2j + 1 vectors of each column. The g (j) vectors of the m − th row,
expand the subspace E (j, m). The 2j + 1 vectors of the k − th column expand the subespace E (j, k). There is a total of 2j + 1
subspaces of the form E (j, m) and a total of g (j) subspaces of the form E (j, k). The total vector space can be obtained by
direct sum of the E (j, m), or alternatively by direct sum of the E (j, k).

where j takes non-negative integer or half-odd-integer values and −j ≤ m ≤ j. We also have


1 1 hp ′ ′
J1 |j ′ , m′ , k ′ i = (J+ + J− ) |j ′ , m′ , k ′ i = j (j + 1) − m′ (m′ + 1) |j ′ , m′ + 1, k ′ i
2 2 i
p
+ j ′ (j ′ + 1) − m′ (m′ − 1) |j ′ , m′ − 1, k ′ i (A.38)

1 1 hp ′ ′
J2 |j ′ , m′ , k ′ i = (J+ − J− ) |j ′ , m′ , k ′ i = j (j + 1) − m′ (m′ + 1) |j ′ , m′ + 1, k ′ i
2i 2i i
p
− j ′ (j ′ + 1) − m′ (m′ − 1) |j ′ , m′ − 1, k ′ i (A.39)

A.5 Construction of a standard basis from a C.S.C.O


A very useful method to generate a standard basis consists of using a set of observables

{A1 , A2 , . . . , An }

that along with J2 and J3 form a C.S.C.O. and also commute with all the components of J

[Ai , J] = 0 ; i = 1, . . . , n

an observable that conmutes with the components of J is called an scalar. For the sake of simplicity, we shall assume that a
single scalar A is enough to form a C.S.C.O with J2 and J3 . Let us see the action of A over an arbitrary vector |j, m, ki of
E (j, m), definining |ψi ≡ A |j, m, ki we have

J2 |ψi = J2 A |j, m, ki = AJ2 |j, m, ki = j (j + 1) A |j, m, ki = j (j + 1) |ψi


J3 |ψi = J3 A |j, m, ki = AJ3 |j, m, ki = mA |j, m, ki = m |ψi

where we have used the fact that A commutes with J2 and J3 . We have then that |ψi ≡ A |j, m, ki is an eigenvector of J2
and J3 with eigenvalues j (j + 1) and m and therefore belongs to E (j, m). Hence, each subspace E (j, m) is invariant under
the action of an operator A that commutes with J2 and J3 . If we now choose a value of j, the subspace E (j, j) will be in
particular invariant under A and we can diagonalize the restriction of A over E (j, j), with certain orthonormal basis {|j, j, ki}
of E (j, j),4 such that
A |j, j, ki = ajk |j, j, ki (A.40)
the set {|j, j, ki ; j f ijo; k = 1, . . . , g (j)} is an orthonormal basis of E (j, j), from which we can construct the orthonormal
basis for E (j). Applying this procedure for each accesible value of j we obtain the orthonormal basis {|j, m, ki} for the whole
vector space E.
The previous results do not require for A to be an scalar, only require that it commutes with J2 and J3 . Let {|j, m, ki} be
the vector basis of E (j, m) obtained from succesive aplication of J− over the basis {|j, j, ki}. We shall see that if A is a scalar,
4 Remember that A is hermitian and so normal. For all normal operators exist an unitary transformation that diagonalizes it.
A.6. DECOMPOSITION OF E IN SUBSPACES OF THE TYPE E (J, K) 369

the vectors {|j, m, ki} as well as being eigenvectors of J2 and J3 are also eigenvectors of A. To see it we observe that for an
scalar A we have
[A, J− ] = [A, J1 − iJ2 ] = [A, J1 ] − i [A, J2 ] = 0 (A.41)
Using (A.40, A.41) we get
A [J− |j, j, ki] = J− A |j, j, ki = ajk [J− |j, j, ki]
J− |j, j, ki is an eigenvector of A with the same eigenvalue as |j, j, ki (theorem 3.17). Equivalently, |j, j − 1, ki is eigenvector
of A with the same eigenvalue as |j, j, ki. Applying this procedure succesively we see that the vectors given by
|j, j, ki , |j, j − 1, ki , . . . , |j, −j, ki
are eigenvectors of A with eigenvalues ajk . Therefore, we can write
A |j, m, ki = ajk |j, m, ki ; m = j, j − 1, . . . , −j + 1, − j (A.42)
the spectrum of A is then the same for all the subspaces E (j, m) with j fixed, but depends in general on j and also on k, such
that a set of labels (j, m, k) defines uniquely a vector |j, m, ki of E, as it corresponds to a C.S.C.O.
Note that an observable that conmutes with J2 and J3 does not neccesarily commute with J1 and J2 . In particular, the set
2
(J , J3 , A) could form a C.S.C.O. but A does not have to conmute with J1 and/or J2 . In such a case however, J± does not
commute with A and therefore J± |j, m, ki is not neccesarily an eigenvector of A with the same eigenvalue of |j, m, ki. Hence,
when A commutes with J2 and J3 but it is not a scalar, the basis {|j, m, ki} obtained by succesive application of J− over
{|j, j, ki} should be rotated to another basis {|j, m, αi} to diagonalize the restriction of A over E (j, m). In contrast, when A
is a scalar, the latter rotation is not neccesary.

A.6 Decomposition of E in subspaces of the type E (j, k)


In the previous procedures, we have decomposed the whole space E in the form given by the combination of Eqs. (A.34, A.35)
E = E (j1 , j1 ) ⊕ E (j1 , j1 − 1) ⊕ E (j1 , j1 − 2) ⊕ . . . ⊕ E (j1 , −j1 ) ⊕
E (j2 , j2 ) ⊕ E (j2 , j2 − 1) ⊕ E (j2 , j2 − 2) ⊕ . . . ⊕ E (j2 , −j2 ) ⊕
E (j3 , j3 ) ⊕ E (j3 , j3 − 1) ⊕ E (j3 , j3 − 2) ⊕ . . . ⊕ E (j3 , −j3 ) ⊕ . . .
where j1 , j2 , j3 , . . . are the permitted values of j for the vector space under study. This is a decomposition in subspaces of the
type E (j, m). Notwithstanding, the subspaces E (j, m) have some disadvantages. On one hand, their dimension g (j) depends
on the specific vector space under study because this dimension accounts on the degeneracy associated to the pair (j, m),
therefore g (j) is unknown in the general case. In addition, a subspace of the type E (j, m) is not invariant under J, and so is
not an invariant under SO (3). For instance
1 1 1
J1 |j, m, ki = (J+ + J− ) |j, m, ki = c+ |j, m + 1, ki + c− |j, m − 1, ki (A.43)
2 2 2
and according with (A.36) this vector is orthonormal to |j, m, ki and is non-null since at least one of the vectors |j, m + 1, ki , |j, m −
have to be non-null and both are orthogonal to each other.
Examining table (A.1) we see that each subspace of the type E (j, m) is generated by the expansion of the g (j) vectors of
the m−th row of the table (the g (j) possible values of k). We see however, that there is another way of collecting the vectors:
we can generate a subespace with the (2j + 1) vectors of a fixed column of the table, from which we obtain a subspace of the
type E (j, k) since in that case is the pair (j, k) the one that remains fixed in the expansion.
The decomposition of E acquires the form
E = E (j1 , k = 1) ⊕ E (j1 , k = 2) ⊕ . . . ⊕ E (j1 , k = g (j1 )) ⊕
E (j2 , k = 1) ⊕ E (j2 , k = 2) ⊕ . . . ⊕ E (j2 , k = g (j2 )) ⊕
E (j3 , k = 1) ⊕ E (j3 , k = 2) ⊕ . . . ⊕ E (j3 , k = g (j3 )) ⊕ . . . (A.44)
the subspaces E (j, k) possess the following properties: (a) the dimension of E (j, k) is 2j +1 such that for a given j its dimension
is known regardless the vector space we are working with. (b) E (j, k) es an irreducible invariant subspace under the SO (3)
group and under its associated Lie algebra (see theorem 15.9 page 269).
We can see that E (j, k) and E (j, k ′ ) are identical subspaces for any function of the generators F (J) or of the elements of
the SO (3) group F (Rn (Φ)). Consequently, the label k can only be distinguished by applying an operator outside of SO (3).
For this reason, the invariant subspaces are labeled in Sec. 15.5 simply as E (j) (see Eq. 15.76). As a matter of example, if
the total space in question decomposes as in Eq. (A.44), we see that Eq. (15.79) in page 270 must be rewritten as
hj, m, k| J3 |j ′ , m′ , k ′ i = mδjj ′ δmm′ δkk′ (A.45)
and similarly for Eqs. (15.80-15.83).
Appendix B

Addition of two angular momenta

The direct product representations of SO (3) leads naturally to the problem of the addition rules of two angular momenta.
The theory can be developed consistently thanks to the fact that when the tensor product of two vector spaces is taken, the
addition of the angular momenta associated with each component space gives another angular momentum

B.1 Total and partial angular momenta


Let J(1) be an angular momentum defined on a vector space E1 , and let J(2) be another angular momentum defined on a
different vector space E2 . By making the tensor product E ≡ E 1 ⊗ E2 and defining the extensions of J(1) and J(2) on E, we can
define the operator
J ≡ J(1) + J(2)
corollary 15.12 on page 278, says that J is also an angular momentum. Hence, all properties of angular momentum are valid
for J. Besides, we also have properties for mixing commutators (involving a total angular momentum and a partial angular
momentum). In particular, let us see the properties of commutation of J2
 2
J2 = J(1) + J(2) = J2(1) + J2(2) + 2J(1) · J(2) (B.1)

where we have taken into account that J(1) and J(2) commute. The scalar product can be expressed in termsof the raising and
(1) (2) (1) (2)
lowering operators J± , J± as well as the operators J3 and J3 .
(1) (2) (1) (2) (1) (2)
J(1) · J(2) = J1 J1 + J2 J2 + J3 J3 (B.2)
1  (1) (1)

(2) (2)
 1  (1) (1)

(2) (2)

(1) (2)
= J+ + J− J+ + J− + 2 J+ − J− J+ − J− + J3 J3
4 4i
1 h (1) (2) (1) (2) (1) (2) (1) (2) (1) (2) (1) (2)
= J J + J+ J− + J− J+ + J− J− − J+ J+ + J+ J−
4 + + i
(1) (2) (1) (2) (1) (2)
+J− J+ − J− J− + J3 J3
1  (1) (2) (1) (2)

(1) (2)
J(1) · J(2) = J+ J− + J− J+ + J3 J3 (B.3)
2
Now the idea is to compare the commuting sets
n o 
(1) (2)
J2(1) , J3 , J2(2) , J3 ; J2 , J3

where the first consists of partial angular momenta and the second of total angular momenta. Since J(1) and J(2) commute
with J2(1) and J2(2) , then J also conmute with them
h i h i
J, J2(1) = J, J2(2) = 0

in particular J2 and J3 commute with J2(1) and J2(2)


h i h i
J3 , J2(1) = J3 , J2(2) = 0 (B.4)
h i h i
J2 , J2(1) = J2 , J2(2) = 0 (B.5)

370
B.2. ADDITION OF TWO ANGULAR MOMENTA WITH J(1) = J(2) = 1/2 371

(1) (2)
on the other hand, it is obvious that J3 commute with J3 and J3
h i h i
(1) (2)
J3 , J 3 = J3 , J 3 =0 (B.6)

(1) (2)
but J2 does not commute with J3 nor with J3 , which can be seen by using (B.1, B.2)
h i h i h i
(1) (1) (1)
J2 , J3 = J2(1) + J2(2) + 2J(1) · J(2) , J3 = 2 J(1) · J(2) , J3
h i h i h i h i
(1) (1) (2) (1) (2) (1) (1) (2) (1) (1) (2) (1)
J2 , J3 = 2 J1 J1 + J2 J2 , J 3 = 2 J1 J1 , J 3 + 2 J2 J2 , J 3
h i h i h i h i
(1) (2) (1) (1) (1) (2) (1) (2) (1) (1) (1) (2)
= 2J1 J1 , J3 + 2 J1 , J3 J1 + 2J2 J2 , J3 + 2 J2 , J3 J2
h i
(1) (1) (2) (1) (2)
J2 , J3 = −2iJ2 J1 + 2iJ1 J2

obtaining finally h i h i
(1) (1) (2) (1) (2)
J2 , J3 = 2i J1 J2 − J2 J1 (B.7)

and since J is an angular momentum, we have


 
J2 , J = 0
so that h i h i h i
(1) (2) (1) (2)
J2 , J3 + J3 =0 ⇒ J2 , J3 = − J2 , J3

the previous analysis shows us that the following set of operators commute each other
n o
J2 , J3 , J2(1) , J2(2)

We shall illustrate the methodology for the adddition of two angular momenta by taking a simple example, before going to
the general formalism.

B.2 Addition of two angular momenta with j(1) = j(2) = 1/2


(k) (1) (2)
Each space E1/2 associated with a given j(k) is a two-dimensional space. Its tensor product E = E 1/2 ⊗ E1/2 is 4-dimensional.
The orthonormal “natural” basis in this space will be denoted as {|ε1 i ⊗ |ε2 i} ≡ {|ε1 , ε2 i}. Explicitly, it gives

{|ε1 , ε2 i} = {|+, +i , |+, −i , |−, +i , |−, −i} (B.8)


(1) (2)
they are eigenvectors of J2(1) , J3 , J2(2) , J3 . Strictly speaking they are the tensor extensions to the E space, from now on it
will be assumed that the appropriate extensions are taken when necessary

3
J2(1) |ε1 , ε2 i = J2(2) |ε1 , ε2 i = |ε1 , ε2 i (B.9)
4
(1) ε1 (2) ε2
J3 |ε1 , ε2 i = |ε1 , ε2 i ; J3 |ε1 , ε2 i = |ε1 , ε2 i (B.10)
2 2
the set
(1) (2)
J2(1) , J3 , J2(2) , J3 (B.11)
(1) (2)
forms a C.S.C.O. for the space E = E 1/2 ⊗ E1/2 . In other words, the basis (B.8) consists of eigenvectors common to the set
n o
(1) (2)
J2(1) , J3 , J2(2) , J3 . Strictly speaking J2(1) , J2(2) can be excluded since they are poportional to the identity1 . We have also
seen that the 4 observables
J2(1) , J2(2) , J2 , J3 (B.12)
(1) (2)
commute each other. We shall see that it is also a C.S.C.O. in E = E 1/2 ⊗ E1/2 . Addition of two angular momenta implies
to construct the orthonormal system of eigenvectors common to the set (B.12). This set difers from (B.8) since J2 does not
1 Note that Eq. (B.9) says J2(1) = J2(2) , understood as their appropiate extensions. It is seen from the fact that they act in the same way for all
the elements in the basis. It can also be
 seen taking into account that both are proportional to the identity in their corresponding spaces, such that
their extensions are J2(1) = 3/4~2 E (1) ⊗ E (2) and J2(2) = E (1) ⊗ 3/4~2 E (2) . Therefore J2(1) = J2(2) = 3/4~2 E (1×2) .
372 APPENDIX B. ADDITION OF TWO ANGULAR MOMENTA

(1) (2)
commute with J3 , J3 . We shall denote the vectors of the new basis as |J, M i where the eigenvalues of J2(1) , J2(2) (that keep
the same) are implicit2 . These vectors satisfy the relations

3
J2(1) |J, M i = J2(2) |J, M i = |J, M i (B.13)
4
J2 |J, M i = J (J + 1) |J, M i (B.14)
J3 |J, M i = M |J, M i (B.15)

since J is an angular momentum, then J should be a non-negative integer or half-odd-integer, M must lie between −J and J
varying in unit steps. The problem is then to find the values that J and M can take based on the values of j1 , j2 and m1 , m2 ,
as well as expressing the basis {|J, M i} in terms of the known basis (B.8).
We shall solve the problem by diagonalizing the 4 × 4 matrices that represent to J2 and J3 in the basis {|ε1 , ε2 i}. We shall
develop later another method more suitable for higher dimensional representations.

B.2.1 Eigenvalues of J3 and their degeneracy


(1) (2)
Note that for the observables J2(1,2) all vectors in the space E = E 1/2 ⊗E1/2 are eigenvectors, hence |J, M i are already eigenvectors
of these observables. On the other hand, Eqs. (B.4, B.6) say that J3 commute with the four observables of the C.S.C.O. given
by Eq. (B.11). Therefore, we expect for the basis vectors {|ε1 , ε2 i} to be automatically eigenvectors of J3 . Using (B.10) we
find that
  (ε1 + ε2 )
(1) (2)
J3 |ε1 , ε2 i = J3 + J3 |ε1 , ε2 i = |ε1 , ε2 i
2

we see then that |ε1 , ε2 i is an eigenvector of J3 with eigenvalue

1
M= (ε1 + ε2 ) (B.16)
2

since ε1 and ε2 take the values ±1, we see that M takes the values +1, 0, −1.
The values M = ±1 are non-degenerate. There is only one associated eigenvector for each of them: |+, +i corresponds to
+1 and |−, −i corresponds to −1. In other words, for M = +1 there is only one possibility ε1 = ε2 = +1, the case M = −1 is
possible only if ε1 = ε2 = −1. By contrast, M = 0 is 2-fold degenerate, there are 2 linearly independent vectors associated with
it, they are |+, −i and |−, +i. This means that there are 2 solutions for M = 0, they are ε1 = −ε2 = 1 and ε1 = −ε2 = −1.
Any linear combination of the vectors |+, −i and |−, +i is an eigenvector of J3 with eigenvalue M = 0.
These results are apparent in the matrix representation of J3 in the basis {|ε1 , ε2 i}. Ordering the vectors in the form of
Eq. (B.8) this matrix reads
 
1 0 0 0
 0 0 0 0 
(J3 ) = 
 0 0 0 0 

0 0 0 −1

B.2.2 Diagonalization of J2
We shall apply J2 to the vectors of the basis (B.8), for which we shall use Eqs. (B.1, B.3)

 2
(1) (2) (1) (2) (1) (2)
J2 = J(1) + J(2) = J2(1) + J2(2) + J+ J− + J− J+ + 2J3 J3

(1) (2)
the 4 vectors |ε1 , ε2 i are eigenvectors of J2(1) , J2(2) , J3 and J3 as can be seen in Eqs. (B.9, B.10), and the action of the
raising and lowering operators is given by Eqs. (15.73), page 269 with j = 1/2. Therefore, we can evaluate J2 |ε1 , ε2 i for all


2 The complete notation would be J, M j(1) , j(2) = |J, M (1/2, 1/2)i.
B.2. ADDITION OF TWO ANGULAR MOMENTA WITH J(1) = J(2) = 1/2 373

the elements of the basis {|ε1 , ε2 i}


 
2 3 3 1
J |+, +i = + |+, +i + |+, +i
4 4 2
= 2 |+, +i (B.17)
 
3 3 1
J2 |+, −i = + |+, −i − |+, −i + |−, +i
4 4 2
= [|+, −i + |−, +i] (B.18)
 
3 3 1
J2 |−, +i = + |−, +i − |−, +i + |+, −i
4 4 2
= [|+, −i + |−, +i] (B.19)
 
3 3 1
J2 |−, −i = + |−, −i + |−, −i
4 4 2
= 2 |−, −i (B.20)

the matrix representative of J2 in the basis {|ε1 , ε2 i} in the order given by (B.8) yields
 
2 0 0 0
  0 1 1 0 
J2 =   0 1

1 0 
0 0 0 2

since J2 commutes with J3 , the matrix will have non-null elements only within eigenvectors of J3 associated with the same
eigenvalues, it explains the null elements of the matrix. According with the results of Sec. B.2.1, the only non-diagonal
elements of J2 that are different from zero, are those relating the vectors {|+, −i , |−, +i}, which are associated with the same
value of M (M = 0).
Now, to diagonalize this matrix we take into account that it is block-diagonal with the following texture
 
A1×1 0 0
 0 B2×2 0 
0 0 C1×1

the one-dimensional matrices are associated with the vectors |±, ±i which are eigenvectors of J2 , as can be seen in Eqs.
(B.17,B.20). The associated eigenvalues are both given by J (J + 1) = 2 corresponding to J = 1. Now we should diagonalize
the submatrix  
1 1
B2×2 =
1 1
that represents J2 within the 2-dimensional subspace generated by {|+, −i , |−, +i}, i.e. the eigensubspace of J3 that corre-
sponds to M = 0. The eigenvalues λ = J (J + 1) of this matrix are found with the secular equation
2
(1 − λ) − 1 = 0

whose roots are λ = 0 and λ = 2. This gives us the latter eigenvalues of J2 : 0 and 2, corresponding to J = 0 and 1. The
corresponding eigenvectors yield
1
|J = 1, M = 0i = √ [|+, −i + |−, +i] (B.21)
2
1
|J = 0, M = 0i = √ [|+, −i − |−, +i] (B.22)
2
as customary, it is possible to put a global phase if desired.
We see then that J2 has 2 different eigenvalues: 0 and 2. The null eigenvalue is non-degenerate and its only associated
vector is (B.22). On the other hand, the eigevalue 2 is three-fold degenerate, since it is associated to the vectors |+, +i , |−−i
as well as the linear combination (B.21).

B.2.3 Eigenstates of J2 and J3 : singlet and triplet


We have then obtained the eigenvalues of J2 and J3 as well as a complete set of common eigenvectors of J2 and J3 (which are
automatically eigenvectors of J2(1) and J2(2) ). We shall express the eigenvectors in the notation given by (B.13-B.15).
374 APPENDIX B. ADDITION OF TWO ANGULAR MOMENTA

The quantum number J of (B.14) can take 2-values: 0 and 1. The first is associated with a unique vector, which is also
eigenvector of J3 with null eigenvalue, we denote this vector as
1
|0, 0i = √ [|+, −i − |−, +i] (B.23)
2
while for J = 1 there are three vectors associated with three different values of M
1
|1, 1i = |+, +i ; |1, 0i = √ [|+, −i + |−, +i] ; |1, −1i = |−−i (B.24)
2

it can be easily checked that the four vectors in (B.23, B.24) are orthonormal. The specification of J and M determines a
vector of this basis uniquely (within a phase factor), such that J2 and J3 form a C.S.C.O. Although it is not necessary, the
operators J2(1) y J2(2) can be added to this C.S.C.O.
Therefore, when we add two angular momenta with j1 = j2 = 1/2, the J number that characterizes the eigenvalue J (J + 1)
of the operator J2 can be equal to zero or equal to one. With each one of these values there is an associated family of (2J + 1)
orthogonal vectors (three for J = 1, and one for J = 0) that corresponds to the 2J + 1 values of M for J fixed.
The family (B.24) of three vectors associated with J = 1 is called a triplet. The vector |0, 0i associated with J = 0 is
called a singlet. Eq. (B.24) showsthat the vectors of the triplet are symmetric with respect to the interchange of two angular
momenta, while the singlet vector Eq. (B.23) is antisymmetric. In other words, if each vector |ε1 , ε2 i is replaced by |ε2 , ε1 i,
the expressions (B.24) keep invariant while the vector (B.23) changes its sign. This is of great importance for the study of a
system of two identical fermions. Further, it indicates the linear combination of |+, −i with |−, +i required to complete the
triplet (it must be symmetric). The singlet part would be then the antisymmetric linear combination of |+, −i with |−, +i
which is orthogonal to the symmetric part and of course to the other vectors of the triplet.

B.3 General method of addition of two angular momenta


Let us consider a vector space E, and J an angular momentum operator acting on it. J puede ser un momento angular parcial
o el momento angular total del sistema. We saw in section A.4.1, that a standard or canonical basis {|j, m, ki} consisting of
eigenvectors common to J2 and J3 , can be constructed

J2 |j, m, ki = j (j + 1) |j, m, ki ; J3 |j, m, ki = m |j, m, ki (B.25)

and the action of the raising and lowering operators on this canonical basis is given by Eqs. (15.73), page 269
p
J± |j, m, ki = j (j + 1) − m (m ± 1) |j, m ± 1, ki (B.26)

we denote as E (j, k) the eigensubspace expanded by the vectors of the canonical basis with j, k fixed. This is a 2j + 1
dimensional space corresponding to the different values of m for a given j. The dimension does not depend on k and spaces
of different k with the same j, are identical with respect to any function of the angular momenta. Eqs. (B.25, B.26) say that
the 2j + 1 vectors of the basis for E (j, k) transform each other by means of the operators J2 , J3 , J+ , J− . It means that the
eigensubspace E (j, k) is invariant under the four operators and in general under the action of any function of F (J) (they are
irreducible invariant subspaces under the j representation of SO (3)). The complete space E can be written as a direct sum of
orthogonal subspaces E (j, k) as can be seen in Eq. (A.44)

E = E (j1 , k = 1) ⊕ E (j1 , k = 2) ⊕ . . . ⊕ E (j1 , k = g (j1 )) ⊕


E (j2 , k = 1) ⊕ E (j2 , k = 2) ⊕ . . . ⊕ E (j2 , k = g (j2 )) ⊕
E (j3 , k = 1) ⊕ E (j3 , k = 2) ⊕ . . . ⊕ E (j3 , k = g (j3 )) ⊕ . . . (B.27)

owing to the invariance of these subspaces under the operators J2 , J3 , J+ , J− , F (J), such operators will have a matrix repre-
sentation in the canonical basis, where the non-null matrix elements are within each subspace E (j, k). Further, each matrix
element of a function F (J) within a subspace E (j, k) is independent of k.
On the other hand, if we add to J2 and J3 the necessary operators to form a C.S.C.O. we can give a meaning to k and the
subspaces E (j, k) with the same j can be distinguish by the action of the added operators. If for example, only one operator A
is required to form a C.S.C.O. and we assume that A commutes with J (i.e. A is a scalar), we can demand for the eigenvectors
|j, m, ki to be eigenvectors of A as well
A |j, m, ki = aj,k |j, m, ki (B.28)
such that the standard basis {|j, m, ki} will be determined by Eqs. (B.25, B.26, B.28). Each E (j, k) is also an eigenspace of
A and the index k discriminates among the different eigenvalues aj,k associated with each value of k. When more than one
operator must be added to form a C.S.C.O. the index k indeed corresponds to several indices.
B.3. GENERAL METHOD OF ADDITION OF TWO ANGULAR MOMENTA 375

B.3.1 Forming the tensor space and the associated angular momenta
Assume that we have two spaces E1 and E2 . We shall use indices (1) and (2) to denote quantities associated with each space.
(1)
Let us also asume that for E1 we know a canonical basis {|j1 , m1 , k1 i} of eigenvectors common to J2(1) and J3 with J(1) being
the angular momentum associated to E1 . Therefore, Eqs. (B.25, B.26) yield
(1)
J2(1) |j1 , m1 , k1 i = j1 (j1 + 1) |j1 , m1 , k1 i ; J3 |j1 , m1 , k1 i = m1 |j1 , m1 , k1 i
(1)
p
J± |j1 , m1 , k1 i = j1 (j1 + 1) − m1 (m1 ± 1) |j1 , m1 ± 1, k1 i

and similarly for the canonical basis {|j2 , m2 , k2 i} of E2


(2)
J2(2) |j2 , m2 , k2 i = j2 (j2 + 1) |j2 , m2 , k2 i ; J3 |j2 , m2 , k2 i = m2 |j2 , m2 , k2 i
(2)
p
J± |j2 , m2 , k2 i = j2 (j2 + 1) − m2 (m2 ± 1) |j2 , m2 ± 1, k2 i

now we form the tensor space of E1 and E2


E = E1 ⊗ E2
we know that the tensor product of the bases of E1 and E2 forms a basis in E. We denote this basis as

|j1 , m1 , k1 i ⊗ |j2 , m2 , k2 i ≡ |j1 , j2 ; m1 , m2 ; k1 , k2 i (B.29)

the spaces E1 and E2 are direct sums of subspaces of the form E1 (j1 , k1 ) and E2 (j2 , k2 ) respectively. These sums are described
by Eq. (B.27)
      
(1) (1) (1) (1)
E1 = E1 j1 , k(1) = 1 ⊕ E1 j1 , k(1) = 2 ⊕ . . . ⊕ E1 j1 , k(1) = g j1 ⊕
      
(1) (1) (1) (1)
E1 j2 , k(1) = 1 ⊕ E1 j2 , k(1) = 2 ⊕ . . . ⊕ E1 j2 , k(1) = g j2 ⊕
      
(1) (1) (1) (1)
E1 j3 , k(1) = 1 ⊕ E1 j3 , k(1) = 2 ⊕ . . . ⊕ E1 j3 , k(1) = g j3 ⊕ ... (B.30)

(m)
and similarly for E2 . The notation ji represents different values of j for the component space Em . We shall however omit
this notation from now on and we use jm to denote the value of j associated to the space Em . This sums can be shrunk in the
form X X
E1 = E1 (j1 , k1 ) ; E2 = E2 (j2 , k2 )
⊕ ⊕

therefore E will be the direct sum of suspaces E (j1 , j2 ; k1 , k2 ) obtained by the tensor product of the subspaces E1 (j1 , k1 ) and E2 (j2 , k
X
E= E (j1 , j2 ; k1 , k2 ) ; E (j1 , j2 ; k1 , k2 ) = E1 (j1 , k1 ) ⊗ E2 (j2 , k2 ) (B.31)

the dimension of the subspace E (j1 , j2 ; k1 , k2 ) is (2j1 + 1) (2j2 + 1). This subspace is invariant (though non necessarily irre-
ducible) under any function of the form F (J1 ) and F (J2 ), where J1 and J2 are of course the appropriate extensions.

B.3.2 Total angular momentum and its relations of commutation


We have seen that J = J(1) + J(2) is also an angular momentum where J(1) and J(2) are proper extensions. Hence, in the same
way as J(1) and J(2) , the operator J satisfies the algebraic properties of an angular momentum. Notwithstanding, there are also
commutation relations between total and partial angular momenta that are important in our discussion. Since J(1) and J(2)
commute with J2(1) and J2(2) they also commute with J. In particular J2 and J3 commute with J2(1) and J2(2) . Besides, it is
(1) (2)
inmediate that J3 and J3 commute with J3 , consequently
h i h i h i h i h i h i
(1) (2)
J3 , J2(1) = J3 , J2(2) = J2 , J2(1) = J2 , J2(2) = J3 , J3 = J3 , J3 = 0 (B.32)

(1) (2)
but, J3 and J3 do not commute with J2 as can be seen from Eqs. (B.1, B.3)

J2 = J2(1) + J2(2) + 2J(1) · J(2) (B.33)


(1) (2) (1) (2) (1) (2)
J2 = J2(1) + J2(2) + 2J3 J3 + J+ J− + J− J+ (B.34)

with which we arrived at Eq. (B.7)


h i h i h i
(1) (2) (1) (2) (1) (2)
J2 , J3 = − J2 , J3 = 2i J1 J2 − J2 J1 (B.35)
376 APPENDIX B. ADDITION OF TWO ANGULAR MOMENTA

B.3.3 Change of basis to be carried out


A vector of the basis
{|j1 , m1 , k1 i ⊗ |j2 , m2 , k2 i} ≡ {|j1 , j2 ; m1 , m2 ; k1 , k2 i} (B.36)
is a simultaneous eigenvector of the observables
(1) (2)
J2(1) , J2(2) , J3 , J3
with eigenvalues j1 (j1 + 1) , j2 (j2 + 1) , m1 , m2 . Now, Eqs. (B.32) say that the set of observables
J2(1) , J2(2) , J2 , J3
also commute each other. Observe that a basis common to these observables, must be different from the basis (B.36), because
(1) (2)
according with Eq. (B.35), J2 do not commute with J3 nor with J3 .
On nthe other hand, o the meaning of the indices k1 and k2 comes from a natural extension of the procedure for each space
(1)
Ei . If A1 , J2(1) , J3 forms a C.S.C.O. in E1 where A1 commutes with J(1) then we can choose a canonical {|j1 , m1 , k1 i}
consisting of then complete seto of orthonormal vectors common to these observables. If something similar occurs with the set
(2)
of observables A2 , J2(2) , J3 in E2 then the set

(1) (2)
A1 , A2 ; J2(1) , J2(2) ; J3 , J3
forms a C.S.C.O. in E whose eigenvectors are given by Eq. (B.36). On the other hand, since A1 commutes with J(1) and with
J(2) then it commutes with J. This in turn implies that A1 commutes with J2 and J3 . The same occurs for the observable A2 ,
therefore the observables in the set
A1 , A2 ; J2(1) , J2(2) ; J2 , J3
commute each other. It can also be shown that they form a C.S.C.O. and that the new canonical basis that we will find is an
orthonormal system of eigenvectors common to the operators in the C.S.C.O. 
Now the subspace
 E (j1 , j2 ; k1 , k2 ) defined in (B.31) is invariant under the action of an operator of the form F J(1) or of
the form F J(2) . Therefore, it is invariant under the action of a function of the type F (J). Hence the observables J2 and
J3 that we are trying to diagonalize, have non-null matrix elements only within each subspace E (j1 , j2 ; k1 , k2 ). The matrices
that represent to J2 and J3 in the basis (B.36) are block-diagonal and can be written as a direct sum of submatrices, each
one associated with a subspace of the form E (j1 , j2 ; k1 , k2 ). Consequently, the problem reduces to diagonalize the submatrices
associated to each subspace E (j1 , j2 ; k1 , k2 ) whose dimension is (2j1 + 1) (2j2 + 1).  
On the other hand, the matrix elements in the basis (B.36) for any function F J(1) or F J(2) are independent of k1 and
k2 (only the matrix elements of A1 depend on k1 and the ones of A2 depend on k2 ). Hence, this is also valid for J2 and J3 .
Consequently, the diagonalization of these two operators within all the subspaces E (j1 , j2 ; k1 , k2 ) with the same value of j1
and j2 , are done in identical form. So we can talk about addition of angular momenta without making reference to the other
indices. Therefore, we shall simplify the notation omitting the indices k1 and k2
E (j1 , j2 ) ≡ E (j1 , j2 ; k1 , k2 ) ; |j1 , j2 ; m1 , m2 i ≡ |j1 , j2 ; m1 , m2 ; k1 , k2 i
since J is an angular momentum and E (j1 , j2 ) is invariant under F (J) then E (j1 , j2 ) is a direct sum of orthogonal subspaces
E (J, k) each one of them invariant under the action of J2 , J3 , J±
X
E (j1 , j2 ) = E (J, k) (B.37)

the following questions arise, given a pair j1 and j2 : Which are the values of J that contribute in the direct sum (B.37)? and
¿How many subspaces E (J, k) are associated with a given J?.
Since there is a known basis (B.36) it will be our starting point to arrive to the basis associated with J2 and J3 . Then it
arises the problem of expanding the eigenvectors of the basis we are looking for, which is associated with E (j1 , j2 ), in terms of
the eigenvectors of the known basis (B.36).
It is important to mention that if we have more angular momenta we can add the first two and use the result to add with
the third and so on. This is possible because the algorithm of sum is commutative and associative as we shall see later.

B.3.4 Eigenvectors of J2 and J3 : Case of j1 = j2 = 1/2.


In this case, each space E1 and E2 contains only one invariant subspace since each one of them is associated with a fixed value
of j. The tensor product E = E1 ⊗ E2 is associated to only one subspace of the form E (j1 , j2 ) with j1 = j2 = 1/2.
According with the decomposition (B.37), the space E (1/2, 1/2) is the direct sum of subspaces of the type E (J, k) of
dimension 2J + 1. Each one of these subspaces contains one and only one eigenvector of J3 associated to each one of the values
of M such that |M | ≤ J. We have seen in Sec. B.2.1 that M only takes the values 1, 0, −1; where the first and the third are
not degenerate while M = 0 is 2-fold degenerate. From this we conclude that:
B.3. GENERAL METHOD OF ADDITION OF TWO ANGULAR MOMENTA 377

1. Values of J > 1 are excluded. For example, for J = 2 to be possible, it would have to exist at least one eigenvector of J3
with M = 2. It owes to the fact that the theory of angular momentum says that for a given j the allowed values of m
consist of all the integers or half-od-integers that covers the interval −j ≤ m ≤ j in unit steps.
2. E (J = 1, k) appears only once (so that k is unique), because M = ±1 only appears once, so that M = ±1 is non-
degenerate.
3. E (J = 0, k) appears only once. It is because M = 0 is 2-fold degenerate but one of the eigenvectors with M = 0 lies in
the subspace with J = 1, such that only one eigenvector with M = 0 is associated to a subspace with J = 0.

Therefore, the 4-dimensional space E (1/2, 1/2) is decomposed in subspaces of the type E (J, k) according with Eq. (B.37)
in the form  
1 1
E , = E (J = 1) ⊕ E (J = 0)
2 2
which are of dimension 3 and 1 respectively. We shall see now how to extend these conclusions to the general case.

B.3.5 Eigenvalues of J3 and their degeneracy: general case

Figure B.1: (a) Illustration of the addition rules for angular momenta in the general case. (b) Possible pairs of values of
(m, m′ ) = (m1 , m2 ) for the specific case j = j1 = 2, j ′ = j2 = 1. In both cases, the points associated with a given value of
M = m + m′ = m1 + m2 are localized over a straight line with slope −1 depicted as a dashed line. We have supposed that
j = j1 ≥ j ′ = j2 , with which the width of the rectangle is greater or equal than its height.

Let us consider a subspace of the form E (j1 , j2 ) of dimension (2j1 + 1) (2j2 + 1). Let us asume that j1 and j2 are labelled
such that
j1 ≥ j2
the basis vectors {|j1 , j2 ; m1 , m2 i} of this subspace (that are constructed from the tensor product of the bases of the component
spaces) are already eigenvectors of J3
 
(1) (2)
J3 |j1 , j2 ; m1 , m2 i = J3 + J3 |j1 , j2 ; m1 , m2 i = (m1 + m2 ) |j1 , j2 ; m1 , m2 i
≡ M |j1 , j2 ; m1 , m2 i

such that the corresponding eigenvalue of M is such that

M = m1 + m2 (B.38)

from which M takes the values


M = j1 + j2 , j1 + j2 − 1, j1 + j2 − 2, . . . , − (j1 + j2 ) (B.39)
We denote the degree of degeneracy of each M in the subspace E (j1 , j2 ), in the form gj1 ,j2 (M ). In order to find this degeneracy
we shall use the following geometrical procedure: we plot a diagram in two dimensions associating to each vector |j1 , j2 ; m1 , m2 i
an ordered pair where the horizontal axis is associated with m1 and the vertical axis with m2

|j1 , j2 ; m1 , m2 i ≡ (m1 , m2 )

all points associated with these vectors are located in the border or interior of a rectangle whose vertices are in (j1 , j2 ) , (j1 , −j2 ) , (−j
and (−j1 , j2 ). Fig. B.1 represents the points associated with an arbitrary configuration (left) and a configuration with j1 = 2,
j2 = 1 (right). If we start from a given point (vector) of the type P = (m1 , m2 ) it is clear that “neighbour” vectors of the
378 APPENDIX B. ADDITION OF TWO ANGULAR MOMENTA

type P± ≡ (m1 ± 1, m2 ∓ 1) posses the same value of M = m1 + m2 as long as the incremented and decremented values of m1
and m2 exist. When some of the incremented or decremented values do not exist, it means that the vector (m1 , m2 ) lies in
one of the borders of the rectangle (or in a corner). For vectors P in the interior of the rectángle, both P+ and P− exist. Two
neighbour points defined with this relation can be joined by a straight line of slope −1
(m2 ∓ 1) − m2
slope = = −1
(m1 ± 1) − m1
In conclusion, all points joined by a dash line as shown in Figs. B.1a, and B.1b, of slope −1, correspond to all vectors with
the same given value of M = m1 + m2 . The number of points (vectors) joined by a line defines the degree of degeneracy
gj1 ,j2 (M ) of the associated value of M .
Let us consider different values of M in descendent order Eq. (B.39). We shall observe the pattern of the dashed lines
when M decreases. Starting with the maximum M = j1 + j2 we see that this value is non-degenerate because the line that
crosses it, passes only through the upper right corner (the line is indeed a point in this case), whose coordiantes are (j1 , j2 ).
We see then that
gj1 ,j2 (j1 + j2 ) = 1 (B.40)
for the following M = j1 + j2 − 1 the degeneracy is double (unless j1 and/or j2 vanish), since the corresponding line contains
the points (j1 , j2 − 1) and (j1 − 1, j2 ). Then
gj1 ,j2 (j1 + j2 − 1) = 2 (B.41)
La degeneracy increases by a unity for each decrement of M in a unity, until the lower right corner is reached (j1 , −j2 ) of the
rectangle3 , that corresponds to the value M = j1 − j2 ≥ 0 since we have assumed that j1 ≥ j2 . For M = j1 − j2 , the number
of points reaches its maximum (this number of points is a measurement of the “height” of the rectangle) and gives
gj1 ,j2 (j1 − j2 ) = 2j2 + 1 (B.42)

if we continue decrementing M , the number of points remains constant in 2j2 + 1 as long as the line associated with M crosses
the rectangle touching its upper (m2 = j2 ) and lower (m2 = −j2 ) sides. It occurs until the line associated reaches the upper
left corner (−j1 , j2 ) of the rectangle, for which M = −j1 + j2 ≤ 0. Therefore, the maximum number of points 2j2 + 1 is
maintained in an interval of M given by
gj1 ,j2 (M ) = 2j2 + 1 f or − (j1 − j2 ) ≤ M ≤ j1 − j2 (B.43)
finally, for values of M less than − (j1 − j2 ), the line associated with each M does not intersect the upper line of the rectangle
(m2 = j2 ) anymore, so that gj1 ,j2 (M ) decreases monotonically in a unity for each decrement of M in a unity, reaching the
value 1 again when M = − (j1 + j2 ), corresponding to the lower left corner of the rectangle. Consequently
gj1 ,j2 (−M ) = gj1 ,j2 (M ) (B.44)
these results are summarized in Fig. B.2 for the case j1 = 2 and j2 = 1, this figure displays g2,1 (M ) as a function of M .

B.3.6 Eigenvalues of J2 : general case


From Eq. (B.39), we see that the values of M are integers if j1 and j2 are both integers or both half-odd-integers. In the same
way, tha values of M are half-odd-integers if one of the ji′ s is integer and the other one is half-odd-integer. On the other hand,
the general theory of angular momentum says that J is integer (half-odd-integer) if and only if M is integer (half-odd-integer).
We might then distinguish two situations (1) j1 and j2 are both of the sam type (both integears or both half-odd-integears),
(2) j1 and j2 are of different type. The first case leads to integer pairs (J, M ) and the second case to half-odd-integer pairs
(J, M ).
Since the maximum value of M is j1 + j2 , we have that values of J > j1 + j2 do not appear in E (j1 , j2 ) and therefore
it does not appear in the direct sum (B.37). It is because to get J > j1 + j2 it would have to exist the corresponding
value of M = J according with the general theory of angular momentum. For J = j1 + j2 there is an invariant subspace
E (J = j1 + j2 ) associated because M = j1 + j2 exists. Further, this subspace is unique since M = j1 + j2 is non-degenerate. In
this subspace there is one and only one vector associated with M = j1 + j2 − 1, and since M = j1 + j2 − 1 is 2-fold degenerate
in E (j1 , j2 ), we have that J = j1 + j2 − 1 also appears and it has a unique invariant space E (J = j1 + j2 − 1) associated.
In a general context we denote as pj1 ,j2 (J) the number of subspaces E (J, k) of E (j1 , j2 ) associated to a given J. In other
words, this is the number of different values of k for the given value of J (where j1 and j2 are fixed since the beginning).
We shall see that pj1 ,j2 (J) and gj1 ,j2 (M ) are associated in a simple way. Let us consider a particular value of M . To this
value of M , it is associated one and only one vector in each subspace E (J, k) as long as J ≥ |M |. Its degree of degeneracy is
then given by
gj1 ,j2 (M ) = pj1 ,j2 (J = |M |) + pj1 ,j2 (J = |M | + 1) + pj1 ,j2 (J = |M | + 2) + . . .
3 Since we are assuming j ≥ j , the lower right corner (j , −j ) is reached sooner than the upper left corner (−j , j ) in this sequence. At most,
1 2 1 2 1 2
both corners might be reached simultaneously when j1 = j2 , in whose case we have a square.
B.3. GENERAL METHOD OF ADDITION OF TWO ANGULAR MOMENTA 379

Figure B.2: Plot of the degree of degeneracy gj1 ,j2 (M ) versus M , for the case j1 = 1, j2 = 2 illustrated in Fig. B.1b. The
degree of degeneracy is obtained by counting the number of points that touches each dashed line in Fig. B.1b. In addition,
this figure shows the symmetry expressed by Eq. (B.44).

Inverting this relation, we obtain pj1 ,j2 (J) in terms of gj1 ,j2 (M )

pj1 ,j2 (J) = gj1 ,j2 (M = J) − gj1 ,j2 (M = J + 1)


= gj1 ,j2 (M = −J) − gj1 ,j2 (M = −J − 1) (B.45)

it worths emphasizing that in Eq. (B.45), J is fixed and the values of M are not associated with the fixed value of J, but
with all the allowed values of M in E (j1 , j2 ). For this reason, the values of gj1 ,j2 (M = J + 1) and gj1 ,j2 (M = −J − 1) can be
non-null.
Taking into account the degeneracy of the values of M studied in Sec. B.3.5, we can determine the values of the number J
that occur in E (j1 , j2 ) and the number of invariant subspaces E (J, k) associated with each one of them. First of all we have
that
pj1 ,j2 (J) = 0 para J > j1 + j2
since gj1 ,j2 (M ) = 0 for |M | > j1 + j2 . If we now apply Eqs. (B.40, B.41) we find

pj1 ,j2 (J = j1 + j2 ) = gj1 ,j2 (M = j1 + j2 ) − gj1 ,j2 (M = j1 + j2 + 1)


pj1 ,j2 (J = j1 + j2 ) = gj1 ,j2 (M = j1 + j2 ) = 1

pj1 ,j2 (J = j1 + j2 − 1) = gj1 ,j2 (M = j1 + j2 − 1) − gj1 ,j2 (M = j1 + j2 ) = 2 − 1


pj1 ,j2 (J = j1 + j2 − 1) = 1

therefore, all the values of pj1 ,j2 (J) can be found by iteration

pj1 ,j2 (J = j1 + j2 − 2) = 1, . . . , pj1 ,j2 (J = j1 − j2 ) = 1

finally, applying Eq. (B.43) we have

pj1 ,j2 (J) = 0 para J < j1 − j2 = |j1 − j2 |

the last equality is obtained recalling that we have maintained the assumption j1 ≥ j2 throughout the treatment. For the case
j2 ≥ j1 we only have to interchange the indices 1 and 2.
In conclusion, for fixed values of j1 and j2 , i.e. within a subspace E (j1 , j2 ) of dimension (2j1 + 1) (2j2 + 1), the eigenvalues
of J2 are such that
J = j1 + j2 , j1 + j2 − 1, j1 + j2 − 2, . . . , |j1 − j2 |
and each value of J is associated with a unique invariant subspace E (J, k) in the direct sum given by Eq. (B.37), so this
equation reduces to
jX
1 +j2

E (j1 , j2 ) = E (J) (B.46)


⊕J=|j1 −j2 |
380 APPENDIX B. ADDITION OF TWO ANGULAR MOMENTA

such that the index k is indeed unnecessary. It implies in particular that if we take a fixed value of J and a fixed value of M
compatible with J (|M | ≤ J), there exist a unique vector |J, M i in E (j1 , j2 ) associated with this numbers. The specification
of J is enough to determine the invariant subspace, and the specification of M leads to a unique vector in such a subspace.
Consequently J2 and J3 form a C.S.C.O. in E (j1 , j2 ).
As a proof of consistency, we can show that the number N of pairs (J, M ) found for E (j1 , j2 ) coincide with the dimension
(2j1 + 1) (2j2 + 1) of E (j1 , j2 ), since the set {|J, M i} constitutes a basis for E (j1 , j2 ). Let us assume for simplicity that j1 ≥ j2 .
Since each subspace E (J) is of dimension 2J + 1 (i.e. contains 2J + 1 different values of M ), the direct sum (B.46) says that
jX
1 +j2

N= (2J + 1) (B.47)
J=j1 −j2

if we replace
J = j1 − j2 + i
in Eq. (B.47), we find

jX
1 +j2 2j2
X 2j2
X 2j2
X
N = (2J + 1) = [2 (j1 − j2 + i) + 1] = [2 (j1 − j2 ) + 1] 1+2 i
J=j1 −j2 i=0 i=0 i=0

2j2 (2j2 + 1)
= [2 (j1 − j2 ) + 1] (2j2 + 1) + 2 = (2j1 − 2j2 + 1) (2j2 + 1) + 2j2 (2j2 + 1)
2
= [(2j1 − 2j2 + 1) + 2j2 ] (2j2 + 1) = (2j1 + 1) (2j2 + 1)

B.4 Eigenvectors common to J2 and J3


The “natural” basis of E (j1 , j2 ) is the basis of the tensor products between the bases of E (j1 ) and E (j2 ) denoted by
(1) (2)
{|j1 , j2 , m1 , m2 i}. This is the basis of eigenvectors common to J2(1) , J3 , J2(2) , J3 . Now, the eigenvectors common to
J2 , J3 , J2(1) , J2(2) will be denoted as |JM i. In a strict way the notation should include the values j1 and j2 from which
the tensor product comes from. Nevertheless, this notation will be dropped since j1 and j2 are fixed in the whole process. For
the same reason, the notation of the natural basis will be simplified by writing it in the form {|m1 , m2 i}. Both sets of bases
will be distinguished by a subscript in the form |JM iJ and |m1 , m2 ij when necessary. The transformation from the basis
{|m1 , m2 i} to the basis {|JM i}, should be realized through an unitary transformation, because both bases are orthonormal.
Since the set {|JM i} consists of eigenvectors common to J2 , J3 , J2(1) , J2(2) we have that

J2 |JM i = J (J + 1) |JM i ; J3 |JM i = M |JM i


J2(1) |JM i = j1 (j1 + 1) |JM i ; J2(2) |JM i = j2 (j2 + 1) |JM i

B.4.1 Special case j1 = j2 = 1/2


In Sec. B.2, we have found the eigenvectors |J, M i in E (1/2, 1/2) by means of the diagonalization of the matrix representations.
In this case we shall resort to the generation of the different vectors by means of the raising and lowering operators J± . The
advantage of this method is that it is easier to generalize and to manage when we have high values of angular momenta.
First of all, the vector |1/2, 1/2i ≡ |++i is the only eigenvector of J3 in E (1/2, 1/2) that corresponds to M = 1. Since J2
and J3 commute, and the value M = 1 is non-degenerate, theorem 3.17 page 51 says that |++i has to be an eigenvector of J2
as well. Following the reasoning of Sec. B.3.4 the eigenvalue for J2 havs to be J = 1. Therefore, we can choose the phase of
the vector |J = 1, M = 1i to coincide with |++i
|1, 1i = |++i (B.48)
the other vectors of the triplet J = 1 are obtained by sucessive application of the operator J− as described in Sec. A.4.1.
Using Eq. (B.26), we then have
p √
J− |1, 1i = 1 (1 + 1) − 1 (1 − 1) |1, 0i = 2 |1, 0i
from which we obtain
1 1
|1, 0i = √ J− |1, 1i = √ J− |++i
2 2
to calculate |1, 0i in terms of the original basis {|m1 , m2 i} it is enough to remember that
(1) (2)
J− = J− + J−
B.5. EIGENVECTORS OF J2 AND J3 : GENERAL CASE 381

from which
1  (1) (2)
 1
|1, 0i = √ J− + J− |++i = √ (|−+i + |+−i)
2 2
1
|1, 0i = √ (|−+i + |+−i) (B.49)
2
now we apply J− on |1, 0i to obtain the last element |1, −1i of the triplet.

J− |1, 0i = 2 |1, −1i (B.50)

combining Eqs. (B.49, B.50) we get


1 1  (1)  1 
(2)
|1, −1i = √ J− |1, 0i = √ J− + J− √ (|−+i + |+−i)
2 2 2
1 h (1) (2)
 
(1) (2)
 i 1h
(2) (1)
i
= J− + J− |−+i + J− + J− |+−i = J− |−+i + J− |+−i
2 2
1
= [|−−i + |−−i]
2
|1, −1i = |−−i

note that the vector |−−i could have been extracted with an argument similar to the one used to find |++i, since the vector
with M = −1 is non-degenerate as in the case of the vector with M = 1. The previous procedure has notwithstanding the
advantage of showing the general algorithm and permits in addition to adjust the phase conventions that could appear on |1, 0i
and |1, −1i. There are two places in the procedure in which the phases are fixed, in Eq. (B.48) an arbitrary phase can be put,
and in Eqs. (B.26) for J± phases depending on m can be put.
Finally, we shall find the singlet vector |J = 0, M = 0i , which is the only vector of the unidimensional subspace E (J = 0).
It can be found within constant phases, by imposing the condition of being orthonormal to the triplet.
Being orthonormal to |1, 1i = |++i and to |1, −1i = |−−i, the vector |0, 0i must be a linear combination of |+−i and |−+i

|0, 0i = α |+−i + β |−+i (B.51)


2 2
h0, 0 |0, 0i = |α| + |β| = 1 (B.52)

where we have added the normalization condition. Taking into account that |0, 0i must also be orthonormal to |1, 0i, Eqs.
(B.49, B.51) yield
1
h1, 0 |0, 0i = √ [h−+| + h+−|] [α |+−i + β |−+i] = 0
2
⇒ α h−+| + −i + β h−+| − +i + α h+−| + −i + β h+−| − +i = 0
β+α = 0 (B.53)

combining Eqs. (B.52, B.53) we have


1
α = −β ⇒ |α|2 = |β|2 ⇒ 2 |α|2 = 1 ⇒ |α| = √
2
from which
1
α = −β = √ eiχ
2
where χ is any real number. Choosing χ = 0, we have
1
|0, 0i = √ [|+−i − |−+i]
2
it is important to observe that with this method, it was not necessary to resort to the matrix representations of the operators,
in particular of J2 (which was the one that had to be diagonalized).

B.5 Eigenvectors of J2 and J3 : general case


We have seen in Sec. B.3.6, Eq. (B.46) that the decomposition of E (j1 , j2 ) as a direct sum of invariant subspaces E (J) is
given by
E (j1 , j2 ) = E (j1 + j2 ) ⊕ E (j1 + j2 − 1) ⊕ . . . ⊕ E (|j1 − j2 |) (B.54)
we shall determine the vectors |J, M i for each one of the subspaces
382 APPENDIX B. ADDITION OF TWO ANGULAR MOMENTA

B.5.1 Determination of the vectors |JMi of the subspace E (j1 + j2 )


The vector |m1 = j1 , m2 = j2 i is the only eigenvector of J3 in E (j1 , j2 ) with M = j1 + j2 . Since J2 and J3 commute and
M = j1 + j2 is non-degenerate, theorem 3.17 page 51 says that |m1 = j1 , m2 = j2 i must also be an eigenvector of J2 . According
with (B.54) the associated value of J can only be J = j1 + j2 . We can choose the phase factor in such a way that

|J = j1 + j2 , M = j1 + j2 i = |m1 = j1 , m2 = j2 i

that we denote also as


|j1 + j2 , j1 + j2 iJ = |j1 , j2 ij (B.55)
the succesive application of J− permits to find all vectors of the type |J, M i associated with J = j1 + j2 . Applying Eqs. (B.26),
we have
p
J− |j1 + j2 , j1 + j2 iJ = 2 (j1 + j2 ) |j1 + j2 , j1 + j2 − 1iJ
1
|j1 + j2 , j1 + j2 − 1iJ = p J− |j1 + j2 , j1 + j2 iJ (B.56)
2 (j1 + j2 )

to write the vector |j1 + j2 , j1 + j2 − 1iJ in terms of the original basis |m1 , m2 ij , we should write the RHS of Eq. (B.56) in
(1) (2)
the original basis, for which we take into account that J− = J− + J− and that |j1 + j2 , j1 + j2 iJ = |j1 , j2 ij ; from which Eq.
(B.56) reads
 
(1) (2) √ √
J− + J− |j1 , j2 ij 2j1 |j1 − 1, j2 ij + 2j2 |j1 , j2 − 1ij
|j1 + j2 , j1 + j2 − 1iJ = p = p
2 (j1 + j2 ) 2 (j1 + j2 )
obtaining finally
s s
j1 j2
|j1 + j2 , j1 + j2 − 1iJ = |j1 − 1, j2 ij + |j1 , j2 − 1ij (B.57)
j1 + j2 j1 + j2

note besides that the linear combination of the original basis that forms to |j1 + j2 , j1 + j2 − 1iJ is automatically normalized.
(1) (2)
To obtain |j1 + j2 , j1 + j2 − 2iJ , we apply J− on both sides of Eq. (B.57) writing such an operator as J− = J− + J− on
the RHS of this equation. We can repeat this procedure sistematically, until we arrive to the vector |j1 + j2 , − (j1 + j2 )iJ ,
which can be checked that coincides with |−j1 , −j2 ij by an argument similar to the one that led us to Eq. (B.55), since
M = −j1 − j2 is non-degenerate as well.
When this process ends, we have found all 2 (j1 + j2 ) + 1 vectors of the form |J = j1 + j2 , M i, and they expand the
subspace E (J = j1 + j2 ) of E (j1 , j2 ).

B.5.2 Determination of the vectors |JMi in the other subspaces


Let us define now the supplement (or orthogonal complement) G (j1 + j2 ) of the subspace E (j1 + j2 ) in E (j1 , j2 ). According
with Eq. (B.54), such an orthogonal complement will be given by

G (j1 + j2 ) = E (j1 + j2 − 1) ⊕ E (j1 + j2 − 2) ⊕ . . . ⊕ E (|j1 − j2 |)

and we apply to G (j1 + j2 ) an analysis similar to the one realized in Sec. B.5.1 for E (j1 + j2 ).
In G (j1 + j2 ) the degree of degeneracy gj′ 1 ,j2 (M ) of a given value of M is less by a unity than the degeneracy of the whole
space E (j1 , j2 )
gj′ 1 ,j2 (M ) = gj1 ,j2 (M ) − 1 (B.58)
it is because E (j1 + j2 ) posseses one and only one vector associated to each accesible value of M in E (j1 , j2 ). In other words,
for each M in the interval − (j1 + j2 ) ≤ M ≤ j1 + j2 there is one and only one vector in E (j1 + j2 ). In particular, M = j1 + j2
exists no more in G (j1 + j2 ), and therefore the maximum value of M in G (j1 + j2 ) is M = j1 + j2 − 1, since it was 2-fold
degenerate in E (j1 , j2 ), it will be non-degenerate in G (j1 + j2 ). By arguments similar to the ones in Sec. B.5.1, the vector
associated with M = j1 + j2 − 1 in this subspace, must be proportional to |J = j1 + j2 − 1, M = j1 + j2 − 1i. Now we want to
find its expansion in terms of the basis {|m1 , m2 i}. By virtue of the value of M = j1 + j2 − 1, such an expansion must have
the form
|j1 + j2 − 1, j1 + j2 − 1iJ = α |j1 , j2 − 1ij + β |j1 − 1, j2 ij ; |α|2 + |β|2 = 1 (B.59)
where we have also demand normalization. Additionally, this state must be orthogonal to |j1 + j2 , j1 + j2 − 1iJ ∈ E (j1 + j2 ),
i.e. to the vector of the orthogonal complement of G (j1 + j2 ) with the same value of M = j1 + j2 − 1. Using expressions (B.57,
B.5. EIGENVECTORS OF J2 AND J3 : GENERAL CASE 383

B.59) for this vector, such orthogonality is written as

J hj1 + j2 , j1 + j2 − 1 |j1 + j2 − 1, j1 + j2 − 1iJ = 0


"s s #
j1 j2 h i
j hj1 − 1, j2 | + j hj ,
1 2j − 1| α |j ,
1 2j − 1ij + β |j1 − 1, j i
2 j = 0
j1 + j2 j1 + j2
s s
j1 j2
β j hj1 − 1, j2 | j1 − 1, j2 ij + α j hj1 , j2 − 1| j1 , j2 − 1ij = 0
j1 + j2 j1 + j2
s s
j1 j2
β +α = 0 (B.60)
j1 + j2 j1 + j2

the normalization condition (B.59) along with Eq. (B.60) permits us to find α and β within a phase factor. Choosing α as
real and positive, Eq. (B.60) says that β is real and takes the value
s    
j2 j2 j1 + j2
β = −α ⇒ α2 + β 2 = α2 1 + = 1 ⇒ α2 =1
j1 j1 j1
s s s
j1 j2 j2
α = ; β = −α =−
j1 + j2 j1 j1 + j2

With which Eq. (B.59) becomes


s s
j1 j2
|j1 + j2 − 1, j1 + j2 − 1iJ = |j1 , j2 − 1ij − |j1 − 1, j2 ij (B.61)
j1 + j2 j1 + j2

this is the first vector of a new family characterized by J = j1 + j2 − 1, in a similar way as for the vector associated with
J = j1 + j2 in Sec. B.5.1. The other vectors of this new family can be generated by succesive application of the operator J− .
In this way, we obtain [2 (j1 + j2 − 1) + 1] vectors of the type |J = j1 + j2 − 1, M i where J and M take the values

J = j1 + j2 − 1 ; M = j1 + j2 − 1, j1 + j2 − 2, . . . , − (j1 + j2 − 1)

these vectors allow us to expand the subspace E (j1 + j2 − 1).


Now, if j1 + j2 − 2 ≥ |j1 − j2 | we can form the orthogonal complement of the direct sum E (j1 + j2 ) ⊕ E (j1 + j2 − 1) in the
space E (j1 , j2 )
G (j1 + j2 , j1 + j2 − 1) = E (j1 + j2 − 2) ⊕ E (j1 + j2 − 3) ⊕ . . . ⊕ E (|j1 − j2 |)
in the supplement G (j1 + j2 , j1 + j2 − 1), the degeneracy of each value of M decreases by an unity with respect to the
degeneracy of the previous orthogonal complement G (j1 + j2 ). In particular, the maximum value of M is now M = j1 + j2 − 2
and is non-degenerate. The associated vector in G (j1 + j2 , j1 + j2 − 1) will be |J = j1 + j2 − 2, M = j1 + j2 − 2i.
To calculate |j1 + j2 − 2, j1 + j2 − 2iJ in terms of the basis |m1 , m2 i, it is enough to observe that it must be a linear
combination of three vectors

|j1 + j2 − 2, j1 + j2 − 2iJ = α1 |j1 , j2 − 2ij + α2 |j1 − 1, j2 − 1ij + α3 |j1 − 2, j2 ij (B.62)

the three coefficients are fixed within a phase factor by the condition of normalization and the orthogonality with the (already
known) vectors given by: |j1 + j2 , j1 + j2 − 2i , |j1 + j2 − 1, j1 + j2 − 2i, it means the vectors in the orthogonal complement
of G (j1 + j2 , j1 + j2 − 1), with the same value of M = j1 + j2 − 2. Once the coefficients in (B.62) are determined, we can find
the other vectors of this third family, by succesive application of J− . These vectors permit us to expand to E (j1 + j2 − 2).
The procedure can be repeated until we cover all values of M greater or equal than |j1 − j2 |, and by virtue of Eq. (B.44),
also all values of M less or equal than − |j1 − j2 |. In this way, we have determined all vectors {|J, M i} in terms of the original
basis {|m1 , m2 i}.
Appendix C

Transformation from the decoupled basis to


the coupled basis and Clebsch-Gordan
coefficients in SO (3)

(1) (2)
In the space E (j1 , j2 ), the eigenvectors common to J2(1) , J3 , J2(2) , J3 , that we denote (in complete notation) as {|j1 , j2 ; m1 , m2 i}
form an orthonormal basis known as the “decoupled” basis, in the sense that this is the basis that arises directly from the
direct product of spaces. On the other hand, the eigenvectors common to J2 , J3 , J2(1) , J2(2) , and that we denote (in complete
notation) as {|j1 , j2 ; J, M i} form an orthonormal basis known as the “coupled” basis, because this is the canonical basis that
forms the invariant irreducible subspaces under SO (3).
The transformation that carries us from the decoupled basis to the coupled one must be unitary since it is a transformation
between two orthonormal bases. This unitary transformation is written easily by using the completeness of the decoupled basis
j1
X j2
X
|j1 , j2 ; J, M i = |j1 , j2 ; m1 , m2 i hj1 , j2 ; m1 , m2 | J, M i (C.1)
m1 =−j1 m2 =−j2

changing the notation slightly, the coefficients of the expansion read

hj1 , j2 ; m1 , m2 | J, M i ≡ hm1 , m2 (j1 , j2 ) J, M i (C.2)

from which the expansion (C.1) is written as


j1
X j2
X
|j1 , j2 ; J, M i = |j1 , j2 ; m1 , m2 i hm1 , m2 (j1 , j2 ) J, M i (C.3)
m1 =−j1 m2 =−j2

the coefficients hm1 , m2 (j1 , j2 ) J, M i of this expansion, that are the elements of the unitary matrix of transformation, are
called the Clebsch-Gordan coefficients. The labels of left-side indicate a vector of the decoupled basis, while the labels on
right-side indicate a vector of the coupled basis, the labels (j1 , j2 ) in the middle, tell us what angular momenta j1 and j2 are
being coupled. It worths pointing out that the original notation |j1 , j2 ; m1 , m2 ; k1 , k2 i , |j1 , j2 ; J, M ; k1 , k2 i for the bases is not
neccesary since the inner products are independent of k1 and k2 , and within the space E (j1 , j2 ) the k ′ s take a single value,
such that within this subspace this label does not discriminate among its vectors.
Of course, the inverse of the relation (C.3), can be obtained using the completeness of the coupled basis
jX
1 +j2 J
X jX
1 +j2 J
X
|j1 , j2 ; m1 , m2 i = |J, M i hJ, M |j1 , j2 ; m1 , m2 i ≡ |J, M i hJ, M (j1 , j2 ) m1 , m2 i (C.4)
J=j1 −j2 M=−J J=j1 −j2 M=−J

In summary, the C-G coefficients determine the unitary transformation from the decoupled basis to the coupled basis and
viceversa.

C.1 Properties of the Clebsch-Gordan coefficients for SO (3)


General expressions for Clebsch-Gordan coefficients are not easy to obtain. They can be generated by the algorithm explained
in previous sections. Additionally, there are numerical tables of these coefficients. For example, Eqs. (B.55, B.57, B.61) permit

384
C.1. PROPERTIES OF THE CLEBSCH-GORDAN COEFFICIENTS FOR SO (3) 385

us to find some Clebsch-Gordan coefficients


hj1 , j2 (j1 , j2 ) j1 + j2 , j1 + j2 i = 1 (C.5)
s
j1
hj1 − 1, j2 (j1 , j2 ) j1 + j2 , j1 + j2 − 1i = (C.6)
j1 + j2
s
j2
hj1 , j2 − 1 (j1 , j2 ) j1 + j2 , j1 + j2 − 1i = (C.7)
j1 + j2
s
j1
hj1 , j2 − 1 (j1 , j2 ) j1 + j2 − 1, j1 + j2 − 1i = (C.8)
j1 + j2
s
j2
hj1 − 1, j2 (j1 , j2 ) j1 + j2 − 1, j1 + j2 − 1i = − (C.9)
j1 + j2

In order to determine these coefficients in a unique form, Eq. (C.3) is not enough. We should choose certain phase
conventions. However, some phase independent properties are necessary to fix those phases properly, then we deal with phase
independent properties first

C.1.1 Selection rules


First of all, the rules of addition that we have obtained in appendix B, show that these coefficients must obey certain selection
rules: the coefficient hj1 , j2 ; m1 , m2 | J, M i is different from zero only if
M = m1 + m2 and |j1 − j2 | ≤ J ≤ j1 + j2 (C.10)
where J must be of the same type (integer or half-odd-integer) as the values j1 + j2 and |j1 − j2 |. The second condition
in (C.10) is usually known as the “triangle rule” because it expresses the fact that if the condition is satisfied, it must be
possible to form a triangle with three segments of length j1 , j2 and J. In other words, the second equation (C.10) expresses the
well-known theorem which says that a side J of any triangle is less than the sum of the other two sides, and greater than their
difference. Consequently, the three numbers play a symmetrical role such that the second of Eqs. (C.10) can be rewritten as
|J − j1 | ≤ j2 ≤ J + j1 or |J − j2 | ≤ j1 ≤ J + j2 (C.11)
Moreover, from the general properties of angular momentum, the vector |J, M i and hence the C-G coefficient hm1 , m2 (j1 , j2 ) J, M i
exists only if M takes on one of the values
M = J, J − 1, J − 2, . . . , − J (C.12)
and it is also necessary that
m1 = j1 , j1 − 1, j1 − 2, . . . , − j1 and m2 = j2 , j2 − 1, j2 − 2, . . . , − j2 (C.13)
otherwise, the C-G coefficients are not defined. However, for many purposes it is better to assume that they exist for all
M, m1 , m2 but are null when conditions above are not satisfied simultaneously. From this point of view Eqs. (C.10, C.12,
C.13) can be seen as selection rules for the C-G coefficients.

C.1.2 Unitarity of the transformation


A reducible invariant subspace E (j1 , j2 ; k1 , k2 ) under SO (3), has a natural basis of the form
{|j1 , j2 ; m1 , m2 i : −j1 ≤ m1 ≤ j1 , −j2 ≤ m2 ≤ j2 }
so that a completeness or closure relation in such a space reads
j1
X j2
X
|j1 , j2 ; m1 , m2 i hj1 , j2 ; m1 , m2 |
m1 =−j1 m2 =−j2

inserting this closure relation in the orthonormality relation


hJ, M | J ′ , M ′ i = δJJ ′ δMM ′
and using our notation for the Clebsch-Gordan coefficients yields
j1
X j2
X
hJ, M (j1 , j2 ) m1 , m2 ihm1 , m2 (j1 , j2 ) J ′ , M ′ i = δJJ ′ δMM ′ (C.14)
m1 =−j1 m2 =−j2
386APPENDIX C. TRANSFORMATION FROM THE DECOUPLED BASIS TO THE COUPLED BASIS AND CLEBSCH-GOR

This equation is the manifestation that the Clebsch-Gordan (C-G) coefficients are elements of a unitary matrix, since it
transforms an orthonormal basis into another orthonormal basis. Assuming that the C-G coefficients can be chosen as real (as
we shall see later), the condition
hJ, M (j1 , j2 ) m1 , m2 i = hm1 , m2 (j1 , j2 ) J ′ , M ′ i (C.15)
is satisfied and the matrix becomes real orthogonal. In that case, Eq. (C.14) becomes
j1
X j2
X
hm1 , m2 (j1 , j2 ) J, M ihm1 , m2 (j1 , j2 ) J ′ , M ′ i = δJJ ′ δMM ′ (C.16)
m1 =−j1 m2 =−j2

which is a first orthogonality relation for the C-G coefficients. It worths emphasizing that the summation on the LHS is realized
indeed in only one index since the coefficient is non-null only if m1 and m2 are related by the second of Eqs. (C.10). The
second relation of orthogonality appears when we apply the closure relation for a subspace E (j1 , j2 ; k1 , k2 ) with respect to the
coupled basis
jX
1 +j2
XJ
|J, M i hJ, M | = 1
J=|j1 −j2 | M=−J

and insert it into the orthonormality relation for |j1 , j2 , m1 , m2 i


jX
1 +j2 J
X
hm1 , m2 (j1 , j2 ) J, M ihJ, M (j1 , j2 ) m′1 m′2 i = δm1 ,m′1 δm2 ,m′2
J=|j1 −j2 | M=−J

assuming real C-G coefficients again, we have


jX
1 +j2 J
X
hm1 , m2 (j1 , j2 ) J, M ihm′1 m′2 (j1 , j2 ) J, M i = δm1 ,m′1 δm2 ,m′2
J=|j1 −j2 | M=−J

once again, by virtue of the second of Eqs. (C.10), the summation is performed over only one index.

C.1.3 Recurrence relations


(1)
Since J , J(2) , and J ≡ J(1) + J(2) are all angular momenta, they satisfy Eqs. (B.26)

p
J(1)± |j1 , j2 ; m1 , m2 i j1 (j1 + 1) − m1 (m1 ± 1) |j1 , j2 ; m1 ± 1, m2 i
=
p
J(2)± |j1 , j2 ; m1 , m2 i = j2 (j2 + 1) − m2 (m2 ± 1) |j1 , j2 ; m1 , m2 ± 1i
p
J± |J, M i = J (J + 1) − M (M ± 1) |J, M ± 1i

then applying J− = J(1)− + J(2)− on both sides of Eq. (C.3) we have (if M > −J)

j1
X j2
X  
J− |J, M i = J(1)− + J(2)− |j1 , j2 ; m′1 , m′2 i hm′1 , m′2 (j1 , j2 ) J, M i
m′1 =−j1 m′2 =−j2

j1
X j2
X  q 
p ′ ′
J (J + 1) − M (M − 1) |J, M − 1i = |j1 , j2 ; m1 − 1, m2 i ′ ′
j1 (j1 + 1) − m1 (m1 − 1)
m′1 =−j1 m′2 =−j2
q 
+ |j1 , j2 ; m′1 , m′2 − 1i j2 (j2 + 1) − m′2 (m′2 − 1) hm′1 , m′2 (j1 , j2 ) J, M
(C.17)
i

and multiplying this relation by the bra hj1 , j2 ; m1 , m2 | we have


j1
X j2
X  q 
p
J (J + 1) − M (M − 1)hm1 , m2 (j1 , j2 ) J, M − 1i = δm1 ,m1 −1 δm2 ,m2
′ ′
′ ′
j1 (j1 + 1) − m1 (m1 − 1)
m′1 =−j1 m′2 =−j2
q 
+ δm1 ,m′1 δm2 ,m′2 −1 j2 (j2 + 1) − m′2 (m′2 − 1) hm′1 , m′2 (j1 , j2 ) J,
C.1. PROPERTIES OF THE CLEBSCH-GORDAN COEFFICIENTS FOR SO (3) 387

and finally
p hp i
J (J + 1) − M (M − 1)hm1 , m2 (j1 , j2 ) J, M − 1i = j1 (j1 + 1) − m1 (m1 + 1) hm1 + 1, m2 (j1 , j2 ) J, M i
hp i
+ j2 (j2 + 1) − m2 (m2 + 1) hm1 , m2 + 1 (j1 , j2 ) J, M(C.18)
i

similarly, by applying J+ = J(1)+ + J(2)+ on both sides of Eq. (C.3) we have (if M < J)
p hp i
J (J + 1) − M (M + 1)hm1 , m2 (j1 , j2 ) J, M + 1i = j1 (j1 + 1) − m1 (m1 − 1) hm1 − 1, m2 (j1 , j2 ) J, M i
hp i
+ j2 (j2 + 1) − m2 (m2 − 1) hm1 , m2 − 1 (j1 , j2 ) J, M(C.19)
i

Equations. (C.18, C.19) are recurrence relations for the Clebsch-Gordan coefficients. If M = ±J we have J± |J, ±Ji = 0 and
Eqs. (C.18, C.19) are still valid if we use the selection rule from which the C-G coefficient is zero if |M | > J.

C.1.4 Phase conventions


We have seen that Eq. (C.3) is not sufficient to determine the C-G coefficients completely. We should choose certain phase
conventions, that are usually fixed for these coefficients to be real. The choice of certain phases determines the sign of some
coefficients. Of course, the relative signs of the coefficients that appear in the expansion of the same vector |J, M i are fixed,
only the global sign can be chosen arbitrarily. First of all, the phase of the normalized vector |J, M i must be chosen, the
relative phase of the 2J + 1 vectors |J, M i associated with the same value of J is fixed through the action of the operators J±
described by Eqs. (B.26)1 , but we still have to choose the phase of one of the vectors in the set (say |J, Ji) to determine them
completely.
In particular, Eq. (C.17) can be used to fix the relative phases of the vectors |J, M i associated with the same value of J.
We complete the definition of the C-G coefficients in Eq. (C.3) by fixing the phases of the vectors |J, Ji. For this purpose, we
shall study some properties of the coefficients hm1 , m2 (j1 , j2 ) J, Ji
In the coefficient hm1 , m2 (j1 , j2 ) J, Ji the maximum value of m1 is mmax 1 = j1 . Now, the first of Eqs. (C.10) with M = J
shows that in this case m2 acquires its minimum value mmin 2 = J − j1 . As m1 decreases from j1 one unit at a time, the
associated m2 increases one unit at a time until m2 reaches its maximum value mmax 2 = j2 , in the latter case m1 acquires
its
minimum value mmin = J − j2 . Note in particular that from Eq. (C.11) we have |J − j1 | ≤ j2 or equivalently m min
≤ m max
,
1 2 2
an analogous relation for m1 is obtained from Eq. (C.11)
min
m1 ≤ mmax1 , mmin2
≤ mmax2

when we run from mmin


1 to mmax
1 in unit steps, the number of different values of m1 is mmax
1 − mmin
1 + 1. Something similar
occurs with m2 . Consequently, the number N of different C-G coefficients of the form hm1 , m2 (j1 , j2 ) J, Ji is given by

N = mmax
1 − mmin
1 + 1 = j1 − (J − j2 ) + 1 ; N = mmax
2 − mmin
2 + 1 = j2 − (J − j1 ) + 1
N = j1 + j2 − J + 1

we shall show that all these j1 + j2 − J + 1 coefficients are strictly different from zero. To prove it, we observe that with
M = J the LHS of Eq. (C.19) vanishes and we are left with
s
j2 (j2 + 1) − m2 (m2 − 1)
hm1 − 1, m2 (j1 , j2 ) J, Ji = − hm1 , m2 − 1 (j1 , j2 ) J, Ji (C.20)
j1 (j1 + 1) − m1 (m1 − 1)

As long as the C-G coefficients satisfy relations (C.13) the radical on the RHS of Eq. (C.20) is never zero neither infinite. For
the C-G coefficients in either side of Eq. (C.20) to exist, we require m1 +m2 −1 = J. Hence, if m1 = j1 then m2 = J −j1 +1 and
in this case relation (C.20) shows that if the RHS hj1 , J −j1 (j1 , j2 ) J, Ji were zero, then the LHS hj1 −1, J −j1 +1 (j1 , j2 ) J, Ji
would also be zero, as well as all the succeeding coefficients hm1 , J −m1 (j1 , j2 ) J, Ji. This would imply that |J, Ji is orthogonal
to all vectors in the basis {|j1 , j2 ; m1 , m2 i} of the subspace E (j1 , j2 ; k1 , k2 ). According with theorem 2.28, Eq. (2.34) page
27 this in turn implies that |J, Ji = 0 in E (j1 , j2 ; k1 , k2 ), and it contradicts the general theory of angular momentum. In
conclusion, all the j1 + j2 − J + 1 C-G coefficients of the form hm1 , m2 (j1 , j2 ) J, Ji compatible with all selection rules and
with J − j2 ≤ m1 ≤ j1 , are different from zero.
Let us take m1 = j1 (the maximum value of m1 ), its associated vector hj1 , J − j1 (j1 , j2 ) J, Ji is in particular non-
vanishing. To fix the phase of |J, Ji we shall demand for this coefficient that

hj1 , J − j1 (j1 , j2 ) J, Ji > 0 (C.21)


1 In turn, different phase conventions can be chosen for Eqs. (B.26). Indeed, we could multiply the normalization constant in Eqs. (B.26) by an

m−dependent phase factor. It is standard to choice the normalization constants in these equations as real positive (phase zero), which is called the
Condon-Shortley convention.
388APPENDIX C. TRANSFORMATION FROM THE DECOUPLED BASIS TO THE COUPLED BASIS AND CLEBSCH-GOR

and from Eq. (C.20) the remaining coefficients are also real with alternating sign. For a fixed value of j1 and m1 = j1 Eq.
(C.20) yields
sgn {hj1 − 1, J − j1 + 1 (j1 , j2 ) J, Ji} = −sgn {hj1 , J − j1 (j1 , j2 ) J, Ji} (C.22)
and with m1 = j1 − 1 Eq. (C.20) yields
sgn {hj1 − 2, J − (j1 − 1) + 1 (j1 , j2 ) J, Ji} = −sgn {hj1 − 1, J − (j1 − 1) (j1 , j2 ) J, Ji}
= −sgn {hj1 − 1, J − j1 + 1 (j1 , j2 ) J, Ji}
2
sgn {hj1 − 2, J − (j1 − 2) (j1 , j2 ) J, Ji} = (−1) sgn {hj1 , J − j1 (j1 , j2 ) J, Ji}
where we have used Eq. (C.22) in the last step. Proceeding in the same way k times we have

sgn {hj1 − k, J − (j1 − k) (j1 , j2 ) J, Ji} = (−1)k sgn {hj1 , J − j1 (j1 , j2 ) J, Ji} (C.23)
The LHS of Eq. (C.23) can be rewritten assigning j1 − k ≡ m1 and J − (j1 − k) ≡ m2 = J − m1 . Therefore

sgn {hm1 , J − m1 (j1 , j2 ) J, Ji} = (−1)j1 −m1 sgn {hj1 , J − j1 (j1 , j2 ) J, Ji} (C.24)
and using the phase convention (C.21) we get
j1 −m1
sgn {hm1 , J − m1 (j1 , j2 ) J, Ji} = (−1) (C.25)
note finally that the phase convention adopted in Eq. (C.21) treats J1 and J2 asymmetrically. It depends on the order in
which j1 and j2 appears in the C-G coefficient. If j1 and j2 are permuted the phase of the vector |J, Ji is fixed by the condition
hj2 , J − j2 (j2 , j1 ) J, Ji > 0 (C.26)
which is not equivalent in general with the convention (C.21), they could define different phases (relative signs) for |J, Ji. We
shall examine this point later.
Observe that the recurrence relation (C.18) enables us to express in terms of the coefficients hm1 , m2 (j1 , j2 ) J, Ji all the
coefficients hm1 , m2 (j1 , j2 ) J, J − 1i then all the coefficients of the form hm1 , m2 (j1 , j2 ) J, J − 2i and so on. Observe that
in these relations there are no imaginary numbers requiring that all C-G coefficients be real.

hm1 , m2 (j1 , j2 ) J, M i = hm1 , m2 (j1 , j2 ) J, M i
This in turn has to do with the Cordon-Shortley convention for the action of J± adopted in Eqs. (B.26). This can also be
written as
hm1 , m2 (j1 , j2 ) J, M i = hJ, M (j1 , j2 ) m1 , m2 i
nevertheless, there is not a simple pattern for the sign of hm1 , m2 (j1 , j2 ) J, M i with M 6= J. Notwithstading, the signs of
some of them can be identified easily.

C.1.5 Signs of some C-G coefficients


For instance, Eq. (C.5) shows that hj1 , j2 (j1 , j2 ) J, M i can be chosen as 1, and it is in particular real positive in agreement
with Eq. (C.21). Setting M = J = j1 + j2 in Eq. (C.18) we see that the coefficients hm1 , m2 (j1 , j2 ) j1 + j2 , j1 + j2 − 1i are
positive so that hm1 , m2 (j1 , j2 ) j1 + j2 , j1 + j2 − 2i also are and so on. Therefore, by recurrence we obtain that
hm1 , m2 (j1 , j2 ) j1 + j2 , M i ≥ 0
now, given a coefficient of the form hm1 , m2 (j1 , j2 ) J, M i with J and M fixed, we shall characterize the coefficients in which
m1 takes its maximum value. In principle the maximum value of m1 is mmax 1 = j1 which leads to m2 = M − j1 but we also
require that m2 ≥ −j2 from which M − j1 ≥ j1 or M ≥ j1 − j2 If M < j1 − j2 the associated m2 is not allowed so that
mmax
1 is obtained by assuming the minimum of m2 hence mmax 1 = M − mmin
2 = M − (−j2 ). Therefore, if M < j1 − j2 then
max
m1 = M + j2 . We can summarize these results as
 
max j1 if M ≥ j1 − j2
m1 = (C.27)
M + j2 if M ≤ j1 − j2
note that both cases coincide when M = j1 − j2 . We shall show that all C-G coefficients for which m1 acquires its maximum
value are strictly positive.
Let us take the first case M ≥ j1 − j2 in Eq. (C.27) for which mmax
1 = j1 . Applying m1 = j1 in Eq. (C.18) we obtain
p p
J (J + 1) − M (M − 1)hj1 , m2 (j1 , j2 ) J, M − 1i = j2 (j2 + 1) − m2 (m2 + 1)hj1 , m2 + 1 (j1 , j2 ) J, M i
s
p j2 (j2 + 1) − m2 (m2 + 1)
J (J + 1) − M (M − 1)hj1 , m2 (j1 , j2 ) J, M − 1i = hj1 , m2 + 1 (j1 , j2 ) J, M i (C.28)
J (J + 1) − M (M − 1)
C.1. PROPERTIES OF THE CLEBSCH-GORDAN COEFFICIENTS FOR SO (3) 389

the radical on the RHS of Eq. (C.28) cannot be zero nor infinity (note for instance that for the coefficient on the RHS of Eq.
C.28 to exist, we require m2 + 1 ≤ j2 and hence m2 < j2 which prevents the RHS of Eq. C.28 to be null). Setting M = J and
using Eq. (C.21) we see that the C-G coefficient of the RHS of Eq. (C.28) is positive so the LHS also is. Applying a recurrence
argument by succesive application of Eq. (C.28) we obtain that all coefficients hj1 , M − j1 (j1 , j2 ) J, M i are strictly positive
if M ≥ j1 − j2 (otherwise m1 6= j1 and such coefficients do not exist).
We now take the second case in which M ≤ j1 − j2 . In this case, when m1 acquires its maximum value then m2 = −j2 .
Applying m2 = −j2 in Eq. (C.19) we have
p hp i
J (J + 1) − M (M + 1)hm1 , −j2 (j1 , j2 ) J, M + 1i = j1 (j1 + 1) − m1 (m1 − 1) hm1 − 1, −j2 (j1 , j2 ) J, M i
s
j1 (j1 + 1) − m1 (m1 − 1)
hm1 , −j2 (j1 , j2 ) J, M + 1i = hm1 − 1, −j2 (j1 , j2 ) J, M i(C.29)
J (J + 1) − M (M + 1)

Observe that for the coefficient on the LHS of Eq. (C.29) to exist we require M < J so that the radical on the RHS of Eq.
(C.29) cannot be zero nor infinity. An analogous argument shows that coefficients of the form hM + j2 , −j2 (j1 , j2 ) J, M i are
strictly positive if M ≤ j1 − j2 . In summary

hj1 , M − j1 (j1 , j2 ) J, M i > 0 if M ≥ j1 − j2 (C.30)


hM + j2 , −j2 (j1 , j2 ) J, M i > 0 if M ≤ j1 − j2 (C.31)

Eq. (C.25) yields


j1 −m1
sgn {hm1 , J − m1 (j1 , j2 ) J, Ji} = (−1)
and applying m2 = j2 and m1 = M − j2 = J − j2 in this equation we get

sgn {hJ − j2 , j2 (j1 , j2 ) J, Ji} = (−1)j1 +j2 −J (C.32)

we can determine the sign of hm1 , m2 (j1 , j2 ) J, − Ji by setting M = −J in Eq. (C.18) obtaining
s
j2 (j2 + 1) − m2 (m2 + 1)
hm1 + 1, m2 (j1 , j2 ) J, −Ji = − hm1 , m2 + 1 (j1 , j2 ) J, −Ji (C.33)
j1 (j1 + 1) − m1 (m1 + 1)

by a procedure similar to the one that led from Eq. (C.20) to Eq. (C.25) we see that the sign of hm1 , m2 (j1 , j2 ) J, −Ji
changes whenever m1 (or m2 ) varies by ±1.
Now if we take M = −J then M ≤ j1 − j2 and Eq. (C.31) says that

h−J + j2 , −j2 (j1 , j2 ) J, − Ji > 0 (C.34)

combining Eqs. (C.33, C.34), we obtain


m2 +j2
sgn {hm1 , m2 (j1 , j2 ) J, − Ji} = (−1) (C.35)

in particular if m1 = −j1 then m2 = −J + j1 and

sgn {h−j1 , −J + j1 (j1 , j2 ) J, − Ji} = (−1)j1 +j2 −J (C.36)

C.1.6 Changing the order of j1 and j2


We have already mentioned that the phase of the ket |J, Ji has been chosen in a way that depends on the order in which the
two angular momenta j1 and j2 are taken in the C-G coefficient. So it is useful to establish how the phase of |J, Ji changes
under the interchange of j1 and j2 .
Let us change the notation for a while, we denote |J, Ji(12) the ket in the phase convention derived from the order j1 , j2
(21)
while |J, Ji is defined accordingly. Taking the order j1 and j2 we established in Eq. (C.21) that

hj1 , J − j1 |J, Ji(12) > 0 (C.37)

from which we arrive to Eq. (C.32) that says


n o
(12) j +j −J
sgn hJ − j2 , j2 |J, Ji = (−1) 1 2 (C.38)

interchanging the order of j1 and j2 , relation (C.38) becomes


n o n o
(12) (12) j +j −J
sgn hJ − j1 , j1 |J, Ji = sgn hj2 , J − j2 |J, Ji = (−1) 1 2 (C.39)
390APPENDIX C. TRANSFORMATION FROM THE DECOUPLED BASIS TO THE COUPLED BASIS AND CLEBSCH-GOR

it is natural to impose the condition (C.26)

hj2 , J − j2 (j2 , j1 ) J, Ji(21) > 0⇒


n o
(21)
sgn hj2 , J − j2 |J, Ji = +1 (C.40)

which is the analogous of condition (C.37) but with the j2 , j1 order. Now writing |J, Ji(21) = (−1)p |J, Ji(12) and combining
Eqs. (C.40, C.39) we have
n o n o
sgn hj2 , J − j2 |J, Ji(21) = (−1)p sgn hj2 , J − j2 |J, Ji(12) = (−1)p (−1)j1 +j2 −J = +1

p j +j −J (12)
for this to be valid for all j1 , j2 and J we require (−1) = (−1) 1 2 . Therefore, the difference of phases between |J, Ji
and |J, Ji(21) is
(21) j +j −J (12)
|J, Ji = (−1) 1 2 |J, Ji
now, taking into account that all kets |J, M i are constructed from |J, Ji by succesive application of J− , and observing that
the action of this operator is not sensitive to the interchange of j1 and j2 we conclude that
(21) j1 +j2 −J (12)
|J, M i = (−1) |J, M i

therefore, the exchange of j1 and j2 leads to the relation

hm2 , m1 (j2 , j1 ) J, M i = (−1)j1 +j2 −J hm1 , m2 (j1 , j2 ) J, M i

C.1.7 Simultaneous change of the sign of m1 , m2 and M


We have generated all kets of the form |J, M i by starting from |J, Ji adn applying J− succesivily. We could equally start from
|J, −Ji and obtain those kets by succesive application of J+ . The procedure is the same and we find for the kets |J, −M i the
same expansion coefficients on the kets |j1 , j2 ; −m1 , −m2 i as for the kets |J, M i on the kets |j1 , j2 ; m1 , m2 i. However, there
could be some differences because of the phase conventions for the kets |J, M i, since the analogue of Eq. (C.21) is given by

h−j1 , −J + j1 (j1 , j2 ) J, −Ji > 0

from which we can define the phase of the vector |J, −Ji. But according with Eq. (C.36) the sign of this coefficient is in reality
(−1)j1 +j2 −J from which it follows
j1 +j2 −J
h−m1 , −m2 (j1 , j2 ) J, −M i = (−1) hm1 , m2 (j1 , j2 ) J, M i

in particular by taking m1 = m2 = 0 we see that h0, 0 (j1 , j2 ) J, 0i is zero when j1 + j2 − J is an odd number.

C.1.8 Evaluation of hm, −m (j, j) 0, 0i


According with Eq. (C.10), J can be zero only if j1 = j2 ≡ j. Setting j1 = j2 = j and m1 = m, m2 = −m − 1, J = M = 0 in
Eq. (C.18), we obtain
hp i
0 = j (j + 1) − m (m + 1) hm + 1, −m − 1 (j, j) 0, 0i
hp i
+ j (j + 1) − (−m − 1) [(−m − 1) + 1] hm, −m (j, j) 0, 0i
p
0 = j (j + 1) − m (m + 1) {hm + 1, − (m + 1) (j, j) 0, 0i + hm, −m (j, j) 0, 0i}

so we find
hm + 1, − (m + 1) (j, j) 0, 0i = −hm, −m (j, j) 0, 0i (C.41)
so that all coeficients of the form hm, −m (j, j) 0, 0i have the same modulus and their signs change as long as m varies by
one. Applying J = M = 0, j1 = j2 ≡ j and m1 = m in Eqs. (C.21, C.25) we have

hj, −j (j, j) 00i > 0 (C.42)


j−m
sgn {hm, −m (j, j) 0, 0i} = (−1) (C.43)

now using the orhtonormality relation (C.16) with the same setting we have
j
X j
X
hm1 , m2 (j, j) 0, 0ihm1 , m2 (j, j) 0, 0i = δ00 δ00
m1 =−j m2 =−j
C.1. PROPERTIES OF THE CLEBSCH-GORDAN COEFFICIENTS FOR SO (3) 391

1 1
j′ = 2

qm = 2 m′ = − 12
q
1 j+M+ 12 1
j−M+ 2
J =j+ 2 q 2j+1 1 q 2j+1
1 j−M+ 2 j+M+ 12
J =j− 2 − 2j+1 2j+1


Table C.1: Clebsch-Gordan coefficients of the type J, M j, 12 M − m′ , m′ .

j′ = 1 q m′ = 1 q m′ = 0 ′
q m = −1
(j+M)(j+M+1) (j−M+1)(j+M+1) (j−M)(j−M+1)
J =j+1
q (2j+1)(2j+2) (2j+1)(j+1) q (2j+1)(2j+2)
(j+M)(j−M+1) M (j−M)(j+M+1)
J =j − 2j(j+1)
√ 2j(j+1)
j(j+1)
q q q
(j−M)(j−M+1) (j−M)(j+M) (j+M+1)(j+M)
J =j−1 2j(2j+1) − j(2j+1) 2j(2j+1)

Table C.2: Clebsch-Gordan coefficients of the type hJ, M (j, 1) M − m′ , m′ i.

since M = 0 then m1 = −m2 ≡ m and we rewrite this as


j
X j
X
hm, m2 (j, j) 0, 0ihm, m2 (j, j) 0, 0iδm,−m2 = 1
m=−j m2 =−j

which takes finally the form


j
X
hm, −m (j, j) 0, 0i2 = 1 (C.44)
m=−j

and we find
(−1)j−m
hm, −m (j, j) 0, 0i = √ (C.45)
2j + 1

C.1.9 Some specific Clebsch Gordan coefficients


Some patterns of C-G coefficients hJ, M (j, j ′ ) m, m′ i can be obtained analytically, and we show some of them in tables C.1,
C.2

Das könnte Ihnen auch gefallen