From the Lorentz Group to the Celestial Sphere
从洛伦兹群到天体球面
Blagoje Oblak∗ 布拉戈耶·奥布莱克 ∗
Physique Théorique et Mathématique
理论与数学体质
Université Libre de Bruxelles and International Solvay Institutes
布鲁塞尔自由大学和国际索尔维研究所
Campus Plaine C.P. 231, B-1050 Bruxelles, Belgium
平原校区 C.P. 231,B-1050 布鲁塞尔,比利时
Abstract 摘要
In these lecture notes we review the isomorphism between the
(connected) Lorentz group and the set of conformal transformations of the sphere. More precisely,
after establishing the main properties of the Lorentz group, we show that it is isomorphic to the group
of complex matrices with unit
determinant. We then classify conformal transformations of the sphere, define the notion of
null infinity in Minkowski space-time, and show that the action of Lorentz transformations on the celestial
spheres at null infinity is precisely that of conformal transformations. In particular, we discuss the
optical phenomena observed by the pilots of the Millenium Falcon during the jump to lightspeed.
在这些讲义中,我们回顾了(连通的)洛伦兹群与球面共形变换集之间的同构。更确切地说,在建立洛伦兹群的主要性质之后,我们表明它同构于单位行列式的复数矩阵群 。然后我们对球面的共形变换进行分类,定义闵可夫斯基时空中的无穷远处的概念,并表明洛伦兹变换对无穷远处天体的作用恰好是共形变换。特别是,我们讨论了千年鹰号飞行员在光速跳跃过程中观察到的光学现象。
This text, aimed at undergraduate
students, was written for the seventh Brussels Summer School of
Mathematics† that took place at Université Libre de Bruxelles in August 2014. An abridged
version has been
published in the proceedings of the school, “Notes de la septième BSSM”, printed in July 2015.
本文面向本科生编写,是为 2014 年 8 月在布鲁塞尔自由大学举行的第七届布鲁塞尔夏季数学学校而写的。摘要版本已发表在学校论文集《第七届 BSSM 笔记》中,该论文集于 2015 年 7 月印刷。
∗ Research Fellow of the Fund for Scientific Research-FNRS
Belgium. E-mail: boblak@ulb.ac.be
∗ 比利时科学研究基金会研究助理。电子邮件:boblak@ulb.ac.be
† Website: http://bssm.ulb.ac.be/
† 网站:http://bssm.ulb.ac.be/
Introduction 引言
The Lorentz group is essentially the symmetry group of special relativity. It is commonly defined as a set of
(linear) transformations acting on a four-dimensional vector space , representing changes of
inertial frames in Minkowski space-time. But as
we will see
below, one can exhibit an isomorphism between the Lorentz group and the group of
conformal transformations of the sphere ; the latter is of course two-dimensional.
This
isomorphism thus relates the action of a group on a four-dimensional space to its action on
a two-dimensional manifold. At first sight, such a relation seems surprising: loosely speaking, one
expects to have lost some information in going from four to two dimensions. In particular, the isomorphism
looks like a coincidence of the group structure: there is no obvious geometric
relation between the original four-dimensional space on the one hand, and the sphere on the other hand.
洛伦兹群本质上是特殊相对论的对称群。它通常被定义为作用于四维向量空间 的一组(线性)变换,代表闵可夫斯基时空惯性系的变化。但正如我们下面将要看到的,可以在洛伦兹群和球面共形变换群之间展示同构;后者当然是二维的。因此,这种同构将一个群在四维空间上的作用与其在二维流形上的作用联系起来。乍一看,这种关系似乎很令人惊讶:粗略地说,人们期望在从四维到二维的过程中会丢失一些信息。特别是,这种同构看起来像是群结构的巧合:原始的四维空间和球面之间没有明显的几何关系。
The purpose of these notes is to show that such a relation actually exists, and is even quite natural.
Indeed, by defining a notion of “celestial
spheres”, one can derive a direct link between four-dimensional Minkowski space and the
two-dimensional sphere. In short, the celestial sphere of an inertial observer in Minkowski space is the
sphere of all directions towards which the observer can look, and coordinate transformations between
inertial observers (i.e. Lorentz transformations) correspond to conformal transformations of this
sphere [1, 2, 3, 4]. In this work we will review this construction
in a self-contained way.
这些笔记的目的是表明这种关系确实存在,甚至相当自然。事实上,通过定义“天球”的概念,可以推导出四维闵可夫斯基空间与二维球面之间的直接联系。简而言之,闵可夫斯基空间中惯性观察者的天球是观察者可以看的所有方向的球面,惯性观察者之间的坐标变换(即洛伦兹变换)对应于这个球面的共形变换[1, 2, 3, 4]。在本工作中,我们将以自洽的方式回顾这一构造。
Keeping this motivation in mind, the text is organized as follows. In section 1, we review
the basic principles of special relativity and define the natural symmetry groups that follow, namely the
Poincaré group and its homogeneous subgroup, the Lorentz group [5, 6, 7].
This will also be an excuse to discuss
certain
elegant properties of the Lorentz group that are seldom exposed in elementary courses on special relativity,
in particular regarding the physical meaning of the notion of “rapidity” [8, 9, 10]. In
section 2, we
then establish the isomorphism between the connected Lorentz group and the group of complex, two
by
two matrices of
unit determinant, quotiented by its center . We also derive the analogue of this
result in three space-time dimensions. Section 3 is devoted to the construction of conformal
transformations of the sphere; it is shown, in particular, that such transformations span a group
isomorphic to a key result in the realm of two-dimensional conformal field
theories [11, 12].
At that point, the stage will be set for the final link between the Lorentz group and the sphere, which
is established in section 4. The conclusion, section 5, relates these
observations to some recent developments in quantum gravity in particular BMS
symmetry
[13, 14, 15, 16, 17, 18, 19, 20] and
holography [21, 22, 23, 24, 25].
考虑到这一动机,本文结构如下。在第 1 节中,我们回顾了狭义相对论的基本原理,并定义了随之而来的自然对称群,即庞加莱群及其同构子群洛伦兹群[5, 6, 7]。这也可以成为讨论洛伦兹群某些优雅性质的理由,这些性质在狭义相对论的基础课程中很少被提及,特别是关于“速度”这一概念物理意义的讨论[8, 9, 10]。在第 2 节中,我们建立了连通洛伦兹群与复数二阶单位行列式矩阵群 的等价性,并推导了该结果在三维时空维度下的类比。第 3 节致力于球面共形变换的构造;特别是,证明了这种变换构成了一个同构于 的群,这是二维共形场论领域的一个关键结果[11, 12]。在那个时刻,洛伦兹群与球面之间的最终联系将在第 4 节中建立。 结论,第 5 节,将这些观察结果与量子引力的一些最新发展联系起来,特别是 BMS 对称性[13, 14, 15, 16, 17, 18, 19, 20]和全息理论[21, 22, 23, 24, 25]。
The presentation voluntarily starts with fairly elementary considerations, in order to be accessible
(hopefully) to undergraduate
students. Though some basic knowledge of group theory and special relativity should come in handy, no prior
knowledge of differential geometry, general relativity or conformal field theory is assumed. In particular,
sections 1 and 2 are mostly based on the undergraduate-level lecture notes
[5].
演讲从相当基础的考虑开始,以便(希望)对本科生来说易于理解。尽管一些基本的群论和狭义相对论知识可能会有所帮助,但假设没有先前的微分几何、广义相对论或共形场论知识。特别是,第 1 节和第 2 节主要基于本科生的讲义[5]。
1 Special relativity and the Lorentz group
1 狭义相对论和洛伦兹群
In this section, after reviewing the basic principles of special relativity (subsection
1.1), we define the associated
symmetry groups (subsection 1.2) and introduce in particular the Lorentz group. In subsection
1.3, we then
define certain natural subgroups of the latter. Subsection 1.4 is devoted to the notion
of Lorentz boosts and to the associated additive parameter, which turns out to have the physical meaning of
“rapidity”. Finally, in subsection 1.5 we show that any Lorentz transformation preserving the
orientation of space and the direction of time flow can be written as
the product of two rotations and a boost, and then use this result in subsection 1.6 to classify the
connected components of the Lorentz group. All these results are well known; the acquainted reader may
safely jump directly to section 2. The presentation of this section is mainly inspired from the
lecture notes [5] and
[6]; more specialized references will be cited in due time.
在这一节中,在回顾了狭义相对论的基本原理(1.1 小节)之后,我们定义了相关的对称群(1.2 小节)并特别介绍了洛伦兹群。在 1.3 小节中,我们随后定义了后者的某些自然子群。1.4 小节致力于洛伦兹加速的概念及其相关的加性参数,该参数最终具有“速度”的物理意义。最后,在 1.5 小节中,我们证明了任何保持空间方向和时间流动方向的洛伦兹变换都可以写成两个旋转和一个加速的乘积,然后利用这一结果在 1.6 小节中对洛伦兹群的连通分量进行分类。所有这些结果都是众所周知的;熟悉这些内容的读者可以安全地直接跳到第 2 节。本节的阐述主要受到[5]和[6]讲义的影响;在适当的时候将引用更多专业参考文献。
1.1 The principles of special relativity
1.1.1 Events and reference frames
In special relativity, natural phenomena take place in the arena of space-time. The latter consists of points, called events, which occur at some position in space, at some moment in time. Events are seen by observers who use coordinate systems, also called reference frames, to specify the location of an event in space-time. In the realm of special relativity, reference frames typically consist of three orthonormal spatial coordinates111The presentation here is confined to four-dimensional space-times, but the generalization to -dimensional space-times is straightforward: simply take spatial coordinates . and one time coordinate , measured by a clock carried by the observer. For practical purposes, the speed of light in the vacuum,
(1.1) |
is used as a conversion factor to express time as a quantity with dimensions of distance. This is done by
defining a new time coordinate .
Thus, in a given reference frame, an event occurring in space-time is labelled by its four coordinates , collectively denoted as . (From now one, greek indices run over the values , , , .) Of course, the event’s existence is independent of the observers who see it, but its coordinates are not: if Alice and Bob are two observers looking at the same event, Alice may use a set of four numbers to describe its location, but Bob will in general use different coordinates to locate the same event. Besides, if we do not specify further the relation between Alice and Bob, there is no link whatsoever between the coordinates they use. What we need are restrictions on the possible reference frames used by Alice and Bob; the principles of special relativity will then apply only to those observers whose reference frames satisfy the given restrictions.

1.1.2 The principles of special relativity
We now state the defining assumptions of special relativity. The three first basic assumptions are
homogeneity of space-time, isotropy of space and causality [8, 7]. The remaining
principles, discussed below, lead to constraints on the relation between reference frames. To expose these
principles, we first need to define the notion of inertial frames.
According to the principle of inertia, a body left to itself, without forces acting
on it, should move in space in a constant direction, with a constant velocity. Obviously, this principle
cannot hold in all reference frames. For example, suppose Alice observes that the principle of inertia is
true in her reference frame (for example by throwing tennis balls in space and observing that they move in
straight
lines at constant velocity). Then, if Bob is accelerated with respect to Alice, he will naturally use a
comoving frame and the straight motions seen by Alice will become curved motions in his reference frame.
Therefore, if two reference frames are accelerated with respect to each other, the principle of
inertia cannot hold in both frames. More generally, we call inertial frame a reference frame in
which
the principle of inertia holds [7]; the results of special relativity apply only to such frames.
Accordingly,
an observer using an inertial frame is called an inertial observer; in physical terms, it is an observer
falling freely in empty space. We will see in
subsection 1.2 what restrictions are imposed on the relation between coordinates of inertial
frames; at present, we already know that, if two such frames move with respect to each other, then this
motion must take place along a straight line, at constant velocity.
Given this definition, we are in position to state the two crucial defining principles of special relativity.
The
first, giving its name to the theory, is the principle of relativity (in the restricted sense
[26]), which
states that the laws of Nature must take the same form in all inertial frames. In other words, according to
this principle, there exists no privileged inertial frame in the Universe: there is no experiment that would
allow an experimenter to distinguish a given inertial frame from the others. This is a principle of special relativity in that it only applies to inertial frames; a principle of general relativity would
apply to all possible reference frames, inertial or not. The latter principle leads to the theory of general
relativity, which we will not discuss further here.
The second principle is Einstein’s historical “second postulate”, which states that the speed of light in the vacuum takes the same value , written in (1.1), in all inertial frames. In fact, if one assumes that Maxwell’s theory of electromagnetism holds, then the second postulate is a consequence of the principle of relativity. Indeed, saying that the speed of light (in the vacuum) is the same in all inertial frames is really saying that the laws of electromagnetism are identical in all inertial frames [7].
1.2 The Poincaré group and the Lorentz group
We now work out the relation between coordinates of inertial frames; the set of all such relations will form a group, called the Poincaré group. We will see that the second postulate is crucial in determining the form of this group, through the notion of “space-time interval”.
1.2.1 Linear structure
Suppose and are two inertial frames, i.e. the principle of inertia holds in both of them. Then, a particle moving along a straight line at constant velocity, as seen from , must also move at constant velocity along a straight line when seen from . Thus, calling the space-time coordinates of and those of , the relation between these coordinates must be such that any straight line in the coordinates is mapped on a straight line in the coordinates . The most general transformation satisfying this property is a projective map [6], for which
(1.2) |
(From now on, summation over repeated indices will always be understood.) Here , , and are constant coefficients. If we insist that points having finite values of coordinates in remain with finite coordinates in , we must set . Then, absorbing the constant in the parameters and , the transformation (1.2) reduces to
(1.3) |
Thus, the principle of inertia endows space-time with a linear structure. In order for the transformation (1.3) to be invertible, we must also demand that the matrix be invertible. Apart from that, using only the principle of inertia, we cannot go further at this point. The principle of relativity will set additional restrictions on .
1.2.2 Invariance of the interval
Let again be an inertial frame and let and be two events in space-time. Call the components of the vector going from to in the frame . Then, we call the number
(1.4) |
the square of the interval between and . The matrix
(1.5) |
appearing in this definition is called the Minkowski metric matrix. The terminology associated with definition (1.4) may seem inconsistent, in that we call “square of the interval between two events” a quantity that seems to depend not only on the events, but also on the coordinates chosen to locate them (in the present case, the separation ). This is not the case, however, thanks to the following important result:
Proposition.
Let and be two inertial frames, and two events, the square of the interval between them being in the coordinates of and in those of . Then,
(1.6) |
In other words, the number (1.4) does not depend on the inertial coordinates used to define it.
Proof.
First suppose that and are light-like separated, i.e. that there exists a light ray going from to (or from to ). Then, by construction. But, by Einstein’s second postulate, the speed of light is the same in both reference frames and , so as well. Thus,
Now, since and are inertial frames, the relation between their coordinates must be of the linear form (1.3); in particular, . Therefore is a polynomial of second order in the components . But we have just seen that and have identical roots; since polynomials having identical roots are necessarily proportional to each other, we know that there exists some number such that
(1.7) |
This number depends on the matrix appearing in , which itself depends on the velocity of the frame with respect to the frame . (This velocity is constant, since accelerated frames cannot be inertial.) But space is isotropic by assumption, so actually depends only on the modulus of , and not on its direction. In particular, . Since the velocity of with respect to is , we know that
implyiing that . Since real transformations cannot change the signature of a quadratic form, cannot be negative, so . ∎
1.2.3 Lorentz transformations
We now know that coordinate transformations between inertial frames must preserve the square of the interval; let us work out the consequences of this statement for the matrix in (1.3). To simplify notations, we will see and as four-component column vectors, so that (1.3) can be written as
(1.8) |
Similarly, seeing as a vector , the square of the interval (1.4) becomes . (The superscript “” denotes transposition.) Then, since , demanding invariance of the square of the interval under (1.8) amounts to the equality
to be satisfied for any . Because is non-degenerate, this implies that the matrix satisfies
(1.9) |
Definition.
The Lorentz group (in four dimensions) is
(1.10) |
where denotes the set of real matrices. More generally, the Lorentz group in space-time dimensions, , is the group of real matrices satisfying property (1.9) for the -dimensional Minkowski metric matrix .
Remark.
This definition is equivalent to saying that the rows and columns of a Lorentz matrix form a Lorentz basis of , that is, a basis of -vectors such that . The Lorentz group in space-time dimensions is a Lie group of real dimension . This is analogous to the orthogonal group , defined as the set of matrices satisfying , where is the identity matrix. In particular, the rows and columns of an orthogonal matrix form an orthonormal basis of .
Definition.
The group consisting of inhomogeneous transformations (1.8), where belongs to the Lorentz group, is called the Poincaré group or the inhomogeneous Lorentz group. Its abstract structure is that of a semi-direct product , where is the group of translations, the group operation being given by
Of course, this definition is readily generalized to -dimensional space-times upon replacing by and by . We will revisit the definition of the Lorentz and Poincaré groups at the end of subsection 3.1, with the tools of pseudo-Riemannian geometry. Apart from that, in the rest of these notes, we will mostly need only the homogeneous Lorentz group (1.10) and we will not really use the Poincaré group. We stress, though, that the latter is crucial for particle physics and quantum field theory [27, 28].
1.3 Subgroups of the Lorentz group
The defining property (1.9) implies that each matrix in the Lorentz group satisfies
. This splits the Lorentz group in two disconnected subsets, corresponding to matrices
with determinant
or . In particular, Lorentz matrices with determinant
span a subgroup of the Lorentz group , called the proper Lorentz group and denoted
or . It is the set of Lorentz transformations that preserve
the orientation of space.
Another natural subgroup of can be isolated using (1.9), though in a somewhat less obvious way. Namely, consider the component of eq. (1.9),
This implies the property
(1.11) |
valid for any matrix in the Lorentz group. The inequality is saturated only if for all . Since the inverse of relation (1.9) implies for any Lorentz matrix , we also find
(1.12) |
with equality iff for all . Thus, in particular, iff for all , in which case the spatial components of form a matrix in . Just as the determinant property , the inequality in (1.11) splits the Lorentz group in two disconnected components, corresponding to matrices with positive or negative . Note that the product of two matrices , , with positive and , is itself a matrix with positive component:
(1.13) | |||||
(In the very last inequality we applied the Cauchy-Schwarz lemma to the spatial vectors whose components
are and
.) Therefore, the set of Lorentz matrices with positive forms a
subgroup of the Lorentz group, called the orthochronous Lorentz group and denoted
or . As the name indicates, elements of are Lorentz
transformations that preserve the direction of the arrow of time.
Given these subgroups, one defines the proper, orthochronous Lorentz group
which is of course a subgroup of . In fact, we will see at the end of this subsection that this is the maximal connected subgroup of the Lorentz group. The rows and columns of Lorentz matrices belonging to form Lorentz bases with a future-directed time-like unit vector, and with positive orientation. The group of orientation-preserving rotations of space, , is a natural subgroup of , consisting of matrices of the form
(1.14) |
Note that can be generated by adding to the time-reversal matrix
(1.15) |
Similarly, can be obtained by adding to the parity matrix
(1.16) |
More generally, the whole Lorentz group can be obtained by adding and to . (Note that and do not commute with all matrices in . This should be contrasted with the case of and , where the three-dimensional parity operator belongs to the center of .)
1.4 Boosts and rapidity
We have just seen that any rotation, acting only on the space coordinates, is a Lorentz transformation. In the language of inertial frames, this is obvious: if the spatial axes of a frame are rotated with respect to those of an inertial frame (and provided the time coordinates in and coincide), then is certainly an inertial frame. The same would be true even in Galilean relativity [7]. In order to see effects specific to Einsteinian special relativity, we need to consider Lorentz transformations involving inertial frames in relative motion.

1.4.1 Boosts
Call and the inertial frames used by Alice and Bob, with respective coordinates and . Suppose Bob moves with respect to Alice in a straight line, at constant velocity . Without loss of generality, we may assume that the origins of the frames and coincide. (If they don’t, just apply a suitable space-time translation to bring them together.) By rotating the spatial axes of and , we can also choose their coordinates to satisfy and . Then, the only coordinates of and that are related by a non-trivial transformation are and . Finally, using parity and time-reversal if necessary, we may choose the same orientation for the spatial frames of and , and the same orientation for their time arrows.

Under these assumptions the relation between the coordinates of and those of takes the form
where , , and are some real coefficients such that belongs to in particular, . Since moves with respect to at constant velocity (along the direction), the coordinate of must vanish when . By virtue of linearity, we may write
where is some -dependent, positive coefficient (on account of the fact that the directions and coincide). Demanding that satisfies relation (1.9), with the restrictions and , then yields
(1.17) |
A Lorentz transformations of this form is called a boost (with velocity in the direction ). In
particular, reality of requires to be smaller than : boosts faster than light
are forbidden.
Boosts give rise to the counterintuitive phenomena of time dilation and length contraction. Let us briefly describe the former. Suppose Bob, moving at velocity with respect to Alice, carries a clock and measures a time interval in his reference frame; for definiteness, suppose he measures the time elapsed between two consecutive “ticks” of his clock, and let the clock be located at the origin of his reference frame. Call the event “Bob’s clock ticks for the first time at his location at that moment”, and call the event “Bob’s clock ticks for the second time (at his location at that time)”. Then, in Alice’s coordinates, the time interval separating the events and is not equal to ; rather, according to (1.17), one has . Since is always larger than one, this means that Alice measures a longer duration than Bob: Bob’s time is “dilated” compared to Alice’s time, and is precisely the dilation factor. This phenomenon is responsible, for instance, for the fact that cosmic muons falling into Earth’s atmosphere can be detected at the level of the oceans even though their time of flight (as measured by an observer standing still on Earth’s surface) is about a hundred times longer than their proper lifetime. In subsection 4.3, we will see that boosts also lead to surprising optical effects on the celestial sphere.
1.4.2 Notion of rapidity
Although the notion of velocity used above is the most intuitive one, it is not the most practical one from a mathematical viewpoint. In particular, composing two boosts with velocities and (in the same direction) does not yield a boost with velocity . It would be convenient to find an alternative parameter to specify boosts, one that would be additive when two boosts are combined. This leads to the notion of rapidity [9, 5],
(1.18) |
in terms of which the boost matrix (1.17) becomes
(1.19) |
One verifies that the composition of two such boosts with rapidities and is a
boost of the same form, with rapidity .
Rapidity is thus the additive parameter specifying Lorentz boosts. It exhibits the fact that boosts along a given axis form a non-compact, one-parameter subgroup of . It also readily provides a formula for the addition of velocities: the composition of two boosts with velocities and is a boost with rapidity ; equivalently, according to (1.18), the velocity of the resulting boost is
This is the usual formula for the addition of velocities (in the same direction) in special relativity
[7].
As practical as rapidity is, its physical meaning is a bit obscure: the definition
(1.18) does not seem related to any measurable quantity whatsoever. But in fact, there exist at least
three different natural definitions of the notion of “speed”, and rapidity is one of them
[10]. To illustrate these definitions, consider an observer, Bob, who travels by train from
Brussels to Paris [29], and
measures his speed during the journey. For simplicity, we will assume that the motion takes place along the
axis of Alice, an inertial observer standing still on the ground.
A first notion of speed he might want
to define is an “extrinsic” one: he lets Alice measure the distance between Brussels and Paris, and the two
clocks of the Brussels and Paris train stations are synchronized. Then, looking at the clocks upon
departure and upon arrival, he defines his velocity as the ratio of the distance measured by Alice to
the duration of his trip, measured by the clocks in Brussels and Paris. The infinitesimal version of velocity
is the usual expression , where and are the space and time coordinates of an inertial
frame which, in general, is not
related to Bob. (In the present case, these are the coordinates that Alice, or any inertial observer
standing still on the ground, would likely use.)
A second natural definition of speed is given by proper velocity. To define this notion, Bob still lets Alice measure the distance between Brussels and Paris, but now he divides this distance by the duration that he himself has measured using his wristwatch. The infinitesimal version of (the component of) proper velocity is , where denotes Bob’s proper time, defined by
(1.20) |
along Bob’s trajectory. (In (1.20), it is understood that Bob’s trajectory is written as
in the coordinates of Alice, but the value of would be the same in any inertial frame with
the same direction for the arrow of time, by virtue of Lorentz-invariance of the interval,
eq. (1.6).) If Bob’s motion occurs at constant speed, the relation
between the component of
proper velocity and standard velocity is , as follows from time dilation.
Finally, Bob may decide not to believe Alice’s measurement of distance, and that he wants to measure everything by himself. Of course, sitting in the train, he cannot measure the distance between Brussels and Paris using a measuring tape. He therefore carries an accelerometer and measures his proper acceleration at each moment during the journey. Starting from rest in Brussels (at proper time say), he can then integrate this acceleration from to to obtain a measure of his speed at proper time . This notion of speed is precisely the rapidity introduced above [10], up to a conversion factor given by the speed of light. Indeed, assuming that Bob accelerates in the direction of positive , his proper acceleration at proper time is the Lorentz-invariant quantity [6, 30]
It is the value of acceleration that would be measured by a “locally inertial observer”, that is, an observer whose velocity coincides with Bob’s velocity at proper time , but who is falling freely instead of following an accelerated trajectory. The integral of this quantity along proper time, provided Bob accelerates in the direction of positive , is thus
(1.21) | |||||
(1.22) |
where we used the definition (1.20) of proper time, which implies
But ; since Bob’s proper velocity along vanishes at proper time , the integral (1.22) can be written as
(1.23) | |||||
where and are Bob’s proper velocity and velocity at proper time . This is precisely the relation we wanted to prove: up to the factor , the integral (1.21) of proper acceleration coincides with rapidity. In subsection 4.3, we will also see that rapidity is the simplest parameter describing the effect of Lorentz boosts on the celestial sphere.
1.5 Standard decomposition theorem
We now derive the following important result, which essentially states that any proper, orthochronous Lorentz transformation can be written as a combination of rotations together with a standard boost (1.17). We closely follow [5].
Theorem.
Any can be written as a product
(1.24) |
where and are rotations of the form (1.14) and is a Lorentz boost of the form (1.19). The decomposition (1.24) is called standard decomposition of a proper, orthochronous Lorentz transformation. There are many such decompositions for a given .
Proof.
Let . Let denote the vector in whose components are the coefficients . If , then and is of the form (1.14). In that case the decomposition (1.24) is trivially satisfied with and . If , let denote one of the two unit vectors proportional to (, ); write its components as . Let also and be two vectors in such that the set be an orthonormal basis of with positive orientation (i.e. the matrix whose entries are the components of , and belongs to ); denote their respective components as and . Then consider the rotation matrix
The product takes the form
(1.25) |
where the ’s are unimportant numbers and where and are the components of two mutually orthogonal unit vectors in (because , so the rows and columns of this matrix form a Lorentz basis); let us denote these unit vectors by and , respectively. Let also be the (unique) vector such that be an orthonormal basis of with positive orientation. Define the rotation by
where are the components of . Then, the product reads
Since belongs to , the two last rows of (1.5) must be orthogonal to the two first ones, so the matrix
must vanish. Thus is block-diagonal,
with . This implies that
for some , so that is a pure boost of the form (1.19). In other words, writing and , the decomposition (1.24) is satisfied. ∎
Remark.
The standard decomposition theorem holds in any space-time dimension . The proof is a straightforward adaptation of the argument used in the four-dimensional case. (In space-time dimensions, there are no spatial rotations and the proper, orthochronous Lorentz group only contains boosts in the spatial direction. In this sense, the standard decomposition theorem is trivially staisfied also in .)
1.6 Connected components of the Lorentz group
In a general topological group (and in particular in any Lie group), we call connected component of the set of all elements in that can be reached by a continuous path starting at . In particular, we denote by the connected component of the identity . A group is connected if it has only one connected component that of the identity , in which case .
Proposition.
is a normal subgroup of . Furthermore, the set of connected components of coincides with the quotient group .
Proof.
We first prove that is a subgroup of . Let and belong to , and let
be two continuous paths such that and , . Then the path
joins to . Therefore belongs to , and the latter is a subgroup of .
Let us now show that is a normal subgroup. Let and let . Consider the element in . Since belongs to the connected component of the identity, there exists a continuous path such that and . But then the map
is also a continuous path in , joining to . Therefore .
Since this is true for any and any , is a normal subgroup of .
We now turn to the second part of the proposition. Suppose first that and belong to the same
connected components in ; let
be a continuous path such that and
. Then, is a continuous path joining the identity to , so
belongs to . Therefore, the cosets and , seen as elements of the quotient
group , coincide. (Since is a normal subgroup of , is indeed a group.)
Conversely, suppose the cosets and coincide as elements of . Then there exists an in such that , and a path such that and . But then the path joins to , so and belong to the same connected component of . ∎
Let us now apply this proposition to the Lorentz group. First observe that is connected, as follows from the standard decomposition theorem: in (1.24), each factor can be linked to the identity by a continuous path, and this remains true for the product of these factors222This argument relies in particular on the fact that the group of orientation-preserving rotations is connected.. Thus, is the connected subgroup of the Lorentz group, and the connected components of the latter coincide with the quotient . As noted below expression (1.16), each element of the Lorentz group can be reached by adding parity and/or time-reversal to the connected Lorentz group . Thus, the quotient is the group generated by and and the Lorentz group has exactly four connected components, denoted as follows:
: | and | , | |
: | and | , | |
: | and | , | |
: | and | . |
As already mentioned, and are subgroups of the Lorentz group. Note that is also a group.
Remark.
Analogous results hold for the Lorentz group in any space-time dimension . The properties and remain true and the definition of the proper Lorentz group and the orthochronous Lorentz group are straighforward generalizations of and . Similarly, one defines . The standard decomposition theorem (1.24) remains true, provided is replaced by . In particular, is the connected subgroup of the Lorentz group and the latter splits in four connected components. The transition between different components is realised by the generalization of the time-reversal and parity matrices (1.15) and (1.16). (In odd space-time dimensions, parity is not just , since that matrix belongs to the proper Lorentz group. Rather, in odd dimensions, parity is .)
2 Lorentz groups and special linear groups
Having defined the Lorentz group, we now turn to its
realization as the group of volume-preserving linear transformations of . Since the
method used to derive this isomorphism has a wide range of applications, we will use it repeatedly
in this section, proving three different isomorphisms along the way: first, in subsection 2.1, we
relate to . Then, in subsection
2.2, we turn to the isomorphism between and the Lorentz group in three space-time
dimensions. Finally, in subsection 2.3, we establish the announced link between
and 333These results have important implications for representation
theory; we
shall not
discuss those
implications here and refer to
[5, 28, 31] for more details.. We end in subsection 2.4 by mentioning
(without proving them) higher-dimensional generalizations of these results and their relation to division
algebras.
Before dealing with specific constructions, let us review a general group-theoretic result. Let and be groups, a homomorphism. Then, the kernel of is a normal subgroup of and the quotient of by is isomorphic (as a group) to the image of :
(2.1) |
The proof is elementary, as it suffices to observe that the map
is a bijective homomorphism, that is, the sought-for isomorphism. All isomorphisms exposed in this section will be obtained using that method: we will construct well-chosen homomorphisms that will lead us to the desired isomorphisms through relation (2.1).
2.1 A compact analogue
Here we establish the isomorphism between and the quotient of by its center, following [5]. This relation is important for our purposes both because of the simplicity of the example, and because of its role in the isomorphism between the Lorentz group in four dimensions and . We begin by reviewing briefly the main properties of the unitary group in two dimensions.
2.1.1 Properties of
The unitary group in two dimensions is the group of linear transformations of that preserve the norm . It consists of complex matrices that are unitary in the sense that
(2.2) |
where denotes hermitian conjugation (). It follows that the lines and
columns of each matrix define an orthonormal basis of for the scalar product
. By virtue of the defining property (2.2), each
has . In particular, we define the special unitary group in two dimensions as the
subgroup of consisting of matrices with unit determinant, .
For later purposes, we will need to know some topological properties of . Demanding that the matrix
belong to imposes the conditions , and . These requirements are solved by and , so each matrix in can be written as
Thus, each element of is uniquely determined by four real numbers
such that .
These numbers define a point on the unit -sphere444Recall that the -sphere is defined as
the set of points in that are located at unit distance from the origin.. Furthermore, this
description is not redundant (two
different quadruples lead to two different elements of ), so is homeomorphic555By
definition, two topological spaces are homeomorphic if there exists a continuous bijection, mapping the
first space on the second one, whose inverse is also continuous. to ,
as a
topological space. In particular, is connected and simply connected.
Finally, recall that the center of a group is the set of elements that commute with all elements of . In particular, it is an Abelian normal subgroup of . It is easy to show that the center of consists of the two matrices
(2.3) |
and is thus isomorphic to . This observation will be important in the next paragraph, and we will use it again once we turn to the Lorentz group in four dimensions.
2.1.2 The isomorphism
Theorem.
One has the following isomorphism:
(2.4) |
where is the center of . In other words, is the double cover of , and it is also its universal cover.
Proof.
Consider the space of traceless Hermitian matrices. Each matrix can be written as (with implicit summation over ), where the ’s are real coefficients, while the ’s are Pauli matrices
(2.5) |
The space is obviously a three-dimensional real vector space. Note that , where denotes the Euclidean norm in . In addition, as a vector space, is isomorphic to the Lie algebra of . The group naturally acts on its Lie algebra by the adjoint action, for which maps on . This action is a representation of , that is, a homomorphism from into the linear group of . Furthermore, it preserves the norm in in the sense that
Thus, the adjoint action of on consists of orthogonal transformations and we can define a homomorphism
(2.6) |
where the matrix is given by the condition
(2.7) |
It remains to compute the image and the kernel of the map so defined.
We begin with the image. By (2.7), the entries of the matrix are quadratic combinations of the
entries of , so is continuous. Since is connected, the image of must be contained
in the connected subgroup of , that is, . To prove the isomorphism (2.4), we need to
show that the opposite inclusion holds as well, i.e. that any matrix in can be written as
for
some .
The latter statement actually follows from a geometric observation [5]: any rotation of (around an axis , by an angle ) can be written as the product of two reflexions and with respect to planes whose intersection is the rotation axis, the angle between the planes being half the angle of rotation. Explicitly, and can be written as
(2.8) |
where and are unit vectors orthogonal to the planes corresponding to the reflexions and , respectively. Then, define the matrices and . These matrices are Hermitian, traceless, have determinant and square to unity. Defining similarly , the reflexions (2.8) can be written as
Therefore, the rotation (with acting first) acts on according to
(2.9) |
But now note that the product is such that with
, and it
has unit determinant. Hence belongs to and the transformation (2.9) is of the form
(2.7) defining the homomorphism . Hence we can write the rotation as . This
proves that is surjective on .
To conclude the proof of (2.4) we turn to the kernel of , that is, the inverse image of the unit element in . Saying that is the identity in is just saying that for any in , which in turn is equivalent to saying that commutes with all ’s. But, when this holds, also commutes with any . Since is Hermitian and traceless, this is the same as saying that commutes with all elements of , i.e. that belongs to the center of . The latter consists of the two matrices (2.3), which form a group . ∎
Remark.
From the definition (2.7) of the homomorphism and the form of the Pauli matrices, one can easily read off the explicit expression of the orthogonal matrix , for an element of :
(2.10) |
This formula exhibits the fact that is insensitive to an overall change of sign in the entries of its argument, since the right-hand side only involves quadratic combinations of those entries. In particular, it implies that the kernel of must contain the matrices (2.3).
2.2 The Lorentz group in three dimensions
The Lorentz group in three space-time dimensions is the group , as defined in subsection 1.2. Its connected subgroup is . We will show here that this connected group is isomorphic to the quotient of by its center. Before doing that, we review a few topological properties of . For the record, the results of this subsection will play a minor role in the remainder of these notes, so they may be skipped in a first reading.
2.2.1 Properties of
The group is the group of volume-preserving linear transformations of the plane . It can be seen as the group of real matrices with unit determinant:
Lemma.
The group is connected, but not simply connected. It is homotopic to a circle; in particular, the fundamental group of is isomorphic to .
Proof.
Let
Since , the vectors and in are linearly independent. We can therefore find three real numbers , and such that the set
(2.11) |
be an orthonormal basis of . Equivalently, there exists a matrix
such that the product
be an orthogonal matrix (since the lines and columns of an orthogonal matrix form an orthonormal basis). We can choose , making positive. Since , we may choose the orientation of the basis (2.11) so that , i.e. . Thus, any matrix can be written as
for some and strictly positive. Now, the set of triangular matrices of the form
is homeomorphic to , which is connected and has the homotopy type of a point. On the other hand, is homeomorphic to a circle. This shows that is connected and homotopic to a circle. In particular, the fundamental group of is . ∎
Note that the center of is the same as that of (see eq. (2.3)): it consists of the identity matrix and minus the identity matrix, forming a group isomorphic to .
2.2.2 The isomorphism
Theorem.
There is an isomorphism
(2.12) |
where is the center of . In other words, is the double cover of the connected Lorentz group in three dimensions (but it is not its universal cover, since it is not simply connected).
Proof.
Our goal is to build a well chosen homomorphism mapping on . Consider, therefore, the space of real, traceless matrices, that is, the Lie algebra of . Each matrix in can be written as
where the ’s are real numbers, while the ’s are the following matrices:
(2.13) |
(These matrices are generators of the Lie algebra of .) Note that, with this convention, the determinant of is, up to a sign, the square of the Minkowskian norm of the corresponding vector :
(2.14) |
There is a natural action of on the space . Namely, with each , associate the map
(2.15) |
This is the adjoint action of . It is linear and it preserves the determinant, since . In addition, thanks to (2.14), each map (2.15) can be seen as a Lorentz transformation acting on the -vector . We can thus define a map
where the matrix is given by
or equivalently,
(2.16) |
Because for all matrics , in , the map is obviously
a homomorphism. Furthermore,
by (2.16), the entries of are quadratic combinations
of the entries of ; therefore is continuous. In particular, since is connected,
the image of is certainly contained in the connected Lorentz group
.
It remains to prove that is surjective on and to compute its kernel. We begin with the former. Let therefore . The standard decomposition theorem (1.24) adapted to states that can be written as , where and belong to the subgroup of , while is a standard boost of the form (1.19) with the last line and last column suppressed. To prove surjectivity of on , we need to show that there exist matrices , and in such that
(2.17) |
We begin with the rotations. If belongs to the subgroup of , i.e.
(2.18) |
for some angle , then formula (2.16) gives
(2.19) |
This implies that any rotation in can be realised as for some matrix of the form (2.18) in . Thus, to prove surjectivity of as in (2.17), it only remains to find a matrix such that be a standard boost with rapidity in three space-time dimensions. Again, using (2.16), one verifies that the matrix
is precisely such that
(2.20) |
which was the desired relation. We conclude that is surjective on , as expected.
Finally, to establish (2.12), we need to show that the kernel of is isomorphic to . The proof is essentially the same as for . Indeed, saying that belongs to the kernel of means that commutes with any linear combination of the generators (2.13). But this implies that commutes with all elements of , i.e. that belongs to the center of , which is just . ∎
Remark.
Using (2.16), one can write down explicitly the homomorphism as
where the argument of on the left-hand side is a matrix in . We already displayed two special cases of this relation in equations (2.19) and (2.20). As in the analogous homomorphism (2.10) for , the fact that the right-hand side only involves quadratic combinations of the entries of the matrix implies that is insensitive to overall signs, so that its kernel necessarily contains .
2.3 The Lorentz group in four dimensions
We now turn to the analogue of the previous isomorphism for the Lorentz group in four dimensions, following [5] once again. As usual, we will begin by reviewing certain topological properties of , turning to the isomorphism later.
2.3.1 Properties of
The group is the set of volume-preserving linear transformations of the vector space . It can be seen as the group of complex matrices with unit determinant:
Lemma.
The group is connected and simply connected.
Proof.
We use essentially the same technique as for the group in subsection 2.2. Let
Then , so the vectors and in are linearly independent. We can then find three complex numbers , and such that
(2.21) |
be an orthonormal basis of . In other words, there exists a matrix
such that the product
be unitary (since the lines and columns of a unitary matrix form an orthonormal basis). We may take , so that is real and strictly positive. Since is a complex number with unit modulus, and since the second basis vector in the orthonormal basis (2.21) is determined up to a phase, we may choose , that is, . Thus and is also a strictly positive real number. We conclude that any matrix in can be written as
for some and strictly positive. is diffeomorphic to , so it is connected and simply connected. Furthermore, the group of triangular matrices of the form
is homeomorphic to , which is also connected and simply connected. As a consequence, itself is connected and simply connected. ∎
Note the topological difference between and : both are connected, but only
is simply connected, while is homotopic to a circle. This subtlety has important
consequences for (projective) unitary representations of and , and, accordingly, for
those of the
Lorentz groups in three and four dimensions [28, 27, 32, 33].
One can also verify that the center of consists of the two matrices (2.3) and is thus isomorphic to , exactly as in the case of and .
2.3.2 The isomorphism
Theorem.
There exists an isomorphism
(2.22) |
where is the center of . In other words, is the double cover of the connected Lorentz group in four dimensions, and it is also its universal cover.
Proof.
Proceeding as for the Lorentz group in three dimensions, we wish to build a homomorphism and compute its image and its kernel. Consider, therefore, the vector space of Hermitian matrices. It is a real, four-dimensional vector space: any matrix can be written as
(2.23) |
where the ’s are real coefficients. This can also be written as , where denotes the identity matrix, while666The choice of signs is slightly unconventional here; it will eventually ensure that the action (4.8) of Lorentz transformations on celestial spheres coincides with the standard expression (3.23) of conformal transformations. , and in terms of the Pauli matrices (2.5). Then, just as in (2.14),
(2.24) |
Let us now define an action of on : for each matrix , we consider the linear map
(2.25) |
This action preserves the determinant because . By (2.24), this amounts to preserving the square of the Minkoswkian norm of the four-vector , so the transformation (2.25) can be seen as a Lorentz transformation acting on . We thus define a map
(2.26) |
where the matrix is given by
(2.27) |
or equivalently,
(2.28) |
Since , the map is obviously a homomorphism.
Furthermore, by (2.28), the entries of are quadratic combinations
of the entries of ; so is continuous. In particular, since is connected,
the image of is contained in the connected Lorentz group
.
It remains to prove that is surjective on and to compute its kernel. We have just seen that by continuity, so as far as surjectivity is concerned, we need only prove the opposite inclusion. Let therefore belong to . By the standard decomposition theorem (1.24), we can write as a standard boost , for some value of the rapidity , sandwiched between two (orientation-preserving) spatial rotations: . Thus, in order to prove surjectivity of on , it suffices to find three matrices , and in such that
(2.29) |
since in that case . Now, the restriction of the homomorphism (2.26) to the subgroup of is precisely the homomorphism (2.6) that we used to prove the isomorphism . We know, therefore, that there exist matrices and in such that conditions (2.29) hold. As for the matrix , we make the educated guess
The image of under can be read off from the property
Comparing with the definition (2.27) of , we see that is precisely the standard
Lorentz boost , as written in (1.19). In conclusion, the homomorphism is surjective on
.
The last missing piece of the proof is the computation of the kernel of . By definition, the kernel consists of matrices such that for any . Taking , we see that must belong to . Then, taking , we observe that must belong to the center of , that is, . ∎
Remark.
As usual, the definition (2.28) can be used to compute explicitly the matrix , when belongs to . The result is
(2.30) | |||
involving only quadratic combinations of the entries of the argument of , which exhibits the fact that the kernel of must contain .
2.3.3 Examples
For future reference, let us display two specific families of matrices in corresponding to rotations around the axis and boosts along that axis, associated respectively with the Lorentz matrices
Demanding that these matrices be of the form for some determines uniquely, up to a sign, through formula (2.30). One thus finds that rotations by around are represented by
(2.31) |
while boosts with rapidity along are given by
(2.32) |
We will put these formulas to use in subsection 4.3, when describing the effect of Lorentz transformations on celestial spheres.
2.4 Lorentz groups and division algebras
In the two previous subsections, we proved the two very similar isomorphisms
(2.33) |
From this viewpoint, going from three to four space-time dimensions amounts to changing into . Now, from Hurwitz’s theorem it is well known that and are only the two first entries of a list of four normed division algebras (see e.g. [34]): the two remaining algebras are the set of quaternions and the set of octonions. Given this classification and the apparent coincidence (2.33), it is tempting to ask whether similar isomorphisms hold between certain higher-dimensional Lorentz groups and special linear groups of the form or . This turns out to be the case indeed: one can prove that the connected Lorentz groups in six and ten space-time dimensions satisfy [35, 36, 37, 38, 39]
We will not prove these isomorphisms here. We will not even attempt to explain the meaning of the last isomorphism in this list, given that octonions are not associative, so that what we call “” is not obvious. Let us simply mention, as a curiosity, that these isomorphisms are related to the fact that minimal supersymmetric gauge field theories (with minimally coupled massless spinors) can only be defined in space-time dimensions 3, 4, 6 and 10. More generally, the relation between spinors and division algebras spreads all the way up to superstring theory. We will not study these questions here and refer for instance to [38, 39] for many more details.
3 Conformal transformations of the sphere
This section is a differential-geometric interlude: setting the Lorentz group aside, we will show that the quotient may be seen as the group of conformal transformations of the -sphere. Accordingly, our battle plan will be the following. We will first define, in general terms, the notion of conformal transformations of a manifold (subsection 3.1). We will then apply this definition to the plane and the sphere (subsections 3.2 and 3.3) and classify the corresponding conformal transformations. These matters should be familiar to readers acquainted with conformal field theories in two dimensions, which we briefly mention in subsection 3.4. Although some basic knowledge of differential geometry may be useful at this point, it is not mandatory for understanding the text, as our presentation will not be cast in a mathematically rigorous language. We refer for instance to [40, 41] for an introduction to differential geometry.
3.1 Notion of conformal transformations
In short, a conformal transformation of some space is a transformation which preserves the angles. To define precisely what we mean by “angles” (and hence conformal transformations), we will now review at lightspeed the notions of manifolds and Riemannian metrics.
3.1.1 Manifolds and metrics
Roughly speaking, a (smooth) manifold is a topological space that looks locally like a Euclidean space , the number being called the dimension of the manifold. Here, by “locally”, we mean “upon zooming in on the manifold”: any point on the manifold admits a neighbourhood that is homeomorphic to . Two typical examples of -dimensional manifolds are itself, and the sphere . Thanks to the locally Euclidean structure, we can define, at each point of a manifold , a vector space consisting of vectors tangent to at ; this vector space is called the tangent space of at , denoted . If we think of a manifold as a smooth set of points embedded in some higher-dimensional Euclidean space , then the tangent space is literally the (affine) hyperplane in that is tangent to at , endowed with the vector space structure inherited from .

Given a vector space, it is natural to endow it with a scalar product, allowing one to compute norms of vectors and angles between vectors. Since a manifold has a tangent space at each point, one would like to define a scalar product in the tangent space at each point of ; a metric does precisely this job.
Definition.
A (Riemannian) metric on is the data of a scalar product in each tangent space of , such that this scalar product varies smoothly on [42]. More precisely, a metric is a symmetric, positive-definite, smooth tensor field
(3.1) |
where is the aforementioned scalar product in :
(3.2) |
The requirements of symmetry and positive-definiteness ensure that satisfies all the standard properties of a scalar product. This definition can be extended to pseudo-Riemannian metrics, that is, symmetric tensor fields such as (3.1) that are not necessarily positive-definite. In particular, we will see below that -dimensional Minkowski space-time is the manifold endowed with the pseudo-Riemannian metric (3.9).
3.1.2 Examples
To illustrate concretely the above definition, let us consider a few simple examples of metrics on the manifold . We can endow this manifold with global (Cartesian) coordinates such that any point is identified with its pair of coordinates. Our first example is the Euclidean metric, whose expression in Cartesian coordinates is
(3.3) |
To explain the meaning of this notation, let us pick a point in and two vectors and at that point, with respective components and . Their scalar product is given by (3.2), i.e.
(3.4) |
By definition, upon acting on a vector, gives the -component of this vector. (In the standard language of differential geometry, is the differential of , that is, the one-form dual to the vector field associated with the coordinate on .) The notation is then understood as the operation which, upon acting on two vectors, gives the product of their components along . A similar definition holds for and , except that they, of course, give -components of vectors. Applying these rules to (3.4), we find that the metric (3.3) defines the standard Euclidean scalar product,
(3.5) |
Of course, one can define more generally the Euclidean metric on to be in terms of Cartesian coordinates.

A slightly less trivial example of metric on is given by
(3.6) |
where is the point at which the metric is evaluated. If then and are two vectors at , with the same components as before, their scalar product with respect to this new metric is
where we have used once more the rule saying that (resp. ), upon acting on a vector, gives the
-component (resp. -component) of
the vector. By contrast to the Euclidean scalar product (3.5), this expression depends
explicitly on the point . In other words, if we take two families of vectors
on with constant components and at each point of the plane, their scalar
product will vary as we move on .
Of course, the metric (3.6) that we picked was chosen for illustrative purposes only: any positive function on multiplying would give a (generally position-dependent) Riemannian metric on . More generally, any position-dependent, real quadratic combination of ’s and ’s,
(3.7) |
is a Riemannian metric on as long as and are everywhere positive. If and are two vectors at with the same components as before, their scalar product with respect to the metric (3.7) is . Again, the generalization of these considerations to is straightforward: in terms of Cartesian coordinates ,…,, the most general Riemannian metric on takes the form (with implicit summation over ), where is a symmetric, positive-definite matrix at each point.
3.1.3 Angles and conformal transformations
Metrics can be used to define norms and angles on tangent spaces of a manifold. Indeed, suppose we are given a manifold endowed with a metric . Let be a point in and let be a tangent vector of at . Then, the norm of is naturally defined to be . Furthermore, if and are two vectors at , the angle between them is defined (up to a sign) by
Note that this definition is blind to the local normalization of the metric. Indeed, suppose we define two metrics and on , such that
where is some smooth, positive real function on . In other words, let us assume that and are proportional, the proportionality factor being position-dependent. Then, these two metrics define the same angles. The proof is elementary: if and are two vectors at , then the cosine of the angle between these vectors is
which is obviously independent of whether we choose to use the metric or the metric . This
observation will be crucial in the following pages.
Given a manifold , it is natural to wonder what modifications may undergo, such that these modifications “preserve the structure” of . To answer this question, we must specify precisely what is the structure we wish to preserve. Clearly, a first feature we would like to preserve when deforming is its local Euclidean structure. This leads to the notion of diffeomorphisms: by definition, a diffeomorphism of a manifold is a smooth, invertible map such that the inverse map be smooth as well777A diffeomorphism is thus a smooth generalization of the notion of homeomorphism, the word “smooth” replacing the word “continuous”.. In this sentence, the word “smooth” means “that preserves the local Euclidean structure in a continuous and differentiable way”. In heuristic terms, a diffeomorphism of is a smooth, invertible deformation of when the latter is seen as a rubber space.

Suppose now we pick a manifold endowed with a metric , and consider a diffeomorphism of that manifold. Since the diffeomorphism is a deformation of , it will in general affect distances and angles on ; in other words, a general diffeomorphism does not preserve the metric on and maps the original metric on some new metric . (In precise terms, what we call the transformed metric is the pull-back of by , that is, .) This gives a motivation for defining certain subclasses of diffeomorphisms that preserve some part (or the entirety) of the metric structure, i.e. diffeomorphisms for which the new metric has certain properties in common with the first metric .
Definition.
A conformal transformation of is a diffeomorphism
such that the original metric and the transformed metric define the
same angles (possibly up to signs).
Given the property, shown above, that proportional metrics define identical angles (possibly up to signs), it is easy to write down an explicit formula for what we mean by a conformal transformation: it is a diffeomorphism for which the transformed metric is related to the original metric as
(3.8) |
where is some smooth, positive function on . When for all in , we say that the diffeomorphism is an isometry: it preserves not only the angles, but also the norms defined by the metric . Of course, conformal transformations and isometries can also be defined for pseudo-Riemannian metrics.
Remark.
We are now equipped with the tools needed to restate in differential-geometric terms the definition of the Lorentz and Poincaré groups, originally described in subsection 1.2. Namely, define -dimensional Minkowski space to be the manifold endowed with a pseudo-Riemannian metric such that there exist global coordinates on in which the metric takes the form
(3.9) |
with the Minkowski metric matrix written in (1.5) for the case . In the language of subsection 1.1, the coordinates are those of an inertial frame. Then the isometry group of this manifold is precisely the Poincaré group in dimensions, acting on according to (1.8), and the stabilizer for this action is the Lorentz group . From this viewpoint, the property (1.6) of invariance of the interval is simply the defining criterion for the transformation to be an isometry.
3.2 Conformal transformations of the plane
To illustrate the definition of conformal transformations in the simplest possible case, let us consider the plane endowed with the Euclidean metric (3.3). To make things technically simpler, we see as the complex plane and introduce a complex coordinate , in terms of which the metric (3.3) becomes (with the complex conjugate of ). Then a generic diffeomorphism is a map
(3.10) |
where the function generally depends on both and . Demanding that be a conformal
transformation imposes certain restrictions on this function, which we now work out.
Since maps on and since the metric is just , it is natural that the transformed metric be
(3.11) |
where the subscript means that both sides are evaluated at the point . (This is just the definition applied to (3.10).) Here the differential of is
Plugging this expression (and its complex conjugate) in (3.11), we find
According to the definition surrounding eq. (3.8), requiring that be a conformal transformation amounts to demanding that this expression be proportional to . The terms involving or must therefore vanish, which is the case if and only if
(3.12) |
In other words, the function must depend either only on , or only on . The latter possibility
represents conformal transformations that change the orientation of (they map an angle on an
angle ), and we will discard them from now on. Thus, a diffeomorphism (3.10) is an
orientation-preserving conformal transformation of provided is a function of only, that is,
a meromorphic function.
Furthermore, locally, any such function is admissible888Upon writing and where and are real functions on the plane, the first equation in
(3.12) can be rewritten as the two Cauchy-Riemann equations for and ..
Of course, this is not the end of the story since (3.10) must be a smooth bijection. This restricts the form of even further. To begin with, must be regular, so must be an analytic function
(3.13) |
The zeros of are the points that are mapped on the origin . Since must be an injective map, there can be only one such zero, say . If this zero is degenerate, then the map will not be injective in a neighbourhood of that zero. (If is sufficiently close to and if is a zero of with order , then is mapped by on different points, and cannot be injective.) Thus, in (3.13), the coefficients of all powers of higher than one must vanish, i.e. , etc. In other words, the function must be linear in . Finally, requiring to be surjective imposes that the coefficient of the -linear term be non-zero. We conclude that all conformal transformations of the plane are of the form
(3.14) |
These transformations naturally split in three classes:
Translations | , | ; |
---|---|---|
Rotations | , | ; |
Dilations | , | . |
We will see in the next subsection that these transformations may also be seen as (a subclass of) conformal
transformations of the
sphere.
Before going further, let us note one important detail: in deriving the set of conformal transformations (3.14), the fact that the metric on was the Euclidean metric (3.3) played a minor role. Indeed, we would have obtained the exact same set of transformations for any metric of the form on the plane, since conformal transformations are blind to the multiplication of metrics by (positive) functions. The only crucial point was that the metric be proportional to , since it is this property that led to the condition (3.12). The further restrictions leading to (3.14), on the other hand, originated from topological (hence metric-independent) considerations. These observations will be essential in the following subsection.
3.3 Conformal transformations of the sphere
We now turn to the main goal of this section: the classification of conformal transformations of the sphere . By definition, the latter is a two-dimensional manifold consisting of all points with Cartesian coordinates in such that , where is some fixed (positive) radius. (The notation is usually reserved for the unit sphere, with radius , but here we will denote any sphere by , regardless of its radius.)
3.3.1 Stereographic coordinates
The standard way to locate points on a sphere of radius relies on polar coordinates and defined by
for any point belonging to the sphere.

In the present case, however, it will be more convenient to use so-called stereographic coordinates, which will simplify the treatment of conformal transformations. These coordinates are defined as follows. Consider a point on the sphere, different from the south pole . Then, there exists a unique straight line in passing through that point and the south pole. Explicitly, all points belonging to this line have coordinates of the form
(3.15) |
where is a parameter running over all real values. (The point corresponding to is the south pole, while corresponds to .) The straight line so obtained crosses the equatorial plane at exactly one point, called the stereographic projection of through the south pole. The coordinates of this projection are obtained by setting in eq. (3.15), that is, by taking , which gives
(3.16) |
We will refer to and as the stereographic coordinates on the sphere. They can be combined into a single complex coordinate
(3.17) |
which is related to polar coordinates through
(3.18) |
For future reference, note that the inverse of relation (3.17) gives in terms of and as
(3.19) |
where we used the fact that . Of course, we could have carried out a parallel construction by projecting points of the sphere on the equatorial plane through the north pole; this would have given formulas analogous to (3.16) and (3.17), but with replaced by .

The stereographic projection is a concrete illustration of the fact that a sphere is locally the same as a plane: any point on the sphere, other than the south pole, can be projected to the equatorial plane through the south pole. Points that are close to the north pole get projected near the origin ; the whole northern hemisphere is projected in the unit disc , and the equator is left fixed by the projection, corresponding to the unit circle . Points belonging to the southern hemisphere, on the other hand, are projected outside of the unit disc. In particular, points located near the south pole are projected far from the origin, at large values of : as points get closer to the south pole, they get projected further and further away. In fact, one may view the infinitely remote point on the plane, the “point at infinity” , as the projection of the south pole itself. (Of course, the actual projection of the south pole is ill-defined, so the point at infinity does not have a well-defined argument.) We conclude that the sphere is diffeomorphic to a plane, up to a point. More precisely,
(3.20) |
The representation of the sphere as a plane to which one adds the point at infinity is called the Riemann sphere [43]. This relation hints that some of the results derived above for conformal transformations of the plane should be applicable to the sphere as well. In order to see concretely if this is the case, we first need to express the metric of a sphere in terms of the coordinate .
3.3.2 The metric on a sphere in stereographic coordinates
The natural metric on a sphere follows from the definition of a sphere as a submanifold of . Namely, endowing with the Euclidean metric , the metric on the sphere is simply
(3.21) |
To express this metric in terms of stereographic coordinates, we use formula (3.17), from which it follows that the differential of is
On the sphere defined by , the differentials of , and satisfy the relation , which can then be used to show that
In the last term of this expression we recognize the metric (3.21) on the sphere, whose expression in terms of thus becomes
(3.22) |
where we used the third relation of (3.19) to write as a function of and . This
metric is position-dependent, since it explicitly depends on . In fact, up to the factor , it is
precisely the metric (3.6) that we took as an example earlier on, written in terms of . The
only subtlety is that, in contrast to (3.6) where and only take finite values,
expression (3.22) must be understood as a metric on the Riemann sphere, where may be infinite.
Crucially, the metric (3.22) is proportional to the Euclidean metric , which implies that,
as far as conformal transformations are concerned, we can simply
repeat the derivation carried out in subsection 3.2 for the plane. More precisely, if we demand that
a diffeomorphism be a conformal
transformation, the arguments that led to (3.12) remain true and the function must depend either
only on , or only on . The latter choice
corresponds to transformations that do not preserve the orientation of the sphere, so we will ignore them.
Thus, any orientation-preserving conformal transformation of the sphere is a meromorphic function of the form
, and locally on the sphere this is all we can say.
Globally, of course, this is not yet the end of the story, since we must further require that the function be a diffeomorphism of the sphere that is, a diffeomorphism of the plane with the point at infinity added as in (3.20). This point will play a key role. Indeed, requiring that be regular on no longer means that is analytic as in (3.13); rather, now may (and should) have at least one pole, at say, corresponding to the point that is mapped to the south pole . Thus, should now be a rational function of the general form
where the roots of the numerator (resp. denominator) correspond to the points that are mapped on the origin (resp. the point at infinity ), i.e. on the north pole (resp. the south pole). Since must be an injective map, there must be one, and only one, point that is mapped to the north pole, and also exactly one other point that is mapped to the south pole. As in subsection 3.2, this requires that both the numerator and the denominator be linear functions of . We can thus write any orientation-preserving conformal transformation of the Riemann sphere as
(3.23) |
where , , and are complex numbers. Requiring this map to be surjective finally imposes that
(3.24) |
This is the classification of conformal transformations of the sphere that we were looking for. Such transformations are also called Möbius transformations. They obviously contain the set of conformal mappings (3.14) of the plane, so that translations of , rotations and dilations also represent conformal transformations of the sphere. However, there is now an additional two-parameter family of transformations of the form
corresponding to so-called special conformal transformations [11, 12]. Such transformations map
the north pole on the south pole, and vice-versa. Any conformal transformation of the sphere can be obtained
as the composition of a special conformal transformation, a translation, a rotation and a dilation (possibly
in a different order).
By construction, conformal transformations span a group, so it is worthwile to investigate the group structure of the set of Möbius transformations. Clearly, formula (3.23) is blind to the overall normalization of the matrix in (3.24), since multiplying all entries of the matrix by the same non-zero complex number leads to the same transformation (3.23). We can thus assume, without loss of generality, that the non-zero determinant (3.24) is actually one, i.e. that the matrix belongs to . Furthermore, two matrices in that differ only by their sign define the same conformal transformation, so the group of all non-degenerate transformations of the form (3.23) is actually isomorphic to the quotient
(3.25) |
In other words, according to (2.22), the set of orientation-preserving conformal transformations of the sphere forms a group isomorphic to the connected Lorentz group in four dimensions! At this stage, this relation appears just as a coincidence of group theory: there seems to be no relation whatsoever between the Möbius transformations (3.23) and the original definition of the Lorentz group as a matrix group acting on . The purpose of the next section will be to show that this apparent coincidence actually has a geometric origin, rooted in the structure of light-like straight lines in Minkowski space-time.
3.4 An aside: conformal field theories in two dimensions
In the two previous subsections we have seen that any (orientation-preserving) conformal transformation of a
two-dimensional manifold with a conformally flat metric can be written as a
meromorphic function . Demanding that be a bijection of the manifold imposes certain restrictions on the function
, leading to (3.14) in the case of the plane, and (3.23) in the case of the sphere.
However, in physical applications, it is often the case that “global” requirements such as bijectivity play
a minor role. This is particularly true in the case of local quantum field theories999We will not
explain the meaning of “quantum field theory” here. For an introduction, we refer for instance to the
textbooks [44, 28]., whose
properties are mostly determined by local (as opposed to global) considerations.
This feature is of central importance in the context of conformal field theories in two dimensions [11, 12]. By definition, a conformal field theory in dimensions is a quantum field theory, defined on a -dimensional manifold endowed with some metric , that is invariant under conformal transformations of . In the case , with a metric proportional to in terms of stereographic coordinates, this leads to theories that are invariant under all Möbius transformations (3.23). However, the actual set of infinitesimal symmetries of such theories (i.e. symmetries found without taking global issues into account) turns out to be much, much larger than the finite-dimensional group (3.25). Indeed, since global requirements such as bijectivity play a secondary role, conformal field theories in two dimensions turn out to be invariant under all transformations that can be written locally as , where is any meromorphic function101010At this point we should mention that proving conformal invariance of a quantum theory may be a subtle issue when the curvature of the underlying manifold does not vanish, due to the Weyl anomaly [11, 45]. We will not discuss these subtleties here.. This leads to an infinite-dimensional symmetry algebra that constrains such theories in a extremely powerful way [46]. For instance, when combined with an additional symmetry property called “modular invariance”, conformal invariance of a two-dimensional field theory implies a universal formula for the entropy of that theory, known as the Cardy formula [47]. We will briefly return to conformal field theories in the conclusion of these notes.
4 Lorentz group and celestial spheres
So far we have seen that the connected Lorentz group in four dimensions, , is isomorphic to the quotient . We have also shown that the latter arises as the group of orientation-preserving conformal transformations of the sphere. However, at this stage, the relation between the Lorentz group and the sphere appears as a mere coincidence. In particular, since the original Lorentz group is defined by its linear action on a four-dimensional space, there is no reason for it to have anything to do with certain non-linear transformations of a two-dimensional manifold such as the sphere. The purpose of this section is to establish this missing link. This will require first defining a notion of “celestial spheres” in Minkowski space-time (subsection 4.1), and then computing the action of Lorentz transformations on such spheres (subsection 4.2). Subsection 4.3 is devoted to the analysis of the somewhat counterintuitive action of Lorentz boosts in terms of celestial spheres. Our approach is motivated by the notion of “asymptotic symmetries” in gravity [14], and will rely on a specific choice of coordinates that simplifies the description of null infinity in Minkowski space-time. The results as such are well known, and coordinate-independent see for instance [3, 4]. It should be noted that similar relations exists also in other space-time dimensions. For instance, in dimensions, the connected Lorentz group acts on the celestial circles at null infinity through projective transformations spanning a group , in accordance with the isomorphism (2.12). In this section, however, we will restrict our attention to the four-dimensional case.
4.1 Notion of celestial spheres
As explained in section 1, inertial observers in special relativity live in Minkowski space-time, which may be seen as the vector space . Inertial coordinates consist of one time coordinate or , and three Cartesian space coordinates . (Latin indices run over the values , , .) Given such coordinates, there is a natural way to define a corresponding family of spheres. Namely, one may describe the spatial location of an event in terms of spherical, rather than Cartesian, coordinates, defined as
(4.1) |
where points on the sphere of radius are labelled by the stereographic coordinates (3.17).
In particular, the spatial coordinates take the form (3.19) when expressed in terms of
and . Note that the parity transformation defined by the matrix (1.16) acts on the coordinate
according to .
For each non-zero , we thus have a spatial sphere naturally associated with the inertial coordinates
. Since Lorentz transformations relate different sets of inertial coordinates through
linear transformations , one might hope that the rewriting of these
transformations in terms of spherical coordinates (4.1) could give rise to a “nice” action of the
Lorentz group on spatial spheres, one that would make the relation to conformal transformations more
apparent. This is not quite the case, however; roughly speaking, the “celestial sphere” that we actually
wish to
define should be the sphere that
an inertial observer looks at. This is not achieved by the sphere of radius defined by (4.1),
because radial, ingoing light-rays emitted by the sphere need a non-zero time to get from
the sphere to the origin at (which we take to be the position of the observer). Thus, we need to work a
little more: we must somehow combine space and time
coordinates so as to take into account the finite velocity of light, and define the celestial sphere seen by
an observer at some moment of time as an object living in the past.
The argument just outlined hints at the right definition of what we would like to call a “celestial sphere”. Indeed, consider a radial light ray whose trajectory in space-time is described by111111Here is of course the coordinate (4.1) locating points on the sphere, and not the coordinate of a Cartesian coordinate system.
In particular, an ingoing radial light ray satisfies where is some strictly positive initial radius. Along the trajectory of this light ray, the quantity
(4.2) |
is constant; it represents the time at which the observer located at sees the light ray. One can thus parametrize the time of emission of ingoing radial light rays by the value of . Instead of using coordinates to locate events in space-time, one may then use the Bondi coordinates , in which case plays the role of a time coordinate and is called advanced time. The situation can be depicted as follows:

In terms of Bondi coordinates, a sphere at constant and constant coincides with the sphere seen by an observer sitting at the origin at time . We can then define the celestial sphere at time as the sphere located at an infinite distance, , and at a fixed value of . It is the sphere of all directions towards which an observer at can look [3], the reason for the name “celestial” being obvious in that context. (Celestial spheres are also sometimes called “heavenly spheres” [37].) Less obvious is the fact that this definition is the one needed to match Lorentz transformations and Möbius transformations, which will be the purpose of the next subsection. The region spanned by the coordinates and at is called past null infinity [14]: it consists of events located at an infinite distance from the observer, and it can be reached from the line by following a past-directed null vector, that is, a vector whose norm squared vanishes with respect to the Minkowski metric (3.9).

Remark.
In these notes we define celestial spheres by using a specific set of coordinates in Minkowski space-time. There also exists a different definition, according to which the celestial sphere associated with a point in space-time is the projective space of its (past) light-cone, that is, the set of past-directed null directions passing through that point [3]. In such terms, the celestial sphere at time that we defined above is the set of past-directed null directions through the point with Bondi coordinates . (This could be any point in space-time since the Minkowski metric is invariant under all space-time translations.) This sphere can be thought of as the complex projective line , and Lorentz transformations span the group of its projective transformations which are nothing but Möbius transformations when seeing as a sphere [3]. The advantage of this projective viewpoint is that it is manifestly coordinate-independent, but we will not adopt this approach here. (See, however, the end of subsection 4.3.)
4.2 Lorentz transformations acting on celestial spheres
The Lorentz group is defined as the set of linear transformations between coordinates of inertial observers in Minkowski space-time. In the previous subsection we have introduced new, non-inertial, Bondi coordinates associated with each choice of inertial coordinates . In order to find the action of the Lorentz group on Bondi coordinates, we must express both and in terms of the associated Bondi coordinates, then rewrite the relation in Bondi coordinates and read off the Lorentz transformation properties of . Since the relation between inertial coordinates and Bondi coordinates is non-linear, this procedure leads in general to cumbersome expressions for in terms of and of the matrix elements of a Lorentz transformation. Fortunately, we are not actually interested in the general relation between and , but only in its limit with finite . Provided Lorentz transformations preserve that limit (which is to be expected since they are linear in inertial coordinates), keeping finite, they correspond to well-defined transformations of past null infinity.
4.2.1 Transformation of the radial coordinate
Let us begin by computing the transformation law of the radial coordinate under Lorentz transformations. By definition, the (square of the) radial coordinate associated with the inertial coordinates is . If now we assume that the coordinates are obtained by acting on certain coordinates with a Lorentz transformation , we have and
(4.3) | |||||
The next step consists in expressing the coordinates in terms of Bondi coordinates through relations (3.19) and (4.2). Taking the limit while keeping and fixed, the only terms that survive in the parentheses are those proportional to , which gives
(In particular, in that limit, we may replace by .) Here the terms of order outside the parentheses are subdominant with respect to the terms of order coming from the parentheses. As the final touch, we take to be a proper, orthochronous Lorentz transformation, i.e. an element of the connected Lorentz group . We can then express all entries of the Lorentz matrix in terms of complex numbers , , , forming a matrix in , as in eq. (2.30). Taking the square root to express in terms of , this gives the lenghty relation
(4.4) | |||||
At first sight, this expression looks terrible: if we were to expand all the products and parentheses in such a way that the argument of the square root be a sum of monomials in , and the numbers , , , (and their complex conjugates), then the sum would contain about terms. Fortunately, as one can check by a straighforward but tedious computation, the terms of the sum conspire to give a very simple final answer:
(4.5) |
This result shows that, as expected, Lorentz transformations do not spoil the limit : the leading effect of Lorentz transformations on is just an angle-dependent rescaling by some function . Furthermore, the occurrence of combinations such as and is reminiscent of conformal transformations of the sphere, eq. (3.23). The terms in (4.5) are subleading corrections that we will not write down, though they will play a role in the transformation law of advanced time.
4.2.2 Transformation of advanced time
Having derived the transformation law of the radial coordinate (in the large limit), we now turn to the transformation of the remaining coordinates and . We begin with the former; using the definition (4.2), we write
(4.6) |
We are now supposed to express and in terms of unprimed coordinates using their Lorentz transformation laws, then write everything in terms of , and , and read off the transformation law of at . But there is a subtlety in carrying out this procedure. Namely, we have just seen that the transformation law of is ; when plugged into (4.6), this implies that the transformation law of should read
Here the leading term is dangerous: if there is nothing to cancel it, the limit of the transformation law of will be ill-defined. The only way to get rid of this term is to cancel it against the leading term in the transformation law of , which is given by
Plugging this in expression (4.6) and using (4.5), one sees that the dangerous terms, proportional
to , cancel out! This means that the limit of the transformation law of is
well-defined; in terms of observers, it means that if Alice and Bob are boosted with respect to each other
and
if Alice assigns a finite value of advanced time to some event, then Bob will assign to it a
Lorentz-transformed value which is also finite, though in general different, even if the event is
located at an infinite distance from both Alice and Bob. This cancellation of potentially divergent terms in
the transformation law of is actually the very reason why celestial spheres are defined at null
infinity rather than spatial infinity. (The latter would correspond to the limit with
the coordinate being kept finite instead of , and in that case the large limit of the
transformation law of would be ill-defined.)
Using the fact that the transformation law of is well-defined at (no term), we know on dimensional grounds that
(4.7) |
where is some unknown function. (Indeed, and are the
only Bondi coordinates with dimension of length, so, upon expanding the transformation law of in
powers of , the term of order zero in must be proportional to .) In particular, the effect of
Lorentz transformations on advanced time at infinity is just an angle-dependent rescaling, just as the
transformation (4.5) of the radial coordinate. The question, then, is to compute the rescaling
.
Just as for the radial coordinate, this calculation is straightforward but cumbersome: it requires evaluating the subleading term in the transformation law of , which we did not derive in (4.5) as it was included in the terms. This subleading term can be found by Taylor-expanding the transformation law (4.3) in powers of , keeping fixed. In practice though, the entire computation can be circumvented thanks to the following argument: the Lorentz-invariant interval reduces to in the limit , implying that the product is Lorentz-invariant up to corrections.121212Incidentally, this is an alternative proof that the dangerous term mentioned below (4.6) vanishes. This, in turn, implies that the leading coefficient in the transformation law of is the inverse of the coefficient multiplying the radial coordinate in eq. (4.5). Accordingly, the Lorentz transformation law of advanced time is [14]
This is of course of the announced form (4.7), with .
4.2.3 Transformation of stereographic coordinates
The last case to be considered and the most interesting one for our purposes is the transformation law of the coordinate under Lorentz transformations in the large limit. The computation is more or less straightforward, as it only involves the dominant piece of the transformation law of , displayed in (4.5). To begin, one uses (4.1) to write the transformed coordinate as
where the primed coordinates on the right-hand side are obtained by acting with a Lorentz transformation on unprimed coordinates:
(We used eq. (4.5) in writing this.) Expressing the ’s in terms of Bondi coordinates through relations (3.19) and (4.2) and keeping finite, one finds
up to corrections. Finally, writing the entries of the Lorentz matrix as in eq. (2.30), both the numerator and the denominator of the last expression become certain complicated polynomials in , , , , , and their complex conjugates. Fortunately, many terms in these polynomials cancel against each other, leading to a simple expression of in terms of :
(4.8) |
This is precisely the standard expression of Möbius transformations, eq. (3.23): Lorentz transformations coincide with conformal transformations of the celestial sphere. This is the result we wanted to prove.
Remark.
In deriving (4.8), our choices of conventions played an important role. Indeed, we could have defined the homomorphism of subsection 2.3 by acting on Hermitian matrices of the form (2.23), but with different signs in front of the components . (The standard choice [5] would correspond to changing the sign in front of .) This would have led to a different expression of the homomorphism (2.30), which in turn would have given a different formula for the action of Lorentz transformations on celestial spheres. For instance, the terms and would then be replaced by combinations such as and , or and , or other variations on the same theme. But the statement that Lorentz transformations act as conformal transformations on celestial spheres remains true regardless of one’s choices of conventions. Furthermore, the physical effect of such conformal transformations is also convention-independent; we will see an illustration of such a physical (actually, optical) effect in the next subsection.
4.3 Boosts and optics
It is worthwile to analyse the transformation law (4.8) for certain specific examples of Lorentz transformations. Namely, recall that the homomorphism (2.30) allowed us to represent rotations and boosts by the matrices (2.31) and (2.32), respectively. We can then plug these matrices in eq. (4.8) and interpret the resulting formula as the conformal transformation of the celestial sphere that corresponds to a change of frames between two inertial observers, say Alice and Bob. For instance, a rotation by along Alice’s axis corresponds to a rotation of the sphere represented by . This is not surprising: if the frames of Alice and Bob are rotated with respect to each other, it is obvious that their respective celestial spheres will be identical, up to a rotation.
4.3.1 Boosts, optics and the Millenium Falcon
A more interesting phenomenon occurs when Bob is boosted with respect to Alice, with rapidity say. The stereographic coordinate of the celestial sphere seen by Bob is then related to the coordinate of the sphere seen by Alice according to
(4.9) |
Let us take for definiteness, i.e. let us assume that Bob moves in the direction of positive , towards the north pole of the sphere, located at . Then eq. (4.9) tells us that, although Alice and Bob are looking at the same celestial sphere, the points of Bob’s sphere are all pulled closer to the north pole than the points of Alice’s sphere. If we imagine that shining stars are glued to the celestial sphere, then the stars seen by Bob are grouped closer to the north pole (i.e. closer to the direction of Bob’s motion) than those seen by Alice.

This result is somewhat counterintuitive, if we base our intuition on our habit of objects flowing past us when driving on the highway. Roughly speaking, one would expect that boosting in a given direction should make objects spread away from that direction. This intuition is well illustrated in the movie Star Wars Episode IV: A New Hope [48]. In the screenshot reproduced below, Han Solo and Chewbacca are sitting in the cockpit of the Millenium Falcon spaceship and have just switched on the “hyperspace” mode they are accelerating straight ahead. This acceleration corresponds to a continuous family of boosts in the direction of the acceleration. In the picture, these boosts are represented by stars flowing away from the direction of the motion, exactly as dictated by the naive, intuitive expectation just described:
![[Uncaptioned image]](/html/1508.00920/assets/SW03Bis.png)
Formula (4.9) tells us that this representation of the “jump to lightspeed” is wrong: what Han Solo and Chewbacca should really see is a contraction of the sphere at which they are
looking, towards the direction of their acceleration. In other words, as long as the stars are far enough
from the
observer undergoing a boost, they should cluster close to the direction of the boost rather than flow away
from it.
There is of course a subtlety in this argument: our intuition of objects flowing past us when we drive on the highway is obviously correct, so how come it contradicts the result (4.9)? The answer is that formula (4.9) holds only in the limit , when the sphere we are talking about is located at an infinite distance from the observer. In that limit, the observer’s motion does not affect its distance to a point on the celestial sphere; in particular, all points on the sphere remain at an infinite distance from the observer, and there is no way they could flow past him. In real-world applications, however, all objects are necessarily located at some finite distance, in which case the corrections of order neglected in (4.8) become relevant. In particular, when Bob is moving with respect to Alice, the relation between his Bondi coordinates and those of Alice involves some time-dependent factors in the corrections. These corrections imply that the objects seen by Bob (be it stars, or cows on the side of the highway) do indeed flow past him when he gets close enough to them. In this sense the picture of the Millenium Falcon cockpit shown above is not completely wrong. Still, for stars located sufficiently far from Han Solo and Chewbacca, the corrections in (4.8) are negligible and the optical effect described by the contraction (4.9) is valid.
4.3.2 Subtleties
While corrections are the most obvious source of modifications to the result (4.9), there
are several other caveats in trying to apply Lorentz transformations to realistic situations
such as the jump to lightspeed in the Millenium Falcon. The first is the fact that the motion of the
spaceship during the jump is actually accelerated, so that Han Solo and Chewbacca are
definitely not inertial observers! This does not prevent us from guessing what should happen: roughly
speaking, accelerated motion may be seen as an infinite sequence of infinitesimal boosts, so
if (4.9) remains valid for each infinitesimal boost, one expects the celestial
sphere seen by an accelerated observer to undergo a time-dependent contraction (in the direction of
acceleration), with a scaling factor
that gets smaller and smaller as time goes by. More precisely, since rapidity is the integral (1.23)
of proper acceleration, the naive application of (4.9) to accelerated motion predicts that an
accelerated observer, looking at the celestial sphere in the direction of his acceleration, should see a
proper-time-dependent contraction with a scaling factor , where is the integral
(1.21) of proper acceleration over proper time.
The potential problem with this expectation is that special relativity was established for inertial observers
from the very beginning, so one might fear that
acceleration invalidates the application of special-relativistic techniques to the Millenium
Falcon. Fortunately, there is in principle no obstacle in describing accelerated observers in special
relativity
[30]. For example, Thomas precession is a well known special-relativistic effect that applies to
such observers [30, 49], and it is precisely derived by thinking of accelerated motion as a
sequence of
infinitesimal boosts. The only issue is that the reference frames associated with
accelerated observers131313Such reference frames are usually defined by attaching a Fermi-Walker
transported
tetrad to the world-line of the observer, then using this tetrad to define a local coordinate system; see
[30], chap. 6. are not global coordinate systems they do not cover the whole of space-time. This
is related to the existence of horizons: for instance, a uniformly accelerated observer in Minkowski
space-time a Rindler observer cannot receive light rays coming from behind his future horizon
[30]. Thus, since
our definition of celestial spheres relied on the limit in Bondi coordinates, one
may wonder
whether acceleration invalidates our approach, as the limit may be ill-defined. We will not attempt to
address
this issue here, but we will rederive formula (4.9) in a local way at the end of this section.
This will confirm that the optical effects of boosts on the celestial sphere do not actually rely on a large
limit, as already mentioned at the end of subsection 4.1. In particular, the local nature of
the derivation implies that it remains valid for an accelerated observer in the sense that acceleration
deforms the shape of light-cones centered on the observer as is of course well-known in general
relativity. Whether this deformation can be seen by an “asymptotic” computation analogous to the one
explained above is another matter, which we will not discuss.
A second subtlety to be considered is the fact that the light seen by Han Solo and Chewbacca during the jump
to lightspeed is blue-shifted due to the Doppler effect. As the velocity of the Millenium Falcon
increases,
the frequency of the light rays hitting the observers inside the cockpit increases as well. Eventually, the
increase in frequency should become so
high that the stars actually become invisible the starlight seen by the pilots of the spaceship has
reached the ultraviolet region. Thus, the stars seen by Han Solo and Chewbacca not only move in the
direction of acceleration, but they also change colour, becoming blue, then purple, then
invisible141414Strictly
speaking, they become invisible to a human eye to the best of our knowledge, it is not known whether
Wookies are able to see a broader spectrum of electromagnetic radiation than human beings: while the stars
definitely become invisible to Han Solo, they might still be visible to Chewbacca..
This blue shift applies of course to any electromagnetic radiation reaching the observers inside the cockpit. In particular, it applies to the cosmic microwave background radiation151515Here we are assuming that the Star Wars took place in a universe that started off with a Big Bang.. Thus, at sufficiently high velocities, the background radiation should reach the visible spectrum and the actual picture seen from the cockpit of the Millenium Falcon should include a fuzzy disc of light centered around the direction of the motion [50, 51]. Upon taking this effect into account and recalling that most stars become invisible because of the blue shift, one concludes that the actual landscape seen by Han Solo and his hairy companion is indeed far, far away from the image shown in the movie.
4.3.3 A local derivation
We have just seen that boosting an observer in a given direction affects his celestial sphere by
contracting all points of the sphere towards that direction. Given the counterintuitive nature of
this optical phenomenon, it is worthwile to rederive it using a different technique. Namely, consider two
inertial observers, Alice and Bob, using inertial
coordinates and respectively. We take Bob’s coordinates to be boosted, with rapidity
, with respect to those of Alice. For definiteness, we will assume that the boost takes place along the
direction, so that the relation between Bob’s coordinates and Alice’s coordinates is
, with the matrix (1.19).
Now suppose Alice and Bob both see one incoming photon, whose energy-momentum vector is
in Alice’s coordinates, and in
Bob’s coordinates. (Here and are the photon’s energy in Alice’s and Bob’s frames,
respectively.) The angle (resp. ) is the angle between the photon’s direction and the axis
in Alice’s (resp. Bob’s) frame. The question is: what is the relation between
and ?
Since 4-momentum transforms under boosts just as standard inertial coordinates do (the energy-momentum vector is a “four-vector”), we know that the photon’s 4-momentum in Bob’s and Alice’s coordinate systems are related by . Explicitly, this means that
From this we read off , which can be rewritten in terms of half angles as
This is a quadratic equation for as a function of . The solution that ensures when is the simplest one,
(4.10) |
Since here and should be thought of as standard azimuthal coordinates on the sphere in
unprimed and primed coordinate systems, we can relate them to the stereographic coordinate
through relation (3.18). The result (4.10) thus coincides with the
contraction (4.9), as it should. In particular, provided is in the first quadrant
(between and ), is smaller than when the rapidity is
positive. (Conversely, when is larger than , then is larger than ,
corresponding to the fact that points of the celestial sphere located in the direction opposite to the boost
undergo a dilation.)
The important difference between this computation and the one based on Bondi coordinates is the fact that here we never needed to take a “large ” limit. The result (4.10) is valid locally, for any boosted observer detecting a light ray. This implies in particular that an accelerated observer looking in the direction of his/her acceleration should see a time-dependent contraction of the celestial sphere, just as mentioned above for the case of Han Solo and Chewbacca.
5 Conclusion
Let us take a look back at what we have done. We have seen how the Lorentz group arises as the set of
homogeneous coordinate transformations between inertial observers in Minkowski space-time. Since it consists
of linear transformations, it can be represented in terms of matrices. We have then shown that (the
maximal connected subgroup of) this matrix group is isomorphic to a type of relation
that also
occurs in other space-time dimensions. As observed in section 3, the group
also arises, somewhat coincidentally, as the group of conformal transformations of the sphere. The question,
then, was whether there exists a relation between the action of Lorentz transformations on space-time and that
of Möbius transformations on a sphere. We answered this question positively, by showing that the
difference between the celestial spheres seen by two inertial observers whose coordinates are related by a
Lorentz transformation is precisely a conformal transformation. Finally, we used this relation to
discuss the slightly unexpected optical phenomenon associated with boosts: we saw that an observer boosted in
a given direction sees the stars of his/her celestial sphere being dragged towards that direction. This
result is applicable, in particular, to the jump to lightspeed as seen from the cockpit of the
Millenium Falcon.
As emphasized in the introduction, the surprising aspect of the relation between Lorentz and conformal
transformations is the fact that it links the action of a group on a
four-dimensional space to its action on a two-dimensional manifold. Of course, from a mathematical viewpoint
there is nothing wrong with that, but from an intuitive viewpoint it is not a priori obvious that such
a connection has any physical meaning i.e. that this relation can actually be seen in a concrete
experiment, such as accelerating in a spaceship. The purpose of these notes was to unveil that meaning,
which is well known in the literature but perhaps less well known to undergraduate students following
a course in special relativity, group theory, or even general relativity.
In fact, part of the motivation for these lectures was that the idea of relating some space to a
lower-dimensional subspace is closely connected to certain recent developments in the study of (quantum)
gravity, all encompassed under the general name of holography. Recall that a hologram is a
two-dimensional surface that produces a three-dimensional image such an optical device is
typically found on credit cards or banknotes. The idea of holography in
quantum gravity [21, 22, 23, 24, 25] roughly states that there exists a
correspondence (and in certain regimes an actual equivalence) between a gravitational system in
space-time
dimensions and some quantum theory living on a lower-dimensional subspace of the gravitational
system one says that the two theories are “dual” to each other. In particular, according to this idea,
the four-dimensional world that we see around us might be a “hologram” of some lower-dimensional
theory. This correspondence is motivated by countless computations matching quantities evaluated on the
high-dimensional, gravitational side, to some other quantities evaluated on the low-dimensional side; the
interested reader may consult the abundant literature on the subject.
The modest result derived in these notes may be seen as a remnant of the holographic principle: we have shown that Lorentz invariance in four dimensions becomes conformal invariance in two dimensions upon focusing on celestial spheres. In fact, this feature is only part of a much larger construction, that is still under study today. Indeed, it was shown in the sixties by Bondi, van der Burg, Metzner and Sachs [13, 14] that the natural symmetry group of four-dimensional “asymptotically Minkowskian” space-times is an infinite-dimensional extension of the Poincaré group, known nowadays as the Bondi-Metzner-Sachs (BMS) group. The transformations of space-time generated by this group precisely act on “null infinity”, the region that we used in section 4 to define celestial spheres, and extend the natural action of Poincaré transformations on that region. In the holographic context, the BMS group is to be interpreted as the symmetry group of the would-be (as yet conjectural) dual field theory; the latter, if it exists, is expected to be some version of a conformal field theory (recall the brief discussion of subsection 3.4), since Lorentz transformations are part of the symmetry group and act as conformal transformations on celestial spheres. BMS symmetry has recently been the focus of renewed interest, as it was shown that it can be extended to include arbitrary, local conformal transformations of the celestial spheres [15, 16, 17], and also that it is related to standard quantum field theory in Minkowski space through certain “soft theorems” that were known in a completely different language ever since the sixties [52] (see e.g. [18, 19, 20], references therein, and their follow-ups). Many more open problems remain to be settled, both regarding holography in general, and BMS symmetry in particular; the hope of the author is that addressing such questions may open the door to a deeper understanding of quantum gravity.
Acknowledgements
I am grateful to the organizers of the Brussels Summer School of Mathematics, and in particular to T. Connor, J. Distexhe, J. Meyer and P. Weber, for giving me the opportunity to share some of the beauties of physics with the audience of the school. I am also indebted to my teachers M. Bertelson, J. Federinov and M. Henneaux for lectures on physics and mathematics of exceptional quality, which were most helpful in preparing these notes. Finally, I wish to thank L. Donnay for pointing out a typo in earlier versions of the formula I wrote for the transformation law of advanced time. This work was supported by Fonds de la Recherche Scientifique-FNRS under grant number FC-95570.
References
References
- [1] J. Terrell, “Invisibility of the Lorentz Contraction,” Phys. Rev. 116 (Nov, 1959) 1041–1045.
- [2] R. Penrose, “The Apparent shape of a relativistically moving sphere,” Proc.Cambridge Phil.Soc. 55 (1959) 137–139.
- [3] R. Penrose and W. Rindler, Spinors and Space-Time: Volume 1, Two-Spinor Calculus and Relativistic Fields. Cambridge Monographs on Mathematical Physics. Cambridge University Press, 1987.
- [4] A. Held, E. T. Newman, and R. Posadas, “The Lorentz Group and the Sphere,” Journal of Mathematical Physics 11 (1970), no. 11, 3145–3154.
- [5] M. Henneaux, “Groupes et représentations: I. Groupe des rotations à 3 dimensions, groupe de Lorentz et groupe de Poincaré.” ULB, 2009. Available at http://www.ulb.ac.be/sciences/ptm/pmif/membres/notescours.html.
- [6] M. Henneaux, “Cours de relativité générale.” ULB, 2011.
- [7] J. Hladik and M. Chrysos, Introduction à la relativité restreinte: cours et exercices corrigés. Dunod, 2001.
- [8] J.-M. Lévy-Leblond, “One more derivation of the Lorentz transformation,” American Journal of Physics 44 (Mar., 1976) 271–277.
- [9] J.-M. Lévy-Leblond and J.-P. Provost, “Additivity, rapidity, relativity,” American Journal of Physics 47 (Dec., 1979) 1045–1049.
- [10] J.-M. Lévy-Leblond, “Speed(s),” American Journal of Physics 48 (May, 1980) 345–347.
- [11] P. Di Francesco, P. Mathieu, and D. Sénéchal, Conformal Field Theory. Springer, 1997.
- [12] R. Blumenhagen and E. Plauschinn, Introduction to Conformal Field Theory: With Applications to String Theory. Lecture notes in physics. Springer, 2009.
- [13] H. Bondi, M. G. J. van der Burg, and A. W. K. Metzner, “Gravitational waves in general relativity. vii. waves from axi-symmetric isolated systems,” Proceedings of the Royal Society of London A: Mathematical, Physical and Engineering Sciences 269 (1962), no. 1336, 21–52.
- [14] R. Sachs, “Asymptotic symmetries in gravitational theory,” Phys. Rev. 128 (Dec, 1962) 2851–2864.
- [15] G. Barnich and C. Troessaert, “Symmetries of asymptotically flat 4 dimensional spacetimes at null infinity revisited,” Phys.Rev.Lett. 105 (2010) 111103, 0909.2617.
- [16] G. Barnich and C. Troessaert, “Aspects of the BMS/CFT correspondence,” JHEP 1005 (2010) 062, 1001.1541.
- [17] T. Banks, “A Critique of pure string theory: Heterodox opinions of diverse dimensions,” hep-th/0306074. (see footnote 17).
- [18] A. Strominger, “On BMS Invariance of Gravitational Scattering,” JHEP 1407 (2014) 152, 1312.2229.
- [19] T. He, V. Lysov, P. Mitra, and A. Strominger, “BMS supertranslations and Weinberg’s soft graviton theorem,” 1401.7026.
- [20] D. Kapec, V. Lysov, S. Pasterski, and A. Strominger, “Semiclassical Virasoro symmetry of the quantum gravity -matrix,” JHEP 1408 (2014) 058, 1406.3312.
- [21] G. ’t Hooft, “Dimensional reduction in quantum gravity,” gr-qc/9310026.
- [22] L. Susskind, “The World as a hologram,” J.Math.Phys. 36 (1995) 6377–6396, hep-th/9409089.
- [23] J. M. Maldacena, “The Large N limit of superconformal field theories and supergravity,” Int.J.Theor.Phys. 38 (1999) 1113–1133, hep-th/9711200.
- [24] E. Witten, “Anti-de Sitter space and holography,” Adv.Theor.Math.Phys. 2 (1998) 253–291, hep-th/9802150.
- [25] S. Gubser, I. R. Klebanov, and A. M. Polyakov, “Gauge theory correlators from noncritical string theory,” Phys.Lett. B428 (1998) 105–114, hep-th/9802109.
- [26] A. Einstein, Relativity: the special and general theory. Henri Holt and Company, New York, 1920. Digital reprint available at http://www.ibiblio.org/ebooks/Einstein/Einstein_Relativity.pdf.
- [27] E. P. Wigner, “On Unitary Representations of the Inhomogeneous Lorentz Group,” Annals Math. 40 (1939) 149–204.
- [28] S. Weinberg, The Quantum Theory of Fields (Volume 1). Cambridge University Press, 1 ed., 1995.
- [29] F. Ferrari, “Cours de relativité générale.” ULB, 2010.
- [30] C. Misner, K. Thorne, and J. Wheeler, Gravitation. W. H. Freeman, 1973.
- [31] J. Cornwell, Group theory in physics. Techniques of physics. Academic Press, 1984.
- [32] B. Binegar, “Relativistic Field Theories in Three-Dimensions,” J.Math.Phys. 23 (1982) 1511.
- [33] D. Grigore, “The Projective unitary irreducible representations of the Poincare group in (1+2)-dimensions,” Journal of Mathematical Physics 34 (1993), no. 9, 4172–4189.
- [34] L. La Fuente-Gravy, “Le théorème de Hurwitz,” in Notes de la quatrième BSSM. Brussels, 2011. Available at https://bssm.ulb.ac.be/en/notes_2011.html.
- [35] T. Kugo and P. Townsend, “Supersymmetry and the division algebras,” Nuclear Physics B 221 (July, 1983) 357–380.
- [36] A. Sudbery, “Division algebras, (pseudo)orthogonal groups and spinors,” Journal of Physics A Mathematical General 17 (Apr., 1984) 939–955.
- [37] J. C. Baez, “The octonions,” Bull. Amer. Math. Soc 39 (2002) 145–205.
- [38] J. C. Baez and J. Huerta, “Division Algebras and Supersymmetry I,” 0909.0551.
- [39] J. C. Baez and J. Huerta, “Division Algebras and Supersymmetry II,” Adv.Theor.Math.Phys. 15 (2011) 1373–1410, 1003.3436.
- [40] J. Lee, Manifolds and Differential Geometry. Graduate studies in mathematics. American Mathematical Society, 2009.
- [41] F. Bourgeois, “Géométrie différentielle.” ULB, 2009. Available at http://homepages.ulb.ac.be/~nrichard/#MATH-F-310.
- [42] M. Bertelson, “Géométrie riemannienne.” ULB, 2011.
- [43] J. Brown and R. Churchill, Complex Variables and Applications. Brown-Churchill series. McGraw-Hill Higher Education, 2004.
- [44] M. Peskin and D. Schroeder, An Introduction to Quantum Field Theory. Advanced book classics. Addison-Wesley Publishing Company, 1995.
- [45] M. Duff, “Twenty years of the Weyl anomaly,” Class.Quant.Grav. 11 (1994) 1387–1404, hep-th/9308075.
- [46] A. A. Belavin, A. M. Polyakov, and A. B. Zamolodchikov, “Infinite conformal symmetry in two-dimensional quantum field theory,” Nuclear Physics B 241 (July, 1984) 333–380.
- [47] J. L. Cardy, “Operator Content of Two-Dimensional Conformally Invariant Theories,” Nucl.Phys. B270 (1986) 186–204.
- [48] “Star Wars Episode IV: A New Hope.” Dir. G. Lucas. Perf. M. Hamill, H. Ford, C. Fisher, P. Cushing, A. Guinness. 20th Century Fox, 1977.
- [49] W. Rindler, Relativity: Special, General, and Cosmological. Oxford University Press, 2001.
- [50] J. Argyle, R. Connors, K. Dexter, and C. Scoular, “P1_3 Relativistic Optics,” Journal of Physics Special Topics 11 (2012), no. 1,.
- [51] R. Connors, J. Argyle, K. Dexter, and C. Scoular, “P1_5 Relativistic Optics Strikes Back,” Journal of Physics Special Topics 11 (2012), no. 1,.
- [52] S. Weinberg, “Infrared photons and gravitons,” Phys.Rev. 140 (1965) B516–B524.