Notions of optimal transport theory and how to implement them on a computer
最优输送理论的概念及如何在计算机上实施它们

Bruno Lévy and Erica Schwindt
Bruno Lévy 和 Erica Schwindt

Abstract 摘要

This article gives an introduction to optimal transport, a mathematical theory that makes it possible to measure distances between functions (or distances between more general objects), to interpolate between objects or to enforce mass/volume conservation in certain computational physics simulations. Optimal transport is a rich scientific domain, with active research communities, both on its theoretical aspects and on more applicative considerations, such as geometry processing and machine learning. This article aims at explaining the main principles behind the theory of optimal transport, introduce the different involved notions, and more importantly, how they relate, to let the reader grasp an intuition of the elegant theory that structures them. Then we will consider a specific setting, called semi-discrete, where a continuous function is transported to a discrete sum of Dirac masses. Studying this specific setting naturally leads to an efficient computational algorithm, that uses classical notions of computational geometry, such as a generalization of Voronoi diagrams called Laguerre diagrams.
本文介绍了最优输运的数学理论，该理论使得能够衡量函数之间的距离（或更一般对象之间的距离），在对象之间进行插值，或在某些计算物理模拟中强制执行质量/体积保守性。最优输运是一个丰富的科学领域，拥有积极的研究社区，既包括其理论方面，也包括更实际的考虑，如几何处理和机器学习。本文旨在解释最优输运理论背后的主要原则，介绍涉及的不同概念，更重要的是它们之间的关系，以便读者把握构建它们的优雅理论的直觉。然后我们将考虑一个称为半离散的特定设置，其中连续函数被输送到 Dirac 质量的离散总和。研究这个特定设置自然地导致了一种高效的计算算法，该算法使用计算几何的经典概念，如称为拉盖尔图的 Voronoi 图的一般化。

1 Introduction 1 介绍

This article presents an introduction to optimal transport. It summarizes and complements a series of conferences given by B. Lévy between 2014 and 2017. The presentations stays at an elementary level, that corresponds to a computer scientist’s vision of the problem. In the article, we stick to using standard notions of analysis (functions, integrals) and linear algebra (vectors, matrices), and give an intuition of the notion of measure. The main objective of the presentation is to understand the overall structure of the reasoning ¹¹1Teach principles, not equations. [R. Feynman]
教授原理，而非方程式。 [R. Feynman]
本文介绍了最优输运。它总结和补充了 B. Lévy 在 2014 年至 2017 年间举办的一系列会议。演示在一个初等水平上停留，对应于计算机科学家对问题的看法。在本文中，我们坚持使用分析的标准概念（函数，积分）和线性代数（向量，矩阵），并提供关于测度概念的直觉。这篇演讲的主要目标是理解推理的整体结构 ¹ , and to follow a continuous path from the theory to an efficient algorithm that can be implemented in a computer.
以及从理论到有效算法的连续路径，这些算法可以在计算机中实现。

Optimal transport, initially studied by Monge, [Mon84], is a very general mathematical framework that can be used to model a wide class of application domains. In particular, it is a natural formulation for several fundamental questions in computer graphics [Mém11, Mér11, BvdPPH11], because it makes it possible to define new ways of comparing functions, of measuring distances between functions and interpolating between two (or more) functions :
最优输运最初由蒙日[Monge，Mon84]研究，是一个非常通用的数学框架，可以用来模拟广泛的应用领域。特别是，它是计算机图形学中几个基本问题的自然表述[Mém11，Mér11，BvdPPH11]，因为它可以定义比较函数、测量函数之间距离以及插值两个（或多个）函数的新方法：

Refer to caption — Figure 1: Comparing functions: one would like to say that $f_{1}$ is nearer to $f_{2}$ than $f_{3}$ , but the classical $L_{2}$ norm “does not see” that the graph of $f_{2}$ corresponds to the graph of $f_{1}$ slightly shifted along the $x$ axis.
图 1：比较函数：人们希望说 $f_{1}$ 比 $f_{3}$ 更靠近 $f_{2}$ ，但是经典的 $L_{2}$ 范数“看不到” $f_{2}$ 的图与 $f_{1}$ 的图沿 $x$ 轴稍微移动的对应关系。

Comparing functions 比较函数

Consider the functions $f_{1}$ , $f_{2}$ and $f_{3}$ in Figure 1. Here we have chosen a function $f_{1}$ with a wildly oscillating graph, and a function $f_{2}$ obtained by translating the graph of $f_{1}$ along the $x$ axis. The function $f_{3}$ corresponds to the mean value of $f_{1}$ (or $f_{2}$ ). If one measures the relative distances between these functions using the classical $L_{2}$ norm, that is $d_{L_{2}}(f,g)=\int(f(x)-g(x))^{2}dx$ , one will find that $f_{1}$ is nearer to $f_{3}$ than $f_{2}$ . Optimal transport makes it possible to define a distance that will take into account that the graph of $f_{2}$ can be obtained from $f_{1}$ through a translation (like here), or through a deformation of the graph of $f_{1}$ . From the point of view of this new distance, the function $f_{1}$ will be nearer to $f_{2}$ than to $f_{3}$ .
考虑图 1 中的函数 $f_{1}$ ， $f_{2}$ 和 $f_{3}$ 。这里我们选择了一个图形急剧振荡的函数 $f_{1}$ ，以及一个通过沿 $x$ 轴平移 $f_{1}$ 的图形获得的函数 $f_{2}$ 。函数 $f_{3}$ 对应于 $f_{1}$ （或 $f_{2}$ ）的平均值。如果使用经典的 $L_{2}$ 范数测量这些函数之间的相对距离，即 $d_{L_{2}}(f,g)=\int(f(x)-g(x))^{2}dx$ ，就会发现 $f_{1}$ 比 $f_{2}$ 更接近 $f_{3}$ 。最优输运使得可能定义一个考虑到 $f_{2}$ 的图形可以通过平移（就像这里一样）或通过 $f_{1}$ 的图形的变形从 $f_{1}$ 获得。从这种新距离的视角来看，函数 $f_{1}$ 将比 $f_{3}$ 更接近 $f_{2}$ 。

Interpolating between functions:
在函数之间插值：

Now consider the functions $u$ and $v$ in Figure 2. Here we suppose that $u$ corresponds to a certain physical quantity measured at an initial time $t=0$ and that $v$ corresponds to the same phenomenon measured at a final time $t=1$ . The problem that we consider now consists in reconstructing what happened between $t=0$ and $t=1$ . If we use linear interpolation (Figure 2, top-right), we will see the left hump progressively disappearing while the right hump progressively appears, which is not very realistic if the functions represent for instance a propagating wave. Optimal transport makes it possible to define another type of interpolation (Mc. Cann’s displacement interpolation, Figure 2, bottom-right), that will progressively displace and morph the graph of $u$ into the graph of $v$ .
现在考虑图 2 中的函数 $u$ 和 $v$ 。在这里，我们假设 $u$ 对应于在初始时间 $t=0$ 测量的某个物理量，而 $v$ 对应于在最终时间 $t=1$ 测量的同一现象。我们现在考虑的问题是重建 $t=0$ 和 $t=1$ 之间发生的事情。如果我们使用线性插值（图 2，右上角），我们会看到左侧的小丘逐渐消失，而右侧的小丘逐渐出现，这对于函数例如代表传播波的情况并不是很现实的。最优输运使得可以定义另一种类型的插值（Mc. Cann 的位移插值，图 2，右下角），这将逐渐将 $u$ 的图形位移和变形为 $v$ 的图形。

Optimal transport makes it possible to define a geometry of a space of functions²²2or more general objects, called probability measures, more on this later.
或更通用的对象，称为概率测度，以后会详细介绍。
最优输运使得可以定义函数空间的几何 ² , and thus gives a definition of distance in this space, as well as means of interpolating between different functions, and in general, defining the barycenter of a weighted family of functions, in a very general context. Thus, optimal transport appears as a fundamental tool in many applied domains. In computer graphics, applications were proposed, to compare and interpolate objects of diverse natures [BvdPPH11], to generate lenses that can concentrate light to form caustics in a prescribed manner [MMT17, STTP14]. Moreover, optimal transport defines new tools that can be used to discretize Partial Differential Equations, and define new numerical solution mechanisms [BCMO14]. This type of numerical solution mechanism can be used to simulate for instance fluids [GM17], with spectacular applications and results in computer graphics [dGWH⁺15].
，从而为该空间提供了距离的定义，以及在不同函数之间插值的手段，通常在非常一般的情况下定义加权函数族的重心。因此，最优输运在许多应用领域中都是一种基本工具。在计算机图形学中，有一些应用被提出，用于比较和插值不同性质的对象[BvdPPH11]，生成可以聚焦光线以预定方式形成焦散光斑的透镜[MMT17, STTP14]。此外，最优输运定义了可以用来离散化偏微分方程并定义新的数值解机制[BCMO14]的新工具。这种数值解机制可以用来模拟例如流体[GM17]，在计算机图形学中具有引人注目的应用和结果[dGWH ⁺ 15]。

The two sections that follow are partly inspired by [Vil09], [San14], [Caf03] and [AG13], but stay at an elementary level. Here the main goal is to give an intuition of the different concepts, and more importantly an idea of the way the relate together. Finally we will see how they can be directly used to design a computational algorithm with very good performance, that can be used in practice in several application domains.
以下两个部分在一定程度上受到[Vil09]，[San14]，[Caf03]和[AG13]的启发，但保持在初级水平。在这里，主要目标是给出对不同概念的直觉，更重要的是对它们之间的关系的一种理解。最后，我们将看到如何直接使用它们来设计一个性能非常优异的计算算法，该算法可以在实践中在多个应用领域中使用。

2 Monge’s problem 蒙日问题

The initial optimal transport problem was first introduced and studied by Monge, right before the French revolution [Mon84]. We first give an intuitive idea of the problem, then quickly introduce the notion of measure, that is necessary to formally state the problem in its most general form and to analyze it.
最初的最优输运问题是在法国大革命前由蒙日首次引入和研究的[Mon84]。我们首先直观地介绍问题的概念，然后快速引入测度的概念，这对于在其最一般形式下正式陈述问题并分析问题是必要的。

2.1 Intuition 直觉

Monge’s initial motivation to study this problem was very practical: supposing you have an army of workmen, how can you transform a terrain with an initial landscape into a given desired target landscape, while minimizing the total amount of work ?
蒙日最初研究这个问题的动机非常实际：假设你有一支工人队，如何将初始地形转变为目标地形，同时最大限度地减少总工作量？

Monge’s initial problem statement was as follows:
蒙日最初的问题陈述如下：

\begin{array}[]{l}\inf\limits_{T:X\rightarrow X}\int\limits_{X}c(x,T(x))u(x)dx\\[17.07164pt] \mbox{\ \ \ \ subject to: }\\[17.07164pt] \forall B\subset X,\int\limits_{T^{-1}(B)}u(x)dx=\int\limits_{B}v(x)dx\end{array}

where $X$ is a subset of ${\mathbb{R}}^{2}$ , $u$ and $v$ are two positive functions defined on $X$ and such that $\int_{X}u(x)dx$ = $\int_{Y}v(x)dx$ , and $c(\cdot,\cdot)$ is a convex distance (the Euclidean distance in Monge’s initial problem statement).
其中 $X$ 是 ${\mathbb{R}}^{2}$ 的子集， $u$ 和 $v$ 是在 $X$ 上定义的两个正函数，使得 $\int_{X}u(x)dx$ = $\int_{Y}v(x)dx$ ， $c(\cdot,\cdot)$ 是一个凸距离（蒙日初始问题陈述中的欧几里得距离）。

The functions $u$ and $v$ represent the height of the current landscape and the height of the target landscape respectively (symbolized as gray levels in Figure 3). The problem consists in finding (if it exists) a function $T$ from $X$ to $X$ that transforms the current landscape $u$ into the desired one $v$ , while minimizing the product of the amount of transported earth $u(x)$ with the distance $c(x,T(x))$ to which it was transported. Clearly, the amount of earth is conserved during transport, thus the total quantity of earth should be the same in the source and target landscapes (the integrals of $u$ and $v$ over $X$ should coincide). This global matter conservation constraint needs to be completed with a local one. The local matter conservation constraint enforces that in the target landscape, the quantity of earth received in any subset $B$ of $X$ corresponds to what was transported here, that is the quantity of earth initially present in the pre-image $T^{-1}(B)$ of $B$ under $T$ . Without this constraint, one could locally create matter in some places and annihilate matter in other places in a counterbalancing way. A map $T$ that satisfies the local mass conservation constraint is called a transport map.
函数 $u$ 和 $v$ 分别表示当前地貌的高度和目标地貌的高度(在图 3 中以灰度级表示)。问题在于找到（如果存在）一个从 $X$ 到 $X$ 的函数 $T$ ，将当前地貌 $u$ 转变为期望的地貌 $v$ ，同时最小化通过运输的土壤数量 $u(x)$ 与其运输距离 $c(x,T(x))$ 的乘积。显然，土壤的数量在运输过程中是不变的，因此源地貌和目标地貌中土壤的总量应当相同（在 $X$ 上 $u$ 和 $v$ 的积分应当一致）。这一全局物质守恒约束需要进一步补充一条局部物质守恒约束。局部物质守恒约束要求在目标地貌中，任意子集 $B$ 所接收的土壤数量应与运输到这里的土壤数量对应，也就是最初存在于 $T$ 下的 $B$ 的前像中土壤数量应当一致。如果没有这一约束，人们可以在一些地方本地创造物质，并在其他地方以相等的方式消除物质。符合局部质量守恒约束的地图称为输运地图。

2.2 Monge’s problem with measures
2.2 蒙日问题与度量

We now suppose that instead of a “target landscape”, we wish to transport earth (or a resource) towards a set of points (that will be denoted by $Y$ for now on), that represent for instance a set of factories that exploit a resource, see Figure 4. Each factory wishes to receive a certain quantity of resource (depending for instance of the number of potential customers around the factory). Thus, the function $v$ that represents the “target landscape” is replaced with a function on a finite set of points. However, if a function $v$ is zero everywhere except on a finite set of points, then its integral over $X$ is also zero. This is a problem, because for instance one cannot properly express the mass conservation constraint. For this reason, the notion of function is not rich enough for representing this configuration. One can use instead measures (more on this below), and associate with each factory a Dirac mass weighted by the quantity of resource to be transported to the factory.
现在我们假设，我们不是想要将土地（或资源）运输到“目标景观”，而是希望将土地（或资源）运输到一组点（暂时表示为 $Y$ ）。这些点例如代表一组开采资源的工厂，参见图 4。每个工厂都希望获得一定数量的资源（取决于工厂周围潜在客户的数量）。因此，表示“目标景观”的函数 $v$ 被替换为一组有限点上的函数。然而，如果函数 $v$ 在除一组有限点外的所有地方均为零，则其在 $X$ 上的积分也为零。这是一个问题，因为例如不能正确表达质量守恒约束。因此，函数的概念不足以表示这种配置。可以改用度量（下文详述），并将每个工厂与要运往工厂的资源数量加权的 Dirac 质量相关联。

From now on, we will use measures $\mu$ and $\nu$ to represent the “current landscape” and the “target landscape”. These measures are supported by sets $X$ and $Y$ , that may be different sets (in the present example, $X$ is a subset of ${\mathbb{R}}^{2}$ and $Y$ is a discrete set of points). Using measures instead of function not only makes it possible to study our “transport to discrete set of factories” problem, but also it can be used to formalize computer objects (meshes) and directly leads to a computational algorithm. This algorithm is very elegant because it is a verbatim computer translation of the mathematical theory (see §7.6). In this particular setting, translating from the mathematical language to the algorithmic setting does not require to make any approximation. This is made possible by the generality of the notion of measure.
从现在开始，我们将使用测度 $\mu$ 和 $\nu$ 来表示“当前景观”和“目标景观”。这些测度由集合 $X$ 和 $Y$ 支持，这些集合可能是不同的（在当前示例中， $X$ 是 ${\mathbb{R}}^{2}$ 的子集， $Y$ 是一组离散的点）。使用测度而不是函数不仅可以研究我们的“运输到离散工厂集合”的问题，还可以用来形式化计算机对象（网格），并直接导致一个计算算法。这个算法非常优雅，因为它是数学理论的逐字计算机转换（见第 7.6 节）。在这种特定情境中，将数学语言转化为算法设置不需要进行任何近似。这是由测度概念的一般性所实现的。

The reader who wishes to learn more on measure theory may refer to the textbook [Tao11]. To keep the length of this article reasonable, we will not give here the formal definition of a measure. In our context, one can think of a measure as a “function” that can be only queried using integrals and that can be “concentrated” on very small sets (points). The following table can be used to intuitively translate from the “language of functions” to the “language of measures” :
希望学习测度论的读者可以参考教材[Tao11]。为了保持本文的长度合理，我们不会在此处给出测度的正式定义。在我们的语境中，可以将测度看作是一个只能通过积分查询并能“聚集”在非常小的集合（点）上的“函数”。以下表格可用于直观地从“函数的语言”翻译成“测度的语言”：

\begin{array}[]{|c|c|}\hline\cr\mbox{Function $u$}&\mbox{Measure $\mu$ }\\[5.69054pt] \hline\cr\int_{B}u(x)dx&\mu(B)\mbox{ or }\int_{B}d\mu\\[5.69054pt] \int_{B}f(x)u(x)dx&\int_{B}f(x)d\mu\\[5.69054pt] u(x)&N/A\\ \hline\cr\end{array}

(Note: in contrast with functions, measures cannot be evaluated at a point, they can be only integrated over domains).
（注意：与函数相对应，测度不能在一个点被评估，它们只能在域上被积分）。

In its version with measures, Monge’s problem can be stated as follows:
在具有测度的版本中，蒙日问题可以陈述如下：

\begin{array}[]{l}\inf\limits_{T:X\rightarrow Y}\int\limits_{X}c(x,T(x))d\mu\ \mbox{ subject to }\ \nu=T\sharp\mu\end{array}

(M)

where $X$ and $Y$ are Borel sets (that is, sets that can be measured), $\mu$ and $\nu$ are two measures on $X$ and $Y$ respectively such that $\mu(X)=\nu(Y)$ and $c(\cdot,\cdot)$ is a convex distance. The constraint $\nu=T\sharp\mu$ , that reads “ $T$ pushes $\mu$ onto $\nu$ ” corresponds to the local mass conservation constraint. Given a measure $\mu$ on $X$ and a map $T$ from $X$ to $Y$ , the measure $T\sharp\mu$ on $Y$ , called “the pushforward of $\mu$ by $T$ ”, is such that $T\sharp\mu(B)=\mu(T^{-1}(B))$ for all Borel set $B\subset Y$ . Thus, the local mass conservation constraint means that $\mu(T^{-1}(B))=\nu(B)$ for all Borel set $B$ $\subset$ $Y$ .
其中 $X$ 和 $Y$ 是 Borel 集合（即可以被测量的集合）， $\mu$ 和 $\nu$ 分别是 $X$ 和 $Y$ 上的两个测度，使得 $\mu(X)=\nu(Y)$ 和 $c(\cdot,\cdot)$ 是一个凸距离。约束条件 $\nu=T\sharp\mu$ ，“ $T$ 将 $\mu$ 推到 $\nu$ 上”对应于局部质量守恒约束。给定 $X$ 上的一个测度 $\mu$ 和从 $X$ 到 $Y$ 的映射 $T$ ， $Y$ 上的测度 $T\sharp\mu$ ，称为“ $\mu$ 被 $T$ 推送的前向”，满足所有 Borel 集合 $B\subset Y$ 。因此，局部质量守恒约束意味着所有 Borel 集合 $B$ 上 $\subset$ $Y$ 。

The local mass conservation constraint makes the problem very difficult: imagine now that you want to implement a computer program that enforces it: the constraint concerns all the subsets $B$ of $Y$ . Could you imagine an algorithm that just tests whether a given map satisfies it ? What about enforcing it ? We will see below a series of transformations of the initial problem into equivalent problems, where the constraint becomes linear. We will finally end up with a simple convex optimization problem, that can be solved numerically using classical methods.
本地质量守恒约束使问题变得非常困难：现在想象一下，您想要实现一个强制执行它的计算机程序：约束涉及所有子集 $B$ $Y$ 。您能想象一个仅测试给定映射是否满足它的算法吗？关于强制执行它呢？我们将在下面看到一系列初始问题转化为等价问题的转换，约束变为线性。最后，我们将得到一个简单的凸优化问题，可以使用经典方法进行数值解决。

Before then, let us get back to examine the original problem. The local mass conservation constraint is not the only difficulty: the functional optimized by Monge’s problem is non-symmetric, and this causes additional difficulties when studying the existence of solutions for problem (M). The problem is not symmetric because $T$ needs to be a map, therefore the source and target landscape do not play the same role. Thus, it is possible to merge earth (if $T(x_{1})=T(x_{2})$ for two different points $x_{1}$ and $x_{2}$ ), but it is not possible to split earth (for that, we would need a “map” $T$ that could send the same point $x$ to two different points $y_{1}$ and $y_{2}$ ). The problem is illustrated in Figure 5: suppose that you want to compute the optimal transport between a segment $L_{1}$ (that symbolizes a “wall of earth”) and two parallel segments $L_{2}$ and $L_{3}$ (that symbolize two “trenches” with a depth that correspond to half the height of the wall of earth). Now we want to transport the wall of earth to the trenches, to make the landscape flat. To do so, it is possible to decompose $L_{1}$ into segments of length $h$ , sent alternatively towards $L_{2}$ and $L_{3}$ (Figure 5 on the left). For any length $h$ , it is always possible to find a better map $T$ , that is a lower value of the functional in (M), by subdividing $L_{1}$ into smaller segments (Figure 5 on the right). The best way to proceed consists in sending from each point of $L_{1}$ half the earth to $L_{2}$ and half the earth to $L_{3}$ , which cannot be represented by a map. Thus, the best solution of problem (M) is not a map. In a more general setting, this problem appears each time the source measure $\mu$ has mass concentrated on a manifold of dimension $d-1$ [McC95] (like the segment $L_{1}$ in the present example).
在此之前，让我们回到检查原始问题。当地质量守恒约束并不是唯一的困难时：蒙热问题的优化功能是非对称的，这在研究问题（M）的解存在时导致额外困难。该问题不对称是因为 $T$ 需要是一张地图，因此源地形和目标地形扮演的角色并不相同。因此，可以合并大地（如果 $T(x_{1})=T(x_{2})$ 两个不同点 $x_{1}$ 和 $x_{2}$ ），但不可能分开地球（为此，我们需要一张“地图” $T$ 能将同一点 $x$ 发送到两个不同点 $y_{1}$ 和 $y_{2}$ ）。该问题在图 5 中得到了说明：假设你想要计算一个代表“一堵土墙”的线段 $L_{1}$ 和两个平行线段 $L_{2}$ 和 $L_{3}$ 之间的最佳传输（代表两个深度相当于土墙高度一半的“沟渠”）。现在我们想把土墙运输到沟渠，使地形平坦。为了做到这一点，可以将 $L_{1}$ 分解为长度为 $h$ 的线段，交替地发送到 $L_{2}$ 和 $L_{3}$ （左边的图 5）。对于任意长度 $h$ ，总是可以通过将 $L_{1}$ 细分为更小的段（右侧第 5 图）而找到一个更好的地图 $T$ ，即(M)中的功能值更低的地图。最佳操作方法是从 $L_{1}$ 的每一点向 $L_{2}$ 发送地球的一半，向 $L_{3}$ 发送地球的一半，而这不能用地图表示。因此，问题(M)的最佳解决方案不是地图。在一个更一般的设定中，每当源测度 $\mu$ 在一个维度 $d-1$ 的流形上有集中质量时[McC95]（例如当前示例中的段 $L_{1}$ ）时，该问题会出现。

3 Kantorovich’s relaxed problem
3Kantorovich 的放松问题

To overcome this difficulty, Kantorovich stated a problem with a larger space of solutions, that is, a relaxation of problem (M), where mass can be both split and merged. The idea consists in solving for the “graph of $T$ ” instead of $T$ . One may think of the graph of $T$ as a function $g$ defined on $X\times Y$ that indicates for each couple of points $x\in X,y\in Y$ how much matter goes from $x$ to $y$ . However, once again, we cannot use standard functions to represent the graph of $T$ : if you think about the graph of a univariate function $x\mapsto f(x)$ , it is defined on $\mathbb{R}^{2}$ but concentrated on a curve. For this reason, as in our previous example with factories, one needs to use measures. Thus, we are now looking for a measure $\gamma$ supported by the product space $X\times Y$ . The relaxed problem is stated as follows:
为了克服这一困难，坎托罗维奇提出了一个具有更大解空间的问题，即问题（M）的放松，其中质量既可以分裂也可以合并。这个想法在解决“ $T$ 的图”而不是 $T$ 时得以体现。我们可以将 $T$ 的图想象为定义在 $X\times Y$ 上的函数 $g$ ，它指示每对点 $x\in X,y\in Y$ 之间有多少物质从 $x$ 流向 $y$ 。然而，再一次，我们不能使用标准函数来表示 $T$ 的图：如果你想到一个一元函数 $x\mapsto f(x)$ 的图，它在 $\mathbb{R}^{2}$ 上定义，但集中在一条曲线上。因此，与我们之前关于工厂的例子一样，我们需要使用测度。因此，我们现在正在寻找一个支持产品空间 $X\times Y$ 的测度 $\gamma$ 。放松后的问题陈述如下：

\begin{array}[]{l}\inf\limits_{\gamma}\left\{\int\limits_{X\times Y}c(x,y)d\gamma\ |\ \gamma\geq 0\mbox{ and }\gamma\in\Pi(\mu,\nu)\right\}\\[14.22636pt] \mbox{where: }\\[8.53581pt] \Pi(\mu,\nu)=\{\gamma\in{\cal P}(X\times Y)\ |\ (P_{X})\sharp\gamma=\mu\ ;\ (P_{Y})\sharp\gamma=\nu\}\end{array}

(K)

where $(P_{X})$ and $(P_{Y})$ denote the two projections $(x,y)\in X\times Y\mapsto x$ and $(x,y)\in X\times Y\mapsto y$ respectively.
其中 $(P_{X})$ 和 $(P_{Y})$ 表示两个投影 $(x,y)\in X\times Y\mapsto x$ 和 $(x,y)\in X\times Y\mapsto y$ 。

The two measures $(P_{X})\sharp\gamma$ and $(P_{Y})\sharp\gamma$ obtained by pushing forward $\gamma$ by the two projections are called the marginals of $\gamma$ . The measures $\gamma$ in the admissible set $\Pi(\mu,\nu)$ , that is, the measures that have $\mu$ and $\nu$ as marginals, are called optimal transport plans. Let us now have a closer look at the two constraints on the marginals $(P_{X})\sharp\gamma=\mu$ and $(P_{X})\sharp\gamma$ that define the set of optimal transport plans $\Pi(\mu,\nu)$ . Recalling the definition of the pushforward (previous subsection), these two constraints can also be written as:
通过两个投射获得的由 $\gamma$ 推动而来的两个值 $(P_{X})\sharp\gamma$ 和 $(P_{Y})\sharp\gamma$ 被称为 $\gamma$ 的两个边际。在可接受集合 $\Pi(\mu,\nu)$ 中 $\gamma$ 度量，即具有 $\mu$ 和 $\nu$ 作为边际的测度，被称为最优输运方案。现在让我们更仔细地看看定义最优输运方案 $\Pi(\mu,\nu)$ 集合的两个边际 $(P_{X})\sharp\gamma=\mu$ 和 $(P_{X})\sharp\gamma$ 的两个约束条件。回顾向前推进的定义（前面的小节），这两个约束条件也可以写成：

\begin{array}[]{lcl}(P_{X})\sharp\gamma=\mu&\iff&\forall B\subset X,\int_{B}d\mu=\int_{B\times Y}d\gamma\\[5.69054pt] (P_{Y})\sharp\gamma=\nu&\iff&\forall B^{\prime}\subset Y,\int_{B^{\prime}}d\nu=\int_{X\times B^{\prime}}d\gamma.\end{array}

(1)

Intuitively, the first constraint $(P_{X})\sharp\gamma=\mu$ means that everything that comes from a subset $B$ of $X$ should correspond to the amount of matter (initially) contained by $B$ in the source landscape, and the second one $(P_{Y})\sharp\gamma=\nu$ means that everything that goes into a subset $B^{\prime}$ of $Y$ should correspond to the (prescribed) amount of matter contained by $B^{\prime}$ in the target landscape $\nu$ .
直观上，第一个约束条件 $(P_{X})\sharp\gamma=\mu$ 意味着来自 $X$ 的子集的一切应该对应于来源景观 $B$ 中包含的物质量（最初），而第二个条件 $(P_{Y})\sharp\gamma=\nu$ 意味着进入 $Y$ 的子集的一切应该对应于目标景观 $\nu$ 中 $B^{\prime}$ 所包含的（规定的）物质量。

We now examine the relation between the relaxed problem (K) and the initial problem (M). One can easily check that among the optimal transport plans, those with the form $(Id\times T)\sharp\mu$ correspond to a transport map $T$ :
我们现在研究松弛问题（K）与初始问题（M）之间的关系。可以很容易地检查到，在最优输运方案中，那些形式为 $(Id\times T)\sharp\mu$ 的方案对应于一个输运映射 $T$ ：

Observation 1.

If $(Id\times T)\sharp\mu$ is a transport plan, then $T$ pushes $\mu$ onto $\nu$ .

观察 1. 如果

(Id\times T)\sharp\mu

是一个输运方案，则

T

会将

\mu

推送到

\nu

。

Proof.

$(Id\times T)\sharp\mu$ is in $\Pi(\mu,\nu)$ , thus $(P_{Y})\sharp(Id\times T)\sharp\mu=\nu$ , or $\left((P_{Y})\circ(Id\times T)\right)\sharp\mu=\nu$ , and finally $T\sharp\mu=\nu$ . ∎

证明。

(Id\times T)\sharp\mu

在

\Pi(\mu,\nu)

中，因此

(P_{Y})\sharp(Id\times T)\sharp\mu=\nu

，或

\left((P_{Y})\circ(Id\times T)\right)\sharp\mu=\nu

，最终

T\sharp\mu=\nu

。∎

We can now observe that if a transport plan $\gamma$ has the form $\gamma=(Id\times T)\sharp\mu$ , then problem (K) becomes:
我们现在可以观察到，如果一个运输计划 $\gamma$ 的形式为 $\gamma=(Id\times T)\sharp\mu$ ，那么问题(K)就变成了：

\mbox{min}\left\{\int\limits_{X\times Y}c(x,y)d\left((Id\times T)\sharp\mu\right)\right\}\quad=\quad\mbox{min}\left\{\int\limits_{X}c(x,T(x))d\mu\right)

(one retrieves the initial Monge problem).
（其中一个是检索最初的蒙日问题）。

To further help grasping an intuition of this notion of transport plan, we show four 1D examples in Figure 6 (the transport plan is then $1D\times 1D=2D$ ). Intuitively, the transport plan $\gamma$ may be thought of as a “table” indexed by $x$ and $y$ that indicates the quantity of matter transported from $x$ to $y$ . More exactly³³3We recall that one cannot evaluate a measure $\gamma$ at a point $(x,y)$ , we can just compute integrals with $\gamma$ .
我们回顾一下，不能在点 $(x,y)$ 处评估度量 $\gamma$ ，我们只能计算 $\gamma$ 的积分。
为了进一步帮助理解这种运输计划的概念，我们在图 6 中展示了四个 1D 示例（然后运输计划为 $1D\times 1D=2D$ ）。直觉上，运输计划 $\gamma$ 可以被认为是由 $x$ 和 $y$ 索引的“表”，指示从 $x$ 运输到 $y$ 的物质数量。更确切地说 ³ , the measure $\gamma$ is non-zero on subsets of $X\times Y$ that contain points $(x,y)$ such that some matter is transported from $x$ to $y$ . Whenever $\gamma$ derives from a transport map $T$ , that is if $\gamma$ has the form $(Id\times T)\sharp\mu$ , then we can consider $\gamma$ as the “graph of $T$ ” like in the first two examples of Figure 6 (A) and (B)⁴⁴4Note that the measure $\mu$ is supposed to be absolutely continuous with respect to the Lebesgue measure. This is required, because for instance in example (B) of Figure 6, the transport map $T$ is undefined at the center of the segment. The absolute continuity requirement allows one to remove from $X$ any subset with zero measure.
注意，度量 $\mu$ 应该对勒贝格测度是绝对连续的。这是必需的，因为例如在图 6 的示例（B）中，运输映射 $T$ 在该段的中心处是未定义的。绝对连续性要求允许从 $X$ 中去除任何零度量子集。
，度量 $\gamma$ 在包含点 $(x,y)$ 的 $X\times Y$ 的子集上非零，点 $(x,y)$ 使从 $x$ 到 $y$ 转移了一些物质。每当 $\gamma$ 由运输图 $T$ 推导出来时，即如果 $\gamma$ 具有形式 $(Id\times T)\sharp\mu$ ，那么我们可以将 $\gamma$ 视为“ $T$ 的图”，就像图 6（A）和（B）的前两个示例中 ⁴ 。 in Figure 6. 在图 6 中。

The transport plans of the two other examples (C) and (D) have no associated transport map, because they split Dirac masses. The transport plan associated with Figure 5 has the same nature (but this time in $2D\times 2D=4D$ ). It cannot be written with the form $(Id\times T)\sharp\mu$ because it splits the mass concentrated in $L_{1}$ into $L_{2}$ and $L_{3}$ .
另外两个示例（C）和（D）的运输计划没有关联的传输地图，因为它们分裂了狄拉克质量。与图 5 相关的运输计划具有相同的性质（但这次是在 $2D\times 2D=4D$ ）。它不能用形式 $(Id\times T)\sharp\mu$ 来写，因为它将集中在 $L_{1}$ 的质量分裂成 $L_{2}$ 和 $L_{3}$ 。

Now the theoretical questions are:
现在的理论问题是：

•

when does an optimal transport plan exist ?

• 何时存在最优输运方案？
•

when does it admit an associated optimal transport map ?

• 何时承认其存在关联的最优输运图？

A standard approach to tackle this type of existence problem is to find a certain regularity both in the functional and in the space of the admissible transport plans, that is, proving that the functional is sufficiently “smooth” and finding a compact set of admissible transport plans. Since the set of admissible transport plans contains at least the product measure $\mu\otimes\nu$ , it is non-empty, and existence can be proven thanks to a topological argument that exploits the regularity of the functional and the compactness of the set. Once the existence of a transport plan is established, the other question concerns the existence of an associated transport map. Unfortunately, problem (K) does not directly show the structure required by this reasoning path. However, one can observe that (K) is a linear optimization problem subject to linear constraints. This suggests using certain tools, such as the dual formulation, that was also developed by Kantorovich. With this dual formulation, it is possible to exhibit an interesting structure of the problem, that can be used to answer both questions (existence of a transport plan, and existence of an associated transport map).
一种解决这种类型存在问题的标准方法是在可传递计划的函数和空间中找到一定的规律，即证明函数足够"平滑"并找到一组可接受的紧致传输计划集。由于可接受传输计划集至少包含乘积度量 $\mu\otimes\nu$ ，因此它不为空，并且可以通过利用函数的规则性和集合的紧致性所做的拓扑论证来证明它的存在性。一旦传输计划的存在性确立，另一个问题涉及相关传输映射的存在性。不幸的是，问题(K)并未直接显示出这种推理路径所需的结构。然而，我们可以观察到(K)是一个受线性约束的线性优化问题。这表明可以使用某些工具，例如康托洛维奇也开发的对偶形式。通过这种对偶形式，可以展示出该问题的一个有趣结构，可用来回答两个问题（传输计划的存在性和相关传输映射的存在性）。

4 Kantorovich’s dual problem
4Kantorovich 的对偶问题

Kantorovich’s duality applies to problem (K), in its most general form (with measures). To facilitate understanding, we will consider instead a discrete version of problem (K), where the involved entities are vectors and matrices (instead of measures and operators). This makes it easy to better discern the structure of (K), that is the linear nature of both the functional and the constraints. This also makes it easier for the reader to understand how to construct the dual by manipulating simpler objects (matrices and vectors), however the structure of the reasoning is the same in the general case.
Kantorovich 的对偶性适用于问题 (K)，以其最一般形式 (带有度量)。为了方便理解，我们将考虑问题 (K) 的离散版本，其中涉及的实体是向量和矩阵 (而不是度量和算子)。这样能更容易地辨别问题 (K) 的结构，即泛函和约束的线性性质。这也使读者更容易理解如何通过操作更简单的对象 (矩阵和向量) 构造对偶问题，然而，在一般情况下，推理的结构是相同的。

4.1 The discrete Kantorovich problem
4.1 离散 Kantorovich 问题

In Figure 7, we show a discrete version of the 1D transport between two segments of Figure 6. The measures $\mu$ and $\nu$ are replaced with vectors $U=(u_{i})_{i=1\ldots m}$ and $V=(v_{j})_{i=1\ldots n}$ . The transport plan $\gamma$ becomes a set of coefficients $\gamma_{ij}$ . Each coefficient $\gamma_{ij}$ indicates the quantity of matter that will be transported from $u_{i}$ to $v_{j}$ . The discrete Kantorovich problem can be written as follows:
在图 7 中，我们展示了图 6 中两个区段之间的一维传输的离散版本。测度 $\mu$ 和 $\nu$ 被向量 $U=(u_{i})_{i=1\ldots m}$ 和 $V=(v_{j})_{i=1\ldots n}$ 替换。传输计划 $\gamma$ 成为一组系数 $\gamma_{ij}$ 。每个系数 $\gamma_{ij}$ 表示将从 $u_{i}$ 输送到 $v_{j}$ 的物质数量。离散 Kantorovich 问题可以表示如下：

\quad\quad\min\limits_{\gamma}<C,\gamma>\mbox{ subject to }\left\{\begin{array}[]{l}{\bf P_{x}}\gamma=U\\ {\bf P_{y}}\gamma=V\\ \gamma_{i,j}\geq 0\quad\forall i,j\end{array}\right.

(2)

where $\gamma$ is the vector of $\mathbb{R}^{m\times n}$ with all coefficients $\gamma_{ij}$ (that is, the matrix $\gamma_{ij}$ “unrolled” into a vector), and $C$ the vector of $\mathbb{R}^{m\times n}$ with the coefficients $c_{ij}$ indicating the transport cost between point $i$ and point $j$ (for instance, the Euclidean cost). The objective function is simply the dot product, denoted by $<C,\gamma>$ , of the cost vector $C$ and the vector $\gamma$ . The objective function is linear in $\gamma$ . The constraints on the marginals (1) impose in this discrete version that the sums of the $\gamma_{ij}$ coefficients over the columns correspond to the $u_{i}$ coefficients (Figure 7-B) and the sums over the rows correspond to the $v_{j}$ coefficients (Figure 7-C). Intuitively, everything that leaves point $i$ should correspond to $u_{i}$ , that is the quantity of matter initially present in $i$ in the source landscape, and everything that arrives at a point $j$ should correspond to $v_{j}$ , that is the quantity of matter desired at $j$ in the target landscape. As one can easily notice, in this form, both constraints are linear in $\gamma$ . They can be written with two matrices ${\bf P_{x}}$ and ${\bf P_{y}}$ , of dimensions $m\times mn$ and $n\times mn$ respectively.
其中 $\gamma$ 是具有所有系数 $\gamma_{ij}$ （即矩阵 $\gamma_{ij}$ “展开”为向量）的 $\mathbb{R}^{m\times n}$ 的向量， $C$ 是具有指示点 $i$ 和点 $j$ 之间运输成本 $c_{ij}$ 的 $\mathbb{R}^{m\times n}$ 的向量（例如，欧几里得成本）。目标函数简单地是成本向量 $C$ 和向量 $\gamma$ 的点积，用 $<C,\gamma>$ 表示。目标函数在 $\gamma$ 中是线性的。边际限制（1）在这个离散版本中要求列上系数 $\gamma_{ij}$ 的总和对应于 $u_{i}$ 系数（图 7-B），行上的总和对应于 $v_{j}$ 系数（图 7-C）。直观地看，离开点 $i$ 的一切都应与 $u_{i}$ 相对应，即最初在源景观中存在的物质量，而到达某一点 $j$ 的一切都应与 $v_{j}$ 相对应，即目标景观中 $j$ 处所需的物质量。正如大家可以轻易发现的那样，在这种形式下，这两个约束在 $\gamma$ 中都是线性的。它们可以分别用两个维数分别为 $m\times mn$ 和 $n\times mn$ 的矩阵 ${\bf P_{x}}$ 和 ${\bf P_{y}}$ 来表示。

4.2 Constructing the Kantorovich dual in the discrete setting
在离散设置中构建 Kantorovich 对偶

We introduce, arbitrarily for now, the following function $L$ defined by:
我们介绍，目前随意地，以下由 $L$ 定义的函数。

L(\varphi,\psi)=<C,\gamma>-<\varphi,{\bf P_{x}}\gamma-U>-<\psi,{\bf P_{y}}\gamma-V>

that takes as arguments two vectors, $\varphi$ in ${\mathbb{R}}^{m}$ and $\psi$ in ${\mathbb{R}}^{n}$ . The function $L$ is constructed from the objective function $<C,\gamma>$ from which we subtracted the dot products of $\varphi$ and $\psi$ with the vectors that correspond to the degree of violation of the constraints. One can observe that:
该函数接受两个向量作为参数， $\varphi$ 在 ${\mathbb{R}}^{m}$ 中， $\psi$ 在 ${\mathbb{R}}^{n}$ 中。函数 $L$ 是从目标函数 $<C,\gamma>$ 构建的，我们从中减去了与对应于约束违反程度的向量的点积。可以观察到：

\begin{array}[]{lcl}\sup\limits_{\varphi,\psi}[L(\varphi,\psi)]&=&<C,\gamma>\mbox{ if }{\bf P_{x}}\gamma=U\mbox{ and }{\bf P_{y}}\gamma=V\\ &=&+\infty\mbox{ otherwise.}\end{array}

Indeed, if for instance a component $i$ of ${\bf P_{x}}\gamma$ is non-zero, one can make $L$ arbitrarily large by suitably choosing the associated coefficient $\varphi_{i}$ .
实际上，例如，如果 ${\bf P_{x}}\gamma$ 的一个分量 $i$ 是非零的，则可以通过适当选择相关系数 $\varphi_{i}$ 使 $L$ 随意增大。

Now we consider: 现在我们考虑:

\inf\limits_{\gamma\geq 0}\left[\sup\limits_{\varphi,\psi}[L(\varphi,\psi)]\right]=\inf\limits_{\tiny\begin{array}[]{l}\gamma\geq 0\\ {\bf P_{x}}\gamma=U\\ {\bf P_{y}}\gamma=V\end{array}}\left[<C,\gamma>\right].

There is equality, because to minimize $\sup[L(\varphi,\psi)]$ , $\gamma$ has no other choice than satisfying the constraints (see the previous observation). Thus, we obtain a new expression (left-hand side) of the discrete Kantorovich problem (right-hand side). We now further examine it, and replace $L$ by its expression:
有相等性，因为要使 $\sup[L(\varphi,\psi)]$ 最小化， $\gamma$ 除了满足限制条件之外别无选择（请参见前面的观察）。因此，我们获得了离散 Kantorovich 问题的新表达式（左侧），第 2 项用其表达式替换。

	$\displaystyle\inf\limits_{\gamma\geq 0}\left[\sup\limits_{\varphi,\psi}\left(\begin{array}[]{ll}<C,\gamma>&-<\varphi,{\bf P_{x}}\gamma-U>\\ &-<\psi,{\bf P_{y}}\gamma-V>\end{array}\right)\right]$	(5)
$\displaystyle=$	$\displaystyle\sup\limits_{\varphi,\psi}\left[\inf\limits_{\gamma\geq 0}\left(\begin{array}[]{ll}<C,\gamma>&-<\varphi,{\bf P_{x}}\gamma-U>\\ &-<\psi,{\bf P_{y}}\gamma-V>\end{array}\right)\right]$	(8)
$\displaystyle=$	$\displaystyle\sup\limits_{\varphi,\psi}\left[\inf\limits_{\gamma\geq 0}\left(\begin{array}[]{l}<\gamma,C-{\bf P_{x}}^{t}\varphi-{\bf P_{y}}^{t}\psi>+\\[2.84526pt] <\varphi,U>+<\psi,V>\end{array}\right)\right]$	(11)
$\displaystyle=$	$\displaystyle\sup\limits_{\tiny\begin{array}[]{c}\varphi,\psi\\ {\bf P_{x}}^{t}\varphi+{\bf P_{y}}^{t}\psi\leq C\end{array}}\left[<\varphi,U>+<\psi,V>\right].\quad\quad$	(14)

The first step (8) consists in exchanging the “ $\inf$ ” and “ $\sup$ ”. Then we rearrange the terms (11). By reinterpreting this equation as a constrained optimization problem (similarly to what we did in the previous paragraph), we finally obtain the constrained optimization problem in (14). In the constraint ${\bf P_{x}}^{t}\varphi+{\bf P_{y}}^{t}\psi\leq C$ , the inequality is to be considered componentwise. Finally, the problem (14) can be rewritten as:
第一步（8）是交换“ $\inf$ ”和“ $\sup$ ”。然后我们重新排列项（11）。通过将这个方程重新解释为一个约束优化问题（类似于前一段的做法），我们最终得到了（14）中的约束优化问题。在约束 ${\bf P_{x}}^{t}\varphi+{\bf P_{y}}^{t}\psi\leq C$ 中，不等式应当被按分量考虑。最后，问题（14）可以被重写为：

\begin{array}[]{l}\sup\limits_{\varphi,\psi}\left[<\varphi,U>+<\psi,V>\right]\\[5.69054pt] \mbox{ subject to }\varphi_{i}+\psi_{j}\leq c_{ij},\quad\forall i,j.\end{array}

(15)

As compared to the primal problem (2) that depends on $m\times n$ variables (all the coefficients $\gamma_{ij}$ of the optimal transport plan for all couples of points $(i,j)$ ), this dual problem depends on $m+n$ variables (the components $\varphi_{i}$ and $\psi_{j}$ attached to the source points and target points). We will see later how to further reduce the number of variables, but before then, we go back to the general continuous setting (that is, with functions, measures and operators).
与基本问题（2）相比，该问题依赖于 $m\times n$ 变量（所有与一对点 $(i,j)$ 的最优输运方案的系数 $\gamma_{ij}$ ），而该对偶问题依赖于 $m+n$ 个变量（与源点和目标点相关联的分量 $\varphi_{i}$ 和 $\psi_{j}$ ）。我们稍后将看到如何进一步减少变量的数量，但在那之前，我们回到一般的连续设置（即，使用函数、测度和算子）。

4.3 The Kantorovich dual in the continuous setting
连续设置中的 Kantorovich 对偶

The same reasoning path can be applied to the continuous Kantorovich problem (K), leading to the following problem (DK):
相同的推理路径可以应用于连续 Kantorovich 问题（K），导致以下问题（DK）：

\begin{array}[]{l}(DK)\quad\quad\sup\limits_{\varphi,\psi}\left[\int\limits_{X}\varphi d\mu+\int\limits_{Y}\psi d\nu\right]\\[11.38109pt] \mbox{ subject to: }\\ \varphi(x)+\psi(y)\leq c(x,y)\quad\forall(x,y)\in X\times Y,\end{array}

(16)

where $\varphi$ and $\psi$ are now functions defined on $X$ and $Y$ ⁵⁵5The functions $\varphi$ and $\psi$ need to be taken in $L^{1}(\mu)$ and $L^{1}(\nu)$ . The proof of the equivalence with problem (K) requires more precautions than in the discrete case, in particular step (8) (exchanging sup and inf), that uses a result of convex analysis (due to Rockafellar), see [Vil09] chapter 5.
函数 $\varphi$ 和 $\psi$ 需要在 $L^{1}(\mu)$ 和 $L^{1}(\nu)$ 中考虑。与问题（K）等价的证明需要比离散情况更多的预防措施，特别是步骤（8）（交换 sup 和 inf），该步骤使用了凸分析的结果（来自 Rockafellar），参见 [Vil09] 章节 5。
其中 $\varphi$ 和 $\psi$ 现在是在 $X$ 和 $Y$ 上定义的函数 ⁵ ，.

The classical image that gives an intuitive meaning to this dual problem is to consider that instead of transporting earth by ourselves, we are now hiring a company that will do the work on our behalf. The company has a special way of determining the price: the function $\varphi(x)$ corresponds to what they charge for loading earth at $x$ , and $\psi(y)$ corresponds to what they charge for unloading earth at $y$ . The company aims at maximizing its profit (this is why the dual problem is a “sup” rather than an “inf)”, but it cannot charge more than what it would cost us if we were doing the work by ourselves (hence the constraint).
给这个双边问题带来直观含义的经典形象是，考虑到我们现在不再亲自搬运土地，而是雇佣一家代劳公司来代劳。该公司有一种特殊的确定价格的方法：函数 $\varphi(x)$ 对应他们在 $x$ 处装载土地的收费， $\psi(y)$ 对应他们在 $y$ 处卸载土地的收费。该公司旨在最大化利润（这就是为什么双边问题是“sup”而不是“inf”），但不能收取比我们自己亲自完成工作时的成本更高的费用（因此有了约束）。

The existence of solutions for (DK) remains difficult to study, because the set of functions $\varphi,\psi$ that satisfy the constraint is not compact. However, it is possible to reveal more structure of the problem, by introducing the notion of c-transform, that makes it possible to exhibit a set of admissible functions with sufficient regularity:
对于（DK）的解的存在性仍然很难研究，因为满足约束条件的函数 $\varphi,\psi$ 的集合并不紧致。但是，通过引入 c 变换的概念，可以揭示问题的更多结构，从而展示出一组具有足够规则性的可接受函数：

Definition 1. 定义 1。

•

For $f$ any function on $Y$ with values in $\mathbb{R}\cup\{-\infty\}$ and not identically $-\infty$ , we define its $c$ -transform by

f^{c}(x)=\inf\limits_{y\in Y}\left[c(x,y)-f(y)\right],\quad x\in X.

• 对于

Y

上的任意函数

f

，其取值在

\mathbb{R}\cup\{-\infty\}

中并且不是恒等于

-\infty

，我们定义其

c

-变换为

•

If a function $\varphi$ is such that there exists a function $f$ such that $\varphi=f^{c}$ , then $\varphi$ is said to be $c$ -concave;

• 如果一个函数 $\varphi$ 存在这样一个函数 $f$ ，使得 $\varphi=f^{c}$ ，那么 $\varphi$ 被称为 $c$ -凹函数;
•

${\bf\Psi}_{c}(X)$ denotes the set of c-concave functions on $X$ .

• ${\bf\Psi}_{c}(X)$ 表示在 $X$ 上的 c-凹函数集合。

We now show two properties of (DK) that will allow us to restrict the problem to the class of $c$ -concave functions to search for $\varphi$ and $\psi$ :
我们现在展示(DK)的两个特性，这将允许我们将问题限制在 $c$ -凹函数类中，以便搜索 $\varphi$ 和 $\psi$ ：

Observation 2.

If the pair $(\varphi,\psi)$ is admissible for (DK), then the pair $(\psi^{c},\psi)$ is admissible as well.

观察 2. 如果对于(DK)而言对

(\varphi,\psi)

是可接受的，那么对

(\psi^{c},\psi)

也是可接受的。

Proof. 证明。

\begin{array}[]{l}\left\{\begin{array}[]{l}\forall(x,y)\in X\times Y,\varphi(x)+\psi(y)\leq c(x,y)\\ \psi^{c}(x)=\inf\limits_{y\in Y}\left[c(x,y)-\psi(y)\right]\end{array}\right.\\[8.53581pt] \begin{array}[]{lcl}\psi^{c}(x)+\psi(y)&=&\inf\limits_{y^{\prime}\in Y}\left[c(x,y^{\prime})-\psi(y^{\prime})\right]+\psi(y)\\ &\leq&(c(x,y)-\psi(y))+\psi(y)\\ &\leq&c(x,y).\end{array}\end{array}

∎

Observation 3.

If the pair $(\varphi,\psi)$ is admissible for (DK), then one obtains a better pair by replacing $\varphi$ with $\psi^{c}$ :

注意 3。如果对（DK）的

(\varphi,\psi)

对是可以接受的，那么通过将

\varphi

替换为

\psi^{c}

可以得到更好的对：

Proof. 证明。

\left.\begin{array}[]{lcl}&\psi^{c}(x)=&\inf\limits_{y\in Y}\left[c(x,y)-\psi(y)\right]\\ \forall y\in Y,&\varphi(x)\leq&c(x,y)-\psi(y)\end{array}\right\}\Rightarrow\psi^{c}(x)\leq\varphi(x).

∎

In terms of the previous intuitive image, this means that by replacing $\varphi$ with $\psi^{c}$ , the company can charge more while the price remains acceptable for the client (that is, the constraint is satisfied). Thus, we have:
根据先前的直观形象，这意味着通过用 $\psi^{c}$ 替换 $\varphi$ ，公司可以收取更多费用，同时价格对客户来说仍然可以接受（也就是说，约束得到满足）。因此，我们有：

\begin{array}[]{ll}\inf(K)&=\sup\limits_{\varphi\in{\bf\Psi}_{c}(X)}\quad\int\limits_{X}\varphi\ d\mu+\int\limits_{Y}\varphi^{c}\ d\nu\\[11.38109pt] &=\sup\limits_{\psi\in{\bf\Psi}_{c}(Y)}\quad\int\limits_{X}\psi^{c}\ d\mu+\int\limits_{Y}\psi\ d\nu\end{array}

We do not give here the detailed proof for existence. The reader is referred to [Vil09], Chapter 4. The idea is that we are now in a much better situation, since the set of admissible functions ${\bf\Psi}_{c}(X)$ is compact⁶⁶6Provided that the value of $\psi$ is fixed at a point of $Y$ in order to suppress invariance with respect to adding a constant to $\psi$ .
假定值为 $\psi$ 的固定在 $Y$ 处，以抑制对于添加常数到 $\psi$ 的不变性。
我们这里不提供存在性的详细证明。读者可以参考 [Vil09]，第 4 章。这个想法是，我们现在处于一个好得多的情况，因为可接受函数集 ${\bf\Psi}_{c}(X)$ 是紧致的 ⁶ 。.

The optimal value gives an interesting information, that is the minimum cost of transforming $\mu$ into $\nu$ . This can be used to define a distance between distributions, and also gives a way to compare different distributions, which is of practical interest for some applications.
最优值提供了一个有趣的信息，即将 $\mu$ 转换为 $\nu$ 的最低成本。这可用于定义分布之间的距离，还提供了一种比较不同分布的方法，这在某些应用中具有实际意义。

5 From the Kantorovich dual to the optimal transport map
从 Kantorovich 对偶到最优输运映射

5.1 The $c$ -superdifferential
5.1The $c$ -超微分

Suppose now that in addition to the optimal cost you want to know the associated way to transform $\mu$ into $\nu$ , in other words, when it exists, the map $T$ from $X$ to $Y$ which associated transport plan $(Id\times T)\sharp\mu$ minimizes the functional of the Monge problem. A result characterizes the support of $\gamma$ , that is the subset $\partial^{c}\varphi\subset X\times Y$ of the pairs of points $(x,y)$ connected by the transport plan:
现在假设除了最优成本外，您还想了解将 $\mu$ 转化为 $\nu$ 的关联方式，换句话说，当存在时，从 $X$ 到 $Y$ 的映射 $T$ 关联的运输方案 $(Id\times T)\sharp\mu$ 最小化蒙日问题的泛函。结果描述了 $\gamma$ 的支持，即由运输方案连接的点对 $(x,y)$ 的子集 $\partial^{c}\varphi\subset X\times Y$ ：

Theorem 1.

Let $\varphi$ a $c$ -concave function. For all $(x,y)\in\partial^{c}\varphi$ , we have

\nabla\varphi(x)-\nabla_{x}c(x,y)=0,

where $\partial^{c}\varphi=\{(x,y)\ |\ \varphi(z)\leq\varphi(x)+(c(z,y)-c(x,y)),\forall z\in X\}$ ⁷⁷7By definition of the $c$ -transform, if $(x,y)\in\partial^{c}\varphi$ , then $\varphi^{c}(y)=c(x,y)-\varphi(x)$ . Then, the $c$ -superdifferential can be characterized by the set of all points $(x,y)\in X\times Y$ such that $\varphi(x)+\varphi^{c}(y)=c(x,y)$
根据 $c$ -变换的定义，如果 $(x,y)\in\partial^{c}\varphi$ ，则 $\varphi^{c}(y)=c(x,y)-\varphi(x)$ 。然后， $c$ -超微分可以由所有满足条件 $\varphi(x)+\varphi^{c}(y)=c(x,y)$ 的点 $(x,y)\in X\times Y$ 来刻画. 其中 $\partial^{c}\varphi=\{(x,y)\ |\ \varphi(z)\leq\varphi(x)+(c(z,y)-c(x,y)),\forall z\in X\}$ ⁷ denotes the so-called $c$ -superdifferential of $\varphi$ .
表示所谓的 $c$ -超微分 $\varphi$ 。

定理 1. 设

\varphi

为

c

-凹函数。对于所有

(x,y)\in\partial^{c}\varphi

，我们有

Proof.

See [Vil09] chapters 9 and 10. ∎

证明。见[Vil09]第 9 和第 10 章。∎

In order to give an idea of the relation between the $c$ -superdifferential and the associated transport map $T$ , we present below a heuristic argument: consider a point $(x,y)$ in the $c$ -superdifferential $\partial^{c}\varphi$ , then for all $z\in X$ we have
为了让大家了解 $c$ -超微分和相关的传输映射 $T$ 之间的关系，我们以下给出一个启发式论点：考虑 $c$ -超微分 $\partial^{c}\varphi$ 中的点 $(x,y)$ ，那么对于所有 $z\in X$ ，我们有

c(x,y)-\varphi(x)\leq c(z,y)-\varphi(z).

(17)

Now, by using (17), we can compute the derivative at $x$ with respect to an arbitrary direction $w$
现在，通过使用（17），我们可以计算关于任意方向 $w$ 在 $x$ 处的导数

\lim_{t\to 0^{+}}\frac{\varphi(x+tw)-\varphi(x)}{t}\leq\lim_{t\to 0^{+}}\frac{c(x+tw,y)-c(x,y)}{t}

and we obtain $\nabla\varphi(x)\cdot w\leq\nabla_{x}c(x,y)\cdot w$ . We can do the same derivation along direction $-w$ instead of $w$ , and then we get $\nabla\varphi(x)\cdot w=\nabla_{x}c(x,y)\cdot w$ , $\forall w\in X$ .
然后我们得到 $\nabla\varphi(x)\cdot w\leq\nabla_{x}c(x,y)\cdot w$ 。我们可以沿着方向 $-w$ 而不是 $w$ 进行相同的推导，然后我们得到 $\nabla\varphi(x)\cdot w=\nabla_{x}c(x,y)\cdot w$ ， $\forall w\in X$ 。

In the particular case of the $L_{2}$ cost, that is with $c(x,y)=1/2\|x-y\|^{2}$ , this relation becomes $\forall(x,y)\in\partial^{c}\varphi,\nabla\varphi(x)+y-x=0$ , thus, when the optimal transport map $T$ exists, it is given by
在特定情况下的 $L_{2}$ 成本，即与 $c(x,y)=1/2\|x-y\|^{2}$ ，这个关系变成 $\forall(x,y)\in\partial^{c}\varphi,\nabla\varphi(x)+y-x=0$ ，因此，当最优输运映射 $T$ 存在时，它是由下式给出的

T(x)=x-\nabla\varphi(x)=\nabla(\|x\|^{2}/2-\varphi(x)).

Not only this gives an expression of $T$ in function of $\varphi$ , which is of high interest to us if we want to compute the transport explicitly. In addition, this makes it possible to characterize $T$ as the gradient of a convex function (see also Brenier’s polar factorization theorem [Bre91]). This convexity property is interesting, because it means that two “transported particles” $x_{1}\mapsto T(x_{1})$ et $x_{2}\mapsto T(x_{2})$ will never collide. We now see how to prove these two assertions ( $T$ gradient of a convex function and absence of collision) in the case of the $L_{2}$ transport (with $(c(x,y)=1/2\|x-y\|^{2}$ ).
不仅这给出了 $T$ 关于 $\varphi$ 的表达式，如果我们想明确计算输运，这对我们很有帮助。此外，这使得可以将 $T$ 表征为凸函数的梯度（也见 Brenier 的极分解定理[Bre91]）。这种凸性质是有趣的，因为它意味着两个“输运粒子” $x_{1}\mapsto T(x_{1})$ 和 $x_{2}\mapsto T(x_{2})$ 永远不会相撞。现在我们看到如何证明这两个断言（凸函数的梯度和无碰撞）在 $L_{2}$ 输运的情况下（与 $(c(x,y)=1/2\|x-y\|^{2}$ ）。

Observation 4.

If $c(x,y)$ = $1/2\|x-y\|^{2}$ and $\varphi\in{\bf\Psi}_{c}(X)$ , then $\bar{\varphi}:x\mapsto\bar{\varphi}(x)=\|x\|^{2}/2-\varphi(x)$ is a convex function (it is an equivalence if $X=Y=\mathbb{R}^{d}$ , see [San15]).

观察 4。如果

c(x,y)

1/2\|x-y\|^{2}

和

\varphi\in{\bf\Psi}_{c}(X)

，那么

\bar{\varphi}:x\mapsto\bar{\varphi}(x)=\|x\|^{2}/2-\varphi(x)

是凸函数（如果

X=Y=\mathbb{R}^{d}

，则它是等价的，参见[San15]）。

Proof. 证明。

\begin{split}\varphi(x)=&\psi^{c}(x)\\ =&\inf\limits_{y}\left[\frac{\|x-y\|^{2}}{2}-\psi(y)\right]\\ =&\inf\limits_{y}\left[\frac{\|x\|^{2}}{2}-x\cdot y+\frac{\|y\|^{2}}{2}-\psi(y)\right].\end{split}

Then, 然后，

\begin{split}-\bar{\varphi}(x)=&\varphi(x)-\frac{\|x\|^{2}}{2}=\inf\limits_{y}\left[-x\cdot y+\left(\frac{\|y\|^{2}}{2}-\psi(y)\right)\right].\end{split}

Or equivalently, 或者等价的，

\begin{split}\bar{\varphi}(x)=&\sup\limits_{y}\left[x\cdot y-\left(\frac{\|y\|^{2}}{2}-\psi(y)\right)\right].\end{split}

The function $x\mapsto x\cdot y-\left(\frac{\|y\|^{2}}{2}-\psi(y)\right)$ is affine in $x$ , therefore the graph of $\bar{\varphi}$ is the upper envelope of a family of hyperplanes, therefore $\bar{\varphi}$ is a convex function (see Figure 8). ∎
函数 $x\mapsto x\cdot y-\left(\frac{\|y\|^{2}}{2}-\psi(y)\right)$ 在 $x$ 中是仿射的，因此 $\bar{\varphi}$ 的图形是一族超平面的上包络，因此 $\bar{\varphi}$ 是一个凸函数（见图 8）。∎

Observation 5.

We now consider the trajectories of two particles parameterized by $t\in[0,1]$ , $t\mapsto(1-t)x_{1}+tT(x_{1})$ and $t\mapsto(1-t)x_{2}+tT(x_{2})$ . If $x_{1}\neq x_{2}$ and $0<t<1$ then there is no collision between the two particles.

观察 5.我们现在考虑两个由

t\in[0,1]

，

t\mapsto(1-t)x_{1}+tT(x_{1})

和

t\mapsto(1-t)x_{2}+tT(x_{2})

参数化的粒子的轨迹。如果

x_{1}\neq x_{2}

和

0<t<1

，那么两个粒子之间就不会发生碰撞。

Proof.

By contradiction, suppose there is a collision, that is there exists $t\in(0,1)$ and $x_{1}\neq x_{2}$ such that

(1-t)x_{1}+tT(x_{1})=(1-t)x_{2}+tT(x_{2}).

Since $T=\nabla\bar{\varphi}$ , we can rewrite the last equality as
自 $T=\nabla\bar{\varphi}$ ，我们可以重写最后的等式为

(1-t)(x_{1}-x_{2})+t(\nabla\bar{\varphi}(x_{1})-\nabla\bar{\varphi}(x_{2}))=0.

Therefore, 所以，

(1-t)\|x_{1}-x_{2}\|^{2}+t(\nabla\bar{\varphi}(x_{1})-\nabla\bar{\varphi}(x_{2}))\cdot(x_{1}-x_{2})=0.

The last step leads to a contradiction, between the left-hand side is the sum of two strictly positive numbers (recalling the definition of the convexity of $\bar{\varphi}$ : $\forall x_{1}\neq x_{2},(x_{1}-x_{2})\cdot(\nabla\bar{\varphi}(x_{1})-\nabla\bar{\varphi}(x_{2}))>0$ )⁸⁸8Note that even if there is no collision, the trajectories can cross, that is $(1-t)x_{1}+tT(x_{1})=(1-t^{\prime})x_{2}+t^{\prime}T(x_{2})$ for some $t\neq t^{\prime}$ (see example in [Vil09]). If the cost is the Euclidean distance (instead of squared Euclidean distance), the non-intersection property is stronger and trajectories cannot cross. This comes at the expense of losing the uniqueness of the optimal transport plan[Vil09].
请注意，即使没有碰撞，轨迹也可以相交，即对于某些 $t\neq t^{\prime}$ （请参见[ Vil09]中的示例）。如果成本是欧氏距离（而不是平方欧氏距离），非交叉属性更强，轨迹不能相交。这是以失去最优运输方案的唯一性为代价[ Vil09]。
最后一步导致矛盾，左侧是两个严格正数的和（回想凸性的定义： $\bar{\varphi}$ ： $\forall x_{1}\neq x_{2},(x_{1}-x_{2})\cdot(\nabla\bar{\varphi}(x_{1})-\nabla\bar{\varphi}(x_{2}))>0$ ） ⁸ ∎

证明。通过反证法，假设存在一次碰撞，即存在

t\in(0,1)

和

x_{1}\neq x_{2}

，使得

6 Continuous, discrete and semi-discrete transport
连续、离散和半离散传输 6

The properties that we have presented in the previous sections are true for any couple of source and target measures $\mu$ and $\nu$ , that can derive from continuous functions or that can be discrete empirical measures (sum of Dirac masses). Figure 9 presents three configurations that are interesting to study. These configurations have specific properties, that lead to different algorithms for computing the transport. We give here some indications and references concerning the continuous $\rightarrow$ continuous and discrete $\rightarrow$ discrete cases. Then we will develop the continuous $\rightarrow$ discrete case with more details in the next section.
我们在前几节中提到的性质对于任意一对源和目标度量 $\mu$ 和 $\nu$ 都成立，可以由连续函数导出，也可以是离散经验度量（Dirac 质量之和）。图 9 呈现了三个有趣研究的配置。这些配置具有特定属性，导致了不同的计算传输算法。我们在此提供一些关于连续 $\rightarrow$ 连续和离散 $\rightarrow$ 离散情况的指示和参考文献。然后我们将在下一节更详细地讨论连续 $\rightarrow$ 离散情况。

6.1 The continuous $\rightarrow$ continuous case and Monge-Ampère equation
6.1 连续 $\rightarrow$ 连续情况和 Monge-Ampère 方程

We recall that when the optimal transport map exists, in the case of the $L_{2}$ cost (that is, $c(x,y)=1/2\|x-y\|^{2}$ ), it can be deduced from the function $\varphi$ by using the relation $T(x)=\nabla\bar{\varphi}=x-\nabla\varphi$ . The change of variable formula for integration over a subset $B$ of $X$ can be written as:
我们回顾一下，当最优输运映射存在时，在 $L_{2}$ 成本的情况下（即 $c(x,y)=1/2\|x-y\|^{2}$ ）。可以通过使用关系 $T(x)=\nabla\bar{\varphi}=x-\nabla\varphi$ 从函数 $\varphi$ 推导出来 $L_{2}$ 。对于 $X$ 的子集 $B$ 的积分的变量变换公式可以写成：

\forall B\subset X,\int_{B}1d\mu=\mu(B)=\nu(T(B))=\int_{B}\left|\det J_{T}(x)\right|d\mu

(18)

where $J_{T}$ denotes the Jacobian matrix of $T$ and $\det$ denotes the determinant.
这里 $J_{T}$ 表示 $T$ 的雅可比矩阵， $\det$ 表示行列式。

If $\mu$ and $\nu$ have densities $u$ and $v$ respectively, that is $\forall B,\mu(B)=\int_{B}u(x)dx$ and $\nu(B)=\int_{B}v(x)dx$ , then one can (formally) consider (18) pointwise in $X$ :
如果 $\mu$ 和 $\nu$ 的密度分别为 $u$ 和 $v$ ，即为 $\forall B,\mu(B)=\int_{B}u(x)dx$ 和 $\nu(B)=\int_{B}v(x)dx$ ，那么可以在 $X$ 中（正式地）逐点考虑（18）：

\forall x\in X,\ u(x)=\left|\det J_{T}(x)\right|v(T(x)).

(19)

By injecting $T=\nabla\bar{\varphi}$ and $J_{T}=H\bar{\varphi}$ into (19), one obtains:
通过将 $T=\nabla\bar{\varphi}$ 和 $J_{T}=H\bar{\varphi}$ 代入（19），可以得到：

\forall x\in X,\ u(x)=\left|\det H\bar{\varphi}(x)\right|v(\nabla\bar{\varphi}(x)),

(20)

where $H\bar{\varphi}$ denotes the Hessian matrix of $\bar{\varphi}$ . Equation (20) is known ad the Monge-Ampère equation. It is a highly non-linear equation, and its solutions when they exist often present singularities⁹⁹9 This is similar to the eikonal equation, which solution corresponds to the distance field, that has a singularity on the medial axis.
这类似于等距方程，其解对应于距离场，在中轴线上存在奇点。
其中 $H\bar{\varphi}$ 表示 $\bar{\varphi}$ 的 Hessian 矩阵。方程（20）被称为蒙热-安普尔方程。这是一个高度非线性的方程，当存在解时，通常会呈现奇点 ⁹ . Note that the derivation above is purely formal, and that studying the solutions of the Monge-Ampère equation require using more sophisticated tools. In particular, it is possible to define several types of weak solutions (viscosity solutions, solution in the sense of Brenier, solutions in the sense of Alexandrov …). Several algorithms to compute numerical solutions of the Monge-Ampère equations were proposed. As such, see for instance the Benamou-Brenier algorithm [BB00], that uses a dynamic formulation inspired by fluid dynamics (incompressible Euler equation with specific boundary conditions). See also [PPO14].
。请注意，上述推导纯粹是形式化的，研究蒙热-安普尔方程的解需要使用更复杂的工具。特别是，可以定义几种类型的弱解（粘度解，Brenier 意义上的解，Alexandrov 意义上的解…）。提出了几种计算蒙热-安普尔方程数值解的算法。例如，参见 Benamou-Brenier 算法[BB00]，它使用受流体动力学启发的动态形式（具有特定边界条件的不可压缩欧拉方程）。另请参阅[PPO14]。

6.2 The discrete $\rightarrow$ discrete case
6.2 离散 $\rightarrow$ 离散案例

If $\mu$ is the sum of $m$ Dirac masses and $\nu$ the sum of $n$ Dirac masses, then the problem boils down to finding the $m\times n$ coefficients $\gamma_{ij}$ that give for each pair of points $i$ of the source space and $j$ of the target space the quantity of matter transported from $i$ to $j$ . This corresponds to the transport plan in the discrete Kantorovich problem that we have seen previously §4. This type of problem (referred to as an assignment problem) can be solved by different methods of linear programming [BDM09]. These method can be dramatically accelerated by adding a regularization term, that can be interpreted as the entropy of the transport plan [Leo13]. This regularized version of optimal transport can be solved by highly efficient numerical algorithms [Cut13].
如果 $\mu$ 是 $m$ 个 Dirac 质点的总和， $\nu$ 是 $n$ 个 Dirac 质点的总和，则问题归结为找出 $m\times n$ 系数 $\gamma_{ij}$ ，使得源空间 $i$ 和目标空间 $j$ 每对点之间从 $i$ 传输到 $j$ 的物质量。这对应于前面我们看到的离散 Kantorovich 问题中的传输计划第 4 节。这种问题（称为分配问题）可以通过不同的线性规划方法来解决[BDM09]。通过添加一个正则化项，可以显著加速这些方法，该正则化项可解释为传输计划的熵[Leo13]。这种正则化版本的最优传输可以通过高效的数值算法来解决[Cut13]。

6.3 The continuous $\rightarrow$ discrete case
连续 $\rightarrow$ 离散情况 6.3

This configuration, also called semi-discrete, corresponds to a continuous function transported to a sum of Dirac masses (see the examples of c-concave functions in [GM96]). This correspond to our example with factories that consume a resource, in §2.2. Semi-discrete transport has interesting connections with some notions of computational geometry and some specific sets of convex polyhedra that were studied by Alexandrov [Ale05] and later by Aurenhammer, Hoffman and Aranov [AHA92]. The next section is devoted to this configuration.
这种配置，也被称为半离散，对应于将连续函数转换为一组 Dirac 质量的总和（参见[GM96]中的 c-凹函数示例）。这对应于我们在第 2.2 节中消耗资源的工厂示例。半离散传输与计算几何的某些概念和一些特定的凸多面体集合有有趣的联系，这些概念曾由亚历山德罗夫[Ale05]以及后来由 Aurenhammer，Hoffman 和 Aranov[AHA92]进行研究。接下来的部分将致力于这种配置。

7 Semi-discrete transport 7 半离散输运

We now suppose that the source measure $\mu$ is continuous, and that the target measure $\nu$ is a sum of Dirac masses. A practical example of this type of configuration corresponds to a resource which available quantity is represented by a function $u$ . The resource is collected by a set of $n$ factories, as shown in Figure 10. Each factory is supposed to collect a certain prescribed quantity of resource $\nu_{j}$ . Clearly, the sum of all prescriptions corresponds to the total quantity of available resource ( $\sum_{j=1}^{n}\nu_{j}=\int_{X}u(x)dx$ ).
我们现在假设源度量 $\mu$ 为连续的，目标度量 $\nu$ 是 Dirac 质量的总和。这种配置的一个实际例子对应于一种资源，其可用数量由函数 $u$ 表示。该资源由一组 $n$ 工厂收集，如图 10 所示。每个工厂都应该收集一定数量的资源 $\nu_{j}$ 。显然，所有预定数量的总和对应于可用资源的总数量 ( $\sum_{j=1}^{n}\nu_{j}=\int_{X}u(x)dx$ )。

7.1 Semi-discrete Monge problem
7.1 半离散 Monge 问题

In this specific configuration, the Monge problem becomes:
在这种特定配置中，Monge 问题变为:

\begin{array}[]{l}\inf\limits_{T:X\rightarrow Y}\int\limits_{X}c(x,T(x))u(x)dx,\mbox{ subject to}\int\limits_{T^{-1}(y_{j})}u(x)dx=\nu_{j},\forall j.\end{array}

A transport map $T$ associates with each point $x$ of $X$ one of the points $y_{j}$ . Thus, it is possible to partition $X$ , by associating to each $y_{j}$ the region $T^{-1}(y_{j})$ that contains all the points transported towards $y_{j}$ by $T$ . The constraint imposes that the quantity of collected resource over each region $T^{-1}(y_{j})$ corresponds to the prescribed quantity $\nu_{j}$ .
一个运输映射 $T$ 与 $X$ 上的每个点 $x$ 关联，关联到所有由 $T$ 运送到 $y_{j}$ 的点 $y_{j}$ 的一个点。因此，通过将每个 $y_{j}$ 关联到包含所有通过 $T$ 运送到 $y_{j}$ 的点的区域 $T^{-1}(y_{j})$ ，可以对 $X$ 进行划分。约束要求在每个区域 $T^{-1}(y_{j})$ 收集的资源量与规定的数量 $\nu_{j}$ 相对应。

Let us now examine the form of the dual Kantorovich problem. In terms of measure, the source measure $\mu$ has a density $u$ , and the target measure $\nu$ is a sum of Dirac masses $\nu=\sum_{j=1}^{n}\nu_{j}\delta_{y_{j}}$ , supported by the set of points $Y=\left\{y_{j}\right\}$ . We recall that in its general form, the dual Kantorovich problem is written as follows:
现在让我们来检查对偶 Kantorovich 问题的形式。就测度而言，源测度 $\mu$ 具有密度 $u$ ，目标测度 $\nu$ 是由支持在点集 $Y=\left\{y_{j}\right\}$ 上的 Dirac 质量 $\nu=\sum_{j=1}^{n}\nu_{j}\delta_{y_{j}}$ 之和。我们想起，对偶 Kantorovich 问题的一般形式如下所示：

\sup\limits_{\psi\in{\bf\Psi^{c}}(Y)}\left[\int_{X}\psi^{c}(x)d\mu+\int_{Y}\psi(y)d\nu\right].

(21)

In our semi-discrete case, the functional becomes a function of $n$ variables, with the following form:
在我们的半离散情况中，该函数成为 $n$ 个变量的函数，形式如下：

$\displaystyle F(\psi)$	$\displaystyle=$	$\displaystyle\!F(\psi_{1},\psi_{2},\ldots\psi_{n})$	(22)
	$\displaystyle=$	$\displaystyle\!\int\limits_{X}\!\psi^{c}(x)u(x)dx+\sum_{j=1}^{n}\psi_{j}\nu_{j}$	(23)
	$\displaystyle=$	$\displaystyle\!\int\limits_{X}\!\inf\limits_{y_{j}\in Y}\left[c(x,y_{j})-\psi_{j}\right]u(x)dx+\sum_{j=1}^{n}\psi_{j}\nu_{j}$	(24)
	$\displaystyle=$	$\displaystyle\!\sum\limits_{j=1}^{n}\int\limits_{\mathrm{L}ag_{\psi}^{c}(y_{j})}\!\left(c(x,y_{j})-\psi_{j}\right)u(x)dx+\sum_{j=1}^{n}\psi_{j}\nu_{j}.$	(25)

The first step (23) takes into account the nature of the measures $\mu$ and $\nu$ . In particular, one can notice that the measure $\nu$ is completely defined by the scalars $\nu_{j}$ associated with the points $y_{j}$ , and the function $\psi$ is defined by the scalars $\psi_{j}$ that correspond to its value at each point $y_{j}$ . The integral $\int_{Y}\psi(y)d\nu$ becomes the dot product $\sum_{j}\psi_{j}\nu_{j}$ . Thus, the functional that corresponds to the dual Kantorovich problem becomes a function $F$ that depends on $n$ variables (the $\psi_{j}$ ). Let us now replace the c-conjugate $\psi^{c}$ with its expression, which gives (24). The integral in the left term can be reorganized, by grouping the points of $X$ for which the same point $y_{j}$ minimizes $c(x,y_{j})-\psi_{j}$ , which gives (25), where the Laguerre cell $\mathrm{L}ag_{\psi}^{c}(y_{j})$ is defined by:
第一步（23）考虑了测量 $\mu$ 和 $\nu$ 的性质。特别是，可以注意到测量 $\nu$ 完全由与点 $y_{j}$ 相关联的标量 $\nu_{j}$ 定义，函数 $\psi$ 由对应于每个点 $y_{j}$ 的标量 $\psi_{j}$ 定义。积分 $\int_{Y}\psi(y)d\nu$ 变成点积 $\sum_{j}\psi_{j}\nu_{j}$ 。因此，对偶 Kantorovich 问题对应的函数变成了一个依赖 $n$ 个变量（ $\psi_{j}$ ）的函数 $F$ 。现在让我们用它的表达式替换 c 共轭 $\psi^{c}$ ，得到（24）。左项中的积分可以重新组织，通过将使得相同点 $y_{j}$ 最小化 $c(x,y_{j})-\psi_{j}$ 的 $X$ 的点进行分组，得到（25），其中 Laguerre 单元 $\mathrm{L}ag_{\psi}^{c}(y_{j})$ 定义为：

\mathrm{L}ag_{\psi}^{c}(y_{j})=\left\{x\in X\quad|\quad c(x,y_{j})-\psi_{j}\leq c(x,y_{k})-\psi_{k},\ \forall k\neq j\right\}.

The Laguerre diagram, formed by the union of the Laguerre cells, is a classical structure in computational geometry. In the case of the $L_{2}$ cost $c(x,y)=1/2\|x-y\|^{2}$ , it corresponds to the power diagram, that was studied by Aurenhammer at the end of the 80’s [Aur87]. One of its particularities is that the boundaries of the cells are rectilinear, making it reasonably easy to design computational algorithms to construct them.
拉盖尔图是由拉盖尔单元的联合形成的，是计算几何学中的经典结构。在 $L_{2}$ 成本 $c(x,y)=1/2\|x-y\|^{2}$ 的情况下，它对应于能量图，这是在 80 年代末由 Aurenhammer 研究的[ Aur87]。其特点之一是单元的边界是直线的，这使得设计计算算法来构建它们相对容易。

7.2 Concavity of $F$ 7.2 $F$ 的凹性

The objective function $F$ is a particular case of the Kantorovich dual, and naturally inherits its properties, such as its concavity (that we did not discuss yet). This property is interesting both from a theoretical point of view, to study the existence and uniqueness of solutions, and from a practical point of view, to design efficient numerical solution mechanisms. In the semi-discrete case, the concavity of $F$ is easier to prove than in the general case. We summarize here the proof by Aurenhammer et. al [AHA92], that leads to an efficient algorithm [Mér11], [Lév15], [KMT16].
目标函数 $F$ 是康托罗维奇对偶问题的一个特例，并自然地继承了其性质，比如它的凹性（我们还没有讨论）。这一性质既有理论意义，用于研究解的存在性和唯一性，也有实际意义，用于设计高效的数值解算机制。在半离散情况下， $F$ 的凹性比一般情况容易证明。我们在这里总结 Aurenhammer 等人提出的证明，这导致了一个高效的算法。

Theorem 2.

The objective function $F$ of the semi-discrete Kantorovich dual problem (21) is concave.

定理 2。半离散康托罗维奇对偶问题（21）的目标函数

F

是凹的。

Proof.

Consider the function $G$ defined by:

G(A,\left[\psi_{1},\ldots\psi_{n}\right])=\int\limits_{X}\left(c(x,y_{A(x)})-\psi_{A(x)}\right)u(x)dx,

(26)

and parameterized by an assignment $A~{}:X\rightarrow[1\ldots n]$ , that is, a function that associates with each point $x$ of $X$ the index $j$ of one of the points $y_{j}$ . If we denote $A^{-1}(j)=\left\{x|A(x)=j\right\}$ , then $G$ can be also written as:
并由赋值 $A~{}:X\rightarrow[1\ldots n]$ 参数化，即，将每个点 $x$ 对应到 $X$ 中某个点 $y_{j}$ 的索引 $j$ 的函数。如果我们表示为 $A^{-1}(j)=\left\{x|A(x)=j\right\}$ ，那么 $G$ 也可写成：

\begin{array}[]{lcl}G(A,\psi)&=&\sum\limits_{j}\int\limits_{A^{-1}(j)}\left(c(x,y_{j})-\psi_{j}\right)u(x)dx\\[5.69054pt] &=&\sum\limits_{j}\int\limits_{A^{-1}(j)}c(x,y_{j})u(x)dx-\sum\limits_{j}\psi_{j}\int\limits_{A^{-1}(j)}u(x)dx.\end{array}

(27)

The first term does not depend on the $\psi_{j}$ , and the second one is a linear combination of the $\psi_{j}$ coefficients, thus, for a given fixed assignment $A$ , $\psi\mapsto G(A,\psi)$ is an affine function of $\psi$ . Figure 11 depicts the appearance of the graph of $G$ for different assignment. The horizontal axis symbolizes the components of the vector $\psi$ (of dimension $n$ ) and the vertical axis the value of $G(A,\psi)$ . For a given assignment $A$ , the graph of $G$ is an hyperplane (symbolized here by a straight line).
第一项不依赖于 $\psi_{j}$ ，第二项是 $\psi_{j}$ 系数的线性组合，因此，对于给定的固定赋值 $A$ ， $\psi\mapsto G(A,\psi)$ 是 $\psi$ 的仿射函数。图 11 描述了 $G$ 的图形在不同赋值下的外观。水平轴表示向量 $\psi$ （维度为 $n$ ）的分量，垂直轴表示 $G(A,\psi)$ 的值。对于给定赋值 $A$ ， $G$ 的图形是一个超平面（这里用直线表示）。

证明。考虑由

G

定义的函数：

Among all the possible assignments $A$ , we distinguish $A^{\psi}$ that associates with a point $x$ the index $j$ of the Laguerre cell $x$ belongs to ¹⁰¹⁰10 $A$ is undefined on the set of Laguerre cell boundaries, this does not cause any difficulty because this set has zero measure.
$A$ 在拉盖尔单元边界集上未定义，这不会造成任何困难，因为该集合具有零测度。
在所有可能的赋值 $A$ 中，我们将 $A^{\psi}$ 与点 $x$ 相关联，该点为 $x$ 的拉盖尔单元 $j$ 的索引所属于 ¹⁰ , that is: ，即：

A^{\psi}(x)=\arg\min\limits_{j}\left[c(x,y_{j})-\psi_{j}\right].

For a fixed vector $\psi=\psi^{0}$ , among all the possible assignments $A$ , the assignment $A^{\psi^{0}}$ minimizes the value $G(A,\psi^{0})$ , because it minimizes the integrand pointwise (see Figure 11 on the left. Thus, since $G$ is affine with respect to $\psi$ , the graph of the function $\psi\rightarrow G(A^{\psi},\psi)$ is the lower envelope of a family of hyperplanes (symbolized as straight lines in Figure 11 on the right), hence $\psi\rightarrow G(A^{\psi},\psi)$ is a concave function. Finally, the objective function $F$ of the dual Kantorovich problem can be written as $F(\psi)=G(A^{\psi},\psi)+\sum_{j}\nu_{j}\psi_{j}$ , that is, the sum of a concave function and a linear function, hence it is also a concave function. ∎
对于一个固定的向量 $\psi=\psi^{0}$ ，在所有可能的分配 $A$ 中，分配 $A^{\psi^{0}}$ 使值 $G(A,\psi^{0})$ 最小化，因为它逐点最小化被积函数（见图 11 左侧）。因此，由于 $G$ 对 $\psi$ 是仿射的，函数 $\psi\rightarrow G(A^{\psi},\psi)$ 的图是一族超平面的下包络线（在图 11 右侧用直线符号表示），因此 $\psi\rightarrow G(A^{\psi},\psi)$ 是凹函数。最后，对偶 Kantorovich 问题的目标函数 $F$ 可以写成 $F(\psi)=G(A^{\psi},\psi)+\sum_{j}\nu_{j}\psi_{j}$ ，即凹函数与线性函数之和，所以它也是凹函数。 ∎

7.3 The semi-discrete optimal transport map
7.3 半离散最优输运映射

Let us now examine the $c$ -superdifferential $\partial^{c}\psi$ , that is, the set of points ( $x\in X$ , $y\in Y$ ) connected by the optimal transport map. We recall that the $c$ -superdifferential $\partial^{c}\psi$ can be defined alternatively as $\partial^{c}\psi=\left\{(x,y_{j})\ |\ \psi^{c}(x)+\psi_{j}=c(x,y_{j})\right\}$ . Consider a point $x$ of $X$ that belongs to the Laguerre cell $\mathrm{L}ag_{\psi}^{c}(y_{j})$ . The $c$ -superdifferential $[\partial^{c}\psi](x)$ at $x$ is defined by:
让我们现在来检验 $c$ -超微分 $\partial^{c}\psi$ ，即由最优输运映射连接的点（ $x\in X$ ， $y\in Y$ ）的集合。我们回忆起 $c$ -超微分 $\partial^{c}\psi$ 可以另行定义为 $\partial^{c}\psi=\left\{(x,y_{j})\ |\ \psi^{c}(x)+\psi_{j}=c(x,y_{j})\right\}$ 。考虑一个属于拉盖尔单元 $\mathrm{L}ag_{\psi}^{c}(y_{j})$ 的 $X$ 点 $x$ 。在 $x$ 处的 $c$ -超微分 $[\partial^{c}\psi](x)$ 定义为：

$\displaystyle[\partial^{c}\psi](x)$	$\displaystyle=$	$\displaystyle\left\{y_{k}\ \|\ \psi^{c}(x)+\psi_{k}=c(x,y_{k})\right\}$	(28)
	$\displaystyle=$	$\displaystyle\left\{y_{k}\ \|\ \inf_{y_{l}}\left[c(x,y_{l})-\psi_{l}\right]+\psi_{k}=c(x,y_{k})\right\}$	(29)
	$\displaystyle=$	$\displaystyle\left\{y_{k}\ \|\ c(x,y_{j})-\psi_{j}+\psi_{k}=c(x,y_{k})\right\}$	(30)
	$\displaystyle=$	$\displaystyle\left\{y_{k}\ \|\ c(x,y_{j})-\psi_{j}=c(x,y_{k})-\psi_{k}\right\}$	(31)
	$\displaystyle=$	$\displaystyle\left\{y_{j}\right\}.$	(32)

In the first step (29), we replace $\psi^{c}$ by its definition, then we use the fact that $x$ belongs to the Laguerre cell of $y_{j}$ (30), and finally, the only point of $Y$ that satisfies (31) is $y_{j}$ because we have supposed $x$ inside the Laguerre cell of $y_{j}$ .
在第一步（29），我们用 $\psi^{c}$ 的定义替换它，然后我们利用 $x$ 属于 $y_{j}$ 的 Laguerre 单元（30），最后，满足（31）的唯一 $Y$ 点是 $y_{j}$ ，因为我们假设 $x$ 在 $y_{j}$ 的 Laguerre 单元内。

To summarize, the optimal transport map $T$ moves each point $x$ to the point $y_{j}$ associated with the Laguerre cell $\mathrm{L}ag_{\psi}^{c}(y_{j})$ that contains $x$ . The vector $\psi_{1},\ldots\psi_{n}$ is the unique vector that maximizes the discrete dual Kantorovich function $F$ such that $\psi$ is c-concave. It is possible to show that $\psi$ is c-concave if and only if no Laguerre cell is empty of matter, that is the integral of $u$ is non-zero on each Laguerre cell. Indeed, we have the following results:
综上所述，最佳传输映射 $T$ 将每个点 $x$ 移动到与包含 $x$ 的 Laguerre 单元 $\mathrm{L}ag_{\psi}^{c}(y_{j})$ 相关联的点 $y_{j}$ 。向量 $\psi_{1},\ldots\psi_{n}$ 是唯一的向量，它最大化离散对偶 Kantorovich 函数 $F$ ，使得 $\psi$ 是 c-凹的。可以证明，当且仅当没有 Laguerre 单元缺少物质时，即每个 Laguerre 单元上 $u$ 的积分非零时， $\psi$ 是 c-凹的。事实上，我们有以下结果：

Theorem 3.

Let $Y=\{y_{1},\ldots,y_{n}\}$ be a set of $n$ points. Let $\psi$ be any function defined on $Y$ such that $\mathrm{L}ag_{\psi}^{c}(y_{i})$ are not empty sets. Then $\psi$ is a $c$ -concave function.

定理 3. 设

Y=\{y_{1},\ldots,y_{n}\}

是一个包含

n

个点的集合。设

\psi

是在

Y

上定义的任意函数，使得

\mathrm{L}ag_{\psi}^{c}(y_{i})

不是空集。那么

\psi

是一个

c

-凹函数。

Proof.

By definition, for $i=1,2,\ldots,k$

\begin{split}(\psi^{c})^{c}(y_{i})&=\inf\limits_{x\in X}\left[c(x,y_{i})-\psi^{c}(x)\right]\\ &=\inf\limits_{x\in X}\left[c(x,y_{i})-\left(\inf\limits_{y_{j}\in Y}\left[c(x,y_{j})-\psi(y_{j})\right]\right)\right]\\ &=\inf\limits_{x\in X}\left\{\begin{array}[]{ll}\psi(y_{i}),\quad\quad\quad\quad\quad\ \quad\quad\quad\quad\hbox{if $x\in\mathrm{L}ag_{\psi}^{c}(y_{i})$ }\\ c(x,y_{i})-(\overbrace{c(x,y_{j})-\psi(y_{j})}^{\quad<(c(x,y_{i})-\psi(y_{i}))}),\ \hbox{if $x\in\mathrm{L}ag_{\psi}^{c}(y_{j})$ $(j\neq i)$ }\end{array}\right.\\ &=\psi(y_{i}).\end{split}

This allows to conclude that $\psi$ is a $c$ -convex function (see [San15, Proposition 1.34]). ∎
这样可以得出结论， $\psi$ 是一个 $c$ -凸函数（参见[San15，命题 1.34]）。 ∎

证明。根据定义，对于

i=1,2,\ldots,k

Moreover, the converse of the theorem is also true:
此外，定理的逆命题也成立：

Theorem 4.

Let $Y=\{y_{1},\ldots,y_{n}\}$ be a set of $n$ points. Let $\psi$ be a $c$ -concave function defined on $Y$ . Then the sets $\mathrm{L}ag_{\psi}^{c}(y_{i})$ are not empty for all $i=1,\ldots,n$ ,.

定理 4 . 让

Y=\{y_{1},\ldots,y_{n}\}

成为一个由

n

个点组成的集合。让

\psi

是一个定义在

Y

上的

c

-凹函数。那么对于所有的

i=1,\ldots,n

，集合

\mathrm{L}ag_{\psi}^{c}(y_{i})

都不为空。

Proof.

Reasoning by contradiction, we suppose that $\psi$ is a $c$ -concave function and there exist $i_{0}\in\{1,\ldots,n\}$ such that $\mathrm{L}ag_{\psi}^{c}(y_{i_{0}})=\emptyset$ . Then, from definition

\begin{split}\forall x\in X,\ \exists j\in\{1,\ldots,n\}\ \hbox{with}\ j\neq i_{0},\ \hbox{and}\ \epsilon_{j}>0\ \hbox{such that}\\ c(x,y_{i_{0}})-\psi(y_{i_{0}})\geq c(x,y_{j})-\psi(y_{j})+\epsilon_{j}.\end{split}

We will write $x=x_{j}$ . Thus,
我们会写 $x=x_{j}$ 。因此。

\begin{split}c(x_{j},y_{i_{0}})&-\left(\inf\limits_{y_{j}\in Y}\left[c(x_{j},y_{j})-\psi(y_{j})\right]\right)\\ &\geq c(x_{j},y_{i_{0}})-\left(c(x_{j},y_{i_{0}})-\psi(y_{i_{0}})-\epsilon_{j}\right)\\ &=\psi(y_{i_{0}})+\epsilon_{j}.\end{split}

(33)

Therefore, 所以，

\begin{split}(\psi^{c})^{c}(y_{i_{0}})&=\inf\limits_{x\in X}\left[c(x,y_{i_{0}})-\left(\inf\limits_{y_{j}\in Y}\left[c(x,y_{j})-\psi(y_{j})\right]\right)\right]\\ &=\inf\limits_{j}\left[c(x_{j},y_{i_{0}})-\left(\inf\limits_{y_{j}\in Y}\left[c(x_{j},y_{j})-\psi(y_{j})\right]\right)\right]\\ &=\inf\limits_{j}\left[\psi(y_{i_{0}})+\epsilon_{j}\right]\\ &>\psi(y_{i_{0}}).\end{split}

This contradicts the fact that $\psi$ is a c-concave function, because [San15, Proposition 1.34]. ∎
这与 $\psi$ 是 c-凹函数的事实相矛盾，因为[San15，命题 1.34]。∎

证明。通过反证法，我们假设

\psi

是一个

c

-凹函数，并且存在这样的

i_{0}\in\{1,\ldots,n\}

，使得

\mathrm{L}ag_{\psi}^{c}(y_{i_{0}})=\emptyset

。然后，根据定义

We now proceed to compute the first and second order derivatives of the objective function $F$ . These derivatives are useful in practice to design computational algorithms.
现在我们开始计算目标函数 $F$ 的一阶和二阶导数。这些导数在实践中用于设计计算算法。

7.4 First-order derivatives of the objective function
目标函数的一阶导数

Since it is concave, $F$ admits a unique maximum $\psi^{*}$ , characterized by $\nabla F(\psi^{*})=0$ where $\nabla F$ denotes the gradient. Let us now examine the form of the gradient $\nabla F=\nabla\left(G(A\psi,\psi)+\sum_{j=1}^{n}\psi_{j}\nu_{j}\right)$ . By replacing $G(A_{\psi},\psi)$ with its expression (27), one obtains:
既然它是凹的， $F$ 允许一个由 $\nabla F(\psi^{*})=0$ 描述的独特最大值 $\psi^{*}$ ，其中 $\nabla F$ 表示梯度。现在让我们考察梯度 $\nabla F=\nabla\left(G(A\psi,\psi)+\sum_{j=1}^{n}\psi_{j}\nu_{j}\right)$ 的形式。通过用其表达式（27）替换 $G(A_{\psi},\psi)$ ，就可以得到：

\begin{split}\frac{\partial G}{\partial\psi_{j}}(\psi)&=\lim_{t\to 0}\frac{G(\psi+t{e}_{j})-G(\psi)}{t}\\ &=\lim_{t\to 0}\frac{1}{t}\left\{\int_{X}\inf\left[c(x,y_{1})-\psi_{1},\ldots,c(x,y_{j})-\psi_{j}-t,\ldots,c(x,y_{n})-\psi_{n}\right]\right.\\ &\hskip 8.5359pt\left.-\inf\limits_{i}\left[c(x,y_{i})-\psi_{i}\right]u(x)dx\right\}.\end{split}

Let $x\in\mathrm{L}ag_{\psi}^{c}(y_{m})$ , then for $t$ small enough we have
假设 $x\in\mathrm{L}ag_{\psi}^{c}(y_{m})$ ，则对于足够小的 $t$ 我们有

\begin{split}\inf&\left[c(x,y_{1})-\psi_{1},\ldots,c(x,y_{j})-\psi_{j}-t,\ldots,c(x,y_{k})-\psi_{n}\right]\\ &=c(x,y_{m})-\psi_{m}.\end{split}

Indeed, since $c(x,y_{m})-\psi_{m}<c(x,y_{i})-\psi_{i}$ for all $i\neq m$ , in particular, if $m\neq j$ : $c(x,y_{m})-\psi_{m}<c(x,y_{j})-\psi_{j}$ . Thus, for $t$ small enough
实际上，由于对于所有的 $i\neq m$ 都有 $c(x,y_{m})-\psi_{m}<c(x,y_{i})-\psi_{i}$ ，特别地，如果 $m\neq j$ : $c(x,y_{m})-\psi_{m}<c(x,y_{j})-\psi_{j}$ 。因此，对于足够小的 $t$

c(x,y_{m})-\psi_{m}\leq c(x,y_{j})-\psi_{j}-t.

In the case that $m=j$ ,
在 $m=j$ 的情况下，

\begin{split}\inf&\left[c(x,y_{1})-\psi_{1},\ldots,c(x,y_{j})-\psi_{j}-t,\ldots,c(x,y_{k})-\psi_{k}\right]\\ &=c(x,y_{j})-\psi_{j}-t.\end{split}

Consequently, we obtain that
因此，我们得到

\begin{split}\frac{\partial G}{\partial\psi_{j}}(\psi)=-\int_{\mathrm{L}ag_{\psi}^{c}(y_{j})}u(x)dx.\end{split}

Recalling that the objective function $F$ is given by $F(\psi)=G(A_{\psi},\psi)+\sum_{j}\nu_{j}\psi_{j}$ , we finally obtain:
回顾目标函数 $F$ 由 $F(\psi)=G(A_{\psi},\psi)+\sum_{j}\nu_{j}\psi_{j}$ 给出，我们最终获得：

\frac{\partial F}{\partial\psi_{j}}=\nu_{j}-\int\limits_{\mathrm{L}ag_{\psi}^{c}(y_{j})}u(x)dx.

(34)

Since the objective function $F$ is concave, it admits a unique maximum. At this maximum, all the components of the gradient vanish, this implies that the quantity of matter obtained at $y_{j}$ , that is the integral of $u$ on the Laguerre cell of $y_{j}$ , corresponds to the prescribed quantity of matter, that is $\nu_{j}$ .
由于目标函数 $F$ 是凹的，它具有唯一的最大值。在这个最大值处，梯度的所有分量都消失，这意味着在 $y_{j}$ 处获得的物质量，即在 $y_{j}$ 的拉盖尔单元上 $u$ 的积分，与所规定的物质量相对应，即 $\nu_{j}$ 。

7.5 Second order derivatives of the objective function
目标函数的二阶导数

The coefficients of the Hessian matrix $\partial^{2}F/\partial\psi_{i}\partial\psi_{j}$ are slightly more difficult to compute, since here, we cannot invoke the envelope theorem. We cannot avoid invoking Reynold’s formula. We do not detail the computations for length considerations, but give the final result.
Hessian 矩阵的系数稍微难以计算，因为在这里，我们无法调用包络定理。我们无法避免调用雷诺公式。出于篇幅考虑，我们不详细说明计算过程，但给出最终结果。

In the particular case of the $L_{2}$ cost, that is $c(x,y)=1/2\|x-y\|^{2}$ , the second-order derivatives are given by:
在 $L_{2}$ 成本的特定情况下，即 $c(x,y)=1/2\|x-y\|^{2}$ ，二阶导数为：

\begin{split}\frac{\partial^{2}G}{\partial\psi_{i}\partial\psi_{j}}(\psi)&=\int\limits_{\mathrm{L}ag_{\psi}^{c}(y_{i})\cap\mathrm{L}ag_{\psi}^{c}(y_{j})}\frac{u(x)}{\|y_{i}-y_{j}\|}\ dS(x)\quad(i\neq j),\\ \frac{\partial^{2}G}{\partial\psi_{j}^{2}}(\psi)&=-\sum\limits_{i\neq j}\frac{\partial^{2}G}{\partial\psi_{i}\partial\psi_{j}}(\psi).\end{split}

(35)

7.6 A computational algorithm for $L_{2}$ semi-discrete optimal transport
7.6A 计算算法用于 $L_{2}$ 半离散最优输运

With the definition of $F(\psi)$ , the expression of its first order derivatives (gradient $\nabla F$ ) and second order derivatives (Hessian matrix $\nabla^{2}F=\left(\partial^{2}F/\partial\psi_{i}\partial\psi_{j}\right)_{ij}$ ), we are now equipped to design a numerical solution mechanism that computes semi-discrete optimal transport by maximizing $F$ , based on a particular version [KMT16] of Newton’s optimization method [NW06]:
借助 $F(\psi)$ 的定义，以及其一阶导数（梯度 $\nabla F$ ）和二阶导数（Hessian 矩阵 $\nabla^{2}F=\left(\partial^{2}F/\partial\psi_{i}\partial\psi_{j}\right)_{ij}$ ）的表达式，我们现在能够设计一个数值解机制，通过最大化 $F$ 来计算半离散最优输运，基于牛顿优化方法的一个特定版本[KMT16]。

\begin{array}[]{ll}\mbox{\bf Input:}&\mbox{a mesh that supports the source density }u\\ &\mbox{the points }(y_{j})_{j=1}^{n}\\ &\mbox{the prescribed quantities }(\nu_{j})_{j=1}^{n}\\[5.69054pt] \mbox{\bf Output:}&\mbox{the (unique) Laguerre diagram }\mathrm{L}ag_{\psi}^{c}\mbox{ such that:}\\ &\quad\int\limits_{\mathrm{L}ag_{\psi}^{c}(y_{j})}u(x)dx=\nu_{j}\quad\forall j\\[14.22636pt] \hline\cr\\ (1)&\psi\leftarrow[0\ldots 0]\\ (2)&\mbox{While convergence is not reached}\\ (3)&\quad\mbox{Compute }\nabla F\mbox{ and }\nabla^{2}F\\ (4)&\quad\mbox{ Find }p\in{\mathbb{R}}^{n}\mbox{ such that }\nabla^{2}F(\psi)p=-\nabla F(\psi)\\ (5)&\quad\mbox{ Find the descent parameter }\alpha\\ (6)&\quad\psi\leftarrow\psi+\alpha p\\ (7)&\mbox{End while}\end{array}

The source measure is given by its density, that is a positive piecewise linear function $u$ , supported by a triangulated mesh (2D) or tetrahedral mesh (3D) of a domain $X$ . The target measure is discrete, and supported by the pointset $Y=(y_{j})_{j=1}^{n}$ . Each target point will receive the prescribed quantity of matter $\nu_{j}$ . Clearly, the prescriptions should be balanced with the available resource, that is $\int_{X}u(x)dx=\sum_{j}\nu_{j}$ . The algorithm computes for each point of the target measure the subset of $X$ that is affected to it through the optimal transport, $T^{-1}(y_{j})=\mathrm{L}ag_{\psi}^{c}(y_{j})$ , that corresponds to the Laguerre cell of $y_{j}$ . The Laguerre diagram is completely determined by the vector $\psi$ that maximizes $F$ .
源测度由其密度给出，这是一个正的分段线性函数 $u$ ，支持一个域 $X$ 的三角形网格（2D）或四面体网格（3D）。目标测度是离散的，支持点集 $Y=(y_{j})_{j=1}^{n}$ 。每个目标点将接收指定量的物质 $\nu_{j}$ 。显然，配额应该与可用资源保持平衡，即 $\int_{X}u(x)dx=\sum_{j}\nu_{j}$ 。该算法为目标测度的每个点计算经由最优传输分配给它的 $X$ 子集， $T^{-1}(y_{j})=\mathrm{L}ag_{\psi}^{c}(y_{j})$ ，对应于 $y_{j}$ 的拉盖尔单元。拉盖尔图由使 $F$ 最大化的矢量 $\psi$ 完全确定。

Line (2) needs a criterion for convergence. The classical convergence criterion for a Newton algorithm uses the norm of the gradient of $F$ . In our case, the components of the gradient of $F$ have a geometric meaning, since $\partial F/\partial\psi_{j}$ corresponds to the difference between the prescribed quantity $\nu_{j}$ associated with $j$ and the quantity of matter present in the Laguerre cell of $y_{j}$ given by $\int_{\mathrm{L}ag_{\psi}^{c}(y_{j})}\!\!u(x)dx$ . Thus, we can decide to stop the algorithm as soon as the largest absolute value of a component becomes smaller than a certain percentage of the smallest prescription. Thus we consider that convergence is reached if $\max|\nabla F_{j}|<\epsilon\min_{j}\nu_{j}$ , for a user-defined $\epsilon$ (typically 1% in the examples below).
第（2）行需要一个收敛标准。牛顿算法的经典收敛标准使用 $F$ 的梯度范数。在我们的情况下， $F$ 的梯度分量具有几何意义，因为 $\partial F/\partial\psi_{j}$ 对应于所需数量 $\nu_{j}$ 与 $j$ 相关联的拉吉尔细胞中存在的物质数量与 $y_{j}$ 给定的拉格朗日细胞中存在的物质数量之间的差异。因此，我们可以决定在某个分量的最大绝对值小于最小规定值的一定百分比时停止算法。因此，我们认为当 $\max|\nabla F_{j}|<\epsilon\min_{j}\nu_{j}$ 时收敛已经实现，对于用户定义的 $\epsilon$ （在下面的示例中通常为 1%）。

Line (3) computes the coefficients of the gradient and the Hessian matrix of $F$ , using (34) and (35). These computations involve integrals over the Laguerre cells and over their boundaries. For the $L_{2}$ cost $c(x,y)=1/2\|x-y\|^{2}$ , the boundaries of the Laguerre cells are rectilinear, which dramatically simplifies the computations of the Hessian coefficients (35). In addition, it makes it possible to use efficient algorithms to compute the Laguerre diagram [Bow81, Wat81]. Their implementation is available in several programming libraries, such as GEOGRAM¹¹¹¹11http://alice.loria.fr/software/geogram/doc/html/index.html
第（3）行使用（34）和（35）来计算 $F$ 的梯度和 Hessian 矩阵的系数。这些计算涉及拉盖尔单元和它们边界上的积分。对于 $L_{2}$ 花费 $c(x,y)=1/2\|x-y\|^{2}$ ，拉盖尔单元的边界是直线的，这显着简化了 Hessian 系数（35）的计算。此外，这使得可以使用高效的算法来计算拉盖尔图[ Bow81，Wat81]。它们的实现可在几个编程库中找到，如 GEOGRAM ¹¹ et CGAL¹²¹²12http://www.cgal.org 和 CGAL ¹² . Then one needs to compute the intersection between each Laguerre cell and the mesh that supports the density $u$ . This can be done with specialized algorithms [Lév15], also available in GEOGRAM.
。然后需要计算每个拉盖尔单元与支持密度 $u$ 的网格之间的交集。这可以通过专门的算法[Lév15]完成，也可在 GEOGRAM 中找到。

Line (4) finds the Newton step $p$ by solving a linear system. We use the Conjugate Gradient algorithm [HS52] with the Jacobi preconditioner. In our empirical experiments below, we stopped the conjugate iterations as soon as $\|\nabla^{2}Fp+\nabla F\|/\|\nabla F\|<10^{-3}$ .
第（4）行通过解线性系统找到牛顿步 $p$ 。我们使用带有雅可比预处理器的共轭梯度算法[HS52]。在我们的经验实验中，我们在共轭迭代次数达到 $\|\nabla^{2}Fp+\nabla F\|/\|\nabla F\|<10^{-3}$ 时停止。

Line (5) determines the descent parameter $\alpha$ . A result due to Mérigot and Kitagawa [KMT16] ensures the convergence of the Newton algorithm if the measure of the smallest Laguerre cell remains larger than a certain threshold (that is, half the smallest prescription $\nu_{j}$ ). There is also a condition on the norm of the gradient $\|\nabla F\|$ that we do not repeat here (the reader is referred to Mérigot and Kitagawa’s original article for more details). In our implementation, starting with $\alpha=1$ , we iteratively divide $\alpha$ by two until both conditions are satisfied.
第（5）行确定了下降参数 $\alpha$ 。Mérigot 和 Kitagawa [KMT16]的结果确保了如果最小拉盖尔单元的度量保持高于一定阈值（即最小规定 $\nu_{j}$ 的一半），牛顿算法的收敛。对于梯度 $\|\nabla F\|$ 的范数也有一个条件我们这里不再重复（读者可参考 Mérigot 和 Kitagawa 原始文章以获取更多细节）。在我们的实现中，从 $\alpha=1$ 开始，我们将 $\alpha$ 迭代除以二直至满足这两个条件。

Let us now make one step backwards and think about the original definition of Monge’s problem (M). We wish to stress that the initial constraint (local mass conservation) that characterizes transport maps was terribly difficult. It is remarkable that after several rewrites (Kantorovich relaxation, duality, c-convexity), the final problem becomes as simple as optimizing a regular ( $C^{2}$ ) concave function, for which computing the gradient and Hessian is easy in the semi-discrete case and boils down to evaluating volumes and areas in a Laguerre diagram. We wish also to stress that the computational algorithm did not require to make any approximation or discretization. The discrete, computer version is a particular setting of the general theory, that fully describes not only transport between smooth objects (functions), but also transport between less regular objects, such as pointsets and triangulated meshes. This is made possible by the rich mathematical vocabulary (measures) on which optimal transport theory acts. Thus, the computational algorithm is an elegant, direct verbatim translation of the theory into a computer program.
让我们现在往回退一步，思考一下蒙日问题（M）的原始定义。我们希望强调，表征传输映射的初始约束（局部质量守恒）是非常困难的。令人惊奇的是，在经过几次改写（康托洛维奇放松，对偶，c-凸性）之后，最终问题变得像优化正则（ $C^{2}$ ）凹函数那么简单，其中在半离散情况下计算梯度和 Hessian 是容易的，而且归结为在拉盖尔图中评估体积和面积。我们还希望强调的是，计算算法不需要做任何近似或离散化。离散的、计算机版本是一般理论的特定设置，不仅完全描述了光滑对象（函数）之间的传输，也描述了不太规则的对象之间的传输，如点集和三角形网格。这得益于在最优传输理论上起作用的丰富数学词汇（度量）。因此，计算算法是将理论直接优雅地翻译成计算机程序的直接逐字翻译。

We now show some computational results and list possible applications of this algorithm.
我们现在展示一些计算结果，并列出这个算法的可能应用。

8 Results, examples and applications of $L_{2}$ semi-discrete transport
8 结果，例子和 $L_{2}$ 半离散传输的应用。

Figure 12 shows some examples of transport in 2D, between a uniform density and a pointset (A). Each Laguerre cell has the same area. Then we consider the same pointset, but this time with a density $u(x,y)=10(1+\sin(2\pi x)\sin(2\pi y))$ (image B). We obtain a completely different Laguerre diagram. Each cell of this diagram has the same value for the integrated density $u$ . (C): the coefficients of the gradient and the Hessian of $F$ computed in the previous section involve integrals of the density $u$ over the Laguerre cells and over their boundaries. The density $u$ is supported by a triangulated mesh, and linearly interpolated on the triangles. The integrals of $u$ over the Laguerre cells and their boundaries are evaluated in closed form, by computing the intersections between the Laguerre cells and the triangles of the mesh that supports $u$ . (D): the same algorithm can compute the optimal transport between a measure supported by a 3D surface and a pointset in 3D.
图 12 显示了在二维空间中，在均匀密度和点集 (A) 之间的一些运输示例。每个拉盖尔细胞的面积相同。然后我们考虑相同的点集，但是这次密度为 $u(x,y)=10(1+\sin(2\pi x)\sin(2\pi y))$ (图像 B)。我们获得了完全不同的拉盖尔图。这个图的每个细胞对于综合密度 $u$ 有相同的值。(C)：在上一节中计算的梯度和 $F$ 的 Hessian 系数涉及拉盖尔细胞和边界上密度 $u$ 的积分。密度 $u$ 由一个三角化网格支持，并且在三角形上进行线性插值。拉盖尔细胞和其边界上 $u$ 的积分通过计算拉盖尔细胞与支持 $u$ 的网格三角形之间的交集来以闭式形式计算。(D)：相同的算法可以计算由 3D 表面支持的测度与 3D 中的点集之间的最优运输。

Figure 13 shows two examples of volume deformation by optimal transport. The same algorithm is used, but this time with 3D Laguerre diagrams, and by computing intersections between the Laguerre cells and a tetrahedral mesh that supports the source density $u$ . The intermediary steps of the animation are generated by linear interpolation of the positions (Mc. Cann’s interpolation). Figure 14 demonstrates a more challenging configuration, computing transport between a sphere (surface) and a cube (volume). The sphere is approximated with 10 million Dirac masses. As can be seen, transport has a singularity that resembles the medial axis of the cube (rightmost figure).
图 13 显示了两个通过最优输运进行体积变形的示例。使用相同的算法，但这次使用了 3D 拉盖尔图，并通过计算拉盖尔单元与支持源密度 $u$ 的四面体网格之间的交点。动画的中间步骤是通过位置的线性插值（Mc. Cann 的插值）生成的。图 14 展示了一个更具挑战性的配置，计算了球（表面）和立方体（体积）之间的输运。球用 1000 万点质量近似。可以看出，输运具有类似于立方体中轴的奇点（最右图）。

Figure 15 demonstrates an application in computational fluid dynamics. A heavy incompressible fluid (in red) is placed on top of a lighter incompressible fluid (in blue). Both fluids tend to exchange their positions, due to gravity, but incompressibility is an obstacle to their movement. As a result, vortices are created, and they become faster and faster (Taylor-Rayleigh instability). The numerical simulation method that we used here [GM17] directly computes the trajectories of particles (Lagrangian coordinates) while taking into account the incompressibility constraint, which is in general unnatural in Lagrangian coordinates. By providing a means of controlling the volumes of the Laguerre cells, optimal transport appears here as an easy way of enforcing incompressibility. Using some efficient geometric algorithms for the Laguerre diagram and its intersections [Lév15], the Gallouet-Mérigot numerical scheme for incompressible Euler scheme can be also applied in 3D to simulate the behavior of non-mixing incompressible fluids (Figure 16). The 3D version of the Taylor-Rayleigh instability is shown in Figure 17, with 10 million Laguerre cells. More complicated vortices are generated (shown here in cross-section).
图 15 展示了在计算流体力学中的应用。重的不可压缩流体（红色）直接放在轻的不可压缩流体（蓝色）上方。由于重力，这两种流体往往会交换位置，但不可压缩性是它们移动的障碍。结果，涡旋被形成，并且变得越来越快（泰勒-雷利不稳定性）。我们在这里使用的数值模拟方法[GM17]直接计算粒子的轨迹（拉格朗日坐标），同时考虑不可压缩性约束，这一般在拉格朗日坐标中是不自然的。通过提供控制拉盖尔细胞体积的方法，最优输运在这里作为执行不可压缩性的一种简单方式。利用一些有效的几何算法来处理拉盖尔图和其交点[Lév15]，可以将 Gallouet-Mérigot 数值方案用于三维中模拟不混合不可压缩流体的行为（图 16）。图 17 展示了三维版本的泰勒-雷利不稳定性，包括 1000 万个拉盖尔细胞。更复杂的涡流会生成（如交叉部分所示）。

The applications in computer animation and computational physics seem to be promising, because the semi-discrete algorithm behaves very well in practice. It is now reasonable to design simulation algorithms that solve a transport problem in each timestep (as our early computational fluid dynamics of the previous paragraph do). With our optimized implementation that uses a multicore processor for the geometric part of the algorithm and a GPU for solving the linear systems, the semi-discrete algorithm takes no more than a few tens of seconds to solve a problem with 10 millions unknown. This opens the door to numerical solution mechanisms for difficult problems. For instance, in astrophysics, the Early Universe Reconstruction problem [BFH⁺03] consists in going “backward in time” from observation data on the repartition of galaxy clusters, to “play the Big-Bang movie” backward. Our numerical experiments tend to confirm Brenier’s point of view that he expressed in the 2000’s, that semi-discrete optimal transport can be an efficient way of solving this problem.
计算机动画和计算物理中的应用似乎很有前途，因为半离散算法在实践中表现得非常好。现在设计解决每个时间步长中的传输问题的仿真算法是合理的（就像我们前一段落中的早期计算流体力学所做的那样）。通过我们优化的实现，该实现使用多核处理器处理算法的几何部分，使用 GPU 解决线性系统，半离散算法仅需几十秒即可解决涉及 1000 万未知数的问题。这为解决困难问题打开了数值解机制的大门。例如，在天体物理学中，早期宇宙重建问题[BFH ⁺ 03]包括从对星系团分布的观测数据“时光倒流”，以“倒放大爆炸电影”。我们的数值实验倾向于证实布雷尼尔在 2000 年代表达的观点，即半离散最优传输可以是解决这个问题的有效方法。

To ensure that our results are reproducible, the source-code associated with the numerical solution mechanism used in all these experiments is available in the EXPLORAGRAM component of the GEOGRAM programming library¹³¹³13http://alice.loria.fr/software/geogram/doc/html/index.html
为了确保我们的结果具有重现性，所有这些实验中使用的数字解决方案机制相关的源代码可以在 GEOGRAM 编程库的 EXPLORAGRAM 组件中找到 ¹³ .

Acknowledgments 致谢

This research is supported by EXPLORAGRAM (Inria Exploratory Research Grant). The authors wish to thank Quentin Mérigot, Yann Brenier, Jean-David Benamou, Nicolas Bonneel and Lénaïc Chizat for many discussions.
本研究得到 EXPLORAGRAM（Inria 探索性研究资助）的支持。作者感谢 Quentin Mérigot，Yann Brenier，Jean-David Benamou，Nicolas Bonneel 和 Lénaïc Chizat 的许多讨论。

References 参考文献

[AG13] Luigi Ambrosio and Nicolas Gigli. A user’s guide to optimal transport. Modelling and Optimisation of Flows on Networks, Lecture Notes in Mathematics, pages 1–155, 2013.
Luigi Ambrosio 和 Nicolas Gigli。最优输运的用户指南。网络流建模与优化，数学讲义，2013 年，1-155 页。
[AHA92] Franz Aurenhammer, Friedrich Hoffmann, and Boris Aronov. Minkowski-type theorems and least-squares partitioning. In Symposium on Computational Geometry, pages 350–357, 1992.
Franz Aurenhammer、Friedrich Hoffmann 和 Boris Aronov。闵可夫斯基类型定理和最小二乘分区。在计算几何学研讨会中，第 350-357 页，1992 年。
[Ale05] A. D. Alexandrov. Intrinsic geometry of convex surfaces (translation of the 1948 Russian original). CRC Press, 2005.
A. D. Alexandrov. 凸曲面的固有几何（1948 年俄文原著的翻译）。CRC 出版社，2005 年。
[Aur87] Franz Aurenhammer. Power diagrams: Properties, algorithms and applications. SIAM J. Comput., 16(1):78–96, 1987.
Franz Aurenhammer. 力图：性质、算法和应用。SIAM 计算杂志，16(1)：78-96，1987 年。
[BB00] Jean-David Benamou and Yann Brenier. A computational fluid mechanics solution to the monge-kantorovich mass transfer problem. Numerische Mathematik, 84(3):375–393, 2000.
Jean-David Benamou 和 Yann Brenier。一个计算流体力学解决蒙日 (monge-kantorovich) 质量转移问题的问题解. 数值数学, 84(3):375–393, 2000.
[BCMO14] Jean-David Benamou, Guillaume Carlier, Quentin Mérigot, and Edouard Oudet. Discretization of functionals involving the monge-ampère operator. arXiv, August 2014. [math.NA] http://arxiv.org/abs/1408.4336.
Jean-David Benamou，Guillaume Carlier，Quentin Mérigot 和 Edouard Oudet。涉及蒙日安培算子的泛函离散化。arXiv，2014 年 8 月。[math.NA] http://arxiv.org/abs/1408.4336.
[BDM09] Rainer Burkard, Mauro Dell’Amico, and Silvano Martello. Assignment Problems. SIAM, 2009.
[BFH⁺03] Y. Brenier, U. Frisch, M. Henon, G. Loeper, S. Matarrese, R. Mohayaee, and A. Sobolevskii. Reconstruction of the early universe as a convex optimization problem. arXiv, September 2003. arXiv:astro-ph/0304214v3.
[Bow81] Adrian Bowyer. Computing dirichlet tessellations. Comput. J., 24(2):162–166, 1981.
[Bre91] Yann Brenier. Polar factorization and monotone rearrangement of vector-valued functions. Communications on Pure and Applied Mathematics, 44:375–417, 1991.
[BvdPPH11] Nicolas Bonneel, Michiel van de Panne, Sylvain Paris, and Wolfgang Heidrich. Displacement interpolation using lagrangian mass transport. ACM Trans. Graph., 30(6):158, 2011.
[Caf03] Luis Caffarelli. The monge-ampère equation and optimal transportation, an elementary review. Optimal transportation and applications (Martina Franca, 2001), Lecture Notes in Mathematics, pages 1–10, 2003.
[Cut13] Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pages 2292–2300, 2013.
[dGWH⁺15] Fernando de Goes, Corentin Wallez, Jin Huang, Dmitry Pavlov, and Mathieu Desbrun. Power particles: an incompressible fluid solver based on power diagrams. ACM Trans. Graph., 34(4):50:1–50:11, 2015.
[GM96] Wilfrid Gangbo and Robert J. McCann. The geometry of optimal transportation. Acta Math., 177(2):113–161, 1996.
[GM17] Thomas O. Gallouët and Quentin Mérigot. A lagrangian scheme à la Brenier for the incompressible euler equations. Foundations of Computational Mathematics, May 2017.
[HS52] Magnus R. Hestenes and Eduard Stiefel. Methods of Conjugate Gradients for Solving Linear Systems. Journal of Research of the National Bureau of Standards, 49(6):409–436, December 1952.
[KMT16] Jun Kitagawa, Quentin Mérigot, and Boris Thibert. A newton algorithm for semi-discrete optimal transport. CoRR, abs/1603.05579, 2016.
[Leo13] Christian Leonard. A survey of the schrödinger problem and some of its connections with optimal transport. arXiv, August 2013. [math.PR] http://arxiv.org/abs/1308.0215.
[Lév15] Bruno Lévy. A numerical algorithm for $L_{2}$ semi-discrete optimal transport in 3d. ESAIM M2AN (Mathematical Modeling and Analysis), 2015.
[McC95] Robert J. McCann. Existence and uniqueness of monotone measure-preserving maps. Duke Mathematical Journal, 80(2):309–323, 1995.
[Mém11] Facundo Mémoli. Gromov-wasserstein distances and the metric approach to object matching. Foundations of Computational Mathematics, 11(4):417–487, 2011.
[Mér11] Quentin Mérigot. A multiscale approach to optimal transport. Comput. Graph. Forum, 30(5):1583–1592, 2011.
[MMT17] Quentin Mérigot, Jocelyn Meyron, and Boris Thibert. Light in power: A general and parameter-free algorithm for caustic design. CoRR, abs/1708.04820, 2017.
[Mon84] Gaspard Monge. Mémoire sur la théorie des déblais et des remblais. Histoire de l’Académie Royale des Sciences (1781), pages 666–704, 1784.
[NW06] J. Nocedal and S. J. Wright. Numerical Optimization. Springer, New York, 2nd edition, 2006.
[PPO14] Nicolas Papadakis, Gabriel Peyré, and Edouard Oudet. Optimal Transport with Proximal Splitting. SIAM Journal on Imaging Sciences, 7(1):212–238, January 2014.
[San14] Filippo Santambrogio. Introduction to Optimal Transport Theory. In Optimal Transport, Theory and Applications, August 2014. [math.PR] http://arxiv.org/abs/1009.3856.
[San15] Filippo Santambrogio. Optimal transport for applied mathematicians, volume 87 of Progress in Nonlinear Differential Equations and their Applications. Birkhäuser/Springer, Cham, 2015. Calculus of variations, PDEs, and modeling.
[STTP14] Yuliy Schwartzburg, Romain Testuz, Andrea Tagliasacchi, and Mark Pauly. High-contrast computational caustic design. ACM Trans. Graph., 33(4):74:1–74:11, 2014.
[Tao11] Terence Tao. An Introduction to Measure Theory. American Mathematical Society, 2011.
[Vil09] Cédric Villani. Optimal transport : old and new. Grundlehren der mathematischen Wissenschaften. Springer, Berlin, 2009.
[Wat81] David Watson. Computing the n-dimensional delaunay tessellation with application to voronoi polytopes. Comput. J., 24(2):167–172, 1981.

Notions of optimal transport theory and how to implement them on a computer最优输送理论的概念及如何在计算机上实施它们

Abstract 摘要

1 Introduction 1 介绍

Comparing functions 比较函数

Interpolating between functions:在函数之间插值：

2 Monge’s problem 蒙日问题

2.1 Intuition 直觉

2.2 Monge’s problem with measures2.2 蒙日问题与度量

3 Kantorovich’s relaxed problem3Kantorovich 的放松问题

Observation 1.

Proof.

4 Kantorovich’s dual problem4Kantorovich 的对偶问题

4.1 The discrete Kantorovich problem4.1 离散 Kantorovich 问题

4.2 Constructing the Kantorovich dual in the discrete setting在离散设置中构建 Kantorovich 对偶

4.3 The Kantorovich dual in the continuous setting连续设置中的 Kantorovich 对偶

Definition 1. 定义 1。

Observation 2.

Proof. 证明。

Observation 3.

Proof. 证明。

5 From the Kantorovich dual to the optimal transport map从 Kantorovich 对偶到最优输运映射

5.1 The c𝑐c-superdifferential5.1The c𝑐c -超微分

Theorem 1.

Proof.

Observation 4.

Proof. 证明。

Observation 5.

Proof.

6 Continuous, discrete and semi-discrete transport连续、离散和半离散传输 6

6.1 The continuous →→\rightarrow continuous case and Monge-Ampère equation6.1 连续 →→\rightarrow 连续情况和 Monge-Ampère 方程

6.2 The discrete →→\rightarrow discrete case6.2 离散 →→\rightarrow 离散案例

6.3 The continuous →→\rightarrow discrete case连续 →→\rightarrow 离散情况 6.3

7 Semi-discrete transport 7 半离散输运

7.1 Semi-discrete Monge problem7.1 半离散 Monge 问题

7.2 Concavity of F𝐹F 7.2 F𝐹F 的凹性

Theorem 2.

Proof.

7.3 The semi-discrete optimal transport map7.3 半离散最优输运映射

Theorem 3.

Proof.

Theorem 4.

Proof.

7.4 First-order derivatives of the objective function目标函数的一阶导数

7.5 Second order derivatives of the objective function目标函数的二阶导数

7.6 A computational algorithm for L2subscript𝐿2L_{2} semi-discrete optimal transport7.6A 计算算法用于 L2subscript𝐿2L_{2} 半离散最优输运

8 Results, examples and applications of L2subscript𝐿2L_{2} semi-discrete transport8 结果，例子和 L2subscript𝐿2L_{2} 半离散传输的应用。

Acknowledgments 致谢

References 参考文献

Notions of optimal transport theory and how to implement them on a computer
最优输送理论的概念及如何在计算机上实施它们

Interpolating between functions:
在函数之间插值：

2.2 Monge’s problem with measures
2.2 蒙日问题与度量

3 Kantorovich’s relaxed problem
3Kantorovich 的放松问题

4 Kantorovich’s dual problem
4Kantorovich 的对偶问题

4.1 The discrete Kantorovich problem
4.1 离散 Kantorovich 问题

4.2 Constructing the Kantorovich dual in the discrete setting
在离散设置中构建 Kantorovich 对偶

4.3 The Kantorovich dual in the continuous setting
连续设置中的 Kantorovich 对偶

5 From the Kantorovich dual to the optimal transport map
从 Kantorovich 对偶到最优输运映射

5.1 The $c$ -superdifferential
5.1The $c$ -超微分

6 Continuous, discrete and semi-discrete transport
连续、离散和半离散传输 6

6.1 The continuous $\rightarrow$ continuous case and Monge-Ampère equation
6.1 连续 $\rightarrow$ 连续情况和 Monge-Ampère 方程

6.2 The discrete $\rightarrow$ discrete case
6.2 离散 $\rightarrow$ 离散案例

6.3 The continuous $\rightarrow$ discrete case
连续 $\rightarrow$ 离散情况 6.3

7.1 Semi-discrete Monge problem
7.1 半离散 Monge 问题

7.2 Concavity of $F$ 7.2 $F$ 的凹性

7.3 The semi-discrete optimal transport map
7.3 半离散最优输运映射

7.4 First-order derivatives of the objective function
目标函数的一阶导数

7.5 Second order derivatives of the objective function
目标函数的二阶导数

7.6 A computational algorithm for $L_{2}$ semi-discrete optimal transport
7.6A 计算算法用于 $L_{2}$ 半离散最优输运

8 Results, examples and applications of $L_{2}$ semi-discrete transport
8 结果，例子和 $L_{2}$ 半离散传输的应用。