A Randomized Algorithm for Single-Source Shortest Path on Undirected Real-Weighted Graphs 无向实权图单源最短路径的随机算法
Ran Duan ^(**1){ }^{* 1}, Jiayi Mao ^(†1){ }^{\dagger 1}, Xinkai Shu ^(‡2){ }^{\ddagger 2}, and Longhui Yin ^(§1){ }^{\S 1} Ran Duan ^(**1){ }^{* 1} , Jiayi Mao ^(†1){ }^{\dagger 1} , Xinkai Shu ^(‡2){ }^{\ddagger 2} , 和 Longhui Yin ^(§1){ }^{\S 1}^(1){ }^{1} Institute for Interdisciplinary Information Sciences, Tsinghua University ^(1){ }^{1} 清华大学跨学科信息科学研究院^(2){ }^{2} The University of Hong Kong ^(2){ }^{2} 香港大学
October 5, 2023 2023 年 10 月 5 日
Abstract 摘要
In undirected graphs with real non-negative weights, we give a new randomized algorithm for the single-source shortest path (SSSP) problem with running time O(msqrt(log n*log log n))O(m \sqrt{\log n \cdot \log \log n}) in the comparison-addition model. This is the first algorithm to break the O(m+n log n)O(m+n \log n) time bound for real-weighted sparse graphs by Dijkstra’s algorithm with Fibonacci heaps. Previous undirected non-negative SSSP algorithms give time bound of O(m alpha(m,n)+min{n log n,n log log r})O(m \alpha(m, n)+\min \{n \log n, n \log \log r\}) in comparison-addition model, where alpha\alpha is the inverse-Ackermann function and rr is the ratio of the maximum-to-minimum edge weight [Pettie & Ramachandran 2005], and linear time for integer edge weights in RAM model [Thorup 1999]. Note that there is a proposed complexity lower bound of Omega(m+min{n log n,n log log r})\Omega(m+\min \{n \log n, n \log \log r\}) for hierarchy-based algorithms for undirected real-weighted SSSP [Pettie & Ramachandran 2005], but our algorithm does not obey the properties required for that lower bound. As a non-hierarchy-based approach, our algorithm shows great advantage with much simpler structure, and is much easier to implement. 在具有真实非负权重的无向图中,我们提出了一种新的随机算法,用于单源最短路径(SSSP)问题,其在比较-加法模型中的运行时间为 O(msqrt(log n*log log n))O(m \sqrt{\log n \cdot \log \log n}) 。这是第一个突破 Dijkstra 算法与 Fibonacci 堆在真实权重稀疏图中 O(m+n log n)O(m+n \log n) 时间界限的算法。之前的无向非负 SSSP 算法在比较-加法模型中给出的时间界限为 O(m alpha(m,n)+min{n log n,n log log r})O(m \alpha(m, n)+\min \{n \log n, n \log \log r\}) ,其中 alpha\alpha 是反阿克曼函数, rr 是最大边权重与最小边权重的比率[Pettie & Ramachandran 2005],而在 RAM 模型中,对于整数边权重则是线性时间[Thorup 1999]。请注意,针对无向真实权重 SSSP 的基于层次的算法有一个提出的复杂度下界 Omega(m+min{n log n,n log log r})\Omega(m+\min \{n \log n, n \log \log r\}) [Pettie & Ramachandran 2005],但我们的算法不遵循该下界所需的属性。作为一种非基于层次的方法,我们的算法显示出巨大的优势,结构更简单,实施起来也更容易。
1 Introduction 1 引言
Shortest path is one of the most fundamental problems in graph theory, and its algorithms lie at the core of graph algorithm research. In a graph G=(V,E,w)G=(V, E, w) with m=|E|,n=|V|m=|E|, n=|V| and non-negative edge weight w:E rarrR_( >= 0)w: E \rightarrow \mathbb{R}_{\geq 0}, single-source shortest path (SSSP) problem asks for the distances from a given source s in Vs \in V to all other vertices. Dijkstra’s algorithm [8] computes the distances dist(s,u)\operatorname{dist}(s, u) by dynamic programming. For each vertex uu, it maintains a temporal distance d(u)d(u), which represents the shortest path from ss to uu only passing through the vertices in current SS, where SS is the set of visited vertices during algorithm process. In each round of iteration it selects vertex uu with the smallest d(u)d(u) from the unvisited nodes. Finally when S=V,d(u)=dist(s,u)S=V, d(u)=\operatorname{dist}(s, u) for all vertex uu. Advanced data structures with amortized O(1)O(1) time for insertion and decreasekey, and O(log n)O(\log n) for extract-min, called Fibonacci heap [14] and relaxed heap [10], make the time bound for Dijkstra’s algorithm to O(m+n log n)O(m+n \log n). This time bound is in the comparisonaddition model where only comparison and addition operations on edge weights are allowed and considered as unit-time operations, which is the most common model for real number inputs. For undirected graphs, Pettie and Ramachandran [25] proposed an SSSP algorithm with running time O(m alpha(m,n)+min{n log n,n log log r})O(m \alpha(m, n)+\min \{n \log n, n \log \log r\}) in the comparison-addition model, where alpha\alpha is the inverseAckermann function and rr bounds the ratio of any two edge weights. However, no SSSP algorithm faster than O(m+n log n)O(m+n \log n) has been found for real-weighted graphs without ratio constraints, both for undirected and directed graphs. 最短路径是图论中最基本的问题之一,其算法是图算法研究的核心。在一个图 G=(V,E,w)G=(V, E, w) 中,具有 m=|E|,n=|V|m=|E|, n=|V| 和非负边权 w:E rarrR_( >= 0)w: E \rightarrow \mathbb{R}_{\geq 0} ,单源最短路径(SSSP)问题要求计算从给定源 s in Vs \in V 到所有其他顶点的距离。Dijkstra 算法 [8] 通过动态规划计算距离 dist(s,u)\operatorname{dist}(s, u) 。对于每个顶点 uu ,它维护一个临时距离 d(u)d(u) ,表示从 ss 到 uu 的最短路径,仅经过当前 SS 中的顶点,其中 SS 是算法过程中访问过的顶点集合。在每轮迭代中,它从未访问的节点中选择距离最小的顶点 uu 。最后,当 S=V,d(u)=dist(s,u)S=V, d(u)=\operatorname{dist}(s, u) 对所有顶点 uu 完成时。具有摊销 O(1)O(1) 时间的高级数据结构用于插入和 decreasekey,以及用于 extract-min 的 O(log n)O(\log n) ,称为 Fibonacci 堆 [14] 和松弛堆 [10],使 Dijkstra 算法的时间界限为 O(m+n log n)O(m+n \log n) 。 此时间界限是在比较加法模型中,其中仅允许对边权进行比较和加法操作,并将其视为单位时间操作,这是实数输入中最常见的模型。对于无向图,Pettie 和 Ramachandran [25] 提出了一个在比较加法模型中运行时间为 O(m alpha(m,n)+min{n log n,n log log r})O(m \alpha(m, n)+\min \{n \log n, n \log \log r\}) 的单源最短路径(SSSP)算法,其中 alpha\alpha 是反阿克曼函数, rr 限制了任何两个边权的比率。然而,对于没有比率约束的实权重图,无论是无向图还是有向图,尚未找到比 O(m+n log n)O(m+n \log n) 更快的 SSSP 算法。
A byproduct of Dijkstra’s algorithm is the sorting of all vertices by their distances from ss, but the lower bound of Omega(n log n)\Omega(n \log n) lies for comparison-based sorting algorithms. Researchers used to believe that this sorting bottleneck existed for many graph problems, and breaking this bottleneck is an important and interesting direction. Yao [7] gave a minimum spanning tree (MST) algorithm with running time O(m log log n)O(m \log \log n), citing an unpublished result of O(msqrt(log n))O(m \sqrt{\log n}) by Tarjan. The current best results for MST are the randomized linear time algorithm [22], the deterministic O(m alpha(m,n))O(m \alpha(m, n))-time algorithm [4], and a deterministic algorithm with proven optimal (but unknown) complexity [24]. In the bottleneck path problem, we want to find the path maximizing the minimum edge weight on it between two vertices. Gabow and Tarjan [18] gave an O(mlog^(**)n)O\left(m \log ^{*} n\right)-time algorithm for s-t bottleneck path problem in directed graphs, which was later improved to randomized O(m beta(m,n))O(m \beta(m, n)) time [5]. For single-source all-destination bottleneck path problem in directed graphs, there is a recent result of O(msqrt(log n))O(m \sqrt{\log n})-time randomized algorithm by Duan et al. [11]. For single-source nondecreasing path problem, Virginia V.Williams [34] proposed an algorithm with time bound O(m log log n)O(m \log \log n). All the results above are comparison-based though, the techniques in these works, such as local construction or divide-and-conquer approach, hardly works for the shortest path problem. Therefore it remains how to break the sorting bottleneck for SSSP. Dijkstra 算法的一个副产品是按与 ss 的距离对所有顶点进行排序,但 Omega(n log n)\Omega(n \log n) 的下界适用于基于比较的排序算法。研究人员曾认为,这种排序瓶颈在许多图问题中存在,突破这一瓶颈是一个重要且有趣的方向。Yao [7] 提出了一个运行时间为 O(m log log n)O(m \log \log n) 的最小生成树 (MST) 算法,引用了 Tarjan 的一项未发表结果 O(msqrt(log n))O(m \sqrt{\log n}) 。目前 MST 的最佳结果是随机线性时间算法 [22]、确定性 O(m alpha(m,n))O(m \alpha(m, n)) -时间算法 [4] 和一个具有已证明最优(但未知)复杂度的确定性算法 [24]。在瓶颈路径问题中,我们希望找到在两个顶点之间最大化最小边权重的路径。Gabow 和 Tarjan [18] 提出了一个针对有向图中 s-t 瓶颈路径问题的 O(mlog^(**)n)O\left(m \log ^{*} n\right) -时间算法,后来改进为随机 O(m beta(m,n))O(m \beta(m, n)) 时间 [5]。对于有向图中的单源全目标瓶颈路径问题,Duan 等人 [11] 最近提出了一个 O(msqrt(log n))O(m \sqrt{\log n}) -时间的随机算法。 对于单源非递减路径问题,Virginia V.Williams [34] 提出了一个时间界限为 O(m log log n)O(m \log \log n) 的算法。尽管上述所有结果都是基于比较的,但这些工作的技术,如局部构造或分治方法,几乎无法用于最短路径问题。因此,如何打破单源最短路径的排序瓶颈仍然是一个问题。
1.1 Our Results 1.1 我们的结果
In this paper we propose the first SSSP algorithm for undirected real-weighted graphs that breaks the sorting bottleneck. 在本文中,我们提出了第一个针对无向实重图的单源最短路径算法,打破了排序瓶颈。
Theorem 1. In an undirected graph G=(V,E,w)G=(V, E, w) with nonnegative edge weights w:E rarrR_( >= 0)w: E \rightarrow \mathbb{R}_{\geq 0}, there is a comparison-addition based Las-Vegas randomized algorithm that solves the single-source shortest path problem in O(msqrt(log n*log log n))O(m \sqrt{\log n \cdot \log \log n}) time, in which the results are always correct and it can achieve this time bound with high probability. The time complexity can be improved to O(sqrt(mn log n)+nsqrt(log n log log n))O(\sqrt{m n \log n}+n \sqrt{\log n \log \log n}) when m=omega(n)m=\omega(n) and m=o(n log n)m=o(n \log n). 定理 1. 在一个边权非负的无向图 G=(V,E,w)G=(V, E, w) 中,有一种基于比较-加法的拉斯维加斯随机算法,可以在 O(msqrt(log n*log log n))O(m \sqrt{\log n \cdot \log \log n}) 时间内解决单源最短路径问题,结果总是正确,并且可以以高概率达到这个时间界限。当 m=omega(n)m=\omega(n) 和 m=o(n log n)m=o(n \log n) 时,时间复杂度可以提高到 O(sqrt(mn log n)+nsqrt(log n log log n))O(\sqrt{m n \log n}+n \sqrt{\log n \log \log n}) 。
Note that there is a (worst-case) lower bound of Omega(m+min{n log n,n log log r})\Omega(m+\min \{n \log n, n \log \log r\}) in [25] for 请注意,在[25]中存在一个(最坏情况)下界为 Omega(m+min{n log n,n log log r})\Omega(m+\min \{n \log n, n \log \log r\})
“hierarchy-based” algorithms for undirected real-weighted SSSP, but our algorithm is randomized and not hierarchy-based. See Remark 2 for discussions. “基于层次”的无向实重 SSSP 算法,但我们的算法是随机的,并不是基于层次的。有关讨论,请参见备注 2。
Technical Overview. The bottleneck of Dijkstra-based algorithm is the priority queue. For this reason, we only add a fraction of vertices into the priority queue. As in many works on distance oracles or spanners, we sample a subset of vertices RR, and the heap is only for vertices in RR, then we “bundle” every other vertex vv to its nearest vertex in RR, which is called b(v)\mathrm{b}(v). Then define Ball (v)(v) to be the set of vertices closer than b(v)\mathrm{b}(v) to vv. Since the algorithm doesn’t know the correct order of most vertices on a shortest path, relaxing only neighbours as in Dijkstra’s algorithm doesn’t work. So when popping a vertex u in Ru \in R from the heap, we also deal with vertices vv which are bundled to uu. In an undirected graph, this also implies that |dist(s,u)-dist(s,v)||\operatorname{dist}(s, u)-\operatorname{dist}(s, v)| is not large. Here we relax vv from vertices in Ball(v)\operatorname{Ball}(v) and their neighbors, then from vv we relax neighbors of vv and vertices in their balls. (To make it easier to describe, we first change the graph to a constant-degree graph with O(m)O(m) vertices.) Details of the algorithm will be discussed in Section 3, as well as the analysis of correctness and running time. The detailed construction of bundles so that the algorithm can achieve the time bound w.h.p. will be introduced in Section 4. The improvement of time complexity by relaxing the constant-degree constraint will be discussed in Section 5 . 技术概述。基于 Dijkstra 算法的瓶颈是优先队列。因此,我们只将一部分顶点添加到优先队列中。与许多关于距离神谕或生成树的研究一样,我们对顶点 RR 进行采样,堆仅用于顶点 RR ,然后我们将每个其他顶点 vv “捆绑”到其在 RR 中最近的顶点,这称为 b(v)\mathrm{b}(v) 。然后定义球 (v)(v) 为比 b(v)\mathrm{b}(v) 更接近 vv 的顶点集合。由于算法不知道大多数顶点在最短路径上的正确顺序,仅像 Dijkstra 算法那样放松邻居是行不通的。因此,当从堆中弹出顶点 u in Ru \in R 时,我们还处理与 uu 捆绑的顶点 vv 。在无向图中,这也意味着 |dist(s,u)-dist(s,v)||\operatorname{dist}(s, u)-\operatorname{dist}(s, v)| 不是很大。在这里,我们从顶点 Ball(v)\operatorname{Ball}(v) 及其邻居中放松 vv ,然后从 vv 中放松 vv 的邻居和它们的球中的顶点。(为了更容易描述,我们首先将图更改为具有 O(m)O(m)