Elsevier

Journal of Computational and Applied Mathematics

Volume 369, 1 May 2020, 112512
第 369 卷,2020 年 5 月 1 日,112512
Journal of Computational and Applied Mathematics

Iterative refinement for singular value decomposition based on matrix multiplication
基于矩阵乘法的奇异值分解迭代细化

https://doi.org/10.1016/j.cam.2019.112512Get rights and content  获取权利和内容
Under an Elsevier user license
在 Elsevier 用户许可下
open archive  打开档案

Abstract  抽象

We propose a refinement algorithm for singular value decomposition (SVD) of a real matrix. In the same manner as Newton’s method, the proposed algorithm converges quadratically if a modestly accurate initial guess is given. Since the proposed algorithm is based on matrix multiplication, it can efficiently be implemented. Numerical results demonstrate the excellent performance of the proposed algorithm in terms of the convergence rate and the measured computing time compared to a standard approach using multiple precision arithmetic.
我们提出了一种用于实矩阵奇异值分解 (SVD) 的细化算法。与牛顿方法相同,如果给出适度准确的初始估计值,则所提出的算法将呈二次方收敛。由于所提出的算法是基于矩阵乘法的,因此可以有效地实现。数值结果表明,与使用多精度算术的标准方法相比,所提算法在收敛速率和实测计算时间方面具有优异的性能。

MSC  MSC 系列

65F30
15A18
15A23

Keywords  关键字

SVD
Iterative refinement
Convergence analysis
Higher-precision arithmetic

SVD
迭代细化
收敛分析
更高精度的算术

1. Introduction  1. 引言

Let A be a real m×n matrix. In this paper, we propose a refinement algorithm for the singular value decomposition (SVD) of A with mn. If m<n, considering the SVD of AT yields equivalent results. It is well known that the SVD has many applications in various fields, such as signal processing [1], [2], statistical analysis [3], [4], and so forth. Excellent overviews of the SVD can be found in [5], [6].
A 为实 m×n 矩阵。在本文中,我们提出了一种 with mnA 奇异值分解 (SVD) 的细化算法。如果 m<n ,考虑 的 SVD AT 会产生等效的结果。众所周知,SVD 在各个领域都有许多应用,例如信号处理、统计分析等。SVD 的精彩概述可以在 、 中找到。
Throughout the paper, let In and O denote the n×n identity matrix and the zero matrix of appropriate size, respectively. Moreover, denotes the spectral norm for matrices. If necessary, we distinguish between the approximate quantities and the computed results, e.g., for some quantity α, we write α˜ and αˆ as an approximation of α and a computed result for α, respectively.
在整篇论文中,let InO 分别表示适当大小的 n×n 单位矩阵和零矩阵。此外, 表示矩阵的谱范数。如有必要,我们区分近似量和计算结果,例如,对于某个量 α ,我们分别将 和 αˆα˜α 的近似 α 值和 计算结果 。
Let σiR, i=1,,n, denote the singular values of A. We consider the (full size) SVD of A such that A=UΣVT,URm×m,VRn×n,ΣRm×n,where both U and V are orthogonal and Σ is diagonal with Σii=σi. For simplicity, we assume that σ1>σ2>>σn>0.In other words, we consider the case that all singular values are simple, and A has full column rank. If there are multiple or nearly multiple singular values, we need some special care as in [7], [8].
σiRi=1,,n , 表示 的 A 奇异值。我们考虑 的(全尺寸)SVD 是 A 这样的,其中 UV 都是正交的,并且 Σ 是与 Σii=σi 的对角线。为简单起见,我们假设换句话说,我们考虑所有奇异值都是简单的,并且 A 具有完整的列排名。如果存在多个或几乎多个奇异值,则需要特别小心,如 , 。
Recently, the authors proposed refinement algorithms for symmetric eigenvalue decomposition in [7], [8]. In the same spirit of the previous papers, the use of higher-precision arithmetic in our proposed refinement algorithm for the SVD is primarily restricted to matrix multiplication, which accounts for most of the computational cost. There are several approaches for higher-precision matrix multiplication. For example, XBLAS (extra precise BLAS) [9] and fast and accurate algorithms for dot products [10] and matrix products [11] based on error-free transformations are available for efficient implementation.
最近,作者在 [7][8] 中提出了对称特征值分解的改进算法。本着与前几篇论文相同的精神,在我们提出的 SVD 细化算法中,更高精度算术的使用主要限于矩阵乘法,这占了大部分计算成本。有几种方法可以进行更高精度的矩阵乘法。例如,XBLAS (超精确 BLAS) [9] 和基于无差错变换的点积 [10] 和矩阵积 [11] 的快速准确算法可用于高效实现。
The idea of our algorithm is to use the following relations: (1)UTU=Im(orthogonality of U)(2)VTV=In(orthogonality of V)(3)UTAV=Σ(diagonality of A as the SVD) Using these relations, we develop a refinement algorithm for the SVD in the same manner as Newton’s method. Thus, the proposed algorithm has quadratic convergence.
我们算法的思路是使用以下关系式:(1)UTU=Im(U 的 正交性 )(2)VTV=In(V 的 正交性 (3)Unexpected text node: '('利用这些关系,我们以与牛顿方法相同的方式为 SVD 开发了一种细化算法。因此,所提出的算法具有二次收敛性。
There exist several refinement algorithms for SVD that are based on Newton’s method for nonlinear equations (cf. e.g., [12]). Since this sort of algorithm is designed to improve a triplet (σ,u,v)R×Rn×Rn individually, where σ is a singular value and u and v are corresponding left and right singular vectors, applying such an approach to all triplets requires O(n4) arithmetic operations. In [13], Davies and Smith proposed an iterative refinement algorithm for updating the singular value decomposition in O(n3) operations. However, similarly to the Davies–Modi algorithm [14] for the symmetric eigenvalue decomposition as mentioned in the previous paper [7], the Davies–Smith algorithm has the limitation of achievable accuracy of the results. The reason is as follows. The Davies–Smith algorithm assumes that a given real matrix A is preconditioned to a nearly diagonal matrix such as UˆTAVˆ, where Uˆ and Vˆ are computed SVD factors, i.e., Uˆ and Vˆ are approximately orthogonal matrices. Since Uˆ and Vˆ involve numerical errors, the matrix multiplications in UˆTAVˆ are generally not orthogonal transformations, and the singular values of UˆTAVˆ are slightly perturbed from the original matrix A. Then, the singular vectors are also perturbed. Therefore, even if the Davies–Smith algorithm provides accurate singular vectors of UˆTAVˆ, they are not necessarily accurate ones of A. On the other hand, our proposed algorithm uses the original matrix A for obtaining accurate singular vectors of A.
SVD 有几种基于牛顿非线性方程法的细化算法(参见 e.g.)。由于这种算法旨在单独改进三元组 (σ,u,v)R×Rn×Rn ,其中 σ 是奇异值 和 uv 是相应的左和右奇异向量,因此将这种方法应用于所有三元组需要 O(n4) 算术运算。在 中,Davies 和 Smith 提出了一种迭代细化算法,用于更新运算中的 O(n3) 奇异值分解。然而,与上一篇文章中提到的对称特征值分解的 Davies-Modi 算法类似,Davies-Smith 算法在结果的可实现精度方面存在局限性。原因如下。Davies-Smith 算法假设给定的实数矩阵 A 被预设为接近对角线的矩阵,例如 UˆTAVˆ ,其中 UˆVˆ 是计算的 SVD 因子,即 UˆVˆ 是近似正交矩阵。由于 UˆVˆ 涉及数值误差,因此 中的 UˆTAVˆ 矩阵乘法通常不是正交变换,并且 的 UˆTAVˆ 奇异值与原始矩阵 A 略有不同。然后,奇异向量也会受到扰动。因此,即使 Davies-Smith 算法提供了 的 UˆTAVˆ 精确奇异向量 ,它们也不一定是 的 A 准确向量。另一方面,我们提出的算法使用原始矩阵 A 来获取 的 A 精确奇异向量。
The rest of the paper is organized as follows. In Section 2, we present a refinement algorithm for the SVD. In Section 3, we provide a convergence analysis of the proposed algorithm. In Section 4, we present some numerical results showing the behavior and performance of the proposed algorithm.
本文的其余部分组织如下。在第 2 节中,我们提出了 SVD 的优化算法。在第 3 节中,我们提供了所提出的算法的收敛分析。在第 4 节中,我们提供了一些数值结果,显示了所提出的算法的行为和性能。

2. Proposed algorithm  2. 建议的算法

Let UˆRm×m and VˆRn×n be given approximation of U and V, respectively. Let further FRm×m and GRn×n be correction matrices satisfying U=Uˆ(Im+F) and V=Vˆ(In+G), respectively. Let ϵ be defined as (4)ϵmax(ϵF,ϵG),ϵFF,ϵGG.We assume that ϵ<1. Then, both Im+F and In+G are nonsingular, and (Im+F)1=ImF+ΔF,ΔFk=2(F)k,ΔFϵF1ϵF,(In+G)1=InG+ΔG,ΔGk=2(G)k,ΔGϵG1ϵG. Inserting U=Uˆ(Im+F) into (1), we have (Im+FT)UˆTUˆ(Im+F)=Imand UˆTUˆ=(Im+FT)1(Im+F)1=(ImFT+ΔFT)(ImF+ΔF),which yields (5)F+FT=ImUˆTUˆ+Δ1,Δ1ΔF+ΔFT+(FΔF)T(FΔF).Similarly, inserting V=Vˆ(In+G) into (2), we have (6)G+GT=InVˆTVˆ+Δ2,Δ2ΔG+ΔGT+(GΔG)T(GΔG).Moreover, inserting U=Uˆ(Im+F) and V=Vˆ(In+G) into (3), we have (7)ΣFTΣΣG=UˆTAVˆ+Δ3,Δ3ΣΔGΔFTΣ(FΔF)TΣ(GΔG).Here, (8)Δ1(32ϵF)ϵF2(1ϵF)2χ(ϵ)ϵ2,(9)Δ2(32ϵG)ϵG2(1ϵG)2χ(ϵ)ϵ2,(10)Δ3ϵF2+ϵG2+(1ϵFϵG)ϵFϵG(1ϵF)(1ϵG)Σχ(ϵ)ϵ2A, where
UˆRm×mVˆRn×n 分别得到 UV 的近似值。设 further FRm×mGRn×n 分别是满足 U=Uˆ(Im+F)V=Vˆ(In+G) 的校正矩阵。设 ϵ 定义为 我们假设 ϵ<1 。那么,和 都是 Im+F 奇异的,并且 插入 U=Uˆ(Im+F) 到 ,我们有 和 ,得到 同样,插入 V=Vˆ(In+G) 到 ,我们有 此外,插入 U=Uˆ(Im+F)V=Vˆ(In+G) 进入 ,我们有 这里 In+G ,其中
(11)χ(ϵ)32ϵ(1ϵ)2.
Omitting the second-order terms Δ1, Δ2, and Δ3 from (5), (6), and (7) in a similar way to Newton’s method, we obtain a system of matrix equations for F˜=(f˜ij)Rm×m, G˜=(g˜ij)Rn×n, and Σ˜=diag(σ˜i)Rm×n as (12)F˜+F˜T=R,RImUˆTUˆG˜+G˜T=S,SInVˆTVˆΣ˜F˜TΣ˜Σ˜G˜=T,TUˆTAVˆ(13)f˜ij+f˜ji=rijfor1i,jmg˜ij+g˜ji=sijfor1i,jnΣ˜ijσ˜jf˜jiσ˜ig˜ij=tijfor1im,1jn. All that remains is to solve (12) for F˜, G˜, and Σ˜.
以与牛顿方法类似的方式省略 (5)、(6)(7) 中的二阶项 Δ1Δ2Δ3 ,我们得到一个矩阵 F˜=(f˜ij)Rm×m 方程组 , G˜=(g˜ij)Rn×n Σ˜=diag(σ˜i)Rm×n 因为 (12)F˜+F˜T=R,RImUˆTUˆG˜+G˜T=S,SInVˆTVˆΣ˜F˜TΣ˜Σ˜G˜=T,TUˆTAVˆ(13)f˜ij+f˜ji=rijfor1i,jmg˜ij+g˜ji=sijfor1i,jnΣ˜ijσ˜jf˜jiσ˜ig˜ij=tijfor1im,1jn. 剩下的就是求解 F˜ 、 和 G˜ Σ˜(12)
In the following, we will show that we can easily solve the system of matrix equations (12). We partition F˜,Σ˜,R,T as follows: nmnF˜=F˜11F˜12}nF˜21F˜22}mn,nΣ˜=Σ˜n}nO}mnnmnR=R11R12}nR21R22}mn,nT=T1}nT2}mn Then it follows from (12) that (14a)F˜11+F˜11T=R11,(14b)F˜21+F˜12T=R21,(14c)F˜22+F˜22T=R22, and (15a)Σ˜nF˜11TΣ˜nΣ˜nG˜=T1,(15b)F˜12TΣ˜n=T2Σ˜nF˜12=T2T.
在下文中,我们将展示我们可以轻松求解矩阵方程组 (12)。我们按如下方式进行分区 F˜,Σ˜,R,T :{n {m−n F ̃= F ̃11 F ̃12 }n F ̃21 F ̃22 }m−n, {n Σ ̃= Σ ̃n }n O }m−n {n {m−n R= R11 R12 }n R21 R22 }m−n, {n T= T1 }n T2 }m−n 然后,从 (12) (14a)F˜11+F˜11T=R11,(14b)F˜21+F˜12T=R21,(14c)F˜22+F˜22T=R22, 可以得出 和 (15a)Σ˜nF˜11TΣ˜nΣ˜nG˜=T1,(15b)F˜12TΣ˜n=T2Σ˜nF˜12=T2T.
First, we focus on the diagonal parts of F˜11 and G˜. It follows from the first and second equations in (13) that f˜ii=rii2,g˜ii=sii2for1in.Moreover, the third equation in (13) yields (1f˜iig˜ii)σ˜i=(1(rii+sii)2)σ˜i=tiifor1in.Thus, if rii+sii2 for 1in, we have (16)σ˜i=tii1(rii+sii)2for1in.
首先,我们关注 F˜11 和 的 G˜ 对角线部分。从 (13) 中的第一个和第二个方程中可以得出, f˜ii=rii2,g˜ii=sii2for1in. 此外,(13) 中的第三个方程得出 (1f˜iig˜ii)σ˜i=(1(rii+sii)2)σ˜i=tiifor1in. 因此,如果 rii+sii2 for 1in ,我们有 (16)σ˜i=tii1(rii+sii)2for1in.

Remark 1  注 1

In theory, there is a possibility that rii+sii=2. However, R and S are residuals in terms of orthogonality, and it is likely that |rii|1 and |sii|1, and |rii+sii|1 in practice.
理论上,有可能 rii+sii=2 .但是, R and S 是正交性的残差,并且很可能是 |rii|1|sii|1 ,并且 |rii+sii|1 在实践中。
Next, we focus on the off-diagonal parts of F˜11 and G˜. Combining (13), (16), they can be determined by solving 4 × 4 linear systems (17)f˜ij+f˜ji=rij(18)g˜ij+g˜ji=sij(19)σ˜if˜ij+σ˜jg˜ji=tji(20)σ˜jf˜ji+σ˜ig˜ij=tij for 1i,jn, ij. By multiplying (19) by σ˜i and (20) by σ˜j, σ˜i2f˜ij+σ˜iσ˜jg˜ji=σ˜itji,σ˜j2f˜ji+σ˜iσ˜jg˜ij=σ˜jtij, and σ˜i2f˜ij+σ˜j2f˜ji+σ˜iσ˜j(g˜ij+g˜ji)=σ˜itjiσ˜jtij.Inserting (18) into this yields σ˜i2f˜ij+σ˜j2f˜ji=σ˜itjiσ˜jtijσ˜iσ˜jsij.Combining this and (17), we have (σ˜j2σ˜i2)f˜ij=σ˜j2rij+σ˜itji+σ˜jtij+σ˜iσ˜jsij=σ˜j(tij+σ˜jrij)+σ˜i(tji+σ˜jsij).Similarly, using (17)(20), we obtain (σ˜j2σ˜i2)g˜ij=σ˜i(tij+σ˜jrij)+σ˜j(tji+σ˜jsij).Hence, (21)f˜ij=αijσ˜j+βijσ˜iσ˜j2σ˜i2,g˜ij=αijσ˜i+βijσ˜jσ˜j2σ˜i2ifσ˜iσ˜jfor1i,jn,ij,where αijtij+σ˜jrij and βijtji+σ˜jsij. Moreover, combining (15b), (16), F˜12 can also be determined as (22)f˜ij=tjiσ˜iifσ˜i0for1in,n+1jm.Furthermore, combining (14b), (22), F˜21 is determined as F˜21=R21F˜12T and f˜ij=rijf˜ji=rij+tijσ˜jifσ˜j0forn+1im,1jn.Finally, F˜22 can arbitrarily be determined on the condition (14c). Thus we choose f˜ij as f˜ij=rij2forn+1i,jm,ij.
  1. Download: Download high-res image (228KB)
  2. Download: Download full-size image

接下来,我们关注 F˜11G˜ 的非对角线部分。将 (13)、(16) 组合起来,可以通过求解 、 的 4 × 4 个线性方程组 (17)f˜ij+f˜ji=rij(18)g˜ij+g˜ji=sij(19)σ˜if˜ij+σ˜jg˜ji=tji(20)σ˜jf˜ji+σ˜ig˜ij=tij 1i,jn 来确定 ij 它们。通过将 (19) 乘以 σ˜i(20)σ˜j 以 , σ˜i2f˜ij+σ˜iσ˜jg˜ji=σ˜itji,σ˜j2f˜ji+σ˜iσ˜jg˜ij=σ˜jtij, 并将 σ˜i2f˜ij+σ˜j2f˜ji+σ˜iσ˜j(g˜ij+g˜ji)=σ˜itjiσ˜jtij. (18) 插入其中,得到 σ˜i2f˜ij+σ˜j2f˜ji=σ˜itjiσ˜jtijσ˜iσ˜jsij. 将此和 (17) 组合起来,我们得到 (σ˜j2σ˜i2)f˜ij=σ˜j2rij+σ˜itji+σ˜jtij+σ˜iσ˜jsij=σ˜j(tij+σ˜jrij)+σ˜i(tji+σ˜jsij). 同样,使用 (17)(20),我们得到 (σ˜j2σ˜i2)g˜ij=σ˜i(tij+σ˜jrij)+σ˜j(tji+σ˜jsij). 因此, (21)f˜ij=αijσ˜j+βijσ˜iσ˜j2σ˜i2,g˜ij=αijσ˜i+βijσ˜jσ˜j2σ˜i2ifσ˜iσ˜jfor1i,jn,ij, 其中 αijtij+σ˜jrijβijtji+σ˜jsij 。此外,结合 (15b)、(16) F˜12 也可以确定为 (22)f˜ij=tjiσ˜iifσ˜i0for1in,n+1jm. 此外,结合 (14b)、(22) F˜21 被确定为 F˜21=R21F˜12Tf˜ij=rijf˜ji=rij+tijσ˜jifσ˜j0forn+1im,1jn. 最后, F˜22 可以根据条件 (14c) 任意确定。因此,我们选择 f˜ij f˜ij=rij2forn+1i,jm,ij.
  1. Download: Download high-res image (228KB)
  2. Download: Download full-size image
Summarizing the above discussion, we present a refinement algorithm for the SVD of a real matrix in Algorithm 1.
总结上述讨论,我们在算法 1 中提出了一种实矩阵 SVD 的细化算法。

Remark 2  注 2

In Algorithm 1, we assume that σ˜iσ˜j for all (i,j). If σ˜i=σ˜j for some (i,j), we need some care in a similar way to the treatment for the symmetric eigenvalue problem in [7].
在算法 1 中, σ˜iσ˜j 我们假设对于所有 (i,j) .如果 σ˜i=σ˜j 对于某些 (i,j) ,我们需要一些小心,就像 [7] 中对称特征值问题的处理一样。

Remark 3  注 3

Algorithm 1 would not work for the thin SVD unless C(Uˆ)=C(U), as C(Uˆ)C(Uˆ) at each iteration, where C(X) is the column space of a matrix X.
算法 1 不适用于瘦 SVD,除非 C(Uˆ)=C(U) ,就像在每次迭代中一样 C(Uˆ)C(Uˆ) ,其中 C(X) 是矩阵 X 的列空间。
In the next section, we will discuss the convergence of the proposed algorithms in this section, which is proved to be quadratic.
在下一节中,我们将讨论本节中提出的算法的收敛性,它被证明是二次的。

3. Convergence analysis  3. 收敛分析

Here we prove quadratic convergence of Algorithm 1. Let ϵ be defined as in (4). Recall that F,F˜,G,G˜ are obtained from the following equations: (23)F+FT=R+Δ1,RImUˆTUˆ,Δ1χ(ϵ)ϵ2,(24)G+GT=S+Δ2,SInVˆTVˆ,Δ2χ(ϵ)ϵ2,(25)ΣFTΣΣG=T+Δ3,TUˆTAVˆ,Δ3χ(ϵ)Aϵ2,(26)F˜+F˜T=R,(27)G˜+G˜T=S,(28)Σ˜F˜TΣ˜Σ˜G˜=T.
在这里,我们证明了算法 1 的二次收敛性。设 ϵ 定义为 (4) 中。回想一下, F,F˜,G,G˜ 从以下方程式中获得: (23)F+FT=R+Δ1,RImUˆTUˆ,Δ1χ(ϵ)ϵ2,(24)G+GT=S+Δ2,SInVˆTVˆ,Δ2χ(ϵ)ϵ2,(25)ΣFTΣΣG=T+Δ3,TUˆTAVˆ,Δ3χ(ϵ)Aϵ2,(26)F˜+F˜T=R,(27)G˜+G˜T=S,(28)Σ˜F˜TΣ˜Σ˜G˜=T.
The main difference from the discussion about the symmetric eigenvalue decompositions is that we consider the case of rectangular matrices, i.e., m>n. In connection with this, for n+1im, the ith columns of U are not unique. Hence, we uniquely determine U depending on a given Uˆ as follows. Define U such that the lower right (mn)×(mn) submatrix of Uˆ1U is symmetric positive definite; see [7, § 3.2] for the proof of its uniqueness. Then, F22 is symmetric. Moreover, the next lemma can be proved in the same manner as [7, Lemma 3].
与对称特征值分解的讨论的主要区别在于,我们考虑了矩形矩阵的情况,即 m>n 。与此相关,对于 n+1im ,的 i U 第 th 列不是唯一的。因此,我们根据给定 Uˆ 的给定来唯一确定 U ,如下所示。定义 U 的右下角 (mn)×(mn) 子矩阵 Uˆ1U 是对称的正定矩阵;参见 [7, § 3.2] 以证明其唯一性。然后, F22 是对称的。此外,下一个引理可以用与 [7, 引理 3] 相同的方式证明。

Lemma 1  引理 1

Let ARm×n,UˆRm×m, andVˆRn×n withm>n. In addition, let U be a set of orthogonal matrices comprising the normalized left singular vectors ofA. ForUˆRm×m obtained by Algorithm1 and any fixed UαU, we define Fα such that (29)Uα=Uˆ(Im+Fα).In addition, we define F such that (30)U=Uˆ(Im+F),where URm×m comprises normalized left singular vectors such that the lower right(mn)×(mn) submatrix of Uˆ1U is symmetric positive definite. Then, we have (31)F3Fα.
ARm×n UˆRm×m , 和 VˆRn×n m>n 。此外,设 U 为一组正交矩阵,其中包含 的 A 归一化左奇异向量。对于 UˆRm×m 通过算法1 和任何固定 UαU 获得我们定义 Fα 如下 (29)Uα=Uˆ(Im+Fα). 此外,我们定义 F (30)U=Uˆ(Im+F), where 包含归一化的左奇异向量,使得 的 Uˆ1U(mn)×(mn)子矩阵是对称的正定。 URm×m 那么,我们有 (31)F3Fα.
Noting the above lemma, we prove the quadratic convergence. First, we estimate F˜F and G˜G in some neighborhood of the solutions.
注意到上述引理,我们证明了二次收敛。首先,我们估计 F˜F 并在 G˜G 解的某个邻域中。

Lemma 2  引理 2

Suppose mn andm2, and defineσn+10 for the sake of convenience. Let ϵ be defined as in (4). If (32)ϵ<min1in(σiσi+1)30mAis satisfied, then (33)ϵ<160.Moreover, letting (34)η(ϵ)2χ(ϵ)(12ϵ)(12ϵχ(ϵ)ϵ2),we obtain (35)max(F˜F,G˜G)(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)mAϵ2min1in(σiσi+1)2η(ϵ)Aϵ2,where χ(ϵ) in(11) and η(ϵ) in (34) satisfy (36)χ(ϵ)3.068,η(ϵ)6.572.
假设 mn m2 ,并为方便起见定义 σn+10 。设 ϵ 定义为 in(4)。如果 (32)ϵ<min1in(σiσi+1)30mA 满足,则 (33)ϵ<160. 此外,让我们 (34)η(ϵ)2χ(ϵ)(12ϵ)(12ϵχ(ϵ)ϵ2), 得到 (35)max(F˜F,G˜G)(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)mAϵ2min1in(σiσi+1)2η(ϵ)Aϵ2, 其中 χ(ϵ) in(11) η(ϵ) in(34)满足 (36)χ(ϵ)3.068,η(ϵ)6.572.

Proof  证明

Since we have (33) from ϵ<1301mmin1in(σiσi+1)A130121=160in (32), it is easy to see that χ(ϵ) in (11) and η(ϵ) satisfy (36).
由于我们在 (32) 中有 ϵ<1301mmin1in(σiσi+1)A130121=160 (33),因此很容易看出 χ(ϵ) in (11)η(ϵ) 满足 (36)。
First of all, we estimate the diagonal elements of FF˜. From (23), (26), we have (37)(FF˜)+(FF˜)T=Δ1,Δ1χ(ϵ)ϵ2.In addition, we see (38)(GG˜)+(GG˜)T=Δ2,Δ2χ(ϵ)ϵ2in the same manner as (37). Therefore, we obtain (39)max(|fiif˜ii|,|giig˜ii|)χ(ϵ)2ϵ2for1in,|fiif˜ii|χ(ϵ)2ϵ2forn+1im.
首先,我们估计 的 FF˜ 对角线元素。从 (23)、(26) 中,我们有 (37)(FF˜)+(FF˜)T=Δ1,Δ1χ(ϵ)ϵ2. 此外,我们以与 (37) 相同的方式看到 (38)(GG˜)+(GG˜)T=Δ2,Δ2χ(ϵ)ϵ2 。因此,我们获得 (39)max(|fiif˜ii|,|giig˜ii|)χ(ϵ)2ϵ2for1in,|fiif˜ii|χ(ϵ)2ϵ2forn+1im.
Next, we estimate Σ˜Σ. From (25), (28), Σ˜ and Σ are determined as σ˜i=tii(1f˜iig˜ii) and σi=(tiiΔ3(i,i))(1fiigii). Thus, from easy calculations, σ˜iσi=tii(1fiigii)tii(1f˜iig˜ii)(1fiigii)(1f˜iig˜ii)+Δ3(i,i)1fiigii=tii(fiif˜ii+giig˜ii)(1fiigii)(1fiigii+(fiif˜ii+giig˜ii))+Δ3(i,i)1fiigii. On the right hand side, we have (40)|Δ3(i,i)1fiigii|χ(ϵ)Aϵ212ϵ.In addition, |tiiσi|A(2ϵ+χ(ϵ)ϵ2)from (25). It then follows that |tii|A(1+2ϵ+χ(ϵ)ϵ2).Hence, it is easy to see that (41)|tii(fiif˜ii+giig˜ii)(1fiigii)(1fiigii+(fiif˜ii+giig˜ii))|χ(ϵ)A(1+2ϵ+χ(ϵ)ϵ2)ϵ2(12ϵ)(12ϵχ(ϵ)ϵ2).Using (34), we have (42)|σ˜iσi|<η(ϵ)Aϵ2for1in.In addition, we see σ˜i>0 for i=1,,n in view of (43)σ˜i>σi+η(ϵ)Aϵ2>30mAϵ+η(ϵ)Aϵ2=(30mη(ϵ)ϵ)Aϵ>0from (42), (32), and (36).
接下来,我们估计 Σ˜Σ 。从 (25)、(28) Σ˜Σ 确定为 σ˜i=tii(1f˜iig˜ii)σi=(tiiΔ3(i,i))(1fiigii) 。因此,从简单的计算中, σ˜iσi=tii(1fiigii)tii(1f˜iig˜ii)(1fiigii)(1f˜iig˜ii)+Δ3(i,i)1fiigii=tii(fiif˜ii+giig˜ii)(1fiigii)(1fiigii+(fiif˜ii+giig˜ii))+Δ3(i,i)1fiigii. 在右侧,我们有 (40)|Δ3(i,i)1fiigii|χ(ϵ)Aϵ212ϵ. 此外, |tiiσi|A(2ϵ+χ(ϵ)ϵ2) 来自 (25)。因此 |tii|A(1+2ϵ+χ(ϵ)ϵ2). ,很容易看出 (41)|tii(fiif˜ii+giig˜ii)(1fiigii)(1fiigii+(fiif˜ii+giig˜ii))|χ(ϵ)A(1+2ϵ+χ(ϵ)ϵ2)ϵ2(12ϵ)(12ϵχ(ϵ)ϵ2). 使用 (34),我们有 (42)|σ˜iσi|<η(ϵ)Aϵ2for1in. 此外,我们从 (43)σ˜i>σi+η(ϵ)Aϵ2>30mAϵ+η(ϵ)Aϵ2=(30mη(ϵ)ϵ)Aϵ>0 (42)、(32)(36) 中看到 σ˜i>0 for i=1,,n
In what follows, we estimate the off-diagonal elements of F˜ and G˜. Combining (25) with (42), we have (44)Σ˜FTΣ˜Σ˜G=T+Δ˜5,where (45)|Δ˜5(i,j)|(χ(ϵ)+2η(ϵ)ϵ)Aϵ2forij. In addition, from (28), (46)(FF˜)TΣ˜+Σ˜(GG˜)=Δ˜5holds. Using (37), (38), and (46), we estimate the off-diagonal elements of F˜ and G˜.
在下文中,我们估计 F˜G˜ 的非对角线元素。将 (25)(42) 组合,我们得到 (44)Σ˜FTΣ˜Σ˜G=T+Δ˜5, 其中 (45)|Δ˜5(i,j)|(χ(ϵ)+2η(ϵ)ϵ)Aϵ2forij. 此外,从 (28) 中, (46)(FF˜)TΣ˜+Σ˜(GG˜)=Δ˜5 成立。使用 (37)、(38)(46),我们估计 F˜G˜ 的非对角线元素。
Recall f˜ij=f˜ji=rij2 for n+1i,jm in the proposed algorithm. Hence, from (37), (47)|f˜ijfij|χ(ϵ)2ϵ2forn+1i,jm.Next, for 1in,n+1jm, from the bottom part of (46), we have (48)|f˜ijfij|Aσi˜(χ(ϵ)+2η(ϵ)ϵ)ϵ2(χ(ϵ)+2η(ϵ)ϵ)Aϵ2σiη(ϵ)Aϵ2.Combining this with (37), for n+1im,1jn, we have (49)|f˜ijfij|χ(ϵ)ϵ2+Aσj˜(χ(ϵ)+2η(ϵ)ϵ)ϵ2(2χ(ϵ)+2η(ϵ)ϵχ(ϵ)η(ϵ)ϵ2)Aϵ2σjη(ϵ)Aϵ2.Moreover, for 1i,jn,ij, we have (50)(fijf˜ij)+(fjif˜ji)=ϵ1,ij,|ϵ1,ij|χ(ϵ)ϵ2,(51)(gijg˜ij)+(gjig˜ji)=ϵ2,ij,|ϵ2,ij|χ(ϵ)ϵ2,(52)σ˜i(fijf˜ij)+σ˜j(gjig˜ji)=ϵ3,ij,|ϵ3,ij|(χ(ϵ)+2η(ϵ)ϵ)Aϵ2 from (37), (38), (45), and (46).
Recall f˜ij=f˜ji=rij2 for n+1i,jm 在建议的算法中。因此,从 (37)开始, (47)|f˜ijfij|χ(ϵ)2ϵ2forn+1i,jm. 接下来,对于 1in,n+1jm ,从 (46) 的底部,我们有 (48)|f˜ijfij|Aσi˜(χ(ϵ)+2η(ϵ)ϵ)ϵ2(χ(ϵ)+2η(ϵ)ϵ)Aϵ2σiη(ϵ)Aϵ2. Combine this 与 (37),对于 n+1im,1jn ,我们有 (49)|f˜ijfij|χ(ϵ)ϵ2+Aσj˜(χ(ϵ)+2η(ϵ)ϵ)ϵ2(2χ(ϵ)+2η(ϵ)ϵχ(ϵ)η(ϵ)ϵ2)Aϵ2σjη(ϵ)Aϵ2. 此外,对于 1i,jn,ij ,我们有 (50)(fijf˜ij)+(fjif˜ji)=ϵ1,ij,|ϵ1,ij|χ(ϵ)ϵ2,(51)(gijg˜ij)+(gjig˜ji)=ϵ2,ij,|ϵ2,ij|χ(ϵ)ϵ2,(52)σ˜i(fijf˜ij)+σ˜j(gjig˜ji)=ϵ3,ij,|ϵ3,ij|(χ(ϵ)+2η(ϵ)ϵ)Aϵ2 来自 (37)、(38)、(45)(46)。
Similarly to (21), all of fijf˜ij and gijg˜ij are calculated as follows. By multiplying (52) by σ˜i, σ˜i2(fijf˜ij)+σ˜iσ˜j(gjig˜ji)=σ˜iϵ3,ij,σ˜j2(fjif˜ji)+σ˜iσ˜j(gijg˜ij)=σ˜jϵ3,ji, where the second equation is due to the symmetry of i and j. Thus, σ˜i2(fijf˜ij)+σ˜j2(fjif˜ji)+σ˜iσ˜j((gijg˜ij)+(gjig˜ji))=σ˜iϵ3,ij+σ˜jϵ3,ji.Inserting (51) into this yields σ˜i2(fijf˜ij)+σ˜j2(fjif˜ji)=σ˜iϵ3,ij+σ˜jϵ3,jiσ˜iσ˜jϵ2,ji.Combining this and (50), we have (σ˜j2σ˜i2)(fjif˜ji)=σ˜iϵ3,ij+σ˜jϵ3,jiσ˜iσ˜jϵ2,jiσ˜i2ϵ1,ij.Thus, noting σ˜i>0(i=1,,n) as in (43), we have |(σ˜j2σ˜i2)(fjif˜ji)|σ˜i|ϵ3,ij|+σ˜j|ϵ3,ji|+σ˜iσ˜j|ϵ2,ji|+σ˜i2|ϵ1,ij|(σ˜i+σ˜j)(χ(ϵ)+2η(ϵ)ϵ)Aϵ2+σ˜i(σ˜i+σ˜j)χ(ϵ)ϵ2(σ˜i+σ˜j)((χ(ϵ)+2η(ϵ)ϵ)Aϵ2+(A+η(ϵ)Aϵ2)χ(ϵ)ϵ2), where the second inequality is due to (50), (51), and (52), and the third inequality is due to (42) and σiA. Therefore, for 1i,jn,ij, we obtain (53)|f˜ijfij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2(σ˜i+σ˜j)|σ˜i2σ˜j2|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2|σiσj|2η(ϵ)Aϵ2,where the second inequality is due to (42). Similarly, for 1i,jn,ij, we have (54)|g˜ijgij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2|σiσj|2η(ϵ)Aϵ2.
(21) 类似,所有 和 fijf˜ij gijg˜ij 的计算方式如下。通过将 (52) 乘以 σ˜iσ˜i2(fijf˜ij)+σ˜iσ˜j(gjig˜ji)=σ˜iϵ3,ij,σ˜j2(fjif˜ji)+σ˜iσ˜j(gijg˜ij)=σ˜jϵ3,ji, 其中第二个方程是由于 和 ji 对称性。因此, σ˜i2(fijf˜ij)+σ˜j2(fjif˜ji)+σ˜iσ˜j((gijg˜ij)+(gjig˜ji))=σ˜iϵ3,ij+σ˜jϵ3,ji.(51) 代入其中,得到 σ˜i2(fijf˜ij)+σ˜j2(fjif˜ji)=σ˜iϵ3,ij+σ˜jϵ3,jiσ˜iσ˜jϵ2,ji. 将 this 和 (50) 组合起来,我们得到 (σ˜j2σ˜i2)(fjif˜ji)=σ˜iϵ3,ij+σ˜jϵ3,jiσ˜iσ˜jϵ2,jiσ˜i2ϵ1,ij. 因此, σ˜i>0(i=1,,n) 注意在 (43) 中,我们有 |(σ˜j2σ˜i2)(fjif˜ji)|σ˜i|ϵ3,ij|+σ˜j|ϵ3,ji|+σ˜iσ˜j|ϵ2,ji|+σ˜i2|ϵ1,ij|(σ˜i+σ˜j)(χ(ϵ)+2η(ϵ)ϵ)Aϵ2+σ˜i(σ˜i+σ˜j)χ(ϵ)ϵ2(σ˜i+σ˜j)((χ(ϵ)+2η(ϵ)ϵ)Aϵ2+(A+η(ϵ)Aϵ2)χ(ϵ)ϵ2), 第二个不等式是由于 (50)、(51)(52) 造成的,第三个不等式是由于 (42)σiA .因此,对于 1i,jn,ij ,我们得到 (53)|f˜ijfij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2(σ˜i+σ˜j)|σ˜i2σ˜j2|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2|σiσj|2η(ϵ)Aϵ2, 第二个不等式是由于 (42) 引起的。同样,对于 1i,jn,ij ,我们有 (54)|g˜ijgij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2|σiσj|2η(ϵ)Aϵ2.
From (39), (47), (48), (49), (53), and (54), we have |f˜ijfij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2min1kn(σkσk+1)2η(ϵ)Aϵ2for1i,jm,|g˜ijgij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2min1kn(σkσk+1)2η(ϵ)Aϵ2for1i,jn. In view of F˜F2i,j|f˜ijfij|2 and G˜G2i,j|g˜ijgij|2, we obtain (35).  □
(39)、(47)、(48)、(49)、(53)(54) 中,我们有 |f˜ijfij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2min1kn(σkσk+1)2η(ϵ)Aϵ2for1i,jm,|g˜ijgij|(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)Aϵ2min1kn(σkσk+1)2η(ϵ)Aϵ2for1i,jn. 鉴于 F˜F2i,j|f˜ijfij|2G˜G2i,j|g˜ijgij|2 ,我们得到 (35)。 □
From Lemma 2, the next lemma is readily accessible.
引理 2 开始,下一个引理很容易获得。

Lemma 3  引理 3

Under the same assumption as in Lemma 2, we obtain (55)max(F˜F,G˜G)<65300ϵ,(56)lim supϵ0max(F˜F,G˜G)ϵ26mAmin1in(σiσi+1).
在与引理 2 相同的假设下,我们得到 (55)max(F˜F,G˜G)<65300ϵ,(56)lim supϵ0max(F˜F,G˜G)ϵ26mAmin1in(σiσi+1).

Proof  证明

Noting (36), we have (57)2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ26.360.Therefore, we see (58)F˜F<(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)ϵ30min1in(σiσi+1)30mAϵ2η(ϵ)ϵ30m<6.4ϵ30171800<65300ϵ.Since G˜G<65ϵ300 also holds, we have (55). Combining (35) with χ(0)=3, we obtain (56).  □
注意 (36),我们有 (57)2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ26.360. 因此,我们看到 (58)F˜F<(2χ(ϵ)+2η(ϵ)ϵ+χ(ϵ)η(ϵ)ϵ2)ϵ30min1in(σiσi+1)30mAϵ2η(ϵ)ϵ30m<6.4ϵ30171800<65300ϵ. Since G˜G<65ϵ300 也成立,我们有 (55)。(35)χ(0)=3 组合在一起,我们得到 (56)。 □
On the basis of the above lemmas, we obtain the main theorem that states the quadratic convergence.
根据上述引理,我们得到了陈述二次收敛的主定理。

Theorem 1  定理 1

Let ARm×n,UˆRm×m, andVˆRn×n withmn andm2. Defineσn+10 for the sake of convenience. Define ϵmax(F,G) with F,G satisfyingU=Uˆ(Im+F),V=Vˆ(In+G). Similarly, define ϵmax(F,G) with F,G satisfyingU=Uˆ(Im+F),V=Vˆ(In+G), whereUˆ,Vˆ are obtained in Algorithm 1. If (59)ϵ<min1in(σiσi+1)30mAis satisfied, then (60)ϵ<710ϵ,(61)lim supϵ0ϵϵ218mAmin1in(σiσi+1).
ARm×n UˆRm×m , 和 VˆRn×n mn m2 。为方便起见,定义 σn+10 。定义 ϵmax(F,G) F G 满足 U=Uˆ(Im+F),V=Vˆ(In+G) .同样,使用 satisfying U=Uˆ(Im+F) V=Vˆ(In+G) 定义 F G ϵmax(F,G) ,其中 Uˆ Vˆ 在算法 1 中获得。如果 (59)ϵ<min1in(σiσi+1)30mA 满足,则 (60)ϵ<710ϵ,(61)lim supϵ0ϵϵ218mAmin1in(σiσi+1).

Proof  证明

Define Fα such that U=Uˆ(Im+Fα). Noting Uˆ(Im+Fα)=Uˆ(Im+F) and Uˆ=Uˆ(Im+F˜), we have UˆFα=Uˆ(Im+F)Uˆ=Uˆ(FF˜)=Uˆ(Im+F˜)1(FF˜).It then follows that (62)Fα=(Im+F˜)1(FF˜).Noting (55) and F˜F˜F+F<2ϵ<130 from (33), we have (63)FαFF˜1F˜65300ϵ1130<730ϵ.In Lemma 1, letting UαU, we see (64)F3Fα.Thus we obtain F<710F.Regarding G, it is easy to see that GGG˜1G˜<730ϵin the same manner as (63). Therefore, we obtain (60). Moreover, using (56), (62), and (64), we obtain (61).  □
定义 Fα ,使 U=Uˆ(Im+Fα) .注意 Uˆ(Im+Fα)=Uˆ(Im+F)Uˆ=Uˆ(Im+F˜) ,我们得到 UˆFα=Uˆ(Im+F)Uˆ=Uˆ(FF˜)=Uˆ(Im+F˜)1(FF˜). 然后, (62)Fα=(Im+F˜)1(FF˜). 注意到 (55) 并从 F˜F˜F+F<2ϵ<130 (33) 中,我们有 (63)FαFF˜1F˜65300ϵ1130<730ϵ.引理 1 中,让 UαU ,我们看到 (64)F3Fα. 因此我们得到 F<710F. 关于 G ,很容易看出它与 GGG˜1G˜<730ϵ (63) 相同。因此,我们得到 (60)。此外,使用 (56)、(62)(64),我们得到 (61)。 □

Remark 4  注 4

From (42), singular values are convergent, where the rate can be estimated by η(ϵ)Aϵ2 that is quadratically convergent.
(42) 中,奇异值是收敛的,其中速率可以通过 η(ϵ)Aϵ2 来估计是二次收敛的。

4. Numerical results  4. 数值结果

We present numerical results to demonstrate the effectiveness of the proposed algorithm (Algorithm 1). The numerical experiments were conducted using MATLAB R2017b on a PC with 2.5 GHz Intel Core i7 and 16 GB of main memory. To realize multiple-precision arithmetic, we adopt Advanpix Multiprecision Computing Toolbox version 4.6.0 [15], which utilizes well-known, fast, and reliable multiple-precision arithmetic libraries including GMP and MPFR. In the multiple-precision toolbox, we can control the arithmetic precision d in decimal digits using the command mp.Digits (d).
我们提供了数值结果来证明所提出的算法的有效性 (算法 1)。数值实验是在配备 2.5 GHz Intel Core i7 和 16 GB 主内存的 PC 上使用 MATLAB R2017b 进行的。为了实现多精度算术,我们采用了 Advanpix Multiprecision Computing Toolbox 版本 4.6.0 [15],它利用了众所周知、快速且可靠的多精度算术库,包括 GMP 和 MPFR。在多精度工具箱中,我们可以使用命令 mp 控制以十进制数字为单位的算术精度 d 。数字d )。

4.1. Convergence property
4.1. 收敛属性

First, we confirm the convergence property of the proposed algorithm for various singular value distributions. We generate m×n rectangular real matrices using Higham’s randsvd [16] by the following MATLAB command.
  1. Download: Download high-res image (12KB)
  2. Download: Download full-size image
The singular value distribution and condition number of A can be controlled by the input arguments mode{1,2,3,4,5} and cndα1, as follows:
首先,我们确认了所提出的算法对各种奇异值分布的收敛特性。我们使用以下 MATLAB 命令使用 Higham 的 randsvd [16] 生成 m×n 矩形实矩阵。
  1. Download: Download high-res image (12KB)
  2. Download: Download full-size image
的奇异值分布和条件数 A 可以通过输入参数 mode{1,2,3,4,5}cndα1 来控制,如下所示:
  • 1.
    one large: σ11, σiα1, i=2,,n
    一个大: σ11σiα1i=2,,n
  • 2.
    one small: σnα1, σi1, i=1,,n1
    一小: σnα1σi1i=1,,n1
  • 3.
    geometrically distributed: σiα(i1)(n1), i=1,,n
    几何分布: σiα(i1)(n1)i=1,,n
  • 4.
    arithmetically distributed: σi1(1α1)(i1)(n1), i=1,,n
    算术分布: σi1(1α1)(i1)(n1)i=1,,n
  • 5.
    random with uniformly distributed logarithm: σiαr(i), i=1,,n, where r(i) are pseudo-random values drawn from the standard uniform distribution on (0,1).
    具有均匀分布对数的 random: σiαr(i)i=1,,n , 其中 r(i) 是从 上的 (0,1) 标准均匀分布中提取的伪随机值。
Here, κ(A)cnd for cnd<u11016. Note that for mode{1,2}, there is a cluster of singular values.
在这里, κ(A)cnd 对于 cnd<u11016 .请注意,对于 mode{1,2} ,存在一组奇异值。
We start with small examples such as m=10 and n=5 to observe the convergence behavior of the algorithm. Moreover, we set cnd=108 to generate moderately ill-conditioned problems in binary64. We compute U(0), V(0) as initial approximate left and right singular vector matrices using the MATLAB function svd for the singular value decomposition in binary64 arithmetic. To see the behavior of the proposed algorithm precisely, we use multiple-precision arithmetic with sufficiently long precision to simulate the exact arithmetic in the algorithm. Then, we expect that Algorithm 1 (RefSVD) works effectively for mode{3,4,5}, but does not for mode{1,2}. For reference, we also use the built-in function svd in the multiple-precision toolbox to compute the singular values σi, i=1,2,,n. The results are shown in Fig. 1, which provides max1in|σˆiσi||σi| as the maximum relative error of the computed singular values σˆi, max(R,S) where RIUˆTUˆ and SIVˆTVˆ as the orthogonality of computed left and right singular vector matrices, offdiag(UˆTAVˆ)A as the diagonality of UˆTAVˆ, and max(F˜,G˜) where F˜ and G˜ are computed in Algorithm 1. Here, offdiag() denotes the off-diagonal part. The horizontal axis shows the number of iterations ν of Algorithm 1.
我们从 m=10n=5 等小例子开始,观察算法的收敛行为。此外,我们设置 cnd=108 在 binary64 中生成中度病态问题。我们使用 MATLAB 函数 svd 计算 U(0)V(0) 作为初始近似左和右奇异向量矩阵,用于 binary64 算术中的奇异值分解。为了精确地查看所提出的算法的行为,我们使用具有足够长精度的多精度算术来模拟算法中的精确算术。然后,我们期望算法 1 (RefSVD) 对 有效, mode{3,4,5}mode{1,2} 对 无效。作为参考,我们还使用多精度工具箱中的内置函数 svd 来计算奇异值 σii=1,2,,n 结果如图 1 所示,其中 提供 max1in|σˆiσi||σi| 计算奇异值 σˆi 的最大相对误差 , max(R,S) 其中 RIUˆTUˆSIVˆTVˆ 作为计算的左和右奇异向量矩阵的正交性, offdiag(UˆTAVˆ)A 作为 的 UˆTAVˆ 对角线, max(F˜,G˜) 其中 F˜G˜ 在算法 1 中计算。这里, offdiag() 表示非对角线部分。横轴显示算法 1 的迭代 ν 次数。
In the case of mode{3,4,5}, all the quantities decrease quadratically in every iteration, i.e., we observe the quadratic convergence of Algorithm 1, as expected. On the other hand, in the case of mode{1,2}, the algorithm fails to improve the accuracy of approximate singular vector matrices because the test matrices for mode{1,2} have clustered singular values. In fact, the assumption (59) for the convergence of Algorithm 1 is not satisfied.
在 的情况下 mode{3,4,5} ,所有量在每次迭代中都呈二次方递减,即,正如我们预期的那样,我们观察到算法 1 的二次收敛。另一方面,在 mode{1,2} 的情况下,该算法无法提高近似奇异向量矩阵的准确率,因为 的 mode{1,2} 测试矩阵具有聚类奇异值。事实上,算法 1 收敛的假设 (59) 不满足。
  1. Download: Download high-res image (450KB)
    下载: 下载高分辨率图片 (450KB)
  2. Download: Download full-size image
    下载:下载全尺寸图像

Fig. 1. Results of iterative refinement by Algorithm 1 (RefSVD) in sufficiently long precision arithmetic for randsvd matrices.
图 1.算法 1 ( RefSVD ) 在足够长的矩阵精度算术 randsvd 中迭代细化的结果。

4.2. Computational speed  4.2. 计算速度

To evaluate the computational speed of the proposed algorithm, we compare the computing time of Algorithm 1 to that of an approach using multiple-precision arithmetic, which is called “MP-approach”. In the multiple-precision toolbox, LAPACK’s routine xGESDD, which is based on a divide-and-conquer method, is implemented sophisticatedly with parallelism to solve singular value problems.
为了评估所提出的算法的计算速度,我们将算法 1 的计算时间与使用多精度算术的方法(称为 “MP-approach”)的计算时间进行了比较。在多精度工具箱中,LAPACK 的例程 xGESDD 基于分而治之法,通过并行性复杂地实现,以解决奇异值问题。
We generate a pseudo-random real n×n matrix with n{500,1000} using the MATLAB function randn such as A = randn(n). We use the MATLAB function svd in binary64, and iteratively refine the computed left and right singular vectors using Algorithm 1 twice. In Algorithm 1, for matrix multiplication at steps 1 and 8 we adopt a fast and accurate algorithm [11] using IEEE 754 binary64 (double precision) as working precision, and for other parts we use the multiple-precision toolbox with necessary arithmetic precision dν for ν=1,2, where ν denotes the iteration number of Algorithm 1. Since the binary64 arithmetic is used for obtaining initial guesses Σˆ0, Uˆ0, and Vˆ0, it is reasonable for the binary128 (quadruple precision) arithmetic to be used for ν=1 in order to achieve the quadratic convergence of the proposed algorithm. In the multiple-precision toolbox, the binary128 arithmetic can be realized when d=34 for mp.Digits (d), and we set d1=34. For ν=2, we determine d2 by estimating the error of Uˆ0 and Vˆ0 using ε1max(F˜0,G˜0) where F˜0 and G˜0 can be obtained at the first iteration (ν=1). Since we expect that the error of Uˆ1 and Vˆ1 is of the order of ε12, the computational precision required in the second iteration should correspond to (ε12)2=ε14. Thus, we set d2=4log10ε11. In the MP-approach, we adjust the arithmetic precision d to d1 and d2 corresponding to Algorithm 1. Note that the case for d=34 is specially tuned in the multiple-precision toolbox and faster than that for d<34, and we do not set d such that d<34 for timing fairness.
n{500,1000} 我们使用 MATLAB 函数 randn 生成一个伪随机实 n×n 矩阵,例如 A = randn(n)。我们在 binary64 中使用 MATLAB 函数 svd,并使用算法 1 迭代优化计算出的左和右奇异向量两次。在算法 1 中,对于步骤 1 和 8 的矩阵乘法,我们采用快速准确的算法 [11],使用 IEEE 754 binary64(双精度)作为工作精度,对于其他部分,我们使用具有必要算术精度 dν 的多精度工具箱, ν=1,2 其中 ν 表示算法 1 的迭代次数。由于 binary64 算术用于获取初始猜测值 Σˆ0Uˆ0Vˆ0 ,因此使用 ν=1 binary128(四倍精度)算术是合理的,以实现所提算法的二次收敛。在多精度工具箱中,当 d=34 for mp.数字d ),我们设置 d1=34 。对于 ν=2 ,我们 d2 通过估计 的误差 Uˆ0 并使用 Vˆ0 ε1max(F˜0,G˜0) where F˜0 来确定 ,并且可以 G˜0 在第一次迭代 ( ) 时获得 ν=1 。由于我们预计 Uˆ1 的误差为 , Vˆ1 ε12 因此第二次迭代所需的计算精度应对应于 (ε12)2=ε14 。因此,我们设置 d2=4log10ε11 。在 MP 方法中,我们将算术精度 d 调整为 d1d2 对应于算法 1。请注意,case for d=34 在 multiple-precision 工具箱中进行了专门调整,并且比 for d<34 更快,我们没有设置 d that d<34 以实现 timing fairness。
In Table 1, Table 2, we show AUˆΣˆVˆTA as the relative residual norm, max(R,S) as the orthogonality of Uˆ and Vˆ, and the measured computing time. In addition, we show max(F˜,G˜) in Algorithm 1 for each iteration. As can be seen from max(F˜ν,G˜ν) in the tables, Algorithm 1 quadratically improves the accuracy of the computed singular vectors. The residual AUˆνΣˆνVˆνTA decreases and the orthogonality max(Rν,Sν) is improved when the iteration number ν increases. Moreover, Algorithm 1 is considerably faster than the MP-approach.
表 1表 2 中,我们表示 AUˆΣˆVˆTA 为相对残差范数, max(R,S) Uˆ 以及 和 Vˆ 的正交性,以及测得的计算时间。此外,我们在 算法 1 中显示了 max(F˜,G˜) 每次迭代的 Algorithm 1。从 max(F˜ν,G˜ν) 表中可以看出,算法 1 二次方提高了计算的奇异向量的精度。当迭代次数 ν 增加时,残差 AUˆνΣˆνVˆνTA 减少,正交性 max(Rν,Sν) 提高。此外,算法 1 比 MP 方法快得多。

Acknowledgments  确认

The authors wish to thank the anonymous referees for their valuable comments. This study was partially supported by JST CREST Grant Number JPMJCR14D4 and JSPS KAKENHI Grant Numbers 16H03917, 17K14143.
作者感谢匿名审稿人的宝贵意见。这项研究得到了 JST CREST Grant Number JPMJCR14D4 和 JSPS 的部分支持 KAKENHI 授权号 16H0391717K14143

Table 1. Results for a pseudo-random real 500 × 500 matrix.
表 1.伪随机实数 500 × 500 矩阵的结果。

Algorithm 1  算法 1ν=0 (svd in binary64)
ν=0Binary64 中的 SVD
ν=1 (d1=34)
ν=1d1=34
ν=2 (d2=44)
ν=2d2=44
max(F˜ν,G˜ν)1.73×10111.50×10223.40×1044
AUˆνΣˆνVˆνTA6.73×10152.03×10224.75×1044
max(Rν,Sν)6.55×10152.99×10226.76×1044
Accumulated elapsed time (s)
累计运行时间 (s)
0.051.245.54
MP-approach  MP 方法mp.Digits (d)
MP.数字d
d=34d=44
AUˆΣˆVˆTA5.96×10334.71×1043
max(R,S)7.72×10334.65×1043
Elapsed time (s)  经过时间 (s)18.8073.92

Table 2. Results for a pseudo-random real 1000 × 1000 matrix.
表 2.伪随机实数 1000 × 1000 矩阵的结果。

Algorithm 1  算法 1ν=0 (svd in binary64)
ν=0Binary64 中的 SVD
ν=1 (d1=34)
ν=1d1=34
ν=2 (d2=39)
ν=2d2=39
max(F˜ν,G˜ν)2.1×10102.1×10208.5×1040
AUˆνΣˆνVˆνTA1.0×10144.2×10201.6×1039
max(Rν,Sν)1.0×10144.2×10201.6×1039
Accumulated elapsed time (s)
累计运行时间 (s)
0.316.0723.48
MP-approach  MP 方法mp.Digits (d)
MP.数字d
d=34d=39
AUˆΣˆVˆTA6.61×10336.08×1038
max(R,S)9.77×10338.17×1038
Elapsed time (s)  经过时间 (s)131.201338.50

References

Cited by (7)

View all citing articles on Scopus