这是用户在 2024-3-17 20:44 为 https://www.mdpi.com/2072-4292/15/4/963# 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
Technical Note

Real-Time 3D Mapping in Complex Environments Using a Spinning Actuated LiDAR System
使用旋转驱动的 LiDAR 系统在复杂环境中进行实时 3D 映射

School of Geodesy and Geomatics, Wuhan University, Wuhan 430079, China
武汉大学大地测量与测绘学院, 武汉430079
Hubei Luojia Laboratory, Wuhan University, Wuhan 430079, China
武汉大学湖北罗家实验室, 湖北430079
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(4), 963; https://doi.org/10.3390/rs15040963
遥感, 2023, 15(4), 963;https://doi.org/10.3390/rs15040963
Submission received: 28 December 2022 / Revised: 3 February 2023 / Accepted: 6 February 2023 / Published: 9 February 2023
提交截止日期:2022 年 12 月 28 日 / 修订日期:2023 年 2 月 3 日 / 录用日期:2023 年 2 月 6 日 / 发布日期:2023 年 2 月 9 日

Abstract 抽象

LiDAR is a crucial sensor for 3D environment perception. However, limited by the field of view of the LiDAR, it is sometimes difficult to achieve complete coverage of the environment with a single LiDAR.
LiDAR 是用于 3D 环境感知的关键传感器。然而,受限于激光雷达的视场,有时很难用单个激光雷达实现对环境的完全覆盖。

In this paper, we designed a spinning actuated LiDAR mapping system that is compatible with both UAV and backpack platforms and propose a tightly coupled laser–inertial SLAM algorithm for it. In our algorithm, edge and plane features in the point cloud are first extracted.

Then, for the significant changes in the distribution of point cloud features between two adjacent scans caused by the continuous rotation of the LiDAR, we employed an adaptive scan accumulation method to improve the stability and accuracy of point cloud registration.

After feature matching, the LiDAR feature factors and IMU pre-integration factor are added to the factor graph and jointly optimized to output the trajectory. In addition, an improved loop closure detection algorithm based on the Cartographer algorithm is used to reduce the drift.

We conducted exhaustive experiments to evaluate the performance of the proposed algorithm in complex indoor and outdoor scenarios.

The results showed that our algorithm is more accurate than the state-of-the-art algorithms LIO-SAM and FAST-LIO2 for the spinning actuated LiDAR system, and it can achieve real-time performance.

Graphical Abstract

1. Introduction

Light Detection and Ranging (LiDAR) can achieve highly accurate and efficient distance measurement. I is insensitive to illumination changes because it measures by actively emitting a pulsed laser.
By combining LiDAR with Simultaneous Localization and Mapping (SLAM) technology, the laser SLAM system can obtain a three-dimensional map of the surrounding environment both indoors and outdoors.
These advantages make the laser SLAM system play an important role in many fields, such as autonomous driving [1], building inspection [2], forestry investigation [3], etc. Currently, the LiDAR used for laser SLAM can be basically divided into two categories. Mechanical LiDAR is the most commonly used LiDAR type. Its horizontal Field Of View (FOV) can reach close to 360°, but its vertical FOV is very limited.
Besides, with the development of microelectronics technology, solid-state LiDAR, e.g., the DJI Livox series, has become more and more commonly used [4,5]. Generally, solid-state LiDAR can provide a larger vertical FOV than mechanical LiDAR, but its horizontal FOV is much smaller. Therefore, in many scenarios, both types of LiDAR cannot completely cover the whole environment.
This shortcoming greatly limits the mapping efficiency of laser SLAM systems, especially when using a platform with limited endurance such as an Unmanned Aerial Vehicle (UAV) to carry the mapping system [6]. In narrow environments, this limited FOV will degrade the accuracy and stability of localization as only a small number of objects can be scanned.
Much research has been conducted to expand the FOV of LiDAR, and the solutions can be roughly divided into two categories. The first category is to combine multiple LiDARs [7,8,9,10,11]. Such systems typically use a horizontally mounted LiDAR to provide primary positioning information and a vertically mounted LiDAR to improve the scanning coverage. Its advantage is that the mechanical structure is simple.
However, due to the high price of LiDARs, the cost of these solutions is greatly increased. Another category of solutions is to actuate the LiDAR dynamically, and LOAM [12] is one of the most-famous methods among them. LOAM first accumulates the point cloud with the assistance of the IMU and then matches the accumulated point cloud with the global map to correct the accumulated errors.
Considering that the prediction error of the IMU grows exponentially with time and the feature extraction and matching algorithms need to wait for sufficient data to be accumulated, this time interval will become a bottleneck limiting the accuracy of the SLAM system.
To address this problem, we designed a mapping system based on a spinning actuated multi-beam LiDAR. By using a multi-beam LiDAR, more information can be obtained in each scan, which helps to increase the frequency of point cloud matching and generate a denser point cloud map.
We first extracted feature points from each frame point cloud through an improved feature extraction method based on LOAM.
Then, we judged whether the current point cloud contains enough information by analyzing the spatial distribution of feature points and performed scan-to-map matching once the requirement is met.
Compared with accumulating point clouds with a fixed number of scans, this method allows a better balance between the matching frequency of point clouds and the accuracy and reliability of the matching results.
Finally, in order to eliminate the accumulated error, we added a loop closure detection module to the algorithm. The main contributions of this paper can be summarized as follows:
  • We propose a tightly coupled laser–inertial SLAM algorithm named Spin-LOAM for a spinning actuated LiDAR system.
  • An adaptive scan accumulation method that can improve the accuracy and reliability of matching by analyzing the spatial distribution of feature points.
  • Extensive experiments were conducted in indoor and outdoor environments to verify the effectiveness of our algorithm.

2. Related Work

As a fundamental problem in robotics, numerous SLAM algorithms based on LiDAR have been proposed. LeGO-LOAM [13] extracts the ground point cloud from the real-time scan results to improve the accuracy in the elevation direction. T-LOAM [14] simultaneously extracts edge features, sphere features, planar features, and ground features to improve the matching accuracy. The works [15,16,17] refined the localization result by introducing plane constraints. Suma++ [18] not only uses geometric features in point clouds, but also introduces semantic information to assist point cloud matching. Besides these LiDAR-only odometry algorithms, many multi-sensor-fusion-based algorithms have been proposed to improve the accuracy and robustness.
LIO-SAM [19] combines the IMU pre-integration factor with the LiDAR odometry factor through a factor graph. LIO-Mapping [20] integrates the LiDAR and IMU in a tightly coupled fashion. FAST-LIO [21] adopts a tightly coupled iterated extended Kalman filter on a manifold to fuse the data and is accelerated by introducing an incremental KD-Tree in FAST-LIO2 [22]. CLINS [23] fuses LiDAR and IMU data by representing trajectories as a continuous-time function, and this framework is well compatible with arbitrary-frequency data from other asynchronous sensors.
For the purposes of acquiring complete 3D information of the environment, researchers have proposed many actuated LiDAR systems. Bosse et al. [24] designed a mapping system named Zebedee, which connects the sensors to the platform via springs. By treating the trajectory as a function of time, a surfel-based matching algorithm was adopted to estimate the 6-DOF pose. Kaul et al. [25] proposed a passively actuated rotating LiDAR system for UAV mapping, and they used a continuous-time SLAM algorithm to produce the trajectory. However, it cannot process the data in real-time. Park et al. [26] addressed this issue by introducing map deformation to replace the original global trajectory optimization in continuous-time SLAM. Fang et al. [27] proposed a two-stage matching algorithm to estimate the trajectory of a rotating LiDAR.

In the algorithm, the distortion of the current point cloud was first removed using the estimated motion generated by matching it with the local map, and then, the undistorted point cloud was matched with the global map. Unlike the aforementioned works, Zhan et al. [28] used a rotating multi-beam LiDAR for 3D mapping and combined it with stereo cameras for dense 3D reconstruction. The precision of this system is very high, but it needs to remain stationary while collecting data. R-LOAM [29] improves the localization accuracy of rotating LiDAR by leveraging prior knowledge about a reference object. Karimi et al. [30] proposed an actuated LiDAR system using the Lissajous pattern [31]. By using the scan slice instead of the full-sweep point cloud to match with the global map, they achieved low-latency localization for a UAV without an IMU in an indoor environment.
在该算法中,首先利用与局部贴图匹配产生的估计运动去除当前点云的畸变,然后将未变形的点云与全局贴图进行匹配。与上述工作不同的是,Zhan等[28]使用旋转的多波束LiDAR进行3D测绘,并将其与立体相机相结合进行密集的三维重建。该系统的精度非常高,但在收集数据时需要保持静止。R-LOAM [ 29] 利用参考对象的先验知识,提高了旋转激光雷达的定位精度。Karimi等[30]提出了一种使用Lissajous模式的驱动LiDAR系统[31]。通过使用扫描切片而不是全扫描点云来匹配全球地图,他们实现了室内环境中没有 IMU 的无人机的低延迟定位。

3. System Overview

We first introduce the hardware systems used in the study. As shown in Figure 1, our device mainly consists of a laser scanner, a step motor, and an IMU, in which the scanner is driven to rotate by the motor. The rotation angle is recorded by an encoder. The IMU is rigidly attached to the platform, so we regarded the IMU frame {I} as the body frame {B} for simplicity. The LiDAR frame is denoted as {L}, and the fixed LiDAR frame {FL} coincides with the initial LiDAR frame when the motor is reset. The Y-axis of the motor frame {M} is aligned with the spin axis. The two extrinsic parameters TIFLSE(3) and TMFLSE(3) are both calibrated manually, where TIFL represents the transformation from the frame {FL} to the frame {I} and TMFL represents the transformation from the frame {FL} to the frame {M}. The timestamp of each sensor is synchronized at the hardware level to ensure accuracy.
Figure 1. The mechanical structure of the hardware device.
Figure 2 provides an overview of our SLAM algorithm. In the front-end, first, the IMU measurements are used to construct the pre-integration factor and produce pose predictions.
图 2 概述了我们的 SLAM 算法。在前端,首先,IMU测量用于构建预积分因子并生成姿态预测。

Next, the raw LiDAR point cloud is transformed to the fixed LiDAR frame and de-skewed using the pose predictions and motor encoder data. Then, edge and plane feature points are extracted from the de-skewed point cloud.

In the scan-to-map registration module, the correspondences of these feature points and the global map are established. The spatial distribution of these matching point pairs is examined to decide whether they are to be added to the factor graph or accumulated to the next scan.

In the back-end, the IMU pre-integration factor and LiDAR factors are jointly optimized to estimate the system states, as shown in Figure 3. In order to bound the amount of computation, only the latest state is optimized when no loop closure constraints are added. After optimization, the feature points are added to the global map using the optimized state.

Loop closure detection is performed periodically in the background to reduce drift.
Figure 2. System overview of the proposed laser–inertial SLAM algorithm for the spinning actuated LiDAR system.
Figure 3. Structure of the factor graph. Only the LiDAR feature factor of the latest scan is directly used in the factor graph, and the others are replaced by the odometry factors, which are relative transformed between two adjacent scans.

4. Methodology

The system state x to be estimated at time t is defined as
where btωR3 and btaR3 are the gyroscope and accelerometer biases, respectively. RtSO(3), ptR3, and vtR3 are the orientation, position, and velocity of the sensor platform in the global coordinate frame {G}, respectively.

4.1. IMU Processing

The platform’s angular velocity ωt and acceleration at in the IMU frame can be obtained by the IMU, but the raw measurements of the IMU are corrupted by noise and bias. Commonly, the slow variations in the bias are modeled with Brownian motion; hence, the IMU measurement model is given by
where g is the gravity vector, ηω, ηa, ηbω, and ηba are white Gaussian noise, and their standard deviations are σηa, σηg, σηba, and σηbω, respectively.

4.1.1. Pose Prediction

Based on the posterior state x^k1 in the previous moment and the IMU measurements between tk1,tk, we can use the Euler integration to calculate the predicted poses. The recursive formula is
where Exp(·) is the exponential map of manifold SO(3).
These poses are sufficient to de-skew the point cloud and provide an initial value for registration. However, to help reject the outliers in Section 4.4, the covariance of these predicted poses also need to be estimated. The recursive formula of error state covariance propagation is
where ω=ω˜ibiω, a=a˜ibia, and Pi is the covariance matrix of system state at time ti. The diagonal elements of the noise covariance matrix Q are σηω2, σηa2, σηba2, and σηbω2. Jr is the right Jacobian of SO(3). Note that the covariance of R is defined in the tangent space.

4.1.2. IMU Pre-Integration

The IMU pre-integration technique was first proposed in [32], and it has been widely applied in SLAM research. It uses IMU measurements between (tk1,tk) to establish the constraint between two states xk1 and xk. The IMU pre-integration factor is calculated as follows:
IMU预积分技术最早于[32]提出,并在SLAM研究中得到了广泛的应用。它使用 之间的 IMU 测量 (tk1,tk) 来建立两个状态 xk1xk 之间的约束。IMU预积分系数计算如下:
where ΔR, Δv, and Δp are the relative motion between two timestamps tk1,tk. More details about the on-manifold IMU pre-integration can be found in [33].
其中 ΔRΔvΔp 是两个时间戳之间的相对运动 tk1,tk 。有关歧管上 IMU 预集成的更多详细信息,请参见 [ 33]。

4.2. Feature Extraction 4.2. 特征提取

The raw point cloud is measured in the LiDAR frame and distorted by the sensor’s motion. Therefore, before feature extraction, it needs to be transformed to frame FL and de-skewed. Suppose ciL is a point in raw point cloud scan C=c0,c1,,cn:
原始点云在 LiDAR 帧中测量,并因传感器的运动而扭曲。因此,在特征提取之前,需要将其转换为帧 FL 并去歪斜。假设 ciL 是原始点云扫描 C=c0,c1,,cn 中的一个点:
where RiM is the rotation matrix generated by encoder data at time ti, which represents the rotation of the LiDAR frame relative to the fixed LiDAR frame in the motor frame. TBkBi=TBk1TBi, TBi, and TBk are the poses of the body frame obtained by the linear interpolation of the predicted pose, and tk is the timestamp of the latest point in the point cloud scan C.
式中 RiM ,是编码器数据在时间 ti 生成的旋转矩阵,它表示电机帧中激光雷达帧相对于固定激光雷达帧的旋转。 TBkBi=TBk1TBiTBiTBk 是预测姿态的线性插值得到的体架位, tk 是点云扫描 C 中最新点的时间戳。
Our feature extraction method extracts planar features Fp and edge features Fe from the input point cloud as shown in Figure 4. The workflow of the method is as follows:
我们的特征提取方法从输入点云中提取平面特征 Fp 和边缘特征 Fe ,如图 4 所示。该方法的工作流程如下:
Figure 4. The illustration of the edge point extraction. As described in Steps 3 and 4, there are two types of edge points. The first type of edge point is at the end of the wall, and the second type of edge point is at the corners.
图4.边缘点提取的图示。如步骤 3 和 4 中所述,有两种类型的边缘点。第一种边缘点位于墙的末端,第二种类型的边缘点位于拐角处。

These edge points can be identified from unstructured points by analyzing the properties of the points in their neighborhoods.
For a point ciC, find its previous neighbors Nipre=cik,,ci1 and succeeding neighbors Nisucc=ci+1,,ci+k in the same scan line.
对于某一点 ciC ,在同一扫描线 Nisucc=ci+1,,ci+k 中查找其先前的 Nipre=cik,,ci1 邻居和后续的邻居。
Calculate the features α, β of the point using
计算点 α 的特征 , β 使用
where α indicates the changing angle of the scan line at the point ci, which is used to characterize the smoothness. The points with α<αthr will be labeled as smooth points. β is the ratio of the distance from point ci to ci1 and ci+1, which is used to determine whether the point is an edge point.
其中 α 表示扫描线在点 ci 处的变化角度,用于表征平滑度。带有 α<αthr 的点将被标记为平滑点。 β 是从点到 ci1ci+1 ci 的距离之比,用于确定该点是否为边缘点。
For point ci with β>βthr, if all points in its closer neighbors (depending on the closest point belonging to the neighbor Nipre or Nisucc) are smooth points, then add ci to Fe.
For point ci with ββthr, if all points in its previous and succeeding neighbors are smooth points, then add ci to the candidate set of edge points.
Use the standard LOAM-based method to extract planar features Fp and edge points Fe, except that the edge points must belong to the candidate set.
This modification was based on the consideration that there are many unstructured objects (e.g., vegetation) in outdoor environments, which will degrade the performance of edge extraction. A comparison result is shown in Figure 5. In our experiment, αthr was set to 15° and k and βthr were set to 2 and 4, respectively, according to [34].
Figure 5. Feature point extraction result. The green points are the raw point cloud. The edge points extracted by the standard LOAM-based extraction method are shown in blue, and the edge points extracted by our method are shown in red.
It can be seen that many blue points are located in the vegetation, and these noises will affect the accuracy of the alignment.

4.3. Scan-to-Map Registration

Similar to [19], for an input point cloud scan C, we used the pose predicted by the IMU to find the adjacent point cloud scans in the global map and merged them into the feature map Mp and Me for data association. The map is downsampled by a voxel filter to accelerate the computation. Then, we find the k nearest neighboring points Nf of each feature point fiFp or Fe in the corresponding feature map and denote its nearest point as mi. For fiFp; the plane normal vector np of its Nf is computed, and for fiFe, the line direction ne of its Nf is computed. In addition, as the LiDAR is continuously rotating, to ensure the reliability of the matching in the initial stage, the initial global map is constructed by keeping the sensor platform stationary for about 3 s.
According to the results of feature matching, the LiDAR residual rLiDAR can be computed using the point-to-plane dp and point-to-line distance de.
where R and p are the predicted orientation and position, np and ne are normalized, and mi is the corresponding point of feature point fi in the global map.

4.4. Adaptive Scan Accumulation

The spatial distribution of the point cloud has an important influence on the accuracy of the registration result. Taking point-to-plane registration as an example, the matched points should involve at least three non-parallel planes.
Unlike the point cloud obtained by the fixed mounted LiDAR, the LiDAR in our device is continuously rotating, resulting in the continuous change of the spatial distribution of the obtained point cloud. As shown in Figure 6, using only a single frame of the point cloud sometimes fails to provide reliable registration results.
A simple solution is to accumulate multi-scan point clouds before each registration, but this will reduce the computational efficiency, so we propose a method to adaptively decide whether to perform point cloud accumulation and how many scans need to be accumulated according to the spatial distribution of the point cloud.
Figure 6. Illustration of multi-scan point clouds’ accumulation. The first scan point cloud is rendered in green and only contains a few points in the horizontal plane, which will lead to an inaccurate registration result, especially in the Z direction.
This problem can be solved by merging it with the next scan point cloud.

4.4.1. Features’ Distribution Inspection

To ensure the registration results are reliable, we used the covariance of the registration result as the indicator. In the least-squares registration problem, the normal equation formed by matched feature points is
where JRn×6 is constructed by stacking Jacobian matrices Jpdp and Jpde, n is the number of matched points, and bRn×1 is constructed by stacking distance residuals dp and de. The solution to Equation (11) is given by
The covariance of the estimated pose is
where E· is the expected value operator. The covariance of b is determined by the laser measurements, and each laser observation can be treated as an independent observation. Therefore, its covariance is:
where σl is the accuracy of the laser measurement, which can be set depending on the model of the LiDAR. Substituting Equation (14) into Equation (13), then
Equation (15) reveals how the spatial distribution of matched points in matrix J affects the covariance of the registration result. If the spatial distribution is suitable for registration, the covariance of the estimated pose should not vary greatly in all directions. Suppose HtranR3×3 is the submatrix of H corresponding to the translation part in the pose; the ratio γ can be computed by
where operator diag(·) extracts the diagonal elements of the matrix as a vector. The smaller the value of γ, the more uniform the distribution of the point cloud is, and vice versa. Here, we set the threshold γth=3. If γ<γth, the matched pairs will be accepted, and these plane and edge features will be added to the factor graph as constraints; otherwise, the extracted features will be accumulated to the next scan.

4.4.2. Outlier Removal in Matched Features

Since the point cloud in a single scan is sparse, there will inevitably be errors in the feature extraction results. These incorrectly matched point pairs will interfere with the features’ distribution inspection and reduce the accuracy of pose estimation.
To remove them while performing distribution inspection, an outlier removal algorithm based on the propagation of covariance is applied.
We assumed that the error in the global map can be ignored, which means that the distance residual is mainly caused by the error of the predicted pose and the error of the laser measurement. Then, the standard deviation of the distance residuals can be computed by