Elsevier

Information Fusion

Volume 97, September 2023, 101819
Information Fusion

Long sequence time-series forecasting with deep learning: A survey

计算机科学TOPEI检索SCI升级版 计算机科学1区SCI基础版 工程技术1区IF 14.7SWJTU A++SWUFE A
https://doi.org/10.1016/j.inffus.2023.101819 Get rights and content  获取权利和内容

Highlights  高光

  • Long sequence time-series forecasting (LSTF) is defined from two perspectives.
    长序列时间序列预测(LSTF)从两个角度进行定义。
  • We propose a new taxonomy and give a comprehensive review of LSTF.
    我们提出了一种新的分类法,并对 LSTF 进行了全面综述。
  • A Kruskal–Wallis test based LSTF performance evaluation method is proposed.
    一种基于 Kruskal-Wallis 检验的 LSTF 性能评估方法被提出。
  • Abundant resources of TSF and LSTF are collected including an open-source library.
    丰富的 TSF 和 LSTF 资源被收集,包括一个开源库。
  • We summarize four possible future research directions.
    我们总结了四个可能的研究方向。

Abstract  摘要

The development of deep learning technology has brought great improvements to the field of time series forecasting. Short sequence time-series forecasting no longer satisfies the current research community, and long-term future prediction is becoming the hotspot, which is noted as long sequence time-series forecasting (LSTF). The LSTF has been widely studied in the extant literature, but few reviews of its research development are reported. In this article, we provide a comprehensive survey of LSTF studies with deep learning technology. We propose rigorous definitions of LSTF and summarize the evolution in terms of a proposed taxonomy based on network structure. Next, we discuss three key problems and corresponding solutions from long dependency modeling, computation cost, and evaluation metrics. In particular, we propose a Kruskal–Wallis test based evaluation method for evaluation metrics problems. We further synthesize the applications, datasets, and open-source codes of LSTF. Moreover, we conduct extensive case studies comparing the proposed Kruskal–Wallis test based evaluation method with existing metrics and the results demonstrate the effectiveness. Finally, we propose potential research directions in this rapidly growing field. All resources and codes are assembled and organized under a unified framework that is available online at https://github.com/Masterleia/TSF_LSTF_Compare.
深度学习技术的发展极大地推动了时间序列预测领域。短序列时间序列预测已无法满足当前研究社区的需求,长期未来预测成为热点,被称为长序列时间序列预测(LSTF)。LSTF 在现有文献中已被广泛研究,但其研究发展的综述却很少。在本文中,我们提供了基于深度学习技术的 LSTF 研究的全面综述。我们提出了 LSTF 的严格定义,并基于网络结构提出了一个分类法来总结其演变。接下来,我们讨论了从长期依赖建模、计算成本和评估指标三个关键问题及其相应解决方案。特别是,我们提出了一种基于 Kruskal-Wallis 测试的评估方法来解决评估指标问题。我们进一步综合了 LSTF 的应用、数据集和开源代码。此外,我们进行了广泛的案例研究,比较了所提出的基于 Kruskal-Wallis 测试的评估方法与现有指标,结果表明其有效性。 最后,我们提出了这个快速发展的领域中的潜在研究方向。所有资源和代码都按照统一框架汇编和组织,可在 https://github.com/Masterleia/TSF_LSTF_Compare 在线获取。

Keywords  关键词

Time series forecasting
Long time series forecasting
Transformer
Data mining
Deep learning

时间序列预测 长时间序列预测 变压器 数据挖掘 深度学习

1. Introduction  1. 引言

Time series forecasting (TSF) is a classical forecasting task that predicts the future trend changes of time series, and has been widely used in real-world applications such as energy [1], transportation [2], and meteorology [3]. Formerly, traditional statistical methods were effective ways to forecast time series. As early as 1927, Yule proposed the autoregressive [4] (AR) model for predicting univariate smooth time series. Since then, the moving average [5] (MA) model proposed by Walker and the autoregressive moving average (ARMA) [6] model, a combination of both, have been used to solve univariate smooth TSF problems. However, these traditional statistical methods often have many assumptions about time series data and are not effective in making valid predictions about reality, based on many prior assumptions such as stability, linear correlation, normal distribution, independence and other properties that limit the effectiveness in real-world applications. It is difficult to effectively capture the nonlinear relationships between time series. Traditional machine learning (ML) algorithms are effective solutions. Two types of classic algorithms, support vector machines (SVMs) [7] and adaptive boosting (AdaBoost) [8], were developed in 1995 and have achieved great success in the field of TSF. They utilize feature engineering, which changes the organization of data with a sliding time window; i.e., data metrics, such as the minimum, maximum, mean, variance, etc., are calculated within the sliding window as new features for prediction. These algorithmic models solve the problem of predicting multivariate, heteroskedastic time series with nonlinear relationships to some extent. However, these models suffer from poor generalization, exhibiting limited predictive accuracy that can often only be improved with complex artificial feature engineering. The rapid development of deep learning (DL) in recent years as shown in Fig. 1, it has greatly improved the nonlinear modeling capabilities of TSF method. DL is called an effective solution for TSF, and it is widely used in finance, energy, meteorology, transportation and medical fields. It has also extended many other problems related to TSF, such as hierarchical time series forecasting [9], intermittent time series forecasting [10], sparse multivariate time series forecasting [11] and asynchronous time series forecasting [12], [13], and has even extended some multiobjective and multigranular forecasting scenarios [14] and multimodal time series forecasting scenarios [15], [16]. With the development of technology, a task gradually opens to using more historical data to predict the longer-term future, that is long sequence time-series forecasting (LSTF) [17], [18], [19], and researchers have also explored commonly feasible solutions to the LSTF.
时间序列预测(TSF)是一种经典的预测任务,它预测时间序列的未来趋势变化,并在能源[1]、交通[2]和气象[3]等现实应用中得到广泛应用。以前,传统的统计方法是预测时间序列的有效方法。早在 1927 年,尤尔就提出了自回归[4](AR)模型来预测单变量平滑时间序列。从那时起,沃克提出的移动平均[5](MA)模型和结合两者的自回归移动平均(ARMA)[6]模型被用来解决单变量平滑时间序列预测问题。然而,这些传统的统计方法通常对时间序列数据有许多假设,并且基于许多先前的假设(如稳定性、线性相关性、正态分布、独立性等)在现实世界中做出有效预测并不有效,这些假设限制了其在现实应用中的有效性。难以有效地捕捉时间序列之间的非线性关系。传统的机器学习(ML)算法是有效的解决方案。 两种经典算法,支持向量机(SVMs)[7]和自适应提升(AdaBoost)[8],于 1995 年开发,在时间序列预测(TSF)领域取得了巨大成功。它们利用特征工程,通过滑动时间窗口改变数据组织;即,在滑动窗口内计算数据度量,如最小值、最大值、平均值、方差等,作为预测的新特征。这些算法模型在一定程度上解决了预测具有非线性关系的多元异方差时间序列的问题。然而,这些模型存在泛化能力差的问题,预测精度有限,通常只能通过复杂的人工特征工程来提高。如图 1 所示,近年来深度学习(DL)的快速发展,极大地提高了 TSF 方法的非线性建模能力。深度学习被称为 TSF 的有效解决方案,并在金融、能源、气象、交通和医疗等领域得到广泛应用。 它还扩展了许多与 TSF 相关的问题,例如层次时间序列预测[9]、间歇时间序列预测[10]、稀疏多元时间序列预测[11]和异步时间序列预测[12]、[13],甚至扩展了一些多目标和多粒度预测场景[14]和多模态时间序列预测场景[15]、[16]。随着技术的发展,一个任务逐渐开放为使用更多历史数据来预测更长期的未来,即长序列时间序列预测(LSTF)[17]、[18]、[19],研究人员也探索了 LSTF 的常见可行解决方案。
LSTF has been in greater demand for applications in financial, energy, meteorological, transportation, and medical scenarios, such as electricity usage planning [25] and financial long-term strategic guidance [26]. The LSTF is a critical component in assisting individuals to better plan for the future by forecasting outcomes further in advance. Nevertheless, prior algorithms have demonstrated unsatisfactory performance when attempting to forecast longer-term predictions with an increased number of input sequences. Specifically, such algorithms are prone to a significant decline in inference performance and predictive accuracy due to the growing sequence length. Furthermore, they struggle to efficiently extract more effective dependencies from the data, posing an urgent problem in time series forecasting. However, with the remarkable advancement in computational resources and deep learning techniques, this challenge has been gradually and effectively addressed.
LSTF 在金融、能源、气象、交通和医疗场景中的应用需求日益增加,例如电力使用规划[25]和金融长期战略指导[26]。LSTF 是帮助个人更好地规划未来的关键组成部分,通过提前更长时间预测结果。然而,先前算法在尝试使用增加的输入序列进行长期预测时表现不佳。具体来说,这些算法由于序列长度的增加,容易导致推理性能和预测准确性的显著下降。此外,它们难以有效地从数据中提取更有效的依赖关系,这在时间序列预测中构成了一个紧迫的问题。然而,随着计算资源和深度学习技术的显著进步,这一挑战已逐步得到有效解决。

Table 1. Recent survey comparison.
表 1. 近期调查比较。

Survey  调查Theme  主题Contribution  贡献
Bianchi F M [20]  比安奇 FM [20]Focus on RNN in short-term load forecasting
关注短期负荷预测中的循环神经网络
A survey on short-term load forecasting.
短期负荷预测调查。

The case study about reviewed models.
关于已审查模型的案例研究。
Manibardo E L [21]  曼伊巴尔多 E L [21]Focus on DL in road traffic forecasting
关注道路交通预测中的深度学习
Critically analyzing the state of the art in what refers to the use of DL for Intelligent Transportation Systems research.
对使用深度学习(DL)进行智能交通系统(ITS)研究现状进行批判性分析。

An extensive experimentation comprising.
包含广泛的实验。
Lara-Benítez P [22]  拉拉-贝尼特斯 P [22]Focus on DL in TSF
关注 TSF 中的深度学习
A exhaustive review about DL in TSF.
关于 TSF 中深度学习的全面综述

An open-source deep learning framework for TSF and a comparative analysis.
开源的 TSF 深度学习框架及其比较分析
Deb C [23]Focus on energy consumption forecasting based on ML
关注基于机器学习的能源消耗预测
A comprehensive review and comparison of ML in energy consumption forecasting.
对能源消耗预测中机器学习的全面综述与比较。

Provides constructive future direction.
提供有建设性的未来方向。
K Benidis [24]  Text: K Benidis [24]The main focus is to educate, review and popularize the latest developments in NN-driven forecasting
主要关注点是教育、审查和普及 NN 驱动预测的最新进展
The breadth and depth survey on DL for TSF.
对 DL 在 TSF 领域的广度和深度调查。

Digging deeper and find out future directions.
深入挖掘,探寻未来方向。
Our  我们的Focus on DL in long sequence time-series forecasting
关注长序列时间序列预测中的深度学习
Deep explanations from multiple perspectives about LSTF definition.
关于 LSTF 定义的多角度深入解释。

New taxonomy and comprehensive review for LSTF.
新分类法和 LSTF 的全面综述

New performance evaluation for LSTF.
LSTF 的新性能评估。

Abundant resources of TSF and particularly in LSTF, such as datasets, metrics, open source library, eg.
TSF 资源丰富,尤其是在 LSTF 方面,例如数据集、指标、开源库等。

Future directions for LSTF.
未来 LSTF 的发展方向
  1. Download: Download high-res image (180KB)
    下载:下载高分辨率图片(180KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 1. Technological Development of DL in TSF.
图 1. TSF 中深度学习的技术发展。

Many researchers have studied LSTF from the perspectives of traditional statistics and machine learning [27], [28], [29], [30], [31]. With the development of DL, especially the emergence of Transformers [32], the ability to deal with long-sequence problems has attracted wide attention in the industry. Researchers started to study the applications of Transformers in the field of TSF, and works such as [17], [33] demonstrated the effectiveness of Transformer-based models for LSTF. The recent work of [34] summarized the applications of Transformer in TSF. The parallel computing approach in the Transformer-based architecture speeds up the inference in LSTF and the capacity of long series data that the model can handle, while the number of operations required for self-attention to compute the association between two time points does not grow with the distance between the two time points making it easier to capture long-term dependencies between the data. However, the expensive training and deployment costs of these models make them unaffordable when applied to real LSTF problems. Transformer-based models with large parameters are also prone to overfitting in some data distributions. Thus, there are also many other nontransformer-type works [29], [35], [36], [37], [38], [39], [40], [41] that are studying TSF in general and LSTF problems in particular. As the LSTF is receiving more and more attention, its related research is increasing. In particular, in view of the current status of the existing work, we found the following three problems in the field of LSTF, which have not been standardized and sorted out in other works
许多研究人员从传统统计学和机器学习的角度研究了 LSTF[27]、[28]、[29]、[30]、[31]。随着深度学习(DL)的发展,尤其是 Transformers[32]的出现,处理长序列问题的能力在业界引起了广泛关注。研究人员开始研究 Transformers 在时间序列预测(TSF)领域的应用,如[17]、[33]等作品展示了基于 Transformer 的模型在 LSTF 中的有效性。[34]的最新工作总结了 Transformer 在 TSF 中的应用。基于 Transformer 的架构中的并行计算方法加快了 LSTF 中的推理速度,并提高了模型可以处理的长序列数据容量,同时,由于计算两个时间点之间关联所需的操作数量不会随着两个时间点之间的距离增长而增加,这使得捕捉数据之间的长期依赖关系变得更加容易。然而,这些模型的昂贵训练和部署成本使得它们在应用于实际的 LSTF 问题时变得难以承受。具有大量参数的基于 Transformer 的模型也容易在一些数据分布中发生过拟合。 因此,还有许多其他非变换器类型的工作[29],[35],[36],[37],[38],[39],[40],[41],它们在研究 TSF 的一般问题和 LSTF 的特定问题。随着 LSTF 越来越受到关注,其相关研究也在不断增加。特别是,鉴于现有工作的现状,我们在 LSTF 领域发现了以下三个问题,这些问题在其他作品中尚未得到标准化和整理
  • Problem 1: To our knowledge, long sequence time-series forecasting still lacks a comprehensive overview in data mining and deep learning fields. In particular, the rapid development of LSTF based on deep learning technology has created a pressing need for an overview of methods, data, and metrics to inspire researchers who are interested in entering this rapidly developing field, as well as experts who wish to compare LSTF methods. Many existing studies survey time series forecasting [20], [21], [22], [23], [24], these reviews tend to categorize and summarize time series forecasting research under one overarching theme, as shown in Table 1, without distinguishing between LSTF and other forecasting approaches.
    问题 1:据我们所知,长序列时间序列预测在数据挖掘和深度学习领域仍缺乏全面的概述。特别是,基于深度学习技术的 LSTF(长短期记忆网络)的快速发展,迫切需要概述方法、数据和指标,以激发对进入这一快速发展领域感兴趣的研究人员,以及希望比较 LSTF 方法的专家。许多现有研究综述了时间序列预测[20],[21],[22],[23],[24],这些综述倾向于将时间序列预测研究归类和总结为一个总主题,如表 1 所示,没有区分 LSTF 和其他预测方法。
  • Problem 2: Long sequence time-series forecasting lacks deep explanations from multiple perspectives. Generally, LSTF is defined as forecasting the distant future [17]. However, such a definition is macroscopic, and lacks a comprehensive and nuanced explanation of LSTF in the existing literature. We are unsure about the specific forecasting task that qualifies as LSTF.
    问题 2:长序列时间序列预测缺乏多角度的深入解释。通常,LSTF 被定义为预测遥远的未来[17]。然而,这样的定义是宏观的,且在现有文献中缺乏对 LSTF 的全面和细致的解释。我们不确定哪些具体的预测任务可以被视为 LSTF。
  • Problem 3: Long sequence time-series forecasting still lacks technical analysis and performance comparison of different neural networks under a unified framework. The traditional MSE and MAE metrics only reflect the error between the true and predicted values. However, in LSTF, with the increase of predicted horizon, it is not reliable to compare the accuracy of models purely in terms of the numerical magnitude of the evaluated metrics. For example, we cannot compare the superior results [42] of PatchTST/42 and DLinear for MSE=0.202 and MSE=0.203. We need to prove the superiority of one model over the other in a statistical sense. However, there is no relevant performance evaluation.
    问题 3:长序列时间序列预测在统一框架下仍缺乏不同神经网络的技术分析和性能比较。传统的均方误差(MSE)和平均绝对误差(MAE)指标仅反映了真实值与预测值之间的误差。然而,在 LSTF 中,随着预测范围的增加,仅从评估指标数值大小的角度比较模型的准确性是不可靠的。例如,我们无法比较 PatchTST/42 和 DLinear 在 MSE=0.202 和 MSE=0.203 时的优越结果。我们需要在统计意义上证明一个模型相对于另一个模型的优越性。然而,目前尚无相关的性能评估。
In order to address the above challenges, we have provided a comprehensive overview of LSTF, and this article is applicable both to interested researchers who want to enter this fast growing field and to experts who want to compare LSTF models. To cover a broader range of approaches, we considers LSTF as a task for forecasting more distant futures in TSF. In other words, we consider as the LSTF literature in which the contribution, motivation, and the problem addressed is to predict a more distant future in TSF. Thus, LSTF is part of TSF, and the difference between spatio-temporal forecasting and LSTF is that they view the problem from different perspectives and they have an intersection part. We retrieved data in a predefined manner from the Web of Science (WoS) Core Collection database and DataBase systems and Logic Programming (DBLP). The document types were confined to article or review. After filtering the articles by reading their abstracts, we also selected closely related references to complement them.
为了应对上述挑战,我们提供了对 LSTF 的全面概述,本文适用于希望进入这一快速发展的领域的感兴趣的研究人员以及希望比较 LSTF 模型的专家。为了涵盖更广泛的方法,我们将 LSTF 视为 TSF 中预测更遥远未来的任务。换句话说,我们将预测 TSF 中更遥远未来的贡献、动机和所解决的问题视为 LSTF 文献。因此,LSTF 是 TSF 的一部分,时空预测与 LSTF 之间的区别在于它们从不同的角度看待问题,并且它们有一个交集部分。我们从 Web of Science(WoS)核心集合数据库和数据库系统与逻辑编程(DBLP)中按预定方式检索数据。文档类型限于文章或评论。在阅读摘要后过滤文章后,我们还选择了与之密切相关的研究文献以补充它们。
The contributions of this paper are summarized as follows:
本文的贡献总结如下:
  • (1)
    Deeply multi-Perspective LSTF definition explanations: The existing definition of LSTF is quite macroscopic and ambiguous. Clearly defining the LSTF is challenging. To address it, we provide deep explanations from multiple perspectives: relative and absolute concepts.
    深度多视角 LSTF 定义解释:现有 LSTF 的定义相当宏观且模糊。明确界定 LSTF 具有挑战性。为解决这一问题,我们从多个视角提供深入的解释:相对和绝对概念。
  • (2)
    New Taxonomy and Comprehensive Review: We propose a new taxonomy of LSTF research. We classify them into RNN-based model, CNN-based model, GNN-based model, Transformer-based model, Compound model and Miscellaneous methods according to their main predictive network structure characteristics. In addition, we summarize several key problems of LSTF and investigate the related work and effective solutions of each key problem.
    新分类法和全面综述:我们提出了一种新的 LSTF 研究分类法。根据其主要预测网络结构特征,我们将它们分为基于 RNN 的模型、基于 CNN 的模型、基于 GNN 的模型、基于 Transformer 的模型、复合模型和杂项方法。此外,我们总结了 LSTF 的几个关键问题,并研究了每个关键问题的相关工作和有效解决方案。
  • (3)
    New Performance Evaluation: To enhance fair and effective performance evaluation of LSTF, we apply Kruskal–Wallis test and propose a new LSTF prediction performance evaluation method from two perspectives.
    新性能评估:为了提高 LSTF 的公平和有效性能评估,我们应用 Kruskal-Wallis 检验,并从两个角度提出一种新的 LSTF 预测性能评估方法。
  • (4)
    Abundant Resources: We have collected abundant resources of TSF and LSTF in particular. We have classified five application domains and collected relevant datasets for each domain, giving the relevant open source links for each dataset. We also collected relevant evaluation metrics, classified them into scale-dependent, scale-independent, and scaled errors, and discussed the properties of each metric. Finally, we selected SOTA (Pyraformer, FEDformer, Autoformer, Informer, Reformer, Transformer, MTGNN, Graph WaveNet, LSTNet) models in recent years and based on our research. Based on the datasets and evaluation metrics of our research, we give a technical analysis and performance comparison of different neural networks under a unified framework in LSTF, and an open source library.
    丰富的资源:我们特别收集了 TSF 和 LSTF 的丰富资源。我们已将应用领域分为五个类别,并为每个领域收集了相关数据集,为每个数据集提供了相关开源链接。我们还收集了相关评估指标,将它们分为规模依赖性、规模独立性和缩放误差,并讨论了每个指标的性质。最后,我们根据近年来的研究和我们的研究,选择了 SOTA(Pyraformer、FEDformer、Autoformer、Informer、Reformer、Transformer、MTGNN、Graph WaveNet、LSTNet)模型。基于我们的研究数据集和评估指标,我们在统一框架下对 LSTF 中的不同神经网络进行了技术分析和性能比较,并开源了一个库。
  • (5)
    Future Directions: Based on the results of our research and experiments, we have summarized four possible future research directions.
    未来方向:基于我们的研究和实验结果,我们总结了四个可能的研究方向。
The motivation and line logic of this paper are shown in Fig. 2. In this paper, definitions of time series data, TSF tasks, and deeply multi-angle definition explanations for LSTF are given in Section 2. In Section 3, we review the relevant research and the evolution of DL networks in the LSTF field in recent years and classify the related studies according to the technical contributions of each work. Section 4 summarizes the key problems encountered in LSTF, sorts out the research directions and effective solutions for each key problem in the current works, and proposes our new LSTF evaluation approach for performance evaluation problems. Section 5 summarizes the relevant open source datasets in various domains of the five application areas. Section 6 summarizes and discusses the predictive performance evaluation metrics and classifies them. In Section 7, we conduct a case study to compare the SOTA models that have been developed in recent years to evaluate and compare their LSTF performance. In Section 8, we discuss future research directions.
本文的动机和线路逻辑如图 2 所示。在本文第 2 节中,给出了时间序列数据、TSF 任务以及 LSTF 的深度多角度定义解释的定义。在第 3 节中,我们回顾了近年来 LSTF 领域相关研究和深度学习网络的发展,并根据每项工作的技术贡献对相关研究进行分类。第 4 节总结了在 LSTF 中遇到的关键问题,梳理了当前工作中每个关键问题的研究方向和有效解决方案,并提出了我们针对性能评估问题的新的 LSTF 评估方法。第 5 节总结了五个应用领域中各个领域的相关开源数据集。第 6 节总结了预测性能评估指标并对其进行分类。在第 7 节中,我们进行案例研究,比较近年来开发的 SOTA 模型,以评估和比较它们的 LSTF 性能。在第 8 节中,我们讨论未来的研究方向。
  1. Download: Download high-res image (513KB)
    下载:下载高分辨率图片(513KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 2. Article Structure.
图 2. 文章结构。

  1. Download: Download high-res image (101KB)
    下载:下载高分辨率图片(101KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 3. Univariate Forecasting.
图 3. 单变量预测。

  1. Download: Download high-res image (105KB)
    下载:下载高分辨率图片(105KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 4. Covariate Forecasting.
图 4. 协变量预测。

2. Problem definition  2. 问题定义

In this section, we present the definitions of time series data and their characteristics. TSF tasks and their classification methods are given based on their properties. We also note that the terms “variable” and “feature” are used interchangeably in this paper. Unless otherwise noted, the notations used in this study are shown in Table 2.
在这一节中,我们介绍了时间序列数据的定义及其特征。根据其属性,给出了 TSF 任务及其分类方法。我们还指出,本文中“变量”和“特征”这两个术语是可互换使用的。除非另有说明,本研究所使用的符号在表 2 中展示。

2.1. Time series data  2.1. 时间序列数据

Time series data exist in various fields in real life, and it is particularly important to master the distribution characteristics of time series data and predict their future laws. By predicting future gold price fluctuations or future stock market price fluctuations to obtain large benefits and by predicting future weather or predicting future traffic flows to benefit people’s lives, it can be seen that grasping the future development law of time series data is very meaningful. So what are time series data?
时间序列数据存在于现实生活的各个领域,掌握时间序列数据的分布特征和预测其未来规律尤为重要。通过预测未来金价波动或未来股市价格波动以获得巨大利益,以及通过预测未来天气或预测未来交通流量以造福人民生活,可以看出,掌握时间序列数据的未来发展趋势非常有意义。那么,什么是时间序列数据呢?
Time series data are series of finite or infinite random variables (sequence data) arranged on the basis of time. This type of data reflects the changing state of a certain thing or phenomenon over time. In addition to digital data in the traditional sense, if sequence data such as video data, audio data, and voice data are arranged according to time, they can be regarded as time series data. That is, any data observed and recorded in a time sequence can be regarded as time series data. An N-dimensional time series matrix Xˆ1:N,tn:t with n samples can be expressed formally as follows: (1)Xˆ1:N,tn:t={x1,tn:t,x2,tn:t,,xN,tn:t} Among them, x1,tn:t is represented as a vector, x1,tn:t={x1,tn,x1,tn+1,,x1,t}, and x1,tn is denoted as a time series point at time tn of a time series. Thus, Xˆ1:N,tn:t denotes time series data consisting of multiple time series.
时间序列数据是根据时间排列的有限或无限随机变量(序列数据)的序列。这类数据反映了某个事物或现象随时间的变化状态。除了传统意义上的数字数据外,如果视频数据、音频数据和语音数据等序列数据按照时间排列,它们也可以被视为时间序列数据。也就是说,任何按时间顺序观察和记录的数据都可以被视为时间序列数据。一个具有 N 维、 Xˆ1:N,tn:t 个样本的时间序列矩阵 n 可以形式上表示如下: (1)Xˆ1:N,tn:t={x1,tn:t,x2,tn:t,,xN,tn:t} 其中, x1,tn:t 表示为向量 x1,tn:t={x1,tn,x1,tn+1,,x1,t}x1,tn 表示为时间序列在时间 tn 的点。因此, Xˆ1:N,tn:t 表示由多个时间序列组成的时间序列数据。

Table 2. A summary of notations.
表 2. 符号总结

Notations  符号Descriptions  描述
XDenotes a multidimensional time series matrix
表示一个多维时间序列矩阵
xDenotes a time series vector
表示时间序列向量
f()Denotes a time series forecasting model
表示时间序列预测模型
Xˆi,t+1Denotes the predicted value of the ith feature at t+1 time step
表示第①#时间步长第①#个特征的预测值
Xi,tn:tDenotes the historical value of the ith feature from time step tn to t
表示从时间步 tnt 的第 i 个特征的历史值
tThe current time step  当前时间步
lIndicates the interval between time series data
指示时间序列数据之间的间隔
mIndicates sample size  指示样本量
yˆIndicates the predicted value
指示预测值
yIndicates true value  指示真实值
yīDenotes the mean of the real sample data
表示真实样本数据的平均值
ymaxDenotes the maximum value of the real sample data
表示真实样本数据的最大值
yminDenotes the minimum value of the real sample data
表示真实样本数据的极小值
mean()Denotes the mean value  表示平均值
HDenotes the statistic in the K-W test
表示 K-W 检验中的统计量
Ridenotes the sum of the rank of ni observations {X1,,Xni} of the ith sample in this arrangement
表示本排列中第②个样本的第①个观测值的秩之和
HDenotes the corrected K-W test statistic
表示修正后的 K-W 检验统计量
αIndicates the significance level
指示显著性水平
VDegree of freedom  自由度
χ2Denotes the cardinal distribution
表示基数分布
χα,v2Denotes the upper lateral quantile of the chi-square distribution
表示卡方分布的上侧分位数
Time series data exist in major fields such as finance, energy, meteorology, transportation, and medical care. They all have the following common characteristics.
时间序列数据存在于金融、能源、气象、交通和医疗保健等主要领域。它们都具有以下共同特征。
  • Trend (T): A trend refers to the change in a time series with the progress of time or the change in an independent variable, and data may present a relatively slow long-term change trend of continuous rising or falling.
    趋势(T):趋势是指随着时间推移或自变量变化,时间序列中的变化,数据可能呈现相对缓慢的长期持续上升或下降变化趋势。
  • Periodicity (C): Periodicity means that the given time series and its independent variables changes with time, and its data show a regular rise or fall.
    周期性(C):周期性意味着给定的时间序列及其自变量随时间变化,其数据表现出规律性的上升或下降。
  • Seasonality (S): Seasonality means that the time series changes with natural seasons and time periods, and its data exhibit regular rises or falls with season changes. The definition of seasonality is very similar to that of periodicity, and many people confuse the two. However, in fact, there is a significant difference between them. Seasonality represents that the time series changes at a known and fixed frequency with the influence of seasons, while periodicity means that the time series rises or falls with an unfixed frequency.
    季节性(S):季节性意味着时间序列随自然季节和时间周期变化,其数据随季节变化呈现规律性的上升或下降。季节性的定义与周期性非常相似,许多人容易混淆这两个概念。然而,实际上,它们之间存在显著差异。季节性表示时间序列在季节影响下以已知和固定的频率变化,而周期性则意味着时间序列以不固定的频率上升或下降。
  • Randomness (I): Randomness is manifested in the fact that time series data points show irregular changes, but the whole series satisfies statistical laws; these changes are sometimes called residuals.
    随机性(一):随机性体现在时间序列数据点表现出不规则变化,但整个序列满足统计规律;这些变化有时被称为残差。
In summary, time series have four characteristics: trend, periodicity, seasonality and randomness. In addition, time series can also be classified according to other characteristics of TSF tasks, such as data distribution characteristics. There are also stationary time series and nonstationary time series in time series data. Stationarity means that the mean and variance of the time series are consistent at different times. In many cases, methods such as time series decomposition are used before forecasting to convert nonstationary time series into stationary time series, which can improve the accuracy of forecasting. Notably, according to the nature of time series, they can also be divided into hierarchical time series [9], fuzzy time series [43] and so on. However, this paper only discusses LSTF under precise time series. In the next section, we introduce the definition of TSF tasks.
总结来说,时间序列具有四个特征:趋势、周期性、季节性和随机性。此外,时间序列还可以根据 TSF 任务的其他特征进行分类,例如数据分布特征。时间序列数据中还存在平稳时间序列和非平稳时间序列。平稳性意味着时间序列的均值和方差在不同时间是一致的。在许多情况下,在预测之前使用时间序列分解等方法将非平稳时间序列转换为平稳时间序列,以提高预测的准确性。值得注意的是,根据时间序列的性质,它们还可以分为层次时间序列[9]、模糊时间序列[43]等。然而,本文仅讨论在精确时间序列下的 LSTF。在下一节中,我们将介绍 TSF 任务的定义。

2.2. Time series forecasting
2.2. 时间序列预测

Many analysis and processing tasks are available for time series data, such as time series classification and TSF. This paper focuses on the TSF task. The TSF task is a classic problem that involves using one or more time-ordered features of something in a prior period of time to predict the features of that thing in a future period of time. This task is different from the general regression analysis prediction model. A TSF model needs to capture the sequence of the series data to grasp the future trend of the sequence data. The same sequence data are changed in order, and the result generated by inputting the time series data into the time series model is completely different. If there is an ith group time series with n sample data