Elsevier

Journal of Commodity Markets
商品市场杂志

Volume 32, December 2023, 100352
第 32 卷,2023 年 12 月,100352
Journal of Commodity Markets

Regular article  常规文章
The role of higher moments in predicting China's oil futures volatility: Evidence from machine learning models
中国石油期货波动预测中高阶矩的作用:来自机器学习模型的证据

https://doi.org/10.1016/j.jcomm.2023.100352Get rights and content  获取权利和内容

Abstract  摘要

This paper expands the emerging literature on volatility forecasting for China's oil market by exploring the predictive ability of higher-order moments (skewness, kurtosis, hyperskewness, and hyperkurtosis) based on high-frequency data. Our investigation is originally based on the heterogeneous autoregressive (HAR) framework, but considering the possible multicollinearity and nonlinearity, it is extended to various machine learning (ML) models and combination forecasting models. The results reveal that higher-order moments, including the two highest moments, always significantly improve predictive performance for the COVID-19 crisis.
We further examine the interpretability of ML models and each factor's contribution to the prediction, finding that odd and even moments contain short- and long-term prediction information, respectively.
This paper also highlights the effectiveness of ML models for capturing trends in oil futures volatility with higher-order moments and the satisfactory performance of combination forecasting models. Finally, we investigate the predictability of asymmetric
risk patterns and obtain identical results. Our study has important implications for financial risk management, asset pricing, and portfolio allocation.

本文通过基于高频数据探索高阶矩(偏度、峰度、超偏度和超峰度)的预测能力,扩展了关于中国石油市场波动预测的现有文献。我们的研究最初基于异质自回归(HAR)框架,但考虑到可能的多重共线性和非线性,将其扩展到各种机器学习(ML)模型和组合预测模型。结果表明,包括两个最高阶矩在内的高阶矩始终显著提高了对 COVID-19 危机的预测性能。我们进一步检验了 ML 模型的可解释性和每个因素对预测的贡献,发现奇数和偶数矩分别包含短期和长期预测信息。本文还突出了 ML 模型利用高阶矩捕捉石油期货波动趋势的有效性以及组合预测模型的令人满意的性能。最后,我们研究了非对称风险模式的可预测性,并获得了相同的结果。 我们的研究对金融风险管理、资产定价和投资组合配置具有重要意义。

Keywords  关键词

China's oil futures
COVID-19
Higher-order moments
Machine learning
Combination forecasting

中国的石油期货 COVID-19 高阶矩机器学习组合预测

1. Introduction  1. 引言

Oil is a major commodity in financial markets, and volatility forecasting of oil prices is crucial for financial modeling and decision-making (Chiang et al., 2015; Hamilton, 1983; Kilian and Park, 2009; Pan et al., 2017). A growing number of studies have made significant gains in volatility prediction by applying realized variance measures built on returns, such as jumps, bipower variation, realized semi-variance, etc. Higher-order moments are no exception (Bollerslev et al., 2016; Ma et al., 2019; Patton and Sheppard, 2015; Prokopczuk et al., 2016; Sévi, 2014). As an integral consideration in risk management, portfolio allocation, and asset pricing, it is worth examining whether higher-order moments are relevant to oil volatility prediction.
石油是金融市场的主要商品,油价波动预测对金融建模和决策至关重要(Chiang 等,2015;Hamilton,1983;Kilian 和 Park,2009;Pan 等,2017)。越来越多的研究通过应用基于收益的实实现金收益度量,如跳跃、双幂变化、实实现半方差等,在波动预测方面取得了显著进展。高阶矩也不例外(Bollerslev 等,2016;Ma 等,2019;Patton 和 Sheppard,2015;Prokopczuk 等,2016;Sévi,2014)。作为风险管理、投资组合分配和资产定价的一个基本考虑因素,值得探讨高阶矩是否与石油波动预测相关。

This study is the first to explore the role of higher-order moments in influencing realized volatility (RV) forecasting in the oil market.
这项研究首次探讨了高阶矩在影响石油市场实现波动率(RV)预测中的作用。
The inspiration for focusing on the predictive role of higher-order moments in oil futures RV stems from a review of the extensive theoretical literature dating back to Kraus and Litzenberger (1976), including macroeconomic disaster studies by Barro (2006), Longstaff and Piazzesi (2004), and Rietz (1988), who argue that heavy-tailed shocks in general, and left-tailed events in particular, significantly influence asset-price behavior.
对石油期货波动率风险(RV)中高阶矩预测作用的关注灵感来源于对自 Kraus 和 Litzenberger(1976 年)以来的广泛理论文献的回顾,包括 Barro(2006 年)的宏观经济灾难研究、Longstaff 和 Piazzesi(2004 年)以及 Rietz(1988 年)的研究,他们认为,在一般情况下,重尾冲击,尤其是左尾事件,对资产价格行为有显著影响。

Throughout the process, higher-order moments aim to capture RV asymmetries and extreme fluctuations, which are easily connected to the COVID-19 pandemic and the ensuing profound shifts in the world's financial markets (Kostakis et al., 2012). Thus, higher-order moments are recognized as essential indicators of market-wide risk, with the capability to enhance RV predictions.
在整个过程中,高阶矩旨在捕捉 RV 的对称性和极端波动,这些波动与 COVID-19 大流行及其引发的全球金融市场深刻变化密切相关(Kostakis 等,2012)。因此,高阶矩被视为市场风险的重要指标,具有增强 RV 预测的能力。

Empirically, higher-order moments can improve investment performance when incorporated into portfolio strategies and can properly define returns distribution when employed in model frameworks (Dittmar, 2002; Ghysels et al., 2016; Jensen et al., 2000; Liu et al., 2020). Notably, Mei et al. (2017) are the first to reveal the predictive ability of higher-order moments for stock market volatility in the US and China. Building on this study, Gkillas et al. (2019) conclude that higher-order moments can improve our comprehension of exchange-rate volatility dynamics based on six major currencies relative to the US dollar. Bonato et al. (2022) provide further evidence of the predictive ability of higher-order moments for the RV of international real estate investment trusts, which exceeds that of jumps. However, these articles only study skewness and kurtosis. So far, few researchers have investigated the potential predictive importance of the fifth and sixth moments. Kinateder and Papavassiliou (2019) discover that the two highest moments contribute to sovereign bond returns forecasts. Driven by the assumption that agents prefer higher odd moments and dislike even ones, Khademalomoom et al. (2019) certificate that the incorporation of odd/even moments (third and fifth moments/fourth and sixth moments) into return/variance modeling improves the performance of generalized autoregressive conditional heteroskedasticity (GARCH) models in predicting exchange-rate returns. However, we still discover that these papers neglect the oil market.
经验表明,将高阶矩纳入投资组合策略可以改善投资业绩,并在模型框架中使用时可以正确定义收益分布(Dittmar,2002;Ghysels 等,2016;Jensen 等,2000;Liu 等,2020)。值得注意的是,Mei 等(2017)是第一个揭示高阶矩对美国和中国股市波动预测能力的学者。在此基础上,Gkillas 等(2019)得出结论,高阶矩可以提高我们对相对于美元的六种主要货币汇率波动动态的理解。Bonato 等(2022)提供了进一步证据,表明高阶矩对国际房地产投资信托的回报率波动率预测能力超过跳跃。然而,这些文章仅研究了偏度和峰度。迄今为止,很少有研究人员调查第五和第六矩的潜在预测重要性。Kinateder 和 Papavassiliou(2019)发现,两个最高矩对主权债券回报率预测有贡献。 受制于代理机构偏好更高的奇数矩和不喜欢偶数矩的假设,Khademalomoom 等人(2019)证明,将奇数/偶数矩(第三和第五矩/第四和第六矩)纳入回报/方差建模,可以提高广义自回归条件异方差(GARCH)模型在预测汇率回报方面的性能。然而,我们仍然发现这些论文忽略了石油市场。

Given the importance of higher-order moments and the oil market, for the first time, this study examines the impact of hyper order moments on oil futures volatility forecasting.
鉴于高阶矩和石油市场的重要性,本研究首次考察了超高阶矩对石油期货波动率预测的影响。

We also differentiate between odd and even moments and takes the lead in measuring the effect of higher-order moments on positive and negative volatility separately.
我们还将奇数和偶数时刻区分开来,并率先分别测量高阶矩对正负波动率的影响。
It is worth noting that this paper focus on the RV of China's oil market during the COVID-19 pandemic, paying particular attention to four considerations: 1)The importance of China's crude oil futures.
值得注意的是,本文聚焦于 COVID-19 疫情期间中国石油市场的 RV,特别关注以下四个方面:1)中国原油期货的重要性。

The first RMD-dominated crude oil futures were formally launched on the Shanghai International Energy Exchange (INE) on March 26, 2018, opening up to international participants, which had theoretical and practical significance for China (Ji and Zhang, 2019). China is rapidly advancing its position in the global market as the second-biggest consumer of crude oil and the world's largest importer of commodities (Liu and Lee, 2021). 2) The uniqueness of China's crude oil futures. China holds the distinction of being the first price benchmark in Asia, setting it apart from other countries such as Japan and Singapore, which have made unsuccessful attempts to establish their own benchmarks. Furthermore, in just three months since its launch, the INE ranked first in Asia and among the top three worldwide in terms of trading volume (Sun et al., 2022), which reveals the exceptional appeal of China's crude oil futures. 3) The limited research on China's crude oil futures. Compared to the large amount of research on global benchmarks, the literature on China's crude oil futures is scant (Duan et al., 2023; Huang and Huang, 2020; Niu et al., 2021; Yang and Zhou, 2020). 4) The noteworthy issues in China's crude oil futures. The INE has faced various extreme domestic and international challenges over the three years since its launch, including the severe shock of the COVID-19 pandemic, which triggered huge fluctuations in oil prices.
2018 年 3 月 26 日,上海国际能源交易所(INE)正式推出首只 RMD 主导的原油期货,向国际参与者开放,这对中国具有理论和实践意义(Ji 和 Zhang,2019)。中国作为全球第二大原油消费国和最大商品进口国,在全球市场中的地位迅速提升(Liu 和 Lee,2021)。2)中国原油期货的独特性。中国拥有亚洲首个价格基准的称号,与日本和新加坡等国家试图建立自己的基准的努力形成鲜明对比。此外,自其推出以来仅三个月,INE 在亚洲排名首位,在全球交易量方面位居前三(Sun 等人,2022),这揭示了我国原油期货的非凡吸引力。3)对中国原油期货的研究有限。与全球基准的大量研究相比,关于中国原油期货的文献相对较少(Duan 等人,2023;Huang 和 Huang,2020;Niu 等人,2021;Yang 和 Zhou,2020)。 4) 中国原油期货值得关注的问题。INE 自推出以来,已面临各种极端的国内外挑战,包括 COVID-19 大流行带来的严重冲击,这引发了油价的大幅波动。

Considering the crucial ability of higher-order moments to cope with extreme situations, it makes sense for us to undertake the first exploration of the relationship between higher-order moments and the volatility of China's oil market during the COVID-19 crisis.
考虑到高阶矩应对极端情况的关键能力,在我们对 COVID-19 危机期间中国石油市场波动与高阶矩之间的关系进行首次探索时,这是有意义的。
Building on scarce empirical literature that predicting volatility using higher-order moments (Bonato et al., 2022; Gkillas et al., 2019), we investigate the impact of different higher-order moments by incorporating third- and fourth-order moments, odd moments, even moments, and all higher-order moments (including fifth- and sixth-order moments) into the benchmark heterogeneous autoregressive (HAR) models developed by Corsi (2009). The out-of-sample results show that third-to sixth-order moments all provide useful information for oil futures volatility forecasts in China.
基于预测波动性使用高阶矩的稀缺实证文献(Bonato 等,2022;Gkillas 等,2019),我们通过将三阶和四阶矩、奇数矩、偶数矩以及所有高阶矩(包括五阶和六阶矩)纳入 Corsi(2009)开发的基准异质自回归(HAR)模型,研究了不同高阶矩的影响。样本外结果表明,三阶至六阶矩均为中国石油期货波动性预测提供了有用的信息。

We also discover that odd moments are more important than even moments for short-term forecasting, whereas even moments are more critical for long-term forecasting, which agrees with an earlier finding that odd and even moments have different effects (Khademalomoom et al., 2019). These results also align with forecasts of positive and negative volatility. Moreover, based on alternative evaluation approaches, forecasting horizons, and rolling schemes, the robustness of these conclusions is confirmed.
我们还发现,对于短期预测来说,奇数时刻比偶数时刻更重要,而对于长期预测来说,偶数时刻则更为关键,这与早期发现奇数和偶数时刻具有不同影响(Khademalomoom 等,2019 年)相一致。这些结果也与正负波动率的预测相一致。此外,基于不同的评估方法、预测范围和滚动方案,这些结论的稳健性得到了证实。
Given the inclusion of higher-order moments and their lags as explanatory variables in the forecasting models, which are all realized variance measures constructed from returns, it is reasonable to suspect correlations among them.
考虑到在预测模型中包含了高阶矩及其滞后项作为解释变量,这些变量都是基于回报率构建的实实现方差度量,因此怀疑它们之间存在相关性是合理的。

In such cases, conventional linear HAR models may falter. Additionally, the presence of nonlinearity contributing to volatility forecasting among higher-order moments cannot be identified using conventional linear models.
在这种情况下,传统的线性 HAR 模型可能失效。此外,使用传统的线性模型无法识别出对高阶矩波动预测有贡献的非线性因素。

However, machine learning (ML) methods have proven to be effective in addressing multicollinearity and accommodating more sophisticated functional forms, enabling the exploration of complex nonlinear relationships among variables This advantage holds promise for better approximating the unknown and likely complex data generating process underlying oil price volatility.
然而,机器学习(ML)方法已被证明在解决多重共线性并适应更复杂的函数形式方面是有效的,这使人们能够探索变量之间复杂的非线性关系。这一优势有望更好地近似油价波动背后可能复杂的未知数据生成过程。

Furthermore, some empirical research has supported the claim that ML models are superior to other models. For example,
Ding et al. (2021) show that the least absolute shrinkage and selection operator (LASSO) method outperforms the benchmark HAR model in forecasting the RV of stocks in eight countries. Christensen et al. (2021) demonstrate that a wide range of ML (regularization, tree-based regression, and neural network) models outperform the HAR model for volatility prediction.
此外,一些实证研究支持了机器学习模型优于其他模型的论断。例如,Ding 等人(2021 年)表明,最小绝对收缩和选择算子(LASSO)方法在预测八个国家的股票回报率波动(RV)方面优于基准 HAR 模型。Christensen 等人(2021 年)证明,广泛的机器学习模型(正则化、基于树的回归和神经网络)在波动率预测方面优于 HAR 模型。

Therefore, we investigate the influence of higher-order moments on oil futures volatility using ML models, including LASSO, elastic net (EN), gradient boosting (GBDT), and random forests (RF).
因此,我们使用机器学习模型,包括 LASSO、弹性网络(EN)、梯度提升(GBDT)和随机森林(RF),研究高阶矩对石油期货波动率的影响。
Interestingly, individual forecasts may react differently to structural disruptions or fluctuating market conditions, but combination forecasts can balance bias and estimation variance to achieve prediction efficiency. (Elliott and Timmermann, 2016; Smith and Wallis, 2009; Timmermann, 2006). Many papers have shown that combination methods are the most effective in most cases (Claeskens et al., 2016; Clemen, 1989; Makridakis et al., 2018). Thus, we base our combination forecasts on ML models. To the best of our knowledge, this study is the first to offer empirical proof of the correlation between higher-order moments and volatility based on ML models and combination forecasting models.
有趣的是,个别预测可能对结构性中断或波动的市场条件有不同的反应,但组合预测可以通过平衡偏差和估计方差来实现预测效率。(Elliott 和 Timmermann,2016;Smith 和 Wallis,2009;Timmermann,2006)。许多论文表明,在大多数情况下,组合方法是最高效的(Claeskens 等,2016;Clemen,1989;Makridakis 等,2018)。因此,我们基于机器学习模型进行组合预测。据我们所知,这项研究是首次基于机器学习模型和组合预测模型提供高级矩与波动性之间相关性的实证证据。
Our paper makes several contributions. First, no study has investigated the role of higher (up to sixth-order) moments in predicting RV in the oil market. Devpura and Narayan (2020) assert that the COVID-19 pandemic led to nearly 10-fold more volatility in oil prices during its early stages, contributing between 8% and 22% per day to oil price volatility.
我们的论文做出了几个贡献。首先,没有研究调查过在石油市场中预测 RV 时,高阶(高达六阶)矩的作用。Devpura 和 Narayan(2020)断言,COVID-19 大流行在其早期阶段导致石油价格波动增加了近 10 倍,每天对石油价格波动贡献了 8%至 22%。

Due to such oil price volatility characteristics, third- and fourth-order moments may be inadequate for understanding volatility dynamics. In fact, empirical research on hyperskewness and hyperkurtosis has offered important predictive information on sovereign bond returns, cryptocurrency returns, and exchange-rate behavior but not oil price volatility. Therefore, we fill a gap in the existing literature and discover that higher (third-to sixth-order) moments are effective for oil futures volatility forecasting.
由于这种油价波动特性,三阶和四阶矩可能不足以理解波动动态。事实上,关于超偏度和超峰度的实证研究已经为国债收益、加密货币收益和汇率行为提供了重要的预测信息,但并非油价波动。因此,我们填补了现有文献的空白,并发现更高阶(三阶至六阶)矩对于石油期货波动预测是有效的。

Second, previous papers have studied higher-order moments using GARCH, HAR, and MIDAS models (
Khademalomoom et al., 2019; Mei et al., 2017). Considering the complexity of the oil futures market and the variable correlations among higher-order moments, this paper is the first to employ ML models to explore the relevance of higher-order moments to oil futures volatility.
其次,先前的研究使用 GARCH、HAR 和 MIDAS 模型研究了高阶矩(Khademalomoom 等,2019;Mei 等,2017)。鉴于石油期货市场的复杂性和高阶矩之间的变量相关性,本文首次采用机器学习(ML)模型来探讨高阶矩对石油期货波动性的相关性。

Also, since we further use combination forecasting models for volatility prediction, the evidence indicates that combination forecasting models are superior to other models, demonstrating that all the ML techniques applied to higher-order moment predictions in this study have specific capabilities and inevitable restrictions.
此外,鉴于我们进一步使用组合预测模型进行波动率预测,证据表明组合预测模型优于其他模型,表明本研究中应用于高阶矩预测的所有机器学习技术都具有特定的能力和不可避免的限制。
Third, we pay attention to the interpretability of ML models and the contribution of each higher-order moment to forecasting, measured using three different tools.
第三,我们关注机器学习模型的可解释性和每个高阶矩对预测的贡献,使用三种不同的工具进行衡量。

Moreover, considering out-of-sample prediction results produced by HAR-type models collectively, we conclude that odd moments outperform even moments for short-term forecasts, whereas even moments contain more predictive information for long-term forecasts.
此外,综合考虑 HAR 类型模型产生的样本外预测结果,我们得出结论:对于短期预测,奇数矩优于偶数矩,而对于长期预测,偶数矩包含更多的预测信息。

Fourth, we consider the special case of the COVID-19 crisis and further measure the impact of higher-order moments on positive and negative volatility forecasts.
第四,我们考虑 COVID-19 危机的特殊情况,并进一步测量高阶矩对正负波动率预测的影响。

Additionally, given the scarcity of previous studies on China's oil market, our study concentrates on the China's oil futures market to explore the relationship between higher-order moments and volatility forecasting, and utilize 5-min high-frequency data to better reflect the volatility information (Andersen and Bollerslev, 1998; Haugom et al., 2014; Liu et al., 2015).
此外,鉴于以往关于中国石油市场的研究较少,本研究聚焦于中国石油期货市场,以探讨高阶矩与波动率预测之间的关系,并利用 5 分钟高频数据更好地反映波动率信息(Andersen 和 Bollerslev,1998;Haugom 等,2014;Liu 等,2015)。
The remainder of the paper is organized as follows. Next part provides a definition of higher-order moments. Section 3 describes the data. Section 4 introduces the models used to forecast oil price volatility. Section 5 presents the empirical prediction analysis and results. Section 6 explains the robustness testing. Section 7 provides a further discussion of positive and negative volatility forecasts, and the final section presents the conclusions.
本文其余部分组织如下。下文部分提供了高阶矩的定义。第 3 节描述了数据。第 4 节介绍了用于预测油价波动的模型。第 5 节展示了实证预测分析和结果。第 6 节解释了稳健性测试。第 7 节进一步讨论了正负波动预测,最后部分提出了结论。

2. Volatility estimation and higher-order moments
2. 波动率估计和高阶矩

2.1. Realized volatility  2.1. 实现波动率

The purpose of this study is to forecast the realized volatility of China's oil prices as a natural estimator for the quadratic variation of a process. Following Andersen and Bollerslev (1998), we define day t's realized volatility (RVt) as:(1)RVt=i=1Nrt,i2,where rt,i is the ith intraday return (i=1,2,...,N) on day t, and 1/N is the sampling frequency. Thus, rt,i=pt,ipt,i1(t=1,...,T;i=1,...,N), where T is the total trading days, N is the number of intraday intervals, and pt,i is the log oil price.
本研究旨在预测中国石油价格的实现波动率,将其作为过程二次变差的自然估计量。遵循 Andersen 和 Bollerslev(1998)的定义,我们定义第 t 天的实现波动率 (RVt) 为: (1)RVt=i=1Nrt,i2, ,其中 rt,i 是第 t 天的日内回报率 (i=1,2,...,N)1/N 是采样频率。因此, rt,i=pt,ipt,i1(t=1,...,T;i=1,...,N) ,其中 T 是总交易日数, N 是日内间隔数, pt,i 是石油价格的对数。

2.2. Higher-order moment methods
2.2. 高阶矩方法

Moments are scalar values that provide a summary of the underlying data distribution's univariate features. They reflect the average of a unimodal distribution's multiple powers of deviation from the mean (Pillai, 2019). The third-to sixth-order moments (skewness, kurtosis, hyperskewness, and hyperkurtosis, respectively) have different meanings, and we define higher-order moments in the following paragraphs.
瞬间是标量值,提供了底层数据分布的单变量特征的总结。它们反映了单峰分布从均值出发的多个偏差幂的平均值(Pillai,2019)。三至六阶矩(分别为偏度、峰度、超偏度和超峰度)有不同的含义,我们将在下一段中定义更高阶的矩。
According to Amaya et al. (2015), skewness (the third moment) explains the asymmetry of daily returns distribution, and kurtosis (the fourth moment) reflects extreme occurrences. We compute the realized third moment, RSKt, and realized fourth moment, RKUt as:(2)RSKt=N·i=1NRt,i3RVt3,(3)RKUt=N·i=1NRt,i4RVt2.
根据 Amaya 等人(2015 年)的研究,偏度(三阶矩)解释了日收益分布的不对称性,峰度(四阶矩)反映了极端事件。我们计算实现的三阶矩, RSKt ,和实现的四阶矩, RKUt ,如下: (2)RSKt=N·i=1NRt,i3RVt3, (3)RKUt=N·i=1NRt,i4RVt2.
Furthermore, following Khademalomoom et al. (2019) and Kinateder and Papavassiliou (2019), hyperskewness (the fifth moment) is the asymmetric sensitivity of the kurtosis, and hyperkurtosis (the sixth moment) measures the level of peakedness and tailedness of the normal distribution. We define the realized fifth moment, HRSKt, and realized sixth moment, HRKUt, as follows:(4)HRSKt=N3·i=1NRt,i5RVt5,(5)HRKUt=N2·i=1NRt,i6RVt3.
此外,根据 Khademalomoom 等人(2019 年)和 Kinateder 与 Papavassiliou(2019 年)的研究,超偏度(第五矩)是峰度的非对称敏感性,超峰度(第六矩)衡量正态分布的尖峰程度和尾部程度。我们定义实现第五矩 HRSKt 和实现第六矩 HRKUt 如下: (4)HRSKt=N3·i=1NRt,i5RVt5, (5)HRKUt=N2·i=1NRt,i6RVt3.

3. Methodology  3. 研究方法

3.1. HAR-type models  3.1. HAR 型模型

Benchmark forecasting (HAR) model. Our research is based on the HAR framework originally developed by Corsi (2009), which become popular due to its ease of implementation and effective performance.
基准预测(HAR)模型。我们的研究基于 Corsi(2009)最初开发的 HAR 框架,该框架因其易于实现和有效性能而变得流行。

The idea is to blend daily (past 1 day), weekly (past 5 days), and monthly (past 22 days) average volatility elements to capture the empirical properties of volatility series, such as multi-scaling behavior, long memory, and fat tails. The HAR model is defined as follows(6)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+εt,where RVt+h,(h=1,5,22) represents the 1-day, 1-week, and 1-month-ahead RV, respectively; RVt is the daily RV defined in Eq. (1); RVt5:t and RVt22:t denote the average daily RV over lags 1 to 5 and lags 1 to 22, representing the weekly and monthly RVs.
将每日(过去 1 天)、每周(过去 5 天)和每月(过去 22 天)的平均波动性元素混合,以捕捉波动性序列的经验性质,如多尺度行为、长记忆和厚尾。HAR 模型定义为如下 (6)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+εt, 其中 RVt+h,(h=1,5,22) 分别代表 1 天、1 周和 1 个月的预期回报率; RVt 是方程(1)中定义的每日 RVRVt5:tRVt22:t 表示滞后 1 到 5 和滞后 1 到 22 的平均每日 RV ,代表每周和每月的 RVs
Relatively low (third- and fourth-order) moment-based forecasting model (HAR-LOW): To check whether the two highest moments (hyperskewness and hyperkurtosis) contain information for oil price volatility forecasting, we include only third- and fourth-order moments in the benchmark model. The model is expressed by(7)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β1RSK+β3RKU+εt,where RSK=[RSKt,RSKt5:t,RSKt22:t] and RKU=[RKUt,RKUt5:t,RKUt22:t] are vectors of the daily, weekly, and monthly skewness and kurtosis, respectively, and β1=[β11,β12,β13] and β3= [β31,β32,β32] are the coefficients for RSK and RKU.
相对较低阶(三阶和四阶)的基于矩的预测模型(HAR-LOW):为了检验两个最高阶矩(超偏度和超峰度)是否包含油价波动预测的信息,我们在基准模型中仅包括三阶和四阶矩。该模型表示为 (7)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β1RSK+β3RKU+εt, ,其中 RSK=[RSKt,RSKt5:t,RSKt22:t]RKU=[RKUt,RKUt5:t,RKUt22:t] 分别是每日、每周和每月的偏度和峰度向量, β1=[β11,β12,β13]β3=[β31,β32,β32]RSKRKU 的系数。
Odd (third- and fifth-order) moment-based forecasting model (HAR-ODD): To explore the importance of odd-order moments, we expand the HAR model by incorporating the third-order moment (skewness) and the fifth-order moment (hyperskewness). The HAR-ODD model is given by(8)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β1RSK+β2HRSK+εt,where RSK=[RSKt,RSKt5:t,RSKt22:t] and HRSK=[HRSKt,HRSKt5:t,HRSKt22:t] are vectors of the daily, weekly, and monthly skewness and hyperskewness, respectively, and β1=[β11,β12,β13] and β2=[β21,β22,β23] are the coefficients for RSK and HRSK.
奇数阶矩(三阶和五阶)预测模型(HAR-ODD):为了探讨奇数阶矩的重要性,我们通过引入三阶矩(偏度)和五阶矩(超偏度)来扩展 HAR 模型。HAR-ODD 模型由 (8)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β1RSK+β2HRSK+εt, 给出,其中 RSK=[RSKt,RSKt5:t,RSKt22:t]HRSK=[HRSKt,HRSKt5:t,HRSKt22:t] 分别是每日、每周和每月的偏度和超偏度向量, β1=[β11,β12,β13]β2=[β21,β22,β23]RSKHRSK 的系数。
Even (fourth- and sixth-order) moment-based forecasting model (HAR-EVEN): To investigate the difference between even and odd moments, we extend the model by inserting the fourth-order moment (kurtosis) and sixth-order moment (hyperkurtosis). We termed the extended model HAR-ODD, which is expressed by(9)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β3RKU+β4HRKU+εt,where RKU=[RKUt,RKUt5:t,RKUt22:t] and HRKU=[HRKUt,HRKUt5:t,HRKUt22:t] are vectors of the daily, weekly, and monthly kurtosis and hyperkurtosis, respectively, and β3=[β31,β32,β32] and β4=[β41,β42,β42] are the coefficients for RKU and HRKU.
即使(四阶和六阶)矩预测模型(HAR-EVEN):为了研究偶数阶和奇数阶矩之间的差异,我们通过插入四阶矩(峰度)和六阶矩(超峰度)来扩展模型。我们将扩展后的模型称为 HAR-ODD,其表达式为 (9)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β3RKU+β4HRKU+εt, ,其中 RKU=[RKUt,RKUt5:t,RKUt22:t]HRKU=[HRKUt,HRKUt5:t,HRKUt22:t] 分别是每日、每周和每月的峰度和超峰度向量, β3=[β31,β32,β32]β4=[β41,β42,β42]RKUHRKU 的系数。
All-order (third, fourth, fifth, and sixth) moment-based forecasting model (HAR-ALL): We add all higher order moments into the benchmark model and constructed the HAR-ALL model as:(10)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β1RSK+β2HRSK+β3RKU+β4HRKU+εt.
所有阶次(第三、第四、第五和第六)矩预测模型(HAR-ALL):我们将所有高阶矩添加到基准模型中,构建了 HAR-ALL 模型: (10)RVt+h=α0+α1RVt+α2RVt5:t+α3RVt22:t+β1RSK+β2HRSK+β3RKU+β4HRKU+εt.

3.2. Machine learning models
3.2. 机器学习模型

3.2.1. Least absolute shrinkage and selection operator
3.2.1. 最小绝对收缩与选择算子

The problem of overfitting tends to occur when the number of indicators increases in low signal-to-noise environments. Instead of fitting the relevant information, linear models tend to fit the noise.
过拟合问题通常发生在低信噪比环境中指标数量增加时。线性模型倾向于拟合噪声,而不是拟合相关信息。

LASSO regression is an efficient technique for avoiding overfitting, increasing out-of-sample performance, and selecting predictors that are computationally efficient (Ding et al., 2021).
LASSO 回归是一种有效的避免过拟合、提高样本外性能以及选择计算效率高的预测因子的技术(Ding 等,2021)。
The LASSO technique is essentially a restricted least squares regression in which regression coefficients are shrunk by imposing a penalty term. The LASSO equation for forecasting oil price volatility is given as(11)RVˆt+h=β0ˆ+i=1nβiˆxi,t,and(12)βˆ=argminβ(12(t1)t=1t1(RVt+hβ0i=1nβixi,t)2+λi=1n|βi|),where xi,t denotes the ith predictor on day t, n is the total number of predictors in the regression, βˆ is the regression coefficients' shrinkage estimator, and λ is the tuning parameter that controls the shrinkage predictor in terms of penalty strictness. The first part of Eq. (12) is the least squares criterion, and the second part is the penalty term for the regression parameters. Increasing λ causes more coefficients from the LASSO regression to be penalized to zero, with stricter coefficient selection. All the coefficients are set to zero when λ=1. The LASSO model includes all predictors from the HAR-ALL model.
LASSO 技术本质上是一种限制性最小二乘回归,通过施加惩罚项来缩小回归系数。预测油价波动的 LASSO 方程为 (11)RVˆt+h=β0ˆ+i=1nβiˆxi,t,(12)βˆ=argminβ(12(t1)t=1t1(RVt+hβ0i=1nβixi,t)2+λi=1n|βi|), ,其中 xi,t 表示第 t 天的 ith 预测因子, n 是回归中的预测因子总数, βˆ 是回归系数的收缩估计量, λ 是控制收缩预测因子的惩罚严格程度的调整参数。方程(12)的第一部分是最小二乘准则,第二部分是回归参数的惩罚项。增加 λ 会导致更多 LASSO 回归系数被惩罚为零,从而进行更严格的系数选择。当 λ=1 时,所有系数都设置为零。LASSO 模型包括 HAR-ALL 模型中的所有预测因子。
Equivalently, Eq. (12) can also be written as an OLS estimator:(13)βˆ=argminβ12l=1t1(RVl+hβ0i=1kβixi,t)2,with an L1 penalty function of(14)i=1k|βi|<ψ,where the parameter ψ plays the same role as λ, controlling for the amount of shrinkage.
等效地,方程(12)也可以写成带有惩罚函数 (14)i=1k|βi|<ψ, 的 OLS 估计量 (13)βˆ=argminβ12l=1t1(RVl+hβ0i=1kβixi,t)2, ,其中参数 ψλ 扮演相同角色,控制收缩量。

3.2.2. Elastic net  3.2.2. 弹性网络

Introduced by Zou and Hastie (2005), EN is another widely used shrinkage method to overcome overfitting. Besides the penalty of the L1 norm in the LASSO method, the elastic net method introduces L2 penalties simultaneously. Like the LASSO model, the elastic model forecasts oil price volatility, computed as follows:(15)RVˆt+h=β0ˆ+i=1nβiˆxi,t,and(16)βˆ=argminβ(12(t1)l=1t1(RVl+hβ0i=1kβixi,t)2+λi=1k(α|βi|+(1α)2βi2),where α is a constant ranging from 0 to 1. When α=1, the EN becomes LASSO, and it transforms to the ridge regression when α=0. The predictor variables of the EN model are the same as those included in the HAR-ALL model.
由 Zou 和 Hastie(2005)提出,EN 是另一种广泛使用的收缩方法,用于克服过拟合问题。除了 LASSO 方法中的 L1 范数惩罚外,弹性网络方法同时引入了 L2 惩罚。与 LASSO 模型类似,弹性模型预测油价波动性,计算如下: (15)RVˆt+h=β0ˆ+i=1nβiˆxi,t,(16)βˆ=argminβ(12(t1)l=1t1(RVl+hβ0i=1kβixi,t)2+λi=1k(α|βi|+(1α)2βi2), ,其中 α 是一个介于 0 到 1 之间的常数。当 α=1 时,EN 变为 LASSO,而当 α=0 时,它转变为岭回归。EN 模型的预测变量与 HAR-ALL 模型中包含的变量相同。
Eq. (16) can also be described as an OLS predictor:(17)βˆ=argminβ12l=1t1(RVl+hβ0i=1Nβixi,t)2,with L1 and L2 penalty functions of(18)αi=1N|βi|+(1α)2i=1Nβi2<ψ,where the parameter ψ plays the same role as λ, controlling for the amount of shrinkage.
式(16)也可以描述为一个 OLS 预测器:带有 (18)αi=1N|βi|+(1α)2i=1Nβi2<ψ, 的惩罚函数 L1L2(17)βˆ=argminβ12l=1t1(RVl+hβ0i=1Nβixi,t)2, ,其中参数 ψλ 扮演相同角色,控制收缩量。

3.2.3. Gradient boosting  3.2.3. 梯度提升

Possible nonlinear relationships between dependent variables and predictors, and the interactions among predictors, cannot be captured by linear models. In contrast, regression trees allow for nonlinearity and consider interactions among predictors using a fully nonparametric approach. On this basis, Friedman (2001) introduce GBDT, which grows each tree depending on information extracted from the preceding tree. The essential idea behind gradient boosting is the tree-boosting model.
可能无法用线性模型捕捉因变量与预测变量之间的非线性关系以及预测变量之间的交互作用。相比之下,回归树允许非线性,并采用完全非参数方法考虑预测变量之间的交互作用。基于此,弗里德曼(2001 年)引入了 GBDT,该算法根据前一棵树提取的信息来生长每一棵树。梯度提升背后的基本思想是树提升模型。

To minimize the errors caused by the current models, new models are added until no further enhancement is achieved. This process is known as “boosting” (Christensen et al., 2021) and proceeds as follows:
为了最小化当前模型造成的错误,添加新的模型直到不再获得进一步改进。这个过程被称为“提升”(Christensen 等人,2021 年)并按以下方式进行:
  • f0ˆ(Zt) is set as a constant that depends on f0ˆ(Zt)=argminρtinsampleL(RVt,ρ). The prediction correlates with the average when using the mean squared error(MSE) as the loss function.
    f0ˆ(Zt) 被设定为依赖于 f0ˆ(Zt)=argminρtinsampleL(RVt,ρ) 的常数。当使用均方误差(MSE)作为损失函数时,预测与平均值相关。
  • The loss function's negative gradient (equivalent to the residuals) relative to the prediction value is calculated.
    损失函数相对于预测值的负梯度(相当于残差)被计算。
  • The residuals based on a shallow tree are fitted and yield a group of terminal nodes Rjbj=1,...,Jb, where j and b indicate the leaf and tree, respectively.
    基于浅层树的残差拟合产生了一组终端节点 Rjbj=1,...,Jb ,其中 jb 分别表示叶子和树。
  • The size of the gradient descent is chosen by ρjb=argminρZtRjbL(RVt,fb1ˆ(Zt)+ρ).
    梯度下降的大小由 ρjb=argminρZtRjbL(RVt,fb1ˆ(Zt)+ρ) 选择。
  • Iteratively, f0ˆ(Zt) is modified as follows:
    迭代地, f0ˆ(Zt) 被修改如下:
(19)fbˆ(Zt)=fb1ˆ(Zt)+νj=1Jbρjb1{ZtRjb},for b=1,...,B, where ν is the learning rate and fˆ(Zt)fBˆ(Zt) is the final prediction.
对于 b=1,...,B ,其中 ν 是学习率, fˆ(Zt)fBˆ(Zt) 是最终预测。

3.2.4. Random forests  3.2.4. 随机森林

The essence of RF is a combination prediction model, which randomly builds a forest containing many single regression trees (Breiman, 2001) with weak correlations or even irrelevance. Subsequently, many regression trees use aggregate statistics to jointly predict the value of the newly observed output variable.
射频(RF)的本质是一种组合预测模型,它随机构建一个包含许多单个回归树(Breiman,2001)的森林,这些回归树之间具有弱相关性甚至无关。随后,许多回归树使用汇总统计量共同预测新观察到的输出变量的值。

The input variables of the RF model in this study include the predictors of the standard HAR (RVD, RVW, and RVM), all higher-order moments, and their lags, which are the same as in the HAR-ALL model.
本研究中 RF 模型的输入变量包括标准 HAR(RVD、RVW 和 RVM)的预测因子、所有高阶矩及其滞后,与 HAR-ALL 模型相同。
In the regression tree T, the input indicator space Y=(Y1,Y2,...) is recursively partitioned by the terminal nodes J into s non-overlapping regions Rs. At the top level of the regression tree, the model uses a greedy algorithm to select the first partition in such a way that the partition variable l and its partition point p, defining the half-plane R1(l,p)={xl|xlp} and R2(l,p)={xl|xl>p}, minimize the loss function:(20)mins,p{minRV1xsR1(s,p)(RViRV1)2+minRV2xsR2(s,p)(RViRV2)2},where i signifies the half-plane-specific t+i data for RV, RVk=mean{RVi|xlRk(l,p)},k=1,2, is the half-plane-specific mean of RV. The region-specific squared error loss is minimized by selecting half-plane-specific means via inner minimization.
在回归树 T 中,输入指示空间 Y=(Y1,Y2,...) 通过终端节点 J 递归地划分为 s 个非重叠区域 Rs 。在回归树的最顶层,模型使用贪婪算法选择第一个分区,使得分区变量 l 及其分区点 p ,定义了半平面 R1(l,p)={xl|xlp}R2(l,p)={xl|xl>p} ,最小化损失函数: (20)mins,p{minRV1xsR1(s,p)(RViRV1)2+minRV2xsR2(s,p)(RViRV2)2}, 其中 i 表示 RV 的半平面特定 t+i 数据, RVk=mean{RVi|xlRk(l,p)},k=1,2 是 RV 的半平面特定均值。通过内部最小化选择半平面特定均值,以最小化区域特定的平方误差损失。

The outer minimization finds the first optimal partitioning predictor and the first optimal partitioning point, resulting in a new regression tree with two terminal nodes, by searching through all possible combinations of s and p.
外部最小化找到第一个最优分区预测器和第一个最优分区点,通过搜索所有可能的 sp 的组合,从而得到一个新的具有两个终端节点的回归树。
The above method is then repeated until the terminal tree's maximum number of terminal nodes or each terminal node's minimum number of observations, whichever comes first, is reached.
上述方法重复进行,直到达到终端树的最大终端节点数或每个终端节点的最小观察数,以先到者为准。

The final tree employs the region-specific averages of the variables to forecast RV after sending them to the Rl best region (1 = indicator function):(21)T(xi,{Rl}1L)=l=1LRVl1(xiRl).
最终树使用变量的区域特定平均值,在将它们发送到最佳区域(1 = 指示函数)后预测 RV: (21)T(xi,{Rl}1L)=l=1LRVl1(xiRl).
The “randomness” of the RF is reflected by two criteria: 1) the training sample obtained by resampling bootstrapping in the original sample has randomness, and 2) the grouping variables are random in nature.
射频的“随机性”通过两个标准体现:1)通过在原始样本中进行重采样自举得到的训练样本具有随机性;2)分组变量在本质上具有随机性。

To construct each regression tree, the present optimum grouping variable is obtained from a random subset of the candidate variables for all variables (rather than from the set of all variables).
为了构建每个回归树,当前最优分组变量是从所有候选变量的随机子集中获得的(而不是从所有变量的集合中获得)。

3.3. Combination forecasting models
3.3. 组合预测模型

We introduce four popular ML models in the previous section, which allow us to obtain four individual predictions for volatility.
我们在前一节介绍了四种流行的机器学习模型,这些模型使我们能够获得四个关于波动性的独立预测。

However, the stability of out-of-sample predictions cannot be guaranteed due to model uncertainty, and different ML models might be effective for different scenarios (Li and Tang, 2021). To solve this issue, we build on the earlier literature (Liang et al., 2020; Zhang et al., 2019) and apply the five combination approaches introduced by Rapach et al. (2010) to make forecasting more robust and further exploit the predictive potential of higher-order moments. Statistically, the combination volatility forecast is given as:(22)RVt+h=k=1Nωi,tRVˆi,t+h,where RVt+h denotes the forecasting of China's oil market volatility on day t+h, RVˆi,t+h represents the individual prediction yielded by the ith ML model, ωi,t represents the combined weight of the ith individual prediction created at t, and N is the number of all the individual ML models used. The five combination weighting techniques are described in detail in the sections that follow.
然而,由于模型的不确定性,无法保证样本外预测的稳定性,不同的机器学习模型可能适用于不同的场景(Li 和 Tang,2021)。为了解决这个问题,我们借鉴了早期文献(Liang 等人,2020;Zhang 等人,2019)并应用了 Rapach 等人(2010)提出的五种组合方法,以提高预测的稳健性并进一步挖掘高阶矩的预测潜力。从统计学的角度来看,组合波动率预测如下: (22)RVt+h=k=1Nωi,tRVˆi,t+h, 其中 RVt+h 表示对第 t+h 天中国石油市场波动的预测, RVˆi,t+h 代表由 ith 机器学习模型产生的单个预测, ωi,t 代表在 t 时刻创建的 ith 单个预测的加权组合,N 是所有单个机器学习模型的总数。接下来的章节将详细描述五种组合加权技术。
  • Mean combination: Each individual forecast is the same (i.e., ωi,t=1/N).
    均值组合:每个个体预测都相同(即 ωi,t=1/N )。
  • Median combination: The median of the N individual predictions is used in this combination strategy.
    中位数组合:本组合策略使用 N 个个体预测的中位数。
  • Trimmed mean combination: The maximum and minimum individual predictions are assigned to ωi,t=0, and the remaining individual forecasts are ωi,t=1/(N2).
    修剪平均值组合:将最大和最小个体预测值分配给 ωi,t=0 ,其余个体预测值分配给 ωi,t=1/(N2)
  • Discount mean square prediction error (DMSPE) combination: The combined weight of the ith individual forecast on day t is computed as ωk,t=φk,t1/l,tNφl,t1, where φl,t=s=m+1tθts(RVsRVˆl,s)2, RVs is the actual RV on day s, the first training sample has m total observations, and θ stands for a discount factor. We consider two values of θ (1 and 0.9) and employ DMSPE(1) and DMSPE(0.9) in this study.
    折扣均方预测误差(DMSPE)组合:第 t 天的单个预测的加权组合计算为 ωk,t=φk,t1/l,tNφl,t1 ,其中 φl,t=s=m+1tθts(RVsRVˆl,s)2RVs 是第 s 天的实际 RV ,第一个训练样本有 m 个总观测值,θ代表折扣因子。我们考虑了 θ 的两个值(1 和 0.9),并在本研究中使用了 DMSPE(1)和 DMSPE(0.9)。
Furthermore, an additional effective forecasting combination model, the polynomially weighted average with multiple rates (ML-Poly), is considered. This model involves the learner making online sequential predictions a series of rounds with the assistance of K experts.
此外,还考虑了一种额外的有效预测组合模型,即多项式加权平均多速率(ML-Poly)。该模型涉及学习者借助 K 位专家进行一系列在线序列预测。

In each round t=1,2,...,T, the learner makes a prediction by choosing a vector pt=(p1,t,...,pK,t) of non-negative weights that sum to one. Every expert k incurs a loss lk,t and the learner's loss is lˆt=ptTlt=k=1Kpk,tlk,t, where lt=(l1,t,...,lK,t). The goal of the learner is to control his cumulative loss, which can be achieved by controlling his regret Rk,T against each expert k. According to Cesa-Bianchi and Lugosi (2003) and Gaillard et al. (2014), the implementation of ML-Poly is recalled in Algorithm 1.
Algorithm 1: The polynomially weighted average forecaster with multiple learning rates (ML-Poly)
Initialization: p1=(1/K,...,1/K) and R0=(0,...,0)
For each instance t=1,2,...,T
0. pick the learning rates
ηk,t1=1/(1+s=1t1(ls(RVˆs)ls(xk,s))2)
1. form the mixture pˆt defined component-wise by
pˆk,t=ηk,t1(Rk,t1)+/ηt1(Rt1)+
where x+ denotes the vector of non-negative parts of the components of x
2. output prediction RVˆt=pˆtxt
3. for each expert k update the regret
Rk,t=Rk,t1+lt(RVtˆ)lt(xk,t)

在每一轮 t=1,2,...,T 中,学习器通过选择一个非负权重向量 pt=(p1,t,...,pK,t) ,其和为 1 来进行预测。每位专家 k 都会遭受损失 lk,t ,学习器的损失为 lˆt=ptTlt=k=1Kpk,tlk,t ,其中 lt=(l1,t,...,lK,t) 。学习器的目标是控制他的累积损失,这可以通过控制他对每位专家 k 的遗憾 Rk,T 来实现。根据 Cesa-Bianchi 和 Lugosi(2003)以及 Gaillard 等人(2014)的研究,ML-Poly 的实现可以在算法 1 中回顾。
Algorithm 1: The polynomially weighted average forecaster with multiple learning rates (ML-Poly)
Initialization: p1=(1/K,...,1/K) and R0=(0,...,0)
For each instance t=1,2,...,T
0. pick the learning rates
ηk,t1=1/(1+s=1t1(ls(RVˆs)ls(xk,s))2)
1. form the mixture pˆt defined component-wise by
pˆk,t=ηk,t1(Rk,t1)+/ηt1(Rt1)+
where x+ denotes the vector of non-negative parts of the components of x
2. output prediction RVˆt=pˆtxt
3. for each expert k update the regret
Rk,t=Rk,t1+lt(RVtˆ)lt(xk,t)

3.4. Sample splitting and the time series cross validation
3.4. 样本分割和时间序列交叉验证

The ML methods discussed before rely on the selection of hyperparameters, which are critical to the models’ performance. Hyperparameters include penalty parameters in Lasso and EN, number of iterated trees in GBDT, number of random trees in RF, and others. We follow Gu et al. (2020) and select the tuning parameters adaptively from the available options.
之前讨论的机器学习方法依赖于超参数的选择,这些超参数对模型性能至关重要。超参数包括 Lasso 和 EN 中的惩罚参数、GBDT 中的迭代树数量、RF 中的随机树数量等。我们遵循 Gu 等人(2020)的方法,从可用选项中自适应地选择调整参数。
This paper maintains the temporal ordering of our sample and divide the data intro three disjoint time periods. The first subsample is “training set”, which is used to estimate the model with a specific set of tuning parameter values.
本文保持了样本的时间顺序,并将数据划分为三个互斥的时间段。第一个子样本是“训练集”,用于使用一组特定的调整参数值估计模型。

The second “validation set” is utilized for tuning the hyperparameters. We employ the estimated model from the training set to forecast the data points in the validation and iteratively search for hyperparameters that optimize the validation objective based on forecast errors.
第二“验证集”用于调整超参数。我们使用训练集估计的模型来预测验证集中的数据点,并迭代搜索基于预测错误的优化验证目标的最优超参数。

It is important to note that the validation set is used solely for tuning the estimated model and is not considered truly out-of-sample. Consequently, the third “testing set” is employed to evaluate the forecasting performance of each model. Table A.1 in the Internet Appendix summarizes the hyperparameter tuning schemes for each ML model.
需要注意的是,验证集仅用于调整估计模型,并不被视为真正的样本外数据。因此,第三个“测试集”被用来评估每个模型的预测性能。互联网附录中的表 A.1 总结了每个机器学习模型的超参数调整方案。
The selection and optimization of hyperparameters should be performed through cross validation. In the case of time series data, we utilize Time series cross validation proposed by Hyndman and Athanasopoulos (2018).
超参数的选择和优化应通过交叉验证进行。在时间序列数据的情况下,我们采用 Hyndman 和 Athanasopoulos(2018)提出的时间序列交叉验证。
The validation set consists of K observations. Initially, we use the training set to fit the model and assess the predictive performance on the first data point of validation set data. Then, starting from the initial point of the training set, we shift the data used for model fitting by one time point and measure the performance on the subsequent time point. The process is repeated for k=1,2,...,K and the forecasting errors obtained at each iteration are averaged. For each value of the parameter, we evaluate its performance by calculating the corresponding average error using the aforementioned method.
验证集包含 K 个观测值。最初,我们使用训练集来拟合模型并评估在验证集数据的第一数据点上的预测性能。然后,从训练集的初始点开始,我们将用于模型拟合的数据向前移动一个时间点,并测量后续时间点的性能。这个过程重复进行,对每个迭代获得的预测误差进行平均。对于每个参数值,我们通过计算上述方法得到的相应平均误差来评估其性能。
The cross-validation error function is the mean square error (MSE), and we suppose the tuning parameter is λ:(23)TCV(λ)=1Kk=1K(RVtRVtkˆ)2,where RVtkˆ is the prediction of the k-th validation data. The optimal λ is selected by minimizing the error of TCV(λ):(24)λTCV=argminλTCV(λ).
交叉验证误差函数是均方误差(MSE),我们假设调整参数为 λ(23)TCV(λ)=1Kk=1K(RVtRVtkˆ)2, ,其中 RVtkˆ 是第 k 个验证数据的预测。通过最小化 TCV(λ)(24)λTCV=argminλTCV(λ). 的误差来选择最优的 λ
We adopt two sample splitting schemes in our empirical exercise. The first is a “rolling” scheme, in which the training and validation samples gradually shift forward in time, but keep the total time period in each training and validation sample fixed.
我们在本项实证研究中采用了两种样本分割方案。第一种是“滚动”方案,其中训练样本和验证样本在时间上逐渐向前移动,但保持每个训练样本和验证样本的总时间周期固定。

In each rolling window, the models are refitted by optimizing the tuning parameters through time series cross-validation using the prevailing training and validation samples. Subsequently, the updated models predict the next time point in the testing set.
在每个滚动窗口中,模型通过使用当前训练和验证样本进行时间序列交叉验证来优化调整参数。随后,更新后的模型预测测试集中的下一个时间点。
The next is a “recursive” scheme. Similar to the “rolling” window, it gradually incorporates more recent observations into the training and validation windows. However, it retains the entire history in the training sample (the size of the validation set remains constant).
下一是一种“递归”方案。类似于“滚动”窗口,它逐渐将更近期的观测值纳入训练和验证窗口。然而,它保留了整个历史数据(验证集的大小保持不变)。

As a result, the window size gradually increases over time. Figure A.1 in the Internet Appendix provides a visual and detailed illustration of our rolling scheme.
因此,窗口大小随时间逐渐增加。互联网附录中的图 A.1 提供了我们滚动方案的视觉和详细说明。

4. Data  4. 数据

In this paper, we use INE data from April 24, 2018 to July 3, 2022 to assess whether higher-order moments can improve forecasting accuracy for China's oil price RV. In order to test the predictive ability of higher-order moments for the particular COVID-19 period, we set the raining sample as April 24, 2018 to December 31, 2019 (489 days), the validation sample from January 1, 2020 to March 22, 2020 (97 days) and the remaining 392 samples for out-of-sample testing.
本文使用 2018 年 4 月 24 日至 2022 年 7 月 3 日的 INE 数据,评估高阶矩是否能提高中国油价 RV 的预测准确性。为了测试高阶矩在特定 COVID-19 时期的预测能力,我们将样本期设定为 2018 年 4 月 24 日至 2019 年 12 月 31 日(489 天),验证样本为 2020 年 1 月 1 日至 2020 年 3 月 22 日(97 天),剩余 392 个样本用于样本外测试。

The same size of the validation sample stays the same, but it is rolled forward to include the recent 97 days. This study uses 5-min high-frequency data for its usefulness in dealing with market microstructure noise based on previous research (
Liu et al., 2015), and we construct corresponding input variables.
验证样本的大小保持不变,但向前滚动以包括最近的 97 天。本研究采用基于先前研究(刘等,2015)中处理市场微观结构噪声的有用性,使用 5 分钟高频数据,并构建相应的输入变量。
The descriptive statistics for the variables are shown in Table 1. We observe that the series for RV (RVD, RVW, and RVM), skewness (RSK, RSK-5, and RSK-22), and hyperskewness (HRSK, HRSK-5, and HRSK-22) are left skewed, whereas the series for kurtosis (RKU, RKU-5, and RKU-22) and hyperkurtosis (HRKU, HRKU-5, and HRKU-221) are mostly right skewed. All data have high kurtosis, except for the monthly data for higher-order moments. In the Jarque–Bera statistical test, we find no sign of Gaussian distribution at the 1% significance level, except for monthly kurtosis. The result show that no variables have unit roots and variables are stationary at their Augmented Dickey-Fuller test levels.
变量描述性统计结果如表 1 所示。我们观察到 RV(RVD、RVW 和 RVM)序列、偏度(RSK、RSK-5 和 RSK-22)和超偏度(HRSK、HRSK-5 和 HRSK-22)呈左偏态,而峰度(RKU、RKU-5 和 RKU-22)和超峰度(HRKU、HRKU-5 和 HRKU-22)主要呈右偏态。所有数据都具有高峰度,除了更高阶矩的月度数据。在 Jarque-Bera 统计检验中,除了月度峰度外,在 1%的显著性水平下没有发现高斯分布的迹象。结果表明,没有变量具有单位根,变量在其 Augmented Dickey-Fuller 检验水平上是平稳的。

Table 1. Summary statistics for all variables.
表 1. 所有变量的汇总统计。

Variable  变量Mean  均值Std.Dev.  标准偏差Median  中位数Skewness  偏度Kurtosis  库特 osisJarque-Bera  贾克-贝拉ADF tests  ADF 测试
RVD2.6433.0801.7463.62720.21218357.367***−7.198***  -7.198***
RVW2.6411.9012.0761.9045.0701590.238***−6.259***  -6.259***
RVM2.6441.3912.2711.0780.946198.212***−4.184***  -4.184***
RSK0.0132.0460.000−0.178  -0.1781.677117.814***−9.691***  -9.691***
RSK-50.0130.9050.040−0.168  -0.1680.38810.880***−9.863***  -9.863***
RSK-220.0070.4310.070−0.450  -0.450−0.228  -0.22831.778***−5.493***  -5.493***
RKU11.51010.9707.5652.6188.9344270.228***−9.489***  -9.489***
RKU-511.5224.71411.0280.9691.484235.034***−8.209***  -8.209***
RKU-2211.5012.53511.662−0.143  -0.143−0.061  -0.0612.231−3.977**  -3.977
HRSK−0.872  -0.872111.395−0.421  -0.421−0.909  -0.90913.9467878.071***−9.355***  -9.355***
HRSK-5−0.903  -0.90348.7660.581−0.636  -0.6364.010696.175***−9.433***  -9.433***
HRSK-22−1.202  -1.20224.6130.669−0.437  -0.437−0.076  -0.07634.135***−4.881***  -4.881***
HRKU406.553829.222103.9084.47927.73333808.896***−10.083***  -10.083***
HRKU-5407.466359.508314.3022.0175.8531979.390***−9.284***  -9.284***
HRKU-22406.668174.354405.1140.281−0.094  -0.09416.152***−4.281***  -4.281***
Notes: This table reports the descriptive statistics of all variables used. The Jarque-Bera statistic is used to test the null hypothesis of normal distribution. The ADF test aims to examine the existence of unit roots. Q(10) is the Ljung-Box statistic for up to the tenth order serial correlation.
注释:本表报告了所有变量的描述性统计。Jarque-Bera 统计量用于检验正态分布的零假设。ADF 测试旨在检验单位根的存在。Q(10) 是至第十阶自相关性的 Ljung-Box 统计量。

The entire sample period is from April 24, 2018 to July 3, 2022. The asterisk ***, ** and * denote 1%, 5% and 10% levels of significance. In order to enhance the clarity and convenience of presenting subsequent tables and figures, we adopt RVD, RVW, and RVM to represent
RVt, RVt5:t,and RVt22:t in Eq. (6), respectively. These variables correspond to the daily, weekly, and monthly data of RV. Similarly, we use RSK, RSK-5 and RSK-22 to denote RSKt, RSKt5:t and RSKt22:t in Eq. (7), representing the daily, weekly, and monthly skewness, respectively. The remaining higher-order moments follow a similar pattern to the skewness measures described above.
整个样本期间为 2018 年 4 月 24 日至 2022 年 7 月 3 日。星号***、**和*分别表示 1%、5%和 10%的显著性水平。为了提高后续表格和图表的清晰度和便利性,我们采用 RVD、RVW 和 RVM 分别代表方程(6)中的 RVtRVt5:tRVt22:t 。这些变量对应于 RV 的日、周和月数据。同样,我们使用 RSK、RSK-5 和 RSK-22 分别表示方程(7)中的 RSKtRSKt5:tRSKt22:t ,分别代表日、周和月的偏度。剩余的高阶矩遵循与上述偏度度量类似的模式。

5. Empirical results  5. 实证结果

5.1. Full sample analysis
5.1. 全样本分析

In this section, we consider the full sampling period. After normalizing the full-sample data, we employ OLS to estimate the HAR-type models. Compared to the regression estimation results, the significance of the moment coefficients contains more important target information. Table 2 reports the coefficient estimation results for all the HAR-type models.
在本节中,我们考虑了完整的采样周期。在将全样本数据进行标准化后,我们采用 OLS 方法估计 HAR 类型模型。与回归估计结果相比,矩系数的显著性包含更重要的目标信息。表 2 报告了所有 HAR 类型模型的系数估计结果。

Table 2. Full sample estimation results.
表 2. 全样本估计结果。

MODELHARHAR-LOWHAR-ODDHAR-EVENHAR-ALL
Panel A: h = 1  图 A:h = 1
 RVD0.151***0.141***0.129***0.160***0.136***
 RVW0.257***0.260***0.252***0.260***0.252***
 RVM0.0560.0100.0540.0210.023
 RSK−0.103***  -0.103***−0.308***  -0.308***−0.275***  -0.275***
 RSK-5−0.101**  -0.101−0.251***  -0.251***−0.207**  -0.207
 RSK-22−0.029  -0.0290.0070.020
 HRSK0.233***0.191**
 HRSK-50.157*0.098
 HRSK-22−0.010  -0.010−0.041  -0.041
 RKU−0.061*  -0.0610.0190.017
 RKU-5−0.047  -0.0470.1400.193
 RKU-22−0.084*  -0.084−0.237*  -0.237−0.246*  -0.246
 HRKU−0.077  -0.077−0.072  -0.072
 HRKU-5−0.188  -0.188−0.231*  -0.231
 HRKU-220.1570.157
Panel B: h = 5  图 B:h = 5
 RVD0.483***0.493***0.456***0.514***0.486***
 RVW0.265***0.250***0.264***0.246***0.243***
 RVM0.115***0.067*0.121***0.072*0.080*
 RSK−0.103***  -0.103***−0.269***  -0.269***−0.225***  -0.225***
 RSK-5−0.128***  -0.128***−0.290***  -0.290***−0.236***  -0.236***
 RSK-22−0.016  -0.0160.0710.090
 HRSK0.193***0.135*
 HRSK-50.170**0.096
 HRSK-22−0.056  -0.056−0.097  -0.097
 RKU−0.113***  -0.113***−0.059  -0.059−0.040  -0.040
 RKU-5−0.024  -0.0240.247**0.315***
 RKU-22−0.121***  -0.121***−0.395***  -0.395***−0.414***  -0.414***
 HRKU−0.049  -0.049−0.069  -0.069
 HRKU-5−0.271**  -0.271−0.334***  -0.334***
 HRKU-220.275**0.281**
Panel C: h = 22  图 C:h = 22
 RVD0.249***0.239***0.238***0.255***0.241***
 RVW0.161***0.149***0.170***0.146***0.153***
 RVM0.285***0.214***0.290***0.201***0.206***
 RSK−0.087***  -0.087***−0.199**  -0.199−0.133*  -0.133
 RSK-5−0.101***  -0.101***−0.100  -0.100−0.051  -0.051
 RSK-22−0.003  -0.0030.0320.107
 HRSK0.123*0.053
 HRSK-5−0.023  -0.023−0.063  -0.063
 HRSK-220.046−0.100  -0.100
 RKU−0.091**  -0.091−0.156  -0.156−0.127  -0.127
 RKU-5−0.019  -0.019−0.112  -0.112−0.004  -0.004
 RKU-22−0.275***  -0.275***−0.515***  -0.515***−0.565***  -0.565***
 HRKU0.0770.044
 HRKU-50.097−0.022  -0.022
 HRKU-220.256*0.288*
Notes: This table shows the parameter estimation results from a full-sample perspective.
注释:本表展示了从全样本视角的参数估计结果。

The standard HAR model is considered as benchmark, HAR-LOW (add skewness, kurtosis), HAR-ODD (add skewness, hyper-skewness), HAR-EVEN (add kurtosis, hyper-kurtosis), HAR-ALL (add skewness, kurtosis, hyper-skewness, hyper-kurtosis) is displayed in turn.
标准 HAR 模型被视为基准,依次展示 HAR-LOW(增加偏度、峰度)、HAR-ODD(增加偏度、超偏度)、HAR-EVEN(增加峰度、超峰度)、HAR-ALL(增加偏度、峰度、超偏度、超峰度)。

The parameter estimation results for the 1-step, 5-step and 22-step forecasts of realized volatility are shown in panels A, B, C as well. The asterisk ***, ** and * denote 1%, 5% and 10% levels of significance.
参数估计结果如图 A、B、C 所示,包括 1 步、5 步和 22 步预测的实实现波动率。星号***、**和*分别表示 1%、5%和 10%的显著性水平。
The results in Table 2 show that the coefficients for all higher-order moments and their lags are statistically nonzero, and more than half of them are significant.
表 2 的结果显示,所有高阶矩及其滞后系数的统计值均不为零,其中超过一半的系数具有显著性。

For the day-ahead prediction, we contrast the HAR-ODD model with the HAR-EVEN model and discover that odd moments have four significant coefficients, whereas even moments have only one.
对于日前预测,我们将 HAR-ODD 模型与 HAR-EVEN 模型进行对比,发现奇数矩有四个显著系数,而偶数矩只有一个。

The HAR-ALL model yields the same finding (i.e., the number of significant coefficients for odd moments is greater than that for even moments).
HAR-ALL 模型得出相同的结果(即奇数矩的显著系数数量大于偶数矩)。

Although odd and even moments have the same number of significant coefficients in the HAR-LOW model containing third- and fourth-order moments, the significance of odd-moment coefficients is higher.
尽管奇数和偶数时刻在包含三阶和四阶矩的 HAR-LOW 模型中具有相同数量的显著系数,但奇数矩系数的重要性更高。
For the week-ahead and month-ahead predictions, the number of significant coefficients for odd (and even) moments is the same for the higher-order moment models for the higher (third- and fourth-order) moments, odd (third- and fifth-order) moments, and even (fourth- and sixth-order) moments.
对于下周和下月预测,奇数(和偶数)矩的显著系数在更高阶矩模型(第三和第四阶矩)、奇数(第三和第五阶)矩以及偶数(第四和第六阶)矩中是相同的。

In the HAR-ALL model, even moments have more significant coefficients. Indeed, among different forecasting horizons (h = 1, 5, and 22), the number and level of significant coefficients in the monthly forecasts are poor.
在 HAR-ALL 模型中,甚至时刻的系数也更为显著。事实上,在不同预测范围(h = 1,5 和 22)中,月度预测中显著系数的数量和水平都较低。
In summary, more than half of the higher-order moment coefficients are significant, proving that higher-order moments contain critical information for the RV forecasting of China's oil market.
总结来说,超过一半的高阶矩系数具有显著性,证明高阶矩包含了中国石油市场风险价值预测的关键信息。

For the short-term horizon (h = 1), odd moments have a greater impact on oil price volatility. The effect of even moments is greater for the medium- and long-term horizons (h = 5, 22).
对于短期(h = 1)预测,奇数阶矩对油价波动性影响更大。对于中长期(h = 5, 22)预测,偶数阶矩的影响更大。
In addition, it is observed that the significance of moment coefficients disappears or diminishes in the HAR-ALL model compared to other models, indicating that possible multicollinearity among the variables affect the final prediction results.
此外,观察到与其它模型相比,HAR-ALL 模型中动量系数的重要性消失或减弱,这表明变量之间可能存在的多重共线性影响了最终的预测结果。

Thus, we choose four ML methods (LASSO, EN, GBDT, and RF) to further explore the above issues.
因此,我们选择了四种机器学习方法(LASSO、EN、GBDT 和 RF)来进一步探讨上述问题。
Owing to the complex “black box” design of ML models, we choose three methods to further explore the variable importance: SHAP (Lundberg and Lee, 2017), the reduction in the predictive R2 and SSD. The first is SHAP values, which allow the contribution of each factor to be calculated, have been used as an interpretation visualization tool for ML models by some studies (Khalfaoui et al., 2022; Lu et al., 2022; Parsa et al., 2020). The model generates a predicted value for each predicted sample in which each feature of the sample is assigned a SHAP value. We assume that xi is the ith sample, where xi,j means the jth feature in sample xi, and yi is the prediction result of the model for sample xi. In this study, the average of the dependent variables for all samples is the baseline for the entire model, described as ybasement. The SHAP values are calculated by(25)yi=ybasement+f(xi,1)+f(xi,2)+...+f(xi,k)where f(xi,j) is the SHAP value of xi,j. A positive/negative f(xi,j) indicates that this variable enhances/decreases the forecasting value.
由于机器学习模型的复杂“黑盒”设计,我们选择了三种方法来进一步探索变量重要性:SHAP(Lundberg 和 Lee,2017)、预测 R 2 的减少和 SSD。第一种是 SHAP 值,它允许计算每个因素的贡献,一些研究将其用作机器学习模型的解释可视化工具(Khalfaoui 等人,2022;Lu 等人,2022;Parsa 等人,2020)。模型为每个预测样本生成一个预测值,其中每个样本的特征被分配一个 SHAP 值。我们假设 xiith 样本,其中 xi,j 表示样本 xi 中的 jth 特征, yi 是模型对样本 xi 的预测结果。在本研究中,所有样本的因变量平均值是整个模型的基线,描述为 ybasement 。SHAP 值通过 (25)yi=ybasement+f(xi,1)+f(xi,2)+...+f(xi,k) 计算,其中 f(xi,j)xi,j 的 SHAP 值。正/负 f(xi,j) 表示该变量增强了/减少了预测值。

SHAP values have an advantage over other explainable artificial intelligence approaches in that they can ignore the model itself to quantify the contribution of each factor to the prediction based on ML models.
SHAP 值相较于其他可解释人工智能方法的优势在于,它们可以忽略模型本身,根据机器学习模型量化每个因素对预测的贡献。
According to Gu et al. (2020), the second is the reduction in panel predictive R2 from setting all values of predictor j to zero, while holding the remaining model estimates fixed. The third is the SSD of the models to each input variable j, which summarizes the sensitivity of model fits to changes in that variable (Dimopoulos et al., 1995).2 In particular, SSD defines the j-th variable importance as(26)SSDj=(g(z;θ)zj|z=zt)
根据 Gu 等人(2020 年)的研究,第二个方面是减少预测 R 2 ,即将预测变量 j 的所有值设为零,同时保持其他模型估计值不变。第三个方面是模型对每个输入变量 j 的 SSD(结构稳定性),它总结了模型拟合对该变量变化的敏感性(Dimopoulos 等人,1995 年)。 2 特别是,SSD 定义了第 j 个变量的重要性为 (26)SSDj=(g(z;θ)zj|z=zt)
The importance of the variables based on the SHAP values for the four ML models in the short-, medium-, and long-term forecasts are presented in Fig. 1a, Fig. 1b, Fig. 1ca, b, and 1c, respectively. The left side of the model results are SHAP summary plots showing the range and distribution of the effects of the variables on RV prediction. As expected, different horizons depend on RVD, RVW, and RVM as the main set of predictor variables, since these variables are related to the recent history or future expectations of oil price RV. More importantly, we find that both the third- and fourth-order moments and the fifth- and sixth-order moments contribute to the prediction.
图 1a、图 1b、图 1ca、b 和 1c 分别展示了基于 SHAP 值的四个机器学习模型在短期、中期和长期预测中变量的重要性。模型结果的左侧是 SHAP 摘要图,显示了变量对 RV 预测影响的范围和分布。正如预期的那样,不同的预测范围依赖于 RVD、RVW 和 RVM 作为主要预测变量,因为这些变量与油价 RV 的近期历史或未来预期有关。更重要的是,我们发现第三和第四阶矩以及第五和第六阶矩都对预测有贡献。

In some cases, higher-order moments even beat the RV lags for ranking in the top three.
在某些情况下,高阶矩甚至能击败 RV 滞后项,在排名前三中胜出。
Fig. 1a
  1. Download: Download high-res image (633KB)
    下载:下载高分辨率图片(633KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 1a. Variable importance for four machine learning models based on SHAP in day-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 1a. 基于 SHAP 的四种机器学习模型在日前预测中的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

Fig. 1b
  1. Download: Download high-res image (603KB)
    下载:下载高分辨率图片(603KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 1b. Variable importance for four machine learing models based on SHAP in week-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 1b. 基于 SHAP 的四个机器学习模型在一周前预测中的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响的范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

Fig. 1c
  1. Download: Download high-res image (592KB)
    下载:下载高分辨率图片(592KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 1c. Variable importance for four machine learning models based on SHAP in month-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 1c. 基于 SHAP 的四种机器学习模型在月度预测中的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

The right side of the results shows the variable importance of higher-order moments (not considering the RV lag) measured by the mean of the absolute SHAP values.
结果右侧显示了通过绝对 SHAP 值的平均值测量的高阶矩的重要性(不考虑 RV 滞后)。

Plots further prove that the two highest moments (fifth- and sixth-order moments) provide useful information for RV forecasts—even more than the third- and fourth-order moments in some cases. In the week-ahead and month-ahead predictions presented in Fig. 1b, Fig. 1cb and c, even moments outperform odd moments, as measured by their cumulative contributions. In the day-ahead prediction presented in Fig. 1a, odd moments based on some models demonstrate better performance.
图进一步证明,两个最高阶矩(五阶和六阶矩)为 RV 预测提供了有用的信息——在某些情况下甚至比三阶和四阶矩更有用。在图 1b、图 1cb 和 c 所示的下一周和下月预测中,偶数阶矩在累积贡献方面甚至优于奇数阶矩。在图 1a 所示的下一日预测中,基于某些模型的奇数阶矩表现出更好的性能。
The second measure of variable importance, the reduction in R2, produces results that are highly consistent with the SHAP values. Our findings indicate that all higher-order moments, including the two highest moments, provide valuable information for RV forecasting.
第二个变量重要性的度量,即 R 2 的减少,产生的结果与 SHAP 值高度一致。我们的研究结果表明,包括两个最高阶矩在内的所有高阶矩都为 RV 预测提供了有价值的信息。

In both day-ahead and week-ahead forecasting, odd moments carry more useful information compared to even moments. The additional measure of variable importance, SSD, also aligns with the previous findings in general.
在日前和周前预测中,奇数阶矩比偶数阶矩包含更多有用的信息。变量重要性测度 SSD 的额外度量也与之前的发现总体上相符。

But the slight difference is that the linear model-based LASSO and EN do not effectively leverage the predictive information from 5-th and 6-th moments. This discrepancy explains why their forecasting performance is inferior to that of the tree-based models.
但是,细微的差别在于基于线性模型的 LASSO 和 EN 未能有效利用 5 阶和 6 阶矩的预测信息。这种差异解释了为什么它们的预测性能不如基于树的模型。

Across all ML models, higher-order moments continue to play a critical role. Figure A.2 and A.3 in the Internet Appendix shows the full results of the reduction in R2 and SSD.
在所有机器学习模型中,高阶矩继续发挥着关键作用。互联网附录中的图 A.2 和 A.3 展示了 R 2 和 SSD 减少的完整结果。
In conclusion, higher-order moments aid the prediction of oil price RV. Odd moments (skewness and hyperskewness) are more essential for day-ahead forecasting, whereas even moments (kurtosis and hyperkurtosis) are more useful for week- and month-ahead forecasting.
结论而言,高阶矩有助于预测油价波动率。奇数矩(偏度和超偏度)对于次日预测更为关键,而偶数矩(峰度和超峰度)则对周和月度预测更有用。

5.2. Out-of-sample analysis
5.2. 样本外分析

5.2.1. Forecast evaluation
5.2.1. 预测评估

Out-of-sample forecasting ability reflects a model's predictive performance for future volatility, which is highly valued because it can provide important decision-making information to investors and policymakers.
样本外预测能力反映了模型对未来波动性的预测性能,这一点非常重要,因为它可以为投资者和政策制定者提供重要的决策信息。

In this section, we focus on the ability of higher-order moments to optimize the out-of-sample prediction accuracy of volatility models. We use the MSE and root mean square error (RMSE) recommended by
Patton (2011) and the out-of-sample R2 statistic (R2oos) recommended by Campbell and Thompson (2008) to assess the out-of-sample forecasting ability of the models. The specific definitions are as follows:(27)MSE=1qt=m+1m+q(RVtRVtˆ)2,(28)RMSE=1qt=m+1m+q(RVtRVtˆ)2,(29)R2oos=1t=m+1m+q(RVtRVtˆ)2t=m+1m+q(RVtRVt)2,where RVt is the true value of oil price volatility, RVtˆ is the out-of-sample forecast of volatility, and RVt is the historical average of volatility, defined as RVt=s=1t1RVs/(t1). The MSE and RMSE measure the differences between the out-of-sample predicted and true values of volatility.
本节中,我们关注高阶矩优化波动率模型样本外预测准确性的能力。我们采用 Patton(2011)推荐的均方误差(MSE)和均方根误差(RMSE),以及 Campbell 和 Thompson(2008)推荐的样本外 R 2 统计量(R 2 oos)来评估模型的样本外预测能力。具体定义如下: (27)MSE=1qt=m+1m+q(RVtRVtˆ)2, (28)RMSE=1qt=m+1m+q(RVtRVtˆ)2, (29)R2oos=1t=m+1m+q(RVtRVtˆ)2t=m+1m+q(RVtRVt)2, 其中 RVt 是油价波动的真实值, RVtˆ 是样本外波动的预测值, RVt 是波动的历史平均值,定义为 RVt=s=1t1RVs/(t1) 。MSE 和 RMSE 衡量样本外预测值与真实值之间的差异。

The nearer the MSE and RMSE converge to zero, the smaller the gap between the out-of-sample prediction and the true value, and the better the model prediction performance.
MSE 和 RMSE 越接近于零,样本外预测值与真实值之间的差距越小,模型预测性能越好。
R2oos takes the historical average of volatility as the natural benchmark for the predictive model and measures the degree of decline in mean square percentage error (MSPE) between the predictive regression model and the historical average model. Intuitively, if higher-order moments contain useful forecasting information, RVtˆ should perform better than RVt. Thus, R2oos > 0 indicates that the model performs better than the historical forecast, and larger R2oos values indicate better model prediction performance.
R 2 oos 以历史波动率平均值为预测模型的自然基准,并衡量预测回归模型与历史平均模型之间均方百分比误差(MSPE)下降的程度。直观上,如果高阶矩包含有用的预测信息, RVtˆ 应该比 RVt 表现更好。因此,R 2 oos > 0 表示模型的表现优于历史预测,更大的 R 2 oos 值表示模型预测性能更好。
5.2.1.1. Out-of-sample forecasting performance
5.2.1.1. 样本外预测性能
Notably, COVID-19 broke out during the sample period studied in this paper and posed serious threats to the global oil industry. Some studies have shown that higher-order moments are associated with market risk (Lo et al., 2008; Navatte and Villa 2000). Hence, in this extreme market environment, determining whether higher-order moments could further enhance the prediction of China's oil futures volatility is such a significant topic for us.
值得注意的是,本研究样本期间爆发了 COVID-19 疫情,对全球石油工业构成了严重威胁。一些研究表明,高阶矩与市场风险相关(Lo 等,2008;Navatte 和 Villa,2000)。因此,在这种极端市场环境下,确定高阶矩是否可以进一步增强对中国石油期货波动性的预测,对我们来说是一个非常重要的话题。

We conduct out-of-sample forecasting between March 23, 2020, and July 3, 2021, with a fixed window of half the sample size of 438 observations in total. Besides day-ahead forecasts, we generate week-ahead and month-ahead forecasts.
我们对 2020 年 3 月 23 日至 2021 年 7 月 3 日进行样本外预测,窗口固定为总样本量 438 个观测值的一半。除了日预测外,我们还生成周预测和月预测。

We measure the forecasting performance using MSE, RMSE, and R
2oos. To select the best model and compare the performance of different models, we utilize the Model Confidence Set (MCS) test and the Diebold-Mariano (DM) test.
我们使用均方误差(MSE)、均方根误差(RMSE)和 R 2 oos 来衡量预测性能。为了选择最佳模型并比较不同模型的性能,我们采用模型置信集(MCS)测试和迪布德-马里亚诺(DM)测试。
Table 3 reports the prediction accuracy of different higher-order moment models for different horizons under the fixed window of the COVID-19 period. We observe that models based on higher-order moments perform significantly better than the HAR models for all horizons.
表 3 报告了在 COVID-19 时期固定窗口下,不同高阶矩模型在不同预测期内的预测准确率。我们观察到,基于高阶矩的模型在所有预测期内均显著优于 HAR 模型。

Specifically, the MSE and RMSE are significantly lower, and the R2oos is significantly higher than in the benchmark HAR model for all higher-order moment models. This is identical to previous findings that third- and fourth-order moments contain important information for volatility forecasting (Bonato et al., 2022; Gkillas et al., 2019; Mei et al., 2017). Comparing the HAR-ODD and HAR-EVEN models, we discover that the models with skewness and hyperskewness outperform the models with kurtosis and hyperkurtosis for the day-ahead and week-ahead predictions, and the opposite is true for the month-ahead predictions.
具体来说,均方误差(MSE)和均方根误差(RMSE)显著低于基准 HAR 模型的所有高阶矩模型,而 R 2 oos 显著高于基准 HAR 模型。这与先前的研究结果一致,即三阶和四阶矩包含对波动率预测的重要信息(Bonato 等,2022;Gkillas 等,2019;Mei 等,2017)。比较 HAR-ODD 和 HAR-EVEN 模型,我们发现对于日预测和周预测,具有偏度和超偏度的模型优于具有峰度和超峰度的模型,而对于月预测则相反。

This indicates that odd moments carry more valid information about oil price RV forecasts than even moments for short- and medium-term forecasting, and even moments carry more helpful information than odd moments for long-term forecasting.
这表明,对于短期和中长期预测,奇数时刻比偶数时刻携带更多关于油价 RV 预测的有效信息,而对于长期预测,偶数时刻比奇数时刻携带更多有用的信息。

This finding resembles a previous article (Khademalomoom et al., 2019) that suggests distinguishing between odd and even moments.
这一发现与先前的一篇文章(Khademalomoom 等,2019)相似,该文章建议区分奇数和偶数矩。

Table 3. Out-of-sample forecasting results during COVID-19 based on the rolling window method.
表 3. 基于滚动窗口方法的 COVID-19 期间样本外预测结果。

For the medium- and long-term horizons, the HAR-ALL models outperform the HAR-LOW models in terms of prediction accuracy, indicating that the two highest moments (fifth- and sixth-order moments) carry useful information for RV forecasting.
对于中长周期,HAR-ALL 模型在预测精度方面优于 HAR-LOW 模型,表明两个最高阶矩(五阶和六阶矩)对 RV 预测具有有用信息。

Issues such as overfitting and nonlinear between higher-order moments and oil price RV may have affected the performance of the HAR-ALL model for day-ahead forecasting. Thus, we employ four ML models containing the same predictors as the HAR-ALL model.
问题如过拟合以及高阶矩与油价 RV 之间的非线性可能影响了 HAR-ALL 模型对次日预测的性能。因此,我们采用四个包含与 HAR-ALL 模型相同预测因子的机器学习模型。

Consistent with the evidence provided by Christensen et al. (2021), Díaz et al. (2020), and Leippold et al. (2022), the results show that ML models perform significantly better than the HAR-ALL and HAR-LOW models for all horizons, suggesting that the ML algorithms are able to identify nonlinear relationships and overfitting in the HAR-based linear model. The HAR-RF is the best model in terms of prediction accuracy, taking the lead among the four ML models for medium- and long-term forecasting.
与 Christensen 等人(2021 年)、Díaz 等人(2020 年)和 Leippold 等人(2022 年)提供的证据一致,结果表明,对于所有预测范围,机器学习模型在性能上显著优于 HAR-ALL 和 HAR-LOW 模型,这表明机器学习算法能够识别基于 HAR 的线性模型中的非线性关系和过拟合。HAR-RF 在预测精度方面表现最佳,在四个机器学习模型中领先,对于中长期的预测具有优势。

In contrast, the HAR-GBDT model is a good alternative to the HAR-RF model, as the HAR-GBDT model typically produce a second-best prediction accuracy, slightly exceeding the HAR-RF model for day-ahead forecasting.
相比之下,HAR-GBDT 模型是 HAR-RF 模型的良好替代品,因为 HAR-GBDT 模型通常产生次优的预测准确率,略高于 HAR-RF 模型,适用于提前一天的预测。

Previous studies have demonstrated that combination forecasting models typically outperform “winner-takes-all” individual models, since they allow more information to be used objectively and eliminate potential model or data bias (
Becker and Clements, 2008; Kang et al., 2022). We observe that the combination forecasting models based on ML models generally perform more accurately than individual forecasts and DMSPE(1) always yield the best performance, which aligns with the extant literature.
先前的研究表明,组合预测模型通常优于“赢家通吃”的个体模型,因为它们允许更客观地使用更多信息并消除潜在模型或数据偏差(Becker 和 Clements,2008;Kang 等,2022)。我们观察到基于机器学习模型的组合预测模型通常比个体预测和 DMSPE(1)更准确,并且 DMSPE(1)总是产生最佳性能,这与现有文献相符。

It is worth noting that for monthly forecasting, there is no clear advantage in using combination approaches. This is because the prediction accuracy of four ML methods varies significantly. However, ML-Poly stands out and enhances the robustness of forecasting.
值得注意的是,对于月度预测,使用组合方法并没有明显的优势。这是因为四种机器学习方法的预测精度差异很大。然而,ML-Poly 脱颖而出,增强了预测的鲁棒性。
In short, higher-order moments carry key information for volatility forecasting.
简而言之,高阶矩携带了波动率预测的关键信息。

Odd moments (skewness and hyperskewness) are more helpful than even moments (kurtosis and hyperkurtosis) for day-ahead and week-ahead forecasting, but the opposite is true for month-ahead forecasting.
奇数矩(偏度和超偏度)对于日预测和周预测比偶数矩(峰度和超峰度)更有帮助,但对于月预测则相反。

ML can be used to further optimize prediction accuracy and improve the information utilization of higher-order moments in volatility forecasting, where the best ML model is the HAR-RF model, followed by the HAR-GBDT model.
机器学习可用于进一步优化预测精度并提高波动率预测中高阶矩的信息利用率,其中最佳机器学习模型是 HAR-RF 模型,其次是 HAR-GBDT 模型。

Combination forecasting models can also enhance the forecasting performance of ML models using higher-order moments, where DMSPE(1) takes the lead.
组合预测模型还可以通过使用高阶矩来提高机器学习模型的预测性能,其中 DMSPE(1)处于领先地位。
In this section, we discuss the accuracy of our models’ predictions based on the MCS developed by Hansen et al. (2011) and select the model that perform best across all loss functions (MSE and MAE for this study). The MCS test has been widely employed to gauge the prediction effectiveness of volatility models because it does not require a predefined benchmark model (Bauwens and Otranto, 2016; Niu et al., 2022). Consistent with previous literature, we select rang and semi-quadratic statistics as MCS statistics, given as follows:(30)TR=MAXu,vM|di,uv|var(di,uv),TSQ=MAXu,vM(di,uv)2var(di,uv),di,uv=n1t=1ndi,uv,t,where di,uv and di,uv are the statistics for relative sample loss for comparing the sample losses between the ith and jth models.
在本节中,我们讨论了基于 Hansen 等人(2011 年)开发的 MCS 模型预测的准确性,并选择了在所有损失函数(本研究的均方误差 MSE 和平均绝对误差 MAE)中表现最好的模型。MCS 测试已被广泛用于评估波动率模型的预测有效性,因为它不需要预定义的基准模型(Bauwens 和 Otranto,2016;刘等,2022)。与先前文献一致,我们选择了范围和半二次统计量作为 MCS 统计量,如下所示: (30)TR=MAXu,vM|di,uv|var(di,uv),TSQ=MAXu,vM(di,uv)2var(di,uv),di,uv=n1t=1ndi,uv,t, 其中 di,uvdi,uv 是相对样本损失的统计量,用于比较 ithjth 模型之间的样本损失。
Table 4 shows the MCS results for the proposed models that apply the rolling window method to the COVID-19 crisis. A value of 1 indicates that the optimum model is picked, and the nearer the convergence to 1, the better the model's forecasting performance.
表 4 显示了应用滚动窗口方法处理 COVID-19 危机的所提模型的 MCS 结果。数值 1 表示选择了最佳模型,数值越接近 1,模型的预测性能越好。

We obtain the following findings. First, despite HAR-type models failing the MCS test, the HAR-ALL model's p values are bigger than those of the HAR-LOW model for medium-term forecasting, showing that the two highest moments (fifth- and sixth-order moments) contain useful forecasting information.
我们得到以下发现。首先,尽管 HAR 型模型未能通过 MCS 测试,但在中长期预测中,HAR-ALL 模型的 p 值大于 HAR-LOW 模型,这表明两个最高阶矩(五阶和六阶矩)包含有用的预测信息。

Second, the models' MCS results are greater than 0.1 for some ML models and combination forecasts. Notably, DMSPE(1) obtains the p value of 1 for short- and medium-term forecasts, and HAR-ALL-RF yields the same p value for long-term forecasts, indicating that ML models and combination approaches improve the utilization of higher-order moments for oil price volatility forecasting in the COVID-19 period.
其次,某些机器学习模型和组合预测的 MCS 结果大于 0.1。值得注意的是,DMSPE(1)在短期和中期预测中获得了 p 值为 1,而 HAR-ALL-RF 在长期预测中也得到了相同的 p 值,这表明机器学习模型和组合方法提高了在 COVID-19 期间对油价波动预测中高阶矩的利用。

Furthermore, the results in rows 3 and 4 show that the odd moments contain more forecasting information than the even moments for the day-ahead forecasts, but the small gap in week-ahead and month-ahead predictions can't uncover helpful information.
此外,第 3 行和第 4 行的结果显示,对于次日预测,奇数时刻包含比偶数时刻更多的预测信息,但周预测和月预测中的小差距无法揭示有用信息。

Overall, the improvement effect of higher-order moments for RV forecasting is significant.
总体而言,高阶矩对随机变量预测的改进效果显著。

Table 4. Out-of-sample forecast accuracy (MCS) during COVID-19 based on the rolling window method.
表 4. 基于滚动窗口方法的 COVID-19 期间的样本外预测准确性(MCS)。

Models  模型Panel A: h = 1-day  图 A:h = 1 天Panel B: h = 1-week  图 B:h = 1 周Panel C: h = 1-month  图 C:h = 1 个月
Range  范围SemiQ  半定量Range  范围SemiQ  半定量Range  范围SemiQ  半定量
MSEMAEMSEMAEMSEMAEMSEMAEMSEMAEMSEMAE
HAR0.0350.2710.0000.0000.0010.0000.0000.0000.0000.0000.0000.000
HAR-LOW0.0210.3250.0000.0000.0010.0000.0000.0000.0000.0000.0000.000
HAR-ODD0.1180.2710.0000.0000.0010.0000.0000.0000.0000.0000.0000.000
HAR-EVEN0.0210.2710.0000.0000.0030.0000.0000.0000.0000.0000.0000.000
HAR-ALL0.0350.2710.0000.0000.0750.0030.0000.0000.0000.0000.0000.000
HAR-ALL-LASSO0.0350.2710.0000.0000.0160.0000.0000.0000.0000.0000.0000.000
HAR-ALL-EN0.0350.2710.0000.0000.0180.0000.0000.0000.0000.0000.0000.000
HAR-ALL-GBDT0.1520.3250.0000.0000.0220.2850.0000.0000.0010.0000.0000.000
HAR-ALL-RF0.1180.2710.0000.0000.3350.6690.0000.0001.000*1.000*1.000*1.000*
MEAN0.5080.4440.0000.0000.7220.6690.0000.0000.0010.0000.0000.000
MEDIAN0.9210.7360.0000.0000.7220.9610.0000.0000.0010.0000.0000.000
TRIMMED MEAN  修剪均值0.9210.7360.9210.6820.7220.9610.6150.0000.0010.0000.0010.000
DMSPE(1)0.7150.7360.0000.0000.7220.8480.6150.9610.0010.0000.0010.000
DMSPE(0.9)1.000*1.000*1.000*1.000*1.000*1.000*1.000*1.000*0.0010.0000.0000.000
ML-Poly  ML-多边形0.0350.2710.0000.0000.3350.6690.0000.0000.3360.4710.3360.471
Note: This table presents the model confidence set (MCS) p-values based on the Range and SemiQ test statistics, TR and TSQ. Models with p > 0.1 are indicated in bold. Panels A, B, and C report the findings for the 1-step, 5-step and 22-step forecasts for realized volatility, respectively.
注意:本表展示了基于范围和半 Q 测试统计量 TR 和 TSQ 的模型置信集(MCS)p 值。p > 0.1 的模型以粗体表示。面板 A、B 和 C 分别报告了实现波动率的 1 步、5 步和 22 步预测结果。

The HAR model is benchmark, HAR-LOW (add skewness, kurtosis), HAR-ODD (add skewness, hyper-skewness), HAR-EVEN (add kurtosis, hyper-kurtosis), HAR-ALL (add skewness, kurtosis, hyper-skewness, hyper-kurtosis) is displayed in turn.
HAR 模型依次展示了 HAR-LOW(增加偏度、峰度)、HAR-ODD(增加偏度、超偏度)、HAR-EVEN(增加峰度、超峰度)、HAR-ALL(增加偏度、峰度、超偏度、超峰度)。

HAR-RF-LASSO, HAR-ALL-EN, HAR-ALL-GBDT and HAR-ALL-RF contain the same variables as HAR-ALL. MEAN, MEDIAN, TRIMMED MEAN, DMSPE(1), DMSPE(0.9) and ML-Poly are all combinations of the four ML models above. The forecasting model with the best performance is highlighted with *.
HAR-RF-LASSO、HAR-ALL-EN、HAR-ALL-GBDT 和 HAR-ALL-RF 包含与 HAR-ALL 相同的变量。MEAN、MEDIAN、TRIMMED MEAN、DMSPE(1)、DMSPE(0.9)和 ML-Poly 都是上述四个 ML 模型的组合。性能最佳的预测模型用*突出显示。
For ensuring the validity of the results and making pairwise comparisons of methods, we use another forecasting evaluation approach, the Diebold and Mariano (2002) test, for differences in out-of-sample predictive accuracy between two models (Diebold and Mariano, 2002; Gu et al., 2020). The DM statistic is calculated as follows:(31)DMstatistic=dVar(d)where d=1qt=m+1m+qdt, dt denotes the difference between the loss functions of the two models and Var(d) is the variance of dt. A positive (negative) DM statistic denotes that the forecasting model outperforms (underperforms) the benchmark.
为确保结果的效度和进行方法间的成对比较,我们采用另一种预测评估方法,即 Diebold 和 Mariano(2002)测试,用于比较两个模型在样本外预测精度方面的差异(Diebold 和 Mariano,2002;Gu 等,2020)。DM 统计量计算如下: (31)DMstatistic=dVar(d) 其中 d=1qt=m+1m+qdtdt 表示两个模型损失函数之间的差异, Var(d)dt 的方差。正(负)DM 统计量表示预测模型优于(低于)基准。
The comparison results of DM test in the short-, medium-, and long-term forecasts are shown in Table A.2a, A.2b and A.2c in the Internet Appendix, respectively. Positive numbers indicate the column model outperforms the row models. The first row is almost positive and statistically significant, indicating that the models containing higher-order moments can have better predictive power.
短期、中期和长期预测中 DM 测试的比较结果分别显示在互联网附录的表 A.2a、A.2b 和 A.2c 中。正数表示列模型优于行模型。第一行几乎为正且具有统计学意义,表明包含高阶矩的模型可能具有更好的预测能力。

ML models are also accompanied by positive values, continuing their excellent performance, especially HAR-ALL-GBDT and HAR-ALL-RF. Compared to ML models, combination models provide more accurate predictions in day-ahead and week-ahead forecasts, especially DMSPE-type models.
机器学习模型也伴随着正值,继续保持其优异的性能,尤其是 HAR-ALL-GBDT 和 HAR-ALL-RF。与机器学习模型相比,组合模型在日预测和周预测中提供更准确的预测,尤其是 DMSPE 型模型。

ML-Poly also has a strong performance in monthly forecasting. These results are in line with the previous ones, indicating that the higher-order moments significantly improve the model's forecasting accuracy.
ML-Poly 在月度预测方面也表现出强劲的性能。这些结果与之前的结果一致,表明高阶矩显著提高了模型的预测精度。

6. Robustness check  6. 鲁棒性检验

For the previously described predictions, we depend on the rolling window to generate predictions of China's oil futures RV during the COVID-19 crisis.
对于之前描述的预测,我们依赖于滚动窗口来生成中国在 COVID-19 危机期间石油期货 RV 的预测。

As a robustness check, we repeat our previous exercise with the HAR-type models, ML models, and combination forecasting models, but we choose the recursive window as an alternative.
为了稳健性检验,我们重复了之前的练习,使用 HAR 类型模型、ML 模型和组合预测模型,但选择递归窗口作为替代。
Table 5 reports the prediction evaluation of the models using the recursive window, which produce the following findings: First, all models are superior to the benchmark HAR model. This result is the same as, or better than, the predictions relying on the fixed window in Table 3, indicating that higher-order moments help in forecasting oil price volatility for the COVID-19 period.
表 5 报告了使用递归窗口对模型进行预测评估的结果,得出以下结论:首先,所有模型均优于基准 HAR 模型。这一结果与表 3 中基于固定窗口的预测相同或更好,表明高阶矩有助于预测 COVID-19 时期的油价波动。

Second, the HAR-ALL model exhibits better performance than the HAR-LOW model for daily and weekly forecasting; thus, we conclude that the two highest moments contain useful forecasting information for oil price volatility.
其次,HAR-ALL 模型在每日和每周预测方面比 HAR-LOW 模型表现更佳;因此,我们得出结论,前两个最高阶矩包含对油价波动有用的预测信息。

Third, the ML models continue to prevail over the HAR-type models according to several statistical indicators, with the HAR-RF models having the best predictive accuracy, followed by the HAR-GBDT models.
第三,根据多个统计指标,机器学习模型在 HAR 类型模型中仍然占据主导地位,其中 HAR-RF 模型的预测精度最高,其次是 HAR-GBDT 模型。

Fourth, combination forecasting models outperform the other models (in particular, DMSPE(1)), except for monthly forecasting.
第四,组合预测模型(尤其是 DMSPE(1))在除月度预测之外的所有情况下均优于其他模型。

Also, comparing the data in rows 3 and 4, we find that HAR-ODD models outperform the HAR-EVEN models for both short- and medium-term forecasting, indicating that odd moments are more informative for daily and weekly forecasting, but even moments perform better for monthly forecasting.
此外,比较第 3 行和第 4 行的数据,我们发现 HAR-ODD 模型在短期和中长期预测方面均优于 HAR-EVEN 模型,这表明奇数矩对于每日和每周预测更有信息量,但偶数矩在月度预测方面表现更佳。

These results align with our out-of-sample analysis using the rolling window.
这些结果与使用滚动窗口进行的样本外分析一致。

Table 5. Out-of-sample forecasting results during COVID-19 based on the recursive window method.
表 5. 基于递归窗口方法的 COVID-19 期间样本外预测结果。

The MCS test employing the recursive window shows that combination approaches and ML models have more potential to pass the MCS test. Odd moments consistently outperform even moments in daily and weekly forecasting. These results, as shown in Table A3 in the Internet Appendix, align with those obtained from the MCS test using the fixed window.
MCS 测试采用递归窗口法表明,组合方法和机器学习模型在通过 MCS 测试方面具有更大的潜力。在每日和每周预测中,奇数矩持续优于偶数矩。这些结果,如表 A3 所示,与使用固定窗口进行的 MCS 测试结果一致。
The DM test results in the recursive window method further confirm the robustness. Models incorporating higher-order moments continue to show outstanding performance, displaying significant positive values relative to the benchmark model. Table A4 in the Internet Appendix shows the full set of DM test results.
DM 测试结果进一步证实了递归窗口方法的鲁棒性。包含高阶矩的模型持续表现出卓越的性能,相对于基准模型显示出显著的正值。互联网附录中的表 A4 显示了 DM 测试的完整结果集。

7. Further analysis of positive and negative volatility
7. 对正负波动性的进一步分析

Asymmetric volatility's predictability is a crucial factor for risk assessment and portfolio variety determination (Garcia and Tsafack, 2011). According to Gong and Lin (2017), positive volatility may support producers and optimistic traders of physical commodities in the oil futures market, whereas negative volatility may support consumers and short sellers.
不对称波动率的可预测性是风险评估和投资组合多样性确定的关键因素(Garcia 和 Tsafack,2011)。根据 Gong 和 Lin(2017)的研究,正波动率可能支持石油期货市场的实物商品生产者和乐观交易者,而负波动率可能支持消费者和做空者。

Thus, we investigate the role of higher-order moments in affecting follow-up asymmetric volatility predictions.
因此,我们研究了高阶矩在影响后续非对称波动率预测中的作用。

7.1. Methodology for positive and negative volatility forecasts
7.1. 正负波动率预测的方法

In view of the conclusions suggesting that higher-order moments in oil price RV forecasts capture more predictive information, we follow Barndorff-Nielsen et al. (2008) and decompose RV into positive realized semivariances (PRV) and negative realized semivariances (NRV), formulated as follows:(32)PRV=i=1Ntrt,i2I(ri,t>0),NRV=i=1Ntrt,i2I(ri,t<0),where I is a indicator function, giving 1 when the condition in brackets is true and 0 otherwise. We then adapt various forecasting model specifications.
鉴于结论表明油价 RV 预测中的高阶矩能够捕捉更多预测信息,我们遵循 Barndorff-Nielsen 等人(2008)的方法,将 RV 分解为正实现半方差(PRV)和负实现半方差(NRV),其公式如下: (32)PRV=i=1Ntrt,i2I(ri,t>0),NRV=i=1Ntrt,i2I(ri,t<0), 其中 I 是一个指示函数,当括号中的条件为真时给出 1,否则为 0。然后我们采用各种预测模型规格。
Specifically, for good volatility forecasts, we constructed five HAR-type models (i.e., HAR-PRV, HAR-PRV-LOW, HAR-PRV-ODD, HAR-PRV-EVEN, and HAR-PRV-ALL), four ML models (i.e., HAR-PRV -LASSO, HAR-PRV-EN, HAR-PRV-GBDT, and HAR-PRV-RF), and five combination forecasting models (i.e., MEAN, MEDIAN, TRIMMED MEAN, DMSPE(1), DMSPE(0.9) and ML-Poly).
具体来说,为了获得良好的波动率预测,我们构建了五种 HAR 类型模型(即 HAR-PRV、HAR-PRV-LOW、HAR-PRV-ODD、HAR-PRV-EVEN 和 HAR-PRV-ALL),四种机器学习模型(即 HAR-PRV-LASSO、HAR-PRV-EN、HAR-PRV-GBDT 和 HAR-PRV-RF),以及五种组合预测模型(即 MEAN、MEDIAN、TRIMMED MEAN、DMSPE(1)、DMSPE(0.9)和 ML-Poly)。

The variables for the ML models are the same as those for the HAR-PRV-ALL model, and the combination forecasting models are based on the four ML models. The five HAR-type models are given by(33)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+εt,(34)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β1RSK+β3RKU+εt,(35)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β1RSK+β2HRSK+εt,(36)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β3RKU+β4HRKU+εt,(37)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β1RSK+β2HRSK+β3RKU+β4HRKU+εt,where PRVt+h,(h=1,5,22) represent the day-ahead, week-ahead, and month-ahead positive volatility, respectively; PRVt is the daily PRV, and PRVti:t=(PRVt+PRVt1+...+ PRVt(i1))/i,(i=5,22) is the weekly and monthly PRV. RSK=[RSKt,RSKt5:t,RSKt22:t], RKU=[RKUt,RKUt5:t,RKUt22:t], HRSK=[HRSKt,HRSKt5:t,HRSKt22:t], and HRKU= [HRKUt,HRKUt5:t,HRKUt22:t] are vectors of the daily, weekly, and monthly skewness, kurtosis, hyperskewness, and hyperkurtosis, respectively. β1=[β11,β12,β13], β2=[β21,β22,β23], β3=[β31,β32,β33], and β4=[β41,β42,β43] are the coefficients for RSK, RKU, HRSK and HRKU.
机器学习模型的变量与 HAR-PRV-ALL 模型相同,组合预测模型基于四个机器学习模型。五个 HAR 类型模型由 (33)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+εt, (34)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β1RSK+β3RKU+εt, (35)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β1RSK+β2HRSK+εt, (36)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β3RKU+β4HRKU+εt, (37)PRVt+h=α0+α1PRVt+α2PRVt5:t+α3PRVt22:t+β1RSK+β2HRSK+β3RKU+β4HRKU+εt, 给出,其中 PRVt+h,(h=1,5,22) 分别代表前一天、一周前和一个月前的正向波动性; PRVt 是每日 PRVPRVti:t=(PRVt+PRVt1+...+ PRVt(i1))/i,(i=5,22) 是每周和每月的 PRVRSK=[RSKt,RSKt5:t,RSKt22:t]RKU=[RKUt,RKUt5:t,RKUt22:t]HRSK=[HRSKt,HRSKt5:t,HRSKt22:t]HRKU= [HRKUt,HRKUt5:t,HRKUt22:t] 分别是每日、每周和每月的偏度、峰度、超偏度和超峰度的向量。 β1=[β11,β12,β13]β2=[β21,β22,β23]β3=[β31,β32,β33]β4=[β41,β42,β43]RSKRKUHRSKHRKU 的系数。
We employ the same process for negative volatility prediction by formulating five HAR-type models (i.e., HAR-NRV, HAR-NRV-LOW, HAR-NRV-ODD, HAR-NRV-EVEN, and HAR-NRV-ALL), four ML models (i.e., HAR-NRV-LASSO, HAR-NRV-EN, HAR-NRV-GBDT, and HAR-NRV-RF), and five combination forecasting models (i.e., MEAN, MEDIAN, TRIMMED MEAN, DMSPE(1), DMSPE(0.9) and ML-Poly).
我们采用相同的过程进行负波动率预测,通过构建五种 HAR 类型模型(即 HAR-NRV、HAR-NRV-LOW、HAR-NRV-ODD、HAR-NRV-EVEN 和 HAR-NRV-ALL),四种机器学习模型(即 HAR-NRV-LASSO、HAR-NRV-EN、HAR-NRV-GBDT 和 HAR-NRV-RF),以及五种组合预测模型(即 MEAN、MEDIAN、TRIMMED MEAN、DMSPE(1)、DMSPE(0.9)和 ML-Poly)。

We construct models for negative volatility in a semblable manner.
我们以类似的方式构建负波动率模型。

7.2. Full sample estimation results
7.2. 全样本估计结果

Table 6 presents the coefficient estimation results for all HAR-type models for positive volatility prediction.
表 6 展示了所有 HAR 类型模型对正向波动率预测的系数估计结果。

We observe that the coefficient parameters for higher-order moments and their lags within all the models are statistically nonzero, and more than half of them are significant, reflecting that higher-order moments can influence predictions of positive volatility.
我们观察到,所有模型中高阶矩及其滞后系数参数在统计上均不为零,其中超过一半的系数具有显著性,这反映出高阶矩可以影响正波动率的预测。

Compared to the HAR-PRV-EVEN models, the number of significant coefficients for higher-order moments in the HAR-PRV-ODD models is greater in the day-ahead forecasts but less in the week-ahead and month-ahead forecasts.
与 HAR-PRV-EVEN 模型相比,HAR-PRV-ODD 模型中高阶矩的显著系数在提前一天预测中更多,但在提前一周和提前一个月预测中更少。

We conclude that odd moments contain short-term predictive information about positive volatility, and even moments contain more medium- and long-term predictive information.
我们得出结论,奇数时刻包含关于正向波动率的短期预测信息,而偶数时刻包含更多关于中长期的预测信息。

Table 6. Full sample estimation results of positive volatility.
表 6. 正向波动率的完整样本估计结果。

MODELHAR-PRVHAR-PRV-LOWHAR-PRV-ODDHAR-PRV-EVENHAR-PRV-ALL
Panel A: h = 1  图 A:h = 1
PRVD0.073*0.0830.088*0.0520.070
PRVW0.221***0.247*0.242***0.2160.241*
PRVM0.0620.012***0.0330.039***0.012***
RSK−0.122***  -0.122***−0.276***  -0.276***−0.247***  -0.247***
RSK-5−0.117**  -0.117−0.164*  -0.164−0.124  -0.124
RSK-22−0.051  -0.051−0.106  -0.106−0.101  -0.101
HRSK0.166*0.134*
HRSK-50.038−0.021  -0.021
HRSK-220.0830.077
RKU−0.006  -0.0060.1060.109
RKU-5−0.012  -0.0120.1520.231**
RKU-22−0.090  -0.090−0.342*  -0.342−0.347**  -0.347
HRKU−0.091  -0.091−0.107  -0.107
HRKU-5−0.157  -0.157−0.243*  -0.243
HRKU-220.250*0.261*
Panel B: h = 5  图 B:h = 5
PRVD0.481***0.504***0.490***0.473***0.489***
PRVW0.240***0.266***0.273***0.222***0.258***
PRVM0.115***0.0430.081*0.074*0.042
RSK−0.137***  -0.137***−0.244***  -0.244***−0.200***  -0.200***
RSK-5−0.155***  -0.155***−0.239***  -0.239***−0.190  -0.190
RSK-22−0.059*  -0.059−0.102  -0.102−0.087  -0.087
HRSK0.125*0.066
HRSK-50.0770.008
HRSK-220.0790.064
RKU−0.063*  -0.0630.4900.0270.059
RKU-5−0.011  -0.0110.2730.198*0.280**
RKU-22−0.136***  -0.136***0.081−0.531***  -0.531***−0.533***  -0.533***
HRKU−0.063  -0.063−0.117**  -0.117
HRKU-5−0.197*  -0.197*−0.290**  -0.290
HRKU-220.396***0.402***
Panel C: h = 22  图 C:h = 22
PRVD0.231***0.241***0.244***0.224***0.241***
PRVW0.096**0.107**0.132***0.089*0.111**
PRVM0.354***0.242***0.322***0.247***0.235***
RSK−0.127***  -0.127***−0.247***  -0.247***−0.186**  -0.186**
RSK-5−0.136***  -0.136***−0.191**  -0.191−0.157*  -0.157
RSK-22−0.079*  -0.079−0.068  -0.0680.023
HRSK0.131*0.072
HRSK-50.0280.024
HRSK-220.064−0.092  -0.092
RKU−0.075**  -0.075−0.178*  -0.178−0.159  -0.159
RKU-50.002−0.192*  -0.192*−0.118  -0.118
RKU-22−0.313***  -0.313***−0.579***  -0.579***−0.538***  -0.538***
HRKU0.1370.096
HRKU-50.204*0.121
HRKU-220.293**0.235*
Notes: This table shows the parameter estimation results from a full-sample perspective.
注释:本表展示了从全样本视角的参数估计结果。

The standard HAR-PRV model is considered as benchmark, HAR-PRV-LOW (add skewness, kurtosis), HAR-PRV-ODD (add skewness, hyper-skewness), HAR-PRV-EVEN (add kurtosis, hyper-kurtosis), HAR-PRV-ALL (add skewness, kurtosis, hyper-skewness, hyper-kurtosis) is displayed in turn.
标准 HAR-PRV 模型被视为基准,依次显示 HAR-PRV-LOW(增加偏度、峰度)、HAR-PRV-ODD(增加偏度、超偏度)、HAR-PRV-EVEN(增加峰度、超峰度)、HAR-PRV-ALL(增加偏度、峰度、超偏度、超峰度)。

The parameter estimation results for the 1-step, 5-step and 22-step forecasts of realized volatility are shown in panels A, B, C as well. The asterisk ***, ** and * denote 1%, 5% and 10% levels of significance.
参数估计结果如图 A、B、C 所示,包括 1 步、5 步和 22 步预测的实实现波动率。星号***、**和*分别表示 1%、5%和 10%的显著性水平。
To measure the importance of each explanatory variable for prediction and to further understand the modeled process, we employ SHAP values to calculate the variables' importance. Fig. 2a, Fig. 2b, Fig. 2c depicts the importance of the variables for the four ML models for positive volatility forecasting. The short-, medium-, and long-term forecast results are displayed in Fig. 2a, Fig. 2b, Fig. 2ca, b, and 2c, respectively. The left side of the model results shows SHAP summary plots. Except for the impact of shrinkage penalties, we find that all variables have an effect on positive volatility forecasts.
为了衡量每个解释变量对预测的重要性,并进一步理解建模过程,我们采用 SHAP 值来计算变量的重要性。图 2a、图 2b、图 2c 展示了四个机器学习模型对正向波动率预测的变量重要性。图 2a、图 2b、图 2c 分别显示了短期、中期和长期预测结果。模型结果左侧展示了 SHAP 摘要图。除了收缩惩罚的影响外,我们发现所有变量都对正向波动率预测有影响。

Some higher-order moments' values are even larger than those of the RV lags. The right side of the model results shows the variables’ importance of higher-order moments (not considering the RV lags) measured by the mean of the absolute SHAP values.
某些高阶矩的值甚至大于 RV 滞后项的值。模型结果的右侧显示了通过绝对 SHAP 值的平均值测量的高阶矩(不考虑 RV 滞后项)的重要性。

All higher-order moments have predictive ability, and the two highest moments (hyperskewness and hyperkurtosis) are sometimes even more capable than skewness and kurtosis. In the medium- and long-term forecasts presented in Fig. 2b, Fig. 2cb and c, even moments outperform odd moments as measured by their cumulative contribution, whereas odd moments perform better in the short-term forecasts. The variable importance for positive volatility forecasting, based on the reduction in R2 and SSD, are presented in Figures A.4 and A.5 in the Internet Appendix. These results also align closely with the findings discussed above.
所有高阶矩都具有预测能力,其中两个最高阶矩(超偏度和超峰度)有时甚至比偏度和峰度更具预测能力。在图 2b、图 2cb 和 c 所示的中长期预测中,偶数阶矩在累积贡献方面甚至优于奇数阶矩,而奇数阶矩在短期预测中表现更佳。基于 R 2 和 SSD 的减少,正波动率预测的变量重要性在互联网附录的图 A.4 和 A.5 中呈现。这些结果也与上述讨论的发现紧密一致。
Fig. 2a
  1. Download: Download high-res image (661KB)
    下载:下载高分辨率图片(661KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 2a. Variable importance of positive volatility for four machine learning models based on SHAP in day-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 2a. 基于 SHAP 的四个机器学习模型在日前预测中对正波动性的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响的范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

Fig. 2b
  1. Download: Download high-res image (621KB)
    下载:下载高分辨率图片(621KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 2b. Variable importance of positive volatility for four machine learning models based on SHAP in week-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 2b. 基于 SHAP 的四个机器学习模型在周预测中正波动性的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

Fig. 2c
  1. Download: Download high-res image (652KB)
    下载:下载高分辨率图片(652KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 2c. Variable importance of positive volatility for four machine learning models based on SHAP in month-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 2c. 基于 SHAP 的四种机器学习模型在月度预测中对正波动性的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响的范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

The full sample estimation results for negative volatility for the HAR-type models are displayed in Table 7. They show that the values for higher-order moments and their lags are nonzero, and more than half of them are significant, implying that higher-order moments play a predictive role for negative volatility.
表 7 显示了 HAR 类型模型负波动率的完整样本估计结果。它们表明,高阶矩及其滞后值不为零,其中超过一半的值具有显著性,这意味着高阶矩在预测负波动率方面起着预测作用。

Table 7. Full sample estimation results of negative volatility.
表 7. 负波动率的完整样本估计结果。

MODELHAR-NRVHAR-NRV-LOWHAR-NRV-ODDHAR-NRV-EVENHAR-NRV-ALL
Panel A: h = 1  图 A:h = 1
NRVD0.118***0.1130.098**0.1380.120**
NRVW0.161***0.132**0.131**0.175***0.132**
NRVM0.091*0.076**0.119**0.061***0.089*
RSK−0.032  -0.032−0.231**  -0.231−0.196**  -0.196**
RSK-5−0.088*  -0.088−0.303***  -0.303***−0.272***  -0.272***
RSK-22−0.002  -0.0020.1070.131
HRSK0.226**0.193*
HRSK-50.234**0.202*
HRSK-22−0.083  -0.083−0.141  -0.141
RKU−0.074*  -0.0740.000−0.047  -0.047
RKU-5−0.045  -0.0450.1190.106
RKU-22−0.078*  -0.078−0.131  -0.131−0.103  -0.103
HRKU−0.079  -0.079−0.027  -0.027
HRKU-5−0.172  -0.172−0.136  -0.136
HRKU-220.0560.015
Panel B: h = 5  图 B:h = 5
NRVD0.464***0.456***0.428***0.498***0.463***
NRVW0.195***0.148***0.167***0.199***0.153***
NRVM0.158***0.164***0.214***0.122***0.177***
RSK−0.064*  -0.064−0.257***  -0.257***−0.215***  -0.215***
RSK-5−0.120***  -0.120***−0.337***  -0.337***−0.292***  -0.292***
RSK-220.0410.230***0.259***
HRSK0.217***0.179**
HRSK-50.244***0.183**
HRSK-22−0.162*  -0.162*−0.232***  -0.232***
RKU−0.108***  -0.108***−0.076  -0.076−0.101  -0.101
RKU-5−0.014  -0.0140.292**0.313**
RKU-22−0.116***  -0.116***−0.268*  -0.268−0.265*  -0.265
HRKU−0.040  -0.040−0.007  -0.007
HRKU-5−0.319**  -0.319−0.320**  -0.320
HRKU-220.1490.124
Panel C: h = 22  图 C:h = 22
NRVD0.253***0.235***0.242***0.257***0.239***
NRVW0.184***0.182***0.185***0.187***0.187***
NRVM0.233***0.222***0.291***0.169***0.210***
RSK−0.040  -0.040−0.128*  -0.128−0.063  -0.063
RSK-5−0.041  -0.041−0.005  -0.0050.061
RSK-220.069*0.1210.181*
HRSK0.0990.024
HRSK-5−0.054  -0.054−0.131*  -0.131
HRSK-220.025−0.104  -0.104
RKU−0.078*  -0.078−0.074  -0.074−0.056  -0.056
RKU-5−0.035  -0.0350.0410.145
RKU-22−0.226***  -0.226***−0.502***  -0.502***−0.577***  -0.577***
HRKU−0.009  -0.009−0.020  -0.020
HRKU-5−0.079  -0.079−0.195*  -0.195
HRKU-220.271*0.336**
Notes: This table shows the parameter estimation results from a full-sample perspective.
注释:本表展示了从全样本视角的参数估计结果。

The standard HAR-NRV model is considered as benchmark, HAR-NRV-LOW (add skewness, kurtosis), HAR-NRV-ODD (add skewness, hyper-skewness), HAR-NRV-EVEN (add kurtosis, hyper-kurtosis), HAR-NRV-ALL (add skewness, kurtosis, hyper-skewness, hyper-kurtosis) is displayed in turn.
标准 HAR-NRV 模型被视为基准,依次显示 HAR-NRV-LOW(增加偏度、峰度)、HAR-NRV-ODD(增加偏度、超偏度)、HAR-NRV-EVEN(增加峰度、超峰度)、HAR-NRV-ALL(增加偏度、峰度、超偏度、超峰度)。

The parameter estimation results for the 1-step, 5-step and 22-step forecasts of realized volatility are shown in panels A, B, C as well. The asterisk ***, ** and * denote 1%, 5% and 10% levels of significance.
参数估计结果如图 A、B、C 所示,包括 1 步、5 步和 22 步预测的实实现波动率。星号***、**和*分别表示 1%、5%和 10%的显著性水平。
Fig. 3 shows the importance of the variables in the four ML models for negative volatility forecasting. Fig. 3a, Fig. 3b, Fig. 3ca, b, and 3c depict the short-, medium-, and long-term forecast results, respectively. The SHAP summary plots are presented on the left of the model results. We observe that all the variables are related to negative volatility in the absence of the effect of a shrinkage penalty.
图 3 显示了四个机器学习模型在负波动率预测中变量的重要性。图 3a、图 3b、图 3ca、b 和 3c 分别描绘了短期、中期和长期预测结果。SHAP 摘要图位于模型结果左侧。我们观察到,在没有收缩惩罚效应的情况下,所有变量都与负波动率相关。

Higher-order moments' SHAP values are higher than the RV lag values in some situations. The right side of the model results shows the variables’ importance in terms of higher-order moments (not considering the RV lags) measured by the mean of the absolute SHAP values.
高阶矩的 SHAP 值在某些情况下高于随机游走滞后值。模型结果的右侧显示了变量在考虑高阶矩(不考虑随机游走滞后)方面的重要性,这是通过绝对 SHAP 值的平均值来衡量的。

We find that all the higher-order moments contain useful predictive information, and the two highest moments (hyperskewness and hyperkurtosis) have more information than skewness and kurtosis in some cases.
我们发现所有高阶矩都包含有用的预测信息,而在某些情况下,两个最高阶矩(超偏度和超峰度)比偏度和峰度包含更多的信息。

Moreover, even moments perform better than odd moments as measured by their cumulative contributions in the medium- and long-term forecasts presented in Fig. 3b, Fig. 3c, whereas odd moments perform better in the short-term forecasts. Figure A.6 and A.7 in the Internet Appendix display the variable importance measured by the reduction in R2 and SSD when forecasting negative volatility, respectively. Both figures align with the aforementioned findings, providing additional evidence for the importance of higher-order moments in predicting negative volatility.
此外,即使按累积贡献衡量,偶数时刻在图 3b 和图 3c 所示的 medium- and long-term forecasts 中表现也比奇数时刻更好,而奇数时刻在 short-term forecasts 中表现更佳。互联网附录中的图 A.6 和 A.7 分别显示了在预测负波动性时,通过 R 2 和 SSD 的减少来衡量的变量重要性。这两个图与上述发现一致,为高阶矩在预测负波动性中的重要性提供了额外的证据。
Fig. 3a
  1. Download: Download high-res image (646KB)
    下载:下载高分辨率图片(646KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 3a. Variable importance of negative volatility for four machine learning models based on SHAP in day-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 3a. 基于 SHAP 的四个机器学习模型在日前预测中对负波动性的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响的范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

Fig. 3b
  1. Download: Download high-res image (641KB)
    下载:下载高分辨率图片(641KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 3b. Variable importance of negative volatility for four machine learning models based on SHAP in week-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 3b. 基于 SHAP 的四个机器学习模型在周预测中负波动性的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

Fig. 3c
  1. Download: Download high-res image (622KB)
    下载:下载高分辨率图片(622KB)
  2. Download: Download full-size image
    下载:下载全尺寸图片

Fig. 3c. Variable importance of negative volatility for four machine learning models based on SHAP in month-ahead forecasts. The left side of each model's results are SHAP summary plots, showing the range and distribution of the impacts of the explanatory variable on the RV prediction.
图 3c. 基于 SHAP 的四个机器学习模型在月度预测中对负波动性的变量重要性。每个模型的左侧结果为 SHAP 摘要图,显示了解释变量对 RV 预测影响的范围和分布。

The explanatory variables are listed on the y-axis in the order of significance from top to bottom. The color of each dot is determined by the explanatory variable's value, which ranges from blue (low) to red (high).
解释变量按从上到下的重要性顺序列在 y 轴上。每个点的颜色由解释变量的值决定,其范围从蓝色(低)到红色(高)。

The right side are variable importance of only the higher moments (not considering the lag of RV, i.e. RVD, RVW, RVM) measured by the mean of SHAP absolute values.
右侧为仅考虑高阶矩(不考虑 RV 的滞后,即 RVD、RVW、RVM)的变量重要性,通过 SHAP 绝对值的平均值进行测量。

7.3. Out-of-sample forecasts
7.3. 样本外预测

The forecasting performance of the proposed models for positive volatility during the COVID-19 period, as measured by MSE, RMSE, and R2oos, are displayed in Table 8. First, all models based on higher-order moments outperform the benchmark HAR-PRV model. The HAR-PRV-ALL model perform better than the HAR-PRV-LOW model for week-ahead and month-ahead forecasting.
COVID-19 期间提出的模型对正波动率的预测性能,以均方误差(MSE)、均方根误差(RMSE)和 R 2 oos 衡量,如表 8 所示。首先,所有基于高阶矩的模型都优于基准 HAR-PRV 模型。对于一周前和一个月前的预测,HAR-PRV-ALL 模型的表现优于 HAR-PRV-LOW 模型。

These results provide evidence that both third- and fourth-order moments and fifth- and sixth-order moments indeed improve forecasting accuracy.
这些结果提供了证据,表明三阶和四阶矩以及五阶和六阶矩确实提高了预测精度。

Second, the four ML models have excellent performance (better than the HAR-ALL models), showing that ML models can extract valuable information from higher-order moments for positive volatility forecasting.
其次,这四个机器学习模型具有优异的性能(优于 HAR-ALL 模型),表明机器学习模型可以从高阶矩中提取有价值的信息,以进行正向波动率预测。

Third, combination forecasting models also yield substantially higher accuracy and achieve the best performance in short- and medium-term forecasts.
第三,组合预测模型也显著提高了准确性,并在短期和中期的预测中实现了最佳性能。

Furthermore, the HAR-PRV-ODD models are more accurate than the HAR-PRV-EVEN models for day-ahead and week-ahead forecasting, whereas the HAR-PRV-EVEN models perform better for month-ahead forecasting.
此外,HAR-PRV-ODD 模型在日预测和周预测方面比 HAR-PRV-EVEN 模型更准确,而 HAR-PRV-EVEN 模型在月预测方面表现更佳。

Table 8. Out-of-sample forecast accuracy (MSE, RMSE, R2oos) based on the rolling window method for positive volatility during COVID-19.
表 8. 基于滚动窗口法对 COVID-19 期间正波动率的样本外预测准确性(均方误差,均方根误差,R 2 oos)。

Table 9 shows the MCS test results for predicting positive volatility during the COVID-19 period. The p values of the ML models and combination models are higher than those of the HAR-type models. For day-ahead and week-ahead forecasting, the TRIMMED MEAN models take a p value of 1 in different loss functions. For month-ahead forecasts, HAR-PRV-RF have better performance. The DM test for positive volatility forecasting shows that all models containing higher-order moments are better than the simple HAR model.
表 9 显示了 COVID-19 期间预测正波动性的 MCS 测试结果。机器学习模型和组合模型的 p 值高于 HAR 型模型。对于提前一天和提前一周的预测,TRIMMED MEAN 模型在不同损失函数中的 p 值为 1。对于提前一个月的预测,HAR-PRV-RF 表现更好。正波动性预测的 DM 测试表明,包含高阶矩的所有模型都优于简单的 HAR 模型。

The tree-based ML models (HAR-PRV-GBDT and HAR-PRV-GBDT) demonstrate competitive information extraction capabilities, and combination models exhibit better forecasting performance. The detailed results of the DM test are presented in Table A5 of the Internet Appendix.
基于树的机器学习模型(HAR-PRV-GBDT 和 HAR-PRV-GBDT)展现出竞争力的信息提取能力,组合模型表现出更好的预测性能。DM 测试的详细结果见互联网附录的表 A5。

Table 9. Out-of-sample forecast accuracy for positive volatility based on the rolling window (MCS).
表 9. 基于滚动窗口(MCS)的正波动率样本外预测准确性。

Models  模型Panel A: h = 1-day  图 A:h = 1 天Panel B: h = 1-week  图 B:h = 1 周Panel C: h = 1-month  图 C:h = 1 个月
Range  范围SemiQ  半定量Range  范围SemiQ  半定量Range  范围SemiQ  半定量
MSEMAEMSEMAEMSEMAEMSEMAEMSEMAEMSEMAE
HAR-PRV0.2650.3080.0000.0000.1430.0140.0000.0000.0000.0000.0000.000
HAR-PRV-LOW0.1350.5460.0000.0000.1310.1110.0000.0000.0000.0000.0000.000
HAR-PRV-ODD0.1490.3080.0000.0000.1310.0140.0000.0000.0000.0000.0000.000
HAR-PRV-EVEN0.1350.3080.0000.0000.1310.0140.0000.0000.0000.0000.0000.000
HAR-PRV-ALL0.1350.3080.0000.0000.1310.0470.0000.0000.0000.0000.0000.000
HAR-PRV-LASSO0.1350.3080.0000.0000.1420.0780.0000.0000.0000.0000.0000.000
HAR-PRV-EN0.1490.3080.0000.0000.1310.0140.0000.0000.0000.0000.0000.000
HAR-PRV-GBDT0.9931.000*0.0001.000*0.1430.0780.0000.0000.0000.0000.0000.000
HAR-PRV-RF0.2650.3080.0000.0000.4400.9970.0000.9971.000*1.000*1.000*1.000*
MEAN0.8320.6000.0000.0000.3950.0140.0000.0000.0000.0000.0000.000
MEDIAN0.9930.8780.0000.0000.4400.9970.0000.0000.0000.0000.0000.000
TRIMMED MEAN  修剪均值1.000*0.8781.000*0.8721.000*0.9971.000*0.9970.0000.0000.0000.000
DMSPE(1)0.8970.6840.0000.0000.4401.000*0.0001.000*0.0000.0000.0010.000
DMSPE(0.9)0.9510.8780.0000.0000.4400.0140.0000.0000.0000.0000.0000.000
ML-Poly  ML-多边形0.8970.8440.0000.0000.3950.9850.0000.0000.2350.7040.2350.704
Note: This table presents the model confidence set (MCS) p-values based on the Range and SemiQ test statistics, TR and TSQ. Models with p > 0.25 are indicated in bold.
注意:本表展示了基于范围和半 Q 检验统计量 TR 和 TSQ 的模型置信集(MCS)p 值。p > 0.25 的模型以粗体表示。

Panels A, B, and C report the findings for the 1-step, 5-step and 22-step forecasts for realized volatility, respectively.
图 A、B 和 C 分别报告了 1 步、5 步和 22 步预测的实实现波动率的发现。

The HAR-PRV model is benchmark, HAR-PRV-LOW (add skewness, kurtosis), HAR-PRV-ODD (add skewness, hyper-skewness), HAR-PRV-EVEN (add kurtosis, hyper-kurtosis), HAR-PRV-ALL (add skewness, kurtosis, hyper-skewness, hyper-kurtosis) is displayed in turn.
HAR-PRV 模型依次显示,HAR-PRV-LOW(增加偏度、峰度),HAR-PRV-ODD(增加偏度、超偏度),HAR-PRV-EVEN(增加峰度、超峰度),HAR-PRV-ALL(增加偏度、峰度、超偏度、超峰度)。

HAR-PRV-LASSO, HAR-PRV-EN, HAR-PRV-GBDT and HAR-PRV-RF contain the same variables as HAR-PRV-ALL. MEAN, MEDIAN, TRIMMED MEAN, DMSPE(1) and DMSPE(0.9) are all combinations of the four ML models above. The forecasting model with the best performance is highlighted with *.
HAR-PRV-LASSO、HAR-PRV-EN、HAR-PRV-GBDT 和 HAR-PRV-RF 包含与 HAR-PRV-ALL 相同的变量。MEAN、MEDIAN、TRIMMED MEAN、DMSPE(1)和 DMSPE(0.9)都是上述四个机器学习模型的组合。性能最佳的预测模型用*突出显示。
Table 10 presents the models’ forecast accuracy for negative volatility during the COVID-19 crisis.
表 10 展示了模型在 COVID-19 危机期间对负波动率的预测准确性。

We observe that the HAR-NRV-ALL models outperform the HAR-NRV-LOW and HAR-NRV models for both short- and medium-term forecasting, indicating that higher-order moments are useful predictors of negative volatility.
我们观察到,HAR-NRV-ALL 模型在短期和中期的预测中均优于 HAR-NRV-LOW 和 HAR-NRV 模型,这表明高阶矩是负波动性的有用预测指标。

Both ML models and combination models yield relatively high accuracy. Specifically, the HAR-NRV-LASSO have the best performance of the ML models for day-ahead forecasting, while HAR-NRV-RF models are better for week-ahead and month-ahead forecasting.
机器学习模型和组合模型都取得了相对较高的准确率。具体来说,HAR-NRV-LASSO 在日预测方面是机器学习模型中表现最好的,而 HAR-NRV-RF 模型在周预测和月预测方面表现更佳。

Also, combination models optimize the prediction accuracy compared to ML models, except for long forecast horizons.
此外,组合模型相较于机器学习模型优化了预测精度,除了在长预测周期之外。

Moreover, HAR-NRV-ODD models outperform HAR-NRV-EVEN models for short- and medium-term forecasting, whereas HAR-NRV-ODD models perform better for long-term forecasting.
此外,HAR-NRV-ODD 模型在短期和中长期预测方面优于 HAR-NRV-EVEN 模型,而 HAR-NRV-ODD 模型在长期预测方面表现更佳。

Table 10. Out-of-sample forecast accuracy (MSE, RMSE, R2oos) based on the rolling window method for negative volatility during COVID-19.
表 10. 基于滚动窗口法对 COVID-19 期间负波动率的样本外预测准确性(均方误差,均方根误差,R 2 oos)。

The MCS test results for the negative volatility models for the COVID-19 period are presented in Table 11. The models that include higher-order moments have satisfactory p values compared to the simple HAR model. HAR-NRV-RF values are equal to 1 in medium- and long-term forecasts. The DM test for negative volatility is presented in Table A6 in the Internet Appendix. The results also show that higher-order moments contain meaningful predictive information.
MCS 测试结果展示了 COVID-19 期间负波动率模型的表格 11。与简单的 HAR 模型相比,包含高阶矩的模型具有令人满意的 p 值。HAR-NRV-RF 在中长期预测中的值为 1。负波动率的 DM 测试结果在互联网附录的表格 A6 中展示。结果还显示,高阶矩包含有意义的预测信息。

Table 11. Out-of-sample forecast accuracy for negative volatility based on the rolling window (MCS).
表 11. 基于滚动窗口(MCS)的负波动率样本外预测准确性。

Models  模型Panel A: h = 1-day  图 A:h = 1 天Panel B: h = 1-week  图 B:h = 1 周Panel C: h = 1-month  图 C:h = 1 个月
Range  范围SemiQ  半定量Range  范围SemiQ  半定量Range  范围SemiQ  半定量
MSEMAEMSEMAEMSEMAEMSEMAEMSEMAEMSEMAE
HAR-NRV0.0310.2260.0150.0000.0000.0000.0000.0000.0000.0000.0000.000
HAR-NRV-LOW0.0760.3380.0760.1420.0000.0000.0000.0000.0000.0000.0000.000
HAR-NRV-ODD0.0340.3380.0150.1300.0000.0000.0000.0000.0000.0000.0000.000
HAR-NRV-EVEN0.0340.3290.0150.0000.0000.0000.0000.0000.0000.0000.0000.000
HAR-NRV-ALL1.000*1.000*1.000*1.000*0.0950.0200.0000.0000.0000.0000.0000.000
HAR-NRV-LASSO0.0340.2260.0150.0000.0030.0020.0000.0000.0000.0000.0000.000
HAR-NRV-EN0.0340.2260.0150.0000.0000.0000.0000.0000.0000.0000.0000.000
HAR-NRV-GBDT0.0310.3380.0000.1300.0140.0150.0000.0000.0000.0000.0010.000
HAR-NRV-RF0.0310.2260.0000.0001.000*1.000*1.000*1.000*1.000*1.000*1.000*1.000*
MEAN0.0310.3290.0150.0000.8690.2080.3810.1060.0000.0000.0000.000
MEDIAN0.0310.3380.0000.0000.8690.2080.0000.0000.0000.0000.0000.000
TRIMMED MEAN  修剪均值0.0310.3380.0150.1300.8690.2180.3810.1530.0000.0000.0010.000
DMSPE(1)0.0340.3380.0150.1300.9820.2810.9820.3870.0060.0000.0050.000
DMSPE(0.9)0.0340.3380.0150.0000.8690.2180.3810.1530.0000.0000.0000.000
ML-Poly  ML-多边形0.0340.2260.0150.0000.8690.5630.3810.5630.0690.0430.0690.043
Note: This table presents the model confidence set (MCS) p-values based on the Range and SemiQ test statistics, TR and TSQ. Models with p > 0.25 are indicated in bold. Panels A, B, and C report the findings for the 1-step, 5-step and 22-step forecasts for realized volatility, respectively.
注意:本表展示了基于范围和半 Q 测试统计量 TR 和 TSQ 的模型置信集(MCS)p 值。p > 0.25 的模型以粗体表示。面板 A、B 和 C 分别报告了实现波动率的 1 步、5 步和 22 步预测结果。

The HAR-NRV model is benchmark, HAR-NRV-LOW (add skewness, kurtosis), HAR-NRV-ODD (add skewness, hyper-skewness), HAR-NRV-EVEN (add kurtosis, hyper-kurtosis), HAR-NRV-ALL (add skewness, kurtosis, hyper-skewness, hyper-kurtosis) is displayed in turn.
HAR-NRV 模型依次显示基准、HAR-NRV-LOW(增加偏度、峰度)、HAR-NRV-ODD(增加偏度、超偏度)、HAR-NRV-EVEN(增加峰度、超峰度)、HAR-NRV-ALL(增加偏度、峰度、超偏度、超峰度)。

HAR-NRV-LASSO, HAR-NRV-EN, HAR-NRV-GBDT and HAR-NRV-RF contain the same variables as HAR-NRV-ALL. MEAN, MEDIAN, TRIMMED MEAN, DMSPE(1) and DMSPE(0.9) are all combinations of the four ML models above. The forecasting model with the best performance is highlighted with *.
HAR-NRV-LASSO、HAR-NRV-EN、HAR-NRV-GBDT 和 HAR-NRV-RF 包含与 HAR-NRV-ALL 相同的变量。MEAN、MEDIAN、TRIMMED MEAN、DMSPE(1)和 DMSPE(0.9)都是上述四个机器学习模型的组合。性能最佳的预测模型用*突出显示。
In summary, the empirical results for the positive and negative volatility forecasting align with previous results. First, all higher-order moments contain useful predictive information for both positive and negative volatility. Second, ML techniques and combination forecasting models have superior accuracy.
总结来说,正负波动预测的实证结果与先前的研究结果一致。首先,所有高阶矩都包含对正负波动的有用预测信息。其次,机器学习技术和组合预测模型具有更高的准确性。

Fourth, odd (third- and fifth-order) moments have more predictive information than even (fourth- and sixth-order) moments for short-term forecasting, whereas even moments contain more information for long-term forecasting.
第四,对于短期预测,奇数阶矩(三阶和五阶)比偶数阶矩(四阶和六阶)具有更多的预测信息,而偶数阶矩对于长期预测包含更多信息。

7.4. Robustness tests  7.4. 鲁棒性测试

To validate our findings, we rerun the previously described empirical analysis that employs the recursive window. Table 7, Table 8, Table 12 show the prediction accuracy, MCS test and DM tests results for positive volatility forecasts.
为了验证我们的发现,我们重新运行了之前描述的采用递归窗口的实证分析。表 7、表 8、表 12 展示了正向波动率预测的预测精度、MCS 测试和 DM 测试结果。

We find that models containing higher-order moments outperform the HAR benchmark models, and that ML and combination models have satisfactory outcomes in the medium- and long-term forecasts.
我们发现,包含高阶矩的模型在 HAR 基准模型中表现更优,并且 ML 和组合模型在中长期预测中具有令人满意的成果。

The HAR-PRV-ODD models also overmatch the HAR-PRV-EVEN models for short- and medium-term forecasting, and even moments have greater predictive ability for long-term forecasting. The above results are robust.
HAR-PRV-ODD 模型在短期和中长期预测中也优于 HAR-PRV-EVEN 模型,甚至在长期预测中,其预测能力也更强。上述结果稳健。

Table 12. Out-of-sample forecast accuracy (MSE, RMSE, R2oos) based on the recursive window for positive volatility during COVID-19.
表 12. 基于 COVID-19 期间正波动性的递归窗口的样本外预测准确性(均方误差,均方根误差,R 2 oos)。

The prediction accuracy, MCS test and DM test results for negative volatility forecasts are presented in Table 9, Table 10, Table 13. The HAR-ALL models yield the best performance among the HAR-type models for all horizons, indicating that higher-order moments enhance the prediction of negative volatility. ML and combination models produce better outcomes than HAR-type models.
预测准确度、MCS 测试和 DM 测试结果在表 9、表 10、表 13 中呈现。HAR-ALL 模型在所有预测时点上均优于 HAR 类型模型,表明高阶矩增强了负波动率的预测。ML 和组合模型比 HAR 类型模型产生更好的结果。

Moreover, HAR-NRV-ODD models perform better for short-term forecasting, and HAR-NRV-EVEN models outperform HAR-NRV-ODD models for long-term forecasting. These conclusions align with those shown in above.
此外,HAR-NRV-ODD 模型在短期预测方面表现更佳,而 HAR-NRV-EVEN 模型在长期预测方面优于 HAR-NRV-ODD 模型。这些结论与上述结果一致。

Table 13. Out-of-sample forecast accuracy (MSE, RMSE, R2oos) based on the recursive window for negative volatility during COVID-19.
表 13. 基于 COVID-19 期间负波动性的递归窗口的样本外预测准确性(均方误差,均方根误差,R 2 oos)。

8. Conclusion  8. 结论

In this study, we assess the significance of higher-order moments, ranked from third to sixth order, for forecasting oil futures volatility.
在这项研究中,我们评估了从三阶到六阶的高阶矩在预测石油期货波动性中的重要性。

We use variants of the popular HAR models augmented with different orders of moments to investigate the impact of skewness and kurtosis, hyper moments, odd moments, and even moments on RV forecasts.
我们使用流行的 HAR 模型的不同阶矩变体来研究偏度和峰度、超矩、奇数矩和偶数矩对 RV 预测的影响。

We also examine whether machine learning models and combination forecasting models could achieve improved performance. Moreover, we dissect the data structure using SHAP values, the reduction in predictive R2 and SSD to determine the importance of each indicator. Finally, we demonstrate the predictive ability of higher-order moments for positive and negative volatility. Thus, our work expands on existing studies.
我们还研究了机器学习模型和组合预测模型是否能实现性能提升。此外,我们使用 SHAP 值分解数据结构,通过降低预测 R 2 和 SSD 来确定每个指标的重要性。最后,我们展示了高阶矩对正负波动性的预测能力。因此,我们的工作扩展了现有研究。
The full-sample estimation results and the variables' importance based on ML models measured by SHAP values, the reduction in R2 and SSD show that higher-order moments contribute to volatility prediction, and the two highest moments (fifth- and sixth-order moments) contain more predictive information than skewness and kurtosis in some cases.
全样本估计结果以及基于 SHAP 值的 ML 模型测量的变量重要性,以及 R 2 和 SSD 的减少表明,高阶矩对波动性预测有贡献,在某些情况下,前两个最高阶矩(五阶和六阶矩)所含的预测信息比偏度和峰度更多。

Compared to odd moments, even moments show stronger predictive ability in long-term forecasts.
与奇数时刻相比,偶数时刻在长期预测中显示出更强的预测能力。

Using out-of-sample forecasts, we research the COVID-19 period and demonstrate that not only third- and fourth-order moments but also hyper moments capture significant predictive information for oil RV during the COVID-19 crisis.
利用样本外预测,我们研究了 COVID-19 时期,并证明不仅三阶和四阶矩,甚至超矩在 COVID-19 危机期间对石油 RV 的预测信息具有重要意义。

The results also prove that ML models consistently outperform HAR-type models, indicating that ML models can handle nonlinearity well and rely on higher-order moments to obtain more information about oil RV forecasts.
结果还证明,机器学习模型在一致性上优于 HAR 类型模型,表明机器学习模型能够很好地处理非线性,并依赖于高阶矩来获取更多关于油轮 RV 预测的信息。

Moreover, combination forecasting models perform satisfactorily and improve ML models’ capabilities by considering strengths and compensating for weaknesses. Notably, odd moments are relatively more important for oil RV predictions for short- and medium-term forecast horizons, whereas even moments have more information for month-ahead forecasting.
此外,组合预测模型表现良好,通过考虑优势并弥补劣势,提高了机器学习模型的能力。值得注意的是,对于短期和中长期预测范围,奇数矩对于石油 RV 预测相对更为重要,而偶数矩对于月度预测则包含更多信息。

These results are robust, considering alternative forecasting and evaluation methods.
这些结果在考虑替代预测和评估方法的情况下是稳健的。
Further extending the analysis to positive and negative volatility, we discover that higher-order moments still play a major role for different horizons. ML models and combination forecasts also yield outstanding outcomes.
进一步将分析扩展到正负波动性,我们发现高阶矩在不同时间跨度下仍起着重要作用。机器学习模型和组合预测也取得了卓越成果。

Additionally, both positive and negative volatility forecasts using odd moments are more definite in the short and medium term, whereas even moments provide more incremental information for long-term forecasting.
此外,使用奇数矩的正负波动率预测在短期和中期内更为确定,而偶数矩为长期预测提供了更多增量信息。
China's oil futures are accessible to investors worldwide, and their volatility during the COVID-19 crisis play an important role in risk management, asset pricing, and investment portfolios.
中国原油期货对全球投资者开放,其在 COVID-19 危机期间的波动性在风险管理、资产定价和投资组合中发挥着重要作用。

The useful forecasting indicators—higher-order moments—used in this study are of great applied economic importance for helping investors, market participants, and policymakers to forecast the volatility of China's oil futures more effectively, reduce investment risk, and achieve better returns.
本研究的有效预测指标——高阶矩——对于帮助投资者、市场参与者和政策制定者更有效地预测中国石油期货的波动性、降低投资风险和实现更好的回报具有重大的应用经济意义。

CRediT authorship contribution statement
CRediT 作者贡献声明

Hongwei Zhang: Writing – review & editing, Supervision, Conceptualization. Xinyi Zhao: Writing – review & editing, Writing – original draft, Software, Methodology, Conceptualization. Wang Gao: Writing – review & editing, Data curation. Zibo Niu: Writing – review & editing, Writing – original draft, Methodology, Conceptualization.
张宏伟:写作 - 审稿与编辑,监督,概念化。赵心怡:写作 - 审稿与编辑,写作 - 原始草稿,软件,方法,概念化。高王:写作 - 审稿与编辑,数据整理。牛子博:写作 - 审稿与编辑,写作 - 原始草稿,方法,概念化。

Declaration of competing interest
利益冲突声明

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
作者声明,他们没有已知可能影响本文报道工作的财务利益或个人关系。

Acknowledgements  致谢

We gratefully acknowledge financial support from National Natural Science Foundation of China (Nos. 72204273 and 72074228), Chinese National Funding of Social Sciences (Nos. 21&ZD103 and 22FGLB032), Humanities and Social Science Research Project of Ministry of Education of China (No. 22YJCZH235), Scientific Research Foundation of Hunan Provincial Education Department (No. 21A0009), Hunan Provincial Natrual Science Foundation (No. 2023JJ30709) and Innovation Driven Project of Central South University (No. 2022ZZTS0326).
我们衷心感谢以下机构提供的财务支持:国家自然科学基金委员会(项目编号:72204273 和 72074228)、中国社会科学院国家资助(项目编号:21&ZD103 和 22FGLB032)、中国教育部人文社会科学研究项目(项目编号:22YJCZH235)、湖南省教育厅科学研究基金(项目编号:21A0009)、湖南省自然科学基金(项目编号:2023JJ30709)以及中南大学创新驱动项目(项目编号:2022ZZTS0326)。

Appendix A. Supplementary data
附录 A. 补充数据

The following is the Supplementary data to this article:Download: Download Word document (14MB)

Multimedia component 1.


以下为该文章的补充数据: Download: Download Word document (14MB)

Multimedia component 1.

Data availability  数据可用性

Data will be made available on request.
数据将在请求后提供。

References

Cited by (3)

1
In order to enhance the clarity and convenience of presenting subsequent tables and figures, we adopt RVD, RVW, and RVM to represent RVt, RVt5:t,and RVt22:t in Eq. (6), respectively. These variables correspond to the daily, weekly, and monthly data of RV. Similarly, we use RSK, RSK-5 and RSK-22 to denote RSKt, RSKt5:t and RSKt22:t in Eq. (7), representing the daily, weekly, and monthly skewness, respectively. The remaining higher-order moments follow a similar pattern to the skewness measures described above.
为了提高后续表格和图形展示的清晰度和便捷性,我们采用 RVD、RVW 和 RVM 分别表示方程(6)中的 RVtRVt5:tRVt22:t 。这些变量分别对应 RV 的日、周和月数据。同样,我们使用 RSK、RSK-5 和 RSK-22 来表示方程(7)中的 RSKtRSKt5:tRSKt22:t ,分别代表日、周和月的偏度。剩余的高阶矩遵循与上述偏度度量类似的模式。
2
Because of non-different abilities in tree-based models, the Dimopoulos et al. (1995) method is not applicable. Therefore, when we employ SSD, we measure variable importance for RF and GBDT using mean decrease in impurity according to Gu et al. (2020).
由于基于树的模型在非不同能力方面存在差异,Dimopoulos 等人(1995)的方法不适用。因此,当我们采用 SSD 时,我们根据 Gu 等人(2020)的方法,使用平均纯度减少来衡量 RF 和 GBDT 的变量重要性。
View Abstract