Research paper

Short-term PV-Wind forecasting of large-scale regional site clusters based on FCM clustering and hybrid Inception-ResNet embedded with Informer

^{a}

College of Automation Engineering, Shanghai University of Electric Power, Shanghai 200090, China

^{b}

Electric Power Science Research Institute, State Grid Shanghai Electric Power Company, Shanghai 200437, China

ARTICLE INFO

Keywords:

Wind and photovoltaic cluster forecasting
Fuzzy c-means clustering
The improved gray wolf algorithm
Informer

Abstract

In order to cope with the challenge that the high proportion of new energy generation for the stable operation of the power grid, this paper proposes an innovative short-term power forecasting model for regional site clusters based on fuzzy c-means (FCM) clustering and hybrid Inception-ResNet deep neural network embedded with Informer. Firstly, multiple wind farms and photovoltaic sites are clustered into different groups for popular clustering prediction owing to FCM clustering algorithm. Secondly, numerous strong factors are selected based on the combination of the linear and nonlinear correlation analysis between the variables and power generation. Furthermore, the improved gray wolf algorithm (GWO) can determine the optimal parameters of deep network model and the Informer and Inception are integrated which is fairly advanced to capture temporal relationship and potent feature extraction. Finally, the wind and photovoltaic dataset in western China is employed to verify our model and the results demonstrate that ours outperforms other algorithms with $5.400 %$ and $4.200 %$ higher R2 and $2.525 %$ and $2.090 %$ lower MAPE in the wind and solar forecasting, which simultaneously improves the accuracy and efficiency of prediction.

1. Introduction

As energy and power systems lower carbon and cleaner in response to climate change and energy crisis, wind and photovoltaic power are increasingly popular due to their clean and sustainable nature [1]. Since wind power relies on external factors such as wind direction, temperature, air pressure [2] and photovoltaic generation is dependent on variables as solar radiation, relative humidity, and temperature [3], the randomness and fluctuations of power generation pose significant challenges to the safe and economical operation of the power grid system. Precise forecasting can enhance wind and photovoltaic site operation and maintenance standard, facilitating efficient resource allocation and grid management and promotes energy integration [4].

Power forecasting can be characterized into the two main categories: mathematical statistical methods and artificial intelligence methods [5]. Statistical methods examine the relationship between historical time series of power generation and utilize mathematical models to anticipate performance, for instance, Autoregressive Integrated Moving Average (ARIMA) and Exponentially Weighted Moving Average (EWMA). Abdulla et al. [6] employed different seasonal HoltWinters models to anticipate power generation in Kuwait from 2020 to 2030 . Nevertheless, power generation is a nonlinear stochastic process and the above struggles to capture the complicated nonlinear
interactions. Artificial intelligence methods with outstanding nonlinear function approximation and computational capability, which can manage nonlinear interactions and are introduced into power generation forecasting. Recurrent Neural Networks (RNN) is experted in handling long-term dependencies in series data and can be effective for power forecasting, for example, Mohamad et al. [7] developed a framework for accurately predicting offshore wind power combined Long Short-Term Memory Recurrent Neural Networks (LSTM) with Swarm Intelligence (SI). Wu et al. [8] suggested a short-term PV power prediction model based on Extreme Gradient Boosting (IXGBoost), with similar day clustering and signal mode decomposition. Convolution Neural Networks (CNN) excels at capturing spatial information and is applied to picture processing but can also handle time series by treating the data as image-like inputs. Hu et al. [8] introduced a novel forecasting method termed Temporal Collaborative Attention. Liu et al. [9] intergrated Bidirectional Long Short-Term Memory (BiLSTM) with CNN to forecast photovoltaic (PV). Adeel et al. [10] proposed a hybrid embedded deep neural network including ResNet, CNN and Inception module, which improved feature extraction and forecating accuracy over time.

The above models are deterministic, which only can forecast the expected value of the output and describe the uncertainty of power

generation. To obtain information about future output and reduce the decision-making risk of the power system, it is particularly significant to design an accurate probability model of power generation, which can be classified as probability density forecasts, interval forecasts and quantile forecasts. Specifically, probability density provides the probability density function of future power output, interval forecasts provides the approximate range of the future output and quantile prediction exports the value under a certain quantile [11]. Chen et al. [12] validated a quantile regression model on the dataset from an offshore wind farm in Penglai District, Shandong Province. Yang et al. [13] proposed wind-power farm cluster prediction model based Graph Convolution Neural Networks (GCN) and fluctuation correlation. In reference [14], an innovative Capsule Network (ACCNet) stood out in interval prediction tasks. Machine learning can evaluate the uncertainty of forecasts in probability distributions, which can help adapting the future uncertainty [15].

Previous research has concentrated on the prediction of centralized power farms, whereas the power system has focused on the total generation of distributed generation sites and there are still relatively few studies on the power prediction of distributed sites. In general, there are three common frameworks for power sites generation: cumulative prediction, predicting cumulatively, and clustering prediction, among which the cumulative prediction accumulates the value of each distributed site to obtain the total in a certain region, the predicting cumulative simply superimposes the results of each site and the clustering prediction categorizes the sites into different clusters, and then accumulates each cluster. However, it is essential for power generation forecasting to strike a balance between accuracy and efficiency. For one thing, the cumulative prediction can improve the prediction efficiency but may increase the forecasting error. For another, predicting cumulatively improves the prediction accuracy but occupies much calculation time and storage. Based on the above, the clustering prediction not only realizes famous prediction accuracy but increases forecasting efficiency, which has arisen strong interest in recent years. However, fewer studies have been conducted in this area. Hou et al. [16] combined the DBSCAN clustering and LSTM to cluster wind turbines and select representative turbine to predict power output.

The forecasting performance of power generation is affected by various characteristic variables. To strengthen the prediction performance of the forecasting model, multiple variables from dataset must be carefully selected. Literature [10] used the Pearson correlation coefficient (PCC) to only analyze the linear relationship between meteorological environmental factors and historical generation data. Literature [14] calculated the Pearson correlations (PCs) among the meteorological parameters and the PV power. Literature [17] evaluated of the correlation between meteorological data and PV power using Pearson correlation coefficient. In summary, while it is important to examine the crucial variables based on the correlation between the different loads and their associated characteristics, most studies focus solely on linear or nonlinear correlations which may lead to the omission of key information.

Although the above models demonstrate exceptional prediction performance, the clustering forecasting model is worth studying [18] and most research have applied the traditional correlation method to analyze single relationship [19]. Furthermore, it is difficult for an individual model to handle complex and variable power generation fluctuation and it may be necessary to optimize the model parameters, which improves the prediction performance [20]. Therefore, this paper proposes an innovative short-term power forecasting model for regional site clusters based on variable selection, FCM clustering algorithm, improved GWO algorithm and hybrid Inception-ResNet deep neural network embedded with Informer which achieves a tradeoff between accuracy and efficiency. The main contributions of this paper are as follows:
(1) In order to select variables with strong correlation to characterize power sites, Pearson correlation coefficient and Spearman rank correlation coefficient derived from Copula function are used to analyze linear and nonlinear correlation simultaneously between influencing factors and power generation, which is a creative combination and utilization.
(2) Large-scale power clusters are classified for the popular clustering prediction algorithm by FCM clustering algorithm and the center of the clusters is selected as the representative site where Silhouette Coefficient (SC) is established to evaluate the clustering results. Besides, the strong correlation variables of the representative site are combined with the power generation which constructs the data input set.
(3) A hybrid Inception-ResNet embedded Informer deep neural network model is proposed based on ResNet which can effectively alleviate the gradient vanishing problem. The Inception module can capture multi-scale features in the time-series and the Informer layer benefiting from self-attention is proficient in capturing the long-term dependency of time-series.
(4) The GWO algorithm with outstanding global search ability is improved by introducing a new adaptive position update strategy and a new nonlinearly adjusted convergence factor to hunt for optimal parameters, which lays a solid foundation for satisfying prediction accuracy and robustness.

The rest of the paper are shown as follows. Section two provides a detailed description of different algorithms and models. Section three presents and analyzes the experiment results of various forecasting models. Finally, the section concludes the research and draws the directions of the future.

The detailed architecture is divided into four components, as illustrated in Fig. 1. Firstly, the dataset is preprocessed including outliers filtering and missing values filling. Secondly, Pearson analysis and Spearman analysis based on copula are applied to select influencing factors and determine the input and output. Then distributed power sites are clustered using the FCM clustering algorithm and an hybrid Inception-ResNet embedded Informer forecating model is constructed whose parameters are optimized through the improved GWO algorithm. Finally, A wind and photovoltaic dataset in western China is used to evaluate and confirm our model compared with different forecasting models.

2.1. The analysis of influencing factor correlation

Since wind power is strongly randomly affected by meteorological factors, it is necessary to analyze the correlation between power generation and meteorological factors and appropriately screen the input variables. In order to more accurately and specifically describe above correlation, Pearson correlation coefficient and Spearman rank correlation coefficient based on copula theory are employed to analyze linear and nonlinear correlation and select relevant variables accordingly.

Pearson correlation analysis measured the linear correlation between random variables

X

and

Y

where

\bar{x}

and

\bar{y}

denote the averages of

X

and

Y

. Pearson correlation coefficient

r_{x y}

between the two variables is defined as the quotient of the covariance and the standard deviation of two elements, which can be expressed as follows:

r_{x y} = \frac{\sum_{i = 1}^{n} (x_{i} - \bar{x}) (y_{i} - \bar{y})}{\sqrt{\sum_{i = 1}^{n} {(x_{i} - \bar{x})}^{2}} \sqrt{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}}

The correlation classification of Pearson correlation coefficient is shown in Table 1 where the absolute value of

r_{x y}

represents the strength of the correlation between the two elements. When

r_{x y} = 1

, the two variables are completely positively correlated, unlike

r_{x y} = - 1

, the

Fig. 1. The flowchart of the proposed model based on FCM clustering and hybrid Inception-ResNet embedded with Informer.
two variables are completely negatively correlated. Additionally, the correlation coefficient is 0 when there is no correlation.

Spearman correlation analysis is less sensitive to outliers and it converts raw data into rank data where

x_{i}

and

y_{i}

are sorted in descending order and the corresponding ranks

x_{i}^{'}

and

y_{i}^{'}

are assigned, the Spearman rank difference

d_{i}

and the Spearman rank correlation coefficient

ρ_{s}

are:

d_{i} = x_{i}^{'} - y_{i}^{'}

ρ_{s} = 1 - \frac{6 \sum d_{i}^{2}}{n (n^{2} - 1}

Copula functions construct joint distribution probability functions that can characterize nonlinear correlations between variables and have been introduced into renewable energy correlation and volatility analysis in recent years [21]. According to Sklar’s theorem, the joint distribution function of an N -dimensional component can be described by the marginal distributions of the N variables and a copula function [22]. And a series of random variables

x_{1}, x_{2}, \dots, x_{n}

, whose respective marginal distribution functions are

F_{1} (x_{1}), F_{2} (x_{2}), \dots, F_{n} (x_{n})

, then the common distribution function between the variables based on the copula function is:

F (x_{1}, x_{2}, \dots, x^{n}) = C (F_{1} (x_{1}), F_{2} (x_{2}), \dots, F_{n} (x_{n}))

Considering that it is complicated for joint distribution function to characterize the correlation between variables, the Spearman rank correlation coefficient derived from the copula function is used to quantitatively represent the nonlinear correlation between variables for two random variables

X

and

Y

, whose corresponding distribution functions are

F (X)

and

G (Y)

. Furthermore, the Spearman rank correlation coefficient

ρ_{s}

obtained from the copula function

C (F (X), G (Y)))

is:

ρ_{s} = 12 \int_{0}^{1} \int_{0}^{1} F (X) G (Y) d C (F (X), G (Y)) - 3

The Spearman rank correlation coefficient

ρ_{s}

indicates the direction of correlation between

X

independent variables and

Y

dependent

Table 1
The correlation classification on Pearson correlation analysis.

Pearson correlation coefficient	Classification
$0.6 \leq \| r_{x y} \| \leq 1$	Strong correlation
$0.4 \leq \| r_{x y} \| \leq 0.6$	Moderate correlation
$0 \leq \| r_{x y} \| \leq 0.4$	Weak correlation

Table 2
The correlation classification on Spearman rank correlation coefficient based on copula function.

Spearman rank correlation coefficient	Classification
$0.7 \leq \| ρ_{s} \| \leq 1$	Strong correlation
$0.5 \leq \| ρ_{ρ^{'}} \| \leq 0.7$	Moderate correlation
$0 \leq \| ρ_{s} \| \leq 0.5$	Weak correlation

variables and its magnitude is between -1 and 1 . While the direction of change is the same and

X

and

Y

are perfectly positively correlated,

ρ_{s} =

1. Otherwise,

ρ_{s} = - 1

. The correlation classification of Spearman rank correlation is demonstrated in Table 2. As shown in Fig. 2, variables whose absolute values of the Pearson correlation coefficient and the Spearman rank correlation coefficient are both below the specified threshold of 0.6 are excluded to simplify the model and improve prediction accuracy. On the one hand, the variable is retained as long as it reaches the above-mentioned threshold values. On the other hand, if the correlation coefficients of a variable are both below the threshold value, the variable is excluded, as the selected variables with a high correlation can convey significant information. After this, the variables with a high correlation are employed to characterize the power stations, which can effectively improve the clustering performance of the clustering algorithm and enhance the forecasting performance of each forecasting model.