Data-driven learning-based classification model for mitigating false data injection attacks on dynamic line rating systems
基于数据驱动学习的分类模型,用于减轻动态线路评级系统上的虚假数据注入攻击

https://doi.org/10.1016/j.segan.2024.101347Get rights and content  获得权利和内容
Full text access  全文访问

Highlights  亮点

  • DLR systems allow an improvement in transmission lines’ ampacities.
    DLR 系统可以提高输电线路的载流量。
  • Cyber-physical power systems involving DLR systems are prone to FDIA.
    涉及 DLR 系统的信息物理电力系统容易受到 FDIA 的影响。
  • Combination of statistics, feature selection, machine learning mitigates FDIA.
    统计、特征选择和机器学习的结合可以缓解 FDIA。
  • MR-MR selects the best features to improve testing of classification models.
    MR-MR 选择最佳特征来改进分类模型的测试。
  • Z-score-MR-MR-BGLM-LR model effectively mitigates FDIA in DLR systems.
    Z-score-MR-MR-BGLM-LR 模型有效地缓解了 DLR 系统中的 FDIA。

Abstract  抽象的

The increasing need to explore electric power grid expansion technologies like dynamic line rating (DLR) systems and their dependence on real-time weather data for system planning necessitates research into false data injection attacks (FDIA) on cyber-physical power systems (CPS). This study aims to develop a robust machine learning model to mitigate FDIA in DLR systems, focusing on statistical data processing, feature ranking, selection, training, validation and evaluation. It synthesises z-score and other statistical analyses with minimum redundancy maximum relevance (MR-MR) feature ranking and selection algorithm to improve model performance and generalisation of binary generalised linear model logistic regression (BGLM-LR) and other machine learning classification algorithms. The resulting models formed with BGLM-LR, Gaussian naïve Bayes (GNB), linear support vector machine (LSVM), wide neural network (WNN), and decision tree (DT) were trained and tested with 10-year hourly DLR history data features. The evaluation of the models on unseen data revealed enhanced validation and testing accuracies after the MR-MR feature ranking and selection. BGLM-LR, GNB, LSVM, and WNN showed promising performance for mitigating FDIA. However, the study identifies DT’s limitations as overfitting and lacking generalisation in FDIA mitigation. z-score-MR-MR-BGLM-LR and z-score-MR-MR-LSVM models exhibited outstanding performances with zero false negative rates highlighting the significance of feature ranking and selection. Still, the z-score-MR-MR-BGLM-LR combination exhibits the highest marginal improvement from training to testing, the lowest training and validation time and a perfect area under curve (AUC) of the receiver operating characteristics making it the best choice in mitigating FDIA when computational resources are limited.
对动态线路额定值 (DLR) 系统等电网扩展技术的探索需求日益增长,而且这些技术对实时天气数据进行系统规划的依赖,使得研究针对信息物理电力系统 (CPS) 的虚假数据注入攻击 (FDIA) 成为必要。本研究旨在开发一个强大的机器学习模型来减轻 DLR 系统中的 FDIA,重点是统计数据处理、特征排序、选择、训练、验证和评估。它将 z 分数和其他统计分析与最小冗余最大相关性 (MR-MR) 特征排序和选择算法相结合,以提高模型性能和二元广义线性模型逻辑回归 (BGLM-LR) 和其他机器学习分类算法的泛化能力。使用 BGLM-LR、高斯朴素贝叶斯 (GNB)、线性支持向量机 (LSVM)、宽神经网络 (WNN) 和决策树 (DT) 形成的模型,并使用 10 年每小时 DLR 历史数据特征进行训练和测试。对未见数据模型的评估表明,在进行 MR-MR 特征排序和选择后,验证集和测试集的准确率均有所提升。BGLM-LR、GNB、LSVM 和 WNN 在缓解 FDIA 方面表现出色。然而,研究指出,DT 的局限性在于过度拟合和在 FDIA 缓解方面缺乏泛化能力。z-score-MR-MR-BGLM-LR 和 z-score-MR-MR-LSVM 模型表现出色,假阴性率为零,凸显了特征排序和选择的重要性。 尽管如此,z-score-MR-MR-BGLM-LR 组合表现出从训练到测试的最高边际改进、最低的训练和验证时间以及完美的接收器操作特性曲线下面积 (AUC),使其成为在计算资源有限的情况下缓解 FDIA 的最佳选择。

Abbreviations  缩写

ARP
Address Resolution Protocol
AUC
Area Under Curve
BiLSTM
Bi-directional Long Short-Term Memory
BGLM-LR
Binary Generalised Linear Model Logistic Regression
CPS
Cyber-physical Power Systems
DLR
Dynamic Line Rating
DoS
Denial of Service
DT
Decision Tree
FL
Federated Learning
FN,
False Negative
FNR
False Negative Rate
FP,
False Positive
FPR
False Positive Rate
FDIA
False Data Injection Attack
GAB
Gentle AdaBoost
GGNN
Gated Graph Neural Network
GNB
Gaussian Naïve Bayes
HMI
Human Machine Interface
ICT
Information and Communication Technology
IEEE
Institute of Electrical and Electronics Engineers
LSTM
Long Short-Term Memory
LSVM
Linear Support Vector Machine
MIQ
Mutual Information Quotient
MR-MR
Minimum Redundancy-Maximum Relevance
PCA
Principal Component Analysis
PMU
Phasor Measurement Unit
PSC
Power System Component
ReLU
Rectified Linear Unit
RES
Renewable Energy Source
RTU
Remote Terminal Unit
ROC
Receiver Operating Characteristics
SCADA
Supervisory Control and Data Acquisition
SPDS
Safety Parameter Display System
SVR
Support Vector Regression
TN,
True Negative
TP,
True Positive
UPS
Un-interruptible Power Supply
WNN
Wide Neural Network

ARP
地址解析协议
AUC
曲线下面积
BiLSTM
双向长短期记忆
BGLM-LR
二元广义线性模型 逻辑回归
CPS
网络物理电力系统
DLR
动态线路评级
DoS
拒绝服务
DT
决策树
FL
联邦学习
F N,
假阴性
联邦国民保险
假阴性率
F P,
假阳性
FPR 误报率 FDIA 虚假数据注入攻击 GAB Gentle Ada Boost GGNNGated Graph Neural Network GNB 高斯朴素贝叶斯 HMI 人机界面 ICT 信息与通信技术 IEEE 电气电子工程师协会 LSTM 长短期记忆网络 LSVM 线性支持向量机 MIQ 相互信息商 MR-MR 最小冗余-最大相关性 PCA 主成分分析 PMUPasor Measurement Unit PSC 电力系统组件 ReLU 整流线性单元 RES 可再生能源 RTU 远程终端 单元 ROC 接收机操作特性 SCADA 监控和数据采集 SPDS 安全参数显示系统 SVR 支持向量回归
TN
真阴性
TP
真正例
UPS
不间断电源
WNN
广域神经网络

Keywords  关键词

Dynamic line rating systems
Cyber-physical power systems
False data injection attacks
Affordable & clean energy
Energy security
Industry, innovation & infrastructure

动态线路评级系统
信息物理电力系统
虚假数据注入攻击
廉价清洁能源
能源安全
产业、创新和基础设施

1. Introduction  1. 简介

Smart grids came up to improve the efficiency and reliability of traditional power grids. They are sizeable cyber-physical power systems (CPS), combining physical components and cyber components of advanced information communication technology (ICT) to improve the efficiency, reliability, and sustainability of electricity generation, transmission, and distribution [1], [2]. The power system components (PSC), dynamic line rating (DLR) sensors, and phasor measurement units (PMUs) form the physical part. In contrast, elements such as supervisory control and data acquisition (SCADA) systems, human-machine interfaces (HMIs), and remote terminal units (RTUs) form the cyber part. This combination enables remote monitoring and control of the grid. These systems are designed to be resilient by enhancing the interoperability of physical devices and the communication networks that connect them, enabling optimal power delivery.
智能电网的出现是为了提高传统电网的效率和可靠性 。它们是相当大的信息物理电力系统 (CPS),结合了先进信息通信技术 (ICT) 的物理组件和网络组件 ,以提高发电 输电和配电的效率、可靠性和可持续性 [1][2] 。电力系统组件 (PSC)、动态线路额定值 (DLR) 传感器和相量测量单元 (PMU) 构成物理部分。相反,监控和数据采集 (SCADA) 系统、人机界面 (HMI) 和远程终端单元 (RTU) 等元素构成了网络部分。这种组合可以实现电网的远程监视和控制。这些系统旨在通过增强物理设备和连接它们的通信网络的互操作性来提高弹性,从而实现最佳电力输送。
In tackling global energy crises, clean and affordable, modern energy for all could be achieved as stated in the targets of sustainable development goal 7 (SDG 7) by matching the increase in renewable energy sources (RES) with grid expansion technologies like the DLR [3]. DLR helps to reduce congestion on the grid, by increasing the current-carrying capacity of the transmission lines, thereby enabling the integration of more clean and affordable energy towards the achievement of SDG 7 [4]. DLR operations have been reviewed and implemented in various studies [5], [6], [7] to rely on real-time measured weather parameters, such as wind speed, wind angle, and ambient temperature, to calculate the capacity of transmission lines and plan for future operations. These weather parameters are used to determine the convective cooling, radiative cooling, and solar heating effects on the lines [8].
在应对全球能源危机的过程中,通过将可再生能源 (RES) 的增长与 DLR 等电网扩展技术相匹配,可以实现可持续发展目标 7 (SDG 7) 中提出的让所有人都能使用清洁、负担得起的现代能源的目标 [3] 。DLR 通过增加输电线的载流量来帮助减少电网拥堵,从而能够整合更多清洁、负担得起的能源,以实现可持续发展目标 7 [4] 。各种研究 [5][6][7] 审查和实施了 DLR 运行 ,依靠实时测量的天气参数,例如风速 、风角和环境温度,来计算输电线的容量并规划未来的运行。这些天气参数用于确定对流冷却、辐射冷却和太阳加热对线路的影响 [8]
Unfortunately, as illustrated in Fig. 1, DLR sensors collecting these data communicate weather conditions and line ratings in real-time with operators [4] making them particularly susceptible to false data injection attacks (FDIA). FDIA is the infiltration of erroneous data into RTUs, leading operators to misconceive line conditions, generators’ output, or demand, inducing instability and potentially causing a section or the entire power grid network to collapse. Attackers focus on components that are operating in real-time like the DLR to have a maximum impact on the PSC. DLR systems are the preferred target for these attackers [9] because DLR systems rely on ICT, which inevitably makes them vulnerable. Steps are taken to prevent these potentially compromising cyber attacks to achieve data confidentiality and integrity while maintaining the data’s security and reliability.
遗憾的是, 如图 1 所示,收集这些数据的 DLR 传感器会实时向运营商通报天气状况和线路额定值 [4], 这使得它们特别容易受到错误数据注入攻击 (FDIA)。FDIA 是指将错误数据渗透到 RTU 中,导致运营商误解线路状况、发电机输出或需求,从而引发不稳定,并可能导致部分或整个电网崩溃。攻击者会重点攻击像 DLR 这样实时运行的组件,以对 PSC 造成最大影响。DLR 系统是这些攻击者的首选目标 [9] ,因为 DLR 系统依赖于 ICT,这不可避免地使其易受攻击。人们已采取措施防止这些可能造成危害的网络攻击 ,以实现数据的机密性和完整性,同时维护数据的安全性和可靠性。
Fig. 1
  1. Download: Download high-res image (138KB)
    下载: 下载高分辨率图像(138KB)
  2. Download: Download full-size image
    下载: 下载全尺寸图像

Fig. 1. FDIA and Mitigation Illustration.
图 1. FDIA 和缓解措施说明。

Other forms of these cyber-attacks include resource disconnection and denial of service (DoS) [10], [11], [12]. They range in severity and scope, and the solutions are categorised according to implementation complexity and mitigation efficacy. Resource disconnection attacks involve hackers accessing systems like SCADA, HMI, or RTU to control circuit breakers, manipulate power flow, and disconnect customers. Some cases even have the software used to restore service after an outage being targeted, making it inaccessible to operators. DoS attacks prevent customers from reporting their outage experiences to the utility.
这些网络攻击的其他形式包括资源切断和拒绝服务 (DoS) [10][11][12] 。它们的严重程度和范围各不相同,解决方案根据实施复杂性和缓解效果进行分类。资源切断攻击是指黑客访问 SCADA、HMI 或 RTU 等系统来控制断路器、操纵电流并切断客户电源。在某些情况下,用于在停电后恢复服务的软件甚至成为攻击目标,使操作员无法访问。DoS 攻击会阻止客户向公用事业公司报告停电情况。
Practical strategies for mitigating FDIA on smart grids should be able to prevent, preempt and identify vulnerabilities, pinpoint the sources of attacks if they occur, and replace false data with accurate data. With this, smart grid systems can bolster their resilience against cyber threats and ensure the integrity and reliability of grid operations. This research hopes to achieve these with a review of studies on different types of cyber-attacks on power grids, focusing on FDIA and real cases in Section 2. Discussion of data-driven, learning-based algorithms designed to prevent, detect, and mitigate these attacks in Section 3. Assessment of the resilience of the developed algorithm and other viable algorithms against FDIA to determine the most reliable method for mitigating these attacks without errors in Section 4. Concluding observations and directions for future research are presented in Section 5.
缓解智能电网 FDIA 的实用策略应该能够预防、抢占和识别漏洞,在攻击发生时查明攻击源,并用准确数据替换虚假数据。这样,智能电网系统就可以增强其抵御网络威胁的能力,并确保电网运行的完整性和可靠性。本研究希望通过回顾针对电网不同类型网络攻击的研究来实现这些目标,重点关注 FDIA 和真实案例(见第 2 节) 。讨论旨在预防、检测和缓解这些攻击的数据驱动、基于学习的算法(见第 3 节) 。评估开发的算法和其他可行算法对 FDIA 的弹性,以确定最可靠的方法来缓解这些攻击而不会出现错误(见第 4 节) 。总结性观察和未来研究的方向见第 5 节

2. Cyber-attacks and FDIA review
2. 网络攻击和 FDIA 审查

DoS, theft of service, spying and plant damage on power system infrastructures have benefited attackers with ransom, spying data for military information and other political reasons, stealing operational technologies and espionage. The following sub-sections of FDIA studies, cases and state estimation reveal recent studies on FDIA prevention and mitigation during cyber-physical attacks on smart grids.
针对电力系统基础设施的拒绝服务 (DoS) 服务盗窃、间谍活动和工厂破坏等攻击,使攻击者从中获利,包括勒索赎金、出于军事情报和其他政治目的而窃取数据、窃取运营技术以及从事间谍活动。以下关于 FDIA 的研究、案例和状态评估的小节,揭示了在智能电网遭受网络物理攻击期间,FDIA 预防和缓解的最新研究。

2.1. FDIA studies  2.1. FDIA 研究

The concept of FDIA for power grid state estimation was introduced by Liu, Ning and Reiter [13]. They pointed out that attackers can infiltrate the CPS and ICT network infrastructure to manipulate measurement devices and obtain network parameters and topology. These will have them construct false measurement data that satisfies the constraints of state estimation, thereby bypassing the bad data detection process. It further enables the attacker to launch attacks unnoticed by the control centre, causing it to lose its ability to accurately perceive the system’s current operating state or topology. As a result, the control centre produces incorrect estimates and issues incorrect instructions and commands disrupting the regular operation of the power grid [14].
刘、宁和赖特 [13] 提出了电网状态估计的 FDIA 概念。他们指出,攻击者可以渗透到 CPS 和 ICT 网络基础设施中,操纵测量设备并获取网络参数和拓扑结构。这将使他们构建满足状态估计约束的虚假测量数据,从而绕过不良数据检测过程。这进一步使攻击者能够在控制中心不被察觉的情况下发动攻击,使其失去准确感知系统当前运行状态或拓扑结构的能力。结果,控制中心产生错误的估计并发出错误的指令和命令,扰乱电网的正常运行 [14]
While traditional algorithms have effectively detected bad data, recent advances in deep learning have also made it possible to estimate system states more accurately, even in cyber-attacks. A Gaussian mixture model has been proposed by Shi, Xie and Peng [15]; the proposed model was tested on Institute of Electrical and Electronics Engineers (IEEE) buses and had between 1.5% and 5% improvement in accuracy compared with other models. The evaluation scenarios did not encompass the full range of potential attacks or system conditions. This limited evaluation scope could restrict the generalisability and reliability of the proposed FDIA detection methods when applied to different power grid configurations. In a study by Xiong et. al [16], a machine learning algorithm, the support vector machine-gentle adaboost (SVM-GAB), was used to detect FDIA. GAB was used to cascade multiple weak support vector machine (SVM) classifiers to make a robust classifier capable of distinguishing normal from abnormal data. The authors used IEEE performance metrics, including mean time to detection and accuracy, to evaluate the effectiveness of the SVM-GAB algorithm. The results showed that the false alarm rate of the SVM-GAB algorithm was 25% lower than that of traditional detection algorithms.
虽然传统算法已经可以有效检测出不良数据,但深度学习的最新进展也使得即使在网络攻击中也能更准确地估计系统状态。Shi、Xie 和 Peng [15] 提出了一种高斯混合模型;该模型在电气和电子工程师协会 (IEEE) 总线上进行了测试,与其他模型相比,准确度提高了 1.5% 到 5%。评估场景并未涵盖所有潜在攻击或系统条件。这种有限的评估范围可能会限制所提出的 FDIA 检测方法在应用于不同电网配置时的通用性和可靠性。在 Xiong 等人 [16] 的研究中,一种机器学习算法——支持向量机-温和 adaboost (SVM-GAB)——被用来检测 FDIA。GAB 用于级联多个弱支持向量机 (SVM) 分类器,以构成一个能够区分正常数据和异常数据的鲁棒分类器。作者采用 IEEE 性能指标,包括平均检测时间和准确率,评估了 SVM-GAB 算法的有效性,结果表明,SVM-GAB 算法的误报率比传统检测算法降低了 25%。
Moradzadeh et al. [17], evaluated the reliability and accuracy of deep learning techniques, such as support vector regression (SVR), long short-term memory (LSTM), and bi-directional LSTM (BiLSTM), in predicting DLR using real-world data from two transmission lines. The authors examined the resilience of these algorithms by simulating cyber-attacks. However, they focused on increasing the wind speed, angle, and ambient temperature of the historical data by chosen percentages without providing a rationale for these choices of variations. It is important to note that data variations could encompass increases, decreases, and changes in percentages, ratios, and exponentials. This raises questions about the comprehensive assessment of the algorithms’ performance in handling various scenarios and data variations.
Moradzadeh 等人 [17] 评估了深度学习技术(例如支持向量回归 (SVR)、长短期记忆 (LSTM) 和双向长短期记忆 (BiLSTM))在使用来自两条输电线路的真实数据预测 DLR 时的可靠性和准确性。作者通过模拟网络攻击检验了这些算法的弹性。然而,他们专注于按选定的百分比增加历史数据的风速、角度和环境温度,而没有提供这些变化选择的理由。值得注意的是,数据变化可能包括百分比、比率和指数的增加、减少和变化。这对全面评估算法在处理各种场景和数据变化方面的性能提出了质疑。
Another study deployed a spatiotemporal machine-learning algorithm to detect FDIA [18]. The authors emphasised using machine learning algorithms that recognise normal distribution dynamics, such as an LSTM, autoencoder or other unsupervised learning methods, as the most effective way to detect FDIA. This is because these models can analyse the data and identify patterns that deviate from the normal distribution, which may indicate the presence of an FDIA. It showed that other unsupervised learning methods, such as clustering or density estimation, could also be used. Assessing the residual of the measurements with or without basic and stealth FDIA was used to adjudge the algorithm’s efficacy. Graph-based detection of FDIA in the power grid was proposed [19]. Spatial features of the grid topology were extracted through a graph neural network, and the study discovered that the accuracy of most data-driven detection methods decreases as the topology of the network changes.
On the contrary, the gated graph neural network (GGNN) increases accuracy as topology changes [20]. A stacked autoencoder network was used to extract cyber-physical attack genes, a fine-tuning amplifier for training and updating network parameters and a cuckoo search algorithm for optimising the model parameters were used to enhance attack detection. The description of FDIA studies, prospects and limitations are explained in Table 1.

Table 1. FDIA studies.

FDIA StudiesDescriptionProspectsConstraints
[13]This study introduces the concept of FDIA for power grid state estimation. Highlights the possibility of attackers manipulating measurement devices and bypassing bad data detection.It raises awareness about potential vulnerabilities in power grid systems.The study lacks specific detection methods and evaluation results. It created arbitrary attack scenarios and did not focus on DLR systems but instead on the impact of FDIA on state estimation. However, it suggests utilising network anomaly detection techniques to protect against false data injection attacks in other areas of the power system network.
[14]This study describes how attackers can launch unnoticed attacks on the power grid, causing disruption.Highlights the impact of attacks on the control centre’s ability to perceive the system’s operating state.The study lacks a specific solution or evaluation method, focuses on linear abnormalities, and suggests exploring more complex ones. It is limited to the IEEE 24-bus system and lacks generalisation. The method relies on static line ratings. To enhance the approach, the study recommends incorporating machine learning to detect abnormalities in load patterns, enabling more effective countermeasures.
[15]This study proposes a Gaussian mixture model for FDIA detection, showing improved accuracy compared to other models.It advances deep learning for accurate state estimation, even in cyber attacks.The study did not address DLR, lacked coverage of potential attacks and system conditions, and provided insufficient information on the model used.
[16]SVM-GAB algorithm was deployed to detect FDIA with a lower false alarm rate than traditional methods.It demonstrates the effectiveness of machine learning algorithms for FDIA detection using metrics such as accuracy and recall.It did not consider DLR. It did not provide detailed information on the SVM-GAB algorithm and the attack intensity rule used to perform the volatility test, differentiating FDIA and power flow surge.
[17]This study evaluated the reliability and accuracy of deep learning techniques (SVR, LSTM, BiLSTM) for predicting DLR and their resilience to cyber-attacks.It explores the use of real-world data and the simulation of cyber-attacks.The study lacks a rationale for the specific data variations, limiting the comprehensive algorithm performance assessment.
[18]This study deploys spatiotemporal learning algorithms, such as LSTM autoencoder and unsupervised learning methods, for FDIA detection. Emphasises the importance of models recognising normal distribution dynamics.It addresses the FDIA challenges that may be encountered in DLR forecasting. It highlights unsupervised learning methods’ effectiveness and ability to detect deviations from standard distribution patterns.In most cases, when these models are efficient, trade-offs exist in computational complexity, resource requirements, and detection latency.
[19]Graph-based detection of FDIA using a graph neural network. Shows that the accuracy of data-driven detection methods decreases with changing network topology, while the GGNN improves accuracy.It offers a novel approach using graph neural networks for FDIA detection.Model complexity, resource requirements, and mitigation are the limitations of the proposed model. It did not incorporate DLR in the analyses.
[20]It utilised a stacked autoencoder network, fine-tuning amplifier, and cuckoo search algorithm for cyber-physical attack detection. Mentions deficiencies in accuracy and sensitivity of algorithms that do not use extensive historical data.It introduces a comprehensive approach involving multiple techniques for attack detection.The structure of the deep network model is complex, and the model training time is extended. To speed up the model training process, the dynamic optimisation method of the learning rate and other parameters in the training process will be considered in future work.
[21]This article proposes ensemble learning algorithms for DLR forecasting to address transmission congestion caused by high renewable energy penetration. Traditional DLR methods require extensive infrastructure, but the proposed approach leverages historical meteorological data. Simulations demonstrate the effectiveness of ensemble learning algorithms, achieving a significant capacity increase for 400 kV lines, and alleviating congestion issues without additional infrastructure.It offers promising prospects for DLR forecasting, providing accurate predictions without extensive infrastructure. The demonstrated capacity increase highlights the potential for this approach to improve grid efficiency and reliability, particularly in regions experiencing transmission congestion due to renewable energy integration.Testing against different data points is essential to ensure the reliability of forecasting models across various line segments. Additionally, the effectiveness of the proposed approach may depend on factors such as data quality and the severity of cyberattacks. Addressing these constraints is crucial to ensuring the practical applicability and reliability of ensemble learning algorithms for DLR forecasting
[22]This article presents a novel approach using federated learning (FL) for DLR forecasting, crucial for enhancing grid-side flexibility by accurately predicting overhead transmission line capacity. FL generates a global model from data across different regions, ensuring security and protection against cyberattacksFL offers promising prospects for DLR forecasting, enabling accurate predictions even in regions lacking data. The global supermodel generated by FL has the potential to improve grid reliability and flexibility, providing timely forecastsDespite its benefits, FL implementation faces constraints related to data availability, privacy, and infrastructure requirements. Ensuring data security and addressing computational challenges are key considerations. Additionally, the global supermodel’s performance may vary based on the diversity of training data, potentially leading to inaccuracies in certain regions
Most of the algorithms described in Table 1 considered FDIA detection and did not include DLR data in their analyses. They are also deficient in accuracy and sensitivity because they did not use extensive historical data. In addition, while these algorithms may demonstrate improvements in accuracy or false alarm rates in some cases compared to traditional algorithms, there is usually a trade-off in other aspects, such as computational complexity, resource requirements and communication latency. An overview of the real cases of FDIA on cyber-physical power systems is presented to provide an appropriate understanding of the impact on the utilities and their host communities.

2.2. FDIA cases

It has been established that cyber-attacks on smart grids can be motivated by various goals, including military, theft, politics, or hostility. One of the earliest known cyberattacks on a power grid occurred in 2003 when the Slammer malware affected the David-Besse plant in the United States [23]. The nuclear power plant had a safety monitoring system called the safety parameter display system (SPDS) used to monitor and control the plant’s safety. However, attackers bypassed the firewall and gained access to the SPDS through a consultant working with one of the plant’s applications. As a result, the attackers caused a slowdown of the servers and a DoS. The Slammer worm disabled access to the server for 5 hours and demonstrated the potentially devastating consequences of a cyberattack on the control of system components.
Another notable example is the Ukrainian smart grid attack in 2015 [24]. The cyber-attack on the Ukrainian power grid caused outages that affected over 225,000 households in three provinces. The episode started with operators falling victim to a spear phishing attack, in which they downloaded a document from their emails that contained malware. This malware gathered information about the system’s state and gave the attackers access to the network through corporate user accounts. The attackers also launched a telephony DoS attack, which flooded the call centre and prevented real customers from reporting the outage. In addition, the attackers turned off uninterruptible power supplies (UPS), corrupted the firmware of the RTUs and used a ‘kill disk’ to wipe out HMIs and several workstations. This coordinated attack targeted six energy companies, but three were vulnerable and suffered outages [25].
Several other high-profile cyber-attacks on power grids have occurred in recent years, including the ‘Stuxnet’ attack on the Iranian nuclear power station in 2010. ‘Stuxnet’ is a computer worm that targets programmable logic controllers to automate power systems. It typically targets Windows operating system computers and real-time data transmission software. In the Iranian attack, the worm was planted on critical infrastructure management centres, allowing the attackers to collect real-time data from industrial systems. They also caused the uranium gas centrifuges to spin out of control, causing widespread damage to the power grid. The Stuxnet virus was also detected in power, chemical, and industrial control plants in Germany that used SCADA and Siemens software. While it targeted the ‘WinCC’ software, it was discovered and patched before it could affect any economic or real-time data aggregation operations. In 2017, a cyber-attack involving the Address Resolution Protocol (ARP) cache virus targeted a wind farm in the United States. Another similar attack on the Venezuelan hydropower plant’s control centre occurred in 2019 [26].
These cyber-attacks, for whatever reasons, could come in several ways. One common tactic attackers use to cause cascading failures is to conceal physical damage to power system components by injecting false data into the energy management system or communication network, thereby altering the system’s state [27]. FDIA can also occur when data from DLR sensors, which are used for monitoring and real-time control purposes, is affected by false data. The false data can alter the normal power flow of the network or introduce unnecessary monitoring and operation delays, leading to outages, equipment damage, operator injuries, and even fatalities. Therefore, it is essential to implement prevention, early detection, and countermeasures to protect against FDIA and the negative consequences of these attacks. A comprehensive study explored various spatiotemporal perspectives to improve the security of CPS against FDIA [28]. The study identified data-driven corrective measures, including false data identification, correction and traffic anomaly detection, as effective ways to prevent FDIA.
It is essential to note that implementing these measures may also increase communication latency and potentially impact the data used for decision-making processes. To mitigate this, a large amount of training and testing data needs to be analysed using data processing algorithms to understand the relationship between accurate and false data and subsequently correctly classify the data and resolve any issues that may arise. A data-driven and learning-based approach considering communication latency is proposed in this study to identify and mitigate FDIA. This will not be possible without assessing the state of the power system network before FDIA and after FDIA.

2.3. State Estimation

State estimation using DLR sensors is a technique that utilises sensors to determine the state variables, such as the ampacity of a power system. DLR sensors are installed on transmission lines to measure actual temperature and weather conditions, which are used to determine the thermal rating of the line in real-time. The measurements obtained from these sensors are integrated into the state estimation equations, which are then used to estimate the state variables of the power system. The state estimation equations are developed based on the measurements acquired from the DLR sensors and are used to estimate the power system flows. The primary objective is to minimise the difference between the actual measurements obtained from the DLR sensors and the estimated values acquired from the state estimation equations to achieve the most accurate estimation possible. Consider the actual state of a system of measurements,(1)x=H1(z)
The equation for state estimation in the presence of false data injection attacks can be modelled as in (2):(2)xˆ=H1(zHs)where xˆ is the estimated state, z is the measurement vector, H is the measurement matrix, and s is the vector representing the false data injection attack. The objective of the attacker is to manipulate the measurement vector z so that the estimated state, xˆ deviates from the actual state x. FDIA is achieved by adding a false measurement s to the measurement vector z. FDIA disrupts the outcomes of state estimation within DLR systems directly. By surreptitiously altering sensor readings, attackers introduce undetected errors into the calculation of state variables and values. Consequently, the estimated state of the DLR system diverges from the actual state, impacting decision-making processes significantly. The compromised state estimation resulting from FDIA can cause the grid regulator to make erroneous decisions, affecting the operation and stability of the power grid. This threat extends to the overall security of grid operations, potentially disrupting power flow management, load balancing, and fault detection mechanisms [29], [30], [31].
To counter such attacks, it becomes crucial to design and deploy a hybrid algorithm that can detect and mitigate the impact of FDIAs, ensuring accurate state estimation despite manipulated measurements. It is necessary to incorporate additional constraints and measurements to the state estimation problem to detect and mitigate the effects of false data injection attacks. For example, the differences between the measured and estimated values are appropriate for detecting anomalies in the measurement vector using residuals. Another approach is to use decentralised state estimation, where each node in the system independently estimates its state and the final estimate is obtained through consensus. It is worth noting that countering false data injection attacks poses a significant challenge, prompting the development of various models proposed in the existing literature to tackle this issue. Selecting a specific approach relies on system specifications and the type of attack under consideration. Once precise estimates of the state variables are obtained, they become valuable resources for making real-time decisions regarding power system operation. These decisions encompass a range of actions, including power distribution across different regions, control of generators and transmission lines, and management of power system contingencies. Moreover, the estimated values play a crucial role in analysing the impact of dynamic line rating, consequently enhancing the effectiveness and reliability of power system operation.
Researchers must develop a model to safeguard critical infrastructure and ensure a resilient power system. The primary objective of this research is to propose an effective data-driven, learning-based classification model capable of identifying and mitigating FDIA in DLR sensor data. This model will be picked from several models trained, tested and affirmed as the best protection against FDIA. It is imperative to build a robust and accurate classification system to distinguish genuine ‘Good Measure’ data from potentially malicious FDIA instances. An integrated model for training and testing viable mitigation approaches to fortify DLR systems against FDIA is described in the methodology section.

3. Methodology

An approach to determine the most accurate algorithm for a classification problem is to evaluate the accuracy of candidate algorithms and select the one with the best validation and testing accuracy. Still, no consensus exists on which way is best to judge the efficacy of individual algorithms because no single classification metric can determine the overall proficiency of an algorithm [32]. In addition, combining classifiers through ensemble creation has been proposed to improve individual classifier performance across several metrics, and various ensemble creation methods have been suggested. Therefore, research on building good ensembles of classifiers is an active area of study in supervised learning. Nonetheless, ensemble methods have weaknesses, including increased storage and computation requirements and decreased comprehensibility. Top of Form
Data-driven FDIA mitigation algorithm should identify and correct data errors to estimate a system’s state, such as a power grid. This involves formulating a mathematical optimisation problem to find the most accurate estimate of the system’s state based on the available data while considering uncertainties and errors. On the other hand, a learning-based algorithm depends on patterns learnt from historical data to predict the system’s state. Both methods use statistical or machine learning models to implement the mitigation. Major models used to detect FDIA are classified into statistical analysis, anomaly detection and intrusion detection. Examples of statistical analysis are normal distribution function, SVM, k-means and neural networks anomaly detection, while intrusion detection involves decision trees and random forests. Anomaly and intrusion detection algorithms include the principal component analysis (PCA) and Naïve Bayes. The machine learning methods of logistic regression, SVM, decision trees, naïve Bayes and neural networks are prominent among these methods because of their proficiency in binary classification. This section first assesses the existing and proposed classification algorithms, discusses the proposed model training, validation and testing methods, and eventually introduces a feature ranking and selection algorithm to improve the efficacy of the existing and proposed algorithms.

3.1. Classification algorithms

The characteristics, structure, parameters and optimisation benefits of the proposed model and other existing algorithms will be compared to mitigate the potential consequences of FDIA attacks, such as communication disruptions and equipment damage. FDIA classification models are designed to identify and correct data errors to ensure the system’s safe and efficient operation. The demerits of most of the models mentioned above are assumptions leading to false alarms, complexity, and limited fault coverage. The characteristics, structure parameter constraints and optimisation benefits of the machine learning algorithms deployed in the binary classification of DLR data are represented in Table 2.

Table 2. Machine-Learning Classification Models.

ModelsCharacteristicsStructure and parameter descriptionOptimisation benefits
Linear Support Vector MachinesLinear Support Vector Machines (LSVMs) are suitable for small to medium-sized datasets and problems with clear margins between classes. Versatile with different kernel functions to handle linear and nonlinear decision boundaries. They work well in cases where the instances are less than the features.This algorithm tries to find a straight-line boundary between categories. It automatically adjusts the importance of this boundary based on the data. If a data point is misclassified, a penalty is determined by a value of 1. Before processing, the data is adjusted to have an average value of 0 and a consistent spread (standardised).LSVM can be computationally expensive, especially for large datasets.
LSVMs often struggle with imbalanced datasets, requiring additional techniques such as class weighting or resampling. Feature selection makes the model more efficient for FDIA detection, especially in high-dimensional spaces. It improves generalisation by focusing on essential features, aiding SVMs in managing imbalanced FDIA datasets.
Wide Neural NetworksWide Neural Networks (WNN) are complex models involving interconnected layers and activation functions. Suitable for problems with complex relationships and large amounts of data. They are highly flexible with nonlinear decision boundaries.Wide Neural Networks can learn complex nonlinear relationships between features, making them highly flexible and powerful.
They can handle high-dimensional data and automatically extract relevant features from raw input. The WNN deployed here has one hidden layer with 100 processing nodes. After each node processes data, it uses a function called Rectified Linear Units (ReLU) to decide on the output. The training process will run for a maximum of 1000 times to adjust and improve. No additional rule prevents it from fitting too closely to the training data. Just like in the SVM, the data is adjusted to have an average value of 0 and a consistent spread before processing.
They are computationally intensive and may require significant computational resources, especially for large and complex models. Large neural networks can be more susceptible to vanishing gradients and overfitting, requiring additional techniques such as batch normalisation and dropout. Feature selection optimises the wide neural network’s performance by focusing on the most critical attributes in FDIA detection to enhance interpretability and training efficiency.
Decision TreeDecision trees (DTs) use a set of if-else conditions to recursively split the feature space based on the values of input features. The decision tree can be represented as a hierarchical structure of decision and leaf nodes.They provide a hierarchical structure of decisions based on feature splits. This makes them understandable and easy to interpret. They can handle both numerical and categorical features effectively. The decision tree has a maximum depth controlled by allowing up to 100 decision-making points. When deciding how to branch, it uses Gini’s diversity index, which helps differentiate between categories. If a primary decision rule isn't applicable, there’s no backup rule because the surrogate decision split is turned off.It sometimes leads to poor generalisation of unseen data. Overfitting can occur when the tree becomes too deep, and decision trees may struggle with capturing certain complex relationships or logic patterns. Additionally, small changes in the data can lead to different tree structures and potential instability.
Gaussian Naïve BayesGaussian naïve Bayes (GNB) is a probabilistic classification algorithm that assumes the features are conditionally independent given the class label. It may struggle with rare events or zero probabilities. Efficient for probabilistic classification tasks and datasets with feature independenceGNB models assume that rating features are conditionally independent given the class label. This is usually not the case because a temporal correlation exists between DLR ratings. For this method, when it looks at numerical data, it assumes a bell-curve-like (Gaussian) distribution. For categorical data, like the presence or absence of FDIA, it assumes a distribution that counts occurrences (multinomial).GNB feature independence assumption might not hold in real-world scenarios, leading to suboptimal performance. Feature selection will assist in identifying crucial features for FDIA detection and maintaining the independence assumption. It will also impact handling rare events of zero probabilities by focusing only on influential features.
Binary Generalised Linear Model Logistic RegressionBinary Generalised Linear Model Logistic Regression (BGLM-LR) assumes a linear relationship between the features and the log odds of the binary outcome variable. It is a specialised form of generalised linear model designed for binary classification tasks. It may struggle with complex nonlinear relationships. It has limited feature interaction capture but is effective for binary classification tasks. It can provide probability estimates for anomaly detection using an unbounded and continuous log of odds termed the logit functionLogistic regressions are linear models with sigmoid functions. Coefficient estimation is pivotal because it defines how each feature affects the outcome, ensuring predictions are anchored in the data’s true patterns. Regularisation in BGLM-LR guards against over-optimising training samples, ensuring the model’s broad applicability. Lastly, threshold adjustment in BGLM-LR enables precise calibration of predictions, especially when certain errors have more severe consequences than others. Regularisation Strength (Lambda) was set to 0.5 by default to control model complexity, with higher values leading to more robust regularisation.Feature selection can help identify the most relevant features for FDIA detection, reducing noise and improving model interpretability. - Particularly beneficial when dealing with high-dimensional datasets or feature-rich FDIA scenarios.

3.2. Proposed model training, validation and testing

DLR sensors measure hourly weather data and use it to calculate the DLR that makes the history data. DLR history data calculated for ten years for a typical power transmission line is used in this case study. Latency values indicate the delay in computing and communicating these values for real-time use. The following procedure explains the model that identifies and mitigates false data in a DLR array:
  • (I)
    Statistical procedure:
Step 1: Inputting history data
The history data (Hα), consists of hourly ampacities spread over ten years. They are calculated using the IEEE 738 standard for estimating the current/temperature relationship of overhead conductors, while the latency data, Lα (ms) corresponds to the delay in relaying each sensor measurement to the servers containing the algorithm. This makes the hourly latency data and history data size depend on the number of days in a month. It may be 24*300 for a 30-day month, 24*310 for a 31-day month, and 24*282 for February over ten years.(3)Hα=[m1.1m1.2m2.1m2.2m1.d"×ym2.d"×ym24.1m24.2m24.d"×y]where m1.d"×y represents the first hour of the last day of a particular month in the 10-year historical data. A simplified example of Lα and Hα for the first month (January) is given in (4) (5).(4)L1=[249296382456269374442400403334378370360432453469370](5)H1=[34563906423048673987420854215217612945343787467043604532451343693870]
Step 2: Checking for the corresponding Latencies of daily measurements
Each element of the daily data denoting the measured ampacity has a delay time stamp (L1), to deliver the measurements to where it is needed for computation. For the most recent measurement in the historical data, the first element in the first row and column of H1 in (5), 3456 was delivered after 249 ms, while the subsequent measurement from the element in the second row and first column, 4867, was delivered after 456 ms. Assuming a latency threshold of 400 ms was set for this delivery, which, when exceeded, will make the DLR data unusable, a previous hour reading will be used to avoid any suspicion of FDIA.
Step 3: Computing the hourly measurements.
The daily data are computed from each hourly data that meets the latency requirement. This makes, for example, daily data, D1,1, the first day of the first month to be dimension 24*1. This dimension will be the same for all days in all months.(6)D1,1=[330732653277]
Step 4: Checking if the daily data can fit into the history data through z-score
Obtain the elements of the daily data.
Perform a check to determine if the daily data element should be added to the history data:
  • a.
    Place the corresponding daily data element in the first column of the history data array.
  • b.
    Calculate the z-score of the new daily data element based on the standard deviation and mean of the history data.
  • c.
    If the z-score exceeds 1.0, skip adding the daily data element to the history data and use the previous hour's data before proceeding to test the next hour’s data in the daily data.
  • d.
    If the z-score of the subsequent data is within the limit, update the initial and present hour of the history data with the successful element in the daily data. This ensures all elements in the daily data are within the set limit.
  • e.
    At the end of each day, all the daily data elements would have replaced the first column of the history data, thereby shifting every other column by one column to the right and eliminating the last column of the updated history data to maintain the matrix size. The updated history data in (5) now appears like (7)
The updated history data is used to train machine learning classification models. This learning-based training validation and testing are done in step 5.(7)Hu1=[33073456390632653867398732775421521733974534456246703989459845324513]
  • (I)
    Machine Learning:
Steps 1–4 represent the data-driven process involving the statistical calculation of the z-score. In contrast, Step 5 represents the learning-based approach that utilises BGLM-LR to learn from features and classify each reading as either a ‘Good Measure’ or FDIA. Fig. 2 illustrates the proposed data-driven, learning-based model to mitigate these challenges. Features to learn from for each month consist of over 300 daily historical and statistical data. The MR-MR algorithm will be used to assess all features and select the essential ones for training, thus reducing the occurrence of false positives and false negatives. Since this is a binary classification scenario to minimise misclassifications, particularly instances of false negatives, the combination of BGLM-LR with the z-score and a minimum redundancy-maximum relevance (MR-MR) algorithm aims to outperform its counterparts. The model hopes to excel in detecting FDIA by distinguishing genuine and manipulated data instances through binary classification. Its probabilistic outputs enable nuanced certainty assessments in FDIA predictions, and model coefficients offer valuable insights into key features driving FDIA detection. Despite its linear assumptions, BGLM-LR captures complex FDIA patterns using techniques like polynomial terms, making it ideal for real-time detection and large datasets. Its simplicity, interpretability, and robustness to noisy data contribute to effective FDIA detection and make it suitable for real-world mitigation.
Fig. 2
  1. Download: Download high-res image (652KB)
  2. Download: Download full-size image

Fig. 2. Data-driven, learning-based FDIA mitigation.

Step 5: Train the Machine learning algorithm with the updated history knowledge base
Training:
  • a.
    Preprocessing of Data: The historical data is updated and transformed into predictors, incorporating additional columns that represent statistical measures of mean, median, range, standard deviation, and z-score for each hour. This preprocessing step ensures the data is appropriately formatted and ready for training.
  • b.
    Feature Ranking: Besides preprocessing, feature selection and extraction techniques are applied to the data to identify the most relevant and impactful features for the classification task. This is done by the MR-MR algorithm selecting the most relevant and least redundant features from the training dataset comprising hundreds of features from hourly ratings for each month over ten years as obtained in (7), their mean, median, range and z-score. This technique helps reduce the dimensionality of the data and eliminate irrelevant or redundant features, enhancing the training efficiency by reducing computational complexity.
  • c.
    Data Splitting for Training: After feature ranking and selection, the data is split into training and testing datasets. The algorithm is trained using eight months of hourly data (January to August), while the remaining four months (September to December) are reserved for testing. This splitting is achieved using a 67–33% training-to-testing ratio. This ratio is carefully chosen to prevent overfitting while considering data volume, computational efficiency, complexity, and variation.
Testing:
Classification Algorithm Evaluation: The four months of data reserved for testing will be tested for FDIA. Once the proposed model is trained using the training dataset, the efficacy of the proposed model and other similar machine-learning algorithms are evaluated using the testing dataset. The evaluation assesses their performance and generalisation to new, unseen data. The model’s ability to accurately detect and distinguish between ‘Good Measure’ and FDIA instances is thoroughly assessed during this phase.
Validation:
Model Validation: The trained z-score-BGLM-LR model is subjected to the same procedure as the Support Vector Machines, Neural Network, and Naïve Bayes. This makes them a z-score-LSVM, z-score-WNN, z-score-GNB and z-score-DT. They were all validated using fundamental and derived classification validation metrics. Accurate identification of the presence and absence of FDIA is of utmost importance, with the additional significance of avoiding false identification of the presence of FDIA as a 'Good Measure'. The models’ performance is rigorously evaluated through comprehensive validation tests, ensuring their reliability and efficacy in detecting and mitigating FDIA accurately.
In summary, the procedure entails training classification models through data preprocessing, data splitting into training and testing datasets, feature selection and extraction for training and testing. The models are then evaluated using the testing dataset to assess their performance and generalisation capabilities. Finally, the trained models undergo validation to ensure their accuracy and effectiveness in detecting and mitigating false data injection attacks. This comprehensive approach, including feature selection and extraction, contributes to developing a robust and reliable system to safeguard the integrity of the CPS against potential false data injection attacks. Under normal operating conditions, the physical part of the CPS functions as intended, and cyberspace processes distribute information with normal communication latency. However, if extraneous data infiltrates the DLR sensor, it can inhibit the communication rate between the sensors and operators, causing traffic anomalies and fluctuations in the communication space. This leads to abnormal operations of the cyber and physical components of the CPS. In another sense, an attacker could alter the DLR data, causing an exponential increase or decrease in the anticipated line rating, thereby causing operators to erroneously pick or drop loads, causing damage to several pieces of equipment, accidents, and fire outbreaks. The contributions of this study to false data injection attacks on CPS are:
(a) Integrated Mitigation Model: This study proposes an innovative model integrating data-driven and learning-based techniques to detect FDIA in DLR sensor data. Combining statistical analysis and machine learning via z-score, MR-MR and a BGLM-LR classifier, it addresses uncertainties and errors in sensor measurements, forming the historical data for DLR operations in real-time power systems. The integration enables the model to distinguish between genuine ‘Good Measure’ data and potential FDIA through a robust feature selection, training, testing and validation.
(b) Feature Selection: It incorporates the MR-MR feature ranking and selection algorithm as a crucial step in the integrated mitigation model. By applying this technique, the model identifies the most relevant and impactful features from the historical data to improve the computational efficiency of the proposed model. Feature selection and extraction help reduce the dimensionality of the data, eliminate irrelevant and redundant features, and focus on those that contribute most to distinguishing between ‘Good Measure’ and potential FDIA instances.
(c) Performance Metrics and Algorithm Validation: This study rigorously validates the proposed model’s performance using fundamental and derived classification error metrics. By extensively exploring multiple classifiers used for similar purposes, including LSVM, WNN, GNB and BGLM-LR, it identifies the best-performing algorithm capable of accurately distinguishing between ‘Good Measure’ data and potential FDIA using fundamental and derived metrics. This systematic validation ensures reliable and efficient DLR data protection for the security of cyber-physical systems.

3.3. MR-MR Feature Ranking and Selection

MR-MR algorithm allows the identification of the most important and informative DLR entries that contribute significantly to determining the presence or absence of FDIA. It avoids redundant or irrelevant entries that may not provide additional insights. This feature selection process is especially useful when dealing with large and dynamic datasets like DLR histories, as it can help improve the efficiency and accuracy of analytical models and decision support systems. The basic steps involved are:
  • 1.
    Initialisation:
    • (a)
      Define an empty set, SelectedFeatures.
    • (b)
      Compute the mutual information between each DLR feature and the hourly measures of central tendencies and dispersions calculated with the FDIA classification target. Store these values.
  • 2.
    Initial Selection:
    • (c)
      Identify the DLR feature with the highest mutual information with the FDIA target. This represents the most relevant feature.
    • (d)
      Add this feature to the SelectedFeatures set.
  • 3.
    Iterative Feature Selection:
    • (e)
      For each remaining unselected DLR feature, compute its mutual information quotient (MIQ). This is done by:
      • (i)
        Calculating its mutual information with the FDIA target (relevance).
      • (ii)
        Calculate its average mutual information with the Selected Features in the SelectedFeatures set (redundancy).
      • (iii)
        Taking the quotient: MIQ = Relevance / (1 + Redundancy).
    • (f)
      From the unselected features, pick the one with the highest MIQ value. This feature is relevant to the FDIA target and minimally redundant with the previously selected features.
    • (g)
      Add this feature to the SelectedFeatures set.
    • (h)
      Repeat this step until a pre-defined stopping criterion is met (e.g., a certain number of features are selected, or the MIQ value falls below a threshold).
  • 4.
    Final Feature Set:
    • (i)
      The SelectedFeatures set now contains best ranked DLR features and statistical measures selected based on their MIQ values concerning the FDIA classification task.
    • (ii)
      Use this set for improving the viable machine learning algorithms prepared for the FDIA classification model.
Integrating the feature ranking and selection with machine learning algorithms guarantees that the chosen DLR features and statistical values provide the most relevant insights into FDIA cases by maximising their relevance to the classification task and minimising information redundancy, resulting in efficient classification. However, the proposed model using real-time DLR data features, resource limitations, assumptions regarding known attack models and insufficient consideration of adversarial behaviour are imminent limitations. Overcoming these obstacles will require more robust interdisciplinary cooperation to devise resilient and scalable solutions that adequately safeguard power systems from FDIA threats.

3.4. Error Metrics

Accuracy and precision metrics are vital; however, like in medical diagnosis, forensics and fraud detection, the best model should have the highest sensitivity and the lowest false negative rate (FNR). Table 3 depicts the fundamental and derived error metrics used to gauge the efficacy of FDIA mitigation algorithms [32], [33].

Table 3. Error metrics.

Empty CellError MetricsFormulaeCharacteristics
Empty CellFUNDAMENTAL METRICS
Empty CellACCURACYTP+TNTP+FN+FP+TNAccuracy is easy to understand and widely used in many fields. However, it can be misleading in imbalanced datasets where it may give a high score even though the classifier only correctly identifies the majority class.
Empty CellPRECISIONTPTP+FPPrecision provides an idea of the number of correctly identified positive instances out of all the cases identified as positive by the classifier but does not consider false negatives.
Empty CellRECALL(SENSITIVITY)TPTP+FNRecall provides an idea of the number of correctly identified positive instances out of all positive samples in the dataset, making it essential in cases where false negatives are more costly.
Empty CellFALSE NEGATIVE RATEFNTP+FNFNR provides an idea of the number of false negatives out of all positive instances in the dataset but does not consider false positives.
Empty CellSPECIFICITYTNFP+TNSpecificity provides an idea of the number of correctly identified negative instances out of all instances identified as negative by the classifier. It is crucial in cases where false positive predictions are more costly than false negatives but do not consider false negatives.
Empty CellDERIVED METRICS
Empty CellF-MEASURE2*(PRECISION*RECALL)(PRECISION+RECALL)F-measure ranges from 0 to 1, where 0 indicates the worst performance, and 1 indicates the best performance. F-measure is useful when the data is imbalanced.
Empty CellINFORMEDNESSTPR+TNR1This depicts the extent to which the model’s predictions are better than random guessing. It ranges from −1–1, where −1 indicates the worst performance, and 1 indicates the best performance (the entire agreement between the model's predictions and the actual values).
Empty CellMARKEDNESSPPV+NPV1Positive predictive value (PPV) is the same as precision, and negative predictive value (NPV) involves dividing the number of true negatives by the sum of true and false negatives. Markedness measures the extent to which the model’s positive predictions are more informative than random guessing. Markedness ranges from −1–1, where −1 indicates the worst performance, and 1 indicates the best performance (total agreement between the model’s predictions and the actual values).
Empty CellCORRELATION(TP*TN)(FP*FN)((TP+FP)*(TP+FN)*(TN+FP)*(TN+FN))Correlation quantifies the extent of the linear connection between the predictions of a model and the actual values. Correlation ranges from −1–1, where −1 indicates a perfect negative correlation, and 1 shows a perfect positive correlation.
Empty CellROC (AUC)01TPR(FPR)dFPRThe area under the receiver operating characteristics (ROC) curve (AUC) is a standard metric used to evaluate the performance of a binary classifier. AUC ranges from 0 to 1, where 0 indicates the worst performance, and 1 indicates the best performance.
Where TP,TN,FP,FN TPR,TNR,FPRrepresent the true positive, true negative, false positive and false negative, true positive rate, true negative rate and false positive rate respectively.

4. Results and Discussion

The simulation output shows how FDIA affects DLR historical data and latency values. Combining a statistical (data-driven) algorithm based on the hourly z-score with several machine-learning (learning-based) models allowed hybrid algorithm training. The training and testing were done in the first instance without feature ranking and selection and they were repeated after feature ranking and selection. The MR-MR algorithm significantly impacts machine learning algorithms’ training, validation, and testing for mitigating FDIA. During training, MR-MR aids in ranking and selecting the most relevant, distinguished features, reducing overfitting, and improving model generalisation to unseen data. Among these DLR features and their statistical representations, 30 highly ranked features were selected by the MR-MR algorithm. These are the z-score, class (Good Measure or FDIA), new measure and other 27 different days DLR data features. In the validation phase, MR-MR refines the model’s performance by excluding irrelevant or redundant features, resulting in more reliable output.

4.1. Performance evaluation metrics

Table 4(a) through 4(d) provide a comprehensive overview of the evaluation metrics observed during both the validation and testing phases of BGLM-LR and other algorithms. These metrics offer a comparative analysis between models utilising MR-MR feature ranking and selection and when it was unutilised. Derived from the confusion matrix for each model, the error metrics reflect average values for informedness, specificity, and markedness, coupled with satisfactory levels of accuracy, precision, and recall during the validation phase. Intriguingly, MR-MR-selected features yielded noticeable improvements across these metrics. Particularly noteworthy is the marginal improvement in the BGLM-LR model validation, as highlighted in Table 4(a) and 4(b). MR-MR feature selection surged precision from 0.789 to 0.897, while informedness rose significantly from 0.427 to 0.833 during the validation phase. These enhancements underscore the efficacy of integrating the MR-MR feature ranking and selection approach, showcasing its ability to refine model performance and bolster effective classification.

Table 4 (a). Model Validation Results.

Machine learning
model
Metrics
BINARY GENERALISED LINEAR MODEL LOGISTIC REGRESSIONGAUSSIAN NAÏVE BAYESLINEAR SUPPORT VECTOR MACHINEWIDE NEURAL NETWORKFINE DECISION TREE
FUNDAMENTAL METRICS
ACCURACY0.7140.7700.7550.7601.000
PRECISION0.7890.8150.7640.7781.000
RECALL0.8270.8850.9350.9351.000
FNR0.2500.1670.0630.0940.000
SPECIFICITY0.4150.4710.0930.3021.000
TRAINING TIME (s)66.7065.4464.3359.1972.34
DERIVED METRICS
F-MEASURE0.8070.8480.8500.8501.000
INFORMEDNESS0.4270.5420.5420.5211.000
MARKEDNESS0.2660.4240.4240.4181.000
CORRELATION0.2540.3890.2810.3151.000
ROC (AUC)0.6810.7980.8190.7511.000
In the pre-MR-MR scenario of Table 4(a), a notable presence of false negatives was recorded during validation. With the utilisation of the MR-MR feature selection, as outlined in Table 4(b), there was a significant improvement in model performance. For instance, BGLM-LR’s accuracy improved from 0.714 to 0.917 post-MR-MR application, indicating the technique’s effectiveness in enhancing model generalisation. This improvement was further validated by a decrease in its false negative rate (FNR) from 0.250 to 0.000. LSVM performed at par with the BGLM-LR in the pre and post-MR-MR ranking and selection while GNB and WNN had an average performance but were lower in most cases than the BGLM-LR. The training time for BGLM-LR pre-MR-MR feature selection was highest at 66.70 s among pairs behind the DT of 72.34 s. While BGLM-LR is a powerful and interpretable algorithm, its performance may be inferior to other algorithms in scenarios characterised by imbalanced classes, non-linear relationships and high-dimensional data.

Table 4 (b). MR-MR Ranked Features Model Validation Results.

Machine learning
model
Metrics
BINARY GENERALISED LINEAR MODEL LOGISTIC REGRESSIONGAUSSIAN NAÏVE BAYESLINEAR SUPPORT VECTOR MACHINEWIDE NEURAL NETWORKFINE DECISION TREE
FUNDAMENTAL METRICS
ACCURACY0.9170.9840.9170.9171.000
PRECISION0.8970.9790.8970.8971.000
RECALL1.0001.0001.0001.0001.000
FNR0.0000.0000.0000.0000.000
SPECIFICITY0.6980.9430.6980.6981.000
TRAINING TIME (s)69.6879.3777.38122.272.34
DERIVED METRICS
F-MEASURE0.9460.9890.9460.9461.000
INFORMEDNESS0.8330.9670.8330.8331.000
MARKEDNESS0.8970.9790.8970.8971.000
CORRELATION0.7910.9610.7910.7911.000
ROC (AUC)1.0000.9940.7300.7301.000
Post-MR-MR feature selection, training and validation the BGLM-LR surpassed all other algorithms in their training and validation time as illustrated in Table 4(b). Since the dimensionality of the data has been reduced, it achieved a training time of 69.68 s while GNB, LSVM, WNN and DT were completely trained and validated after 79.37 s, 77.38 s, 122.22 s and 72.34 s. Its simplicity, efficiency, scalability, interpretability, and ability to provide probabilistic predictions make binary GLM logistic regression a valuable choice for quick training in classification problems, particularly when computational resources are limited or when interpretability and model transparency are essential.
During the testing phase, the models initially demonstrated commendable performance across various metrics such as F-measure, recall, markedness, and correlation. However, following the integration of the MR-MR feature ranking algorithm, these metrics experienced a remarkable transition from commendable to exceptional levels. Table 4(c) and 4(d) provide specific insights into this transition, particularly highlighting the BGLM-LR model’s performance post-feature selection. Notably, the model exhibited optimal values for FNR and recall, with FNR reduced to 0.000 and recall reaching 1.000. These values represent a substantial improvement from their previous counterparts, where FNR stood at 0.042 and recall at 0.942. This significant enhancement underscores the efficacy of incorporating the MR-MR feature ranking approach. It demonstrates the algorithm’s ability to refine model performance, effectively reducing false negatives and enhancing recall, thereby fortifying the model’s classification capabilities and overall reliability.

Table 4 (c). Model Testing Results.

Machine learning
model
Metrics
BINARY GENERALISED LINEAR MODEL LOGISTIC REGRESSIONGAUSSIAN NAÏVE BAYESLINEAR SUPPORT VECTOR MACHINEWIDE NEURAL NETWORKFINE DECISION TREE
FUNDAMENTAL METRICS
TESTING ACCURACY0.6980.7810.7290.7291.000
PRECISION0.7220.7860.7720.7471.000
RECALL0.9420.9570.8840.9421.000
FNR0.0420.0310.0830.0420.000
SPECIFICITY0.0740.3330.3330.1851.000
DERIVED METRICS
F-MEASURE0.8180.8630.8240.8331.000
INFORMEDNESS-0.302-0.219-0.271-0.2711.000
MARKEDNESS0.5600.5360.3020.3031.000
CORRELATION0.0300.3940.2560.1961.000
ROC (AUC)0.5560.8470.8510.7121.000
The empirical findings reveal that the BGLM-LR, GNB, LSVM, and WNN models exhibited commendable performance on novel data following MR-MR feature optimisation. These algorithms demonstrated elevated metrics in accuracy, precision, recall, and F-measure, highlighting their suitability for practical FDIA detection deployments. Variability in performance metrics was observed among the models, with the Fine DT model particularly standing out with exemplary scores across accuracy 1.000, precision, recall, F-measure, and ROC (AUC) during validation. While these outcomes were impressive, concerns arose regarding potential overfitting, suggesting a possible memorisation of the training data. Upon testing the models on new datasets, initial results captured in Table 4(c) were surpassed post-MR-MR optimisation, as demonstrated in Table 4(d). For instance, the WNN’s accuracy improved from 0.729 to 0.802. Similarly, the GNB model’s precision rose from 0.786 to 0.983, reaffirming the pivotal role of MR-MR in refining model performance.

Table 4 (d). MR-MR Ranked Features Model Testing Results.

Machine learning
model
Metrics
BINARY GENERALISED LINEAR MODEL LOGISTIC REGRESSIONGAUSSIAN NAÏVE BAYESLINEAR SUPPORT VECTOR MACHINEWIDE NEURAL NETWORKFINE DECISION TREE
FUNDAMENTAL METRICS
TESTING ACCURACY0.8440.8750.8440.8021.000
PRECISION0.8210.9830.8210.8131.000
RECALL1.0000.8411.0000.9421.000
FNR0.0000.1150.0000.0420.000
SPECIFICITY0.4440.9630.4440.4441.000
DERIVED METRICS
F-MEASURE0.9020.9060.9020.8721.000
INFORMEDNESS-0.406-0.125-0.156-0.1981.000
MARKEDNESS0.8210.6860.8210.5631.000
CORRELATION0.6040.7420.6040.4661.000
ROC (AUC)1.0000.9151.0000.9601.000

4.2. Comparison of Model Validation and Testing Results

In the initial assessment of validation and testing in Fig. 3(a-h), BGLM-LR demonstrated a performance that could be characterised as moderate. During validation, Fig. 3(a), achieved an AUC of 0.6809, while during testing, Fig. 3(b), yielded an AUC of 0.5563, indicative of performance approaching chance levels, as denoted by the red dotted line. Conversely, GNB, as depicted in Fig. 3(c) and (d), surpassed BGLM-LR by achieving an improved testing AUC of 0.8465 and a validation AUC of 0.7978. Similarly, LSVM outperformed BGLM-LR during both validation and testing before the application of feature ranking and selection. It recorded AUC values of 0.8188 during validation, Fig. 3(e) and 0.8514 during testing, Fig. 3(f). The WNN, illustrated in Fig. 3(g) and (h), also demonstrated competitive results, with an AUC of 0.7513 during validation and 0.7117 during testing.
Fig. 3
  1. Download: Download high-res image (496KB)
  2. Download: Download full-size image
Fig. 3
  1. Download: Download high-res image (507KB)
  2. Download: Download full-size image

Fig. 3. Confusion matrix of algorithms.

The integration of MR-MR feature ranking and selection marked a pivotal enhancement for the algorithms under evaluation. Notably, this integration propelled BGLM-LR, previously showing moderate success, to a significantly higher level of performance. Across all evaluation metrics, BGLM-LR and LSVM emerged as the standout algorithms. During the evaluation phases, their AUC scores reached an exemplary 1.000, as demonstrated in Fig. 4(c) and 4(d) respectively, highlighting their superiority over other classifiers. These results strongly affirm the transformative potential of MR-MR feature selection techniques in enhancing model efficacy.
Fig. 4
  1. Download: Download high-res image (511KB)
  2. Download: Download full-size image

Fig. 4. (a) and (b). BGLM-LR Validation and Testing. (c) and (d). GNB Validation and Testing. (e) and (f). LSVM Validation and Testing. (g) and (h). WNN Validation and Testing.

Furthermore, GNB and WNN also experienced substantial benefits from MR-MR optimisation, evident in Fig. 4(a) and (b) respectively. Their testing AUC scores witnessed remarkable improvements, rising from 0.8465 in Fig. 3(d) and 0.7117 in Fig. 3(h) to 0.9147 in Fig. 4(a) and 0.9595 in Fig. 4(b) respectively, indicative of enhanced performance. However, it is noteworthy that despite these considerable gains, they still slightly trailed the BGLM-LR. This underscores the exceptional effectiveness of BGLM-LR when combined with MR-MR, positioning it as the leading choice for detecting FDIA.
In summary, this research underscores the critical role of optimisation techniques like MR-MR in False Data Injection Attack detection. BGLM-LR’s transformation into a top-performing algorithm exemplifies the potential of feature selection in elevating model performance, ultimately establishing itself as the premier choice among the evaluated classifiers.

5. Conclusion

This research has addressed FDIA on DLR systems in cyber-physical power system networks using integrated statistical and machine learning models. By standardising DLR ratings with the z-score technique and employing MR-MR feature ranking and selection, the misidentification of FDIA was minimized with the learning algorithm. The research highlights the efficacy of MR-MR in enhancing model performance, with notable potential seen in BGLM Logistic Regression and Linear Support Vector Machine. The two algorithms have comparably exceptional results in testing, but BGLM Logistic Regression exhibits marginal improvements in FNR, Recall, and AUC values from training to testing. Additionally, the training and validation time of BGLM-LR was minimal compared to LSVM and other models, making it the fastest model among peers and a desirable choice when conserving computational resources is pertinent. Implementing this model on CPS networks will allow the full utilisation of line capacity thereby allowing more renewables gearing towards the achievement of a sustainable energy grid. Such integrated solutions would aid in meeting the United Nations' 7th and 13th Sustainable Development Goals for 2030, namely, affordable and greener energy generation, as well as climate change mitigation [34]. Future studies should explore diverse optimisation methods and evaluate algorithms across various datasets and environments to advance FDIA detection methodologies.

CRediT authorship contribution statement

Jiashen Teh: Project administration, Validation, Visualization, Writing – review & editing. Olatunji Lawal: Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Resources, Software, Writing – original draft. Bader Alharbi: Project administration, Supervision, Validation, Visualization, Writing – review & editing. Ching-Ming Lai: Investigation, Methodology, Project administration, Supervision, Validation, Visualization, Writing – review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The author extends the appreciation to the Deanship of Postgraduate Studies and Scientific Research at Majmaah University for funding this research work through the project number (R-2024-1013).

Appendix A. Supplementary material

Data availability

Data will be made available on request.

References

Cited by (15)

  • A multi-stage techno-economic model for harnessing flexibility from IoT-enabled appliances and smart charging systems: Developing a competitive local flexibility market using Stackelberg game theory

    2024, Applied Energy
    Citation Excerpt :

    conducts site selection and capacity planning using vulnerability assessments and optimization methods, improving cost efficiency, profit, and stability in energy systems through the application of Dynamic Thermal Rating technology. [32] develops a Markov process-based framework for assessing grid network reliability, optimizing sensor placement and cyber-physical system availability, resulting in substantial improvements in network availability and reduced sensor requirements. [33] presents a machine learning model to mitigate false data injection attacks in Dynamic Line Rating systems, utilizing advanced feature ranking and selection techniques to enhance model accuracy and generalization, with the binary generalized linear model logistic regression showing the best performance.

  • A multi-period restoration approach for resilience increase of active distribution networks by considering fault rapid recovery and component repair

    2024, International Journal of Electrical Power and Energy Systems
    Citation Excerpt :

    In [17], a dynamic line rating forecasting algorithm is proposed to decrease outages and enhance the resilience of the power system. In [18], a robust machine learning model is introduced for addressing false data injection attacks to enhance resilience and energy security. In [19], a two-stage dispatching framework using the DTR model as the line capacity limit is proposed, and the goal is to optimize the source-load fluctuations while improving the system’s stability and flexibility.

  • An optimal dispatch model for distribution network considering the adaptive aggregation of 5G base stations

    2024, International Journal of Electrical Power and Energy Systems
    Citation Excerpt :

    Given that the aforementioned measures entail additional equipment, resulting in elevated costs and constraints in terms of space and capacity within DN, some researchers propose maximizing the utilization of existing assets. Dynamic Thermal Rating (DTR) systems [8,9,10] are suggested for adjusting transmission line capacity by monitoring environmental data. This approach helps reduce line capacity wastage stemming from conservative estimates, thereby facilitating RES integration.

View all citing articles on Scopus
View Abstract