MSE-TCN: Multi-scale temporal convolutional network with channel attention for open-set gas classification MSE-TCN:具有通道注意力的多尺度时间卷积网络用于开放集气体分类
Xu Ma ^(a){ }^{\mathrm{a}}, Fan Wu^(a)\mathrm{Wu}^{\mathrm{a}}, Jiaxin Yue ^(a,b){ }^{\mathrm{a}, \mathrm{b}}, Peter Feng ^(c){ }^{\mathrm{c}}, Xiaoyan Peng ^(a,b){ }^{\mathrm{a}, \mathrm{b}}, Jin Chu ^(a,b,^(**)){ }^{\mathrm{a}, \mathrm{b},{ }^{*}} 徐马 ^(a){ }^{\mathrm{a}} ,范 Wu^(a)\mathrm{Wu}^{\mathrm{a}} ,岳佳欣 ^(a,b){ }^{\mathrm{a}, \mathrm{b}} ,彼得·冯 ^(c){ }^{\mathrm{c}} ,彭晓燕 ^(a,b){ }^{\mathrm{a}, \mathrm{b}} ,楚瑾 ^(a,b,^(**)){ }^{\mathrm{a}, \mathrm{b},{ }^{*}}^("a "){ }^{\text {a }} College of Artificial Intelligence, Southwest University, Chongqing 400715, China ^("a "){ }^{\text {a }} 西南大学人工智能学院,重庆 400715,中国^(b){ }^{\mathrm{b}} Chongqing Key Laboratory of Brain-Inspired Computing and Intelligent Chips, Chongqing 400715, China ^(b){ }^{\mathrm{b}} 重庆脑启发计算与智能芯片重点实验室,重庆 400715,中国^("c "){ }^{\text {c }} Department of Physics, University of Puerto Rico, San Juan, PR 00931, USA ^("c "){ }^{\text {c }} 波多黎各大学物理系,圣胡安,PR 00931,美国
ARTICLE INFO 文章信息
Keywords: 关键词:
Open-Set
Gas classification 气体分类
E-nose 电子鼻
TCN
SE-ResNet
Abstract 摘要
Currently, researches on closed-set gas classification tasks has achieved great success in the electronic nose (Enose) field. E-nose, however, faces a more challenging and realistic task in open-set gas classification. To find an accurate open-set gas classification model, we proposed a MSE-TCN, which integrates squeeze-and-excitation residual network (SE-ResNet) internally into the temporal convolutional network (TCN) and expands the channels number of TCN, forming a multi-scale feature extraction encoder, and realizes open-set gas classification using the OpenMax algorithm. The underlying SE-ResNet module focuses on the relatively important sensor channels, while Multi-scale TCN thoroughly captures temporal relationships between the E-nose data from three scales, and the OpenMax identifies the unknown classes by redistributing the probability. The gas sensing performance of the sensors used in the self-sampling dataset was analyzed, demonstrating the reliability of the data acquisition. Subsequently, comparative experiments based on self-sampling dataset and public dataset were performed to determine the number of encoders and demonstrate the necessity of the SE-ResNet module. Meanwhile, ablation experiments demonstrate the effectiveness of our proposed model. In addition, the comparative experiments of the open-set classifiers show MSE-TCN achieves the highest accuracies of 0.9245 and 0.9387 among all the models on both datasets for open-set classification, respectively. As a result, this model provides an effective method for high accuracy gas classification for both closed-set and open-set in E-nose field. 目前,在电子鼻(E-nose)领域,封闭集气体分类任务的研究已取得巨大成功。然而,电子鼻在开放集气体分类中面临更为复杂和现实的任务。为了寻找一个准确的开放集气体分类模型,我们提出了 MSE-TCN,它将压缩激励残差网络(SE-ResNet)内部集成到时间卷积网络(TCN)中,并扩展了 TCN 的通道数,形成了一个多尺度特征提取编码器,并利用 OpenMax 算法实现了开放集气体分类。底层的 SE-ResNet 模块专注于相对重要的传感器通道,而多尺度 TCN 则从三个尺度全面捕捉电子鼻数据之间的时间关系,OpenMax 通过重新分配概率来识别未知类别。对自采样数据集中使用的传感器的气体传感性能进行了分析,证明了数据采集的可靠性。随后,基于自采样数据集和公共数据集进行了对比实验,以确定编码器的数量并证明 SE-ResNet 模块的必要性。 同时,消融实验证明了我们提出模型的有效性。此外,开放集分类器的对比实验显示,MSE-TCN 在两个数据集上的开放集分类中分别达到了 0.9245 和 0.9387 的最高准确率。因此,该模型为电子鼻领域中的封闭集和开放集高精度气体分类提供了一种有效方法。
1. Introduction 1. 引言
Electronic nose (E-nose) is a system designed to simulate animal olfaction, which is usually composed of two parts: sensor arrays and pattern recognition system [1]. With the rapid development of pattern recognition algorithms, the performances of E-noses have also been greatly improved, resulting in the wide applications of E-nose in the fields of food [23], medical treatment [45], agriculture [67] and others. 电子鼻(E-nose)是一种旨在模拟动物嗅觉的系统,通常由传感器阵列和模式识别系统两部分组成[1]。随着模式识别算法的快速发展,电子鼻的性能也得到了极大提升,使其在食品[23]、医疗[45]、农业[67]等领域得到了广泛应用。
Gas classification is a primary task of E-nose, and the accuracy of the classification is highly dependent on the pattern recognition algorithm of the E-nose system. Prior to the proposal of artificial neural networks (ANNs) [8], the classification algorithms for E-noses were mainly machine learning, such as K-nearest neighbor (KNN) [9], support vector machine (SVM) [10], principal component analysis (PCA) [11], and so on. Although these methods are effective, the required manual feature extraction has limited the accuracy of gas classification [1213]. Currently, neural network algorithms represented by convolution neural 气体分类是电子鼻的一项主要任务,其分类准确性高度依赖于电子鼻系统的模式识别算法。在人工神经网络(ANNs)[8]提出之前,电子鼻的分类算法主要是机器学习方法,如 K 近邻(KNN)[9]、支持向量机(SVM)[10]、主成分分析(PCA)[11]等。尽管这些方法有效,但所需的手动特征提取限制了气体分类的准确性[1213]。目前,以卷积神经网络为代表的神经网络算法
network (CNN) [14], recurrent neural network (RNN) [15], long shortterm memory (LSTM) [16], etc., have been widely used in the gas classification and other fields with satisfying results. 网络(CNN)[14]、循环神经网络(RNN)[15]、长短期记忆(LSTM)[16]等已在气体分类及其他领域得到广泛应用,并取得了令人满意的结果。
Further, as E-nose data exhibits temporal feature, classification of Enose data falls under the task of sequence modeling, which has been thoroughly studied by various researchers. For example, Bai et al. compared convolutional architecture networks with recurrent architecture networks to develop a temporary convolutional network (TCN) suitable for sequence modeling [17]. Wu et al. developed TETCN for modeling E-nose data by combining the multi-head attention mechanism with TCN [18]. In addition, some researchers have combined Transformer with CNN to process E-nose data, forming a stronger classification network [19]. 此外,由于电子鼻数据展现出时间特征,电子鼻数据的分类属于序列建模任务,这一领域已被众多研究者深入探讨。例如,Bai 等人通过比较卷积架构网络与循环架构网络,开发出了一种适用于序列建模的临时卷积网络(TCN)[17]。Wu 等人则通过将多头注意力机制与 TCN 结合,开发了 TETCN 用于电子鼻数据建模[18]。此外,一些研究者将 Transformer 与 CNN 结合处理电子鼻数据,构建了更强大的分类网络[19]。
However, the majority of current research on gas classification tasks focus on closed-set, where the training set labels and the test set labels are originated from the same set [20]. In practice, the open-set gas classification is more realistic and challenging due to the uncertainty 然而,当前关于气体分类任务的研究大多集中于封闭集,其中训练集标签和测试集标签源自同一集合[20]。实际上,由于不确定性,开放集气体分类更为现实且具有挑战性。
Most of the open-set classification methods based on traditional machine learning are built based on SVM, such as 1-vs-Set [21], W-SVM [23] and P_(I)-SVM\mathrm{P}_{\mathrm{I}}-\mathrm{SVM} [24], etc. In addition, there are some models based on other algorithms, such as SROSR [25] and NNO [26]. However, these methods are highly integrated with traditional machine learning algorithms, which are difficult to adapt to deep neural networks, and not applicable in our study. For deep neural networks, the general method for open-set classification task is to determine whether a sample belongs to an unknown class by adding a threshold after the SoftMax layer, called SoftMax threshold (ST). For example, Akshay Raj Dhamija et al. [27] achieved satisfying results in image classification of unknown classes using the ST method and additional background classes or garbage classes. Gaurav Jaiswal [28] proposed Threshold Softmax Layer (TSM) and learning algorithm, successfully handling the unknown class problem with reduced misclassification error. However, Bendale [29] et al. argued that ST rejected indeterminate classes rather than unknown classes. Therefore, OpenMax algorithm, which recalculates the probabilities after fitting the Weibull distribution, was proposed by their team to replace the SoftMax layer to deal with unknown classes with better image classification results. 大多数基于传统机器学习的开放集分类方法都是基于 SVM 构建的,例如 1-vs-Set [21]、W-SVM [23]和 P_(I)-SVM\mathrm{P}_{\mathrm{I}}-\mathrm{SVM} [24]等。此外,还有一些基于其他算法的模型,如 SROSR [25]和 NNO [26]。然而,这些方法与传统机器学习算法高度集成,难以适应深度神经网络,因此不适用于我们的研究。对于深度神经网络,开放集分类任务的通用方法是在 SoftMax 层后添加一个阈值来确定样本是否属于未知类别,称为 SoftMax 阈值(ST)。例如,Akshay Raj Dhamija 等人[27]使用 ST 方法和额外的背景类或垃圾类在未知类别的图像分类中取得了令人满意的结果。Gaurav Jaiswal [28]提出了阈值 SoftMax 层(TSM)和学习算法,成功处理了未知类别问题并减少了误分类错误。然而,Bendale 等人[29]认为 ST 拒绝的是不确定类别而非未知类别。 因此,他们的团队提出了 OpenMax 算法,该算法在拟合 Weibull 分布后重新计算概率,以替代 SoftMax 层,从而在处理未知类别时获得更好的图像分类结果。
In addition, Shu et al. replaced the SoftMax layer with a 1 -ver-susrest final layer of sigmoids and presented Deep Open classifier (DOC) model for text open-set classification [30]. Further, some researchers have adapted the open-set classification task from a loss function perspective and have proposed a distance-based loss function called CAC [31]. It is worth noting that current open-set classification algorithms concentrate on the field of computer vision, neglecting their application and evaluation on the time series data, such as E-noses data [22]. 此外,Shu 等人将 SoftMax 层替换为 1 对多的 sigmoid 最终层,并提出了用于文本开放集分类的深度开放分类器(DOC)模型[30]。进一步地,一些研究者从损失函数的角度调整了开放集分类任务,并提出了一种基于距离的损失函数 CAC[31]。值得注意的是,当前的开放集分类算法主要集中在计算机视觉领域,忽视了它们在时间序列数据(如电子鼻数据)上的应用和评估[22]。
The performance of open-set classifiers depends on closed-set classifiers, which requires more powerful and detailed feature extraction capabilities to achieve high-performance open-set classification. Therefore, to find a high-performance open-set classification model for gas data, we started with the development of closed-set classifiers with powerful feature extraction capabilities and proposed a model composed of optimized multi-scale TCN and embedded squeeze-andexcitation residual network (SE-ResNet), named as MSE-TCN. The multi-scale TCN explores the temporal relationships from the E-nose data with a receptive field variable in length and the ability to capture long-term dependence [17], and achieves thoroughly feature extraction though an expanded channel. The embedded SE-ResNet module focuses on sensor channels that contribute more gas features by learning weights of the channels [32]. Eventually, the results of the open-set classification are obtained using the OpenMax algorithm, which replaces the SoftMax layer of the closed-set classifier to generate the open-set classifier in this model. Sufficient experiments demonstrate that MSE-TCN outperforms the comparison models in both closed-set and open-set classification. 开放集分类器的性能依赖于闭集分类器,这需要更强大和细致的特征提取能力来实现高性能的开放集分类。因此,为了找到适用于气体数据的高性能开放集分类模型,我们从开发具有强大特征提取能力的闭集分类器入手,提出了一个由优化的多尺度 TCN 和嵌入的压缩激励残差网络(SE-ResNet)组成的模型,命名为 MSE-TCN。多尺度 TCN 通过长度可变的感受野探索电子鼻数据中的时间关系,并具备捕捉长期依赖的能力[17],通过扩展通道实现彻底的特征提取。嵌入的 SE-ResNet 模块通过学习通道权重,专注于贡献更多气体特征的传感器通道[32]。最终,使用 OpenMax 算法获得开放集分类结果,该算法替换了闭集分类器的 SoftMax 层,在本模型中生成开放集分类器。 充分的实验表明,MSE-TCN 在闭集和开集分类中均优于对比模型。
In this article, our research is summarized as below: 在本文中,我们的研究总结如下:
(1) To achieve thoroughly extraction of features, the internal channels of the TCN were expanded to form a multi-scale feature extraction TCN, where the backbone was optimized using Gaussian error linear unit (GELU) to weight the inputs, resulting in more accurate results compared with Rectified linear unit (RELU). (1) 为了实现特征的彻底提取,TCN 的内部通道被扩展以形成多尺度特征提取 TCN,其中主干网络通过使用高斯误差线性单元(GELU)对输入进行加权来优化,与整流线性单元(RELU)相比,获得了更准确的结果。
(2) The experimentally selected SE-ResNet module was embedded into the multi-scale TCN to capture the information of more important sensor channels, resulting in a more powerful encoder for feature extraction. (2) 将实验选定的 SE-ResNet 模块嵌入到多尺度 TCN 中,以捕捉更重要的传感器通道信息,从而构建出更强大的特征提取编码器。
(3) OpenMax was used to reconstruct the classification vectors, achieving high accuracy gas classification, and was verified on the selfsampling dataset and the public dataset from UCI website. (3) 使用 OpenMax 重构分类向量,实现了高精度的气体分类,并在自采样数据集和 UCI 网站上的公共数据集上进行了验证。
(4) The gas sensing performances of the sensors used in the selfsampling dataset were analyzed in terms of response, repeatability, and selectivity. (4) 针对自采样数据集中使用的传感器,从响应、重复性和选择性三个方面分析了其气体传感性能。
The rest of the article is structured as follows: Section 2 describes the two datasets used in the experiments; Section 3 explains the details of the model; Section 4 analyzes the results of the experiments and discusses the performance of the proposed model; Section 5 is the conclusion. 文章的其余部分结构如下:第 2 节描述了实验中使用的两个数据集;第 3 节解释了模型的细节;第 4 节分析了实验结果并讨论了所提出模型的性能;第 5 节是结论。
2. Materials
In this section, the self-sampling dataset and the public dataset from UCI website (UCI dataset) used in the experiments are introduced. 在本节中,介绍了实验中使用的自采样数据集和来自 UCI 网站(UCI 数据集)的公共数据集。
2.1. Self-sampling dataset
The data acquisition device in our lab consists of a computer, a gas mixing control chamber, a test system main chamber, a sensor array, and a data collection system, as illustrated in Fig. 1, in where the sensor array consists of 10 gas sensors with 2 for each type: MQ-7B, TGS816, TGS822, TGS826, MQ136 and data is collected using JF02F software provided by Guiyan Jinfeng Company. The background gas and the test gas are passed into the gas mixing control chamber. Subsequently, the mixed gas with desired concentration is delivered into the chamber in where the gas analytes are fully reacted with the sensing layer of the sensor array. The signal of the sensor array is collected by the data collection system. In this process, the advanced sensor technology and built-in high-precision flow control modules in the testing system main chamber are used to realize real-time monitoring of pressure, temperature, and flow data. In conclusion, the data collection is efficient and highly stable through the collaborative operation of the system to ensure the reliability of the subsequent data processing and gas classification application. 我们实验室的数据采集设备由一台计算机、一个气体混合控制室、一个测试系统主室、一个传感器阵列和一个数据收集系统组成,如图 1 所示。其中,传感器阵列由 10 个气体传感器组成,每种类型各 2 个:MQ-7B、TGS816、TGS822、TGS826、MQ136,数据采集使用贵阳金峰公司提供的 JF02F 软件。背景气体和测试气体被导入气体混合控制室。随后,将具有所需浓度的混合气体送入主室,在此气体分析物与传感器阵列的感应层充分反应。传感器阵列的信号由数据收集系统采集。在此过程中,测试系统主室采用先进的传感器技术和内置高精度流量控制模块,实现对压力、温度和流量数据的实时监控。总之,通过系统的协同操作,数据采集高效且高度稳定,确保了后续数据处理和气体分类应用的可靠性。
The details of the data collection process are explained below: The test gas and the background gas consisting of 70%70 \% nitrogen and 30%30 \% oxygen were fed to the gas mixing control chamber. The total gas flow rate was set to 300mL//min300 \mathrm{~mL} / \mathrm{min} and kept constant during the entire experimental process, and the variation of gas concentration was controlled by adjusting the flow ratio of background gas and test gas. According to the table presented in the operating screen, the test voltage was adjusted to 8 V , while the optimal operating temperature of the sensors, which is attained via a built-in heater that is driven by an external circuit, was obtained when the voltage was set at 5 V .
The sensors were pre-heated for 48 h prior the sensing measurements gets began to guarantee the required reproducible response patterns for each experiment. The chamber was cleaned for 120 s , followed by sampling time of each measurement for 180 s , divided into 60 s for the gas inflation and a 120 s for the deflation, which was repeated for the subsequent experiments. Fifteen concentrations ranging from 0 to 300 ppm were set for all the testing gases, with a gradient interval of 20 ppm . 传感器在开始传感测量前预热 48 小时,以确保每次实验所需的可重复响应模式。腔室清洁 120 秒,随后每次测量的采样时间为 180 秒,分为 60 秒的气体充气和 120 秒的放气,此过程在后续实验中重复进行。所有测试气体的浓度设置为 0 至 300 ppm 范围内的 15 个浓度,梯度间隔为 20 ppm。
A total of 791 measurements were conducted for the following seven kinds of the gases under strictly controlled experimental conditions: carbon monoxide (CO), ammonia (NH_(3))\left(\mathrm{NH}_{3}\right), ethylene (C_(2)H_(4))\left(\mathrm{C}_{2} \mathrm{H}_{4}\right), hydrogen chloride (HCl)(\mathrm{HCl}), propane (C_(3)H_(8))\left(\mathrm{C}_{3} \mathrm{H}_{8}\right), nitrogen dioxide (NO_(2))\left(\mathrm{NO}_{2}\right), sulfur dioxide (SO_(2))\left(\mathrm{SO}_{2}\right). Each data point in the sensor data serves as a feature and corresponds to the measured resistance value of 10 sensors. To obtain reliable data, we extracted feature of 80 s from the response curves, including the last 20 s of the deflation phase and the full ventilation phase. Fig. 2 displays the response curves of the gas sensors during the first 400 s of measurements for one gas and a feature map of the aforementioned data in this work, where the sensor model of sensor array channel as shown in Table 1.
Fig. 1. Diagram of experimental setup for self-sampling dataset. 图 1. 自采样数据集实验装置示意图。
Fig. 2. (a) Response curves of the sensors during the first 400 s for one gas, and (b) a feature map of 80 s used in this work. Each curve represents the data collected by a sensor. 图 2. (a) 传感器在最初 400 秒内对一种气体的响应曲线,以及(b) 本研究中使用的 80 秒特征图。每条曲线代表一个传感器收集的数据。
Table 1 表 1
Correspondence table between sensor array channel and sensor model. 传感器阵列通道与传感器型号对应表。
To evaluate the generalization performance of the model, the dataset published by Alexander Vergara et al. on the UCI website was also used [33]. The dataset was collected using 9 sensor arrays, with each containing 8 sensors. The collection device was placed in a 2.5mxx1.2mxx2.5 \mathrm{~m} \times 1.2 \mathrm{~m} \times 0.4 m wind tunnel research test facility at the Biocircuits Institute of the University of California, San Diego. To ensure the authenticity of the data collection, they positioned the sensors in 6 different locations in turn, forming 6 subsets (L1, L2, …, L6) according to the distance from the gas source, as shown in Fig. S1.
In this dataset, a total of 1,8000 time-series were collected from 72 metal oxide gas sensors to 10 types of gas, including acetone (C_(3)H_(6)O)\left(\mathrm{C}_{3} \mathrm{H}_{6} \mathrm{O}\right), acetaldehyde (C_(2)H_(4)O),NH_(3)\left(\mathrm{C}_{2} \mathrm{H}_{4} \mathrm{O}\right), \mathrm{NH}_{3}, butanol (C_(4)H_(9)OH),C_(2)H_(4)\left(\mathrm{C}_{4} \mathrm{H}_{9} \mathrm{OH}\right), \mathrm{C}_{2} \mathrm{H}_{4}, methane (CH_(3))\left(\mathrm{CH}_{3}\right), 在该数据集中,共从 72 个金属氧化物气体传感器收集了 18,000 个时间序列,涉及 10 种气体,包括丙酮 (C_(3)H_(6)O)\left(\mathrm{C}_{3} \mathrm{H}_{6} \mathrm{O}\right) 、乙醛 (C_(2)H_(4)O),NH_(3)\left(\mathrm{C}_{2} \mathrm{H}_{4} \mathrm{O}\right), \mathrm{NH}_{3} 、丁醇 (C_(4)H_(9)OH),C_(2)H_(4)\left(\mathrm{C}_{4} \mathrm{H}_{9} \mathrm{OH}\right), \mathrm{C}_{2} \mathrm{H}_{4} 、甲烷 (CH_(3))\left(\mathrm{CH}_{3}\right) 、
CO, benzene (C_(6)H_(6))\left(\mathrm{C}_{6} \mathrm{H}_{6}\right), methanol (CH_(3)OH)\left(\mathrm{CH}_{3} \mathrm{OH}\right), and toluene (C_(7)H_(8))\left(\mathrm{C}_{7} \mathrm{H}_{8}\right), with each sample containing a full response cycle of each gas sensor array. The sensor arrays were initially exposed to clean air with a flow rate of 0.21 m//s\mathrm{m} / \mathrm{s} for 20 s prior to collecting data. Subsequently, one of 10 randomly selected test gases was allowed to pass through the device for 180 s , followed by ventilate with the same fan speed for a duration of 60 s to return the sensor resistance to its baseline after halting the gas injection.
The duration of each cycle for this dataset was 260 s , with a data acquisition frequency of 100 Hz . Since high-dimensional data will potentially slow down the converge speed of the model, the dimensionality of the dataset should be reduced while retain as many features as possible. Hence, the average pooling method was selected for dimensionality reduction, and the calculation was carried out by Eq. (1). y_(i)=(1)/(k)sum_(j=i xx k)^((i+1)xx k-1)x_(j)y_{i}=\frac{1}{k} \sum_{j=i \times k}^{(i+1) \times k-1} x_{j}
where xx is the input data, yy represents pooled data, ii is the index in the output data and kk is the size of the pooling window.
3. Methodology 3. 方法论
This section first describes the pre-processing methods for e-nose gas data, followed by the details of the MSE-TCN, including the TCN, multiscale channel, SE-ResNet module, and the OpenMax algorithm.
Corresponding author at: College of Artificial Intelligence, Southwest University, Chongqing 400715, China. 通讯作者:西南大学人工智能学院,重庆 400715,中国。