Development of an ensemble of machine learning algorithms to model aerobic granular sludge reactors

doi:10.1016/j.watres.2020.116657

Water Research 水研究

Volume 189, 1 February 2021, 116657
卷 189，2021 年 2 月 1 日，116657

https://doi.org/10.1016/j.watres.2020.116657 Get rights and content 获取权利和内容

Highlights 亮点

•
A new multi-stage model structure was developed to explain the effluent predictions.
开发了一种新的多阶段模型结构，以解释排放预测。
•
Multicollinearity reduction and RReliefF algorithm improved the model performance.
多重共线性减少和 RReliefF 算法提高了模型性能。
•
An ensemble of machine learning algorithms was used for more accurate predictions.
使用了一组机器学习算法以实现更准确的预测。
•
The average R², nRMSE, and sMAPE were 95.7%, 0.032, and 3.7% respectively.
平均 R2、nRMSE 和 sMAPE 分别为 95.7%、0.032 和 3.7%。
•
The model was able to explain a failure instance in the predicted dataset.
该模型能够解释预测数据集中一个失败实例。

Abstract 摘要

Machine learning models provide an adaptive tool to predict the performance of treatment reactors under varying operational and influent conditions. Aerobic granular sludge (AGS) is still an emerging technology and does not have a long history of full-scale application. There is, therefore, a scarcity of long-term data in this field, which impacted the development of data-driven models. In this study, a machine learning model was developed for simulating the AGS process using 475 days of data collected from three lab-based reactors. Inputs were selected based on RReliefF ranking after multicollinearity reduction. A five-stage model structure was adopted in which each parameter was predicted using separate models for the preceding parameters as inputs. An ensemble of artificial neural networks, support vector regression and adaptive neuro-fuzzy inference systems was used to improve the models’ performance. The developed model was able to predict the MLSS, MLVSS, SVI₅, SVI₃₀, granule size, and effluent COD, NH₄-N, and PO₄³⁻ with average R², nRMSE and sMAPE of 95.7%, 0.032 and 3.7% respectively.
机器学习模型提供了一种自适应工具，用于预测在不同操作和进水条件下处理反应器的性能。好氧颗粒污泥（AGS）仍然是一项新兴技术，尚未有长时间的全规模应用历史。因此，该领域缺乏长期数据，这影响了数据驱动模型的发展。在本研究中，开发了一种机器学习模型，用于模拟 AGS 过程，使用了从三个实验室反应器收集的 475 天数据。输入变量是在多重共线性减少后基于 RReliefF 排名选择的。采用了五阶段模型结构，其中每个参数使用前一个参数作为输入的单独模型进行预测。为了提高模型的性能，使用了人工神经网络、支持向量回归和自适应神经模糊推理系统的集成。所开发的模型能够预测 MLSS、MLVSS、SVI5、SVI30、颗粒大小以及出水 COD、NH4-N 和 PO43−，其平均 R2、nRMSE 和 sMAPE 分别为 95.7%、0.032 和 3.7%。

Graphical Abstract 图形摘要

Keywords 关键词

Machine Learning

Artificial neural networks

Adaptive Neuro-Fuzzy Inference Systems

Support Vector Regression

Aerobic granular sludge

Sequencing Batch Reactors

机器学习人工神经网络自适应神经模糊推理系统支持向量回归好氧颗粒污泥序批反应器

1. Introduction 1. 引言

Aerobic granular sludge (AGS) is a promising biological wastewater treatment technology that has shown excellent performance in laboratories for the treatment of domestic and high-strength wastewater and is starting to be applied in full-scale wastewater treatment plants (WWTPs) (Pronk et al., 2015; Zheng et al., 2020). AGS has certain advantages over conventional activated sludge (CAS) in terms of lower reactor footprint, higher capacity for organic loading and simultaneous removal of nutrients and organics (He et al., 2020). The compact structure of the biomass granules gives the reactor higher resilience against shock loads and toxic wastewater and provides better biomass retention due to the enhanced settling properties (Franca et al., 2018; Nancharaiah & Reddy, 2018).
好氧颗粒污泥（AGS）是一种有前景的生物废水处理技术，在实验室中对生活污水和高强度废水的处理表现出色，并开始在全规模废水处理厂（WWTPs）中应用（Pronk 等，2015；Zheng 等，2020）。与传统活性污泥（CAS）相比，AGS 在反应器占地面积更小、有机负荷能力更高以及同时去除营养物质和有机物方面具有一定优势（He 等，2020）。生物质颗粒的紧凑结构使反应器对冲击负荷和有毒废水具有更高的抗压能力，并由于增强的沉降特性提供了更好的生物质保留（Franca 等，2018；Nancharaiah & Reddy，2018）。

Although AGS has consistently shown promise in terms of performance, the operation of AGS bioreactors is challenging due to the large number of factors affecting the process (Wilén et al., 2018). The characteristics of the influent wastewater, biomass properties within the reactor and operational conditions all play a significant role in the removal efficiency of the reactor. Additionally, these factors are interconnected and have complex nonlinear relationships (Khan et al., 2013). Influent characteristics and the mode of operation of the sequencing batch reactor (SBR) play a significant role in the biomass microbial culture, which in turn affects the integrity of the granule structure and settling ability. The settling ability of biomass is also affected by the settling time, volumetric exchange ratio, and discharge time. At the end of the settling time, slow settling biomass that does not settle below the effluent port gets washed out of the reactor during the decant phase, leaving the faster settling granules inside the reactor (Qin et al., 2004; Wang et al., 2004). This also affects the concentration of biomass left inside the reactor after every cycle of operation, which provides seed for new granule formation and, therefore, directly affects the level of organics and nutrients removal. A certain aeration time is necessary for aerobic degradation of organics, nitrification and for providing the required shear force that triggers the granulation process in the biomass flocs, the latter being the governing factor (Hamza et al., 2018). Other factors that affect the AGS process include the influent pH, volumetric exchange ratio, hydraulic retention time (HRT), and temperature (Khan et al., 2013). Sudden changes to these factors can lead to the failure of the structural integrity of the granules and the washout of the biomass out of the reactor, resulting in the reactor failing to meet the required effluent quality. Since these factors do continue to change, the operation of an AGS system is challenging and requires careful monitoring.
尽管 AGS 在性能方面一直显示出良好的前景，但由于影响过程的因素众多，AGS 生物反应器的操作仍然具有挑战性（Wilén 等，2018）。进水废水的特性、反应器内的生物质特性以及操作条件在反应器的去除效率中都起着重要作用。此外，这些因素是相互关联的，并且具有复杂的非线性关系（Khan 等，2013）。进水特性和序批反应器（SBR）的操作模式在生物质微生物培养中起着重要作用，这反过来又影响颗粒结构的完整性和沉降能力。生物质的沉降能力还受到沉降时间、体积交换比和排放时间的影响。在沉降时间结束时，未能沉降到排放口下方的缓慢沉降生物质会在排水阶段被冲出反应器，留下沉降较快的颗粒在反应器内（Qin 等，2004；Wang 等，2004）。这也影响了每个操作周期后反应器内剩余生物量的浓度，这为新颗粒的形成提供了种子，因此直接影响有机物和营养物质的去除水平。为了有氧降解有机物、硝化以及提供触发生物量絮凝体颗粒化过程所需的剪切力，必须有一定的曝气时间，后者是主导因素（Hamza et al., 2018）。影响 AGS 过程的其他因素包括进水 pH 值、体积交换比、水力停留时间（HRT）和温度（Khan et al., 2013）。这些因素的突然变化可能导致颗粒的结构完整性失效以及生物量从反应器中冲洗出去，导致反应器无法满足所需的出水质量。由于这些因素持续变化，AGS 系统的操作具有挑战性，需要仔细监测。

A tool that can simulate AGS reactors considering all previously mentioned factors would help alleviate some of this challenge. Such a tool can provide operators with the ability to predict the reactor performance and adapt as the quality of influent wastewater changes.

There are several studies in the literature that present physical models for AGS reactors (Baeten et al., 2019; Ni & Yu, 2010). Physical AGS models utilize the biofilm model to simulate the diffusion of the substrate into the granules and Activated Sludge Models (ASM) based equations to simulate the kinetics of the biological process (Cui et al., 2020). Many restrictions and assumptions have to be made to be able to develop physical models without being overly complicated (Ni & Yu, 2010). The calibrated kinetic and stoichiometric parameters will change with any change in operation or influent wastewater, making it challenging to use physical models for process control (Baeten et al., 2018). Physical models, however, are excellent for understanding the biological processes, conversion rates, and for studying the factors affecting the process performance.

Machine learning provides an excellent alternative to physical models for predicting reactor performance and process control. Data-driven models can overcome the need for continuous re-calibration of physical models. They are more adaptive and can learn from new data that is collected as the process continues to run (El-Din et al., 2004). The use of machine learning for AGS modelling is still not as studied as physical models (Baeten et al., 2019). Artificial neural networks (ANN) were used to simulate the AGS process in a simple model structure where only chemical oxygen demand (COD) and total nitrogen (TN) removals were predicted with an R² of 0.9 and 0.81 (Gong et al., 2018). Single hidden layer ANNs were used to predict the effluent COD using six inputs, resulting in an R² of 0.91 (Mahmod & Wahab, 2017). Another AGS model was developed using single hidden layer feed-forward ANNs using eight inputs to predict the effluent COD, NH₄-N, and TN with an R² of 0.9988, 0.9997, and 0.9991, respectively (Liang et al., 2020). A more comprehensive model structure was developed using feed-forward multi-layer ANNs to simulate the full AGS process, including the prediction of biomass characteristics and the removal rates of COD, ammonia, and phosphates, with a minimum prediction R² of 99% (Zaghloul et al., 2018). Adaptive neuro-fuzzy inference systems (ANFIS) and support vector regression (SVR) were investigated as alternative algorithms to ANN, concluding that SVR provided comparable results to ANN with a minimum R² of 0.997, while ANFIS provided lower prediction accuracy than ANN and SVR with a minimum R² of 0.815 when simulating AGS reactors (Zaghloul et al., 2020).

Aside from modelling AGS, machine learning has shown excellent performance in simulating other wastewater treatment processes such as CAS, showing the potential application of machine learning in forecasting and process control (Corominas et al., 2018). An ANN model was successfully used for modelling the BOD and TSS removal in a full-scale CAS process using single input-single output models with an R² of 0.665 and 0.542, respectively (Hamed et al., 2004). ANN was also used for the development of software sensors that predict the effluent TN, TP and COD for a real-time remote monitoring system in another full-scale CAS treatment plant, with an R² of 0.952, 0.934, and 0.921, respectively (Lee et al., 2008). SVR was used to predict the removal of COD, ammonia, and nitrates in a CAS process using microbial community data with an R² of 0.9501, 0.7936, and 0.8916, respectively (Seshan et al., 2014). ANN and SVR were compared for the prediction of effluent TN in a CAS process treating food waste leachate, showing that both algorithms performed similarly where the R² was 0.47 and 0.46 respectively, however, the SVR suffered from overfitting where the training R² was 1.00 (Guo et al., 2015). ANFIS was compared to SVR for predicting the removal of TKN in a full-scale BNR plant, where SVR provided better performance than ANFIS with R² values of 0.85 and 0.91, respectively (Manu & Thalla, 2017).

The studies above concluded that ANN, SVR and ANFIS are capable of simulating various biological treatment processes in WWTPs. It was also observed that while ANN provided reliable results, it required the largest training datasets to provide good quality modelling. SVR was reported to provide unique solutions to regression problems and not as likely to get trapped in local error minima during error optimization as ANN, but it is hard to interpret the final model formulation, and the computational requirements increase with larger datasets (Karamizadeh et al., 2014, September). ANFIS models can be relatively easier to interpret than ANN and SVR, but the number of fuzzy rules increases exponentially with the number of input variables and input membership functions (Stathakis et al., 2006). Ye et al. (2020) detailed the characteristics, advantages and limitations of several algorithms, including the ones used in this study (Ye et al., 2020). They showed that: (1) ANNs are accurate but have the risk of overfitting and harder to find the best architecture. (2) SVR works well with noisy data and does not require as much training data but needs higher computation power that other algorithms. (3) ANFIS can optimally solve nonlinear problems, but it is difficult to find the best model structure.

Machine learning ideally requires large datasets for training the algorithms (Liu et al., 2017). Databases from AGS WWTPs are still not large enough for conventional machine learning simulations. Small datasets are challenging when used for training machine learning models, i.e. the training process becomes highly affected by data quality issues, dimensionality, and multicollinearity (Shaikhina & Khovanova, 2017). Additionally, small datasets with high dimensionality increase the required level of model complexity to achieve reasonable prediction accuracies (Wójcik & Kurdziel, 2019). Data pre-processing and feature selection play an important role in handling outliers and gaps, normalizing features, and reducing dimensionality and multicollinearity in the dataset, which improves the model training and final performance.

This work presents a modelling approach for AGS reactors when only small datasets are available. Data were pre-processed and cleaned, then feature-selection was performed using the variance inflation factor (VIF) for reducing multicollinearity and the RReliefF algorithm for ranking inputs. A combination of ANN, SVR and ANFIS algorithms was used via different ensemble techniques, and the best performing technique was used for the final model. A multi-stage model structure was developed to provide stepwise predictions where outputs of each stage get added to the potential pool of inputs for the following stage. The purpose of this model is to provide a tool that can predict the biomass characteristics inside AGS reactors, effluent characteristics (concentrations of COD, NH₄-N, and PO₄³⁻), and potential failure to meet user-defined treatment requirements.

2. Methods

2.1. Experimental Setup

Three SBRs were set up and operated to collect the required data for this study. Reactor R1 had a diameter of 89 mm, and a working volume of 4.5 L. Reactors R2 and R3 had a diameter of 150 mm, and a working volume of 19 L. Fig. 1 shows the general setup of the reactors.

The SBR operation was automated with scheduled times for fill, idle, aeration (reaction), settling, and draw (decant). Table 1 shows the cycle times and superficial air velocity used for the duration of the data collection period. Aeration was provided using air compressors and controlled using Cole-Parmer airflow meters and regulators. Air was diffused into the reactor using Paintair fine bubble ceramic diffusers (AS4). Masterflex peristaltic pumps were used for feeding the reactors.

Table 1. Reactor operation parameters.

Parameter	Empty Cell	Reactor R1	Reactor R2	Reactor R3
Fill Time (min)		6 – 7	6 – 8	60
Idle Time (min)		0 – 5	1 – 3	2
Aeration Time (min)		180 – 182	180 – 222	145 – 172
Settling Time (min)		3 – 15	8 – 30	5 – 30
Decanting Time (min)		1 – 6	1	1
Superficial Air Velocity (cm/s)		1.6 – 4	2.11	3

The reactors were operated using synthetic wastewater prepared as detailed in (Tay et al., 2002). The main carbon, nitrogen and phosphorus sources were sodium acetate, ammonium chloride, monopotassium and dipotassium phosphate, respectively. Return activated sludge (RAS) was procured from the Pine Creek wastewater treatment plant for seeding the granulation process. The reactors were run at a stable temperature of 18±2°C. Influent, effluent and biomass samples were collected daily. Mixed liquor suspended solids (MLSS), mixed liquor volatile suspended solids (MLVSS), 5-minute sludge volumetric index (SVI₅) and 30-minute SVI (SVI₃₀) were measured according to standard methods (Rice et al., 2017). The United States Environmental Protection Agency (USEPA) reactor digestion method was adopted for the measurement of COD using a HACH DR-2400 spectrophotometer. The salicylate method was used to measure ammonia with TNT 830, 831, 832 and 833 kits. Ion chromatography was used to measure reactive phosphate using a Metrohm Compact IC Flex based on the Standard Methods for the Examination of Water and Wastewater (Rice et al., 2017). Laser particle size analysis was used to measure the granule size (Malvern MasterSizer Series 2000).

2.2. Model Structure

This study adopted a 5-stage model structure where each of the stages 2 to 5 is predicted using the preceding stages as potential inputs, as shown in Fig. 2. The multi-stage model structure is designed to simulate the cause-effect process in AGS reactors, where the influent characteristics and operational parameters affect the biomass concentration due to growth and decay of the microbial community. The biomass concentration and the SBR operation directly affect the biomass settling properties, which in turn affects the granule growth. All the previous parameters and interactions affect the removal efficiency and the effluent wastewater quality. Each of the parameters in stages 2 - 5 were predicted using a separate model, except for the F/M ratio that was calculated using the influent organics and biomass concentrations then added as input for stages 4 and 5. The multi-stage structure also adds versatility during model development as it allows using different inputs for each output.

In this study, three algorithm alternatives were individually used for simulating the AGS process: ANN, SVR and ANFIS. The outputs of individual models were combined as inputs to ensemble algorithms using five different alternative methods: ANN, SVR, ANFIS, arithmetic mean (E-AVG), and weighted average (E-WAVG). In total, each output was predicted eight times using the individual and ensemble alternatives. The best performing algorithm out of the eight alternatives was chosen for the final model. The ensemble algorithms were denoted with the prefix “E-”. Fig. 3 shows the algorithm choice approach.

The dataset of 475 days was divided into 404 days for developing the models and 71 days for evaluation. The evaluation dataset was completely isolated and was only used after the models were trained and chosen. Fig. 4 shows the data divisions for the algorithms used in this study. The model development data (404 days) was divided according to the requirements of the algorithm being trained. The ANN and E-ANN models had a data division scheme of 70% for training, 15% for test and 15% for validation, which corresponded to 284, 60 and 60 days, respectively. The SVR and E-SVR models utilized the full 404 days for training. The ANFIS and E-ANFIS models used 85% of the data for training and 15% for validation, which corresponded to 344 and 60 days, respectively. The E-AVG and E-WAVG ensembles are not machine learning algorithms; thus, they did not require training and validation.

2.3. Data Pre-Processing

The dataset collected for this study consisted of 475 days. Datasets used for training machine learning algorithms need to undergo a cleaning process that mainly removes outliers, fills missing points, randomizes the dataset, and normalizes all the data features to the same scale. In this study, outliers were removed during data collection, and missing data points were filled using linear regression imputation (Lakshminarayan et al., 1999).

Randomization is done to remove the effect of phased operation and the use of multiple reactors when the dataset is split into training and evaluation datasets. The statistical properties of the training and evaluation datasets must be as close as possible to ensure proper evaluation of the models. Table 2 shows the maximum, minimum, mean and coefficient of variation for the training and evaluation datasets.

Table 2. Statistical properties of the training and evaluation dataset.

Parameter	Max.		Min.		Mean		Coef. Of Var.
Parameter	Train.	Eval.	Train.	Eval.	Train.	Eval.	Train.	Eval.
Influent COD (mg/L)	8758	7445	1287	1518	3352	3069	0.56	0.51
Influent NH₄-N (mg/L)	234	201	53	68	129	135	0.32	0.28
Influent PO₄³⁻ (mg/L)	124	83	3	6	48	47	0.38	0.33
Influent Flowrate (L/d)	74.81	74.81	11.03	11.03	62.63	60.95	0.28	0.31
Volume (L)	19	19	4.5	4.5	17.49	17.16	0.25	0.28
Influent pH	9.06	8.47	0.00	6.62	7.17	7.19	0.08	0.04
OLR (kg COD/m³)	33.08	26.87	3.48	4.47	11.06	9.70	0.66	0.57
HRT (h)	9.67	9.67	5.92	5.92	7.69	7.79	0.10	0.09
Exchange Ratio (%)	0.56	0.56	0.35	0.35	0.50	0.50	0.09	0.09
Superficial Air Vel (cm/s)	3	3	1.56	1.56	2.51	2.53	0.18	0.18
Temperature (°C)	24.1	23.7	12.4	14.3	20.5	20.5	0.09	0.09
Settling time (min)	30	30	3	3	13.14	14.17	0.41	0.40
Aeration time (min)	221	221	163	163	187	185	0.13	0.13
MLSS (mg/L)	25157	24411	779	2485	7966	7446	0.66	0.61
MLVSS (mg/L)	19303	18675	523	2015	6329	6023	0.61	0.57
SVI₅ (mL/g)	446	241	20	22	113	117	0.56	0.49
SVI₃₀ (mL/g)	278	137	18	21	73	75	0.48	0.39
Granule Size (μm)	952	930	66	76	440	468	0.47	0.44
F/M Ratio	12.14	4.21	0.54	0.92	2.07	1.89	0.51	0.34
Effluent COD (mg/L)	4227	2940	0	9	210	136	3.03	3.11
Effluent NH₃-N (mg/L)	116	115	0	0	24	31	1.19	1.13
Effluent PO₄³⁻ (mg/L)	51	27	0	0	8	8	1.14	1.09

Following randomization, each feature in the training dataset is normalized to the scale of (0 - 1) by dividing the feature by its maximum value. The evaluation dataset was normalized using the training maximum to keep the evaluation dataset unseen during the model development; therefore, the normalized values might slightly exceed one if the maximum of the evaluation dataset was higher than that of the training dataset.

2.4. Feature Selection

The choice of model inputs has a significant effect on model performance. The dataset collected contained input parameters that are correlated at varying degrees and contributed differently to each of the outputs. Each output had a pool of parameters to choose its inputs from using feature selection methods such as multicollinearity reduction and RReleifF algorithms.

Linearly correlated input parameters reduce the orthogonality of the model, which is also known as multicollinearity (Alin, 2010). Multicollinearity is a source of overfitting during training and results in models with low reliability (Read & Belsley, 1994). The level of multicollinearity in a set of parameters can be measured using the variance inflation factor (VIF), as shown in Eq. (1). It is generally accepted that VIF values of 5 and below are accepted for regression problems. If the VIF is larger than 5, the parameter with the highest VIF is removed, and the test is repeated till the maximum VIF is 5 or less.(1)

V I F = \frac{1}{1 - R_{i}^{2}}

After multicollinearity is reduced and redundant parameters are removed, the remaining parameters were sorted according to a weight calculated for each parameter using the RReliefF algorithm. The RReliefF algorithm is one of the filter methods for feature selection that assigns weights to input parameters based on their effect on the output using the k-nearest neighbours approach for input-output instances, where higher weights correspond to more important inputs (Urbanowicz et al., 2018). Weights are calculated based on three probabilities at the nearest instances: a different input value at nearest outputs, a different output value at nearest inputs, and a different output value when there is a difference in the input value. Detailed mathematical formulation and the algorithm structure can be found in (Robnik-Šikonja & Kononenko, 1997). Inputs that are more consistent with the nearest neighbours in explaining the variation in outputs receive higher weights. The RReliefF algorithm, being one of the filter methods, carries the advantage that it is not affected by the induction algorithms applied to the raw data (data pre-processing) (Urbanowicz et al., 2018). This allows the chosen inputs to be used with different machine learning algorithms with confidence.

The number of nearest neighbours (k) used in this study was determined by calculating the input weights as k is increased from 1 to 500. Weights were used at k = 200, where the results have stabilized, as shown in Fig. 5.

2.5. Artificial neural networks

Artificial neural networks mimic the way neurons work in the human brain to perform complex operations. The ANNs type used in this study, feed-forward neural networks, utilize an error minimization algorithm to tune the network weights (Fernando & Shamseldin, 2009; Sammut & Webb, 2016). Well designed and trained neural networks can achieve outstanding prediction accuracies; however, this comes at the expense of long training time as the error minimization functions are generally slow to converge. Additionally, large training datasets are needed to reach high prediction accuracies without overfitting. Neural networks can also overfit if the inputs are not well selected or the layers architecture is not well designed (Lawrence & Giles, 2000; Ye et al., 2020).

In this study, Bayesian Regularization was used as the objective function for error minimization using a linear formulation of squared errors and network weights (Foresee & Hagan, 1997). The network architectures were selected by training all ANN combinations of (1-3) hidden layers and (1-10) hidden nodes in each layer. This approach resulted in the training of 8880 neural networks for the 8 outputs. The best performing network selected for each of the outputs was the one with the most accurate prediction (lowest error) and with similar training and test performance. These conditions ensured that the chosen networks did not overfit or underfit.

2.6. Adaptive neuro-fuzzy inference systems

The adaptive neuro-fuzzy inference systems (ANFIS) is used to simulate complex processes with measurement uncertainty. It is a universal approximator that utilizes logical rules to reach an output through human-like reasoning (Jang, 1993). In ANFIS, membership functions are used to map numerical inputs to fuzzy sets. A learning algorithm, similar to the back-propagation algorithm, is used to minimize the errors by optimizing the ANFIS parameters.

Each of the outputs in this study was predicted using a separate ANFIS model. The clustering method used was Grid Partitioning, as it allows for choosing the desired membership functions. Grid partitioning, however, assigns a fuzzy rule for each input-membership function combination, which exponentially increases the number of rules and the computational requirement for training the models.

2.7. Support vector regression

Support vector regression (SVR) is an algorithm, based on the statistical learning theory, that uses the structural risk minimization (SRM) method to minimize the modelling error and maintain low model complexity (Smola & Schölkopf, 2004; Vapnik, 2000). The nonlinear SVR model formula is shown in Eq. (2) for an input vector

x

, output vector

y

, and N number of samples.(2)

y = \sum_{i = 1}^{N} w_{i} φ_{i} (x) + b

where

w_{i}

is a weight vector,

b

is bias,

φ_{i}

is a nonlinear kernel function that maps the training data to a higher dimensional feature space, making it possible to linearize the model (Smola & Schölkopf, 2004). A Gaussian kernel was used as it is easier to tune than other functions, and it can also handle complex error boundaries (Goyal & Ojha, 2011).

Three hyperparameters need tuning in SVR: the kernel scale (γ), box constraint (C) and the error band (ε). The kernel scale (γ) determines how much the kernel function will detect variation in the input vectors. It is inversely proportional to the sensitivity of the kernel function to input variation. The SVR model can underfit if the kernel function is not sensitive enough to detect changes in inputs and can overfit if the kernel function was too sensitive to detect the smallest variation in inputs. The box constraint (C) is a regularization factor needed by the SRM to control the penalty on large prediction residual errors. It represents the trade-off between the training error and model complexity, where small C values will result in poor predictions, and large values will cause overfitting. Finally, the error band represents the space where the predictions can be made around the actual measured values. Tighter error bands will result in more accurate predictions at the expense of the model complexity. More details on the mathematical formulation of SVR can be found in (Awad & Khanna, 2015; Cristianini and Shawe-Taylor, 2000; Smola & Schölkopf, 2004).

3. Results and Discussion

3.1. Aerobic Granular Sludge Performance

The reactors simulated in this study were operated for data collection for a collective of 475 days. Periods of stable operation have been observed along with some disruptions due to biomass washout. The average (± standard deviation) influent COD, NH₄⁺-N, and PO₄³⁻ concentrations in the reactors are 3309±1838, 130±41, and 48±18 mg/L, respectively. The systems exhibited stable organics and nutrients removal throughout the duration of the experiment, with an average COD, NH₄⁺-N, and PO₄³⁻ removal efficiencies of 96±8, 81±18, and 84±18%, respectively. Aerobic granular sludge has been proven to have consistent good removal of organics and nutrients (de Kreuk et al., 2005; Iorhemen et al., 2020; Nancharaiah & Reddy, 2018). The stratification of aerobic, anoxic, and anaerobic microbial communities has also been observed, resulting in better nitrogen and phosphorus removal (Wang et al., 2008; Yilmaz et al., 2008).

The average MLSS and MLVSS concentrations were 7888±5158 and 6284±3810mg/L, respectively. The average MLSS/MLVSS ratio was 80%, which is a typical value in aerobic treatment reactors. The average SVI₅ and SVI₃₀ were 114±63, and 73±34 mL/g, respectively, demonstrating the fast settling of the granules. Granules are considered to have good settling properties when SVI₃₀ is below 100 mL/mg (Hamza et al., 2018; Liu et al., 2007). The average ratio of granulation in AGS reactors, calculated by the SVI₃₀/SVI₅ (Hamza et al., 2018), was 64%, and the average granule diameter was 445±206 µm, which showed that the biomass was mostly granular. The presence of some floccular biomass and fluctuation in settling properties were expected due to the variation in F/M ratio (2±1) (Hamza et al., 2018).

3.2. Feature Selection

The ANN, SVR and ANFIS models that were used to predict outputs 1, 2 and 3 (Fig. 3) were developed using the operation and performance dataset, which was collected from the laboratory (Table 2). Feature selection was necessary to overcome the multicollinearity between different parameters and remove the inputs that adversely affect the model performance. The feature selection methods used in this study were able to reduce the level of multicollinearity and identify the most effective parameters for each output, which resulted in the elimination of some parameters. Calculating the VIF for the inputs of each stage resulted in the reduction of inputs by 4, 5, 8, and 7 for stages 2, 3, 4, and 5, respectively. The results of the multicollinearity reduction are shown in Table 3. The level of multicollinearity is accepted once the maximum VIF is below 5. The data collection plan was quite thorough in the selection of parameters to measure. This resulted in a high level of multicollinearity in the initial dataset due to the close relationships between parameters, such as the OLR-COD, or the flowrate-reactor volume-HRT (Price, 1998).

Table 3. Multicollinearity reduction.

Model stage	Max. VIF before reduction	Max. VIF after reduction	Initial number of inputs	Final number of inputs	Number of inputs removed
Stage 2	990.55	4.63	13	9	4
Stage 3	1078.9	2.49	15	10	5
Stage 4	1132.1	3.11	18	10	8
Stage 5	1141.9	4.28	19	12	7

Further dimensionality reduction was applied using the RReliefF algorithm to rank the inputs for each output. Unlike the VIF test that was input dependent, the RReliefF weights depended on the relationship between the inputs and outputs, which resulted in different input weights for each of the outputs even within the same stage. Table 4 shows the final inputs used for the ANN, SVR, and ANFIS models. In Table 4, the ANN was denoted by “N”, SVR was denoted by “S”, and ANFIS was denoted by “A”. Table 4 shows, for each output, the models in which each of the predictors was used, for example, the influent NH₄-N, influent PO₄³⁻, OLR and HRT were used as inputs for the MLVSS ANFIS model. The ranking done using the RReliefF algorithm was used for the three modelling algorithms. Selected inputs were chosen from the ranked list and noting the model performance until the model stopped improving or was unable to complete the training due to computational limitation (in ANFIS models).

Table 4. Algorithms in which each of the inputs was used for each output.

Parameter	Outputs
Parameter	MLSS	MLVSS	SVI₅	SVI₃₀	Granule Size	Effluent COD	Effluent NH₄-N	Effluent PO₄³⁻
Influent NH₄-N (mg/L)	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A
Influent PO₄³⁻ (mg/L)	N - S - A	N - S - A	N - S	N - S	N - S	N - S - A	N - S - A	N - S - A
Volume (L)			N - S - A	N - S	N - S		N - S - A	N - S - A
Influent pH	N - S - A	N - S	N - S	N - S	N - S		N - S	N - S
OLR (kg COD/m³)	N - S - A	N - S - A
HRT (h)	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A	N - S - A	N - S
Superficial Air Vel. (cm/s)	A		N - S - A	N - S - A	N - S - A		N - S - A	N - S - A
Temperature (°C)	N - S	N - S	N - S	N - S - A	N - S - A	N - S - A		N - S
Settling time (min)	S		N - S - A	N - S - A	N - S		N - S - A	N - S
MLVSS (mg/L)			N - S - A	N - S - A	N - S	N - S - A		N - S - A
SVI₅ (mL/g)					N - S - A
SVI₃₀ (mL/g)							N - S - A
Granule Size (μm)						N - S - A	N	N - S - A
F/M Ratio						N - S	N	N - S

The ANFIS model was restricted by the number of inputs to be used as is was found that adding more inputs would significantly increase the required training time and CPU usage. ANFIS was limited to a maximum of six inputs selected according to the RReliefF ranking, except for the MLVSS, where the best performance was achieved with four inputs and the effluent ammonia where seven inputs had to be used to achieve acceptable performance.

Feature selection was found to be the most important step in the development, as it has improved the performance of the ANN, SVR, and ANFIS models by raising the overall evaluation average R² from 93%, 85%, and 83% to 94.2%, 92.4% and 85.6% respectively using the same data and model structure.

3.3. Model Development

The ANN model was trained using the selected inputs in Table 4. The network architectures for each ANN model are shown in Table 5. All ANN models had three hidden layers with different combinations of hidden nodes. Table 5 also shows the tuned SVR hyperparameters. There is a large variation in the values of C, γ and ε between the models. The available literature explains the individual effect of each of the hyperparameters on the performance of the trained SVR models and the risk of overfitting. However, the hyperparameters have a combined effect on the performance. The values of C, γ and ε, were optimized within the SVR training process using Bayesian optimization to minimize 5-fold cross-validation error. The optimization of the hyperparameters results in the best possible prediction accuracy without significant overfitting. Finally, the ANFIS models were all assigned two input membership functions. It was difficult to increase the number of membership functions due to computational limitations that resulted in failing to train the models. Different membership functions were tested, and the best-performing ones were chosen for the final model. Triangular-type membership functions were used for the MLSS, MLVSS, SVI₃₀ and effluent NH₄-N (Eq. 3).(3)

y = {\begin{matrix} 0, & x \leq a \\ \frac{x - a}{b - a}, & a \leq x \leq b \\ \frac{c - x}{c - b}, & b \leq x \leq c \\ 0, & x \geq c \end{matrix}

where a, b and c are constants determined through the ANFIS training. The rest of the ANFIS models used Gaussian Combination membership functions (Eq. 4).(4)

y = e^{\frac{- {(x - μ)}^{2}}{2 σ^{2}}}

where σ is the standard deviation, and µ is the mean of the training data. The average training performance of the ANN model for all outputs was 96%, 0.03, and 3.3% for R², nRMSE and sMAPE, respectively. The average training performance of the SVR model for all outputs was 95.8%, 0.026, and 1.9% for R², nRMSE and sMAPE, respectively. The average training performance of the ANFIS model for all outputs was 92.5%, 0.047, and 4.6% for R², nRMSE and sMAPE, respectively. The overall performance of SVR was better than ANN in terms of nRMSE and sMAPE. The ANFIS model was considerably less accurate than the ANN and SVR, which can be attributed to the computational limitation on the number of inputs and membership functions that were used.

Table 5. ANN architectures, SVR hyperparameters and ANFIS membership functions.

Output	ANN Architectures	SVR Hyperparameters			ANFIS Membership Function
Output	ANN Architectures	C	γ	ε	ANFIS Membership Function
Empty Cell
MLSS (mg/L)	6-4-3	1.763	1.8302	1.33E-04	Triangular
MLVSS (mg/L)	5-4-9	244.46	8.6594	0.0004098	Triangular
SVI₅ (mL/g)	6-3-1	49.959	4.1236	0.0001425	Gaussian Combination
SVI₃₀ (mL/g)	2-7-4	0.59194	5.2062	0.013026	Triangular
Granule Size (μm)	2-9-8	121.55	1.4949	0.0003605	Gaussian Combination
Effluent COD (mg/L)	7-1-1	7.9007	2.7712	7.226E-06	Gaussian Combination
Effluent NH₄-N (mg/L)	6-6-1	1.2535	1.0669	0.0007429	Triangular
Effluent PO₄³⁻ (mg/L)	5-1-1	17.613	2.87	0.0002448	Gaussian Combination

The outputs of the ANN, SVR and ANFIS models were used as inputs to five ensemble alternatives: E-ANN, E-SVR, E-ANFIS, E-AVG and E-WAVG. The E-AVG was the arithmetic mean of the three inputs, which resulted in an overall average training performance of 97.3%, 0.027, and 2.8% for R², nRMSE and sMAPE, respectively. E-WAVG was a weighted average of the three inputs using the training R² of the ANN, SVR and ANFIS for each output as relative weights. This approach had the advantage of favouring the more accurate inputs which provided an improvement over E-AVG when there was a large difference in accuracy between the ANN, SVR and ANFIS. The overall average performance of the E-WAVG was 97.8%, 0.024, and 2.4% for R², nRMSE and sMAPE, respectively. The E-ANN, E-SVR, and E-ANFIS are machine learning-based ensembles where the ANN, SVR and ANFIS outputs were used as inputs. Table 6 shows the architecture, hyperparameters and membership functions of the E-ANN, E-SVR, and E-ANFIS, respectively. These algorithms were much simpler in their development as they were intended to use three versions of the true output to make the predictions. Also, the number of inputs was much less than the original three models. Maintaining a simple model was essential to ensure that overfitting was minimal, considering the small number of inputs (three) and that the inputs are variants of the same parameters that are already close to the desired solution. All neural networks had one hidden layer with a single neuron providing a training performance of 98.7%, 0.016, and 1.8% for R², nRMSE and sMAPE respectively. The E-SVR hyperparameters provide insight into the performance of the algorithm, where large C values indicate larger penalties on errors in all outputs. However, the γ values were also much higher than the SVR model, indicating that the kernel function is not as sensitive to the variation in the inputs. The overall average training performance of the E-SVR was 98.7%, 0.016, and 1.6% for R², nRMSE and sMAPE, respectively. The E-ANFIS models had 2 Gaussian Combination membership functions with an overall training performance of 94.3%, 0.031, and 1.9% for R², nRMSE and sMAPE, respectively.

Table 6. E-ANN architectures, E-SVR hyperparameters and E-ANFIS membership functions.

Output	E-ANN Architectures	E-SVR Hyperparameters			E-ANFIS Membership Function
Output	E-ANN Architectures	C	γ	ε	E-ANFIS Membership Function
Empty Cell
MLSS (mg/L)	1	274.69	3.1267	0.0001948	Gaussian Combination
MLVSS (mg/L)	1	969.84	16.172	0.0006375	Gaussian Combination
SVI₅ (mL/g)	1	932.21	222.88	0.0005993	Gaussian Combination
SVI₃₀ (mL/g)	1	286.84	38.591	0.0071388	Gaussian Combination
Granule Size (μm)	1	892.45	377.84	0.0002951	Gaussian Combination
Effluent COD (mg/L)	1	998.2	115.86	0.0001844	Gaussian Combination
Effluent NH₄-N (mg/L)	1	886.38	329.18	0.0002088	Gaussian Combination
Effluent PO₄³⁻ (mg/L)	1	942.59	231.49	0.0012618	Gaussian Combination

3.4. Model Performance

The developed algorithms in this study (ANN, SVR, ANFIS, E-ANN, E-SVR, E-ANFIS, E-AVG, E-WAVG) were all validated using the evaluation dataset that was isolated before developing the models. Table 7 shows the overall evaluation performance averaged for all outputs for each algorithm. It was found that the E-ANN provided the best performance in terms of R², nRMSE, and sMAPE. The E-SVR and E-WAVG provided a close performance to the E-ANN, but the E-ANFIS was not able to reach the same level of performance.

Table 7. Overall average evaluation performance.

Metric	ANN	SVR	ANFIS	E-ANN	E-SVR	E-ANFIS	E-AVG	E-WAVG
R²	94.2%	92.4%	85.6%	95.2%	94.5%	80.3%	94.6%	95%
nRMSE	0.037	0.043	0.062	0.034	0.036	0.081	0.037	0.035
sMAPE	4.2%	4.6%	7.7%	3.8%	4%	6.4%	4.5%	4.2%

Although E-ANN outperformed the other ensemble algorithm in the overall performance, the E-WAVG was able to provide a somewhat better prediction accuracy for the granule size with an R² of 89% as opposed to the E-ANN with an R² of 85%. The ensemble models did not provide an improvement over the ANN for the prediction of the effluent COD, where the ANN provided an R² of 99.65% as opposed to the 99.3% R² of the E-ANN. The best performing models for each of the outputs were selected for the final model as shown in Table 8. Fig. 6 shows the diagonal plots of the final selected models.

Table 8. Final model performance using the best performing algorithms.

Output	Chosen Algorithm	R² (%)		nRMSE		sMAPE (%)
Output	Chosen Algorithm	Training	Evaluation	Training	Evaluation	Training	Evaluation
MLSS (mg/L)	E-NN	97.80%	96.11%	0.031	0.036	3.06%	4.05%
MLVSS (mg/L)	E-NN	97.58%	94.30%	0.031	0.043	2.91%	4.56%
SVI₅ (mL/g)	E-NN	98.53%	95.89%	0.017	0.028	2.29%	3.16%
SVI₃₀ (mL/g)	E-NN	95.81%	92.19%	0.025	0.030	2.93%	3.49%
Granule Size (μm)	E-WAVG	97.06%	88.75%	0.038	0.073	2.54%	3.78%
Effluent COD (mg/L)	NN	99.65%	99.65%	0.009	0.008	3.06%	5.63%
Effluent NH₄-N (mg/L)	E-NN	99.89%	99.53%	0.008	0.021	0.92%	1.98%
Effluent PO₄³⁻ (mg/L)	E-NN	99.72%	98.96%	0.010	0.018	1.85%	2.88%

After using the best performing ensemble algorithms, the final model improved the overall prediction accuracy over the ANN model, the best performing single algorithm, by raising the overall average R² from 94.2% to 95.2% and reducing the overall average nRMSE from 0.037 to 0.032, and the overall average sMAPE from 4.2% to 3.7%. The most significant improvement was observed in the granule size, where the R² was increased from 81% to 88.2%.

It was found that even though the E-SVR was close to the E-ANN in terms of prediction accuracy of the evaluation data, it suffered from a slightly higher level of overfitting in some of the outputs where the difference between the training and evaluation predictions was larger than that of E-ANN. The E-ANFIS did not provide predictions as accurately as the other ensembles, as there was not enough distinction in the input parameters for the ANFIS to be able to develop a reliable rule-base.

Table 9 compares the model developed in this study to other machine learning models in the literature using the prediction R² of the validation datasets as a performance indicator. The comparison shows that this study can achieve prediction accuracies that are similar to those made by AGS machine learning models with much larger datasets (Zaghloul et al., 2020). It can also be observed that the performance of the model developed in this study achieves prediction accuracies that are higher than AGS models that were developed with small datasets (Gong et al., 2018; Mahmod & Wahab, 2017). Other CAS models with small datasets resulted in predictions that are consistent with AGS models (El-Din et al., 2004; Manu & Thalla, 2017). The small size of test datasets can result in less reliable models that do not provide consistent results, which is demonstrated in the results of Mahmod and Wahab (2017), where the training and testing R² were 78% and 91.17%, respectively (Mahmod & Wahab, 2017). SVR and ANNs were found to perform at the same level of accuracy in other CAS and AGS models, which is consistent with the models developed in this study (Gong et al., 2018; Guo et al., 2015; Mahmod & Wahab, 2017; Seshan et al., 2014).

Table 9. Comparison between the dataset size and prediction R² (%) of this study and other machine learning models for AGS and CAS.

Study	Algorithm	Model	Dataset Size (days)	Biomass					Effluent
Study	Algorithm	Model	Dataset Size (days)	MLSS	MLVSS	SVI₅	SVI₃₀	Granule Size	COD	NH₄-N	TKN	TN	PO₄³⁻
This study	Ensemble	AGS	475	96.1	94.3	95.9	92.2	88.75	99.7	99.5	-	-	99.0
(Zaghloul et al., 2020)	ANFIS	AGS	2920	87.5	86.6	96.3	95.6	81.5	98.5	99.6	-	-	86.7
(Zaghloul et al., 2020)	SVR	AGS	2920	99.9	99.9	99.9	99.8	99.8	99.9	99.9	-	-	99.7
(Zaghloul et al., 2018)	ANN	AGS	2886	99.5	99.6	99.6	99.0	99.2	100.0	99.9	-	-	99.9
(Gong et al., 2018)	ANN	AGS	205 (136)^⁎	-	-	-	-	-	90.0	-	-	81.0	-
(Mahmod & Wahab, 2017)	ANN	AGS	21	-	-	-	-	-	91.2	-	-	-	-
(Manu & Thalla, 2017)	ANFIS	CAS	88	-	-	-	-	-	-	-	72.0	-	-
(Manu & Thalla, 2017)	SVR	CAS	88	-	-	-	-	-	-	-	82.5	-	-
(Guo et al., 2015)	ANN	CAS	305	-	-	-	-	-	-	-	-	47.0	-
(Guo et al., 2015)	SVR	CAS	305	-	-	-	-	-	-	-	-	46.0	-

⁎: COD dataset was 205 days, TN dataset was 136 days.

4. Multi-Stage Model Structure

AGS machine learning models in the literature are all single-stage models where a group of inputs is used to predict the final effluent quality parameters without considering the process sequence (Gong et al., 2018; Mahmod & Wahab, 2017). Two-stage models were designed to improve the model structure with success (Zaghloul et al., 2020). The multi-stage model structure makes the model developed in this study more versatile than other machine learning models as it considers the effect of influent characteristics on the biomass properties and the consequent effect on the effluent quality. The multi-stage model structure also provides the ability to identify the source of predicted effluent quality issues, where the different biomass properties that were predicted at the same instance can be analyzed. This mitigates the disadvantage of the application of black-box models that can be difficult to use for process interpretation (Newhart et al., 2019).

5. Failure prediction

The model developed in this study can accurately predict the performance of AGS reactors under varying operational and influent conditions. The model provides a tool that can be used to forecast the reactor behaviour during operation, which will guide the operators on experimental design. Fig. 7 shows the predictions made for a portion of the evaluation dataset (chronologically ordered and obtained from the same reactor) plotted with failure thresholds. Operators can utilize such figures to identify sources of process failures and potential causes.

The local treated effluent standards were set as thresholds for this study. Maximum effluent COD, NH₄-N, and PO₄³⁻ were set to 20 mg/L, 10 mg/L, and 0.5 mg/L, respectively. The threshold for the granule size was set to 200 µm, below which the biomass would be considered floccular (Liu et al., 2010), which indicates either structural integrity failure of the granule or failure to achieve a state of granulation. The MLSS threshold was chosen for this study to be 4000 mg/L. The MLSS dropping below the threshold would indicate a washout of the biomass due to poor settling. The settling properties can also be predicted by the SVI values. Additionally, the SVI₃₀/SVI₅ indicates the percentage or granulation inside the reactor, as defined by Kocaturk and Erguder (2016). The threshold for the SVI₃₀/SVI₅ was set to 50% for this study.

It can be observed that a failure to remove NH₄-N has occurred between samples 218 and 232, where the effluent concentration reached 28 mg/L. The MLSS plot shows a rapid decline in the MLSS concentration, which entailed a loss of the slow-growing nitrifying bacteria; thus, the delayed effect of poor NH₄-N removal while the biomass recovered. Further analysis of the results shows that there was a drop in the granule size and the granulation ratio inside the reactor, indicating that a partial washout has occurred, followed by a rapid growth of heterotrophic biomass in floccular form due to the abundance of organics (F/M ratio was disturbed due to the loss of biomass).

The influent COD was reduced from around 7000 mg/L to 4500 mg/L after the biomass washout instance. This improved the observed COD removal efficiency as the new heterotrophic growth was able to handle the influent organics. The effluent COD concentration was above the allowed threshold as the reactor was being operated to treat high organic content wastewater, where the effluent wastewater was to be polished before disposal. The aerobic biological process, although successful in removing most of the organic load, was unable to bring the COD concentrations below the required limits (Hamza et al., 2018).

6. Conclusion

A machine learning model was developed for AGS reactors using a combination of neural networks, support vector regression and adaptive neuro-fuzzy inference systems. Feature selection methods were applied, and a five-stage model structure was adopted. This study shows that proper feature selection and combining multiple machine learning algorithms in an ensemble can improve the performance of data-driven models when the available dataset is small. The two feature selection methods were applied to reduce the dimensionality of the regression problem and reduce the multicollinearity of the input data. Combining multiple algorithms using simple neural networks or weighted average ensembles reduced the levels of over and under-fitting of the individual algorithms. The modular nature of the model structure allowed the use of best performing models, out of the eight alternatives, for each output. The model developed in this study was able to predict the behaviour of AGS reactors and provide insight into the process by explaining the causes of predicted failures.

Declaration of Competing Interest

None.

Acknowledgements

The authors would like to acknowledge the National Science and Engineering Research Council of Canada for funding this research.

References

Alin, 2010
A. Alin
Multicollinearity. Wiley Interdisciplinary Reviews
Computational Statistics, 2 (3) (2010), pp. 370-374, 10.1002/wics.84
View in Scopus Google Scholar
Awad and Khanna, 2015
M. Awad, R. Khanna
Support vector regression. In Efficient Learning Machines
Springer (2015), pp. 67-80
Crossref Google Scholar
Baeten et al., 2019
J.E. Baeten, D.J. Batstone, O.J. Schraa, M.C.M. van Loosdrecht, E.I.P. Volcke
Modelling anaerobic, aerobic and partial nitritation-anammox granular sludge reactors - A review
Water Research, 149 (2019), pp. 322-341, 10.1016/j.watres.2018.11.026
View PDF View article View in Scopus Google Scholar
Baeten et al., 2018
J.E. Baeten, M.C.M. van Loosdrecht, E.I.P. Volcke
Modelling aerobic granular sludge reactors through apparent half-saturation coefficients
Water Research, 146 (2018), pp. 134-145, 10.1016/j.watres.2018.09.025
View PDF View article View in Scopus Google Scholar
Corominas et al., 2018
L. Corominas, M. Garrido-Baserba, K. Villez, G. Olsson, U. Cortés, M. Poch
Transforming data into knowledge for improved wastewater treatment operation: A critical review of techniques
Environmental Modelling & Software, 106 (2018), pp. 89-103, 10.1016/j.envsoft.2017.11.023
View PDF View article View in Scopus Google Scholar
Cristianini and Shawe-Taylor, 2000
N. Cristianini, J. Shawe-Taylor
An introduction to support vector machines and other kernel-based learning methods
Cambridge university press (2000)
Google Scholar
Cui et al., 2020
F. Cui, M. Kim, W. Lee, C. Park, M. Kim
Pseudo-analytical solutions for multi-species biofilm model of aerobic granular sludge
Environmental Technology (United Kingdom) (0) (2020), pp. 1-11, 10.1080/09593330.2020.1733673
0
Google Scholar
de Kreuk et al., 2005
M.K. de Kreuk, J.J. Heijnen, M.C.M. van Loosdrecht
Simultaneous COD, nitrogen, and phosphate removal by aerobic granular sludge
Biotechnology and Bioengineering, 90 (6) (2005), pp. 761-769, 10.1002/bit.20470
View in Scopus Google Scholar
El-Din et al., 2004
A.G. El-Din, D.W. Smith, M.G. El-Din
Application of artificial neural networks in wastewater treatment
Journal of Environmental Engineering and Science, 3 (1) (2004), pp. S81-S95, 10.1139/s03-067
Google Scholar
Fernando and Shamseldin, 2009
D.A.K. Fernando, A.Y. Shamseldin
Investigation of Internal Functioning of the Radial-Basis-Function Neural Network River Flow Forecasting Models
Journal of Hydrologic Engineering, 14 (3) (2009), pp. 286-292, 10.1061/(ASCE)1084-0699(2009)14
View in Scopus Google Scholar
Foresee and Hagan, 1997
F.D. Foresee, M.T. Hagan
Gauss-Newton approximation to Bayesian learning
Proceedings of International Conference on Neural Networks ({ICNN}{\textquotesingle}97), 3 (1997), pp. 1930-1935, 10.1109/icnn.1997.614194
View in Scopus Google Scholar
Franca et al., 2018
R.D.G. Franca, H.M. Pinheiro, M.C.M. van Loosdrecht, N.D. Lourenço
Stability of aerobic granules during long-term bioreactor operation
Biotechnology Advances, 36 (1) (2018), pp. 228-246, 10.1016/j.biotechadv.2017.11.005
View PDF View article View in Scopus Google Scholar
Gong et al., 2018
H. Gong, R. Pishgar, J.H. Tay
Artificial neural network modelling for organic and total nitrogen removal of aerobic granulation under steady-state condition
Environmental Technology, 40 (24) (2018), pp. 3124-3139, 10.1080/09593330.2018.1466920
Google Scholar
Goyal and Ojha, 2011
M.K. Goyal, C.S.P. Ojha
Estimation of Scour Downstream of a Ski-Jump Bucket Using Support Vector and M5 Model Tree
Water Resources Management, 25 (9) (2011), pp. 2177-2195, 10.1007/s11269-011-9801-6
View in Scopus Google Scholar
Guo et al., 2015
H. Guo, K. Jeong, J. Lim, J. Jo, Y.M. Kim, J.pyo Park, J.H. Kim, K.H Cho
Prediction of effluent concentration in a wastewater treatment plant using machine learning models
Journal of Environmental Sciences (China), 32 (2015), pp. 90-101, 10.1016/j.jes.2015.01.007
View PDF View article View in Scopus Google Scholar
Hamed et al., 2004
M.M. Hamed, M.G. Khalafallah, E.A. Hassanien
Prediction of wastewater treatment plant performance using artificial neural networks
Environmental Modelling & Software, 19 (10) (2004), pp. 919-928, 10.1016/j.envsoft.2003.10.005
View PDF View article View in Scopus Google Scholar
Hamza et al., 2018
R.A. Hamza, O.T. Iorhemen, M.S. Zaghloul, J.H. Tay
Rapid formation and characterization of aerobic granules in pilot-scale sequential batch reactor for high-strength organic wastewater treatment
Journal of Water Process Engineering (2018), p. 22, 10.1016/j.jwpe.2018.01.002
Google Scholar
Hamza et al., 2018
R.A. Hamza, Z. Sheng, O.T. Iorhemen, M.S. Zaghloul, J.H Tay
Impact of food-to-microorganisms ratio on the stability of aerobic granular sludge treating high-strength organic wastewater
Water Research, 147 (2018), pp. 287-298, 10.1016/j.watres.2018.09.061
View PDF View article View in Scopus Google Scholar
He et al., 2020
Q. He, J. Song, W. Zhang, S. Gao, H. Wang, J. Yu
Enhanced simultaneous nitrification, denitrification and phosphorus removal through mixed carbon source by aerobic granular sludge
Journal of Hazardous Materials, 382 (2020), Article 121043, 10.1016/j.jhazmat.2019.121043
June 2019
View PDF View article View in Scopus Google Scholar
Iorhemen et al., 2020
O.T. Iorhemen, M.S. Zaghloul, R.A. Hamza, J.H. Tay
Long-term aerobic granular sludge stability through anaerobic slow feeding, fixed feast-famine period ratio, and fixed SRT
Journal of Environmental Chemical Engineering, 8 (2) (2020), Article 103681, 10.1016/j.jece.2020.103681
View PDF View article View in Scopus Google Scholar
Jang, 1993
J.-S.R. Jang
ANFIS: adaptive-network-based fuzzy inference system
IEEE Transactions on Systems, Man, and Cybernetics, 23 (3) (1993), pp. 665-685, 10.1109/21.256541
View in Scopus Google Scholar
Karamizadeh et al., 2014, September
S. Karamizadeh, S.M. Abdullah, M. Halimi, J. Shayan, M. javad Rajabi
Advantage and drawback of support vector machine functionality
2014 International Conference on Computer, Communications, and Control Technology (I4CT) (2014, September), 10.1109/i4ct.2014.6914146
Google Scholar
Khan et al., 2013
M.Z. Khan, P.K. Mondal, S. Sabir
Aerobic granulation for wastewater bioremediation: A review
The Canadian Journal of Chemical Engineering, 91 (6) (2013), pp. 1045-1058, 10.1002/cjce.21729
View in Scopus Google Scholar
Kocaturk and Erguder, 2016
I. Kocaturk, T.H. Erguder
Influent COD/TAN ratio affects the carbon and nitrogen removal efficiency and stability of aerobic granules
Ecological Engineering, 90 (2016), pp. 12-24, 10.1016/j.ecoleng.2016.01.077
View PDF View article View in Scopus Google Scholar
Lakshminarayan et al., 1999
K. Lakshminarayan, S.A. Harp, T. Samad
Imputation of missing data in industrial databases
Applied Intelligence, 11 (3) (1999), pp. 259-275, 10.1023/A:1008334909089
View in Scopus Google Scholar
Lawrence and Giles, 2000
S. Lawrence, C.L. Giles
Overfitting and neural networks: Conjugate gradient and backpropagation
Proceedings of the International Joint Conference on Neural Networks, 1 (2000), pp. 114-119, 10.1109/ijcnn.2000.857823
View in Scopus Google Scholar
Lee et al., 2008
M.W. Lee, S.H. Hong, H. Choi, J.-H.H. Kim, D.S. Lee, J.M. Park
Real-time remote monitoring of small-scaled biological wastewater treatment plants by a multivariate statistical process control and neural network-based software sensors
Process Biochemistry, 43 (10) (2008), pp. 1107-1113, 10.1016/j.procbio.2008.06.002
View PDF View article View in Scopus Google Scholar
Liang et al., 2020
J. Liang, Q. Wang, Q.X. Li, L. Jiang, J. Kong, M. Ke, M. Arslan, M. Gamal El-Din, C Chen
Aerobic sludge granulation in shale gas flowback water treatment: Assessment of the bacterial community dynamics and modeling of bioreactor performance using artificial neural network
Bioresource Technology, 313 (2020), Article 123687, 10.1016/j.biortech.2020.123687
June
View PDF View article View in Scopus Google Scholar
Liu et al., 2010
Y.Q. Liu, Y.H. Kong, R. Zhang, X. Zhang, F.S. Wong, J.H. Tay, J.R. Zhu, W.J. Jiang, W.T. Liu
Microbial population dynamics of granular aerobic sequencing batch reactors during start-up and steady state periods
Water Science & Technology, 62 (6) (2010), p. 1281, 10.2166/wst.2010.408
Google Scholar
Liu et al., 2017
Yanzhu Liu, A.W.K. Kong, C.K. Goh
Deep ordinal regression based on data relationship for small datasets
IJCAI International Joint Conference on Artificial Intelligence (2017), pp. 2372-2378
Crossref View in Scopus Google Scholar
Liu et al., 2007
Yu Liu, L. Qin, S.-F. Yang
Microbial granulation technology for nutrient removal from wastewater
Nova Publishers (2007)
Google Scholar
Mahmod and Wahab, 2017
N. Mahmod, N.A. Wahab
Dynamic Modelling of Aerobic Granular Sludge Artificial Neural Networks
International Journal of Electrical and Computer Engineering, 7 (3) (2017), p. 1568, 10.11591/ijece.v7i3.pp1568-1573
View in Scopus Google Scholar
Manu and Thalla, 2017
D.S. Manu, A.K. Thalla
Artificial intelligence models for predicting the performance of biological wastewater treatment plant in the removal of Kjeldahl Nitrogen from wastewater
Applied Water Science, 7 (7) (2017), pp. 3783-3791, 10.1007/s13201-017-0526-4
View in Scopus Google Scholar
Nancharaiah and Reddy, 2018
Y.V Nancharaiah, G.K.K. Reddy
Aerobic granular sludge technology: Mechanisms of granulation and biotechnological applications
Bioresource Technology, 247 (2018), pp. 1128-1143, 10.1016/j.biortech.2017.09.131
View PDF View article View in Scopus Google Scholar
Newhart et al., 2019
K.B. Newhart, R.W. Holloway, A.S. Hering, T.Y. Cath
Data-driven performance analyses of wastewater treatment plants: A review
Water Research, 157 (2019), pp. 498-513, 10.1016/j.watres.2019.03.030
View PDF View article View in Scopus Google Scholar
Ni and Yu, 2010
B.-J. Ni, H.-Q. Yu
Mathematical modeling of aerobic granular sludge: A review
Biotechnology Advances, 28 (6) (2010), pp. 895-909, 10.1016/j.biotechadv.2010.08.004
View PDF View article View in Scopus Google Scholar
Price, 1998
J.K. Price
Applied math for wastewater plant operators
CRC Press (1998)
Google Scholar
Pronk et al., 2015
M. Pronk, M.K. de Kreuk, B. de Bruin, P. Kamminga, R. Kleerebezem, M.C.M. van Loosdrecht
Full scale performance of the aerobic granular sludge process for sewage treatment
Water Research, 84 (2015), pp. 207-217, 10.1016/j.watres.2015.07.011
View PDF View article View in Scopus Google Scholar
Qin et al., 2004
L. Qin, J.H. Tay, Y. Liu
Selection pressure is a driving force of aerobic granulation in sequencing batch reactors
Process Biochemistry, 39 (5) (2004), pp. 579-584, 10.1016/S0032-9592(03)00125-0
View PDF View article View in Scopus Google Scholar
Read and Belsley, 1994
C.B. Read, D.A. Belsley
Conditioning Diagnostics: Collinearity and Weak Data in Regression
Biometrics, 50 (1) (1994), p. 314, 10.2307/2533229
Google Scholar
Rice et al., 2017
Rice, E. W., Baird, R. B., & Eaton, A. D. (2017). Standard Methods for the Examination of Water and Wastewater, 23rd Edition (23rd ed.). American Public Health Association, American Water Works Association, Water Environment Federation.
Google Scholar
Robnik-Šikonja and Kononenko, 1997
M. Robnik-Šikonja, I. Kononenko
An adaptation of {R}elief for attribute estimation in regression
Machine {L}earning: {P}roceedings of the {F}ourteenth International Conference, 5, ICML’97 (1997), pp. 296-304
Google Scholar
Sammut et al., 2016
Sammut, C., & Webb, G. I. (2016). Encyclopedia of Machine Learning and Data Mining (C. Sammut & G. I. Webb (eds.)). Springer {US}. 10.1007/978-1-4899-7687-1
Google Scholar
Seshan et al., 2014
H. Seshan, M.K. Goyal, M.W. Falk, S. Wuertz
Support vector regression model of wastewater bioreactor performance using microbial community diversity indices: Effect of stress and bioaugmentation
Water Research, 53 (2014), pp. 282-296, 10.1016/j.watres.2014.01.015
View PDF View article View in Scopus Google Scholar
Shaikhina and Khovanova, 2017
T. Shaikhina, N.A. Khovanova
Handling limited datasets with neural networks in medical applications: A small-data approach
Artificial Intelligence in Medicine, 75 (2017), pp. 51-63, 10.1016/j.artmed.2016.12.003
View PDF View article View in Scopus Google Scholar
Smola and Schölkopf, 2004
A.J. Smola, B. Schölkopf
A tutorial on support vector regression
Statistics and Computing, 14 (3) (2004), pp. 199-222, 10.1023/b:stco.0000035301.49549.88
View in Scopus Google Scholar
Stathakis et al., 2006
D. Stathakis, I. Savin, F.N. Networks
Neuro-Fuzzy Modelling For Crop Yield Prediction. The International Archives of the Photogrammetry
Remote Sensing and Spatial Information Sciences, 34 (2006), pp. 8-11
Google Scholar
Tay et al., 2002
J.H. Tay, Q.S. Liu, Y. Liu
Characteristics of Aerobic Granules Grown on Glucose and Acetate in Sequential Aerobic Sludge Blanket Reactors
Environmental Technology, 23 (8) (2002), pp. 931-936, 10.1080/09593332308618363
View in Scopus Google Scholar
Urbanowicz et al., 2018
R.J. Urbanowicz, M. Meeker, W. La Cava, R.S. Olson, J.H. Moore
Relief-based feature selection: Introduction and review
Journal of Biomedical Informatics, 85 (June) (2018), pp. 189-203, 10.1016/j.jbi.2018.07.014
View PDF View article View in Scopus Google Scholar
Vapnik, 2000
V.N. Vapnik
The Nature of Statistical Learning Theory
Springer, New York (2000), 10.1007/978-1-4757-3264-1
Google Scholar
Wang et al., 2008
J. Wang, X. Wang, Z. Zhao, J. Li
Organics and nitrogen removal and sludge stability in aerobic granular sludge membrane bioreactor
Applied Microbiology and Biotechnology, 79 (4) (2008), pp. 679-685, 10.1007/s00253-008-1466-6
View in Scopus Google Scholar
Wang et al., 2004
Q. Wang, G. Du, J. Chen
Aerobic granular sludge cultivated under the selective pressure as a driving force
Process Biochemistry, 39 (2004), pp. 557-563, 10.1016/S0032-9592(03)00128-6
View PDF View article View in Scopus Google Scholar
Wilén et al., 2018
B.-M. Wilén, R. Liébana, F. Persson, O. Modin, M. Hermansson
The mechanisms of granulation of activated sludge in wastewater treatment, its optimization, and impact on effluent quality
Applied Microbiology and Biotechnology, 102 (12) (2018), pp. 5005-5020, 10.1007/s00253-018-8990-9
View in Scopus Google Scholar
Wójcik and Kurdziel, 2019
P.I. Wójcik, M. Kurdziel
Training neural networks on high-dimensional data using random projection
Pattern Analysis and Applications, 22 (3) (2019), pp. 1221-1231, 10.1007/s10044-018-0697-0
View in Scopus Google Scholar
Ye et al., 2020
Z. Ye, J. Yang, N. Zhong, X. Tu, J. Jia, J. Wang
Tackling environmental challenges in pollution controls using artificial intelligence: A review
Science of the Total Environment, 699 (2020), Article 134279, 10.1016/j.scitotenv.2019.134279
View PDF View article View in Scopus Google Scholar
Yilmaz et al., 2008
G. Yilmaz, R. Lemaire, J. Keller, Z. Yuan
Simultaneous nitrification, denitrification, and phosphorus removal from nutrient-rich industrial wastewater using granular sludge
Biotechnology and Bioengineering, 100 (3) (2008), pp. 529-541, 10.1002/bit.21774
View in Scopus Google Scholar
Zaghloul et al., 2018
M.S. Zaghloul, R.A. Hamza, O.T. Iorhemen, J.H. Tay
Performance prediction of an aerobic granular SBR using modular multilayer artificial neural networks
Science of the Total Environment, 645 (2018), pp. 449-459, 10.1016/j.scitotenv.2018.07.140
View PDF View article View in Scopus Google Scholar
Zaghloul et al., 2020
M.S. Zaghloul, R.A. Hamza, O.T. Iorhemen, J.H. Tay, O. Terna Iorhemen, J.H. Tay
Comparison of adaptive neuro-fuzzy inference systems (ANFIS) and support vector regression (SVR) for data-driven modelling of aerobic granular sludge reactors
Journal of Environmental Chemical Engineering, 8 (3) (2020), Article 103742, 10.1016/j.jece.2020.103742
View PDF View article View in Scopus Google Scholar
Zheng et al., 2020
S. Zheng, H. Lu, G. Zhang
The recent development of the aerobic granular sludge for industrial wastewater treatment: a mini review
Environmental Technology Reviews, 9 (1) (2020), pp. 55-66, 10.1080/21622515.2020.1732479
View in Scopus Google Scholar

Cited by (40)

Machine-learning-based prediction and optimization of emerging contaminants' adsorption capacity on biochar materials
2023, Chemical Engineering Journal
Biochar materials have recently received considerable recognition as eco-friendly and cost-effective adsorbents capable of effectively removing hazardous emerging contaminants (e.g., pharmaceuticals, herbicides, and fungicides) to aquatic organisms and human health accumulated in aquatic ecosystems. In this study, ten tree-based machine learning (ML) models, including bagging, CatBoost, ExtraTrees, HistGradientBoosting, XGBoost, GradientBoosting, DecisionTree, Random Forest, Light gradient Boosting, and KNearest Neighbors, have been built to accurately predict the adsorption capacity of biochar materials toward ECs in aqueous solutions. A very large data set with 3,757 data points was generated using 24 input variables (i.e., pyrolysis conditions for biochar production (3 features), biochar characteristics (3 features), biochar compositions (6 features), and adsorption experimental conditions (12 features)) obtained from the batch adsorption experiments to remove 12 kinds of ECs using 18 different biochar materials. The rigorous evaluation and comparison of the ML model performances shows that CatBoost model had the highest test coefficient of determination (0.9433) and lowest mean absolute error (4.95 mg/g), outperformed clearly all other models. The feature importance analyzed by the shapley additive explanations (SHAP) indicated that the adsorption experimental conditions provided the highest impact on the model prediction for adsorption capacity (41 %) followed by the adsorbent composition (35 %), adsorbent characterization (20 %), and synthesis conditions (3)%). The optimized experimental conditions predicted by the modeling were a N/C ratio of 0.017, BET surface area of 1040 m²/g, content of C(%) contents of 82.1 %, pore volume of 0.46 cm³/g, initial ECs concentration of 100 mg/L, type of pollutant (CAR), adsorption type (Single) and adsorption contact time (720 min).
Application of machine learning techniques to model a full-scale wastewater treatment plant with biological nutrient removal
2022, Journal of Environmental Chemical Engineering
Citation Excerpt :
This paper presents a machine learning ensemble model for a full-scale BNR process. It is the first application of the modular multi-stage model structure developed by Zaghloul et al. [34] in full-scale using real wastewater, after its success in laboratory-scale applications using synthetic wastewater. It is also the first model to predict 15 process parameters using machine learning.
A full-scale biological nutrient removal wastewater treatment process was simulated using artificial intelligence. In wastewater treatment plants, adaptive machine learning models can reduce process disruptions and generate savings through optimized operation. Machine learning is also useful when simulating processes that are particularly complex and where the physio-chemical interactions are not well understood, such as biological nutrients removal. Current models in literature only focus on the prediction of a small number of effluent parameters using a direct input-output approach. This paper presents a machine learning ensemble model that combines artificial neural networks, adaptive neuro-fuzzy inference systems, and support vector regression to predict 15 process parameters that include biomass properties, operation parameters, and effluent characteristics. A historical dataset between 2010 and 2020 was used to develop and validate the model. The model features a six-stage modular model structure where each parameter was predicted using a separate model and based on the preceding predicted parameters. The average correlation coefficient, normalized root mean square error, and symmetric mean absolute error of 69%, 0.06%, and 7.5%, respectively. The ensemble approach improved the average prediction accuracy over individual base models by 5%. The model developed in this study was more versatile than other machine learning models in the literature and relatively reduced the ambiguity of black-box data-driven models.
Multidisciplinary characterization of nitrogen-removal granular sludge: A review of advances and technologies
2022, Water Research
Citation Excerpt :
The attempt in NRGS began by using various models, such as adaptive neuro-fuzzy inference system, support vector regression, and artificial neuron network, in AGS to achieve process-simulation and performance-prediction (Liang et al., 2020; Zaghloul et al., 2018; 2020). In recent studies, an ensemble of machine learning was developed with a good fit to model AGS reactors: A five-stage machine learning model was proposed, which could predict the performance of AGS and explain the reason for predicted failures (Zaghloul et al., 2021). Although these properties are artificially classified as physical-, chemical-, biological- and systematic, their inherent associations cannot be overlooked.
Nitrogen-removal granular sludge (NRGS) is a promising technology in wastewater treatment, with advantages of efficient nitrogen removal, less footprint, lower sludge production and energy consumption, and is a way for wastewater treatment plants to achieve carbon–neutrality. Aerobic granular sludge (AGS) and anammox granular sludge (AnGS) are two typical NRGS technologies that have attracted extensive attention. Mounting evidence has shown strong associations between NRGS properties and the status of NRGS systems; however, a holistic view is still missing. The aim of this article is to provide an overview of NRGS with an emphasis on characterization. Specifically, the integrated nitrogen transformation pathways inside NRGS and the performance of NRGS treating various wastewaters are discussed. NRGS properties are categorized as physical-, chemical-, biological- and systematical ones, presenting current advances and corresponding characterization technologies. Finally, the future prospects for furthering the mechanistic understanding and engineering application of NRGS are proposed. Overall, the technological advancements in characterization have greatly contributed to understanding NRGS properties, which are potential factors for optimizing the performance and evaluating the working status of NRGS. This review will provide guidance in characterizing NRGS properties and boost the introduction of novel characterization technologies.
Machine learning modeling and analysis of biohydrogen production from wastewater by dark fermentation process
2022, Bioresource Technology
Dark fermentation process for simultaneous wastewater treatment and H₂ production is gaining attention. This study aimed to use machine learning (ML) procedures to model and analyze H₂ production from wastewater during dark fermentation. Different ML procedures were assessed based on the mean squared error (MSE) and determination coefficient (R²) to select the most robust models for modeling the process. The research showed that gradient boosting machine (GBM), support vector machine (SVM), random forest (RF) and AdaBoost were the most appropriate models, which were optimized by grid search and deeply analyzed by permutation variable importance (PVI) to identify the relative importance of process variables. All four models demonstrated promising performances in predicting H₂ production with high R² values (0.893, 0.885, 0.902 and 0.889) and small MSE values (0.015, 0.015, 0.016 and 0.015). Moreover, RF-PVI demonstrated that acetate, butyrate, acetate/butyrate, ethanol, Fe and Ni were of high importance in decreasing order.
Machine learning in natural and engineered water systems
2021, Water Research
Citation Excerpt :
Then the outputs were combined as inputs for the subsequent ensemble algorithms, of which the best outputs were determined to be the final prediction. This approach improved the performance of ML models with a small dataset by combining the advantages of different algorithms instead of discussing and selecting the best single one (Zaghloul et al., 2020b). In addition to the common indicators mentioned above, some other parameters related to WWTPs have also been predicted.
Water resources of desired quality and quantity are the foundation for human survival and sustainable development. To better protect the water environment and conserve water resources, efficient water management, purification, and transportation are of critical importance. In recent years, machine learning (ML) has exhibited its practicability, reliability, and high efficiency in numerous applications; furthermore, it has solved conventional and emerging problems in both natural and engineered water systems. For example, ML can predict various water quality indicators in situ and real-time by considering the complex interactions among water-related variables. ML approaches can also solve emerging pollution problems with proven rules or universal mechanisms summarized from the related research. Moreover, by applying image recognition technology to analyze the relationships between image information and physicochemical properties of the research object, ML can effectively identify and characterize specific contaminants. In view of the bright prospects of ML, this review comprehensively summarizes the development of ML applications in natural and engineered water systems. First, the concept and modeling steps of ML are briefly introduced, including data preparation, algorithm selection and model evaluation. In addition, comprehensive applications of ML in recent studies, including predicting water quality, mapping groundwater contaminants, classifying water resources, tracing contaminant sources, and evaluating pollutant toxicity in natural water systems, as well as modeling treatment techniques, assisting characterization analysis, purifying and distributing drinking water, and collecting and treating sewage water in engineered water systems, are summarized. Finally, the advantages and disadvantages of commonly used algorithms are analyzed according to their structures and mechanisms, and recommendations on the selection of ML algorithms for different studies, as well as prospects on the application and development of ML in water science are proposed. This review provides references for solving a wider range of water-related problems and brings further insights into the intelligent development of water science.
Performance prediction of trace metals and cod in wastewater treatment using artificial neural network
2021, Computers and Chemical Engineering
Citation Excerpt :
Global development of supervision tools and reliable real-time control was applied to the wastewater treatment process. The ANNs have proven to be the universal tool for forecasting and prediction where the much-desired input to output is determined by external and supervised adjustment of the system parameters (Zaghloul et al., 2021). Artificial neural network (ANNs) are designed to solve problems with unknown required output (unsupervised learning algorithms) and known output (supervised learning algorithms) (Fernando and Surgenor, 2017).
Artificial intelligence is finding its ways into the mainstream of day-to-day operations. Novel AI application techniques such as the artificial neural network (ANN), fuzzy logic, genetic algorithms and expert systems have gained popularity in the fourth industrial revolution era. Due to the chemical composition, inherent complexity, incoherent flow rate and higher safety factor in the effective operation of the biological wastewater treatment process, the AI-based model was extensively tested in managing the wastewater treatment operations. The interrelationship between COD and trace metals was studied using an AI-based prediction model with ANNs incorporated in MATLAB. A supervised learning algorithm was used for training the ANNs and to relate input data to output data. The training was aimed at estimating, validating, predicting the parameters by an error function minimization. The goodness of the prediction was attained with the coefficient of determination (R²) of 0.98–0.99, a sum of square error (SSE) 0.00029–0.1598, room mean-square error (RMSE) of 0.0049–0.8673, mean squared error (MSE) 2.7059e-14 to 2.3175e-15. The ANNs models were found to be a robust tool for predicting WWTP performance. The predictive approaches can be used in the prediction of environmental management and other emerging technologies. This will meet the cost-effectiveness, effective environmental and technical criteria with a wide range of big-data support and implementation of the sustainable development goals, circular bio-economy and industry 4.0.

View all citing articles on Scopus

^#: Deceased.

View Abstract

[1] Alin, 2010
A. Alin
Multicollinearity. Wiley Interdisciplinary Reviews
Computational Statistics, 2 (3) (2010), pp. 370-374, 10.1002/wics.84
View in Scopus Google Scholar

[2] Awad and Khanna, 2015
M. Awad, R. Khanna
Support vector regression. In Efficient Learning Machines
Springer (2015), pp. 67-80
Crossref Google Scholar

[3] Baeten et al., 2019
J.E. Baeten, D.J. Batstone, O.J. Schraa, M.C.M. van Loosdrecht, E.I.P. Volcke
Modelling anaerobic, aerobic and partial nitritation-anammox granular sludge reactors - A review
Water Research, 149 (2019), pp. 322-341, 10.1016/j.watres.2018.11.026
View PDF View article View in Scopus Google Scholar

[4] Baeten et al., 2018
J.E. Baeten, M.C.M. van Loosdrecht, E.I.P. Volcke
Modelling aerobic granular sludge reactors through apparent half-saturation coefficients
Water Research, 146 (2018), pp. 134-145, 10.1016/j.watres.2018.09.025
View PDF View article View in Scopus Google Scholar

[5] Corominas et al., 2018
L. Corominas, M. Garrido-Baserba, K. Villez, G. Olsson, U. Cortés, M. Poch
Transforming data into knowledge for improved wastewater treatment operation: A critical review of techniques
Environmental Modelling & Software, 106 (2018), pp. 89-103, 10.1016/j.envsoft.2017.11.023
View PDF View article View in Scopus Google Scholar

[6] Cristianini and Shawe-Taylor, 2000
N. Cristianini, J. Shawe-Taylor
An introduction to support vector machines and other kernel-based learning methods
Cambridge university press (2000)
Google Scholar

[7] Cui et al., 2020
F. Cui, M. Kim, W. Lee, C. Park, M. Kim
Pseudo-analytical solutions for multi-species biofilm model of aerobic granular sludge
Environmental Technology (United Kingdom) (0) (2020), pp. 1-11, 10.1080/09593330.2020.1733673
0
Google Scholar

[8] de Kreuk et al., 2005
M.K. de Kreuk, J.J. Heijnen, M.C.M. van Loosdrecht
Simultaneous COD, nitrogen, and phosphate removal by aerobic granular sludge
Biotechnology and Bioengineering, 90 (6) (2005), pp. 761-769, 10.1002/bit.20470
View in Scopus Google Scholar

[9] El-Din et al., 2004
A.G. El-Din, D.W. Smith, M.G. El-Din
Application of artificial neural networks in wastewater treatment
Journal of Environmental Engineering and Science, 3 (1) (2004), pp. S81-S95, 10.1139/s03-067
Google Scholar

[10] Fernando and Shamseldin, 2009
D.A.K. Fernando, A.Y. Shamseldin
Investigation of Internal Functioning of the Radial-Basis-Function Neural Network River Flow Forecasting Models
Journal of Hydrologic Engineering, 14 (3) (2009), pp. 286-292, 10.1061/(ASCE)1084-0699(2009)14
View in Scopus Google Scholar

[11] Foresee and Hagan, 1997
F.D. Foresee, M.T. Hagan
Gauss-Newton approximation to Bayesian learning
Proceedings of International Conference on Neural Networks ({ICNN}{\textquotesingle}97), 3 (1997), pp. 1930-1935, 10.1109/icnn.1997.614194
View in Scopus Google Scholar

[12] Franca et al., 2018
R.D.G. Franca, H.M. Pinheiro, M.C.M. van Loosdrecht, N.D. Lourenço
Stability of aerobic granules during long-term bioreactor operation
Biotechnology Advances, 36 (1) (2018), pp. 228-246, 10.1016/j.biotechadv.2017.11.005
View PDF View article View in Scopus Google Scholar

[13] Gong et al., 2018
H. Gong, R. Pishgar, J.H. Tay
Artificial neural network modelling for organic and total nitrogen removal of aerobic granulation under steady-state condition
Environmental Technology, 40 (24) (2018), pp. 3124-3139, 10.1080/09593330.2018.1466920
Google Scholar

[14] Goyal and Ojha, 2011
M.K. Goyal, C.S.P. Ojha
Estimation of Scour Downstream of a Ski-Jump Bucket Using Support Vector and M5 Model Tree
Water Resources Management, 25 (9) (2011), pp. 2177-2195, 10.1007/s11269-011-9801-6
View in Scopus Google Scholar

[15] Guo et al., 2015
H. Guo, K. Jeong, J. Lim, J. Jo, Y.M. Kim, J.pyo Park, J.H. Kim, K.H Cho
Prediction of effluent concentration in a wastewater treatment plant using machine learning models
Journal of Environmental Sciences (China), 32 (2015), pp. 90-101, 10.1016/j.jes.2015.01.007
View PDF View article View in Scopus Google Scholar

[16] Hamed et al., 2004
M.M. Hamed, M.G. Khalafallah, E.A. Hassanien
Prediction of wastewater treatment plant performance using artificial neural networks
Environmental Modelling & Software, 19 (10) (2004), pp. 919-928, 10.1016/j.envsoft.2003.10.005
View PDF View article View in Scopus Google Scholar

[17] Hamza et al., 2018
R.A. Hamza, O.T. Iorhemen, M.S. Zaghloul, J.H. Tay
Rapid formation and characterization of aerobic granules in pilot-scale sequential batch reactor for high-strength organic wastewater treatment
Journal of Water Process Engineering (2018), p. 22, 10.1016/j.jwpe.2018.01.002
Google Scholar

[18] Hamza et al., 2018
R.A. Hamza, Z. Sheng, O.T. Iorhemen, M.S. Zaghloul, J.H Tay
Impact of food-to-microorganisms ratio on the stability of aerobic granular sludge treating high-strength organic wastewater
Water Research, 147 (2018), pp. 287-298, 10.1016/j.watres.2018.09.061
View PDF View article View in Scopus Google Scholar

[19] He et al., 2020
Q. He, J. Song, W. Zhang, S. Gao, H. Wang, J. Yu
Enhanced simultaneous nitrification, denitrification and phosphorus removal through mixed carbon source by aerobic granular sludge
Journal of Hazardous Materials, 382 (2020), Article 121043, 10.1016/j.jhazmat.2019.121043
June 2019
View PDF View article View in Scopus Google Scholar

[20] Iorhemen et al., 2020
O.T. Iorhemen, M.S. Zaghloul, R.A. Hamza, J.H. Tay
Long-term aerobic granular sludge stability through anaerobic slow feeding, fixed feast-famine period ratio, and fixed SRT
Journal of Environmental Chemical Engineering, 8 (2) (2020), Article 103681, 10.1016/j.jece.2020.103681
View PDF View article View in Scopus Google Scholar

[21] Jang, 1993
J.-S.R. Jang
ANFIS: adaptive-network-based fuzzy inference system
IEEE Transactions on Systems, Man, and Cybernetics, 23 (3) (1993), pp. 665-685, 10.1109/21.256541
View in Scopus Google Scholar

[22] Karamizadeh et al., 2014, September
S. Karamizadeh, S.M. Abdullah, M. Halimi, J. Shayan, M. javad Rajabi
Advantage and drawback of support vector machine functionality
2014 International Conference on Computer, Communications, and Control Technology (I4CT) (2014, September), 10.1109/i4ct.2014.6914146
Google Scholar

[23] Khan et al., 2013
M.Z. Khan, P.K. Mondal, S. Sabir
Aerobic granulation for wastewater bioremediation: A review
The Canadian Journal of Chemical Engineering, 91 (6) (2013), pp. 1045-1058, 10.1002/cjce.21729
View in Scopus Google Scholar

[24] Kocaturk and Erguder, 2016
I. Kocaturk, T.H. Erguder
Influent COD/TAN ratio affects the carbon and nitrogen removal efficiency and stability of aerobic granules
Ecological Engineering, 90 (2016), pp. 12-24, 10.1016/j.ecoleng.2016.01.077
View PDF View article View in Scopus Google Scholar

[25] Lakshminarayan et al., 1999
K. Lakshminarayan, S.A. Harp, T. Samad
Imputation of missing data in industrial databases
Applied Intelligence, 11 (3) (1999), pp. 259-275, 10.1023/A:1008334909089
View in Scopus Google Scholar

[26] Lawrence and Giles, 2000
S. Lawrence, C.L. Giles
Overfitting and neural networks: Conjugate gradient and backpropagation
Proceedings of the International Joint Conference on Neural Networks, 1 (2000), pp. 114-119, 10.1109/ijcnn.2000.857823
View in Scopus Google Scholar

[27] Lee et al., 2008
M.W. Lee, S.H. Hong, H. Choi, J.-H.H. Kim, D.S. Lee, J.M. Park
Real-time remote monitoring of small-scaled biological wastewater treatment plants by a multivariate statistical process control and neural network-based software sensors
Process Biochemistry, 43 (10) (2008), pp. 1107-1113, 10.1016/j.procbio.2008.06.002
View PDF View article View in Scopus Google Scholar

[28] Liang et al., 2020
J. Liang, Q. Wang, Q.X. Li, L. Jiang, J. Kong, M. Ke, M. Arslan, M. Gamal El-Din, C Chen
Aerobic sludge granulation in shale gas flowback water treatment: Assessment of the bacterial community dynamics and modeling of bioreactor performance using artificial neural network
Bioresource Technology, 313 (2020), Article 123687, 10.1016/j.biortech.2020.123687
June
View PDF View article View in Scopus Google Scholar

[29] Liu et al., 2010
Y.Q. Liu, Y.H. Kong, R. Zhang, X. Zhang, F.S. Wong, J.H. Tay, J.R. Zhu, W.J. Jiang, W.T. Liu
Microbial population dynamics of granular aerobic sequencing batch reactors during start-up and steady state periods
Water Science & Technology, 62 (6) (2010), p. 1281, 10.2166/wst.2010.408
Google Scholar

[30] Liu et al., 2017
Yanzhu Liu, A.W.K. Kong, C.K. Goh
Deep ordinal regression based on data relationship for small datasets
IJCAI International Joint Conference on Artificial Intelligence (2017), pp. 2372-2378
Crossref View in Scopus Google Scholar

[31] Liu et al., 2007
Yu Liu, L. Qin, S.-F. Yang
Microbial granulation technology for nutrient removal from wastewater
Nova Publishers (2007)
Google Scholar

[32] Mahmod and Wahab, 2017
N. Mahmod, N.A. Wahab
Dynamic Modelling of Aerobic Granular Sludge Artificial Neural Networks
International Journal of Electrical and Computer Engineering, 7 (3) (2017), p. 1568, 10.11591/ijece.v7i3.pp1568-1573
View in Scopus Google Scholar

[33] Manu and Thalla, 2017
D.S. Manu, A.K. Thalla
Artificial intelligence models for predicting the performance of biological wastewater treatment plant in the removal of Kjeldahl Nitrogen from wastewater
Applied Water Science, 7 (7) (2017), pp. 3783-3791, 10.1007/s13201-017-0526-4
View in Scopus Google Scholar

[34] Nancharaiah and Reddy, 2018
Y.V Nancharaiah, G.K.K. Reddy
Aerobic granular sludge technology: Mechanisms of granulation and biotechnological applications
Bioresource Technology, 247 (2018), pp. 1128-1143, 10.1016/j.biortech.2017.09.131
View PDF View article View in Scopus Google Scholar

[35] Newhart et al., 2019
K.B. Newhart, R.W. Holloway, A.S. Hering, T.Y. Cath
Data-driven performance analyses of wastewater treatment plants: A review
Water Research, 157 (2019), pp. 498-513, 10.1016/j.watres.2019.03.030
View PDF View article View in Scopus Google Scholar

[36] Ni and Yu, 2010
B.-J. Ni, H.-Q. Yu
Mathematical modeling of aerobic granular sludge: A review
Biotechnology Advances, 28 (6) (2010), pp. 895-909, 10.1016/j.biotechadv.2010.08.004
View PDF View article View in Scopus Google Scholar

[37] Price, 1998
J.K. Price
Applied math for wastewater plant operators
CRC Press (1998)
Google Scholar

[38] Pronk et al., 2015
M. Pronk, M.K. de Kreuk, B. de Bruin, P. Kamminga, R. Kleerebezem, M.C.M. van Loosdrecht
Full scale performance of the aerobic granular sludge process for sewage treatment
Water Research, 84 (2015), pp. 207-217, 10.1016/j.watres.2015.07.011
View PDF View article View in Scopus Google Scholar

[39] Qin et al., 2004
L. Qin, J.H. Tay, Y. Liu
Selection pressure is a driving force of aerobic granulation in sequencing batch reactors
Process Biochemistry, 39 (5) (2004), pp. 579-584, 10.1016/S0032-9592(03)00125-0
View PDF View article View in Scopus Google Scholar

[40] Read and Belsley, 1994
C.B. Read, D.A. Belsley
Conditioning Diagnostics: Collinearity and Weak Data in Regression
Biometrics, 50 (1) (1994), p. 314, 10.2307/2533229
Google Scholar

[41] Rice et al., 2017
Rice, E. W., Baird, R. B., & Eaton, A. D. (2017). Standard Methods for the Examination of Water and Wastewater, 23rd Edition (23rd ed.). American Public Health Association, American Water Works Association, Water Environment Federation.
Google Scholar

[42] Robnik-Šikonja and Kononenko, 1997
M. Robnik-Šikonja, I. Kononenko
An adaptation of {R}elief for attribute estimation in regression
Machine {L}earning: {P}roceedings of the {F}ourteenth International Conference, 5, ICML’97 (1997), pp. 296-304
Google Scholar

[43] Sammut et al., 2016
Sammut, C., & Webb, G. I. (2016). Encyclopedia of Machine Learning and Data Mining (C. Sammut & G. I. Webb (eds.)). Springer {US}. 10.1007/978-1-4899-7687-1
Google Scholar

[44] Seshan et al., 2014
H. Seshan, M.K. Goyal, M.W. Falk, S. Wuertz
Support vector regression model of wastewater bioreactor performance using microbial community diversity indices: Effect of stress and bioaugmentation
Water Research, 53 (2014), pp. 282-296, 10.1016/j.watres.2014.01.015
View PDF View article View in Scopus Google Scholar

[45] Shaikhina and Khovanova, 2017
T. Shaikhina, N.A. Khovanova
Handling limited datasets with neural networks in medical applications: A small-data approach
Artificial Intelligence in Medicine, 75 (2017), pp. 51-63, 10.1016/j.artmed.2016.12.003
View PDF View article View in Scopus Google Scholar

[46] Smola and Schölkopf, 2004
A.J. Smola, B. Schölkopf
A tutorial on support vector regression
Statistics and Computing, 14 (3) (2004), pp. 199-222, 10.1023/b:stco.0000035301.49549.88
View in Scopus Google Scholar

[47] Stathakis et al., 2006
D. Stathakis, I. Savin, F.N. Networks
Neuro-Fuzzy Modelling For Crop Yield Prediction. The International Archives of the Photogrammetry
Remote Sensing and Spatial Information Sciences, 34 (2006), pp. 8-11
Google Scholar

[48] Tay et al., 2002
J.H. Tay, Q.S. Liu, Y. Liu
Characteristics of Aerobic Granules Grown on Glucose and Acetate in Sequential Aerobic Sludge Blanket Reactors
Environmental Technology, 23 (8) (2002), pp. 931-936, 10.1080/09593332308618363
View in Scopus Google Scholar

[49] Urbanowicz et al., 2018
R.J. Urbanowicz, M. Meeker, W. La Cava, R.S. Olson, J.H. Moore
Relief-based feature selection: Introduction and review
Journal of Biomedical Informatics, 85 (June) (2018), pp. 189-203, 10.1016/j.jbi.2018.07.014
View PDF View article View in Scopus Google Scholar

[50] Vapnik, 2000
V.N. Vapnik
The Nature of Statistical Learning Theory
Springer, New York (2000), 10.1007/978-1-4757-3264-1
Google Scholar

[51] Wang et al., 2008
J. Wang, X. Wang, Z. Zhao, J. Li
Organics and nitrogen removal and sludge stability in aerobic granular sludge membrane bioreactor
Applied Microbiology and Biotechnology, 79 (4) (2008), pp. 679-685, 10.1007/s00253-008-1466-6
View in Scopus Google Scholar

[52] Wang et al., 2004
Q. Wang, G. Du, J. Chen
Aerobic granular sludge cultivated under the selective pressure as a driving force
Process Biochemistry, 39 (2004), pp. 557-563, 10.1016/S0032-9592(03)00128-6
View PDF View article View in Scopus Google Scholar

[53] Wilén et al., 2018
B.-M. Wilén, R. Liébana, F. Persson, O. Modin, M. Hermansson
The mechanisms of granulation of activated sludge in wastewater treatment, its optimization, and impact on effluent quality
Applied Microbiology and Biotechnology, 102 (12) (2018), pp. 5005-5020, 10.1007/s00253-018-8990-9
View in Scopus Google Scholar

[54] Wójcik and Kurdziel, 2019
P.I. Wójcik, M. Kurdziel
Training neural networks on high-dimensional data using random projection
Pattern Analysis and Applications, 22 (3) (2019), pp. 1221-1231, 10.1007/s10044-018-0697-0
View in Scopus Google Scholar

[55] Ye et al., 2020
Z. Ye, J. Yang, N. Zhong, X. Tu, J. Jia, J. Wang
Tackling environmental challenges in pollution controls using artificial intelligence: A review
Science of the Total Environment, 699 (2020), Article 134279, 10.1016/j.scitotenv.2019.134279
View PDF View article View in Scopus Google Scholar

[56] Yilmaz et al., 2008
G. Yilmaz, R. Lemaire, J. Keller, Z. Yuan
Simultaneous nitrification, denitrification, and phosphorus removal from nutrient-rich industrial wastewater using granular sludge
Biotechnology and Bioengineering, 100 (3) (2008), pp. 529-541, 10.1002/bit.21774
View in Scopus Google Scholar

[57] Zaghloul et al., 2018
M.S. Zaghloul, R.A. Hamza, O.T. Iorhemen, J.H. Tay
Performance prediction of an aerobic granular SBR using modular multilayer artificial neural networks
Science of the Total Environment, 645 (2018), pp. 449-459, 10.1016/j.scitotenv.2018.07.140
View PDF View article View in Scopus Google Scholar

[58] Zaghloul et al., 2020
M.S. Zaghloul, R.A. Hamza, O.T. Iorhemen, J.H. Tay, O. Terna Iorhemen, J.H. Tay
Comparison of adaptive neuro-fuzzy inference systems (ANFIS) and support vector regression (SVR) for data-driven modelling of aerobic granular sludge reactors
Journal of Environmental Chemical Engineering, 8 (3) (2020), Article 103742, 10.1016/j.jece.2020.103742
View PDF View article View in Scopus Google Scholar

[59] Zheng et al., 2020
S. Zheng, H. Lu, G. Zhang
The recent development of the aerobic granular sludge for industrial wastewater treatment: a mini review
Environmental Technology Reviews, 9 (1) (2020), pp. 55-66, 10.1080/21622515.2020.1732479
View in Scopus Google Scholar

Outline 大纲

Cited by (40) 引用自 (40)