Parameter Identification for Partial Differential Equations with Spatiotemporal Varying Coefficients
具有时空变化系数的偏微分方程的参数辨识
Abstract 抽象
To comprehend complex systems with multiple states, it is imperative to reveal the identity of these states by system outputs. Nevertheless, the mathematical models describing these systems often exhibit nonlinearity so that render the resolution of the parameter inverse problem from the observed spatiotemporal data a challenging endeavor. Starting from the observed data obtained from such systems, we propose a novel framework that facilitates the investigation of parameter identification for multi-state systems governed by spatiotemporal varying parametric partial differential equations. Our framework consists of two integral components: a constrained self-adaptive physics-informed neural network, encompassing a sub-network, as our methodology for parameter identification, and a finite mixture model approach to detect regions of probable parameter variations. Through our scheme, we can precisely ascertain the unknown varying parameters of the complex multi-state system, thereby accomplishing the inversion of the varying parameters. Furthermore, we have showcased the efficacy of our framework on two numerical cases: the 1D Burgers’ equation with time-varying parameters and the 2D wave equation with a space-varying parameter.
要理解具有多个状态的复杂系统,必须通过系统输出揭示这些状态的身份。然而,描述这些系统的数学模型经常表现出非线性,因此使得从观察到的时空数据中解决参数逆问题的分辨率成为一项具有挑战性的工作。从从此类系统获得的观测数据开始,我们提出了一个新的框架,该框架有助于研究由时空变化参数偏微分方程控制的多状态系统的参数识别。我们的框架由两个不可或缺的部分组成:一个受约束的自适应物理信息神经网络,包含一个子网络,作为我们的参数识别方法,以及一个有限混合模型方法,用于检测可能参数变化的区域。通过我们的方案,我们可以精确确定复杂多态系统的未知变化参数,从而完成变化参数的反演。此外,我们还展示了我们的框架在两种数值情况下的有效性:具有时变参数的 1D Burgers 方程和具有空间变化参数的 2D 波动方程。
keywords:
Multi-state complex system; Parameter identification; Inverse problem; Physics-informed neural network; Finite mixture model; Change-point detection;关键字:
多态复杂系统;参数识别;逆问题;物理信息神经网络;有限混合模型;变化点检测;1 Introduction 1 介绍
Parameter identification for partial differential equations (PDEs) is also known as the inverse problem, encompassing various mathematical branches such as numerical analysis, nonlinear analysis, and optimization algorithms. The target of the inverse problem is inferring unknown parameters of PDE from a set of spatiotemporal data with potential noise [1] and this field has progressed rapidly over the past few decades with proposed methods such as the sparse Bayesian learning algorithm [2], the least squares method [3], the frequency and Bayesian methods [4], and the physics-informed neural networks (PINNs) [5], etc. In the mechanics of material fields, accurate property parameter detection will benefit the damage detection and design for new multi-functional materials [6]. In biomechanics, identifying important parameters in human tissue can be helpful for treatment and disease prevention [7, 8]. And parameter identification method is also widely used in other engineering fields such as oil exploration and fluid mechanism [9, 10].
偏微分方程 (PDE) 的参数识别也称为逆问题,包括各种数学分支,例如数值分析、非线性分析和优化算法。逆问题的目标是从一组具有潜在噪声的时空数据中推断出 PDE 的未知参数 [1],在过去的几十年里,该领域随着提出的方法迅速发展,例如稀疏贝叶斯学习算法 [2]、最小二乘法 [3]、频率和贝叶斯方法 [4] 以及物理信息神经网络 (PINN) [5]等。在材料领域的力学中,精确的性能参数检测将有利于新型多功能材料的损伤检测和设计 [6]。在生物力学中,识别人体组织中的重要参数有助于治疗和疾病预防 [7, 8]。参数辨识方法也广泛应用于石油勘探、流体机构等其他工程领域 [9, 10]。
Nowadays, multi-state systems with time-varying or space-varying parameters have been widely used in fields such as physics, biology, chemical processes [11], and society [12]. One of the most powerful ways of understanding a multi-state complex system with time-varying parameters is discovering its state transition path. Various theories have been proposed to model and characterize the system dynamics such as the transition path theory [13], the transition path sampling [14], and the Markov state model [15]. For the system governed by a varying parametric PDE, the evolutionary process of the varying parameters determines the state transition path of the multi-state system. For space-varying parameters in higher dimensions, the transition region could be inscribed instead of the transition path. As such, identifying the unknown varying parameters is becoming a necessary first step for discovering the pattern variation in complex systems.
如今,具有时变或空间变化参数的多态系统已广泛应用于物理学、生物学、化学过程 [11] 和社会学 [12] 等领域。理解具有时变参数的多态复杂系统的最有效方法之一是发现其状态转换路径。已经提出了各种理论来建模和表征系统动力学,例如过渡路径理论 [13]、过渡路径采样 [14] 和马尔可夫状态模型 [15]。对于由变化参数偏微分方程控制的系统,变化参数的进化过程决定了多态系统的状态转换路径。对于更高维度中的空间变化参数,可以内接过渡区域而不是过渡路径。因此,识别未知的变化参数正在成为发现复杂系统中模式变化的必要第一步。
The PINNs have been demonstrated as an efficient way to infer the unknown parameters of PDEs from the observed data. The original idea of PINNs was introduced by Lagaris in 1998 [16], and has been well established by Raissi et al. for solving two main problems: the forward problem for PDE resolution and parameter identification for PDE [17, 18]. From Raissi, varied numerical techniques have been proposed to improve the performance of PINNs for that two problems [19, 20]
and been successfully used in solving problems in materials [21], biology [22], topological optimization [23], and fluid[24]. For the varying parameter inferring task, Revanth et al. proposed the backward compatible PINNs(bc-PINNs)[25] to learn time-varying parameters of time-varying parametric Burgers’ equation from the observed data without any prior information. However, the inferring results of bc-PINNs only follow a trend similar to the true values. As a result, such inaccurate results are insufficient for us to explore the transition path. To solve the above, we need a more accurate parameter identification method.
PINN 已被证明是从观察到的数据中推断 PDE 未知参数的有效方法。PINN 的最初想法由 Lagaris 于 1998 年提出 [16],并已被 Raissi 等人很好地确立,用于解决两个主要问题:偏微分方程分辨率的正向问题和偏微分方程的参数识别 [17, 18]。Raissi 提出了各种数值技术来提高 PINN 在这两个问题上的性能 [19, 20],并成功用于解决材料 [21]、生物学 [22]、拓扑优化 [23] 和流体 [24] 中的问题。对于变化参数推断任务,Revanth 等人提出了向后兼容的 PINN(bc-PINN)[25],以从没有任何先验信息的观测数据中学习时变参数 Burgers 方程的时变参数。但是,bc-PINN 的推理结果仅遵循与真实值相似的趋势。因此,这种不准确的结果不足以让我们探索过渡路径。要解决上述问题,我们需要一种更准确的参数识别方法。
After obtaining the inferring results of the varying parametric PDEs, the next part is detecting the change region of the varying parameters. Change-point detection is an important part of time series analysis and probability anomaly detection [26]. This work requires us to pinpoint the locations of changes in statistical characteristics and points in time at which the probability density functions change [27]. Based on the parameter inferring results of a varying system, a fast and accurate change point detection method may contribute to detecting the change points of the system and locating their position, which may be signification for us to discover the state transition path. For time series data, There has been extensive work in detection change points [28, 29, 30] and becomes a signification part of controlling the reliability and stability of the system. Unlike time series analysis, in this study, the change points of time-varying and space-varying parameters make up the region of variation about the intrinsic nature of the multi-state complex system. It would be interesting research to reveal this hidden parameter variation from the output of the system.
Data-driven statistical modeling based on finite mixture distributions is a rapidly evolving field, with a wide range of applications expanding rapidly [31]. Recently, the finite mixture models is utilized in various fields, such as biometrics, physics, medicine, and marketing. It offers a straightforward method for describing a continuous system’s variation through discrete state space. Despite being a simple linear extension of the classical statistical model, finite mixture models share features concerning inference, specifically a discrete latent structure that results in certain fundamental challenges in estimation, such as the need to determine the unknown number of groups, states, and clusters. The expectation maximization(EM) algorithm is an iterative technique based on maximum likelihood estimation for estimating the parameters of statistical models when the data comprises both observed and hidden variables in the context of finite mixture models [32]. The key advantage of the EM algorithm is that it provides a means of estimating the parameters of models with latent variables without explicitly computing the posterior distribution of the latent variables. This statistical method is particularly useful when there are missing or incomplete data, or when the data is partially observed. This can be computationally efficient, especially when dealing with complex models.
In this paper, we introduce a novel framework for discovering the state transition path of a multi-state parametric PDE system in two steps. Firstly, we use the modified constrained self-adaptive physics-informed neural networks (cSPINNs) to identify the unknown varying parameters and then detect the change region via a change point detection method based on a finite mixture model. Specifically, we modify the cSPINNs by adding a sub-network to learn the varying parameters and this can obtain more accurate results than the previous bc-PINNs. Next, we detect the change points concerning where the parameter change based on the inferring results by employing the finite mixture method. Finally, we take the 1D time-varying parametric Burgers’ Equation and 2D space-varying wave equation as test examples to demonstrate the performance of our method.
This paper is structured as follows. In section 2, we describe forms of parametric partial differential equations with time and space-varying parameters which are the test cases for our method. In section 3, the proposed framework containing two main methods for discovering the state transition path is presented in detail. In section 4, we test the performance of our framework based on the 1D time-varying parametric Burgers’ equation and the 2D space-varying parametric wave equation and analysis their results. Section 5 is the comparison of cSPINNs and bc-PINNs via 1D Burgers’ equation. Section 6 is the performance of our framework on 2D space-varying wave equation and section 7 is the conclusion and discussion.
2 Parametric Partial Differential Equations with Time and Space Varying Parameter
To elucidate the situation of the partial differential equations with Time-varying parameters, we use the following 1D time-varying parametric Burgers’ equation as an example. The Burgers’ equation is a nonlinear second-order partial differential equation that is used as a simplified model in fluid mechanics. The equation is given by the Dutch mathematician Johannes Burgers’ [33] and in this study, we generally write as the following form
(2.1) |
where is the fluid velocity at position and time , the term is known as the convective term, the term is the diffusive term and is the kinematic viscosity of the fluid. The Burgers’ equation combines the effects of convection and diffusion in a non-linear way and is used to model a variety of phenomena in fluid mechanics, including shock waves, turbulence, and flow in porous media [34]. Besides fluid mechanics, it has also been used in other areas of physics, such as in modeling traffic flow in transportation engineering [35].
Let and be time-varying parameters and take values in a finite discrete parameter space. We rewrite equation (2.1) as
(2.2) |
Thus we get a continuous system with discrete states. In this time-varying parameter system, the parameter may exhibit local invariance. As such, in the global time domain, research attention is directed toward how the system state changes over time and at which points these changes occur. The subsequent objective of this study is to establish a comprehensive mathematical framework that builds upon existing solutions of the system. This framework serves to address the inverse problem for parameters in the equation and change point detection of time-varying parameter systems.
In this paper, the observed data of the 1D time-varying parametric Burgers’ equation is computed via the numerical method fast Fourier transform where the initial value is as givens:
(2.3) |
and the domain is . The observed data of three cases: constant parameters without change point, only changes once, and and are all change with multiple change points are shown in the figure 1.
From the above figures, it is clear that the transition path of states and the change points of the time-varying system can not be revealed directly from the observed data. Moreover, the difference between the first and the second is obscure, let alone the system information and state transfer paths. Therefore, we can apply the modified cSPINNs as a bridge to link the observed data and the unknown parameters. In this way, we can discover the hidden information together with the transition path.
For the situation of partial differential equations with space-varying parameters, as a contrast to that previous example, we will introduce the 2D wave equation, whose parameters are not a constant in the space plane. The wave equation is a mathematical model that describes wave phenomena. It is typically expressed as a partial differential equation and can describe wave processes in both space and time. It has widespread applications in physics, engineering, mathematics, and other fields. The general form of the wave equation can be written as:
(2.4) |
Here, the represents the wave amplitude, the is the wave speed, and the is the Laplacian operator, which represents the second derivative in space. This equation describes how the wave amplitude changes and propagates during a wave process. The second time derivative represents the acceleration of the wave amplitude, while the Laplacian operator represents the second derivative in space. Let be a space-varying parameter and then rewrite the (2.4) as
(2.5) |
In many cases of scientific computing research, there can be sudden and discontinuous changes in local regions, which can have a significant impact on the output of the system. Therefore, it is of great practical significance to obtain the regions of varying neutral states in such a space through scientific calculations. By doing so, we can better understand the underlying physical processes and develop more accurate models to describe them.
3 Data-driven Discovery of Parameter Identification Framework for Partial Differential Equations
In this section, we use two parts to introduce our state transition path discovery framework for a varying parameter system. Firstly, we illustrate the modified cSPINNs method for identifying varying parameters from the observed data. Next, we describe the finite mixture model as our change point detection method.
3.1 Modified Constrained Self-adaptive Physics Informed Neural Networks
In this subsection, we introduce the modified cSPINNs to solve the inverse problem. We firstly consider the model problem given the spatial domain , and temporal domain , which with explicit parametric form of parameterized PDEs:
(3.1) |
where is an operator parameterized by physics parameter , which includes any combination of linear and non-linear terms of spatial derivatives. To infer the unknown parameters of the PDE via PINNs [5], we need to construct a neural network given the spatial and temporal inputs with the trainable parameters to fit the data . Meanwhile, the neural network also needs to satisfy the physics laws, i.e. the parameterized governing PDE. Therefore, we can train a physics-informed model by minimizing the following loss function
(3.2) |
where
(3.3a) | |||
(3.3b) |
Here, and are loss functions due to the residual in the PDE loss, data loss between observed data, and predicted value from the network. We use to represent the output of the neural network or in other words, the PDE solution, which is parameterized by . The weights and could highly influence the convergence rate of different loss components and the final accuracy of PINNs [36]. Recently, many works [36, 37, 38, 39, 40] are proposed to explore the weighting strategy during PINNs training, which has become one of the mainstream directions of PINNs. To further enhance the learning ability in the physics domain with the complex solution and improve the accuracy of inferred parameters, we introduce a constrained self-adaptive weighting residual loss function. For the inverse problem, the training goal is determined by the residual loss and data loss, here we mainly consider the residual loss, which is closely related to the accuracy of inferred parameters. Then we first rewrite the residual loss function as
(3.4a) |
during training, we update the trainable weights as
(3.5a) | ||||
(3.5b) | ||||
(3.5c) |
where we denotes as residual points in , and the training iteration numbers. is a middle variable before normalization, in other words, we first normalize the and get the final by a weighted sum of the previous weight of iteration and the normalized in the current iteration. We set as the expectation of weights in PINNs here, i.e., we let . We update the weights by gradient ascend here to raise PINNs’ attention in the area that is difficult to learn. Figure 2 illustrates the modified constrained self-adaptive PINNs framework for parameter identification problems. The Neural Network 1 is used to approximate the solution , and the Neural Network 2 which is the adding sub-network we mentioned above is applied to reconstruct the varying parameters and . Training loss is composed of the modified PDE loss and the data loss, which correspond to the physics laws and the real observed data, respectively. Here, we obtained the observed data by numerically solving the time-dependent Burgers’ equation as in [41], which depends on a spectral method and uses the specfem2D package to simulate the wave equation. It is worth noting that we consider the physical parameters of the system to evolve, which could be modeled using a neural network with time as input and predicted parameters as output. Readers could see the neural network structure in Figure 2 for more details.
3.2 Change Point Detection by Finite Mixture Method
For the data of the system varying parameters and obtained from modified cSPINNs, we need to find a suitable way to perform change-point detection work for system (2.2). In general, due to the observation noise and the biased estimation of training the network, we have
(3.6) |
where is a biased estimate of the parameter at discrete-time points , is the system output spatiotemporal data and is 1D Gaussian distribution with density function
(3.7) |
Due to the time-varying parameters in system (2.2), the probabilistic model (3.6) is extended to a Gaussian mixture model (GMM) for observations
(3.8) |
where is model unidentified parameters with proportional factor . Define the latent variable is a encoding of the assignment of the observation to subgroup of the mixture model.
(3.9) |
with its responsive estimation
(3.10) |
The expectation step uses the parameter estimations of the model from the previous step to calculate the conditional expectation of the log-likelihood function for the observation data
(3.11) |
The maximization step determines the parameters for maximizing the log-likelihood function of the complete data obtained in the expectation step
(3.12) |
By Lagrange constrained optimization method, the updates of model parameters in each iteration are
(3.13) |
Then in continuous iterations, until the algorithm converges, the final two-state GMM parameter estimates
(3.14) |
are generated. Thus the soft classification probability results based on GMM of observations can be obtained as a matrix
(3.15) |
where
(3.16) |
which is deduced from the Bayes theorem, and it reveals the magnitude of the probability that the -th sample belongs to the -th mixture component of the GMM model. Hence for the observation data . For 1D Burgers’ equation with time-varying parameters, we can calculate a corresponding sequence of change-point probabilities in time interval
(3.17) |
For a 2D space-varying wave equation with a space-varying parameter , we need to consider the Gaussian distribution in a high dimension
(3.18) |
Similarly, after getting the two-dimensional GMM, we have the soft classification probability results for space point in the domain
(3.19) |
Then we give a similar calculation of change-point probabilities in a cross-shaped five-point region
(3.20) |
Finally, the peaks of this time series could be regarded as the detected state change-points of systems (2.2) and (2.5) in global time.
4 1D Burgers’ Equation with Time-varying Parameter
In this section, we use three distinctive types of numerical cases to test the performance of our framework. Moreover, those three category cases represent different evolutionary models of the time-varying 1D parametric Burgers’ equation, and their hidden state transition paths can be discovered via our framework.
To better identify parameter , a sub-network with the input and the output is used to model the dynamics of the parameter, where denotes all trainable parameters of the network and could be optimized during training with the time-varying parameters and . The loss function is denoted as
(a). Mean squared error on the observed data
(4.1) |
(b). Mean squared error of the residual points
(4.2) | ||||
(c). Total mean squared error for inverse
(4.3) |
In the following numerical experiments, we will get observed data, and residual points randomly sampled from the computational domain with . We let the weights of the PDE loss term and the residual loss term . We use the modified multilayer perceptron (MLP) [39] with a depth of 6, a width of 128, and the tahn activation function as the Neural Network 1 for solving the inverse problem. As for Neural Network 2, a modified MLP is used here, which has 1 input neuron and consists of 4 hidden layers with 40 neurons in each layer, and the activation function is chosen as tanh. The Adam optimizer is used here to minimize the loss function with epochs. Meanwhile, We set the batch size of residual points to reduce the memory requirement of hardware. The initial learning rate is 0.001, and the exponential learning rate annealing method is applied here with hyper-parameter during training. The total time-domain of the parametric Burgers’ equation has been discretized into 256 times steps uniformly. To identify the parameters and , all the observed data within five steps segment has been chosen. Prediction errors of identifying parameters via modified cSPINNs are shown in the appendix Appendix B: Absolute Error between Reference and Predicted Solution of 1D parametric Burgers’ Equation while the statistical inferring results and the error for learning Burgers’ equation are shown in the table 1. Next, we start to exhibit our results.
4.1 Case 1: Burgers’ Equation with Single Change Point
The fundamental evolutionary model for a parametric PDE-governed time-varying system necessitates that one time-varying parameter contains one change point throughout the entire process. In this study, we explore three conditions: the first is the trivial case with no change point; the second and third are cases that feature a single varying parameter with one abrupt shift or one gradual change, respectively.
(4.4) |
Follow the proposed framework mentioned in Section 3, we apply modified cSPINNs with a sub-network to learn the time-varying parameter of the parametric Burgers’ equation, then we detect the change points by a finite mixture model. Through the results attained above, the transition path could be discovered. Figure 3 shows the time-varying parameter values obtained using modified cSPINNs and the results of our change point detection scheme. Sub-figures in the first column and the second column illustrate values of learned and learned using modified cSPINNs, separately. And the last row of sub-figures is the results of the finite mixture model. It demonstrates that our framework performs well for all three cases. The main advantage of our framework is that we can discover the transition path of a time-varying system governed by a parametric PDE without any prior information. More specifically, we can predict the values of parameters(constant or time-varying) and the locations of change points without any prior information about time segments.
From figure 3, we can observe that the predicted parameters accurately fit the reference solution for both constant and time-varying cases where the predicted errors mainly appear at the location with discontinuity. Moreover, the error of case 1.2 shown in the second row with an abrupt change is larger than case 1.3 which the time-varying parameter evolves gradually. This phenomenon seems reasonable since it is always hard for PINNs to tackle problems with discontinuities [42]. To better identify the change points, we prefer to use the probability method to finish our change point detection task. Our criterion of detection is measured by probability through the finite mixture model. It successfully captures the same change point in case 1.2 as the reference solution which has properties of low variance and high confidence. In this way, we managed to find out all the change points in the evolutionary process in case 1.3.
The total discretized time points for all experiments is 256 such that we may get 255 probability results through the change point detection method. For better analysis, we set a threshold as 1e-6. The results show that there exists one change point at for case 1.2 and 9 change points for case 1.3. Thus the transition path of (,) for cases above are , and with sequential gradual change points.
4.2 Case 2: One Time-varying Parameter with Multiple Change Points
The second type of evolutionary model is one time-varying parameter with multiple change points. Here, the time-varying parameter is and another parameter is constantly 0.1. In this scenario, we test the performance of our framework through two cases. The first is the time-varying parameter takes two values with two change points, and the second is the time-varying parameter takes two values with three change points. The reference solution has been obtained as follows:
(4.5) |
Figures 4 illustrate results in the same way as the cases discussed above and errors mainly locate at positions where the discontinuity occurs. For all three cases, our framework successfully identifies the time-varying parameter precisely and captures all change points which are consistent with the reference solution. In this way, the transition path has been discovered.
For case 2.1, the time-varying parameter has the same state for the beginning and the end while mixing with a small ratio of the difference in the middle. Based on the results of modified cSPINNs, the change point detection method detects the two change points precisely. And our framework also performs well on case 2.2, a more complex three-state mixing time-varying system with three change points.
As we mentioned above, the transition path of (,) for those two cases are and .
4.3 Case 3: Multiple Time-varying Parameters with Multiple Change Points
This type of case describes a more complicated time-varying system with multiple time-varying parameters and multiple change points. More precisely, a mixing time-varying 1D parametric Burgers’ Equation with multiple change points. And the time-varying parameters and vary simultaneously in different paths. The reference solution of this case has been calculated as follows:
(4.6) |
The results of modified cSPINNs fit the reference solution well and the detection method successfully captures all four change points within the evolutionary process. In this case, the transition path of is . The values of parameters represent the corresponding phases of the system.
5 Comparison with Existing Methods
In this section, we compare the proposed methods with traditional approaches for change-point detection and existing neural network models. The aim is to assess the effectiveness and advantages of our proposed techniques in addressing the respective research problems. By examining these comparisons, we can gain insights into the performance improvements and novel features offered by our proposed methods.
5.1 Comparison of Change-point Detection by Finite Mixture Method with Traditional Approach
Traditional research focuses on the consistency and convergence rates of CUSUM-type estimators for detecting change points in the mean of dependent observations [43]. The results obtained in this study hold under weak assumptions on the dependence structure, allowing for non-linear and non-stationary sequences. The consistency of CUSUM-type estimators is proven for detecting shifts in the mean of a sequence of observations, and the rates of convergence are derived. The analysis considers a broad range of dependence structures, making the findings applicable to various scenarios. The estimator of change points is defined as
(5.1) |
where
(5.2) |
Our tool enables the comprehensive detection of all four change points in a sequence, encompassing their precise positions and distinctive attributes. Conversely, traditional methods are limited to identifying solely the final change point, which is 0.8242s. Thus failing to capture the other change points. This discrepancy arises from the sequence’s limited length, which impairs the accuracy of change point detection using conventional methods. Traditional approaches heavily rely on specific statistical models and assumptions to facilitate change point detection. However, in shorter sequences, these methods often struggle to identify early change points. This limitation stems from the constrained sensitivity and accuracy of traditional approaches when confronted with shorter sequences. In contrast, our tool employs a flexible and adaptive approach to detect change points, effectively adjusting to the data’s unique features and patterns. By leveraging additional information, it accurately determines the presence and characteristics of change points, granting our tool superior detection capabilities even in shorter sequences.
5.2 Comparison of Modified cSPINNs with bc-PINNs
In this part, we will compare the results of modified cSPINNs and bc-PINNs. We draw the predicted results of learned from bc-PINNs [25] and modified cSPINNs together with their detected change points in the following figure 6. The shape of bc-PINNs’ result is more like a smooth parabola with no apparent cut-off points for three different steps. Consequently, the parameter identification result of bc-PINNs will obtain wrong detected change points resulting in a larger variance statistical result. In contrast, the result of modified cSPINNs fits the reference solution better. Based on it, the finite mixture model can detect four change points precisely. In this case, the transition path of Burgers’ equation parameter is
(5.3) |
The bc-PINNs algorithm has been previously used to solve PDE inverse problems with time-varying parameters. However, this method was found to have limited accuracy, and it was unable to accurately detect the system’s change points. In contrast, the cSPINNs algorithm has been developed as a new and improved approach for solving these types of problems. With significantly higher accuracy than bc-PINNs, cSPINNs can accurately identify the turning points in a system, which is essential for many scientific applications. By using cSPINNs, we can gain deeper insights into complex systems and develop more accurate models to describe their behavior. As a result, the cSPINNs algorithm is a powerful tool for scientific computing and can be used to accurately detect change points in a wide range of complex systems in combination with the finite mixture model.
6 2D Wave Equation with Space-varying Parameter
Here, we consider the 2D space-varying acoustic wave equation as another test case for our framework. The parametric wave equation is 2.5 and the space-varying parameter is . Similarly, we firstly use the modified cSPINNs to infer the space-varying parameter , whose loss function could be defined as:
(6.1a) | |||
(6.1b) | |||
(6.1c) | |||
(6.1d) | |||
(6.1e) |
with the domain is . We construct a 2D domain with a certain distributed wave speed and obtain the result by using the package specfem2D[44]. Moreover, we impose a free-surface condition for the 2D domain. The generated seismograms are the observed data for inferring, and we use two early-time snapshots of the displacement field for training, which are taken before the wave interacts with any heterogeneities in the ground truth model. The reference solution used for the training set can be found in the left part of figure 12.
The space-varying wave speed parameter is what we need to infer and for the direct comparison, we also set the weights of different loss terms as during training, which could be denoted as
(6.2) |
For the following training, we select residual points from a mesh with size and boundary points from each edge. Our architecture here is a fully-connected neural network, trained by using the modified cSPINN scheme with four corresponding stages. For the backbone network, an MLP with a depth of 8 and a width of 100 is used; for the sub-network, a fully-connected neural network with a depth of 5 and a width of 10 is used. Similar to the modified cSPINNs for the forward problems, we train the PINNs in the following four stages with . In Stage 3 and Stage 4, an exponential learning rate decay method for the Adam optimizer is applied with a decay rate of 0.7 every 2500 iterations. Then, L-BFGS is used to optimize the backbone network with 1000 epochs further.
In this case, a low-velocity anomaly, taking the shape of an ellipsoid, with a wave speed of , is situated within a uniform background model with a wave speed of . The transition in velocity between the anomaly and the background is abrupt, resembling a sharp step function. The first one of figure 7 shows a good match between the wave simulated with specfem2D applied to the wave speed model. The second one is the modified cSPINNs solution shows the inverted solution for the wave speed parameter for the 2D wave equation. Compared with the reference solution, the inverted solution is smoothed instead of the sharp discontinuous transition. The last picture is the result of the finite mixture model which corresponds well with the reference solution.
Deep learning statistical algorithms to infer the locations of parameter variations in spatial properties from the solution of equations has significant implications. It means that we can infer certain characteristics of a system from observation data without prior knowledge of all parameters and physical properties. This approach is particularly useful for practical problems, as real-world systems often contain numerous complex parameters and physical properties that may have intricate relationships with each other. With deep learning algorithms, we can learn these relationships from vast amounts of observation data and use them to make predictions and control the system. We hope that this method would have a wide range of applications in many fields, such as weather forecasting, climate modeling, environmental monitoring, and engineering design. By inferring the locations of parameter variations in spatial properties from equations, we can gain insights into the behavior of complex systems and make more accurate predictions and better control.
7 Conclusion and Discussion
The rapid development of parameter identification methods for complex physical models has been enabled by the advancement of computational models. In this study, we propose a novel framework for discovering the hidden transition path of a time-varying complex system. Specifically, we introduce the combination of modified cSPINNs and finite mixture model as the change point detection method to identify change points in the system, then we can discover the transition path behind it. Our method has been tested by using the 1D time-varying parametric Burgers’ equation in three different types of evolutionary models, and our framework performed well for all cases. The modified cSPINNs method has been proven to be an efficient approach for parameter identification in time-varying parametric Burgers’ equations, and the change point detection method is also crucial for identifying change points in time-varying systems. We use finite mixture models as well as the EM algorithm to offer a straightforward way of describing a system’s variation through discrete state space, making statistical computational algorithms to be widely applicable across various fields. Our future works will focus on more challenging models like the Naiver-Stokes equation, which leads to the goal of providing powerful computational tools for the applications of computer vision in detecting angiomas’ location and diagnosing vascular aging.
Appendix A: Parameter Estimation Results
Here, the following table 1 shows the statistical inferring results of the finite mixture model based on the results of modified cSPINNs. Moreover, we also give the inferring results based on the results of bc-PINNs. The results contain the parameter estimation results, the Gaussian variance, and the mixture ratio. Take case 3 as an example, the mixture ratio of with values of 0.5052, 0.7663, and 1.0015 is 0.1908, 0.4479, and 0.3612. The last column is the L2 relative error for modified cSPINNs about the 1D time-varying parametric Burgers’ Equation. The value of represents the L2 error between the inferring results and the reference solution. And the value of is the error between the reference solution and the result calculated by .
Numerical | Equation | True | Parameter | Gaussian | Mixture | Relative Error of | |
---|---|---|---|---|---|---|---|
Example | Coefficient | Value | Estimation | Variance | Ratio | ||
Case 1.1: | 1.50 | 1.4996 | 1.9800-5 | 1.0000 | 2.455-04 | 2.978-03 | |
Non-change | 0.10 | 0.1000 | 1.6652-7 | 1.0000 | 4.081-03 | ||
Case 1.2: | 0.50 | 0.4988 | 8.9737-5 | 0.5000 | 1.348-04 | 1.709-02 | |
Single-change | 1.00 | 0.9985 | 2.7171-5 | 0.5000 | |||
0.10 | 0.1000 | 9.1572-8 | 1.0000 | 3.026-03 | |||
Case 1.3: | 0.50 | 0.5060 | 9.6023-4 | 0.5000 | 7.419e-05 | 3.472e-03 | |
Gradual change | 1.00 | 0.9938 | 9.3191-4 | 0.5000 | |||
0.10 | 0.1000 | 0.3418-8 | 1.0000 | 2.897e-03 | |||
Case 2.1: | 0.50 | 0.5001 | 3.0196-7 | 0.8253 | 2.110e-04 | 3.389e-02 | |
Multi-change | 1.00 | 0.7897 | 0.0582 | 0.1747 | |||
Two States | 0.10 | 0.1001 | 1.2926-6 | 1.0000 | 1.139e-02 | ||
Case 2.2: | 0.50 | 0.4987 | 6.6471-5 | 0.3693 | 3.514e-04 | 3.169e-02 | |
Multi-change | 0.75 | 0.7570 | 0.0044 | 0.4511 | |||
Three States | 1.00 | 1.0010 | 4.3976-5 | 0.1796 | |||
0.10 | 0.1000 | 3.9237-7 | 1.0000 | 6.264e-03 | |||
Case 3: | 0.50 | 0.5052 | 1.5886-4 | 0.1908 | 4.656e-04 | 3.451e-02 | |
Multi-change | 0.75 | 0.7663 | 0.0043 | 0.4479 | |||
Three States | 1.00 | 1.0015 | 6.7240-5 | 0.3612 | |||
Two-Parameter | 1.00 | 0.9964 | 1.1749-4 | 0.3508 | 3.810e-02 | ||
Varying | 1.33 | 1.3201 | 0.0188 | 0.4656 | |||
2.00 | 1.9989 | 6.1425-4 | 0.1836 | ||||
Comparison case: | 0.50 | 0.4770 | 4.5955-4 | 0.1509 | 1.130e-02 | 1.057e-01 | |
bc-PINNs | 0.75 | 0.6825 | 0.0041 | 0.3488 | |||
for Multi-change | 1.00 | 0.9895 | 0.0137 | 0.5003 | |||
0.10 | 0.1004 | 3.7265-5 | 1.0000 | 5.477e-02 | |||
Comparison case: | 0.50 | 0.4982 | 6.0886-5 | 0.3593 | 4.627e-04 | 4.119e-02 | |
modified cSPINNs | 0.75 | 0.7423 | 0.0048 | 0.4661 | |||
for Multi-change | 1.00 | 0.1007 | 1.2454-5 | 0.1746 | |||
0.10 | 0.1000 | 6.2961-7 | 1.0000 | 7.949e-03 |
Appendix B: Absolute Error between Reference and Predicted Solution of 1D parametric Burgers’ Equation
We draw the errors of reference solution, predicted solution, and absolute error in the following three figures 8, 9, 10.
Appendix C: Absolute error between reference solution and predicted solution of 2D Space-varying Wave Equation
The following figure 12 is the error of reference solution, predicted solution, and absolute error.
References
- [1] S. Zhang, G. Lin, Robust data-driven discovery of governing physical laws with error bars, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 474 (2217) (2018) 20180305.
- [2] S. Yuan, S. Wang, M. Ma, Y. Ji, L. Deng, Sparse bayesian learning-based time-variant deconvolution, IEEE Transactions on Geoscience and Remote Sensing 55 (11) (2017) 6182–6194.
- [3] C. Qi, H.-T. Zhang, H.-X. Li, A multi-channel spatio-temporal hammerstein modeling approach for nonlinear distributed parameter processes, Journal of Process Control 19 (1) (2009) 85–99.
- [4] G. Frasso, J. Jaeger, P. Lambert, Parameter estimation and inference in dynamic systems described by linear partial differential equations, AStA Advances in Statistical Analysis 100 (2016) 259–287.
- [5] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational physics 378 (2019) 686–707.
- [6] J. Li, J. Zhang, W. Ge, X. Liu, Multi-scale methodology for complex systems, Chemical engineering science 59 (8-9) (2004) 1687–1700.
- [7] M. Fatemi, J. F. Greenleaf, Ultrasound-stimulated vibro-acoustic spectrography, Science 280 (5360) (1998) 82–85.
- [8] S. Cai, H. Li, F. Zheng, F. Kong, M. Dao, G. E. Karniadakis, S. Suresh, Artificial intelligence velocimetry and microaneurysm-on-a-chip for three-dimensional analysis of blood flow in physiology and disease, Proceedings of the National Academy of Sciences 118 (13) (2021) e2100697118.
- [9] M. N. Fienen, P. K. Kitanidis, D. Watson, P. Jardine, An application of bayesian inverse methods to vertical deconvolution of hydraulic conductivity in a heterogeneous aquifer at oak ridge national laboratory, Mathematical geology 36 (2004) 101–126.
- [10] J. Brigham, W. Aquino, F. Mitri, J. F. Greenleaf, M. Fatemi, Inverse estimation of viscoelastic material properties for solids immersed in fluids using vibroacoustic techniques, Journal of applied physics 101 (2) (2007) 023509.
- [11] S. Swain, Handbook of stochastic methods for physics, chemistry and the natural sciences, Optica Acta 31 (9) (1984) 977–978.
- [12] L. Helfmann, E. Ribera Borrell, C. Schütte, P. Koltai, Extending transition path theory: Periodically driven and finite-time dynamics, Journal of nonlinear science 30 (6) (2020) 3321–3366.
- [13] E. Vanden-Eijnden, Transition path theory, Computer Simulations in Condensed Matter Systems: From Materials to Chemical Biology Volume 1 (2006) 453–493.
- [14] C. Dellago, P. G. Bolhuis, P. L. Geissler, Transition path sampling, Advances in chemical physics 123 (2002) 1–78.
- [15] J. D. Chodera, F. Noé, Markov state models of biomolecular conformational dynamics, Current opinion in structural biology 25 (2014) 135–144.
- [16] I. Lagris, A. Likas, D. Fotiadis, Artificial neural networks for solving ordinary and partitial differential equations, IEEE Transactions on Neural Networks 9 (5) (1998) 987–1000.
- [17] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations, arXiv preprint arXiv:1711.10561.
- [18] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep learning (part ii): Data-driven discovery of nonlinear partial differential equations, ArXiv abs/1711.10566.
- [19] J. Yu, L. Lu, X. Meng, G. E. Karniadakis, Gradient-enhanced physics-informed neural networks for forward and inverse pde problems, Computer Methods in Applied Mechanics and Engineering 393 (2022) 114823.
- [20] A. Krishnapriyan, A. Gholami, S. Zhe, R. Kirby, M. W. Mahoney, Characterizing possible failure modes in physics-informed neural networks, Advances in Neural Information Processing Systems 34 (2021) 26548–26560.
- [21] Y. Chen, L. Lu, G. E. Karniadakis, L. Dal Negro, Physics-informed neural networks for inverse problems in nano-optics and metamaterials, Optics express 28 (8) (2020) 11618–11633.
- [22] M. Daneker, Z. Zhang, G. E. Karniadakis, L. Lu, Systems biology: Identifiability analysis and parameter identification via systems-biology informed neural networks, arXiv preprint arXiv:2202.01723.
- [23] L. Lu, R. Pestourie, W. Yao, Z. Wang, F. Verdugo, S. G. Johnson, Physics-informed neural networks with hard constraints for inverse design, SIAM Journal on Scientific Computing 43 (6) (2021) B1105–B1132.
- [24] T. Kadeethum, T. M. Jørgensen, H. M. Nick, Physics-informed neural networks for solving inverse problems of nonlinear biot’s equations: batch training, in: 54th US Rock Mechanics/Geomechanics Symposium, OnePetro, 2020.
- [25] R. Mattey, S. Ghosh, A novel sequential method to train physics informed neural networks for allen cahn and cahn hilliard equations, Computer Methods in Applied Mechanics and Engineering 390 (2022) 114474.
- [26] S. Aminikhanghahi, D. J. Cook, A survey of methods for time series change point detection, Knowledge and information systems 51 (2) (2017) 339–367.
- [27] N. James, M. Menzies, L. Azizi, J. Chan, Novel semi-metrics for multivariate change point analysis and anomaly detection, Physica D: Nonlinear Phenomena 412 (2020) 132636.
- [28] G. J. Ross, Parametric and nonparametric sequential change detection in r: The cpm package, Journal of Statistical Software 66 (2015) 1–20.
- [29] C. Truong, L. Oudre, N. Vayatis, Selective review of offline change point detection methods, Signal Processing 167 (2020) 107299.
- [30] A. Goswami, D. Sharma, H. Mathuku, S. M. P. Gangadharan, C. S. Yadav, S. K. Sahu, M. K. Pradhan, J. Singh, H. Imran, Change detection in remote sensing image data comparing algebraic and machine learning methods, Electronics 11 (3) (2022) 431.
- [31] G. J. McLachlan, S. X. Lee, S. I. Rathnayake, Finite mixture models, Annual review of statistics and its application 6 (2019) 355–378.
- [32] G. J. McLachlan, T. Krishnan, The EM algorithm and extensions, John Wiley & Sons, 2007.
- [33] J. M. Burgers, The nonlinear diffusion equation: asymptotic solutions and statistical problems, Springer Science & Business Media, 2013.
- [34] J. Smoller, Shock waves and reaction—diffusion equations, Vol. 258, Springer Science & Business Media, 2012.
- [35] R. Velasco, P. Saavedra, A first order model in traffic flow, Physica D: Nonlinear Phenomena 228 (2) (2007) 153–158.
- [36] L. McClenny, U. Braga-Neto, Self-adaptive physics-informed neural networks using a soft attention mechanism, arXiv preprint arXiv:2009.04544.
- [37] C. L. Wight, J. Zhao, Solving Allen-Cahn and Cahn-Hilliard equations using the adaptive physics informed neural networks, arXiv preprint arXiv:2007.04542.
- [38] S. Wang, S. Sankaran, P. Perdikaris, Respecting causality is all you need for training physics-informed neural networks, arXiv preprint arXiv:2203.07404.
- [39] S. Wang, Y. Teng, P. Perdikaris, Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM Journal on Scientific Computing 43 (5) (2021) A3055–A3081.
- [40] R. Mattey, S. Ghosh, A novel sequential method to train physics informed neural networks for allen cahn and cahn hilliard equations, Computer Methods in Applied Mechanics and Engineering.
- [41] S. Rudy, A. Alla, S. L. Brunton, J. N. Kutz, Data-driven identification of parametric partial differential equations, SIAM Journal on Applied Dynamical Systems 18 (2) (2018) 643–660.
- [42] A. D. Jagtap, Z. Mao, N. Adams, G. E. Karniadakis, Physics-informed neural networks for inverse problems in supersonic flows, Journal of Computational Physics 466 (2022) 111402.
- [43] P. Kokoszka, R. Leipus, Change-point in the mean of dependent observations, Statistics & probability letters 40 (4) (1998) 385–393.
- [44] C. Song, T. A. Alkhalifah, Wavefield reconstruction inversion via physics-informed neural networks, IEEE Transactions on Geoscience and Remote Sensing 60 (2021) 1–12.