这是用户在 2024-11-22 15:24 为 https://ar5iv.labs.arxiv.org/html/2307.00035 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?

Parameter Identification for Partial Differential Equations with Spatiotemporal Varying Coefficients
具有时空变化系数的偏微分方程的参数辨识

Guangtao Zhang11footnotemark: 122footnotemark: 2 张广涛12 Yiting Duan33footnotemark: 3 段怡婷3 Guanyu Pan44footnotemark: 455footnotemark: 5 潘冠宇45 Qijing Chen66footnotemark: 677footnotemark: 7 陈启静67 Huiyu Yang88footnotemark: 899footnotemark: 9 杨慧宇89 Zhikun Zhang1010footnotemark: 101111footnotemark: 11 张志坤1011 zkzhang@hust.edu.cn
Abstract 抽象

To comprehend complex systems with multiple states, it is imperative to reveal the identity of these states by system outputs. Nevertheless, the mathematical models describing these systems often exhibit nonlinearity so that render the resolution of the parameter inverse problem from the observed spatiotemporal data a challenging endeavor. Starting from the observed data obtained from such systems, we propose a novel framework that facilitates the investigation of parameter identification for multi-state systems governed by spatiotemporal varying parametric partial differential equations. Our framework consists of two integral components: a constrained self-adaptive physics-informed neural network, encompassing a sub-network, as our methodology for parameter identification, and a finite mixture model approach to detect regions of probable parameter variations. Through our scheme, we can precisely ascertain the unknown varying parameters of the complex multi-state system, thereby accomplishing the inversion of the varying parameters. Furthermore, we have showcased the efficacy of our framework on two numerical cases: the 1D Burgers’ equation with time-varying parameters and the 2D wave equation with a space-varying parameter.
要理解具有多个状态的复杂系统,必须通过系统输出揭示这些状态的身份。然而,描述这些系统的数学模型经常表现出非线性,因此使得从观察到的时空数据中解决参数逆问题的分辨率成为一项具有挑战性的工作。从从此类系统获得的观测数据开始,我们提出了一个新的框架,该框架有助于研究由时空变化参数偏微分方程控制的多状态系统的参数识别。我们的框架由两个不可或缺的部分组成:一个受约束的自适应物理信息神经网络,包含一个子网络,作为我们的参数识别方法,以及一个有限混合模型方法,用于检测可能参数变化的区域。通过我们的方案,我们可以精确确定复杂多态系统的未知变化参数,从而完成变化参数的反演。此外,我们还展示了我们的框架在两种数值情况下的有效性:具有时变参数的 1D Burgers 方程和具有空间变化参数的 2D 波动方程。

keywords:
Multi-state complex system; Parameter identification; Inverse problem; Physics-informed neural network; Finite mixture model; Change-point detection;
关键字:
多态复杂系统;参数识别;逆问题;物理信息神经网络;有限混合模型;变化点检测;

1 Introduction 1 介绍

Parameter identification for partial differential equations (PDEs) is also known as the inverse problem, encompassing various mathematical branches such as numerical analysis, nonlinear analysis, and optimization algorithms. The target of the inverse problem is inferring unknown parameters of PDE from a set of spatiotemporal data with potential noise [1] and this field has progressed rapidly over the past few decades with proposed methods such as the sparse Bayesian learning algorithm [2], the least squares method [3], the frequency and Bayesian methods [4], and the physics-informed neural networks (PINNs) [5], etc. In the mechanics of material fields, accurate property parameter detection will benefit the damage detection and design for new multi-functional materials [6]. In biomechanics, identifying important parameters in human tissue can be helpful for treatment and disease prevention [7, 8]. And parameter identification method is also widely used in other engineering fields such as oil exploration and fluid mechanism [9, 10].
偏微分方程 (PDE) 的参数识别也称为逆问题,包括各种数学分支,例如数值分析、非线性分析和优化算法。逆问题的目标是从一组具有潜在噪声的时空数据中推断出 PDE 的未知参数 [1],在过去的几十年里,该领域随着提出的方法迅速发展,例如稀疏贝叶斯学习算法 [2]、最小二乘法 [3]、频率和贝叶斯方法 [4] 以及物理信息神经网络 (PINN) [5]等。在材料领域的力学中,精确的性能参数检测将有利于新型多功能材料的损伤检测和设计 [6]。在生物力学中,识别人体组织中的重要参数有助于治疗和疾病预防 [78]。参数辨识方法也广泛应用于石油勘探、流体机构等其他工程领域 [910]。

Nowadays, multi-state systems with time-varying or space-varying parameters have been widely used in fields such as physics, biology, chemical processes [11], and society [12]. One of the most powerful ways of understanding a multi-state complex system with time-varying parameters is discovering its state transition path. Various theories have been proposed to model and characterize the system dynamics such as the transition path theory [13], the transition path sampling [14], and the Markov state model [15]. For the system governed by a varying parametric PDE, the evolutionary process of the varying parameters determines the state transition path of the multi-state system. For space-varying parameters in higher dimensions, the transition region could be inscribed instead of the transition path. As such, identifying the unknown varying parameters is becoming a necessary first step for discovering the pattern variation in complex systems.
如今,具有时变或空间变化参数的多态系统已广泛应用于物理学、生物学、化学过程 [11] 和社会学 [12] 等领域。理解具有时变参数的多态复杂系统的最有效方法之一是发现其状态转换路径。已经提出了各种理论来建模和表征系统动力学,例如过渡路径理论 [13]、过渡路径采样 [14] 和马尔可夫状态模型 [15]。对于由变化参数偏微分方程控制的系统,变化参数的进化过程决定了多态系统的状态转换路径。对于更高维度中的空间变化参数,可以内接过渡区域而不是过渡路径。因此,识别未知的变化参数正在成为发现复杂系统中模式变化的必要第一步。

The PINNs have been demonstrated as an efficient way to infer the unknown parameters of PDEs from the observed data. The original idea of PINNs was introduced by Lagaris in 1998 [16], and has been well established by Raissi et al. for solving two main problems: the forward problem for PDE resolution and parameter identification for PDE [17, 18]. From Raissi, varied numerical techniques have been proposed to improve the performance of PINNs for that two problems [19, 20] and been successfully used in solving problems in materials [21], biology [22], topological optimization [23], and fluid[24]. For the varying parameter inferring task, Revanth et al. proposed the backward compatible PINNs(bc-PINNs)[25] to learn time-varying parameters of time-varying parametric Burgers’ equation from the observed data without any prior information. However, the inferring results of bc-PINNs only follow a trend similar to the true values. As a result, such inaccurate results are insufficient for us to explore the transition path. To solve the above, we need a more accurate parameter identification method.
PINN 已被证明是从观察到的数据中推断 PDE 未知参数的有效方法。PINN 的最初想法由 Lagaris 于 1998 年提出 [16],并已被 Raissi 等人很好地确立,用于解决两个主要问题:偏微分方程分辨率的正向问题和偏微分方程的参数识别 [1718]。Raissi 提出了各种数值技术来提高 PINN 在这两个问题上的性能 [1920],并成功用于解决材料 [21]、生物学 [22]、拓扑优化 [23] 和流体 [24] 中的问题。对于变化参数推断任务,Revanth 等人提出了向后兼容的 PINN(bc-PINN)[25],以从没有任何先验信息的观测数据中学习时变参数 Burgers 方程的时变参数。但是,bc-PINN 的推理结果仅遵循与真实值相似的趋势。因此,这种不准确的结果不足以让我们探索过渡路径。要解决上述问题,我们需要一种更准确的参数识别方法。

After obtaining the inferring results of the varying parametric PDEs, the next part is detecting the change region of the varying parameters. Change-point detection is an important part of time series analysis and probability anomaly detection [26]. This work requires us to pinpoint the locations of changes in statistical characteristics and points in time at which the probability density functions change [27]. Based on the parameter inferring results of a varying system, a fast and accurate change point detection method may contribute to detecting the change points of the system and locating their position, which may be signification for us to discover the state transition path. For time series data, There has been extensive work in detection change points [28, 29, 30] and becomes a signification part of controlling the reliability and stability of the system. Unlike time series analysis, in this study, the change points of time-varying and space-varying parameters make up the region of variation about the intrinsic nature of the multi-state complex system. It would be interesting research to reveal this hidden parameter variation from the output of the system.

Data-driven statistical modeling based on finite mixture distributions is a rapidly evolving field, with a wide range of applications expanding rapidly [31]. Recently, the finite mixture models is utilized in various fields, such as biometrics, physics, medicine, and marketing. It offers a straightforward method for describing a continuous system’s variation through discrete state space. Despite being a simple linear extension of the classical statistical model, finite mixture models share features concerning inference, specifically a discrete latent structure that results in certain fundamental challenges in estimation, such as the need to determine the unknown number of groups, states, and clusters. The expectation maximization(EM) algorithm is an iterative technique based on maximum likelihood estimation for estimating the parameters of statistical models when the data comprises both observed and hidden variables in the context of finite mixture models [32]. The key advantage of the EM algorithm is that it provides a means of estimating the parameters of models with latent variables without explicitly computing the posterior distribution of the latent variables. This statistical method is particularly useful when there are missing or incomplete data, or when the data is partially observed. This can be computationally efficient, especially when dealing with complex models.

In this paper, we introduce a novel framework for discovering the state transition path of a multi-state parametric PDE system in two steps. Firstly, we use the modified constrained self-adaptive physics-informed neural networks (cSPINNs) to identify the unknown varying parameters and then detect the change region via a change point detection method based on a finite mixture model. Specifically, we modify the cSPINNs by adding a sub-network to learn the varying parameters and this can obtain more accurate results than the previous bc-PINNs. Next, we detect the change points concerning where the parameter change based on the inferring results by employing the finite mixture method. Finally, we take the 1D time-varying parametric Burgers’ Equation and 2D space-varying wave equation as test examples to demonstrate the performance of our method.

This paper is structured as follows. In section 2, we describe forms of parametric partial differential equations with time and space-varying parameters which are the test cases for our method. In section 3, the proposed framework containing two main methods for discovering the state transition path is presented in detail. In section 4, we test the performance of our framework based on the 1D time-varying parametric Burgers’ equation and the 2D space-varying parametric wave equation and analysis their results. Section 5 is the comparison of cSPINNs and bc-PINNs via 1D Burgers’ equation. Section 6 is the performance of our framework on 2D space-varying wave equation and section 7 is the conclusion and discussion.

2 Parametric Partial Differential Equations with Time and Space Varying Parameter

To elucidate the situation of the partial differential equations with Time-varying parameters, we use the following 1D time-varying parametric Burgers’ equation as an example. The Burgers’ equation is a nonlinear second-order partial differential equation that is used as a simplified model in fluid mechanics. The equation is given by the Dutch mathematician Johannes Burgers’ [33] and in this study, we generally write as the following form

ut=λ1uux+λ2uxx,u_{t}=\lambda_{1}uu_{x}+\lambda_{2}u_{xx}, (2.1)

where uu is the fluid velocity at position xx and time tt, the term λ1uux\lambda_{1}uu_{x} is known as the convective term, the term λ2uxx\lambda_{2}u_{xx} is the diffusive term and λ2\lambda_{2} is the kinematic viscosity of the fluid. The Burgers’ equation combines the effects of convection and diffusion in a non-linear way and is used to model a variety of phenomena in fluid mechanics, including shock waves, turbulence, and flow in porous media [34]. Besides fluid mechanics, it has also been used in other areas of physics, such as in modeling traffic flow in transportation engineering [35].

Let λ1\lambda_{1} and λ2\lambda_{2} be time-varying parameters and take values in a finite discrete parameter space. We rewrite equation (2.1) as

ut=λ1(t)uux+λ2(t)uxx.u_{t}=\lambda_{1}(t)uu_{x}+\lambda_{2}(t)u_{xx}. (2.2)

Thus we get a continuous system with discrete states. In this time-varying parameter system, the parameter may exhibit local invariance. As such, in the global time domain, research attention is directed toward how the system state changes over time and at which points these changes occur. The subsequent objective of this study is to establish a comprehensive mathematical framework that builds upon existing solutions of the system. This framework serves to address the inverse problem for parameters in the equation and change point detection of time-varying parameter systems.

In this paper, the observed data of the 1D time-varying parametric Burgers’ equation is computed via the numerical method fast Fourier transform where the initial value is as givens:

u(x,0)=exp{(x+1)2},u(x,0)=\exp{\{-(x+1)^{2}\}}, (2.3)

and the domain is (x,t)[8,8]×(0,10](x,t)\in[-8,8]\times(0,10]. The observed data of three cases: constant parameters without change point, only λ1\lambda_{1} changes once, and λ1\lambda_{1} and λ2\lambda_{2} are all change with multiple change points are shown in the figure 1.

Refer to caption
Refer to caption
Refer to caption
Figure 1: Numerical solution of the 1D time-varying parametric Burgers’ equation for three cases. From left to right, figures correspond to constant parameters without change point, time-varying λ1\lambda_{1} with one change point, and time-varying λ1\lambda_{1} and λ2\lambda_{2} with multiple change points.

From the above figures, it is clear that the transition path of states and the change points of the time-varying system can not be revealed directly from the observed data. Moreover, the difference between the first and the second is obscure, let alone the system information and state transfer paths. Therefore, we can apply the modified cSPINNs as a bridge to link the observed data and the unknown parameters. In this way, we can discover the hidden information together with the transition path.

For the situation of partial differential equations with space-varying parameters, as a contrast to that previous example, we will introduce the 2D wave equation, whose parameters are not a constant in the space plane. The wave equation is a mathematical model that describes wave phenomena. It is typically expressed as a partial differential equation and can describe wave processes in both space and time. It has widespread applications in physics, engineering, mathematics, and other fields. The general form of the wave equation can be written as:

utt=α22u.u_{tt}=\alpha^{2}\nabla^{2}u. (2.4)

Here, the uu represents the wave amplitude, the α\alpha is the wave speed, and the 2\nabla^{2} is the Laplacian operator, which represents the second derivative in space. This equation describes how the wave amplitude changes and propagates during a wave process. The second time derivative represents the acceleration of the wave amplitude, while the Laplacian operator represents the second derivative in space. Let α\alpha be a space-varying parameter α(x,y)\alpha(x,y) and then rewrite the (2.4) as

utt=[α(x,y)]22u.u_{tt}=[\alpha(x,y)]^{2}\nabla^{2}u. (2.5)

In many cases of scientific computing research, there can be sudden and discontinuous changes in local regions, which can have a significant impact on the output of the system. Therefore, it is of great practical significance to obtain the regions of varying neutral states in such a space through scientific calculations. By doing so, we can better understand the underlying physical processes and develop more accurate models to describe them.

3 Data-driven Discovery of Parameter Identification Framework for Partial Differential Equations

In this section, we use two parts to introduce our state transition path discovery framework for a varying parameter system. Firstly, we illustrate the modified cSPINNs method for identifying varying parameters from the observed data. Next, we describe the finite mixture model as our change point detection method.

3.1 Modified Constrained Self-adaptive Physics Informed Neural Networks

In this subsection, we introduce the modified cSPINNs to solve the inverse problem. We firstly consider the model problem given the spatial domain Ω\Omega, and temporal domain t[0,T]t\in[0,T], which with explicit parametric form of parameterized PDEs:

ut[(x,t)]+𝒩[u(x,t);λp(t)]=0,xΩ,t(0,T],\displaystyle u_{t}[(x,t)]+\mathcal{N}[u(x,t);\lambda_{p}(t)]=0,\quad x\in\Omega,t\in(0,T], (3.1)

where 𝒩[]\mathcal{N}[\cdot] is an operator parameterized by physics parameter λp(t)\lambda_{p}(t), which includes any combination of linear and non-linear terms of spatial derivatives. To infer the unknown parameters of the PDE via PINNs [5], we need to construct a neural network u^(x,t;𝒘)\hat{u}(x,t;\boldsymbol{w}) given the spatial xΩx\in\Omega and temporal t[0,T]t\in[0,T] inputs with the trainable parameters 𝒘\boldsymbol{w} to fit the data {xko,tko,uko}k=1No\{x_{k}^{o},t_{k}^{o},u_{k}^{o}\}_{k=1}^{N_{o}}. Meanwhile, the neural network also needs to satisfy the physics laws, i.e. the parameterized governing PDE. Therefore, we can train a physics-informed model by minimizing the following loss function

(𝒘)=λrr(𝒘)+λoo(𝒘),\displaystyle\mathcal{L}(\boldsymbol{w})=\lambda_{r}\mathcal{L}_{r}(\boldsymbol{w})+\lambda_{o}\mathcal{L}_{o}(\boldsymbol{w}), (3.2)

where

r(𝒘)=1Nri=1Nr|u^t[(xri,tri)]+𝒩[u^(xri,tri);λp(tri)]|2,\displaystyle\mathcal{L}_{r}(\boldsymbol{w})=\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\left|\hat{u}_{t}[(x_{r}^{i},t_{r}^{i})]+\mathcal{N}[\hat{u}(x_{r}^{i},t_{r}^{i});\lambda_{p}(t_{r}^{i})]\right|^{2}, (3.3a)
o(𝒘)=1Noi=1No|u^(xoi,toi)uo(xoi,toi)|2.\displaystyle\mathcal{L}_{o}(\boldsymbol{w})=\frac{1}{N_{o}}\sum_{i=1}^{N_{o}}\left|\hat{u}\left({x}_{o}^{i},{t}_{o}^{i}\right)-u_{o}\left({x}_{o}^{i},{t}_{o}^{i}\right)\right|^{2}. (3.3b)

Here, r\mathcal{L}_{r} and o\mathcal{L}_{o} are loss functions due to the residual in the PDE loss, data loss between observed data, and predicted value from the network. We use u^\hat{u} to represent the output of the neural network or in other words, the PDE solution, which is parameterized by 𝒘\boldsymbol{w}. The weights λr\lambda_{r} and λo\lambda_{o} could highly influence the convergence rate of different loss components and the final accuracy of PINNs [36]. Recently, many works [36, 37, 38, 39, 40] are proposed to explore the weighting strategy during PINNs training, which has become one of the mainstream directions of PINNs. To further enhance the learning ability in the physics domain with the complex solution and improve the accuracy of inferred parameters, we introduce a constrained self-adaptive weighting residual loss function. For the inverse problem, the training goal is determined by the residual loss and data loss, here we mainly consider the residual loss, which is closely related to the accuracy of inferred parameters. Then we first rewrite the residual loss function as

r(𝒘)=1Nri=1Nrλ^ri|u^t[(xri,tri)]+𝒩[u^(xri,tri);λp(tri)]|2,\displaystyle\mathcal{L}_{r}(\boldsymbol{w})=\frac{1}{N_{r}}\sum_{i=1}^{N_{r}}\hat{\lambda}_{r}^{i}\left|\hat{u}_{t}[(x_{r}^{i},t_{r}^{i})]+\mathcal{N}[\hat{u}(x_{r}^{i},t_{r}^{i});\lambda_{p}(t_{r}^{i})]\right|^{2}, (3.4a)

during training, we update the trainable weights {λ^ri}i=1Nr\{\hat{\lambda}_{r}^{i}\}_{i=1}^{N_{r}} as

𝝀rk+1\displaystyle{\boldsymbol{\lambda}_{r}^{k+1}} =𝝀^rk+ηk𝝀^rk(𝒘,λr,λo,𝝀^rk),\displaystyle={\boldsymbol{\hat{\lambda}}_{r}^{k}}+\eta_{k}\nabla_{\hat{\boldsymbol{\lambda}}_{r}^{k}}\mathcal{L}\left(\boldsymbol{w},{\lambda}_{r},{\lambda}_{o},\hat{\boldsymbol{\lambda}}_{r}^{k}\right), (3.5a)
λrik+1\displaystyle{\lambda}_{r_{i}}^{k+1} =|λrik+1|i=1Nr|λrik+1|×C,\displaystyle=\frac{|{\lambda}_{r_{i}}^{k+1}|}{\sum_{i=1}^{N_{r}}|{\lambda}_{r_{i}}^{k+1}|}\times C, (3.5b)
λ^rik+1\displaystyle\hat{{\lambda}}_{r_{i}}^{k+1} =(1ϵ)×λ^rik+ϵ×λrik+1,\displaystyle=(1-\epsilon)\times\hat{{\lambda}}_{r_{i}}^{k}+\epsilon\times{{\lambda}}_{r_{i}}^{k+1}, (3.5c)

where we denotes rir_{i} as ithi^{th} residual points in {xri,tri}i=1Nr{\{x_{r}^{i},t_{r}^{i}\}}_{i=1}^{N_{r}}, kk and k+1k+1 the training iteration numbers. 𝝀rk+1{\boldsymbol{\lambda}_{r}^{k+1}} is a middle variable before normalization, in other words, we first normalize the λrik+1𝝀rk+1{\lambda}_{r_{i}}^{k+1}\in{\boldsymbol{\lambda}_{r}^{k+1}} and get the final λ^rik+1\hat{{\lambda}}_{r_{i}}^{k+1} by a weighted sum of the previous weight λ^rik\hat{{\lambda}}_{r_{i}}^{k} of iteration kk and the normalized λrik+1{\lambda}_{r_{i}}^{k+1} in the current k+1k+1 iteration. We set CC as the expectation of weights in PINNs here, i.e., we let C=E(i=1Nrλr^i)=NrC=E(\sum_{i=1}^{N_{r}}\hat{{\lambda}_{r}}_{i})=N_{r}. We update the weights by gradient ascend here to raise PINNs’ attention in the area that is difficult to learn. Figure 2 illustrates the modified constrained self-adaptive PINNs framework for parameter identification problems. The Neural Network 1 is used to approximate the solution u^\hat{u}, and the Neural Network 2 which is the adding sub-network we mentioned above is applied to reconstruct the varying parameters λ1(t)\lambda_{1}(t) and λ2(t)\lambda_{2}(t). Training loss is composed of the modified PDE loss and the data loss, which correspond to the physics laws and the real observed data, respectively. Here, we obtained the observed data by numerically solving the time-dependent Burgers’ equation as in [41], which depends on a spectral method and uses the specfem2D package to simulate the wave equation. It is worth noting that we consider the physical parameters of the system to evolve, which could be modeled using a neural network with time as input and predicted parameters as output. Readers could see the neural network structure in Figure 2 for more details.

Refer to caption
Figure 2: The schematics of the constrained self-adaptive PINNs’ framework for the inverse problem.

3.2 Change Point Detection by Finite Mixture Method

For the data of the system varying parameters λ1(t)\lambda_{1}(t) and λ2(t)\lambda_{2}(t) obtained from modified cSPINNs, we need to find a suitable way to perform change-point detection work for system (2.2). In general, due to the observation noise and the biased estimation of training the network, we have

Pr(yt|𝐗)=𝒩(E(yt),σ2),\Pr(y_{t}|\mathbf{X})=\mathcal{N}(\mathrm{E}(y_{t}),\sigma^{2}), (3.6)

where ytY={y1,y2,,yN}y_{t}\in\mathrm{Y}=\{y_{1},y_{2},\cdots,y_{N}\} is a biased estimate of the parameter λ(t)\lambda(t) at discrete-time points {t1,t2,,tN}0T\{t_{1},t_{2},\cdots,t_{N}\}^{T}_{0}, 𝐗\mathbf{X} is the system output spatiotemporal data u(x,t)u(x,t) and 𝒩\mathcal{N} is 1D Gaussian distribution with density function

f𝒩(y;μ,σ2)=1σ2πexp{(yμ)22σ2}.f_{\mathcal{N}}(y;\,\mu,\sigma^{2})=\frac{1}{\sigma\sqrt{2\pi}}\mathrm{exp}{\left\{-\frac{(y-\mu)^{2}}{2\sigma^{2}}\right\}}. (3.7)

Due to the time-varying parameters in system (2.2), the probabilistic model (3.6) is extended to a Gaussian mixture model (GMM) for observations

Pr(y|ϑ)=k=1Kαkf𝒩(y;μk,σk2),\Pr(y|\vartheta)=\sum_{k=1}^{K}\alpha_{k}f_{\mathcal{N}}(y;\,\mu_{k},\sigma_{k}^{2}), (3.8)

where ϑ={μk,σk,αk,k=1,2,,K}\vartheta=\{\mu_{k},\sigma_{k},\alpha_{k},\,k=1,2,\cdots,K\} is model unidentified parameters with proportional factor k=1Kαk=1\sum_{k=1}^{K}\alpha_{k}=1. Define the latent variable DtkD_{tk} is a 0/10/1 encoding of the assignment of the observation yty_{t} to kk subgroup of the mixture model.

γtk={1,yt is from k subgroup,0,otherwise,\gamma_{tk}=\begin{cases}1,\quad&\text{$y_{t}$ is from $k$ subgroup},\\ 0,\quad&\text{otherwise},\end{cases} (3.9)

with its responsive estimation

γ^tk=E(γtk|Y,ϑ)=αkf𝒩(yt;μk,σk2)k=1Kαkf𝒩(yt;μk,σk2).\hat{\gamma}_{tk}=\mathrm{E}(\gamma_{tk}|\mathrm{Y},\vartheta)=\frac{\alpha_{k}f_{\mathcal{N}}(y_{t};\,\mu_{k},\sigma_{k}^{2})}{\sum_{k=1}^{K}\alpha_{k}f_{\mathcal{N}}(y_{t};\,\mu_{k},\sigma_{k}^{2})}. (3.10)

The expectation step uses the parameter estimations of the model from the previous step to calculate the conditional expectation of the log-likelihood function for the observation data

E[logPr(y,γ|Y,ϑ)]=k=1K{(logαk)t=1Nγtk+[log(1/2π)logσk(ytμk)22σk2]t=1Nγ^tk}.\mathrm{E}[\mathrm{log}\Pr(y,\gamma|\mathrm{Y},\vartheta)]=\sum_{k=1}^{K}\Big{\{}(\mathrm{log}\,\alpha_{k})\sum_{t=1}^{N}\gamma_{tk}+\Big{[}\mathrm{log}(1/\sqrt{2\pi})-\mathrm{log}\,\sigma_{k}-\frac{(y_{t}-\mu_{k})^{2}}{2\sigma^{2}_{k}}\Big{]}\sum_{t=1}^{N}\hat{\gamma}_{tk}\Big{\}}. (3.11)

The maximization step determines the parameters ϑ^j{m1}\hat{\vartheta}_{j}^{\{m-1\}} for maximizing the log-likelihood function of the complete data obtained in the expectation step

ϑ^new=argmaxϑE[logPr(y,γ|Y,ϑ)].\hat{\vartheta}^{\mathrm{new}}=\arg\mathop{\max}\limits_{\vartheta}\mathrm{E}[\mathrm{log}\Pr(y,\gamma|\mathrm{Y},\vartheta)]. (3.12)

By Lagrange constrained optimization method, the updates of model parameters in each iteration are

{μ^k,(σ^k)2,η^k}new={t=1Nγ^tkyjt=1Nγ^tk,t=1Nγ^tk(yjμk)2j=1Nγ^tk,t=1Nγ^tkN}.\{\hat{\mu}_{k},(\hat{\sigma}_{k})^{2},\hat{\eta}_{k}\}^{\mathrm{new}}=\Big{\{}\frac{\sum_{t=1}^{N}\hat{\gamma}_{tk}y_{j}}{\sum_{t=1}^{N}\hat{\gamma}_{tk}},\frac{\sum_{t=1}^{N}\hat{\gamma}_{tk}(y_{j}-\mu_{k})^{2}}{\sum_{j=1}^{N}\hat{\gamma}_{tk}},\frac{\sum_{t=1}^{N}\hat{\gamma}_{tk}}{N}\Big{\}}. (3.13)

Then in continuous iterations, until the algorithm converges, the final two-state GMM parameter estimates

ϑ^={μ^k,σ^k,η^k,,k=1,2,,K},\hat{\vartheta}=\{\hat{\mu}_{k},\hat{\sigma}_{k},\hat{\eta}_{k},\>,k=1,2,\cdots,K\}, (3.14)

are generated. Thus the soft classification probability results based on GMM of observations Y\mathrm{Y} can be obtained as a N×KN\times K matrix

𝐆={gi,j}1iN, 1jK,\mathbf{G}=\{g_{i,j}\}_{1\leq i\leq N,\,1\leq j\leq K}, (3.15)

where

gi,j=ηjf𝒩(yi;μj,σj)k=1Kηkf𝒩(yi;μk,σk),g_{i,j}=\frac{\eta_{j}f_{\mathcal{N}}(y_{i};\mu_{j},\sigma_{j})}{\sum_{k=1}^{K}\eta_{k}f_{\mathcal{N}}(y_{i};\mu_{k},\sigma_{k})}, (3.16)

which is deduced from the Bayes theorem, and it reveals the magnitude of the probability that the ii-th sample belongs to the jj-th mixture component of the GMM model. Hence for the observation data Y={y1,y2,,yN}\mathrm{Y}=\{y_{1},y_{2},\cdots,y_{N}\}. For 1D Burgers’ equation with time-varying parameters, we can calculate a corresponding sequence of change-point probabilities in time interval [t1,t+1][t-1,t+1]

𝐏change={pt=1k=1K(gt1,kgt,kgt+1,k), 2tN1}.\mathbf{P}^{\mathrm{change}}=\Big{\{}p_{t}=1-\sum_{k=1}^{K}(g_{t-1,k}\cdot g_{t,k}\cdot g_{t+1,k}),\>2\leq t\leq N-1\Big{\}}. (3.17)

For a 2D space-varying wave equation with a space-varying parameter α\alpha, we need to consider the Gaussian distribution in a high dimension

f𝒩(𝐱;𝝁,𝚺)=k=1K1(2π)d2|𝚺|12exp{12(𝐱𝝁)T𝚺1(𝐱𝝁)},f_{\mathcal{N}}(\mathbf{x};\boldsymbol{\mu},\boldsymbol{\Sigma})=\sum^{K}_{k=1}\frac{1}{(2\pi)^{\frac{d}{2}}{\lvert\boldsymbol{\Sigma}\rvert}^{\frac{1}{2}}}\exp\Big{\{}-{\frac{1}{2}(\mathbf{x}-\boldsymbol{\mu})^{\mathrm{T}}\,\boldsymbol{\Sigma}^{-1}\,(\mathbf{x}-\boldsymbol{\mu})}\Big{\}}, (3.18)

Similarly, after getting the two-dimensional GMM, we have the soft classification probability results for space point (x,y)(x,y) in the domain

gx,y,j=ηjf𝒩((x,y);𝝁j,𝚺j)k=1Kηkf𝒩((x,y);𝝁k,𝚺k),g_{x,y,j}=\frac{\eta_{j}f_{\mathcal{N}}((x,y);\boldsymbol{\mu}_{j},\boldsymbol{\Sigma}_{j})}{\sum_{k=1}^{K}\eta_{k}f_{\mathcal{N}}((x,y);\boldsymbol{\mu}_{k},\boldsymbol{\Sigma}_{k})}, (3.19)

Then we give a similar calculation of change-point probabilities in a cross-shaped five-point region {(x,y),(x1,y),(x+1,y),(x,y1),(x,y+1)}\{(x,y),(x-1,y),(x+1,y),(x,y-1),(x,y+1)\}

𝐏change={px,y=1k=1K(gx,y,jgx1,y,jgx+1,y,jgx,y1,jgx,y+1,j), 2xNx1, 2yNy1}.\begin{split}&\mathbf{P}^{\mathrm{change}}=\Big{\{}p_{x,y}=1-\sum_{k=1}^{K}(g_{x,y,j}\cdot g_{x-1,y,j}\cdot g_{x+1,y,j}\cdot g_{x,y-1,j}\cdot g_{x,y+1,j}),\>2\leq x\leq N_{x}-1,\>2\leq y\leq N_{y}-1\Big{\}}.\end{split} (3.20)

Finally, the peaks of this time series could be regarded as the detected state change-points of systems (2.2) and (2.5) in global time.

4 1D Burgers’ Equation with Time-varying Parameter

In this section, we use three distinctive types of numerical cases to test the performance of our framework. Moreover, those three category cases represent different evolutionary models of the time-varying 1D parametric Burgers’ equation, and their hidden state transition paths can be discovered via our framework.

To better identify parameter λ1(t)\lambda_{1}(t), a sub-network with the input tt and the output λ1(t;ϕ)\lambda_{1}(t;\phi) is used to model the dynamics of the parameter, where ϕ\phi denotes all trainable parameters of the network and could be optimized during training with the time-varying parameters λ1\lambda_{1} and λ2\lambda_{2}. The loss function is denoted as

(a). Mean squared error on the observed data

MSEo=1Nok=1No(u^(𝒙ko,tko)uko)2,(𝒙ko,tko)Ω×T.\mathrm{MSE}_{o}=\frac{1}{N_{o}}\sum_{k=1}^{N_{o}}\left(\hat{u}\left(\boldsymbol{x}_{k}^{o},t_{k}^{o}\right)-u_{k}^{o}\right)^{2},\quad\left(\boldsymbol{x}_{k}^{o},t_{k}^{o}\right)\in\Omega\times T. (4.1)

(b). Mean squared error of the residual points

MSER\displaystyle\mathrm{MSE}_{R} =1Nrk=1Nr(R(𝒙kr,tkr))2,(𝒙kr,tkr)Ω×T,\displaystyle=\frac{1}{N_{r}}\sum_{k=1}^{N_{r}}\left(R\left(\boldsymbol{x}_{k}^{r},t_{k}^{r}\right)\right)^{2},\quad\left(\boldsymbol{x}_{k}^{r},t_{k}^{r}\right)\in\Omega\times T, (4.2)
R:\displaystyle R: =u^tλ1(t)u^u^xλ2(t)u^xx.\displaystyle=\hat{u}_{t}-\lambda_{1}(t)\hat{u}\hat{u}_{x}-\lambda_{2}(t)\hat{u}_{xx}.

(c). Total mean squared error for inverse

MSE=λoMSEo+λRMSER.\mathrm{MSE}=\lambda_{o}\mathrm{MSE}_{o}+\lambda_{R}\mathrm{MSE}_{R}. (4.3)

In the following numerical experiments, we will get No=4,000N_{o}=4,000 observed data, and Nr=64,000N_{r}=64,000 residual points randomly sampled from the computational domain with Ω=[8,8],T=10\Omega=[-8,8],T=10. We let the weights of the PDE loss term and the residual loss term λo=λR=1\lambda_{o}=\lambda_{R}=1. We use the modified multilayer perceptron (MLP) [39] with a depth of 6, a width of 128, and the tahn activation function as the Neural Network 1 for solving the inverse problem. As for Neural Network 2, a modified MLP is used here, which has 1 input neuron and consists of 4 hidden layers with 40 neurons in each layer, and the activation function is chosen as tanh. The Adam optimizer is used here to minimize the loss function with Ne=200,000N_{e}=200,000 epochs. Meanwhile, We set the batch size of residual points Nbs=4,000N_{bs}=4,000 to reduce the memory requirement of hardware. The initial learning rate is 0.001, and the exponential learning rate annealing method is applied here with hyper-parameter γ=0.9\gamma=0.9 during training. The total time-domain [0,10][0,10] of the parametric Burgers’ equation has been discretized into 256 times steps uniformly. To identify the parameters λ1\lambda_{1} and λ2\lambda_{2}, all the observed data within five steps segment has been chosen. Prediction errors of identifying parameters via modified cSPINNs are shown in the appendix Appendix B: Absolute Error between Reference and Predicted Solution of 1D parametric Burgers’ Equation while the statistical inferring results and the L2L^{2} error for learning Burgers’ equation are shown in the table 1. Next, we start to exhibit our results.

4.1 Case 1: Burgers’ Equation with Single Change Point

The fundamental evolutionary model for a parametric PDE-governed time-varying system necessitates that one time-varying parameter contains one change point throughout the entire process. In this study, we explore three conditions: the first is the trivial case with no change point; the second and third are cases that feature a single varying parameter with one abrupt shift or one gradual change, respectively.

case 1.1:λ1(t)=1.5.case1.2:λ1(t)={0.5,0t<5.1,5t10.case 1.3:λ1(t)={0.5,0t<4.77,0.98x4.194.77t<5.27,1,5.27t10.\begin{split}&{\rm case\>1.1}:\lambda_{1}(t)=1.5.\quad{\rm case1.2}:\lambda_{1}(t)=\begin{cases}0.5,&0\leq t<5.\\ 1,&5\leq t\leq 10.\\ \end{cases}\quad\\ &{\rm case\>1.3}:\lambda_{1}(t)=\begin{cases}0.5,&0\leq t<4.77,\\ 0.98x-4.19&4.77\leq t<5.27,\\ 1,&5.27\leq t\leq 10.\end{cases}\end{split} (4.4)

Follow the proposed framework mentioned in Section 3, we apply modified cSPINNs with a sub-network to learn the time-varying parameter λ1\lambda_{1} of the parametric Burgers’ equation, then we detect the change points by a finite mixture model. Through the results attained above, the transition path could be discovered. Figure 3 shows the time-varying parameter values obtained using modified cSPINNs and the results of our change point detection scheme. Sub-figures in the first column and the second column illustrate values of λ1\lambda_{1} learned and λ2\lambda_{2} learned using modified cSPINNs, separately. And the last row of sub-figures is the results of the finite mixture model. It demonstrates that our framework performs well for all three cases. The main advantage of our framework is that we can discover the transition path of a time-varying system governed by a parametric PDE without any prior information. More specifically, we can predict the values of parameters(constant or time-varying) and the locations of change points without any prior information about time segments.

From figure 3, we can observe that the predicted parameters accurately fit the reference solution for both constant and time-varying cases where the predicted errors mainly appear at the location with discontinuity. Moreover, the error of case 1.2 shown in the second row with an abrupt change is larger than case 1.3 which the time-varying parameter λ1\lambda_{1} evolves gradually. This phenomenon seems reasonable since it is always hard for PINNs to tackle problems with discontinuities [42]. To better identify the change points, we prefer to use the probability method to finish our change point detection task. Our criterion of detection is measured by probability through the finite mixture model. It successfully captures the same change point in case 1.2 as the reference solution which has properties of low variance and high confidence. In this way, we managed to find out all the change points in the evolutionary process in case 1.3.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 3: Figures from the first and second columns represent λ1\lambda_{1} and λ2\lambda_{2} learned using the modified cSPINNs approach for the time-varying parametric Burgers’ equation. Figures from the last column illustrate change point detection results of λ1\lambda_{1} learned. The first, second, and third rows correspond with cases 1.1, 1.2, and 1.3.

The total discretized time points for all experiments is 256 such that we may get 255 probability results through the change point detection method. For better analysis, we set a threshold as 1e-6. The results show that there exists one change point at t=5t=5 for case 1.2 and 9 change points for case 1.3. Thus the transition path of (λ1\lambda_{1},λ2\lambda_{2}) for cases above are (1.5,0.1)(1.5,0.1)(1.5,0.1)\rightarrow(1.5,0.1), (0.5,0.1)(1,0.1)(0.5,0.1)\rightarrow(1,0.1) and (0.5,0.1)(1,0.1)(0.5,0.1)\rightarrow(1,0.1) with sequential gradual change points.

4.2 Case 2: One Time-varying Parameter with Multiple Change Points

The second type of evolutionary model is one time-varying parameter with multiple change points. Here, the time-varying parameter is λ1\lambda_{1} and another parameter λ2\lambda_{2} is constantly 0.1. In this scenario, we test the performance of our framework through two cases. The first is the time-varying parameter λ1\lambda_{1} takes two values with two change points, and the second is the time-varying parameter λ1\lambda_{1} takes two values with three change points. The reference solution has been obtained as follows:

case 2.1:λ1(t)={0.5,0t<4,1,4t<5,0.5,5t10.case 2.2:λ1(t)={1,0t<2,0.5,2t<4,0.75,4t<8,0.5,8t10.\begin{split}{\rm case\>2.1}:\lambda_{1}(t)=\begin{cases}0.5,&0\leq t<4,\\ 1,&4\leq t<5,\\ 0.5,&5\leq t\leq 10.\end{cases}\quad{\rm case\>2.2}:\lambda_{1}(t)=\begin{cases}1,&0\leq t<2,\\ 0.5,&2\leq t<4,\\ 0.75,&4\leq t<8,\\ 0.5,&8\leq t\leq 10.\\ \end{cases}\end{split} (4.5)

Figures 4 illustrate results in the same way as the cases discussed above and errors mainly locate at positions where the discontinuity occurs. For all three cases, our framework successfully identifies the time-varying parameter λ1\lambda_{1} precisely and captures all change points which are consistent with the reference solution. In this way, the transition path has been discovered.

For case 2.1, the time-varying parameter λ1\lambda_{1} has the same state for the beginning and the end while mixing with a small ratio of the difference in the middle. Based on the results of modified cSPINNs, the change point detection method detects the two change points precisely. And our framework also performs well on case 2.2, a more complex three-state mixing time-varying system with three change points.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 4: Figures from the first and second columns represent λ1\lambda_{1} and λ2\lambda_{2} learned results. And the last column illustrates the change point detection results of the time-varying parameter λ1\lambda_{1}. The first and second rows correspond with cases 2.1 and 2.2.

As we mentioned above, the transition path of (λ1\lambda_{1},λ2\lambda_{2}) for those two cases are (0.5,0.1)(1,0.1)(0.5,0.1)(0.5,0.1)\rightarrow(1,0.1)\rightarrow(0.5,0.1) and (1,0.1)(0.5,0.1)(0.75,0.1)(0.5,0.1)(1,0.1)\rightarrow(0.5,0.1)\rightarrow(0.75,0.1)\rightarrow(0.5,0.1).

4.3 Case 3: Multiple Time-varying Parameters with Multiple Change Points

This type of case describes a more complicated time-varying system with multiple time-varying parameters and multiple change points. More precisely, a mixing time-varying 1D parametric Burgers’ Equation with multiple change points. And the time-varying parameters λ1\lambda_{1} and λ2\lambda_{2} vary simultaneously in different paths. The reference solution of this case has been calculated as follows:

case 3:(λ1(t),λ2(t))={(1.00,1.00),0t<2,(0.75,1.33),2t<4,(0.50,2.00),4t<6,(0.75,1.33)6t<8,(1.00,1.00),8t10.{\rm case\>3}:\Big{(}\lambda_{1}(t),\lambda_{2}(t)\Big{)}=\begin{cases}(1.00,1.00),&0\leq t<2,\\ (0.75,1.33),&2\leq t<4,\\ (0.50,2.00),&4\leq t<6,\\ (0.75,1.33)&6\leq t<8,\\ (1.00,1.00),&8\leq t\leq 10.\end{cases}\quad (4.6)

The results of modified cSPINNs fit the reference solution well and the detection method successfully captures all four change points within the evolutionary process. In this case, the transition path of (λ1,λ2)(\lambda_{1},\lambda_{2}) is (1,1)(0.75,1.33)(0.5,2)(0.75,1.33)(1,1)(1,1)\rightarrow(0.75,1.33)\rightarrow(0.5,2)\rightarrow(0.75,1.33)\rightarrow(1,1). The values of parameters represent the corresponding phases of the system.

Refer to caption
Refer to caption
Refer to caption
Figure 5: The first and second figure shows the λ1\lambda_{1}, λ2\lambda_{2} learned results of the case 3. And the last picture shows the result of the change point detection method.

5 Comparison with Existing Methods

In this section, we compare the proposed methods with traditional approaches for change-point detection and existing neural network models. The aim is to assess the effectiveness and advantages of our proposed techniques in addressing the respective research problems. By examining these comparisons, we can gain insights into the performance improvements and novel features offered by our proposed methods.

5.1 Comparison of Change-point Detection by Finite Mixture Method with Traditional Approach

Traditional research focuses on the consistency and convergence rates of CUSUM-type estimators for detecting change points in the mean of dependent observations [43]. The results obtained in this study hold under weak assumptions on the dependence structure, allowing for non-linear and non-stationary sequences. The consistency of CUSUM-type estimators is proven for detecting shifts in the mean of a sequence of observations, and the rates of convergence are derived. The analysis considers a broad range of dependence structures, making the findings applicable to various scenarios. The estimator of change points is defined as

k^n(α)=argmax1kn1|Uk(α)|,\hat{k}_{n}(\alpha)=\underset{1\leq k\leq n-1}{\operatorname{argmax}}\left|U_{k}(\alpha)\right|, (5.1)

where

Uk(α)=(k(nk)n)1α(1ki=1kXi1nki=k+1nXi),1kn1.U_{k}(\alpha)=\left(\frac{k(n-k)}{n}\right)^{1-\alpha}\left(\frac{1}{k}\sum_{i=1}^{k}X_{i}-\frac{1}{n-k}\sum_{i=k+1}^{n}X_{i}\right),\quad 1\leq k\leq n-1. (5.2)

Our tool enables the comprehensive detection of all four change points in a sequence, encompassing their precise positions and distinctive attributes. Conversely, traditional methods are limited to identifying solely the final change point, which is 0.8242s. Thus failing to capture the other change points. This discrepancy arises from the sequence’s limited length, which impairs the accuracy of change point detection using conventional methods. Traditional approaches heavily rely on specific statistical models and assumptions to facilitate change point detection. However, in shorter sequences, these methods often struggle to identify early change points. This limitation stems from the constrained sensitivity and accuracy of traditional approaches when confronted with shorter sequences. In contrast, our tool employs a flexible and adaptive approach to detect change points, effectively adjusting to the data’s unique features and patterns. By leveraging additional information, it accurately determines the presence and characteristics of change points, granting our tool superior detection capabilities even in shorter sequences.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 6: Figures from the first row represent λ1\lambda_{1} and λ2\lambda_{2} learned results. The second column illustrates the change point detection results of λ1\lambda_{1} learned. The first picture in the second row shows the four detected change points based on modified cSPINNs and the second picture shows the bad results based on bc-PINNs.

5.2 Comparison of Modified cSPINNs with bc-PINNs

In this part, we will compare the results of modified cSPINNs and bc-PINNs. We draw the predicted results of λ1\lambda_{1} learned from bc-PINNs [25] and modified cSPINNs together with their detected change points in the following figure 6. The shape of bc-PINNs’ result is more like a smooth parabola with no apparent cut-off points for three different steps. Consequently, the parameter identification result of bc-PINNs will obtain wrong detected change points resulting in a larger variance statistical result. In contrast, the result of modified cSPINNs fits the reference solution better. Based on it, the finite mixture model can detect four change points precisely. In this case, the transition path of Burgers’ equation parameter λ\lambda is

Comparisoncase:λ1(t)={1.00,0t<2,0.75,2t<4,0.50,4t<6,0.756t<8,1.00,8t<10.λ2(t)=0.1,  0t<10.\begin{split}{\rm Comparison\>case}:\lambda_{1}(t)=\begin{cases}1.00,&0\leq t<2,\\ 0.75,&2\leq t<4,\\ 0.50,&4\leq t<6,\\ 0.75&6\leq t<8,\\ 1.00,&8\leq t<10.\end{cases}\quad\lambda_{2}(t)=0.1,\>\>0\leq t<10.\end{split} (5.3)

The bc-PINNs algorithm has been previously used to solve PDE inverse problems with time-varying parameters. However, this method was found to have limited accuracy, and it was unable to accurately detect the system’s change points. In contrast, the cSPINNs algorithm has been developed as a new and improved approach for solving these types of problems. With significantly higher accuracy than bc-PINNs, cSPINNs can accurately identify the turning points in a system, which is essential for many scientific applications. By using cSPINNs, we can gain deeper insights into complex systems and develop more accurate models to describe their behavior. As a result, the cSPINNs algorithm is a powerful tool for scientific computing and can be used to accurately detect change points in a wide range of complex systems in combination with the finite mixture model.

6 2D Wave Equation with Space-varying Parameter

Here, we consider the 2D space-varying acoustic wave equation as another test case for our framework. The parametric wave equation is 2.5 and the space-varying parameter is α(x,y)\alpha(x,y). Similarly, we firstly use the modified cSPINNs to infer the space-varying parameter α(x,y)\alpha(x,y), whose loss function could be defined as:

RPDE:=α22ϕ2ϕt2,\displaystyle R_{PDE}:=\alpha^{2}\nabla^{2}\phi-\frac{\partial^{2}\phi}{\partial t^{2}}, (6.1a)
RP.C:=ρα22ϕ(x,t,z=0),\displaystyle R_{P.C}:=\rho\alpha^{2}\nabla^{2}\phi(x,t,z=0), (6.1b)
RS1:=ϕ(x,z,t=t10)U10(x,z),\displaystyle R_{S_{1}}:=\nabla\phi(x,z,t=t_{1}^{0})-U_{1}^{0}(x,z), (6.1c)
RS2:=ϕ(x,z,t=t20)U20(x,z),\displaystyle R_{S_{2}}:=\nabla\phi(x,z,t=t_{2}^{0})-U_{2}^{0}(x,z), (6.1d)
Robs:=ϕ(x,z,t)Uobs(x,z,t),\displaystyle R_{obs}:=\nabla\phi(x,z,t)-U_{obs}(x,z,t), (6.1e)

with the domain is {(x,z,t)|(x,z,t)[0,1.2]×[0,0.45]×[0,0.5]}\{(x,z,t)|(x,z,t)\in[0,1.2]\times[0,0.45]\times[0,0.5]\}. We construct a 2D domain with a certain distributed wave speed α(x,y)\alpha(x,y) and obtain the result by using the package specfem2D[44]. Moreover, we impose a free-surface condition for the 2D domain. The generated seismograms are the observed data for inferring, and we use two early-time snapshots of the displacement field for training, which are taken before the wave interacts with any heterogeneities in the ground truth model. The reference solution used for the training set can be found in the left part of figure 12.

The space-varying wave speed parameter α(x,y)\alpha(x,y) is what we need to infer and for the direct comparison, we also set the weights of different loss terms as λ1=0.1,λ2=1,λ3=1,λ4=0.1\lambda_{1}=0.1,\lambda_{2}=1,\lambda_{3}=1,\lambda_{4}=0.1 during training, which could be denoted as

MSE(Θ)=λ1MSEPDE+λ2MSES+λ3MSEP.C+λ4MSEObs.\displaystyle MSE(\Theta)=\lambda_{1}MSE_{PDE}+\lambda_{2}MSE_{S}+\lambda_{3}MSE_{P.C}+\lambda_{4}MSE_{Obs}. (6.2)

For the following training, we select Nr=40,000N_{r}=40,000 residual points from a mesh with size 200×200200\times 200 and Nb=5,000N_{b}=5,000 boundary points from each edge. Our architecture here is a fully-connected neural network, trained by using the modified cSPINN scheme with four corresponding stages. For the backbone network, an MLP with a depth of 8 and a width of 100 is used; for the sub-network, a fully-connected neural network with a depth of 5 and a width of 10 is used. Similar to the modified cSPINNs for the forward problems, we train the PINNs in the following four stages with NS1=5000,NS2=2000,NS3=NS4=30000N_{S_{1}}=5000,N_{S_{2}}=2000,N_{S_{3}}=N_{S_{4}}=30000. In Stage 3 and Stage 4, an exponential learning rate decay method for the Adam optimizer is applied with a decay rate of 0.7 every 2500 iterations. Then, L-BFGS is used to optimize the backbone network with 1000 epochs further.

In this case, a low-velocity anomaly, taking the shape of an ellipsoid, with a wave speed of 2km/s2km/s, is situated within a uniform background model with a wave speed of 3km/s3km/s. The transition in velocity between the anomaly and the background is abrupt, resembling a sharp step function. The first one of figure 7 shows a good match between the wave simulated with specfem2D applied to the wave speed model. The second one is the modified cSPINNs solution shows the inverted solution for the wave speed parameter α(x,y)\alpha(x,y) for the 2D wave equation. Compared with the reference solution, the inverted solution is smoothed instead of the sharp discontinuous transition. The last picture is the result of the finite mixture model which corresponds well with the reference solution.

Deep learning statistical algorithms to infer the locations of parameter variations in spatial properties from the solution of equations has significant implications. It means that we can infer certain characteristics of a system from observation data without prior knowledge of all parameters and physical properties. This approach is particularly useful for practical problems, as real-world systems often contain numerous complex parameters and physical properties that may have intricate relationships with each other. With deep learning algorithms, we can learn these relationships from vast amounts of observation data and use them to make predictions and control the system. We hope that this method would have a wide range of applications in many fields, such as weather forecasting, climate modeling, environmental monitoring, and engineering design. By inferring the locations of parameter variations in spatial properties from equations, we can gain insights into the behavior of complex systems and make more accurate predictions and better control.

Refer to caption
Refer to caption
Refer to caption
Figure 7: From the left to right, they are the numerical solution of specfem2D, inferring results of modified cSPINNs and change point detection results.

7 Conclusion and Discussion

The rapid development of parameter identification methods for complex physical models has been enabled by the advancement of computational models. In this study, we propose a novel framework for discovering the hidden transition path of a time-varying complex system. Specifically, we introduce the combination of modified cSPINNs and finite mixture model as the change point detection method to identify change points in the system, then we can discover the transition path behind it. Our method has been tested by using the 1D time-varying parametric Burgers’ equation in three different types of evolutionary models, and our framework performed well for all cases. The modified cSPINNs method has been proven to be an efficient approach for parameter identification in time-varying parametric Burgers’ equations, and the change point detection method is also crucial for identifying change points in time-varying systems. We use finite mixture models as well as the EM algorithm to offer a straightforward way of describing a system’s variation through discrete state space, making statistical computational algorithms to be widely applicable across various fields. Our future works will focus on more challenging models like the Naiver-Stokes equation, which leads to the goal of providing powerful computational tools for the applications of computer vision in detecting angiomas’ location and diagnosing vascular aging.

Appendix A: Parameter Estimation Results

Here, the following table 1 shows the statistical inferring results of the finite mixture model based on the results of modified cSPINNs. Moreover, we also give the inferring results based on the results of bc-PINNs. The results contain the parameter estimation results, the Gaussian variance, and the mixture ratio. Take case 3 as an example, the mixture ratio of λ1\lambda_{1} with values of 0.5052, 0.7663, and 1.0015 is 0.1908, 0.4479, and 0.3612. The last column is the L2 relative error for modified cSPINNs about the 1D time-varying parametric Burgers’ Equation. The value of λk\lambda_{k} represents the L2 error between the inferring results and the reference solution. And the value of u(t,x)u(t,x) is the error between the reference solution and the result calculated by λk\lambda_{k}.

Table 1: Estimates of parameters for the Burgers’ Equation with time-varying parameters.
Numerical Equation True Parameter Gaussian Mixture Relative L2L^{2} Error of
Example Coefficient Value Estimation Variance Ratio u(t,x)u(t,x) λk\lambda_{k}
Case 1.1: λ1\lambda_{1} 1.50 1.4996 1.9800e\mathrm{e}-5 1.0000 2.455e\mathrm{e}-04 2.978e\mathrm{e}-03
Non-change λ2\lambda_{2} 0.10 0.1000 1.6652e\mathrm{e}-7 1.0000 4.081e\mathrm{e}-03
Case 1.2: λ1\lambda_{1} 0.50 0.4988 8.9737e\mathrm{e}-5 0.5000 1.348e\mathrm{e}-04 1.709e\mathrm{e}-02
Single-change 1.00 0.9985 2.7171e\mathrm{e}-5 0.5000
λ2\lambda_{2} 0.10 0.1000 9.1572e\mathrm{e}-8 1.0000 3.026e\mathrm{e}-03
Case 1.3: λ1\lambda_{1} 0.50 0.5060 9.6023e\mathrm{e}-4 0.5000 7.419e-05 3.472e-03
Gradual change 1.00 0.9938 9.3191e\mathrm{e}-4 0.5000
λ2\lambda_{2} 0.10 0.1000 0.3418e\mathrm{e}-8 1.0000 2.897e-03
Case 2.1: λ1\lambda_{1} 0.50 0.5001 3.0196e\mathrm{e}-7 0.8253 2.110e-04 3.389e-02
Multi-change 1.00 0.7897 0.0582 0.1747
Two States λ2\lambda_{2} 0.10 0.1001 1.2926e\mathrm{e}-6 1.0000 1.139e-02
Case 2.2: λ1\lambda_{1} 0.50 0.4987 6.6471e\mathrm{e}-5 0.3693 3.514e-04 3.169e-02
Multi-change 0.75 0.7570 0.0044 0.4511
Three States 1.00 1.0010 4.3976e\mathrm{e}-5 0.1796
λ2\lambda_{2} 0.10 0.1000 3.9237e\mathrm{e}-7 1.0000 6.264e-03
Case 3: λ1\lambda_{1} 0.50 0.5052 1.5886e\mathrm{e}-4 0.1908 4.656e-04 3.451e-02
Multi-change 0.75 0.7663 0.0043 0.4479
Three States 1.00 1.0015 6.7240e\mathrm{e}-5 0.3612
Two-Parameter λ2\lambda_{2} 1.00 0.9964 1.1749e\mathrm{e}-4 0.3508 3.810e-02
Varying 1.33 1.3201 0.0188 0.4656
2.00 1.9989 6.1425e\mathrm{e}-4 0.1836
Comparison case: λ1\lambda_{1} 0.50 0.4770 4.5955e\mathrm{e}-4 0.1509 1.130e-02 1.057e-01
bc-PINNs 0.75 0.6825 0.0041 0.3488
for Multi-change 1.00 0.9895 0.0137 0.5003
λ2\lambda_{2} 0.10 0.1004 3.7265e\mathrm{e}-5 1.0000 5.477e-02
Comparison case: λ1\lambda_{1} 0.50 0.4982 6.0886e\mathrm{e}-5 0.3593 4.627e-04 4.119e-02
modified cSPINNs 0.75 0.7423 0.0048 0.4661
for Multi-change 1.00 0.1007 1.2454e\mathrm{e}-5 0.1746
λ2\lambda_{2} 0.10 0.1000 6.2961e\mathrm{e}-7 1.0000 7.949e-03

Appendix B: Absolute Error between Reference and Predicted Solution of 1D parametric Burgers’ Equation

We draw the errors of reference solution, predicted solution, and absolute error in the following three figures 8, 9, 10.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 8: From top to bottom, they are constant parameters with no change point(case 1.1), a single varying parameter with one abrupt shift(case 1.2), and one gradual shift(case 1.3).
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 9: From top to bottom, they are one time-varying parameter with two change point(case 2.1), modified cSPINNs and bc-PINNs for one time-varying parameter takes two values with three change points(case 2.2).
Refer to caption
Refer to caption
Refer to caption
Figure 10: Case 3: Multiple time-varying parameters with multiple change points
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 11: Figures in the first line represent errors of modified cSPINNs and the second line is the error of bc-PINNs. The result of cSPINNs is more accurate than bc-PINNs.

Appendix C: Absolute error between reference solution and predicted solution of 2D Space-varying Wave Equation

The following figure 12 is the error of reference solution, predicted solution, and absolute error.

Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Refer to caption
Figure 12: Comparison between ground truth and modeled wavefields and their absolute pointwise differences for the synthetic crosswell experiment with a discontinuous ellipsoidal anomaly from SpecFem2D at t=0,t=0.01st=0,t=0.01s and t=0.15st=0.15s are used as the training data.

References

  • [1] S. Zhang, G. Lin, Robust data-driven discovery of governing physical laws with error bars, Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences 474 (2217) (2018) 20180305.
  • [2] S. Yuan, S. Wang, M. Ma, Y. Ji, L. Deng, Sparse bayesian learning-based time-variant deconvolution, IEEE Transactions on Geoscience and Remote Sensing 55 (11) (2017) 6182–6194.
  • [3] C. Qi, H.-T. Zhang, H.-X. Li, A multi-channel spatio-temporal hammerstein modeling approach for nonlinear distributed parameter processes, Journal of Process Control 19 (1) (2009) 85–99.
  • [4] G. Frasso, J. Jaeger, P. Lambert, Parameter estimation and inference in dynamic systems described by linear partial differential equations, AStA Advances in Statistical Analysis 100 (2016) 259–287.
  • [5] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations, Journal of Computational physics 378 (2019) 686–707.
  • [6] J. Li, J. Zhang, W. Ge, X. Liu, Multi-scale methodology for complex systems, Chemical engineering science 59 (8-9) (2004) 1687–1700.
  • [7] M. Fatemi, J. F. Greenleaf, Ultrasound-stimulated vibro-acoustic spectrography, Science 280 (5360) (1998) 82–85.
  • [8] S. Cai, H. Li, F. Zheng, F. Kong, M. Dao, G. E. Karniadakis, S. Suresh, Artificial intelligence velocimetry and microaneurysm-on-a-chip for three-dimensional analysis of blood flow in physiology and disease, Proceedings of the National Academy of Sciences 118 (13) (2021) e2100697118.
  • [9] M. N. Fienen, P. K. Kitanidis, D. Watson, P. Jardine, An application of bayesian inverse methods to vertical deconvolution of hydraulic conductivity in a heterogeneous aquifer at oak ridge national laboratory, Mathematical geology 36 (2004) 101–126.
  • [10] J. Brigham, W. Aquino, F. Mitri, J. F. Greenleaf, M. Fatemi, Inverse estimation of viscoelastic material properties for solids immersed in fluids using vibroacoustic techniques, Journal of applied physics 101 (2) (2007) 023509.
  • [11] S. Swain, Handbook of stochastic methods for physics, chemistry and the natural sciences, Optica Acta 31 (9) (1984) 977–978.
  • [12] L. Helfmann, E. Ribera Borrell, C. Schütte, P. Koltai, Extending transition path theory: Periodically driven and finite-time dynamics, Journal of nonlinear science 30 (6) (2020) 3321–3366.
  • [13] E. Vanden-Eijnden, Transition path theory, Computer Simulations in Condensed Matter Systems: From Materials to Chemical Biology Volume 1 (2006) 453–493.
  • [14] C. Dellago, P. G. Bolhuis, P. L. Geissler, Transition path sampling, Advances in chemical physics 123 (2002) 1–78.
  • [15] J. D. Chodera, F. Noé, Markov state models of biomolecular conformational dynamics, Current opinion in structural biology 25 (2014) 135–144.
  • [16] I. Lagris, A. Likas, D. Fotiadis, Artificial neural networks for solving ordinary and partitial differential equations, IEEE Transactions on Neural Networks 9 (5) (1998) 987–1000.
  • [17] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep learning (part i): Data-driven solutions of nonlinear partial differential equations, arXiv preprint arXiv:1711.10561.
  • [18] M. Raissi, P. Perdikaris, G. E. Karniadakis, Physics informed deep learning (part ii): Data-driven discovery of nonlinear partial differential equations, ArXiv abs/1711.10566.
  • [19] J. Yu, L. Lu, X. Meng, G. E. Karniadakis, Gradient-enhanced physics-informed neural networks for forward and inverse pde problems, Computer Methods in Applied Mechanics and Engineering 393 (2022) 114823.
  • [20] A. Krishnapriyan, A. Gholami, S. Zhe, R. Kirby, M. W. Mahoney, Characterizing possible failure modes in physics-informed neural networks, Advances in Neural Information Processing Systems 34 (2021) 26548–26560.
  • [21] Y. Chen, L. Lu, G. E. Karniadakis, L. Dal Negro, Physics-informed neural networks for inverse problems in nano-optics and metamaterials, Optics express 28 (8) (2020) 11618–11633.
  • [22] M. Daneker, Z. Zhang, G. E. Karniadakis, L. Lu, Systems biology: Identifiability analysis and parameter identification via systems-biology informed neural networks, arXiv preprint arXiv:2202.01723.
  • [23] L. Lu, R. Pestourie, W. Yao, Z. Wang, F. Verdugo, S. G. Johnson, Physics-informed neural networks with hard constraints for inverse design, SIAM Journal on Scientific Computing 43 (6) (2021) B1105–B1132.
  • [24] T. Kadeethum, T. M. Jørgensen, H. M. Nick, Physics-informed neural networks for solving inverse problems of nonlinear biot’s equations: batch training, in: 54th US Rock Mechanics/Geomechanics Symposium, OnePetro, 2020.
  • [25] R. Mattey, S. Ghosh, A novel sequential method to train physics informed neural networks for allen cahn and cahn hilliard equations, Computer Methods in Applied Mechanics and Engineering 390 (2022) 114474.
  • [26] S. Aminikhanghahi, D. J. Cook, A survey of methods for time series change point detection, Knowledge and information systems 51 (2) (2017) 339–367.
  • [27] N. James, M. Menzies, L. Azizi, J. Chan, Novel semi-metrics for multivariate change point analysis and anomaly detection, Physica D: Nonlinear Phenomena 412 (2020) 132636.
  • [28] G. J. Ross, Parametric and nonparametric sequential change detection in r: The cpm package, Journal of Statistical Software 66 (2015) 1–20.
  • [29] C. Truong, L. Oudre, N. Vayatis, Selective review of offline change point detection methods, Signal Processing 167 (2020) 107299.
  • [30] A. Goswami, D. Sharma, H. Mathuku, S. M. P. Gangadharan, C. S. Yadav, S. K. Sahu, M. K. Pradhan, J. Singh, H. Imran, Change detection in remote sensing image data comparing algebraic and machine learning methods, Electronics 11 (3) (2022) 431.
  • [31] G. J. McLachlan, S. X. Lee, S. I. Rathnayake, Finite mixture models, Annual review of statistics and its application 6 (2019) 355–378.
  • [32] G. J. McLachlan, T. Krishnan, The EM algorithm and extensions, John Wiley & Sons, 2007.
  • [33] J. M. Burgers, The nonlinear diffusion equation: asymptotic solutions and statistical problems, Springer Science & Business Media, 2013.
  • [34] J. Smoller, Shock waves and reaction—diffusion equations, Vol. 258, Springer Science & Business Media, 2012.
  • [35] R. Velasco, P. Saavedra, A first order model in traffic flow, Physica D: Nonlinear Phenomena 228 (2) (2007) 153–158.
  • [36] L. McClenny, U. Braga-Neto, Self-adaptive physics-informed neural networks using a soft attention mechanism, arXiv preprint arXiv:2009.04544.
  • [37] C. L. Wight, J. Zhao, Solving Allen-Cahn and Cahn-Hilliard equations using the adaptive physics informed neural networks, arXiv preprint arXiv:2007.04542.
  • [38] S. Wang, S. Sankaran, P. Perdikaris, Respecting causality is all you need for training physics-informed neural networks, arXiv preprint arXiv:2203.07404.
  • [39] S. Wang, Y. Teng, P. Perdikaris, Understanding and mitigating gradient flow pathologies in physics-informed neural networks, SIAM Journal on Scientific Computing 43 (5) (2021) A3055–A3081.
  • [40] R. Mattey, S. Ghosh, A novel sequential method to train physics informed neural networks for allen cahn and cahn hilliard equations, Computer Methods in Applied Mechanics and Engineering.
  • [41] S. Rudy, A. Alla, S. L. Brunton, J. N. Kutz, Data-driven identification of parametric partial differential equations, SIAM Journal on Applied Dynamical Systems 18 (2) (2018) 643–660.
  • [42] A. D. Jagtap, Z. Mao, N. Adams, G. E. Karniadakis, Physics-informed neural networks for inverse problems in supersonic flows, Journal of Computational Physics 466 (2022) 111402.
  • [43] P. Kokoszka, R. Leipus, Change-point in the mean of dependent observations, Statistics & probability letters 40 (4) (1998) 385–393.
  • [44] C. Song, T. A. Alkhalifah, Wavefield reconstruction inversion via physics-informed neural networks, IEEE Transactions on Geoscience and Remote Sensing 60 (2021) 1–12.
×
拖拽到此处完成下载
图片将完成下载
AIX智能下载器