News Blog Paper China
A Neuro-Symbolic Method for Solving Differential and Functional Equations2020-11-04   ${\displaystyle \cong }$
When neural networks are used to solve differential equations, they usually produce solutions in the form of black-box functions that are not directly mathematically interpretable. We introduce a method for generating symbolic expressions to solve differential equations while leveraging deep learning training methods. Unlike existing methods, our system does not require learning a language model over symbolic mathematics, making it scalable, compact, and easily adaptable for a variety of tasks and configurations. As part of the method, we propose a novel neural architecture for learning mathematical expressions to optimize a customizable objective. The system is designed to always return a valid symbolic formula, generating a useful approximation when an exact analytic solution to a differential equation is not or cannot be found. We demonstrate through examples how our method can be applied on a number of differential equations, often obtaining symbolic approximations that are useful or insightful. Furthermore, we show how the system can be effortlessly generalized to find symbolic solutions to other mathematical tasks, including integration and functional equations.
Symbolically Solving Partial Differential Equations using Deep Learning2020-11-12   ${\displaystyle \cong }$
We describe a neural-based method for generating exact or approximate solutions to differential equations in the form of mathematical expressions. Unlike other neural methods, our system returns symbolic expressions that can be interpreted directly. Our method uses a neural architecture for learning mathematical expressions to optimize a customizable objective, and is scalable, compact, and easily adaptable for a variety of tasks and configurations. The system has been shown to effectively find exact or approximate symbolic solutions to various differential equations with applications in natural sciences. In this work, we highlight how our method applies to partial differential equations over multiple variables and more complex boundary and initial value conditions.
Evolutional Deep Neural Network2021-03-17   ${\displaystyle \cong }$
The notion of an Evolutional Deep Neural Network (EDNN) is introduced for the solution of partial differential equations (PDE). The parameters of the network are trained to represent the initial state of the system only, and are subsequently updated dynamically, without any further training, to provide an accurate prediction of the evolution of the PDE system. In this framework, the network parameters are treated as functions with respect to the appropriate coordinate and are numerically updated using the governing equations. By marching the neural network weights in the parameter space, EDNN can predict state-space trajectories that are indefinitely long, which is difficult for other neural network approaches. Boundary conditions of the PDEs are treated as hard constraints, are embedded into the neural network, and are therefore exactly satisfied throughout the entire solution trajectory. Several applications including the heat equation, the advection equation, the Burgers equation, the Kuramoto Sivashinsky equation and the Navier-Stokes equations are solved to demonstrate the versatility and accuracy of EDNN. The application of EDNN to the incompressible Navier-Stokes equation embeds the divergence-free constraint into the network design so that the projection of the momentum equation to solenoidal space is implicitly achieved. The numerical results verify the accuracy of EDNN solutions relative to analytical and benchmark numerical solutions, both for the transient dynamics and statistics of the system.
Deep learning based numerical approximation algorithms for stochastic partial differential equations and high-dimensional nonlinear filtering problems2020-12-02   ${\displaystyle \cong }$
In this article we introduce and study a deep learning based approximation algorithm for solutions of stochastic partial differential equations (SPDEs). In the proposed approximation algorithm we employ a deep neural network for every realization of the driving noise process of the SPDE to approximate the solution process of the SPDE under consideration. We test the performance of the proposed approximation algorithm in the case of stochastic heat equations with additive noise, stochastic heat equations with multiplicative noise, stochastic Black--Scholes equations with multiplicative noise, and Zakai equations from nonlinear filtering. In each of these SPDEs the proposed approximation algorithm produces accurate results with short run times in up to 50 space dimensions.
Inference of Stochastic Dynamical Systems from Cross-Sectional Population Data2020-12-09   ${\displaystyle \cong }$
Inferring the driving equations of a dynamical system from population or time-course data is important in several scientific fields such as biochemistry, epidemiology, financial mathematics and many others. Despite the existence of algorithms that learn the dynamics from trajectorial measurements there are few attempts to infer the dynamical system straight from population data. In this work, we deduce and then computationally estimate the Fokker-Planck equation which describes the evolution of the population's probability density, based on stochastic differential equations. Then, following the USDL approach, we project the Fokker-Planck equation to a proper set of test functions, transforming it into a linear system of equations. Finally, we apply sparse inference methods to solve the latter system and thus induce the driving forces of the dynamical system. Our approach is illustrated in both synthetic and real data including non-linear, multimodal stochastic differential equations, biochemical reaction networks as well as mass cytometry biological measurements.
Deep Forward-Backward SDEs for Min-max Control2019-06-11   ${\displaystyle \cong }$
This paper presents a novel approach to numerically solve stochastic differential games for nonlinear systems. The proposed approach relies on the nonlinear Feynman-Kac theorem that establishes a connection between parabolic deterministic partial differential equations and forward-backward stochastic differential equations. Using this theorem the Hamilton-Jacobi-Isaacs partial differential equation associated with differential games is represented by a system of forward-backward stochastic differential equations. Numerical solution of the aforementioned system of stochastic differential equations is performed using importance sampling and a Long-Short Term Memory recurrent neural network, which is trained in an offline fashion. The resulting algorithm is tested on two example systems in simulation and compared against the standard risk neutral stochastic optimal control formulations.
Data-driven peakon and periodic peakon travelling wave solutions of some nonlinear dispersive equations via deep learning2021-01-12   ${\displaystyle \cong }$
In the field of mathematical physics, there exist many physically interesting nonlinear dispersive equations with peakon solutions, which are solitary waves with discontinuous first-order derivative at the wave peak. In this paper, we apply the multi-layer physics-informed neural networks (PINNs) deep learning to successfully study the data-driven peakon and periodic peakon solutions of some well-known nonlinear dispersion equations with initial-boundary value conditions such as the Camassa-Holm (CH) equation, Degasperis-Procesi equation, modified CH equation with cubic nonlinearity, Novikov equation with cubic nonlinearity, mCH-Novikov equation, b-family equation with quartic nonlinearity, generalized modified CH equation with quintic nonlinearity, and etc. These results will be useful to further study the peakon solutions and corresponding experimental design of nonlinear dispersive equations.
Asymptotics of Reinforcement Learning with Neural Networks2019-11-13   ${\displaystyle \cong }$
We prove that a single-layer neural network trained with the Q-learning algorithm converges in distribution to a random ordinary differential equation as the size of the model and the number of training steps become large. Analysis of the limit differential equation shows that it has a unique stationary solution which is the solution of the Bellman equation, thus giving the optimal control for the problem. In addition, we study the convergence of the limit differential equation to the stationary solution. As a by-product of our analysis, we obtain the limiting behavior of single-layer neural networks when trained on i.i.d. data with stochastic gradient descent under the widely-used Xavier initialization.
Scalable Gradients for Stochastic Differential Equations2020-07-07   ${\displaystyle \cong }$
The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for caching noise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance on a 50-dimensional motion capture dataset.
Learning To Solve Differential Equations Across Initial Conditions2020-04-19   ${\displaystyle \cong }$
Recently, there has been a lot of interest in using neural networks for solving partial differential equations. A number of neural network-based partial differential equation solvers have been formulated which provide performances equivalent, and in some cases even superior, to classical solvers. However, these neural solvers, in general, need to be retrained each time the initial conditions or the domain of the partial differential equation changes. In this work, we posit the problem of approximating the solution of a fixed partial differential equation for any arbitrary initial conditions as learning a conditional probability distribution. We demonstrate the utility of our method on Burger's Equation.
Solving non-linear Kolmogorov equations in large dimensions by using deep learning: a numerical comparison of discretization schemes2020-12-09   ${\displaystyle \cong }$
Non-linear partial differential Kolmogorov equations are successfully used to describe a wide range of time dependent phenomena, in natural sciences, engineering or even finance. For example, in physical systems, the Allen-Cahn equation describes pattern formation associated to phase transitions. In finance, instead, the Black-Scholes equation describes the evolution of the price of derivative investment instruments. Such modern applications often require to solve these equations in high-dimensional regimes in which classical approaches are ineffective. Recently, an interesting new approach based on deep learning has been introduced by E, Han, and Jentzen [1], [2]. The main idea is to construct a deep network which is trained from the samples of discrete stochastic differential equations underlying Kolmogorov's equation. The network is able to approximate the solutions of the Kolmogorov equation with polynomial complexity in whole spatial domains, therefore avoiding the curse of dimensionality. In this contribution we study variants of the deep networks by using different discretizations schemes of the stochastic differential equation. We compare the performance of the associated networks, on benchmarked examples, and show that, for some discretization schemes, improvements in the accuracy are possible without affecting the computational complexity.
Equation Embeddings2018-03-24   ${\displaystyle \cong }$
We present an unsupervised approach for discovering semantic representations of mathematical equations. Equations are challenging to analyze because each is unique, or nearly unique. Our method, which we call equation embeddings, finds good representations of equations by using the representations of their surrounding words. We used equation embeddings to analyze four collections of scientific articles from the arXiv, covering four computer science domains (NLP, IR, AI, and ML) and $\sim$98.5k equations. Quantitatively, we found that equation embeddings provide better models when compared to existing word embedding approaches. Qualitatively, we found that equation embeddings provide coherent semantic representations of equations and can capture semantic similarity to other equations and to words.
StarNet: Gradient-free Training of Deep Generative Models using Determined System of Linear Equations2021-01-03   ${\displaystyle \cong }$
In this paper we present an approach for training deep generative models solely based on solving determined systems of linear equations. A network that uses this approach, called a StarNet, has the following desirable properties: 1) training requires no gradient as solution to the system of linear equations is not stochastic, 2) is highly scalable when solving the system of linear equations w.r.t the latent codes, and similarly for the parameters of the model, and 3) it gives desirable least-square bounds for the estimation of latent codes and network parameters within each layer.
General solutions for nonlinear differential equations: a rule-based self-learning approach using deep reinforcement learning2019-05-29   ${\displaystyle \cong }$
A universal rule-based self-learning approach using deep reinforcement learning (DRL) is proposed for the first time to solve nonlinear ordinary differential equations and partial differential equations. The solver consists of a deep neural network-structured actor that outputs candidate solutions, and a critic derived only from physical rules (governing equations and boundary and initial conditions). Solutions in discretized time are treated as multiple tasks sharing the same governing equation, and the current step parameters provide an ideal initialization for the next owing to the temporal continuity of the solutions, which shows a transfer learning characteristic and indicates that the DRL solver has captured the intrinsic nature of the equation. The approach is verified through solving the Schrödinger, Navier-Stokes, Burgers', Van der Pol, and Lorenz equations and an equation of motion. The results indicate that the approach gives solutions with high accuracy, and the solution process promises to get faster.
A deep surrogate approach to efficient Bayesian inversion in PDE and integral equation models2020-02-25   ${\displaystyle \cong }$
We propose a novel deep learning approach to efficiently perform Bayesian inference in partial differential equation (PDE) and integral equation models over potentially high-dimensional parameter spaces. The contributions of this paper are two-fold; the first is the introduction of a neural network approach to approximating the solutions of Fredholm and Volterra integral equations of the first and second kind. The second is the description of a deep surrogate model which allows for efficient sampling from a Bayesian posterior distribution in which the likelihood depends on the solutions of PDEs or integral equations. For the latter, our method relies on the approximate representation of parametric solutions by neural networks. This deep learning approach allows the accurate and efficient approximation of parametric solutions in significantly higher dimensions than is possible using classical techniques. Since the approximated solutions are very cheap to evaluate, the solutions of Bayesian inverse problems over large parameter spaces are tractable using Markov chain Monte Carlo. We demonstrate the efficiency of our method using two real-world examples; these include Bayesian inference in the PDE and integral equation case for an example from electrochemistry, and Bayesian inference of a function-valued heat-transfer parameter with applications in aviation.
A Kaczmarz Algorithm for Solving Tree Based Distributed Systems of Equations2019-04-11   ${\displaystyle \cong }$
The Kaczmarz algorithm is an iterative method for solving systems of linear equations. We introduce a modified Kaczmarz algorithm for solving systems of linear equations in a distributed environment, i.e. the equations within the system are distributed over multiple nodes within a network. The modification we introduce is designed for a network with a tree structure that allows for passage of solution estimates between the nodes in the network. We prove that the modified algorithm converges under no additional assumptions on the equations. We demonstrate that the algorithm converges to the solution, or the solution of minimal norm, when the system is consistent. We also demonstrate that in the case of an inconsistent system of equations, the modified relaxed Kaczmarz algorithm converges to a weighted least squares solution as the relaxation parameter approaches $0$.
Deep neural network for solving differential equations motivated by Legendre-Galerkin approximation2020-10-24   ${\displaystyle \cong }$
Nonlinear differential equations are challenging to solve numerically and are important to understanding the dynamics of many physical systems. Deep neural networks have been applied to help alleviate the computational cost that is associated with solving these systems. We explore the performance and accuracy of various neural architectures on both linear and nonlinear differential equations by creating accurate training sets with the spectral element method. Next, we implement a novel Legendre-Galerkin Deep Neural Network (LGNet) algorithm to predict solutions to differential equations. By constructing a set of a linear combination of the Legendre basis, we predict the corresponding coefficients, $?_i$ which successfully approximate the solution as a sum of smooth basis functions $u \simeq \sum_{i=0}^{N} ?_i \varphi_i$. As a computational example, linear and nonlinear models with Dirichlet or Neumann boundary conditions are considered.
Informed Equation Learning2021-05-13   ${\displaystyle \cong }$
Distilling data into compact and interpretable analytic equations is one of the goals of science. Instead, contemporary supervised machine learning methods mostly produce unstructured and dense maps from input to output. Particularly in deep learning, this property is owed to the generic nature of simple standard link functions. To learn equations rather than maps, standard non-linearities can be replaced with structured building blocks of atomic functions. However, without strong priors on sparsity and structure, representational complexity and numerical conditioning limit this direct approach. To scale to realistic settings in science and engineering, we propose an informed equation learning system. It provides a way to incorporate expert knowledge about what are permitted or prohibited equation components, as well as a domain-dependent structured sparsity prior. Our system then utilizes a robust method to learn equations with atomic functions exhibiting singularities, as e.g. logarithm and division. We demonstrate several artificial and real-world experiments from the engineering domain, in which our system learns interpretable models of high predictive power.
Automated Mathematical Equation Structure Discovery for Visual Analysis2021-04-17   ${\displaystyle \cong }$
Finding the best mathematical equation to deal with the different challenges found in complex scenarios requires a thorough understanding of the scenario and a trial and error process carried out by experts. In recent years, most state-of-the-art equation discovery methods have been widely applied in modeling and identification systems. However, equation discovery approaches can be very useful in computer vision, particularly in the field of feature extraction. In this paper, we focus on recent AI advances to present a novel framework for automatically discovering equations from scratch with little human intervention to deal with the different challenges encountered in real-world scenarios. In addition, our proposal can reduce human bias by proposing a search space design through generative network instead of hand-designed. As a proof of concept, the equations discovered by our framework are used to distinguish moving objects from the background in video sequences. Experimental results show the potential of the proposed approach and its effectiveness in discovering the best equation in video sequences. The code and data are available at: https://github.com/carolinepacheco/equation-discovery-scene-analysis
Actor-Critic Algorithm for High-dimensional Partial Differential Equations2020-10-07   ${\displaystyle \cong }$
We develop a deep learning model to effectively solve high-dimensional nonlinear parabolic partial differential equations (PDE). We follow Feynman-Kac formula to reformulate PDE into the equivalent stochastic control problem governed by a Backward Stochastic Differential Equation (BSDE) system. The Markovian property of the BSDE is utilized in designing our neural network architecture, which is inspired by the Actor-Critic algorithm usually applied for deep Reinforcement Learning. Compared to the State-of-the-Art model, we make several improvements including 1) largely reduced trainable parameters, 2) faster convergence rate and 3) fewer hyperparameters to tune. We demonstrate those improvements by solving a few well-known classes of PDEs such as Hamilton-Jacobian-Bellman equation, Allen-Cahn equation and Black-Scholes equation with dimensions on the order of 100.