News Blog Paper China
Informed Equation Learning2021-05-13   ${\displaystyle \cong }$
Distilling data into compact and interpretable analytic equations is one of the goals of science. Instead, contemporary supervised machine learning methods mostly produce unstructured and dense maps from input to output. Particularly in deep learning, this property is owed to the generic nature of simple standard link functions. To learn equations rather than maps, standard non-linearities can be replaced with structured building blocks of atomic functions. However, without strong priors on sparsity and structure, representational complexity and numerical conditioning limit this direct approach. To scale to realistic settings in science and engineering, we propose an informed equation learning system. It provides a way to incorporate expert knowledge about what are permitted or prohibited equation components, as well as a domain-dependent structured sparsity prior. Our system then utilizes a robust method to learn equations with atomic functions exhibiting singularities, as e.g. logarithm and division. We demonstrate several artificial and real-world experiments from the engineering domain, in which our system learns interpretable models of high predictive power.
Solving non-linear Kolmogorov equations in large dimensions by using deep learning: a numerical comparison of discretization schemes2020-12-09   ${\displaystyle \cong }$
Non-linear partial differential Kolmogorov equations are successfully used to describe a wide range of time dependent phenomena, in natural sciences, engineering or even finance. For example, in physical systems, the Allen-Cahn equation describes pattern formation associated to phase transitions. In finance, instead, the Black-Scholes equation describes the evolution of the price of derivative investment instruments. Such modern applications often require to solve these equations in high-dimensional regimes in which classical approaches are ineffective. Recently, an interesting new approach based on deep learning has been introduced by E, Han, and Jentzen [1], [2]. The main idea is to construct a deep network which is trained from the samples of discrete stochastic differential equations underlying Kolmogorov's equation. The network is able to approximate the solutions of the Kolmogorov equation with polynomial complexity in whole spatial domains, therefore avoiding the curse of dimensionality. In this contribution we study variants of the deep networks by using different discretizations schemes of the stochastic differential equation. We compare the performance of the associated networks, on benchmarked examples, and show that, for some discretization schemes, improvements in the accuracy are possible without affecting the computational complexity.
A Neuro-Symbolic Method for Solving Differential and Functional Equations2020-11-04   ${\displaystyle \cong }$
When neural networks are used to solve differential equations, they usually produce solutions in the form of black-box functions that are not directly mathematically interpretable. We introduce a method for generating symbolic expressions to solve differential equations while leveraging deep learning training methods. Unlike existing methods, our system does not require learning a language model over symbolic mathematics, making it scalable, compact, and easily adaptable for a variety of tasks and configurations. As part of the method, we propose a novel neural architecture for learning mathematical expressions to optimize a customizable objective. The system is designed to always return a valid symbolic formula, generating a useful approximation when an exact analytic solution to a differential equation is not or cannot be found. We demonstrate through examples how our method can be applied on a number of differential equations, often obtaining symbolic approximations that are useful or insightful. Furthermore, we show how the system can be effortlessly generalized to find symbolic solutions to other mathematical tasks, including integration and functional equations.
Equation Embeddings2018-03-24   ${\displaystyle \cong }$
We present an unsupervised approach for discovering semantic representations of mathematical equations. Equations are challenging to analyze because each is unique, or nearly unique. Our method, which we call equation embeddings, finds good representations of equations by using the representations of their surrounding words. We used equation embeddings to analyze four collections of scientific articles from the arXiv, covering four computer science domains (NLP, IR, AI, and ML) and $\sim$98.5k equations. Quantitatively, we found that equation embeddings provide better models when compared to existing word embedding approaches. Qualitatively, we found that equation embeddings provide coherent semantic representations of equations and can capture semantic similarity to other equations and to words.
Inference of Stochastic Dynamical Systems from Cross-Sectional Population Data2020-12-09   ${\displaystyle \cong }$
Inferring the driving equations of a dynamical system from population or time-course data is important in several scientific fields such as biochemistry, epidemiology, financial mathematics and many others. Despite the existence of algorithms that learn the dynamics from trajectorial measurements there are few attempts to infer the dynamical system straight from population data. In this work, we deduce and then computationally estimate the Fokker-Planck equation which describes the evolution of the population's probability density, based on stochastic differential equations. Then, following the USDL approach, we project the Fokker-Planck equation to a proper set of test functions, transforming it into a linear system of equations. Finally, we apply sparse inference methods to solve the latter system and thus induce the driving forces of the dynamical system. Our approach is illustrated in both synthetic and real data including non-linear, multimodal stochastic differential equations, biochemical reaction networks as well as mass cytometry biological measurements.
Evolutional Deep Neural Network2021-03-17   ${\displaystyle \cong }$
The notion of an Evolutional Deep Neural Network (EDNN) is introduced for the solution of partial differential equations (PDE). The parameters of the network are trained to represent the initial state of the system only, and are subsequently updated dynamically, without any further training, to provide an accurate prediction of the evolution of the PDE system. In this framework, the network parameters are treated as functions with respect to the appropriate coordinate and are numerically updated using the governing equations. By marching the neural network weights in the parameter space, EDNN can predict state-space trajectories that are indefinitely long, which is difficult for other neural network approaches. Boundary conditions of the PDEs are treated as hard constraints, are embedded into the neural network, and are therefore exactly satisfied throughout the entire solution trajectory. Several applications including the heat equation, the advection equation, the Burgers equation, the Kuramoto Sivashinsky equation and the Navier-Stokes equations are solved to demonstrate the versatility and accuracy of EDNN. The application of EDNN to the incompressible Navier-Stokes equation embeds the divergence-free constraint into the network design so that the projection of the momentum equation to solenoidal space is implicitly achieved. The numerical results verify the accuracy of EDNN solutions relative to analytical and benchmark numerical solutions, both for the transient dynamics and statistics of the system.
Deep Forward-Backward SDEs for Min-max Control2019-06-11   ${\displaystyle \cong }$
This paper presents a novel approach to numerically solve stochastic differential games for nonlinear systems. The proposed approach relies on the nonlinear Feynman-Kac theorem that establishes a connection between parabolic deterministic partial differential equations and forward-backward stochastic differential equations. Using this theorem the Hamilton-Jacobi-Isaacs partial differential equation associated with differential games is represented by a system of forward-backward stochastic differential equations. Numerical solution of the aforementioned system of stochastic differential equations is performed using importance sampling and a Long-Short Term Memory recurrent neural network, which is trained in an offline fashion. The resulting algorithm is tested on two example systems in simulation and compared against the standard risk neutral stochastic optimal control formulations.
Learning To Solve Differential Equations Across Initial Conditions2020-04-19   ${\displaystyle \cong }$
Recently, there has been a lot of interest in using neural networks for solving partial differential equations. A number of neural network-based partial differential equation solvers have been formulated which provide performances equivalent, and in some cases even superior, to classical solvers. However, these neural solvers, in general, need to be retrained each time the initial conditions or the domain of the partial differential equation changes. In this work, we posit the problem of approximating the solution of a fixed partial differential equation for any arbitrary initial conditions as learning a conditional probability distribution. We demonstrate the utility of our method on Burger's Equation.
Data-driven peakon and periodic peakon travelling wave solutions of some nonlinear dispersive equations via deep learning2021-01-12   ${\displaystyle \cong }$
In the field of mathematical physics, there exist many physically interesting nonlinear dispersive equations with peakon solutions, which are solitary waves with discontinuous first-order derivative at the wave peak. In this paper, we apply the multi-layer physics-informed neural networks (PINNs) deep learning to successfully study the data-driven peakon and periodic peakon solutions of some well-known nonlinear dispersion equations with initial-boundary value conditions such as the Camassa-Holm (CH) equation, Degasperis-Procesi equation, modified CH equation with cubic nonlinearity, Novikov equation with cubic nonlinearity, mCH-Novikov equation, b-family equation with quartic nonlinearity, generalized modified CH equation with quintic nonlinearity, and etc. These results will be useful to further study the peakon solutions and corresponding experimental design of nonlinear dispersive equations.
StarNet: Gradient-free Training of Deep Generative Models using Determined System of Linear Equations2021-01-03   ${\displaystyle \cong }$
In this paper we present an approach for training deep generative models solely based on solving determined systems of linear equations. A network that uses this approach, called a StarNet, has the following desirable properties: 1) training requires no gradient as solution to the system of linear equations is not stochastic, 2) is highly scalable when solving the system of linear equations w.r.t the latent codes, and similarly for the parameters of the model, and 3) it gives desirable least-square bounds for the estimation of latent codes and network parameters within each layer.
Probabilistic Grammars for Equation Discovery2020-12-01   ${\displaystyle \cong }$
Equation discovery, also known as symbolic regression, is a type of automated modeling that discovers scientific laws, expressed in the form of equations, from observed data and expert knowledge. Deterministic grammars, such as context-free grammars, have been used to limit the search spaces in equation discovery by providing hard constraints that specify which equations to consider and which not. In this paper, we propose the use of probabilistic context-free grammars in the context of equation discovery. Such grammars encode soft constraints on the space of equations, specifying a prior probability distribution on the space of possible equations. We show that probabilistic grammars can be used to elegantly and flexibly formulate the parsimony principle, that favors simpler equations, through probabilities attached to the rules in the grammars. We demonstrate that the use of probabilistic, rather than deterministic grammars, in the context of a Monte-Carlo algorithm for grammar-based equation discovery, leads to more efficient equation discovery. Finally, by specifying prior probability distributions over equation spaces, the foundations are laid for Bayesian approaches to equation discovery.
Modeling the Dynamics of PDE Systems with Physics-Constrained Deep Auto-Regressive Networks2019-12-03   ${\displaystyle \cong }$
In recent years, deep learning has proven to be a viable methodology for surrogate modeling and uncertainty quantification for a vast number of physical systems. However, in their traditional form, such models can require a large amount of training data. This is of particular importance for various engineering and scientific applications where data may be extremely expensive to obtain. To overcome this shortcoming, physics-constrained deep learning provides a promising methodology as it only utilizes the governing equations. In this work, we propose a novel auto-regressive dense encoder-decoder convolutional neural network to solve and model non-linear dynamical systems without training data at a computational cost that is potentially magnitudes lower than standard numerical solvers. This model includes a Bayesian framework that allows for uncertainty quantification of the predicted quantities of interest at each time-step. We rigorously test this model on several non-linear transient partial differential equation systems including the turbulence of the Kuramoto-Sivashinsky equation, multi-shock formation and interaction with 1D Burgers' equation and 2D wave dynamics with coupled Burgers' equations. For each system, the predictive results and uncertainty are presented and discussed together with comparisons to the results obtained from traditional numerical analysis methods.
Asymptotics of Reinforcement Learning with Neural Networks2019-11-13   ${\displaystyle \cong }$
We prove that a single-layer neural network trained with the Q-learning algorithm converges in distribution to a random ordinary differential equation as the size of the model and the number of training steps become large. Analysis of the limit differential equation shows that it has a unique stationary solution which is the solution of the Bellman equation, thus giving the optimal control for the problem. In addition, we study the convergence of the limit differential equation to the stationary solution. As a by-product of our analysis, we obtain the limiting behavior of single-layer neural networks when trained on i.i.d. data with stochastic gradient descent under the widely-used Xavier initialization.
Automated Mathematical Equation Structure Discovery for Visual Analysis2021-04-17   ${\displaystyle \cong }$
Finding the best mathematical equation to deal with the different challenges found in complex scenarios requires a thorough understanding of the scenario and a trial and error process carried out by experts. In recent years, most state-of-the-art equation discovery methods have been widely applied in modeling and identification systems. However, equation discovery approaches can be very useful in computer vision, particularly in the field of feature extraction. In this paper, we focus on recent AI advances to present a novel framework for automatically discovering equations from scratch with little human intervention to deal with the different challenges encountered in real-world scenarios. In addition, our proposal can reduce human bias by proposing a search space design through generative network instead of hand-designed. As a proof of concept, the equations discovered by our framework are used to distinguish moving objects from the background in video sequences. Experimental results show the potential of the proposed approach and its effectiveness in discovering the best equation in video sequences. The code and data are available at: https://github.com/carolinepacheco/equation-discovery-scene-analysis
Data-Driven Discovery of Coarse-Grained Equations2020-07-27   ${\displaystyle \cong }$
Statistical (machine learning) tools for equation discovery require large amounts of data that are typically computer generated rather than experimentally observed. Multiscale modeling and stochastic simulations are two areas where learning on simulated data can lead to such discovery. In both, the data are generated with a reliable but impractical model, e.g., molecular dynamics simulations, while a model on the scale of interest is uncertain, requiring phenomenological constitutive relations and ad-hoc approximations. We replace the human discovery of such models, which typically involves spatial/stochastic averaging or coarse-graining, with a machine-learning strategy based on sparse regression that can be executed in two modes. The first, direct equation-learning, discovers a differential operator from the whole dictionary. The second, constrained equation-learning, discovers only those terms in the differential operator that need to be discovered, i.e., learns closure approximations. We illustrate our approach by learning a deterministic equation that governs the spatiotemporal evolution of the probability density function of a system state whose dynamics are described by a nonlinear partial differential equation with random inputs. A series of examples demonstrates the accuracy, robustness, and limitations of our approach to equation discovery.
Data-Driven Continuum Dynamics via Transport-Teleport Duality2020-06-30   ${\displaystyle \cong }$
In recent years, machine learning methods have been widely used to study physical systems that are challenging to solve with governing equations. Physicists and engineers are framing the data-driven paradigm as an alternative approach to physical sciences. In this paradigm change, the deep learning approach is playing a pivotal role. However, most learning architectures do not inherently incorporate conservation laws in the form of continuity equations, and they require dense data to learn the dynamics of conserved quantities. In this study, we introduce a clever mathematical transform to represent the classical dynamics as a point-wise process of disappearance and reappearance of a quantity, which dramatically reduces model complexity and training data for machine learning of transport phenomena. We demonstrate that just a few observational data and a simple learning model can be enough to learn the dynamics of real-world objects. The approach does not require the explicit use of governing equations and only depends on observation data. Because the continuity equation is a general equation that any conserved quantity should obey, the applicability may range from physical to social and medical sciences or any field where data are conserved quantities.
Partial Differential Equations is All You Need for Generating Neural Architectures -- A Theory for Physical Artificial Intelligence Systems2021-03-09   ${\displaystyle \cong }$
In this work, we generalize the reaction-diffusion equation in statistical physics, Schrödinger equation in quantum mechanics, Helmholtz equation in paraxial optics into the neural partial differential equations (NPDE), which can be considered as the fundamental equations in the field of artificial intelligence research. We take finite difference method to discretize NPDE for finding numerical solution, and the basic building blocks of deep neural network architecture, including multi-layer perceptron, convolutional neural network and recurrent neural networks, are generated. The learning strategies, such as Adaptive moment estimation, L-BFGS, pseudoinverse learning algorithms and partial differential equation constrained optimization, are also presented. We believe it is of significance that presented clear physical image of interpretable deep neural networks, which makes it be possible for applying to analog computing device design, and pave the road to physical artificial intelligence.
Physics-informed neural networks for the shallow-water equations on the sphere2021-04-01   ${\displaystyle \cong }$
We propose the use of physics-informed neural networks for solving the shallow-water equations on the sphere. Physics-informed neural networks are trained to satisfy the differential equations along with the prescribed initial and boundary data, and thus can be seen as an alternative approach to solving differential equations compared to traditional numerical approaches such as finite difference, finite volume or spectral methods. We discuss the training difficulties of physics-informed neural networks for the shallow-water equations on the sphere and propose a simple multi-model approach to tackle test cases of comparatively long time intervals. We illustrate the abilities of the method by solving the most prominent test cases proposed by Williamson et al. [J. Comput. Phys. 102, 211-224, 1992].
General solutions for nonlinear differential equations: a rule-based self-learning approach using deep reinforcement learning2019-05-29   ${\displaystyle \cong }$
A universal rule-based self-learning approach using deep reinforcement learning (DRL) is proposed for the first time to solve nonlinear ordinary differential equations and partial differential equations. The solver consists of a deep neural network-structured actor that outputs candidate solutions, and a critic derived only from physical rules (governing equations and boundary and initial conditions). Solutions in discretized time are treated as multiple tasks sharing the same governing equation, and the current step parameters provide an ideal initialization for the next owing to the temporal continuity of the solutions, which shows a transfer learning characteristic and indicates that the DRL solver has captured the intrinsic nature of the equation. The approach is verified through solving the Schrödinger, Navier-Stokes, Burgers', Van der Pol, and Lorenz equations and an equation of motion. The results indicate that the approach gives solutions with high accuracy, and the solution process promises to get faster.
Discovery of Nonlinear Dynamical Systems using a Runge-Kutta Inspired Dictionary-based Sparse Regression Approach2021-05-11   ${\displaystyle \cong }$
Discovering dynamical models to describe underlying dynamical behavior is essential to draw decisive conclusions and engineering studies, e.g., optimizing a process. Experimental data availability notwithstanding has increased significantly, but interpretable and explainable models in science and engineering yet remain incomprehensible. In this work, we blend machine learning and dictionary-based learning with numerical analysis tools to discover governing differential equations from noisy and sparsely-sampled measurement data. We utilize the fact that given a dictionary containing huge candidate nonlinear functions, dynamical models can often be described by a few appropriately chosen candidates. As a result, we obtain interpretable and parsimonious models which are prone to generalize better beyond the sampling regime. Additionally, we integrate a numerical integration framework with dictionary learning that yields differential equations without requiring or approximating derivative information at any stage. Hence, it is utterly effective in corrupted and sparsely-sampled data. We discuss its extension to governing equations, containing rational nonlinearities that typically appear in biological networks. Moreover, we generalized the method to governing equations that are subject to parameter variations and externally controlled inputs. We demonstrate the efficiency of the method to discover a number of diverse differential equations using noisy measurements, including a model describing neural dynamics, chaotic Lorenz model, Michaelis-Menten Kinetics, and a parameterized Hopf normal form.