Stochastic (ordinary) differential equations or SDE are differential equations that describe certain random processes. They are used in statistical physics to model systems with both a deterministic influence and a random influence. The simplest example is a classical particle in a potential coupled to a heat bath. Physicists often refer to the corresponding class of equations as ‘the’ Langevin equation.
The weather, and climate, can be modelled using similar but also more complicated SDE. For example, the El Niño/Southern Oscillation is an important phenomenon centered in the Pacific ocean, which displays irregular oscillatory behavior and has been studied using stochastic differential equations. For more, see ENSO.
Numerical solvers of SDE can also be used to synthesize random processes in order to produce synthetic time series, such that techniques of time series analysis can be tested, when real data is not available.
This page is about ordinary differential equations only, see stochastic partial differential equations.
We will concentrate on one dimensional ordinary SDE, the Ito calculus and integration with respect to Brownian motion (also known as the ‘Wiener process’).
We’ll deliberately switch to the sloppy physicist’s language in this paragraph.
We have a system that can be modelled as a particle moving in a potential $V$ subject to friction proportional to its velocity and random influences from the coupling to a heat bath. Let $t \in \mathbb{R}$ be the time and write $x(t)$ as the space coordinate of our particle, $\dot{x}$ as the time derivative.
As a non-relativistic classical particle its equation of motion is:
Here the constant $m$ is the mass of our particle. The constant $\gamma$ describes a frictional force which depends linearly on the velocity of the particle; this ansatz for the friction is called Newtonian friction.
It is the term $\sqrt{2D} \; \dot{W}$ that makes this differential equation stochastic: it describes the influence of the heat bath. The constant $D$ is often called the diffusion constant or diffusion coefficient. $W$ is not a fixed function of time; it is a random function, or technically a random process called the Wiener process, also known as Brownian motion or red noise. $\dot{W}$ is used to denote the ‘time derivative’ of the Wiener process, which is called white noise. Physicists often speak as if white noise were a random process. Strictly speaking there is no random process that is the time derivative of the Wiener process, but it is possible to construct white noise as a distribution-valued random process in a framework called the ‘white noise calculus’. Luckily, it is not necessary to understand this in any more detail for the following.
If we assume our particle moves in the ‘high friction limit’ where the acceleration is negligible, we can rewrite the equation as
absorbing the constants m and $\gamma$ into the potential function $V$ and the diffusion $D$.
This is the way that physicists, chemists etc. usually write the following stochastic differential equation
Every Langevin equation or SDE of the form
has an associated Fokker-Planck equation (as physicists say) aka Kolmogorov forward equation (as mathematicians say), that describes the time evolution of the probability distribution of the Langevin equation: From the physicist’s POV this probability distribution tells us what the probability is to find the particle at a certain time in a certain place.
The mathematical theorem making all of this precise is the Feynman-Kac formula.
The associated Fokker-Planck equation is in our example:
with $P(x, t | x', t')$ denoting the probability to find the particle at time t at position x if it was at time t’ at position x’.
If the particle starts at x at time 0 then our initial condition is
We’ll collect some mathematical theorems and other results that are needed for the models and simulations on the Azimuth project here. For starters, note that the usual chain rule of calculus is not valid for Itô stochastic processes, but one has to use the Itô formula instead.
Attractors play an important role in the study of the asymptotic behavior of dynamical systems. There is an extensive literature dealing with attractors of deterministic dynamical systems. For stochastic dynamical systems, however, comparably little progress has been made by now.
Two different approaches to stochastic bifurcation have been pursued so far, the “physicist’s” (P) and the “dynamical” (D) approach, see [L. Arnold et al. (1995)]. Suppose a random dynamical system which depends on a parameter is given. Then the approach called (P)-bifurcation deals with qualitative changes of stationary measures when a parameter is varied. It is particularly useful for stochastic differential equations, since densities of stationary measures are delivered by the Fokker–Planck equation. (P)-bifurcations have been studied intensively by engineers […]
The approach called (D)-bifurcation studies the loss of stability of invariant measures and the occurrence of new invariant measures when a parameter is varied. It is tied to sample stability through the use of Lyapunov exponents (given by the multiplicative ergodic theorem) to characterize stability of invariant measures. This approach has proven to be very successful.
It is possible to solve SDE numerically, in the sense that one can get numerical approximations to certain properties of the stochastic process that is the solution of a given SDE. Algorithms for solving SDE numerically are generalizations of algorithms used for solving ordinary differential equations (ODE).
For some general terminology see discrete approximation scheme.
Let $X_t$ be an Ito process that is a solution to the SDE
on $[0, T]$ with the initial condition $X_0 = C \in \mathbb{R}$. For a given equidistant discretization of $[0, T]$, $0 = \tau_0, ... \tau_n = T$ an Euler approximation is a stochastic process $Y_t$ satisfying the recursion relations
and $Y_0 = C$. If we compare this approximation to the usual Euler scheme for an ODE, we see that in order to compute $Y$ numerically using a computer, we need random numbers with a Gaussian distribution $N$, since $W(\tau_{n+1}) - W(\tau_n)$ is distributed according to $N(0, \tau_{n+1} - \tau )$. For this, we need a random number generator.
The simplest scheme that achieves a higher order of strong convergence (see below) than the Euler scheme is the Milstein scheme.
We will use the nomenclature of the preceding paragraph about the Euler scheme.
We need to generalize the concepts of convergence of a discrete approximation to the solution of ODE to the stochastic setting. There are several ways to do this, two common ones are:
Let $\delta := \max (|\tau_n - \tau_{n+1}|)$ be the maximum time difference of a given discretization of $[0, T]$.
Strong Convergence
A discrete approximation $Y_t$ to an Ito process $X_t$ converges strongly, if
It is said to be of strong order $\gamma$ if there is a constant $C$ such that
As we see from the definition, a strong approximation needs to approximate the original process $X_t$ pathwise.
Weak Convergence
A discrete approximation $Y_t$ to an Ito process $X_t$ converges weakly for a given class of functions $G$, if for every function $g \in G$ we have
where we assume that all appearing expectation values exist and are finite.
It is said to be of weak order $\gamma$ with respect to $G$ if there is a constant $C$ such that for every function $g \in G$:
We speak simply of weak convergence when $G$ is the set of all polynomials.
So weak convergence means that the approximation needs to approximate the probability distribution of the original process $X_t$. Weak and strong convergence are indeed different concepts, example:
The Euler scheme is of strong order 0.5 and of weak order 1.0.
Strong approximations are needed if we are interested in properties of the process $X_t$ that depend on properties of sample paths, for example, when we are interested in trajectories of projectiles. Weak approximations are needed when we are interested in properties of the probability distribution of $X_t$, for example, when we would like to compute functionals of $X_t$: According to the Feynman-Kac formula we can use weak approximations to solve the associated Fokker-Planck equation numerically. This may be an interesting alternative to numerical methods for hyperbolic partial differential equations, that apply to Fokker-Planck equations, in high dimensions and given complex boundary conditions.
Applications of stochastic differential equations to climate physics are described here:
In particular, the following article describes some very simple models of long-term climate variability:
As a sample, here is the simplest one:
This equation describes the interaction between the atmosphere and the ocean’s surface. Here $T_a$ is the atmosphere temperature and $T_s$ is the ocean surface temperature, which are idealized as functions only of time. The constants $c_a$ and $c_s$ describe the heat capacity of the atmosphere and the ocean, respectively. In this idealization the atmosphere and ocean temperatures are assumed to interact with each other and with cold outer space in a linear way, following Newton’s law of cooling:
The rate of heat loss of a body is proportional to the difference in temperatures between the body and its surroundings.
These linear interactions are described by the constants $b_d, b_s, b_2, b_3$. Finally, we assume the atmosphere is affected by random fluctuations described by the $\sigma \dot{W}$ term: here $\dot{W}$ is white noise, and $\sigma$ is a constant.
For a somewhat realistic model, we assume the heat capacity of the ocean surface is much greater than that of the atmosphere:
Starting from these assumptions one can work out the power spectrum for both atmosphere and ocean surface temperatures, $T_a$ and $T_s$. The power spectrum of a random process is the square of the absolute value of its Fourier transform: this measures how ‘strong’ that process is at various frequencies. It turns out that the power spectrum for $T_s$ is roughly flat for frequencies below
and drops off in a manner like the power spectrum of ‘red noise’ at higher frequencies. As mentioned above, red noise is another name for Brownian motion, and it has power inversely proportional to the square of frequency.
The moral is that rapid fluctuations of the atmospheric temperature can drive fluctuations in the ocean temperature that last much longer, even in this simple linearized model. This is a small first step towards understanding how temperature fluctuations at time scales greater than a year can arise in the Earth’s climate. However, many examples of such fluctuations, such as the ENSO and AMO, are quasi-periodic in nature and require much more sophisticated models.
Stochastic resonance is a phenomenon that is best studied using SDE.
A given time series can be used to fit a SDE from a fixed set of models, see Parametric estimation for stochastic differential equations.
If the right hand side of a SDE depends on values of the process in the past, one talks about stochastic differential delay equations.
Stochastic differential equation, Wikipedia.
Brownian noise, Wikipedia. Also known as ‘red noise’.
White noise, Wikipedia.
The following textbook proceeds by explaining little chunks of theory and then progresses through exercises with increasing complexity. It starts with basic probability theory and ends with SDE:
A modern classic, due to its lucid style and the many applications that are explained, is this:
The standard references are
and its accompanying volume
The following book is an elementary introduction to models in physics, chemistry and other sciences using Langevin equations:
The following book is a classic reference for the practicioner, Risken explains model building and approximate solution methods like linear response theory? using a minimum of mathematical background: