The Azimuth Project
Parametric estimation for stochastic differential equations



When we study a time series, the central question is often this: Is there a pattern, a tendency, or is what we see purely coincidental? There are a multitude of statistical techniques to tackle this question, one of them is trying to fit a stochastic differential equation (SDE) to the given time series, this is what this page is about.


If there is reason to assume that a process is described by a certain class of SDE with nontrivial diffusion, described by a finite set of parameters, we can try to perform a maximum likelihood estimation of the parameters to fit a given time series.

This is possible, because according to the Girsanov theorem, in this case the probability laws of the processes that are solutions to the given SDE are continuous with respect to each other. Therefore every process has a probability density with respect to every other process, according to the Radon-Nikodym theorem. We can even calculate this density, see the examples section.


For a simple example we look at the family of one dimensional SDE parametrized by one real parameter α\alpha:

dX t,α=αf(X t)dt+dW t dX_{t, \alpha} = \alpha \; f(X_t) dt + \; dW_t

The maximum likelihood estimate α^(T)\hat\alpha(T) for a process on the interval [0,T][0, T] is the value that maximizes the likelihood ratio:

L(α,T)=exp(12α 2 0 Tf 2(X t)dtα 0 Tf(X t)dX t) L(\alpha, T) = \exp{ (\frac{1}{2} \alpha^2 \int_0^T f^2(X_t) dt - \alpha \int_0^T f(X_t) dX_t) }

According to the Girsanov theorem, this is the Radon-Nikodym density of the solution X t,αX_{t, \alpha} of the SDE with respect to the Wiener process.

To find the maximum, we solve the equation

Lα=0 \frac{\partial L}{\partial \alpha} = 0

and find

α^(T)= 0 Tf(X t)dX t 0 Tf 2(X t)dt \hat \alpha(T) = \frac{ \int_0^T f(X_t) dX_t}{\int_0^T f^2(X_t) dt}

This estimator is a random variable depending on the sample path. It is possible to prove under mild conditions, that lim Tα^(T)=α\lim_{T \to \infty} \hat \alpha(T) = \alpha with probability one, that is: The estimator is unbiased.

For a quick sanity check, we can set f(X t)=1f(X_t) = 1 and rediscover linear regression:

α^(T)=1T 0 TdX t \hat \alpha(T) = \frac{1}{T} \int_0^T dX_t

The formula of the estimator assumes that we know a sample path of the process in continuous time, the next question is therefore: What do we do when we have only a time series, that is a discrete finite set of values?

To be continued…


It is of course possible to perform nonparametric estimation for stochastic differential equations as well. Including this approach, stochastic differential equations are more general than all explicit nonlinear time series models that are considered in the references mentioned on the page time series analysis.