The Azimuth Project Functions for empirical data fitting (changes)

Showing changes from revision #0 to #1: Added | Removed | Changed

Functions for empirical data fitting

Idea

Various kinds of functions used in data fitting. TODO: an important issue is “redundant variables” which can confuse the fitting. These functions are chosen because they may be in some way revealing about some structure in the dataset. This is a different idea to general function approximation with with, e.g., non-parametric density estimation.

Details

In general data variables are denote $x$ for “input” variables and $y$ for “output” variables, with $\sim$ denoting “basically fits with unmodelled noise term”. (This is deliberately loose.)

Functions for 1-dimensional fitting

Linear, vertical errors:

(1)$y \sim a x +b$

Gaussian:

(2)$y \sim a \exp -\frac{(x-\mu)^2}{\sigma^2}$

Exponential:

(3)$y \sim a \exp \lambda x$

Combined Gaussian and exponential:

(4)$y \sim a \exp -(b x^2 + c x)$
(5)$y \sim a x^b + c$

Beta:

(6)$y \sim a x^b (1-x^c)$

Need to see how these break down in terms of (possibly null) affine transformations of basic variable combined with other functions.

Functions for multidimensional fitting

In some cases, particularly if the choice of axis corresponds to meaningful variables, a multidimensional dataset may be well-fitted by a separable function composed from several of the 1-dimensional functions above. Otherwise an intrinsically multidimensional function like those below may be better.