# The Azimuth Project Climate network

## Idea

A climate network is a undirected graph whose nodes represent points in a spatial grid, and where the edge weight (link strength) between nodes i and j is calculated from the historical weather record at the two points. For example it could be based on the cross-correlation of temperature histories at i and j. In weighted graph formulations, the cross-correlation may supply the raw data for the edge weight. In unweighted graph formulations, a binary decision rule, based on the cross-correlations between the histories at i and j, may be used to specify whether i and j are “connected” or not.

The graph structure of the network is a function of time.

Climate networks have found important applications to the detection and forecasting of El Niño events. See:

## Definition of climate networks

There are many ways in which a “link strength” could be defined. First we will set up some notation. Given a time $t$, we can define a map which takes $t$ to a vector of times no later then $t$. For example, if the units of time are days, a “previous year” map $Y(t)$ can be defined like:

$Y(t) = (t-364,\dots t-1, t).$

The covariance of two vectors $x$ and $y$ of the same length $n$ is defined in the usual way:

$\cov(x,y) = \operatorname{E}{\big[(x - \operatorname{E}[x])(y - \operatorname{E}[y])\big]} = \frac{1}{n}\sum_i \big( x_i - \frac{1}{n}\sum_j x_j \big) \big(y_i - \frac{1}{n}\sum_j y_j \big)$

From this, the correlation can be defined as

$\cor(x,y) = \frac{\cov(x,y)}{\cov(x,x)^{1/2} \; \cov(y,y)^{1/2}}.$

Finally, if $f(t)$ is a function of time, such as the temperature, we extend $f$ to a map between vectors in the obvious way:

$f((x_1, \dots, x_n)) = (f(x_1), \dots, f(x_n))$

The following definitions are derived from:

Firstly, $T_i(t)$ is the temperature on day $t$ at point $i$. For a time lag of $\tau$ days, $0 \leq \tau \leq 200$ they define time-lagged cross-covariances

$C^{(t)}_{i,j}(-\tau) = cov( T_i(Y(t)), \; T_j(Y(t-\tau)) )$

and

$C^{(t)}_{i,j}(\tau) = cov( T_i(Y(t-\tau)), \; T_j(Y(t)) )$

and then divide these by the corresponding standard deviations to obtain cross-correlations.

$c^{(t)}_{i,j}(-\tau) = \frac{C^{(t)}_{i,j}(-\tau)}{C^{(t)}_{i,i}(0)^{1/2} \; C^{(t-\tau)}_{j,j}(0)^{1/2}}.$

and a similar expression for $c^{(t)}_{i,j}(\tau)$.

The description of $S_{ij}(t)$ is changed in the correction to the paper. It is confusing, but I hope the following is correct: They determine, (by taking expectations over $\tau$) for each point in time $t$, and for any pair of points $i,j$, the maximum, the mean, and the standard deviation of $| c^{(t)}_{i,j}(\tau) |$ around the mean and define the link strength $S_{ij}(t)$ as the difference between the maximum and the mean value, divided by the standard deviation.

## El Niño prediction

The time dependent average link strength $S(t)$ is obtained by averaging over $S_{ij}(t)$ where $i$ is in the “El Niño basin” and $j$ is in an area of the Pacific outside the basin.

It is observed that $S(t)$ decreases during El Niño events. The prediction of El Niño events is done by choosing a threshold, and predicting an El Niño event when $S(t)$ rises above the threshold.

The link strength defined above by Yamasaki et al has some surprising behaviour. See Experiments with varieties of link strength for El Niño prediction.

category: climate