The Azimuth Project
Climate network (changes)

Showing changes from revision #7 to #8: Added | Removed | Changed


A climate network is a undirected graph whose nodes represent points in a spatial grid, and where the edge weight (link strength) between nodes i and j is calculated from the historical weather record at the two points. For example it could be based on the cross-correlation of temperature histories at i and j. In weighted graph formulations, the cross-correlation may supply the raw data for the edge weight. In unweighted graph formulations, a binary decision rule, based on the cross-correlations between the histories at i and j, may be used to specify whether i and j are “connected” or not.

The graph structure of the network is a function of time.

Climate networks have found important applications to the detection and forecasting of El Niño events. See:

Definition of climate networks

There are many ways in which a “link strength” could be defined. First we will set up some notation. Given a time tt, we can define a map which takes tt to a vector of times no later then tt. For example, if the units of time are days, a “previous year” map Y(t)Y(t) can be defined like:

Y(t)=(t364,t1,t). Y(t) = (t-364,\dots t-1, t).

The covariance of two vectors xx and yy of the same length nn is defined in the usual way:

cov(x,y)=E[(xE[x])(yE[y])]=1n i(x i1n jx j)(y i1n jy j)\cov(x,y) = \operatorname{E}{\big[(x - \operatorname{E}[x])(y - \operatorname{E}[y])\big]} = \frac{1}{n}\sum_i \big( x_i - \frac{1}{n}\sum_j x_j \big) \big(y_i - \frac{1}{n}\sum_j y_j \big)

From this, the correlation can be defined as

cor(x,y)=cov(x,y)cov(x,x) 1/2cov(y,y) 1/2. \cor(x,y) = \frac{\cov(x,y)}{\cov(x,x)^{1/2} \; \cov(y,y)^{1/2}}.

Finally, if f(t)f(t) is a function of time, such as the temperature, we extend ff to a map between vectors in the obvious way:

f((x 1,,x n))=(f(x 1),,f(x n)) f((x_1, \dots, x_n)) = (f(x_1), \dots, f(x_n))

The following definitions are derived from:

Firstly, T i(t)T_i(t) is the temperature on day tt at point ii. For a time lag of τ\tau days, 0τ2000 \leq \tau \leq 200 they define time-lagged cross-covariances

C i,j (t)(τ)=cov(T i(Y(t)),T j(Y(tτ))) C^{(t)}_{i,j}(-\tau) = cov( T_i(Y(t)), \; T_j(Y(t-\tau)) )


C i,j (t)(τ)=cov(T i(Y(tτ)),T j(Y(t))) C^{(t)}_{i,j}(\tau) = cov( T_i(Y(t-\tau)), \; T_j(Y(t)) )

and then divide these by the corresponding standard deviations to obtain cross-correlations.

c i,j (t)(τ)=C i,j (t)(τ)C i,i (t)(0) 1/2C j,j (tτ)(0) 1/2. c^{(t)}_{i,j}(-\tau) = \frac{C^{(t)}_{i,j}(-\tau)}{C^{(t)}_{i,i}(0)^{1/2} \; C^{(t-\tau)}_{j,j}(0)^{1/2}}.

and a similar expression for c i,j (t)(τ) c^{(t)}_{i,j}(\tau) .

The description of S ij(t)S_{ij}(t) is changed in the correction to the paper. It is confusing, but I hope the following is correct: They determine, (by taking expectations over τ\tau) for each point in time tt, and for any pair of points i,ji,j, the maximum, the mean, and the standard deviation of |c i,j (t)(τ)|| c^{(t)}_{i,j}(\tau) | around the mean and define the link strength S ij(t)S_{ij}(t) as the difference between the maximum and the mean value, divided by the standard deviation.

El Niño prediction

The time dependent average link strength S(t)S(t) is obtained by averaging over S ij(t)S_{ij}(t) where ii is in the “El Niño basin” and jj is in an area of the Pacific outside the basin.

It is observed that S(t)S(t) decreases during El Niño events. The prediction of El Niño events is done by choosing a threshold, and predicting an El Niño event when S(t)S(t) rises above the threshold.

The link strength defined above by Yamasaki et al has some surprising behaviour. The See following graphs are all based on simulated data. There are two time series of length 565, called “signal 1” and “signal 2” in the graphs, which consist of quadraticsq 1q_1Experiments with varieties of link strength for El Niño prediction . andq 2q_2 plus independent gaussian noise. The noise has the same amplitude (standard deviation) in all cases, but q 1q_1 and q 2q_2 are multiplied by 1000 (leftmost column), 9 (second column), 3 (third column) and 1 (fourth column).

Examples of the signals themselves are shown in the top two rows, the value of c i,j (t)(τ) c^{(t)}_{i,j}(\tau) is in the third row, and the fourth row shows an estimated density of the link strength derived from 100 replicates (different samplings of noise).

In the first column, the q 1q_1 and q 2q_2 overwhelm the guassian noise, so you can see their shapes. In particular, note that have positive correlation for all delays: it varies between about 0.87 and 0.97. The other three columns are intended to be more realistic signals which roughly resemble climate data. One would expect that as the multiplier for q 1q_1 and q 2q_2 decreases, the link strength would also decrease, but the opposite is the case.

Simulated link strengths

The code is below.

signal1 <- function(period) {
  x <- period / length(period)
  x + (x-.5)^2

signal2 <- function(period) {
  x <- period / length(period)
  x - (x-.5)^2

period <- 1:566
tau.max <- 200
tau.range <- -tau.max :tau.max 
cperiod <- 365

make.Csamples <- function(nreps, scale) {
  LSs <- rep(0, nreps)
  C <- rep(0, length(tau.range))
  for (r in 1:nreps) {
    t1 <- scale * signal1(period) + rnorm(length(period))
    t2 <- scale * signal2(period) + rnorm(length(period))
    for (tau in tau.range) {
      if (tau <= 0) {
        x <- t1[1:cperiod]
        y <- t2[(-tau+1):(-tau+cperiod)]
      } else {
        x <- t1[(tau+1):(tau+cperiod)]
        y <- t2[1:cperiod]
      C[tau.max+1+tau] <- abs(cor(x,y))  
    LSs[r] <- (max(C)-mean(C))/ sd(C)  
  qauntiles <- quantile(LSs, probs=c(.05,.25,.5,.75,.95))
  list(C=C, LSs=LSs, t1=t1, t2=t2, qauntiles = round(qauntiles, digits=2))  

op <- par(mfcol=c(4,4), mar=c(4,5,1,1))
for (s in 1:4) {
  scaling <- c(1000,9,3,1)[s]
  dsmps <- make.Csamples(100,scaling)
  maintxt <- paste0("signal scaled by ", scaling)
  plot(period, dsmps$t1, type='l', ylab="signal 1", xlab="days")
  plot(period, dsmps$t2, type='l', ylab="signal 2", xlab="days")
  plot(tau.range, dsmps$C, type='l', ylab="C(tau)", xlab="tau")
  dens <- density(dsmps$LSs)
  plot(dens$x, dens$y, type='l', xlab="signal strength", ylab="density")

category: climate