# The Azimuth Project Blog - El Niño project (part 4) (Rev #6)

This is a blog article in progress, written by John Baez. To see discussions of the article as it is being written, visit the Azimuth Forum.

If you want to write your own article, please read the directions on How to blog.

As the first big step in our El Niño project, Graham Jones replicated the paper by Ludescher et al that I explained in Part 3. Let’s see how this works!

Graham did this using R, a programming language that’s good for statistics. So, I’ll tell you how everything works with R. If you prefer another language, go ahead and write software for that… and let us know! We can add it to our repository.

Today I’ll explain this stuff to people who know their way around computers. But I’m not one of those people! So, next time I’ll explain the nitty-gritty details in a way that may be helpful to people like me.

### Getting temperature data

Say you want to predict El Niños from 1950 to 1980 using Ludescher et al‘s method. To do this, you need daily average surface air temperatures in this 7 × 23 grid in the Pacific Ocean:

Each square here is 7.5° × 7.5°. To get this data, you have to first download area-averaged temperatures on grid with squares that are 1.5° × 1.5° in size:

• Earth System Research Laboratory, NCEP Reanalysis Daily Averages Surface Level, or ftp site.

You can get the website to deliver you temperatures in a given rectangle in a given time interval. But it gives you this data in a format called NetCDF, meaning Network Common Data Form. We’ll take a different approach. We’ll download all the Earth’s temperatures from 1948 to 2013, and then extract the data we need using R scripts. That way, when we play other games with temperature data later, we’ll already have it.

So, go ahead and download all files from air.sig995.1948.nc to air.sig995.2013.nc. Or if you just want to do this one project, just up to air.sig.995.1980.nc. It will take a while…

### Getting the temperatures you need

Now you have files of daily average temperatures on a 1.5° by 1.5° grid, from 1948 to 2013. Make sure all these files are in your working directory for R, and download this R script from GitHub:

netcdf-convertor-ludescher.R, johncarlosbaez/el-nino Github site.

Graham wrote it; I just modified it a bit. You can use this to get the temperatures in any time interval and rectangle of grid points you want. However, the defaults are set to precisely what you need now!

When you run this, you should get a file called Pacific-1948-1980.txt. This has daily average temperatures in the region we care about, from 1948 to 1980.

### Getting the El Niño data

You’ll use this data to predict El Niños, so you also want a file of the Niño 3.4 data. Remember from last time, this says how much hotter than average the surface water is in this patch of the Pacific Ocean:

You can download the file from here:

nino3.4-anoms.txt`, johncarlosbaez/el-nino Github site.

This is a copy of the Monthly Niño 3.4 index data from the US National Weather Service, which I discussed last time.

Put this file in your working directory for R.

Here is the result (click to enlarge):

This is almost but not quite the same as the graph in Ludescher et al:

### Niño 3.4

In Part 3 I mentioned a way to get Niño 3.4 data from NOAA. However, Graham started with data from a different source:

Monthly Niño 3.4 index, Climate Prediction Center, National Weather Service.

The actual temperatures in Celsius are close to the NOAA data I mentioned last time.
But the anomalies, which actually give the Niño 3.4 index, are rather different, because they are computed in a different way, that takes global warming into account. See the website for details.

Code at Github. It took about 35 minutes to run.

category: blog, climate