#### Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Options

# Some El Nino related data

I cannot make heads and tails out of the NASA and NOAA data, so randomly search for useful data (when I cannot sleep):

Really useful visualization of El Nino and La Nina:

Historical El Niño/La Niña Watch

In there I found sea level satellite data:

Warm water pool returns to near normal state March 98

The larger dataset:

Open Surface Topolgraphy From Space

This caught my eyes for times-series wavelet app and forecasts:

Global Mean Sea Level Trend from Integrated Multi-Mission Ocean Altimeters TOPEX/Poseidon Jason-1 and OSTM/Jason-2 Version 2

• Options

Dara, Regarding the last link, this data is from the PO.DAAC oceanography database http://podaac.jpl.nasa.gov/ maintained by JPL. I worked with the scientist who is in charge of the site on a recent collaborative project and can ask him some favors for accessing data that interests us. The amount of oceanography data that PO.DAAC archives is astounding.

Paul Pukite

• Options

Hello Paul thank you.

In so far as the correlation is concerned, we need to match different data for correlation e.g. temperature and see levels , just matching temperature to temperature might not give us enough clues.

D

• Options

Sea level to temperature correlations is simple to first-order but quite involved to second order. The first-order physics approximation is to assume an addition of heat to ocean water and then invoke the steric expansion to estimate sea level increase. (see http://theoilconundrum.blogspot.ch/2013/07/expansion-of-atmosphere-and-ocean.html for the simple derivation and some numbers)

To second-order, we have to integrate through all depths as the thermal coefficient of expansion for water depends on temperature.

• Options

Ok Paul this is impressive, could you help me out to sketch out some ideas so I start coding some computational models?

• Options

If you're interested in bringing more variables than just sea surface temperature into the ENSO prediction problem, you may want to read this paper, which adds three additional atmospheric variables (atmospheric streamfunction, diabatic heating, and velocity potential); see Section 2 for methodology. Not all of these correspond to easily-accessible data products, but they may suggest datasets to look for.

• Options

Thank you Nathan, I need to see the bigger picture. I really appreciate this paper.

Also I need to define the INPUT for my forecast algorithms, I could easily add more variables. But how?

Example:

Let a1 a2 a3 ... an be an input array at time t and make a finite list of different sequential times

input 1= a1,1 a1,2, ... a1, n input 2= a2,1 a2,2 ... a2,n

input t = at,1 at,2 ... at,n

...

Forecast (input t) = El Nino index at time t + 1

We could use this list as an input to a forecast algorithm by taking the row i and corresponding output e.g. El Nino index and attempt to forecast the output for row i+1.

I suspect this will produce ok to inferior results.

It is better to the take the list above, entirely, and wavelet transform and decompose it to a finite list of decompositions, some trends and some noise (high freq):

make the entire input list into a 2D matrix

M={input1, input2, ... inputn} which each input is a row in this matrix.

decompose M = M1 + M2 + ... Mr where r is the max refinement level depth

Then select the decompositions with larger Energy Fraction and past those (trends) to the forecast algorithm e.g. M2 or M1 + M2 depending on the data.

In most papers, they pass only one parameter to the forecast algorithm e.g. time , and the algorithm's performance is compromised.

Dara

• Options

Perhaps this is too pedantic a concern but prediction based on dead-reckoning extrapolation of multiple variate indices will continue to be suboptimal as long the individual indices are not by themselves predictable.

Until the data is reduced to known periodic or at least deterministic factors the extrapolation will continue to diverge quickly a few years into the future.

As a case in point, consider if Total Solar Irradiation (TSI) was a driver for ENSO variability. If this was indeed a factor, we would need to predict the future sunspot activity, which is a challenge on its own. The same would go for volcanic activity, which some have also considered a potential driver.

However, if more predictable forcing factors, such as tidal forces, were the main source, then there is a better chance of finding a good prediction algorithm.

And I realize this is not the best scientific approach because it is the equivalent of the drunk looking for his car keys under the lamp-post, but I am thinking that starting from known sources may be more productive.

So, in keeping with the theme of this discussion thread, I think it is a very good idea to continue to look for related data to ENSO.

Other data that I have considered are observations of Length of Day (LOD) of the earth's rotation and the Quasi-Biennial Oscillations (QBO) of the stratospheric winds. Also the Chandler Wobble has been associated with ocean dynamics. In pure terms, these observables show arguably more determinism so may link better to natural sources.

This is basically the way that I am approaching the challenge.

"Never make a calculation until you know the answer: make an estimate before every calculation, try a simple physical argument (symmetry! invariance! conservation!) before every derivation, guess the answer to every puzzle. Courage: no one else needs to know what the guess is. Therefore make it quickly, by instinct. A right guess reinforces this instinct. A wrong guess brings the refreshment of surprise. In either case life as a spacetime expert, however long, is more fun!"


Edwin F. Taylor describing John A. Wheeler's approach to problem solving.

• Options

Hello Paul

You are not pedantic, and I am amazed at your abilities.

multiple variate indices will continue to be suboptimal as long the individual indices are not by themselves predictable.

I am not sure if that is always applicable(I use the term well-behaved loosely):

1. Assume x is a random number list of some kind, compute y = m*x + b for constants m and b, clearly both x and y are random and unpredictable but (x, y) collection is somehow very well behaved i.e. moves along a perfect line! Therefore I cannot tell you where each coordinate x and y might be next, but I could tell you they will be on a fixed line! In general you do this with your diff EQ you had published, you do not know where the variables for the diff EQ will be next, but you know the inter-relationship between them is bound by such an equation.

2. Assume x is a random, y = p(x)x + q(x) where each p and q are well behaved functions, therefore xp(x) -y is well behaved and predictable i.e. in multivariate data set (x, y)-pairs the function xp(x) -y is well behaved and predictable. Now to make it more real, add a small noise to p(x)+e, then xp(x) -y is well behaved save a small noise e which we could remove or issue an error for.

3. Think of a robotic copter, facing random gusts, and lifting a sack of liquid, the forces applied to the copter are random and multivariate, but you could have a guidance system (machine learning type) that will land and take-off successfully.

So I am not sure about the multivariate systems.

Dara

• Options

So, in keeping with the theme of this discussion thread, I think it is a very good idea to continue to look for related data to ENSO.

agreed.

I have a hodge-podge approach: collect a bunch of data, apply a bunch of code to forecast, do postmortem stats and study the cases, start all over again, and again ...

If I do my hodge-podge then other smart people could make proper inferences and find relationships, I think division of labor should be as such: Computing fellows do computing, the theoreticians and scientists do the models and inferences. I think the problems arise from lack of division of labor :)

Dara

• Options

Dara, Just some observations.

With ENSO, your equation y = m*x + b could be y=SOI pressure and x=SST or thermocline which doesn't add a lot to the puzzle except as related data.

I believe your copter example is an illustration of a Kalman filter application, whereby a stochastic model of the noise is used to create an optimal control strategy. The Kalman filter is there to remove the undesired noise to estimate the actual system state more accurately. This is still a kind of "dead-reckoning" class of model in that "only the estimated state from the previous time step and the current measurement are needed to compute the estimate for the current state". http://en.wikipedia.org/wiki/Kalman_filter

But what happens if the noise is possibly very small? Noise ceases to be noise when it can be ascribed to some other physical phenomena that is more predictable. Then we have conceivably an equation state with a residual describing the error from that model.

So in general, I agree with you as long as we have more (to use your terms) "well behaved and predictable" factors rather than "random and unpredictable".

I can see what you mean if we take the nearly monotonic GHG forcing as an example of a factor that is mixed in with, say, red noise and then can try to isolate the GHG forcing from the noisy temperature trend. The GHG forcing may be more well-behaved and predictable here. But that is not ENSO, which has no long-term trend and therefore the signal is more "broad band" than "DC". So the discrimination of noise from signal is more difficult with ENSO.

• Options
edited August 2014

Hello Paul

I feel lucky to have you here, about CONTROL system strategy you mentioned above:

Idea: A quadcopter facing gusts carrying a sloshing heavy sack of liquid. The wind and sloshing of the liquid are difficult to predict, but a navigation software (adaptive non-linear) could control the copter to take off and land properly! How is that possible?

Motivation: We could build a function that is unpredictable and yet its inverse is fully predictable:

f: N ---> N

Where f is a random variable i.e. f(n) is random, however inverse of f or INVF is fully predictable: f(n) ----> n

f(n+1) ---> n+1

f(n+2) ---> n+2

In other words one looking at the past few f(n) ---> n could guess the next value accurately!

For navigation systems, e.g. the quadcopter earlier, that works well, if we use the Phase Space map for f, and wind and other forces could be hard to predict:

f: Phase Space ---> Position or R^3

Phase Space subset of R^4 made of tuples (rotor1, rotor2, rotor3, rotor4) which are the RPM for each rotar in the copter, by altering these RPM values the copter flies and lands and so on.

f(rotor1, rotor2, rotor3, rotor4) = (x, y, z)

So the problem is how to change 4 RPM values so the copter lifts off or lands in wind with sloshing sack, f is not well behaved:

Work with f inverse or FINV(x,y,z) = (rotor1, rotor2, rotor3, rotor4)

Make a list of the history of flight, _i means at time i or ith, but TIME SHIFT THE LIST:

(x,y,z)_i ---> (rotor1, rotor2, rotor3, rotor4) _i-1

(x,y,z)_i+1 ---> (rotor1, rotor2, rotor3, rotor4) _i

(x,y,z)_i+2 ---> (rotor1, rotor2, rotor3, rotor4) _i+1

...

Run a forecast algorithm on FINV call it FINV_forecast, and for take off forecast setup:

if copter at position (x, y, z) what should the 4 RPMS change to such that copter finds itself at (x, y, z+1) afterwards or

FINV_forecast(x, y, z+1)_i+1 = (rotor1, rotor2, rotor3, rotor4)_i

You no longer need to worry about the wind or sloshing, since the algorithm is localized in time and does localized adaptive forecasts.

Apply this for El-Nino:

f: time ---> (T1, T2, T3, T4, T5) the 5 NEXT consecutive month temperature

This function f is difficult to predict, so let's work on FINV but choose a Phase space in place of time, var1 .... varn any parameters we guess impacts El Nino e.g. CO2 emissions or rain volume in California:

f: (var1, var2, ... varn) ---> (T1, T2, T3, T4, T5)

FINV: (T1, T2, T3, T4, T5) ---> (var1, var2, ... varn)

Then make a TIME SHIFTED history list:

(T1, T2, T3, T4, T5)_i+1 ---> (var1, var2, ... varn)_i

(T1, T2, T3, T4, T5)_i+2 ---> (var1, var2, ... varn)_i+1

(T1, T2, T3, T4, T5)_i+3 ---> (var1, var2, ... varn)_i+2

So at time n+1 FINV forecasts (var1, var2, ... varn) at time n i.e. for PROPOSED El-Nino NEXT 5 consecutive temperatures what should be the var1... varn today? Manually choose 5 temp values that indicate a strong El-Nino, and compute FINV and get the var1... varn and compare the phase space parameters to actual values today, if very close then good chance for El-Nino, if the vars are too far off try another 5 consecutive temperature.

Use Differential Evolution to minimize the the distance between the FINV output for vars and today's vars i.e. start with a guess (like Newton Raphson) and then iterate the Differential Evolution algorithm till a global minimum is reached, thus (T1, T2, T3, T4, T5)_n+1 is found such that minimizes |(var1, var2, ... varn)_n - FINV_forecast((T1, T2, T3, T4, T5)_n+1))| .

This is like building a virtual drone chasing a El-Nino storm but reverse in time :)

Dara

• Options

Here is some more related data to ENSO: http://contextearth.com/2014/08/15/change-of-tide-in-thought/

This covers the concept of geopotential height, which is a mix of pressure and temperature.

At the end, I wrote this to summarize -- One of the challenges of deconstructing phenomena such as ENSO is that one doesn't exactly know where to start from. There is enough variation in the predictions from assorted ENSO models that one realizes that the foundation may be a bit shaky, if not otherwise governed by chaotic uncertainty. Yet the agreement of closely related phenomena such as atmospheric tides to fundamental theories of forcing functions (ala lunar gravitational effects) provide hope that these promising leads will amount to something that generates better short or long-term predictions.

• Options

Great, I am going to post a forecast for Darwin index and we could use that to start linking more data into the forecast algorithm and experiment.

• Options
edited August 2014

I haven't managed to decide on which El Nino areas need to be analysed (to some depths). Ludescher et al. use a version of EN3.4, Ian Ross's thesis uses EN3 and I've just read this on the ncdc noaa website:

SST values in the Niño 3.4 region may not be the best choice for determining La Niña episodes but, for consistency, the index has been defined by negative anomalies in this area. A better choice might be the Niño 4 region, since that region normally has SSTs at or above the threshold for deep convection throughout the year. An SST anomaly of -0.5°C in that region would be sufficient to bring water temperatures below the 28°C threshold, which would result in a significant westward shift in the pattern of deep convection in the tropical Pacific.

Sea surface temperature anomalies were calculated using the Extended Reconstructed Sea Surface Temperature version 3 (ERSST.v3).

Equatorial Pacific sea surface temperatures

The obvious but preumably computationally costly answer is all of them.

• Options

Hello Jim

Thanx, I will take the sub-grid for the equatorial temperatures and attempt forecast algorithms without any moving average applications.

My earlier computations show there is a Trend for this data vibrating north-south, in other words the equatorial temperature sub-grid are not enough of an input for such forecasts.

Also played around with K-nn Nearest algorithm last night and found out that non-Euclidean metrics produce much better similarities between the sub-intervals of the data for Darwin anomalies than Euclidean.

I try to post something, and pick John's brain on modeling these data with non-Euclidean geometries.

The issue in all this is you need heavy computing or we will not see the patterns in the data and the structures that could tell us what is happening.

Dara