Nearest Neighbor Algorithm: Darwin Anomalies

I ran the Nearest Neighbor algorithm on Darwin Anomalies and Darwin Delta Anomalies (Darwin(t+1) - Darwin(t)). Meaning took a sub-interval of Darwin Delta Anomalies and found the nearest neighbor for it comparing it to other sub-intervals of same length.

And did so using 8 different metrics.

It seemed the algorithm would agree on a particular nearest neighbor across metrics for Darwin Delta Anomalies in specific the Correlation Distance and Cosine Distance agreed on 5+ nearest neighbors!

This is so since the mean for the Darwin Delta Anomalies is almost 0, therefore the Correlation Distance approaches towards Cosine Distance for Darwin Delta Anomalies.

What does this mean?

Correlations for Darwin Delta Anomalies then is geometrical vs. statistical. Therefore the Correlations for Darwin Delta Anomalies could be classified being within a certain cone (of tolerance). Outside the cone then there is no correlation, inside the cone there is high correlation.

Therefore, possibly, a cone could be define for strong El-Nino, the vectors within the cone correlate to the strong El-Nino and outside no correlation.

Other data of similar or dis-similar nature could be padded to intervals ofDarwin Delta Anomalies and as long as the mean is close to 0, this code classification would hold.

It might provide a better equation form for the modeling of El-Nino which would be an angle < tolerance .



  • Options

    The nearest neighbor results do not reveal any patterns i.e. the nearest neighbor could 100 months ago or 12 months, no particularly noticeable relationships.

  • Options

    What does this nearest-neighbor relationship mean? ((Darwin(t+1)+Darwin(t-1))/2 - Darwin(t))

    Or is this a trick question?

  • Options
    edited August 2014

    There are two ways to deal with correlations:

    1. Statistical stuff which are very limited, you get a correlation number and that is it.
    2. Replace Correlation concept by Similarity and use similarity metrics as in metric space distance function and vary the metric and segment the data and measure the similarity distances to find similar patterns inside same data or different data

    Item 2 is used in machine learning which is quite effective for surveying large data to find patters of all sorts and then a fast forecast algorithm i.e. K-nn Regression. Dara

  • Options

    This is a pattern I often see in machine-learning time-series analysis of ENSO SST and SOI.

    The tool discovers terms that follow this pattern:

    Y'' = sin(wt-kY)

    From the substitution U = wt-kY

    this gives

    U'' = - k sin(U)

    The solution to this is the Jacobi amplitude function (am) and it can look like this


    What the effect amounts to is to essentially provide a "braking" to excursions of a sharp wave amplitude and thus to extend the period by flattening the peaks. It is a nonlinear effect that depends on near-neighbors that define the second derivative and amplitude.

    In terms of other physical systems, I have seen references to the Jacobi amplitude function in papers on "Oscillations of a body with an orbital tethered system" and in "Complex trajectories in a classical periodic potential".

    This nonlinear modulation is not as common as the Mathieu effect of

    Y'' = sin(wt) Y

    but does appear often enough to be interesting.

  • Options

    Thanx Paul, I am thinking about this, I might make some pdf files to investigate more... very soon

Sign In or Register to comment.