CDF: Tidal vs. ENSO for Paul

Paul's Tidal vs. ENSO

Hello Paul

This is Enterprise CDF for your diff EQ work.

First, go to the input field where there is the diff eq and boundary values and alter the coefficients or anything else, and simply hit RETURN and the entire solution plot updated on the fly.

Second, press SAVE SOLUTION AS and two files are placed in your MY DOCUMENTS: .csv which is the numerical solution and .mml which is the actual diffEQ + boundary conditions.

Third, slide range and you will slide up and down the large plot.

If you give me the ENSO data or what data you were comparing this against, then I will do a double plot and issue the errors and correlations.

BTW you should not put your photos online they might end up in my CDFs

Dara

Comments

  • 1.
    edited October 2014

    thanks Dara,

    For that equation, the data is here http://www.psmsl.org/data/obtaining/rlr.monthly.data/65.rlrdata

    or this one http://www.psmsl.org/data/obtaining/rlr.monthly.data/196.rlrdata

    easy to parse, composed of semicolon delimited year/height pairs, a few bad data points have -9999 entries

    Comment Source:thanks Dara, For that equation, the data is here <http://www.psmsl.org/data/obtaining/rlr.monthly.data/65.rlrdata> or this one <http://www.psmsl.org/data/obtaining/rlr.monthly.data/196.rlrdata> easy to parse, composed of semicolon delimited year/height pairs, a few bad data points have -9999 entries
  • 2.

    Thanx Paul I will add these to the CDF.

    If you play with the coefficients of your equations you will see a lot of interesting effects.

    I will change my code to make sure it allows for more iterations for convergence in NDSolve. ' So there are my goals for these sorts of CDFS, first the equations are solved numerically in reasonable time, second the equations and boundary conditions could be altered interactively, third learn the behaviour of the diff EQ by repeated use of #1 and #2.

    Dara

    Comment Source:Thanx Paul I will add these to the CDF. If you play with the coefficients of your equations you will see a lot of interesting effects. I will change my code to make sure it allows for more iterations for convergence in NDSolve. ' So there are my goals for these sorts of CDFS, first the equations are solved numerically in reasonable time, second the equations and boundary conditions could be altered interactively, third learn the behaviour of the diff EQ by repeated use of #1 and #2. Dara
  • 3.

    Dara, I think your CDF idea will work out real well.

    Here is something to shoot for that may knock your socks off:

    sydneyTides

    The raw tidal data was processed with a zero-lag 12-month box filter to get rid of the annual and semiannual signals (which are more of a nuisance IMO). I actually didn't use an automated search on this, but brute forced the fit by changing the parameters by hand. The biennial forcing RHS and Mathieu modulation LHS are equally important. If you notice there is also a break-point at the year 1953 where the biennial switches parity -- odd-to-even or even-to-odd depending on how we want to define it. I highlighted the regions in yellow where it looks like the biennial pattern was getting out of sync -- the area at 1955 may be the response to the transition at 1953.

    This may be leading to something. What do you think?


    Besides your CDF approach, another interesting possibility, which I have learned from the Matlab world, is to use a server to host a time-limited online competition to try to find the best fit. The server can identify each competitor and store the set of coefficients which will improve the fit the best. There are all sorts of gaming aspects to this approach as the competitors may decide to hold back on submitting their results until the very end. The rules are that there are no rules -- competitors can use machine learning concepts or do it by hand and use other people's contributions to improve their own fit.

    Comment Source:Dara, I think your CDF idea will work out real well. Here is something to shoot for that may knock your socks off: ![sydneyTides](http://imageshack.com/a/img661/9030/b59rUU.gif) The raw tidal data was processed with a zero-lag 12-month box filter to get rid of the annual and semiannual signals (which are more of a nuisance IMO). I actually didn't use an automated search on this, but brute forced the fit by changing the parameters by hand. The biennial forcing RHS and Mathieu modulation LHS are equally important. If you notice there is also a break-point at the year 1953 where the biennial switches parity -- odd-to-even or even-to-odd depending on how we want to define it. I highlighted the regions in yellow where it looks like the biennial pattern was getting out of sync -- the area at 1955 may be the response to the transition at 1953. This may be leading to something. What do you think? --- Besides your CDF approach, another interesting possibility, which I have learned from the Matlab world, is to use a server to host a time-limited online competition to try to find the best fit. The server can identify each competitor and store the set of coefficients which will improve the fit the best. There are all sorts of gaming aspects to this approach as the competitors may decide to hold back on submitting their results until the very end. The rules are that there are no rules -- competitors can use machine learning concepts or do it by hand and use other people's contributions to improve their own fit.
  • 4.

    did you try your new DIFF EQ on the CDF? ALl you have to do is change the constants.

    About your game, I could simple code the SAVE AS button to save the results into our server under an assumed name, or email the results to a group of recipients.

    Dara

    Comment Source:did you try your new DIFF EQ on the CDF? ALl you have to do is change the constants. About your game, I could simple code the SAVE AS button to save the results into our server under an assumed name, or email the results to a group of recipients. Dara
  • 5.

    Dara asked:

    "did you try your new DIFF EQ on the CDF? "
    

    I am certain that it will work the same. The example you provided had the right pattern so I trust that it will work as advertised for other DiffEq's.

    Here is my evaluation:

    daraEq

    Looks very good ! Thanks !

    Comment Source:Dara asked: "did you try your new DIFF EQ on the CDF? " I am certain that it will work the same. The example you provided had the right pattern so I trust that it will work as advertised for other DiffEq's. Here is my evaluation: ![daraEq](http://imageshack.com/a/img673/2077/g1y8rF.gif) Looks very good ! Thanks !
  • 6.

    Good to see you used it, I could make all kinds of calculators like this mix of symbolic, numeric and graphics and geometry and data ...

    I will add the data tonight to this CDF.

    Thanx for using

    Dara

    Comment Source:Good to see you used it, I could make all kinds of calculators like this mix of symbolic, numeric and graphics and geometry and data ... I will add the data tonight to this CDF. Thanx for using Dara
  • 7.

    Dara, The beauty of this is that the equation is so simple, but yet it is able to reconstruct as complicated a waveform as shown in #4. So the entire formulation fits in the box input of the CDF.

    This is essentially what we call in the sciences a "one-liner". The fact that the model is trained/optimized the formulation with respect to post-1953 data and then validated against the data pre-1953 indicates that this may be a winning approach. The Aikaike Information Criterion (AIC) metric for this also may be quite good -- low complexity and excellent fit.

    Looking forward to what other ideas that you have!

    Comment Source:Dara, The beauty of this is that the equation is so simple, but yet it is able to reconstruct as complicated a waveform as shown in #4. So the entire formulation fits in the box input of the CDF. This is essentially what we call in the sciences a "one-liner". The fact that the model is trained/optimized the formulation with respect to post-1953 data and then validated against the data pre-1953 indicates that this may be a winning approach. The [Aikaike Information Criterion](http://en.wikipedia.org/wiki/Akaike_information_criterion) (AIC) metric for this also may be quite good -- low complexity and excellent fit. Looking forward to what other ideas that you have!
  • 8.

    This is essentially what we call in the sciences a “one-liner”.

    As long as you do not call it 'crackpot' I am on safe grounds :)

    Looking forward to what other ideas that you have!

    Lots of computational ideas, my concern is to bring them to production like the CDF here and also John use them seriously for research

    Dara

    Comment Source:>This is essentially what we call in the sciences a “one-liner”. As long as you do not call it 'crackpot' I am on safe grounds :) >Looking forward to what other ideas that you have! Lots of computational ideas, my concern is to bring them to production like the CDF here and also John use them seriously for research Dara
  • 9.

    One of the challenges that one always has to defend against is that of over-fitting. The tidal sloshing model has very few parameters, with the biennial forcing fixed at a specific period. The main Mathieu parameters are the a and q amplitudes and the $\omega$ sloshing modulation frequency. After that we have two phase angles, the forcing amplitude, a drift term if needed, and the initial conditions.

    The overall amplitude is not critical if we use a correlation coefficient or a related function as the goodness of fit criteria.

    The parsimony in this model is that the formulation is a direct lift from hydrodynamics textbook [1] on sloshing while the biennial forcing is one of those terms that derives from symmetry considerations. In other words, something like this physical process should be happening, but until a simple enough model is applied, no one could find a toehold from where to start.

    That's the rock climber's dilemma -- you know it's a challenging route so you spend time looking for the best toehold. This could be it.

    [1]O. M. Faltinsen and A. N. Timokha, “Sloshing,” Cambridge University Press, 2009.

    Comment Source:One of the challenges that one always has to defend against is that of over-fitting. The tidal sloshing model has very few parameters, with the biennial forcing fixed at a specific period. The main Mathieu parameters are the *a* and *q* amplitudes and the $\omega$ sloshing modulation frequency. After that we have two phase angles, the forcing amplitude, a drift term if needed, and the initial conditions. The overall amplitude is not critical if we use a correlation coefficient or a related function as the goodness of fit criteria. The parsimony in this model is that the formulation is a direct lift from hydrodynamics textbook [1] on sloshing while the biennial forcing is one of those terms that derives from symmetry considerations. In other words, something like this physical process should be happening, but until a simple enough model is applied, no one could find a toehold from where to start. That's the rock climber's dilemma -- you know it's a challenging route so you spend time looking for the best toehold. This could be it. [1]O. M. Faltinsen and A. N. Timokha, “Sloshing,” Cambridge University Press, 2009.
  • 10.

    need to know how you handled the missing data -9999

    Comment Source:need to know how you handled the missing data -9999
  • 11.

    The diffeq in #4 produces solution which has periods around 20-30 months and longer ones. (Wavelet transform showed me this)

    However you filtered the data to have the high frequency activities, the solutions have almost 0 amplitude for high frequencies i.e. less than 20 months period.

    D

    Comment Source:The diffeq in #4 produces solution which has periods around 20-30 months and longer ones. (Wavelet transform showed me this) However you filtered the data to have the high frequency activities, the solutions have almost 0 amplitude for high frequencies i.e. less than 20 months period. D
  • 12.

    Dara, I combined the two Sydney data sets. The set with the earliest data has no missing entries, so splice #2 into that to get one with only two missing entries, and those are just interpolated.

    This is my most recent analysis. http://contextearth.com/2014/09/21/an-enso-predictor-based-on-a-tide-gauge-data-model/

    I think you have to filter the high freq stuff out because some of that is due to steric and other seasonal effects that are not sloshing-related. But then you add the forcing in later.

    The latest plot:

    latest

    Comment Source:Dara, I combined the two Sydney data sets. The set with the earliest data has no missing entries, so splice #2 into that to get one with only two missing entries, and those are just interpolated. This is my most recent analysis. <http://contextearth.com/2014/09/21/an-enso-predictor-based-on-a-tide-gauge-data-model/> I think you have to filter the high freq stuff out because some of that is due to steric and other seasonal effects that are not sloshing-related. But then you add the forcing in later. The latest plot: ![latest](http://imageshack.com/a/img903/7049/yxkLF2.gif)
  • 13.

    I think you have to filter the high freq stuff out because some of that is due to steric and other seasonal effects that are not sloshing-related

    Paul MeanFilter[36] removes SOME of the frequencies in the range of 76+ months priodicity and leaves the lesser periods untouched.

    Dara

    Comment Source:>I think you have to filter the high freq stuff out because some of that is due to steric and other seasonal effects that are not sloshing-related Paul MeanFilter[36] removes SOME of the frequencies in the range of 76+ months priodicity and leaves the lesser periods untouched. Dara
  • 14.

    Paul I cannot synch with the way you read your data, I will publish a CDF today maybe I take out the data and you will read from a local file in MY DOCUMENTS this way the CDF is independent of the data.

    I am not sure: your solutions to the diffEQ produces periodicities from 20 months to 152 months, while you tried to remove this range from the original data, somehow I get confuse here. If you cut the data in that range out, you should have solutions that are also out of that range if you set up the equations properly. Just a thought, otherwise why to cut them out?

    Dara

    Comment Source:Paul I cannot synch with the way you read your data, I will publish a CDF today maybe I take out the data and you will read from a local file in MY DOCUMENTS this way the CDF is independent of the data. I am not sure: your solutions to the diffEQ produces periodicities from 20 months to 152 months, while you tried to remove this range from the original data, somehow I get confuse here. If you cut the data in that range out, you should have solutions that are also out of that range if you set up the equations properly. Just a thought, otherwise why to cut them out? Dara
  • 15.

    I looked at the output of MeanFilter[36] by itself and it is smooth in comparison to the higher freq signal I am capturing

    The removed low-freq and trend below in blue low pass

    Comment Source:I looked at the output of MeanFilter[36] by itself and it is smooth in comparison to the higher freq signal I am capturing The removed low-freq and trend below in blue ![low pass](http://imageshack.com/a/img743/1209/CCpIDJ.gif)
  • 16.

    Paul you subtract this MeanFilter[36] from the raw data and what you are left with high frequency signal, what you plotted in #16 is without subtraction.

    Personally my preference as I did both version yours and and wavelets, is wavelets.

    Dara

    Comment Source:Paul you subtract this MeanFilter[36] from the raw data and what you are left with high frequency signal, what you plotted in #16 is without subtraction. Personally my preference as I did both version yours and and wavelets, is wavelets. Dara
  • 17.

    Dara, I think it is critical that something like a MeanFilter is applied to the data. What is needed is a transformation that creates a signal with balanced excursions that are + and - in amplitude. This will remove any secular trend such as associated with global warming, while also accounting for the multidecadal variations that are associated with ocean/land transfers of water.

    http://podaac.jpl.nasa.gov/OceanEvents/GRACE_2010-11_GMSL_ENSO_Oct2012

    What happens is that significant amounts of rain water can get temporarily trapped in huge basins such as occur in Australia. These subtract from the ocean sea-level, but take time to equilibrate.

    rain

    These are not part of the sloshing signal that we are trying to identify.

    It is all about trying to isolate the signal related to sloshing and removing everything else. We will never be totally successful unless corrections are applied for changes due to steric expansion of sea-water and mass loss/gain of water, which contribute to sea-level height variations not related to sloshing.

    But let's not forget how close we are getting from the results of #13 !

    I have some questions on wavelets which I will ask in another comment.

    Comment Source:Dara, I think it is critical that something like a MeanFilter is applied to the data. What is needed is a transformation that creates a signal with balanced excursions that are + and - in amplitude. This will remove any secular trend such as associated with global warming, while also accounting for the multidecadal variations that are associated with ocean/land transfers of water. <http://podaac.jpl.nasa.gov/OceanEvents/GRACE_2010-11_GMSL_ENSO_Oct2012> What happens is that significant amounts of rain water can get temporarily trapped in huge basins such as occur in Australia. These subtract from the ocean sea-level, but take time to equilibrate. ![rain](http://podaac.jpl.nasa.gov/sites/default/files/content/OceanStory-GRACE-2012-10-fig3_edited.JPG) These are not part of the sloshing signal that we are trying to identify. It is all about trying to isolate the signal related to sloshing and removing everything else. We will never be totally successful unless corrections are applied for changes due to steric expansion of sea-water and mass loss/gain of water, which contribute to sea-level height variations not related to sloshing. But let's not forget how close we are getting from the results of #13 ! I have some questions on wavelets which I will ask in another comment.
  • 18.

    I am not thinking of any of this, just concerned about computing and CDF matching your results.

    I added the MeanFilter to CDF with a slider you could change its value.

    Comment Source:I am not thinking of any of this, just concerned about computing and CDF matching your results. I added the MeanFilter to CDF with a slider you could change its value.
  • 19.

    Dara,

    OK, let me back up a few steps. I did the manual correction of the -999 values in a spreadsheet. While in the spreadsheet, I also generated a balanced 12-point box averager to the monthly data to remove the annual and biannual signal. Much like what happens with the SOI data, any seasonal fluctuations obscure the longer term fluctuations. Discrimination is the key here -- any correlation coefficient computed will be determined completely by the annual signal, leaving little discriminating power for the residual, which is what we are interested in.

    Same thing with diurnal and semidurnal tides. If we were really being precise, we would want the daily and hourly values of the tidal gauge, because those changes are huge ! But if we did that, we would really be at wits end, because we would not see the smaller changes that we are truly interested in.

    Does that now make sense?

    Comment Source:Dara, OK, let me back up a few steps. I did the manual correction of the -999 values in a spreadsheet. While in the spreadsheet, I also generated a balanced 12-point box averager to the monthly data to remove the annual and biannual signal. Much like what happens with the SOI data, any seasonal fluctuations obscure the longer term fluctuations. Discrimination is the key here -- any correlation coefficient computed will be determined completely by the annual signal, leaving little discriminating power for the residual, which is what we are interested in. Same thing with diurnal and semidurnal tides. If we were really being precise, we would want the daily and hourly values of the tidal gauge, because those changes are huge ! But if we did that, we would really be at wits end, because we would not see the smaller changes that we are truly interested in. Does that now make sense?
  • 20.

    Actually those filterings will diversely affect the results.

    In order to make a useful division of labour, I will have the CDF read a file from MY DOCUMENTS, it will be a 1D array of numbers, any numbers of any length.

    This way you are free to do as you like.

    Dara

    Comment Source:Actually those filterings will diversely affect the results. In order to make a useful division of labour, I will have the CDF read a file from MY DOCUMENTS, it will be a 1D array of numbers, any numbers of any length. This way you are free to do as you like. Dara
  • 21.

    Dara said:

    "Actually those filterings will diversely affect the results."
    

    Let's put that to good use. See what you can do with the data and we can contrast the results in #13 with an alternate approach.

    That is the way to do analysis -- hammer away from as many angles as possible.

    BTW, I used a 12-month filter on the tide data before I did this correlation:

    If I didn't, the two curves would have been buried in an envelope of noise.

    Wavelets are the way to do this analysis at multiple scales, yet the Mathieu wavelets are not available yet AFAIK.

    Comment Source:Dara said: "Actually those filterings will diversely affect the results." Let's put that to good use. See what you can do with the data and we can contrast the results in #13 with an alternate approach. That is the way to do analysis -- hammer away from as many angles as possible. BTW, I used a 12-month filter on the tide data before I did this correlation: ![](http://imagizer.imageshack.us/a/img912/9310/ag13Vs.gif) If I didn't, the two curves would have been buried in an envelope of noise. Wavelets are the way to do this analysis at multiple scales, yet the [Mathieu wavelets](http://en.wikipedia.org/wiki/Mathieu_wavelet) are not available yet AFAIK.
  • 22.

    Paul just upgraded to 10.0.1 and my CDF files are working but issuing a warning, so I will not be able to send you anything tonight, I sent a ticket to techsupport

    Comment Source:Paul just upgraded to 10.0.1 and my CDF files are working but issuing a warning, so I will not be able to send you anything tonight, I sent a ticket to techsupport
Sign In or Register to comment.