NinjaTrader Support Forum  
X

Attention!

This website will be down for maintenance from Friday May 24th at 6PM MDT until Sunday May 26th at 12PM MDT. We apologize for the inconvenience. If you need assistance during this time, please email sales@ninjatrader.com


Go Back   NinjaTrader Support Forum > NinjaScript Development Support > General Programming

General Programming General NinjaScript programming questions.

Reply
 
Thread Tools Display Modes
Old 08-06-2012, 02:05 AM   #1
LiquidDrift
Junior Member
 
Join Date: Aug 2012
Posts: 15
Thanks: 2
Thanked 0 times in 0 posts
Default EMA output from custom DataSeries = wonky

Hey, I'm trying to put my own data into a DataSeries so that I can run Indicator methods on it.

So for example, I set the last 12 values of a DataSeries to be 100.0 and then pass it into an EMA.

Here's the example code:

Code:
for( int i=0; i<12; i++ )
{
	testSeries.Set( i, 100.0 );
}

IDataSeries emaTest = EMA( testSeries, 5 );

for( int i=0; i<5; i++ )
{
	Print( "VALUE: " + testSeries[i].ToString() );
	Print( "EMA: " + emaTest[i].ToString() );
}
I would expect the output from the EMA to be 100.0 since that's all that's being fed into it. However, this is the output:

Code:
VALUE: 100
EMA: 109.097235188783
VALUE: 100
EMA: 113.645852783175
VALUE: 100
EMA: 120.468779174763
VALUE: 100
EMA: 130.703168762144
VALUE: 100
EMA: 146.054753143216
This appears to only be a problem with the EMA. SMA and LinReg both return correct values. If I fill testSeries with a larger number of values (ie 30), then it gets closer to the correct EMA value (100.0). However, if I am setting the period to 5, it shouldn't be looking back in the array further than 10 if I'm only looking at first 5 values.

What am I doing wrong here? Is there a bug in the EMA method?
LiquidDrift is offline  
Reply With Quote
Old 08-06-2012, 02:10 AM   #2
NinjaTrader_Bertrand
NinjaTrader Customer Service
 
NinjaTrader_Bertrand's Avatar
 
Join Date: Sep 2008
Location: Germany
Posts: 22,421
Thanks: 252
Thanked 982 times in 964 posts
Default

LiquidDrift, the EMA is an infinite filter and as such would always take all values into consideration (although the longer the series gets their weights get really small).
NinjaTrader_Bertrand is offline  
Reply With Quote
Old 08-06-2012, 05:56 AM   #3
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by LiquidDrift View Post
Hey, I'm trying to put my own data into a DataSeries so that I can run Indicator methods on it.

So for example, I set the last 12 values of a DataSeries to be 100.0 and then pass it into an EMA.

Here's the example code:

Code:
for( int i=0; i<12; i++ )
{
	testSeries.Set( i, 100.0 );
}

IDataSeries emaTest = EMA( testSeries, 5 );

for( int i=0; i<5; i++ )
{
	Print( "VALUE: " + testSeries[i].ToString() );
	Print( "EMA: " + emaTest[i].ToString() );
}
I would expect the output from the EMA to be 100.0 since that's all that's being fed into it. However, this is the output:

Code:
VALUE: 100
EMA: 109.097235188783
VALUE: 100
EMA: 113.645852783175
VALUE: 100
EMA: 120.468779174763
VALUE: 100
EMA: 130.703168762144
VALUE: 100
EMA: 146.054753143216
This appears to only be a problem with the EMA. SMA and LinReg both return correct values. If I fill testSeries with a larger number of values (ie 30), then it gets closer to the correct EMA value (100.0). However, if I am setting the period to 5, it shouldn't be looking back in the array further than 10 if I'm only looking at first 5 values.

What am I doing wrong here? Is there a bug in the EMA method?
You are not quite doing what you think that you are.
Code:
IDataSeries emaTest = EMA( testSeries, 5 );
is not the correct way to populate a DataSeries.
koganam is offline  
Reply With Quote
Old 08-06-2012, 11:21 AM   #4
LiquidDrift
Junior Member
 
Join Date: Aug 2012
Posts: 15
Thanks: 2
Thanked 0 times in 0 posts
Default

OK, thanks, I figured out that indeed it is an infinite series and older entries in my "testSeries" were throwing off the values.

@koganam - What am I doing wrong there? I've been populating dataseries that way all over the place with no problems yet. I'm coming from a C++ background, so I may be doing improper assignment for sure. What is the correct way?

Thanks!
LiquidDrift is offline  
Reply With Quote
Old 08-07-2012, 08:21 PM   #5
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by LiquidDrift View Post
OK, thanks, I figured out that indeed it is an infinite series and older entries in my "testSeries" were throwing off the values.

@koganam - What am I doing wrong there? I've been populating dataseries that way all over the place with no problems yet. I'm coming from a C++ background, so I may be doing improper assignment for sure. What is the correct way?

Thanks!
Unfortunately that is not your problem. Your original statement was the correct one. It does not matter how many terms there are in a rectangular distribution, the average value, no matter how it is measured, ema, sma, weighted, etc., will always be exactly the same; the value of every identical member of the distribution. Even a cursory glance at how any average is calculated will make this clear. If all the members of the distribution are 100, then the value of the average MUST be 100, or the average itself must be being miscalculated or improperly defined. This is not simply a mathematical nicety. The definition of average, as the most likely value, means that it must be so. The most likely value of a collection whose every member is 100 cannot be anything but 100.

The problem lies in your code.

As you have written it, on each barUpdate, you are redefining and initializing an Interface, IDataSeries to an unknown state.

To correctly do what you want, you should declare a class variable of type EMA (EMA is a class, hence object). You then assign/instantiate this named instance of an EMA (in either Initialize(), or preferably in OnStartUp(), or even reinitialize it every time in OnBarUpdate() like you have done. But it must be a named instance of the class, not a new declaration of the Interface.

Here is what I mean:
Code:
private EMA emaTest;
Code:
protected override void Initialize()
{
this.testSeries		= new DataSeries(this);
}
Code:
protected override void OnBarUpdate()
{
// Use this method for calculating your indicator values. Assign a value to each
// plot below by replacing 'Close[0]' with your own formula.
//            Plot0.Set(Close[0]);
if (CurrentBar < 5) return;
for( int i=0; i<12; i++ )
{
testSeries.Set( i, 100.0 );
}

emaTest = EMA( testSeries, 5 ); //this statement can go in Initialize(), or OnStartUp(), which is the most efficient place for it.

for( int i=0; i<5; i++ )
{
Print( "VALUE: " + testSeries[i].ToString() );
Print( "EMA: " + emaTest[i].ToString() );
}
}
koganam is offline  
Reply With Quote
The following 3 users say thank you to koganam for this post:
Old 08-08-2012, 01:20 AM   #6
LiquidDrift
Junior Member
 
Join Date: Aug 2012
Posts: 15
Thanks: 2
Thanked 0 times in 0 posts
Default

Ah, I see, I did not know what a C# Interface was, that was helpful thanks.

I'm still having problems with this however. It appears that DataSeries data lingers in the EMA, even if the DataSeries has been completely overwritten.

For example:

Code:
++barNum;
if( barNum == 40 || barNum == 100 )
{
	Print( "-------------------------" );
	int cnt = Math.Min( testSeries.Count, 256 );
	for( int i=1; i<cnt; i++ )
	{
		testSeries.Set( i, 100.0 );
	}
	testSeries.Set( 0, 500.0 );
	
	emaTest = EMA( testSeries, 5 );

	for( int i=0; i<5; i++ )
	{
		Print( "VALUE: " + testSeries[i].ToString() );
		Print( "EMA: " + emaTest[i].ToString() );
	}

}
return;
Produces this:

Code:
-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 322.222222222222
VALUE: 100
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
So it runs through the same code twice at different times and comes up with 2 different results. I can get it to work correctly if I insert:

Code:
	EMA( testSeries, 5 ).Dispose();
before this line:
Code:
	emaTest = EMA( testSeries, 5 );
But that results in horrible performance in a backtest and NT eventually runs out of memory. Is there some way to do this properly such that the EMA data is correct, but I don't need to call Dispose() every time?
LiquidDrift is offline  
Reply With Quote
Old 08-08-2012, 06:10 AM   #7
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by LiquidDrift View Post
Ah, I see, I did not know what a C# Interface was, that was helpful thanks.

I'm still having problems with this however. It appears that DataSeries data lingers in the EMA, even if the DataSeries has been completely overwritten.

For example:

Code:
++barNum;
if( barNum == 40 || barNum == 100 )
{
	Print( "-------------------------" );
	int cnt = Math.Min( testSeries.Count, 256 );
	for( int i=1; i<cnt; i++ )
	{
		testSeries.Set( i, 100.0 );
	}
	testSeries.Set( 0, 500.0 );
	
	emaTest = EMA( testSeries, 5 );

	for( int i=0; i<5; i++ )
	{
		Print( "VALUE: " + testSeries[i].ToString() );
		Print( "EMA: " + emaTest[i].ToString() );
	}

}
return;
Produces this:

Code:
-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 322.222222222222
VALUE: 100
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
So it runs through the same code twice at different times and comes up with 2 different results. I can get it to work correctly if I insert:

Code:
	EMA( testSeries, 5 ).Dispose();
before this line:
Code:
	emaTest = EMA( testSeries, 5 );
But that results in horrible performance in a backtest and NT eventually runs out of memory. Is there some way to do this properly such that the EMA data is correct, but I don't need to call Dispose() every time?
Where are you running this code? As a method, or in an event handler. Yes, it would make a difference.
koganam is offline  
Reply With Quote
Old 08-08-2012, 07:18 AM   #8
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by LiquidDrift View Post
Ah, I see, I did not know what a C# Interface was, that was helpful thanks.

I'm still having problems with this however. It appears that DataSeries data lingers in the EMA, even if the DataSeries has been completely overwritten.

For example:

Code:
++barNum;
if( barNum == 40 || barNum == 100 )
{
	Print( "-------------------------" );
	int cnt = Math.Min( testSeries.Count, 256 );
	for( int i=1; i<cnt; i++ )
	{
		testSeries.Set( i, 100.0 );
	}
	testSeries.Set( 0, 500.0 );
	
	emaTest = EMA( testSeries, 5 );

	for( int i=0; i<5; i++ )
	{
		Print( "VALUE: " + testSeries[i].ToString() );
		Print( "EMA: " + emaTest[i].ToString() );
	}

}
return;
Produces this:

Code:
-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 322.222222222222
VALUE: 100
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
So it runs through the same code twice at different times and comes up with 2 different results. I can get it to work correctly if I insert:

Code:
	EMA( testSeries, 5 ).Dispose();
before this line:
Code:
	emaTest = EMA( testSeries, 5 );
But that results in horrible performance in a backtest and NT eventually runs out of memory. Is there some way to do this properly such that the EMA data is correct, but I don't need to call Dispose() every time?
Ah, the painful vicissitudes of optimized processors and FPU's. That looks to me like floating point inaccuracies inherent in trying to use digital equipment to approximate floating point numbers.

The clue that it is probably the effects of the optimization that holds the structure in the pipeline, instead of flushing it? The fact, that when you explicitly Dispose() of it, the problem goes away.

It looks like you may have to specify the precision of your calculation results if you want consistency.
koganam is offline  
Reply With Quote
Old 08-08-2012, 11:57 AM   #9
LiquidDrift
Junior Member
 
Join Date: Aug 2012
Posts: 15
Thanks: 2
Thanked 0 times in 0 posts
Default

Quote:
Originally Posted by koganam View Post
Ah, the painful vicissitudes of optimized processors and FPU's. That looks to me like floating point inaccuracies inherent in trying to use digital equipment to approximate floating point numbers.

The clue that it is probably the effects of the optimization that holds the structure in the pipeline, instead of flushing it? The fact, that when you explicitly Dispose() of it, the problem goes away.

It looks like you may have to specify the precision of your calculation results if you want consistency.
Floating point errors are usually relatively small compared to a number like 100. Also, if it's a floating point error, it's a hell of a coincidence that the second EMA of the second set is identical to the first EMA of the first set. AND, I'm completely filling the DataSeries before I do any calculation, so the floating point error should show up the same way both times.

The EMA / DataSeries appears to be doing some stuff under the hood that does not allow this kind of behavior. I believe I'm going to have to integrate my own EMA with a regular array to get what I need.

Thanks so much for looking at it everyone. NT people - it would be good if you could look at this further, this may be a sign that there's a bug on your end somewhere.
LiquidDrift is offline  
Reply With Quote
Old 08-08-2012, 08:26 PM   #10
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by LiquidDrift View Post
Floating point errors are usually relatively small compared to a number like 100. Also, if it's a floating point error, it's a hell of a coincidence that the second EMA of the second set is identical to the first EMA of the first set. AND, I'm completely filling the DataSeries before I do any calculation, so the floating point error should show up the same way both times.

The EMA / DataSeries appears to be doing some stuff under the hood that does not allow this kind of behavior. I believe I'm going to have to integrate my own EMA with a regular array to get what I need.

Thanks so much for looking at it everyone. NT people - it would be good if you could look at this further, this may be a sign that there's a bug on your end somewhere.
Hey, wait a Holy Minute right there!! There is more to this than meets the eye. When I run your code, that is NOT the output that I got. My output is what I made the comment from.

This is what I got:

Code:
-------------------------
VALUE: 500
EMA: 233.333333333333
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
VALUE: 100
EMA: 100
-------------------------
VALUE: 500
EMA: 233.33333333696
VALUE: 100
EMA: 100.000000005439
VALUE: 100
EMA: 100.000000008159
VALUE: 100
EMA: 100.000000012239
VALUE: 100
EMA: 100.000000018358
where we do see a reasonably small floating point error. In fact, just to be sure, I reset barNum at the end of the loop, to force the code to run multiple times, and the solution did converge to a consistent rounding error. That is why, in addition to your discovery of what happens when the named EMA is disposed, I concluded that all I was seeing was a rounding error caused by pipeline optimization.
koganam is offline  
Reply With Quote
Old 08-08-2012, 10:05 PM   #11
sledge
Senior Member
 
Join Date: Aug 2010
Location: Washington, D.C.
Posts: 1,204
Thanks: 182
Thanked 305 times in 263 posts
Default

Quote:
Originally Posted by koganam View Post
Unfortunately that is not your problem. Your original statement was the correct one. It does not matter how many terms there are in a rectangular distribution, the average value, no matter how it is measured, ema, sma, weighted, etc., will always be exactly the same; the value of every identical member of the distribution. Even a cursory glance at how any average is calculated will make this clear. If all the members of the distribution are 100, then the value of the average MUST be 100, or the average itself must be being miscalculated or improperly defined. This is not simply a mathematical nicety. The definition of average, as the most likely value, means that it must be so. The most likely value of a collection whose every member is 100 cannot be anything but 100.

The problem lies in your code.
Good grief. I have a BS in math and comp sci. rectangular distribution? I guess I need to catch up. My real world job has made me soft.
sledge is offline  
Reply With Quote
Old 08-09-2012, 11:28 AM   #12
LiquidDrift
Junior Member
 
Join Date: Aug 2012
Posts: 15
Thanks: 2
Thanked 0 times in 0 posts
Default

Quote:
Hey, wait a Holy Minute right there!! There is more to this than meets the eye. When I run your code, that is NOT the output that I got. My output is what I made the comment from.
Yes, your output does appear to be floating point errors, and indeed it may be floating point errors if you are resetting barNum at the end of the loop. I got my results by having the code snippet run just a couple of times, far apart from each other, ie the 40, 100 values.

My output is much further off, and again, the coincidence of the 233.333333333 value in both outputs leads me to conclude that even though I'm overwriting the DataSeries, old data continues to live somewhere in NT and continues to be processed by the EMA.

I'm bummed that you were not able to reproduce my results, maybe you can if you try running on different bars further apart. Since I believe it's a memory/data issue, no 2 computers are going to get the same results every time however.

The reason I created the code snippet was to attempt to narrow down the same issue that I'm seeing in a more complex strategy, and to hopefully either figure out if I'm doing something wrong, or shine some light on the problem.

I now have 2 (3 if I count yours) cases where this is happening, and that's more than enough for me to not trust data that I'm filling into a DataSeries and processing with an Indicator. When I use third party code to do the same thing using regular arrays, I'm not seeing any issues.

Just want to say once again, I'm very thankful for your time koganam for taking a look at this and your feedback!
LiquidDrift is offline  
Reply With Quote
Old 08-09-2012, 12:48 PM   #13
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by LiquidDrift View Post
Yes, your output does appear to be floating point errors, and indeed it may be floating point errors if you are resetting barNum at the end of the loop. I got my results by having the code snippet run just a couple of times, far apart from each other, ie the 40, 100 values.
Actually, the output that I showed was what I got from running your exact code (cut-and-paste). It was that in investigating if I was just seeing FP error, that I reset barNum so that the code run multiple times (plenty of bars on the chart), knowing that if I was just seeing FP error, then the solution would have to converge to a stable values state, which it did.

Quote:
My output is much further off, and again, the coincidence of the 233.333333333 value in both outputs leads me to conclude that even though I'm overwriting the DataSeries, old data continues to live somewhere in NT and continues to be processed by the EMA.
The 233.3 recurring is actually exactly correct for the way that the EMA is being initialized . The calculation is also mathematically correct.

I am actually more surprised that there are FP errors in the first place. I would expect each run of the code to produce the exact same results; not perfectly correct results in the first run, then anything else in the next. After all, the values are being shown to be exactly 100 in both cases, so there should be no difference.

The only question now is: "Have we found an error in the way C# handles objects, or is the error in the way that the CPU handles caching and pipelining. To me, our mutual results point to a processing error somewhere.
Quote:
... I now have 2 (3 if I count yours) cases where this is happening, and that's more than enough for me to not trust data that I'm filling into a DataSeries and processing with an Indicator. When I use third party code to do the same thing using regular arrays, I'm not seeing any issues.

Just want to say once again, I'm very thankful for your time koganam for taking a look at this and your feedback!
Don't mention it. It was an interesting conundrum arising from something that at first glance looked trivial. I am the better for having looked at it. Thank you!
Last edited by koganam; 08-29-2012 at 12:37 AM. Reason: Corrected spelling
koganam is offline  
Reply With Quote
Old 08-09-2012, 12:56 PM   #14
koganam
Senior Member
 
Join Date: Feb 2008
Location: Durham, North Carolina, USA
Posts: 3,221
Thanks: 24
Thanked 1,233 times in 1,004 posts
Send a message via Skype™ to koganam
Default

Quote:
Originally Posted by sledge View Post
Good grief. I have a BS in math and comp sci. rectangular distribution? I guess I need to catch up. My real world job has made me soft.
Just made you soft? It knocked me out, then picked me, chewed me up really nicely, and then spit me right out.
koganam is offline  
Reply With Quote
Old 08-09-2012, 01:45 PM   #15
LiquidDrift
Junior Member
 
Join Date: Aug 2012
Posts: 15
Thanks: 2
Thanked 0 times in 0 posts
Default

Quote:
The 233.3 recurring is actually exactly correct for the way that the EMA is being initialized . The calculation is also mathematically correct.
If this is true, than why do we not see it recurring in your output, only mine? Or were you seeing it in your output as well?

If what you're saying is true, then the EMA doesn't take into account new data entered into the DataSeries, which would make sense given the output.
LiquidDrift is offline  
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
DataSeries Market Analyzer output iangohye Market Analyzer 6 03-28-2012 07:04 AM
Indicator BarsPeriods perhaps wonky saltminer Indicator Development 5 02-21-2012 03:46 AM
plotting an EMA from added dataseries jfw215 General Programming 9 07-20-2010 07:28 AM
Custom DataSeries EMA eleven Indicator Development 4 07-14-2010 06:20 AM
Need Help- Same Output with different DataSeries Trendseek Indicator Development 9 04-30-2010 06:27 AM


All times are GMT -6. The time now is 03:15 AM.