Lots and Lots of Forex Data

December 4, 2006 by Trader Rich 

A lot of my time in the last week has been dedicated to compiling
backtesting data for multiple currency pairs and multiple time frames. 
I previously had GBP tick data that I was working from but now I'm
collating this data to multiple time frames and increasing my currency
pairs to 35 or so.  This process takes a lot of time and requires some
of the following steps:

  1. Creation of databases and tables.  I believe MySQL can handle
    over 240,000,000 rows per database.   When you're dealing with tick
    data, size has to be a consideration.  A couple of years of GBP data
    can easily include over 20,000,000 ticks.
  2. Importing the data into respective database tables.
  3. Adding indexes for faster querying. Indexes on the datetime are a
    must.  Without indexes, it would be like searching for a term in a book
    without an index.  You would have to go page by page to look for the
    term.  With an index, you look up the term and it points you to the
    page number similar to what a database index does. 
  4. Go tick by tick and verify that the data is there.  The key here
    is to verify that there are no gaps in the data.  A gap in the data can
    throw a calculation totally out of whack.

My point of explaining all of this is that there are so many
variables that go into backtesting, the above steps being those that
are abstracted from the user if you are using a given software
package.  With that said, different brokers have different data.  Some
brokers open earlier on certain days and holidays, other brokers close
on certain holidays; the bottom line is that the data is disparate. 
Forex doesn't have a central exchange so the prices you're working from
could be very different from the next person.  In turn, your stochastic
indicator could be showing oversold while another's guys may be at 50. 
For example, right now, my Metatrader charts (Alpari) show the 62 EMA
on the 30-min GBP/USD at 1.9749.  The 62 EMA on the 30-min GBP/USD at
Oanda is 1.9759.  That's a 10 pip difference. 

I've always said
that backtesting is necessary but I do believe that as the time frames
get shorter and shorter, the results have to be taken with more "grains
of salt".  That is why in addition to backtesting your system, it
should always be forward tested.  The object of backtesting should be
to get a general idea of its viability and to assist in determining if
further testing of this trading system is worthwhile.  For example, if
you were to backtest a moving average crossover, it becomes quite
evident that this type of trading system just isn't profitable (most of
the time.)  The negative results associated with this type of system
are just stacked up so high that you can make the assumption that this
system just wouldn't work in real-time.  This of course may not always
be the case.  Interpreting backtesting results requires you to make
assumptions and educated guesses.  Backtesting assists in moving your
system system testing to the next phase whatever this may be.  After
all, we have to make due with what we have and if backtesting can add
the inner confidence to stick to a trading system, then it has helped
in one way or another.  

Popularity: 3%

Comments

2 Responses to “Lots and Lots of Forex Data”

  1. Craig on December 4th, 2006 2:13 pm

    Hi Rich,

    I assume you are using the Gain Capital data, that stuff is full of holes & overlaps, so be careful. But good for you on getting stuck into the backtesting.

  2. knight.com on April 1st, 2008 2:05 am

    Rich,

    Built a really simple back testing / forward testing system for equities. Very fast and real way to validate and test strategies. Realize Tick Data is not complete… You need Level II and market noise… ie. News, S&P / DOW movements etc.

    Our objective was to quantify results from semi controlled data. We used random number generators to simulate transactions extrapalated from hi / low / open / close / volume. Then we would use the same simulated minute system and force bullish, bearish and choppy trends to see how the different trading bots would perform. We would run 1000\’s of simulations in a matter of a few minutes and tweak one parameter at a time to fine tune our dynamic trading bots.

    Our trading bots adapt to the market dynamically and adjust when conditions are recognized and triggered.

    You are the coach and your team is inside the 10 yard line ready to score. You wouldn\’t be going for the long bomb would you?

    4 hours a run for each back test against snapshot history is just mental masturbation. Focus on Building strategies adaptive to patterned semi-random data and run enough iterations to quantify your objective.

    You are driving down the road looking in your rear view mirror to see if your still on the road. Looking at lane markers and curbs with no concept of where or how fast your going. Switch gears and look forward…

Feel free to leave a comment...
and oh, if you want a pic to show with your comment, go get a gravatar!