Академический Документы
Профессиональный Документы
Культура Документы
This is the another post of the series: How to build your own algotrading platform.
Jon V
Before running any live algotrading system, it is a good practice to backtest (that means run a simulation) our algorithms.
Have in mind that this doesn't mean that if your system is killing it for the last 5 years/months/days it will make a prot
BigData.
but it is aStartups.
good indicator that you might be on to something.
Trading. 48
Shares
There are four things that we need to take into consideration when we do our backtesting:
22
1. Twitter
The quality of the data
2. LinkedIn
How to load them eciently
3. Github
How to built our backtesting system
4. Bitbucket
Try to have our backtesting and our live system share as much code as we can
For Forex data, I am using GainCapital. Their data are in the form of ticks. For a free source it is good enough. I used to use
Oanda's historical data service but it seems that they moved it to a premium product. Too bad. Make sure that you use
Powered by Pelican, theme
GainCapital's data only for experimentation.
info.
For any other kind of paid historical data (ETFs, stocks, options stc), I am using eoddata.com (they also have some forex
historical data but I haven't used them).
Let's download data for a week and experiment a little bit. The link to the data is
http://ratedata.gaincapital.com/2015/11%20November/EUR_USD_Week1.zip for the rst week of November 2015.
and you'll get a 25MB le named EUR_USD_Week1.csv. These are data for one week for one currency pair. You can
imagine the amount of data you need to process for all currencies for the last ve years (hint: a lot!). But don't worry, we
are going optimize this. For now, let's open the le and inspect.
>head EUR_USD_Week1.csv
lTid cDealable CurrencyPair RateDateTime RateBid RateAsk
4464650058 D EUR/USD 2015-11-01 17:00:06.490000000 1.103380 1.103770
4464650061 D EUR/USD 2015-11-01 17:00:06.740000000 1.103400 1.103760
4464650072 D EUR/USD 2015-11-01 17:00:07.990000000 1.103390 1.103750
4464650083 D EUR/USD 2015-11-01 17:00:08.990000000 1.103400 1.103750
the things that we care about is the RateDateTime, RateBid and RateAsk. As you can understade each line has a
timestamp and the how much was the price to buy or sell. Formats downloaded by other services are pretty similar.
There are many ways to load these data into Python but the most preferable when it comes to data slicing and
manipulating is using Pandas. We can always use the csv library to load data (and it might be faster) but we need to do
some optimizations and processing rst that as you will see it is pretty easy with pandas.
Another great tool to load TONS of GBs pretty eciently and very fast is using Bcolz, covered in a much later post (or you
can read a preview if you have signed up in the newsletter.
Let's group all these data in 15 minutes. How? Time to fall in love with resample.
This is called OHLC (Open High Low Close) bar for every 15 minutes. You can see now that the ticks are grouped in 15
minute segments and you have the highest and lowest point that the price reached during these 15 minutes and also the
open/close for buy and sell. Pure gold! Not only you have all the information you need but now it is extremely fast to load
it. You just need to save the data:
:: python
# save to file
grouped_data.to_pickle(filename+'-OHLC.pkl')
We can write a simple momentum algorithm that checks if there was a huge movement the last 15 minutes and if that
was the case, let's buy. We will dive into this in a later post.
Legal outro. This is an engineering tutorial on how to build an algotrading platform for experimentation and FUN. Any
suggestions here are not nancial advices. If you lose any (or all) you money because you followed any trading advices or
deployed this system in production, you cannot blame this random blog (and/or me). Enjoy at your own risk.
BACK TO BLOG