Академический Документы
Профессиональный Документы
Культура Документы
www.prediction-machines.com
Special thanks to -
Algorithmic Trading (e.g., HFT) vs Human Systematic Trading
FC ReLU
FC ReLU
Output
Input
+
Functional
pass-though
O O
O O
O O
Lots of this
leading to
this
Machine Learning vs Reinforcement Learning
Good textbook on this by
Sutton and Barto -
No supervisor
Trial and error paradigm
Feedback delayed
Time sequenced
Agent influences the environment
Agent
Environment
st, at, rt, st+1,at+1,rt+1,st+2,at+2,rt+2, …
Value function
Policy function
REINFORCEjs
Reward function
GridWorld :
---demo---
Application to Trading
Typical dynamics of a mean-reverting asset or pairs-trading where the spread exhibits mean reversion
i=2
i = -2
sell buy
Mean Reversion Game Simulator
Level 3
Example sell transaction
… over-and-over
FC ReLU FC ReLU
FC ReLU FC ReLU
Output Output
Input Input
+ +
Functional Functional
pass-though pass-though
lattice position
Data Generator
Environment Agent Memory
next()
rewind() render()
step() act() add()
reset() observe() sample()
Random end()
Walks
Single Asset
Deterministic DQN
Signals Multi Asset Brain
Double DQN
CSV Replay Market train()
Making predict()
Market Data A3C
Streamer
Trading-Gym - OpenSourced
Prediction Machines release of Trading-Gym environment into OpenSource
- - demo - -
Prediction Machines release of Trading-Gym environment into OpenSource
Experience Replay
Removes correlation in sequences
Smooths over changes in data distribution
Prioritized Experience Replay
Speeds up learning by choosing experiences with weighted distribution
Separate target network from Q network
Removes correlation with target - improves stability
Double Q learning
Removes a lot of the non uniform overestimations by separating selection of action and evaluation
Dueling Q learning
Improves learning with many similar action values. Separates Q value into two : state value and state-
dependent action advantage
Keras v Tensorflow
Keras Tensorflow
High level ✔
Standardized API ✔
Tensorboard ✔* ✔
My installation was on CentOS in docker with GPU*, but also did locally on
Ubuntu 16 for this demo. *Built from source for maximum speed.
https://blog.abysm.org/2016/06/building-tensorflow-centos-6/
https://www.tensorflow.org/install/install_sources
Tensorflow - what is it
with tf.variable_scope('prediction'):
Sessions
https://github.com/Prediction-Machines/Trading-Gym
Open sourced
https://github.com/Prediction-Machines/Trading-Brain
Contains example of Dueling Double DQN for single stock trading game
examples/tf_example.py
References
Much of the Brain and config code in this example is adapted from devsisters github:
https://github.com/devsisters/DQN-tensorflow
Our github:
https://github.com/Prediction-Machines
Our blog:
http://prediction-machines.com/blog/