Keras implementation of Temporal Difference Models by V. Pong et al.(2018) + Value based sampling
Requirements:
The structure of this code is built on Keras-rl
A few tweaks that we did -
- Relabelling goals based on expected reward henceforth, with some probability. We found that it lead to faster convergence in the FetchReach environment.
- Decayed the 'goal reached' condition radius gradually. It lead to faster convergence as well