Skip to content

Latest commit

 

History

History

TD3

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Twin Delayed DDPG

TD3 can be seen as an improved version of DDPG, which utilizes clipped double q-learning, meaning that it learns two action value functions instead of one.

Also, the actor updates are delayed (updates are less frequent than the critic updates).

The result of trained DDPG agent after 500 episodes for HalfCheetah environment.
The result of trained DDPG agent after 500 episodes for Pendulum environment.