You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There is a 7x difference. I have noticed that the APRL paper used Go1 while [46] used A1, and different velocity measurement might be applied (tracking camera vs. Kalman filter). I want to know if there is any other difference between "Restricted" and [46].
Thanks!
The text was updated successfully, but these errors were encountered:
Hi there,
Thank you for your question. The difference between "Restricted" and [46] are:
The restricted method share the same action space and similar observation space as [46], but have a bit different reward shaping. The only difference in the observation space is that we used normalized foot contact forces (in the restricted method) instead of binary foot contact observations used in [46] due to our Go1 robot foot contact sensor being not very reliable...
The velocity measurement for the restricted method comes from a tracking camera while the velocity measurement for [46] comes from a Kalman filter combining information from (1) forward kinematics (2) onboard accelerometer. This measurement is used in both the observation and the reward function.
Please look at our project website for the reward function. The main changes affecting the learning speed is that we scaled up the reward for velocity and used a near quadratic term for the velocity reward. This kind of reward shaping makes the algorithm pick up reward signals earlier in training and thus makes the training faster.
Dear author:
Thanks a lot for inventing APRL and open sourcing an official implementation.
I have a question about the performance of "Restricted".
Fig 3 in [46]Demonstrating a walk in the park: Learning to walk in 20 minutes with model-free reinforcement learning showed that robot can move on the flat ground at an average speed of 0.06m/s after 20min.
Fig 6. in the APRL paper showed that robot can move on the flat ground at an average speed of 0.44m/s after 20min.
There is a 7x difference. I have noticed that the APRL paper used Go1 while [46] used A1, and different velocity measurement might be applied (tracking camera vs. Kalman filter). I want to know if there is any other difference between "Restricted" and [46].
Thanks!
The text was updated successfully, but these errors were encountered: