-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added support for Dict action space #301
Conversation
Epsilon greedy policy error when generating new action from Dict Action space. #276
Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). 📝 Please visit https://cla.developers.google.com/ to sign. Once you've signed (or fixed any issues), please reply here with What to do if you already signed the CLAIndividual signers
Corporate signers
ℹ️ Googlers: Go here for more info. |
@googlebot I signed it! |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
1 similar comment
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
DQN loss calculation error when using Dict Action space #297
@oars I saw you did a +2 on the internal sync, so I added you as the reviewer here, you can assign someone else. |
We're trying to tighten up the contract that Agents present to the world, in particular we don't want to allow arbitrary nested structures "so long as its only got one entry", if the agent does not support more than one action in the action space. Better if you just add a wrapper that converts the tensor to a dict. If we start allowing this kind of soft contract, it makes it much harder for future developers to know which agents really support multiple action spaces; and makes it harder for us to add pytype to tf-agents. Sorry; will have to reject this PR. |
(a good followup PR would be to disallow nested action_specs in this agent in the first place). |
@ebrevdo, I agree with you that the code need to be simple and clean for the agents, especially if a certain agent implementation (this case DQN) is meant for just one dimensional action space. but this PR contain 2 commits, one for the DQN and the second is for the epsilon greedy policy. personally I just wanted to test the tf-agent framework with an external and complex gym project and see they work well together. I use the DQN tutorial just for an easy entry to tf-agent framework. anyway thanks for helping out :) |
Ah I didn't see the changes to epsilon_greedy. That change is fine! Can you revert the change to DQN agent? |
tf_agents/agents/dqn/dqn_agent.py
Outdated
@@ -546,7 +547,8 @@ def _compute_next_q_values(self, next_time_steps): | |||
# action constraints are respected and helps centralize the greedy logic. | |||
greedy_actions = self._target_greedy_policy.action( | |||
next_time_steps, dummy_state).action | |||
|
|||
greedy_actions = tf.nest.flatten(greedy_actions)[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert this file please.
This reverts commit ff356ed.
DQN changes had been reverted |
Is there a unit test? |
@JacobHanouna If there is a unit test, please reference it. If not can you add one and we can merge this can get it closed. Not to be a drag, but without a test the chances of it getting broken by some other change increases greatly. Thank you for understanding. We are so close on this and appreciate you powering through to the end. |
So close to high fives all around. |
MERGED |
Epsilon greedy policy error when generating new action from Dict Action space. #276