added support for Dict action space #301

JaCoderX · 2020-02-06T12:49:07Z

Epsilon greedy policy error when generating new action from Dict Action space. #276

googlebot · 2020-02-06T12:49:12Z

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here with @googlebot I signed it! and we'll verify it.

What to do if you already signed the CLA

Individual signers

It's possible we don't have your GitHub username or you're using a different email address on your commit. Check your existing CLA data and verify that your email is set on your git commits.

Corporate signers

Your company has a Point of Contact who decides which employees are authorized to participate. Ask your POC to be added to the group of authorized contributors. If you don't know who your Point of Contact is, direct the Google project maintainer to go/cla#troubleshoot (Public version).
The email used to register you as an authorized contributor must be the email used for the Git commit. Check your existing CLA data and verify that your email is set on your git commits.
The email used to register you as an authorized contributor must also be attached to your GitHub account.

ℹ️ Googlers: Go here for more info.

JaCoderX · 2020-02-06T13:06:49Z

@googlebot I signed it!

googlebot · 2020-02-06T13:06:54Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

googlebot · 2020-02-06T13:06:54Z

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

DQN loss calculation error when using Dict Action space #297

tfboyd · 2020-02-06T17:25:23Z

@oars I saw you did a +2 on the internal sync, so I added you as the reviewer here, you can assign someone else.

ebrevdo · 2020-02-07T05:08:42Z

We're trying to tighten up the contract that Agents present to the world, in particular we don't want to allow arbitrary nested structures "so long as its only got one entry", if the agent does not support more than one action in the action space. Better if you just add a wrapper that converts the tensor to a dict.

If we start allowing this kind of soft contract, it makes it much harder for future developers to know which agents really support multiple action spaces; and makes it harder for us to add pytype to tf-agents. Sorry; will have to reject this PR.

ebrevdo · 2020-02-07T05:09:15Z

(a good followup PR would be to disallow nested action_specs in this agent in the first place).

JaCoderX · 2020-02-07T12:41:05Z

We're trying to tighten up the contract that Agents present to the world, in particular we don't want to allow arbitrary nested structures "so long as its only got one entry", if the agent does not support more than one action in the action space.

@ebrevdo, I agree with you that the code need to be simple and clean for the agents, especially if a certain agent implementation (this case DQN) is meant for just one dimensional action space.

but this PR contain 2 commits, one for the DQN and the second is for the epsilon greedy policy.
As far as I understand, the epsilon greedy policy isn't bounded only for DQN and can be used by other agents that can have a structured action space? is that commit also too case specific?

personally I just wanted to test the tf-agent framework with an external and complex gym project and see they work well together. I use the DQN tutorial just for an easy entry to tf-agent framework.

anyway thanks for helping out :)

ebrevdo · 2020-02-21T20:31:32Z

Ah I didn't see the changes to epsilon_greedy. That change is fine! Can you revert the change to DQN agent?

ebrevdo · 2020-02-21T20:31:50Z

tf_agents/agents/dqn/dqn_agent.py

@@ -546,7 +547,8 @@ def _compute_next_q_values(self, next_time_steps):
    # action constraints are respected and helps centralize the greedy logic.
    greedy_actions = self._target_greedy_policy.action(
        next_time_steps, dummy_state).action
-
+    greedy_actions = tf.nest.flatten(greedy_actions)[0]


revert this file please.

This reverts commit ff356ed.

JaCoderX · 2020-02-21T20:57:39Z

DQN changes had been reverted

ebrevdo · 2020-03-17T18:23:16Z

Is there a unit test?

tfboyd · 2020-03-17T18:35:10Z

@JacobHanouna If there is a unit test, please reference it. If not can you add one and we can merge this can get it closed. Not to be a drag, but without a test the chances of it getting broken by some other change increases greatly. Thank you for understanding. We are so close on this and appreciate you powering through to the end.

JaCoderX · 2020-03-17T19:07:56Z

Tutorial: Integration with TF-Agents RL Framework

tfboyd · 2020-03-17T22:07:51Z

So close to high fives all around.

tfboyd · 2020-03-19T03:30:13Z

MERGED

added support for Dict action space for Epsilon greedy policy

8f8947f

Epsilon greedy policy error when generating new action from Dict Action space. #276

googlebot added the cla: no label Feb 6, 2020

googlebot added cla: yes and removed cla: no labels Feb 6, 2020

added support for dict action space for dqn agent

ff356ed

DQN loss calculation error when using Dict Action space #297

JaCoderX changed the title ~~added support for Dict action space for Epsilon greedy policy~~ added support for Dict action space Feb 6, 2020

tfboyd requested a review from oars February 6, 2020 17:18

ebrevdo closed this Feb 7, 2020

ebrevdo reopened this Feb 21, 2020

ebrevdo reviewed Feb 21, 2020

View reviewed changes

Revert "added support for dict action space for dqn agent"

20aced4

This reverts commit ff356ed.

tfboyd added the kokoro:run label Feb 24, 2020

kokoro-team removed the kokoro:run label Feb 24, 2020

JaCoderX mentioned this pull request Mar 7, 2020

Tutorial: Integration with TF-Agents RL Framework Kismuz/btgym#133

Open

ebrevdo approved these changes Mar 17, 2020

View reviewed changes

JaCoderX closed this Mar 17, 2020

JaCoderX reopened this Mar 17, 2020

Remove whitespace

e906d51

tfboyd added kokoro:force-run kokoro:run labels Mar 19, 2020

kokoro-team removed kokoro:run kokoro:force-run labels Mar 19, 2020

tf-agents-copybara merged commit a5b7bdd into tensorflow:master Mar 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added support for Dict action space #301

added support for Dict action space #301

JaCoderX commented Feb 6, 2020

googlebot commented Feb 6, 2020

JaCoderX commented Feb 6, 2020

googlebot commented Feb 6, 2020

googlebot commented Feb 6, 2020

tfboyd commented Feb 6, 2020

ebrevdo commented Feb 7, 2020

ebrevdo commented Feb 7, 2020

JaCoderX commented Feb 7, 2020

ebrevdo commented Feb 21, 2020

ebrevdo Feb 21, 2020

JaCoderX commented Feb 21, 2020

ebrevdo commented Mar 17, 2020

tfboyd commented Mar 17, 2020

JaCoderX commented Mar 17, 2020

tfboyd commented Mar 17, 2020

tfboyd commented Mar 19, 2020

added support for Dict action space #301

added support for Dict action space #301

Conversation

JaCoderX commented Feb 6, 2020

googlebot commented Feb 6, 2020

What to do if you already signed the CLA

Individual signers

Corporate signers

JaCoderX commented Feb 6, 2020

googlebot commented Feb 6, 2020

googlebot commented Feb 6, 2020

tfboyd commented Feb 6, 2020

ebrevdo commented Feb 7, 2020

ebrevdo commented Feb 7, 2020

JaCoderX commented Feb 7, 2020

ebrevdo commented Feb 21, 2020

ebrevdo Feb 21, 2020

Choose a reason for hiding this comment

JaCoderX commented Feb 21, 2020

ebrevdo commented Mar 17, 2020

tfboyd commented Mar 17, 2020

JaCoderX commented Mar 17, 2020

tfboyd commented Mar 17, 2020

tfboyd commented Mar 19, 2020