Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: You must feed a value for placeholder tensor 'input_1/dones_ph' with dtype float and shape [1] #391

Closed
the-jb opened this issue Jun 27, 2019 · 2 comments

Comments

@the-jb
Copy link

the-jb commented Jun 27, 2019

Hello.
When I try to use pretrain with MlpLnLstmPolicy before actual learning,
it causes below traceback, and doesn't work.

As I see the code, i can't find feeding "dones_ph" in the base_class code.

# base_class.py : 228
feed_dict = {
    obs_ph: expert_obs,
    actions_ph: expert_actions,
}
train_loss_, _ = self.sess.run([loss, optim_op], feed_dict)

I think this pretrain is not compatitable with the MlpLnLstmPolicy.

Below is my code, and the error tracebacks.

Please help if I am wrong.

# My Code
env = DummyVecEnv([lambda: MyEnv()])
dataset = ExpertDataset(expert_path="my_data_set.npz", traj_limitation=-1, batch_size=1)

model = PPO2(policy="MlpLnLstmPolicy", env=env)

model.pretrain(dataset, n_epochs=100)

Errors :

Traceback (most recent call last):
  File "lib\tensorflow\python\client\session.py", line 1334, in _do_call
    return fn(*args)
  File "lib\tensorflow\python\client\session.py", line 1319, in _run_fn
    options, feed_dict, fetch_list, target_list, run_metadata)
  File "lib\tensorflow\python\client\session.py", line 1407, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1/dones_ph' with dtype float and shape [1]
	 [[{{node input_1/dones_ph}}]]
	 [[{{node pretrain/Mean}}]]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "home/ppo/main.py", line 161, in <module>
    start()
  File "home/ppo/main.py", line 37, in start
    model.pretrain(dataset, n_epochs=100)
  File "lib\stable_baselines\common\base_class.py", line 232, in pretrain
    train_loss_, _ = self.sess.run([loss, optim_op], feed_dict)
  File "lib\tensorflow\python\client\session.py", line 929, in run
    run_metadata_ptr)
  File "lib\tensorflow\python\client\session.py", line 1152, in _run
    feed_dict_tensor, options, run_metadata)
  File "lib\tensorflow\python\client\session.py", line 1328, in _do_run
    run_metadata)
  File "lib\tensorflow\python\client\session.py", line 1348, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: You must feed a value for placeholder tensor 'input_1/dones_ph' with dtype float and shape [1]
	 [[node input_1/dones_ph (defined at lib\stable_baselines\common\policies.py:346) ]]
	 [[node pretrain/Mean (defined at lib\stable_baselines\common\base_class.py:214) ]]

Caused by op 'input_1/dones_ph', defined at:
  File "home/ppo/main.py", line 161, in <module>
    start()
  File "home/ppo/main.py", line 31, in start
    model = PPO2(policy="MlpLnLstmPolicy", env=env)
  File "lib\stable_baselines\ppo2\ppo2.py", line 93, in __init__
    self.setup_model()
  File "lib\stable_baselines\ppo2\ppo2.py", line 126, in setup_model
    n_batch_step, reuse=False, **self.policy_kwargs)
  File "lib\stable_baselines\common\policies.py", line 701, in __init__
    layer_norm=True, feature_extraction="mlp", **_kwargs)
  File "lib\stable_baselines\common\policies.py", line 406, in __init__
    scale=(feature_extraction == "cnn"))
  File "lib\stable_baselines\common\policies.py", line 346, in __init__
    self._dones_ph = tf.placeholder(tf.float32, (n_batch, ), name="dones_ph")  # (done t-1)
  File "lib\tensorflow\python\ops\array_ops.py", line 2077, in placeholder
    return gen_array_ops.placeholder(dtype=dtype, shape=shape, name=name)
  File "lib\tensorflow\python\ops\gen_array_ops.py", line 6834, in placeholder
    "Placeholder", dtype=dtype, shape=shape, name=name)
  File "lib\tensorflow\python\framework\op_def_library.py", line 788, in _apply_op_helper
    op_def=op_def)
  File "lib\tensorflow\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "lib\tensorflow\python\framework\ops.py", line 3300, in create_op
    op_def=op_def)
  File "lib\tensorflow\python\framework\ops.py", line 1801, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_1/dones_ph' with dtype float and shape [1]
	 [[node input_1/dones_ph (defined at lib\stable_baselines\common\policies.py:346) ]]
	 [[node pretrain/Mean (defined at lib\stable_baselines\common\base_class.py:214) ]]

System Info

  • Anaconda (Windows)
  • 1070 with CUDA v10.0
  • Python 3.6.8
  • Tensorflow 1.13.1
@the-jb the-jb changed the title You must feed a value for placeholder tensor 'input_1/dones_ph' with dtype float and shape [1] Bug: You must feed a value for placeholder tensor 'input_1/dones_ph' with dtype float and shape [1] Jun 27, 2019
@araffin
Copy link
Collaborator

araffin commented Jun 27, 2019

Hello,

Recurrent policies are currently not supported for pretraining.
There is an issue #253 and a PR (awaiting review #315) that should solve that issue.
It should be part of the next major release.

If you try the PR, could you give use feedback?

Note: I will close this issue to avoid duplicates.

@the-jb
Copy link
Author

the-jb commented Jun 27, 2019

Thank you for quick answering.

I missed for that issue.
Surely I would use that PR with the feedback.

@the-jb the-jb closed this as completed Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants