About jax.PRNGKey：Error reporting when running #8

euyy · 2021-10-16T09:36:08Z

Excuse me. When I tried to run this code, I have a problem about this line:

xmcgan_image_generation/xmcgan/train_utils.py

Line 167 in 22a7ef2

generator_variables = generator(train=False).init(g_rng, (inputs, z))

and the error is flax.errors.InvalidRngError: rngs should be a dictionary mapping strings to jax.PRNGKey. Actually, the g_rng is a array of shape[2,].
So anyone else can help me solve this problem?

By the way, I have configured cuda, but it still tell me cuda not found.
xla_bridge.py:232] Unable to initialize backend 'gpu': NOT_FOUND: Could not find registered platform with name: "cuda". Available platform names are: Interpreter Host
I even use tensorflow to test the gpu, which is right. I don't know what the problem is.

The text was updated successfully, but these errors were encountered:

woctezuma · 2021-10-17T06:56:32Z

No idea about the second issue, but you can find others having the same issue online: kuixu/alphafold#8

kohjingyu · 2021-10-18T05:50:14Z

The first issue might be because of some recent change to Flax. What version are you using? Can you try changing it to:

generator_variables = generator(train=False).init({'params': g_rng}, (inputs, z))

(ref: https://github.com/google/flax/blob/main/examples/imagenet/train.py#L74)

The second issue is likely due to some problem during setup. Can you perhaps try asking in https://github.com/google/jax.

hyeonjinXZ · 2021-10-18T19:01:59Z

The first issue is fixed for me by upgrading to the latest version of Flax.
pip install --upgrade git+https://github.com/google/flax.git
(ref: https://pythonrepo.com/repo/google-flax-python-deep-learning)

euyy · 2021-10-20T12:33:34Z

@woctezuma @Hyeonjin1989 @kohjingyu Thanks for your help.
But now I have a new problem.
UNKNOWN: Failed to determine best cudnn convolution algorithm: UNKNOWN: GetConvolveAlgorithms failed.
Convolution performance may be suboptimal. To ignore this failure and try to use a fallback algorithm, use XLA_FLAGS=--xla_gpu_strict_conv_algorithm_picker=false. Please also file a bug for the root cause of failing autotuning.
I don't know if it's the error caused by my device. So I want to know if there are minimum configuration requirements for training. If anyone knows about it, please tell me. Thanks.

adambot806 · 2021-11-23T13:30:31Z

add

os.environ["XLA_PYTHON_CLIENT_MEM_FRACTION"] = '.7'

explanation may refer to gpu memory allocation.

woctezuma mentioned this issue Nov 7, 2021

Changing batch size and using multiple gpu makes Incompatible shapes issue. #9

Open

hyeonjinXZ mentioned this issue Dec 22, 2021

what's your jax version? #16

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About jax.PRNGKey：Error reporting when running #8

About jax.PRNGKey：Error reporting when running #8

euyy commented Oct 16, 2021

woctezuma commented Oct 17, 2021

kohjingyu commented Oct 18, 2021

hyeonjinXZ commented Oct 18, 2021 •

edited

Loading

euyy commented Oct 20, 2021

adambot806 commented Nov 23, 2021

About jax.PRNGKey：Error reporting when running #8

About jax.PRNGKey：Error reporting when running #8

Comments

euyy commented Oct 16, 2021

woctezuma commented Oct 17, 2021

kohjingyu commented Oct 18, 2021

hyeonjinXZ commented Oct 18, 2021 • edited Loading

euyy commented Oct 20, 2021

adambot806 commented Nov 23, 2021

hyeonjinXZ commented Oct 18, 2021 •

edited

Loading