Confused about dtype and precision #3987

davidshen84 · 2024-06-12T06:53:15Z

davidshen84
Jun 12, 2024

Hi,

I am a bit confused about the dtype, param_dtype and the precision parameters in some of the flax.linen modules.

According to the document of Conv, it has these parameters to control the precision:

dtype: can infer from the input
param_dtype: default to float32
precision: default to None; I guess it is resolved to default? https://jax.readthedocs.io/en/latest/jax.lax.html#jax.lax.Precision

If I want to use bfloat16 for my model, which parameter should I use?

Also, I found the return value of nn.Conv.apply is not controlled by dtype nor precision but by param_dtype.

For example, if I want to create a simple 2-layer conv net and do not set any of these parameters, then the 1st conv layer's precision can be controlled by the input type, but the 2nd conv layer's precision is controlled by the output type of the first layer, which is always float32.

Should I explicitly set all the param_dtype parameters of all the layers?

Is there a way to control the precision globally? I guess it would cause trouble for some layers, like BatchNorm, which always prefers higher precision.

Do we have official guidelines on controlling the model precision and utilising hardware features?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confused about dtype and precision #3987

{{title}}

Replies: 0 comments

Select a reply

Confused about dtype and precision #3987

davidshen84 Jun 12, 2024

Replies: 0 comments

davidshen84
Jun 12, 2024