Releases: angerson/tensorflow
Another DOI Test
DOI Test v1.0
This is an experiment to see how easy it is to use Zenodo to publish a DOI alongside a code release.
TensorFlow 2.4.0
Release 2.4.0
Major Features and Improvements
-
tf.distribute
introduces experimental support for asynchronous training of models via thetf.distribute.experimental.ParameterServerStrategy
API. Please see the tutorial to learn more. -
MultiWorkerMirroredStrategy
is now a stable API and is no longer considered experimental. Some of the major improvements involve handling peer failure and many bug fixes. Please check out the detailed tutorial on Multi-worker training with Keras. -
Introduces experimental support for a new module named
tf.experimental.numpy
which is a NumPy-compatible API for writing TF programs. See the detailed guide to learn more. Additional details below. -
Adds Support for
TensorFloat-32 on Ampere based GPUs. TensorFloat-32, or TF32 for short, is a math mode for NVIDIA Ampere based GPUs and is enabled by default. -
A major refactoring of the internals of the Keras Functional API has been completed, that should improve the reliability, stability, and performance of constructing Functional models.
-
Keras mixed precision API
tf.keras.mixed_precision
is no longer experimental and allows the use of 16-bit floating point formats during training, improving performance by up to 3x on GPUs and 60% on TPUs. Please see below for additional details. -
TensorFlow Profiler now supports profiling
MultiWorkerMirroredStrategy
and tracing multiple workers using the sampling mode API. -
TFLite Profiler for Android is available. See the detailed guide to learn more.
-
TensorFlow pip packages are now built with CUDA11 and cuDNN 8.0.2.
Breaking Changes
-
TF Core:
- Certain float32 ops run in lower precsion on Ampere based GPUs, including matmuls and convolutions, due to the use of TensorFloat-32. Specifically, inputs to such ops are rounded from 23 bits of precision to 10
bits of precision. This is unlikely to cause issues in practice for deep learning models. In some cases, TensorFloat-32 is also used for complex64 ops.
TensorFloat-32 can be disabled by runningtf.config.experimental.enable_tensor_float_32_execution(False)
. - The byte layout for string tensors across the C-API has been updated to match TF Core/C++; i.e., a contiguous array of
tensorflow::tstring
/TF_TString
s. - C-API functions
TF_StringDecode
,TF_StringEncode
, andTF_StringEncodedSize
are no longer relevant and have been removed; seecore/platform/ctstring.h
for string access/modification in C. tensorflow.python
,tensorflow.core
andtensorflow.compiler
modules are now hidden. These modules are not part of TensorFlow public API.tf.raw_ops.Max
andtf.raw_ops.Min
no longer accept inputs of typetf.complex64
ortf.complex128
, because the behavior of these ops is not well defined for complex types.- XLA:CPU and XLA:GPU devices are no longer registered by default. Use
TF_XLA_FLAGS=--tf_xla_enable_xla_devices
if you really need them, but this flag will eventually be removed in subsequent releases.
- Certain float32 ops run in lower precsion on Ampere based GPUs, including matmuls and convolutions, due to the use of TensorFloat-32. Specifically, inputs to such ops are rounded from 23 bits of precision to 10
-
tf.keras
:- The
steps_per_execution
argument inmodel.compile()
is no longer experimental; if you were passingexperimental_steps_per_execution
, rename it tosteps_per_execution
in your code. This argument controls the number of batches to run during eachtf.function
call when callingmodel.fit()
. Running multiple batches inside a singletf.function
call can greatly improve performance on TPUs or small models with a large Python overhead. - A major refactoring of the internals of the Keras Functional API may affect code that
is relying on certain internal details:- Code that uses
isinstance(x, tf.Tensor)
instead oftf.is_tensor
when checking Keras symbolic inputs/outputs should switch to usingtf.is_tensor
. - Code that is overly dependent on the exact names attached to symbolic tensors (e.g. assumes there will be ":0" at the end of the inputs, treats names as unique identifiers instead of using
tensor.ref()
, etc.) may break. - Code that uses full path for
get_concrete_function
to trace Keras symbolic inputs directly should switch to building matchingtf.TensorSpec
s directly and tracing theTensorSpec
objects. - Code that relies on the exact number and names of the op layers that TensorFlow operations were converted into may have changed.
- Code that uses
tf.map_fn
/tf.cond
/tf.while_loop
/control flow as op layers and happens to work before TF 2.4. These will explicitly be unsupported now. Converting these ops to Functional API op layers was unreliable before TF 2.4, and prone to erroring incomprehensibly or being silently buggy. - Code that directly asserts on a Keras symbolic value in cases where ops like
tf.rank
used to return a static or symbolic value depending on if the input had a fully static shape or not. Now these ops always return symbolic values. - Code already susceptible to leaking tensors outside of graphs becomes slightly more likely to do so now.
- Code that tries directly getting gradients with respect to symbolic Keras inputs/outputs. Use
GradientTape
on the actual Tensors passed to the already-constructed model instead. - Code that requires very tricky shape manipulation via converted op layers in order to work, where the Keras symbolic shape inference proves insufficient.
- Code that tries manually walking a
tf.keras.Model
layer by layer and assumes layers only ever have one positional argument. This assumption doesn't hold true before TF 2.4 either, but is more likely to cause issues now. - Code that manually enters
keras.backend.get_graph()
before building a functional model is no longer needed. - Start enforcing input shape assumptions when calling Functional API Keras models. This may potentially break some users, in case there is a mismatch between the shape used when creating
Input
objects in a Functional model, and the shape of the data passed to that model. You can fix this mismatch by either calling the model with correctly-shaped data, or by relaxingInput
shape assumptions (note that you can pass shapes withNone
entries for axes
that are meant to be dynamic). You can also disable the input checking entirely by settingmodel.input_spec = None
.
- Code that uses
- Several changes have been made to
tf.keras.mixed_precision.experimental
. Note that it is now recommended to use the non-experimentaltf.keras.mixed_precision
API. AutoCastVariable.dtype
now refers to the actual variable dtype, not the dtype it will be casted to.- When mixed precision is enabled,
tf.keras.layers.Embedding
now outputs a float16 or bfloat16 tensor instead of a float32 tensor. - The property
tf.keras.mixed_precision.experimental.LossScaleOptimizer.loss_scale
is now a tensor, not aLossScale
object. This means to get a loss scale of aLossScaleOptimizer
as a tensor, you must now callopt.loss_scale
instead ofopt.loss_scale()
. - The property
should_cast_variables
has been removed fromtf.keras.mixed_precision.experimental.Policy
- When passing a
tf.mixed_precision.experimental.DynamicLossScale
totf.keras.mixed_precision.experimental.LossScaleOptimizer
, theDynamicLossScale
's multiplier must be 2. - When passing a
tf.mixed_precision.experimental.DynamicLossScale
totf.keras.mixed_precision.experimental.LossScaleOptimizer
, the weights of
theDynanmicLossScale
are copied into theLossScaleOptimizer
instead of being reused. This means modifying the weights of theDynamicLossScale
will no longer affect the weights of the LossScaleOptimizer, and vice versa. - The global policy can no longer be set to a non-floating point policy in
tf.keras.mixed_precision.experimental.set_policy
- In
Layer.call
,AutoCastVariable
s will no longer be casted withinMirroredStrategy.run
orReplicaContext.merge_call
. This is because a thread local variable is used to determine whetherAutoCastVariable
s are casted, and those two functions run with a different thread. Note this only applies if one of these two functions is called withinLayer.call
; if one of those two functions callsLayer.call
,AutoCastVariable
s will still be casted.
- The
-
tf.data
:tf.data.experimental.service.DispatchServer
now takes a config tuple instead of individual arguments. Usages should be updated totf.data.experimental.service.DispatchServer(dispatcher_config)
.tf.data.experimental.service.WorkerServer
now takes a config tuple instead of individual arguments. Usages should be updated totf.data.experimental.service.WorkerServer(worker_config)
.
-
tf.distribute
:- Removes
tf.distribute.Strategy.experimental_make_numpy_dataset
. Please usetf.data.Dataset.from_tensor_slices
instead. - Renames
experimental_hints
intf.distribute.StrategyExtended.reduce_to
,tf.distribute.StrategyExtended.batch_reduce_to
,tf.distribute.ReplicaContext.all_reduce
tooptions
. - Renames
tf.distribute.experimental.CollectiveHints
...
- Removes
Doi test 3
Yet one more DOI test.