This document lists the release notes for the TensorFlow-Neuron package.
- Issue: When compiling large models, user might run out of memory and encounter this fatal error.
terminate called after throwing an instance of 'std::bad_alloc'
Solution: run compilation on a c5.4xlarge instance type or larger.
Date: 09/22/2020
- tensorflow-neuron now automatically enables data parallel mode on four cores in one Inferentia. In
tensorflow-model-server-neuron
, most models can now fully utilize four cores automatically. In Python tensorflow, running threaded inference using>=4
Python threads in the same tensorflow Session lead to full utilization of four cores. - tensorflow-neuron now tries to enable dynamic batch size automatically for a limited number of models, such as ResNet50.
- Improved logging during
tfn.saved_model.compile
to display input/output information about subgraphs that are going to be compiled byneuron-cc
.
Date: 08/08/2020
Various minor improvements.
Date: 08/05/2020
Various minor improvements.
Date: 07/16/2020
This version contains a few bug fixes and user experience improvements.
- Bump tensorflow base package version number to 1.15.3
- Add
tensorflow >= 1.15.0, < 1.16.0
as an installation dependency so that packages depending on tensorflow can be installed together with tensorflow-neuron without error
tensorflow-neuron
now displays a summary of model performance when profiling is enable by setting environment variableNEURON_PROFILE
- Environment variable
NEURON_PROFILE
can now be set to a non-existing path which will be automatically created - Fixed a bug in
tfn.saved_model.compile
that causes compilation failure whendynamic_batch_size=True
is specified on a SavedModel with unknown rank inputs.
Date 6/11/2020
This version contains a few bug fixes.
- Fixed a bug related with device placement. Now models with device information hardcoded to GPU can be successfully compiled with
tfn.saved_model.compile
- Fixed a bug in
tfn.saved_model.compile
that causes models containing Reshape operators not functioning correctly when it is compiled withdynamic_batch_size=True
- Fixed a bug in
tfn.saved_model.compile
that causes models containing Table related operators to initialize incorrectly after compilation.
Date: 5/11/2020
This version contains some bug fixes and new features.
- Tensorflow-Neuron is now built on TensorFlow 1.15.2 instead of TensorFlow 1.15.0
- Fixed a bug that caused Neuron runtime resources to not all be released when a tensorflow-neuron process terminated with in-flight inferences
- Inference timeout value set at compile time is now correctly recognized at runtime
Date: 3/26/2020
- Improved performance between Tensorflow to Neuron runtime.
- Fixed a bug in Neuron runtime adaptor operator's shape function when dynamic batch size inference is enabled
- Framework method (tensorflow.neuron.saved-model.compile) improved handling of compiler timeout termination by letting it clean up before exiting.
Date: 2/27/2020
- Enabled runtime memory optimizations by default to improve inference performance, specifically in cases with large input/output tensors
- tfn.saved_model.compile now displays warning message instead of "successfully compiled" if less than 30% of operators are mapped to Inferentia
- Improve error messages. Runtime failure error messages are now more descriptive and also provide instructions to restart neuron-rtd when necessary.
- Issue: When compiling a large model, may encounter.
terminate called after throwing an instance of 'std::bad_alloc'
Solution: run compilation on c5.4xlarge instance type or larger.
Date: 1/27/2020
- Added support for NCHW pooling operators in tfn.saved_model.compile.
- Fixed GRPC transient status error issue.
- Fixed a graph partitioner issue with control inputs.
- Issue: When compiling a large model, may encounter.
terminate called after throwing an instance of 'std::bad_alloc'
Solution: run compilation on c5.4xlarge instance type or larger.
Date: 12/20/2019
- Improved handling of
tf.neuron.saved_model.compile
arguments
Date: 12/1/2019
- Fix race condition between model load and model unload when the process is killed
- Remove unnecessary GRPC calls when the process is killed
-
When compiling a large model, may encounter “terminate called after throwing an instance of 'std::bad_alloc'”. Solution: run compilation on c5.4xlarge instance type or larger.
-
The pip package
wrapt
may have a conflicting version in some installations. This is seen when this error occurs:
ERROR: Cannot uninstall 'wrapt'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
To solve this, you can update wrapt to the newer version:
python3 -m pip install wrapt --ignore-installed
python3 -m pip install tensorflow-neuron
Within a Conda environment:
conda update wrapt
conda update tensorflow-neuron
Date: 11/25/2019
This version is available only in released DLAMI v26.0 and is based on TensorFlow version 1.15.0. Please update to latest version.
The following models have successfully run on neuron-inferentia systems
- BERT_LARGE and BERT_BASE
- Transformer
- Resnet50 V1/V2
- Inception-V2/V3/V4
- Python versions supported:
- 3.5, 3.6, 3.7
- Linux distribution supported:
- Ubuntu 16, Ubuntu 18, Amazon Linux 2