Update dependency transformers to v4.38.0 [SECURITY] #69
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
4.30.2
->4.38.0
==4.30.2
->==4.38.0
GitHub Vulnerability Alerts
CVE-2023-7018
Deserialization of Untrusted Data in GitHub repository huggingface/transformers prior to 4.36.
CVE-2023-6730
Deserialization of Untrusted Data in GitHub repository huggingface/transformers prior to 4.36.0.
CVE-2024-3568
The huggingface/transformers library is vulnerable to arbitrary code execution through deserialization of untrusted data within the
load_repo_checkpoint()
function of theTFPreTrainedModel()
class. Attackers can execute arbitrary code and commands by crafting a malicious serialized payload, exploiting the use ofpickle.load()
on data from potentially untrusted sources. This vulnerability allows for remote code execution (RCE) by deceiving victims into loading a seemingly harmless checkpoint during a normal training process, thereby enabling attackers to execute arbitrary code on the targeted machine.Release Notes
huggingface/transformers (transformers)
v4.38.0
: v4.38: Gemma, Depth Anything, Stable LM; Static Cache, HF Quantizer, AQLMCompare Source
New model additions
💎 Gemma 💎
Gemma is a new opensource Language Model series from Google AI that comes with a 2B and 7B variant. The release comes with the pre-trained and instruction fine-tuned versions and you can use them via
AutoModelForCausalLM
,GemmaForCausalLM
orpipeline
interface!Read more about it in the Gemma release blogpost: https://hf.co/blog/gemma
You can use the model with Flash Attention, SDPA, Static cache and quantization API for further optimizations !
Depth Anything Model
The Depth Anything model was proposed in Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data by Lihe Yang, Bingyi Kang, Zilong Huang, Xiaogang Xu, Jiashi Feng, Hengshuang Zhao. Depth Anything is based on the DPT architecture, trained on ~62 million images, obtaining state-of-the-art results for both relative and absolute depth estimation.
Stable LM
StableLM 3B 4E1T was proposed in StableLM 3B 4E1T: Technical Report by Stability AI and is the first model in a series of multi-epoch pre-trained language models.
StableLM 3B 4E1T is a decoder-only base language model pre-trained on 1 trillion tokens of diverse English and code datasets for four epochs. The model architecture is transformer-based with partial Rotary Position Embeddings, SwiGLU activation, LayerNorm, etc.
The team also provides StableLM Zephyr 3B, an instruction fine-tuned version of the model that can be used for chat-based applications.
StableLM
by @jon-tow in #28810⚡️ Static cache was introduced in the following PRs ⚡️
Static past key value cache allows
LlamaForCausalLM
' s forward pass to be compiled usingtorch.compile
!This means that
(cuda) graphs
can be used for inference, which speeds up the decoding step by 4x!A forward pass of Llama2 7B takes around
10.5
ms to run with this on an A100! Equivalent to TGI performances! ⚡️Core generation
] Adds support for static KV cache by @ArthurZucker in #27931CLeanup
] Revert SDPA attention changes that got in the static kv cache PR by @ArthurZucker in #29027generate
is not included yet. This feature is experimental and subject to changes in subsequent releases.Quantization
🧼 HF Quantizer 🧼
HfQuantizer
makes it easy for quantization method researchers and developers to add inference and / or quantization support in 🤗 transformers. If you are interested in adding the support for new methods, please refer to this documentation page: https://huggingface.co/docs/transformers/main/en/hf_quantizerHfQuantizer
class for quantization-related stuff inmodeling_utils.py
by @poedator in #26610HfQuantizer
] Move it to "Developper guides" by @younesbelkada in #28768HFQuantizer
] Removecheck_packages_compatibility
logic by @younesbelkada in #28789⚡️AQLM ⚡️
AQLM is a new quantization method that enables no-performance degradation in 2-bit precision. Check out this demo about how to run Mixtral in 2-bit on a free-tier Google Colab instance: https://huggingface.co/posts/ybelkada/434200761252287
🧼 Moving canonical repositories 🧼
The canonical repositories on the hugging face hub (models that did not have an organization, like
bert-base-cased
), have been moved under organizations.You can find the entire list of models moved here: https://huggingface.co/collections/julien-c/canonical-models-65ae66e29d5b422218567567
Redirection has been set up so that your code continues working even if you continue calling the previous paths. We, however, still encourage you to update your code to use the new links so that it is entirely future proof.
Flax Improvements 🚀
The Mistral model was added to the library in Flax.
TensorFlow Improvements 🚀
With Keras 3 becoming the standard version of Keras in TensorFlow 2.16, we've made some internal changes to maintain compatibility. We now have full compatibility with TF 2.16 as long as the
tf-keras
compatibility package is installed. We've also taken the opportunity to do some cleanup - in particular, the objects likeBatchEncoding
that are returned by our tokenizers and processors can now be directly passed to Keras methods likemodel.fit()
, which should simplify a lot of code and eliminate a long-standing source of annoyances.Pre-Trained backbone weights 🚀
Enable loading in pretrained backbones in a new model, where all other weights are randomly initialized. Note: validation checks are still in place when creating a config. Passing in
use_pretrained_backbone
will raise an error. You can override by settingconfig.use_pretrained_backbone = True
after creating a config. However, it is not yet guaranteed to be fully backwards compatible.Introduce a helper function
load_backbone
to load a backbone from a backbone's model config e.g.ResNetConfig
, or from a model config which contains backbone information. This enables cleaner modeling files and crossloading between timm and transformers backbones.Backbone
] Useload_backbone
instead ofAutoBackbone.from_config
by @amyeroberts in #28661Add in API references, list supported backbones, updated examples, clarification and moving information to better reflect usage and docs
Image Processor work 🚀
Bugfixes and improvements 🚀
Llava
] Update convert_llava_weights_to_hf.py script by @isaac-vidas in #28617GPTNeoX
] Fix GPTNeoX + Flash Attention 2 issue by @younesbelkada in #28645SigLIP
] Only import tokenizer if sentencepiece available by @amyeroberts in #28636PartialState().default_device
as it has been officially released by @statelesshz in #27256tensor_size
- fix copy/paste error msg typo by @scruel in #28660CodeGenTokenizer
by @cmathw in #28628GenerationConfig
, now thegeneration_config.json
can be loaded successfully by @ParadoxZW in #28604chore
] Add missing space in warning by @tomaarsen in #28695Vilt
] align input and model dtype in the ViltPatchEmbeddings forward pass by @faaany in #28633docs
] Improve visualization for vertical parallelism by @petergtz in #28583LocalEntryNotFoundError
duringprocessor_config.json
loading by @ydshieh in #28709docs
] Update preprocessing.md by @velaia in #28719weights_only
by @ydshieh in #28725GatedRepoError
to use cache file (fix #28558). by @scruel in #28566Siglip
] protect from imports if sentencepiece not installed by @amyeroberts in #28737DepthEstimationPipeline
's docstring by @ydshieh in #28733Block
. by @xkszltl in #28727load_in_8bit
andload_in_4bit
at the same time by @osanseviero in #28266bnb
] Fix bnb slow tests by @younesbelkada in #28788torch.arange
dtype onfloat
usage to avoid incorrect initialization by @gante in #28760is_torch_bf16_available_on_device
more strict by @ydshieh in #28796-v
forpytest
on CircleCI by @ydshieh in #28840test_encoder_decoder_model_generate
forvision_encoder_deocder
as flaky by @amyeroberts in #28842Doc
] update contribution guidelines by @ArthurZucker in #28858save_only_model
withload_best_model_at_end
for DeepSpeed/FSDP by @pacman100 in #28866FastSpeech2ConformerModelTest
and skip it on CPU by @ydshieh in #28888torchaudio
get the correct version intorch_and_flax_job
by @ydshieh in #28899logging_first_step
by removing "evaluate" by @Sai-Suraj-27 in #28884Exception
when trying to generate 0 tokenstorch_dtype
asstr
to actual torch data type (i.e. "float16" …totorch.float16
) by @KossaiSbai in #28208pipelines
] updated docstring with vqa alias by @cmahmut in #28951test_save_load_fast_init_from_base
as flaky by @gante in #28930NllbTokenizer
] refactor with added tokens decoder by @ArthurZucker in #27717DETR
] Update the processing to adapt masks & bboxes to reflect padding by @amyeroberts in #28363quantization_config
is in config but not passed as an arg by @younesbelkada in #28988AutoQuantizer
]: enhance trainer + not supported quant methods by @younesbelkada in #28991Doc
] Fix docbuilder - makeBackboneMixin
andBackboneConfigMixin
importable fromutils
. by @amyeroberts in #29002test_trainer
to float32 by @statelesshz in #28920Trainer
/ tags]: Fix trainer + tags when users do not pass"tags"
totrainer.push_to_hub()
by @younesbelkada in #29009logger.warning
+ inline with recent refactor by @younesbelkada in #29039test_save_load_low_cpu_mem_usage
tests by @amyeroberts in #29043generation/utils.py::GenerateEncoderDecoderOutput
's docstring by @sadra-barikbin in #29044auto_find_batch_size
isn't yet supported with DeepSpeed/FSDP. Raise error accrodingly. by @pacman100 in #29058Awq
] Add peft support for AWQ by @younesbelkada in #28987bnb
/tests
]: Fix currently failing bnb tests by @younesbelkada in #29092bert-base-cased
tokenizer configuration test by @LysandreJik in #29105examples/pytorch/text-classification/run_classification.py
by @Ja1Zhou in #29072pipelines/base.py::Pipeline::_sanitize_parameters()
's docstring by @sadra-barikbin in #29102gradient_checkpointing
] default to use it for torch 2.3 by @ArthurZucker in #28538Trainer
/bnb
]: Add RMSProp frombitsandbytes
to HFTrainer
by @younesbelkada in #29082bnb
/tests
] Propagate the changes from #29092 to 4-bit tests by @younesbelkada in #29122cuda kernels
] only compile them when initializing by @ArthurZucker in #29133PEFT
/Trainer
] Handle better peft + quantized compiled models by @younesbelkada in #29055Core tokenization
]add_dummy_prefix_space
option to help with latest issues by @ArthurZucker in #28010pipeline
] Add pool option to image feature extraction pipeline by @amyeroberts in #28985Significant community contributions
The following contributors have made significant changes to the library over the last release:
Configuration
📅 Schedule: Branch creation - "" (UTC), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about these updates again.
This PR was generated by Mend Renovate. View the repository job log.