Releases: keras-team/keras-hub
v0.11.0
Summary
This release has no major feature updates, but changes the location our source code is help. Source code is split into a src/
and api/
directory with an explicit API surface similar to core Keras.
When adding or removing new API in a PR, use ./shell/api_gen.sh
to update the autogenerated api/
files. See our contributing guide.
What's Changed
- Change the order of importing
keras
by @james77777778 in #1596 - Add backend info to HF model card by @SamanehSaadat in #1599
- Bump required kagglehub version to 0.2.4 by @SamanehSaadat in #1600
- Bump
bert_tiny_en_uncased_sst2
classifier version by @SamanehSaadat in #1602 - Allow a task preprocessor to be an argument in from_preset by @SamanehSaadat in #1603
- API Generation by @sampathweb in #1608
- Update readme with some recent changes by @mattdangerw in #1575
- Bump the python group with 2 updates by @dependabot in #1611
- Version bump 0.11.0.dev0 by @mattdangerw in #1615
- Unexport models from the 0.11 release by @mattdangerw in #1614
- Version bump 0.11.0 by @mattdangerw in #1616
New Contributors
- @james77777778 made their first contribution in #1596
Full Changelog: v0.10.0...v0.11.0
v0.10.0
Summary
- Added support for
Task
(CausalLM
andClassifier
) saving and loading which allows uploadingTask
s. - Added basic Model Card for Hugging Face upload.
- Added support for a
positions
array in ourRotaryEmbedding
layer.
What's Changed
- 0.9 is out, nightly should be a preview of 0.10 now by @mattdangerw in #1570
- Do the reverse embedding in the same dtype as the input embedding by @mattdangerw in #1548
- Add support for positions array in
keras_nlp.layers.RotaryEmbedding
layer by @tirthasheshpatel in #1571 - Support Task Saving/Loading by @SamanehSaadat in #1547
- Improve error handling for non-keras model loading attempts by @SamanehSaadat in #1577
- Add Model Card for Hugging Face Upload by @SamanehSaadat in #1578
- Add Saving Tests by @SamanehSaadat in #1590
- Improve error handling for missing TensorFlow dependency in keras_nlp. by @SamanehSaadat in #1585
- Fix Keras import by @sampathweb in #1593
- Check kagglehub version before upload by @SamanehSaadat in #1594
- Version bump to 0.10.0.dev0 by @SamanehSaadat in #1595
- Version bump 0.10.0.dev1 by @SamanehSaadat in #1601
- Version bump to 0.10.0.dev2 by @SamanehSaadat in #1604
- Version bump to 0.10.0 by @SamanehSaadat in #1606
Full Changelog: v0.9.3...v0.10.0
v0.9.3
Patch release with fixes for Llama and Mistral saving.
What's Changed
- Fix saving bug for untied weights with keras 3.2 by @mattdangerw in #1568
- Version bump for dev release by @mattdangerw in #1569
- Version bump 0.9.3 by @mattdangerw in #1572
Full Changelog: v0.9.2...v0.9.3
v0.9.2
Summary
- Initial release of CodeGemma.
- Bump to a Gemma 1.1 version without download issues on Kaggle.
What's Changed
- Fix
print_fn
issue in task test by @SamanehSaadat in #1563 - Update presets for code gemma by @mattdangerw in #1564
- version bump 0.9.2.dev0 by @mattdangerw in #1565
- Version bump 0.9.2 by @mattdangerw in #1566
Full Changelog: v0.9.1...v0.9.2
v0.9.1
Patch fix for bug with stop_token_ids
.
What's Changed
- Fix the new stop_token_ids argument by @mattdangerw in #1558
- Fix tests with the "auto" default for stop token ids by @mattdangerw in #1559
- Version bump for 0.9.1 by @mattdangerw in #1560
Full Changelog: v0.9.0...v0.9.1
v0.9.0
The 0.9.0 release adds new models, hub integrations, and general usability improvements.
Summary
- Added the Gemma 1.1 release.
- Added the Llama 2, BLOOM and ELECTRA models.
- Expose new base classes. Allow
from_preset()
on base classes.keras_nlp.models.Backbone
keras_nlp.models.Task
keras_nlp.models.Classifier
keras_nlp.models.CausalLM
keras_nlp.models.Seq2SeqLM
keras_nlp.models.MaskedLM
- Some initial features for uploading to model hubs.
backbone.save_to_preset
,tokenizer.save_to_preset
,keras_nlp.upload_preset
.from_preset
andupload_preset
now work with the Hugging Face Models Hub.- More features (task saving, lora saving), and full documentation coming soon.
- Numerical fixes for the Gemma model at mixed_bfloat16 precision. Thanks unsloth for catching!
# Llama 2. Needs Kaggle consent and login, see https://github.com/Kaggle/kagglehub
causal_lm = keras_nlp.models.LlamaCausalLM.from_preset(
"llama2_7b_en",
dtype="bfloat16", # Run at half precision for inference.
)
causal_lm.generate("Keras is a", max_length=128)
# Base class usage.
keras_nlp.models.Classifier.from_preset("bert_base_en", num_classes=2)
keras_nlp.models.Tokenizer.from_preset("gemma_2b_en")
keras_nlp.models.CausalLM.from_preset("gpt2_base_en", dtype="mixed_bfloat16")
What's Changed
- Add dtype arg to Gemma HF conversion script by @nkovela1 in #1452
- Fix gemma testing import by @mattdangerw in #1462
- Add docstring for PyTorch conversion script install instructions by @nkovela1 in #1471
- Add an annotation to tests that need kaggle auth by @mattdangerw in #1470
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Bump the master version to 0.9 by @mattdangerw in #1473
- Pin to TF 2.16 RC0 by @sampathweb in #1478
- Fix gemma rms_normalization's use of epsilon by @cpsauer in #1472
- Add
FalconBackbone
by @SamanehSaadat in #1475 - CI - Add kaggle creds to pull model by @sampathweb in #1459
- bug in example for ReversibleEmbedding by @TheCrazyT in #1484
- doc fix for constrastive sampler by @mattdangerw in #1488
- Remove broken link to masking and padding guide by @mattdangerw in #1487
- Fix a typo in causal_lm_preprocessors by @SamanehSaadat in #1489
- Fix dtype accessors of tasks/backbones by @mattdangerw in #1486
- Auto-labels 'gemma' on 'gemma' issues/PRs. by @shmishra99 in #1490
- Add BloomCausalLM by @abuelnasr0 in #1467
- Remove the bert jupyter conversion notebooks by @mattdangerw in #1492
- Add
FalconTokenizer
by @SamanehSaadat in #1485 - Add
FalconPreprocessor
by @SamanehSaadat in #1498 - Rename 176B presets & Add other presets into bloom_presets.py by @abuelnasr0 in #1496
- Add bloom presets by @abuelnasr0 in #1501
- Create workflow for auto assignment of issues and for stale issues by @sachinprasadhs in #1495
- Update requirements to TF 2.16 by @sampathweb in #1503
- Expose Task and Backbone by @mattdangerw in #1506
- Clean up and add our gemma conversion script by @mattdangerw in #1493
- Don't auto-update JAX GPU by @sampathweb in #1507
- Keep rope at float32 precision by @grasskin in #1497
- Bump the python group with 2 updates by @dependabot in #1509
- Fixes for the LLaMA backbone + add dropout by @tirthasheshpatel in #1499
- Add
LlamaPreprocessor
andLlamaCausalLMPreprocessor
by @tirthasheshpatel in #1511 - Always run the rotary embedding layer in float32 by @tirthasheshpatel in #1508
- CI: Fix psutil - Remove install of Python 3.9 and alias of python3 by @sampathweb in #1514
- Update gemma_backbone.py for sharding config. by @qlzh727 in #1491
- Docs/modelling layers by @mykolaskrynnyk in #1502
- Standardize docstring by @sachinprasadhs in #1516
- Support tokenization of special tokens for word_piece_tokenizer by @abuelnasr0 in #1397
- Upload Model to Kaggle by @SamanehSaadat in #1512
- Add scoring mode to MistralCausalLM by @RyanMullins in #1521
- Add Mistral Instruct V0.2 preset by @tirthasheshpatel in #1520
- Add Tests for Kaggle Upload Validation by @SamanehSaadat in #1524
- Add presets for Electra and checkpoint conversion script by @pranavvp16 in #1384
- Allow saving / loading from Huggingface Hub preset by @Wauplin in #1510
- Stop on multiple end tokens by @grasskin in #1518
- Fix doc:
mistral_base_en
->mistral_7b_en
by @asmith26 in #1528 - Add lora example to GemmaCausalLM docstring by @SamanehSaadat in #1527
- Add LLaMA Causal LM with 7B presets by @tirthasheshpatel in #1526
- Add task base classes; support out of tree library extensions by @mattdangerw in #1517
- Doc fixes by @mattdangerw in #1530
- Run the LLaMA and Mistral RMS Layer Norm in float32 by @tirthasheshpatel in #1532
- Adds score API to GPT-2 by @RyanMullins in #1533
- increase pip timeout to 1000s to avoid connection resets by @sampathweb in #1535
- Adds the score API to LlamaCausalLM by @RyanMullins in #1534
- Implement compute_output_spec() for tokenizers with vocabulary. by @briango28 in #1523
- Remove staggler type annotiations by @mattdangerw in #1536
- Always run SiLU activation in float32 for LLaMA and Mistral by @tirthasheshpatel in #1540
- Bump the python group with 2 updates by @dependabot in #1538
- Disallow saving to preset from keras 2 by @SamanehSaadat in #1545
- Fix the rotary embedding computation in LLaMA by @tirthasheshpatel in #1544
- Fix re-compilation bugs by @mattdangerw in #1541
- Fix preprocessor from_preset bug by @mattdangerw in #1549
- Fix a strange issue with preprocessing layer output types by @mattdangerw in #1550
- Fix lowercase bug in wordpiece tokenizer by @abuelnasr0 in #1543
- Small docs updates by @mattdangerw in #1553
- Add a few new preset for gemma by @mattdangerw in #1556
- Remove the dev prefix for 0.9.0 release by @mattdangerw in #1557
New Contributors
- @cpsauer made their first contribution in #1472
- @SamanehSaadat made their first contribution in #1475
- @TheCrazyT made their first contribution in #1484
- @shmishra99 made their first contribution in #1490
- @sachinprasadhs made their first contribution in #1495
- @mykolaskrynnyk made their first contribution in #1502
- @RyanMullins made their first contribution in #1521
- @Wauplin made their first contribution in #1510
- @asmith26 made their first contribution in #1528
- @briango28 made their first contribution in #1523
Full Changelog: v0.8.2...v0.9.0
v0.8.2
Summary
- Mistral fixes for dtype and memory usage. #1458
What's Changed
- Fix Mistral memory consumption with JAX and default dtype bug by @tirthasheshpatel in #1460
- Version bump for dev release by @mattdangerw in #1474
Full Changelog: v0.8.1...v0.8.2.dev0
v0.8.1
Minor fixes to Kaggle Gemma assets.
What's Changed
- Update to the newest version of Gemma on Kaggle by @mattdangerw in #1454
- Dev release 0.8.1.dev0 by @mattdangerw in #1456
- 0.8.1 version bump by @mattdangerw in #1457
Full Changelog: v0.8.0...v0.8.1
v0.8.0
The 0.8.0 release focuses on generative LLM features in KerasNLP.
Summary
- Added the
Mistral
andGemma
models. - Allow passing
dtype
directly to backbone and task constructors. - Add a settable
sequence_length
property to all preprocessing layers. - Added
enable_lora()
to the backbone class for parameter efficient fine-tuning. - Added layer attributes to backbone models for easier access to model internals.
- Added
AlibiBias
layer.
# Pass dtype to a model.
causal_lm = keras_nlp.MistralCausalLM.from_preset(
"mistral_instruct_7b_en",
dtype="bfloat16"
)
# Settable sequence length property.
causal_lm.preprocessor.sequence_length = 128
# Lora API.
causal_lm.enable_lora(rank=4)
# Easy layer attributes.
for layer in causal_lm.backbone.transformer_layers:
print(layer.count_params())
What's Changed
- Fix test for recent keras 3 change by @mattdangerw in #1400
- Pass less state to jax generate function by @mattdangerw in #1398
- Add llama tokenizer by @mattdangerw in #1401
- Add Bloom Model by @abuelnasr0 in #1382
- Try fixing tests by @mattdangerw in #1411
- Revert "Pass less state to jax generate function (#1398)" by @mattdangerw in #1412
- Bloom tokenizer by @abuelnasr0 in #1403
- Update black formatting by @mattdangerw in #1415
- Add Alibi bias layer by @abuelnasr0 in #1404
- Pin to
tensorflow-hub 0.16.0
to fix CI error by @sampathweb in #1420 - Update TF Text and remove TF Hub deps by @sampathweb in #1423
- Pin Jax Version in GPU CI by @sampathweb in #1430
- Add Bloom preprocessor by @abuelnasr0 in #1424
- Add layer attributes for all functional models by @mattdangerw in #1421
- Allow setting dtype per model by @mattdangerw in #1431
- Add a Causal LM model for Mistral by @tirthasheshpatel in #1429
- Fix bart by @mattdangerw in #1434
- Add a settable property for sequence_length by @mattdangerw in #1437
- Add dependabot to update GH Actions and Python dependencies by @pnacht in #1380
- Bump the github-actions group with 1 update by @dependabot in #1438
- Add 7B presets for Mistral by @tirthasheshpatel in #1436
- Update byte_pair_tokenizer.py to close merges file properly by @divyashreepathihalli in #1440
- bump version to 0.8 by @mattdangerw in #1441
- Update our sampler documentation to reflect usage by @mattdangerw in #1444
- Add Gemma model by @mattdangerw in #1448
- Version bump for dev release by @mattdangerw in #1449
- Version bump to 0.8.0 by @mattdangerw in #1450
New Contributors
- @dependabot made their first contribution in #1438
- @divyashreepathihalli made their first contribution in #1440
Full Changelog: v0.7.0...v0.8.0
v0.7.0
This release integrates KerasNLP and Kaggle Models. KerasNLP models will now work in Kaggle offline notebooks and all assets will quickly attach to a notebook rather than needing a slow download.
Summary
KerasNLP pre-trained models are now all made available through Kaggle Models. You can see all models currently available in both KerasCV and KerasNLP here. Individual model pages will include example usage and a file browser to examine all available assets for a model preset.
This change will not affect the existing usage of from_preset()
. Statement like keras_nlp.models.BertClassifier.from_preset("bert_base_en")
will continue to work and download checkpoints from the Kaggle Models hub.
A note on model saving—for saving support across Keras 2 and Keras 3, we recommend using the new Keras saved model format. You can use model.save('path/to/location.keras')
for a full model and model.save_weights('path/to/location.weights.h5')
for checkpoints. See the Keras saving guide for more details.
What's Changed
- Don't export model internals publicly by @mattdangerw in #1255
- Bump master branch version number to 0.7.0.dev0 by @mattdangerw in #1254
- Fix/allow different encoder and decoder feature dimensions in transformer decoder layer by @ferraric in #1260
- Doc updates to switch branding to Keras 3 by @mattdangerw in #1259
- Remove unused TPU testing for backbones by @mattdangerw in #1266
- Make gelu a function, not a lambda so it can be loaded without safe_mode=False by @calvingiles in #1262
- Update requirements and install instructions for multi-backend keras by @mattdangerw in #1257
- Support Keras 3 installation by @mattdangerw in #1258
- Remove dtensor by @mattdangerw in #1268
- Add a lora dense layer by @mattdangerw in #1263
- Factor out testing routines for models by @mattdangerw in #1269
- Convert T5 to Keras 3 by @nkovela1 in #1274
- Fix missing backticks in DistilBertClassifier docstrings by @Philmod in #1278
- T5 checkpoint conversion with HF by @nkovela1 in #1277
- Use gelu_approximate directly in t5 presets by @mattdangerw in #1284
- Add preset tests and weights URLs by @nkovela1 in #1285
- Support loading keras 3 nightly by @mattdangerw in #1286
- Remove the use of
SentencePieceTrainer
from tests by @tirthasheshpatel in #1283 - Fix XLM-RoBERTa detokenize() by @abheesht17 in #1289
- Correct tie_embedding_weights and add logit checking by @nkovela1 in #1288
- Add detokenize testing for model tokenizers by @mattdangerw in #1290
- Fix Whisper by @abheesht17 in #1287
- Test against Keras 3 by @mattdangerw in #1273
- Support TF_USE_LEGACY_KERAS by @mattdangerw in #1295
- Run workflows with read-only tokens by @pnacht in #1305
- Update CONTRIBUTING.md by @mattdangerw in #1310
- Add GitHub Action for Nightly by @sampathweb in #1309
- Fix the publish to pypi action by @mattdangerw in #1311
- Fix nightly tf failure by @mattdangerw in #1316
- Switch deberta to use the "int" dtype by @mattdangerw in #1315
- Add security policy by @pnacht in #1319
- Fix missing export for reversible embedding by @mattdangerw in #1327
- Add
version
API to keras_nlp by @grasskin in #1324 - Fix Keras 3 version check by @sampathweb in #1328
- Simplify running KerasNLP with Keras 3 by @mattdangerw in #1308
- Fix issues with version by @mattdangerw in #1332
- Fix typo in whisper presets files by @mattdangerw in #1337
ELECTRA
backbone implementation in keras by @pranavvp16 in #1291- Fix t5 tokenizer expected output by @mattdangerw in #1348
- Add init.py for electra by @mattdangerw in #1352
- Remove lora dense for now by @mattdangerw in #1359
- Adds Kokoro Build script for Keras-NLP GPU tests by @sampathweb in #1355
- Fixes GPU Test failures for Keras 3 by @sampathweb in #1361
- Change Continuous config to also run only large tests by @sampathweb in #1362
- ElectraTokenizer by @pranavvp16 in #1357
- Add MistralAI's 7B Transformer as a backbone in KerasNLP Models by @tirthasheshpatel in #1314
- changing pooling output by @mbrhd in #1364
- Add
LlamaBackbone
by @shivance in #1203 - Align pip_build with keras by @sampathweb in #1374
- Remove cloudbuild config by @mattdangerw in #1375
- Fix one last bad preset hash by @mattdangerw in #1381
- Add a tokenizer for the Mistral backbone by @tirthasheshpatel in #1383
- Kaggle Presets by @sampathweb in #1365
- Fix mistral and electra tokenizer to match kaggle changes by @mattdangerw in #1387
- Align requirments with Keras by @sampathweb in #1386
- Add a preprocessor for the Mistral backbone by @tirthasheshpatel in #1385
- Switch to always expect full Kaggle preset handles by @mattdangerw in #1390
New Contributors
- @calvingiles made their first contribution in #1262
- @tirthasheshpatel made their first contribution in #1283
- @pnacht made their first contribution in #1305
- @grasskin made their first contribution in #1324
- @pranavvp16 made their first contribution in #1291
- @mbrhd made their first contribution in #1364
Full Changelog: v0.6.4...v0.7.0